Atypical protein kinases of the RIO (right open reading frame) kinase family are found in all three domains of life, emphasizing their essential function. In all archaeal genomes sequenced to date, typically two, but at least one, members of the RIO kinase family have been identified. Although the function of RIO kinases in Archaea remains to be resolved, bioinformatics analysis (e.g. comparison of the phylogenetic distribution and gene neighbourhood analysis, as well as interaction analysis) in combination with the available phosphoproteome study of Sulfolobus solfataricus provided some first hints to the possible function as well as revealed some putative target proteins for RIO kinases in Archaea. This study suggests a possible function of archaeal RIO kinases in RNA and/or DNA binding/processing translation initiation or ribosomal biogenesis resembling the assumed physiological role in yeast.
- atypical protein kinase
- protein phosphorylation
- right open reading frame kinase (RIO kinase)
The final aim in systems biology is to build a silicon cell, a precise replicate of the living cell that is able to predict in vivo function. Besides regulation at the transcription and translation level, PTMs (post-translational modifications) play a major role in cellular regulation, since they allow for a fast response to environmental changes. The most important PTM is probably reversible protein phosphorylation, which has a crucial role in signal transduction. This modification takes usually place at histidine or aspartate (two component systems, most common in Bacteria) or serine, threonine or tyrosine (most common in Eukarya) residues. Protein phosphorylation is mediated by PKs (protein kinases), which catalyse the phosphorylation of respective target proteins and protein phosphatases, which specifically remove the respective phosphate group. PKs form a large superfamily comprising six families  and, apart from the well investigated ePKs (eukaryotic type-like PKs)  so-called aPKs (atypical PKs) of the ABC1 (ATP-binding cassette transporter 1), RIO (right open reading frame), piD261, AQ578 and PKN2 family were proposed. Although the presence of aPKs is well established, their physiological function remains largely unknown.
RIO kinases have received a lot of attention over the last decade. The RIO families were originally identified in Archaea and Eukarya, and a common ancestor with an ancestral RIO gene has been predicted . To date, four different types of RIO kinases have been identified: Rio 1, Rio 2, Rio 3 and Rio B kinase . Phylogenetic analysis confirmed the distribution of RIO kinases in all three domains of life and proposed the presence of a combination of one Rio 1 kinase and one Rio 2 kinase in less complex species (i.e. prokaryotes and single-cellular eukaryotes), whereas multicellular eukaryotes, including humans, possess, in addition, one Rio 3 kinases .
For yeast Rio 1, an essential function in cell-cycle progression and chromosome maintenance is known, and, furthermore, for both yeast RIO kinases a role in 18 S rRNA, processing has been demonstrated [5,6]. Deletion of either of the two RIO kinases results in cell death, suggesting that both of them possesses different functions [5–10]. However, whereas the function in yeast is well established, the role of their prokaryotic counterparts remains obscure. Although PK activity could be verified, their physiological target proteins are still unknown. A major breakthrough was achieved on the basis of the crystallization of the Rio 1 and Rio 2 kinases from Archaeoglobus fulgidus [11,12] and a combined bioinformatics analysis . For both archaeal RIO kinases, autophosphorylation and activity with common kinase substrates (Rio 1 kinase: histone H1 and myelin basic protein; Rio 2 kinase: histone H1, myelin basic protein and α-casein) has been shown [12,13].
In general, RIO kinases are regarded as trimmed versions of canonical ePKs, but lacking the subdomains VIII (active loop), X and XI (involved in substrate binding) . In addition, RIO domains possesses an insertion of 18–23 amino acids between αC and β3 (flexible loop), which is absent from ePKs. In general, Rio 1 and Rio 2 kinases are very similar with regard to their overall fold despite the fact that the N-terminal domain of the Rio 2 family comprises a wing–helix domain, which is absent from all other RIO families. As outlined previously, members of the different RIO families can be distinguished by their specific P-loop (interaction and orientation of the ATP moiety) and DFG (Asp-Phe-Gly) loop (metal-ion-binding and positioning) sequence [3,11].
In the present paper, we focus on the phylogenetic distribution of RIO kinase subfamilies within the third domain of life, i.e. Archaea, and, by using comparative genomics (i.e. BLAST analysis and gene neighbourhood), interaction predictions (i.e. STRING ) as well as the available phosphoproteome of Sulfolobus solfataricus P2 , we aim to gain the first insights into their possible physiological function in Archaea.
The distribution of RIO kinases within Archaea was analysed using AF1804 and AF2426 from A. fulgidus DSM 4304 as template [11–13]. All sequences with an e-value smaller than 10−10 were considered as putative RIO kinases and for classification within the RIO subfamily, the Rio 1 (STGKEA), Rio 2 (GXGKES), Rio 3 (STGKES) or Rio B (SGKEA) P-loop sequences were used .
In the 121 archaeal genomes available (http://img.jgi.doe.gov/cgi-bin/w/main.cgi; version 3.5, August 2012) 224 RIO kinases were identified, confirming the assumption that RIO kinases are conserved in all three domains of life . According to their P-loop sequences, 72 of the 224 RIO kinases were classified as Rio 1 kinases, 122 as Rio 2 kinases and 30 as Rio B kinases (see Supplementary Table S1 at http://www.biochemsoctrans.org/bst/041/bst0410399add.htm). A total of 55 of the 72 Rio 1 kinases harbour the canonical GXXSTGKEANVY/F P-loop sequence or the short variant STGKEA, whereas the remaining 17 Rio 1 kinases showed variations in their P-loop sequence (for details, see Supplementary Table S1). Among the 122 Rio 2 kinases, 81 possess the Rio 2 kinase P-loop sequence GXGKES [3,11,12], and in 41 Rio 2 kinases, an alternative Rio 2 P-loop was identified (e.g. GVGKEG; for details, see Supplementary Table S1). All 30 members of the Rio B kinase subfamily contain the conserved Rio B kinase P-loop sequence SGKEA  and no alterations of this sequence were observed.
In most archaea (100 of 121), two RIO kinases were present. Only in 19 archaea was one RIO kinase identified; in Aciduliprofundum boonei T469, three Rio 1 kinases were identified, and in Methanopyrus kandleri AV19, four RIO kinases (two Rio 1 and two Rio 2) were identified. Combinations of Rio 1 and Rio 2 kinases were identified in 58 archaea (most members of the Desulfurococcales, Archaeoglobales, Halobacteriales Methanomicrobia and Thermococcales) and of Rio B and Rio 2 in 30 archaea [all members of the Sulfolobales except S. acidocaldarius (Rio 1 and Rio 2) and most members of the Methanococcales]. Two copies of Rio 2 were found in 12 archaea (most are members of the Thermoproteales). Only one copy of Rio 1 was detected in ten archaea (all are members of the Methanobacteriaceae) and one copy of Rio 2 in nine archaea (all are members of Thermoplasmatales, Nanoarchaeum, Cenarchaeum and Nitrosopumilaceae) (Supplementary Table S1). Interestingly, no archaeon with a combination of Rio 1 and Rio B was identified. Furthermore, no Rio 3 kinases could be identified in archaea, which is in accordance with their unique presence in multicellular eukaryotes .
To gain a better understanding of the physiological function of the RIO kinase family in Archaea, we compared the gene neighbourhood of all identified archaeal RIO kinases (up to ten genes up/down-stream). Remarkably, for most RIO kinases, a conserved gene neighbourhood was observed (164 of 224 RIO kinases, 73.2%), mainly with genes encoding proteins involved in biological processes related to transcription and translation (Figure 1 and Supplementary Table S1; see below for detailed discussion).
The majority of the Rio 1 kinase genes (43 of 68, 59.7%) as well as Rio B kinase genes (29 of 30) are found in a conserved neighbourhood with genes encoding the human KH (K homology) domain protein hnRNP (heterogeneous nuclear ribonucleoprotein) and the aeIF (archaeal translation initiation factor)-1A. Some 14 (19.4%) Rio 1 kinase genes cluster with the gene encoding the KH domain protein alone and four (5.6%) with genes encoding the KH domain protein and the DEAD/DEAH-box protein (five, 6.9%) or with the DUF (domain of unknown function) 460 gene alone (two, 2.8%). The remaining Rio B kinase gene of Methanococcus voltae (Mvol_0151) is located in the gene neighbourhood of the aeIF-1A and the DEAD/DEAH-box protein gene. The majority of the Rio 2 kinase genes present as additional copies (46 of 122, 37.7%) are located in a conserved gene neighbourhood with a gene encoding the DUF 460 protein (Pfam) (31, 24.5%) or a combination of genes encoding DUF460 and snRNPs (small nuclear ribonucleoproteins) (13, 10.7%, restricted to Sulfolobus species) or cobyrinic acid a,c-diamide synthase CbiA (two, 1.6%, restricted to Metalosphaera species). The remaining 18 Rio 2 kinase genes were located in the gene neighbourhood of the gene encoding the KH domain protein alone (nine) or in combination with the aeIF-1A-encoding gene (eight), and the last Rio 2 kinase of Vulcanisaeta distributa (Vdis_2273) was located upstream of aeIF-1A.
Remarkably, the archaea that show a similar gene organization of Rio 2 as found for Rio 1 and Rio B are, with few exceptions, the ones that harbour two Rio 2 genes/paralogues or only one copy of Rio 2. This therefore suggests that the combination of RIO kinases with aeIF-1A with the KH domain protein encodes an essential function that is present in all archaea. In no archaea with two Rio kinases, except A. fulgidus, was a similar gene neighbourhood observed, and it is tempting to speculate that both atypical protein RIO kinases might have acquired different functions.
As outlined above, the majority of proteins encoded in the RIO gene cluster have an predicted function in processes related to transcription and translation. KH domain proteins are present in all three domains of life. In general, the function of KH domain proteins is to recognize RNA or ssDNA (single-stranded DNA) and KH motifs are found in various proteins which are involved in a vast number of biological processes (e.g. transcriptional regulation or translation control) [16–18]. Although our knowledge about archaeal translation is rather scant , for aeIF-1a from S. solfataricus P2, binding to the small ribosomal subunit and furthermore stimulation of the binding of Met-tRNAi (initiator methionyl-tRNA) and mRNA to the ribosome has been demonstrated . DEAD/DEAH-box helicases are involved, for example, in ribosome biogenesis, translation initiation and mRNA decay [21,22]. Their general function is to separate short RNA duplexes or RNA–DNA heteroduplexes and they are also important for RNA structure modification . For the cold-shock-induced DEAD-box helicase CsdA from Escherichia coli, an involvement in translation initiation, biogenesis of the 50S ribosomal subunit or/and degradosome formation with RNase E has been demonstrated [24–26]. Intriguingly, studies of archaeal DEAD-box helicases revealed also a significant induction by cold shock and cell growth at low temperature [27,28]. DUF460 proteins are only present in Archaea and no homologues were found in Bacteria or Eukarya (Pfam 26.0); however, their function remains to be elucidated . snRNPs are distributed in all three domains of life . Although archaeal snRNPs are structurally more closely related to their eukaryotic than their bacterial homologues (Hfq proteins), in terms of function, they resemble Hfq proteins. Whereas the eukaryal snRNPs are involved in spliceosome formation, Hfq proteins are involved in RNA processing and are of interest for RNA–RNA interaction. An interaction of DEAD-box proteins with snRNPs has been reported in Methanococcus jannaschii [23,31]. Only the cbiA gene in Metallosphaera species, which encodes the cobyrinic acid a,c-diamide synthases involved in cobalamin biosynthesis , shows no obvious link to transcription and translation.
RIO gene cluster in Sulfolobaceae
In all seven Sulfolobaceae sequenced, except for Acidianus hospitalis and Metallosphaera cuprina, a larger conserved RIO kinase gene region was observed (Figure 2). The cluster comprises nine genes and, for all encoded proteins (except acetyl-CoA c-acetyltransferase, acaB-4), a possible function in RNA or DNA processing, modification and degradation can be predicted. Ribonuclease HII (SSO2384) exhibits ribonucleotide-specific endonuclease activity and degrades the RNA strand of DNA–RNA heteroduplexes. A function in DNA replication, transcription, repair and development in all three domains of life is predicted , and some archaeal enzymes have been investigated (e.g. Thermococcus kodakaraensis) .
For archaeal NMD3 (nonsense-mediated decay 3) domain proteins (SSO2383), a function in biogenesis of the large ribosomal subunit has been predicted . For DUF424 domain proteins (SSO2382), no functional information is available. The role of universal translation initiation factor 2 [subunit B; aeIF-2: α, β (aeIF-2B, SSO2381) and γ subunit] is well established in Sulfolobus and Archaea in general and a crucial role in binding of charged initiator tRNA to the small ribosomal subunit has been demonstrated . SSO2380 encodes UPF (uncharacterized protein family) 004, Elp3 (eukaryotic elongation protein subunit 3) and TRAM (TRM2 and MiaB) domains. TRAM domains are common in Archaea and a function in thiolation of tRNA and ribosomal proteins has been predicted . UPF004 and Elp3 domains are commonly present in methyltransferases [38,39] and an N-terminal UPF004 domain and C-terminal TRAM domain was identified in methylthiotransferases (MtaB) [38,39]. Besides the RIO kinase gene (SSO2375), as discussed above, the genes encoding aeIF-1A and KH domain protein are also present in the gene cluster. Therefore, except for the acaB-4 genes, all genes encode proteins which seem to be tightly connected to biological processes related to transcription and translation regulation, which is also assumed for RIO kinases.
First evidence for the function of RIO kinases in S. solfataricus P2
S. solfataricus P2 is probably the best studied archaeon in terms of protein phosphorylation. Two protein phosphatases, as well as several protein kinases, have been characterized and the phosphoproteome in response to changes of carbon source (glucose/tryptone) was determined [15,40–45]. Besides the conserved gene organization of Rio B kinase (SSO0197) and Rio 2 kinase (SSO2374), we used ‘predicted protein–protein interactions’ (STRING ) as well as available phosphoproteome data in order to identify possible target proteins for Rio kinase phosphorylation.
For SSO Rio B kinase, STRING analysis predicted protein–protein interaction with aeIF-1A (SSO2375) , the KH domain-containing protein (SSO2373), aeIF-2B (SSO2381) and the ribunuclease HII (SSO2384), which reside in the same gene cluster as several other proteins involved in translation or transcription (e.g. DNA topoisomerase VI type IIB 6a and 6b, tyrosyl-tRNA synthetase and DNA topoisomerase III). Intriguingly, in accordance with this observation, aeIF-2B was identified to be phosphorylated on tyrosine, serine and also threonine (ApYVECpSpTCK) in glucose-grown cells . In addition, both DNA topoisomerase 6 subunits [Top6A (SSO0907) and Top6B (SSO0968)] were detected to be multiply phosphorylated (Top6A: DIVpTpYMLpSpSE, glucose and MpTADMEpSK, tryptone; Top6B: KpYEDFR, glucose and RLpYpTFK, tryptone).
For Rio 2 kinase (SSO0195), STRING analysis  predicts protein–protein interaction with the DUF460 domain-containing protein (SSO0198), which is located in the conserved gene cluster. SSO0198 was also detected as phosphoprotein in the phosphoproteome study of S. solfataricus P2 in glucose-grown cells . In addition, the predicted interacting transcription regulator SSO0942 was detected to be phosphorylated in tryptone-grown cells (DpYNKpSMK). Furthermore, besides several hypothetical proteins and NAD synthetase (nadE) and the putative TBP (TATA-box-binding)-interacting protein (TIP49-like) (SSO2450), an interaction between both S. solfataricus RIO kinases is predicted as well as between the S. solfataricus Rio 2 kinase and two ePKs (SSO3182 and SSO3207), suggesting a more complex cellular regulatory network.
aPKs of the RIO kinase family are found in all three domains of life, underlining their essential function. In all archaeal genomes sequenced, at least one (Rio 1 or Rio 2 kinase), but typically two, member of the RIO kinase family (Rio 1 and Rio 2 kinase, Rio B and Rio 2 kinase or, in a few cases, two Rio 2 kinases) were identified. In all archaea, a conserved clustering of one RIO kinase gene (usually of the Rio 1 or Rio B kinase gene or of one of the two Rio 2 kinase genes) with genes encoding the KH domain protein and often aeIF-1A is observed. The second Rio 2 kinase gene typically clusters with a gene encoding the DUF460 protein. In some archaea, an additional association of DEAD/DEAH-box helicases, snRNP or CbiA is observed. An even larger conserved gene cluster was identified in most members of the Sulfolobaceae comprising, for example, ribonuclease HII, aeIF-2B and a predicted methylthiotransferase for thiolation of tRNA or ribosomal proteins. Although the function of RIO kinases in Archaea remains to be resolved, the bioinformatics analysis described in the present paper, including comparison of the phylogenetic distribution, gene neighbourhood analysis as well as protein–protein interaction prediction, provides some first hints to a possible function. Furthermore, the combination with the available phosphoproteome study of S. solfataricus  allows the identification of possible target proteins for RIO kinase phosphorylation in S. solfataricus. Our studies suggest a regulatory function of archaeal RIO kinases in biological processes related to transcription and translation resembling the assumed physiological role in yeast.
The first in vivo evidence for the important role of RIO kinases comes from a study of the haloarchaeon Haloferax volcanii . The α1 protein of the proteasome 20S core particle was identified as a target for in vitro phosphorylation of Rio1 kinase (Rio 1p, HVO 0135). Site-directed mutagenesis of the phosphosites of the α1 protein and expression in H. volcanii displayed a link between in vivo phosphorylation of the proteasome and cell viability, as well as pigmentation .
The project was partially performed in the course of the Sulfolobus Systems Biology ‘SulfoSYS’ (SysMO, Bundesministerium für Bildung und Forschung) and the ‘Hot signal transduction in Sulfolobus’ (Deutsche Forschungsgemeinschaft) project. D.E. was supported by the Bundesministerium für Bildung und Forschung [grant number 0315004A] and Deutsche Forschungsgemeinschaft [grant number SI 642/10-1].
Molecular Biology of Archaea 3: An Independent Meeting held at the Max Planck Institute for Terrestrial Microbiology, Marburg, Germany, 2–4 July 2012. Organized and Edited by Sonja-Verena Albers (Max Planck Institute for Terrestrial Microbiology, Germany), Bettina Siebers (University of Duisberg-Essen, Germany) and Finn Werner (University College London, U.K.).
Abbreviations: aeIF, archaeal translation initiation factor; DUF, domain of unknown function; Elp3, eukaryotic elongation protein subunit 3; KH, K homology; PK, protein kinase; aPK, atypical PK; ePK, eukaryotic type-like PK; PTM, post-translational modification; RIO, right open reading frame; snRNP, small nuclear ribonucleoprotein; TRAM, TRM2 and MiaB; UPF, uncharacterized protein family
- © The Authors Journal compilation © 2013 Biochemical Society