Of all the steps in mRNA translation, initiation is the one that differs most radically between prokaryotes and eukaryotes. Not only is there no equivalent of the prokaryotic Shine–Dalgarno rRNA–mRNA interaction, but also what requires only three initiation factor proteins (aggregate size ∼125 kDa) in eubacteria needs at least 28 different polypeptides (aggregate >1600 kDa) in mammalian cells, which is actually larger than the size of the 40 S ribosomal subunit. Translation of the overwhelming majority of mammalian mRNAs occurs by a scanning mechanism, in which the 40 S ribosomal subunit, primed for initiation by the binding of several initiation factors including the eIF2 (eukaryotic initiation factor 2)–GTP–MettRNAi complex, is loaded on the mRNA immediately downstream of the 5′-cap, and then scans the RNA in the 5′→3′ direction. On recognition of (usually) the first AUG triplet via base-pairing with the Met-tRNAi anticodon, scanning ceases, triggering GTP hydrolysis and release of eIF2–GDP. Finally, ribosomal subunit joining and the release of the other initiation factors completes the initiation process. This sketchy outline conceals the fact that the exact mechanism of scanning and the precise roles of the initiation factors remain enigmatic. However, the factor requirements for initiation site selection on some viral IRESs (internal ribosome entry sites/segments) are simpler, and investigations into these IRES-dependent mechanisms (particularly picornavirus, hepatitis C virus and insect dicistrovirus IRESs) have significantly enhanced our understanding of the standard scanning mechanism. This article surveys the various alternative mechanisms of initiation site selection on mammalian (and other eukaryotic) cellular and viral mRNAs, starting from the simplest (in terms of initiation factor requirements) and working towards the most complex, which paradoxically happens to be the reverse order of their discovery.
- initiation factor
- internal initiation
- mammalian mRNA
- ribosome scanning
- translation initiation
Although it was mammalian systems that provided the first insights, some 50 years ago, into the mechanism of protein biosynthesis and the role of ribosomes, this was soon overhauled by research into bacterial mRNA translation and thus further investigation into mammalian systems became rather a ‘compare and contrast’ exercise. Throughout the 1960s, when the emphasis was mainly on deciphering the genetic code and studying mechanisms of elongation, it seemed that eubacterial and eukaryotic protein synthesis were rather similar processes, and the discovery that both systems used an AUG codon and a dedicated initiator Met-tRNA for initiation raised false hopes that initiation, too, might be similar.
However, by the mid 1970s, it was becoming clear that, in most other respects, initiation in the two systems must be very different. Sequencing the ends of rRNAs revealed that, although the 3′-ends of prokaryotic 16 S rRNA and eukaryotic 18 S rRNA are quite similar, the prokaryotic 16 S rRNA CCUCC motif, which is the core of the Shine–Dalgarno interaction, is precisely deleted from all eukaryotic 18 S rRNAs. In addition, the pioneering work of the Theo Staehelin and John Hershey groups [1,2] had already discovered (and outlined the function of) seven different eIFs (eukaryotic initiation factors; not including eIF2B, 4E, 4G and 5, which were discovered subsequently), indicative of a much greater complexity than in prokaryotes.
In 1976, we developed the nuclease-treated rabbit reticulocyte lysate system for assaying translation of exogenous mRNAs , and found that, in general, it translated all eukaryotic mRNAs accurately and efficiently, whether from yeast, insect, plant or mammalian cells (although there were some viral RNA exceptions, notably poliovirus and rhinovirus RNAs). This was encouraging as it implied that all eukaryotes shared a common initiation mechanism. However, as investigations into initiation site selection mechanisms really demand the study of a single RNA species, ideally coupled with the potential for mutagenesis, the use of cell-free systems for such work was effectively limited to the study of wild-type viral RNAs until 1984, when Paul Krieg and Doug Melton invented methods for transcribing cloned cDNAs using bacteriophage RNA polymerases .
The discovery, again in the mid-1970s, that eukaryotic mRNAs have 5′-caps drew attention to the importance of the 5′-end in translation initiation [5–7]. Reovirus mRNAs could be synthesized in vitro with either m7Gppp-capped ends or Gppp-ends, and compared in translation assays, which showed that both coded for the same product, but the former was far more efficient [6,7], implying that, at some stage, the translation machinery might interact with the mRNA 5′-end. This was confirmed by the finding that eukaryotic ribosomes could not bind to, or initiate translation on, ∼60 nt RNAs that had been covalently circularized . Moreover, when 80 S initiation complexes were formed on reovirus mRNAs in the presence of sparsomycin (to block elongation), the sequence of the ribosome-protected RNA segment included a central AUG triplet that was invariably the first AUG from the 5′-end, and, with some mRNAs, the protected segment also included the 5′-cap . These findings were highly suggestive of a mechanism of initiation site selection by ribosome scanning from the 5′-end [10,11], which was in due course verified by cDNA transfection assays coupled with mutagenesis , as this was still in the era before the development of in vitro transcription methods for RNA production.
The discovery of mRNAs where initiation does not occur exclusively at the 5′-proximal AUG prompted further investigations, leading to minor modification of the original model. First, it was shown that the efficiency of recognition of an AUG triplet by the scanning ribosome is influenced by its local sequence context . If the context of the 5′-proximal AUG is unfavourable, some scanning ribosomes will bypass it and initiate at the second or subsequent AUG. Secondly, if the 5′-proximal initiation site is followed by a sORF (short open reading frame) (<20–30 codons), there may be resumption of scanning and reinitiation at a downstream site . However, if a dicistronic construct is generated with two long, non-overlapping, protein-coding cistrons, there is generally no translation of the downstream cistron.
Despite the compelling evidence favouring the modified scanning model, it seemed to many in the field that there were some viral RNA genomes, notably picornavirus RNAs, that could not be accommodated by this model because their long 5′-untranslated regions had so many AUG triplets that were apparently silent as initiation sites. The discovery that insertion of these 5′-untranslated regions between the two cistrons of a dicistronic construct with two long cistrons strongly promoted translation of the downstream cistron raised the idea that the viral sequences could somehow cause direct ribosome entry at an internal site [15,16]. Thus a sequence with these properties came to be known as an IRES (internal ribosome entry site/segment). At first, the concept of internal initiation was regarded as something of a heresy, but, in due course, it has been widely accepted, certainly in the case of viral IRESs. If nothing else, the seminal finding that mammalian ribosomes can translate a covalently circularized mRNA provided that it includes a functional picornavirus IRES element  should have silenced the sceptics and closed the debate.
As this potted history shows, the field is greatly indebted to Theo Staehelin and John Hershey for their pioneering work on mammalian initiation factors, and to Marilyn Kozak for her 1975–1990 work on elucidating and proving the scanning mechanism, and its variants. We are also hugely indebted to Tatyana Pestova and Christopher Hellen for their contributions over the past 10 years on the roles of individual initiation factors. In essence, they have revisited the Staehelin and Hershey experiments of the 1970s [1,2], but with much cleaner reagents, recombinant initiation factors wherever possible and more highly purified preparations of native forms of the multisubunit factors. A further advance has been to supplement sucrose-gradient centrifugation methods of detecting ribosome–mRNA complexes with toeprint assays that show the actual position of the ribosome on the mRNA. If this approach of dissecting the initiation system down to its individual components is considered a ‘bottom up’ strategy, then our own approach has been more ‘top down’, where we manipulate the rabbit reticulocyte lysate system in the desired direction, for example, by selectively depleting tRNA, or specific RNA-binding proteins, or certain initiation factors such as eIF4G. The two approaches should not be regarded as rivals, but as complementary to each other. While the ‘bottom up’ strategy is obviously the only way to identify all the players, the advantage of the ‘top down’ approach is that it retains the high activity of the crude reticulocyte lysate, which is close to the intact cell in its efficiency of protein biosynthesis, and thus is likely to be fairly free of artifacts.
An introduction to mammalian translation initiation factors (eIFs)
Although ribosome scanning was not only the first mechanism of mammalian initiation to be described, and is undoubtedly the mechanism operative with the overwhelming majority of mRNAs, in many ways it remains poorly understood or, rather, it would be poorly understood but for the insights gained from studying some of the non-standard initiation mechanisms. So, not for the first time in biology, and certainly not for the last, studies of the abnormal or deviant have greatly increased understanding of the normal, which is the reason why this review starts with the simplest (in terms of initiation factor requirements) of the various alternative mechanisms and works towards the most complex.
Obviously, this journey through alternative initiation mechanisms cannot be made without reference to the initiation factors, which are listed in Table 1, together with a brief summary of their roles. Further information on their structure and function can be found in a review chapter by Hershey and Merrick . Table 1 highlights the contrast between the extreme complexity of mammalian initiation factors and the relatively simple situation of just three initiation factors (each a single polypeptide) in eubacteria. Despite this awesome difference in complexity, it is somewhat reassuring that there is quite close structural and functional homology between bacterial IF1 and mammalian eIF1A, and between bacterial IF2 and eukaryotic eIF5B, while bacterial IF3 shows functional, but not structural, similarity to eIF1 . Thus the eukaryotic system seems to represent an extreme expansion (with at least 25 novel polypeptides) of the prokaryotic set of factors, rather than a complete de novo reinvention of the wheel.
The most important factors from the standpoint of this discussion are eIF2, eIF3 and the eIF4 family of factors. The first of these has a relatively straightforward function: it forms an eIF2–GTP–Met-tRNAi ternary complex which loads the initiator tRNA on to the 40 S ribosomal subunit . In contrast, mammalian eIF3 is an exceedingly complex factor composed of 12 different polypeptides (although Saccharomyces cerevisiae eIF3 has only six of these). Not surprisingly, given its size, mammalian eIF3 appears to have multiple roles, and seems to orchestrate the function of the other factors. It not only binds to the 40 S ribosomal subunit, although surprisingly mainly contacting the back or solvent-exposed face of the 40 S subunit , but also interacts directly with the central domain of eIF4G . In addition, yeast eIF3 (and probably also its mammalian counterpart) interacts with other initiation factors to form a so-called multifactor complex [21,22], consisting of eIF1, the eIF2 ternary complex, eIF3 and eIF5 (Figure 1A).
The eIF4 factors include eIF4A, the prototype of the DEAD (Asp-Glu-Ala-Asp) box RNA helicase family, and eIF4E, the only factor with specific affinity for the mRNA 5′-cap structure. Both of these are components of the eIF4F holoenzyme complex (Figure 1B). The core of this complex is eIF4G, which appears to fulfil a scaffolding role and binds eIF4E and eIF4A at specific sites . In addition, the central domain of mammalian eIF4G interacts with eIF3 , which also binds to 40 S ribosomal subunits. As these two interactions of eIF3 do not seem to be mutually exclusive, in principle a tripartite interaction relay (eIF4G–eIF3–40 S subunit) is possible, and this is thought to be the key to how the 40 S subunit is loaded on to the mRNA. Both the eIF4F complex and eIF4A as an individual entity have ATP-dependent RNA helicase activity in vitro, which, in both cases, is stimulated by eIF4B for reasons that are poorly understood . However, investigations into the impact of dominant-negative eIF4A mutants on mammalian translation initiation have suggested that eIF4A functions in initiation mainly, if not exclusively, as a component of the eIF4F complex, rather than as a singular entity . Many, but not all, animal picornaviruses encode a protease which cleaves the eIF4G component of the eIF4F complex into an N-terminal one-third fragment and a C-terminal two-thirds fragment (Figure 1B), which is often referred to as p100. As will become evident in the subsequent discussion, there are many situations where p100 can substitute for the uncleaved factor, as can the central one-third fragment (known as p50).
Intergenic IRESs of dicistroviruses
The simplest mechanism of initiation on a naturally occurring mRNA is the translation of the downstream (capsid protein) cistron of the dicistroviruses, a family of invertebrate viruses, most of which infect insects. Translation initiation in this case does not require any of the canonical initiation factors, nor Met-tRNAi, but is dependent on a ∼180 nt intergenic IRES [26,27], which consists of a triple pseudoknot structure, two of them overlapping, while the 3′-pseudoknot (actually known as PK1) is more independent [28,29]. The IRES binds directly to the 40 S ribosomal subunit, mainly via the two overlapping pseudoknots, which interact with the platform and E-site regions, while PK1 appears to occupy the 40 S subunit P-site as a codon–anticodon stem–loop mimic [29–31]. There are actually 5 bp in PK1 [28,29], of which the last 3 bp are the presumed codon–anticodon mimic. Provided that the base-pairing is maintained, the identity of the codon–anticodon mimic is relatively unimportant, and efficient initiation is seen even if the codon mimic is a stop codon [27,32]. The N-terminal residue of the capsid protein is specified by the next codon downstream of this P-site codon mimic, i.e. the codon occupying the ribosomal A-site [27,32]. Binding of the IRES to the 40 S subunit does not hinder 60 S subunit joining, and may even promote it. The cognate aminoacyl-tRNA is delivered to the A-site by elongation factor eEF1A (eukaryotic elongation factor 1A) in the usual way, and then a pseudo-translocation event transfers it to the P-site, whereupon the normal process of elongation can commence [33–35].
Remarkable though it is, this highly unusual mechanism involving what is in effect a bifunctional tRNA–mRNA does not tell us much about the more standard initiation mechanisms, except that it very strongly emphasizes the importance of P-site occupancy. In the latter respect, it is not unlike the mechanism of initiation of translation of poly(U), which occurs at non-physiologically high Mg2+ concentrations necessary to promote mRNA binding to ribosomes, and factor-independent binding of deacylated tRNAPhe to the P-site. Similar to the events on the dicistrovirus IRES, this is then followed by eEF1A-dependent binding of Phe-tRNA to the A-site, which in turn is followed by a pseudo-translocation event to transfer it to the P-site.
HCV (hepatitis C virus) and pestivirus IRESs
The ∼330 nt IRESs of HCV and the animal pestiviruses represent the next level of complexity. As is also seen with the different types of picornavirus IRES, there is much closer homology between HCV and the pestiviruses in their IRESs than there is in the coding region, or especially the 3′-untranslated regions, which are quite different. These IRESs show two distinctive properties which are almost certainly interdependent . First, salt-washed 40 S subunits bind to the IRES at the correct site in the complete absence of any initiation factors, unlike all other mRNAs, except RNAs with a dicistrovirus IRES. Thus, in operational terms, the ∼300 nt IRES serves the same function as the approx. 5 bp prokaryotic Shine–Dalgarno interaction. Secondly, initiation dependent on these IRESs shows no requirement whatsoever for (or participation of) eIF4A, 4B, 4E or 4G (or the eIF4F complex), nor any requirement for ATP hydrolysis . As it is quite likely that eIF1 and eIF1A are also redundant, all that is required is the eIF2–GTP–Met-tRNAi ternary complex, eIF5, eIF5B and also eIF3, but this is not required for 40 S subunit binding to mRNA nor for ternary complex loading, but most probably for subsequent steps by virtue of its interaction with eIF5 [21,22]. Cryo-electron microscopy shows that the bulk of the HCV IRES binds to the back (solvent face) of the 40 S subunit, behind the platform region, but with the 5′-proximal irregular stem–loop reaching round to interact with the subunit interface in the region of the E-site, leaving the P-site vacant to accept the Met-tRNAi of the ternary complex .
The discovery of this unusual mechanism of initiation, where the ability of 40 S subunits to bind directly to the mRNA in the absence of initiation factors makes initiation independent of the eIF4 factors and ATP, had an important impact on our understanding of the more general mechanism, as it implies that the role (or, certainly, the main role) of the eIF4 factors in translation initiation of most other mRNAs is to load the primed 40 S subunits on to the mRNA.
Unusually (for eukaryotic mRNAs), the efficiency of initiation dependent on the HCV and pestivirus IRESs can be influenced very profoundly by the nature of the 5′-proximal coding sequences [38,39], which seems to be largely, if not entirely, due to a strong inhibitory effect of even the slightest secondary structure around the initiation codon [39,40]. This is probably a direct and inevitable consequence of the fact that initiation on these IRESs occurs without any participation of eIF4A (the RNA helicase initiation factor), either as an individual entity or in association with eIF4G as part of the eIF4F complex. This shows a further parallel with prokaryotic initiation, where none of the canonical initiation factors is an RNA helicase, and which is likewise very inefficient if the initiation codon and Shine–Dalgarno motif are occluded by secondary structure that is more stable than a critical threshold value of approx. −6 kcal/mol (1 kcal=4.184 kJ) [41,42]. For every additional −1.4 kcal/mol beyond this threshold, the efficiency of prokaryotic initiation decreases by a factor of 10-fold [41,42]. Thus, if mutations are introduced to incorporate a previously unstructured initiation site into a −30 kcal/mol hairpin motif, the outcome for a mammalian mRNA translated by the scanning mechanism is a reduction in initiation efficiency of no more than 10–20% , compared with a predicted reduction by a factor of 1017 for a eubacterial mRNA! Such is the power of helicases!
Clearly, initiation without the involvement of an RNA helicase initiation factor, as occurs with HCV and pestivirus IRESs, has potential disadvantages. On the other hand, it would seem to have a potential advantage in that if the virus encoded enzymes which could inactivate any of the eIF4 factors, it could shut down competing host-cell mRNA translation very effectively indeed. Curiously, so far, none of these viruses appears to have (yet) evolved such a strategy.
Next in this hierarchy of complexity come picornavirus IRESs, which were first discovered in 1988 in poliovirus and EMCV (encephalomyocarditis virus) RNAs [15,16]. At that time, there were enough sequences available to see that all picornavirus IRESs, apart from HAV (hepatitis A virus), fell into two groups [44,45]: (i) rhino- and entero-viruses (such as poliovirus), which are often designated as type 1 picornavirus IRESs, and (ii) aphtho- and cardio-viruses (e.g. EMCV), which are known as type 2 IRESs. (Some more recently discovered picornaviruses, such as porcine teschovirus, do not fall into either class, but have an IRES with remarkable similarity to the HCV IRES , suggesting the possibility that such IRESs might be acquired from cross-species recombination during mixed infections.) Within each of the original ∼450 nt type 1/type 2 picornavirus IRES groups, there is fair conservation of primary nucleotide sequence and even stronger conservation of predicted secondary structure, but there is very little similarity between the two types, apart from a pyrimidine-rich tract located at the 3′-end of the IRES, as defined by deletion analysis [44,45].
One of the first questions to be addressed was where the actual ribosome entry site, which was defined as the most 5′-proximal position at which initiation by internal ribosome entry could occur, is located. With the EMCV IRES, the answer was straightforward: the ribosome enters at the authentic initiation codon (AUG-11), which is situated at the 3′-end of the IRES, some 25 nt downstream of the start of the pyrimidine-rich tract. Significantly, there is almost no initiation at AUG-10 which is situated just 8 nt upstream of AUG-11 . However, if the pyrimidine-rich tract was lengthened, there was a partial, but incomplete, shift towards initiation at AUG-10 at the expense of AUG-11, suggesting that the selection of the initiation site depends, at least in part, on the distance of the AUG triplet from the start of the pyrimidine-rich tract and the rest of the IRES upstream of this tract .
The situation with type 1 picornavirus IRESs is more complex. There is an absolutely conserved AUG triplet located in a similar position ∼25 nt downstream of the start of the pyrimidine-rich tract, at nt 586 in poliovirus type 1. In fact, there is very little, if any, initiation at this site, possibly because of its unfavourable local sequence context. However, this AUG is certainly important for efficient internal initiation, as mutations to non-AUG codons reduce translation efficiency quite severely [49–51]. The authentic initiation codon is invariably the next AUG downstream, at a distance of ∼40 nt in rhinoviruses and ∼160 nt in enteroviruses. However, deletions of this intervening sequence which therefore bring the authentic initiation codon to a position ∼25 nt downstream of the start of the pyrimidine-rich tract are viable, if producing slightly small plaques [52,53]. Current models envisage internal 40 S ribosomal subunit entry at the near-silent AUG (at nt 586 in poliovirus type 1), followed by transfer to the authentic initiation codon. It remains uncertain whether this transfer is invariably by a linear scanning process similar to that which occurs with capped mRNAs, or whether some 40 S subunits bypass part of the intervening sequences in a form of ribosome ‘shunting’. FMDV (foot-and-mouth disease virus) has an IRES that is clearly of the type 2 class, but uses two initiation sites 84 nt apart, which are accessed by a mechanism that is a hybrid of the EMCV and the type 1 IRES mechanisms. Most probably, all 40 S subunits enter at the upstream Lab site, but only a fraction actually initiate at that site, while the rest scan to the downstream Lb site and initiate translation there .
When picornavirus IRESs were first discovered, it was widely assumed that this type of initiation would require a different set of initiation factors than the scanning mechanism. With the wisdom of hindsight, this presumption seems rather naïve, but it can perhaps be excused on the grounds that (i) at that time, this was the only known example of non-standard initiation, and (ii) it was well established that initiation of enterovirus, and especially rhinovirus, RNA translation was very inefficient in reticulocyte lysates, unless the system was supplemented with HeLa cell cytoplasmic extract . Thus the early 1990s saw numerous efforts to identify the putative IRES-specific initiation factors by using UV-cross-linking to catalogue IRES-binding proteins. This again was rather naïve, as we now know that there are several proteins which bind to IRESs yet apparently play no role in translation. There is no substitute for proper functional (translation) assays, which were eventually used to show that HeLa cell cytoplasm contributes two proteins essential for initiation on rhinovirus IRESs [56,57]: (i) unr (upstream of N-ras), a cytoplasmic RNA-binding protein with five cold-shock domains; and (ii) PTB (polypyrimidine-tract-binding protein), an RNA-binding protein that regulates alternative splicing, and thus is predominantly nuclear, but with a significant cytoplasmic presence. Another cytoplasmic RNA-binding protein, PCBP-2 [poly(C)-binding protein-2], is also required , but is as abundant in reticulocyte lysates as in HeLa cell extracts. Poliovirus IRES activity requires PCBP-2 and PTB [57,58], but the status of any unr requirement is still uncertain. PTB is stimulatory rather than essential for initiation on the type 2 EMCV IRES [59,60], while the FMDV IRES has a strong requirement for PTB and another RNA-binding protein, ITAF45 (IRES trans-acting factor of 45 kDa) . It is somewhat curious that closely related IRESs show rather different requirements for these RNA-binding proteins, although it is worth noting that many closely related picornaviruses use completely different receptors for entry into host cells.
None of these RNA-binding proteins (PTB, unr, PCBP-2, etc.) have characteristics expected of translation initiation factors, and it is thought more likely that their effect on IRES activity results from their binding inducing subtle changes in the three-dimensional structure of the IRES . The question of initiation factor requirements was finally solved, at least for the EMCV IRES, by Tatyana Pestova and Christopher Hellen [62,63], who showed that (i) 40 S subunits do not bind to the IRES in the absence of initiation factors, and (ii) all the canonical initiation factors are required, except that eIF4E is completely redundant, and full-length eIF4G can be replaced by the C-terminal two-thirds fragment (p100), or even the central one-third (p50), both of which include interaction sites for eIF3 and eIF4A (Figure 1B). In fact, this was not unexpected, as many picornaviruses (although not EMCV) encode proteases which cleave eIF4G to generate an N-terminal one-third fragment plus p100 (Figure 1B), which results in a severe, but not necessarily complete, reduction in translation of capped host-cell mRNAs, while the initiation of viral RNA translation is usually stimulated to a significant extent. It was shown that the intact eIF4F complex, or p100 or p50, binds specifically and tightly to an A-rich bulge situated just upstream of the pyrimidine-rich tract in the EMCV IRES [63–65] in fairly close proximity to the authentic initiation codon (AUG-11), so that one can readily envisage how the eIF4G–eIF3–40 S subunit interaction relay could deliver the 40 S subunit to the correct initiation site. However, although this provides a self-consistent model of how the 40 S subunit is delivered to the initiation codon, it cannot be the whole explanation of how the EMCV IRES functions, as it does not account for why internal initiation requires some ∼350 nt of IRES sequence upstream of the eIF4G-binding site.
The type 1 entero-/rhino-virus IRESs are superficially similar in that 40 S subunits do not bind to the IRES in the absence of initiation factors, and p100 or p50 (with eIF4A) can promote more efficient initiation than the intact eIF4F holoenzyme complex. Although these IRESs have no equivalent of the EMCV IRES A-rich bulge, it is presumed that the eIF4F complex, or p100–eIF4A or p50–eIF4A, binds to the IRES in a similar position relative to the pyrimidine-rich tract and the putative ribosome entry site (the AUG at nt 586 in poliovirus type 1) as on the EMCV IRES. On the other hand, the HAV IRES is rather different. It requires the complete eIF4F complex, including eIF4E and intact eIF4G, and is inhibited (i) by cleavage of eIF4G into an N-terminal one-third fragment plus p100, or (ii) by the m7GpppG cap analogue [66,67]. It seems likely that the eIF4E subunit of the eIF4F complex interacts with a flipped out (presumably unmethylated) G residue somewhere within the IRES. While the lack of methylation would make this interaction rather weak, it could be stabilized by the type of direct eIF4G–RNA interaction discussed below. As all these picornaviruses are thought to be descended from a common ancestor, presumably this last common ancestor had factor requirements similar to the present-day HAV IRES, which has remained stuck in an evolutionary backwater, while the other viruses have evolved towards an ability to function with the p100 eIF4G fragment, which in turn has allowed the evolution of mechanisms of inhibiting host-cell mRNA translation via cleavage of eIF4G.
The ribosome-scanning mechanism
Lastly, we come to the ribosome-scanning mechanism, which is generally considered to require all of the canonical initiation factors [18,68]. The basic reaction scheme is shown in Figure 2, which is almost certainly an oversimplification, as it has eIF1, 3 and 5 participating as individual entities and fails to take into account the evidence for a multifactor complex comprising all these factors and the eIF2 ternary complex [19,21,22]. In this simplified scheme, the 40 S ribosomal subunit is thought to be primed for initiation by binding eIF1, 1A and 3, and the eIF2–GTP–Met-tRNAi ternary complex, and perhaps also eIF5, for the reasons just explained. Meanwhile, the eIF4F complex is thought to bind to the mRNA 5′-end, largely via interaction of the eIF4E cap-binding subunit with the methylated 5′-cap, possibly assisted by eIF4G–mRNA interactions (see below). This interaction allows local unwinding of mRNA structure by the eIF4A helicase subunit of eIF4F , assisted in an unknown way by eIF4B. The potential interaction between 40 S-associated eIF3 and the central domain of the eIF4G subunit of eIF4F (Figure 1) then promotes loading of the 40 S subunit on the mRNA, and scanning commences. Once the anticodon of the Met-tRNAi (in the 40 S-associated ternary complex) has engaged an AUG triplet, usually the 5′-proximal AUG subject to some influence of the local sequence context, the GAP (GTPase-activating protein) function of eIF5 triggers GTP hydrolysis and release of eIF2–GDP. Finally, eIF5B catalyses ribosomal subunit joining, and all other initiation factors supposedly dissociate , leaving an 80 S ribosome at the initiation codon with Met-tRNAi in the P-site, ready for the start of elongation.
With an uncapped mRNA, or an mRNA with a non-methylated 5′-cap (Gppp- or Appp-), initiation site selection is still by a scanning mechanism, judging by the preference for the 5′-proximal AUG, and involves the participation of at least part of the eIF4F complex, as it is sensitive to inhibition by dominant-negative eIF4A mutants . On the other hand, the efficiency of translation of these mRNAs is low, unless the K+ concentration is reduced from the usual 100–120 mM. However, cleavage of the eIF4G component of the eIF4F complex by picornavirus proteases (generating an N-terminal one-third fragment and p100), or simply addition of the p100 eIF4G fragment, greatly stimulates uncapped mRNA translation, even at 100 mM K+ [70,71]. In contrast, the same cleavage of eIF4G reduces the efficiency of capped mRNA translation, but, contrary to popular opinion, does not completely abrogate it, even though the cap-binding function of eIF4E associated with the N-terminal one-third of eIF4G becomes physically separated from the eIF4G central domain which has the eIF4A and eIF3 interaction sites that are critical for loading the primed 40 S subunit on the mRNA (Figure 1B). Moreover, the p100 eIF4G fragment (or even the p50 fragment) can support translation of (methylated) capped mRNAs from the 5′-proximal AUG in the eIF4G-depleted system that we have developed, although it requires approx. 4-fold more p100 than intact eIF4F complex to achieve the same product yield [72,73]. Thus initiation of capped mRNA translation is more efficient with the intact eIF4F complex than with p100–eIF4A, while the reverse is true for uncapped mRNAs (Table 2), with the result that the differences in efficiency between capped and uncapped versions of the same RNA largely disappear when p100–eIF4A is driving initiation. These considerations led to the suggestion that, in addition to interaction between the 5′-cap and the eIF4E subunit of the eIF4F complex (which can obviously occur only with capped RNAs), there is also direct interaction between the mRNA and eIF4G, most probably with the region of eIF4G just downstream of the protease cleavage site (Figure 1B). This direct mRNA–eIF4G interaction seems to occur preferentially near the 5′-end of the mRNA, regardless of the exact nature of the 5′-terminus, and seems to be of higher affinity with the p100 eIF4G fragment than with intact eIF4G.
An unexpected recent discovery is that if the 5′-untranslated region consists essentially of approx. 20 tandem CAA repeats, which is believed to give a completely unstructured RNA segment, initiation can occur, albeit at somewhat reduced efficiency, even if the eIF4 factors and ATP are omitted . This initiation has an absolute requirement for eIF1, 1A and 3, and appears to be by a scanning mechanism, as there is strong preference for the 5′-proximal AUG, subject to a similar influence of local sequence context, as when the eIF4 factors and ATP are present . However, as the introduction of even the slightest secondary structure into the middle of this 5′-untranslated region makes translation absolutely dependent on the eIF4 factors and ATP, in practice these factors will be needed for virtually all naturally occurring mRNAs.
Although the 40 S subunit is normally pre-loaded with an eIF2–GTP–Met-tRNAi ternary complex before mRNA binding and scanning, it is important to emphasize that a bound Met-tRNAi is not obligatory for 40 S subunit scanning. This is most evident when the 5′-proximal initiation site is followed by a sORF, in which case there is some resumption of scanning with the possibility of reinitiation at a downstream site. The 40 S subunits that resume scanning are initially bereft of Met-tRNAi, but can acquire a ternary complex in the course of migrating in the 5′→3′ direction . Very detailed investigations into the regulation of translation of yeast GCN4 (general control of nitrogen metabolism gene 4) mRNA, which has four very short upstream ORFs, indicate that, under conditions of slightly reduced intracellular ternary complex concentrations, the small ribosomal subunits may travel at least 200 nt before acquiring a ternary complex , completely bypassing three AUGs on the way (because a bound ternary complex is a prerequisite for AUG recognition via codon–anticodon pairing).
One of the frustrating features of ribosome scanning is that it is almost impossible to trap scanning intermediates, or to catch 40 S subunits in the act of scanning. There is some evidence that, if ATP is limiting, the 40 S subunits do not migrate far from the 5′-cap , and that certain inhibitors of initiation (edeine, pactamycin or sodium fluoride), inosine-substituted mRNAs or just a low Mg2+ concentration cause scanning 40 S subunits to bypass potential AUG initiation codons, presumably by interfering with codon–anticodon recognition [76–78]. In addition, omission of eIF1 and eIF1A causes scanning 40 S subunits to stall a short distance downstream of the cap, perhaps because eIF1 plays a critical role in monitoring the fidelity of initiation codon–Met-tRNAi anticodon interaction . However, it is doubtful whether these stalled 40 S subunits can be considered true scanning intermediates, since the delayed addition of eIF1 and eIF1A does not cause them to resume scanning, but, rather, promotes their dissociation from the mRNA, whereupon a de novo attempt at scanning is made, which is now successful because of the presence of these two factors. We have very little idea whether scanning is a strictly linear and unidirectional nucleotide-by-nucleotide inspection process, or whether it is more of a ‘shuffling’ motion, somewhat akin to bidirectional diffusion. Neither do we know whether the nature of the movement is any different if the 40 S subunits lack bound Met-tRNAi, or if they are scanning an unstructured CAA repeat 5′-untranslated region in the absence of the eIF4 family of factors and ATP . One indication that scanning may not be a strictly unidirectional nucleotide-by-nucleotide search is the failure of the scanning mechanism to discriminate between two closely spaced AUG codons which have what appear to be equally favourable local sequence contexts, as is found with the NA/NB mRNA of the influenza B viruses . Of the various types of mutation examined, the only one which promoted selective use of the upstream AUG was one which increased the separation between the two.
What determines whether mammalian ribosomes resume scanning after translation of an upstream sORF?
The critical parameter that determines whether there is a resumption of scanning appears to be not so much the actual length of the sORF itself, but the time taken to translate it, as the presence of a pseudoknot structure that would be expected to cause ribosomal pausing during translation of the sORF abrogates reinitiation at downstream sites . This suggests that the initiation factors which were instrumental in promoting scanning-dependent initiation at the sORF initiation codon may remain ribosome-associated very briefly (probably for less than approx. 20 s), while translation of the sORF is proceeding. Given the near-impossibility of direct detection of such short-lived interactions, we have examined this issue by investigating whether reinitiation is dependent on the actual mechanism of initiation of translation of the sORF .
We find that, in all circumstances in which the complete eIF4F complex or the p100 eIF4G fragment participates in initiation at the sORF AUG, there is reasonably efficient (20–40%) reinitiation at the downstream site. However, if sORF translation is dependent on a pestivirus IRES or a dicistrovirus intergenic IRES, and thus occurs without any involvement of the eIF4F complex (or any form of eIF4G that includes the central domain), reinitiation is negligible (approx. 1%). The critical test was the mRNA with an unstructured CAA repeat 5′-untranslated region, which is translated by a scanning mechanism, but one which is not absolutely dependent on the eIF4 factors . In a standard reticulocyte lysate, where the eIF4 factors are all present and are likely to participate in initiation of translation of this mRNA, we observe reinitiation at approx. 25% efficiency, but, in the eIF4G-depleted lysate system that we have developed, there is virtually no reinitiation .
These results suggest that, when the eIF4F complex, or at least the eIF4G p100 fragment with associated eIF4A, participates in initiation of translation of the sORF, the interaction of these factors with the 40 S subunit (which is likely to be an indirect interaction via eIF3 as an intermediary) is not disrupted immediately, but may persist for a few seconds. If it persists until such time as sORF translation has been completed and terminated, these factors can then promote further 40 S subunit scanning and reinitiation further downstream. However, if the interaction is disrupted before translation of the sORF is complete, then termination of sORF translation is probably followed automatically by dissociation of both ribosomal subunits from the mRNA. If these ideas prove to be correct, the currently favoured model (Figure 2), which has all the initiation factors dissociating from the ribosome at the stage of eIF5B-promoted ribosomal subunit joining , may need minor revision to allow for a slight delay in the release of some of the factors.
Reinitiation after translation of a long ORF
Although the conventional wisdom is that ribosomes which have completed translation of a long ORF cannot reinitiate translation at downstream sites, there have been occasional claims to the contrary. Most of these have not stood up to closer scrutiny, but there are a few candidates worthy of consideration among viral RNAs, of which the sub-genomic RNA of caliciviruses is the best studied example. This RNA has a 5′-proximal cistron (ORF-1) coding for the ∼75 kDa major capsid protein precursor and a downstream ORF-2 encoding a ∼10 kDa putative minor capsid protein that is produced in ∼15% relative molar yield. Depending on the particular virus species, the two ORFs overlap by 1–8 nt, and, in the FCV (feline calicivirus) sub-genomic RNA that we have studied, the overlap is 4 nt (AUGA).
In vitro time-course assays show that initiation of FCV sub-genomic ORF-2 translation does not start until the first ribosomes have completed translation of ORF-1, indicative of a translational coupling mechanism involving a termination–reinitiation event (T.A.A. Pöyry, A. Kaminski, E.J. Connell and R.J. Jackson, unpublished work). Surprisingly, initiation of ORF-2 translation is not inhibited by dominant-negative eIF4A mutants , which implies that eIF4G/4A plays no role in the reinitiation process. Substitution of reporters for viral sequences shows that the only viral sequence specifically required for ORF-2 expression is the last ∼84 nt of ORF-1, which is strikingly in agreement with the results that Meyers obtained with the rabbit calicivirus sub-genomic RNA . We find that translation through this ∼84 nt segment is necessary, as ORF-2 expression is abolished if the ORF-1 termination codon is moved upstream. On the other hand, the termination–reinitiation event does tolerate a small (10–15 codons) downstream displacement of the ORF-1 stop codon, but, in such circumstances, reinitiation still occurs at the original endogenous ORF-2 initiation codon rather than at any AUG introduced into the ORF-2 frame closer to the displaced ORF-1 termination codon. We have good evidence that the critical 84 nt terminal residue segment of ORF-1 has the ability to bind eIF3. As eIF3 can interact with the 40 S ribosomal subunit, this suggests that when ORF-1 translation terminates at a stop codon located within a short distance of the 84 nt segment, the 40 S ribosomal subunit is retained on the mRNA (via interaction with the bound eIF3) in an appropriate position and orientation relative to the ORF-2 initiation codon, so that, once this 40 S subunit has acquired a ternary complex, initiation can occur without any requirement for an eIF4G/4A complex.
The failure to see any ORF-2 expression if the ORF-1 termination codon is moved to upstream positions, or is displaced too far downstream, implies that this type of reinitiation is an unusual process requiring a special event (in this case, binding of eIF3 to the appropriate segment of RNA). This in turn implies that the usual default event following termination of translation of a long cistron will be complete dissociation of both ribosomal subunits from the mRNA and no reinitiation.
A number of basic principles emerge from this survey of the alternative mechanisms of initiation site selection:
There can be no initiation without occupancy of the ribosomal P-site by an appropriate ligand, which is almost invariably Met-tRNAi loaded by eIF2. At elevated Mg2+ concentrations, other tRNAs can enter the P-site; for example, deacylated tRNAPhe when poly(U) is used as a surrogate mRNA. Exceptionally, the dicistrovirus IRESs have evolved an acceptable P-site codon–anticodon stem–loop mimic.
Loading the primed 40 S subunits on to the mRNA nearly always requires the eIF4 factors and ATP. The critical component in this 40 S subunit delivery is the central domain of eIF4G, which has the sites for interaction with eIF4A and eIF3 [20,23], but larger fragments of eIF4G may give more efficient initiation on some types of mRNA (e.g. the intact eIF4F complex with capped mRNAs). In exceptional cases, the eIF4 factors may be redundant for 40 S subunit binding to mRNA: (i) if the RNA sequence and structure allows direct factor-independent binding of 40 S subunits at the appropriate site to place the AUG codon in the P-site, as with HCV-like IRESs; or (ii) if the 5′-untranslated region is completely unstructured (e.g. when it consists of tandem CAA repeats), and, in this case, eIF1, 1A and 3 are absolutely required.
If the 40 S subunit with associated ternary complex is loaded on to the mRNA at a site where there is no cognate initiation codon, there is a propensity for it to scan the mRNA until codon–anticodon interaction occurs. While this scanning normally involves participation of the eIF4 factors and ATP, the fact that 40 S subunits can scan the CAA repeat 5′-untranslated region in the complete absence of any eIF4 factors and ATP suggests that the propensity to scan is a property either of the 40 S subunits themselves or of the 40 S–eIF1–eIF1A–eIF3 complex, and that the involvement of the eIF4 factors and ATP in scanning is more to eliminate inhibitory secondary structure than an intrinsic part of the scanning process.
This totally surprising honour makes me deeply aware of how much I owe to colleagues and collaborators. Even though it would clearly be impossible to name them all, I thank every single one of them. At the risk of making invidious divisions, I further record my special gratitude to Tim Hunt, Tony Hunter, Hugh Pelham and Paul Farrell for their input during my formative years; and Ann Kaminski, Tuija Pöyry, Iraj Ali, Sarah Hunt, Simon Morley, Simon Fletcher, Emma Brown, Julie Batley and Mike Howell for their pivotal contributions to the results that have been discussed in this article. I also thank Tuija Pöyry for invaluable help with the Figures. Work from our laboratory described herein has been supported in the main by the Wellcome Trust, with some additional funding from the Medical Research Council.
Abbreviations: eEF1A, eukaryotic elongation factor 1A; EMCV, encephalomyocarditis virus; FCV, feline calicivirus; FMDV, foot-and-mouth disease virus; (e)IF, (eukaryotic) initiation factor; HAV, hepatitis A virus; HCV, hepatitis C virus; IRES, internal ribosome entry site/segment; (s)ORF, (short) open reading frame; PCBP-2, poly(C)-binding protein-2; PK1, pseudoknot 1; PTB, polypyrimidine-tract-binding protein; unr, upstream of N-ras
- © 2005 The Biochemical Society