The splicing regulator protein Tra2β is conserved between humans and insects and is essential for mouse development. Recent identification of physiological RNA targets has started to uncover molecular targets and mechanisms of action of Tra2β. At a transcriptome-wide level, Tra2β protein binds a matrix of AGAA-rich sequences mapping frequently to exons. Particular tissue-specific alternatively spliced exons contain high concentrations of high scoring Tra2β-binding sites and bind Tra2β strongly in vitro. These top exons were also activated for splicing inclusion in cellulo by co-expression of Tra2β protein and were significantly down-regulated after genetic depletion of Tra2β. Tra2β itself seems to be fairly evenly expressed across several different mouse tissues. In the present paper, we review the properties of Tra2β and its regulated target exons, and mechanisms through which this fairly evenly expressed alternative splicing regulator might drive tissue-specific splicing patterns.
- gene expression
- high-throughput sequencing of RNAs isolated by cross-linking immunoprecipitation (HITS-CLIP)
- RNA-binding proteins
- RNA splicing
Tra2β protein is a splicing activator conserved between fruitflies and mice
Alternative splicing introduces new coding information into mRNAs, and so plays a pivotal role in expanding genome capacity to encode more proteins than just the ~23000 that would be expected if each gene encoded a single protein . Alternative splice events are controlled in part by a number of different RNA-binding proteins attaching to pre-mRNAs, although links with transcription and epigenetic modification of the template chromatin are also important. A large group of splicing regulator proteins contain domains that are enriched in arginine and serine residues (so-called RS domains, based on the one letter amino acid code) [2,3]. These include Tra2 proteins, which have a modular organization comprising a single central RRM (RNA recognition motif) flanked either side by RS domains [4,5].
A single Tra2 protein is found in fruitflies, where Tra2 is one of the classical splicing regulators controlling sexual differentiation as well as being essential for spermatogenesis [6–8]. The fruitfly gene is the source of the acronym Tra2, which is short for transformer 2, since mutations in this gene transform sexual phenotype. The Tra2 gene has duplicated in vertebrates, resulting in two mammalian Tra2 proteins with 63% amino acid identity (aligned at http://www.ebi.ac.uk/Tools/emboss/align/). These proteins are called Tra2α (encoded by the Tra2a gene on mouse chromosome 6) and Tra2β (encoded by the Sfrs10 gene on mouse chromosome 16). Tra2β binds to exons to regulate their alternative splicing inclusion. For example, Tra2β binds to the testis-specific T exon in the homeodomain interacting protein kinase 3 gene to regulate its splicing inclusion in the testis [9,10]. Recently, the details of exactly how Tra2β protein binds to both AGAA and CAA target RNA sequences have been revealed at atomic resolution, and involve protein–RNA interactions with both the RRM and flanking regions [11,12].
Transcriptome-wide identification of splicing targets for Tra2β
Despite similar amino acid sequences and RNA-binding specificities between Tra2α and Tra2β , genetic deletion of the Sfrs10 gene still results in embryonic lethality even though the Tra2a gene remains intact . Sfrs10−/− mice die at approximately 12 days gestation. This indicates either non-redundancy with Tra2α at this stage of development or that expression levels achieved from both genes are needed for embryonic development . Similarly, Tra2β is essential in the embryonic brain .
Because of the known functions for Tra2β protein in splicing, it is likely that defects in splicing regulation are a major contributing factor to the embryonic death of Sfrs10−/− mice. Such defects would probably lead to downstream changes in mRNA and protein isoforms impacting on development. Previously there were just a handful of known splicing targets for Tra2β, and these were not clearly mis-regulated in the absence of Tra2β [10,16]. Recently HITS-CLIP (high-throughput sequencing of RNAs isolated by cross-linking immunoprecipitation) has been used to comprehensively identify endogenous RNA targets for Tra2β during mouse germ cell development . In this procedure, endogenous target RNAs are cross-linked by UV radiation, then short fragments bound to Tra2β are rigorously purified, amplified, deep sequenced and mapped on to the genome. As an illustration of the resolution of this technique in identifying binding sites in vivo, the CLIP tags mapping to one of these identified target exons (Nasp-T) is shown in Figure 1 .
Target exons that depend on Tra2β for splicing inclusion should be mis-spliced in the absence of the Tra2β protein. Splicing analyses in the brains of mice, which contain Sfrs10−/− neurons indicated splicing inclusion of at least two of the identified target exons (within the Nasp and Tra2a genes) identified from the HITS-CLIP screen, were indeed strongly down-regulated in the absence of Tra2β protein, thus identifying these as physiologically regulated target exons .
The identification of these target exons reveals for the first time biological functions operating downstream of the Tra2β protein in mouse development. The Nasp gene itself is essential for mouse development  and encodes a protein that mediates histone import into the nucleus and assembly of chromatin after replication. The Tra2β-regulated Nasp-T exon is very long at 975 nt , making this single exon almost as long as the rest of the Nasp mRNA. This 975 nt exon length is divisible by three, and inserts the ORF (open reading frame) for a peptide cassette into the Nasp mRNA in those tissues where it is spliced. In adult mice, the Nasp-T exon is most actively spliced in the testis (it is also spliced in the embryo). In the testis, the longer encoded Nasp protein isoform is associated with meiotic chromosomes where it forms part of the complex that monitors the completion of double-strand break repair [18,20–22]. Hence, by regulating splicing of the NaspT exon, Tra2β operates as one of the key upstream regulators for this crucial physiological process in mouse male germ cells.
Tra2β also physiologically regulates splicing inclusion of a ‘poison’ exon into the Tra2a mRNA [9,23]. This poison exon inserts in-frame stop codons into the Tra2a mRNA, thereby targeting it for nonsense-mediated decay and preventing subsequent production of Tra2α protein. The identification of this regulated poison exon has thus revealed a novel pathway of feedback control, which operates between the vertebrate Tra2 proteins: overexpression of Tra2β protein leads to increased splicing inclusion of the Tra2a poison exon and down-regulation of Tra2α expression.
Both the Nasp-T and Tra2a exons are found in all available vertebrate genome sequences, indicating strong selective pressure [24–28]. High conservation of the Nasp-T and Tra2a exons is particularly remarkable in the context of the testis, since many alternative exons included in this tissue are not conserved between species . Functionally important exons also tend to be frequently included into mRNAs in at least some tissues. In mice, the Nasp-T and Tra2a exons are spliced into mRNAs at high levels in testis, and also spliced into mRNAs in other adult tissues and in the embryo. The defects in Nasp-T and Tra2a splicing may therefore contribute to the phenotype of the Sfrs10−/− mice. Since RNA-binding proteins are global regulators, defects resulting from their absence are likely to be widespread. Tra2β can bind and potentially regulate thousands of exons in parallel across the transcriptome, many of which might contribute to the embryonic or brain phenotype when Tra2β is missing [14,15].
Multiple binding sites are needed for Tra2β-mediated splicing activation of Nasp-T and Tra2a
Both the Nasp-T and Tra2a poison exons have two key features relating to their particular dependence on Tra2β for splicing inclusion, as discussed below.
A high frequency of Tra2β-binding sites
Both the Nasp-T and Tra2a exons contain multiple Tra2β-binding sites inferred from transcriptome-wide analysis . Within the transcriptome-wide dataset of RNA targets, particular 6-mer sequences were identified as enriched compared with either the genomic- or testis-specific transcriptomic backgrounds. Each of these most frequently recovered 6-mers represent binding sites for Tra2β recognized in vivo within the mouse transcriptome, and were subtle variations of AGAA-rich sequences, very similar to the known Tra2β-binding site [11,12].
These frequently recovered 6-mers were used as a Tra2β-binding site matrix, with the most frequently recovered 6-mers representing the best physiological binding sites. When analysed according to preferred physiological binding sites, the Nasp-T exon contains approximately 37 Tra2β-binding sites (defined as containing sequences from the 25 most frequently recovered 6-mers in the transcriptome-wide dataset, with particular importance being placed on the top five recovered 6-mers); and the Tra2a exon contains approximately 12 binding sites . Note that these numbers are approximations: some predicted Tra2β-binding sites overlap, making it difficult to be precise. Tra2β-binding sites clustered in the upstream and downstream portions of the Nasp-T exon identified by CLIP are shown in Figure 1.
Highly efficient Tra2β binding
Consistent with their high concentration in binding sites, both the Nasp-T and Tra2a exons bound Tra2β protein very efficiently in gel shift experiments, and less stable complexes were formed after binding site mutagenesis .
By looking at the exon sequences of Nasp-T and Tra2a, it is possible to estimate when splicing regulation by Tra2β might have evolved. Superimposition of binding sites and detailed phylogenetic comparisons of exon sequence show that the Tra2a poison exon is likely to be controlled by Tra2β across all vertebrates (all of which contain multiple binding sites), whereas Tra2β control of Nasp-T is more likely to have evolved in mammals .
Models for physiological splicing control by Tra2β
The Sfrs10 gene is expressed fairly evenly in different mouse tissues . This raises the important question: how does Tra2β regulate tissue-specific splicing patterns? Two conceptually different (but not mutually exclusive) models might explain how Tra2β protein operates as a tissue-specific splicing factor.
Tissue-specific patterns of splicing may be controlled by differences in the cellular concentration of Tra2β (Figure 2A).
Subtle but important differences in Tra2β protein concentration might occur both between and within tissues to drive tissue-specific splicing patterns. For example, within the mouse testis, Tra2β protein is low in spermatogonia (a cell population including the stem cells), but expression is higher in spermatocytes (the meiotic cells) . Tra2β protein levels then fall again in round spermatids (the post-meiotic haploid cells). The splicing inclusion of exons such as Nasp-T and Tra2a thus might peak in spermatocytes, which have the highest Tra2β protein concentration to activate splicing inclusion. When compared between different tissues, total RNA and protein preparations would conceal these intrinsic gradients of Tra2β protein concentration. Another piece of evidence argues for this kind of differential splicing control within tissues. Tra2β protein regulates its own level through regulating splicing inclusion of a poison exon into its own mRNA , and splicing levels of this poison exon are highest in the testis, possibly also in meiotic cells expressing maximum levels of Tra2β protein. Muscle also shows high levels of inclusion of this poison exon .
Tra2β might regulate specific exons that are de-repressed in particular tissues or cell types and thus available for activation (Figure 2B).
In this model, alternative splicing changes might be driven not directly by changes in Tra2β protein concentration, but rather in changes in the concentration of splicing repressors which antagonize Tra2β. Most exons are under combinatorial control, meaning they are influenced by a combination of both activating and repressive activities of splicing factors . Particular tissue-specific exons that contain Tra2β protein-binding sites might be strongly repressed in some tissues due to high concentrations of these splicing repressors, but more weakly repressed in other tissues or cell types where the expression of these repressors is weaker. In tissues containing lower levels of splicing repressors, Tra2β might operate as a tissue-specific activator of exons which contain Tra2β-binding sites, even though its actual expression level may not be different compared with the former tissues. For example, hnRNPA1 protein (a splicing repressor) is down-regulated over the course of germ cell development , so de-repressed exons that contain Tra2β-binding sites might be co-ordinately activated at this time.
Irrespective of the actual model that explains tissue-specific splicing of individual exons by Tra2β (referred to as ‘activator driven’ or ‘repressor driven’ in Figure 2), experimental evidence directly shows multiple adjacent Tra2β-binding sites are important to activate tissue-specific exons (Figure 2) . Since RNA–protein interactions are likely to be dynamic in the nucleus and have both on and off rates, multiple adjacent binding sites might simply act to increase the probability of occupancy by a single Tra2β protein at any one time to activate exon splicing. It is, however, difficult to see how mutation of just two binding sites out of a total of ~37 potential sites in Nasp-T could explain the ~80% reduction in splicing inclusion using this kinetic probability model .
A more likely scenario than the probability model is that adjacent RNA-binding sites might act as a platform to assemble multiple Tra2β proteins into a splicing activator complex . Gel shifts show large protein complexes assembling on these exons in vitro [9,15]. For some exons such as Nasp-T, a version of Tra2β unable to directly bind RNA by itself can still co-activate splicing in cellulo, possibly by attaching to splicing complexes already nucleated by wild-type endogenous Tra2β protein on the regulated exons [9,15]. Binding site mutagenesis of Nasp-T also resulted in large differences in splicing inclusion levels between individual double mutants, indicating that the organization of available Tra2β-binding sites is also important as well as the number of sites available.
Overexpressed Tra2β protein might also play a key role in disease
The above studies are starting to address the role of Tra2β protein in normal development. Tra2β might have additional roles in situations when it is expressed above its normal cellular concentration, perhaps by enabling it to activate weaker target exons than would be normally regulated. One such important target exon is SMN2 exon 7.
Splicing of SMN2 exon 7 was not obviously affected in the Sfrs10−/− mice , but was activated by overexpression of Tra2β in transfected cells . Regulation of SMN2 makes overexpression of Tra2β a potential therapeutic option to treat the developmental disease SMA (spinal muscular atrophy). SMA is caused by deletion of the SMN1 gene. Expression of the adjacent SMN2 gene is normally very low, since exon 7 is poorly spliced into the SMN2 mRNA, resulting in an unstable protein product. By improving exon 7 splicing, Tra2β could improve expression from the SMN2 gene and ameliorate disease in some patients. In addition to potentially useful roles in therapy, Tra2β has also been implicated as a more sinister modifier of other diseases, including breast cancer  where overexpression leads to increased inclusion of CD44 alternative exons associated with metastasis.
Recent research has identified new physiological target RNAs for Tra2β in development, and revealed it as an important developmental splicing regulator. Gradients in Tra2β protein concentrations may drive exon inclusion patterns over development, leading to the regulated production of developmentally important protein isoforms. Alternatively, Tra2β might provide a level of intrinsic background activation which is enough to activate available de-repressed exons that contain Tra2β-binding sites and generate tissue-specific splicing profiles. Future work will continue to elucidate the role of Tra2β both in splicing regulation, as well as in the metabolism of other kinds of cellular RNA target.
This work was supported by the Wellcome Trust [grant numbers WT080368MA and WT089225/Z/09/Z (to D.J.E.)], the Biotechnology and Biological Sciences Research Council [grant numbers BB/D013917/1 and BB/I006923/1 (to D.J.E.)] and the Breast Cancer Campaign (to D.J.E.).
RNA UK 2012: An Independent Meeting held at The Burnside Hotel, Bowness-on-Windermere, Cumbria, U.K., 20–22 January 2012. Organized and Edited by Raymond O'Keefe and Mark Ashe (Manchester, U.K.).
Abbreviations: HITS-CLIP, high-throughput sequencing of RNAs isolated by cross-linking immunoprecipitation; RRM, RNA recognition motif; SMA, spinal muscular atrophy
- © The Authors Journal compilation © 2012 Biochemical Society