The complex processes of mRNA transcription and splicing were traditionally studied in isolation. In vitro studies showed that splicing could occur independently of transcription and the perceived wisdom was that, to a large extent, it probably did. However, there is now abundant evidence for functional interactions between transcription and splicing, with important consequences for splicing regulation. In the present paper, we summarize the evidence that transcription affects splicing and vice versa, and the more recent indications of epigenetic effects on splicing, through chromatin modifications. We end by discussing the potential for a systems biology approach to obtain better insight into how these processes affect each other.
- histone modification
- RNA polymerase
- small nuclear ribonucleoprotein particle (snRNP)
When protein encoding transcripts are produced by RNAPII (RNA polymerase II), they are subject to processing at their 5′-ends (capping), 3′-ends (cleavage and polyadenylation) and internally (splicing to remove non-coding intron sequences). There has been a gradual move away from the dogma that these processes are independent events, and it is now widely accepted that some of these modifications occur co-transcriptionally, meaning while RNAPII is still elongating the nascent RNA. There are many excellent reviews on the subject [1–5]. In the present paper, the emphasis is on cross-talk between transcription, splicing and chromatin remodelling.
Early electron microscopy studies showed that many, but not all, introns are removed in a co-transcriptional manner . However, it was only relatively recently that the application of chromatin immunoprecipitation allowed the co-transcriptional recruitment of splicing factors to be monitored in real time [7–11]. Owing to their close proximity to the DNA template, it is possible to cross-link co-transcriptionally recruited splicing factors to the DNA by treating cells with formaldehyde. After cell lysis, chromatin is sheared and immunoprecipitated with antibodies raised against specific splicing factors. At this point, the cross-links are reversed and the recovered DNA is amplified by PCR using specific primers. It is therefore possible to detect the sequential recruitment of splicing factors during co-transcriptional spliceosome assembly along the length of a gene as the nascent RNA is elongated by RNAPII. Nevertheless, it has been pointed out that, although early splicing factors may be recruited co-transcriptionally, splicing is not necessarily completed co-transcriptionally [2,12].
So, is it just by chance that spliceosome assembly sometimes occurs at the site of transcription or are the processes actively coupled and, if so, why? What are the benefits? In vitro studies to address these questions used HeLa nuclear extracts that have combined transcription and splicing activities. Whereas improved kinetics  and altered patterns of splicing  were observed in these physically coupled transcription/splicing systems, it was concluded that this may be largely attributed to RNAPII protecting nascent transcripts against nuclear degradation, rather than enhanced splicing activity [14,15]. In other words, there was no evidence for functional coupling of the two processes.
In order to clarify discussion of this topic, it was proposed that the transcription and splicing reactions can be considered ‘coupled’ if properties of the splicing reaction are specifically altered in a transcription-dependent manner . Two distinct mechanisms of coupling have been identified and are described in the present paper.
There is evidence that certain splicing factors interact with the CTD (C-terminal domain) of the large subunit of RNAPII, thereby facilitating their co-transcriptional recruitment to the nascent transcript . Although this may, in turn, promote co-transcriptional spliceosome assembly on nascent transcripts, it does not necessarily imply functional coupling of the two processes. The CTD consists of multiple repeats (26 in yeast, 52 in humans) of the heptad sequence YS2PTS5PS7 that is dynamically phosphorylated as RNAPII transcribes the gene (Figure 1). Typically, when RNAPII is at the promoter, the CTD is in a hypophosphorylated form. It then becomes phosphorylated on Ser5 (pSer5) to permit promoter clearance and initiation. The phosphorylation at Ser5 decreases towards the 3′-end of the gene as the CTD becomes phosphorylated on Ser2. Recent reports suggest that CTD Ser7 is also phosphorylated, in a manner resembling that of Ser5. Currently, the only characterized function for pSer7 is related to 3′-end processing of snRNAs (small nuclear RNAs) [16,17]. Over 100 proteins have been shown to bind to the phosphorylated CTD , and it was proposed that the CTD acts as a ‘landing platform’ for RNA-processing factors, with the specificity of binding being determined by a ‘CTD code’ of post-translational modifications . For example, the yeast U1 snRNP (small nuclear ribonucleoprotein particle) Prp40 was found to bind specifically to CTD that is doubly phosphorylated at Ser2 and Ser5 . The central importance of the CTD was demonstrated in a study where truncation of the CTD led to defects in capping, polyadenylation and splicing . This dynamic CTD phosphorylation suggests that RNAPII reorganizes its CTD interactions during transit along the DNA template to promote the sequential action of distinct RNA-processing events . Recruitment coupling is also evident by the presence of dual-function proteins such as PGC-1 (peroxisome-proliferator-activated receptor γ co-activator 1), which is both a transcriptional co-activator and an alternative splicing regulator whose splicing activity is effective only when it is tethered to promoters through binding to a sequence-specific transcription factor .
‘The co-transcriptional race’
The second mode of interaction between transcription and splicing is referred to as kinetic coupling, which is particularly important in the context of alternative splicing. Alternative splicing is a widespread means of producing polypeptide diversity from a single gene. The transcripts of 95% of human multi-exon genes undergo alternative splicing and there are ~100000 alternative splicing events in major human tissues . Transcript elongation by RNAPII is not a uniform process that occurs at a steady rate. Rather, it fluctuates and is prone to pause, as dictated by the local sequence environment. The simplest way in which elongation rate can influence the process of alternative splicing is through the control of what has been described as the “window of opportunity” . This is the time in which an upstream splice site can assemble in a functional spliceosome before it has to compete with a downstream splice site. A low elongation rate or paused polymerase would allow recognition and inclusion of a poor upstream 3′-splice site before a better downstream one is transcribed, thereby promoting inclusion of an alternative exon . Essentially, it represents a race between RNAPII elongation and the time it takes for splicing factor recruitment and spliceosome assembly to occur. It is therefore not surprising to find other factors that are implicated in the control of RNAPII pausing and processivity, and which affect both constitutive and alternative splicing. These include promoter- and enhancer-associated transcription factors, elongation factors and co-activators (for a review, see ).
Is coupling reciprocal?
Most of the available evidence demonstrates influences of the transcription machinery on factors involved in pre-mRNA processing. However, there is a growing body of evidence to suggest a reciprocal (bi-directional coupling) relationship between the two processes. For example, a functional 5′-splice site in close proximity to a promoter was found to enhance transcriptional output , and it was shown that the presence of the U1 snRNP at a 5′-splice site can actively recruit general transcription factors and RNAPII to the promoter in a splicing-dependent manner . Similarly, Fong and Zhou  demonstrated that the U1 snRNP can regulate transcription via stimulation of RNAPII elongation. The effect is mediated through interaction with the elongation factor Tat-SF1 [Tat (transactivator of transcription)-specific factor 1], which itself interacts with P-TEFb (positive transcription elongation factor b), the kinase responsible for phosphorylation of Ser2 on the CTD of RNAPII. Additionally, in human cell extracts, U1 snRNPs, but not other snRNPs, were shown to co-purify with RNAPII .
More recently, the mammalian transcriptional regulator SKIP (Ski-interacting protein) was demonstrated to have a role in Tat-dependent transcription . Interestingly, SKIP and its yeast counterpart, Prp45p, are components of spliceosomes [29,30], as is Cus2p, the yeast counterpart of Tat-SF1 . These findings point towards a conserved structural and functional intertwining of the transcription and splicing machineries.
The SR (serine/arginine-rich) family of proteins is a class of RNA-binding proteins that are capable of binding nascent transcripts, committing pre-mRNA to the splicing pathway through contact with the U1 and U2 snRNPs (reviewed in ). SR proteins have been demonstrated to co-purify with RNAPII , and the SC35 family member has been implicated in stimulation of transcriptional elongation through interaction with P-TEFb . SC35 depletion induced RNAPII accumulation within the gene body and atte-nuated elongation. This work suggests that SC35, and possibly other SR proteins, may act at an earlier stage than was first thought, functioning as single-strand-RNA-binding proteins to facilitate transcriptional elongation, even on intronless genes (reviewed in ).
Whereas the evidence discussed so far is strongly indicative of functional links between transcription and splicing, recent studies suggest a more complex scenario. As reviewed by Kornblihtt et al. , the recognition of exons by the splicing machinery might need “a little help from a chromatin friend”.
Chromatin joins the pre-mRNA processing party
In eukaryotes, genomic DNA is packed into nucleosomes, octamers of histone proteins around which the DNA is wrapped and from which histone ‘tails’ protrude. This chromatin environment is in a constant state of flux, being altered by specific modifications to the histone tails, including acetylation, methylation, phosphorylation, ubiquitination and proline isomerization [36,37]. These modifications convert chromatin between ‘closed’, transcriptionally repressed, heterochromatin and more ‘open’, transcriptionally accessible, euchromatin states . A general link between nucleosome density and gene exon/intron architecture was first proposed by Beckmann and Trifonov , who made the striking observation that the mean distance between consecutive 5′- or 3′-splice sites showed a periodicity reminiscent of the 147 nt length of DNA required to wrap around a nucleosome. They suggested that nucleosomes are somehow positioned in conjunction with the elements that promote intron removal. More recently, a number of reports suggest a link between chromatin structure and exon/intron architecture. Kolasinska-Zwierz et al.  showed that trimethylation of histone 3 Lys36 (H3K36me3), a mark previously associated with transcription elongation, is found preferentially on exons relative to introns of actively transcribed genes in nematodes and mammals. It was suggested that H3K36me3 on expressed exons may represent a marking mechanism, providing a dynamic link between transcription and splicing.
Further bioinformatic analyses of published experimental data derived from deep sequencing of human and Caernorhabditis elegans DNA fragments generated through micrococcal nuclease digestion have provided further insights. It was demonstrated that exons are differentially marked from introns, both in terms of nucleosome occupancy (approx. 1.5-fold higher in exons than in introns) and in specific histone modifications [41–44]. This suggests that the H3K36me3 pattern reported by Kolasinska-Zwierz et al.  can be explained at least in part by the more general phenomenon of increased nucleosome density over exons. Other epigenetic marks such as monomethylation of H3K79, H4K20 and H2BK5 and mono-, di- and tri-methylation of H3K27 are also enriched on exons, with an increase in amplitude as gene expression levels increase (Figure 2). Conceivably, nucleosomes carrying specific histone modifications may interact with splicing factors to enhance exon recognition [42,45]. It was also found that the ends of introns contain sequences disfavoured by nucleosomes, that may shift nucleosome occupancy to exons, and nucleosome enrichment was most pronounced for exons with weak splice sites [42–44]. It was suggested that nucleosomes may function as ‘speed bumps’ to slow RNAPII, thereby improving the selection of exons by increasing the time (or window of opportunity) for newly synthesized splice signals in the nascent pre-mRNA to be recognized by splicing factors . Indeed, recent biophysical evidence has been obtained for the nucleosome behaving as a fluctuating barrier that affects RNAPII movement . In this way, nucleosome positioning may aid exon definition, especially for long genes.
Another intriguing connection between nucleosome positioning and splicing is through chromatin remodelling. This was suggested after the discovery that the SWI/SNF chromatin-remodelling ATPases influence splicing . This work demonstrated that alternative splicing was influenced by differences in transcription elongation rates in a manner regulated by SWI/SNF and CTD phosphorylation, and involving the creation of ‘road blocks’ to transcription (reviewed in ).
RNAPII is thought to mediate the cross-talk between chromatin and the exon/intron architecture of RNA. The nucleosome-specific histone methyltransferase Set2 is responsible for methylation of H3 at Lys36 and is recruited through the selective action of pSer2 on RNAPII . Intriguingly, this suggests that perturbations of the phosphorylation status of RNAPII would have an effect on epigenetic marking specifically of the H3K36me3 mark, with an altered chromatin landscape possibly controlling alternative splicing events. Indeed, Luco et al.  have demonstrated a direct role for histone modifications in splice site selection in human cells. Using the human FGFR2 (fibroblast growth factor receptor 2) gene, which has been used extensively to study alternative splicing, they observed that modulating the level of H3K36me3 through overexpression of Set2 resulted in a significant increase in H3K36me3 globally and reduced inclusion of PTB (polypyrimidine-tract-binding protein)-dependent exons. Conversely, down-regulation of Set2 by RNAi (RNA interference) promoted the inclusion of normally repressed PTB-dependent exons. These results suggest a role for histone modifications in alternative splicing control through the existence of an adaptor system, containing histone modifications, a chromatin-binding protein that reads the histone marks and an interacting splicing regulator. It was proposed that such a system could transmit epigenetic information to the pre-mRNA-processing machinery by promoting the recruitment of specific splicing factors (reviewed in ).
Lateral thinking: a systems biology approach
Considering the growing body of evidence suggesting a three-way cross-talk between transcription, splicing and chromatin, it would be beneficial to adopt a systems biology approach to integrate the large amount of complex and diverse biological information that is available. Through mathematical modelling, it should be possible to generate biological predictions that can be tested experimentally. There has already been a move towards a systems level approach to understand coupling and co-ordination in gene expression systems , focused primarily on protein interactions. Maciag et al.  developed computational methods to analyse protein coupling in the gene-expression machineries of yeast, with extrapolation of their findings to humans. Using this approach, they were able to confirm known coupling such as that of transcription and RNA processing with export, and predict further coupling with translation and nonsense-mediated decay.
Efforts have been made by Darzacq et al.  to measure elongation rates of RNAPII in situ. By following the synthesis of RNA in real time and through the use of deterministic computational models constrained by extensive data analysis, they were able to make predictions that were tested experimentally through the use of transcriptional inhibitors. Multi-step models of transcript synthesis have also been developed  and studied theoretically [56,57]. Modelling techniques were applied to the analysis of cytoplasmic mRNA turnover in Saccharomyces cerevisiae, yielding insights into mRNA metabolism that would not readily have been obtained from conventional RNA analyses alone . RNA turnover is a simpler pathway than transcription and splicing, but this illustrates the potential for generating novel insights.
Although both transcription and mRNA turnover have been extensively modelled, these models have yet to incorporate the splicing reaction or take into account the effects of chromatin. Many intriguing questions remain, such as the following. Is the order of RNA processing events important? How is information transferred from the DNA to the RNA level and vice versa? With regard to the chromatin link, what happens first? Does the splicing machinery first signal to the chromatin where a functional intron is present, causing chromatin remodelling and/or histone modification to establish the intron mark? Or conversely, does epigenetic marking first determine splicing by signalling to the polymerase and splicing machinery?
It should be possible to address some of these questions by using genetic or RNAi knockdown of relevant factors in combination with high-resolution kinetic analyses of transcript elongation, splicing, the recruitment of transcription and splicing factors, CTD kinases and phosphatases, and chromatin-modifying and -remodelling factors during the very early stages of de novo transcription of a gene. As a beginning, a recent quantitative analysis of the expression of long genes in human cells following removal of a chemical inhibitor of transcription has permitted more accurate measurements of the kinetics of transcription and splicing and provided direct evidence for co-transcriptional splicing, even of long introns . This demonstrates the feasibility of obtaining the kinds of quantitative real-time in vivo data required for mathematical modelling. Undoubtedly, new modelling approaches will also be required to describe this very complex multi-component and multi-dimensional system, but the insights gained should be worth the effort.
R.A. was supported by the European Union-funded Framework Programme 6 RiboSys project [grant number 518280] and by the Biotechnology and Biological Sciences Research Council and the Engineering and Physical Sciences Research Council through funding [grant number BB/D019621/1] to the Edinburgh Centre for Systems Biology. J.D.B. is the Royal Society Darwin Trust Research Professor.
We are grateful to Steve Innocente for helpful discussion and comments on the manuscript.
Signalling and Control from a Systems Perspective: A Biochemical Society Focused Meeting held at University of York, U.K., 22–24 March 2010, as part of the Systems Biochemistry Linked Focused Meetings. Organized and Edited by David Fell (Oxford Brookes, U.K.), Hans Westerhoff (Manchester, U.K., and Amsterdam, The Netherlands) and Michael White (Liverpool, U.K.).
Abbreviations: CTD, C-terminal domain; PTB, polypyrimidine-tract-binding protein; P-TEFb, positive transcription elongation factor b; RNAi, RNA interference; RNAPII, RNA polymerase II; SKIP, Ski-interacting protein; snRNP, small nuclear ribonucleoprotein particle; SR, serine/arginine-rich; Tat, transactivator of transcription; Tat-SF1, Tat-specific factor 1
- © The Authors Journal compilation © 2010 Biochemical Society