Transcription factories: structures conserved during differentiation and evolution

I. Faro-Trindade, P.R. Cook


Many cellular functions take place in discrete compartments, but our textbooks make little reference to any compartments involved in transcription. We review the evidence that active RNA polymerases and associated factors cluster into ‘factories’ that carry out many (perhaps all) of the functions required to generate mature transcripts. Clustering ensures high local concentrations and efficient interaction. Then, a gene must associate with the appropriate factory before it can be transcribed. Recent results show that the density and diameter of nucleoplasmic factories remain roughly constant as cells differentiate, despite large changes in the numbers of active polymerases and nucleoplasmic volumes.

  • cell differentiation
  • chromatin loop
  • genome organization
  • RNA polymerase
  • transcription factory

Molecules are concentrated within cells into distinct cellular compartments to increase reaction rates (through mass action), and facilitate regulation (e.g. through ‘co-operative’ effects). Concentration is achieved by enclosing those molecules in membranes (as in mitochondria) and/or by forming local clusters (as in the machinery for oxidative phosphorylation). We will argue that the transcription machinery is clustered into structures called ‘factories’, and that this underpins the self-organization of all genomes [13]. Local concentrations in factories can be high; HeLa nuclei contain a dispersed approx. 1 μM pool of RNA polymerase II, but this accounts for little transcription as the concentration in factories is approx. 1000-fold higher [2,4]. Here, we review the properties of eukaryotic transcription factories that carry out many (perhaps all) of the functions required to generate a mature transcript; in the case of a ‘standard’ message, these include capping, splicing, polyadenylation and proofreading [5,6].

Nucleoli: prototypic transcription factories

The nucleolus is the one nuclear compartment seen classically; it is dedicated to 45 S rRNA synthesis and ribosome production. The human loci encoding rRNA are carried on chromosomes 13, 14, 15, 21 and 22, with each locus carrying approx. 80 tandem repeats of 43 kb containing the gene and an untranscribed spacer. Repeats appear as secondary constrictions in mitotic chromosomes: the NORs (nucleolar organizing regions). Inactive RNA polymerase I and its transcription factor UBF (upstream binding factor) are bound to some NORs and, on exit from mitosis, these NORs fuse into one or more nucleoli [7]. NORs lacking bound UBF and the polymerase remain inactive and are not incorporated into nucleoli [8,9]; UBF plays a critical role in organizing the structure as introducing UBF-binding sites into other chromosomes generates pseudo-NORs [10].

Three distinct regions within nucleoli can be seen by electron microscopy: a ‘granular component’ in which are embedded one or more ‘fibrillar centres’ and associated ‘dense fibrillar component(s)’. Polymerase I and UBF are concentrated in the fibrillar centre, and nascent Br-RNA [BrU (bromouridine)-labelled RNA] is found on its surface in the dense fibrillar component [11]. Newly completed transcripts are processed in the granular component to emerge as mature ribosomal subunits into the nucleoplasm [12]. A typical nucleolar factory in HeLa contains approx. 500 polymerases engaged on approximately four templates packed around one fibrillar centre (Table 1). The nucleolus provides important precedents in this context: two or more active transcription units cluster into one compartment to be transcribed; other identical but unassociated units are inactive.

View this table:
Table 1 Approximate number of polymerases, nascent transcripts and transcription factories (with diameters) in a HeLa cell

Pol I, polymerase I. Notes: aFrom quantitative Western blotting [4]. bFrom Western blotting (54% of all forms is IIO, while sarkosyl removes 73% of all forms and 62% IIO) [28]; sarkosyl-resistant IIO is assumed to be active. cFrom [32P]UTP incorporation by permeabilized cells (mitochondrial contribution of ≤6% neglected) [28]. dMinimal estimates of 4500 (calculated assuming 5% of 90000 nascent transcripts made by pol III) and 7900 (from numbers of nascent 5 S+tRNA transcripts) also obtained by Pombo et al. [31]. eFrom number (i.e. 120) of nucleolar Br-RNA foci seen by LM (light microscopy) in spreads after ‘run-on’ in Br-UTP [28], assuming each focus represents one unit with 125 polymerases [38]. fExtra-nucleolar Br-RNA foci seen by LM in spreads after ‘run-on’ in Br-UTP [28]; underestimate, as some short molecules missed. gAverage of nucleoplasmic clusters of gold particles (i.e. 7000, 8800 and 7700) seen by EM (electron microscopy) after ‘run-on’ transcription in biotin-CTP and post-embedment labelling [28,30,39]. hCorrected as described in [39,40]. iSite diameter is needed to calculate site numbers and was initially overestimated as immunolabelling probes are large relative to sites examined. The extent of the overestimation was determined subsequently [41], and a corrected diameter of 50 nm (rather than 71, 80 or 77 nm) is used to calculate site numbers here. The equivalent correction has not yet been determined for diameters seen in cryosections (see also [42]). jNucleoplasmic Br-RNA foci seen by LM in cryosections after ‘run-on’ in Br-UTP [30]. kNucleoplasmic clusters of gold particles marking Br-RNA seen in cryosections by EM after ‘run-on’ in Br-UTP [31]. lNucleolar Br-RNA foci seen by LM in spreads after ‘run-on’ in Br-UTP [28]. mAverage of nucleoplasmic clusters of gold particles (i.e. 8800 and 8470) seen by EM after growth in BrU and post-embedment labelling of Br-RNA [28,39]. nNucleoplasmic clusters of gold particles seen by EM in unlysed cells after labelling with anti-pol II [30]. oUncorrected site diameters seen after extension in BrU, Br-UTP, or biotin-CTP and post-embedment immunolabelling are 71–80 nm [28,30,39]; after correction (see note for superscript ‘i’), these become approx. 50 nm. Uncorrected diameters seen in cryosections by EM are 36 nm [33] and 46 nm [31]. pLargest diameters are twice the average, so could accommodate approx. 8-fold more nascent transcripts [30,33,42]. qNucleoplasmic Br-RNA sites marked by clusters of gold particles in cryosections after ‘run-on’ in Br-UTP in 2 μg/ml α-amanitin [31]. rLabelled with anti-pol II [30]. sLabelled with anti-pol IIO; high-resolution LM gives an overestimated diameter of 74 nm [42]. tFrom ‘spreads’ [38]. uFrom ‘spreads’ [28]. vMost class III units are too short to accommodate >1 polymerase. wFrom rows 3, 4 and 11; each factory contains approximately four active units, each with approx. 125 polymerases, and a nucleolus may contain more than one factory [28]. xFrom number of nascent transcripts and Br-RNA sites or Pol II/II0 sites.

Nucleoplasmic factories

We imagine that nucleoplasmic factories containing polymerase II (the enzyme that generates most eukaryotic mRNAs) are built similarly [3]. Polymerase II disengages during mitosis, but factors such as TBP (TATA-box-binding protein) remain bound [13] to ‘bookmark’ previously active genes so that the enzyme can re-engage on them after mitosis [14,15]. Newly active polymerases on different units would aggregate into a factory surrounded by a ‘cloud’ of loops (Figure 1), driven both by specific and non-specific forces [3,16]. Then, the polymerase is both an enzyme and a molecular tie that organizes loops [2]. A typical nucleoplasmic factory in HeLa contains a cluster of approximately eight active polymerases (diameter ∼50 nm) overlapping a zone (diameter ∼50 nm) containing approximately eight transcripts (Table 1). Polymerase III factories have roughly similar structures. Critical evidence for this model is now summarized.

Figure 1 Model for a polymerase II factory and genome structure in a HeLa cell

DNA is coiled around the histone octamer, and runs of nucleosomes form a zigzagging string. As active polymerizing complexes and bound transcription factors (diamond) tend to cluster, a ‘cloud’ of ten to twenty loops forms around a factory. Active polymerases do not track along their templates; they are bound to a factory and act both as motors that reel in their templates and as one of the critical structural ties that maintain the loops [2]. Loops inevitably appear and disappear as polymerases initiate and terminate (and then dissociate to join the soluble pool); bound transcription factors also exchange with the soluble pool. Genes tethered near the factory are more likely to initiate than more distant ones [43]. Nucleosomes in long loops are static and acquire a (heterochromatic) histone code that spreads down the fibre; they also aggregate on to the lamina, nucleoli and chromocentres. Each factory contains one type of RNA polymerase (in this case polymerase II) to the exclusion of others, and some factories are richer in certain transcription factors than others (and so are involved in the transcription of specific sets of genes; [37]). 50–200 successive clouds strung along the chromosome form a territory (the general path of DNA between clouds is shown). The Figure is modified from [17]; this material is used with the permission of John Wiley and Sons.

Chromatin is looped

That chromatin is organized into loops is an old, if controversial, idea [17]; application of ‘chromosome conformation capture’ (3C) now provides decisive evidence for looping. The method involves fixation, before analysis of which DNA sequences lie next to each other in three-dimensional space (to loop the intervening DNA). Several loops have now been shown to exist [18], but we will exemplify results using only one: the mouse globin locus [19,20]. The Hbb-b1 (β-globin) gene lies tens of kilobase-pairs distant from its LCR (locus control region), and approx. 25 Mb away from a gene (Eraf) encoding the α-globin stabilizing protein. 3C shows that Hbb-b1 contacts the LCR and Eraf in erythroid nuclei (where all three are transcribed), but not in brain nuclei (where all are inactive). It is thought that the LCR nucleates an ‘active chromatin hub’ or ‘factory’ that facilitates expression of globin-related genes.

Bound polymerases and factors are molecular ties

DNA sequences at tethering point can be mapped using nucleases. Studies using eukaryotic ‘nucleoids’ show nearly all residual fragments to be parts of transcription units [21], and this is confirmed by a detailed analysis of clones transformed by single polyoma or avian sarcoma viruses [22]. In every clone where an integrated virus is expressed, viral units resist detachment. Where the virus integrates away from a tie, flanking cellular DNA attaches (unlike identical sequences on the unaffected homologue). Close attachment is lost when proviruses spontaneously become inactive, and are regained on re-activation. These ties are unlikely to form ‘artefactually’, as comparable results are obtained with iso-osmotic buffers; now, polymerizing activity also resists detachment, implying that it mediates attachments [23,24]. For example, when 5 kb plasmids carrying the SV40 (simian virus 40) ori and two transcription units are transfected into COS7 cells, replication yields hundreds of minichromosomes. On permeabilization, cutting with HaeIII and removing most of the chromatin, essentially no activity is lost, and the residual fragments encode either a transcribed region or an (untranscribed) promoter [25]. All these studies point to engaged polymerases and transcription factors as the molecular ties.

Inhibiting transcription eliminates looping

We suggest that looping depends on ongoing transcription, and it seems to. For example, when demembranated spermatozoa heads are injected into the germinal vesicle of amphibian oocytes, they swell and transcription begins. If the contents of the germinal vesicle are now dispersed in a hypo-osmotic buffer, lampbrush chromosomes (with associated loops) derived from both injected spermatozoa and host can be seen. These structures have the active form of polymerase II concentrated along the lampbrush axis. However, the inhibitor, actinomycin D, prevents loop formation [26]. The contacts seen by 3C also involve ≥2 active units, and are lost when transcription ceases (see above).

Nascent eukaryotic transcripts are concentrated in foci

When HeLa cells are permeabilized in a ‘physiological’ buffer and engaged polymerases allowed to extend their transcripts by a few nucleotides in Br-UTP, immunolabelling shows the resulting Br-RNA to be concentrated in a limited number of discrete sites – the factories [2730]. These sites remain even when most of the chromatin is removed by nucleases [27]. Some of these sites contain only RNA polymerase II and others contain only polymerase III [31]. The small numbers suggest that each site contains many polymerases active on different units, and their resistance to nucleolytic detachment and small diameter (∼50 nm) implies that those polymerases are attached to them [25,30].

Most factories contain >1 transcription unit

Numbers of active polymerases (and so nascent transcripts) per site can be calculated by quantitative analysis. In a HeLa cell, for example, there are approx. 8-fold more active molecules of polymerase II than sites; as only one polymerase is typically engaged on a unit, each site must then contain approximately eight units (Table 1). We now summarize the evidence for the three values used in this calculation. In each case, estimates were obtained using different experimental approaches, and – as each approach has a different threshold of detection and as results are reassuringly similar – one estimate lends credibility to another. (i) Numbers of active polymerases were determined by end-labelling nascent transcripts, quantitative immunoblotting of polymerase II, and by counting the number of transcription complexes seen in ‘spreads’ made from known numbers of nuclei. (ii) Site numbers were determined using intact/permeabilized cells, different precursors (BrU, Br-UTP and biotin-CTP), pre-/post-embedment immunolabelling on/in sections and light/electron microscopy. Can we be confident that all sites are detected? We can. If many less-active sites escape detection, increased incorporation of a tagged precursor such as Br-UTP should raise more above the detection threshold, but it does not; rather, the same numbers of sites are labelled more intensely [3033]. Moreover, sectioning cuts through some sites to leave ‘polar’ caps, and one can estimate how small such caps must be before they go undetected. It turns out that sites containing one-twentieth the nascent RNA of the average are detected, so any missed ones can contain only a fraction of the total [31,33]. As might be expected, examination of exactly the same sites shows that one site seen by light microscopy is sometimes resolved into two sites by electron microscopy [33]. (iii) Numbers of polymerases engaged per unit have mainly been determined using electron microscopy of ‘Miller’ spreads (although other methods give similar results). When we think of such spreads, we see the famous ‘Christmas trees’ in our mind's eye. But these are the exceptional rRNA operons, where each unit is transcribed by approx. 120 closely packed polymerase I molecules. The densities seen on polymerase II units in the same spreads are much lower. Thus, if one excludes hyperactive units (e.g. chorion and heat shock in flies and actin in mammals), all the evidence shows that most active units are associated with only one polymerase [34]. For example, analysis of 100 active HeLa units in spreads showed that (at least) two-thirds were associated with only one transcript [30], whereas microarrays reveal that only 73 of the several thousand yeast open reading frames are transcribed by more than one polymerase – and only eight are transcribed by more than two [35]. Studies on green fluorescent protein-tagged polymerase II support the idea that transcriptional initiation is rate-limiting, so few units even become loaded with more than one polymerase [36].

Speculations on changes occurring during differentiation and evolution

How does the organization change as cells differentiate? Do numbers of active polymerases rise or fall, and by how much? Do factory numbers and sizes change? We now have the first answers [32]. Retinoic acid induces totipotent (euploid) mouse embryonic stem cells to differentiate into small SPARC (secreted protein that is acidic and rich in cysteine residues)-positive parietal endoderm cells and larger SPARC-negative cells. (SPARC is a specific marker of parietal endoderm.) Numbers of active polymerase II and factories roughly follow changes in nucleoplasmic volume, but factory diameter and density remain the same (Table 2). The same trends are seen as (aneuploid) F9 teratocarcinoma cells differentiate into SPARC-negative parietal endoderm. The constancy of site density and diameter in the face of changing nucleoplasmic volume prompted analysis of A1 cells from the red-spotted salamander, which have an 11-fold larger genome; density and diameter were surprisingly similar despite increased volume and polymerase number. If we assume that an active (polymerase II) unit is typically associated with one polymerase (see above), factories from all these cells would contain four to eighteen polymerases engaged on the same number of different units (Table 2).

View this table:
Table 2 The organization of nucleoplasmic transcription in different cells

Values for HeLa cells are from Table 1, and for other cells, from [32]. Nucleoplasmic volumes, properties of nucleoplasmic sites containing nascent Br-RNA, and numbers of molecules of active polymerase (pol) II determined by confocal microscopy, electron microscopy and quantitative immunoblotting respectively.


We accept that the many compartments in the cytoplasm carry out different functions. As conventional imaging reveals no obvious compartments within nuclei, the idea has developed that polymerases bind to their target genes wherever they might be. We have reviewed the growing evidence that nuclei contain distinct ‘factories’ dedicated to the production of specific transcripts. Then, genes must diffuse to, and bind to, the appropriate factory before they can be expressed [18,37]. Our model has the merit that the key architectural motifs are all defined. However, many questions remain. For example, we still know little about factory microarchitecture, how that structure might be maintained in the face of the continual exchange of individual components with the soluble pool, and what path DNA might follow around a factory and from factory to factory.


We thank our colleagues for helpful discussions and the Biotechnology and Biological Sciences Research Council, Cancer Research UK, the Engineering and Physical Sciences Research Council, the Foundation for Science and Technology (Portugal) plus the European Social Fund in the context of the III Community Support Panel, the Medical Research Council and Wellcome Trust for support.


  • Mechanisms of Gene Regulation: Research Colloquium at BioScience2006, held at SECC Glasgow, U.K., 23–27 July 2006. Edited by S. Graham (Glasgow, U.K.). Sponsored by Pfizer.

Abbreviations: BrU, bromouridine; Br-UTP, bromo-UTP; Br-RNA, BrU-labelled RNA; 3C, chromosome conformation capture; LCR, locus control region; NOR, nucleolar organizing region; SPARC, secreted protein that is acidic and rich in cysteine residues; UBF, upstream binding factor


View Abstract