Understanding organisms from a systems perspective is essential for predicting cellular behaviour as well as designing gene-metabolic circuits for novel functions. The structure, dynamics and interactions of cellular networks are all vital components of systems biology. To facilitate investigation of these aspects, we have developed an integrative technique called network component analysis, which utilizes mRNA expression and transcriptional network connectivity to determine network component dynamics, functions and interactions. This approach has been applied to elucidate transcription factor dynamics in Saccharomyces cerevisiae cell-cycle regulation, detect cross-talks in Escherichia coli two-component signalling pathways, and characterize E. coli carbon source transition. An ultimate test of system-wide understanding is the ability to design and construct novel gene-metabolic circuits. To this end, artificial feedback regulation, cell–cell communication and oscillatory circuits have been constructed, which demonstrate the design principles of gene-metabolic regulation in the cell.
- chromatin immunoprecipitation–DNA microarray (ChIP-chip)
- gene circuit
- network component analysis (NCA)
- transcriptional regulation
- transcription factor
Organisms respond to their environment through a series of interconnected networks that include signal transduction and metabolic and transcriptional systems. How these networks are structured, how they respond dynamically and how they interact are all questions essential to the understanding of biology from a systems perspective. To address these questions, we have developed an integrative technique called NCA (network component analysis), and its subsequent generalization, gNCA, which utilizes mRNA expression and transcriptional network connectivity to determine network component dynamics, functions and interactions. So far, we have used NCA/gNCA to: (i) deduce TFA (transcription factor activity) profiles during the Saccharomyces cerevisiae cell cycle and Escherichia coli carbon source transition [1–3], (ii) determine interactions of S. cerevisiae regulators and of E. coli's TCSs (two-component systems) [3,4] and (iii) identify putative functions for S. cerevisiae transcription factors .
With a larger breadth of knowledge of the structure, dynamic properties and interactions of biological systems, rational cellular design of complex functionality becomes a feasible objective. To demonstrate this concept, we have designed and constructed artificial feedback regulation, cell–cell communication and oscillatory circuits in E. coli.
There are many published works concerned with the deduction of biological network architecture, dynamic response and interaction. Most notably, probabilistic models (reviewed in ), genomic association [6–9], multiple regression [10–14] and dimensional reduction [15–18] have been developed for bioinformatic analysis. NCA is a dimensional reduction technique akin to PCA (principal component analysis) and ICA (independent component analysis). However, unlike previous methods, NCA does not impose mathematical constraints on its solution such as orthogonality (PCA) or statistical independence (ICA). The constraints that NCA uses to ensure a unique solution are the connectivity structure of the analysed system. If analysing the transcriptional system, this would be the presence or absence of transcription factor binding to the promoter region of a gene. Previously, during the study of transcriptional systems, the network topology necessary to perform NCA was provided by publicly available ChIP-chip (chromatin immunoprecipitation–DNA microarray) binding data . ChIP-chip assays determine the probability of a transcription factor binding the promoter element of a gene with a method that incorporates chromatin immunoprecipitation and DNA microarray technology . With more biologically meaningful constraints, NCA presents a method to deduce more biologically relevant information, such as TFA profiles from mRNA expression and ChIP-chip binding data.
The general concept and application of NCA were developed by Liao et al. , where TFA profiles for the S. cerevisiae cell cycle were deduced from mRNA expression and ChIP-chip binding data. Since transcription factors are often regulated post-transcriptionally, their transcript expression profile does not necessarily correlate well with their activity, and thus cannot be reliably used as a proxy. Gene expression profiles from a transcription factor's target genes are much more likely to provide accurate approximations of activity. However, due to the complexity of transcriptional regulation, a straightforward approximation of activity from target gene expression profiles would be inadequate. To furnish a robust approximation of TFAs from gene expression profiles, NCA deduces all TFAs simultaneously while allowing combinatorial regulation, consistent with the network architecture and variable control strengths. To demonstrate the qualitative accuracy of TFA profiles deduced from NCA, Kao et al.  performed NCA on E. coli carbon source transition expression data. All TFA profiles were found to be consistent with the literature or were verified experimentally with independent measurements.
TFA profiles deduced from NCA provide a dynamic perspective of the transcriptional regulation system, which could provide more detailed understanding than that from static properties alone. Yang et al.  hypothesized that transcription factors participating in cell-cycle regulation would exhibit cyclic TFA profiles. Out of 11 known cell-cycle regulators in S. cerevisiae, nine exhibited significant periodicity in their TFA profile. By combining the results of the periodicity analysis, with those from a cluster analysis, five putative cell-cycle-related regulators were identified, thereby demonstrating the utility of investigating dynamic responses while in search of functional characteristics.
In addition to functional characteristics, network interactions can be determined through the analysis of NCA-deduced TFA profiles. Two methods have been used to determine network interactions. The first was described by Yang and Liao , where E. coli's TCSs were analysed. A total of 36 TCS deletion strain gene-expression data sets were investigated under the hypothesis that interactions are rare and that the presence of a significant fluctuation in a TFA profile compared with the median level over all deletion experiments would indicate a network interaction between the corresponding gene deletion and the regulator whose TFA profile was affected. The analysis identified 37 functional interactions, with 18 of them previously established in the literature and the others constituting novel predictions.
The second method to identify network interactions relied upon gNCA. NCA when first derived utilized network architecture to impose constraints and produce a unique solution. In the case of transcriptional regulation, these constraints would be represented with a network map created from a binding assay such as ChIP-chip. gNCA, derived by Tran et al. , is an expansion of NCA that allows the integration of constraints on to the deduced dynamic behaviour. In the case of transcriptional system analyses, these additional constraints would be imposed upon the TFA profiles. This seemingly subtle mathematical development enables the incorporation of important experimental information, such as regulator knockout experiments, used to identify interaction.
Yang et al.  employed gNCA to determine network interactions. Gene expression data from S. cerevisiae cell-cycle experiments including forkhead transcription factor deletion strains were analysed. ChIP-chip data were used along with knowledge of the regulator deletion strains to constrain a unique solution. To incorporate regulator deletion information, zeros were imposed on the deduced TFA profiles of the forkhead transcription factors over the deletion experiments. They hypothesized that any regulator that exhibited a statistically significant perturbation in their TFA profile over the forkhead transcription factor deletion experiments functionally interacts with the forkhead transcription factors. Seven interaction partners of the forkhead transcription factors were identified, three of which had been previously verified and four of which were putative.
With the development of NCA and gNCA, a valuable new tool has become available to aid in the elucidation of biological system network dynamics and interactions. Various applications of the method in E. coli and S. cerevisiae have shown this utility. By utilizing known network architecture, provided by the ChIP-chip assay, and microarray gene expression data, NCA has been able to provide insights into the system dynamics and interactions of transcriptional systems.
Since NCA and gNCA are mathematical techniques, their use is not limited to the study of transcriptional systems. Any system capable of being represented by a number of regulators controlling a larger number of outputs can theoretically be analysed by NCA. The advent of the ChIP-chip assay and DNA microarray has only facilitated the ease with which network architectures and biological data can be collected, and thus the ease with which transcriptional systems can be analysed by NCA. In time, more systems will be analysed by NCA and, in concerted effort with other techniques, a wider more comprehensive view of systems biology will be achieved.
Artificial gene-metabolic circuits
De novo design of transcriptional and metabolic networks represents another powerful approach for elucidating network design principles and exploring potential applications not limited by natural systems. However, application of complex design scenarios remains challenging, with the disruption of unanticipated cellular pathways remaining a concern. Several synthetic circuits, such as oscillators [21,23], toggle switches [23,24] and feedback loops [25,26], were designed and implemented experimentally to function independently from cellular metabolism and physiology. To enhance control capabilities and create novel functionalities, another dimension can be added to the synthetic circuit architecture by integrating both transcription and metabolic controls. Implementation of such a design would require extensive knowledge of an organism's physiology. To this end, we had chosen E. coli as our host due to the extensive knowledge of its metabolic pathways, metabolic control and transcriptional regulation. We have engineered an intracellular dynamic feedback controller that senses metabolic state and allows separation of growth phase and metabolite production phase to improve lycopene production ; we have constructed a gene-metabolic network for artificial cell–cell communication using acetate as the signalling molecule, thus enabling co-ordinated population level control ; recently, we have built a synthetic gene metabolic oscillator that creates autonomous oscillation between two pools of metabolites .
The conceptual design of the metabolic oscillator consists of a flux-carrying network with two interconverting metabolite pools (M1 and M2) catalysed by two enzymes (E1 and E2), whose expression is negatively and positively regulated by M2 respectively. In the first stage where the M2 level is low, E1 is expressed and E2 is not. A high input metabolic flux drives M1 to M2 rapidly. The accumulation of M2 represses E1 and up-regulates E2. When the backward reaction rate exceeds the sum of the forward reaction rate and the output rate, M2 level decreases and M1 level increases. Then, E1 is expressed again and E2 is degraded, returning the system to the first stage (Scheme 1A). On the other hand, if the input flux is low, M2 does not accumulate sufficiently quickly enough to cause a large swing in gene expression, and a stable steady state can be achieved. This design allows the metabolic state to modulate gene expression cycles, a characteristic commonly seen in circadian rhythm.
The design was implemented experimentally within E. coli central metabolism by choosing AcCoA (acetyl-CoA) as M1 and a lumped pool of AcP (acetyl phosphate), acetate and acetic acid as M2. Connecting these two metabolite pools were Pta (phosphotransacetylase), as E1, which converts AcCoA into AcP, and Acs (AcCoA synthetase), as E2, which converts acetate into AcCoA. To control gene expression with M2, the glnAp2 promoter was chosen and used to control two genes, lacI and acs. Pta was under the control of lacO1 (Scheme 1B). To obtain readout of the circuit, a green fluorescent protein was placed under the control of a tac promoter. When the AcP level is low, LacI and Acs levels are also low, thus derepressing the lacO1 promoter and beginning the production of Pta. As Pta is produced, it converts AcCoA into AcP and thus activates the glnAp2 promoter that synthesizes Acs and LacI. An increase in the level of LacI represses the transcription of pta, hence lowering the production level of AcP. Simultaneously, as the level of Acs increases, it converts acetate into AcCoA, which helps remove the downstream bottleneck of the AcP degradation pathway and thus also helps lower the AcP level. The key of this design was the control of two metabolite pools, AcCoA and AcP, by two enzymes in the circuit, Pta and Acs. These two enzymes in turn, indirectly and directly, responded to AcP.
Integration of gene circuits into metabolic and transcriptional systems holds the possibility of expanded functionality and increased control. Successful application of such design schema requires extensive knowledge of system properties, such as signal transduction, metabolic and transcriptional dynamics, interactions, and network architecture.
Integrated gene-metabolic circuits require system-wide understanding of an organism. To further this understanding, we have developed a technique called NCA, which, when applied to biological systems, is capable of uncovering the dynamics, function and interaction within cellular networks. To date, NCA has been used to elucidate functional characteristics, transcriptional regulatory dynamics and interactions in S. cerevisiae and E. coli. Additionally, we have been able to utilize the wealth of knowledge on E. coli transcriptional and metabolic systems to engineer artificial feedback regulation, cell–cell communication and oscillatory circuits.
Large-Scale Screening: A Focus Topic at BioScience2005, held at SECC Glasgow, U.K., 17–21 July 2005. Edited by B. Baum (Ludwig Institute, London, U.K.), K. Brindle (Cambridge, U.K.), S. Eaton (Institute of Child Health, London, U.K.) and I. Johnstone (Glasgow, U.K.).
Abbreviations: AcCoA, acetyl-CoA; AcP, acetyl phosphate; Acs, AcCoA synthetase; ChIP-chip, chromatin immunoprecipitation–DNA microarray; ICA, independent component analysis; NCA, network component analysis; gNCA, generalized form of NCA; PCA, principal component analysis; Pta, phosphotransacetylase; TCS, two-component system; TFA, transcription factor activity
- © 2005 The Biochemical Society