Diterpene resin acids, together with monoterpenes and sesquiterpenes, are the most prominent defence chemicals in conifers. These compounds belong to the large group of structurally diverse terpenoids formed by enzymes known as terpenoid synthases. CYPs (cytochrome P450-dependent mono-oxygenases) can further increase the structural diversity of these terpenoids. While most terpenoids are characterized as specialized or secondary metabolites, some terpenoids, such as the phytohormones GA (gibberellic acid), BRs (brassinosteroids) and ABA (abscisic acid), have essential functions in plant growth and development. To date, very few CYP genes involved in conifer terpenoid metabolism have been functionally characterized and were limited to two systems, yew (Taxus) and loblolly pine (Pinus taeda). The characterized yew CYP genes are involved in taxol diterpene biosynthesis, while the only characterized pine terpenoid CYP gene is part of DRA (diterpene resin acid) biosynthesis. These CYPs from yew and pine are members of two apparently conifer-specific CYP families within the larger CYP85 clan, one of four plant CYP multifamily clans. Other CYP families within the CYP85 clan were characterized from a variety of angiosperms with functions in terpenoid phytohormone metabolism of GA, BR, and ABA. The recent development of EST (expressed sequence tag) and FLcDNA (where FL is full-length) sequence databases and cDNA collections for species of two conifers, spruce (Picea) and pine, allows for the discovery of new terpenoid CYPs in gymnosperms by means of large-scale sequence mining, phylogenetic analysis and functional characterization. Here, we present a snapshot of conifer CYP data mining, discovery of new conifer CYPs in all but one family within the CYP85 clan, and suggestions for their functional characterization. This paper will focus on the discovery of conifer CYPs associated with diterpene metabolism and CYP with possible functions in the formation of GA, BR, and ABA in conifers.
- CYP85 clan
- cytochrome P450-dependent mono-oxygenase (CYP)
- diterpene resin acid (DRA)
Genome sequencing of Arabidopsis, rice and poplar [1–4] provided new insights into the divergence and evolution of the highly divergent CYP (cytochrome P450-dependent mono-oxygenase) superfamily in angiosperms [3,5]. However, knowledge of genomic organization, phylogeny, expression and functions of plant CYP is largely restricted to angiosperms. Within the gymnosperms, CYPs have only been characterized for a few species in the coniferophyta (conifers), which include the Pinaceae and Taxaceae, representing the largest and economically most important group of conifers.
On the basis of the dominant terpenoid chemical profiles of conifers (Figure 1), specifically the abundance of DRA (diterpene resin acid) oleoresin defence chemicals in the Pinaceae and occurrence of the anticancer drug taxol in species of Taxus, two conifer-specific families of terpenoid CYP oxygenases have been identified (Figure 2). In Taxus, seven different CYPs have been cloned and characterized for various oxidative taxane modifications in the biosynthesis of taxol . Using loblolly pine, a first CYP gene (CYP720B1) of DRA biosynthesis was functionally characterized to catalyse at least two of the three oxidation steps in the conversion of abietadiene to abietic acid and showed activity with several other diterpene alcohols and aldehydes . Except for these CYP genes that are apparently unique to conifers, functions for other gymnosperm CYPs in the CYP85 clan are not known. Despite their biological importance in plant development, there is no report of functional identification of any gymnosperm gene in the CYP85 clan for the biosynthesis or modification of GA (gibberellic acid), ABA (abscisic acid) or BR (brassinosteroid) phytohormones.
The enormous size of conifer genomes (for example the pine genome is estimated to be 160 larger than that of Arabidopsis ) as well as limited feasibility of genetic approaches to CYP discovery in conifers suggest that mining of expressed genes in the form of ESTs (expressed sequence tags), FLcDNAs (where FL is full-length) and gene expression profiles combined with phylogenetic analysis and functional characterization of heterologously expressed proteins provide an alternative route to CYP discovery in these systems. The recent development of large-scale conifer EST databases with more than 500000 entries to date from loblolly pine (PlantGDB; http://www.plantgdb.org) and spruce  represent an enormous resource for gymnosperm CYP gene discovery.
We illustrate an approach of CYP mining in conifers by focusing on the CYP85 clan and the closely related CYP701 family (Figures 1 and 2). All previously characterized angiosperm genes within this group are involved in oxidative terpenoid phytohormone metabolism.
We present a topology of the CYP85 clan with the inclusion of gymnosperm genes and possible implications for an understanding of evolution and functions of newly identified conifer CYP. In addition to identification of gymnosperm CYP that cluster together with angiosperm genes of GA, ABA and BR metabolism, we identified one conifer-specific family and two subfamilies within the CYP85 clan that could be indicative of lineage-specific evolution of terpenoid secondary metabolite pathways, as shown for the conifer CYP family associated with taxol (CYP725) biosynthesis.
The CYP85 clan contributes to terpenoid chemical diversity in plants
The CYP85 clan is one of four multifamily clans in plants including the following major families: CYP85, CYP87, CYP88, CYP90, CYP707, CYP708, CYP716, CYP720, CYP724 and CYP728. The CYP701 family, functionally related to the CYP88 family, is phylogenetically close to the CYP85 clan. The CYP85 family, together with CYP90 and CYP724, was shown to catalyse the early steps in the pathway from the triterpenoid campesterol to brassinolide in tomato, Arabidopsis and rice [10–12]. 8′-Hydroxylation of ABA to 8′-OH-ABA, the predominant way in ABA catabolism, is catalysed by the CYP707As in Arabidopsis and barley [13,14]. The biosynthetic routes to GAs have been studied intensively and early biosynthetic key steps, the three-step oxidation of ent-kaurene to ent-kaurenoic acid, and the subsequent analogous oxidation to GA, are catalysed by two multifunctional oxygenases, members of the CYP701A and the CYP88A family in Arabidopsis and pea (Pisum sativum) [15–18]. A series of eight CYP-mediated oxidation steps of the taxane core are crucial for the vast array of taxoids, including taxol, and the CYP725 family was demonstrated to act in the taxol biosynthetic pathway in Taxus . CYP720B1 from pine was shown to oxidize the diterpenoid-derived alcohol and aldehyde precursors of abietic acid and its structurally related isopimaradiene and levopimaradiene derivatives, yielding the major constituents of loblolly pine DRAs .
In addition to the ubiquitous CYP families related to phytohormone metabolism and the conifer-specific terpenoid oxygenases, the CYP85 clan contains several CYP families, some of them species-specific, that still remain uncharacterized.
In the following sections, we will briefly describe the discovery and classification of conifer CYP genes associated with the known families of the CYP85 clan and apparently one conifer-specific CYP85 family and two subfamilies.
Arabidopsis is known to have four CYP90 members; each is associated with a distinct clade within this family. Characterization of respective null-allele mutants for these genes showed that the CYP90 members in Arabidopsis are subfunctionalized and not functionally redundant [19–21]. For each subfamily, CYP90A, CYP90C/D and CYP90B/CYP724, we found rice and conifer candidates (Figure 2B). Interestingly, the CYP724 members of rice and Arabidopsis cluster within the CYP90 family and it was also shown that the rice CYP90B2/OsDWARF4 (where Os is Oryza sativa) and CYP724B1/D11 function redundantly . Considering the structure of the CYP90 subfamilies and the functions of the characterized members, it is likely that newly identified conifer CYP90 genes perform functions in BR metabolism, similar to their characterized angiosperm homologues. This hypothesis can be tested in cross-species complementation assays using existing Arabidopsis CYP90 mutants.
Similar to the CYP90 family, the CYP85 family is clearly substructured by mono- and di-cotyledonous lineages and by the more distant the new pine and spruce CYP85 (Figure 2B). Apparently, both Arabidopsis and tomato maintained two copies of this gene, with significant functional redundancy of the Arabidopsis paralogues . It remains to be tested, whether the single copies found in conifers are atypical for this family.
The CYP701 and CYP88 families
Newly identified pine and spruce genes were also identified within the CYP701 (Figure 2C) and CYP88 families (Figure 2A). The pine and spruce CYP88s, together with the gymnosperm Ginkgo biloba sequence, are forming a more distant branch to the existing mono- and di-cotyledonous lineages. Similarly, the pine and spruce CYP701 form a distinct branch from the angiosperms that are split within the clade in mono- and di-cotyledonous lineages. This indicates that even though the reactions catalysed by CYP701 and CYP88 are biochemically related, both distinct CYP88 and CYP701 genes involved in the respective late and early oxidation steps of GA biosynthesis existed independently in a common progenitor of angiosperms and gymnosperms.
The CYP725 family is a conifer-specific family
In addition to the functionally characterized Taxus CYP725 involved in taxane hydroxylation, single genes for each Sitka spruce and loblolly pine were identified in the CYP725 family (Figure 2A). Considering the close distance of the pine and spruce candidates to the Taxus CYP, it is possible that these genes form a functionally coherent family, acting on similar substrates. It will therefore be interesting to test to what extend taxane-related substrates exist in the Pinaceae and whether the spruce and pine CYP725 enzymes function with such compounds. The closest gene outside of the CYP725 family was identified in Arabidopsis, AtCYP718 (where At is Arabidopsis thaliana), which is as yet uncharacterized.
The CYP720B is a conifer-specific subfamily
In addition to the recently functionally characterized CYP720B1 diterpene oxidase of DRA formation in loblolloy pine, several new conifer candidates were identified in the CYP720 family (Figure 2B). The large number of genes and divergence within this family are not likely to be due to allelic variation only, but may represent biochemical diversification in conifer terpenoid secondary metabolism. This family lacks rice genes and only a single Arabidopsis CYP720A gene is somewhat distantly related. We recently characterized the spruce homologue of loblolly pine PtCYP720B1 [where Pt is Pinus taeda (loblolly pine)], SsCYP720B4 (where Ss is Sitka spruce), to catalyse all three oxidations from abietadiene to abietic acid (B. Hamberger and J. Bohlmann, unpublished work). It is possible that members of this family each act on multiple diterpene isomers, as was shown for PtCYP720B1 , with possible overlapping or unique substrate specificities, thereby enhancing diterpenoid chemical diversity in conifer defence. These ideas can now be tested using the heterologous expression and enzyme assay systems recently developed for PtCYP720B1 .
The CYP716B is a conifer-specific subfamily
The CYP716B subfamily is similar to the CYP720B family in that it contains only members from conifers (Figure 2A). Phylogenetic distance to the next closest subfamily in the CYP716 family, represented by the Arabidopsis AtCYP16A1, is significant. To date, no gene of this family has been functionally characterized.
The CYP87 family
Further inspection of the CYP87 family (Figure 2B) will be needed since not a single member for this family could be identified in conifer species. Lack of conifer CYP87 candidates may suggest loss of this family in modern gymnosperms, or a recent angiosperm evolution.
The CYP708, CYP702 and CYP728 families
Both closely related CYP708 and CYP702 families consist exclusively of Arabidopsis genes (Figure 2B) and may be limited to Brassicaceae with distinct and specific functions. Likewise, the CYP728 family (Figure 2A) lacks counterparts in Arabidopsis and conifers but has multiple genes in rice. No gene of these families has been functionally characterized so far.
Approach, bioinformatic algorithms and parameters
Recursive rounds of reciprocal low-stringency e-value cut-off threshold BLAST searches with functionally characterized CYP, uncharacterized members from Arabidopsis and rice and homologues identified in the NCBI databases (http://www.ncbi.nlm.nih.gov) were used to extract pine and spruce CYP homologues for the CYP85 clan. The exhaustive list of rice and Arabidopsis CYP sequences was based on homologous, annotated and characterized candidates.
Mapping of conifer candidates on to the CYP85 clan tree was performed using a phylogeny reconstruction of the entire dataset and subsequently individual families. Phylogenetic analyses on aligned amino acid sequences (dialign2; http://bioweb.pasteur.fr; manually curated; Bioedit) were tested for consistency and the reconstructed maximum likelihood trees were bootstrapped [PhyML; http://atgc.lirmm.fr/phyml; four rate substitution categories, γ shape parameter optimized, JTT (Jones–Taylor–Thornton) substitution model, BioNJ starting tree and 100 bootstrap repetitions]. Tree topologies were supported by TREE-PUZZLE (http://www.tree-puzzle.de).
Discussion and conclusions
Cloning and phylogenetic analysis of conifer CYP of the CYP85 clan will direct functional characterization of new genes in conifer terpenoid secondary metabolism. Until now, our knowledge of genes in the biosynthesis of terpenoid phytohormones ABA, GA and BR has been based on angiosperm species. In contrast, several P450 genes of diterpenoid secondary metabolism (taxol and DRA biosynthesis) have been identified only in conifers. The new identification of conifer genes in all but one family of the CYP85 clan suggests that much of the diversity of this clan existed prior to the divergence of angiosperms and gymnosperms. The gene clustering presented in the present paper will guide functional testing of the newly cloned conifer CYP for their proposed functions in phytohormone and secondary metabolism and can eventually lead to a reconstruction of common ancestry of angiosperm and gymnosperm CYP involved in ubiquitous phytohormone and specialized secondary metabolism. Analysis of CYP from the lycophyte Selaginella moellendorffii , one of the oldest lineages of vascular plants, or the moss Physcomitrella patens  could help to further resolve the ancestry of the CYP85 clan.
An interesting feature of the CYP85 clan is lineage-specific family expansions with new members evolving to acquire novel biochemical functions (paralogues). Examples are the Arabidopsis CYP702 and CYP708 families, the rice CYP728 family, the conifer CYP725 and CYP716 families and the CYP701 family, with rice atypically represented by four members. The biosynthesis of rice-specific diterpenoid phytoalexins/allelochemicals structurally closely related to GA (momilactones, oryzalexins and phytocassanes)  includes oxygenation steps most likely involving CYP and possibly explaining the rice CYP701A subfamily diversity.
The conifer-specific families and subfamilies CYP725 and CYP720 of terpenoid secondary metabolism may share common ancestors with CYP genes of ubiquitous phytohormone biosynthesis and may have diversified over time after initial duplication and neo-functionalization. These genes are likely to contribute to the large terpenoid chemical diversity in conifers. The CYP725 family is one example of such biochemical diversification in Taxus and we may find similar interesting and new biochemical functions in the other conifer CYP families that remain to be functionally characterized. The divergence of the main extant gymnosperm groups evolved after the gymnosperm angiosperm split . Presence of pine and spruce CYP in the CYP725 family is therefore not surprising. The question arises, whether the variety of CYP genes observed in Taxus was recruited from a common gymnosperm progenitor, still represented in other conifers.
8th International Symposium on Cytochrome P450 Biodiversity and Biotechnology: Independent Meeting held at Swansea Medical School, Swansea, Wales, U.K., 23–27 July 2006. Organized and Edited by D. Kelly, D. Lamb and S. Kelly (Swansea, U.K.).
Abbreviations: ABA, abscisic acid; BR, brassinosteroid; CYP, cytochrome P450-dependent mono-oxygenase; DRA, diterpene resin acid; EST, expressed sequence tag; GA, gibberellic acid
- © 2006 The Biochemical Society