The first fully sequenced insect genomes were those of the fruitfly and the mosquito, both from the order Diptera. Now, with an increasing number and diversity of insect genomes becoming available, the diversity of insect P450 genes can be better appreciated and tentative ideas about the evolution of the CYP (cytochrome P450) superfamily in insects can be proposed. There are four large clades of insect P450 genes that existed before the divergence of the class Insecta and that are also represented by CYP families in vertebrates: the CYP2 clade, the CYP3 clade, the CYP4 clade and the mitochondrial P450 clade. P450s with known or suspected physiological functions are present in each of these clades and only a dozen genes appear to have orthologues or very close paralogues in each insect genome. P450 enzymes from each of these clades have been linked to insecticide resistance or to the metabolism of natural products and xenobiotics. In particular, insects appear to maintain a repertoire of mitochondrial P450 paralogues devoted to the response to environmental challenges.
- cytochrome P450 (CYP)
- Drosophila melanogaster
- insect genome
Introduction: insect genomes
Insects as an ancient (>450 million years) taxonomic group have the largest number of macroscopic species, both described (∼60%) and predicted. These include major pests of agriculture and forestry, major vectors of disease, but also beneficial insects such as major pollinators of our crops or natural enemies of pests. They include markers of biodiversity and model species in biomedical research. Furthermore, insects have evolved sociality several times. The first insect genome was that of the fruitfly Drosophila melanogaster with 85 CYP (cytochrome P450) genes and five pseudogenes  (http://P450.antibes.inra.fr). The number of fully sequenced insect genomes grows regularly, and beyond D. melanogaster (180 Mb genome) and a dozen closely related species of Drosophila (which all diverged within 40 million years), we have access now or will shortly have access to data from several vectors of disease: the mosquitoes Anopheles gambiae (220 Mb) vector of malaria, Aedes aegypti (800 Mb) vector of yellow fever and dengue and Culex pipiens the vector of Western Nile virus (540 Mb). All these species belong to the class Diptera. Recently, genome sequences became available for the honeybee Apis mellifera (200 Mb), the silkworm Bombyx mori (530 Mb) and the red flour beetle Tribolium castaneum (200 Mb). These species, a hymenopteran, a lepidopteran and a coleopteran, are, as the dipterans, holometabolous insects, i.e. insects with a complete metamorphosis (endopterygotes, insects with distinct larval, pupal and adult stages). The fossil record dates complete metamorphosis to approx. 300 million years ago, and sequencing is now in progress for two hemipterans, the blood-sucking bug Rhodnius prolixus (670 Mb), vector of Trypanosoma cruzi, and the pea aphid Acyrthosiphon pisum (300 Mb). Hemiptera are insects with an ancestral state of incomplete metamorphosis (exopterygotes or hemimetabolous insects). Closely thereafter we will obtain the sequences (i) of Bicyclus anynana (490 Mb), a tropical butterfly, (ii) of Nasonia vitripennis, a parasitoid wasp (330 Mb), and two related species, and (iii) of the body louse Pediculus humanus, another exopterygote, order Phthiraptera. Future genome projects may include the cotton bollworm Helicoverpa armigera (400 Mb), the tsetse fly Glossina morsitans (200 Mb), the asian tiger mosquito Aedes albopictus, the Eastern tree hole mosquito Ochlerotatus (Aedes) triseriatus and the medfly Ceratitis capitata.
Beyond insects, genome projects are under way for the crustacean Daphnia pulex, a water flea, and for two Chelicerate arthropods, the tick Ixodes scapularis and the spider mite Tetranychus urticae. This wealth of new data represents a tremendous annotation challenge for the ancient superfamily of CYP genes because it is one of the most abundant family of genes found in eukaryotic genomes, as each release of a new completed genome sequence amply demonstrates. Annotation of the fruitfly , mosquito  and honeybee  P450 family each brought specific problems. These included (i) nomenclature (lump new sequence into existing CYP family or split into new CYP family?); (ii) sequence and assembly quality (how do you assemble genes that are ostensibly spread on to two or more sequence contigs? Will a pseudogene in release 1.0 become a ‘real’ gene in release 2.0?); and (iii) homozygosity of the genomic DNA that was sequenced (in An. gambiae, a large P450 cluster was present on contigs that originated from the DNA of two different cytotypes). Annotation work is therefore a continuing effort, and new experimental data, particularly the PCR amplification and resequencing of problematic P450s, are often needed.
Initial and tentative ideas about the evolution of the CYP genes are presented here, with original information gathered from the annotated genomes [1–3], from P450 websites (http://drnelson.utmem.edu/cytochromeP450.html; http://P450.antibes.inra.fr) and the websites for the various insect genomes, particularly FlyBase (http://flybase.indiana.bio.edu/).
Four clades of insect CYP genes
Analysis of the available sequences indicates that insect CYP genes fall into four major clades (Figure 1). These clades correspond to branches above the family level described as subclasses by Gotoh  and as clans by Nelson . Each one of these four clades includes some CYP families from vertebrate species, and for clarity, these have been named here and before  as the CYP2, CYP3, CYP4 and mitochondrial clades. Implicit in this designation is the fact that these four groups of genes were represented by at least one member in the last common ancestor of vertebrates and insects. Insects do not make sterols and have lost the CYP51 gene. Interestingly, insects appear to lack P450s related to CYP26, CYP7 and CYP8, and these families have probably diverged from CYP51 more recently in deuterostomes. The origin of CYP19, also missing in insects, is less clear.
Table 1 gives a current summary of the number of genes in each clade for each insect genome sequenced to date, as well as the CYP families represented in each clade. P450 from these four clades are represented by ESTs (expressed sequence tags) of the pea aphid and of the migratory locust, two exopterygotes. Also indicated are the CYP genes for which presumed orthologues are found in almost all species. In the mosquito An. gambiae, only ten P450 gene orthologues were found compared with the predicted ±40  and the lower number (46) of honeybee Ap. mellifera P450 genes  also includes ten genes with orthologues in the Diptera. However, there are more P450s with known physiological functions, and this indicates that specialized physiological functions in different insect orders recruit new P450s after gene duplication events. The evolution of new function following gene duplication is therefore not restricted to ‘detoxification’ or ‘environmental response’. In the Drosophila lineage, a P450 gene duplication event was estimated to occur on average every 5 million years . The sequence of 12 closely related Drosophila species will provide a fertile ground for the study of P450 gene duplication and gene loss, but also intron loss and gain.
Mitochondrial P450: the insect difference
Mitochondrial P450s are not found in plants or fungi, but seem restricted to animals. In the pond snail Lymnea stagnalis, a CYP10 sequence appears to encode a mitochondrial P450, and its tissue localization suggests an endocrine function. In vertebrates, the mitochondrial P450s are all involved in steroid or vitamin D metabolism. Interaction with adrenodoxin involves several basic residues that are conserved and easily recognized. Several years ago, the characterization of CYP12A1 from the housefly  revealed that this P450 was structurally close to the vertebrate mitochondrial P450, and indeed proof of the mitochondrial targeting of this P450 was unambiguous: reduction of the recombinantly produced CYP12A1 was rapid and efficient with bovine adrenodoxin and adrenodoxin reductase, whereas reduction by the housefly microsomal P450 reductase was sluggish. Furthermore, immunogold histochemistry showed concentration of the CYP12A1 protein in mitochondria. The fly and mosquito genomes then showed that the CYP12 family consisted of half a dozen members in each species. Several of the Halloween genes CYP302, CYP314 and CYP315 (see Rewitz et al. on pp. 1256–1260 of this issue) as well as a highly conserved P450 of unknown function, CYP301, also belong to the mitochondrial P450 clade.
It then appears that there are two types of mitochondrial P450s in insects: on one hand conserved genes involved in essential physiological functions, and on the other hand a variable number of taxon-specific paralogous P450s that are rapidly evolving. CYP12A1 of the housefly is phenobarbital-inducible, and is constitutively overexpressed in an insecticide-resistant strain. It metabolizes a variety of xenobiotics, but apparently not insect ecdysteroids. Drosophila and Anopheles CYP12 (Cyp12d1/2, Cyp12a4 and CYP12F1) have now also been associated with insecticide resistance. Cyp12d1 and Cyp12d2 are the products of a very recent gene duplication event .
The two types of mitochondrial P450 are monophyletic with the vertebrate and nematode mitochondrial P450s. It is interesting to speculate about the origin of the mitochondrial P450s from an ancestral microsomal P450. CYP44A1 is the only mitochondrial-type P450 of Caenorhabditis elegans, but RNA interference extinction of the CYP44A1 transcript does not lead to obvious abnormalities in development . This argues against a role of CYP44A1 in the endocrinology or development of the nematode. The vertebrates, then, are characterized by a single type of mitochondrial P450, three families all involved in essential physiological functions. However, they complement their mitochondria with additional P450s by alternative targeting of microsomal P450, for instance proteolytic truncation of CYP1A1 .
The CYP2 clade: conserved sequences in an ancient clan
The ‘founder’ mitochondrial P450 gene probably originated in the CYP2 clade that comprises a large number of families, and that includes P450s with essential physiological functions (CYP17 and CYP 21) as well as P450s with broader substrate specificity (CYP2). In insects, several members of the CYP2 clade have been characterized. Phantom (Cyp306A1) is a Halloween gene encoding an ecdysteroid 25-hydroxylase. Spook (Cyp307a1) is another Halloween gene, expressed in early embryos and in adult ovaries, but not in the prothoracic glands of larvae . This suggests that the ecdysone pathway may utilize different branches depending on the developmental stage. Drosophila Cyp18 was initially cloned as an ecdysone-responsive gene (Eig17-1), up-regulated by the moulting hormone. It is a predicted target of the miR-276b microRNA . CYP18 is absent from the mosquito An. gambiae, but is found in Aed. aegypti and all other species. No experimental, comparative evidence on ecdysone metabolism is available to test the hypothesis  that CYP18 is involved in C26 hydroxylation/oxidation. CYP15 is the stereospecific epoxidase involved in juvenile hormone biosynthesis  and is lacking in Drosophila that has a non-typical juvenile hormone. nompH (Cyp303A1) is a named Drosophila gene with orthologues in all insect species. Its function is unclear but CYP303A1 probably metabolizes an essential signal in the development of some external sensory organs .
The CYP3 clade: multiple insect-specific blooms of gene duplications
The CYP6 genes were among the first genes of this clade to be cloned and characterized from insects. They are related to the vertebrate CYP3 and CYP5 families. Genes in this clade are the most numerous among insect P450 genes, and are often found in large clusters. Considerable evidence links members of this clade to xenobiotic metabolism and also insecticide resistance, and several are inducible by phenobarbital, pesticides and natural products . In the honeybee, which has the lowest number of P450 genes, a very recent expansion of the CYP6AS subfamily has been observed . This appears to be a recurrent event in different insect lineages for different families and subfamilies in the CYP3 clade.
Cyp310a1 of Drosophila is expressed specifically in the wing imaginal disc of the developing larva [14,15] and not in the haltere imaginal disc. Wings and halteres are the adult dorsal appendages of the second and third thoracic segments, and their different fate is controlled by the homoeotic gene Ultrabithorax. Along with a Kazal-type serine protease inhibitor, Cyp310a1 appears to be involved in the signal transduction of the Wg (Wingless) pathway, and in fact, these genes are negatively regulated by Wg . Cyp310a1 is on the negative strand of the single intron of the MESR3 (misexpression suppressor of ras3) gene. In An. gambiae, no P450 gene is embedded in the MESR3 orthologue.
The ‘insect CYP4’ and their relatives
Insect CYP4 genes and their relatives are also very numerous in insect genomes, with the notable exception of the honeybee that has only four members of the CYP4 family, including a CYP4AA1 orthologue. The great diversity of genes in the CYP4 clan is also reflected in a great diversity of functions. Some CYP4s appear to be inducible metabolizers of xenobiotics; others have been linked to odorant or pheromone metabolism. The CYP4G genes are abundantly expressed and their products probably perform an important physiological function . CYP4C7 is a sesquiterpenoid ω hydroxylase in cockroach endocrine glands that produce juvenile hormone . Paradoxically, this highly diversified clade is the least studied in insects. The P450 enzymes of all four clades will undoubtedly reveal many facets of the highly successful physiological adaptations of insects to their varied food sources and environments.
8th International Symposium on Cytochrome P450 Biodiversity and Biotechnology: Independent Meeting held at Swansea Medical School, Swansea, Wales, U.K., 23–27 July 2006. Organized and Edited by D. Kelly, D. Lamb and S. Kelly (Swansea, U.K.).
Abbreviations: CYP, cytochrome P450; MESR3, misexpression suppressor of ras3; Wg, Wingless
- © 2006 The Biochemical Society