Phosphorylation plays essential roles in nearly every aspect of cell life. Protein kinases regulate signalling pathways and cellular processes that mediate metabolism, transcription, cell-cycle progression, differentiation, cytoskeleton arrangement and cell movement, apoptosis, intercellular communication, and neuronal and immunological functions. Protein kinases share a conserved catalytic domain, which catalyses the transfer of the γ-phosphate of ATP to a serine, threonine or tyrosine residue in protein substrates. The kinase can exist in an active or inactive state regulated by a variety of mechanisms in different kinases that include control by phosphorylation, regulation by additional domains that may target other molecules, binding and regulation by additional subunits, and control by protein–protein association. This Novartis Medal Lecture was delivered at a meeting on protein evolution celebrating the 200th anniversary of Charles Darwin's birth. I begin with a summary of current observations from protein sequences of kinase phylogeny. I then review the structural consequences of protein phosphorylation using our work on glycogen phosphorylase to illustrate one of the more dramatic consequences of phosphorylation. Regulation of protein phosphorylation is frequently disrupted in the diseased state, and protein kinases have become high-profile targets for drug development. Finally, I consider recent advances on protein kinases as drug targets and describe some of our recent work with CDK9 (cyclin-dependent kinase 9)–cyclin T, a regulator of transcription.
- cyclin-dependent kinase 9–cyclin T (CDK9–cyclin T)
- glycogen phosphorylase
- protein kinase
- protein kinase inhibitor
The kinome and phylogeny
The discovery of phosphorylation as a regulatory physiological mechanism arose from the work in 1955 of Eddie Fischer and Ed Krebs  who showed that the activation of GPb (glycogen phosphorylase b) to GPa (glycogen phosphorylase a) was dependent on a protein kinase action together with ATP. They also found, by accident, that calcium leached from filter paper was important to activate the kinase through limited proteolysis by a calcium-dependent protease. The second enzyme discovered to be controlled by phosphorylation was glycogen synthase, and, for many years, control by phosphorylation was considered to be an idiosyncrasy of glycogen metabolism. It was the isolation and purification of PKA (protein kinase A, also known as cAMP-dependent protein kinase) in 1968 that showed protein phosphorylation to be a much more widespread phenomenon . Protein phosphorylation in eukaryotes appeared to be confined to serine and threonine residues until, in 1979, Tony Hunter and colleagues identified the third phospho-amino acid, phosphotyrosine, as the product of a protein kinase activity in immunoprecipitates of a viral oncoprotein . In cell extracts, the more abundant phosphothreonine peak can obscure the phosphotyrosine peak, but, in these experiments, Hunter and colleagues had accidentally allowed their electrophoresis buffer to become more acidic, leading to better separation of the phosphotyrosine peak . In humans, phosphorylation on serine, threonine and tyrosine is approx. 86.4, 11.8 and 1.8% respectively . Although less common than serine or threonine phosphorylation, tyrosine phosphorylation is crucially important in health and disease. The discoveries that many other cytoplasmic oncoproteins possess tyrosine kinase activity and that this activity is associated with their transforming ability, together with discoveries that membrane receptors for growth factors and hormones were also tyrosine kinases, provided deep insights into the regulation of normal and abnormal cellular processes.
The completion of the human genome sequencing project allowed the identification of the whole human kinome in 2002 . This landmark paper produced the iconic diagram of the protein kinome in which the ePKs (eukaryotic protein kinases) could be assigned to seven major groups on the basis of characteristic sequence identification features: AGC [including PKA, PKG (protein kinase G) and PKC (protein kinase C) families], CAMK (Ca2+/calmodulin-regulated kinases), CK1 (casein kinase 1 family), CMGC [including CDKs (cyclin-dependent kinases), MAPKs (mitogen-activated protein kinases), GSK (glycogen synthase kinase) and CDK-like kinases], STE (related to yeast sterile kinases), tyrosine kinases and TKL (tyrosine kinase-like). Within each group, the kinases also shared related functions. In addition an eighth group, RGC (receptor guanylate cyclase kinases), was added, and there was a further group labelled ‘other’. The analysis also included the aPKs (atypical protein kinases), proteins reported to have biochemical kinase activity, but which lack sequence similarity to the ePK domain. Manning et al.  identified a total of 518 protein kinases in the human kinome, comprising 478 ePKs and 40 aPKs.
Phylogenetic comparison of the human kinome with that of yeasts, nematode worms, fruitflies and other organisms shows that most ePK families are shared among different eukaryotes. The most dramatic difference is the lack of tyrosine kinases in the unicellular organisms and the relatively few such kinases in the greatly expanded plant kinome (Table 1). The rice kinome contains 40% more kinases than does Arabidopsis and is three times larger than the human kinome . Over two-thirds of all rice kinases fall into the TKL group (1179 out of 1778 ePKs ), the group whose members have sequence similarity to the tyrosine kinases, but which show serine/threonine kinase activity. They include both receptor and cytoplasmic kinases, mostly of unknown function. Although part of the expansion of the plant kinome can be attributed to an early genome duplication event and many subsequent tandem gene duplications, much work remains to be carried out to understand the plant kinome in the context of plant physiology. In view of their roles in cell–cell communication and signalling, tyrosine kinases were assumed to have emerged during the evolution of unicellular to multicellular organisms. The recent sequencing of the choanoflagellate Monosiga brevicollis, a unicellular organism, which is the closest known relative of metazoans, has given fascinating insight into the emergence of tyrosine kinases as signalling molecules . Out of 9200 genes, the genome contains 331 protein kinase, of which ∼95 are tyrosine kinases (D. Miranda-Saavedra, personal communication). These constitute the largest group of tyrosine kinases in a kinome, a startling result for a unicellular organism. Moreover, this organism shows an equivalent expansion of tyrosine phosphatase and phosphotyrosine-recognition domains [e.g. SH2 (Src homology 2) domain] [9,10]. Comparison with other unicellular organisms (e.g. yeast, Dictostelium discoideum) suggests that the phosphatases and SH2 domains emerged early in evolution, but were of little use until the emergence of tyrosine kinase activity. Many of the domains associated with phosphotyrosine recognition in M. brevicollis have unique architecture with no specific homology with human proteins. The elaborate signalling network shows little orthology with metazoan counterparts, yet displays many innovations reminiscent of metazoans. As Manning et al.  comment: “This uniquely divergent and elaborate signalling network illuminates the early evolution of phosphotyrosine signalling and shows extensive convergent evolution.”
The parallel evolution of the protein phosphatases has also been important for the regulation of cellular control by phosphorylation [11,12]. Eukaryotic protein phosphatases are structurally and functionally diverse enzymes that are represented by three distinct gene families. The majority of phosphoserine and phosphothreonine dephosphorylation is accounted for by the PPP (phosphoprotein phosphatase) family (which includes PP1, PP2A, PP2B and others) and the PPM (metallo-dependent protein phosphatase) family (including PP2C). The PPP and PPM families are unrelated in sequence and probably evolved from two ancestral genes. The PTP (protein tyrosine phosphatase) family is different again. A subfamily of the PTPs, the dual-specificity phosphatases, is able to dephosphorylate all three phosphoamino acids. Within each family, the catalytic domains are highly conserved, with functional diversity endowed by regulatory domains and subunits.
Structural consequences of phosphorylation
Phosphorylation can have profound effects on the function of the target protein . The phosphoryl group with a pKa of ∼6.7 is likely to be predominantly dianionic at physiological pH. The property of a double negative charge (a property not carried by any of the naturally occurring amino acids) and the capacity for the phosphoryl oxygens to form hydrogen-bond networks confers special characteristics. Two types of interaction predominate. First, at tight binding sites used to stabilize a conformational state, the phosphate group frequently interacts with the side chain of one or more arginine residues (Figure 1). The guanidinium group of an arginine is well suited for such interactions because of its planar structure and its ability to form multiple hydrogen bonds. The guanidinium group (pKa>12) is a poor proton donor and cannot function as a general acid in the hydrolysis of phosphorylated amino acids. Theoretical calculations on the strengths of hydrogen bonds have shown that the bidentate interactions available to arginine with phosphate provide much stronger interactions than those that can be formed with -NH3+ groups as in lysine side chains . Secondly, an interaction often observed at less tight phosphoryl groupbinding sites involves the interaction of the phosphate group with the main-chain nitrogens at the start of an α-helix, utilizing the positive charge of the helix dipole. In addition, a number of other polar residues many also be involved in contacts, including lysine, histidine, tyrosine, serine and threonine.
Phosphorylation can activate enzyme activity through allosteric conformational changes, as observed for glycogen phosphorylase  and many protein kinases that rely on phosphorylation by upstream kinases for activity . Phosphorylation can inhibit enzyme activity, as observed in isocitrate dehydrogenase, where the phosphate group acts as a steric blocking agent and does not promote any conformational change , and in CDK2, where phosphorylation on Tyr15 impedes protein substrate recognition . Phosphorylation can lead to recognition sites for other protein molecules, such as in the phosphotyrosine-recognition SH2 domains important for regulation of kinases such as Src, ZAP70 (ζ-chain-associated protein kinase of 70 kDa), Fes and Abl protein. Recent results have shown how recognition of a phosphotyrosine residue by the SH2 domain in Fes is coupled to substrate recognition through co-operative SH2-kinase–substrate interactions . Less extensive, but also important, are the regulatory domains that recognize phosphoserine or phosphothreonine such as the 14-3-3 proteins  or the Polo-box domain of Polo-like kinase where a phosphoserine site is recognized by the Polo-box domain, which then targets the Polo-like kinase to its substrate . In a further variation of specific site recognition, some protein kinases require hierarchical substrate phosphorylation where the phosphorylation of one site is necessary to create a recognition site to allow subsequent phosphorylation, as occurs in the phosphorylation of the substrate APC (adenomatous polyposis coli) protein by CK1 and GSK3 as part of the pathway for β-catenin degradation in Wnt signalling . Phosphorylation may also cause an order-to-disorder transition as in the K+ channel inactivation domain , or it may cause a disorder-to-order transition as in the KIX (kinase-inducible interaction)/pKID (phosphorylated kinase-inducible domain) CBP [CREB (cAMP-response-element-binding protein)-binding protein] co-activator protein/CREB transactivation domain system , these transitions being demonstrated in NMR structural studies. Phosphorylation can promote conformational changes that lead to protein association as in ERK (extracellular-signal-regulated kinase) 2  or STAT (signal transducer and activator of transcription) proteins [26,27] and entry to the nucleus. Phosphorylation may also cause protein–protein disassociation as in the CDK-mediated phosphorylation of pRb (retinoblastoma protein) that promotes dissociation of pRb from the transcription factor E2F/DP1 . For almost all of these systems, structures have explained the molecular basis for these phenomena.
Glycogen phosphorylase: an example of the structural consequences of phosphorylation and the evolutionary history of control by phosphorylation
Glycogen phosphorylase was the first phosphoprotein structure to be understood in its non-phospho and phospho states. The work demonstrates how phosphorylation can lead to activation. According to the Monod–Wyman–Changeux theory for allosteric activation, it is assumed that the enzyme exists in two (or at least two) functional states, the T (tense) state which is less active and is characteristic of non-phospho GPb and the R (relaxed) state characteristic of the active-state phosphorylated GPa. The equilibrium between the two can be controlled both by non-covalent ligands and by phosphorylation. Both mechanisms are used in mammalian muscle, but the response to stimulation by adrenaline that leads to activation of PhK (phosphorylase kinase) and phosphorylation of GPb to GPa results in nearly instantaneous activation that is needed for a fight or flight response. The activation of phosphorylase by phosphorylation is the end result of one of the best understood signalling pathways. As a result of numerous structural studies over the years, we now understand the metabolic transformations and the structures of the players involved in this pathway (Figure 2). There are two major gaps in our understanding of the structural basis of this pathway. First, how an activated GPCR (G-protein-coupled receptor) in turn binds and activates a heterotrimeric G-protein and, secondly, how phosphorylation of PhK by PKA activates this very large kinase complex. There has been progress in understanding the structure of PhK (1.3×106 Da) from three-dimensional image reconstruction of cryo-electron micrographs  and the X-ray structure of the catalytic γ subunit [30,31], but how the large complex of α and β subunits together with a calmodulin domain act to restrain the kinase in the inactive state until stimulation remains a subject for future work.
Glycogen phosphorylase catalyses the first step in the breakdown of glycogen. The reaction involves the phosphorylysis of the α-1,4-glycosidic bond of the terminal sugar of glycogen to yield glucose 1-phosphate. Phosphorylation by PhK takes place on Ser14, near the N-terminus of this large polypeptide chain (842 residues). The question is how does phosphorylation of a single serine residue activate such a large molecule?
Glycogen phosphorylase is a dimer in which the two subunits are related by a 2-fold axis of symmetry. The catalytic site is at the centre of each subunit and well away from the subunit–subunit interface. In GPb, the non-phospho form of the enzyme, the N-terminal region contacts the subunit at an acidic site on the surface; an environment that is consistent with the sequence of positively charged groups that surround Ser14 (Figure 3A) (sequence K9RKQLS14VR16 using the single-letter amino acid code.) The N-terminal region is less well ordered than the rest of the structure, a feature that allows it to adapt to the PhK catalytic site . On phosphorylation, Ser14 shifts approx. 34 Å (1 Å=0.1 nm) and the phosphoserine residue contacts two basic residues at the subunit–subunit interface, Arg69 from one subunit and Arg43′ from the other subunit (Figure 3B) . The phosphorylase structure demonstrates the importance of arginine residues in creating phosphate-recognition sites. These contacts and others result in a tightening of interactions at this interface.
Our understanding of the allosteric mechanism of phosphorylase, whereby binding events at sites that are over 45 Å from the catalytic site can nevertheless activate the enzyme, rests on the intimate connection between the tertiary structure and the quaternary structure as the structure goes from the less active T state (GPb) to the more active R state (GPa). The tightening of the subunit interfaces in the vicinity of the phosphoserine sites is accompanied by a rotation of approx. 10 ° of one subunit with respect to the other about an axis approximately normal to the 2-fold axis of the GP dimer. This leads to changes at the subunit–subunit interface at the other side of the molecule, as can be seen in the view normal to the 2-fold axis (Figures 3C and 3D). In the T state, this interface is characterized by helix–helix packing of two helices one from each subunit, the so-called tower helices. Each helix is followed by a loop of chain termed the 280s loop (that contains Asp283) that blocks access for the large substrate glycogen to the catalytic site. On the transition from T to R state, the two helices pull apart and change their angle of tilt. The 280s loop becomes displaced, and the catalytic site is now accessible.
These allosteric changes create a substrate-recognition site for the substrate phosphate. Glycogen phosphorylase contains the cofactor PLP (pyridoxal 5′-phosphate), and the 5′-phosphate of PLP is essential for catalysis. It is proposed that the 5′-phosphate acts as a general acid to promote attack by the inorganic phosphate on the glycosidic bond of the terminal sugar in a glycogen chain. This requires direct PLP 5′-phosphate interaction with the inorganic substrate phosphate. In the T state of the enzyme, the acidic residue from the 280s loop, Asp283, is in the catalytic site directed towards the 5′-phosphate. The negatively charged aspartate residue hinders binding of the negatively charged inorganic substrate phosphate. When the structure changes to the R state, the 280s loop is displaced; the acidic group is removed and is replaced by a positively charged group, Arg569, thus creating a substrate-binding site for the inorganic phosphate .
A variation on this theme of control by phosphorylation is seen with yeast glycogen phosphorylase. Yeast and mammalian muscle phosphorylase have overall sequence identity of 46% over the large polypeptide chain. However, they differ significantly at the N-terminal region, where the intron splice site before residue 80 has been used to fuse two rather different peptide regions. Independent genetic-fusion events have joined unrelated segments to a conserved core. Yeast glycogen phosphorylase is longer by 39 residues at the N-terminus and its site of phosphorylation is at Thr−10 (using the rabbit muscle sequence). The reaction is catalysed not by PhK, but by a yeast PKA. In the inactive form of the enzyme, the N-terminal region blocks access to the catalytic site. But, in the active phospho-form of the enzyme , the phosphothreonine docks into a site that is conserved between mammalian and yeast enzymes that contains two arginine residues (Arg309 and Arg310), which, in mammalian phosphorylase, is used to bind the phosphate of AMP. The enzyme is kept in its inactive state by a steric blocking mechanism. Activation by phosphorylation serves to remove the steric block without allosteric changes at the catalytic site, although there are changes at the subunit interface that regulate access to the inhibitor glucose 6-phosphate or activator threonine phosphate. The Escherichia coli maltodextrin phosphorylase shows another variation. This enzyme is not regulated by post-translational modification, but regulation takes place at the level of gene transcription through relief of repression of the mal genes, which is mediated by cAMP in response to the need to utilize oligosaccharides. Overall, mammalian and E. coli phosphorylase are 46% identical in amino acid sequence, but the E. coli enzyme is smaller (796 amino acids) and lacks 17 residues at the N-terminal region. Again, the region up to the first intron–exon boundary of the mammalian enzyme (residue 80) is different. The structural studies showed that the enzyme is held in an active conformation by virtue of the subunit–subunit contacts from regions that show sequence differences from the mammalian enzyme and the loss of the regulatory sites present in the mammalian enzyme [33,35]. The open access to the conserved catalytic site and the correct disposition of the arginine residue provides an explanation for the activity without control. Thus enzymes that carry out similar function show different mechanisms for regulation. Control by phosphorylation has evolved differently in metazoans and in the unicellular organism, whereas the prokaryotic system has evolved a system of regulation at the gene level.
Protein kinase inhibitors
Protein kinases have become prime targets for drug intervention in the diseased state, especially in cancer. In 2008, there were ten protein kinase inhibitors that had been approved for use in the clinic (Table 2) and there are many more in clinical trials . All of the approved compounds target the ATP-binding site on the kinase, with the exception of temsirolimus that targets a domain of the mTOR (mammalian target of rapamycin) kinase that is involved with substrate localization. The ATP site is common to all the protein kinases and hence it is remarkable that targeting this site can result in inhibitors that are specific for just a few kinases. Selectivity is engineered by targeting pockets adjacent to the ATP site with groups that have chemical diversity.
Structural analyses of compounds bound to the ATP site have revealed the following five (or at least five) pockets that can be exploited for binding different chemical groups (reviewed in ) (Figure 4). (i) The adenine-binding pocket that is lined by aliphatic and aromatic hydrophobic groups with the potential for two or more hydrogen bonds to the hinge region. Many groups with planar geometry based on sp2 hybridization bind here provided that they also contain at least one group that can hydrogen-bond with the hinge region. (ii) The ribose-binding pocket that has several polar groups. (iii) The so-called ‘p38’ pocket (recognized in the p38α–BIRB-796 complex and in the Abl–imatinib complex) above the adenine, which in p38 and other kinases such as Abl and Raf is guarded by a small residue (usually threonine) known as the ‘gatekeeper’, but which in other kinases is blocked by a more bulky residue [e.g. phenylalanine (Phe80) in CDK2]. (iv) The pocket below the adenine, the Lys89 pocket in CDK2 or the hydrophobic pocket in other kinases, that allows targeting by a number of different groups. (v) The Type II inhibitor pocket formed between the C-helix and the activation segment, which accommodates some large inhibitors that target the inactive conformation .
There have been developments in the generation of allosteric inhibitors of protein kinases. BIRB-796 in complex with P38α was an example of a compound that only partly overlapped with the ATP site and bound to the inactive conformation, eliciting a slow conformational change . The MEK1 (MAPK/ERK kinase 1) inhibitors PD184352 and the more potent compound PD0325901 are more striking. The relatively small PD184352 binds to a site tucked under the C-helix that is adjacent to the ATP site in the inactive conformation. It does not overlap ATP and acts as an allosteric inhibitor . These compounds have progressed in clinical trials, but have not yet received approval . Because of their potency and selectivity, they have been recommended in cellular studies for elucidating the roles of signalling pathways .
Temsirolimus (a derivative of rapamycin) is the only example of a drug in the clinic that does not target the kinase ATP site. Instead, rapamycin, when in complex with the protein FKBP12 (FK506-binding protein 12), targets mTOR kinase at a site located on a domain, the FRB (FKBP12–rapamcyin-binding) domain, that appears to be involved in localization or substrate docking. In a new slant on the protein kinase inhibitor story, the action of rapamycin on mTOR has shown that it is possible to have a drug that apparently interferes with protein-docking interactions. The basis for the inhibition of the FKBP12–rapamycin complex on mTOR is not completely understood, but is different from the mechanism of rapamycin when used as an immunosuppressant.
Protein kinases can exist in active and inactive conformations that are dependent on various different regulatory mechanisms . Both the active and inactive conformations have been used in strategies to produce potent and selective compounds. Targeting the inactive conformation can give high specificity, but targeting the active conformation is favourable where the diseased state has arisen from activating mutations. Kinases converge to a similar conformation in the active state, whereas the inactive states are often unique. Hence targeting the active conformation often results in a less specific inhibitor. Drug-resistance mutations are a potential risk for both conformational states, where drug-binding regions are not directly involved in catalysis. However, the active conformation requires conservation of both key residues and tertiary structure and is less likely to be tolerant of mutations. Conventional kinase assays by their nature identify inhibitors of the active kinase conformation and can establish whether the inhibitor is competitive with ATP. Of the compounds listed in Table 2, fasudil, dasatinib, gefitinib and erlotinib bind to the active conformations, whereas imatinib, lapatinib and sorafenib bind to the inactive conformation of their respective kinases.
Targeting an inactive conformation is attractive because this is more likely to represent a conformation that is unique to that protein kinase. For example, imatinib (Glivec), the most successful of protein kinase inhibitors, targets the inactive conformation of Abl tyrosine kinase with high specificity . Only the kinases Abl, ARG (Abl-related gene), Kit and PDGFR (platelet-derived growth factor receptor) are targets for imatinib. Residues that contact imatinib in Abl are conserved in Src or replaced by a smaller residue, but Src is 2400-fold less sensitive to inhibition by imatinib. It has been shown that Src can adopt the inactive Abl-inhibited conformation, but the overall conformational balance is tuned differently, demonstrating the contribution of long-range contributions to a protein's conformational state . A newer Abl inhibitor, dasatinib, targets the Abl active state . It has been developed to increase potency and has proved effective against some, but not all, drug-resistant mutations.
Where disease occurs through loss of regulation leading to an inappropriately active protein kinase, targeting the active conformation can be advantageous. The first EGFR (epidermal growth factor receptor) kinase inhibitors in the clinic, gefitinib (Iressa)  and erlotinib (Tarceva) , targeted the active form of the kinase and this proved advantageous for patients whose cancer was caused by mutations that resulted in a constitutively active EGFR kinase domain. Newer approved compounds, such as lapatinib (Tykerb), target the inactive conformation with high potency . A further compound that forms a covalent attachment to the kinase has been found to overcome one of the major drug-resistance mutations, where the effectiveness of the drug in vivo is dependent on its ability to compete successfully in the presence of cellular concentrations of ATP . Indeed, an emerging theme in the effectiveness of a drug in vivo has been the ability of the drug to compete with cellular concentrations of ATP (typically approx. 1–10 mM). For most kinases, the Km for ATP is of the order of 10 μM. As the work with the EGFR inhibitor gefitinib  has shown, it is the balance between ATP affinity and drug affinity that determines the efficacy of the drug. For EGFR, such problems have been overcome with a compound, HKI-272, that binds covalently to a cysteine residue .
Protein kinases are also targets for treatment of inflammatory disease, diabetes and neurodegenerative diseases . There has been progress in these areas , but, as yet, there are no drugs in the clinic, possibly because of the complexity of the signalling pathways that involves much cross-talk between pathways. For neurodegenerative diseases, inhibition of the kinases that phosphorylate the tau protein could be beneficial. For these indications, drugs must cross the blood–brain barrier. Kinases are also possible targets for drugs against parasitic diseases. For example, knowledge of the malaria genome has allowed characteristic sequences to be discerned that are specific to the parasite and different from the human host. Those kinases that are essential for the parasite life cycle could be exploited for drug design from structural knowledge .
In addition to small-molecule inhibitors, there are also successful approaches that have resulted in products in the clinic using humanized monoclonal antibodies generated against the extracellular domain of transmembrane receptor protein kinases, e.g. trastuzumab (Herceptin) that targets Her2 (ErbB-2) receptor for treatment of breast cancer or cetuximab (Erbitux) that targets EGFR for treatment of metastatic colorectal cancer, or against the growth factors themselves, such as bevacizumab (Avastin) that targets VEGF (vascular endothelial growth factor) for treatment of metastatic colorectal cancer. Structures are known for each of these complexes, and the structures have contributed to an understanding of the mechanisms of action.
CDKs have been prime targets for inhibition in cancer therapy because of their role in control of the cell cycle and their up-regulation through overexpression of cyclins or down-regulation of tumour-suppressor proteins in cancer (reviewed in ). For example, breast cancer that is driven by the ErbB-2 oncogene is dependent on CDK4–cyclin D1 function, and mice lacking cyclin D1 are resistant to ErbB-2-driven mammary neoplasia [54,55]. The observations that siRNA (small interfering RNA) knockdown of CDK2 failed to halt proliferation of osteosarcomas and pRb-negative cervical cancer cells and that Cdk2−/− mice are viable [56–58] suggested that CDK2 might not be a good target for inhibition. However, most CDK2 inhibitors also target CDK1 , and the joint down-regulation of the CDKs is likely to have beneficial effects. CDK2 has received much attention in CDK inhibitor design, partly because of ease of expression of the kinase and the early availability of a structure in 1993 (reviewed in ). More than 130 CDK2–inhibitor complexes have been reported .
Two events are required to activate the CDKs. First, the binding of a cyclin and, secondly, phosphorylation of a threonine residue in the activation segment. The structures of CDK2, the kinase that is responsible for driving events through S-phase, in complex with cyclin A and in its phospho-form with cyclin A, have demonstrated the key events that lead to activation [62–66]. In its inactive state, CDK2 lacks features that recognize and align the triphosphate moiety of ATP, largely through misalignment of the C-helix, and it also lacks the protein substrate-recognition site through the misalignment of the activation segment (Figure 5A). On binding cyclin A, the C-helix swings in and a glutamate residue from this helix, Glu51, contacts a lysine residue, Lys33, which in turn contacts the α and β phosphates of ATP, aligning them correctly for catalysis (Figure 4). At the same time, the activation segment, a region from the conserved DFG (Asp-Phe-Gly) motif to the APE (Ala-Pro-Glu) motif swings down so that it no longer blocks access to the catalytic site. The start of the segment changes conformation from a short α-helix to an extended conformation to allow the C-helix to swing in and the aspartate residue at the DFG motif (Asp145) to adopt a conformation in which it chelates an Mg2+ ion that binds the α and γ phosphates of ATP (Figures 4 and 5B). However, the kinase is still not active. It requires phosphorylation of Thr160 by the upstream kinase CDK7–cyclin H. Phosphorylation changes the activation segment in the key region for substrate recognition and forms an organizing centre linking the C-helix (Arg50), the catalytic strand (Arg126) and the activation segment (Arg150) (Figure 1). The phosphate group is partially shielded by a non-polar residue from the cyclin and is resistant to attack by phosphatases (Figure 5B). Structural studies on binding a cognate peptide with the sequence HHAS*PRK (where S* indicates the serine residue to be phosphorylated) showed how the unusual left-handed conformation of the activation segment in the region between Val163 and Val164 stabilized by an interaction of the two main-chain carbonyl oxygens atoms to an arginine residue (Arg169) created a pocket that could only accommodate a proline residue, because any other amino acid with a peptide NH group would demand a hydrogen-binding partner . Specificity for the basic group three residues from phosphorylated serine (P+3 site) (a lysine in this peptide) was provided by a contact to the phosphothreonine residue (Figure 5B).
Analysis of the structures of two other CDK2–cyclin complexes, CDK2–cyclin E  and CDK2–cyclin B , showed a very similar mode of interaction and activation mechanism, although there are some detailed differences arising from sequence changes in the cyclins (Figure 6). Structures of more distantly related complexes of the neuronal CDK, CDK5 in complex with the single domain cyclin p25 , of the cell-cycle kinase CDK6 in complex with a viral cyclin  and the yeast Pho85–Pho80 complex , showed more significant differences from the canonical CDK–cyclin interaction, but each demonstrated common features that lead to activation of the CDK. In the light of these studies, we were surprised to find a different motif when we determined the structure of the transcriptional CDK, CDK9, in complex with cyclin T.
CDK9–cyclin T, a transcriptional regulator
CDK9–cyclin T [P-TEFb (positive transcription elongation factor b)] co-ordinates the elongation phase of transcription. After initiation of transcription, two inhibitors DSIF [DRB (5,6-dichloro-1-β-D-ribobenzimidazole)-sensitivity-inducing factor] and NELF (negative elongation factor) halt elongation of mRNA by RNA PolII (RNA polymerase II). P-TEFb phosphorylates these inhibitors, causing dissociation of NELF from the complex and it also phosphorylates the CTD (C-terminal tail domain) of RNA PolII on the second serine residue of the heptapeptide repeat YSPTSPS. These phosphorylations release repressor actions and allow productive elongation of mRNA. The hyperphosphorylated CTD recruits members of the splicing and 3′-polyadenylation machinery for subsequent processing of mRNA.
The structure of CDK9–cyclin T1 showed a dramatic difference in the association of the CDK and the cyclin compared with the cell-cycle CDK complexes  (Figure 7A). The orientation of the cyclin is rotated approx. 26 ° compared with the position in CDK2–cyclin A. This results in a comparatively sparse number of contacts between CDK9 and cyclin T1. The buried molecular surface is only 1763 Å2 compared with 2892 Å2 observed for CDK2–cyclin A.
Cyclin A and cyclin T consist of a duplication of two cyclin box folds each based on a five α-helical motif together with additional N-terminal and C-terminal helical regions . The key interactions for CDK2–cyclin A involve the N-terminal helix of the cyclin, which contacts several different regions of the kinase and which has been shown to be functionally important. Important contacts also come from the cyclin A H3, H4 and H5 helices and the H5-H1′ loop to the CDK2 C-helix and β4 strand. The cyclin H5 helix runs parallel to the CDK C-helix and assists in the alignment for the active conformation (Figure 7B). The most obvious difference between CDK9–cyclin T and CDK–cyclin A is in the orientation of the cyclin N-terminal helix. In cyclin T1, the N-terminal helix adopts an orientation in which it is directed towards the solvent and it does not participate in the extensive CDK interactions observed for the CDK2–cyclin A complex, although it does make a few contacts to a region corresponding to an N-terminal extension in CDK9.
The complexes of CDK5–p25 and Pho85–Pho80 also lack participation of the N-terminal helix in the CDK–cyclin interactions and yet both show a similar cyclin orientation to cyclin A. Despite the different orientation of cyclin T1, the cyclin participates in key interactions from the H3, H4 and H5 helices that orient the CDK9 in its active conformation. Sequences changes in both the CDK and the cyclins indicate why cyclin T1 could not adopt the conformation observed for cyclin A. Overall, CDK2 and CDK9 show 41% sequence identity. The ATP-binding sites are conserved with 17/18 contact residues identical. Analysis of the CDK2 surface residues that are in contact with cyclin A show that only 34% of these are conserved in CDK9. Although it is generally expected that surface residues are less conserved than core residues that are important for maintaining the structural framework, the lower conservation of protein–protein interaction sites is unusual and demonstrates how the CDKs have evolved to present different surfaces that can be exploited by different cyclins. For CDK9–cyclin T1, the less intimate association of CDK and cyclin allows more flexibility in the activated conformation of the CDK and this appears to be a feature that is important in specific inhibitor design.
The recently determined structure of CDK4 in complex with cyclin D3 has shown a further variation in CDK–cyclin interactions . Here, cyclin binding has not been sufficient to drive the CDK to the active conformation. Like the CDK9–cyclin T1 structure, the CDK–cyclin interactions are fewer than in CDK2–cyclin A and involve the N-terminal regions of the kinase and the N-terminal cyclin box of cyclin D3. For this CDK–cyclin conformation, it appears that both phosphorylation and substrate binding are required to bring the kinase to the active conformation.
Flavopiridol, a CDK9–cyclin T inhibitor
Flavopiridol (Alvocidib) is a derivative of the flavonoid rohitukine, obtained from the bark of an indigenous tree used in Ayurvedic medicine, Dysoxylum binectiferum, which grows in India, Nepal, Sri Lanka and China. Flavopiridol is a broad-specificity CDK inhibitor with a distinct preference for CDK9. Ki values for CDK9 (3 nM) are lower than those for CDK1, CDK2 and CDK4 (40–70 nM) and for CDK7 (300 nM) . Flavopiridol administration has profound effects on transcription and results in decreased levels of labile mRNAs, including those for early transcription factors, cytokines, cell-cycle regulators and kinases, as well as those proteins that mediate the anti-apoptotic response such as Mcl-1 and XIAP (X-linked inhibitor of apoptosis). Flavopiridol is currently in Phase II clinical trials as a CDK9 inhibitor for a number of cancers , with promising results for the treatment of B-cell CLL (chronic lymphocytic leukaemia) . CLL is an uncommon cancer (approx. 8000 cases in the U.S.A. in 2004), with a significant risk ratio for first- or second-degree relatives associated with epigenetic silencing and/or germline mutations in DAPK1 (death-associated protein kinase 1), a positive mediator of apoptosis . CLL cells are refractory to apoptosis. Flavopiridol appears to promote apoptosis by inhibiting transcription of anti-apoptotic proteins.
Flavopiridol has also shown promise in combination therapy with docetaxel  or with irinotecan , where information based on known mode of action of the inhibitors was used to decide the order in which the drugs were administered. In human patients, flavopiridol is bound to serum (only ∼5–8% flavopiridol is in the free state compared with 63–100% for bovine serum). This has been partly addressed to achieve pharmacologically active concentrations by using a bolus injection for 30 min of 30 mg/m2 (plasma concentration achieved 1.5–2.2 μM), followed by 4 h infusion of 30 mg/m2 (plasma concentration 0.9–1.5 μM) administered over 4–6-week period . Serum binding through human serum albumin or α1 acid glycoprotein may have beneficial effects in promoting drug solubility, transport and preventing too rapid elimination, but too tight binding means that high concentrations are required to achieve the desired free concentration of the drug in cells.
Recent structural studies have shown that flavopiridol binds to CDK9–cyclin T1 at the ATP-binding pocket  in a similar fashion to the binding of deschloroflavopiridol to inactive monomeric CDK2  (Figure 8). Flavopiridol is almost entirely buried in CDK9, with only 8% of its molecular surface area exposed. Flavopiridol hydrogen-bonds to the hinge residue Phe105 carbonyl oxygen (long) and Cys106 main-chain NH. The piperidinyl group N1, which is assumed to be protonated, contacts the Asn154 side chain, whereas the O3 hydroxy group and the N1 contact the Asp167 side chain (the DFG aspartate residue). The chlorophenyl ring is oriented approx. 30 ° to the flavone ring to accommodate the chlorine, which makes a long contact to the main-chain oxygen of Ile25 (4.2 Å) and an internal contact to the oxygen of the adjacent ring (2.8 Å). The pocket that accommodates the chlorophenyl ring is more open in CDK9. Gly112 takes the place of CDK2 Lys89, the only change at the ATP-binding site between CDK2 and CDK9. This creates a less crowded and a more favourable electrostatic environment than in CDK2, thus providing a partial explanation for the higher affinity for CDK9 than for CDK2.
A further explanation for specificity is found in the conformational changes induced on binding flavopiridol to CDK9. The glycine-rich loop between β1 and β2 folds over the active site (Figure 8). Phe30 in the glycine-rich loop rotates and makes additional van der Waals contacts (edge to face) with the piperidinyl group of the inhibitor (similar to the contact between imatinib and Abl Tyr253 [44,81]). The β3/αC loop changes its conformation so that Leu51 occupies the space of Phe30 in the ATP-bound structure and locks the glycine-rich loop in its closed conformation. The new position of the glycine-rich loop would exclude ATP binding. These conformational changes are less likely to be accomplished by CDK2 because of sequence changes. The residue equivalent to CDK9 Leu51 in CDK2 is Arg36 and the arginine could not fulfil the requirements for filling the non-polar pocket. The structure of the CDK9–cyclin T1–flavopiridol complex shows that specificity can be engineered both by local contacts and by the overall framework of the kinase that allows flexibility and a conformational response.
The human kinome and the kinomes of other organisms have provided a rich source of information on the distribution of protein kinases in different physiological environments and the development of signalling pathways that use tyrosine kinases. Over the years, structural studies have demonstrated the stereochemical consequences of phosphorylation of protein targets, revealing a wide variety of mechanisms, but with the common feature that where a tight phosphate-binding site is needed, then arginine residues are involved in bidentate hydrogen bonds to the phosphate. The ATP-recognition site of the protein kinases, despite sharing common features among the different kinases, has proved a most-druggable site, and, by 2008, there were nine compounds approved for use in the clinic that target this site. An emerging feature of the kinase studies for ligand interactions has been the importance of not only local interactions, but also the conformational adaptability of the kinase to select a particular conformational state that also provides recognition properties, as seen in imatinib binding to Abl kinase and flavopiridol binding to CDK9–cyclin T1.
I am grateful to the Medical Research Council, the Biotechnology and Biological Sciences Research Council, the Wellcome Trust and the European Union for their financial support throughout my career.
I warmly thank the Biochemical Society and the Novartis Foundation for this honour. I acknowledge with gratitude the contributions form my research colleagues in Oxford and around the world who have contributed to the research described. I thank Diego Miranda-Saavedra for discussions.
Novartis Medal Lecture:
Abbreviations: aPK, atypical protein kinase; CDK, cyclin-dependent kinase; CLL, chronic lymphocytic leukaemia; CREB, cAMP-response-element-binding protein; CTD, C-terminal tail domain; EGFR, epidermal growth factor receptor; ePK, eukaryotic protein kinase; ERK, extracellular-signal-regulated kinase; FKBP12, FK506-binding protein 12; GPCR, G-protein-coupled receptor; GPa, glycogen phosphorylase a; GPb, glycogen phosphorylase b; GSK, glycogen synthase kinase; MAPK, mitogen-activated protein kinase; mTOR, mammalian target of rapamycin; PhK, phosphorylase kinase; PKA, protein kinase A; PLP, pyridoxal 5′-phosphate; PPM, metallo-dependent protein phosphatase; PPP, phosphoprotein phosphatase; pRb, retinoblastoma protein; P-TEFb, positive transcription elongation factor b; PTP, protein tyrosine phosphatase; R, relaxed; RNA Pol II, RNA polymerase II; SH2, Src homology 2; T, tense; TKL, tyrosine kinase-like
- © The Authors Journal compilation © 2009 Biochemical Society