Biochemical Society Transactions

Synaptopathies: Dysfunction of Synaptic Function

Confirmed rare copy number variants implicate novel genes in schizophrenia

Gloria W.C. Tam, Louie N. van de Lagemaat, Richard Redon, Karen E. Strathdee, Mike D.R. Croning, Mary P. Malloy, Walter J. Muir, Ben S. Pickard, Ian J. Deary, Douglas H.R. Blackwood, Nigel P. Carter, Seth G.N. Grant

Abstract

Understanding how cognitive processes including learning, memory, decision making and ideation are encoded by the genome is a key question in biology. Identification of sets of genes underlying human mental disorders is a path towards this objective. Schizophrenia is a common disease with cognitive symptoms, high heritability and complex genetics. We have identified genes involved with schizophrenia by measuring differences in DNA copy number across the entire genome in 91 schizophrenia cases and 92 controls in the Scottish population. Our data reproduce rare and common variants observed in public domain data from >3000 schizophrenia cases, confirming known disease loci as well as identifying novel loci. We found copy number variants in PDE10A (phosphodiesterase 10A), CYFIP1 [cytoplasmic FMR1 (Fragile X mental retardation 1)-interacting protein 1], K+ channel genes KCNE1 and KCNE2, the Down's syndrome critical region 1 gene RCAN1 (regulator of calcineurin 1), cell-recognition protein CHL1 (cell adhesion molecule with homology with L1CAM), the transcription factor SP4 (specificity protein 4) and histone deacetylase HDAC9, among others (see http://www.genes2cognition.org/SCZ-CNV). Integrating the function of these many genes into a coherent model of schizophrenia and cognition is a major unanswered challenge.

  • array comparative genome hybridization
  • copy number variant
  • psychiatric genetics
  • schizophrenia

Introduction

Schizophrenia is a debilitating psychiatric disorder that affects 1% of the population worldwide. It is characterized by positive psychotic symptoms, such as hallucinations and delusions, and negative symptoms, including cognitive and social impairments [1]. With an estimated heritability coefficient of ~0.8, schizophrenia is thought to have strong genetic components, interacting with a number of environmental, epigenetic and stochastic factors. Identifying genomic variants linked to schizophrenia is therefore a crucial step in understanding the aetiology and pathophysiology of the disorder.

In previous decades, the main strategies used in psychiatric genetics were family linkage and case-control association studies [13], complemented by the identification of cytogenetic abnormalities. This led to the discovery of some rare, potentially disease-causing, mutations: for example, disease-related genes or genetic loci such as DISC1 (disrupted in schizophrenia 1), NPAS3 [neuronal PAS (Per/Arnt/Sim) domain 3] and 22q11 were identified from balanced translocations or microdeletions. They have subsequently been supported by compelling biological and functional evidence as candidates for contributors to the aetiology of schizophrenia.

More recently, whole-genome screening for CNV (copy number variation) and genome rearrangements has revealed many more structural variations than expected in healthy individuals [4,5]. Furthermore, associations between CNV and neuropsychiatric conditions have been identified [6,7]. For example, studies by the ISC (International Schizophrenia Consortium), the SGENE-plus Consortium, and others have shown significant associations between rare recurrent CNVs and schizophrenia [811]. These studies also revealed a higher collective disease burden from rare genic CNVs.

We screened DNA from 91 cases and 92 controls for CNVs using aCGH (array-based comparative genome hybridization). The screen was performed using the WGTP (whole genome tile path) microarray platform. Examination of autosomal CNVs revealed agreement with the ISC dataset at individual loci and confirmation of disease-biased CNV at multiple loci. These results take a first step in pinpointing rare variants that are likely to be functional and demonstrate the usefulness of confirmatory association studies of rare variants in schizophrenia.

aCGH data analysis

CNVs were detected in 91 Scottish individuals with schizophrenia and 92 control Scottish samples from the 1921 LBC (Lothian Birth Cohort) [12].

WGTP array CGH was performed as described previously by Redon et al. [13]. Briefly, each fluorescently labelled [with Cy3 (indocarbocyanine) or Cy5 (indodicarbocyanine)] HapMap DNA sample (200 ng) was hybridized to an oppositely labelled (Cy5 or Cy3) reference genomic DNA (NA10851) (200 ng) in a dye-swap experiment using two BAC (bacterial artificial chromosome) array slides. After 21 h of hybridization, array slides were scanned in an Agilent scanner and analysed using the BlueFuse software and an in-house Perl script for data normalization and GC-content correction.

In addition to manual inspection of profile qualities, all hybridization results were monitored with two statistical indicators for quality control. The first is global SDe [5], an estimate of S.D. of all log2 fluorescence ratio signals in the genome for a particular sample profile. The second is the clone retention rate after fusion of dye-swap experimental data. Experiments were accepted for further analysis only if global SDe was <0.06, global clone retention rate was >90% and clone exclusion rate per individual chromosome was <20%.

Clones mapping to the X chromosome in females and to the Y chromosome and pseudo-autosomal regions in all individuals were excluded from further analysis. Clones were then excluded further if they were retained in less than 75% of expected individuals in either SCZ (Schizophrenia Cohort) or LBC. This left a total of 27980 clones, representing an average of 25102 clones per individual (range 23292–25805). Data for each individual were then normalized to give a median log2 fluorescence ratio of 0.00. Finally, to compensate for CNV in the reference individual, the clonewise median log2 fluorescence ratio was subtracted across all individuals. CNV was detected in each individual using CNVFinder [5]. The reduced variation in the samples enabled more sensitive calling of CNV events, since the CNVFinder algorithm automatically increases sensitivity of detection when the overall sample variance is low.

CNV detection

A CNV dataset was generated from 91 Scottish SCZ and 92 Scottish control (LBC) DNA samples hybridized against a single HapMap reference DNA on the WGTP array platform. Initial normalization on the WGTP data was performed as described previously [5]. Additional normalization and filtering steps, including correction for GC content, and filtering of clones for artefacts are summarized in [14]. One key pre-processing normalization step in which our method differs from earlier methods involved the computation of the median log2 ratio for each clone and subtraction of this median across individuals. This step reduced the S.D. of overall log2-transformed fluorescence ratios in the SCZ samples from 0.052 to 0.043 and in control samples from 0.050 to 0.040.

CNVs were detected using CNVFinder [5]. This algorithm automatically increases the sensitivity of detection when the overall sample variance is low. Since the SCZ samples had generally higher variation in fluorescence than LBC samples, CNVFinder detected fewer CNVs in SCZ samples than controls (Figure 1). The total number of CNV calls was 3551 in the cases (1715 gains, 1836 losses, 39 CNVs average per sample) and 4041 in the controls (2038 gains 2003 losses, 44 CNVs average per sample) (Figures 1a and 1b). In each cohort, CNVs from multiple individuals may overlap at the same genomic location. To generate sets of non-overlapping CNV genomic locations, we grouped CNVs for each cohort into CNVRs (CNV regions). For our purposes, we defined CNVRs as clusters of CNVs or singletons isolated by more than 1 Mb from the next adjacent CNV. SCZ samples were found to harbour 449 CNVRs (147 gain only, 138 loss only, 164 gain or loss), whereas controls had 481 (166 gain only, 141 loss only, 174 gain or loss) (Figures 1e and 1f). The CNV length distributions were broadly similar between cases and controls (Figures 1c and 1d), as were ratios of CNVRs representing rare compared with recurrent CNVs. In cases, 204 CNVRs represented rare events, whereas 245 were recurrent; in controls, 211 CNVRs were associated with rare CNVs compared with 270 regions with recurrent CNVs (Figures 1g and 1h).

Figure 1 Overall characteristics of CNVs and CNVRs found by CNVFinder in cases (a, c, e and g) and controls (b, d, f and h)

Among CNVs (ad), similar numbers of gains and losses were observed in each cohort (a and b), and the number of clones associated with gains and losses of DNA were similar between cohorts (c and d). CNV regions in cases (e) and controls (f) had similar fractions of gains, losses and regions of gain and loss. Cases (g) and controls (h) both had roughly half of CNV regions representing single events within their cohort.

Validation of individual CNVs by secondary platforms

Previous literature proposed the role of rare CNVs in schizophrenia, with an estimated enrichment in cases of up to three times that found in controls. The fact that CNVFinder penalizes the SCZ samples more heavily due to their higher variation makes it less likely that any such enrichment will be found in our dataset. Conversely, it was expected that CNVs detected as specific to the SCZ samples were more likely to represent true rare variants. We therefore validated CNVs that were specific to cases for potential schizophrenia candidate genes.

Validation was performed by quantitative PCR or by using Nimblegen high-density oligonucleotide arrays in 11 individual CNVs, of which ten were technically confirmed (see Supplementary Table S1 at http://www.biochemsoctrans.org/bst/038/bst0380445add.htm). In addition, we compared CNVs from 39 individuals in our study, with CNVs detected by standard genotyping arrays in the same patients, which were included in the ISC study [8]. Of 91 CNVs in 39 individuals, 50 were confirmed in our CNV calls (Supplementary Table S1). We noted that three individuals accounted for fully one third of the 91 ISC calls. When these individuals were excluded, 77% (46 out of 60) of ISC calls were replicated in our study.

These validated regions (Supplementary Table S1) contain neuronal-related genes or overlap with previously associated psychiatric disorders (Figure 2). These include sarcoglycan ε (SGCE), which is associated with Tourette's syndrome and obsessive compulsive disorder; phosphodiesterase 10A (PDE10A), which is associated with schizophrenia and bipolar disorder; and the recurrent schizophrenia candidate locus at 15q11.2 encompassing the gene for cytoplasmic FMR1 (Fragile X mental retardation 1)-interacting protein 1, CYFIP1. CYFIP1 is involved in the regulation of translation in neurons, with important functions in synaptic plasticity and brain development [15]. This deletion region with CYFIP1 has been described previously as a statistically significant schizophrenia-associated CNV locus in a large case-control cohort [9], deleted in approx. 0.55% of cases and 0.2% of controls.

Figure 2 Loci confirmed by secondary platforms

Normalized fluorescence ratios for cases (red) and controls (green) are shown. Gene positions (NCBI36) are shown in blue.

Confirmed genic mutational biases between studies

By comparing our data with ISC data on >3000 cases and controls (http://pngu.mgh.harvard.edu/isc/isc-r1.cnv.bed), we could cross-validate putative mutational biases between schizophrenia-specific genic CNV gains and losses in the WGTP data. We excluded from the ISC data, CNV co-ordinates from the 39 individuals that were included in our study and which we used earlier for validation of individual CNV calls. Cross-validation of genic mutational bias was then assessed separately for insertions and deletions and was deemed positive if a gene overlapped an indel of the same sense (insertion or deletion) in the same (case or control) cohort in both datasets. We detected 36 autosomal regions demonstrating replicated case-specific gains or losses (see Supplementary Table S2 at http://www.biochemsoctrans.org/bst/038/bst0380445add.htm). Of these regions, harbouring 17 genes, 12 were deletions and 24, overlapping 55 genes, represented insertions. These groups of genes, and especially the deletions, are expected to be enriched in disease candidates. A total of 338 genes in 123 regions had matching case or control skew in their insertion or deletion bias in both studies (results not shown).

In addition to CNVs in our study which overlapped with known schizophrenia-associated regions, our cross-validation analysis identified new candidate loci for schizophrenia. Among known regions, the two CNV regions most robustly validated in the ISC data involved deletions at 15q13 and a duplication at 21q22 (Figure 3). Human chromosome 15q13 is a region shown previously to be associated with schizophrenia [16,17]. CHRNA7 (cholinergic receptor, nicotinic, α7) is thought to participate in the gating of auditory stimuli and a lack of this receptor has been hypothesized to contribute to auditory hallucinations in schizophrenia [16]. Some 23 deletions were detected overlapping the CHRNA7 gene in our data in this region, compared with 14 in controls: in the ISC data, this gene was affected by ten deletions in cases and three in controls. It should be noted that the CHRNA7 locus shows strong nucleotide similarity to that of the CHRFAM7A [CHRNA7 and FAM7A (family with sequence similarity 7A) fusion] gene, which represents a fusion between a partially duplicated copy of CHRNA7 and a copy of the FAM7A gene. Owing to the strong similarity between these loci, it is possible that some of the events assigned to CHRNA7 are a result of changes at the CHRFAM7A locus. However, our results duplicate earlier findings by Freedman et al. [16], lending credence to this locus as being subject to disruption in schizophrenia.

Figure 3 Six loci cross-validated in the ISC dataset

Normalized fluorescence ratios for cases (red) and controls (green) are shown. Gene positions (NCBI36) are shown in blue.

A case-specific single duplication in 21q22 was detected in our study (Figure 3c) and was matched by duplication occurring 16 times in ISC patients compared with seven times in controls. The CNV breakpoints are similar in all patients. This ~200 kb duplication harbours two K+ channel genes, KCNE1 and KCNE2. The DSCR1 (Down's syndrome critical region 1) gene, also known as RCAN1 (regulator of calcineurin 1), is located at the 3′ breakpoint of the duplication. RCAN1 has been linked to Down's syndrome and Alzheimer's disease [1820]. It encodes calcipressin 1, a regulator of calcineurin (calmodulin-dependent protein phosphatase). RCAN1 was shown previously to affect the expression of GSK3B (glycogen synthase kinase 3β) [21], a candidate gene for schizophrenia and bipolar disorder [22]. Furthermore, knockout mouse models of RCAN1 showed impairment in spatial learning and memory (in particular long-term potentiation), as well as sensorimotor deficits [18].

Several other variants recurrent in both studies also struck our notice, including case-specific deletions in SP4 (transcription factor SP4) and HDAC9 (histone deacetylase 9) (Figures 3d and 3e). SP4 is a transcription factor highly expressed in brain and therefore could reasonably have a large effect through controlling the transcription of many genes. In addition, SP4 single nucleotide polymorphisms have been linked with schizophrenia and bipolar disorder, as well as sensorimotor gating in mice in other studies [23,24]. Furthermore, SP4 has been shown to be involved in cerebellar granule neuron development by promoting activity-dependent pruning of dendritic processes [25]. HDAC9 is highly expressed in brain and skeletal muscle [26] and was found to contain single schizophrenia-specific deletions in both studies. A decrease in the expression of this gene has been associated with increased neuronal apoptosis [27]. However, its putative role in chromatin modification suggests a potential for broader effects on transcription in neurons.

In addition to SP4 and HDAC9, our study robustly confirmed a rare deletion variant in the CHL1 (cell adhesion molecule with homology with L1CAM) gene (Figure 3f). CHL1 is a cell-recognition protein with roles in nervous system development, synaptic plasticity and behaviour in mice [2831]. CHL1 has also been implicated in mental retardation in humans [32]. Other genes having a role in cell–cell interactions, for example NRG1 (neuregulin 1) [33], have been implicated in schizophrenia.

Finally, rare CNVs in our study suggested several novel candidate schizophrenia genes. For example, we found rare single deletions overlapping the PITPNA (phosphatidylinositol-transfer protein α) gene in our study and in the ISC study. A mouse knockout of this gene [34] exhibited early-onset tremors and neurodegeneration, indicating its relevance to neural function. Rare insertions were found in several genes, including BIN1 (bridging integrator 1), which has been found to be important for synaptic vesicle recycling and learning in the mouse [35]. To facilitate the use of these data, we have created a web resource of the CNVs and the genes containing them (http://www.genes2cognition.org/SCZ-CNV).

Discussion

This study illustrates that some of the more common earlier findings of genes associated with schizophrenia are duplicated in smaller-scale studies, suggesting commonality of schizophrenia aetiology across populations. Principally, mutations in PDE10A and CYFIP1, suggested previously as candidate genes for schizophrenia, were confirmed. SGCE, associated with obsessive compulsive disorder, was overlapped by a deletion in one patient in our study. A region containing two K+ channel genes, KCNE1 and KCNE2, and RCAN1 was duplicated in one patient. Both KCNE1 and KCNE2 mutations have been strongly associated with death owing to heart arrhythmias and related sudden death. Although KCNE2 is highly expressed in brain, no brain pathology has been associated with this gene in OMIM. Whereas no study to date has implicated RCAN1 in human pathology, in rats an alternative transcript of Rcan1 is expressed during nervous system development, suggesting a possible role for RCAN1 in brain development. CHL1, a haploinsufficient gene [36] located on the short arm of human chromosome 3, was also associated previously with schizophrenia [37]. Our study also suggests novel candidate genes for schizophrenia: for example, mutations in Pitpna and Bin1 in the mouse have been associated with neuropathology and learning deficits respectively. Numerous other genes implicated in schizophrenia are in the postsynaptic proteome and interact in multiprotein complexes with neurotransmitter receptors [38,39]. For example, the multiprotein complex associated with PSD (postsynaptic density) -95 protein, identified using affinity purification, is highly enriched in known schizophrenia candidate genes [38]. Most recently, identification of the human PSD from fresh cortical tissue and related analysis revealed that, among known schizophrenia candidate genes annotated in public databases, the human PSD is highly enriched in these genes relative to the rest of the genome, and even relative to the rest of brain-expressed genes identified by proteomic techniques [40]. This insight may help identify likely candidate sets of genes in the future and help to narrow down lists of candidate genes. The present study also highlights the necessity for and utility of confirmatory studies to validate variants found by others. Finally, several novel candidate genes for schizophrenia are identified; however, determining how they act together with other susceptibility genes to produce the disease remains an unanswered challenge.

Funding

G.W.C.T., K.E.S., L.N.v.d.L., M.D.R.C., M.P.M., W.J.M., I.J.D., D.H.R.B. and S.G.N.G. were funded by the Wellcome Trust Genes to Cognition Programme [grant number 066717]. N.P.C. and R.R. were funded by the Wellcome Trust [grant number WT077008].

Acknowledgments

We thank the International Schizophrenia Consortium for CNV genotype data from 39 individuals with schizophrenia called on microarrays.

Footnotes

  • Synaptopathies: Dysfunction of Synaptic Function: A Biochemical Society Focused Meeting held at The Hotel Victoria, Newquay, U.K., 2–4 September 2009. Organized and Edited by Nils Brose (Max Planck Institute for Experimental Medicine, Göttingen, Germany), Vincent O'Connor (Southampton, U.K.) and Paul Skehel (Centre For Integrative Physiology, Edinburgh, U.K.)

Abbreviations: aCGH, array comparative genome hybridization; BIN1, bridging integrator 1; CHL1, cell adhesion molecule with homology with L1CAM; CHRNA7, cholinergic receptor; nicotinic, α7; CNV, copy number variation; CNVR, CNV region; Cy3, indocarbocyanine; Cy5, indodicarbocyanine; CYFIP1, cytoplasmic FMR1 (Fragile X mental retardation 1)-interacting protein; FAM7A, family with sequence similarity 7A; CHRFAM7A, CHRNA7 and FAM7A fusion; HDAC9, histone deacetylase 9; ISC, International Schizophrenia Consortium; LBC, Lothian Birth Cohort; PDE10A, phosphodiesterase 10A; PITPNA, phosphatidylinositol-transfer protein α; PSD, postsynaptic density; RCAN1, regulator of calcineurin 1; SCZ, Schizophrenia Cohort; SGCE, sarcoglycan ε; WGTP, whole genome tile path

References

View Abstract