Significant Locus and Metabolic Genetic Correlations Revealed in Genome-Wide Association Study of Anorexia Nervosa
The authors conducted a genome-wide association study of anorexia nervosa and calculated genetic correlations with a series of psychiatric, educational, and metabolic phenotypes.
Following uniform quality control and imputation procedures using the 1000 Genomes Project (phase 3) in 12 case-control cohorts comprising 3,495 anorexia nervosa cases and 10,982 controls, the authors performed standard association analysis followed by a meta-analysis across cohorts. Linkage disequilibrium score regression was used to calculate genome-wide common variant heritability (single-nucleotide polymorphism [SNP]-based heritability [h2SNP]), partitioned heritability, and genetic correlations (rg) between anorexia nervosa and 159 other phenotypes.
Results were obtained for 10,641,224 SNPs and insertion-deletion variants with minor allele frequencies >1% and imputation quality scores >0.6. The h2SNP of anorexia nervosa was 0.20 (SE=0.02), suggesting that a substantial fraction of the twin-based heritability arises from common genetic variation. The authors identified one genome-wide significant locus on chromosome 12 (rs4622308) in a region harboring a previously reported type 1 diabetes and autoimmune disorder locus. Significant positive genetic correlations were observed between anorexia nervosa and schizophrenia, neuroticism, educational attainment, and high-density lipoprotein cholesterol, and significant negative genetic correlations were observed between anorexia nervosa and body mass index, insulin, glucose, and lipid phenotypes.
Anorexia nervosa is a complex heritable phenotype for which this study has uncovered the first genome-wide significant locus. Anorexia nervosa also has large and significant genetic correlations with both psychiatric phenotypes and metabolic traits. The study results encourage a reconceptualization of this frequently lethal disorder as one with both psychiatric and metabolic etiology.
Anorexia nervosa is a serious eating disorder characterized by restriction of energy intake relative to requirements, resulting in abnormally low body weight. It has a lifetime prevalence of approximately 1% and disproportionately affects females (1, 2), and no well-replicated evidence of effective pharmacological or psychological treatments for it have been identified, despite high morbidity and mortality (3, 4). Twin studies consistently support a genetic basis for the observed familial aggregation in anorexia nervosa, with heritability estimates in the range of 48%–74% (5). Although initial genome-wide association studies (GWASs) were underpowered (6, 7), the available evidence strongly suggested that signals for anorexia nervosa would be detected with increased sample size (6).
Our aim in the present study was to combine existing samples to conduct a more powerful GWAS of anorexia nervosa. To further characterize the nature of the illness, we applied linkage disequilibrium (LD) score regression (8) to calculate genome-wide common variant heritability (single-nucleotide polymorphism [SNP]-based heritability [h2SNP]), partitioned heritability, and genetic correlations (rg) between anorexia nervosa and other phenotypes. These include the other major psychiatric disorders with large GWASs, namely schizophrenia, bipolar disorder, major depressive disorder, autism, and attention deficit hyperactivity disorder (ADHD), as well as medical, educational, and personality phenotypes. We then used rg estimates between anorexia nervosa and 159 additional phenotypes to characterize the phenome-wide genetic architecture of anorexia nervosa.
Cases and Controls
Our sample included 3,495 anorexia nervosa cases and 10,982 controls. Case definition required a lifetime diagnosis of anorexia nervosa (restricting or binge-purge subtype) or lifetime eating disorder not otherwise specified, anorexia nervosa subtype (i.e., exhibiting the core features of anorexia nervosa). A lifetime history of bulimia nervosa was allowed, given the frequency of diagnostic crossover (9). Amenorrhea was not required, because it does not increase diagnostic specificity (10) (and it was removed as a diagnostic criterion in DSM-5 ). Extensive information on diagnostic and consensus procedures for the samples included in the Children’s Hospital of Philadelphia/Price Foundation Collaborative Group (CHOP/PFCG) cohort is available elsewhere (7). The cases included from the Genetic Consortium for Anorexia Nervosa/Wellcome Trust Case Control Consortium–3 (GCAN/WTCCC3) GWAS came from 12 previously collected clinical or population cohorts. Given that these were archived samples, the calculation of reliability statistics on diagnoses was not possible. Mitigating that concern, however, is that anorexia nervosa is a highly homogeneous phenotype with high interrater agreement for diagnosis (typical kappa values range from 0.81 to 0.97 ). Moreover, the approach taken here is consistent with successful GWAS meta-analysis efforts across psychiatric diagnoses, in which larger samples are used to detect the modest effects of typical single common genetic risk variants.
Individuals with schizophrenia, intellectual disability, and medical or neurological conditions causing weight loss were excluded, as in previous studies (6, 7). All sites had documented permission from local ethical committees, and all participants provided informed consent.
Consistent with procedures established by the Psychiatric Genomics Consortium (13, 14), we collected individual-level genotype (GWAS array) and phenotype (binary case-control status) data from contributing previous GWAS consortia and groups (for a description, see Table S1 in the data supplement that accompanies the online edition of this article). In particular, the previous reports on anorexia nervosa GWASs from CHOP/PFCG data (7) and the GCAN/WTCCC3 (6) provide further details about cohort ascertainment and participant characteristics not described below or in the online data supplement.
Although most of the cases included in the published anorexia nervosa GWASs were included in this analysis, many of the controls used in previous GWASs could not be used for subsequent analyses. To summarize, our analysis includes the CHOP/PFCG data (7) plus cases from 12 of the 15 strata included in the GCAN/WTCCC3 analysis of anorexia nervosa. Three data sets (Italy-North, Sweden, and Poland) from the Boraska et al. study (6) were dropped from our analysis either because appropriately matched controls could not be found or because case plus control numbers were <100. After removing these three data sets and combining the U.S. and Canadian cases, we included 11 GCAN/WTCCC3–based data sets plus the CHOP/PFCG data set in our analyses. For the nine data sets requiring new controls, we first evaluated diverse control data sets from Psychiatric Genomics Consortium collaborators for potentially suitable controls based on geographic location and Illumina genotyping. We then performed quality control steps (see below; additional details are provided in the data supplement), using visual inspection of principal component plots (comparing cases to controls) as well as quantile-quantile (QQ) and Manhattan plots (for evidence of systematic bias) to identify suitably matched controls. All samples in the present study are of European ancestry. As shown in Figure S1 in the data supplement, all of the data sets (except the one from Finland) form a gradient of clusters when visualized in a scatterplot of the first two principal components, as expected based on known population genetic features (15).
Quality Control and Analysis
After uniform quality control and imputation using the 1000 Genomes Project (phase 3) (16) in the anorexia nervosa case-control cohorts, we performed association analysis with an additive model using the dosage (the expected count of one of the alleles) for each genotype for each individual for each cohort. After adjustment for unbalanced case and control numbers across our 12 strata (see reference 17), our summed effective balanced sample size was 5,082 cases and 5,082 controls. Accordingly, our power was 83.1% for a genotype relative risk of 1.25, at an allele frequency of 0.2 at p<5×10−8 (http://zzz.bwh.harvard.edu/gpc). Analysis within data sets was performed in PLINK with the first 10 principal components as covariates. METAL (17) was used to conduct fixed-effects meta-analysis across the 12 data sets using inverse-variance weighting. Results were obtained for 10,641,224 SNPs and insertion-deletion variants with minor allele frequencies >1% and imputation quality scores >0.6 (for the QQ plot, see Figure S2 in the data supplement). The GWAS statistic inflation factor (λ) was 1.080, with a sample-size-adjusted λ1000 of 1.008, consistent with minimal population stratification or other systematic biases. Plotting was performed in R (18) and with LocusZoom (19). See the data supplement for additional methods and quality control details, and Table S1 for individual study details.
The primary analysis in this study is the GWAS, which analyzes each SNP for association with phenotype. The international standard for statistical significance is p<5×10−8, which corrects for the approximately 1 million independent statistical tests conducted. Focused secondary analyses are now the expectation for primary GWAS reports, and we describe statistical significance thresholds for them individually. For LD score regression genetic correlations (rg), we used the false discovery rate. Gene-based and pathway analyses were also conducted. For these analyses, statistical significance was set using the Bonferroni correction, which is conservative given nonindependence among the gene-based and pathway statistical tests. For the gene-based analyses, we defined statistical significance as a gene p value <2.6×10−6 (0.05/19,222 genes tested), and for pathway analyses, a p value <1.8×10−5 (0.05/2,714 pathways tested).
Analytical methods for estimating heritability and genetic correlations and for gene-based and pathway analyses are presented in the data supplement. Regarding the rationale for these particular secondary analyses, we note that these are often considered to be standard analyses for GWAS reports across medicine. In this particular application, we estimate SNP heritability for anorexia nervosa because it is important to quantify the combined effects of common variants on anorexia nervosa and compare it with other complex disorders and traits, within and outside psychiatry.
One locus achieved genome-wide significance for a single variant, as shown in the Manhattan plot in Figure 1, in which the threshold for significance (p<5×10−8) is denoted with a dotted line. The top locus (chromosome 12q13.2) overlaps six genes (IKZF4, RPS26, ERBB3, PA2G4, RPL41, and ZC3H10) and is located near six additional genes (ESYT1, SUOX, RAB5B, CDK2, PMEL, and DGKA). The top SNP was rs4622308 (p=4.3×10−9, odds ratio=1.2, SE=0.03; minor allele frequency in cases [MAFcases]=0.48, minor allele frequency in controls [MAFcontrols]=0.44). We found no evidence for heterogeneity in effect sizes across cohorts (Q=12.58, p=0.32), and we estimated that 12.59% of the variation was due to heterogeneity rather than chance (I2=12.59). The effects across studies are shown in the forest plot of rs4622308 in Figure S3 in the data supplement.
The results of conditional regression analyses are consistent with the existence of one signal at the top locus (see Figure S4 in the data supplement). The top SNP, rs4622308, is in high LD (r2=0.86; D′>0.99) with rs11171739, which was found to be associated with type 1 diabetes (20) and rheumatoid arthritis (21) in previous GWASs. The risk-associated alleles of both SNPs are typically found on the same haplotype (C-C); that is, the direction of effect for the risk allele is consistent across anorexia nervosa and these other two disorders. Several other immune-related phenotypes—vitiligo, alopecia areata, and asthma (see Figure S5 in the data supplement)—also have associations in the region, although these are somewhat LD independent of rs4622308.
Information for the top six loci is provided in Table S2 in the data supplement. The second (rs200312312 on chromosome 5, p=6.7×10−8), third (rs117957029 on chromosome 12, p=1.6×10−7), and fourth (rs11174202 on chromosome 12, p=3.1×10−7) most significant loci in our analyses also have consistent evidence for association across multiple cohorts (see Figure S6 in the data supplement for area plots of these loci). The fourth best locus is intronic in the FAM19A2 gene. Summary statistics are available at https://www.med.unc.edu/pgc.
Gene-Based and Pathway Analyses
Multiple genes, all but one of which were in the region around the top SNP (rs4622308), reached gene-based significance (reflecting the high LD in the region). The remaining significant gene was FAM19A2, a putative chemokine/cytokine and the fourth best locus in our SNP-based analyses. No pathways were significant (see Table S3 in the data supplement for the complete gene-based and pathway analysis results). As has typically been reported for other psychiatric disorders, candidate genes from previous studies did not reach gene-based significance; for a detailed review of the candidate gene literature, see reference 5).
Interrogation of databases such as GTEx (22) did not indicate that any of the genes in the top region have distinct patterns of brain gene expression. Searches using both GTEx and the SNP tag lookup function in MR-Base (www.mrbase.org/beta) indicated that the top SNP (rs4622308) is not, directly or via LD tagging, an expression or methylation quantitative trait locus. In addition, differential expression in an exploratory mouse model did not suggest a distinct pattern of gene expression (see Figure S7 in the data supplement).
Linkage Disequilibrium Score Regression
In our cohort, h2SNP for anorexia nervosa was 0.20 (SE=0.021), comparable to h2SNP estimates for other psychiatric disorders (see Figure S8 in the data supplement). Partitioned heritability estimates for annotation categories and cell types were not significant after correction for multiple testing (for complete results, see Table S4 in the data supplement).
A wide range of both positive and negative genetic correlations between anorexia nervosa and other phenotypes were statistically significant. Of 159 phenotypes tested, 29 had a false discovery rate <0.05 (uncorrected p values reported below). See Figure 2 for a depiction of these genetic correlations, and the text below for selected examples. All 159 genetic correlations and relevant references are available in Table S5 in the data supplement.
Notable significant genetic correlations between anorexia nervosa and psychiatric traits and disorders were neuroticism (rg=0.39, SE=0.14, p=4.4×10−3), schizophrenia (rg=0.29, SE=0.07, p=4.4×10−5), and results from a meta-analysis across psychiatric phenotypes (rg=0.22, SE=0.07, p=3.4×10−3). Genetic correlations between anorexia nervosa and the educational phenotypes of years of education (rg=0.34, SE=0.08, p=5.2×10−6) and attending college (rg=0.30, SE=0.07, p=4.4×10−5) were also positive and significant. Obsessive-compulsive disorder (OCD) GWAS data were unavailable to us, but a previous analysis (24) reported a positive rg with anorexia nervosa of 0.53 (SE=0.12, p=5.5×10−6).
Several significant negative genetic correlations emerged between anorexia nervosa and weight-related phenotypes, suggesting shared genetic loci underlying these phenotypes and opposing effects for relevant alleles. Extreme high body mass index (BMI) was significantly negatively correlated with anorexia nervosa (rg=−0.29, SE=0.08, p=2.0×10−4) as were obesity, BMI in the normal range, overweight, and hip circumference, with genetic correlations ranging from −0.2 to −0.3.
We also observed significant negative genetic correlations between anorexia nervosa and insulin- and glucose-related traits: insulin resistance (HOMA-IR) (rg=−0.50, SE=0.11, p=1.3×10−5) and fasting insulin (rg=−0.41, SE=0.09, p=5.2×10−6) were the largest-magnitude genetic correlations observed, aside from the previous report of OCD (24). A negative genetic correlation with fasting glucose (rg=−0.26, SE=0.07, p=3.0×10−4) was also observed. Although genome-wide BMI-corrected HOMA-IR GWAS statistics were not available, we observed a negative rg of BMI-corrected GWAS results for the closely related trait of leptin levels (rg=−0.24, SE=0.11, p=0.03), which suggests a role for BMI-independent glucose-related metabolism in anorexia nervosa. Regarding cholesterol and lipid measures, a distinction between different lipid fractions emerges when comparing high-density lipoprotein (HDL), low-density lipoprotein (LDL), and very-low-density lipoprotein (VLDL) phenotypes. Genetic correlations between anorexia nervosa and HDL phenotypes were positive—for example, total cholesterol in large HDL particles (rg=0.39, SE=0.12, p=1.6×10−3), free cholesterol in large HDL particles (rg=0.37, SE=0.12, p=2.2×10−3), and phospholipids in large HDL particles (rg=0.30, SE=0.11, p=6.7×10−3). In contrast, VLDL and LDL cholesterol phenotypes were negatively correlated with anorexia nervosa, albeit with nominal significance (i.e., uncorrected p<0.05)—for example, total lipids in VLDL (rg=−0.30, SE=0.12, p=0.01), phospholipids in VLDL (rg=−0.33, SE=0.13, p=4.4×10−3), and LDL cholesterol (rg=−0.20, SE=0.08, p=0.011).
To our knowledge, this is the first report of a genome-wide significant association for anorexia nervosa. As is typical of many GWAS loci for complex disorders, the region has a common top variant (MAFcontrols=0.44) that shows a modest odds ratio of 1.2 and implicates a broad region encompassing multiple genes (25). Consistent with other GWAS (26), our genome-wide h2SNP estimate of 20% for anorexia nervosa supports a substantial role for common genetic variation, which accounts for a sizable portion of twin-based heritability (h2Twin=48%–74%) (6). Furthermore, these results fit with the expectation that h2Twin should exceed h2SNP, because the former captures the effects of all types of genetic variation (common and rare, as well as variation not captured with current methods).
The observed pattern of genetic correlations with psychiatric, personality, educational, and metabolic phenotypes provides grounds for broadening our conceptualization of the disorder. First, the strong positive genetic correlations of anorexia nervosa with OCD and neuroticism reinforce clinical and epidemiological observations. Anorexia nervosa is commonly comorbid with OCD, and twin studies have reported high twin-based genetic correlations (27). High neuroticism in adolescence predicts subsequent onset of anorexia nervosa (1). In addition, anorexia nervosa is commonly comorbid with multiple anxiety phenotypes, which often predate the onset of anorexia nervosa (28).
Second, the positive genetic correlations seen with schizophrenia and the cross–psychiatric disorder phenotype firmly anchor anorexia nervosa with other psychiatric disorders and reflect the substantial evidence for partially shared genetic risk across many psychiatric disorders (29). Third, congruent with our results, positive associations between anorexia nervosa and educational attainment have been reported (30) and have been conjectured to reflect greater internal and external demands for academic success in highly educated families. Our results, in contrast, suggest that genetic factors may partially account for these reported associations.
Fourth, the identification of significant negative correlations between anorexia nervosa and BMI-related and anthropometric measures could potentially serve as an important first step toward gaining a better understanding of the shared biology underlying extremes of weight dysregulation (i.e., obesity versus anorexia nervosa). This is of critical importance because adequate explanations for how individuals with anorexia nervosa reach, sustain, and revert to exceedingly low BMIs have been elusive. Clinically, one of the most perplexing features of anorexia nervosa is how patients’ bodies seem to revert rapidly to a “low set point” after renourishment, which may represent the biological inverse of the reversion to high set points commonly seen in the unsuccessful treatment of obesity (31, 32). As noted by Bulik-Sullivan et al. (23) and Hinney et al. (33), these observations extend our understanding that the same genetic factors that influence normal variation in BMI, body shape, and body composition may also influence extreme dysregulation of these weight-related features in anorexia nervosa. This pattern of observations complements prior strong evidence for the involvement of neural mechanisms in obesity (34). Finally, positive correlations with “favorable” metabolic phenotypes (i.e., HDL and lipid measures) and negative correlations with “unfavorable” metabolic phenotypes (i.e., fasting insulin level, fasting glucose level, HOMA-IR) encourage additional exploration of the role metabolic factors may play in extreme dysregulation of appetite and weight in anorexia nervosa.
The genome-wide significant locus we identify to be associated with anorexia nervosa is broad and multigenic (chr12:56,372,585–56,482,185). Mechanistic explanations about the role of the associated variant require additional functional data; nevertheless, we note the possible role for genes at this locus in the pathophysiology of anorexia nervosa. PA2G4 is involved in growth regulation and acts as a co-repressor of the androgen receptor (35). ESYT1 (extended synaptotagmin-1, which binds and transports lipids ) is enriched in the postsynaptic density, which is implicated in the etiology of schizophrenia (37). Perhaps more convincing is that the sentinel marker for this locus, rs4622308, is in high LD with a known GWAS hit for type 1 diabetes (20) and rheumatoid arthritis (21), and the region around it harbors multiple other autoimmune associations. Multiple reports of shared effects between anorexia nervosa and immune phenotypes fit into a broader pattern of above-chance comorbidity across psychiatric and immune phenotypes (38, 39). Evidence suggests that this shared risk is at least partly genetic in origin (23, 39). A negative genetic correlation between anorexia nervosa and rheumatoid arthritis was previously reported (23), and our LD score regression estimate—though only nominally significant—is in the negative direction as well (see Table S5 in the data supplement).
The primary strength of this investigation is that it extends previous work by increasing sample size through collaboration. Nevertheless, contemporary understanding of complex trait genetics suggests that even larger samples are needed. Since our collection represents all of the currently GWAS-genotyped anorexia nervosa samples in the world, no known genotyped replication samples exist. We therefore expect this to be the beginning of genomic discovery in eating disorders (25). Future work with additional and better-powered anorexia nervosa GWASs will clarify the magnitude of genetic relationships among metabolic and psychiatric phenotypes, and methods such as that proposed by Pickrell et al. (40) will provide clues about the direction of causal relationships.
In summary, we identified the first robust genome-wide significant locus for anorexia nervosa, which is also a previously reported type 1 diabetes and general autoimmune disorder locus. Perhaps of greater importance is that we find anorexia nervosa to be a complex heritable phenotype with intriguingly large and significant genetic correlations not only with psychiatric disorders but also with multiple metabolic traits. This encourages a reconceptualization of this frequently lethal disorder as both psychiatric and metabolic. Just as obesity is increasingly considered to be both a metabolic/endocrine and psychiatric disorder, approaching anorexia nervosa as both a psychiatric and metabolic condition could ignite interest in developing or repositioning pharmacological agents for its treatment.
16 : The International Genome Sample Resource (IGSR): a worldwide collection of genome variation incorporating the 1000 Genomes Project data. Nucleic Acids Res 2016; 45(D1):D854–D859Crossref, Medline, Google Scholar
18 : A language for data analysis and graphics. J Comput Graph Stat 1996; 5:299–314Google Scholar
24 : Analysis of shared heritability in common disorders of the brain [Internet]. bioRxiv 2016. http://biorxiv.org/content/early/2016/04/16/048991.article-infoGoogle Scholar