The American Psychiatric Association (APA) has updated its Privacy Policy and Terms of Use, including with new information specifically addressed to individuals in the European Economic Area. As described in the Privacy Policy and Terms of Use, this website utilizes cookies, including for the purpose of offering an optimal online experience and services tailored to your preferences.

Please read the entire Privacy Policy and Terms of Use. By closing this message, browsing this website, continuing the navigation, or otherwise continuing to use the APA's websites, you confirm that you understand and accept the terms of the Privacy Policy and Terms of Use, including the utilization of cookies.




Efforts to prevent depression, the leading cause of disability worldwide, have focused on a limited number of candidate factors. Using phenotypic and genomic data from over 100,000 UK Biobank participants, the authors sought to systematically screen and validate a wide range of potential modifiable factors for depression.


Baseline data were extracted for 106 modifiable factors, including lifestyle (e.g., exercise, sleep, media, diet), social (e.g., support, engagement), and environmental (e.g., green space, pollution) variables. Incident depression was defined as minimal depressive symptoms at baseline and clinically significant depression at follow-up. At-risk individuals for incident depression were identified by polygenic risk scores or by reported traumatic life events. An exposure-wide association scan was conducted to identify factors associated with incident depression in the full sample and among at-risk individuals. Two-sample Mendelian randomization was then used to validate potentially causal relationships between identified factors and depression.


Numerous factors across social, sleep, media, dietary, and exercise-related domains were prospectively associated with depression, even among at-risk individuals. However, only a subset of factors was supported by Mendelian randomization evidence, including confiding in others (odds ratio=0.76, 95% CI=0.67, 0.86), television watching time (odds ratio=1.09, 95% CI=1.05, 1.13), and daytime napping (odds ratio=1.34, 95% CI=1.17, 1.53).


Using a two-stage approach, this study validates several actionable targets for preventing depression. It also demonstrates that not all factors associated with depression in observational research may translate into robust targets for prevention. A large-scale exposure-wide approach combined with genetically informed methods for causal inference may help prioritize strategies for multimodal prevention in psychiatry.

Depression is the leading cause of disability worldwide (1), but knowledge of actionable strategies that could mitigate depression risk remains relatively limited. A number of critical research gaps have remained. First, the literature to date has focused on validating a limited set of hypothesized modifiable factors for prevention of depression, such as physical activity (2, 3) and social support (4). Without broader investigation, additional factors may be overlooked or unknown. Investigating a wide range of factors could help confirm existing relationships and also identify novel potential prevention targets. Systematically testing the relationship between many variables and a single outcome for hypothesis-free discovery is now common practice in other fields in the form of genome- or phenome-wide association studies and has led to new insights about underlying associations (5, 6), but has not yet been applied to identifying modifiable factors for prevention of depression.

Second, few studies, to our knowledge, have appraised the relative influences of multiple modifiable factors within the same population. Some factors (e.g., specific nutrients or foods) that show statistically significant effects when studied alone may not prove robust or as clinically relevant when considered alongside other factors (7). Understanding the relative importance of different modifiable factors that could be integrated into prevention packages has been limited by modest sample sizes for multiple testing and lack of comprehensive measurements in a single study. The availability of large cohort studies, such as UK Biobank (8), now make comprehensive and well-powered inquiries possible.

Third, we do not know which modifiable factors may help prevent depression among individuals at elevated risk. Two of the best substantiated risk factors for depression—genetic vulnerability and early-life adversity (9, 10)—are effectively unmodifiable in adults. What generally helps prevent depression in most people may not necessarily be most relevant for those with specific risk profiles (11) and vice versa. Depression is now recognized as a polygenic condition (12)—influenced by many variants across the genome with individually small effects (13). As we are increasingly able to quantify polygenic risk for depression (14) and may even communicate polygenic risk information to individuals in the future (15), it becomes vital to expand knowledge of effective actionable measures for those identified as having an elevated risk. Similarly, life history factors such as traumatic events are known to increase risk for depression (16). As we more systematically assess established sources of genetic and environmental risk in a precision medicine framework (17), evidence of modifiable factors that benefit high-risk individuals could guide recommendations to offset preexisting vulnerabilities for depression (18).

Finally, modifiable factors may be associated with depression for noncausal reasons, including unmeasured third variables (i.e., residual confounding) and reverse causation (e.g., whereby depression risk influences behavioral patterns). To strengthen conclusions about which modifiable factors may be high-priority intervention targets, Mendelian randomization analyses can be used to further test relationships between identified factors and depression. Mendelian randomization is an alternative strategy for causal inference that uses genetic variants inherited at birth as statistical “instruments” to approximate a natural experiment in which individuals are assigned to varying average lifetime levels of an exposure (e.g., social affiliation) in relation to an outcome of interest (e.g., depression) (19). This use of genetic data bypasses typical sources of confounding in observational data and allows for triangulation of findings (20). We previously leveraged Mendelian randomization to validate a protective relationship between objectively measured physical activity and depression risk (3). Here, we extend the Mendelian randomization approach to evaluate a wide range of possible modifiable factors.

In this study, using phenotypic and genomic data from over 100,000 UK Biobank participants without active depressive symptoms at baseline, we used an exposure-wide association study design to test the relationship between 106 modifiable factors and clinically significant depression at follow-up (Figure 1). Given the established role of genetics and traumatic life events on depression risk, we also aimed to identify factors that may influence depression even in the context of these risks. Finally, we used two-sample Mendelian randomization to further assess directional effects and potential causal relationships between identified factors and depression.


FIGURE 1. Overview of two-stage analytic designa

a Associations between modifiable factors and incident depression were tested in three analytic samples: the full sample, individuals at risk based on polygenic risk, and individuals at risk based on reported traumatic life events. To reduce bias in associations from contemporaneous reporting, modifiable factors were selected from those indexed to the baseline assessment, and subsequent depression was assessed at the follow-up survey approximately 6 to 8 years later, after removing individuals with elevated depressive symptoms at baseline. Relationships between identified factors and depression risk were then examined in bidirectional Mendelian randomization analyses.


Our initial sample consisted of 123,794 adults of white British ancestry who enrolled in UK Biobank, had high-quality genomic data (for quality control procedures, see the Supplementary Methods in the online supplement), and completed an online follow-up mental health survey approximately 6 to 8 years after their initial enrollment (Figure 1). Data analytic procedures were approved by the institutional review board at Partners HealthCare and conducted as part of UK Biobank application 32568. Primary data processing and statistical analyses were conducted between October 2018 and August 2019.


Incident Depression.

Participants who endorsed depressed mood and/or anhedonia (for details, see the Supplementary Methods in the online supplement) for more than half the days in the past 2 weeks at baseline (N=5,416) were considered to have elevated depressive symptoms (21) and were excluded from this study, leaving 118,378 participants. At follow-up, symptoms of depression were measured using all nine items of the Patient Health Questionnaire–9 (PHQ-9) (22), summed to create an overall score ranging from 0 to 27. To derive predicted probabilities of depression to stratify at-risk groups, we created a binary variable for clinically significant incident depression based on a score cut-off of ≥10 (23).

Modifiable Factors.

We curated data on 106 potentially modifiable factors (see Table S1A in the online supplement) as measured or derived at baseline. These factors included behavioral (e.g., exercise, sleep, media use, diet), social (e.g., activities, support), and environmental (e.g., green space, pollution) variables. We selected these variables by inspecting the UK Biobank data showcase ( After review by three authors (K.W.C., J.W.S., K.M.N.), we included any variables in a domain that were 1) unlikely to be a close comorbidity of mental health problems (e.g., excluding substance use and cognitive functioning); 2) putatively modifiable at an individual or societal level (e.g., including behavioral and environmental factors); and 3) largely available for most participants and not just collected for a small subset (e.g., excluding branched questions that were only administered to individuals who had endorsed an earlier item). Potentially correlated variables within a category (e.g., 16-hour and 24-hour noise pollution) were retained to assess the relative influences of all available variables. As negative controls, we also selected two nonmodifiable variables hypothesized to be unrelated to depression—natural hair color and skin tanning ability. Data processing was performed on all variables, as described in the Supplementary Methods and Table S1A in the online supplement.

Traumatic Life Experiences.

In the online follow-up, participants reported on their history of traumatic life experiences, including childhood physical, sexual, and emotional abuse; partner-based physical, sexual, and emotional abuse; and other traumatic events, including exposure to sexual assault, violent crime, life-threatening accident, and witnessing violent death (for details, see the Supplementary Methods).


Baseline variables were extracted for participant characteristics (sex, age, assessment center), sociodemographic factors (socioeconomic deprivation, employment status, household income, completion of higher education, urbanicity, household size), and physical health factors (body mass index and reported physical illness or disability) (for details and inclusion rationale, see the Supplementary Methods in the online supplement).

Polygenic Risk Scoring

Polygenic risk scores (PRSs) were generated from large-scale genome-wide association results for major depression (12). Specifically, we used summary statistics (discovery genome-wide association study [GWAS], N=431,394) from the Psychiatric Genomics Consortium, leaving out UK Biobank data to minimize sample overlap and including 23andMe data for improved statistical power. We retained single-nucleotide polymorphisms (SNPs) with minor allele frequency >0.01 and INFO quality score >0.80. To generate PRSs, we applied PRS-CS (24)—a Bayesian polygenic prediction method that places a continuous shrinkage (CS) prior on effect sizes for all HapMap3 SNPs and infers posterior SNP weights using GWAS summary statistics combined with an external linkage disequilibrium reference panel, such as the 1000 Genomes Project European sample (for more details and comparison with conventional clumping and thresholding, see the Supplementary Methods in the online supplement). We set the global shrinkage parameter at 0.01 to reflect the likely polygenic architecture of major depression. Scores were calculated by summing the number of risk alleles at each SNP multiplied by the posterior SNP weight inferred using PRS-CS, with a total of 1,090,207 included SNPs. (For the distribution of scores, see the Supplementary Methods.) We then extracted residuals from a model in which PRSs were regressed on the top 10 European ancestry principal components provided by UK Biobank for use as stratification-adjusted PRSs in subsequent analyses.

Stratifying Participants at Risk for Incident Depression

Among individuals with available data on later depression and risk variables (i.e., polygenic risk and reported traumatic life events) (N=113,589; 4.3% with incident depression), we removed a holdout training sample of 1,000 participants consisting of an even split of randomly selected case and control subjects for incident depression (for the rationale, see the Supplementary Methods in the online supplement). In this holdout training sample, we regressed incident depression against polygenic risk or reported traumatic life events (for descriptive distributions, see Table S1B in the online supplement). Here, each traumatic life event was entered as a separate independent variable within a multivariable model to estimate relative weights of each event on depression risk, rather than assuming equal influences. We obtained regression coefficients for each set of risk variables from the training sample (see the Supplementary Methods) and used these coefficients as weights to generate predicted probability scores for incident depression for individuals in the testing sample (N=112,589), based on polygenic risk or reported traumatic life events. Selecting individuals with high predicted probability scores (>90th percentile), we obtained three groups: individuals in the full sample unselected for risk (maximum N=112,589), individuals at risk based on genetic factors (the PRS group; maximum N=11,258), and individuals at risk based on reported traumatic life events (maximum N=11,258). Only 1,563 individuals belonged to both the PRS and traumatic life events groups (13.9% of each), suggesting modest overlap and potentially distinct influences on depression (for exploratory results in this reduced sample, see Tables S2J–L and Figures S4A–C in the online supplement).

Exposure-Wide Association Scan

Using an exposure-wide association approach with logistic regression (see the Supplementary Methods in the online supplement), we tested associations between each baseline modifiable factor and incident depression in each of these samples (Figure 1), with a conservative Bonferroni-corrected significance threshold for establishing top hits (0.05 divided by 106 tests across three main analytic samples, or 0.000157). All associations were adjusted for sex, baseline age, and assessment center (model 0). We further adjusted for the sociodemographic factors described earlier (model 1) and also added physical health factors (model 2). All analytic samples were restricted to participants who had not withdrawn from UK Biobank as of February 2020 and had full covariate data (full sample: maximum N=100,517; PRS group: maximum N=10,093; traumatic life events group: maximum N=10,154) to ensure that differences in results between successively adjusted models reflected the addition of covariates rather than varying sample size. We also descriptively assessed the overlap between significant factors in each at-risk sample compared with the full sample and between at-risk samples.

Mendelian Randomization Analyses

We performed bidirectional two-sample Mendelian randomization analyses (see the Supplementary Methods in the online supplement) between depression and modifiable factors identified in the fully adjusted exposure-wide association scan (model 2) in the overall sample. For each factor, we accessed the GWAS Atlas database (25) ( to obtain publicly available UK Biobank–based summary statistics. For depression, we retained the Psychiatric Genomics Consortium summary statistics used for polygenic scoring (12). As instruments for each factor, we extracted highly associated SNPs (p<5×10−7; for the rationale, see the Supplementary Methods) that were clumped for independence at r2>0.001. Using the TwoSampleMR package in R (26), we conducted Mendelian randomization analyses to estimate the effect of each modifiable factor on the risk of depression and vice versa. For primary Mendelian randomization analyses, we combined per-SNP effects using inverse-variance weighted meta-analysis, where the resulting estimate represents the slope of a weighted regression of SNP-outcome effects on SNP-exposure effects in which the intercept is constrained to zero. We applied MR-PRESSO (27) with additional tests (i.e., Cook’s distance, studentized residuals, Q-value outliers) to detect statistical outliers reflecting potential bias (28), and removed these outliers to generate reported estimates. We relaxed the instrument p threshold (p<5×10−6) for several traits lacking sufficient SNPs (≤3) after outlier removal (e.g., vitamin B supplementation, walking frequency). We then compared the pattern of inverse-variance weighted results to other established Mendelian randomization methods—the weighted median approach (29) and Mendelian randomization-Egger (MR-Egger) regression (30)—whose estimates rely on different assumptions and are relatively robust to horizontal pleiotropy, i.e., violation of the assumption that genetic instruments act on the outcome only via their effects on the exposure. For significant results, we further assessed horizontal pleiotropy using leave-one-SNP-out analyses, the modified Cochran’s Q statistic, the MR-Egger intercept test (31), and manual SNP lookups (further details are provided in the Supplementary Methods in the online supplement). Reported estimates were converted to odds ratios where the outcome was binary, and interpreted using a conservative p threshold (0.05/number of factors with available summary statistics).


Modifiable Factors Prospectively Associated With Incident Depression in the Full Sample

In the full sample (for descriptive data, see the Supplementary Methods and Table S1C in the online supplement), 49 factors spanning multiple domains (e.g., physical activity, media use, sleep, social, environmental, and dietary variables) were significantly associated with depression (model 0) (see Figure S1A,C and Table S2A in the online supplement). After adjusting for sociodemographic factors (model 1), 39 factors were significantly associated with depression (see Figure S1B,D and Table S2B in the online supplement). After further adjusting for physical health factors (model 2), 29 factors remained significantly associated with depression (Figures 2 and 3; see also Figure S1E and Table S2C in the online supplement). Of these, 18 factors were associated with reduced odds of depression and 11 were associated with increased odds of depression (all continuous factors were standardized to mean of zero and a standard deviation of 1; for variable types, see Table S1A in the online supplement). The top 10 included six factors that appeared protective: confiding in others (adjusted odds ratio=0.83, 95% CI=0.82, 0.85, p=9.66×10−100), sleep duration (adjusted odds ratio=0.83, 95% CI=0.80, 0.85, p=5.37×10−33), engaging in exercises like swimming or cycling (adjusted odds ratio=0.70, 95% CI=0.66, 0.75, p=2.91×10−25), walking pace (adjusted odds ratio=0.79, 95% CI=0.74, 0.84, p=3.37×10−15), being part of a sports club or gym (adjusted odds ratio=0.77, 95% CI=0.72, 0.83; p=3.98×10−12), and cereal intake (adjusted odds ratio=0.89, 95% CI=0.87, 0.92, p=9.57×10−12). The remaining four factors appeared to increase risk: daytime napping (adjusted odds ratio=1.29, 95% CI=1.22, 1.37, p=1.20×10−19), computer use time (adjusted odds ratio=1.10, 95% CI=1.07, 1.13, p=9.36×10−12), television watching time (adjusted odds ratio=1.12, 95% CI=1.08, 1.16, p=6.07×10−12), and cell phone use time (adjusted odds ratio=1.10, 95% CI=1.07, 1.13, p=1.25×10−11).


FIGURE 2. Association results between modifiable factors and incident depression in the full sample, adjusted for all covariatesa

a Panel A is an association plot for modifiable factors in relation to incident depression, with the x-axis organized by conceptual domains and the y-axis showing statistical significance as −log10 of the p value; the long-dash horizontal line indicates the significance threshold corrected for multiple testing, and the short-dash horizontal line indicates the p<0.05 threshold. A selection of top factors are annotated for legibility; the full set of association results is provided in Tables S2A–C in the online supplement. Panel B presents the adjusted odds ratios for significant factors, in ascending order (i.e., from risk-reducing to risk-increasing). IPAQ=International Physical Activity Questionnaire classification of physical activity level (low, moderate, high).


FIGURE 3. Top factors associated with incident depression across levels of covariate adjustmenta

a The top associated factors in the full sample are shown in order of consistency patterns across one, two, or three models, in descending alphabetical order within each pattern. Results are shown only for factors with significant associations in at least one model. (The full set of association results is provided in Tables S2A–C in the online supplement.) Model 0=adjusted for participant characteristics; model 1=model 0 further adjusted for sociodemographic factors; model 2=model 1 further adjusted for health factors; IPAQ=International Physical Activity Questionnaire classification of physical activity level (low, moderate, high).

Factors Associated With Depression Among At-Risk Individuals Based on Polygenic Risk

Among individuals at high predicted probability of depression based on PRS, 12 factors were identified to be significantly associated with depression (model 0) (see Figure S2A,E and Table S2D in the online supplement). These were reduced to 10 (model 1; see Figure S2B,F and Table S2E) and four top factors (model 2; see Figure S2C,G and Table S2F) after adjustment for sociodemographic and health factors, respectively. Notably, these factors had been identified in the full sample. Of these, two appeared to be protective: frequency of confiding in others (adjusted odds ratio=0.85, 95% CI=0.81, 0.89, p=2.87×10−13) and sleep duration (adjusted odds ratio=0.81, 95% CI=0.75, 0.88, p=4.07×10−7). The other two appeared to increase risk: computer use time (adjusted odds ratio=1.17, 95% CI=1.09, 1.26, p=1.19×10−5) and salt intake (adjusted odds ratio=1.21, 95% CI=1.10, 1.33, p=1.31×10−4).

Factors Associated With Depression Among Individuals at Risk Based on Traumatic Life Events

Among individuals with high predicted probability of depression based on their reported traumatic life events, 18 factors were significantly associated with depression (model 0) (see Figure S3A,E and Table S2G in the online supplement). These were reduced to 16 (model 1; see Figure S3B,F and Table S2H) and four top factors (model 2; see Figure S3C,G and Table S2I) after adjustment for sociodemographic and health factors, respectively. Again, these factors had been identified in the full sample. Of these, three appeared to be protective: frequency of confiding in others (adjusted odds ratio=0.85, 95% CI=0.82, 0.88, p=2.00×10−22), engaging in exercises like swimming or cycling (adjusted odds ratio=0.66, 95% CI=0.59, 0.75, p=2.31×10−10), and sleep duration (adjusted odds ratio=0.83, 95% CI=0.79, 0.89, p=3.93×10−9). One factor appeared to increase risk: television watching time (adjusted odds ratio=1.15, 95% CI=1.08, 1.23, p=5.85×10−6). Two of these factors (confiding in others and sleep duration) had also been identified as top factors in the PRS group, and television watching time showed a similar estimate in the PRS group (adjusted odds ratio=1.17, 95% CI=1.08, 1.27) as well. The remaining top factors (computer use, salt intake, and exercises like swimming or cycling) showed overlapping confidence intervals between the PRS and traumatic life events groups, suggesting that associations may be relatively comparable across genetic or environmental risk despite not meeting the defined threshold.

Follow-Up Mendelian Randomization Analyses

We tested all modifiable factors identified in the adjusted full sample (model 2) with available GWAS summary statistics. Bidirectional Mendelian randomization analyses between each factor and depression revealed a number of findings suggesting possible causal relationships (see Figure 4 for inverse-variance weighted results; weighted median results are shown in Figures S5A,B and all estimates in Table S3 in the online supplement).


FIGURE 4. Mendelian randomization estimates of top modifiable factors in relation to depression riska

a Panel A presents Mendelian randomization estimates of the relationship between modifiable factors (exposures) and the risk of depression (outcome), based on the inverse-variance weighted method with outliers removed (for the weighted median method, see Figure S26 in the online supplement). Panel B presents Mendelian randomization estimates of the relationship between depression (exposure) and modifiable factors (outcomes), based on the inverse-variance weighted method with outliers removed (for the weighted median method, see Figure S27 in the online supplement). Odds ratio estimates are shown on the left for dichotomous factors as outcomes, and beta estimates on the right for nondichotomous factors as outcomes.

Mendelian randomization evidence supported a beneficial effect of confiding in others on depression risk (odds ratio=0.76, 95% CI=0.67, 0.86, p=2.53×10−5; 10 SNPs; see Figure S6A in the online supplement), with nonsignificant effects in the reverse direction. We also found Mendelian randomization evidence supporting a deleterious effect of television use on depression risk (odds ratio=1.09, 95% CI=1.05, 1.13, p=6.81×10−6; 145 SNPs; see Figure S6B in the online supplement), with nonsignificant effects in the reverse direction. No evidence of effect heterogeneity or horizontal pleiotropy was observed for either factor (see the Supplementary Methods in the online supplement). Daytime napping showed bidirectional effects with depression, such that it was linked to higher odds of depression (odds ratio=1.34, 95% CI=1.17, 1.53, p=1.82×10−5; 91 SNPs; see Figure S6C in the online supplement), but depression was also associated with increased daytime napping (β=0.05, 95% CI=0.03, 0.06, p=8.45×10−11; 43 SNPs), with no evidence of effect heterogeneity or horizontal pleiotropy in either direction. Surprisingly, Mendelian randomization evidence suggested that multivitamin use was also linked to increased odds of depression (odds ratio=1.28, 95% CI=1.11, 1.47, p=6.04×10−4; six SNPs; see Figure S6D in the online supplement). Given the lower number of SNPs tested, this effect was notably attenuated when the instrument SNP threshold was further relaxed to p<5×10−6 (odds ratio=1.07, 95% CI=1.0, 1.14, p=0.0498; 30 SNPs). Depression was also nominally associated with increased intake of multivitamins (odds ratio=1.06, 95% CI=1.003, 1.13, p=0.0407; 44 SNPs).

Other nominal results at the p<0.05 threshold included potential risk-reducing effects of tea intake, visits with family and friends, and exercises such as swimming or cycling, and risk-increasing effects of salt intake on depression, and, in the reverse direction, potential effects of depression risk on reducing social participation and increasing tendencies for computer-related behavior and walking (Figure 4 and Table S3; see also the Supplementary Methods in the online supplement).


Although depression is a major source of suffering and lost productivity globally, successful prevention remains challenging. Using phenotypic and genomic data from UK Biobank, we used a novel two-stage approach to screen and validate a broad panel of modifiable factors as potential prevention targets. Consistent with the multifactorial nature of depression (32), we first identified a range of factors across social, media, sleep, dietary, and physical activity–related domains that were associated with incident depression over the course of study participation, both in general and among at-risk individuals. In subsequent Mendelian randomization analyses, we identified factors with convergent support across both methods, and others with discrepant evidence that may require further validation before they are targeted in resource-intensive trials or policy.

Among factors with convergent support, confiding in others showed the strongest phenotypic associations, even among at-risk individuals, and these associations were substantiated by robust Mendelian randomization results, validating the impact of trusted social connections as causally protective for depression. Visiting with family and friends was also supported by nominally significant Mendelian randomization results, pointing to frequent social interactions as an additional key facet of social engagement that may be protective. Findings align with the literature on social connections and mental well-being (4) and with our recent study in military personnel demonstrating that greater social cohesion was linked to reduced risk of incident depression despite high genetic or environmental risk (33). Emergence of social factors as the most robust among many other modifiable targets suggests that efforts to counteract disconnection at the societal and individual levels—whether by social activity prescriptions (34) or reducing the stigma of seeking emotional support—should be central to an effective depression prevention agenda. Our two-stage analyses also validated television use as a risk factor for depression (35). Further work is needed to determine whether this effect is due to screen time or media exposure per se, or whether television watching time serves more generally as a proxy for sedentary behavior, which was not explicitly measured in the full sample but has been identified as a risk factor for depression (36). Regardless, our findings suggest that health care provider assessment of media use patterns in adult patients and providing psychoeducation on the potential mood impacts of excess television watching could represent another effective component of depression prevention. Finally, daytime napping emerged unexpectedly with bidirectional influences in the Mendelian randomization context; that is, a tendency for daytime napping in adults appeared to increase risk of depression, but depression itself may be a cause of increased napping.

A substantial number of associated factors were not supported by current Mendelian randomization evidence, for several possible reasons. First, not all modifiable factors—even those prospectively related to depression—may be causal in their effects on depression risk and thus represent weaker targets for prevention. For example, bidirectional Mendelian randomization evidence suggested that factors such as increased computer use and vitamin B supplementation are more likely to be consequences of depression than causes, such that depressed individuals may tend to spend more time on the computer or be more likely to take supplements. It may be useful to leverage these factors as early indicators of depression rather than direct modifiers of depression risk. Causality notwithstanding, the co-occurrence of depression risk with a range of health-relevant behaviors highlights a potential mechanism for physical morbidities (e.g., cardiometabolic disease, premature mortality) often associated with depression, which could inform preventive interventions to reduce health disparities in individuals with or at risk of depression (37).

Second, the relationship between certain modifiable factors and depression may not be straightforward, and more nuanced study is required. For example, overall reported sleep duration, which was related to incident depression but not substantiated by current Mendelian randomization evidence, may have complex and nonlinear effects (38) that could not be fully explored in this study but could be probed in future studies with more detailed sleep-related phenotypes (39). Geocoded environmental exposures such as pollution and natural space also showed associations that did not persist after adjusting for sociodemographic factors and were thus not tested in Mendelian randomization. It may be that such environmental exposures exert stronger influences earlier in development (40) or their effects depend on heterogeneous features (e.g., tree canopy versus grass coverage [41]).

Third, although we adjusted for sociodemographic and health factors, residual confounding could explain some observed associations. For example, various dietary factors associated with depression (e.g., cereal consumption, lamb intake, vitamin B supplementation) were not supported in Mendelian randomization and may instead reflect behavioral patterns such as daily routines, social rituals, or health concerns that affect mental health more broadly. Despite popular views of vitamin B as a mood-boosting supplement, our findings align with a current lack of randomized trial evidence supporting beneficial effects on depression risk (42). Among the more surprising findings, multivitamin use was not only associated with increased depression but also supported by Mendelian randomization evidence, although it was attenuated in sensitivity analyses. Given sparse evidence to date (43), this finding should be interpreted with caution unless supported by additional data, although an intriguing but nonsignificant association of multivitamin supplementation with higher odds of depression was recently observed in a multisite randomized trial for depression prevention (44). We also found evidence suggesting reverse causation, whereby depressed individuals may be more likely to take multivitamins.

Fourth, the strength of current genetic instruments may have contributed to discrepancies between phenotypic and Mendelian randomization associations, and results may require updating as new genetic discoveries emerge. Although physical activity variables showed some of the largest protective relationships with incident depression, their effects were not bolstered in Mendelian randomization. We previously observed that while influences of objectively measured physical activity (not included here) on depression were validated in Mendelian randomization (3), self-report measures did not show these patterns. Objective measures—capturing a broad tendency for movement—demonstrate higher heritability (45) and may yield more powerful genetic instruments. Indeed, self-reported activity variables, as well as dietary factors, tended to have fewer genome-wide significant SNPs than other traits (e.g., media use). Nonetheless, nominal Mendelian randomization results suggested that liability to engaging in exercises like swimming or cycling (protective) and salt intake (risk) may affect depression risk, meriting further inquiry.

Our study should be evaluated in light of several limitations. First, while we considered a wide array of lifestyle and environmental factors, we were limited by available variables in the UK Biobank database. These did not include modifiable psychological factors (e.g., coping styles) that could also influence depression risk. Second, although the exposure-wide design is a major strength, some associations—potentially noteworthy if studied alone—may be obscured by correction for multiple testing. For instance, physical activity variables (e.g., exercises like swimming and cycling, or heavy outdoor chores) were protectively associated with incident depression even among individuals at high genetic risk, as shown elsewhere (46), but not interpreted as “top” factors for this group because of conservative thresholds. While we highlight some of the most strongly associated factors, the full results should be reviewed in Table S2 in the online supplement. Third, our study relied largely on self-report measures, which can be subject to reporting biases. Our assessment of depression was based on a survey measure that, while widely used, does not constitute a clinical diagnostic interview. In addition, a self-reported outcome could explain stronger associations with factors that were also self-reported and have an emotional component (e.g., social factors). Given that depression may occur across the life course and that this was a sample of only older adults, we focused on any clinically significant incident depression over the follow-up period; however, future longitudinal research could distinguish between new-onset depression and relapse. Fourth, confirmation of causal effects may require randomized controlled trials of preventive interventions. In some cases, such trials might be prohibitively costly, require long follow-up periods, or be otherwise unfeasible. Mendelian randomization provides an important alternative for verifying effects; however, estimates reflect lifelong average effects of genetic variants and should not be interpreted in the same way as effects from a discrete intervention trial or within a briefer period. Moreover, absence of a Mendelian randomization result does not disconfirm the potential importance of a factor operating within shorter time frames, but points to a need to further investigate discrepancies. As mentioned above, horizontal pleiotropy is a common threat to the validity of Mendelian randomization estimates, which we attempted to rule out using multiple sensitivity analyses; notably, significant results for confiding in others, television watching time, and daytime napping persisted even when retaining genetic instruments with no known associations with other phenotypes including depression-relevant traits. Finally, this study was restricted to an older white British sample that volunteered for research and thus represents a more engaged and healthy population (47) and may not be generalizable to other populations.


Systematic large-scale research on modifiable factors for prevention of depression has thus far been limited. In this study, in over 100,000 individuals for whom genomic and wide-ranging lifestyle and environmental measures were available, we screened more than 100 potentially modifiable factors for their association with incident depression, including among at-risk individuals, and then tested potential causal effects in a Mendelian randomization framework. Our two-stage results prioritize an array of potential targets for prevention—most robustly, social support factors, media use, and circadian habits—with the potential to reduce the risk of depression even in the face of genetic or environmental vulnerability. Not all factors associated with depression in observational research may represent potent targets for prevention. A large-scale systematic approach combined with genetically informed methods for causal inference could help prioritize candidates for multimodal prevention in psychiatry.

Department of Psychiatry, Massachusetts General Hospital, Boston (Choi, Chen, Zheutlin, Dunn, Koenen, Smoller); Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston (Choi, Chen, Zheutlin, Dunn, Koenen, Smoller); Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston (Choi, Nishimi, Koenen, Smoller); Biogen, Cambridge, Mass. (Chen); Departments of Psychiatry and Family Medicine and Public Health, University of California, San Diego, La Jolla (Stein); Social, Genetic, and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology, and Neuroscience, King’s College London (Coleman, Breen); and Department of Epidemiology, Columbia University Mailman School of Public Health, New York (Ratanatharathorn).
Send correspondence to Dr. Choi () and Dr. Smoller ().

Presented at the 27th annual World Congress of Psychiatric Genetics, Anaheim, Calif., October 26–31, 2019; and the 35th annual meeting of the International Society for Traumatic Stress Studies, Boston, November 14–16, 2019.

The views expressed in this article are those of the authors and not necessarily those of NIH, NHS, NIHR or the U.K. Department of Health and Social Care.

Dr. Stein has served as a consultant for Actelion, Aptinyx, Bionomics, Dart Neuroscience, EpiVario, GW Pharma, GABA Therapeutics, Healthcare Management Technologies, Janssen, Jazz Pharmaceuticals, Neurocrine Biosciences, Oxeia Biopharmaceuticals, and Pfizer. Dr. Zheutlin receives a salary from Sema4, a health intelligence company. Dr. Breen has served on an advisory board for Otsuka Pharmaceutical. Dr. Smoller is an unpaid member of the Bipolar/Depression Research Community Advisory Panel of 23andMe and a member of the Leon Levy Foundation Neuroscience Advisory Board, and he has received an honorarium for an internal seminar at Biogen, Inc. The other authors report no financial relationships with commercial interests.

This research was conducted using the UK Biobank resource under an approved data request (no. 32568). This work involved the use of the Enterprise Research Infrastructure and Services at Partners HealthCare. Dr. Choi was supported in part by an NIMH T32 Training Fellowship (T32MH017119). Dr. Smoller is a Tepper Family MGH Research Scholar and is supported in part by the Demarest Lloyd, Jr., Foundation. Dr. Ge is supported in part by National Institute on Aging grant K99AG054573. Drs. Coleman and Breen are funded partly by the U.K. National Institutes of Health Research (NIHR) and partly by a grant from Cohen Veterans Bioscience. This study represents independent research funded in part by the NIHR Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King’s College London.

The authors thank the research participants and employees of 23andMe for making this work possible. The following members of the 23andMe Research Team contributed to this study: Michelle Agee, Babak Alipanahi, Adam Auton, Robert K. Bell, Katarzyna Bryc, Sarah L. Elson, Pierre Fontanillas, Nicholas A. Furlotte, Barry Hicks, David A. Hinds, Karen E. Huber, Ethan M. Jewett, Yunxuan Jiang, Aaron Kleinman, Keng-Han Lin, Nadia K. Litterman, Jennifer C. McCreight, Matthew H. McIntyre, Kimberly F. McManus, Joanna L. Mountain, Elizabeth S. Noblin, Carrie A.M. Northover, Steven J. Pitts, G. David Poznik, J. Fah Sathirapongsasuti, Janie F. Shelton, Suyash Shringarpure, Chao Tian, Joyce Y. Tung, Vladimir Vacic, Xin Wang, and Catherine H. Wilson.


1 World Health Organization: Fact sheets: Depression. January 2020. Scholar

2 Schuch FB, Vancampfort D, Firth J, et al.: Physical activity and incident depression: a meta-analysis of prospective cohort studies. Am J Psychiatry 2018; 175:631–648LinkGoogle Scholar

3 Choi KW, Chen C-Y, Stein MB, et al.: Assessment of bidirectional relationships between physical activity and depression among adults: a 2-sample Mendelian randomization study. JAMA Psychiatry 2019; 76:399–408Crossref, MedlineGoogle Scholar

4 Santini ZI, Koyanagi A, Tyrovolas S, et al.: The association between social relationships and depression: a systematic review. J Affect Disord 2015; 175:53–65Crossref, MedlineGoogle Scholar

5 Visscher PM, Wray NR, Zhang Q, et al.: 10 Years of GWAS discovery: biology, function, and translation. Am J Hum Genet 2017; 101:5–22Crossref, MedlineGoogle Scholar

6 Denny JC, Bastarache L, Roden DM: Phenome-wide association studies as a tool to advance precision medicine. Annu Rev Genomics Hum Genet 2016; 17:353–373Crossref, MedlineGoogle Scholar

7 Ioannidis JPA: Neglecting major health problems and broadcasting minor, uncertain issues in lifestyle science. JAMA 2019; 1:1–2Google Scholar

8 Sudlow C, Gallacher J, Allen N, et al.: UK Biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med 2015; 12:e1001779Crossref, MedlineGoogle Scholar

9 Sullivan PF, Neale MC, Kendler KS: Genetic epidemiology of major depression: review and meta-analysis. Am J Psychiatry 2000; 157:1552–1562LinkGoogle Scholar

10 Köhler CA, Evangelou E, Stubbs B, et al.: Mapping risk factors for depression across the lifespan: an umbrella review of evidence from meta-analyses and Mendelian randomization studies. J Psychiatr Res 2018; 103:189–207Crossref, MedlineGoogle Scholar

11 Patel V, Goodman A: Researching protective and promotive factors in mental health. Int J Epidemiol 2007; 36:703–707Crossref, MedlineGoogle Scholar

12 Wray NR, Ripke S, Mattheisen M, et al.: Genome-wide association analyses identify 44 risk variants and refine the genetic architecture of major depression. Nat Genet 2018; 50:668–681Crossref, MedlineGoogle Scholar

13 Wray NR, Lee SH, Mehta D, et al.: Research review: polygenic methods and their application to psychiatric traits. J Child Psychol Psychiatry 2014; 55:1068–1087Crossref, MedlineGoogle Scholar

14 McIntosh AM, Sullivan PF, Lewis CM: Uncovering the genetic architecture of major depression. Neuron 2019; 102:91–103Crossref, MedlineGoogle Scholar

15 Palk AC, Dalvie S, de Vries J, et al.: Potential use of clinical polygenic risk scores in psychiatry: ethical implications and communicating high polygenic risk. Philos Ethics Humanit Med 2019; 14:4Crossref, MedlineGoogle Scholar

16 Tennant C: Life events, stress, and depression: a review of recent findings. Aust N Z J Psychiatry 2002; 36:173–182Crossref, MedlineGoogle Scholar

17 Prendes-Alvarez S, Nemeroff CB: Personalized medicine: prediction of disease vulnerability in mood disorders. Neurosci Lett 2018; 669:10–13Crossref, MedlineGoogle Scholar

18 Choi KW, Stein MB, Dunn EC, et al.: Genomics and psychological resilience: a research agenda. Mol Psychiatry 2019; 24:1770–1778Crossref, MedlineGoogle Scholar

19 Byrne EM, Yang J, Wray NR: Inference in psychiatry via 2-sample Mendelian randomization: from association to causal pathway? JAMA Psychiatry 2017; 74:1191–1192Crossref, MedlineGoogle Scholar

20 Munafò MR, Davey Smith G: Robust research needs many lines of evidence. Nature 2018; 553:399–401Crossref, MedlineGoogle Scholar

21 Arroll B, Goodyear-Smith F, Crengle S, et al.: Validation of PHQ-2 and PHQ-9 to screen for major depression in the primary care population. Ann Fam Med 2010; 8:348–353Crossref, MedlineGoogle Scholar

22 Kroenke K, Spitzer RL, Williams JB: The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med 2001; 16:606–613Crossref, MedlineGoogle Scholar

23 Manea L, Gilbody S, McMillan D: Optimal cut-off score for diagnosing depression with the Patient Health Questionnaire (PHQ-9): a meta-analysis. CMAJ 2012; 184:E191–E196Crossref, MedlineGoogle Scholar

24 Ge T, Chen C-Y, Ni Y, et al.: Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat Commun 2019; 10:1776Crossref, MedlineGoogle Scholar

25 Watanabe K, Stringer S, Frei O, et al.: A global overview of pleiotropy and genetic architecture in complex traits. Nat Genet 2019; 51:1339–1348Crossref, MedlineGoogle Scholar

26 Hemani G, Zheng J, Elsworth B, et al.: The MR-Base platform supports systematic causal inference across the human phenome. eLife 2018; 7:e34408Crossref, MedlineGoogle Scholar

27 Verbanck M, Chen C-Y, Neale B, et al.: Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat Genet 2018; 50:693–698Crossref, MedlineGoogle Scholar

28 Hemani G, Bowden J, Davey Smith G: Evaluating the potential role of pleiotropy in Mendelian randomization studies. Hum Mol Genet 2018; 27(R2):R195–R208Crossref, MedlineGoogle Scholar

29 Bowden J, Davey Smith G, Haycock PC, et al.: Consistent estimation in Mendelian randomization with some invalid instruments using a weighted median estimator. Genet Epidemiol 2016; 40:304–314Crossref, MedlineGoogle Scholar

30 Bowden J, Davey Smith G, Burgess S: Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int J Epidemiol 2015; 44:512–525Crossref, MedlineGoogle Scholar

31 Burgess S, Thompson SG: Interpreting findings from Mendelian randomization using the MR-Egger method. Eur J Epidemiol 2017; 32:377–389Crossref, MedlineGoogle Scholar

32 Sjöholm L, Lavebratt C, Forsell Y: A multifactorial developmental model for the etiology of major depression in a population-based sample. J Affect Disord 2009; 113:66–76Crossref, MedlineGoogle Scholar

33 Choi KW, Chen C-Y, Ursano RJ, et al.: Prospective study of polygenic risk, protective factors, and incident depression following combat deployment in US Army soldiers. Psychol Med 2019; 50:737–745Crossref, MedlineGoogle Scholar

34 Drinkwater C, Wildman J, Moffatt S: Social prescribing. BMJ 2019; 364:l1285Crossref, MedlineGoogle Scholar

35 de Wit L, van Straten A, Lamers F, et al.: Are sedentary television watching and computer use behaviors associated with anxiety and depressive disorders? Psychiatry Res 2011; 186:239–243Crossref, MedlineGoogle Scholar

36 Teychenne M, Ball K, Salmon J: Sedentary behavior and depression among adults: a review. Int J Behav Med 2010; 17:246–254Crossref, MedlineGoogle Scholar

37 Firth J, Siddiqi N, Koyanagi A, et al.: The Lancet Psychiatry Commission: a blueprint for protecting physical health in people with mental illness. Lancet Psychiatry 2019; 6:675–712Crossref, MedlineGoogle Scholar

38 Zhai L, Zhang H, Zhang D: Sleep duration and depression among adults: a meta-analysis of prospective studies. Depress Anxiety 2015; 32:664–670Crossref, MedlineGoogle Scholar

39 Jones SE, van Hees VT, Mazzotti DR, et al.: Genetic studies of accelerometer-based sleep measures yield new insights into human sleep behaviour. Nat Commun 2019; 10:1585Crossref, MedlineGoogle Scholar

40 Engemann K, Pedersen CB, Arge L, et al.: Residential green space in childhood is associated with lower risk of psychiatric disorders from adolescence into adulthood. Proc Natl Acad Sci USA 2019; 116:5188–5193Crossref, MedlineGoogle Scholar

41 Astell-Burt T, Feng X: Association of urban green space with mental health and general health among adults in Australia. JAMA Netw Open 2019; 2:e198209Crossref, MedlineGoogle Scholar

42 Firth J, Teasdale SB, Allott K, et al.: The efficacy and safety of nutrient supplements in the treatment of mental disorders: a meta-review of meta-analyses of randomized controlled trials. World Psychiatry 2019; 18:308–324Crossref, MedlineGoogle Scholar

43 Long S-J, Benton D: Effects of vitamin and mineral supplementation on stress, mild psychiatric symptoms, and mood in nonclinical samples: a meta-analysis. Psychosom Med 2013; 75:144–153Crossref, MedlineGoogle Scholar

44 Bot M, Brouwer IA, Roca M, et al.: Effect of multinutrient supplementation and food-related behavioral activation therapy on prevention of major depressive disorder among overweight or obese adults with subsyndromal depressive symptoms: the MooDFOOD randomized clinical trial. JAMA 2019; 321:858–868Crossref, MedlineGoogle Scholar

45 Klimentidis YC, Raichlen DA, Bea J, et al.: Genome-wide association study of habitual physical activity in over 377,000 UK Biobank participants identifies multiple variants including CADM2 and APOE. Int J Obes 2018; 42:1161–1176CrossrefGoogle Scholar

46 Choi KW, Zheutlin AB, Karlson RA, et al.: Physical activity offsets genetic risk for incident depression assessed via electronic health records in a biobank cohort study. Depress Anxiety 2020; 37:106–114Crossref, MedlineGoogle Scholar

47 Adams MJ, Hill WD, Howard DM, et al.: Factors associated with sharing e-mail information and mental health survey participation in large population cohorts. Int J Epidemiol 2020; 49:410–421Crossref, MedlineGoogle Scholar