A central question in the etiology of drug abuse is the extent to which the risk factors for the use or misuse of a particular class of psychoactive substances are specific to that class or are nonspecific in that they predispose the individual to the use or misuse of a wide range of such compounds (1). Because psychoactive substance abuse is strongly familial (2–7), the analysis of the pattern of drug use and abuse of multiple substances in family or twin samples can provide particular insight into this issue (7–9). To further clarify this question, we used multivariate twin analysis to examine the genetic and environmental sources of comorbidity for the use and abuse/dependence of six classes of commonly used illicit psychoactive substances in 1,196 male-male twin pairs from the Virginia Twin Registry.
Sample and Assessment Procedures
This report is based on data collected in the second wave of interviews in a study of adult male twins from the Virginia Twin Registry, details of which have been outlined previously (10). Briefly, twins were eligible for participation in this study if one or both twins were successfully matched to birth records, were a member of a multiple birth with at least one male, were Caucasian, and were born between 1940 and 1974. Of 9,417 eligible individuals for the first wave, 6,814 (72.4%) completed the initial interviews. At least 1 year later, we contacted those who had completed the initial interview to schedule a second-wave interview. The second-wave interview was completed for 5,629 (82.6%) of those who had completed the first interview. Where possible, this interview was completed face-to-face (for 79.4% of sample). After a full explanation of the research protocol, signed informed consent was obtained before all face-to-face interviews and verbal assent before all telephone interviews.
The current report is based on 1,196 male-male pairs (704 monozygotic and 492 dizygotic pairs) with complete data on substance use and abuse/dependence. To reduce the complexity of the analyses, we randomly excluded one member of five all-male triplet sets and excluded data from 540 twins whose co-twins did not have valid data. Of these 540 individuals, 318 were twins whose co-twins had never participated in the study and 222 had co-twins who participated at wave 1 but did not have valid drug data at wave 2.
At the second-wave interview (1994–1998), subjects were an average age of 36.6 years (SD=9.07, range=20–58 years) and had a mean of 13.6 years of education (SD=2.64). Interviewers had a master’s degree in a mental health-related field or a bachelor’s degree in this area plus 2 years of clinical experience. The two members of a twin pair were each interviewed by different interviewers who were blind to clinical information about the co-twin. Zygosity diagnosis was performed with a discriminant function analysis on the basis of six standard zygosity questions. The algorithm was developed with data from 227 twin pairs genotyped with eight or more highly polymorphic DNA markers (10).
Lifetime use, abuse, and dependence were assessed separately for cannabis, cocaine, hallucinogens, sedatives, stimulants, opiates, inhalants, and "over-the-counter" medications by using an adaptation of the Structured Clinical Interview for DSM-III-R—Patient Version (11). For substances that could be legally obtained, we emphasized that our interest was solely in nonmedical use, defined as use 1) without a doctor’s prescription, 2) in greater amounts or more often than prescribed, or 3) for any other reason than a doctor said it should be taken. Use and misuse of inhalants and over-the-counter medications were too rare to analyze usefully. Drug abuse and dependence were diagnosed by using DSM-IV criteria. We reported previously on the high test-retest reliability of our assessment of drug use, abuse, and dependence and the similarity of our prevalence estimates to those reported from other large U.S. samples (10).
We reviewed elsewhere our approach to multivariate twin analysis (12, 13). We assumed a liability-threshold model, the strengths and limitations of which have been outlined previously (13, 14). Like traditional factor analysis, multivariate genetic analysis seeks to explain covariation among multiple variables (as manifested, for example, in patterns of comorbidity across multiple disorders) with a small number of factors. However, while exploratory factor analysis is purely descriptive, multivariate genetic analysis provides insight into the sources of resemblance.
Using the software package Mx (15), we fitted models by the method of maximum likelihood to data from individual twins in complete pairs. We examined possible biases introduced by this approach by comparing twins from complete versus incomplete pairs. Compared to unpaired twins, those from complete pairs were younger (mean=36.3, SD=9.1, versus mean=39.4, SD=9.5) (F=25.53, df=1, 2930, p<0.001), had more years of education (mean=13.6, SD=2.6, versus mean=12.7, SD=2.8) (F=50.10, df=1, 2927, p<0.001), and were more likely to be monozygotic (χ2=25.08, df=1, p<0.001). However, on the 12 measures of use and abuse/dependence for six substances, unpaired twins had a significantly higher prevalence only for sedative use, a pattern consistent with chance effects (16).
For these analyses, we utilized primarily independent pathway models, our full model containing substance-nonspecific (common) factors (two factors for additive genetic effects, two for shared environmental effects, and two for unique environmental effects) and substance-specific factors (for additive genetic, shared environmental, and unique environmental effects). Two common factors were used for each type of effect to permit testing for independent genetic or environmental effects while allowing the models to converge within realistic time limits.
We then attempted to simplify the full model, first by reducing the number of common factors and then by eliminating the substance-specific genetic and shared environmental factors. We did not eliminate the substance-specific unique environmental factors because to do so assumes that the substance-related disorders were assessed without error. Twice the difference in log likelihood between the models yields a statistic that is asymptotically distributed as chi-square with degrees of freedom equal to the difference in their number of parameters. We used Akaike’s information criterion (17, 18) for our model selection; the lower the value, the better the balance between explanatory power and parsimony. After model fitting, if more than one factor was identified, factor loadings were reestimated by using a varimax rotation as operationalized in the SAS routine PROC FACTOR (19).
Prevalences and Phenotypic Correlations
t1 shows the prevalence of use and abuse/dependence for each of the six substance types in the study sample. Both use and abuse/dependence were most common for cannabis and rarest for opiates. Dizygotic twins had slightly higher prevalences than monozygotic twins for nearly all measures. We tested the significance of these differences in our multivariate models. Allowing thresholds to differ by zygosity did not significantly improve the fit over a model that constrained the thresholds to be equal in the two twin types (Δ χ2=8.72, df=6, p=0.18, for use; Δ χ2=12.04, df=6, p=0.06, for abuse/dependence). Therefore, we used equal thresholds for all subsequent models.
t2 shows the pattern of resemblance for use and abuse/dependence of the six substance classes (above and below the diagonal, respectively), expressed by tetrachoric correlations (odds ratios are available from the authors). The correlations are high (with a mean of 0.78 for use and 0.77 for abuse/dependence) and relatively uniform, ranging from 0.60 to 0.85 for use and from 0.67 to 0.85 for abuse/dependence.
Model Fitting—Substance Use
We outline the results of model fitting in some detail for substance use and more briefly for abuse/dependence. The full model—or model I in t3—includes, for each of the three sources of variance in liability (genetic, shared environmental, and unique environmental), two common factors (which potentially influence risk for use of all substances) and a factor specific for each individual substance. For simplicity, we refer to this as a 2-2-2 model where the digits indicate, respectively, the number of genetic, shared environmental, and unique environmental common factors. Our attempt to simplify this full model—details of which are outlined in t3—began by reducing the number of common factors and then examining evidence for disorder-specific factors.
In models II through IV, we reduced the number of common factors from two to one for genes (model II—a 1-2-2 model), shared environment (model III—a 2-1-2 model), and unique environment (model IV—a 2-2-1 model). Of these, model II provided the best fit.
We then simplified model II by reducing the common factors from one to none for genes (model V—a 0-2-2 model), from two to one for shared environment (model VI—a 1-1-2 model), and from two to one for unique environment (model VII—a 1-2-1 model). Only model VI produced a further improvement in Akaike’s information criterion.
Working from model VI, we further simplified the model by reducing the common factors from one to none for genes (model VIII—a 0-1-2 model), from one to none for shared environment (model IX—a 1-0-2 model), and from two to one for unique environment (model VI—a 1-1-1 model). None of these models produced any improvement in Akaike’s information criterion.
Working still from model VI, we set to zero the genetic specific loadings (model XI), the shared environmental specific loadings (model XII), and both sets of specific loadings (model XIII). Model XII provided the lowest Akaike’s information criterion, and we declare it the best-fit model.
Parameter Estimates—Substance Use
As depicted in F1 and t4, the best-fit model for substance use was a 1-1-2 model with genetic and unique environmental specific loadings. The loadings of the individual substances on the genetic common factor were similar, ranging from 0.58 for cannabis to 0.74 for stimulants. These loadings were substantially greater than those estimated for the substance-specific factors, which ranged from zero for opiates to 0.32 for hallucinogens. The proportion of the genetic variation that came from the genetic common factor averaged 91% and ranged from 82% (for hallucinogens) to 100% (for opiates).
Loadings of the individual substances on the shared environmental common factor were variable—ranging from 0.26 for sedatives to 0.59 for cannabis. In the best-fit model, all shared environmental effects were mediated through the common factor. The best-fit model contained two unique environmental common factors. The first had high loadings on sedatives and opiates and the second on cannabis and cocaine. The variance accounted for by the unique environmental substance-specific factors was generally modest. The proportion of the unique environmental effect on the illicit use of these substances that was due to the two common factors ranged from 52% for stimulants to 99% for cannabis.
Model Fitting—Substance Abuse/Dependence
As outlined in t5, working from the full or 2-2-2 model, improvements in Akaike’s information criterion were seen with the reduction from two to one of the number of genetic common factors (model II), the number of unique environmental common factors (model VII), and the number of shared environmental common factors (model IX). No further simplification of the common factor structure was possible (models XI–XIII). The best-fit model was obtained by beginning with model IX and then setting the substance-specific genetic and substance-specific shared environmental factors to zero (model XVI).
Parameter Estimates— Substance Abuse/Dependence
As depicted in F2 and t4, the best-fit model for substance abuse/dependence was a 1-1-1 model with unique environmental specific loadings. All of the genetic influences on abuse/dependence came from the genetic common factor. For five of the six substance classes, these loadings were high and relatively uniform, ranging from 0.72 for sedatives to 0.86 for cannabis. For opiates, the loading on the genetic common factor was considerably lower (0.48).
Shared environmental effects also came solely from one common factor, with very low loadings for cannabis, cocaine, and opiates; modestly positive loading for hallucinogens; and modestly negative loadings for sedatives and stimulants. The single unique environmental common factor had a very high loading on opiates (0.80) and moderate loadings, ranging from 0.30 to 0.58, on the other substance classes. The proportion of the total unique environmental effect on the abuse/dependence illicit use of these substances that was due to the common factor ranged from 35% for cannabis to 82% for opiates.
Multivariate models allow us to divide the causes of comorbidity between disorders into those due to genetic, shared environmental, and unique environmental components. F3 illustrates such results for substance abuse/dependence. Except for relationships involving opiates, our best-fit model suggests that a substantial majority of the observed comorbidity is due to genetic effects. For example, the phenotypic correlation between cannabis and cocaine abuse/dependence is estimated at 0.80. This phenotypic correlation can be decomposed into 0.68, 0.01, and 0.11 due to the effects of, respectively, genetic, shared environmental, and unique environmental risk factors that affect liability to both disorders. Our model predicts that 0.68/0.80 or 85% of the comorbidity between cannabis and cocaine abuse/dependence results from genes that predispose individuals to both conditions.
Our goal was to determine, in a population-based sample of male-male twin pairs, the level of specificity of the genetic and environmental risk factors for the use and the abuse/dependence of common classes of illicit psychoactive substances. Although the answers differed slightly for use and for abuse/dependence, the overall pattern was similar. The genetic and shared environmental factors that affect risk for use and misuse of these six classes of illicit substances are largely or entirely nonspecific in their effect.
The specificity of the familial risk factors for drug use or abuse has been examined in four prior large-scale studies. First, using similar methods, we examined the factors influencing illicit use of these same substance categories in the female-female pairs from the Virginia Twin Registry (9). The best-fit model in that analysis, identical to the best-fit model for abuse/dependence in this paper, found that both the genetic and the shared environmental risk factors were entirely nonspecific in their effect. Second and third were two large-scale family studies with probands found through the Collaborative Study on the Genetics of Alcoholism (7) and drug treatment clinics (6). While the Collaborative Study on the Genetics of Alcoholism found substantial specificity of familial transmission of marijuana and cocaine dependence (7), the other large family study found more evidence for nonspecificity of familial effects with elevated rates of both "soft" and "hard" drug use in the relatives of probands with opiate, cocaine, and cannabis dependence (6). Fourth, the one prior multivariate twin study of abuse/dependence was carried out in the Vietnam Era Twin Registry (8). The substance classes examined in that study were similar to those in our study, although stimulants and cocaine were combined into one category. The researchers in the prior study did not test two-factor models nor try to simplify their one-factor model by eliminating substance-specific genetic or shared environmental factors. Contrary to the present analyses, they found that a one-factor common pathway model fit best. Despite these differences, in at least two important ways, their results were similar to ours. First, for four of their five substance classes, a majority of the genetic variance came from the common factor, ranging from 67% for cannabis to 100% for psychedelics. Second, for all five of the substance categories, the large majority of the familial environmental effects came from the common factor.
Our results therefore are broadly congruent with those of the two previous twin studies (8, 9) and one of the two family studies (6) in suggesting that most of the genetic and shared environmental risk factors for illicit substance use and substance abuse/dependence are nonspecific. The one exception to this trend was the finding in the Vietnam era study (8)—in marked contrast to the present investigation—that most of the genetic effects on heroin abuse were substance-specific. This difference could result from two features of the Vietnam twin cohort: 1) obligatory service in the military, with many seeing service in Southeast Asia, where heroin was widely available and 2) first exposure to drugs in the unique historical period of the late 1960s and early 1970s. The discrepancy between our findings and those of the Collaborative Study on the Genetics of Alcoholism (7) is both larger and more perplexing. We found a strong genetic correlation between cannabis and cocaine abuse/dependence, while they found no evidence of familial coaggregation of the two syndromes. Perhaps it is of significance that our sample was ascertained through birth certificates, while the families in their study were found through patients’ being treated for alcohol dependence.
Genetic Effects on Drug Use and Abuse/Dependence
For both use and abuse/dependence, we found evidence for only a single genetic common factor. We predicted, for example, based on the similar mode of action of cocaine and stimulants (20), that liability to cocaine and stimulant use and abuse would be influenced by genetic factors that were unique to these two substance classes. However, we found no evidence for this. Indeed, for abuse/dependence, we found no evidence for any substance-specific genetic effects. Put more concretely, we could not find evidence for genetic factors that increase risk for individuals to abuse substance A and not also to abuse substances B, C, and D.
If true, these results have substantial implications for the nature of the genetic vulnerability to illicit psychoactive substance abuse (21). Interindividual differences in this vulnerability appear to be due largely or entirely to factors that increase or decrease risk for the abuse of all substance classes. In general population samples, genetic variation in biological systems that affect the action of only one or a small number of substance classes (e.g., specific drug receptor sites) do not appear to be an important source of differences in vulnerability.
Although twin studies such as this one are not designed to help localize susceptibility genes on the human genome, these findings do have implications for such research. In particular, they suggest that the search for genetic variants affecting human drug abuse should include systems that have a wide range of action across many substance classes. For example, genetic variation in personality or in the liability to externalizing disorders may influence risk for the use and abuse of most psychoactive compounds (22, 23). Alternatively, genetic variation may exist in biological systems that are activated by most or all substances of abuse (24).
Environmental Effects of Drug Use and Abuse/Dependence
In contrast to the findings of studies examining more typical psychiatric disorders such as major depression or the anxiety disorders, most twin studies of drug use and some of drug abuse have found evidence for the etiologic importance of shared environmental experiences (e.g., references 8, 25, 26). Our analyses provide insight into the nature of those experiences. We found evidence for a shared environmental common factor but no substance-specific shared environmental effects for both substance use and abuse/dependence, although these effects were considerably stronger for substance use. These findings suggest that, like genetic factors, shared environmental effects are acting in a nonspecific manner on the risk for drug use and abuse/dependence. For example, our findings would not be consistent with a social learning model in which parental or peer use of cannabis increased the risk in individuals for the specific use of cannabis but not of other illicit psychoactive substances.
An unexpected result for abuse/dependence was that loadings on the shared environmental common factor were positive for four substance categories but negative for sedatives and stimulants. Follow-up analyses indicated that these effects were not statistically significant. Constraining to zero the path from the shared environmental common factor to hallucinogens produced an improvement in Akaike’s information criterion and positive loadings for all other substances.
For drug use and drug abuse/dependence, risk factors specific to individual or small groups of substances came solely from environmental experiences not shared with the co-twin. For drug use, one set of unique environmental risk factors predisposed to sedative and opiate use and another to cannabis and cocaine. Modest substance-specific unique environmental risk factors were also seen. By contrast, for a twin with a high liability to drug abuse or dependence, our results suggest that which substance classes are abused are determined entirely by substance-specific environmental risks not shared with the co-twin.
These results need to be interpreted in the context of 11 possible methodologic limitations. First, this sample was restricted to white males born in Virginia. Although their rates of substance use and dependence were typical for other U.S. populations (10), these findings may not be generalizable. Second, diagnostic assessments were done at a single interview and include error variance (27). In multivariate models, measurement error is largely confounded with true disorder-specific unique environmental effects and produces downward biases on other parameter estimates. Third, parameter estimates from structural equation modeling should ideally be presented with confidence intervals. However, the added computational burden required to estimate the confidence intervals would have rendered these analyses unfeasible. Fourth, our twin model assumed that comorbidity results from the effect of latent genetic and environmental risk factors. Other models of comorbidity are possible (28) but were not examined here. Fifth, these models assumed that exposure to environmental factors that influence twin similarity for substance use and abuse/dependence are approximately equal in monozygotic and dizygotic pairs. We examined that assumption previously and found it to be supported (10). Sixth, although drug abuse/dependence is a conditional process that requires prior initiation (29), this conditionality was not incorporated into current modeling. Seventh, our analyses did not include cohort effects that could be confounded with estimates of shared environment. Eighth, our analyses examined independent pathway models, although Tsuang et al. (8), using only one-factor models, concluded that a common pathway model was superior. We repeated their analyses in our data for drug abuse/dependence. By contrast, the 1-1-1 independent pathway model provided a much better Akaike’s information criterion (–209.91) than did a one-factor common pathway model (–185.29). Ninth, for substance use, the Akaike’s information criterion value for our model XII was very close to that of the best-fit model XI. Overall, the parameter estimates were quite similar for the two models, with most of the shared environmental variance for model XII coming from the common factor. Tenth, with low-powered studies, best-fit models can substantially distort the true pattern of findings (30). However, an examination of the parameter estimates for the full models from our two analyses indicated that this was not the case here. Finally, we examined abuse and dependence together because the higher prevalence rate produced more stable and robust parameter estimates. However, since drug abuse is a broadly defined syndrome, we fit the same multivariate models to our data using dependence alone. The best-fit model (a 1-0-1 model with only unique environmental specific loadings) was similar to that seen in F2, albeit missing the shared environmental common factor. The loadings of these six substances on the genetic common factor were broadly similar in magnitude to those seen with abuse/dependence, being again lowest for opiates (0.61) and ranging from 0.68 to 0.83 for the remaining substances. Our conclusion about the nonspecificity of genetic risk factors applies also to more narrowly defined drug dependence.
Received Sept. 10, 2002; revision received Dec. 8, 2002; accepted Jan. 22, 2003. From the Virginia Institute for Psychiatric and Behavioral Genetics and the Departments of Psychiatry and Human Genetics, Medical College of Virginia of Virginia Commonwealth University. Address reprint requests to Dr. Kendler, Department of Psychiatry, Medical College of Virginia of Virginia Commonwealth University, P.O. Box 980126, Richmond, VA 23298-0126; email@example.com (e-mail). Supported by NIH grants DA-11287, MH/AA/DA-49492, MH-01458, and AA-00236. The authors thank Indrani Ray and Steven Aggen, Ph.D., for database assistance and acknowledge the contribution of the Virginia Twin Registry, now part of the Mid-Atlantic Twin Registry, to ascertainment of subjects for this study. The Mid-Atlantic Twin Registry, directed by Dr. L. Corey, has received support from NIH, the Carman Trust, and the W.M. Keck, John Templeton, and Robert Wood Johnson Foundations.
Best-Fit Model for the Lifetime Use of Six Illicit Substance Classes by Monozygotic and Dizygotic Twins (N=2,392 Individuals) From a Population-Based Registrya
aLatent variables—all of which have variance of 1.0—are depicted in circles, and observed variables (types of substance use) are depicted in rectangles. The path coefficients represent standardized partial regression coefficients, so they must be squared to equal the amount of variance in the dependent (downstream) variable that is accounted for by the independent (upstream) variable.
Best-Fit Model for the Lifetime Abuse/Dependence of Six Illicit Substance Classes by Monozygotic and Dizygotic Twins (N=2,392 Individuals) From a Population-Based Registrya
aLatent variables—all of which have variance of 1.0—are depicted in circles, and observed variables (types of substance abuse/dependence) are depicted in rectangles. The path coefficients represent standardized partial regression coefficients, so they must be squared to equal the amount of variance in the dependent (downstream) variable that is accounted for by the independent (upstream) variable.
Sources of Correlation Predicted by the Best-Fit Model for Liability to Lifetime Use and Abuse/Dependence of Pairs of Illicit Substance Classes by Monozygotic and Dizygotic Twins (N=2,392 Individuals) From a Population-Based Registry