Major depression occurs in at least 1% to 3% of the general elderly population (1–3), and an additional 8% to 16% have clinically significant depressive symptoms (1, 4). In primary care settings, the prevalence of depressive disorders is 5% to 17% (1, 5); the prevalence of depressive symptoms is 11% to 29% (6, 7); one longitudinal study estimated the 9-month incidence rate of depressive symptoms to be 11.7% (6).
Studies of depressed adults report that those with depressive symptoms, with or without depressive disorder, have poorer functioning than nondepressed adults (8–10), and their functioning is comparable to or worse than that of adults with chronic medical conditions such as heart and lung disease, arthritis, hypertension, and diabetes (11, 12). In addition to poor functioning, depression increases the perception of poor health (11), the utilization of medical services (13), and health care costs (14, 15).
These findings suggest that depression in elderly community and primary care populations is a serious problem. Yet, probably less than 20% of depressed elders in these settings are detected or adequately treated (2). In fact, underdetection and undertreatment of depression have been recognized as such important problems that they have become targets of a campaign by geriatric psychiatrists to improve detection and treatment (16).
Fundamental to understanding the potential impact of detection and treatment strategies is a knowledge of the course and outcome of depression in these elderly populations. Thus, the purpose of this study was to determine the prognosis of depression in elderly community residents and primary care patients by systematically reviewing original research on this topic. The review process, modified from the one described by Oxman et al. (17), involved systematic selection of articles, assessment of validity, abstraction of data, and qualitative and quantitative synthesis of results.
The selection process involved four steps. First, two computer databases, Medline and PsycINFO, were searched by one of us (M.G.C.) for potentially relevant articles published from January 1981 to November 1996 and from January 1984 to November 1996, respectively, using the key words "depression" and "prognosis" or "course" or "follow-up" and "aged." Second, relevant articles (selected on the basis of the title and abstract) were retrieved for more detailed evaluation. Third, the bibliographies of relevant articles were searched for additional references. Finally, all retrieved articles were screened by one of us (M.G.C.) to ensure that they met the following five inclusion criteria: 1) original research, 2) published in English or French, 3) study population of community residents or primary care patients, 4) subjects’ mean age of 60 years and over, and 5) reported affective state as an outcome.
To determine validity, two of us (M.G.C. and A.M.) independently assessed the methods and design of each study according to the following seven criteria for prognostic studies described by the Evidence-Based Medicine Working Group (18): formation of an inception cohort, description of referral pattern, adequate length of follow-up to determine outcome, completion of follow-up (determination of outcomes for at least 80% of the inception cohort), objective outcome criteria, blind outcome assessment, and adjustment for extraneous prognostic factors (e.g., severity of physical illness, cognitive impairment). Each study was scored with respect to meeting, not meeting, or partially meeting each of these criteria. Interrater agreement was calculated for each criterion as the percent of studies in which independent assessments of both raters were exactly the same; thereafter, in instances of disagreement, articles were reexamined to reach a consensus.
Information about the population, sample size, diagnostic criteria, proportion of depressed subjects detected and treated by primary care physicians, length of follow-up, affective outcomes, and prognostic factors was independently abstracted by two of us (M.G.C. and A.M.) from each report. Interobserver agreement was calculated; thereafter, in instances of disagreement, articles were reexamined to reach a consensus. To compare the results of different studies, the percentage of subjects in each reported outcome category was calculated by using the number of subjects in the inception cohort as the denominator; when this number was not explicitly reported, it was estimated.
Information about the population, diagnostic criteria, proportion of depressed subjects detected and treated by primary care physicians, length of follow-up, outcomes, and prognostic factors was tabulated. A qualitative meta-analysis was conducted by comparing and contrasting abstracted data.
To analyze the results of the different studies, we selected the outcome categories that were consistent across most of the studies. We then used a mixed effects regression model (19) to combine the results of each outcome category (e.g., percent of subjects well) across the different studies at the end of follow-up. The study population (i.e., community or primary care), length of follow-up, lower age limit for enrollment, gender, diagnostic criteria, and percent of subjects treated with antidepressants were included in the regression model as covariates by using the stepwise selection strategy. Missing values for covariates were handled in the following ways: first, only studies with no missing covariates were considered; second, missing values were replaced by the average, and all studies were considered in the model. The parameter estimates were computed by using the method of moments with the weighted least squares approach. Each outcome category was modeled separately. Finally, we performed a test of homogeneity of the outcomes across studies by testing that the random effects variance of the regression model was null. All statistical analyses were conducted through use of SAS statistical software, version 6.12 (20).
The search strategy yielded 711 potentially relevant studies; 27 were retrieved for more detailed evaluation. Four studies of primary care patients, involving 843 patients with depression (6, 21–23), and eight studies of community residents, involving 425 subjects with depression (24–32), met all the inclusion criteria. The other 15 studies were excluded for the following reasons: one was not original research, in two the subjects’ mean age was lower than 60 years, nine did not report affective state as an outcome, and three did not meet two or more of the inclusion criteria.
The results of the validity assessment of the primary care and community studies are presented in T1. Interrater agreement ranged from 50% to 100% for the seven criteria. All studies had some methodologic limitations. For the primary care studies, the limitations were related to formation of the inception cohort, description of referral pattern, completion of follow-up, and adjustment for extraneous prognostic factors; for the community studies, the limitations were related to formation of the inception cohort, blind outcome assessment, and adjustment for extraneous prognostic factors. In both types of studies, the inception cohort failed to identify depressed subjects at an early and uniform point in the course of their illness (e.g., beginning of first episode) and may have included a disproportionately large number of subjects with multiple episodes or chronic depression.
The results of the 12 studies are summarized in T2. Interobserver agreement for nine items of abstracted data ranged from 58% to 100%. The lower level of agreement for the outcome variable (58%) reflected the way it was calculated: for each study, both raters had to have exactly the same percentages in each outcome category. When the criterion of agreement was relaxed the same (give or take 3%), interobserver agreement increased to 92%.
Qualitative: primary care studies. One study used DSM-III criteria (major depression or dysthymia), and three used cutoffs on depression symptom rating scales: in two instances, a score of 16 or more on the Center for Epidemiologic Studies Depression Scale (34) and, in the third instance, a score of 60 or more on the Zung Self-Rating Depression Scale (33). Study groups ranged from 42 to 410 patients. Patients’ mean ages were reported in two studies (65.6 and 75.6 years). One study included men only, and in two others, 80% or more of the patients were women. Lengths of reported follow-up varied from 9 to 33 months. One study reported the rate of detection by primary care physicians (28%). Two studies reported rates of eventual antidepressant treatment (9% and 10%, respectively).
Qualitative: community studies. One study each used the Present State Examination-CATEGO (35), the Cambridge Examination for Mental Disorders in the Elderly (36), DSM-III-R criteria (major depression), or a score of 6 or more for anxiety-depression on the General Health Questionnaire (38); two studies each used DSM-III criteria (major depression) or the Geriatric Mental State-AGECAT (37). Study groups ranged from 12 to 129 patients. Subjects’ mean ages were reported in two studies (70 and 72.4 years). Four reported gender distribution: most subjects were women. Lengths of reported follow-up varied from 12 to 60 months. Three studies reported rates of detection of depression by primary care physicians (0% to 32%). Six studies reported rates of eventual antidepressant treatment (4% to 37%).
Qualitative: prognostic factors. A variety of prognostic factors were reported in 10 studies, although measurement of these factors varied from one study to the next. Older age (22), added supports (22), poor perceived health (6), and total number of life events (26, 27) were associated with poor outcome in one study each; however, alcohol abuse (21), major life events (23), and social factors (education, marital status, social participation [26, 27]) were not. Physical disability (26, 27, 32) and cognitive impairment (23, 31) were associated with poor outcome in two studies each. Physical illness was associated with poor outcome in four studies (22, 29, 31, 32) but not in two others (21, 26, 27). Finally, severe depression was associated with poor outcome in two studies (6, 32) but not in two others (26, 27, 31).
Quantitative. Three outcome categories—well, depressed, and died—were consistent across most of the studies. The other outcomes (e.g., dementia, partial remission) were categorized as "other" in this analysis. Some specific outcome categories (e.g., dead) were not evaluated in a few studies; in such cases, we removed these studies from the calculation of the estimates of the regression model parameters for these outcome categories. The percent of subjects well, depressed, dead, and "other" at the end of follow-up was modeled by using a mixed effects regression model.
There was significant heterogeneity in the outcomes across studies (T3). Differences in the length of follow-up and lower age limit for enrollment explained part of this heterogeneity for the outcome category of well, but differences in gender distribution, population, diagnostic criteria, and percent receiving antidepressant treatment did not. However, significant variation among the random effects was still unexplained even after control for length of follow-up and lower age limit for enrollment.
F1 illustrates the significant inverse relationships (p<0.05) between length of follow-up, lower age limit for enrollment, and percent well. Clearly, the percent of subjects well decreases with the increasing length of follow-up and lower age limit of 75. There was no significant difference between lower age limits of 60 and 65; therefore, they were pooled into a single category in the final mixed effects regression model.
T3 presents the combined estimate of the percent well and the 95% confidence interval (CI) for a 24-month follow-up and 60 or 65 lower age limit for enrollment. No significant relationship was observed between the other three outcome categories and the covariates. Thus, the final mixed effects regression model for these other outcome categories included only the intercept and the random effect, which is equivalent to simply combining the results of all studies through use of a random effects model (19). The combined estimates for these other outcomes and their 95% CIs are presented in T3. Note that only nine studies had "dead" as an outcome category; only seven had an "other" category.
To date, 12 studies of the prognosis of depression in elderly community and primary care populations have been published in the English and French literature. The combined results of these studies indicated that 24 months after enrollment, 33% were well, 33% were depressed, and 21% had died. Thus, the prognosis of depression in these populations appears to be poor: 24 months after enrollment, almost half of those alive were depressed, probably reflecting the chronic and relapsing course of the disorder. This finding echoes the conclusions of two well-designed studies (22, 28) included in the meta-analysis.
Length of follow-up and lower age limit for enrollment were inversely associated with one outcome, percent well, but not with the other three outcomes. Increasing both length of follow-up and lower age limit for enrollment made it less likely that elderly subjects would be well and more likely that they would fall into one of the other three outcome categories; however, their distribution in the other three categories did not appear to be influenced by either length of follow-up or lower age limit.
Differences in gender distribution, diagnostic criteria, population, or percent receiving antidepressant treatment were not related to outcomes. The absence of a relationship between population (community or primary care) and outcome is not surprising, since elderly subjects in many of the community studies were in regular contact with a primary care physician (24, 26–28, 30, 32) and had rates of detection and treatment comparable to those of subjects in the primary care studies. The absence of a relationship between the percent receiving antidepressant treatment and outcome may reflect the relatively low rates of treatment in these studies.
The outcome of depression in these populations contrasts with that of elderly depressed patients in hospital-based psychiatric services, where 60% were well (or had relapses with recovery) and only 12%–22% were continuously ill over a mean 13 to 52 months of follow-up (39). The differences in outcome may reflect fundamental differences between community/primary care subjects and psychiatric patients or the modest effectiveness of available treatments in the community/primary care population (40). Alternatively, the differences may reflect the fact that rates of detection and treatment of depression were low in community/primary care subjects, whereas psychiatric patients were treated, in most cases with antidepressants and psychotherapy. Thus, increased attention to detection and treatment of depression in community and primary care settings may improve outcome.
This review has nine potential limitations. First, the literature search was conducted by only one reviewer. Second, the search was limited to articles published in English and French because we did not have the resources to translate articles written in other languages. Third, we did not assess publication bias, although it is unlikely that this bias influences publication of studies of prognosis. Fourth, even though agreement between raters on the validity criteria was generally high, it was low for "blind outcome assessment" because of our not anticipating a "not applicable" rating for four studies. Fifth, the abstraction of data was highly reliable except for the outcomes; however, when the criterion of agreement for outcomes was relaxed the same (give or take 3%), agreement was very good (92%); notably, the outcome data, presented in T2 and used in the meta-analysis, were consensus data. Sixth, examination of outcomes was complicated by differences in the length of follow-up and outcome categories from one study to the next. Seventh, outcomes were usually based on a cross-sectional assessment at the time of follow-up; affective status between baseline and follow-up assessments was not reported. Eighth, there was significant heterogeneity in the outcomes that could not be explained either by differences in length of follow-up and lower age limit for enrollment or by differences in gender, diagnostic criteria, population, and the percent receiving antidepressant treatment. Other explanations for this heterogeneity could be unreported differences in the subjects enrolled or differences in study methods. Finally, because of the significant heterogeneity in the results, the combined estimates of outcomes were not very precise.
To conclude, depression in elderly community and primary care subjects 1) has a poor prognosis; 2) is perhaps chronic, relapsing, or both; and 3) is probably undertreated. Despite the methodologic limitations of the studies and this meta-analysis, these findings seem to support efforts to develop detection and treatment programs for depression in these populations.
Received April 27, 1998; revisions received Oct. 6, 1998, and Feb. 10, 1999; accepted Feb. 24, 1999. From the Division of Geriatric Psychiatry, St. Mary’s Hospital, and Department of Psychiatry, McGill University, Montreal; and È£ole de Hautes È´udes Commerciales, Montreal, affiliated with l’Universitç¤e Montrè ¬. Reprints of this article are not available. Address correspondence to Dr. Cole, Division of Geriatric Psychiatry, St. Mary’s Hospital Center, 3830 Ave. Lacombe, Montreal, Que. H3T IM5, Canada
Percent of Well Subjects and Length of Follow-Up by Population and Lower Age Limit for Enrollment, From Studies of Prognosis of Depression in Elderly Primary Care Patients and Community Residents