Rising mental health spending has triggered cost-containment efforts primarily aimed at decreasing inpatient utilization (1). In this fiscally sensitive environment, partial hospitalization, a nonresidential treatment modality capable of providing tertiary-level care to mentally ill adults, has enjoyed renewed popularity (2–4). Favorable Medicare reimbursement policies and the advent of managed care have played a key role in driving up partial hospitalization utilization (3). The significance of this development is magnified by the fact that partial care grew at a stubbornly slow rate through the mid-1980s (2, 5), which was the result of tepid policy and insurance support (6) as well as ambivalence on the part of clinicians, patients, and families (7, 8).
The expanding use of partial hospitalization as a treatment option for patients requiring intensive services should be followed by evaluative efforts to ascertain its effectiveness relative to full hospitalization, the standard for tertiary-level psychiatric care. This development also invites a reanalysis of the decades-old body of research on this topic. While such research has focused primarily on partial and full hospitalization as alternative forms of treatment, a smaller number of studies have examined the comparative effectiveness of partial care as a supplement to brief inpatient care versus standard full hospitalization. Despite the lack of hard data on the relative importance of these two modalities of acute partial care (Center for Mental Health Services, personal communication, April 2000), evidence of opposite utilization trends for partial and full hospitalization (8) suggests that acute partial care is being increasingly used as a substitute for inpatient care.
Although the relative effectiveness of partial and full hospitalization as alternative treatment modalities has been the subject of multiple investigations, several questions remain unanswered. Because of methodological shortcomings, reviewers of this literature have declared that no firm conclusions can be drawn (9–14). Deficiencies include inadequate description of the programs, program heterogeneity, small sample sizes, high exclusion rates, nonrandom assignment, unblinded assessments, nonstandardized or inadequate response measures, inadequacy of statistical methods, and high attrition rates.
Despite these shortcomings, published reviews have offered interpretations of the research evidence that have likely influenced public policy and care trends. While some reviewers have proclaimed the superiority of partial over full hospitalization, either globally (15–17) or in terms of social adjustment (13, 18, 19), others have concluded that the programs have similar outcomes in terms of psychopathology and functional adjustment (20, 21).
Although not a cure for all the aforementioned methodological deficiencies, meta-analysis provides a powerful tool for overcoming problems related to small sample size and for integrating the results of multiple investigations. We therefore performed a systematic review of English language studies on the relative merits of partial and full hospitalization as first-line treatment for mentally ill adults. Our objectives were to systematize the knowledge base and provide directions for future research. Specific goals were to resolve conflicting results on various measures of outcome and to establish whether selected patient, illness, and program characteristics were associated with superior outcome.
We performed manual (Psychological Abstracts 1945–1966) as well as computerized searches (MEDLINE [1966–October 1998] and PsycLit [1984–October 1998]) to identify relevant studies. Search terms included "day hospital," "day treatment," "day care," or "partial hospitalization," and "mental" or "psychiatric" disorders. References contained in the retrieved literature were also reviewed.
We restricted our review to studies of partial hospitalization as an alternative to full hospitalization for adults with primary psychiatric diagnoses other than substance abuse disorders (7, 22–48). We excluded studies of partial hospitalization programs specifically designed for children, adolescents, and patients aged 65 and over, or if service utilization was the only study measure. Studies were included regardless of their design. If the study design called for hospitalizing all patients before randomization, the maximum inpatient stay could not exceed 4 days.
Data Extraction and Definition of Variables
The reviewer and data abstractor was a psychiatrist with health services research training (M.H.-L.) who consulted with another author (S.-L.T.N.) to resolve statistical questions. Data were extracted by using a standardized instrument adapted from Glass et al. (49). If a study reported more than one significant result favoring a particular program on a given outcome domain, we selected the most conservative result (i.e., the highest level of significance as denoted by the smallest p value). All statistics were adjusted so that higher scores reflected better outcomes.
Qualitative study variables such as demographic and clinical characteristics of patients and type of program were abstracted. Demographic information included percentage of male subjects, median age, and percentage of subjects lacking involved family. The percent of nonaffective psychoses was calculated as a proxy for the programs’ enrollment of severely mentally ill patients. On the basis of Armstrong et al.’s classification (50), treatment programs were categorized into four mutually exclusive types, depending on the predominant intervention: medication-based, directive (program with focused behavioral activities), nondirective (milieu-based program heavily reliant on community meetings and verbal therapies), and eclectic (balanced program where all or most therapies are equally emphasized).
Key quantitative study variables such as exclusion, attrition, and transfer rates were calculated according to preoperationalized definitions. For randomized studies, the numerator for the exclusion rate was the sum of prerandomization exclusions and postrandomization exclusions; for nonrandomized studies, the numerator was the number of patients excluded due to program ineligibility. The denominator for all studies was the total number of potential study subjects, which was not necessarily equivalent to the study sample size. The numerator for the attrition rate included those missed by researchers or with no informant available, study refusers, postrandomization exclusions, and patients lost to follow-up. The denominator was the study sample size. Transfer rate was calculated only for the partial hospitalization group. Transfers were defined as patients who, at some point during their treatment, had spent more than 2 days in an inpatient facility (51). If information on length of inpatient stay was not available, all boarded patients were considered transfers. Excluded from the numerator were partial hospitalization subjects who required an emergency inpatient admission immediately after randomization. The denominator was the study sample size. Other quantitative variables included whether the posttest results had been adjusted by pretest differences at baseline and type of analysis used by the study (e.g., intent to treat, per protocol).
Domains and Measures of Outcome
The relative effectiveness of partial versus full hospitalization was assessed in five domains: psychopathology, social functioning, family burden, satisfaction with services, and service utilization. Consumer satisfaction and use of services, although better characterized as process indicators, were used as proxy outcome domains. Measures for outcome assessment were questionnaires administered to patients or informants or rates computed from administrative data (e.g., rehospitalization rate).
Measures of psychopathology (symptoms of psychotic, affective, and other disorders) and social functioning (interpersonal and occupational adjustment) were either standardized instruments with known reliability or nonstandardized assessments. Examples of the former included the Present State Examination (PSE) (52), which assesses a variety of symptom clusters, and the Social Behavior Assessment Scale (53), which assesses social performance and abnormal behaviors. Nonstandardized measures included assessments of "psychiatric status" (25) and "quality of patient’s family relationships" (42). Rate-based measures of observable behaviors, such as percent of patients with self-mutilatory behavior (32) and percent of patients employed (7, 46), were also used. All assessments of family burden (distress or inconvenience caused by patient’s illness) were completed with the burden subscale of the Social Behavior Assessment Scale (53). Satisfaction with services was assessed both with standardized measures of unknown reliability and nonstandardized assessments. The former included the Satisfaction With Services Scale (37), which measures perceived helpfulness and responsiveness of services, and a questionnaire on "attitudes towards mental institutions" (42). Nonstandardized assessments, both rate-based, measured program satisfaction (32) and preference (42). Service utilization was assessed both at the index admission by using rates of discharge (24, 25, 27, 30) and at follow-up by using cumulative rehospitalization rates (24, 25, 27, 29, 36, 37, 46, 47) and community tenure rates (25, 27, 30).
Stratification of Study Results
Selective reporting of differential findings frequently involved undue attention to isolated subscale results and lesser attention to nondifferential full scale or global results. Along with methodological considerations, the above led us to categorize measures as global (e.g., total PSE score), partial (e.g., the delusions and auditory hallucinations subscale of the PSE), or rate-based (e.g., percentage of patients requiring rehospitalization). Further, because of findings suggesting that treatment effects may be time sensitive (9, 27, 39), study results were also stratified by time of assessment measured from discharge. For studies that only reported findings relative to time of admission, this time variable was adjusted by the mean treatment duration or, if the latter was not available, by an approximate modal length of treatment (e.g., 2 months).
Estimation of Effect Sizes
Test statistics, means, and other reported information were used to calculate the standardized mean difference, defined as the mean difference between partial and full hospitalization groups divided by the pooled standard deviation. This parameter, estimated using Glass’s effect size (Δ), may be interpreted as a measure of the outcome differential between average partial hospitalization subjects and average full hospitalization subjects, expressed in standard deviations (49). Thus, a value of 0.6 would indicate that the average partial hospitalization patient was 0.6 standard deviations better off than the average inpatient on a given outcome. A value of 0.0 would indicate that there was no difference between the groups.
Effect sizes were computed by using a hierarchical method. We computed standardized measures when study means and standard deviations were reported. When these quantities were not available, we resorted to other formulas based on Student’s t statistics, F values, proportions, and p values. Computations that employed p values were deemed of last resort and used only if no other statistics were available. The precision associated with each estimate was also calculated (54).
A random-effects framework (55) was used to combine Glass’s Δ across studies within each outcome domain, stratified by type of measure and time of assessment. Between-study variance was calculated by using the DerSimonian and Laird estimate (56). Across all studies, the overall difference between programs was estimated by using Cochran’s semiweighted estimator (û), with weights based on the within- and between-study variations. Ninety-five percent confidence intervals (CIs) were calculated for the overall difference for each type of measure, adjusting for multiple comparisons by using a Bonferroni correction and assuming approximate normality of the estimator. The results were deemed statistically significant if, upon the Bonferroni adjustment, the CI excluded zero. The clinical significance of the summary estimates was interpreted by using the guidelines suggested by Cohen (57), with values of 0.20, 0.50, and 0.80 corresponding to small, medium, and large effects, respectively.
We analyzed the entire set of results, as well as results stratified according to whether or not the reported results had been adjusted for baseline differences. In addition, sensitivity analyses were performed to establish the robustness of our conclusions. Thus, with the aim of determining if isolated observations had an undue influence on the results, effect sizes were sequentially eliminated and the overall estimate of effect was recomputed for each outcome domain and for each stratum (type of measure by time of assessment). Findings were deemed positive if they remained significant after performing sensitivity analyses.
One of us (M.H.-L.) evaluated the methodological quality of the studies by means of a 15-point instrument that rated validity, generalizability, quality of assessment methods, and quality of statistics (58, 59) (available upon request). Studies were characterized as poor, fair, good, or excellent if their quality scores were 0–5, 6–8, 9–12, or 13–15, respectively.
Eight studies initially identified as meeting review criteria were excluded either because the partial hospitalization program was not an alternative to its counterpart (60–66) or because the outcome investigation was limited to service utilization (67). Our review covered 17 studies, 18 investigations (one study  conducted two parallel investigations), and a total of 2,450 observations (t1).
Overview of Published Studies
Only one investigation conducted blinded assessments (46). A similar number of studies used intent-to-treat (N=8) and per protocol (N=10) techniques for statistical analyses. Twelve of the 18 investigations made reasonable attempts to control for baseline differences. Thus, they either adjusted most posttest results, conducted posttests upon finding pretests to be statistically comparable, or did not need to adjust, given the nature of the domain being assessed (e.g., satisfaction).
With one exception, all nonrandomized studies failed to report exclusion rates. Among randomized studies, the median exclusion rate for both treatment arms was 56%. In addition, of the three randomized studies with no exclusions, one (25) disenrolled 30% of the partial hospitalization subjects upon randomization (included in the exclusion rate—see Method section) and eventually transferred 22% of partial care enrollees, while another (36) transferred 61% of its partial hospitalization subjects (t1). Regardless of study design, the most frequent exclusion criterion was, by far, "too severely ill" (i.e., dangerousness to self or others; disruptive behavior). Other frequent exclusion criteria were cognitive impairment and antisocial behavior.
With the exception of matched-design studies, full hospitalization programs tended to have a greater share of male subjects and nonaffective psychoses, a pattern better discernible among randomized investigations. Illness severity varied widely across studies, regardless of design (range=0%–96%). Extent of family involvement also varied substantially, with rates lowest in matched-design studies (range=19%–33%), intermediate in observational studies (range=28%–60%), and highest in randomized studies (range=42%–87%). Treatment sites offered similar interventions in eight of the 12 studies with program information; most programs were either eclectic or nondirective. Residential services were readily available for partial hospitalization patients in only one investigation (37).
Methodological quality was characterized as fair or poor in 11 and two studies, respectively. Validity was typically compromised by lack of baseline comparability between groups, unblinded assessments, or high rates of attrition, while high volumes of exclusions or transfers typically resulted in low generalizability. Other common deficiencies were insufficient reporting of key information, selective reporting of significant results, and use of measures with unknown reliability, most of which were nonstandardized.
Most investigations detected comparable improvement for partial and full hospitalization patients in both psychopathology and social functioning, domains assessed by 18 and 16 studies, respectively. Exceptions to this general trend were reports of poor functional outcomes across the board (36), differential speed of recovery on symptom (27, 46) or functional (30, 39) measures, or greater social adjustment gains for partial hospitalization patients (24, 27, 42, 43, 46, 47). Two investigations that assessed family burden found a delayed advantage for partial hospitalization (39, 48), whereas the other two found no differences (30, 31). Four of the five studies that assessed satisfaction found an advantage for partial hospitalization (25, 29, 32, 42). Service utilization was assessed by 10 studies. While most investigations found comparable rehospitalization rates (24, 25, 27, 29, 30, 32, 37), of the three studies with discharge information (24, 27, 30), two (24, 27) found that partial hospitalization patients were discharged sooner. Only five investigations performed subgroup analyses by diagnosis (25, 30, 36, 37, 43).
Four of the 18 studies (32, 37, 39, 47) provided sufficient information to estimate effect sizes for all published results. Conversely, inadequacies in data reporting prevented us from computing effect sizes for any of the results published by two studies (31, 48) that had suggested nondifferential psychopathology and social functioning outcomes.
Outcome domains were variably represented in the overall pool of effect sizes (N=95) (t2). While 29% (N=28) and 41% (N=39) of effect sizes corresponded to assessments of psychopathology and social functioning, only 7% (N=7) and 4% (N=4) of effect sizes reflected assessments of satisfaction and family burden, respectively. In addition, most effect sizes (77%, N=73) were assessed within 1 year of discharge, and the satisfaction and family burden domains had sparse or no long-term assessments.
For psychopathology and social functioning, summary estimates of global, partial, and rate-based measures suggested nondifferential treatment effects at all time points (t3). For social functioning, the initially significant summary estimate of rate-based measures that suggested superior results for partial hospitalization at 0–6 months was sensitive to the iterative removal of one of three contributing effect sizes. Of note, the nonsignificant summary estimates of partial measures at 7–12 months and after 18 months were sensitive to the removal of two of seven and one of three effect sizes, respectively.
One study (37) contributed all results for assessment of family burden. Neither of the two summary estimates suggested differential effects.
Summary estimates of global measures suggested no differences between treatments in terms of satisfaction with services. However, significant summary estimates for rate-based measures indicated that greater satisfaction was associated with partial hospitalization within 1 year of discharge. Only one adjusted result contributed to the significant estimate at 0–6 months. The largest significant estimate, a combination of two unadjusted results at 7–12 months, indicated that the average partial hospitalization patient was one and one-half standard deviations more satisfied with services than the average inpatient. This difference rendered the average partial hospitalization patient more satisfied than roughly 94% of his or her counterparts. The initially significant estimate seen after 18 months was sensitive to the removal of one of two effect sizes.
For service utilization, all summary estimates of rate-based measures suggested nondifferential treatment effects (t3). The finding from an adjusted analysis in which partial hospitalization was associated with lower service utilization at 7–12 months (312 observations from two studies; û=0.64, 95% CI=0.15–1.12) was sensitive to the removal of one of the three contributing effect sizes.
This review offers no evidence that partial hospitalization is less effective than full hospitalization in the provision of tertiary-level care to mentally ill adults of moderate diagnostic severity, although the generalizability of this conclusion is limited by the exclusion of a significant fraction of patients on the basis of prespecified criteria. In addition, our work suggests that patients and their families are more satisfied with partial hospitalization within 1 year of discharge. Limitations of the data prevented us from elucidating the factors underlying this positive finding or those underlying the dissipation of the satisfaction differential after 18 months. Although we found a robust result, methodological shortcomings of the original data call for caution in its interpretation. Thus, while the vast transformation of the American health care system and changing expectations by consumers affect the generalizability of findings published 15 to 40 years ago, the use of largely nonstandardized satisfaction measures compromises their validity.
The presumption of no difference in treatment effects is not applicable to the entire population of patients requiring intensive services because, as inferred from reported exclusion rates, a little over half of eligible patients were prevented from entering the studies because partial care placement was not considered viable. On the basis of our observation that studies with high exclusion rates had comparatively lower transfer rates than two of the three studies with nonexclusionary protocols, it would appear that, in the absence of exclusion criteria, many of the same patients that otherwise would have been excluded end up being transferred. At least one study has attempted to shed light on the factors affecting successful allocation to partial hospitalization by studying the dynamics of transfer to inpatient care in a partial hospitalization program with no a priori exclusion criteria (36). The best predictor of successful allocation to partial hospitalization was a proxy variable for illness severity. Illness severity was, also, the most frequent exclusion criterion among the studies reviewed. According to the findings of the two most recent studies (36, 67) from the small set of randomized investigations without a priori exclusions—one of which did not meet review criteria (67)—61% to approximately 80% of partial care patients will eventually be transferred and fully hospitalized. Conversely, approximately 21% to 39% of acutely ill patients may be solely treated with partial hospitalization.
Because partial hospitalization has been shown to be less expensive than full hospitalization when used as an alternative form of treatment (29, 37, 39, 68), our results also support the notion that partial care is a cost-effective alternative to inpatient care for the subset of patients included in the studies reviewed. It must be noted, however, that typical costs of partial hospitalization may rise if services are upgraded to meet advocated standards of care. Process improvements include appropriate level of staffing (7, 30, 31, 69), highly structured treatment programs (20), and quick and easy access to inpatient care (7, 32, 68, 69) and to residential facilities (32, 68).
Unfortunately, the relatively small number of comparisons and inadequate reporting of the data did not allow us to "capitalize on study-level variation" (70) in order to investigate the manner in which outcome may correlate with variables such as publication date, diagnostic severity, extent of family involvement, length of stay, or type of program. The only study (32) that systematically investigated some of these associations found no factors significantly predictive of differential psychopathology or social functioning outcomes. In addition, neither this nor another study (37), both of which employed milieu-based and rehabilitative methods in both settings, found clinically significant differences in functional outcomes between the groups. In addition, the larger literature on the association between patient characteristics and successful partial hospitalization placement has been similarly unsuccessful (71).
Set against a background of rising utilization of partial hospitalization and system-wide pressures to decrease inpatient utilization, this study has systematized available knowledge on the relative merits of partial and full hospitalization as alternative systems of care from information accrued by 18 investigations. The majority of programs were eclectic or nondirective, and full hospitalization patient groups tended to have more male subjects and psychotic patients. Two-thirds of the studies had been published over a decade ago, and a similar proportion were rated to be of substandard quality. Methodological limitations of the studies and inadequacies in their reporting prevented us from accomplishing all of our original goals. Furthermore, the tendency to exclude certain categories of patients from the studies reduced the generalizability of our findings. We were further limited by the paucity of recent results on context-sensitive domains (i.e., satisfaction with services, service utilization) and insufficient research on the family burden domain. These limitations notwithstanding, the fact that our meta-analysis offered no evidence of inferior outcomes among partial hospitalization patients is a very interesting one inasmuch as it represents the collective results of 40 years of research that, right now, stands as the only source of scientific evidence for policy makers and administrators.
Although a clearer definition of the role of partial hospitalization has been advocated for the past 30 years (9), widely varying programs serving patients with widely varying diagnoses, illness severity, and treatment needs continue to be classified as forms of partial hospitalization. A joint effort between professional associations, regulatory agencies, and the health care industry aimed at clarifying patients served and types of services delivered by partial hospitalization programs is in order. Such effort will not only improve the process of allocating patients and health care dollars but will also facilitate the gathering of critical health utilization information and the evaluation of "usual partial care."
The changing nature of inpatient care will likely affect the process and outcome expectations for future partial hospitalization programs and possibly lead to a larger role for interventions typically considered "medical" (15). Thus, future studies may not assess domains that require longer lengths of stay. Symptom reduction, community tenure, and satisfaction with services, key endpoints for the evaluation of acute care (72), may suffice as outcome indicators. In addition, the purportedly longer-than-optimal duration of partial hospitalization admissions (20) may justify tracking lengths of stay and discharge rates for evaluative purposes.
Future research should attempt to avoid the methodological pitfalls that have beset the investigations reviewed, thus enhancing the strength of the evidence. In this respect, the investigators behind the next generation of studies ought to describe their methods and present their data with the level of detail that allows readers to fully comprehend the science underlying the results. In addition, future studies ought to broaden the scope of their inquiry. Thus, more research needs to be conducted to identify the variables that predict successful partial care placement for patients requiring intensive services. Also, possible associations between patient, illness, or program characteristics and outcome need to be systematically investigated. The relative cost-effectiveness of partial and full hospitalization programs should be studied in a way that takes into account the likelihood that a fraction of partial hospitalization patients will require transfer to an inpatient facility. In addition, more research needs to be conducted on the comparative effectiveness and cost-effectiveness of standard full hospitalization and a package of services that includes a brief inpatient admission supplemented by acute partial hospitalization. Finally, user satisfaction with partial and full hospitalization programs needs to be ascertained under current conditions, heavily influenced by forces and constraints that were much less relevant two decades ago.
Most of the nonrandomized studies failed to report whether patients had been excluded, a highly possible scenario for observational studies in which the partial care program may have had built-in, programmatic exclusions. It is then possible that the overall pool of partial hospitalization patients may have been less severely ill, or otherwise more treatment-responsive, than the pool of inpatients, which poses an unascertainable threat to the validity of our findings.
In common with other meta-analyses, data extraction bias cannot be ruled out. However, the fact that our conclusion of no difference is in keeping with most studies’ results and that we arrived at this conclusion despite some of the studies’ differential findings, largely confined to our category of partial measures, provides assurance that this nonnegligible source of bias was not significant.
Because patients may have contributed to more than one measure within a given outcome domain and time period, our estimates of the measures’ precision may be overstated. Furthermore, although we employed a random-effects framework for combining study effects, the method we used to estimate between-study variation (DerSimonian and Laird) does not fully account for the uncertainty associated with that variance component. However, because our results were largely suggestive of nondifferential treatment effects, these methodological issues do not invalidate our findings.
Received Dec. 16, 1999; revisions received May 30 and Aug. 14, 2000; accepted Oct. 24, 2000. From the Department of Psychiatry, The Cambridge Hospital; the Department of Health Care Policy, Harvard Medical School, Boston; and the Department of Biostatistics, Harvard School of Public Health, Boston. Address reprint requests to Dr. Horvitz-Lennon, Department of Psychiatry, The Cambridge Hospital and Harvard Medical School, 26 Central St., Somerville, MA 02143; firstname.lastname@example.org (e-mail). Supported in part by the Department of Psychiatry, Johns Hopkins Medical School. The authors thank James Tonascia, Ph.D., for statistical assistance and Debbie Collins for programming support.