Placebo effects are among the most poorly informed issues in medicine. The power of nonspecific effects in medical treatments is recognized theoretically but is often ignored or denied in practice. Placebo effects are the bane of clinical trials, which must demonstrate that the studied treatment effect significantly exceeds the nonspecific effects in improvement of the treated condition. But placebo effects also positively influence and enhance patient outcomes, mimic or reverse effects of active drugs, and can be superior to standard treatments for some conditions (1, 2).
Placebo effects in the treatment of premenstrual syndrome (PMS) and its severe form, termed premenstrual dysphoric disorder in DSM-IV, have generally been viewed as troublesome and masking responses to possible treatments. However, to our knowledge there has been no investigation of the onset, duration, and degree of symptom relief that is experienced by placebo-treated patients in randomized controlled trials, issues that are relevant for study designs and for patient management in clinical practice.
A recent meta-analysis of controlled treatment trials for premenstrual dysphoric disorder showed rates of placebo response ranging from 6% to 35% (3). Earlier PMS studies indicated rates of placebo response as high as 94% (4). The high placebo response rates may result in part from differences in subject selection, but such rates are also consistent with the estimated power of placebos in natural clinical settings, where both clinicians and patients expect a positive outcome (5). In contrast to natural clinical settings, double-blind studies may not fully reflect the power of placebos for various reasons, but particularly if the study treatments are devoid of the "optimism, conviction and persuasive abilities of the skilled physician" (5).
A long-held but not well-supported view in medicine is that a placebo response is brief and limited to partial improvement that does not exceed one-third of the possible relief obtained with proven treatments. While brief and partial responses are clearly observed in clinical trials, investigation of the extent of sustained placebo improvement is important for clinical trial designs and for better understanding of the non-drug-related improvements in clinical care.
To obtain information on the frequency and duration of placebo responses, we examined the responses to placebo medication of patients with clearly defined PMS who were randomly assigned in double-blind treatment protocols. On the basis of clinical observations, we hypothesized that a subgroup would show a sustained response, another subgroup would experience only partial improvement, and the remaining subjects would experience little or no improvement in response to placebo medication. A secondary aim was to investigate the standard medical and background variables in the diagnostic evaluation of PMS for possible predictors of placebo response.
The subjects in this investigation were pooled from two clinical trials, each with three parallel treatment arms and identical designs: a double-blind trial of oral progesterone, alprazolam, and placebo for treatment of severe PMS in 185 subjects (6) and a comparison of serotonergic and noradrenergic antidepressants for treatment of PMS in 145 subjects (7). In both studies, significant placebo-drug differences in treatment response were demonstrated. In the first study, alprazolam was significantly better than placebo or progesterone (6). In the second study, sertraline was significantly better than placebo or desipramine (7).
The subjects in both studies were unimproved according to the study criteria after single-blind placebo medication and were randomly assigned to 3 months of double-blind treatment. Of 514 subjects who entered the single-blind placebo cycle before random assignment in the two studies, 35% discontinued treatment in the single-blind phase—18% because of improvement as reported by the subjects or as defined by the study criteria and 17% for other reasons. In this report of subjects randomly assigned to the placebo arms (N=101), 55 subjects were from study 1 and 46 subjects were from study 2.
Before the two placebo groups were combined, the mean premenstrual scores on the Daily Symptom Report (8) were compared, and they showed no significant differences in any of the 6 study months. At month 1 the mean scores were 138 (SD=75) for study 1 (N=55) (6) and 155 (SD=77) for study 2 (N=46) (unpublished data) (t=1.15, df=99, p=0.25). At month 6 the mean Daily Symptom Report scores were 111 (SD=58) and 118 (SD=94), respectively (t=0.35, df=79, p=0.73). Demographic background variables did not differ between the two study groups.
The study designs for the two groups were identical and consisted of six menstrual cycles in four study phases.
1. Prescreening—one cycle of daily symptom ratings before the first clinical visit.
2. Screening—one cycle of evaluation with no medications and two clinical visits, one postmenstrual and one premenstrual.
3. Placebo lead-in—one cycle of single-blind administration of placebo capsules.
4. Randomized, double-blind treatment—three cycles of placebo capsules.
The studies were approved by the Institutional Review Board of the University of Pennsylvania. After complete description of the study to the subjects, written informed consent was obtained.
All subjects were 18–45 years of age, had regular menstrual cycles of 22 to 35 days, had experienced PMS for at least 1 year, reported moderate to severe interference with work, relationships, or social activities, and were in overall good health as determined by medical history, physical examination, laboratory tests of blood count, and complete blood chemistry with differential. The subjects had no current major psychiatric illness as determined by the Structured Clinical Interview for DSM-IV (SCID) (9) or the SCID for DSM-III-R (10), used for the earlier study group. According to the SCID interviews, 48% of the subjects met the DSM criteria for one or more previous illnesses: major depression (40%), alcohol or drug abuse or dependency (17%), eating disorders (3%), panic disorder (3%), manic syndrome (3%), or dysthymia (1%). The subjects took no concomitant treatments for PMS during the study.
The PMS criteria for study eligibility were a premenstrual total score of 70 or higher on the Daily Symptom Report with an increase of 50% or more from the postmenstrual total score, premenstrual symptom interference with functioning at a moderate to severe level as rated by the subjects, no current diagnosis of a major mental disorder as determined by SCID interview (9, 10), and confirmation of symptom status by daily ratings in the 3-month screening period. Subjects who did not continue to meet these criteria after the single-blind placebo month were not assigned to double-blind treatment. In addition to meeting these criteria, 71% of the subjects met the DSM-IV criteria for premenstrual dysphoric disorder. Those who did not meet those criteria had a mean number of 2.3 qualifying symptoms of premenstrual dysphoric disorder, rather than the required five symptoms.
The outcome measure for this report was the premenstrual score on the Daily Symptom Report, a 17-item scale with reported reliability and validity (8). The symptoms covered by the Daily Symptom Report are irritability, anxiety and nervous tension, mood swings, feeling out of control, feeling worthless or hopeless, difficulty concentrating or confusion, swelling, depression, fatigue, insomnia, decreased interest, poor coordination, food craving, headache, aches, breast tenderness, and cramps. The subjects rated the symptoms daily throughout screening and treatment by using a 5-point scale (0 for none, 4 for severe) with descriptors for each rating. Scores were calculated for each menstrual cycle by summing the ratings of cycle days 5–10 for the postmenstrual score (day 1 is the first day of menses) and the ratings of the last 6 days of the cycle for the premenstrual score.
Definition of Improvement
Improvement in this study was defined as a decrease of at least 50% in the total premenstrual Daily Symptom Report score from the first month of daily symptom ratings (the prescreen month) and was calculated for each menstrual cycle in the study. Three improvement groups were defined. Sustained improvement was defined as improvement lasting 2–4 consecutive months, and 20 subjects met this criterion, 18 of whom were improved for at least 3 of the 4 placebo months; all of these 20 were improved in the last 3 months. Partial improvement was 30% to 49% improvement in any month or at least 50% improvement in 1 month; 42 subjects fell into this category, 15 of whom were improved at least 50% for 1 month only and 27 of whom were improved 30% to 49% in 1 or more months. Unimproved was defined as a less than 30% improvement throughout the study; 39 subjects were classified as unimproved.
The change from the prescreening baseline in each subsequent cycle was calculated for the premenstrual Daily Symptom Report scores. The change scores were analyzed by using repeated measures analysis with group (the three improvement groups), time, and the interaction of group and time. The repeated measures analyses were also conducted by using the actual premenstrual Daily Symptom Report scores and the percentage change from baseline in each of the six menstrual cycles. The results were consistent in all analyses. Tukey’s Studentized range test and least squares means were examined in pairwise comparisons. Results are reported for endpoint, with the last observation for each discontinuer carried forward. Completer analysis yielded the same results. Diagnostic and demographic variables were tested with analysis of variance, Student t test, and chi-square test as appropriate for the variable. Probability values of 0.05 or less in two-tailed tests were considered significant. The SAS statistical package (11) was used for the computer analyses.
Percent of Subjects Improved and Timing of Improvement
t1 shows the number of subjects improved during placebo medication administered in study months 3–6. Using the criterion of 50% decrease in premenstrual symptom scores from cycle 1, the greatest number to report first reaching this improvement level occurred in the first month of double-blind treatment, when 17% reported initial improvement. The maximum number of improved subjects overall (29%) was reached in the second month of double-blind treatment (month 5).
According to the definitions of improvement described in the Method section, 20% (20 of 101) of the subjects had sustained improvement. Another 42% partially improved, and 39% were clearly unimproved throughout the 6 study months. t2 shows the symptom scores of the three improvement groups in each of the 6 months and the total group score in each month, which indicates the overall placebo response during the study period. As expected, the interaction of time and group for the Daily Symptom Report scores was significant (t2); improvement for the subgroup with sustained improvement was significantly greater than that of either the group with partial improvement or the unimproved group throughout double-blind treatment.
Analysis of the Daily Symptom Report change scores (F1) and the percentage change from baseline (data not shown) showed similar results. The addition of premenstrual dysphoric disorder (yes/no) as a covariate in the repeated measures model did not alter the results; there was no significant interaction between premenstrual dysphoric disorder and improvement with placebo.
Changes in Severity During Screening Period
The mean Daily Symptom Report change score significantly worsened in month 2 (screening) and returned to the initial baseline level in month 3 (single-blind placebo phase). When the changes in symptom severity during the screening period in the three improvement groups were compared, the unimproved and partial improvement groups accounted for the observed worsening in the second month (F1 and t3). In contrast, the subjects with sustained improvement showed slight improvement in the second month, which increased in the third month. However, this improvement differed significantly only from the change score for the unimproved group and not from that of the group with partial improvement. This result was further confirmed by review of the frequency of improvement in each group after the single-blind placebo phase (month 3), when nine subjects improved at least 50%: five in the sustained improvement group, four in the partial improvement group, and none in the unimproved group (χ2=10.21, df=2, p<0.01). However, it can be clearly seen that the group with sustained improvement did not differ substantially from those with partial improvement at the single-blind placebo time point (t3). It is important to note that the subjects with partial improvement (improvement in no more than one cycle) in the screening period did not remain improved after the screening period.
Further information on the duration of the placebo-related improvement is only suggestive and limited by the study designs. The subjects had the option of continuing treatment for an additional 3 months (double-blind) only in study 1 (6). Of the subjects in the present report with sustained improvement, 11 entered the maintenance treatment and continued for 3 additional months (double-blind) and therefore had a possible total of 7 months of improvement. The Daily Symptom Report scores of these 11 subjects did not change significantly during the additional 3 months of treatment; their mean premenstrual score was 51 at month 6 and 66 at month 9 (paired t test: t=1.13, df=10, p<0.29). Only one of the 11 improved subjects reported the return of symptoms to her baseline level.
Possible Predictor Variables
Diagnostic and demographic background variables of the three improvement groups were compared to identify possible predictors of placebo response or nonresponse. The only variable that statistically differed was age (F=2.99, df=2, 99, p<0.05). The subjects with sustained improvement were slightly younger (mean age=31 years) than those with partial improvement (mean=34 years), a significant difference in the Tukey pairwise comparison, but did not differ from the unimproved subjects (mean=34 years). Because age did not discriminate improvers from nonimprovers, further analysis was not conducted. The other variables that were examined, but did not differ among the three improvement groups, were as follows, by time of assessment:
1. In the screening cycle (month 2)
The total postmenstrual score on the Daily Symptom Report, the pre- and postmenstrual scores for the factor dimensions of the Daily Symptom Report (mood, behavior, pain, physical changes, and food cravings), the Hamilton Depression Rating Scale scores (total of 17 items and single items for depressed mood, work/interest, and psychological anxiety) (12), and the state and trait scores on the Spielberger State-Trait Anxiety Inventory (13).
2. At the end of the placebo lead-in cycle (month 3)
The clinician rating of symptom severity (Clinical Global Impression) (rated 1–7) and the subject global ratings of symptom interference in relationships and work (rated 0–4).
3. In the medical interview and history
Psychiatric history (yes/no from SCID interview), duration of PMS (years), previous medical treatment for PMS (yes/no), number of work days lost, education (high school/more than high school), employment (yes/no), number of children, and marital status (single/ever married).
Last, to address the question of whether the rate of placebo response was unique to this study group, we applied the same definition of improvement to subjects in another PMS treatment study that was not included in the analysis because of a different design. The study (14) had the same prescreening and screening months followed by 2 months of single-blind placebo administration and then 2 months each of placebo and active medication, with the order of placebo and medication randomly assigned. Of the 80 subjects assigned to the placebo arm first, 31% were improved with the placebo medication, a rate notably similar to the 29% improvement in the present study group at endpoint, as shown in t1. It is noteworthy that the rates of placebo response after treatment assignment were nearly identical regardless of whether there were 1 or 2 months of placebo lead-in treatment.
In this study of the frequency and duration of placebo responses in PMS treatment, 20% of the subjects sustained major improvement during the 4 months of placebo treatment. Another 42% partially improved, and the remaining 39% had no improvement as assessed by the subjects’ daily symptom ratings. These results contrast with previous reports of high rates of response to placebo in PMS studies but are consistent with findings in other randomized controlled trials for premenstrual dysphoric disorder, in which the reported placebo response rates were 34% (15), 38% (16), and 23% (17). It is also noteworthy that the placebo response rate of 29% (29 of 101 subjects) at endpoint in the present investigation was nearly identical to the endpoint rates in two other randomized, double-blind efficacy studies of PMS treatments that we have previously reported (14, 18), further suggesting that this placebo response rate is not unique to this study.
The symptom fluctuations during the screening period were of particular interest but did not strongly predict subsequent improvement among these subjects who were randomly assigned to treatment. The more important issue that appears to warrant the placebo washout period is establishing a reliable baseline. These data show that, overall, the premenstrual symptoms increased significantly in the screening cycle, which was the second month of symptom reporting and included two evaluation visits, but returned to the initial baseline in month 3 (single-blind placebo). This suggests that the initial symptom reports, which were done by the subjects before the first clinical assessment, may more accurately reflect symptom status independent of visit effects but, more important, that averaging several months of symptom reports may provide a more reliable baseline.
Possible predictors of placebo response were examined but did not differentiate the improved from the unimproved groups. Symptom severity has previously been reported as a significant predictor, with subjects who do not respond to placebo having more severe symptoms (19–22) or having longer or more chronic illnesses (23–29). However, symptom severity did not differentiate the response groups in the present study, possibly because the study required a severe symptom level and/or because the symptoms were of long duration, averaging 10 years in this study group. A previous PMS study (30) showed that placebo responders were less likely to have had prior medical treatment for PMS, and similarly, in a study of depressed patients (27), placebo responders had had less psychiatric medication previously than had the nonresponders. In the present study, previous treatment did not differentiate placebo response and nonresponse, possibly because relatively few subjects had had previous medical treatment or psychotropic medications for PMS. Brown et al. (24) found no differences between depressed patients who did and did not respond to placebo and suggested that "although clinically indistinguishable, the groups have different illnesses at the pathophysiological level," a hypothesis that continues to warrant further study.
Research suggests that placebo effects may account for as many as 70% of positive outcomes in clinical practice (5), but the mechanisms of placebo responses remain unclear. Endogenous opioids may have a role (31), and numerous other possible sources of placebo effects have been investigated: expectations of treatment, the desire for relief, classical conditioning, spontaneous recovery, regression to the mean, suggestibility, deception, and psychophysiological states that may change during the course of treatment and alter symptom reports, e.g., anxiety, arousal, and relaxation levels (1, 5, 32). Alterations in one or more of these sources may account for differences in response to identical clinical treatments.
The rate of placebo responses in this study may be an underestimate because the study included only subjects who remained eligible at the end of the screening period and were randomly assigned to the treatment phase. Another 18% of the subjects who were eligible for the single-blind placebo phase were not included, because either they said they were "improved" and withdrew consent or they did not meet the symptom severity criteria. However, these "improvers" did not fulfill the more stringent 50% criterion of the present study, and whether the "improvement" was sustained is not known. Another limitation of this study is the absence of an untreated comparison group to provide information on the natural changes over time. However, the mean duration of PMS reported by these subjects, 10 years (SD=7), fails to suggest that any sizable number of women would improve simply with time during a several-month treatment period. Other recent data (33) show that symptoms of PMS or premenstrual dysphoric disorder are stable on an individual basis and replicable across cycles, further suggesting that women with severe premenstrual distress do not improve by time alone over several months.
In summary, these results show that about 20% of PMS patients randomly assigned to clinical trials after a 3-month screening period experience sustained improvement with placebo medication. The data further suggest that subjects who improve for at least 2 months are likely to remain improved at a level similar to that achieved with drug treatment. Other subjects reported partial or brief improvement, which affects placebo response rates in short-term studies. The lack of change in the symptoms of the unimproved group for the 6-month study period is also noteworthy. This information is important for the design of randomized controlled trials. The evidence for sustained placebo response warrants further study of the role of non-drug-related improvement in clinical care.
Received Jan. 29, 1998; revisions received Aug. 13 and Nov. 9, 1998; accepted March 4, 1999. From the Departments of Obstetrics/Gynecology and Psychiatry, University of Pennsylvania Medical Center. Address reprint requests to Dr. Freeman, Department of Obstetrics/Gynecology, University of Pennsylvania Medical Center, Mudd Suite, 2 Dulles Bldg., 3400 Spruce St., Philadelphia, PA 19104; email@example.com (e-mail). Supported in part by grant HD-18633 from the National Institute of Child Health and Human Development. The authors thank Beatriz Garcia, M.A., for the computer analyses.
Changes in Premenstrual Score on Daily Symptom Report Over 6 Months for Women With PMS Who Experienced Sustained, Partial, or No Improvementa During Placebo Treatmentb
aImprovement was defined as a decrease of at least 50% from month 1 in total premenstrual score on the Daily Symptom Report. Sustained improvement: improvement lasting 2–4 consecutive months. Partial improvement: 30%–49% improvement in any month or ≥50% improvement in 1 month. No improvement: <30% improvement throughout study.
bRepeated measures analysis showed a significant time-by-group interaction (F=7.38, df=8, 190, p<0.001), time effect (F=30.80, df=4, 95, p<0.001), and group effect (F=36.23, df=2, 98, p<0.001).
cSignificant difference between sustained and no improvement (p<0.05, least squares).
dSignificant difference between partial and no improvement (p<0.05, least squares).
eSignificant difference between sustained and partial improvement (p<0.05, least squares).