Nonequivalent Pretest-Posttest Control Group Design
There are many research situations in the social, clinical, and behavioral sciences where groups are found intact or where subjects are self-selected. The former case is common in a clinic or school where patients or students belong to fixed groups or classes. The latter case will apply when attribute variables are studied or when volunteers are recruited. The nonequivalent pretest-posttest control group design (see Figure 11.6) is similar to the pretest-posttest experimental design, except that subjects are not assigned to groups randomly. This design can be structured with one treatment group and one control group or with multiple treatment and control groups.
Nonequivalent pretest-posttest control group design.
Example of a Nonequivalent Pretest-Posttest Control Group Design Based on Intact Groups
A study was done to determine the effectiveness of an individualized physical therapy intervention in treating neck pain.17 One treatment group of 30 patients with neck pain completed physical therapy treatment. The control group of convenience was formed by a cohort group of 27 subjects who also had neck pain but did not receive treatment for various reasons. There were no significant differences between groups in demographic data or the initial test scores of the outcome measures. A physical therapist rendered an intervention to the treatment group based on a clinical decision making algorithm. Treatment effectiveness was examined by assessing changes in range of motion, pain, endurance and function. Both the treatment and control groups completed the initial and follow-up examinations, with an average duration of 4 weeks between tests.
Example of a Nonequivalent Pretest-Posttest Control Group Design Based on Subject Preferences
A study was designed to examine the influence of regular participation in chair exercises on postoperative deconditioning following hip fracture.18 Subjects were distinguished by their willingness to participate, and were not randomly assigned to groups. A control group received usual care following discharge. Physiological, psychological, and anthropometric variables were measured before and after intervention.
In the study of therapy for neck pain, the patients are members of intact groups by virtue of their diagnosis. In the chair exercise study, subjects self-selected their group membership.
Although the nonequivalent pretest-posttest control group design is limited by the lack of randomization, it still has several strengths. Because it includes a pretest and a control group, there is some control over history, testing and instrumentation effects. The pretest scores can be used to test the assumption of initial equivalence on the dependent variable, based on average scores and measures of variability. The major threat to internal validity is the interaction of selection with history and maturation. For instance, if those who chose to participate in chair exercises were stronger or more motivated patients, changes in outcomes may have been related to physiological or psychological characteristics of subjects. These characteristics could affect general activity level or rate of healing. Such interactions might be mistaken for the effect of the exercise program. These types of interactions can occur even when the groups are identical on pretest scores.
Analysis of Nonequivalent Pretest-Posttest Designs. Several statistical methods are suggested for use with nonequivalent groups, including the unpaired t-test (with two groups), analysis of variance, analysis of covariance, analysis of variance with matching, and analysis of variance with gain scores. Ordinal data can be analyzed using the Mann-Whitney U test. Nonparametric tests may be more appropriate with nonequivalent groups, as variances are likely to be unequal. Preference for one approach will depend in large part on how groups were formed and what steps the researcher can take to ensure or document initial equivalence. Tests such as the t-test, analysis of variance and chi square are often used to test for differences in baseline measures. Regression analysis or discriminant analysis may be the most applicable approach to determine how the dependent variable differentiates the treatment groups, while adjusting for other variables. Statistical strategies must include mechanisms for controlling for group differences on potentially confounding variables.
Another strategy for comparing treatments involves the use of historical controls who received a different treatment during an earlier time period.
Example of a Nonequivalent Design Based on Historical Controls
Concern exists that prednisone-free maintenance immunosuppression in kidney transplant recipients will increase acute and/or chronic rejection. Over a 5-year period from 1999 to 2004, researchers worked with 477 kidney transplant recipients who discontinued prednisone on postoperative day 6, followed by a regimen of immunosuppressive therapy.19 The outcomes were compared with that of 388 historical controls from the same institution (1996 to 2000) who did not discontinue prednisone. Outcomes included changes in serum creatinine levels, weight and cholesterol, as well as patient and graft survival rates.
As this example illustrates, a nonconcurrent control group may best serve the purpose of comparison when ethical concerns may preclude a true control group. When the researcher truly believes that the experimental intervention is more effective than standard care, the use of historical controls provides a reasonable alternative.20 This approach has been used in cancer trials, for example, when protocols in one trial act as a control for subsequent studies.21 The major advantage of this approach is its efficiency. Because all subjects are assigned to the experimental condition, the total sample will be smaller and the results can be obtained in a shorter period of time.
The disadvantages of using historical controls must be considered carefully, however. Studies that have compared outcomes based on historical controls versus randomly allocated controls have found positive treatment effects with historical controls that randomized trials have not been able to replicate.22,23 The most obvious problem, therefore, is the potential for confounding because of imbalances in characteristics of the experimental and historical control groups. For this approach to work, then, the researcher must be diligent in establishing a logical basis for group comparisons. This means that the historical controls should not simply be any patients described in the literature, or those treated at another time or another clinic.20,24 It is reasonable, however, as in the kidney transplant example, to consider using groups that were treated within the same environment, under similar conditions, where records of protocols were kept and demographics of subjects can be obtained. This approach may prove useful as large clinical data bases are accumulated within a given treatment setting.
Analysis of Designs with Historical Controls. Researchers often use the independent samples t-test to compare current subjects with historical subjects, although there is an inherent flaw in this approach because there is no assumption of equivalence between the groups. The Mann-Whitney U test may be used with ordinal data. Chi square will allow the researcher to determine if there is an association among categorical variables. Multiple regression, logistic regression or discriminant analysis can be done, using group membership as a variable, to analyze differences between the groups while accounting for other variables.
Nonequivalent Posttest-Only Control Group Design
Nonequivalent designs are less interpretable when only posttest measures are available. The nonequivalent posttest-only control group design (see Figure 11.7), also called a static group comparison,25 is a quasi-experimental design that can be expanded to include any number of treatment levels, with or without a control group. This design uses existing groups who have and have not received treatment.
Nonequivalent posttest-only control group design.
Example of a Nonequivalent Posttest-Only Control Group Design
Researchers were interested in studying the effects of a cardiac rehabilitation program on self-esteem and mobility skill in 152 patients who received cardiac surgery.26 They studied 37 subjects who participated in a 2-month exercise program, and another 115 subjects who chose not to attend the program, forming the control group. Measurements were taken at the end of the 2-month study period. Outcomes were based on the Adult Source of Self-esteem Inventory and the New York Heart Association Classification.
To draw conclusions from this comparison, we would have to determine if variables other than the exercise program could be responsible for outcomes. Confounding factors should be identified and analyzed in relation to the dependent variable. For instance, in the cardiac rehabilitation example, researchers considered the subjects' age, years of education and occupational skill.26 Although the static group comparison affords some measure of control in that there is a control group, internal validity is severely threatened by selection biases and attrition. This design is inherently weak because it provides no evidence of equivalence of groups before treatment. Therefore, it should be used only in an exploratory capacity, where it may serve to generate hypotheses for future testing. It is essentially useless in the search for causal relationships.
Analysis of Nonequivalent Posttest-Only Designs. Because this design does not allow interpretation of cause and effect, the most appropriate analysis is a regression approach, such as discriminant analysis. Essentially, this design allows the researcher to determine if there is a relationship between the presence of the group attribute and the measured response. An analysis of covariance may be used to account for the effect of confounding variables.
COMMENTARY Generalization to the Real World
In the pursuit of evidence-based practice, clinicians must be able to read research literature with a critical eye, assessing not only the validity of the study's design, but also the generalizabiIity of its findings. Published research must be applicable to clinical situations and individual patients to be useful. The extent to which any study can be applied to a given patient is always a matter of judgment.
Purists will claim that generalization of intervention studies requires random selection and random assignment—in other words, a randomized controlled trial (RCT). In this view, quasi-experimental studies are less generalizable because they do not provide sufficient control of extraneous variables. In fact, such designs may be especially vulnerable to all the factors that affect internal validity.
One might reasonably argue, however, that the rigid structure and rules of the RCT do not represent real world situations, making it difficult for clinicians to apply or generalize research findings. The results of the RCT may not apply to a particular patient who does not meet inclusion or exclusion criteria, or who cannot be randomly assigned to a treatment protocol. Many of the quasi-experimental models will provide an opportunity to look at comparisons in a more natural context. In the hierarchy of evidence that is often used to qualify the rigor and weight of a study's findings, the RCT is considered the highest level. But the quasi-experimental study should not be dismissed as a valuable source of information. As with any study, however, it is the clinician's responsibility to make the judgments about the applicability of the findings to the individual patient.