The statistical procedures that have been described thus far have focused on comparisons, generally applied to experimental and quasi-experimental designs. We will now begin to examine procedures for exploratory analysis, where the purpose of the research question is to evaluate the relationship between two or more measured variables. Where studies of group differences ask if group A is different from group B, measures of association ask, “What is the relationship between A and B?”
Many such research questions involve categorical variables that are measured on a nominal or ordinal scale. Rather than using means, these questions deal with analysis of proportions or frequencies within various categories. The purpose of this chapter is to describe the use of several nonparametric tests based on the chi-square distribution. These tests have many applications in clinical research, in both experimental and descriptive analysis, including tests of goodness of fit to determine if a set of observed frequencies differs from a theoretical distribution and tests of independence to determine if two classification variables are related to each other.
Many research questions in clinical and behavioral science involve categorical variables that are measured on a nominal or ordinal scale. Such data are analyzed by determining if there is a difference between the proportions observed within a set of categories and the proportions that would be expected by chance.
For example, suppose we tossed a coin 100 times. The null hypothesis states that the coin is not biased, and therefore we would theoretically expect a 50:50 outcome—50 heads and 50 tails. But we actually observe 47 heads and 53 tails. Does this deviation from the null hypothesis occur because the coin is biased, or is it only a matter of chance? In other words, is the difference between the observed and expected frequencies sufficiently large to justify rejection of the null hypothesis?
The chi-square statistic, χ2, tests the difference between observed and expected frequencies:
where O represents the observed frequency and E represents the expected frequency. As the difference between observed and expected frequencies increases, the value of χ2 will increase. If observed and expected frequencies are the same, χ2 will equal zero.
Using the coin example, we calculate χ2 by substituting values for each category.
The sum of these terms for all categories is the value of χ2. Therefore,
We analyze the significance of this value using critical values of χ2 found in Appendix Table A-5. Along the top of the table we identify the desired α level, ...