Many statistical procedures, like the t-test, analysis of variance and linear regression are based on assumptions about homogeneity of variance and normality that should be met to ensure the validity of the test. Although most parametric statistical procedures are considered robust to moderate violations of these assumptions, some modification to the analysis is usually necessary with striking departures. When this occurs, the researcher can choose one of two approaches to accommodate the analysis. The analytic procedure can be modified, by using nonparametric statistics or nonlinear regression, or the dependent variable, X, can be transformed to a new variable, X', which more closely satisfies the necessary assumptions. The new variable is created by changing the scale of measurement for X. In this appendix we introduce five approaches to data transformation.
The three most common reasons for using data transformation are to satisfy the assumption of homogeneity of variance, to conform data to a normal distribution, and to create a more linear distribution that will fit the linear regression model. Fortunately, the same transformation will often accomplish more than one of these goals.1
The most commonly used transformations are the square root transformation, the square transformation, the log transformation, the reciprocal transformation, and the arc sine transformation. The choice of which method to use will depend on characteristics of the data. Before we describe the guidelines for using each of these approaches, it may be helpful to illustrate the transformation process using the square root transformation.
The square root transformation replaces each score in a distribution with its square root. This method is most appropriate when variances are roughly proportional to group means, that is, when is similar for all samples. The square root transformation will typically have the effect of equalizing variances.
TABLE D.1EFFECT OF SQUARE ROOT TRANSFORMATION ||Download (.pdf) TABLE D.1 EFFECT OF SQUARE ROOT TRANSFORMATION
| ||Original Data (X) ||Transformed Data (√X) |
| ||A ||B ||A ||B |
| ||1 ||8 ||1.00 ||2.83 |
| ||3 ||7 ||1.73 ||2.65 |
| ||8 ||12 ||2.83 ||3.46 |
| ||6 ||5 ||2.45 ||2.24 |
| ||2 ||18 ||1.41 ||4.24 |
|Σ ||20 ||50 ||9.42 ||15.42 |
|X̄ ||4 ||10 ||1.88 ||3.08 |
|s2 ||8.5 ||26.5 ||.56 ||.61 |
| ||2.125 ||2.65 || || |
Each score in both distributions is transformed to its square root on the right in Table D.l. As we can see, the effect of this transformation is a reduction in the discrepancy between the two variances; ...