In Chapter 5 we introduced basic concepts of reliability and described how different forms of reliability can be addressed in the planning of research protocols. The purpose of this chapter is to expand on these concepts by presenting the statistical bases for estimates of reliability, including measures of correlation, agreement, internal consistency, response stability and method comparison for alternate forms. We have waited until this point in the book to present these procedures because they require application of statistical concepts that have been covered in the preceding chapters.
RELIABILITY THEORY AND MEASUREMENT ERROR
Recall from Chapter 5 that classical reliability theory partitions an observed measurement or score, X, into two components: a true component, T, which represents the real value under ideal and infallible conditions, and an error component, E, which includes all other sources of variance that influence the outcome of measurement. This theoretical relationship is expressed in the equation
We can also examine the statistical nature of this relationship by restating it in terms of variance (s2). The total variance within a set of observed scores (s2x) is a function of both the true variance between scores (s2T) and the variance in the errors of measurement, or error variance (s2E):
Although it is an unknown quantity, we assume that s2T is fixed, because true scores will theoretically remain constant. Therefore, in a set of perfectly reliable scores, all observed differences between individual scores should be attributable to true differences between scores; that is, there is no error variance. Conversely, if we look at a set of repeated measurements from one person, and assume that the true response has not changed, then all observed variance should be the result of error. The essence of reliability, then, is based on the amount of error that is present in a set of scores. A measurement is considered more reliable if a greater proportion of the total observed variance is represented by the true score variance. Thus, reliability is defined by the ratio:
In statistical terminology, this relationship can be expressed as
where rXX is the symbol for a reliability coefficient.
The coefficient of reliability can take values from 0.00 to 1.00. Zero reliability indicates that all measurement variation is attributed to error. Reliability of 1.00 means that the measurement has no error, or s2E = 0. As the coefficient nears 1.00, we are more confident that the observed score is representative of the true score.
To illustrate this application, consider the set of hypothetical data presented in Table 26.1A. These values represent ratings for six patients on ...