Measurement validity concerns the extent to which an instrument measures what it is intended to measure. Validity places an emphasis on the objectives of a test and the ability to make inferences from test scores or measurements. For instance, a goniometer is considered a valid instrument for testing range of motion because we can assess joint range from angular measurements. A ruler is considered a valid instrument for calibrating length, because we can judge how long an object is by measuring inches or centimeters. We would, however, question the validity of assessing low back pain by measuring leg length because we cannot make reasonable inferences about back pain based on that measurement.
Therefore, validity addresses what we are able to do with test results. Tests are usually devised for purposes of discrimination, evaluation, or prediction. For instance, we may ask, Is the test capable of discriminating among individuals with and without certain traits? Can it evaluate change in the magnitude or quality of a variable from one time to another? Can we make useful and accurate predictions or diagnoses about a patient's potential function based on the outcome of the test? These are all questions of test validity.
The determination of validity for any test instrument can be made in a variety of contexts, depending on how the instrument will be used, the type of data it will generate, and the precision of the response variables. The purpose of this chapter is to define different types of validity and to describe the application of validity testing for clinical measurements. Discussion of statistical procedures related to validity will be covered in Chapter 27.
Validity implies that a measurement is relatively free from error; that is, a valid test is also reliable. An instrument that is inconsistent cannot produce meaningful measurements. If we use a goniometer with a loose axis that alters alignment, our results will no longer be valid indicators of joint range. Random measurement error will make it difficult to determine a true reading.
An invalid test can be reliable, however. For instance, we could obtain reliable measures of leg length time after time, but those measurements would still not tell us anything about back pain. Similarly, we might be able to establish that the reliability of leg length measurements is greater than the reliability of scores on a less objective test, such as a graphic pain scale, but this fact could not be used to support the validity of leg length as a measure of back pain. In addition we must consider the effect of systematic error or bias in the recording of data. If a tape measure is incorrectly marked, so that readings are consistently one inch more than the actual length, we may see strong reliability, but we will not have a valid measure of length.