In the investigation of most clinical research questions, some form of quantitative data will be collected. Initially these data exist in *raw form*, which means that they are nothing more than a compilation of numbers representing empirical observations from a group of individuals. For these data to be useful as measures of group performance, they must be organized, summarized, and analyzed, so that their meaning can be communicated. These are the functions of the branch of mathematics called statistics. **Descriptive statistics** are used to characterize the shape, central tendency, and variability within a set of data, often with the intent to describe a population. Measures of population characteristics are called **parameters**. A descriptive index computed from sample data is called a **statistic**. When researchers generalize sample data to populations, they use statistics to estimate population parameters. In this chapter we introduce the basic elements of statistical analysis for describing quantitative data.

Because the numerical data collected during a study exist in unanalyzed, unsorted form, a structure is needed that allows us to recognize trends or averages. Table 17.1A presents a set of hypothetical scores of 48 therapists on a test of attitudes toward working with geriatric clients. For this example, a maximum score of 20 indicates an overall positive attitude; zero indicates a strong negative bias. The total set of scores for a particular variable is called a **distribution**. The total number of scores in the distribution is given the symbol ** n**. In this sample,

*n*= 48.

Although visual inspection of a distribution allows us to see all the scores, this list is long and unwieldy, and inadequate for describing this group of therapists or comparing them with any other group. We can begin to summarize the data by presenting them in a **frequency distribution**. A frequency distribution is a table of rank ordered scores that shows the number of times each value occurred, or its *frequency* (*f*). The first two columns in Table 17.1B show the frequency distribution for the attitude scores. Now we can tell more readily how the scores are distributed. We can see the lowest and highest scores, where the scores tend to cluster, and which scores occurred most often. The sum of the numbers in the frequency column (*f*) equals *n*, the number of subjects or scores in the distribution.

Sometimes frequencies are more meaningfully expressed as percentages of the total distribution. We can look at the percentage represented by each score in the distribution, or at the *cumulative percentage* obtained by adding the percentage value for each score to all percentages that fall below that score. For example, it may be useful to know that 18.8% of the sample had a score of 15 or that 56.3% of the sample had scores of 15 and below. Percentages are useful for describing distributions because they are independent of sample size. For example, suppose we tested another sample with 150 therapists, and found that 84 individuals obtained a score of 15. Although there are more people in this second sample with this score than in the first sample, they both represent the same percentage of the total sample (56%). Therefore, the samples may be more similar than frequencies would indicate.

When clinical data are collected, researchers will often find that very few subjects, if any, obtain the exact same score. Consider a hypothetical sample of 30 patients for whom we obtained measurements of shoulder abduction range of motion, shown in Table 17.2A. Obviously, creating a frequency distribution is a useless process if almost every score has a frequency of one. In this situation, a *grouped frequency distribution* can be constructed by grouping the scores into *classes*, or intervals, where each class represents a unique range of scores within the distribution. Frequencies are then assigned to each interval.

Table 17.2B shows a grouped frequency distribution for the range of motion data. The classes represent ranges of 10 degrees. The classes are *mutually exclusive* (no overlap) and *exhaustive* within the range of scores obtained. The choice of the number of classes to be used and the range within each class is an arbitrary decision. It depends on the overall range of scores, the number of observations, and how much detail is relevant for the intended audience. Although information is inherently lost in grouped data, this approach is often the only feasible way to present comprehensible data when large amounts of information are collected for continuous data. The groupings should be clustered to reveal the important features of the data. The researcher must recognize that the choice of the number of classes and the range within each class can influence the interpretation of how a variable is distributed.

Graphic representation of data often communicates information about trends and general characteristics of distributions more clearly than a tabular frequency distribution. The most common methods of graphing frequency distributions are the stem-and-leaf plot, histogram, and frequency polygon.

The **stem-and-leaf plot** is a refined grouped frequency distribution that is most useful for presenting the pattern of distribution of a continuous variable. The pattern is derived by separating each score into two parts. The *leaf* consists of the last or rightmost single digit of each score, and the *stem* consists of the remaining leftmost digits. Table 17.2C illustrates a stem-and-leaf plot for the shoulder range of motion data. The scores have leftmost digits of 6 through 13. These values become the stem. The last digit in each score becomes the leaf. To read the stem-and-leaf plot, we look across each row, attaching each single leaf digit to the stem. Therefore, the first row represents the scores 60 and 68; the second row, 72, 77 and 77; the third row, 80, 82, 84, 85 and 86; and so on.

This display provides a concise summary of the data, while maintaining the integrity of the original data. If we compare this plot with the grouped frequency distribution, it is clear how much more information is provided by the stem-and-leaf plot in a small space, and how it provides elements of both tabular and graphic displays.

A **histogram** is a bar graph, composed of a series of columns, each representing one score or class interval. Figure 17.1A is a histogram showing the distribution of attitude scores given in Table 17.1. The frequency for each score is plotted on the Y-axis (vertical), and the measured variable, in this case attitude score, is on the X-axis (horizontal). The bars are centered over the scores.

A **frequency polygon** is a line plot, where each point on the line represents frequency or percentage. Figure 17.1B illustrates a frequency polygon for the attitude data. When grouped data are used, the dots in the graph are located at the midpoint of each class interval to represent the frequency in that class.

When graphs of frequency distributions are drawn, the distributions can be characterized by their shape. Although real data seldom achieve smooth curves, minor discrepancies are often ignored in an effort to describe overall the shape of a distribution.

Some distributions are symmetrical; that is, each half is a mirror image of the other. Curves A and B in Figure 17.2 are symmetrical. When scores are equal throughout the distribution, the shape is described as *uniform*, or rectangular, as shown in Curve A. Curve B represents a special case of the symmetrical distribution called the **normal distribution**. In statistical terminology, "normal" refers to a specific type of bell-shaped distribution where most of the scores fall in the middle of the scale and progressively fewer fall at the extremes. The unique characteristics of this distribution curve are discussed in greater detail later in this chapter.

*A* **skewed distribution** is asymmetrical. The degree to which the distribution deviates from symmetry is its *skewness.* Curve C in Figure 17.2 is *positively skewed*, or skewed to the right, because most of the scores cluster at the low end and only a few scores at the high end have caused the tail of the curve to point toward the right. If we were to plot a distribution for annual family income in the United States, for example, it would be positively skewed, because most families have low to moderate incomes. When the curve "tails off" to the left, the distribution is *negatively skewed*, or skewed to the left, as in Curve D. We might see a negatively skewed distribution if we plotted exam scores for an easy test, on which relatively few students achieved a low score.

Although frequency distributions enable us to order data and identify group patterns, they do not provide a practical quantitative summary of a group's characteristics. Numerical indices are needed to describe the "typical" nature of the data and to reflect different concepts of the "center" of a distribution. These indices are called measures of **central tendency**, or averages. The term *average* can denote three different measures of central tendency: the mode, the median, and the mean.

The **mode** is the score that occurs most frequently in a distribution. It is most easily determined by inspection of a frequency distribution. The frequency distribution in Table 17.1 shows that the mode for the attitude data is 15 because it occurs nine times, more than any other score. When class intervals are used, the mode is taken as the midpoint of the interval with the largest frequency. When more than one score occurs with the highest frequency, a distribution is considered *bimodal* (with two modes) or *multimodal* (with more than two modes). Many distributions of continuous variables do not have a mode.

The mode has only limited application as a measure of central tendency for continuous data, but can be useful in the assessment of categorical variables. For example, it may be of interest to determine the diagnostic category seen most often in a clinic.

The **median** of a series of observations is that value above which there are as many scores as below it; that is, it divides a rank-ordered distribution into two equal halves. When a distribution contains an odd number of scores, such as 4, 5, 6, 7, 8, the middle score, 6, is the median. With an even number of scores, the midpoint between the two middle scores is the median, so that for the series 4, 5, 6, 7, 8, 9, the median lies halfway between 6 and 7. Therefore, the median equals 6.5. For the distribution of attitude scores given in Table 17.2, with *n* = 48, the median will lie midway between the 24th and 25th scores. As both of these are 15, the median is 15.

The advantage of the median as a measure of central tendency is that it is unaffected by the value of extreme scores. It is an index of average *position* in a distribution, not amount. It is therefore a useful measure in describing skewed distributions. For instance, the average cost of a house is usually cited in terms of the median, because the distribution tends to be skewed to the right.

The **mean** is the sum of a set of scores divided by the number of scores, *n.* This is the value most people refer to as the "average." The symbol used to represent the mean of a population is the Greek letter mu, *μ*, and the mean of a sample is represented by *X̄*.

The bar above the *X* indicates that the value is an average score. The formula for calculation of the sample mean from raw data is

where the Greek letter Σ (sigma) stands for "the sum of." This is read, "the mean equals the sum of *X* divided by *n*", where *X* represents each individual score in the distribution. For example, we can apply this formula to the ROM scores shown in Table 17.2. In this distribution of thirty scores, the sum of scores is 2,848. Therefore, *X̄* = 2,848/30 = 94.9.

Determining which measure of central tendency is most appropriate for describing a distribution depends on several factors. Foremost is the intended application of the data. The scale of measurement of the variable is another important consideration. All three measures of central tendency can be applied to variables on the interval or ratio scales, although the mean is most useful. For data on the nominal scale, only the mode is meaningful. If data are ordinal, both the median and mode can be applied.

It is necessary to consider how the summary measure will be used statistically. Of the three measures of central tendency, the mean is considered the most stable; that is, if we were to repeatedly draw random samples from a population, the means of those samples would fluctuate less than the mode or median. Only the mean can be subjected to arithmetic manipulations, making it the most reasonable estimate of population characteristics. For this reason, the mean is used more often than the median or mode for statistical analysis of ratio or interval data.

We can also consider the utility of the three measures of central tendency for describing distributions of different shapes. With uniform and normal distributions, any of the three averages can be applied with validity. With skewed distributions, however, the mean is limited as a descriptive measure because, unlike the median and mode, it is affected by the quantitative value of every score in a distribution and can be biased by extreme scores. For instance, in the previous example of ROM scores (see Table 17.2), if the first subject obtained a score of 20 instead of 60, the mean would decrease from 94.9 to 93.6. The median and mode would be unaffected by this change.

The curves in Figure 17.3 illustrate how measures of central tendency are affected by skewness. The median will typically fall between the mode and the mean in a skewed curve, and the mean will be pulled toward the tail. Because of these properties, the choice of which index to report with skewed distributions depends on what facet of information is appropriate to the analysis. It is often reasonable to report all three values, to present a complete picture of a distribution's characteristics.

The shape and central tendency of a distribution are useful but incomplete descriptors of a sample. To illustrate this point, consider the following dilemma: You are responsible for planning the musical entertainment for a party of seven individuals, but you don't know what kind of music to choose—so you decide to use their average age as a guide. The guests' ages are 3, 3,13,14, 59, 70, and 78 years. If you based your decision on the mode of 3 years, you would bring in characters from Sesame Street. Using the median of 14 years, you might hire a heavy metal band. And according to the mean age of 34.3 years, you might decide to play soft rock, although nobody in the group is actually in that age range. And the Tommy Dorsey fans are completely overlooked! What we are ignoring is the spread of ages within this group.

Consider now a more serious example, using the hypothetical exam scores reported in Table 17.3, obtained from two different groups of students. If we were to describe these two distributions using measures of central tendency only, they would appear identical; however, a careful glance reveals that the scores for Group B are more widely scattered than those for Group A. This difference in **variability**, or dispersion of scores, is an essential element in data analysis. The description of a sample is not complete unless we can characterize the differences that exist *among* the scores as well as the central tendency of the data. In this section we describe five commonly used statistical measures of variability: range, percentiles, variance, standard deviation and coefficient of variation.

The simplest measure of variability is the **range,** which is the difference between the highest and lowest values in a distribution. For the test scores reported in Table 17.3, the range for Group A is 88 − 78 = 10, and for Group B, 98 − 65 = 33.^{∗} These values suggest that the first group was more homogeneous. Although the range is a relatively simple statistical measure, its applicability is limited because it is determined using only the two extreme scores in the distribution. It reflects nothing about the dispersion of scores between the two extremes. One aberrant extreme score can greatly increase the range, even though the variability within the rest of the data set is unchanged. In addition, the range of scores tends to increase with larger samples, making it an ineffective value for comparing distributions with different numbers of scores. Therefore, although it is easily computed, the range is usually employed only as a rough descriptive measure, and is typically reported in conjunction with other indices of variability.

^{∗}Research reports will usually report range by providing the actual minimum and maximum scores, rather than their difference.

**Percentiles** are used to describe a score's position within a distribution. Percentiles divide data into 100 equal portions. A particular score is located in one of these portions, which represents its position relative to all other scores. For example, if a student taking a college entrance examination scores in the 92nd percentile (P_{92}), that individual's score was higher than 92% of those who took the test. Percentiles are helpful for converting actual scores into comparative scores or for providing a reference point for interpreting a particular score. For instance, a child who scores in the 20th percentile for weight in his age group can be evaluated relative to his peer group, rather than considering only the absolute value of his weight.

**Quartiles** divide a distribution into four equal parts, or quarters. Therefore, three quartiles exist for any data set. Quartiles *Q*_{1}, *Q*_{2}, and *Q*_{3} correspond to percentiles at 25%, 50%, and 75% of the distribution (*P*_{25}, *P*_{50}, *P*_{75}). The score at the 50th percentile or *Q*_{2} is the median. The distance between the first and third quartiles, *Q*_{3} − *Q*_{1} is called the **interquartile range,** which represents the boundaries of the middle 50% of the distribution. A **box plot** graph, also called a box-and-whisker plot, (Figure 17.4) is a useful way to demonstrate visually the spread of scores in a distribution, including the median and interquartile range.^{1} Box plots may be drawn with the "whiskers" representing highest and lowest scores. The whiskers may also be drawn to represent the 90th and 10th percentiles, as shown in Figure 17.4, and outliers beyond those values may be indicated as circles outside the whiskers.

###### FIGURE 17.4

These box plots show four distributions of scores of functional level based on the Gross Motor Function Classification System (GMFCS). The distributions compare the ratio of medium to low activity levels (%) among children who were developing normally and children with cerebral palsy at functional levels I, II and III. The upper and lower margins of the box indicate the interquartile range (Q_{3}−Q_{1}), demarcating the 25th and 75th percentiles. The center line sits at the median score (50th percentile). The outer bars (whiskers) indicate the range of scores at each end of the distribution, with circles indicating outliers beyond 3 standard deviations from the mean. (From Bjornson KF et al. Ambulatory physical activity performance in youth with cerebral palsy and youth who are developing typically. *Phys Ther* 2007;87:248–257, Figure 4, p. 255, Used with permission of the American Physical Therapy Association.)

Quartiles are often used in clinical research as a basis for differentiating subgroups within a sample. For example, researchers studied the relationship between bone density and walking habits in 239 postmenopausal women.^{2} The sample was grouped into quartiles based on year-round distance walked, and these four groups were compared on bone density and several anthropometric variables. Quartiles provided the structure for creating comparison groups where no obvious criteria were available.

Measures of range have limited application as indices of variability because they are not influenced by every score in a distribution and they are sensitive to extreme scores. To more completely describe a distribution we need an index that reflects the variation within a full set of scores. This value should be small if scores are close together and large if they are spread out. It should also be objective so that we can compare samples of different sizes and determine if one is more variable than another.

We can begin to examine variability by looking at the deviation of each score from the mean; that is, we subtract the mean from each score in the distribution to obtain a *deviation score*, *X* − *X̄*. Obviously, samples with larger deviation scores will be more variable around the mean. For instance, consider the distribution of test scores from Group B in Table 17.3. The deviation scores for this sample are shown in Table 17.4A. The mean of the distribution is 83.63. For the score *X* = 65, the deviation score will be 65 − 83.63 = −18.63. Note that the first three deviation scores are negative values because these scores are smaller than the mean.

As a measure of variability, the deviation score has intuitive appeal, as these scores will obviously be larger as scores become more heterogeneous and farther from the mean. It might seem reasonable, then, to take the average of these values, or the mean deviation, as an index of dispersion within the sample. This is a useless exercise, however, because the sum of the deviation scores will always equal zero, Σ(*X* − *X̄*) = 0, as illustrated in the second column in Table 17.4A. If we think of the mean as a central balance point for a distribution, then it makes sense that the scores will be equally dispersed above and below that central point.

This dilemma is solved by squaring each deviation score to get rid of the minus signs, as shown in the third column of Table 17.4A. The sum of the squared deviation scores, Σ(*X* − *X̄*)^{2}, is called the **sum of squares ( SS)**. As variability increases, the sum of squares will be larger.

We now have a number we can use to describe the sample's variability. In this case, Σ(X – *X̄*)^{2} = 1044.63. As an index of relative variability, however, the sum of squares is limited because it can be influenced by the sample size; that is, as *n* increases, the sum will also tend to increase simply because there are more scores. To eliminate this problem, the sum of squares is divided by *n*, to obtain the mean of the squared deviation scores (shortened to **mean square, MS)**. This value is a true measure of variability and is called the **variance**.

For population data, the variance is symbolized by σ^{2} (lowercase Greek sigma squared). When the population mean is known, deviation scores are obtained by *X − μ*. Therefore, the population variance is defined by

With sample data, deviation scores are obtained using *X̄*, not *μ* Because sample data do not include all the observations in a population, the sample mean is only an estimate of the population mean. This substitution results in a sample variance slightly smaller than the true population variance. To compensate for this bias, the sum of squares is divided by *n* − 1 to calculate the sample variance, given the symbol *s*^{2}:

This corrected statistic is considered an *unbiased estimate* of the parameter *σ ^{2}.* For the data in Table 17.4,

*SS*= 1044.63 and

*n*= 8. Therefore,

When means are not whole numbers, calculation of deviation scores can be biased by rounding. Computational formulae provide more accurate answers. See Table 17.4C for calculations using the computational formula for variance.

The limitation of variance as a descriptive measure of a sample's variability is that it was calculated using the squares of the deviation scores. It is generally not useful to describe sample variability in terms of squared units, such as degrees squared or pounds squared. Therefore, to bring the index back into the original units of measurement, we take the positive square root of the variance. This value is called the **standard deviation**, symbolized by *s*. The formula for standard deviation is

For the preceding example,

See Table 17.4C for the corresponding computational formula.

The standard deviation of sample data is usually reported along with the mean so that the data are characterized according to both central tendency and variability. A mean may be expressed as *X̄* = 83.63 ± 12.22, which tells us that the average of the deviations on either side of the mean is 12.22. An *error bar graph* shows these values for both groups, illustrating the difference in their variability to indicate the mean and standard deviation (see Figure 17.5).

The standard deviation can be used as a basis for comparing samples. The results shown in Table 17.4D show the standard deviations for both Groups A and B (from Table 17.3). The *error bar graph* in Figure 17.5 illustrates the comparison of means and standard deviations for these two groups. Because the standard deviation for Group A is smaller, we know that the Group B scores were more spread out around the mean. In clinical studies it may be relevant to describe the degree of variability among subjects as a way of estimating the generalizability of responses. Variance and standard deviation are fundamental components of any analysis of data. We explore the application of these concepts to many statistical procedures throughout the coming chapters.

The **coefficient of variation ( CV)** is another measure of variability that can be used to describe data measured on the interval or ratio scale. It is the ratio of the standard deviation to the mean, expressed as a percentage:

There are two major advantages to this index. First, it is independent of units of measurement because units will mathematically cancel out. Therefore, it is a practical statistic for comparing distributions recorded in different units. Second, the coefficient of variation expresses the standard deviation as a proportion of the mean, thereby accounting for differences in the magnitude of the mean. The coefficient of variation is, therefore, a measure of *relative variation*, most meaningful when comparing two distributions.^{†}

These advantages can be illustrated using data from a study of normal values of lumbar spine range of motion, in which data were recorded in both degrees and inches of excursion.^{3} The mean ranges for 20- to 29-year-olds were *X̄* = 41.2 ± 9.6 degrees, and *X̄* = 3.7 ± 0.72 inches, respectively. The absolute values of the standard deviations for these two measurements suggest that the measure of inches, using a tape measure, was much less variable; however, because the means and units are substantially different, we would expect the standard deviations to be different as well. By calculating the coefficient of variation, we get a better idea of the relative variation of these two measurements:

Now we can see that the variability within these two distributions is actually fairly comparable. As this example illustrates, the coefficient of variation is a useful measure for making comparisons among patient groups or different clinical assessments to determine if some are more stable than others.

^{†}The coefficient of variation cannot be used when a variable mean is a negative number. Because *CV* is expressed as a percentage, it cannot be interpreted as a negative value.

Earlier in this chapter we discussed the symmetrical distribution known as the **normal distribution**. This distribution represents an important statistical concept because so many biological, psychological and social phenomena manifest themselves in populations according to this shape. If we were to graph the population frequency distribution of variables such as height or intelligence, the graph would resemble the bell-shaped curve. Unfortunately, in the real world we can only estimate such data from samples and, therefore, cannot expect data to fit the normal curve exactly. For practical purposes, then, the normal curve represents a theoretical concept only, with well defined properties that allow us to make statistical estimates about populations using sample data.

The fact that the normal curve is important to statistical theory should not imply, however, that data are not useful or valid if they are not normally distributed. Many sociological variables, such as socioeconomic class, income, ethnic background and age, are skewed. Such data can be handled using statistics appropriate to nonnormal distributions (see Chapter 22).

The statistical appeal of the normal distribution is that its characteristics are constant and, therefore, predictable. As shown in Figure 17.6, the curve is smooth, symmetrical and bellshaped, with most of the scores clustered around the mean. The mean, median and mode have the same value. The vertical axis of the curve represents the frequency of data. The frequency of scores decreases steadily as scores move in a negative or positive direction away from the mean, with relatively rare observations at the extremes. Theoretically, there are no boundaries to the curve; that is, scores potentially exist with infinite magnitude above and below the mean. Therefore, the tails of the curve approach but never quite touch the baseline.

Because of these standard properties, we can also determine the proportional areas under the curve represented by the standard deviations in a normal distribution. Statisticians have shown that 34.13% of the area under the normal curve is bounded by the mean and the score one standard deviation above or below the mean. Therefore, 68.26% of the total distribution (the majority) will have scores within ±1 standard deviation (±1*s*) from the mean. Similarly, ±2*s* from the mean will encompass 95.45%, and ±3*s* will cover 99.74% of the total area under the curve. At ±3*s* we have accounted for virtually the entire distribution. Because we can never discount extreme values at either end, we never account for the full 100%. This information can be used as a basis for interpreting standard deviations. For example, if we are given *X̄* = 65 ± 6.06, we can estimate that approximately 68% of the individuals in the sample have scores between 58.94 and 71.06.

Statistical data are meaningful only when they are applied in some quantitative context. For example, if a patient has a pulse rate of 58 beats/min, the implication of that value is evident only if we know where that score falls in relation to a distribution of normal pulse rates. If we know that *X̄* = 68 and *s* = 10 for a given sample, then we know that an individual score of 58 is one standard deviation below the mean. This gives us a clearer interpretation of the score. When we express scores in terms of standard deviation units, we are using **standardized scores**, also called ** z-scores**. For this example, a score of 58 can be expressed as a

*z*-score of −1.0, the minus sign indicating that it is one standard deviation unit below the mean. A score of 88 is similarly transformed to a

*z*-score of + 2.0, or two standard deviations above the mean.

A *z*-score is computed by dividing the deviation of an individual score from the mean by the standard deviation:

Using the example of pulse rates, for an individual score of 85 beats/minute, with *X̄* = 68 and *s* = 10,

Thus, 85 beats/minute is 1.7 standard deviations above the mean.

The normal distribution can also be described in terms of standardized scores. Theoretically, there are an infinite number of normal distributions, corresponding to every combination of means and standard deviations. The mean of a normal distribution of *z*-scores will always equal zero (no deviation from the mean), and the standard deviation will always be 1.0. As shown in Figure 17.6, the area under the standardized normal curve between *z* = 0 and *z* = +1.0 is approximately 34%, the same as that defined by the area between the mean (*z* = 0) and one standard deviation. The total area within *z* = ±1.00 is 68.26%. Similarly, the total area within *z* = ±2.00 is 95.45%. Using this model, we can determine the proportional area under the curve bounded by any two points in a normal distribution. These values are given in Appendix Table A.1.

We can illustrate this process using hypothetical values for pulse rates, with *X̄* = 68 and *s* = 10. Suppose we want to determine what percentage of our sample has a pulse rate above 80 beats/minute. First, we determine the *z*-score for 80 beats/minute:

Therefore, 80 beats/minute is slightly more than one standard deviation above the mean.

We want to determine the proportion of our total sample that is represented by all scores above 80, or above *z* = 1.2. This is the shaded area above 80 in Figure 17.7. We can now refer to Table A.1. This table is arranged in three columns, one containing *z*-scores and the other two representing areas either from 0 to *z* or above *z* (in one tail of the curve). For this example, we are interested in the area above *z*, or above 1.2. If we look to the right of *z* = 1.20 in Table A.1, we find that the area above *z* equals .1151. Therefore, scores above 80 beats/minute represent 11.51% of the total distribution.

###### FIGURE 17.7

Distribution of pulse rates with *X̄* = 68 and *s* = 10 showing the area under the normal curve above 80 beats/minute, or *z* = 1.2. The shaded area in the tail of the curve represents 11.51% of the curve (from Table A.1).

We might also be interested in determining the area above 50 beats/minute. First we determine the *z*-score for 50 beats/minute:

Therefore, 50 beats/minute is slightly less than two standard deviations below the mean.

Now we want to determine the proportion of our total sample that is represented by all scores above 50, or above *z* = −1.8. We already know that the scores above the mean (above *z* = 0) represent 50% of the curve, as shown by the dark green area in Figure 17.8. Therefore, we are now concerned with the light green area between 68 and 50, which is equal to the area from *z* = 0 to *z* = −1.8. Together these two shaded areas represent the total area above 50. Table A.1 uses only absolute values of z. Because it includes standardized units for a symmetrical curve, the proportional area from 0 to *z* = +1.8 is the same as the area from 0 to *z* = −1.8. The area between *z* = 0 and *z* = −1.8 is .4641. Therefore, the total area under the curve for all scores above 80 beats/minute will be .50 + .4641 = .9641, or 96.41%.

###### FIGURE 17.8

Distribution of pulse rates with *X̄* = 68 and *s* = 10, showing the area under the normal curve above 50 beats/minute, or *z* = −1.8. The light green area represents *z* = 0 to −1.8 = .4641 (from Table A.1). Together, the green areas represent 96.41% of the total area under the curve.

Standardized scores are very useful for interpreting an individual's standing relative to a normalized group. For example, many standardized tests, such as psychological, developmental and intelligence tests, use 2-scores to demonstrate that an individual's score is above or below the "norm" (the standardized mean) or to show what proportion of the subjects in a distribution fall within a certain range of scores.

The validity of estimates using the standard normal curve depends on the closeness with which a sample approximates the normal distribution. Many clinical samples are too small to provide an adequate approximation and are more accurately described as skewed. Most computer programs for descriptive statistics can compute measures of skewness. A value close to zero indicates a normal (or near-normal) distribution. As values become increasingly positive or negative, they indicate the extent to which the data are skewed.

Because many statistical procedures are based on assumptions related to the normal distribution, researchers should evaluate the shape of the data as part of their initial analysis. Alternative statistical operations can be used with skewed data, or data may be transformed to better reflect the characteristics of a normal distribution (see Appendix D). Unfortunately, many researchers do not test for skewness as part of their initial analysis, running the risk of invalid statistical conclusions.^{4} Skewness should be reported to help readers understand the shape of a distribution, and to evaluate the appropriate use of statistical procedures.^{5}

Descriptive statistics are the building blocks for data analysis. They serve an obvious function in that they summarize important features of quantitative data. Every study will include a description of subjects and responses using one or more measures of central tendency and variance, as a way of understanding the study sample, and establishing the framework for further analysis.

Descriptive measures are limited in their interpretation because they do not attempt to infer anything that goes beyond the data themselves. Therefore, if we collect information about the average performance of a group of patients, we are using a descriptive process; however, if we use this information to predict future performance of these patients or to make generalizations about the effectiveness of the treatment they received, we are going beyond the scope of descriptive data. We must remain cognizant of the limitations of descriptive information for generalization. Interpretations that go beyond sample data are based on inferential statistics. It is also essential to understand, however, the assumptions underlying most inferential procedures, which are based on descriptive characteristics of a distribution, including central tendency, variance, and the degree to which the distribution approaches the normal curve.

Although descriptive values cannot be used alone for generalizations, they do provide essential information about structure and patterns in data. While most inferential statistical analyses are used to test specific preset hypotheses (an approach called **confirmatory data analysis**), descriptive measures are also used to gain insight into data as part of an approach called **exploratory data analysis (EDA)**.^{6,7} Graphic representations of data, such as box plots, stem-and-leaf plots, and histograms, are used to scrutinize the data, to reveal the shape of distributions, and to examine the variability in different subgroups of a sample. The visual analysis of graphics provides the opportunity to inspect and interpret data, often allowing the researcher to see patterns that might not have otherwise been clear. Graphs are more powerful than summary statistics, for example, to find gaps in scores within a certain range, or to identify a particular score that is "somewhere in left field," or to show that there is a "pile-up" of scores at one point in the distribution.^{8} This type of analysis can be used to generate hypotheses, or to suggest alternative questions of the data. Other, more complex, statistical procedures can also be used to explore data structures, such as factor analysis and multivariate regression (see Chapter 29).

The take-home message, however, is the importance of descriptive statistics as the basis for sound statistical reasoning.^{5} Descriptive analyses are necessary to demonstrate that statistical tests are used appropriately, and that their interpretations are valid.^{4}

*Ann Int Med*1989;110:916–921. [PubMed: 2719423]

*Am J Med*1994;96:20–26. [PubMed: 8304358]

*Phys Ther*1983;63:1776–1781. [PubMed: 6634943]

*Am J Phys Med Rehabil*1991;70:S84–S93.

*Am J Phys Med Rehabil*2001;80:141–146. [PubMed: 11212015]