# Chapter 17: Descriptive Statistics

In the investigation of most clinical research questions, some form of quantitative data will be collected. Initially these data exist in *raw form*, which means that they are nothing more than a compilation of numbers representing empirical observations from a group of individuals. For these data to be useful as measures of group performance, they must be organized, summarized, and analyzed, so that their meaning can be communicated. These are the functions of the branch of mathematics called statistics. **Descriptive statistics** are used to characterize the shape, central tendency, and variability within a set of data, often with the intent to describe a population. Measures of population characteristics are called **parameters**. A descriptive index computed from sample data is called a **statistic**. When researchers generalize sample data to populations, they use statistics to estimate population parameters. In this chapter we introduce the basic elements of statistical analysis for describing quantitative data.

Because the numerical data collected during a study exist in unanalyzed, unsorted form, a structure is needed that allows us to recognize trends or averages. Table 17.1A presents a set of hypothetical scores of 48 therapists on a test of attitudes toward working with geriatric clients. For this example, a maximum score of 20 indicates an overall positive attitude; zero indicates a strong negative bias. The total set of scores for a particular variable is called a **distribution**. The total number of scores in the distribution is given the symbol ** n**. In this sample,

*n*= 48.

Although visual inspection of a distribution allows us to see all the scores, this list is long and unwieldy, and inadequate for describing this group of therapists or comparing them with any other group. We can begin to summarize the data by presenting them in a **frequency distribution**. A frequency distribution is a table of rank ordered scores that shows the number of times each value occurred, or its *frequency* (*f*). The first two columns in Table 17.1B show the frequency distribution for the attitude scores. Now we can tell more readily how the scores are distributed. We can see the lowest and highest scores, where the scores tend to cluster, and which scores occurred most often. The sum of the numbers in the frequency column (*f*) equals *n*, the number of subjects or scores in the distribution.

Sometimes frequencies are more meaningfully expressed as percentages of the total distribution. We can look at the percentage represented by each score in the distribution, or at the *cumulative percentage* obtained by adding the percentage value for each score to all percentages that fall below that ...