## This site is under construction. Please check back every few weeks for updates

COMMON MISTEAKS MISTAKES IN USING STATISTICS: Spotting and Avoiding Them

# Simple Random Samples

The simplest type of random sample is a simple random sample, often called an SRS. Moore and McCabe define a simple random sample as follows:

"A simple random sample (SRS) of size n consists of n individuals from the population chosen in such a way that every set of n individuals has an equal chance to be the sample actually selected."1.

Here,  population refers to the collection of people, animals, locations, etc. that the study is focusing on.

Some examples:

1. In a medical study, the population might be all adults over age 50 who have high blood pressure.
2. In another study, the population might be all hospitals in the U.S. that perform heart bypass surgery.
3. If we are studying whether a certain die is fair or weighted, the population would be all possible tosses of the die.
In Example 3, it is fairly easy to get a simple random sample: Just toss the die n times, and record each outcome.

Selecting a simple random sample in examples 1 and 2 is much harder. A good way to select a simple random sample for Example 2 would proceed as follows:

First, obtain or make a list of all hospitals in the U.S. that perform heart bypass surgery. Number them 1, 2, ... up to to the total number M of hospitals in the population. (Such a list is called a sampling frame.)
Then use some sort of random number generating process2
to obtain a simple random sample of size n from the population of integers 1, 2, ...,  M.  The simple random sample of hospitals would consist of the hospitals in the list that correspond to the numbers in the SRS of numbers.

In theory, the same process could be used in Example 1. However, obtaining the sampling frame would be much harder -- probably impossible. So some compromises may need to be made.  Unfortunately, these compromises can easily lead to a sample that is biased or otherwise not close enough to random to be suitable for the statistical procedures used.

Indeed, even the sampling procedure described above is a compromise and may not be suitable in some situations, described in the next section.

## Why Is Random Sampling Important?

Notes
1. Moore, David S. and George P. McCabe (2006), Introduction to the Practice of Statistics, fifth edition, Freeman, p. 219. The same definition appears on p. 196 of Moore, David S. (2007), The Basic Practice of Statistics, fourth edition, Freeman. These and other introductory texts by Moore and co-authors are among the best introductory texts for pointing out many of the common errors in using statistics.

2. Think of the process that is used in selecting winning numbers in some lotteries: Put M balls, labeled 1, 2, ..., M, in a container that can mix the balls up thoroughly. After mixing, select one ball (without looking at the number on it or any other ball), mix again, select a second ball (without looking at numbers), mix again, and continue until n balls have been selected. In practice, computer processes that are (we hope) close enough to random are used.