Lesson 1.5 Sampling
Sampling
Because it may be too expensive, or even impossible to collect information about an entire population, we usually use a sample of that population. A sample should have the same characteristics as the population it is representing.
Sampling can be divided into random sampling and non-random sampling.
Random Sampling
Simple Random Sample
In a simple random sample, each member of the population has an equal chance of being chosen.
For example, to get a simple random sample representing the Jones family (see picture below), we could assign a number from one to five to each of the family members. the Jones family
1=Mr. Jones
2=Suzie Jones
3=Billy Jones
4=Annie Jones
5=Mrs. Jones
Then, we could roll a die to see which number comes up. If we roll a "five," then Mrs. Jones is our sample representing the population called the "Jones family."
This is an example of simple random sampling because every member of the Jones family had an equal chance of being selected as our sample.
Stratified Sample
In a stratified sample, the population is divided into strata (groups), and then some members of each stratum are randomly selected.
Consider a Population:
For example, consider the population to be all restaurants in Silicon Valley in California.
Divide the Population into Strata:
Let's take a look at the price of their entrées and divide this population into the following strata:
Strata: |
Restaurants in These Strata: |
Restaurants with all entrées under $10 |
Pasta Mia, Taxi's, Taj India, Armadillo Willy's |
Restaurants with all entrées from $10 to under $15 |
Ming's, Santa Barbara Grill, Cafe Camaroon, Dawit |
Restaurants with all entrées from $15 to under $20 |
Fontana's, Truya Sushi, Gervais |
Restaurants with all entrées over $20 |
Germania, La Maison Du Cafe |
Choose One Simple Random Sample:
Then, we could choose one simple random sample from each of the above strata.
Strata: |
Simple Random Sample: |
entrees under $10 |
Taj India |
entrees from $10 to under $15 |
Ming's |
entrees from $15 to under $20 |
Truya Sushi |
entrees over $20 |
Germania |
This, then, would be our stratified sample:
Stratified Sample: |
Taj India, Ming's, Truya Sushi, Germania |
Something to think about: What strata could you create to divide your classmates into groups from which you could form a stratified sample?
Cluster Sample
In a cluster sample, the population is divided into groups and then some of the groups are selected randomly.
Divide the Population into Groups:
As in the above example, we could divide Silicon Valley restaurants into different groups according to the price of their entrées.
Strata: |
Testaurants in These Strata: |
Restaurants with all entrées under $10 |
Pasta Mia, Taxi's, Taj India, Armadillo Willy's |
Restaurants with all entrées from $10 to under $15 |
Ming's, Santa Barbara Grill, Cafe Camaroon, Dawit |
Restaurants with all entrées from $15 to under $20 |
Fontana's, Truya Sushi, Gervais |
Restaurants with all entrées over $20 |
Germania, La Maison Du Cafe |
Randomly Select Some Groups:
Then, if we randomly selected some of these groups, we would have a cluster sample.
Cluster Sample: |
Restaurants with all entrées under $10 |
Restaurants with all entrées from $15 to under $20 |
Our cluster sample would include all the restaurants in those groups selected:
Cluster sample: |
Members of This Sample: |
Restaurants with all entrées under $10 |
Pasta Mia, Taxi's, Taj India, Armadillo Willy's |
Restaurants with all entrées from $15 to under $20 |
Fontana's, Truya Sushi, Gervais |
Systematic Sample
In a systematic sample, a starting point is selected randomly, and then every nth value of data is taken from a listing of the population. For example, the Juneau, Alaska phone book contains approximately 30,000 home phone listings. Let's say we want to collect a systematic sample of 40 members. If we randomly choose the name H. Davis as our starting point, then we could select 40 other names by choosing every 50th phone listing, starting with H. Davis.
Non-Random Sampling
Convenience Sample
A convenience sample uses results that are readily available and, therefore, convenient to collect.
For example, the management of an ice cream store wants to know what ice cream flavor people like the best. They do a survey by asking the first 50 customers who walk into their store on a Saturday. In this way, they have conveniently gathered a sample of people's favorite ice cream flavors.
Variations in Samples
Two or more samples taken from the same population will usually be different. For example, if two people each roll a die 20 times, the result of their samples will probably differ. However, by increasing the size of their samples (the number of times they roll the dice), the percent of 1's, 2's, 3's, 4's, 5's, and 6's that each person rolls should get closer and closer to one another. In the example data below, Ana and Robert each roll one 6-sided die 20 times. "Frequency" reflects the number of times that a particular numbered die face (1-6) appears.
Number on
the Die Face |
Frequency (the number of times each number occurred) |
Ana's Data |
Robert's Data |
1
|
2
|
2
|
2
|
3
|
4
|
3
|
3
|
3
|
4
|
4
|
3
|
5
|
3
|
4
|
6
|
5
|
4
|
|
|
|
Total: |
20
|
20
|
If we look at the above results, we see that Ana rolled the number "six" five times in twenty tries, and Robert rolled this number four times. Ana rolled "two" three times and Robert four times. Who had the correct results? They both did! Variation in samples is natural and expected.
Please continue to the next section of this lesson.
|