Example: Can small sample sizes tell you something about large populations?
The Gov of Minnesota wants to have some idea what the residents of Minnesota think about the issue of raising taxes to improve education facilities (infrastructure) in the state.
Before serious discussions begin, they want to take the pulse of the residents by conducting a simple poll.
To do this, they decide to conduct a phone poll: they will phone, choosing phone numbers at random, 500 residents of Minnesota and ask them to choose the answer that most represents their position on the question "Should Minnesota raise state taxes to improve the infrastructure of the schools in Minnesota". The choices offered are:
Minnesota has a population of about 5 million people (census). This is the population.
I ran my own computer simulation, constructing a population of 5 million with a response to the question from 1 to 4, and then sampling the population at 500 randomly chosen positions. The way I have constructed my data, I have ignored diversity which might play an important role in the real world application of this method. Some of the things that might make my study not sample diversity properly would be:
Since we are interested in seeing how accurate our sampling techniques are, here is the histogram for the entire population. This information is, of course, unknown to the person taking the poll.
|
|
Here are the results from my ''poll'' of 500 people chosen randomly from the population:
|
|
Notice that we actually get quite good results from using "only" 500 people. Well designed samples can produce good estimates for the behaviour of even very large populations.