Wednesday | 02 March, 2011

Ask CRRC | Sample Size

Q: In the last posting you said that in order for the sample to be representative of the entire population, every member of the population had to have some chance of being selected for the sample. However, you didn’t say anything about sample size. Doesn’t sample size matter?

A: As long as the sample size is not tiny, then the sample can be representative of the population – having 200 respondents or 2,000 respondents does not make a difference in whether you can call the sample representative of the population. Where sample size does make a difference is in how accurate your conclusions about the population of interest will be. Let’s explain what that means with an example:

Suppose we are interested in the population of voters in Rustavi and that we are interested in the proportion of residents who find the availability of gas to be an important local issue. We take a list of the 98,492 registered voters in Rustavi and randomly select a sample for interview. Now, let’s imagine two different scenarios: In the first, we randomly select 200 respondents and interview them. In the second, we randomly select 2,000 respondents and interview them. Now, imagine that in the first scenario, 64 respondents mentioned the availability of gas as an important local issue and 138 did not. Imagine that in the second scenario 640 respondents mentioned it and 1,380 did not. Because 64/200=0.32 and 640/2,000=0.32, in both scenarios exactly 32% of the respondents said that the availability of gas is an important local issue.

Both of these samples are representative of the population of Rustavi because every resident had a chance to be in the sample. In both cases, our best estimate of the proportion of Rustavi residents who consider the availability of gas to be a major issue is the same. This is the proportion that we encountered in each sample: 32%.

However, the two different sample sizes allow us to say two different things about the greater population of Rustavi. This is because in general the larger the sample size, the smaller the margin of error. The margin of error tells us how wide the range is within which we are sure that the true value for the entire population lies. For example, in the first scenario, using statistical formulas we can calculate that there is a 95% chance that the proportion of the entire population of 98,492 registered voters that considers the availability of gas to be an important issue is between 25.5% and 38.5%. However, in the second scenario, our calculations will tell us that we can be 95% confident that the proportion is between 30% and 34%.

That is, in the first scenario, we were 95% confident that the proportion was between 32% - 6.5% and 32% + 6.5%. In the second scenario, we were 95% confident that the proportion was between 32% - 2% and 32% + 2%. In other words, in the first scenario, the margin of error is 6.5% and in second scenario the margin of error is 2%. To conclude, different sample sizes can still be representative of a population. However, the margin of error varies with respect to the sample size and can tell us how accurate conclusions are about the population of interest.