Q: In the posting on representativeness, you said that every member of the population must have some chance of being selected for the sample. In the next posting about sample size, your Rustavi example had every member of the population with an equal chance of being selected. What if everyone has a chance, but not an equal chance? In this case, is it possible to make a sample be representative of the population?

A: This is very important question! The short answer is yes—the sample can be representative of the population, but you need to do a little extra work. Let’s use a simple example:

Suppose we are interested in comparing the experiences of male and female students in an engineering program. The program has 800 men and 200 women. If we randomly select a sample of 200 students (20% of the total student population in the engineering program), then we should expect only about 40 women in our sample. Suppose we randomly select 100 men and then randomly select 100 women. This means that every man has an equal chance of being selected for the sample and every woman has an equal chance of being selected, but every student did not. If we want to use the responses of the men to say something only about male students or the responses of women to say something only about female students, then we can do this using some simple formulas from statistics. However, what if we want to use of all of the information that we have to say something about the entire population of students?

In this case, different members of the population have different chances of being selected. Every man has a 1 in 8 chance of being selected, while every woman has a 1 in 2 chance. We can turn this around and say that every man who is interviewed represents 8 people including himself and every woman who is interviewed represents 2 people including herself. This is what is known as a sampling weight – every man in the sample has a sampling weight of 8, while every woman in the sample has a sampling weight of 2:

We need to utilize sampling weights when making estimates about an entire population. This means that we need to use different statistical formulas than the simple ones used above. We also need to use a computer program that has built-in functions to make estimates about populations using data with sampling weights (e.g., SPSS for estimates or STATA for estimates and associated margins of error). As long as we do that, then our sample is still representative of our population even though every member of the population did not have the same chance of being selected for an interview.