Sampling
The chain of subsets goes like this:
Sampling unit < Sample < Sampling frame < Population
Terms:
- The target population
- all subjects of interest (to which you want to be able to apply a conclusion) e.g. "all cows in the world".
- The sampling frame
- the subset of the population, from which you draw your sample, e.g. "all cows in Skane county".
- Sampling design or sampling procedure
- the whole plan, basically, may involve such concepts as stratified sampling
Population definition
Be specific!
- All current customers -> Everyone who's bought from us in the last year.
- SDSU students loyal to the football team -> SDSU students registered in Fall 2019 and have attended at least one game while a student
- Unsatisfied customers -> Customers with an active paid account and have either scored us a 3 or lower in the last year or registered a complaint through e-mail
A good pop. def. uses clear, unambig, objective criteria to disting. betw. pop. and non-pop. members.
A good pop def suits the needs of the audience requesting the research
- It is common that the client you're working with has a hazy idea of what makes someone a member of the pop of interest. Before you undertake your whole study, get verification from them so that all parties agree what the pop def is!
A good pop def matches with previous research so that the present study can be directly compared with them.
Sampling frame
Leslie Kish posited four basic problems of sampling frames:[7]
- "Errors of omission": Missing elements: Some members of the population are not included in the frame.
- "Errors of inclusion": Foreign elements: Non-members of the population are included in the frame.
- Duplicate entries: A member of the population is surveyed more than once.
- Groups or clusters: The frame lists clusters instead of individuals.
Example 1: errors of omission
Pop. of interest: all customers who have been on our tele-therapy platform in the last 30 days
Sampling frame: A list of 5,000 customer phone numbers, which were collected from customers after their 90-day free trial expired.
OK, there is going to be some, and probably sizable, overlap between these sets of individuals (many individuals are in both). However, some error sources:
- Maybe the company didn't provide a complete customer list, just a list of 5,000. This kind of error of omission is okay – if the 5,000 were selected randomly, it won't affect conclusions.
- People still in the trial period haven't had to submit phone numbers yet. This kind of error of omission poses a problem: the cause of the omission systematically differentiates your in-frame pop from your out-of-frame pop.
Potential remedies
- If some pop characteristics are known, measure your sample and compare to see if they are approx equivalent, showing that the sample is representative.
- E.g. if you KNEW from other research that the target pop is 65% 60+ years old, but your frame only has 5% 60+ years old, there's an issue!