Sample size is a research term used for defining the number of individuals included in a research study to represent a population. The sample size references the total number of respondents included in a study, and the number is often broken down into sub-groups by demographics such as age, gender, and location so that the total sample achieves represents the entire population. Determining the appropriate sample size is one of the most important factors in statistical analysis. If the sample size is too small, it will not yield valid results or adequately represent the realities of the population being studied. On the other hand, while larger sample sizes yield smaller margins of error and are more representative, a sample size that is too large may significantly increase the cost and time taken to conduct the research.
This article will discuss considerations to put in place when determining your sample size and how to calculate the sample size.
Confidence Interval and Confidence Level
As we have noted before, when selecting a sample there are multiple factors that can impact the reliability and validity of results, including sampling and non-sampling errors. When thinking about sample size, the two measures of error that are almost always synonymous with sample sizes are the confidence interval and the confidence level.
Confidence Interval (Margin of Error)
Confidence intervals measure the degree of uncertainty or certainty in a sampling method and how much uncertainty there is with any particular statistic. In simple terms, the confidence interval tells you how confident you can be that the results from a study reflect what you would expect to find if it were possible to survey the entire population being studied. The confidence interval is usually a plus or minus (±) figure. For example, if your confidence interval is 6 and 60% percent of your sample picks an answer, you can be confident that if you had asked the entire population, between 54% (60-6) and 66% (60+6) would have picked that answer.
The confidence level refers to the percentage of probability, or certainty that the confidence interval would contain the true population parameter when you draw a random sample many times. It is expressed as a percentage and represents how often the percentage of the population who would pick an answer lies within the confidence interval. For example, a 99% confidence level means that should you repeat an experiment or survey over and over again, 99 percent of the time, your results will match the results you get from a population.
The larger your sample size, the more confident you can be that their answers truly reflect the population. In other words, the larger your sample for a given confidence level, the smaller your confidence interval.
Another critical measure when determining the sample size is the standard deviation, which measures a data set’s distribution from its mean. In calculating the sample size, the standard deviation is useful in estimating how much the responses you receive will vary from each other and from the mean number, and the standard deviation of a sample can be used to approximate the standard deviation of a population.
The higher the distribution or variability, the greater the standard deviation and the greater the magnitude of the deviation. For example, once you have already sent out your survey, how much variance do you expect in your responses? That variation in responses is the standard deviation.
The other important consideration to make when determining your sample size is the size of the entire population you want to study. A population is the entire group that you want to draw conclusions about. It is from the population that a sample is selected, using probability or non-probability samples. The population size may be known (such as the total number of employees in a company), or unknown (such as the number of pet keepers in a country), but there’s a need for a close estimate, especially when dealing with a relatively small or easy to measure groups of people.
As demonstrated through the calculation below, a sample size of about 385 will give you a sufficient sample size to draw assumptions of nearly any population size at the 95% confidence level with a 5% margin of error, which is why samples of 400 and 500 are often used in research. However, if you are looking to draw comparisons between different sub-groups, for example, provinces within a country, a larger sample size is required. GeoPoll typically recommends a sample size of 400 per country as the minimum viable sample for a research project, 800 per country for conducting a study with analysis by a second-level breakdown such as females versus males, and 1200+ per country for doing third-level breakdowns such as males aged 18-24 in Nairobi.
How to Calculate Sample Size
As we have defined all the necessary terms, let us briefly learn how to determine the sample size using a sample calculation formula known as Andrew Fisher’s Formula.
- Determine the population size (if known).
- Determine the confidence interval.
- Determine the confidence level.
- Determine the standard deviation (a standard deviation of 0.5 is a safe choice where the figure is unknown)
- Convert the confidence level into a Z-Score. This table shows the z-scores for the most common confidence levels:
- Put these figures into the sample size formula to get your sample size.
Here is an example calculation:
Say you choose to work with a 95% confidence level, a standard deviation of 0.5, and a confidence interval (margin of error) of ± 5%, you just need to substitute the values in the formula:
((1.96)2 x .5(.5)) / (.05)2
(3.8416 x .25) / .0025
.9604 / .0025
Your sample size should be 385.
Fortunately, there are several available online tools to help you with this calculation. Here’s an online sample calculator from Easy Calculation. Just put in the confidence level, population size, the confidence interval, and the perfect sample size is calculated for you.
GeoPoll’s Sampling Techniques
With the largest mobile panel in Africa, Asia, and Latin America, and reliable mobile technologies, GeoPoll develops unique samples that accurately represent any population. See our country coverage here, or contact our team to discuss your upcoming project.