Work out how many respondents, plots, or units you need - for a single estimate, a comparison between two groups, or a stratified survey. Every result shows the working and the formula used.
One groupEstimate a single value
Compare two groupsDetect a difference
Stratified sampleKnown subgroups
Use this when you want to estimate one number for a population - e.g. "what proportion of farmers have adopted a new variety?" or "what is the average yield per hectare?" - within a chosen margin of error.
What are you estimating?
Expected proportion (p)
0.50
Your best guess of how common the thing you're measuring is in the population - not the size of any subgroup. If you have no prior data, leave it at 0.50; this is the most conservative choice and gives the largest (safest) sample size. If a pilot study or earlier research suggests, say, 30%, enter 0.30 instead - it will usually reduce the required sample.
Estimated standard deviation (σ)
How spread out you expect individual values to be. Use the standard deviation from a pilot study or earlier similar research if you have one. If not, a rough rule of thumb is: (expected maximum − expected minimum) ÷ 4.
Margin of error (E) - same units as your variable
How close you want your sample average to be to the true population average. E.g. if you're measuring yield in kg and set E = 2, your estimate will be accurate to within ±2 kg of the true mean.
Margin of error (E)
5%
How much error either side of the true value you're willing to accept. A 5% margin means your result could differ from the true population value by up to 5 percentage points, in either direction. Smaller margin = larger required sample.
Confidence level
How sure you want to be that the true population value falls within your margin of error, across repeated samples. 95% is the conventional default for most field and social-science research.
Population size (N) - optional, but important if your population is small
The total number of units that actually exist - e.g. total farmers, plots, or animals in the area you're studying. If this number is large or unknown, leave it blank. A practical guideline: finite population correction is usually negligible once your sample is below about 5% of N - below that sampling fraction, leaving N blank makes little practical difference. As N gets smaller relative to your required sample, the correction reduces the required sample more noticeably, since you cannot sample more units than exist.
Advanced adjustments - optional, both default to "none"
These inflate your final sample for real-world survey conditions. Apply after deciding your core statistical design.
Design effect (DEFF) - for cluster/multistage sampling
Expected non-response rate (%)
Required sample size
-
Step by step
Theory: estimating a single value
For a proportion, the base sample size assuming an infinite population is:
n₀ = Z² × p × (1 − p) / E²
For a mean, the equivalent formula is:
n₀ = Z² × σ² / E²
If the population size N is finite and known, a finite population correction (FPC) is applied, since you cannot sample more units than exist. As a rule of thumb, FPC is usually negligible once the sampling fraction (n₀/N) is below about 5%, but the calculator applies it exactly regardless:
n_fpc = n₀ / [1 + (n₀ − 1) / N]
If using cluster or multistage sampling, a design effect (DEFF) inflates the sample to account for intra-cluster correlation:
n_deff = n_fpc × DEFF
Finally, an anticipated non-response rate r inflates the sample so that the achieved (responding) sample still meets target precision:
n_final = n_deff / (1 − r)
Z - the standard normal value for your chosen confidence level (e.g. 1.96 for 95%)
p - expected proportion with the characteristic of interest (0.5 is most conservative)
σ - expected standard deviation of the variable, for means
E - acceptable margin of error, as a decimal for proportions or in raw units for means
N - total population size, if known and finite
DEFF - design effect; 1 for simple random sampling, >1 for cluster/multistage designs (often estimated from a pilot or similar prior study)
r - anticipated proportion of sampled units who will not respond
References: Cochran, W.G. (1977). Sampling Techniques (3rd ed.). New York: John Wiley & Sons. Kish, L. (1965). Survey Sampling. New York: John Wiley & Sons (design effect).
Use this when you're comparing two independent groups on some outcome - e.g. treatment vs control, adopters vs non-adopters of a practice, variety A vs variety B - and want enough data to detect a real difference if one exists.
Important: what goes into p₁ and p₂?
p₁ and p₂ are not the sizes of your two groups, and not "proportion who belong to Group 1 / Group 2." They are the expected rate of your outcome, measured separately within each group.
Example - wrong: "350 of 2,000 farmers are in the adopter group, so p₁ = 350/2000."
Example - right: If your outcome is "achieved above-average yield," then p₁ = expected % of adopters with above-average yield, and p₂ = expected % of non-adopters with above-average yield. Use a pilot study, similar published research, or a reasoned guess for both.
What outcome are you comparing?
Group 1 name - optional, for clarity
Expected outcome rate - Group 1 (p₁)
0.55
Group 2 name - optional, for clarity
Expected outcome rate - Group 2 (p₂)
0.35
Both values describe the same outcome, measured separately in each group. The bigger the real gap between p₁ and p₂, the fewer samples you need to detect it reliably. If you have no prior data, base your guess on the smallest difference that would actually matter for your study's purpose.
Calculation method
The standard (large-sample) formula can understate the required sample when proportions are close to 0 or 1, or when sample sizes are modest - common in epidemiology and public health work. The continuity-corrected version (Fleiss, Levin & Paik) adds a small correction term that is more conservative and generally preferred whenever feasible. Use standard for quick planning estimates; switch to corrected for a study design you intend to publish or defend.
Smallest difference worth detecting (Δ)
The smallest gap between the two group averages that you'd consider practically meaningful - not zero, and not necessarily the difference you expect, but the smallest one worth being able to detect.
Estimated standard deviation (σ) - assumed similar in both groups
Two-sided significance level (α) - shown as its critical value Zα/2
α is the chance you accept of detecting a "difference" that is really just random noise - 5% is the conventional default. The number in brackets, Zα/2, is the standard normal critical value that α actually converts to inside the formula (two-sided test, so α is split across both tails). The calculator uses Zα/2 directly in its arithmetic.
Statistical power (1−β) - shown as its critical value Zβ
Power (1−β) is the chance of correctly detecting the difference, if it truly exists at the size you specified. Zβ is the matching standard normal critical value used in the formula. Higher power needs a larger sample; 80–90% is standard practice.
Allocation ratio - how many in Group 2 per person in Group 1
Use unequal allocation when one group is harder or more expensive to recruit than the other, or when one group's population is capped (see below) - you can compensate with a larger comparison group.
Is Group 1's population capped? - e.g. only 350 units actually exist with this trait
If Group 1 is a naturally limited subgroup (e.g. only 350 farms in the area actually grow this variety, or only 350 animals carry this trait), you cannot sample more than that number from Group 1. Enter that cap here: the calculator applies a finite population correction (FPC) to Group 1's required sample, which will reduce the number needed, since sampling from a small, fully-enumerable population is more efficient than the formula assumes by default. If the corrected number still exceeds the cap, you'll get practical guidance instead.
Note: this corrects Group 1's sample size on its own (a single-group FPC), which is the standard practical approach. It does not re-derive the full two-sample power calculation under finite-population sampling, which is a more advanced, less commonly used method.
Advanced adjustments - optional, both default to "none"
These two adjustments inflate your final sample to account for real-world survey conditions. Apply them after deciding your statistical design, not instead of it.
Design effect (DEFF) - for cluster/multistage sampling
Expected non-response rate (%)
Required sample size per group (before allocation adjustment)
-
Step by step
Theory: comparing two groups
For two proportions, the per-group sample size (equal allocation, pooled variance) is:
where p̄ = (p₁ + p₂) / 2 is the pooled average proportion, Zα/2 is the two-sided critical value for your chosen significance level α, and Zβ is the critical value for your chosen power.
When allocation is unequal (ratio k = n₂/n₁), the calculator generalises this to:
Continuity correction (optional, proportions only). The standard approximation above can understate the required sample for small samples or proportions near 0 or 1. The Fleiss, Levin & Paik continuity-corrected version is:
Capped Group 1 population (optional). If Group 1 cannot exceed a known population size, a finite population correction is applied to Group 1's required sample after the formula above:
n₁_capped = n₁ / [1 + (n₁ − 1) / cap]
This is a practical, single-group correction - not a full re-derivation of two-sample power under finite-population sampling, which would also require adjusting the variance terms jointly and is rarely done in applied work.
Design effect and non-response are then applied to Group 1's sample exactly as in the single-group case (n × DEFF, then ÷ (1 − r)), with Group 2 set to n₁ × ratio afterwards.
Zα/2 - two-sided critical value for your significance level
Zβ - critical value for your chosen statistical power
p₁, p₂ - expected outcome rates in each group
Δ - smallest mean difference considered meaningful
Use this when your population is split into known, non-overlapping subgroups (strata) - like districts, crop zones, or farm-size categories - and you want a total sample that's fairly distributed across them.
Overall margin of error (E)
5%
Same concept as the single-group margin of error, applied once results from all strata are combined back into an overall estimate.
Expected proportion (p) - overall, if you don't have per-stratum estimates
0.50
Confidence level
Your strata - add each subgroup with its population size
A stratum is a subgroup of your population that doesn't overlap with the others - e.g. "North zone" and "South zone," or "small," "medium," and "large" farms. Enter how many units exist in each. The calculator computes an overall sample size, then splits it across strata in proportion to their size (proportional allocation).
Advanced adjustments - optional, both default to "none"
These inflate the total sample for real-world survey conditions, applied after proportional allocation across strata.
Design effect (DEFF) - for cluster/multistage sampling
Expected non-response rate (%)
Total required sample size
-
Step by step
Theory: stratified sampling
First, the overall sample size is calculated exactly as in the single-group case, using the combined population N (sum of all strata) and finite population correction:
Design effect and non-response adjustments, if specified, are applied next exactly as in the single-group case:
n_final = (n_fpc × DEFF) / (1 − r)
This total (rounded up to n) is then divided across strata using proportional allocation - each stratum receives a share of the sample equal to its share of the total population:
nₕ (exact) = n × (Nₕ / N)
Controlled rounding. Rounding each stratum's allocation independently (e.g. with simple rounding) can cause the stratum totals to not sum exactly to n. To guarantee Σnₕ = n, the calculator uses the largest-remainder method: take the integer (floor) part of each nₕ, then distribute the remaining unallocated units one at a time to the strata with the largest fractional remainders, until the total matches n exactly.
Nₕ - population size of stratum h
N - total population across all strata
nₕ - sample size allocated to stratum h, guaranteed to sum to the total
DEFF - design effect for cluster/multistage designs (default 1)
r - anticipated non-response rate (default 0)
Note: this tool uses proportional allocation, which works well when variability is similar across strata. If variability differs substantially between strata (e.g. yields are far more variable in one zone than another), Neyman (optimal) allocation - which also weights by each stratum's standard deviation - gives a more efficient design. That refinement needs per-stratum variance estimates and isn't included here.
References: Cochran, W.G. (1977). Sampling Techniques (3rd ed.), Chapter 5. New York: John Wiley & Sons. Largest-remainder (Hamilton) apportionment method, standard in survey allocation and electoral apportionment literature.