svy is a Python package for designing and analyzing complex survey data.
Large-scale surveys are often selected using complex random mechanisms such as stratification, clustering, or unequal probabilities. To produce accurate estimates, statistical analyses must account for the sampling design. svy provides a unified toolkit for working with these designs, ensuring that inferences correctly reflect the underlying population.
svy provides a comprehensive set of tools to support quantitative survey research.
Key Features
Sample size calculation and allocation Functions to determine required sample sizes and distribute them efficiently across strata or clusters.
Sampling methods Built-in algorithms for:
- Simple random sampling (SRS)
- Systematic selection (SYS)
- Probability proportional to size (PPS)
Sample weighting Tools to derive design weights and adjust them for nonresponse or to calibrate against population totals.
Estimation of population parameters Procedures for means, totals, proportions, and regression estimates using:
- Taylor-based methods
- Replication-based methods (Bootstrap, BRR, Jackknife)
- Regression-based methods (e.g., generalized regression, GREG)
Categorical data analysis Support for survey-weighted:
- Chi-square tests
- Logistic regression
- Multinomial regression