Nonresponse adjustment, poststratification, calibration, raking, and replicate weights
Learn how to calculate and adjust survey weights in Python using the svy library. Covers nonresponse adjustment, poststratification, calibration, raking, and replicate weights for variance estimation.
Sample weighting allows analysts to generalize results from a survey sample to the target population. Design weights (also called base weights) are derived as the inverse of the final probability of selection. In large-scale surveys, these design weights are often further adjusted to correct for nonresponse, extreme values, or to align auxiliary variables with known population controls.
This tutorial covers two main topics:
Weight adjustment techniques — nonresponse adjustment, poststratification, calibration, and raking to improve representativeness and reduce bias
Replicate weights for variance estimation — Bootstrap, Balanced Repeated Replication (BRR), and Jackknife methods
For more on sample-weight adjustments, see Valliant and Dever (2018), which provides a step-by-step guide to calculating survey weights.
Setting Up the Sample Data
This tutorial uses the World Bank (2023) synthetic sample data.
import numpy as npfrom rich importprintas rprintimport svyhld_data = svy.load_dataset(name="hld_sample_wb_2023", limit=None)print(f"The number of records in the household sample data is {hld_data.shape[0]}")
The number of records in the household sample data is 8000
Weight Adjustment Methods
In practice, base weights derived from selection probabilities are routinely adjusted to:
Correct for nonresponse and unknown eligibility
Temper the influence of extreme or large weights
Align the weighted sample with known auxiliary controls
This section demonstrates five key methods available in the svy library:
Method
Function
Purpose
Nonresponse adjustment
adjust_nr()
Account for unit nonresponse and unknown eligibility
Poststratification
poststratify()
Match weights to known control totals
Calibration
calibrate()
Adjust weights using GREG framework with auxiliary variables
Raking
rake()
Rescale weights to match multiple marginal totals
Normalization
normalize()
Rescale weights to sum to a chosen constant
Creating the Design Weight
The design weight (or base weight) represents the inverse of the overall probability of selection—the product of first-stage and second-stage selection probabilities, as explained in the Sample Selection tutorial.
# Define the sampling designhld_design = svy.Design(stratum=["geo1", "urbrur"], psu=["ea"], wgt="hhweight")# Create the samplehld_sample = svy.Sample(data=hld_data, design=hld_design)
The dataset includes a household-level base weight variable named hhweight. Let’s rename it to base_wgt for clarity:
╭─────────────────────────── Sample ────────────────────────────╮│Survey Data:││ Number of rows: 8000 ││ Number of columns: 52 ││ Number of strata: 19 ││ Number of PSUs: 320 ││││Survey Design:││││Field Value ││ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ││ Row index svy_row_index ││ Stratum (geo1, urbrur) ││ PSU (ea,) ││ SSU None ││ Weight base_wgt ││ With replacement False ││ Prob None ││ Hit None ││ MOS None ││ Population size None ││ Replicate weights None │││╰───────────────────────────────────────────────────────────────╯
Understanding Response Status Categories
The core idea of nonresponse adjustment is to redistribute the survey weights of eligible non-respondents to eligible respondents within defined adjustment classes.
In practice, some units have unknown eligibility. How their weights are handled is survey-specific. Common options include:
Treat unknowns like eligibles — redistribute their weights to respondents in the same class
Partition unknowns — allocate a fraction to ineligibles and the remainder to eligibles
Exclude unknowns from redistribution — leave ineligible weights unchanged
The svy library classifies records into four response categories:
Code
Meaning
rr
Respondent
nr
Nonrespondent
uk
Unknown eligible
in
Ineligible
By default, unknowns are treated as potentially ineligible, so their weights are redistributed to the ineligible group as well.
Simulating Response Status
The World Bank simulated data has a 100% observed response rate. For demonstration purposes, we’ll simulate ineligibility and nonresponse:
import numpy as npimport polars as plrng = np.random.default_rng(12345)RESPONSE_STATUS = rng.choice( ("ineligible", "respondent", "non-respondent", "unknown"), p=(0.03, 0.82, 0.10, 0.05), size=hld_sample.n_records,)hld_sample = hld_sample.wrangling.mutate({"resp_status": RESPONSE_STATUS})# Show 9 eligible non-respondents records in geo_01print( hld_sample.show_records( columns=["hid", "geo1", "urbrur", "resp_status"], where=[ svy.col("resp_status") =="non-respondent", svy.col("geo1").is_in(["geo_01"]), ], n=9, ))
If your dataset uses different labels, provide a mapping to the canonical values:
# Mapping of canonical response status codes to descriptive labelsstatus_mapping = {"in": "ineligible","rr": "respondent","nr": "non-respondent","uk": "unknown",}
Nonresponse Adjustment
TipImplementation in svy
Use Sample.adjust_nr() to adjust sample weights for nonresponse. The method computes adjusted weights and stores them in the sample object for downstream estimation.
The adjust_nr() method includes a unknown_to_inelig parameter that controls where unknowns’ weights go:
unknown_to_inelig=True (default) — Unknowns’ weights are redistributed to ineligibles. Respondents’ adjusted weights are generally smaller.
unknown_to_inelig=False — Unknowns’ weights are not given to ineligibles. Respondents’ adjusted weights are larger.
# Show a random sample with the key columnsout = hld_sample.show_data( columns=["hid", "geo1", "geo2", "base_wgt", "resp_status", "nr_wgt"], how="sample", n=10, rstate=rng,)print(out)
╭─────────────────────────── Sample ────────────────────────────╮│Survey Data:││ Number of rows: 6629 ││ Number of columns: 54 ││ Number of strata: 19 ││ Number of PSUs: 320 ││││Survey Design:││││Field Value ││ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ││ Row index svy_row_index ││ Stratum (geo1, urbrur) ││ PSU (ea,) ││ SSU None ││ Weight nr_wgt ││ With replacement False ││ Prob None ││ Hit None ││ MOS None ││ Population size None ││ Replicate weights None │││╰───────────────────────────────────────────────────────────────╯
If you don’t specify wgt_name, svy creates the adjusted weight automatically as svy_adjusted_<base_weight_name>
Set replace=True to replace the pre-adjusted variable with the adjusted one
svy updates the sample design internally so the weight reference points to the adjusted weight
Poststratification
Poststratification compensates for under- or over-representation in the sample by adjusting weights so that weighted sums within poststratification classes match known control totals from reliable sources.
Poststratification classes need not mirror the sampling design—they can be formed from additional variables. Common choices include age group, gender, race/ethnicity, and education.
Warning
Use current, reliable controls: Poststratifying to out-of-date or unreliable totals may introduce bias rather than reduce it. Document your sources and reference dates.
TipImplementation in svy
Use Sample.poststratify() to adjust sample weights to match known population totals.
Let’s assume we have reliable control totals (e.g., from a recent census) for households per administrative region:
# Show a random sample with the key columnsout = hld_sample.show_data( columns=["hid","geo1","nr_wgt","ps_wgt", ], how="sample", n=10, sort_by="geo1", rstate=rng,)print(out)
The sample design is automatically updated with the new weight:
print(hld_sample)
╭─────────────────────────── Sample ────────────────────────────╮│Survey Data:││ Number of rows: 6629 ││ Number of columns: 55 ││ Number of strata: 19 ││ Number of PSUs: 320 ││││Survey Design:││││Field Value ││ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ││ Row index svy_row_index ││ Stratum (geo1, urbrur) ││ PSU (ea,) ││ SSU None ││ Weight ps_wgt ││ With replacement False ││ Prob None ││ Hit None ││ MOS None ││ Population size None ││ Replicate weights None │││╰───────────────────────────────────────────────────────────────╯
Calibration (GREG)
Calibration adjusts sample weights so that certain totals align with known population values. The Generalized Regression (GREG) approach is a model-assisted version that assumes the survey variable of interest relates to auxiliary variables through a regression-type relationship. See Deville and Särndal (1992) for the foundational work on calibration, and Särndal, Swensson, and Wretman (1992) for a thorough treatment of model-assisted survey sampling.
GREG calibration finds weights that:
Stay as close as possible to the original design weights
Make the weighted totals of auxiliary variables match their known population values
When auxiliary variables correlate strongly with the study variable, GREG estimates tend to be more stable and efficient than simple design-based estimates.
TipImplementation in svy
Use Sample.calibrate() or Sample.calibrate_matrix() to apply GREG calibration.
Tip: Use the by parameter in calibrate() to control by domain.
Raking (Iterative Proportional Fitting)
Raking (also called iterative proportional fitting or IPF) adjusts survey weights so that weighted sample distributions match known population margins for several categorical variables.
Unlike calibration, which aligns multiple totals simultaneously, raking updates weights iteratively—adjusting one margin at a time until all specified margins agree with population controls within tolerance.
Raking is especially useful when only marginal totals are available (e.g., totals by age group and totals by gender, but not their cross-tabulation).
TipImplementation in svy
Use Sample.rake() to apply iterative proportional fitting.
First, create a categorical variable for household size:
# Show a random sample with the key columnsout = hld_sample.show_data( columns=["hid","statocc","electricity","ps_wgt","rake_wgt", ], how="sample", n=10, sort_by=("statocc", "electricity"), rstate=rng,)print(out)
Surveys sometimes normalize weights to a convenient constant (e.g., the sample size or 1,000) so results are easier to compare across analyses.
Normalization multiplies every weight by the same factor. It does not change weighted means, proportions, or regression coefficients (the factor cancels), but it does change level estimates such as totals—and their standard errors—by the same factor.
TipImplementation in svy
Use Sample.normalize() to scale sample weights to a target sum.
Replicate weights are constructed primarily for variance (uncertainty) estimation. They are especially useful when:
Estimating non-linear parameters where Taylor linearization may be inaccurate
The number of PSUs per stratum is small, making linearization unstable
This section demonstrates three replication methods using the svy library:
Method
Function
Requirements
Balanced Repeated Replication (BRR)
create_brr_wgts()
Exactly 2 PSUs per stratum
Jackknife (JK)
create_jk_wgts()
≥2 PSUs per stratum
Bootstrap (BS)
create_bs_wgts()
≥2 PSUs per stratum
Sample Data for Replicate Weights
BRR assumes exactly two PSUs per stratum after any collapsing. To demonstrate the syntax without complex data engineering, we’ll construct a small BRR-compatible dummy sample:
rows = []y_means = {"S1_P1": 10,"S1_P2": 12,"S2_P1": 8,"S2_P2": 9,"S3_P1": 15,"S3_P2": 13,"S4_P1": 11,"S4_P2": 10,}for s inrange(1, 5): # S1..S4for p inrange(1, 3): # P1..P2 (2 PSUs per stratum) label =f"S{s}_P{p}"for i inrange(3): # 3 units per PSU rows.append( {"unit_id": f"S{s}P{p}U{i +1}","stratum": f"S{s}","cluster": f"P{p}","weight": 1.0, # base weight"y": rng.normal(y_means[label], 1.0), # outcome } )df_rep = pl.DataFrame(rows)print(df_rep)
BRR forms balanced half-samples within each stratum using a Hadamard design. It requires exactly 2 PSUs per stratum.
By default, svy sets the number of replicates to the smallest multiple of 4 strictly greater than the number of strata. You can request more by passing n_reps.
╭─────────────────────────── Sample ────────────────────────────╮│Survey Data:││ Number of rows: 24 ││ Number of columns: 16 ││ Number of strata: 4 ││ Number of PSUs: 8 ││││Survey Design:││││Field Value ││ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ││ Row index svy_row_index ││ Stratum stratum ││ PSU cluster ││ SSU None ││ Weight weight ││ With replacement False ││ Prob None ││ Hit None ││ MOS None ││ Population size None ││ Replicate weights RepWeights(method=BRR, ││ prefix='brr_rep_wgt', n_reps=8, df=4, ││ fay=0.0) │││╰───────────────────────────────────────────────────────────────╯
Fay-BRR (Damped BRR)
Fay-BRR is a damped version where each replicate weight combines the full weight and the BRR half-sample weight. Choose a Fay factor ρ ∈ (0,1), commonly between 0.3 and 0.5, to reduce perturbation and improve stability:
╭─────────────────────────── Sample ────────────────────────────╮│Survey Data:││ Number of rows: 24 ││ Number of columns: 36 ││ Number of strata: 4 ││ Number of PSUs: 8 ││││Survey Design:││││Field Value ││ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ││ Row index svy_row_index ││ Stratum stratum ││ PSU cluster ││ SSU None ││ Weight weight ││ With replacement False ││ Prob None ││ Hit None ││ MOS None ││ Population size None ││ Replicate weights RepWeights(method=BRR, ││ prefix='brr_rep_wgt', n_reps=8, df=4, ││ fay=0.0) │││╰───────────────────────────────────────────────────────────────╯
Bootstrap (BS)
Bootstrap replicates are formed by re-sampling PSUs with replacement within each stratum, drawing the same number of PSUs as observed in the sample for every replicate. The selection is independent across replicates, and weights are rescaled (e.g., Rao–Wu rescaled bootstrap) so estimators remain unbiased under the design.
If n_reps is omitted, create_bs_wgts() defaults to 500 replicates. Increase this for highly non-linear targets.
╭─────────────────────────── Sample ────────────────────────────╮│Survey Data:││ Number of rows: 24 ││ Number of columns: 86 ││ Number of strata: 4 ││ Number of PSUs: 8 ││││Survey Design:││││Field Value ││ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ││ Row index svy_row_index ││ Stratum stratum ││ PSU cluster ││ SSU None ││ Weight weight ││ With replacement False ││ Prob None ││ Hit None ││ MOS None ││ Population size None ││ Replicate weights RepWeights(method=Bootstrap, ││ prefix='bs_rep_wgt', n_reps=50, ││ df=49) │││╰───────────────────────────────────────────────────────────────╯
Adjustment of Replicate Weights
Coming Soon: This section will cover how to apply nonresponse and calibration adjustments to replicate weights.
Summary
This tutorial covered the essential techniques for survey weight adjustment and variance estimation:
Weight Adjustments:
Nonresponse adjustment with adjust_nr() — redistributes weights from non-respondents to respondents
Poststratification with poststratify() — aligns weights to known population totals
Calibration with calibrate() — uses GREG framework with auxiliary variables
Raking with rake() — iteratively matches multiple marginal distributions
Normalization with normalize() — scales weights to a convenient total
Replicate Weights:
BRR with create_brr_wgts() — requires exactly 2 PSUs per stratum
Jackknife with create_jk_wgts() — flexible, works with ≥2 PSUs per stratum
Bootstrap with create_bs_wgts() — most flexible, good for complex designs
Next Steps
Now that you understand how to create and adjust survey weights, continue to the Estimation tutorial to learn how to compute point estimates and standard errors using these weights.