Replicate Weights for Variance Estimation in Python
BRR, Jackknife, Bootstrap, and SDR methods for complex survey designs
Tutorials
Survey Weighting
Variance Estimation
Python
Learn how to create and adjust replicate weights for survey variance estimation using the svy library. Covers BRR, Jackknife (JKn and JK2), Bootstrap, and SDR methods with examples of propagating weight adjustments to replicates.
Replicate weights provide a flexible approach to variance estimation for complex survey designs. Rather than relying on analytical formulas (Taylor linearization), replication methods estimate variance by repeatedly perturbing the sample weights and observing the resulting variation in estimates.
This tutorial covers:
When to use replicate weights — comparing replication vs. Taylor linearization
Creating replicate weights — BRR, Jackknife (JKn and JK2), Bootstrap, and SDR methods
Preparing designs for replication — using create_variance_strata() to create paired PSU structures
Adjusting replicate weights — propagating nonresponse, calibration, and other adjustments
This tutorial assumes you’ve already created base weights and applied any necessary adjustments. See the Weight Adjustment tutorial for details on nonresponse adjustment, poststratification, calibration, and raking.
When to Use Replicate Weights
Replicate weights are especially useful when:
Estimating non-linear statistics (medians, percentiles, ratios) where Taylor linearization may be inaccurate
The number of PSUs per stratum is small, making linearization-based variance estimates unstable
Software limitations prevent proper specification of the complex design
Sharing data with secondary analysts who may not have access to design details
Approach
Strengths
Limitations
Taylor linearization
Computationally efficient; no additional columns needed
Requires correct design specification; may be inaccurate for non-linear statistics
Replication
Flexible; works well for non-linear statistics; easy to share
Increases file size; computationally heavier for many replicates
Setting Up the Sample Data
We’ll create a sample dataset that satisfies the requirements for all replication methods. BRR requires exactly 2 PSUs per stratum, which is the most restrictive requirement.
import numpy as npimport polars as plimport svyrng = np.random.default_rng(42)# Create sample with 4 strata, 2 PSUs each, 5 units per PSUrows = []y_means = {"S1_P1": 10, "S1_P2": 12,"S2_P1": 8, "S2_P2": 9,"S3_P1": 15, "S3_P2": 13,"S4_P1": 11, "S4_P2": 10,}for s inrange(1, 5):for p inrange(1, 3): label =f"S{s}_P{p}"for i inrange(5): rows.append({"unit_id": f"S{s}P{p}U{i +1}","stratum": f"S{s}","psu": f"S{s}_P{p}","base_wgt": rng.uniform(1.0, 3.0),"y": rng.normal(y_means[label], 2.0),"x_cat": rng.choice(["A", "B", "C"]),"resp_status": rng.choice( ["respondent", "non-respondent"], p=[0.85, 0.15] ), })df = pl.DataFrame(rows)print(f"Sample size: {df.shape[0]} units")print(df.head(10))
╭─────────────────────────── Sample ────────────────────────────╮│Survey Data:││ Number of rows: 40 ││ Number of columns: 10 ││ Number of strata: 4 ││ Number of PSUs: 8 ││││Survey Design:││││Field Value ││ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ││ Row index svy_row_index ││ Stratum stratum ││ PSU psu ││ SSU None ││ Weight base_wgt ││ With replacement False ││ Prob None ││ Hit None ││ MOS None ││ Population size None ││ Replicate weights None │││╰───────────────────────────────────────────────────────────────╯
Replication Methods
The svy library supports four replication methods:
Method
Function
Requirements
Typical Use Case
Balanced Repeated Replication (BRR)
create_brr_wgts()
Exactly 2 PSUs per stratum
Designs with paired PSUs
Jackknife (JKn)
create_jk_wgts(paired=False)
≥1 PSU per stratum
General purpose; moderate # of PSUs
Jackknife (JK2)
create_jk_wgts(paired=True)
2–3 PSUs per stratum
Paired designs; fewer replicates than JKn
Bootstrap
create_bs_wgts()
≥2 PSUs per stratum
Complex designs; non-linear statistics
Successive Difference Replication (SDR)
create_sdr_wgts()
Ordered/systematic samples
Systematic samples; time series
Balanced Repeated Replication (BRR)
BRR constructs balanced half-samples using a Hadamard matrix design. In each replicate, one PSU is selected from each stratum, and the weights of selected PSUs are doubled while the other PSU receives zero weight.
Requirements:
Exactly 2 PSUs per stratum (after any stratum collapsing)
Number of replicates defaults to the smallest Hadamard matrix size ≥ number of strata
Notice how within each stratum, one PSU has doubled weights (2×) and the other has zero weights in each replicate.
BRR Parameters
Parameter
Type
Default
Description
n_reps
int | None
Auto
Number of replicates (rounded to valid Hadamard size)
rep_prefix
str | None
"svy_brr_wgt"
Prefix for replicate weight column names
fay_coef
float
0.0
Fay coefficient ρ ∈ [0, 1) for damped BRR
rstate
int | None
None
Random seed for PSU ordering within strata
drop_nulls
bool
False
Drop rows with missing values in design columns
Reproducible BRR with rstate
The rstate parameter controls how PSUs are ordered within each stratum before applying the Hadamard matrix. This affects which PSU gets the +1 (doubled) vs -1 (zeroed) treatment in each replicate.
With fay_coef=0.5, selected PSUs get weight × 1.5 and non-selected get weight × 0.5 (instead of ×2 and ×0).
Jackknife Methods (JKn and JK2)
The svy library supports two jackknife variants controlled by the paired parameter:
Variant
paired
# Replicates
Description
JKn (delete-one-PSU)
False
Total # of PSUs
Delete one PSU at a time across entire sample
JK2 (paired/stratified)
True
# of strata
Delete one PSU per stratum per replicate
JKn: Delete-One-PSU Jackknife
JKn creates replicates by systematically deleting one PSU at a time and re-weighting the remaining PSUs within each stratum. The number of replicates equals the total number of PSUs across all strata.
JK2 is designed for paired PSU designs (2–3 PSUs per stratum). It creates one replicate per stratum, where in each replicate one PSU is deleted from that stratum and the remaining PSUs are upweighted.
Adjustment factor: When deleting one PSU from a stratum with n PSUs: - Deleted PSU: weight = 0 - Remaining PSUs: weight × (n / (n-1))
For pairs (n=2): factor = 2.0 For triplets (n=3): factor = 1.5
Notice that: - Replicate 1 deletes a PSU from stratum S1, other strata unchanged - Replicate 2 deletes a PSU from stratum S2, other strata unchanged - And so on…
Reproducible JK2 with rstate
For JK2, the rstate parameter controls which PSU is selected for deletion in each stratum. This is particularly useful when strata have more than 2 PSUs.
# Create JK2 with different seedsjk2_a = sample.weighting.create_jk_wgts(paired=True, rep_prefix="jk2a", rstate=0)jk2_b = sample.weighting.create_jk_wgts(paired=True, rep_prefix="jk2b", rstate=0)# Same seed produces identical resultsarr_a = jk2_a.data["jk2a1"].to_numpy()arr_b = jk2_b.data["jk2b1"].to_numpy()print(f"Same seed produces identical weights: {np.allclose(arr_a, arr_b)}")
Same seed produces identical weights: True
Jackknife Parameters
Parameter
Type
Default
Description
paired
bool
False
If False, use JKn; if True, use JK2
rep_prefix
str | None
"svy_jkn_wgt" or "svy_jk2_wgt"
Prefix for replicate weight column names
rstate
int | None
None
Random seed for PSU selection (JK2 only)
drop_nulls
bool
False
Drop rows with missing values in design columns
Bootstrap
Bootstrap replication draws PSUs with replacement within each stratum. The Rao-Wu rescaled bootstrap adjusts weights to maintain unbiasedness under the sampling design.
Requirements:
At least 2 PSUs per stratum
Stratum is optional
bs_sample = sample.weighting.create_bs_wgts( n_reps=200, rep_prefix="bs_wgt", rstate=rng, # For reproducibility)print(bs_sample)
╭─────────────────────────── Sample ────────────────────────────╮│Survey Data:││ Number of rows: 40 ││ Number of columns: 250 ││ Number of strata: 4 ││ Number of PSUs: 8 ││││Survey Design:││││Field Value ││ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ││ Row index svy_row_index ││ Stratum stratum ││ PSU psu ││ SSU None ││ Weight base_wgt ││ With replacement False ││ Prob None ││ Hit None ││ MOS None ││ Population size None ││ Replicate weights RepWeights(method=Bootstrap, ││ prefix='bs_wgt', n_reps=200, ││ df=199.0) │││╰───────────────────────────────────────────────────────────────╯
For simple statistics (means, totals), 200–500 replicates usually suffice. For percentiles or other non-linear statistics, consider 1,000+ replicates. More replicates reduce Monte Carlo error but increase computation time and file size.
Successive Difference Replication (SDR)
SDR is designed for systematic samples where units are ordered (e.g., by geography or time). It creates replicates based on successive differences between adjacent units, which better captures the correlation structure in ordered samples.
Requirements:
Units should be meaningfully ordered within strata
Works best with systematic or implicitly stratified samples
# Add an order column to simulate systematic samplingdf_ordered = df.with_columns( pl.arange(0, df.height).alias("sort_order"))ordered_sample = svy.Sample( data=df_ordered, design=svy.Design(stratum="stratum", psu="psu", wgt="base_wgt"),)sdr_sample = ordered_sample.weighting.create_sdr_wgts( n_reps=4, rep_prefix="sdr_wgt", order_col="sort_order",)print(sdr_sample)
╭─────────────────────────── Sample ────────────────────────────╮│Survey Data:││ Number of rows: 40 ││ Number of columns: 15 ││ Number of strata: 4 ││ Number of PSUs: 8 ││││Survey Design:││││Field Value ││ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ││ Row index svy_row_index ││ Stratum stratum ││ PSU psu ││ SSU None ││ Weight base_wgt ││ With replacement False ││ Prob None ││ Hit None ││ MOS None ││ Population size None ││ Replicate weights RepWeights(method=SDR, ││ prefix='sdr_wgt', n_reps=4, df=4.0) │││╰───────────────────────────────────────────────────────────────╯
Many survey designs have more than 2 PSUs per stratum, which makes them incompatible with BRR (requires exactly 2) or produces many replicates with JK2. The create_variance_strata() method helps by creating new variance strata with paired PSUs.
The Problem
# Create a sample with 4 PSUs per stratumrows_multi = []for s inrange(1, 4): # 3 stratafor p inrange(1, 5): # 4 PSUs eachfor i inrange(3): # 3 units per PSU rows_multi.append({"stratum": f"S{s}","psu": f"S{s}_P{p}","wgt": 1.0,"y": rng.normal(10, 2), })df_multi = pl.DataFrame(rows_multi)multi_sample = svy.Sample( data=df_multi, design=svy.Design(stratum="stratum", psu="psu", wgt="wgt"),)print(f"Original design: 3 strata × 4 PSUs = 12 PSUs total")
Original design: 3 strata × 4 PSUs = 12 PSUs total
# BRR will fail - requires exactly 2 PSUs per stratumtry: multi_sample.weighting.create_brr_wgts()exceptValueErroras e:print(f"BRR Error: {e}")
BRR Error: BRR requires 2 PSUs per stratum, stratum 2 has 4
Solution: Create Variance Strata
The create_variance_strata() method pairs PSUs within each original stratum to create new variance strata suitable for BRR or JK2:
# Create variance strata for BRR (pairs PSUs: 1-2, 3-4)brr_ready = multi_sample.weighting.create_variance_strata( method="brr", into="var_stratum", # Name of new stratum column)print(brr_ready)
╭─────────────────────────── Sample ────────────────────────────╮│Survey Data:││ Number of rows: 36 ││ Number of columns: 8 ││ Number of strata: 6 ││ Number of PSUs: 12 ││││Survey Design:││││Field Value ││ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ││ Row index svy_row_index ││ Stratum var_stratum ││ PSU psu ││ SSU None ││ Weight wgt ││ With replacement False ││ Prob None ││ Hit None ││ MOS None ││ Population size None ││ Replicate weights None │││╰───────────────────────────────────────────────────────────────╯
# Now BRR works!brr_ready = brr_ready.weighting.create_brr_wgts(rep_prefix="brr")print(f"Created {brr_ready.design.rep_wgts.n_reps} BRR replicates")
# JK2 handles odd counts by creating tripletsjk2_ready = odd_sample.weighting.create_variance_strata( method="jk2", into="var_stratum",)# Now JK2 works with tripletsjk2_ready = jk2_ready.weighting.create_jk_wgts(paired=True, rep_prefix="jk2")print(f"Created {jk2_ready.design.rep_wgts.n_reps} JK2 replicates")
Created 3 JK2 replicates
Variance Strata Parameters
Parameter
Type
Default
Description
method
"brr" | "jk2"
Required
Target replication method
order_by
str | Sequence[str] | None
None
Column(s) to sort PSUs before pairing
shuffle
bool
False
Randomly shuffle PSUs before pairing
into
str
"svy_var_stratum"
Name for new variance stratum column
rstate
int | None
None
Random seed for shuffle
Ordering PSUs Before Pairing
You can control how PSUs are paired by sorting them first:
# Add a size variable for orderingdf_multi_size = df_multi.with_columns( pl.lit(rng.uniform(100, 1000, df_multi.height)).alias("pop_size"))sized_sample = svy.Sample( data=df_multi_size, design=svy.Design(stratum="stratum", psu="psu", wgt="wgt"),)# Pair similar-sized PSUs togetherordered_strata = sized_sample.weighting.create_variance_strata( method="brr", order_by="pop_size", # Sort by population size before pairing into="var_stratum",)print("PSUs paired by population size within each stratum")
PSUs paired by population size within each stratum
Random Pairing
For reproducible random pairing:
random_strata = multi_sample.weighting.create_variance_strata( method="brr", shuffle=True, rstate=12345, into="var_stratum",)print("PSUs randomly paired within each stratum")
PSUs randomly paired within each stratum
Adjusting Replicate Weights
When you apply weight adjustments (nonresponse, poststratification, calibration, raking), you must also apply those same adjustments to each replicate weight. This ensures that variance estimates properly reflect the additional uncertainty introduced by the adjustment process.
The svy library handles this automatically. All adjustment methods include parameters for replicate weight handling:
Parameter
Type
Default
Description
rep_wgts_prefix
str | None
Auto-generated
Prefix for adjusted replicate weight columns
ignore_reps
bool
False
If True, skip replicate weight adjustment
update_design_wgts
bool
True
Update the design to reference new weights
Example: Full Adjustment Pipeline
Let’s walk through a complete pipeline: create replicate weights, then apply nonresponse adjustment and raking to both the main weight and all replicates.
╭─────────────────────────── Sample ────────────────────────────╮│Survey Data:││ Number of rows: 40 ││ Number of columns: 60 ││ Number of strata: 4 ││ Number of PSUs: 8 ││││Survey Design:││││Field Value ││ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ││ Row index svy_row_index ││ Stratum stratum ││ PSU psu ││ SSU None ││ Weight base_wgt ││ With replacement False ││ Prob None ││ Hit None ││ MOS None ││ Population size None ││ Replicate weights RepWeights(method=Bootstrap, ││ prefix='rep_wgt', n_reps=50, df=49.0) │││╰───────────────────────────────────────────────────────────────╯
Step 2: Nonresponse Adjustment
Apply nonresponse adjustment to the main weight and all 50 replicates:
╭─────────────────────────── Sample ────────────────────────────╮│Survey Data:││ Number of rows: 37 ││ Number of columns: 162 ││ Number of strata: 4 ││ Number of PSUs: 8 ││││Survey Design:││││Field Value ││ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ││ Row index svy_row_index ││ Stratum stratum ││ PSU psu ││ SSU None ││ Weight final_wgt ││ With replacement False ││ Prob None ││ Hit None ││ MOS None ││ Population size None ││ Replicate weights RepWeights(method=Bootstrap, ││ prefix='final_rep_wgt', n_reps=50, ││ df=49.0) │││╰───────────────────────────────────────────────────────────────╯
Step 4: Verify the Results
# Check main weightprint("Main weight (final_wgt) vs controls:")main_wgt_sums = pipeline_sample.data.group_by("x_cat").agg( pl.col("final_wgt").sum().alias("weighted_sum")).sort("x_cat")for row in main_wgt_sums.iter_rows(named=True): cat = row["x_cat"] actual = row["weighted_sum"] target = controls["x_cat"][cat] diff = actual - targetprint(f" {cat}: target={target:.2f}, actual={actual:.4f}, diff={diff:.6f}")# Check a few replicate weightsprint("\nReplicate weights vs controls:")for rep_col in ["final_rep_wgt1", "final_rep_wgt2", "final_rep_wgt3"]: rep_sums = pipeline_sample.data.group_by("x_cat").agg( pl.col(rep_col).sum().alias("weighted_sum") ).sort("x_cat") total_diff =sum(abs(row["weighted_sum"] - controls["x_cat"][row["x_cat"]])for row in rep_sums.iter_rows(named=True) )print(f" {rep_col}: max total deviation = {total_diff:.6f}")# Overall totalsprint(f"\nOverall total (should be {sum(controls['x_cat'].values())}):")print(f" Main weight sum: {pipeline_sample.data['final_wgt'].sum():.4f}")
Main weight (final_wgt) vs controls:
A: target=15.00, actual=15.0000, diff=0.000000
B: target=12.00, actual=12.0000, diff=0.000000
C: target=13.00, actual=13.0000, diff=-0.000000
Replicate weights vs controls:
final_rep_wgt1: max total deviation = 0.000000
final_rep_wgt2: max total deviation = 0.000000
final_rep_wgt3: max total deviation = 0.000000
Overall total (should be 40.0):
Main weight sum: 40.0000
Skipping Replicate Adjustment
In some cases, you may want to skip replicate weight adjustment:
Preliminary analysis: Quick exploration before final weights are needed
Memory constraints: Large datasets with many replicates
External replicates: When replicates will be adjusted separately
# Apply adjustment to main weight onlyquick_sample = sample.weighting.create_bs_wgts(n_reps=10, rep_prefix="rep", rstate=rng)quick_sample = quick_sample.weighting.normalize( controls=100, wgt_name="norm_wgt", ignore_reps=True, # Skip replicate adjustment)print(quick_sample)
╭─────────────────────────── Sample ────────────────────────────╮│Survey Data:││ Number of rows: 40 ││ Number of columns: 261 ││ Number of strata: 4 ││ Number of PSUs: 8 ││││Survey Design:││││Field Value ││ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ││ Row index svy_row_index ││ Stratum stratum ││ PSU psu ││ SSU None ││ Weight norm_wgt ││ With replacement False ││ Prob None ││ Hit None ││ MOS None ││ Population size None ││ Replicate weights RepWeights(method=Bootstrap, ││ prefix='rep', n_reps=10, df=9.0) │││╰───────────────────────────────────────────────────────────────╯
WarningImportant
If you skip replicate adjustment (ignore_reps=True), variance estimates computed from those replicates will not account for the additional variability introduced by the weight adjustment. This typically leads to underestimated standard errors.
Design Metadata
When you create replicate weights, svy stores metadata in the Sample.design.rep_wgts object:
bs_sample = sample.weighting.create_bs_wgts(n_reps=100, rep_prefix="bs", rstate=rng)print(f"Method: {bs_sample.design.rep_wgts.method}")print(f"Prefix: {bs_sample.design.rep_wgts.prefix}")print(f"Number of replicates: {bs_sample.design.rep_wgts.n_reps}")print(f"Degrees of freedom: {bs_sample.design.rep_wgts.df}")print(f"Fay coefficient: {bs_sample.design.rep_wgts.fay_coef}")
Method: Bootstrap
Prefix: bs
Number of replicates: 100
Degrees of freedom: 99.0
Fay coefficient: 0.0
This metadata is used by estimation functions to automatically compute replicate-based variance estimates.
Choosing a Replication Method
If your design has…
Recommended method
Exactly 2 PSUs per stratum
BRR (most efficient)
2–3 PSUs per stratum
JK2 (paired jackknife)
Varying PSUs per stratum
JKn or Bootstrap
Many strata, few PSUs each
Fay-BRR (dampened)
Systematic/ordered sample
SDR
Complex non-linear statistics
Bootstrap (≥500 reps)
Need to minimize file size
JK2 (fewest columns)
Need to convert multi-PSU design
create_variance_strata() first
Summary
This tutorial covered replicate weight methods for variance estimation:
Creating Replicate Weights:
Method
Function
Key Feature
BRR
create_brr_wgts()
Balanced half-samples; requires 2 PSUs/stratum
Fay-BRR
create_brr_wgts(fay_coef=...)
Damped BRR for stability
JKn
create_jk_wgts(paired=False)
Delete-one-PSU; # replicates = # PSUs
JK2
create_jk_wgts(paired=True)
Paired deletion; # replicates = # strata
Bootstrap
create_bs_wgts()
Resample with replacement; most flexible
SDR
create_sdr_wgts()
For systematic/ordered samples
Preparing Designs:
Method
Function
Use Case
Variance Strata
create_variance_strata(method="brr")
Convert multi-PSU to paired for BRR
Variance Strata
create_variance_strata(method="jk2")
Convert multi-PSU to 2–3 PSUs for JK2
Adjusting Replicate Weights:
All adjustment methods (adjust_nr, poststratify, rake, calibrate, normalize) automatically propagate to replicates
Use rep_wgts_prefix to control naming of adjusted replicates
Use ignore_reps=True to skip replicate adjustment (not recommended for final analysis)
Next Steps
With your replicate weights created and adjusted, you’re ready to compute estimates with proper variance estimation. Continue to the Estimation tutorial to learn how to use these weights.
Fay, R. E. (1989). Theory and application of replicate weighting for variance calculations. Proceedings of the Survey Research Methods Section, American Statistical Association, 212–217.
Judkins, D. R. (1990). Fay’s method for variance estimation. Journal of Official Statistics, 6(3), 223–239.
Rao, J. N. K., & Wu, C. F. J. (1988). Resampling inference with complex survey data. Journal of the American Statistical Association, 83(401), 231–241.
Rust, K. F., & Rao, J. N. K. (1996). Variance estimation for complex surveys using replication techniques. Statistical Methods in Medical Research, 5(3), 283–310.
Valliant, R., Dever, J. A., & Kreuter, F. (2018). Practical Tools for Designing and Weighting Survey Samples (2nd ed.). Springer.