Replicate Weights for Variance Estimation in Python

BRR, Jackknife, Bootstrap, and SDR methods for complex survey designs

Tutorials

Survey Weighting

Variance Estimation

Python

Learn how to create and adjust replicate weights for survey variance estimation using the svy library. Covers BRR, Jackknife (JKn and JK2), Bootstrap, and SDR methods with examples of propagating weight adjustments to replicates.

Author

Mamadou S. Diallo, Ph.D.

Published

January 18, 2026

Modified

January 26, 2026

Keywords

replicate weights variance, bootstrap weights survey, jackknife weights Python, BRR balanced repeated replication, JK2 paired jackknife, SDR successive difference replication, Fay BRR damped replication, survey variance estimation, replicate weight adjustment, svy replication tutorial

Replicate weights provide a flexible approach to variance estimation for complex survey designs. Rather than relying on analytical formulas (Taylor linearization), replication methods estimate variance by repeatedly perturbing the sample weights and observing the resulting variation in estimates.

This tutorial covers:

When to use replicate weights — comparing replication vs. Taylor linearization
Creating replicate weights — BRR, Jackknife (JKn and JK2), Bootstrap, and SDR methods
Preparing designs for replication — using create_variance_strata() to create paired PSU structures
Adjusting replicate weights — propagating nonresponse, calibration, and other adjustments

This tutorial assumes you’ve already created base weights and applied any necessary adjustments. See the Weight Adjustment tutorial for details on nonresponse adjustment, poststratification, calibration, and raking.

When to Use Replicate Weights

Replicate weights are especially useful when:

Estimating non-linear statistics (medians, percentiles, ratios) where Taylor linearization may be inaccurate
The number of PSUs per stratum is small, making linearization-based variance estimates unstable
Software limitations prevent proper specification of the complex design
Sharing data with secondary analysts who may not have access to design details

Approach	Strengths	Limitations
Taylor linearization	Computationally efficient; no additional columns needed	Requires correct design specification; may be inaccurate for non-linear statistics
Replication	Flexible; works well for non-linear statistics; easy to share	Increases file size; computationally heavier for many replicates

Setting Up the Sample Data

We’ll create a sample dataset that satisfies the requirements for all replication methods. BRR requires exactly 2 PSUs per stratum, which is the most restrictive requirement.

import numpy as np
import polars as pl
import svy

rng = np.random.default_rng(42)

# Create sample with 4 strata, 2 PSUs each, 5 units per PSU
rows = []
y_means = {
    "S1_P1": 10, "S1_P2": 12,
    "S2_P1": 8,  "S2_P2": 9,
    "S3_P1": 15, "S3_P2": 13,
    "S4_P1": 11, "S4_P2": 10,
}

for s in range(1, 5):
    for p in range(1, 3):
        label = f"S{s}_P{p}"
        for i in range(5):
            rows.append({
                "unit_id": f"S{s}P{p}U{i + 1}",
                "stratum": f"S{s}",
                "psu": f"S{s}_P{p}",
                "base_wgt": rng.uniform(1.0, 3.0),
                "y": rng.normal(y_means[label], 2.0),
                "x_cat": rng.choice(["A", "B", "C"]),
                "resp_status": rng.choice(
                    ["respondent", "non-respondent"],
                    p=[0.85, 0.15]
                ),
            })

df = pl.DataFrame(rows)
print(f"Sample size: {df.shape[0]} units")
print(df.head(10))

Sample size: 40 units
shape: (10, 7)
┌─────────┬─────────┬───────┬──────────┬───────────┬───────┬────────────────┐
│ unit_id ┆ stratum ┆ psu   ┆ base_wgt ┆ y         ┆ x_cat ┆ resp_status    │
│ ---     ┆ ---     ┆ ---   ┆ ---      ┆ ---       ┆ ---   ┆ ---            │
│ str     ┆ str     ┆ str   ┆ f64      ┆ f64       ┆ str   ┆ str            │
╞═════════╪═════════╪═══════╪══════════╪═══════════╪═══════╪════════════════╡
│ S1P1U1  ┆ S1      ┆ S1_P1 ┆ 2.547912 ┆ 7.920032  ┆ B     ┆ respondent     │
│ S1P1U2  ┆ S1      ┆ S1_P1 ┆ 1.188355 ┆ 7.395641  ┆ C     ┆ respondent     │
│ S1P1U3  ┆ S1      ┆ S1_P1 ┆ 2.572129 ┆ 9.966398  ┆ C     ┆ respondent     │
│ S1P1U4  ┆ S1      ┆ S1_P1 ┆ 2.85353  ┆ 10.132061 ┆ B     ┆ respondent     │
│ S1P1U5  ┆ S1      ┆ S1_P1 ┆ 1.886828 ┆ 8.281415  ┆ A     ┆ respondent     │
│ S1P2U1  ┆ S1      ┆ S1_P2 ┆ 2.655262 ┆ 11.900148 ┆ B     ┆ respondent     │
│ S1P2U2  ┆ S1      ┆ S1_P2 ┆ 1.709052 ┆ 14.445083 ┆ B     ┆ respondent     │
│ S1P2U3  ┆ S1      ┆ S1_P2 ┆ 1.389277 ┆ 13.064618 ┆ C     ┆ respondent     │
│ S1P2U4  ┆ S1      ┆ S1_P2 ┆ 1.308579 ┆ 12.861642 ┆ C     ┆ non-respondent │
│ S1P2U5  ┆ S1      ┆ S1_P2 ┆ 1.651651 ┆ 10.372455 ┆ C     ┆ respondent     │
└─────────┴─────────┴───────┴──────────┴───────────┴───────┴────────────────┘

# Define the survey design
sample = svy.Sample(
    data=df,
    design=svy.Design(
        stratum="stratum",
        psu="psu",
        wgt="base_wgt"
    ),
)

print(sample)

╭─────────────────────────── Sample ────────────────────────────╮
│ Survey Data:                                                  │
│   Number of rows: 40                                          │
│   Number of columns: 10                                       │
│   Number of strata: 4                                         │
│   Number of PSUs: 8                                           │
│                                                               │
│ Survey Design:                                                │
│                                                               │
│    Field               Value                                  │
│    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━                          │
│    Row index           svy_row_index                          │
│    Stratum             stratum                                │
│    PSU                 psu                                    │
│    SSU                 None                                   │
│    Weight              base_wgt                               │
│    With replacement    False                                  │
│    Prob                None                                   │
│    Hit                 None                                   │
│    MOS                 None                                   │
│    Population size     None                                   │
│    Replicate weights   None                                   │
│                                                               │
╰───────────────────────────────────────────────────────────────╯

Replication Methods

The svy library supports four replication methods:

Method	Function	Requirements	Typical Use Case
Balanced Repeated Replication (BRR)	`create_brr_wgts()`	Exactly 2 PSUs per stratum	Designs with paired PSUs
Jackknife (JKn)	`create_jk_wgts(paired=False)`	≥1 PSU per stratum	General purpose; moderate # of PSUs
Jackknife (JK2)	`create_jk_wgts(paired=True)`	2–3 PSUs per stratum	Paired designs; fewer replicates than JKn
Bootstrap	`create_bs_wgts()`	≥2 PSUs per stratum	Complex designs; non-linear statistics
Successive Difference Replication (SDR)	`create_sdr_wgts()`	Ordered/systematic samples	Systematic samples; time series

Balanced Repeated Replication (BRR)

BRR constructs balanced half-samples using a Hadamard matrix design. In each replicate, one PSU is selected from each stratum, and the weights of selected PSUs are doubled while the other PSU receives zero weight.

Requirements:

Exactly 2 PSUs per stratum (after any stratum collapsing)
Number of replicates defaults to the smallest Hadamard matrix size ≥ number of strata

brr_sample = sample.weighting.create_brr_wgts(
    rep_prefix="brr_wgt",
)

print(brr_sample)

╭─────────────────────────── Sample ────────────────────────────╮
│ Survey Data:                                                  │
│   Number of rows: 40                                          │
│   Number of columns: 14                                       │
│   Number of strata: 4                                         │
│   Number of PSUs: 8                                           │
│                                                               │
│ Survey Design:                                                │
│                                                               │
│    Field               Value                                  │
│    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━  │
│    Row index           svy_row_index                          │
│    Stratum             stratum                                │
│    PSU                 psu                                    │
│    SSU                 None                                   │
│    Weight              base_wgt                               │
│    With replacement    False                                  │
│    Prob                None                                   │
│    Hit                 None                                   │
│    MOS                 None                                   │
│    Population size     None                                   │
│    Replicate weights   RepWeights(method=BRR,                 │
│                        prefix='brr_wgt', n_reps=4, df=4.0,    │
│                        fay=0.0)                               │
│                                                               │
╰───────────────────────────────────────────────────────────────╯

Examine the replicate weight pattern:

print(
    brr_sample.show_data(
        columns=["stratum", "psu", "base_wgt", "brr_wgt1", "brr_wgt2", "brr_wgt3", "brr_wgt4"],
        n=20,
    )
)

shape: (20, 7)
┌─────────┬───────┬──────────┬──────────┬──────────┬──────────┬──────────┐
│ stratum ┆ psu   ┆ base_wgt ┆ brr_wgt1 ┆ brr_wgt2 ┆ brr_wgt3 ┆ brr_wgt4 │
│ ---     ┆ ---   ┆ ---      ┆ ---      ┆ ---      ┆ ---      ┆ ---      │
│ str     ┆ str   ┆ f64      ┆ f64      ┆ f64      ┆ f64      ┆ f64      │
╞═════════╪═══════╪══════════╪══════════╪══════════╪══════════╪══════════╡
│ S1      ┆ S1_P1 ┆ 2.547912 ┆ 5.095824 ┆ 5.095824 ┆ 5.095824 ┆ 5.095824 │
│ S1      ┆ S1_P1 ┆ 1.188355 ┆ 2.376709 ┆ 2.376709 ┆ 2.376709 ┆ 2.376709 │
│ S1      ┆ S1_P1 ┆ 2.572129 ┆ 5.144257 ┆ 5.144257 ┆ 5.144257 ┆ 5.144257 │
│ S1      ┆ S1_P1 ┆ 2.85353  ┆ 5.70706  ┆ 5.70706  ┆ 5.70706  ┆ 5.70706  │
│ S1      ┆ S1_P1 ┆ 1.886828 ┆ 3.773657 ┆ 3.773657 ┆ 3.773657 ┆ 3.773657 │
│ …       ┆ …     ┆ …        ┆ …        ┆ …        ┆ …        ┆ …        │
│ S2      ┆ S2_P2 ┆ 2.329702 ┆ 0.0      ┆ 4.659403 ┆ 0.0      ┆ 4.659403 │
│ S2      ┆ S2_P2 ┆ 1.917832 ┆ 0.0      ┆ 3.835663 ┆ 0.0      ┆ 3.835663 │
│ S2      ┆ S2_P2 ┆ 2.336806 ┆ 0.0      ┆ 4.673612 ┆ 0.0      ┆ 4.673612 │
│ S2      ┆ S2_P2 ┆ 2.529998 ┆ 0.0      ┆ 5.059995 ┆ 0.0      ┆ 5.059995 │
│ S2      ┆ S2_P2 ┆ 1.6079   ┆ 0.0      ┆ 3.2158   ┆ 0.0      ┆ 3.2158   │
└─────────┴───────┴──────────┴──────────┴──────────┴──────────┴──────────┘

Notice how within each stratum, one PSU has doubled weights (2×) and the other has zero weights in each replicate.

BRR Parameters

Parameter	Type	Default	Description
`n_reps`	`int \| None`	Auto	Number of replicates (rounded to valid Hadamard size)
`rep_prefix`	`str \| None`	`"svy_brr_wgt"`	Prefix for replicate weight column names
`fay_coef`	`float`	`0.0`	Fay coefficient ρ ∈ [0, 1) for damped BRR
`rstate`	`int \| None`	`None`	Random seed for PSU ordering within strata
`drop_nulls`	`bool`	`False`	Drop rows with missing values in design columns

Reproducible BRR with `rstate`

The rstate parameter controls how PSUs are ordered within each stratum before applying the Hadamard matrix. This affects which PSU gets the +1 (doubled) vs -1 (zeroed) treatment in each replicate.

# Create reproducible BRR weights
brr_sample1 = sample.weighting.create_brr_wgts(rep_prefix="brr1", rstate=42)
brr_sample2 = sample.weighting.create_brr_wgts(rep_prefix="brr2", rstate=42)

# Same seed produces identical results
arr1 = brr_sample1.data["brr11"].to_numpy()
arr2 = brr_sample2.data["brr21"].to_numpy()
print(f"Same seed produces identical weights: {np.allclose(arr1, arr2)}")

Same seed produces identical weights: False

Fay-BRR (Generalized BRR)

Standard BRR can produce unstable estimates when PSU contributions vary greatly. Fay-BRR dampens the perturbation by using a coefficient ρ ∈ (0, 1):

Selected PSU weight: multiplied by (2 − ρ)
Non-selected PSU weight: multiplied by ρ

Common choices are ρ = 0.3 to 0.5. As ρ → 0, Fay-BRR approaches standard BRR; as ρ → 1, perturbation vanishes.

fay_sample = sample.weighting.create_brr_wgts(
    n_reps=8,
    rep_prefix="fay_wgt",
    fay_coef=0.5,
)

print(
    fay_sample.show_data(
        columns=["stratum", "psu", "base_wgt", "fay_wgt1", "fay_wgt2"],
        n=10,
    )
)

shape: (10, 5)
┌─────────┬───────┬──────────┬──────────┬──────────┐
│ stratum ┆ psu   ┆ base_wgt ┆ fay_wgt1 ┆ fay_wgt2 │
│ ---     ┆ ---   ┆ ---      ┆ ---      ┆ ---      │
│ str     ┆ str   ┆ f64      ┆ f64      ┆ f64      │
╞═════════╪═══════╪══════════╪══════════╪══════════╡
│ S1      ┆ S1_P1 ┆ 2.547912 ┆ 3.821868 ┆ 3.821868 │
│ S1      ┆ S1_P1 ┆ 1.188355 ┆ 1.782532 ┆ 1.782532 │
│ S1      ┆ S1_P1 ┆ 2.572129 ┆ 3.858193 ┆ 3.858193 │
│ S1      ┆ S1_P1 ┆ 2.85353  ┆ 4.280295 ┆ 4.280295 │
│ S1      ┆ S1_P1 ┆ 1.886828 ┆ 2.830243 ┆ 2.830243 │
│ S1      ┆ S1_P2 ┆ 2.655262 ┆ 1.327631 ┆ 1.327631 │
│ S1      ┆ S1_P2 ┆ 1.709052 ┆ 0.854526 ┆ 0.854526 │
│ S1      ┆ S1_P2 ┆ 1.389277 ┆ 0.694639 ┆ 0.694639 │
│ S1      ┆ S1_P2 ┆ 1.308579 ┆ 0.654289 ┆ 0.654289 │
│ S1      ┆ S1_P2 ┆ 1.651651 ┆ 0.825825 ┆ 0.825825 │
└─────────┴───────┴──────────┴──────────┴──────────┘

With fay_coef=0.5, selected PSUs get weight × 1.5 and non-selected get weight × 0.5 (instead of ×2 and ×0).

Jackknife Methods (JKn and JK2)

The svy library supports two jackknife variants controlled by the paired parameter:

Variant	`paired`	# Replicates	Description
JKn (delete-one-PSU)	`False`	Total # of PSUs	Delete one PSU at a time across entire sample
JK2 (paired/stratified)	`True`	# of strata	Delete one PSU per stratum per replicate

JKn: Delete-One-PSU Jackknife

JKn creates replicates by systematically deleting one PSU at a time and re-weighting the remaining PSUs within each stratum. The number of replicates equals the total number of PSUs across all strata.

jkn_sample = sample.weighting.create_jk_wgts(
    paired=False,  # JKn (delete-one-PSU)
    rep_prefix="jkn_wgt",
)

print(jkn_sample)

╭─────────────────────────── Sample ────────────────────────────╮
│ Survey Data:                                                  │
│   Number of rows: 40                                          │
│   Number of columns: 38                                       │
│   Number of strata: 4                                         │
│   Number of PSUs: 8                                           │
│                                                               │
│ Survey Design:                                                │
│                                                               │
│    Field               Value                                  │
│    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━  │
│    Row index           svy_row_index                          │
│    Stratum             stratum                                │
│    PSU                 psu                                    │
│    SSU                 None                                   │
│    Weight              base_wgt                               │
│    With replacement    False                                  │
│    Prob                None                                   │
│    Hit                 None                                   │
│    MOS                 None                                   │
│    Population size     None                                   │
│    Replicate weights   RepWeights(method=BRR,                 │
│                        prefix='brr_wgt', n_reps=4, df=4.0,    │
│                        fay=0.0)                               │
│                                                               │
╰───────────────────────────────────────────────────────────────╯

# With 4 strata × 2 PSUs = 8 PSUs, we get 8 replicates
print(
    jkn_sample.show_data(
        columns=["stratum", "psu", "base_wgt", "jkn_wgt1", "jkn_wgt2", "jkn_wgt3", "jkn_wgt4"],
        n=12,
    )
)

shape: (12, 7)
┌─────────┬───────┬──────────┬──────────┬──────────┬──────────┬──────────┐
│ stratum ┆ psu   ┆ base_wgt ┆ jkn_wgt1 ┆ jkn_wgt2 ┆ jkn_wgt3 ┆ jkn_wgt4 │
│ ---     ┆ ---   ┆ ---      ┆ ---      ┆ ---      ┆ ---      ┆ ---      │
│ str     ┆ str   ┆ f64      ┆ f64      ┆ f64      ┆ f64      ┆ f64      │
╞═════════╪═══════╪══════════╪══════════╪══════════╪══════════╪══════════╡
│ S1      ┆ S1_P1 ┆ 2.547912 ┆ 0.0      ┆ 5.095824 ┆ 2.547912 ┆ 2.547912 │
│ S1      ┆ S1_P1 ┆ 1.188355 ┆ 0.0      ┆ 2.376709 ┆ 1.188355 ┆ 1.188355 │
│ S1      ┆ S1_P1 ┆ 2.572129 ┆ 0.0      ┆ 5.144257 ┆ 2.572129 ┆ 2.572129 │
│ S1      ┆ S1_P1 ┆ 2.85353  ┆ 0.0      ┆ 5.70706  ┆ 2.85353  ┆ 2.85353  │
│ S1      ┆ S1_P1 ┆ 1.886828 ┆ 0.0      ┆ 3.773657 ┆ 1.886828 ┆ 1.886828 │
│ …       ┆ …     ┆ …        ┆ …        ┆ …        ┆ …        ┆ …        │
│ S1      ┆ S1_P2 ┆ 1.389277 ┆ 2.778555 ┆ 0.0      ┆ 1.389277 ┆ 1.389277 │
│ S1      ┆ S1_P2 ┆ 1.308579 ┆ 2.617158 ┆ 0.0      ┆ 1.308579 ┆ 1.308579 │
│ S1      ┆ S1_P2 ┆ 1.651651 ┆ 3.303301 ┆ 0.0      ┆ 1.651651 ┆ 1.651651 │
│ S2      ┆ S2_P1 ┆ 1.378943 ┆ 1.378943 ┆ 1.378943 ┆ 0.0      ┆ 2.757885 │
│ S2      ┆ S2_P1 ┆ 2.339628 ┆ 2.339628 ┆ 2.339628 ┆ 0.0      ┆ 4.679256 │
└─────────┴───────┴──────────┴──────────┴──────────┴──────────┴──────────┘

JK2: Paired/Stratified Jackknife

JK2 is designed for paired PSU designs (2–3 PSUs per stratum). It creates one replicate per stratum, where in each replicate one PSU is deleted from that stratum and the remaining PSUs are upweighted.

Adjustment factor: When deleting one PSU from a stratum with n PSUs: - Deleted PSU: weight = 0 - Remaining PSUs: weight × (n / (n-1))

For pairs (n=2): factor = 2.0 For triplets (n=3): factor = 1.5

jk2_sample = sample.weighting.create_jk_wgts(
    paired=True,  # JK2 (paired/stratified)
    rep_prefix="jk2_wgt",
)

print(jk2_sample)

╭─────────────────────────── Sample ────────────────────────────╮
│ Survey Data:                                                  │
│   Number of rows: 40                                          │
│   Number of columns: 42                                       │
│   Number of strata: 4                                         │
│   Number of PSUs: 8                                           │
│                                                               │
│ Survey Design:                                                │
│                                                               │
│    Field               Value                                  │
│    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━  │
│    Row index           svy_row_index                          │
│    Stratum             stratum                                │
│    PSU                 psu                                    │
│    SSU                 None                                   │
│    Weight              base_wgt                               │
│    With replacement    False                                  │
│    Prob                None                                   │
│    Hit                 None                                   │
│    MOS                 None                                   │
│    Population size     None                                   │
│    Replicate weights   RepWeights(method=BRR,                 │
│                        prefix='brr_wgt', n_reps=4, df=4.0,    │
│                        fay=0.0)                               │
│                                                               │
╰───────────────────────────────────────────────────────────────╯

# With 4 strata, we get 4 replicates (one per stratum)
print(
    jk2_sample.show_data(
        columns=["stratum", "psu", "base_wgt", "jk2_wgt1", "jk2_wgt2", "jk2_wgt3", "jk2_wgt4"],
        n=20,
    )
)

shape: (20, 7)
┌─────────┬───────┬──────────┬──────────┬──────────┬──────────┬──────────┐
│ stratum ┆ psu   ┆ base_wgt ┆ jk2_wgt1 ┆ jk2_wgt2 ┆ jk2_wgt3 ┆ jk2_wgt4 │
│ ---     ┆ ---   ┆ ---      ┆ ---      ┆ ---      ┆ ---      ┆ ---      │
│ str     ┆ str   ┆ f64      ┆ f64      ┆ f64      ┆ f64      ┆ f64      │
╞═════════╪═══════╪══════════╪══════════╪══════════╪══════════╪══════════╡
│ S1      ┆ S1_P1 ┆ 2.547912 ┆ 0.0      ┆ 2.547912 ┆ 2.547912 ┆ 2.547912 │
│ S1      ┆ S1_P1 ┆ 1.188355 ┆ 0.0      ┆ 1.188355 ┆ 1.188355 ┆ 1.188355 │
│ S1      ┆ S1_P1 ┆ 2.572129 ┆ 0.0      ┆ 2.572129 ┆ 2.572129 ┆ 2.572129 │
│ S1      ┆ S1_P1 ┆ 2.85353  ┆ 0.0      ┆ 2.85353  ┆ 2.85353  ┆ 2.85353  │
│ S1      ┆ S1_P1 ┆ 1.886828 ┆ 0.0      ┆ 1.886828 ┆ 1.886828 ┆ 1.886828 │
│ …       ┆ …     ┆ …        ┆ …        ┆ …        ┆ …        ┆ …        │
│ S2      ┆ S2_P2 ┆ 2.329702 ┆ 2.329702 ┆ 4.659403 ┆ 2.329702 ┆ 2.329702 │
│ S2      ┆ S2_P2 ┆ 1.917832 ┆ 1.917832 ┆ 3.835663 ┆ 1.917832 ┆ 1.917832 │
│ S2      ┆ S2_P2 ┆ 2.336806 ┆ 2.336806 ┆ 4.673612 ┆ 2.336806 ┆ 2.336806 │
│ S2      ┆ S2_P2 ┆ 2.529998 ┆ 2.529998 ┆ 5.059995 ┆ 2.529998 ┆ 2.529998 │
│ S2      ┆ S2_P2 ┆ 1.6079   ┆ 1.6079   ┆ 3.2158   ┆ 1.6079   ┆ 1.6079   │
└─────────┴───────┴──────────┴──────────┴──────────┴──────────┴──────────┘

Notice that: - Replicate 1 deletes a PSU from stratum S1, other strata unchanged - Replicate 2 deletes a PSU from stratum S2, other strata unchanged - And so on…

Reproducible JK2 with `rstate`

For JK2, the rstate parameter controls which PSU is selected for deletion in each stratum. This is particularly useful when strata have more than 2 PSUs.

# Create JK2 with different seeds
jk2_a = sample.weighting.create_jk_wgts(paired=True, rep_prefix="jk2a", rstate=0)
jk2_b = sample.weighting.create_jk_wgts(paired=True, rep_prefix="jk2b", rstate=0)

# Same seed produces identical results
arr_a = jk2_a.data["jk2a1"].to_numpy()
arr_b = jk2_b.data["jk2b1"].to_numpy()
print(f"Same seed produces identical weights: {np.allclose(arr_a, arr_b)}")

Same seed produces identical weights: True

Jackknife Parameters

Parameter	Type	Default	Description
`paired`	`bool`	`False`	If `False`, use JKn; if `True`, use JK2
`rep_prefix`	`str \| None`	`"svy_jkn_wgt"` or `"svy_jk2_wgt"`	Prefix for replicate weight column names
`rstate`	`int \| None`	`None`	Random seed for PSU selection (JK2 only)
`drop_nulls`	`bool`	`False`	Drop rows with missing values in design columns

Bootstrap

Bootstrap replication draws PSUs with replacement within each stratum. The Rao-Wu rescaled bootstrap adjusts weights to maintain unbiasedness under the sampling design.

Requirements:

At least 2 PSUs per stratum
Stratum is optional

bs_sample = sample.weighting.create_bs_wgts(
    n_reps=200,
    rep_prefix="bs_wgt",
    rstate=rng,  # For reproducibility
)

print(bs_sample)

╭─────────────────────────── Sample ────────────────────────────╮
│ Survey Data:                                                  │
│   Number of rows: 40                                          │
│   Number of columns: 250                                      │
│   Number of strata: 4                                         │
│   Number of PSUs: 8                                           │
│                                                               │
│ Survey Design:                                                │
│                                                               │
│    Field               Value                                  │
│    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━  │
│    Row index           svy_row_index                          │
│    Stratum             stratum                                │
│    PSU                 psu                                    │
│    SSU                 None                                   │
│    Weight              base_wgt                               │
│    With replacement    False                                  │
│    Prob                None                                   │
│    Hit                 None                                   │
│    MOS                 None                                   │
│    Population size     None                                   │
│    Replicate weights   RepWeights(method=Bootstrap,           │
│                        prefix='bs_wgt', n_reps=200,           │
│                        df=199.0)                              │
│                                                               │
╰───────────────────────────────────────────────────────────────╯

# Bootstrap weights vary randomly across replicates
print(
    bs_sample.show_data(
        columns=["stratum", "psu", "base_wgt", "bs_wgt1", "bs_wgt2", "bs_wgt3"],
        n=10,
    )
)

shape: (10, 6)
┌─────────┬───────┬──────────┬──────────┬──────────┬──────────┐
│ stratum ┆ psu   ┆ base_wgt ┆ bs_wgt1  ┆ bs_wgt2  ┆ bs_wgt3  │
│ ---     ┆ ---   ┆ ---      ┆ ---      ┆ ---      ┆ ---      │
│ str     ┆ str   ┆ f64      ┆ f64      ┆ f64      ┆ f64      │
╞═════════╪═══════╪══════════╪══════════╪══════════╪══════════╡
│ S1      ┆ S1_P1 ┆ 2.547912 ┆ 0.0      ┆ 5.095824 ┆ 2.547912 │
│ S1      ┆ S1_P1 ┆ 1.188355 ┆ 0.0      ┆ 2.376709 ┆ 1.188355 │
│ S1      ┆ S1_P1 ┆ 2.572129 ┆ 0.0      ┆ 5.144257 ┆ 2.572129 │
│ S1      ┆ S1_P1 ┆ 2.85353  ┆ 0.0      ┆ 5.70706  ┆ 2.85353  │
│ S1      ┆ S1_P1 ┆ 1.886828 ┆ 0.0      ┆ 3.773657 ┆ 1.886828 │
│ S1      ┆ S1_P2 ┆ 2.655262 ┆ 5.310525 ┆ 0.0      ┆ 2.655262 │
│ S1      ┆ S1_P2 ┆ 1.709052 ┆ 3.418104 ┆ 0.0      ┆ 1.709052 │
│ S1      ┆ S1_P2 ┆ 1.389277 ┆ 2.778555 ┆ 0.0      ┆ 1.389277 │
│ S1      ┆ S1_P2 ┆ 1.308579 ┆ 2.617158 ┆ 0.0      ┆ 1.308579 │
│ S1      ┆ S1_P2 ┆ 1.651651 ┆ 3.303301 ┆ 0.0      ┆ 1.651651 │
└─────────┴───────┴──────────┴──────────┴──────────┴──────────┘

Bootstrap Parameters

Parameter	Type	Default	Description
`n_reps`	`int`	`500`	Number of bootstrap replicates
`rep_prefix`	`str \| None`	`"svy_boot_wgt"`	Prefix for replicate weight column names
`drop_nulls`	`bool`	`False`	Drop rows with missing values in design columns
`rstate`	`RandomState`	`None`	Random state for reproducibility

Choosing the Number of Bootstrap Replicates

For simple statistics (means, totals), 200–500 replicates usually suffice. For percentiles or other non-linear statistics, consider 1,000+ replicates. More replicates reduce Monte Carlo error but increase computation time and file size.

Successive Difference Replication (SDR)

SDR is designed for systematic samples where units are ordered (e.g., by geography or time). It creates replicates based on successive differences between adjacent units, which better captures the correlation structure in ordered samples.

Requirements:

Units should be meaningfully ordered within strata
Works best with systematic or implicitly stratified samples

# Add an order column to simulate systematic sampling
df_ordered = df.with_columns(
    pl.arange(0, df.height).alias("sort_order")
)

ordered_sample = svy.Sample(
    data=df_ordered,
    design=svy.Design(stratum="stratum", psu="psu", wgt="base_wgt"),
)

sdr_sample = ordered_sample.weighting.create_sdr_wgts(
    n_reps=4,
    rep_prefix="sdr_wgt",
    order_col="sort_order",
)

print(sdr_sample)

╭─────────────────────────── Sample ────────────────────────────╮
│ Survey Data:                                                  │
│   Number of rows: 40                                          │
│   Number of columns: 15                                       │
│   Number of strata: 4                                         │
│   Number of PSUs: 8                                           │
│                                                               │
│ Survey Design:                                                │
│                                                               │
│    Field               Value                                  │
│    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━  │
│    Row index           svy_row_index                          │
│    Stratum             stratum                                │
│    PSU                 psu                                    │
│    SSU                 None                                   │
│    Weight              base_wgt                               │
│    With replacement    False                                  │
│    Prob                None                                   │
│    Hit                 None                                   │
│    MOS                 None                                   │
│    Population size     None                                   │
│    Replicate weights   RepWeights(method=SDR,                 │
│                        prefix='sdr_wgt', n_reps=4, df=4.0)    │
│                                                               │
╰───────────────────────────────────────────────────────────────╯

print(
    sdr_sample.show_data(
        columns=["stratum", "sort_order", "base_wgt", "sdr_wgt1", "sdr_wgt2", "sdr_wgt3", "sdr_wgt4"],
        n=12,
    )
)

shape: (12, 7)
┌─────────┬────────────┬──────────┬──────────┬──────────┬──────────┬──────────┐
│ stratum ┆ sort_order ┆ base_wgt ┆ sdr_wgt1 ┆ sdr_wgt2 ┆ sdr_wgt3 ┆ sdr_wgt4 │
│ ---     ┆ ---        ┆ ---      ┆ ---      ┆ ---      ┆ ---      ┆ ---      │
│ str     ┆ i64        ┆ f64      ┆ f64      ┆ f64      ┆ f64      ┆ f64      │
╞═════════╪════════════╪══════════╪══════════╪══════════╪══════════╪══════════╡
│ S1      ┆ 0          ┆ 2.547912 ┆ 4.349558 ┆ 4.349558 ┆ 4.349558 ┆ 4.349558 │
│ S1      ┆ 1          ┆ 1.188355 ┆ 0.594177 ┆ 0.101945 ┆ 0.594177 ┆ 0.101945 │
│ S1      ┆ 2          ┆ 2.572129 ┆ 1.286064 ┆ 7.495732 ┆ 0.220654 ┆ 1.286064 │
│ S1      ┆ 3          ┆ 2.85353  ┆ 1.426765 ┆ 0.244794 ┆ 1.426765 ┆ 8.315796 │
│ S1      ┆ 4          ┆ 1.886828 ┆ 0.943414 ┆ 5.498621 ┆ 5.498621 ┆ 0.943414 │
│ …       ┆ …          ┆ …        ┆ …        ┆ …        ┆ …        ┆ …        │
│ S1      ┆ 7          ┆ 1.389277 ┆ 0.694639 ┆ 0.119181 ┆ 0.694639 ┆ 4.048651 │
│ S1      ┆ 8          ┆ 1.308579 ┆ 0.654289 ┆ 3.813479 ┆ 3.813479 ┆ 0.654289 │
│ S1      ┆ 9          ┆ 1.651651 ┆ 0.483757 ┆ 0.483757 ┆ 0.483757 ┆ 0.483757 │
│ S2      ┆ 10         ┆ 1.378943 ┆ 2.354002 ┆ 2.354002 ┆ 2.354002 ┆ 2.354002 │
│ S2      ┆ 11         ┆ 2.339628 ┆ 1.169814 ┆ 0.200708 ┆ 1.169814 ┆ 0.200708 │
└─────────┴────────────┴──────────┴──────────┴──────────┴──────────┴──────────┘

SDR Parameters

Parameter	Type	Default	Description
`n_reps`	`int`	`4`	Number of replicates (typically 4+)
`rep_prefix`	`str \| None`	`"svy_sdr_wgt"`	Prefix for replicate weight column names
`order_col`	`str \| None`	`None`	Column specifying sort order within strata
`drop_nulls`	`bool`	`False`	Drop rows with missing values in design columns

Preparing Designs for BRR and JK2

Many survey designs have more than 2 PSUs per stratum, which makes them incompatible with BRR (requires exactly 2) or produces many replicates with JK2. The create_variance_strata() method helps by creating new variance strata with paired PSUs.

The Problem

# Create a sample with 4 PSUs per stratum
rows_multi = []
for s in range(1, 4):  # 3 strata
    for p in range(1, 5):  # 4 PSUs each
        for i in range(3):  # 3 units per PSU
            rows_multi.append({
                "stratum": f"S{s}",
                "psu": f"S{s}_P{p}",
                "wgt": 1.0,
                "y": rng.normal(10, 2),
            })

df_multi = pl.DataFrame(rows_multi)
multi_sample = svy.Sample(
    data=df_multi,
    design=svy.Design(stratum="stratum", psu="psu", wgt="wgt"),
)

print(f"Original design: 3 strata × 4 PSUs = 12 PSUs total")

Original design: 3 strata × 4 PSUs = 12 PSUs total

# BRR will fail - requires exactly 2 PSUs per stratum
try:
    multi_sample.weighting.create_brr_wgts()
except ValueError as e:
    print(f"BRR Error: {e}")

BRR Error: BRR requires 2 PSUs per stratum, stratum 2 has 4

Solution: Create Variance Strata

The create_variance_strata() method pairs PSUs within each original stratum to create new variance strata suitable for BRR or JK2:

# Create variance strata for BRR (pairs PSUs: 1-2, 3-4)
brr_ready = multi_sample.weighting.create_variance_strata(
    method="brr",
    into="var_stratum",  # Name of new stratum column
)

print(brr_ready)

╭─────────────────────────── Sample ────────────────────────────╮
│ Survey Data:                                                  │
│   Number of rows: 36                                          │
│   Number of columns: 8                                        │
│   Number of strata: 6                                         │
│   Number of PSUs: 12                                          │
│                                                               │
│ Survey Design:                                                │
│                                                               │
│    Field               Value                                  │
│    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━                          │
│    Row index           svy_row_index                          │
│    Stratum             var_stratum                            │
│    PSU                 psu                                    │
│    SSU                 None                                   │
│    Weight              wgt                                    │
│    With replacement    False                                  │
│    Prob                None                                   │
│    Hit                 None                                   │
│    MOS                 None                                   │
│    Population size     None                                   │
│    Replicate weights   None                                   │
│                                                               │
╰───────────────────────────────────────────────────────────────╯

# Now BRR works!
brr_ready = brr_ready.weighting.create_brr_wgts(rep_prefix="brr")

print(f"Created {brr_ready.design.rep_wgts.n_reps} BRR replicates")

Created 8 BRR replicates

# Examine the variance strata structure
print(
    brr_ready.show_data(
        columns=["stratum", "psu", "var_stratum", "brr1", "brr2"],
        n=24,
    )
)

shape: (24, 5)
┌─────────┬───────┬─────────────┬──────┬──────┐
│ stratum ┆ psu   ┆ var_stratum ┆ brr1 ┆ brr2 │
│ ---     ┆ ---   ┆ ---         ┆ ---  ┆ ---  │
│ str     ┆ str   ┆ i64         ┆ f64  ┆ f64  │
╞═════════╪═══════╪═════════════╪══════╪══════╡
│ S1      ┆ S1_P1 ┆ 0           ┆ 2.0  ┆ 2.0  │
│ S1      ┆ S1_P1 ┆ 0           ┆ 2.0  ┆ 2.0  │
│ S1      ┆ S1_P1 ┆ 0           ┆ 2.0  ┆ 2.0  │
│ S1      ┆ S1_P2 ┆ 0           ┆ 0.0  ┆ 0.0  │
│ S1      ┆ S1_P2 ┆ 0           ┆ 0.0  ┆ 0.0  │
│ …       ┆ …     ┆ …           ┆ …    ┆ …    │
│ S2      ┆ S2_P3 ┆ 3           ┆ 2.0  ┆ 0.0  │
│ S2      ┆ S2_P3 ┆ 3           ┆ 2.0  ┆ 0.0  │
│ S2      ┆ S2_P4 ┆ 3           ┆ 0.0  ┆ 2.0  │
│ S2      ┆ S2_P4 ┆ 3           ┆ 0.0  ┆ 2.0  │
│ S2      ┆ S2_P4 ┆ 3           ┆ 0.0  ┆ 2.0  │
└─────────┴───────┴─────────────┴──────┴──────┘

Handling Odd PSU Counts with JK2

When a stratum has an odd number of PSUs, BRR cannot be used. JK2, however, allows 2–3 PSUs per variance stratum:

# Create sample with 3 PSUs per stratum (odd count)
rows_odd = []
for s in range(1, 4):  # 3 strata
    for p in range(1, 4):  # 3 PSUs each (odd)
        for i in range(3):
            rows_odd.append({
                "stratum": f"S{s}",
                "psu": f"S{s}_P{p}",
                "wgt": 1.0,
                "y": rng.normal(10, 2),
            })

df_odd = pl.DataFrame(rows_odd)
odd_sample = svy.Sample(
    data=df_odd,
    design=svy.Design(stratum="stratum", psu="psu", wgt="wgt"),
)

print("Original: 3 strata × 3 PSUs = 9 PSUs (odd count per stratum)")

Original: 3 strata × 3 PSUs = 9 PSUs (odd count per stratum)

# BRR fails with odd PSU counts
try:
    odd_sample.weighting.create_variance_strata(method="brr")
except Exception as e:
    print(f"BRR Error: {type(e).__name__}")

BRR Error: DimensionError

# JK2 handles odd counts by creating triplets
jk2_ready = odd_sample.weighting.create_variance_strata(
    method="jk2",
    into="var_stratum",
)

# Now JK2 works with triplets
jk2_ready = jk2_ready.weighting.create_jk_wgts(paired=True, rep_prefix="jk2")

print(f"Created {jk2_ready.design.rep_wgts.n_reps} JK2 replicates")

Created 3 JK2 replicates

Variance Strata Parameters

Parameter	Type	Default	Description
`method`	`"brr" \| "jk2"`	Required	Target replication method
`order_by`	`str \| Sequence[str] \| None`	`None`	Column(s) to sort PSUs before pairing
`shuffle`	`bool`	`False`	Randomly shuffle PSUs before pairing
`into`	`str`	`"svy_var_stratum"`	Name for new variance stratum column
`rstate`	`int \| None`	`None`	Random seed for shuffle

Ordering PSUs Before Pairing

You can control how PSUs are paired by sorting them first:

# Add a size variable for ordering
df_multi_size = df_multi.with_columns(
    pl.lit(rng.uniform(100, 1000, df_multi.height)).alias("pop_size")
)

sized_sample = svy.Sample(
    data=df_multi_size,
    design=svy.Design(stratum="stratum", psu="psu", wgt="wgt"),
)

# Pair similar-sized PSUs together
ordered_strata = sized_sample.weighting.create_variance_strata(
    method="brr",
    order_by="pop_size",  # Sort by population size before pairing
    into="var_stratum",
)

print("PSUs paired by population size within each stratum")

PSUs paired by population size within each stratum

Random Pairing

For reproducible random pairing:

random_strata = multi_sample.weighting.create_variance_strata(
    method="brr",
    shuffle=True,
    rstate=12345,
    into="var_stratum",
)

print("PSUs randomly paired within each stratum")

PSUs randomly paired within each stratum

Adjusting Replicate Weights

When you apply weight adjustments (nonresponse, poststratification, calibration, raking), you must also apply those same adjustments to each replicate weight. This ensures that variance estimates properly reflect the additional uncertainty introduced by the adjustment process.

The svy library handles this automatically. All adjustment methods include parameters for replicate weight handling:

Parameter	Type	Default	Description
`rep_wgts_prefix`	`str \| None`	Auto-generated	Prefix for adjusted replicate weight columns
`ignore_reps`	`bool`	`False`	If `True`, skip replicate weight adjustment
`update_design_wgts`	`bool`	`True`	Update the design to reference new weights

Example: Full Adjustment Pipeline

Let’s walk through a complete pipeline: create replicate weights, then apply nonresponse adjustment and raking to both the main weight and all replicates.

Step 1: Create Replicate Weights

# Start fresh with base sample
pipeline_sample = svy.Sample(
    data=df,
    design=svy.Design(stratum="stratum", psu="psu", wgt="base_wgt"),
)

# Create bootstrap replicates
pipeline_sample = pipeline_sample.weighting.create_bs_wgts(
    n_reps=50,
    rep_prefix="rep_wgt",
    rstate=rng,
)

print(f"After creating replicates:")
print(pipeline_sample)

After creating replicates:

╭─────────────────────────── Sample ────────────────────────────╮
│ Survey Data:                                                  │
│   Number of rows: 40                                          │
│   Number of columns: 60                                       │
│   Number of strata: 4                                         │
│   Number of PSUs: 8                                           │
│                                                               │
│ Survey Design:                                                │
│                                                               │
│    Field               Value                                  │
│    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━  │
│    Row index           svy_row_index                          │
│    Stratum             stratum                                │
│    PSU                 psu                                    │
│    SSU                 None                                   │
│    Weight              base_wgt                               │
│    With replacement    False                                  │
│    Prob                None                                   │
│    Hit                 None                                   │
│    MOS                 None                                   │
│    Population size     None                                   │
│    Replicate weights   RepWeights(method=Bootstrap,           │
│                        prefix='rep_wgt', n_reps=50, df=49.0)  │
│                                                               │
╰───────────────────────────────────────────────────────────────╯

Step 2: Nonresponse Adjustment

Apply nonresponse adjustment to the main weight and all 50 replicates:

# Map data labels to response codes:
# 1 = respondent (rr), 2 = non-respondent (nr), 3 = ineligible (in), 4 = unknown (uk)
status_mapping = {
    "rr": "respondent",
    "nr": "non-respondent",
}

pipeline_sample = pipeline_sample.weighting.adjust_nr(
    resp_status="resp_status",
    by="stratum",  # Adjustment cells by stratum
    resp_mapping=status_mapping,
    wgt_name="nr_wgt",
    rep_wgts_prefix="nr_rep_wgt",  # Adjusted replicates get this prefix
    # ignore_reps=False,  # Default: adjust replicates too
)

print(f"After nonresponse adjustment:")
print(pipeline_sample)

After nonresponse adjustment:

╭─────────────────────────── Sample ────────────────────────────╮
│ Survey Data:                                                  │
│   Number of rows: 37                                          │
│   Number of columns: 111                                      │
│   Number of strata: 4                                         │
│   Number of PSUs: 8                                           │
│                                                               │
│ Survey Design:                                                │
│                                                               │
│    Field               Value                                  │
│    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━  │
│    Row index           svy_row_index                          │
│    Stratum             stratum                                │
│    PSU                 psu                                    │
│    SSU                 None                                   │
│    Weight              nr_wgt                                 │
│    With replacement    False                                  │
│    Prob                None                                   │
│    Hit                 None                                   │
│    MOS                 None                                   │
│    Population size     None                                   │
│    Replicate weights   RepWeights(method=Bootstrap,           │
│                        prefix='nr_rep_wgt', n_reps=50,        │
│                        df=49.0)                               │
│                                                               │
╰───────────────────────────────────────────────────────────────╯

The design now references: - Main weight: nr_wgt - Replicate weights: nr_rep_wgt1, nr_rep_wgt2, …, nr_rep_wgt50

Step 3: Raking

Apply raking to match population margins:

# Define marginal controls
controls = {
    "x_cat": {"A": 15.0, "B": 12.0, "C": 13.0},
}

pipeline_sample = pipeline_sample.weighting.rake(
    controls=controls,
    wgt_name="final_wgt",
    rep_wgts_prefix="final_rep_wgt",
)

print(f"After raking:")
print(pipeline_sample)

After raking:

╭─────────────────────────── Sample ────────────────────────────╮
│ Survey Data:                                                  │
│   Number of rows: 37                                          │
│   Number of columns: 162                                      │
│   Number of strata: 4                                         │
│   Number of PSUs: 8                                           │
│                                                               │
│ Survey Design:                                                │
│                                                               │
│    Field               Value                                  │
│    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━  │
│    Row index           svy_row_index                          │
│    Stratum             stratum                                │
│    PSU                 psu                                    │
│    SSU                 None                                   │
│    Weight              final_wgt                              │
│    With replacement    False                                  │
│    Prob                None                                   │
│    Hit                 None                                   │
│    MOS                 None                                   │
│    Population size     None                                   │
│    Replicate weights   RepWeights(method=Bootstrap,           │
│                        prefix='final_rep_wgt', n_reps=50,     │
│                        df=49.0)                               │
│                                                               │
╰───────────────────────────────────────────────────────────────╯

Step 4: Verify the Results

# Check main weight
print("Main weight (final_wgt) vs controls:")
main_wgt_sums = pipeline_sample.data.group_by("x_cat").agg(
    pl.col("final_wgt").sum().alias("weighted_sum")
).sort("x_cat")

for row in main_wgt_sums.iter_rows(named=True):
    cat = row["x_cat"]
    actual = row["weighted_sum"]
    target = controls["x_cat"][cat]
    diff = actual - target
    print(f"  {cat}: target={target:.2f}, actual={actual:.4f}, diff={diff:.6f}")

# Check a few replicate weights
print("\nReplicate weights vs controls:")
for rep_col in ["final_rep_wgt1", "final_rep_wgt2", "final_rep_wgt3"]:
    rep_sums = pipeline_sample.data.group_by("x_cat").agg(
        pl.col(rep_col).sum().alias("weighted_sum")
    ).sort("x_cat")

    total_diff = sum(
        abs(row["weighted_sum"] - controls["x_cat"][row["x_cat"]])
        for row in rep_sums.iter_rows(named=True)
    )
    print(f"  {rep_col}: max total deviation = {total_diff:.6f}")

# Overall totals
print(f"\nOverall total (should be {sum(controls['x_cat'].values())}):")
print(f"  Main weight sum: {pipeline_sample.data['final_wgt'].sum():.4f}")

Main weight (final_wgt) vs controls:
  A: target=15.00, actual=15.0000, diff=0.000000
  B: target=12.00, actual=12.0000, diff=0.000000
  C: target=13.00, actual=13.0000, diff=-0.000000

Replicate weights vs controls:
  final_rep_wgt1: max total deviation = 0.000000
  final_rep_wgt2: max total deviation = 0.000000
  final_rep_wgt3: max total deviation = 0.000000

Overall total (should be 40.0):
  Main weight sum: 40.0000

Skipping Replicate Adjustment

In some cases, you may want to skip replicate weight adjustment:

Preliminary analysis: Quick exploration before final weights are needed
Memory constraints: Large datasets with many replicates
External replicates: When replicates will be adjusted separately

# Apply adjustment to main weight only
quick_sample = sample.weighting.create_bs_wgts(n_reps=10, rep_prefix="rep", rstate=rng)

quick_sample = quick_sample.weighting.normalize(
    controls=100,
    wgt_name="norm_wgt",
    ignore_reps=True,  # Skip replicate adjustment
)

print(quick_sample)

╭─────────────────────────── Sample ────────────────────────────╮
│ Survey Data:                                                  │
│   Number of rows: 40                                          │
│   Number of columns: 261                                      │
│   Number of strata: 4                                         │
│   Number of PSUs: 8                                           │
│                                                               │
│ Survey Design:                                                │
│                                                               │
│    Field               Value                                  │
│    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━  │
│    Row index           svy_row_index                          │
│    Stratum             stratum                                │
│    PSU                 psu                                    │
│    SSU                 None                                   │
│    Weight              norm_wgt                               │
│    With replacement    False                                  │
│    Prob                None                                   │
│    Hit                 None                                   │
│    MOS                 None                                   │
│    Population size     None                                   │
│    Replicate weights   RepWeights(method=Bootstrap,           │
│                        prefix='rep', n_reps=10, df=9.0)       │
│                                                               │
╰───────────────────────────────────────────────────────────────╯

Important

If you skip replicate adjustment (ignore_reps=True), variance estimates computed from those replicates will not account for the additional variability introduced by the weight adjustment. This typically leads to underestimated standard errors.

Design Metadata

When you create replicate weights, svy stores metadata in the Sample.design.rep_wgts object:

bs_sample = sample.weighting.create_bs_wgts(n_reps=100, rep_prefix="bs", rstate=rng)

print(f"Method: {bs_sample.design.rep_wgts.method}")
print(f"Prefix: {bs_sample.design.rep_wgts.prefix}")
print(f"Number of replicates: {bs_sample.design.rep_wgts.n_reps}")
print(f"Degrees of freedom: {bs_sample.design.rep_wgts.df}")
print(f"Fay coefficient: {bs_sample.design.rep_wgts.fay_coef}")

Method: Bootstrap
Prefix: bs
Number of replicates: 100
Degrees of freedom: 99.0
Fay coefficient: 0.0

This metadata is used by estimation functions to automatically compute replicate-based variance estimates.

Choosing a Replication Method

If your design has…	Recommended method
Exactly 2 PSUs per stratum	BRR (most efficient)
2–3 PSUs per stratum	JK2 (paired jackknife)
Varying PSUs per stratum	JKn or Bootstrap
Many strata, few PSUs each	Fay-BRR (dampened)
Systematic/ordered sample	SDR
Complex non-linear statistics	Bootstrap (≥500 reps)
Need to minimize file size	JK2 (fewest columns)
Need to convert multi-PSU design	`create_variance_strata()` first

Summary

This tutorial covered replicate weight methods for variance estimation:

Creating Replicate Weights:

Method	Function	Key Feature
BRR	`create_brr_wgts()`	Balanced half-samples; requires 2 PSUs/stratum
Fay-BRR	`create_brr_wgts(fay_coef=...)`	Damped BRR for stability
JKn	`create_jk_wgts(paired=False)`	Delete-one-PSU; # replicates = # PSUs
JK2	`create_jk_wgts(paired=True)`	Paired deletion; # replicates = # strata
Bootstrap	`create_bs_wgts()`	Resample with replacement; most flexible
SDR	`create_sdr_wgts()`	For systematic/ordered samples

Preparing Designs:

Method	Function	Use Case
Variance Strata	`create_variance_strata(method="brr")`	Convert multi-PSU to paired for BRR
Variance Strata	`create_variance_strata(method="jk2")`	Convert multi-PSU to 2–3 PSUs for JK2

Adjusting Replicate Weights:

All adjustment methods (adjust_nr, poststratify, rake, calibrate, normalize) automatically propagate to replicates
Use rep_wgts_prefix to control naming of adjusted replicates
Use ignore_reps=True to skip replicate adjustment (not recommended for final analysis)

Next Steps

With your replicate weights created and adjusted, you’re ready to compute estimates with proper variance estimation. Continue to the Estimation tutorial to learn how to use these weights.

Ready to compute estimates?
Learn estimation methods in Survey Estimation →

References

Fay, R. E. (1989). Theory and application of replicate weighting for variance calculations. Proceedings of the Survey Research Methods Section, American Statistical Association, 212–217.
Judkins, D. R. (1990). Fay’s method for variance estimation. Journal of Official Statistics, 6(3), 223–239.
Rao, J. N. K., & Wu, C. F. J. (1988). Resampling inference with complex survey data. Journal of the American Statistical Association, 83(401), 231–241.
Rust, K. F., & Rao, J. N. K. (1996). Variance estimation for complex surveys using replication techniques. Statistical Methods in Medical Research, 5(3), 283–310.
Valliant, R., Dever, J. A., & Kreuter, F. (2018). Practical Tools for Designing and Weighting Survey Samples (2nd ed.). Springer.

When to Use Replicate Weights

Setting Up the Sample Data

Replication Methods

Balanced Repeated Replication (BRR)

BRR Parameters

Reproducible BRR with rstate

Fay-BRR (Generalized BRR)

Jackknife Methods (JKn and JK2)

JKn: Delete-One-PSU Jackknife

JK2: Paired/Stratified Jackknife

Reproducible JK2 with rstate

Jackknife Parameters

Bootstrap

Bootstrap Parameters

Successive Difference Replication (SDR)

SDR Parameters

Preparing Designs for BRR and JK2

The Problem

Solution: Create Variance Strata

Handling Odd PSU Counts with JK2

Variance Strata Parameters

Ordering PSUs Before Pairing

Random Pairing

Adjusting Replicate Weights

Example: Full Adjustment Pipeline

Step 1: Create Replicate Weights

Step 2: Nonresponse Adjustment

Step 3: Raking

Step 4: Verify the Results

Skipping Replicate Adjustment

Design Metadata

Choosing a Replication Method

Summary

Next Steps

References

Reproducible BRR with `rstate`

Reproducible JK2 with `rstate`