Sample Weighting

Sample weighting is the mechanism that allows analysts to generalize results from the sample to the target population. The design (or base) weights are derived as the inverse of the final probability of selection. In large-scale surveys, these design weights are often further adjusted to correct for nonresponse, extreme values, or to align auxiliary variables with known population controls.

The weighting tutorial is presented in two parts. The first section introduces common adjustment techniques—such as nonresponse adjustment, poststratification, and calibration—that improve representativeness and reduce potential bias. The second section demonstrates how to create and use replicate weights for variance estimation using Bootstrap (BST), Balanced Repeated Replication (BRR), and Jackknife Repeated Replication (JNK) methods.

For more on sample-weight adjustments, see Valliant and Dever (2018), which provides a step-by-step guide to calculating survey weights.

To run the code in this notebook, we will use the World Bank (2023) synthetic sample data.

import numpy as np

from rich import print as rprint

import svy
from svy import load_dataset

hld_data = load_dataset(name="hld_sample_wb_2023", limit=None)

print(
    f"The number of records in the household sample data is {hld_data.shape[0]}"
)
The number of records in the household sample data is 8000

Weight Adjustment Methods

In practice, base (design) weights derived from the selection probabilities are routinely adjusted to (i) correct for nonresponse and unknown eligibility, (ii) temper the influence of extreme/large weights, and (iii) align the weighted sample with known auxiliary controls. In this tutorial, you will:

  • Use adjust_nr() to adjust weights to account for unit nonresponse and unknown eligibility.
  • Use poststratify() to make weights match known control totals (or to force domain shares to known distributions).
  • Use calibrate() to adjust weights using auxiliary variables under a regression (GREG) framework, satisfying calibration constraints.
  • Use rake() to rescale weights so they sum to the chosen margins.
  • Use normalize() to rescale weights so they sum to a chosen constant (e.g., the sample size, a population total, or domain totals).

Design (Base) Weight

The design or base weight represents the inverse of the overall probability of selection. This probability is the product of the first-stage and second-stage selection probabilities, as explained in previous tutorials.

Let’s create the sample as follows

from svy import Design, Sample

# Define the sampling design
hld_design = Design(stratum=["geo1", "urbrur"], psu=["ea"], wgt="hhweight")

# Create the sample
hld_sample = Sample(data=hld_data, design=hld_design)

The dataset includes a household-level base weight variable named hhweight, we will rename it to base_wgt.

# Rename hhweight -> base_wgt
hld_sample = hld_sample.wrangling.rename_columns({"hhweight": "base_wgt"})

print(hld_sample)
╭─────────────────────────── Sample ────────────────────────────╮
 Survey Data:                                                  
   Number of rows: 8000                                        
   Number of columns: 52                                       
   Number of strata: 19                                        
   Number of PSUs: 320                                         
                                                               
 Survey Design:                                                
                                                               
    Field               Value                                  
    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━                         
    Row index           svy_row_index                          
    Stratum             (geo1, urbrur)                         
    PSU                 (ea,)                                  
    SSU                 None                                   
    Weight              base_wgt                               
    With replacement    False                                  
    Prob                None                                   
    Hit                 None                                   
    MOS                 None                                   
    Population size     None                                   
    Replicate weights   None                                   
                                                               
╰───────────────────────────────────────────────────────────────╯

Response Status

The core idea of nonresponse adjustment is to redistribute the survey weights of eligible non-respondents to eligible respondents, and to do so within defined adjustment classes.

In practice, some units have unknown eligibility. How their weights are handled is survey-specific. Common options include:

  • Treat unknowns like eligibles: redistribute their weights to respondents in the same class.
  • Partition unknowns: allocate a fraction to ineligibles (reflecting the belief that some unknowns would have been ineligible) and the remainder to eligibles.
  • Exclude unknowns from redistribution: leave ineligible weights unchanged and redistribute only non-respondent (known eligible) weights.

In svy, we classify records into four response categories (eligible respondent, eligible non-respondent, ineligible, unknown). By default, unknowns are treated as potentially ineligible; therefore their weights are redistributed to the ineligible group as well.

In the World Bank simulated data, the observed response rate is 100%. For this tutorial, we will simulate ineligibility and nonresponse and then compute nonresponse-adjusted weights by weighting class.

import numpy as np
import polars as pl

rng = np.random.default_rng(12345)

RESPONSE_STATUS = rng.choice(
    ("ineligible", "respondent", "non-respondent", "unknown"),
    p=(0.03, 0.82, 0.10, 0.05),
    size=hld_sample.n_records,
)

hld_sample = hld_sample.wrangling.mutate({"resp_status": RESPONSE_STATUS})

# Show 9 eligible non-respondents records in the geo_01
print(
    hld_sample.show_records(
        columns=["hid", "geo1", "urbrur", "resp_status"],
        where=[
            svy.col("resp_status") == "non-respondent",
            svy.col("geo1").is_in(["geo_01"]),
        ],
        n=9,
    )
)
shape: (9, 4)
┌─────────────┬────────┬────────┬────────────────┐
│ hid         ┆ geo1   ┆ urbrur ┆ resp_status    │
│ ---         ┆ ---    ┆ ---    ┆ ---            │
│ str         ┆ str    ┆ str    ┆ str            │
╞═════════════╪════════╪════════╪════════════════╡
│ 0424d8c9572 ┆ geo_01 ┆ Urban  ┆ non-respondent │
│ 09a4ff6c721 ┆ geo_01 ┆ Urban  ┆ non-respondent │
│ 0f8b12b37f6 ┆ geo_01 ┆ Urban  ┆ non-respondent │
│ 1bd825df0cc ┆ geo_01 ┆ Urban  ┆ non-respondent │
│ 1e2c7908b70 ┆ geo_01 ┆ Urban  ┆ non-respondent │
│ 1ffddef4ebe ┆ geo_01 ┆ Urban  ┆ non-respondent │
│ 22dfb939642 ┆ geo_01 ┆ Urban  ┆ non-respondent │
│ 258b968d304 ┆ geo_01 ┆ Urban  ┆ non-respondent │
│ 2c62bd0966f ┆ geo_01 ┆ Urban  ┆ non-respondent │
└─────────────┴────────┴────────┴────────────────┘

svy expects a single variable containing standardized response statuses:

Code Meaning
rr Respondent
nr Nonrespondent
uk Unknown eligible
in Ineligible

If your dataset uses different labels or codes, you must provide a mapping to these canonical values before using weighting or adjustment functions in svy. Hence, we create the mapping as follows:

# Mapping of canonical response status codes to descriptive labels
status_mapping = {
    "in": "ineligible",
    "rr": "respondent",
    "nr": "non-respondent",
    "uk": "unknown",
}

Nonresponse Adjustment

TipImplementation in svy

Use Sample.adjust_nr() to adjust the sample weights for nonresponse. The method computes the adjusted weights and stores them in the sample object for downstream estimation.

The method adjust() has a boolean parameter unknown_to_inelig that controls where the unknowns’ weights go:

  • unknown_to_inelig=True (default): Unknowns’ weights are redistributed to ineligibles (as well as being accounted for in the class totals). Because ineligibles absorb some of the unknown mass, the respondents’ adjusted weights are generally smaller than when this flag is false.

  • unknown_to_inelig=False: Unknowns’ weights are not given to ineligibles. Ineligibles’ weights therefore do not change due to unknowns, and—other things equal—respondents’ adjusted weights are larger than in the default setting.

hld_sample = hld_sample.weighting.adjust_nr(
    resp_status="resp_status",
    adj_class=("geo1", "geo2"),
    resp_mapping=status_mapping,
    wgt_name="nr_wgt",
    unknown_to_inelig=True,
)

We now display a small sample of records to confirm the nr_wgt column was created.

# Show a random sample with the key columns
out = hld_sample.show_data(
    columns=["hid", "geo1", "geo2", "base_wgt", "resp_status", "nr_wgt"],
    how="sample",
    n=10,
    rstate=rng,
)

print(out)
shape: (10, 6)
┌─────────────┬────────┬───────────┬────────────┬─────────────┬────────────┐
│ hid         ┆ geo1   ┆ geo2      ┆ base_wgt   ┆ resp_status ┆ nr_wgt     │
│ ---         ┆ ---    ┆ ---       ┆ ---        ┆ ---         ┆ ---        │
│ str         ┆ str    ┆ str       ┆ f64        ┆ str         ┆ f64        │
╞═════════════╪════════╪═══════════╪════════════╪═════════════╪════════════╡
│ 5eb5523fa44 ┆ geo_05 ┆ geo_05_02 ┆ 344.371944 ┆ respondent  ┆ 428.014848 │
│ a0ab8fab866 ┆ geo_09 ┆ geo_09_07 ┆ 332.450508 ┆ respondent  ┆ 377.061843 │
│ 198103a45ac ┆ geo_03 ┆ geo_03_05 ┆ 214.050903 ┆ respondent  ┆ 243.654166 │
│ 80218e46fc6 ┆ geo_02 ┆ geo_02_08 ┆ 354.277313 ┆ respondent  ┆ 415.771682 │
│ 6cab3b5f92a ┆ geo_10 ┆ geo_10_02 ┆ 463.188978 ┆ respondent  ┆ 536.761013 │
│ e620f2ad89f ┆ geo_01 ┆ geo_01_03 ┆ 358.183456 ┆ respondent  ┆ 417.155668 │
│ ebc9a78a43d ┆ geo_01 ┆ geo_01_03 ┆ 264.084412 ┆ respondent  ┆ 307.563925 │
│ 9ef8186c24c ┆ geo_01 ┆ geo_01_01 ┆ 256.192235 ┆ respondent  ┆ 310.089333 │
│ 4e423b25fc8 ┆ geo_07 ┆ geo_07_02 ┆ 519.38285  ┆ respondent  ┆ 638.576398 │
│ d8f163cb9de ┆ geo_04 ┆ geo_04_01 ┆ 371.100796 ┆ respondent  ┆ 437.941559 │
└─────────────┴────────┴───────────┴────────────┴─────────────┴────────────┘
print(hld_sample)
╭─────────────────────────── Sample ────────────────────────────╮
 Survey Data:                                                  
   Number of rows: 6629                                        
   Number of columns: 54                                       
   Number of strata: 19                                        
   Number of PSUs: 320                                         
                                                               
 Survey Design:                                                
                                                               
    Field               Value                                  
    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━                         
    Row index           svy_row_index                          
    Stratum             (geo1, urbrur)                         
    PSU                 (ea,)                                  
    SSU                 None                                   
    Weight              nr_wgt                                 
    With replacement    False                                  
    Prob                None                                   
    Hit                 None                                   
    MOS                 None                                   
    Population size     None                                   
    Replicate weights   None                                   
                                                               
╰───────────────────────────────────────────────────────────────╯
  • If you don’t specify wgt_name=…, svy creates the adjusted weight automatically and names it svy_adjusted_. For example, if the original weight column is base_wgt, the new column will be svy_adjusted_base_wgt.
  • If you want to replace the pre-adjusted variable by the adjusted one then set “replace=True”.
  • svy also updates the sample design internally so that the design’s weight reference points to the adjusted weight.

Poststratification

Poststratification helps compensate for under- or over-representation in the sample and can mitigate certain non-sampling errors. The most common approach adjusts sample weights so that, within poststratification classes, the weighted sums match known control totals from reliable sources.

Poststratification classes need not mirror the sampling design; they can be formed from additional variables. Typical choices in the social science research include age group, gender, race/ethnicity, and education—alone or in combination.

Warning

Use current, reliable controls: Poststratifying to out-of-date or unreliable totals may not improve estimates and can introduce bias. Proceed with caution and document sources and reference dates.

TipImplementation in svy

Use Sample.poststratify() to poststratify the sample weights. The method computes the adjusted weights and stores them in the sample object for downstream estimation.

Let’s assume that we have a reliable source, e.g. a recent census, that provides the number of households per admin 1 level.

Note that the control totals are provided using the Python dictionary census_households. Let assume the following control totals.

hld_control_totals = {
    "geo_01": 342_000,
    "geo_02": 240_000,
    "geo_03": 282_000,
    "geo_04": 370_000,
    "geo_05": 210_000,
    "geo_06": 185_000,
    "geo_07": 183_000,
    "geo_08": 105_000,
    "geo_09": 300_000,
    "geo_10": 290_000,
}

Then poststratify the nonresponse weights

hld_sample = hld_sample.weighting.poststratify(
    controls=hld_control_totals,
    adj_class="geo1",
    wgt_name="ps_wgt",
)

We now display a small sample of records to confirm the ps_wgt column was created.

# Show a random sample with the key columns
out = hld_sample.show_data(
    columns=[
        "hid",
        "geo1",
        "nr_wgt",
        "ps_wgt",
    ],
    how="sample",
    n=10,
    sort_by="geo1",
    rstate=rng,
)

print(out)
shape: (10, 4)
┌─────────────┬────────┬────────────┬────────────┐
│ hid         ┆ geo1   ┆ nr_wgt     ┆ ps_wgt     │
│ ---         ┆ ---    ┆ ---        ┆ ---        │
│ str         ┆ str    ┆ f64        ┆ f64        │
╞═════════════╪════════╪════════════╪════════════╡
│ e620f2ad89f ┆ geo_01 ┆ 417.155668 ┆ 426.809278 │
│ ebc9a78a43d ┆ geo_01 ┆ 307.563925 ┆ 314.681417 │
│ 9ef8186c24c ┆ geo_01 ┆ 310.089333 ┆ 317.265267 │
│ 80218e46fc6 ┆ geo_02 ┆ 415.771682 ┆ 428.223709 │
│ 198103a45ac ┆ geo_03 ┆ 243.654166 ┆ 252.938827 │
│ d8f163cb9de ┆ geo_04 ┆ 437.941559 ┆ 447.723837 │
│ 5eb5523fa44 ┆ geo_05 ┆ 428.014848 ┆ 438.615403 │
│ 4e423b25fc8 ┆ geo_07 ┆ 638.576398 ┆ 665.818285 │
│ a0ab8fab866 ┆ geo_09 ┆ 377.061843 ┆ 400.657013 │
│ 6cab3b5f92a ┆ geo_10 ┆ 536.761013 ┆ 544.50475  │
└─────────────┴────────┴────────────┴────────────┘

And again svy updates the sampling design with the current “final” sample weight.

print(hld_sample)
╭─────────────────────────── Sample ────────────────────────────╮
 Survey Data:                                                  
   Number of rows: 6629                                        
   Number of columns: 55                                       
   Number of strata: 19                                        
   Number of PSUs: 320                                         
                                                               
 Survey Design:                                                
                                                               
    Field               Value                                  
    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━                         
    Row index           svy_row_index                          
    Stratum             (geo1, urbrur)                         
    PSU                 (ea,)                                  
    SSU                 None                                   
    Weight              ps_wgt                                 
    With replacement    False                                  
    Prob                None                                   
    Hit                 None                                   
    MOS                 None                                   
    Population size     None                                   
    Replicate weights   None                                   
                                                               
╰───────────────────────────────────────────────────────────────╯

Calibration

Calibration, see Deville and Särndal (1992), is a general method used in survey sampling to adjust sample weights so that certain totals in the sample align with known values for the entire population. The idea is to make the sample “look like” the population in terms of key auxiliary variables, while keeping the adjusted weights as close as possible to the original design weights.

The Generalized Regression (GREG) approach is a model-assisted version of calibration; see Särndal, Swensson, and Wretman (1992) for a thorough exposé of model-assisted survey sampling. It assumes that the survey variable of interest—such as income or health status—is related to one or more auxiliary variables (like age, region, or education) through a regression-type relationship. Rather than relying solely on the design weights, GREG calibration adjusts those weights using information from both:

  • the survey data (what we observe in the sample), and
  • external data (known population totals for the auxiliary variables).

In essence, GREG finds a new set of weights that stay as close as possible to the original design weights and, at the same time, make the weighted totals of the auxiliary variables match their known population values. When the auxiliary variables are strongly correlated with the study variable, the resulting GREG estimates tend to be more stable and efficient than simple design-based estimates.

TipImplementation in svy

Use Sample.calibration() or Sample.calibrate_matrix() to calibrate using the GREG approach to adjust the sample weights. The methods compute the adjusted weights and stores them in the sample object for downstream estimation.

svy.Table.PRINT_WIDTH = 95

# hld_sample.data.columns
hld_sample.categorical.tabulate(rowvar="statocc", colvar="electricity", units=svy.TableUnits.COUNT)
Table(type=TWO_WAY, rowvar='statocc', colvar='electricity', levels=3x2, n=6, alpha=0.05)
controls = hld_sample.weighting.control_aux_template(
    cat_vars=("statocc", "electricity"), cat_na="level"
)

rprint(controls)
{
    ('Occupied for free', 'No'): nan,
    ('Occupied for free', 'Yes'): nan,
    ('Owned', 'No'): nan,
    ('Owned', 'Yes'): nan,
    ('Rented', 'No'): nan,
    ('Rented', 'Yes'): nan
}
controls.update(
    {
        ("Occupied for free", "No"): 40_000,
        ("Occupied for free", "Yes"): 210_000,
        ("Owned", "No"): 360_000,
        ("Owned", "Yes"): 1_572_000,
        ("Rented", "No"): 25_000,
        ("Rented", "Yes"): 300_000,
    }
)

rprint(controls)
{
    ('Occupied for free', 'No'): 40000,
    ('Occupied for free', 'Yes'): 210000,
    ('Owned', 'No'): 360000,
    ('Owned', 'Yes'): 1572000,
    ('Rented', 'No'): 25000,
    ('Rented', 'Yes'): 300000
}

If you want to control by domain then you can do so using the parameter by of calibrate().

Raking

Raking, also known as iterative proportional fitting (IPF), is a method used to adjust survey weights so that the weighted sample distributions match known population margins for several categorical variables.

Unlike calibration, which can align with multiple totals simultaneously through a regression framework, raking updates the weights iteratively—adjusting one margin at a time until all specified margins agree (within tolerance) with the population controls.

Raking is especially useful when only marginal totals are available (for example, totals by age group and totals by gender, but not their cross-tabulation). It preserves the relative structure of the sample while bringing the weighted distributions into alignment with known population information.

TipImplementation in svy

Use Sample.rake() to rake the sample weights. The method computes the adjusted weights and stores them in the sample object for downstream estimation.

hld_sample = hld_sample.wrangling.categorize(
    "hhsize",
    bins=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 30],
    into="hhsize_cat",
    labels=("1", "2", "3", "4", "5", "6", "7", "8", "9", "10+"),
)
print(
    hld_sample.categorical.tabulate(
        rowvar="hhsize_cat",
        drop_nulls=True,
        units=svy.TableUnits.COUNT,
    )
)    
╭─────────────────────────────────────────── Table ───────────────────────────────────────────╮
 Type=One-Way                                                                                
 Alpha=0.05                                                                                  
                                                                                             
 Row      Estimate      Std Err       CV         Lower         Upper                         
 ───────────────────────────────────────────────────────────────────                         
 1     223585.8102   26841.7337   0.1201   170764.5924   276407.0281                         
 2     370807.5011   34866.4367   0.0940   302194.6587   439420.3436                         
 3     510909.8103   27845.6883   0.0545   456112.9337   565706.6869                         
 4     540769.5741   28386.4805   0.0525   484908.4853   596630.6629                         
 5     369959.4288   22050.2962   0.0596   326567.1684   413351.6891                         
 6     194964.9906   14982.3973   0.0768   165481.4827   224448.4986                         
 7     125078.6089   13765.1648   0.1101    97990.4643   152166.7536                         
 8      87823.8785   12582.3734   0.1433    63063.3212   112584.4359                         
 9      34838.6570    5924.7016   0.1701    23179.5758    46497.7382                         
 10+    48261.7404   12211.3078   0.2530    24231.3943    72292.0865                         
╰─────────────────────────────────────────────────────────────────────────────────────────────╯
hld_control_totals = {
    "statocc": {
        "Occupied for free": 250_000,
        "Owned": 1_932_000,
        "Rented": 325_000,
    },
    "electricity": {"No": 425_000, "Yes": 2_082_000},
}

rprint(hld_control_totals)    
{
    'statocc': {'Occupied for free': 250000, 'Owned': 1932000, 'Rented': 325000},
    'electricity': {'No': 425000, 'Yes': 2082000}
}
hld_sample = hld_sample.weighting.rake(
    controls=hld_control_totals, wgt_name="rake_wgt"
)
# Show a random sample with the key columns
out = hld_sample.show_data(
    columns=[
        "hid",
        "statocc",
        "electricity",
        "ps_wgt",
        "rake_wgt",
    ],
    how="sample",
    n=10,
    sort_by=("statocc", "electricity"),
    rstate=rng,
)

print(out)
shape: (10, 5)
┌─────────────┬───────────────────┬─────────────┬────────────┬────────────┐
│ hid         ┆ statocc           ┆ electricity ┆ ps_wgt     ┆ rake_wgt   │
│ ---         ┆ ---               ┆ ---         ┆ ---        ┆ ---        │
│ str         ┆ str               ┆ str         ┆ f64        ┆ f64        │
╞═════════════╪═══════════════════╪═════════════╪════════════╪════════════╡
│ d8f163cb9de ┆ Occupied for free ┆ No          ┆ 447.723837 ┆ 438.174327 │
│ a0ab8fab866 ┆ Occupied for free ┆ Yes         ┆ 400.657013 ┆ 392.944597 │
│ 6cab3b5f92a ┆ Occupied for free ┆ Yes         ┆ 544.50475  ┆ 534.023348 │
│ e620f2ad89f ┆ Occupied for free ┆ Yes         ┆ 426.809278 ┆ 418.593446 │
│ 5eb5523fa44 ┆ Owned             ┆ Yes         ┆ 438.615403 ┆ 436.724988 │
│ 198103a45ac ┆ Owned             ┆ Yes         ┆ 252.938827 ┆ 251.848671 │
│ 80218e46fc6 ┆ Owned             ┆ Yes         ┆ 428.223709 ┆ 426.378081 │
│ ebc9a78a43d ┆ Owned             ┆ Yes         ┆ 314.681417 ┆ 313.325153 │
│ 9ef8186c24c ┆ Owned             ┆ Yes         ┆ 317.265267 ┆ 315.897866 │
│ 4e423b25fc8 ┆ Owned             ┆ Yes         ┆ 665.818285 ┆ 662.948634 │
└─────────────┴───────────────────┴─────────────┴────────────┴────────────┘

Normalization

Surveys sometimes normalize weights to a convenient constant (e.g., the sample size n or 1,000) so results are easier to compare across analyses. Normalization multiplies every weight by the same factor. It does not change weighted means, proportions, or regression coefficients (the factor cancels), but it does change level estimates such as totals—and their standard errors—by the same factor.

TipImplementation in svy

Use Sample.normalize() to scale the sample weights. The method computes the adjusted weights and stores them in the sample object for downstream estimation.

hld_sample = hld_sample.weighting.normalize(
    controls=1_000, wgt_name="norm_wgt"
)


print(hld_sample.data["norm_wgt"].sum())
999.9999999999999

Replicate Weights

This section introduces three common replication methods for estimating sampling variance: Bootstrap (BS), Balanced Repeated Replication (BRR), and Jackknife (JK).

Replicate weights are constructed primarily for variance (uncertainty) estimation. They are especially useful when:

  • Estimating non-linear parameters, for which Taylor linearization may be inaccurate.
  • The number of PSUs per stratum is small (low degrees of freedom), making linearization unstable.

In this tutorial, we demonstrate how to create replicate weights using the ReplicateWeight class. Three replication methods are available:

  • Balanced Repeated Replication (BRR), including Fay-BRR
  • Bootstrap (BS)
  • Jackknife (JK)

BRR assumes exactly two PSUs per stratum after any collapsing. The World Bank synthetic dataset includes strata with more than two PSUs; to use BRR you would pair PSUs within each stratum to create 2-PSU pseudo-strata (e.g., sort by size/geography and pair 1–2, 3–4, …). To keep this tutorial focused on demonstrating svy syntax rather than data engineering, we’ll instead construct a small BRR-compatible dummy sample. (If you need to retain all PSUs without pairing, consider jackknife or bootstrap replicates, which handle ≥2 PSUs per stratum.)

Let’s consider the following dataset.

rows = []
y_means = {
    "S1_P1": 10,
    "S1_P2": 12,
    "S2_P1": 8,
    "S2_P2": 9,
    "S3_P1": 15,
    "S3_P2": 13,
    "S4_P1": 11,
    "S4_P2": 10,
}

for s in range(1, 5):  # S1..S4
    for p in range(1, 3):  # P1..P2  (2 PSUs per stratum)
        label = f"S{s}_P{p}"
        for i in range(3):  # 3 units per PSU
            rows.append(
                {
                    "unit_id": f"S{s}P{p}U{i + 1}",
                    "stratum": f"S{s}",
                    "cluster": f"P{p}",
                    "weight": 1.0,  # base weight
                    "y": rng.normal(y_means[label], 1.0),  # outcome
                }
            )

df_rep = pl.DataFrame(rows)

print(df_rep)
shape: (24, 5)
┌─────────┬─────────┬─────────┬────────┬───────────┐
│ unit_id ┆ stratum ┆ cluster ┆ weight ┆ y         │
│ ---     ┆ ---     ┆ ---     ┆ ---    ┆ ---       │
│ str     ┆ str     ┆ str     ┆ f64    ┆ f64       │
╞═════════╪═════════╪═════════╪════════╪═══════════╡
│ S1P1U1  ┆ S1      ┆ P1      ┆ 1.0    ┆ 7.780013  │
│ S1P1U2  ┆ S1      ┆ P1      ┆ 1.0    ┆ 9.211001  │
│ S1P1U3  ┆ S1      ┆ P1      ┆ 1.0    ┆ 10.354935 │
│ S1P2U1  ┆ S1      ┆ P2      ┆ 1.0    ┆ 11.277252 │
│ S1P2U2  ┆ S1      ┆ P2      ┆ 1.0    ┆ 12.26242  │
│ …       ┆ …       ┆ …       ┆ …      ┆ …         │
│ S4P1U2  ┆ S4      ┆ P1      ┆ 1.0    ┆ 9.844337  │
│ S4P1U3  ┆ S4      ┆ P1      ┆ 1.0    ┆ 10.741156 │
│ S4P2U1  ┆ S4      ┆ P2      ┆ 1.0    ┆ 10.305219 │
│ S4P2U2  ┆ S4      ┆ P2      ┆ 1.0    ┆ 8.991012  │
│ S4P2U3  ┆ S4      ┆ P2      ┆ 1.0    ┆ 9.013348  │
└─────────┴─────────┴─────────┴────────┴───────────┘

Balanced Repeated Replication (BRR)

BRR forms balanced half-samples within each stratum using a Hadamard design and—after any collapsing—requires exactly 2 PSUs per stratum.

By default, svy sets the number of replicates to the smallest multiple of 4 strictly greater than the number of strata. You can request more by passing n_reps (values ≤28 must be multiples of 4; larger values are rounded up to the next power of two).

rep_sample = Sample(
    data=df_rep,
    design=Design(stratum="stratum", wgt="weight", psu="cluster"),
)

brr_sample = rep_sample.weighting.create_brr_wgts(rep_prefix="brr_rep_wgt")

print(brr_sample)
╭─────────────────────────── Sample ────────────────────────────╮
 Survey Data:                                                  
   Number of rows: 24                                          
   Number of columns: 16                                       
   Number of strata: 4                                         
   Number of PSUs: 8                                           
                                                               
 Survey Design:                                                
                                                               
    Field               Value                                  
    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━    
    Row index           svy_row_index                          
    Stratum             stratum                                
    PSU                 cluster                                
    SSU                 None                                   
    Weight              weight                                 
    With replacement    False                                  
    Prob                None                                   
    Hit                 None                                   
    MOS                 None                                   
    Population size     None                                   
    Replicate weights   BRR (n_reps=8, df=4, fay=0, cols=8)    
                                                               
╰───────────────────────────────────────────────────────────────╯

Fay–BRR is a damped BRR: each replicate weight is a combinaision of the full weight and the BRR half-sample weight. Choose a Fay factor \(\rho \in (0,1)\), commonly between 0.3 and 0.5, to reduce perturbation and improve stability.

fay_sample = rep_sample.weighting.create_brr_wgts(
    n_reps=12, rep_prefix="fay_rep_wgt", fay_coef=0.45
)

print(fay_sample)
╭─────────────────────────── Sample ────────────────────────────╮
 Survey Data:                                                  
   Number of rows: 24                                          
   Number of columns: 28                                       
   Number of strata: 4                                         
   Number of PSUs: 8                                           
                                                               
 Survey Design:                                                
                                                               
    Field               Value                                  
    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━    
    Row index           svy_row_index                          
    Stratum             stratum                                
    PSU                 cluster                                
    SSU                 None                                   
    Weight              weight                                 
    With replacement    False                                  
    Prob                None                                   
    Hit                 None                                   
    MOS                 None                                   
    Population size     None                                   
    Replicate weights   BRR (n_reps=8, df=4, fay=0, cols=8)    
                                                               
╰───────────────────────────────────────────────────────────────╯

Jackknife (JK)

Jackknife forms replicates by deleting one PSU within each stratum and re-weighting the remainder. Each stratum must have two ro more PSUs.

jk_sample = rep_sample.weighting.create_jk_wgts(rep_prefix="jk_rep_wgt")

print(jk_sample)
╭─────────────────────────── Sample ────────────────────────────╮
 Survey Data:                                                  
   Number of rows: 24                                          
   Number of columns: 36                                       
   Number of strata: 4                                         
   Number of PSUs: 8                                           
                                                               
 Survey Design:                                                
                                                               
    Field               Value                                  
    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━    
    Row index           svy_row_index                          
    Stratum             stratum                                
    PSU                 cluster                                
    SSU                 None                                   
    Weight              weight                                 
    With replacement    False                                  
    Prob                None                                   
    Hit                 None                                   
    MOS                 None                                   
    Population size     None                                   
    Replicate weights   BRR (n_reps=8, df=4, fay=0, cols=8)    
                                                               
╰───────────────────────────────────────────────────────────────╯

Bootstrap (BS)

Bootstrap replicates are formed by re-sampling PSUs with replacement within each stratum, drawing the same number of PSUs as observed in the sample for every replicate. The selection is independent across replicates. We then rescale the weights (e.g., Rao–Wu rescaled bootstrap) so estimators remain unbiased under the design.

Replicate count: if n_reps is omitted, create_bs_wgts() defaults to 500 (increase for highly non-linear targets).

bs_sample = rep_sample.weighting.create_bs_wgts(
    n_reps=50, rep_prefix="bs_rep_wgt"
)

print(bs_sample)
╭─────────────────────────── Sample ────────────────────────────╮
 Survey Data:                                                  
   Number of rows: 24                                          
   Number of columns: 86                                       
   Number of strata: 4                                         
   Number of PSUs: 8                                           
                                                               
 Survey Design:                                                
                                                               
    Field               Value                                  
    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━  
    Row index           svy_row_index                          
    Stratum             stratum                                
    PSU                 cluster                                
    SSU                 None                                   
    Weight              weight                                 
    With replacement    False                                  
    Prob                None                                   
    Hit                 None                                   
    MOS                 None                                   
    Population size     None                                   
    Replicate weights   BOOTSTRAP (n_reps=50, df=49, fay=0,    
                        cols=50)                               
                                                               
╰───────────────────────────────────────────────────────────────╯

Adjustment of Replicate Weights

COMING SOON!!!

Next Steps

Next, we discuss the estimation methods: Estimation.

References

Deville, Jean-Claude, and Carl-Erik Särndal. 1992. “Calibration Estimators in Survey Sampling.” J. Amer. Statist. Assoc. 87 (418): 376–82. https://doi.org/10.1080/01621459.1992.10475217.
Särndal, Carl-Erik, Bengt Swensson, and Jan Wretman. 1992. Model Assisted Survey Sampling. Springer-Verlag New York, Inc. https://link.springer.com/book/9780387406206.
Valliant, R, and J A Dever. 2018. Survey Weights: A Step-by-Step Guide to Calculation. Stata Press. https://www.stata-press.com/books/survey-weights/.
World Bank. 2023. “Synthetic Data for an Imaginary Country, Sample, 2023.” World Bank, Development Data Group. https://doi.org/10.48529/MC1F-QH23.