Tutorials - Small Area Estimation with svy-sae

Step-by-step tutorials for Small Area Estimation in Python. Learn Fay-Herriot models, EBLUP, and advanced SAE techniques with practical examples.
Keywords

SAE tutorials, Fay-Herriot model, EBLUP, area-level models, unit-level models

Small Area Estimation (SAE) refers to a class of statistical methods designed to produce reliable estimates for domains (or “small areas”) where direct survey estimates are unreliable or unavailable due to small sample sizes. SAE approaches improve precision by introducing statistical models that borrow strength across domains or units, enabling estimation even when domain-level sample sizes are limited or zero.

In practice, SAE methods are commonly grouped into two broad categories. Area-level models perform modeling at the aggregated domain level, using domain-specific summary measures. Unit-level models perform modeling at the individual unit level, allowing information to be shared across units and domains through a common model structure. Both modeling frameworks are supported in svy-sae, with several methodological options and estimation workflows. Further details are introduced progressively throughout the tutorials.

Note

This tutorial series focuses on the practical application of Small Area Estimation methods rather than their underlying statistical theory. For a comprehensive theoretical treatment of SAE, we recommend:

Rao, J. N. K., & Molina, I. (2015). Small Area Estimation (2nd ed.). Wiley.


How to use this tutorial

The tutorials provide a hands-on introduction to small area estimation workflows using svy-sae. Examples are self-contained and build progressively, but sections can also be read independently depending on the reader’s background and objectives.

The emphasis is on end-to-end workflows: from data preparation and direct estimation to model fitting, diagnostics, and uncertainty assessment.


Tutorial structure

The tutorials are organized as follows:

1. Getting started

  • Installation and setup
  • Package structure and basic concepts

2. Area-level models

  • Area-level modeling workflows
  • Estimation and diagnostics
  • Uncertainty measures

3. Unit-level models

  • Unit-level modeling workflows
  • Prediction and aggregation
  • Practical considerations

Conventions used in the tutorials

  • Code examples assume Python 3.11 or newer
  • Data are represented using polars DataFrames
  • Survey design objects and direct estimators are created using svy