import polars as pl
from svy_io.stata import read_dta, write_dta
from svy_io.spss import read_sav, write_sav
from svy_io.sas import read_sas, read_xpt, write_xpt
# Read files
df_stata, meta_stata = read_dta("data.dta")
df_spss, meta_spss = read_sav("data.sav")
df_sas, meta_sas = read_sas("data.sas7bdat")
# Write files
write_dta(df, "output.dta", var_labels={"age": "Age in years"})
write_sav(df, "output.sav", var_labels={"age": "Age in years"})
write_xpt(df, "output.xpt", label="My Dataset")svy-io: Python bindings for ReadStat
Python bindings for ReadStat — read and write SAS, Stata, and SPSS files
Read and write statistical software files (SAS, Stata, SPSS) in Python. Fast, Pythonic API backed by ReadStat with full metadata support.
Keywords
svy-io Python, read SAS Python, read Stata Python, read SPSS Python, statistical file formats Python, ReadStat Python, sas7bdat Python, dta Python, sav Python
svy-io provides fast, Pythonic I/O for statistical software file formats. It wraps the ReadStat C library to read and write SAS, Stata, and SPSS files with full metadata support.
Installation
pip install svy-ioFeatures
- Fast — Native C library performance via ReadStat
- Polars-native — Returns Polars DataFrames for modern data analysis
- Full metadata — Preserves variable labels, value labels, and file attributes
- Multiple formats — SAS (.sas7bdat, .xpt), Stata (.dta), SPSS (.sav, .zsav, .por)
Quick Start
Supported Formats
| Format | Read | Write | Extensions |
|---|---|---|---|
| Stata | ✅ | ✅ | .dta |
| SPSS | ✅ | ✅ | .sav, .zsav, .por |
| SAS | ✅ | ✅* | .sas7bdat, .xpt |
*SAS write support is for XPT (transport) format only.
Documentation
Why svy-io?
| Feature | svy-io | pandas | pyreadstat |
|---|---|---|---|
| Returns Polars DataFrame | ✅ | ❌ | ❌ |
| Full metadata support | ✅ | ❌ | ✅ |
| Value labels | ✅ | ❌ | ✅ |
| User-defined missing | ✅ | ❌ | ✅ |
| Write support | ✅ | Limited | ✅ |
| Zip archive support | ✅ | ❌ | ❌ |
Common Patterns
Convert Between Formats
from svy_io.stata import read_dta
from svy_io.spss import write_sav
# Stata → SPSS
df, meta = read_dta("input.dta")
write_sav(df, "output.sav")Preserve Metadata
from svy_io.stata import read_dta, write_dta
# Read with metadata
df, meta = read_dta("input.dta")
# Transform data
df_filtered = df.filter(pl.col("age") > 18)
# Extract labels for remaining columns
var_labels = {
v['name']: v.get('label')
for v in meta['vars']
if v['name'] in df_filtered.columns
}
# Write with preserved metadata
write_dta(df_filtered, "output.dta", var_labels=var_labels)Export to Multiple Formats
from svy_io.stata import write_dta
from svy_io.spss import write_sav
from svy_io.sas import write_xpt
# Export to all formats
write_dta(df, "data.dta")
write_sav(df, "data.sav")
write_xpt(df, "data.xpt")
df.write_parquet("data.parquet")
df.write_csv("data.csv")Differences from Stata/SPSS I/O
| Feature | SAS | Stata | SPSS |
|---|---|---|---|
| File formats | .sas7bdat, .xpt |
.dta |
.sav, .zsav, .por |
| Value labels | Separate catalog file | In file | In file |
| Write support | XPT only | Full | Full |
| Tagged missing | .A–.Z |
.a–.z |
User-defined |
| Default temporal coercion | False (SAS7BDAT), True (XPT) | False | True |
Requirements
- Python 3.9+
- Polars
- ReadStat (bundled)
Links
- GitHub Repository
- Issue Tracker
- ReadStat — Upstream C library
- Polars — DataFrame library