DATASET
Data & Downloads
SynPop-DE provides 40,235,916 synthetic households and 81,629,116 persons across all 400 German districts — in open Parquet format, directly readable with DuckDB, pandas, or R.
Downloads
400 Parquet files — one per district — plus the full national dataset. Stream directly with DuckDB, pandas, or R.
Download by District
All 45 attributes per person and household. File size: ~15–30 MB per district.
Full Dataset
All 400 districts in one file. Best for cross-district analyses.
DuckDB streams via HTTPS byte-range requests — no full download needed. For pandas/R, use per-district files.
CSV format is available per district only (section above). The national dataset is available as Parquet or via DuckDB streaming.
Machine-readable index: catalog.json · DOI: 10.5281/zenodo.20439915
Direct Access & Streaming
Read Parquet files directly from the URL — DuckDB transfers only the needed data via byte-range requests.
For filtered or programmatic export — ready-to-download CSV files are available in the district picker above.
Data Preview
Select a district and explore 10 sample records — one row per person, with all household attributes.
Enter an AGS number or type a district name and select from the list.
| Household | Persons | |||||||
|---|---|---|---|---|---|---|---|---|
| AGS | Building type | Floor Area (m²) | Household income (€/yr) | Age Group | Gender | Education | ||
| No rows for this district. | ||||||||
Data Schema
All 45 columns in the Parquet files — one row per person, household attributes shared within each household.
| Column | Level | Type | Description / Values |
|---|---|---|---|
| Identifiers | |||
household_id | ID | string | Unique household identifier |
person_id | ID | string | Unique person identifier |
person_rank | ID | int | Person's position within household (1 = head) |
ags | ID | string | AGS district code (5-digit, e.g. "11000" for Berlin) |
state | ID | int | Federal state code — 1–16 |
region_type | ID | int | BKG settlement type — 1=urban core, 2=rural district, 3=rural area |
original_household_id | ID | string | Row index in the GAN-generated state-region donor pool. Not a link to any EVS survey respondent. |
source_household_id | ID | string | Identical to original_household_id (donor pool index — both columns are always equal) |
| Person attributes | |||
gender | Person | int | Gender — 1=male, 2=female |
age_group | Person | int | Age group — 0=0–5, 1=6–14, 2=15–17, 3=18–29, 4=30–44, 5=45–59, 6=60–74, 7=75+ |
education | Person | int | Educational attainment — 1=no qualification, 2=primary, 3=secondary, 4=vocational, 5=tertiary |
employment | Person | int | Employment status — 1=unemployed, 2=part-time, 3=full-time, 4=civil servant, 5=self-employed |
| Household attributes | |||
household_size | Household | int | Number of persons in household |
household_type | Household | int | Household type — 1=single, 2=couple, 3=single parent, 4=couple+children, 5=other |
household_type_27 | Household | int | Household type (5 categories, identical to household_type after harmonisation — name is a legacy artefact of the raw EVS variable) |
building_type | Household | int | Building type — 1=detached, 2=semi-detached, 3=multi-family, 4=other |
building_ownership | Household | int | Ownership — 1=owner-occupied, 2=rented |
building_age | Household | int | Construction period — 1=pre-1949, 2=1949–1978, 3=1979–2001, 4=post-2001 |
building_size | Household | float | Living floor area in m² |
heating_type | Household | int | Heating system — 1=district heating, 2=central, 3=single room |
heating_energy | Household | int | Primary energy carrier — 0=none, 1=electricity, 2=gas, 3=oil, 4=solid fuel, 5=renewables |
household_income | Household | int | Income bracket (10 classes, Destatis classification) — 1 (lowest) to 10 (highest) |
household_income_num | Household | float | Annual net household income (€) |
aeq | Household | float | Equivalized income (OECD-modified scale, €/year) |
exp_quota | Household | float | Consumption ratio — ratio of total annual household expenditure to annual net household income |
| Expenditures (€/year) | |||
expenditure | Household | float | Total annual household expenditure in EUR (sum of all expenditure categories after calibration) |
exp_food | Household | float | Food and non-alcoholic beverages |
exp_clothes | Household | float | Clothing and footwear |
exp_housing_total | Household | float | Total housing costs |
exp_housing_rent | Household | float | Rent (tenants) or imputed rent (owners) |
exp_housing_electricity | Household | float | Electricity |
exp_housing_heating | Household | float | Heating fuel |
exp_housing_imputed | Household | float | Imputed rent (owner-occupied households only) |
exp_housing_maintenance | Household | float | Maintenance and repair |
exp_transport_total | Household | float | Total transport |
exp_transport_fuel | Household | float | Fuel |
exp_transport_own | Household | float | Own vehicle purchase and leasing |
exp_transport_public | Household | float | Public transport |
exp_health | Household | float | Health |
exp_leisure | Household | float | Leisure and culture |
exp_telecom | Household | float | Telecommunications |
exp_furniture | Household | float | Furniture and home equipment |
exp_hospitality | Household | float | Restaurants and accommodation |
exp_education | Household | float | Education |
exp_other | Household | float | Other goods and services |