DATASET

Data & Downloads

SynPop-DE provides 40,235,916 synthetic households and 81,629,116 persons across all 400 German districts — in open Parquet format, directly readable with DuckDB, pandas, or R.

82MSynthetic persons
40MHouseholds
400Districts

Downloads

400 Parquet files — one per district — plus the full national dataset. Stream directly with DuckDB, pandas, or R.

Download by District

All 45 attributes per person and household. File size: ~15–30 MB per district.

    Full Dataset

    All 400 districts in one file. Best for cross-district analyses.

    DuckDB streams via HTTPS byte-range requests — no full download needed. For pandas/R, use per-district files.

    CSV format is available per district only (section above). The national dataset is available as Parquet or via DuckDB streaming.

    Machine-readable index: catalog.json · DOI: 10.5281/zenodo.20439915

    Direct Access & Streaming

    Read Parquet files directly from the URL — DuckDB transfers only the needed data via byte-range requests.

    For filtered or programmatic export — ready-to-download CSV files are available in the district picker above.

    Data Preview

    Select a district and explore 10 sample records — one row per person, with all household attributes.

    Enter an AGS number or type a district name and select from the list.

    HouseholdPersons
    AGSBuilding typeFloor Area (m²)Household income (€/yr)Age GroupGenderEducation
    No rows for this district.
    Total households: 0
    Page 1 / 1

    Data Schema

    All 45 columns in the Parquet files — one row per person, household attributes shared within each household.

    Column Level Type Description / Values
    Identifiers
    household_idIDstringUnique household identifier
    person_idIDstringUnique person identifier
    person_rankIDintPerson's position within household (1 = head)
    agsIDstringAGS district code (5-digit, e.g. "11000" for Berlin)
    stateIDintFederal state code — 1–16
    region_typeIDintBKG settlement type — 1=urban core, 2=rural district, 3=rural area
    original_household_idIDstringRow index in the GAN-generated state-region donor pool. Not a link to any EVS survey respondent.
    source_household_idIDstringIdentical to original_household_id (donor pool index — both columns are always equal)
    Person attributes
    genderPersonintGender — 1=male, 2=female
    age_groupPersonintAge group — 0=0–5, 1=6–14, 2=15–17, 3=18–29, 4=30–44, 5=45–59, 6=60–74, 7=75+
    educationPersonintEducational attainment — 1=no qualification, 2=primary, 3=secondary, 4=vocational, 5=tertiary
    employmentPersonintEmployment status — 1=unemployed, 2=part-time, 3=full-time, 4=civil servant, 5=self-employed
    Household attributes
    household_sizeHouseholdintNumber of persons in household
    household_typeHouseholdintHousehold type — 1=single, 2=couple, 3=single parent, 4=couple+children, 5=other
    household_type_27HouseholdintHousehold type (5 categories, identical to household_type after harmonisation — name is a legacy artefact of the raw EVS variable)
    building_typeHouseholdintBuilding type — 1=detached, 2=semi-detached, 3=multi-family, 4=other
    building_ownershipHouseholdintOwnership — 1=owner-occupied, 2=rented
    building_ageHouseholdintConstruction period — 1=pre-1949, 2=1949–1978, 3=1979–2001, 4=post-2001
    building_sizeHouseholdfloatLiving floor area in m²
    heating_typeHouseholdintHeating system — 1=district heating, 2=central, 3=single room
    heating_energyHouseholdintPrimary energy carrier — 0=none, 1=electricity, 2=gas, 3=oil, 4=solid fuel, 5=renewables
    household_incomeHouseholdintIncome bracket (10 classes, Destatis classification) — 1 (lowest) to 10 (highest)
    household_income_numHouseholdfloatAnnual net household income (€)
    aeqHouseholdfloatEquivalized income (OECD-modified scale, €/year)
    exp_quotaHouseholdfloatConsumption ratio — ratio of total annual household expenditure to annual net household income
    Expenditures (€/year)
    expenditureHouseholdfloatTotal annual household expenditure in EUR (sum of all expenditure categories after calibration)
    exp_foodHouseholdfloatFood and non-alcoholic beverages
    exp_clothesHouseholdfloatClothing and footwear
    exp_housing_totalHouseholdfloatTotal housing costs
    exp_housing_rentHouseholdfloatRent (tenants) or imputed rent (owners)
    exp_housing_electricityHouseholdfloatElectricity
    exp_housing_heatingHouseholdfloatHeating fuel
    exp_housing_imputedHouseholdfloatImputed rent (owner-occupied households only)
    exp_housing_maintenanceHouseholdfloatMaintenance and repair
    exp_transport_totalHouseholdfloatTotal transport
    exp_transport_fuelHouseholdfloatFuel
    exp_transport_ownHouseholdfloatOwn vehicle purchase and leasing
    exp_transport_publicHouseholdfloatPublic transport
    exp_healthHouseholdfloatHealth
    exp_leisureHouseholdfloatLeisure and culture
    exp_telecomHouseholdfloatTelecommunications
    exp_furnitureHouseholdfloatFurniture and home equipment
    exp_hospitalityHouseholdfloatRestaurants and accommodation
    exp_educationHouseholdfloatEducation
    exp_otherHouseholdfloatOther goods and services