← Back to country index
Synthestat · UN · population QA

UN 1:1 population synthesis QA cycle

Country-specific layer for synthetic people in households, dwellings, real building stock where available, hidden-population overlays, and work/school assignment evidence.

Board: synthestat-population-qa · Tenant: synthestat · Country: UN · Task workflow status: ready · Artifact completion: no_review_bundle

Ideal-country quality criteria: impossible 1:1 benchmark

This is the common gold-standard benchmark for an ideal country. It is intentionally impossible to fully satisfy: complete success would mean a 1:1 replica of the real population where every person, household, dwelling, attribute, and assignment is exactly represented. The QA page uses it as an asymptote and gap taxonomy, not as a release promise.

Apply this same rubric to this country’s latest run, then report which needs are measured, constrained, modelled, unavailable, or blocked.

NeedUnachievable idealQA evidence we require insteadWhy perfection cannot be achieved
Complete de jure resident coverageEvery real resident represented exactly once in the right country, municipality, small area, household, and dwelling.Synthetic person count equals official population at all enforced geographies; no unexplained duplicate, missing, or out-of-universe people.A true 1:1 resident list is a confidential population register and changes continuously; Synthestat can only match official aggregates and declared source universes.
Complete attribute truthEach synthetic person has the same age, sex, household role, education, occupation, industry, origin, health proxy, income proxy, and lifecycle state as the corresponding real person.Published marginal and cross-tab constraints pass within HARD/FIRM/SOFT tolerances; modelled fields carry uncertainty and measured/constrained/modelled provenance.Official releases do not expose a complete individual joint distribution, and many attributes are survey-derived, lagged, suppressed, or unavailable at fine geography.
Perfect household and family structureEvery household contains the exact real members and relationships, including multi-generation, partnership, child, shared, institutional, and edge-case arrangements.Household totals, household-type distributions, age/sex/role consistency, fertility/child constraints, and structural invariants pass with explicit residuals.Household membership is sensitive microdata; public sources usually expose only aggregate household/family tables and partial cross-tabs.
Exact dwelling and building groundingEvery household is assigned to its real dwelling and building with exact occupancy, vacancy, dwelling type, floor area, tenure, and address-level geography.Dwelling/building capacity checks pass; vacancy/second-home/institutional dwellings are represented or explicitly unavailable; building links have source provenance.Many countries lack open address-level registers; dwelling occupancy is confidential and time-varying.
Complete de facto and hidden-population overlaysHomeless, undocumented, refugees, students away from home, seasonal, institutional, tourists, and daytime populations are all represented with exact location and timing.Overlay layers use interval estimates, source-specific quality flags, and never silently modify de jure HARD constraints.Hidden populations are partly unobserved by definition; ethical/privacy constraints forbid exact person-level labels.
Exact school, workplace, facility, and mobility assignmentEvery person is assigned to the real school, workplace, care provider, commute, and daily activity chain they use.Assignment layers use official registers/OD flows where available; modelled assignments are flagged and validated only against aggregate flows/capacities.Operational assignments are usually protected registers or dynamic behavioural data; Phase 1 must not imply they are known.
Full joint-distribution realismThe full multivariate joint distribution is identical to reality across all attributes, households, geography, and rare subgroups.High-priority marginals/cross-tabs pass; sparse zones and prior-dominated attributes are clearly marked with quality tiers and credible intervals.The joint distribution is non-identifiable from published marginals; IPF/BN/hierarchical pooling choose plausible distributions, not truth.
Zero uncertainty and zero lagAll values are current today and known without error.Every output records reference period, retrieval timestamp, lag, confidence, uncertainty bounds, and degradation decisions.Official statistics are lagged, revised, sampled, suppressed, and harmonized after collection.
Privacy-safe yet maximally detailed releaseThe system releases maximum useful detail while creating zero re-identification risk.Release mode, k-anonymity/cell safeguards, perturbation/aggregation policy, and sensitive-field treatment are explicit.Fine-area synthetic microdata can still create structurally unique records; synthetic does not mean anonymous.
Perfect reproducibility and auditabilityAny user can trace every output record to exact source snapshots, transformations, constraints, relaxations, seeds, and code versions.Run manifests, source provenance, checksums, frozen extracts, seeds, versioned crosswalks, validation reports, and relaxation logs are complete.This is approachable but never final: source portals, classifications, geography, and code keep changing, so audits must be continuously renewed.

Population artifact output status (separate from task status)

Artifact completionRow count sourcePeopleTarget populationNational coverageAbsolute shortfallHouseholdsDwellingsHouses/buildingsMax marginal deviationHARD statusRun
no_review_bundlenoneno_bundle_yet

This table describes emitted population artifacts only. It is intentionally independent from the Kanban task workflow status below: a country can have all tasks done while its artifact is still only a seeded slice, or a passing review bundle can still lack national target completion. Deviation is the maximum absolute relative error across collected HARD/FIRM/SOFT marginal constraints in the latest review bundle. GUIDE/INFORMATIONAL priors are excluded. National target/coverage are read from build_manifest.json when available and override any visual impression of completion.

Kanban task workflow status (not artifact completion)

ready
3
todo
1
done
12

These cards count board tasks only. They do not certify that the country-level population artifact is nationally complete or reviewer-approved.

Datasets and distributions

Lists come from the latest run bundle: source_provenance.json, distribution_diagnostics.json, and build_manifest.json.

Summary

Datasets used0
Distributions available0
Constraints/distributions used in synthesis0
Constraint types
Dataset variants
Finest-geography status

Source gaps

  • No source gaps listed.

Datasets used

Dataset/source ID
None listed yet.

Best source by distribution family

Distribution familyDataset/source ID
None listed yet.

Available distributions / priors in registry

SpecLabelTypeGeoStatusVariantConfidenceData URI
None listed yet.

Constraints/distributions used in synthesis manifest

Constraint or distribution ID
None listed yet.

Current country tasks

IDTitleAssigneeStatusCreatedLatest summary
t_f12df715Resolve GR/EL publication mapping for Greece population QA statussynth-modelerready2026-05-19 22:05:23 CEST
t_1aacf742Add guardrail test: release_review cannot pass on tiny seeded population artifactssynth-modelerready2026-05-19 22:05:22 CEST
t_757abce4Fix release/status language so seeded population slices cannot surface as national PASSsynth-modelerready2026-05-19 22:05:21 CEST
t_e0e72e3cmanager: supervise Italy/Greece/Baltic population-QA expansion wavesynth-managertodo2026-05-19 20:17:46 CEST
t_1aeadf3aReviewer sanity sweep: national-scale claims versus actual artifact row countssynth-reviewerdone2026-05-19 22:00:13 CESTCompleted reviewer sanity sweep and wrote the country-level matrix to /home/synthestat/output/reviews/national_scale_claims_sanity_sweep_2026-05-19.md. Verdict is NEEDS_MODEL_FIX for release/status presentation: top-level release surfaces still expose seeded-slice population counts and DK/FI/NL/NO/S
t_49963fe5manager: immediate active control loop after memory-scale spawn requestsynth-managerdone2026-05-19 21:59:28 CESTCompleted active population-QA manager control loop: checked population-QA plus synthestat infra board, verified country status by actual parquet row counts/latest-run artifacts, routed PASS countries to final internal notes, and built continuation chains around stale blocked parents for IT/NL/LU/MT
t_42457a02Re-review CZ repaired household-prior bundlesynth-reviewerdone2026-05-19 21:42:47 CESTRe-reviewed CZ repaired household-prior bundle at /home/synthestat/output/runs/CZ/cz_population_fixed_household_priors_36c1d87e_seed420987. Verdict PASS for the declared repaired household-prior/internal-review scope: required bundle files are present, HARD persons and FIRM households are exact, hou
t_4afa8f5bFix CZ household type-size/dependent-child prior integrationsynth-modelerdone2026-05-19 21:00:24 CESTBuilt repaired CZ household-prior review bundle at /home/synthestat/output/runs/CZ/cz_population_fixed_household_priors_36c1d87e_seed420987. HARD persons/FIRM households pass exact, household size/exclusive leaf type/dependent-child residuals pass exact, composition and person-household size mismatc
t_5bc37015Re-review XK candidate after residual bucket diagnostics fixsynth-reviewerdone2026-05-19 20:55:51 CESTRe-reviewed XK rebuilt national candidate after the residual-bucket diagnostics fix and returned PASS for the declared internal municipality-level C-tier national-core scope. Independent parquet/JSON checks verified required bundle presence, exact HARD person/household totals, +12 FIRM dwelling resi
t_bc746218Fix XK candidate unresolved residual bucket diagnostics inconsistencysynth-modelerdone2026-05-19 20:47:46 CESTFixed and rebuilt the XK national candidate residual bucket diagnostics: parquet readback, household_diagnostics, uncertainty_summary, national_candidate_summary, and model_notes now agree on 432 unresolved bucket rows / 21,510 persons using the emitted-row size-sum definition. Added a parquet-vs-JS
t_efd44c77Review fixed XK national candidate household realism and diagnosticssynth-reviewerdone2026-05-19 20:43:20 CESTReviewed the fixed XK national candidate and returned BLOCKED_INVALID_OUTPUT. The row-level household realism fixes are verified, but the bundle is internally inconsistent: diagnostics/uncertainty report 18,497 unresolved residual persons while actual unresolved bucket household rows sum to 21,510;
t_40e91e8dReview LT cycle-4 household-family realism model-fix bundlesynth-reviewerdone2026-05-19 20:39:53 CESTReviewed LT cycle-4 bundle at /home/synthestat/output/runs/LT/lt_population_review_cycle4_888b7387_seed420987 and returned PASS for the declared internal research-review scope. Independent checks confirmed complete contract files, 2,810,761 persons / 1,215,360 households / 1,215,360 dwellings, exact
t_20be5c35Review CZ household/non-demographic prior integrationsynth-reviewerdone2026-05-19 20:26:46 CESTReviewed CZ targeted-prior bundle and returned NEEDS_MODEL_FIX. National person/household counts are exact and unsupported fine joints/assignments remain flagged, but household type-size/dependent-child prior integration is invalid with large residuals and impossible household compositions; handoff
t_9465d714manager: reconcile country-level population QA status vs fixture outputssynth-managerdone2026-05-19 20:12:25 CESTReconciled country-level population QA status against current persons.parquet counts and reviewer/task history for CZ, SE, NL, LT, LU, AD, PT, MT, SI, BG, and XK. Wrote the status matrix at /home/synthestat/workspace/manager_handoffs/population_qa_country_status/latest.md, appended manager_updates.m
t_67b5031eRoute LU seeded-slice PASS to final human review/deliverysynth-managerdone2026-05-19 20:11:06 CESTRouted the LU cycle-2 PASS into a final human-facing delivery/readiness-note task without expanding scope. The downstream note task must preserve that the PASS applies only to the internal seeded two-zone LU slice and is not nationwide Luxembourg 1:1 production readiness or external-release approval
t_e09e08cemanager: continuous board oversight and reaction loopsynth-managerdone2026-05-19 19:56:07 CESTCompleted a population-QA board supervisor pass: inspected active/done/blocked tasks, routed stalled NL/AD/LU/LT/SE continuations, created final PASS delivery-note tasks for XK/PT/MT/SI, and kept CZ pressure on the >8-person national synthesis blocker with a dependent re-review. Wrote the concise su

Process

Manager kickoff

synth-manager creates and controls the country loop.

Model build

synth-modeler generates the review bundle: people, households, dwellings/buildings or unavailable markers, overlays, assignments, manifests, residuals, diagnostics, uncertainty, provenance.

Reviewer gate

synth-reviewer audits constraints, marginals, household/family realism, hidden populations, dwelling/building grounding, work/school assignment, uncertainty, provenance, and privacy.

Branch

PASS finalizes; NEEDS_MODEL_FIX routes back to modeler; NEEDS_MORE_SOURCES routes to marginal/distribution researchers then downloader; exhausted evidence/model plateau stops for human decision.

Quality gates and stop conditions