2026-05-19 16:05:12 CEST · synth-reviewer
QA report — SE population synthesis QA cycle 1
country: SE
run_id: se_population_review_cycle1_a5ad12d7_seed420987
artifact_reviewed: /home/synthestat/output/runs/SE/se_population_review_cycle1_a5ad12d7_seed420987
verdict: NEEDS_MORE_SOURCES
confidence_in_verdict: high
summary: Bundle is contract-complete and honest enough to review, but it is an 8-person seeded/two-zone fixture, not a reviewable Sweden 1:1 country population. HARD residuals pass exactly and unavailable hidden/work-school layers are explicit, but production-scale SCB/Lantmäteriet evidence, hidden-population evidence, assignment evidence, and clearer geography/degradation metadata are needed before another model pass can plausibly reach PASS.
constraint_fit:
hard: pass_exact in constraint_residuals.json and /home/synthestat/output/SE/validation_report.parquet; 4/4 HARD rows pass with relative_error 0.0.
firm: 2/2 FIRM rows pass with max_abs_relative_error 0 and declared policy normally <=2%.
soft: no SOFT rows are reported in constraint_residuals residuals_by_constraint_type despite SOFT specs in the distribution registry; treat current slice as too small/sparse to validate SOFT fit.
household_family_checks: 8 households / 8 persons, all single-person households; no children-alone/couple-age-gap issue can appear, but family composition and parent-child age-gap realism are explicitly unavailable/not materially validated.
dwelling_building_checks: 8 assignments, 8 dwellings, 3 seeded buildings, capacity_check ok; building source is seeded Lantmäteriet-style B2, not live building stock. Deterministic check found synthetic_dwellings.household_id is null for all dwellings while households and building_assignments carry dwelling links; canonical mapping still exists, but the backlink inconsistency should be fixed or documented.
hidden_population_checks: hidden_population_overlays.unavailable.json is explicit, names categories, required evidence, and states no effect on de jure population. This is honest, but not adequate for PASS unless manager/human narrows scope or sources are exhausted.
work_school_assignment_checks: work_school_assignments.unavailable.json is explicit and does not hallucinate assignments; Skolverket/workplace/facility/OD evidence and anchor integration remain missing.
distribution_checks: distribution_diagnostics reports 38/38 registry coverage but many fine/joint components are constrained/modelled, including 11 modelled correlational/health components and 7 guide entries below 0.7 confidence; uncertainty metadata is present, but the seeded output is not a distributional validation of Sweden.
geography_checks: only DESO_SE_TEST_001 and DESO_SE_TEST_002 are present. geography_quality_tiers reports tier A/well_constrained and degraded_zone_count 0 while degradation_notes say aggregate release review should treat both seeded zones as C/degraded; this is inconsistent and risks overclaiming geography quality.
uncertainty_provenance_checks: modelled_or_weakly_measured_components all have uncertainty metadata (credible_level plus CV or bounds URI). source_provenance has required fields for 9 sources, but all retrieval_timestamp values are null under an explicit policy; acceptable for seeded review, not sufficient for frozen production provenance.
privacy_release_checks: internal research release only; no full release risk review. Fine-geography uniqueness risk remains material because the output is tiny and fixture-like; do not treat as anonymous or production-safe.
critical_failures:
- Tiny seeded scope (8 persons, 8 households, 2 test zones) is not a reviewable national Sweden synthetic population.
- Hidden-population overlays are unavailable rather than evidence-backed/uncertainty-bounded.
- Work/school/facility assignments are unavailable rather than evidence-backed/uncertainty-bounded.
- Real building/dwelling grounding is seeded fixture only, not live Lantmäteriet/SCB production grounding.
- Geography quality metadata overclaims A/well_constrained while notes say seeded zones should be treated as C/degraded for aggregate release review.
model_fix_requests:
- After source acquisition, regenerate with production-scale SE population/household/building coverage rather than 8-person fixtures.
- Align geography_quality_tiers, zones_degraded, and model_notes so seeded/test/degraded zones cannot show as production-grade A without an explicit release-tier caveat.
- Fix or explicitly document null synthetic_dwellings.household_id values when household->dwelling and building_assignment mappings exist elsewhere.
- Add SOFT residual summaries where SOFT constraints are present in the registry, or explicitly mark them not exercised in the seeded validation slice.
source_gap_requests:
marginals:
- Freeze/catalogue production SCB DeSO/municipality population, household-dwelling, education, labour/RAMS-LISA, income, and geography extracts with retrieval timestamps/checksums.
- Freeze/catalogue Lantmäteriet building/geodata and any SCB dwelling register-compatible evidence needed for production real-house grounding.
- Find/category-tag Sweden hidden population counts/definitions for homelessness, asylum/refugees, Ukrainian/Syrian displaced people, undocumented/seasonal populations, students, and institutional populations, including reference periods and allocation geographies.
- Acquire school/facility register and workplace/commuting OD or destination marginals suitable for assignment calibration.
distributions:
- Improve or document unavailable joint priors for family composition, parent-child age gaps, couple gaps, coresidence, occupation/education/industry/income, commuting mode/distance, and hidden-pop allocation with uncertainty.
stopping_condition_assessment: This is cycle 1 and findings are not a repeated plateau. Evidence is not exhausted: concrete official/source families remain to freeze or integrate. Do not route to human exhaustion yet; route to source acquisition/downloader lanes before another modeler cycle.
recommended_next_cards:
- assignee: synth-marginals-researcher
title: SE production marginal/source freeze for SCB + Lantmäteriet + hidden/assignment categories
reason: Current bundle is seeded and lacks production-scale frozen sources/timestamps/checksums.
depends_on: t_33ff07f7
- assignee: synth-distributions-researcher
title: SE household-family, assignment, hidden-population, and correlation prior evidence review
reason: Family composition, hidden overlays, work/school assignment, and sparse joint attributes are unavailable or prior-dominated.
depends_on: t_33ff07f7
- assignee: synth-downloader
title: Freeze/catalogue concrete SE official source extracts after researcher source IDs are selected
reason: Provenance currently has null timestamps and seeded inventory references; production review needs immutable artefacts.
depends_on: source research outputs
- assignee: synth-modeler
title: Regenerate SE review bundle from production sources and aligned quality metadata
reason: Only after source freeze should modeler replace seeded fixture with reviewable production-scale artefacts.
depends_on: source/downloader outputs