2026-05-19 20:52:18 CEST · synth-reviewer
QA report for EE national-scale non-fixture rerun
country: EE
run_id: ee_population_private_household_national_2021_seed420987
artifact_reviewed: /home/synthestat/output/runs/EE/ee_population_private_household_national_2021_seed420987
verdict: NEEDS_MODEL_FIX
confidence_in_verdict: high
summary: The artifact is national-scale private-household, not the old 8-person fixture: actual parquet counts are persons=1,317,666, households=561,655, dwellings=561,655. It matches declared 2021 Statistics Estonia private-household HARD totals exactly and documents the 14,158 total-population residual as unavailable/non-private rather than fabricating hidden persons. However it fails household-family realism badly because age groups are streamed into household slots: 174,570 households contain children and no adult; 132,650 are single-person child households; 174,571 child records are household reference_person. Verdict is NEEDS_MODEL_FIX, not pass for internal review.
constraint_fit:
hard: PASS for declared private-household counts only. Independent raw freeze check: RL21707 households=561,655; RL21703 private-household members=1,317,666; RL21001 total population=1,331,824, residual=14,158. Parquet row counts match declared private-household persons/households. Age-sex generated counts match distribution_diagnostics totals.
firm: No FIRM constraints declared.
soft: No SOFT constraints declared.
household_family_checks: FAIL. size member sums are internally consistent, but household composition is impossible for ordinary private households: 174,570 child-without-adult households, 132,650 single child households, 174,571 child reference persons. Root cause visible in builder: person_rows streams age-sex blocks sequentially into household slots and assigns first member as reference_person without adult/role safeguards.
dwelling_building_checks: DEGRADED/NOT PASS. One synthetic dwelling per private household is internally linked; real building assignment is explicitly unavailable pending Maa-amet reconciliation. This is acceptable as an explicit unavailable layer for a scoped review bundle, but not as real-house grounding.
hidden_population_checks: PASS for honesty of scope, not completeness. The 14,158 total-vs-private residual is documented as aggregate unavailable/non-private and not silently injected or labelled as hidden persons.
work_school_assignment_checks: UNAVAILABLE, honestly declared. No work/school assignments are claimed.
distribution_checks: MIXED. Age-sex counts are exact at declared private-household aggregate level. Socioeconomic attributes are unassigned/null and explicitly marked unavailable/modelled. Large-household diagnostics are inaccurate: raw RL21707 open class is 6-10 households=17,764 and members=114,596; actual generated split is size 6=9,752 and size 7=8,012, but household_diagnostics.json lists 6=16,966 and 7=798, so diagnostics do not match the parquet.
geography_checks: Scoped to country EE only. geography_quality_tiers.json labels country B and buildings C; no subnational geography is synthesized. This is explicit, not a silent dropped-zone issue.
uncertainty_provenance_checks: Mostly adequate for scoped candidate. Required review files are present (16 files). Provenance includes frozen Statistics Estonia and Maa-amet records with paths/checksums/timestamps. Uncertainty summary notes midpoint ages, unassigned socioeconomic attributes, missing collective/hidden/work-school/building layers. Gap: household-family structural uncertainty/invalidity is not surfaced as a failed diagnostic.
privacy_release_checks: Low immediate sensitivity because no fine geography, real building IDs, occupation, origin, nationality, or work/school assignments are present. Still synthetic persons are 1:1 national private-household rows and should not be described as anonymous by default.
critical_failures:
- Ordinary private-household composition is structurally invalid: 174,570 child-without-adult households and 132,650 single-child households.
- 174,571 minor records are household reference_person in ordinary private households.
- household_diagnostics.json mismatches actual generated size distribution for large households and does not report the child-alone failure.
model_fix_requests:
- Rebuild person-to-household assignment so household roles and adult/child composition are plausible under private-household rules while preserving HARD private-household totals.
- Add validation metrics for minor reference persons, child-alone households, household type/member composition coherence, and role/age plausibility.
- Correct household_diagnostics.json to distinguish measured open-class source controls from generated deterministic splits.
- Keep explicit unavailable statuses for real building assignment, hidden/non-private overlay, and work/school assignment until validated.
source_gap_requests:
marginals: []
distributions: []
stopping_condition_assessment: Not evidence exhausted and not model-improvement exhausted. The failure is model logic/diagnostics, so route to synth-modeler.
recommended_next_cards:
- assignee: synth-modeler
title: EE model fix: repair invalid child-alone household assignment in national private-household bundle
reason: Required before any internal-review pass; created as t_1bf7ff30.
depends_on: t_8f902059
Evidence commands run: pyarrow parquet row/schema checks; independent frozen CSV target extraction; household composition audit; builder source inspection.