0005-snapshot-metadata-and-distribution
ADR 0005: Snapshot metadata and progress distribution
Date: 2025-09-21 Status: accepted
Context
The service periodically produces point-in-time analytics snapshots (/stats) which are used
by external aggregators to compute weekly rollups. When multiple service instances publish
snapshots, aggregators need lightweight metadata to deduplicate or attribute samples to instances.
Additionally, consumers requested a compact representation of project progress distribution
that is efficient to transmit and simple to compute on the producer side.
Decision
-
Add optional instance metadata to each
SnapshotStatspayload:instance_id(string, optional): an identifier for the producing service instance. Populated from theINSTANCE_IDenvironment variable when present.env(string, optional): the runtime environment name (e.g.,prod,staging), populated from theENVenvironment variable when present.
-
Add a compact
progress_distributionhistogram toSnapshotStats:- Representation: array of 10 integers where index 0 counts projects with progress in [0..9]%, index 1 counts [10..19]%, …, index 9 counts [90..100]%. This is intentionally coarse but sufficient for visualizations and rollup computations.
-
Producers will compute the histogram by inspecting
Project.Progress(clamped 0..100) and incrementing the appropriate bucket. Consumers can expand buckets as needed.
Rationale:
instance_idis lightweight and gives aggregators a mechanism to deduplicate or track per-instance contribution without introducing distributed consensus or durable instance registries.- A fixed 10-bucket histogram is simple to compute, small to transmit, and covers the common visualization needs without needing full percentiles or heavy-weight sketches.
Consequences
-
Positive:
- Aggregators can detect duplicate samples and attribute samples to instances.
- Reduced bandwidth for progress distribution compared to sending full sample lists.
- Simple producer implementation (single pass over projects).
-
Negative / trade-offs:
- Lossy histogram: fine-grained percentile computation is less accurate than full distributions.
- Reliance on
INSTANCE_IDenvironment being correctly set for deduplication — if absent, aggregators must rely on other signals (e.g., source IP, service name + timestamp) which may be less reliable.
Next steps
- Update
SPEC.mdanddocs/examples.mdto documentinstance_id,env, andprogress_distribution(done). - Add OpenAPI schema examples for the new fields (optional).
- Consider an ADR or note if a different histogram resolution or sketch (e.g., HDR or DDSketch) is later required.