Skip to content
Registry stack docs v0 · draft

Existing data is not service-ready

A district focal point in the Ministry of Education updates a school directory spreadsheet every quarter. The deworming team emails for the latest copy. The exam board emails for the latest copy. The EMIS pilot emails for the latest copy. Three months later, two of those teams are publishing numbers that do not match the focal point’s records, and nobody can say which list is authoritative.

The data exists and is being used. What is missing is a single place to ask for the current version, on the owner’s terms.

Every new integration starts with an email, a meeting, or a copy. Every integration ends with a private copy living in someone else’s system, beyond the original owner’s ability to audit, version, or revoke. The list of sources looks technical (spreadsheets, CSV exports, Parquet files, PostgreSQL tables, legacy applications), but the problem is institutional: there is no public contract for who can ask, for what, with what freshness, and on what authority.

The data owner becomes the permanent integration bottleneck. Every new caller is a meeting, an email thread, and a one-off agreement. The first integration takes months; the second re-runs the same negotiation; the third finds out the field meanings have drifted in the meantime.

The owner also loses control of what was shared. Copies of the spreadsheet land in other ministries’ systems, processed by tools the original owner cannot see. Asking those callers to “use the new version” is a request, not a control.

A reviewer arriving later finds no record of who asked, when, for what purpose, or what was disclosed. The integrations work. The owner cannot say who has the current data, who has a stale copy, or whether anyone is still using it at all.

Data warehouses centralise the data for analysis but turn the warehouse team into the new gatekeeper. The source owner loses visibility into how the data is being used downstream, and the analytical copy drifts from the operational source.

Generic API wrappers put the source online quickly. They also expose internal column names, storage layout, and filter behavior that nobody reviewed. The first security review usually finds them.

One-time exports are perfect for a pilot and broken by the second source change. By the time the third integration team requests the data, the export script lives on one laptop and nobody is sure when it last ran.

All three work locally. They stall when the same registry must serve multiple callers with different purposes, field access, freshness expectations, and audit requirements.

The source stays where it is. The contract over it becomes public.

Registry Manifest publishes the description of what the registry actually offers: which datasets, which entities, which fields are authoritative, which policy applies, which evidence shapes are available.

Registry Relay binds that description to the underlying source and serves it as protected, read-only, entity-shaped routes. Callers authorise at the API layer; the source owner does not ship copies.

Registry Atlas lets integrators and reviewers inspect the published description before any code is written.

The school directory now has a public contract: who may ask, for what, with what freshness, on what authority, and with what audit trail. The spreadsheet still lives where it has always lived. For the broader model, see the architecture overview.

Registry Manifest renders discovery artifacts for catalogues, JSON Schemas, SHACL shapes, policies, and evidence offerings. Registry Relay can read configured CSV, XLSX, Parquet, and PostgreSQL sources, project declared entity fields, enforce caller scopes, and write audit records. Registry Atlas is a supporting inspection tool for metadata and service-discovery review.

The current surfaces are documented in the Registry Manifest overview, Registry Relay overview, and Registry Atlas overview.

Registry Stack does not replace the source registry, clean source data automatically, or provide a full master data management program. Write APIs, provisioning workflows, stream ingestion, and automated data repair are out of scope for the current Registry Relay pages.

Registry Forge is future-facing preparation work for data readiness. It is not a current product promise on this site.