Skip to content
Registry stack docs v0 · draft

Registry Relay overview

Registry Relay turns sensitive government tabular files and selected database tables into protected, read-only, domain-oriented APIs. It sits between a caller (a social protection program system, a statistics office, an identity verification service) and the raw source data, enforcing auth, scoping what each caller can see, shaping records into declared entities, and writing a tamper-evident audit trail for every request. This is not an open-data portal and not a spreadsheet wrapper.

The upstream source is registry-relay. Shared auth, audit, OpenID Connect (OIDC), HTTP security, outbound HTTP, crypto, and SD-JWT helper primitives come from registry-platform.

Stack commitments: safeguards, interoperability.

Registry Relay maps private storage to public APIs through two layers.

Storage tables read CSV, XLSX, Parquet, or PostgreSQL sources at startup or on refresh. Table identifiers are private; they never appear in public URLs or responses.

Entities are the public domain resources: things like household or individual, with field projection, declared relationships, configured aggregates, semantic metadata, and audit records. Entity names are what callers use in paths such as /datasets/social_registry/individual.

This split means the physical shape of the source data can change without altering the public API contract. Only fields that appear in entities[].fields are exposed. Filters are accepted only when declared under entities[].api.allowed_filters. Arbitrary SQL is never passed through to the database.

Registry Relay reads four source types.

CSV files, with a configurable header row. XLSX files, with optional sheet, header_row, and data_range hints for worksheets that have header notes around the data table. Parquet files, inferred from the .parquet extension. PostgreSQL in two materialization modes: snapshot (loaded at startup or on refresh) and live (structured table only, with column projection pushdown for filter-free scans).

Stream ingestion (Kafka, pub/sub, event streaming) is not supported in v0.

Registry Relay runs in exactly one auth mode at a time, configured at startup. Two modes are supported: API key (auth.mode: api_key) and OIDC resource-server (auth.mode: oidc). Full configuration details, scope assignment, and key provisioning are in Authorize callers.

Scopes are independent and granular. The scope suffixes are metadata, rows, aggregate, evidence_verification, and admin. A key or token must carry <dataset_id>:<suffix> for the operation it needs. A metadata-only caller cannot read rows; a rows caller cannot run aggregates.

Every authenticated request produces one JSONL audit record written to a configured sink (stdout, file with in-process rotation, or syslog). Audit records are separate from operational logs (which go to stderr).

Key fields in each record include:

  • ts: ISO-8601 UTC with millisecond precision.
  • request_id: ULID, identical to the X-Request-Id response header.
  • principal_id: stable identifier of the authenticated caller.
  • auth_mode: api_key or oidc.
  • dataset_id, entity_name, table_id: resolved from the request path.
  • scopes_used: scopes actually checked on this request.
  • query_params: redacted parameter inventory (names and operators, never values).
  • purpose: verbatim Data-Purpose header value when present.
  • status_code, error_code, duration_ms.
  • verification_id, verification_decision, claim_hash, evidence_hash: populated on evidence-verification requests.

Registry Platform audit envelopes wrap records with prev_hash and record_hash fields for tamper evidence.

Audit records never contain raw API keys, raw query values for sensitive fields, or row-level data. Identifier fields marked sensitive: true in entity config are deterministically hashed in audit rather than logged verbatim.

Registry Relay publishes metadata at runtime through /metadata/* routes, using the same renderers provided by registry-manifest-core. These routes expose caller-scoped views filtered by the authenticated principal’s metadata scopes.

The /metadata/* routes are the canonical standards-facing metadata surface. They do not grant row, aggregate, evidence-verification, or admin access.

Metadata can also be published statically, without running Registry Relay, using the registry-manifest-cli publish command. The static bundle includes catalog.json, dcat.jsonld (DCAT JSON-LD), shacl.jsonld (SHACL node shapes), entity JSON Schemas, and an index.json discovery entry point.

The publishing pipeline explanation covers the portable-vs-runtime split and when to use each path.

The portable manifest and runtime config split

Section titled “The portable manifest and runtime config split”

You keep two files: a portable metadata manifest (metadata.yaml) and a runtime config (config.yaml). Registry Relay validates that they match at startup.

The metadata.yaml manifest describes datasets, entities, fields, policies, and evidence offerings in standards-facing terms. It must not contain source paths, table names, scopes, or Relay runtime backend URLs.

The runtime config.yaml binds those logical concepts to actual files, database tables, scopes, filters, aggregates, and refresh settings.

Startup failures produce stable error codes such as runtime.binding.dataset_missing and metadata.manifest.validation_failed; the process exits non-zero so misconfiguration is immediately visible.

Two additional capability families are available but not active by default. They require an appropriately built binary (compile-time feature flags).

OGC API Records: exposes a standards-conformant records surface at /ogc/v1/records with a single datasets collection. Each item describes a visible dataset; no row data is exposed through this surface.

SP DCI sync adapter: maps Social Protection Digital Convergence Initiative (SP DCI) query and response shapes to normal Relay entity reads. Without this capability in the build, any standards.spdci config block is rejected with spdci.config.feature_disabled.

Use Registry Relay for read-only, entity-shaped consultation APIs over structured sources. For claim evaluation and credential issuance, see Registry Notary.

Registry Relay’s current release tag is v0.1.1. The Cargo package version remains 0.1.0 at that tag. The following capabilities are not yet supported:

  • Live keyring reload. Key and JWKS reloads require a process restart.
  • Config reload without restart. Dataset data can be reloaded via the admin listener, but the config file itself is not re-read at runtime.
  • PostgreSQL live mode for query sources. Live materialization is supported only for structured table sources, not for query sources.
  • Stream ingestion. Kafka, pub/sub, and event-streaming backends are not supported.
  • Remote signing backend (KMS). The only supported production signing path is local software Ed25519. The kms signer kind is reserved and rejected by v1 config validation.

The rename from the legacy underscore form registry_relay to the hyphenated registry-relay is complete in the current release line. No legacy underscore references appear in the API, config surface, binary name, or Cargo.toml.