Skip to content
Registry Stack Docs Latest

Source and claim modeling

View as Markdown

This guide helps adopter teams design the source connections and claims that Registry Notary will evaluate. It complements the config reference by focusing on modeling choices: what belongs in an upstream source, what belongs in a Notary claim, and how to avoid accidental over-collection.

Registry Notary does four separate jobs:

  1. Authenticate the caller and check scopes.
  2. Read the minimum required data from configured source registries.
  3. Evaluate a configured claim from that source data or dependent claims.
  4. Return a claim result, render a supported format, or issue a credential.

The source registry remains the system of record. Notary should not become a copy of the registry, and a sidecar should not decide whether a claim is true. Keep source connectors narrow and keep claim semantics in Notary config.

ConnectorUse whenConfig value
DCIThe upstream speaks a DCI-style search envelopeconnector: dci
Registry Data APIThe upstream exposes /v1/datasets/{dataset}/entities/{entity}/records lookupsconnector: registry_data_api
OpenFn sidecarA pinned OpenFn adaptor or workflow must execute outside Notary for single reads or batch matchingconnector: openfn_sidecar

Prefer the simplest direct source. Add an OpenFn sidecar when the target system needs adaptor code, request shaping, credential handling, or normalization that does not belong inside Notary itself.

A source connection is a reusable upstream target:

evidence:
source_connections:
civil_registry:
base_url: https://registry.example.gov
source_auth:
type: oauth2_client_credentials
token_url: https://registry.example.gov/oauth2/client/token
client_id_env: CIVIL_REGISTRY_CLIENT_ID
client_secret_env: CIVIL_REGISTRY_CLIENT_SECRET
request_format: json
max_in_flight: 8
retry_on_5xx: true
bulk_mode: none
dci:
search_path: /registry/sync/search
sender_id: registry-notary
query_type: idtype-value
records_path: /message/search_response/0/data/reg_records

Design rules:

  • Configure exactly one of token_env or source_auth.
  • Use HTTPS source URLs in shared environments.
  • Keep max_in_flight below the upstream’s safe concurrency limit.
  • Leave retry_on_5xx: true for idempotent reads.
  • Set retry_on_5xx: false for sidecar worker flows that must not repeat.
  • Use bulk_mode: none until the source contract has been tested.
  • Use bulk_mode: openfn_sidecar_batch only for OpenFn sidecar batch matching, after the sidecar contract and per-item cardinality have been tested.
  • Keep field_paths and claim-level fields limited to what claims need.

DCI sources use a search endpoint and an envelope shape. Check these fields with the source owner:

  • search_path: DCI search path relative to base_url.
  • sender_id: Notary identity sent to the source.
  • receiver_id: optional source receiver identity.
  • query_type: usually idtype-value.
  • registry_type, registry_event_type, record_type: source-specific envelope fields.
  • records_path: JSON Pointer to records in a single response.
  • bulk_records_path: JSON Pointer used inside each batched response item.
  • max_results: default is 2 so Notary can distinguish not found, exactly one, and ambiguous.
  • field_paths: source-level JSON Pointer aliases for fields used by claims.

For OpenCRVS-style DCI, confirm whether the token endpoint expects JSON or form encoding. The config default is form; the OpenCRVS demo uses request_format: json.

Registry Data API sources expose lookup-style reads:

GET /v1/datasets/{dataset}/entities/{entity}/records?{lookup_field}={lookup_value}&fields=a,b&limit=2
Authorization: Bearer <source-token>
Data-Purpose: <purpose>

Successful responses use:

{ "data": [{ "field": "value" }] }

Use this connector when an upstream already has the shape or when an internal sidecar normalizes a target system into that shape.

The OpenFn sidecar is a separate process that runs pinned worker code and normalizes a target system into Notary’s source-read contracts. Use the first-class connector for new configs:

evidence:
source_connections:
openfn_crvs:
base_url: http://127.0.0.1:9191
allow_insecure_localhost: true
token_env: OPENFN_SIDECAR_TOKEN
retry_on_5xx: false
claims:
- id: date-of-birth
title: Date of birth
version: 2026-06
subject_type: person
value:
type: date
inputs:
- name: target.identifiers.national_id
type: string
source_bindings:
crvs:
connector: openfn_sidecar
connection: openfn_crvs
required_scope: civil_registry:evidence_verification
dataset: civil_registry
entity: civil_person
lookup:
input: target.identifiers.national_id
field: national_id
op: eq
cardinality: one
fields:
birth_date:
field: birth_date
type: date
required: true
rule:
type: extract
source: crvs
field: birth_date

Use the sidecar when the target system needs:

  • An adaptor or workflow to fetch data.
  • Credential material that should stay out of Notary config.
  • Output normalization.
  • A private worker process boundary.
  • Per-source smoke checks before Notary depends on it.

Boundary rules:

  • Notary owns caller policy, matching policy, minimization, error collapsing, audit, disclosure, credential issuance, and the decision about whether a source result satisfies a claim.
  • The sidecar owns adaptor execution, target-service credentials, source comparison, output normalization, runtime/adaptor pinning, and worker isolation.
  • OpenFn sidecar batch matching is a source-read optimization. It is not a new matching model, authorization model, disclosure model, identity proof model, or credential issuance path. A batch match is semantically equivalent to running the same source binding as single reads for each item.
  • The sidecar must be reachable only over localhost or a private pod network from Notary. Do not expose it publicly or place it behind an internet-facing ingress.
  • Pin worker runtime and adaptor versions.
  • Store sidecar target credentials in sidecar env, not in Notary config.
  • Return no more than two records for a lookup.
  • Return only normalized fields needed by Notary.
  • Do not put claim logic in the sidecar.
  • Set retry_on_5xx: false on the Notary source connection. Notary does not retry OpenFn worker execution failures.

See ../crates/registry-notary-openfn-sidecar/README.md for sidecar manifest and worker details.

OpenFn sidecar batch matching uses a dedicated POST contract. Notary calls this route when bulk_mode: openfn_sidecar_batch is set on a source connection and the request contains multiple subjects. The contract is semantically equivalent to running the same source binding as single reads for each item. For the full request and response shapes, field rules, cardinality semantics, and HTTP error codes, see the OpenFn Sidecar Source API section of the API reference.

Use bulk_mode: openfn_sidecar_batch on the source connection and connector: openfn_sidecar on every binding that points to that connection. The binding may use either single-field lookup or multi-field query_fields.

evidence:
source_connections:
openfn_crvs:
base_url: http://127.0.0.1:9191
allow_insecure_localhost: true
token_env: OPENFN_SIDECAR_TOKEN
retry_on_5xx: false
bulk_mode: openfn_sidecar_batch
bulk_timeout_max_ms: 30000
claims:
- id: birth-record-exists
title: Birth record exists
version: 2026-06
subject_type: person
value:
type: boolean
operations:
batch_evaluate:
enabled: true
max_subjects: 100
inputs:
- name: target.attributes.given_name
type: string
- name: target.attributes.family_name
type: string
- name: target.attributes.birthdate
type: date
source_bindings:
crvs:
connector: openfn_sidecar
connection: openfn_crvs
required_scope: civil_registry:evidence_verification
dataset: civil_registry
entity: civil_person
lookup:
input: target.attributes.birthdate
field: birthdate
op: eq
cardinality: one
query_fields:
- input: target.attributes.given_name
field: given_name
op: eq
- input: target.attributes.family_name
field: family_name
op: eq
- input: target.attributes.birthdate
field: birthdate
op: eq
matching:
policy_id: civil-person-name-birthdate-v1
method: exact_name_birthdate
target_type: Person
allowed_purposes:
- benefit_eligibility_check
sufficient_target_inputs:
- [target.attributes.given_name, target.attributes.family_name, target.attributes.birthdate]
allowed_target_inputs:
- target.attributes.given_name
- target.attributes.family_name
- target.attributes.birthdate
collapse_matching_errors: true
confidence: high
fields:
national_id:
field: national_id
type: string
required: true
birth_date:
field: birth_date
type: date
required: true
rule:
type: exists
source: crvs

A claim should express one decision or one extracted value. Good examples:

  • birth-record-exists
  • date-of-birth
  • farmer-under-4ha
  • household-enrolled-in-program

Avoid claims such as person-profile or full-registry-record. Those tend to over-collect, over-disclose, and become hard to authorize safely.

Every claim should answer:

  • Which target entity is being evaluated?
  • Is requester identity or relationship context needed?
  • Which caller scope may evaluate it?
  • Which source fields are required?
  • What happens when no record is found?
  • What happens when multiple records are found?
  • Is the output a value, a predicate, or a redacted assertion?
  • Can this claim be issued as a credential?

A source binding connects a claim to one source read:

source_bindings:
birth_record:
connector: dci
connection: civil_registry
required_scope: civil_registry:evidence_verification
dataset: civil_registry
entity: birth_registration
lookup:
input: target.identifiers.national_id
field: UIN
op: eq
cardinality: one
query_fields:
- input: target.identifiers.national_id
field: UIN
op: eq
fields:
birth_date:
field: birth_date
type: date
required: true

Important choices:

  • required_scope: scope the caller must have before this binding can read the source.
  • lookup.input: request lookup path, such as target.id, target.identifiers.<scheme>, target.attributes.<name>, requester.id, requester.identifiers.<scheme>, requester.attributes.<name>, or relationship.attributes.<name>.
  • lookup.field: upstream identifier field.
  • lookup.cardinality: use one when the claim needs exactly one record.
  • query_fields: optional multi-field lookup override. Use it when the source supports querying by more than one request path, such as first name, last name, and date of birth. Leave it empty for single-field lookup.
  • fields: only fields needed by the rule.

Use separate bindings when a claim needs data from multiple registries. Use claim dependencies when a rule can reuse previous claim outputs instead of reading the same source again.

Use exists when the fact is the presence of exactly one source record:

rule:
type: exists
source: birth_record

Use extract when the claim returns a source field:

rule:
type: extract
source: birth_record
field: birth_date

Use cel when the claim is derived from source fields or dependent claim results:

depends_on:
- farmed-land-size
rule:
type: cel
expression: "claims.farmed_land_size.value < 4.0"
bindings:
claims:
farmed_land_size:
claim: farmed-land-size

CEL-enabled builds evaluate expressions in a hardened worker process and apply Notary-owned limits to expressions, root bindings, and worker frames. Prefer exists or extract when they express the claim clearly.

Disclosure config controls what the caller can ask Notary to reveal:

disclosure:
default: redacted
allowed:
- value
- redacted

For privacy-sensitive claims, prefer redacted or predicate outputs. Allow value only when the relying party genuinely needs the value.

formats controls renderable response formats for the claim. Include application/vnd.registry-notary.claim-result+json for standard JSON claim results. Add SD-JWT VC issuance through a credential profile rather than by adding broad render formats.

A claim can be issued as a credential only when both sides agree:

claims:
- id: birth-record-exists
credential_profiles:
- birth_record_sd_jwt
credential_profiles:
birth_record_sd_jwt:
allowed_claims:
- birth-record-exists

This two-way relationship prevents a profile from accidentally issuing from a claim that was not designed for that credential, and prevents a claim from being issued by an unrelated profile.

Batch evaluation lets one request evaluate many target items for a claim. It should be enabled only when the source and caller are ready for that access pattern:

operations:
batch_evaluate:
enabled: true
max_subjects: 100

evidence.inline_batch_limit sets a general default. The claim-level max_subjects config key caps the number of batch items[] target entries for a claim, and should be lower when a source is sensitive or slow.

Bulk source modes are separate from API batch evaluation:

  • none: one source read per target item.
  • dci_batched_search: DCI source supports a batched search envelope.
  • rda_in_filter: Registry Data API source supports an in style filter and the operator attests that each lookup is unique.
  • openfn_sidecar_batch: OpenFn sidecar source supports POST /v1/datasets/{dataset}/entities/{entity}/records:batchMatch with a shared query_signature.

Do not enable bulk modes until contract tests prove response shape, cardinality, and source limits. Notary does not retry OpenFn worker execution failures; keep retry_on_5xx: false on OpenFn sidecar connections.

Claims and source bindings carry purpose through the request path. Use stable, human-reviewable purpose values such as:

  • benefit_eligibility_check
  • wallet_credential_issuance
  • program_enrollment_verification

Avoid using free-form user text as purpose. Purpose values should be part of the deployment’s policy review, source-owner agreement, and audit review.

  • The claim id is stable and specific.
  • The claim reads the fewest possible source fields.
  • The source owner has confirmed lookup field, cardinality, and response shape.
  • Missing, ambiguous, and upstream-error behavior are acceptable to the relying party.
  • Caller scopes match source-owner access policy.
  • Disclosure defaults to the least revealing useful output.
  • Credential issuance is explicitly allowed by both claim and profile.
  • Batch and bulk modes are disabled until source contracts are tested.
  • OpenFn sidecars normalize data only and do not decide claims.
  • OpenFn sidecars run on localhost or a private pod network, never as a public endpoint.
  • doctor --live passes against a controlled test target.

Run non-live checks first:

Terminal window
registry-notary doctor --config registry-notary.yaml

Then run a live probe only with a controlled test target:

Terminal window
registry-notary doctor \
--config registry-notary.yaml \
--live

Live doctor probes can contact the upstream source. Use test data, document the purpose with the source owner, and keep probe output out of screenshots or support tickets unless it has been reviewed for disclosure.