Registry stack documentation: machine-readable Markdown.
Index of all pages: https://docs.registrystack.org/llms.txt
Full corpus: https://docs.registrystack.org/llms-full.txt

# Source and claim modeling

> This guide helps adopter teams design the source connections and claims that

This guide helps adopter teams design the source connections and claims that
Registry Notary will evaluate. It complements the config reference by focusing
on modeling choices: what belongs in an upstream source, what belongs in a
Notary claim, and how to avoid accidental over-collection.

## Mental Model

Registry Notary does four separate jobs:

1. Authenticate the caller and check scopes.
2. Read the minimum required data from configured source registries.
3. Evaluate a configured claim from that source data or dependent claims.
4. Return a claim result, render a supported format, or issue a credential.

The source registry remains the system of record. Notary should not become a
copy of the registry, and a sidecar should not decide whether a claim is true.
Keep source connectors narrow and keep claim semantics in Notary config.

## Pick The Source Connector

| Connector | Use when | Config value |
| --- | --- | --- |
| DCI | The upstream speaks a DCI-style search envelope | `connector: dci` |
| Registry Data API | The upstream exposes `/v1/datasets/{dataset}/entities/{entity}/records` lookups | `connector: registry_data_api` |
| OpenFn sidecar | A pinned OpenFn adaptor or workflow must execute outside Notary for single reads or batch matching | `connector: openfn_sidecar` |

Prefer the simplest direct source. Add an OpenFn sidecar when the target system
needs adaptor code, request shaping, credential handling, or normalization that
does not belong inside Notary itself.

## Source Connection Design

A source connection is a reusable upstream target:

```yaml
evidence:
  source_connections:
    civil_registry:
      base_url: https://registry.example.gov
      source_auth:
        type: oauth2_client_credentials
        token_url: https://registry.example.gov/oauth2/client/token
        client_id_env: CIVIL_REGISTRY_CLIENT_ID
        client_secret_env: CIVIL_REGISTRY_CLIENT_SECRET
        request_format: json
      max_in_flight: 8
      retry_on_5xx: true
      bulk_mode: none
      dci:
        search_path: /registry/sync/search
        sender_id: registry-notary
        query_type: idtype-value
        records_path: /message/search_response/0/data/reg_records
```

Design rules:

- Configure exactly one of `token_env` or `source_auth`.
- Use HTTPS source URLs in shared environments.
- Keep `max_in_flight` below the upstream's safe concurrency limit.
- Leave `retry_on_5xx: true` for idempotent reads.
- Set `retry_on_5xx: false` for sidecar worker flows that must not repeat.
- Use `bulk_mode: none` until the source contract has been tested.
- Use `bulk_mode: openfn_sidecar_batch` only for OpenFn sidecar batch matching,
  after the sidecar contract and per-item cardinality have been tested.
- Keep `field_paths` and claim-level `fields` limited to what claims need.

## DCI Sources

DCI sources use a search endpoint and an envelope shape. Check these fields with
the source owner:

- `search_path`: DCI search path relative to `base_url`.
- `sender_id`: Notary identity sent to the source.
- `receiver_id`: optional source receiver identity.
- `query_type`: usually `idtype-value`.
- `registry_type`, `registry_event_type`, `record_type`: source-specific
  envelope fields.
- `records_path`: JSON Pointer to records in a single response.
- `bulk_records_path`: JSON Pointer used inside each batched response item.
- `max_results`: default is 2 so Notary can distinguish not found, exactly one,
  and ambiguous.
- `field_paths`: source-level JSON Pointer aliases for fields used by claims.

For OpenCRVS-style DCI, confirm whether the token endpoint expects JSON or form
encoding. The config default is form; the OpenCRVS demo uses
`request_format: json`.

## Registry Data API Sources

Registry Data API sources expose lookup-style reads:

```text
GET /v1/datasets/{dataset}/entities/{entity}/records?{lookup_field}={lookup_value}&fields=a,b&limit=2
Authorization: Bearer <source-token>
Data-Purpose: <purpose>
```

Successful responses use:

```json
{ "data": [{ "field": "value" }] }
```

Use this connector when an upstream already has the shape or when an internal
sidecar normalizes a target system into that shape.

## OpenFn Sidecar Sources

The OpenFn sidecar is a separate process that runs pinned worker code and
normalizes a target system into Notary's source-read contracts. Use the
first-class connector for new configs:

```yaml
evidence:
  source_connections:
    openfn_crvs:
      base_url: http://127.0.0.1:9191
      allow_insecure_localhost: true
      token_env: OPENFN_SIDECAR_TOKEN
      retry_on_5xx: false

  claims:
    - id: date-of-birth
      title: Date of birth
      version: 2026-06
      subject_type: person
      value:
        type: date
      inputs:
        - name: target.identifiers.national_id
          type: string
      source_bindings:
        crvs:
          connector: openfn_sidecar
          connection: openfn_crvs
          required_scope: civil_registry:evidence_verification
          dataset: civil_registry
          entity: civil_person
          lookup:
            input: target.identifiers.national_id
            field: national_id
            op: eq
            cardinality: one
          fields:
            birth_date:
              field: birth_date
              type: date
              required: true
      rule:
        type: extract
        source: crvs
        field: birth_date
```

Use the sidecar when the target system needs:

- An adaptor or workflow to fetch data.
- Credential material that should stay out of Notary config.
- Output normalization.
- A private worker process boundary.
- Per-source smoke checks before Notary depends on it.

Boundary rules:

- Notary owns caller policy, matching policy, minimization, error collapsing,
  audit, disclosure, credential issuance, and the decision about whether a
  source result satisfies a claim.
- The sidecar owns adaptor execution, target-service credentials, source
  comparison, output normalization, runtime/adaptor pinning, and worker
  isolation.
- OpenFn sidecar batch matching is a source-read optimization. It is not a new
  matching model, authorization model, disclosure model, identity proof model,
  or credential issuance path. A batch match is semantically equivalent to
  running the same source binding as single reads for each item.
- The sidecar must be reachable only over localhost or a private pod network
  from Notary. Do not expose it publicly or place it behind an internet-facing
  ingress.
- Pin worker runtime and adaptor versions.
- Store sidecar target credentials in sidecar env, not in Notary config.
- Return no more than two records for a lookup.
- Return only normalized fields needed by Notary.
- Do not put claim logic in the sidecar.
- Set `retry_on_5xx: false` on the Notary source connection. Notary does not
  retry OpenFn worker execution failures.

See
[`../crates/registry-notary-openfn-sidecar/README.md`](https://github.com/jeremi/registry-notary/blob/313a226b92a6e2075003b04353dcbdab335c2c10/crates/registry-notary-openfn-sidecar/README.md)
for sidecar manifest and worker details.

### OpenFn Batch Matching Contract

OpenFn sidecar batch matching uses a dedicated POST contract. Notary calls this
route when `bulk_mode: openfn_sidecar_batch` is set on a source connection and
the request contains multiple subjects. The contract is semantically equivalent
to running the same source binding as single reads for each item. For the full
request and response shapes, field rules, cardinality semantics, and HTTP error
codes, see the
[OpenFn Sidecar Source API section of the API reference](https://github.com/jeremi/registry-notary/blob/f182385a5065873aac030c41d9fe020704afc4e2/docs/api-reference.md#openfn-sidecar-source-api).

### OpenFn Batch Config

Use `bulk_mode: openfn_sidecar_batch` on the source connection and
`connector: openfn_sidecar` on every binding that points to that connection.
The binding may use either single-field `lookup` or multi-field `query_fields`.

```yaml
evidence:
  source_connections:
    openfn_crvs:
      base_url: http://127.0.0.1:9191
      allow_insecure_localhost: true
      token_env: OPENFN_SIDECAR_TOKEN
      retry_on_5xx: false
      bulk_mode: openfn_sidecar_batch
      bulk_timeout_max_ms: 30000

  claims:
    - id: birth-record-exists
      title: Birth record exists
      version: 2026-06
      subject_type: person
      value:
        type: boolean
      operations:
        batch_evaluate:
          enabled: true
          max_subjects: 100
      inputs:
        - name: target.attributes.given_name
          type: string
        - name: target.attributes.family_name
          type: string
        - name: target.attributes.birthdate
          type: date
      source_bindings:
        crvs:
          connector: openfn_sidecar
          connection: openfn_crvs
          required_scope: civil_registry:evidence_verification
          dataset: civil_registry
          entity: civil_person
          lookup:
            input: target.attributes.birthdate
            field: birthdate
            op: eq
            cardinality: one
          query_fields:
            - input: target.attributes.given_name
              field: given_name
              op: eq
            - input: target.attributes.family_name
              field: family_name
              op: eq
            - input: target.attributes.birthdate
              field: birthdate
              op: eq
          matching:
            policy_id: civil-person-name-birthdate-v1
            method: exact_name_birthdate
            target_type: Person
            allowed_purposes:
              - benefit_eligibility_check
            sufficient_target_inputs:
              - [target.attributes.given_name, target.attributes.family_name, target.attributes.birthdate]
            allowed_target_inputs:
              - target.attributes.given_name
              - target.attributes.family_name
              - target.attributes.birthdate
            collapse_matching_errors: true
            confidence: high
          fields:
            national_id:
              field: national_id
              type: string
              required: true
            birth_date:
              field: birth_date
              type: date
              required: true
      rule:
        type: exists
        source: crvs
```

## Claim Boundaries

A claim should express one decision or one extracted value. Good examples:

- `birth-record-exists`
- `date-of-birth`
- `farmer-under-4ha`
- `household-enrolled-in-program`

Avoid claims such as `person-profile` or `full-registry-record`. Those tend to
over-collect, over-disclose, and become hard to authorize safely.

Every claim should answer:

- Which target entity is being evaluated?
- Is requester identity or relationship context needed?
- Which caller scope may evaluate it?
- Which source fields are required?
- What happens when no record is found?
- What happens when multiple records are found?
- Is the output a value, a predicate, or a redacted assertion?
- Can this claim be issued as a credential?

## Source Bindings

A source binding connects a claim to one source read:

```yaml
source_bindings:
  birth_record:
    connector: dci
    connection: civil_registry
    required_scope: civil_registry:evidence_verification
    dataset: civil_registry
    entity: birth_registration
    lookup:
      input: target.identifiers.national_id
      field: UIN
      op: eq
      cardinality: one
    query_fields:
      - input: target.identifiers.national_id
        field: UIN
        op: eq
    fields:
      birth_date:
        field: birth_date
        type: date
        required: true
```

Important choices:

- `required_scope`: scope the caller must have before this binding can read the
  source.
- `lookup.input`: request lookup path, such as `target.id`,
  `target.identifiers.<scheme>`, `target.attributes.<name>`, `requester.id`,
  `requester.identifiers.<scheme>`, `requester.attributes.<name>`, or
  `relationship.attributes.<name>`.
- `lookup.field`: upstream identifier field.
- `lookup.cardinality`: use `one` when the claim needs exactly one record.
- `query_fields`: optional multi-field lookup override. Use it when the source
  supports querying by more than one request path, such as first name, last
  name, and date of birth. Leave it empty for single-field lookup.
- `fields`: only fields needed by the rule.

Use separate bindings when a claim needs data from multiple registries. Use
claim dependencies when a rule can reuse previous claim outputs instead of
reading the same source again.

## Rule Types

Use `exists` when the fact is the presence of exactly one source record:

```yaml
rule:
  type: exists
  source: birth_record
```

Use `extract` when the claim returns a source field:

```yaml
rule:
  type: extract
  source: birth_record
  field: birth_date
```

Use `cel` when the claim is derived from source fields or dependent claim
results:

```yaml
depends_on:
  - farmed-land-size
rule:
  type: cel
  expression: "claims.farmed_land_size.value < 4.0"
  bindings:
    claims:
      farmed_land_size:
        claim: farmed-land-size
```

CEL-enabled builds evaluate expressions in a hardened worker process and apply
Notary-owned limits to expressions, root bindings, and worker frames. Prefer
`exists` or `extract` when they express the claim clearly.

## Disclosure And Formats

Disclosure config controls what the caller can ask Notary to reveal:

```yaml
disclosure:
  default: redacted
  allowed:
    - value
    - redacted
```

For privacy-sensitive claims, prefer redacted or predicate outputs. Allow
`value` only when the relying party genuinely needs the value.

`formats` controls renderable response formats for the claim. Include
`application/vnd.registry-notary.claim-result+json` for standard JSON claim
results. Add SD-JWT VC issuance through a credential profile rather than by
adding broad render formats.

## Credential Eligibility

A claim can be issued as a credential only when both sides agree:

```yaml
claims:
  - id: birth-record-exists
    credential_profiles:
      - birth_record_sd_jwt

credential_profiles:
  birth_record_sd_jwt:
    allowed_claims:
      - birth-record-exists
```

This two-way relationship prevents a profile from accidentally issuing from a
claim that was not designed for that credential, and prevents a claim from being
issued by an unrelated profile.

## Batch And Bulk Reads

Batch evaluation lets one request evaluate many target items for a claim. It
should be enabled only when the source and caller are ready for that access
pattern:

```yaml
operations:
  batch_evaluate:
    enabled: true
    max_subjects: 100
```

`evidence.inline_batch_limit` sets a general default. The claim-level
`max_subjects` config key caps the number of batch `items[]` target entries for
a claim, and should be lower when a source is sensitive or slow.

Bulk source modes are separate from API batch evaluation:

- `none`: one source read per target item.
- `dci_batched_search`: DCI source supports a batched search envelope.
- `rda_in_filter`: Registry Data API source supports an `in` style filter and
  the operator attests that each lookup is unique.
- `openfn_sidecar_batch`: OpenFn sidecar source supports
  `POST /v1/datasets/{dataset}/entities/{entity}/records:batchMatch` with a
  shared `query_signature`.

Do not enable bulk modes until contract tests prove response shape,
cardinality, and source limits. Notary does not retry OpenFn worker execution
failures; keep `retry_on_5xx: false` on OpenFn sidecar connections.

## Purpose Propagation

Claims and source bindings carry purpose through the request path. Use stable,
human-reviewable purpose values such as:

- `benefit_eligibility_check`
- `wallet_credential_issuance`
- `program_enrollment_verification`

Avoid using free-form user text as purpose. Purpose values should be part of the
deployment's policy review, source-owner agreement, and audit review.

## Modeling Checklist

- The claim id is stable and specific.
- The claim reads the fewest possible source fields.
- The source owner has confirmed lookup field, cardinality, and response shape.
- Missing, ambiguous, and upstream-error behavior are acceptable to the relying
  party.
- Caller scopes match source-owner access policy.
- Disclosure defaults to the least revealing useful output.
- Credential issuance is explicitly allowed by both claim and profile.
- Batch and bulk modes are disabled until source contracts are tested.
- OpenFn sidecars normalize data only and do not decide claims.
- OpenFn sidecars run on localhost or a private pod network, never as a public
  endpoint.
- `doctor --live` passes against a controlled test target.

## Testing With Doctor

Run non-live checks first:

```sh
registry-notary doctor --config registry-notary.yaml
```

Then run a live probe only with a controlled test target:

```sh
registry-notary doctor \
  --config registry-notary.yaml \
  --live
```

Live doctor probes can contact the upstream source. Use test data, document the
purpose with the source owner, and keep probe output out of screenshots or
support tickets unless it has been reviewed for disclosure.