MVP demo: register one real recording end-to-end (ENG-1063)

This runbook drives the first real-R2 ingest of a recording through Ursa and verifies it from a different machine. It is the runnable proof that Ursa’s “user can call an API and get data anywhere” promise holds end-to-end on production R2.

Scope. Production-only. The blobs being registered are already in r2://constellation-data/recordings/<rec_id>/ from the rig push; ursa.register.modality is catalog-only in Phase 1a, so we never upload bytes. The testing-profile rehearsal is gated on ENG-1071 and is deliberately out of scope for this demo.

The demo ships as three scripts under examples/:

Script

Role

Where it runs

mvp_demo_stage.py

List + download all segments for one rec_id; compute sha256 per segment; write a sha256.txt table.

Producer machine (MBP)

mvp_demo_register.py

Pre-condition check + dry-run plan + (with --yes) register.{participant,recording,modality}.

Producer machine (MBP)

verify_mvp_demo.py

ursa.query + ursa.get + ursa.download; assert byte counts and sha256s against the table.

Verifier machine (green-mantis)

Prerequisites

  • 1Password desktop + op CLI signed in (biometric session unlocked).

  • CONSTELLATION_PROFILE=production resolves r2_raw_ro (reads constellation-data) and r2_assets_rw (writes Ursa catalog under r2://constellation-assets/ursa/catalog/).

  • A reachable verifier machine (this runbook uses green-mantis at 100.64.0.14 over Tailscale).

  • Single-writer assumption: post in #engineering-support that the ENG-1063 demo is registering against production R2 before starting.

Step 1 — Pick the recording

Browse production:

CONSTELLATION_PROFILE=production op signin
uv run python -c '
from ursa.store import get_store
s = get_store("raw_ro")
for p in list(s.list_prefixes("recordings/"))[:30]:
    print(p)
'

Pick one rec_* that:

  • has EEG + at least one other modality (audio / video / eyetracking),

  • has recordings/<rec_id>/manifest.json in R2 (the recording manifest),

  • is < 2 GB total (< 5 GB hard cap).

Note: the upload commit-marker prefix manifests/<rec_id>/ is empty on production recordings — manifests live inside recordings/<rec_id>/.

Record the chosen rec_id, the participant name from the manifest, and the CatalogID you want to use for participant_id (must match ^[A-Za-z0-9_-]+$; the manifest’s participant is a display name, not a valid CatalogID).

Step 2 — Stage on the producer

CONSTELLATION_PROFILE=production op signin
uv run python examples/mvp_demo_stage.py \
    --rec-id <rec_id> \
    --staging-dir <your-staging-dir>

Produces <staging-dir>/<rec_id>/sha256.txt with sha256  key  size  etag rows for every segment under recordings/<rec_id>/. This file is the only artifact transferred to the verifier machine.

Step 3 — Register (dry-run, then commit)

# dry run — prints the action plan and exits 0 without writing anything
uv run python examples/mvp_demo_register.py \
    --rec-id <rec_id> \
    --participant-id <pid>

# commit — only after reviewing the dry-run output
uv run python examples/mvp_demo_register.py \
    --rec-id <rec_id> \
    --participant-id <pid> \
    --yes

Pre-condition: the production catalog must have no existing recording row for <rec_id>; the script aborts non-zero if one exists.

Pre-existing state handling:

  • Participant: checked before the write loop; if already present, it is skipped silently (participant ID comes from --participant-id, not the manifest).

  • Recording: a pre-condition check at the top of the script aborts immediately if recording_hash already exists. If a concurrent write races past the check, the script exits non-zero with an error message.

  • Modality: fails loud (exit 3) on CatalogRowExistsModalityRow carries mutable state (format, ingestion_status, domain_intervals) and a silent skip on field change would leave stale catalog.

Real re-runs after partial failure need manual catalog cleanup; tracked under ENG-1069.

Step 4 — Verify on green-mantis

ssh green-mantis@100.64.0.14
mkdir -p ~/workspace && cd ~/workspace
git clone git@github.com:constellationlab/ursa.git ursa-eng-1063 && cd ursa-eng-1063
uv sync --all-extras

op signin
export CONSTELLATION_PROFILE=production

# Copy the sha256 table from the producer machine (rsync, scp, or NAS).
# Example: scp from MBP to green-mantis ~/sha256.txt
uv run python examples/verify_mvp_demo.py \
    --rec-id <rec_id> \
    --participant-id <pid> \
    --sha256-table ~/sha256.txt

The verifier asserts, in order:

  1. ursa.query(participants=[pid]) returns the recording.

  2. ursa.query(recording_hash=rec_id) returns exactly one result.

  3. ursa.get(qr) materializes every modality as a RawBytes carrier whose per-segment byte counts match the table.

  4. ursa.download(qr, layout="by_recording") writes files for one non-EEG modality whose sha256 hashes match the table.

Exit 0 ⇒ ENG-1063 acceptance met.

Recovery from partial Step 3 failure

If mvp_demo_register.py errors after one or more rows have been written: do not retry the script. M2 Catalog.delete raises NotImplementedError (ENG-1069), so there is no programmatic cleanup. Treat as severity-1: post in #engineering-support and coordinate manual catalog cleanup with the data-platform owner. Document the partial state in the demo runbook before re-attempting.

Divergence triage

  • Abort + escalate if ursa.get returns wrong byte counts, the verifier cannot import ursa, or Step 3 fails after the pre-condition passes.

  • Continue + file follow-up (mvp-divergence label in Linear) for non-blocking schema oddities or a single-segment sha256 mismatch on a multi-segment modality (treat as an investigation, not a blocker).

Follow-ups expected

  • ENG-1071 — validator support for -test bucket suffix so future demos can rehearse against testing R2.

  • ENG-1069 — Catalog.delete and a mvp_demo_register.py --reset mode so re-runs aren’t manual-cleanup-only.

  • ENG-892 — pin the chosen recording_hash into the quickstart doctest once this demo completes.