ursa.data_interface

Credential isolation

A fresh :class:DataInterface opens against the assets_ro credential role only. Calling any register_* method before

meth:

enable_writes raises :class:~ursa.catalog.WritesNotEnabled — production callers must explicitly request the assets_rw role.

meth:

enable_writes lazily builds a second :class:Catalog handle against assets_rw. The public read surface (query, list_*, get_*, :attr:catalog) always uses the RO handle. register_* runs on the RW handle — including its internal idempotency get_* lookup in :mod:ursa.register, so the check-and-write happen under a single credential and the read-modify-write race window stays tight.

Lazy construction

R2 profiles do not resolve 1Password credentials at construction; the underlying :class:Catalog is built on first access to :attr:catalog (reads) or :meth:enable_writes (writes). This keeps offline / sandboxed environments (CI, doc builds) able to import and instantiate

class:

DataInterface without an op session. Argument validation (profile value, path constraints) still runs eagerly so caller mistakes surface immediately.

Coexistence with the existing surface

The module-level :func:ursa.query / :func:ursa.get /

func:

ursa.download / :mod:ursa.register functions remain wired up;

class:

DataInterface is purely additive in PR 1. Internal callers migrate in PR 3a; the module-level forms are deleted in PR 3b.

Profile threading caveat (PR 1)

meth:

get and :meth:download delegate to module-level

func:

ursa.get.get and :func:ursa.download.download, which resolve the segment store via the process-ambient URSA_PROFILE env var rather than the DataInterface(profile=...) argument. Callers must ensure URSA_PROFILE agrees with the profile passed here. Threading profile through those module-level functions is tracked as a follow-up under ENG-1107.

User-facing class surface for Ursa (ENG-1108).

class:

DataInterface packages the three M2 read verbs (:mod:ursa.query, :mod:ursa.get, :mod:ursa.download), the four

mod:

ursa.register write verbs, and the typed per-table list_<table> / get_<table> lookups behind a single object that owns its :class:~ursa.catalog.Catalog handles.

Module Contents

Classes

DataInterface

Class-based entry point for Ursa’s read + register API.

Data

API

ursa.data_interface.__all__

[‘DataInterface’]

ursa.data_interface._Profile

None

class ursa.data_interface.DataInterface(profile: ursa.data_interface._Profile = 'r2', *, path: str | pathlib.Path | None = None)[source]

Class-based entry point for Ursa’s read + register API.

Construction validates arguments but defers credential resolution (R2 profiles) until first use. Call :meth:enable_writes once before invoking any register_* method; that promotion is idempotent.

Parameters

profile "r2" (default) for the production catalog, "r2-test" for the testing bucket, "local" for an on-disk lance directory. path Required when profile="local". Forbidden for R2 profiles — their URI is resolved from the configured object store.

Initialization

property catalog: ursa.catalog.Catalog

The read-only catalog handle.

Constructed on first access; subsequent accesses return the cached handle. Always returns the RO catalog, even after

Meth:

enable_writes — credential isolation is enforced. register_* methods use a separate write handle internally.

property writes_enabled: bool
enable_writes() None[source]

Open the assets_rw write handle.

Idempotent — a second call is a no-op. For profile="local", local lance has no credential distinction, so the bool flip is the only observable effect; register_* methods fall back to the read handle.

_write_catalog() ursa.catalog.Catalog[source]

The write-capable catalog handle.

Raises :class:WritesNotEnabled if :meth:enable_writes hasn’t been called. For profile="local", falls back to the same underlying handle as :attr:catalog — there is no credential distinction to enforce locally.

register_* callers route both the idempotency get_* check and the add_* write through this handle so the check-and-write run under one credential.

_require_writes() None[source]
query(spec: ursa.query.QuerySpec | None = None, **kwargs: Any) ursa.query.QueryResultList[source]

Delegate to :func:ursa.query.query with the RO catalog.

get(target: ursa.query.QueryResult | collections.abc.Iterable[ursa.query.QueryResult], *, concat: bool = False) ursa.temporal.Data | list[ursa.temporal.Data][source]

Delegate to :func:ursa.get.get.

See the module docstring’s profile-threading caveat — URSA_PROFILE must match DataInterface(profile=...).

download(target: ursa.query.QueryResult | collections.abc.Iterable[ursa.query.QueryResult], dest: str | os.PathLike[str], *, layout: Literal[by_recording, by_modality, flat] = 'by_recording', overwrite: bool = False) list[pathlib.Path][source]

Delegate to :func:ursa.download.download.

See the module docstring’s profile-threading caveat — URSA_PROFILE must match DataInterface(profile=...).

register_participant(*, participant_id: str, enrolled_at: int | datetime.datetime, metadata: dict[str, Any] | None = None) ursa.catalog.ParticipantRow[source]
register_recording(*, recording_hash: str, participant_ids: list[str], start_time: int | datetime.datetime, duration: datetime.timedelta, device_info: dict[str, Any] | None = None, metadata: dict[str, Any] | None = None) ursa.catalog.RecordingRow[source]
register_modality(*, recording_hash: str, modality: str, raw_storage_uri: str, storage_uri: str | None = None, ingestion_status: ursa.catalog.IngestionStatus = IngestionStatus.RAW, format: ursa.catalog.StorageFormat | None = None, domain_intervals: list[tuple[float, float]] | None = None, sampling_rate: float | None = None, channel_spec: dict[str, Any] | None = None, metadata: dict[str, Any] | None = None) ursa.catalog.ModalityRow[source]
register_event(*, event_id: str, recording_hash: str, event_time: float, event_type: str, prompt: str | None = None, response: str | None = None, metadata: dict[str, Any] | None = None) ursa.catalog.EventRow[source]
list_participants(**kwargs: Any) list[ursa.catalog.ParticipantRow][source]
list_recordings(**kwargs: Any) list[ursa.catalog.RecordingRow][source]
list_modalities(**kwargs: Any) list[ursa.catalog.ModalityRow][source]
list_events(**kwargs: Any) list[ursa.catalog.EventRow][source]
list_embeddings(**kwargs: Any) list[ursa.catalog.EmbeddingRow][source]
list_virgo_assets(**kwargs: Any) list[ursa.catalog.VirgoAssetRow][source]
list_checkpoints(**kwargs: Any) list[ursa.catalog.CheckpointRow][source]
list_benchmark_suites(**kwargs: Any) list[ursa.catalog.BenchmarkSuiteRow][source]
list_benchmark_results(**kwargs: Any) list[ursa.catalog.BenchmarkResultRow][source]
get_participant(participant_id: str) ursa.catalog.ParticipantRow | None[source]
get_recording(recording_hash: str) ursa.catalog.RecordingRow | None[source]
get_modality(recording_hash: str, modality: str) ursa.catalog.ModalityRow | None[source]
get_event(event_id: str) ursa.catalog.EventRow | None[source]
get_embedding(embedding_id: str) ursa.catalog.EmbeddingRow | None[source]
get_virgo_asset(asset_id: str) ursa.catalog.VirgoAssetRow | None[source]
get_checkpoint(checkpoint_id: str) ursa.catalog.CheckpointRow | None[source]
get_benchmark_suite(suite_name: str, suite_version: int) ursa.catalog.BenchmarkSuiteRow | None[source]
get_benchmark_result(result_id: str) ursa.catalog.BenchmarkResultRow | None[source]