Ursa

Database / data-access layer for Constellation’s research stack.

Ursa is a multimodal time-series database. It owns ingestion, indexing, lazy time-windowed reads, vector search, and lifecycle (retention, GC, Polaris cache sync) for all of Constellation’s multimodal data — EEG, video, eye tracking, biometrics, questionnaires, keyboard/mouse/screen captures, and more.

The data model is intentionally minimal: Participant → Recording (recording_hash) → Modality + Events + flexible metadata. No sessions, no trials, no stimuli.

Cloudflare R2 storage is backed by a Lance metadata catalog and Zarr time-series store; the user-facing API is Pydantic-typed and built on temporaldata primitives.

Where this fits

Ursa is one of three packages in Constellation’s research stack:

  • Ursa (this site) — database / data layer

  • Virgo — DAG-based preprocessing

  • Orion — research / training / benchmarking

Full architecture: Research Stack Architecture (Notion).

Status

🌱 Early bootstrap. Implementation tracked in the Linear Ursa project.

Phasing (mirrors the Linear project milestones):

  • M1 — Foundations (in progress)

  • M2 — Underbuilt MVP (Phase 1a, solo)

  • M3 — Production Database (Phase 1b, with Killian)

  • M4 — Production-scale (Phase 4)

  • M5 — Benchmarks Integration (Phase 5)

  • M6 — Polish & Onboarding (Phase 6)