ursa.recovery.first_epoch

Derive a modality’s first-sample wall-clock epoch from its raw files.

Companion to :mod:ursa.recovery.timing (which derives recording end times for duration fixes). scripts/fix_modalities_start_epoch.py re-anchors legacy zero-origin ModalityRow.domain_intervals onto the shared recording origin (RecordingRow.start_time), which requires each modality’s own first-sample absolute epoch — the value Virgo’s shared-origin parsers record as start_epoch_ns at ingest. The per-modality derivations here replicate those parsers’ first-sample anchoring (virgo/src/virgo/adapters/data_engine_parsers/*) from the raw timestamps sidecars alone — no bulk data is read, so a full-catalog backfill touches only headers, first CSV rows, and 16-byte binary anchors.

Probe semantics mirror timing.py: a probe returns None when no parseable signal exists (missing/empty sidecar, unknown header); read or parse failures raise and are caught per-row by the caller. Derived epochs are intentionally approximate to the parser’s value at sub-millisecond level (identical formulas, independent float paths); the consuming script additionally bounds every epoch against the recording’s own time window, so a glitched first row (e.g. a Pupil-SDK clock step) downgrades its row to underivable instead of committing a wild shift.

Modality routing:

  • video-webcam / camera / screen are excluded — their MP4 assets must be re-ingested (the sidecar timestamps change with the origin), so the catalog-only fix does not apply.

  • pupillabs-scene is excluded — its Pupil Labs world_timestamps_*.csv carries no joinable PTS column, so the modality has no aligned first-sample epoch to record.

Per-modality first-sample sources (segment 0 = lowest segment index):

==================== ======================================== ===================================== Modality slug(s) Raw source (segment 0) First-sample epoch formula ==================== ======================================== ===================================== eeg (new fmt) eeg_0000_timestamps.bin pair 0 time (float64 Unix s) × 1e9 eeg-old-format amp0 …_timestamps.bin anchor 0 (time_s offset/rate) × 1e9 eeg-cascaded …_<serial>_timestamps.bin anchor 0 (time_s offset/rate) × 1e9 microphone/mic mic_timestamps_0000_*.csv anchor 0 epoch_ns offset × 1e9/rate samsung-watch-… imu_0000_*.csv row 0 col0 (int ns) value pupillabs-gazegaze_0000_*.csv/gaze.csv row 0 timestamp_unix_s × 1e9 pupillabs-imu imu_0000_*.csv row 0 timestamp_unix_s × 1e9 samsungwatch min across stream CSVs’ row 0 timestamp ns / ts s × 1e9 mouse/…/battery first data row timestamp (float s) value × 1e9 notes min ts_capture across all segments value × 1e9 browser events.jsonl line 0 timestamp value (ms) × 1e6 location/env… environment_0000_*.jsonl line 0 timestamp ns / legacy ts × 1e9 ==================== ======================================== =====================================

Module Contents

Functions

derive_first_sample_epoch_ns

First-sample absolute epoch (ns) for modality under its raw prefix.

supported_slugs

Modality slugs the catalog-only backfill can derive an epoch for.

Data

API

ursa.recovery.first_epoch.PERMANENT_NULL_SLUGS

‘frozenset(…)’

ursa.recovery.first_epoch.VIDEO_REINGEST_SLUGS

‘frozenset(…)’

ursa.recovery.first_epoch.derive_first_sample_epoch_ns(store: ursa.store.ObjectStore, raw_storage_uri: str, modality: str) int | None[source]

First-sample absolute epoch (ns) for modality under its raw prefix.

Returns None for an absent/unparseable signal AND for modality slugs with no probe (video → re-ingest; pupillabs-scene → permanently unaligned; unknown slugs). Read/parse failures raise — the caller treats them per-row.

ursa.recovery.first_epoch.supported_slugs() frozenset[str][source]

Modality slugs the catalog-only backfill can derive an epoch for.