ursa.recovery.timing¶
Derive a recording’s ended_at from per-modality segment metadata.
When the catalog’s RecordingRow.duration is wrong or missing (e.g.
the 198 legacy rows whose manifests were reconstructed with
ended_at = max(R2 last_modified) = upload time, not session end),
this module re-derives the true session end from the segment files the
rigs wrote at recording time.
Several signal sources are supported, split into data-stream probes (read from actual recorded sample timestamps) and a universal worker-report fallback:
EEG (
eeg_*/eeg_*_timestamps.bin): per-chunk(int64 offset, float64 unix_ts)pairs packed bydata-engine/eeg/recorder.py. The last 16-byte struct gives the timestamp of the most recent data chunk written before the recorder stopped.Camera (
camera_*/camera_timestamps_NNNN_<first_frame_ts>.csvORcamera_*/segment_NNNN_<first_frame_ts>.csv): per-frameframe, pts_ns, wall_clock, epoch_nsrows written bydata-engine/camera/timestamps.py. The last row’sepoch_nsgives the most recent frame’s wall-clock. Two filename conventions are accepted because the legacy rig writers emitsegment_*.csvwhile newer writers emitcamera_timestamps_*.csv— same column shape. The filter still requires one of those two prefixes (not just.csv) so a stray debug dump can’t be lex-max-picked and silently downgrade the recording.Microphone (
mic_*/mic_NNNN_<unix_ts>.wav): the filename encodes the first-frame unix timestamp; the RIFFdatachunk sizesample rate + channels + bits-per-sample give the duration. End = start + duration.
Screen (
screen_*/screen_timestamps_NNNN_<first_frame_ts>.csv): identicalframe, pts_ns, wall_clock, epoch_nscolumns as camera — the last row’sepoch_nsis the most recent captured frame.Samsung watch (
samsungwatch_*/<sensor>_NNNN_<start_ts>.csvfor imu / ppg / eda / heart_rate / battery): each sensor stream’s first column is a nanosecondtimestamp; the max last-row timestamp across sensors is the last sample received.Environment (
environment_*/environment_NNNN_<start_ts>.jsonl): newline-delimited{"timestamp": <ns>, "type": ...}poll records; the maxtimestampis the last poll.Worker report (
<worker>/worker_report*.json): a universal fallback across every worker type (notes, pupillabs, location, keyboard, mouse, battery, etc.). Each worker writes a report on clean shutdown carrying an explicitstopped_at_utcISO-8601 timestamp. Probed for every worker dir alongside the data-stream probe, so a recording whoseeeg_*_timestamps.binis 0-bytes (failed recorder) can still deriveended_atfrom a sibling worker that shut down cleanly.
- func:
derive_recording_end_timereturns a :class:ModalitySignalsholding the per-source candidates; the recording’s trueended_atis derived via :meth:ModalitySignals.max_end, which prefers the latest data-stream end (eeg / camera / mic / screen / samsungwatch / environment) when any such signal exists, falling back toworker_reportonly when none fired (recordings made up solely of worker types that still lack a sample-timestamp probe).
Conservative-by-design: parsers return None when no parseable
signal exists (no _timestamps.bin present, no
camera_timestamps_/segment_ CSV, no valid WAV header, no
worker_report*.json with a parseable stopped_at_utc — the
absent-signal case). Exceptions raised by read or parse failures are
caught by :func:derive_recording_end_time and appended to
- attr:
ModalitySignals.errors— the failed-parse case. An emptyerrorslist with aNoneend-time means “no signal,” not “no failure.” A single broken segment file downgrades only its own recording toindeterminaterather than aborting a multi-recording rebuild.
Module Contents¶
Classes¶
End-time candidates per signal source for one recording. |
Functions¶
Probe every modality worker under |
Data¶
API¶
- ursa.recovery.timing.CSV_TAIL_BYTES¶
8192
- ursa.recovery.timing.WAV_HEADER_BYTES¶
256
- class ursa.recovery.timing.ModalitySignals[source]¶
End-time candidates per signal source for one recording.
Populated by :func:
derive_recording_end_time. Use- Meth:
max_endto collapse the modality candidates into a singleended_atvalue;Noneindicates no signal source produced a parseable result (the row is indeterminate).
- eeg: datetime.datetime | None¶
None
- camera: datetime.datetime | None¶
None
- mic: datetime.datetime | None¶
None
- screen: datetime.datetime | None¶
None
- samsungwatch: datetime.datetime | None¶
None
- environment: datetime.datetime | None¶
None
- worker_report: datetime.datetime | None¶
None
- errors: list[str]¶
‘field(…)’
- max_end() datetime.datetime | None[source]¶
Best
ended_atestimate, orNoneif no signal produced a candidate (the row is indeterminate).Prefers the latest data-stream end (
eeg/camera/mic/screen/samsungwatch/environment), each read from actual recorded sample timestamps.worker_report(stopped_at_utc) is wall-clock at worker process teardown, not the last sample — workers routinely linger hours (even into the next day) past the last data, so including it in the max inflated durations (e.g. a 5h session reported as 30h, or short single-modality sessions reported as a uniform ~8h once the recorder’s idle timeout fires). It is therefore used only as a fallback when no data-stream signal exists — i.e. for worker types that still lack a sample-timestamp probe (notes / pupillabs / keyboard / mouse / location / battery).
- ursa.recovery.timing.derive_recording_end_time(store: ursa.store.ObjectStore, rec_id: str) ursa.recovery.timing.ModalitySignals[source]¶
Probe every modality worker under
recordings/<rec_id>/and return per-modality end-time candidates.- Func:
ModalitySignals.max_endcollapses them into a singleended_atvalue. A single broken segment file degrades only its own worker — the error is recorded inModalitySignals.errorsand the walk continues. Callers should logerrorsregardless of themax_end()outcome so future parser regressions don’t hide behind a successful sibling modality.