ursa.backends.video¶
MP4-backed lazy video reader for StorageFormat.MP4_INDEX modalities.
Ported from neuro-galaxy/temporaldata@ian/lazy-everything/temporaldata/lazy_video.py
(2026-05 snapshot) with four adaptations:
video_filepaths →uri: str+ :class:ObjectStore. Segments are downloaded to a process-global LRU cache keyed by(uri, etag)on first slice; PyAV decodes the local copy. Cache directory and quota are controlled by the module-level :data:_DEFAULT_CACHE_DIRand- data:
_DEFAULT_CACHE_QUOTA_GBconstants — tests and operators override by monkeypatching them and calling- func:
_reset_cache_for_tests.
segment_frame_counts/segment_pts_indices→ :class:LanceFrameIndex. The frame index is a sibling Lance table at- func:
ursa.layout.lance_frame_index_uri(uri)with columns(segment_idx, local_frame_idx, pts, timestamp). The original PyAV demux fallback survives for callers that open an MP4 without an index (tests, ad-hoc inspection); it is a transitional path slated for removal once every processed recording carries an index.
to_hdf5/from_hdf5removed. Persistent state is the catalog- class:
ModalityRow+ storage URI; HDF5 would just duplicate it.
metadata: ModalityRow | Noneslot mirroring the rest of- mod:
ursa.temporal..slice()propagates the same instance to the returnedIrregularTimeSeries(identity-preserving — pinned by test).
Fork-safety: stream.thread_count = 1 + thread_type = "NONE" is kept
verbatim — FFmpeg’s frame/slice worker threads deadlock in
avcodec_free_context if the codec context was created in the parent and
released in a forked child (PyTorch DataLoader(num_workers>0) defaults to
fork on Linux). The constraint is read-side; the writer side imposes the
same setting on its own PyAV containers.
PyAV is an optional dep (ursa[video]); construction raises a clear
ImportError with the install hint if missing.
Module Contents¶
Classes¶
Lazy-decoded video for |
|
Sibling Lance table at |
API¶
- class ursa.backends.video.LazyVideo(uri: str, *, store: ursa.store.base.ObjectStore, frame_index: ursa.backends.video.LanceFrameIndex | None = None, metadata: ursa.catalog.schemas.ModalityRow | None = None, segment_uris: Sequence[str] | None = None, resize: tuple[int, int] | None = None, colorspace: str = 'RGB', channel_format: str = 'NCHW')[source]¶
Lazy-decoded video for
StorageFormat.MP4_INDEXmodalities.Construction is metadata-only when a :class:
LanceFrameIndexis provided: no MP4 GET, no frame decode happens until the first- Meth:
slicecall, which downloads the required segment(s) into the process-global LRU cache and decodes the requested frames via PyAV.
When
frame_index=Nonethe constructor falls back to PyAV-demuxing every segment up-front to recover frame counts + PTS tables; this path DOES download bytes and probe PTS at construction time. Use the Lance-frame-index path (the default through :meth:from_uri) for the metadata-only contract.Seeks are keyframe-aligned (
container.seek(pts, any_frame=False, backward=True)) and the decoder is then advanced frame-by-frame to the target PTS, so frames returned mid-GOP are bit-correct (nommco: unref short failurecorruption).Picklable: no PyAV container/stream/reformatter handles are stored on
self(they live as locals inside :meth:_load_frames), so aLazyVideosurvives thepickle.dumpsthat DataLoader workers perform across the fork boundary.Args: uri:
r2://URI of the MP4 (or the first segment if multi-segment). store: :class:ObjectStoreto read from — typically :attr:DataInterface.assets_ro_store. frame_index: :class:LanceFrameIndexfor the sibling sidecar, orNoneto PyAV-demux the segments lazily (test-only fallback; see module docstring). metadata: optional :class:ModalityRowto carry through to :meth:sliceresults. segment_uris: additional URIs for multi-segment videos. If absent,uriis the sole segment. resize:(height, width)to resize frames to, orNonefor original dimensions. colorspace:"RGB"or"G". channel_format:"NCHW"or"NHWC".Initialization
- property metadata: ursa.catalog.schemas.ModalityRow | None¶
- classmethod from_uri(uri: str, *, store: ursa.store.base.ObjectStore, metadata: ursa.catalog.schemas.ModalityRow | None = None) ursa.backends.video.LazyVideo[source]¶
Open uri as a lazy MP4 video.
Mirrors :meth:
ursa.RegularTimeSeries.from_uri/- Meth:
ursa.LazyIrregularTimeSeries.from_uriso callers can use a uniform construction surface across backends. Resolves the sibling Lance frame index via :func:ursa.layout.lance_frame_index_uri; on a missing sidecar, falls back to demuxing the MP4 itself.
Video is intentionally lazy-only. An eager equivalent would decode every frame at construction, which defeats the point of
MP4_INDEXfor any realistic recording.DataInterface.materialize(..., lazy=False)therefore still returns aLazyVideofor video subfields.Video isn’t dispatched through :class:
_BackendOpenersbecause the regular/irregular split doesn’t fit per-frame video access.- Class:
ursa.DataInterfaceinvokes this classmethod directly whenModalityRow.format == StorageFormat.MP4_INDEX.
- classmethod concat(videos: Sequence[ursa.backends.video.LazyVideo]) ursa.backends.video.LazyVideo[source]¶
Concatenate segments of one logical video.
Single-recording only — cross-recording concat is not supported because aligned time domains across recordings are only established by the per-recording processed path.
Multi-video concat with frame indexes is also rejected: the receiver would have to merge
LanceFrameIndexinstances from each input, which the writer-side helper has not been built yet. Callingconcat([a])(single-video — i.e. metadata-only re-wrap) stays supported.
- lazy_slice(start: float, end: float) ursa.backends.video.LazyVideo[source]¶
Return a new :class:
LazyVideowindowed to[start, end)with no PyAV decode at call time.The returned object is still a :class:
LazyVideocarrying a recorded frame-range window; frames decode only on the next :meth:slicecall, and only for the windowed range._apply_time_windowcalls this form so thatstream(time_range=…)never triggers a segment download or decode.Implementation:
np.searchsortedon the in-memory timestamp array to findidx_l:idx_r, thenobject.__new__+ attribute shallow- copy with the windowedtimestamps,frame_indices, andframe_countsubstituted. All per-segment metadata (segment_frame_counts,segment_frame_offsets,_pts_cache, etc.) is shared by reference — these use global frame indices, so they remain correct for the windowed object’s :meth:_segment_for_framelookups.reset_originis alwaysFalse: the returned series keeps original recording-relative coordinates.If
start >= endor the window covers no frames, an empty- Class:
LazyVideo(frame_count=0) is returned rather than raising.
- slice(start: float, end: float, reset_origin: bool = True) temporaldata.IrregularTimeSeries[source]¶
Return an :class:
IrregularTimeSeriesof decoded frames in[start, end)(end-exclusive).reset_origin=True(default) shifts the returnedtimestampsto be relative tostart;Falsekeeps absolute camera time.
- _resolve_segment_path(segment_idx: int) str[source]¶
Cache-resolve a segment URI to a local-fs path for PyAV.
- _load_frames(frame_indices: numpy.ndarray) numpy.ndarray[source]¶
Decode the requested presentation-ordered frames.
Implementation notes (verbatim from Ian’s port — keep these):
Indices are sorted by
(segment, local_index)so each segment is walked forward in presentation order; the only seeks are at segment boundaries (or when the caller passes a non-monotonic sequence).Single-threaded decode (
stream.thread_count = 1,stream.thread_type = "NONE") avoids the libavcodec frame/slice thread +os.fork()deadlock inavcodec_free_context. This shows up in any consumer that loadsLazyVideofrom a forked child (PyTorchDataLoader(num_workers>0)defaults to fork on Linux).A single
av.video.reformatter.VideoReformatterdoes colorspace + resize via libswscale.
- class ursa.backends.video.LanceFrameIndex(dataset: lance.LanceDataset, uri: str)[source]¶
Sibling Lance table at
<mp4_uri>.lancedescribing one MP4’s frames.Schema (the writer side emits this; this module only reads):
.. code-block:: text
segment_idx: int64 # which underlying segment file local_frame_idx: int64 # 0..segment_frame_count-1 pts: int64 # PyAV PTS (codec-time-base units) timestamp: float64 # recording-relative seconds
Sorted by
(segment_idx, local_frame_idx)(i.e. presentation order within each segment). Construction is metadata-only; per-segment PTS arrays are loaded lazily.Rebuildable from the source MP4 via :func:
_probe_segmentas a fallback.Initialization
- classmethod open(uri: str, store: ursa.store.base.ObjectStore) ursa.backends.video.LanceFrameIndex[source]¶
Open the frame-index dataset. Lance pulls only metadata + footer.
- segment_frame_counts() numpy.ndarray[source]¶
Frames per segment as an
int64ndarray of lengthn_segments.