ursa.store.base¶
Public surface of the Ursa object-store layer.
ObjectStore is the single abstraction Ursa code calls into for blob IO.
Backends live under ursa.store.backends; the Pydantic config layer in
ursa.store.config and the factory in ursa.store.factory decide which
backend a given role binds to.
The wrapper hides the configured prefix from every caller: keys passed in
and returned out are always relative to the store’s prefix, which is set
once on construction inside the underlying obstore handle. raw_obstore()
exposes the prefix-scoped obstore handle for Lance/Zarr backends that
consume it natively.
Module Contents¶
Classes¶
Metadata for one object in a store. |
|
Generic blob-IO surface used by all Ursa code. |
Data¶
API¶
- ursa.store.base.__all__¶
[‘ObjectMeta’, ‘ObjectStore’, ‘ObjectStoreError’, ‘ObjectNotFound’, ‘ObjectExists’, ‘ETagMismatch’, …
- class ursa.store.base.ObjectMeta[source]¶
Metadata for one object in a store.
keyis always relative to the store’s configured prefix — callers do not see the prefix at any boundary.sha256is lifted fromx-amz-meta-sha256user metadata when present; absent on objects written without it.- key: str¶
None
- size: int¶
None
- etag: str | None¶
None
- last_modified: datetime.datetime¶
None
- sha256: str | None¶
None
- exception ursa.store.base.ObjectStoreError[source]¶
Bases:
RuntimeErrorBase class for all Ursa object-store errors.
Initialization
Initialize self. See help(type(self)) for accurate signature.
- exception ursa.store.base.ObjectNotFound[source]¶
Bases:
ursa.store.base.ObjectStoreErrorRaised by
head,get,get_range,open,delete,copy(src) when the key does not exist in the store.Initialization
Initialize self. See help(type(self)) for accurate signature.
- exception ursa.store.base.ObjectExists[source]¶
Bases:
ursa.store.base.ObjectStoreErrorRaised when
if_none_matchprecondition fails (the object already exists, or its ETag matched).Initialization
Initialize self. See help(type(self)) for accurate signature.
- exception ursa.store.base.ETagMismatch[source]¶
Bases:
ursa.store.base.ObjectStoreErrorRaised when
if_matchprecondition fails (the object’s ETag does not match the expected value — concurrent write detected).Initialization
Initialize self. See help(type(self)) for accurate signature.
- exception ursa.store.base.InvalidMetadataError[source]¶
Bases:
ursa.store.base.ObjectStoreErrorRaised by
putwhenextra_metadataviolates S3/R2 user-metadata constraints (key charset, total header size). Validated locally before any network call.Initialization
Initialize self. See help(type(self)) for accurate signature.
- exception ursa.store.base.ObjectAccessDenied[source]¶
Bases:
ursa.store.base.ObjectStoreErrorRaised when the underlying R2/S3 service rejects the request as unauthorized (HTTP 401/403). Common cause: writing through a read-only credential, or a credential whose policy does not cover the target prefix.
Initialization
Initialize self. See help(type(self)) for accurate signature.
- class ursa.store.base.ObjectStore[source]¶
Bases:
typing.ProtocolGeneric blob-IO surface used by all Ursa code.
Sync-primary; an async sibling will land in a follow-up issue when a real caller (e.g. ENG-893’s prefetcher) needs it.
Prefix semantics: the configured prefix is invisible to callers. All keys are relative to it;
raw_obstore()returns a handle that is also prefix-scoped, so Lance/Zarr backings consuming the handle write under the same logical namespace as wrapper-owned objects.- get(key: str) bytes[source]¶
Return the full object as bytes. For multi-MB objects use
open()(streaming) orget_range()(random access).
- get_range(key: str, *, start: int, length: int) bytes[source]¶
Return
lengthbytes from byte offsetstart.
- open(key: str) contextlib.AbstractContextManager[BinaryIO][source]¶
Return a context manager over a forward-only
BinaryIOview.The reader does not support
seek(). Useget_range()for random access.
- put(key: str, data: bytes | typing.BinaryIO, *, content_type: str | None = None, sha256: str | None = None, extra_metadata: typing.Mapping[str, str] | None = None, if_none_match: typing.Literal[*] | None = None, if_match: str | None = None) ursa.store.base.ObjectMeta[source]¶
Write
datatokey.Conditional writes (mutually exclusive):
if_none_match="*"— fail withObjectExistsif the key already exists. Create-if-not-exists semantics.if_match=etag— fail withETagMismatchif the current object’s ETag is notetag. Compare-and-swap updates.
extra_metadatakeys are validated locally against the S3/R2 user-metadata charset ([a-z0-9_-]+) and total header size.
- copy(src: str, dst: str, *, overwrite: bool = False) ursa.store.base.ObjectMeta[source]¶
Server-side copy from
srctodst(no host round-trip).overwrite=False(default) raisesObjectExistsifdstalready exists. ETag-conditional copy is not supported by the underlying obstore API; if you need CAS semantics, do aget+put(if_match=...)instead.
- head(key: str) ursa.store.base.ObjectMeta[source]¶
Return metadata for
key. RaisesObjectNotFoundif absent.
- list(prefix: str = '') Iterator[ursa.store.base.ObjectMeta][source]¶
Recursively yield objects whose keys start with
prefix.Returned
ObjectMeta.keyvalues are relative to the store’s configured prefix.
- list_prefixes(prefix: str = '') Iterator[str][source]¶
Yield one-level common prefixes under
prefix(delimiter ‘/’).Returns
strrather thanObjectMetabecause prefix entries have no size / ETag / last-modified.
- raw_obstore() obstore.store.ObjectStore[source]¶
Return the underlying, already-prefix-scoped obstore handle.
For Lance/Zarr backends only — those libraries accept obstore handles natively. Common callers should use the wrapped surface above.
- lance_connection() tuple[str, dict[str, str]][source]¶
Return
(uri, storage_options)forlancedb.connect().Returns
(uri, storage_options)
uriis the lancedb-compatible root for this store (filesystem path for local,s3://<bucket>/<prefix>for R2, with no trailing slash). Callers append their own subpath if they want to nest data under a sub-namespace —Catalogappends/catalogfor example.``storage_options`` carries everything lancedb needs beyond the URI scheme. For R2 this MUST include ``endpoint=`` (without it the S3 client targets AWS and fails opaquely), plus ``access_key_id``, ``secret_access_key``, and ``region``. Local stores return an empty dict. The returned dict is a fresh copy on each call; mutations are not reflected back into the store.