ursa.participant¶
Participant name → catalog-safe slug.
The Ursa catalog requires every identifier to match
[A-Za-z0-9_-]+ (:data:ursa.catalog.schemas.ID_PATTERN). Operator
input from data-engine’s dashboard is free-form ("Alice",
"José Smith", etc.); this module is the single place that converts
it into the canonical participant_id written to the
participants table and into RecordingRow.participant_ids.
Ownership rationale (v0.1.7): pre-v0.1.7, the data-engine uploader
slugified before calling ingest() and ursa trusted the input.
That left “where does slugification happen” ambiguous — two
implementations of the same rule, two opportunities for them to drift.
v0.1.7 moves the rule into ursa so every catalog write goes through one
codepath, and DataInterface.register_participant and
DataInterface.ingest(participant=...) both accept raw display names.
Module Contents¶
Functions¶
Slug a free-form string into an ursa-catalog-safe identifier. |
Data¶
API¶
- ursa.participant.UNKNOWN_SLUG: str¶
‘unknown’
- ursa.participant.slugify_to_catalog_id(label: str | None) str[source]¶
Slug a free-form string into an ursa-catalog-safe identifier.
Rules:
None/ empty / whitespace-only → :data:UNKNOWN_SLUG.NFKD-normalize so diacritics decompose to base letters (
José→Jose,André Smith→Andre_Smith). Two recordings of the same person spelled with vs. without a diacritic land under the same catalog ID.Lower-case is not applied (the catalog is case-sensitive;
Aliceandaliceremain distinct). Replace runs of non-[A-Za-z0-9_-]characters with a single_; strip leading/trailing_.
Returns
str A catalog-safe slug (matches
ID_PATTERN).Examples
slugify_to_catalog_id(“Alice”) ‘Alice’ slugify_to_catalog_id(“José”) ‘Jose’ slugify_to_catalog_id(“André Smith”) ‘Andre_Smith’ slugify_to_catalog_id(“山田”) # non-ASCII-only → fallback ‘unknown’ slugify_to_catalog_id(“”) ‘unknown’ slugify_to_catalog_id(None) ‘unknown’