ursa.participant

Participant name → catalog-safe slug.

The Ursa catalog requires every identifier to match [A-Za-z0-9_-]+ (:data:ursa.catalog.schemas.ID_PATTERN). Operator input from data-engine’s dashboard is free-form ("Alice", "José Smith", etc.); this module is the single place that converts it into the canonical participant_id written to the participants table and into RecordingRow.participant_ids.

Ownership rationale (v0.1.7): pre-v0.1.7, the data-engine uploader slugified before calling ingest() and ursa trusted the input. That left “where does slugification happen” ambiguous — two implementations of the same rule, two opportunities for them to drift. v0.1.7 moves the rule into ursa so every catalog write goes through one codepath, and DataInterface.register_participant and DataInterface.ingest(participant=...) both accept raw display names.

Module Contents

Functions

slugify_to_catalog_id

Slug a free-form string into an ursa-catalog-safe identifier.

Data

API

ursa.participant.UNKNOWN_SLUG: str

unknown

ursa.participant.slugify_to_catalog_id(label: str | None) str[source]

Slug a free-form string into an ursa-catalog-safe identifier.

Rules:

  1. None / empty / whitespace-only → :data:UNKNOWN_SLUG.

  2. NFKD-normalize so diacritics decompose to base letters (JoséJose, André SmithAndre_Smith). Two recordings of the same person spelled with vs. without a diacritic land under the same catalog ID.

  3. Lower-case is not applied (the catalog is case-sensitive; Alice and alice remain distinct). Replace runs of non-[A-Za-z0-9_-] characters with a single _; strip leading/trailing _.

Returns

str A catalog-safe slug (matches ID_PATTERN).

Examples

slugify_to_catalog_id(“Alice”) ‘Alice’ slugify_to_catalog_id(“José”) ‘Jose’ slugify_to_catalog_id(“André Smith”) ‘Andre_Smith’ slugify_to_catalog_id(“山田”) # non-ASCII-only → fallback ‘unknown’ slugify_to_catalog_id(“”) ‘unknown’ slugify_to_catalog_id(None) ‘unknown