ADR-0040: Controlled-vocabulary tags for governance artifacts
Status: accepted | Date: 2026-04-09
Tags:
schema
References: RFC-0002, ADR-0039
Context
As the govctl artifact corpus grows (currently 200+ artifacts), finding related artifacts by domain becomes difficult. Users resort to grep or memorizing IDs.
Problem Statement
There is no structured way to answer “show me everything related to caching” or “which ADRs touch the parser”. Artifact titles provide some signal, but titles are inconsistent and not designed for cross-cutting categorization.
Constraints
- RFC-0002:C-RESOURCES defines the field surface for each artifact type — adding
tagsrequires a schema amendment - RFC-0002:C-CRUD-VERBS governs how fields are mutated — tags must follow existing
add/removeverb semantics - Tags must be diffable and reviewable in PRs (no hidden state)
- The system should prevent tag sprawl — typos and near-duplicates degrade signal
Options Considered
Two tagging models: controlled vocabulary (registry-first) vs. free-form (tag-on-use). See alternatives for analysis.
Decision
We will use a controlled-vocabulary tag system where tags must be registered in a project-level allowed list before any artifact can reference them.
Why Controlled Vocabulary
The core trade-off is between friction and signal quality. Free-form tags have zero friction but degrade rapidly — typos, case variants, and synonyms fragment the taxonomy. In a governed workflow where artifacts are meant to be auditable and cross-referenced, unreliable metadata defeats the purpose.
A controlled vocabulary enforces consistency at the cost of a one-time registration step for each new tag. This cost is intentional: introducing a new domain category is a project-level decision that should be visible and reviewable.
Design Outline
- Registry: a
[tags] allowedlist ingov/config.toml— flat, lowercase kebab-case strings - Artifact field: an optional
tagsarray in the[govctl]section of RFCs, clauses, ADRs, work items, and guards (releases do not carry tags) - Management: registry-level
new/delete/listcommands; artifact-level tagging via existingadd/removeverbs - Filtering:
--tagflag on existinglistcommands for taggable resource types - Validation:
govctl checkrejects tags not in the allowed set;addrejects unregistered tags immediately
Detailed command syntax, schema changes, and validation rules will be specified in an RFC-0002 amendment.
Constraints
- No maximum tag count per artifact — signal quality is maintained by the controlled vocabulary, not by limiting labels
- The initial seed list of allowed tags is a separate operational decision from the mechanism itself
- Tags complement but do not replace potential future full-text search (see ADR-0039)
Consequences
Positive
- Cross-cutting discovery becomes a first-class operation — “show me everything about caching” is a single command
- Controlled vocabulary prevents tag sprawl — consistency is enforced, not hoped for
- Tags are part of the TOML source — diffable, reviewable in PRs, greppable
- Agents can enumerate available tags and use them programmatically
- Extends existing
add/remove/listverb model — minimal new CLI grammar
Negative
- Friction to introduce a new tag — requires a config edit before first use (mitigation: this friction is intentional and the operation is a one-liner)
- Retroactive tagging of existing artifacts requires effort (mitigation: incremental adoption — untagged artifacts simply don’t appear in filtered queries)
- Schema change across all five taggable artifact types (mitigation:
tagsis optional with empty-array default — existing artifacts remain valid without modification)
Neutral
govctl tagbecomes a new top-level command namespace for registry management- The tag vocabulary will need periodic curation as the project evolves — orphaned or overly broad tags should be pruned
- Tags complement but do not replace full-text search; ADR-0039 remains a viable future option if content-level discovery is needed
- An RFC-0002 amendment is a prerequisite before implementation — this ADR authorizes the design direction but not the schema change
Alternatives Considered
Controlled vocabulary: tags registered in gov/config.toml before use, enforced by govctl check. Lowercase kebab-case, flat list. (accepted)
- Pros: Prevents tag sprawl — typos and near-duplicates are caught at check time, Registry is diffable and reviewable in PRs, Tag list is enumerable — agents and CLI completion can offer suggestions, Removing a tag from the registry is an explicit, auditable decision
- Cons: Friction to add a new tag — requires a config edit before first use
Free-form tags: any string can be used as a tag on any artifact. No registry. Tags are created implicitly on first use. (rejected)
- Pros: Zero friction — tag immediately without config changes
- Cons: Tag sprawl is inevitable — cache vs caching vs Cache are all different tags, No way to enforce consistency across contributors, Removing a stale tag requires finding and editing every artifact that uses it
- Rejected because: In a governed workflow, uncontrolled metadata defeats the purpose of structured artifacts. Tag sprawl would quickly make filtering unreliable.
No tags — improve search and filtering instead: rely on title grep, rendered markdown search tools (rg, qmd), or future FTS (ADR-0039) to find artifacts by content rather than adding structured metadata. (rejected)
- Pros: Zero schema changes — no new fields, no config section, no validation rules, No tagging discipline burden on authors
- Cons: Finding all artifacts related to a topic requires remembering the right search terms, No enumerable taxonomy — agents cannot discover what categories exist, Cross-cutting queries remain ad hoc and fragile
- Rejected because: Search finds text matches, not intentional categorization. Tags express author intent about which domain an artifact belongs to — a dimension that free-text search cannot reliably recover.