Ontology lifecycle
What happens to an ontology in Crosswalker
Section titled “What happens to an ontology in Crosswalker”An ontology (framework, taxonomy, any structured knowledge system) goes through five lifecycle phases within the Crosswalker ecosystem. This isn’t a one-shot pipeline — it’s an ongoing lifecycle where ontologies are acquired, imported, enriched with evidence and crosswalks, maintained as they evolve, and shared with the community.
| Phase | What happens | Current state |
|---|---|---|
| Acquire | Get structured data from the outside world | CSV file picker in wizard |
| Import | Parse, transform, generate vault structure | CSV parser + generation engine complete; transforms 0% |
| Enrich | Link, crosswalk, attach evidence | Basic WikiLinks; typed links planned |
| Maintain | Handle updates, detect staleness, migrate versions | Research complete; implementation planned |
| Share | Export, report, contribute configs back | Not started |
The cycle repeats: shared configs make the next Acquire easier. Framework updates trigger Maintain, which loops back to Enrich. Evidence mapping lives in the Enrich phase. Ontology evolution lives in Maintain.
Acquire
Section titled “Acquire”What: Get structured data from the outside world into the system.
Current: File picker in the import wizard accepting CSV files from the local filesystem.
Planned:
- XLSX file selection with sheet picker
- JSON/JSONL file import
- Future: URL-based import (fetch a framework CSV from a public URL)
- Future: OSCAL catalog import (machine-readable framework format)
- Future: community config registry that bundles framework data with import configs
Decision point: How much do we automate acquisition? The progressive classification UX starts here — known frameworks could auto-fetch their data.
Import
Section titled “Import”What: Parse, transform, and generate — turn raw files into organized vault structure with metadata.
This phase has three sub-steps:
Convert raw file bytes into structured data. Using established libraries — solved problems:
- PapaParse for CSV with streaming (files >5MB)
xlsxpackage for Excel (installed, not yet integrated)- Native JSON parsing (planned)
- Column analysis auto-detects types (hierarchy, ID, text, numeric, date, tags, URL)
Transform
Section titled “Transform”Clean, normalize, and restructure parsed data. 24 transform types defined (0% implemented). Research evaluated 14 engines — custom build decided (~2KB bundle, Obsidian-native output, under 25ms). Optional escape hatches: Arquero, JSONata.
Per-framework transforms: hierarchical forward-fill, tag aggregation, ID normalization, preamble extraction. See helper functions and ChunkyCSV research.
Generate
Section titled “Generate”Produce folders, notes, frontmatter, and _crosswalker metadata. The generation engine is production-ready — folder hierarchies, YAML serialization, link formatting, template resolution, path sanitization. Will evolve to support FrameworkConfig v2 and metadata v2 tiers.
Enrich
Section titled “Enrich”What: Connect the imported ontology to other frameworks and to your evidence. This is the evidence mapping phase — the core value for GRC teams.
- Cross-framework crosswalking — generate typed WikiLinks between frameworks using matching modes (exact, array-contains, regex)
- Typed link syntax —
framework_here.implements:: [[AC-2]] {"sufficient": true} - Link insertion commands — “Insert framework link” with search modal and metadata form
- Evidence linking — attach policies, audit findings, and technical docs to framework controls with structured edge metadata
Ecosystem note: Obsidian Bases can query frontmatter but cannot traverse typed links. Edge metadata requires DataviewJS or Datacore. This is a known trade-off.
Enrichment is ongoing — you keep adding evidence links and crosswalks as your compliance posture evolves. It’s not a one-time step.
Maintain
Section titled “Maintain”What: Handle framework updates, version migration, stale crosswalk detection, and long-term data model resilience.
This is the novel contribution — no existing tool solves the ontology evolution meta-problem.
- EvolutionPattern taxonomy — classify how each ontology evolves (release cadence, breaking changes, ID stability, changelog format). Standalone spec. Research
- Migration strategy engine — given old + new version → recommended SCD type + handling strategy
- Progressive classification UX — community pre-classified → guided wizard → auto-detect
- Constraint enforcement — lazy detection of orphaned notes, broken links, stale metadata
Maintain loops back to Enrich — when a framework updates, crosswalk links may need updating, and evidence mappings need re-validation.
What: Export, report, and contribute back to the community.
- Community config registry — share FrameworkConfig files so others don’t repeat the per-framework configuration work
- OSCAL export — machine-readable output for GRC tool integration
- Compliance dashboards — Bases views for gap analysis
- Report generation — exportable compliance reports for auditors
- Spec publication — the EvolutionPattern taxonomy as a standalone standard
Share loops back to Acquire — community-contributed configs make the next person’s import easier.
Lifecycle as architecture
Section titled “Lifecycle as architecture”The lifecycle maps to the layered architecture:
| Lifecycle phase | Architecture layer | Who owns it |
|---|---|---|
| Acquire | Integration (plugin/CLI) | Platform wrapper |
| Import (parse/transform/generate) | Library (SDK) | Core library + transform providers |
| Enrich | Library + Integration | Core library + Obsidian WikiLink API |
| Maintain | Spec + Library | Spec defines patterns, library implements |
| Share | Integration + Community | Plugin UI + config registry |
The config-as-code format (FrameworkConfig v2) configures every phase. Community framework drivers are JSON configs that tell the system how to handle a specific ontology across its entire lifecycle.
Resources
Section titled “Resources”- Architecture — component-level system design
- Layered architecture vision — spec → library → integrations
- Why Obsidian, why files — the filesystem-first decision
- Roadmap — which stages are being built when
- File-based graph databases — what the pipeline produces
- Consistency models — how the pipeline handles failure