Skip to content
🚧 Early alpha — building the foundation. See the roadmap →

Ontology-web querying

Updated

Crosswalker is a general ontology-web query engine built on top of an Obsidian vault.

Read the previous sentence carefully. Three pieces:

  1. Ontology-web — Crosswalker’s data model is a graph of concepts connected by typed predicates. Concepts can be controls, taxonomies, threats, library subjects, biomedical terms — anything with stable identifiers and structured relationships. Predicates can be SKOS-style (broader, narrower, related), SSSOM-style mapping vocabulary (equivalent_to, narrower_match, closely_matches), STRM (Crosswalker’s 5-relationship vocabulary), or domain-specific.
  2. Query engine — a way to ask questions over that graph and get useful answers. The query engine architecture is three layers: primitives (filter / project / traversal / closure / anti-join / pivot / aggregate), view shapes (table / list / pivot / graph / hierarchy / timeline), and recipes (marketplace instances composing primitives + shape).
  3. Built on top of an Obsidian vault — the storage tier is canonical Markdown with YAML frontmatter and wikilinks. This makes the data portable across tools, diffable in git, syncable via any Obsidian sync mechanism.

Compliance is the launch market — not the project’s identity

Section titled “Compliance is the launch market — not the project’s identity”

GRC compliance (NIST 800-53 ↔ ISO 27001 ↔ CIS ↔ SOC 2 ↔ HIPAA ↔ …) is Crosswalker’s launch market. It’s where the first canonical recipes ship, where the first user community lives, and where the project description on GitHub leads. The “Coverage Matrix” recipe is the canonical compliance instance.

But Crosswalker’s architecture is general-domain. Per feedback_general_ontology_positioning.md memory:

  • Internal vocabulary stays general — ontology / concept / subject / object / predicate. The Tier 1 schema, the recipe grammar, the query primitives, the view shapes — all are ontology-neutral.
  • User-facing surfaces stay GRC-first — control / framework / evidence / coverage / audit. The README, the import wizard, the marketplace recipes lead with compliance vocabulary because that’s where launch users start.
  • The recipe layer is the bridge — recipes compose general primitives + shape into named domain-specific reports. “Coverage Matrix” is a recipe; the underlying view shape is pivot (mechanism-neutral); the underlying primitives are filter + traversal + anti-join + pivot.

This matters because if we treat compliance as the project’s identity, we lock the architecture to compliance semantics. The 13-field junction-note schema, the 5-mechanism recipe grammar, the SSSOM/SKOS/STRM cross-domain vocabulary — all of these are durable specifically because they were designed general-domain from the start.

Cross-domain examples — where ontology-web querying applies

Section titled “Cross-domain examples — where ontology-web querying applies”

Crosswalker’s launch market is GRC, but the same engine applies anywhere there’s an ontology web. Examples:

DomainExample ontologiesWhy ontology-web querying matters
Compliance / GRC (LAUNCH)NIST 800-53, ISO 27001, CIS Controls, SOC 2 TSC, HIPAA, PCI-DSS, NIST CSF, FedRAMPCrosswalking control sets across frameworks; evidence linking; audit deliverables
Threat intelligenceMITRE ATT&CK, MITRE D3FEND, MITRE ENGAGE, CAPEC, CWE, CVEMapping threats to defenses; tactic ↔ technique ↔ mitigation traversal
BiomedicalOBO Foundry (Gene Ontology, ChEBI, MONDO disease ontology, UBERON anatomy), UMLS, SNOMED CT, ICD-10Cross-ontology mapping for clinical informatics; concept reconciliation across vocabularies
Library / information scienceLCSH (Library of Congress Subject Headings), MeSH (Medical Subject Headings), Dewey Decimal, SKOS thesauriSubject taxonomy alignment; hierarchical browsing; cross-vocabulary search
Ontology mapping researchNIST OLIR (Online Informative References), OxO2 (Ontology Cross-Reference Service), BioPortalThe discipline of crosswalking IS the use case
Cyber risk frameworksFAIR, ISO 31000, NIST RMF, COBIT, OWASP risk-ratingMapping risk semantics across enterprise frameworks
Privacy frameworksGDPR, CCPA, NIST Privacy Framework, ISO 29100Cross-jurisdiction privacy control mapping

For each of these domains, the same query primitives apply: filter controls/concepts by ontology, traverse the mapping edges, closure for transitive reachability, anti-join for gaps, pivot for cross-tabulation. The primitives don’t change; the recipes (and their domain naming) do.

The cross-domain vocabulary — SSSOM, SKOS, STRM, OLIR

Section titled “The cross-domain vocabulary — SSSOM, SKOS, STRM, OLIR”

Crosswalker uses W3C and standards-track vocabularies to express ontology mappings. This is what makes the architecture general — we didn’t invent the cross-domain semantics; we adopted what the ontology-mapping community has converged on.

StandardWhat it isWhere used in Crosswalker
SSSOM (Simple Standard for Sharing Ontology Mappings)TSV-shaped envelope for ontology mappings: subject/object/predicate/justification/confidence/author/license. Maintained by the Mapping Commons community.Tier 1 junction-note frontmatter shape; recipe kind: crosswalk-edge validation envelope
SKOS (Simple Knowledge Organization System)W3C standard for thesauri, taxonomies, classification schemes. Defines broader, narrower, related, exactMatch, closeMatch, broadMatch, narrowMatch.Recipe graph_edges semantically aligned with SKOS broader/narrower for hierarchical predicates
STRM (Secure Trust Reference Mapping — Crosswalker-specific)The 5-relationship predicate vocabulary settled in v0.1: equivalent_to, closely_matches, narrower_match, broader_match, related_to. SSSOM-shaped.The closed predicate set Crosswalker emits at v0.1; user-facing surfaces show STRM names
OLIR (Online Informative References)NIST’s submission-based catalog of crosswalks between cybersecurity frameworks. Cross-jurisdiction reuse.Reference example for cross-framework mapping at scale; pending Ch 35 rerun considers OLIR-scale graph→tabular
OxO2 (EBI Ontology Cross-Reference Service)EBI’s mapping catalog across biomedical ontologiesReference architecture for ontology-mapping query engine; pending Ch 36 rerun considers OxO2’s query language
OWL transitive propertiesStandard semantic web vocabulary for transitively-closed predicatesConceptual basis for closure primitive; Crosswalker doesn’t ship OWL reasoner but borrows the semantics

Important non-claims to set expectations:

  • Not a triple store — we do not implement RDF storage or SPARQL. The ontology-web data model is implemented in Markdown frontmatter + sqlite-wasm sidecar, not in a triple store. (Whether to add SPARQL via Oxigraph-WASM as a Tier 3 option is open per Ch 33.)
  • Not an OWL reasoner — we borrow OWL semantics (transitive properties, equivalence) but don’t implement DL reasoning. Recipes can declare predicates as transitive; closure follows them; that’s the limit of what we infer.
  • Not a knowledge-graph database product — Neo4j, Stardog, GraphDB, AllegroGraph, TerminusDB are all valid alternatives for users who want server-side graph databases. Crosswalker’s value is vault-native ontology-web management — your data stays in Obsidian, portable, diffable, syncable.
  • Not a federated query engine across vaults — explicitly out of scope per Ch 27 + Ch 28 anti-patterns. One ontology web per vault. Cross-vault federation may revisit in v0.3+ if user demand emerges.
  • Not a replacement for compliance GRC tools — Crosswalker is the substrate on which GRC workflows can be expressed in Obsidian. It does not replace ticketing, evidence-collection automation, audit-readiness scoring engines, or controls-as-code platforms. It complements them by being the canonical structured-knowledge tier.

The hard question: why build an ontology-web query engine on top of an Obsidian vault, when the ontology-mapping community has BioPortal / OLS / OxO2 / NIST OLIR as established platforms?

Three reasons:

  1. Vault-native = local-first + portable. Your ontology web lives as Markdown files you own. Diffable. Git-versionable. Sync-friendly. Survives any tool transition. BioPortal et al. are vendor-hosted; if they go away, you re-do the work.
  2. Knowledge graph + working notes in one tool. Obsidian users already have notes, daily journals, project plans, evidence dumps. Putting the ontology web in the same vault lets you wikilink between a control and the evidence note that proves it, between a SKOS taxonomy term and the research note that uses it. Cross-context queries become trivially answerable.
  3. AI-agent-readable substrate. Plain Markdown + YAML frontmatter is the universal substrate AI agents (Claude, ChatGPT, local models) can read and write without any specialized API. The MCP server pattern (kepano/obsidian-skills) makes the vault directly queryable by agents — and that vault includes ontology-web data alongside everything else the user has.

The tradeoff: Crosswalker won’t match a dedicated ontology-management platform on raw scale (BioPortal hosts 700+ ontologies; UMLS has 3.5M concepts). The substrate scaling question is open per Ch 33 and Ch 37. Crosswalker’s sweet spot is medium-scale (tens of ontologies, tens of thousands of concepts) where vault-native portability outweighs server-side scale.

Concept pillars:

Decision logs:

Research challenges (pending):

External references: