Skip to content
🚧 Early alpha — building the foundation. See the roadmap →

Query primitives

Updated

Crosswalker’s query engine separates concerns into three layers:

LayerWhat it isWhere vocabulary lives
A — Query primitives (this page)Mechanism-neutral irreducible operations (the verbs)Here
B — View shapesMechanism-neutral visual presentations (the visuals)view-shapes
C — RecipesCompositions: primitives + view shape + paramsRecipe registry; user-facing surfaces

Recipes name domain-specific compositions like “Coverage Matrix” (filter NIST controls + traverse maps-to edge to ISO + anti-join evidence + Layer B pivot shape) — but every recipe decomposes into the same 8 primitives.

A fourth axis — mechanism (Bases-native / registerBasesView custom view / codeblock processor / materialized snapshot) — describes how a primitive actually executes in Obsidian. See §How primitives map to mechanism below.

#PrimitiveWhat it doesCross-domain precedentCrosswalker mechanism
1filterRestrict a row-set by a predicate over its attributes. “Concepts where framework == NIST.”SPARQL FILTER, SQL WHERE, Datalog body literals, Bases filterBases-native (frontmatter filter); Tier 2 SQL WHERE
2traverseHop along a typed edge — single-step OR transitive (depth=* subsumes the old standalone closure per Ch 29; same shape as SPARQL property paths :p+/:p*, Cypher [*1..n], Datalog recursion). “Follow maps-to from this concept (1 hop or transitively).”SPARQL property path, SKOS broader/narrower, Cypher pattern, Datalog rule bodyTier 2 SQL via mappings table + recursive CTE (closureFromConcept shipped v0.1.5 P3); Obsidian metadataCache.resolvedLinks
3bindAdd a derived column from a formula over existing columns. “Compute age_days = today - last_reviewed.” Required for evidence-freshness, predicate-normalization, and confidence-threshold queries.SPARQL BIND / (?old AS ?new), SQL computed columns / AS, Datalog head expressions, pandas assignBases-native (formulas); Tier 2 SQL computed columns; view-internal compute
4projectChoose which attributes to surface from a row. Thin output-shaping primitive distinct from bind. “Show only control_id, title, status.”SPARQL SELECT, SQL SELECT col1, col2, Bases columnsBases-native (column set); Tier 2 SQL projection
5aggregateCount / sum / avg / min / max / density / freshness over a row-set, optionally grouped. “Average mappings per control.”SPARQL COUNT/AVG/MIN/MAX, SQL aggregate functions + GROUP BY, Bases summaries (1-D), SSSOM mapping-density metricsBases summaries (1-D); Tier 2 SQL aggregates (multi-D); view-internal computation
6anti-join”X without Y.” LEFT rows that have NO matching record in RIGHT. “Controls without evidence.”SPARQL MINUS / FILTER NOT EXISTS, SQL LEFT JOIN ... NULL / EXCEPT, OLAP “gaps” reportTier 2 SQL (Bases can’t express); load-bearing for sparse-relation views. Implemented in src/views/join-primitives.ts (Phase 5).
7set-opUnion (∪), intersection (∩), or difference (⊖) of two row-sets keyed on a shared identifier. “Concepts in BOTH NIST and CIS”; “concepts in NIST union CIS.” Not derivable from filter + anti-join alone (anti-join is one-sided).SPARQL UNION, SQL UNION/INTERSECT/EXCEPT, Codd ∪/∩/⊖, Datalog disjunctionTier 2 SQL primarily; view-internal for small sets. Single parameterized set-op(left, right, mode).
8diffVersioned ontology delta — what changed between two snapshots? Returns typed change records: added concepts, removed concepts, changed concepts (with field-level diffs). Required for audit-trail (v0.1.8).OWL-ecco, CODEX, DynDiff, git diff, Unix diffTier 2 SQL + version-addressing in query params. Implemented in src/views/diff-primitive.ts (Phase 6).

Net changes from the original 7-primitive candidate set (per Ch 29):

  • Dropped standalone closure — folded into traverse(depth=*). Same shape as SPARQL property paths.
  • Demoted pivot to Layer B (view shape, not value-producing primitive — see view-shapes).
  • Added bind — required for any computed column (evidence-freshness, predicate normalization).
  • Added set-op — required for union/intersection-style queries; cannot be composed from anti-join alone.
  • Added diff — required for v0.1.8 audit-trail and ontology-version delta queries.

Crosswalker has four execution mechanisms where these primitives run:

MechanismPrimitives that run hereWhen
Bases-native (filter expressions, formulas, summaries)filter, bind, project, aggregate (1-D)Default for simple queries; desktop + mobile + (future) Publish
Tier 2 SQL helpers (plugin.queryConcepts/Crosswalk/Closure + Phase 5 join + Phase 6 set-op/diff)filter, traverse, bind, project, aggregate, anti-join, set-op, diffWhen Bases can’t express the operation (joins, recursion, anti-joins, set-ops, deltas)
Custom Bases view (e.g. crosswalkerPivot via registerBasesView)Whatever Layer B shape consumes Layer A outputWhen a custom rendering is the right Layer B presentation and Bases’ built-in views can’t express it
Materialized snapshot (_crosswalker/queries/<slug>/materialized/result.json)All primitives — output is fully resolved before writeAudit-trail attestation; Publish parity; offline use. Deferred to v0.1.8 per D1

Recipes (Layer C) name a primitive composition + a Layer B view shape; the engine picks the right mechanism for each primitive based on what’s expressible at that layer.

Worked example: “Coverage gaps for NIST 800-53”

Section titled “Worked example: “Coverage gaps for NIST 800-53””

User question: “Which NIST 800-53 controls have NO evidence covering them?”

Decomposition into primitives:

  1. filterconcepts WHERE ontology_id == "nist-800-53" → all NIST controls
  2. anti-join — that row-set MINUS concepts WHERE EXISTS junction WHERE junction.subject == concept.curie AND junction.object.kind == "evidence" → controls without evidence

Mechanism: anti-join is not Bases-native, so this query runs through Tier 2 SQL. Result renders via:

  • A Bases-native table view (Bases reads the SQL result via plugin.queryAntiJoin())
  • The crosswalkerPivot view as a “gaps” filter
  • A materialized snapshot at _crosswalker/queries/<slug>/materialized/result.json for v0.1.8 audit

Worked example: “Concepts in both NIST CSF and CIS”

Section titled “Worked example: “Concepts in both NIST CSF and CIS””

User question: “Which concepts appear in BOTH NIST CSF and CIS Controls v8?”

Decomposition:

  1. filterconcepts WHERE ontology_id == "nist-csf" → row-set A
  2. filterconcepts WHERE ontology_id == "cis-v8" → row-set B
  3. set-opsetOp(A, B, mode: 'intersection', keyOf: <shared-curie>)

This query is inexpressible without set-opanti-join is one-sided (A minus B), and filter cannot cross row-sets. Hence Ch 29’s verdict that set-op is a true Layer A primitive.

Worked example: “What changed in NIST CSF v1.1 → v2.0?”

Section titled “Worked example: “What changed in NIST CSF v1.1 → v2.0?””

User question: “Tell me which concepts were added, removed, or modified between two snapshots of an ontology.”

Decomposition:

  1. filterconcepts WHERE ontology_id == "nist-csf" AND version == "v1.1" → row-set BEFORE
  2. filterconcepts WHERE ontology_id == "nist-csf" AND version == "v2.0" → row-set AFTER
  3. diffdiff(BEFORE, AFTER, keyOf: curie){added: [], removed: [], changed: [{before, after, changedFields}]}

diff is the load-bearing primitive for v0.1.8 audit-trail attestations. Without it, ontology-version delta queries require ad-hoc composition that doesn’t survive snapshot semantics (the BEFORE and AFTER sets live in different versions of the same ontology, not different relations within one graph).

Worked example: “Evidence older than 1 year”

Section titled “Worked example: “Evidence older than 1 year””

User question: “Which evidence records have a last_reviewed date older than 365 days?”

Decomposition:

  1. filterjunctions WHERE object.kind == "evidence" → all evidence junctions
  2. bindbind('age_days', row => today - row.last_reviewed) → adds computed column
  3. filterbind_result WHERE age_days > 365 → stale evidence

bind is required here. Without it, you can’t introduce computed dimensions without changing the source data (ETL would have to pre-compute the age field — a violation of the Layer A “transform at query time” contract).

Looks like a primitive but isn’tWhy notWhat it actually is
Sort / order byDisplay concern, not value-producingLayer B view-shape config
Limit / paginationDisplay concernLayer B
Inner / outer / left / right join (positive sense)Decomposes into traverse (1-hop join via predicate) + filterCompositional; Phase 5’s join-primitives.ts implements these as a unified executeJoin for performance, but algebraically they reduce to traverse + filter
Pivot (2D crosstab)Demoted to Layer B per Ch 29 — presentation, not value-producingLayer B view shape; consumes Layer A row-set output
Equivalence / OWL reasoningDomain-specific to OWL; out of scope (Ch 29 explicit non-goal)External reasoner’s output becomes input skos:exactMatch triples to traverse
Rank / similarity scoringEmbedding-derived; same shape as bind (computed column)Use bind('similarity', vectorFn) then filter + Layer B sort
Window functionsPresentation concern over a result setLayer B (analytics over a result)
Constraint propagation / satisfactionReasoning, not queryingOut of scope per Ch 29
Subquery / nestingProperty of the algebra (closed under composition), not a primitiveThe algebra by definition supports it
Federation / SERVICECrosswalker is a single-vault tool; cross-vault is a deployment concernOut of scope
CONSTRUCT / graph-outputOutput-shape concern over a query resultLayer C (serialization) — exporters in v0.1.7

The 8 primitives are closed under composition — any composition of primitives takes row-sets to row-sets, so primitives can be chained arbitrarily. This is the formal sense in which the algebra is complete:

filter → row-set
traverse → row-set
bind → row-set (extended with new column)
project → row-set (with subset of columns)
aggregate → row-set (often 1-row; group-by produces N-rows)
anti-join → row-set (subset of left)
set-op → row-set (union/intersection/difference of two sets)
diff → row-set (3-record structure: added/removed/changed)

Any “query” is a directed acyclic graph of primitive applications terminating at a Layer B view shape (or a materialized JSON snapshot).

Per Commitment #5 (runtime-agnostic recipe schema), the 8 primitives are mechanism-neutral. The same recipe declaring traverse(depth=*) + anti-join can be evaluated by:

  • Crosswalker’s bundled TypeScript engine (today, v0.1.6+)
  • An external Python CLI (Path C, deferred to v0.5+)
  • An MCP server delegating to a SQL backend
  • A future Datalog runtime (Nemo / Soufflé)

The primitives are the portable contract. Recipe authors don’t pick a runtime — they pick verbs.

Concept pillars:

Decision logs:

Research deliverables: