Query primitives
What this page is
Section titled “What this page is”Crosswalker’s query engine separates concerns into three layers:
| Layer | What it is | Where vocabulary lives |
|---|---|---|
| A — Query primitives (this page) | Mechanism-neutral irreducible operations (the verbs) | Here |
| B — View shapes | Mechanism-neutral visual presentations (the visuals) | view-shapes |
| C — Recipes | Compositions: primitives + view shape + params | Recipe registry; user-facing surfaces |
Recipes name domain-specific compositions like “Coverage Matrix” (filter NIST controls + traverse maps-to edge to ISO + anti-join evidence + Layer B pivot shape) — but every recipe decomposes into the same 8 primitives.
A fourth axis — mechanism (Bases-native / registerBasesView custom view / codeblock processor / materialized snapshot) — describes how a primitive actually executes in Obsidian. See §How primitives map to mechanism below.
The 8 primitives
Section titled “The 8 primitives”| # | Primitive | What it does | Cross-domain precedent | Crosswalker mechanism |
|---|---|---|---|---|
| 1 | filter | Restrict a row-set by a predicate over its attributes. “Concepts where framework == NIST.” | SPARQL FILTER, SQL WHERE, Datalog body literals, Bases filter | Bases-native (frontmatter filter); Tier 2 SQL WHERE |
| 2 | traverse | Hop along a typed edge — single-step OR transitive (depth=* subsumes the old standalone closure per Ch 29; same shape as SPARQL property paths :p+/:p*, Cypher [*1..n], Datalog recursion). “Follow maps-to from this concept (1 hop or transitively).” | SPARQL property path, SKOS broader/narrower, Cypher pattern, Datalog rule body | Tier 2 SQL via mappings table + recursive CTE (closureFromConcept shipped v0.1.5 P3); Obsidian metadataCache.resolvedLinks |
| 3 | bind | Add a derived column from a formula over existing columns. “Compute age_days = today - last_reviewed.” Required for evidence-freshness, predicate-normalization, and confidence-threshold queries. | SPARQL BIND / (?old AS ?new), SQL computed columns / AS, Datalog head expressions, pandas assign | Bases-native (formulas); Tier 2 SQL computed columns; view-internal compute |
| 4 | project | Choose which attributes to surface from a row. Thin output-shaping primitive distinct from bind. “Show only control_id, title, status.” | SPARQL SELECT, SQL SELECT col1, col2, Bases columns | Bases-native (column set); Tier 2 SQL projection |
| 5 | aggregate | Count / sum / avg / min / max / density / freshness over a row-set, optionally grouped. “Average mappings per control.” | SPARQL COUNT/AVG/MIN/MAX, SQL aggregate functions + GROUP BY, Bases summaries (1-D), SSSOM mapping-density metrics | Bases summaries (1-D); Tier 2 SQL aggregates (multi-D); view-internal computation |
| 6 | anti-join | ”X without Y.” LEFT rows that have NO matching record in RIGHT. “Controls without evidence.” | SPARQL MINUS / FILTER NOT EXISTS, SQL LEFT JOIN ... NULL / EXCEPT, OLAP “gaps” report | Tier 2 SQL (Bases can’t express); load-bearing for sparse-relation views. Implemented in src/views/join-primitives.ts (Phase 5). |
| 7 | set-op | Union (∪), intersection (∩), or difference (⊖) of two row-sets keyed on a shared identifier. “Concepts in BOTH NIST and CIS”; “concepts in NIST union CIS.” Not derivable from filter + anti-join alone (anti-join is one-sided). | SPARQL UNION, SQL UNION/INTERSECT/EXCEPT, Codd ∪/∩/⊖, Datalog disjunction | Tier 2 SQL primarily; view-internal for small sets. Single parameterized set-op(left, right, mode). |
| 8 | diff | Versioned ontology delta — what changed between two snapshots? Returns typed change records: added concepts, removed concepts, changed concepts (with field-level diffs). Required for audit-trail (v0.1.8). | OWL-ecco, CODEX, DynDiff, git diff, Unix diff | Tier 2 SQL + version-addressing in query params. Implemented in src/views/diff-primitive.ts (Phase 6). |
Net changes from the original 7-primitive candidate set (per Ch 29):
- Dropped standalone
closure— folded intotraverse(depth=*). Same shape as SPARQL property paths. - Demoted
pivotto Layer B (view shape, not value-producing primitive — see view-shapes). - Added
bind— required for any computed column (evidence-freshness, predicate normalization). - Added
set-op— required for union/intersection-style queries; cannot be composed fromanti-joinalone. - Added
diff— required for v0.1.8 audit-trail and ontology-version delta queries.
How primitives map to mechanism
Section titled “How primitives map to mechanism”Crosswalker has four execution mechanisms where these primitives run:
| Mechanism | Primitives that run here | When |
|---|---|---|
| Bases-native (filter expressions, formulas, summaries) | filter, bind, project, aggregate (1-D) | Default for simple queries; desktop + mobile + (future) Publish |
Tier 2 SQL helpers (plugin.queryConcepts/Crosswalk/Closure + Phase 5 join + Phase 6 set-op/diff) | filter, traverse, bind, project, aggregate, anti-join, set-op, diff | When Bases can’t express the operation (joins, recursion, anti-joins, set-ops, deltas) |
Custom Bases view (e.g. crosswalkerPivot via registerBasesView) | Whatever Layer B shape consumes Layer A output | When a custom rendering is the right Layer B presentation and Bases’ built-in views can’t express it |
Materialized snapshot (_crosswalker/queries/<slug>/materialized/result.json) | All primitives — output is fully resolved before write | Audit-trail attestation; Publish parity; offline use. Deferred to v0.1.8 per D1 |
Recipes (Layer C) name a primitive composition + a Layer B view shape; the engine picks the right mechanism for each primitive based on what’s expressible at that layer.
Worked example: “Coverage gaps for NIST 800-53”
Section titled “Worked example: “Coverage gaps for NIST 800-53””User question: “Which NIST 800-53 controls have NO evidence covering them?”
Decomposition into primitives:
- filter —
concepts WHERE ontology_id == "nist-800-53"→ all NIST controls - anti-join — that row-set MINUS
concepts WHERE EXISTS junction WHERE junction.subject == concept.curie AND junction.object.kind == "evidence"→ controls without evidence
Mechanism: anti-join is not Bases-native, so this query runs through Tier 2 SQL. Result renders via:
- A Bases-native table view (Bases reads the SQL result via
plugin.queryAntiJoin()) - The
crosswalkerPivotview as a “gaps” filter - A materialized snapshot at
_crosswalker/queries/<slug>/materialized/result.jsonfor v0.1.8 audit
Worked example: “Concepts in both NIST CSF and CIS”
Section titled “Worked example: “Concepts in both NIST CSF and CIS””User question: “Which concepts appear in BOTH NIST CSF and CIS Controls v8?”
Decomposition:
- filter —
concepts WHERE ontology_id == "nist-csf"→ row-set A - filter —
concepts WHERE ontology_id == "cis-v8"→ row-set B - set-op —
setOp(A, B, mode: 'intersection', keyOf: <shared-curie>)
This query is inexpressible without set-op — anti-join is one-sided (A minus B), and filter cannot cross row-sets. Hence Ch 29’s verdict that set-op is a true Layer A primitive.
Worked example: “What changed in NIST CSF v1.1 → v2.0?”
Section titled “Worked example: “What changed in NIST CSF v1.1 → v2.0?””User question: “Tell me which concepts were added, removed, or modified between two snapshots of an ontology.”
Decomposition:
- filter —
concepts WHERE ontology_id == "nist-csf" AND version == "v1.1"→ row-set BEFORE - filter —
concepts WHERE ontology_id == "nist-csf" AND version == "v2.0"→ row-set AFTER - diff —
diff(BEFORE, AFTER, keyOf: curie)→{added: [], removed: [], changed: [{before, after, changedFields}]}
diff is the load-bearing primitive for v0.1.8 audit-trail attestations. Without it, ontology-version delta queries require ad-hoc composition that doesn’t survive snapshot semantics (the BEFORE and AFTER sets live in different versions of the same ontology, not different relations within one graph).
Worked example: “Evidence older than 1 year”
Section titled “Worked example: “Evidence older than 1 year””User question: “Which evidence records have a last_reviewed date older than 365 days?”
Decomposition:
- filter —
junctions WHERE object.kind == "evidence"→ all evidence junctions - bind —
bind('age_days', row => today - row.last_reviewed)→ adds computed column - filter —
bind_result WHERE age_days > 365→ stale evidence
bind is required here. Without it, you can’t introduce computed dimensions without changing the source data (ETL would have to pre-compute the age field — a violation of the Layer A “transform at query time” contract).
What’s NOT a primitive (and why)
Section titled “What’s NOT a primitive (and why)”| Looks like a primitive but isn’t | Why not | What it actually is |
|---|---|---|
| Sort / order by | Display concern, not value-producing | Layer B view-shape config |
| Limit / pagination | Display concern | Layer B |
| Inner / outer / left / right join (positive sense) | Decomposes into traverse (1-hop join via predicate) + filter | Compositional; Phase 5’s join-primitives.ts implements these as a unified executeJoin for performance, but algebraically they reduce to traverse + filter |
| Pivot (2D crosstab) | Demoted to Layer B per Ch 29 — presentation, not value-producing | Layer B view shape; consumes Layer A row-set output |
| Equivalence / OWL reasoning | Domain-specific to OWL; out of scope (Ch 29 explicit non-goal) | External reasoner’s output becomes input skos:exactMatch triples to traverse |
| Rank / similarity scoring | Embedding-derived; same shape as bind (computed column) | Use bind('similarity', vectorFn) then filter + Layer B sort |
| Window functions | Presentation concern over a result set | Layer B (analytics over a result) |
| Constraint propagation / satisfaction | Reasoning, not querying | Out of scope per Ch 29 |
| Subquery / nesting | Property of the algebra (closed under composition), not a primitive | The algebra by definition supports it |
| Federation / SERVICE | Crosswalker is a single-vault tool; cross-vault is a deployment concern | Out of scope |
| CONSTRUCT / graph-output | Output-shape concern over a query result | Layer C (serialization) — exporters in v0.1.7 |
Algebraic closure
Section titled “Algebraic closure”The 8 primitives are closed under composition — any composition of primitives takes row-sets to row-sets, so primitives can be chained arbitrarily. This is the formal sense in which the algebra is complete:
Any “query” is a directed acyclic graph of primitive applications terminating at a Layer B view shape (or a materialized JSON snapshot).
Engine-neutrality
Section titled “Engine-neutrality”Per Commitment #5 (runtime-agnostic recipe schema), the 8 primitives are mechanism-neutral. The same recipe declaring traverse(depth=*) + anti-join can be evaluated by:
- Crosswalker’s bundled TypeScript engine (today, v0.1.6+)
- An external Python CLI (Path C, deferred to v0.5+)
- An MCP server delegating to a SQL backend
- A future Datalog runtime (Nemo / Soufflé)
The primitives are the portable contract. Recipe authors don’t pick a runtime — they pick verbs.
Related
Section titled “Related”Concept pillars:
- View shapes — Layer B (visual presentations)
- Ontology-web querying — positioning page
- Metadata ecosystem — Bases capabilities + limits
- System architecture — Layer 4 (Query)
- Hierarchy primitives — emission-side primitives (Crosswalker’s render() function)
- Terminology
Decision logs:
- Bases query layer architecture synthesis — Settled #14 locks the 8-primitive set
- Phase 5 scope log — implements
anti-join(primitive #6) + join sub-modes (inner/left/right/full) - Query state location synthesis — where primitive outputs persist (Layout B+)
- v0.1.5 Tier 2 sidecar shipped —
closureFromConceptrecursive CTE (implementstraverse(depth=*))
Research deliverables:
- Ch 29 — 8-primitive validation — adversarial cross-reference to SPARQL/Datalog/OLAP/SKOS/SSSOM prior art; locked verdict above
- Ch 36 — Query language rerun — compositional language stack (YAML recipes → SQL recursive CTE → Bases YAML)
- Ch 33 — Multi-modal engine audit — how other engines expose primitive sets