🚧 Early alpha — building the foundation. See the roadmap →

Direction commitments (TL;DR) — what's locked in, what's still researching

Created May 2, 2026 Updated Jun 1, 2026

Two more deliverables (Ch 18 + Ch 19) landed later 2026-05-02 and produced a v0.1 build-target pivot. The third-wave commitments captured below are now back-pocket research (opt-in companion plugins for v1.0+); the actual v0.1 build target is in the v0.1 stack-pivot log.

Specifically: v0.1 ships Tier 1 (markdown + YAML STRM, in-memory JS Map) + Tier 2 (@sqlite.org/sqlite-wasm sidecar, ~600 KB) bundled together (~1.2 MB total). The third-wave’s layered Tier 2 stack (DuckDB + Oxigraph + Nemo, ~5 MB), Tier 3 server matrix, and 4-tier audit with T2 OTS default are reframed as researched and ready-when-needed, not v0.1 defaults. The architectural safety guarantee — Tier 2 and Tier 3 are projections of Tier 1, fully recoverable — is now explicit and load-bearing.

The third-wave research is not invalidated; it’s reframed at the build-target boundary. Sections below remain valid as research direction; for what gets built first, see the v0.1 pivot log.

Status at a glance

#	Item	Status	Where
1	Tier 2 layered stack (DuckDB-WASM + Oxigraph + Nemo) — confirmed; extended with Tier 2-Lite alternate + Comunica federation add-on by Ch 14	✅ Confirmed	§2.1
2	Tier 1 audit-trail — 4-tier model adopted by Ch 15; T2 OpenTimestamps default; git stack repositioned as one of three T3 options	✅ Confirmed	§3.1
3	Datalog (Nemo) for SSSOM derivation — placement explained	✅ Committed	§2.2
4	Sigstore/gitsign as configurable alternative (now scoped to T3 architecture C)	✅ Committed	§2.3
5	SLSA targeting L1→L2→L3 — explanation added	✅ Committed	§2.4
6	Materialized-folder Tier 1 generator	✅ Committed	§2.5
7	Tier 3 default — flipped by Ch 16: Apache Jena Fuseki primary + oxigraph-server same-API alternative; AGE retained as fallback	✅ Confirmed	§3.2
8	TerminusDB as vault-mirror only — small-vendor (DFRNT) risk explicitly flagged by Ch 16	✅ Committed	§2.6
9	LinkML as canonical schema substrate	🅿️ Parked (idea bucket)	§4.1
10	OSCAL native support — wire format on export/import boundary	✅ Yes (after core); deferred to Phase 2+; document via `registry/oscal` page mapping OSCAL into Crosswalker mental model	§3.3
11	Grafeo evaluation — resolved by Ch 14 deliverable; track in long-horizon list with explicit migration triggers; do not adopt yet	✅ Resolved	§3.4

Score after third wave + sign-off: 9 confirmed (#1, #2, #3, #4, #5, #6, #7, #8, #10). 1 parked (#9 LinkML). 1 resolved-track (#11 Grafeo). #10 OSCAL deferred to Phase 2+ but committed-in-principle.

See the third-wave architectural shifts log for the deltas behind #1, #2, #7, and #11.

v0.1 build-target reframe: rows #1 (Tier 2 layered stack), #2 (Tier 1 audit-trail T2 OTS default), #4 (Sigstore/gitsign Tier 3 audit option), #7 (Tier 3 default), #8 (TerminusDB vault-mirror), #10 (OSCAL native) are all back-pocket research / opt-in companion plugins or future phases, not v0.1 build target. The v0.1 stack is in the v0.1 stack-pivot log §3. Rows #3 (Datalog placement), #5 (SLSA targeting), #6 (materialized-folder generator), #9 (LinkML parked), #11 (Grafeo) are unaffected by the v0.1 pivot.

§2 Confirmed commitments

2.1 Tier 2 layered stack — DuckDB-WASM + Oxigraph + Nemo (confirmed)

Confirmed by Challenge 14 deliverable — see third-wave log §2. None of the seven candidate engines (Grafeo, Minigraf, CozoDB, SurrealDB-WASM, Comunica, cr-sqlite, sqlite-wasm) succeeded at “collapse to one engine”; SurrealDB busted the bundle budget; Grafeo and Minigraf earned watchlist slots with explicit migration triggers.

What Ch 14 added:

Tier 2-Lite alternate stack (sqlite-wasm + sqlite-vec + simple-graph + recursive-CTE; ~1.5 MB compressed) for Obsidian Mobile / low-end / restricted-CSP environments. SSSOM rule subset and scale ceiling need a dedicated brief — listed as Ch 18 candidate.
Comunica + N3 + HDT federation add-on (~250–300 KB gzipped) for cross-vault, cross-org, external SPARQL endpoint queries. Genuinely additive — Oxigraph stays primary for local queries.

What was chosen and why (with alternatives that lost)

Analytical / SQL surface — DuckDB-WASM

	DuckDB-WASM (chosen)	Polars-WASM	sql.js	DataFusion-WASM	ClickHouse-local
Bundle	~3.2 MB Brotli (lazy-loaded)	tens of MB (Pyodide)	~1.5 MB	~10s of MB	n/a (no WASM)
License	MIT	MIT	Public domain	Apache-2.0	Apache-2.0
Project health	DuckDB Foundation, weekly releases	Alpha; “not for production”	Active but slow	Experimental WASM playground	Active; no WASM port
Joins/pivots	Native PIVOT/UNPIVOT/window	Native	Limited	Yes	Yes
Recursive CTE	Yes (USING KEY since May 2025)	No	Yes (basic)	Yes	n/a
Verdict	Pick	Reject (alpha)	Fallback only	Reject (size + experimental)	Reject (no WASM)

RDF / SPARQL surface — Oxigraph-WASM

	Oxigraph-WASM (chosen)	Comunica + N3 + HDT	Apache Jena Fuseki	GraphDB (Ontotext)	Stardog
Bundle	~3-4 MB	~200 KB gzipped	n/a (JVM server)	n/a	n/a
License	Apache-2.0/MIT	Apache-2.0/MIT	Apache-2.0	Commercial	Commercial
SPARQL 1.1	Full + RDF 1.2 preview	Full + federation	Full + RDFS/OWL inference	Full + best OWL reasoning	Full + OWL+ICV
Browser/Embedded	Yes (in-memory only on Wasm)	Yes (TS-native, lighter)	No	No	No
Verdict	Pick (current)	Re-evaluate in Ch 14 — could be lighter	Tier 3 sidecar option	Reject (commercial)	Reject (commercial)

Datalog / SSSOM derivation engine — Nemo

	Nemo (chosen)	Soufflé	CozoDB	RDFox	Differential Datalog
License	Apache-2.0/MIT	UPL-1.0	MPL-2.0	Commercial	MIT (archived)
WASM	Yes (shipping)	No (C++→native)	Yes (in-memory only)	No (JVM/native)	Archived
Production validation	EBI’s OxO2 (1.16M mappings → 49.5K inferences in 17 min on a laptop)	Industry-grade for static analysis	Slowing in 2024–2025	Highest-quality OWL 2 RL reasoner	VMware archived
Provenance support	Native (“tracing”)	Yes	Yes	Best-in-class (why-provenance)	Yes
Verdict	Pick	Reject (no WASM)	Re-evaluate in Ch 14	Reject (commercial)	Reject (archived)

Why “layered” and not “single engine”?

Crosswalker has three different mathematics at Tier 2:

Tabular pivots (analyst spreadsheet) → SQL → DuckDB
Ontology semantics (RDF, SKOS, SSSOM standards) → SPARQL → Oxigraph
Logical derivation (SSSOM chain rules over crosswalk edges) → Datalog → Nemo

Forcing all three onto a single engine produces 30% great-fit code and 70% awkward workarounds.

Caveat: Grafeo (in Challenge 14) claims to do all three (SQL/PGQ + SPARQL + Cypher + GQL + vector). If it works, the layered stack collapses to one engine.

2.2 Datalog (Nemo) for SSSOM chain-rule derivation

Datalog is a declarative logic-programming language — you write rules like “if A maps to B and B maps to C, then A maps to C with min confidence” once, and the engine derives all consequences. See the verbose Datalog glossary entry for the full explainer of why we use it instead of plain SQL.

Where Nemo lives in the architecture (web-of-webs mapping)

Nemo is not a query engine for live user queries. It is a build-pipeline tier between Tier 1 (canonical files) and Tier 2 (query surface):

Tier 1 (canonical files)
    │
    │  Markdown + SSSOM TSV
    │  (asserted crosswalk edges only)
    ▼
┌─────────────────────────────────────────────────────────┐
│  Build-pipeline tier  ← NEMO LIVES HERE                 │
│                                                         │
│  Nemo applies SSSOM chain rules:                        │
│    if A→[skos:exactMatch]→B  AND  B→[skos:closeMatch]→C │
│    then derive A→[skos:closeMatch]→C                    │
│    with provenance: derivation_path = [edge1, edge2]    │
│                                                         │
│  Output: NEW derived edges saved alongside              │
│  asserted edges in the canonical files                  │
└─────────────────────────────────────────────────────────┘
    │
    │  Asserted + derived edges (both in files)
    ▼
Tier 2 (query surface, in-Obsidian)
    │
    │  DuckDB-WASM queries asserted + derived edges as SQL
    │  Oxigraph queries the same graph as RDF/SPARQL
    │
    ▼
End user sees a complete crosswalk

In web-of-webs terms

From the orientation log’s web-of-webs framing:

The source-ontology webs (NIST/CIS/MITRE/etc.) start with only their own internal edges (hierarchy, references)
Crosswalk edges between source-ontology webs are asserted by the user (or imported from SCF/OLIR)
Derived crosswalk edges are produced by Nemo composing pairwise mappings through chains — including chains that route through the optional pivot/spine web

So Nemo’s job: densify the crosswalk web by computing transitive closures over asserted edges using SSSOM chain rules. It runs as a build step (not on every query); derived edges are saved to canonical files alongside asserted ones.

This matches EBI’s OxO2 architecture exactly. Crosswalker is essentially “OxO2 for compliance frameworks.”

2.3 Sigstore/gitsign as configurable alternative

For commit signing when git is in use. GPG/SSH stays default; Sigstore is an optional swap-in for teams that want federated OIDC-backed signing (cleanest path to SLSA L3).

Caveat: assumes git is in use at all — see §3.1 on whether that assumption holds in some environments.

2.4 SLSA targeting L1→L2→L3 (explained)

SLSA = Supply-chain Levels for Software Artifacts (slsa.dev/spec/v1.0/). Originally a software-build-integrity framework; we apply it analogically to compliance evidence pipelines.

The progression Crosswalker targets:

Version	SLSA Target	What’s required	Cost
v0.1	L1	Documented build process; emit a provenance file alongside each commit (just an in-toto SLSA-Provenance JSON).	Trivial — already implicit in git+commit-signing; just needs an in-toto Provenance predicate emitted on commit
v1.0	L2	Provenance generation runs in a “hosted” context (CI workflow), digitally signed with a key the user can’t forge.	Modest — requires a Crosswalker-managed pre-commit hook or CI workflow signing the provenance
v2.0+	L3	Hardened build with non-extractable signing keys (Fulcio’s ephemeral certs are the cleanest way). Requires gitsign + Fulcio + Rekor (Sigstore).	Significant — requires either a managed Crosswalker SaaS-style verifier service OR gitsign with OIDC; this is the architectural argument for Sigstore at v2.0+

Why this matters: L1/L2/L3 is auditor vocabulary. Saying “we target SLSA Build L2 by v1.0” gives a SOC 2 / ISO 27001 auditor a familiar handle. It’s a credibility lever, not an engineering checklist.

2.5 Materialized-folder Tier 1 generator

Plugin auto-generates Bases-compatible folders containing pre-joined/merged data so users can browse cross-tabs that Bases otherwise cannot compute (no joins, no pivots in Bases). Survives any Tier 2 engine choice — concept is independent of engine selection.

2.6 TerminusDB as optional vault-mirror only

If a user wants Git-style branch/diff/merge over the curated crosswalk graph, TerminusDB can be deployed as a parallel governance database that reads from the canonical Markdown vault. It is not the system of record; not the default Tier 3.

§3 Needs more research (with new challenges spun up)

3.1 Tier 1 audit-trail — 4-tier model with OpenTimestamps T2 default

Resolved by Challenge 15 deliverable — see third-wave log §4.

What was committed by Ch 15

A 4-tier audit-trail model (T0 floor / T1 credible / T2 defensible / T3 court-defensible) with .audit/chain.jsonl as the universal substrate that works in both git and non-git modes.

Tier	Default?	Substrate
T0 Floor	—	Edit History plugin or Obsidian Sync version history (no cryptographic guarantee)
T1 Credible	—	`.audit/chain.jsonl` with prev_hash links + Ed25519 signatures, vault-anchored
T2 Defensible	✅ New default	T1 + OpenTimestamps `.ots` on signed chain checkpoints (free, decentralized, offline-buffered, license-free)
T3 Court-Defensible	—	T2 + (a) FRE 902(13) PDF + S3 Object Lock; or (b) eIDAS qualified TSA + W3C VC; or (c) Sigstore Rekor v2 + in-toto

Where the Ch 08+13 git stack went

The signed-commit + RFC 3161 + S3 Object Lock + FRE 902(13) + in-toto stack from Ch 08+13 is kept as a first-class option but no longer the default. It is now Tier 3 architecture A — recommended for users who already have signed-commit workflows, deploy to AWS or another Object-Lock-capable provider, and whose auditors specifically request “WORM + 902(13) PDF”. For the great majority of GRC consultancy work, T2 with OpenTimestamps reaches a comparable evidentiary standard without requiring git.

What’s new

PQC migration plan 2026 (Ed25519) → 2027 (dual-sign Ed25519 + ML-DSA-44) → 2030 (deprecate Ed25519-only) → 2032 (fully PQC) — well ahead of NIST IR 8547’s 2035 deadline.
Single audit-ready badge with progressive disclosure: gray T0 / blue T1 / green T2 / gold T3, plus honest tier-floor messaging (“T0 — version history only” is not “audit-ready”).
External CLI verifier (crosswalker-verify) with zero Obsidian dependencies, runnable on the auditor’s machine.
Per-persona tier mapping — see third-wave log §4.4 for the full table (solo consultant US/EU, locked-down enterprise, federal/air-gapped, multi-tenant team, EU AI Act / DORA / NIS2).

3.2 Tier 3 stack — default flipped from AGE to Fuseki/oxigraph-server

Resolved by Challenge 16 deliverable — see third-wave log §3.

The flip

Profile	New Tier 3 default	Why
Default — small GRC team, ≤500k mappings	Apache Jena Fuseki	Apache TLP governance (multi-employer PMC, no key-person risk); ~2 decades of releases; SPARQL 1.1 + RDFS/OWL inference + SHACL; safest 5–10-year bet
Same-API lighter alternative	oxigraph-server	Architectural symmetry with Tier 2 (same engine, just `oxigraph serve`); single Rust binary or Docker container; smaller footprint than Fuseki’s JVM
Power user — multi-team, mixed SQL+graph, multi-million mappings	Layered Fuseki + DuckDB-on-server	Federated via SPARQL `SERVICE` and DuckDB `httpfs` / `postgres_scanner`. Crossover point: above ~250k mappings with mixed workloads
Postgres-standardized shop	Apache AGE (kept as supported fallback) or plain Postgres + JSONB + recursive CTEs	AGE is the only option that lives inside an existing Postgres; the boring SQL option remains viable for ≥90% of queries

Why AGE was demoted (not dropped)

The user’s concern was substantiated: sponsor pivot (Bitnine → SKAI Worldwide moved into AI advertising), the November 2024 PG 17.1 ABI break that hit AGE alongside TimescaleDB, slow per-PG-line release cadence (PG 18 support landed late 2025/early 2026), and Apache board minutes reporting “reduced activity year-over-year” with no new committers. AGE remains supported as a fallback because its killer feature — running graph queries inside an existing Postgres instance with shared transactions and indexes — has no substitute for Postgres-standardized environments.

Migration is a re-projection, not a translation

The architectural payoff of files-canonical: because mappings are canonically SSSOM (markdown + YAML in the vault), any database is by definition a projection of the canonical files, not the source of truth. AGE→Fuseki migration is “re-run Crosswalker’s SSSOM-to-RDF projector against the new engine,” not “translate AGE data to Fuseki data.” See third-wave log §3.6 for concrete steps.

Watch but do not adopt

HelixDB (AGPL + YC-stage + custom DSL); ArcadeDB (small contributor base, but Apache-2.0 and built-in MCP); SurrealDB (BSL license is the procurement blocker). Each is interesting; none is mature/governance-stable enough to sit under a small open-source GRC tool with a 5–10-year horizon today.

3.3 OSCAL native support — placement in the web-of-webs

Where OSCAL fits in the architecture

OSCAL is not an internal data model. It is a federally-recognized wire format for export/import:

Tier 1 (canonical Crosswalker vault)
    │
    │  Markdown + SSSOM TSV (internal source of truth)
    │
    ▼
═════════════════════════════════════════════════════════
EXPORT/IMPORT BOUNDARY  ← OSCAL LIVES HERE
═════════════════════════════════════════════════════════
    │
    │  Crosswalker exports source-ontology webs as OSCAL `catalog` JSON
    │  Crosswalker exports crosswalk edges as OSCAL `mapping` records
    │     (Control Mapping Model, currently NIST pre-release)
    │  Crosswalker exports evidence-link junction notes as
    │     `assessment-result/observation` records
    │
    │  AND vice versa (OSCAL imports into Crosswalker vault)
    │
    ▼
External GRC tooling, federal authorisation packages,
agency consumers (FedRAMP, ATO packages, etc.)

In web-of-webs terms

OSCAL is the wire format between Crosswalker’s web-of-webs and external federal GRC systems. It is not part of the web-of-webs itself; it is the edge of it — the export/import boundary where the internal SSSOM-flavored data is reshaped into NIST’s preferred JSON/XML/YAML formats.

In the orientation log’s web-of-webs diagram, OSCAL is the port through which:

Source-ontology webs ↔ NIST OSCAL catalogs (machine-readable control catalogs)
Crosswalk edges ↔ OSCAL Control Mapping Model records
Evidence-vault web ↔ OSCAL Assessment Results

Why we’d promote it from “feature” to “architectural concern”

Consideration	Argument
FedRAMP RFC-0024	Mandates machine-readable authorisation packages by Sept 2026. Federal customers will require OSCAL.
Credibility multiplier	Even non-federal customers see OSCAL native support as evidence of auditor-grade data modeling.
Adjacent ecosystem	Many commercial GRC tools (Hyperproof, Drata, Vanta, AuditBoard, RegScale) already produce OSCAL output. Crosswalker as OSCAL bridge = useful glue.
No internal compromise	We don’t change SSSOM-internally; OSCAL is just a serializer/deserializer.

What “promotion” actually means

Add crosswalker import oscal <file> and crosswalker export oscal --type {catalog,profile,ssp,ar,...} commands
Make OSCAL round-trip a tested feature (not best-effort)
Add Foundation roadmap item for OSCAL native support

Decision (received 2026-05-02): ✅ yes, but after core working. User read on OSCAL: it sounds a lot like a custom schema we map to (i.e., a wire format on the export/import boundary, not an internal model). Treat it as that, document it in Crosswalker’s mental-model vocabulary, and don’t promote it to architectural concern until the core SSSOM-internal pipeline is solid.

Action items:

Update reference/registry/oscal.mdx to map OSCAL into Crosswalker’s mental model (web-of-webs framing; map OSCAL-native terms to Crosswalker synonyms — catalog ↔ source-ontology web, Control Mapping Model ↔ crosswalk edges, Assessment Result ↔ junction notes / evidence-vault web)
Cross-link from any page that mentions OSCAL back to that page
Roadmap placement: deferred until core export/import boundary is well-defined; flagged for Phase 2+ rather than Foundation

3.4 Grafeo evaluation — resolved by Ch 14

Resolved. Folded into the Ch 14 deliverable §2.1. Verdict: track in long-horizon list with explicit migration triggers; do not adopt yet. Genuinely impressive surface area (LPG+RDF+six query languages+HNSW+CDC+WASM+IndexedDB persistence), but ~6 months old, v0.5.x, ~582 stars, single-sponsor (Supernovae), vendor-only benchmarks, no W3C SPARQL conformance proof. A 3-year survival probability of 50–60% is the honest estimate.

Migration trigger A spells out the conditions under which Grafeo would collapse the Tier 2 stack to a single engine.

§4 Parked

4.1 LinkML as canonical schema substrate (idea bucket)

Major architectural pivot. Ch 12b deliverable B made the strongest argument: LinkML auto-generates JSON Schema, OWL, SHACL, Pydantic, TypeScript from a single YAML; SSSOM is itself defined in LinkML; could be Crosswalker’s “Tier 0”.

Real benefit: decouples engines from schema authority. Every engine becomes a serializer/deserializer plugin against the canonical LinkML schema rather than a competing schema authority.

Cost: cascading commitment across every Crosswalker schema (StewardshipProfile, junction-note 13-field, FrameworkConfig, _crosswalker metadata, etc.).

Park for now. Spin up a future Challenge 17 if interest renews after Ch 14/15/16 land.

2026-05-02 direction third-wave shifts log — captures the deltas behind the third-wave updates to §2.1, §3.1, §3.2, §3.4 above
Ch 14 deliverable: Missed engines evaluation
Ch 15 deliverable: Audit-trail alternatives without external git tooling
Ch 16 deliverable: Tier 3 stack reconsideration
2026-05-02 direction log (bloated, full second wave) — research record for 9 fresh-agent deliverables across Ch 08–13. Don’t read for normal navigation.
05-01 commitments log — predecessor; pairwise+pivot, junction-notes-by-tier, StewardshipProfile rename, meta-schema commitment
05-01 orientation log — the web-of-webs framing referenced in §2.2 (Nemo placement) and §3.3 (OSCAL placement)
Active challenges — Ch 14 (Grafeo, user-driven), Ch 15 NEW (non-git audit trail), Ch 16 NEW (Tier 3 alternatives)

Direction commitments (TL;DR) — what's locked in, what's still researching

Status at a glance

§2 Confirmed commitments

2.1 Tier 2 layered stack — DuckDB-WASM + Oxigraph + Nemo (confirmed)

What was chosen and why (with alternatives that lost)

Why “layered” and not “single engine”?

2.2 Datalog (Nemo) for SSSOM chain-rule derivation

Where Nemo lives in the architecture (web-of-webs mapping)

In web-of-webs terms

2.3 Sigstore/gitsign as configurable alternative

2.4 SLSA targeting L1→L2→L3 (explained)

2.5 Materialized-folder Tier 1 generator

2.6 TerminusDB as optional vault-mirror only

§3 Needs more research (with new challenges spun up)

3.1 Tier 1 audit-trail — 4-tier model with OpenTimestamps T2 default

What was committed by Ch 15

Where the Ch 08+13 git stack went

What’s new

3.2 Tier 3 stack — default flipped from AGE to Fuseki/oxigraph-server

The flip

Why AGE was demoted (not dropped)

Migration is a re-projection, not a translation

Watch but do not adopt

3.3 OSCAL native support — placement in the web-of-webs

Where OSCAL fits in the architecture

In web-of-webs terms

Why we’d promote it from “feature” to “architectural concern”

What “promotion” actually means

3.4 Grafeo evaluation — resolved by Ch 14

§4 Parked

4.1 LinkML as canonical schema substrate (idea bucket)

§5 Related