🚧 Early alpha — building the foundation. See the roadmap →

Direction — research wave (Challenges 08/09/10), critical gaps, and the roadmap reshape

Created May 2, 2026 Updated Jun 1, 2026

§1 Why this log exists
§2 Research waves landed
- §2.A First wave (Ch 08/09/10)
- §2.B Second wave (Ch 11/12/13)
§3 Critical assessment — Ch 10 gaps, Ch 09 minor, Ch 08 omissions, Ch 11/12/13 reads
§4 Research challenges status — 11 resolved + Ch 14/15/16 spun up
§5 Direction posture — superseded by TL;DR
§6 Phase plan refresh
§7 Proposed roadmap deltas — listed for review; not yet applied
§8 StewardshipProfile rename ripples — listed for review; not yet applied
§9 What’s still deferred
§10 Long-horizon ideas considered, not committed
§11 Related

This is the direction log signaled in the 05-01 decision log §6. It captures both research waves of 2026-05-02:

First wave (Ch 08/09/10) — 3 deliverables; covered in §2.A, §3.A, §4.A.
Second wave (Ch 11/12/13) — 6 deliverables (Ch 11 produced 3 independent runs, Ch 12 produced 2, Ch 13 produced 1); covered in §2.B, §3.B, §4.B.

§5 captures direction posture across four buckets: confirmed commitments (lock in immediately), convergent commitments pending user sign-off, partial-convergence items needing user input, and follow-on research items. §10 lists long-horizon ideas surfaced but not committed (LinkML, IPLD, Tier 1.5 compilation, etc.).

Roadmap edits in §7 and StewardshipProfile rename ripples in §8 remain listed for review only — not yet applied to the roadmap or affected files.

Why this log exists

Yesterday’s pair of logs (orientation AM, decisions PM) set the table: web-of-webs framing, five Foundation commitments, three new research challenges (08/09/10), the StewardshipProfile rename, the meta-schema lifecycle commitment (“Crosswalker eats its own dog food”), and an explicit deferral of roadmap edits to this log.

Today two research waves landed, totalling 9 fresh-agent deliverables:

First wave (Ch 08/09/10): 3 deliverables addressing the gaps surfaced by yesterday’s commitments — git audit-trail tenability, identifier strategy, graph→tabular bridging engine.
Second wave (Ch 11/12/13): 6 deliverables (Ch 11 produced three independent runs with slightly different recommendations on Tier 2/3 engine choice; Ch 12 produced two; Ch 13 produced one). Multi-agent convergence across the three Ch 11 runs is itself load-bearing evidence for the Tier 2 layered stack.

This log:

Summarizes both waves (§2.A first wave, §2.B second wave)
Gives critical reads of each (§3)
Marks the resolved challenges and spins up Challenge 14 for the engines surfaced during Ch 11 research (§4)
Direction posture: confirmed / pending / partial / follow-on (§5)
Phase plan refresh (§6)
Roadmap deltas listed for review (§7)
StewardshipProfile rename ripples listed (§8) — preserved from earlier today
What’s still deferred (§9)
NEW: Long-horizon ideas considered, not committed (§10) — LinkML, IPLD, Tier 1.5 compilation, AI-augmented mapping, etc.

§2.A First research wave (Ch 08/09/10)

Three fresh-agent research deliverables, all 2026-05-02 morning:

2.1 Challenge 08 — Git history as a compliance audit trail (verdict: augment, not replace)

Full deliverable: Ch 08: Is git history a tenable compliance audit trail?. Brief: Challenge 08.

Headline finding: bare signed-commit + branch-protection git fails on four of five core audit-evidence standards (SAS 142 “susceptibility to management bias,” PCAOB AS 1105 IPE controls, ISO 27001 A.8.15 tamper-resistance against privileged insiders, SOX §802 WORM expectation). It satisfies the integrity leg via the Merkle DAG but fails the trusted-time and non-repudiation legs because the author controls the timestamp and the repository is internally written.

Recommended Tier 1 hardening (all required, none deferred to Tier 2/3):

Signed commits on every evidence-link state change (SSH signing preferred over GPG)
Mandatory remote mirror — refuse to claim “audit trail” for an unmirrored vault
RFC 3161 trusted-timestamp receipts on every audit-relevant commit (free public TSAs make this operationally negligible)
gc.reflogExpire = never and gc.reflogExpireUnreachable = never
Built-in Audit Authenticity Report export (FRE 902(13)/(14)-shaped certification PDF/JSON)

Tier 2 adds nothing audit-wise on its own (the sql.js sidecar is a derived view, not a system of record) but is the right place to enforce write-through-to-git in code.

Tier 3 adds external monitoring (continuous git verify-commit + TSA receipt verification + fsck --full), per-commit/hourly WORM mirror to S3 Object Lock Compliance (or equivalent — Cohasset confirms compliance with SEC 17a-4, FINRA 4511, CFTC 1.31), and SOC 2-attestable infrastructure controls converting the user’s IPE problem into the server’s IPE problem.

14-failure-mode inventory ranked by likelihood and mitigation cost; the residual risk after augmentation is a three-party collusion (admin bypass + WORM compromise + forged TSA), which is below the bar that any commercial GRC tool credibly defends against.

2.2 Challenge 09 — UUID / CWUUID cross-cutting identifier strategy

Full deliverable: Ch 09: UUID/CWUUID cross-cutting identifier strategy. Brief: Challenge 09.

Headline finding: layered scheme. UUIDv7 (RFC 9562, May 2024) for almost everything Crosswalker generates; content-addressed sha256 CIDs only where the entity is its content (spine snapshots, schema releases); CURIEs for external references (controls, frameworks, ORCIDs); ORCID CURIEs verbatim for SSSOM author/reviewer slots. “CWUUID” is a display convention, not a new algebra — every CW-minted ID is a canonical UUIDv7 stored in YAML frontmatter; cw: prefix and short hex suffixes are UI affordances only.

Filename strategy: human-readable filenames for browseable classes (ontology nodes, evidence notes); composite filenames with --cwunder 6-hex> suffix for collision-prone classes (junction notes, lifecycle records, crosswalk edges with multiple predicates). Aligns with the prevailing Obsidian convention of uid in frontmatter (Advanced URI plugin, etc.).

OSCAL round-trip: preserve incoming @uuid flags verbatim on import; mint UUIDv7 only for entities Crosswalker creates de novo. UUIDv7 is syntactically valid in every OSCAL uuid slot — published OSCAL schemas constrain only the regex grammar, not the version nibble. Risk of NIST tightening to v4-only is low.

Minimum viable Foundation set (v0.1) — six identifier classes:

Vault UUID
Ontology-web UUID
Ontology-node UUID (alongside CURIE natural key)
Junction-note UUID (covers crosswalk + evidence-link)
Spine snapshot CID
SSSOM author CURIE (ORCID-preferred)

The deliverable is substantively complete. Minor gaps flagged in §3.2 below don’t warrant a re-run.

2.3 Challenge 10 — Graph→tabular bridging engine for the web-of-webs

Full deliverable: Ch 10: Graph→tabular bridging engine for the web-of-webs. Brief: Challenge 10.

Headline finding: hybrid 3-tier strategy.

Tier	Strategy	Engine choice
Tier 1	Build	Materialized-folder generator inside the plugin; flattens graph queries into Bases-compatible YAML notes; per-folder `.view.yaml`, `.view.lock.json` dependency manifest, `.view.stale` flag
Tier 2	Integrate	DuckDB-WASM (~3.2 MB compressed shell, MIT, MotherDuck-backed). Recursive CTEs handle multi-hop traversal; PIVOT/UNPIVOT/window functions handle cross-tabs; Apache Arrow zero-copy to a Polars-JS or Arquero renderer-side pivot layer
Tier 3	Integrate	PostgreSQL + Apache AGE (openCypher graph traversal + full SQL, Apache 2.0, ASF governance). Oxigraph as RDF sidecar for SSSOM/SKOS/STRM workloads needing SPARQL property-path closure

KuzuDB explicitly rejected despite Cypher elegance and native property-graph fit, on the basis that the upstream project was archived 10 October 2025 with no upstream maintenance commitment. The deliverable claims this is a load-bearing supply-chain risk for a multi-year compliance tool. ⚠️ Verify before acting on it (see §3.1).

Five data-flow invariants — file canonicity, determinism/idempotency, explicit staleness, writes-always-land-in-files, transparent cross-tier query routing. Same posture as the 05-01 §2.5 meta-schema commitment: files are the only writable surface; everything else is a content-addressed cache.

Cost ceilings sketched per tier (e.g., Tier 1 caps at ~10K vault notes / 30K crosswalk edges; Tier 2 at ~250K notes / 5M edges; Tier 3 effectively unbounded). Marked design-time targets, not benchmarks.

The deliverable is technically thorough on the shortlist it considered, but the shortlist is incomplete in ways that matter — see §3.1.

§2.B Second research wave (Ch 11/12/13)

Six fresh-agent research deliverables, all 2026-05-02 (afternoon/evening):

2.4 Challenge 11 — Tier 2/3 engine deep survey (3 deliverables)

Three independent fresh-agent runs, each producing slightly different recommendations. Multi-agent convergence is itself load-bearing evidence.

Deliverable	Distinguishing recommendation
Ch 11a — TerminusDB-as-Tier-3 emphasis	Keep DuckDB-WASM Tier 2; drop AGE; adopt TerminusDB as default Tier 3 (git-style branch/diff/merge). Includes a Grafeo follow-up (potential game-changer engine that could collapse the layered Tier 2 stack to one engine).
Ch 11b — Layered Tier 2 stack	Layer Tier 2: DuckDB-WASM + Oxigraph-WASM + Nemo-WASM (each lazy-loaded). AGE remains optional Tier 3.
Ch 11c — Layered + OSCAL/FedRAMP angle	Same layered Tier 2 as 11b. Tier 3 = AGE+Jena Fuseki + optional TerminusDB vault-mirror. Strategic insight: FedRAMP RFC-0024 mandates machine-readable authorisation packages by Sept 2026 → OSCAL native support is a 10× value-multiplier.

Convergence summary (multi-agent evidence):

Recommendation	11a	11b	11c	Convergence
KuzuDB upstream dead, no fork stable	✅	✅	✅	3-of-3 strong
DuckDB-WASM stays Tier 2 default	✅	✅	✅	3-of-3 strong
Datalog (Nemo) for SSSOM derivation	✅	✅	✅	3-of-3 strong
AGE alone is insufficient for Tier 3 (no RDF)	✅	✅	✅	3-of-3 strong
Polars-WASM not viable today	✅	✅	✅	3-of-3 strong
TerminusDB cannot be Tier 2 (no embedded WASM)	✅	✅	✅	3-of-3 strong
Tier 2 layered stack (DuckDB + Oxigraph + Nemo)	❌ (single-engine + Tier 3 swap)	✅	✅	2-of-3 — 11a takes a different shape
Tier 3 default = AGE+Jena Fuseki	❌ (TerminusDB-as-default)	✅	✅	2-of-3 — split
TerminusDB as optional Tier 3 vault-mirror	✅ (as primary)	✅ (as alternative)	✅ (as optional vault-mirror)	3-of-3 strong (role varies)

New engines surfaced during research (absent from original Ch 11 brief): Grafeo (potential game-changer — pure-Rust LPG+RDF+vector with WASM bindings, all major query languages, could collapse layered Tier 2 to single engine), Minigraf (embedded bi-temporal Datalog with WASM), CozoDB (Datalog+graph+vector embedded), SurrealDB-WASM (multi-model unified, BSL 1.1 license), Comunica + N3 + HDT (TS-native SPARQL meta-engine, ~200 KB gzipped vs Oxigraph’s ~3 MB), cr-sqlite (CRDT SQLite), simple-graph + sqlite-vec stack. All flagged for follow-on Challenge 14.

2.5 Challenge 12 — Datalog vs SQL for SSSOM chain rules (2 deliverables)

Deliverable	Distinguishing recommendation
Ch 12a — Focused fork-in-the-road	Hybrid: rules expressed as Datalog DSL, executable via either Datalog engine (Nemo, browser/CLI) or compiled to SQL recursive CTEs (DuckDB-WASM/SQLite-WASM). OxO2 reference architecture validates Datalog at scale (1.16M mappings → 49.5K inferences, ~17 min, ~380 MB on a laptop).
Ch 12b — Beyond the known engine landscape	Long-horizon: LinkML as canonical schema substrate (auto-generates JSON Schema, OWL, SHACL, Pydantic, TypeScript from one YAML). IPLD content-addressed crosswalks. Tier 1.5 compilation pipeline producing Parquet/HDT/JSON-LD/OSCAL/IPLD-CAR. AI-augmented mapping per OAEI 2025/26 (LogMap-LLM, GenOM). Many engines absent from prior surveys.

Convergence: Both deliverables converge on Datalog (Nemo) as primary derivation engine, OxO2 as the reference architecture. Both endorse the rules-as-data DSL pattern compiled to either Datalog or SQL. Deliverable B expands well beyond the Ch 12 brief into long-horizon architectural ideas — those land in §10.

2.6 Challenge 13 — Modern attestation primitives (confirms Ch 08, adds in-toto)

Full deliverable: Ch 13 — Modern attestation primitives. Brief: Challenge 13.

Verdict by primitive:

Primitive	Decision	Why
Sigstore / gitsign / Rekor	Complement (configurable alternative for commit signing)	Cleanest path to SLSA L3; non-air-gap only; offer alongside GPG/SSH.
in-toto attestations	Complement (mandatory, Tier 1)	Replaces commit messages as authoritative review-chain record; backward-compatible with PDF Audit Authenticity Report.
SLSA targeting	Adopt as framing model	L1 for v0.1, L2 for v1.0, L3 for v2.0+ (via gitsign).
OpenTimestamps	Skip Tier 1; offer Tier 2	Latency + auditor unfamiliarity outweigh marginal benefit for 7-year retention; permanence value re-emerges only for >25-year horizon.
W3C Verifiable Credentials	Skip near-term; track for v2.0+	Standards-stable since May 2025 but not in audit toolchains today; viable for cross-vault federation later.
AWS QLDB	Drop entirely	Service ended 2025-07-31.
Azure Confidential Ledger	Skip Tier 1; document as Tier 3	High cost (~$3/day per ledger), narrow incremental benefit, vendor-locked.
immudb	Skip Tier 1; document as Tier 3	Useful for high-volume scale but not auditor-recognized as external party.
Challenge 08 stack	Confirm + extend with in-toto	Auditor-familiar floor; in-toto is the missing review/approval schema.

Net architectural impact: Ch 13 adds exactly one mandatory primitive (in-toto) to Ch 08’s Tier 1 stack. Tier 1 minimum bar is now: signed commits + RFC 3161 TSA + S3 Object Lock WORM + FRE 902(13) PDF cert + in-toto attestations.

Critical assessment of the research waves

3.1 Challenge 10 has substantive gaps

The deliverable evaluates ~9 engines and makes the DuckDB+AGE call on the basis of that shortlist. Whole classes of relevant systems are not engaged with at all:

Class	Specific systems missed	Why this matters for Crosswalker
Datalog engines	Soufflé, Nemo, Differential Datalog, Datomic, RDFox	Datalog is the native fit for SSSOM chain rules. Recursive Datalog with provenance is mathematically cleaner than recursive CTE for transitive crosswalk derivation. The OxO2 paper Crosswalker already cites uses Nemo (Datalog). Direct fork in the road that the deliverable doesn’t even pose
Production triple stores	Apache Jena Fuseki, GraphDB (Ontotext), Virtuoso, RDF4J, Stardog, AnzoGraph, Blazegraph	The project is RDF-flavored (SSSOM, SKOS, STRM) by design. Oxigraph is “the WASM one we found”; no comparative analysis against the mature RDF stack
Versioned graph databases	TerminusDB (Git-style branching/diff/merge over RDF)	Terminus’s versioned-graph-with-diff-and-merge model is uncannily aligned with Crosswalker’s “files canonical, derived stores rebuildable” ethos. Should arguably be a top contender; not mentioned at all
Other property graphs	Memgraph, NebulaGraph, ArangoDB, Dgraph, FalkorDB, OrientDB	The “Kuzu rejected → AGE accepted” leap skips half the relevant landscape
Embedded analytical	Polars-WASM (Tier 1.5 candidate), DataFusion (Apache), LanceDB, ClickHouse-local, Velox	Polars-WASM as a Tier 1.5 join/pivot layer without SQL is a real alternative the deliverable mentions as a renderer-side helper but doesn’t seriously evaluate as a primary engine
Vector + graph hybrids	Weaviate, Qdrant, Milvus, FalkorDB+vec	Becomes critical for AI-assisted schema matching (deferred future workstream). No architectural slot reserved
Streaming / incremental MV	Materialize, Differential Dataflow, Snowflake Dynamic Tables, ksqlDB	Deliverable cites Postgres MV/BQ/Redshift but skips state-of-the-art incremental view maintenance — directly relevant for the materialized-folder Tier 1 design
Virtual / federated	Ontop (SPARQL-over-relational), Trino, Dremio	Could turn “files → derived store” into “files → virtual SQL view” without materializing
Query unification	GraphQL gateway, Substrait	A unified query layer that abstracts the engine choice across tiers. Not mentioned at all

Architectural questions left unasked:

Datalog vs recursive CTE for the core SSSOM chain-rule derivation. The deliverable picks the harder, less-expressive option without justifying it
TerminusDB’s versioned-graph model deserves first-class evaluation given Crosswalker’s ethos
Polars-WASM as Tier 1.5 (without DuckDB) — bundle-size argument deserves deeper treatment
GraphQL as a tier-agnostic query surface (compiles to SQL/Cypher/SPARQL per tier)
CRDT layer for the deferred live-edit team mode (Yjs / Automerge / Loro)
Concrete WASM bundle optimization strategies (tree-shaking, code-splitting, on-demand loading)
LLM/NL-query architecture for AI-assisted features
No real benchmarks against representative GRC data — explicitly out-of-scope per the brief, but means the choices are theoretical against an unmeasured workload

Empirical claims worth verifying before acting:

“KuzuDB archived 10 October 2025” — load-bearing for the entire engine choice. Verify upstream + the “bighorn” community fork status
“DuckDB-WASM ~3.2 MB compressed” — confirm against current build; bundle has grown over releases
“DuckPGQ extension not yet WASM-friendly” — check current state
Apache AGE PostgreSQL version compatibility window

3.2 Challenge 09 minor gaps (no new research session needed)

Challenge 09 is substantively complete. Minor flags for implementation only:

SHA-256 vs SHA-3 for content addressing: sha256 fine, but if a downstream user mandates SHA-3 (some federal contexts), allow algorithm agility in the CID prefix
UUIDv7 timestamp leakage: every UUIDv7 reveals creation millisecond — a side-channel in adversarial/forensic settings. Document explicitly
UUIDv7-from-mtime migration entropy: the migration script derives UUIDs from file mtime, which has near-zero entropy. Should be flagged that migrated UUIDs are predictable and unsuitable as security tokens
Cross-vault federation protocol: punted to Phase 2 in the deliverable; the URN form urn:crosswalker:<vault-uuid>:<entity-uuid> is sketched but not analyzed against DID:web for vault identity or IPFS CIDs for content

Action: lessons-learned annotation on the existing Challenge 09 brief; no new research session.

3.3 Challenge 08 — two real omissions

Challenge 08 is broadly complete but skips a category of modern attestation primitives that meaningfully change the design:

Sigstore + in-toto attestations + SLSA framework — federated OIDC-backed signing (Sigstore alone could replace the entire “manage GPG/SSH signing keys” UX); in-toto is the standard for “this evidence was reviewed by X using process Y” attestations; SLSA frames the whole supply-chain integrity story. None of these are engaged with
OpenTimestamps (Bitcoin-anchored, free, decentralized) — gets one passing mention in the failure-mode table but isn’t compared against RFC 3161 TSAs as an alternative or complement
W3C Verifiable Credentials for the qualified-person certification — flexible alternative to the proposed PDF Audit Authenticity Report
AWS QLDB / Azure Confidential Ledger rejected without deep analysis

Action: spin up a narrower follow-on research challenge focused specifically on these primitives — resolved by Challenge 13 (deliverable summarized in §2.6 above; critical read in §3.6 below).

3.4 Critical read of Ch 11

Convergence is strong across the three independent runs:

3-of-3 rejected KuzuDB upstream and all four forks (Bighorn, Ladybug, RyuGraph, Vela) as not stable enough for a multi-year compliance tool
3-of-3 kept DuckDB-WASM as Tier 2 default
3-of-3 committed to Datalog (Nemo, OxO2 architecture) for SSSOM derivation
3-of-3 found AGE alone insufficient at Tier 3 (no RDF; SSSOM/SKOS/STRM are RDF-native)
3-of-3 rejected Polars-WASM as Tier 1.5 (Pyodide-only path, alpha)

Divergence on the layered Tier 2 stack — 11b and 11c explicitly recommend layered Tier 2 (DuckDB-WASM + Oxigraph-WASM + Nemo-WASM lazy-loaded). 11a takes a different shape: keep DuckDB single-engine at Tier 2 and replace AGE with TerminusDB at Tier 3. Both architectures are coherent. Decision pending user input — see §5.B.

Divergence on Tier 3 default:

11a: TerminusDB as default (versioning is the load-bearing requirement)
11b/11c: AGE+Jena Fuseki as default; TerminusDB as optional vault-mirror

New engines surfaced — Grafeo (most consequential potential game-changer), Minigraf, CozoDB, SurrealDB-WASM, Comunica + N3 + HDT, cr-sqlite, sqlite-vec stack. Spin up Challenge 14.

FedRAMP RFC-0024 (mandates machine-readable authorisation packages by Sept 2026) flagged by 11c as a 10× value-multiplier for Crosswalker’s federal market — argues for elevating OSCAL native support from feature to architectural concern.

3.5 Critical read of Ch 12

Strong convergence across 12a and 12b plus all three Ch 11 deliverables: Datalog (Nemo) is the right derivation engine, OxO2 is the reference architecture, rules-as-data DSL is the architectural pattern.

Ch 12b expansion beyond brief: deliverable B explicitly went beyond the narrow Datalog-vs-SQL question and proposed major architectural ideas not previously on the table:

LinkML as canonical schema substrate — auto-generates JSON Schema, OWL, SHACL, Pydantic, TypeScript from one YAML; SSSOM is itself defined in LinkML; could become Crosswalker’s “Tier 0”
IPLD content-addressed crosswalks — every SSSOM bundle hashes to a CID; distribution via Merkle-DAG; signed with W3C VCs
Tier 1.5 compilation pipeline — Markdown+SSSOM TSV → Parquet/HDT/JSON-LD/OSCAL/IPLD-CAR multi-target artifact compiler
AI-augmented mapping — LogMap-LLM, GenOM, BERTMap; OAEI 2025/2026 results validate hybrid LLM approach

These are major architectural pivots, not adopted today. Catalogued in §10 (long-horizon ideas considered, not committed).

3.6 Critical read of Ch 13

Highest confidence of the second wave. Ch 13 directly confirms Ch 08’s stack with one explicit addition (in-toto attestations as mandatory Tier 1) and several explicit non-adoptions (skip OpenTimestamps for Tier 1, drop QLDB entirely as it’s dead, skip ACL/immudb/VCs as Tier 1).

Single new commitment vector: in-toto attestations as the mandatory schema for review/approval evidence. Custom Crosswalker predicate type https://crosswalker.dev/predicates/evidence-review/v1 plus reuse of SLSA Provenance v1.

No surprises, no contradictions, no major divergences from Ch 08. Lift to commit in §5.B.

Research challenges status

4.1 Challenge 11 — Tier 2/3 engine deep survey

✅ RESOLVED by 3 fresh-agent deliverables (one of the strongest convergence signals in the project so far). Summarized in §2.4; critical read in §3.4.

Ch 11a — TerminusDB-as-Tier-3 emphasis
Ch 11b — Layered Tier 2 stack (DuckDB + Oxigraph + Nemo)
Ch 11c — Layered + OSCAL/FedRAMP angle

Net direction: layered Tier 2 stack (2-of-3 explicit + alignment from 11a’s Grafeo follow-up); Tier 3 default open between AGE+Jena and TerminusDB-as-primary; new engines surfaced → Challenge 14.

4.2 Challenge 12 — Datalog vs SQL for SSSOM chain-rule derivation

✅ RESOLVED by 2 fresh-agent deliverables. Strong convergence on Datalog (Nemo) primary + OxO2 reference architecture. Summarized in §2.5; critical read in §3.5.

Ch 12a — focused fork-in-the-road analysis; rules-as-data DSL
Ch 12b — long-horizon expansion (LinkML, IPLD, Tier 1.5 compilation, AI-augmented mapping); long-horizon ideas catalogued in §10

4.3 Challenge 13 — Modern attestation primitives

✅ RESOLVED by 1 fresh-agent deliverable. Confirms Ch 08 Tier 1 stack; adds in-toto attestations as mandatory; offers Sigstore/gitsign as configurable alternative; SLSA L1→L2→L3 progression. Summarized in §2.6; critical read in §3.6.

Ch 13 — modern attestation primitives

4.4 Challenge 14 — Missed engines evaluation

✅ RESOLVED by 1 fresh-agent deliverable later on 2026-05-02 (third wave). Verdict: keep Ch 11 layered Tier 2 stack as production; add Tier 2-Lite (sqlite-wasm + sqlite-vec + simple-graph + recursive-CTE) and Comunica + N3 + HDT federation as additive extensions; track Grafeo and Minigraf with explicit, falsifiable migration triggers; reject SurrealDB (BSL + 12.6 MB bundle), cr-sqlite (stalled Oct 2024), CozoDB (no release since v0.7 in 2023). Synthesized in third-wave log §2.

Ch 14 deliverable — missed engines evaluation
Archived Challenge 14 brief

4.5 Challenge 15 — Audit-trail alternatives without external git tooling

✅ RESOLVED by 1 fresh-agent deliverable later on 2026-05-02 (third wave). Verdict: adopt 4-tier model (T0/T1/T2/T3) with OpenTimestamps .ots on signed chain checkpoints as new T2 default; reposition the Ch 08+13 git+RFC3161+S3-Object-Lock+FRE 902 stack as one of three T3 options (others: eIDAS QTSA + W3C VC for EU; Sigstore Rekor v2 + in-toto for supply-chain). Crypto-agile PQC migration plan 2026→2032 ahead of NIST IR 8547 2035 deadline. Synthesized in third-wave log §4.

Ch 15 deliverable — non-git audit-trail alternatives
Archived Challenge 15 brief

4.6 Challenge 16 — Tier 3 stack reconsideration

✅ RESOLVED by 1 fresh-agent deliverable later on 2026-05-02 (third wave). Verdict: demote Apache AGE from default to optional fallback; promote Apache Jena Fuseki as new Tier 3 default; document oxigraph-server as the lighter same-API alternative (architectural symmetry: same engine as Tier 2, just oxigraph serve); document layered Fuseki + DuckDB-on-server as power-user upgrade path; TerminusDB v12 as opt-in vault-mirror with small-vendor (DFRNT) risk explicitly flagged. Synthesized in third-wave log §3.

Ch 16 deliverable — Tier 3 stack reconsideration
Archived Challenge 16 brief

4.7 (informational) Possible future challenges identified

Not spun up; flagged for user signal:

Ch 17 candidate — LinkML adoption as canonical schema substrate (per Ch 12b §2.3)
Ch 18 candidate — Tier 2-Lite SSSOM rule subset and scale ceiling (per Ch 14 §2.7) — most actionable follow-on
Ch 19 candidate — PQC dual-sign protocol detail (per Ch 15 §5.6) — defer toward 2027
Ch 20 candidate — eIDAS 2.0 / EUDI Wallet integration profile (per Ch 15 §3.1) — defer to 2027+
(older) IPLD content-addressed crosswalk distribution (per Ch 12b §4.2); reactive/incremental computation for derived crosswalks (Feldera/DBSP/Materialize)

Direction posture

The user signal: lift convergent items in this log; embed inline checkpoint questions for explicit yes/no per item; do follow-on research where convergence is partial.

Four buckets:

5.A Confirmed commitments (locked in this log)

Items with strong convergence and no objections expected; lock in immediately.

Topic	Source
Identifier strategy — UUIDv7 + sha256 CIDs + CURIEs + ORCID for SSSOM authors	Ch 09 deliverable; single deliverable, substantively complete
KuzuDB and forks: do NOT adopt as Tier 2 primary (no fork has 12+ months stability)	3-of-3 Ch 11 convergence
Polars-WASM: NOT viable as Tier 1.5 today (alpha, Pyodide-only path)	3-of-3 Ch 11 convergence; reaffirmed by Ch 12b
AWS QLDB: drop entirely (service ended 2025-07-31)	Ch 13
Pairwise + optional pivot crosswalk architecture	05-01 §2.1 commitment stays
Junction-note evidence-link form factor	04-10 synthesis + 05-01 §2.2 commitment stays
OxO2 reference architecture for SSSOM chain-rule derivation	3-of-3 Ch 11 + 2-of-2 Ch 12

5.B Convergent commitments — pending user sign-off

Strong convergence; flagged for explicit yes/no per item before locking. Each row ends with a checkpoint question.

Topic	Convergence	Checkpoint
Tier 2 layered stack: DuckDB-WASM + Oxigraph-WASM + Nemo-WASM (all lazy-loaded). Total compressed ~10 MB worst case; under 100 KB plugin shell + ~3 MB on first analytical query.	2-of-3 Ch 11 (11a takes a different shape — single-engine + TerminusDB-as-Tier-3); reaffirmed by Ch 12 + Ch 14 brief	[Confirm? Y/N] Adopt layered Tier 2 (DuckDB + Oxigraph + Nemo) over single-engine Tier 2?
Tier 1 audit-trail bar: signed commits + RFC 3161 TSA + S3 Object Lock WORM mirror + FRE 902(13) PDF cert + in-toto attestations (mandatory).	Total convergence (Ch 13 confirms Ch 08 + adds in-toto only)	[Confirm? Y/N] Adopt the 5-layer Tier 1 audit-trail bar including in-toto?
Datalog (Nemo) for SSSOM chain-rule derivation per OxO2 architecture (build-pipeline derivation, not live query). DuckDB recursive CTE remains for ad-hoc queries over already-derived facts.	Total convergence (3-of-3 Ch 11 + 2-of-2 Ch 12)	[Confirm? Y/N] Lock in Nemo-as-derivation-engine, OxO2-as-reference-architecture?
Sigstore/gitsign as configurable alternative for commit signing (path to SLSA L3); GPG/SSH remains default.	Ch 13 explicit	[Confirm? Y/N] Offer Sigstore/gitsign as alternative-not-replacement?
SLSA targeting: L1 for v0.1, L2 for v1.0, L3 for v2.0+ via gitsign.	Ch 13 explicit	[Confirm? Y/N] Adopt SLSA progression (L1→L2→L3 per version)?
Tier 1 materialized-folder generator for Bases-compatible pre-joined views (Ch 10 §2 design).	Stays sound across Ch 11 deliverables; concept doesn’t change with engine choice	[Confirm? Y/N] Lock in materialized-folder-generator as Tier 1 pattern?

5.C Partial convergence — need user input

Items where multiple acceptable answers emerged from the research. User input needed.

Topic	Convergence pattern	User input needed
Tier 3 default stack: AGE+Jena Fuseki (Ch 11b, 11c) vs TerminusDB-as-primary (Ch 11a)	2-of-3 favors AGE+Jena; 1-of-3 favors TerminusDB	Pick: AGE+Jena Fuseki primary, TerminusDB optional vault-mirror? OR TerminusDB primary, AGE fallback?
TerminusDB role: vault-mirror only / Tier 3 alternative / Tier 3 default?	3-of-3 mention TerminusDB; role varies	Pick role for TerminusDB.
LinkML as canonical schema substrate (Ch 12b major architectural pivot)	1-of-2 Ch 12 deliverables; substantial implications	Adopt LinkML as Tier 0 schema layer (auto-generates JSON Schema/OWL/SHACL/Pydantic/TypeScript)? Or stick with bespoke schemas? Or spin up Challenge 15 to evaluate?
OSCAL native support priority (Ch 11c flag: FedRAMP RFC-0024 mandates by Sept 2026)	1-of-3 Ch 11 explicitly flagged; demand-side validation event	Promote OSCAL import/export from feature to architectural concern? Add roadmap item?
Grafeo evaluation timing (Ch 11a follow-up + Ch 14 brief)	Identified as potential game-changer; Ch 14 will evaluate	Wait for Ch 14 deliverable, OR also evaluate now via narrower in-session research?

5.D Identified for follow-on research

Items where the convergence pointed at a gap. New research challenges spun up:

Challenge 14 (created today): missed engines (Grafeo, Minigraf, CozoDB, SurrealDB-WASM, Comunica, cr-sqlite, sqlite-vec). Brief published.
Possible Challenge 15 (await user signal): LinkML adoption as canonical schema substrate.
Possible Challenge 16 (await user signal): IPLD content-addressed crosswalk distribution.
Possible Challenge 17 (await user signal): reactive/incremental computation for derived crosswalks.

Phase plan refresh

The two research waves of 2026-05-02 collapsed Phases B and C into the same calendar day:

Phase A — Today’s housekeeping (✅ done)
- Adopted the Challenge 09 identifier strategy — locked in §5.A
- Spun up Ch 11, Ch 12, Ch 13 (earlier today) and Ch 14 (this log)
- StewardshipProfile rename ripples — listed in §8 (still not yet applied; awaits user)
- Roadmap deltas — listed in §7 (still not yet applied; awaits user)
Phase B — Run the second research wave (✅ done, today)
- Ch 11 produced 3 independent fresh-agent runs
- Ch 12 produced 2 deliverables (one focused, one long-horizon)
- Ch 13 produced 1 deliverable
Phase C — Synthesize (✅ in flight, this log update)
- This log captures convergences (§5.A and §5.B) and divergences (§5.C)
- Key open question: Tier 3 stack default (AGE+Jena vs TerminusDB-as-primary)
Phase B′ — Run the third research wave
- Hand Challenge 14 to a fresh agent (engines that surfaced during Ch 11)
- Optionally spin up Challenges 15/16/17 if the user signals interest in LinkML / IPLD / reactive computation (§5.C)
Phase D — Implementation begins (gated on Phase B′ + §5.B sign-off)
- Tier 1 materialized-folder generator (committed in §5.B pending sign-off)
- Tier 1 audit-trail hardening: signed commits + TSA + WORM + cert export + in-toto (committed in §5.B pending sign-off)
- Tier 2 engine integration: layered stack (DuckDB-WASM + Oxigraph-WASM + Nemo-WASM) — committed in §5.B pending sign-off and contingent on Ch 14 not surfacing a single-engine collapse
- Tier 3 engine integration: gated on §5.C user input
- Schemas (StewardshipProfile, junction-note, FrameworkConfig v2, _crosswalker v2) — implementation following the meta-schema versioning policy once that policy is concretized
Phase E — Out of scope for this Foundation cycle
- Marketplace mechanics
- Cross-vault federation (Phase 2)
- LLM/AI-assisted features (informed by Ch 11 §2.6 + Ch 12b § AI-augmented mapping; defer concrete commitment)
- Multi-user collaboration / CRDT layer (informed by Yjs/Loro/Automerge/cr-sqlite analyses across Ch 11 deliverables)

The big change vs yesterday’s plan: Phases B and C collapsed into one calendar day because the user ran multiple fresh-agent sessions in parallel. The new gate before Phase D is (a) §5.B explicit sign-off from the user, (b) Ch 14 deliverable on missed engines, (c) §5.C resolution of partial-convergence items.

Proposed roadmap deltas

Concrete edits to land in docs/src/content/docs/reference/roadmap/index.mdx — listed here for review; not yet applied. Edits are bucketed by section of the roadmap.

A. Foundation — “Get the architecture right” — existing items to update

A1. “Pairwise crosswalks vs synthetic spine architecture” (currently the biggest open Foundation question)

✅ Mark as resolved by 05-01 §2.1 deferred-pivot hybrid commitment
Update item description to summarize the commitment: pairwise primary, optional inheritable pivot (default SCF), SSSOM-on-markdown persistence, derived-mappings-computed-not-stored
Update internal challenge link from /zz-challenges/06-synthetic-spine to /zz-challenges/archive/06-synthetic-spine (already done in 05-01 commit; verify still correct)

A2. “EvolutionPattern vs transformation recipes”

Rename to: “StewardshipProfile (formerly EvolutionPattern) vs transformation recipes”
Update body text: every “EvolutionPattern” → “StewardshipProfile” with first-mention “(formerly EvolutionPattern)” parenthetical
Add link to 05-01 §3.2 rename

A3. “Evidence-framework edge model (the other edge type)” (currently marked “Direction committed”)

Add a sentence: Tier 1 audit-trail hardening (TSA + WORM + cert-export per Challenge 08) is deferred pending Challenge 13 — modern attestation primitives.
Rest of the existing item (junction notes, 13-field schema, OSCAL by-component, etc.) stands

A4. “Crosswalk edge semantics commitment (STRM + SSSOM)”

Mark as ✅ committed direction per 05-01 §2.1 and the Challenge 06 deliverable
No body changes; status marker only

A5. “Progressive tier architecture (pillar)”

Add: Tier 2 engine choice deferred pending Challenge 11; Datalog vs SQL fork for derivation deferred pending Challenge 12.
Rename “Tier 2 (sql.js sidecar)” descriptor to “Tier 2 (embedded analytical engine, choice TBD per Ch 11)”
Existing 3-tier framing stands

A6. “Obsidian Bases direction research”

No change — still active research; coordinates with Challenge 11 but doesn’t get superseded

B. Foundation — new items to add

B1. NEW: “Identifier strategy (UUIDv7 + sha256 CIDs + CURIEs)” ✅ committed

Per Challenge 09 deliverable: UUIDv7 default, sha256 multibase CIDs for content-addressed (spine snapshots, schema releases), CURIEs for external references (controls, frameworks, ORCIDs)
“CWUUID” is display convention only, not a new algebra
Filename rule: human-readable + --cwunder 6-hex> suffix on collision-prone classes
Six-class minimum viable Foundation set (vault, ontology web, ontology node, junction note, spine snapshot CID, SSSOM author CURIE)
OSCAL round-trip: preserve incoming @uuid verbatim; mint UUIDv7 only for new entities
Links to deliverable

B2. NEW: “Tier 2/3 engine deep survey” — research item

Links to Challenge 11
Required to complete before Tier 2 implementation can start

B3. NEW: “Datalog vs SQL for SSSOM chain-rule derivation” — research item

Links to Challenge 12
Coordinates with Challenge 11 (engine choice picks the paradigm; Ch 12 picks the engine)

B4. NEW: “Modern attestation primitives evaluation (Sigstore, in-toto, SLSA, OpenTimestamps, VCs)” — research item

Links to Challenge 13
Required before the Tier 1 audit-trail bar from Challenge 08 can be locked in

B5. NEW: “Crosswalker-internal schema versioning and migration policy”

Per 05-01 §2.5 dog-food commitment: every Crosswalker-internal schema (StewardshipProfile, junction-note 13-field schema, FrameworkConfig, _crosswalker metadata, pivot snapshot manifest, lifecycle change record, SSSOM crosswalk record) is versioned and migration-aware
Versioning convention TBD (semver vs content-addressed vs release-tag-aligned)
Direction log next pass to commit the convention

B6. NEW: “Tier 2 layered engine stack (DuckDB-WASM + Oxigraph-WASM + Nemo-WASM)” (PENDING §5.B sign-off)

Per Ch 11 deliverables b+c convergence — DuckDB-WASM for tabular SQL; Oxigraph-WASM for SPARQL/SKOS; Nemo-WASM for SSSOM chain-rule derivation
Bundle target: under 100 KB plugin shell + ~3 MB on first analytical query + lazy-load Oxigraph (~3 MB) and Nemo (~3–4 MB) on demand
Alternative collapse-to-single-engine pending Ch 14 evaluation of Grafeo

B7. NEW: “SSSOM chain-rule derivation engine (Nemo, OxO2 architecture)” (PENDING §5.B sign-off)

Per Ch 12 convergence — Nemo (Rust → WASM) as the canonical Datalog derivation engine
OxO2 reference architecture: Markdown vault → SSSOM facts → Nemo (Datalog with chain rules) → derived facts (with provenance) → DuckDB Parquet shard / TerminusDB graph commit
Rules expressed as data (Datalog DSL), not code; compilable to either Nemo or SQL recursive CTE per deployment

B8. NEW: “Tier 1 audit-trail hardening (Ch 08 + in-toto)” (PENDING §5.B sign-off)

Per Ch 08 + Ch 13 convergence — five-layer Tier 1 stack:
1. Signed commits (GPG/SSH default; gitsign as configurable alternative)
2. RFC 3161 trusted timestamps on every commit
3. S3 Object Lock WORM mirror
4. FRE 902(13) qualified-person certification PDF
5. in-toto attestations (mandatory) for review/approval evidence — custom predicate type https://crosswalker.dev/predicates/evidence-review/v1
Tier 2: Rekor cross-publication when gitsign in use; OpenTimestamps as parallel .ots proof for high-retention vaults
Tier 3 (optional): Azure Confidential Ledger or immudb for high-volume installations

B9. NEW: “Sigstore/gitsign as configurable alternative; SLSA targeting (L1→L2→L3)” (PENDING §5.B sign-off)

Per Ch 13 — Sigstore (Fulcio/Rekor/gitsign) as alternative-not-replacement for GPG/SSH
SLSA Build target: L1 v0.1, L2 v1.0, L3 v2.0+ (gitsign + Fulcio + ephemeral keys)

B10. NEW: “Tier 3 engine stack” (PENDING §5.C user input)

Per Ch 11 partial convergence — Tier 3 default is split between AGE+Jena Fuseki (deliverables b+c) and TerminusDB-as-primary (deliverable a)
One row blocked on user signal; not yet a roadmap delta

B11. NEW: “OSCAL native support (FedRAMP RFC-0024 demand-side validation)” (PENDING §5.C user input)

Per Ch 11c — FedRAMP RFC-0024 mandates machine-readable authorisation packages by Sept 2026; OSCAL native import/export becomes a 10× value-multiplier for Crosswalker’s federal market
Could promote OSCAL from “feature” to “architectural concern”; not yet a roadmap delta

C. Decision log (bottom of roadmap) — append entries

The current “Decision log” section at the bottom of the roadmap lists 04-03 through 04-09 entries. Append:

Foundation state of play (orientation, 05-01 AM) — web-of-webs framing, six open questions
Foundation commitments + follow-on research (decisions, 05-01 PM) — five commitments, three new challenges, StewardshipProfile rename, meta-schema commitment
Direction — research wave + roadmap reshape (this log, 05-02) — the present log

D. Inventory

Total proposed roadmap edits:

6 existing items modified (A1–A6)
5 new items added (B1–B5)
3 decision-log entries appended (C)

These are deltas to the active Foundation section only. The Formats / Crosswalks / Evolution / Community sections of the roadmap are not touched in this pass.

StewardshipProfile rename ripples (proposed edits)

A grep across docs/src/content/docs finds 27 files containing “EvolutionPattern” (last counted 2026-05-02). The rename strategy preserves history but updates the canonical present:

E1. Concept pages (canonical present — full update)

These pages define the term. Replace “EvolutionPattern” with “StewardshipProfile” on first mention parenthesized as “StewardshipProfile (formerly EvolutionPattern)”, then “StewardshipProfile” thereafter:

concepts/terminology.mdx (3 mentions) — entry header
concepts/ontology-evolution.mdx (3 mentions)
concepts/ontology-lifecycle.mdx (2 mentions)
concepts/institutional-landscape.mdx (2 mentions)
concepts/operational-landscape.mdx (4 mentions)
concepts/what-makes-crosswalker-unique.mdx (8 mentions — most-affected concept page)

E2. Reference pages — full update (registry + roadmap)

reference/registry/cis.mdx (1)
reference/registry/mitre.mdx (1)
reference/registry/nist.mdx (1)
reference/registry/oscal.mdx (1)
reference/roadmap/index.mdx (3) — covered by §7 A2 above; same rename pass

E3. Active research challenges — full update

agent-context/zz-challenges/03-competitive-landscape.mdx (1)
agent-context/zz-challenges/04-first-principles-audit.mdx (1)
agent-context/zz-challenges/05-transformation-problem.mdx (1)

E4. Historical log entries — top-of-file rename callout, body preserved

These logs were written before the rename. Add a single :::tip callout at the top of each linking to 05-01 §3.2; leave body alone (preserves history). The most-prominent callout goes on the original EvolutionPattern taxonomy draft.

agent-context/zz-log/2026-04-03-evolution-pattern-taxonomy-draft.mdx (4) — prominent rename callout
agent-context/zz-log/2026-04-03-deep-research-synthesis.mdx (1)
agent-context/zz-log/2026-04-03-distribution-architecture-research.mdx (1)
agent-context/zz-log/2026-04-03-layered-architecture-vision.mdx (2)
agent-context/zz-log/2026-04-03-vision-alignment-decisions.mdx (2)
agent-context/zz-log/2026-04-04-volatility-and-registry.mdx (1)
agent-context/zz-log/2026-04-08-ontology-evolution-first-principles.mdx (5)
agent-context/zz-log/2026-04-09-primitives-depth-and-pluggable-layers.mdx (1)
agent-context/zz-log/2026-04-09-user-first-ontology-maintenance.mdx (2)
agent-context/zz-log/2026-04-10-foundation-research-synthesis.mdx (12) — most-mentioned log; prominent callout

E5. Today’s logs — already use “StewardshipProfile (formerly EvolutionPattern)” pattern; no edit

agent-context/zz-log/2026-05-01-foundation-state-of-play.mdx (6) — already updated
agent-context/zz-log/2026-05-01-foundation-commitments-and-followon-research.mdx (12) — the rename log itself
agent-context/zz-log/2026-05-02-direction-research-wave-and-roadmap-reshape.mdx (3) — this log; references the rename in §3.2 link only

Rename mechanics

Categories E1–E3 (canonical present, 14 files): surgical Edits replacing “EvolutionPattern” → “StewardshipProfile (formerly EvolutionPattern)” on first mention per file, “StewardshipProfile” thereafter
Category E4 (historical, 10 files): a single inserted :::tip[Renamed: EvolutionPattern → StewardshipProfile] callout at the top of each, ~3 lines. Body unchanged
Total = ~24 file edits in the rename pass

The rename pass should be its own commit, separate from this log’s commit, so the diff is reviewable.

What’s still deferred

The deferred list is shorter now that Ch 11/12/13 resolved. Remaining items:

Deferred topic	Gates on
Tier 3 default stack (AGE+Jena vs TerminusDB-as-primary)	§5.C user input
TerminusDB role (vault-mirror only / Tier 3 alternative / primary)	§5.C user input
OSCAL native support architectural priority	§5.C user input
Layered Tier 2 vs single-engine collapse	Challenge 14 deliverable (Grafeo viability)
Bi-temporal Datalog (Minigraf) for SSSOM	Challenge 14
LinkML adoption as canonical schema substrate	Possible Challenge 15 (await user signal)
IPLD content-addressed crosswalk distribution	Possible Challenge 16 (await user signal)
Reactive/incremental computation (Feldera/DBSP)	Possible Challenge 17 (await user signal); see §10
Concrete versioning convention for Crosswalker-internal schemas	Phase B′ synthesis
StewardshipProfile keep/replace/stack (formal grounding via Stojanovic & Flouris)	Already on Foundation roadmap
Marketplace mechanics (Obsidian plugin distribution, signing, sandboxing)	Post-Foundation (Phase 2)
Cross-vault federation protocol	Phase 2 (informed by Linked Data Fragments + Comunica federated SPARQL per Ch 11/12)
LLM/AI-assisted features (NL query, schema-matching assistance, mapping suggestion)	Post-Foundation (informed by OAEI 2025/26 hybrid LLM patterns; see §10)
Multi-user collaboration / CRDT layer	Post-Foundation (informed by Yjs/Loro/Automerge/cr-sqlite analyses; see §10)
Marketplace meta-schema (schema-of-schema question)	Resolved in spirit by §2.5 dog-food; concrete mechanics still deferred

§10 Long-horizon ideas considered, not committed

This section catalogs architectural ideas that surfaced during the second research wave but are not adopted in this log. They are worth surfacing for future decisions and may seed individual research challenges if the user signals interest.

LinkML as canonical schema substrate (Tier 0)

Source: Ch 12b §2.3.

The idea: LinkML is the schema language SSSOM itself is defined in. From a single YAML LinkML schema, codegen produces JSON Schema, ShEx, SHACL, OWL, Python dataclasses, Pydantic models, TypeScript types, and more. If Crosswalker adopted LinkML as a “Tier 0,” every engine choice (DuckDB / Oxigraph / Nemo / TerminusDB / etc.) becomes a serializer/deserializer plugin against the canonical LinkML schema rather than a competing schema authority.

Why interesting: Resolves the schema-substrate question that the §2.5 dog-food commitment opened. Future-proof against engine churn (engines come and go; LinkML schemas persist).

Why deferred: Adopting LinkML as Tier 0 is a meaningful architectural commitment with cascading implications across every Crosswalker schema (StewardshipProfile, junction-note 13-field, FrameworkConfig, _crosswalker metadata, etc.). Worth a dedicated Challenge 15 if the user signals interest.

Possible Challenge 15 brief: “Adopt LinkML as Crosswalker’s canonical schema substrate? Evaluate codegen overhead, contributor learning curve, alignment with SSSOM-py / OAK / Mondo prior art, migration path from current ad-hoc schemas.”

IPLD content-addressed crosswalks

Source: Ch 12b §4.2.

The idea: Every SSSOM row, bundle, and mapping_set hashes to an IPLD CID. Crosswalk releases become immutable Merkle DAGs. An audit can verify “we used exactly these mappings on the assessment date” by checking one hash. Distribution via CAR files; signing via W3C VCs.

Why interesting: Solves provenance permanently; couples cleanly with the cid: content-addressed identifier convention from Ch 09; enables federated distribution without a central registry.

Why deferred: Adds operational complexity; depends on at least some users adopting IPLD-aware tooling.

Possible Challenge 16 brief: “IPLD/CAR content-addressed distribution for SSSOM mapping sets — feasibility, tooling overhead, integration with existing Git-based vault.”

Tier 1.5 compilation pipeline

Source: Ch 12b §2.2.

The idea: A Rust CLI (crosswalker compile) ingests the vault and emits multi-target artifacts in one pass: mappings.parquet (DuckDB/Polars/Datafusion), mappings.hdt and mappings.ttl (Comunica/Oxigraph), mappings.json-ld (web), oscal.json and sssom.tsv (regulator-facing exports), mappings.car (IPLD content-addressed bundle), rules.wasm (Ascent-compiled Datalog rules for browser inference).

Why interesting: Decouples the canonical layer (Markdown + SSSOM TSV) from any specific query engine. Engines become consumers of the compiled artifacts rather than competing storage layers. Each tier loads only the artifact it needs.

Why deferred: Substantial new architectural component. Worth its own decision after §5.B is locked in.

Reactive/incremental computation for derived crosswalk views

Source: Ch 12b §2.4 + Ch 11b §4.7.

The idea: Adopt a Differential-Dataflow / Materialize-style execution model so derived views (coverage matrices, gap reports, MITRE ATT&CK Navigator overlays) update partially on each Markdown save rather than rebuilding from scratch. Realistic substrates: declarative-dataflow, Materialize OSS, Feldera/DBSP.

Why interesting: Live crosswalk editing is naturally a streaming workload; partial updates would dramatically improve UX at scale.

Why deferred: None of these run in the browser yet. Crosswalker’s “files canonical, derived stores rebuildable” principle already provides coarse incrementalism. Revisit if rebuild times become unbearable. Possible Challenge 17.

CRDT-based collaborative editing

Source: Ch 11 §2.4 (all three deliverables), Ch 12b §4.1.

The idea: Yjs (~10–30 KB gzipped, mature ecosystem) or Loro (newer, Rust-native, full editing-history DAG) for live multi-analyst editing of SSSOM rows. Or cr-sqlite (CRDT SQLite, preserves SQL shape over CRDT semantics).

Why interesting: Live-edit team mode is on the deferred list; this is the architectural option for it.

Why deferred: Live-edit is post-Foundation (Phase 2). When the time comes, Yjs is the safe default; cr-sqlite is the architecturally-intriguing alternative because it preserves SQL semantics natively.

AI-augmented mapping (LogMap-LLM, GenOM)

Source: Ch 12b §2.5 + Ch 11 §2.6.

The idea: OAEI 2025/2026 results show hybrid symbolic-skeleton (LogMap/AML) + embedding-retrieval (sqlite-vec or LanceDB-WASM) + LLM-oracle outperforms pure symbolic baselines by 5–16% F1 on biomedical alignments. Same pattern transfers to GRC frameworks. Architecture: vector-search nearest unmapped controls → top-K candidate to LLM → LLM emits SSSOM row with mapping_justification, confidence, predicate_modifier: candidate for review.

Why interesting: LLMs are now empirically validated for ontology alignment. Combined with sqlite-vec or LanceDB at Tier 2, in-Obsidian “suggest next mapping” UX becomes feasible.

Why deferred: Crosswalker’s GRC audience is acutely privacy-sensitive — LLM placement (in-browser via WebLLM, local sidecar via Ollama, BYOK cloud) is a separate architectural question. Defer until the basic three-tier engine stack is shipping.

Verifiable Credentials for mapping_set provenance

Source: Ch 13 §5 + Ch 12b §4.7.

The idea: W3C VCs v2.0 (Recommendation since May 2025) on the mapping_set so consumers can cryptographically verify “this mapping was signed by the official NIST OLIR submitter.” Combines naturally with IPLD CID-anchored bundles.

Why deferred: Track for v2.0+; auditor-familiarity is “very low” today (per Ch 13 table). EU eIDAS 2 + EUDI Wallet rollout is the catalyst that will eventually mainstream VCs in audit toolchains.

OSCAL native support (FedRAMP RFC-0024 demand-side validation)

Source: Ch 11c §4.4.

The idea: FedRAMP RFC-0024 mandates machine-readable authorisation packages by September 2026. NIST OSCAL (catalog/profile/component-definition/SSP/AP/AR/POA&M models) is the format federal agencies will require. Crosswalker that natively imports/exports OSCAL becomes 10× more valuable to the federal market.

Decision flagged in §5.C — this could become an architectural concern, not just a feature, depending on user input.

Federated crosswalks via Linked Data Fragments

Source: Ch 12b §4.7.

The idea: Every Crosswalker installation publishes a Triple Pattern Fragments (TPF) endpoint; Comunica federates queries across them client-side. No central registry. Combined with W3C VCs for signed mapping_sets, this enables a fully federated crosswalk distribution model without a central authority.

Why deferred: Cross-vault federation is Phase 2.

The 05-01 pair (load-bearing for today’s direction):

Foundation state of play (orientation, 05-01 AM) — web-of-webs framing, six open questions
Foundation commitments and follow-on research (decisions, 05-01 PM) — five commitments, three new challenges, the StewardshipProfile rename, the meta-schema commitment

First-wave deliverables (Ch 08/09/10 — 2026-05-02 morning):

Second-wave deliverables (Ch 11/12/13 — 2026-05-02 afternoon/evening):

Ch 11 deliverable A: Engine survey, TerminusDB-as-Tier-3 emphasis — includes Grafeo follow-up
Ch 11 deliverable B: Engine survey, layered Tier 2 stack — DuckDB-WASM + Oxigraph-WASM + Nemo-WASM
Ch 11 deliverable C: Engine survey, layered + OSCAL/FedRAMP angle — multi-agent validation; FedRAMP RFC-0024 strategic insight
Ch 12 deliverable A: Datalog vs SQL focused fork-in-the-road
Ch 12 deliverable B: Beyond the known engine landscape (long-horizon) — LinkML, IPLD, Tier 1.5 compilation, AI-augmented mapping
Ch 13 deliverable: Modern attestation primitives — Sigstore/in-toto/SLSA/OpenTimestamps/VCs/QLDB(dead)

The challenges (briefs):

Challenge 08: Git audit-trail tenability — ✅ resolved (Ch 13 follow-on)
Challenge 09: UUID/CWUUID strategy — ✅ resolved
Challenge 10: Graph→tabular bridging engine — ✅ resolved (Ch 11+12 follow-on)
Challenge 11: Tier 2/3 engine deep survey — ✅ resolved (3 deliverables)
Challenge 12: Datalog vs SQL for SSSOM chain rules — ✅ resolved (2 deliverables)
Challenge 13: Modern attestation primitives — ✅ resolved
🆕 Challenge 14: Missed engines evaluation — Phase 2 follow-on; spun up today

Roadmap (target of this log’s deltas):