Import primitive formal foundation — Ch 20 synthesis (wargaming setup, not a decision log)
§1 Why this log exists
Section titled “§1 Why this log exists”The user ran Challenge 20: Import primitive formal foundation and received three independent fresh-agent deliverables plus a substantive user/agent dialog that crystallized two architectural insights. The four artifacts:
- Ch 20a — T1TMA (Tier-1 Term-Map Algebra) — 6 primitives, YARRRML-retargeted, MTT-justified, lens-contracted
- Ch 20b — Boundary semantics (
ref/resolve/bind/seal) — 4-primitive boundary contract, Backpack/Nix-style content addressing - Ch 20c — 5+4 primitive set (RML retargeted) — 5 import primitives + 4 output sinks, s-t tgds + MTT + functorial migration justified
- Ch 20 dialog — ETL + fundamental forms — user/agent exchange validating “data has fundamental forms; ETL is mapping between forms; the primitives are tree transducers”
This synthesis log is the wargaming setup the user explicitly asked for. It anchors the discussion at concrete examples and walks through scenarios where each architectural choice exhibits its tradeoffs.
§2 The reference baseline — what we’re improving on
Section titled “§2 The reference baseline — what we’re improving on”Per user: “the Obsidian Importer plugin is an example of an overly simplified process of what importing could look like.”
The Obsidian Importer plugin (and most vendor importers) demonstrates what a “simple” import looks like:
| What it does | What’s wrong with it |
|---|---|
| File-format-specific code paths (one for Notion, one for Bear, one for Roam, etc.) | Adding a new format means writing new code; no recipe abstraction |
| Hardcoded mapping from source structure to vault structure | No schema; the mapping rules live in the plugin’s source code, not as data |
| No provenance recording — generated notes don’t know where they came from | Re-import is destructive; no Merkle history; “what changed” is unanswerable |
| No identity model — two imports of the same source produce different vaults | Collaborators can’t merge; sharing recipes is impossible |
| No protocol surface — only Obsidian-internal logic can pull data in | External systems (MCP servers, agents, custom CLIs, APIs) have no standardized way to push data into a Crosswalker vault |
| Limited to file-on-disk sources | Can’t ingest from URLs, Git repos, MCP resources, or live APIs without extra plumbing |
This is the bar to clear. Any v0.x architecture for Crosswalker’s import side must solve at least the provenance, identity, recipe-as-data, and protocol-surface problems — because those are the four that durably differentiate Crosswalker from “yet another importer plugin.” The deliverables below all address these to varying depths.
§3 The user’s framing — graph-aware ETL with a protocol surface
Section titled “§3 The user’s framing — graph-aware ETL with a protocol surface”Three architectural insights from the user/agent dialog:
3.1 Crosswalker is graph-aware, semantically-constrained, format-diverse ETL
Section titled “3.1 Crosswalker is graph-aware, semantically-constrained, format-diverse ETL”User question: “So what I’m essentially building is an ETL engine for importing frameworks and ontologies into my Crosswalker system.” — Confirmed. The agent sharpened: it’s distinct from generic tabular ETL (dbt, dlt, Singer) in three structural ways:
- Graph-aware output: records aren’t flat rows, they’re nodes in a graph. A NIST 800-53 control has typed relationships (parent family, child enhancements, related controls, mapped-to controls in other frameworks). Generic ETL frameworks assume tabular-in tabular-out; Crosswalker’s ETL must produce a graph.
- Semantically-constrained target: the target schema isn’t arbitrary — it’s the Tier-1 representation that downstream systems (STRM, SSSOM, ontology diff, Nemo derivation) consume with expectations. The target shape is closed by design.
- Format-diverse sources: the same conceptual entity arrives in many serializations. NIST 800-53 r5 ships as CSV, XLSX, OSCAL JSON, OSCAL XML, PDF. The ETL primitive must abstract over physical format while preserving conceptual mapping.
3.2 Data has fundamental forms; ETL is mapping between forms
Section titled “3.2 Data has fundamental forms; ETL is mapping between forms”User question: “Am I oversimplifying when I think of ETL as changing the shape of the data?” — No. The agent confirmed and sharpened the insight to its formal foundation:
- Every data format reduces to a small number of structural forms: tables (depth-2 trees), trees (the universal form), graphs (trees-of-adjacency-lists when serialized).
- Tables and graphs are special cases of labeled trees. The universal structural form is the labeled tree.
- ETL is tree-to-tree transformation.
- The primitive operations of tree-to-tree transformation are macro tree transducer rules — provably 5 irreducible operations (Engelfriet/Vogler 1985, Fülöp/Vogler 1998) that compute exactly the primitive recursive functions on trees.
- ChunkyCSV (the user’s earlier tool) is a tree transducer specialized for the table↔tree depth crossing. JSONaut (the user’s other tool) is a tree-to-tree transducer specialized for JSON manipulation. Both are concrete instances of the general primitive.
The user arrived at this insight from first-principles intuition; the formal literature converges on the same answer from the theoretical direction. Both directions licensing the same primitive set is the strongest first-principles evidence available.
3.3 The import surface is potentially a protocol, not just Obsidian-internal logic
Section titled “3.3 The import surface is potentially a protocol, not just Obsidian-internal logic”User: “The goal is to have a system where you could define transforms or maybe in the future idk — we can architect other connections (doesn’t need to be logic that lives in Obsidian) to pull data in.”
This is significant. The deliverables describe the import primitive as a recipe/transformation algebra living inside Crosswalker. But the user’s framing imagines the import side as a protocol surface: external systems (MCP servers, agent extractors, custom CLIs, APIs, third-party tools) connect to Crosswalker’s import side directly and push data in. The recipe layer remains, but it can also be invoked by external systems through a typed protocol.
This insight strengthens the case for Run B’s boundary-semantics layer — when sources are external systems rather than local files, content-addressing, sealed manifests, capability-typed effects, and Merkle provenance matter more, not less. It also activates the platform-not-monolith pillar (“Spec / Library / Integrations”) that’s already a Foundation commitment.
§4 The convergence — Layer B (transformation algebra)
Section titled “§4 The convergence — Layer B (transformation algebra)”Runs A and C give essentially the same answer at the transformation-algebra layer. Side-by-side:
| Aspect | Run A (T1TMA) | Run C (5+4 primitive set) |
|---|---|---|
| Primitive count | 6 (ITERATE, REFERENCE, TEMPLATE, BIND, JOIN, INVERT) | 5 + 4 sinks (Source, Term, Map, Join, Function + path/frontmatter/body/wikilink) |
| Output vocabulary | Closed Tier-1 slot vocab: id, label, body.section, frontmatter.k, links.role, folder, aliases, tags, metadata.sssom-key | 4 sinks: path, frontmatter[k], body[region], wikilink[role] |
| Bidirectionality | INVERT primitive (sixth op) for opt-in lens put | bidirectional: true annotation flag at NoteMap level |
| Surface DSL | YARRRML-shaped YAML | YARRRML-flavored YAML |
| Expression sub-language | JSONata | JSONata (in Function primitive) |
| Tabular type profile | CSVW | CSVW (via Source.options) |
| Bundle target | ~480 KB total | ~620 KB total |
| Theoretical justification | MTT primitive recursion + Foster/Pierce lens semantics | s-t tgds (Fagin/Kolaitis/Popa) + MTT + functorial data migration |
| Reject list | CQL/FDM (too academic), full Boomerang (no JS impl), Datalog-as-DSL (Datalog stays for derivation), pure JSONata/Jolt (no source/sink abstraction) | Same |
| Adopt list | RML/YARRRML shape; CSVW; JSONata; SSSOM/T-style filter→action | Same |
| NIST 800-53 worked example | Yes, OSCAL CSV path | Yes, OSCAL JSON path |
The two are essentially the same recommendation stated from slightly different angles. Run A names INVERT as a sixth primitive; Run C makes bidirectionality an annotation. Otherwise the architectural shape is identical: YARRRML retargeted from RDF triples to Tier-1 Notes, with the Tier-1 sink vocabulary as the closure constraint, JSONata as the expression layer, CSVW as the tabular type profile, MTT as the completeness theorem, and Foster lenses as the round-trip contract.
This convergence is the strongest signal. When two independent fresh-agent runs reach the same architectural recommendation through slightly different paths (Run A starts from “what’s the minimal complete primitive for tree-to-tree transformation”; Run C starts from “what does data exchange theory say is the minimal s-t tgd for this problem”), it’s evidence that the count and shape are right.
§5 The complementary layer — Layer A (boundary semantics)
Section titled “§5 The complementary layer — Layer A (boundary semantics)”Run B operates at a different layer. Not transformation algebra — boundary semantics. It answers a different question.
The two layers compose cleanly. Layer A handles what makes the import legal at the vault boundary; Layer B handles what makes the import expressible as a transformation. Recipe authors interact primarily with Layer B (the YARRRML-shaped DSL); Layer A is enforced by the runtime (hash-pinning, sandbox, manifest-sealing).
The user’s “external connections / protocol surface” insight strengthens Layer A’s case dramatically. When sources are external systems (MCP servers, APIs, agent extractors, custom CLIs) rather than local files, the boundary semantics matter more — provenance, content-addressing, sandboxed effects, and capability typing become the load-bearing properties. A vendor-style local-file-only importer can ignore Layer A; a protocol-based system cannot.
§6 Concrete worked examples — the wargaming gallery
Section titled “§6 Concrete worked examples — the wargaming gallery”This section is the centerpiece. Walk through each scenario and observe what each architectural choice does.
Example 1 — Importing a NIST 800-53 r5 CSV (the simplest case)
Section titled “Example 1 — Importing a NIST 800-53 r5 CSV (the simplest case)”The source: a CSV with columns Control Identifier, Control Name, Control Text, Family, Discussion, Related Controls.
Vendor importer (Obsidian Importer baseline):
Mapping rules live in the plugin source code. No recipe. No provenance. Re-imports overwrite.
v0.1 practical ImportRecipe (current spec §4):
Recipe-as-data; column roles + 24 transform types; works for the 90% case.
v0.2 primitive-grounded recipe (Runs A + C convergent):
What Path A vs Path B does to this example:
- Path A (ship v0.1 practical, migrate to v0.2): authors write the v0.1 form now, gets auto-transpiled to v0.2 in 6 months. Recipe migration is mechanical.
- Path B (v0.1 primitive-grounded from start): authors write the v0.2 form from day one. ~5 lines longer than v0.1.
What Layer A adds (orthogonal to Path A/B): the recipe references nist-800-53-r5.csv by content-digest, not just filename. Provenance is recorded. Re-imports are typed deltas, not destructive overwrites.
Example 2 — Importing OSCAL JSON (tree-shaped source, format diversity)
Section titled “Example 2 — Importing OSCAL JSON (tree-shaped source, format diversity)”Same recipe primitives, different formulation:
The conceptual mapping is the same as Example 1. Only the formulation and the access expressions change. The recipe author doesn’t rewrite their entire recipe just because the source format changed. This is the format-diversity property — the user’s framing #3.1 made concrete.
What this shows: the primitive set’s formulation parameterization absorbs format diversity for free. Vendor importer pattern requires a new code path; v0.1 practical requires column-role rewrites; v0.2 primitive-grounded requires only the formulation switch.
Example 3 — Importing from an MCP server / external API
Section titled “Example 3 — Importing from an MCP server / external API”The protocol-surface case. A Crosswalker recipe whose Source is a URI pointing at an MCP resource:
Layer A is what makes this work safely. The MCP server returns an artifact; Layer A:
refconstructs a typed reference:(mcp://compliance-research-server/scf/2025-Q4, FrameworkSig, sha256:abc...).resolvecalls into the MCP server in a sandboxed environment with declared trust roots, fetches bytes, canonicalizes, hashes, verifies digest matches pin.bindmaterializes Tier-1 notes via Layer B’s transformation algebra.sealrecords the manifest the import was sealed against.
If the MCP server returns different bytes the next time, the digest mismatch is caught. If the server is compromised, the trust-root verification catches it. If two collaborators import from the same MCP source, content-addressing collapses the result. Without Layer A, none of these properties hold for an external-system source.
This is also the example that activates the user’s “other connections, doesn’t need to be logic that lives in Obsidian” framing. The MCP server is not Obsidian-internal logic; it’s an external system that speaks the import protocol. The recipe is invariant under whether the source is a local file or an external system — Layer A handles the boundary, Layer B handles the reshape.
Example 4 — Re-import / version bump
Section titled “Example 4 — Re-import / version bump”NIST CSF 1.1 → 2.0 (a real upgrade that broke crosswalks across the GRC industry).
Vendor importer: re-runs the importer; either silently overwrites the old vault content, or duplicates with version suffixes the user has to clean up. No record of what changed.
v0.1 practical: re-imports produce a new set of notes; the user runs git diff to figure out what changed. Possible but manual.
v0.2 + Layer A: the import is a typed delta. Layer A computes:
- Old digest:
sha256:abc...(NIST CSF 1.1) - New digest:
sha256:def...(NIST CSF 2.0) - Both are first-class artifacts in the vault’s import history
- A Migration Crosswalk artifact is automatically proposed: edges where both endpoints have a content-equivalent in the old version map automatically; edges where the target was renamed/restructured surface as user-review tasks
- The ontology diff engine (the existing 9 atomic graph-edit primitives) decomposes the change into adds/removes/relabels/restructures
- Provenance is monotonically extended; nothing is destroyed
This is what “re-import as typed delta against an immutable substrate” means in practice. Without Layer A, this scenario is the most painful re-import experience in compliance work today.
Example 5 — Two collaborators importing the same framework
Section titled “Example 5 — Two collaborators importing the same framework”Without Layer A: Alice imports NIST CSF 2.0 from https://nist.gov/csf-2.0.json on Monday. Bob imports the same URL on Tuesday. NIST silently fixed a typo Tuesday morning. Alice’s vault and Bob’s vault have different notes. Their crosswalks reference different controls. Merging their vaults is a nightmare.
With Layer A: both Alice and Bob’s recipes pin the source to sha256:abc.... If NIST changed the bytes, Bob’s import is refused with a digest mismatch error — he must explicitly accept the new digest, which creates a new typed artifact. Alice and Bob’s vaults are observationally indistinguishable; merging is trivial.
This is the collaboration safety property. It’s foundational for the user’s “platform architecture, not plugin monolith” pillar — sharing recipes across organizations only works if recipes are content-addressed.
Example 6 — An AI agent reasoning over Crosswalker imports
Section titled “Example 6 — An AI agent reasoning over Crosswalker imports”Without Layer A or v0.2: agent reads notes one by one, infers structure from frontmatter heuristics, frequently hallucinates schema details.
With Layer B (v0.2 primitive-grounded): agent loads the recipe and instantly knows the shape of every generated note (closed slot vocabulary; declared frontmatter keys; declared body sections; declared wikilink roles). No hallucination space.
With Layer A (sealed manifests): agent loads the vault’s crosswalker.yaml plus manifest files (a few KB total) and knows the shape of the entire vault without reading any framework body. Progressive disclosure is built into the architecture. The agent can run “find all controls where evidence is older than 90 days” by reading materialized note frontmatter, with no need to load body text. If it needs a body, it reads exactly one note.
This addresses one of the user’s stated audience requirements: AI agents need to reason over Crosswalker without hallucinating. The closed slot vocabulary plus sealed manifests give them ground truth.
Example 7 — External system pulls data INTO Crosswalker via a protocol
Section titled “Example 7 — External system pulls data INTO Crosswalker via a protocol”The user’s “other connections” insight made concrete. An external compliance-data scraper (running outside Obsidian, perhaps as a daemon in a CI pipeline or an MCP server) wants to push data into a Crosswalker vault.
Without a protocol surface: external systems can only write files to the vault directory and hope Obsidian picks them up. No typing, no validation, no provenance, no schema enforcement.
With Layer A as a protocol: the external system speaks Crosswalker’s import protocol:
The recipe is the same recipe a human author would write. The external system is just another source — bound by the same boundary semantics, the same transformation algebra, the same Tier-1 contract. This is what “platform architecture, not plugin monolith” means for the import side specifically.
What this scenario reveals: Layer A is the protocol and Layer B is the language. Together they make Crosswalker’s import side a typed surface that any client (human author, MCP server, scraper, agent extractor) can speak. Without both layers, the import side is Obsidian-internal logic with all the brittleness that implies.
§7 Composability with existing first-principles representations
Section titled “§7 Composability with existing first-principles representations”Each deliverable confirmed strict orthogonality with Crosswalker’s existing primitives. Consolidated:
| Existing primitive | Relationship to Ch 20 import primitive |
|---|---|
| STRM (5 set-theory predicates) | STRM operates on edges between Tier-1 entities. The import primitive produces those entities and the links.<role> edges that STRM may then label. Strictly orthogonal. |
| SSSOM (canonical row-schema envelope) | SSSOM is one possible output schema the import primitive emits into. A NoteMap whose path sink is *.sssom.tsv produces SSSOM rows. The 22 SSSOM chain rules continue to live in Nemo (downstream). |
| Junction notes (13-field schema, Ch 07) | Junction notes are generated by the import primitive when a Map binds into the canonical 13-field shape. The schema lives in the slot vocabulary; the production lives in the import primitive. |
| Ontology diff primitives (9 atomic graph-edit ops) | Diff consumes two Tier-1 vault states. The import primitive produces those states. When a re-import runs, diff compares old vs new Tier-1 trees; the 9-atom edit script is the change set. |
| Nemo Datalog (SSSOM derivation) | Nemo runs over already-imported Tier-1 facts. Import produces facts; Nemo derives further facts. Same OxO2 architectural split (declarative ingest, then Datalog inference). |
| StewardshipProfile + meta-schema lifecycle | Recipes themselves are first-class versioned schemas; the meta-schema lifecycle commitment (“Crosswalker eats own dog food”) applies to the import recipe schema. |
The Ch 20 primitives sit strictly upstream of all existing Crosswalker primitives. They produce the substrate the existing primitives consume. No competition; no overlap.
§8 Wargaming questions to walk through
Section titled “§8 Wargaming questions to walk through”Per the user’s directive to wargame before deciding. Each question maps to one or more worked examples in §6. For each, evaluate what each architectural choice (Path A vs Path B; Layer A scope: full / minimal / deferred) does to the scenario.
W1: “What does it cost when an upstream framework changes?”
Section titled “W1: “What does it cost when an upstream framework changes?””- Vendor importer: Re-import; silent overwrite or duplicates. No provenance. Manual reconciliation.
- v0.1 practical: Re-import; user runs git diff. Can work if the user is disciplined.
- v0.2 + Layer A: Typed delta against immutable substrate; Migration Crosswalk auto-proposed; ontology diff engine decomposes change into 9 atoms. (Example 4)
Wargame: how often will frameworks actually change in Crosswalker’s deployment context? NIST CSF: every few years (1.1 → 2.0). NIST 800-53: every 4–5 years (Rev 4 → Rev 5). MITRE ATT&CK: continuously updated. CIS Controls: every 1–2 years. SCF: monthly. The answer determines how load-bearing Layer A is. If users are typically on a single framework for years, Layer A’s cost may exceed its benefit. If they’re aggregating dozens of fast-moving frameworks (SCF case), Layer A becomes essential.
W2: “What happens when two recipes target the same vault?”
Section titled “W2: “What happens when two recipes target the same vault?””- Vendor importer: Whichever ran last wins. No conflict detection.
- v0.1 practical: Recipes have output paths; collisions result in overwrites or import errors.
- v0.2 + Layer A: Recipes produce Tier-1 deltas; the sheaf-theoretic gluing model from Run B catches overlapping assertions on shared sub-contexts as well-defined conflicts (pushout fails to be a sheaf). (Example 5 generalized)
Wargame: how often will vaults have overlapping import recipes? Very common in GRC: a vault with NIST 800-53 + ISO 27001 + SCF will have controls that are referenced from multiple recipes (the SCF recipe wants to write links.maps_to_nist on every NIST control). Without a conflict-detection model, this becomes painful at scale.
W3: “What happens when an agent is wrong about which version it imported?”
Section titled “W3: “What happens when an agent is wrong about which version it imported?””- Vendor importer: Agent reads notes; no version metadata; agent guesses; agent hallucinates.
- v0.1 practical: Agent can read
_crosswalker.framework_versionfrontmatter; better, but still requires the agent to know the schema. - v0.2 + Layer A: Agent reads sealed manifest; typed knowledge of which version, which digest, which provenance chain. (Example 6)
Wargame: how often will agents be the consumer of the import? Increasingly: the user’s stated audience includes “AI agents that increasingly assist [GRC teams].” Agents need typed context, not heuristic frontmatter. This is where Layer A’s progressive-disclosure property pays dividends.
W4: “What does it look like when an external system wants to push data INTO Crosswalker?”
Section titled “W4: “What does it look like when an external system wants to push data INTO Crosswalker?””- Vendor importer: External system writes raw markdown files to the vault directory; no validation; no provenance; no protocol.
- v0.1 practical: Same — there’s no protocol surface; external systems are reduced to filesystem writes.
- v0.2 + Layer A: External system speaks the import protocol; recipes are invariant; boundary semantics hold. (Example 7 — the user’s “other connections” insight)
Wargame: does the user actually want external systems to push data into Crosswalker? Yes — that was the explicit framing. “The goal is to have a system where you could define transforms or maybe in the future idk — we can architect other connections (doesn’t need to be logic that lives in Obsidian) to pull data in.” This is a v1.0+ feature, but it’s a load-bearing architectural concern now because retrofitting a protocol surface onto a system that wasn’t designed for one is dramatically more expensive than designing it in from the start.
W5: “What’s the failure mode of each architectural choice at 1×, 10×, 100× current scale?”
Section titled “W5: “What’s the failure mode of each architectural choice at 1×, 10×, 100× current scale?””- At 1× (one framework, single user): vendor importer suffices; v0.1 practical and v0.2 are both overkill.
- At 10× (handful of frameworks, small team): v0.1 practical handles it; vendor importer breaks on collaboration; v0.2 is well-scoped.
- At 100× (dozens of frameworks, large org, multi-tenant, agents-in-the-loop): vendor importer is unusable; v0.1 practical strains; v0.2 + Layer A is the only architecture that durably scales.
Wargame: which scale does Crosswalker target? The user’s stated audience (GRC consultants, internal auditors, security architects, AI agents) plus the “build something DURABLE” directive points firmly at 100×. That argues for v0.2 + Layer A as the v0.1 build target — but the cost is real.
§9 Open architectural questions (not decisions yet)
Section titled “§9 Open architectural questions (not decisions yet)”What’s deferred for user discussion. Each is consequential; none is being decided in this commit.
| Question | Options | Pinging |
|---|---|---|
| v0.1 path | Path A: ship v0.1 practical, transpile to v0.2 later. Path B: pivot v0.1 to primitive-grounded from the start. | Affects build velocity vs avoid-throwaway-vocabulary. Both deliverables explicitly recommend Path A; user has not decided. |
| Layer A scope | (a) Full ref/resolve/bind/seal + content-addressing + Manifest sealing + sandboxed effects. (b) Minimal subset (digest + provenance recording only). (c) Defer entirely to v1.0+. | Run B makes a strong case but adds substantial cognitive and implementation cost. The protocol-surface insight tilts toward (a); pragmatism tilts toward (b) or (c). |
| Surface DSL flavor | YARRRML-shaped (Run A + C convergence); Dhall-typed (Run B compatible); hybrid. | YARRRML has community + tooling; Dhall has stronger typing; hybrid is the most flexibility but adds complexity. |
| Manifest language choice | Dhall (typed, total, importable, hash-pinnable; small JS/TS impl exists); JSON Schema (huge ecosystem, weaker types, no native imports); CUE (between, growing JS support). | Run B leans Dhall but acknowledges tooling cost. JSON Schema is pragmatic. CUE is the conservative middle. |
| Markdown+frontmatter canonicalization standard | RFC 8785 covers JSON; nothing equivalent for Markdown. Crosswalker may need to define one. | Small but real spec work. Required for any content-addressing of Markdown notes. |
| ”Concept” / “node” / “control” rename for the generated note | Likely moot because the primitive set’s domain-neutral sink vocabulary handles this implicitly. | Worth a final confirmation when the schema spec is rewritten. |
| Spin up Ch 21 specifically for “external connections / protocol surface” research? | Given the user’s directional input on protocol surface, a dedicated brief is reasonable. | Defer pending user signal. |
§10 Updated schema-spec implications (listed; not applied)
Section titled “§10 Updated schema-spec implications (listed; not applied)”What would change in the v0.1 schema spec §4 (ImportRecipe) under each path. This is preparation, not application.
If Path A (ship v0.1 practical, migrate to v0.2 later):
- v0.1 schema stays as-is.
- A new
reference/spec/import/T1TMA-1.0.mddocument captures the v0.2 target spec. - A v0.1 → v0.2 transpiler tracks development as a planned tooling deliverable.
If Path B (v0.1 primitive-grounded from the start):
- Replace v0.1 schema spec §4 (ImportRecipe) with the T1TMA / 5+4 primitive set.
- v0.1 =
Source / Term / Map / Join / Functionoverpath / frontmatter / body / wikilinksinks. - The 8 column-roles + 24 transform types are eliminated; the 24 transforms become a stdlib of named JSONata + GREL functions.
- Bundle target adjusts from “under 500 KB plugin core” to “~480 KB recipe runtime + JSONata lazy-loaded ~140 KB peer dep + ~80 KB FNML stdlib” ≈ ~700 KB plugin core.
If Layer A also adopted at v0.1:
- Add a new schema spec section §0 covering boundary semantics (
crosswalker.yamlvault root manifest,imports/*.import.mdimport declarations, content-digest provenance frontmatter on every materialized note). - Bundle adds: canonicalization library (~100 KB), digest computation (~50 KB), manifest validator (Dhall ~200 KB or JSON Schema ~80 KB).
§11 Related
Section titled “§11 Related”Inputs:
- Ch 20a deliverable: T1TMA
- Ch 20b deliverable: Boundary semantics
- Ch 20c deliverable: 5+4 primitive set
- Ch 20 dialog: ETL + fundamental forms
- Original Ch 20 brief (archived)
Crosswalker primitives the import primitive composes with:
- STRM registry entry
- SSSOM registry entry
- Junction notes (Ch 07 resolution)
- Ontology diff primitives (atomic operations research)
- v0.1 schema spec — current ImportRecipe section
Related research deliverables cited in the Ch 20 work:
- Ch 11 deliverables (engine survey) — Nemo as the Datalog derivation tier downstream of import
- Ch 12 deliverables (Datalog vs SQL for SSSOM chain rules) — confirms the architectural split between import and derivation tiers
- Ch 14 deliverable (missed engines) — Comunica federation as a downstream consumer
Project framing:
- What makes Crosswalker unique — particularly the “Platform architecture, not plugin monolith” pillar that the protocol-surface insight activates
- v0.1 stack-pivot log — the “DURABLE / built from the ground up” directive comes from here
- Roadmap: Foundation phase — where the import primitive eventually lands
User’s prior tools (cited as practical-precedent touchstones):