🚧 Early alpha — building the foundation. See the roadmap →

Challenge 35: Graph→tabular bridging (RERUN of Ch 10 under ontology-web framing)

Created May 8, 2026 Updated Jun 1, 2026

Predecessor + what’s different now

Original Ch 10 (resolved 2026-05-02 via cascade Ch 11 + Ch 14 + Ch 16 → v0.1 stack pivot):

Asked: how does Crosswalker bridge graph data (RDF/SSSOM mappings) into tabular views (Bases / DataView / .base)?
Verdict: Hybrid 3-tier — materialized folders (T1) + DuckDB-WASM (T2) + Apache AGE (T3)
Then revised by Ch 14 → Tier 2-Lite (sqlite-wasm + sqlite-vec + simple-graph) for mobile/low-end
Then revised by Ch 16 → AGE demoted; Fuseki/oxigraph-server promoted

What’s different now (2026-05-08 rerun):

Framing shifted — original was GRC/SSSOM-specific; rerun asks for arbitrary ontology webs (BioPortal, OBO Foundry, OLS, UMLS, NIST OLIR, OxO2)
Scale shifted — original assumed bounded vault size (~5K controls × 8 frameworks); rerun assumes ontology-web scale (BioPortal: 700+ ontologies; UMLS: 3.5M concepts)
Substrate shifted — Crosswalker pivoted to plain sqlite-wasm + recursive CTE (per WASM-A pivot); the verdict needs re-examination
User intuition — the user explicitly framed graph→tabular as the central question: “we’re pulling tabular views from graph/web-connected networks of ontologies/hierarchies”; original Ch 10 didn’t engage this fully
Sister challenges — Ch 33 (landscape audit) + Ch 34 (streaming) bracket this challenge with broader context that was unavailable in 2026-05-02

Why this exists (under the new framing)

Ontology webs are graphs. Bases queries are tabular. The graph→tabular bridge is THE load-bearing operation of Crosswalker’s query engine — every recipe ultimately produces a flat tabular result that Bases (or a custom view) renders. Original Ch 10 answered this for SSSOM/GRC mappings at small-medium scale. The rerun asks: does the same architecture survive at ontology-web scale?

What we already have

Asset	What it gives us
Ch 10 archived brief	Original framing + 3-tier verdict
Ch 10 deliverable	Original deliverable (verbatim historical record)
Ch 14 + Ch 16 syntheses	Subsequent revisions of the 3-tier stack
v0.1.5 Tier 2 sidecar shipped	What’s actually running today (sqlite-wasm + recursive CTE)
`concepts/query-primitives`	The candidate primitive set; pivot is THE graph→tabular operation

What to investigate

1. Confirm or revise the 3-tier architecture under ontology-web framing

Original Ch 10 verdict: Hybrid 3-tier. Re-examine each tier:

Tier 1 (materialized folders): does this scale to BioPortal-scale (700 ontologies × thousands of concepts each)?
Tier 2 (sqlite-wasm + recursive CTE; was DuckDB-WASM): does this survive ontology-web scale closure queries?
Tier 3 (Fuseki/oxigraph-server; was AGE): is the SPARQL/RDF stack the right Tier 3 for ontology webs (vs. Cozo, Stardog, Datomic)?

For each tier, argue REAFFIRMED / REVISED / DEFERRED.

2. Graph→tabular projection patterns

For arbitrary ontology webs, what’s the canonical pattern for projecting graph data into tabular form?

SPARQL SELECT (RDF triple → tabular): the W3C standard
SPARQL CONSTRUCT (RDF triple → RDF triple, then tabular): reshape graph before projection
Cypher RETURN clauses (Neo4j; property graph → tabular)
DuckPGQ MATCH…RETURN (DuckDB graph extension; SQL/PGQ)
Datalog rule heads (Cozo, Datomic; logic → tabular)
Property graph traversal API (TinkerPop, Gremlin; chained API → result)

For each: how naturally does it express Crosswalker’s pivot operation? What’s the developer-friendliness?

3. Pivot at ontology-web scale

The user’s framing was specifically about pivot. At BioPortal scale (700 ontologies; potential cross-product = 490,000 ontology pairs), pivot becomes infeasible at full cross-product.

What’s the pragmatic approach? (top-N pivots? user-selected pairs? pre-computed indexes?)
Does the substrate need to support sparse pivot (most cells empty)?
Is materialization (per Ch 28 Pattern B) required at this scale?

4. Cross-domain examples

Re-do the Ch 28a §4 query-routing matrix but for non-GRC ontology webs:

OBO Foundry: gene-ontology terms → MONDO disease → ChEBI compound (3-hop closure across biomedical ontologies)
SKOS taxonomy: top-level subject heading → narrower → narrower (deep hierarchy traversal)
MITRE ATT&CK: technique → mitigation → control (threat → defense pivot)
NIST OLIR: cross-framework crosswalk catalog (massive crosswalk inventory)
Library science: LCSH ↔ MeSH ↔ Dewey (cross-vocabulary subject mapping)

For each: what query primitive composition? What graph→tabular projection? What scale ceiling?

5. Reconcile with v0.1.5 shipped reality

v0.1.5 P3 shipped crosswalkBetween and closureFromConcept against sqlite-wasm. These ARE Crosswalker’s graph→tabular bridge today. Audit:

Are they substrate-neutral (could swap to Cozo / DuckDB / Polars / Oxigraph without API change)?
Do they handle ontology-web scale OR are they bounded to small/medium scale?
What’s the migration path if Ch 33/35 surfaces the need to swap substrates?

6. Materialization as a graph→tabular escape hatch

Ch 28 settled-item commitments place materialization in v0.1.8 (deferred). Under ontology-web framing where pivot at full cross-product is infeasible, materialization may be the only path. Argue:

Is materialization the pragmatic graph→tabular bridge for ontology-web scale?
Should materialization move to v0.1.6 if Ch 35 surfaces this need?

Anti-patterns to reject upfront

The deliverable must NOT recommend:

Migrating off sqlite-wasm without a concrete trigger — Ch 33 might surface one; Ch 35 alone shouldn’t unless it’s overwhelming
Reintroducing Apache AGE — Ch 16 demoted explicitly
Reintroducing libSQL / Turso / Limbo — Ch 24 rejected
Speculative tier added (T4? T5?) — 3 tiers was the design commitment
Forgetting that Crosswalker is a vault-native plugin — server-side substrates only as Tier 3 escape hatches
Cross-vault federation — explicitly out of scope per Ch 27/28 anti-patterns

Success criteria for the deliverable

The deliverable must produce:

Per-tier verdict — for each of T1/T2/T3: REAFFIRMED / REVISED / DEFERRED-TO-LATER, with rationale
Graph→tabular projection pattern survey — 6+ patterns × 4+ dimensions
Pivot scale analysis — at small/medium/large/ontology-web scale, what’s feasible?
Non-GRC query catalog — 5+ representative ontology-web queries × decomposition × scale
Substrate-neutrality audit — are current Tier 2 helpers swappable?
Materialization timing — should it move to v0.1.6, stay v0.1.8, or change again?
Updated migration triggers — modifications to Ch 24’s 5 triggers

Anchored references

Predecessor:

Project context:

concepts/query-primitives
concepts/ontology-web-querying
WASM-A pivot synthesis — current substrate state
Ch 24 synthesis — substrate + migration triggers
In-progress synthesis log §9

Sister challenges:

External:

BioPortal — ontology-web scale reference
UMLS — 3.5M concepts
OBO Foundry — biomedical ontology web
NIST OLIR — crosswalk catalog at scale

Hand-off

Write the deliverable to docs/.../zz-research/YYYY-MM-DD-challenge-35-deliverable-a-<slug>.md. After deliverable lands: flip synthesis log §9 status Ch 35 row from ⏳ to ✅; update Ch 10 archived brief with :::note callout pointing to this rerun + new deliverable; if verdict revises original Ch 10 verdict, document explicitly in synthesis log; archive this brief.