Challenge 35: Graph→tabular bridging (RERUN of Ch 10 under ontology-web framing)
Predecessor + what’s different now
Section titled “Predecessor + what’s different now”Original Ch 10 (resolved 2026-05-02 via cascade Ch 11 + Ch 14 + Ch 16 → v0.1 stack pivot):
- Asked: how does Crosswalker bridge graph data (RDF/SSSOM mappings) into tabular views (Bases / DataView / .base)?
- Verdict: Hybrid 3-tier — materialized folders (T1) + DuckDB-WASM (T2) + Apache AGE (T3)
- Then revised by Ch 14 → Tier 2-Lite (sqlite-wasm + sqlite-vec + simple-graph) for mobile/low-end
- Then revised by Ch 16 → AGE demoted; Fuseki/oxigraph-server promoted
What’s different now (2026-05-08 rerun):
- Framing shifted — original was GRC/SSSOM-specific; rerun asks for arbitrary ontology webs (BioPortal, OBO Foundry, OLS, UMLS, NIST OLIR, OxO2)
- Scale shifted — original assumed bounded vault size (~5K controls × 8 frameworks); rerun assumes ontology-web scale (BioPortal: 700+ ontologies; UMLS: 3.5M concepts)
- Substrate shifted — Crosswalker pivoted to plain sqlite-wasm + recursive CTE (per WASM-A pivot); the verdict needs re-examination
- User intuition — the user explicitly framed graph→tabular as the central question: “we’re pulling tabular views from graph/web-connected networks of ontologies/hierarchies”; original Ch 10 didn’t engage this fully
- Sister challenges — Ch 33 (landscape audit) + Ch 34 (streaming) bracket this challenge with broader context that was unavailable in 2026-05-02
Why this exists (under the new framing)
Section titled “Why this exists (under the new framing)”Ontology webs are graphs. Bases queries are tabular. The graph→tabular bridge is THE load-bearing operation of Crosswalker’s query engine — every recipe ultimately produces a flat tabular result that Bases (or a custom view) renders. Original Ch 10 answered this for SSSOM/GRC mappings at small-medium scale. The rerun asks: does the same architecture survive at ontology-web scale?
What we already have
Section titled “What we already have”| Asset | What it gives us |
|---|---|
| Ch 10 archived brief | Original framing + 3-tier verdict |
| Ch 10 deliverable | Original deliverable (verbatim historical record) |
| Ch 14 + Ch 16 syntheses | Subsequent revisions of the 3-tier stack |
| v0.1.5 Tier 2 sidecar shipped | What’s actually running today (sqlite-wasm + recursive CTE) |
concepts/query-primitives | The candidate primitive set; pivot is THE graph→tabular operation |
What to investigate
Section titled “What to investigate”1. Confirm or revise the 3-tier architecture under ontology-web framing
Section titled “1. Confirm or revise the 3-tier architecture under ontology-web framing”Original Ch 10 verdict: Hybrid 3-tier. Re-examine each tier:
- Tier 1 (materialized folders): does this scale to BioPortal-scale (700 ontologies × thousands of concepts each)?
- Tier 2 (sqlite-wasm + recursive CTE; was DuckDB-WASM): does this survive ontology-web scale closure queries?
- Tier 3 (Fuseki/oxigraph-server; was AGE): is the SPARQL/RDF stack the right Tier 3 for ontology webs (vs. Cozo, Stardog, Datomic)?
For each tier, argue REAFFIRMED / REVISED / DEFERRED.
2. Graph→tabular projection patterns
Section titled “2. Graph→tabular projection patterns”For arbitrary ontology webs, what’s the canonical pattern for projecting graph data into tabular form?
- SPARQL
SELECT(RDF triple → tabular): the W3C standard - SPARQL
CONSTRUCT(RDF triple → RDF triple, then tabular): reshape graph before projection - Cypher RETURN clauses (Neo4j; property graph → tabular)
- DuckPGQ MATCH…RETURN (DuckDB graph extension; SQL/PGQ)
- Datalog rule heads (Cozo, Datomic; logic → tabular)
- Property graph traversal API (TinkerPop, Gremlin; chained API → result)
For each: how naturally does it express Crosswalker’s pivot operation? What’s the developer-friendliness?
3. Pivot at ontology-web scale
Section titled “3. Pivot at ontology-web scale”The user’s framing was specifically about pivot. At BioPortal scale (700 ontologies; potential cross-product = 490,000 ontology pairs), pivot becomes infeasible at full cross-product.
- What’s the pragmatic approach? (top-N pivots? user-selected pairs? pre-computed indexes?)
- Does the substrate need to support sparse pivot (most cells empty)?
- Is materialization (per Ch 28 Pattern B) required at this scale?
4. Cross-domain examples
Section titled “4. Cross-domain examples”Re-do the Ch 28a §4 query-routing matrix but for non-GRC ontology webs:
- OBO Foundry: gene-ontology terms → MONDO disease → ChEBI compound (3-hop closure across biomedical ontologies)
- SKOS taxonomy: top-level subject heading → narrower → narrower (deep hierarchy traversal)
- MITRE ATT&CK: technique → mitigation → control (threat → defense pivot)
- NIST OLIR: cross-framework crosswalk catalog (massive crosswalk inventory)
- Library science: LCSH ↔ MeSH ↔ Dewey (cross-vocabulary subject mapping)
For each: what query primitive composition? What graph→tabular projection? What scale ceiling?
5. Reconcile with v0.1.5 shipped reality
Section titled “5. Reconcile with v0.1.5 shipped reality”v0.1.5 P3 shipped crosswalkBetween and closureFromConcept against sqlite-wasm. These ARE Crosswalker’s graph→tabular bridge today. Audit:
- Are they substrate-neutral (could swap to Cozo / DuckDB / Polars / Oxigraph without API change)?
- Do they handle ontology-web scale OR are they bounded to small/medium scale?
- What’s the migration path if Ch 33/35 surfaces the need to swap substrates?
6. Materialization as a graph→tabular escape hatch
Section titled “6. Materialization as a graph→tabular escape hatch”Ch 28 settled-item commitments place materialization in v0.1.8 (deferred). Under ontology-web framing where pivot at full cross-product is infeasible, materialization may be the only path. Argue:
- Is materialization the pragmatic graph→tabular bridge for ontology-web scale?
- Should materialization move to v0.1.6 if Ch 35 surfaces this need?
Anti-patterns to reject upfront
Section titled “Anti-patterns to reject upfront”The deliverable must NOT recommend:
- Migrating off sqlite-wasm without a concrete trigger — Ch 33 might surface one; Ch 35 alone shouldn’t unless it’s overwhelming
- Reintroducing Apache AGE — Ch 16 demoted explicitly
- Reintroducing libSQL / Turso / Limbo — Ch 24 rejected
- Speculative tier added (T4? T5?) — 3 tiers was the design commitment
- Forgetting that Crosswalker is a vault-native plugin — server-side substrates only as Tier 3 escape hatches
- Cross-vault federation — explicitly out of scope per Ch 27/28 anti-patterns
Success criteria for the deliverable
Section titled “Success criteria for the deliverable”The deliverable must produce:
- Per-tier verdict — for each of T1/T2/T3: REAFFIRMED / REVISED / DEFERRED-TO-LATER, with rationale
- Graph→tabular projection pattern survey — 6+ patterns × 4+ dimensions
- Pivot scale analysis — at small/medium/large/ontology-web scale, what’s feasible?
- Non-GRC query catalog — 5+ representative ontology-web queries × decomposition × scale
- Substrate-neutrality audit — are current Tier 2 helpers swappable?
- Materialization timing — should it move to v0.1.6, stay v0.1.8, or change again?
- Updated migration triggers — modifications to Ch 24’s 5 triggers
Anchored references
Section titled “Anchored references”Predecessor:
Project context:
concepts/query-primitivesconcepts/ontology-web-querying- WASM-A pivot synthesis — current substrate state
- Ch 24 synthesis — substrate + migration triggers
- In-progress synthesis log §9
Sister challenges:
- Ch 33 — Multi-modal landscape audit
- Ch 34 — Streaming / chunked execution
- Ch 36 — Query language rerun
- Ch 37 — Tier 2 scale model rerun
External:
- BioPortal — ontology-web scale reference
- UMLS — 3.5M concepts
- OBO Foundry — biomedical ontology web
- NIST OLIR — crosswalk catalog at scale
Hand-off
Section titled “Hand-off”Write the deliverable to docs/.../zz-research/YYYY-MM-DD-challenge-35-deliverable-a-<slug>.md. After deliverable lands: flip synthesis log §9 status Ch 35 row from ⏳ to ✅; update Ch 10 archived brief with :::note callout pointing to this rerun + new deliverable; if verdict revises original Ch 10 verdict, document explicitly in synthesis log; archive this brief.