Skip to content
🚧 Early alpha — building the foundation. See the roadmap →

Challenge 35: Graph→tabular bridging (RERUN of Ch 10 under ontology-web framing)

Created Updated

Original Ch 10 (resolved 2026-05-02 via cascade Ch 11 + Ch 14 + Ch 16 → v0.1 stack pivot):

  • Asked: how does Crosswalker bridge graph data (RDF/SSSOM mappings) into tabular views (Bases / DataView / .base)?
  • Verdict: Hybrid 3-tier — materialized folders (T1) + DuckDB-WASM (T2) + Apache AGE (T3)
  • Then revised by Ch 14 → Tier 2-Lite (sqlite-wasm + sqlite-vec + simple-graph) for mobile/low-end
  • Then revised by Ch 16 → AGE demoted; Fuseki/oxigraph-server promoted

What’s different now (2026-05-08 rerun):

  1. Framing shifted — original was GRC/SSSOM-specific; rerun asks for arbitrary ontology webs (BioPortal, OBO Foundry, OLS, UMLS, NIST OLIR, OxO2)
  2. Scale shifted — original assumed bounded vault size (~5K controls × 8 frameworks); rerun assumes ontology-web scale (BioPortal: 700+ ontologies; UMLS: 3.5M concepts)
  3. Substrate shifted — Crosswalker pivoted to plain sqlite-wasm + recursive CTE (per WASM-A pivot); the verdict needs re-examination
  4. User intuition — the user explicitly framed graph→tabular as the central question: “we’re pulling tabular views from graph/web-connected networks of ontologies/hierarchies”; original Ch 10 didn’t engage this fully
  5. Sister challenges — Ch 33 (landscape audit) + Ch 34 (streaming) bracket this challenge with broader context that was unavailable in 2026-05-02

Ontology webs are graphs. Bases queries are tabular. The graph→tabular bridge is THE load-bearing operation of Crosswalker’s query engine — every recipe ultimately produces a flat tabular result that Bases (or a custom view) renders. Original Ch 10 answered this for SSSOM/GRC mappings at small-medium scale. The rerun asks: does the same architecture survive at ontology-web scale?

AssetWhat it gives us
Ch 10 archived briefOriginal framing + 3-tier verdict
Ch 10 deliverableOriginal deliverable (verbatim historical record)
Ch 14 + Ch 16 synthesesSubsequent revisions of the 3-tier stack
v0.1.5 Tier 2 sidecar shippedWhat’s actually running today (sqlite-wasm + recursive CTE)
concepts/query-primitivesThe candidate primitive set; pivot is THE graph→tabular operation

1. Confirm or revise the 3-tier architecture under ontology-web framing

Section titled “1. Confirm or revise the 3-tier architecture under ontology-web framing”

Original Ch 10 verdict: Hybrid 3-tier. Re-examine each tier:

  • Tier 1 (materialized folders): does this scale to BioPortal-scale (700 ontologies × thousands of concepts each)?
  • Tier 2 (sqlite-wasm + recursive CTE; was DuckDB-WASM): does this survive ontology-web scale closure queries?
  • Tier 3 (Fuseki/oxigraph-server; was AGE): is the SPARQL/RDF stack the right Tier 3 for ontology webs (vs. Cozo, Stardog, Datomic)?

For each tier, argue REAFFIRMED / REVISED / DEFERRED.

For arbitrary ontology webs, what’s the canonical pattern for projecting graph data into tabular form?

  • SPARQL SELECT (RDF triple → tabular): the W3C standard
  • SPARQL CONSTRUCT (RDF triple → RDF triple, then tabular): reshape graph before projection
  • Cypher RETURN clauses (Neo4j; property graph → tabular)
  • DuckPGQ MATCH…RETURN (DuckDB graph extension; SQL/PGQ)
  • Datalog rule heads (Cozo, Datomic; logic → tabular)
  • Property graph traversal API (TinkerPop, Gremlin; chained API → result)

For each: how naturally does it express Crosswalker’s pivot operation? What’s the developer-friendliness?

The user’s framing was specifically about pivot. At BioPortal scale (700 ontologies; potential cross-product = 490,000 ontology pairs), pivot becomes infeasible at full cross-product.

  • What’s the pragmatic approach? (top-N pivots? user-selected pairs? pre-computed indexes?)
  • Does the substrate need to support sparse pivot (most cells empty)?
  • Is materialization (per Ch 28 Pattern B) required at this scale?

Re-do the Ch 28a §4 query-routing matrix but for non-GRC ontology webs:

  • OBO Foundry: gene-ontology terms → MONDO disease → ChEBI compound (3-hop closure across biomedical ontologies)
  • SKOS taxonomy: top-level subject heading → narrower → narrower (deep hierarchy traversal)
  • MITRE ATT&CK: technique → mitigation → control (threat → defense pivot)
  • NIST OLIR: cross-framework crosswalk catalog (massive crosswalk inventory)
  • Library science: LCSH ↔ MeSH ↔ Dewey (cross-vocabulary subject mapping)

For each: what query primitive composition? What graph→tabular projection? What scale ceiling?

v0.1.5 P3 shipped crosswalkBetween and closureFromConcept against sqlite-wasm. These ARE Crosswalker’s graph→tabular bridge today. Audit:

  • Are they substrate-neutral (could swap to Cozo / DuckDB / Polars / Oxigraph without API change)?
  • Do they handle ontology-web scale OR are they bounded to small/medium scale?
  • What’s the migration path if Ch 33/35 surfaces the need to swap substrates?

6. Materialization as a graph→tabular escape hatch

Section titled “6. Materialization as a graph→tabular escape hatch”

Ch 28 settled-item commitments place materialization in v0.1.8 (deferred). Under ontology-web framing where pivot at full cross-product is infeasible, materialization may be the only path. Argue:

  • Is materialization the pragmatic graph→tabular bridge for ontology-web scale?
  • Should materialization move to v0.1.6 if Ch 35 surfaces this need?

The deliverable must NOT recommend:

  1. Migrating off sqlite-wasm without a concrete trigger — Ch 33 might surface one; Ch 35 alone shouldn’t unless it’s overwhelming
  2. Reintroducing Apache AGE — Ch 16 demoted explicitly
  3. Reintroducing libSQL / Turso / Limbo — Ch 24 rejected
  4. Speculative tier added (T4? T5?) — 3 tiers was the design commitment
  5. Forgetting that Crosswalker is a vault-native plugin — server-side substrates only as Tier 3 escape hatches
  6. Cross-vault federation — explicitly out of scope per Ch 27/28 anti-patterns

The deliverable must produce:

  1. Per-tier verdict — for each of T1/T2/T3: REAFFIRMED / REVISED / DEFERRED-TO-LATER, with rationale
  2. Graph→tabular projection pattern survey — 6+ patterns × 4+ dimensions
  3. Pivot scale analysis — at small/medium/large/ontology-web scale, what’s feasible?
  4. Non-GRC query catalog — 5+ representative ontology-web queries × decomposition × scale
  5. Substrate-neutrality audit — are current Tier 2 helpers swappable?
  6. Materialization timing — should it move to v0.1.6, stay v0.1.8, or change again?
  7. Updated migration triggers — modifications to Ch 24’s 5 triggers

Predecessor:

Project context:

Sister challenges:

External:

Write the deliverable to docs/.../zz-research/YYYY-MM-DD-challenge-35-deliverable-a-<slug>.md. After deliverable lands: flip synthesis log §9 status Ch 35 row from ⏳ to ✅; update Ch 10 archived brief with :::note callout pointing to this rerun + new deliverable; if verdict revises original Ch 10 verdict, document explicitly in synthesis log; archive this brief.