How to enforce round-trip correctness at CI time
The core problem
Section titled “The core problem”Principle 2 says Obsidian semantics must round-trip between the Web CMS and the vault without corruption. That’s enforceable only if:
- We have a test corpus of real Obsidian-flavored pages
- Each page has a canonical “expected after round-trip” snapshot
- CI runs the full pipeline: parse → render → simulate an edit → save → parse again → compare
The comparison step is where this gets hard. “Equal” for markdown isn’t string equality — whitespace, frontmatter ordering, link aliasing, callout syntax variants all produce non-corrupting differences. A naive diff will flag every benign reformat as a regression.
Why it’s an open challenge, not just a test-harness problem
Section titled “Why it’s an open challenge, not just a test-harness problem”Frame it as a testing problem and the answer is “write unit tests.” But the actual blocker is defining what “round-trip correct” means for each feature:
- Wikilinks: is
[[Foo]]≡[[Foo|Foo]]≡[[Foo| Foo ]]? Probably yes. Codify. - Callouts: is
> [!note]≡> [!note]-? Different —-means collapsed. Not equivalent. - Frontmatter: YAML key ordering is not semantic; re-ordering on save is benign.
- Embeds:
![[image.png]]should round-trip as the same literal. But what about![[image.png|200]]? That’s Obsidian’s size hint — probably needs to survive. - Block refs:
[[Note#^xyz]]— the block ID must be preserved exactly; a re-key would break inbound references.
Each of these needs a written equivalence rule before we can automate the test.
Candidate approaches
Section titled “Candidate approaches”Approach A — Property-based testing
Section titled “Approach A — Property-based testing”Use fast-check or similar. Generate synthetic Obsidian markdown, run it through the pipeline twice, assert equality modulo equivalence rules. Pros: catches generalization bugs. Cons: generators for realistic markdown are non-trivial to write; false positives on edge cases that aren’t real-world.
Approach B — Fixture corpus
Section titled “Approach B — Fixture corpus”Hand-write a set of vault pages covering every Tier 1 / Tier 2 feature. Each is a .in.md file paired with a .expected.md file (the state after one round-trip). CI runs the round-trip and does a semantic diff. Pros: concrete, understandable. Cons: corpus maintenance burden grows with feature list.
Approach C — Real vault diffing
Section titled “Approach C — Real vault diffing”Clone cybersader/cyberbase at a known commit, run the full pipeline, render, simulate edits on a random 1% of pages, save, compare. Pros: catches real-world regressions. Cons: requires network I/O at test time, slow, flaky.
Approach D — Nothing, document the risk
Section titled “Approach D — Nothing, document the risk”Accept that round-trip correctness is an aspiration, not a property. Document each failure mode as it’s discovered. Pros: zero upfront cost. Cons: Principle 2 becomes a lie; contributors will eventually lose trust.
Leaning
Section titled “Leaning”Approach B + Approach A, in that order. Fixture corpus first (concrete, testable, documents expectations). Then property-based testing layered on top once the equivalence rules are codified.
What needs to happen next
Section titled “What needs to happen next”- Define equivalence rules for each Tier 1 feature (wikilinks, callouts, embeds, code blocks, math, Mermaid, tables). One Markdown document per feature, documenting what “equal” means.
- Write 3–5 fixture pages covering the common cases. Store in
docs/tests/fixtures/(or wherever, TBD). - Build a minimal semantic-diff tool — initially just
normalize(a) === normalize(b)wherenormalizecanonicalizes whitespace, frontmatter order, and link aliases. - Wire into CI as a Playwright test or a standalone script.
This is probably 2–3 sessions of work once Phase R exits and Phase 1 begins. Not doing it during Phase R because the principle needs to be grounded first.
Related
Section titled “Related”- Translation Layer — the subsystem this challenge is about
- Q04 in Open Questions — the precise question form
- Principle 2 — the principle this enforces