Skip to content
🚧 Early alpha — building the foundation. See the roadmap →

Research: atomic operations on graphs — are the 13 primitives primitive enough?

Created Updated

Are the 13 structural change primitives truly primitive? Or are some of them composite operations that decompose into simpler atoms? What does the literature say about the provably complete minimal set of operations for transforming one labeled directed graph into another?

Answer: 6 mathematical atoms, our 9 are a valid refinement

Section titled “Answer: 6 mathematical atoms, our 9 are a valid refinement”

The literature converges on 6 atomic operations as the provably complete minimal set:

#OperationProven by
1Node insertionGED (Bunke & Allermann 1983), Algebraic Graph Transformation (Ehrig et al. 2006)
2Node deletionSame
3Node relabelingSame
4Edge insertionSame
5Edge deletionSame
6Edge relabelingSame

Completeness proof: For any two finite labeled directed graphs G1 and G2, there exists a finite sequence of these 6 operations that transforms G1 into G2. Proven via category theory (pushout constructions) and constructively (delete everything in G1, insert everything in G2).

Our refinement of “node relabeling” into three sub-types (id_changed, properties_changed, type_changed) and “edge relabeling” into two sub-types (type_changed, properties_changed) is a pragmatic design choice supported by schema evolution practice (RENAME is universally treated as distinct from ALTER) and ontology tools (PROMPT, OntoDiff). It’s a superset of the minimal complete set, so completeness is preserved.

Bunke & Allermann 1983, Sanfeliu & Fu 1983

Six operations. Provably complete. GED is NP-hard in general but polynomial for special graph classes. The six operations are the mathematical gold standard.

Ontology diff tools (OntoDiff, PROMPT, ContentCVS)

Section titled “Ontology diff tools (OntoDiff, PROMPT, ContentCVS)”

Klein 2004, Noy & Musen 2004

All use a two-tier model: atomic changes + detected composite changes. Atomic operations align with GED’s six. Composites include merge, split, move, pull-up, push-down, flatten, deepen. This two-tier model validates our approach of having both atomic primitives and recognized composites.

ISO 25964-1:2011

Thesauri management recognizes add, deprecate (NOT delete), merge, split, restructure. Key insight: deprecation as a first-class concept — thesauri never truly delete, they deprecate and redirect. This is a properties_changed + edge additions in our model, but worth recognizing as a pattern.

Curino et al. 2008

PRISM proved 11 Schema Modification Operators are complete for relational schemas. Key insight: RENAME is always treated as atomic and distinct from ALTER — validates our separation of id_changed from properties_changed.

Chawathe et al. 1996, Falleri et al. 2014

Four operations for ordered trees: insert, delete, update, move. Move is treated as atomic even though it decomposes into delete + insert — because detecting it as a unit produces dramatically better change descriptions. This is the strongest challenge to our primitive set: we should add node.moved as a recognized operation.

Category theory / algebraic graph transformation

Section titled “Category theory / algebraic graph transformation”

Ehrig et al. 2006

The definitive mathematical framework. Six elementary operations via Double Pushout (DPO) construction. Provably complete. Everything else is provably a composition.

Change ontologies (Stojanovic 2004, Papavassiliou et al. 2009)

Section titled “Change ontologies (Stojanovic 2004, Papavassiliou et al. 2009)”

Stojanovic 2004

15 elementary changes for OWL ontologies. When you strip away OWL-specific semantics, they reduce to the GED six plus domain refinements. Universally distinguish atomic from composite.

Refinement of the mathematical 6, preserving completeness:

#OperationMathematical basis
1node.addednode insertion
2node.removednode deletion
3node.id_changednode relabeling (identity)
4node.properties_changednode relabeling (data)
5node.type_changednode relabeling (type)
6edge.addededge insertion
7edge.removededge deletion
8edge.type_changededge relabeling (type)
9edge.properties_changededge relabeling (data)

Tier 2 — Recognized composites (detected patterns)

Section titled “Tier 2 — Recognized composites (detected patterns)”

Not primitive but recognized as units for UX and change comprehension:

#OperationDecomposes intoWhy recognize it
10node.movededge.removed + edge.added30 years of tree-diff research says this is essential for readable diffs (Chawathe 1996, GumTree 2014)
11subgraph.mergednode.removed + edge ops + property mergeUniversal in ontology evolution literature
12subgraph.splitnode.added + edge ops + property splitInverse of merge
13hierarchy.restructuredMultiple edge operationsCatch-all for complex reorganizations
  • Dropped: hierarchy.depth_changed — redundant with hierarchy.restructured; depth change is a measurable property of restructuring, not a separate operation
  • Added: node.moved — strong literature support as a recognized composite (replaces the depth_changed slot)
  • Explicitly tiered: 9 true atomics vs. 4 recognized composites
  • Deprecation: Sub-case of properties_changed (adding deprecated flag) + edge additions (redirect). Common pattern, not distinct operation.
  • Copy/clone: Decomposes into insert + property copy. Not atomic.
  • Edge reversal: Decomposes into delete + insert with swapped endpoints. Not atomic.
  • Edge retargeting (change source/target): Interesting for preserving edge identity through moves. May add later as edge.source_changed / edge.target_changed if needed.
  1. The 9 atomic operations are the foundation. They’re a provably complete set. Every possible change to a labeled directed graph can be expressed as a sequence of these 9.

  2. The 4 recognized composites are the UX layer. They exist because users and agents need readable change descriptions, not because the math requires them.

  3. This is exactly the “tight to first principles while allowing extension of opinionation” architecture — the atoms are non-negotiable math, the composites are pragmatic engineering, and the per-framework decisioning is opinionated extension on top.

  4. The schema can confidently build on this. The diff engine produces a sequence of tier-1 atoms. The composite detector groups them into tier-2 patterns. The decisioning layer maps each to a handling strategy. Each layer is independently testable and independently extensible.

  • Bunke, H. & Allermann, G. (1983). “Inexact graph matching for structural pattern recognition.” Pattern Recognition Letters, 1(4), 245–253.
  • Sanfeliu, A. & Fu, K.S. (1983). “A distance measure between attributed relational graphs.” IEEE Trans. SMC, 13(3), 353–362.
  • Zhang, K. & Shasha, D. (1989). “Simple fast algorithms for the editing distance between trees.” SIAM J. Computing, 18(6), 1245–1262.
  • Chawathe, S.S. et al. (1996). “Change detection in hierarchically structured information.” ACM SIGMOD, 25(2), 493–504.
  • Noy, N.F. & Musen, M.A. (2004). “Ontology versioning in an ontology management framework.” IEEE Intelligent Systems, 19(4), 6–13.
  • Klein, M. (2004). Change Management for Distributed Ontologies. PhD Thesis, VU Amsterdam.
  • Stojanovic, L. (2004). Methods and Tools for Ontology Evolution. PhD Thesis, University of Karlsruhe.
  • Ehrig, H. et al. (2006). Fundamentals of Algebraic Graph Transformation. Springer.
  • Curino, C. et al. (2008). “Graceful database schema evolution: The PRISM workbench.” VLDB, 1(1), 761–772.
  • Falleri, J-R. et al. (2014). “Fine-grained and accurate source code differencing.” ASE, 313–324.
  • Gao, X. et al. (2010). “A survey of graph edit distance.” Pattern Analysis and Applications, 13(1), 113–129.
  • ISO 25964-1:2011. “Information and documentation — Thesauri and interoperability with other vocabularies.”