Helper function specifications
These functions were developed during the Python tool phase and define the core transformation logic. The TypeScript plugin equivalents follow the same algorithms.
match_value(source, target, mode)
Section titled “match_value(source, target, mode)”Compares two values using one of three matching strategies. Used for crosswalk linking — determining whether a control in Framework A corresponds to one in Framework B.
Parameters:
source— the value to search fortarget— the value to search withinmode— matching strategy:"exact","array-contains", or"regex"
Pseudocode:
Edge cases:
- Null/empty values always return false (no match)
array-containssplits on both commas AND whitespace —"AC-1, AC-2 AC-3"produces["AC-1", "AC-2", "AC-3"]regextreats the source as the pattern and target as the string to search
Used in: Crosswalk linking pipeline, config matching fingerprints.
build_full_path_components(code)
Section titled “build_full_path_components(code)”Builds cumulative path components from a dotted/hyphenated ID. Used to create the folder hierarchy from a single control ID.
Input/Output examples:
Pseudocode:
Why cumulative? Each path component represents a level in the hierarchy. "ID" is the function, "ID.AM" is the category, "ID.AM-1" is the subcategory. The cumulative approach preserves the full context at each level.
split_folders(code)
Section titled “split_folders(code)”Simple split for folder creation — separates an ID into individual segments.
Input/Output examples:
Pseudocode:
Difference from build_full_path_components: split_folders produces folder names (ID/AM/1/), while build_full_path_components produces cumulative IDs for hierarchy preservation. Use split_folders when each segment is a standalone folder name; use build_full_path_components when folder names should show their full path context.
sanitize_column_name(column)
Section titled “sanitize_column_name(column)”Normalizes column headers from source spreadsheets into clean frontmatter property keys.
Transformation rules:
- Strip leading/trailing whitespace
- Convert to lowercase
- Replace whitespace and
/with_ - Remove all characters except word characters,
-, and_
Input/Output examples:
Pseudocode:
Used in: Column mapping during import wizard, frontmatter key generation.
hierarchical_ffill(dataframe, columns)
Section titled “hierarchical_ffill(dataframe, columns)”Grouped forward-fill for hierarchical data with merged cells. This is the most complex helper — standard forward-fill fails for hierarchical data because it doesn’t respect group boundaries.
The problem: Framework spreadsheets often use merged cells for hierarchy levels. When loaded into a table, merged cells become empty:
| Function | Category | Subcategory |
|---|---|---|
| GOVERN | Organizational Context | GV.OC-01 |
| GV.OC-02 | ||
| Risk Management | GV.RM-01 | |
| GV.RM-02 | ||
| PROTECT | Access Control | PR.AC-01 |
A naive forward-fill on “Function” correctly fills GOVERN down. But a naive forward-fill on “Category” would incorrectly fill Organizational Context into the Risk Management rows if “Function” changed.
The solution: Fill each column grouped by its parent columns.
Pseudocode:
After hierarchical forward-fill:
| Function | Category | Subcategory |
|---|---|---|
| GOVERN | Organizational Context | GV.OC-01 |
| GOVERN | Organizational Context | GV.OC-02 |
| GOVERN | Risk Management | GV.RM-01 |
| GOVERN | Risk Management | GV.RM-02 |
| PROTECT | Access Control | PR.AC-01 |
Used for: D3FEND ontology (deeply nested hierarchy), CRI Profile structure, any framework with merged Excel cells.
Resources
Section titled “Resources”Related pages
Section titled “Related pages”- Prior art — Python tool implementation history
- Framework data sources — which frameworks use which helpers
- Transformation system — column roles and naming styles