2026-04-24 19:44:52 +00:00
|
|
|
|
---
|
|
|
|
|
|
description: Dependency & topology mapping — call graphs, data lineage, batch flows, rendered as navigable diagrams
|
|
|
|
|
|
argument-hint: <system-dir>
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
Build a **dependency and topology map** of `legacy/$1` and render it visually.
|
|
|
|
|
|
|
|
|
|
|
|
The assessment gave us domains. Now go one level deeper: how do the *pieces*
|
|
|
|
|
|
connect? This is the map an engineer needs before touching anything.
|
|
|
|
|
|
|
|
|
|
|
|
## What to produce
|
|
|
|
|
|
|
|
|
|
|
|
Write a one-off analysis script (Python or shell — your choice) that parses
|
code-modernization: harden writes a patch instead of editing legacy; make map/security guidance language-agnostic
- modernize-harden: never edits legacy/ anymore. Writes findings plus a
reviewed unified diff to analysis/<system>/security_remediation.patch.
A second security-auditor pass reviews each hunk (RESOLVES / PARTIAL /
INTRODUCES-RISK) before presenting. The user reviews and applies the
patch deliberately, then re-runs to verify. This makes every command
consistent with the recommended deny Edit(legacy/**) workspace setting,
so the README's exception note is gone.
- modernize-map: restructure the parse-target list around three stack-
agnostic principles (dispatcher targets are variables; code-storage
joins live in config; entry points live in deployment descriptors), with
COBOL/Java/web/CLI examples on equal footing rather than COBOL-dominant.
Same protections against false dead-code findings, less stack-specific.
- security-auditor agent: rephrase coverage items in stack-neutral terms
(record layouts/temp datasets, resource ACLs, deployment scripts/job
definitions, batch input records) so the checklist reads naturally for
COBOL, Java EE, .NET, and web targets alike.
- README: drop the harden exception note; describe the patch workflow.
2026-05-11 16:46:03 -07:00
|
|
|
|
the source under `legacy/$1` and extracts the four datasets below. Three
|
|
|
|
|
|
principles apply across stacks; getting them wrong produces a misleading map:
|
|
|
|
|
|
|
|
|
|
|
|
1. **Edges live in two places** — direct calls in source, *and* dispatcher/
|
|
|
|
|
|
router calls whose targets are variables (config tables, route maps,
|
|
|
|
|
|
dependency injection, dynamic dispatch). Resolve variables against config
|
|
|
|
|
|
before declaring an edge unresolvable.
|
|
|
|
|
|
2. **The code↔storage join is usually external configuration**, not source —
|
|
|
|
|
|
job/deployment descriptors map logical names to physical stores.
|
|
|
|
|
|
3. **Entry points usually live in deployment config**, not source — without
|
|
|
|
|
|
parsing it, every top-level module looks unreachable.
|
|
|
|
|
|
|
|
|
|
|
|
Extract:
|
|
|
|
|
|
|
|
|
|
|
|
- **Program/module call graph** — direct calls (`CALL`, method invocations,
|
|
|
|
|
|
`import`/`require`) *and* dispatcher calls (`EXEC CICS LINK/XCTL`, DI
|
|
|
|
|
|
container wiring, framework routing, reflection/factory). Resolve variable
|
|
|
|
|
|
call targets against route tables, copybooks, config, or constant pools.
|
|
|
|
|
|
- **Data dependency graph** — which modules read/write which data stores,
|
|
|
|
|
|
joined through the relevant config: `SELECT…ASSIGN TO` ↔ JCL `DD` (batch
|
|
|
|
|
|
COBOL), `EXEC CICS READ/WRITE…FILE()` ↔ CSD `DEFINE FILE` (CICS online),
|
|
|
|
|
|
`EXEC SQL` table refs (embedded SQL), ORM annotations/mappings (Java/.NET),
|
|
|
|
|
|
model files (Node/Python/Ruby). Include UI/screen bindings (BMS maps, JSPs,
|
|
|
|
|
|
templates) — they're dependencies too.
|
|
|
|
|
|
- **Entry points** — whatever the stack's outermost invoker is, read from
|
|
|
|
|
|
where it's defined: JCL `EXEC PGM=` and CICS CSD `DEFINE TRANSACTION`
|
|
|
|
|
|
(mainframe), `web.xml`/route annotations/route files (web), `main()`/argv
|
|
|
|
|
|
parsing (CLI), queue/scheduler subscriptions (event-driven).
|
|
|
|
|
|
- **Dead-end candidates** — modules with no inbound edges. **Only meaningful
|
|
|
|
|
|
once all the entry-point and call-edge types above are in the graph.**
|
|
|
|
|
|
Suppress the dead claim for anything that could be the target of an
|
|
|
|
|
|
unresolved dynamic call. A grep-only graph will mark most dispatcher-driven
|
|
|
|
|
|
modules (CICS programs, Spring controllers, ORM-bound DAOs) dead when they
|
|
|
|
|
|
aren't.
|
|
|
|
|
|
|
|
|
|
|
|
If the source is fixed-column (COBOL columns 8–72, RPG, etc.), slice the
|
|
|
|
|
|
code area and strip comment lines before regex matching, or you'll match
|
|
|
|
|
|
sequence numbers and commented-out code.
|
2026-04-24 19:44:52 +00:00
|
|
|
|
|
|
|
|
|
|
Save the script as `analysis/$1/extract_topology.py` (or `.sh`) so it can be
|
Harden code-modernization plugin from a real CardDemo dry run
Fixes found by running the discovery workflow against the AWS CardDemo
mainframe sample (~50 KLOC of COBOL/CICS/JCL/BMS/VSAM):
- modernize-assess: add scc -> cloc -> find/wc fallback chain with the
COCOMO-II formula so Step 1 works when scc isn't installed; same for
portfolio-mode cloc/lizard. Drop the reference to a specific
agent-spawning tool name (just "in parallel"). Sharpen the structural-
map subagent prompt: 5-12 domains, subgraph clustering, ~40-edge cap,
repo-relative paths, dangling-reference check.
- modernize-map: expand the parse-target list with the things a
literal-minded reader would miss on a real mainframe codebase — CICS
CSD DEFINE TRANSACTION/FILE for entry points and online file I/O,
EXEC CICS file ops, SELECT...ASSIGN TO joined with JCL DD,
EXEC SQL table refs (not JCL DD), SEND/RECEIVE MAP, dynamic
data-name XCTL resolution, COBOL fixed-format column slicing. Without
these the dead-code list is wrong (most CICS programs look unreachable).
Also write a machine-readable topology.json alongside the summary.
- modernize-extract-rules: add a Priority (P0/P1/P2) field with a
heuristic, and an optional Suspected-defect field. modernize-brief
reads P0 rules to build the behavior contract, but the Rule Card had
no priority slot — the chain was broken.
- modernize-brief: read the new P0 tags; flag low-confidence P0 rules as
SME blockers.
- modernize-reimagine: drop "for the demo" wording.
- security-auditor agent: add mainframe/COBOL coverage items (RACF,
JCL/PROC creds, BMS field validation, DB2 dynamic SQL, copybook PII)
and mark web-only items as such so it adapts to the target stack.
- README: add Optional Tooling section and a symlink example for the
expected layout.
2026-05-11 16:28:27 -07:00
|
|
|
|
re-run and audited. Have it write a machine-readable
|
|
|
|
|
|
`analysis/$1/topology.json` and print a human summary. Run it; show the
|
|
|
|
|
|
summary (cap at ~200 lines for very large estates).
|
2026-04-24 19:44:52 +00:00
|
|
|
|
|
|
|
|
|
|
## Render
|
|
|
|
|
|
|
|
|
|
|
|
From the extracted data, generate **three Mermaid diagrams** and write them
|
2026-05-11 16:17:59 -07:00
|
|
|
|
to `analysis/$1/TOPOLOGY.html` as a self-contained page that renders in any
|
|
|
|
|
|
browser.
|
2026-04-24 19:44:52 +00:00
|
|
|
|
|
|
|
|
|
|
The HTML page must use: dark `#1e1e1e` background, `#d4d4d4` text,
|
|
|
|
|
|
`#cc785c` for `<h2>`/accents, `system-ui` font, all CSS **inline** (no
|
2026-05-11 16:17:59 -07:00
|
|
|
|
external stylesheets). Load Mermaid from a CDN in `<head>`:
|
|
|
|
|
|
|
|
|
|
|
|
```html
|
|
|
|
|
|
<script type="module">
|
|
|
|
|
|
import mermaid from 'https://cdn.jsdelivr.net/npm/mermaid@11/dist/mermaid.esm.min.mjs';
|
|
|
|
|
|
mermaid.initialize({ startOnLoad: true, theme: 'dark' });
|
|
|
|
|
|
</script>
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
Each diagram goes in a `<pre class="mermaid">...</pre>` block. Do **not**
|
|
|
|
|
|
wrap diagrams in markdown ` ``` ` fences inside the HTML.
|
2026-04-24 19:44:52 +00:00
|
|
|
|
|
|
|
|
|
|
1. **`graph TD` — Module call graph.** Cluster by domain (use `subgraph`).
|
|
|
|
|
|
Highlight entry points in a distinct style. Cap at ~40 nodes — if larger,
|
|
|
|
|
|
show domain-level with one expanded domain.
|
|
|
|
|
|
|
|
|
|
|
|
2. **`graph LR` — Data lineage.** Programs → data stores.
|
|
|
|
|
|
Mark read vs write edges.
|
|
|
|
|
|
|
|
|
|
|
|
3. **`flowchart TD` — Critical path.** Trace ONE end-to-end business flow
|
|
|
|
|
|
(e.g., "monthly billing run" or "process payment") through every program
|
2026-05-11 16:17:59 -07:00
|
|
|
|
and data store it touches, in execution order. If production telemetry is
|
|
|
|
|
|
available (see `/modernize-assess` Step 4), annotate each step with its
|
|
|
|
|
|
p50/p99 wall-clock.
|
2026-04-24 19:44:52 +00:00
|
|
|
|
|
|
|
|
|
|
Also export the three diagrams as standalone `.mmd` files for re-use:
|
|
|
|
|
|
`analysis/$1/call-graph.mmd`, `analysis/$1/data-lineage.mmd`,
|
|
|
|
|
|
`analysis/$1/critical-path.mmd`.
|
|
|
|
|
|
|
|
|
|
|
|
## Annotate
|
|
|
|
|
|
|
|
|
|
|
|
Below each `<pre class="mermaid">` block in TOPOLOGY.html, add a `<ul>`
|
|
|
|
|
|
with 3-5 **architect observations**: tight coupling clusters, single
|
|
|
|
|
|
points of failure, candidates for service extraction, data stores
|
|
|
|
|
|
touched by too many writers.
|
|
|
|
|
|
|
|
|
|
|
|
## Present
|
|
|
|
|
|
|
2026-05-11 16:17:59 -07:00
|
|
|
|
Tell the user to open `analysis/$1/TOPOLOGY.html` in a browser.
|