Instrument-first for the macOS Python-3.9 cohort.
v2.0.6 telemetry: ~13.6% of macOS sessions (~6,337 users) run on Apple's
Python 3.9 → HOOK_PY_INCOMPATIBLE → the agentic reviewer can't load (needs
3.10+ syntax). That's ~12x macOS's build-failure rate and the single
biggest macOS degradation. sg-python.sh only probes `python3.1x` on PATH,
so these users have nothing newer ON PATH — but they may still have a
3.10+ installed at a standard location that isn't on the hook's PATH
(Homebrew /opt/homebrew, python.org framework, etc.).
Before building an explicit-path interpreter search, size the RECOVERABLE
fraction: `_probe_alt_python()` checks Homebrew / python.org / distro
locations for a 3.10+ binary and emits the highest found as `sdk_alt_py`
(major*100+minor, or 0 = genuinely 3.9-only). Telemetry-only; probed ONLY
on the HOOK_PY_INCOMPATIBLE path, so healthy sessions never run it.
After a data cycle: non-zero sdk_alt_py = recoverable by an explicit-path
search in sg-python.sh; 0 = needs a user-side Python install (the one-time
notice is the only lever). That decides whether the search is worth building.
Verified locally on macOS Python 3.13:
- py_compile clean; probe returns 314 on this Mac (homebrew 3.14 present).
- 7 new tests (test_altpython_probe.py): highest-version selection,
0-when-none (mocked os.access), framework/distro path parsing, only
counts 3.10+, and emit gated on outcome==HOOK_PY_INCOMPATIBLE.
- Full suite 575/575 + 2 skipped.
No behavior change — purely additive telemetry on the incompatible path.
Version 2.0.6 -> 2.0.7 per the per-PR-bump policy (#2114).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The real dominant Linux failure, identified by a CCR Linux repro.
A CCR container reproduced the production signature — non-zero exit +
EMPTY stdout + EMPTY stderr (~60k fires/day, 4,485 Linux users on 2.0.4):
running `python -m venv` under a tight memory limit (ulimit -v) kills the
memory-heavy venv+ensurepip/pip subprocess with SIGSEGV (-11, RLIMIT_AS)
or SIGKILL (-9, kernel OOM-killer) BEFORE it writes anything. This is
NOT the ensurepip/packaging case (that always writes to stderr, code 11)
and NOT fixable by --target (a --target pip install is also memory-heavy
and gets killed too). Three earlier hypotheses (stdout, packaging,
Option A fixes Linux) were wrong — the repro corrected them.
Changes:
- Detect the signal kill (rc<0, or 128+sig: 134/137/139) in the venv/pip
and --target paths → err_kind "signal_killed:<rc>" (new code 16). The
returncode rides in a new sdk_bootstrap_rc metric so prod confirms
which signal dominates (-9 OOM-killer vs -11 RLIMIT_AS).
- Cooldown: on a signal kill, write a marker and return the new
SKIP_COOLDOWN outcome (9) on subsequent sessions for 24h — stops the
retry storm (every session was re-attempting a build that just gets
re-killed, burning the user's memory/CPU). Retries once per window so a
machine that frees memory still recovers.
- --no-cache-dir on both pip installs (venv + --target) trims pip's peak
memory; may get marginal machines under the OOM threshold.
No happy-path change: signal detection is at the top of the existing
failure handler; cooldown is checked only after all no-op probes
(NOOP_SYSTEM/VENV/TARGET short-circuit first).
Verified locally on macOS Python 3.13:
- py_compile clean.
- 35 new tests (test_signal_kill_cooldown.py): _is_signal_kill across
signals/exit-codes, rc decode, signal_killed→code 16, cooldown
lifecycle (none→write→expire), and an integration flow — simulated
SIGKILL'd venv → BUILD_FAILED/signal_killed:-9 + cooldown written →
2nd run SKIP_COOLDOWN without re-attempting → retry after window;
non-signal failure does NOT cool down; --no-cache-dir present on both
pip paths; sdk_bootstrap_rc emitted conditionally.
- End-to-end harness: the full kill→categorize→cooldown→skip→retry
chain confirmed in-process.
The original CCR repro (ulimit -v ≤7000 KB → rc=-11, empty streams) is
the ground truth this fix is built on. Can be re-validated on CCR with the
same ulimit approach.
Version 2.0.5 -> 2.0.6 per the per-PR-bump policy (#2114).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Option A, the data-gated fix for venv_ensurepip_fail (#2154 follow-up).
v2.0.4 telemetry made the call: of the venv_ensurepip_fail cohort, ~95%
HAVE pip (sdk_has_pip=true) and run Python 3.11–3.14 — so it's not the
Apple-3.9 problem; it's modern interpreters where `python -m venv` can't
bootstrap pip (Debian python3-venv absent, or python.org/pyenv builds
without ensurepip) but pip itself works. `pip install --target` needs only
pip, so it recovers the agentic reviewer for them instead of degrading to
pattern + single-shot review.
Producer (ensure_agent_sdk.py):
- New outcomes BUILT_TARGET=7, NOOP_TARGET=8; new phase pip_target=5.
- _build_via_target(): `pip install --target <state>/agent-sdk-libs
--upgrade --prefer-binary claude-agent-sdk`. Failures categorized via
_pip_err_from_stderr (sibling of main()'s pip chain — kept separate to
avoid disturbing the working venv categorizer); errno embedded for
OSError-family exceptions.
- _target_sdk_importable(): probes a prior target install → NOOP_TARGET.
Dir-check short-circuits before any subprocess, and it's only reached
when there's no working venv, so the 81% NOOP_VENV cohort never pays.
- main() falls through to the target build ONLY on venv_ensurepip_fail;
every other venv/pip failure stays terminal BUILD_FAILED. The sentinel
is released before the target build so a retry isn't seen as SKIP_SENTINEL.
Consumer (llm.py):
- _inject_agent_sdk_venv_into_syspath() adds the flat agent-sdk-libs dir
(packages sit directly in it, not under site-packages). The existing
pywin32 .pth bootstrap applies (target installs don't run .pth either).
No change to the happy path — the new branch is taken only on the
ensurepip failure, and the extra candidate dir is a no-op when absent.
Verified locally on macOS Python 3.13:
- py_compile clean.
- 30 new tests (test_venv_target_fallback.py): outcome/phase codes
(append-only, 4 stays retired), _pip_err_from_stderr categories,
_build_via_target success/CalledProcessError/timeout/exc+errno (mocked
subprocess), _target_sdk_importable dir-short-circuit, main() wiring
(ensurepip→target fallthrough + NOOP_TARGET probe + sentinel release),
consumer adds the flat dir. Full suite 533/533 pass + 2 skipped.
- END-TO-END harness (real install, simulated ensurepip failure):
main() → BUILT_TARGET, target dir has claude_agent_sdk; 2nd run →
NOOP_TARGET; consumer _inject → `import claude_agent_sdk` resolves
FROM the --target dir. Full chain proven without needing a
broken-ensurepip box.
- Real `pip install --target` + import confirmed independently (exit 0,
SDK imports from the flat layout).
NOT validated in tmux: the ensurepip failure can't be reproduced on macOS
(working ensurepip), so the fallback was proven via the real-install
harness above instead. The happy path (NOOP_VENV / normal agentic review)
is unchanged and covered by the existing hook-smoke suite.
Version 2.0.4 -> 2.0.5 per the per-PR-bump policy (#2114).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Legacy code often cannot build locally by nature — CICS/IMS programs
have no local translator and the real runtime may be a mainframe the
user doesn't have. Stopping transform on a failed legacy smoke compile
would block exactly those systems.
- transform Step 0a: the target toolchain remains required (its tests
cannot run without it); a failed or impossible legacy compile no
longer stops the run — the equivalence strategy switches to recorded
traces / golden-master fixtures, and that downgrade is stated in the
plan and in TRANSFORMATION_NOTES.md so reviewers know the strength of
the proof
- preflight: a red legacy toolchain now yields Ready-with-gaps for
transform/reimagine instead of Not-ready
The previous commit round-tripped the catalog through a JSON serializer,
which escaped non-ASCII characters in every other plugin's description.
Restore the file from main and change only the code-modernization entry.
Viewer (assets/topology-viewer.html):
- inline a minified d3 subset (hierarchy/pack, zoom, selection,
interpolateZoom, ease; ISC license) instead of loading from a CDN —
the page is now fully self-contained and works on air-gapped networks
- handle duplicate node ids (unique-suffix; edges bind to the first
occurrence) and store parent references directly, fixing
level-of-detail and selection corruption with messy generated data
- share one reveal rule between drawing, edge culling, and hit-testing
so edges no longer draw into collapsed containers
- pre-bucket edges by kind and keep a per-node adjacency map; the
hover/selection pass no longer scans every edge each frame
- cancel in-flight fly-to animations when a new one starts; clamp
fly-to zoom to the zoom extent; derive max zoom from the smallest
leaf so deep estates stay reachable
- render dead-end candidates (new deadEnds field) with a dashed
outline and a sidebar badge
- clicking a node during a flow walkthrough exits the walkthrough;
search results clear on selection and Escape; surrogate-safe label
truncation; clearer stats line; explicit empty-topology message
Commands:
- new /modernize-status: read-only progress report — artifact inventory
with timestamps, staleness flags, secrets-hygiene checks, next step
- map: deadEnds in the topology schema; datastore names must be logical
identifiers with credentials stripped from URLs/DSNs
- brief: read topology.json + .mmd files (not the interactive HTML);
staleness check against inputs; effort unit aligned to person-months
- transform: secret-safe characterization-test prompt; diff -y fallback
when delta is missing; credential-safe diff selection
- reimagine: target vision is everything after the first argument (was
silently truncated to one word); masking rules in spec/scaffold/
handoff prompts
- brief/transform/reimagine: human-approval gates phrased as explicit
stop-and-wait instead of 'enter plan mode'
- preflight: delta in the tool table; brief added to the verdict list
- README: preflight/status in the workflow; legacy/ deny list also
covers Write; plugin + marketplace descriptions updated
Fixes from an adversarial review of the new viewer:
- pin d3 to 7.9.0 and load it via dynamic import with an explicit
error panel when the CDN is unreachable (previously a blocked CDN
produced a silent dark page — a real concern for restricted networks)
- coerce ids/names/loc at intake: a single missing name or non-numeric
loc previously threw inside the render loop or propagated NaN through
the pack layout, blanking the canvas with no error
- normalize flows/steps/edges defensively (null entries, missing steps,
numeric ids vs string lookups)
- mirror the level-of-detail reveal rule in the hit test so clicks
can't select nodes that aren't drawn
- scope the Escape shortcut so clearing the search box doesn't reset
the viewport; set zoom clickDistance(4) so trackpad jitter doesn't
swallow selection clicks
- round canvas backing-store size (fractional devicePixelRatio caused
a reallocation every frame on 125%/150% display scaling)
- modernize-map: use braced ${CLAUDE_PLUGIN_ROOT} so substitution
actually happens, assert the injection marker exists in the template,
and correct the documented failure mode
modernize-map previously rendered the call graph and data lineage as
static Mermaid diagrams, which become unreadable once a node has ~10+
edges — exactly the shape of real legacy systems. It now builds an
interactive viewer from a shipped template (assets/topology-viewer.html):
a zoomable circle-pack of domains/modules sized by LOC, rendered to
canvas with level-of-detail reveal, dependency edges with per-kind
toggles, search with fly-to, a per-node detail sidebar, and a flow
walkthrough mode. Small domain-level .mmd exports remain for docs.
- topology.json now has a documented schema (hierarchy + edges + entry
points + observations + flows) consumed by the viewer
- map traces 2-4 business flows anchored to personas (claimant,
operator, auditor), each step in plain business language mapped to
the modules that implement it; the viewer plays them as numbered
paths
- brief gains a Business Walkthroughs section connecting each persona
flow to the phase that replaces it
- new modernize-preflight command: detects the stack, checks analysis
tooling, smoke-compiles a real source file with the legacy toolchain,
inventories missing copybooks/descriptors/binary-only artifacts, and
writes a per-command readiness verdict
- transform now verifies legacy + target toolchains before its plan
gate instead of failing at test time
- README: commands updated, optional-tooling section reframed as 'what
to give Claude'
assess only added SECRETS.local.md to analysis/.gitignore, leaving
*.local.patch uncovered until harden's own Step 0 ran. Both patterns are
now written by whichever command runs first.
A red-team pass found four ways credential values still reached
shareable artifacts after the initial redaction:
- the remediation patch: a diff removing a hardcoded secret carries the
raw value on its '-' lines by construction. harden now splits output:
non-credential hunks in the shareable security_remediation.patch,
credential hunks in a gitignored security_remediation.local.patch
with comment-only placeholders in the shareable file
- the other four agents had no secret-handling rules. legacy-analyst
(hardcoded-config evidence in tech-debt findings),
business-rules-extractor (credentials recorded as rule parameters),
test-engineer (legacy literals becoming committed test fixtures), and
architecture-critic (quoted code in notes files) now all mask values
and cite file:line; assess's tech-debt prompt and ASSESSMENT.md
masking now cover every section, not just Security Findings
- non-git projects: a .gitignore protects nothing under SVN/Mercurial.
Both commands now refuse --show-secrets without git and write the
quarantine file to ~/.modernize/<system>/ outside the project tree
- the patch-apply instruction was wrong in both documented layouts
(symlinked legacy/ broke relative paths). Patches are now written
with project-root-relative paths and applied from the project root
Also: --show-secrets is now position-independent in both commands, and
the README documents the full model.
Legacy systems often contain live credentials, and assessment/findings
files get committed and shared. Previously the security-auditor agent
reported hardcoded secrets verbatim into ASSESSMENT.md and
SECURITY_FINDINGS.md.
- security-auditor: mandatory secret-handling rules — mask all credential
values (file:line + 2-4 char preview), redact secrets from echoed tool
output, recommend rotation for anything that looks live
- assess/harden: gitignore-verified SECRETS.local.md quarantine file for
the per-credential inventory; findings files get masked entries and a
pointer only
- new --show-secrets flag opts into raw values in the quarantine file
(and only there)
- README: document the behavior and advise users of earlier versions to
check for already-committed findings and rotate
Follow-up to #2154. v2.0.3 telemetry showed the venv BUILD_FAILED bucket
splits into two unexplained groups; this PR instruments both.
## 1. The exc: bucket — exception type + errno
The dominant remaining venv BUILD_FAILED (phase=venv, err=99) is ~99%
sdk_bootstrap_stderr_sig=NULL — Python exceptions caught by the generic
`except Exception` ("exc:<TypeName>"), not CalledProcessErrors with
categorizable stderr. ~56k/30h, all opaque (stderr_sig only covers
"other:<tail>").
- Handler embeds errno for OSError-family: "exc:OSError:28", etc.
- SDK_BOOTSTRAP_EXC_CODES maps the type → sdk_bootstrap_exc
(FileNotFoundError=1 … OSError=6 … 99=other).
- errno decoded → sdk_bootstrap_errno (ENOENT/EACCES/ENOSPC/…).
## 2. venv_ensurepip_fail instrumentation (the other category)
venv_ensurepip_fail (code 11) is the top categorizable venv failure, and
telemetry flipped the naive assumption: it's NOT just Debian/Ubuntu —
macOS has the MOST distinct affected users (466 vs 121 linux), and linux
is a retry storm (~172 fires/user). Before committing to a `pip install
--target` fallback (Option A) we need to know (a) which interpreter these
users run and (b) whether that interpreter even has pip (→ whether
--target would work, vs needing a system package).
- sdk_hook_py (always emitted): interpreter version as major*100+minor
(309/312). Disambiguates Apple-3.9 vs a 3.10+-with-broken-ensurepip,
and also recovers the version for HOOK_PY_INCOMPATIBLE (whose "py_3.9"
err_kind otherwise collapses to err=99).
- sdk_has_pip (only on err==11, to avoid an extra subprocess per healthy
session): whether `<interpreter> -m pip --version` works. has_pip=true
→ the --target fallback would fix them; has_pip=false → they need a
system package (python3-venv / a complete Python).
Both #1 and #2 are purely additive telemetry on the existing BUILD_FAILED
path — no behavior change to the bootstrap. They de-risk the Option A
decision: ship A only if the affected cohort has pip.
Verified locally on macOS Python 3.13:
- py_compile clean.
- 39 tests in test_exc_failure_encoding.py (34 exc/errno + 5 ensurepip
instrumentation): type-code map, errno extraction + round-trip,
APPEND-ONLY stability, handler-embeds-errno, _probe_has_pip returns
bool + true-on-this-machine, sdk_hook_py always-emitted as
major*100+minor, sdk_has_pip gated on err==11.
- Full suite: 503/503 pass + 2 skipped.
Version 2.0.3 -> 2.0.4 per the per-PR-bump policy (#2114).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Refreshes the marketplace listing description for the AWS Startup Advisor
plugin at the maintainer's request.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds an additional marketplace entry named `vanta` pointing to the same
source as the existing `vanta-mcp-plugin` entry (VantaInc/vanta-mcp-plugin
at the pinned SHA `345d86b5`). No code change — same upstream the existing
entry already uses; this exposes the plugin under the developer's canonical
plugin name (`vanta`, per the repo's plugin.json) alongside the existing slug.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 14:19:00 -07:00
24 changed files with 1971 additions and 226 deletions
@@ -7,7 +7,7 @@ A structured workflow and set of specialist agents for modernizing legacy codeba
Legacy modernization fails most often not because the target technology is wrong, but because teams skip steps: they transform code before understanding it, reimagine architecture before extracting business rules, or ship without a harness that would catch behavior drift. This plugin enforces a sequence:
The discovery commands (`assess`, `map`, `extract-rules`) build artifacts under `analysis/<system>/`. The `brief` command synthesizes them into an approval gate. The build commands (`reimagine`, `transform`) write new code under `modernized/`. The `harden` command audits the legacy system and produces a reviewable remediation patch. Each step has a dedicated slash command, and specialist agents (legacy analyst, business rules extractor, architecture critic, security auditor, test engineer) are invoked from within those commands — or directly — to keep the work honest.
@@ -20,25 +20,40 @@ Commands take a `<system-dir>` argument and assume the system being modernized l
`/modernize-assess` works best with [`scc`](https://github.com/boyter/scc) (LOC + complexity + COCOMO) or [`cloc`](https://github.com/AlDanial/cloc), and falls back to `find`/`wc` if neither is installed. Portfolio mode also benefits from [`lizard`](https://github.com/terryyin/lizard) (cyclomatic complexity). The commands degrade gracefully without them, but the metrics will be coarser.
The commands degrade gracefully, but each of these makes the output meaningfully better — run `/modernize-preflight <system-dir>` to check all of them at once and get a readiness report:
- **Analysis tools**: [`scc`](https://github.com/boyter/scc) (LOC + complexity + COCOMO) or [`cloc`](https://github.com/AlDanial/cloc); [`lizard`](https://github.com/terryyin/lizard) for portfolio mode. Without them, metrics fall back to `find`/`wc` and get coarser.
- **A working build toolchain** for the legacy stack (e.g. GnuCOBOL for COBOL) — required before `/modernize-transform` can prove behavioral equivalence, and verified by preflight with a real smoke compile against your code.
- **The whole system in the tree**: deployment descriptors (JCL, CICS definitions, route configs), copybooks/includes, and DDL/schemas. Entry-point detection and data lineage in `/modernize-map` are guesswork without them.
- **Production telemetry** (optional): an observability MCP server or batch job logs enable the runtime overlay in `/modernize-assess` and timing annotations on critical paths.
## Secret handling
Legacy systems routinely contain live credentials, and assessment artifacts get committed and shared. **Every agent in this plugin masks credential values** — findings, rule-card parameters, architecture notes, and test fixtures cite `file:line` with a masked preview (`AKIA****`), never the value. When credentials are found, a per-credential inventory (type, location, blast radius, rotation recommendation) is written to `analysis/<system>/SECRETS.local.md`, which the commands gitignore before writing; on non-git projects the quarantine file goes to `~/.modernize/<system>/` instead. `/modernize-harden` splits its remediation diff so credential-removal hunks (which necessarily contain the raw value) land in a gitignored `security_remediation.local.patch`, never the shareable patch. Pass `--show-secrets` to include raw values in the quarantine file (and only there). If you ran an earlier version of this plugin on a real system, check whether `analysis/` artifacts containing credentials were committed or shared, and rotate anything that was.
## Commands
The commands are designed to be run in order, but each produces a standalone artifact so you can stop, review, and resume.
Environment readiness check, meant to run first: detects the legacy stack, checks analysis tooling, **smoke-compiles a real source file** with the legacy toolchain (the errors this surfaces — missing copybooks, wrong dialect flags — are the ones that otherwise appear mid-transform), inventories missing includes / deployment descriptors / binary-only artifacts, and probes for telemetry. Produces `analysis/<system>/PREFLIGHT.md` with a per-command Ready / Ready-with-gaps / Not-ready verdict.
### `/modernize-assess <system-dir>` — or — `/modernize-assess --portfolio <parent-dir>`
Inventory the legacy codebase: languages, line counts, complexity, build system, integrations, technical debt, security posture, documentation gaps, and a COCOMO-derived effort estimate. Produces `analysis/<system>/ASSESSMENT.md` and `analysis/<system>/ARCHITECTURE.mmd`. Spawns `legacy-analyst` (×2) and `security-auditor` in parallel for deep reads. With `--portfolio`, sweeps every subdirectory of a parent directory and writes a sequencing heat-map to `analysis/portfolio.html`.
### `/modernize-map <system-dir>`
Build a dependency and topology map of the **legacy** system: program/module call graph, data lineage (programs ↔ data stores), entry points, dead-end candidates, and one traced critical-path business flow. Writes a re-runnable extraction script and produces `analysis/<system>/topology.json` (machine-readable), `analysis/<system>/TOPOLOGY.html` (rendered Mermaid + architect observations), and standalone `call-graph.mmd`, `data-lineage.mmd`, and `critical-path.mmd`.

Build a dependency and topology map of the **legacy** system: program/module call graph, data lineage (programs ↔ data stores), entry points, dead-end candidates, and 2–4 traced business flows each anchored to a persona (the claimant, the operator, the auditor — not the maintainer). Writes a re-runnable extraction script and produces `analysis/<system>/topology.json` plus `analysis/<system>/TOPOLOGY.html` — an **interactive zoomable map** (circle-pack of domains/modules sized by LOC, dependency edges with per-kind toggles, search, click-for-details sidebar, and a walkthrough mode that plays each persona flow as a numbered path with a plain-language narrative). Built from a template shipped with the plugin, so it works on systems far too dense for a static diagram. Small domain-level `call-graph.mmd`, `data-lineage.mmd`, and `critical-path.mmd` are still exported for docs and PRs.
Mine the business rules embedded in the legacy code — calculations, validations, eligibility, state transitions, policies — into Given/When/Then "Rule Cards" with `file:line` citations and confidence ratings. Spawns three `business-rules-extractor` agents in parallel (calculations, validations, lifecycle). Produces `analysis/<system>/BUSINESS_RULES.md` and `analysis/<system>/DATA_OBJECTS.md`.
Synthesize the discovery artifacts into a phased **Modernization Brief** — the single document a steering committee approves and engineering executes: target architecture, strangler-fig phase plan with entry/exit criteria, behavior contract, validation strategy, open questions, and an approval block. Reads `ASSESSMENT.md`, `TOPOLOGY.html`, and `BUSINESS_RULES.md` and **stops if any are missing** — run the discovery commands first. Produces `analysis/<system>/MODERNIZATION_BRIEF.md` and enters plan mode as a human-in-the-loop gate.
Synthesize the discovery artifacts into a phased **Modernization Brief** — the single document a steering committee approves and engineering executes: target architecture, strangler-fig phase plan with entry/exit criteria, persona-based business walkthroughs (the section non-technical approvers actually read), behavior contract, validation strategy, open questions, and an approval block. Reads `ASSESSMENT.md`, `TOPOLOGY.html`, and `BUSINESS_RULES.md` and **stops if any are missing** — run the discovery commands first. Produces `analysis/<system>/MODERNIZATION_BRIEF.md` and enters plan mode as a human-in-the-loop gate.
Greenfield rebuild from extracted intent rather than a structural port. Mines a spec (`analysis/<system>/AI_NATIVE_SPEC.md`), designs a target architecture and has it adversarially reviewed (`analysis/<system>/REIMAGINED_ARCHITECTURE.md`), then **scaffolds services with executable acceptance tests** under `modernized/<system>-reimagined/` and writes a `CLAUDE.md` knowledge handoff for the new system. Two human-in-the-loop checkpoints. Spawns `business-rules-extractor`, `legacy-analyst` (×2), `architecture-critic`, and general-purpose scaffolding agents.
@@ -46,6 +61,9 @@ Greenfield rebuild from extracted intent rather than a structural port. Mines a
Surgical, single-module strangler-fig rewrite. Plans first (HITL gate), then writes characterization tests via `test-engineer`, then an idiomatic target implementation under `modernized/<system>/<module>/`, proves equivalence by running the tests, and produces `TRANSFORMATION_NOTES.md` mapping legacy → modern with deliberate deviations called out. Reviewed by `architecture-critic`.
### `/modernize-status <system-dir>`
Read-only progress report: artifact inventory with timestamps per workflow stage, staleness flags (e.g. a brief older than the assessment it was built from), secrets-hygiene checks (quarantine file gitignored and never committed), and the single most useful next command. Run it anytime you come back to a modernization after a break.
### `/modernize-harden <system-dir>`
Security hardening pass on the **legacy** system: OWASP/CWE scan, dependency CVEs, secrets, injection. Spawns `security-auditor`. Produces `analysis/<system>/SECURITY_FINDINGS.md` ranked Critical / High / Medium / Low and a reviewed `analysis/<system>/security_remediation.patch` with minimal fixes for the Critical/High findings. The patch is reviewed by a second `security-auditor` pass before you see it. **Never edits `legacy/`** — you review and apply the patch yourself when ready, then re-run to verify. Useful as a pre-modernization step when the legacy system will keep running in production during the migration.
@@ -81,17 +99,21 @@ This plugin ships commands and agents, but modernization projects benefit from a
"Edit(modernized/**)"
],
"deny":[
"Edit(legacy/**)"
"Edit(legacy/**)",
"Write(legacy/**)"
]
}
}
```
Adjust `legacy/` and `modernized/` to match your actual layout. The key invariants: `Edit` under `legacy/`is denied, and writes are scoped to `analysis/` (for documents) and `modernized/` (for the new code). Every command in this plugin respects this — `/modernize-harden` writes a patch to `analysis/` rather than editing `legacy/` in place.
Adjust `legacy/` and `modernized/` to match your actual layout. The key invariants: `Edit`/`Write` under `legacy/`are denied, and writes are scoped to `analysis/` (for documents) and `modernized/` (for the new code). Note this guards the file tools — shell commands that mutate files (`sed -i`, `git apply`) still go through the normal Bash permission prompt, so review those prompts with the same invariant in mind. Every command in this plugin respects this — `/modernize-harden` writes a patch to `analysis/` rather than editing `legacy/` in place.
## Typical Workflow
```bash
# 0. Check the environment is ready (tools, toolchain, source completeness)
/modernize-preflight billing
# 1. Inventory the legacy system (or sweep a portfolio of them)
/modernize-assess billing
@@ -112,6 +134,9 @@ Adjust `legacy/` and `modernized/` to match your actual layout. The key invarian
# 6. Security-harden the legacy system that's still in production
description: Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
description: Guidance for distinctive, intentional visual design when building new UI or reshaping an existing one. Helps with aesthetic direction, typography, and making choices that don't read as templated defaults.
license: Complete terms in LICENSE.txt
---
This skill guides creation of distinctive, production-grade frontend interfaces that avoid generic "AI slop" aesthetics. Implement real working code with exceptional attention to aesthetic details and creative choices.
# Frontend Design
The user provides frontend requirements: a component, page, application, or interface to build. They may include context about the purpose, audience, or technical constraints.
Approach this as the design lead at a small studio known for giving every client a visual identity that could not be mistaken for anyone else's. This client has already rejected proposals that felt templated, and is paying for a distinctive point of view: make deliberate, opinionated choices about palette, typography, and layout that are specific to this brief, and take one real aesthetic risk you can justify.
## Design Thinking
## Ground it in the subject
Before coding, understand the context and commit to a BOLD aesthetic direction:
- **Purpose**: What problem does this interface solve? Who uses it?
- **Tone**: Pick an extreme: brutally minimal, maximalist chaos, retro-futuristic, organic/natural, luxury/refined, playful/toy-like, editorial/magazine, brutalist/raw, art deco/geometric, soft/pastel, industrial/utilitarian, etc. There are so many flavors to choose from. Use these for inspiration but design one that is true to the aesthetic direction.
- **Differentiation**: What makes this UNFORGETTABLE? What's the one thing someone will remember?
If the brief does not pin down what the product or subject is, pin it yourself before designing: name one concrete subject, its audience, and the page's single job, and state your choice. If there's any information in your memory about the human's preferences, context about what they're building, or designs you've made before – use that as a hint. The subject's own world, its materials, instruments, artifacts, and vernacular, is where distinctive choices come from. Build with the brief's real content and subject matter throughout.
**CRITICAL**: Choose a clear conceptual direction and execute it with precision. Bold maximalism and refined minimalism both work - the key is intentionality, not intensity.
## Design principles
Then implement working code (HTML/CSS/JS, React, Vue, etc.) that is:
- Production-grade and functional
- Visually striking and memorable
- Cohesive with a clear aesthetic point-of-view
- Meticulously refined in every detail
For web designs, the hero is a thesis. Open with the most characteristic thing in the subject's world, in whatever form makes sense for it: a headline, an image, an animation, a live demo, an interactive moment. Be deliberate with your choice: a big number with a small label, supporting stats, and a gradient accent is the template answer, only use if that's truly the best option.
## Frontend Aesthetics Guidelines
Typography carries the personality of the page. Pair the display and body faces deliberately, not the same families you would reach for on any other project, and set a clear type scale with intentional weights, widths, and spacing. Make the type treatment itself a memorable part of the design, not a neutral delivery vehicle for the content.
Focus on:
- **Typography**: Choose fonts that are beautiful, unique, and interesting. Avoid generic fonts like Arial and Inter; opt instead for distinctive choices that elevate the frontend's aesthetics; unexpected, characterful font choices. Pair a distinctive display font with a refined body font.
- **Color & Theme**: Commit to a cohesive aesthetic. Use CSS variables for consistency. Dominant colors with sharp accents outperform timid, evenly-distributed palettes.
- **Motion**: Use animations for effects and micro-interactions. Prioritize CSS-only solutions for HTML. Use Motion library for React when available. Focus on high-impact moments: one well-orchestrated page load with staggered reveals (animation-delay) creates more delight than scattered micro-interactions. Use scroll-triggering and hover states that surprise.
- **Spatial Composition**: Unexpected layouts. Asymmetry. Overlap. Diagonal flow. Grid-breaking elements. Generous negative space OR controlled density.
- **Backgrounds & Visual Details**: Create atmosphere and depth rather than defaulting to solid colors. Add contextual effects and textures that match the overall aesthetic. Apply creative forms like gradient meshes, noise textures, geometric patterns, layered transparencies, dramatic shadows, decorative borders, custom cursors, and grain overlays.
Structure is information. Structural devices, numbering, eyebrows, dividers, labels, should encode something true about the content, not decorate it. Many generic designs use numbered markers (01 / 02 / 03), but that's only appropriate if the content actually is a sequence - like a real process or a typed timeline where order carries information the reader needs. Question if choices like numbered markers actually make sense before incorporating them.
NEVER use generic AI-generated aesthetics like overused font families (Inter, Roboto, Arial, system fonts), cliched color schemes (particularly purple gradients on white backgrounds), predictable layouts and component patterns, and cookie-cutter design that lacks context-specific character.
Leverage motion deliberately. Think about where and if animation can serve the subject: a page-load sequence, a scroll-triggered reveal, hover micro-interactions, ambient atmosphere. An orchestrated moment usually lands harder than scattered effects; choose what the direction calls for. However, sometimes less is more, and extra animation contributes to the feeling that the design is AI-generated.
Interpret creatively and make unexpected choices that feel genuinely designed for the context. No design should be the same. Vary between light and dark themes, different fonts, different aesthetics. NEVER converge on common choices (Space Grotesk, for example) across generations.
Match complexity to the vision. Maximalist directions need elaborate execution; minimal directions need precision in spacing, type, and detail. Elegance is executing the chosen vision well.
**IMPORTANT**: Match implementation complexity to the aesthetic vision. Maximalist designs need elaborate code with extensive animations and effects. Minimalist or refined designs need restraint, precision, and careful attention to spacing, typography, and subtle details. Elegance comes from executing the vision well.
Consider written content carefully. Often a design brief may not contain real content, and it's up to you to come up with copy. Copy can make a design feel as templated as the design itself. See the below section on writing for more guidance.
Remember: Claude is capable of extraordinary creative work. Don't hold back, show what can truly be created when thinking outside the box and committing fully to a distinctive vision.
## Process: brainstorm, explore, plan, critique, build, critique again
For calibration: AI-generated design right now clusters around three looks: (1) a warm cream background (near #F4F1EA) with a high-contrast serif display and a terracotta accent; (2) a near-black background with a single bright acid-green or vermilion accent; (3) a broadsheet-style layout with hairline rules, zero border-radius, and dense newspaper-like columns. All three are legitimate for some briefs, but they are defaults rather than choices, and they appear regardless of subject. Where the brief pins down a visual direction, follow it exactly — the brief's own words always win, including when it asks for one of these looks. Where it leaves an axis free, don't spend that freedom on one of these defaults. Just like a human designer who's hired, there's often a careful balance between doing what you're good at and taking each project as a chance to experiment and learn.
Work in two passes. First, brainstorm a short design plan based on the human's design brief: create a compact token system with color, type, layout, and signature. Color: describe the palette as 4–6 named hex values. Type: the typefaces for 2+ roles (a characterful display face that's used with restraint, a complementary body face, and a utility face for captions or data if needed). Layout: a layout concept, using one-sentence prose descriptions and ASCII wireframes to ideate and compare. Signature: the single unique element this page will be remembered by that embodies the brief in an appropriate way.
Then review that plan against the brief before building: if any part of it reads like the generic default you would produce for any similar page (work through a similar prompt to see if you arrive somewhere similar) rather than a choice made for this specific brief — revise that part, say what you changed and why. Only after you've confirmed the relative uniqueness of your design plan should you start to write the code, following the revised plan exactly and deriving every color and type decision from it.
When writing the code, be careful of structuring your CSS selector specificities. It's easy to generate CSS classes that cancel each other out (especially with a type-based selector like .section and a element-based selector like .cta). This can happen often with paddings/margins between sections.
Try to do a lot of this planning and iteration in your thinking, and only show ideas to the user when you have higher confidence it'll delight them.
## Restraint and self-critique
Spend your boldness in one place. Let the signature element be the one memorable thing, keep everything around it quiet and disciplined, and cut any decoration that does not serve the brief. Not taking a risk can be a risk itself! Build to a quality floor without announcing it: responsive down to mobile, visible keyboard focus, reduced motion respected. Critique your own work as you build, taking screenshots if your environment supports it – a picture is worth 1000 tokens. Consider Chanel's advice: before leaving the house, take a look in the mirror and remove one accessory. Human creators have memory and always try to do something new, so if you have a space to quickly jot down notes about what you've tried, it can help you in future passes.
## More on writing in design
Words appear in a design for one reason: to make it easier to understand, and therefore easier to use. They are design material, not decoration. Bring the same intentionality to copy that you would bring to spacing and color. Before writing anything, ask what the design needs to say, and how it can best be said to help the person navigate the experience.
Write from the end user's side of the screen. Name things by what people control and recognize, never by how the system is built. A person manages notifications, not webhook config. Describe what something does in plain terms rather than selling it. Being specific is always better than being clever.
Use active voice as default. A control should say exactly what happens when it's used: "Save changes," not "Submit." An action keeps the same name through the whole flow, so the button that says "Publish" produces a toast that says "Published." The vocabulary of an interface is the signposting for someone navigating the product. Cohesion and consistency are how people learn their way around.
Treat failure and emptiness as moments for direction, not mood. Explain what went wrong and how to fix it, in the interface's voice rather than a person's. Errors don't apologize, and they are never vague about what happened. An empty screen is an invitation to act.
Keep the register conversational and tuned: plain verbs, sentence case, no filler, with tone matched to the brand and the audience. Let each element do exactly one job. A label labels, an example demonstrates, and nothing quietly does double duty.
"description":"Security review for Claude-generated code. Pattern-based warnings on edits, LLM-powered diff review on Stop, and an agentic commit reviewer that catches injection, XSS, SSRF, hardcoded secrets, and 25+ other vulnerability classes.",
# `pip install --target` fallback (ensure_agent_sdk BUILT_TARGET, used
# when venv can't bootstrap pip): a FLAT layout — packages sit directly
# in agent-sdk-libs/, not under a site-packages subdir. See #2154
# follow-up. The pywin32 .pth bootstrap below applies here too (target
# installs don't process .pth at runtime, same as a manual venv insert).
+[os.path.join(state_dir,"agent-sdk-libs")]
)
added=False
forspincandidates:
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.