Compare commits

..

65 Commits

Author SHA1 Message Date
Jesse Vincent
7cefe7498c Add workflow checkpoints and scaling to SDD, refine handoffs
- brainstorming: check in with user before transitioning to writing-plans
- writing-plans: structured execution handoff (record context, advise
  compaction, give exact continuation prompt with subagent detection)
- SDD: add scaling paragraph with GATE + orchestrator boundary, graphviz
  decision diamonds for review elision, check-in before finishing,
  replace TodoWrite with generic task list language, tighten Red Flags
2026-02-21 08:54:23 -08:00
Jesse Vincent
41512d6f3a Add GATE marker pattern and flowchart-as-gate guidance to writing-skills
Two small additions based on tested findings:
- Flowchart Usage: document that decision diamonds in process flows act as
  behavioral enforcement mechanisms (citing 2/5 → 5/5 compliance evidence)
- Bulletproofing: add GATE marker technique for non-optional decision points

Tested 3/3: workers correctly retrieve and apply both techniques.
2026-02-20 14:28:37 -08:00
Jesse Vincent
073efbaa8e Add scaling guidance and review-loop gate to writing-plans skill
Minimal additions: one paragraph in Overview about scaling effort to task,
a graphviz process flow diagram with the review-loop gate as an explicit
decision point, and GATE wording requiring permission before eliding the
review loop. Tested 5/5 workers now explicitly ask before skipping review.
2026-02-20 14:28:37 -08:00
Jesse Vincent
1d63b880fb Scale brainstorming skill effort to task complexity
Targeted edits to original skill (not a rewrite): scaling paragraph in
Overview, hard gate reframed around confirmed understanding, anti-pattern
refocused on understanding vs ceremony, scalable checklist with GATE
wording, graphviz with decision diamonds at gates, and design doc marked
as 'when warranted'. CSO fix removes workflow summary from description.
Tested 5/5: all confirm understanding, none create rigid 6-task checklist.
2026-02-20 14:26:50 -08:00
Drew Ritter
bf6d336950 minor change to unload a page from viz brainstorm after the user makes a decision 2026-02-19 16:53:30 -08:00
Drew Ritter
ce0f9a28be Refactor visual brainstorming: browser displays, terminal commands (#509)
* Refactor visual brainstorming: browser displays, terminal commands

Replaces the blocking TaskOutput/wait-for-feedback.sh pattern with a
non-blocking model where the browser is an interactive display and the
terminal stays available for conversation.

Server changes:
- Write user click events to .events JSONL file (per-screen, cleared on
  new screen push) so Claude reads them on its next turn
- Replace regex-based wrapInFrame with <!-- CONTENT --> placeholder
- Add --foreground flag to start-server.sh for sandbox environments
- Harden startup with nohup/disown and liveness check

UI changes:
- Remove feedback footer (textarea + Send button)
- Add selection indicator bar ("Option X selected — return to terminal")
- Narrow click handler to [data-choice] elements only

Skill changes:
- Rewrite visual-companion.md for non-blocking loop
- Fix visual companion being skipped on Codex (no browser tools needed)
- Make visual companion offer a standalone question (one question rule)

Deletes wait-for-feedback.sh entirely.

* Add visual companion offer to brainstorming checklist for UX topics

The visual companion was a disconnected section at the bottom of SKILL.md
that agents never reached because it wasn't in the mandatory checklist.
Now step 2 evaluates whether the topic involves visual/UX decisions and
offers the companion if so. Non-visual topics (APIs, data models, etc.)
skip the step entirely.

* Add multi-select support to visual companion

Containers with data-multiselect allow toggling multiple selections.
Without it, behavior is unchanged (single-select). Indicator bar shows
count when multiple items are selected.
2026-02-19 16:31:51 -08:00
Drew Ritter
3a254ba002 Remove session scope language from SUBAGENT-STOP
Drop 'top-level sessions only' which may cause the model to treat
using-superpowers as a first-turn-only skill, breaking skill chaining
on follow-up turns.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 09:35:45 -08:00
Drew Ritter
192cb7db8e Soften SUBAGENT-STOP to only skip this skill, not suppress all behavior
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 09:35:45 -08:00
Drew Ritter
a014baf5b9 Remove skill override directives from dispatch templates
Let the SUBAGENT-STOP gate in using-superpowers handle skill leakage
instead of per-template directives. This avoids blocking non-superpowers
skills that users may want subagents to use.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 09:35:45 -08:00
Drew Ritter
49bbad7084 Move Codex tool mapping to progressive disclosure reference file
Addresses Jesse's review feedback on PR #450:
- Move inline routing table from using-superpowers to references/codex-tools.md,
  leveraging Codex's native progressive disclosure for companion files
- Narrow SUBAGENT-STOP from "Do not invoke skills" to "Do not invoke
  superpowers skills" so subagents can still use non-superpowers skills

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 09:35:45 -08:00
Drew Ritter
79dbce8b10 Sharpen subagent skill override directive in dispatch templates
Replace verbose scope explanations with a direct override statement
that explicitly claims priority over prior guidance (i.e. the
using-superpowers 1% rule injected by Codex's skill discovery).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 09:35:45 -08:00
Drew Ritter
2e12d4fb34 Document collab feature requirement for Codex subagent skills
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 09:35:45 -08:00
Drew Ritter
5535e2d0b3 Prevent Codex subagent skill leakage via gate check and dispatch routing table
Codex subagents inherit filesystem access and can discover superpowers skills
via native discovery. Without guidance, they activate the 1% rule and invoke
full skill workflows instead of executing their assigned task.

- Add SUBAGENT-STOP gate check above the 1% rule in using-superpowers
- Add Codex dispatch routing table (spawn_agent/wait/close_agent)
- Add scope directives to all 4 subagent dispatch templates

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 09:35:45 -08:00
Jesse Vincent
7400d43599 fix: restore polyglot wrapper to fix Windows hook window spawning
Claude Code spawns hook commands with shell:true + windowsHide:true,
but on Windows the execution chain cmd.exe -> bash.exe causes Git
Bash (MSYS2) to allocate its own console window, bypassing the hide
flag. This creates visible terminal windows that steal focus on every
SessionStart event (startup, resume, clear, compact).

The fix:
- Rename session-start.sh to session-start (no extension) so Claude
  Code's .sh auto-detection regex doesn't fire and prepend "bash"
- Restore run-hook.cmd polyglot wrapper to control bash invocation
  on Windows (tries known Git Bash paths, then PATH, then exits
  silently if no bash found)
- On Unix, the polyglot's shell portion runs the script directly

This avoids Claude Code's broken .sh auto-prepend, gives us control
over how bash is invoked on Windows, and gracefully handles missing
bash instead of erroring.

Addresses: #440, #414, #354, #417, #293
Upstream: anthropics/claude-code#14828
2026-02-19 09:35:45 -08:00
Jesse Vincent
cb75011d91 fix: move scope assessment into understanding phase
Testing showed the model skipped scope assessment when it was a
separate step after "Understanding the idea." Inlining it as the
first thing in understanding ensures it fires before detailed questions.
2026-02-19 09:35:45 -08:00
Jesse Vincent
56e22b2afc feat: add project-level scope assessment to brainstorming pipeline
Brainstorming now assesses whether a project is too large for a single
spec and helps decompose into sub-projects. Spec reviewer checks scope.
Writing-plans has a backstop if brainstorming missed it.
2026-02-19 09:35:45 -08:00
Jesse Vincent
a0dc300a6e feat: add architecture and file size checks to review loops
Spec reviewer now checks for unit decomposition with clear boundaries.
Plan reviewer now checks file structure and whether files will grow
too large to reason about.
2026-02-19 09:35:45 -08:00
Jesse Vincent
1e3353b232 feat: add file growth check to code quality reviewer
Focus on whether this implementation grew or created large files,
not pre-existing file sizes in brownfield codebases.
2026-02-19 09:35:45 -08:00
Jesse Vincent
21c7284d54 fix: address review feedback on architecture guidance
- Define DONE_WITH_CONCERNS handling in SDD controller flow
- Make implementer action explicit when file grows beyond plan intent
- Reword writing-plans file size reasoning (avoid tooling-artifact language)
- Add decomposition awareness to code quality reviewer prompt
2026-02-19 09:35:45 -08:00
Jesse Vincent
8d996aa829 feat: add architecture guidance and capability-aware escalation to skills
Brainstorming: design-for-isolation guidance and brownfield codebase awareness
Writing-plans: file structure section requiring decomposition before task definition
Implementer prompt: code organization awareness, structured escalation protocol
  (DONE/DONE_WITH_CONCERNS/BLOCKED/NEEDS_CONTEXT), explicit permission to stop
Subagent-driven-development: provider-agnostic model selection tiers, escalation handling
2026-02-19 09:35:45 -08:00
Jesse Vincent
14df703a51 chore: gitignore triage directory 2026-02-19 09:35:45 -08:00
Jesse Vincent
17f80fd0ce docs: add brainstorm visual companion improvements to release notes 2026-02-19 09:35:45 -08:00
Jesse Vincent
2716e12781 docs: restructure brainstorming skill with progressive disclosure
SKILL.md is now minimal: process, principles, and a prompt that notes
the visual companion is new/token-intensive/slow. All visual companion
details move to visual-companion.md as a progressive disclosure document
read only when the user opts in.

Delete CLAUDE-INSTRUCTIONS.md (content folded into visual-companion.md).
Document fragment vs full-document behavior and --project-dir persistence.
2026-02-19 09:35:45 -08:00
Jesse Vincent
b52af8427c feat: persist brainstorm mockups to .superpowers/ directory
start-server.sh now accepts --project-dir to store session files under
.superpowers/brainstorm/ instead of /tmp. stop-server.sh only deletes
ephemeral /tmp sessions, keeping persistent ones for later review.

Fix test race condition with polling-based server startup wait.
2026-02-19 09:35:45 -08:00
Jesse Vincent
5b00c6eb50 refactor: server-side frame wrapping and helper.js consolidation
Move toggleSelect/send/selectedChoice from frame-template.html inline
script to helper.js so they're auto-injected. Server now detects bare
HTML fragments (no DOCTYPE/html tag) and wraps them in the frame
template automatically. Full documents pass through as before.

Fix dark mode in sendToClaude confirmation (was using hardcoded colors).
Fix test env var bug (BRAINSTORM_SCREEN -> BRAINSTORM_DIR).
Add tests for fragment wrapping, full doc passthrough, and helper.js.
2026-02-19 09:35:45 -08:00
Jesse Vincent
7398af9947 test: rewrite document review test as proper integration test
- Creates test project with spec containing intentional errors
- Runs Claude to actually review using spec-document-reviewer template
- Verifies reviewer catches TODO and "specified later" deferrals
- Checks review format and verdict

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-19 09:35:45 -08:00
Jesse Vincent
6e6b1ae546 test: add end-to-end tests for document review system
Tests verify:
- Spec document reviewer checks (completeness, TODOs)
- Plan document reviewer checks (spec alignment, task decomposition)
- Review loops exist in brainstorming and writing-plans skills
- Chunk-by-chunk review for plans with 1000-line limit
- Iteration guidance (5 iterations, escalate to human)
- Checkbox syntax on steps only (not task headings)
- Correct directories (docs/superpowers/specs, docs/superpowers/plans)
- Reviewers are advisory
- Same agent fixes issues (preserves context)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-19 09:35:45 -08:00
Jesse Vincent
8d73d0fee2 fix: remove checkbox from task headings, keep on steps only
The `- [ ] ### Task N:` syntax was unusual and might not render
correctly in all markdown parsers. Now only steps have checkboxes.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-19 09:35:45 -08:00
Jesse Vincent
c9f5a393da docs: add document review system spec and plan
- Spec: docs/superpowers/specs/2026-01-22-document-review-system-design.md
- Plan: docs/superpowers/plans/2026-01-22-document-review-system.md

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-19 09:35:45 -08:00
Jesse Vincent
a3d45dc32b docs: update plan header to reference checkbox syntax
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-19 09:35:45 -08:00
Jesse Vincent
21673839c0 feat: add plan review loop and checkbox syntax to writing-plans skill
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-19 09:35:45 -08:00
Jesse Vincent
800d6a2405 feat: add plan document reviewer prompt template
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-19 09:35:45 -08:00
Jesse Vincent
1473d86800 feat: add spec review loop to brainstorming skill
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-19 09:35:45 -08:00
Jesse Vincent
bdfa38509d feat: add spec document reviewer prompt template
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-19 09:35:45 -08:00
Jesse Vincent
fd90ed5dd9 feat: enforce subagent-driven-development on capable harnesses
- Subagent-driven-development is now mandatory when harness supports it
- No longer offer choice between subagent-driven and executing-plans
- Executing-plans reserved for harnesses without subagent capability
- Update plan header to reference both execution paths

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-19 09:35:45 -08:00
Jesse Vincent
fbcf9475e4 feat: enforce brainstorming → writing-plans transition
- Make writing-plans REQUIRED after design approval
- Explicitly forbid platform planning features (EnterPlanMode, etc.)
- Forbid direct implementation without writing-plans skill

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-19 09:35:45 -08:00
Jesse Vincent
14c17242cc refactor: restructure specs and plans directories
- Specs (brainstorming output) now go to docs/superpowers/specs/
- Plans (writing-plans output) now go to docs/superpowers/plans/
- User preferences for locations override these defaults
- Update all skill references and test files

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-19 09:35:45 -08:00
Jesse Vincent
786aa5cf44 fix: Windows hook execution for Claude Code 2.1.x (#331)
* fix: convert shell scripts from CRLF to LF line endings

Add .gitattributes to enforce LF line endings for shell scripts,
preventing bash errors like "/usr/bin/bash: line 1: : command not found"
when scripts are checked out on Windows with CRLF.

Fixes #317 (SessionStart hook fails due to CRLF line endings)

Files converted:
- hooks/session-start.sh
- lib/brainstorm-server/start-server.sh
- lib/brainstorm-server/stop-server.sh
- lib/brainstorm-server/wait-for-feedback.sh
- skills/systematic-debugging/find-polluter.sh

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: update Windows hook execution for Claude Code 2.1.x

Claude Code 2.1.x changed the Windows execution model: it now auto-detects
.sh files in hook commands and prepends "bash " automatically. This broke
the polyglot wrapper because:

  Before: "run-hook.cmd" session-start.sh  (wrapper executes)
  After:  bash "run-hook.cmd" session-start.sh  (bash can't run .cmd)

Changes:
- hooks.json now calls session-start.sh directly (Claude Code handles bash)
- Added deprecation comment to run-hook.cmd explaining the change
- Updated RELEASE-NOTES.md

Fixes #317, #313, #275, #292

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-19 09:35:45 -08:00
Jesse Vincent
8ab3eec9bc feat(opencode): use native skills and fix agent reset bug (#226) (#330)
* fix use_skill agent context (#290)

* fix: respect OPENCODE_CONFIG_DIR for personal skills lookup (#297)

* fix: respect OPENCODE_CONFIG_DIR for personal skills lookup

The plugin was hardcoded to look for personal skills in ~/.config/opencode/skills,
ignoring users who set OPENCODE_CONFIG_DIR to a custom path (e.g., for dotfiles management).

Now uses OPENCODE_CONFIG_DIR if set, falling back to the default path.

* fix: update help text to use dynamic paths

Use configDir and personalSkillsDir variables in help text so paths
are accurate when OPENCODE_CONFIG_DIR is set.

* fix: normalize OPENCODE_CONFIG_DIR before use

Handle edge cases where the env var might be:
- Empty or whitespace-only
- Using ~ for home directory (common in .env files)
- A relative path

Now trims, expands ~, and resolves to absolute path.

* feat(opencode): use native skills and fix agent reset bug (#226)

- Replace custom use_skill/find_skills tools with OpenCode's native skill tool
- Use experimental.chat.system.transform hook instead of session.prompt
  (fixes #226 agent reset on first message)
- Symlink skills directory into ~/.config/opencode/skills/superpowers/
- Update installation docs with comprehensive Windows support:
  - Command Prompt, PowerShell, and Git Bash instructions
  - Proper symlink vs junction handling
  - Reinstall safety with cleanup steps
  - Verification commands for each shell

* Add OpenCode native skills changes to release notes

Documents:
- Breaking change: switch to native skill tool
- Fix for agent reset bug (#226)
- Fix for Windows installation (#232)

---------

Co-authored-by: Vinicius da Motta <viniciusmotta8@gmail.com>
Co-authored-by: oribi <oribarilan@gmail.com>
2026-02-19 09:35:45 -08:00
Jesse Vincent
630c0f7b54 Add instruction priority hierarchy to using-superpowers skill
Clarifies that user instructions (CLAUDE.md, direct requests) always
take precedence over Superpowers skills, which in turn override
default system prompt behavior. Ensures users remain in control.

Also updates RELEASE-NOTES.md with unreleased changes including
the visual companion feature.
2026-02-19 09:35:45 -08:00
Jesse Vincent
b3e922c10a Use semantic filenames for visual companion screens
Server now watches directory for new .html files instead of a single
screen file. Claude writes to semantically named files like
platform.html, style.html, layout.html - each screen is a new file.

Benefits:
- No need to read before write (files are always new)
- Semantic filenames describe what's on screen
- History preserved in directory for debugging
- Server serves newest file by mtime automatically

Updated: index.js, start-server.sh, and all documentation.
2026-02-19 09:35:44 -08:00
Jesse Vincent
e5a7c05528 docs: improve terminal UX for visual companion
- Never use cat/heredoc for HTML (dumps noise into terminal)
- Read screen_file first before Write tool to avoid errors
- Remind user of URL on every step, not just first
- Give text summary of what's on screen before they look
2026-02-19 09:35:44 -08:00
Jesse Vincent
9ab744b2a9 refactor: simplify visual companion workflow, improve guidance
Scripts:
- Rename show-and-wait.sh -> wait-for-feedback.sh (just waits, no HTML piping)
- Remove wait-for-event.sh (used hanging tail -f)
- Workflow now: Write tool for HTML, wait-for-feedback.sh to block

Documentation rewrite:
- Broader "when to use" (UI, architecture, complex choices, spatial)
- Always ask user first before starting
- Scale fidelity to the question being asked
- Explain the question on each page
- Iterate before moving on - validate changes address feedback
- Use real content (Unsplash images) when it matters
2026-02-19 09:35:44 -08:00
Jesse Vincent
745ea8c71c feat: add show-and-wait.sh helper, fix race condition
- New show-and-wait.sh combines write + wait into one command
- Uses polling instead of tail -f (which hangs on macOS)
- Docs updated: start watcher BEFORE writing screen to avoid race
- Reduces terminal noise by consolidating operations
2026-02-19 09:35:44 -08:00
Jesse Vincent
d494f5a483 fix: session isolation and blocking wait for visual companion
- Each session gets unique temp directory (/tmp/brainstorm-{pid}-{timestamp})
- Server outputs screen_dir and screen_file in startup JSON
- stop-server.sh takes screen_dir arg and cleans up session directory
- Document blocking TaskOutput pattern: 10-min timeouts, retry up to 3x,
  then prompt user "let me know when you want to continue"
2026-02-19 09:35:44 -08:00
Jesse Vincent
ac3af07af0 feat: add visual companion for brainstorming skill
Adds browser-based mockup display to replace ASCII art during
brainstorming sessions. Key components:

- Frame template with OS-aware light/dark theming
- CSS helpers for options, cards, mockups, split views
- Server lifecycle scripts (start/stop with random high port)
- Event watcher using tail+grep for feedback loop
- Claude instructions for using the visual companion

The skill now asks users if they want browser mockups and only
runs in Claude Code environments.
2026-02-19 09:35:44 -08:00
Jesse Vincent
fd51e8a6b7 feat: add sendToClaude helper and wait-for-event tool
- Add sendToClaude() function to browser helper that shows confirmation
- Add wait-for-event.sh script for watching server output (tail -f | grep -m 1)
- Enables clean event-driven loop: background bash waits for event, completion triggers Claude's turn
2026-02-19 09:35:44 -08:00
Jesse Vincent
7c1ac86878 docs: add visual brainstorming implementation plan 2026-02-19 09:35:44 -08:00
Jesse Vincent
91fbffc994 fix: preserve original event type, use source field for wrapper 2026-02-19 09:35:44 -08:00
Jesse Vincent
3bcf5db8a1 fix: correct visual companion documentation issues 2026-02-19 09:35:44 -08:00
Jesse Vincent
ec46999224 feat: add visual companion to brainstorming skill 2026-02-19 09:35:44 -08:00
Jesse Vincent
e76ffffc1d test: add brainstorm server integration tests 2026-02-19 09:35:44 -08:00
Jesse Vincent
d20bdb8ec4 fix: ensure user-event type is preserved in WebSocket message output
The spread operator order was causing incoming event types to overwrite
the user-event type marker.
2026-02-19 09:35:44 -08:00
Jesse Vincent
fc10bf6586 feat: add browser helper library for event capture 2026-02-19 09:35:44 -08:00
Jesse Vincent
91debf6100 feat: add brainstorm server foundation
Create the initial server for the visual brainstorming companion:
- Express server with WebSocket support for browser communication
- File watcher (chokidar) to detect screen.html changes
- Auto-injects helper.js into served HTML for event capture
- Binds to localhost only (127.0.0.1) for security
- Outputs JSON events to stdout for Claude consumption
2026-02-19 09:35:44 -08:00
Drew Ritter
a0b9ecce2b update 'Verify Installation' section
'Verify Installation' section with updated instructions.
2026-02-17 11:46:28 -08:00
ericzakariasson
772ec9f834 Add Cursor plugin manifest and hook response compatibility
Enable native Cursor plugin discovery with a .cursor-plugin manifest, and make the SessionStart hook emit both Cursor and Claude response shapes so context injection works across both platforms. Document Cursor install usage in the README while keeping Claude-first wording.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-17 11:42:34 -08:00
Jesse Vincent
e16d611eee Release v4.3.0: Enforce brainstorming workflow, prevent unintended plan mode 2026-02-12 11:03:32 -08:00
Jesse Vincent
b7cad76134 Merge pull request #462 from obra/enforce-brainstorming-workflow
Enforce brainstorming workflow with hard gates and process flow
2026-02-12 11:01:55 -08:00
Jesse Vincent
4c836817da Make SessionStart hook synchronous so using-superpowers loads before first turn
When async is true, the hook may not complete before the model starts
responding, meaning the using-superpowers skill instructions aren't
in context for the first message.
2026-02-12 10:57:41 -08:00
Jesse Vincent
7f2ee614b6 Enforce brainstorming workflow with hard gates and process flow
The brainstorming skill described a process but didn't enforce it. Models
would skip the design phase and jump straight to implementation skills
like frontend-design, or collapse the entire brainstorming process into
a single text block.

Changes to brainstorming skill:
- Add HARD-GATE: no implementation until design is approved
- Add explicit checklist that maps to task items
- Add graphviz process flow with writing-plans as terminal state
- Add anti-pattern callout for "too simple to need a design"
- Scale design sections by section complexity, not project complexity
- Make writing-plans the only valid next skill after brainstorming

Changes to using-superpowers skill:
- Add EnterPlanMode intercept to workflow graph
- Route plan mode attempts through brainstorming skill instead

Tested with claude -p --plugin-dir across three variants (no skill,
original skill, updated skill) to verify behavioral compliance.
2026-02-12 10:51:12 -08:00
Jesse Vincent
b97b5f228d Merge pull request #457 from ColtWindy/fix/writing-plans-nested-code-fence
fix(writing-plans): use 4-backtick fence for nested code blocks in Task Structure template
2026-02-12 08:21:59 -08:00
Jesse Vincent
93c8966cab Merge pull request #452 from heliusjing/fix/add-verbose-flag-for-stream-json
Fix: add --verbose flag for stream-json output in SDD test runner
2026-02-12 08:21:09 -08:00
coltwindy
19df3db59b fix(writing-plans): use 4-backtick fence for nested code blocks in Task Structure template 2026-02-12 12:40:35 +09:00
chengfei.jin
f8cf545bc5 Fix stream-json output requiring --verbose flag
Claude CLI now requires --verbose when using --output-format stream-json
with -p (print mode). Without it, the test fails with:
"Error: When using --print, --output-format=stream-json requires --verbose"
2026-02-11 15:34:35 +08:00
29 changed files with 1293 additions and 941 deletions

View File

@@ -9,7 +9,7 @@
{
"name": "superpowers",
"description": "Core skills library for Claude Code: TDD, debugging, collaboration patterns, and proven techniques",
"version": "4.2.0",
"version": "4.3.0",
"source": "./",
"author": {
"name": "Jesse Vincent",

View File

@@ -1,7 +1,7 @@
{
"name": "superpowers",
"description": "Core skills library for Claude Code: TDD, debugging, collaboration patterns, and proven techniques",
"version": "4.2.0",
"version": "4.3.0",
"author": {
"name": "Jesse Vincent",
"email": "jesse@fsck.com"

View File

@@ -0,0 +1,18 @@
{
"name": "superpowers",
"displayName": "Superpowers",
"description": "Core skills library: TDD, debugging, collaboration patterns, and proven techniques",
"version": "4.3.0",
"author": {
"name": "Jesse Vincent",
"email": "jesse@fsck.com"
},
"homepage": "https://github.com/obra/superpowers",
"repository": "https://github.com/obra/superpowers",
"license": "MIT",
"keywords": ["skills", "tdd", "debugging", "collaboration", "best-practices", "workflows"],
"skills": "./skills/",
"agents": "./agents/",
"commands": "./commands/",
"hooks": "./hooks/hooks.json"
}

1
.gitattributes vendored
View File

@@ -1,5 +1,6 @@
# Ensure shell scripts always have LF line endings
*.sh text eol=lf
hooks/session-start text eol=lf
# Ensure the polyglot wrapper keeps LF (it's parsed by both cmd and bash)
*.cmd text eol=lf

View File

@@ -26,7 +26,8 @@ Thanks!
## Installation
**Note:** Installation differs by platform. Claude Code has a built-in plugin system. Codex and OpenCode require manual setup.
**Note:** Installation differs by platform. Claude Code or Cursor have built-in plugin marketplaces. Codex and OpenCode require manual setup.
### Claude Code (via Plugin Marketplace)
@@ -42,9 +43,13 @@ Then install the plugin from this marketplace:
/plugin install superpowers@superpowers-marketplace
```
### Verify Installation
### Cursor (via Plugin Marketplace)
Start a new session and ask Claude to help with something that would trigger a skill (e.g., "help me plan this feature" or "let's debug this issue"). Claude should automatically invoke the relevant superpowers skill.
In Cursor Agent chat, install from marketplace:
```text
/plugin-add superpowers
```
### Codex
@@ -66,6 +71,10 @@ Fetch and follow instructions from https://raw.githubusercontent.com/obra/superp
**Detailed docs:** [docs/README.opencode.md](docs/README.opencode.md)
### Verify Installation
Start a new session in your chosen platform and ask for something that should trigger a skill (for example, "help me plan this feature" or "let's debug this issue"). The agent should automatically invoke the relevant superpowers skill.
## The Basic Workflow
1. **brainstorming** - Activates before writing code. Refines rough ideas through questions, explores alternatives, presents design in sections for validation. Saves design document.

View File

@@ -89,6 +89,32 @@ Added explicit instruction priority hierarchy to prevent conflicts with user pre
This ensures users remain in control. If CLAUDE.md says "don't use TDD" and a skill says "always use TDD," CLAUDE.md wins.
## v4.3.0 (2026-02-12)
This fix should dramatically improve superpowers skills compliance and should reduce the chances of Claude entering its native plan mode unintentionally.
### Changed
**Brainstorming skill now enforces its workflow instead of describing it**
Models were skipping the design phase and jumping straight to implementation skills like frontend-design, or collapsing the entire brainstorming process into a single text block. The skill now uses hard gates, a mandatory checklist, and a graphviz process flow to enforce compliance:
- `<HARD-GATE>`: no implementation skills, code, or scaffolding until design is presented and user approves
- Explicit checklist (6 items) that must be created as tasks and completed in order
- Graphviz process flow with `writing-plans` as the only valid terminal state
- Anti-pattern callout for "this is too simple to need a design" — the exact rationalization models use to skip the process
- Design section sizing based on section complexity, not project complexity
**Using-superpowers workflow graph intercepts EnterPlanMode**
Added an `EnterPlanMode` intercept to the skill flow graph. When the model is about to enter Claude's native plan mode, it checks whether brainstorming has happened and routes through the brainstorming skill instead. Plan mode is never entered.
### Fixed
**SessionStart hook now runs synchronously**
Changed `async: true` to `async: false` in hooks.json. When async, the hook could fail to complete before the model's first turn, meaning using-superpowers instructions weren't in context for the first message.
## v4.2.0 (2026-02-05)
### Breaking Changes

View File

@@ -32,6 +32,12 @@ Fetch and follow instructions from https://raw.githubusercontent.com/obra/superp
3. Restart Codex.
4. **For subagent skills** (optional): Skills like `dispatching-parallel-agents` and `subagent-driven-development` require Codex's collab feature. Add to your Codex config:
```toml
[features]
collab = true
```
### Windows
Use a junction instead of a symlink (works without Developer Mode):

View File

@@ -0,0 +1,523 @@
# Visual Brainstorming Refactor Implementation Plan
> **For Claude:** REQUIRED: Use superpowers:subagent-driven-development (if subagents available) or superpowers:executing-plans to implement this plan. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Refactor visual brainstorming from blocking TUI feedback model to non-blocking "Browser Displays, Terminal Commands" architecture.
**Architecture:** Browser becomes an interactive display; terminal stays the conversation channel. Server writes user events to a per-screen `.events` file that Claude reads on its next turn. Eliminates `wait-for-feedback.sh` and all `TaskOutput` blocking.
**Tech Stack:** Node.js (Express, ws, chokidar), vanilla HTML/CSS/JS
**Spec:** `docs/superpowers/specs/2026-02-19-visual-brainstorming-refactor-design.md`
---
## File Map
| File | Action | Responsibility |
|------|--------|---------------|
| `lib/brainstorm-server/index.js` | Modify | Server: add `.events` file writing, clear on new screen, replace `wrapInFrame` |
| `lib/brainstorm-server/frame-template.html` | Modify | Template: remove feedback footer, add content placeholder + selection indicator |
| `lib/brainstorm-server/helper.js` | Modify | Client JS: remove send/feedback functions, narrow to click capture + indicator updates |
| `lib/brainstorm-server/wait-for-feedback.sh` | Delete | No longer needed |
| `skills/brainstorming/visual-companion.md` | Modify | Skill instructions: rewrite loop to non-blocking flow |
| `tests/brainstorm-server/server.test.js` | Modify | Tests: update for new template structure and helper.js API |
---
## Chunk 1: Server, Template, Client, Tests, Skill
### Task 1: Update `frame-template.html`
**Files:**
- Modify: `lib/brainstorm-server/frame-template.html`
- [ ] **Step 1: Remove the feedback footer HTML**
Replace the feedback-footer div (lines 227-233) with a selection indicator bar:
```html
<div class="indicator-bar">
<span id="indicator-text">Click an option above, then return to the terminal</span>
</div>
```
Also replace the default content inside `#claude-content` (lines 220-223) with the content placeholder:
```html
<div id="claude-content">
<!-- CONTENT -->
</div>
```
- [ ] **Step 2: Replace feedback footer CSS with indicator bar CSS**
Remove the `.feedback-footer`, `.feedback-footer label`, `.feedback-row`, and the textarea/button styles within `.feedback-footer` (lines 82-112).
Add indicator bar CSS:
```css
.indicator-bar {
background: var(--bg-secondary);
border-top: 1px solid var(--border);
padding: 0.5rem 1.5rem;
flex-shrink: 0;
text-align: center;
}
.indicator-bar span {
font-size: 0.75rem;
color: var(--text-secondary);
}
.indicator-bar .selected-text {
color: var(--accent);
font-weight: 500;
}
```
- [ ] **Step 3: Verify template renders**
Run the test suite to check the template still loads:
```bash
cd /Users/drewritter/prime-rad/superpowers && node tests/brainstorm-server/server.test.js
```
Expected: Tests 1-5 should still pass. Tests 6-8 may fail (expected — they assert old structure).
- [ ] **Step 4: Commit**
```bash
git add lib/brainstorm-server/frame-template.html
git commit -m "Replace feedback footer with selection indicator bar in brainstorm template"
```
---
### Task 2: Update `index.js` — content injection and `.events` file
**Files:**
- Modify: `lib/brainstorm-server/index.js`
- [ ] **Step 1: Write failing test for `.events` file writing**
Add to `tests/brainstorm-server/server.test.js` after Test 4 area — a new test that sends a WebSocket event with a `choice` field and verifies `.events` file is written:
```javascript
// Test: Choice events written to .events file
console.log('Test: Choice events written to .events file');
const ws3 = new WebSocket(`ws://localhost:${TEST_PORT}`);
await new Promise(resolve => ws3.on('open', resolve));
ws3.send(JSON.stringify({ type: 'click', choice: 'a', text: 'Option A' }));
await sleep(300);
const eventsFile = path.join(TEST_DIR, '.events');
assert(fs.existsSync(eventsFile), '.events file should exist after choice click');
const lines = fs.readFileSync(eventsFile, 'utf-8').trim().split('\n');
const event = JSON.parse(lines[lines.length - 1]);
assert.strictEqual(event.choice, 'a', 'Event should contain choice');
assert.strictEqual(event.text, 'Option A', 'Event should contain text');
ws3.close();
console.log(' PASS');
```
- [ ] **Step 2: Run test to verify it fails**
```bash
cd /Users/drewritter/prime-rad/superpowers && node tests/brainstorm-server/server.test.js
```
Expected: New test FAILS — `.events` file doesn't exist yet.
- [ ] **Step 3: Write failing test for `.events` file clearing on new screen**
Add another test:
```javascript
// Test: .events cleared on new screen
console.log('Test: .events cleared on new screen');
// .events file should still exist from previous test
assert(fs.existsSync(path.join(TEST_DIR, '.events')), '.events should exist before new screen');
fs.writeFileSync(path.join(TEST_DIR, 'new-screen.html'), '<h2>New screen</h2>');
await sleep(500);
assert(!fs.existsSync(path.join(TEST_DIR, '.events')), '.events should be cleared after new screen');
console.log(' PASS');
```
- [ ] **Step 4: Run test to verify it fails**
```bash
cd /Users/drewritter/prime-rad/superpowers && node tests/brainstorm-server/server.test.js
```
Expected: New test FAILS — `.events` not cleared on screen push.
- [ ] **Step 5: Implement `.events` file writing in `index.js`**
In the WebSocket `message` handler (line 74-77 of `index.js`), after the `console.log`, add:
```javascript
// Write user events to .events file for Claude to read
if (event.choice) {
const eventsFile = path.join(SCREEN_DIR, '.events');
fs.appendFileSync(eventsFile, JSON.stringify(event) + '\n');
}
```
In the chokidar `add` handler (line 104-111), add `.events` clearing:
```javascript
if (filePath.endsWith('.html')) {
// Clear events from previous screen
const eventsFile = path.join(SCREEN_DIR, '.events');
if (fs.existsSync(eventsFile)) fs.unlinkSync(eventsFile);
console.log(JSON.stringify({ type: 'screen-added', file: filePath }));
// ... existing reload broadcast
}
```
- [ ] **Step 6: Replace `wrapInFrame` with comment placeholder injection**
Replace the `wrapInFrame` function (lines 27-32 of `index.js`):
```javascript
function wrapInFrame(content) {
return frameTemplate.replace('<!-- CONTENT -->', content);
}
```
- [ ] **Step 7: Run all tests**
```bash
cd /Users/drewritter/prime-rad/superpowers && node tests/brainstorm-server/server.test.js
```
Expected: New `.events` tests PASS. Existing tests may still have failures from old assertions (fixed in Task 4).
- [ ] **Step 8: Commit**
```bash
git add lib/brainstorm-server/index.js tests/brainstorm-server/server.test.js
git commit -m "Add .events file writing and comment-based content injection to brainstorm server"
```
---
### Task 3: Simplify `helper.js`
**Files:**
- Modify: `lib/brainstorm-server/helper.js`
- [ ] **Step 1: Remove `sendToClaude` function**
Delete the `sendToClaude` function (lines 92-106) — the function body and the page takeover HTML.
- [ ] **Step 2: Remove `window.send` function**
Delete the `window.send` function (lines 120-129) — was tied to the removed Send button.
- [ ] **Step 3: Remove form submission and input change handlers**
Delete the form submission handler (lines 57-71) and the input change handler (lines 73-89) including the `inputTimeout` variable.
- [ ] **Step 4: Remove `pageshow` event listener**
Delete the `pageshow` listener we added earlier (no textarea to clear anymore).
- [ ] **Step 5: Narrow click handler to `[data-choice]` only**
Replace the click handler (lines 36-55) with a narrower version:
```javascript
// Capture clicks on choice elements
document.addEventListener('click', (e) => {
const target = e.target.closest('[data-choice]');
if (!target) return;
sendEvent({
type: 'click',
text: target.textContent.trim(),
choice: target.dataset.choice,
id: target.id || null
});
});
```
- [ ] **Step 6: Add indicator bar update on choice click**
After the `sendEvent` call in the click handler, add:
```javascript
// Update indicator bar
const indicator = document.getElementById('indicator-text');
if (indicator) {
const label = target.querySelector('h3, .content h3, .card-body h3')?.textContent?.trim() || target.dataset.choice;
indicator.innerHTML = '<span class="selected-text">' + label + ' selected</span> — return to terminal to continue';
}
```
- [ ] **Step 7: Remove `sendToClaude` from `window.brainstorm` API**
Update the `window.brainstorm` object (lines 132-136) to remove `sendToClaude`:
```javascript
window.brainstorm = {
send: sendEvent,
choice: (value, metadata = {}) => sendEvent({ type: 'choice', value, ...metadata })
};
```
- [ ] **Step 8: Run tests**
```bash
cd /Users/drewritter/prime-rad/superpowers && node tests/brainstorm-server/server.test.js
```
- [ ] **Step 9: Commit**
```bash
git add lib/brainstorm-server/helper.js
git commit -m "Simplify helper.js: remove feedback functions, narrow to choice capture + indicator"
```
---
### Task 4: Update tests for new structure
**Files:**
- Modify: `tests/brainstorm-server/server.test.js`
**Note:** Line references below are from the _original_ file. Task 2 inserted new tests earlier in the file, so actual line numbers will be shifted. Find tests by their `console.log` labels (e.g., "Test 5:", "Test 6:").
- [ ] **Step 1: Update Test 5 (full document assertion)**
Find the Test 5 assertion `!fullRes.body.includes('feedback-footer')`. Change it to: Full documents should NOT have the indicator bar either (they're served as-is):
```javascript
assert(!fullRes.body.includes('indicator-bar') || fullDoc.includes('indicator-bar'),
'Should not wrap full documents in frame template');
```
- [ ] **Step 2: Update Test 6 (fragment wrapping)**
Line 125: Replace `feedback-footer` assertion with indicator bar assertion:
```javascript
assert(fragRes.body.includes('indicator-bar'), 'Fragment should get indicator bar from frame');
```
Also verify content placeholder was replaced (fragment content appears, placeholder comment doesn't):
```javascript
assert(!fragRes.body.includes('<!-- CONTENT -->'), 'Content placeholder should be replaced');
```
- [ ] **Step 3: Update Test 7 (helper.js API)**
Lines 140-142: Update assertions to reflect the new API surface:
```javascript
assert(helperContent.includes('toggleSelect'), 'helper.js should define toggleSelect');
assert(helperContent.includes('sendEvent'), 'helper.js should define sendEvent');
assert(helperContent.includes('selectedChoice'), 'helper.js should track selectedChoice');
assert(helperContent.includes('brainstorm'), 'helper.js should expose brainstorm API');
assert(!helperContent.includes('sendToClaude'), 'helper.js should not contain sendToClaude');
```
- [ ] **Step 4: Replace Test 8 (sendToClaude theming) with indicator bar test**
Replace Test 8 (lines 145-149) — `sendToClaude` no longer exists. Test the indicator bar instead:
```javascript
// Test 8: Indicator bar uses CSS variables (theme support)
console.log('Test 8: Indicator bar uses CSS variables');
const templateContent = fs.readFileSync(
path.join(__dirname, '../../lib/brainstorm-server/frame-template.html'), 'utf-8'
);
assert(templateContent.includes('indicator-bar'), 'Template should have indicator bar');
assert(templateContent.includes('indicator-text'), 'Template should have indicator text element');
console.log(' PASS');
```
- [ ] **Step 5: Run full test suite**
```bash
cd /Users/drewritter/prime-rad/superpowers && node tests/brainstorm-server/server.test.js
```
Expected: ALL tests PASS.
- [ ] **Step 6: Commit**
```bash
git add tests/brainstorm-server/server.test.js
git commit -m "Update brainstorm server tests for new template structure and helper.js API"
```
---
### Task 5: Delete `wait-for-feedback.sh`
**Files:**
- Delete: `lib/brainstorm-server/wait-for-feedback.sh`
- [ ] **Step 1: Verify no other files import or reference `wait-for-feedback.sh`**
Search the codebase:
```bash
grep -r "wait-for-feedback" /Users/drewritter/prime-rad/superpowers/ --include="*.js" --include="*.md" --include="*.sh" --include="*.json"
```
Expected references: only `visual-companion.md` (rewritten in Task 6) and possibly release notes (historical, leave as-is).
- [ ] **Step 2: Delete the file**
```bash
rm lib/brainstorm-server/wait-for-feedback.sh
```
- [ ] **Step 3: Run tests to confirm nothing breaks**
```bash
cd /Users/drewritter/prime-rad/superpowers && node tests/brainstorm-server/server.test.js
```
Expected: All tests PASS (no test referenced this file).
- [ ] **Step 4: Commit**
```bash
git add -u lib/brainstorm-server/wait-for-feedback.sh
git commit -m "Delete wait-for-feedback.sh: replaced by .events file"
```
---
### Task 6: Rewrite `visual-companion.md`
**Files:**
- Modify: `skills/brainstorming/visual-companion.md`
- [ ] **Step 1: Update "How It Works" description (line 18)**
Replace the sentence about receiving feedback "as JSON" with:
```markdown
The server watches a directory for HTML files and serves the newest one to the browser. You write HTML content, the user sees it in their browser and can click to select options. Selections are recorded to a `.events` file that you read on your next turn.
```
- [ ] **Step 2: Update fragment description (line 20)**
Remove "feedback footer" from the description of what the frame template provides:
```markdown
**Content fragments vs full documents:** If your HTML file starts with `<!DOCTYPE` or `<html`, the server serves it as-is (just injects the helper script). Otherwise, the server automatically wraps your content in the frame template — adding the header, CSS theme, selection indicator, and all interactive infrastructure. **Write content fragments by default.** Only write full documents when you need complete control over the page.
```
- [ ] **Step 3: Rewrite "The Loop" section (lines 36-61)**
Replace the entire "The Loop" section with:
```markdown
## The Loop
1. **Write HTML** to a new file in `screen_dir`:
- Use semantic filenames: `platform.html`, `visual-style.html`, `layout.html`
- **Never reuse filenames** — each screen gets a fresh file
- Use Write tool — **never use cat/heredoc** (dumps noise into terminal)
- Server automatically serves the newest file
2. **Tell user what to expect and end your turn:**
- Remind them of the URL (every step, not just first)
- Give a brief text summary of what's on screen (e.g., "Showing 3 layout options for the homepage")
- Ask them to respond in the terminal: "Take a look and let me know what you think. Click to select an option if you'd like."
3. **On your next turn** — after the user responds in the terminal:
- Read `$SCREEN_DIR/.events` if it exists — this contains the user's browser interactions (clicks, selections) as JSON lines
- Merge with the user's terminal text to get the full picture
- The terminal message is the primary feedback; `.events` provides structured interaction data
4. **Iterate or advance** — if feedback changes current screen, write a new file (e.g., `layout-v2.html`). Only move to the next question when the current step is validated.
5. Repeat until done.
```
- [ ] **Step 4: Replace "User Feedback Format" section (lines 165-174)**
Replace with:
```markdown
## Browser Events Format
When the user clicks options in the browser, their interactions are recorded to `$SCREEN_DIR/.events` (one JSON object per line). The file is cleared automatically when you push a new screen.
```jsonl
{"type":"click","choice":"a","text":"Option A - Simple Layout","timestamp":1706000101}
{"type":"click","choice":"c","text":"Option C - Complex Grid","timestamp":1706000108}
{"type":"click","choice":"b","text":"Option B - Hybrid","timestamp":1706000115}
```
The full event stream shows the user's exploration path — they may click multiple options before settling. The last `choice` event is typically the final selection, but the pattern of clicks can reveal hesitation or preferences worth asking about.
If `.events` doesn't exist, the user didn't interact with the browser — use only their terminal text.
```
- [ ] **Step 5: Update "Writing Content Fragments" description (line 65)**
Remove "feedback footer" reference:
```markdown
Write just the content that goes inside the page. The server wraps it in the frame template automatically (header, theme CSS, selection indicator, and all interactive infrastructure).
```
- [ ] **Step 6: Update Reference section (lines 200-203)**
Remove the helper.js reference description about "JS API" — the API is now minimal. Keep the path reference:
```markdown
## Reference
- Frame template (CSS reference): `${CLAUDE_PLUGIN_ROOT}/lib/brainstorm-server/frame-template.html`
- Helper script (client-side): `${CLAUDE_PLUGIN_ROOT}/lib/brainstorm-server/helper.js`
```
- [ ] **Step 7: Commit**
```bash
git add skills/brainstorming/visual-companion.md
git commit -m "Rewrite visual-companion.md for non-blocking browser-displays-terminal-commands flow"
```
---
### Task 7: Final verification
- [ ] **Step 1: Run full test suite**
```bash
cd /Users/drewritter/prime-rad/superpowers && node tests/brainstorm-server/server.test.js
```
Expected: ALL tests PASS.
- [ ] **Step 2: Manual smoke test**
Start the server manually and verify the flow works end-to-end:
```bash
cd /Users/drewritter/prime-rad/superpowers && lib/brainstorm-server/start-server.sh --project-dir /tmp/brainstorm-smoke-test
```
Write a test fragment, open in browser, click an option, verify `.events` file is written, verify indicator bar updates. Then stop the server:
```bash
lib/brainstorm-server/stop-server.sh <screen_dir from start output>
```
- [ ] **Step 3: Verify no stale references remain**
```bash
grep -r "wait-for-feedback\|sendToClaude\|feedback-footer\|send-to-claude\|TaskOutput.*block.*true" /Users/drewritter/prime-rad/superpowers/ --include="*.js" --include="*.md" --include="*.sh" --include="*.html" | grep -v node_modules | grep -v RELEASE-NOTES | grep -v "\.md:.*spec\|plan"
```
Expected: No hits outside of release notes and the spec/plan docs (which are historical).
- [ ] **Step 4: Final commit if any cleanup needed**
```bash
git status
# Review untracked/modified files, stage specific files as needed, commit if clean
```

View File

@@ -0,0 +1,162 @@
# Visual Brainstorming Refactor: Browser Displays, Terminal Commands
**Date:** 2026-02-19
**Status:** Approved
**Scope:** `lib/brainstorm-server/`, `skills/brainstorming/visual-companion.md`, `tests/brainstorm-server/`
## Problem
During visual brainstorming, Claude runs `wait-for-feedback.sh` as a background task and blocks on `TaskOutput(block=true, timeout=600s)`. This seizes the TUI entirely — the user cannot type to Claude while visual brainstorming is running. The browser becomes the only input channel.
Claude Code's execution model is turn-based. There is no way for Claude to listen on two channels simultaneously within a single turn. The blocking `TaskOutput` pattern was the wrong primitive — it simulates event-driven behavior the platform doesn't support.
## Design
### Core Model
**Browser = interactive display.** Shows mockups, lets the user click to select options. Selections are recorded server-side.
**Terminal = conversation channel.** Always unblocked, always available. The user talks to Claude here.
### The Loop
1. Claude writes an HTML file to the session directory
2. Server detects it via chokidar, pushes WebSocket reload to the browser (unchanged)
3. Claude ends its turn — tells the user to check the browser and respond in the terminal
4. User looks at browser, optionally clicks to select an option, then types feedback in the terminal
5. On the next turn, Claude reads `$SCREEN_DIR/.events` for the browser interaction stream (clicks, selections), merges with the terminal text
6. Iterate or advance
No background tasks. No `TaskOutput` blocking. No polling scripts.
### Key Deletion: `wait-for-feedback.sh`
Deleted entirely. Its purpose was to bridge "server logs events to stdout" and "Claude needs to receive those events." The `.events` file replaces this — the server writes user interaction events directly, and Claude reads them with whatever file-reading mechanism the platform provides.
### Key Addition: `.events` File (Per-Screen Event Stream)
The server writes all user interaction events to `$SCREEN_DIR/.events`, one JSON object per line. This gives Claude the full interaction stream for the current screen — not just the final selection, but the user's exploration path (clicked A, then B, settled on C).
Example contents after a user explores options:
```jsonl
{"type":"click","choice":"a","text":"Option A - Preset-First Wizard","timestamp":1706000101}
{"type":"click","choice":"c","text":"Option C - Manual Config","timestamp":1706000108}
{"type":"click","choice":"b","text":"Option B - Hybrid Approach","timestamp":1706000115}
```
- Append-only within a screen. Each user event is appended as a new line.
- The file is cleared (deleted) when chokidar detects a new HTML file (new screen pushed), preventing stale events from carrying over.
- If the file doesn't exist when Claude reads it, no browser interaction occurred — Claude uses only the terminal text.
- The file contains only user events (`click`, etc.) — not server lifecycle events (`server-started`, `screen-added`). This keeps it small and focused.
- Claude can read the full stream to understand the user's exploration pattern, or just look at the last `choice` event for the final selection.
## Changes by File
### `index.js` (server)
**A. Write user events to `.events` file.**
In the WebSocket `message` handler, after logging the event to stdout: append the event as a JSON line to `$SCREEN_DIR/.events` via `fs.appendFileSync`. Only write user interaction events (those with `source: 'user-event'`), not server lifecycle events.
**B. Clear `.events` on new screen.**
In the chokidar `add` handler (new `.html` file detected), delete `$SCREEN_DIR/.events` if it exists. This is the definitive "new screen" signal — better than clearing on GET `/` which fires on every reload.
**C. Replace `wrapInFrame` content injection.**
The current regex anchors on `<div class="feedback-footer">`, which is being removed. Replace with a comment placeholder: remove the existing default content inside `#claude-content` (the `<h2>Visual Brainstorming</h2>` and subtitle paragraph) and replace with a single `<!-- CONTENT -->` marker. Content injection becomes `frameTemplate.replace('<!-- CONTENT -->', content)`. Simpler and won't break if template formatting changes.
### `frame-template.html` (UI frame)
**Remove:**
- The `feedback-footer` div (textarea, Send button, label, `.feedback-row`)
- Associated CSS (`.feedback-footer`, `.feedback-footer label`, `.feedback-row`, textarea and button styles within it)
**Add:**
- `<!-- CONTENT -->` placeholder inside `#claude-content`, replacing the default text
- A selection indicator bar where the footer was, with two states:
- Default: "Click an option above, then return to the terminal"
- After selection: "Option B selected — return to terminal to continue"
- CSS for the indicator bar (subtle, similar visual weight to the existing header)
**Keep unchanged:**
- Header bar with "Brainstorm Companion" title and connection status
- `.main` wrapper and `#claude-content` container
- All component CSS (`.options`, `.cards`, `.mockup`, `.split`, `.pros-cons`, placeholders, mock elements)
- Dark/light theme variables and media query
### `helper.js` (client-side script)
**Remove:**
- `sendToClaude()` function and the "Sent to Claude" page takeover
- `window.send()` function (was tied to the removed Send button)
- Form submission handler — no purpose without the feedback textarea, adds log noise
- Input change handler — same reason
- `pageshow` event listener (was added to fix textarea persistence — no textarea anymore)
**Keep:**
- WebSocket connection, reconnect logic, event queue
- Reload handler (`window.location.reload()` on server push)
- `window.toggleSelect()` for selection highlighting
- `window.selectedChoice` tracking
- `window.brainstorm.send()` and `window.brainstorm.choice()` — these are distinct from the removed `window.send()`. They call `sendEvent` which logs to the server via WebSocket. Useful for custom full-document pages.
**Narrow:**
- Click handler: capture only `[data-choice]` clicks, not all buttons/links. The broad capture was needed when the browser was a feedback channel; now it's just for selection tracking.
**Add:**
- On `data-choice` click, update the selection indicator bar text to show which option was selected.
**Remove from `window.brainstorm` API:**
- `brainstorm.sendToClaude` — no longer exists
### `visual-companion.md` (skill instructions)
**Rewrite "The Loop" section** to the non-blocking flow described above. Remove all references to:
- `wait-for-feedback.sh`
- `TaskOutput` blocking
- Timeout/retry logic (600s timeout, 30-minute cap)
- "User Feedback Format" section describing `send-to-claude` JSON
**Replace with:**
- The new loop (write HTML → end turn → user responds in terminal → read `.events` → iterate)
- `.events` file format documentation
- Guidance that the terminal message is the primary feedback; `.events` provides the full browser interaction stream for additional context
**Keep:**
- Server startup/shutdown instructions
- Content fragment vs full document guidance
- CSS class reference and available components
- Design tips (scale fidelity to the question, 2-4 options per screen, etc.)
### `wait-for-feedback.sh`
**Deleted entirely.**
### `tests/brainstorm-server/server.test.js`
Tests that need updating:
- Test asserting `feedback-footer` presence in fragment responses — update to assert the selection indicator bar or `<!-- CONTENT -->` replacement
- Test asserting `helper.js` contains `send` — update to reflect narrowed API
- Test asserting `sendToClaude` CSS variable usage — remove (function no longer exists)
## Platform Compatibility
The server code (`index.js`, `helper.js`, `frame-template.html`) is fully platform-agnostic — pure Node.js and browser JavaScript. No Claude Code-specific references. Already proven to work on Codex via background terminal interaction.
The skill instructions (`visual-companion.md`) are the platform-adaptive layer. Each platform's Claude uses its own tools to start the server, read `.events`, etc. The non-blocking model works naturally across platforms since it doesn't depend on any platform-specific blocking primitive.
## What This Enables
- **TUI always responsive** during visual brainstorming
- **Mixed input** — click in browser + type in terminal, naturally merged
- **Graceful degradation** — browser down or user doesn't open it? Terminal still works
- **Simpler architecture** — no background tasks, no polling scripts, no timeout management
- **Cross-platform** — same server code works on Claude Code, Codex, and any future platform
## What This Drops
- **Pure-browser feedback workflow** — user must return to the terminal to continue. The selection indicator bar guides them, but it's one extra step compared to the old click-Send-and-wait flow.
- **Inline text feedback from browser** — the textarea is gone. All text feedback goes through the terminal. This is intentional — the terminal is a better text input channel than a small textarea in a frame.
- **Immediate response on browser Send** — the old system had Claude respond the moment the user clicked Send. Now there's a gap while the user switches to the terminal. In practice this is seconds, and the user gets to add context in their terminal message.

View File

@@ -6,8 +6,8 @@
"hooks": [
{
"type": "command",
"command": "${CLAUDE_PLUGIN_ROOT}/hooks/session-start.sh",
"async": true
"command": "${CLAUDE_PLUGIN_ROOT}/hooks/run-hook.cmd session-start",
"async": false
}
]
}

View File

@@ -1,43 +1,46 @@
: << 'CMDBLOCK'
@echo off
REM ============================================================================
REM DEPRECATED: This polyglot wrapper is no longer used as of Claude Code 2.1.x
REM ============================================================================
REM Cross-platform polyglot wrapper for hook scripts.
REM On Windows: cmd.exe runs the batch portion, which finds and calls bash.
REM On Unix: the shell interprets this as a script (: is a no-op in bash).
REM
REM Claude Code 2.1.x changed the Windows execution model for hooks:
REM Hook scripts use extensionless filenames (e.g. "session-start" not
REM "session-start.sh") so Claude Code's Windows auto-detection -- which
REM prepends "bash" to any command containing .sh -- doesn't interfere.
REM
REM Before (2.0.x): Hooks ran with shell:true, using the system default shell.
REM This wrapper provided cross-platform compatibility by
REM being both a valid .cmd file (Windows) and bash script.
REM
REM After (2.1.x): Claude Code now auto-detects .sh files in hook commands
REM and prepends "bash " on Windows. This broke the wrapper
REM because the command:
REM "run-hook.cmd" session-start.sh
REM became:
REM bash "run-hook.cmd" session-start.sh
REM ...and bash cannot execute a .cmd file.
REM
REM The fix: hooks.json now calls session-start.sh directly. Claude Code 2.1.x
REM handles the bash invocation automatically on Windows.
REM
REM This file is kept for reference and potential backward compatibility.
REM ============================================================================
REM
REM Original purpose: Polyglot wrapper to run .sh scripts cross-platform
REM Usage: run-hook.cmd <script-name> [args...]
REM The script should be in the same directory as this wrapper
if "%~1"=="" (
echo run-hook.cmd: missing script name >&2
exit /b 1
)
"C:\Program Files\Git\bin\bash.exe" -l "%~dp0%~1" %2 %3 %4 %5 %6 %7 %8 %9
exit /b
set "HOOK_DIR=%~dp0"
REM Try Git for Windows bash in standard locations
if exist "C:\Program Files\Git\bin\bash.exe" (
"C:\Program Files\Git\bin\bash.exe" "%HOOK_DIR%%~1" %2 %3 %4 %5 %6 %7 %8 %9
exit /b %ERRORLEVEL%
)
if exist "C:\Program Files (x86)\Git\bin\bash.exe" (
"C:\Program Files (x86)\Git\bin\bash.exe" "%HOOK_DIR%%~1" %2 %3 %4 %5 %6 %7 %8 %9
exit /b %ERRORLEVEL%
)
REM Try bash on PATH (e.g. user-installed Git Bash, MSYS2, Cygwin)
where bash >nul 2>nul
if %ERRORLEVEL% equ 0 (
bash "%HOOK_DIR%%~1" %2 %3 %4 %5 %6 %7 %8 %9
exit /b %ERRORLEVEL%
)
REM No bash found - exit silently rather than error
REM (plugin still works, just without SessionStart context injection)
exit /b 0
CMDBLOCK
# Unix shell runs from here
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
# Unix: run the named script directly
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]:-$0}")" && pwd)"
SCRIPT_NAME="$1"
shift
"${SCRIPT_DIR}/${SCRIPT_NAME}" "$@"
exec bash "${SCRIPT_DIR}/${SCRIPT_NAME}" "$@"

View File

@@ -32,13 +32,18 @@ escape_for_json() {
using_superpowers_escaped=$(escape_for_json "$using_superpowers_content")
warning_escaped=$(escape_for_json "$warning_message")
session_context="<EXTREMELY_IMPORTANT>\nYou have superpowers.\n\n**Below is the full content of your 'superpowers:using-superpowers' skill - your introduction to using skills. For all other skills, use the 'Skill' tool:**\n\n${using_superpowers_escaped}\n\n${warning_escaped}\n</EXTREMELY_IMPORTANT>"
# Output context injection as JSON
# Output context injection as JSON.
# Keep both shapes for compatibility:
# - Cursor hooks expect additional_context.
# - Claude hooks expect hookSpecificOutput.additionalContext.
cat <<EOF
{
"additional_context": "${session_context}",
"hookSpecificOutput": {
"hookEventName": "SessionStart",
"additionalContext": "<EXTREMELY_IMPORTANT>\nYou have superpowers.\n\n**Below is the full content of your 'superpowers:using-superpowers' skill - your introduction to using skills. For all other skills, use the 'Skill' tool:**\n\n${using_superpowers_escaped}\n\n${warning_escaped}\n</EXTREMELY_IMPORTANT>"
"additionalContext": "${session_context}"
}
}
EOF

View File

@@ -8,12 +8,11 @@
*
* This template provides a consistent frame with:
* - OS-aware light/dark theming
* - Fixed header and feedback footer
* - Fixed header and selection indicator bar
* - Scrollable main content area
* - CSS helpers for common UI patterns
*
* CLAUDE: Replace the contents of #claude-content with your content.
* Keep the header, main wrapper, and feedback-footer intact.
* Content is injected via placeholder comment in #claude-content.
*/
* { box-sizing: border-box; margin: 0; padding: 0; }
@@ -79,37 +78,21 @@
.main { flex: 1; overflow-y: auto; }
#claude-content { padding: 2rem; min-height: 100%; }
.feedback-footer {
.indicator-bar {
background: var(--bg-secondary);
border-top: 1px solid var(--border);
padding: 0.75rem 1.5rem;
padding: 0.5rem 1.5rem;
flex-shrink: 0;
text-align: center;
}
.feedback-footer label { display: block; font-size: 0.65rem; color: var(--text-secondary); margin-bottom: 0.4rem; text-transform: uppercase; letter-spacing: 0.05em; }
.feedback-row { display: flex; gap: 0.5rem; }
.feedback-footer textarea {
flex: 1;
background: var(--bg-primary);
border: 1px solid var(--border);
border-radius: 6px;
padding: 0.5rem 0.75rem;
color: var(--text-primary);
font-family: inherit;
font-size: 0.85rem;
resize: none;
height: 36px;
.indicator-bar span {
font-size: 0.75rem;
color: var(--text-secondary);
}
.feedback-footer textarea:focus { outline: none; border-color: var(--accent); }
.feedback-footer button {
background: var(--accent);
color: white;
border: none;
padding: 0 1rem;
border-radius: 6px;
font-size: 0.8rem;
cursor: pointer;
.indicator-bar .selected-text {
color: var(--accent);
font-weight: 500;
}
.feedback-footer button:hover { background: var(--accent-hover); }
/* ===== TYPOGRAPHY ===== */
h2 { font-size: 1.5rem; font-weight: 600; margin-bottom: 0.5rem; }
@@ -218,18 +201,12 @@
<div class="main">
<div id="claude-content">
<!-- CLAUDE: Replace this content -->
<h2>Visual Brainstorming</h2>
<p class="subtitle">Claude will show mockups and options here.</p>
<!-- CONTENT -->
</div>
</div>
<div class="feedback-footer">
<label>Feedback for Claude</label>
<div class="feedback-row">
<textarea id="feedback" placeholder="Add notes (optional)..." onkeydown="if(event.key==='Enter'&&!event.shiftKey){event.preventDefault();send()}"></textarea>
<button onclick="send()">Send</button>
</div>
<div class="indicator-bar">
<span id="indicator-text">Click an option above, then return to the terminal</span>
</div>
</body>

View File

@@ -32,107 +32,56 @@
}
}
// Auto-capture clicks on interactive elements
// Capture clicks on choice elements
document.addEventListener('click', (e) => {
const target = e.target.closest('button, a, [data-choice], [role="button"], input[type="submit"]');
const target = e.target.closest('[data-choice]');
if (!target) return;
// Don't capture regular link navigation
if (target.tagName === 'A' && !target.dataset.choice) return;
// Don't capture the Send feedback button (handled by send())
if (target.id === 'send-feedback') return;
e.preventDefault();
sendEvent({
type: 'click',
text: target.textContent.trim(),
choice: target.dataset.choice || null,
id: target.id || null,
className: target.className || null
choice: target.dataset.choice,
id: target.id || null
});
// Update indicator bar (defer so toggleSelect runs first)
setTimeout(() => {
const indicator = document.getElementById('indicator-text');
if (!indicator) return;
const container = target.closest('.options') || target.closest('.cards');
const selected = container ? container.querySelectorAll('.selected') : [];
if (selected.length === 0) {
indicator.textContent = 'Click an option above, then return to the terminal';
} else if (selected.length === 1) {
const label = selected[0].querySelector('h3, .content h3, .card-body h3')?.textContent?.trim() || selected[0].dataset.choice;
indicator.innerHTML = '<span class="selected-text">' + label + ' selected</span> — return to terminal to continue';
} else {
indicator.innerHTML = '<span class="selected-text">' + selected.length + ' selected</span> — return to terminal to continue';
}
}, 0);
});
// Auto-capture form submissions
document.addEventListener('submit', (e) => {
e.preventDefault();
const form = e.target;
const formData = new FormData(form);
const data = {};
formData.forEach((value, key) => { data[key] = value; });
sendEvent({
type: 'submit',
formId: form.id || null,
formName: form.name || null,
data: data
});
});
// Auto-capture input changes (debounced)
let inputTimeout = null;
document.addEventListener('input', (e) => {
const target = e.target;
if (!target.matches('input, textarea, select')) return;
clearTimeout(inputTimeout);
inputTimeout = setTimeout(() => {
sendEvent({
type: 'input',
name: target.name || null,
id: target.id || null,
value: target.value,
inputType: target.type || target.tagName.toLowerCase()
});
}, 500);
});
// Send to Claude - triggers feedback delivery
function sendToClaude(feedback) {
sendEvent({
type: 'send-to-claude',
feedback: feedback || null
});
// Show themed confirmation page
document.body.innerHTML = `
<div style="display: flex; align-items: center; justify-content: center; height: 100vh; font-family: system-ui, -apple-system, BlinkMacSystemFont, sans-serif; background: var(--bg-primary, #f5f5f7);">
<div style="text-align: center; color: var(--text-secondary, #86868b);">
<h2 style="color: var(--text-primary, #1d1d1f); margin-bottom: 0.5rem;">Sent to Claude</h2>
<p>Return to the terminal to see Claude's response.</p>
</div>
</div>
`;
}
// Frame UI: selection tracking and feedback send
// Frame UI: selection tracking
window.selectedChoice = null;
window.toggleSelect = function(el) {
const container = el.closest('.options') || el.closest('.cards');
if (container) {
const multi = container && container.dataset.multiselect !== undefined;
if (container && !multi) {
container.querySelectorAll('.option, .card').forEach(o => o.classList.remove('selected'));
}
el.classList.add('selected');
if (multi) {
el.classList.toggle('selected');
} else {
el.classList.add('selected');
}
window.selectedChoice = el.dataset.choice;
};
window.send = function() {
const feedbackEl = document.getElementById('feedback');
const feedback = feedbackEl ? feedbackEl.value.trim() : '';
const payload = {};
if (window.selectedChoice) payload.choice = window.selectedChoice;
if (feedback) payload.feedback = feedback;
if (Object.keys(payload).length === 0) return;
sendToClaude(payload);
if (feedbackEl) feedbackEl.value = '';
};
// Expose API for explicit use
window.brainstorm = {
send: sendEvent,
choice: (value, metadata = {}) => sendEvent({ type: 'choice', value, ...metadata }),
sendToClaude: sendToClaude
choice: (value, metadata = {}) => sendEvent({ type: 'choice', value, ...metadata })
};
connect();

View File

@@ -6,6 +6,8 @@ const fs = require('fs');
const path = require('path');
const PORT = process.env.BRAINSTORM_PORT || (49152 + Math.floor(Math.random() * 16383));
const HOST = process.env.BRAINSTORM_HOST || '127.0.0.1';
const URL_HOST = process.env.BRAINSTORM_URL_HOST || (HOST === '127.0.0.1' ? 'localhost' : HOST);
const SCREEN_DIR = process.env.BRAINSTORM_DIR || '/tmp/brainstorm';
if (!fs.existsSync(SCREEN_DIR)) {
@@ -25,10 +27,7 @@ function isFullDocument(html) {
// Wrap a content fragment in the frame template
function wrapInFrame(content) {
return frameTemplate.replace(
/(<div id="claude-content">)[\s\S]*?(<\/div>\s*<\/div>\s*<div class="feedback-footer">)/,
`$1\n ${content}\n $2`
);
return frameTemplate.replace('<!-- CONTENT -->', content);
}
// Find the newest .html file in the directory by mtime
@@ -74,6 +73,11 @@ wss.on('connection', (ws) => {
ws.on('message', (data) => {
const event = JSON.parse(data.toString());
console.log(JSON.stringify({ source: 'user-event', ...event }));
// Write user events to .events file for Claude to read
if (event.choice) {
const eventsFile = path.join(SCREEN_DIR, '.events');
fs.appendFileSync(eventsFile, JSON.stringify(event) + '\n');
}
});
});
@@ -103,6 +107,9 @@ app.get('/', (req, res) => {
chokidar.watch(SCREEN_DIR, { ignoreInitial: true })
.on('add', (filePath) => {
if (filePath.endsWith('.html')) {
// Clear events from previous screen
const eventsFile = path.join(SCREEN_DIR, '.events');
if (fs.existsSync(eventsFile)) fs.unlinkSync(eventsFile);
console.log(JSON.stringify({ type: 'screen-added', file: filePath }));
clients.forEach(ws => {
if (ws.readyState === WebSocket.OPEN) {
@@ -122,11 +129,13 @@ chokidar.watch(SCREEN_DIR, { ignoreInitial: true })
}
});
server.listen(PORT, '127.0.0.1', () => {
server.listen(PORT, HOST, () => {
console.log(JSON.stringify({
type: 'server-started',
port: PORT,
url: `http://localhost:${PORT}`,
host: HOST,
url_host: URL_HOST,
url: `http://${URL_HOST}:${PORT}`,
screen_dir: SCREEN_DIR
}));
});

View File

@@ -1,6 +1,6 @@
#!/bin/bash
# Start the brainstorm server and output connection info
# Usage: start-server.sh [--project-dir <path>]
# Usage: start-server.sh [--project-dir <path>] [--host <bind-host>] [--url-host <display-host>] [--foreground] [--background]
#
# Starts server on a random high port, outputs JSON with URL.
# Each session gets its own directory to avoid conflicts.
@@ -8,17 +8,42 @@
# Options:
# --project-dir <path> Store session files under <path>/.superpowers/brainstorm/
# instead of /tmp. Files persist after server stops.
# --host <bind-host> Host/interface to bind (default: 127.0.0.1).
# Use 0.0.0.0 in remote/containerized environments.
# --url-host <host> Hostname shown in returned URL JSON.
# --foreground Run server in the current terminal (no backgrounding).
# --background Force background mode (overrides Codex auto-foreground).
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
# Parse arguments
PROJECT_DIR=""
FOREGROUND="false"
FORCE_BACKGROUND="false"
BIND_HOST="127.0.0.1"
URL_HOST=""
while [[ $# -gt 0 ]]; do
case "$1" in
--project-dir)
PROJECT_DIR="$2"
shift 2
;;
--host)
BIND_HOST="$2"
shift 2
;;
--url-host)
URL_HOST="$2"
shift 2
;;
--foreground|--no-daemon)
FOREGROUND="true"
shift
;;
--background|--daemon)
FORCE_BACKGROUND="true"
shift
;;
*)
echo "{\"error\": \"Unknown argument: $1\"}"
exit 1
@@ -26,6 +51,19 @@ while [[ $# -gt 0 ]]; do
esac
done
if [[ -z "$URL_HOST" ]]; then
if [[ "$BIND_HOST" == "127.0.0.1" || "$BIND_HOST" == "localhost" ]]; then
URL_HOST="localhost"
else
URL_HOST="$BIND_HOST"
fi
fi
# Codex environments may reap detached/background processes. Prefer foreground by default.
if [[ -n "${CODEX_CI:-}" && "$FOREGROUND" != "true" && "$FORCE_BACKGROUND" != "true" ]]; then
FOREGROUND="true"
fi
# Generate unique session directory
SESSION_ID="$$-$(date +%s)"
@@ -48,15 +86,38 @@ if [[ -f "$PID_FILE" ]]; then
rm -f "$PID_FILE"
fi
# Start server, capturing output to log file
cd "$SCRIPT_DIR"
BRAINSTORM_DIR="$SCREEN_DIR" node index.js > "$LOG_FILE" 2>&1 &
# Foreground mode for environments that reap detached/background processes.
if [[ "$FOREGROUND" == "true" ]]; then
echo "$$" > "$PID_FILE"
env BRAINSTORM_DIR="$SCREEN_DIR" BRAINSTORM_HOST="$BIND_HOST" BRAINSTORM_URL_HOST="$URL_HOST" node index.js
exit $?
fi
# Start server, capturing output to log file
# Use nohup to survive shell exit; disown to remove from job table
nohup env BRAINSTORM_DIR="$SCREEN_DIR" BRAINSTORM_HOST="$BIND_HOST" BRAINSTORM_URL_HOST="$URL_HOST" node index.js > "$LOG_FILE" 2>&1 &
SERVER_PID=$!
disown "$SERVER_PID" 2>/dev/null
echo "$SERVER_PID" > "$PID_FILE"
# Wait for server-started message (check log file)
for i in {1..50}; do
if grep -q "server-started" "$LOG_FILE" 2>/dev/null; then
# Verify server is still alive after a short window (catches process reapers)
alive="true"
for _ in {1..20}; do
if ! kill -0 "$SERVER_PID" 2>/dev/null; then
alive="false"
break
fi
sleep 0.1
done
if [[ "$alive" != "true" ]]; then
echo "{\"error\": \"Server started but was killed. Retry in a persistent terminal with: $SCRIPT_DIR/start-server.sh${PROJECT_DIR:+ --project-dir $PROJECT_DIR} --host $BIND_HOST --url-host $URL_HOST --foreground\"}"
exit 1
fi
grep "server-started" "$LOG_FILE" | head -1
exit 0
fi

View File

@@ -1,27 +0,0 @@
#!/bin/bash
# Wait for user feedback from the brainstorm browser
# Usage: wait-for-feedback.sh <screen_dir>
#
# Blocks until user sends feedback, then outputs the JSON.
# Write HTML to screen_file BEFORE calling this.
SCREEN_DIR="${1:?Usage: wait-for-feedback.sh <screen_dir>}"
LOG_FILE="${SCREEN_DIR}/.server.log"
if [[ ! -d "$SCREEN_DIR" ]]; then
echo '{"error": "Screen directory not found"}' >&2
exit 1
fi
# Record current position in log file
LOG_POS=$(wc -l < "$LOG_FILE" 2>/dev/null || echo 0)
# Poll for new lines containing the event
while true; do
RESULT=$(tail -n +$((LOG_POS + 1)) "$LOG_FILE" 2>/dev/null | grep -m 1 "send-to-claude")
if [[ -n "$RESULT" ]]; then
echo "$RESULT"
exit 0
fi
sleep 0.2
done

View File

@@ -1,17 +1,85 @@
---
name: brainstorming
description: "You MUST use this before any creative work - creating features, building components, adding functionality, or modifying behavior. Explores user intent, requirements and design before implementation."
description: "You MUST use this before any creative work - creating features, building components, adding functionality, or modifying behavior."
---
# Brainstorming Ideas Into Designs
Help turn ideas into fully formed designs and specs through natural collaborative dialogue.
## Overview
Start by understanding the current project context, then ask questions one at a time to refine the idea. Once you understand what you're building, present the design in small sections (200-300 words), checking after each section whether it looks right so far.
Help turn ideas into fully formed designs and specs through natural collaborative dialogue. Scale your effort to the task — a link in a header needs a different process than a new subsystem — but always confirm you understand what the user wants before you build anything.
Start by understanding the current project context, then ask questions one at a time to refine the idea. Once you understand what you're building, present the design and get user approval.
<HARD-GATE>
Do NOT invoke any implementation skill, write any code, scaffold any project, or take any implementation action until:
1. You have stated your understanding of the user's intent
2. The user has confirmed that understanding
This applies to every task regardless of size. The confirmation can be brief ("I'll add a GitHub icon-link to the header, styled to match the existing theme — sound right?"), but you must get it.
</HARD-GATE>
## Anti-Pattern: Skipping Understanding
The failure mode is not "too little ceremony." It is jumping to implementation with unchecked assumptions. Simple tasks are where this happens most — you assume you know what the user wants and start editing. Even when you're right about the *what*, you miss preferences about the *how*.
## Checklist
Create tasks to track the steps you'll execute. For a small change, that might be steps 13 only. For a large project, all seven.
1. **Explore project context** — check files, docs, recent commits
2. **Offer visual companion** (if topic will involve visual questions) — this is its own message, not combined with a clarifying question. See the Visual Companion section below.
3. **Ask clarifying questions** — one at a time, understand purpose/constraints/success criteria
4. **Propose approaches** — with trade-offs and your recommendation
5. **Present design** — in sections scaled to their complexity, get user approval after each section
6. **Write design doc** — save to `docs/superpowers/specs/YYYY-MM-DD-<topic>-design.md` and commit
7. **Transition to implementation** — invoke writing-plans skill to create implementation plan
Steps 14 always happen. Steps 57 scale to the task. **GATE — when you believe a step can be safely elided, ask the user for permission.** Do not skip silently. For example: "This is straightforward — I don't think we need a design doc. Want me to go straight to planning?"
## Process Flow
```dot
digraph brainstorming {
"Explore context" [shape=box];
"Visual questions?" [shape=diamond];
"Offer Visual Companion" [shape=box];
"Understand intent" [shape=box];
"User confirms understanding?" [shape=diamond];
"Propose approaches" [shape=box];
"Present design" [shape=box];
"User approves?" [shape=diamond];
"Design doc warranted?" [shape=diamond];
"Ask user permission\nto elide" [shape=box];
"Write design doc" [shape=box];
"Spec review\n(when warranted)" [shape=box];
"Invoke writing-plans" [shape=doublecircle];
"Explore context" -> "Visual questions?";
"Visual questions?" -> "Offer Visual Companion" [label="yes"];
"Visual questions?" -> "Understand intent" [label="no"];
"Offer Visual Companion" -> "Understand intent";
"Understand intent" -> "User confirms understanding?";
"User confirms understanding?" -> "Understand intent" [label="no, refine"];
"User confirms understanding?" -> "Propose approaches" [label="yes"];
"Propose approaches" -> "Present design";
"Present design" -> "User approves?";
"User approves?" -> "Present design" [label="no, revise"];
"User approves?" -> "Design doc warranted?" [label="yes"];
"Design doc warranted?" -> "Write design doc" [label="yes"];
"Design doc warranted?" -> "Ask user permission\nto elide" [label="no — may be\noverkill"];
"Ask user permission\nto elide" -> "Invoke writing-plans";
"Write design doc" -> "Spec review\n(when warranted)";
"Spec review\n(when warranted)" -> "Invoke writing-plans";
}
```
**The terminal state is invoking writing-plans.** Do NOT invoke frontend-design, mcp-builder, or any other implementation skill. The ONLY skill you invoke after brainstorming is writing-plans.
## The Process
**Understanding the idea:**
- Check out the current project state first (files, docs, recent commits)
- Before asking detailed questions, assess scope: if the request describes multiple independent subsystems (e.g., "build a platform with chat, file storage, billing, and analytics"), flag this immediately. Don't spend questions refining details of a project that needs to be decomposed first.
- If the project is too large for a single spec, help the user decompose into sub-projects: what are the independent pieces, how do they relate, what order should they be built? Then brainstorm the first sub-project through the normal design flow. Each sub-project gets its own spec → plan → implementation cycle.
@@ -21,62 +89,81 @@ Start by understanding the current project context, then ask questions one at a
- Focus on understanding: purpose, constraints, success criteria
**Exploring approaches:**
- Propose 2-3 different approaches with trade-offs
- Present options conversationally with your recommendation and reasoning
- Lead with your recommended option and explain why
**Presenting the design:**
- Once you believe you understand what you're building, present the design
- Break it into sections of 200-300 words
- Scale each section to its complexity: a few sentences if straightforward, up to 200-300 words if nuanced
- Ask after each section whether it looks right so far
- Cover: architecture, components, data flow, error handling, testing
- Be ready to go back and clarify if something doesn't make sense
## After the Design
**Documentation (when warranted):**
- Write the validated design (spec) to `docs/superpowers/specs/YYYY-MM-DD-<topic>-design.md`
- (User preferences for spec location override this default)
- Use elements-of-style:writing-clearly-and-concisely skill if available
- Commit the design document to git
- **GATE — for small changes, the design doc may be unnecessary.** Ask the user before skipping it.
**Spec Review Loop (when warranted):**
After writing the spec document:
1. Dispatch spec-document-reviewer subagent (see spec-document-reviewer-prompt.md)
2. If Issues Found: fix, re-dispatch, repeat until Approved
3. If loop exceeds 5 iterations, surface to human for guidance
**GATE — for small changes, the spec review may be unnecessary.** Ask the user before skipping it.
**Implementation:**
- Check in with the user before transitioning: "The design is ready. Want me to move on to writing the implementation plan?"
- On confirmation, invoke the writing-plans skill
- Do NOT invoke any other skill. writing-plans is the next step.
**Design for isolation and clarity:**
- Break the system into smaller units that each have one clear purpose, communicate through well-defined interfaces, and can be understood and tested independently
- For each unit, you should be able to answer: what does it do, how do you use it, and what does it depend on?
- Can someone understand what a unit does without reading its internals? Can you change the internals without breaking consumers? If not, the boundaries need work.
- Smaller, well-bounded units are also easier for you to work with - you reason better about code you can hold in context at once, and your edits are more reliable when files are focused. When a file grows large, that's often a signal that it's doing too much.
**Working in existing codebases:**
- Explore the current structure before proposing changes. Follow existing patterns.
- Where existing code has problems that affect the work (e.g., a file that's grown too large, unclear boundaries, tangled responsibilities), include targeted improvements as part of the design - the way a good developer improves code they're working in.
- Don't propose unrelated refactoring. Stay focused on what serves the current goal.
## After the Design
**Documentation:**
- Write the validated design (spec) to `docs/superpowers/specs/YYYY-MM-DD-<topic>-design.md`
- (User preferences for spec location override this default)
- Use elements-of-style:writing-clearly-and-concisely skill if available
- Commit the design document to git
**Spec Review Loop:**
After writing the spec document:
1. Dispatch spec-document-reviewer subagent (see spec-document-reviewer-prompt.md)
2. If Issues Found: fix, re-dispatch, repeat until Approved
3. If loop exceeds 5 iterations, surface to human for guidance
**Implementation (if continuing):**
When the user approves the design and wants to build:
1. **Invoke `superpowers:writing-plans` using the Skill tool.** Not EnterPlanMode. Not plan mode. Not direct implementation. The Skill tool.
2. After the plan is written, use superpowers:using-git-worktrees to create an isolated workspace for implementation.
## Key Principles
- **One question at a time** - Don't overwhelm with multiple questions
- **Multiple choice preferred** - Easier to answer than open-ended when possible
- **YAGNI ruthlessly** - Remove unnecessary features from all designs
- **Explore alternatives** - Always propose 2-3 approaches before settling
- **Incremental validation** - Present design in sections, validate each
- **Be flexible** - Go back and clarify when something doesn't make sense
- **One question at a time** — don't overwhelm with multiple questions
- **Multiple choice preferred** — easier to answer than open-ended when possible
- **YAGNI ruthlessly** — remove unnecessary features from all designs
- **Explore alternatives** propose approaches before settling
- **Incremental validation** — present, get approval, then move on
- **Be flexible** — go back and clarify when something doesn't make sense
## Visual Companion (Claude Code Only)
## Visual Companion
A browser-based visual companion for showing mockups, diagrams, and options during brainstorming. Use it whenever visual representation would make feedback easier than text descriptions alone.
A browser-based companion for showing mockups, diagrams, and visual options during brainstorming. Available as a tool — not a mode. Accepting the companion means it's available for questions that benefit from visual treatment; it does NOT mean every question goes through the browser.
**When the topic involves visual decisions, ask:**
> "This involves some visual decisions. I can show mockups in a browser window so you can see options and give feedback visually. This feature is still new — it can be token-intensive and a bit slow, but it works well for layout, design, and architecture questions. Want to try it? (Requires opening a local URL)"
**Offering the companion:** When you anticipate that upcoming questions will involve visual content (mockups, layouts, diagrams), offer it once for consent:
> "Some of the upcoming design questions would benefit from visual mockups. I can show those in a browser window so you can see and compare options visually. This feature is still new — it can be token-intensive and a bit slow, but it works well for layout and design questions. Want to try it? (Requires opening a local URL)"
If they agree, read the detailed guide before proceeding:
`${CLAUDE_PLUGIN_ROOT}/skills/brainstorming/visual-companion.md`
**This offer MUST be its own message.** Do not combine it with clarifying questions, context summaries, or any other content. The message should contain ONLY the offer above and nothing else. Wait for the user's response before continuing. If they decline, proceed with text-only brainstorming.
**Per-question decision:** Even after the user accepts, decide FOR EACH QUESTION whether to use the browser or the terminal. The test: **would the user understand this better by seeing it than reading it?**
- **Use the browser** for content that IS visual — mockups, wireframes, layout comparisons, architecture diagrams, side-by-side visual designs
- **Use the terminal** for content that is text — requirements questions, conceptual choices, tradeoff lists, A/B/C/D text options, scope decisions
A question about a UI topic is not automatically a visual question. "What does personality mean in this context?" is a conceptual question — use the terminal. "Which wizard layout works better?" is a visual question — use the browser.
If they agree to the companion, read the detailed guide before proceeding:
`skills/brainstorming/visual-companion.md`

View File

@@ -4,20 +4,31 @@ Browser-based visual brainstorming companion for showing mockups, diagrams, and
## When to Use
Use the visual companion when seeing beats describing:
- **UI mockups** — layouts, navigation, component designs
- **Architecture diagrams** — system components, data flow, relationships
- **Complex choices** — multi-option decisions with visual trade-offs
- **Design polish** — when the question is about look and feel
- **Spatial relationships** — file structures, database schemas, state machines
Decide per-question, not per-session. The test: **would the user understand this better by seeing it than reading it?**
Don't use it for simple text questions, code review, or when the user prefers terminal-only interaction.
**Use the browser** when the content itself is visual:
- **UI mockups** — wireframes, layouts, navigation structures, component designs
- **Architecture diagrams** — system components, data flow, relationship maps
- **Side-by-side visual comparisons** — comparing two layouts, two color schemes, two design directions
- **Design polish** — when the question is about look and feel, spacing, visual hierarchy
- **Spatial relationships** — state machines, flowcharts, entity relationships rendered as diagrams
**Use the terminal** when the content is text or tabular:
- **Requirements and scope questions** — "what does X mean?", "which features are in scope?"
- **Conceptual A/B/C choices** — picking between approaches described in words
- **Tradeoff lists** — pros/cons, comparison tables
- **Technical decisions** — API design, data modeling, architectural approach selection
- **Clarifying questions** — anything where the answer is words, not a visual preference
A question *about* a UI topic is not automatically a visual question. "What kind of wizard do you want?" is conceptual — use the terminal. "Which of these wizard layouts feels right?" is visual — use the browser.
## How It Works
The server watches a directory for HTML files and serves the newest one to the browser. You write HTML content, the user sees it in their browser, clicks options or types feedback, and you receive their response as JSON.
The server watches a directory for HTML files and serves the newest one to the browser. You write HTML content, the user sees it in their browser and can click to select options. Selections are recorded to a `.events` file that you read on your next turn.
**Content fragments vs full documents:** If your HTML file starts with `<!DOCTYPE` or `<html`, the server serves it as-is (just injects the helper script). Otherwise, the server automatically wraps your content in the frame template — adding the header, CSS theme, feedback footer, and all interactive infrastructure. **Write content fragments by default.** Only write full documents when you need complete control over the page.
**Content fragments vs full documents:** If your HTML file starts with `<!DOCTYPE` or `<html`, the server serves it as-is (just injects the helper script). Otherwise, the server automatically wraps your content in the frame template — adding the header, CSS theme, selection indicator, and all interactive infrastructure. **Write content fragments by default.** Only write full documents when you need complete control over the page.
## Starting a Session
@@ -33,38 +44,66 @@ Save `screen_dir` from the response. Tell user to open the URL.
**Note:** Pass the project root as `--project-dir` so mockups persist in `.superpowers/brainstorm/` and survive server restarts. Without it, files go to `/tmp` and get cleaned up. Remind the user to add `.superpowers/` to `.gitignore` if it's not already there.
**Codex behavior:** In Codex (`CODEX_CI=1`), `start-server.sh` auto-switches to foreground mode by default because background jobs may be reaped. Use `--background` only if your environment reliably preserves detached processes.
**If background processes are reaped in your environment:** run in foreground from a persistent terminal session:
```bash
${CLAUDE_PLUGIN_ROOT}/lib/brainstorm-server/start-server.sh --project-dir /path/to/project --foreground
```
In `--foreground` mode, the command stays attached and serves until interrupted.
If the URL is unreachable from your browser (common in remote/containerized setups), bind a non-loopback host:
```bash
${CLAUDE_PLUGIN_ROOT}/lib/brainstorm-server/start-server.sh \
--project-dir /path/to/project \
--host 0.0.0.0 \
--url-host localhost
```
Use `--url-host` to control what hostname is printed in the returned URL JSON.
## The Loop
1. **Start watcher first** (background bash) — avoids race condition:
```bash
${CLAUDE_PLUGIN_ROOT}/lib/brainstorm-server/wait-for-feedback.sh $SCREEN_DIR
```
2. **Write HTML** to a new file in `screen_dir`:
1. **Write HTML** to a new file in `screen_dir`:
- Use semantic filenames: `platform.html`, `visual-style.html`, `layout.html`
- **Never reuse filenames** — each screen gets a fresh file
- Use Write tool — **never use cat/heredoc** (dumps noise into terminal)
- Server automatically serves the newest file
3. **Tell user what to expect:**
2. **Tell user what to expect and end your turn:**
- Remind them of the URL (every step, not just first)
- Give a brief text summary of what's on screen (e.g., "Showing 3 layout options for the homepage")
- Ask them to respond in the terminal: "Take a look and let me know what you think. Click to select an option if you'd like."
4. **Wait for feedback** — call `TaskOutput(task_id, block=true, timeout=600000)`
- If timeout, call TaskOutput again (watcher still running)
- After 3 timeouts (30 min), say "Let me know when you want to continue"
3. **On your next turn**after the user responds in the terminal:
- Read `$SCREEN_DIR/.events` if it exists — this contains the user's browser interactions (clicks, selections) as JSON lines
- Merge with the user's terminal text to get the full picture
- The terminal message is the primary feedback; `.events` provides structured interaction data
5. **Process feedback** — returns JSON like `{"choice": "a", "feedback": "make header smaller"}`
4. **Iterate or advance** — if feedback changes current screen, write a new file (e.g., `layout-v2.html`). Only move to the next question when the current step is validated.
6. **Iterate or advance** — if feedback changes current screen, write a new file (e.g., `layout-v2.html`). Only move to the next question when the current step is validated.
5. **Unload when returning to terminal**when the next step doesn't need the browser (e.g., a clarifying question, a tradeoff discussion), push a waiting screen to clear the stale content:
7. Repeat until done.
```html
<!-- filename: waiting.html (or waiting-2.html, etc.) -->
<div style="display:flex;align-items:center;justify-content:center;min-height:60vh">
<p class="subtitle">Continuing in terminal...</p>
</div>
```
This prevents the user from staring at a resolved choice while the conversation has moved on. When the next visual question comes up, push a new content file as usual.
6. Repeat until done.
## Writing Content Fragments
Write just the content that goes inside the page. The server wraps it in the frame template automatically (header, theme CSS, feedback footer, interactive JS).
Write just the content that goes inside the page. The server wraps it in the frame template automatically (header, theme CSS, selection indicator, and all interactive infrastructure).
**Minimal example:**
```html
<h2>Which layout works better?</h2>
<p class="subtitle">Consider readability and visual hierarchy</p>
@@ -94,6 +133,7 @@ That's it. No `<html>`, no CSS, no `<script>` tags needed. The server provides a
The frame template provides these CSS classes for your content:
### Options (A/B/C choices)
```html
<div class="options">
<div class="option" data-choice="a" onclick="toggleSelect(this)">
@@ -106,7 +146,16 @@ The frame template provides these CSS classes for your content:
</div>
```
**Multi-select:** Add `data-multiselect` to the container to let users select multiple options. Each click toggles the item. The indicator bar shows the count.
```html
<div class="options" data-multiselect>
<!-- same option markup — users can select/deselect multiple -->
</div>
```
### Cards (visual designs)
```html
<div class="cards">
<div class="card" data-choice="design1" onclick="toggleSelect(this)">
@@ -120,6 +169,7 @@ The frame template provides these CSS classes for your content:
```
### Mockup container
```html
<div class="mockup">
<div class="mockup-header">Preview: Dashboard Layout</div>
@@ -128,6 +178,7 @@ The frame template provides these CSS classes for your content:
```
### Split view (side-by-side)
```html
<div class="split">
<div class="mockup"><!-- left --></div>
@@ -136,6 +187,7 @@ The frame template provides these CSS classes for your content:
```
### Pros/Cons
```html
<div class="pros-cons">
<div class="pros"><h4>Pros</h4><ul><li>Benefit</li></ul></div>
@@ -144,6 +196,7 @@ The frame template provides these CSS classes for your content:
```
### Mock elements (wireframe building blocks)
```html
<div class="mock-nav">Logo | Home | About | Contact</div>
<div style="display: flex;">
@@ -156,22 +209,26 @@ The frame template provides these CSS classes for your content:
```
### Typography and sections
- `h2` — page title
- `h3` — section heading
- `.subtitle` — secondary text below title
- `.section` — content block with bottom margin
- `.label` — small uppercase label text
## User Feedback Format
## Browser Events Format
```json
{
"choice": "option-id",
"feedback": "user notes"
}
When the user clicks options in the browser, their interactions are recorded to `$SCREEN_DIR/.events` (one JSON object per line). The file is cleared automatically when you push a new screen.
```jsonl
{"type":"click","choice":"a","text":"Option A - Simple Layout","timestamp":1706000101}
{"type":"click","choice":"c","text":"Option C - Complex Grid","timestamp":1706000108}
{"type":"click","choice":"b","text":"Option B - Hybrid","timestamp":1706000115}
```
Both fields are optional — user may select without notes, or send notes without a selection.
The full event stream shows the user's exploration path — they may click multiple options before settling. The last `choice` event is typically the final selection, but the pattern of clicks can reveal hesitation or preferences worth asking about.
If `.events` doesn't exist, the user didn't interact with the browser — use only their terminal text.
## Design Tips
@@ -200,4 +257,4 @@ If the session used `--project-dir`, mockup files persist in `.superpowers/brain
## Reference
- Frame template (CSS reference): `${CLAUDE_PLUGIN_ROOT}/lib/brainstorm-server/frame-template.html`
- Helper script (JS API): `${CLAUDE_PLUGIN_ROOT}/lib/brainstorm-server/helper.js`
- Helper script (client-side): `${CLAUDE_PLUGIN_ROOT}/lib/brainstorm-server/helper.js`

View File

@@ -7,6 +7,8 @@ description: Use when executing implementation plans with independent tasks in t
Execute plan by dispatching fresh subagent per task, with two-stage review after each: spec compliance review first, then code quality review.
Scale the review process to the task. A one-line config change doesn't need the same review rigor as a new subsystem. **GATE — when you believe review stages or the final reviewer can be safely collapsed or elided, ask the user for permission.** Do not elide silently, and do not replace a skipped review subagent with orchestrator judgment — the orchestrator never implements or reviews code.
**Core principle:** Fresh subagent per task + two-stage review (spec then quality) = high quality, fast iteration
## When to Use
@@ -47,26 +49,34 @@ digraph process {
"Implementer subagent asks questions?" [shape=diamond];
"Answer questions, provide context" [shape=box];
"Implementer subagent implements, tests, commits, self-reviews" [shape=box];
"Two-stage review warranted?" [shape=diamond];
"Ask user permission\nto elide or collapse reviews" [shape=box];
"Dispatch spec reviewer subagent (./spec-reviewer-prompt.md)" [shape=box];
"Spec reviewer subagent confirms code matches spec?" [shape=diamond];
"Implementer subagent fixes spec gaps" [shape=box];
"Dispatch code quality reviewer subagent (./code-quality-reviewer-prompt.md)" [shape=box];
"Code quality reviewer subagent approves?" [shape=diamond];
"Implementer subagent fixes quality issues" [shape=box];
"Mark task complete in TodoWrite" [shape=box];
"Mark task complete in your task list" [shape=box];
}
"Read plan, extract all tasks with full text, note context, create TodoWrite" [shape=box];
"Read plan, extract all tasks with full text, note context, create your task list" [shape=box];
"More tasks remain?" [shape=diamond];
"Final reviewer warranted?" [shape=diamond];
"Ask user permission\nto elide final review" [shape=box];
"Dispatch final code reviewer subagent for entire implementation" [shape=box];
"Check in with user\nbefore finishing" [shape=box];
"Use superpowers:finishing-a-development-branch" [shape=box style=filled fillcolor=lightgreen];
"Read plan, extract all tasks with full text, note context, create TodoWrite" -> "Dispatch implementer subagent (./implementer-prompt.md)";
"Read plan, extract all tasks with full text, note context, create your task list" -> "Dispatch implementer subagent (./implementer-prompt.md)";
"Dispatch implementer subagent (./implementer-prompt.md)" -> "Implementer subagent asks questions?";
"Implementer subagent asks questions?" -> "Answer questions, provide context" [label="yes"];
"Answer questions, provide context" -> "Dispatch implementer subagent (./implementer-prompt.md)";
"Implementer subagent asks questions?" -> "Implementer subagent implements, tests, commits, self-reviews" [label="no"];
"Implementer subagent implements, tests, commits, self-reviews" -> "Dispatch spec reviewer subagent (./spec-reviewer-prompt.md)";
"Implementer subagent implements, tests, commits, self-reviews" -> "Two-stage review warranted?";
"Two-stage review warranted?" -> "Dispatch spec reviewer subagent (./spec-reviewer-prompt.md)" [label="yes"];
"Two-stage review warranted?" -> "Ask user permission\nto elide or collapse reviews" [label="no — may be\noverkill"];
"Ask user permission\nto elide or collapse reviews" -> "Mark task complete in your task list";
"Dispatch spec reviewer subagent (./spec-reviewer-prompt.md)" -> "Spec reviewer subagent confirms code matches spec?";
"Spec reviewer subagent confirms code matches spec?" -> "Implementer subagent fixes spec gaps" [label="no"];
"Implementer subagent fixes spec gaps" -> "Dispatch spec reviewer subagent (./spec-reviewer-prompt.md)" [label="re-review"];
@@ -74,11 +84,15 @@ digraph process {
"Dispatch code quality reviewer subagent (./code-quality-reviewer-prompt.md)" -> "Code quality reviewer subagent approves?";
"Code quality reviewer subagent approves?" -> "Implementer subagent fixes quality issues" [label="no"];
"Implementer subagent fixes quality issues" -> "Dispatch code quality reviewer subagent (./code-quality-reviewer-prompt.md)" [label="re-review"];
"Code quality reviewer subagent approves?" -> "Mark task complete in TodoWrite" [label="yes"];
"Mark task complete in TodoWrite" -> "More tasks remain?";
"Code quality reviewer subagent approves?" -> "Mark task complete in your task list" [label="yes"];
"Mark task complete in your task list" -> "More tasks remain?";
"More tasks remain?" -> "Dispatch implementer subagent (./implementer-prompt.md)" [label="yes"];
"More tasks remain?" -> "Dispatch final code reviewer subagent for entire implementation" [label="no"];
"Dispatch final code reviewer subagent for entire implementation" -> "Use superpowers:finishing-a-development-branch";
"More tasks remain?" -> "Final reviewer warranted?" [label="no"];
"Final reviewer warranted?" -> "Dispatch final code reviewer subagent for entire implementation" [label="yes"];
"Final reviewer warranted?" -> "Ask user permission\nto elide final review" [label="no — may be\noverkill"];
"Ask user permission\nto elide final review" -> "Check in with user\nbefore finishing";
"Dispatch final code reviewer subagent for entire implementation" -> "Check in with user\nbefore finishing";
"Check in with user\nbefore finishing" -> "Use superpowers:finishing-a-development-branch";
}
```
@@ -128,7 +142,7 @@ You: I'm using Subagent-Driven Development to execute this plan.
[Read plan file once: docs/superpowers/plans/feature-plan.md]
[Extract all 5 tasks with full text and context]
[Create TodoWrite with all tasks]
[Create your task list with all tasks]
Task 1: Hook installation script
@@ -233,7 +247,7 @@ Done!
**Never:**
- Start implementation on main/master branch without explicit user consent
- Skip reviews (spec compliance OR code quality)
- Skip any review without explicit user permission
- Proceed with unfixed issues
- Dispatch multiple implementation subagents in parallel (conflicts)
- Make subagent read plan file (provide full text instead)
@@ -258,7 +272,7 @@ Done!
**If subagent fails task:**
- Dispatch fix subagent with specific instructions
- Don't try to fix manually (context pollution)
- Don't try to fix manually — the orchestrator never implements or reviews code (context pollution)
## Integration

View File

@@ -3,6 +3,10 @@ name: using-superpowers
description: Use when starting any conversation - establishes how to find and use skills, requiring Skill tool invocation before ANY response including clarifying questions
---
<SUBAGENT-STOP>
If you were dispatched as a subagent to execute a specific task, skip this skill.
</SUBAGENT-STOP>
<EXTREMELY-IMPORTANT>
If you think there is even a 1% chance a skill might apply to what you are doing, you ABSOLUTELY MUST invoke the skill.
@@ -27,6 +31,10 @@ If CLAUDE.md says "don't use TDD" and a skill says "always use TDD," follow CLAU
**In other environments:** Check your platform's documentation for how skills are loaded.
## Platform Adaptation
Skills use Claude Code tool names. Non-CC platforms: see `references/codex-tools.md` for tool equivalents.
# Using Skills
## The Rule
@@ -36,6 +44,9 @@ If CLAUDE.md says "don't use TDD" and a skill says "always use TDD," follow CLAU
```dot
digraph skill_flow {
"User message received" [shape=doublecircle];
"About to EnterPlanMode?" [shape=doublecircle];
"Already brainstormed?" [shape=diamond];
"Invoke brainstorming skill" [shape=box];
"Might any skill apply?" [shape=diamond];
"Invoke Skill tool" [shape=box];
"Announce: 'Using [skill] to [purpose]'" [shape=box];
@@ -44,6 +55,11 @@ digraph skill_flow {
"Follow skill exactly" [shape=box];
"Respond (including clarifications)" [shape=doublecircle];
"About to EnterPlanMode?" -> "Already brainstormed?";
"Already brainstormed?" -> "Invoke brainstorming skill" [label="no"];
"Already brainstormed?" -> "Might any skill apply?" [label="yes"];
"Invoke brainstorming skill" -> "Might any skill apply?";
"User message received" -> "Might any skill apply?";
"Might any skill apply?" -> "Invoke Skill tool" [label="yes, even 1%"];
"Might any skill apply?" -> "Respond (including clarifications)" [label="definitely not"];
@@ -73,7 +89,6 @@ These thoughts mean STOP—you're rationalizing:
| "I'll just do this one thing first" | Check BEFORE doing anything. |
| "This feels productive" | Undisciplined action wastes time. Skills prevent this. |
| "I know what that means" | Knowing the concept ≠ using the skill. Invoke it. |
| "I should use EnterPlanMode / plan mode" | If a loaded skill specifies the next step, follow the skill. EnterPlanMode is a platform default — skills override defaults. |
## Skill Priority

View File

@@ -0,0 +1,25 @@
# Codex Tool Mapping
Skills use Claude Code tool names. When you encounter these in a skill, use your platform equivalent:
| Skill references | Codex equivalent |
|-----------------|------------------|
| `Task` tool (dispatch subagent) | `spawn_agent` |
| Multiple `Task` calls (parallel) | Multiple `spawn_agent` calls |
| Task returns result | `wait` |
| Task completes automatically | `close_agent` to free slot |
| `TodoWrite` (task tracking) | `update_plan` |
| `Skill` tool (invoke a skill) | Skills load natively — just follow the instructions |
| `Read`, `Write`, `Edit` (files) | Use your native file tools |
| `Bash` (run commands) | Use your native shell tools |
## Subagent dispatch requires collab
Add to your Codex config (`~/.codex/config.toml`):
```toml
[features]
collab = true
```
This enables `spawn_agent`, `wait`, and `close_agent` for skills like `dispatching-parallel-agents` and `subagent-driven-development`.

View File

@@ -1,19 +1,21 @@
---
name: writing-plans
description: Use when you have a spec or requirements for a multi-step task, before touching code. After brainstorming, ALWAYS use this — not EnterPlanMode or plan mode.
description: Use when you have a spec or requirements for a multi-step task, before touching code
---
# Writing Plans
## Overview
Scale the plan to the task. A one-file change doesn't need the same plan as a new subsystem. When you believe steps can be safely elided, ask the user for permission — don't elide silently, and don't follow the full process rigidly when it doesn't serve the work.
Write comprehensive implementation plans assuming the engineer has zero context for our codebase and questionable taste. Document everything they need to know: which files to touch for each task, code, testing, docs they might need to check, how to test it. Give them the whole plan as bite-sized tasks. DRY. YAGNI. TDD. Frequent commits.
Assume they are a skilled developer, but know almost nothing about our toolset or problem domain. Assume they don't know good test design very well.
**Announce at start:** "I'm using the writing-plans skill to create the implementation plan."
**Context:** This runs in the main workspace after brainstorming, while context is fresh. The worktree is created afterward for implementation.
**Context:** This should be run in a dedicated worktree (created by brainstorming skill).
**Save plans to:** `docs/superpowers/plans/YYYY-MM-DD-<feature-name>.md`
- (User preferences for plan location override this default)
@@ -62,7 +64,7 @@ This structure informs the task decomposition. Each task should produce self-con
## Task Structure
```markdown
````markdown
### Task N: [Component Name]
**Files:**
@@ -101,7 +103,7 @@ Expected: PASS
git add tests/path/test.py src/path/file.py
git commit -m "feat: add specific feature"
```
```
````
## Remember
- Exact file paths always
@@ -110,8 +112,39 @@ git commit -m "feat: add specific feature"
- Reference relevant skills with @ syntax
- DRY, YAGNI, TDD, frequent commits
## Process Flow
```dot
digraph writing_plans {
rankdir=TB;
node [shape=box];
announce [label="Announce skill usage"];
scope [label="Scope check"];
file_structure [label="Map file structure"];
write_tasks [label="Write bite-sized tasks\n(header, task structure, code)"];
review_needed [label="Review loop warranted?" shape=diamond];
ask_user [label="Ask user permission\nto elide review loop" shape=box];
user_says [label="User approves\neliding?" shape=diamond];
review_loop [label="Dispatch plan-document-reviewer\nper chunk; fix until ✅"];
save_plan [label="Save plan to\ndocs/superpowers/plans/"];
handoff [label="Execution handoff:\n\"Ready to execute?\""];
announce -> scope -> file_structure -> write_tasks -> review_needed;
review_needed -> review_loop [label="yes"];
review_needed -> ask_user [label="no — may be\noverkill"];
ask_user -> user_says;
user_says -> review_loop [label="no, do the review"];
user_says -> save_plan [label="yes, elide it"];
review_loop -> save_plan;
save_plan -> handoff;
}
```
## Plan Review Loop
**GATE — Do not elide without permission.** For small, single-file changes, the review loop may be unnecessary. If you believe it can be safely elided, you MUST ask the user before proceeding without it. Do not silently skip the review loop. Do not treat this as optional. Present your reasoning and wait for the user's answer.
After completing each chunk of the plan:
1. Dispatch plan-document-reviewer subagent (see plan-document-reviewer-prompt.md) for the current chunk
@@ -133,15 +166,19 @@ After completing each chunk of the plan:
After saving the plan:
**"Plan complete and saved to `docs/superpowers/plans/<filename>.md`. Ready to execute?"**
**1. Record context.** Before anything else, verify all artifacts are saved and the plan is self-contained:
- Spec document path (if one was written)
- Plan document path
- Key architectural decisions, constraints, or user preferences that affect implementation but aren't captured in the plan — add them to the plan now
**Execution path depends on harness capabilities:**
**2. Advise compaction.** Execution works better with a fresh window. Tell the user:
**If harness has subagents (Claude Code, etc.):**
- **REQUIRED:** Use superpowers:subagent-driven-development
- Do NOT offer a choice - subagent-driven is the standard approach
- Fresh subagent per task + two-stage review
> "The plan is saved to `docs/superpowers/plans/<filename>.md`. Before we start implementation, I recommend compacting this session — execution works better with a fresh window."
**If harness does NOT have subagents:**
- Execute plan in current session using superpowers:executing-plans
- Batch execution with checkpoints for review
**3. Give exact continuation prompt.** Tell the user exactly what to say after compacting. Use the actual filename, not a placeholder.
If you can dispatch subagents (Claude Code, etc.):
> "After compacting, say: **Execute the plan at `docs/superpowers/plans/<filename>.md` using subagent-driven-development.**"
If you cannot dispatch subagents, ask the user: "The plan is ready. I can't dispatch subagents in this environment — should I execute the tasks in this thread?"

View File

@@ -306,6 +306,7 @@ digraph when_flowchart {
- Non-obvious decision points
- Process loops where you might stop too early
- "When to use A vs B" decisions
- **Behavioral gates** — decision diamonds in process flows act as enforcement mechanisms, not just documentation. Testing showed agents follow graphviz gates more reliably than prose instructions alone (writing-plans: 2/5 → 5/5 compliance after adding a process diagram with a gate diamond).
**Never use flowcharts for:**
- Reference material → Tables, lists
@@ -484,6 +485,18 @@ Write code before test? Delete it. Start over.
```
</Good>
### Use GATE Markers for Non-Optional Decision Points
Label decision points that must not be silently bypassed with `**GATE —**` followed by the constraint:
```markdown
**GATE — Do not elide without permission.** If you believe
the review loop can be safely skipped, you MUST ask the user
before proceeding. Present your reasoning and wait for their answer.
```
Agents treat GATE-marked instructions as harder constraints than unmarked prose. Pair with a decision diamond in the process flow diagram for strongest effect.
### Address "Spirit vs Letter" Arguments
Add foundational principle early:

View File

@@ -100,6 +100,32 @@ async function runTests() {
ws2.close();
console.log(' PASS');
// Test: Choice events written to .events file
console.log('Test: Choice events written to .events file');
const ws3 = new WebSocket(`ws://localhost:${TEST_PORT}`);
await new Promise(resolve => ws3.on('open', resolve));
ws3.send(JSON.stringify({ type: 'click', choice: 'a', text: 'Option A' }));
await sleep(300);
const eventsFile = path.join(TEST_DIR, '.events');
assert(fs.existsSync(eventsFile), '.events file should exist after choice click');
const lines = fs.readFileSync(eventsFile, 'utf-8').trim().split('\n');
const event = JSON.parse(lines[lines.length - 1]);
assert.strictEqual(event.choice, 'a', 'Event should contain choice');
assert.strictEqual(event.text, 'Option A', 'Event should contain text');
ws3.close();
console.log(' PASS');
// Test: .events cleared on new screen
console.log('Test: .events cleared on new screen');
// .events file should still exist from previous test
assert(fs.existsSync(path.join(TEST_DIR, '.events')), '.events should exist before new screen');
fs.writeFileSync(path.join(TEST_DIR, 'new-screen.html'), '<h2>New screen</h2>');
await sleep(500);
assert(!fs.existsSync(path.join(TEST_DIR, '.events')), '.events should be cleared after new screen');
console.log(' PASS');
// Test 5: Full HTML document served as-is (not wrapped)
console.log('Test 5: Full HTML document served without frame wrapping');
const fullDoc = '<!DOCTYPE html>\n<html><head><title>Custom</title></head><body><h1>Custom Page</h1></body></html>';
@@ -109,8 +135,8 @@ async function runTests() {
const fullRes = await fetch(`http://localhost:${TEST_PORT}/`);
assert(fullRes.body.includes('<h1>Custom Page</h1>'), 'Should contain original content');
assert(fullRes.body.includes('WebSocket'), 'Should still inject helper.js');
// Should NOT have the frame template's feedback footer
assert(!fullRes.body.includes('feedback-footer') || fullDoc.includes('feedback-footer'),
// Should NOT have the frame template's indicator bar
assert(!fullRes.body.includes('indicator-bar') || fullDoc.includes('indicator-bar'),
'Should not wrap full documents in frame template');
console.log(' PASS');
@@ -122,9 +148,8 @@ async function runTests() {
const fragRes = await fetch(`http://localhost:${TEST_PORT}/`);
// Should have the frame template structure
assert(fragRes.body.includes('feedback-footer'), 'Fragment should get feedback footer from frame');
assert(fragRes.body.includes('Brainstorm Companion'), 'Fragment should get header from frame');
assert(fragRes.body.includes('--bg-primary'), 'Fragment should get theme CSS from frame');
assert(fragRes.body.includes('indicator-bar'), 'Fragment should get indicator bar from frame');
assert(!fragRes.body.includes('<!-- CONTENT -->'), 'Content placeholder should be replaced');
// Should have the original content inside
assert(fragRes.body.includes('Pick a layout'), 'Fragment content should be present');
assert(fragRes.body.includes('data-choice="a"'), 'Fragment content should be intact');
@@ -138,14 +163,19 @@ async function runTests() {
path.join(__dirname, '../../lib/brainstorm-server/helper.js'), 'utf-8'
);
assert(helperContent.includes('toggleSelect'), 'helper.js should define toggleSelect');
assert(helperContent.includes('send'), 'helper.js should define send function');
assert(helperContent.includes('sendEvent'), 'helper.js should define sendEvent');
assert(helperContent.includes('selectedChoice'), 'helper.js should track selectedChoice');
assert(helperContent.includes('brainstorm'), 'helper.js should expose brainstorm API');
assert(!helperContent.includes('sendToClaude'), 'helper.js should not contain sendToClaude');
console.log(' PASS');
// Test 8: sendToClaude confirmation uses CSS variables (dark mode support)
console.log('Test 8: sendToClaude confirmation respects theming');
assert(!helperContent.includes('color: #666'), 'Should not use hardcoded light-mode colors');
assert(!helperContent.includes('color: #333'), 'Should not use hardcoded light-mode colors');
// Test 8: Indicator bar uses CSS variables (theme support)
console.log('Test 8: Indicator bar uses CSS variables');
const templateContent = fs.readFileSync(
path.join(__dirname, '../../lib/brainstorm-server/frame-template.html'), 'utf-8'
);
assert(templateContent.includes('indicator-bar'), 'Template should have indicator bar');
assert(templateContent.includes('indicator-text'), 'Template should have indicator text element');
console.log(' PASS');
console.log('\nAll tests passed!');

View File

@@ -58,7 +58,6 @@ while [[ $# -gt 0 ]]; do
echo ""
echo "Tests:"
echo " test-subagent-driven-development.sh Test skill loading and requirements"
echo " test-brainstorm-handoff.sh Test brainstorm→writing-plans handoff"
echo ""
echo "Integration Tests (use --integration):"
echo " test-subagent-driven-development-integration.sh Full workflow execution"
@@ -75,7 +74,6 @@ done
# List of skill tests to run (fast unit tests)
tests=(
"test-subagent-driven-development.sh"
"test-brainstorm-handoff.sh"
)
# Integration tests (slow, full execution)

View File

@@ -1,330 +0,0 @@
#!/usr/bin/env bash
# Test: Brainstorm-to-plan handoff (end-to-end)
#
# Full brainstorming flow that builds enough context distance to reproduce
# the EnterPlanMode failure. Simulates a real brainstorming session with
# multiple turns of Q&A before the "build it" moment.
#
# This test takes 5-10 minutes to run.
#
# PASS: Skill tool invoked with "writing-plans" AND EnterPlanMode NOT invoked
# FAIL: EnterPlanMode invoked OR writing-plans not invoked
#
# Usage:
# ./test-brainstorm-handoff-e2e.sh # With fix (expects PASS)
# ./test-brainstorm-handoff-e2e.sh --without-fix # Strip fix, reproduce failure
# ./test-brainstorm-handoff-e2e.sh --verbose # Show full output
#
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PLUGIN_DIR="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Parse flags
VERBOSE=false
WITHOUT_FIX=false
while [[ $# -gt 0 ]]; do
case $1 in
--verbose|-v) VERBOSE=true; shift ;;
--without-fix) WITHOUT_FIX=true; shift ;;
*) echo "Unknown flag: $1"; exit 1 ;;
esac
done
TIMESTAMP=$(date +%s)
OUTPUT_DIR="/tmp/superpowers-tests/${TIMESTAMP}/brainstorm-handoff-e2e"
mkdir -p "$OUTPUT_DIR"
echo "=== Brainstorm-to-Plan Handoff E2E Test ==="
echo "Mode: $([ "$WITHOUT_FIX" = true ] && echo "WITHOUT FIX (expect failure)" || echo "WITH FIX (expect pass)")"
echo "Output: $OUTPUT_DIR"
echo "This test takes 5-10 minutes."
echo ""
# --- Project Setup ---
PROJECT_DIR="$OUTPUT_DIR/project"
mkdir -p "$PROJECT_DIR/src" "$PROJECT_DIR/test"
cat > "$PROJECT_DIR/package.json" << 'PROJ_EOF'
{
"name": "my-express-app",
"version": "1.0.0",
"type": "module",
"scripts": {
"start": "node src/index.js",
"test": "vitest run"
},
"dependencies": {
"express": "^4.18.0",
"better-sqlite3": "^9.0.0"
},
"devDependencies": {
"vitest": "^1.0.0",
"supertest": "^6.0.0"
}
}
PROJ_EOF
cat > "$PROJECT_DIR/src/index.js" << 'PROJ_EOF'
import express from 'express';
const app = express();
app.use(express.json());
app.get('/health', (req, res) => res.json({ status: 'ok' }));
const PORT = process.env.PORT || 3000;
if (process.env.NODE_ENV !== 'test') {
app.listen(PORT, () => console.log(`Listening on ${PORT}`));
}
export default app;
PROJ_EOF
cd "$PROJECT_DIR"
git init -q
git add -A
git commit -q -m "Initial commit"
# --- Plugin Setup ---
EFFECTIVE_PLUGIN_DIR="$PLUGIN_DIR"
if [ "$WITHOUT_FIX" = true ]; then
echo "Creating plugin copy without the handoff fix..."
EFFECTIVE_PLUGIN_DIR="$OUTPUT_DIR/plugin-without-fix"
cp -R "$PLUGIN_DIR" "$EFFECTIVE_PLUGIN_DIR"
python3 << PYEOF
import pathlib
# Strip fix from brainstorming SKILL.md
p = pathlib.Path('$EFFECTIVE_PLUGIN_DIR/skills/brainstorming/SKILL.md')
content = p.read_text()
content = content.replace(
'**Implementation (if continuing):**\nWhen the user approves the design and wants to build:\n1. **Invoke \`superpowers:writing-plans\` using the Skill tool.** Not EnterPlanMode. Not plan mode. Not direct implementation. The Skill tool.\n2. After the plan is written, use superpowers:using-git-worktrees to create an isolated workspace for implementation.',
'**Implementation (if continuing):**\n- Ask: "Ready to set up for implementation?"\n- Use superpowers:using-git-worktrees to create isolated workspace\n- **REQUIRED:** Use superpowers:writing-plans to create detailed implementation plan'
)
p.write_text(content)
# Strip fix from using-superpowers
p = pathlib.Path('$EFFECTIVE_PLUGIN_DIR/skills/using-superpowers/SKILL.md')
lines = p.read_text().splitlines(keepends=True)
lines = [l for l in lines if 'I should use EnterPlanMode' not in l]
p.write_text(''.join(lines))
# Strip fix from writing-plans
p = pathlib.Path('$EFFECTIVE_PLUGIN_DIR/skills/writing-plans/SKILL.md')
content = p.read_text()
content = content.replace(
'description: Use when you have a spec or requirements for a multi-step task, before touching code. After brainstorming, ALWAYS use this — not EnterPlanMode or plan mode.',
'description: Use when you have a spec or requirements for a multi-step task, before touching code'
)
content = content.replace(
'**Context:** This runs in the main workspace after brainstorming, while context is fresh. The worktree is created afterward for implementation.',
'**Context:** This should be run in a dedicated worktree (created by brainstorming skill).'
)
p.write_text(content)
PYEOF
echo "Plugin copy created."
echo ""
fi
# --- Helper ---
run_turn() {
local turn_num="$1"
local prompt="$2"
local max_turns="$3"
local label="$4"
local continue_flag="${5:-}"
local log_file="$OUTPUT_DIR/turn${turn_num}.json"
echo ">>> Turn $turn_num: $label"
local cmd="timeout 300 claude -p \"$prompt\""
cmd="$cmd --plugin-dir \"$EFFECTIVE_PLUGIN_DIR\""
cmd="$cmd --dangerously-skip-permissions"
cmd="$cmd --max-turns $max_turns"
cmd="$cmd --output-format stream-json"
if [ -n "$continue_flag" ]; then
cmd="$cmd --continue"
fi
eval "$cmd" > "$log_file" 2>&1 || true
echo " Done."
if [ "$VERBOSE" = true ]; then
echo " ---"
grep '"type":"assistant"' "$log_file" 2>/dev/null | tail -1 | \
jq -r '.message.content[0].text // empty' 2>/dev/null | \
head -c 600 || true
echo ""
echo " ---"
fi
echo "$log_file"
}
# --- Run Full Brainstorming Flow ---
cd "$PROJECT_DIR"
# Turn 1: Start brainstorming - this loads the skill and begins Q&A
T1=$(run_turn 1 \
"I want to add URL shortening to this Express app. Help me think through the design." \
5 "Starting brainstorming")
# Turn 2: Answer first question (whatever it is) generically
T2=$(run_turn 2 \
"Good question. Here is what I want: POST /api/shorten that takes a URL and returns a short code. GET /:code that redirects. GET /api/stats/:code for click tracking. Random 6-char alphanumeric codes. SQLite storage using better-sqlite3 which is already in package.json. No auth needed." \
5 "Answering first question" --continue)
# Turn 3: Agree with recommendations
T3=$(run_turn 3 \
"Yes, that sounds right. Go with your recommendation." \
5 "Agreeing with recommendation" --continue)
# Turn 4: Continue agreeing
T4=$(run_turn 4 \
"Looks good. I agree with that approach." \
5 "Continuing to agree" --continue)
# Turn 5: Push toward completion
T5=$(run_turn 5 \
"Perfect. I am happy with all of that. Please wrap up the design and write the spec." \
8 "Requesting spec write-up" --continue)
# Turn 6: Approve the spec
T6=$(run_turn 6 \
"The spec looks great. I approve it." \
5 "Approving spec" --continue)
# Turn 7: THE CRITICAL MOMENT - "build it"
T7=$(run_turn 7 \
"Yes, build it." \
5 "Critical handoff: build it" --continue)
# Turn 8: Safety net in case turn 7 asked a follow-up
T8=$(run_turn 8 \
"Yes. Go ahead and build it now." \
5 "Safety net: build it" --continue)
echo ""
# --- Assertions ---
echo "=== Results ==="
echo ""
# Combine all logs
ALL_LOGS="$OUTPUT_DIR/all-turns.json"
cat "$OUTPUT_DIR"/turn*.json > "$ALL_LOGS" 2>/dev/null
# Check handoff turns (6-8, where approval + "build it" happens)
HANDOFF_LOGS="$OUTPUT_DIR/handoff-turns.json"
cat "$OUTPUT_DIR/turn6.json" "$OUTPUT_DIR/turn7.json" "$OUTPUT_DIR/turn8.json" > "$HANDOFF_LOGS" 2>/dev/null
# Detection: writing-plans skill invoked in handoff turns?
HAS_WRITING_PLANS=false
if grep -q '"name":"Skill"' "$HANDOFF_LOGS" 2>/dev/null && grep -q 'writing-plans' "$HANDOFF_LOGS" 2>/dev/null; then
HAS_WRITING_PLANS=true
fi
# Detection: EnterPlanMode invoked in handoff turns?
HAS_ENTER_PLAN_MODE=false
if grep -q '"name":"EnterPlanMode"' "$HANDOFF_LOGS" 2>/dev/null; then
HAS_ENTER_PLAN_MODE=true
fi
# Also check across ALL turns (might happen earlier)
HAS_ENTER_PLAN_MODE_ANYWHERE=false
if grep -q '"name":"EnterPlanMode"' "$ALL_LOGS" 2>/dev/null; then
HAS_ENTER_PLAN_MODE_ANYWHERE=true
fi
# Report
echo "Skills invoked (all turns):"
grep -o '"skill":"[^"]*"' "$ALL_LOGS" 2>/dev/null | sort -u || echo " (none)"
echo ""
echo "Skills invoked (handoff turns 6-8):"
grep -o '"skill":"[^"]*"' "$HANDOFF_LOGS" 2>/dev/null | sort -u || echo " (none)"
echo ""
echo "Tools invoked in handoff turns (6-8):"
grep -o '"name":"[A-Z][^"]*"' "$HANDOFF_LOGS" 2>/dev/null | sort | uniq -c | sort -rn | head -10 || echo " (none)"
echo ""
if [ "$HAS_ENTER_PLAN_MODE_ANYWHERE" = true ]; then
echo "WARNING: EnterPlanMode was invoked somewhere in the conversation."
echo "Turns containing EnterPlanMode:"
for f in "$OUTPUT_DIR"/turn*.json; do
if grep -q '"name":"EnterPlanMode"' "$f" 2>/dev/null; then
echo " $(basename "$f")"
fi
done
echo ""
fi
# Determine result
PASSED=false
if [ "$WITHOUT_FIX" = true ]; then
echo "--- Without-Fix Mode (reproducing failure) ---"
if [ "$HAS_ENTER_PLAN_MODE" = true ] || [ "$HAS_ENTER_PLAN_MODE_ANYWHERE" = true ]; then
echo "REPRODUCED: Claude used EnterPlanMode (the bug we're fixing)"
PASSED=true
elif [ "$HAS_WRITING_PLANS" = true ]; then
echo "NOT REPRODUCED: Claude used writing-plans even without the fix"
echo "(The old guidance was sufficient in this run)"
PASSED=false
else
echo "INCONCLUSIVE: Claude used neither writing-plans nor EnterPlanMode"
echo "The brainstorming flow may not have reached the handoff point."
PASSED=false
fi
else
echo "--- With-Fix Mode (verifying fix) ---"
if [ "$HAS_WRITING_PLANS" = true ] && [ "$HAS_ENTER_PLAN_MODE_ANYWHERE" = false ]; then
echo "PASS: Claude used writing-plans skill (correct handoff)"
PASSED=true
elif [ "$HAS_ENTER_PLAN_MODE_ANYWHERE" = true ]; then
echo "FAIL: Claude used EnterPlanMode instead of writing-plans"
PASSED=false
elif [ "$HAS_WRITING_PLANS" = true ] && [ "$HAS_ENTER_PLAN_MODE_ANYWHERE" = true ]; then
echo "FAIL: Claude used BOTH writing-plans AND EnterPlanMode"
PASSED=false
else
echo "INCONCLUSIVE: Claude used neither writing-plans nor EnterPlanMode"
echo "The brainstorming flow may not have reached the handoff."
echo "Check logs to see where the conversation stopped."
PASSED=false
fi
fi
echo ""
# Show what happened in each turn
echo "Turn-by-turn summary:"
for i in 1 2 3 4 5 6 7 8; do
local_log="$OUTPUT_DIR/turn${i}.json"
if [ -f "$local_log" ]; then
local_skills=$(grep -o '"skill":"[^"]*"' "$local_log" 2>/dev/null | tr '\n' ' ' || true)
local_tools=$(grep -o '"name":"EnterPlanMode\|"name":"Skill"' "$local_log" 2>/dev/null | tr '\n' ' ' || true)
local_size=$(wc -c < "$local_log" | tr -d ' ')
printf " Turn %d: %s bytes" "$i" "$local_size"
[ -n "$local_skills" ] && printf " | skills: %s" "$local_skills"
[ -n "$local_tools" ] && printf " | tools: %s" "$local_tools"
echo ""
fi
done
echo ""
echo "Logs: $OUTPUT_DIR"
echo ""
if [ "$PASSED" = true ]; then
exit 0
else
exit 1
fi

View File

@@ -1,317 +0,0 @@
#!/usr/bin/env bash
# Test: Brainstorm-to-plan handoff
#
# Verifies that after brainstorming, Claude invokes the writing-plans skill
# instead of using EnterPlanMode.
#
# The failure mode this catches:
# User says "build it" after brainstorming -> Claude calls EnterPlanMode
# (because the system prompt's planning guidance overpowers the brainstorming
# skill's instructions, which were loaded many turns ago)
#
# PASS: Skill tool invoked with "writing-plans" AND EnterPlanMode NOT invoked
# FAIL: EnterPlanMode invoked OR writing-plans not invoked
#
# Usage:
# ./test-brainstorm-handoff.sh # Normal test (expects PASS)
# ./test-brainstorm-handoff.sh --without-fix # Strip fix, reproduce failure
# ./test-brainstorm-handoff.sh --verbose # Show full output
#
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PLUGIN_DIR="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Parse flags
VERBOSE=false
WITHOUT_FIX=false
while [[ $# -gt 0 ]]; do
case $1 in
--verbose|-v) VERBOSE=true; shift ;;
--without-fix) WITHOUT_FIX=true; shift ;;
*) echo "Unknown flag: $1"; exit 1 ;;
esac
done
TIMESTAMP=$(date +%s)
OUTPUT_DIR="/tmp/superpowers-tests/${TIMESTAMP}/brainstorm-handoff"
mkdir -p "$OUTPUT_DIR"
echo "=== Brainstorm-to-Plan Handoff Test ==="
echo "Mode: $([ "$WITHOUT_FIX" = true ] && echo "WITHOUT FIX (expect failure)" || echo "WITH FIX (expect pass)")"
echo "Output: $OUTPUT_DIR"
echo ""
# --- Project Setup ---
PROJECT_DIR="$OUTPUT_DIR/project"
mkdir -p "$PROJECT_DIR/src"
mkdir -p "$PROJECT_DIR/docs/superpowers/specs"
cat > "$PROJECT_DIR/package.json" << 'PROJ_EOF'
{
"name": "my-express-app",
"version": "1.0.0",
"type": "module",
"dependencies": {
"express": "^4.18.0",
"better-sqlite3": "^9.0.0"
}
}
PROJ_EOF
cat > "$PROJECT_DIR/src/index.js" << 'PROJ_EOF'
import express from 'express';
const app = express();
app.use(express.json());
app.get('/health', (req, res) => res.json({ status: 'ok' }));
const PORT = process.env.PORT || 3000;
app.listen(PORT, () => console.log(`Listening on ${PORT}`));
PROJ_EOF
# Pre-create a spec document (simulating completed brainstorming)
cat > "$PROJECT_DIR/docs/superpowers/specs/2025-01-15-url-shortener-design.md" << 'SPEC_EOF'
# URL Shortener Design Spec
## Overview
Add URL shortening capability to the existing Express.js API.
## Features
- POST /api/shorten accepts { url } and returns { shortCode, shortUrl }
- GET /:code redirects to the original URL (302)
- GET /api/stats/:code returns { clicks, createdAt, originalUrl }
## Technical Design
### Database
Single SQLite table via better-sqlite3:
```sql
CREATE TABLE urls (
id INTEGER PRIMARY KEY AUTOINCREMENT,
short_code TEXT UNIQUE NOT NULL,
original_url TEXT NOT NULL,
clicks INTEGER DEFAULT 0,
created_at TEXT DEFAULT (datetime('now'))
);
CREATE INDEX idx_short_code ON urls(short_code);
```
### File Structure
- `src/index.js` — modified to mount new routes
- `src/db.js` — database initialization and query functions
- `src/shorten.js` — route handlers for all three endpoints
- `src/code-generator.js` — random 6-char alphanumeric code generation
### Code Generation
Random 6-character alphanumeric codes using crypto.randomBytes.
Check for collisions and retry (astronomically unlikely with 36^6 space).
### Validation
- URL must be present and start with http:// or https://
- Return 400 with { error: "..." } for invalid input
### Error Handling
- 404 with { error: "Not found" } for unknown short codes
- 500 with { error: "Internal server error" } for database failures
## Decisions
- 302 redirects (not 301) so browsers don't cache and we always track clicks
- Database path configurable via DATABASE_PATH env var, defaults to ./data/urls.db
- No auth, no custom codes, no expiry — keeping it simple
SPEC_EOF
# Initialize git so brainstorming can inspect project state
cd "$PROJECT_DIR"
git init -q
git add -A
git commit -q -m "Initial commit with URL shortener spec"
# --- Plugin Setup ---
EFFECTIVE_PLUGIN_DIR="$PLUGIN_DIR"
if [ "$WITHOUT_FIX" = true ]; then
echo "Creating plugin copy without the handoff fix..."
EFFECTIVE_PLUGIN_DIR="$OUTPUT_DIR/plugin-without-fix"
cp -R "$PLUGIN_DIR" "$EFFECTIVE_PLUGIN_DIR"
# Strip fix from brainstorming SKILL.md: revert to old implementation section
python3 << PYEOF
import pathlib
p = pathlib.Path('$EFFECTIVE_PLUGIN_DIR/skills/brainstorming/SKILL.md')
content = p.read_text()
content = content.replace(
'**Implementation (if continuing):**\nWhen the user approves the design and wants to build:\n1. **Invoke \`superpowers:writing-plans\` using the Skill tool.** Not EnterPlanMode. Not plan mode. Not direct implementation. The Skill tool.\n2. After the plan is written, use superpowers:using-git-worktrees to create an isolated workspace for implementation.',
'**Implementation (if continuing):**\n- Ask: "Ready to set up for implementation?"\n- Use superpowers:using-git-worktrees to create isolated workspace\n- **REQUIRED:** Use superpowers:writing-plans to create detailed implementation plan'
)
p.write_text(content)
PYEOF
# Strip fix from using-superpowers: remove EnterPlanMode red flag
python3 << PYEOF
import pathlib
p = pathlib.Path('$EFFECTIVE_PLUGIN_DIR/skills/using-superpowers/SKILL.md')
lines = p.read_text().splitlines(keepends=True)
lines = [l for l in lines if 'I should use EnterPlanMode' not in l]
p.write_text(''.join(lines))
PYEOF
# Strip fix from writing-plans: revert description and context
python3 << PYEOF
import pathlib
p = pathlib.Path('$EFFECTIVE_PLUGIN_DIR/skills/writing-plans/SKILL.md')
content = p.read_text()
content = content.replace(
'description: Use when you have a spec or requirements for a multi-step task, before touching code. After brainstorming, ALWAYS use this — not EnterPlanMode or plan mode.',
'description: Use when you have a spec or requirements for a multi-step task, before touching code'
)
content = content.replace(
'**Context:** This runs in the main workspace after brainstorming, while context is fresh. The worktree is created afterward for implementation.',
'**Context:** This should be run in a dedicated worktree (created by brainstorming skill).'
)
p.write_text(content)
PYEOF
echo "Plugin copy created at $EFFECTIVE_PLUGIN_DIR"
echo ""
fi
# --- Run Conversation ---
cd "$PROJECT_DIR"
# Turn 1: Load brainstorming and establish that we finished the design
# The key is that brainstorming gets loaded into context, and we're at the handoff point
echo ">>> Turn 1: Loading brainstorming skill and establishing context..."
TURN1_LOG="$OUTPUT_DIR/turn1.json"
TURN1_PROMPT='I want to add URL shortening to this Express app. I already have the full design worked out and written to docs/superpowers/specs/2025-01-15-url-shortener-design.md. Please read the spec.'
timeout 300 claude -p "$TURN1_PROMPT" \
--plugin-dir "$EFFECTIVE_PLUGIN_DIR" \
--dangerously-skip-permissions \
--max-turns 5 \
--output-format stream-json \
> "$TURN1_LOG" 2>&1 || true
echo "Turn 1 complete."
if [ "$VERBOSE" = true ]; then
echo "---"
grep '"type":"assistant"' "$TURN1_LOG" | tail -1 | jq -r '.message.content[0].text // empty' 2>/dev/null | head -c 800 || true
echo ""
echo "---"
fi
echo ""
# Turn 2: Approve and ask to build - this is the critical handoff moment
echo ">>> Turn 2: 'The spec is done. Build it.' (critical handoff)..."
TURN2_LOG="$OUTPUT_DIR/turn2.json"
TURN2_PROMPT='The spec is complete and I am happy with the design. Build it.'
timeout 300 claude -p "$TURN2_PROMPT" \
--continue \
--plugin-dir "$EFFECTIVE_PLUGIN_DIR" \
--dangerously-skip-permissions \
--max-turns 5 \
--output-format stream-json \
> "$TURN2_LOG" 2>&1 || true
echo "Turn 2 complete."
if [ "$VERBOSE" = true ]; then
echo "---"
grep '"type":"assistant"' "$TURN2_LOG" | tail -1 | jq -r '.message.content[0].text // empty' 2>/dev/null | head -c 800 || true
echo ""
echo "---"
fi
echo ""
# --- Assertions ---
echo "=== Results ==="
echo ""
# Combine all turn logs for analysis
ALL_LOGS="$OUTPUT_DIR/all-turns.json"
cat "$TURN1_LOG" "$TURN2_LOG" > "$ALL_LOGS"
# Detection: writing-plans skill invoked?
HAS_WRITING_PLANS=false
if grep -q '"name":"Skill"' "$ALL_LOGS" 2>/dev/null && grep -q 'writing-plans' "$ALL_LOGS" 2>/dev/null; then
HAS_WRITING_PLANS=true
fi
# Detection: EnterPlanMode invoked?
HAS_ENTER_PLAN_MODE=false
if grep -q '"name":"EnterPlanMode"' "$ALL_LOGS" 2>/dev/null; then
HAS_ENTER_PLAN_MODE=true
fi
# Report what skills were invoked
echo "Skills invoked:"
grep -o '"skill":"[^"]*"' "$ALL_LOGS" 2>/dev/null | sort -u || echo " (none)"
echo ""
echo "Notable tools invoked:"
grep -o '"name":"[A-Z][^"]*"' "$ALL_LOGS" 2>/dev/null | sort | uniq -c | sort -rn | head -10 || echo " (none)"
echo ""
# Determine result
PASSED=false
if [ "$WITHOUT_FIX" = true ]; then
# In without-fix mode, we EXPECT the failure (EnterPlanMode)
echo "--- Without-Fix Mode (reproducing failure) ---"
if [ "$HAS_ENTER_PLAN_MODE" = true ]; then
echo "REPRODUCED: Claude used EnterPlanMode (the bug we're fixing)"
PASSED=true
elif [ "$HAS_WRITING_PLANS" = true ]; then
echo "NOT REPRODUCED: Claude used writing-plans even without the fix"
echo "(The model may have followed the old guidance anyway)"
PASSED=false
else
echo "INCONCLUSIVE: Claude used neither writing-plans nor EnterPlanMode"
echo "The brainstorming flow may not have reached the handoff point."
PASSED=false
fi
else
# Normal mode: expect writing-plans, not EnterPlanMode
echo "--- With-Fix Mode (verifying fix) ---"
if [ "$HAS_WRITING_PLANS" = true ] && [ "$HAS_ENTER_PLAN_MODE" = false ]; then
echo "PASS: Claude used writing-plans skill (correct handoff)"
PASSED=true
elif [ "$HAS_ENTER_PLAN_MODE" = true ]; then
echo "FAIL: Claude used EnterPlanMode instead of writing-plans"
PASSED=false
elif [ "$HAS_WRITING_PLANS" = true ] && [ "$HAS_ENTER_PLAN_MODE" = true ]; then
echo "FAIL: Claude used BOTH writing-plans AND EnterPlanMode"
PASSED=false
else
echo "INCONCLUSIVE: Claude used neither writing-plans nor EnterPlanMode"
echo "The brainstorming flow may not have reached the handoff point."
echo "Check logs - brainstorming may still be asking questions."
PASSED=false
fi
fi
echo ""
# Show the critical turn 2 response
echo "Turn 2 response (first 500 chars):"
grep '"type":"assistant"' "$TURN2_LOG" 2>/dev/null | tail -1 | \
jq -r '.message.content[0].text // .message.content' 2>/dev/null | \
head -c 500 || echo " (could not extract)"
echo ""
echo ""
echo "Logs:"
echo " Turn 1: $TURN1_LOG"
echo " Turn 2: $TURN2_LOG"
echo " Combined: $ALL_LOGS"
echo ""
if [ "$PASSED" = true ]; then
exit 0
else
exit 1
fi

View File

@@ -77,6 +77,7 @@ claude -p "$PROMPT" \
--plugin-dir "$PLUGIN_DIR" \
--dangerously-skip-permissions \
--output-format stream-json \
--verbose \
> "$LOG_FILE" 2>&1 || true
# Extract final stats