Compare commits

..

10 Commits

Author SHA1 Message Date
Mohamed Hegazy
37ffc76005 security-guidance: emit findings via hookSpecificOutput.additionalContext (#1358 #1375 #1783)
Fixes #1358, #1375, and #1783 — three related complaints about the
hook output protocol used at the three asyncRewake exit-2 sites
(handle_commit_review_posttooluse, handle_push_sweep_posttooluse,
handle_stop_hook).

The old shape at each site was:

  emit_metrics({...})                              # JSON to stdout (metrics)
  sys.stderr.write(banner + guidance + suffix)     # plain text to stderr
  sys.exit(2)                                      # asyncRewake trigger

That triggered three reported problems:

  #1375: CC's hook system parsing stdout for a SyncHookJSONOutput sees
         only the bare metrics dict — no findings reason — and on older
         CC versions surfaces a 'json output validation failed' error
         because stderr's plain text isn't valid JSON.
  #1783: CC's UI shows 'Permission to use Edit has been denied' with no
         permissionDecisionReason — the stderr text is invisible to that
         UI surface; CC only renders fields it can find in the JSON.
  #1358: Reporters experienced the exit(2) as 'gating' behavior rather
         than 'warning' behavior. The pattern-warning path in main()
         was migrated to exit(0) + hookSpecificOutput.additionalContext
         long ago; these three asyncRewake sites were never updated.

Fix: extend emit_metrics() to accept additional_context, system_message,
and hook_event_name kwargs, and emit them in the same SyncHookJSONOutput
line as the metrics. CC's parser stops scanning stdout after the first
{-prefixed line, so the findings must ride in that same line — calling
emit_metrics twice or adding a second print(json.dumps(...)) would
silently drop the second emission.

At each of the three call sites: route the guidance text that used to
go to stderr through additional_context instead. The stderr.write is
dropped — additionalContext carries the same text to the model via the
JSON channel, and the legacy stderr surface is what triggered #1375's
JSON validation error on older CC clients.

exit(2) is preserved at all three sites. That's the documented mechanism
for triggering the asyncRewake 'force fix' feedback loop (per the
inline comment at the stop-hook site); switching to exit(0) without
verifying CC's protocol-version support risks dropping the rewake
entirely and silently losing all the findings the hook just computed.

For push-sweep specifically: emit_metrics had to move from an
unconditional pre-emission (line ~1680) to two conditional sites (one
in the no-vulns branch with exit(0), one in the with-vulns branch with
exit(2)) because the with-vulns branch needs to attach additional_context
and CC reads only the first JSON line — a second emit would be ignored.
Behavior is preserved: every push-sweep fire emits exactly one metrics
line, just at a slightly later point in the function body.

Verified locally on macOS Python 3.13:

  - py_compile clean.
  - Existing 45 smoke + extensibility tests still pass.
  - 21 new tests in test_hook_output_protocol.py (added to internal
    test suite at sg-staging/tests/, not in this PR):

      * 6 backward-compat: emit_metrics with metrics only, with
        rewake_summary, etc. — verifies the legacy callers still
        produce the same output shape.
      * 5 additional_context shape: lands in hookSpecificOutput,
        round-trips the value, default hook_event_name is sensible,
        empty/None doesn't pollute the JSON with an empty hSO block.
      * 3 system_message shape: lands in systemMessage, empty/None
        suppressed, round-trips.
      * 1 combined: metrics + rewake_summary + additional_context +
        system_message + hook_event_name all merge into one JSON line.
      * 6 round-trip safety: emoji, quotes, backslashes, newlines,
        Unicode (山田太郎 + 🎉), tabs, null bytes — all survive the
        json.dumps cycle.
      * 6 static-shape: each of the three asyncRewake handlers
        (commit_review, push_sweep, stop_hook) is checked to confirm
        it passes additional_context to emit_metrics and no longer
        writes the PROVENANCE_BANNER guidance to stderr. Catches the
        regression class where a new exit(2) site forgets to plumb
        guidance through the JSON channel.

  - 66/66 pass total (45 existing + 21 new) in 2.57s.

NOT verified end-to-end with a real CC instance triggering all three
hooks. The static-shape tests + the JSON round-trip tests should catch
any regression in the emit_metrics output, but the actual interaction
with CC's asyncRewake / rewakeMessage flow (especially: does
hookSpecificOutput.additionalContext successfully appear in the
rewakeMessage that CC sends to the model?) needs runtime verification
against a CC version that supports the modern protocol.

The reporter for #1375 specifically called out that CC's older
versions surfaced 'json output validation failed' on the old stderr-
only output; this fix changes the stdout shape to valid JSON with the
findings included, which should resolve that error class.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 23:53:04 -07:00
Mohamed Hegazy
68a700837c Merge pull request #2075 from anthropics/fix-2056-windows-unicode-decode
security-guidance: lenient UTF-8 decode in 6 git-subprocess helpers (#2056)
2026-05-28 23:36:36 -07:00
Mohamed Hegazy
3d349d40b9 Merge pull request #2074 from anthropics/fix-xss-rules-non-js-false-positives
security-guidance: gate XSS pattern rules to JS-family files
2026-05-28 23:18:17 -07:00
Mohamed Hegazy
6a63e35e75 security-guidance: lenient UTF-8 decode in 6 git-subprocess helpers (#2056)
Fixes anthropics/claude-plugins-official#2056 — on Windows, when the
worktree contains an untracked file whose name has a character undefined
in cp1252 (accented capitals like Á Í Ï Ð Ý, most CJK, emoji), the
UserPromptSubmit hook crashes:

  Exception in thread Thread-5 (_readerthread):
    UnicodeDecodeError: 'charmap' codec can't decode byte 0x81
  Traceback (most recent call last):
    File diffstate.py, line 338, in _list_untracked
      for p in r.stdout.split('\\0'):
  AttributeError: 'NoneType' object has no attribute 'split'

Non-blocking (UPS failures still let the prompt through) but the
baseline-untracked snapshot is silently lost, so the Stop-hook review
mis-handles pre-existing untracked files.

Root cause (reporter's diagnosis, verified):

1. core.quotePath=false makes git emit raw UTF-8 for non-ASCII filenames.
2. subprocess.run(..., text=True) decodes via
   locale.getpreferredencoding(False) in strict mode — on Windows that
   is cp1252, in which 0x81 / 0x8D / 0x8F / 0x90 / 0x9D are undefined.
   Those bytes appear in the UTF-8 encodings of Á (C3 81), Í (C3 8D),
   Ï (C3 8F), Ð (C3 90), Ý (C3 9D), and a large fraction of CJK / emoji
   codepoints.
3. The decode runs in the subprocess reader thread. The thread raises
   UnicodeDecodeError, threading prints 'Exception in thread Thread-N',
   subprocess.run returns with stdout=None. The handler then does
   None.split('\\0') -> AttributeError, which is NOT in the narrow
   except (TimeoutExpired, FileNotFoundError, OSError) tuple, so it
   escapes the helper, propagates out of UserPromptSubmit's
   ThreadPoolExecutor.result(), and exits the hook non-zero.

This is internally inconsistent: gitutil._git_diff_range,
security_reminder_hook._reflog_amend_lookup (line ~540), and the commit
diff loop (line ~1115) already do bytes + decode utf-8/replace, with
comments explicitly noting that text=True would crash. The fix below
extends that established pattern to the helpers that were holdouts.

Affected helpers (6 total):

  - diffstate._list_untracked            <- reporter, hot path, CRITICAL
  - diffstate.capture_git_baseline       <- reporter, latent
  - diffstate.get_baseline_file_content  <- audit, file content read, HIGH
  - gitutil._git_name_only                <- reporter, latent
  - gitutil._git_status_porcelain         <- reporter, latent
  - gitutil._git_reflog_recent_commits    <- audit, embeds %gs commit msg, HIGH

For each one:

  - Drop text=True from subprocess.run.
  - Decode r.stdout / r.stderr as .decode('utf-8', errors='replace').
  - Add ValueError to the except tuple as defense against any future
    strict-decode regression (UnicodeDecodeError is a ValueError
    subclass; including it explicitly degrades the helper to its
    empty/None return instead of escaping out of the hook).

Verified locally on macOS Python 3.13:

  - py_compile clean on both files.
  - 45 existing smoke + extensibility tests still pass.
  - 21 new internal tests (not in this PR — added to the team's local
    test suite at staging/tests/test_unicode_decode.py):
      * 18 static-shape parametrized: each of the 6 fixed helpers has
        no text=True in its subprocess calls, contains errors='replace',
        and lists ValueError in its except.
      * Deterministic end-to-end: create real git repo + Ávila_report.txt
        untracked, call _list_untracked, verify it returns
        {'Ávila_report.txt': <mtime>} without crashing.
      * Deterministic end-to-end: same for capture_git_baseline (verifies
        the latent stderr-warning case stays valid).
      * Deterministic end-to-end: get_baseline_file_content on a file
        whose content has 山田太郎 + 🎉; verify the bytes round-trip
        through the decode.
  - 66/66 tests pass total (45 existing + 21 new).

NOT verified end-to-end on Windows — would need actual cp1252 strict
decode to fire. Reporter has the deterministic repro and will
re-verify on their Win11 / Python 3.14.x setup before merge.

Not in this PR (defense-in-depth, lower risk):

  - 3 git rev-parse calls returning path output (gitutil._find_git_index,
    _git_toplevel, _git_dir) could fail on Windows if cwd is in a
    non-ASCII install directory. Same fix shape but unreported and
    much lower probability — worth a separate follow-up if anyone
    actually hits it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 23:15:16 -07:00
Mohamed Hegazy
12a5376e20 security-guidance: gate XSS pattern rules to JS-family files
Closes #410, #2037, #2045, #1640, #1280, #1329, #1341, #255,
anthropics/claude-code#46720 (partial closes on overlap with other rules).

The plugin's substring-only XSS / browser-DOM rules
(new_function_injection, react_dangerously_set_html, document_write_xss,
innerHTML_xss, outerHTML_xss, insertAdjacentHTML_xss) fired on any file
containing the trigger substring — including:

  * Markdown documentation explaining XSS sinks
  * Blog posts / READMEs that name browser APIs
  * Python tutorials referencing dangerouslySetInnerHTML
  * Plugin skill files with example HTML strings
  * .yaml / .json configs that happen to contain the literal string
  * .gitignore / Dockerfile / Makefile

These constructs have no meaning outside JS/TS source. Add a
path_filter: lambda p: p.endswith(_JS_EXTS) to each so they fire only
on .js, .jsx, .ts, .tsx, .mjs, .cjs, .mts, .cts, .vue, .svelte.

Cross-checked against the existing _JS_EXTS-gated rules
(regex_exec_substring, child_process_exec, exec_substring) — same
pattern, same constant, same intent. Uses the module-level _JS_EXTS
tuple so future extension changes propagate to all 6 rules atomically.

Verified locally on macOS Python 3.13:
  - py_compile clean.
  - 45-test existing smoke + extensibility suite still passes.
  - 151 new parametrized tests in test_xss_gate.py (added to internal
    test suite this PR doesn't ship): each gated rule x every
    JS-family extension accepts, x every non-JS path (.md / .py /
    .yaml / .json / .txt / .html / Dockerfile / Makefile / .gitignore
    / .sh / .go / .rs / .rb) rejects. 196 tests pass total.

Doesn't address everything in the false-positive cluster — issues that
require Python-rule gating (#1114 .env.schema exec), tighter substring
scoping (#660 pickle in usernames), or hook-protocol changes (#1358
exit-2 vs warning, #1375 plain-text-vs-JSON output) need separate PRs.
This PR covers the JS-substring subset cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 23:07:53 -07:00
Mohamed Hegazy
04127de5d1 Merge pull request #2073 from anthropics/fix-2071-macos-python-39
security-guidance: enable LLM review on default macOS Python 3.9 (#2071)
2026-05-28 22:59:23 -07:00
Mohamed Hegazy
a67587c816 security-guidance: enable LLM review on default macOS Python 3.9
Fixes anthropics/claude-plugins-official#2071 — on macOS where the
default `python3` is Apple's Command Line Tools Python 3.9.6, the
plugin's agentic commit reviewer silently does not run, even when the
user has a newer Python installed.

Three compounding factors in the bug:

1. `sg-python.sh` only checks the major version (`3`), so it always
   picks 3.9 even when 3.10+ is on PATH.
2. `claude_agent_sdk` requires Python >=3.10 — pip install on 3.9
   returns "No matching distribution" -> bootstrap returns BUILD_FAILED.
3. Even with a hand-built 3.12 venv, `llm.py` imports the SDK
   in-process into the hook's interpreter (still 3.9), which raises
   SyntaxError. The existing venv-probe in `ensure_agent_sdk.py` uses
   the venv's own Python (3.12) so it reports NOOP_VENV (healthy) while
   the consumer fails — misleading telemetry on top of silent feature
   degradation.

Per BQ telemetry, 14,073 external macOS users hit
sdk_bootstrap=BUILD_FAILED in the past 4 days (the default-macOS
cohort), out of ~86K total external installed users. Combined with
~20K other users in similar broken-bootstrap states (Windows pre-#2055,
Linux <3.10), about half the installed base has a silently-broken
agentic reviewer.

This PR implements the reporter's items #1, #3, and #4. Item #2
(running the SDK out-of-process) is deferred as a bigger refactor.

Item #1 — hooks/sg-python.sh — prefer >=3.10 binaries via 3-pass probe:

  Pass 1: python3.13 / 3.12 / 3.11 / 3.10 (>=3.10 by name, highest wins)
  Pass 2: bare python3 / python / py -3 (accept only if reported >=3.10)
  Pass 3: bare python3 / python / py -3 (any Python 3, FALLBACK so
          pattern checks still work on macOS-default 3.9 — no regression
          vs today; SDK-dependent paths detect the version mismatch
          inside Python and degrade cleanly via item #4)

Item #4 — ensure_agent_sdk.py — health-check honesty:

Added HOOK_PY_INCOMPATIBLE=6 outcome with short-circuit at top of main():

  if sys.version_info < (3, 10):
      return HOOK_PY_INCOMPATIBLE, "hook_py", f"py_{...}"

Telemetry consequences after rollout: sdk_bootstrap=6 is a new clean
bucket; some users currently miscounted in sdk_bootstrap=3 BUILD_FAILED
(wasted pip cycles) and sdk_bootstrap=1 NOOP_VENV (falsely-healthy)
move to sdk_bootstrap=6. The remaining NOOP_VENV count becomes
trustworthy.

Item #3 — ensure_agent_sdk.py — one-time user-visible notice:

When outcome == HOOK_PY_INCOMPATIBLE and a marker file at
`~/.claude/security/.agentic_unavailable_notice_v<pv>` doesn't exist,
the SessionStart response includes hookSpecificOutput.additionalContext
+ systemMessage explaining the situation. Marker file is plugin-
version-keyed so a future fix (e.g. shipping out-of-process SDK) can
bump pv and re-notify users.

BUILD_FAILED is intentionally excluded from the notice — it covers
transient causes where a permanent banner would mislead.

Verified locally on macOS Python 3.13:
  - py_compile clean on both files.
  - Existing 45-test smoke + extensibility suite: 45/45 PASS in 2.50s.
  - Unit test of simulated 3.9 path: HOOK_PY_INCOMPATIBLE returned with
    correct phase/kind; notice shown on first call, suppressed on
    second, reshown on bumped pv; BUILD_FAILED correctly does NOT
    trigger notice.

NOT verified: actual Python 3.9 behavior end-to-end (would need a 3.9
install). Worth a follow-up smoke test in a 3.9 venv before next
release. The unit test simulating 3.9 covers the logic but not the
runtime invocation through the shim.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 22:58:01 -07:00
Bryan Thompson
502de97746 Add vibe-prospecting plugin (#1997) 2026-05-28 15:30:04 -07:00
Bryan Thompson
679f52da9e feat(scan): emit per-entry sticky verdict comments (#2009)
Adds an `emit-verdict` job to scan-plugins.yml that posts a sticky
comment per scanned entry to the corresponding bump PR, with marker
`<!-- bump-pr-verdict:<name> -->`. The body is a schema_v1 JSON block,
the same shape `anthropics/claude-plugins-community-internal`'s
`scan-external-plugins.yml` already emits, so any consumer that already
reads verdicts from that schema works uniformly across both repos.

What this enables
-----------------

Lets downstream consumers (label automation, dashboards, anything that
wants per-entry verdict signal) read verdicts directly from the PR
rather than scraping job logs or downloading artifacts. The current
options are log-scraping (truncated after log retention) or fetching
the `scan-verdicts` artifact (retention-limited and only after upload
succeeds).

What does NOT change
--------------------

- The `scan` required check is unaffected (emit-verdict is
  `continue-on-error: true` at the job level — failures here MUST NOT
  block the required gate).
- Verdict cache, scan flow, and revert-failed-bumps.yml are unchanged.
- No new permission scopes (uses `pull-requests: write` at the job
  level, identical to other PR-commenting jobs in this repo).

Schema notes
------------

- `scan.*` axes (clone, schema, binaries, etc.) emit as "skipped" —
  this workflow runs the policy review only, not per-entry static
  checks. Shape kept compatible with -internal's schema_v1 so the
  same consumers work uniformly on both repos.
- `policy.has_broad_scope_hooks`, `has_undisclosed_telemetry`,
  `description_matches_behavior` emit as null — those granular axes
  aren't surfaced by this workflow's per-entry artifact yet. Consumers
  that map `null → "?"` for display already handle this gracefully.
- `policy.status` is execution state (not outcome). Map source →
  status: scan-action-run → "ran"; cache-served → "cached". Outcome
  lives in `policy.passes`. policy.status vocabulary matches the
  `ran|cached|missing|gated_out|infra_error` convention from
  -internal's emit-verdict.

PR resolution
-------------

`pull_request` events carry the PR number directly. The bump workflow
creates bump PRs via GITHUB_TOKEN (which doesn't fire `pull_request`
triggers — recursion guard) and dispatches this scan via
`workflow_dispatch` on the bump branch; in that case the job looks up
the open PR by head ref via REST. No PR found (scan_all dispatch on
main, etc.) → no-op with notice.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 15:29:59 -07:00
Bryan Thompson
13a0208f38 Add Skill-bundle plugins section to README (#2067)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 15:29:53 -07:00
9 changed files with 526 additions and 66 deletions

View File

@@ -2135,22 +2135,6 @@
},
"homepage": "https://github.com/SAP/open-ux-tools/tree/main/packages/fiori-mcp-server"
},
{
"name": "sap-hana-cli",
"description": "150+ SAP HANA database tools for AI assistants. Query tables, import/export data, profile data quality, compare schemas, manage backups, monitor performance, and more. Connects to SAP HANA Cloud and on-premise databases.",
"author": {
"name": "SAP SE",
"email": "ospo@sap.com",
"url": "https://www.sap.com"
},
"category": "database",
"source": {
"source": "url",
"url": "https://github.com/SAP-samples/hana-cli-claude-plugin.git",
"sha": "160ae47efaffea2e1dd9d6877ab9ec49b78542a0"
},
"homepage": "https://github.com/SAP-samples/hana-cli-claude-plugin"
},
{
"name": "sap-mdk-server",
"description": "MCP server for SAP Mobile Development Kit (MDK). Build and modify MDK applications with AI assistance — schema lookups, action validation, rule editing, and project scaffolding.",
@@ -2601,6 +2585,20 @@
},
"homepage": "https://github.com/vercel/vercel-plugin"
},
{
"name": "vibe-prospecting",
"description": "Vibe Prospecting connects Claude to live B2B company and contact data so users can search, match, enrich, filter, and export prospects at scale. It turns natural-language requests into structured GTM workflows for lead generation, CRM enrichment, company research, executive discovery, and multi-step prospecting automation inside Claude Cowork and Claude Code.",
"author": {
"name": "vibeprospecting.ai"
},
"category": "productivity",
"source": {
"source": "url",
"url": "https://github.com/explorium-ai/vibeprospecting-plugin.git",
"sha": "ada4d569dbf70194fe18750ecbc5170e9a3f120a"
},
"homepage": "https://www.vibeprospecting.ai/product/claude-plugin"
},
{
"name": "windsor-ai",
"description": "Connect Claude Code to 325+ business data sources via Windsor.ai. Query marketing, sales, CRM, ecommerce, finance, and analytics data from Google Ads, Meta, HubSpot, Salesforce, Shopify, Stripe, and hundreds more — directly from your terminal.",

View File

@@ -381,3 +381,166 @@ jobs:
echo "::error::Scan step failed without a parseable policy verdict (likely an infra error)."
exit 1
fi
# ─────────────────────────────────────────────────────────────────────────────
# emit-verdict: post a sticky comment per entry to the bump PR with the
# structured verdict, so downstream tooling (label automation, delist
# authoring) can read verdicts directly instead of scraping job logs.
# Sticky comment marker: `<!-- bump-pr-verdict:<name> -->`.
#
# Mirrors the schema_v1 contract from
# anthropics/claude-plugins-community-internal#3908 so the triage scripts
# in mcp-local-directory/scripts/triage/ work uniformly across both repos.
# -official doesn't run per-entry static checks (zombie, schema, binaries,
# etc.) so the `scan.*` axes are emitted as "skipped". The granular policy
# booleans (`has_broad_scope_hooks`, `has_undisclosed_telemetry`,
# `description_matches_behavior`) aren't surfaced by this workflow's
# per-entry artifact yet, so they're emitted as null; the triage
# `triage_bool_to_str` helper maps null → "?" so display is graceful.
# Status describes the execution state, not the outcome — `ran` when the
# scan action evaluated this SHA fresh, `cached` when a prior verdict was
# reused (cf. run-verdicts.json's `source` field). Outcome lives in
# `policy.passes`. policy-sweep.sh dispatches on this exact vocabulary.
#
# PR resolution: pull_request events carry the PR number directly. The
# bump workflow creates bump PRs via GITHUB_TOKEN (which doesn't fire
# pull_request triggers — recursion guard) and dispatches this scan via
# workflow_dispatch on the bump branch. In that case we look up the
# open PR by head ref. No PR (scan_all dispatch on main, etc.) → no-op.
#
# continue-on-error at the job level: emit failure must NOT block the
# `scan` required check. Consumers fall back to log-scraping if the
# comment is absent (gradual migration; no flag day).
# ─────────────────────────────────────────────────────────────────────────────
emit-verdict:
needs: [scan]
if: always() && needs.scan.result != 'skipped' && needs.scan.result != 'cancelled'
runs-on: ubuntu-latest
continue-on-error: true
permissions:
contents: read
pull-requests: write
steps:
- name: Download scan verdicts
uses: actions/download-artifact@v4
with:
name: scan-verdicts
path: /tmp/scan-verdicts
continue-on-error: true
- name: Resolve PR number for this ref
id: pr
env:
GH_TOKEN: ${{ github.token }}
EVENT_NAME: ${{ github.event_name }}
PR_FROM_EVENT: ${{ github.event.pull_request.number }}
REF: ${{ github.ref_name }}
REPO: ${{ github.repository }}
run: |
set -euo pipefail
if [[ "$EVENT_NAME" == "pull_request" && -n "$PR_FROM_EVENT" ]]; then
echo "number=$PR_FROM_EVENT" >> "$GITHUB_OUTPUT"
exit 0
fi
# workflow_dispatch on the bump branch: find the open PR for it.
# head filter takes the form owner:branch.
owner="${REPO%%/*}"
pr=$(gh api "/repos/${REPO}/pulls?state=open&head=${owner}:${REF}&per_page=1" \
--jq '.[0].number // ""')
if [[ -z "$pr" ]]; then
echo "::notice::No open PR for ref ${REF} — sticky comments skipped (verdicts still in scan-verdicts artifact)"
fi
echo "number=$pr" >> "$GITHUB_OUTPUT"
- name: Build and post sticky comments
if: steps.pr.outputs.number != ''
env:
GH_TOKEN: ${{ github.token }}
REPO: ${{ github.repository }}
PR: ${{ steps.pr.outputs.number }}
RUN_ID: ${{ github.run_id }}
run: |
set -euo pipefail
verdicts_path=/tmp/scan-verdicts/run-verdicts.json
# Missing/empty artifact: scan job ran but didn't produce verdicts
# (e.g. the relevance gate said "no changes"). Nothing to comment;
# exit clean.
if [[ ! -s "$verdicts_path" ]]; then
echo "::notice::No run-verdicts.json artifact — nothing to emit"
exit 0
fi
count=$(jq 'length' "$verdicts_path")
if [[ "$count" == "0" ]]; then
echo "::notice::run-verdicts.json is empty — nothing to emit"
exit 0
fi
ran_at=$(date -u +%Y-%m-%dT%H:%M:%SZ)
# scan.* axes: -official doesn't run per-entry static checks; emit
# "skipped" for each so the schema is shape-compatible with -internal.
scan_stub='{"clone":"skipped","subpath_missing":"skipped","schema":"skipped","zombie":"skipped","tool_allowlist":"skipped","binaries":"skipped","unique":"skipped","mcp":"skipped"}'
# Pre-fetch all PR comments once (paginated) for the marker lookup.
gh api --paginate "/repos/$REPO/issues/$PR/comments" \
--jq '.[] | {id, body}' > /tmp/comments.ndjson
jq -c '.[]' "$verdicts_path" | while read -r entry; do
name=$(jq -r '.name' <<< "$entry")
passes=$(jq -r '.passes' <<< "$entry")
summary=$(jq -r '.summary // ""' <<< "$entry")
violations=$(jq -r '.violations // ""' <<< "$entry")
source=$(jq -r '.source // "scan"' <<< "$entry")
# status = execution state (cf. -internal#3908 vocabulary).
# Outcome is in `passes`. Map source → status: scan-action-run
# → "ran"; cache-served → "cached". Anything else falls through
# as "ran" (only those two values appear in run-verdicts.json).
case "$source" in
cache) status="cached" ;;
scan) status="ran" ;;
*) status="ran" ;;
esac
policy=$(jq -n \
--argjson passes "$passes" \
--arg summary "$summary" \
--arg violations "$violations" \
--arg source "$source" \
--arg status "$status" \
'{passes: $passes,
has_broad_scope_hooks: null,
has_undisclosed_telemetry: null,
description_matches_behavior: null,
summary: $summary,
violations: $violations,
source: $source,
status: $status}')
verdict=$(jq -n \
--argjson scan "$scan_stub" \
--argjson policy "$policy" \
--arg ran_at "$ran_at" \
--arg run_id "$RUN_ID" \
'{schema_version: 1, ran_at: $ran_at, run_id: $run_id, scan: $scan, policy: $policy}')
marker="<!-- bump-pr-verdict:$name -->"
body=$(printf '%s\n```json\n%s\n```' "$marker" "$verdict")
# jq's first() short-circuits and avoids SIGPIPE under pipefail if
# duplicate markers exist (shouldn't, but a prior buggy run could
# double-post). -s slurps NDJSON; `// empty` yields no output when
# no match.
existing=$(jq -rs --arg m "$marker" \
'first(.[] | select(.body | startswith($m)) | .id) // empty' \
/tmp/comments.ndjson)
if [[ -n "$existing" ]]; then
gh api -X PATCH "/repos/$REPO/issues/comments/$existing" -f body="$body" >/dev/null
echo "Updated comment $existing for $name"
else
gh api -X POST "/repos/$REPO/issues/$PR/comments" -f body="$body" >/dev/null
echo "Created comment for $name"
fi
done

View File

@@ -42,6 +42,37 @@ plugin-name/
└── README.md # Documentation
```
## Skill-bundle plugins
When a plugin's source repository ships skills (`SKILL.md` files) without a `.claude-plugin/plugin.json` manifest, the marketplace entry can declare the skills directly using `strict: false` and an explicit `skills` array.
```json
{
"name": "example-bundle",
"description": "Brief description of the bundled skills.",
"author": { "name": "Author Name" },
"category": "development",
"source": {
"source": "git-subdir",
"url": "https://github.com/example-org/sdk.git",
"path": "packages/agent-skills",
"ref": "main",
"sha": "<commit sha>"
},
"strict": false,
"skills": [
"./skill-a",
"./skill-b",
"./skill-c"
],
"homepage": "https://github.com/example-org/sdk"
}
```
Each path in `skills` is relative to `source.path` and points at a directory containing a `SKILL.md`. Paths can reach deeper than a single level — for example, `["./libA/skill-1", "./libB/skill-2"]` exposes a curated subset across multiple library subdirectories. Each skill is registered as `<plugin-name>:<skill-name>` in Claude Code.
For the underlying schema, see [Strict mode](https://code.claude.com/docs/en/plugin-marketplaces) in the marketplace documentation.
## License
Please see each linked plugin for the relevant LICENSE file.

View File

@@ -138,7 +138,17 @@ def restore_unreviewed_stop_state(session_id, paths, baseline_sha):
def get_baseline_file_content(session_id, file_path, cwd):
"""Get the content of a file at the baseline SHA. Returns None if unavailable."""
"""Get the content of a file at the baseline SHA. Returns None if unavailable.
Decode the file content as UTF-8 with errors="replace" rather than using
text=True: source files in user repos can be latin-1 / cp1252 / shift-jis
/ etc., and on Windows text=True would decode via locale.getpreferredencoding()
in strict mode and raise UnicodeDecodeError in the subprocess reader
thread — leaving result.stdout=None and propagating AttributeError when
the caller tries to use it. Same class as the existing migrations at
security_reminder_hook.py:540 (reflog subjects) and :1115 (commit
diffs); this helper was missed in that pass. See
anthropics/claude-plugins-official#2056."""
baseline_sha = load_baseline_sha(session_id)
if not baseline_sha:
return None
@@ -151,12 +161,12 @@ def get_baseline_file_content(session_id, file_path, cwd):
return None
result = subprocess.run(
[*GIT_CMD, "show", f"{baseline_sha}:{rel_path}"],
cwd=cwd, capture_output=True, text=True, timeout=5
cwd=cwd, capture_output=True, timeout=5
)
if result.returncode == 0:
return result.stdout
return (result.stdout or b"").decode("utf-8", errors="replace")
return None
except (subprocess.TimeoutExpired, FileNotFoundError, OSError):
except (subprocess.TimeoutExpired, FileNotFoundError, OSError, ValueError):
return None
@@ -173,11 +183,16 @@ def capture_git_baseline(cwd):
and `compute_v2_review_set` subtracts that set so pre-existing untracked
files are not reviewed as Claude-authored.
"""
# stdout is a SHA so text=True is safe on stdout, but a non-ASCII
# filename in `git stash create`'s STDERR warning (e.g. a worktree
# with `Ávila_report.txt` triggers a quotePath/locale warning) would
# trip the stderr reader thread on Windows cp1252. Decode both streams
# leniently for symmetry with _list_untracked. See #2056.
try:
# Check if HEAD exists (i.e., repo has at least one commit)
head_check = subprocess.run(
[*GIT_CMD, "rev-parse", "HEAD"],
cwd=cwd, capture_output=True, text=True, timeout=5
cwd=cwd, capture_output=True, timeout=5
)
if head_check.returncode != 0:
# No commits yet — skip review rather than creating commits in the user's repo
@@ -186,20 +201,20 @@ def capture_git_baseline(cwd):
result = subprocess.run(
[*GIT_CMD, "stash", "create"],
cwd=cwd, capture_output=True, text=True, timeout=15
cwd=cwd, capture_output=True, timeout=15
)
sha = result.stdout.strip()
sha = (result.stdout or b"").decode("utf-8", errors="replace").strip()
if sha:
return sha
# Working tree is clean — stash create returns empty. Use HEAD.
result = subprocess.run(
[*GIT_CMD, "rev-parse", "HEAD"],
cwd=cwd, capture_output=True, text=True, timeout=5
cwd=cwd, capture_output=True, timeout=5
)
sha = result.stdout.strip()
sha = (result.stdout or b"").decode("utf-8", errors="replace").strip()
return sha if sha else None
except (subprocess.TimeoutExpired, FileNotFoundError, OSError) as e:
except (subprocess.TimeoutExpired, FileNotFoundError, OSError, ValueError) as e:
debug_log(f"Failed to capture git baseline: {e}")
return None
@@ -323,19 +338,35 @@ def _list_untracked(cwd):
mtime is captured so an in-place edit during the turn is still reviewed.
Uses ls-files (not status) for the UPS path: the index diff isn't needed,
and ls-files --others only walks the worktree against .gitignore."""
and ls-files --others only walks the worktree against .gitignore.
Decodes stdout/stderr as UTF-8 with errors="replace" instead of using
text=True. With core.quotePath=false git emits raw UTF-8 bytes for
non-ASCII filenames; text=True decodes via locale.getpreferredencoding()
in strict mode — on Windows that's cp1252 with several undefined bytes
(0x81/0x8D/0x8F/0x90/0x9D), all of which appear in UTF-8 encodings of
common accented capitals (Á Í Ï Ð Ý) and most CJK/emoji codepoints.
A non-ASCII filename in the worktree crashed the subprocess reader
thread, left r.stdout=None, and propagated AttributeError out of the
helper — silently losing the baseline snapshot every UserPromptSubmit.
See anthropics/claude-plugins-official#2056. The sibling helpers in
gitutil.py already follow the lenient pattern; this function and
capture_git_baseline / _git_name_only / _git_status_porcelain were
the holdouts."""
try:
repo = _git_toplevel(cwd) or cwd
r = subprocess.run(
[*GIT_CMD, "-c", "core.quotePath=false", "ls-files",
"--others", "--exclude-standard", "-z"],
cwd=repo, capture_output=True, text=True, timeout=15,
cwd=repo, capture_output=True, timeout=15,
)
if r.returncode != 0:
debug_log(f"_list_untracked rc={r.returncode}: {r.stderr[:200]}")
stderr_str = (r.stderr or b"").decode("utf-8", errors="replace")
debug_log(f"_list_untracked rc={r.returncode}: {stderr_str[:200]}")
return {}
stdout = (r.stdout or b"").decode("utf-8", errors="replace")
out = {}
for p in r.stdout.split("\0"):
for p in stdout.split("\0"):
if not p:
continue
try:
@@ -346,7 +377,9 @@ def _list_untracked(cwd):
debug_log(f"_list_untracked: capped at {UNTRACKED_BASELINE_CAP}")
break
return out
except (subprocess.TimeoutExpired, FileNotFoundError, OSError) as e:
except (subprocess.TimeoutExpired, FileNotFoundError, OSError, ValueError) as e:
# ValueError guards against any future strict-decode regression
# so the helper degrades to {} instead of crashing the hook.
debug_log(f"_list_untracked error: {e}")
return {}

View File

@@ -32,6 +32,8 @@ BUILD_FAILED = 3 # venv create or pip install raised/timed out
# llm.py also matches Windows venv layout (Lib/site-packages). Don't reuse the
# value — telemetry rows from older plugin builds still emit 4.
SKIP_SENTINEL = 5 # another SessionStart is currently building
HOOK_PY_INCOMPATIBLE = 6 # hook interpreter is <3.10 — SDK syntax can't load
# here no matter how the venv was built. See #2071.
def _sdk_on_syspath() -> bool:
@@ -62,6 +64,29 @@ def main() -> tuple[int, str, str]:
err_phase / err_kind are non-empty only on BUILD_FAILED — they let
telemetry split bootstrap failures by root cause.
"""
# Honesty check (fixes the misleading NOOP_VENV in #2071): the SDK
# requires Python >=3.10 and uses 3.10+ syntax (match statements,
# PEP 604 unions). On a 3.9 hook interpreter we CANNOT import it no
# matter how the venv was built — llm.py runs in this same interpreter
# and the syntax-level import will SyntaxError. macOS ships 3.9.6 as
# the default `python3` and `/usr/bin` precedes Homebrew in PATH, so
# this case is the default state for a large share of macOS users.
#
# sg-python.sh now prefers python3.10+ binaries so most users won't
# reach this branch; the fallback to 3.9 is preserved for the
# pattern-warning hooks that don't need the SDK. Reporting
# HOOK_PY_INCOMPATIBLE here:
# (a) avoids 30-60s of wasted pip install,
# (b) avoids the lie where the venv_py probe says NOOP_VENV but the
# consumer import fails, and
# (c) gives telemetry a clean bucket to size the affected fleet.
if sys.version_info < (3, 10):
return (
HOOK_PY_INCOMPATIBLE,
"hook_py",
f"py_{sys.version_info[0]}.{sys.version_info[1]}",
)
if _sdk_on_syspath():
return NOOP_SYSTEM, "", ""
@@ -195,6 +220,56 @@ def main() -> tuple[int, str, str]:
sentinel.unlink(missing_ok=True)
def _maybe_emit_user_notice(outcome: int, pv: int) -> str | None:
"""Return a one-time user-visible notice when the agentic reviewer is
in a persistent broken state on this machine, or None if we've already
shown the notice for this plugin version (or shouldn't show one).
The marker file is plugin-version-keyed: a future plugin update can
re-notify if behavior changes (e.g. we ship out-of-process SDK in v3
and want to tell affected users it's fixed). Failures to write the
marker degrade to "skip the notice this session" so we don't spam
every SessionStart on a read-only home dir.
Currently only HOOK_PY_INCOMPATIBLE qualifies. BUILD_FAILED is
intentionally excluded — it covers transient causes (network failure,
pip registry hiccup, in-flight rebuild) where the next session may
succeed and a permanent notice would mislead.
"""
if outcome != HOOK_PY_INCOMPATIBLE:
return None
try:
state_dir = Path(
os.environ.get("SECURITY_WARNINGS_STATE_DIR")
or os.path.expanduser("~/.claude/security")
)
marker = state_dir / f".agentic_unavailable_notice_v{pv or 0}"
if marker.exists():
return None
state_dir.mkdir(parents=True, exist_ok=True)
# Write timestamp + Python version so the marker is self-documenting
# if a user goes looking. O_EXCL would be racier with no real win
# (two concurrent SessionStarts both showing the notice once is fine).
marker.write_text(
f"{time.strftime('%Y-%m-%dT%H:%M:%SZ', time.gmtime())} "
f"py={sys.version_info[0]}.{sys.version_info[1]}\n"
)
except OSError:
return None
return (
f"⚠ security-guidance plugin: the cross-file commit reviewer "
f"(layer 3 of 3 — catches IDOR, auth-bypass, cross-file SSRF) "
f"is unavailable in this environment. It requires Python ≥3.10, "
f"but the hook is running on "
f"{sys.version_info[0]}.{sys.version_info[1]}.\n\n"
f"Pattern checks and the single-shot LLM diff review are still "
f"active. To enable the deeper reviewer, install Python 3.10+ "
f"(e.g. `brew install python` on macOS) and restart Claude Code.\n\n"
f"This notice is shown once per plugin version. "
f"See: github.com/anthropics/claude-plugins-official/issues/2071"
)
if __name__ == "__main__":
# Tell the harness this is async — venv create + pip install can take
# 30-60s on a cold cache, well past the default sync hook timeout.
@@ -231,4 +306,18 @@ if __name__ == "__main__":
pv = _plugin_version_int()
if pv:
metrics["pv"] = pv
print(json.dumps({"metrics": metrics}), flush=True)
response: dict[str, object] = {"metrics": metrics}
# One-time user-visible notice when the agentic reviewer is dead on
# arrival. Uses hookSpecificOutput.additionalContext (SessionStart's
# supported channel for surfacing text to both the model and the user)
# plus systemMessage as a belt-and-suspenders. Marker-file-gated so
# this fires exactly once per plugin version per install — see
# _maybe_emit_user_notice.
notice = _maybe_emit_user_notice(outcome, pv)
if notice:
response["hookSpecificOutput"] = {
"hookEventName": "SessionStart",
"additionalContext": notice,
}
response["systemMessage"] = notice
print(json.dumps(response), flush=True)

View File

@@ -259,19 +259,29 @@ def _git_reflog_recent_commits(repo_root, max_age_s=120, max_n=5):
# %gs (the reflog subject) is `commit: <commit-msg first line>` and can
# contain `|`; put it LAST so split("|", 2) leaves it intact. %H is
# hex and %ct is integer, so the first two fields are delimiter-safe.
#
# Bytes + decode utf-8/replace: %gs embeds commit-message subjects
# which git stores as raw bytes — commits can be authored in
# latin-1 / cp1252 / shift-jis etc., and text=True would raise
# UnicodeDecodeError in the subprocess reader thread on Windows
# cp1252 (subprocess.run returns r.stdout=None, then
# r.stdout.splitlines() AttributeErrors). Mirrors the existing
# migration at security_reminder_hook.py:540 — same pattern was
# missed here. See anthropics/claude-plugins-official#2056.
r = subprocess.run(
[*GIT_CMD, "log", "-g", "-n", str(max_n),
"--format=%H|%ct|%gs", "HEAD"],
cwd=repo_root, capture_output=True, text=True, timeout=5,
cwd=repo_root, capture_output=True, timeout=5,
)
except (subprocess.TimeoutExpired, FileNotFoundError, OSError):
except (subprocess.TimeoutExpired, FileNotFoundError, OSError, ValueError):
return [], 0
if r.returncode != 0:
return [], 0
stdout = (r.stdout or b"").decode("utf-8", errors="replace")
import time as _time
now = int(_time.time())
fresh, stale = [], 0
for idx, line in enumerate(r.stdout.splitlines()):
for idx, line in enumerate(stdout.splitlines()):
parts = line.split("|", 2)
if len(parts) != 3:
continue
@@ -306,23 +316,31 @@ def _git_name_only(cwd, base, include_untracked=False):
must distinguish None (error → don't trust as a filter) from set()
(genuinely nothing changed). `-c core.quotePath=false -z` keeps non-ASCII
and space-containing paths intact."""
# Decode stdout/stderr as UTF-8 with errors="replace" instead of using
# text=True. core.quotePath=false makes git emit raw UTF-8 for non-ASCII
# paths, and text=True on Windows decodes via cp1252 strict — a non-ASCII
# changed path would crash the subprocess reader thread, leave
# result.stdout=None, and propagate AttributeError out of the helper.
# Same fix shape as diffstate._list_untracked. See #2056.
def _run(env):
result = subprocess.run(
[*GIT_CMD, "-c", "core.quotePath=false", "diff", "--name-only", "-z", base],
cwd=cwd, capture_output=True, text=True, timeout=30,
cwd=cwd, capture_output=True, timeout=30,
env=env,
)
if result.returncode != 0:
debug_log(f"_git_name_only({base!r}) rc={result.returncode}: {result.stderr[:200]}")
stderr_str = (result.stderr or b"").decode("utf-8", errors="replace")
debug_log(f"_git_name_only({base!r}) rc={result.returncode}: {stderr_str[:200]}")
return None
return {p for p in result.stdout.split("\0") if p}
stdout = (result.stdout or b"").decode("utf-8", errors="replace")
return {p for p in stdout.split("\0") if p}
try:
if not include_untracked:
return _run(None)
with _temp_index(cwd) as env:
return _run(env)
except (subprocess.TimeoutExpired, FileNotFoundError, OSError) as e:
except (subprocess.TimeoutExpired, FileNotFoundError, OSError, ValueError) as e:
debug_log(f"_git_name_only({base!r}) error: {e}")
return None
@@ -339,17 +357,22 @@ def _git_status_porcelain(cwd):
collapses to `dir/`). Required so the untracked set subtracts cleanly
against the UPS-time `_list_untracked` snapshot, which uses ls-files and
therefore always lists individual files."""
# Lenient decode: same UTF-8 + errors="replace" pattern as the
# sibling helpers — a non-ASCII path in the worktree would otherwise
# crash the cp1252 reader thread on Windows. See #2056.
try:
r = subprocess.run(
[*GIT_CMD, "-c", "core.quotePath=false", "status",
"--porcelain=v1", "-uall", "-z"],
cwd=cwd, capture_output=True, text=True, timeout=30,
cwd=cwd, capture_output=True, timeout=30,
)
if r.returncode != 0:
debug_log(f"_git_status_porcelain rc={r.returncode}: {r.stderr[:200]}")
stderr_str = (r.stderr or b"").decode("utf-8", errors="replace")
debug_log(f"_git_status_porcelain rc={r.returncode}: {stderr_str[:200]}")
return None, None
tracked, untracked = set(), set()
entries = r.stdout.split("\0")
stdout = (r.stdout or b"").decode("utf-8", errors="replace")
entries = stdout.split("\0")
i = 0
while i < len(entries):
e = entries[i]
@@ -368,7 +391,9 @@ def _git_status_porcelain(cwd):
i += 1
i += 1
return tracked, untracked
except (subprocess.TimeoutExpired, FileNotFoundError, OSError) as e:
except (subprocess.TimeoutExpired, FileNotFoundError, OSError, ValueError) as e:
# ValueError guards against any future strict-decode regression
# so the helper degrades to (None, None) instead of crashing.
debug_log(f"_git_status_porcelain error: {e}")
return None, None

View File

@@ -94,6 +94,9 @@ Only use exec() if you absolutely need shell features and the input is guarantee
},
{
"ruleName": "new_function_injection",
# JS-only construct: gate to JS/TS files so docs/.md and other prose
# mentioning "new Function" don't trip the warning.
"path_filter": lambda p: p.endswith(_JS_EXTS),
"substrings": ["new Function"],
"reminder": "\u26a0\ufe0f Security Warning: Using new Function() with string interpolation is a CODE INJECTION vulnerability. If any variable is concatenated or interpolated into the function body string, an attacker controlling that variable can execute arbitrary code. Use safe alternatives: for property access use obj[key] or array.reduce((o, k) => o[k], root); for computation use a safe expression parser. NEVER interpolate untrusted strings into new Function() bodies.",
},
@@ -107,16 +110,24 @@ Only use exec() if you absolutely need shell features and the input is guarantee
},
{
"ruleName": "react_dangerously_set_html",
# JS/TS-only (React); gate so .md docs / .py / .go files don't trip.
"path_filter": lambda p: p.endswith(_JS_EXTS),
"substrings": ["dangerouslySetInnerHTML"],
"reminder": "⚠️ Security Warning: dangerouslySetInnerHTML can lead to XSS vulnerabilities if used with untrusted content. Ensure all content is properly sanitized using an HTML sanitizer library like DOMPurify, or use safe alternatives.",
},
{
"ruleName": "document_write_xss",
# Browser DOM API: only meaningful in JS/TS source.
"path_filter": lambda p: p.endswith(_JS_EXTS),
"substrings": ["document.write"],
"reminder": "⚠️ Security Warning: document.write() can be exploited for XSS attacks and has performance issues. Use DOM manipulation methods like createElement() and appendChild() instead.",
},
{
"ruleName": "innerHTML_xss",
# Browser DOM API: only meaningful in JS/TS source. Closes FPs like
# docs/example HTML, playground/self-contained skills that hardcode
# innerHTML strings with zero user input (#410).
"path_filter": lambda p: p.endswith(_JS_EXTS),
"substrings": [".innerHTML =", ".innerHTML="],
"reminder": "⚠️ Security Warning: Setting innerHTML with untrusted content can lead to XSS vulnerabilities. Use textContent for plain text or safe DOM methods for HTML content. If you need HTML support, consider using an HTML sanitizer library such as DOMPurify.",
},
@@ -217,11 +228,15 @@ Additionally, validate user inputs:
},
{
"ruleName": "outerHTML_xss",
# Browser DOM API: only meaningful in JS/TS source.
"path_filter": lambda p: p.endswith(_JS_EXTS),
"substrings": [".outerHTML =", ".outerHTML="],
"reminder": "⚠️ Security Warning: Use textContent or sanitize with DOMPurify. outerHTML assignment is an XSS sink equivalent to innerHTML.",
},
{
"ruleName": "insertAdjacentHTML_xss",
# Browser DOM API: only meaningful in JS/TS source.
"path_filter": lambda p: p.endswith(_JS_EXTS),
"substrings": [".insertAdjacentHTML("],
"reminder": "⚠️ Security Warning: Use insertAdjacentText() or sanitize with DOMPurify. insertAdjacentHTML is an XSS sink.",
},

View File

@@ -190,7 +190,13 @@ CONTINUATION_SUFFIX = (
"response."
)
def emit_metrics(metrics, rewake_summary=None):
def emit_metrics(
metrics,
rewake_summary=None,
additional_context=None,
system_message=None,
hook_event_name="PostToolUse",
):
"""
Write a SyncHookJSONOutput line to stdout for Claude Code to pick up.
For asyncRewake (Stop) hooks, CC scans stdout for the first {-prefixed line
@@ -213,6 +219,27 @@ def emit_metrics(metrics, rewake_summary=None):
rewakeSummary in hooks.json, shown to the user in the terminal as the
task-notification one-liner. Must be in the same JSON line as the metrics
because CC stops scanning stdout after the first {-prefixed line.
`additional_context` (asyncRewake findings): model-visible guidance text
that CC surfaces via the modern hook-output protocol
(hookSpecificOutput.additionalContext) instead of the legacy stderr +
exit(2) pair. The caller passes the finding-explanation text it would
have written to stderr; the JSON channel carries it cleanly so CC's UI
shows the reason properly instead of "Permission denied with no reason".
See anthropics/claude-plugins-official#1375 and #1783. Empty/None
means no hookSpecificOutput field is emitted (preserves backward compat
for legacy emit-sites that only want metrics).
`system_message` (optional, asyncRewake only): user-visible TUI message,
distinct from rewakeSummary which is the task-notification one-liner.
Use sparingly — the rewakeMessage in hooks.json is the primary user
surface; systemMessage adds a per-fire override when the static
rewakeMessage isn't specific enough for the finding being shown.
`hook_event_name` (used only when additional_context is set): which event
the hookSpecificOutput attaches to. Defaults to "PostToolUse" since the
commit-review and push-sweep handlers are the most common callers;
handle_stop_hook explicitly passes "Stop".
"""
head = {}
if _PV and "pv" not in metrics:
@@ -223,6 +250,17 @@ def emit_metrics(metrics, rewake_summary=None):
out = {"metrics": metrics}
if rewake_summary:
out["rewakeSummary"] = rewake_summary
if additional_context:
# Wrap in hookSpecificOutput per CC's modern hook-output contract.
# Drops the legacy `sys.stderr.write(...) + sys.exit(2)` shape that
# left CC's UI showing "denied with no reason" (#1783) and triggered
# "json output validation failed" on older CC versions (#1375).
out["hookSpecificOutput"] = {
"hookEventName": hook_event_name,
"additionalContext": additional_context,
}
if system_message:
out["systemMessage"] = system_message
print(json.dumps(out), flush=True)
# =====================================================================
@@ -1361,18 +1399,26 @@ def handle_commit_review_posttooluse(input_data):
if s in sev:
sev[s] += 1
# Rebuild guidance from new_vulns only — concrete_guidance from the LLM
# still lists deduped entries. Pass via additional_context so CC surfaces
# the reason via hookSpecificOutput.additionalContext instead of empty
# stdout (#1783) / stderr-only "json output validation failed" (#1375).
_commit_guidance = (PROVENANCE_BANNER + "\n\n"
+ _format_vulns_guidance(new_vulns)
+ CONTINUATION_SUFFIX + "\n")
emit_metrics({
"vulns_found": len(new_vulns), **_base, **_agentic_m,
"critical_count": sev["critical"], "high_count": sev["high"],
"files_reviewed": len(diff_files), "review_ms": review_ms,
**({"deduped": n_deduped} if n_deduped else {}),
}, rewake_summary=_format_vulns_summary(new_vulns, prefix="Commit security review found"))
}, rewake_summary=_format_vulns_summary(new_vulns, prefix="Commit security review found"),
additional_context=_commit_guidance,
hook_event_name="PostToolUse")
# Rebuild guidance from new_vulns only — concrete_guidance from the LLM
# still lists deduped entries.
sys.stderr.write(PROVENANCE_BANNER + "\n\n"
+ _format_vulns_guidance(new_vulns)
+ CONTINUATION_SUFFIX + "\n")
# exit(2) is preserved per the asyncRewake protocol — it's what CC
# uses as the "force fix" signal that triggers the rewakeMessage flow.
# The stderr.write was removed; additional_context above now carries
# the same text via the modern JSON channel. See #1358/#1375/#1783.
sys.exit(2)
def handle_push_sweep_posttooluse(input_data):
@@ -1629,17 +1675,23 @@ def handle_push_sweep_posttooluse(input_data):
# Metrics — keep within the 10-key cap; agentic sub-metrics are dropped
# here in favour of the push-sweep funnel keys (telemetry can join on session_id
# to the per-commit fires for agentic detail). rewake_summary must ride
# this line (CC reads only the first {-prefixed stdout line); it's a
# no-op when new_vulns is empty since we exit 0 below.
emit_metrics({
# this line (CC reads only the first {-prefixed stdout line); the emit
# is deferred to the two exit points below so the with-vulns path can
# also pass additional_context in the same JSON line (#1375/#1783) —
# the by-design "CC keeps only the first JSON line" constraint means
# we can't emit twice. Builds the shared metrics dict here; vulns path
# adds additional_context, no-vulns path emits as-is.
_push_metrics = {
**_base, "pushed": len(push_range), "unreviewed": len(tail),
"prefix_advanced": prefix_advanced, "vulns_found": len(new_vulns),
"files_reviewed": len(diff_files), "review_ms": review_ms,
**({"deduped": n_deduped} if n_deduped else {}),
}, rewake_summary=_format_vulns_summary(new_vulns, prefix="Push security review found"))
}
_push_rewake_summary = _format_vulns_summary(new_vulns, prefix="Push security review found")
if not new_vulns:
debug_log("Push sweep: no new findings")
emit_metrics(_push_metrics, rewake_summary=_push_rewake_summary)
sys.exit(0)
# First-push of a big branch can surface many findings at once across
@@ -1692,9 +1744,14 @@ def handle_push_sweep_posttooluse(input_data):
guidance = _format_vulns_guidance(reported) or ""
else:
guidance = concrete_guidance or _format_vulns_guidance(reported) or ""
sys.stderr.write(
PROVENANCE_BANNER + "\n\n" + guidance + CONTINUATION_SUFFIX + "\n"
)
# Emit metrics + additional_context together — single JSON line is the
# contract CC's hook parser expects. exit(2) preserved as the asyncRewake
# "force fix" trigger (see comment near handle_commit_review_posttooluse).
# See #1358 / #1375 / #1783.
emit_metrics(_push_metrics, rewake_summary=_push_rewake_summary,
additional_context=(PROVENANCE_BANNER + "\n\n"
+ guidance + CONTINUATION_SUFFIX + "\n"),
hook_event_name="PostToolUse")
sys.exit(2)
def handle_stop_hook(input_data):
@@ -1927,6 +1984,11 @@ def handle_stop_hook(input_data):
# untracked_baseline_n is the signal for whether the UPS-time
# untracked-snapshot capture actually ran.
sweep_trimmed = {k: v for k, v in sweep.items() if k != "warn_unresolved_mask"}
# Pass guidance via additional_context so CC surfaces the findings via
# hookSpecificOutput.additionalContext instead of stderr-only (which
# was the cause of "json output validation failed" / empty-reason UI in
# #1375 / #1783). exit(2) preserved as the asyncRewake "force fix"
# signal — that's the documented mechanism. See #1358 / #1375 / #1783.
emit_metrics({
"vulns_found": len(vulns),
"untracked_baseline_n": len(untracked_at_baseline),
@@ -1940,10 +2002,10 @@ def handle_stop_hook(input_data):
**({"diff_truncated": llm._last_review_truncated_bytes}
if llm._last_review_truncated_bytes else {}),
**sweep_trimmed,
}, rewake_summary=_format_vulns_summary(vulns))
# Exit code 2 with stderr forces Claude to continue and fix
sys.stderr.write(PROVENANCE_BANNER + "\n\n" + concrete_guidance + CONTINUATION_SUFFIX + "\n")
}, rewake_summary=_format_vulns_summary(vulns),
additional_context=(PROVENANCE_BANNER + "\n\n"
+ concrete_guidance + CONTINUATION_SUFFIX + "\n"),
hook_event_name="Stop")
sys.exit(2)
if llm._last_call_claude_http_error is not None:

View File

@@ -47,21 +47,65 @@ fi
probe() {
# $1..N: the interpreter command (may be multi-word like `py -3`)
# Probe writes the major version to stdout and exits 0 iff it's >=3.
"$@" -c 'import sys; print(sys.version_info[0])' 2>/dev/null
# Writes "<major>.<minor>" to stdout and exits 0 iff at least Python 3.
"$@" -c 'import sys; print(f"{sys.version_info[0]}.{sys.version_info[1]}")' 2>/dev/null
}
# True iff arg is a "M.m" version string >= 3.10. claude_agent_sdk requires
# Python >= 3.10; below that, pip install fails ("No matching distribution")
# and the LLM-powered review (Stop / commit / push) silently no-ops while
# pattern checks (PostToolUse regex) keep working. macOS ships 3.9.6 as the
# default `python3` on current versions, so this guard matters in practice.
# See anthropics/claude-plugins-official#2071.
is_sdk_compatible() {
case "$1" in
3.1[0-9]|3.[2-9][0-9]|[4-9].*|[1-9][0-9].*) return 0 ;;
*) return 1 ;;
esac
}
# Pass 1 — try minor-versioned binaries in descending order. These are only
# present if the user explicitly installed them (Homebrew / python.org / pyenv),
# so picking one here always upgrades over the system `python3`. Highest
# available wins; the user doesn't have to PATH-prefer it.
for cmd in "python3.13" "python3.12" "python3.11" "python3.10"; do
v=$(probe "$cmd") || continue
if is_sdk_compatible "$v"; then
exec "$cmd" "$@"
fi
done
# Pass 2 — bare interpreters, but only if SDK-compatible. Covers Linux distros
# that ship 3.10+ as the default `python3`, and Windows where `python` /
# `py -3` resolves to the user's python.org install.
for cmd in "python3" "python" "py -3"; do
# Word-split intentionally so `py -3` works
# shellcheck disable=SC2086
v=$(probe $cmd) || continue
if [ "$v" = "3" ]; then
if is_sdk_compatible "$v"; then
# shellcheck disable=SC2086
exec $cmd "$@"
fi
done
# Pass 3 — fallback to any Python 3, even <3.10. Pattern-based checks
# (PostToolUse regex on Edit/Write) only need 3.6+ and are useful on their
# own; the SDK-dependent paths will detect the version mismatch and degrade
# inside the Python code. Without this fallback, the entire plugin would
# stop working on default macOS, which is a regression vs today.
for cmd in "python3" "python" "py -3"; do
# shellcheck disable=SC2086
v=$(probe $cmd) || continue
# Accept anything that successfully reported a "M.m" string.
case "$v" in
[0-9]*.[0-9]*)
# shellcheck disable=SC2086
exec $cmd "$@"
;;
esac
done
echo "security-guidance: no working Python 3 interpreter found." >&2
echo " tried: python3, python, py -3" >&2
echo " tried: python3.13, python3.12, python3.11, python3.10, python3, python, py -3" >&2
echo " on Windows, install Python from https://python.org (NOT the Microsoft Store)" >&2
echo " on macOS, install Python 3.10+ via Homebrew (\`brew install python\`)" >&2
exit 1