Compare commits

...

6 Commits

Author SHA1 Message Date
Mohamed Hegazy
38b298d5b2 security-guidance: pass core.quotePath=false to diff feeders (#2082)
Fixes anthropics/claude-plugins-official#2082 — diff feeders use git's
default quotePath setting, which C-quotes any path with a non-ASCII
byte. The downstream parsers in gitutil.parse_diff_into_files /
gitutil.extract_file_paths_from_diff match the diff header with
`re.match(r'^a/(.+?) b/(.+)$', ...)`, which only sees the raw
`a/path b/path` form. The C-quoted `"a/\303\201vila/..."` form
slips past the regex, the `continue` fires, and the file is silently
dropped from review.

Effect: a vulnerable file like `Ávila/payment.py` with
`os.system('curl ' + user_input)` never reaches the LLM reviewer.
False negative in exactly the direction the plugin exists to catch.

Sibling of #2056 / #2075: those fixed the UTF-8 decode of the
subprocess output (text=True crashed the reader thread on Windows
cp1252). This one fixes the diff-feeder commands themselves — the
name-only helpers (_git_name_only, _git_status_porcelain) already
pass core.quotePath=false for this exact reason; the diff-text
feeders were the holdouts.

Fix: add `-c core.quotePath=false` to 4 git invocations:

  - gitutil._git_diff_range            (push-sweep feed)
  - gitutil.get_git_diff                (Stop-hook feed)
  - security_reminder_hook commit-review `git diff` (amend delta)
  - security_reminder_hook commit-review `git show`  (post-amend)

With the flag, git emits raw UTF-8 in the diff header
(`a/Ávila/payment.py`), the regex matches, and both files (the
non-ASCII vulnerable one + any ASCII control file) flow through to
review correctly.

Verified locally on macOS Python 3.13:

  - py_compile clean on both files.
  - Existing 45 smoke + extensibility tests still pass.
  - 8 new tests in test_diff_parser_non_ascii.py (added to internal
    test suite at sg-staging/tests/, not in this PR):

      * 2 static-shape: gitutil._git_diff_range and get_git_diff both
        contain `core.quotePath=false` in their source.
      * 2 commit-review static: every subprocess.run in
        handle_commit_review_posttooluse that mentions `"diff"` or
        `"show"` also passes the flag. Catches the regression
        class where a new diff/show call site is added without
        plumbing the flag through.
      * 4 end-to-end with a real git repo containing a
        `Ávila/payment.py` baseline-and-edit:
          - WITHOUT flag: header is C-quoted, both parsers drop the
            non-ASCII file (demonstrates the bug).
          - WITH flag: header is raw UTF-8, both parsers see the file.
          - parse_diff_into_files (the other parse path) also keeps
            the file with the flag.
          - get_git_diff end-to-end produces unquoted output whose
            file list includes the non-ASCII path.

  - 53/53 pass total (45 existing + 8 new) in 3.41s.

NOT verified end-to-end with a real CC commit-review fire on a
non-ASCII path. The static-shape tests catch the regression and the
end-to-end git-repo tests pin parser behavior, but the actual
LLM-review-with-vuln-found path requires runtime verification against
an Anthropic-API-credentialed CC session.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-29 07:56:22 -07:00
Mohamed Hegazy
68a700837c Merge pull request #2075 from anthropics/fix-2056-windows-unicode-decode
security-guidance: lenient UTF-8 decode in 6 git-subprocess helpers (#2056)
2026-05-28 23:36:36 -07:00
Mohamed Hegazy
3d349d40b9 Merge pull request #2074 from anthropics/fix-xss-rules-non-js-false-positives
security-guidance: gate XSS pattern rules to JS-family files
2026-05-28 23:18:17 -07:00
Mohamed Hegazy
12a5376e20 security-guidance: gate XSS pattern rules to JS-family files
Closes #410, #2037, #2045, #1640, #1280, #1329, #1341, #255,
anthropics/claude-code#46720 (partial closes on overlap with other rules).

The plugin's substring-only XSS / browser-DOM rules
(new_function_injection, react_dangerously_set_html, document_write_xss,
innerHTML_xss, outerHTML_xss, insertAdjacentHTML_xss) fired on any file
containing the trigger substring — including:

  * Markdown documentation explaining XSS sinks
  * Blog posts / READMEs that name browser APIs
  * Python tutorials referencing dangerouslySetInnerHTML
  * Plugin skill files with example HTML strings
  * .yaml / .json configs that happen to contain the literal string
  * .gitignore / Dockerfile / Makefile

These constructs have no meaning outside JS/TS source. Add a
path_filter: lambda p: p.endswith(_JS_EXTS) to each so they fire only
on .js, .jsx, .ts, .tsx, .mjs, .cjs, .mts, .cts, .vue, .svelte.

Cross-checked against the existing _JS_EXTS-gated rules
(regex_exec_substring, child_process_exec, exec_substring) — same
pattern, same constant, same intent. Uses the module-level _JS_EXTS
tuple so future extension changes propagate to all 6 rules atomically.

Verified locally on macOS Python 3.13:
  - py_compile clean.
  - 45-test existing smoke + extensibility suite still passes.
  - 151 new parametrized tests in test_xss_gate.py (added to internal
    test suite this PR doesn't ship): each gated rule x every
    JS-family extension accepts, x every non-JS path (.md / .py /
    .yaml / .json / .txt / .html / Dockerfile / Makefile / .gitignore
    / .sh / .go / .rs / .rb) rejects. 196 tests pass total.

Doesn't address everything in the false-positive cluster — issues that
require Python-rule gating (#1114 .env.schema exec), tighter substring
scoping (#660 pickle in usernames), or hook-protocol changes (#1358
exit-2 vs warning, #1375 plain-text-vs-JSON output) need separate PRs.
This PR covers the JS-substring subset cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 23:07:53 -07:00
Mohamed Hegazy
04127de5d1 Merge pull request #2073 from anthropics/fix-2071-macos-python-39
security-guidance: enable LLM review on default macOS Python 3.9 (#2071)
2026-05-28 22:59:23 -07:00
Mohamed Hegazy
a67587c816 security-guidance: enable LLM review on default macOS Python 3.9
Fixes anthropics/claude-plugins-official#2071 — on macOS where the
default `python3` is Apple's Command Line Tools Python 3.9.6, the
plugin's agentic commit reviewer silently does not run, even when the
user has a newer Python installed.

Three compounding factors in the bug:

1. `sg-python.sh` only checks the major version (`3`), so it always
   picks 3.9 even when 3.10+ is on PATH.
2. `claude_agent_sdk` requires Python >=3.10 — pip install on 3.9
   returns "No matching distribution" -> bootstrap returns BUILD_FAILED.
3. Even with a hand-built 3.12 venv, `llm.py` imports the SDK
   in-process into the hook's interpreter (still 3.9), which raises
   SyntaxError. The existing venv-probe in `ensure_agent_sdk.py` uses
   the venv's own Python (3.12) so it reports NOOP_VENV (healthy) while
   the consumer fails — misleading telemetry on top of silent feature
   degradation.

Per BQ telemetry, 14,073 external macOS users hit
sdk_bootstrap=BUILD_FAILED in the past 4 days (the default-macOS
cohort), out of ~86K total external installed users. Combined with
~20K other users in similar broken-bootstrap states (Windows pre-#2055,
Linux <3.10), about half the installed base has a silently-broken
agentic reviewer.

This PR implements the reporter's items #1, #3, and #4. Item #2
(running the SDK out-of-process) is deferred as a bigger refactor.

Item #1 — hooks/sg-python.sh — prefer >=3.10 binaries via 3-pass probe:

  Pass 1: python3.13 / 3.12 / 3.11 / 3.10 (>=3.10 by name, highest wins)
  Pass 2: bare python3 / python / py -3 (accept only if reported >=3.10)
  Pass 3: bare python3 / python / py -3 (any Python 3, FALLBACK so
          pattern checks still work on macOS-default 3.9 — no regression
          vs today; SDK-dependent paths detect the version mismatch
          inside Python and degrade cleanly via item #4)

Item #4 — ensure_agent_sdk.py — health-check honesty:

Added HOOK_PY_INCOMPATIBLE=6 outcome with short-circuit at top of main():

  if sys.version_info < (3, 10):
      return HOOK_PY_INCOMPATIBLE, "hook_py", f"py_{...}"

Telemetry consequences after rollout: sdk_bootstrap=6 is a new clean
bucket; some users currently miscounted in sdk_bootstrap=3 BUILD_FAILED
(wasted pip cycles) and sdk_bootstrap=1 NOOP_VENV (falsely-healthy)
move to sdk_bootstrap=6. The remaining NOOP_VENV count becomes
trustworthy.

Item #3 — ensure_agent_sdk.py — one-time user-visible notice:

When outcome == HOOK_PY_INCOMPATIBLE and a marker file at
`~/.claude/security/.agentic_unavailable_notice_v<pv>` doesn't exist,
the SessionStart response includes hookSpecificOutput.additionalContext
+ systemMessage explaining the situation. Marker file is plugin-
version-keyed so a future fix (e.g. shipping out-of-process SDK) can
bump pv and re-notify users.

BUILD_FAILED is intentionally excluded from the notice — it covers
transient causes where a permanent banner would mislead.

Verified locally on macOS Python 3.13:
  - py_compile clean on both files.
  - Existing 45-test smoke + extensibility suite: 45/45 PASS in 2.50s.
  - Unit test of simulated 3.9 path: HOOK_PY_INCOMPATIBLE returned with
    correct phase/kind; notice shown on first call, suppressed on
    second, reshown on bumped pv; BUILD_FAILED correctly does NOT
    trigger notice.

NOT verified: actual Python 3.9 behavior end-to-end (would need a 3.9
install). Worth a follow-up smoke test in a 3.9 venv before next
release. The unit test simulating 3.9 covers the logic but not the
runtime invocation through the shim.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 22:58:01 -07:00
5 changed files with 174 additions and 10 deletions

View File

@@ -32,6 +32,8 @@ BUILD_FAILED = 3 # venv create or pip install raised/timed out
# llm.py also matches Windows venv layout (Lib/site-packages). Don't reuse the
# value — telemetry rows from older plugin builds still emit 4.
SKIP_SENTINEL = 5 # another SessionStart is currently building
HOOK_PY_INCOMPATIBLE = 6 # hook interpreter is <3.10 — SDK syntax can't load
# here no matter how the venv was built. See #2071.
def _sdk_on_syspath() -> bool:
@@ -62,6 +64,29 @@ def main() -> tuple[int, str, str]:
err_phase / err_kind are non-empty only on BUILD_FAILED — they let
telemetry split bootstrap failures by root cause.
"""
# Honesty check (fixes the misleading NOOP_VENV in #2071): the SDK
# requires Python >=3.10 and uses 3.10+ syntax (match statements,
# PEP 604 unions). On a 3.9 hook interpreter we CANNOT import it no
# matter how the venv was built — llm.py runs in this same interpreter
# and the syntax-level import will SyntaxError. macOS ships 3.9.6 as
# the default `python3` and `/usr/bin` precedes Homebrew in PATH, so
# this case is the default state for a large share of macOS users.
#
# sg-python.sh now prefers python3.10+ binaries so most users won't
# reach this branch; the fallback to 3.9 is preserved for the
# pattern-warning hooks that don't need the SDK. Reporting
# HOOK_PY_INCOMPATIBLE here:
# (a) avoids 30-60s of wasted pip install,
# (b) avoids the lie where the venv_py probe says NOOP_VENV but the
# consumer import fails, and
# (c) gives telemetry a clean bucket to size the affected fleet.
if sys.version_info < (3, 10):
return (
HOOK_PY_INCOMPATIBLE,
"hook_py",
f"py_{sys.version_info[0]}.{sys.version_info[1]}",
)
if _sdk_on_syspath():
return NOOP_SYSTEM, "", ""
@@ -195,6 +220,56 @@ def main() -> tuple[int, str, str]:
sentinel.unlink(missing_ok=True)
def _maybe_emit_user_notice(outcome: int, pv: int) -> str | None:
"""Return a one-time user-visible notice when the agentic reviewer is
in a persistent broken state on this machine, or None if we've already
shown the notice for this plugin version (or shouldn't show one).
The marker file is plugin-version-keyed: a future plugin update can
re-notify if behavior changes (e.g. we ship out-of-process SDK in v3
and want to tell affected users it's fixed). Failures to write the
marker degrade to "skip the notice this session" so we don't spam
every SessionStart on a read-only home dir.
Currently only HOOK_PY_INCOMPATIBLE qualifies. BUILD_FAILED is
intentionally excluded — it covers transient causes (network failure,
pip registry hiccup, in-flight rebuild) where the next session may
succeed and a permanent notice would mislead.
"""
if outcome != HOOK_PY_INCOMPATIBLE:
return None
try:
state_dir = Path(
os.environ.get("SECURITY_WARNINGS_STATE_DIR")
or os.path.expanduser("~/.claude/security")
)
marker = state_dir / f".agentic_unavailable_notice_v{pv or 0}"
if marker.exists():
return None
state_dir.mkdir(parents=True, exist_ok=True)
# Write timestamp + Python version so the marker is self-documenting
# if a user goes looking. O_EXCL would be racier with no real win
# (two concurrent SessionStarts both showing the notice once is fine).
marker.write_text(
f"{time.strftime('%Y-%m-%dT%H:%M:%SZ', time.gmtime())} "
f"py={sys.version_info[0]}.{sys.version_info[1]}\n"
)
except OSError:
return None
return (
f"⚠ security-guidance plugin: the cross-file commit reviewer "
f"(layer 3 of 3 — catches IDOR, auth-bypass, cross-file SSRF) "
f"is unavailable in this environment. It requires Python ≥3.10, "
f"but the hook is running on "
f"{sys.version_info[0]}.{sys.version_info[1]}.\n\n"
f"Pattern checks and the single-shot LLM diff review are still "
f"active. To enable the deeper reviewer, install Python 3.10+ "
f"(e.g. `brew install python` on macOS) and restart Claude Code.\n\n"
f"This notice is shown once per plugin version. "
f"See: github.com/anthropics/claude-plugins-official/issues/2071"
)
if __name__ == "__main__":
# Tell the harness this is async — venv create + pip install can take
# 30-60s on a cold cache, well past the default sync hook timeout.
@@ -231,4 +306,18 @@ if __name__ == "__main__":
pv = _plugin_version_int()
if pv:
metrics["pv"] = pv
print(json.dumps({"metrics": metrics}), flush=True)
response: dict[str, object] = {"metrics": metrics}
# One-time user-visible notice when the agentic reviewer is dead on
# arrival. Uses hookSpecificOutput.additionalContext (SessionStart's
# supported channel for surfacing text to both the model and the user)
# plus systemMessage as a belt-and-suspenders. Marker-file-gated so
# this fires exactly once per plugin version per install — see
# _maybe_emit_user_notice.
notice = _maybe_emit_user_notice(outcome, pv)
if notice:
response["hookSpecificOutput"] = {
"hookEventName": "SessionStart",
"additionalContext": notice,
}
response["systemMessage"] = notice
print(json.dumps(response), flush=True)

View File

@@ -199,8 +199,15 @@ def _git_diff_range(repo_root, base, head="HEAD"):
them reviewed — otherwise unreviewed commits get permanently silenced.
"""
try:
# core.quotePath=false makes git emit raw UTF-8 in `diff --git a/... b/...`
# headers instead of C-quoting non-ASCII path bytes (`"a/\303\201vila/..."`
# vs `a/Ávila/...`). The downstream `re.match(r'^a/(.+?) b/(.+)$', ...)`
# in parse_diff_into_files / extract_file_paths_from_diff matches the
# raw form only — quoted headers slip past and the entire file is
# silently dropped from review. See #2082 (sibling of #2056 / #2075).
r = subprocess.run(
[*GIT_CMD, "diff", "-p", "--no-color", "--no-ext-diff", base, head],
[*GIT_CMD, "-c", "core.quotePath=false",
"diff", "-p", "--no-color", "--no-ext-diff", base, head],
cwd=repo_root, capture_output=True, timeout=30,
)
if r.returncode != 0:
@@ -436,7 +443,11 @@ def get_git_diff(cwd, baseline_sha, full_context=False, paths=None, untracked_pa
# change exists to fix.
return ""
cmd = [*GIT_CMD, "diff", "--no-color", "--no-ext-diff", baseline_sha] + (["--unified=99999"] if full_context else []) + pathspec
# core.quotePath=false: emit raw UTF-8 in `diff --git a/... b/...` headers
# so non-ASCII paths aren't C-quoted past the downstream parse_diff_into_files
# regex. See #2082 (sibling of #2056 / #2075).
cmd = [*GIT_CMD, "-c", "core.quotePath=false",
"diff", "--no-color", "--no-ext-diff", baseline_sha] + (["--unified=99999"] if full_context else []) + pathspec
try:
with _temp_index(cwd, untracked_paths) as env:
# env is None when no index could be found (bare repo / not a

View File

@@ -94,6 +94,9 @@ Only use exec() if you absolutely need shell features and the input is guarantee
},
{
"ruleName": "new_function_injection",
# JS-only construct: gate to JS/TS files so docs/.md and other prose
# mentioning "new Function" don't trip the warning.
"path_filter": lambda p: p.endswith(_JS_EXTS),
"substrings": ["new Function"],
"reminder": "\u26a0\ufe0f Security Warning: Using new Function() with string interpolation is a CODE INJECTION vulnerability. If any variable is concatenated or interpolated into the function body string, an attacker controlling that variable can execute arbitrary code. Use safe alternatives: for property access use obj[key] or array.reduce((o, k) => o[k], root); for computation use a safe expression parser. NEVER interpolate untrusted strings into new Function() bodies.",
},
@@ -107,16 +110,24 @@ Only use exec() if you absolutely need shell features and the input is guarantee
},
{
"ruleName": "react_dangerously_set_html",
# JS/TS-only (React); gate so .md docs / .py / .go files don't trip.
"path_filter": lambda p: p.endswith(_JS_EXTS),
"substrings": ["dangerouslySetInnerHTML"],
"reminder": "⚠️ Security Warning: dangerouslySetInnerHTML can lead to XSS vulnerabilities if used with untrusted content. Ensure all content is properly sanitized using an HTML sanitizer library like DOMPurify, or use safe alternatives.",
},
{
"ruleName": "document_write_xss",
# Browser DOM API: only meaningful in JS/TS source.
"path_filter": lambda p: p.endswith(_JS_EXTS),
"substrings": ["document.write"],
"reminder": "⚠️ Security Warning: document.write() can be exploited for XSS attacks and has performance issues. Use DOM manipulation methods like createElement() and appendChild() instead.",
},
{
"ruleName": "innerHTML_xss",
# Browser DOM API: only meaningful in JS/TS source. Closes FPs like
# docs/example HTML, playground/self-contained skills that hardcode
# innerHTML strings with zero user input (#410).
"path_filter": lambda p: p.endswith(_JS_EXTS),
"substrings": [".innerHTML =", ".innerHTML="],
"reminder": "⚠️ Security Warning: Setting innerHTML with untrusted content can lead to XSS vulnerabilities. Use textContent for plain text or safe DOM methods for HTML content. If you need HTML support, consider using an HTML sanitizer library such as DOMPurify.",
},
@@ -217,11 +228,15 @@ Additionally, validate user inputs:
},
{
"ruleName": "outerHTML_xss",
# Browser DOM API: only meaningful in JS/TS source.
"path_filter": lambda p: p.endswith(_JS_EXTS),
"substrings": [".outerHTML =", ".outerHTML="],
"reminder": "⚠️ Security Warning: Use textContent or sanitize with DOMPurify. outerHTML assignment is an XSS sink equivalent to innerHTML.",
},
{
"ruleName": "insertAdjacentHTML_xss",
# Browser DOM API: only meaningful in JS/TS source.
"path_filter": lambda p: p.endswith(_JS_EXTS),
"substrings": [".insertAdjacentHTML("],
"reminder": "⚠️ Security Warning: Use insertAdjacentText() or sanitize with DOMPurify. insertAdjacentHTML is an XSS sink.",
},

View File

@@ -1118,16 +1118,21 @@ def handle_commit_review_posttooluse(input_data):
resolved = 0
for sha in shas:
try:
# core.quotePath=false: emit raw UTF-8 in `diff --git a/... b/...`
# headers so non-ASCII paths aren't C-quoted past the downstream
# parse_diff_into_files regex (sibling of #2056 / #2075). See #2082.
if pre_amend_sha:
# Delta review: pre-amend → post-amend. `git diff` (not show)
# so the output is a pure unified diff with no commit header.
result = subprocess.run(
[*GIT_CMD, "diff", "--no-color", "--no-ext-diff", pre_amend_sha, sha, "--"],
[*GIT_CMD, "-c", "core.quotePath=false",
"diff", "--no-color", "--no-ext-diff", pre_amend_sha, sha, "--"],
cwd=repo_root, capture_output=True, timeout=15
)
else:
result = subprocess.run(
[*GIT_CMD, "show", "-p", "--no-color", "--no-ext-diff", sha, "--"],
[*GIT_CMD, "-c", "core.quotePath=false",
"show", "-p", "--no-color", "--no-ext-diff", sha, "--"],
cwd=repo_root, capture_output=True, timeout=15
)
except (subprocess.TimeoutExpired, FileNotFoundError, OSError) as e:

View File

@@ -47,21 +47,65 @@ fi
probe() {
# $1..N: the interpreter command (may be multi-word like `py -3`)
# Probe writes the major version to stdout and exits 0 iff it's >=3.
"$@" -c 'import sys; print(sys.version_info[0])' 2>/dev/null
# Writes "<major>.<minor>" to stdout and exits 0 iff at least Python 3.
"$@" -c 'import sys; print(f"{sys.version_info[0]}.{sys.version_info[1]}")' 2>/dev/null
}
# True iff arg is a "M.m" version string >= 3.10. claude_agent_sdk requires
# Python >= 3.10; below that, pip install fails ("No matching distribution")
# and the LLM-powered review (Stop / commit / push) silently no-ops while
# pattern checks (PostToolUse regex) keep working. macOS ships 3.9.6 as the
# default `python3` on current versions, so this guard matters in practice.
# See anthropics/claude-plugins-official#2071.
is_sdk_compatible() {
case "$1" in
3.1[0-9]|3.[2-9][0-9]|[4-9].*|[1-9][0-9].*) return 0 ;;
*) return 1 ;;
esac
}
# Pass 1 — try minor-versioned binaries in descending order. These are only
# present if the user explicitly installed them (Homebrew / python.org / pyenv),
# so picking one here always upgrades over the system `python3`. Highest
# available wins; the user doesn't have to PATH-prefer it.
for cmd in "python3.13" "python3.12" "python3.11" "python3.10"; do
v=$(probe "$cmd") || continue
if is_sdk_compatible "$v"; then
exec "$cmd" "$@"
fi
done
# Pass 2 — bare interpreters, but only if SDK-compatible. Covers Linux distros
# that ship 3.10+ as the default `python3`, and Windows where `python` /
# `py -3` resolves to the user's python.org install.
for cmd in "python3" "python" "py -3"; do
# Word-split intentionally so `py -3` works
# shellcheck disable=SC2086
v=$(probe $cmd) || continue
if [ "$v" = "3" ]; then
if is_sdk_compatible "$v"; then
# shellcheck disable=SC2086
exec $cmd "$@"
fi
done
# Pass 3 — fallback to any Python 3, even <3.10. Pattern-based checks
# (PostToolUse regex on Edit/Write) only need 3.6+ and are useful on their
# own; the SDK-dependent paths will detect the version mismatch and degrade
# inside the Python code. Without this fallback, the entire plugin would
# stop working on default macOS, which is a regression vs today.
for cmd in "python3" "python" "py -3"; do
# shellcheck disable=SC2086
v=$(probe $cmd) || continue
# Accept anything that successfully reported a "M.m" string.
case "$v" in
[0-9]*.[0-9]*)
# shellcheck disable=SC2086
exec $cmd "$@"
;;
esac
done
echo "security-guidance: no working Python 3 interpreter found." >&2
echo " tried: python3, python, py -3" >&2
echo " tried: python3.13, python3.12, python3.11, python3.10, python3, python, py -3" >&2
echo " on Windows, install Python from https://python.org (NOT the Microsoft Store)" >&2
echo " on macOS, install Python 3.10+ via Homebrew (\`brew install python\`)" >&2
exit 1