Compare commits

...

1 Commits

Author SHA1 Message Date
Mohamed Hegazy
a67587c816 security-guidance: enable LLM review on default macOS Python 3.9
Fixes anthropics/claude-plugins-official#2071 — on macOS where the
default `python3` is Apple's Command Line Tools Python 3.9.6, the
plugin's agentic commit reviewer silently does not run, even when the
user has a newer Python installed.

Three compounding factors in the bug:

1. `sg-python.sh` only checks the major version (`3`), so it always
   picks 3.9 even when 3.10+ is on PATH.
2. `claude_agent_sdk` requires Python >=3.10 — pip install on 3.9
   returns "No matching distribution" -> bootstrap returns BUILD_FAILED.
3. Even with a hand-built 3.12 venv, `llm.py` imports the SDK
   in-process into the hook's interpreter (still 3.9), which raises
   SyntaxError. The existing venv-probe in `ensure_agent_sdk.py` uses
   the venv's own Python (3.12) so it reports NOOP_VENV (healthy) while
   the consumer fails — misleading telemetry on top of silent feature
   degradation.

Per BQ telemetry, 14,073 external macOS users hit
sdk_bootstrap=BUILD_FAILED in the past 4 days (the default-macOS
cohort), out of ~86K total external installed users. Combined with
~20K other users in similar broken-bootstrap states (Windows pre-#2055,
Linux <3.10), about half the installed base has a silently-broken
agentic reviewer.

This PR implements the reporter's items #1, #3, and #4. Item #2
(running the SDK out-of-process) is deferred as a bigger refactor.

Item #1 — hooks/sg-python.sh — prefer >=3.10 binaries via 3-pass probe:

  Pass 1: python3.13 / 3.12 / 3.11 / 3.10 (>=3.10 by name, highest wins)
  Pass 2: bare python3 / python / py -3 (accept only if reported >=3.10)
  Pass 3: bare python3 / python / py -3 (any Python 3, FALLBACK so
          pattern checks still work on macOS-default 3.9 — no regression
          vs today; SDK-dependent paths detect the version mismatch
          inside Python and degrade cleanly via item #4)

Item #4 — ensure_agent_sdk.py — health-check honesty:

Added HOOK_PY_INCOMPATIBLE=6 outcome with short-circuit at top of main():

  if sys.version_info < (3, 10):
      return HOOK_PY_INCOMPATIBLE, "hook_py", f"py_{...}"

Telemetry consequences after rollout: sdk_bootstrap=6 is a new clean
bucket; some users currently miscounted in sdk_bootstrap=3 BUILD_FAILED
(wasted pip cycles) and sdk_bootstrap=1 NOOP_VENV (falsely-healthy)
move to sdk_bootstrap=6. The remaining NOOP_VENV count becomes
trustworthy.

Item #3 — ensure_agent_sdk.py — one-time user-visible notice:

When outcome == HOOK_PY_INCOMPATIBLE and a marker file at
`~/.claude/security/.agentic_unavailable_notice_v<pv>` doesn't exist,
the SessionStart response includes hookSpecificOutput.additionalContext
+ systemMessage explaining the situation. Marker file is plugin-
version-keyed so a future fix (e.g. shipping out-of-process SDK) can
bump pv and re-notify users.

BUILD_FAILED is intentionally excluded from the notice — it covers
transient causes where a permanent banner would mislead.

Verified locally on macOS Python 3.13:
  - py_compile clean on both files.
  - Existing 45-test smoke + extensibility suite: 45/45 PASS in 2.50s.
  - Unit test of simulated 3.9 path: HOOK_PY_INCOMPATIBLE returned with
    correct phase/kind; notice shown on first call, suppressed on
    second, reshown on bumped pv; BUILD_FAILED correctly does NOT
    trigger notice.

NOT verified: actual Python 3.9 behavior end-to-end (would need a 3.9
install). Worth a follow-up smoke test in a 3.9 venv before next
release. The unit test simulating 3.9 covers the logic but not the
runtime invocation through the shim.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 22:58:01 -07:00
2 changed files with 139 additions and 6 deletions

View File

@@ -32,6 +32,8 @@ BUILD_FAILED = 3 # venv create or pip install raised/timed out
# llm.py also matches Windows venv layout (Lib/site-packages). Don't reuse the
# value — telemetry rows from older plugin builds still emit 4.
SKIP_SENTINEL = 5 # another SessionStart is currently building
HOOK_PY_INCOMPATIBLE = 6 # hook interpreter is <3.10 — SDK syntax can't load
# here no matter how the venv was built. See #2071.
def _sdk_on_syspath() -> bool:
@@ -62,6 +64,29 @@ def main() -> tuple[int, str, str]:
err_phase / err_kind are non-empty only on BUILD_FAILED — they let
telemetry split bootstrap failures by root cause.
"""
# Honesty check (fixes the misleading NOOP_VENV in #2071): the SDK
# requires Python >=3.10 and uses 3.10+ syntax (match statements,
# PEP 604 unions). On a 3.9 hook interpreter we CANNOT import it no
# matter how the venv was built — llm.py runs in this same interpreter
# and the syntax-level import will SyntaxError. macOS ships 3.9.6 as
# the default `python3` and `/usr/bin` precedes Homebrew in PATH, so
# this case is the default state for a large share of macOS users.
#
# sg-python.sh now prefers python3.10+ binaries so most users won't
# reach this branch; the fallback to 3.9 is preserved for the
# pattern-warning hooks that don't need the SDK. Reporting
# HOOK_PY_INCOMPATIBLE here:
# (a) avoids 30-60s of wasted pip install,
# (b) avoids the lie where the venv_py probe says NOOP_VENV but the
# consumer import fails, and
# (c) gives telemetry a clean bucket to size the affected fleet.
if sys.version_info < (3, 10):
return (
HOOK_PY_INCOMPATIBLE,
"hook_py",
f"py_{sys.version_info[0]}.{sys.version_info[1]}",
)
if _sdk_on_syspath():
return NOOP_SYSTEM, "", ""
@@ -195,6 +220,56 @@ def main() -> tuple[int, str, str]:
sentinel.unlink(missing_ok=True)
def _maybe_emit_user_notice(outcome: int, pv: int) -> str | None:
"""Return a one-time user-visible notice when the agentic reviewer is
in a persistent broken state on this machine, or None if we've already
shown the notice for this plugin version (or shouldn't show one).
The marker file is plugin-version-keyed: a future plugin update can
re-notify if behavior changes (e.g. we ship out-of-process SDK in v3
and want to tell affected users it's fixed). Failures to write the
marker degrade to "skip the notice this session" so we don't spam
every SessionStart on a read-only home dir.
Currently only HOOK_PY_INCOMPATIBLE qualifies. BUILD_FAILED is
intentionally excluded — it covers transient causes (network failure,
pip registry hiccup, in-flight rebuild) where the next session may
succeed and a permanent notice would mislead.
"""
if outcome != HOOK_PY_INCOMPATIBLE:
return None
try:
state_dir = Path(
os.environ.get("SECURITY_WARNINGS_STATE_DIR")
or os.path.expanduser("~/.claude/security")
)
marker = state_dir / f".agentic_unavailable_notice_v{pv or 0}"
if marker.exists():
return None
state_dir.mkdir(parents=True, exist_ok=True)
# Write timestamp + Python version so the marker is self-documenting
# if a user goes looking. O_EXCL would be racier with no real win
# (two concurrent SessionStarts both showing the notice once is fine).
marker.write_text(
f"{time.strftime('%Y-%m-%dT%H:%M:%SZ', time.gmtime())} "
f"py={sys.version_info[0]}.{sys.version_info[1]}\n"
)
except OSError:
return None
return (
f"⚠ security-guidance plugin: the cross-file commit reviewer "
f"(layer 3 of 3 — catches IDOR, auth-bypass, cross-file SSRF) "
f"is unavailable in this environment. It requires Python ≥3.10, "
f"but the hook is running on "
f"{sys.version_info[0]}.{sys.version_info[1]}.\n\n"
f"Pattern checks and the single-shot LLM diff review are still "
f"active. To enable the deeper reviewer, install Python 3.10+ "
f"(e.g. `brew install python` on macOS) and restart Claude Code.\n\n"
f"This notice is shown once per plugin version. "
f"See: github.com/anthropics/claude-plugins-official/issues/2071"
)
if __name__ == "__main__":
# Tell the harness this is async — venv create + pip install can take
# 30-60s on a cold cache, well past the default sync hook timeout.
@@ -231,4 +306,18 @@ if __name__ == "__main__":
pv = _plugin_version_int()
if pv:
metrics["pv"] = pv
print(json.dumps({"metrics": metrics}), flush=True)
response: dict[str, object] = {"metrics": metrics}
# One-time user-visible notice when the agentic reviewer is dead on
# arrival. Uses hookSpecificOutput.additionalContext (SessionStart's
# supported channel for surfacing text to both the model and the user)
# plus systemMessage as a belt-and-suspenders. Marker-file-gated so
# this fires exactly once per plugin version per install — see
# _maybe_emit_user_notice.
notice = _maybe_emit_user_notice(outcome, pv)
if notice:
response["hookSpecificOutput"] = {
"hookEventName": "SessionStart",
"additionalContext": notice,
}
response["systemMessage"] = notice
print(json.dumps(response), flush=True)

View File

@@ -47,21 +47,65 @@ fi
probe() {
# $1..N: the interpreter command (may be multi-word like `py -3`)
# Probe writes the major version to stdout and exits 0 iff it's >=3.
"$@" -c 'import sys; print(sys.version_info[0])' 2>/dev/null
# Writes "<major>.<minor>" to stdout and exits 0 iff at least Python 3.
"$@" -c 'import sys; print(f"{sys.version_info[0]}.{sys.version_info[1]}")' 2>/dev/null
}
# True iff arg is a "M.m" version string >= 3.10. claude_agent_sdk requires
# Python >= 3.10; below that, pip install fails ("No matching distribution")
# and the LLM-powered review (Stop / commit / push) silently no-ops while
# pattern checks (PostToolUse regex) keep working. macOS ships 3.9.6 as the
# default `python3` on current versions, so this guard matters in practice.
# See anthropics/claude-plugins-official#2071.
is_sdk_compatible() {
case "$1" in
3.1[0-9]|3.[2-9][0-9]|[4-9].*|[1-9][0-9].*) return 0 ;;
*) return 1 ;;
esac
}
# Pass 1 — try minor-versioned binaries in descending order. These are only
# present if the user explicitly installed them (Homebrew / python.org / pyenv),
# so picking one here always upgrades over the system `python3`. Highest
# available wins; the user doesn't have to PATH-prefer it.
for cmd in "python3.13" "python3.12" "python3.11" "python3.10"; do
v=$(probe "$cmd") || continue
if is_sdk_compatible "$v"; then
exec "$cmd" "$@"
fi
done
# Pass 2 — bare interpreters, but only if SDK-compatible. Covers Linux distros
# that ship 3.10+ as the default `python3`, and Windows where `python` /
# `py -3` resolves to the user's python.org install.
for cmd in "python3" "python" "py -3"; do
# Word-split intentionally so `py -3` works
# shellcheck disable=SC2086
v=$(probe $cmd) || continue
if [ "$v" = "3" ]; then
if is_sdk_compatible "$v"; then
# shellcheck disable=SC2086
exec $cmd "$@"
fi
done
# Pass 3 — fallback to any Python 3, even <3.10. Pattern-based checks
# (PostToolUse regex on Edit/Write) only need 3.6+ and are useful on their
# own; the SDK-dependent paths will detect the version mismatch and degrade
# inside the Python code. Without this fallback, the entire plugin would
# stop working on default macOS, which is a regression vs today.
for cmd in "python3" "python" "py -3"; do
# shellcheck disable=SC2086
v=$(probe $cmd) || continue
# Accept anything that successfully reported a "M.m" string.
case "$v" in
[0-9]*.[0-9]*)
# shellcheck disable=SC2086
exec $cmd "$@"
;;
esac
done
echo "security-guidance: no working Python 3 interpreter found." >&2
echo " tried: python3, python, py -3" >&2
echo " tried: python3.13, python3.12, python3.11, python3.10, python3, python, py -3" >&2
echo " on Windows, install Python from https://python.org (NOT the Microsoft Store)" >&2
echo " on macOS, install Python 3.10+ via Homebrew (\`brew install python\`)" >&2
exit 1