Compare commits

..

1 Commits

Author SHA1 Message Date
Mohamed Hegazy
6a63e35e75 security-guidance: lenient UTF-8 decode in 6 git-subprocess helpers (#2056)
Fixes anthropics/claude-plugins-official#2056 — on Windows, when the
worktree contains an untracked file whose name has a character undefined
in cp1252 (accented capitals like Á Í Ï Ð Ý, most CJK, emoji), the
UserPromptSubmit hook crashes:

  Exception in thread Thread-5 (_readerthread):
    UnicodeDecodeError: 'charmap' codec can't decode byte 0x81
  Traceback (most recent call last):
    File diffstate.py, line 338, in _list_untracked
      for p in r.stdout.split('\\0'):
  AttributeError: 'NoneType' object has no attribute 'split'

Non-blocking (UPS failures still let the prompt through) but the
baseline-untracked snapshot is silently lost, so the Stop-hook review
mis-handles pre-existing untracked files.

Root cause (reporter's diagnosis, verified):

1. core.quotePath=false makes git emit raw UTF-8 for non-ASCII filenames.
2. subprocess.run(..., text=True) decodes via
   locale.getpreferredencoding(False) in strict mode — on Windows that
   is cp1252, in which 0x81 / 0x8D / 0x8F / 0x90 / 0x9D are undefined.
   Those bytes appear in the UTF-8 encodings of Á (C3 81), Í (C3 8D),
   Ï (C3 8F), Ð (C3 90), Ý (C3 9D), and a large fraction of CJK / emoji
   codepoints.
3. The decode runs in the subprocess reader thread. The thread raises
   UnicodeDecodeError, threading prints 'Exception in thread Thread-N',
   subprocess.run returns with stdout=None. The handler then does
   None.split('\\0') -> AttributeError, which is NOT in the narrow
   except (TimeoutExpired, FileNotFoundError, OSError) tuple, so it
   escapes the helper, propagates out of UserPromptSubmit's
   ThreadPoolExecutor.result(), and exits the hook non-zero.

This is internally inconsistent: gitutil._git_diff_range,
security_reminder_hook._reflog_amend_lookup (line ~540), and the commit
diff loop (line ~1115) already do bytes + decode utf-8/replace, with
comments explicitly noting that text=True would crash. The fix below
extends that established pattern to the helpers that were holdouts.

Affected helpers (6 total):

  - diffstate._list_untracked            <- reporter, hot path, CRITICAL
  - diffstate.capture_git_baseline       <- reporter, latent
  - diffstate.get_baseline_file_content  <- audit, file content read, HIGH
  - gitutil._git_name_only                <- reporter, latent
  - gitutil._git_status_porcelain         <- reporter, latent
  - gitutil._git_reflog_recent_commits    <- audit, embeds %gs commit msg, HIGH

For each one:

  - Drop text=True from subprocess.run.
  - Decode r.stdout / r.stderr as .decode('utf-8', errors='replace').
  - Add ValueError to the except tuple as defense against any future
    strict-decode regression (UnicodeDecodeError is a ValueError
    subclass; including it explicitly degrades the helper to its
    empty/None return instead of escaping out of the hook).

Verified locally on macOS Python 3.13:

  - py_compile clean on both files.
  - 45 existing smoke + extensibility tests still pass.
  - 21 new internal tests (not in this PR — added to the team's local
    test suite at staging/tests/test_unicode_decode.py):
      * 18 static-shape parametrized: each of the 6 fixed helpers has
        no text=True in its subprocess calls, contains errors='replace',
        and lists ValueError in its except.
      * Deterministic end-to-end: create real git repo + Ávila_report.txt
        untracked, call _list_untracked, verify it returns
        {'Ávila_report.txt': <mtime>} without crashing.
      * Deterministic end-to-end: same for capture_git_baseline (verifies
        the latent stderr-warning case stays valid).
      * Deterministic end-to-end: get_baseline_file_content on a file
        whose content has 山田太郎 + 🎉; verify the bytes round-trip
        through the decode.
  - 66/66 tests pass total (45 existing + 21 new).

NOT verified end-to-end on Windows — would need actual cp1252 strict
decode to fire. Reporter has the deterministic repro and will
re-verify on their Win11 / Python 3.14.x setup before merge.

Not in this PR (defense-in-depth, lower risk):

  - 3 git rev-parse calls returning path output (gitutil._find_git_index,
    _git_toplevel, _git_dir) could fail on Windows if cwd is in a
    non-ASCII install directory. Same fix shape but unreported and
    much lower probability — worth a separate follow-up if anyone
    actually hits it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 23:15:16 -07:00
4 changed files with 90 additions and 165 deletions

View File

@@ -138,7 +138,17 @@ def restore_unreviewed_stop_state(session_id, paths, baseline_sha):
def get_baseline_file_content(session_id, file_path, cwd):
"""Get the content of a file at the baseline SHA. Returns None if unavailable."""
"""Get the content of a file at the baseline SHA. Returns None if unavailable.
Decode the file content as UTF-8 with errors="replace" rather than using
text=True: source files in user repos can be latin-1 / cp1252 / shift-jis
/ etc., and on Windows text=True would decode via locale.getpreferredencoding()
in strict mode and raise UnicodeDecodeError in the subprocess reader
thread — leaving result.stdout=None and propagating AttributeError when
the caller tries to use it. Same class as the existing migrations at
security_reminder_hook.py:540 (reflog subjects) and :1115 (commit
diffs); this helper was missed in that pass. See
anthropics/claude-plugins-official#2056."""
baseline_sha = load_baseline_sha(session_id)
if not baseline_sha:
return None
@@ -151,12 +161,12 @@ def get_baseline_file_content(session_id, file_path, cwd):
return None
result = subprocess.run(
[*GIT_CMD, "show", f"{baseline_sha}:{rel_path}"],
cwd=cwd, capture_output=True, text=True, timeout=5
cwd=cwd, capture_output=True, timeout=5
)
if result.returncode == 0:
return result.stdout
return (result.stdout or b"").decode("utf-8", errors="replace")
return None
except (subprocess.TimeoutExpired, FileNotFoundError, OSError):
except (subprocess.TimeoutExpired, FileNotFoundError, OSError, ValueError):
return None
@@ -173,11 +183,16 @@ def capture_git_baseline(cwd):
and `compute_v2_review_set` subtracts that set so pre-existing untracked
files are not reviewed as Claude-authored.
"""
# stdout is a SHA so text=True is safe on stdout, but a non-ASCII
# filename in `git stash create`'s STDERR warning (e.g. a worktree
# with `Ávila_report.txt` triggers a quotePath/locale warning) would
# trip the stderr reader thread on Windows cp1252. Decode both streams
# leniently for symmetry with _list_untracked. See #2056.
try:
# Check if HEAD exists (i.e., repo has at least one commit)
head_check = subprocess.run(
[*GIT_CMD, "rev-parse", "HEAD"],
cwd=cwd, capture_output=True, text=True, timeout=5
cwd=cwd, capture_output=True, timeout=5
)
if head_check.returncode != 0:
# No commits yet — skip review rather than creating commits in the user's repo
@@ -186,20 +201,20 @@ def capture_git_baseline(cwd):
result = subprocess.run(
[*GIT_CMD, "stash", "create"],
cwd=cwd, capture_output=True, text=True, timeout=15
cwd=cwd, capture_output=True, timeout=15
)
sha = result.stdout.strip()
sha = (result.stdout or b"").decode("utf-8", errors="replace").strip()
if sha:
return sha
# Working tree is clean — stash create returns empty. Use HEAD.
result = subprocess.run(
[*GIT_CMD, "rev-parse", "HEAD"],
cwd=cwd, capture_output=True, text=True, timeout=5
cwd=cwd, capture_output=True, timeout=5
)
sha = result.stdout.strip()
sha = (result.stdout or b"").decode("utf-8", errors="replace").strip()
return sha if sha else None
except (subprocess.TimeoutExpired, FileNotFoundError, OSError) as e:
except (subprocess.TimeoutExpired, FileNotFoundError, OSError, ValueError) as e:
debug_log(f"Failed to capture git baseline: {e}")
return None
@@ -323,19 +338,35 @@ def _list_untracked(cwd):
mtime is captured so an in-place edit during the turn is still reviewed.
Uses ls-files (not status) for the UPS path: the index diff isn't needed,
and ls-files --others only walks the worktree against .gitignore."""
and ls-files --others only walks the worktree against .gitignore.
Decodes stdout/stderr as UTF-8 with errors="replace" instead of using
text=True. With core.quotePath=false git emits raw UTF-8 bytes for
non-ASCII filenames; text=True decodes via locale.getpreferredencoding()
in strict mode — on Windows that's cp1252 with several undefined bytes
(0x81/0x8D/0x8F/0x90/0x9D), all of which appear in UTF-8 encodings of
common accented capitals (Á Í Ï Ð Ý) and most CJK/emoji codepoints.
A non-ASCII filename in the worktree crashed the subprocess reader
thread, left r.stdout=None, and propagated AttributeError out of the
helper — silently losing the baseline snapshot every UserPromptSubmit.
See anthropics/claude-plugins-official#2056. The sibling helpers in
gitutil.py already follow the lenient pattern; this function and
capture_git_baseline / _git_name_only / _git_status_porcelain were
the holdouts."""
try:
repo = _git_toplevel(cwd) or cwd
r = subprocess.run(
[*GIT_CMD, "-c", "core.quotePath=false", "ls-files",
"--others", "--exclude-standard", "-z"],
cwd=repo, capture_output=True, text=True, timeout=15,
cwd=repo, capture_output=True, timeout=15,
)
if r.returncode != 0:
debug_log(f"_list_untracked rc={r.returncode}: {r.stderr[:200]}")
stderr_str = (r.stderr or b"").decode("utf-8", errors="replace")
debug_log(f"_list_untracked rc={r.returncode}: {stderr_str[:200]}")
return {}
stdout = (r.stdout or b"").decode("utf-8", errors="replace")
out = {}
for p in r.stdout.split("\0"):
for p in stdout.split("\0"):
if not p:
continue
try:
@@ -346,7 +377,9 @@ def _list_untracked(cwd):
debug_log(f"_list_untracked: capped at {UNTRACKED_BASELINE_CAP}")
break
return out
except (subprocess.TimeoutExpired, FileNotFoundError, OSError) as e:
except (subprocess.TimeoutExpired, FileNotFoundError, OSError, ValueError) as e:
# ValueError guards against any future strict-decode regression
# so the helper degrades to {} instead of crashing the hook.
debug_log(f"_list_untracked error: {e}")
return {}

View File

@@ -32,8 +32,6 @@ BUILD_FAILED = 3 # venv create or pip install raised/timed out
# llm.py also matches Windows venv layout (Lib/site-packages). Don't reuse the
# value — telemetry rows from older plugin builds still emit 4.
SKIP_SENTINEL = 5 # another SessionStart is currently building
HOOK_PY_INCOMPATIBLE = 6 # hook interpreter is <3.10 — SDK syntax can't load
# here no matter how the venv was built. See #2071.
def _sdk_on_syspath() -> bool:
@@ -64,29 +62,6 @@ def main() -> tuple[int, str, str]:
err_phase / err_kind are non-empty only on BUILD_FAILED — they let
telemetry split bootstrap failures by root cause.
"""
# Honesty check (fixes the misleading NOOP_VENV in #2071): the SDK
# requires Python >=3.10 and uses 3.10+ syntax (match statements,
# PEP 604 unions). On a 3.9 hook interpreter we CANNOT import it no
# matter how the venv was built — llm.py runs in this same interpreter
# and the syntax-level import will SyntaxError. macOS ships 3.9.6 as
# the default `python3` and `/usr/bin` precedes Homebrew in PATH, so
# this case is the default state for a large share of macOS users.
#
# sg-python.sh now prefers python3.10+ binaries so most users won't
# reach this branch; the fallback to 3.9 is preserved for the
# pattern-warning hooks that don't need the SDK. Reporting
# HOOK_PY_INCOMPATIBLE here:
# (a) avoids 30-60s of wasted pip install,
# (b) avoids the lie where the venv_py probe says NOOP_VENV but the
# consumer import fails, and
# (c) gives telemetry a clean bucket to size the affected fleet.
if sys.version_info < (3, 10):
return (
HOOK_PY_INCOMPATIBLE,
"hook_py",
f"py_{sys.version_info[0]}.{sys.version_info[1]}",
)
if _sdk_on_syspath():
return NOOP_SYSTEM, "", ""
@@ -220,56 +195,6 @@ def main() -> tuple[int, str, str]:
sentinel.unlink(missing_ok=True)
def _maybe_emit_user_notice(outcome: int, pv: int) -> str | None:
"""Return a one-time user-visible notice when the agentic reviewer is
in a persistent broken state on this machine, or None if we've already
shown the notice for this plugin version (or shouldn't show one).
The marker file is plugin-version-keyed: a future plugin update can
re-notify if behavior changes (e.g. we ship out-of-process SDK in v3
and want to tell affected users it's fixed). Failures to write the
marker degrade to "skip the notice this session" so we don't spam
every SessionStart on a read-only home dir.
Currently only HOOK_PY_INCOMPATIBLE qualifies. BUILD_FAILED is
intentionally excluded — it covers transient causes (network failure,
pip registry hiccup, in-flight rebuild) where the next session may
succeed and a permanent notice would mislead.
"""
if outcome != HOOK_PY_INCOMPATIBLE:
return None
try:
state_dir = Path(
os.environ.get("SECURITY_WARNINGS_STATE_DIR")
or os.path.expanduser("~/.claude/security")
)
marker = state_dir / f".agentic_unavailable_notice_v{pv or 0}"
if marker.exists():
return None
state_dir.mkdir(parents=True, exist_ok=True)
# Write timestamp + Python version so the marker is self-documenting
# if a user goes looking. O_EXCL would be racier with no real win
# (two concurrent SessionStarts both showing the notice once is fine).
marker.write_text(
f"{time.strftime('%Y-%m-%dT%H:%M:%SZ', time.gmtime())} "
f"py={sys.version_info[0]}.{sys.version_info[1]}\n"
)
except OSError:
return None
return (
f"⚠ security-guidance plugin: the cross-file commit reviewer "
f"(layer 3 of 3 — catches IDOR, auth-bypass, cross-file SSRF) "
f"is unavailable in this environment. It requires Python ≥3.10, "
f"but the hook is running on "
f"{sys.version_info[0]}.{sys.version_info[1]}.\n\n"
f"Pattern checks and the single-shot LLM diff review are still "
f"active. To enable the deeper reviewer, install Python 3.10+ "
f"(e.g. `brew install python` on macOS) and restart Claude Code.\n\n"
f"This notice is shown once per plugin version. "
f"See: github.com/anthropics/claude-plugins-official/issues/2071"
)
if __name__ == "__main__":
# Tell the harness this is async — venv create + pip install can take
# 30-60s on a cold cache, well past the default sync hook timeout.
@@ -306,18 +231,4 @@ if __name__ == "__main__":
pv = _plugin_version_int()
if pv:
metrics["pv"] = pv
response: dict[str, object] = {"metrics": metrics}
# One-time user-visible notice when the agentic reviewer is dead on
# arrival. Uses hookSpecificOutput.additionalContext (SessionStart's
# supported channel for surfacing text to both the model and the user)
# plus systemMessage as a belt-and-suspenders. Marker-file-gated so
# this fires exactly once per plugin version per install — see
# _maybe_emit_user_notice.
notice = _maybe_emit_user_notice(outcome, pv)
if notice:
response["hookSpecificOutput"] = {
"hookEventName": "SessionStart",
"additionalContext": notice,
}
response["systemMessage"] = notice
print(json.dumps(response), flush=True)
print(json.dumps({"metrics": metrics}), flush=True)

View File

@@ -259,19 +259,29 @@ def _git_reflog_recent_commits(repo_root, max_age_s=120, max_n=5):
# %gs (the reflog subject) is `commit: <commit-msg first line>` and can
# contain `|`; put it LAST so split("|", 2) leaves it intact. %H is
# hex and %ct is integer, so the first two fields are delimiter-safe.
#
# Bytes + decode utf-8/replace: %gs embeds commit-message subjects
# which git stores as raw bytes — commits can be authored in
# latin-1 / cp1252 / shift-jis etc., and text=True would raise
# UnicodeDecodeError in the subprocess reader thread on Windows
# cp1252 (subprocess.run returns r.stdout=None, then
# r.stdout.splitlines() AttributeErrors). Mirrors the existing
# migration at security_reminder_hook.py:540 — same pattern was
# missed here. See anthropics/claude-plugins-official#2056.
r = subprocess.run(
[*GIT_CMD, "log", "-g", "-n", str(max_n),
"--format=%H|%ct|%gs", "HEAD"],
cwd=repo_root, capture_output=True, text=True, timeout=5,
cwd=repo_root, capture_output=True, timeout=5,
)
except (subprocess.TimeoutExpired, FileNotFoundError, OSError):
except (subprocess.TimeoutExpired, FileNotFoundError, OSError, ValueError):
return [], 0
if r.returncode != 0:
return [], 0
stdout = (r.stdout or b"").decode("utf-8", errors="replace")
import time as _time
now = int(_time.time())
fresh, stale = [], 0
for idx, line in enumerate(r.stdout.splitlines()):
for idx, line in enumerate(stdout.splitlines()):
parts = line.split("|", 2)
if len(parts) != 3:
continue
@@ -306,23 +316,31 @@ def _git_name_only(cwd, base, include_untracked=False):
must distinguish None (error → don't trust as a filter) from set()
(genuinely nothing changed). `-c core.quotePath=false -z` keeps non-ASCII
and space-containing paths intact."""
# Decode stdout/stderr as UTF-8 with errors="replace" instead of using
# text=True. core.quotePath=false makes git emit raw UTF-8 for non-ASCII
# paths, and text=True on Windows decodes via cp1252 strict — a non-ASCII
# changed path would crash the subprocess reader thread, leave
# result.stdout=None, and propagate AttributeError out of the helper.
# Same fix shape as diffstate._list_untracked. See #2056.
def _run(env):
result = subprocess.run(
[*GIT_CMD, "-c", "core.quotePath=false", "diff", "--name-only", "-z", base],
cwd=cwd, capture_output=True, text=True, timeout=30,
cwd=cwd, capture_output=True, timeout=30,
env=env,
)
if result.returncode != 0:
debug_log(f"_git_name_only({base!r}) rc={result.returncode}: {result.stderr[:200]}")
stderr_str = (result.stderr or b"").decode("utf-8", errors="replace")
debug_log(f"_git_name_only({base!r}) rc={result.returncode}: {stderr_str[:200]}")
return None
return {p for p in result.stdout.split("\0") if p}
stdout = (result.stdout or b"").decode("utf-8", errors="replace")
return {p for p in stdout.split("\0") if p}
try:
if not include_untracked:
return _run(None)
with _temp_index(cwd) as env:
return _run(env)
except (subprocess.TimeoutExpired, FileNotFoundError, OSError) as e:
except (subprocess.TimeoutExpired, FileNotFoundError, OSError, ValueError) as e:
debug_log(f"_git_name_only({base!r}) error: {e}")
return None
@@ -339,17 +357,22 @@ def _git_status_porcelain(cwd):
collapses to `dir/`). Required so the untracked set subtracts cleanly
against the UPS-time `_list_untracked` snapshot, which uses ls-files and
therefore always lists individual files."""
# Lenient decode: same UTF-8 + errors="replace" pattern as the
# sibling helpers — a non-ASCII path in the worktree would otherwise
# crash the cp1252 reader thread on Windows. See #2056.
try:
r = subprocess.run(
[*GIT_CMD, "-c", "core.quotePath=false", "status",
"--porcelain=v1", "-uall", "-z"],
cwd=cwd, capture_output=True, text=True, timeout=30,
cwd=cwd, capture_output=True, timeout=30,
)
if r.returncode != 0:
debug_log(f"_git_status_porcelain rc={r.returncode}: {r.stderr[:200]}")
stderr_str = (r.stderr or b"").decode("utf-8", errors="replace")
debug_log(f"_git_status_porcelain rc={r.returncode}: {stderr_str[:200]}")
return None, None
tracked, untracked = set(), set()
entries = r.stdout.split("\0")
stdout = (r.stdout or b"").decode("utf-8", errors="replace")
entries = stdout.split("\0")
i = 0
while i < len(entries):
e = entries[i]
@@ -368,7 +391,9 @@ def _git_status_porcelain(cwd):
i += 1
i += 1
return tracked, untracked
except (subprocess.TimeoutExpired, FileNotFoundError, OSError) as e:
except (subprocess.TimeoutExpired, FileNotFoundError, OSError, ValueError) as e:
# ValueError guards against any future strict-decode regression
# so the helper degrades to (None, None) instead of crashing.
debug_log(f"_git_status_porcelain error: {e}")
return None, None

View File

@@ -47,65 +47,21 @@ fi
probe() {
# $1..N: the interpreter command (may be multi-word like `py -3`)
# Writes "<major>.<minor>" to stdout and exits 0 iff at least Python 3.
"$@" -c 'import sys; print(f"{sys.version_info[0]}.{sys.version_info[1]}")' 2>/dev/null
# Probe writes the major version to stdout and exits 0 iff it's >=3.
"$@" -c 'import sys; print(sys.version_info[0])' 2>/dev/null
}
# True iff arg is a "M.m" version string >= 3.10. claude_agent_sdk requires
# Python >= 3.10; below that, pip install fails ("No matching distribution")
# and the LLM-powered review (Stop / commit / push) silently no-ops while
# pattern checks (PostToolUse regex) keep working. macOS ships 3.9.6 as the
# default `python3` on current versions, so this guard matters in practice.
# See anthropics/claude-plugins-official#2071.
is_sdk_compatible() {
case "$1" in
3.1[0-9]|3.[2-9][0-9]|[4-9].*|[1-9][0-9].*) return 0 ;;
*) return 1 ;;
esac
}
# Pass 1 — try minor-versioned binaries in descending order. These are only
# present if the user explicitly installed them (Homebrew / python.org / pyenv),
# so picking one here always upgrades over the system `python3`. Highest
# available wins; the user doesn't have to PATH-prefer it.
for cmd in "python3.13" "python3.12" "python3.11" "python3.10"; do
v=$(probe "$cmd") || continue
if is_sdk_compatible "$v"; then
exec "$cmd" "$@"
fi
done
# Pass 2 — bare interpreters, but only if SDK-compatible. Covers Linux distros
# that ship 3.10+ as the default `python3`, and Windows where `python` /
# `py -3` resolves to the user's python.org install.
for cmd in "python3" "python" "py -3"; do
# Word-split intentionally so `py -3` works
# shellcheck disable=SC2086
v=$(probe $cmd) || continue
if is_sdk_compatible "$v"; then
if [ "$v" = "3" ]; then
# shellcheck disable=SC2086
exec $cmd "$@"
fi
done
# Pass 3 — fallback to any Python 3, even <3.10. Pattern-based checks
# (PostToolUse regex on Edit/Write) only need 3.6+ and are useful on their
# own; the SDK-dependent paths will detect the version mismatch and degrade
# inside the Python code. Without this fallback, the entire plugin would
# stop working on default macOS, which is a regression vs today.
for cmd in "python3" "python" "py -3"; do
# shellcheck disable=SC2086
v=$(probe $cmd) || continue
# Accept anything that successfully reported a "M.m" string.
case "$v" in
[0-9]*.[0-9]*)
# shellcheck disable=SC2086
exec $cmd "$@"
;;
esac
done
echo "security-guidance: no working Python 3 interpreter found." >&2
echo " tried: python3.13, python3.12, python3.11, python3.10, python3, python, py -3" >&2
echo " tried: python3, python, py -3" >&2
echo " on Windows, install Python from https://python.org (NOT the Microsoft Store)" >&2
echo " on macOS, install Python 3.10+ via Homebrew (\`brew install python\`)" >&2
exit 1