Compare commits

...

1 Commits

Author SHA1 Message Date
Mohamed Hegazy
43fcf6d513 security-guidance: encode exception type + errno + ensurepip instrumentation for venv BUILD_FAILED (2.0.3 → 2.0.4)
Follow-up to #2154. v2.0.3 telemetry showed the venv BUILD_FAILED bucket
splits into two unexplained groups; this PR instruments both.

## 1. The exc: bucket — exception type + errno

The dominant remaining venv BUILD_FAILED (phase=venv, err=99) is ~99%
sdk_bootstrap_stderr_sig=NULL — Python exceptions caught by the generic
`except Exception` ("exc:<TypeName>"), not CalledProcessErrors with
categorizable stderr. ~56k/30h, all opaque (stderr_sig only covers
"other:<tail>").

  - Handler embeds errno for OSError-family: "exc:OSError:28", etc.
  - SDK_BOOTSTRAP_EXC_CODES maps the type → sdk_bootstrap_exc
    (FileNotFoundError=1 … OSError=6 … 99=other).
  - errno decoded → sdk_bootstrap_errno (ENOENT/EACCES/ENOSPC/…).

## 2. venv_ensurepip_fail instrumentation (the other category)

venv_ensurepip_fail (code 11) is the top categorizable venv failure, and
telemetry flipped the naive assumption: it's NOT just Debian/Ubuntu —
macOS has the MOST distinct affected users (466 vs 121 linux), and linux
is a retry storm (~172 fires/user). Before committing to a `pip install
--target` fallback (Option A) we need to know (a) which interpreter these
users run and (b) whether that interpreter even has pip (→ whether
--target would work, vs needing a system package).

  - sdk_hook_py (always emitted): interpreter version as major*100+minor
    (309/312). Disambiguates Apple-3.9 vs a 3.10+-with-broken-ensurepip,
    and also recovers the version for HOOK_PY_INCOMPATIBLE (whose "py_3.9"
    err_kind otherwise collapses to err=99).
  - sdk_has_pip (only on err==11, to avoid an extra subprocess per healthy
    session): whether `<interpreter> -m pip --version` works. has_pip=true
    → the --target fallback would fix them; has_pip=false → they need a
    system package (python3-venv / a complete Python).

Both #1 and #2 are purely additive telemetry on the existing BUILD_FAILED
path — no behavior change to the bootstrap. They de-risk the Option A
decision: ship A only if the affected cohort has pip.

Verified locally on macOS Python 3.13:
  - py_compile clean.
  - 39 tests in test_exc_failure_encoding.py (34 exc/errno + 5 ensurepip
    instrumentation): type-code map, errno extraction + round-trip,
    APPEND-ONLY stability, handler-embeds-errno, _probe_has_pip returns
    bool + true-on-this-machine, sdk_hook_py always-emitted as
    major*100+minor, sdk_has_pip gated on err==11.
  - Full suite: 503/503 pass + 2 skipped.

Version 2.0.3 -> 2.0.4 per the per-PR-bump policy (#2114).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-04 14:55:55 -07:00
3 changed files with 117 additions and 2 deletions

View File

@@ -2261,7 +2261,7 @@
{
"name": "security-guidance",
"description": "Security review for Claude-generated code. Pattern-based warnings on edits, LLM-powered diff review on Stop, and an agentic commit reviewer that catches injection, XSS, SSRF, hardcoded secrets, and 25+ other vulnerability classes.",
"version": "2.0.3",
"version": "2.0.4",
"author": {
"name": "Anthropic",
"email": "support@anthropic.com"

View File

@@ -1,6 +1,6 @@
{
"name": "security-guidance",
"version": "2.0.3",
"version": "2.0.4",
"description": "Security review for Claude-generated code. Pattern-based warnings on edits, LLM-powered diff review on Stop, and an agentic commit reviewer that catches injection, XSS, SSRF, hardcoded secrets, and 25+ other vulnerability classes.",
"author": {
"name": "David Dworken",

View File

@@ -102,6 +102,41 @@ SDK_BOOTSTRAP_ERR_CODES = {
"_uncategorized": 99,
}
# Exception-type encoding for the "exc:<TypeName>" err_kinds (the generic
# `except Exception` path — venv/pip raised a Python exception rather than
# a CalledProcessError with categorizable stderr).
#
# #2154 telemetry surfaced that the dominant remaining venv BUILD_FAILED
# bucket (phase=venv, err=99) is ~99% `exc:` with stderr_sig=NULL — i.e.
# exceptions, not stderr-bearing subprocess failures — so the stderr_sig
# hash couldn't distinguish them. This maps the exception TYPE to a stable
# code so BQ can tell FileNotFoundError (python/venv binary missing) from
# PermissionError (read-only home) from a bare OSError, etc.
#
# All the FileNotFoundError/PermissionError/etc. entries are OSError
# subclasses, so they ALSO carry an errno (see _encode_errno) — the type
# code gives the Python class, errno gives the OS-level cause. APPEND-ONLY.
SDK_BOOTSTRAP_EXC_CODES = {
"FileNotFoundError": 1, # interpreter/venv path component missing
"PermissionError": 2, # read-only home, sandboxed FS
"NotADirectoryError": 3,
"IsADirectoryError": 4,
"FileExistsError": 5, # (sentinel race is handled separately; this
# is FileExistsError from elsewhere in venv)
"OSError": 6, # bare OSError — errno carries the real cause
"BlockingIOError": 7,
"BrokenPipeError": 8,
"ConnectionError": 9,
"TimeoutError": 10, # distinct from subprocess.TimeoutExpired
"InterruptedError": 11,
"MemoryError": 12,
"UnicodeDecodeError": 13,
"ValueError": 14,
"RuntimeError": 15,
# 1698 reserved; APPEND-ONLY.
"_other_exc": 99, # an exception type not in this map
}
def _encode_phase(s):
"""Map err_phase string to its telemetry integer code, or 0 if unset.
@@ -158,6 +193,55 @@ def _encode_stderr_sig(err_kind):
return int.from_bytes(h[:2], "big") % 1000
def _encode_exc_kind(err_kind):
"""Map an "exc:<TypeName>[:errno]" err_kind to its exception-type code
(SDK_BOOTSTRAP_EXC_CODES). Returns 0 for non-exc err_kinds (so the
sdk_bootstrap_exc field auto-omits on stderr/categorized failures).
Unmapped exception types → 99 (_other_exc)."""
if not err_kind or not err_kind.startswith("exc:"):
return 0
# "exc:OSError:28" → "OSError"; "exc:RuntimeError" → "RuntimeError"
name = err_kind[len("exc:"):].split(":", 1)[0].strip()
if not name:
return 0
return SDK_BOOTSTRAP_EXC_CODES.get(name, SDK_BOOTSTRAP_EXC_CODES["_other_exc"])
def _encode_errno(err_kind):
"""Extract the OS errno from an "exc:<TypeName>:<errno>" err_kind.
OSError-family exceptions embed their errno (ENOENT=2, EACCES=13,
ENOSPC=28, …) — the OS-level cause is far more actionable than the
Python class alone. Returns 0 when absent/non-numeric (field omitted)."""
if not err_kind or not err_kind.startswith("exc:"):
return 0
parts = err_kind.split(":")
if len(parts) < 3:
return 0
try:
return int(parts[2])
except (ValueError, IndexError):
return 0
def _probe_has_pip() -> bool:
"""True iff the current interpreter can run pip (`-m pip --version`).
Probed only on the venv_ensurepip_fail path (see __main__), NOT on the
happy path — it's an extra subprocess we only want when diagnosing a
failure. The result decides whether a `pip install --target` fallback
(Option A) is even viable for this machine: ensurepip/venv missing but
pip present → --target would work; pip also missing → it wouldn't, and
the user needs a system package (python3-venv / a complete Python)."""
try:
r = subprocess.run(
[sys.executable, "-m", "pip", "--version"],
capture_output=True, timeout=10,
)
return r.returncode == 0
except Exception:
return False
def _sdk_on_syspath() -> bool:
# find_spec is ~10ms; actually importing the SDK pulls in
# transitive deps and costs ~800ms — too heavy for a
@@ -364,6 +448,13 @@ def main() -> tuple[int, str, str]:
except subprocess.TimeoutExpired:
return BUILD_FAILED, err_phase, "subprocess_timeout"
except Exception as e:
# Embed errno for OSError-family exceptions ("exc:OSError:28") so
# telemetry can decode the OS-level cause (ENOENT/EACCES/ENOSPC/…),
# not just the Python class. #2154 follow-up: this is the dominant
# remaining venv BUILD_FAILED bucket. See _encode_exc_kind/_encode_errno.
errno = getattr(e, "errno", None)
if isinstance(errno, int):
return BUILD_FAILED, err_phase, f"exc:{type(e).__name__}:{errno}"
return BUILD_FAILED, err_phase, f"exc:{type(e).__name__}"
finally:
# Only remove the sentinel if THIS process created it. The
@@ -467,6 +558,30 @@ if __name__ == "__main__":
sig = _encode_stderr_sig(err_kind)
if sig:
metrics["sdk_bootstrap_stderr_sig"] = sig
# Exception-type + errno for the "exc:" bucket (the dominant
# remaining venv BUILD_FAILED mode per #2154 telemetry). Both
# auto-omit (0) on stderr/categorized failures.
exc = _encode_exc_kind(err_kind)
if exc:
metrics["sdk_bootstrap_exc"] = exc
exc_errno = _encode_errno(err_kind)
if exc_errno:
metrics["sdk_bootstrap_errno"] = exc_errno
# venv_ensurepip_fail (code 11) is the top categorizable venv
# failure, and telemetry shows it's NOT just Debian — macOS has the
# most distinct affected users. Probe whether this interpreter has
# pip so we know if a `pip install --target` fallback (Option A)
# would actually help, vs the user needing a system package. Probed
# only here (not on the happy path) to avoid an extra subprocess
# per healthy session.
if _encode_err_kind(err_kind) == 11:
metrics["sdk_has_pip"] = _probe_has_pip()
# Interpreter version (major*100 + minor, e.g. 309 / 312), emitted on
# every bootstrap. Disambiguates the macOS cohort (Apple 3.9 vs a 3.10+
# with broken ensurepip) for both venv_ensurepip_fail AND
# HOOK_PY_INCOMPATIBLE (whose "py_3.9" err_kind otherwise collapses to
# err=99, losing the version). Cheap — no subprocess, just sys.version_info.
metrics["sdk_hook_py"] = sys.version_info[0] * 100 + sys.version_info[1]
pv = _plugin_version_int()
if pv:
metrics["pv"] = pv