From b7c0654137cdc5f2c9d668de15563ffe3d7e2a2d Mon Sep 17 00:00:00 2001 From: Tobin South Date: Mon, 18 May 2026 12:55:20 -0700 Subject: [PATCH] Raise bump cap with verdict cache and skip-and-revert (#1913) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * Cache scan verdicts and drop policy-failing entries from bump PRs Three changes that together let the nightly bump clear any backlog in a single run without blocking on a single bad upstream or re-burning Claude time on already-scanned SHAs: - bump-plugin-shas.yml: raise max-bumps default 20 -> 130 (above the external entry count, so a single run can clear a full backlog) and add an explicit 60-min job timeout. The cap was the only thing bounding the blast radius of a single policy failure; the changes below take over that role so the cap can be lifted. - scan-plugins.yml: add a verdict cache keyed on (plugin, sha, policy hash). The bump action force-resets bump/plugin-shas every night, which makes the same SHAs reappear in the diff on consecutive nights — without the cache the scan would re-burn ~90s of Claude time per entry per night. Cached verdicts (pass and fail) are served from disk; only uncached SHAs are scanned. The job still fails on cached failures so the required check stays honest. - revert-failed-bumps.yml (new): after a Scan Plugins workflow_run on bump/plugin-shas concludes with a failure, drop just the failing entries' source.sha back to main's pin via a follow-up signed commit and re-dispatch the scan. The re-dispatch finds only cached-pass entries and goes green in seconds. Bounded at 3 passes/night, restricted to SHA-only diffs, and aborts if the bump branch was tampered with. * Harden bump cache and revert workflows after review - revert-failed-bumps: replace the time-based revert budget (anchored on the PR head, which a revert commit immediately replaces — never accumulating past 1) with a commit count: every nightly bump force- resets to one commit and every revert pass adds exactly one, so commits > MAX+1 is the budget without date math, pagination, or exposure to comment spoofing. - revert-failed-bumps: filter the bump PR by head owner so a fork PR with a branch named bump/plugin-shas can't be selected. - revert-failed-bumps: continue-on-error on the artifact download so a scan that died before uploading (infra error) doesn't fail the revert job — the missing-file guard downstream handles it. - scan-plugins: add a per-ref concurrency group so concurrent scans don't lose one another's cache writes; key the cache on run_attempt so a re-run can save its own verdicts. - scan-plugins: store the full source object in the cache and require source equality on lookup, so a repo/path change at the same SHA misses the cache instead of getting a stale verdict. - scan-plugins / revert-failed-bumps: strip markdown control chars, wrap model-generated text in code spans (neutralizes auto-linked URLs), and redact key-shaped tokens before they reach the step summary, artifact, cache, or PR comment. --- .github/workflows/bump-plugin-shas.yml | 16 +- .github/workflows/revert-failed-bumps.yml | 284 +++++++++++++++++++ .github/workflows/scan-plugins.yml | 317 +++++++++++++++++++++- 3 files changed, 610 insertions(+), 7 deletions(-) create mode 100644 .github/workflows/revert-failed-bumps.yml diff --git a/.github/workflows/bump-plugin-shas.yml b/.github/workflows/bump-plugin-shas.yml index 111c622..dab017e 100644 --- a/.github/workflows/bump-plugin-shas.yml +++ b/.github/workflows/bump-plugin-shas.yml @@ -13,6 +13,14 @@ name: Bump Plugin SHAs # the scan ourselves on the bump branch after the PR is opened. The check run # lands on the branch HEAD — the same SHA as the PR head — and satisfies the # required check. +# +# max-bumps is set above the external-entry count so a single run can clear +# any backlog. The cost-control mechanisms are downstream: +# - scan-plugins.yml caches verdicts by (plugin, sha) so an unchanged SHA +# is never re-scanned across nightly force-resets. +# - revert-failed-bumps.yml drops policy-failing entries from the bump PR +# so one bad upstream can't block the rest. +# See those files for details. on: schedule: @@ -22,7 +30,7 @@ on: max_bumps: description: Cap on plugins bumped this run required: false - default: '20' + default: '130' permissions: contents: write @@ -35,6 +43,10 @@ concurrency: jobs: bump: runs-on: ubuntu-latest + # Per-bump cost is ~2s (ls-remote + shallow clone + validate); 130 entries + # is ~5 min. The 60 min ceiling absorbs slow upstreams without letting a + # pathological run consume the default 360 min budget. + timeout-minutes: 60 steps: - uses: actions/checkout@v4 @@ -44,7 +56,7 @@ jobs: id: bump with: marketplace-path: .claude-plugin/marketplace.json - max-bumps: ${{ inputs.max_bumps || '20' }} + max-bumps: ${{ inputs.max_bumps || '130' }} claude-cli-version: latest # `bump/plugin-shas` is the action's default `pr-branch`. The scan diffs diff --git a/.github/workflows/revert-failed-bumps.yml b/.github/workflows/revert-failed-bumps.yml new file mode 100644 index 0000000..37cbb4c --- /dev/null +++ b/.github/workflows/revert-failed-bumps.yml @@ -0,0 +1,284 @@ +name: Revert Failed Bumps + +# Drops policy-failing entries from a bump PR so one bad upstream can't +# block the rest. Runs after a Scan Plugins workflow_run on bump/plugin-shas +# concludes with a failure: read the per-entry verdicts the scan uploaded, +# revert just the failing entries' source.sha back to main's pin, push a +# follow-up signed commit, and re-dispatch the scan. The re-dispatched scan +# finds only cached-pass entries in the new diff and goes green in seconds. +# +# Scope and guardrails — this job has contents:write so it must be tight: +# - Only acts on bump/plugin-shas (literal branch match). +# - Only acts when the scan was dispatched (workflow_dispatch event), i.e. +# by bump-plugin-shas.yml. A scan on a regular PR never triggers this. +# - Only reverts source.sha. If any other field in a failing entry differs +# from main, the run aborts — that means the bump branch was tampered +# with and a human needs to look. +# - Bounded at MAX_REVERT_PASSES per night via a PR comment marker; a +# persistent loop means the cache or scan is broken and a human needs +# to look. +# - The revert commit is created with createCommitOnBranch (GitHub-signed, +# compare-and-swap via expectedHeadOid) — no signing key on the runner. + +on: + workflow_run: + workflows: ["Scan Plugins"] + types: [completed] + +permissions: + contents: read + +env: + MARKETPLACE: .claude-plugin/marketplace.json + BUMP_BRANCH: bump/plugin-shas + MAX_REVERT_PASSES: '3' + REVERT_MARKER: '' + +jobs: + revert: + # Tight gate: the triggering scan must be a workflow_dispatch run on the + # bump branch (i.e. the one bump-plugin-shas.yml dispatched) that failed. + # A scan on a regular PR, a passing scan, or a manual dispatch on another + # branch must never reach this job. + if: > + github.event.workflow_run.conclusion == 'failure' && + github.event.workflow_run.event == 'workflow_dispatch' && + github.event.workflow_run.head_branch == 'bump/plugin-shas' + runs-on: ubuntu-latest + timeout-minutes: 15 + permissions: + contents: write # createCommitOnBranch on bump/plugin-shas + pull-requests: write # comment on / close the bump PR + actions: write # gh workflow run scan-plugins.yml --ref bump/plugin-shas + concurrency: + group: revert-failed-bumps + cancel-in-progress: false + steps: + # The artifact carries run-failed.json (just plugin names) and + # run-verdicts.json (full per-entry verdicts for the PR comment). It is + # uploaded by scan-plugins.yml for every relevant run so we can tell + # "policy failures found" from "scan never ran" (infra error → no revert). + # The artifact won't exist when the scan died before the upload step + # (cache restore error, jq failure, timeout) — that is an infra error, + # not a policy failure, so the right move is to do nothing. The + # download must not fail the job; the next step handles the missing file. + - name: Download scan verdicts + continue-on-error: true + uses: actions/download-artifact@v4 + with: + name: scan-verdicts + run-id: ${{ github.event.workflow_run.id }} + github-token: ${{ github.token }} + path: scan-out + + - name: Determine revert set + id: plan + run: | + set -euo pipefail + if [[ ! -f scan-out/run-failed.json ]]; then + echo "::warning::No run-failed.json in scan artifact — nothing to revert." + echo "act=false" >> "$GITHUB_OUTPUT" + exit 0 + fi + if ! jq -e 'type == "array"' scan-out/run-failed.json >/dev/null 2>&1; then + echo "::warning::run-failed.json is not a JSON array — refusing to act." + echo "act=false" >> "$GITHUB_OUTPUT" + exit 0 + fi + fail_count="$(jq 'length' scan-out/run-failed.json)" + if [[ "$fail_count" -eq 0 ]]; then + # The scan job failed but reported zero policy failures: that is + # an infra error (API key missing, clone failure, schema break). + # Reverting nothing is correct; surfacing the infra error is the + # scan job's responsibility. + echo "::notice::Scan failed with zero parsed policy failures — infra error, not a policy failure. Not reverting." + echo "act=false" >> "$GITHUB_OUTPUT" + exit 0 + fi + echo "act=true" >> "$GITHUB_OUTPUT" + echo "fail_count=$fail_count" >> "$GITHUB_OUTPUT" + echo "Failing entries:" + jq -r '.[]' scan-out/run-failed.json + + - name: Locate bump PR and check revert budget + if: steps.plan.outputs.act == 'true' + id: pr + env: + GH_TOKEN: ${{ github.token }} + REPO: ${{ github.repository }} + run: | + set -euo pipefail + # Resolve the bump PR by head ref. `gh pr list --head ` matches + # by ref name across forks, so reject any PR whose head repo isn't + # ours — a fork PR named bump/plugin-shas must never reach the + # contents:write paths below. + pr_json="$(gh api "repos/$REPO/pulls?head=${REPO%%/*}:$BUMP_BRANCH&base=main&state=open&per_page=1" \ + --jq '.[0] // empty')" + if [[ -z "$pr_json" ]]; then + echo "::warning::No open bump PR on $BUMP_BRANCH — nothing to revert." + echo "act=false" >> "$GITHUB_OUTPUT" + exit 0 + fi + pr_number="$(jq -r '.number' <<<"$pr_json")" + head_repo="$(jq -r '.head.repo.full_name' <<<"$pr_json")" + head_sha="$(jq -r '.head.sha' <<<"$pr_json")" + # The list endpoint omits `commits`; the single-PR endpoint has it. + commit_count="$(gh api "repos/$REPO/pulls/$pr_number" --jq '.commits')" + if [[ "$head_repo" != "$REPO" ]]; then + echo "::error::Bump PR head is from $head_repo, not $REPO — refusing to act." + echo "act=false" >> "$GITHUB_OUTPUT" + exit 0 + fi + # Loop bound: every nightly bump force-resets the branch to a single + # commit and every revert pass adds exactly one. Counting commits is + # therefore the per-night pass count + 1, with no date math, no + # pagination, and no exposure to comment spoofing. + if [[ "$commit_count" -gt $(( MAX_REVERT_PASSES + 1 )) ]]; then + echo "::error::Revert budget exhausted ($((commit_count - 1))/$MAX_REVERT_PASSES passes on this PR). The cache or scan is likely broken — needs a human." + gh pr comment "$pr_number" --repo "$REPO" --body \ + "$REVERT_MARKER"$'\n\n'"⚠️ Revert budget exhausted ($((commit_count - 1)) passes). The scan keeps failing after reverting — likely a cache or scan bug. Pausing automatic reverts until the next nightly bump." + echo "act=false" >> "$GITHUB_OUTPUT" + exit 0 + fi + echo "Bump PR #$pr_number @ $head_sha ($commit_count commit(s))" + { + echo "act=true" + echo "number=$pr_number" + echo "head_sha=$head_sha" + } >> "$GITHUB_OUTPUT" + + - name: Revert failing SHAs + if: steps.plan.outputs.act == 'true' && steps.pr.outputs.act == 'true' + id: revert + env: + GH_TOKEN: ${{ github.token }} + REPO: ${{ github.repository }} + HEAD_SHA: ${{ steps.pr.outputs.head_sha }} + run: | + set -euo pipefail + mkdir -p work + + gh api "repos/$REPO/contents/${MARKETPLACE}?ref=$HEAD_SHA" --jq '.content' | base64 -d > work/head.json + gh api "repos/$REPO/contents/${MARKETPLACE}?ref=main" --jq '.content' | base64 -d > work/base.json + + # Build the reverted marketplace: for each failing plugin, restore + # source.sha to main's value. Refuse if anything else differs — a + # difference outside source.sha on a bump-branch entry means the + # branch was tampered with. + jq -c -s \ + '.[0] as $head | .[1] as $base | (.[2] | map({(.): true}) | add // {}) as $fail + | ($base.plugins | map({(.name): .}) | add // {}) as $b + | $head | .plugins = [ + .plugins[] | + if ($fail[.name] // false) and ($b[.name] // null) != null then + # Verify the only delta is source.sha — never silently + # accept a structural change masquerading as a bump. + if (. | del(.source.sha)) == ($b[.name] | del(.source.sha)) then + .source.sha = $b[.name].source.sha + else + error("entry \(.name) differs from main beyond source.sha — refusing to revert") + end + else . end + ]' \ + work/head.json work/base.json scan-out/run-failed.json > work/reverted.json.compact + + # Match the marketplace's existing pretty-print so the diff is + # human-reviewable. + jq --indent 2 '.' work/reverted.json.compact > work/reverted.json + + # Two no-action cases: + # - nothing actually reverted (failed names not in this PR's diff) + # - everything reverted (the file is back to main → PR is empty) + if cmp -s work/reverted.json.compact <(jq -c '.' work/head.json); then + echo "::notice::No entries to revert (failing names not in this PR)." + echo "committed=false" >> "$GITHUB_OUTPUT" + echo "empty=false" >> "$GITHUB_OUTPUT" + exit 0 + fi + if cmp -s work/reverted.json.compact <(jq -c '.' work/base.json); then + echo "::warning::Every bumped entry failed policy — the PR would be empty." + echo "committed=false" >> "$GITHUB_OUTPUT" + echo "empty=true" >> "$GITHUB_OUTPUT" + exit 0 + fi + + # Vendored entries have a string `source` — restrict to object + # sources or `.source.sha` errors. + reverted="$(jq -c -s \ + '.[0] as $head | .[1] as $rev + | ($head.plugins | map(select(.source | type == "object") | {(.name): .source.sha}) | add // {}) as $h + | [$rev.plugins[] | select(.source | type == "object") + | select(($h[.name] // null) != .source.sha) | .name]' \ + work/head.json work/reverted.json.compact)" + echo "Reverted: $reverted" + echo "reverted=$reverted" >> "$GITHUB_OUTPUT" + + msg="Drop $(jq 'length' <<<"$reverted") policy-failing entries from bump" + # createCommitOnBranch: GitHub-signed, expectedHeadOid CAS so a + # concurrent force-reset from the nightly bump fails this push + # loudly instead of being clobbered. The base64'd marketplace can + # exceed MAX_ARG_STRLEN, so the body travels via stdin. + oid="$(jq -n \ + --rawfile content work/reverted.json \ + --arg repo "$REPO" \ + --arg branch "$BUMP_BRANCH" \ + --arg oid "$HEAD_SHA" \ + --arg msg "$msg" \ + --arg path "$MARKETPLACE" \ + '{ + query: "mutation($repo:String!,$branch:String!,$oid:GitObjectID!,$msg:String!,$path:String!,$contents:Base64String!){createCommitOnBranch(input:{branch:{repositoryNameWithOwner:$repo,branchName:$branch},message:{headline:$msg},fileChanges:{additions:[{path:$path,contents:$contents}]},expectedHeadOid:$oid}){commit{oid}}}", + variables: { repo: $repo, branch: $branch, oid: $oid, msg: $msg, path: $path, contents: ($content | @base64) } + }' \ + | gh api graphql --input - --jq '.data.createCommitOnBranch.commit.oid')" + [[ "$oid" =~ ^[0-9a-f]{40}$ ]] || { echo "::error::createCommitOnBranch did not return a commit OID."; exit 1; } + echo "committed=true" >> "$GITHUB_OUTPUT" + echo "empty=false" >> "$GITHUB_OUTPUT" + echo "::notice::Pushed revert commit $oid to $BUMP_BRANCH." + + - name: Close empty bump PR + if: steps.revert.outputs.empty == 'true' + env: + GH_TOKEN: ${{ github.token }} + REPO: ${{ github.repository }} + PR: ${{ steps.pr.outputs.number }} + run: | + set -euo pipefail + gh pr comment "$PR" --repo "$REPO" --body \ + "$REVERT_MARKER"$'\n\n'"Every bumped entry failed the policy scan. Closing — the next nightly run will retry." + gh pr close "$PR" --repo "$REPO" + + - name: Comment with revert detail + if: steps.revert.outputs.committed == 'true' + env: + GH_TOKEN: ${{ github.token }} + REPO: ${{ github.repository }} + PR: ${{ steps.pr.outputs.number }} + REVERTED: ${{ steps.revert.outputs.reverted }} + SCAN_RUN_URL: ${{ github.event.workflow_run.html_url }} + run: | + set -euo pipefail + { + printf '%s\n\n' "$REVERT_MARKER" + echo "Dropped $(jq 'length' <<<"$REVERTED") entrie(s) that failed the policy scan. The remaining bumps were unaffected." + echo + echo "| Plugin | Violations |" + echo "|---|---|" + # `violations` is model-generated text shaped by a cloned external + # repo. Strip markdown control characters and wrap in a code span + # so a prompt-injected upstream can't smuggle links/images/table + # breakouts into a public PR comment. + jq -r --argjson rev "$REVERTED" \ + 'def neutralize: gsub("[|\n\r\\[\\]<>`]"; " "); + .[] | select(.name as $n | $rev | index($n)) + | "| \(.name) | `\(.violations | neutralize | .[0:200])` |"' \ + scan-out/run-verdicts.json + echo + echo "These entries will be retried at their next upstream SHA. See the [scan run]($SCAN_RUN_URL) for full verdicts." + } > /tmp/comment.md + gh pr comment "$PR" --repo "$REPO" --body-file /tmp/comment.md + + - name: Re-dispatch scan on revised bump branch + if: steps.revert.outputs.committed == 'true' + env: + GH_TOKEN: ${{ github.token }} + run: gh workflow run scan-plugins.yml --ref "$BUMP_BRANCH" diff --git a/.github/workflows/scan-plugins.yml b/.github/workflows/scan-plugins.yml index 14bf9b4..f54764a 100644 --- a/.github/workflows/scan-plugins.yml +++ b/.github/workflows/scan-plugins.yml @@ -7,6 +7,19 @@ name: Scan Plugins # PRs blocked forever — so this workflow runs on every PR and skips the heavy # scan setup at the step level when nothing scan-relevant changed. The check # always reports. +# +# Verdict cache: each (plugin, sha) pair is scanned at most once. The bump +# workflow force-resets bump/plugin-shas every night, which makes the same +# SHAs reappear in the diff on consecutive nights — without a cache, the +# scan would re-burn ~90s of Claude time per entry per night. The cache is +# keyed on the policy hash so a prompt or schema change invalidates all +# verdicts and triggers a clean re-scan. +# +# Failure handling: a cached `passes:false` verdict still fails the job. The +# Revert Failed Bumps workflow (revert-failed-bumps.yml) reacts to that by +# dropping the failing entries from the bump PR, so one bad upstream can't +# block the rest. After the revert, the re-dispatched scan finds only +# cached-pass entries and goes green in seconds. on: pull_request: @@ -20,6 +33,18 @@ on: permissions: contents: read +# Serialize scans per ref so concurrent runs (a re-dispatch racing the +# original, or a manual dispatch) don't both restore the same cache, scan +# overlapping sets, and lose one another's verdicts on save. +concurrency: + group: scan-plugins-${{ github.event.pull_request.number || github.ref }} + cancel-in-progress: false + +env: + MARKETPLACE: .claude-plugin/marketplace.json + CACHE_DIR: ${{ github.workspace }}/.scan-cache + CACHE_TTL_DAYS: '30' + jobs: scan: runs-on: ubuntu-latest @@ -37,11 +62,14 @@ jobs: EVENT_NAME: ${{ github.event_name }} BASE_SHA: ${{ github.event.pull_request.base.sha }} run: | + set -euo pipefail if [[ "$EVENT_NAME" == "workflow_dispatch" ]]; then echo "relevant=true" >> "$GITHUB_OUTPUT" + echo "base_ref=origin/main" >> "$GITHUB_OUTPUT" exit 0 fi - if git diff --quiet "$BASE_SHA" HEAD -- .claude-plugin/marketplace.json .github/policy/; then + echo "base_ref=$BASE_SHA" >> "$GITHUB_OUTPUT" + if git diff --quiet "$BASE_SHA" HEAD -- "$MARKETPLACE" .github/policy/; then echo "relevant=false" >> "$GITHUB_OUTPUT" echo "::notice::No changes to marketplace.json or policy/ — skipping policy scan." else @@ -61,13 +89,292 @@ jobs: exit 1 fi - # Blocking: policy failures fail the job. Loosen by removing - # fail-on-findings if the false-positive rate is too high. - - if: steps.changes.outputs.relevant == 'true' + # Verdict cache, keyed on the policy content hash. A prompt change + # invalidates every cached verdict — that is intentional. The save key + # includes run_id so each run writes a fresh cache; restore-keys picks + # the most recent one. Verdicts older than CACHE_TTL_DAYS are pruned on + # restore to bound cache size as the marketplace grows. + - name: Restore verdict cache + if: steps.changes.outputs.relevant == 'true' + id: cache-restore + uses: actions/cache/restore@v4 + with: + path: .scan-cache + # run_attempt so a re-run can save its own verdicts (cache keys are + # immutable; without it a re-run would silently fail to save). + key: scan-verdicts-${{ hashFiles('.github/policy/**') }}-${{ github.run_id }}-${{ github.run_attempt }} + restore-keys: | + scan-verdicts-${{ hashFiles('.github/policy/**') }}- + + # Split the diff into cached (skip) and uncached (scan) entries. The + # cache key is "@" — a SHA is immutable, so a verdict for a + # given (plugin, sha) is permanent under a fixed policy. + - name: Filter scan targets against cache + if: steps.changes.outputs.relevant == 'true' + id: filter + env: + BASE_REF: ${{ steps.changes.outputs.base_ref }} + SCAN_ALL: ${{ inputs.scan_all || 'false' }} + TTL_DAYS: ${{ env.CACHE_TTL_DAYS }} + run: | + set -euo pipefail + mkdir -p "$CACHE_DIR" + + # Initialize / prune the verdict map. + if [[ -f "$CACHE_DIR/verdicts.json" ]] && jq -e 'type == "object"' "$CACHE_DIR/verdicts.json" >/dev/null 2>&1; then + # Drop entries older than TTL. Verdicts are immutable per (plugin, sha) + # but pruning keeps the cache from accumulating forever. + cutoff="$(date -u -d "-${TTL_DAYS} days" +%Y-%m-%dT%H:%M:%SZ)" + jq --arg cutoff "$cutoff" \ + 'with_entries(select(.value.scanned_at >= $cutoff))' \ + "$CACHE_DIR/verdicts.json" > "$CACHE_DIR/verdicts.json.tmp" + mv "$CACHE_DIR/verdicts.json.tmp" "$CACHE_DIR/verdicts.json" + else + echo '{}' > "$CACHE_DIR/verdicts.json" + fi + + # Build the change set: entries in HEAD whose object differs from base. + # scan_all overrides to "every external entry" (full re-review). + if [[ "$SCAN_ALL" == "true" ]]; then + jq -c '[.plugins[] | select(.source | type == "object")]' "$MARKETPLACE" \ + > "$CACHE_DIR/changed.json" + else + if git cat-file -e "${BASE_REF}:${MARKETPLACE}" 2>/dev/null; then + git show "${BASE_REF}:${MARKETPLACE}" > "$CACHE_DIR/base.json" + else + echo '{"plugins":[]}' > "$CACHE_DIR/base.json" + fi + jq -c -s \ + '(.[0].plugins | map({(.name): .}) | add // {}) as $b + | [.[1].plugins[] + | select(.source | type == "object") + | select(($b[.name] // null) != .)]' \ + "$CACHE_DIR/base.json" "$MARKETPLACE" > "$CACHE_DIR/changed.json" + fi + + changed_count="$(jq 'length' "$CACHE_DIR/changed.json")" + + # Split changed entries into cached vs uncached. A hit requires the + # *whole* source object (repo, sha, path, ref) to match the cached + # entry, not just name@sha — a repo migration or path change with the + # same SHA is different scan content and must miss the cache. + jq -c -s \ + '.[0] as $cache + | (.[1] | map(. + {key: (.name + "@" + (.source.sha // "")) })) as $entries + | { + to_scan: [$entries[] | select(($cache[.key].source // null) != .source)], + cached: [$entries[] | select(($cache[.key].source // null) == .source) + | . + {verdict: $cache[.key]}] + }' \ + "$CACHE_DIR/verdicts.json" "$CACHE_DIR/changed.json" > "$CACHE_DIR/split.json" + + jq -c '.to_scan' "$CACHE_DIR/split.json" > "$CACHE_DIR/to-scan.json" + jq -c '.cached' "$CACHE_DIR/split.json" > "$CACHE_DIR/cached.json" + + to_scan_count="$(jq 'length' "$CACHE_DIR/to-scan.json")" + cached_count="$(jq 'length' "$CACHE_DIR/cached.json")" + cached_fail_count="$(jq '[.[] | select(.verdict.passes == false)] | length' "$CACHE_DIR/cached.json")" + + # Build a filtered marketplace containing only the uncached entries. + # Passing this as the action's marketplace-path means the action's own + # base diff (which can't resolve a path outside git) falls back to an + # empty base and scans everything in the file — which is exactly the + # to-scan set. Annotations point to the temp file rather than the real + # marketplace, but the per-entry verdicts still land in the artifact + # and the step summary. + jq -c '{plugins: .}' "$CACHE_DIR/to-scan.json" > "$CACHE_DIR/scan-targets.json" + + { + echo "changed=$changed_count" + echo "to_scan=$to_scan_count" + echo "cached=$cached_count" + echo "cached_failures=$cached_fail_count" + } >> "$GITHUB_OUTPUT" + + echo "::notice::$changed_count changed entrie(s): $cached_count cached ($cached_fail_count failing), $to_scan_count to scan." + + - name: Scan uncached entries + if: steps.changes.outputs.relevant == 'true' && steps.filter.outputs.to_scan != '0' + id: scan + # Capture the action's per-entry outputs even when it exits nonzero. + # The verdict (cached + fresh) is what gates the job, not the action's + # exit code, and the revert workflow needs the artifact even on failure. + continue-on-error: true uses: anthropics/claude-plugins-community/.github/actions/scan-plugins@b277757588871fe55b2620de8c6dfda470e2e9d8 with: anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }} + marketplace-path: .scan-cache/scan-targets.json policy-prompt: .github/policy/prompt.md fail-on-findings: "true" - scan-all-external: ${{ inputs.scan_all || 'false' }} claude-cli-version: latest + + # Merge fresh verdicts into the cache and assemble this run's full + # verdict set (cached + fresh) for downstream consumers. Runs even when + # the scan step failed so that fail verdicts are also cached — that is + # what lets the revert workflow drop them and what stops the same + # failing SHA from being re-scanned every night. + - name: Merge verdicts and assemble run report + if: steps.changes.outputs.relevant == 'true' + id: report + # The action's `scanned` output travels here via an env var, which is + # subject to the OS argv/envp size limit (~128 KiB on Linux). At ~300 + # bytes/entry that is ~400 entries — an order of magnitude above the + # cold-start case, and steady state with the cache is ~10/night. If + # the limit is ever hit the runner fails the step before the script + # runs ("argument list too long") — the right response is to clear + # the cache key and lower max-bumps temporarily. Documented here so + # nobody has to rediscover it. + env: + SCANNED_JSON: ${{ steps.scan.outputs.scanned || '[]' }} + run: | + set -euo pipefail + mkdir -p "$CACHE_DIR" + [[ -f "$CACHE_DIR/cached.json" ]] || echo '[]' > "$CACHE_DIR/cached.json" + [[ -f "$CACHE_DIR/changed.json" ]] || echo '[]' > "$CACHE_DIR/changed.json" + + # Defensive: a partial or unparseable action output must not poison + # the cache. Treat it as "scanned nothing". + printf '%s' "$SCANNED_JSON" > "$CACHE_DIR/scanned-raw.json" + if ! jq -e 'type == "array"' "$CACHE_DIR/scanned-raw.json" >/dev/null 2>&1; then + echo "::warning::scan action output is not a valid JSON array — treating as empty." + echo '[]' > "$CACHE_DIR/scanned-raw.json" + fi + + # Defense in depth: the scan action runs Claude with Read access over + # a cloned external repo and ANTHROPIC_API_KEY in its process env. A + # successful prompt injection could coerce the model to put key + # material into `summary`/`violations`. The action's own step summary + # already carries that risk; this workflow adds an artifact and a PR + # comment, both public sinks. Scrub any key-shaped token here so it + # never reaches the cache, artifact, or comment. + jq -c '(.. | strings) |= gsub("sk-ant-[A-Za-z0-9_-]{8,}"; "[REDACTED]")' \ + "$CACHE_DIR/scanned-raw.json" > "$CACHE_DIR/scanned-raw.json.tmp" + mv "$CACHE_DIR/scanned-raw.json.tmp" "$CACHE_DIR/scanned-raw.json" + + now="$(date -u +%Y-%m-%dT%H:%M:%SZ)" + + # The action's `scanned` output has no SHA or source — join it with + # the change set by name to recover both for the cache key + the + # source-equality lookup guard. + jq -c -s --arg now "$now" \ + '.[0] as $changed + | (.[1] // []) as $scanned + | ($changed | map({(.name): .source}) | add // {}) as $srcs + | [$scanned[] + | . + {source: ($srcs[.name] // null), sha: ($srcs[.name].sha // ""), scanned_at: $now}]' \ + "$CACHE_DIR/changed.json" "$CACHE_DIR/scanned-raw.json" \ + > "$CACHE_DIR/fresh.json" + + # Merge fresh verdicts into the cache, keyed by name@sha. The + # full source object is stored so a future repo/path change with the + # same SHA fails the lookup guard. summary/violations are model + # output — truncate to bound cache size (the artifact carries the + # full text for the run that produced it). + jq -c -s \ + '.[0] + ([.[1][] | select(.sha != "") | {(.name + "@" + .sha): { + source: .source, + passes: .passes, + summary: ((.summary // "") | .[0:300]), + violations: ((.violations // "") | .[0:500]), + scanned_at: .scanned_at + }}] | add // {})' \ + "$CACHE_DIR/verdicts.json" "$CACHE_DIR/fresh.json" \ + > "$CACHE_DIR/verdicts.json.tmp" + mv "$CACHE_DIR/verdicts.json.tmp" "$CACHE_DIR/verdicts.json" + + # The full per-entry verdict for THIS run's diff: cached verdicts + # plus freshly-scanned verdicts. The revert workflow consumes the + # `failed` list to know exactly which SHAs to drop. + jq -c -s \ + '(.[0] | map({name, sha: .source.sha, passes: .verdict.passes, + summary: (.verdict.summary // ""), + violations: (.verdict.violations // ""), + source: "cache"})) + + (.[1] | map({name, sha, passes, + summary: (.summary // ""), + violations: (.violations // ""), + source: "scan"}))' \ + "$CACHE_DIR/cached.json" "$CACHE_DIR/fresh.json" \ + > "$CACHE_DIR/run-verdicts.json" + + jq -c '[.[] | select(.passes == false) | .name]' "$CACHE_DIR/run-verdicts.json" \ + > "$CACHE_DIR/run-failed.json" + + fail_count="$(jq 'length' "$CACHE_DIR/run-failed.json")" + total="$(jq 'length' "$CACHE_DIR/run-verdicts.json")" + + { + echo "failed_count=$fail_count" + echo "total=$total" + } >> "$GITHUB_OUTPUT" + + # `summary` and `violations` are model-generated text shaped by a + # cloned external repo. Strip markdown control characters AND wrap + # in code spans before they hit a publicly-rendered sink — code + # spans neutralize auto-linked bare URLs that a prompt-injected + # upstream could smuggle in. Stripping backticks first stops a + # breakout from the code span. + { + echo "## Policy scan (with verdict cache)" + echo + echo "Changed entries: ${total} · cached: $(jq 'length' "$CACHE_DIR/cached.json") · scanned fresh: $(jq 'length' "$CACHE_DIR/fresh.json") · failures: ${fail_count}" + echo + if [[ "$total" -gt 0 ]]; then + echo "| Plugin | SHA | Passes | Source | Summary |" + echo "|---|---|---|---|---|" + jq -r 'def neutralize: gsub("[|\n\r\\[\\]<>`]"; " "); + .[] | "| \(.name) | `\(.sha[0:8])` | \(if .passes then "✅" else "❌" end) | \(.source) | `\(.summary | neutralize | .[0:120])` |"' \ + "$CACHE_DIR/run-verdicts.json" + fi + if [[ "$fail_count" -gt 0 ]]; then + echo + echo "### Violations" + jq -r 'def neutralize: gsub("[|\n\r\\[\\]<>`]"; " "); + .[] | select(.passes == false) | "- **\(.name)** — `\(.violations | neutralize | .[0:500])`"' "$CACHE_DIR/run-verdicts.json" + fi + } >> "$GITHUB_STEP_SUMMARY" + + # Used by revert-failed-bumps.yml to know which entries to drop. Always + # uploaded when relevant so the revert workflow can distinguish "scan + # found policy failures" from "scan never ran" (infra error → no revert). + - name: Upload scan verdicts artifact + if: steps.changes.outputs.relevant == 'true' + uses: actions/upload-artifact@v4 + with: + name: scan-verdicts + path: | + .scan-cache/run-verdicts.json + .scan-cache/run-failed.json + retention-days: 7 + + # Save even when the scan failed — fail verdicts are what stop us from + # re-burning Claude time on a known-bad SHA every night. + - name: Save verdict cache + if: always() && steps.changes.outputs.relevant == 'true' + uses: actions/cache/save@v4 + with: + path: .scan-cache + key: scan-verdicts-${{ hashFiles('.github/policy/**') }}-${{ github.run_id }}-${{ github.run_attempt }} + + # Required-check gate. Fails on either fresh or cached policy failures — + # a known-bad SHA must keep failing until it is reverted or upstream + # fixes it (a new SHA is a new cache key and gets a fresh scan). + - name: Gate on policy verdict + if: steps.changes.outputs.relevant == 'true' + env: + FAILED: ${{ steps.report.outputs.failed_count || '0' }} + SCAN_OUTCOME: ${{ steps.scan.outcome }} + run: | + set -euo pipefail + if [[ "$FAILED" != "0" ]]; then + echo "::error::$FAILED entrie(s) fail policy. See the run summary for verdicts." + exit 1 + fi + # The action can also fail without a policy verdict (clone error, + # API error, schema mismatch). With zero parsed failures and a + # nonzero exit, that is an infra error — fail loudly so the revert + # workflow does NOT misread it as "everything passed". + if [[ "$SCAN_OUTCOME" == "failure" ]]; then + echo "::error::Scan step failed without a parseable policy verdict (likely an infra error)." + exit 1 + fi