Compare commits

...

54 Commits

Author SHA1 Message Date
Bryan Thompson
a1d0fa6d93 Update coderabbit plugin source URL 2026-04-28 07:22:26 -05:00
Tobin South
18113ade5c Add adobe-for-creativity plugin (#1625)
Adobe's Creative Cloud skills for image editing, design automation,
background removal, vectorization, and retouching. Points at the
plugins/creative-cloud/adobe-for-creativity subdir of adobe/skills.
2026-04-28 06:43:59 -05:00
Octavian Guzu
99832739a1 Merge pull request #1621 from anthropics/fix/validate-frontmatter-shell-injection
Harden validate-frontmatter workflow
2026-04-28 11:41:07 +01:00
Bryan Thompson
c5837a2c23 Add aws-dev-toolkit plugin (#1617) 2026-04-28 06:40:28 +01:00
Tobin South
f4b5494fb4 mcp-server-dev: hosting, payload-cap, lifecycle, and directory guidance (#1566) 2026-04-28 04:46:23 +01:00
Dickson Tsai
068a59e000 Fix shell injection in validate-frontmatter workflow
The 'Validate frontmatter' step interpolated step output directly into a
double-quoted shell string, allowing a fork PR that adds a file named
e.g. agents/$(curl ...).md to execute arbitrary commands on the runner.

- Pass the file list via env: and reference as "$FILES" so the shell
  never re-evaluates the contents
- Pass PR number via env: for consistency (no ${{ }} inside run:)
- Gate the job on same-repo PRs only, since fork PRs are auto-closed by
  close-external-prs.yml anyway

Impact was bounded (fork PRs get a read-only token with no secrets), but
this closes the RCE-on-runner vector entirely.
2026-04-27 17:38:18 -07:00
Bryan Thompson
1c81b81299 Add logfire plugin (#1613) 2026-04-27 12:37:20 -07:00
Bryan Thompson
7d42fe2132 Add 42crunch-api-security-testing plugin (#1580) 2026-04-27 12:37:15 -07:00
Bryan Thompson
71545a2994 Add datarobot-agent-skills plugin (#1579) 2026-04-27 12:37:11 -07:00
Bryan Thompson
458b2799c5 Add aiven plugin (#1578) 2026-04-27 12:37:07 -07:00
Bryan Thompson
26973b887b Add fullstory plugin (#1577) 2026-04-27 12:37:03 -07:00
Bryan Thompson
6fc0a4b36a Add jfrog plugin (#1576) 2026-04-27 12:36:58 -07:00
Bryan Thompson
27cab8ee35 Add rails-query plugin (#1575) 2026-04-27 12:36:54 -07:00
Bryan Thompson
020446a429 Add quarkus-agent plugin (#1534) 2026-04-23 22:45:48 +01:00
Bryan Thompson
740e9d5513 Add vanta-mcp-plugin (#1563) 2026-04-23 22:29:25 +01:00
Noah Zweben
5a71459c03 telegram: gate /start, /help, /status behind dmPolicy (#894)
The bot command handlers bypassed access control — they responded to
any DM user regardless of dmPolicy, leaking bot presence and
contradicting ACCESS.md's "Drop silently. No reply." contract for
allowlist mode.

Add dmCommandGate() that applies the same disabled/allowlist checks
as gate() without the pairing side effects, and route all three
handlers through it. Also prune expired pending codes before /status
iterates them.

Fixes #854

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-23 12:02:34 -07:00
Noah Zweben
ae54b113c4 Add Apache 2.0 LICENSE to math-olympiad plugin (#868)
Co-authored-by: Claude <noreply@anthropic.com>
2026-04-23 12:02:30 -07:00
jschwar2552
2a40fd2e7c skill-creator: sync from anthropics/skills (drop ANTHROPIC_API_KEY requirement) (#1523)
Ports anthropics/skills#547 (b0cbd3d) so this repo matches the upstream
skills repo.

improve_description.py and run_loop.py now shell out to `claude -p` instead
of using the Anthropic SDK directly, so the description optimizer uses the
session's existing Claude Code auth and no longer requires a separate
ANTHROPIC_API_KEY. SKILL.md drops the stale extended-thinking reference and
adds guidance for updating an existing skill.

Several enterprise customers sync exclusively from this repo (not
anthropics/skills, whose README disclaims production use), so they have been
stuck on the old SDK-based path.
2026-04-23 12:02:26 -07:00
Bryan Thompson
95f6172405 Add zscaler plugin (#1552) 2026-04-23 12:01:19 -07:00
Bryan Thompson
7bbdb8434e Add data-agent-kit-starter-pack plugin (#1551) 2026-04-23 12:01:12 -07:00
Bryan Thompson
4bbf944de1 Add atlassian-forge-skills plugin (#1539) 2026-04-23 12:01:06 -07:00
Bryan Thompson
06830b2ccd Add apollo plugin (#1538) 2026-04-23 12:01:00 -07:00
Bryan Thompson
bd6f1d7f48 Add windsor-ai plugin (#1536) 2026-04-23 12:00:53 -07:00
Bryan Thompson
808e70ffb9 Add auth0 plugin (#1535) 2026-04-23 12:00:47 -07:00
Bryan Thompson
187a267738 Add cloud-sql-postgresql plugin (#1533)
* Add cloud-sql-postgresql plugin

* Remove SHA pin from cloud-sql-postgresql entry
2026-04-23 12:00:37 -07:00
Bryan Thompson
42e980340d Add alloydb plugin (#1532)
* Add alloydb plugin

* Remove SHA pin from alloydb entry
2026-04-23 12:00:31 -07:00
Bryan Thompson
c15eada2e9 Add qt-development-skills plugin (#1519) 2026-04-23 12:00:25 -07:00
Bryan Thompson
f9f07aa2d3 Add versori-skills plugin (#1501) 2026-04-23 12:00:18 -07:00
Bryan Thompson
81952cabc5 Merge pull request #1499 from anthropics/add-exa
Add exa plugin
2026-04-23 13:59:15 -05:00
Bryan Thompson
0852f6647a Merge pull request #1437 from anthropics/rename-azure-skills-to-azure
Rename azure-skills to azure
2026-04-23 13:58:58 -05:00
Bryan Thompson
b0724d7a16 Rename azure-skills to azure per developer request 2026-04-22 06:42:35 -05:00
Bryan Thompson
cf62a6c02d Merge pull request #1439 from anthropics/add-datadog
Add datadog plugin
2026-04-21 16:04:18 -05:00
Bryan Thompson
3bd94cc810 Bump SHA pins for 39 plugins (>7d stale) (#1502)
Rebased on latest main to resolve conflict with cockroachdb unpin (#1514)
and liquid-lsp addition (#1520). Excludes netsuite-suitecloud (4d).
2026-04-21 20:56:31 +01:00
Bryan Thompson
a8be018317 Merge pull request #1514 from anthropics/update-cockroachdb
Update cockroachdb plugin — add author + category, bump SHA
2026-04-21 14:38:05 -05:00
Bryan Thompson
33e62b9bd6 Remove SHA pin from cockroachdb entry
Let installs follow the repo's default branch instead of a fixed SHA.
Removes the plugin from the weekly SHA-bump rotation and lets developer
updates reach users directly on `claude plugin install`.
2026-04-21 14:32:10 -05:00
Bryan Thompson
9f103c621d Add liquid-lsp plugin (#1520) 2026-04-21 19:07:52 +01:00
Bryan Thompson
caa8c1a539 Update cockroachdb description + author per developer request
- Description: expand to reflect current capabilities (14 tools, 2 MCP
  backends, 3 agents, 32 skills, safety hooks)
- Author: "CockroachDB" → "Cockroach Labs" (company name)
2026-04-21 12:33:39 -05:00
Bryan Thompson
33fd73c8b9 Update cockroachdb plugin — add author + category, bump SHA 2026-04-21 07:03:06 -05:00
Bryan Thompson
777db5c30b Add liquid-skills plugin (#1507) 2026-04-20 22:01:55 +01:00
Karandeep Johar
aeecad8f43 fix(amplitude): use git-subdir source to point at plugins/amplitude (#1505)
The amplitude entry used source type "url" which clones the root of
https://github.com/amplitude/mcp-marketplace — a multi-plugin repo
where the actual plugin lives at plugins/amplitude/. Claude Code found
no skills there, so /reload-plugins loaded 0 skills for amplitude.

Switching to "git-subdir" with path "plugins/amplitude" (the same
pattern used by awslabs, bigdata-com, zapier, etc.) makes Claude Code
resolve the correct subdirectory and load all 27 amplitude skills.

Removing the pinned sha so the plugin tracks main, consistent with
how posthog and other unpinned entries behave.
2026-04-20 20:34:29 +01:00
Bryan Thompson
f1938a2dc2 Add exa plugin 2026-04-20 08:03:24 -05:00
Bryan Thompson
bb7730114d Consolidate Oracle NetSuite skills into a single plugin (#1464) 2026-04-17 22:03:25 +01:00
Bryan Thompson
3df5394ee9 Merge pull request #1463 from anthropics/add-netsuite-plugins
Add Oracle NetSuite agent skills (3 plugins)
2026-04-17 15:39:51 -05:00
Bryan Thompson
12401af104 Add Oracle NetSuite agent skills (3 plugins)
Adds three NetSuite agent skills to the official marketplace:

- netsuite-aiconnector-service-skill: runtime guidance for the NetSuite
  AI Service Connector (tool selection, output formatting, SuiteQL
  safety checklist)
- netsuite-sdf-roles-and-permissions: SDF permission ID lookup and
  least-privilege role authoring (ADMI_, LIST_, REGT_, REPO_, TRAN_)
- netsuite-uif-spa-reference: API/type reference for @uif-js/core and
  @uif-js/component

All three ship from oracle/netsuite-suitecloud-sdk (packages/agent-skills/)
using git-subdir + strict:false + skills[] — the same shape stagehand uses
for skill-only distributions.
2026-04-17 15:10:22 -05:00
Bryan Thompson
167f01f2e0 Add auto-SHA-bump workflow for marketplace plugins (#1392)
* Add auto-SHA-bump workflow for marketplace plugins

Weekly CI action that discovers stale SHA pins in marketplace.json
and opens a batched PR with updated SHAs. Adapted from the
claude-plugins-community-internal bump-plugin-shas workflow for
the single-file marketplace.json format.

- discover_bumps.py: checks 56 SHA-pinned plugins against upstream
  repos, oldest-stale-first rotation, capped at 20 bumps/run
- bump-plugin-shas.yml: weekly Monday schedule + manual dispatch
  with dry_run and per-plugin targeting options

Entries without SHA pins (intentionally tracking HEAD) are never
touched. Existing validate-marketplace CI runs on the resulting PR.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Fix input interpolation and add BASE_BRANCH overlay

- Pass workflow_dispatch inputs through env vars instead of direct
  ${{ inputs.* }} interpolation in run blocks (avoids shell injection)
- Add marketplace.json overlay from main so the workflow can be tested
  via dispatch from a feature branch against main's real plugin data

Both patterns match claude-plugins-community-internal's implementation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Use GitHub App token for PR creation

The anthropics org disables "Allow GitHub Actions to create and approve
pull requests", so GITHUB_TOKEN cannot call gh pr create. Split the
workflow: GITHUB_TOKEN pushes the branch, then the same GitHub App
used by -internal's bump workflow (app-id 2812036) creates the PR.

Prerequisite: app must be installed on this repo and the PEM secret
(CLAUDE_DIRECTORY_BOT_PRIVATE_KEY) must exist in repo settings.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Use --force-with-lease for bump branch push

Prevents push failure if the branch exists from a previous same-day
run whose PR was merged but whose branch wasn't auto-deleted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 09:12:36 -07:00
Noah Zweben
637c6b3b6a Add Apache 2.0 license to session-report plugin (#1346)
Co-authored-by: Claude <noreply@anthropic.com>
2026-04-17 09:11:39 -07:00
Bryan Thompson
811c9b5394 Merge pull request #1453 from anthropics/add-miro
Add miro plugin
2026-04-17 11:08:43 -05:00
Tobin South
b00abee24e Align mcp-server-dev skills with claude.com/docs connector guidance (#1418)
- build-mcp-server: load llms-full.txt for Claude-specific context;
  add Phase 6 (test in Claude, review checklist, submit, ship plugin)
- references/auth.md: add Claude auth-type table, callback URL,
  not-supported list
- references/tool-design.md: add Anthropic Directory hard requirements
  (annotations, name length, read/write split, prompt-injection rule)
- build-mcp-app: add Claude host specifics (prefersBorder,
  safeAreaInsets, CSP) and submission asset specs; testing via
  custom connector
- build-mcpb: note remote servers are the recommended directory path
2026-04-17 17:02:48 +01:00
Bryan Thompson
5c5c5f9896 Remove SHA pin from miro entry 2026-04-17 08:57:39 -05:00
Bryan Thompson
8518bfc43d Add miro plugin 2026-04-17 08:53:58 -05:00
Bryan Thompson
db52e65c44 Add datadog plugin 2026-04-17 07:28:58 -05:00
Tobin South
b992a65037 Refresh AWS plugins: add amplify/databases/sagemaker, remove migration (#1226)
Adds three plugins from awslabs/agent-plugins:
- aws-amplify (development)
- databases-on-aws (database)
- sagemaker-ai (development)

Removes migration-to-aws (deprecated by the AWS team).
2026-04-17 06:29:23 -05:00
Dickson Tsai
de39da5ba2 Point supabase plugin to supabase-community/supabase-plugin (#1442)
Remove the in-repo supabase stub; source from the external repo.

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-16 21:01:47 +01:00
Bryan Thompson
cb8c857a5e Normalize git-subdir source URLs to full HTTPS format (#1422)
Standardize 12 git-subdir plugin entries from owner/repo shorthand to
full https://github.com/owner/repo.git URLs for consistency with the
existing HTTPS entries.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 21:01:08 +01:00
20 changed files with 1267 additions and 186 deletions

File diff suppressed because it is too large Load Diff

229
.github/scripts/discover_bumps.py vendored Normal file
View File

@@ -0,0 +1,229 @@
#!/usr/bin/env python3
"""Discover plugins in marketplace.json whose upstream repo has moved past
their pinned SHA, update the file in place, and emit a summary.
Adapted from claude-plugins-community-internal's discover_bumps.py for the
single-file marketplace.json format used by claude-plugins-official.
Usage: discover_bumps.py [--plugin NAME] [--max N] [--dry-run]
"""
import argparse
import json
import os
import re
import subprocess
import sys
from datetime import datetime, timezone
from typing import Any
MARKETPLACE_PATH = ".claude-plugin/marketplace.json"
def gh_api(path: str) -> Any:
"""GET from the GitHub API. None on not-found; raises on other errors.
"Not found" covers both 404 (resource gone) and 422 "No commit found
for SHA" (force-pushed away). Both mean the thing we asked for isn't
there — treating them the same lets callers handle dead refs uniformly.
"""
r = subprocess.run(
["gh", "api", path], capture_output=True, text=True
)
if r.returncode != 0:
combined = r.stdout + r.stderr
if any(s in combined for s in ("404", "Not Found", "No commit found")):
return None
raise RuntimeError(f"gh api {path}: {r.stderr.strip() or r.stdout.strip()}")
return json.loads(r.stdout)
def parse_github_repo(url: str) -> tuple[str, str] | None:
"""Extract (owner, repo) from a URL or owner/repo shorthand."""
# Full URL: https://github.com/owner/repo(.git)(/...)
m = re.match(r"https?://github\.com/([^/]+)/([^/]+?)(?:\.git)?(?:/|$)", url)
if m:
return m.group(1), m.group(2)
# Shorthand: owner/repo
m = re.match(r"^([\w.-]+)/([\w.-]+)$", url)
if m:
return m.group(1), m.group(2)
return None
def latest_sha(owner: str, repo: str, *, ref: str | None, path: str | None) -> str | None:
"""Latest commit SHA for the repo, optionally scoped to a ref and/or path."""
if path:
# Scoped to a subdirectory — use the commits list endpoint with path filter.
q = f"repos/{owner}/{repo}/commits?per_page=1&path={path}"
if ref:
q += f"&sha={ref}"
commits = gh_api(q)
if not commits:
return None
return commits[0]["sha"]
# Whole repo — the single-ref endpoint is cheaper.
if not ref:
meta = gh_api(f"repos/{owner}/{repo}")
if not meta:
return None
ref = meta["default_branch"]
c = gh_api(f"repos/{owner}/{repo}/commits/{ref}")
return c["sha"] if c else None
def pinned_age_days(owner: str, repo: str, sha: str) -> int | None:
"""Days since the pinned commit was authored. Used for oldest-first rotation."""
c = gh_api(f"repos/{owner}/{repo}/commits/{sha}")
if not c:
return None
dt = datetime.fromisoformat(
c["commit"]["committer"]["date"].replace("Z", "+00:00")
)
return (datetime.now(timezone.utc) - dt).days
def main() -> int:
ap = argparse.ArgumentParser()
ap.add_argument("--plugin", help="only check this plugin")
ap.add_argument("--max", type=int, default=20, help="cap bumps emitted")
ap.add_argument("--dry-run", action="store_true", help="don't write marketplace.json")
args = ap.parse_args()
with open(MARKETPLACE_PATH) as f:
marketplace = json.load(f)
plugins = marketplace.get("plugins", [])
bumps: list[dict] = []
dead: list[str] = []
skipped_non_github = 0
checked = 0
for plugin in plugins:
name = plugin.get("name", "?")
src = plugin.get("source")
# Only process object sources with a sha field
if not isinstance(src, dict) or "sha" not in src:
continue
# Filter to specific plugin if requested
if args.plugin and name != args.plugin:
continue
checked += 1
kind = src.get("source")
url = src.get("url", "")
path = src.get("path")
ref = src.get("ref")
pinned = src.get("sha")
slug = parse_github_repo(url)
if not slug:
skipped_non_github += 1
continue
owner, repo = slug
try:
latest = latest_sha(owner, repo, ref=ref, path=path)
except RuntimeError as e:
print(f"::warning::{name}: {e}", file=sys.stderr)
continue
if latest is None:
dead.append(f"{name} ({owner}/{repo})")
continue
if latest == pinned:
continue # up to date
# Age lookup for rotation — oldest-pinned first prevents starvation.
try:
age = pinned_age_days(owner, repo, pinned) if pinned else None
except RuntimeError as e:
print(f"::warning::{name}: age lookup failed: {e}", file=sys.stderr)
age = None
bumps.append({
"name": name,
"kind": kind,
"url": url,
"path": path or "",
"ref": ref or "",
"old_sha": pinned or "",
"new_sha": latest,
"age_days": age if age is not None else 10**6,
})
# Oldest-pinned first so nothing starves under the cap.
bumps.sort(key=lambda b: -b["age_days"])
emitted = bumps[: args.max]
# Apply bumps to marketplace data
if emitted and not args.dry_run:
bump_map = {b["name"]: b["new_sha"] for b in emitted}
for plugin in plugins:
name = plugin.get("name")
src = plugin.get("source")
if isinstance(src, dict) and name in bump_map:
src["sha"] = bump_map[name]
with open(MARKETPLACE_PATH, "w") as f:
json.dump(marketplace, f, indent=2, ensure_ascii=False)
f.write("\n")
# Write GitHub outputs
out = os.environ.get("GITHUB_OUTPUT")
if out:
bumped_names = ",".join(b["name"] for b in emitted)
with open(out, "a") as fh:
fh.write(f"count={len(emitted)}\n")
fh.write(f"bumped_names={bumped_names}\n")
# Write GitHub step summary
summary = os.environ.get("GITHUB_STEP_SUMMARY")
if summary:
with open(summary, "a") as fh:
fh.write("## SHA Bump Discovery\n\n")
fh.write(f"- Checked: {checked} SHA-pinned entries\n")
fh.write(f"- Stale: {len(bumps)} (applying {len(emitted)}, cap {args.max})\n")
if skipped_non_github:
fh.write(f"- Skipped non-GitHub: {skipped_non_github}\n")
if dead:
fh.write(f"- **Dead upstream** ({len(dead)}): {', '.join(dead)}\n")
if emitted:
fh.write("\n| Plugin | Old | New | Age |\n|---|---|---|---|\n")
for b in emitted:
old = b["old_sha"][:8] if b["old_sha"] else "(unpinned)"
fh.write(f"| {b['name']} | `{old}` | `{b['new_sha'][:8]}` | {b['age_days']}d |\n")
# Write PR body for the workflow to use
pr_body_path = os.environ.get("PR_BODY_PATH", "/tmp/bump-pr-body.md")
if emitted:
with open(pr_body_path, "w") as fh:
fh.write("Upstream repos moved. Bumping pinned SHAs so plugins track latest.\n\n")
fh.write("| Plugin | Old | New | Upstream |\n")
fh.write("|--------|-----|-----|----------|\n")
for b in emitted:
old = b["old_sha"][:8] if b["old_sha"] else "(unpinned)"
slug_str = re.sub(r"https?://github\.com/", "", b["url"])
slug_str = re.sub(r"\.git$", "", slug_str)
compare = f"https://github.com/{slug_str}/compare/{b['old_sha'][:12]}...{b['new_sha'][:12]}"
fh.write(f"| `{b['name']}` | `{old}` | `{b['new_sha'][:8]}` | [diff]({compare}) |\n")
fh.write(f"\n---\n_Auto-generated by `bump-plugin-shas.yml` on {datetime.now(timezone.utc).strftime('%Y-%m-%d')}_\n")
# Console summary
print(f"Checked {checked} SHA-pinned plugins", file=sys.stderr)
print(f"Stale: {len(bumps)}, applying: {len(emitted)}", file=sys.stderr)
if dead:
print(f"Dead upstream: {', '.join(dead)}", file=sys.stderr)
for b in emitted:
old = b["old_sha"][:8] if b["old_sha"] else "unpinned"
print(f" {b['name']}: {old} -> {b['new_sha'][:8]} ({b['age_days']}d)", file=sys.stderr)
return 0
if __name__ == "__main__":
sys.exit(main())

133
.github/workflows/bump-plugin-shas.yml vendored Normal file
View File

@@ -0,0 +1,133 @@
name: Bump plugin SHAs
# Weekly sweep of marketplace.json — for each entry whose upstream repo has
# moved past its pinned SHA, open a PR against main with updated SHAs. The
# validate-marketplace workflow then runs on the PR to confirm the file is
# still well-formed.
#
# Adapted from claude-plugins-community-internal's bump-plugin-shas.yml
# for the single-file marketplace.json format. Key difference: all bumps
# are batched into one PR (since they all modify the same file).
on:
schedule:
- cron: '23 7 * * 1' # Monday 07:23 UTC
workflow_dispatch:
inputs:
plugin:
description: Only bump this plugin (for testing)
required: false
max_bumps:
description: Cap on plugins bumped this run
required: false
default: '20'
dry_run:
description: Discover only, don't open PR
type: boolean
default: true
concurrency:
group: bump-plugin-shas
cancel-in-progress: false
permissions:
contents: write
pull-requests: write
jobs:
bump:
runs-on: ubuntu-latest
timeout-minutes: 15
steps:
- uses: actions/checkout@v4
- name: Check for existing bump PR
id: existing
env:
GH_TOKEN: ${{ github.token }}
run: |
existing=$(gh pr list --label sha-bump --state open --json number --jq 'length')
echo "count=$existing" >> "$GITHUB_OUTPUT"
if [ "$existing" -gt 0 ]; then
echo "::notice::Open sha-bump PR already exists — skipping"
fi
- name: Ensure sha-bump label exists
if: steps.existing.outputs.count == '0'
env:
GH_TOKEN: ${{ github.token }}
run: gh label create sha-bump --color 0e8a16 --description "Automated SHA bump" 2>/dev/null || true
- name: Overlay marketplace data from main
if: steps.existing.outputs.count == '0'
run: |
git fetch origin main --depth=1 --quiet
git checkout origin/main -- .claude-plugin/marketplace.json
- name: Discover and apply SHA bumps
if: steps.existing.outputs.count == '0'
id: discover
env:
GH_TOKEN: ${{ github.token }}
PR_BODY_PATH: /tmp/bump-pr-body.md
PLUGIN: ${{ inputs.plugin }}
MAX_BUMPS: ${{ inputs.max_bumps }}
DRY_RUN: ${{ inputs.dry_run }}
run: |
args=(--max "${MAX_BUMPS:-20}")
[[ -n "$PLUGIN" ]] && args+=(--plugin "$PLUGIN")
[[ "$DRY_RUN" = "true" ]] && args+=(--dry-run)
python3 .github/scripts/discover_bumps.py "${args[@]}"
- uses: oven-sh/setup-bun@v2
if: steps.existing.outputs.count == '0' && steps.discover.outputs.count != '0' && inputs.dry_run != true
- name: Validate marketplace.json
if: steps.existing.outputs.count == '0' && steps.discover.outputs.count != '0' && inputs.dry_run != true
run: |
bun .github/scripts/validate-marketplace.ts .claude-plugin/marketplace.json
bun .github/scripts/check-marketplace-sorted.ts
- name: Push bump branch
if: steps.existing.outputs.count == '0' && steps.discover.outputs.count != '0' && inputs.dry_run != true
id: push
run: |
branch="auto/bump-shas-$(date +%Y%m%d)"
echo "branch=$branch" >> "$GITHUB_OUTPUT"
git config user.name "github-actions[bot]"
git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
git checkout -b "$branch"
git add .claude-plugin/marketplace.json
git commit -m "Bump SHA pins for ${{ steps.discover.outputs.count }} plugin(s)
Plugins: ${{ steps.discover.outputs.bumped_names }}"
git push -u origin "$branch" --force-with-lease
# GITHUB_TOKEN cannot create PRs (org policy: "Allow GitHub Actions to
# create and approve pull requests" is disabled). Use the same GitHub App
# that -internal's bump workflow uses.
#
# Prerequisite: app 2812036 must be installed on this repo. The PEM
# secret must exist in this repo's settings (shared with -internal).
- name: Generate bot token
if: steps.push.outcome == 'success'
id: app-token
uses: actions/create-github-app-token@v1
with:
app-id: 2812036
private-key: ${{ secrets.CLAUDE_DIRECTORY_BOT_PRIVATE_KEY }}
owner: ${{ github.repository_owner }}
repositories: ${{ github.event.repository.name }}
- name: Create pull request
if: steps.push.outcome == 'success'
env:
GH_TOKEN: ${{ steps.app-token.outputs.token }}
run: |
gh pr create \
--base main \
--head "${{ steps.push.outputs.branch }}" \
--title "Bump SHA pins (${{ steps.discover.outputs.count }} plugins)" \
--body-file /tmp/bump-pr-body.md \
--label sha-bump

View File

@@ -9,6 +9,10 @@ on:
jobs:
validate:
# Fork PRs are auto-closed by close-external-prs.yml, so skip validation
# for them entirely. This also prevents untrusted filenames from forks
# from ever reaching the shell steps below.
if: github.event.pull_request.head.repo.full_name == github.repository
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
@@ -20,16 +24,19 @@ jobs:
- name: Get changed frontmatter files
id: changed
env:
GH_TOKEN: ${{ github.token }}
PR_NUMBER: ${{ github.event.pull_request.number }}
run: |
# Use diff-filter=AMRC to exclude deleted files (D) - only Added, Modified, Renamed, Copied
FILES=$(gh pr diff ${{ github.event.pull_request.number }} --name-only --diff-filter=AMRC | grep -E '(agents/.*\.md|skills/.*/SKILL\.md|commands/.*\.md)$' || true)
FILES=$(gh pr diff "$PR_NUMBER" --name-only --diff-filter=AMRC | grep -E '(agents/.*\.md|skills/.*/SKILL\.md|commands/.*\.md)$' || true)
echo "files<<EOF" >> "$GITHUB_OUTPUT"
echo "$FILES" >> "$GITHUB_OUTPUT"
echo "EOF" >> "$GITHUB_OUTPUT"
env:
GH_TOKEN: ${{ github.token }}
- name: Validate frontmatter
if: steps.changed.outputs.files != ''
env:
FILES: ${{ steps.changed.outputs.files }}
run: |
echo "${{ steps.changed.outputs.files }}" | xargs bun .github/scripts/validate-frontmatter.ts
printf '%s\n' "$FILES" | xargs bun .github/scripts/validate-frontmatter.ts

View File

@@ -1,7 +0,0 @@
{
"name": "supabase",
"description": "Supabase MCP integration for database operations, authentication, storage, and real-time subscriptions. Manage your Supabase projects, run SQL queries, and interact with your backend directly.",
"author": {
"name": "Supabase"
}
}

View File

@@ -1,6 +0,0 @@
{
"supabase": {
"type": "http",
"url": "https://mcp.supabase.com/mcp"
}
}

View File

@@ -284,6 +284,19 @@ function gate(ctx: Context): GateResult {
return { action: 'drop' }
}
// Like gate() but for bot commands: no pairing side effects, just allow/drop.
function dmCommandGate(ctx: Context): { access: Access; senderId: string } | null {
if (ctx.chat?.type !== 'private') return null
if (!ctx.from) return null
const senderId = String(ctx.from.id)
const access = loadAccess()
const pruned = pruneExpired(access)
if (pruned) saveAccess(access)
if (access.dmPolicy === 'disabled') return null
if (access.dmPolicy === 'allowlist' && !access.allowFrom.includes(senderId)) return null
return { access, senderId }
}
function isMentioned(ctx: Context, extraPatterns?: string[]): boolean {
const entities = ctx.message?.entities ?? ctx.message?.caption_entities ?? []
const text = ctx.message?.text ?? ctx.message?.caption ?? ''
@@ -669,12 +682,7 @@ setInterval(() => {
// the gate's behavior for unrecognized groups.
bot.command('start', async ctx => {
if (ctx.chat?.type !== 'private') return
const access = loadAccess()
if (access.dmPolicy === 'disabled') {
await ctx.reply(`This bot isn't accepting new connections.`)
return
}
if (!dmCommandGate(ctx)) return
await ctx.reply(
`This bot bridges Telegram to a Claude Code session.\n\n` +
`To pair:\n` +
@@ -685,7 +693,7 @@ bot.command('start', async ctx => {
})
bot.command('help', async ctx => {
if (ctx.chat?.type !== 'private') return
if (!dmCommandGate(ctx)) return
await ctx.reply(
`Messages you send here route to a paired Claude Code session. ` +
`Text and photos are forwarded; replies and reactions come back.\n\n` +
@@ -695,14 +703,12 @@ bot.command('help', async ctx => {
})
bot.command('status', async ctx => {
if (ctx.chat?.type !== 'private') return
const from = ctx.from
if (!from) return
const senderId = String(from.id)
const access = loadAccess()
const gated = dmCommandGate(ctx)
if (!gated) return
const { access, senderId } = gated
if (access.allowFrom.includes(senderId)) {
const name = from.username ? `@${from.username}` : senderId
const name = ctx.from!.username ? `@${ctx.from!.username}` : senderId
await ctx.reply(`Paired as ${name}.`)
return
}

View File

@@ -10,6 +10,20 @@ An MCP app is a standard MCP server that **also serves UI resources** — intera
The UI layer is **additive**. Under the hood it's still tools, resources, and the same wire protocol. If you haven't built a plain MCP server before, the `build-mcp-server` skill covers the base layer. This skill adds widgets on top.
> **Testing in Claude:** Add the server as a custom connector in claude.ai (via a Cloudflare tunnel for local dev) — this exercises the real iframe sandbox and `hostContext`. See https://claude.com/docs/connectors/building/testing.
## Claude host specifics
| `_meta.ui.*` key | Where | Effect |
|---|---|---|
| `resourceUri` | tool | Which `ui://` resource the host renders for this tool's results. |
| `visibility: ["app"]` | tool | Hide a widget-only helper tool (e.g. geometry/image fetcher called via `callServerTool`) from Claude's tool list. |
| `prefersBorder: false` | resource | Drop the host's outer card border (mobile). |
| `csp.{connectDomains, resourceDomains, baseUriDomains}` | resource | Declare external origins; default is block-all. `frameDomains` is currently restricted in Claude. |
- `hostContext.safeAreaInsets: {top, right, bottom, left}` (px) — honor these for notches and the composer overlay.
- Directory submission requires OAuth or **authless** (`none`) — static bearer is private-deploy only and blocks listing — plus tool `annotations` and 35 PNG screenshots; see `references/directory-checklist.md`.
---
## When a widget beats plain text
@@ -95,6 +109,7 @@ const server = new McpServer({ name: "contacts", version: "1.0.0" });
// 1. The tool — returns DATA, declares which UI to show
registerAppTool(server, "pick_contact", {
description: "Open an interactive contact picker",
annotations: { title: "Pick Contact", readOnlyHint: true },
inputSchema: { filter: z.string().optional() },
_meta: { ui: { resourceUri: "ui://widgets/contact-picker.html" } },
}, async ({ filter }) => {
@@ -163,7 +178,10 @@ The `/*__EXT_APPS_BUNDLE__*/` placeholder gets replaced by the server at startup
| `app.updateModelContext({...})` | Widget → host | Update context silently (no visible message) |
| `app.callServerTool({name, arguments})` | Widget → server | Call another tool on your server |
| `app.openLink({url})` | Widget → host | Open a URL in a new tab (sandbox blocks `window.open`) |
| `app.getHostContext()` / `app.onhostcontextchanged` | Host → widget | Theme (`light`/`dark`), locale, etc. |
| `app.getHostContext()` / `app.onhostcontextchanged` | Host → widget | Theme, host CSS vars, `containerDimensions`, `displayMode`, `deviceCapabilities` |
| `app.requestDisplayMode({mode})` | Widget → host | Ask for `inline` / `pip` / `fullscreen` |
| `app.downloadFile({name, mimeType, content})` | Widget → host | Host-mediated download (base64 content) |
| `new App(info, caps, {autoResize: true})` | — | Iframe height tracks rendered content |
`sendMessage` is the typical "user picked something, tell Claude" path. `updateModelContext` is for state that Claude should know about but shouldn't clutter the chat. `openLink` is **required** for any outbound navigation — `window.open` and `<a target="_blank">` are blocked by the sandbox attribute.
@@ -216,6 +234,7 @@ const pickerHtml = readFileSync("./widgets/picker.html", "utf8")
registerAppTool(server, "pick_contact", {
description: "Open an interactive contact picker. User selects one contact.",
annotations: { title: "Pick Contact", readOnlyHint: true },
inputSchema: { filter: z.string().optional().describe("Name/email prefix filter") },
_meta: { ui: { resourceUri: "ui://widgets/picker.html" } },
}, async ({ filter }) => {
@@ -339,6 +358,24 @@ Desktop caches UI resources aggressively. After editing widget HTML, **fully qui
The `sleep` keeps stdin open long enough to collect all responses. Parse the jsonl output with `jq` or a Python one-liner.
**Widget dev loop** — avoid the ⌘Q-relaunch cycle entirely by serving the inlined widget HTML at a plain GET route with a fake `ExtApps` shim that fires `ontoolresult` from a query param:
```ts
app.get("/widget-preview", (_req, res) => {
const shim = `globalThis.ExtApps={applyHostStyleVariables:()=>{},App:class{
constructor(){this.h={}} ontoolresult;onhostcontextchanged;
async connect(){const p=new URLSearchParams(location.search).get("payload");
if(p)this.ontoolresult?.({content:[{type:"text",text:p}]});}
getHostContext(){return{theme:"light"}}
sendMessage(m){console.log("sendMessage",m)} updateModelContext(){}
callServerTool(){return Promise.resolve({content:[]})} openLink(){} downloadFile(){}
}};`;
res.type("html").send(widgetHtml.replace("/*__EXT_APPS_BUNDLE__*/", shim));
});
```
Open `http://localhost:3000/widget-preview?payload={"rows":[...]}` in a normal browser tab and iterate with ordinary devtools.
**Host fallback** — use a host without the apps surface (or MCP Inspector) and confirm the tool's text content degrades gracefully.
**CSP debugging** — open the iframe's own devtools console. CSP violations are the #1 reason widgets silently fail (blank rectangle, no error in the main console). See `references/iframe-sandbox.md`.
@@ -347,6 +384,9 @@ The `sleep` keeps stdin open long enough to collect all responses. Parse the jso
## Reference files
- `references/iframe-sandbox.md` — CSP/sandbox constraints, the bundle-inlining pattern, image handling
- `references/iframe-sandbox.md` — CSP/sandbox constraints, the bundle-inlining pattern, image handling, host theming
- `references/widget-templates.md` — reusable HTML scaffolds for picker / confirm / progress / display
- `references/apps-sdk-messages.md` — the `App` class API: widget ↔ host ↔ server messaging
- `references/apps-sdk-messages.md` — the `App` class API: widget ↔ host ↔ server messaging, lifecycle & supersession
- `references/payload-budgeting.md` — host tool-result size caps, prune-then-truncate, heavy assets via `callServerTool`
- `references/abuse-protection.md` — Anthropic egress CIDRs, tiered rate limiting, `trust proxy`, response caching
- `references/directory-checklist.md` — pre-flight for connector-directory submission

View File

@@ -0,0 +1,60 @@
# Abuse protection for authless hosted servers
An authless StreamableHTTP server is reachable by anything on the internet.
There are three resources to protect: your compute, any upstream API quota
your tools consume, and egress bandwidth for large `callServerTool` payloads.
## You don't get a per-user identity
In authless mode there is no token and stateless transport gives no session
ID. Traffic from claude.ai is proxied through Anthropic's egress — every web
user arrives from the same small set of IPs:
```
160.79.104.0/21
2607:6bc0::/48
```
(See https://platform.claude.com/docs/en/api/ip-addresses.)
Claude Desktop, Claude Code, and other hosts connect **directly from the
user's machine**, so those *do* have distinct per-user IPs. Per-IP limiting
therefore works for direct-connect clients; for claude.ai you can only limit
the aggregate Anthropic pool. If true per-user limits matter, that's the
trigger to add OAuth.
## Tiered token-bucket (per-replica backstop)
```ts
const ANTHROPIC_CIDRS = ["160.79.104.0/21", "2607:6bc0::/48"];
const TIERS = {
anthropic: { capacity: 600, refillPerSec: 100 }, // shared pool
other: { capacity: 30, refillPerSec: 2 }, // per-IP
};
```
Match `req.ip` against the CIDRs, pick a bucket (`"anthropic"` or
`"ip:<addr>"`), 429 + `Retry-After` on exhaust. This is a per-replica
backstop — cross-replica enforcement belongs at the edge (Cloudflare, Cloud
Armor), which keeps the containers stateless.
## `trust proxy` must match your topology
`req.ip` only honours `X-Forwarded-For` if `app.set('trust proxy', N)` is
set. `true` trusts every hop, which lets a direct client send
`X-Forwarded-For: 160.79.108.42` and claim the Anthropic tier. Set it to the
exact number of trusted hops (e.g. `1` behind a single LB, `2` behind
Cloudflare → origin LB) and **never `true` in production**.
## Hard-allowlisting Anthropic IPs is a product decision
Blocking everything outside `160.79.104.0/21` locks out Desktop, Claude Code,
and every other MCP host. Use the CIDRs to **tier** rate limits, not to gate
access, unless claude.ai-only is an explicit goal.
## Cache upstream responses
For tools that wrap a third-party API, an in-process LRU keyed on the
normalized query (TTL hours, no secrets in the key) is the primary cost
control — repeat queries become free and absorb thundering-herd. Rate limits
are the safety net, not the first line.

View File

@@ -2,6 +2,18 @@
The `@modelcontextprotocol/ext-apps` package provides the `App` class (browser side) and `registerAppTool`/`registerAppResource` helpers (server side). Messaging is bidirectional and persistent.
## Construction
```js
const app = new App(
{ name: "MyWidget", version: "1.0.0" },
{}, // capabilities
{ autoResize: true }, // options
);
```
`autoResize: true` wires a `ResizeObserver` that emits `ui/notifications/size-changed` so the host iframe height tracks your rendered content. Without it the frame is fixed-height and tall renders get clipped — set it for any widget whose height depends on data.
---
## Widget → Host
@@ -63,6 +75,26 @@ card.querySelector("a").addEventListener("click", (e) => {
Host-mediated download (sandbox blocks direct `<a download>`). `content` is a base64 string.
```js
const csv = rows.map((r) => Object.values(r).join(",")).join("\n");
app.downloadFile({
name: "export.csv",
mimeType: "text/csv",
content: btoa(unescape(encodeURIComponent(csv))),
});
```
### `app.requestDisplayMode({ mode })`
Ask the host to switch the widget between `"inline"`, `"pip"`, or `"fullscreen"`. Check `getHostContext().availableDisplayModes` first; hide the control if the mode isn't offered. The host responds by firing `onhostcontextchanged` with new `displayMode` and `containerDimensions` — re-render at the new size.
```js
if (app.getHostContext()?.availableDisplayModes?.includes("fullscreen")) {
expandBtn.hidden = false;
expandBtn.onclick = () => app.requestDisplayMode({ mode: "fullscreen" });
}
```
---
## Host → Widget
@@ -84,9 +116,22 @@ app.ontoolresult = ({ content }) => {
Fires with the arguments Claude passed to the tool. Useful if the widget needs to know what was asked for (e.g., highlight the search term).
### `app.ontoolinputpartial = ({ arguments }) => {...}` / `app.ontoolcancelled = () => {...}`
`ontoolinputpartial` fires while Claude is still streaming arguments — use it to show a skeleton ("Preparing: <title>…") before the result lands. `ontoolcancelled` fires if the call is aborted; clear the skeleton.
### `app.getHostContext()` / `app.onhostcontextchanged = (ctx) => {...}`
Read and subscribe to host context`theme` (`"light"` / `"dark"`), locale, etc. Call `getHostContext()` **after** `connect()`. Subscribe for live updates (user toggles dark mode mid-conversation).
Read and subscribe to host context. Call `getHostContext()` **after** `connect()`. Subscribe for live updates (user toggles dark mode, expands to fullscreen).
| `ctx.` field | Use |
|---|---|
| `theme` | `"light"` / `"dark"` — toggle a `.dark` class |
| `styles.variables` | Host CSS tokens — pass to `applyHostStyleVariables()` so colors/fonts match host chrome |
| `displayMode` / `availableDisplayModes` | Current mode and which `requestDisplayMode` targets are valid |
| `containerDimensions.{maxHeight,width}` | Size your render to this instead of hard-coded px |
| `deviceCapabilities.touch` | Switch hover-only affordances to tap (`pointerdown`) |
| `safeAreaInsets` | Padding for notches / composer overlay |
```js
const applyTheme = (t) =>
@@ -129,14 +174,36 @@ No `{ notify }` destructure — `extra` is `RequestHandlerExtra`; progress goes
## Lifecycle
1. Claude calls a tool with `_meta.ui.resourceUri` declared
2. Host fetches the resource (your HTML) and renders it in an iframe
2. Host fetches the resource (your HTML) and mounts a **fresh iframe** for this call
3. Widget script runs, sets handlers, calls `await app.connect()`
4. Host pipes the tool's return value → `ontoolresult` fires
5. Widget renders, user interacts
6. Widget calls `sendMessage` / `updateModelContext` / `callServerTool` as needed
7. Widget persists until conversation context moves on — subsequent calls to the same tool reuse the iframe and fire `ontoolresult` again
7. Iframe persists in the transcript; **the next call to the same tool mounts another iframe** alongside it
There's no explicit "submit and close" — the widget is a long-lived surface.
There's no explicit "submit and close" — each instance is long-lived, but instances are not reused across calls.
### Supersession
Because earlier instances stay mounted, a click on a stale widget can `sendMessage` after a newer one has rendered. Detect this with a `BroadcastChannel` and make older instances inert:
```js
let superseded = false;
const seq = Date.now() + Math.random();
const bc = new BroadcastChannel("my-widget");
bc.onmessage = (e) => {
if (e.data?.seq > seq) {
superseded = true;
document.body.classList.add("superseded"); // opacity:.45; pointer-events:none
}
};
bc.postMessage({ seq });
// Guard outbound calls:
function safeSend(msg) {
if (!superseded) app.sendMessage(msg);
}
```
---

View File

@@ -0,0 +1,18 @@
# Connector-directory submission checklist
Pre-flight before submitting a remote MCP app to the Claude connector
directory. Each item is a hard review criterion.
| Area | Requirement |
|---|---|
| **Auth** | OAuth (DCR or CIMD) or **`none`** (authless). Static bearer tokens are private-deploy only and block listing. Authless is valid for public-data servers — the server holds any upstream API keys. |
| **Tool annotations** | Every tool sets `annotations.title` plus the relevant hints: `readOnlyHint: true` for fetch/search tools, `destructiveHint` / `idempotentHint` for writes, `openWorldHint: true` if the tool reaches an external system. |
| **Tool names** | ≤ 64 characters, snake/kebab case. |
| **Widget layout** | Inline height ≤ 500px, no nested scroll containers, 44pt minimum touch targets, WCAG-AA contrast in both themes. |
| **Theming** | `html, body { background: transparent }`, `<meta name="color-scheme" content="light dark">`, adopt host CSS tokens via `applyHostStyleVariables`. |
| **External links** | Use `app.openLink`. Declare each origin (e.g. `https://api.example.com`) in the connector's *Allowed link URIs* so the link skips the confirm modal. |
| **Helper tools** | Widget-only tools (geometry/image fetchers) carry `_meta.ui.visibility: ["app"]` so they don't appear in Claude's tool list. |
| **Screenshots** | 35 PNGs, ≥ 1000px wide, cropped to the app response only — no prompt text in frame. |
See `abuse-protection.md` for rate-limit and IP-tiering guidance once the
authless endpoint is public.

View File

@@ -122,23 +122,38 @@ that survives un-inlined.
---
## Dark mode
## Theme & host styles
```js
const applyTheme = (theme) =>
document.documentElement.classList.toggle("dark", theme === "dark");
The host renders the iframe inside its own card chrome — paint a **transparent** background and adopt host CSS tokens so the widget blends in across light/dark and across hosts.
app.onhostcontextchanged = (ctx) => applyTheme(ctx.theme);
await app.connect();
applyTheme(app.getHostContext()?.theme);
```html
<meta name="color-scheme" content="light dark" />
```
```css
:root { --ink:#0f1111; --bg:#fff; color-scheme:light; }
:root.dark { --ink:#e6e6e6; --bg:#1f2428; color-scheme:dark; }
:root {
--ink: var(--color-text-primary, #0f1111);
--sub: var(--color-text-secondary, #5a6270);
--line: var(--color-border-default, #e3e6ea);
}
html, body { background: transparent; color: var(--ink); }
:root.dark .thumb { mix-blend-mode: normal; } /* multiply → images vanish in dark */
```
```js
const { App, applyHostStyleVariables } = globalThis.ExtApps;
function applyHostContext(ctx) {
document.documentElement.classList.toggle("dark", ctx?.theme === "dark");
if (ctx?.styles?.variables) applyHostStyleVariables(ctx.styles.variables);
}
app.onhostcontextchanged = applyHostContext;
await app.connect();
applyHostContext(app.getHostContext());
```
`applyHostStyleVariables` writes the host's `--color-*` / `--font-*` / `--border-radius-*` tokens onto `:root`; the hex values above are fallbacks for hosts that don't supply them.
---
## Debugging

View File

@@ -0,0 +1,54 @@
# Payload budgeting
Hosts cap tool-result text. claude.ai and Claude Desktop truncate at roughly
**150,000 characters**; Claude Code at ~25k tokens. When a tool result exceeds
the cap, the host substitutes a file-pointer string in place of your JSON. The
widget then receives non-JSON in `ontoolresult`, `JSON.parse` throws, and the
user sees something like *"Bad payload: SyntaxError: Unexpected token 'E'"*
with no hint that size was the cause.
## Symptom → cause
| Symptom | Likely cause |
|---|---|
| Widget shows a JSON parse error on `content[0].text` | Result over the host cap; host swapped in a file-pointer string |
| Works for one query, breaks for "all of X" | Row count × column count crossed the cap |
| Works in MCP Inspector, breaks in Desktop | Inspector has no cap; Desktop does |
## Strategy
Cap your own payload at ~130KB and degrade in order:
1. **Ship full rows** when `JSON.stringify(rows).length` is under the cap.
2. **Prune columns** to those the rendering spec actually references. Walk the
spec for both `field: "..."` keys *and* `datum.X` / `datum['X']` inside
expression strings — if the spec aliases a column via a `calculate`
transform, the alias appears as `field:` but the source column only appears
as `datum.X`, and dropping it leaves the widget with NaN.
3. **Truncate rows** as a last resort and include `{ truncated: N }` in the
payload so the widget can label it.
```ts
const MAX = 130_000;
let out = rows;
if (JSON.stringify(out).length > MAX) {
const keep = referencedFields(spec); // field: + datum.X refs
out = rows.map((r) => pick(r, keep));
if (JSON.stringify(out).length > MAX) {
const per = JSON.stringify(out[0] ?? {}).length || 1;
out = out.slice(0, Math.floor(MAX / per));
}
}
```
## Heavy assets go via `callServerTool`, not the result
Geometry, image bytes, or any blob the widget needs but Claude doesn't should
be served by a separate tool the widget calls after mount:
```js
const topo = await app.callServerTool({ name: "get-topojson", arguments: { level } });
```
Mark that helper tool with `_meta.ui.visibility: ["app"]` so it doesn't appear
in Claude's tool list.

View File

@@ -8,6 +8,8 @@ version: 0.1.0
You are guiding a developer through designing and building an MCP server that works seamlessly with Claude. MCP servers come in many forms — picking the wrong shape early causes painful rewrites later. Your first job is **discovery, not code**.
**Load Claude-specific context first.** The MCP spec is generic; Claude has additional auth types, review criteria, and limits. Before answering questions or scaffolding, fetch `https://claude.com/docs/llms-full.txt` (the full export of the Claude connector docs) so your guidance reflects Claude's actual constraints.
Do not start scaffolding until you have answers to the questions in Phase 1. If the user's opening message already answers them, acknowledge that and skip straight to the recommendation.
---
@@ -182,6 +184,17 @@ Tools are one of three server primitives. Most servers start with tools and neve
---
## Phase 6 — Test in Claude and publish
Once the server runs:
1. **Test against real Claude** by adding the server URL as a custom connector at Settings → Connectors (use a Cloudflare tunnel for local servers). Claude identifies itself with `clientInfo.name: "claude-ai"` on initialize. → https://claude.com/docs/connectors/building/testing
2. **Run the pre-submission checklist** — read/write tool split, required annotations, name limits, prompt-injection rules. → https://claude.com/docs/connectors/building/review-criteria
3. **Submit to the Anthropic Directory.** → https://claude.com/docs/connectors/building/submission
4. **Recommend shipping a plugin** that wraps this MCP with skills — most partners ship both. → https://claude.com/docs/connectors/building/what-to-build
---
## Quick reference: decision matrix
| Scenario | Deployment | Tool pattern |

View File

@@ -2,6 +2,22 @@
Auth is the reason most people end up needing a **remote** server even when a local one would be simpler. OAuth redirects, token storage, and refresh all work cleanly when there's a real hosted endpoint to redirect back to.
## Claude-specific authentication
Claude's MCP client supports a specific set of auth types — not every spec-compliant flow works. Full reference: https://claude.com/docs/connectors/building/authentication
| Type | Notes |
|---|---|
| `oauth_dcr` | Supported. For high-volume directory entries, prefer CIMD or Anthropic-held creds — DCR registers a new client on every fresh connection. |
| `oauth_cimd` | Supported, recommended over DCR for directory entries. |
| `oauth_anthropic_creds` | Partner provides `client_id`/`client_secret` to Anthropic; user-consent-gated. Contact `mcp-review@anthropic.com`. |
| `custom_connection` | User supplies URL/creds at connect time (Snowflake-style). Contact `mcp-review@anthropic.com`. |
| `none` | Authless. |
**Not supported:** user-pasted bearer tokens (`static_bearer`); pure machine-to-machine `client_credentials` grant without user consent.
**Callback URL** (single, all surfaces): `https://claude.ai/api/mcp/auth_callback`
---
## The three tiers

View File

@@ -2,6 +2,16 @@
Tool schemas and descriptions are prompt engineering. They land directly in Claude's context and determine whether Claude picks the right tool with the right arguments. Most MCP integration bugs trace back to vague descriptions or loose schemas.
## Anthropic Directory hard requirements
If this server will be submitted to the Anthropic Directory, the following are pass/fail review criteria (full list: https://claude.com/docs/connectors/building/review-criteria):
- Every tool **must** include `readOnlyHint`, `destructiveHint`, and `title` annotations — these determine auto-permissions in Claude.
- Tool names **must** be ≤64 characters.
- Read and write operations **must** be in separate tools. A single tool accepting both GET and POST/PUT/PATCH/DELETE is rejected — documenting safe vs unsafe within one tool's description does not satisfy this.
- Tool descriptions **must not** instruct Claude how to behave (e.g. "always do X", "you must call Y first", overriding system instructions, promoting products) — treated as prompt injection at review.
- Tools that accept freeform API endpoints/params **must** reference the target API's documentation in their description.
---
## Descriptions

View File

@@ -8,6 +8,8 @@ version: 0.1.0
MCPB is a local MCP server **packaged with its runtime**. The user installs one file; it runs without needing Node, Python, or any toolchain on their machine. It's the sanctioned way to distribute local MCP servers.
> MCPB is the **secondary** distribution path. Anthropic recommends remote MCP servers for directory listing — see https://claude.com/docs/connectors/building/what-to-build.
**Use MCPB when the server must run on the user's machine** — reading local files, driving a desktop app, talking to localhost services, OS-level APIs. If your server only hits cloud APIs, you almost certainly want a remote HTTP server instead (see `build-mcp-server`). Don't pay the MCPB packaging tax for something that could be a URL.
---

View File

@@ -1,6 +1,6 @@
---
name: skill-creator
description: Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, update or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy.
description: Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, edit, or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy.
---
# Skill Creator
@@ -391,7 +391,7 @@ Use the model ID from your system prompt (the one powering the current session)
While it runs, periodically tail the output to give the user updates on which iteration it's on and what the scores look like.
This handles the full optimization loop automatically. It splits the eval set into 60% train and 40% held-out test, evaluates the current description (running each query 3 times to get a reliable trigger rate), then calls Claude with extended thinking to propose improvements based on what failed. It re-evaluates each new description on both train and test, iterating up to 5 times. When it's done, it opens an HTML report in the browser showing the results per iteration and returns JSON with `best_description` — selected by test score rather than train score to avoid overfitting.
This handles the full optimization loop automatically. It splits the eval set into 60% train and 40% held-out test, evaluates the current description (running each query 3 times to get a reliable trigger rate), then calls Claude to propose improvements based on what failed. It re-evaluates each new description on both train and test, iterating up to 5 times. When it's done, it opens an HTML report in the browser showing the results per iteration and returns JSON with `best_description` — selected by test score rather than train score to avoid overfitting.
### How skill triggering works
@@ -435,6 +435,11 @@ In Claude.ai, the core workflow is the same (draft → test → review → impro
**Packaging**: The `package_skill.py` script works anywhere with Python and a filesystem. On Claude.ai, you can run it and the user can download the resulting `.skill` file.
**Updating an existing skill**: The user might be asking you to update an existing skill, not create a new one. In this case:
- **Preserve the original name.** Note the skill's directory name and `name` frontmatter field -- use them unchanged. E.g., if the installed skill is `research-helper`, output `research-helper.skill` (not `research-helper-v2`).
- **Copy to a writeable location before editing.** The installed skill path may be read-only. Copy to `/tmp/skill-name/`, edit there, and package from the copy.
- **If packaging manually, stage in `/tmp/` first**, then copy to the output directory -- direct writes may fail due to permissions.
---
## Cowork-Specific Instructions
@@ -447,6 +452,7 @@ If you're in Cowork, the main things to know are:
- Feedback works differently: since there's no running server, the viewer's "Submit All Reviews" button will download `feedback.json` as a file. You can then read it from there (you may have to request access first).
- Packaging works — `package_skill.py` just needs Python and a filesystem.
- Description optimization (`run_loop.py` / `run_eval.py`) should work in Cowork just fine since it uses `claude -p` via subprocess, not a browser, but please save it until you've fully finished making the skill and the user agrees it's in good shape.
- **Updating an existing skill**: The user might be asking you to update an existing skill, not create a new one. Follow the update guidance in the claude.ai section above.
---

View File

@@ -2,22 +2,52 @@
"""Improve a skill description based on eval results.
Takes eval results (from run_eval.py) and generates an improved description
using Claude with extended thinking.
by calling `claude -p` as a subprocess (same auth pattern as run_eval.py —
uses the session's Claude Code auth, no separate ANTHROPIC_API_KEY needed).
"""
import argparse
import json
import os
import re
import subprocess
import sys
from pathlib import Path
import anthropic
from scripts.utils import parse_skill_md
def _call_claude(prompt: str, model: str | None, timeout: int = 300) -> str:
"""Run `claude -p` with the prompt on stdin and return the text response.
Prompt goes over stdin (not argv) because it embeds the full SKILL.md
body and can easily exceed comfortable argv length.
"""
cmd = ["claude", "-p", "--output-format", "text"]
if model:
cmd.extend(["--model", model])
# Remove CLAUDECODE env var to allow nesting claude -p inside a
# Claude Code session. The guard is for interactive terminal conflicts;
# programmatic subprocess usage is safe. Same pattern as run_eval.py.
env = {k: v for k, v in os.environ.items() if k != "CLAUDECODE"}
result = subprocess.run(
cmd,
input=prompt,
capture_output=True,
text=True,
env=env,
timeout=timeout,
)
if result.returncode != 0:
raise RuntimeError(
f"claude -p exited {result.returncode}\nstderr: {result.stderr}"
)
return result.stdout
def improve_description(
client: anthropic.Anthropic,
skill_name: str,
skill_content: str,
current_description: str,
@@ -99,7 +129,7 @@ Based on the failures, write a new and improved description that is more likely
1. Avoid overfitting
2. The list might get loooong and it's injected into ALL queries and there might be a lot of skills, so we don't want to blow too much space on any given description.
Concretely, your description should not be more than about 100-200 words, even if that comes at the cost of accuracy.
Concretely, your description should not be more than about 100-200 words, even if that comes at the cost of accuracy. There is a hard limit of 1024 characters — descriptions over that will be truncated, so stay comfortably under it.
Here are some tips that we've found to work well in writing these descriptions:
- The skill should be phrased in the imperative -- "Use this skill for" rather than "this skill does"
@@ -111,70 +141,41 @@ I'd encourage you to be creative and mix up the style in different iterations si
Please respond with only the new description text in <new_description> tags, nothing else."""
response = client.messages.create(
model=model,
max_tokens=16000,
thinking={
"type": "enabled",
"budget_tokens": 10000,
},
messages=[{"role": "user", "content": prompt}],
)
text = _call_claude(prompt, model)
# Extract thinking and text from response
thinking_text = ""
text = ""
for block in response.content:
if block.type == "thinking":
thinking_text = block.thinking
elif block.type == "text":
text = block.text
# Parse out the <new_description> tags
match = re.search(r"<new_description>(.*?)</new_description>", text, re.DOTALL)
description = match.group(1).strip().strip('"') if match else text.strip().strip('"')
# Log the transcript
transcript: dict = {
"iteration": iteration,
"prompt": prompt,
"thinking": thinking_text,
"response": text,
"parsed_description": description,
"char_count": len(description),
"over_limit": len(description) > 1024,
}
# If over 1024 chars, ask the model to shorten it
# Safety net: the prompt already states the 1024-char hard limit, but if
# the model blew past it anyway, make one fresh single-turn call that
# quotes the too-long version and asks for a shorter rewrite. (The old
# SDK path did this as a true multi-turn; `claude -p` is one-shot, so we
# inline the prior output into the new prompt instead.)
if len(description) > 1024:
shorten_prompt = f"Your description is {len(description)} characters, which exceeds the hard 1024 character limit. Please rewrite it to be under 1024 characters while preserving the most important trigger words and intent coverage. Respond with only the new description in <new_description> tags."
shorten_response = client.messages.create(
model=model,
max_tokens=16000,
thinking={
"type": "enabled",
"budget_tokens": 10000,
},
messages=[
{"role": "user", "content": prompt},
{"role": "assistant", "content": text},
{"role": "user", "content": shorten_prompt},
],
shorten_prompt = (
f"{prompt}\n\n"
f"---\n\n"
f"A previous attempt produced this description, which at "
f"{len(description)} characters is over the 1024-character hard limit:\n\n"
f'"{description}"\n\n'
f"Rewrite it to be under 1024 characters while keeping the most "
f"important trigger words and intent coverage. Respond with only "
f"the new description in <new_description> tags."
)
shorten_thinking = ""
shorten_text = ""
for block in shorten_response.content:
if block.type == "thinking":
shorten_thinking = block.thinking
elif block.type == "text":
shorten_text = block.text
shorten_text = _call_claude(shorten_prompt, model)
match = re.search(r"<new_description>(.*?)</new_description>", shorten_text, re.DOTALL)
shortened = match.group(1).strip().strip('"') if match else shorten_text.strip().strip('"')
transcript["rewrite_prompt"] = shorten_prompt
transcript["rewrite_thinking"] = shorten_thinking
transcript["rewrite_response"] = shorten_text
transcript["rewrite_description"] = shortened
transcript["rewrite_char_count"] = len(shortened)
@@ -216,9 +217,7 @@ def main():
print(f"Current: {current_description}", file=sys.stderr)
print(f"Score: {eval_results['summary']['passed']}/{eval_results['summary']['total']}", file=sys.stderr)
client = anthropic.Anthropic()
new_description = improve_description(
client=client,
skill_name=name,
skill_content=content,
current_description=current_description,

View File

@@ -15,8 +15,6 @@ import time
import webbrowser
from pathlib import Path
import anthropic
from scripts.generate_report import generate_html
from scripts.improve_description import improve_description
from scripts.run_eval import find_project_root, run_eval
@@ -75,7 +73,6 @@ def run_loop(
train_set = eval_set
test_set = []
client = anthropic.Anthropic()
history = []
exit_reason = "unknown"
@@ -200,7 +197,6 @@ def run_loop(
for h in history
]
new_description = improve_description(
client=client,
skill_name=name,
skill_content=content,
current_description=current_description,