Files
claude-plugins-official/plugins/math-olympiad/README.md
Tobin South 9d468adfb8 math-olympiad: housekeeping (#1172)
* math-olympiad: add LICENSE, marketplace entry, and prettier formatting

- Add Apache 2.0 LICENSE file
- Register plugin in marketplace.json
- Run prettier (prose-wrap=always, 80 cols) over all plugin markdown
- Simplify model tier naming in reference docs

🏠 Remote-Dev: homespace

* Update .claude-plugin/marketplace.json
2026-03-30 20:56:21 +01:00

41 lines
1.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# math-olympiad
Competition math solver with adversarial verification.
## The problem
Self-verification gets fooled. A verifier that sees the reasoning is biased
toward agreement. arXiv:2503.21934 ("Proof or Bluff") showed 85.7% self-verified
IMO success drops to <5% under human grading.
## The approach
- **Context-isolated verification**: verifier sees only the clean proof, never
the reasoning trace
- **Pattern-armed adversarial checks**: not "is this correct?" but "does this
accidentally prove RH?" / "extract the general lemma, find a 2×2
counterexample"
- **Calibrated abstention**: says "no confident solution" rather than bluff
- **Presentation pass**: produces clean LaTeX/PDF after verification passes
## Validation
17/18 IMO+Putnam 2025 problems solved, 0 false positives, 2 novel proofs found.
See the skill's eval data in the
[anthropic monorepo](https://github.com/anthropics/anthropic/tree/staging/sandbox/sandbox/ralph/math_skills/eval_harness).
## Install
```
/plugin install math-olympiad@claude-plugins-official
```
## Use
```
> Solve this IMO problem: [statement]
```
The skill auto-triggers on "IMO", "Putnam", "olympiad", "verify this proof",
etc.