multivon-eval ships three Anthropic Agent Skills so Claude Code knows when to bootstrap an eval suite, when to gate a PR on it, and how to explain why a particular evaluator was picked. The skills are plain Markdown with YAML frontmatter — no DSL — and they live inDocumentation Index
Fetch the complete documentation index at: https://docs.multivon.ai/llms.txt
Use this file to discover all available pages before exploring further.
multivon_eval/_skills/ inside the PyPI package.
What is an Agent Skill?
A skill is a directory containing aSKILL.md file that Claude Code auto-loads from ~/.claude/skills/. The frontmatter declares name, description, and allowed-tools; the body is instructions Claude Code reads when the description matches the user’s request. Without skills, Claude Code has to infer evaluator selection and CLI flags from your docs — and hallucinates command names half the time. With skills, the tool’s own team writes the workflow once and every Claude Code session inherits it.
The bootstrap to audit to explain loop
The three skills form one iterative loop. You start cold, run bootstrap, ask explain to surface what just got picked, then on every subsequent PR audit gates the change against the suite bootstrap generated.eval-bootstrap
Cold-start an eval suite from a product description plus sample traces. Emits
eval_suite.py, seed_cases.jsonl, thresholds.yaml, DISCOVERY_REPORT.md.eval-audit
Pre-ship gate on a PR diff. Runs only the cases that stress the changed surface. Blocks safety-class regressions at p < 0.05.
eval-explain
Three-sentence answer to “why did multivon pick this evaluator”. Reads
DISCOVERY_REPORT.md plus the evaluator docstring.Install
The skills ship inside themultivon-eval PyPI package (>= 0.9.8). One command writes them into ~/.claude/skills/.
install-skills prefers a directory symlink into ~/.claude/skills/eval-bootstrap (and the two siblings) so a later pip install -U multivon-eval propagates SKILL.md edits without re-running the command. On Windows or filesystems that refuse directory symlinks, it falls back to a recursive copy and prints a note that you’ll need to re-run the command after upgrades.Print what would happen — which source paths, which targets, symlink vs copy — without touching the filesystem. Run this first if you’re unsure what
~/.claude/skills/ already contains.Replace existing entries at the three target paths. Without
--force, the command refuses to overwrite anything already on disk and tells you which entries collided.Auto-discovery flow
Once the symlinks exist under~/.claude/skills/, Claude Code auto-loads them on session start — no config edit, no restart of an existing session required (a new session picks them up on next launch). Each session’s tool list includes the skill names; Claude Code matches user phrases against each skill’s description and invokes the matching skill before answering. The auto-invoke triggers are spelled out on each skill’s detail page.
Verify the install:
Manual fallback
If you can’t runinstall-skills (older versions, or you want to vendor into a different directory), the symlinks are three lines:

