Prompt attribution

multivon-eval attribution is the structured-diff substrate that the /eval-audit skill uses to figure out which prompts actually changed in a PR before deciding what to re-run. It walks a Python repo with an AST visitor, finds every anthropic.messages.create / openai.chat.completions.create / litellm.completion call site, extracts the string-literal prompts, fingerprints them, and emits a PR-comment-ready Markdown diff. Shipped in 0.9.4. Implementation lives in multivon_eval/attribution/ — five small modules, no extra dependencies, pure stdlib.

This is Phase 1 — descriptive, not causal. Output says “these N prompts changed between base and head”; it never says “this prompt change caused this regression.” The hardened calibration spike of 2026-05-30 showed Haiku-based causal attribution failing catastrophically on mixed-cause regressions (14% HIGH-confidence-and-wrong rate). Causal attribution is gated on a future non-prompt-change sidecar signal — see the package docstring for the gate.

When to use it

In CI, on every PR: run attribution diff base head and post the Markdown output as a PR comment. Reviewers see exactly which prompts changed without scrolling the unified diff.
From /eval-audit: the skill calls scan to identify which call sites the PR touched, then runs only the eval cases that stress those surfaces.
One-off audits: attribution scan . lists every prompt literal in your codebase. Useful when onboarding to a repo you didn’t write.

What it captures

The AST extractor in ast_extractor.py matches three call shapes:

# anthropic
anthropic.messages.create(system="<literal>", messages=[{"role": "user", "content": "<literal>"}])
client.messages.create(...)              # any *.messages.create

# openai
openai.chat.completions.create(messages=[...])
client.chat.completions.create(...)      # any *.chat.completions.create

# litellm
litellm.completion(messages=[...])
litellm.acompletion(messages=[...])

Matching is method-name-based, not type-inferred: any object whose method chain ends in one of these is captured. This trades some recall for simplicity; false matches without a system or messages kwarg are silently dropped. Literals captured as static:

Plain string literals: system="..."
f-strings with zero runtime interpolation (their content is fully known at parse time)
Module-level string constants referenced by name at the call site (one hop, same file; see below)

Everything else (Attribute lookups, runtime f-strings, .join(...) calls, names that don’t resolve under the rules below) is recorded as a PromptRecord with is_dynamic=True and a placeholder text. The count is preserved and the gap is visible in the diff, but the actual text isn’t compared.

What the extractor does not see: prompts in Jinja templates, LangChain ChatPromptTemplate objects, prompts loaded from a database, runtime- assembled strings, or names that need cross-module or multi-hop resolution. If the prompt isn’t statically resolvable at a recognized SDK call site, it doesn’t land in the diff. This is deliberate — the v1 adversarial-fix discipline dropped fuzzy name-regex capture entirely.

Scanner v3 (0.10.1)

0.10.1 shipped SCANNER_VERSION = 3. v2 (0.10.0) added the first two items; v3 closed three detection blind spots found by running the scanner against five real open-source repos (aider, gpt-researcher, open-interpreter, letta, pr-agent), where v2 saw zero call sites in four of the five:

Aliased litellm imports (import litellm as llm, from litellm import completion as c) now resolve to litellm call sites.
**kwargs-unpacked calls are recorded as dynamic records with placeholder <dynamic:KwargsUnpack> instead of being invisible.
messages=<variable> (the dominant real-world pattern) produces an honest dynamic record instead of nothing. A statically-known empty messages=[] still produces nothing — there is no prompt there.
One-hop module-constant resolution. SYSTEM_PROMPT = "..." at module scope followed by system=SYSTEM_PROMPT at a call site now resolves to the real text instead of a dynamic placeholder. The rules are deliberately conservative, because a false “static” poisons every downstream verdict while a false “dynamic” is merely cautious. A name assigned more than once, assigned under a conditional/loop/try, augmented, tuple-unpacked, shadowed in the enclosing scope, declared global, or defined in another module stays dynamic. One hop only: X = Y chains do not resolve.
loose_fingerprint on every record: a hash of the whitespace-collapsed prompt text, alongside the strict fingerprint. It exists solely to label a change as formatting-only, never to suppress it.

Scanner v4 (0.11.1)

The scanner that ships with 0.11.1+ is versioned SCANNER_VERSION = 4. v4 is hardening from an adversarial audit. Every fix replaces a crash or a false verdict with an accurate one:

Fingerprints are NFC-normalized (the version bump). Composed vs decomposed unicode (“é” as one codepoint vs e + combining accent) is an editor/OS artifact, not a prompt change — it previously fingerprinted as drift.
match-statement capture patterns disqualify module constants. case PROMPT: rebinds the name through a path the constant resolver didn’t see, letting a rebound constant read as static. Such names now stay dynamic.
Unscannable files surface as UNSCANNABLE, never REMOVED. A file the scanner can’t parse (syntax error, non-UTF8 encoding) previously returned zero records, so the staleness report marked every baselined site in it REMOVED, and --fail-on removed failed CI with a misleading verdict. They now report as a distinct UNSCANNABLE tier (“file exists but could not be parsed — verdict unknown, NOT removed”): a warning names each file with its reason in all three renderers, the JSON report gains skipped_files, and --fail-on removed no longer trips. Skipped files are a report-time concept, never written into baselines.
Symlinks resolving outside the repo root are skipped, not recorded. They previously wrote machine-specific absolute paths into the baseline, producing false REMOVED+ADDED churn on every other checkout.

The scanner version is recorded in prompt_baseline.json; a baseline written by an older scanner version read by a newer scan prints a “rescan recommended” warning instead of fake drift. Attribution is the scanner; the staleness report (also 0.10.0) is the drift detector built on it: a committed baseline of every call site plus per-case provenance, answering “which prompts changed since my cases were authored?” rather than “what changed between these two checkouts?”.

`attribution scan`

Walks a Python repo and lists every prompt call site it finds.

multivon-eval attribution scan ./my-repo

multivon-eval attribution scan ./my-repo --format json > prompts.json

Text output (default):

Found 7 prompt(s) across 3 file(s):

  src/agent/planner.py:42:anthropic.system
      qualname=Planner.build_prompt  fp=a3f1b2c8d4e0…
      first line: 'You are a careful planning agent.'
  src/agent/planner.py:78:anthropic.user#0
      qualname=Planner.build_prompt  fp=9c2e7a1f6d3b…
      first line: 'Plan the following task step by step:'
  src/extractors/invoice.py:91:anthropic.system  [dynamic]
      qualname=extract_invoice  fp=000000000000…
  ...

The [dynamic] marker flags call sites where the prompt is built at runtime; the extractor records the location but can’t compare the text across refs. JSON output is a flat list of records — fingerprint, file/line, sdk, role, role position, qualname, and a 200-char text preview. Suitable for piping into jq or feeding into a CI step that wants to fail on unfingerprinted prompts. Walk skips .venv/, node_modules/, __pycache__/, and other build directories automatically.

`attribution diff`

Compute the structured prompt diff between two repo checkouts.

multivon-eval attribution diff ./repo-base ./repo-head

multivon-eval attribution diff ./repo-base ./repo-head --format text

multivon-eval attribution diff ./repo-base ./repo-head --format json

The default --format markdown emits a PR-comment-ready block. The render.py output truncates each prompt body to the first six lines or 400 chars (whichever hits first) with an explicit … (truncated) marker — so the PR comment stays readable on a long system prompt. Diffs are ordered: modified first, then added, then removed, then dynamic, sorted by call_site_id within each group. That order is stable across runs and surfaces the highest-signal changes (modified literals) first.

How identity works

Two PromptRecords are considered “the same call site across refs” iff (file_path, line, sdk, role, role_position) all match. This is the Tier-1 identity used by diff_records. A renamed file or a shifted line will register as removed + added rather than modified — Tier-2 file-move / call-site-shift detection is a future addition (see schema.py).

change_type values

Type	Meaning
`modified`	Both refs have a static record at the same call site; fingerprints differ.
`added`	The call site exists only in head.
`removed`	The call site exists only in base.
`dynamic`	Either record has `is_dynamic=True` and the recorded text differs. The actual prompt cannot be reliably compared.

A purely-dynamic call site with identical placeholder text on both sides is not emitted — there’s no meaningful change to surface.

Programmatic API

Same surface as the CLI, importable from any Python:

from multivon_eval.attribution import (
    scan,
    diff_records,
    render_markdown,
)

base = scan("/path/to/repo-base")
head = scan("/path/to/repo-head")
diffs = diff_records(base, head)
md = render_markdown(diffs)
print(md)

scan returns list[PromptRecord]. diff_records returns list[PromptDiff]. render_markdown emits the same string the CLI prints with --format markdown. The dataclasses are frozen and exposed from the package root, so you can build your own renderer if the bundled one doesn’t fit your PR-comment style.

Wiring it into CI

The two-line GitHub Actions step (assuming the workflow already checks out base and head into separate directories):

.github/workflows/eval-pr.yml

- name: Prompt diff
  run: |
    multivon-eval attribution diff ./base ./head --format markdown \
      > prompt-diff.md
- uses: marocchino/sticky-pull-request-comment@v2
  with:
    path: prompt-diff.md

For a more end-to-end PR-gate that also runs the changed-surface eval cases, install multivon-ai/eval-action — it composes attribution scan|diff with the suite runner and the calibrated threshold gate.

What’s next

The Phase 2 sidecar that gates causal attribution on a non-prompt-change signal is under design — see the package docstring in attribution/__init__.py.
Tier-2 identity (file-move / line-shift detection) is on the roadmap; until it lands, a refactor that moves a prompt to a different line will show up as removed + added.

Getting Started

Evaluators

Claude Code Skills

Guides

Reference

Compliance

When to use it

What it captures

Scanner v3 (0.10.1)

Scanner v4 (0.11.1)

`attribution scan`

`attribution diff`

How identity works

change_type values

Programmatic API

Wiring it into CI

What’s next

See also

​When to use it

​What it captures

​Scanner v3 (0.10.1)

​Scanner v4 (0.11.1)

​attribution scan

​attribution diff

​How identity works

​change_type values

​Programmatic API

​Wiring it into CI

​What’s next

​See also

When to use it

What it captures

Scanner v3 (0.10.1)

Scanner v4 (0.11.1)

`attribution scan`

`attribution diff`

How identity works

change_type values

Programmatic API

Wiring it into CI

What’s next

See also