# Multivon Docs ## Docs - [Compliance — Audit trail](https://docs.multivon.ai/compliance/audit-trail.md): The hash-chain algorithm, how the verifier proves integrity end-to-end, what the auditor zip contains, and how to anchor the chain tip to an external witness. - [Compliance Bundle — Early Access](https://docs.multivon.ai/compliance/bundle.md): What the paid Compliance Bundle adds on top of the open-source library, what it does not include today, and how to engage. - [Compliance — DPDP (India)](https://docs.multivon.ai/compliance/dpdp.md): How multivon-eval helps with the Digital Personal Data Protection Act, 2023 (India). PII patterns covered, what the library produces, what remains your obligation. - [Compliance — EU AI Act](https://docs.multivon.ai/compliance/eu-ai-act.md): Article-by-article coverage of multivon-eval against Regulation (EU) 2024/1689. What each article requires, what evidence the library produces, what it does not, and where the source lives. - [Compliance — Overview](https://docs.multivon.ai/compliance/overview.md): What multivon-eval is and is not in compliance terms. Frameworks mapped today, what an auditor receives, and where organizational measures take over. - [Agent Evaluators](https://docs.multivon.ai/evaluators/agent.md): Evaluate tool use, planning, and task completion in agentic systems. - [Compliance Evaluators](https://docs.multivon.ai/evaluators/compliance.md): Local PII detection and schema validation — no API calls required. - [Consistency Evaluators](https://docs.multivon.ai/evaluators/consistency.md): Zero-resource hallucination detection via repeated sampling. - [Conversation Evaluators](https://docs.multivon.ai/evaluators/conversation.md): Evaluate multi-turn chat quality across an entire session. - [Deterministic Evaluators](https://docs.multivon.ai/evaluators/deterministic.md): Instant, free checks that need no LLM. - [LLM Judge Evaluators](https://docs.multivon.ai/evaluators/llm-judge.md): QAG-based scoring for quality you can't measure with strings. - [Multimodal Evaluators](https://docs.multivon.ai/evaluators/multimodal.md): Vision-grounded faithfulness for image- and document-AI outputs. - [Agent traces](https://docs.multivon.ai/guides/agent-trace.md): Capture multi-step agent execution and score it with eight trace-aware evaluators. - [Prompt attribution](https://docs.multivon.ai/guides/attribution.md): Structured prompt-diff for PR review — find every LLM-SDK call site in a Python repo, then diff two checkouts to surface every prompt change. - [Bootstrap an eval suite from your product](https://docs.multivon.ai/guides/bootstrap.md): Cold-start your evals — `multivon-eval bootstrap` proposes a tuned suite from a product description + sample traces in under 60 seconds. - [CI/CD Integration](https://docs.multivon.ai/guides/ci-cd.md): Run evals as a quality gate in your deployment pipeline. - [Compliance & Privacy](https://docs.multivon.ai/guides/compliance.md): PII detection, schema validation, and tamper-evident audit trails — all local, no API calls. - [Custom Evaluators](https://docs.multivon.ai/guides/custom-evaluators.md): Build your own evaluators by extending the base class. - [Loading Datasets](https://docs.multivon.ai/guides/datasets.md): Load eval cases from JSONL and CSV files. - [Real-World Examples](https://docs.multivon.ai/guides/examples.md): End-to-end walkthroughs for support bots, RAG pipelines, and coding agents. - [Experiment Tracking](https://docs.multivon.ai/guides/experiments.md): Compare eval runs across model versions and catch regressions before they ship. - [Factory Suites](https://docs.multivon.ai/guides/factory-suites.md): Pre-configured eval suites by use case — no evaluator selection needed. - [Synthetic Dataset Generation](https://docs.multivon.ai/guides/generate.md): Generate eval cases from your docs — no labeled data required. - [Install Claude Code skills](https://docs.multivon.ai/guides/install-skills.md): Wire the bundled eval-bootstrap, eval-audit, and eval-explain skills into ~/.claude/skills/ with one command. - [Framework Integrations](https://docs.multivon.ai/guides/integrations.md): Capture agent traces from LangChain, LangSmith, or any custom agent with a consistent OOP interface. - [Intelligent eval primitives](https://docs.multivon.ai/guides/intelligent-eval.md): The multivon_eval.auto module — auto_evaluators, generate_adversarial_cases, and validate_adversarial_cases. The primitives the `multivon-eval bootstrap` CLI composes. - [Production targets](https://docs.multivon.ai/guides/production-targets.md): Run evals against deployed REST APIs, multi-turn sessions, and real browsers — not just local functions. - [Reliability & Flakiness Detection](https://docs.multivon.ai/guides/reliability.md): Handle LLM non-determinism with multi-run evaluation, flakiness detection, and statistical significance testing. - [Statistical Rigor](https://docs.multivon.ai/guides/statistical-rigor.md): Confidence intervals, power analysis, multiple comparison correction, and judge calibration. - [Synthetic dataset generation](https://docs.multivon.ai/guides/synthetic-data.md): Generate eval cases from raw text or files — no labeled data required. - [Introduction](https://docs.multivon.ai/introduction.md): AI evaluation for teams that need to prove their model is safe to ship. - [Agent recipes](https://docs.multivon.ai/mcp/agent-recipes.md): Three short patterns that turn an MCP-connected agent into a working eval loop. - [Configuration](https://docs.multivon.ai/mcp/configuration.md): Wire multivon-mcp into Claude Desktop, Claude Code, Cursor, and Cline. - [multivon-mcp](https://docs.multivon.ai/mcp/overview.md): MCP server that gives AI coding agents direct access to the multivon-eval and pdfhell evaluation surface. - [Tool reference](https://docs.multivon.ai/mcp/tool-reference.md): Every multivon-mcp tool, its arguments, and the JSON shape it returns. - [CI integration](https://docs.multivon.ai/pdfhell/ci-integration.md): Wire pdfhell into GitHub Actions, GitLab CI, and the audit pack into procurement workflows. - [CLI reference](https://docs.multivon.ai/pdfhell/cli-reference.md): Every pdfhell subcommand and flag. - [FAQ](https://docs.multivon.ai/pdfhell/faq.md): Methodology questions, scoring, contamination, statistical power, and how pdfhell relates to existing eval frameworks. - [Quickstart](https://docs.multivon.ai/pdfhell/quickstart.md): Install pdfhell and run your first adversarial PDF benchmark in 30 seconds. - [Trap families](https://docs.multivon.ai/pdfhell/trap-families.md): The three adversarial trap families in the 0.1 mini suite — what each tests, how it's generated, and what specifically breaks. - [Quickstart](https://docs.multivon.ai/quickstart.md): Run your first eval in 5 minutes — or let multivon-eval pick your evaluators for you. - [EvalReport API reference](https://docs.multivon.ai/reference/eval-report.md): Programmatic interface for the object returned by suite.run(). Every attribute and method, with types and one-line descriptions. - [eval-audit](https://docs.multivon.ai/skills/eval-audit.md): Claude Code skill that gates a PR on eval regressions — runs only the cases that stress the changed surface and blocks safety-class drops. - [eval-bootstrap](https://docs.multivon.ai/skills/eval-bootstrap.md): Claude Code skill that turns a product description plus a handful of traces into a runnable eval suite in under three minutes. - [eval-explain](https://docs.multivon.ai/skills/eval-explain.md): Claude Code skill that explains in exactly 3 sentences why multivon-eval picked a particular evaluator, threshold, or methodology. - [Claude Code Skills](https://docs.multivon.ai/skills/index.md): Three skills that teach Claude Code how to bootstrap, audit, and explain your evals — installed with one command. - [Why multivon-eval](https://docs.multivon.ai/why-multivon-eval.md): Head-to-head numbers vs DeepEval, RAGAS, and Promptfoo — with reproducible benchmarks you can rerun. ## OpenAPI Specs - [openapi](https://docs.multivon.ai/api-reference/openapi.json)