Compliance Evaluators

Compliance evaluators run entirely within your environment. No data leaves your infrastructure.

PIIEvaluator

Scans LLM outputs for personally identifiable information using local regex patterns plus checksum validation (Luhn, Verhoeff, Mod-97, Mod-11). Zero API calls by default. Optional NER fallback via Presidio when installed. When to use: Any regulated deployment (healthcare, finance, legal, government) where PII in model outputs is a compliance risk. Run it in CI/CD so regressions are caught before production. Passes when no PII is detected. Fails with a per-type breakdown.

from multivon_eval import PIIEvaluator

# Default — all jurisdictions, strict checksum validation, no NER.
suite.add_evaluators(PIIEvaluator())

# Healthcare with name detection (lazy-imports presidio_analyzer if installed).
suite.add_evaluators(PIIEvaluator(jurisdiction="hipaa", use_ner=True))

# India-specific with custom identifier overlay.
suite.add_evaluators(PIIEvaluator(
    jurisdiction="dpdp",
    patterns={"employee_id": r"EMP-\d{6}"},
))

# Reason field uses [REDACTED-TYPE] tokens; original output is never mutated.
suite.add_evaluators(PIIEvaluator(redact=True))

Parameters

Parameter	Default	Description
`jurisdiction`	`"all"`	`"hipaa"`, `"gdpr"`, `"dpdp"`, `"ccpa"`, `"pipeda"`, or `"all"`.
`patterns`	`None`	Additional `{name: regex}` overlay. Merged after jurisdiction patterns.
`redact`	`False`	Reason field shows `[REDACTED-TYPE]` tokens instead of substrings. Original output is never mutated.
`threshold`	`1.0`	Pass threshold — any PII match fails by default.
`strict`	`True`	Apply checksum validators (Luhn, Verhoeff, Mod-97, Mod-11, PAN structural, SSN structural) to drop false positives. Pass `False` to see raw regex hits.
`use_ner`	`False`	Lazy-import `presidio_analyzer` (if installed) to additionally catch PERSON, LOCATION, DATE_TIME. Partial coverage for HIPAA Safe Harbor categories regex can’t reach. Silent no-op when Presidio isn’t installed.

Standards covered

Every pattern in the evaluator carries a citation to its source standard. The table below is exhaustive — if your jurisdiction needs an identifier not listed here, supply it via patterns={...} (or open a PR — additions are easy).

HIPAA Safe Harbor — 45 CFR § 164.514(b)(2)

Eighteen identifier categories. The evaluator covers thirteen via regex; the remaining five (free-text names, geographic subdivisions, photographs, biometrics, “other unique IDs”) need use_ner=True for partial coverage.

#	Identifier	Detection
1	Names (with `Patient`, `Mr`, `Dr` etc. prefix)	Regex (context-led, high precision)
1	Names without prefix	NER (`use_ner=True`)
2	Street addresses	NER (`use_ner=True`); loose regex baseline
2	US ZIP codes (`ZIP 94103`, `Zipcode 12345-6789`)	Regex (context-anchored)
3	Dates of admission / discharge / death / birth	Regex (context-anchored)
3	Age > 89 (`aged 92`, `years old: 95`)	Regex
4	Telephone numbers	Regex (NANP + international)
5	Fax numbers	Regex (NANP shape, `Fax:` prefix)
6	Email addresses	Regex (RFC 5322 simplified)
7	Social Security Numbers	Regex + structural validator (drops `000-`, `666-`, `9xx-`, all-same decoys)
8	Medical record numbers (4–15 digits)	Regex
9	Health plan beneficiary numbers (`HPN`, `Group No`, `Policy No`)	Regex
10	Account numbers (`Acct`, `Account`)	Regex
11	Certificate / license numbers (`NPI`, `DEA`, `License`, `Cert`)	Regex
12	Vehicle identifiers (VINs, 17-char)	Regex
13	Device identifiers (`UDI`, `Device ID`, `Implant No`)	Regex
14	Web URLs	Regex
15	IP addresses (IPv4)	Regex
16	Biometric identifiers	NER (`use_ner=True`); partial
17	Full-face photographs	Not text — must be screened upstream
18	Other unique identifying numbers	`patterns={...}` overlay

National identification numbers across EU member states + base PII.

Identifier	Country	Validator
NI Number (`AB123456C`)	UK	Structural (excludes D/F/I/Q/U/V prefixes)
NHS Number	UK	Mod-11 (10-digit, drops 3-3-4 grouping false positives)
DNI (`12345678Z`)	Spain	Letter (mod-23) when strict
NIE (`X1234567L`)	Spain	Letter (mod-23) when strict
Codice Fiscale (16 alphanumeric)	Italy	Structural
NIR / INSEE (15-digit Sécurité Sociale)	France	Structural
Steuer-IdNr (`Steuer-IdNr: 12345678901`)	Germany	Context-anchored
BSN (`BSN: 12345678`)	Netherlands	Context-anchored; 11-test optional
PESEL (11-digit, `PESEL` context)	Poland	Structural
Personnummer (`YYMMDD-XXXX`)	Sweden	Context-anchored
CPR (`DDMMYY-XXXX`)	Denmark	Context-anchored
PPSN (`7 digits + 1-2 letters`)	Ireland	Context-anchored
HETU (`DDMMYY[-+A]NNNX`)	Finland	Structural
EU VAT (country prefix + digits)	EU	Structural
IBAN	All	Mod-97 per ISO 13616

Article 9 special categories (race, religion, health data, sex life, trade-union membership): these are content categories, not identifier formats. Use a topic classifier or NER pipeline; the regex evaluator can’t reach them.

DPDP India — Act 22 of 2023

Indian government-issued identifiers + Indian PII formats.

Identifier	Format	Validator
Aadhaar	UIDAI 12-digit	Verhoeff dihedral checksum
PAN	Income Tax `[A-Z]{5}\d{4}[A-Z]`	Structural — 4th char ∈
GSTIN	`\d{2}[A-Z]{5}\d{4}[A-Z][A-Z0-9]Z[A-Z0-9]`	Structural
IFSC	`[A-Z]{4}0[A-Z0-9]{6}`	Structural
Voter ID (EPIC)	`[A-Z]{3}\d{7}`	Structural
India mobile (+91)	10-digit, starts 6-9	Structural
Driving License	State + RTO + year + serial	Structural
India Passport	Letter (A–P, R–W, Y) + 7 digits	Structural
Vehicle Registration	`KA-01-AB-1234` style	Structural
Ration Card	Context-anchored	—

CCPA — Cal. Civ. Code § 1798.140(o)

Identifier	Detection
Bank account / routing (context-anchored: `account`, `acct`, `routing`)	Regex
California Driver’s License (`[A-Z]\d{7}`)	Regex

(Other CCPA categories — biometric data, geolocation, browsing history — need pipeline-level controls; regex can’t cover them.)

PIPEDA (Canada)

Schedule 1 categories overlap entirely with the base PII set (name, email, phone, address, SSN/SIN, financial). No Canada-specific identifier format needs to be added.

Strict mode (default)

When strict=True (the default), regex hits are filtered through identity validators before being reported. This dramatically cuts false positives:

Identifier	Validator
Credit card (Visa/MC/Amex/Discover/JCB)	Luhn Mod-10
Aadhaar	Verhoeff
IBAN	Mod-97 (ISO 13616)
NHS Number	Mod-11
PAN India	Structural (holder-type)
SSN (US)	Structural (drops `000-`, `666-`, `9xx-`, all-same decoys)
GSTIN	Structural

Pass strict=False to see raw matches without validation — useful for debugging and for jurisdictions where checksum specs aren’t published.

Optional NER (`use_ner=True`)

When use_ner=True, the evaluator additionally invokes presidio_analyzer on the output to catch PERSON, LOCATION, DATE_TIME, NRP, ORGANIZATION, etc. — providing partial coverage for HIPAA Safe Harbor categories that regex can’t reach (unprefixed names, free-form addresses, biometrics references). Presidio is an optional dependency:

pip install presidio_analyzer
python -m spacy download en_core_web_lg

When Presidio isn’t installed, use_ner=True is a silent no-op — the evaluator just runs the regex/checksum pipeline.

Sample output

PII detected (3 type(s)):
  patient_name: "John Smith" (1 match)
  medical_record_number: "MRN 12345" (1 match)
  email: "[email protected]" (1 match)

With strict=True, decoys like 1234-5678-9012 (not Verhoeff-valid Aadhaar), 4532-0151-1283-0367 (Luhn-invalid Visa shape), and 123-45-6789 (test SSN) are dropped from the report.

SchemaEvaluator

Validates that LLM outputs conform to a Pydantic model or JSON Schema dict. Zero API calls — validation is purely local. When to use: Structured output tasks — extraction, classification, API response generation — where you need per-field failure breakdowns rather than a binary pass/fail. Passes when output is valid JSON matching the schema. Fails with per-field error messages.

from pydantic import BaseModel
from multivon_eval import SchemaEvaluator

class Summary(BaseModel):
    title: str
    score: float
    tags: list[str]

suite.add_evaluators(SchemaEvaluator(Summary))

# JSON Schema alternative
suite.add_evaluators(SchemaEvaluator({
    "type": "object",
    "required": ["title", "score"],
    "properties": {
        "title": {"type": "string"},
        "score": {"type": "number", "minimum": 0, "maximum": 1},
    }
}))

Parameters:

Parameter	Type	Default	Description
`schema`	`type \| dict`	required	Pydantic model class or JSON Schema dict
`strict`	`bool`	`False`	If `True`, extra fields not in the schema are also treated as failures
`threshold`	`float`	`1.0`	Minimum score to pass (default: any field error = fail)

strict mode behavior: When strict=False (default), extra keys in the JSON output are ignored — only required fields and type constraints are checked. When strict=True, any key present in the output that is not declared in the schema is counted as a violation. Use strict mode when you need to enforce that the model doesn’t leak internal fields or hallucinate extra properties. Strips markdown code fences automatically before parsing. For JSON Schema, scoring is proportional: score = max(0.0, 1.0 - errors/10). For Pydantic models, any validation error returns score 0.0. Sample output:

Schema validation failed:
  score: Input should be a valid number
  tags: Field required

ComplianceReporter

Not an evaluator — a report writer. Produces tamper-evident NDJSON audit trails. See Compliance & Privacy guide for full documentation.

Getting Started

Evaluators

Claude Code Skills

Guides

Reference

Compliance

PIIEvaluator

Parameters

Standards covered

HIPAA Safe Harbor — 45 CFR § 164.514(b)(2)

DPDP India — Act 22 of 2023

CCPA — Cal. Civ. Code § 1798.140(o)

PIPEDA (Canada)

Strict mode (default)

Optional NER (`use_ner=True`)

Sample output

SchemaEvaluator

ComplianceReporter

​PIIEvaluator

​Parameters

​Standards covered

​HIPAA Safe Harbor — 45 CFR § 164.514(b)(2)

​GDPR — Regulation (EU) 2016/679, Art.4(1)

​DPDP India — Act 22 of 2023

​CCPA — Cal. Civ. Code § 1798.140(o)

​PIPEDA (Canada)

​Strict mode (default)

​Optional NER (use_ner=True)

​Sample output

​SchemaEvaluator

​ComplianceReporter

PIIEvaluator

Parameters

Standards covered

HIPAA Safe Harbor — 45 CFR § 164.514(b)(2)

GDPR — Regulation (EU) 2016/679, Art.4(1)

DPDP India — Act 22 of 2023

CCPA — Cal. Civ. Code § 1798.140(o)

PIPEDA (Canada)

Strict mode (default)

Optional NER (`use_ner=True`)

Sample output

SchemaEvaluator

ComplianceReporter