Model Card Required Policy

The model-card-required policy enforces OWASP LLM03 Supply Chain and aligns with NIST AI RMF GV-1.1 (Govern → Model Risk Classification) and EU AI Act Article 13 (transparency). Every model used by an agent must have a declared model card on file with a risk tier of low, medium, high, or critical. Models with no card — or with a tier above the policy ceiling — are blocked.

Model cards can come from three sources, checked in order:

Inline registry (rules.model_cards) — simplest path for a single policy.
Injected registry function (_registry_fn) — wired by the controlplane to SystemModelCost or an external registry.
Empty — no card → BLOCK (or WARN per config).

Rules

Rule	Type	Default	Description
`max_risk_tier`	string	`"medium"`	Block models with tier higher than this (`low` < `medium` < `high` < `critical`)
`allowed_models`	string[]	`[]`	Positive allowlist — non-empty means ONLY these models may be used
`blocked_models`	string[]	`[]`	Deny-list (wins over `allowed_models` and `max_risk_tier`)
`require_card`	boolean	`true`	When true, models with no declared card are blocked
`model_cards`	object	`{}`	Inline map `model_id → {risk_tier, owner, ...}`
`action_on_violation`	string	`"block"`	`"block"` or `"warn"`

How It Works

The handler runs at all three phases, since models_used accumulates as the workflow progresses. Evaluation order per model:

Deny-list — blocked_models match → BLOCK.
Allowlist — if allowed_models is non-empty and the model isn't in it → BLOCK.
Card lookup — inline model_cards then injected _registry_fn. Missing card + require_card=true → BLOCK.
Risk-tier ceiling — model tier rank > max_risk_tier rank → BLOCK.

Phase	Behavior
`before_workflow`	Check pre-declared models
`mid_execution`	Check `models_used` accumulated so far
`after_workflow`	Final audit of all models used

Context Attributes Read

Attribute	Phase	Purpose
`context.models_used`	all	List of model IDs invoked during the run

Example Policy

Allow only carded models at or below medium risk; block gpt-4-32k outright:

{
  "max_risk_tier": "medium",
  "allowed_models": ["gpt-4o-mini", "gpt-4o", "claude-3-5-sonnet-20241022"],
  "blocked_models": ["gpt-4-32k", "gpt-3.5-turbo-0301"],
  "require_card": true,
  "model_cards": {
    "gpt-4o-mini": {"risk_tier": "low", "owner": "openai"},
    "gpt-4o": {"risk_tier": "medium", "owner": "openai"},
    "claude-3-5-sonnet-20241022": {"risk_tier": "medium", "owner": "anthropic"}
  },
  "action_on_violation": "block"
}

SDK Integration

import waxell_observe as waxell

waxell.init()

@waxell.observe(agent_name="research-agent", enforce_policy=True)
async def research(query: str) -> str:
    # Each LLM call adds to context.models_used.
    # If an uncarded or over-tier model is invoked, mid_execution blocks.
    return await run_research(query)

Observability

Field	Example
Category	`model-card-required`
Action	`block`
Reason	"Model 'gpt-4-turbo-preview' has no declared model card. OWASP LLM03 requires risk classification."
Metadata	`{"signal": "no_model_card", "model": "gpt-4-turbo-preview", "owasp": "LLM03"}`

Risk-tier exceeded:

Field	Example
Reason	"Model 'gpt-4o' risk tier 'high' exceeds policy ceiling 'medium'."
Metadata	`{"signal": "risk_tier_exceeded", "model": "gpt-4o", "model_risk_tier": "high", "policy_max_tier": "medium"}`

Common Gotchas

Risk-tier comparison uses ordinal rank. low=0, medium=1, high=2, critical=3. max_risk_tier="medium" permits low and medium and blocks high and critical.
Card lookup is case-insensitive but exact-id-first. gpt-4o is checked literally, then case-folded against keys. Aliases (e.g., openai/gpt-4o) won't auto-resolve — register both forms.
require_card=false is rarely advisable. It permits uncarded models silently and defeats the OWASP LLM03 goal.
Card with invalid risk_tier is treated like a missing card when require_card=true. Stick to the four canonical tier strings.
Empty models_used always allows. If your runtime doesn't populate models_used, this policy effectively no-ops.
The injected _registry_fn is set class-wide. Controlplane wires it once during app startup; if it raises, the handler logs and treats the result as "no card found".

Next Steps

Policy Categories — All 49 categories
LLM Policy — Model allowlist + per-call constraints
Compliance Policy — Meta-validator for regulatory frameworks
Provenance Required — Per-claim citation enforcement (LLM09b)

Rules​

How It Works​

Context Attributes Read​

Example Policy​

SDK Integration​

Observability​

Common Gotchas​

Next Steps​