Data Access Policy

The data-access policy category controls which data sources an agent may access. Use it to prevent agents from touching sensitive databases, enforce read-only access to production data, or cap the number of records an agent can pull per query.

Rules

Rule	Type	Default	Description
`allowed_data_sources`	string[]	`[]`	If non-empty, agents may only access sources in this list. Acts as an allowlist.
`blocked_data_sources`	string[]	`[]`	Sources that agents are never allowed to access, regardless of the allowlist.
`read_only_sources`	string[]	`[]`	Sources that agents may read from but not write to.
`max_records_per_query`	integer	`1000`	Maximum number of records an agent may retrieve in a single query. Violations produce a WARN (never a block).
`action_on_violation`	string	`"block"`	`"block"` raises a PolicyViolationError; `"warn"` logs the violation and lets the agent continue. Applies to source-level violations only — record limit violations always WARN.

How It Works

The data-access handler runs at three phases:

before_workflow

Runs before the agent does any work. Checks context.data_sources_configured — if the agent is pre-configured to use a blocked source, it is stopped before any LLM call or tool use occurs.

mid_execution

Triggered every time the agent calls ctx.record_data_access(...). This is where source-level and write violations are caught:

For each source in context.data_sources_accessed:
- If the source appears in blocked_data_sources → violation
- If allowed_data_sources is non-empty and the source is not in it → violation
For each source in context.data_sources_written:
- If the source appears in read_only_sources → violation
If context.records_queried exceeds max_records_per_query → WARN (agent continues regardless of action_on_violation)

after_workflow

Runs after the agent completes. Produces a final audit summary listing sources accessed and written. Warnings are emitted if blocked or read-only sources were accessed during execution (belt-and-suspenders check after mid_execution).

Rule Evaluation Order

Check	When triggered	Configurable action
Blocked source	mid_execution, per `record_data_access` call	`action_on_violation`
Not in allowlist	mid_execution, per `record_data_access` call	`action_on_violation`
Write to read-only source	mid_execution, per `record_data_access` call	`action_on_violation`
Record limit	mid_execution, per `record_data_access` call	Always WARN

Blocked sources are checked before allowlist membership. A source that appears in both allowed_data_sources and blocked_data_sources is always blocked.

Example Policies

Customer Data Policy (strict)

Allow reads from approved sources only; block HR and payroll entirely; make the production database read-only:

{
  "allowed_data_sources": ["postgres", "redis", "product_catalog"],
  "blocked_data_sources": ["hr_records", "payroll"],
  "read_only_sources": ["postgres"],
  "max_records_per_query": 1000,
  "action_on_violation": "block"
}

Analytics Agent (high volume, warn on violations)

Allow large record pulls but log violations rather than blocking:

{
  "allowed_data_sources": ["analytics_db", "data_warehouse"],
  "blocked_data_sources": ["pii_store"],
  "read_only_sources": [],
  "max_records_per_query": 50000,
  "action_on_violation": "warn"
}

Internal-Only Agent (blocklist only)

Block specific sensitive sources without restricting everything else:

{
  "allowed_data_sources": [],
  "blocked_data_sources": ["hr_records", "payroll", "executive_compensation"],
  "read_only_sources": [],
  "max_records_per_query": 5000,
  "action_on_violation": "block"
}

SDK Integration

Recording Data Access Events

Call ctx.record_data_access() after each data operation. The handler evaluates the access immediately at mid_execution:

import waxell_observe as waxell
from waxell_observe.errors import PolicyViolationError

waxell.init()

try:
    async with waxell.WaxellContext(
        agent_name="data-agent",
        enforce_policy=True,
    ) as ctx:

        # Read from a data source — triggers mid_execution governance
        rows = db.query("SELECT * FROM customers LIMIT 500")
        ctx.record_data_access(
            source="postgres",
            operation="read",
            records=len(rows),
        )

        # Write to a data source
        db.execute("UPDATE customers SET status = 'active' WHERE id = ?", customer_id)
        ctx.record_data_access(
            source="postgres",
            operation="write",
            records=1,
        )

        ctx.set_result({"rows": rows})

except PolicyViolationError as e:
    print(f"Data access blocked: {e}")
    # e.g. "Write to read-only data source 'postgres'"
    # e.g. "Access to blocked data source 'hr_records'"
    # e.g. "Data source 'staging_db' is not in allowed list"

Method Signature

ctx.record_data_access(
    source: str,       # Data source name — must match your policy config exactly
    operation: str,    # "read" or "write"
    records: int = 0,  # Number of records accessed/modified
) -> None

The source name is compared exactly (case-sensitive) against allowed_data_sources, blocked_data_sources, and read_only_sources. Use consistent naming conventions across your codebase.

Using the Decorator

@waxell.observe(
    agent_name="data-agent",
    enforce_policy=True,
)
async def run_query(ctx, query: str):
    rows = db.query(query)
    ctx.record_data_access(source="postgres", operation="read", records=len(rows))
    return rows

Enforcement Flow

Agent starts (WaxellContext.__aenter__)
    │
    └── before_workflow governance
        └── Check data_sources_configured vs blocked_data_sources
            └── Pre-configured blocked source? → BLOCK (always, regardless of action_on_violation)

Agent calls ctx.record_data_access(source="hr_records", operation="read", records=200)
    │
    └── mid_execution governance
        ├── source in blocked_data_sources? → action_on_violation (BLOCK or WARN)
        ├── allowed_data_sources non-empty AND source not in it? → action_on_violation
        ├── source in read_only_sources AND operation == "write"? → action_on_violation
        └── records_queried > max_records_per_query? → always WARN

Agent completes
    │
    └── after_workflow governance
        └── Audit summary — warns if blocked or read-only sources were accessed

Creating via Dashboard

Navigate to Governance > Policies
Click New Policy
Select category Data Access
Configure source lists, record limit, and violation action
Set scope to target specific agents (e.g., data-access-agent)
Enable

Creating via API

curl -X POST \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  https://acme.waxell.dev/waxell/v1/policies/ \
  -d '{
    "name": "Customer Data Policy",
    "category": "data-access",
    "rules": {
      "allowed_data_sources": ["postgres", "redis", "product_catalog"],
      "blocked_data_sources": ["hr_records", "payroll"],
      "read_only_sources": ["postgres"],
      "max_records_per_query": 1000,
      "action_on_violation": "block"
    },
    "scope": {
      "agents": ["data-access-agent"]
    },
    "enabled": true
  }'

Observability

Governance Tab

Data access evaluations appear with:

Field	Example
Policy name	Customer Data Policy
Action	`allow`, `warn`, or `block`
Category	`data-access`
Reason	"Access to blocked data source 'hr_records'"
Metadata	`{"blocked_source": "hr_records"}`

For allow cases:

Field	Example
Reason	"Data access within policy (2 source(s) accessed)"
Metadata	`{"sources_accessed": ["postgres", "redis"], "sources_written": []}`

Record Limit Warnings

When records_queried exceeds max_records_per_query, the governance tab shows:

Field	Example
Action	`warn`
Reason	"Records queried (15000) exceeds limit (1000)"
Metadata	`{"records_queried": 15000, "limit": 1000}`

Record limit warnings never stop the agent — the action_on_violation setting does not apply to them.

Common Gotchas

allowed_data_sources is an allowlist when non-empty. An empty list means "no restriction." As soon as you add one entry, all other sources are blocked (unless action_on_violation is "warn").
blocked_data_sources is checked before allowed_data_sources. A source in both lists is always blocked. This makes blocklists safe to use alongside allowlists without interaction surprises.
max_records_per_query always WARNS, never blocks. The handler hardcodes WARN for record limit violations. Setting action_on_violation: "block" does not change this behavior. Use source-level controls (allowlists and blocklists) for hard enforcement.
Source names are case-sensitive and exact-matched. "Postgres" and "postgres" are different sources. Use consistent lowercase naming in your ctx.record_data_access() calls and policy configuration.
write in operation populates data_sources_written, not data_sources_accessed. Read-only enforcement only fires when the operation is "write". If you accidentally pass operation="read" for a write operation, the read-only check is bypassed.
Each record_data_access() call triggers mid_execution immediately. The handler evaluates the entire accumulated access buffer on every call. If a second access is the violating one, the first access is still recorded in the trace.
before_workflow only checks data_sources_configured. This field is rarely populated in practice — it requires the agent framework to pre-declare which sources it uses. Most enforcement happens at mid_execution.

Combining with Other Policies

The data-access policy works well alongside:

Audit policy — logs every data access with timestamp and user for compliance records
Compliance policy — HIPAA and PCI-DSS compliance profiles often require a data-access policy in required_categories
Scope policy — combine with data-access to limit both which sources and how many records can be modified in a single run

Next Steps

Policy & Governance — How policy enforcement works
Compliance Policy — Enforce regulatory frameworks that require data-access controls
Network Policy — Govern outbound HTTP requests alongside data source access
Scope Policy — Limit blast radius for write operations
Policy Categories & Templates — All 26 categories

Rules​

How It Works​

before_workflow​

mid_execution​

after_workflow​

Rule Evaluation Order​

Example Policies​

Customer Data Policy (strict)​

Analytics Agent (high volume, warn on violations)​

Internal-Only Agent (blocklist only)​

SDK Integration​

Recording Data Access Events​

Method Signature​

Using the Decorator​

Enforcement Flow​

Creating via Dashboard​

Creating via API​

Observability​

Governance Tab​

Record Limit Warnings​

Common Gotchas​

Combining with Other Policies​

Next Steps​