Skip to main content

AWS Bedrock

A multi-agent pipeline using Amazon Bedrock's Converse API with Amazon Nova models. The orchestrator preprocesses the query, decides the Nova model variant (lite/pro/micro), dispatches a classifier child agent using amazon.nova-lite-v1:0 and a synthesizer child agent using amazon.nova-pro-v1:0, then evaluates response quality. Demonstrates manual record_llm_call() for boto3's synchronous API (which lacks auto-instrumentation).

Environment variables

This example requires AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_DEFAULT_REGION, WAXELL_API_KEY, and WAXELL_API_URL. Use --dry-run to run without any API keys.

Architecture

Key Code

Bedrock Converse API with manual LLM call recording

Since boto3 is synchronous and may not be auto-instrumented, the agent uses waxell_ctx.record_llm_call() to manually record token usage and cost.

@waxell.observe(agent_name="bedrock-classifier", workflow_name="bedrock-classify")
async def run_classifier(query: str, bedrock_client: object, waxell_ctx=None):
waxell.tag("provider", "aws_bedrock")
waxell.tag("agent_role", "classifier")

classify_model = "amazon.nova-lite-v1:0"
waxell.metadata("model", classify_model)

classify_response = bedrock_client.converse(
modelId=classify_model,
messages=[{
"role": "user",
"content": [{"text": f"Classify the following query: {query}"}],
}],
)
classification = classify_response["output"]["message"]["content"][0]["text"]

if waxell_ctx:
waxell_ctx.record_llm_call(
model=classify_model,
tokens_in=classify_response["usage"]["inputTokens"],
tokens_out=classify_response["usage"]["outputTokens"],
task="classify_query",
)
return {"classification": classification, "model": classify_model}

Nova model variant decision

The decision decorator selects between Nova Micro (lightweight), Nova Lite (balanced), and Nova Pro (premium) based on query length.

@waxell.decision(name="choose_nova_model", options=["nova-lite", "nova-pro", "nova-micro"])
def choose_nova_model(query: str) -> dict:
word_count = len(query.split())
if word_count < 10:
chosen = "nova-micro"
reasoning = "Short query -- lightweight model sufficient"
elif word_count < 30:
chosen = "nova-lite"
reasoning = "Medium complexity -- Nova Lite balances speed and quality"
else:
chosen = "nova-pro"
reasoning = "Complex query -- Nova Pro for best synthesis quality"
return {"chosen": chosen, "reasoning": reasoning, "confidence": 0.85}

What this demonstrates

  • @waxell.observe -- parent orchestrator with 2 child agents
  • @waxell.step_dec -- query preprocessing
  • @waxell.decision -- Nova model variant selection
  • @waxell.reasoning_dec -- cross-response quality evaluation
  • record_llm_call() -- manual LLM call recording for non-auto-instrumented SDKs
  • waxell.tag() -- AWS provider and agent role tagging
  • waxell.score() -- classification and synthesis quality scores
  • waxell.metadata() -- Converse API and model metadata
  • Bedrock Converse API -- boto3's structured message format

Run it

cd dev/waxell-dev
python -m app.demos.bedrock_agent --dry-run

Source

dev/waxell-dev/app/demos/bedrock_agent.py