AI21 Labs
A multi-agent analysis pipeline using AI21 Labs' Jamba models -- a hybrid Mamba-Transformer architecture. The orchestrator dispatches a fast analyzer backed by jamba-1.5-mini and a deep synthesizer backed by jamba-1.5-large, demonstrating tiered model selection for cost-quality tradeoff in a single trace.
Environment variables
This example requires AI21_API_KEY, WAXELL_API_KEY, and WAXELL_API_URL. Use --dry-run to run without any API keys.
Architecture
Key Code
Jamba model size decision
The decision selects between Jamba 1.5 Mini (fast, cheap) and Jamba 1.5 Large (thorough) based on query complexity and technical depth.
@waxell.decision(name="choose_jamba_model_size", options=["mini", "large"])
def choose_jamba_model_size(query_info: dict) -> dict:
if query_info.get("is_technical") and query_info.get("word_count", 0) > 10:
chosen = "large"
reasoning = "Technical query with detail -- use Jamba 1.5 Large for thorough synthesis"
elif query_info.get("word_count", 0) > 20:
chosen = "large"
reasoning = "Complex query -- Jamba 1.5 Large for depth"
else:
chosen = "mini"
reasoning = "Standard query -- Jamba 1.5 Mini for fast analysis"
return {"chosen": chosen, "reasoning": reasoning, "confidence": 0.85}
Child agents with AI21's chat completions API
The AI21 client uses the standard chat.completions.create() interface, making it easy to swap with other OpenAI-compatible providers.
@waxell.observe(agent_name="jamba-synthesizer", workflow_name="jamba-synthesis", capture_io=True)
async def run_jamba_synthesizer(query: str, analysis: str, client, *, dry_run=False, waxell_ctx=None) -> dict:
waxell.tag("task", "detailed_synthesis")
waxell.tag("model", "jamba-1.5-large")
response = await client.chat.completions.create(
model="jamba-1.5-large",
messages=[
{"role": "system", "content": "Provide a detailed technical response."},
{"role": "user", "content": f"Analysis: {analysis}\n\nOriginal query: {query}"},
],
)
detail = response.choices[0].message.content
depth = evaluate_jamba_response(detail)
waxell.score("synthesis_quality", depth["depth_score"])
return {"detail": detail, "depth": depth, "model": response.model}
What this demonstrates
@waxell.observe-- parent orchestrator with 2 child agents@waxell.step_dec-- query preprocessing with technical topic detection@waxell.decision-- Jamba model size selection (mini vs large)@waxell.reasoning_dec-- response depth evaluationwaxell.tag()-- task and model taggingwaxell.score()-- analysis and synthesis quality scoreswaxell.metadata()-- SDK and model metadata- AI21 Jamba architecture -- hybrid Mamba-Transformer models
Run it
cd dev/waxell-dev
python -m app.demos.ai21_agent --dry-run