AI21 Labs

A multi-agent analysis pipeline using AI21 Labs' Jamba models -- a hybrid Mamba-Transformer architecture. The orchestrator dispatches a fast analyzer backed by jamba-1.5-mini and a deep synthesizer backed by jamba-1.5-large, demonstrating tiered model selection for cost-quality tradeoff in a single trace.

Environment variables

This example requires AI21_API_KEY, WAXELL_API_KEY, and WAXELL_API_URL. Use --dry-run to run without any API keys.

Architecture

Key Code

Jamba model size decision

The decision selects between Jamba 1.5 Mini (fast, cheap) and Jamba 1.5 Large (thorough) based on query complexity and technical depth.

@waxell.decision(name="choose_jamba_model_size", options=["mini", "large"])
def choose_jamba_model_size(query_info: dict) -> dict:
    if query_info.get("is_technical") and query_info.get("word_count", 0) > 10:
        chosen = "large"
        reasoning = "Technical query with detail -- use Jamba 1.5 Large for thorough synthesis"
    elif query_info.get("word_count", 0) > 20:
        chosen = "large"
        reasoning = "Complex query -- Jamba 1.5 Large for depth"
    else:
        chosen = "mini"
        reasoning = "Standard query -- Jamba 1.5 Mini for fast analysis"
    return {"chosen": chosen, "reasoning": reasoning, "confidence": 0.85}

Child agents with AI21's chat completions API

The AI21 client uses the standard chat.completions.create() interface, making it easy to swap with other OpenAI-compatible providers.

@waxell.observe(agent_name="jamba-synthesizer", workflow_name="jamba-synthesis", capture_io=True)
async def run_jamba_synthesizer(query: str, analysis: str, client, *, dry_run=False, waxell_ctx=None) -> dict:
    waxell.tag("task", "detailed_synthesis")
    waxell.tag("model", "jamba-1.5-large")

    response = await client.chat.completions.create(
        model="jamba-1.5-large",
        messages=[
            {"role": "system", "content": "Provide a detailed technical response."},
            {"role": "user", "content": f"Analysis: {analysis}\n\nOriginal query: {query}"},
        ],
    )
    detail = response.choices[0].message.content

    depth = evaluate_jamba_response(detail)
    waxell.score("synthesis_quality", depth["depth_score"])
    return {"detail": detail, "depth": depth, "model": response.model}

What this demonstrates

@waxell.observe -- parent orchestrator with 2 child agents
@waxell.step_dec -- query preprocessing with technical topic detection
@waxell.decision -- Jamba model size selection (mini vs large)
@waxell.reasoning_dec -- response depth evaluation
waxell.tag() -- task and model tagging
waxell.score() -- analysis and synthesis quality scores
waxell.metadata() -- SDK and model metadata
AI21 Jamba architecture -- hybrid Mamba-Transformer models

Run it

cd dev/waxell-dev
python -m app.demos.ai21_agent --dry-run

Source

dev/waxell-dev/app/demos/ai21_agent.py

Architecture​

Key Code​

Jamba model size decision​

Child agents with AI21's chat completions API​

What this demonstrates​

Run it​

Source​