Skip to main content

Cohere

A multi-agent classify-and-generate pipeline using Cohere's V2 chat API. The orchestrator dispatches a classifier child agent backed by command-r for fast intent classification, and a generator child agent backed by command-r-plus for detailed, recommendation-rich response generation. Demonstrates Cohere's V2 API with client.v2.chat().

Environment variables

This example requires CO_API_KEY, WAXELL_API_KEY, and WAXELL_API_URL. Use --dry-run to run without any API keys.

Architecture

Key Code

Cohere V2 chat API integration

The child agents use Cohere's V2 API with client.v2.chat(), accessing response text via response.message.content[0].text.

@waxell.observe(agent_name="cohere-classifier", workflow_name="intent-classification", capture_io=True)
async def run_classifier(query: str, client, *, dry_run=False, waxell_ctx=None) -> dict:
waxell.tag("task", "classification")
waxell.tag("model", "command-r")

response = await client.v2.chat(
model="command-r",
messages=[
{"role": "system", "content": "Classify the query into: technical, general, opinion, or research."},
{"role": "user", "content": query},
],
)
classification = response.message.content[0].text
waxell.score("classification_confidence", 0.85)
return {"classification": classification, "model": response.model}

Response depth decision and quality assessment

The orchestrator decides on response depth (brief, standard, or comprehensive) before dispatch, and the generator assesses quality after generation.

@waxell.decision(name="choose_response_depth", options=["brief", "standard", "comprehensive"])
def choose_response_depth(query_info: dict) -> dict:
wc = query_info.get("word_count", 0)
if wc < 8:
chosen, reasoning = "brief", "Short query -- concise response"
elif wc < 20:
chosen, reasoning = "standard", "Medium query -- standard depth"
else:
chosen, reasoning = "comprehensive", "Complex query -- comprehensive response"
return {"chosen": chosen, "reasoning": reasoning, "confidence": 0.85}


@waxell.reasoning_dec(step="assess_response_quality")
def assess_response_quality(answer: str) -> dict:
word_count = len(answer.split())
has_examples = any(w in answer.lower() for w in ["example", "for instance", "such as"])
has_recommendations = any(w in answer.lower() for w in ["recommend", "suggest", "best practice"])
quality = 0.6 + (0.15 if has_examples else 0) + (0.15 if has_recommendations else 0)
return {"quality_score": round(min(quality, 1.0), 2)}

What this demonstrates

  • @waxell.observe -- parent orchestrator with 2 child agents
  • @waxell.step_dec -- query preprocessing
  • @waxell.decision -- response depth selection
  • @waxell.reasoning_dec -- quality assessment with domain-specific heuristics
  • waxell.tag() -- task and model tagging
  • waxell.score() -- classification and response quality scores
  • waxell.metadata() -- SDK and model metadata
  • Cohere V2 API -- client.v2.chat() with structured message format

Run it

cd dev/waxell-dev
python -m app.demos.cohere_agent --dry-run

Source

dev/waxell-dev/app/demos/cohere_agent.py