HuggingFace
A multi-agent summarize-and-explain pipeline using HuggingFace's Inference API with the meta-llama/Llama-3.2-3B-Instruct model. The orchestrator dispatches a summarizer child agent for concise topic summaries and an explainer child agent that evaluates the summary quality before generating a detailed technical explanation.
Environment variables
This example requires HF_TOKEN, WAXELL_API_KEY, and WAXELL_API_URL. Use --dry-run to run without any API keys.
Architecture
Key Code
HuggingFace Inference API integration
The child agents use HuggingFace's text_generation API, which takes a prompt string and returns generated text directly.
@waxell.observe(agent_name="hf-summarizer", workflow_name="hf-summarization", capture_io=True)
async def run_summarizer(query: str, client, *, dry_run=False, waxell_ctx=None) -> dict:
waxell.tag("task", "summarization")
waxell.tag("model", "meta-llama/Llama-3.2-3B-Instruct")
prompt = f"Summarize the following topic in 2-3 sentences: {query}"
summary = client.text_generation(prompt, max_new_tokens=150)
waxell.score("summary_quality", 0.80, comment="HF summary quality")
return {"summary": summary, "model": "meta-llama/Llama-3.2-3B-Instruct"}
Summary quality assessment before explanation
The explainer evaluates the summary's quality before generating a detailed explanation, checking for key technical terms and conciseness.
@waxell.reasoning_dec(step="assess_summary_quality")
def assess_summary_quality(summary: str) -> dict:
word_count = len(summary.split())
has_key_terms = any(w in summary.lower() for w in ["mechanism", "layer", "weight", "query", "key", "value"])
is_concise = 20 < word_count < 100
quality = 0.5 + (0.2 if has_key_terms else 0) + (0.15 if is_concise else 0)
return {
"word_count": word_count,
"has_key_terms": has_key_terms,
"quality_score": round(min(quality, 1.0), 2),
}
What this demonstrates
@waxell.observe-- parent orchestrator with 2 child agents@waxell.step_dec-- query preprocessing with technical topic detection@waxell.decision-- model size selection (small, medium, large)@waxell.reasoning_dec-- summary quality evaluation before next stepwaxell.tag()-- task and model taggingwaxell.score()-- summary and explanation quality scoreswaxell.metadata()-- model hub and model metadata- HuggingFace Inference API --
text_generationwithmax_new_tokens
Run it
cd dev/waxell-dev
python -m app.demos.huggingface_agent --dry-run