Image Generation Agent

A multi-agent comparison of three image generation systems: Stable Diffusion (text-to-image + img2img), Flux (high-resolution generation), and Fal AI (serverless inference). An image-generator child agent runs all pipelines while an image-evaluator compares quality via @reasoning and LLM analysis.

Environment variables

This example requires OPENAI_API_KEY, WAXELL_API_KEY, and WAXELL_API_URL. Use --dry-run to skip real API calls.

Architecture

Key Code

Three Image Generation Pipelines

Five @tool-decorated functions exercise Stable Diffusion (text2img + img2img), Flux (standard + guided), and Fal AI.

@waxell.tool(tool_type="image_generation")
def run_stable_diffusion(pipeline, prompt: str, steps: int = 50, guidance: float = 7.5,
                         width: int = 512, height: int = 512) -> dict:
    """Generate image with Stable Diffusion (StableDiffusionPipeline.__call__)."""
    result = pipeline(prompt=prompt, num_inference_steps=steps, guidance_scale=guidance, width=width, height=height)
    return {"model_id": pipeline.name_or_path, "num_images": len(result.images), "dimensions": f"{width}x{height}"}

@waxell.tool(tool_type="image_generation")
def run_flux(pipeline, prompt: str, steps: int = 28, width: int = 1024, height: int = 1024) -> dict:
    """Generate image with Flux (FluxPipeline.__call__)."""
    result = pipeline(prompt=prompt, num_inference_steps=steps, width=width, height=height)
    return {"model": pipeline.name_or_path, "variant": "flux-dev", "num_images": len(result.images)}

@waxell.tool(tool_type="image_generation")
def run_fal_ai(fal_client, endpoint: str, prompt: str, image_size: str = "landscape_4_3") -> dict:
    """Generate image with Fal AI serverless (fal_client.run)."""
    result = fal_client.run(endpoint=endpoint, arguments={"prompt": prompt, "image_size": image_size})
    return {"endpoint": endpoint, "output_count": len(result.get("images", []))}

Quality Comparison and Recommendation

The evaluator compares generators and recommends the best approach.

@waxell.reasoning_dec(step="quality_comparison")
async def compare_generation_quality(comparison: dict) -> dict:
    return {
        "thought": "SD at 512x512 for img2img, Flux at 1024x1024 for quality, Fal AI for serverless",
        "conclusion": "Flux best quality/speed; Fal AI best for serverless; SD best for img2img",
    }

@waxell.decision(name="recommended_generator", options=["stable_diffusion", "flux", "fal_ai", "hybrid"])
async def recommend_generator(comparison: dict) -> dict:
    return {"chosen": "hybrid", "reasoning": "Each generator excels in different scenarios", "confidence": 0.82}

What this demonstrates

Stable Diffusion instrumentor -- StableDiffusionPipeline.__call__ for text2img and StableDiffusionImg2ImgPipeline.__call__ for img2img.
Flux instrumentor -- FluxPipeline.__call__ with standard and enhanced guidance settings.
Fal AI instrumentor -- fal_client.run, fal_client.submit, and fal_client.subscribe for serverless inference.
5 image generation tool calls -- SD text2img, SD img2img, Flux standard, Flux guided, Fal AI.
@reasoning + @decision -- quality comparison and hybrid generator recommendation.

Run it

# Dry-run mode (no API key needed)
cd dev/waxell-dev
python -m app.demos.image_gen_agent --dry-run

# Live mode
export OPENAI_API_KEY="sk-..."
python -m app.demos.image_gen_agent

Source

dev/waxell-dev/app/demos/image_gen_agent.py

Architecture​

Key Code​

Three Image Generation Pipelines​

Quality Comparison and Recommendation​

What this demonstrates​

Run it​

Source​