Skip to main content

Session Analytics

Group agent interactions into sessions to track multi-turn conversations, analyze patterns, and identify expensive or slow interactions.

Prerequisites

  • Python 3.10+
  • waxell-observe installed and configured with an API key
  • Some recorded runs with session_id set (follow the RAG pipeline tutorial first)

What You'll Learn

  • Pass a session_id to @waxell.observe at call time to group runs
  • Build a multi-turn chatbot with session-aware observability using decorators
  • Analyze sessions in the dashboard
  • Query session data via the REST API
  • Identify expensive and slow sessions

Step 1: Instrument with Session IDs

A session groups related runs together. With the decorator pattern, you pass a consistent session_id as a keyword argument every time the agent function is called -- @waxell.observe intercepts it and applies it to the run:

import waxell_observe as waxell

waxell.init() # before LLM SDK imports

import openai
client = openai.OpenAI()

from waxell_observe import generate_session_id

session_id = generate_session_id() # e.g., "sess_a1b2c3d4e5f6g7h8"

@waxell.observe(agent_name="chatbot")
async def chat(message: str) -> str:
response = client.chat.completions.create( # auto-captured
model="gpt-4o",
messages=[{"role": "user", "content": message}],
)
return response.choices[0].message.content

# Every call in this conversation passes the same session_id
await chat("Hi there!", session_id=session_id, user_id="user_alice")
await chat("Tell me more.", session_id=session_id, user_id="user_alice")
info

generate_session_id() produces IDs like sess_a1b2c3d4e5f6g7h8. You can also use your own session identifiers -- any string works.

Step 2: Build a Multi-Turn Chatbot

Here is a complete example of a chatbot that maintains conversation state and tracks each turn as a separate observed run within the same session. The OpenAI call is auto-captured by waxell.init() -- no record_llm_call needed.

import asyncio

import waxell_observe as waxell

waxell.init() # before openai import

import openai
from waxell_observe import generate_session_id

oai = openai.OpenAI()


@waxell.observe(agent_name="chatbot")
async def send_turn(message: str, history: list[dict]) -> str:
"""One turn of a chat conversation. session_id/user_id come in via kwargs."""
history.append({"role": "user", "content": message})

# Inline enrichment -- no ctx handle needed
waxell.tag("session_type", "chat")
waxell.metadata("turn_number", len(history) // 2)

response = oai.chat.completions.create( # auto-captured
model="gpt-4o",
messages=history,
)
answer = response.choices[0].message.content
history.append({"role": "assistant", "content": answer})

waxell.score("answered", True, data_type="boolean")
return answer


class ChatSession:
"""Holds conversation state across turns."""

def __init__(self, user_id: str):
self.session_id = generate_session_id()
self.user_id = user_id
self.history: list[dict] = [
{"role": "system", "content": "You are a helpful assistant."},
]

async def send_message(self, message: str) -> str:
return await send_turn(
message,
self.history,
session_id=self.session_id,
user_id=self.user_id,
)


async def main():
session = ChatSession(user_id="user_alice")

a1 = await session.send_message("What is Waxell?")
print(f"Bot: {a1}\n")

a2 = await session.send_message("How does it handle governance?")
print(f"Bot: {a2}\n")

a3 = await session.send_message("Can you give me an example?")
print(f"Bot: {a3}\n")

print(f"Session: {session.session_id}")
print(f"Turns: {len(session.history) // 2}")


if __name__ == "__main__":
asyncio.run(main())

Each call to send_message creates a separate observed run, but they are all linked by the same session_id.

Step 3: View Sessions in the Dashboard

Open your Waxell dashboard and navigate to Observability > Sessions:

  1. Session list -- See all sessions with their total run count, total cost, and duration
  2. Click a session -- View the timeline of all runs in that session, ordered chronologically
  3. Inspect individual runs -- Click any run to see its inputs, outputs, LLM calls, and scores

The session detail view shows a timeline that looks like a conversation flow, making it easy to follow the user's journey through your application.

Step 4: Query Sessions via the REST API

List sessions with filtering and sorting:

# List recent sessions
curl "https://acme.waxell.dev/api/v1/observability/sessions/?limit=20" \
-H "X-Wax-Key: wax_sk_..."

Response:

{
"results": [
{
"session_id": "sess_a1b2c3d4e5f6g7h8",
"run_count": 3,
"total_cost": 0.0045,
"total_tokens": 1250,
"first_run_at": "2025-01-15T10:00:00Z",
"last_run_at": "2025-01-15T10:05:30Z",
"user_id": "user_alice"
}
]
}

Filter by user or date range:

# Sessions for a specific user
curl "https://acme.waxell.dev/api/v1/observability/sessions/?user_id=user_alice" \
-H "X-Wax-Key: wax_sk_..."

# Sessions from the last 24 hours
curl "https://acme.waxell.dev/api/v1/observability/sessions/?since=2025-01-14T10:00:00Z" \
-H "X-Wax-Key: wax_sk_..."

Step 5: Identify Expensive Sessions

Sort sessions by cost to find the most expensive conversations:

curl "https://acme.waxell.dev/api/v1/observability/sessions/?ordering=-total_cost&limit=10" \
-H "X-Wax-Key: wax_sk_..."

Investigate expensive sessions by looking at:

  • High turn count -- Long conversations accumulate tokens (especially with growing context windows)
  • Large context -- RAG pipelines that retrieve too many documents per turn
  • Expensive models -- Sessions using GPT-4 where GPT-4o-mini would suffice
tip

Track the turn_number in metadata (as shown in Step 2) to understand how conversation length correlates with cost. If most value comes in the first 3 turns, consider prompting users to start new conversations.

Step 6: Identify Slow Sessions

Sort by duration to find sessions where users waited a long time:

curl "https://acme.waxell.dev/api/v1/observability/sessions/?ordering=-total_duration&limit=10" \
-H "X-Wax-Key: wax_sk_..."

Common causes of slow sessions:

  • Sequential tool calls -- An agent that calls multiple tools one after another
  • Large retrieval -- Searching over a large corpus takes time
  • Rate limiting -- The LLM API throttles requests during high load
  • Retries -- Failed LLM calls that get retried add latency

Step 7: Session-Based Patterns

Use sessions to answer business questions:

Average session length: How many turns do users typically need?

# Pseudocode for analyzing session patterns
sessions = fetch_sessions(limit=1000)

turn_counts = [s["run_count"] for s in sessions]
avg_turns = sum(turn_counts) / len(turn_counts)
print(f"Average turns per session: {avg_turns:.1f}")

# Distribution
from collections import Counter
distribution = Counter(turn_counts)
for turns, count in sorted(distribution.items()):
print(f" {turns} turns: {count} sessions")

Abandonment rate: How many sessions have only 1 run (user asked once and left)?

single_turn = sum(1 for s in sessions if s["run_count"] == 1)
abandonment_rate = single_turn / len(sessions) * 100
print(f"Abandonment rate: {abandonment_rate:.1f}%")

Cost per session: What does each conversation cost?

costs = [s["total_cost"] for s in sessions]
avg_cost = sum(costs) / len(costs)
p95_cost = sorted(costs)[int(len(costs) * 0.95)]
print(f"Average cost per session: ${avg_cost:.4f}")
print(f"P95 cost per session: ${p95_cost:.4f}")

Advanced: Long-Running Sessions with WaxellContext

For batch loops or orchestration that needs to start/stop a run mid-function (rather than once per function call), WaxellContext is available -- but it is not the default path. The decorator pattern above covers nearly all chatbot and multi-turn use cases. See the Context Manager reference if you need to drop down to it.

Next Steps