Replay Engine
Understand how TraceLLM reconstructs execution history.
Why Replay Exists
AI systems are non-deterministic. The same prompt can produce different responses, different latency, different tool call paths, and different failure modes on every run. When something goes wrong in production — a hallucination, a timeout, a broken tool chain — you cannot simply re-run and hope to see the same behavior.
Replay solves this by storing every step of every execution as structured data. Instead of guessing what happened, you open a past trace and watch it unfold again — same steps, same timing, same inputs, same outputs — reconstructed from the captured trace document.
Info
What Is Replay
Replay is the process of reconstructing a past execution from its stored trace. Given a trace ID, TraceLLM fetches the complete trace document from MongoDB and replays each step in sequence, rendering a live execution tree and step detail panel in the terminal.
tracellm replay tr_2kf9q3m1
The replay command takes a trace ID and an optional speed multiplier. It does not call any LLM — it reads the captured data and re-renders the execution timeline at human-readable speed.
| Flag | Default | Description |
|---|---|---|
| --speed | 1.0 | Replay speed multiplier (min 0.1) |
| --show-response | false | Print the full saved response after replay |
How Replay Works Internally
The replay engine follows a precise sequence to reconstruct the execution:
- Fetch trace — The trace document is retrieved from MongoDB via
fetch_trace(trace_id). This returns a validatedTraceSchemaobject with all steps, metadata, and timing information. - Render metadata header — The trace ID, status, latency, retry count, and step count are displayed in a header panel so you know what you are about to watch.
- Iterate steps — For each step (1-indexed), two panels are rendered inside a
rich.live.Livedisplay:- An execution tree showing all steps with the current one highlighted and previous ones marked complete
- A step detail panel showing tool name, duration, status, input (clipped to 200 chars), and output (clipped to 200 chars)
- Throttle timing— The engine sleeps between steps to simulate the original pacing. The delay is derived from the step's stored duration, divided by the speed multiplier, and clamped to a range that keeps the replay readable.
- Render final report — After all steps are replayed, a summary table with the full trace report is printed. If
--show-responseis set, the complete model response is displayed in a separate panel.
Timing Mechanism
The replay engine preserves the relative pacing of the original execution by using the stored step duration as a guide. The actual sleep logic is:
delay = step_duration_ms / 1000 / speed sleep = max(0.08, min(0.55, delay))
| Scenario | Stored Duration | Delay at 1x Speed | Delay at 2x Speed |
|---|---|---|---|
| Very fast step | 50 ms | 80 ms (clamped) | 80 ms (clamped) |
| Normal step | 340 ms | 340 ms | 170 ms |
| Slow step | 1200 ms | 550 ms (clamped) | 550 ms (clamped) |
The 80ms floor ensures even sub-millisecond steps are visible. The 550ms ceiling prevents a single slow step from stalling the replay. The speed multiplier scales linearly — --speed 2 halves every delay, --speed 0.5 doubles it.
Execution Tree
During replay, each step iteration renders a tree visualization of the entire execution. The tree uses three visual states:
agent:start ├── ✓ query.embed 180ms OK ├── ✓ vector.search 340ms OK ├── ▶ context.rerank 280ms ← active step ├── agent.plan 210ms ├── context.allocate 120ms ├── tool.chain 450ms ├── llm.generate 1240ms └── ✓ done
| Icon | Meaning | Style |
|---|---|---|
| ✓ | Step completed (previous step) | Green, dimmed |
| ▶ | Step currently replaying | Cyan, bold |
| (space) | Step pending (future step) | Dimmed |
| ✗ | Failed or retried step | Red |
The tree is rendered by build_execution_tree() in the tree_renderer module. It uses Rich Tree components with guide lines to show the execution flow. A final done node is appended at the bottom and colored green (all steps successful) or yellow (any warnings).
How Failures Appear
Traces with failures or warnings produce different replay output:
╭── Replay ────────────────────────────────────────────────╮ │ │ │ trace_id tr_f9k2m4x7 │ │ status WARNING │ │ latency 4,120.00 ms │ │ retries 2 │ │ steps 6 │ │ │ ╰──────────────────────────────────────────────────────────╯ agent:start ├── ✓ query.embed 190ms OK ├── ✓ vector.search 310ms OK ├── ▶ tool_schema_lookup 250ms RETRY ← failed attempt ├── retry_guard 150ms ├── tool_schema_lookup 280ms OK ← successful retry ├── response_generation 2450ms OK └── ✗ done
Failed steps show RETRY or FAILED in the status column. The step detail panel shows the error message in the output field. After replay completes, the console displays a warning message (yellow) instead of the standard success message (green).
Step Detail Panel
Alongside the execution tree, each step shows a detail panel with the full captured data:
╭── Step Detail ──────────────────────────────────────────╮
│ │
│ step 3/9 │
│ tool context.rerank │
│ duration 340 ms │
│ status OK │
│ input {"query": "Explain transformers", │
│ "strategy": "cross-encoder", │
│ "candidate_count": 8} │
│ output {"reranked_chunks": 5, │
│ "coverage_score": 0.912} │
│ │
╰──────────────────────────────────────────────────────────╯Input and output values are truncated to 200 characters to keep the display readable. The duration is the actual captured duration from the original execution, not a replay estimate.
Complete Replay Output
Here is the full terminal output of a replay from start to finish:
$ tracellm replay tr_2kf9q3m1
╭──────────────────────────────────────────────────────────╮
│ 🦖 Replaying execution timeline... │
╰──────────────────────────────────────────────────────────╯
╭── Replay ────────────────────────────────────────────────╮
│ │
│ trace_id tr_2kf9q3m1 │
│ status SUCCESS │
│ latency 3,420.00 ms │
│ retries 1 │
│ steps 9 │
│ │
╰──────────────────────────────────────────────────────────╯
╭── Replaying step 3/9 ────────────────────────────────────╮
│ │
│ ╭── Execution Tree ─────────────────────────────────╮ │
│ │ │ │
│ │ agent:start │ │
│ │ ├── ✓ query.embed 180ms OK │ │
│ │ ├── ✓ vector.search 340ms OK │ │
│ │ ├── ▶ context.rerank 280ms │ │
│ │ ├── agent.plan 210ms │ │
│ │ ├── context.allocate 120ms │ │
│ │ ├── tool.chain 450ms │ │
│ │ ├── llm.generate 1240ms │ │
│ │ └── ✓ done │ │
│ │ │ │
│ ╰────────────────────────────────────────────────────╯ │
│ │
│ ╭── Step Detail ────────────────────────────────────╮ │
│ │ │ │
│ │ step 3/9 │ │
│ │ tool context.rerank │ │
│ │ duration 340 ms │ │
│ │ status OK │ │
│ │ input {"query": "Explain transformers",...} │ │
│ │ output {"reranked": true, "matches": 5} │ │
│ │ │ │
│ ╰────────────────────────────────────────────────────╯ │
│ │
╰──────────────────────────────────────────────────────────╯
🦖 Replay complete
╭── TraceLLM Trace ───────────────────────────── SUCCESS ──╮
│ │
│ Trace ID tr_2kf9q3m1 │
│ Prompt Explain transformers │
│ Model gpt-4.1-mini │
│ Project default │
│ Environment development │
│ Latency 3,420.00 ms │
│ Token Count 1,247 │
│ Retries 1 │
│ Steps 9 │
│ Status SUCCESS │
│ │
╰──────────────────────────────────────────────────────────╯