Documentation

Execution Timeline

Understand ordering, spans, events, and state transitions.

What Is the Execution Timeline

The execution timeline is the ordered sequence of events that occur from the moment a traced function starts to the moment it completes. It includes the decorator lifecycle, every step execution, every retry, and every state transition. The timeline is the core abstraction that makes replay, debugging, and analytics possible.

Every trace has exactly one timeline. The timeline is composed of spans (measured units of work) and events (timestamped state changes), both captured automatically by the @trace decorator and its companion tools.

The Six-Stage Lifecycle

Every traced function progresses through six distinct stages. These stages form the backbone of the execution timeline:

1

Start Time Capture

The decorator records datetime.now(timezone.utc) for the human-readable timestamp and time.perf_counter() for high-precision latency measurement. These two values anchor the entire timeline.
2

Context Resolution

Project ID, project name, environment, and API key are resolved. If an API key was provided, it is looked up in the database. Otherwise the decorator arguments or defaults are used.
3

Context Variable Initialization

A contextvars.ContextVar is set with an empty collected_steps list and a retry_count of zero. From this point, any @trace_tool or integration call that executes within this function will append its steps to this context.
4

Function Execution

The wrapped function runs. For sync functions this is a direct call; for async functions it is awaited. During execution, child decorators and integrations append step records to the context variable. If the function raises an exception, it is captured for the trace but re-raised.
5

Trace Finalization

In the finally block, the latency delta is computed. The trace payload is built from the captured data, validated against the schema, and persisted to MongoDB. A WebSocket broadcast notifies all connected clients. The console summary is rendered.
6

Context Reset

The context variable is reset to its previous state. This ensures that sibling or subsequent traced calls start with a clean context, preventing step leakage between unrelated traces.

Event Stream

During live execution, the TraceStream context manager provides a real-time event stream that renders timestamped events to the console:

Live event streamCopy
text
  ╭── Live Trace ───────────────────────────────────────────╮
  │                                                          │
  │  Event Stream                                            │
  │  [14:22:10]  trace.start                                 │
  │  [14:22:10]  query.embed          Embedding prompt       │
  │  [14:22:11]  vector.search        Searching vector index │
  │  [14:22:11]  context.rerank       Reranking context      │
  │  [14:22:11]  agent.plan           Planning tool execution│
  │  [14:22:11]  context.allocate     Allocating context     │
  │  [14:22:12]  tool.chain           Running tool chain     │
  │  [14:22:12]  llm.generate         Generating answer      │
  │  [14:22:14]  trace.complete                              │
  │                                                          │
  │  Prompt     Explain transformers                         │
  │  Model      gpt-4.1-mini                                 │
  │  Elapsed    4.2s                                         │
  │  Events     9                                            │
  │                                                          │
  ╰──────────────────────────────────────────────────────────╯

Each event is timestamped with HH:MM:SS precision. The stream is transient — it appears during execution and is replaced by the final trace summary when the trace completes.

How Steps Relate to the Timeline

The timeline is a linear sequence, but the relationship between steps depends on how the code was instrumented:

  • Sequential steps — Most traces consist of steps that run one after another. Each step has its own duration, and the sum of step durations is typically close to (but less than) the total trace latency.
  • Retry steps — When a tool fails and is retried, each attempt appears as a separate step in the timeline. Failed attempts are marked success=False and interleaved with backoff delays.
  • Parallel steps — When using asyncio.gather, steps from different coroutines appear in the timeline in the order they were appended (which may or may not match execution order). The ContextVar isolation ensures steps from different parallel branches are not interleaved at the step level.

How Timing Is Preserved

Timing is the most critical dimension of the execution timeline. TraceLLM preserves three layers of timing information:

LayerMeasurementPrecisionPurpose
Total latencyperf_counter delta2 decimal msTop-level duration of the entire traced function
Per-step durationperf_counter delta per step2 decimal msWall-clock time of each individual tool or operation
Timestampsdatetime.now(utc)ISO 8601Absolute wall-clock time for ordering and correlation

The slow_request flag is set to true when total latency exceeds 1500ms. This threshold is configurable and used by the dashboard to highlight slow traces. Step durations are independently flagged as slow (red) when they exceed the same threshold.

How Failures Appear in the Timeline

Failures and retries produce distinct patterns in the execution timeline:

Timeline with retryCopy
text
  agent:start
  ├── ✓  query.embed              180ms  OK
  ├── ✓  vector.search            310ms  OK
  ├── ✗  tool_schema_lookup       250ms  RETRY    ← attempt 1, failed
  ├── ✓  retry_guard              150ms  OK       ← backoff + retry decision
  ├── ✓  tool_schema_lookup       280ms  OK       ← attempt 2, succeeded
  ├── ✓  response_generation     2450ms  OK
  └── ✓  done

  Retries: 1
  Status:  WARNING

When a trace has retries, the status becomes "warning" and the retry count is incremented. The timeline shows each attempt as a separate step with its own duration and success flag. The step detail for a failed attempt includes the error message in the output field.

Warning

Retry steps are counted in _infer_retry_count() by detecting duplicate tool names. A tool that appears more than once in the step list is counted as having been retried.

Timeline in the Dashboard

The dashboard renders the execution timeline as an interactive trace list and detail view. Each trace appears with its timestamp, status, model, latency, and token count. Clicking a trace opens the inspector, which displays the full step list with timing, input, and output for every step.

Dashboard timeline screenshot placeholder

The dashboard also provides an analytics view that aggregates timeline data across all traces: average latency, P95 latency, error rates, token usage trends, and breakdowns by model, project, and environment.

Timeline vs. Steps

It is important to distinguish between the execution timeline and the step list, though they are closely related:

DimensionTimelineStep List
DefinitionThe complete lifecycle of the traced function (6 stages)The ordered list of tool operations within that lifecycle
IncludesContext resolution, latency measurement, persistence, broadcastTool name, input, output, duration, success, timestamp
GranularityTwo measurements: total latency and per-stagePer-step measurements with independent timing
VisibilityExposed through the live event stream and trace summaryExposed through the step detail panel and dashboard
PersistenceCaptured implicitly in the trace document (latency, timestamps)Stored explicitly as the steps array in the trace document