Documentation

OpenAI Example

Trace real OpenAI chat completion calls.

Overview

This example traces a real OpenAI chat completion. It captures the full request-response cycle including prompt, response content, model name, latency, token usage, and streaming chunks. The wrap_openai monkey-patches the OpenAI client so every chat.completions.create call is automatically traced.

Code

openai_example.pyCopy
python
import os
import sys

sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))

from openai import OpenAI
from tracellm import trace
from tracellm.integrations.openai import wrap_openai


client = OpenAI()
client = wrap_openai(client)


@trace(project="openai-demo", environment="development")
def ask_llm(prompt: str) -> str:
    response = client.chat.completions.create(
        model="gpt-4.1-mini",
        messages=[{"role": "user", "content": prompt}],
        max_tokens=500,
        temperature=0.7,
    )
    return response.choices[0].message.content


if __name__ == "__main__":
    result = ask_llm(
        "Explain how transformer attention works in three sentences."
    )
    print(f"\nResponse received ({len(result)} chars)")
    print(result)

Tip

Set export OPENAI_API_KEY="sk-..." before running. Make sure the TraceLLM stack is running (tracellm start) so traces are persisted.

Expected Output

Console outputCopy
text
  ╭── TraceLLM Trace ───────────────────────────── SUCCESS ──╮
  │                                                              │
  │  Trace ID     tr_f9e2a1b7                                    │
  │  Prompt       Explain how transformer attention works in     │
  │               three sentences.                               │
  │  Model        gpt-4.1-mini                                   │
  │  Project      openai-demo                                    │
  │  Environment  development                                    │
  │  Latency      1,873.42 ms                                    │
  │  Token Count  142                                            │
  │  Retries      0                                              │
  │  Steps        1                                              │
  │  Status        SUCCESS                                       │
  │                                                              │
  ╰──────────────────────────────────────────────────────────────╯

  #  Tool              Duration  Status  Detail
  1  openai_chat         1873ms     OK

Response received (486 chars)
Transformer attention works by computing three vectors — Query, Key,
and Value — from each input token. It calculates attention scores by
taking the dot product of every Query with every Key, then applies a
softmax to produce a probability distribution. These scores determine
how much each token contributes to the output, allowing the model to
focus on relevant parts of the input when generating each token.

Dashboard Result

Open http://localhost:3000/traces to see the trace in the dashboard:

Dashboard UICopy
text
TraceLLM Dashboard  >  Traces

  Status  Trace ID        Prompt                                      Model         Latency    Tokens    Time
  ─────── ─────────────── ─────────────────────────────────────────── ───────────── ────────── ──────── ─────────────────────
  ● Success  tr_f9e2a1b7  Explain how transformer attention works in  gpt-4.1-mini  1,873 ms   142      2026-05-31 14:22:10
                          three sentences.

  > Clicking the trace opens the detail view:

  ┌─ tr_f9e2a1b7 ───────────────────────────────────────────── [Success] ─┐
  │  Model  gpt-4.1-mini  |  Latency  1,873 ms  |  Tokens  142           │
  │  Retries  0  |  Steps  1  |  At  2026-05-31 14:22:10                  │
  └───────────────────────────────────────────────────────────────────────┘

  ┌─ Step Timeline ───────────────────────────────────────────────────────┐
  │                                                                       │
  │    openai_chat ─────────────────────────────────────── 1,873ms  OK    │
  │                                                                       │
  └────────────────────────────────────────────────────────────────────────┘

  ┌─ Prompt ────────────────┐  ┌─ Response ───────────────────────────────┐
  │ Explain how transformer │  │ Transformer attention works by computing │
  │ attention works in      │  │ three vectors — Query, Key, and Value —  │
  │ three sentences.        │  │ from each input token...                  │
  └─────────────────────────┘  └──────────────────────────────────────────┘

Replay Result

Use the CLI to replay the trace step-by-step:

terminalCopy
bash
tracellm replay tr_f9e2a1b7
Replay outputCopy
text
╭────────────────── Replaying execution timeline... ──────────────────╮
│                                                                      │
│  ╭─ Replay ───────────────────────────────────────────────────────╮ │
│  │                                                                 │ │
│  │  trace_id  tr_f9e2a1b7                                         │ │
│  │  status    SUCCESS                                              │ │
│  │  latency   1873.42 ms                                           │ │
│  │  retries   0                                                    │ │
│  │  steps     1                                                    │ │
│  │                                                                 │ │
│  ╰─────────────────────────────────────────────────────────────────╯ │
│                                                                      │
│  ╭─ Step 1/1 ────────────────────────────────╮                      │
│  │                                           │                      │
│  │  step     1/1                             │                      │
│  │  tool     openai_chat                     │                      │
│  │  duration 1873 ms                         │                      │
│  │  status   OK                              │                      │
│  │  input    {'model': 'gpt-4.1-mini', ...}  │                      │
│  │  output   {'content': 'Transformer att... │                      │
│  │                                           │                      │
│  ╰───────────────────────────────────────────╯                      │
╰──────────────────────────────────────────────────────────────────────╯

  ╭── TraceLLM Trace ───────────────────────────── SUCCESS ──╮
  │                                                          │
  │  Trace ID     tr_f9e2a1b7                                │
  │  Prompt       Explain how transformer attention...       │
  │  Model        gpt-4.1-mini                               │
  │  Project      openai-demo                                │
  │  Environment  development                                │
  │  Latency      1,873.42 ms                                │
  │  Token Count  142                                        │
  │  Retries      0                                          │
  │  Steps        1                                          │
  │  Status        SUCCESS                                   │
  │                                                          │
  ╰──────────────────────────────────────────────────────────╯

Replay complete