Skip to main content

Overview

The step() function creates a new step to track an individual LLM call within a run. Steps capture detailed information about model interactions, including prompts, responses, token usage, costs, and tool definitions.

Function Signature

from contextcompany import step

step(
    run_id: str,
    step_id: Optional[str] = None,
    api_key: Optional[str] = None,
    tcc_url: Optional[str] = None,
) -> Step

Parameters

run_id
str
required
The run ID that this step belongs to. Get this from r.run_id.
step_id
str
Unique identifier for this step. If not provided, a UUID will be automatically generated.
api_key
str
Observatory API key. If not provided, uses the TCC_API_KEY environment variable.
tcc_url
str
Custom Observatory endpoint URL. If not provided, uses the TCC_URL environment variable or defaults to production.

Returns

Returns a Step object with the following methods:

prompt()

Set the prompt sent to the LLM:
s.prompt(text: str) -> Step
text
str
required
The prompt text or serialized messages array sent to the LLM

response()

Set the response from the LLM:
s.response(text: str) -> Step
text
str
required
The response text from the LLM

model()

Set the model information:
s.model(
    requested: Optional[str] = None,
    used: Optional[str] = None
) -> Step
requested
str
The model that was requested (e.g., “gpt-4”)
used
str
The model that was actually used (e.g., “gpt-4-0613”)

finish_reason()

Set the reason why the LLM stopped generating:
s.finish_reason(reason: str) -> Step
reason
str
required
Finish reason: “stop”, “length”, “tool_calls”, “content_filter”, etc.

tokens()

Set token usage information:
s.tokens(
    prompt_uncached: Optional[int] = None,
    prompt_cached: Optional[int] = None,
    completion: Optional[int] = None
) -> Step
prompt_uncached
int
Number of uncached prompt tokens
prompt_cached
int
Number of cached prompt tokens
completion
int
Number of completion tokens generated

cost()

Set the actual cost of this LLM call:
s.cost(real_total: float) -> Step
real_total
float
required
Total cost in dollars (e.g., 0.002 for $0.002)

tool_definitions()

Set the tool/function definitions provided to the LLM:
s.tool_definitions(definitions: str) -> Step
definitions
str
required
JSON string or text representation of tool definitions

status()

Set the status code and optional message:
s.status(
    code: int,
    message: Optional[str] = None
) -> Step
code
int
required
Status code: 0 = success, 1 = partial success, 2 = error
message
str
Human-readable status message

tool_call()

Create a child tool call:
s.tool_call(
    tool_name: Optional[str] = None,
    tool_call_id: Optional[str] = None
) -> ToolCall
See tool_call() for full documentation.

end()

Finalize and send the step data:
s.end() -> None
You must call both s.prompt() and s.response() before calling s.end(), or a ValueError will be raised.

error()

Mark the step as failed and send immediately:
s.error(status_message: str = "") -> None
status_message
str
Error message describing what went wrong

Usage Examples

Creating Steps from a Run

from contextcompany import run

r = run()

# Create a step from the run object (recommended)
s = r.step()
s.prompt("Summarize this text...")
s.response("Here is a summary...")
s.end()

r.prompt(user_prompt="User query")
r.response("Final response")
r.end()

Creating Steps Independently

from contextcompany import run, step

r = run()

# Create a step using the run_id
s = step(run_id=r.run_id)
s.prompt("Analyze this data...")
s.response("Based on the analysis...")
s.end()

r.prompt(user_prompt="User query")
r.response("Final response")
r.end()

Complete Step with All Fields

from contextcompany import run
import json

r = run()
s = r.step()

# Set prompt and response
s.prompt(json.dumps([
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is 2+2?"}
]))
s.response("The answer is 4.")

# Set model information
s.model(requested="gpt-4", used="gpt-4-0613")

# Set token usage
s.tokens(
    prompt_uncached=25,
    prompt_cached=0,
    completion=10
)

# Set cost
s.cost(0.001)

# Set finish reason
s.finish_reason("stop")

s.end()

r.prompt(user_prompt="What is 2+2?")
r.response("The answer is 4.")
r.end()

Step with Tool Definitions

from contextcompany import run
import json

r = run()
s = r.step()

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string"}
                },
                "required": ["location"]
            }
        }
    }
]

s.prompt("What's the weather in SF?")
s.tool_definitions(json.dumps(tools))
s.response("I'll check the weather for you.")
s.finish_reason("tool_calls")
s.end()

r.prompt(user_prompt="What's the weather?")
r.response("It's sunny!")
r.end()

Multiple Steps in Sequence

from contextcompany import run

r = run()

# First step: planning
s1 = r.step()
s1.prompt("Plan how to solve this problem...")
s1.response("I will first analyze the data, then...")
s1.model(requested="gpt-4", used="gpt-4")
s1.tokens(prompt_uncached=50, completion=30)
s1.end()

# Second step: execution
s2 = r.step()
s2.prompt("Execute the plan...")
s2.response("Analysis complete. Results show...")
s2.model(requested="gpt-4", used="gpt-4")
s2.tokens(prompt_uncached=100, completion=75)
s2.end()

# Finalize run
r.prompt(user_prompt="Solve this problem")
r.response("Problem solved successfully.")
r.end()

Error Handling

from contextcompany import run

r = run()
s = r.step()

s.prompt("Generate a summary...")

try:
    response = call_llm()
    s.response(response)
    s.end()
except Exception as e:
    s.error(f"LLM call failed: {str(e)}")

# Continue with run even if step failed
r.prompt(user_prompt="User query")
r.response("Unable to complete due to error.")
r.status(2, "Error in LLM call")
r.end()

Tracking Cached vs Uncached Tokens

from contextcompany import run

r = run()
s = r.step()

s.prompt("Reuse previous context...")
s.response("Using cached context...")

# Track cache usage
s.tokens(
    prompt_uncached=10,   # Only 10 new tokens
    prompt_cached=500,    # 500 tokens from cache
    completion=50
)

# Significant cost savings from caching
s.cost(0.0005)  # Much lower cost than without cache

s.end()

r.prompt(user_prompt="User query")
r.response("Response")
r.end()

Best Practices

  1. Always set prompt and response: Both are required before calling s.end().
  2. Track token usage: Use s.tokens() to monitor and optimize token consumption.
  3. Record actual costs: Use s.cost() to track real spending, not just estimated costs.
  4. Use tool_definitions: When using function calling, record the tools provided to the LLM.
  5. Handle errors gracefully: Use s.error() to capture and report failures without breaking your agent.
  6. Create steps from runs: Use r.step() instead of step(run_id=...) for cleaner code.

See Also

Build docs developers (and LLMs) love