Goose includes comprehensive observability through OpenTelemetry (OTel), providing traces, metrics, and logs for production monitoring and debugging.
Overview
Goose’s telemetry system provides:
- Distributed tracing: Track requests across agents, providers, and extensions
- Metrics: Monitor performance, token usage, and error rates
- Structured logs: Debug issues with contextual information
- Flexible exporters: OTLP, console, or custom backends
Quick Start
Enable OpenTelemetry
# Set OTLP endpoint
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
# Run Goose
goose session start
Telemetry data is automatically sent to your OTLP collector.
Local Testing with Console Exporter
# Output to stdout for debugging
export OTEL_TRACES_EXPORTER=console
export OTEL_METRICS_EXPORTER=console
export OTEL_LOGS_EXPORTER=console
goose session start
Configuration
Environment Variables
Global Settings
| Variable | Default | Description |
|---|
OTEL_SDK_DISABLED | false | Disable all telemetry |
OTEL_SERVICE_NAME | goose | Service name in traces |
OTEL_RESOURCE_ATTRIBUTES | - | Additional resource attributes |
Endpoint Configuration
# All signals
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
# Per-signal endpoints
export OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://localhost:4318/v1/traces
export OTEL_EXPORTER_OTLP_METRICS_ENDPOINT=http://localhost:4318/v1/metrics
export OTEL_EXPORTER_OTLP_LOGS_ENDPOINT=http://localhost:4318/v1/logs
Signal-Specific Settings
# Enable/disable individual signals
export OTEL_TRACES_EXPORTER=otlp # or "console", "none"
export OTEL_METRICS_EXPORTER=otlp
export OTEL_LOGS_EXPORTER=otlp
# Timeout
export OTEL_EXPORTER_OTLP_TIMEOUT=10000 # milliseconds
# Metrics temporality (cumulative, delta, lowmemory)
export OTEL_EXPORTER_OTLP_METRICS_TEMPORALITY_PREFERENCE=delta
# Log level
export OTEL_LOG_LEVEL=info # trace, debug, info, warn, error
Configuration File
Set in ~/.config/goose/config.yaml:
# OTLP endpoint
otel_exporter_otlp_endpoint: http://localhost:4318
# Timeout (milliseconds)
otel_exporter_otlp_timeout: 10000
# Service name
otel_service_name: goose-production
# Resource attributes
otel_resource_attributes: deployment.environment=prod,team=ai
Configuration file settings are promoted to environment variables at startup. Environment variables take precedence.
Signal Configuration
Traces
Traces track request flows through Goose:
# Enable traces
export OTEL_TRACES_EXPORTER=otlp
export OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://localhost:4318/v1/traces
Trace structure:
Session Start
├─ User Message
│ ├─ Provider Request (Anthropic/OpenAI)
│ │ └─ HTTP Call
│ ├─ Tool Call (developer__list_files)
│ │ └─ MCP Server Request
│ └─ Agent Response
└─ Session End
Trace attributes:
session.id: Session identifier
provider.name: AI provider (anthropic, openai, etc.)
model.name: Model used
tool.name: Tool invoked
extension.name: MCP extension
Metrics
Metrics track performance and usage:
# Enable metrics
export OTEL_METRICS_EXPORTER=otlp
export OTEL_EXPORTER_OTLP_METRICS_ENDPOINT=http://localhost:4318/v1/metrics
Built-in metrics:
goose.session.duration: Session length (histogram)
goose.provider.tokens.input: Input tokens per request (counter)
goose.provider.tokens.output: Output tokens per request (counter)
goose.provider.latency: Provider response time (histogram)
goose.tool.invocations: Tool call count (counter)
goose.tool.duration: Tool execution time (histogram)
goose.errors: Error count by type (counter)
Logs
Structured logs with trace correlation:
# Enable logs
export OTEL_LOGS_EXPORTER=otlp
export OTEL_LOG_LEVEL=info
Log levels:
trace: Very detailed debugging
debug: Debugging information
info: General informational messages
warn: Warning messages
error: Error messages
Log attributes:
trace_id: Correlate with traces
span_id: Specific span within trace
session_id: Session identifier
module: Rust module path
Integration Examples
Jaeger (Tracing)
# Run Jaeger all-in-one
docker run -d --name jaeger \
-p 16686:16686 \
-p 4318:4318 \
jaegertracing/all-in-one:latest
# Configure Goose
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
export OTEL_TRACES_EXPORTER=otlp
# View traces at http://localhost:16686
Prometheus + Grafana (Metrics)
# Run OpenTelemetry Collector
docker run -d --name otel-collector \
-p 4318:4318 \
-v $(pwd)/otel-config.yaml:/etc/otel-config.yaml \
otel/opentelemetry-collector:latest \
--config=/etc/otel-config.yaml
# Configure Goose
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
export OTEL_METRICS_EXPORTER=otlp
otel-config.yaml:
receivers:
otlp:
protocols:
http:
endpoint: 0.0.0.0:4318
exporters:
prometheus:
endpoint: "0.0.0.0:8889"
service:
pipelines:
metrics:
receivers: [otlp]
exporters: [prometheus]
Elastic Stack (Full Observability)
# docker-compose.yml
version: '3.8'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.11.0
environment:
- discovery.type=single-node
ports:
- "9200:9200"
kibana:
image: docker.elastic.co/kibana/kibana:8.11.0
ports:
- "5601:5601"
depends_on:
- elasticsearch
apm-server:
image: docker.elastic.co/apm/apm-server:8.11.0
ports:
- "8200:8200"
environment:
- output.elasticsearch.hosts=["elasticsearch:9200"]
# Configure Goose
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:8200
export OTEL_TRACES_EXPORTER=otlp
export OTEL_METRICS_EXPORTER=otlp
export OTEL_LOGS_EXPORTER=otlp
Datadog
# Run Datadog Agent
docker run -d --name datadog-agent \
-e DD_API_KEY=<your-api-key> \
-e DD_SITE=datadoghq.com \
-e DD_OTLP_CONFIG_RECEIVER_PROTOCOLS_HTTP_ENDPOINT=0.0.0.0:4318 \
-p 4318:4318 \
datadog/agent:latest
# Configure Goose
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
export OTEL_SERVICE_NAME=goose
export OTEL_RESOURCE_ATTRIBUTES=env:production,version:1.0.0
Implementation Details
Source Code
- OTel setup:
crates/goose/src/otel/otlp.rs
- Module:
crates/goose/src/otel/mod.rs
- Initialization: Called from
goose-cli and goose-server main functions
Initialization
From crates/goose/src/otel/otlp.rs:
pub fn init_otlp_layers(
config: &crate::config::Config,
) -> Vec<Box<dyn Layer<Registry> + Send + Sync>> {
let mut layers = Vec::new();
// Create layers for enabled signals
if let Ok(layer) = create_otlp_tracing_layer() {
layers.push(layer.boxed());
}
if let Ok(layer) = create_otlp_metrics_layer() {
layers.push(layer.boxed());
}
if let Ok(layer) = create_otlp_logs_layer() {
layers.push(layer.boxed());
}
// Set up trace context propagation
if !layers.is_empty() {
global::set_text_map_propagator(TraceContextPropagator::new());
}
layers
}
Resource Attributes
Goose automatically includes:
Resource::builder_empty()
.with_attributes([
KeyValue::new("service.name", "goose"),
KeyValue::new("service.version", env!("CARGO_PKG_VERSION")),
KeyValue::new("service.namespace", "goose"),
])
.with_detector(Box::new(EnvResourceDetector::new()))
.build()
Additional attributes from OTEL_RESOURCE_ATTRIBUTES:
export OTEL_RESOURCE_ATTRIBUTES="deployment.environment=prod,team=ai,region=us-west"
Signal Detection
Goose determines which signals to enable:
pub fn signal_exporter(signal: &str) -> Option<ExporterType> {
// 1. Check OTEL_SDK_DISABLED
if env::var("OTEL_SDK_DISABLED") == Ok("true".to_string()) {
return None;
}
// 2. Check OTEL_{SIGNAL}_EXPORTER
if let Ok(val) = env::var(&format!("OTEL_{}_EXPORTER", signal.to_uppercase())) {
return match val.as_str() {
"otlp" => Some(ExporterType::Otlp),
"console" => Some(ExporterType::Console),
_ => None,
};
}
// 3. Check for OTLP endpoint
if env::var(&format!("OTEL_EXPORTER_OTLP_{}_ENDPOINT", signal.to_uppercase())).is_ok()
|| env::var("OTEL_EXPORTER_OTLP_ENDPOINT").is_ok()
{
return Some(ExporterType::Otlp);
}
None
}
Filtering
Trace Filtering
By default, Goose captures:
- All spans at INFO level and above
- DEBUG level for Goose modules
Customize with RUST_LOG:
export RUST_LOG=debug,goose::providers=trace
Metrics Filtering
Metrics are captured for:
- INFO level and above
- Events marked with metric fields
Log Filtering
Log level determined by:
RUST_LOG (highest priority)
OTEL_LOG_LEVEL
- Default:
info
export OTEL_LOG_LEVEL=debug
Custom Telemetry
Add custom spans and metrics in your extensions:
use tracing::{info_span, instrument};
#[instrument(name = "my_custom_operation", skip(data))]
async fn process_data(data: &str) -> Result<()> {
let _span = info_span!("processing", data_len = data.len()).entered();
// Your logic
info!(operation = "process", "Processing started");
Ok(())
}
Disabling Telemetry
Complete Disable
# Disable all OpenTelemetry
export OTEL_SDK_DISABLED=true
# Or disable individual signals
export OTEL_TRACES_EXPORTER=none
export OTEL_METRICS_EXPORTER=none
export OTEL_LOGS_EXPORTER=none
Custom Distributions
For custom Goose distributions, disable in code:
// Skip OTel initialization
let layers = vec![]; // Don't call init_otlp_layers
OpenTelemetry overhead:
- Console exporter: ~5-10% CPU
- OTLP exporter: ~2-5% CPU + network I/O
- Memory: ~10-50 MB for buffering
For production, use OTLP with batching (default) for minimal overhead.
Troubleshooting
No telemetry data
# Verify endpoint is reachable
curl http://localhost:4318/v1/traces -X POST
# Check environment variables
env | grep OTEL
# Enable debug logging
export RUST_LOG=debug,opentelemetry=trace
Spans not correlating
Ensure trace context propagation:
# Goose sets this automatically, but verify
export OTEL_PROPAGATORS=tracecontext
High memory usage
Reduce batch size:
export OTEL_BSP_MAX_QUEUE_SIZE=1024
export OTEL_BSP_MAX_EXPORT_BATCH_SIZE=256
Resources