Common LLM Issues
Errors & Timeouts
API failures, rate limits, timeouts, and provider outages
Quality Issues
Wrong answers, inconsistent outputs, hallucinations, and context loss
Performance Problems
Slow responses, high latency, and token inefficiency
Cost Overruns
Unexpected spending, inefficient prompts, and model selection
Debugging Workflow
Filter by Status Codes
Start by identifying failed requests using status code filters:
Common status codes:
- 200 - Success
- 400 - Bad request (malformed input)
- 401 - Authentication failed
- 429 - Rate limit exceeded
- 500 - Provider error
- 503 - Provider unavailable
Inspect Request Details
Click on any request to see complete details:
Key information available:
- Full request body - Exact prompt and parameters sent
- Complete response - What the model returned
- Timing breakdown - Where latency occurred
- Token usage - Input/output token counts
- Cost - Exact cost of this request
- Custom properties - Your metadata for filtering
Use Playground for Testing
Test fixes immediately without redeploying code:
The Playground allows you to:
- Modify the prompt and see new results
- Change model parameters (temperature, max tokens)
- Switch models to compare outputs
- Test different approaches quickly
Currently, only OpenAI models are supported in the Playground
Debugging Specific Issues
API Errors & Rate Limits
When you see 429 or 500 errors:- Implement Retries
- Add Rate Limiting
- Use Fallback Providers
Quality Issues
When responses are wrong or inconsistent:Compare Across Sessions
Compare Across Sessions
Filter requests by custom properties to identify patterns:Then filter in the dashboard to see:
- Do technical queries fail more often?
- Are premium users having different issues?
- Which features have the most quality problems?
Track Model Versions
Track Model Versions
Tag requests with model versions to compare quality:This helps you:
- A/B test prompt changes
- Track quality regressions
- Identify which version works best
Use Score Tracking
Use Score Tracking
Add quality scores to track improvements:
Performance Problems
When responses are slow:- Analyze Latency
- Optimize Token Usage
- Use Faster Models
Check the request details for timing breakdown:
- Queue time - How long before processing started
- Processing time - Model inference time
- Network time - Transfer latency
Cost Overruns
When costs are higher than expected:- Filter by feature to find expensive operations
- Check session costs to see complete workflows
- Review token usage to identify inefficient prompts
- Compare model costs to find cheaper alternatives
Advanced Debugging Techniques
Custom Request IDs
Use predictable IDs to correlate with your own logs:Property-Based Filtering
Tag requests with rich metadata for powerful filtering:- “Show me production errors for premium users”
- “Compare v2.3 vs v2.2 response times”
- “Which A/B test variant has better quality?”
Session Replay
Replay entire sessions to reproduce issues:- Find the problematic session in the dashboard
- Click “Replay Session”
- View the exact sequence of requests
- Test fixes against the same inputs
Session replay is especially useful for debugging multi-turn conversations where context matters.
Debugging Checklist
When investigating an issue:- Check status codes for obvious errors
- Review request/response in detail
- Test fixes in Playground
- Look at session context if multi-turn
- Filter by custom properties to find patterns
- Compare with working requests
- Check timing breakdown for performance
- Review token usage for cost issues
- Add more logging for future debugging
Proactive Debugging
Prevent issues before they happen:Set Up Alerts
Add Comprehensive Logging
Monitor Key Metrics
Track these metrics weekly:- Error rate - Should stay below 2%
- P95 latency - Should be under 3 seconds
- Average cost per session - Watch for increases
- Cache hit rate - Should be above 50% for cacheable content
Debugging Tools Reference
Request Filters
Filter by status, model, properties, and more
Sessions
Track multi-turn conversations and workflows
Custom Properties
Add metadata for powerful filtering
Alerts
Get notified of issues immediately
Next Steps
Agent Tracing
Debug complex agent workflows with tool calls
Cost Tracking
Identify and optimize expensive operations
Experiments
A/B test fixes before deploying to production
