Export your LLM request and response data from Helicone for analysis, backup, compliance, or migration to other systems.
Why Export Data
Common use cases:
Fine-tuning preparation : Export production data as training examples
Custom analytics : Analyze in your own BI tools (Tableau, PowerBI)
Compliance : Meet data retention and audit requirements
Backup : Keep local copies of critical data
Migration : Move data between systems or regions
Export Methods
Helicone provides three ways to export data:
NPM Tool Command-line tool with resume support
REST API Programmatic access for automation
Dashboard Manual export via UI
The easiest and most reliable way to export large datasets.
Quick Start
# No installation required - use npx
HELICONE_API_KEY = "sk-xxx" npx @helicone/export \
--start-date 2024-01-01 \
--end-date 2024-12-31 \
--limit 10000 \
--include-body
Features
Auto-Recovery Resumes from last checkpoint if interrupted
Retry Logic Exponential backoff for transient failures
Progress Tracking Real-time progress with ETA
Multiple Formats JSON, JSONL, or CSV output
Common Usage Examples
Export all requests from a date range: HELICONE_API_KEY = "sk-xxx" npx @helicone/export \
--start-date 2024-01-01 \
--end-date 2024-12-31 \
--format jsonl \
--output ./data/helicone-export.jsonl \
--include-body
Output: ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Helicone Data Export Tool ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
Fetching total count...
Total records: 45,231
Exporting to: ./data/helicone-export.jsonl
Progress: [====================] 100% | 45,231/45,231 | ETA: 0s
✅ Export complete!
├── Records exported: 45,231
├── Output file: ./data/helicone-export.jsonl
├── File size: 1.2 GB
└── Duration: 3m 42s
Export specific feature or environment: # Export production data for a specific feature
HELICONE_API_KEY = "sk-xxx" npx @helicone/export \
--property Environment=production \
--property Feature=chat \
--start-date 2024-12-01 \
--format csv \
--include-body
Multiple properties: HELICONE_API_KEY = "sk-xxx" npx @helicone/export \
--property Environment=production \
--property Feature=document-analysis \
--property UserTier=premium \
--start-date 2024-01-01
If export is interrupted, resume from checkpoint: # First run (interrupted at 30%)
HELICONE_API_KEY = "sk-xxx" npx @helicone/export \
--start-date 2024-01-01 \
--limit 100000
# ^C (interrupted)
# Resume automatically
HELICONE_API_KEY = "sk-xxx" npx @helicone/export \
--resume
# Continues from 30% (offset 30,000)
# Or clean state and restart
HELICONE_API_KEY = "sk-xxx" npx @helicone/export \
--clean-state \
--start-date 2024-01-01 \
--limit 100000
Export from EU region: HELICONE_API_KEY = "sk-xxx-eu" npx @helicone/export \
--region eu \
--start-date 2024-01-01 \
--include-body
Configuration Options
Option Description Default Example --start-dateStart date (ISO 8601) 30 days ago 2024-01-01--end-dateEnd date (ISO 8601) Now 2024-12-31--limitMax records to export Unlimited 10000--formatOutput format jsonljson, jsonl, csv--outputOutput file path helicone-export.*./data/export.jsonl--include-bodyInclude request/response bodies false(flag) --propertyFilter by property None Environment=prod--regionAPI region usus, eu--batch-sizeRecords per API call 1000500--resumeResume from checkpoint false(flag) --clean-stateClear checkpoint and restart false(flag) --log-levelLogging verbosity normalquiet, verbose
Method 2: REST API
For programmatic export and automation.
Basic Query
import fs from 'fs' ;
const HELICONE_API_KEY = process . env . HELICONE_API_KEY ;
async function exportData (
startDate : string ,
endDate : string ,
limit : number = 1000
) {
const response = await fetch (
"https://api.helicone.ai/v1/request/query-clickhouse" ,
{
method: "POST" ,
headers: {
"Authorization" : `Bearer ${ HELICONE_API_KEY } ` ,
"Content-Type" : "application/json" ,
},
body: JSON . stringify ({
filter: {
request_response_rmt: {
request_created_at: {
gte: startDate ,
lte: endDate ,
},
},
},
limit ,
}),
}
);
const data = await response . json ();
return data . data ;
}
// Export and save
const requests = await exportData (
"2024-01-01T00:00:00Z" ,
"2024-12-31T23:59:59Z" ,
10000
);
fs . writeFileSync (
"export.jsonl" ,
requests . map ( r => JSON . stringify ( r )). join ( " \n " )
);
console . log ( `Exported ${ requests . length } requests` );
Advanced Filtering
By Properties
By User
By Model
By Status
{
"filter" : {
"request_response_rmt" : {
"properties" : {
"Environment" : { "equals" : "production" },
"Feature" : { "equals" : "chat" }
},
"request_created_at" : {
"gte" : "2024-01-01T00:00:00Z"
}
}
},
"limit" : 1000
}
{
"filter" : {
"request_response_rmt" : {
"user_id" : { "equals" : "user-123" },
"request_created_at" : {
"gte" : "2024-01-01T00:00:00Z"
}
}
},
"limit" : 1000
}
{
"filter" : {
"request_response_rmt" : {
"model" : { "equals" : "gpt-4o" },
"request_created_at" : {
"gte" : "2024-01-01T00:00:00Z"
}
}
},
"limit" : 1000
}
// Only successful requests
{
"filter" : {
"request_response_rmt" : {
"status" : { "gte" : 200 , "lt" : 300 },
"request_created_at" : {
"gte" : "2024-01-01T00:00:00Z"
}
}
},
"limit" : 1000
}
// Only errors
{
"filter" : {
"request_response_rmt" : {
"status" : { "gte" : 400 },
"request_created_at" : {
"gte" : "2024-01-01T00:00:00Z"
}
}
},
"limit" : 1000
}
async function exportAllData (
startDate : string ,
endDate : string
) {
const allRequests = [];
let offset = 0 ;
const batchSize = 1000 ;
while ( true ) {
console . log ( `Fetching batch at offset ${ offset } ...` );
const response = await fetch (
"https://api.helicone.ai/v1/request/query-clickhouse" ,
{
method: "POST" ,
headers: {
"Authorization" : `Bearer ${ HELICONE_API_KEY } ` ,
"Content-Type" : "application/json" ,
},
body: JSON . stringify ({
filter: {
request_response_rmt: {
request_created_at: {
gte: startDate ,
lte: endDate ,
},
},
},
limit: batchSize ,
offset ,
}),
}
);
const data = await response . json ();
const batch = data . data ;
if ( batch . length === 0 ) {
break ; // No more data
}
allRequests . push ( ... batch );
offset += batch . length ;
console . log ( `Total fetched: ${ allRequests . length } ` );
// Respect rate limits
await new Promise ( resolve => setTimeout ( resolve , 100 ));
}
return allRequests ;
}
// Usage
const allData = await exportAllData (
"2024-01-01T00:00:00Z" ,
"2024-12-31T23:59:59Z"
);
console . log ( `Exported ${ allData . length } total requests` );
Method 3: Dashboard Export
Manual export for small datasets.
Apply Filters
Filter data to export:
Date range
Properties (Environment, Feature, etc.)
User ID
Model
Status
Export
Click “Export” button and choose format:
Dashboard export is limited to 10,000 records. For larger datasets, use the NPM tool or API.
One JSON object per line:
{ "request_id" : "req_abc123" , "created_at" : "2024-01-15T10:30:00Z" , "model" : "gpt-4o" , "prompt_tokens" : 50 , "completion_tokens" : 100 , "cost_usd" : 0.015 }
{ "request_id" : "req_def456" , "created_at" : "2024-01-15T10:31:00Z" , "model" : "gpt-4o-mini" , "prompt_tokens" : 30 , "completion_tokens" : 80 , "cost_usd" : 0.003 }
Benefits:
Streamable (process line by line)
Efficient for large files
Easy to split/merge
Array of objects:
[
{
"request_id" : "req_abc123" ,
"created_at" : "2024-01-15T10:30:00Z" ,
"model" : "gpt-4o" ,
"prompt_tokens" : 50 ,
"completion_tokens" : 100 ,
"cost_usd" : 0.015
},
{
"request_id" : "req_def456" ,
"created_at" : "2024-01-15T10:31:00Z" ,
"model" : "gpt-4o-mini" ,
"prompt_tokens" : 30 ,
"completion_tokens" : 80 ,
"cost_usd" : 0.003
}
]
Comma-separated values:
request_id, created_at, model, prompt_tokens, completion_tokens, cost_usd
req_abc123, 2024-01-15T10:30:00Z, gpt-4o, 50, 100, 0.015
req_def456, 2024-01-15T10:31:00Z, gpt-4o-mini, 30, 80, 0.003
Best for:
Excel/Google Sheets
BI tools (Tableau, PowerBI)
Simple analysis
Included Fields
Field Description Type request_idUnique request identifier string created_atTimestamp (ISO 8601) string user_idUser identifier string modelModel name string prompt_tokensInput tokens number completion_tokensOutput tokens number total_tokensTotal tokens number cost_usdCost in USD number latencyResponse time (ms) number statusHTTP status code number propertiesCustom properties object request_bodyRequest payload (if --include-body) object response_bodyResponse payload (if --include-body) object
Use Case Examples
Fine-Tuning Dataset
Export successful requests for training:
HELICONE_API_KEY = "sk-xxx" npx @helicone/export \
--property Task=sentiment-analysis \
--property Environment=production \
--start-date 2024-01-01 \
--format jsonl \
--include-body \
--output training-data.jsonl
# Post-process to OpenAI format
node convert-to-openai-format.js training-data.jsonl
Cost Analysis
Export for custom analytics:
HELICONE_API_KEY = "sk-xxx" npx @helicone/export \
--start-date 2024-01-01 \
--end-date 2024-12-31 \
--format csv \
--output costs-2024.csv
# Import into Excel/Tableau for analysis
Compliance Backup
Monthly backup for audit trail:
#!/bin/bash
# backup-monthly.sh
MONTH = $( date -d "last month" +%Y-%m )
START_DATE = "${ MONTH }-01T00:00:00Z"
END_DATE = $( date -d "${ START_DATE } +1 month" +%Y-%m-%dT00:00:00Z )
HELICONE_API_KEY = "sk-xxx" npx @helicone/export \
--start-date " $START_DATE " \
--end-date " $END_DATE " \
--format jsonl \
--include-body \
--output "backups/helicone-${ MONTH }.jsonl.gz"
echo "Backup complete for $MONTH "
User Data Export (GDPR)
Export all data for a specific user:
const response = await fetch (
"https://api.helicone.ai/v1/request/query-clickhouse" ,
{
method: "POST" ,
headers: {
"Authorization" : `Bearer ${ HELICONE_API_KEY } ` ,
"Content-Type" : "application/json" ,
},
body: JSON . stringify ({
filter: {
request_response_rmt: {
user_id: { equals: "user-123" },
},
},
limit: 100000 ,
}),
}
);
const userData = await response . json ();
// Save for GDPR request
fs . writeFileSync (
"user-123-data-export.json" ,
JSON . stringify ( userData . data , null , 2 )
);
Best Practices
Use JSONL for large exports : More efficient than JSON arrays
Export incrementally : Daily or weekly exports are easier to manage than one large export
Compress backups : JSONL compresses well with gzip (80-90% reduction)
Filter early : Apply filters at export time to reduce data size
Request bodies can be large : Only use --include-body when needed
Troubleshooting
Tips to speed up:
Use --batch-size 500 for faster but smaller batches
Apply filters to reduce data volume
Export during off-peak hours
Check your network connection
Use --resume to continue: HELICONE_API_KEY = "sk-xxx" npx @helicone/export --resume
Or clean state and restart: HELICONE_API_KEY = "sk-xxx" npx @helicone/export --clean-state ...
Reduce batch size: HELICONE_API_KEY = "sk-xxx" npx @helicone/export \
--batch-size 250 \
...
Or add delays in custom scripts: await new Promise ( resolve => setTimeout ( resolve , 500 ));
Property filter not working
Ensure property name matches exactly: # Correct
--property Environment=production
# Wrong (case sensitive)
--property environment=production
Check property exists in your data:
Go to Helicone dashboard
View a request
Check exact property names
Automated Exports
Schedule regular exports:
Cron Job (Linux/Mac)
# Add to crontab (crontab -e)
# Run daily at 2 AM
0 2 * * * cd /path/to/project && HELICONE_API_KEY = sk-xxx npx @helicone/export --start-date $( date -d "yesterday" + \% Y- \% m- \% d ) --output backups/daily- $( date + \% Y- \% m- \% d ) .jsonl
GitHub Actions
name : Daily Helicone Backup
on :
schedule :
- cron : '0 2 * * *' # Daily at 2 AM UTC
jobs :
export :
runs-on : ubuntu-latest
steps :
- name : Export Helicone data
env :
HELICONE_API_KEY : ${{ secrets.HELICONE_API_KEY }}
run : |
npx @helicone/export \
--start-date $(date -d "yesterday" +%Y-%m-%d) \
--format jsonl \
--output backup-$(date +%Y-%m-%d).jsonl
- name : Upload to S3
uses : aws-actions/aws-cli@v2
with :
args : s3 cp backup-$(date +%Y-%m-%d).jsonl s3://my-backups/helicone/
Next Steps
Query API Docs Full API documentation for queries
Fine-Tuning Prep Use exported data for fine-tuning
Custom Properties Add metadata for better filtering
Sessions Export complete workflows