Skip to main content
Use the familiar OpenAI SDK to access 100+ LLM models across OpenAI, Anthropic, Google, and more with automatic logging, observability, and fallbacks built in.
1

Create your account

Sign up for Helicone

  1. Sign up for free (10,000 requests/month on the free tier)
  2. Complete the onboarding flow
  3. Generate your Helicone API key at API Keys
Free tier includes: 10K requests/month, all core features, and no credit card required
2

Add credits (optional)

Use the AI Gateway with credits

For the easiest experience, add credits to access 100+ models without signing up for each provider:
  1. Go to helicone.ai/credits
  2. Add funds to your account (we charge exactly what providers charge - 0% markup)
  3. Use any model from any provider with a single API key
Instead of managing API keys for each provider (OpenAI, Anthropic, Google, etc.), Helicone maintains the keys for you. You simply add credits to your account, and we handle the rest.Benefits:
  • 0% markup - Pay exactly what providers charge, no hidden fees
  • No need to sign up for multiple LLM providers
  • Switch between 100+ models by just changing the model name
  • Automatic fallbacks if a provider is down
  • Unified billing across all providers
Want more control? You can bring your own provider keys instead.
Skip this step and use your own API keys for OpenAI, Anthropic, or other providers. Configure them at Provider Keys.You’ll still get full observability, but you’ll manage provider relationships directly. See the “Bring Your Own Keys” tab in Step 3.
3

Send your first request

Choose your integration method

Helicone’s AI Gateway is OpenAI-compatible, so you can use the OpenAI SDK with any provider.
Using Helicone credits to access any model:
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://ai-gateway.helicone.ai",
  apiKey: process.env.HELICONE_API_KEY, // Your Helicone API key
});

const response = await client.chat.completions.create({
  model: "gpt-4o-mini", // Or any of 100+ models
  messages: [
    { role: "user", content: "Explain Helicone in one sentence" }
  ],
});

console.log(response.choices[0].message.content);
Switch providers instantly:
// OpenAI
model: "gpt-4o-mini"

// Anthropic
model: "claude-sonnet-4"

// Google
model: "gemini-2.0-flash"

// Groq
model: "llama-3.3-70b-versatile"
4

View your logs

See your request in the dashboard

Once you run the code, you’ll see your request appear in the Requests tab within seconds.
Helicone dashboard showing request logs with cost, latency, and full details
What you’ll see:
  • Full request and response details
  • Token usage (input, output, cached)
  • Exact cost per request
  • Latency and processing time
  • Model and provider information
  • Custom properties and user tracking
Click any request to see the complete conversation, including all messages, tokens, costs, and metadata.

You’re All Set! 🎉

Congratulations! You’ve successfully integrated Helicone and logged your first LLM request. Now let’s explore what you can do with the platform.

What’s Next?

Understand the Platform

Learn how Helicone solves production AI challenges with architecture overview

Track Sessions & Agents

Debug multi-step AI workflows with session trees and full visibility

Add Custom Properties

Segment requests by user, feature, or environment for better insights

Set Up Fallbacks

Configure automatic failover when providers go down

Manage Prompts

Version control prompts and deploy without code changes

Cost Tracking

Understand your LLM economics and optimize spending

Common Use Cases

Add a Helicone-User-Id header to tag requests with user IDs:
const response = await client.chat.completions.create(
  {
    model: "gpt-4o-mini",
    messages: [{ role: "user", content: "Hello!" }],
  },
  {
    headers: {
      "Helicone-User-Id": "user-123",
    },
  }
);
Then filter by user in the dashboard to see per-user costs and usage.
Use sessions to group related requests and trace multi-step workflows:
const sessionId = "research-task-" + Date.now();

// Step 1: Web search
await client.chat.completions.create(
  { model: "gpt-4o-mini", messages: [...] },
  { headers: { 
    "Helicone-Session-Id": sessionId,
    "Helicone-Session-Path": "/research/web_search"
  }}
);

// Step 2: Summarize
await client.chat.completions.create(
  { model: "gpt-4o-mini", messages: [...] },
  { headers: { 
    "Helicone-Session-Id": sessionId,
    "Helicone-Session-Path": "/research/summarize"
  }}
);
View the complete workflow tree in the Sessions tab.
Specify multiple models separated by commas - Helicone will try them in order:
const response = await client.chat.completions.create({
  // Try OpenAI first, fallback to Anthropic if it fails
  model: "gpt-4o-mini,claude-sonnet-4",
  messages: [{ role: "user", content: "Hello!" }],
});
Your app stays online even during provider outages.
Enable caching with a header to reuse identical responses:
const response = await client.chat.completions.create(
  {
    model: "gpt-4o-mini",
    messages: [{ role: "user", content: "What is 2+2?" }],
  },
  {
    headers: {
      "Helicone-Cache-Enabled": "true",
    },
  }
);
Identical requests are served from cache instantly at zero cost.

Need Help?

We’re here to help you succeed:

Join Discord

Chat with 2000+ developers in our community

Email Support

Contact [email protected] with questions

Documentation

Explore integration guides for all frameworks

GitHub

Star us and contribute to the project
Pro tip: Start with basic request logging, then add custom properties, sessions, and prompts as your needs grow. Each feature builds on the others to give you complete observability.

Build docs developers (and LLMs) love