Skip to main content
NAVAI provides flexible voice customization options to create the perfect voice experience for your users. You can customize the voice, tone, accent, and instructions for your AI agent.

Voice Selection

NAVAI uses OpenAI’s Realtime API voices. You can configure the voice through environment variables or programmatically.
The default voice is marin if not specified.

Backend Configuration

Set the voice in your backend environment variables:
.env
OPENAI_REALTIME_VOICE=marin
Available voices from OpenAI Realtime API:
  • marin (default)
  • alloy
  • echo
  • shimmer
  • ash
  • ballad
  • coral
  • sage
  • verse

Runtime Voice Override

You can override the voice at runtime when creating a client secret:
import { createRealtimeClientSecret } from "@navai/voice-backend";

const clientSecret = await createRealtimeClientSecret(
  {
    openaiApiKey: process.env.OPENAI_API_KEY,
    defaultVoice: "marin"
  },
  {
    voice: "alloy" // Override for this session
  }
);

Model Selection

Configure the OpenAI Realtime model to use:
.env
OPENAI_REALTIME_MODEL=gpt-realtime
The default model is gpt-realtime if not specified.

Frontend Model Override

You can override the model from the frontend:
.env
NAVAI_REALTIME_MODEL=gpt-4o-realtime-preview

Tone Customization

Control the speaking tone of your AI agent. The tone is injected into the session instructions.
1

Configure tone in environment

.env
OPENAI_REALTIME_VOICE_TONE=friendly and professional
2

How it works

The tone is automatically appended to the instructions:
Use a friendly and professional tone while speaking.

Common Tone Examples

OPENAI_REALTIME_VOICE_TONE=professional and concise
Great for business applications, customer service, or enterprise tools.
OPENAI_REALTIME_VOICE_TONE=friendly and conversational
Perfect for consumer apps, social platforms, or casual interactions.
OPENAI_REALTIME_VOICE_TONE=enthusiastic and energetic
Ideal for fitness apps, gaming, or motivational tools.
OPENAI_REALTIME_VOICE_TONE=calm and reassuring
Best for meditation apps, healthcare, or therapeutic applications.

Accent Customization

Specify the accent for the AI voice to use while speaking.
.env
OPENAI_REALTIME_VOICE_ACCENT=neutral American English

Accent Examples

  • neutral American English
  • British English
  • neutral Latin American Spanish
  • European Spanish
  • French
  • German
  • Italian
The accent setting is a hint to the AI model and may not always be perfectly followed. Results can vary based on the underlying model capabilities.

Instructions Customization

Customize the base instructions to define your agent’s behavior and personality.

Basic Instructions

.env
OPENAI_REALTIME_INSTRUCTIONS=You are a helpful assistant embedded in a web app.

Advanced Instructions Example

server.ts
import { registerNavaiExpressRoutes } from "@navai/voice-backend";

const backendOptions = {
  openaiApiKey: process.env.OPENAI_API_KEY,
  defaultInstructions: `You are a helpful shopping assistant.
    Help users find products, answer questions about items, and guide them through checkout.
    Be concise and friendly. Always confirm before making purchases.`
};

registerNavaiExpressRoutes(app, { backendOptions });

Combined Configuration Example

Here’s a complete example combining all customization options:
.env
# Voice Configuration
OPENAI_API_KEY=sk-...
OPENAI_REALTIME_MODEL=gpt-realtime
OPENAI_REALTIME_VOICE=coral

# Tone and Accent
OPENAI_REALTIME_VOICE_TONE=friendly and professional
OPENAI_REALTIME_VOICE_ACCENT=neutral American English

# Instructions
OPENAI_REALTIME_INSTRUCTIONS=You are a voice navigation assistant. Help users navigate the app efficiently and answer their questions clearly.

How Instructions Are Built

NAVAI automatically combines your customizations using the buildSessionInstructions function:
// From packages/voice-backend/src/index.ts:134-158
function buildSessionInstructions(input: {
  baseInstructions: string;
  language?: string;
  voiceAccent?: string;
  voiceTone?: string;
}): string {
  const lines = [input.baseInstructions.trim()];
  
  if (language) {
    lines.push(`Always reply in ${language}.`);
  }
  
  if (voiceAccent) {
    lines.push(`Use a ${voiceAccent} accent while speaking.`);
  }
  
  if (voiceTone) {
    lines.push(`Use a ${voiceTone} tone while speaking.`);
  }
  
  return lines.join("\n");
}
You can override any of these settings at runtime by passing them to createRealtimeClientSecret with the request body.

Dynamic Customization

Customize voice settings per user or session:
// Client sends customization preferences
const response = await fetch('/navai/realtime/client-secret', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    voice: 'sage',
    voiceTone: 'calm and reassuring',
    voiceAccent: 'British English',
    instructions: 'You are a meditation guide. Speak slowly and calmly.'
  })
});

Testing Different Voices

Quickly test different voice configurations:
1

Update environment variables

Change voice settings in your .env file
2

Restart the backend

npm run dev
3

Test in your app

Start a new voice session to hear the changes

Next Steps

Multilingual Support

Learn how to configure language-specific settings

Debugging

Troubleshoot voice and audio issues

Build docs developers (and LLMs) love