Voice Customization

NAVAI provides flexible voice customization options to create the perfect voice experience for your users. You can customize the voice, tone, accent, and instructions for your AI agent.

Voice Selection

NAVAI uses OpenAI’s Realtime API voices. You can configure the voice through environment variables or programmatically.

The default voice is marin if not specified.

Backend Configuration

Set the voice in your backend environment variables:

.env

OPENAI_REALTIME_VOICE=marin

Available voices from OpenAI Realtime API:

marin (default)
alloy
echo
shimmer
ash
ballad
coral
sage
verse

Runtime Voice Override

You can override the voice at runtime when creating a client secret:

import { createRealtimeClientSecret } from "@navai/voice-backend";

const clientSecret = await createRealtimeClientSecret(
  {
    openaiApiKey: process.env.OPENAI_API_KEY,
    defaultVoice: "marin"
  },
  {
    voice: "alloy" // Override for this session
  }
);

Model Selection

Configure the OpenAI Realtime model to use:

.env

OPENAI_REALTIME_MODEL=gpt-realtime

The default model is gpt-realtime if not specified.

Frontend Model Override

You can override the model from the frontend:

.env

NAVAI_REALTIME_MODEL=gpt-4o-realtime-preview

Tone Customization

Control the speaking tone of your AI agent. The tone is injected into the session instructions.

Configure tone in environment

.env

OPENAI_REALTIME_VOICE_TONE=friendly and professional

How it works

The tone is automatically appended to the instructions:

Use a friendly and professional tone while speaking.

Common Tone Examples

Professional & Business

OPENAI_REALTIME_VOICE_TONE=professional and concise

Great for business applications, customer service, or enterprise tools.

Friendly & Casual

OPENAI_REALTIME_VOICE_TONE=friendly and conversational

Perfect for consumer apps, social platforms, or casual interactions.

Enthusiastic & Energetic

OPENAI_REALTIME_VOICE_TONE=enthusiastic and energetic

Ideal for fitness apps, gaming, or motivational tools.

Calm & Soothing

OPENAI_REALTIME_VOICE_TONE=calm and reassuring

Best for meditation apps, healthcare, or therapeutic applications.

Accent Customization

Specify the accent for the AI voice to use while speaking.

.env

OPENAI_REALTIME_VOICE_ACCENT=neutral American English

Accent Examples

neutral American English
British English
neutral Latin American Spanish
European Spanish
French
German
Italian

The accent setting is a hint to the AI model and may not always be perfectly followed. Results can vary based on the underlying model capabilities.

Instructions Customization

Customize the base instructions to define your agent’s behavior and personality.

Basic Instructions

.env

OPENAI_REALTIME_INSTRUCTIONS=You are a helpful assistant embedded in a web app.

Advanced Instructions Example

server.ts

import { registerNavaiExpressRoutes } from "@navai/voice-backend";

const backendOptions = {
  openaiApiKey: process.env.OPENAI_API_KEY,
  defaultInstructions: `You are a helpful shopping assistant.
    Help users find products, answer questions about items, and guide them through checkout.
    Be concise and friendly. Always confirm before making purchases.`
};

registerNavaiExpressRoutes(app, { backendOptions });

Combined Configuration Example

Here’s a complete example combining all customization options:

.env

# Voice Configuration
OPENAI_API_KEY=sk-...
OPENAI_REALTIME_MODEL=gpt-realtime
OPENAI_REALTIME_VOICE=coral

# Tone and Accent
OPENAI_REALTIME_VOICE_TONE=friendly and professional
OPENAI_REALTIME_VOICE_ACCENT=neutral American English

# Instructions
OPENAI_REALTIME_INSTRUCTIONS=You are a voice navigation assistant. Help users navigate the app efficiently and answer their questions clearly.

How Instructions Are Built

NAVAI automatically combines your customizations using the buildSessionInstructions function:

// From packages/voice-backend/src/index.ts:134-158
function buildSessionInstructions(input: {
  baseInstructions: string;
  language?: string;
  voiceAccent?: string;
  voiceTone?: string;
}): string {
  const lines = [input.baseInstructions.trim()];
  
  if (language) {
    lines.push(`Always reply in ${language}.`);
  }
  
  if (voiceAccent) {
    lines.push(`Use a ${voiceAccent} accent while speaking.`);
  }
  
  if (voiceTone) {
    lines.push(`Use a ${voiceTone} tone while speaking.`);
  }
  
  return lines.join("\n");
}

You can override any of these settings at runtime by passing them to createRealtimeClientSecret with the request body.

Dynamic Customization

Customize voice settings per user or session:

// Client sends customization preferences
const response = await fetch('/navai/realtime/client-secret', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    voice: 'sage',
    voiceTone: 'calm and reassuring',
    voiceAccent: 'British English',
    instructions: 'You are a meditation guide. Speak slowly and calmly.'
  })
});

Testing Different Voices

Quickly test different voice configurations:

Update environment variables

Change voice settings in your .env file

Restart the backend

npm run dev

Test in your app

Start a new voice session to hear the changes

Next Steps

Multilingual Support

Learn how to configure language-specific settings

Debugging

Troubleshoot voice and audio issues

Get Started

Core Concepts

Backend Integration

Frontend Integration

Mobile Integration

Guides

Voice Customization

Voice Selection

Backend Configuration

Runtime Voice Override

Model Selection

Frontend Model Override

Tone Customization

Common Tone Examples

Accent Customization

Accent Examples

Instructions Customization

Basic Instructions

Advanced Instructions Example

Combined Configuration Example

How Instructions Are Built

Dynamic Customization

Testing Different Voices

Next Steps

Multilingual Support

Debugging

Build docs developers (and LLMs) love

Get Started

Core Concepts

Backend Integration

Frontend Integration

Mobile Integration

Guides

​Voice Selection

​Backend Configuration

​Runtime Voice Override

​Model Selection

​Frontend Model Override

​Tone Customization

​Common Tone Examples

​Accent Customization

​Accent Examples

​Instructions Customization

​Basic Instructions

​Advanced Instructions Example

​Combined Configuration Example

​How Instructions Are Built

​Dynamic Customization

​Testing Different Voices

​Next Steps

Multilingual Support

Debugging

Build docs developers (and LLMs) love

Voice Selection

Backend Configuration

Runtime Voice Override

Model Selection

Frontend Model Override

Tone Customization

Common Tone Examples

Accent Customization

Accent Examples

Instructions Customization

Basic Instructions

Advanced Instructions Example

Combined Configuration Example

How Instructions Are Built

Dynamic Customization

Testing Different Voices

Next Steps