Basic Audio Examples
Sending WAV Audio
Send audio from a WAV file into a meeting using a virtual microphone device. File:demos/audio/wav_audio_send.py
-c, --channels: Number of audio channels (default: 1)-r, --rate: Sample rate in Hz (default: 16000)
Receiving WAV Audio
Capture audio from a meeting participant and save it to a WAV file. File:demos/audio/wav_audio_receive.py
Usage:
RAW Audio Processing
Work directly with raw audio buffers for custom processing:- Send RAW Audio (
raw_audio_send.py): Send raw PCM audio data - Receive RAW Audio (
raw_audio_receive.py): Receive and process raw audio buffers - Async WAV Send (
async_wav_audio_send.py): Asynchronous audio transmission - Timed WAV Receive (
timed_wav_audio_receive.py): Time-based audio capture
Speech-to-Text (STT) Integration
Google Cloud Speech-to-Text
Transcribe spoken audio to text using Google’s Speech-to-Text API. File:demos/google/google_speech_to_text.py
Prerequisites:
- Google Cloud credentials configured
- See Google Cloud Speech-to-Text docs
Text-to-Speech (TTS) Integration
Google Cloud Text-to-Speech
Convert text to speech and stream it into a meeting. File:demos/google/google_text_to_speech.py
Deepgram Text-to-Speech
Use Deepgram’s TTS API for high-quality voice synthesis. File:demos/deepgram/deepgram_text_to_speech.py
- Set
DEEPGRAM_API_KEYenvironment variable - See Deepgram TTS docs
Hardware Audio Integration
PyAudio: Real Microphone and Speaker
Capture audio from your system microphone and play meeting audio through your speakers. File:demos/pyaudio/record_and_play.py
- Captures real microphone input
- Plays meeting audio through speakers
- Supports audio processing features (AGC, noise suppression, echo cancellation)
- Configurable channels (mono/stereo)
-c, --channels: Number of channels (1 or 2)-r, --rate: Sample rate in Hz
Voice Activity Detection (VAD)
Detect when someone is speaking in a meeting. File:demos/vad/native_vad.py
View full source →
Key Concepts
Virtual Audio Devices
Create virtual microphones and speakers for audio I/O:Audio Configuration
Configure microphone settings when joining:Sample Rate and Format
Most demos use these audio settings:- Sample Rate: 16000 Hz (optimal for speech)
- Channels: 1 (mono) or 2 (stereo)
- Format: 16-bit PCM (LINEAR16)
Next Steps
- Explore Video Applications for video streaming examples
- Check out Integration Examples for combining audio with other services
- Read the Virtual Devices guide for in-depth documentation
- Browse all examples on GitHub