Skip to main content

AssemblyAI Real-Time Transcription Browser Example

This open-source project demonstrates how to implement real-time speech transcription directly in the browser using AssemblyAI’s real-time WebSocket API. The application captures audio from your microphone, processes it using the Web Audio API’s AudioWorklet, and streams it to AssemblyAI for instant transcription. The results are displayed in real-time as you speak.

What You’ll Learn

This example shows you how to:
  • Set up an Express server to generate temporary authentication tokens
  • Capture and process microphone audio using the Web Audio API
  • Convert audio to the correct format (PCM16 at 16kHz) using AudioWorklet
  • Establish a WebSocket connection to AssemblyAI’s real-time transcription service
  • Handle real-time transcription results with turn-based ordering

Key Features

Real-Time Transcription

Stream audio and receive transcriptions instantly with minimal latency

Browser-Native

Runs entirely in the browser with no plugin or extension required

Secure Authentication

Uses temporary tokens to keep your API key secure on the server

AudioWorklet Processing

Low-latency audio processing using modern Web Audio API features

Architecture Overview

The application consists of three main components:
  1. Express Server - Generates temporary authentication tokens to keep your API key secure
  2. Client-Side Audio Processing - Captures microphone input and converts it to the required format
  3. WebSocket Connection - Streams audio data to AssemblyAI and receives transcription results
Before running this example, you need an upgraded AssemblyAI account. The real-time API is only available to accounts with billing enabled. Learn more in the Account Upgrade guide.

Prerequisites

  • Node.js installed on your system
  • An AssemblyAI account with billing enabled
  • A modern browser with support for:
    • Web Audio API
    • AudioWorklet
    • WebSocket
    • getUserMedia

Quickstart

Get up and running in 5 minutes

Core Concepts

Understand how the system works

Implementation Guide

Step-by-step implementation details

API Reference

Detailed API documentation

Technology Stack

  • Backend: Express.js, Node.js
  • Frontend: Vanilla JavaScript
  • Audio Processing: Web Audio API, AudioWorklet
  • Real-Time Communication: WebSocket
  • Transcription: AssemblyAI Real-Time API

Next Steps

Ready to get started? Head over to the Quickstart guide to clone the repository, configure your API key, and run the application.

Build docs developers (and LLMs) love