This feature is coming in version 0.5.0 and is not yet available in the current release.
Overview
Speech Enhancement will improve audio quality by removing noise, echo, and other artifacts. This preprocessing step significantly improves STT accuracy and audio clarity.
Planned Features
Noise Reduction Remove background noise from recordings
Echo Cancellation Eliminate echo and reverb artifacts
Audio Normalization Normalize volume levels
Before/After Preview Compare original and enhanced audio
Expected API (Preview)
While the API is not finalized, the expected interface will be:
import { createEnhancement } from 'react-native-sherpa-onnx/enhancement' ;
// Create enhancement engine
const enhancer = await createEnhancement ({
modelPath: { type: 'asset' , path: 'models/rnnoise' },
noiseReductionLevel: 0.8 , // 0..1
});
// Enhance audio file
const enhanced = await enhancer . processFile ( '/path/to/noisy.wav' );
// Save enhanced audio
await saveAudioToFile ( enhanced , '/path/to/clean.wav' );
// Cleanup
await enhancer . destroy ();
Use Cases
1. Pre-processing for STT
Improve transcription accuracy by cleaning audio first:
// Planned API
const enhancer = await createEnhancement ( config );
const stt = await createSTT ( sttConfig );
const enhanced = await enhancer . processFile ( '/path/to/noisy.wav' );
const result = await stt . transcribeSamples (
enhanced . samples ,
enhanced . sampleRate
);
console . log ( 'Clean transcript:' , result . text );
await enhancer . destroy ();
await stt . destroy ();
2. Podcast Cleanup
Remove background noise from recordings:
// Planned API
const enhancer = await createEnhancement ({
modelPath: { type: 'asset' , path: 'models/rnnoise' },
noiseReductionLevel: 0.9 ,
});
const enhanced = await enhancer . processFile ( '/path/to/podcast.wav' );
await saveAudioToFile ( enhanced , '/path/to/podcast-clean.wav' );
await enhancer . destroy ();
3. Real-time Enhancement
Enhance streaming audio:
// Planned API
const enhancer = await createEnhancement ( config );
const recorder = startRecording ();
recorder . on ( 'chunk' , async ( samples ) => {
const enhanced = await enhancer . processSamples ( samples , 16000 );
// Forward to STT or playback
await sttStream . acceptWaveform ( enhanced , 16000 );
});
4. Call Quality Improvement
// Planned API
const enhancer = await createEnhancement ({
modelPath: { type: 'asset' , path: 'models/rnnoise' },
enableEchoCancel: true ,
enableNoiseReduction: true ,
});
const enhanced = await enhancer . processFile ( '/path/to/call.wav' );
await saveAudioToFile ( enhanced , '/path/to/call-enhanced.wav' );
Planned Configuration
// Expected configuration options
interface EnhancementConfig {
modelPath : ModelPathConfig ;
// Noise reduction
noiseReductionLevel ?: number ; // 0 (off) to 1 (max), default 0.5
enableNoiseReduction ?: boolean ; // default true
// Echo cancellation
enableEchoCancel ?: boolean ; // default false
echoSuppressionLevel ?: number ; // 0..1
// Normalization
enableNormalization ?: boolean ; // default false
targetLevel ?: number ; // dB, e.g., -20
// Advanced
frameSize ?: number ; // Samples per frame
hopSize ?: number ; // Overlap between frames
}
Expected Output
interface EnhancedAudio {
samples : number []; // Enhanced PCM samples
sampleRate : number ; // Sample rate
noiseLevelBefore ?: number ; // Noise estimate before
noiseLevelAfter ?: number ; // Noise estimate after
}
Enhancement Levels
// Planned presets
const presets = {
light: { noiseReductionLevel: 0.3 },
moderate: { noiseReductionLevel: 0.6 },
aggressive: { noiseReductionLevel: 0.9 },
};
const enhancer = await createEnhancement ({
modelPath: { type: 'asset' , path: 'models/rnnoise' },
... presets . moderate ,
});
Expected Models
Likely model support:
RNNoise - Lightweight noise suppression
DeepFilterNet - Deep learning-based enhancement
Speex - Classic noise reduction
Custom sherpa-onnx models - Optimized for mobile
Timeline
Enhancement support is planned for:
Version 0.5.0
Initial enhancement with basic noise reduction
Future versions
Advanced features like echo cancellation and real-time processing
Stay Updated
To track progress or contribute:
Current Workarounds
While enhancement is not available, you can:
External libraries - Use JavaScript audio libraries (e.g., Web Audio API)
Pre-process offline - Use desktop tools (Audacity, FFmpeg) before importing
Cloud services - Use enhancement APIs from cloud providers
Simple Normalization Example
function normalizeAudio ( samples : number []) : number [] {
// Find peak
const peak = Math . max ( ... samples . map ( Math . abs ));
if ( peak === 0 ) return samples ;
// Normalize to 0.8 (-1.9 dB)
const targetPeak = 0.8 ;
const gain = targetPeak / peak ;
return samples . map ( s => s * gain );
}
// Usage
const samples = getPcmSamples ();
const normalized = normalizeAudio ( samples );
// Now use for STT
const result = await stt . transcribeSamples ( normalized , 16000 );
Simple High-pass Filter
function highPassFilter ( samples : number [], alpha : number = 0.95 ) : number [] {
const filtered : number [] = [];
let prev = 0 ;
for ( let i = 0 ; i < samples . length ; i ++ ) {
const current = samples [ i ];
filtered [ i ] = alpha * ( prev + current - ( samples [ i - 1 ] || 0 ));
prev = filtered [ i ];
}
return filtered ;
}
// Usage (removes low-frequency noise)
const samples = getPcmSamples ();
const filtered = highPassFilter ( samples );
Integration with STT Pipeline
When available, enhancement will integrate seamlessly:
// Future combined API (preview)
import { createEnhancement } from 'react-native-sherpa-onnx/enhancement' ;
import { createSTT } from 'react-native-sherpa-onnx/stt' ;
const enhancer = await createEnhancement ( enhancementConfig );
const stt = await createSTT ( sttConfig );
// Process pipeline
const enhanced = await enhancer . processFile ( '/path/to/noisy.wav' );
const transcript = await stt . transcribeSamples (
enhanced . samples ,
enhanced . sampleRate
);
console . log ( 'Transcript:' , transcript . text );
console . log ( 'Noise reduction:' ,
` ${ enhanced . noiseLevelBefore } -> ${ enhanced . noiseLevelAfter } dB`
);
await enhancer . destroy ();
await stt . destroy ();
Expected before/after comparison:
// Planned API
const enhancer = await createEnhancement ( config );
const result = await enhancer . processFile ( '/path/to/noisy.wav' , {
returnOriginal: true , // Include original for comparison
});
// Play original
await playAudio ( result . original . samples , result . original . sampleRate );
// Play enhanced
await playAudio ( result . enhanced . samples , result . enhanced . sampleRate );
console . log ( 'SNR improvement:' , result . snrImprovement , 'dB' );
Speech-to-Text Transcribe enhanced audio
Source Separation Separate voice from background (coming in v0.6.0)