Documentation Index Fetch the complete documentation index at: https://mintlify.com/xdcobra/react-native-sherpa-onnx/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Streaming TTS generates speech incrementally, delivering audio chunks as they are produced. This enables lower time-to-first-byte and immediate playback while synthesis continues.
Quick Start
import { createStreamingTTS } from 'react-native-sherpa-onnx/tts' ;
// 1) Create streaming TTS engine
const tts = await createStreamingTTS ({
modelPath: { type: 'asset' , path: 'models/vits-piper-en' },
modelType: 'vits' ,
});
// 2) Generate speech with streaming callbacks
const controller = await tts . generateSpeechStream (
'Hello, this is streaming TTS.' ,
undefined ,
{
onChunk : ( chunk ) => {
// chunk.samples: float[] in [-1, 1]
// chunk.sampleRate: number
// chunk.progress: 0..1
// chunk.isFinal: boolean
playAudio ( chunk . samples , chunk . sampleRate );
},
onEnd : () => console . log ( 'Generation complete' ),
onError : ( err ) => console . error ( 'Error:' , err . message ),
}
);
// 3) Cleanup
await tts . destroy ();
Built-in PCM Player
Use the native PCM player for minimal latency:
const sampleRate = await tts . getSampleRate ();
await tts . startPcmPlayer ( sampleRate , 1 ); // mono
const controller = await tts . generateSpeechStream (
'Hello, world!' ,
undefined ,
{
onChunk : ( chunk ) => {
if ( chunk . samples . length > 0 ) {
tts . writePcmChunk ( chunk . samples );
}
},
onEnd : () => tts . stopPcmPlayer (),
onError : () => tts . stopPcmPlayer (),
}
);
Engine Creation
Create a streaming TTS engine (same as batch TTS):
const tts = await createStreamingTTS ({
modelPath: { type: 'asset' , path: 'models/vits-piper-en' },
modelType: 'auto' , // or explicit: 'vits', 'matcha', etc.
// Performance
numThreads: 4 ,
provider: 'cpu' ,
// Model options
modelOptions: {
vits: {
noiseScale: 0.667 ,
noiseScaleW: 0.8 ,
lengthScale: 1.0 ,
},
},
// Config-level options
maxNumSentences: 1 , // Sentences per callback
silenceScale: 0.2 ,
});
Generate Speech Stream
const controller = await tts . generateSpeechStream (
text ,
options , // TtsGenerationOptions or undefined
handlers // TtsStreamHandlers
);
Generation Options
Same as batch TTS:
const controller = await tts . generateSpeechStream (
'Hello, world!' ,
{
sid: 0 , // Speaker ID
speed: 1.2 , // Speed multiplier
silenceScale: 0.3 ,
},
handlers
);
Stream Handlers
interface TtsStreamHandlers {
onChunk ?: ( chunk : TtsStreamChunk ) => void ;
onEnd ?: ( event : TtsStreamEnd ) => void ;
onError ?: ( event : TtsStreamError ) => void ;
}
Chunk Event
interface TtsStreamChunk {
instanceId ?: string ;
requestId ?: string ;
samples : number []; // Float PCM in [-1, 1]
sampleRate : number ; // Sample rate in Hz
progress : number ; // 0..1
isFinal : boolean ; // True for last chunk
}
End Event
interface TtsStreamEnd {
instanceId ?: string ;
requestId ?: string ;
cancelled : boolean ; // True if cancelled
}
Error Event
interface TtsStreamError {
instanceId ?: string ;
requestId ?: string ;
message : string ;
}
Stream Controller
The controller manages the streaming generation:
interface TtsStreamController {
cancel : () => Promise < void >; // Stop generation
unsubscribe : () => void ; // Remove listeners
}
Cancel Generation
const controller = await tts . generateSpeechStream ( text , undefined , handlers );
// User taps "Stop"
await controller . cancel ();
Unsubscribe Listeners
// Automatically called after onEnd/onError
// Manually call if discarding controller early
controller . unsubscribe ();
PCM Player API
Start Player
const sampleRate = await tts . getSampleRate ();
const numChannels = 1 ; // mono
await tts . startPcmPlayer ( sampleRate , numChannels );
Write Chunks
onChunk : ( chunk ) => {
// Samples must be in [-1, 1]
await tts . writePcmChunk ( chunk . samples );
}
Stop Player
await tts . stopPcmPlayer ();
Complete Example
import { createStreamingTTS } from 'react-native-sherpa-onnx/tts' ;
async function streamSpeech ( text : string ) {
const tts = await createStreamingTTS ({
modelPath: { type: 'asset' , path: 'models/vits-piper-en_US' },
modelType: 'vits' ,
numThreads: 4 ,
});
try {
const sampleRate = await tts . getSampleRate ();
await tts . startPcmPlayer ( sampleRate , 1 );
const controller = await tts . generateSpeechStream (
text ,
{ speed: 1.0 },
{
onChunk : ( chunk ) => {
console . log ( `Progress: ${ ( chunk . progress * 100 ). toFixed ( 0 ) } %` );
if ( chunk . samples . length > 0 ) {
tts . writePcmChunk ( chunk . samples );
}
},
onEnd : ( e ) => {
tts . stopPcmPlayer ();
if ( e . cancelled ) {
console . log ( 'Generation cancelled' );
} else {
console . log ( 'Generation complete' );
}
},
onError : ( err ) => {
tts . stopPcmPlayer ();
console . error ( 'TTS error:' , err . message );
},
}
);
// Return controller for potential cancellation
return controller ;
} finally {
// Cleanup after generation completes
await tts . destroy ();
}
}
// Usage
const controller = await streamSpeech ( 'Hello, world!' );
// Cancel if needed
// await controller.cancel();
Recording Streamed Audio
Accumulate chunks to save after generation:
const chunks : number [] = [];
let sampleRate = 0 ;
const controller = await tts . generateSpeechStream ( text , undefined , {
onChunk : ( chunk ) => {
sampleRate = chunk . sampleRate ;
chunks . push ( ... chunk . samples );
// Also play live
tts . writePcmChunk ( chunk . samples );
},
onEnd : async () => {
tts . stopPcmPlayer ();
// Save accumulated audio
if ( chunks . length > 0 ) {
await saveAudioToFile (
{ samples: chunks , sampleRate },
'/path/to/output.wav'
);
}
},
onError : () => tts . stopPcmPlayer (),
});
Voice Cloning (Pocket TTS)
Stream with voice cloning for Kotlin-engine models:
const tts = await createStreamingTTS ({
modelPath: { type: 'asset' , path: 'models/pocket-tts' },
modelType: 'pocket' ,
});
const controller = await tts . generateSpeechStream (
'Target text in cloned voice' ,
{
referenceAudio: { samples: refSamples , sampleRate: 22050 },
referenceText: 'Reference transcript' ,
numSteps: 20 ,
extra: { temperature: '0.7' },
},
handlers
);
Note: Streaming with reference audio is not supported for ZipVoice . Use batch generateSpeech for ZipVoice voice cloning.
Multiple Concurrent Requests
Only one stream per engine is allowed at a time. For concurrent requests:
Option A: Sequential
Wait for onEnd before starting the next:
await tts . generateSpeechStream ( text1 , undefined , handlers1 );
// Wait for onEnd...
await tts . generateSpeechStream ( text2 , undefined , handlers2 );
Option B: Multiple Engines
Create separate engines:
const tts1 = await createStreamingTTS ( config );
const tts2 = await createStreamingTTS ( config );
await tts1 . generateSpeechStream ( text1 , undefined , handlers1 );
await tts2 . generateSpeechStream ( text2 , undefined , handlers2 );
await tts1 . destroy ();
await tts2 . destroy ();
Threading
const tts = await createStreamingTTS ({
modelPath: { type: 'asset' , path: 'models/vits-piper' },
modelType: 'vits' ,
numThreads: 4 , // Use multiple cores
});
Chunk Size
Control via maxNumSentences:
const tts = await createStreamingTTS ({
modelPath: { type: 'asset' , path: 'models/vits-piper' },
modelType: 'vits' ,
maxNumSentences: 2 , // Larger chunks, less frequent callbacks
});
Memory
Avoid accumulating all chunks in JS for very long texts
Use native player to minimize JS memory usage
Save incrementally to files if needed
Error Handling
const controller = await tts . generateSpeechStream (
text ,
undefined ,
{
onChunk : ( chunk ) => playAudio ( chunk . samples ),
onEnd : ( e ) => {
if ( ! e . cancelled ) {
console . log ( 'Success' );
}
},
onError : ( e ) => {
console . error ( 'TTS streaming error:' , e . message );
// Cleanup, stop playback, show error UI
},
}
);
Cleanup
Always clean up resources:
try {
const tts = await createStreamingTTS ({ /* ... */ });
// Use streaming TTS
const controller = await tts . generateSpeechStream ( text , undefined , handlers );
// Wait for completion or cancel
// ...
} finally {
await tts . destroy ();
}
Listeners are automatically removed after onEnd or onError. Call controller.unsubscribe() manually only if discarding the controller before completion.
Supported Models
All TTS model types support streaming:
VITS (Piper)
Matcha
Kokoro
Kitten
Pocket
ZipVoice (batch generateSpeech only for voice cloning)
Next Steps
Batch TTS Generate complete audio buffers
Model Setup Download and configure TTS models