0 keeps the current default behavior. Set Max TTS Batch Size to 1 to force split chunks to run one by one.
Buffered generation keeps chunk order and decodes codec sub-batches no larger than the current TTS batch.
Realtime Streaming Decode keeps output order and uses the smallest active chunk-group width among auto batching, Max TTS Batch Size, and Max Codec Batch Size.
This ONNX app uses the server-start execution provider. CPU Threads selects the cached ONNX runtime instance for that request.
fixed uses the baked ONNX sampling constants.
WeTextProcessing and normalize_tts_text can now be toggled independently for each request.
WeTextProcessing is preloaded during startup so enabling it does not add first-request graph-build latency.