Replies: 1 comment
-
|
Hi @baduyne! I'm Dosu and I’m helping the sherpa-onnx team. sherpa-onnx supports batch inference (batch size > 1) for ASR models, which can help reduce latency and improve throughput when processing multiple diarization segments together. The websocket servers have a --max-batch-size option to control this behavior, and the system is designed to process multiple inputs in parallel batches reference. While the documentation gives explicit batch examples for models like Paraformer and Transducer, it doesn't specifically mention Whisper. However, since batch processing is a general feature in sherpa-onnx, it's likely supported for Whisper as well if integrated similarly. You may want to ensure your segments are padded or preprocessed to the same length if required by the model, as is common in batch ASR workflows. In summary: batching diarization segments for Whisper ASR in sherpa-onnx should be possible and can leverage your system more efficiently. Use the --max-batch-size parameter to tune performance for your use case. To reply, just mention @dosu. How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Beta Was this translation helpful? Give feedback.
All reactions