[TTS][Magpietts] Unify Longform and Standard Inference logic by subhankar-ghosh · Pull Request #15375 · NVIDIA-NeMo/NeMo

subhankar-ghosh · 2026-02-09T22:43:15Z

Important

The Update branch button must only be pressed in very rare occassions.
An outdated branch is never blocking the merge of a PR.
Please reach out to the automation team before pressing that button.

What does this PR do ?

This pull request refactors and unifies the text chunking and inference logic for TTS (Text-to-Speech) in the MagpieTTS pipeline. The main change is the replacement of the previous "longform" inference logic with a new, language-aware, unified chunked inference path. This affects dataset preparation, model state management, argument parsing, and the inference runner, making the codebase simpler and more robust for both short and long texts.

Key changes:

Unified Inference and Text Chunking

Replaced the old longform inference logic with a unified, automatic text chunking approach that determines chunking based on per-sample language thresholds. Short texts are processed as single chunks, while long texts are split into sentences automatically. (examples/tts/magpietts_inference.py, nemo/collections/tts/data/text_to_speech_dataset.py, nemo/collections/tts/models/magpietts.py) [1] [2] [3]
Removed all command-line arguments related to explicit longform control (--longform_mode, --longform_word_threshold, etc.), simplifying the inference interface. (examples/tts/magpietts_inference.py) [1] [2] [3]

Dataset and Collation Refactor

Introduced ChunkedTTSInferenceDataset (replacing LongFormTTSInferenceDataset) with per-sample, language-aware chunking and tokenizer selection. The dataset now automatically decides chunking strategy based on language and text length. (nemo/collections/tts/data/text_to_speech_dataset.py) [1] [2] [3]
Updated the dataset's collate_fn to handle variable-length chunked batches, padding as needed, and to generalize beyond the previous longform-specific logic. (nemo/collections/tts/data/text_to_speech_dataset.py) [1] [2]

Model and State Naming Consistency

Renamed all "longform" classes and configs to "chunked" (e.g., LongformDecoderState → ChunkedDecoderState, LongformConfig → ChunkedInferenceConfig) throughout the model code for clarity and consistency with the new unified approach. (nemo/collections/tts/models/magpietts.py) [1] [2] [3] [4]
Removed the _needs_longform_inference method and all language threshold logic from the model, as chunking is now handled in a unified, language-aware way. (nemo/collections/tts/models/magpietts.py)

Utility and Import Updates

Added and updated utility imports for chunked inference and tokenizer selection to support the new pipeline. (nemo/collections/tts/models/magpietts.py, nemo/collections/tts/data/text_to_speech_dataset.py) [1] [2]

These changes make the TTS inference pipeline easier to use and maintain, while improving support for multilingual and variable-length text inputs.

Collection: TTS

Changelog

Add specific line by line info of high level changes in this PR.

Usage

You can potentially add a usage example below

# Add a code snippet demonstrating how to use this

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

Related to # (issue)

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>

Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com>

Copilot

Pull request overview

Refactors MagpieTTS inference to use a single “chunked” inference path for both short and long texts, with dataset-driven automatic sentence chunking based on per-sample language thresholds.

Changes:

Introduces language-aware thresholding + unified chunk_text_for_inference() chunking utility (replacing prior longform detection logic).
Replaces LongFormTTSInferenceDataset with ChunkedTTSInferenceDataset and updates the inference runner to always use the unified multi/single-chunk loop.
Updates CLI/example script to remove explicit longform args and align with the unified inference flow.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
`nemo/collections/tts/parts/utils/tts_dataset_utils.py`	Adds language-aware sentence splitting, thresholds, tokenizer mapping, and unified chunking helper.
`nemo/collections/tts/data/text_to_speech_dataset.py`	Replaces longform inference dataset with unified chunked inference dataset + mixed-chunk collation.
`nemo/collections/tts/modules/magpietts_inference/inference.py`	Removes standard/longform branching; always runs unified chunk loop via `generate_speech()`.
`nemo/collections/tts/models/magpietts.py`	Renames longform state/config to chunked equivalents; updates `do_tts()` to the unified chunked generation path.
`examples/tts/magpietts_inference.py`	Removes longform CLI controls and updates messaging for unified chunking behavior.
`tests/collections/tts/parts/utils/test_tts_dataset_utils.py`	Adds unit tests for new thresholds and unified chunking helper.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

nemo/collections/tts/parts/utils/tts_dataset_utils.py

nemo/collections/tts/models/magpietts.py

nemo/collections/tts/modules/magpietts_inference/inference.py

nemo/collections/tts/data/text_to_speech_dataset.py

nemo/collections/tts/parts/utils/tts_dataset_utils.py

blisc · 2026-02-10T16:10:57Z

Can you resolve conflicts?
Can you add a unit test for magpietts.generate_speech that contains a batch of short and long texts?

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>

blisc · 2026-02-12T14:25:21Z

The github UI still says that there are conflicts

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>

Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com>

nemo/collections/tts/models/magpietts.py

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> Signed-off-by: Subhankar Ghosh <subhankar2321@gmail.com>

nemo/collections/tts/models/magpietts.py

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>

…o into magpietts_longform_unify

blisc

Please fix the linting errors

nemo/collections/tts/data/text_to_speech_dataset.py

nemo/collections/tts/modules/magpietts_inference/utils.py

tests/functional_tests/L2_TTS_InferEvaluate_Magpietts_SeenSpeakers.sh

blisc · 2026-03-02T16:29:39Z

tests/functional_tests/L2_TTS_InferEvaluate_Magpietts_ZeroShot.sh

+    --hparams_files /home/TestData/tts/2602_MagpieTTS/hprams_hi_char.yaml \
+    --checkpoint_files /home/TestData/tts/2602_MagpieTTS/Magpie-TTS-ML-V1--val_cer_gt=0.3258-step=1000.ckpt \


If this is the same checkpoint as tests/functional_tests/L2_TTS_InferEvaluate_Magpietts_SeenSpeakers.sh, we can remove this test. This test was only kept to test a different checkpoint

We can remove this test now since it's identical to L2_TTS_InferEvaluate_Magpietts_SeenSpeakers.sh

tests/functional_tests/L2_TTS_InferEvaluatelongform_Magpietts_ZeroShot.sh

for PR it is important to merge main to this branch, as it is out of date.

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>

subhankar-ghosh · 2026-03-03T06:34:32Z

Fixed the necessary linting issues, the current linting failures are due to CI issues.

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>

github-actions · 2026-03-06T21:03:14Z

[🤖]: Hi @subhankar-ghosh 👋,

We wanted to let you know that a CICD pipeline for this PR just finished successfully.

So it might be time to merge this PR or get some approvals.

subhankar-ghosh added 4 commits February 2, 2026 09:51

Unify longform with standard inference

74e0204

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>

Unify longform with standard inference - small fixes

7d8e746

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>

Hi and Ja

e920a7f

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>

Rename longform with chunk

6c0549c

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>

subhankar-ghosh requested review from blisc, Copilot and rlangman February 9, 2026 22:43

github-actions bot added the TTS label Feb 9, 2026

Copilot started reviewing on behalf of subhankar-ghosh February 9, 2026 22:43 View session

Apply isort and black reformatting

03a487f

Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com>

Copilot AI reviewed Feb 9, 2026

View reviewed changes

subhankar-ghosh added 2 commits February 12, 2026 01:36

Added Long and short test cases, review comments

9c40f43

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>

merge conflict

f8a76a9

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>

subhankar-ghosh and others added 2 commits February 12, 2026 09:58

merge conflicts

e7966fc

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>

Apply isort and black reformatting

de64eae

Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com>

subhankar-ghosh added skip-linting Run CICD labels Feb 12, 2026

subhankar-ghosh had a problem deploying to test February 12, 2026 18:01 — with GitHub Actions Error

github-advanced-security bot found potential problems Feb 12, 2026

View reviewed changes

nemo/collections/tts/models/magpietts.py Fixed Show fixed Hide fixed

nemo/collections/tts/models/magpietts.py Fixed Show fixed Hide fixed

Potential fix for code scanning alert no. 16979: Unused import

1768f6b

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> Signed-off-by: Subhankar Ghosh <subhankar2321@gmail.com>

chtruong814 added Run CICD and removed Run CICD labels Feb 12, 2026

Potential fix for code scanning alert no. 16980: Unused import

02572fb

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> Signed-off-by: Subhankar Ghosh <subhankar2321@gmail.com>

chtruong814 added Run CICD and removed Run CICD labels Feb 12, 2026

chtruong814 temporarily deployed to test February 12, 2026 18:08 — with GitHub Actions Inactive

github-advanced-security bot found potential problems Feb 12, 2026

View reviewed changes

nemo/collections/tts/models/magpietts.py Fixed Show fixed Hide fixed

Fix unit tests

5f13ea2

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>

chtruong814 added the Run CICD label Feb 28, 2026

chtruong814 temporarily deployed to test February 28, 2026 05:20 — with GitHub Actions Inactive

subhankar-ghosh added 2 commits February 27, 2026 21:38

review comments

4a42fad

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>

Merge branch 'magpietts_longform_unify' of github.com:NVIDIA-NeMo/NeM…

f872a4a

…o into magpietts_longform_unify

chtruong814 added Run CICD and removed Run CICD labels Feb 28, 2026

chtruong814 temporarily deployed to test February 28, 2026 05:40 — with GitHub Actions Inactive

github-actions bot removed the Run CICD label Feb 28, 2026

blisc removed the skip-linting label Mar 2, 2026

blisc requested changes Mar 2, 2026

View reviewed changes

subhankar-ghosh added 3 commits March 2, 2026 10:10

Merge branch 'main' into magpietts_longform_unify

4d919f0

for PR it is important to merge main to this branch, as it is out of date.

Fix Framestacking test command.

e433d76

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>

Change checkpoint in magpie tests, review comments

55b5b5e

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>

subhankar-ghosh added skip-linting Run CICD labels Mar 3, 2026

subhankar-ghosh temporarily deployed to test March 3, 2026 06:36 — with GitHub Actions Inactive

github-actions bot removed the Run CICD label Mar 3, 2026

Frame stacking MagpieTTS generate_speech method

b3a234f

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>

subhankar-ghosh added the Run CICD label Mar 6, 2026

subhankar-ghosh temporarily deployed to test March 6, 2026 08:10 — with GitHub Actions Inactive

Typo fix in tests

b673bca

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>

chtruong814 added Run CICD and removed Run CICD labels Mar 6, 2026

chtruong814 temporarily deployed to test March 6, 2026 18:39 — with GitHub Actions Inactive

github-actions bot removed the Run CICD label Mar 6, 2026

blisc approved these changes Mar 7, 2026

View reviewed changes

subhankar-ghosh merged commit 3bfbcb6 into main Mar 7, 2026
131 checks passed

subhankar-ghosh deleted the magpietts_longform_unify branch March 7, 2026 01:02

		--hparams_files /home/TestData/tts/2602_MagpieTTS/hprams_hi_char.yaml \
		--checkpoint_files /home/TestData/tts/2602_MagpieTTS/Magpie-TTS-ML-V1--val_cer_gt=0.3258-step=1000.ckpt \

Conversation

subhankar-ghosh commented Feb 9, 2026

What does this PR do ?

Unified Inference and Text Chunking

Dataset and Collation Refactor

Model and State Naming Consistency

Utility and Import Updates

Changelog

Usage

GitHub Actions CI

Before your PR is "Ready for review"

Who can review?

Additional Information

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

blisc commented Feb 10, 2026

Uh oh!

blisc commented Feb 12, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

blisc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

blisc Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

blisc Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

subhankar-ghosh commented Mar 3, 2026

Uh oh!

github-actions bot commented Mar 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants