add transformer class for review #11491

paarthneekhara · 2024-12-05T22:01:34Z

Added the transformer stack currently being used in T5TTS - Identify unused code paths, clean up the code and see what modules can be reused.

Signed-off-by: Paarth Neekhara <[email protected]>

Signed-off-by: paarthneekhara <[email protected]>

nemo/collections/tts/modules/transformer_dec24.py

rlangman · 2024-12-11T18:04:28Z

nemo/collections/tts/modules/transformer_dec24.py

+
+        self.d_model = d_model
+        self.non_linearity = nn.GELU(approximate="tanh")
+        self.proj = ConvNorm(d_model, d_model * 4, bias=bias, kernel_size=kernel_size, is_causal=is_causal)


Should the FFN size be a configuration instead of hardcoded to 4 * d_model?

Paarth changed this

nemo/collections/tts/modules/transformer_dec24.py

rlangman · 2024-12-11T19:50:21Z

nemo/collections/tts/modules/transformer_dec24.py

+            q = self.q_net(query).reshape(Bq, Tq, self.n_heads, self.d_head)
+            kv = self.kv_net(memory).reshape(Bkv, Tkv, 2, self.n_heads, self.d_head)
+            if self.pos_emb_name == 'rope':
+                q, kv = self.rope(q, kv)
+            elif self.pos_emb_name == 'alibi':
+                alibi_slopes = self.m[:, 0, 0]
+            q = q[~query_mask].reshape(-1, self.n_heads, self.d_head)
+            kv = kv[~memory_mask].reshape(-1, 2, self.n_heads, self.d_head)
+            lengths_q = (~query_mask).sum(1)
+            lengths_k = (~memory_mask).sum(1)


Code like this will be a lot easier to read if we replace .reshape () with einops rearrange(), and add comments with the output shapes for operations that are not reshape/rearrange.

nemo/collections/tts/modules/transformer_dec24.py

rlangman · 2024-12-11T21:50:23Z

nemo/collections/tts/modules/transformer_dec24.py

+        if self.has_xattn:
+            self.cross_attention.reset_cache(use_cache)
+
+    def forward(self, x, x_mask, cond, cond_mask, dump_attention=False, attn_prior=None, idx=None):


If cond and cond_mask are optional we should default them to None.

Should we throw an error if cond is provided, but self.has_xattn is False? Or if cond is not provided, but self.has_xattn is True?

We can default them to None, but I wouldnt raise an error if has_xattn is True and cond is None. I use that feature to pretrain the decoder with context as None, but still having the same architecture and parameters when using it as the pretrained T5 decoder for TTS,

nemo/collections/tts/modules/transformer_dec24.py

rlangman · 2024-12-13T18:42:52Z

nemo/collections/tts/modules/transformer_dec24.py

+                p_dropout=p_dropout,
+                is_causal=False,
+                is_self_attention=False,
+                d_memory=params['d_heads'],


Should rename d_heads in params to d_memory here. d_memory is supposed to be the dimension of the context information for cross attention. d_heads refers to the size of each attention head, but which this code hardcodes to be d_memory // n_heads.

We no longer use a params dict so this should no longer happen.

nemo/collections/tts/modules/transformer_dec24.py

…d suggest careful review of changes Signed-off-by: Paarth Neekhara <[email protected]>

Signed-off-by: paarthneekhara <[email protected]>

nemo/collections/tts/modules/transformer_dec24.py

Signed-off-by: Jason <[email protected]>

nemo/collections/tts/modules/transformer_2412.py

…ove remove_self_attention param Signed-off-by: Jason <[email protected]>

…TransformerLayer init Signed-off-by: Jason <[email protected]>

Signed-off-by: blisc <[email protected]>

Signed-off-by: Jason <[email protected]>

nemo/collections/tts/modules/transformer_2412.py

Signed-off-by: Jason <[email protected]>

Signed-off-by: blisc <[email protected]>

Signed-off-by: Jason <[email protected]>

…into tts_new_transformer

Signed-off-by: Jason <[email protected]>

rlangman · 2024-12-18T18:31:12Z

nemo/collections/tts/modules/transformer_2412.py

+        use_flash_self_attention=True,
+        use_flash_x_attention=True,
+        deterministic=False,
+        pos_emb={"name": "learnable"},


Can we make the pos_emb argument more structured? Either a dataclass, or similar to xattn flatten into parameters like pos_emb_name, pos_emb_base, pos_emb_kwargs, etc.

Nitpick: Mutable objects like dictionaries should not be used as default arguments.

vote for @dataclass to group the configs.

We got rid of this parameter, and changed it to a bool

rlangman · 2024-12-18T18:49:28Z

nemo/collections/tts/modules/transformer_2412.py

+        has_xattn,
+        xa_d_memory=None,
+        xa_n_heads=None,
+        xa_pos_emb=None,
+        xa_max_length_causal_mask=None,


Should we group these into a CrossAttentionConfig dataclass? To make it clear which arguments are related/optional. Then we can check if the config is None rather than the has_xattn flag.

…n pos emb and x-attn causal mask args Signed-off-by: Paarth Neekhara <[email protected]>

Signed-off-by: paarthneekhara <[email protected]>

bugfix. Signed-off-by: Xuesong Yang <[email protected]>

bugfix Signed-off-by: Xuesong Yang <[email protected]>

Signed-off-by: Jason <[email protected]>

Signed-off-by: blisc <[email protected]>

It requires that `xa_d_memory` and `xa_n_heads` are specified when `has_xattn` is True Signed-off-by: Xuesong Yang <[email protected]>

Signed-off-by: Xuesong Yang <[email protected]>

Signed-off-by: XuesongYang <[email protected]>

github-advanced-security

CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

Signed-off-by: Xuesong Yang <[email protected]>

tests/collections/tts/modules/test_tts_new_transformer.py

XuesongYang · 2025-01-18T09:16:31Z

added a unit test. This is the necessary test to ensure the forward pass of Transformer class succeed.

The multiple conditions from difference encoders failed the tests (test_forward_causal_self_attn_and_has_xattn). It seems a list of tensors are not supported. @paarthneekhara could you pls verify?

pls run pytest -s -vvv tests/collections/tts/modules/test_tts_new_transformer.py locally to test the code.

Signed-off-by: Paarth Neekhara <[email protected]>

Signed-off-by: paarthneekhara <[email protected]>

paarthneekhara · 2025-01-19T07:53:41Z

@XuesongYang We need to pass multi_encoder_mapping to the forward function as well for multi-encoder case. I have updated the test case with some comments (that still need to be incorporated) and fixes. Also corrected the x = (x + x_) * x_mask.unsqueeze(-1) bug which I believe was inserted when the mask was flipped.

FYI, I tested this new transformer code training and inference with t5tts locally, and it seems to be working fine. Also for a fixed set of weights, the transformer implementation in experimentalt5tts and this branch, give the same output, so I think we should be good.

Signed-off-by: Jason <[email protected]>

github-actions · 2025-01-21T16:44:53Z

beep boop 🤖: 🚨 The following files must be fixed before merge!

Your code was analyzed with PyLint. The following annotations have been identified:

************* Module nemo.collections.tts.modules.transformer_2412
nemo/collections/tts/modules/transformer_2412.py:26:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/tts/modules/transformer_2412.py:85:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/modules/transformer_2412.py:94:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/tts/modules/transformer_2412.py:138:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/tts/modules/transformer_2412.py:171:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/modules/transformer_2412.py:191:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/modules/transformer_2412.py:195:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/modules/transformer_2412.py:280:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/tts/modules/transformer_2412.py:340:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/tts/modules/transformer_2412.py:398:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/tts/modules/transformer_2412.py:471:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/modules/transformer_2412.py:536:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/tts/modules/transformer_2412.py:627:4: C0116: Missing function or method docstring (missing-function-docstring)

-----------------------------------
Your code has been rated at 9.47/10

Mitigation guide:

Add sensible and useful docstrings to functions and methods
For trivial methods like getter/setters, consider adding # pylint: disable=C0116 inside the function itself
To disable multiple functions/methods at once, put a # pylint: disable=C0116 before the first and a # pylint: enable=C0116 after the last.

By applying these rules, we reduce the occurance of this message in future.

Thank you for improving NeMo's documentation!

blisc · 2025-01-21T18:47:46Z

Closed in favour of #11911

add transformer class for review

1aa3ee3

Signed-off-by: Paarth Neekhara <[email protected]>

github-actions bot added the TTS label Dec 5, 2024

Apply isort and black reformatting

bc30ac9

Signed-off-by: paarthneekhara <[email protected]>

github-advanced-security bot found potential problems Dec 5, 2024

View reviewed changes

blisc requested changes Dec 6, 2024

View reviewed changes

blisc requested review from XuesongYang and rlangman December 9, 2024 18:33

blisc reviewed Dec 10, 2024

View reviewed changes

nemo/collections/tts/modules/transformer_dec24.py Outdated Show resolved Hide resolved

rlangman reviewed Dec 11, 2024

View reviewed changes

rlangman reviewed Dec 13, 2024

View reviewed changes

nemo/collections/tts/modules/transformer_dec24.py Outdated Show resolved Hide resolved

paarthneekhara and others added 2 commits December 13, 2024 19:16

Addressed most comments, tested it with t5 training locally, but woul…

2878850

…d suggest careful review of changes Signed-off-by: Paarth Neekhara <[email protected]>

Apply isort and black reformatting

8c2443d

Signed-off-by: paarthneekhara <[email protected]>

github-advanced-security bot found potential problems Dec 14, 2024

View reviewed changes

nemo/collections/tts/modules/transformer_dec24.py Fixed Show fixed Hide fixed

nemo/collections/tts/modules/transformer_dec24.py Fixed Show fixed Hide fixed

nemo/collections/tts/modules/transformer_dec24.py Fixed Show fixed Hide fixed

rename

53827ed

Signed-off-by: Jason <[email protected]>

github-advanced-security bot found potential problems Dec 16, 2024

View reviewed changes

blisc and others added 4 commits December 16, 2024 08:39

rename ConvNorm; add docstrings; unest context_xattn params dict; rem…

c9c2365

…ove remove_self_attention param Signed-off-by: Jason <[email protected]>

remove init_gain from ConvolutionLayer since it gets over-written in …

5bafb67

…TransformerLayer init Signed-off-by: Jason <[email protected]>

Apply isort and black reformatting

6be554a

Signed-off-by: blisc <[email protected]>

style

fdc28a6

Signed-off-by: Jason <[email protected]>

github-advanced-security bot found potential problems Dec 16, 2024

View reviewed changes

nemo/collections/tts/modules/transformer_2412.py Fixed Show fixed Hide fixed

blisc and others added 5 commits December 16, 2024 08:57

replace non_linearity options from string to callable

8bf3736

Signed-off-by: Jason <[email protected]>

Apply isort and black reformatting

b3171f1

Signed-off-by: blisc <[email protected]>

remove init_weight_method

62b3490

Signed-off-by: Jason <[email protected]>

Merge branch 'tts_new_transformer' of github.com:paarthneekhara/NeMo …

147be91

…into tts_new_transformer

change copyright header

074e192

Signed-off-by: Jason <[email protected]>

blisc self-requested a review December 17, 2024 19:04

rlangman reviewed Dec 18, 2024

View reviewed changes

paarthneekhara and others added 2 commits December 19, 2024 02:43

fixed learnable position embedding, pos embedding args, removed x-att…

d4793c3

…n pos emb and x-attn causal mask args Signed-off-by: Paarth Neekhara <[email protected]>

Apply isort and black reformatting

ca6b8e7

Signed-off-by: paarthneekhara <[email protected]>

XuesongYang and others added 6 commits January 18, 2025 00:53

Update nemo/collections/tts/modules/transformer_2412.py

debab72

bugfix. Signed-off-by: Xuesong Yang <[email protected]>

Update nemo/collections/tts/modules/transformer_2412.py

4f23bbe

bugfix Signed-off-by: Xuesong Yang <[email protected]>

flip mask

34b8bf9

Signed-off-by: Jason <[email protected]>

Apply isort and black reformatting

0573be8

Signed-off-by: blisc <[email protected]>

Update nemo/collections/tts/modules/transformer_2412.py

1aad73e

It requires that `xa_d_memory` and `xa_n_heads` are specified when `has_xattn` is True Signed-off-by: Xuesong Yang <[email protected]>

add unit tests.

d6b05ac

Signed-off-by: Xuesong Yang <[email protected]>

XuesongYang force-pushed the tts_new_transformer branch from 28fcc24 to d6b05ac Compare January 18, 2025 08:53

github-actions bot removed core Changes to NeMo Core NLP Speaker Tasks CI common Multi Modal labels Jan 18, 2025

Apply isort and black reformatting

fdc2fa8

Signed-off-by: XuesongYang <[email protected]>

github-advanced-security bot found potential problems Jan 18, 2025

View reviewed changes

rm files that were accidentally added when rebasing.

f9bf5b6

Signed-off-by: Xuesong Yang <[email protected]>

github-actions bot removed the ASR label Jan 18, 2025

revert modified files when rebasing

7fa7295

Signed-off-by: Xuesong Yang <[email protected]>

XuesongYang removed the audio label Jan 18, 2025

github-advanced-security bot found potential problems Jan 18, 2025

View reviewed changes

XuesongYang added the Run CICD label Jan 18, 2025

paarthneekhara and others added 2 commits January 19, 2025 02:48

bug fix and multi-encoder test comments

59f137c

Signed-off-by: Paarth Neekhara <[email protected]>

Apply isort and black reformatting

42c7ee3

Signed-off-by: paarthneekhara <[email protected]>

blisc added 2 commits January 21, 2025 08:41

merge

d2c6c7d

Signed-off-by: Jason <[email protected]>

move mask to front

dcda0a2

Signed-off-by: Jason <[email protected]>

blisc closed this Jan 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add transformer class for review #11491

add transformer class for review #11491

paarthneekhara commented Dec 5, 2024

rlangman Dec 11, 2024

blisc Dec 16, 2024

rlangman Dec 11, 2024

rlangman Dec 11, 2024

paarthneekhara Dec 13, 2024

rlangman Dec 13, 2024

blisc Jan 7, 2025

rlangman Dec 18, 2024

XuesongYang Dec 19, 2024

blisc Jan 7, 2025

rlangman Dec 18, 2024

github-advanced-security bot left a comment

XuesongYang commented Jan 18, 2025 •

edited

Loading

paarthneekhara commented Jan 19, 2025

github-actions bot commented Jan 21, 2025

blisc commented Jan 21, 2025

add transformer class for review #11491

add transformer class for review #11491

Conversation

paarthneekhara commented Dec 5, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-advanced-security bot left a comment

Choose a reason for hiding this comment

XuesongYang commented Jan 18, 2025 • edited Loading

paarthneekhara commented Jan 19, 2025

github-actions bot commented Jan 21, 2025

blisc commented Jan 21, 2025

XuesongYang commented Jan 18, 2025 •

edited

Loading