Skip to content

Pull requests: NVIDIA/TransformerEngine

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

[JAX] Consolidate the distributed fused attention test code
#1405 opened Jan 12, 2025 by mgoldfarb-nvidia Loading…
8 of 13 tasks
[PyTorch] Avoid parameters function in op backward pass bug Something isn't working
#1403 opened Jan 11, 2025 by timmoon10 Loading…
3 of 13 tasks
Fix "refractor" typo in the PR template
#1402 opened Jan 11, 2025 by kit1980 Loading…
Use log1p(x) instead of log(1+x)
#1401 opened Jan 11, 2025 by kit1980 Loading…
1 of 6 tasks
[PyTorch] Fix AttentionParams comparison logic
#1397 opened Jan 9, 2025 by cyanguwa Loading…
8 of 13 tasks
Better cuBLAS handle management
#1389 opened Jan 2, 2025 by ptrendx Loading…
8 of 13 tasks
Update README.rst
#1385 opened Dec 23, 2024 by sbhavani Loading…
1 of 6 tasks
Don't touch nor send messages to the root logger.
#1380 opened Dec 19, 2024 by sagostinho-nvidia Loading…
4 of 13 tasks
[MoE][PyTorch] Add mask-based MoE permutation
#1373 opened Dec 13, 2024 by hxbai Loading…
8 of 13 tasks
Add paged attention support
#1355 opened Dec 4, 2024 by cyanguwa Loading…
8 of 13 tasks
[PyTorch] Adding TP overlap support for te.Linear with parallel_mode="column" 1.14.0 enhancement New feature or request
#1343 opened Nov 20, 2024 by denera Loading…
8 of 13 tasks
[PyTorch] Bugfix for wgrad bulk overlap conflict when dgrad overlap is reduce-scatter bug Something isn't working
#1341 opened Nov 18, 2024 by denera Loading…
6 of 13 tasks
[C/JAX] Comm+GEMM Overlap API for TE/JAX enhancement New feature or request jax
#1337 opened Nov 15, 2024 by denera Draft
3 of 13 tasks
Build with uv instead of just pip
#1324 opened Nov 8, 2024 by jennifgcrl Loading…
5 of 13 tasks
TP communication overlap: enable the overlap between GEMM chunk at Ho…
#1311 opened Nov 4, 2024 by erhoo82 Loading…
1 of 13 tasks
[PyTorch] Add heuristics for intializing FP8 params enhancement New feature or request
#1300 opened Oct 30, 2024 by timmoon10 Loading…
8 of 13 tasks
Offloading example
#1299 opened Oct 29, 2024 by sanandaraj5597 Loading…
[PyTorch] Fix autocast deprecation warnings
#1277 opened Oct 21, 2024 by yaox12 Loading…
13 tasks
attention_mask fill with -inf for UnfusedDotProductAttention
#1268 opened Oct 18, 2024 by Agoniii Loading…
1 of 13 tasks
Draft: reduce cudagraph mem via preoallcations
#1253 opened Oct 15, 2024 by JimmyZhang12 Loading…
13 tasks
ProTip! Updated in the last three days: updated:>2025-01-09.