-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add non-mcore fsdp2 strategy #11525
Add non-mcore fsdp2 strategy #11525
Conversation
Signed-off-by: Boxiang Wang <[email protected]>
Signed-off-by: BoxiangW <[email protected]>
Signed-off-by: Boxiang Wang <[email protected]>
Signed-off-by: BoxiangW <[email protected]>
Signed-off-by: Boxiang Wang <[email protected]>
Signed-off-by: BoxiangW <[email protected]>
* Initial commit Signed-off-by: Piotr Kaminski <[email protected]> * Apply isort and black reformatting Signed-off-by: Laplasjan107 <[email protected]> --------- Signed-off-by: Piotr Kaminski <[email protected]> Signed-off-by: Laplasjan107 <[email protected]> Co-authored-by: Piotr Kaminski <[email protected]> Co-authored-by: Laplasjan107 <[email protected]>
* Make HfDatasetDataModule a datasets.load_dataset wrapper Signed-off-by: Alexandros Koumparoulis <[email protected]> * add logging Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * Update HFDatasetDataModule Signed-off-by: Alexandros Koumparoulis <[email protected]> * refactor Signed-off-by: Alexandros Koumparoulis <[email protected]> * refactor fixup Signed-off-by: Alexandros Koumparoulis <[email protected]> * refactor fixup #2 Signed-off-by: Alexandros Koumparoulis <[email protected]> * do not expand Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * doc Signed-off-by: Alexandros Koumparoulis <[email protected]> * doc Signed-off-by: Alexandros Koumparoulis <[email protected]> * add synonym Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * typo Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> * Add train/val/test attributes Signed-off-by: Alexandros Koumparoulis <[email protected]> * Add test for hf-datamodule Signed-off-by: Alexandros Koumparoulis <[email protected]> * Import lazily to avoid breaking with older megatron versions Signed-off-by: Alexandros Koumparoulis <[email protected]> * bot happy Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> * bot happy2 Signed-off-by: Alexandros Koumparoulis <[email protected]> * add doc-strings and collate-fn arg Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: akoumpa <[email protected]> Co-authored-by: akoumpa <[email protected]>
Signed-off-by: Oliver Koenig <[email protected]>
Signed-off-by: Oliver Koenig <[email protected]>
Signed-off-by: ashors1 <[email protected]>
Signed-off-by: Oliver Koenig <[email protected]>
Signed-off-by: Oliver Koenig <[email protected]>
* ci: Remove token from checkout Signed-off-by: Oliver Koenig <[email protected]> * bump version Signed-off-by: Oliver Koenig <[email protected]> --------- Signed-off-by: Oliver Koenig <[email protected]>
Signed-off-by: Oliver Koenig <[email protected]>
Signed-off-by: Oliver Koenig <[email protected]>
* Fix llm.deploy api Signed-off-by: Hemil Desai <[email protected]> * fix Signed-off-by: Hemil Desai <[email protected]> * fix Signed-off-by: Hemil Desai <[email protected]> * fix Signed-off-by: Hemil Desai <[email protected]> * fix Signed-off-by: Hemil Desai <[email protected]> * fix Signed-off-by: Hemil Desai <[email protected]> * Apply isort and black reformatting Signed-off-by: hemildesai <[email protected]> * PR feedback Signed-off-by: Hemil Desai <[email protected]> * fix Signed-off-by: Hemil Desai <[email protected]> --------- Signed-off-by: Hemil Desai <[email protected]> Signed-off-by: hemildesai <[email protected]> Co-authored-by: hemildesai <[email protected]>
Signed-off-by: Malay Nagda <[email protected]> Co-authored-by: oliver könig <[email protected]>
* update recipe Signed-off-by: yaoyu-33 <[email protected]> * fix mllama mock ds Signed-off-by: yaoyu-33 <[email protected]> * update to use attention bias Signed-off-by: yaoyu-33 <[email protected]> * remove example Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * fix docstring mock.py Signed-off-by: yaoyu-33 <[email protected]> * fix docstring language.py Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * fix docstring language.py Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * fix docstring mllama/base.py Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * fix docstring mllama/language.py Signed-off-by: yaoyu-33 <[email protected]> * bump mcore Signed-off-by: Oliver Koenig <[email protected]> * Add scripts for mllama Signed-off-by: yaoyu-33 <[email protected]> * fix Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * update script Signed-off-by: yaoyu-33 <[email protected]> * fix pylint Signed-off-by: yaoyu-33 <[email protected]> * revert Dockerfile.ci Signed-off-by: Yu Yao <[email protected]> * add scripts Signed-off-by: yaoyu-33 <[email protected]> * add vlm training test in ci Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * fix docstring issues Signed-off-by: yaoyu-33 <[email protected]> * update script match recipe Signed-off-by: yaoyu-33 <[email protected]> * update recipes Signed-off-by: yaoyu-33 <[email protected]> * Update mllama_train.py Signed-off-by: Yu Yao <[email protected]> * update mllama 90b recipe Signed-off-by: yaoyu-33 <[email protected]> * update to use tmp in ci tests Signed-off-by: yaoyu-33 <[email protected]> * update default llava config Signed-off-by: yaoyu-33 <[email protected]> * add nemo run scripts Signed-off-by: yaoyu-33 <[email protected]> * fix vpp issue Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * fix cicd Signed-off-by: yaoyu-33 <[email protected]> * fix cicd Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * remove duplicated script Signed-off-by: yaoyu-33 <[email protected]> * ci: Add HF cache Signed-off-by: oliver könig <[email protected]> * update to use SP in recipe Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * fix Signed-off-by: yaoyu-33 <[email protected]> * upgrade Signed-off-by: yaoyu-33 <[email protected]> * Revert "upgrade" This reverts commit f6ad2cd. * update neva api Signed-off-by: yaoyu-33 <[email protected]> * update neva api Signed-off-by: yaoyu-33 <[email protected]> * fix neva processing Signed-off-by: yaoyu-33 <[email protected]> * fix lint Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * fix data fields Signed-off-by: yaoyu-33 <[email protected]> * few fixes Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Oliver Koenig <[email protected]> Signed-off-by: Yu Yao <[email protected]> Signed-off-by: oliver könig <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Oliver Koenig <[email protected]>
* Add from_dict method Signed-off-by: Alexandros Koumparoulis <[email protected]> * add test_load_from_dict Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * add test_load_from_dict Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: akoumpa <[email protected]> Co-authored-by: akoumpa <[email protected]>
* prevent llama3.1 from using linear interpolation * Apply isort and black reformatting Signed-off-by: suiyoubi <[email protected]> --------- Signed-off-by: suiyoubi <[email protected]> Co-authored-by: suiyoubi <[email protected]>
Signed-off-by: Ryan <[email protected]>
* update for nest release Signed-off-by: stevehuang52 <[email protected]> * make pylint happier Signed-off-by: stevehuang52 <[email protected]> * fix for lhotse dataloader Signed-off-by: stevehuang52 <[email protected]> * update yaml Signed-off-by: stevehuang52 <[email protected]> * minor refactor Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: stevehuang52 <[email protected]>
* Port changes related to SFT text+speech dataloading Signed-off-by: Piotr Żelasko <[email protected]> * Revert changes from Canary(nonLLM) code Signed-off-by: Piotr Żelasko <[email protected]> * Add joint text/audio dataloading capability to speechllm Signed-off-by: Piotr Żelasko <[email protected]> * include text-only into fprop of training and eval; TODO: text-only predict Signed-off-by: zhehuaichen <[email protected]> * Actually working forward step Signed-off-by: Piotr Żelasko <[email protected]> * Support for source-target text file pair training for MT+speech Signed-off-by: Piotr Żelasko <[email protected]> * Include supervision text tokens in audio example's num tokens Signed-off-by: Piotr Żelasko <[email protected]> * Disable conformer seq len NCCL sync Signed-off-by: Piotr Żelasko <[email protected]> * Preliminary sampler fusion stragies support: mux/zip/round_robin/randomized_round_robin Signed-off-by: Piotr Żelasko <[email protected]> * Working V2 version of multimodal dataloading. Each modality gets its own batch settings that can be merged with zip sampler to enjoy max batch sizes for both modalities in a single training step. Each modality runs fwd+bwd in turn to save GPU memory (instead of running fwd separately and bwd together). Signed-off-by: Piotr Żelasko <[email protected]> * Add missing config Signed-off-by: Piotr Żelasko <[email protected]> * Revert multimodal grad accum and fix mask padding issue Signed-off-by: Piotr Żelasko <[email protected]> * Add modality weights support via cfg.model.modality_weights Signed-off-by: Piotr Żelasko <[email protected]> * Fix for V2 dataloader shuffling CRITICAL Signed-off-by: Piotr Żelasko <[email protected]> * Restore multimodal grad accum Signed-off-by: Piotr Żelasko <[email protected]> * Fix unit tests for multi-sampler configurations Signed-off-by: Piotr Żelasko <[email protected]> * Apply isort and black reformatting Signed-off-by: pzelasko <[email protected]> * nemo gemma to hf conversion (#9629) * adding script for gemma nemo to hf Signed-off-by: Krishna Puvvada <[email protected]> * adding verification for convert_gemma_nemo_to_hf Signed-off-by: Krishna Puvvada <[email protected]> * Apply isort and black reformatting Signed-off-by: krishnacpuvvada <[email protected]> --------- Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: krishnacpuvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: krishnacpuvvada <[email protected]> * support FSDP (thank Yifan for early trying) (#10062) Note: as of now, this is still not fully working on the cluster. See above doc for details. Signed-off-by: zhehuaichen <[email protected]> * Fix unit tests after rebasing on recent main Signed-off-by: Piotr Żelasko <[email protected]> * support megatron_amp_O2 and tp (#10599) * Port changes related to SFT text+speech dataloading Signed-off-by: Piotr Żelasko <[email protected]> * Revert changes from Canary(nonLLM) code Signed-off-by: Piotr Żelasko <[email protected]> * Add joint text/audio dataloading capability to speechllm Signed-off-by: Piotr Żelasko <[email protected]> * include text-only into fprop of training and eval; TODO: text-only predict Signed-off-by: zhehuaichen <[email protected]> * Actually working forward step Signed-off-by: Piotr Żelasko <[email protected]> * Support for source-target text file pair training for MT+speech Signed-off-by: Piotr Żelasko <[email protected]> * Include supervision text tokens in audio example's num tokens Signed-off-by: Piotr Żelasko <[email protected]> * Disable conformer seq len NCCL sync Signed-off-by: Piotr Żelasko <[email protected]> * Preliminary sampler fusion stragies support: mux/zip/round_robin/randomized_round_robin Signed-off-by: Piotr Żelasko <[email protected]> * Working V2 version of multimodal dataloading. Each modality gets its own batch settings that can be merged with zip sampler to enjoy max batch sizes for both modalities in a single training step. Each modality runs fwd+bwd in turn to save GPU memory (instead of running fwd separately and bwd together). Signed-off-by: Piotr Żelasko <[email protected]> * Add missing config Signed-off-by: Piotr Żelasko <[email protected]> * Revert multimodal grad accum and fix mask padding issue Signed-off-by: Piotr Żelasko <[email protected]> * Add modality weights support via cfg.model.modality_weights Signed-off-by: Piotr Żelasko <[email protected]> * Fix for V2 dataloader shuffling CRITICAL Signed-off-by: Piotr Żelasko <[email protected]> * Restore multimodal grad accum Signed-off-by: Piotr Żelasko <[email protected]> * Fix unit tests for multi-sampler configurations Signed-off-by: Piotr Żelasko <[email protected]> * Apply isort and black reformatting Signed-off-by: pzelasko <[email protected]> * nemo gemma to hf conversion (#9629) * adding script for gemma nemo to hf Signed-off-by: Krishna Puvvada <[email protected]> * adding verification for convert_gemma_nemo_to_hf Signed-off-by: Krishna Puvvada <[email protected]> * Apply isort and black reformatting Signed-off-by: krishnacpuvvada <[email protected]> --------- Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: krishnacpuvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: krishnacpuvvada <[email protected]> * support FSDP (thank Yifan for early trying) Signed-off-by: zhehuaichen <[email protected]> * debug TP deadlock Signed-off-by: zhehuaichen <[email protected]> * some fixes for fsdp and tp /lustre/fsw/portfolios/llmservice/users/zhehuaic/results/canary-v0_speechllm/prompt_lhmerge5_p2b_oci_FC-GPT_llama_canaryset_b6s4kf-sunolong_noCC_langtemp0.5_dsettemp0.5_lr1e-4wd1e-3_CosineAnnealing_warmup2500_minlr1e-6_gbs2048_mbs16_ep200/error-1417621-0.out /lustre/fsw/portfolios/llmservice/users/zhehuaic/results/canary-v0_speechllm/prompt_lhmerge5_p2b_tp_oci_FC-GPT_llama_canaryset_b6s4kf-sunolong_noCC_langtemp0.5_dsettemp0.5_lr1e-4wd1e-3_CosineAnnealing_warmup2500_minlr1e-6_gbs128_mbs16_ep200/error-1421103-3.out Signed-off-by: zhehuaichen <[email protected]> * nit fix Signed-off-by: zhehuaichen <[email protected]> * fix for llama3.1 Signed-off-by: zhehuaichen <[email protected]> * for llama3.1 Signed-off-by: zhehuaichen <[email protected]> * fix for inference Signed-off-by: zhehuaichen <[email protected]> * fix inference Signed-off-by: zhehuaichen <[email protected]> * fix grad accu Signed-off-by: zhehuaichen <[email protected]> * fix inference Signed-off-by: zhehuaichen <[email protected]> * initial impl to support megatron_amp_O2 in salm, bestow, salm-t5 Signed-off-by: zhehuaichen <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: zhehuaichen <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: pzelasko <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: krishnacpuvvada <[email protected]> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: pzelasko <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: krishnacpuvvada <[email protected]> * minor change in dataloader (#10601) * Speechllm dataset basic unit test (#10631) * Basic unit test for speechllm lhotse dataset Signed-off-by: Piotr Żelasko <[email protected]> * cleanup Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Unit test for existing speechllm dataset with llama2 prompt format (#10634) Signed-off-by: Piotr Żelasko <[email protected]> * [speechllm] Replace TextProcessing with PromptFormatter (#10639) * [speechllm] Replace TextProcessing with PromptFormatter Signed-off-by: Piotr Żelasko <[email protected]> * Test for tokens_to_generate Signed-off-by: Piotr Żelasko <[email protected]> * Padding optimization for speechlm dataset Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Multimodal conversation format dataloading (#10683) * Draft implementation of NeMo Multimodal Conversation format Signed-off-by: Piotr Żelasko <[email protected]> * Fully working data parsing and iteration Signed-off-by: Piotr Żelasko <[email protected]> * Fully working dataloading with tokenization + prompting Signed-off-by: Piotr Żelasko <[email protected]> * Collapse consecutive user turns into single turn Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * a few fixes for the new prompt template based dataloader and lora+distributed fused adam (#10701) * Draft implementation of NeMo Multimodal Conversation format Signed-off-by: Piotr Żelasko <[email protected]> * Fully working data parsing and iteration Signed-off-by: Piotr Żelasko <[email protected]> * Fully working dataloading with tokenization + prompting Signed-off-by: Piotr Żelasko <[email protected]> * Collapse consecutive user turns into single turn Signed-off-by: Piotr Żelasko <[email protected]> * compatible with previous expts Signed-off-by: zhehuaichen <[email protected]> * support gemma Signed-off-by: zhehuaichen <[email protected]> * handle the case max_seq_length is smaller than input_id length Signed-off-by: zhehuaichen <[email protected]> * fix max seq case Signed-off-by: zhehuaichen <[email protected]> * fix lora ckpt storing and loading Signed-off-by: zhehuaichen <[email protected]> * temp fix for distributed fused adam Signed-off-by: zhehuaichen <[email protected]> * revert changes in nemo_adapters.py Signed-off-by: zhehuaichen <[email protected]> * Fix tokenize_with_prompt Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: zhehuaichen <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: zhehuaichen <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Co-authored-by: Piotr Żelasko <[email protected]> * Mechanism to insert BOS/EOS at the beginning/end of dialog (#10923) * Mechanism to insert BOS/EOS at the beginning/end of dialog Signed-off-by: Piotr Żelasko <[email protected]> * Fix Gemma prompt formatter test Signed-off-by: Piotr Żelasko <[email protected]> * Add a test specifically for multiturn insertion of bos/eos Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Add options to override default map/iterable dataset style selection in lhotse dataloader Signed-off-by: Piotr Żelasko <[email protected]> * Feature/conversations tarred (#11086) * Multimodal conversation tarring script Signed-off-by: Piotr Żelasko <[email protected]> * Fix sharding logic Signed-off-by: Piotr Żelasko <[email protected]> * Fix dir creation Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * EMMeTT support in SpeechLLM + tutorial for Lhotse Multimodal Dataloading (#10927) * Preliminary support for oomptimizer Signed-off-by: Piotr Żelasko <[email protected]> * OOMptimizer for SpeechLLM Signed-off-by: Piotr Żelasko <[email protected]> * Initial version of estimate token bins script Signed-off-by: Piotr Żelasko <[email protected]> * Initial support for multimodal 2d bucketing Signed-off-by: Piotr Żelasko <[email protected]> * Extend to text-to-text oomptimizer Signed-off-by: Piotr Żelasko <[email protected]> * Preliminary support for Llama2 prompt format in ast+mt Signed-off-by: Piotr Żelasko <[email protected]> * Support for 1D estimate token bins Signed-off-by: Piotr Żelasko <[email protected]> * Support for 1D estimate token bins Signed-off-by: Piotr Żelasko <[email protected]> * Fix Signed-off-by: Piotr Żelasko <[email protected]> * Fix Signed-off-by: Piotr Żelasko <[email protected]> * Minor tweaks Signed-off-by: Piotr Żelasko <[email protected]> * Add min/max tokens filter Signed-off-by: Piotr Żelasko <[email protected]> * Change to bisect_left for bucket idx selection Signed-off-by: Piotr Żelasko <[email protected]> * Add reconfigure_num_microbatches_calculator at the start of train epoch for modular models Signed-off-by: Piotr Żelasko <[email protected]> * Update lhotse multi-sampler config and make validation datasets finite Signed-off-by: Piotr Żelasko <[email protected]> * Initial implementation of text+audio training for T5 modular models Signed-off-by: Piotr Żelasko <[email protected]> * megatron t5 nmt prompt formatter Signed-off-by: Piotr Żelasko <[email protected]> * Fixes for MT+AST T5 oomptimizer and training Signed-off-by: Piotr Żelasko <[email protected]> * configs, fixes, token-per-token filtering * Support text modality in predict_step Signed-off-by: Piotr Żelasko <[email protected]> * Support text data in val/test dl Signed-off-by: Piotr Żelasko <[email protected]> * fix Signed-off-by: Piotr Żelasko <[email protected]> * fix Signed-off-by: Piotr Żelasko <[email protected]> * fix Signed-off-by: Piotr Żelasko <[email protected]> * fix Signed-off-by: Piotr Żelasko <[email protected]> * fix Signed-off-by: Piotr Żelasko <[email protected]> * fix Signed-off-by: Piotr Żelasko <[email protected]> * fix Signed-off-by: Piotr Żelasko <[email protected]> * fix Signed-off-by: Piotr Żelasko <[email protected]> * fix infinite Signed-off-by: Piotr Żelasko <[email protected]> * prompt format fixes Signed-off-by: Piotr Żelasko <[email protected]> * Fixes in audio supervision Signed-off-by: Piotr Żelasko <[email protected]> * remove superficial padding Signed-off-by: Piotr Żelasko <[email protected]> * test config and prompt context fetching fixes Signed-off-by: Piotr Żelasko <[email protected]> * support text-only decoding for salm/bestow Signed-off-by: Piotr Żelasko <[email protected]> * Add unit tests for EMMETT / refactor prompt_format_fn Signed-off-by: Piotr Żelasko <[email protected]> * make t5nmt prompt formatter auto discoverable Signed-off-by: Piotr Żelasko <[email protected]> * include token count / tpt filtering in estimate_token_bins Signed-off-by: Piotr Żelasko <[email protected]> * fix max token filter Signed-off-by: Piotr Żelasko <[email protected]> * some fixes Signed-off-by: Piotr Żelasko <[email protected]> * custom mixin for text adapters Signed-off-by: Piotr Żelasko <[email protected]> * Warmup in oomptimizer-speechlm Signed-off-by: Piotr Żelasko <[email protected]> * Move oomptimizer-speechllm to separate directory Signed-off-by: Piotr Żelasko <[email protected]> * Initial cleanup Signed-off-by: Piotr Żelasko <[email protected]> * Refactoring of prompt format fn and length measurement and filtering for data types; improved unit test coverage Signed-off-by: Piotr Żelasko <[email protected]> * Refactor sampler constraints / filters into sampling.py Signed-off-by: Piotr Żelasko <[email protected]> * Tests and support for sampler length measurement of multimodal conversations Signed-off-by: Piotr Żelasko <[email protected]> * Update estimate_token_bins.py Signed-off-by: Piotr Żelasko <[email protected]> * Move estimate_token_bins.py to speech_llm scripts Signed-off-by: Piotr Żelasko <[email protected]> * Minor tweaks Signed-off-by: Piotr Żelasko <[email protected]> * Fixes for SpeechLLM dataset Signed-off-by: Piotr Żelasko <[email protected]> * Apply isort and black reformatting Signed-off-by: pzelasko <[email protected]> * Add missing emmett tests Signed-off-by: Piotr Żelasko <[email protected]> * Add tutorial about multimodal lhotse dataloading Signed-off-by: Piotr Żelasko <[email protected]> * Updated documentation for multimodal dataloading Signed-off-by: Piotr Żelasko <[email protected]> * Prompt Formatter tutorial Signed-off-by: Piotr Żelasko <[email protected]> * Review comments Signed-off-by: Piotr Żelasko <[email protected]> * Fixes for sampling filters None values Signed-off-by: Piotr Żelasko <[email protected]> * Changes requested by Steve: moving some args to main config namespace in multi config sampler Signed-off-by: Piotr Żelasko <[email protected]> * fix Signed-off-by: Piotr Żelasko <[email protected]> * Update default configs to the modified config schema Signed-off-by: Piotr Żelasko <[email protected]> * Fix omegaconf use issue Signed-off-by: Piotr Żelasko <[email protected]> * Update the docs to the modified multi config format Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: pzelasko <[email protected]> Co-authored-by: pzelasko <[email protected]> * Remove old TODO comments Signed-off-by: Piotr Żelasko <[email protected]> * Remove prompts/fn.py Signed-off-by: Piotr Żelasko <[email protected]> * Copyright notices Signed-off-by: Piotr Żelasko <[email protected]> * Make linter happy Signed-off-by: Piotr Żelasko <[email protected]> * Make linter happy Signed-off-by: Piotr Żelasko <[email protected]> * Fix megatron test Signed-off-by: Piotr Żelasko <[email protected]> * Fix megatron test Signed-off-by: Piotr Żelasko <[email protected]> * Disable plugin for high entropy strings in secrets detector Signed-off-by: Piotr Żelasko <[email protected]> * Fix CodeQL errors Signed-off-by: Piotr Żelasko <[email protected]> * fix unit tests Signed-off-by: Piotr Żelasko <[email protected]> * fix another unit test Signed-off-by: Piotr Żelasko <[email protected]> * Fix multimodal tests Signed-off-by: Piotr Żelasko <[email protected]> * Apply isort and black reformatting Signed-off-by: pzelasko <[email protected]> * fixes after merging canary2 pr to main Signed-off-by: Piotr Żelasko <[email protected]> * fix headers Signed-off-by: Piotr Żelasko <[email protected]> * fix canary integration test + formatting Signed-off-by: Piotr Żelasko <[email protected]> * Address reviews - add sync_max_audio_length flag for conformer encoder Signed-off-by: Piotr Żelasko <[email protected]> * Revert change in secrets detector Signed-off-by: Piotr Żelasko <[email protected]> * Revert change in secrets detector Signed-off-by: Piotr Żelasko <[email protected]> * Revert change in secrets detector Signed-off-by: Piotr Żelasko <[email protected]> * Address code review Signed-off-by: Piotr Żelasko <[email protected]> * Address Steve's review Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: zhehuaichen <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: pzelasko <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: krishnacpuvvada <[email protected]> Co-authored-by: zhehuaichen <[email protected]> Co-authored-by: pzelasko <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: krishnacpuvvada <[email protected]> Co-authored-by: zhehuaichen <[email protected]>
* Sync validation metrics for ASRModel Signed-off-by: Piotr Żelasko <[email protected]> * support sync for single-dataloader case Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]>
* nemo 2 support Signed-off-by: Onur Yilmaz <[email protected]> * Remove unwanted params in DDP init in Megatron Parallel Signed-off-by: Hemil Desai <[email protected]> * nemo2 working with query Signed-off-by: Onur Yilmaz <[email protected]> * Apply isort and black reformatting Signed-off-by: oyilmaz-nvidia <[email protected]> * multigpu deployment with nemo2 works Signed-off-by: Onur Yilmaz <[email protected]> * Apply isort and black reformatting Signed-off-by: oyilmaz-nvidia <[email protected]> * add max output lenght Signed-off-by: Onur Yilmaz <[email protected]> * Remove prints Signed-off-by: Onur Yilmaz <[email protected]> * Fix merge conflicts Signed-off-by: Onur Yilmaz <[email protected]> * readded this file Signed-off-by: Onur Yilmaz <[email protected]> --------- Signed-off-by: Onur Yilmaz <[email protected]> Signed-off-by: Hemil Desai <[email protected]> Signed-off-by: oyilmaz-nvidia <[email protected]> Co-authored-by: Hemil Desai <[email protected]> Co-authored-by: oyilmaz-nvidia <[email protected]>
* Add SFT/PEFT HF tests Signed-off-by: Alexandros Koumparoulis <[email protected]> * move hf examples to examples dir Signed-off-by: Alexandros Koumparoulis <[email protected]> * bot Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> * fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * use mini_squad Signed-off-by: Alexandros Koumparoulis <[email protected]> * use mini_squad Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> * add 2gpu DDP Signed-off-by: Alexandros Koumparoulis <[email protected]> * refactor Signed-off-by: Alexandros Koumparoulis <[email protected]> * use labels as passed by the user Signed-off-by: Alexandros Koumparoulis <[email protected]> * update samples/ tests Signed-off-by: Alexandros Koumparoulis <[email protected]> * rm unused imports Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> * fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * Add tests with subset split names, e.g. train[:100] Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> * add --disable-ckpt Signed-off-by: Alexandros Koumparoulis <[email protected]> * use self-hosted-azure-gpus-1 for single-gpu test Signed-off-by: Alexandros Koumparoulis <[email protected]> * Add TRANSFORMERS_OFFLINE=1 to hf tests Signed-off-by: Alexandros Koumparoulis <[email protected]> --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: akoumpa <[email protected]> Co-authored-by: akoumpa <[email protected]>
Signed-off-by: BoxiangW <[email protected]>
beep boop 🤖: 🚨 The following files must be fixed before merge! Your code was analyzed with PyLint. The following annotations have been identified:
Mitigation guide:
By applying these rules, we reduce the occurance of this message in future. Thank you for improving NeMo's documentation! |
[🤖]: Hi @BoxiangW 👋, We wanted to let you know that a CICD pipeline for this PR just finished successfully So it might be time to merge this PR or get some approvals I'm just a bot so I'll leave it you what to do next. //cc @pablo-garay @ko3n1g |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add the tests to the final step:)
Signed-off-by: Alexandros Koumparoulis <[email protected]>
beep boop 🤖: 🙏 The following files have warnings. In case you are familiar with these, please try helping us to improve the code base. Your code was analyzed with PyLint. The following annotations have been identified:
Mitigation guide:
By applying these rules, we reduce the occurance of this message in future. Thank you for improving NeMo's documentation! |
beep boop 🤖: 🚨 The following files must be fixed before merge! Your code was analyzed with PyLint. The following annotations have been identified:
Mitigation guide:
By applying these rules, we reduce the occurance of this message in future. Thank you for improving NeMo's documentation! |
What does this PR do ?
Add a one line overview of what this PR aims to accomplish.
Collection: [Note which collection this PR will affect]
Changelog
Usage
# Add a code snippet demonstrating how to use this
GitHub Actions CI
The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.
The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".
Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information