-
Notifications
You must be signed in to change notification settings - Fork 352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable FP8 in Mcore integration test on older GPUs #1357
Conversation
Avoid FP8 on Ampere and older. Generate synthetic data instead of depending on external data. Signed-off-by: Tim Moon <[email protected]>
/te-ci pytorch L1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only minor nitpicks/questions. Ow, lgtm
if [[ ${DEVICE_ARCH} -ge 89 ]]; then | ||
WITH_FP8=1 | ||
fi | ||
|
||
# Download Megatron-LM if needed | ||
if [ ! -d "${MCORE_PATH}" ]; then | ||
pushd $(dirname ${MCORE_PATH}) | ||
git clone -b core_r0.9.0 https://github.com/NVIDIA/Megatron-LM.git Megatron-LM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess then we have to manually update this branch. Is this okay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's an argument for just downloading the Megatron-LM main
branch, but that is orthogonal to this PR.
printf "{" >> ${VOCAB_FILE} | ||
printf "\"<|endoftext|>\": 0" >> ${VOCAB_FILE} | ||
seq 1 4095 | awk '{ printf(", \"%d\": %d", $1, $1) }' >> ${VOCAB_FILE} | ||
printf "}" >> ${VOCAB_FILE} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, that's enough to generate a vocab file!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is somewhat specific to the Mcore mock GPT dataset: https://github.com/NVIDIA/Megatron-LM/blob/bd677bfb13ac2f19deaa927adc6da6f9201d66aa/megatron/core/datasets/gpt_dataset.py#L693
The model figures out the number of embeddings based on the vocab size. However, the mock dataset just generates indices with arange
(up to a max sequence length of 4096), so the vocab can be junk.
--vocab-file /data/gpt3/pile-cc1-cc2-shuf/bpe/gpt2-vocab.json | ||
--merge-file /data/gpt3/pile-cc1-cc2-shuf/bpe/gpt2-merges.txt | ||
--vocab-file ${VOCAB_FILE} | ||
--merge-file ${TE_PATH}/qa/L1_pytorch_mcore_integration/merges.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what's the purpose of this file, but is it okay for it to only have version info?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm treating merges.txt
as a magic config for the tokenizer: huggingface/transformers#1083 (comment)
Description
We have experienced test failures in the Mcore integration tests on older GPUs, so this PR checks the GPU arch before enabling FP8. It also generates synthetic data so it can be run without needing to access a specific data path.
Type of change
Changes
Checklist: