[Bug] enable_dp_attention not worked with DeepSeek-V3 2x8 H20 benchmarking #2808

liz-badada · 2025-01-09T08:12:08Z

Checklist

1. I have searched related issues but cannot get the expected help.
2. The bug has not been fixed in the latest version.
3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
4. If the issue you raised is not a bug but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
5. Please use English, otherwise it will be closed.

Describe the bug

Hi,

We try to reproduce DeepSeek-V3 on 2x8 H20 as described in example-serving-with-2-h208 part.

And found --enable-dp-attention is not supported with multi-node

When will this feature be supported?

Reproduction

python3 -m sglang.launch_server --model-path /workspace/DeepSeek-V3 --dist-init-addr $3:5000 --nnodes 2 --node-rank $NODERANK --trust-remote-code --tp 16 --enable-dp-attention

Environment

2x8 Nvidia H20 GPUs

The text was updated successfully, but these errors were encountered:

liz-badada · 2025-01-09T13:18:42Z

BTW, so appreciate it if you have recommended SGLang benchmark parameters that fully utilize H20 resources!

zhyncs assigned ispobock Jan 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] enable_dp_attention not worked with DeepSeek-V3 2x8 H20 benchmarking #2808

[Bug] enable_dp_attention not worked with DeepSeek-V3 2x8 H20 benchmarking #2808

liz-badada commented Jan 9, 2025

liz-badada commented Jan 9, 2025

[Bug] enable_dp_attention not worked with DeepSeek-V3 2x8 H20 benchmarking #2808

[Bug] enable_dp_attention not worked with DeepSeek-V3 2x8 H20 benchmarking #2808

Comments

liz-badada commented Jan 9, 2025

Checklist

Describe the bug

Reproduction

Environment

liz-badada commented Jan 9, 2025