[Bug] 如何在指定卡上加载lmdeploy加速的模型 #3091

qingchunlizhi · 2025-01-26T09:14:44Z

Checklist

1. I have searched related issues but cannot get the expected help.
2. The bug has not been fixed in the latest version.
3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.

Describe the bug

当使用CUDA_VISIBLE_DEVICES指定可见卡不是0卡时，使用pipeline加载InternVL2_5-38B-MPO-AWQ模型会报错：
ValueError: At least one of the model submodule will be offloaded to disk, please pass along an offload_folder.
debug发现部分vit的权重是放在cpu上的，尝试直接使用to方法迁到gpu上失败。请问这个问题该怎么解决

Reproduction

pipeline(model_path, backend_config=TurbomindEngineConfig(session_len=10000),)

Environment

InternVL2_5-38B-MPO-AWQ

Error traceback

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] 如何在指定卡上加载lmdeploy加速的模型 #3091

[Bug] 如何在指定卡上加载lmdeploy加速的模型 #3091

qingchunlizhi commented Jan 26, 2025

[Bug] 如何在指定卡上加载lmdeploy加速的模型 #3091

[Bug] 如何在指定卡上加载lmdeploy加速的模型 #3091

Comments

qingchunlizhi commented Jan 26, 2025

Checklist

Describe the bug

Reproduction

Environment

Error traceback