You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
2. Please use English, otherwise it will be closed.
Motivation
When users have a codebase that uses some inference engine (e.g. vllm), and he/she wants to try out SGLang (e.g. because SGLang may be faster in his/her workload), currently it is required to manually refactor the codebase, because the two frameworks have different API.
Therefore, one way is to add a compatibility layer on top of SGLang, then users can do a drop-in replacement. For example, suppose the old user code is:
from vllm import LLM, SamplingParams
llm = LLM(model="facebook/opt-125m")
outputs = llm.generate(prompts, sampling_params)
for output in outputs:
print(output.outputs[0].text)
Then, the only line that user needs to change is
from sglang.vllm_compatible import LLM, SamplingParams
Surely we should do other things, such as converting SamplingParams to SGLang's param dict, converting outputs, etc. And we will definitely face incompatible things, where we may simply error and tell user that, [UPDATE] or provide a simple-to-migrate API and allow users to split supported and unsupported operations.
There seems to be examples in other field as well. For example, TiDB claims to be (mostly) compatible with MySQL, thus users using MySQL (but facing performance issues) can migrate to TiDB if they want. A quick search also shows libs claimed to be compatible with Redis, such as the apache kvrocks, as another example.
One very direct usage is when integrating with OpenRLHF (e.g. OpenRLHF/OpenRLHF#661) and Verl, both of which already has vllm support but lack sglang one.
We may also provide conversion utility functions as well when users wants to have more low-level control. For example, when verl integrates sglang (in addition to the existing vllm), verl may want to have a def convert_vllm_sampling_params_to_sglang() function, and this is indeed also used as well when openrlhf integrates sglang. Then, instead of implementing the logic twice in both libraries (and indeed almost every user library), we can simply provide a single one in sglang.
Related resources
No response
The text was updated successfully, but these errors were encountered:
Checklist
Motivation
When users have a codebase that uses some inference engine (e.g. vllm), and he/she wants to try out SGLang (e.g. because SGLang may be faster in his/her workload), currently it is required to manually refactor the codebase, because the two frameworks have different API.
Therefore, one way is to add a compatibility layer on top of SGLang, then users can do a drop-in replacement. For example, suppose the old user code is:
Then, the only line that user needs to change is
and everything else should be the same.
Under the hood, the implementation can be like:
Surely we should do other things, such as converting SamplingParams to SGLang's param dict, converting outputs, etc. And we will definitely face incompatible things, where we may simply error and tell user that, [UPDATE] or provide a simple-to-migrate API and allow users to split supported and unsupported operations.
There seems to be examples in other field as well. For example, TiDB claims to be (mostly) compatible with MySQL, thus users using MySQL (but facing performance issues) can migrate to TiDB if they want. A quick search also shows libs claimed to be compatible with Redis, such as the apache kvrocks, as another example.
One very direct usage is when integrating with OpenRLHF (e.g. OpenRLHF/OpenRLHF#661) and Verl, both of which already has vllm support but lack sglang one.
We may also provide conversion utility functions as well when users wants to have more low-level control. For example, when verl integrates sglang (in addition to the existing vllm), verl may want to have a
def convert_vllm_sampling_params_to_sglang()
function, and this is indeed also used as well when openrlhf integrates sglang. Then, instead of implementing the logic twice in both libraries (and indeed almost every user library), we can simply provide a single one in sglang.Related resources
No response
The text was updated successfully, but these errors were encountered: