-
-
Notifications
You must be signed in to change notification settings - Fork 11.5k
[Bugfix] [Easy] Fixed a bug in the multiprocessing GPU executor. #6770
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, please make sure to run full CI as it is required to merge (or just use auto-merge). To run full CI, you can do one of these:
🚀 |
|
Hi, why would this be a problem? When would this code be called from another thread? |
|
We are using |
|
I looked into this a bit more and the main issue is that we have a FastAPI service and we are initializing the vLLM engine in a separate thread so that the API can still respond with something like "Model is being loaded" while the model is being loaded etc., which can take a while for some models. This looks something like this: loop = asyncio.get_event_loop()
loop.set_default_executor(ThreadPoolExecutor())
model_download_task = loop.run_in_executor(None, load_model)
model_download_task.add_done_callback(done_callback)where |
|
@youkaichao I believe that the change introduced in this PR should be sufficient to address this use case and innocuous for other use cases but I'm not sure if there are consequences I'm not aware of. |
|
Got it. So you are indeed calling it from another thread. This is not the common usecase. I can accept this change, but since this is not a common usage, we will not make a release just for it. It can be in the next release. |
|
That is understandable, thank you! Do you have an estimated timeline for the next release? |
|
We are usually in a bi-weekly release cadence. |
youkaichao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
|
Sounds good, thank you! |
…m-project#6770) Signed-off-by: Alvant <[email protected]>
…m-project#6770) Signed-off-by: LeiWang1999 <[email protected]>
This was broken in the
0.5.3release when these signal calls were introduced. It results in the following error when deploying on our machines:The fix is borrowed from here.
It would be great if you could cut a
0.5.3.post2release after this is merged to unblock us (and I assume others as well) from using vLLM with Llama 3.1. Thank you!