Skip to content

Conversation

@al-rigazzi
Copy link
Collaborator

@al-rigazzi al-rigazzi commented Jul 25, 2024

This PR adds the RequestDispatcher to the MLI. The RequestDispatcher batches inference requests together.

The implementation can be improved, especially by adding:

  • Abstraction for Memory, so that Dragon's MemoryPool can be wrapped in a SmartSim class and different types of memory can be injected (esp. at unit testing time)
  • More parameters around Torch Threads and intra-op threads
  • Queue removal mechanism in RequestDispatcher
  • Model removal mechanism if OOM error when loading a model
  • Tests for RequestDispatcher, BatchQueue, DeviceManager, and so on.

There is no mechanism to address model versions right now.

We can decide what to do now and what to put up a ticket for.

@al-rigazzi al-rigazzi requested review from AlyssaCote, ankona and mellis13 and removed request for ankona August 27, 2024 18:06
Copy link
Contributor

@AlyssaCote AlyssaCote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! One tiny comment about potentially removing a timing line, but not worth holding up the approval!

@codecov
Copy link

codecov bot commented Aug 27, 2024

Codecov Report

Attention: Patch coverage is 0% with 623 lines in your changes missing coverage. Please review.

Please upload report for BASE (mli-feature@6d5518b). Learn more about missing BASE report.

Files with missing lines Patch % Lines
...re/mli/infrastructure/control/requestdispatcher.py 0.00% 254 Missing ⚠️
.../_core/mli/infrastructure/control/workermanager.py 0.00% 95 Missing ⚠️
smartsim/_core/utils/timings.py 0.00% 86 Missing ⚠️
smartsim/_core/mli/infrastructure/worker/worker.py 0.00% 66 Missing ⚠️
...im/_core/mli/infrastructure/worker/torch_worker.py 0.00% 63 Missing ⚠️
.../_core/mli/infrastructure/control/devicemanager.py 0.00% 42 Missing ⚠️
..._core/mli/infrastructure/control/error_handling.py 0.00% 16 Missing ⚠️
smartsim/_core/launcher/dragon/dragonBackend.py 0.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@              Coverage Diff               @@
##             mli-feature     #647   +/-   ##
==============================================
  Coverage               ?   71.34%           
==============================================
  Files                  ?      102           
  Lines                  ?     8525           
  Branches               ?        0           
==============================================
  Hits                   ?     6082           
  Misses                 ?     2443           
  Partials               ?        0           
Files with missing lines Coverage Δ
smartsim/_core/launcher/dragon/dragonBackend.py 1.96% <0.00%> (ø)
..._core/mli/infrastructure/control/error_handling.py 0.00% <0.00%> (ø)
.../_core/mli/infrastructure/control/devicemanager.py 0.00% <0.00%> (ø)
...im/_core/mli/infrastructure/worker/torch_worker.py 0.00% <0.00%> (ø)
smartsim/_core/mli/infrastructure/worker/worker.py 0.00% <0.00%> (ø)
smartsim/_core/utils/timings.py 0.00% <0.00%> (ø)
.../_core/mli/infrastructure/control/workermanager.py 0.00% <0.00%> (ø)
...re/mli/infrastructure/control/requestdispatcher.py 0.00% <0.00%> (ø)

Copy link
Contributor

@mellis13 mellis13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic first implementation of the queue-based architecture. Thanks!

@al-rigazzi al-rigazzi merged commit 5d85995 into CrayLabs:mli-feature Aug 28, 2024
@al-rigazzi al-rigazzi deleted the queue-wm branch August 28, 2024 15:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants