-
Notifications
You must be signed in to change notification settings - Fork 106
Batching crashtest #310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batching crashtest #310
Conversation
b60dfe7
to
918b81b
Compare
Thank you @hhsecond ! |
I'm wondering why the non-gpu build is not crashing. Can you confirm this is the case? |
918b81b
to
e758f34
Compare
Confirming that It's crashing on CPU. Tests probably misbehaving because of the timings. I have added a thread join for syncing the execution flow |
Codecov Report
@@ Coverage Diff @@
## batching #310 +/- ##
============================================
+ Coverage 55.52% 55.79% +0.26%
============================================
Files 25 25
Lines 5021 5022 +1
============================================
+ Hits 2788 2802 +14
+ Misses 2233 2220 -13
Continue to review full report at Codecov.
|
Ok, that's just weird. All the tests are passing now |
e758f34
to
21883bd
Compare
Ok, so if you take to look at the logs, the server is still crashing. But the test conditions are met because it tries to fetch the result from a key that was filled from a previous operation. I have changed the test. It should fail now. But something still keeps me thinking is, how'd the crashed server auto-healed for the rest of the test cases in the pipeline. Would you have any idea @lantiga ? |
@hhsecond I think I got it: Note: I had to port the test to multiprocessing so I can now kill the pending |
* Add support for automated batching Add support for inspection and eviction to queue Mock run info batching Mock run info batching Make TF tests work Add batching for ONNX and ONNX-ML Fix torch API, still WIP Fix torch backend Fixes after rebasing Add auto-batching to TFLite backend Fix from rebase Add batching args to command and change API accordingly Add batching heuristics [WIP] Fix TFLite test by accessing first tensor in first batch safely Temporarily comment out wrong_bg test check Implement batching heuristics Introduce autobatch tests, tflite still fails Fix segfault when error was generated from the backend Fix tflite autobatch test Updated documentation with auto batching Remove stale comments Avoid making extra copies of inputs and outputs when batch count is 1 Address review comments re const-correctness Add tests to detect failures Fix slicing and concatenation Fix tensor slicing and concatenating Temporarily disable tflite autobatch test due to tflite limitation Disable support for autobatching for TFLITE * Fix TFLite and tests after rebase * Temporarily disable macos CI build * Add synchronization to autobatch tests * Add synchronization to autobatch thread * Add synchronization to autobatch thread * Batching crashtest (#310) * test cases for crash test * Fix issue with evict. Port test to multiprocessing to allow killing pending command. * Use terminate instead of kill Co-authored-by: Luca Antiga <[email protected]> Co-authored-by: Sherin Thomas <[email protected]>
Add support for batching (take two) (#270) * Add support for automated batching Add support for inspection and eviction to queue Mock run info batching Mock run info batching Make TF tests work Add batching for ONNX and ONNX-ML Fix torch API, still WIP Fix torch backend Fixes after rebasing Add auto-batching to TFLite backend Fix from rebase Add batching args to command and change API accordingly Add batching heuristics [WIP] Fix TFLite test by accessing first tensor in first batch safely Temporarily comment out wrong_bg test check Implement batching heuristics Introduce autobatch tests, tflite still fails Fix segfault when error was generated from the backend Fix tflite autobatch test Updated documentation with auto batching Remove stale comments Avoid making extra copies of inputs and outputs when batch count is 1 Address review comments re const-correctness Add tests to detect failures Fix slicing and concatenation Fix tensor slicing and concatenating Temporarily disable tflite autobatch test due to tflite limitation Disable support for autobatching for TFLITE * Fix TFLite and tests after rebase * Temporarily disable macos CI build * Add synchronization to autobatch tests * Add synchronization to autobatch thread * Add synchronization to autobatch thread * Batching crashtest (#310) * test cases for crash test * Fix issue with evict. Port test to multiprocessing to allow killing pending command. * Use terminate instead of kill Co-authored-by: Luca Antiga <[email protected]> Co-authored-by: Sherin Thomas <[email protected]>
Test case for testing the batching crash