[Feature] Support HF accelerate and DeepSpeed for inference

Using HF accelerate or DeepSpeed engine for inference:

https://github.com/aws/amazon-sagemaker-examples/blob/main/inference/generativeai/llm-workshop/lab1-deploy-llm/intro_to_llm_deployment.ipynb

Also compare with this DeepSpeed example:
https://github.com/aws/amazon-sagemaker-examples/blob/main/advanced_functionality/pytorch_deploy_large_GPT_model/GPT-J-6B-model-parallel-inference-DJL.ipynb 

See: 

https://github.com/deepjavalibrary/djl-serving

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature] Support HF accelerate and DeepSpeed for inference #35

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Feature] Support HF accelerate and DeepSpeed for inference #35

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions