Skip to content

EIA deploy error no libcuda.so.1 #613

@austinmw

Description

@austinmw

I'm trying to deploy a TF Script Mode trained model that uses TF Serving to a CPU+EIA, but the endpoint is failing to be created. The logs say:

tensorflow_model_server: error while loading shared libraries: libcuda.so.1: cannot open shared object file: No such file or directory

So does TF Serving require CUDA Toolkit? and If so is this toolkit not installed on SM official cpu/eia docker images?

Edit: Actually this now seems like a duplicate of a previous bug I posted. Working backwards, the model created from estimator.deploy lists the image pulled as the GPU version: 520713654638.dkr.ecr.us-west-2.amazonaws.com/sagemaker-tensorflow-serving:1.12.0-gpu even though I specified a cpu instance type.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions