-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Describe the bug
There seems to be a bug in the code when attempting to set a custom image for the PyTorchModel object. The problem is in the PyTorch estimator object that sets the image for the PyTorchModel for inference to be the same as the training image name. This is problematic as there is a different image for inference vs training now. The logic for the create_model() method should check if the parameter image is passed in and use this vs
To reproduce
A clear, step-by-step set of instructions to reproduce the bug.
from sagemaker.pytorch import PyTorch
hyperparameters = {'epochs': 8}
estimator = PyTorch(source_dir='container/oxford-pets',
entry_point='oxford-pets.py',
role=role,
train_instance_count=1,
train_instance_type='local_gpu',
framework_version='1.3.1',
hyperparameters=hyperparameters,
image_name='fastai2-oxford-pets-sm-example-training')
estimator.fit('file://' + str(path))
predictor = model.deploy(1, 'local', image='fastai2-oxford-pets-sm-example-inference')
The following code with produce an error as it will attempt to launch a local Docker container with the image name fastai2-oxford-pets-sm-example-training instead of fastai2-oxford-pets-sm-example-inference
Inspecting the PyTorch model object the value for the image is set incorrectly to fastai2-oxford-pets-sm-example-training instead of fastai2-oxford-pets-sm-example-inference.
The problematic line of code is found here. It should check if the param image is passed in before setting the image param on the model instead of assigning from the var image_name.
The only way around this is to create the model from the estimator and override the param image. An example is shown below:
model = estimator.create_model(role=role,
entry_point='oxford-pets.py',
source_dir='container/oxford-pets',
image='fastai2-oxford-pets-sm-example-inference')
model.image = 'fastai2-oxford-pets-sm-example-inference'
predictor = model.deploy(1, 'local')
Expected behavior
A clear and concise description of what you expected to happen.
The SDK should launch a container from the image named fastai2-oxford-pets-sm-example-inference.
Screenshots or logs
If applicable, add screenshots or logs to help explain your problem.
System information
A description of your system. Please provide:
- SageMaker Python SDK version: 1.50.9.post0
- Framework name (eg. PyTorch) or algorithm (eg. KMeans): PyTorch
- Framework version: 1.3.1
- Python version: 3.6
- CPU or GPU: Both
- Custom Docker image (Y/N): Y
Additional context
Add any other context about the problem here.