-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Describe the bug
Good Morning, I get an error when trying to create a pipeline execution, after updating sagemaker, as follows:
...Traceback Messages...
File "blah/blah", line xxxx, in get_estimator
estimator = Estimator(
File "/var/task/sagemaker/estimator.py", line 2896, in __init__
super(Estimator, self).__init__(
File "/var/task/sagemaker/estimator.py", line 622, in __init__
raise ValueError(f"Bad value for instance type: '{instance_type}'")
File "/var/task/sagemaker/workflow/entities.py", line 86, in __str__
raise TypeError(
TypeError: Pipeline variables do not support __str__ operation. Please use `.to_string()` to convert it to string type in execution timeor use `.expr` to translate it to Json for display purpose in Python SDK.
+ [[ False = True ]]
Firstly the TypeError is incorrect, I don't want to to_string() the pipeline variable right off the bat, as I don't know what instance type I will need. I want to dynamically assign an instance_type after I've estimated the resources required to run a job. This resource estimation is part of our pipeline.
To reproduce
Try to create an estimator with instance_type = a PipelineVariable with recent version of sagemaker==2.171.0
Expected behavior
Before recent changes I was able to dynamically assign an instance type. Now I have to do some kind of workaround, such as using ConditionStep and JsonGet'ing the type. Or some other workaround, if you could please provide any suggestions.
Is this expected behavior? Is the point that each step should be assigned an instance_type at pipeline execution start or is there another reason? If not, assigning an instance_type is important as we want to accurately estimate the resources for a training job rather than overprovision or guess too early. We have files that might have high compression ratios that we cannot accurate gauge the size of early on.