Skip to content

s3_input with "compression='Gzip', input_mode='Pipe'" fails with ValidationError #716

@andremoeller

Description

@andremoeller

Please fill out the form below.

System Information

  • Python Version: 3.6
  • Python SDK Version: 1.18

Describe the problem

I'm trying to fit with an Estimator on Gzipped data with Pipe mode. I don't have input_mode set on the Estimator, but I do have it set in the s3_input, which should override the Estimator's input_mode:

input_mode (str): Optional override for this channel's input mode (default: None). By default, channels will
use the input mode defined on ``sagemaker.estimator.EstimatorBase.input_mode``, but they will ignore
that setting if this parameter is set.
* None - Amazon SageMaker will use the input mode specified in the ``Estimator``.
* 'File' - Amazon SageMaker copies the training dataset from the S3 location to a local directory.
* 'Pipe' - Amazon SageMaker streams data directly from S3 to the container via a Unix-named pipe.

My s3_input is:

training_s3_input = s3_input('s3://my_training_data', compression='Gzip', input_mode='Pipe', shuffle_config=ShuffleConfig(1))

Trying to fit on an Estimator gives me back this ValidationError, even though I specify Pipe, not File:

An error occurred (ValidationException) when calling the CreateTrainingJob operation: Invalid compression type for channel training: File mode only supports NONE, got Gzip instead

Setting input_mode='Pipe' directly on the Estimator works as expected.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions