Skip to content

Invalid specification of input data in windows local mode #844

@xnaiman

Description

@xnaiman

System Information

  • Framework (e.g. TensorFlow) / Algorithm (e.g. KMeans): Scikit-Learn
  • Framework Version: 0.20.0 (official sagemaker-scikit-learn-container)
  • Python Version: 3.6
  • CPU or GPU: CPU
  • Python SDK Version: 1.26.0
  • Are you using a custom image: No

Describe the problem

When I execute fit method in local mode on windows, incorrect training-data file path will be transferred to the docker container.

Cause

I already know that the cause is the difference between windows and linux file separators. Therefore, I specify the cause.

The following code is provided for sagemaker-python-sdk/src/sagemaker/local/image.py.

self.container_dir = container_dir if container_dir else os.path.join('/opt/ml/input/data', channel)

When this code is executed on windows, if channel is train, it becomes /opt/ml/input/data\train and can not be recognized by docker.

Minimal repro / logs

  • Exact command to reproduce:
sklearn = SKLearn(
    entry_point='scikit_learn_iris.py',
    train_instance_type="ml.c4.xlarge",
    role=role,
    sagemaker_session=sagemaker_session,
    hyperparameters={'max_leaf_nodes': 30})

sklearn.fit({'train': train_input})

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions