Skip to content

Conversation

@YYStreet
Copy link
Contributor

@YYStreet YYStreet commented Mar 3, 2020

Issue #, if available:
conda awscli gets updated later than pypi.

Description of changes:
Install awscli from pypi instead of conda for PyTorch containers, because conda gets updated later than pypi

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@YYStreet YYStreet requested review from a user and TusharKanekiDey March 3, 2020 21:09
@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@YYStreet
Copy link
Contributor Author

YYStreet commented Mar 4, 2020

ValueError: Error for Training job sagemaker-test-2020-03-03-22-36-18-979: Failed Reason: CapacityError: Unable to provision requested ML compute capacity. Please retry using a different ML instance type.

ValueError: Error for Training job sagemaker-test-2020-03-03-22-55-18-029: Failed Reason: CapacityError: Unable to provision requested ML compute capacity. Please retry using a different ML instance type.

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@YYStreet
Copy link
Contributor Author

YYStreet commented Mar 4, 2020

Issues might be from PyTorch Framework and related resources themselves.
pytorch/pytorch#34262

2020-03-03 23:14:04,658 sagemaker-containers ERROR    ExecuteUserScriptError:
Command "/opt/conda/bin/python smdebug_mnist.py --data_dir /codebuild/output/src015489493/src/github.com/aws/sagemaker-pytorch-container/test/resources/mnist/data/training --epochs 1 --num_steps 50 --random_seed True --smdebug_path /opt/ml/output/tensors"
INFO:__main__:Create neural network module
INFO:__main__:Get train data loader
#0150it [00:00, ?it/s]Traceback (most recent call last):
  File "smdebug_mnist.py", line 223, in <module>
    main()
  File "smdebug_mnist.py", line 186, in main
    train(model, device, optimizer, hook, opt.epochs, opt.log_interval, training_dir)
  File "smdebug_mnist.py", line 130, in train
    trainloader = _get_train_data_loader(4, training_dir)
  File "smdebug_mnist.py", line 96, in _get_train_data_loader
    transforms.Normalize((0.1307,), (0.3081,))
  File "/opt/conda/lib/python3.6/site-packages/torchvision/datasets/mnist.py", line 70, in __init__
    self.download()
  File "/opt/conda/lib/python3.6/site-packages/torchvision/datasets/mnist.py", line 137, in download
    download_and_extract_archive(url, download_root=self.raw_folder, filename=filename, md5=md5)
  File "/opt/conda/lib/python3.6/site-packages/torchvision/datasets/utils.py", line 264, in download_and_extract_archive
    download_url(url, download_root, filename, md5)
  File "/opt/conda/lib/python3.6/site-packages/torchvision/datasets/utils.py", line 97, in download_url
    raise e
  File "/opt/conda/lib/python3.6/site-packages/torchvision/datasets/utils.py", line 85, in download_url
    reporthook=gen_bar_updater()
  File "/opt/conda/lib/python3.6/urllib/request.py", line 248, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/opt/conda/lib/python3.6/urllib/request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "/opt/conda/lib/python3.6/urllib/request.py", line 532, in open
    response = meth(req, response)
  File "/opt/conda/lib/python3.6/urllib/request.py", line 642, in http_response
    'http', request, response, code, msg, hdrs)
  File "/opt/conda/lib/python3.6/urllib/request.py", line 570, in error
    return self._call_chain(*args)
  File "/opt/conda/lib/python3.6/urllib/request.py", line 504, in _call_chain
    result = func(*args)
  File "/opt/conda/lib/python3.6/urllib/request.py", line 650, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@YYStreet YYStreet merged commit 62fe007 into aws:master Mar 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants