Skip to content

Allow specifying default S3 bucket for a given SageMaker Session #1165

@elvinx

Description

@elvinx

One of the issues we face with the SDK over and over again in controlled environments is where we are not allowed to create S3 buckets, the SDK tries to create and use one based on the following convention:
sagemaker-{region}-{AWS account ID}

Cases where we face this issue:

  • Code location using Tensorflow/Pytorch script mode
  • Code location for Data Processing Jobs
  • Default output directories for created models, and etc.

What's even worse, in all of the above cases those variables are optional and the SDK tries to use the default bucket which we are not able to create.

As a user, I would like to be able to specify my default S3 bucket for a given session so I stop hitting this issue all the time.

Proposed solution:
Allow specifying a default_bucket during the creation of a Session

class Session(object):  # pylint: disable=too-many-public-methods
    def __init__(self, boto_session=None, sagemaker_client=None, sagemaker_runtime_client=None, default_bucket=None):

Metadata

Metadata

Assignees

No one assigned

    Labels

    status: pending releaseThe fix have been merged but not yet released to PyPI

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions