-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
System Information
- Framework (e.g. TensorFlow) / Algorithm (e.g. KMeans): TensorFlow Framework
- Framework Version: 1.10,1.11
- Python Version: 3.x
- CPU or GPU: GPU
- Python SDK Version: latest
- Are you using a custom image: no
Describe the problem
A large data science org has a requirement for encrypted S3 buckets. BlazingText works effectively and pushes data to encrypted S3 buckets. With the same parameters as BlazingText (role, session, subnets, security groups, and output_path) plus a few others for the Framework specifically (code_location, output_kms_key), I am getting a putObject error (access denied).
Minimal repro / logs
The problem is specifically in the process _prepare_for_training -> _stage_user_code_in_s3 -> tar_and_upload_dir -> object_upload_file. It looks like the Boto3 command here is not using the KMS key, even when one is provided to the framework. Here's the specific line:
| session.resource('s3').Object(bucket, key).upload_file(tar_file) |
The upload_file command could have extra ServerSideEncryption and SSEMKSKeyId args for the S3 Transfer Manager.
- Exact command to reproduce:
estimator = TensorFlow(entry_point = 'model_file.py', role = role, output_path = 's3://encryptedbucket/key', code_location = 's3://encryptedbucket/code_key', hyperparameters = hyperparameter_dict, training_steps = training_steps, evaluation_steps = evaluation_steps, train_instance_count = 1, train_instance_type = 'ml.p3.2xlarge', sagemaker_session = sagemaker_session, subnets = ['subnet'], security_group_ids = ['securitygroup'], output_kms_ket = 'alias/aws/s3', train_volume_kms_key = 'alias/aws/s3')
estimator.fit(inputs = 's3://encryptedbucket/training_key')
Please let me know if you need any further information--thank you!