Skip to content

Conversation

@rxin
Copy link
Contributor

@rxin rxin commented Sep 4, 2014

Please merge this at the same time as mesos/spark-ec2#66

@rxin rxin changed the title [SPARK-3391] Support attaching more than 1 EBS volumes. [SPARK-3391][EC2] Support attaching more than 1 EBS volumes. Sep 4, 2014
@rxin
Copy link
Contributor Author

rxin commented Sep 4, 2014

This needs to be used together with mesos/spark-ec2#65 and mesos/spark-ec2#66

@rxin
Copy link
Contributor Author

rxin commented Sep 4, 2014

Tested by launching 8 EBS volumes on r3.8xlarge instances.

@rxin rxin changed the title [SPARK-3391][EC2] Support attaching more than 1 EBS volumes. [SPARK-3391][EC2] Support attaching up to 8 EBS volumes. Sep 4, 2014
ec2/spark_ec2.py Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is gp2 ? Is this applicable to all instances etc. ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gp2 is the new general purpose instance, which is the new recommended one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http://aws.amazon.com/ebs/details/ says that GP2 implies attaching SSD-based EBS volumes -- which sounds good. But this is 2x more expensive compared to standard ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pdeyhim any comment here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could just make this configurable. Some people might prefer the spinning disks.

@rxin
Copy link
Contributor Author

rxin commented Sep 5, 2014

Ok I made the ebs volume type configurable.

@shivaram
Copy link
Contributor

shivaram commented Sep 5, 2014

LGTM.

@pdeyhim
Copy link

pdeyhim commented Sep 5, 2014

for io1, specifying the number of iops is required. So we either have to limit this to gp2 and standard or fully support io1 by allowing users to specify the number of iops

@rxin
Copy link
Contributor Author

rxin commented Sep 5, 2014

Ok merging this (and removed io1 for now).

@pdeyhim
Copy link

pdeyhim commented Sep 5, 2014

And what happens when the additional EBS volumes get added? We probably want to configure spark-env.sh and spark_local_dir with the new volumes correct? the place this happens is here: https://github.com/rxin/spark/blob/ec2-ebs-vol/ec2/spark_ec2.py#L674-L678 but that snippet only configures local disks in spark-env.sh and not the new EBS volumes.

@asfgit asfgit closed this in 1725a1a Sep 5, 2014
@rxin
Copy link
Contributor Author

rxin commented Sep 5, 2014

the ebs volumes are not great for shuffle (bad small write performance). Let's hold that off for now.

@pdeyhim
Copy link

pdeyhim commented Sep 5, 2014

@rxin ok that's correct for smaller instance types. But FYI, EBS on larger instances (and ebs optimized instances) should perform well on shuffle read/write

@rxin rxin deleted the ec2-ebs-vol branch September 5, 2014 07:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants