SPARK-1352: Improve robustness of spark-submit script #271

pwendell · 2014-03-30T05:06:08Z

Better error messages when required arguments are missing.
Support for unit testing cases where presented arguments are invalid.
Bug fix: Only use environment varaibles when they are set (otherwise will cause NPE).
A verbose mode to aid debugging.
Visibility of several variables is set to private.
Deprecation warning for existing scripts.

1. Better error messages when required arguments are missing. 2. Support for unit testing cases where presented arguments are invalid. 3. Bug fix: Only use environment varaibles when they are set (otherwise will cause NPE). 4. A verbose mode to aid debugging. 5. Visibility of several variables is set to private. 6. Deprecation warning for existing scripts.

pwendell · 2014-03-30T05:06:13Z

/cc @sryza

AmplabJenkins · 2014-03-30T05:07:22Z

Merged build triggered. Build is starting -or- tests failed to complete.

AmplabJenkins · 2014-03-30T05:07:32Z

Merged build started. Build is starting -or- tests failed to complete.

AmplabJenkins · 2014-03-30T05:59:47Z

Merged build finished. All automated tests passed.

AmplabJenkins · 2014-03-30T05:59:47Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13583/

pwendell · 2014-03-31T19:06:47Z

These are mostly fixes and tests, so I'm going to merge them. But @sryza feel free to submit follow up patches if there is something in here that doesn't seem right.

…-0.8 Force pseudo-tty allocation in spark-ec2 script. ssh commands need the -t argument repeated twice if there is no local tty, e.g. if the process running spark-ec2 uses nohup and the parent process exits. Without this change, if you run the script this way (e.g. using nohup from a cron job), it will fail setting up the nodes because some of the ssh commands complain about missing ttys and then fail. (This version is for the 0.8 branch. I've filed a separate request for master since changes to the script caused the patches to be different.)

1. Better error messages when required arguments are missing. 2. Support for unit testing cases where presented arguments are invalid. 3. Bug fix: Only use environment varaibles when they are set (otherwise will cause NPE). 4. A verbose mode to aid debugging. 5. Visibility of several variables is set to private. 6. Deprecation warning for existing scripts. Author: Patrick Wendell <[email protected]> Closes apache#271 from pwendell/spark-submit and squashes the following commits: 9146def [Patrick Wendell] SPARK-1352: Improve robustness of spark-submit script

Integreation test "test.redshift", part of "dogfood_notebook_tests", was failing for Spark branch-2.1 with an SSL-related error. Root cause: - Postgres JDBC driver expects all-lowercase options in the URL, while we were providing camelCase ones (e.g. _sslRootCert_ as opposed to _sslrootcert_). Why this wasn't caught by Spark-side integration tests: - All tests use the "redshift" subprotocol, which asks for the Redshift driver. - The Postgres integration tests don't involve Redshift and therefore don't exercise this feature. - Even after manually changing the subprotocol to "postgresql", the Redshift driver was still being picked up by _Class.forName("org.postgres.Driver")_ because it was present in the classpath with more priority. Why the intended way of disabling this feature didn't work: - Per Postgres JDBC docs, _&ssl=false_ has in fact the exact opposite effect: it enables SSL encryption (!) ## What changes were proposed in this pull request? - Change ssl-related options to lowercase. - Introduce Dataframe reader option "autoenablessl" for disabling the feature. - Extend Redshift SSL integration test suite to verify that this new flag works. ## How was this patch tested? - _bazel run //spark/images:2.1.x-scala2.10_dogfood_notebook_tests_: https://dogfood.staging.cloud.databricks.com/#job/186727/run/1 - Existing _redshift-integration-tests_ - New integration test ## To Do in a separate PR: - Make _redshift-integration-tests_ also exercise the Postgres driver that we actually bundle. Author: Adrian Ionescu <[email protected]> Closes apache#271 from adrian-ionescu/redshift-ssl-SC-6101.

[NOSQUASH] Resync from apache

Update ansible-functional-public-clouds job option

… in thrift server (apache#271) ## What changes were proposed in this pull request? For the details of the exception please see [SPARK-24062](https://issues.apache.org/jira/browse/SPARK-24062). The issue is: Spark on Yarn stores SASL secret in current UGI's credentials, this credentials will be distributed to AM and executors, so that executors and drive share the same secret to communicate. But STS/Hive library code will refresh the current UGI by UGI's loginFromKeytab() after Spark application is started, this will create a new UGI in the current driver's context with empty tokens and secret keys, so secret key is lost in the current context's UGI, that's why Spark driver throws secret key not found exception. In Spark 2.2 code, Spark also stores this secret key in SecurityManager's class variable, so even UGI is refreshed, the secret is still existed in the object, so STS with SASL can still be worked in Spark 2.2. But in Spark 2.3, we always search key from current UGI, which makes it fail to work in Spark 2.3. To fix this issue, there're two possible solutions: 1. Fix in STS/Hive library, when a new UGI is refreshed, copy the secret key from original UGI to the new one. The difficulty is that some codes to refresh the UGI is existed in Hive library, which makes us hard to change the code. 2. Roll back the logics in SecurityManager to match Spark 2.2, so that this issue can be fixed. 2nd solution seems a simple one. So I will propose a PR with 2nd solution. ## How was this patch tested? Verified in local cluster. CC vanzin tgravescs please help to review. Thanks! Author: jerryshao <[email protected]> Closes apache#21138 from jerryshao/SPARK-24062. (cherry picked from commit ffaf0f9) Signed-off-by: jerryshao <[email protected]>

…pache#271)

…33] Backport insert operation lock (apache#197) * [HADP-40184]Backport insert operation lock (#15) [HADP-31946] Fix data duplicate on application retry and support concurrent write to different partitions in the same table.[HADP-33040][HADP-33041] Optimize merging staging files to output path and detect conflict with HDFS file lease. HADP-34738] During commitJob, merge paths with multi threads (apache#218) [HADP-36251] Enhance the concurrent lock mechanism for insert operation (apache#272) [HADP-37137] Add option to disable insert operation lock to write partitioned table (apache#286) * [HADP-46224] Do not overwrite the lock file when creating lock (apache#133) * [HADP-46868] Fix Spark merge path race condition (apache#161) * [HADP-50903] Ignore the error message if insert operation lock file has been deleted (apache#271) * [HADP-50733] Enhance the error message on picking insert operation lock failure (apache#267) * Fix * Fix * Fix * fix * Fix * Fix * Fix * Fix * Fix * [HADP-50574] Support to create the lock file for EC enabled path (apache#263) * [HADP-50574][FOLLOWUP] Add parameter type when getting overwrite method (apache#265) * [HADP-50574][FOLLOWUP] Add UT for creating ec disabled lock file and use underlying DistributedFileSystem for ViewFileSystem (apache#266) * Fix * Fix * Fix * [HADP-34612][FOLLOWUP] Do not show the insert local error by removing the being written stream from dfs client (apache#288) * Enabled Hadoop 3 --------- Co-authored-by: fwang12 <[email protected]>

asfgit closed this in 841721e Mar 31, 2014

rahij pushed a commit to rahij/spark that referenced this pull request Dec 5, 2017

Merge pull request apache#271 from palantir/aash/resync-apache

7bfa380

[NOSQUASH] Resync from apache

Igosuki pushed a commit to Adikteev/spark that referenced this pull request Jul 31, 2018

Upgraded Spark distribution to 2.2.1-2 (apache#271)

6d477c6

bzhaoopenstack pushed a commit to bzhaoopenstack/spark that referenced this pull request Sep 11, 2019

Merge pull request apache#271 from theopenlab/ansible-tests

5ee00dc

Update ansible-functional-public-clouds job option

turboFei pushed a commit to turboFei/spark that referenced this pull request Nov 6, 2025

[CARMEL-7560][CARMEL-4805] Simplify time calculation in ListenerBus (a…

3c010f8

…pache#271)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SPARK-1352: Improve robustness of spark-submit script #271

SPARK-1352: Improve robustness of spark-submit script #271

Uh oh!

pwendell commented Mar 30, 2014

Uh oh!

pwendell commented Mar 30, 2014

Uh oh!

AmplabJenkins commented Mar 30, 2014

Uh oh!

AmplabJenkins commented Mar 30, 2014

Uh oh!

AmplabJenkins commented Mar 30, 2014

Uh oh!

AmplabJenkins commented Mar 30, 2014

Uh oh!

pwendell commented Mar 31, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SPARK-1352: Improve robustness of spark-submit script #271

SPARK-1352: Improve robustness of spark-submit script #271

Uh oh!

Conversation

pwendell commented Mar 30, 2014

Uh oh!

pwendell commented Mar 30, 2014

Uh oh!

AmplabJenkins commented Mar 30, 2014

Uh oh!

AmplabJenkins commented Mar 30, 2014

Uh oh!

AmplabJenkins commented Mar 30, 2014

Uh oh!

AmplabJenkins commented Mar 30, 2014

Uh oh!

pwendell commented Mar 31, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants