Skip to content

Conversation

@markhamstra
Copy link

No description provided.

tmagrino and others added 14 commits June 28, 2016 13:38
… Executor ID

## What changes were proposed in this pull request?

Previously, the TaskLocation implementation would not allow for executor ids which include underscores.  This tweaks the string split used to get the hostname and executor id, allowing for underscores in the executor id.

This addresses the JIRA found here: https://issues.apache.org/jira/browse/SPARK-16148

This is moved over from a previous PR against branch-1.6: apache#13857

## How was this patch tested?

Ran existing unit tests for core and streaming.  Manually ran a simple streaming job with an executor whose id contained underscores and confirmed that the job ran successfully.

This is my original work and I license the work to the project under the project's open source license.

Author: Tom Magrino <[email protected]>

Closes apache#13858 from tmagrino/fixtasklocation.

(cherry picked from commit ae14f36)
Signed-off-by: Shixiong Zhu <[email protected]>
…n NewHadoopRDD to branch 1.6

## What changes were proposed in this pull request?

This PR backports apache#13759.

(`SqlNewHadoopRDDState` was renamed to `InputFileNameHolder` and `spark` API does not exist in branch 1.6)

## How was this patch tested?

Unit tests in `ColumnExpressionSuite`.

Author: hyukjinkwon <[email protected]>

Closes apache#13806 from HyukjinKwon/backport-SPARK-16044.
….6.3.

## What changes were proposed in this pull request?

- Adds 1.6.2 and 1.6.3 as supported Spark versions within the bundled spark-ec2 script.
- Makes the default Spark version 1.6.3 to keep in sync with the upcoming release.
- Does not touch the newer spark-ec2 scripts in the separate amplabs repository.

## How was this patch tested?

- Manual script execution:

export AWS_SECRET_ACCESS_KEY=_snip_
export AWS_ACCESS_KEY_ID=_snip_
$SPARK_HOME/ec2/spark-ec2 \
    --key-pair=_snip_ \
    --identity-file=_snip_ \
    --region=us-east-1 \
    --vpc-id=_snip_ \
    --slaves=1 \
    --instance-type=t1.micro \
    --spark-version=1.6.2 \
    --hadoop-major-version=yarn \
    launch test-cluster

- Result: Successful creation of a 1.6.2-based Spark cluster.

This contribution is my original work and I license the work to the project under the project's open source license.

Author: Brian Uri <[email protected]>

Closes apache#13947 from briuri/branch-1.6-bug-spark-16257.
…cess.destroyForcibly() if and only if Process.destroy() fails

## What changes were proposed in this pull request?

Utils.terminateProcess should `destroy()` first and only fall back to `destroyForcibly()` if it fails. It's kind of bad that we're force-killing executors -- and only in Java 8. See JIRA for an example of the impact: no shutdown

While here: `Utils.waitForProcess` should use the Java 8 method if available instead of a custom implementation.

## How was this patch tested?

Existing tests, which cover the force-kill case, and Amplab tests, which will cover both Java 7 and Java 8 eventually. However I tested locally on Java 8 and the PR builder will try Java 7 here.

Author: Sean Owen <[email protected]>

Closes apache#13973 from srowen/SPARK-16182.

(cherry picked from commit 2075bf8)
Signed-off-by: Sean Owen <[email protected]>
…hon3

## What changes were proposed in this pull request?

I would like to use IPython with Python 3.5. It is annoying when it fails with IPython requires Python 2.7+; please install python2.7 or set PYSPARK_PYTHON when I have a version greater than 2.7

## How was this patch tested
It now works with IPython and Python3

Author: MechCoder <[email protected]>

Closes apache#13503 from MechCoder/spark-15761.

(cherry picked from commit 66283ee)
Signed-off-by: Sean Owen <[email protected]>
… No Column apache#14040

#### What changes were proposed in this pull request?
Star expansion over a table containing zero column does not work since 1.6. However, it works in Spark 1.5.1. This PR is to fix the issue in the master branch.

For example,
```scala
val rddNoCols = sqlContext.sparkContext.parallelize(1 to 10).map(_ => Row.empty)
val dfNoCols = sqlContext.createDataFrame(rddNoCols, StructType(Seq.empty))
dfNoCols.registerTempTable("temp_table_no_cols")
sqlContext.sql("select * from temp_table_no_cols").show
```

Without the fix, users will get the following the exception:
```
java.lang.IllegalArgumentException: requirement failed
        at scala.Predef$.require(Predef.scala:221)
        at org.apache.spark.sql.catalyst.analysis.UnresolvedStar.expand(unresolved.scala:199)
```

#### How was this patch tested?
Tests are added

Author: gatorsmile <[email protected]>

Closes apache#14042 from gatorsmile/starExpansionEmpty.
Link to Jira issue: https://issues.apache.org/jira/browse/SPARK-16353

## What changes were proposed in this pull request?

The javadoc options for the java unidoc generation are ignored when generating the java unidoc. For example, the generated `index.html` has the wrong HTML page title. This can be seen at http://spark.apache.org/docs/latest/api/java/index.html.

I changed the relevant setting scope from `doc` to `(JavaUnidoc, unidoc)`.

## How was this patch tested?

I ran `docs/jekyll build` and verified that the java unidoc `index.html` has the correct HTML page title.

Author: Michael Allman <[email protected]>

Closes apache#14031 from mallman/spark-16353.

(cherry picked from commit 7dbffcd)
Signed-off-by: Sean Owen <[email protected]>
…er is no longer published on Apache mirrors

## What changes were proposed in this pull request?

Download Maven 3.3.9 instead of 3.3.3 because the latter is no longer published on Apache mirrors

## How was this patch tested?

Jenkins

Author: Sean Owen <[email protected]>

Closes apache#14066 from srowen/Maven339Branch16.
… log

## What changes were proposed in this pull request?

Free memory size displayed in the log is wrong (used memory), fix to make it correct. Backported to 1.6.

## How was this patch tested?

N/A

Author: jerryshao <[email protected]>

Closes apache#14043 from jerryshao/memory-log-fix-1.6-backport.
## What changes were proposed in this pull request?

The following Java code because of type erasing:

```Java
JavaRDD<Vector> rows = jsc.parallelize(...);
RowMatrix mat = new RowMatrix(rows.rdd());
QRDecomposition<RowMatrix, Matrix> result = mat.tallSkinnyQR(true);
```

We should use retag to restore the type to prevent the following exception:

```Java
java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to [Lorg.apache.spark.mllib.linalg.Vector;
```

## How was this patch tested?

Java unit test

Author: Xusen Yin <[email protected]>

Closes apache#14051 from yinxusen/SPARK-16372.

(cherry picked from commit 4c6f00d)
Signed-off-by: Sean Owen <[email protected]>
…eflection.

Using "Method.invoke" causes an exception to be thrown, not an error, so
Utils.waitForProcess() was always throwing an exception when run on Java 7.

Author: Marcelo Vanzin <[email protected]>

Closes apache#14056 from vanzin/SPARK-16385.

(cherry picked from commit 59f9c1b)
Signed-off-by: Sean Owen <[email protected]>
@markhamstra markhamstra merged commit c261ddb into alteryx:csd-1.6 Jul 13, 2016
markhamstra pushed a commit to markhamstra/spark that referenced this pull request Nov 7, 2017
* Logging for resource deletion

Remove dangling colon and replace with an ellipses and a second log statement

* Update KubernetesResourceCleaner.scala
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants