Skip to content

Conversation

@CHN-guanxinyu
Copy link

What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

Please review http://spark.apache.org/contributing.html before opening a pull request.

cloud-fan and others added 30 commits August 8, 2017 16:32
…xecutor Process

The executor tab on Spark UI page shows task as completed when an executor process that is running that task is killed using the kill command.
Added the case ExecutorLostFailure which was previously not there, thus, the default case would be executed in which case, task would be marked as completed. This case will consider all those cases where executor connection to Spark Driver was lost due to killing the executor process, network connection etc.

## How was this patch tested?
Manually Tested the fix by observing the UI change before and after.
Before:
<img width="1398" alt="screen shot-before" src="https://user-images.githubusercontent.com/22228190/28482929-571c9cea-6e30-11e7-93dd-728de5cdea95.png">
After:
<img width="1385" alt="screen shot-after" src="https://user-images.githubusercontent.com/22228190/28482964-8649f5ee-6e30-11e7-91bd-2eb2089c61cc.png">

Please review http://spark.apache.org/contributing.html before opening a pull request.

Author: pgandhi <[email protected]>
Author: pgandhi999 <[email protected]>

Closes #18707 from pgandhi999/master.

(cherry picked from commit f016f5c)
Signed-off-by: Wenchen Fan <[email protected]>
…strong wolfe line search

## What changes were proposed in this pull request?

Update breeze to 0.13.1 for an emergency bugfix in strong wolfe line search
scalanlp/breeze#651

## How was this patch tested?

N/A

Author: WeichenXu <[email protected]>

Closes #18797 from WeichenXu123/update-breeze.

(cherry picked from commit b35660d)
Signed-off-by: Yanbo Liang <[email protected]>
… return value

Same PR as #18799 but for branch 2.2. Main discussion the other PR.
--------

When I was investigating a flaky test, I realized that many places don't check the return value of `HDFSMetadataLog.get(batchId: Long): Option[T]`. When a batch is supposed to be there, the caller just ignores None rather than throwing an error. If some bug causes a query doesn't generate a batch metadata file, this behavior will hide it and allow the query continuing to run and finally delete metadata logs and make it hard to debug.

This PR ensures that places calling HDFSMetadataLog.get always check the return value.

Jenkins

Author: Shixiong Zhu <[email protected]>

Closes #18890 from tdas/SPARK-21596-2.2.
…should call masterTracker.stop() in MapOutputTrackerSuite

Signed-off-by: 10087686 <wang.jiaochunzte.com.cn>

## What changes were proposed in this pull request?
After Unit tests end,there should be call masterTracker.stop() to free resource;
(Please fill in changes proposed in this fix)

## How was this patch tested?
Run Unit tests;
(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

Please review http://spark.apache.org/contributing.html before opening a pull request.

Author: 10087686 <[email protected]>

Closes #18867 from wangjiaochun/mapout.

(cherry picked from commit 6426adf)
Signed-off-by: Wenchen Fan <[email protected]>
## What changes were proposed in this pull request?
This patch removes the unused SessionCatalog.getTableMetadataOption and ExternalCatalog. getTableOption.

## How was this patch tested?
Removed the test case.

Author: Reynold Xin <[email protected]>

Closes #18912 from rxin/remove-getTableOption.

(cherry picked from commit 584c7f1)
Signed-off-by: Reynold Xin <[email protected]>
…nalAppendOnlyUnsafeRowArray

## What changes were proposed in this pull request?

[SPARK-21595](https://issues.apache.org/jira/browse/SPARK-21595) reported that there is excessive spilling to disk due to default spill threshold for `ExternalAppendOnlyUnsafeRowArray` being quite small for WINDOW operator. Old behaviour of WINDOW operator (pre #16909) would hold data in an array for first 4096 records post which it would switch to `UnsafeExternalSorter` and start spilling to disk after reaching `spark.shuffle.spill.numElementsForceSpillThreshold` (or earlier if there was paucity of memory due to excessive consumers).

Currently the (switch from in-memory to `UnsafeExternalSorter`) and (`UnsafeExternalSorter` spilling to disk) for `ExternalAppendOnlyUnsafeRowArray` is controlled by a single threshold. This PR aims to separate that to have more granular control.

## How was this patch tested?

Added unit tests

Author: Tejas Patil <[email protected]>

Closes #18843 from tejasapatil/SPARK-21595.

(cherry picked from commit 9443999)
Signed-off-by: Herman van Hovell <[email protected]>
…ons and adding jars

## What changes were proposed in this pull request?

Fix the race condition when serializing TaskDescriptions and adding jars by keeping the set of jars and files for a TaskSet constant across the lifetime of the TaskSet.  Otherwise TaskDescription serialization can produce an invalid serialization when new file/jars are added concurrently as the TaskDescription is serialized.

## How was this patch tested?

Additional unit test ensures jars/files contained in the TaskDescription remain constant throughout the lifetime of the TaskSet.

Author: Andrew Ash <[email protected]>

Closes #18913 from ash211/SPARK-21563.

(cherry picked from commit 6847e93)
Signed-off-by: Wenchen Fan <[email protected]>
…pshot files

## What changes were proposed in this pull request?

Directly writing a snapshot file may generate a partial file. This PR changes it to write to a temp file then rename to the target file.

## How was this patch tested?

Jenkins.

Author: Shixiong Zhu <[email protected]>

Closes #18928 from zsxwing/SPARK-21696.

(cherry picked from commit 282f00b)
Signed-off-by: Tathagata Das <[email protected]>
… successfully removed

## What changes were proposed in this pull request?

We put staging path to delete into the deleteOnExit cache of `FileSystem` in case of the path can't be successfully removed. But when we successfully remove the path, we don't remove it from the cache. We should do it to avoid continuing grow the cache size.

## How was this patch tested?

Added a test.

Author: Liang-Chi Hsieh <[email protected]>

Closes #18934 from viirya/SPARK-21721.

(cherry picked from commit 4c3cf1c)
Signed-off-by: gatorsmile <[email protected]>
Check the option "numFeatures" only when reading LibSVM, not when writing. When writing, Spark was raising an exception. After the change it will ignore the option completely. liancheng HyukjinKwon

(Maybe the usage should be forbidden when writing, in a major version change?).

Manual test, that loading and writing LibSVM files work fine, both with and without the numFeatures option.

Please review http://spark.apache.org/contributing.html before opening a pull request.

Author: Jan Vrsovsky <[email protected]>

Closes #18872 from ProtD/master.

(cherry picked from commit 8321c14)
Signed-off-by: Sean Owen <[email protected]>
…executors when tasks still to run

## What changes were proposed in this pull request?

Right now spark lets go of executors when they are idle for the 60s (or configurable time). I have seen spark let them go when they are idle but they were really needed. I have seen this issue when the scheduler was waiting to get node locality but that takes longer than the default idle timeout. In these jobs the number of executors goes down really small (less than 10) but there are still like 80,000 tasks to run.
We should consider not allowing executors to idle timeout if they are still needed according to the number of tasks to be run.

## How was this patch tested?

Tested by manually adding executors to `executorsIdsToBeRemoved` list and seeing if those executors were removed when there are a lot of tasks and a high `numExecutorsTarget` value.

Code used

In  `ExecutorAllocationManager.start()`

```
    start_time = clock.getTimeMillis()
```

In `ExecutorAllocationManager.schedule()`
```
    val executorIdsToBeRemoved = ArrayBuffer[String]()
    if ( now > start_time + 1000 * 60 * 2) {
      logInfo("--- REMOVING 1/2 of the EXECUTORS ---")
      start_time +=  1000 * 60 * 100
      var counter = 0
      for (x <- executorIds) {
        counter += 1
        if (counter == 2) {
          counter = 0
          executorIdsToBeRemoved += x
        }
      }
    }

Author: John Lee <[email protected]>

Closes #18874 from yoonlee95/SPARK-21656.

(cherry picked from commit adf005d)
Signed-off-by: Tom Graves <[email protected]>
…hema in table properties

backport #18907 to branch 2.2

Author: Wenchen Fan <[email protected]>

Closes #18963 from cloud-fan/backport.
…it is called statically to convert something into TimestampType

## What changes were proposed in this pull request?

https://issues.apache.org/jira/projects/SPARK/issues/SPARK-21739

This issue is caused by introducing TimeZoneAwareExpression.
When the **Cast** expression converts something into TimestampType, it should be resolved with setting `timezoneId`. In general, it is resolved in LogicalPlan phase.

However, there are still some places that use Cast expression statically to convert datatypes without setting `timezoneId`. In such cases,  `NoSuchElementException: None.get` will be thrown for TimestampType.

This PR is proposed to fix the issue. We have checked the whole project and found two such usages(i.e., in`TableReader` and `HiveTableScanExec`).

## How was this patch tested?

unit test

Author: donnyzone <[email protected]>

Closes #18960 from DonnyZone/spark-21739.

(cherry picked from commit 310454b)
Signed-off-by: gatorsmile <[email protected]>
…SurvivalRegression

## What changes were proposed in this pull request?

The line SchemaUtils.appendColumn(schema, $(predictionCol), IntegerType) did not modify the variable schema, hence only the last line had any effect. A temporary variable is used to correctly append the two columns predictionCol and probabilityCol.

## How was this patch tested?

Manually.

Please review http://spark.apache.org/contributing.html before opening a pull request.

Author: Cédric Pelvet <[email protected]>

Closes #18980 from sharp-pixel/master.

(cherry picked from commit 73e04ec)
Signed-off-by: Sean Owen <[email protected]>
… paths are successfully removed

## What changes were proposed in this pull request?

Fix a typo in test.

## How was this patch tested?

Jenkins tests.

Author: Liang-Chi Hsieh <[email protected]>

Closes #19005 from viirya/SPARK-21721-followup.

(cherry picked from commit 28a6cca)
Signed-off-by: Wenchen Fan <[email protected]>
…in Hive metastore.

For Hive tables, the current "replace the schema" code is the correct
path, except that an exception in that path should result in an error, and
not in retrying in a different way.

For data source tables, Spark may generate a non-compatible Hive table;
but for that to work with Hive 2.1, the detection of data source tables needs
to be fixed in the Hive client, to also consider the raw tables used by code
such as `alterTableSchema`.

Tested with existing and added unit tests (plus internal tests with a 2.1 metastore).

Author: Marcelo Vanzin <[email protected]>

Closes #18849 from vanzin/SPARK-21617.

(cherry picked from commit 84b5b16)
Signed-off-by: gatorsmile <[email protected]>
## What changes were proposed in this pull request?

Code in vignettes requires winutils on windows to run, when publishing to CRAN or building from source, winutils might not be available, so it's better to disable code run (so resulting vigenttes will not have output from code, but text is still there and code is still there)

fix * checking re-building of vignette outputs ... WARNING
and
> %LOCALAPPDATA% not found. Please define the environment variable or restart and enter an installation path in localDir.

## How was this patch tested?

jenkins, appveyor, r-hub

before: https://artifacts.r-hub.io/SparkR_2.2.0.tar.gz-49cecef3bb09db1db130db31604e0293/SparkR.Rcheck/00check.log
after: https://artifacts.r-hub.io/SparkR_2.2.0.tar.gz-86a066c7576f46794930ad114e5cff7c/SparkR.Rcheck/00check.log

Author: Felix Cheung <[email protected]>

Closes #19016 from felixcheung/rvigwind.

(cherry picked from commit 43cbfad)
Signed-off-by: Felix Cheung <[email protected]>
This is a bug introduced by https://github.com/apache/spark/pull/11274/files#diff-7adb688cbfa583b5711801f196a074bbL274 .

Non-equal join condition should only be applied when the equal-join condition matches.

regression test

Author: Wenchen Fan <[email protected]>

Closes #19036 from cloud-fan/bug.

(cherry picked from commit 2dd37d8)
Signed-off-by: Herman van Hovell <[email protected]>
…td contains zero (backport PR for 2.2)

## What changes were proposed in this pull request?

This is backport PR of #18896

fix bug of MLOR do not work correctly when featureStd contains zero

We can reproduce the bug through such dataset (features including zero variance), will generate wrong result (all coefficients becomes 0)
```
    val multinomialDatasetWithZeroVar = {
      val nPoints = 100
      val coefficients = Array(
        -0.57997, 0.912083, -0.371077,
        -0.16624, -0.84355, -0.048509)

      val xMean = Array(5.843, 3.0)
      val xVariance = Array(0.6856, 0.0)  // including zero variance

      val testData = generateMultinomialLogisticInput(
        coefficients, xMean, xVariance, addIntercept = true, nPoints, seed)

      val df = sc.parallelize(testData, 4).toDF().withColumn("weight", lit(1.0))
      df.cache()
      df
    }
```
## How was this patch tested?

testcase added.

Author: WeichenXu <[email protected]>

Closes #19026 from WeichenXu123/fix_mlor_zero_var_bug_2_2.
…ance generate negative result

Because of numerical error, MultivariateOnlineSummarizer.variance is possible to generate negative variance.

**This is a serious bug because many algos in MLLib**
**use stddev computed from** `sqrt(variance)`
**it will generate NaN and crash the whole algorithm.**

we can reproduce this bug use the following code:
```
    val summarizer1 = (new MultivariateOnlineSummarizer)
      .add(Vectors.dense(3.0), 0.7)
    val summarizer2 = (new MultivariateOnlineSummarizer)
      .add(Vectors.dense(3.0), 0.4)
    val summarizer3 = (new MultivariateOnlineSummarizer)
      .add(Vectors.dense(3.0), 0.5)
    val summarizer4 = (new MultivariateOnlineSummarizer)
      .add(Vectors.dense(3.0), 0.4)

    val summarizer = summarizer1
      .merge(summarizer2)
      .merge(summarizer3)
      .merge(summarizer4)

    println(summarizer.variance(0))
```
This PR fix the bugs in `mllib.stat.MultivariateOnlineSummarizer.variance` and `ml.stat.SummarizerBuffer.variance`, and several places in `WeightedLeastSquares`

test cases added.

Author: WeichenXu <[email protected]>

Closes #19029 from WeichenXu123/fix_summarizer_var_bug.

(cherry picked from commit 0456b40)
Signed-off-by: Sean Owen <[email protected]>
…for launching daemons like History Server

History Server Launch uses SparkClassCommandBuilder for launching the server. It is observed that SPARK_CLASSPATH has been removed and deprecated. For spark-submit this takes a different route and spark.driver.extraClasspath takes care of specifying additional jars in the classpath that were previously specified in the SPARK_CLASSPATH. Right now the only way specify the additional jars for launching daemons such as history server is using SPARK_DIST_CLASSPATH (https://spark.apache.org/docs/latest/hadoop-provided.html) but this I presume is a distribution classpath. It would be nice to have a similar config like spark.driver.extraClasspath for launching daemons similar to history server.

Added new environment variable SPARK_DAEMON_CLASSPATH to set classpath for launching daemons. Tested and verified for History Server and Standalone Mode.

## How was this patch tested?
Initially, history server start script would fail for the reason being that it could not find the required jars for launching the server in the java classpath. Same was true for running Master and Worker in standalone mode. By adding the environment variable SPARK_DAEMON_CLASSPATH to the java classpath, both the daemons(History Server, Standalone daemons) are starting up and running.

Author: pgandhi <[email protected]>
Author: pgandhi999 <[email protected]>

Closes #19047 from pgandhi999/master.

(cherry picked from commit 24e6c18)
Signed-off-by: Tom Graves <[email protected]>
…ces in yarn client mode

## What changes were proposed in this pull request?

This is a backport PR to fix issue of re-uploading remote resource in yarn client mode. The original PR is #18962.

## How was this patch tested?

Tested in local UT.

Author: jerryshao <[email protected]>

Closes #19074 from jerryshao/SPARK-21714-2.2-backport.
…e resources in yarn client mode"

This reverts commit 59529b2.
## This is a backport of PR #18783 to the latest released branch 2.2.

## What changes were proposed in this pull request?

As described in JIRA ticket, History page is taking ~1min to load for cases when amount of jobs is 10k+.
Most of the time is currently being spent on DOM manipulations and all additional costs implied by this (browser repaints and reflows).
PR's goal is not to change any behavior but to optimize time of History UI rendering:

1. The most costly operation is setting `innerHTML` for `duration` column within a loop, which is [extremely unperformant](https://jsperf.com/jquery-append-vs-html-list-performance/24). [Refactoring ](criteo-forks@b7e56ee) this helped to get page load time **down to 10-15s**

2. Second big gain bringing page load time **down to 4s** was [was achieved](criteo-forks@3630ca2) by detaching table's DOM before parsing it with DataTables jQuery plugin.

3. Another chunk of improvements ([1]criteo-forks@aeeeeb5), [2](criteo-forks@e25be9a), [3](criteo-forks@9169707)) was focused on removing unnecessary DOM manipulations that in total contributed ~250ms to page load time.

## How was this patch tested?

Tested by existing Selenium tests in `org.apache.spark.deploy.history.HistoryServerSuite`.

Changes were also tested on Criteo's spark-2.1 fork with 20k+ number of rows in the table, reducing load time to 4s.

Author: Dmitry Parfenchik <[email protected]>

Closes #18860 from 2ooom/history-ui-perf-fix-2.2.
…ces in yarn client mode

## What changes were proposed in this pull request?

This is a backport PR to fix issue of re-uploading remote resource in yarn client mode. The original PR is #18962.

## How was this patch tested?

Tested in local UT.

Author: jerryshao <[email protected]>

Closes #19074 from jerryshao/SPARK-21714-2.2-backport.
## What changes were proposed in this pull request?

killExecutor api currently does not allow killing an executor without updating the total number of executors needed. In case of dynamic allocation is turned on and the allocator tries to kill an executor, the scheduler reduces the total number of executors needed ( see https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala#L635) which is incorrect because the allocator already takes care of setting the required number of executors itself.

## How was this patch tested?

Ran a job on the cluster and made sure the executor request is correct

Author: Sital Kedia <[email protected]>

Closes #19081 from sitalkedia/skedia/oss_fix_executor_allocation.

(cherry picked from commit 6949a9c)
Signed-off-by: Marcelo Vanzin <[email protected]>
…'s input data transient

This PR is to backport #18686 for resolving the issue in #19094

---

## What changes were proposed in this pull request?
This PR is to mark the parameter `rows` and `unsafeRow` of LocalTableScanExec transient. It can avoid serializing the unneeded objects.

## How was this patch tested?
N/A

Author: gatorsmile <[email protected]>

Closes #19101 from gatorsmile/backport-21477.
…Exec with sun.io.serialization.extendedDebugInfo=true

## What changes were proposed in this pull request?

If no SparkConf is available to Utils.redact, simply don't redact.

## How was this patch tested?

Existing tests

Author: Sean Owen <[email protected]>

Closes #19123 from srowen/SPARK-21418.

(cherry picked from commit ca59445)
Signed-off-by: Herman van Hovell <[email protected]>
…vior change in Spark 2.2

Forgot to update docs with behavior change.

Author: Burak Yavuz <[email protected]>

Closes #19138 from brkyvz/trigger-doc-fix.

(cherry picked from commit 8c954d2)
Signed-off-by: Tathagata Das <[email protected]>
mgaido91 and others added 25 commits June 12, 2018 16:43
Currently, `spark.ui.filters` are not applied to the handlers added after binding the server. This means that every page which is added after starting the UI will not have the filters configured on it. This can allow unauthorized access to the pages.

The PR adds the filters also to the handlers added after the UI starts.

manual tests (without the patch, starting the thriftserver with `--conf spark.ui.filters=org.apache.hadoop.security.authentication.server.AuthenticationFilter --conf spark.org.apache.hadoop.security.authentication.server.AuthenticationFilter.params="type=simple"` you can access `http://localhost:4040/sqlserver`; with the patch, 401 is the response as for the other pages).

Author: Marco Gaido <[email protected]>

Closes #21523 from mgaido91/SPARK-24506.

(cherry picked from commit f53818d)
Signed-off-by: Marcelo Vanzin <[email protected]>
Apply the suggestion on the bug to fix source links. Tested with
the 2.3.1 release docs.

Author: Marcelo Vanzin <[email protected]>

Closes #21521 from vanzin/SPARK-23732.

(cherry picked from commit dc22465)
Signed-off-by: Marcelo Vanzin <[email protected]>
…ator.

When an output stage is retried, it's possible that tasks from the previous
attempt are still running. In that case, there would be a new task for the
same partition in the new attempt, and the coordinator would allow both
tasks to commit their output since it did not keep track of stage attempts.

The change adds more information to the stage state tracked by the coordinator,
so that only one task is allowed to commit the output in the above case.
The stage state in the coordinator is also maintained across stage retries,
so that a stray speculative task from a previous stage attempt is not allowed
to commit.

This also removes some code added in SPARK-18113 that allowed for duplicate
commit requests; with the RPC code used in Spark 2, that situation cannot
happen, so there is no need to handle it.

Author: Marcelo Vanzin <[email protected]>

Closes #21577 from vanzin/SPARK-24552.

(cherry picked from commit c8e909c)
Signed-off-by: Thomas Graves <[email protected]>
stageAttemptId added in TaskContext and corresponding construction modification

Added a new test in TaskContextSuite, two cases are tested:
1. Normal case without failure
2. Exception case with resubmitted stages

Link to [SPARK-22897](https://issues.apache.org/jira/browse/SPARK-22897)

Author: Xianjin YE <advancedxygmail.com>

Closes #20082 from advancedxy/SPARK-22897.

Conflicts:
	project/MimaExcludes.scala

## What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

Please review http://spark.apache.org/contributing.html before opening a pull request.

Author: Xianjin YE <[email protected]>
Author: Thomas Graves <[email protected]>

Closes #21609 from tgravescs/SPARK-22897.
…er for writes .

This passes a unique attempt id to the Hadoop APIs, because attempt
number is reused when stages are retried. When attempt numbers are
reused, sources that track data by partition id and attempt number
may incorrectly clean up data because the same attempt number can
be both committed and aborted.

Author: Marcelo Vanzin <[email protected]>

Closes #21616 from vanzin/SPARK-24552-2.2.
findTightestCommonTypeOfTwo has been renamed to findTightestCommonType

## What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

Please review http://spark.apache.org/contributing.html before opening a pull request.

Author: Fokko Driesprong <[email protected]>

Closes #21597 from Fokko/fd-typo.

(cherry picked from commit 6a97e8e)
Signed-off-by: hyukjinkwon <[email protected]>
…sionsSuite still flaky; fall back to Apache archive

## What changes were proposed in this pull request?

Try only unique ASF mirrors to download Spark release; fall back to Apache archive if no mirrors available or release is not mirrored

## How was this patch tested?

Existing HiveExternalCatalogVersionsSuite

Author: Sean Owen <[email protected]>

Closes #21779 from srowen/SPARK-24813.2.
Release process for Spark 2.2.2 say to update the test.

Author: Thomas Graves <[email protected]>

Closes #21707 from tgravescs/spark222-release.
## What changes were proposed in this pull request?
When speculation is enabled,
TaskSetManager#markPartitionCompleted should write successful task duration to MedianHeap,
not just increase tasksSuccessful.

Otherwise when TaskSetManager#checkSpeculatableTasks,tasksSuccessful non-zero, but MedianHeap is empty.
Then throw an exception successfulTaskDurations.median java.util.NoSuchElementException: MedianHeap is empty.
Finally led to stopping SparkContext.
## How was this patch tested?
TaskSetManagerSuite.scala
unit test:[SPARK-24677] MedianHeap should not be empty when speculation is enabled

Author: sychen <[email protected]>

Closes #21656 from cxzl25/fix_MedianHeap_empty.

(cherry picked from commit c8bee93)
Signed-off-by: Thomas Graves <[email protected]>
…t and Greatest to reduce time in its test

## What changes were proposed in this pull request?

It's minor and trivial but looks 2000 input is good enough to reproduce and test in SPARK-22499.

## How was this patch tested?

Manually brought the change and tested.

Locally tested:

Before: 3m 21s 288ms
After: 1m 29s 134ms

Given the latest successful build took:

```
ArithmeticExpressionSuite:
- SPARK-22499: Least and greatest should not generate codes beyond 64KB (7 minutes, 49 seconds)
```

I expect it's going to save 4ish mins.

Author: hyukjinkwon <[email protected]>

Closes #21855 from HyukjinKwon/minor-fix-suite.

(cherry picked from commit 3d5c61e)
Signed-off-by: hyukjinkwon <[email protected]>
…provided"

## What changes were proposed in this pull request?

Please see [SPARK-24927][1] for more details.

[1]: https://issues.apache.org/jira/browse/SPARK-24927

## How was this patch tested?

Manually tested.

Author: Cheng Lian <[email protected]>

Closes #21879 from liancheng/spark-24927.

(cherry picked from commit d5f340f)
Signed-off-by: Xiao Li <[email protected]>
…ult in data error

When join key is long or int in broadcast join, Spark will use `LongToUnsafeRowMap` to store key-values of the table witch will be broadcasted. But, when `LongToUnsafeRowMap` is broadcasted to executors, and it is too big to hold in memory, it will be stored in disk. At that time, because `write` uses a variable `cursor` to determine how many bytes in `page` of `LongToUnsafeRowMap` will be write out and the `cursor` was not restore when deserializing, executor will write out nothing from page into disk.

## What changes were proposed in this pull request?
Restore cursor value when deserializing.

Author: liulijia <[email protected]>

Closes #21772 from liutang123/SPARK-24809.

(cherry picked from commit 2c54aae)
Signed-off-by: Xiao Li <[email protected]>
…hedulerImplSuite

## What changes were proposed in this pull request?

In the `afterEach()` method of both `TastSetManagerSuite` and `TaskSchedulerImplSuite`, `super.afterEach()` shall be called at the end, because it shall stop the SparkContext.

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93706/testReport/org.apache.spark.scheduler/TaskSchedulerImplSuite/_It_is_not_a_test_it_is_a_sbt_testing_SuiteSelector_/
The test failure is caused by the above reason, the newly added `barrierCoordinator` required `rpcEnv` which has been stopped before `TaskSchedulerImpl` doing cleanup.

## How was this patch tested?
Existing tests.

Author: Xingbo Jiang <[email protected]>

Closes #21908 from jiangxb1987/afterEach.

(cherry picked from commit 3695ba5)
Signed-off-by: hyukjinkwon <[email protected]>
…regation returns wrong result

## What changes were proposed in this pull request?

When we do an average, the result is computed dividing the sum of the values by their count. In the case the result is a DecimalType, the way we are casting/managing the precision and scale is not really optimized and it is not coherent with what we do normally.

In particular, a problem can happen when the Divide operand returns a result which contains a precision and scale different by the ones which are expected as output of the Divide operand. In the case reported in the JIRA, for instance, the result of the Divide operand is a Decimal(38, 36), while the output data type for Divide is 38, 22. This is not an issue when the Divide is followed by a CheckOverflow or a Cast to the right data type, as these operations return a decimal with the defined precision and scale. Despite in the Average operator we do have a Cast, this may be bypassed if the result of Divide is the same type which it is casted to, hence the issue reported in the JIRA may arise.

The PR proposes to use the normal rules/handling of the arithmetic operators with Decimal data type, so we both reuse the existing code (having a single logic for operations between decimals) and we fix this problem as the result is always guarded by CheckOverflow.

## How was this patch tested?

added UT

Author: Marco Gaido <[email protected]>

Closes #21949 from mgaido91/SPARK-24957_2.2.
(cherry picked from commit 15fc237)
… the file system

## What changes were proposed in this pull request?

In `SparkHadoopUtil. checkAccessPermission`,  we consider only basic permissions in order to check whether a user can access a file or not. This is not a complete check, as it ignores ACLs and other policies a file system may apply in its internal. So this can result in returning wrongly that a user cannot access a file (despite he actually can).

The PR proposes to delegate to the filesystem the check whether a file is accessible or not, in order to return the right result. A caching layer is added for performance reasons.

## How was this patch tested?

added UT

Author: Marco Gaido <[email protected]>

Closes #22022 from mgaido91/SPARK-24948_2.2.
…ails w/java 8 181-b13

## What changes were proposed in this pull request?

- Update DateTimeUtilsSuite so that when testing roundtripping in daysToMillis and millisToDays multiple skipdates can be specified.
- Updated test so that both new years eve 2014 and new years day 2015 are skipped for kiribati time zones.  This is necessary as java versions pre 181-b13 considered new years day 2015 to be skipped while susequent versions corrected this to new years eve.

## How was this patch tested?
Unit tests

Author: Chris Martin <[email protected]>

Closes #21901 from d80tb7/SPARK-24950_datetimeUtilsSuite_failures.

(cherry picked from commit c5b8d54)
Signed-off-by: Sean Owen <[email protected]>
…crypto.html

Add ECCN notice required by http://www.apache.org/dev/crypto.html
See https://issues.apache.org/jira/browse/LEGAL-398

This should probably be backported to 2.3, 2.2, as that's when the key dep (commons crypto) turned up. BC is actually unused, but still there.

N/A

Closes #22064 from srowen/ECCN.

Authored-by: Sean Owen <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
(cherry picked from commit 91cdab5)
Signed-off-by: Sean Owen <[email protected]>
…access released memory page (branch-2.2)

## What changes were proposed in this pull request?

Backport #22062 to branch-2.2. Just two minor differences in the test:

- branch-2.2 doesn't have `SparkOutOfMemoryError`. It's using `OutOfMemoryError` directly.
- MockitoSugar is in a different package in old scalatest.

## How was this patch tested?

Jenkins

Closes #22072 from zsxwing/SPARK-25081-2.2.

Authored-by: Shixiong Zhu <[email protected]>
Signed-off-by: Xiao Li <[email protected]>
## What changes were proposed in this pull request?

This PR fixes the an example for `to_json` in doc and function description.

- http://spark.apache.org/docs/2.3.0/api/sql/#to_json
- `describe function extended`

## How was this patch tested?

Pass the Jenkins with the updated test.

Closes #22096 from dongjoon-hyun/minor_json.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: hyukjinkwon <[email protected]>
(cherry picked from commit e2ab7de)
Signed-off-by: hyukjinkwon <[email protected]>
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@HyukjinKwon
Copy link
Member

@speful looks mistakenly open. mind closing this please?

@srowen srowen mentioned this pull request Aug 20, 2018
## What changes were proposed in this pull request?

[SPARK-25144](https://issues.apache.org/jira/browse/SPARK-25144) reports memory leaks on Apache Spark 2.0.2 ~ 2.3.2-RC5.

```scala
scala> case class Foo(bar: Option[String])
scala> val ds = List(Foo(Some("bar"))).toDS
scala> val result = ds.flatMap(_.bar).distinct
scala> result.rdd.isEmpty
18/08/19 23:01:54 WARN Executor: Managed memory leak detected; size = 8650752 bytes, TID = 125
res0: Boolean = false
```

This is a backport of cloud-fan 's #21738 which is a single commit among 3 commits of SPARK-21743. In addition, I added a test case to prevent regressions in branch-2.3 and branch-2.2. Although SPARK-21743 is reverted due to regression, this subpatch can go to branch-2.3 and branch-2.2. This will be merged as cloud-fan 's commit.

## How was this patch tested?

Pass the jenkins with a newly added test case.

Closes #22156 from dongjoon-hyun/SPARK-25144-2.2.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: hyukjinkwon <[email protected]>
@asfgit asfgit closed this in b8788b3 Aug 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.