Skip to content

Conversation

@ajithme
Copy link
Contributor

@ajithme ajithme commented Jan 23, 2020

What changes were proposed in this pull request?

In org.apache.spark.sql.execution.SubqueryExec#relationFuture make a copy of org.apache.spark.SparkContext#localProperties and pass it to the sub-execution thread in org.apache.spark.sql.execution.SubqueryExec#executionContext

Why are the changes needed?

Local properties set via sparkContext are not available as TaskContext properties when executing jobs and threadpools have idle threads which are reused

Explanation:
When SubqueryExec, the relationFuture is evaluated via a separate thread. The threads inherit the localProperties from sparkContext as they are the child threads.
These threads are created in the executionContext (thread pools). Each Thread pool has a default keepAliveSeconds of 60 seconds for idle threads.
Scenarios where the thread pool has threads which are idle and reused for a subsequent new query, the thread local properties will not be inherited from spark context (thread properties are inherited only on thread creation) hence end up having old or no properties set. This will cause taskset properties to be missing when properties are transferred by child thread via sparkContext.runJob/submitJob

Does this PR introduce any user-facing change?

No

How was this patch tested?

Added UT

@ajithme
Copy link
Contributor Author

ajithme commented Jan 23, 2020

cc @dongjoon-hyun @cloud-fan

This is a backport of #27267 fix
I had to do some modification:

  • org.apache.spark.util.Utils#cloneProperties was missing hence added
  • UT to use scalar subquery to reproduce the issue

@cloud-fan
Copy link
Contributor

ok to test

@SparkQA
Copy link

SparkQA commented Jan 23, 2020

Test build #117303 has finished for PR 27340 at commit ede4044.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-30556][SQL][branch2.4] Copy sparkContext.localproperties to child thread inSubqueryExec.executionContext [SPARK-30556][SQL][2.4] Copy sparkContext.localproperties to child thread inSubqueryExec.executionContext Jan 23, 2020
@dongjoon-hyun
Copy link
Member

All tests passed. The R failure is a known incoming feasibility test flakiness. cc @viirya

* checking CRAN incoming feasibility ...Error in .check_package_CRAN_incoming(pkgdir) : 
  dims [product 26] do not match the length of object [0]

@dongjoon-hyun
Copy link
Member

Merged to branch-2.4.

dongjoon-hyun pushed a commit that referenced this pull request Jan 23, 2020
…read inSubqueryExec.executionContext

### What changes were proposed in this pull request?
In `org.apache.spark.sql.execution.SubqueryExec#relationFuture` make a copy of `org.apache.spark.SparkContext#localProperties` and pass it to the sub-execution thread in `org.apache.spark.sql.execution.SubqueryExec#executionContext`

### Why are the changes needed?
Local properties set via sparkContext are not available as TaskContext properties when executing  jobs and threadpools have idle threads which are reused

Explanation:
When `SubqueryExec`, the relationFuture is evaluated via a separate thread. The threads inherit the `localProperties` from `sparkContext` as they are the child threads.
These threads are created in the `executionContext` (thread pools). Each Thread pool has a default keepAliveSeconds of 60 seconds for idle threads.
Scenarios where the thread pool has threads which are idle and reused for a subsequent new query, the thread local properties will not be inherited from spark context (thread properties are inherited only on thread creation) hence end up having old or no properties set. This will cause taskset properties to be missing when properties are transferred by child thread via `sparkContext.runJob/submitJob`

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
Added UT

Closes #27340 from ajithme/subquerylocalprop2.

Authored-by: Ajith <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
@dongjoon-hyun
Copy link
Member

Thank you, @ajithme and @cloud-fan .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants