Skip to content
This repository was archived by the owner on Jan 9, 2020. It is now read-only.

Conversation

@duyanghao
Copy link

@duyanghao duyanghao commented Aug 30, 2017

What changes were proposed in this pull request?

  1. Fix MB units conversion(spark->k8s) error.
  2. Set the executor ENV_EXECUTOR_MEMORY environment variable to executorMemoryWithOverhead in accordance with driver.

How was this patch tested?

Manual tests show that Memory units conversion is correct and ENV_EXECUTOR_MEMORY environment variable is same with executorMemoryWithOverhead.

…MORY` environment variable to `executorMemoryWithOverhead` in accordance with driver.

Signed-off-by: duyanghao <[email protected]>
@duyanghao duyanghao changed the title Fix MB units conversion(spark->k8s)&Set the executor ENV_EXECUTOR_MEMORY to executorMemoryWithOverhead in accordance with driver. Fix MB units conversion(spark->k8s)&Set the executor ENV_EXECUTOR_MEMORY to executorMemoryWithOverhead Aug 30, 2017
@ash211
Copy link

ash211 commented Aug 30, 2017

Dupe of #470

@ash211 ash211 closed this Aug 30, 2017
@duyanghao
Copy link
Author

duyanghao commented Aug 30, 2017

@ash211 i am afraid #470 is not enough to cover my PR.

// Executor backend expects integral value for executor cores, so round it up to an int.
(ENV_EXECUTOR_CORES, math.ceil(executorCores).toInt.toString),
(ENV_EXECUTOR_MEMORY, executorMemoryString),
(ENV_EXECUTOR_MEMORY, executorMemoryWithOverhead + "m"),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to do this. The point of the memory with overhead is to make it higher than what we give the JVM. The JVM should only have the exact amount of heap given by spark.executor.memory, but the overhead accounts for the fact that the JVM might also allocate memory off-heap, and in Python's case it also covers for the storage of Python's data.

Hence I think #470 does cover everything we need here.

Copy link
Author

@duyanghao duyanghao Aug 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ash211 @mccheah as i said in #467, Right now, the ENV_EXECUTOR_MEMORY is memory but ENV_DRIVER_MEMORY is memory+overhead

If the ENV_EXECUTOR_MEMORY should only be memory, as you said above, the ENV_DRIVER_MEMORY should also only be memory rather than memory+overhead,is that right?

So i suggest that ENV_DRIVER_MEMORY should also be just memory instead of memory+overheads.

I have created a PR for to correct this problem.

ifilonenko pushed a commit to ifilonenko/spark that referenced this pull request Feb 26, 2019
### Upstream SPARK-XXXXX ticket and PR link (if not applicable, explain)
No upstream PR - SafeLogging is palantir specific.

### What changes were proposed in this pull request?
If the log in SafeLogging is not marked as transient, it is included in the serialized version of TorrentBroadcast, which causes issues. This excludes the log from being included in the TorrentBroadcast object, by making the log transient.

### How was this patch tested?
No testing for this - the logic is taken from upstream
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants