Skip to content

Conversation

@andrewor14
Copy link
Contributor

Problem. Event logs in 1.6 were much bigger than 1.5. I ran page rank and the event log size in 1.6 was almost 5x that in 1.5. I did a bisect to find that the RDD callsite added in #9398 is largely responsible for this.

Solution. This patch removes the long form of the callsite (which is not used!) from the event log. This reduces the size of the event log significantly.

Note on compatibility: if this patch is to be merged into 1.6.0, then it won't break any compatibility. Otherwise, if it is merged into 1.6.1, then we might need to add more backward compatibility handling logic (currently does not exist yet).

The long form is not currently used and inflates the size of the
event log significantly.
@andrewor14
Copy link
Contributor Author

@zsxwing @JoshRosen

@SparkQA
Copy link

SparkQA commented Dec 3, 2015

Test build #47100 has finished for PR 10115 at commit 5a0ebdf.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * class CrossValidator @Since(\"1.2.0\") (@Since(\"1.4.0\") override val uid: String)\n * class ParamGridBuilder @Since(\"1.2.0\")\n * class TrainValidationSplit @Since(\"1.5.0\") (@Since(\"1.5.0\") override val uid: String)\n

@SparkQA
Copy link

SparkQA commented Dec 3, 2015

Test build #47103 has finished for PR 10115 at commit 1ddd099.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zsxwing
Copy link
Member

zsxwing commented Dec 3, 2015

LGTM

@andrewor14
Copy link
Contributor Author

m1.6

asfgit pushed a commit that referenced this pull request Dec 3, 2015
**Problem.** Event logs in 1.6 were much bigger than 1.5. I ran page rank and the event log size in 1.6 was almost 5x that in 1.5. I did a bisect to find that the RDD callsite added in #9398 is largely responsible for this.

**Solution.** This patch removes the long form of the callsite (which is not used!) from the event log. This reduces the size of the event log significantly.

*Note on compatibility*: if this patch is to be merged into 1.6.0, then it won't break any compatibility. Otherwise, if it is merged into 1.6.1, then we might need to add more backward compatibility handling logic (currently does not exist yet).

Author: Andrew Or <[email protected]>

Closes #10115 from andrewor14/smaller-event-logs.

(cherry picked from commit 688e521)
Signed-off-by: Andrew Or <[email protected]>
@asfgit asfgit closed this in 688e521 Dec 3, 2015
@andrewor14 andrewor14 deleted the smaller-event-logs branch December 3, 2015 19:11
@steveloughran
Copy link
Contributor

Has this gone into 1.6.0 or 1.6.1? Because the JIRA is still open and doesn't say

@andrewor14
Copy link
Contributor Author

This went into 1.6.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants