Skip to content

Conversation

@willb
Copy link
Contributor

@willb willb commented May 9, 2014

This patch checks top-level closure arguments to ClosureCleaner.clean for return statements and raises an exception if it finds any. This is mainly a user-friendliness addition, since programs with return statements in closure arguments will currently fail upon RDD actions with a less-than-intuitive error message.

willb added 2 commits May 9, 2014 15:05
This commit ensures that ClosureCleaner.clean will identify
toplevel return statements in closures and raise an exception
if they are encountered.

This is intended to fix SPARK-571.
@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14858/

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This function is pretty simple. Mind moving it into run() directly?

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14859/

@willb
Copy link
Contributor Author

willb commented May 10, 2014

Thanks for the quick feedback, @andrewor14! I'll make those changes. I'll also look to see if I can figure out why that repartition test is failing; oddly, it's working for me locally.

@pwendell
Copy link
Contributor

@willb the re-partition issue is not related to your patch. There was an inadvertent check-in in master that caused it. It has been fixed.

@willb
Copy link
Contributor Author

willb commented May 11, 2014

Thanks for the clarification, @pwendell.

@andrewor14, I've grouped my imports and collapsed that function. Thanks again.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@rxin
Copy link
Contributor

rxin commented May 13, 2014

Why do closures with return fail?

@rxin
Copy link
Contributor

rxin commented May 13, 2014

Ah never mind my quesiton. "return" exits the outer level function instead of the closure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a space after if (also for the if below)

@rxin
Copy link
Contributor

rxin commented May 13, 2014

I left couple comments on style. Thanks for leading this.

@willb
Copy link
Contributor Author

willb commented May 13, 2014

@rxin, yes! (I was firing up a debugger in case you wanted to know exactly where it was returning to.)

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14932/

@rxin
Copy link
Contributor

rxin commented May 13, 2014

@willb can you also add a test case where an inline function is using return statement in its closure to make sure that is allowed? e.g.

sc.parallelize(1 to 100).map { i =>
  def foo(): Int = { return 1; 2 }
  foo()
}

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@willb
Copy link
Contributor Author

willb commented May 13, 2014

@rxin, the latest commit adds such a test.

I'll note that the code I have only finds toplevel return statements, so it won't spuriously reject code like that example, but it also won't reject code like this, which is an invalid use of return:

nums.map {x => 
  // this return is invalid since it will transfer control outside the closure
  val foo = {y: Int => return 2; 1 }
  foo(x)
}

I thought identifying toplevel return statements in closures represented an acceptable tradeoff between complexity and user-friendliness in order to provide better feedback in the common case (but not necessarily to exhaustively flag all bogus closures, no matter how pathological). If this isn't the right tradeoff, I could adapt the code to reject a higher percentage of closures with nonlocal returns.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14946/

@rxin
Copy link
Contributor

rxin commented May 13, 2014

I think the current approach is fine. I'm going to merge this.

@willb
Copy link
Contributor Author

willb commented May 13, 2014

Thanks, @rxin!

@rxin
Copy link
Contributor

rxin commented May 13, 2014

BTW I am merging this only into master (not branch-1.0) since it is too late for changes into Spark core for 1.0 unless they are critical bug fixes.

@asfgit asfgit closed this in 16ffadc May 13, 2014
pdeyhim pushed a commit to pdeyhim/spark-1 that referenced this pull request Jun 25, 2014
This patch checks top-level closure arguments to `ClosureCleaner.clean` for `return` statements and raises an exception if it finds any.  This is mainly a user-friendliness addition, since programs with return statements in closure arguments will currently fail upon RDD actions with a less-than-intuitive error message.

Author: William Benton <[email protected]>

Closes apache#717 from willb/spark-571 and squashes the following commits:

c41eb7d [William Benton] Another test case for SPARK-571
30c42f4 [William Benton] Stylistic cleanups
559b16b [William Benton] Stylistic cleanups from review
de13b79 [William Benton] Style fixes
295b6a5 [William Benton] Forbid return statements in closure arguments.
b017c47 [William Benton] Added a test for SPARK-571
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants