-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-14914] Fix Resource not closed after using, mostly for unit tests #12693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
1b531f8
aa31dd9
151a1b0
781b1ac
5b83cf6
aa5c099
6108f61
8b98826
16edc2e
1f91a8d
e9399a3
0948fd2
488cd3a
aa74679
69dbceb
3ae2396
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -202,8 +202,6 @@ class EventLoggingListenerSuite extends SparkFunSuite with LocalSparkContext wit | |
|
|
||
| // Make sure expected events exist in the log file. | ||
| val logData = EventLoggingListener.openEventLog(new Path(eventLogger.logPath), fileSystem) | ||
| val logStart = SparkListenerLogStart(SPARK_VERSION) | ||
| val lines = readLines(logData) | ||
| val eventSet = mutable.Set( | ||
| SparkListenerApplicationStart, | ||
| SparkListenerBlockManagerAdded, | ||
|
|
@@ -216,19 +214,25 @@ class EventLoggingListenerSuite extends SparkFunSuite with LocalSparkContext wit | |
| SparkListenerTaskStart, | ||
| SparkListenerTaskEnd, | ||
| SparkListenerApplicationEnd).map(Utils.getFormattedClassName) | ||
| lines.foreach { line => | ||
| eventSet.foreach { event => | ||
| if (line.contains(event)) { | ||
| val parsedEvent = JsonProtocol.sparkEventFromJson(parse(line)) | ||
| val eventType = Utils.getFormattedClassName(parsedEvent) | ||
| if (eventType == event) { | ||
| eventSet.remove(event) | ||
| try { | ||
| val logStart = SparkListenerLogStart(SPARK_VERSION) | ||
| val lines = readLines(logData) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here, I think readLines is the only thing that reads logData? after it's done you could close the stream. Maybe that's tidier than wrapping so much in the try-finally block.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @srowen I tried: val logData = EventLoggingListener.openEventLog(new Path(eventLogger.logPath), fileSystem)
var lines : Seq[String] = null
try {
lines = readLines(logData)
} finally {
logData.close()
}It thrown IOException complaining that Stream closed. I think the readLines is sort of lazy such that it won't read the file until we actually move the iterator.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK sounds fine. How about pulling out |
||
| lines.foreach { line => | ||
| eventSet.foreach { event => | ||
| if (line.contains(event)) { | ||
| val parsedEvent = JsonProtocol.sparkEventFromJson(parse(line)) | ||
| val eventType = Utils.getFormattedClassName(parsedEvent) | ||
| if (eventType == event) { | ||
| eventSet.remove(event) | ||
| } | ||
| } | ||
| } | ||
| } | ||
| assert(JsonProtocol.sparkEventFromJson(parse(lines(0))) === logStart) | ||
| assert(eventSet.isEmpty, "The following events are missing: " + eventSet.toSeq) | ||
| } finally { | ||
| logData.close() | ||
| } | ||
| assert(JsonProtocol.sparkEventFromJson(parse(lines(0))) === logStart) | ||
| assert(eventSet.isEmpty, "The following events are missing: " + eventSet.toSeq) | ||
| } | ||
|
|
||
| private def readLines(in: InputStream): Seq[String] = { | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -640,16 +640,18 @@ class CheckpointSuite extends TestSuiteBase with DStreamCheckpointTester | |
| val fileStream = ssc.textFileStream(testDir.toString) | ||
| // Make value 3 take a large time to process, to ensure that the driver | ||
| // shuts down in the middle of processing the 3rd batch | ||
| CheckpointSuite.batchThreeShouldBlockIndefinitely = true | ||
| val mappedStream = fileStream.map(s => { | ||
| CheckpointSuite.batchThreeShouldBlockALongTime = true | ||
| val mappedStream = fileStream.map{ s => | ||
| val i = s.toInt | ||
| if (i == 3) { | ||
| while (CheckpointSuite.batchThreeShouldBlockIndefinitely) { | ||
| Thread.sleep(Long.MaxValue) | ||
| if (CheckpointSuite.batchThreeShouldBlockALongTime) { | ||
| // It's not a good idea to let the thread run forever | ||
| // as resource won't be correctly released | ||
| Thread.sleep(6000) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If the logic is that this needs to wait indefinitely, then this needs to be an error if it times out. It may require a different API call to accomplish this.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @srowen It's a bit hacky here, but I can't find a good way. The problem here is that, if the Task here waits indefinitely, and the
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, how about changing the text to "should block a long time" then? as long as this is more than long enough for the driver to shutdown it seems to be in line with the intent of the test. |
||
| } | ||
| } | ||
| i | ||
| }) | ||
| } | ||
|
|
||
| // Reducing over a large window to ensure that recovery from driver failure | ||
| // requires reprocessing of all the files seen before the failure | ||
|
|
@@ -689,7 +691,7 @@ class CheckpointSuite extends TestSuiteBase with DStreamCheckpointTester | |
| } | ||
|
|
||
| // The original StreamingContext has now been stopped. | ||
| CheckpointSuite.batchThreeShouldBlockIndefinitely = false | ||
| CheckpointSuite.batchThreeShouldBlockALongTime = false | ||
|
|
||
| // Create files while the streaming driver is down | ||
| for (i <- Seq(4, 5, 6)) { | ||
|
|
@@ -926,5 +928,5 @@ class CheckpointSuite extends TestSuiteBase with DStreamCheckpointTester | |
| } | ||
|
|
||
| private object CheckpointSuite extends Serializable { | ||
| var batchThreeShouldBlockIndefinitely: Boolean = true | ||
| var batchThreeShouldBlockALongTime: Boolean = true | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should use
finallyin contexts like this.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srowen The
deserializeStreamwill be closed. We only need this when thedeserializeStreammethod throws exception and we have no chance to go to the line 258, in which thedeserializeStreamwill be closed, with the innerfileInputStream.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I mean put this in the block below that closes the deserialization stream, at best.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srowen It seems we need extra code to accommodate the scope problem if putting the close in other clauses. Probably it's cleaner to stick with this solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean -- it's less code. Just open the deserializeStream inside the block and close it in the finally block that follows. Actually, closing deserializeStream will already (should already) close the underlying stream anyway. It doesn't handle errors while making the stream from the original stream, but, constructors of the deserializer streams aren't reading the stream anyway. I suspect it's fine as-is, but it would be extra-defensive to also close the fileInputStream, yes.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srowen Probably a code snippet will explain things better. Do you mean this:
There are a few things I am concerning
deserializeStreamis nowvarinstead ofval)deserializeStream, in the case of the deserialization throwing exception. I'm pretty sure that if we close thefileInputStreamfirst may cause the closingdeserializeSteamthrowing exception complaining that the input has been closed already.FYI, I found this problem by while running the test case "checkpointing partitioners" in which the
corruptPartitionerFileflag is turned on in theCheckpointSuite.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's what I mean. Well, if I were right that
serializer.deserializeStreamcan't fail, then the existing code would already be fine, so this suggestion doesn't help unlessfileInputStreamgets a similar treatment, and that's too complex. I'm curious how that fails given the current implementation, but, even if it couldn't now, it could in the future.You might normally resolve this with nested try blocks; I think streams aren't supposed to fail if closed twice, so, safe to close the underlying stream for good measure. Still at that point it's no less complex, so I can see that this is as clear as anything.