Skip to content

Conversation

@HyukjinKwon
Copy link
Member

@HyukjinKwon HyukjinKwon commented Oct 25, 2016

What changes were proposed in this pull request?

Close FileStreams, ZipFiles etc to release the resources after using. Not closing the resources will cause IO Exception to be raised while deleting temp files.

How was this patch tested?

Existing tests

@HyukjinKwon
Copy link
Member Author

This PR simply rebases #12693 because it seems almost there or already done but was not rebased and It has been stale.

I guess this should be credited to @taoli91.

cc @tdas @srowen

@SparkQA
Copy link

SparkQA commented Oct 25, 2016

Test build #67491 has finished for PR 15618 at commit f3713d1.

  • This patch fails from timeout after a configured wait of 250m.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Oct 25, 2016

Test build #67507 has finished for PR 15618 at commit f3713d1.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

retest this please

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Oct 25, 2016

Let me take a look into this deeper if some tests constantly fail.

@SparkQA
Copy link

SparkQA commented Oct 25, 2016

Test build #67516 has finished for PR 15618 at commit f3713d1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mridulm
Copy link
Contributor

mridulm commented Oct 25, 2016

There are a bunch of methods in Utils which actually nicely apply to this PR.

  • Utils.tryWithSafeFinally
  • Utils.tryWithResource, etc

They also handle exception suppression, etc.

serializer.deserializeStream(fileInputStream)
} catch {
case e : Throwable =>
fileInputStream.close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be in a try {} finally {} instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not here because the stream needs to be open afterwards. I had a similar discussion on the original pr.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mean having the finally here on this line...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry to ask repeatedly but could I please ask how I should change this a little bit more?
I was thinking I should not use finally upon fileInputStream as it should not always be closed at this point or around this point..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, did you mean something like this?

val partitioner = Utils.tryWithSafeFinally[Partitioner] {
  val deserializeStream = serializer.deserializeStream(fileInputStream)
  Utils.tryWithSafeFinally[Partitioner] {
    deserializeStream.readObject[Partitioner]
  } {
    deserializeStream.close()
  }
} {
  fileInputStream.close()
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think that's a good idea. Do you need the generic types on the "try" methods, even?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah-ha, sure. Let me then try. Thanks for bearing with me.

@HyukjinKwon
Copy link
Member Author

Thanks @mridulm and @felixcheung. Let me try to address your comments.

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Oct 26, 2016

cc @fuzhouch too (we talked about this via emails).

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Oct 26, 2016

I just took a look for Utils.tryWithSafeFinally and Utils.tryWithResource. It seems both Utils.tryWithSafeFinally and Utils.tryWithResource will close it in finally if I understand this correctly. Could I just leave as it is?

@mridulm
Copy link
Contributor

mridulm commented Oct 26, 2016

@HyukjinKwon So the idea is that you acquire resources required and dont need to track it by wrapping them in Utils.tryWithResource (similar to memory management in jvm).

As an example:
main/scala/org/apache/spark/rdd/ReliableCheckpointRDD.scala change will simply acquire the fileInputStream in the try and release it in the finally automatically - without needing to manage it via catch/rethrow, etc (ex: what if close() throws exception ?).

Even core/src/test/scala/org/apache/spark/FileSuite.scala, core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala, etc change can be modelled the same way.
You get the idea :-)

This is essentially analogous to try-with-resources in java.
Which is not to say it applies every where ofcourse : drawback is that unlike in java, you need to explicitly specify the finally action, which can be pita imo compared to java's idiom.
And potentially multiple levels of try blocks ...

Since you are anyway going through the pain of making all these changes to fix up code, might be a good idea to change it such that future tests will follow the same pattern.
Thoughts ?

@srowen
Copy link
Member

srowen commented Oct 26, 2016

It makes sense to use the Utils methods where possible, sure.

@SparkQA
Copy link

SparkQA commented Nov 1, 2016

Test build #67903 has finished for PR 15618 at commit 1521572.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

retest this please

@HyukjinKwon HyukjinKwon closed this Nov 1, 2016
@HyukjinKwon HyukjinKwon reopened this Nov 1, 2016
@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Nov 1, 2016

Oh wait, it seems the failed test is potentislly related. Will take a look.

@HyukjinKwon HyukjinKwon changed the title [SPARK-14914][CORE] Fix Resource not closed after using, mostly for unit tests [WiP][SPARK-14914][CORE] Fix Resource not closed after using, mostly for unit tests Nov 1, 2016
@HyukjinKwon HyukjinKwon changed the title [WiP][SPARK-14914][CORE] Fix Resource not closed after using, mostly for unit tests [WIP][SPARK-14914][CORE] Fix Resource not closed after using, mostly for unit tests Nov 1, 2016
@SparkQA
Copy link

SparkQA commented Nov 2, 2016

Test build #67928 has finished for PR 15618 at commit 1521572.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Nov 6, 2016

Hm.. @srowen @tdas and @mridulm . I just got rid of the changes in ReceiverTracker.scala here for now. It seems we can't make this null in stop() and create a new ReceivedBlockTracker instance in start() because this instance is being accessed after gracefully stopped. It seems the blocks should be able to be retrieved after stopped.

One way I am thinking is, initiate this ReceivedBlockTracker instance at start() in ReceiverTracker and add null checks in each method using ReceivedBlockTracker instance in ReceiverTracker. However, I am not sure whether it is worth adding such checks and that this is right or cleanest way.

Otherwise, could I please ask anyone who is familiar with this to make a followup if it is fine?

@HyukjinKwon HyukjinKwon changed the title [WIP][SPARK-14914][CORE] Fix Resource not closed after using, mostly for unit tests [SPARK-14914][CORE] Fix Resource not closed after using, mostly for unit tests Nov 6, 2016
@SparkQA
Copy link

SparkQA commented Nov 6, 2016

Test build #68231 has finished for PR 15618 at commit a9a5f06.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK perhaps except for one last question.


override def beforeAll(): Unit = {
super.beforeAll()
checkpointDir = Utils.createTempDir("checkpoint")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might happen to be OK, but now the dir is not deleted between tests. Is that going to be OK?

Copy link
Member Author

@HyukjinKwon HyukjinKwon Nov 7, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it seems the original codes are already fine in here.. I took a look and ran several tests and it seems now I may understand why @taoli91 tried to fix the codes like this.

It seems somehow the state in ReceiverTracker went wrong on Windows and ended up without closing checkpointDir. It seems the original test was failed[1] due to the issue in ReceiverTracker. So, I think he tried to only create/delete the folder once[2] and ensure stopping ReceivedBlockTracker in ReceiverTracker regardless of the state.

I tested this with manually stopping ReceivedBlockTracker regardless of the state (the original proposal) and it seems fine without the changes in here, MapWithStateSuite.scala[3]. Of course, it is fine with this change[4].

[1]https://ci.appveyor.com/project/spark-test/spark/build/56-F88EDDAF-E576-4787-9530-A4185FC46B1E
[2]https://ci.appveyor.com/project/spark-test/spark/build/57-test-MapWithStateSuite
[3]https://ci.appveyor.com/project/spark-test/spark/build/58-test-MapWithStateSuite
[4]https://ci.appveyor.com/project/spark-test/spark/build/59-test-MapWithStateSuite

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this change is not valid. I will get rid of this. Thank you for pointing this out.

Copy link
Member Author

@HyukjinKwon HyukjinKwon Nov 7, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, the reason why [1][2] were failed on Windows (without ensuring stopping ReceivedBlockTracker) seems, the directory, checkpointDir, is being opened so it fails to delete the directory throwing an exception.

@SparkQA
Copy link

SparkQA commented Nov 7, 2016

Test build #68273 has finished for PR 15618 at commit d680a2f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mridulm
Copy link
Contributor

mridulm commented Nov 7, 2016

LGTM, merging it into master.
Thx @HyukjinKwon

@srowen
Copy link
Member

srowen commented Nov 10, 2016

I'm going to merge to 2.1 as well to match #15320

@srowen
Copy link
Member

srowen commented Nov 10, 2016

Scratch that; it doesn't merge cleanly. Let's leave it.

ghost pushed a commit to dbtsai/spark that referenced this pull request Nov 10, 2016
…ts and example

## What changes were proposed in this pull request?

This is a follow-up work of apache#15618.

Close file source;
For any newly created streaming context outside the withContext, explicitly close the context.

## How was this patch tested?

Existing unit tests.

Author: [email protected] <[email protected]>

Closes apache#15818 from wangmiao1981/rtest.
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
…nit tests

## What changes were proposed in this pull request?

Close `FileStreams`, `ZipFiles` etc to release the resources after using. Not closing the resources will cause IO Exception to be raised while deleting temp files.
## How was this patch tested?

Existing tests

Author: U-FAREAST\tl <[email protected]>
Author: hyukjinkwon <[email protected]>
Author: Tao LI <[email protected]>

Closes apache#15618 from HyukjinKwon/SPARK-14914-1.
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
…ts and example

## What changes were proposed in this pull request?

This is a follow-up work of apache#15618.

Close file source;
For any newly created streaming context outside the withContext, explicitly close the context.

## How was this patch tested?

Existing unit tests.

Author: [email protected] <[email protected]>

Closes apache#15818 from wangmiao1981/rtest.
@HyukjinKwon HyukjinKwon deleted the SPARK-14914-1 branch January 2, 2018 03:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants