Skip to content

Conversation

@zuotingbing
Copy link

@zuotingbing zuotingbing commented Dec 15, 2017

What changes were proposed in this pull request?

  1. Start HiveThriftServer2.
  2. Connect to thriftserver through beeline.
  3. Close the beeline.
  4. repeat step2 and step 3 for several times, which caused the leak of Memory.

we found there are many directories never be dropped under the path hive.exec.local.scratchdir and hive.exec.scratchdir, as we know the scratchdir has been added to deleteOnExit when it be created. So it means that the cache size of FileSystem deleteOnExit will keep increasing until JVM terminated.

In addition, we use jmap -histo:live [PID]
to printout the size of objects in HiveThriftServer2 Process, we can find the object org.apache.spark.sql.hive.client.HiveClientImpl and org.apache.hadoop.hive.ql.session.SessionState keep increasing even though we closed all the beeline connections, which caused the leak of Memory.

How was this patch tested?

manual tests

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@mgaido91
Copy link
Contributor

I am not sure about this change actually. In this way all the users would use the same metadataHive. This might have also concurrency issue. Did you experienced a OOM error due to the memory leak?

@gatorsmile
Copy link
Member

cc @liufengdb

@liufengdb
Copy link

I think this method can take care of resource clean up automatically: https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/session/SessionManager.java#L151

Can you really make a heap dump and find out why the sessions are not cleaned up?

@zuotingbing
Copy link
Author

zuotingbing commented Dec 18, 2017

as i debug, every time when i connect to thrift server through beeline, the SessionState.start(state) will be called two times. one is in HiveSessionImpl:open , another is in HiveClientImpl for sql use default .
SessionManager.java#L151 or HiveSessionImpl:close only clean the first.

BTW, Session Timeout Checker does not work in SessionManager. and i create another PR to follow it #20025 . Thanks @liufengdb

@zuotingbing
Copy link
Author

zuotingbing commented Dec 18, 2017

we can find the cache size of FileSystem deleteOnExit will keep increasing even though i have closed all the connections of beeline before.

mshot

@zuotingbing zuotingbing changed the title [SPARK-22793][SQL]Memory leak in Spark Thrift Server [SPARK-22793][SQL][BACKPORT-2.0]Memory leak in Spark Thrift Server Dec 20, 2017
@zuotingbing
Copy link
Author

zuotingbing commented Dec 21, 2017

Could anybody please to check this PR or find out how to correct it? It seems a critical bug. Thanks very much! @cloud-fan @rxin @gatorsmile @vanzin @jerryshao @zsxwing

@cloud-fan
Copy link
Contributor

This is not a backport as this patch is not merged to master yet. Let's move the discussion to the primary PR that against the master branch.

@zuotingbing zuotingbing changed the title [SPARK-22793][SQL][BACKPORT-2.0]Memory leak in Spark Thrift Server [SPARK-22793][SQL]Memory leak in Spark Thrift Server Dec 21, 2017
@zuotingbing
Copy link
Author

OK , please move to #20029. Thanks all.

asfgit pushed a commit that referenced this pull request Jan 6, 2018
# What changes were proposed in this pull request?
1. Start HiveThriftServer2.
2. Connect to thriftserver through beeline.
3. Close the beeline.
4. repeat step2 and step 3 for many times.
we found there are many directories never be dropped under the path `hive.exec.local.scratchdir` and `hive.exec.scratchdir`, as we know the scratchdir has been added to deleteOnExit when it be created. So it means that the cache size of FileSystem `deleteOnExit` will keep increasing until JVM terminated.

In addition, we use `jmap -histo:live [PID]`
to printout the size of objects in HiveThriftServer2 Process, we can find the object `org.apache.spark.sql.hive.client.HiveClientImpl` and `org.apache.hadoop.hive.ql.session.SessionState` keep increasing even though we closed all the beeline connections, which may caused the leak of Memory.

# How was this patch tested?
manual tests

This PR follw-up the #19989

Author: zuotingbing <[email protected]>

Closes #20029 from zuotingbing/SPARK-22793.

(cherry picked from commit be9a804)
Signed-off-by: gatorsmile <[email protected]>
ghost pushed a commit to dbtsai/spark that referenced this pull request Jan 6, 2018
# What changes were proposed in this pull request?
1. Start HiveThriftServer2.
2. Connect to thriftserver through beeline.
3. Close the beeline.
4. repeat step2 and step 3 for many times.
we found there are many directories never be dropped under the path `hive.exec.local.scratchdir` and `hive.exec.scratchdir`, as we know the scratchdir has been added to deleteOnExit when it be created. So it means that the cache size of FileSystem `deleteOnExit` will keep increasing until JVM terminated.

In addition, we use `jmap -histo:live [PID]`
to printout the size of objects in HiveThriftServer2 Process, we can find the object `org.apache.spark.sql.hive.client.HiveClientImpl` and `org.apache.hadoop.hive.ql.session.SessionState` keep increasing even though we closed all the beeline connections, which may caused the leak of Memory.

# How was this patch tested?
manual tests

This PR follw-up the apache#19989

Author: zuotingbing <[email protected]>

Closes apache#20029 from zuotingbing/SPARK-22793.
@zuotingbing zuotingbing changed the title [SPARK-22793][SQL]Memory leak in Spark Thrift Server [SPARK-22793][SQL][BACKPORT-2.0]Memory leak in Spark Thrift Server Jan 8, 2018
@zuotingbing
Copy link
Author

zuotingbing commented Jan 8, 2018

@gatorsmile @liufengdb Could you please also check this PR ? it [BACKPORT-2.0] from master/2.3 about #20029

@gatorsmile
Copy link
Member

@zuotingbing No new 2.0 release is planned. Thus, we do not backport it to 2.0.

@zuotingbing
Copy link
Author

ok, got it. Thanks!

@zuotingbing zuotingbing closed this Jan 8, 2018
zzcclp pushed a commit to zzcclp/spark that referenced this pull request Dec 6, 2018
# What changes were proposed in this pull request?
1. Start HiveThriftServer2.
2. Connect to thriftserver through beeline.
3. Close the beeline.
4. repeat step2 and step 3 for many times.
we found there are many directories never be dropped under the path `hive.exec.local.scratchdir` and `hive.exec.scratchdir`, as we know the scratchdir has been added to deleteOnExit when it be created. So it means that the cache size of FileSystem `deleteOnExit` will keep increasing until JVM terminated.

In addition, we use `jmap -histo:live [PID]`
to printout the size of objects in HiveThriftServer2 Process, we can find the object `org.apache.spark.sql.hive.client.HiveClientImpl` and `org.apache.hadoop.hive.ql.session.SessionState` keep increasing even though we closed all the beeline connections, which may caused the leak of Memory.

# How was this patch tested?
manual tests

This PR follw-up the apache#19989

Author: zuotingbing <[email protected]>

Closes apache#20029 from zuotingbing/SPARK-22793.

(cherry picked from commit be9a804)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants