Skip to content

Conversation

@chenghao-intel
Copy link
Contributor

Still, we keep only a single HiveContext within ThriftServer, and we also create a object called SQLSession for isolating the different user states.

Developers can obtain/release a new user session via openSession and closeSession, and SQLContext and HiveContext will also provide a default session if no openSession called, for backward-compatibility.

@SparkQA
Copy link

SparkQA commented Mar 4, 2015

Test build #28253 has started for PR 4885 at commit 5fea724.

  • This patch merges cleanly.

@chenghao-intel
Copy link
Contributor Author

cc @liancheng @tianyi @guowei2
We have 2 implementations for supporting the multiple sessions in thriftserver, can you review the code for me?

@SparkQA
Copy link

SparkQA commented Mar 4, 2015

Test build #28253 has finished for PR 4885 at commit 5fea724.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28253/
Test FAILed.

@SparkQA
Copy link

SparkQA commented Mar 4, 2015

Test build #28256 has started for PR 4885 at commit 0ca4bbd.

  • This patch merges cleanly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's no need to overwrite SQLSession and createSession here, for SessionState self is ThreadLocal. we just need to set SessionState when openSession in SparkSQLSessionManager.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@guowei2 I think either way is OK for now. Putting all session-specific stuff into a central place (SQLSession) seems cleaner to me. Making SQLSession a thread-local does look a little ugly, however, right now it's not used anywhere other than the Thrift server. When we do decide to move Hive into a separate data source and make our own data source neutral Spark SQL server, we can handle the session problem in a cleaner way (e.g., using an actor for each session and keep all session-specific stuff in the actor instance).

@SparkQA
Copy link

SparkQA commented Mar 4, 2015

Test build #28256 has finished for PR 4885 at commit 0ca4bbd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28256/
Test PASSed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

�Indentations are off in this test case.

@liancheng
Copy link
Contributor

Hey @chenghao-intel, terribly sorry for the delay. In general this LGTM. Left some comments, mostly on styling issues. Thanks!

@chenghao-intel chenghao-intel force-pushed the multisessions_singlecontext branch from 0ca4bbd to 815b27a Compare March 16, 2015 03:15
@SparkQA
Copy link

SparkQA commented Mar 16, 2015

Test build #28636 has started for PR 4885 at commit 815b27a.

  • This patch merges cleanly.

@chenghao-intel
Copy link
Contributor Author

Thank you @liancheng @guowei2 for the review, I've updated the code as suggested.

Still, I am thinking how to handle the temporal function and table which isolated by SQLSession, maybe life would be easier if we have the design along this PR(we can do those in a separated PR). Any suggestions @liancheng @marmbrus @guowei2 ?

@SparkQA
Copy link

SparkQA commented Mar 16, 2015

Test build #28636 has finished for PR 4885 at commit 815b27a.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28636/
Test FAILed.

@SparkQA
Copy link

SparkQA commented Mar 16, 2015

Test build #28638 has started for PR 4885 at commit 1c47b2a.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 16, 2015

Test build #28638 has finished for PR 4885 at commit 1c47b2a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28638/
Test PASSed.

@chenghao-intel chenghao-intel changed the title [SPARK-2087] [SQL] [WIP] Multiple thriftserver sessions with single HiveContext instance [SPARK-2087] [SQL] Multiple thriftserver sessions with single HiveContext instance Mar 16, 2015
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to add comment to indicate that the expected value should be "<undefined>". I was quite confused at first as 200 should be the default value of "spark.sql.shuffle.partitions" :)

@liancheng
Copy link
Contributor

Hey @chenghao-intel, left another 3 minor comments. But I'm gonna merge this. Please fix them in another PR. Also verified locally that both session isolation and cache sharing work as expected. Thanks for the efforts!!

@chenghao-intel
Copy link
Contributor Author

Thank you very much @liancheng, I will create another PR for the requirements that we discussed above, and also the minor issues.

@chenghao-intel chenghao-intel deleted the multisessions_singlecontext branch July 2, 2015 08:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants