-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-15084][PYTHON][SQL] Use builder pattern to create SparkSession in PySpark. #12860
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@rxin . |
python/pyspark/sql/session.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should update SparkSession's doc itself to indicate how to create it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. It's updated.
|
cc @davies can you take a look at the builder API? |
|
Test build #57589 has finished for PR 12860 at commit
|
|
Test build #57592 has finished for PR 12860 at commit
|
|
Test build #57593 has finished for PR 12860 at commit
|
|
Test build #57602 has finished for PR 12860 at commit
|
python/pyspark/sql/session.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could create a builder here, then we can use it like this:
SparkSession.builder.master().getOrCreater()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvm, we also create a Builder every time in Scala.
|
Thank you for review, @davies . I'll update soon. |
|
@davies . I addressed two comments, but I'm not sure about the first one. |
python/pyspark/sql/session.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's weird to use SQLContext to create an SparkSession, can't we create an SparkSession directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also use that in Scala, it's OK for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we just call scala's getOrCreate here then we don't need to fix this in the future
|
Looks good otherwise |
|
Test build #57649 has finished for PR 12860 at commit
|
|
Thank you, @davies and @andrewor14 . |
|
It's been merged! |
|
Great! Thank you, @rxin |
…PySpark. This is a port of corresponding Scala builder pattern code. `sql.py` is modified as a target example case.
|
Test build #57683 has finished for PR 12860 at commit
|
|
Test build #57684 has finished for PR 12860 at commit
|
|
Hi, @davies and @andrewor14 . Now, it's updated.
For the calling Scala's |
|
Thanks, merging into master 2.0 |
… in PySpark. ## What changes were proposed in this pull request? This is a python port of corresponding Scala builder pattern code. `sql.py` is modified as a target example case. ## How was this patch tested? Manual. Author: Dongjoon Hyun <[email protected]> Closes #12860 from dongjoon-hyun/SPARK-15084. (cherry picked from commit 0903a18) Signed-off-by: Andrew Or <[email protected]>
|
Thank you, @andrewor14 ! |
What changes were proposed in this pull request?
This is a python port of corresponding Scala builder pattern code.
sql.pyis modified as a target example case.How was this patch tested?
Manual.