[DOCS] Improve docs for kyuubi-extension-spark-jdbc-dialect #7036

pan3793 · 2025-04-21T03:47:28Z

Why are the changes needed?

This PR removes the page https://kyuubi.readthedocs.io/en/v1.10.1/client/python/pyspark.html and merges the most content into https://kyuubi.readthedocs.io/en/v1.10.1/extensions/engines/spark/jdbc-dialect.html, some original content of the latter is also modified.

The current docs are misleading, I got asked several times by users why they follow the Kyuubi PySpark docs to access data stored in Hive warehouse is too slow.

Actually, accessing HiveServer2/STS from Spark JDBC data source is discouraged by the Spark community, see SPARK-47482, even though it's technical feasible.

How was this patch tested?

It's a docs-only change, review is required.

Was this patch authored or co-authored using generative AI tooling?

No.

codecov-commenter · 2025-04-21T05:16:11Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 0.00%. Comparing base (cc68cb4) to head (c00ce07).

Additional details and impacted files

@@          Coverage Diff           @@
##           master   #7036   +/-   ##
======================================
  Coverage    0.00%   0.00%           
======================================
  Files         695     695           
  Lines       42814   42814           
  Branches     5829    5829           
======================================
  Misses      42814   42814

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

bowenliang123

LGTM.

…alect ### Why are the changes needed? This PR removes the page https://kyuubi.readthedocs.io/en/v1.10.1/client/python/pyspark.html and merges the most content into https://kyuubi.readthedocs.io/en/v1.10.1/extensions/engines/spark/jdbc-dialect.html, some original content of the latter is also modified. The current docs are misleading, I got asked several times by users why they follow the [Kyuubi PySpark docs](https://kyuubi.readthedocs.io/en/v1.10.1/client/python/pyspark.html) to access data stored in Hive warehouse is too slow. Actually, accessing HiveServer2/STS from Spark JDBC data source is discouraged by the Spark community, see [SPARK-47482](apache/spark#45609), even though it's technical feasible. ### How was this patch tested? It's a docs-only change, review is required. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #7036 from pan3793/jdbc-ds-docs. Closes #7036 c00ce07 [Cheng Pan] style f2676bd [Cheng Pan] [DOCS] Improve docs for kyuubi-extension-spark-jdbc-dialect Authored-by: Cheng Pan <[email protected]> Signed-off-by: Cheng Pan <[email protected]> (cherry picked from commit 6da0e62) Signed-off-by: Cheng Pan <[email protected]>

…dbc-dialect ### Why are the changes needed? This PR removes the page https://kyuubi.readthedocs.io/en/v1.10.1/client/python/pyspark.html and merges the most content into https://kyuubi.readthedocs.io/en/v1.10.1/extensions/engines/spark/jdbc-dialect.html, some original content of the latter is also modified. The current docs are misleading, I got asked several times by users why they follow the [Kyuubi PySpark docs](https://kyuubi.readthedocs.io/en/v1.10.1/client/python/pyspark.html) to access data stored in Hive warehouse is too slow. Actually, accessing HiveServer2/STS from Spark JDBC data source is discouraged by the Spark community, see [SPARK-47482](apache/spark#45609), even though it's technical feasible. ### How was this patch tested? It's a docs-only change, review is required. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#7036 from pan3793/jdbc-ds-docs. Closes apache#7036 c00ce07 [Cheng Pan] style f2676bd [Cheng Pan] [DOCS] Improve docs for kyuubi-extension-spark-jdbc-dialect Authored-by: Cheng Pan <[email protected]> Signed-off-by: Cheng Pan <[email protected]>

[DOCS] Improve docs for kyuubi-extension-spark-jdbc-dialect

f2676bd

github-actions bot added the kind:documentation Documentation is a feature! label Apr 21, 2025

pan3793 requested a review from bowenliang123 April 21, 2025 03:48

style

c00ce07

bowenliang123 approved these changes Apr 21, 2025

View reviewed changes

yaooqinn approved these changes Apr 22, 2025

View reviewed changes

pan3793 self-assigned this Apr 23, 2025

pan3793 added this to the v1.10.2 milestone Apr 23, 2025

pan3793 closed this in 6da0e62 Apr 23, 2025

pan3793 deleted the jdbc-ds-docs branch May 22, 2025 19:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DOCS] Improve docs for kyuubi-extension-spark-jdbc-dialect #7036

[DOCS] Improve docs for kyuubi-extension-spark-jdbc-dialect #7036

Uh oh!

pan3793 commented Apr 21, 2025

Uh oh!

codecov-commenter commented Apr 21, 2025

Uh oh!

bowenliang123 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[DOCS] Improve docs for kyuubi-extension-spark-jdbc-dialect #7036

[DOCS] Improve docs for kyuubi-extension-spark-jdbc-dialect #7036

Uh oh!

Conversation

pan3793 commented Apr 21, 2025

Why are the changes needed?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

codecov-commenter commented Apr 21, 2025

Codecov Report

Uh oh!

bowenliang123 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants