Skip to content

Conversation

@pan3793
Copy link
Member

@pan3793 pan3793 commented Apr 21, 2025

Why are the changes needed?

This PR removes the page https://kyuubi.readthedocs.io/en/v1.10.1/client/python/pyspark.html and merges the most content into https://kyuubi.readthedocs.io/en/v1.10.1/extensions/engines/spark/jdbc-dialect.html, some original content of the latter is also modified.

The current docs are misleading, I got asked several times by users why they follow the Kyuubi PySpark docs to access data stored in Hive warehouse is too slow.

Actually, accessing HiveServer2/STS from Spark JDBC data source is discouraged by the Spark community, see SPARK-47482, even though it's technical feasible.

How was this patch tested?

It's a docs-only change, review is required.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the kind:documentation Documentation is a feature! label Apr 21, 2025
@pan3793 pan3793 requested a review from bowenliang123 April 21, 2025 03:48
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 0.00%. Comparing base (cc68cb4) to head (c00ce07).

Additional details and impacted files
@@          Coverage Diff           @@
##           master   #7036   +/-   ##
======================================
  Coverage    0.00%   0.00%           
======================================
  Files         695     695           
  Lines       42814   42814           
  Branches     5829    5829           
======================================
  Misses      42814   42814           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@bowenliang123 bowenliang123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@pan3793 pan3793 self-assigned this Apr 23, 2025
@pan3793 pan3793 added this to the v1.10.2 milestone Apr 23, 2025
@pan3793 pan3793 closed this in 6da0e62 Apr 23, 2025
pan3793 added a commit that referenced this pull request Apr 23, 2025
…alect

### Why are the changes needed?

This PR removes the page https://kyuubi.readthedocs.io/en/v1.10.1/client/python/pyspark.html and merges the most content into https://kyuubi.readthedocs.io/en/v1.10.1/extensions/engines/spark/jdbc-dialect.html, some original content of the latter is also modified.

The current docs are misleading, I got asked several times by users why they follow the [Kyuubi PySpark docs](https://kyuubi.readthedocs.io/en/v1.10.1/client/python/pyspark.html) to access data stored in Hive warehouse is too slow.

Actually, accessing HiveServer2/STS from Spark JDBC data source is discouraged by the Spark community, see [SPARK-47482](apache/spark#45609), even though it's technical feasible.

### How was this patch tested?

It's a docs-only change, review is required.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7036 from pan3793/jdbc-ds-docs.

Closes #7036

c00ce07 [Cheng Pan] style
f2676bd [Cheng Pan] [DOCS] Improve docs for kyuubi-extension-spark-jdbc-dialect

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
(cherry picked from commit 6da0e62)
Signed-off-by: Cheng Pan <[email protected]>
@pan3793 pan3793 deleted the jdbc-ds-docs branch May 22, 2025 19:23
turboFei pushed a commit to turboFei/kyuubi that referenced this pull request Aug 27, 2025
…dbc-dialect

### Why are the changes needed?

This PR removes the page https://kyuubi.readthedocs.io/en/v1.10.1/client/python/pyspark.html and merges the most content into https://kyuubi.readthedocs.io/en/v1.10.1/extensions/engines/spark/jdbc-dialect.html, some original content of the latter is also modified.

The current docs are misleading, I got asked several times by users why they follow the [Kyuubi PySpark docs](https://kyuubi.readthedocs.io/en/v1.10.1/client/python/pyspark.html) to access data stored in Hive warehouse is too slow.

Actually, accessing HiveServer2/STS from Spark JDBC data source is discouraged by the Spark community, see [SPARK-47482](apache/spark#45609), even though it's technical feasible.

### How was this patch tested?

It's a docs-only change, review is required.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#7036 from pan3793/jdbc-ds-docs.

Closes apache#7036

c00ce07 [Cheng Pan] style
f2676bd [Cheng Pan] [DOCS] Improve docs for kyuubi-extension-spark-jdbc-dialect

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind:documentation Documentation is a feature!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants