Skip to content

Conversation

@yaooqinn
Copy link
Member

@yaooqinn yaooqinn commented Mar 5, 2025

What changes were proposed in this pull request?

This PR uses CatalogUtils.URIToString instead of URI.toString to decode the location URI.

Why are the changes needed?

For example, for partition specs like test1=X'16', test3=timestamp'2018-11-17 13:33:33', the stored path will include them as test1=%16/test3=2018-11-17 13%3A33%3A33 because the special characters are escaped. Furthermore, while resolving the whole path string to a URI object, this path fragment becomes test1=%2516/test3=2018-11-17 13%253A33%253A33, so we need to decode %25 -> % before displaying to users

Does this PR introduce any user-facing change?

yes, DESC TABLE will not show 2x-encoded paths.

How was this patch tested?

new tests

Was this patch authored or co-authored using generative AI tooling?

no

@github-actions github-actions bot added the SQL label Mar 5, 2025
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making a backporting PR.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pending CIs

@dongjoon-hyun
Copy link
Member

Sorry but could you rebase once more because Docker and SparkR CI also failed before, @yaooqinn ?

Screenshot 2025-03-05 at 18 45 01

@yaooqinn
Copy link
Member Author

yaooqinn commented Mar 6, 2025

Thank you @dongjoon-hyun

yaooqinn added a commit that referenced this pull request Mar 6, 2025
…decoded for display

### What changes were proposed in this pull request?

This PR uses CatalogUtils.URIToString instead of URI.toString to decode the location URI.

### Why are the changes needed?

For example, for partition specs like test1=X'16', test3=timestamp'2018-11-17 13:33:33', the stored path will include them as `test1=%16/test3=2018-11-17 13%3A33%3A33` because the special characters are escaped. Furthermore, while resolving the whole path string to a URI object, this path fragment becomes `test1=%2516/test3=2018-11-17 13%253A33%253A33`, so we need to decode `%25` -> `%` before displaying to users

### Does this PR introduce _any_ user-facing change?
yes, DESC TABLE  will not show 2x-encoded paths.

### How was this patch tested?
new tests

### Was this patch authored or co-authored using generative AI tooling?
no

Closes #50164 from yaooqinn/SPARK-51307-35.

Authored-by: Kent Yao <[email protected]>
Signed-off-by: Kent Yao <[email protected]>
@yaooqinn yaooqinn closed this Mar 6, 2025
@yaooqinn
Copy link
Member Author

yaooqinn commented Mar 6, 2025

Merged to branch 3.5, thank you again @dongjoon-hyun

jetoile pushed a commit to criteo-forks/spark that referenced this pull request Jul 15, 2025
…decoded for display

### What changes were proposed in this pull request?

This PR uses CatalogUtils.URIToString instead of URI.toString to decode the location URI.

### Why are the changes needed?

For example, for partition specs like test1=X'16', test3=timestamp'2018-11-17 13:33:33', the stored path will include them as `test1=%16/test3=2018-11-17 13%3A33%3A33` because the special characters are escaped. Furthermore, while resolving the whole path string to a URI object, this path fragment becomes `test1=%2516/test3=2018-11-17 13%253A33%253A33`, so we need to decode `%25` -> `%` before displaying to users

### Does this PR introduce _any_ user-facing change?
yes, DESC TABLE  will not show 2x-encoded paths.

### How was this patch tested?
new tests

### Was this patch authored or co-authored using generative AI tooling?
no

Closes apache#50164 from yaooqinn/SPARK-51307-35.

Authored-by: Kent Yao <[email protected]>
Signed-off-by: Kent Yao <[email protected]>
turboFei pushed a commit to turboFei/spark that referenced this pull request Nov 6, 2025
…ed for display (apache#728)

[SPARK-51307][SQL][3.5] locationUri in CatalogStorageFormat shall be decoded for display

### What changes were proposed in this pull request?

This PR uses CatalogUtils.URIToString instead of URI.toString to decode the location URI.

### Why are the changes needed?

For example, for partition specs like test1=X'16', test3=timestamp'2018-11-17 13:33:33', the stored path will include them as `test1=%16/test3=2018-11-17 13%3A33%3A33` because the special characters are escaped. Furthermore, while resolving the whole path string to a URI object, this path fragment becomes `test1=%2516/test3=2018-11-17 13%253A33%253A33`, so we need to decode `%25` -> `%` before displaying to users

### Does this PR introduce _any_ user-facing change?
yes, DESC TABLE  will not show 2x-encoded paths.

### How was this patch tested?
new tests

### Was this patch authored or co-authored using generative AI tooling?
no

Closes apache#50164 from yaooqinn/SPARK-51307-35.

Authored-by: Kent Yao <[email protected]>

Signed-off-by: Kent Yao <[email protected]>
Co-authored-by: Kent Yao <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants