Skip to content

Conversation

@HeartSaVioR
Copy link
Contributor

@HeartSaVioR HeartSaVioR commented Feb 4, 2021

What changes were proposed in this pull request?

This PR proposes to expose the number of total paths in Utils.buildLocationMetadata(), with relaxing space usage a bit (around 10+ chars).

Suppose the first 2 of 5 paths are only fit to the threshold, the outputs between the twos are below:

  • before the change: [path1, path2]
  • after the change: (5 paths)[path1, path2, ...]

Why are the changes needed?

SPARK-31793 silently truncates the paths hence end users can't indicate how many paths are truncated, and even more, whether paths are truncated or not.

Does this PR introduce any user-facing change?

Yes, the location metadata will also show how many paths are truncated (not shown), instead of silently truncated.

How was this patch tested?

Modified UTs

@HeartSaVioR
Copy link
Contributor Author

While I've marked SPARK-34339 as improvement, I also feel this may be considered as a bug, as the new output brought by SPARK-31793 brings confusion.

@HeartSaVioR
Copy link
Contributor Author

cc.ing @gengliangwang @cloud-fan @HyukjinKwon @maropu who are author/reviewers of #28610

@HyukjinKwon
Copy link
Member

Yeah, I think this is better

@SparkQA
Copy link

SparkQA commented Feb 4, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39436/

@SparkQA
Copy link

SparkQA commented Feb 4, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39436/

@SparkQA
Copy link

SparkQA commented Feb 4, 2021

Test build #134849 has finished for PR 31464 at commit f0a02fd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HeartSaVioR HeartSaVioR changed the title [SPARK-34339][CORE][SQL] Expose the number of truncated paths in Utils.buildLocationMetadata() [SPARK-34339][CORE][SQL] Expose the number of total paths in Utils.buildLocationMetadata() Feb 4, 2021
@SparkQA
Copy link

SparkQA commented Feb 4, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39440/

@SparkQA
Copy link

SparkQA commented Feb 4, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39440/

@HeartSaVioR
Copy link
Contributor Author

Appreciate another round of review based on the changed output. I'll try out alternative if the output doesn't look good for us. Thanks!

@HeartSaVioR
Copy link
Contributor Author

HeartSaVioR commented Feb 4, 2021

Hmm... I'll need to fix the test as a lot of tests are relying on the previous format (Location: <className>[]). I'd like to hear the voice on the new output format before fixing these.
(If tests are simply checking it via regex then moving (N paths) to the first part of [] would work, but it seems clearer to have it separately.)

@SparkQA
Copy link

SparkQA commented Feb 4, 2021

Test build #134853 has finished for PR 31464 at commit 56abfe2.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HeartSaVioR
Copy link
Contributor Author

I was wrong about the amount of broken UTs. They were just 2 and I just fixed.

@github-actions github-actions bot added the AVRO label Feb 4, 2021
@SparkQA
Copy link

SparkQA commented Feb 4, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39448/

@SparkQA
Copy link

SparkQA commented Feb 4, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39448/

@SparkQA
Copy link

SparkQA commented Feb 4, 2021

Test build #134861 has finished for PR 31464 at commit e766209.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HeartSaVioR
Copy link
Contributor Author

@HyukjinKwon @cloud-fan @gengliangwang Appreciate another round of review. Thanks!

@HyukjinKwon
Copy link
Member

Merged to master.

@HeartSaVioR
Copy link
Contributor Author

Thanks all for reviewing and merging!

@HeartSaVioR HeartSaVioR deleted the SPARK-34339 branch February 5, 2021 02:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants