Skip to content

Conversation

@HyukjinKwon
Copy link
Member

@HyukjinKwon HyukjinKwon commented Apr 4, 2021

What changes were proposed in this pull request?

This PR proposes to set the system encoding as UTF-8. For some reasons, it looks like GitHub Actions machines changed theirs to ASCII by default. This leads to default encoding/decoding to use ASCII in Python, e.g.) "a".encode(), and looks like Sphinx depends on that.

Why are the changes needed?

To recover GItHub Actions build.

Does this PR introduce any user-facing change?

No, dev-only.

How was this patch tested?

Tested in #32046

@HyukjinKwon
Copy link
Member Author

Can you take a quick look please when you guys find some time?

@HyukjinKwon HyukjinKwon changed the title [SPARK-34951][INFRA][PYTHON][TESTS]Set the system encoding as UTF-8 to recover the Sphinx build in GitHub Actions [SPARK-34951][INFRA][PYTHON][TESTS] Set the system encoding as UTF-8 to recover the Sphinx build in GitHub Actions Apr 4, 2021
@SparkQA
Copy link

SparkQA commented Apr 4, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41466/

@SparkQA
Copy link

SparkQA commented Apr 4, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41466/

Comment on lines -360 to -361
export LC_ALL=C.UTF-8
export LANG=C.UTF-8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting, so export doesn't work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh it works. Problem was that we didn't set both when we run lint-python above that trigfers sphinx build ..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we need this still.

runs-on: ubuntu-20.04
env:
LC_ALL: C.UTF-8
LANG: C.UTF-8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ur, interesting. Previously, I moved this from here to line 360 because the linter is executed inside dongjoon/apache-spark-github-action-image:20201025.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Apr 4, 2021

Anyway, if this passes GA, please proceed to merge. Thank you always, @HyukjinKwon !
(I didn't take a look at the failure in depth)

@SparkQA
Copy link

SparkQA commented Apr 4, 2021

Test build #136889 has finished for PR 32047 at commit a9d83d6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

MaxGekk pushed a commit that referenced this pull request Apr 4, 2021
…tion

### What changes were proposed in this pull request?

This PR replaces the non-ASCII characters to ASCII characters when possible in PySpark documentation

### Why are the changes needed?

To avoid unnecessarily using other non-ASCII characters which could lead to the issue such as #32047 or #22782

### Does this PR introduce _any_ user-facing change?

Virtually no.

### How was this patch tested?

Found via (Mac OS):

```bash
# In Spark root directory
cd python
pcregrep --color='auto' -n "[\x80-\xFF]" `git ls-files .`
```

Closes #32048 from HyukjinKwon/minor-fix.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: Max Gekk <[email protected]>
@HyukjinKwon
Copy link
Member Author

Merged to master, branch-3.1 and branch-3.0.

HyukjinKwon added a commit that referenced this pull request Apr 4, 2021
…to recover the Sphinx build in GitHub Actions

This PR proposes to set the system encoding as UTF-8. For some reasons, it looks like GitHub Actions machines changed theirs to ASCII by default. This leads to default encoding/decoding to use ASCII in Python, e.g.) `"a".encode()`, and looks like Sphinx depends on that.

To recover GItHub Actions build.

No, dev-only.

Tested in #32046

Closes #32047 from HyukjinKwon/SPARK-34951.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit 82ad2f9)
Signed-off-by: HyukjinKwon <[email protected]>
HyukjinKwon added a commit that referenced this pull request Apr 4, 2021
…to recover the Sphinx build in GitHub Actions

This PR proposes to set the system encoding as UTF-8. For some reasons, it looks like GitHub Actions machines changed theirs to ASCII by default. This leads to default encoding/decoding to use ASCII in Python, e.g.) `"a".encode()`, and looks like Sphinx depends on that.

To recover GItHub Actions build.

No, dev-only.

Tested in #32046

Closes #32047 from HyukjinKwon/SPARK-34951.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit 82ad2f9)
Signed-off-by: HyukjinKwon <[email protected]>
flyrain pushed a commit to flyrain/spark that referenced this pull request Sep 21, 2021
…to recover the Sphinx build in GitHub Actions

This PR proposes to set the system encoding as UTF-8. For some reasons, it looks like GitHub Actions machines changed theirs to ASCII by default. This leads to default encoding/decoding to use ASCII in Python, e.g.) `"a".encode()`, and looks like Sphinx depends on that.

To recover GItHub Actions build.

No, dev-only.

Tested in apache#32046

Closes apache#32047 from HyukjinKwon/SPARK-34951.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit 82ad2f9)
Signed-off-by: HyukjinKwon <[email protected]>
@HyukjinKwon HyukjinKwon deleted the SPARK-34951 branch January 4, 2022 00:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants