Skip to content

Conversation

@HyukjinKwon
Copy link
Member

@HyukjinKwon HyukjinKwon commented Mar 24, 2020

What changes were proposed in this pull request?

For a bit of background,
PIP packaging test started to fail (see this logs) as of setuptools 46.1.0 release. In pypa/setuptools#1424, they decided to don't keep the modes in package_data.

In PySpark pip installation, we keep the executable scripts in package_data

spark/python/setup.py

Lines 199 to 200 in fc4e56a

'pyspark.bin': ['*'],
'pyspark.sbin': ['spark-config.sh', 'spark-daemon.sh',
, and expose their symbolic links as executable scripts.

So, the symbolic links (or copied scripts) executes the scripts copied from package_data, which doesn't have the executable permission in its mode:

/tmp/tmp.UmkEGNFdKF/3.6/bin/spark-submit: line 27: /tmp/tmp.UmkEGNFdKF/3.6/lib/python3.6/site-packages/pyspark/bin/spark-class: Permission denied
/tmp/tmp.UmkEGNFdKF/3.6/bin/spark-submit: line 27: exec: /tmp/tmp.UmkEGNFdKF/3.6/lib/python3.6/site-packages/pyspark/bin/spark-class: cannot execute: Permission denied

The current issue is being tracked at pypa/setuptools#2041


For what this PR proposes:
It sets the setuptools version in PR builder for now to unblock other PRs. This PR does not solve the issue yet. I will make a fix after monitoring pypa/setuptools#2041

Why are the changes needed?

It currently affects users who uses the latest setuptools. So, users seem unable to use PySpark with the latest setuptools. See also pypa/setuptools#2041 (comment)

Does this PR introduce any user-facing change?

It makes CI pass for now. No user-facing change yet.

How was this patch tested?

Jenkins will test.

@SparkQA
Copy link

SparkQA commented Mar 24, 2020

Test build #120230 has finished for PR 27995 at commit db53aa5.

  • This patch fails PySpark pip packaging tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 24, 2020

Test build #120231 has finished for PR 27995 at commit 857cdb4.

  • This patch fails PySpark pip packaging tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 24, 2020

Test build #120232 has finished for PR 27995 at commit fa9b0e5.

  • This patch fails PySpark pip packaging tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 24, 2020

Test build #120234 has finished for PR 27995 at commit 37c32d1.

  • This patch fails PySpark pip packaging tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 24, 2020

Test build #120237 has finished for PR 27995 at commit aa5647f.

  • This patch fails PySpark pip packaging tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 24, 2020

Test build #120238 has finished for PR 27995 at commit 570a08e.

  • This patch fails PySpark pip packaging tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 24, 2020

Test build #120245 has finished for PR 27995 at commit cee6d51.

  • This patch fails PySpark pip packaging tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon HyukjinKwon force-pushed the investigate-pip-packaging branch from cee6d51 to a1fdad8 Compare March 24, 2020 07:12
@SparkQA
Copy link

SparkQA commented Mar 24, 2020

Test build #120248 has finished for PR 27995 at commit a1fdad8.

  • This patch fails PySpark pip packaging tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon HyukjinKwon force-pushed the investigate-pip-packaging branch from a1fdad8 to f5a60c6 Compare March 24, 2020 07:37
@HyukjinKwon HyukjinKwon changed the title [DO-NOT-MERGE] Investigate PIP package failure [SPARK-31231][BUILD] Set the upper bound (before 46.1.0) for setuptools in pip package test Mar 24, 2020
@SparkQA
Copy link

SparkQA commented Mar 24, 2020

Test build #120252 has finished for PR 27995 at commit f5a60c6.

  • This patch fails PySpark pip packaging tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon HyukjinKwon changed the title [SPARK-31231][BUILD] Set the upper bound (before 46.1.0) for setuptools in pip package test [SPARK-31231][BUILD] Explicitly setuptools version as 46.0.0 in pip package test Mar 24, 2020
@SparkQA
Copy link

SparkQA commented Mar 24, 2020

Test build #120255 has finished for PR 27995 at commit 01a1cad.

  • This patch fails PySpark pip packaging tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 24, 2020

Test build #120256 has finished for PR 27995 at commit 0c83437.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

Okay .. it passed. I'll clean up and merge to unblock other PRs.

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Mar 24, 2020

Merged to master, branch-3.0, and branch-2.4.

I am not going to resolve the JIRA yet - this PR is a temp workaround only for CI at this momoent.

HyukjinKwon added a commit that referenced this pull request Mar 24, 2020
…ackage test

### What changes were proposed in this pull request?

For a bit of background,
PIP packaging test started to fail (see [this logs](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120218/testReport/)) as of  setuptools 46.1.0 release. In pypa/setuptools#1424, they decided to don't keep the modes in `package_data`.

In PySpark pip installation, we keep the executable scripts in `package_data` https://github.com/apache/spark/blob/fc4e56a54c15e20baf085e6061d3d83f5ce1185d/python/setup.py#L199-L200, and expose their symbolic links as executable scripts.

So, the symbolic links (or copied scripts) executes the scripts copied from `package_data`, which doesn't have the executable permission in its mode:

```
/tmp/tmp.UmkEGNFdKF/3.6/bin/spark-submit: line 27: /tmp/tmp.UmkEGNFdKF/3.6/lib/python3.6/site-packages/pyspark/bin/spark-class: Permission denied
/tmp/tmp.UmkEGNFdKF/3.6/bin/spark-submit: line 27: exec: /tmp/tmp.UmkEGNFdKF/3.6/lib/python3.6/site-packages/pyspark/bin/spark-class: cannot execute: Permission denied
```

The current issue is being tracked at pypa/setuptools#2041

</br>

For what this PR proposes:
It sets the upper bound in PR builder for now to unblock other PRs.  _This PR does not solve the issue yet. I will make a fix after monitoring https://github.com/pypa/setuptools/issues/2041_

### Why are the changes needed?

It currently affects users who uses the latest setuptools. So, _users seem unable to use PySpark with the latest setuptools._ See also pypa/setuptools#2041 (comment)

### Does this PR introduce any user-facing change?

It makes CI pass for now. No user-facing change yet.

### How was this patch tested?

Jenkins will test.

Closes #27995 from HyukjinKwon/investigate-pip-packaging.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit c181c45)
Signed-off-by: HyukjinKwon <[email protected]>
HyukjinKwon added a commit that referenced this pull request Mar 24, 2020
…ackage test

### What changes were proposed in this pull request?

For a bit of background,
PIP packaging test started to fail (see [this logs](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120218/testReport/)) as of  setuptools 46.1.0 release. In pypa/setuptools#1424, they decided to don't keep the modes in `package_data`.

In PySpark pip installation, we keep the executable scripts in `package_data` https://github.com/apache/spark/blob/fc4e56a54c15e20baf085e6061d3d83f5ce1185d/python/setup.py#L199-L200, and expose their symbolic links as executable scripts.

So, the symbolic links (or copied scripts) executes the scripts copied from `package_data`, which doesn't have the executable permission in its mode:

```
/tmp/tmp.UmkEGNFdKF/3.6/bin/spark-submit: line 27: /tmp/tmp.UmkEGNFdKF/3.6/lib/python3.6/site-packages/pyspark/bin/spark-class: Permission denied
/tmp/tmp.UmkEGNFdKF/3.6/bin/spark-submit: line 27: exec: /tmp/tmp.UmkEGNFdKF/3.6/lib/python3.6/site-packages/pyspark/bin/spark-class: cannot execute: Permission denied
```

The current issue is being tracked at pypa/setuptools#2041

</br>

For what this PR proposes:
It sets the upper bound in PR builder for now to unblock other PRs.  _This PR does not solve the issue yet. I will make a fix after monitoring https://github.com/pypa/setuptools/issues/2041_

### Why are the changes needed?

It currently affects users who uses the latest setuptools. So, _users seem unable to use PySpark with the latest setuptools._ See also pypa/setuptools#2041 (comment)

### Does this PR introduce any user-facing change?

It makes CI pass for now. No user-facing change yet.

### How was this patch tested?

Jenkins will test.

Closes #27995 from HyukjinKwon/investigate-pip-packaging.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit c181c45)
Signed-off-by: HyukjinKwon <[email protected]>
@HyukjinKwon
Copy link
Member Author

@nchammas, @srowen, @dongjoon-hyun, @BryanCutler, FWIW, I think this is one example of the trade-off we discussed at #27928.

setuptools is dev-only dependency but seems it found a pretty critical issue. Currently seems users can't use pip-installed PySpark assuming from pypa/setuptools#2041. However, of course it broke the PR builders.

@SparkQA
Copy link

SparkQA commented Mar 24, 2020

Test build #120259 has finished for PR 27995 at commit b8926bd.

  • This patch passes all tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Mar 24, 2020

Was there an issue in 46.1.0? wouldn't pinning to 46.0.0 have been a good thing then? or else I misunderstand what broke.

@dongjoon-hyun
Copy link
Member

Hi, @HyukjinKwon .
This broke branch-2.4 due to the following. I'll revert this from branch-2.4.

UnsatisfiableError: The following specifications were found to be in conflict:
  - python=3.5
  - setuptools=46.0.0 -> python[version='>=3.8,<3.9.0a0'] -> openssl[version='>=1.1.1d,<1.1.2a']
Use "conda search <package> --info" to see the dependencies for each package.

@dongjoon-hyun
Copy link
Member

Since Apache Spark 2.4 still supports Python 2.7+/3.4+, setuptools 46.0.0 seems to be incompatible to that. Please make a separate PR for branch-2.4.

@nchammas
Copy link
Contributor

Perhaps the requirement should instead be of the form setuptools < 46.1.0, to allow flexibility for older versions of Python to still find a release of setuptools that satisfies their requirements.

@HyukjinKwon
Copy link
Member Author

Yeah, maybe I should better set upperbound instead. Thanks @dongjoon-hyun and @nchammas

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Mar 24, 2020

@srowen, seems this can be the problem in client side with the latest setuptools (see pypa/setuptools#2041). So, I think using lower version now fixed the test but the issue itself persists.

@srowen
Copy link
Member

srowen commented Mar 25, 2020

I'm still confused. This is an argument for pinning versions, right? Because a new version of setuptools caused a problem:

@HyukjinKwon
Copy link
Member Author

But we found that issue from the broken PR builder, which seems to be pretty critical.

@HyukjinKwon
Copy link
Member Author

I simply meant this is an example of the trade-off. It caused a problem and broke the PR builder vs it found a problem early.

HyukjinKwon added a commit that referenced this pull request Mar 26, 2020
…or setuptools in pip package test

## What changes were proposed in this pull request?
This PR is a followup of #27995. Rather then pining setuptools version, it sets upper bound so Python 3.5 with branch-2.4 tests can pass too.

## Why are the changes needed?
To make the CI build stable

## Does this PR introduce any user-facing change?
No, dev-only change.

## How was this patch tested?
Jenkins will test.

Closes #28005 from HyukjinKwon/investigate-pip-packaging-followup.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
HyukjinKwon added a commit that referenced this pull request Mar 26, 2020
…or setuptools in pip package test

## What changes were proposed in this pull request?
This PR is a followup of #27995. Rather then pining setuptools version, it sets upper bound so Python 3.5 with branch-2.4 tests can pass too.

## Why are the changes needed?
To make the CI build stable

## Does this PR introduce any user-facing change?
No, dev-only change.

## How was this patch tested?
Jenkins will test.

Closes #28005 from HyukjinKwon/investigate-pip-packaging-followup.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit 178d472)
Signed-off-by: HyukjinKwon <[email protected]>
HyukjinKwon added a commit that referenced this pull request Mar 26, 2020
…or setuptools in pip package test

This PR is a followup of #27995. Rather then pining setuptools version, it sets upper bound so Python 3.5 with branch-2.4 tests can pass too.

To make the CI build stable

No, dev-only change.

Jenkins will test.

Closes #28005 from HyukjinKwon/investigate-pip-packaging-followup.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit 178d472)
Signed-off-by: HyukjinKwon <[email protected]>
sjincho pushed a commit to sjincho/spark that referenced this pull request Apr 15, 2020
…ackage test

### What changes were proposed in this pull request?

For a bit of background,
PIP packaging test started to fail (see [this logs](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120218/testReport/)) as of  setuptools 46.1.0 release. In pypa/setuptools#1424, they decided to don't keep the modes in `package_data`.

In PySpark pip installation, we keep the executable scripts in `package_data` https://github.com/apache/spark/blob/fc4e56a54c15e20baf085e6061d3d83f5ce1185d/python/setup.py#L199-L200, and expose their symbolic links as executable scripts.

So, the symbolic links (or copied scripts) executes the scripts copied from `package_data`, which doesn't have the executable permission in its mode:

```
/tmp/tmp.UmkEGNFdKF/3.6/bin/spark-submit: line 27: /tmp/tmp.UmkEGNFdKF/3.6/lib/python3.6/site-packages/pyspark/bin/spark-class: Permission denied
/tmp/tmp.UmkEGNFdKF/3.6/bin/spark-submit: line 27: exec: /tmp/tmp.UmkEGNFdKF/3.6/lib/python3.6/site-packages/pyspark/bin/spark-class: cannot execute: Permission denied
```

The current issue is being tracked at pypa/setuptools#2041

</br>

For what this PR proposes:
It sets the upper bound in PR builder for now to unblock other PRs.  _This PR does not solve the issue yet. I will make a fix after monitoring https://github.com/pypa/setuptools/issues/2041_

### Why are the changes needed?

It currently affects users who uses the latest setuptools. So, _users seem unable to use PySpark with the latest setuptools._ See also pypa/setuptools#2041 (comment)

### Does this PR introduce any user-facing change?

It makes CI pass for now. No user-facing change yet.

### How was this patch tested?

Jenkins will test.

Closes apache#27995 from HyukjinKwon/investigate-pip-packaging.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
sjincho pushed a commit to sjincho/spark that referenced this pull request Apr 15, 2020
…or setuptools in pip package test

## What changes were proposed in this pull request?
This PR is a followup of apache#27995. Rather then pining setuptools version, it sets upper bound so Python 3.5 with branch-2.4 tests can pass too.

## Why are the changes needed?
To make the CI build stable

## Does this PR introduce any user-facing change?
No, dev-only change.

## How was this patch tested?
Jenkins will test.

Closes apache#28005 from HyukjinKwon/investigate-pip-packaging-followup.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
@HyukjinKwon HyukjinKwon deleted the investigate-pip-packaging branch July 27, 2020 07:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants