Skip to content

Conversation

@reactormonk
Copy link
Contributor

Otherwise the script will crash with

- Downloading boto...
Traceback (most recent call last):
  File "ec2/spark_ec2.py", line 148, in <module>
    setup_external_libs(external_libs)
  File "ec2/spark_ec2.py", line 128, in setup_external_libs
    if hashlib.md5(tar.read()).hexdigest() != lib["md5"]:
  File "/usr/lib/python3.4/codecs.py", line 319, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte

In case of an utf8 env setting.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@shivaram
Copy link
Contributor

shivaram commented Jul 3, 2015

Could you open a JIRA for this ? See https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark for more details

cc @nchammas

@reactormonk
Copy link
Contributor Author

Looks too trivial to jump through the hoops of JIRA.

@JoshRosen
Copy link
Contributor

I think the motivations for JIRA tickets are:

  • JIRA helps us track where a fix has been applied; this is important if a fix needs to be applied to multiple maintenance branches and it also helpful when a fix is reverted.
  • The contributor credits in our release notes are automatically generated from JIRA.

@reactormonk
Copy link
Contributor Author

Otherwise the script will crash with

 - Downloading boto...
Traceback (most recent call last):
  File "ec2/spark_ec2.py", line 148, in <module>
    setup_external_libs(external_libs)
  File "ec2/spark_ec2.py", line 128, in setup_external_libs
    if hashlib.md5(tar.read()).hexdigest() != lib["md5"]:
  File "/usr/lib/python3.4/codecs.py", line 319, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte

In case of an utf8 env setting.
@nchammas
Copy link
Contributor

nchammas commented Jul 6, 2015

LGTM

@shivaram
Copy link
Contributor

shivaram commented Jul 6, 2015

Could you add [SPARK-8821] [EC2] to the PR title ?

@JoshRosen
Copy link
Contributor

Jenkins, this is ok to test.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented Jul 6, 2015

Test build #36590 has started for PR 7215 at commit e86957a.

@reactormonk reactormonk changed the title Switched to binary mode for file reading [SPARK-8821] [EC2] Switched to binary mode for file reading Jul 6, 2015
@SparkQA
Copy link

SparkQA commented Jul 6, 2015

Test build #36590 has finished for PR 7215 at commit e86957a.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@JoshRosen
Copy link
Contributor

Jenkins, retest this please.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented Jul 6, 2015

Test build #36601 has started for PR 7215 at commit e86957a.

@SparkQA
Copy link

SparkQA commented Jul 6, 2015

Test build #36601 has finished for PR 7215 at commit e86957a.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@reactormonk
Copy link
Contributor Author

There doesn't even seem to be any reference to ec2 in the test output.

@shivaram
Copy link
Contributor

shivaram commented Jul 6, 2015

Jenkins, retest this please

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented Jul 6, 2015

Test build #36617 has started for PR 7215 at commit e86957a.

@SparkQA
Copy link

SparkQA commented Jul 7, 2015

Test build #36617 has finished for PR 7215 at commit e86957a.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@srowen
Copy link
Member

srowen commented Jul 7, 2015

Yeah it's not related. You can see the failure is related to SQL. I don't think ec2 is tested. In fact I think this whole bit is moving out of apache/spark? So, LGTM

@shivaram
Copy link
Contributor

shivaram commented Jul 7, 2015

Yeah EC2 is not tested by jenkins -- Merging this

asfgit pushed a commit that referenced this pull request Jul 7, 2015
Otherwise the script will crash with

    - Downloading boto...
    Traceback (most recent call last):
      File "ec2/spark_ec2.py", line 148, in <module>
        setup_external_libs(external_libs)
      File "ec2/spark_ec2.py", line 128, in setup_external_libs
        if hashlib.md5(tar.read()).hexdigest() != lib["md5"]:
      File "/usr/lib/python3.4/codecs.py", line 319, in decode
        (result, consumed) = self._buffer_decode(data, self.errors, final)
    UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte

In case of an utf8 env setting.

Author: Simon Hafner <[email protected]>

Closes #7215 from reactormonk/branch-1.4 and squashes the following commits:

e86957a [Simon Hafner] [SPARK-8821] [EC2] Switched to binary mode
@asfgit asfgit closed this in 70beb80 Jul 7, 2015
@reactormonk reactormonk deleted the branch-1.4 branch July 7, 2015 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants