Skip to content

Commit f17d43b

Browse files
nchammasJoshRosen
authored andcommitted
[SPARK-6219] [Build] Check that Python code compiles
This PR expands the Python lint checks so that they check for obvious compilation errors in our Python code. For example: ``` $ ./dev/lint-python Python lint checks failed. Compiling ./ec2/spark_ec2.py ... File "./ec2/spark_ec2.py", line 618 return (master_nodes,, slave_nodes) ^ SyntaxError: invalid syntax ./ec2/spark_ec2.py:618:25: E231 missing whitespace after ',' ./ec2/spark_ec2.py:1117:101: E501 line too long (102 > 100 characters) ``` This PR also bumps up the version of `pep8`. It ignores new types of checks introduced by that version bump while fixing problems missed by the older version of `pep8` we were using. Author: Nicholas Chammas <[email protected]> Closes #4941 from nchammas/compile-spark-ec2 and squashes the following commits: 75e31d8 [Nicholas Chammas] upgrade pep8 + check compile b33651c [Nicholas Chammas] PEP8 line length
1 parent 3b5aaa6 commit f17d43b

File tree

2 files changed

+29
-19
lines changed

2 files changed

+29
-19
lines changed

dev/lint-python

Lines changed: 27 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -19,43 +19,53 @@
1919

2020
SCRIPT_DIR="$( cd "$( dirname "$0" )" && pwd )"
2121
SPARK_ROOT_DIR="$(dirname "$SCRIPT_DIR")"
22-
PEP8_REPORT_PATH="$SPARK_ROOT_DIR/dev/pep8-report.txt"
22+
PATHS_TO_CHECK="./python/pyspark/ ./ec2/spark_ec2.py ./examples/src/main/python/"
23+
PYTHON_LINT_REPORT_PATH="$SPARK_ROOT_DIR/dev/python-lint-report.txt"
2324

2425
cd "$SPARK_ROOT_DIR"
2526

27+
# compileall: https://docs.python.org/2/library/compileall.html
28+
python -B -m compileall -q -l $PATHS_TO_CHECK > "$PYTHON_LINT_REPORT_PATH"
29+
compile_status="${PIPESTATUS[0]}"
30+
2631
# Get pep8 at runtime so that we don't rely on it being installed on the build server.
2732
#+ See: https://github.com/apache/spark/pull/1744#issuecomment-50982162
2833
#+ TODOs:
29-
#+ - Dynamically determine latest release version of pep8 and use that.
30-
#+ - Download this from a more reliable source. (GitHub raw can be flaky, apparently. (?))
34+
#+ - Download pep8 from PyPI. It's more "official".
3135
PEP8_SCRIPT_PATH="$SPARK_ROOT_DIR/dev/pep8.py"
32-
PEP8_SCRIPT_REMOTE_PATH="https://raw.githubusercontent.com/jcrocholl/pep8/1.5.7/pep8.py"
33-
PEP8_PATHS_TO_CHECK="./python/pyspark/ ./ec2/spark_ec2.py ./examples/src/main/python/"
36+
PEP8_SCRIPT_REMOTE_PATH="https://raw.githubusercontent.com/jcrocholl/pep8/1.6.2/pep8.py"
3437

38+
# if [ ! -e "$PEP8_SCRIPT_PATH" ]; then
3539
curl --silent -o "$PEP8_SCRIPT_PATH" "$PEP8_SCRIPT_REMOTE_PATH"
36-
curl_status=$?
40+
curl_status="$?"
3741

38-
if [ $curl_status -ne 0 ]; then
42+
if [ "$curl_status" -ne 0 ]; then
3943
echo "Failed to download pep8.py from \"$PEP8_SCRIPT_REMOTE_PATH\"."
40-
exit $curl_status
44+
exit "$curl_status"
4145
fi
42-
46+
# fi
4347

4448
# There is no need to write this output to a file
4549
#+ first, but we do so so that the check status can
4650
#+ be output before the report, like with the
4751
#+ scalastyle and RAT checks.
48-
python "$PEP8_SCRIPT_PATH" $PEP8_PATHS_TO_CHECK > "$PEP8_REPORT_PATH"
49-
pep8_status=${PIPESTATUS[0]} #$?
52+
python "$PEP8_SCRIPT_PATH" --ignore=E402,E731,E241,W503,E226 $PATHS_TO_CHECK >> "$PYTHON_LINT_REPORT_PATH"
53+
pep8_status="${PIPESTATUS[0]}"
54+
55+
if [ "$compile_status" -eq 0 -a "$pep8_status" -eq 0 ]; then
56+
lint_status=0
57+
else
58+
lint_status=1
59+
fi
5060

51-
if [ $pep8_status -ne 0 ]; then
52-
echo "PEP 8 checks failed."
53-
cat "$PEP8_REPORT_PATH"
61+
if [ "$lint_status" -ne 0 ]; then
62+
echo "Python lint checks failed."
63+
cat "$PYTHON_LINT_REPORT_PATH"
5464
else
55-
echo "PEP 8 checks passed."
65+
echo "Python lint checks passed."
5666
fi
5767

58-
rm "$PEP8_REPORT_PATH"
5968
rm "$PEP8_SCRIPT_PATH"
69+
rm "$PYTHON_LINT_REPORT_PATH"
6070

61-
exit $pep8_status
71+
exit "$lint_status"

ec2/spark_ec2.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1159,8 +1159,8 @@ def real_main():
11591159
if EC2_INSTANCE_TYPES[opts.instance_type] != \
11601160
EC2_INSTANCE_TYPES[opts.master_instance_type]:
11611161
print >> stderr, \
1162-
"Error: spark-ec2 currently does not support having a master and slaves with " + \
1163-
"different AMI virtualization types."
1162+
"Error: spark-ec2 currently does not support having a master and slaves " + \
1163+
"with different AMI virtualization types."
11641164
print >> stderr, "master instance virtualization type: {t}".format(
11651165
t=EC2_INSTANCE_TYPES[opts.master_instance_type])
11661166
print >> stderr, "slave instance virtualization type: {t}".format(

0 commit comments

Comments
 (0)