-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-11222][Build][Python] Python document style checker added #20378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Pulling functionality from apache spark
pull latest from apache spark
Pulling functionality from apache spark
Pulling functionality from apache spark
pull request from apache/master
pull latest from apache spark
pull latest from apache spark
Pull apache spark
pull latest apache spark
Apache spark pull latest
|
Test build #86563 has finished for PR 20378 at commit
|
HyukjinKwon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea itself seems fine. but how many instances to fix are there? I think we should decide which rule to exclude and include. Also, I took a quick look and seems this scans all the files including non-python files?
|
Thanks @HyukjinKwon for review. |
|
Please give me few days .. Let me try to take a close look for this. |
|
So, seems we got: I think we can take in: Not sure on: and take out (for now) Also, I think we can take out cloudpickle.py, heapq3.py, shared.py, python/docs/conf.py, work//.py, and ./python/.eggs/* as we do in pep8. |
|
Hey @holdenk, @ueshin, @viirya, @icexelloss, @felixcheung, @BryanCutler and @MrBago. What do you guys think about checking docstring and the list above? I think this could prevent nitpicking and idea itself seems good. One vague concern is that it might make backporting super hard. |
|
Thanks @HyukjinKwon for your update. @HyukjinKwon @holdenk @ueshin @viirya @icexelloss @felixcheung @BryanCutler and @MrBago - While you are thinking on it, below is my analysis. As I understand, there are two things that jira "seems" to be calling out.Please validate.
Working on it, I found docstring style itself was not enforced at all, and that includes doctest style. Meanwhile I had a look into/tested different configurations on epytext/sphinx extensions to see if we can achieve surpassing doctests in docs via them. So I played around with sphinx extensions in conf.py, tested with different configs, eg- None of those options tried, get me to surpass doctest in docs(_build/html) once the build is done. Thanks for thinking this over. |
|
I like this idea, too, but seems like there are too many violating files so we can't enable this for now. |
| if not changed_files or any(f.endswith("lint-python") | ||
| or f.endswith("tox.ini") | ||
| or f.endswith(".py") | ||
| for f in changed_files): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you resolve the conflict? Looks like this change is already merged from #20338.
|
One question I have is, do the current violations cause significant document error? Overall this is a good idea. However, is it worth enforcedly applying this if we consider the effort of fixing the violations, backporting difficulty in the future? |
I think this is a good point. Maybe, we could enable ones fixing actual significant problems, at least. |
| # Using pep257.py which is the single file version of pydocstyle. | ||
| PYDOCSTYLE_VERSION="0.2.1" | ||
| PYDOCSTYLE_SCRIPT_PATH="$SPARK_ROOT_DIR/dev/pydocstyle-$PYDOCSTYLE_VERSION.py" | ||
| PYDOCSTYLE_SCRIPT_REMOTE_PATH="https://raw.githubusercontent.com/PyCQA/pydocstyle/$PYDOCSTYLE_VERSION/pep257.py" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering if this is an official channel to get this script from? I see pep8 download has a note above to use PyPI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @BryanCutler for your input. pep8 is replaced by pycodestyle from official PyPi channel in PR #20338 .This is for doc style alone, which is not maintained within pycodestyle as per maintainers of the project - PyCQA/pycodestyle#723
On the other hand, there is value and we could try to somehow setup latest pydocstyle from git(it would need setup on fly) rather than using single file version.
|
I looked at the example output to see see what the errors were. Specifically looking at it gave I think a lot of docstrings are purposely formatted this way. What is the format that it is looking for here? |
|
pydocstyle seems claiming PEP 257 - https://www.python.org/dev/peps/pep-0257. One option given #20378 (comment) and #20378 (comment) might be to note that we follow PEP 257 in http://spark.apache.org/contributing.html and then enable only ones causing actual problems. |
|
I share the same concern of backporting. If we decide to do large amounts of format changes. Should we consider backporting the format changes in one batch so future backporting can be easier? |
|
@rekhajoshm, would you mind if I ask to take a quick look for ones causing actual problems and identify them? It might be kind of a grunt job tho. |
|
If you happen to be busy on working this one, I can take this when I have some time. That's fine. Either way works. |
|
@HyukjinKwon I will check it over weekend.thanks |
|
|
||
| # Get PYDOCSTYLE at runtime so that we don't rely on it being installed on the build server. | ||
| # Using pep257.py which is the single file version of pydocstyle. | ||
| PYDOCSTYLE_VERSION="0.2.1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, seems like the latest version of pydocstyle is 2.1.1. Should we use it instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As called out earlier, this was single file python doc style checker, the latest does not have single file checker that can be included.
|
@HyukjinKwon Identifying docstyle failures does not help much as it is not straightforward to exclude in this version. |
|
@HyukjinKwon @holdenk @ueshin @viirya @icexelloss @felixcheung @BryanCutler and @MrBago - This was one of the possible approach that I was running by you. I have proposed another approach at #20556 with features as below -
|
What changes were proposed in this pull request?
Using pydocstyle for python document style checker
PyCQA/pycodestyle#723
How was this patch tested?
./dev/run-tests