Skip to content

Conversation

@crazybolillo
Copy link
Contributor

Type of Changes

Type
✨ New feature

Description

A new message has been added to suggest refactoring when yielding from a for loop can be replaced by 'yield from'. This message is only emitted when there are no other statements in the containing for loop.

Closes #9229.

@crazybolillo
Copy link
Contributor Author

crazybolillo commented Feb 8, 2024

I may be missing some test cases and also wonder what the appropriate confidence level should be. I don't think it should be UNDEFINED. Any review is appreciated.

@DanielNoord
Copy link
Collaborator

I may be missing some test cases and also wonder what the appropriate confidence level should be. I don't think it should be UNDEFINED. Any review is appreciated.

confidence.HIGH would be appropriate here. As you can see you will need to update some of the existing test cases for this to work.

@Pierre-Sassoulas Pierre-Sassoulas added the Enhancement ✨ Improvement to a component label Feb 8, 2024
Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thank you ! The two new messages in pylint make sense and are not false positives. I think we need to fix those two violations and fix the tests to be able to run the primer and see how it fares on more diverse python projects (we do not run the primer if other tests fails).

@crazybolillo
Copy link
Contributor Author

So I think unit tests should be fixed. Also found some cases I had not considered and improved the checker in general. Still I think I need to do a little bit more work to handle cases like:

for item in [1, 2, 3]:
    yield item

@crazybolillo
Copy link
Contributor Author

Also realized I need to add a docs section for the message. Will work on that as well.

@codecov
Copy link

codecov bot commented Feb 9, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (659a32f) 95.80% compared to head (2a06726) 95.81%.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #9419   +/-   ##
=======================================
  Coverage   95.80%   95.81%           
=======================================
  Files         173      173           
  Lines       18776    18788   +12     
=======================================
+ Hits        17989    18001   +12     
  Misses        787      787           
Files Coverage Δ
pylint/checkers/refactoring/refactoring_checker.py 98.26% <100.00%> (+0.02%) ⬆️

@crazybolillo
Copy link
Contributor Author

I tried to add more logic to obtain a human-friendly name for other sort of iterators in emitted messages but I thought the simplicity of simply not showing it outweighed the extra-code, after all the documentation good/bad examples make it very clear and it is not a very complex message (IMO).

Therefore the message now just suggests use 'yield from'. This makes the actual code to implement the code short and straight to the point. Let me know if you agree on this or prefer to show the iterator's name in the emitted message.

Also, I ran my branch over pylint source code and it did not emit use-yield-from messages. Maybe I am missing something? Or what were the two errors that were mentioned?

Copy link
Collaborator

@DanielNoord DanielNoord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! One last nit.

@@ -0,0 +1,39 @@
# pylint: disable=missing-docstring, import-error
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add

Suggested change
# pylint: disable=missing-docstring, import-error
yield 1

Although it is a syntax error I would like to check that pylint doesn't crash on it.

Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, the coverage and primers aren't working though. Could you try to rebase on upstream main ?

Emitted when a boolean condition can be simplified to a constant value.
:simplify-boolean-expression (R1709): *Boolean expression may be simplified to %s*
Emitted when redundant pre-python 2.5 ternary syntax is used.
:use-yield-from (R1737): *Consider directly using 'yield from' instead*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a distinction between "use" and "consider using", we should not mix both. Is there something to consider or is using yield from always better ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check should only emit messages on very clear cases like this one:

for x in gen:
    yield x

And in those cases I think yield from is always better. I see this message as similar to use-dict-literal.

@jacobtylerwalls
Copy link
Member

jacobtylerwalls commented Feb 10, 2024

Looks great, the coverage and primers aren't working though.

It's because there hasn't been a commit on main in five days, and all of the primer caches were evicted due to space constraints. I just manually reran the primer / main job, so when it finishes, we can try these ones again.

@Pierre-Sassoulas
Copy link
Member

It's because there hasn't been a commit on main in five days,

Damn, it's the first slow week since forever, we never encountered this issue before 😄 (it was a 'pytest' week for me). Thanks for the explanation Jacob.

@github-actions

This comment has been minimized.

@jacobtylerwalls
Copy link
Member

Looks like a FP in black.

@jacobtylerwalls
Copy link
Member

I'm not fluent with async for (see https://github.com/django/django/blob/3580b47ed31ec85ae89b13618f36bb463e97acc8/django/test/client.py#L132). Would that be a behavior change if changed to simply yield from?

A new message has been added to suggest refactoring when yielding from a
for loop can be replaced by 'yield from'. This message is only emitted
when there are no other statements in the containing for loop.

Closes pylint-dev#9229.
It is also possible to call attributes so the case has been handled
regardless of the nesting level.

Yield value may not always be a name as the result of function calls or other elements may be
yielded, therefore nodes will be discarded if they don't yield a name.
There are too many possible generators to use in a for loop and handling
all the cases to provide a human friendly display name in emitted
messages seems too cumbersome compared to the benefit it brings,
specially considering this message will have a doc page with examples.
The corresponding good/bad examples have been added and the message
description touched up a bit.
Got to carried away deleting code and deleted the most important
comparison to see if what is yielded are the items iterated on. Also
added more cases to the test file.
@crazybolillo
Copy link
Contributor Author

crazybolillo commented Feb 10, 2024

I'm not fluent with async for (see https://github.com/django/django/blob/3580b47ed31ec85ae89b13618f36bb463e97acc8/django/test/client.py#L132). Would that be a behavior change if changed to simply yield from?

Good catch, I would not consider myself fluent either but in this case yield from is not even supported inside async functions, so I will work on handling that case.

@github-actions

This comment has been minimized.

It is invalid to use 'yield from' in async functions. Replacing yielding
from an 'async for' loop may not be the same as using 'yield from' so
the message won't be emitted for async loops as well.
@github-actions

This comment has been minimized.

@crazybolillo
Copy link
Contributor Author

So I handled the async case and all primer messages seem valid to me.

Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for fixing the async case, the primer looks very promising now !

"R1737": (
"Consider directly using 'yield from' instead",
"use-yield-from",
"Emitted when yielding from a loop can be replaced by yielding from the iterator directly.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Emitted when yielding from a loop can be replaced by yielding from the iterator directly.",
"yielding from a loop is slower than yielding from the iterator directly.",

(? not sure if actually true all the time, but I suppose it's the case. At least it's going to be more elegant and understandable than the loop.)

I know almost all our messages talk about the "when" it's raised, when the "why" it's raised is a lot more interesting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I had thought about stating possible perf improvements but was not sure whether to say "possibly faster" or just state it. I guess we can go with this because it certainly won't be any slower. Besides explaining it is faster I also mentioned the cleaner code part.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometime it's surprising, it would be better to add something in related.rst Found https://copyprogramming.com/howto/in-python-is-faster-using-yield-from-than-yield-in-a-loop it would be better if it's in depth and by someone like a python core dev but I didn't find this in a reasonable time.

@crazybolillo
Copy link
Contributor Author

Applied the requested changes.

@github-actions

This comment has been minimized.

Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thank you. Let me know what you think of the article I found for related.rst, or if you found something more convincing. This is going to be a nice addition to 3.1.0 !

@Pierre-Sassoulas Pierre-Sassoulas added this to the 3.1.0 milestone Feb 10, 2024
@crazybolillo
Copy link
Contributor Author

So I could not find anything too convincing regarding the performance of yield from. I added PEP 380 to the related links, it has a section on "optimizations". However I found this post on stack overflow that states the optimizations have not been implemented (at least of python 3.6):

https://stackoverflow.com/questions/43591542/does-cpython-implement-the-mentioned-optimizations-from-pep-380

I don't trust my CPython reading abilities so I can't confirm whether they have been implemented on newer versions. I did play around with timeit to compare the performance between the two approaches, I added the snippet to details.rst.

As we know measuring performance is not that easy but I think we can say that yield from is at least as fast if not a little bit faster on occasions.

@github-actions
Copy link
Contributor

🤖 Effect of this PR on checked open source code: 🤖

Effect on astroid:
The following messages are now emitted:

  1. use-yield-from:
    Use 'yield from' directly instead of yielding each element one by one
    https://github.com/pylint-dev/astroid/blob/35dba66b7739b8fc0b25e409ca07da9a62d9ee34/astroid/manager.py#L405

Effect on home-assistant:
The following messages are now emitted:

  1. use-yield-from:
    Use 'yield from' directly instead of yielding each element one by one
    https://github.com/home-assistant/core/blob/3441b93c5c1c639dd879a67bc5f82ca125f2cd3a/homeassistant/components/stream/worker.py#L424

Effect on music21:
The following messages are now emitted:

  1. use-yield-from:
    Use 'yield from' directly instead of yielding each element one by one
    https://github.com/cuthbertLab/music21/blob/0927e39231a8a748a78f0f6499dd246942fed0e4/music21/common/objects.py#L70
  2. use-yield-from:
    Use 'yield from' directly instead of yielding each element one by one
    https://github.com/cuthbertLab/music21/blob/0927e39231a8a748a78f0f6499dd246942fed0e4/music21/test/treeYield.py#L69
  3. use-yield-from:
    Use 'yield from' directly instead of yielding each element one by one
    https://github.com/cuthbertLab/music21/blob/0927e39231a8a748a78f0f6499dd246942fed0e4/music21/test/treeYield.py#L77
  4. use-yield-from:
    Use 'yield from' directly instead of yielding each element one by one
    https://github.com/cuthbertLab/music21/blob/0927e39231a8a748a78f0f6499dd246942fed0e4/music21/test/treeYield.py#L98
  5. use-yield-from:
    Use 'yield from' directly instead of yielding each element one by one
    https://github.com/cuthbertLab/music21/blob/0927e39231a8a748a78f0f6499dd246942fed0e4/music21/tree/core.py#L522
  6. use-yield-from:
    Use 'yield from' directly instead of yielding each element one by one
    https://github.com/cuthbertLab/music21/blob/0927e39231a8a748a78f0f6499dd246942fed0e4/music21/tree/core.py#L526
  7. use-yield-from:
    Use 'yield from' directly instead of yielding each element one by one
    https://github.com/cuthbertLab/music21/blob/0927e39231a8a748a78f0f6499dd246942fed0e4/music21/stream/iterator.py#L511

Effect on pytest:
The following messages are now emitted:

  1. use-yield-from:
    Use 'yield from' directly instead of yielding each element one by one
    https://github.com/pytest-dev/pytest/blob/7690a0ddf15785f0af0e9a4cf8524379c53cd061/src/_pytest/_py/path.py#L163
  2. use-yield-from:
    Use 'yield from' directly instead of yielding each element one by one
    https://github.com/pytest-dev/pytest/blob/7690a0ddf15785f0af0e9a4cf8524379c53cd061/src/_pytest/_py/path.py#L170

Effect on pandas:
The following messages are now emitted:

  1. use-yield-from:
    Use 'yield from' directly instead of yielding each element one by one
    https://github.com/pandas-dev/pandas/blob/862d4318b4d0a1f052afd796901bcc4038761f2c/pandas/core/arrays/masked.py#L336

This comment was generated for commit 2a06726

Copy link
Collaborator

@DanielNoord DanielNoord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, a nice addition!

Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great new checker, one of the main new features of 3.1.0 🎉

@Pierre-Sassoulas Pierre-Sassoulas merged commit 18013eb into pylint-dev:main Feb 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Enhancement ✨ Improvement to a component

Projects

None yet

Development

Successfully merging this pull request may close these issues.

New check: use yield from

4 participants