Skip to content

Conversation

@NewUserHa
Copy link
Contributor

Advantages:

  • use new standard library, to encourage users to also use stdlibs.
d=list(range(50000))

list(concurrent.futures.process._get_chunks(d, chunksize=1))
# 105 ms ± 4.43 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

list(_get_chunks(d, chunksize=1))
# 5.87 ms ± 378 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
  • performance. the difference is significant when chunksize is small, and the chunksize is usually =1.

Copy link
Member

@AA-Turner AA-Turner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fine -- there's a check that chunksize is >=1 in ProcessPoolExecutor.map.

A

@AA-Turner AA-Turner changed the title update concurrent.features.process get_chunks to use new itertools.batched() Use itertools.batched() in concurrent.futures.process._get_chunks Jan 18, 2024
@serhiy-storchaka
Copy link
Member

You can inline the it variable, and the inline _get_chunks() itself. This will remove some overhead for packing/unpacking *iterables.

@NewUserHa
Copy link
Contributor Author

inline _get_chunks() itself means moving its content back to map()?

@serhiy-storchaka
Copy link
Member

Yes, since it is a short one-liner.

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@serhiy-storchaka serhiy-storchaka changed the title Use itertools.batched() in concurrent.futures.process._get_chunks Simplify concurrent.futures.process code by using itertools.batched() Jan 26, 2024
@serhiy-storchaka serhiy-storchaka merged commit 547c135 into python:main Jan 27, 2024
@serhiy-storchaka
Copy link
Member

Thank you for your contribution, @NewUserHa.

But note that usually all but the most trivial changes require opening an issue, and if it is an optimization, its effect should be demonstrated in benchmarks. Not only that some line in a tight loop becomes faster, but that the stdlib code that include this line becomes faster.

I accepted this change as exception, because it makes the code simpler and adds a demonstration of a new itertools function. But the bar may be higher for other changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants