GH-140052: Add PyTuple_MakeSingle and PyTuple_MakePair #140132

sergey-miryanov · 2025-10-14T20:29:39Z

As requested by @vstinner I have added a separate PR with two functions: PyTuple_MakeSingle and PyTuple_MakePair.

Issue: [C API] Replace PyTuple_Pack(1,2) with PyTuple_Make[Single,Pair] to optimize creation of tuples #140052

📚 Documentation preview 📚: https://cpython-previews--140132.org.readthedocs.build/

Misc/NEWS.d/next/Core_and_Builtins/2025-10-15-01-30-28.gh-issue-140052.08spgX.rst

Modules/_testcapi/tuple.c

Doc/c-api/tuple.rst

Objects/tupleobject.c

Doc/data/refcounts.dat

This reverts commit ebb70c9.

Co-authored-by: Victor Stinner <[email protected]>

sergey-miryanov · 2025-10-14T21:01:46Z

@vstinner I have made requested changes. Please take a look.

Include/tupleobject.h

sergey-miryanov · 2025-10-14T21:27:19Z

Done, please take a look.

Doc/c-api/tuple.rst

Co-authored-by: Victor Stinner <[email protected]>

Doc/c-api/tuple.rst

Co-authored-by: Pieter Eendebak <[email protected]>

Doc/c-api/tuple.rst

…/cpython into 140052-pytuple-make-pair

sergey-miryanov · 2025-10-15T21:16:12Z

Microbenchmarks:
Windows 11, i5-11600K @ 3.90GHz

+----------------+---------+-----------------+-----------------------+-----------------------+-----------------------+
| Benchmark      | t       | s               | p                     | a                     | m                     |
+================+=========+=================+=======================+=======================+=======================+
| tuple-1        | 12.8 ns | not significant | 12.2 ns: 1.06x faster | 12.4 ns: 1.04x faster | 11.8 ns: 1.09x faster |
+----------------+---------+-----------------+-----------------------+-----------------------+-----------------------+
| tuple-2        | 14.5 ns | not significant | 13.3 ns: 1.09x faster | 13.2 ns: 1.10x faster | 12.3 ns: 1.18x faster |
+----------------+---------+-----------------+-----------------------+-----------------------+-----------------------+
| Geometric mean | (ref)   | 1.01x slower    | 1.07x faster          | 1.07x faster          | 1.13x faster          |
+----------------+---------+-----------------+-----------------------+-----------------------+-----------------------+

Ubuntu 24.04, gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0, same cpu, built with lto enabled

+----------------+---------+-----------------------+-----------------------+-----------------------+-----------------------+
| Benchmark      | t       | s                     | p                     | a                     | m                     |
+================+=========+=======================+=======================+=======================+=======================+
| tuple-1        | 15.7 ns | 14.5 ns: 1.08x faster | 12.7 ns: 1.24x faster | 11.5 ns: 1.36x faster | 11.2 ns: 1.40x faster |
+----------------+---------+-----------------------+-----------------------+-----------------------+-----------------------+
| tuple-2        | 22.6 ns | 20.1 ns: 1.12x faster | 16.0 ns: 1.41x faster | 14.2 ns: 1.59x faster | 15.0 ns: 1.51x faster |
+----------------+---------+-----------------------+-----------------------+-----------------------+-----------------------+
| Geometric mean | (ref)   | 1.10x faster          | 1.32x faster          | 1.47x faster          | 1.45x faster          |
+----------------+---------+-----------------------+-----------------------+-----------------------+-----------------------+

t - PyTuple_New + PyTuple_SetItem
s - PyTuple_New + PyTuple_SET_ITEM
p - PyTuple_Pack
a - PyTuple_FromArray
m - PyTuple_Make[Single,Pair]

Microbenchmarks - sergey-miryanov@07f7c6a

run scripts

bench_tuple.py

import pyperf
import _testcapi
import functools
runner = pyperf.Runner()
for size in (1, 2):
    func = functools.partial(_testcapi.bench_tuple, size)
    runner.bench_time_func(f'tuple-{size}', func)

bench_steal.py

import pyperf
import _testcapi
import functools
runner = pyperf.Runner()
for size in (1, 2):
    func = functools.partial(_testcapi.bench_tuple_steal, size)
    runner.bench_time_func(f'tuple-{size}', func)

bench_pack.py

import pyperf
import _testcapi
import functools
runner = pyperf.Runner()
for size in (1, 2):
    func = functools.partial(_testcapi.bench_tuple_pack, size)
    runner.bench_time_func(f'tuple-{size}', func)

bench_from_array.py

import pyperf
import _testcapi
import functools
runner = pyperf.Runner()
for size in (1, 2):
    func = functools.partial(_testcapi.bench_tuple_from_array, size)
    runner.bench_time_func(f'tuple-{size}', func)

bench_make.py

import pyperf
import _testcapi
import functools
runner = pyperf.Runner()
for size in (1, 2):
    func = functools.partial(_testcapi.bench_tuple_make, size)
    runner.bench_time_func(f'tuple-{size}', func)

eendebakpt · 2025-10-21T22:25:48Z

In the microbenchmarks it seems odd that PyTuple_Pack is faster than PyTuple_New + PyTuple_SET_ITEM. I can think of no reasons for this. Which optimization settings did you use? Looking at the implementation of the microbenchmarks: maybe for PyTuple_Pack the code PyObject *one = PyLong_FromLong(0); is optimized away. What happens if you move the PyLong_FromLong outside the loop?

Objects/tupleobject.c

sergey-miryanov · 2025-10-23T20:40:55Z

Microbenchmark results from Windows machine (Windows 11, i5-11600K @ 3.90GHz)
Results for Tuple(Long) and Tuple(Long, Long) - tuple will not be tracked.

+----------------+---------+-----------------------+-----------------------+-----------------------+-----------------------+
| Benchmark      | n       | s                     | p                     | a                     | m                     |
+================+=========+=======================+=======================+=======================+=======================+
| tuple-1        | 14.1 ns | not significant       | 9.79 ns: 1.44x faster | 9.39 ns: 1.50x faster | 8.62 ns: 1.64x faster |
+----------------+---------+-----------------------+-----------------------+-----------------------+-----------------------+
| tuple-2        | 15.7 ns | 16.1 ns: 1.02x slower | 12.3 ns: 1.28x faster | 13.1 ns: 1.20x faster | 12.1 ns: 1.30x faster |
+----------------+---------+-----------------------+-----------------------+-----------------------+-----------------------+
| Geometric mean | (ref)   | 1.01x slower          | 1.36x faster          | 1.34x faster          | 1.46x faster          |
+----------------+---------+-----------------------+-----------------------+-----------------------+-----------------------+

Results for Tuple(EmptyTuple) and Tuple(EmptyTuple, EmptyTuple) - tuple will be tracked (EmptyTuple - is a special case - we don't allocate it)

+----------------+---------+-----------------------+-----------------------+-----------------+-----------------------+
| Benchmark      | tn      | ts                    | tp                    | ta              | tm                    |
+================+=========+=======================+=======================+=================+=======================+
| tuple-1        | 14.4 ns | not significant       | 14.1 ns: 1.02x faster | not significant | 13.9 ns: 1.04x faster |
+----------------+---------+-----------------------+-----------------------+-----------------+-----------------------+
| tuple-2        | 15.1 ns | 15.7 ns: 1.04x slower | 16.0 ns: 1.06x slower | not significant | 14.5 ns: 1.04x faster |
+----------------+---------+-----------------------+-----------------------+-----------------+-----------------------+
| Geometric mean | (ref)   | 1.02x slower          | 1.02x slower          | 1.00x slower    | 1.04x faster          |
+----------------+---------+-----------------------+-----------------------+-----------------+-----------------------+

Notes:

We can see a significant effect of tracking and not tracking of tuples.
We can see that version with PyTuple_Make[Single, Pair] much better for not trackable tuples, and get not much gain for trackable ones.
We can see that SET_ITEM version is slower than SetItem one, I don't understand why.

Linux benchmarks will be a bit later.

Benchmarks here - https://github.com/sergey-miryanov/cpython/tree/140052-pytuple-make-pair-bench

vstinner · 2025-10-23T21:03:13Z

I don't know how to read your benchmark. What are the "a", "p", "s", etc. columns?

vstinner · 2025-10-23T21:05:02Z

Oh, I suppose that letters are the same from previous benchmark: #140132 (comment)

sergey-miryanov · 2025-10-23T21:08:37Z

Yes, they are the same (except n then in new benchmarks means PyTuple_New + PyTuple_SetItem).
Sorry.

n - PyTuple_New + PyTuple_SetItem
s - PyTuple_New + PyTuple_SET_ITEM
p - PyTuple_Pack
a - PyTuple_FromArray
m - PyTuple_Make[Single,Pair]

tn, ts, tp, ta, tm - for version where internal tuple's item is an EmptyTuple.

sergey-miryanov · 2025-10-24T05:12:19Z

Microbenchmark results from Linux (Ubuntu 24.04, gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0, i5-11600K @ 3.90GHz)
Results for Tuple(Long) and Tuple(Long, Long) - tuple will not be tracked.

+----------------+---------+-----------------------+-----------------------+-----------------------+-----------------------+
| Benchmark      | n       | s                     | p                     | a                     | m                     |
+================+=========+=======================+=======================+=======================+=======================+
| tuple-1        | 11.9 ns | 11.8 ns: 1.01x faster | 9.26 ns: 1.29x faster | 8.37 ns: 1.43x faster | 8.70 ns: 1.37x faster |
+----------------+---------+-----------------------+-----------------------+-----------------------+-----------------------+
| tuple-2        | 17.2 ns | 16.5 ns: 1.04x faster | 12.6 ns: 1.37x faster | 11.1 ns: 1.56x faster | 10.7 ns: 1.60x faster |
+----------------+---------+-----------------------+-----------------------+-----------------------+-----------------------+
| Geometric mean | (ref)   | 1.03x faster          | 1.33x faster          | 1.49x faster          | 1.48x faster          |
+----------------+---------+-----------------------+-----------------------+-----------------------+-----------------------+

Results for Tuple(EmptyTuple) and Tuple(EmptyTuple, EmptyTuple):

+----------------+---------+-----------------------+-----------------------+-----------------------+-----------------------+
| Benchmark      | tn      | ts                    | tp                    | ta                    | tm                    |
+================+=========+=======================+=======================+=======================+=======================+
| tuple-1        | 12.7 ns | 12.2 ns: 1.04x faster | 10.9 ns: 1.17x faster | 9.70 ns: 1.31x faster | 9.94 ns: 1.28x faster |
+----------------+---------+-----------------------+-----------------------+-----------------------+-----------------------+
| tuple-2        | 18.1 ns | 17.4 ns: 1.04x faster | 13.8 ns: 1.31x faster | 11.6 ns: 1.56x faster | 12.0 ns: 1.51x faster |
+----------------+---------+-----------------------+-----------------------+-----------------------+-----------------------+
| Geometric mean | (ref)   | 1.04x faster          | 1.24x faster          | 1.43x faster          | 1.39x faster          |
+----------------+---------+-----------------------+-----------------------+-----------------------+-----------------------+v

Notes:

On Linux I see that tracking and not tracking of the tuples doesn't have much difference.
Version with PyTuple_Make[Single,Pair] is a bit slower than version with PyTuple_FromArray, I suspect it is because PyTuple_Make[Single,Pair] are not covered by PGO + LTO.
Build params:

./configure --enable-optimizations --with-lto=full --prefix=/home/msn/work/cpython/installed/tuple-make-pair-bench

Legend:

n - PyTuple_New + PyTuple_SetItem
s - PyTuple_New + PyTuple_SET_ITEM
p - PyTuple_Pack
a - PyTuple_FromArray
m - PyTuple_Make[Single,Pair]

tn, ts, tp, ta, tm - for version where internal tuple's item is an EmptyTuple.

sergey-miryanov · 2025-10-24T05:13:36Z

@eendebakpt I have updated microbenchmarks. Could you please take a look? Are they fair enough now?

sergey-miryanov · 2025-10-24T05:14:12Z

@vstinner This is ready for review. Could you please take a look?

efimov-mikhail · 2025-10-25T12:50:29Z

Lib/test/test_capi/test_tuple.py

+        # because we only check type for gc support can't untrack tuple of
+        # immutable tuples, see maybe_tracked
+        self.assertTrue(gc.is_tracked(make_single((1, 2))))


IMO, it'll be better not to check this, since we can make another decision in the future.

We have a small disagreement here.
I think we should explicitly fix the current behavior in the tests.
@efimov-mikhail does not want to fix the behavior that may change in the near future because doing so would require changes to the tests.

We need another opinion on this.

eendebakpt · 2025-10-25T21:50:00Z

@eendebakpt I have updated microbenchmarks. Could you please take a look? Are they fair enough now?

The benchmarks seem fine, although I still find the results surprising (why is PyTuple_Pack faster than PyTuple_New + PyTuple_SET_ITEM? maybe it depends on whether the object one you are adding in the performance tests is not tracked by the GC).

But I am +1 on the PR, even if we don't go looking into tiny performance details: the methods PyTuple_MakeSingle/PyTuple_MakePair have a clean interface, they can be used quite a bit in the codebase and are faster than the alternative PyTuple_Pack.

vstinner · 2025-10-28T18:32:37Z

I created capi-workgroup/decisions#84 decision issue for the C API Working Group.

Add PyTuple_Make[Single,Pair]

4e46463

bedevere-app bot added the awaiting review label Oct 14, 2025

bedevere-app bot mentioned this pull request Oct 14, 2025

[C API] Replace PyTuple_Pack(1,2) with PyTuple_Make[Single,Pair] to optimize creation of tuples #140052

Open

sergey-miryanov added 2 commits October 15, 2025 01:30

Add news entry

aacb9f2

Fix news entry

ef7d139

vstinner reviewed Oct 14, 2025

View reviewed changes

Revert refcounts.dat changes

ebb70c9

vstinner reviewed Oct 14, 2025

View reviewed changes

Doc/data/refcounts.dat Outdated Show resolved Hide resolved

Doc/data/refcounts.dat Outdated Show resolved Hide resolved

sergey-miryanov added 3 commits October 15, 2025 01:47

Revert "Revert refcounts.dat changes"

bb9d532

This reverts commit ebb70c9.

Fix refcounts.dat

ee4056e

Fix news entry

bda2374

sergey-miryanov marked this pull request as draft October 14, 2025 20:51

bedevere-app bot removed the awaiting review label Oct 14, 2025

sergey-miryanov and others added 3 commits October 15, 2025 01:52

Apply suggestions from code review

2076783

Co-authored-by: Victor Stinner <[email protected]>

Fix docs

d45bd2a

Fix tuple_make_single

bd7bc3e

sergey-miryanov marked this pull request as ready for review October 14, 2025 21:01

bedevere-app bot added the awaiting review label Oct 14, 2025

vstinner reviewed Oct 14, 2025

View reviewed changes

Include/tupleobject.h Outdated Show resolved Hide resolved

sergey-miryanov added 2 commits October 15, 2025 02:24

Fix docs

f5bab3c

Move declarations to Include/cpython/tupleobject.h

6161535

vstinner reviewed Oct 14, 2025

View reviewed changes

Doc/c-api/tuple.rst Outdated Show resolved Hide resolved

sergey-miryanov and others added 2 commits October 15, 2025 02:35

Update Doc/c-api/tuple.rst

e6d2e0e

Co-authored-by: Victor Stinner <[email protected]>

Remove warning from PyTuple_MakePair docs

3c0ef7c

eendebakpt reviewed Oct 14, 2025

View reviewed changes

Doc/c-api/tuple.rst Outdated Show resolved Hide resolved

eendebakpt reviewed Oct 14, 2025

View reviewed changes

Doc/c-api/tuple.rst Outdated Show resolved Hide resolved

Apply suggestions from code review

270141b

Co-authored-by: Pieter Eendebak <[email protected]>

vstinner reviewed Oct 14, 2025

View reviewed changes

Doc/c-api/tuple.rst Outdated Show resolved Hide resolved

sergey-miryanov added 3 commits October 15, 2025 10:27

Fix indentation for PyTuple_MakePair docs

7c0eaa4

Merge branch 'main' into 140052-pytuple-make-pair

e27f3a1

Merge branch '140052-pytuple-make-pair' of github.com:sergey-miryanov…

ce29809

…/cpython into 140052-pytuple-make-pair

eendebakpt mentioned this pull request Oct 15, 2025

Improve performance by replacing PyTuple_Pack with PyTuple_FromArray #140009

Open

eendebakpt reviewed Oct 21, 2025

View reviewed changes

Objects/tupleobject.c Outdated Show resolved Hide resolved

sergey-miryanov marked this pull request as draft October 22, 2025 10:04

bedevere-app bot removed the awaiting review label Oct 22, 2025

sergey-miryanov added 3 commits October 24, 2025 00:33

Merge branch 'main' into 140052-pytuple-make-pair

d28311c

Check should we track or no tuple created via PyTuple_Make[Single,Pair]

d075bff

Update tests accordingly

0517223

sergey-miryanov marked this pull request as ready for review October 24, 2025 05:13

bedevere-app bot added the awaiting review label Oct 24, 2025

sergey-miryanov requested a review from efimov-mikhail October 24, 2025 05:14

efimov-mikhail reviewed Oct 25, 2025

View reviewed changes

vstinner mentioned this pull request Oct 28, 2025

Add PyTuple_MakeSingle() and PyTuple_MakePair() functions capi-workgroup/decisions#84

Open

Uh oh!

GH-140052: Add PyTuple_MakeSingle and PyTuple_MakePair #140132

Are you sure you want to change the base?

GH-140052: Add PyTuple_MakeSingle and PyTuple_MakePair #140132

Conversation

sergey-miryanov commented Oct 14, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sergey-miryanov commented Oct 14, 2025

Uh oh!

Uh oh!

sergey-miryanov commented Oct 14, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sergey-miryanov commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eendebakpt commented Oct 21, 2025

Uh oh!

Uh oh!

sergey-miryanov commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vstinner commented Oct 23, 2025

Uh oh!

vstinner commented Oct 23, 2025

Uh oh!

sergey-miryanov commented Oct 23, 2025

Uh oh!

sergey-miryanov commented Oct 24, 2025

Uh oh!

sergey-miryanov commented Oct 24, 2025

Uh oh!

sergey-miryanov commented Oct 24, 2025

Uh oh!

efimov-mikhail Oct 25, 2025

Choose a reason for hiding this comment

Uh oh!

sergey-miryanov Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

eendebakpt commented Oct 25, 2025

Uh oh!

vstinner commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sergey-miryanov commented Oct 14, 2025 •

edited by github-actions bot

Loading

sergey-miryanov commented Oct 15, 2025 •

edited

Loading

sergey-miryanov commented Oct 23, 2025 •

edited

Loading