Skip to content

Commit 2a2bc82

Browse files
gh-130567: Remove optimistic allocation in locale.strxfrm() (GH-137143)
On modern systems, the result of wcsxfrm() is much larger the size of the input string (from 4+2*n on Windows to 4+5*n on Linux for simple ASCII strings), so optimistic allocation of the buffer of the same size never works. The exception is if the locale is "C" (or unset), but in that case the `wcsxfrm` call should be fast (and calling `locale.strxfrm()` doesn't make too much sense in the first place).
1 parent 3a81313 commit 2a2bc82

File tree

1 file changed

+10
-22
lines changed

1 file changed

+10
-22
lines changed

Modules/_localemodule.c

Lines changed: 10 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -455,36 +455,24 @@ _locale_strxfrm_impl(PyObject *module, PyObject *str)
455455
goto exit;
456456
}
457457

458-
/* assume no change in size, first */
459-
n1 = n1 + 1;
460-
/* Yet another +1 is needed to work around a platform bug in wcsxfrm()
461-
* on macOS. See gh-130567. */
462-
buf = PyMem_New(wchar_t, n1+1);
458+
errno = 0;
459+
n2 = wcsxfrm(NULL, s, 0);
460+
if (errno && errno != ERANGE) {
461+
PyErr_SetFromErrno(PyExc_OSError);
462+
goto exit;
463+
}
464+
buf = PyMem_New(wchar_t, n2+1);
463465
if (!buf) {
464466
PyErr_NoMemory();
465467
goto exit;
466468
}
469+
467470
errno = 0;
468-
n2 = wcsxfrm(buf, s, n1);
469-
if (errno && errno != ERANGE) {
471+
n2 = wcsxfrm(buf, s, n2+1);
472+
if (errno) {
470473
PyErr_SetFromErrno(PyExc_OSError);
471474
goto exit;
472475
}
473-
if (n2 >= (size_t)n1) {
474-
/* more space needed */
475-
wchar_t * new_buf = PyMem_Realloc(buf, (n2+1)*sizeof(wchar_t));
476-
if (!new_buf) {
477-
PyErr_NoMemory();
478-
goto exit;
479-
}
480-
buf = new_buf;
481-
errno = 0;
482-
n2 = wcsxfrm(buf, s, n2+1);
483-
if (errno) {
484-
PyErr_SetFromErrno(PyExc_OSError);
485-
goto exit;
486-
}
487-
}
488476
/* The result is just a sequence of integers, they are not necessary
489477
Unicode code points, so PyUnicode_FromWideChar() cannot be used
490478
here. For example, 0xD83D 0xDC0D should not be larger than 0xFF41.

0 commit comments

Comments
 (0)