Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Conversation

@filipnavara
Copy link
Member

@filipnavara filipnavara commented Apr 18, 2019

Address few regressions from #22378. Avoid using calloc in order to remove overhead from unnecessary zeroing of memory.

Fix reallocation in AddItem (pal_collation.c) to prevent accidental overflow and to use correct element size. Previously it allocated more memory due to incorrectly multiplying by sizeof(UChar*) instead of sizeof(UChar).

Fix performance regression from #22390 (tracked in dotnet/aspnetcore#9404). tsearch performance on glibc is vastly inferior to tfind so prefer to use the later. On macOS the performance of either approach was nearly identical.

int32_t patternLen = udat_toPattern(pFormat, FALSE, NULL, 0, &ignore) + 1;

UChar* pattern = calloc(patternLen, sizeof(UChar));
UChar* pattern = malloc(patternLen * sizeof(UChar));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this an improvement? Don't we then need to be concerned about overflow and potential resulting buffer overflows?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because calloc zeroes the allocated memory. In all the cases that was actually redundant because it was overwritten with valid data just few lines after the allocation. If overflow in the multiplication is valid concern I can wrap that up in a macro or helper function and check for it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In CoreFX, add_s and multiply_s (https://github.com/dotnet/corefx/blob/0fd3b7e/src/Native/Unix/Common/pal_safecrt.h#L17) are used for safe arithmetic before calling malloc. The implementation uses compiler intrinsic and provides software fallback. Maybe pal_safecrt.h can be ported from CoreFX?

Copy link
Member Author

@filipnavara filipnavara Apr 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@am11 That makes sense. I was about to write similar helper function, but in the end I decided to write a simple portable code instead. I didn't notice that it already exists in CoreFX.

I actually have an open issue for moving System.Globalization.Native to CoreFX - https://github.com/dotnet/coreclr/issues/22391 - so it makes sense to converge the code.

@filipnavara filipnavara changed the title Minor performance improvements for System.Globalization.Native WIP: Minor performance improvements for System.Globalization.Native Apr 18, 2019
@filipnavara filipnavara changed the title WIP: Minor performance improvements for System.Globalization.Native [System.Globalization.Native] Fix regressions from #22378 and #22390 Apr 18, 2019
@filipnavara
Copy link
Member Author

/cc @janvorli @sebastienros

list->capacity *= 2;
UChar* ptr = (UChar*)realloc(list->items, list->capacity * sizeof(UChar*));
list->capacity += 512;
if (list->capacity < size)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR is making a handful of unrelated improvements with the goal of improving perf... can you share measurements that demonstrate they move things in a positive direction? I'm not clear on why this change to +512 instead of *2 is definitively better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The safety check I added is actually never going to get hit. GetCustomRules has upper bound on how many custom rules can get added, so realistically the only two cases that can happen is the default array size (512) or the next bigger one (1024 both before and after my change).

The maximum size at the moment is something around 860.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll revert it to *2 since it doesn't really matter.

{
pCollator = CloneCollatorWithOptions(pSortHandle->regular, options, pErr);
map->UCollator = pCollator;
tsearch(map, &pSortHandle->collatorsPerOptionRoot, TreeComparer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we separate out the regression fix into its own PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants