-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Convert System.Globalization.Native to C #18614
Conversation
| // All ICU headers need to be included here so that all function prototypes are | ||
| // available before the function pointers are declared below. | ||
| #include <unicode/locid.h> | ||
| //#include <unicode/locid.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is not needed - delete it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do, this is a C++ header, based on http://icu-project.org/apiref/icu4c/files.html
|
It would be nice to mark all the exports with DLLEXPORT like it is done in corefx, and hide everything else. |
| // Couldn't find the data we need for this locale, we should fallback. | ||
| if (localeNameBuf[0] == 0x0) | ||
| { | ||
| ures_close(rootResBundle); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to do this sort of error handling using goto done pattern. It avoids duplicating the cleanup and mistakes that the duplication tend to lead to.
|
I have set |
|
@dotnet-bot test this please |
| // doing the KanaType custom rule. | ||
| customRules.reserve(512); | ||
| customRules = (UCharList*)malloc(sizeof(UCharList*)); | ||
| customRules->items = calloc(512, sizeof(UChar*)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is 512 always going to be enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added an AddItem function that checks and reallocate if it goes beyond the capacity.
|
The final binary seems to still depend on |
|
Setting the language to C at top of CMakeLists wasn't enough, b9e9d3d. Had to do https://stackoverflow.com/a/29204610 and some more. |
| return version; | ||
| } | ||
|
|
||
| __attribute__((destructor)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@janvorli I think that this shutdown method is broken. There can be libicu code running during shutdown. Unloading icu here will cause intermittent crashes during process exit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tried to make it crash and there does not appear to be a real problem. dlclose during shutdown seems to be ignored - the library is not actually unmapped.
| int32_t length = uloc_getLanguage(locale, NULL, 0, &status) + 1; | ||
|
|
||
| std::vector<char> buf(length + 1, '\0'); | ||
| char* buf = calloc(length, sizeof(char*)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be freed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check for null would be nice too.
| int32_t icuPatternLength = unum_toPattern(pNumberFormat, false, NULL, 0, &ignore) + 1; | ||
|
|
||
| std::vector<UChar> icuPattern(icuPatternLength + 1, '\0'); | ||
| UChar* icuPattern = calloc(icuPatternLength, sizeof(UChar*)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
matching free?
|
Tests are failing |
|
@jkotas, windows legs failing is a known/unrelated issue. Is there a way to tell what is causing those test crash from the build output in unix legs? |
It is crashing very early during startup on memory corruption. It is likely bug in your changes. It you run anything locally under debugger, you should be able to find the problem pretty quickly. |
|
I haven't built the interloop testes locally, it takes enormous amount of time and resources. Can i download artifacts from somewhere? |
|
I expect that the crash will repro even if you run a simple hello world. https://github.com/dotnet/coreclr#using-your-build has two ways you can run a simple hello world (I typically use the |
That would mean the non-interloop test legs that are passing are not running any tests? |
|
my disk space is running full and couldn't build the whole thing. I was building only parts (GlobalizationNative) and travis-ci for mac. |
janvorli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will continue reviewing later today, but it seems I may have found the crash reason.
| int32_t patternLen = udatpg_getBestPattern(pGenerator, patternSkeleton, -1, NULL, 0, &ignore) + 1; | ||
|
|
||
| std::vector<UChar> bestPattern(patternLen + 1, '\0'); | ||
| UChar* bestPattern = calloc(patternLen, sizeof(UChar*)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you allocating one less char than the original version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added + 1 in the line above.
|
|
||
| std::vector<UChar> bestPattern(patternLen + 1, '\0'); | ||
| UChar* bestPattern = calloc(patternLen, sizeof(UChar*)); | ||
| if(bestPattern == NULL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A nit - missing space between the if and the opening parenthese.
|
|
||
| std::vector<UChar> symbolBuf(symbolLen + 1, '\0'); | ||
| UChar* symbolBuf = calloc(symbolLen, sizeof(UChar*)); | ||
| if(symbolBuf == NULL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A nit - missing space between the if and the opening parenthese.
| int symbolLen = udat_getSymbols(pFormat, type, i, NULL, 0, &ignore) + 1; | ||
|
|
||
| std::vector<UChar> symbolBuf(symbolLen + 1, '\0'); | ||
| UChar* symbolBuf = calloc(symbolLen, sizeof(UChar*)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same question as in a similar case before - why are you allocating one less UChar than the original code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added 1 to symbolLen above
| if (U_FAILURE(err)) | ||
| { | ||
| udat_close(pFormat); | ||
| free(pFormat); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems it should be free(symbolBuf) instead.
|
Try doing this (using the .NET Core 2.1 SDK):
The bug is in |
|
@jkotas, simple cases are working. I am not sure which case will make it fail locally. I have tried with: using System;
using System.Globalization;
using System.Threading;
public class Example
{
public static void Main()
{
CultureInfo culture1 = CultureInfo.CurrentCulture;
CultureInfo culture2 = Thread.CurrentThread.CurrentCulture;
Console.WriteLine("The current culture is {0}", culture1.Name);
Console.WriteLine("The two CultureInfo objects are equal: {0}",
culture1 == culture2);
var val = 222;
Console.WriteLine($">> {val}");
GregorianCalendar calendar = new GregorianCalendar();
Console.WriteLine(calendar.GetMonthsInYear(11));
Console.WriteLine("France".Equals(new RegionInfo("FR").NativeName));
string guidString = "00000001-57ee-1e5c-00b4-d0000bb1e11e";
SortVersion version = new SortVersion(393742, new Guid(guidString));
Console.WriteLine(393742 == version.FullVersion);
}
} |
| if ((*ppSortHandle) == nullptr) | ||
| assert(ppSortHandle != NULL); | ||
|
|
||
| *ppSortHandle = (SortHandle*)malloc(sizeof(SortHandle*)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be malloc(sizeof(SortHandle)). Similar bug is in other places too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the main bug. Thanks, now it's all green.
| } SortHandle; | ||
| const TCollatorMap* leftMap = left; | ||
| const TCollatorMap* rightMap = right; | ||
| return leftMap->key - rightMap->key; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can return wrong result in a general case. (Not sure for the range of keys that are possible here.)
Change this to correct by construction pattern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is implemented correctly and in accord with the documentation https://linux.die.net/man/3/tsearch. No issues found so far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E.g. when left key is 0x7FFFFFFF and the right key key is -1, this will return incorrect result.
The correct way to implement this looks like this: https://github.com/dotnet/coreclr/blob/master/src/System.Private.CoreLib/shared/System/Int32.cs#L51
This issue causes corner case bugs that are not easy to find.
| UChar* ptr = (UChar*)realloc(list->items, *currentCapacity * sizeof(UChar*)); | ||
| if(ptr == NULL) | ||
| { | ||
| assert(false && "realloc unable to allocate memory."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not assert on OOM.
Are you sure you are running the binary with your changes? I cannot believe that it would work by accident, even for simple cases like this. |
| int32_t length = uloc_getLanguage(locale, nullptr, 0, &status); | ||
| int32_t length = uloc_getLanguage(locale, NULL, 0, &status) + 1; | ||
|
|
||
| char* buf = calloc(length, sizeof(char*)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The allocated length is off by one here and in the usage below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@janvorli, like i've commented twice above that +1 is added in previous statement..
| int32_t length = uloc_getCountry(locale, nullptr, 0, &status); | ||
| int32_t length = uloc_getCountry(locale, NULL, 0, &status) + 1; | ||
|
|
||
| char* buf = calloc(length, sizeof(char*)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again off by one allocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, see previous like 136
| set_target_properties(System.Globalization.Native PROPERTIES PREFIX "") | ||
|
|
||
| # Force CMake to not link against libstdc++ | ||
| # http://cmake.3232098.n2.nabble.com/setting-LINKER-LANGUAGE-still-adds-lstdc-td7581940.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem that the URL applies to our case. It describes the case when a target contained cpp files. We now have all the files in "C", so the linker should recognize it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you try it? This is the only option that works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried to remove the changes here and it has built fine on my machine. Maybe you didn't do clean build and something has leaked from the previous state? Just try to deleted the whole bin folder before the build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've used cmake 3.5.1 and clang 3.9.1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wasn't about the build failure, if you follow the comments above (#18614 (comment)), it was found with ldd System.Globalization.Native.so that we are still linking against libstdc++. With this change it fixed that issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I am sorry, I've missed that. But I've found the real culprit. The problem is that we are including one .cpp file in the build of the library - the version.cpp that is compiled into all targets to provide info on the version of the library. This is automatically added to all targets, so that's why you have not noticed it. See the _add_library function in functions.cmake.
Modifying the generator of version.cpp to generate file named version.c instead and updating the VERSION_FILE_PATH variable in the root CMakeLists.txt accordingly fixes the problem..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks. I have renamed version.cpp to version.c and removed these hacks. Checked with ldd no libstdc++.
|
|
||
| if (U_FAILURE(err)) | ||
| { | ||
| ucal_close(pCal); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case of failure, the pCal should not be closed (see the original holder). That's why the holder gets the err as argument too.
The same at many other places below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is ok. The ICU close methods deal with null pointers fine. Their examples use similar patterns. Also I have checked the ICU sources and they indeed handle null pointers fine.
|
Just a quick thought - I wonder if we could use the |
|
I do not think we should be using non-standard attributes. My strong preference for doing this sort of cleanup is https://github.com/dotnet/corefx/blob/master/src/Native/Unix/System.Security.Cryptography.Native/pal_ecc_import_export.c#L135 It is the most readable, portable and robust pattern among the different options. |
Yes. |
|
I do not see any obvious bug by just looking at the code. |
|
Type casting UChar ( |
|
@kasper3, what was the syntax of the bpmd command you have tried to use? |
|
@janvorli, I was trying with dotnet new console -n /testapp
cd testapp
cat >Program.cs <<EOF
using System.Globalization;
static class Program
{
static void Main(string [] args)
{
new CultureInfo("ja-JP") { DateTimeFormat = { Calendar = new JapaneseCalendar() } };
}
}
EOF
dotnet publish -r linux-x64
cp \
/coreclr/bin/Product/Linux.x64.Debug/System.Globalization.Native.so \
/coreclr/bin/Product/Linux.x64.Debug/libcoreclr.so \
/coreclr/bin/Product/Linux.x64.Debug/libclrjit.so \
/coreclr/bin/Product/Linux.x64.Debug/libmscordaccore.so \
/coreclr/bin/Product/Linux.x64.Debug/PDB/System.Private.CoreLib.pdb \
/coreclr/bin/Product/Linux.x64.Debug/System.Private.CoreLib.dll \
/testapp/bin/Debug/netcoreapp2.1/linux-x64/publish/
lldb bin/Debug/netcoreapp2.1/linux-x64/publish/testapp
(lldb) target create "bin/Debug/netcoreapp2.1/linux-x64/publish/testapp"
Current executable set to 'bin/Debug/netcoreapp2.1/linux-x64/publish/testapp' (x86_64).
(lldb) plugin load /coreclr/bin/Product/Linux.x64.Debug/libsosplugin.so
(lldb) sos bpmd /testapp/bin/Debug/netcoreapp2.1/linux-x64/publish/System.Private.CoreLib.dll System.Globalization.CultureInfo.GetCultureByName
Failed to load data access DLL, 0x80131c4f
You can run the debugger command 'setclrpath' to control the load of libmscordaccore.so.
If that succeeds, the SOS command should work on retry.
bpmd /testapp/bin/Debug/netcoreapp2.1/linux-x64/publish/System.Private.CoreLib.dll System.Globalization.CultureInfo.GetCultureByName failedthough I couldn't quite figure out how to set the line number with bpmd in managed code. For example in lldb, usually the native code |
|
Couple of thoughts:
|
janvorli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did one more pass to see if I can spot something suspicious that would cause the japanese culture test to fail. I have not found anything like this, but I've found few nits and details.
| const UChar* localeRules = ucol_getRules(pCollator, &localeRulesLength); | ||
| int32_t completeRulesLength = localeRulesLength + customRuleLength + 1; | ||
|
|
||
| UChar* completeRules = calloc(completeRulesLength, sizeof(UChar)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems you are missing deletion of the completeRules in this function
| delete (*ppSortHandle); | ||
| (*ppSortHandle) = nullptr; | ||
| free(*ppSortHandle); | ||
| (*ppSortHandle) = NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see that the original code was not doing that, but we are missing pthread_mutex_destroy call here - the CreateSortHandle has created the mutex.
| || (0xff61 <= character && character <= 0xff65); | ||
| } | ||
|
|
||
| static bool AddItem(UCharList* list, size_t* currentCapacity, const UChar item) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A nit - it seems the currentCapacity could be part of the UCharList too. It seems a bit strange that the size is part of it and the capacity is not.
|
@janvorli, is there a way to set a breakpoint (or tracepoint) by line number with bpmd? |
|
@dotnet-bot test Ubuntu x64 Checked corefx_baseline |
|
Japanese calendar problem also appeared while porting partial changes to C++ for #18733. It was the enum alignment. It is interesting that we can't use @jkotas, this PR is back in business; all green. |
|
@krwq @tarekgh Does this look good to you?
|
| int symbolLen = udat_getSymbols(pFormat, type, i, NULL, 0, &ignore) + 1; | ||
|
|
||
| std::vector<UChar> symbolBuf(symbolLen + 1, '\0'); | ||
| UChar* symbolBuf = calloc(symbolLen, sizeof(UChar)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UChar* symbolBuf = calloc(symbolLen, sizeof(UChar)); [](start = 8, length = 52)
can we optimize this, instead of allocating inside every time inside the loop, we can just allocate once outside (maybe with 100 UChar) and then check inside the loop if we need to resize this buffer if needed. in most cases I wouldn't expect we'll resize anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Current approach is to provide 1:1 replacement for existing code and reap the performance benefit of reduced code size of C vs. C++. If we need additional performance tweaks, that can be done separately?
| UResourceBundle* calResBundle = ures_getByKey(rootResBundle, "calendar", NULL, &status); | ||
| UResourceBundle* targetCalResBundle = ures_getByKey(calResBundle, name, NULL, &status); | ||
| UResourceBundle* erasColResBundle = ures_getByKey(targetCalResBundle, "eras", NULL, &status); | ||
| UResourceBundle* erasResBundle = ures_getByKey(erasColResBundle, "narrow", NULL, &status); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this logic work fine? I mean the status values would be accumulated and note overridden? for example if ures_open failed and then some how the other subsequent calls succeed for any reason which will mark status as success.
would it be safer to check the status after every call? and don't have to continue of one failed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic works and it is actually the suggested error handling pattern if you look at ICU samples. We can use it in more places if we wanted it to make the code smaller and a bit faster. I agree that it looks like a suspect outlier now since nothing else in the file is using the ICU error handling pattern. There should be either comment, or it should use the same error handling pattern like the rest of the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use it in more places if we wanted it to make the code smaller and a bit faster
Yes @jkotas, that was your comment and I cleaned it up from all places. Do you have any other place in mind which can be cleaned up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I have commented about the ICU style error handling, but there was not any cleanup done to use it. The only place that uses the ICU style error handling in the change right now is this one.
You can read more about the design of the ICU error handling here: http://userguide.icu-project.org/design#TOC-Error-Handling. Here is one example from many where the ICU style error handling can be used to make the code smaller and a bit faster:
int32_t GlobalizationNative_GetLatestJapaneseEra()
{
UErrorCode err = U_ZERO_ERROR;
UCalendar* pCal = ucal_open(NULL, 0, JAPANESE_LOCALE_AND_CALENDAR, UCAL_TRADITIONAL, &err);
int32_t ret = ucal_getLimit(pCal, UCAL_ERA, UCAL_MAXIMUM, &err);
ucal_close(pCal);
return U_SUCCESS(err) ? ret : 0;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you want to address it as part of this PR? I thought this pull request is approved..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs one of the following:
- Change the error handling style here to match the style used in the rest of the change
- Or change the error handling style throughout the change to use ICU-style
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this pull request is approved.
I have approved it because of I did not have any more comments. However, I have looked at this change multiple times and I knew that I am likely missing things (it is "reviewers fatigue") since this change had many bugs and comments already. It is why I have asked more folks for review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant is there are many comments in this PR thread. It is hard for me to follow as well. I can do a follow up PR to address the rest of comments from green point marker onwards.
Here is one example from many where the ICU style error handling can be used to make the code smaller and a bit faster:
is it about removing:
if (U_FAILURE(err))
return 0;after *_open APIs calls?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the if (U_FAILURE(err)) checks are not needed in some cases after *_open calls thanks to the ICU error handling model. http://userguide.icu-project.org/design#TOC-Error-Handling.
I can do a follow up PR to address the rest of comments from green point marker onwards.
sounds good.
| UCollator* pClonedCollator; | ||
| std::vector<UChar> customRules = GetCustomRules(options, strength, isIgnoreSymbols); | ||
| if (customRules.empty()) | ||
| UCharList* customRules = GetCustomRules(options, strength, isIgnoreSymbols); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UCharList* customRules = GetCustomRules(options, strength, isIgnoreSymbols); [](start = 4, length = 76)
This maybe ok but I am wondering if this is kind of behavior change? I mean before if we had out of memory exception in C++ we wouldn't continue coming to this code. but now we are kind of ignoring the out of memory and just continue without using any of custom rules.
I just thinking loud what is the best we can do here? should we ignore the OOM or we should fail the whole call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can change the signature from:
UCharList* GetCustomRules(int32_t, UColAttributeValue, bool)to:
RESULT_CODE GetCustomRules(int32_t, UColAttributeValue, bool, UCharList**)and return U_MEMORY_ALLOCATION_ERROR like we are doing elsewhere; including this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be reasonable
| const UChar* localeRules = ucol_getRules(pCollator, &localeRulesLength); | ||
| int32_t completeRulesLength = localeRulesLength + customRuleLength + 1; | ||
|
|
||
| UChar* completeRules = calloc(completeRulesLength, sizeof(UChar)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
completeRules [](start = 15, length = 13)
we should check if the allocation succeeded after calling calloc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
| } | ||
|
|
||
| extern "C" ResultCode GlobalizationNative_GetSortHandle(const char* lpLocaleName, SortHandle** ppSortHandle) | ||
| void CreateSortHandle(SortHandle** ppSortHandle) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CreateSortHandle [](start = 5, length = 16)
what is the benefit making this a method by itself and not part of GlobalizationNative_GetSortHandle?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ResultCode GlobalizationNative_GetSortHandle(const char* lpLocaleName, SortHandle** ppSortHandle)
is on line 373
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was @janvorli's suggestion to make it a separate method in order to keep the initialization parts of sortHandle in one place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong feeling to change it but this initialization is not used anywhere else anyway and also when something fails we clean up in GlobalizationNative_GetSortHandle as we don't have a clean up method matching the allocation method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't have a clean up method matching the allocation method
we have GlobalizationNative_CloseSortHandle, that does the cleanup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean, in CreateSortHandle we allocate the memory and then initialize the mutex there. and then in case of the failure, we free the mutex and memory in GlobalizationNative_GetSortHandle which is weird for me. my point is, either to have everything encapsulated inside GlobalizationNative_GetSortHandle or you create CreateSortHandle and DestroySortHandle kind.
As I mentioned I am not strongly feeling about it so I'll leave it to you to decide if you need to change it or keep the current code.
|
|
||
| TCollatorMap::iterator entry = pSortHandle->collatorsPerOption.find(options); | ||
| if (entry == pSortHandle->collatorsPerOption.end()) | ||
| TCollatorMap* map = (TCollatorMap*)malloc(sizeof(TCollatorMap)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
map [](start = 22, length = 3)
we should check if the allocation succeeded here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
| if (isNegative && !minusAdded) | ||
| { | ||
| size_t length = (iEnd - iStart) + 2; | ||
| destPattern = calloc(length, sizeof(char)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
calloc [](start = 22, length = 6)
should check for allocation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
| assert(U_SUCCESS(err)); | ||
|
|
||
| std::string normalizedPattern = NormalizeNumericPattern(icuPattern.data(), isNegative); | ||
| char* normalizedPattern = NormalizeNumericPattern(icuPattern, isNegative); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NormalizeNumericPattern [](start = 30, length = 23)
Maybe we can male this method return the length as we already know it there and avoid calling strlen after
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I will do that.
| */ | ||
| typedef enum | ||
| { | ||
| C = 0x1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C [](start = 4, length = 1)
nit: Maybe naming these to FormC, FormD... would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, not something that is diverging from master
coreclr/src/corefx/System.Globalization.Native/normalization.cpp
Lines 14 to 17 in 1310065
| C = 0x1, | |
| D = 0x2, | |
| KC = 0x5, | |
| KD = 0x6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is diverging from master. The C++ code in master is using the C++ style enum class for this. It means that you qualify the enum value using the enum name, like NormalizationForm::C. C without qualification does not compile.
You have change it to regular enum. It means that the references to the enum values are just C, D, etc. and it is not possible to qualify them. The standard naming convention for C enum is to prefix all value with the enum name or something close to it. Take a look at the naming pattern used for C enums in corefx shims.
|
@kasper3 why did you close this PR? I think you were close to finish it |
|
There are too many comments here and it is hard to navigate them. @kasper3 will open follow up PR: #18614 (comment) |
No description provided.