Skip to content

Conversation

@taooceros
Copy link
Member

@taooceros taooceros commented Oct 18, 2020

Moved from #179

Wox has changed the Pinyin Library to ToolGood.Words. #2962 Thank you for the change @bao-qian .
This pull request has done the same thing, but optimize a few.

It keeps the full pinyin query, which seems perform the same as current pinyin library.
It keeps using the Concurrent Dictionary as cache, instead of Wox's Memory Cache. I think changing the cache implementation should be done in separately. However, the cache for Pinyin is changed from store single chinese character to whole word.
IAlphabet interface is kept in this pull request.
I also merge the query of win32 and uwp in Program plugin into one query by adding enabled property to the IProgram interface and cast win32 and uwp to IProgram.
I have checked the previous cache storage and it actually makes the loading time quite long, so I remove the Save and Loading cache in this pull request because it seems that the library is fast enough.

@jjw24
Copy link
Member

jjw24 commented Oct 18, 2020

Thanks @taooceros! If PR is ready, let us know via request review :)

@taooceros
Copy link
Member Author

taooceros commented Oct 18, 2020

I think this should be ready now. You are welcome!

@jjw24 jjw24 added enhancement New feature or request review in progress Indicates that a review is in progress for this PR labels Oct 19, 2020
select $"{a1};{a2}"
).ToArray();
return combination;
return string.Concat(content.Split(';').Select(x => x.First()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a question, why is this code only get the first char, do you know what its used for?

Copy link
Member Author

@taooceros taooceros Oct 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for getting the acronym for pinyin, just like our discussion in the acronym search. That should be removed once we have done the acronym search in fuzzy search.

If you are talking why this can get the first char of pinyin, it is because when I get the pinyin, I called it with WordsHelper.GetPinyin(content, ";"), which split Chinese character with ";".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm if this is for acronym search, could we remove this and add it in the other acronym search PR, I have a feeling we going to forget to remove this code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have done the part of removing this in the acronym branch. I put it here because I think this will be merged earlier so that I won't like to change the behavior since on the past pinyin acronym search is done with this method.

Copy link
Member

@jjw24 jjw24 Oct 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure as long as this acronym search code is only specific to Pinyin

@taooceros
Copy link
Member Author

Thank you for the reviews! @jjw24

Comment on lines 75 to 79
lock (IndexLock)
{ // just take the reference inside the lock to eliminate query time issues.
win32 = _win32s;
uwps = _uwps;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bao has removed the lock in Wox, so I think removing it won't be an issue in Flow, too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont suppose those two will be updated often unless a reindex is run. Can the user reindex and also query at the same time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would consider even if they query and reindex at the same time, it won't affect each other since index only change the memory index after finishing the whole index process (which is time consuming), but assigning values should not be an issue. On the time that Flow is reindexing, the variable won't change until it finish.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have provide a small test and it seems that won't fine of querying when reindexing.

@jjw24
Copy link
Member

jjw24 commented Oct 24, 2020

Pinyin changes are looking good.

I think if we can rename the Alphabet class that would be sweet.

Can you do me a favour, can we pull out the changes related to Programs plugin - locking and program query optimization. Let's stick to one change set per PR so it's easier/quicker to review and test.

@jjw24
Copy link
Member

jjw24 commented Oct 24, 2020

also please bump the version in Plugin.Program's plugin.json file

@taooceros
Copy link
Member Author

taooceros commented Oct 25, 2020

Sure, I will rollback the changes for programs plug-ins and create a new one for it. I think if we don't include the optimization of the query, this PR won't include change for program plugin. Should we still bump the version?

@taooceros taooceros force-pushed the PinyinLibraryChange branch from 8a212a6 to 80a2f0d Compare October 25, 2020 02:22
@jjw24
Copy link
Member

jjw24 commented Nov 1, 2020

Just done some testing around the usage, I think the acronym search caused the Chinese to Pinyin search to fail:

lets so i have two entries as below.

searching pinyin is good
image

using acronym is good
image

but Chinese to Pinyin, hmmm:
image

The acronym search has changed the pinyin translation
image

@taooceros
Copy link
Member Author

taooceros commented Nov 1, 2020

Hmmmm you are right, I think the original implementation of the acronym search in the pinyin search is done by adding the acronym character to the front of the translated query, which will lead to the weird query translation.

   var combination = PinyinCombination(source);
            
   var pinyinArray=combination.Select(x => string.Join("", x));
   var acronymArray = combination.Select(Acronym).Distinct();

   var joinedSingleStringCombination = new StringBuilder();
   var all = acronymArray.Concat(pinyinArray);
   all.ToList().ForEach(x => joinedSingleStringCombination.Append(x));

   return joinedSingleStringCombination.ToString();

You can see that the current implementation is joining all the combination together including acronym into a large string. I think it doesn't allow we use Chinese query to search for pinyin result as well because of the query translation is also the one that include every combination of polyphone and acronym.

The acronym part will be removed after the native acronym search has been finished, which means that the translation of query will be clean.

@jjw24
Copy link
Member

jjw24 commented Nov 1, 2020

The acronym part will be removed after the native acronym search has been finished, which means that the translation of query will be clean.

yeah that's what I was thinking, once the acronym search is implemented in string matcher, this shouldnt be needed here. Is that PR ready to go?

@taooceros
Copy link
Member Author

taooceros commented Nov 1, 2020

The acronym part will be removed after the native acronym search has been finished, which means that the translation of query will be clean.

yeah that's what I was thinking, once the acronym search is implemented in string matcher, this shouldnt be needed here. Is that PR ready to go?

It should be ready to go except the highlighted data is hard to control due to the translation. Do you think it is ok when sometimes the highlighted data is not on the right place? Or maybe if it is translated, we don't include the highlight data.

@jjw24
Copy link
Member

jjw24 commented Nov 1, 2020

The acronym part will be removed after the native acronym search has been finished, which means that the translation of query will be clean.

yeah that's what I was thinking, once the acronym search is implemented in string matcher, this shouldnt be needed here. Is that PR ready to go?

It should be ready to go except the highlighted data is hard to control due to the translation. Do you think it is ok when sometimes the highlighted data is not on the right place? Or maybe if it is translated, we don't include the highlight data.

is this when translating Pinyin and matching a Chinese character program? I think the original code doesnt highlight Chinese

@taooceros
Copy link
Member Author

The acronym part will be removed after the native acronym search has been finished, which means that the translation of query will be clean.

yeah that's what I was thinking, once the acronym search is implemented in string matcher, this shouldnt be needed here. Is that PR ready to go?

It should be ready to go except the highlighted data is hard to control due to the translation. Do you think it is ok when sometimes the highlighted data is not on the right place? Or maybe if it is translated, we don't include the highlight data.

is this when translating Pinyin and matching a Chinese character program? I think the original code doesnt highlight Chinese

Yes, it only highlights if search with acronym since they are the character that lies one the first few characters and match the chinese character. We can discuss this on #185 .

@jjw24
Copy link
Member

jjw24 commented Nov 1, 2020

The acronym part will be removed after the native acronym search has been finished, which means that the translation of query will be clean.

yeah that's what I was thinking, once the acronym search is implemented in string matcher, this shouldnt be needed here. Is that PR ready to go?

It should be ready to go except the highlighted data is hard to control due to the translation. Do you think it is ok when sometimes the highlighted data is not on the right place? Or maybe if it is translated, we don't include the highlight data.

is this when translating Pinyin and matching a Chinese character program? I think the original code doesnt highlight Chinese

Yes, it only highlights if search with acronym since they are the character that lies one the first few characters and match the chinese character. We can discuss this on #185 .

Ok sounds good.

I havent had a chance to look at 185 yet, will do so soon. Lets not forget to remove this acronym search once 185 is finished.

@taooceros
Copy link
Member Author

Of course, the integration has been finished, except for the highlight, and once those two pr has been merged, I will create a new pr to do the job.

@jjw24
Copy link
Member

jjw24 commented Nov 1, 2020

Lets get rid of that white space and then we can merge this in and get a release going :)

@jjw24 jjw24 merged commit aa3177d into Flow-Launcher:dev Nov 1, 2020
@jjw24 jjw24 removed the review in progress Indicates that a review is in progress for this PR label Nov 1, 2020
@taooceros taooceros deleted the PinyinLibraryChange branch November 2, 2020 03:08
@taooceros taooceros restored the PinyinLibraryChange branch November 2, 2020 03:30
@taooceros taooceros deleted the PinyinLibraryChange branch November 2, 2020 03:39
@taooceros taooceros linked an issue Dec 9, 2020 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Merge performance improvements from Wox

4 participants