Skip to content

Conversation

@ricardojaferreira
Copy link
Contributor

Removes the use of ICUCollationKeyFilter at IcuCollationTokenFilterFactory and removes the class ICUCollationKeyFilter.java which no longer has usages.

Relates to #15827

Copy link
Member

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ricardojaferreira, first of all thanks for picking up the issue and opening this PR. Unfortunately I think the changes you made will not be sufficient to solve the issue at hand. First of all I think the usage of ICUCollationDocValuesField doesn't work this way, in fact there are test failures e.g. in SimpleIcuCollationTokenFilterTests which you can see when you e.g. run just the unit test for the analysis-icu plugin (./gradlew -p plugins/analysis-icu check).
Furhtermore I'm not sure the goal of the issue was to swap out the internal implementation of IcuCollationTokenFilterFactory but to first deprecate it in Elasticsearch (so that users using in in newly created indices get a warning) and then later remove it completely. I might be wrong on this point though, maybe others can correct me in this case.
If you want to work on the above points please let us know if you have questions.

@Override
public TokenStream create(TokenStream tokenStream) {
return new ICUCollationKeyFilter(tokenStream, collator);
return new ICUCollationDocValuesField(tokenStream.toString(), collator).tokenStreamValue();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt this is going to be sufficient. ICUCollationDocValuesField's first argument is a the field name (per Javadoc) and I'm not sure the tokenStreamValue will work either, but maybe it does.

@cbuescher cbuescher added :Search Foundations/Mapping Index mappings, including merging and defining field types >refactoring labels Nov 26, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

@ricardojaferreira
Copy link
Contributor Author

Hi @cbuescher thanks for the comments. Maybe I haven't understand the purpose of the fix, can you please point me on the right direction? Thanks.

@cbuescher
Copy link
Member

Hi @ricardojaferreira. To answer your question, we cannot immediately remove the filter in version 7.0 since it might be used in indices create in 6.x which we still need to support. We should first add deprecation warnings when this filter is used in on of the current 6.x versions.
In a next step we should forbid using this filter for any new indices in 7.0, but at the same time this should still allow usages in old indices. Finally we can remove the filter, but this cannot be done until version 8.0 which we haven't even got a branch for yet.
If you are interested in working further on this I'd start by adding deprecation warnings when this filter is used. You can take a look at e.g. #33468 for an example that does something I believe is quite similar. Since the ICU collation token filter lives in the icu-module, the changes would be somewhere in AnalysisICUPlugin I assume. Also please make sure to add some tests similar to the ones in the PR to check the deprecation logging. You can open a PR either against the current 6.x branch or master and we can backport the change and then work on the part forbidding the use of this filter in 7.0 next.

@cbuescher cbuescher self-assigned this Jan 7, 2019
@cbuescher
Copy link
Member

@ricardojaferreira are you still working on this? Otherwise I'd like to close this PR to open the issue up for other people who might be interested.

@cbuescher
Copy link
Member

@ricardojaferreira I'm closing this for now, assuming you are not working on it actively at the moment so that others can pick up the issue. If you like to resume work please reopen and we'll see if anybody else is on it then already.

@AlexAxeman
Copy link
Contributor

AlexAxeman commented Dec 2, 2020

@nik9000 @cbuescher Hey guys, I'm new to the game and looking for a first issue to pick up. I read the PR form a previous attempt to address this, and it seems like there would be some deprecation process to be followed here. I looked at ICUCollationKeyFilter and it's annotated @Deprecated. It doesn't say since when (well... 2014-11-06), and when it's scheduled to be removed. Can anyone point me to some information that will help me understand your deprecation process?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement >refactoring :Search Foundations/Mapping Index mappings, including merging and defining field types v7.0.0-beta1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants