-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Fix kuromoji default stoptags #26600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The order was reversed, as the expected value was given for the actual value and vice versa. This led to a confusing assertion error message: ``` FAILURE 0.04s J1 | KuromojiAnalysisTests.testPartOfSpeechFilter <<< FAILURES! > Throwable elastic#1: java.lang.AssertionError: expected different term at index 1 > Expected: "が" > but: was "おいしい" ``` when the string "が" was actually not expected.
* add new test which checks that part-of-speech tokens are removed when using the kuromoji_part_of_speech filter * initialize the default stop-tags in `KuromojiPartOfSpeechFilterFactory` if the `stoptags` are not given in the config
|
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
|
@elasticmachine test this please |
|
@johtani looks like you might be the right person to look at this. Please un-assign yourself if not. |
johtani
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@cbuescher This PR is bug fix so we can cherry-pick it to 6.x, right?
|
@johtani if it is a bug fix an doesn't change existing behaviour I'd even pick it to 6.0. Looks low risk to me, do you agree? |
|
@cbuescher I agree with you. |
|
@avdv thanks a lot for this fix, I will merge this to master and the current 6.x branches |
Initialize the default stop-tags in `KuromojiPartOfSpeechFilterFactory` if the `stoptags` are not given in the config. Also adding a test which checks that part-of-speech tokens are removed when using the kuromoji_part_of_speech filter.
Initialize the default stop-tags in `KuromojiPartOfSpeechFilterFactory` if the `stoptags` are not given in the config. Also adding a test which checks that part-of-speech tokens are removed when using the kuromoji_part_of_speech filter.
|
thank for @cbuescher and @johtani! any chance to get this into 5.6.x, too? |
|
@avdv you are right, since this is a bugfix, I think we should also merge to the 5.6 branch. Will do so. Not sure though in which of the next minor releases it is going to end up. |
Initialize the default stop-tags in `KuromojiPartOfSpeechFilterFactory` if the `stoptags` are not given in the config. Also adding a test which checks that part-of-speech tokens are removed when using the kuromoji_part_of_speech filter.
* master: fix testSniffNodes to use the new error message Add check for invalid index in WildcardExpressionResolver (elastic#26409) Docs: Use single-node discovery.type for dev example Filter unsupported relation for range query builder (elastic#26620) Fix kuromoji default stoptags (elastic#26600) [Docs] Add description for missing fields in Reindex/Update/Delete By Query (elastic#26618) [Docs] Update ingest.asciidoc (elastic#26599) Better message text for ResponseException [DOCS] Remove edit link from ML node enable bwc testing fix StartRecoveryRequestTests.testSerialization Add bad_request to the rest-api-spec catch params (elastic#26539) Introduce a History UUID as a requirement for ops based recovery (elastic#26577) Add missing catch arguments to the rest api spec (elastic#26536)
Fixes #26519