Skip to content

Conversation

@talevy
Copy link
Contributor

@talevy talevy commented Jun 15, 2017

#24133 added the ability to specify a targetField in SortProcessor. This results in some interesting behavior that was missed in the review.
This processor sorts in-place, so there is a side-effect in both the original field and the target field.
Another bug was that the targetField was not being set if the list being sorted was fewer than two elements.

The new behavior works like this: If targetField and fieldName are not the same, we copy the list.

to reproduce original test failure:

gradle :modules:ingest-common:test -Dtests.seed=812BA08DE627020C -Dtests.class=org.elasticsearch.ingest.common.SortProcessorTests -Dtests.method="testSortWithTargetField" -Dtests.security.manager=true -Dtests.jvm.argline="-XX:+AggressiveOpts" -Dtests.locale=bg -Dtests.timezone=America/Pangnirtung

@talevy talevy added :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >bug review >test Issues or PRs that are addressing/adding tests v5.6.0 v6.0.0 labels Jun 15, 2017
@talevy talevy requested a review from martijnvg June 15, 2017 02:01
@rjernst
Copy link
Member

rjernst commented Jun 15, 2017

Can you just always copy the list? Modifying a list from somewhere else is sneaky. Also, should be no need for the size check.

@talevy
Copy link
Contributor Author

talevy commented Jun 15, 2017

@rjernst guess I was just preserving that "optimizing" logic, but yeah, super minimal. I'll simplify

@talevy
Copy link
Contributor Author

talevy commented Jun 15, 2017

updated. thanks for looking @rjernst

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really should not have random logic like this anywhere in ES. Do we really need random field names to test whether sorting works? I would rather see explicit fields names "field1" and "field2" and actually test both sort orders than always have these random field names. They have their place, but for things where the field name matters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @talevy. Looks good. I would still have an explicit test for both sort orders, so as not to rely on randomization since it is easy to cover these 2 cases.

@davidkyle
Copy link
Member

davidkyle commented Jun 15, 2017

There was another occurrence of this failure on 5.x

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.x+multijob-darwin-compatibility/572/console

gradle :modules:ingest-common:test -Dtests.seed=5FE7727E3E91839F -Dtests.class=org.elasticsearch.ingest.common.SortProcessorTests -Dtests.method="testSortWithTargetField" -Dtests.security.manager=true -Dtests.locale=fi-FI -Dtests.timezone=Asia/Omsk

And master:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-intake/1424/console

gradle :modules:ingest-common:test -Dtests.seed=14E3C30AB5E521B6 -Dtests.class=org.elasticsearch.ingest.common.SortProcessorTests -Dtests.method="testSortWithTargetField" -Dtests.security.manager=true -Dtests.locale=tr-TR -Dtests.timezone=Asia/Damascus

Both reproduced for me

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, +1 to Ryan's test suggestion.

to specify a `targetField`. This results in some interesting behavior that was missed in the review.
This processor sorts in-place, so there is a side-effect in both the original field and the target field.
Another bug was that the targetField was not being set if the list being sorted was fewer than two elements.

The new behavior works like this: If targetField and fieldName are not the same, we copy the list.
@talevy talevy merged commit 2cd771a into elastic:master Jun 15, 2017
@talevy talevy deleted the fix-24133 branch June 15, 2017 12:28
talevy added a commit that referenced this pull request Jun 15, 2017
…25237)

to specify a `targetField`. This results in some interesting behavior that was missed in the review.
This processor sorts in-place, so there is a side-effect in both the original field and the target field.
Another bug was that the targetField was not being set if the list being sorted was fewer than two elements.

The new behavior works like this: If targetField and fieldName are not the same, we copy the list.
jasontedor added a commit to glefloch/elasticsearch that referenced this pull request Jun 15, 2017
* master:
  [Test] restore BWC for parent-join now that the new mapping format is in 5.x
  Add a section named "relations" in the ParentJoinFieldMapper (elastic#25248)
  test: Ported more OldIndexBackwardsCompatibilityIT tests to full cluster restart qa tests. (elastic#25173)
  fix: Sort Processor does not have proper behavior with targetField (elastic#25237)
  Allow reader wrappers to have different live docs but the same cache key.
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Jun 16, 2017
…y-context

* 'master' of github.com:elastic/elasticsearch: (21 commits)
  [DOCS] Clarify expected availability of HDFS for the HDFS Repository (elastic#25220)
  Remove some redundant 140 character checkstyle suppressions
  [Docs] more fix for the parent-join docs
  [Docs] Fix cross reference for parent-join field
  More advices around search speed and disk usage. (elastic#25252)
  Add documentation for the new parent-join field (elastic#25227)
  [analysis-icu] Allow setting unicodeSetFilter (elastic#20814)
  Introduce translog size and age based retention policies (elastic#25147)
  Add needs methods for specific variables to Painless script context factories. (elastic#25267)
  Improves snapshot logging and snapshoth deletion error handling (elastic#25264)
  Add unit test for PathHierarchyTokenizerFactory (elastic#24984)
  Deprecate tribe service
  Moved more token filters to analysis-common module.
  [Test] Make sure that SearchAfterSortedDocQueryTests uses a single threaded searcher
  [DOCS] Defined es-test-dir and plugins-examples-dir in index.asciidoc.  (elastic#25232)
  Test fix - removed superfluous assertion (elastic#25247)
  [Test] restore BWC for parent-join now that the new mapping format is in 5.x
  Add a section named "relations" in the ParentJoinFieldMapper (elastic#25248)
  test: Ported more OldIndexBackwardsCompatibilityIT tests to full cluster restart qa tests. (elastic#25173)
  fix: Sort Processor does not have proper behavior with targetField (elastic#25237)
  ...
@clintongormley clintongormley changed the title fix: Sort Processor does not have proper behavior with targetField Sort Processor does not have proper behavior with targetField Jul 25, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >test Issues or PRs that are addressing/adding tests v5.6.0 v6.0.0-beta1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants