support numeric bounds with decimal parts for long/integer/short/byte datatypes #21972

scampi · 2016-12-05T12:05:24Z

… datatypes

scampi · 2016-12-05T12:09:55Z

I have a question about the code. The {byte,short} rangeQuery methods rely on the INTEGER implementation of rangeQuery. Therefore, the call to parse uses the INTEGER.parse implementation, not the BYTE.parse one. Is this normal ?

jpountz

the call to parse uses the INTEGER.parse implementation, not the BYTE.parse one. Is this normal?

Yes: we do not optimize storage for bytes or shorts at the moment, so it is fine to share the same code as far as queries are concerned. The indexing code is different however since we want to fail when someone indexes eg. a large integer in a byte field.

If I am not mistaken, a side effect of this pull request is that searching a decimal value on an integer field used to raise an error while it would not silently round the value down. I am fine with not throwing an error, but could we make sure to create a query that does not match anything if the decimal part is not null rather than silently rounding down? I think both term and terms are affected.

jpountz · 2016-12-05T18:27:56Z

core/src/main/java/org/elasticsearch/index/mapper/NumberFieldMapper.java

+                double doubleValue = ((Number) number).doubleValue();
+                return doubleValue % 1 != 0;
+            }
+            return false;


Is it safe to return false otherwise? Maybe return Double.parseDouble(number) % 1 != 0?

If number is a string like 1.1 then the parsing would fail with java.lang.NumberFormatException: For input string: "1.1". This is already the case in the existing parse methods. So I think it should be fine to return false here.

jpountz · 2016-12-05T18:31:34Z

core/src/main/java/org/elasticsearch/index/mapper/NumberFieldMapper.java

                    }
-                    if (doubleValue % 1 != 0) {
-                        throw new IllegalArgumentException("Value [" + value + "] has a decimal part");
-                    }


I'd still like an exception to be thrown if this is called from parseCreateField. Maybe add a boolean coerce parameter (consistent with the rest of this class) and only perform this check if coerce is false?

scampi · 2016-12-07T17:18:09Z

@jpountz Thanks for the explanation of the rangeQuery.
Ready for another look!

jpountz

Sorry for the lag, I just had anothen look at your changes. I think it does not work with negative values, since your code assumes that calling longValue() on a decimal rounds down, while it actually rounds up for negative values.

jpountz · 2016-12-14T14:34:37Z

core/src/main/java/org/elasticsearch/index/mapper/NumberFieldMapper.java

                float[] v = new float[values.size()];
                for (int i = 0; i < values.size(); ++i) {
-                    v[i] = parse(values.get(i));
+                    v[i] = (float) parse(values.get(i), false);


the cast looks unnecessary?

jpountz · 2016-12-14T14:35:52Z

I have a recreation if you want to look into it:

PUT index 
{
  "mappings": {
    "type": {
      "properties": {
        "field": {
          "type": "integer"
        }
      }
    }
  }
}

PUT index/type/1
{
  "field": -3
}

GET index/_search
{
  "query": {
    "range": {
      "field": {
        "gte": -3.5,
        "lte": -2.5
      }
    }
  }
}

GET index/_search
{
  "query": {
    "range": {
      "field": {
        "gte": -4.5,
        "lte": -3.5
      }
    }
  }
}

The document matches the second query but not the first one.

jpountz · 2016-12-14T14:41:25Z

core/src/main/java/org/elasticsearch/index/mapper/NumberFieldMapper.java

+
+                int[] v = new int[nonDecimalValues.size()];
+                for (int i = 0; i < nonDecimalValues.size(); ++i) {
+                    v[i] = parse(nonDecimalValues.get(i), true);


Maybe we could avoid generating intermediate garbage because of boxed objects by directly creating the primitive array, eg. something like:

int[] v = new int[values.size()]; int upTo = 0; for (Object value : values) { if (hasDecimalPart(value) == false) { v[upTo++] = parse(value, true); } } if (upTo != v.length) { v = Arrays.copyOf(v, upTo); }

scampi · 2016-12-18T21:46:41Z

@jpountz thanks for the review! I have resolved your comments in the last two PRs. Ready for another round

jpountz

The changes to terms queries look good, but I think there are still issues with range query generation.

jpountz · 2016-12-19T10:13:45Z

core/src/test/java/org/elasticsearch/index/mapper/NumberFieldTypeTests.java

+        MappedFieldType ftInt = new NumberFieldMapper.NumberFieldType(NumberType.INTEGER);
+        ftInt.setName("field");
+        ftInt.setIndexOptions(IndexOptions.DOCS);
+        assertEquals(IntPoint.newRangeQuery("field", -3, -2), ftInt.rangeQuery(-3.5, -2.5, true, true, null));


Sorry I may be a bit confused, but if the range is [-3.5, -2.5] then -2 should not match?

sorry about that, i'll correct this in next commit

jpountz · 2016-12-19T10:18:11Z

core/src/main/java/org/elasticsearch/index/mapper/NumberFieldMapper.java

+                    // - if the bound is negative then we leave it as is:
+                    //      if lowerTerm=-1.5 then the (inclusive) bound becomes -1 due to the call to longValue
+                    boolean lowerTermHasDecimalPart = hasDecimalPart(lowerTerm);
+                    if ((includeLower == false && !lowerTermHasDecimalPart) || (lowerTermHasDecimalPart && l > 0)) {


I suspect this approach cannot work since -0.5 and +0.5 both parse to 0 when coerce is true, so you have no way to know whether the original value was positive or negative?

scampi · 2016-12-21T00:07:37Z

@jpountz I corrected those two issues you pointed out.

jpountz

It looks good to me. I left some comments about formatting if you don't mind applying them and then I'll merge. Thanks @scampi!

jpountz · 2016-12-21T15:32:34Z

core/src/main/java/org/elasticsearch/index/mapper/NumberFieldMapper.java

+                    // - if the bound is negative then we leave it as is:
+                    //      if lowerTerm=-1.5 then the (inclusive) bound becomes -1 due to the call to longValue
+                    boolean lowerTermHasDecimalPart = hasDecimalPart(lowerTerm);
+                    if ((!lowerTermHasDecimalPart && includeLower == false) ||


can you either use ! in both cases or == false?

(others tend to have a preference for == false so I'd recommend using that)

jpountz · 2016-12-21T15:33:23Z

core/src/main/java/org/elasticsearch/index/mapper/NumberFieldMapper.java

+                    //      if lowerTerm=-1.5 then the (inclusive) bound becomes -1 due to the call to longValue
+                    boolean lowerTermHasDecimalPart = hasDecimalPart(lowerTerm);
+                    if ((!lowerTermHasDecimalPart && includeLower == false) ||
+                        (lowerTermHasDecimalPart && signum(lowerTerm) > 0)) {


can you indent one level more so that it is easier to figure out what is part of the if statement and what is part of the inner block?

jpountz · 2016-12-21T15:35:11Z

core/src/main/java/org/elasticsearch/index/mapper/NumberFieldMapper.java

+                    u = parse(upperTerm, true);
+                    boolean upperTermHasDecimalPart = hasDecimalPart(upperTerm);
+                    if ((!upperTermHasDecimalPart && includeUpper == false) ||
+                        (upperTermHasDecimalPart && signum(upperTerm) < 0)) {


jpountz · 2016-12-21T15:35:41Z

core/src/main/java/org/elasticsearch/index/mapper/NumberFieldMapper.java

+                    //      if lowerTerm=-1.5 then the (inclusive) bound becomes -1 due to the call to longValue
+                    boolean lowerTermHasDecimalPart = hasDecimalPart(lowerTerm);
+                    if ((!lowerTermHasDecimalPart && includeLower == false) ||
+                        (lowerTermHasDecimalPart && signum(lowerTerm) > 0)) {


jpountz · 2016-12-21T15:35:50Z

core/src/main/java/org/elasticsearch/index/mapper/NumberFieldMapper.java

+                    u = parse(upperTerm, true);
+                    boolean upperTermHasDecimalPart = hasDecimalPart(upperTerm);
+                    if ((!upperTermHasDecimalPart && includeUpper == false) ||
+                        (upperTermHasDecimalPart && signum(upperTerm) < 0)) {


jpountz · 2016-12-21T15:36:33Z

core/src/main/java/org/elasticsearch/index/mapper/NumberFieldMapper.java

+        /**
+         * Returns -1, 0, or 1 if the value is lower than, equal to, or greater than 0
+         */
+        protected double signum(Object value) {


let's make those two methods private, I don't think we need to extend them?

scampi · 2016-12-21T21:40:23Z

@jpountz I have addressed your comments, thanks for the review ;o)

… datatypes (#21972) Close #21600

* master: (22 commits) Support negative numbers in writeVLong (elastic#22314) UnicastZenPing's PingingRound should prevent opening connections after being closed Add task to clean idea build directory. Make cleanIdea task invoke it. add trace logging to UnicastZenPingTests.testResolveReuseExistingNodeConnections Adds ingest processor headers to exception for unknown processor. (elastic#22315) Remove much ceremony from parsing client yaml test suites (elastic#22311) Support numeric bounds with decimal parts for long/integer/short/byte datatypes (elastic#21972) inner hits: Don't inline inner hits if the query the inner hits is inlined into can't resolve mappings and ignore_unmapped has been set to true Fix stackoverflow error on InternalNumericMetricAggregation Date detection should not rely on a hardcoded set of characters. (elastic#22171) `value_type` is useful regardless of scripting. (elastic#22160) Improve concurrency of ShardCoreKeyMap. (elastic#22316) fixed jdocs and removed already fixed norelease Adds abstract test classes for serialisation (elastic#22281) Introduce translog no-op Provide helpful error message if a plugin exists Clear static variable after suite Repeated language analyzers (elastic#22240) Restore deprecation warning for invalid match_mapping_type values (elastic#22304) Make `-0` compare less than `+0` consistently. (elastic#22173) ...

support numeric bounds with decimal parts for long/integer/short/byte…

21db4b4

… datatypes

jpountz suggested changes Dec 5, 2016

View reviewed changes

jpountz self-assigned this Dec 5, 2016

scampi force-pushed the range-query-decimal-part branch from ce43e31 to 19cdc1b Compare December 7, 2016 17:16

do not silently round down in {term,terms}Query

19cdc1b

clintongormley added :Query DSL >enhancement labels Dec 7, 2016

jpountz suggested changes Dec 14, 2016

View reviewed changes

jpountz reviewed Dec 14, 2016

View reviewed changes

scampi added 2 commits December 18, 2016 14:18

avoid intermediate garbage from arraylist

db51414

support negative bounds in range queries

12d599c

jpountz suggested changes Dec 19, 2016

View reviewed changes

corrected range query generation

3c3946f

jpountz approved these changes Dec 21, 2016

View reviewed changes

improved syntax

ec1fbf5

jpountz merged commit e1b8528 into elastic:master Dec 22, 2016

scampi deleted the range-query-decimal-part branch December 22, 2016 14:54

jpountz pushed a commit that referenced this pull request Dec 22, 2016

Support numeric bounds with decimal parts for long/integer/short/byte…

6f45f8e

… datatypes (#21972) Close #21600

jpountz added v5.2.0 v6.0.0-alpha1 labels Dec 22, 2016

jpountz mentioned this pull request Jan 27, 2017

search time match coercing behaves differently than index time coercing #22776

Closed

clintongormley added :Search/Search Search-related issues that do not fall into other categories and removed :Query DSL labels Feb 14, 2018

support numeric bounds with decimal parts for long/integer/short/byte datatypes #21972

support numeric bounds with decimal parts for long/integer/short/byte datatypes #21972

Uh oh!

Conversation

scampi commented Dec 5, 2016

Uh oh!

scampi commented Dec 5, 2016

Uh oh!

jpountz left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scampi commented Dec 7, 2016

Uh oh!

jpountz left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jpountz commented Dec 14, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scampi commented Dec 18, 2016

Uh oh!

jpountz left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scampi commented Dec 21, 2016

Uh oh!

jpountz left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scampi commented Dec 21, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants