-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
I've seen a couple of cases where all the write threads are heroically trying to parse some very large numbers on the way into a long field, consuming far too much CPU in the process. The hot threads output reported many threads stuck doing BigInteger shenanigans such as this:
java.math.BigInteger.square(BigInteger.java:1899)
java.math.BigInteger.squareToomCook3(BigInteger.java:2053)
java.math.BigInteger.square(BigInteger.java:1899)
java.math.BigInteger.squareToomCook3(BigInteger.java:2051)
java.math.BigInteger.square(BigInteger.java:1899)
java.math.BigInteger.squareToomCook3(BigInteger.java:2051)
java.math.BigInteger.square(BigInteger.java:1899)
java.math.BigInteger.squareToomCook3(BigInteger.java:2053)
java.math.BigInteger.square(BigInteger.java:1899)
java.math.BigInteger.pow(BigInteger.java:2306)
java.math.BigDecimal.bigTenToThe(BigDecimal.java:3543)
java.math.BigDecimal.bigMultiplyPowerTen(BigDecimal.java:3676)
java.math.BigDecimal.setScale(BigDecimal.java:2445)
java.math.BigDecimal.toBigInteger(BigDecimal.java:3025)
org.elasticsearch.common.xcontent.support.AbstractXContentParser.toLong(AbstractXContentParser.java:195)
org.elasticsearch.common.xcontent.support.AbstractXContentParser.longValue(AbstractXContentParser.java:220)
org.elasticsearch.index.mapper.NumberFieldMapper$NumberType$7.parse(NumberFieldMapper.java:679)
org.elasticsearch.index.mapper.NumberFieldMapper$NumberType$7.parse(NumberFieldMapper.java:655)
org.elasticsearch.index.mapper.NumberFieldMapper.parseCreateField(NumberFieldMapper.java:996)
org.elasticsearch.index.mapper.FieldMapper.parse(FieldMapper.java:297)
One example was at https://discuss.elastic.co/t/high-cpu-usage-in-elasticsearch-nodes/161504 and another was with a customer.
We fall back to BigInteger if a naive conversion to long fails, for instance if using scientific notation, but then reject the result if it doesn't fit into a long:
Lines 157 to 180 in 92b2e1a
| private static long toLong(String stringValue, boolean coerce) { | |
| try { | |
| return Long.parseLong(stringValue); | |
| } catch (NumberFormatException e) { | |
| // we will try again with BigDecimal | |
| } | |
| final BigInteger bigIntegerValue; | |
| try { | |
| BigDecimal bigDecimalValue = new BigDecimal(stringValue); | |
| bigIntegerValue = coerce ? bigDecimalValue.toBigInteger() : bigDecimalValue.toBigIntegerExact(); | |
| } catch (ArithmeticException e) { | |
| throw new IllegalArgumentException("Value [" + stringValue + "] has a decimal part"); | |
| } catch (NumberFormatException e) { | |
| throw new IllegalArgumentException("For input string: \"" + stringValue + "\""); | |
| } | |
| if (bigIntegerValue.compareTo(BigInteger.valueOf(Long.MAX_VALUE)) > 0 || | |
| bigIntegerValue.compareTo(BigInteger.valueOf(Long.MIN_VALUE)) < 0) { | |
| throw new IllegalArgumentException("Value [" + stringValue + "] is out of range for a long"); | |
| } | |
| return bigIntegerValue.longValue(); | |
| } |
This means that if you try and insert a short string such as "1e99999999" into a long field then we spend a lot of resources converting this into a very large BigInteger before deciding it's too big. I think we should not try so hard to parse values like "1e99999999" as a long.