Skip to content

Feeding non-numeric data into a long field may consume excessive CPU #40323

@DaveCTurner

Description

@DaveCTurner

I've seen a couple of cases where all the write threads are heroically trying to parse some very large numbers on the way into a long field, consuming far too much CPU in the process. The hot threads output reported many threads stuck doing BigInteger shenanigans such as this:

       java.math.BigInteger.square(BigInteger.java:1899)
       java.math.BigInteger.squareToomCook3(BigInteger.java:2053)
       java.math.BigInteger.square(BigInteger.java:1899)
       java.math.BigInteger.squareToomCook3(BigInteger.java:2051)
       java.math.BigInteger.square(BigInteger.java:1899)
       java.math.BigInteger.squareToomCook3(BigInteger.java:2051)
       java.math.BigInteger.square(BigInteger.java:1899)
       java.math.BigInteger.squareToomCook3(BigInteger.java:2053)
       java.math.BigInteger.square(BigInteger.java:1899)
       java.math.BigInteger.pow(BigInteger.java:2306)
       java.math.BigDecimal.bigTenToThe(BigDecimal.java:3543)
       java.math.BigDecimal.bigMultiplyPowerTen(BigDecimal.java:3676)
       java.math.BigDecimal.setScale(BigDecimal.java:2445)
       java.math.BigDecimal.toBigInteger(BigDecimal.java:3025)
       org.elasticsearch.common.xcontent.support.AbstractXContentParser.toLong(AbstractXContentParser.java:195)
       org.elasticsearch.common.xcontent.support.AbstractXContentParser.longValue(AbstractXContentParser.java:220)
       org.elasticsearch.index.mapper.NumberFieldMapper$NumberType$7.parse(NumberFieldMapper.java:679)
       org.elasticsearch.index.mapper.NumberFieldMapper$NumberType$7.parse(NumberFieldMapper.java:655)
       org.elasticsearch.index.mapper.NumberFieldMapper.parseCreateField(NumberFieldMapper.java:996)
       org.elasticsearch.index.mapper.FieldMapper.parse(FieldMapper.java:297)

One example was at https://discuss.elastic.co/t/high-cpu-usage-in-elasticsearch-nodes/161504 and another was with a customer.

We fall back to BigInteger if a naive conversion to long fails, for instance if using scientific notation, but then reject the result if it doesn't fit into a long:

private static long toLong(String stringValue, boolean coerce) {
try {
return Long.parseLong(stringValue);
} catch (NumberFormatException e) {
// we will try again with BigDecimal
}
final BigInteger bigIntegerValue;
try {
BigDecimal bigDecimalValue = new BigDecimal(stringValue);
bigIntegerValue = coerce ? bigDecimalValue.toBigInteger() : bigDecimalValue.toBigIntegerExact();
} catch (ArithmeticException e) {
throw new IllegalArgumentException("Value [" + stringValue + "] has a decimal part");
} catch (NumberFormatException e) {
throw new IllegalArgumentException("For input string: \"" + stringValue + "\"");
}
if (bigIntegerValue.compareTo(BigInteger.valueOf(Long.MAX_VALUE)) > 0 ||
bigIntegerValue.compareTo(BigInteger.valueOf(Long.MIN_VALUE)) < 0) {
throw new IllegalArgumentException("Value [" + stringValue + "] is out of range for a long");
}
return bigIntegerValue.longValue();
}

This means that if you try and insert a short string such as "1e99999999" into a long field then we spend a lot of resources converting this into a very large BigInteger before deciding it's too big. I think we should not try so hard to parse values like "1e99999999" as a long.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions