-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
Elasticsearch version 7.5.0
Kibana 7.5.0 plugin installed
Windows Server 2016 Datacenter
When moving from 6.3.2 to 7.5.0, spatial intersections on a geo_shape field in an index with 12 million polygons is significantly slower than it was before. This index is about 20GB. All these polygons represent land parcels and commonly touch and/or slightly overlap with neighboring shapes. I am doing the intersection with a bounding box roughly the size of the southern United States.
The first intersection query (see end of this post for example) I do after a fresh ES 7.5.0 server restart takes 2+ minutes. On a fresh ES 6.3.2 server restart this exact same query against the same data (using quadtree geo_shape) takes 800ms. I notice that there is very heavy disk read activity during the 7.5.0 query. This does not happen during the 6.3.2 query. Note that this is only happening with polygon geo_shapes; point geo_shapes do not have this problem from what I can tell.
After this first query (with a hot cache?), the situation improves somewhat but the 7.5.0 query still takes over a second to run while the 6.3.2 query takes 100ms.
Interestingly, doing an ES restart as opposed to a full server restart does not result in the 2+ minute query. I believe this is due to the Windows disk page cache being cleared being responsible for the 2+ minute to 1 second change.
If it matters, here is my query (this is /_count but I get similar results with /_search):
POST myindex/_count
{"query":{"bool":{"filter":{"bool":{"must":[{"geo_shape":{"geography":{"shape":{"type":"polygon","coordinates":[[[-118.74701654704592,38.554294590584455],[-118.74701654704592,22.052177425063828],[-76.559516547045916,22.052177425063828],[-76.559516547045916,38.554294590584455],[-118.74701654704592,38.554294590584455]]]},"relation":"intersects"}}}]}}}}}