-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Normalise polygons only when necessary #84229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Pinging @elastic/es-analytics-geo (Team:Analytics) |
|
Hi @iverase, I've created a changelog YAML for you. |
|
@elasticmachine update branch |
...java/org/elasticsearch/xpack/spatial/index/mapper/GeoShapeWithDocValuesFieldMapperTests.java
Outdated
Show resolved
Hide resolved
|
Hi @iverase, I've updated the changelog YAML for you. |
craigtaverner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks like an improvement for the error checking, but am worried about performance. Do we have regular benchmarks running that would catch a regression?
| return Collections.emptyList(); | ||
| } | ||
| geometry = GeometryNormalizer.apply(orientation, geometry); | ||
| if (GeometryNormalizer.needsNormalize(orientation, geometry)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the code, needsNormalize does quite an extensive check, so I presume this adds some performance cost?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needsNormalize should run several times faster than the normalisation.
| @Override | ||
| public Void visit(Polygon polygon) { | ||
| addFields(LatLonShape.createIndexableFields(name, GeoShapeUtils.toLucenePolygon(polygon))); | ||
| addFields(LatLonShape.createIndexableFields(name, GeoShapeUtils.toLucenePolygon(polygon), true)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The additional checks in the tesselator also add some performance cost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the hope is that the overhead of checking for intersection in the tessellator is compensated with skipping the normalisation step.
|
We don't have a rally track for geo_shape that runs daily but we are working on it: elastic/rally-tracks#238 Just run indexing manually and all-in-all it seems the effect in performance is very small, cumulative time just increased by ~1.75%. |
imotov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
With the upgrade to Lucene 9.1, the Tessellator supports validity checks of the polygons and it will return a nice message to the user. That allows calling the normaliser only when necessary as the checks it was doing to the polygon are covered by the Tessellator.
In general this change of strategy should result in better error messages back to the user when a polygon is invalid.
closes #35349