-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Update ShapeBuilder and GeoPolygonQueryParser to accept non-closed GeoJSON #11161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
How can you tell the difference between they left off the closing point, vs their serialization to json didn't write the entire polygon? Maybe this behavior should be controlled by a setting, as it relaxes restrictions but allows potentially incorrect data to be added? |
|
@rjernst I understand what you mean. That then puts the onus on the person managing elasticsearch to have the correct configuration or setting. I believe the responsibility should be on the the person handling the data in the first place. If they write data to elasticsearch and find it's incorrect, they can check their serialization method to verify it's producing correct outputs. This will still fail if the polygon only contains 1,2, or 3 points. I can see why this type of validation is necessary, but again, I believe that validation should be on the json serialization not in elasticsearch. |
|
Since before ElastiCon there has been a lot of talk about maturing ES to make it harder to "hurt yourself". The more I think about this issue the more I feel like we're opening a door for users to do exactly that. In the spirit of user safety I like the idea of adding a setting to control this behavior. Though we're also trying to thin out the already massive number of settings for geo types. @clintongormley thoughts? |
|
@nknize @clintongormley I hear what you're saying. If you do add a setting, is the default going to allow "loose" GeoPoints, such as the non-closed ones? To elaborate: |
|
@nknize we have two settings on (eg) numeric fields:
Wondering if we should allow the same settings for geo-shapes, where |
|
+1 to reusing settings (with documentation for what it means to "coerce" for geo points) and to keep the default sane (ie dont coerce by default). |
|
++ I like this idea for consistency across mappings. |
|
What's the status on this merge? |
c8c4b9f to
d5e6767
Compare
|
Code updated to use new mappings enhancements and optional coerce parameter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a map to store two options? can we not just store the options in two local variables here. If/when this expands into more parse options and its too big to pass around as separate variables we could create a class to hold these options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@colings86 absolutely. I debated this myself. In the end I made the decision to put it in now for the flexibility of adding future options by just adding them to the hashmap. Since the overhead of the map is greater than the current parameters I can take it out and add it later when needed.
|
Left a comment. I actually think |
|
I think I agree. It'll also give a better OOB experience with the multitude of bad geodata out there. |
|
I disagree. This would give a worse OOB experience. The point of this setting is to allow not following the standard (closing the shapes) which leaves room for corrupt serialization of shapes (eg omitted some points) leaving a corrupt shape (because we "auto closed" it). We shouldn't be silent about it, the shape is corrupt! We should call the setting something different if we can't agree on this. The OOB experience needs to be "follow the standard or you get an error", otherwise we would silently allow corrupt data into the index. |
|
I agree that geodata differs from numerics in the context of 'coerce'. Even though its fundamentally two numeric elements they're also bounded by the spherical coordinate system. Defaulting coerce to true means ES will happily do more work just to accomodate non-sensical geodata. If the user really wants ES to "fix" bad data then they can explicitly set coerce to true. But IMHO I don't think we should take the default expectation that all geodata ES receives will be bad. |
|
OK - then coerce false |
|
Bumping the version up to 1.7.1 for the today's release. |
d5e6767 to
5c81f5e
Compare
|
LGTM |
…d GeoJSON While the GeoJSON spec does say a polygon is represented as an array of LinearRings (where a LinearRing is defined as a 'closed' array of points), the coerce parameter provides users with flexibility to have ES automatically close polygons. This addresses situations like those integrated with twitter (where GeoJSON polygons are not closed) such that our users do not have to write extra code to close the polygon. This code change adds the optional coerce parameter to the GeoShapeFieldMapper. closes elastic#11131
5c81f5e to
bc9a470
Compare
|
coerce: https://www.elastic.co/guide/en/elasticsearch/reference/current/coerce.html, does it only support latest version? how about 1.7.3? |
While the GeoJSON spec does say a polygon is represented as an array of LinearRings (where a LinearRing is defined as a 'closed' array of points), it is a simple change to close the polygon for users. This addresses situations like those integrated with twitter (where GeoJSON polygons are not closed) such that our users do not have to write extra code to close the polygon.
closes #11131