-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Integrate GeoHexGridAggregation with vector tiles API #84553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit adds a new optional parameter on the vector tiles API called `grid_agg` with two possible values, geotile (default) and geohex. This will allow to build the aggs layer using different grid aggregations, for example we can have a grid aggregation that is built using hexagons. Note the GeoHexGridAggregation only support data indexed using geo_point field type so this limitation applies here too.
|
Pinging @elastic/es-analytics-geo (Team:Analytics) |
|
Hi @iverase, I've created a changelog YAML for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
@jrodewig I love what you did, thanks! |
| // zoom 27: 13 | 14 | 14 | 15 | 15 | 15 | 15 | 15 | | ||
| // zoom 28: 14 | 14 | 15 | 15 | 15 | 15 | 15 | 15 | | ||
| // zoom 29: 14 | 15 | 15 | 15 | 15 | 15 | 15 | 15 | | ||
| return Math.min(H3.MAX_H3_RES, Math.max(2, (z + gridPrecision - 1) / 2)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @iverase
@nreese and I were discussing this mapping-function earlier, in brining this into Kibana.
We're wondering if this is the right heuristic on how to map a geohex-level to a zoom-level.
What we would like to achieve is a heuristic that approximates density of hex-bins in the tile to that of what we have using grid or clusters.
This heuristic
target_hex_resolution = (<zoom> + grid_precision - 1) / 2
generally tends to create too fine hex-grids for a given grid_precision, compared to comparable tiles using clusters or grid, especially at lower zoom.
How grid_precision interacts with the existing mvt-api is simple: zoom + grid_precision = target_geotile_level. This mapping to target_geotile_level is easy to reason about. Hoping for something similar with geohex.
We would prefer a heuristic that maps target_geotile_zoom to the geohex_zoom based on density of bins.
So more something like:
target_hex_resolution = mapGeotileToHexbin(zoom + grid_precision)
where mapGeoTileToHexbin() is the mapping from level->h3 resolution from this table:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right that the hex bins resolution for high zooms is too low and can be improved. Still what you propose would not wok either because the density of hex bin in your example is too big.
For geotile and grid_precision 8 we can generate up to 65536 bins which is the default limit for the number of buckets in a bucket aggregation. Therefore the density of hex bins should always be <= density of geotile, otherwise we might go over the limit of the number of buckets.
In your proposal, the density of hex bins can be up to 2.5 times higher than tile bins. We should make sure density is not bigger than 1 (it can be slightly bigger as the number of hex bins consider hexes around the poles which are not part of the mercator projection and for small zoom level when the number of bins < 65536).
Here is my approximation, what I wonder is how to document this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated how we compute the H3 resolution by using the table above. Let me know what you think @thomasneirynck
@craigtaverner
imotov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
# Conflicts: # x-pack/plugin/vector-tile/src/main/java/org/elasticsearch/xpack/vectortile/feature/FeatureFactory.java # x-pack/plugin/vector-tile/src/test/java/org/elasticsearch/xpack/vectortile/feature/FeatureFactoriesConsistencyTests.java
|
@jrodewig Could you have a look into the docs again? We update the way we compute the precision for GeoHex so we needed to update the docs as well. |
nreese
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
tried hex bins out with Kibana, elastic/kibana#127170, and it looks great. Adding this to vector tile service makes it so easy to integrate into Kibana, thanks.
jrodewig
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
I open #84993 @elasticmachine run elasticsearch-ci/part-2 |
|
@elasticmachine run elasticsearch-ci/packaging-tests-windows-sample |
craigtaverner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a few comments and suggestions, but otherwise I think it looks good.
| // 26 4503599627370500 15 569707381193162 0.1265004504 | ||
| // 27 18014398509482000 15 569707381193162 0.03162511259 | ||
| // 28 72057594037927900 15 569707381193162 0.007906278149 | ||
| // 29 288230376151712000 15 569707381193162 0.001976569537 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since H3 only goes down to 1m² and clearly elastic square tiles go down to much smaller tiles, how feasible would it be to increase the number of H3 levels to match the resolution of the tiles? Perhaps two more H3 levels would suffice? Or is it that in H3 the 0..15 range is very clearly hard-coded (eg. only 4 bits used to store the H3 level)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is just hard coded, e.g if you increase the max resolution all the test fail so there is much more into it. Note that hexagons don't divide the sphere perfectly so you have some pentagons (and other shapes) that need to fail the gaps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed in some Uber videos they were describing H3 tiles with fixed length strings, and tiles with less than level 15 would have suffixes made of the character f, so it appears that it is likely that the 0..15 range is correlated to the string length chosen for this encoding. I'd be curious to investigate more at some point.
|
|
||
| @Override | ||
| public int gridPrecisionToAggPrecision(int z, int gridPrecision) { | ||
| return ZOOM2RESOLUTION[GEOTILE.gridPrecisionToAggPrecision(z, gridPrecision)]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I worry that we are trying to be too clever here, and trying to force the zoom levels of the H3 tiles to match the zoom levels of the grid. What about pushing this problem to the client? So when doing geotiles, the possible zoom values would be 1 to 29, but when doing H3 the possible values are reduced to 0 to 15. The server side code is much simpler, and clients like kibana need to worry about just what works best for visualization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did consider having different precision meaning for geotile and geohex but I think it will make the API hard to use. In addition, we are already not exposing the geotile zoom while aggregating but a grid precision. Therefore I think it makes sense to adjust the grid precision to the density of bins.
| } | ||
| final int aggPrecision = gridPrecisionToAggPrecision(z, gridPrecision); | ||
| if (aggPrecision < 2) { | ||
| // we need to consider all data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this? For H3 even level 0 divides the world into 20 tiles, so looking at all data seems to be an inefficient option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem here is that some hex bins will be going over the pole and at the moment we are not able to compute the buffering using cartesian bounding boxes. This is an edge case that only happens when the grid_precision is <= 2 and you are at zoom = 0 or zoom = 1, therefore let consider it if there is some use case where performance is needed. wdyt?
| // we need to consider all data | ||
| return new Rectangle(-180, 180, GeoTileUtils.LATITUDE_MASK, -GeoTileUtils.LATITUDE_MASK); | ||
| } | ||
| return new Rectangle( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, the constructor of Rectangle was confusion. I thought you had made a mistake with the 3rd and 4th arguments swapped around, but when I looked at the code for Rectangle it really does have minX, maxX, maxY, minY! I guess this is related to the convention to have the original at the top-left instead of bottom left?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is legacy and like you I found it confusing.
...lugin/vector-tile/src/main/java/org/elasticsearch/xpack/vectortile/rest/GridAggregation.java
Outdated
Show resolved
Hide resolved
| lats[i] = latLng.getLatDeg(); | ||
| lons[i] = latLng.getLonDeg(); | ||
| } | ||
| lats[boundary.numPoints()] = boundary.getLatLon(0).getLatDeg(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| lats[boundary.numPoints()] = boundary.getLatLon(0).getLatDeg(); | |
| lats[boundary.numPoints()] = lats[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
x-pack/plugin/vector-tile/src/main/java/org/elasticsearch/xpack/vectortile/rest/GridType.java
Outdated
Show resolved
Hide resolved
…k/vectortile/rest/GridAggregation.java Co-authored-by: Craig Taverner <[email protected]>
…k/vectortile/rest/GridType.java Co-authored-by: Craig Taverner <[email protected]>
|
@elasticmachine run elasticsearch-ci/part-2 |


This commit adds a new optional parameter on the vector tiles API called
grid_aggwith twopossible values, geotile (default) and geohex. This will allow to build the aggs layer using different
grid aggregations, for example we can have a grid aggregation that is built using hexagons.
Note the GeoHexGridAggregation only support data indexed using geo_point field type so this limitation
applies here too.
Docs preview
https://elasticsearch_84553.docs-preview.app.elstc.co/guide/en/elasticsearch/reference/master/search-vector-tile-api.html