-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Closed
Labels
:Analytics/SQLSQL queryingSQL querying>enhancementTeam:QL (Deprecated)Meta label for query languages teamMeta label for query languages team
Description
Description
Certain OR expressions are translated into a number of nested bool queries which, unless balanced, end up being too deep and rejected by ES/Lucene.
Reproduced on 8.7.1 thanks to @e-parth-pathak:
- Create an Index and ingest data:
PUT /library/_bulk?refresh
{"index":{"_id":"Leviathan Wakes"}}
{"name":"Leviathan Wakes","author":"James S.A. Corey","release_date":"2011-06-02","page_count":561}
{"index":{"_id":"Hyperion"}}
{"name":"Hyperion","author":"Dan Simmons","release_date":"1989-05-26","page_count":482}
{"index":{"_id":"Dune"}}
{"name":"Dune","author":"Frank Herbert","release_date":"1965-06-01","page_count":604}
{"index":{"_id":"Persuation"}}
{"name":"Persuation","author":"Robert Cialdini","release_date":"2023-06-01","page_count":304}
{"index":{"_id":"Sapiens"}}
{"name":"Sapiens","author":"Yuval Noah Harari","release_date":"1969-06-01","page_count":204}
{"index":{"_id":"Hooked"}}
{"name":"Hooked","author":"Nir Eyal","release_date":"2022-06-01","page_count":408}
- Run a query:
POST /_sql/translate
{
"query": "SELECT * FROM library WHERE name like '%nir%' OR name like '%yuval%' OR name like '%dan%' OR name like '%frank%' OR name like '%robert%' OR name like '%leviathan%'"
}
- The response that I got:
{
"size": 1000,
"query": {
"bool": {
"should": [
{
"bool": {
"should": [
{
"bool": {
"should": [
{
"bool": {
"should": [
{
"bool": {
"should": [
{
"wildcard": {
"name.keyword": {
"wildcard": "*nir*",
"boost": 1
}
}
},
{
"wildcard": {
"name.keyword": {
"wildcard": "*yuval*",
"boost": 1
}
}
}
],
"boost": 1
}
},
{
"wildcard": {
"name.keyword": {
"wildcard": "*dan*",
"boost": 1
}
}
}
],
"boost": 1
}
},
{
"wildcard": {
"name.keyword": {
"wildcard": "*frank*",
"boost": 1
}
}
}
],
"boost": 1
}
},
{
"wildcard": {
"name.keyword": {
"wildcard": "*robert*",
"boost": 1
}
}
}
],
"boost": 1
}
},
{
"wildcard": {
"name.keyword": {
"wildcard": "*leviathan*",
"boost": 1
}
}
}
],
"boost": 1
}
},
"_source": false,
"fields": [
{
"field": "author"
},
{
"field": "name"
},
{
"field": "page_count"
},
{
"field": "release_date",
"format": "strict_date_optional_time_nanos"
}
],
"sort": [
{
"_doc": {
"order": "asc"
}
}
],
"track_total_hits": -1
}Note that while the lowest nested bool query uses two entries, the rest do not and instead rely on another bool wrapper.
Either all entries should be put into a should clause OR, all the OR entries need to be balanced and be wrapped on the same level.
P.S. This has been addressed in the past by #30267 however it still seems to occur when dealing with like comparisons.
Metadata
Metadata
Assignees
Labels
:Analytics/SQLSQL queryingSQL querying>enhancementTeam:QL (Deprecated)Meta label for query languages teamMeta label for query languages team