Skip to content

Make index_prefixes work with span_multi #31056

@jpountz

Description

@jpountz

Currently it fails because span_multi expects a MultiTermQuery while prefix queries might create term queries when index_prefixes in enabled. Maybe span_multi should also accept term queries wrapped in a constant_score? There might also be things we need to change regarding how we decide on what index options to use on the prefix field type.

I think it would be a major improvement for users who need to run queries that look like a NEAR b* who currently kill their clusters because the span_multi expands to too many terms.

Here is a recreation:

PUT index
{
  "mappings": {
    "_doc": {
      "properties": {
        "foo": {
          "type": "text",
          "index_prefixes": {}
        }
      }
    }
  }
}

PUT index/_doc/1
{
  "foo": "the quick fox jumps over the lazy dog"
}

GET index/_search
{
  "query": {
    "span_near": {
      "clauses": [
        {
          "span_multi": {
            "match": {
              "prefix": {
                "foo": {
                  "value": "jumps"
                }
              }
            }
          }
        },
        {
          "span_term": {
            "foo": "fox"
          }
        }
      ],
      "slop": 2
    }
  }
}

which triggers:

{
  "error": {
    "root_cause": [
      {
        "type": "query_shard_exception",
        "reason": "failed to create query: {\n  \"span_near\" : {\n    \"clauses\" : [\n      {\n        \"span_multi\" : {\n          \"match\" : {\n            \"prefix\" : {\n              \"foo\" : {\n                \"value\" : \"jumps\",\n                \"boost\" : 1.0\n              }\n            }\n          },\n          \"boost\" : 1.0\n        }\n      },\n      {\n        \"span_term\" : {\n          \"foo\" : {\n            \"value\" : \"fox\",\n            \"boost\" : 1.0\n          }\n        }\n      }\n    ],\n    \"slop\" : 2,\n    \"in_order\" : true,\n    \"boost\" : 1.0\n  }\n}",
        "index_uuid": "L67oony8REGtPcAHC0C8WQ",
        "index": "index"
      }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
      {
        "shard": 0,
        "index": "index",
        "node": "MYCVYLGiSiy2tZzASlk7uA",
        "reason": {
          "type": "query_shard_exception",
          "reason": "failed to create query: {\n  \"span_near\" : {\n    \"clauses\" : [\n      {\n        \"span_multi\" : {\n          \"match\" : {\n            \"prefix\" : {\n              \"foo\" : {\n                \"value\" : \"jumps\",\n                \"boost\" : 1.0\n              }\n            }\n          },\n          \"boost\" : 1.0\n        }\n      },\n      {\n        \"span_term\" : {\n          \"foo\" : {\n            \"value\" : \"fox\",\n            \"boost\" : 1.0\n          }\n        }\n      }\n    ],\n    \"slop\" : 2,\n    \"in_order\" : true,\n    \"boost\" : 1.0\n  }\n}",
          "index_uuid": "L67oony8REGtPcAHC0C8WQ",
          "index": "index",
          "caused_by": {
            "type": "unsupported_operation_exception",
            "reason": "unsupported inner query, should be org.apache.lucene.search.MultiTermQuery but was org.apache.lucene.search.ConstantScoreQuery"
          }
        }
      }
    ]
  },
  "status": 400
}

cc @jimczi @romseygeek

Metadata

Metadata

Assignees

Labels

:Search/SearchSearch-related issues that do not fall into other categories>feature

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions