Skip to content

More Like This queries return 0 results if field in source document has 0 tokens in analyzed field #30148

@kylelyk

Description

@kylelyk

Elasticsearch version: 6.2.4

Plugins installed: []

JVM version: 1.8.0_101

OS version: MacOS (Darwin Kernel Version 15.6.0)

Description of the problem including expected versus actual behavior:
"More Like This" queries do not return any results when a field on the source document produces no tokens at index time. Using a keyword field and manually specifying the analyzer at query time works as expected.

Steps to reproduce:

PUT test
{
  "mappings": {
    "type": {
      "properties": {
        "myField": {
          "type": "text"
        },
        "empty": {
          "type": "text"
        }
      }
    }
  }
}
POST /_bulk
{ "index":  { "_index": "test", "_type": "type","_id":1}}
{"myField":"and_foo", "empty":""}
{ "index":  { "_index": "test", "_type": "type","_id":2}}
{"myField":"and_foo", "empty":""}

This query correctly returns 1 result:

GET /_search
{
  "query": {
    "more_like_this": {
      "fields": [
        "myField"
      ],
      "like": [
        {
          "_index": "test",
          "_type": "type",
          "_id": "1"
        }
      ],
      "min_term_freq": 1,
      "min_doc_freq": 1
    }
  }
}

This query returns no results when using both fields:

GET /_search
{
  "query": {
    "more_like_this": {
      "fields": [
        "myField", "empty"
      ],
      "like": [
        {
          "_index": "test",
          "_type": "type",
          "_id": "1"
        }
      ],
      "min_term_freq": 1,
      "min_doc_freq": 1
    }
  }
}

If you update the "empty" field in document 1 to contain non-analyzable characters (like punctuation), the first query still gives 0 results. Changing the "empty" field to be a keyword field works as expected.

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Search/SearchSearch-related issues that do not fall into other categories>buggood first issuelow hanging fruit

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions