Skip to content

Term Vectors doesn't work on artificial docs with keyword fields #53494

@matriv

Description

@matriv

Steps to reproduce:

PUT test_termvector
{
    "mappings": {
    "properties": {
      "words": {
        "type": "keyword"
      }
    }
  }
}

PUT test_termvector/_doc/1?refresh
{
  "words": [
    "b",
    "b",
    "b",
    "a",
    "e",
    "f",
    "f"
  ]
}

# we can get terms score without artifical documents
GET test_termvector/_termvectors/1
{
  "fields": [
    "words"
  ],
  "term_statistics": false,
  "field_statistics": false,
  "positions": false,
  "offsets": false,
  "filter": {
    "max_num_terms": 3
  }
}

{
  "_index" : "test_termvector",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 1,
  "found" : true,
  "took" : 0,
  "term_vectors" : {
    "words" : {
      "terms" : {
        "b" : {
          "term_freq" : 3,
          "score" : 3.0
        },
        "e" : {
          "term_freq" : 1,
          "score" : 1.0
        },
        "f" : {
          "term_freq" : 2,
          "score" : 2.0
        }
      }
    }
  }
}


# we **cannot** get terms score with artifical documents of keyword type
GET test_termvector/_termvectors
{
  "doc": {
    "words": [
      "b",
      "b",
      "b",
      "a",
      "e",
      "f",
      "f"
    ]
  },
  "fields": [
    "words"
  ],
  "term_statistics": false,
  "field_statistics": false,
  "positions": false,
  "offsets": false,
  "filter": {
    "max_num_terms": 3
  }
}

{
  "_index" : "test_termvector",
  "_type" : "_doc",
  "_version" : 0,
  "found" : true,
  "took" : 0,
  "term_vectors" : { }
}

Issue is spotted in ParseContext#getValues() where field.stringValue() returns null for keyword fields. Need to check for the KeywordFieldType and convert BytesRef to UTF8 string.

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Search/SearchSearch-related issues that do not fall into other categories>bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions