Skip to content

Date processor incorrectly parses week based dates with java.time #58479

@pgomulka

Description

@pgomulka

ingest date time processor is incorrectly parsing week based date formats.
raised by a user on a discuss forum https://discuss.elastic.co/t/elastic-joda-to-java-time-update-illegal-pattern-component-uuuu/237830
It affects v8, v7 and 6.8 versions
class: https://github.com/elastic/elasticsearch/blob/master/modules/ingest-common/src/main/java/org/elasticsearch/ingest/common/DateFormat.java#L82

reproduce step in 6.8

curl --request PUT \
  --url http://localhost:9200/_ingest/pipeline/date_pipeline \
  --header 'authorization: Basic ZWxhc3RpYzpwYXNzd29yZA==' \
  --header 'content-type: application/json' \
  --data '{
  "description": "Pipeline for parsing date",
  "processors": [
    {
      "date": {
        "field": "createdTime",
        "formats": [
         "8YYYY-ww"
        ]
      }
    }
  ]
}'

simulate

curl --request POST \
  --url http://localhost:9200/_ingest/pipeline/_simulate \
  --header 'authorization: Basic ZWxhc3RpYzpwYXNzd29yZA==' \
  --header 'content-type: application/json' \
  --data '{
  "pipeline": {
    "processors": [
      {
        "pipeline": {
          "name": "date_pipeline"
        }
      }
    ]
  },
  "docs": [
    {
      "_source": {
 				 "createdTime": "2020-33"
      }
    }
  ]
}'

incorrect response

{
  "docs": [
    {
      "doc": {
        "_index": "_index",
        "_type": "_type",
        "_id": "_id",
        "_source": {
          "createdTime": "2020-33",
          "@timestamp": "2020-01-01T00:00:00.000Z"
        },
        "_ingest": {
          "timestamp": "2020-06-24T08:12:19.899156Z"
        }
      }
    }
  ]
}

when ingesting it fails with

curl --request PUT \
  --url 'http://localhost:9200/test1/_doc/1?pipeline=date_pipeline' \
  --header 'authorization: Basic ZWxhc3RpYzpwYXNzd29yZA==' \
  --header 'content-type: application/json' \
  --data '{
  "createdTime": "2020-33"
}'
{
  "error": {
    "root_cause": [
      {
        "type": "mapper_parsing_exception",
        "reason": "failed to parse field [createdTime] of type [date] in document with id '1'. Preview of field's value: '2020-33'"
      }
    ],
    "type": "mapper_parsing_exception",
    "reason": "failed to parse field [createdTime] of type [date] in document with id '1'. Preview of field's value: '2020-33'",
    "caused_by": {
      "type": "date_time_exception",
      "reason": "Invalid value for MonthOfYear (valid values 1 - 12): 33"
    }
  },
  "status": 400
}

it works for joda time implementation in 6.8

curl --request PUT \
  --url http://localhost:9200/_ingest/pipeline/date_pipeline \
  --header 'authorization: Basic ZWxhc3RpYzpwYXNzd29yZA==' \
  --header 'content-type: application/json' \
  --data '{
  "description": "Pipeline for parsing date",
  "processors": [
    {
      "date": {
        "field": "createdTime",
        "formats": [
         "yyyy-ww"
        ]
      }
    }
  ]
}'
curl --request POST \
  --url http://localhost:9200/_ingest/pipeline/_simulate \
  --header 'authorization: Basic ZWxhc3RpYzpwYXNzd29yZA==' \
  --header 'content-type: application/json' \
  --data '{
  "pipeline": {
    "processors": [
      {
        "pipeline": {
          "name": "date_pipeline"
        }
      }
    ]
  },
  "docs": [
    {
      "_source": {
 				 "createdTime": "2020-33"
      }
    }
  ]
}'

correct response - see

{
  "docs": [
    {
      "doc": {
        "_index": "_index",
        "_type": "_type",
        "_id": "_id",
        "_source": {
          "createdTime": "2020-33",
          "@timestamp": "2020-08-10T00:00:00.000Z"
        },
        "_ingest": {
          "timestamp": "2020-06-24T08:44:48.250843Z"
        }
      }
    }
  ]
}
``

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions