-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
The current date mapping code treats unix timestamps differently from other date formats. We should unify this, even though this requires changing our defaults and requires the user to explicitely configure the unix timestamp usecase.
Today we parse dates as follows:
Mapped fields with a format (defaults to dateOptionalTime)
- If number, treat as epoch ms
- If string, try to parse with defined format(s)
- If it fails and is purely numeric, treat as epoch ms
- Else fail
Dynamic date detection
- If string,
- and contains at least two
:,-, or/ - and matches dynamic date formats (defaults to
dateOptionalTime || yyyy/MM/dd HH:mm:ss || yyyy/MM/dd) - then
date, elsestring
There are a few issues which can surprise users:
- Joda dates are not strict, so
"1/1/1"is detected as a date, and"1"would be interpreted as0001-01-01 00:00:00 - The distinction between numeric and string values is not always possible, eg query string params are always strings (
_timestamp), a date in thequery_stringquery is always a string, and even in the JSON body some languages can render a number as a string and vice versa - Dates such as
2015.01.01(german) or20150101T000000(iso8601) can never be detected dynamically
Proposals
Make date parsing as unambiguous as possible. Where there is ambiguity, it is because the user chose ambiguous options (which we can warn about in the docs).
For indices created in 2.0:
- Add two formats for parsing epoch:
epoch_msandepoch_secondsAdded epoch date formats to configure parsing of unix dates #11453 - Add strict Joda formats, where eg the year must have 4 digits More strict parsing of ISO dates #6227
- Remove
numeric_resolution(not needed with above)
For mapped date field:
- only check the specified formats, which default to
strictDateOptionalTime || epoch_ms - No distinction between numeric and string values for date fields - always parsed as strings (ie coerce from numeric)
For dynamic date detection:
- only check string values (don't coerce numerics)
- accept any formats except
epoch_msandepoch_seconds - mapping should add just the matching format (optionally append
epoch_ms?)
For indices created before 2.0:
We need to keep bwc on older indices, so we follow the same rules as specified at the beginning of this comment
Query time
Typically users will always use the same format at index time - they don't mix epoch timestamps with formatted dates, which is why we should only parse the specified formats.
However, at query time it is quite possible that (eg) Kibana may query with epoch timestamps, even though the date field only accepts a formatted date. Today, in the range query we accept a format parameter which is used to parse dates at query time.
There are two options to deal with this situation:
- Add a
formatparameter to theterm,terms,query_string, andsimple_query_stringqueries, and to therangeaggregation - Add a special format for epoch timestamps which is always recognised, eg
epoch_ms:123456789