Skip to content

Conversation

@nik9000
Copy link
Member

@nik9000 nik9000 commented Sep 20, 2021

If tsdb is enabled we need an @timestamp field. This automatically
maps the field if it is missing and fails to create indices in
time_series mode that map @timestamp as anything other than date and
date_nanos.

@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Sep 20, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

}
if (false == timestamp.get() instanceof DateFieldMapper.Builder) {
throw new IllegalArgumentException("@timestamp must be [date] or [date_nanos]");
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data streams also automatically @timestamp as a date but the way that do it seems really heavy to me. They have a special template called DataStreamTemplate and they automatically apply it if the name starts with .ds-. We could do something similar but it'd take a bunch of extra layers of plumbing to get the setting into the template resolution layer. We'd still want the validation somewhere anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends, do you want the validation outside of data streams, or as a configuration option on the data stream itself? If you want it outside of data streams (for use with aliases for instance), then don't use the DataStreamTemplate.

Also, I notice that you add the mapper, but don't actually enforce that the @timestamp is filled in the way that a data stream does. I think you also need to validate that every document contains an @timestamp field, is that right? Or are you planning on filling it in automatically if it is missing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends, do you want the validation outside of data streams, or as a configuration option on the data stream itself?

I want it totally independent of data streams. If a time_series mode index is not in a data stream I still want there to be an @timestamp field. I was more wondering if I should make a second special template or use a mechanism like this one here.

Also, I notice that you add the mapper, but don't actually enforce that the @timestamp is filled in the way that a data stream does. I think you also need to validate that every document contains an @timestamp field, is that right? Or are you planning on filling it in automatically if it is missing?

I was planning on adding that test in a follow up change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach seems fine to me then, since you want it to be independent.

I was planning on adding that test in a follow up change.

This should hopefully be easy as using DataStreamTimestampFieldMapper instead of the DateFieldMapper. I think we should probably unify them also so that they share the same infrastructure for validation, for example, it doesn't help if we add a @timestamp field but the user configures it for indexed: false so we can't use it. The DataStreamTimestampFieldMapper does more validation here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can have a look at unifying them in the follow up, yeah!

If tsdb is enabled we need an `@timestamp` field. This automatically
maps the field if it is missing and fails to create indices in
time_series mode that map `@timestamp` as anything other than `date` and
`date_nanos`.
@nik9000
Copy link
Member Author

nik9000 commented Sep 20, 2021

run elasticsearch-ci/eql-correctness

@nik9000
Copy link
Member Author

nik9000 commented Sep 20, 2021

run elasticsearch-ci/packaging-tests-unix-sample

@nik9000
Copy link
Member Author

nik9000 commented Sep 20, 2021

run elasticsearch-ci/packaging-tests-windows-sample

@nik9000
Copy link
Member Author

nik9000 commented Sep 21, 2021

Adding @romseygeek and @dakrone to talk a bit about whether or not its worth mimicking the way datastreams automatically map timestamp. My instinct is that it isn't worth it but certainly open to learning.

@dakrone
Copy link
Member

dakrone commented Sep 29, 2021

@imotov if I recall I think you said something about why this couldn't be done either this way or the data stream way, am I remembering correctly?

@imotov
Copy link
Contributor

imotov commented Sep 29, 2021

l feel like it is definitely worth talking about it. Since we are already defining a timestamp in data streams my preference would be to reuse it, if that not possible - reimplement it in a consistent manner, or as the last resort come up with a plan to make it consistent in the future. I really don't want to explain users that this is how timestamp works in ordinary data steams but not in time series data streams.

@nik9000
Copy link
Member Author

nik9000 commented Oct 26, 2021

Replaced by #79136.

@nik9000 nik9000 closed this Oct 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:StorageEngine/TSDB You know, for Metrics Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants