From 1eebe84620386b0d050d0edf2f2b04bef2599b08 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Mon, 14 Sep 2020 17:37:39 -0400 Subject: [PATCH 01/44] First steps in docs for runtime fields. --- docs/reference/mapping/types.asciidoc | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/docs/reference/mapping/types.asciidoc b/docs/reference/mapping/types.asciidoc index e35298946d7fb..85598af9bf275 100644 --- a/docs/reference/mapping/types.asciidoc +++ b/docs/reference/mapping/types.asciidoc @@ -108,11 +108,12 @@ as-you-type completion. === Arrays In {es}, arrays do not require a dedicated field data type. Any field can contain zero or more values by default, however, all values in the array must be of the -same field type. See <>. +same field type. + +See <> to learn more. [discrete] === Multi-fields - It is often useful to index the same field in different ways for different purposes. For instance, a `string` field could be mapped as a `text` field for full-text search, and as a `keyword` field for @@ -124,6 +125,21 @@ the <>, the This is the purpose of _multi-fields_. Most field types support multi-fields via the <> parameter. +[discrete] +=== Runtime +Oftentimes, you just want to make your data available to {es} without indexing +or preprocessing. _Runtime fields_ extend the flexibility of the +<> to quickly make data searchable, even with minimal +structure. + +Runtime fields are not indexed and do have <>, meaning +Lucene is unaware of their existence. However, runtime fields are accessible +from the search API like any other field that has `doc_values` and is +searchable. You can retrieve and query these fields, as well as aggregate and +sort on them. + +See <> to learn more. + include::types/alias.asciidoc[] include::types/array.asciidoc[] From e1c042704be7393445f1a985916aee7abedcdd41 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Mon, 14 Sep 2020 17:38:51 -0400 Subject: [PATCH 02/44] Adding new page for runtime fields. --- docs/reference/mapping/types/runtime.asciidoc | 2 ++ 1 file changed, 2 insertions(+) create mode 100644 docs/reference/mapping/types/runtime.asciidoc diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc new file mode 100644 index 0000000000000..0a5bd056c7739 --- /dev/null +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -0,0 +1,2 @@ +[[runtime]] +=== Runtime From 47bb110f54f6f8e331b2a07981e24a1e19b55f1c Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Tue, 15 Sep 2020 10:21:25 -0400 Subject: [PATCH 03/44] Adding page for runtime fields. --- docs/reference/mapping/types/runtime.asciidoc | 3 +++ 1 file changed, 3 insertions(+) diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index 0a5bd056c7739..c9789856d1923 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -1,2 +1,5 @@ [[runtime]] === Runtime +Typically, you must index fields before they can be retrieved, aggregated, or +searched. Runtime fields provide flexibility in determining which fields to +index over time, rather than indexing all of your data up front. From 0c97b926552d746d227b8b77b1622a9238c71da6 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Thu, 17 Sep 2020 17:28:29 -0400 Subject: [PATCH 04/44] Adding more to the runtime fields topic. --- docs/reference/mapping/types.asciidoc | 4 ++ docs/reference/mapping/types/runtime.asciidoc | 43 +++++++++++++++++-- 2 files changed, 44 insertions(+), 3 deletions(-) diff --git a/docs/reference/mapping/types.asciidoc b/docs/reference/mapping/types.asciidoc index 85598af9bf275..8fcb83263bbaa 100644 --- a/docs/reference/mapping/types.asciidoc +++ b/docs/reference/mapping/types.asciidoc @@ -132,11 +132,13 @@ or preprocessing. _Runtime fields_ extend the flexibility of the <> to quickly make data searchable, even with minimal structure. +// tag::runtime-fields-description[] Runtime fields are not indexed and do have <>, meaning Lucene is unaware of their existence. However, runtime fields are accessible from the search API like any other field that has `doc_values` and is searchable. You can retrieve and query these fields, as well as aggregate and sort on them. +// end::runtime-fields-description[] See <> to learn more. @@ -184,6 +186,8 @@ include::types/rank-feature.asciidoc[] include::types/rank-features.asciidoc[] +include::types/runtime.asciidoc[] + include::types/search-as-you-type.asciidoc[] include::types/text.asciidoc[] diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index c9789856d1923..0d6f153ac25c3 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -1,5 +1,42 @@ [[runtime]] === Runtime -Typically, you must index fields before they can be retrieved, aggregated, or -searched. Runtime fields provide flexibility in determining which fields to -index over time, rather than indexing all of your data up front. +Typically, you must index fields to {es} before they can be retrieved, +aggregated, or searched. With runtime fields, you can explicitly define a field +in the mapping and access it at runtime without indexing your data. This +flexibility allows you to more quickly ingest raw data into the Elastic Stack +and immediately access it. By dynamically evaluating runtime fields at query +time, you can choose which fields to index and optimize disk space. + +include::{es-ref-dir}/mapping/types.asciidoc[tag=runtime-fields-description] + +Each runtime field is of the `runtime` data type, and has its own +<>, such as `boolean`, `integer`, or `keyword`. In +the following example, the data type is `runtime` and the runtime field type is +`integer`. + +[source,console] +---- +PUT /my-index/_mappings +{ + “properties” : { + “age” : { + “type” : “runtime”, + “runtime_type” : “integer”, + “script” : { + “source” : "now - doc['date_of_birth'].value" + } + } + } +} +---- + +==== Retrieving runtime fields +Because runtime fields are not part of the +<>, they are not returned in search hits +by default. You can request runtime fields by using the +<>, or by using the +<> parameter to return `doc_values` for one +or more fields in the search response. Then, use the +<> to return runtime fields as +fields that can be searched and aggregated, including the cost of these +operations. From ee8a16fe17d775bd068467ccaa05ed1bf3e63dac Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Fri, 18 Sep 2020 13:15:57 -0400 Subject: [PATCH 05/44] Adding parameters and retrieval options for runtime fields. --- docs/reference/mapping/types/runtime.asciidoc | 81 ++++++++++++++----- 1 file changed, 62 insertions(+), 19 deletions(-) diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index 0d6f153ac25c3..032a428c128ec 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -9,34 +9,77 @@ time, you can choose which fields to index and optimize disk space. include::{es-ref-dir}/mapping/types.asciidoc[tag=runtime-fields-description] -Each runtime field is of the `runtime` data type, and has its own -<>, such as `boolean`, `integer`, or `keyword`. In -the following example, the data type is `runtime` and the runtime field type is -`integer`. +// Each runtime field is of the `runtime` data type, and has its own +// <>, such as `boolean`, `long`, or `keyword`. The +// field type identifies the data type In +// the following example, the data type is `runtime` and the runtime field type is +// `keyword`. + +==== Specifying runtime fields +Runtime fields do not have access to the +<>. You specify runtime fields in the +mapping by <>, which always has +access to the `source` field. At search time, the script runs and generates +values for each scripted field. + +The script in the following request extracts the day of the week from the +`timestamp` field, which is defined as a `date` data type. [source,console] ---- PUT /my-index/_mappings { - “properties” : { - “age” : { - “type” : “runtime”, - “runtime_type” : “integer”, - “script” : { - “source” : "now - doc['date_of_birth'].value" + "properties" : { + "day_of_week" : { + "type" : "runtime", <1> + "runtime_type" : "keyword", <2> + "script" : { + "source" : "emit(doc['timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))" } - } + } } } ---- +<1> Runtime fields are of the `runtime` data type. +<2> Each runtime has its own field type, defined by `runtime_type`. + +[[runtime-params]] +==== Parameters for `runtime` fields +Runtime fields accept the following parameters: + +`type`:: +The type of runtime computation to perform at query time. Currently, runtime +fields only support the `runtime` data type. + +`runtime_type`:: +The <> for each scripted field. {es} +supports `boolean`, `date`, `double`, `ip`, `keyword`, and `long`. + ==== Retrieving runtime fields -Because runtime fields are not part of the -<>, they are not returned in search hits -by default. You can request runtime fields by using the -<>, or by using the +Because runtime fields are not part of the `_source` field, they are not +returned in search hits by default. You can request runtime fields by using the +<> parameter, or by using the <> parameter to return `doc_values` for one -or more fields in the search response. Then, use the -<> to return runtime fields as -fields that can be searched and aggregated, including the cost of these -operations. +or more fields in the search response. + +Use the <> to return runtime fields +like any other field. For example, a runtime field with a `runtime_type` of +`keyword` returns as any other field that belongs to the `keyword` family. + +The following request uses the search API to retrieve the `day_of_week` field +that the previous request defined in the mapping. + +[source,console] +---- +GET /my-index/_search +{ + "aggs": { + "days_of_week": { + "terms": { + "field": "day_of_week" + } + } + } +} +---- From f56bde04baa1f7e8b9e4a1769b6dc9ad7a8194b7 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Fri, 18 Sep 2020 13:51:50 -0400 Subject: [PATCH 06/44] Adding TESTSETUP for index creation. --- docs/reference/mapping/types/runtime.asciidoc | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index 032a428c128ec..6de7bfa71ab70 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -1,5 +1,14 @@ [[runtime]] === Runtime + +//// +[source,console] +---- +PUT /my-index +---- +// TESTSETUP +//// + Typically, you must index fields to {es} before they can be retrieved, aggregated, or searched. With runtime fields, you can explicitly define a field in the mapping and access it at runtime without indexing your data. This From bb822841debbcb14d0cd43b5ed38e989d6fc82f9 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Mon, 21 Sep 2020 17:50:02 -0400 Subject: [PATCH 07/44] Incorporating review feedback. --- docs/reference/mapping/types.asciidoc | 13 ++++--- docs/reference/mapping/types/runtime.asciidoc | 36 +++++++++---------- docs/reference/search/field-caps.asciidoc | 4 +++ 3 files changed, 26 insertions(+), 27 deletions(-) diff --git a/docs/reference/mapping/types.asciidoc b/docs/reference/mapping/types.asciidoc index 8fcb83263bbaa..ce51ec3139532 100644 --- a/docs/reference/mapping/types.asciidoc +++ b/docs/reference/mapping/types.asciidoc @@ -132,13 +132,12 @@ or preprocessing. _Runtime fields_ extend the flexibility of the <> to quickly make data searchable, even with minimal structure. -// tag::runtime-fields-description[] -Runtime fields are not indexed and do have <>, meaning -Lucene is unaware of their existence. However, runtime fields are accessible -from the search API like any other field that has `doc_values` and is -searchable. You can retrieve and query these fields, as well as aggregate and -sort on them. -// end::runtime-fields-description[] +Instead of indexing all of your data, you can use runtime fields to +search and aggregate on your data at query time. Runtime fields can incur +performance costs at runtime depending on the runtime type, but don't use the +disk space that is typically required to index your data. By removing the +requirement to index everything, you gain the flexibility of choosing which +fields to index. See <> to learn more. diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index 6de7bfa71ab70..4ae736b1fa788 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -16,20 +16,18 @@ flexibility allows you to more quickly ingest raw data into the Elastic Stack and immediately access it. By dynamically evaluating runtime fields at query time, you can choose which fields to index and optimize disk space. -include::{es-ref-dir}/mapping/types.asciidoc[tag=runtime-fields-description] +Runtime fields are not indexed, but are accessible from the search API like any +other field that has <> and is searchable. You can +retrieve and query these fields, as well as aggregate and +sort on them. -// Each runtime field is of the `runtime` data type, and has its own -// <>, such as `boolean`, `long`, or `keyword`. The -// field type identifies the data type In -// the following example, the data type is `runtime` and the runtime field type is -// `keyword`. +NOTE: Runtime fields can be incur performance costs at search time, depending +on the <>. ==== Specifying runtime fields -Runtime fields do not have access to the -<>. You specify runtime fields in the -mapping by <>, which always has -access to the `source` field. At search time, the script runs and generates -values for each scripted field. +You specify runtime fields in the mapping by +<>. At search time, the script runs +and generates values for each scripted field. The script in the following request extracts the day of the week from the `timestamp` field, which is defined as a `date` data type. @@ -61,20 +59,18 @@ Runtime fields accept the following parameters: The type of runtime computation to perform at query time. Currently, runtime fields only support the `runtime` data type. +[[runtime-params-runtime-type]] `runtime_type`:: The <> for each scripted field. {es} supports `boolean`, `date`, `double`, `ip`, `keyword`, and `long`. ++ +Runtime fields with a `runtime_type` of `date` can accept the +<> parameter exactly as the `date` field type. ==== Retrieving runtime fields -Because runtime fields are not part of the `_source` field, they are not -returned in search hits by default. You can request runtime fields by using the -<> parameter, or by using the -<> parameter to return `doc_values` for one -or more fields in the search response. - -Use the <> to return runtime fields -like any other field. For example, a runtime field with a `runtime_type` of -`keyword` returns as any other field that belongs to the `keyword` family. +Use the <> parameter on the `_search` API to retrieve +the values of runtime fields. This API works for all fields, even those that +were not sent as part of the original `_source`. The following request uses the search API to retrieve the `day_of_week` field that the previous request defined in the mapping. diff --git a/docs/reference/search/field-caps.asciidoc b/docs/reference/search/field-caps.asciidoc index 9d352273999ce..6a247238ee879 100644 --- a/docs/reference/search/field-caps.asciidoc +++ b/docs/reference/search/field-caps.asciidoc @@ -34,6 +34,10 @@ GET /_field_caps?fields=rating The field capabilities API returns the information about the capabilities of fields among multiple indices. +Use the field capabilities API to return <> like any +other field. For example, a runtime field with a `runtime_type` of +`keyword` returns as any other field that belongs to the `keyword` family. + [[search-field-caps-api-path-params]] ==== {api-path-parms-title} From b2cbf0a7f99b604fa825ab1ffc7cb4360793cbe5 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Tue, 22 Sep 2020 17:16:51 -0400 Subject: [PATCH 08/44] Incorporating reviewer feedback. --- docs/reference/mapping/types.asciidoc | 40 ++++---- docs/reference/mapping/types/runtime.asciidoc | 93 +++++++++++++------ 2 files changed, 88 insertions(+), 45 deletions(-) diff --git a/docs/reference/mapping/types.asciidoc b/docs/reference/mapping/types.asciidoc index ce51ec3139532..28c4c69765b9a 100644 --- a/docs/reference/mapping/types.asciidoc +++ b/docs/reference/mapping/types.asciidoc @@ -108,11 +108,10 @@ as-you-type completion. === Arrays In {es}, arrays do not require a dedicated field data type. Any field can contain zero or more values by default, however, all values in the array must be of the -same field type. - -See <> to learn more. +same field type. See <>. [discrete] +[[types-multi-fields]] === Multi-fields It is often useful to index the same field in different ways for different purposes. For instance, a `string` field could be mapped as @@ -126,20 +125,29 @@ This is the purpose of _multi-fields_. Most field types support multi-fields via the <> parameter. [discrete] +[[types-runtime]] === Runtime -Oftentimes, you just want to make your data available to {es} without indexing -or preprocessing. _Runtime fields_ extend the flexibility of the -<> to quickly make data searchable, even with minimal -structure. - -Instead of indexing all of your data, you can use runtime fields to -search and aggregate on your data at query time. Runtime fields can incur -performance costs at runtime depending on the runtime type, but don't use the -disk space that is typically required to index your data. By removing the -requirement to index everything, you gain the flexibility of choosing which -fields to index. - -See <> to learn more. +{es} indexes most field types by default to promote faster search. However, +indexing all of your data can be slow and requires more disk space. If you're +experimenting with your data or are unsure which fields you need for search, +use _runtime fields_. + +{es} treats runtime fields like any other field, except that their values are +only extracted or computed at search time. When mapping a runtime field, you +define a script that determines how to extract or compute field values from +your unindexed data. + +Runtime fields use less disk space and provide flexibility in how you access +your data, but impact search performance based on the computation defined in +the runtime script. See <>. + +// Runtime fields optimize disk space by evaluating each runtime script at search +// time, instead of indexing fields and using disk space. +// +// Runtime fields use less disk space and provide flexibility in how you want to +// access your data. Use runtime fields to quickly get data into the Elastic Stack +// +// Runtime fields make searches slower, as computing their values for each document that might match the query is costly, depending on how they are calculated. include::types/alias.asciidoc[] diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index 4ae736b1fa788..a27a8add18a5f 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -11,24 +11,51 @@ PUT /my-index Typically, you must index fields to {es} before they can be retrieved, aggregated, or searched. With runtime fields, you can explicitly define a field -in the mapping and access it at runtime without indexing your data. This -flexibility allows you to more quickly ingest raw data into the Elastic Stack -and immediately access it. By dynamically evaluating runtime fields at query -time, you can choose which fields to index and optimize disk space. +in the mapping and access it at search time without indexing your data. -Runtime fields are not indexed, but are accessible from the search API like any -other field that has <> and is searchable. You can -retrieve and query these fields, as well as aggregate and -sort on them. +Runtime fields are accessible from the search API like any other field that has +<> and is searchable. You can retrieve and query these +fields, as well as aggregate on them. -NOTE: Runtime fields can be incur performance costs at search time, depending -on the <>. +Runtime fields help to alleviate several common issues when using {es}: -==== Specifying runtime fields -You specify runtime fields in the mapping by +* Reindexing your data between development iterations is slow and can make +experimenting on large datasets difficult +* Indexing data before searching makes running one-off searches costly and +resource intensive +* Indexing all of your data instead of just the fields you want to search +requires more disk space to gain search time performance +* Reindexing data for time-based indices to ensure that existing indices +include any new fields in the index template is slow +* Determining how a field is evaluated in {kib} at index or search time is +difficult for scripted fields because they have different needs based on where +they are implemented + +Because runtime fields aren't indexed, you can more quickly ingest raw data +into the Elastic Stack and immediately access it. By dynamically evaluating +runtime fields at search time, you can optimize disk space by choosing which +fields to index. + +Runtime fields incur performance costs at search time, depending +on the <>. For example, let's say +you created an anomaly detection job that operates on the `timestamp` field. +If the `timestamp` field is a runtime field, the search cost would be extremely +high because the data isn't indexed and {es} must compute the value for each +document that matches the query. + +Runtime fields are useful when working with log data, especially when you're +unsure about the data structure. Your search speed decreases, but your index +size is much smaller and you can more quickly process logs without having to +index them. + +[[runtime-mapping-fields]] +==== Mapping a runtime field +You map runtime fields by <>. At search time, the script runs and generates values for each scripted field. +NOTE: When mapping a runtime field, indexing is disabled by default. + The script in the following request extracts the day of the week from the `timestamp` field, which is defined as a `date` data type. @@ -51,27 +78,15 @@ PUT /my-index/_mappings <1> Runtime fields are of the `runtime` data type. <2> Each runtime has its own field type, defined by `runtime_type`. -[[runtime-params]] -==== Parameters for `runtime` fields -Runtime fields accept the following parameters: - -`type`:: -The type of runtime computation to perform at query time. Currently, runtime -fields only support the `runtime` data type. - -[[runtime-params-runtime-type]] -`runtime_type`:: -The <> for each scripted field. {es} -supports `boolean`, `date`, `double`, `ip`, `keyword`, and `long`. -+ -Runtime fields with a `runtime_type` of `date` can accept the -<> parameter exactly as the `date` field type. - -==== Retrieving runtime fields +[[runtime-retrieving-fields]] +==== Retrieving a runtime field Use the <> parameter on the `_search` API to retrieve the values of runtime fields. This API works for all fields, even those that were not sent as part of the original `_source`. +NOTE: We highly recommended using the <> +to run searches that use runtime fields. + The following request uses the search API to retrieve the `day_of_week` field that the previous request defined in the mapping. @@ -88,3 +103,23 @@ GET /my-index/_search } } ---- + +[[runtime-params]] +++++ +Parameters +++++ +==== Parameters for `runtime` fields +Runtime fields accept the following parameters: + +[[runtime-params-type]] +`type`:: +The type of runtime computation to perform at query time. Currently, runtime +fields only support the `runtime` data type. + +[[runtime-params-runtime-type]] +`runtime_type`:: +The <> for each scripted field. {es} +supports `boolean`, `date`, `double`, `ip`, `keyword`, and `long`. ++ +Runtime fields with a `runtime_type` of `date` can accept the +<> parameter exactly as the `date` field type. From d119cbd38e00cf698d5f3d675644c215d85e16c4 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Wed, 23 Sep 2020 16:50:58 -0400 Subject: [PATCH 09/44] Adding examples for runtime fields. --- docs/reference/mapping/types/runtime.asciidoc | 196 +++++++++++++++++- 1 file changed, 192 insertions(+), 4 deletions(-) diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index a27a8add18a5f..1121ecd12136a 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -11,7 +11,8 @@ PUT /my-index Typically, you must index fields to {es} before they can be retrieved, aggregated, or searched. With runtime fields, you can explicitly define a field -in the mapping and access it at search time without indexing your data. +in the mapping and access it at search time without indexing your data during +ingest time. Runtime fields are accessible from the search API like any other field that has <> and is searchable. You can retrieve and query these @@ -105,9 +106,6 @@ GET /my-index/_search ---- [[runtime-params]] -++++ -Parameters -++++ ==== Parameters for `runtime` fields Runtime fields accept the following parameters: @@ -123,3 +121,193 @@ supports `boolean`, `date`, `double`, `ip`, `keyword`, and `long`. + Runtime fields with a `runtime_type` of `date` can accept the <> parameter exactly as the `date` field type. + +[[runtime-examples]] +==== Examples +Consider a large set of log data that you want to extract fields from. +Indexing the data is time consuming and uses a lot of disk space, and you just +want to explore the data structure without committing to a schema up front. + +You know that your log data contains specific fields that you want to extract. +By using runtime fields, you can define scripts to calculate values at search +time for the `clientip`, `request`, `status`, and `size` fields. + +[source,console] +---- +PUT /my-index/_mappings +{ + "properties": { + "@timestamp": { + "format": "strict_date_optional_time||epoch_second", + "type": "date" + }, + "message": { + "type": "keyword", + "index": false, + "doc_values": true + }, + "clientip": { + "type": "runtime", + "runtime_type": "ip", + "script" : { + "source" : "String m = doc[\"message\"].value; int end = m.indexOf(\" \"); emit(m.substring(0, end));" + } + }, + "request": { + "type": "runtime", + "runtime_type": "keyword", + "script" : { + "source" : "String m = doc[\"message\"].value; int start = m.indexOf(\"\\\"\") + 1; int end = m.indexOf(\"\\\"\", start); emit(m.substring(start, end));" + } + }, + "status": { + "type": "runtime", + "runtime_type": "long", + "script" : { + "source" : "String m = doc[\"message\"].value; int end = m.lastIndexOf(\" \"); int start = m.lastIndexOf(\" \", end - 1) + 1; emit(Long.parseLong(m.substring(start, end)));" + } + }, + "size": { + "type": "runtime", + "runtime_type": "long", + "script" : { + "source" : "String m = doc[\"message\"].value; int start = m.lastIndexOf(\" \") + 1; emit(Long.parseLong(m.substring(start)));" + } + } + } +} +---- + +After mapping the fields you want to retrieve, you can index a few records from +your log data into {es}. The following request uses the <> +to index raw log data into `my-index`. Instead of indexing all of your log +data, you can use a small sample to experiment with the runtime fields defined +in the mapping. + +[source,console] +---- +POST /my-index/_bulk +{ "index": {}} +{ "timestamp": "2020-06-21T15:00:01-05:00", "message" : "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0"} +{ "index": {}} +{ "timestamp": "2020-06-21T15:00:01-05:00", "message" : "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0"} +{ "index": {}} +{ "timestamp": "2020-04-30T14:30:17-05:00", "message" : "40.135.0.0 - - [2020-04-30T14:30:17-05:00] \"GET /images/hm_bg.jpg HTTP/1.0\" 200 24736"} +{ "index": {}} +{ "timestamp": "2020-04-30T14:30:53-05:00", "message" : "232.0.0.0 - - [2020-04-30T14:30:53-05:00] \"GET /images/hm_bg.jpg HTTP/1.0\" 200 24736"} +{ "index": {}} +{ "timestamp": "2020-04-30T14:31:12-05:00", "message" : "26.1.0.0 - - [2020-04-30T14:31:12-05:00] \"GET /images/hm_bg.jpg HTTP/1.0\" 200 24736"} +{ "index": {}} +{ "timestamp": "2020-04-30T14:31:19-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:19-05:00] \"GET /french/splash_inet.html HTTP/1.0\" 200 3781"} +{ "index": {}} +{ "timestamp": "2020-04-30T14:31:27-05:00", "message" : "252.0.0.0 - - [2020-04-30T14:31:27-05:00] \"GET /images/hm_bg.jpg HTTP/1.0\" 200 24736"} +{ "index": {}} +{ "timestamp": "2020-04-30T14:31:29-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:29-05:00] \"GET /images/hm_brdl.gif HTTP/1.0\" 304 0"} +{ "index": {}} +{ "timestamp": "2020-04-30T14:31:29-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:29-05:00] \"GET /images/hm_arw.gif HTTP/1.0\" 304 0"} +{ "index": {}} +{ "timestamp": "2020-04-30T14:31:32-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:32-05:00] \"GET /images/nav_bg_top.gif HTTP/1.0\" 200 929"} +{ "index": {}} +{ "timestamp": "2020-04-30T14:31:43-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:43-05:00] \"GET /french/images/nav_venue_off.gif HTTP/1.0\" 304 0"} +---- + +Using the `clientip` field, you can define a simple query to run a search for a +specific IP address and return all related fields. + +[source,console] +---- +GET my-index/_search +{ + "query": { + "match": { + "clientip": "211.11.9.0" + } + }, + "fields" : ["*"] +} +---- + +The API returns the following result. + +[source,console-result] +---- +{ + "took" : 1, + "timed_out" : false, + "_shards" : { + "total" : 1, + "successful" : 1, + "skipped" : 0, + "failed" : 0 + }, + "hits" : { + "total" : { + "value" : 2, + "relation" : "eq" + }, + "max_score" : 1.0, + "hits" : [ + { + "_index" : "my-index", + "_type" : "_doc", + "_id" : "8GRivHQBddAsv33YHS9W", + "_score" : 1.0, + "_source" : { + "timestamp" : "2020-06-21T15:00:01-05:00", + "message" : """211.11.9.0 - - [2020-06-21T15:00:01-05:00] "GET /english/index.html HTTP/1.0" 304 0""" + }, + "fields" : { + "request" : [ + "GET /english/index.html HTTP/1.0" + ], + "size" : [ + 0 + ], + "clientip" : [ + "211.11.9.0" + ], + "message" : [ + """211.11.9.0 - - [2020-06-21T15:00:01-05:00] "GET /english/index.html HTTP/1.0" 304 0""" + ], + "timestamp" : [ + "2020-06-21T20:00:01.000Z" + ], + "status" : [ + 304 + ] + } + }, + { + "_index" : "my-index", + "_type" : "_doc", + "_id" : "8WRivHQBddAsv33YHS9W", + "_score" : 1.0, + "_source" : { + "timestamp" : "2020-06-21T15:00:01-05:00", + "message" : """211.11.9.0 - - [2020-06-21T15:00:01-05:00] "GET /english/index.html HTTP/1.0" 304 0""" + }, + "fields" : { + "request" : [ + "GET /english/index.html HTTP/1.0" + ], + "size" : [ + 0 + ], + "clientip" : [ + "211.11.9.0" + ], + "message" : [ + """211.11.9.0 - - [2020-06-21T15:00:01-05:00] "GET /english/index.html HTTP/1.0" 304 0""" + ], + "timestamp" : [ + "2020-06-21T20:00:01.000Z" + ], + "status" : [ + 304 + ] + } + } + ] + } +} +---- From a287895592edf4a5bb9c2a2033eb7a6080c2997b Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Thu, 24 Sep 2020 16:32:08 -0400 Subject: [PATCH 10/44] Adding more context and simplifying the example. --- docs/reference/mapping/types/runtime.asciidoc | 177 +++++++++++------- 1 file changed, 114 insertions(+), 63 deletions(-) diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index 1121ecd12136a..0c4c10b7316ba 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -35,7 +35,11 @@ they are implemented Because runtime fields aren't indexed, you can more quickly ingest raw data into the Elastic Stack and immediately access it. By dynamically evaluating runtime fields at search time, you can optimize disk space by choosing which -fields to index. +fields to index. You can also fix errors in indexed fields by overriding them +with runtime fields, rather than reindexing all of your data. If you later +decide that a runtime field is increasingly used for filtering or aggregation, +you can remove the runtime field from the mapping and index the field to gain +faster search speed. Runtime fields incur performance costs at search time, depending on the <>. For example, let's say @@ -51,11 +55,17 @@ index them. [[runtime-mapping-fields]] ==== Mapping a runtime field -You map runtime fields by -<>. At search time, the script runs -and generates values for each scripted field. +When mapping a runtime field, indexing is disabled by default. You map runtime +fields by <>. At search +time, the script runs and generates values for each scripted field. Runtime +scripts have access to the entire context of a document, the original +`_source`, and the mapped field plus its values (`doc_values`). -NOTE: When mapping a runtime field, indexing is disabled by default. +IMPORTANT: Updating a script while a query that relies on the script is running +can return inconsistent results. ++ +Additionally, existing queries or visualizations that rely on runtime fields +can break if scripts are updated. The script in the following request extracts the day of the week from the `timestamp` field, which is defined as a `date` data type. @@ -130,7 +140,11 @@ want to explore the data structure without committing to a schema up front. You know that your log data contains specific fields that you want to extract. By using runtime fields, you can define scripts to calculate values at search -time for the `clientip`, `request`, `status`, and `size` fields. +time for these fields. + +You can start with a simple example by adding the `timestamp` and `message` +fields to the `my-index` mapping. To remain flexible, use `wildcard` as the +field type for `message`. [source,console] ---- @@ -142,47 +156,16 @@ PUT /my-index/_mappings "type": "date" }, "message": { - "type": "keyword", - "index": false, - "doc_values": true - }, - "clientip": { - "type": "runtime", - "runtime_type": "ip", - "script" : { - "source" : "String m = doc[\"message\"].value; int end = m.indexOf(\" \"); emit(m.substring(0, end));" - } - }, - "request": { - "type": "runtime", - "runtime_type": "keyword", - "script" : { - "source" : "String m = doc[\"message\"].value; int start = m.indexOf(\"\\\"\") + 1; int end = m.indexOf(\"\\\"\", start); emit(m.substring(start, end));" - } - }, - "status": { - "type": "runtime", - "runtime_type": "long", - "script" : { - "source" : "String m = doc[\"message\"].value; int end = m.lastIndexOf(\" \"); int start = m.lastIndexOf(\" \", end - 1) + 1; emit(Long.parseLong(m.substring(start, end)));" - } - }, - "size": { - "type": "runtime", - "runtime_type": "long", - "script" : { - "source" : "String m = doc[\"message\"].value; int start = m.lastIndexOf(\" \") + 1; emit(Long.parseLong(m.substring(start)));" - } + "type": "wildcard" } } } ---- -After mapping the fields you want to retrieve, you can index a few records from +After mapping the fields you want to retrieve, index a few records from your log data into {es}. The following request uses the <> to index raw log data into `my-index`. Instead of indexing all of your log -data, you can use a small sample to experiment with the runtime fields defined -in the mapping. +data, you can use a small sample to experiment with runtime fields. [source,console] ---- @@ -211,8 +194,59 @@ POST /my-index/_bulk { "timestamp": "2020-04-30T14:31:43-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:43-05:00] \"GET /french/images/nav_venue_off.gif HTTP/1.0\" 304 0"} ---- -Using the `clientip` field, you can define a simple query to run a search for a -specific IP address and return all related fields. +At this point, you can view how {es} stores your raw data. + +[source,console] +---- +GET /my-index +---- + +The mapping contains two fields: `timestamp` and `message`. + +[source,console-result] +---- +{ + "my-index" : { + "aliases" : { }, + "mappings" : { + "properties" : { + "@timestamp" : { + "type" : "date", + "format" : "strict_date_optional_time||epoch_second" + }, + "message" : { + "type" : "wildcard" + }, + "timestamp" : { + "type" : "date" + } + } + }, +... +---- + +If you want to retrieve results that include `clientip`, you can add that field +as a runtime field in the mapping. The runtime script operates on the `clientip` +field at runtime to calculate values for that field. + +[source,console] +---- +PUT /my-index/_mappings +{ + "properties": { + "clientip": { + "type": "runtime", + "runtime_type": "ip", + "script" : { + "source" : "String m = doc[\"message\"].value; int end = m.indexOf(\" \"); emit(m.substring(0, end));" + } + } + } +} +---- + +Using the `clientip` runtime field, you can define a simple query to run a +search for a specific IP address and return all related fields. [source,console] ---- @@ -227,7 +261,9 @@ GET my-index/_search } ---- -The API returns the following result. +The API returns the following result. Without building your data structure in +advance, you can search and explore your data in meaningful ways to experiment +and determine which fields to index. [source,console-result] ---- @@ -250,19 +286,13 @@ The API returns the following result. { "_index" : "my-index", "_type" : "_doc", - "_id" : "8GRivHQBddAsv33YHS9W", + "_id" : "m4d6wXQBQVoWbakQ_rGg", "_score" : 1.0, "_source" : { "timestamp" : "2020-06-21T15:00:01-05:00", "message" : """211.11.9.0 - - [2020-06-21T15:00:01-05:00] "GET /english/index.html HTTP/1.0" 304 0""" }, "fields" : { - "request" : [ - "GET /english/index.html HTTP/1.0" - ], - "size" : [ - 0 - ], "clientip" : [ "211.11.9.0" ], @@ -271,28 +301,19 @@ The API returns the following result. ], "timestamp" : [ "2020-06-21T20:00:01.000Z" - ], - "status" : [ - 304 ] } }, { "_index" : "my-index", "_type" : "_doc", - "_id" : "8WRivHQBddAsv33YHS9W", + "_id" : "nId6wXQBQVoWbakQ_rGg", "_score" : 1.0, "_source" : { "timestamp" : "2020-06-21T15:00:01-05:00", "message" : """211.11.9.0 - - [2020-06-21T15:00:01-05:00] "GET /english/index.html HTTP/1.0" 304 0""" }, "fields" : { - "request" : [ - "GET /english/index.html HTTP/1.0" - ], - "size" : [ - 0 - ], "clientip" : [ "211.11.9.0" ], @@ -301,9 +322,6 @@ The API returns the following result. ], "timestamp" : [ "2020-06-21T20:00:01.000Z" - ], - "status" : [ - 304 ] } } @@ -311,3 +329,36 @@ The API returns the following result. } } ---- + +If you add the `day_of_week` field to the mapping using the request in +<>, you can re-run the previous +search request and also retrieve the day of the week based on the `timestamp` +field. + +The value for this field was never indexed, and is calculated dynamically at +runtime. This flexibility allows you to modify the mapping without changing +any field values. + +[source,console-result] +---- +... + "clientip" : [ + "211.11.9.0" + ], + "message" : [ + """211.11.9.0 - - [2020-06-21T15:00:01-05:00] "GET /english/index.html HTTP/1.0" 304 0""" + ], + "day_of_week" : [ + "Sunday" <1> + ], + "timestamp" : [ + "2020-06-21T20:00:01.000Z" + ], + "status" : [ + 304 + ] +... +---- + +<1> This value was calculated at search time using the runtime script defined +in the mapping. From 2b533f7a25ed53d78123f83164cf493555f72072 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Mon, 28 Sep 2020 11:09:01 -0400 Subject: [PATCH 11/44] Changing timestamp to @timestamp throughout. --- docs/reference/mapping/types/runtime.asciidoc | 48 +++++++++---------- 1 file changed, 24 insertions(+), 24 deletions(-) diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index 0c4c10b7316ba..97e37445a7758 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -43,8 +43,8 @@ faster search speed. Runtime fields incur performance costs at search time, depending on the <>. For example, let's say -you created an anomaly detection job that operates on the `timestamp` field. -If the `timestamp` field is a runtime field, the search cost would be extremely +you created an anomaly detection job that operates on the `@timestamp` field. +If the `@timestamp` field is a runtime field, the search cost would be extremely high because the data isn't indexed and {es} must compute the value for each document that matches the query. @@ -68,7 +68,7 @@ Additionally, existing queries or visualizations that rely on runtime fields can break if scripts are updated. The script in the following request extracts the day of the week from the -`timestamp` field, which is defined as a `date` data type. +`@timestamp` field, which is defined as a `date` data type. [source,console] ---- @@ -79,7 +79,7 @@ PUT /my-index/_mappings "type" : "runtime", <1> "runtime_type" : "keyword", <2> "script" : { - "source" : "emit(doc['timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))" + "source" : "emit(doc['@timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))" } } } @@ -142,7 +142,7 @@ You know that your log data contains specific fields that you want to extract. By using runtime fields, you can define scripts to calculate values at search time for these fields. -You can start with a simple example by adding the `timestamp` and `message` +You can start with a simple example by adding the `@timestamp` and `message` fields to the `my-index` mapping. To remain flexible, use `wildcard` as the field type for `message`. @@ -171,27 +171,27 @@ data, you can use a small sample to experiment with runtime fields. ---- POST /my-index/_bulk { "index": {}} -{ "timestamp": "2020-06-21T15:00:01-05:00", "message" : "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0"} +{ "@timestamp": "2020-06-21T15:00:01-05:00", "message" : "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0"} { "index": {}} -{ "timestamp": "2020-06-21T15:00:01-05:00", "message" : "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0"} +{ "@timestamp": "2020-06-21T15:00:01-05:00", "message" : "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0"} { "index": {}} -{ "timestamp": "2020-04-30T14:30:17-05:00", "message" : "40.135.0.0 - - [2020-04-30T14:30:17-05:00] \"GET /images/hm_bg.jpg HTTP/1.0\" 200 24736"} +{ "@timestamp": "2020-04-30T14:30:17-05:00", "message" : "40.135.0.0 - - [2020-04-30T14:30:17-05:00] \"GET /images/hm_bg.jpg HTTP/1.0\" 200 24736"} { "index": {}} -{ "timestamp": "2020-04-30T14:30:53-05:00", "message" : "232.0.0.0 - - [2020-04-30T14:30:53-05:00] \"GET /images/hm_bg.jpg HTTP/1.0\" 200 24736"} +{ "@timestamp": "2020-04-30T14:30:53-05:00", "message" : "232.0.0.0 - - [2020-04-30T14:30:53-05:00] \"GET /images/hm_bg.jpg HTTP/1.0\" 200 24736"} { "index": {}} -{ "timestamp": "2020-04-30T14:31:12-05:00", "message" : "26.1.0.0 - - [2020-04-30T14:31:12-05:00] \"GET /images/hm_bg.jpg HTTP/1.0\" 200 24736"} +{ "@timestamp": "2020-04-30T14:31:12-05:00", "message" : "26.1.0.0 - - [2020-04-30T14:31:12-05:00] \"GET /images/hm_bg.jpg HTTP/1.0\" 200 24736"} { "index": {}} -{ "timestamp": "2020-04-30T14:31:19-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:19-05:00] \"GET /french/splash_inet.html HTTP/1.0\" 200 3781"} +{ "@timestamp": "2020-04-30T14:31:19-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:19-05:00] \"GET /french/splash_inet.html HTTP/1.0\" 200 3781"} { "index": {}} -{ "timestamp": "2020-04-30T14:31:27-05:00", "message" : "252.0.0.0 - - [2020-04-30T14:31:27-05:00] \"GET /images/hm_bg.jpg HTTP/1.0\" 200 24736"} +{ "@timestamp": "2020-04-30T14:31:27-05:00", "message" : "252.0.0.0 - - [2020-04-30T14:31:27-05:00] \"GET /images/hm_bg.jpg HTTP/1.0\" 200 24736"} { "index": {}} -{ "timestamp": "2020-04-30T14:31:29-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:29-05:00] \"GET /images/hm_brdl.gif HTTP/1.0\" 304 0"} +{ "@timestamp": "2020-04-30T14:31:29-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:29-05:00] \"GET /images/hm_brdl.gif HTTP/1.0\" 304 0"} { "index": {}} -{ "timestamp": "2020-04-30T14:31:29-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:29-05:00] \"GET /images/hm_arw.gif HTTP/1.0\" 304 0"} +{ "@timestamp": "2020-04-30T14:31:29-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:29-05:00] \"GET /images/hm_arw.gif HTTP/1.0\" 304 0"} { "index": {}} -{ "timestamp": "2020-04-30T14:31:32-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:32-05:00] \"GET /images/nav_bg_top.gif HTTP/1.0\" 200 929"} +{ "@timestamp": "2020-04-30T14:31:32-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:32-05:00] \"GET /images/nav_bg_top.gif HTTP/1.0\" 200 929"} { "index": {}} -{ "timestamp": "2020-04-30T14:31:43-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:43-05:00] \"GET /french/images/nav_venue_off.gif HTTP/1.0\" 304 0"} +{ "@timestamp": "2020-04-30T14:31:43-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:43-05:00] \"GET /french/images/nav_venue_off.gif HTTP/1.0\" 304 0"} ---- At this point, you can view how {es} stores your raw data. @@ -201,7 +201,7 @@ At this point, you can view how {es} stores your raw data. GET /my-index ---- -The mapping contains two fields: `timestamp` and `message`. +The mapping contains two fields: `@timestamp` and `message`. [source,console-result] ---- @@ -217,7 +217,7 @@ The mapping contains two fields: `timestamp` and `message`. "message" : { "type" : "wildcard" }, - "timestamp" : { + "@timestamp" : { "type" : "date" } } @@ -289,7 +289,7 @@ and determine which fields to index. "_id" : "m4d6wXQBQVoWbakQ_rGg", "_score" : 1.0, "_source" : { - "timestamp" : "2020-06-21T15:00:01-05:00", + "@timestamp" : "2020-06-21T15:00:01-05:00", "message" : """211.11.9.0 - - [2020-06-21T15:00:01-05:00] "GET /english/index.html HTTP/1.0" 304 0""" }, "fields" : { @@ -299,7 +299,7 @@ and determine which fields to index. "message" : [ """211.11.9.0 - - [2020-06-21T15:00:01-05:00] "GET /english/index.html HTTP/1.0" 304 0""" ], - "timestamp" : [ + "@timestamp" : [ "2020-06-21T20:00:01.000Z" ] } @@ -310,7 +310,7 @@ and determine which fields to index. "_id" : "nId6wXQBQVoWbakQ_rGg", "_score" : 1.0, "_source" : { - "timestamp" : "2020-06-21T15:00:01-05:00", + "@timestamp" : "2020-06-21T15:00:01-05:00", "message" : """211.11.9.0 - - [2020-06-21T15:00:01-05:00] "GET /english/index.html HTTP/1.0" 304 0""" }, "fields" : { @@ -320,7 +320,7 @@ and determine which fields to index. "message" : [ """211.11.9.0 - - [2020-06-21T15:00:01-05:00] "GET /english/index.html HTTP/1.0" 304 0""" ], - "timestamp" : [ + "@timestamp" : [ "2020-06-21T20:00:01.000Z" ] } @@ -332,7 +332,7 @@ and determine which fields to index. If you add the `day_of_week` field to the mapping using the request in <>, you can re-run the previous -search request and also retrieve the day of the week based on the `timestamp` +search request and also retrieve the day of the week based on the `@timestamp` field. The value for this field was never indexed, and is calculated dynamically at @@ -351,7 +351,7 @@ any field values. "day_of_week" : [ "Sunday" <1> ], - "timestamp" : [ + "@timestamp" : [ "2020-06-21T20:00:01.000Z" ], "status" : [ From d2d8551ced48c4f162889fa7b633d2ba76d6d311 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Mon, 28 Sep 2020 13:08:29 -0400 Subject: [PATCH 12/44] Removing duplicate @timestamp field. --- docs/reference/mapping/types/runtime.asciidoc | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index 97e37445a7758..a6b3f9d95ac5a 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -14,7 +14,7 @@ aggregated, or searched. With runtime fields, you can explicitly define a field in the mapping and access it at search time without indexing your data during ingest time. -Runtime fields are accessible from the search API like any other field that has +Runtime fields beta[] are accessible from the search API like any other field that has <> and is searchable. You can retrieve and query these fields, as well as aggregate on them. @@ -217,9 +217,6 @@ The mapping contains two fields: `@timestamp` and `message`. "message" : { "type" : "wildcard" }, - "@timestamp" : { - "type" : "date" - } } }, ... From 282a77756fac10932ad7239da5135edca3567dca Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Mon, 28 Sep 2020 13:25:13 -0400 Subject: [PATCH 13/44] Expanding example to hopefully fix CI builds. --- docs/reference/mapping/types/runtime.asciidoc | 24 +++++++++++++++++-- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index a6b3f9d95ac5a..ee403a1de9bfe 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -216,10 +216,30 @@ The mapping contains two fields: `@timestamp` and `message`. }, "message" : { "type" : "wildcard" - }, + } } }, -... + "settings" : { + "index" : { + "routing" : { + "allocation" : { + "include" : { + "_tier" : "data_hot" + } + } + }, + "number_of_shards" : "1", + "provided_name" : "my-index", + "creation_date" : "1601313565318", + "number_of_replicas" : "1", + "uuid" : "zpdFg3pXSNm4UBinOUc75A", + "version" : { + "created" : "7100099" + } + } + } + } +} ---- If you want to retrieve results that include `clientip`, you can add that field From 50a30f68035da0c6c253df268f0d31f6cd708357 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Mon, 28 Sep 2020 14:21:46 -0400 Subject: [PATCH 14/44] Adding skip test for result. --- docs/reference/mapping/types/runtime.asciidoc | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index ee403a1de9bfe..f78b21faca015 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -359,6 +359,10 @@ any field values. [source,console-result] ---- ... + "fields" : { + "@timestamp" : [ + "2020-06-21T20:00:01.000Z" + ], "clientip" : [ "211.11.9.0" ], @@ -366,16 +370,13 @@ any field values. """211.11.9.0 - - [2020-06-21T15:00:01-05:00] "GET /english/index.html HTTP/1.0" 304 0""" ], "day_of_week" : [ - "Sunday" <1> - ], - "@timestamp" : [ - "2020-06-21T20:00:01.000Z" - ], - "status" : [ - 304 + "Sunday" ] -... + } + }, + ... ---- +// TEST[skip:not a complete result] <1> This value was calculated at search time using the runtime script defined in the mapping. From f8a3d503c655ecaa2736b816e4e316fd3885a70e Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Mon, 28 Sep 2020 14:41:09 -0400 Subject: [PATCH 15/44] Adding missing callout. --- docs/reference/mapping/types/runtime.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index f78b21faca015..07aee7a37af1c 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -370,7 +370,7 @@ any field values. """211.11.9.0 - - [2020-06-21T15:00:01-05:00] "GET /english/index.html HTTP/1.0" 304 0""" ], "day_of_week" : [ - "Sunday" + "Sunday" <1> ] } }, From 301abd19191c4de27ab94ec0f489720eef46bb4e Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Mon, 28 Sep 2020 17:12:11 -0400 Subject: [PATCH 16/44] Adding TESTRESPONSEs, which are currently broken. --- docs/reference/mapping/types/runtime.asciidoc | 93 ++++++++----------- 1 file changed, 40 insertions(+), 53 deletions(-) diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index 07aee7a37af1c..a5612703f1951 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -1,14 +1,5 @@ [[runtime]] === Runtime - -//// -[source,console] ----- -PUT /my-index ----- -// TESTSETUP -//// - Typically, you must index fields to {es} before they can be retrieved, aggregated, or searched. With runtime fields, you can explicitly define a field in the mapping and access it at search time without indexing your data during @@ -72,14 +63,16 @@ The script in the following request extracts the day of the week from the [source,console] ---- -PUT /my-index/_mappings +PUT /my-index { - "properties" : { - "day_of_week" : { - "type" : "runtime", <1> - "runtime_type" : "keyword", <2> - "script" : { - "source" : "emit(doc['@timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))" + "mappings": { + "properties": { + "day_of_week": { + "type": "runtime", <1> + "runtime_type": "keyword", <2> + "script": { + "source": "emit(doc['@timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))" + } } } } @@ -114,6 +107,7 @@ GET /my-index/_search } } ---- +// TEST[continued] [[runtime-params]] ==== Parameters for `runtime` fields @@ -148,17 +142,19 @@ field type for `message`. [source,console] ---- -PUT /my-index/_mappings +PUT /my-index/ { - "properties": { - "@timestamp": { - "format": "strict_date_optional_time||epoch_second", - "type": "date" - }, - "message": { - "type": "wildcard" + "mappings": { + "properties": { + "@timestamp": { + "format": "strict_date_optional_time||epoch_second", + "type": "date" + }, + "message": { + "type": "wildcard" + } + } } - } } ---- @@ -169,7 +165,7 @@ data, you can use a small sample to experiment with runtime fields. [source,console] ---- -POST /my-index/_bulk +POST /my-index/_bulk?refresh { "index": {}} { "@timestamp": "2020-06-21T15:00:01-05:00", "message" : "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0"} { "index": {}} @@ -193,6 +189,7 @@ POST /my-index/_bulk { "index": {}} { "@timestamp": "2020-04-30T14:31:43-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:43-05:00] \"GET /french/images/nav_venue_off.gif HTTP/1.0\" 304 0"} ---- +// TEST[continued] At this point, you can view how {es} stores your raw data. @@ -200,6 +197,7 @@ At this point, you can view how {es} stores your raw data. ---- GET /my-index ---- +// TEST[continued] The mapping contains two fields: `@timestamp` and `message`. @@ -207,7 +205,6 @@ The mapping contains two fields: `@timestamp` and `message`. ---- { "my-index" : { - "aliases" : { }, "mappings" : { "properties" : { "@timestamp" : { @@ -219,28 +216,11 @@ The mapping contains two fields: `@timestamp` and `message`. } } }, - "settings" : { - "index" : { - "routing" : { - "allocation" : { - "include" : { - "_tier" : "data_hot" - } - } - }, - "number_of_shards" : "1", - "provided_name" : "my-index", - "creation_date" : "1601313565318", - "number_of_replicas" : "1", - "uuid" : "zpdFg3pXSNm4UBinOUc75A", - "version" : { - "created" : "7100099" - } - } - } + ... } } ---- +// TESTRESPONSE[s/\.\.\./"aliases": $body.my-index.aliases, "settings": $body.my-index.settings/] If you want to retrieve results that include `clientip`, you can add that field as a runtime field in the mapping. The runtime script operates on the `clientip` @@ -261,6 +241,7 @@ PUT /my-index/_mappings } } ---- +// TEST[continued] Using the `clientip` runtime field, you can define a simple query to run a search for a specific IP address and return all related fields. @@ -277,6 +258,7 @@ GET my-index/_search "fields" : ["*"] } ---- +// TEST[continued] The API returns the following result. Without building your data structure in advance, you can search and explore your data in meaningful ways to experiment @@ -285,7 +267,7 @@ and determine which fields to index. [source,console-result] ---- { - "took" : 1, + "took" : 150, "timed_out" : false, "_shards" : { "total" : 1, @@ -303,18 +285,18 @@ and determine which fields to index. { "_index" : "my-index", "_type" : "_doc", - "_id" : "m4d6wXQBQVoWbakQ_rGg", + "_id" : "8Jh81nQBp2DRDXdiOxVt", "_score" : 1.0, "_source" : { "@timestamp" : "2020-06-21T15:00:01-05:00", - "message" : """211.11.9.0 - - [2020-06-21T15:00:01-05:00] "GET /english/index.html HTTP/1.0" 304 0""" + "message" : "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0" }, "fields" : { "clientip" : [ "211.11.9.0" ], "message" : [ - """211.11.9.0 - - [2020-06-21T15:00:01-05:00] "GET /english/index.html HTTP/1.0" 304 0""" + "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0" ], "@timestamp" : [ "2020-06-21T20:00:01.000Z" @@ -324,18 +306,18 @@ and determine which fields to index. { "_index" : "my-index", "_type" : "_doc", - "_id" : "nId6wXQBQVoWbakQ_rGg", + "_id" : "8Zh81nQBp2DRDXdiOxVu", "_score" : 1.0, "_source" : { "@timestamp" : "2020-06-21T15:00:01-05:00", - "message" : """211.11.9.0 - - [2020-06-21T15:00:01-05:00] "GET /english/index.html HTTP/1.0" 304 0""" + "message" : "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0" }, "fields" : { "clientip" : [ "211.11.9.0" ], "message" : [ - """211.11.9.0 - - [2020-06-21T15:00:01-05:00] "GET /english/index.html HTTP/1.0" 304 0""" + "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0" ], "@timestamp" : [ "2020-06-21T20:00:01.000Z" @@ -346,6 +328,11 @@ and determine which fields to index. } } ---- +// TESTRESPONSE[s/"took": 150/"took": $body.took/] +// TESTRESPONSE[s/"_type": "_doc"/"_type": $body.$_path/] +// TESTRESPONSE[s/"_type": "_doc"/"_type": $body.$_path/] +// TESTRESPONSE[s/"_id": "8Jh81nQBp2DRDXdiOxVt"/"_id": $body.hits.hits.0._id/] +// TESTRESPONSE[s/"_id": "8Zh81nQBp2DRDXdiOxVu"/"_id": $body.hits.hits.1._id/] If you add the `day_of_week` field to the mapping using the request in <>, you can re-run the previous From 62a5393d1ffdb422169e7e8feabeeda10856fae4 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Tue, 29 Sep 2020 08:58:39 -0400 Subject: [PATCH 17/44] Fixing TESTRESPONSEs. --- docs/reference/mapping/types/runtime.asciidoc | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index a5612703f1951..f4a023aaad4f4 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -284,7 +284,6 @@ and determine which fields to index. "hits" : [ { "_index" : "my-index", - "_type" : "_doc", "_id" : "8Jh81nQBp2DRDXdiOxVt", "_score" : 1.0, "_source" : { @@ -305,7 +304,6 @@ and determine which fields to index. }, { "_index" : "my-index", - "_type" : "_doc", "_id" : "8Zh81nQBp2DRDXdiOxVu", "_score" : 1.0, "_source" : { @@ -328,11 +326,9 @@ and determine which fields to index. } } ---- -// TESTRESPONSE[s/"took": 150/"took": $body.took/] -// TESTRESPONSE[s/"_type": "_doc"/"_type": $body.$_path/] -// TESTRESPONSE[s/"_type": "_doc"/"_type": $body.$_path/] -// TESTRESPONSE[s/"_id": "8Jh81nQBp2DRDXdiOxVt"/"_id": $body.hits.hits.0._id/] -// TESTRESPONSE[s/"_id": "8Zh81nQBp2DRDXdiOxVu"/"_id": $body.hits.hits.1._id/] +// TESTRESPONSE[s/"took" : 150/"took": $body.took/] +// TESTRESPONSE[s/"_id" : "8Jh81nQBp2DRDXdiOxVt"/"_id": $body.hits.hits.0._id/] +// TESTRESPONSE[s/"_id" : "8Zh81nQBp2DRDXdiOxVu"/"_id": $body.hits.hits.1._id/] If you add the `day_of_week` field to the mapping using the request in <>, you can re-run the previous From 36c92442308f8430dbeb881729c542b0fafde793 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Tue, 29 Sep 2020 17:40:47 -0400 Subject: [PATCH 18/44] Incorporating review feedback. --- docs/reference/mapping/types.asciidoc | 6 +-- docs/reference/mapping/types/runtime.asciidoc | 54 ++++++++++++------- 2 files changed, 38 insertions(+), 22 deletions(-) diff --git a/docs/reference/mapping/types.asciidoc b/docs/reference/mapping/types.asciidoc index 409ac3cce2588..c725515ec89c4 100644 --- a/docs/reference/mapping/types.asciidoc +++ b/docs/reference/mapping/types.asciidoc @@ -132,12 +132,12 @@ via the <> parameter. {es} indexes most field types by default to promote faster search. However, indexing all of your data can be slow and requires more disk space. If you're experimenting with your data or are unsure which fields you need for search, -use _runtime fields_. +use _runtime fields_ beta[]. {es} treats runtime fields like any other field, except that their values are only extracted or computed at search time. When mapping a runtime field, you -define a script that determines how to extract or compute field values from -your unindexed data. +define a script that determines how to extract or compute values from existing +fields. Runtime fields use less disk space and provide flexibility in how you access your data, but impact search performance based on the computation defined in diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index f4a023aaad4f4..14d41c7c77218 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -5,9 +5,13 @@ aggregated, or searched. With runtime fields, you can explicitly define a field in the mapping and access it at search time without indexing your data during ingest time. -Runtime fields beta[] are accessible from the search API like any other field that has -<> and is searchable. You can retrieve and query these -fields, as well as aggregate on them. +Runtime fields beta[] are accessible from the search API like any other field. +Use the <> to return runtime fields +as searchable and aggregatable. + +including the cost of these operations. +that is indexed and has <>. You can retrieve and query +these fields, as well as sort and aggregate on them. Runtime fields help to alleviate several common issues when using {es}: @@ -26,11 +30,9 @@ they are implemented Because runtime fields aren't indexed, you can more quickly ingest raw data into the Elastic Stack and immediately access it. By dynamically evaluating runtime fields at search time, you can optimize disk space by choosing which -fields to index. You can also fix errors in indexed fields by overriding them -with runtime fields, rather than reindexing all of your data. If you later -decide that a runtime field is increasingly used for filtering or aggregation, -you can remove the runtime field from the mapping and index the field to gain -faster search speed. +fields to index. If you later decide that a runtime field is increasingly used +for filtering or aggregation, you can add the field to `_source` to gain faster +search speed. Runtime fields incur performance costs at search time, depending on the <>. For example, let's say @@ -46,17 +48,26 @@ index them. [[runtime-mapping-fields]] ==== Mapping a runtime field -When mapping a runtime field, indexing is disabled by default. You map runtime -fields by <>. At search -time, the script runs and generates values for each scripted field. Runtime +You map runtime fields by +<>. At search time, the +script runs and generates values for each scripted field. Runtime scripts have access to the entire context of a document, the original `_source`, and the mapped field plus its values (`doc_values`). -IMPORTANT: Updating a script while a query that relies on the script is running -can return inconsistent results. -+ -Additionally, existing queries or visualizations that rely on runtime fields -can break if scripts are updated. +[[runtime-updating-scripts]] +.Updating runtime scripts +**** + +Updating a script while a dependent query is running can return +inconsistent results. Each shard might have access to different versions of the +script, depending on when the mapping change takes effect. + +Existing queries or visualizations in {kib} that rely on runtime fields can +fail if you change the `runtime_type`. For example, a bar chart visualization +that uses a runtime field of `ip` will fail if the `runtime_type` is changed +to `boolean`. + +**** The script in the following request extracts the day of the week from the `@timestamp` field, which is defined as a `date` data type. @@ -126,6 +137,11 @@ supports `boolean`, `date`, `double`, `ip`, `keyword`, and `long`. Runtime fields with a `runtime_type` of `date` can accept the <> parameter exactly as the `date` field type. +[[runtime-params-script]] +`script`:: +The <> that is evaluated at search +time to produce the value of the runtime field. + [[runtime-examples]] ==== Examples Consider a large set of log data that you want to extract fields from. @@ -335,9 +351,9 @@ If you add the `day_of_week` field to the mapping using the request in search request and also retrieve the day of the week based on the `@timestamp` field. -The value for this field was never indexed, and is calculated dynamically at -runtime. This flexibility allows you to modify the mapping without changing -any field values. +The value for this field is calculated dynamically at runtime without +reindexing the document or adding the `day_of_week` field. This flexibility +allows you to modify the mapping without changing any field values. [source,console-result] ---- From 804f10f5f0e51da6d4905c0f3c09f38dca43ae4e Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Wed, 30 Sep 2020 17:21:55 -0400 Subject: [PATCH 19/44] Several clarifications, better test cases, and other changes. --- docs/reference/mapping/types/runtime.asciidoc | 206 ++++++++++++++---- 1 file changed, 161 insertions(+), 45 deletions(-) diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index 14d41c7c77218..89378165c1ae6 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -6,13 +6,9 @@ in the mapping and access it at search time without indexing your data during ingest time. Runtime fields beta[] are accessible from the search API like any other field. -Use the <> to return runtime fields +The <> returns runtime fields as searchable and aggregatable. -including the cost of these operations. -that is indexed and has <>. You can retrieve and query -these fields, as well as sort and aggregate on them. - Runtime fields help to alleviate several common issues when using {es}: * Reindexing your data between development iterations is slow and can make @@ -41,6 +37,10 @@ If the `@timestamp` field is a runtime field, the search cost would be extremely high because the data isn't indexed and {es} must compute the value for each document that matches the query. +NOTE: Computing values for runtime fields in each document that might match a +query impacts search speed. Use the <> +to run searches that include runtime fields. + Runtime fields are useful when working with log data, especially when you're unsure about the data structure. Your search speed decreases, but your index size is much smaller and you can more quickly process logs without having to @@ -81,7 +81,7 @@ PUT /my-index "day_of_week": { "type": "runtime", <1> "runtime_type": "keyword", <2> - "script": { + "script": { <3> "source": "emit(doc['@timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))" } } @@ -92,6 +92,29 @@ PUT /my-index <1> Runtime fields are of the `runtime` data type. <2> Each runtime has its own field type, defined by `runtime_type`. +<3> The script defines the evaluation to calculate at search time. + +[[runtime-params]] +==== Parameters for `runtime` fields +Runtime fields accept the following parameters: + +[[runtime-params-type]] +`type`:: +The type of runtime computation to perform at query time. Currently, runtime +fields only support the `runtime` data type. + +[[runtime-params-runtime-type]] +`runtime_type`:: +The <> for each scripted field. {es} +supports `boolean`, `date`, `double`, `ip`, `keyword`, and `long`. ++ +Runtime fields with a `runtime_type` of `date` can accept the +<> parameter exactly as the `date` field type. + +[[runtime-params-script]] +`script`:: +The <> that is evaluated at search +time to produce the value of the runtime field. [[runtime-retrieving-fields]] ==== Retrieving a runtime field @@ -99,9 +122,6 @@ Use the <> parameter on the `_search` API to retrieve the values of runtime fields. This API works for all fields, even those that were not sent as part of the original `_source`. -NOTE: We highly recommended using the <> -to run searches that use runtime fields. - The following request uses the search API to retrieve the `day_of_week` field that the previous request defined in the mapping. @@ -120,28 +140,6 @@ GET /my-index/_search ---- // TEST[continued] -[[runtime-params]] -==== Parameters for `runtime` fields -Runtime fields accept the following parameters: - -[[runtime-params-type]] -`type`:: -The type of runtime computation to perform at query time. Currently, runtime -fields only support the `runtime` data type. - -[[runtime-params-runtime-type]] -`runtime_type`:: -The <> for each scripted field. {es} -supports `boolean`, `date`, `double`, `ip`, `keyword`, and `long`. -+ -Runtime fields with a `runtime_type` of `date` can accept the -<> parameter exactly as the `date` field type. - -[[runtime-params-script]] -`script`:: -The <> that is evaluated at search -time to produce the value of the runtime field. - [[runtime-examples]] ==== Examples Consider a large set of log data that you want to extract fields from. @@ -283,14 +281,7 @@ and determine which fields to index. [source,console-result] ---- { - "took" : 150, - "timed_out" : false, - "_shards" : { - "total" : 1, - "successful" : 1, - "skipped" : 0, - "failed" : 0 - }, + ... "hits" : { "total" : { "value" : 2, @@ -342,7 +333,7 @@ and determine which fields to index. } } ---- -// TESTRESPONSE[s/"took" : 150/"took": $body.took/] +// TESTRESPONSE[s/\.\.\./"took" : $body.took,"timed_out" : $body.timed_out,"_shards" : $body._shards,/] // TESTRESPONSE[s/"_id" : "8Jh81nQBp2DRDXdiOxVt"/"_id": $body.hits.hits.0._id/] // TESTRESPONSE[s/"_id" : "8Zh81nQBp2DRDXdiOxVu"/"_id": $body.hits.hits.1._id/] @@ -355,9 +346,105 @@ The value for this field is calculated dynamically at runtime without reindexing the document or adding the `day_of_week` field. This flexibility allows you to modify the mapping without changing any field values. +//// +[source,console] +---- +PUT /my-index/ +{ + "mappings": { + "properties": { + "@timestamp": { + "format": "strict_date_optional_time||epoch_second", + "type": "date" + }, + "message": { + "type": "wildcard" + } + } + } +} + +POST /my-index/_bulk?refresh +{ "index": {}} +{ "@timestamp": "2020-06-21T15:00:01-05:00", "message" : "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0"} +{ "index": {}} +{ "@timestamp": "2020-06-21T15:00:01-05:00", "message" : "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0"} +{ "index": {}} +{ "@timestamp": "2020-04-30T14:30:17-05:00", "message" : "40.135.0.0 - - [2020-04-30T14:30:17-05:00] \"GET /images/hm_bg.jpg HTTP/1.0\" 200 24736"} +{ "index": {}} +{ "@timestamp": "2020-04-30T14:30:53-05:00", "message" : "232.0.0.0 - - [2020-04-30T14:30:53-05:00] \"GET /images/hm_bg.jpg HTTP/1.0\" 200 24736"} +{ "index": {}} +{ "@timestamp": "2020-04-30T14:31:12-05:00", "message" : "26.1.0.0 - - [2020-04-30T14:31:12-05:00] \"GET /images/hm_bg.jpg HTTP/1.0\" 200 24736"} +{ "index": {}} +{ "@timestamp": "2020-04-30T14:31:19-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:19-05:00] \"GET /french/splash_inet.html HTTP/1.0\" 200 3781"} +{ "index": {}} +{ "@timestamp": "2020-04-30T14:31:27-05:00", "message" : "252.0.0.0 - - [2020-04-30T14:31:27-05:00] \"GET /images/hm_bg.jpg HTTP/1.0\" 200 24736"} +{ "index": {}} +{ "@timestamp": "2020-04-30T14:31:29-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:29-05:00] \"GET /images/hm_brdl.gif HTTP/1.0\" 304 0"} +{ "index": {}} +{ "@timestamp": "2020-04-30T14:31:29-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:29-05:00] \"GET /images/hm_arw.gif HTTP/1.0\" 304 0"} +{ "index": {}} +{ "@timestamp": "2020-04-30T14:31:32-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:32-05:00] \"GET /images/nav_bg_top.gif HTTP/1.0\" 200 929"} +{ "index": {}} +{ "@timestamp": "2020-04-30T14:31:43-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:43-05:00] \"GET /french/images/nav_venue_off.gif HTTP/1.0\" 304 0"} + +PUT /my-index/_mappings +{ + "properties": { + "clientip": { + "type": "runtime", + "runtime_type": "ip", + "script" : { + "source" : "String m = doc[\"message\"].value; int end = m.indexOf(\" \"); emit(m.substring(0, end));" + } + } + } +} + +PUT /my-index/_mappings +{ + "properties": { + "day_of_week": { + "type": "runtime", + "runtime_type": "keyword", + "script": { + "source": "emit(doc['@timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))" + } + } + } +} + +GET my-index/_search +{ + "query": { + "match": { + "clientip": "211.11.9.0" + } + }, + "fields" : ["*"] +} +---- +//// + [source,console-result] ---- -... +{ + ... + "hits" : { + "total" : { + "value" : 2, + "relation" : "eq" + }, + "max_score" : 1.0, + "hits" : [ + { + "_index" : "my-index", + "_id" : "8Jh81nQBp2DRDXdiOxVt", + "_score" : 1.0, + "_source" : { + "@timestamp" : "2020-06-21T15:00:01-05:00", + "message" : "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0" + }, "fields" : { "@timestamp" : [ "2020-06-21T20:00:01.000Z" @@ -366,16 +453,45 @@ allows you to modify the mapping without changing any field values. "211.11.9.0" ], "message" : [ - """211.11.9.0 - - [2020-06-21T15:00:01-05:00] "GET /english/index.html HTTP/1.0" 304 0""" + "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0" ], "day_of_week" : [ - "Sunday" <1> + "Sunday" ] } }, - ... + { + "_index" : "my-index", + "_id" : "8Zh81nQBp2DRDXdiOxVu", + "_score" : 1.0, + "_source" : { + "@timestamp" : "2020-06-21T15:00:01-05:00", + "message" : "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0" + }, + "fields" : { + "@timestamp" : [ + "2020-06-21T20:00:01.000Z" + ], + "clientip" : [ + "211.11.9.0" + ], + "message" : [ + "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0" + ], + "day_of_week" : [ + "Sunday" + ] + } + } + ] + } +} ---- -// TEST[skip:not a complete result] +// TESTRESPONSE[s/\.\.\./"took" : $body.took,"timed_out" : $body.timed_out,"_shards" : $body._shards,/] +// TESTRESPONSE[s/"_id" : "8Jh81nQBp2DRDXdiOxVt"/"_id": $body.hits.hits.0._id/] +// TESTRESPONSE[s/"_id" : "8Zh81nQBp2DRDXdiOxVu"/"_id": $body.hits.hits.1._id/] +// TESTRESPONSE[s/"day_of_week" : \[\n\s+"Sunday"\n\s\]/"day_of_week": $body.hits.hits.0.fields.day_of_week/] +// TESTRESPONSE[s/"day_of_week" : \[\n\s+"Sunday"\n\s\]/"day_of_week": $body.hits.hits.1.fields.day_of_week/] <1> This value was calculated at search time using the runtime script defined in the mapping. From 875713c0c86eca35f1282c9dce94ffa5f3b9c0e3 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Thu, 1 Oct 2020 10:21:10 -0400 Subject: [PATCH 20/44] Adding missing callout in example. --- docs/reference/mapping/types/runtime.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index 89378165c1ae6..cebc39c137d37 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -456,7 +456,7 @@ GET my-index/_search "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0" ], "day_of_week" : [ - "Sunday" + "Sunday" <1> ] } }, From 54d529adfd526d9f84d709328ecade6617517ded Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Thu, 1 Oct 2020 11:17:09 -0400 Subject: [PATCH 21/44] Adding substitutions to TESTRESPONSE for shorter results shown. --- docs/reference/mapping/types/runtime.asciidoc | 27 ++----------------- 1 file changed, 2 insertions(+), 25 deletions(-) diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index cebc39c137d37..53c8b8f55e3c5 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -460,38 +460,15 @@ GET my-index/_search ] } }, - { - "_index" : "my-index", - "_id" : "8Zh81nQBp2DRDXdiOxVu", - "_score" : 1.0, - "_source" : { - "@timestamp" : "2020-06-21T15:00:01-05:00", - "message" : "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0" - }, - "fields" : { - "@timestamp" : [ - "2020-06-21T20:00:01.000Z" - ], - "clientip" : [ - "211.11.9.0" - ], - "message" : [ - "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0" - ], - "day_of_week" : [ - "Sunday" - ] - } - } + *** ] } } ---- // TESTRESPONSE[s/\.\.\./"took" : $body.took,"timed_out" : $body.timed_out,"_shards" : $body._shards,/] // TESTRESPONSE[s/"_id" : "8Jh81nQBp2DRDXdiOxVt"/"_id": $body.hits.hits.0._id/] -// TESTRESPONSE[s/"_id" : "8Zh81nQBp2DRDXdiOxVu"/"_id": $body.hits.hits.1._id/] // TESTRESPONSE[s/"day_of_week" : \[\n\s+"Sunday"\n\s\]/"day_of_week": $body.hits.hits.0.fields.day_of_week/] -// TESTRESPONSE[s/"day_of_week" : \[\n\s+"Sunday"\n\s\]/"day_of_week": $body.hits.hits.1.fields.day_of_week/] +// TESTRESPONSE[s/\*\*\*/$body.hits.hits.1/] <1> This value was calculated at search time using the runtime script defined in the mapping. From 7c1a7e084ffb94e89d0264c0fdc3fa0a55d3d376 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Mon, 26 Oct 2020 17:08:09 -0400 Subject: [PATCH 22/44] Shuffling some information and adding link to script-fields. --- docs/reference/mapping/types/runtime.asciidoc | 30 ++++++++++--------- 1 file changed, 16 insertions(+), 14 deletions(-) diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index 53c8b8f55e3c5..79addf7056294 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -1,13 +1,18 @@ [[runtime]] === Runtime Typically, you must index fields to {es} before they can be retrieved, -aggregated, or searched. With runtime fields, you can explicitly define a field -in the mapping and access it at search time without indexing your data during -ingest time. +aggregated, or searched. With _runtime fields_ beta[], you can explicitly +define a field in the mapping and access it at search time without indexing +your data during ingest time. -Runtime fields beta[] are accessible from the search API like any other field. -The <> returns runtime fields -as searchable and aggregatable. +Because runtime fields aren't indexed, you can more quickly ingest raw data +into the Elastic Stack and access it immediately. Runtime fields are accessible +from the search API like any other field. The <> returns runtime fields as both searchable and aggregatable. + +By dynamically evaluating runtime fields at search time, you can optimize disk +space by choosing which fields to index. If you later decide that a runtime +field is increasingly used for filtering or aggregation, you can add the field +to `_source` to gain faster search speed. Runtime fields help to alleviate several common issues when using {es}: @@ -16,20 +21,13 @@ experimenting on large datasets difficult * Indexing data before searching makes running one-off searches costly and resource intensive * Indexing all of your data instead of just the fields you want to search -requires more disk space to gain search time performance +requires more disk space to gain performance at search time * Reindexing data for time-based indices to ensure that existing indices include any new fields in the index template is slow * Determining how a field is evaluated in {kib} at index or search time is difficult for scripted fields because they have different needs based on where they are implemented -Because runtime fields aren't indexed, you can more quickly ingest raw data -into the Elastic Stack and immediately access it. By dynamically evaluating -runtime fields at search time, you can optimize disk space by choosing which -fields to index. If you later decide that a runtime field is increasingly used -for filtering or aggregation, you can add the field to `_source` to gain faster -search speed. - Runtime fields incur performance costs at search time, depending on the <>. For example, let's say you created an anomaly detection job that operates on the `@timestamp` field. @@ -54,6 +52,10 @@ script runs and generates values for each scripted field. Runtime scripts have access to the entire context of a document, the original `_source`, and the mapped field plus its values (`doc_values`). +Runtime fields are similar to the <> parameter +of the `_search` request. You can retrieve the results of running a script, but +also make the script results available for queries and aggregations. + [[runtime-updating-scripts]] .Updating runtime scripts **** From d3f3e4bae0b42e99dcc0816e64f22a910ad7011e Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Mon, 26 Oct 2020 17:43:05 -0400 Subject: [PATCH 23/44] Fixing typo. --- docs/reference/mapping/types/runtime.asciidoc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index 79addf7056294..0217c60cd8ab7 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -52,9 +52,9 @@ script runs and generates values for each scripted field. Runtime scripts have access to the entire context of a document, the original `_source`, and the mapped field plus its values (`doc_values`). -Runtime fields are similar to the <> parameter +Runtime fields are similar to the <> parameter of the `_search` request. You can retrieve the results of running a script, but -also make the script results available for queries and aggregations. +also make the script results available for queries and aggregations. [[runtime-updating-scripts]] .Updating runtime scripts From 3fe154187217bd3f6d0f1450c1f567289685d9b7 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Fri, 30 Oct 2020 10:42:46 -0400 Subject: [PATCH 24/44] Updates for API redesign -- will break builds. --- docs/reference/mapping/types/runtime.asciidoc | 58 ++++++++++--------- 1 file changed, 32 insertions(+), 26 deletions(-) diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index 0217c60cd8ab7..48dd6f398638e 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -46,11 +46,12 @@ index them. [[runtime-mapping-fields]] ==== Mapping a runtime field -You map runtime fields by -<>. At search time, the -script runs and generates values for each scripted field. Runtime -scripts have access to the entire context of a document, the original -`_source`, and the mapped field plus its values (`doc_values`). +You map runtime fields by adding a `"runtime"` section under the mapping +definition. Within that section, you +<>, which has access to the +entire context of a document, the original `_source`, and the mapped field plus +its values (`doc_values`). At search time, the script runs and generates values +for each scripted field. Runtime fields are similar to the <> parameter of the `_search` request. You can retrieve the results of running a script, but @@ -79,21 +80,24 @@ The script in the following request extracts the day of the week from the PUT /my-index { "mappings": { - "properties": { + "runtime": { <1> "day_of_week": { - "type": "runtime", <1> - "runtime_type": "keyword", <2> + "type": "keyword", <2> "script": { <3> "source": "emit(doc['@timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))" } } + }, + "properties": { + "timestamp": {"type": "date"} } } } ---- -<1> Runtime fields are of the `runtime` data type. -<2> Each runtime has its own field type, defined by `runtime_type`. +<1> Runtime fields are defined in the `runtime` section of the mapping +definition. +<2> Each runtime has its own field type, just like any other field. <3> The script defines the evaluation to calculate at search time. [[runtime-params]] @@ -161,7 +165,7 @@ field type for `message`. PUT /my-index/ { "mappings": { - "properties": { + "runtime": { "@timestamp": { "format": "strict_date_optional_time||epoch_second", "type": "date" @@ -222,7 +226,7 @@ The mapping contains two fields: `@timestamp` and `message`. { "my-index" : { "mappings" : { - "properties" : { + "runtime" : { "@timestamp" : { "type" : "date", "format" : "strict_date_optional_time||epoch_second" @@ -246,10 +250,9 @@ field at runtime to calculate values for that field. ---- PUT /my-index/_mappings { - "properties": { + "runtime": { "clientip": { - "type": "runtime", - "runtime_type": "ip", + "type": "ip", "script" : { "source" : "String m = doc[\"message\"].value; int end = m.indexOf(\" \"); emit(m.substring(0, end));" } @@ -353,7 +356,7 @@ allows you to modify the mapping without changing any field values. ---- PUT /my-index/ { - "mappings": { + "runtime": { "properties": { "@timestamp": { "format": "strict_date_optional_time||epoch_second", @@ -392,10 +395,9 @@ POST /my-index/_bulk?refresh PUT /my-index/_mappings { - "properties": { + "runtime": { "clientip": { - "type": "runtime", - "runtime_type": "ip", + "type": "ip", "script" : { "source" : "String m = doc[\"message\"].value; int end = m.indexOf(\" \"); emit(m.substring(0, end));" } @@ -403,15 +405,19 @@ PUT /my-index/_mappings } } -PUT /my-index/_mappings +PUT /my-index { - "properties": { - "day_of_week": { - "type": "runtime", - "runtime_type": "keyword", - "script": { - "source": "emit(doc['@timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))" + "mappings": { + "runtime": { + "day_of_week": { + "type": "keyword", + "script": { + "source": "emit(doc['@timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))" + } } + }, + "properties": { + "timestamp": {"type": "date"} } } } From 6d495f024678e0ace4009169019da6bd9a13257f Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Mon, 23 Nov 2020 17:00:50 -0500 Subject: [PATCH 25/44] Updating examples and including info about overriding fields. --- docs/reference/mapping/types/runtime.asciidoc | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index 48dd6f398638e..7e36392c1faa1 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -14,6 +14,11 @@ space by choosing which fields to index. If you later decide that a runtime field is increasingly used for filtering or aggregation, you can add the field to `_source` to gain faster search speed. +Alternatively, you can specify a `runtime_mappings` section in a search request +to use runtime fields that exist only as part of the query. This +flexibility allows you to override fields in `_source` for the duration of the +query without modifying the field itself. + Runtime fields help to alleviate several common issues when using {es}: * Reindexing your data between development iterations is slow and can make @@ -165,7 +170,7 @@ field type for `message`. PUT /my-index/ { "mappings": { - "runtime": { + "properties": { "@timestamp": { "format": "strict_date_optional_time||epoch_second", "type": "date" @@ -225,8 +230,9 @@ The mapping contains two fields: `@timestamp` and `message`. ---- { "my-index" : { + "aliases" : { }, "mappings" : { - "runtime" : { + "properties" : { "@timestamp" : { "type" : "date", "format" : "strict_date_optional_time||epoch_second" @@ -356,7 +362,7 @@ allows you to modify the mapping without changing any field values. ---- PUT /my-index/ { - "runtime": { + "mappings": { "properties": { "@timestamp": { "format": "strict_date_optional_time||epoch_second", From e00f80f6424280c8c545f2a82e5244bdcca504c2 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Tue, 24 Nov 2020 10:09:46 -0500 Subject: [PATCH 26/44] Updating examples. --- docs/reference/mapping/types/runtime.asciidoc | 36 ++++++++++++------- 1 file changed, 23 insertions(+), 13 deletions(-) diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index 7e36392c1faa1..3ddc3430568e1 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -254,7 +254,7 @@ field at runtime to calculate values for that field. [source,console] ---- -PUT /my-index/_mappings +PUT /my-index/_mapping { "runtime": { "clientip": { @@ -399,7 +399,7 @@ POST /my-index/_bulk?refresh { "index": {}} { "@timestamp": "2020-04-30T14:31:43-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:43-05:00] \"GET /french/images/nav_venue_off.gif HTTP/1.0\" 304 0"} -PUT /my-index/_mappings +PUT /my-index/_mapping { "runtime": { "clientip": { @@ -411,19 +411,29 @@ PUT /my-index/_mappings } } -PUT /my-index +GET my-index/_search { - "mappings": { - "runtime": { - "day_of_week": { - "type": "keyword", - "script": { - "source": "emit(doc['@timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))" - } + "query": { + "match": { + "clientip": "211.11.9.0" + } + }, + "fields" : ["*"] +} + +PUT /my-index/_mapping +{ + "runtime": { + "day_of_week": { + "type": "keyword", + "script": { + "source": "emit(doc['@timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))" } - }, - "properties": { - "timestamp": {"type": "date"} + } + }, + "properties": { + "timestamp": { + "type": "date" } } } From 8824beaf87b5399eb3e60491788a48b256b2e38b Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Tue, 24 Nov 2020 15:56:58 -0500 Subject: [PATCH 27/44] Adding info for using runtime fields in the search request. --- docs/reference/mapping/types/runtime.asciidoc | 106 ++++++++++++++++-- 1 file changed, 94 insertions(+), 12 deletions(-) diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index 3ddc3430568e1..502acbc4588f2 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -16,8 +16,8 @@ to `_source` to gain faster search speed. Alternatively, you can specify a `runtime_mappings` section in a search request to use runtime fields that exist only as part of the query. This -flexibility allows you to override fields in `_source` for the duration of the -query without modifying the field itself. +flexibility allows you to create or override field values in `_source` for the +duration of the query without modifying the field itself. Runtime fields help to alleviate several common issues when using {es}: @@ -105,6 +105,82 @@ definition. <2> Each runtime has its own field type, just like any other field. <3> The script defines the evaluation to calculate at search time. +[[runtime-search-request]] +==== Defining runtime fields in a search request +You can specify a `runtime_mappings` section in a search request to create +runtime fields that exist only as part of the query. You must specify a script +as part of the `runtime_mappings` section, just as you would if adding a +runtime field to the mappings. + +In the following request, the values for the `day_of_week` field are calculated +dynamically, and only within the context of this search request. + +[source,console] +---- +GET my-index/_search +{ + "runtime_mappings": { + "day_of_week": { + "type": "keyword", + "script": { + "source": "emit(doc['@timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))" + } + } + }, + "aggs": { + "day_of_week": { + "terms": { + "field": "day_of_week" + } + } + } +} +---- +// TEST[continued] + +[[runtime-overriding-fields]] +===== Overriding fields in the search request +You can use the `runtime_mappings` section of the `_search` request to override +field values inside objects by naming the runtime fields with dot notation. For +example, index the following documents. + +[source, console] +---- +POST my-index/_bulk?refresh=true +{"index":{}} +{"name":{"first":"Jose","last":"Hickman"}} +{"index":{}} +{"name":{"first":"India","last":"Avila","suffix":"II"}} +---- + +In the `runtime_mappings` section of the `_search` request, define a script +that operates on the `name.first` field. The values you specify will override +fields in `_source` for the duration of the query without modifying the field +itself. + +The following request evaluates the `name.last` field and overrides the +`name.first` field in the search request based on the script valuation. + +[source,console] +---- +POST my-index/_search +{ + "runtime_mappings": { + "name.first": { + "type": "keyword", + "script": { + "source": "if (\"Hickman\".equals(doc[\"name.last.keyword\"].value)) {emit(\"Guile\");} else if (\"Avila\".equals(doc[\"name.last.keyword\"].value)) {emit(\"Anastasia\");}" + } + } + }, + "query": { + "match": { + "name.first": "Anastasia" + } + } +} +---- + [[runtime-params]] ==== Parameters for `runtime` fields Runtime fields accept the following parameters: @@ -130,23 +206,29 @@ time to produce the value of the runtime field. [[runtime-retrieving-fields]] ==== Retrieving a runtime field Use the <> parameter on the `_search` API to retrieve -the values of runtime fields. This API works for all fields, even those that -were not sent as part of the original `_source`. +the values of runtime fields. Runtime fields won't display in `_source`, but +the `fields` API works for all fields, even those that were not sent as part of +the original `_source`. The following request uses the search API to retrieve the `day_of_week` field -that the previous request defined in the mapping. +that the previous request defined as a runtime field in the mapping. The value +for the `day_of_week` field is calculated dynamically at search time, and the +following search request retrieves any documents where the calculated value is +equal to `Thursday`. [source,console] ---- -GET /my-index/_search +GET my-index/_search { - "aggs": { - "days_of_week": { - "terms": { - "field": "day_of_week" - } + "query": { + "match": { + "day_of_week": "Thursday" } - } + }, + "fields": [ + "@timestamp", "day_of_week" + ], + "_source": false } ---- // TEST[continued] From 72cf35429ec99c248e7b3f3e914931b0dc936a05 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Wed, 25 Nov 2020 12:16:17 -0500 Subject: [PATCH 28/44] Adding that queries against runtime fields are expensive. --- docs/reference/mapping/types/runtime.asciidoc | 25 +++++++++++-------- 1 file changed, 15 insertions(+), 10 deletions(-) diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/types/runtime.asciidoc index 502acbc4588f2..699f4f9f2c45c 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/types/runtime.asciidoc @@ -142,7 +142,7 @@ GET my-index/_search ===== Overriding fields in the search request You can use the `runtime_mappings` section of the `_search` request to override field values inside objects by naming the runtime fields with dot notation. For -example, index the following documents. +example, index the following documents into `my-index`. [source, console] ---- @@ -153,10 +153,10 @@ POST my-index/_bulk?refresh=true {"name":{"first":"India","last":"Avila","suffix":"II"}} ---- -In the `runtime_mappings` section of the `_search` request, define a script -that operates on the `name.first` field. The values you specify will override -fields in `_source` for the duration of the query without modifying the field -itself. +In the `runtime_mappings` section of the `_search` request, you can define a +script that operates on the `name.first` field. The values you specify will +override fields in `_source` for the duration of the query without modifying +the field itself. The following request evaluates the `name.last` field and overrides the `name.first` field in the search request based on the script valuation. @@ -182,7 +182,7 @@ POST my-index/_search ---- [[runtime-params]] -==== Parameters for `runtime` fields +==== Parameters for runtime fields Runtime fields accept the following parameters: [[runtime-params-type]] @@ -210,11 +210,16 @@ the values of runtime fields. Runtime fields won't display in `_source`, but the `fields` API works for all fields, even those that were not sent as part of the original `_source`. +IMPORTANT: Queries against runtime fields are considered expensive. If +<> is set +to `false`, expensive queries are not allowed and {es} will reject any queries +against runtime fields. + The following request uses the search API to retrieve the `day_of_week` field -that the previous request defined as a runtime field in the mapping. The value -for the `day_of_week` field is calculated dynamically at search time, and the -following search request retrieves any documents where the calculated value is -equal to `Thursday`. +that the <> defined as a runtime field +in the mapping. The value for the `day_of_week` field is calculated dynamically +at search time, and the following search request retrieves any documents where +the calculated value is equal to `Thursday`. [source,console] ---- From 35d7b6a7a41c1af5395a7b2492445342cd6324c3 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Mon, 30 Nov 2020 17:10:55 -0500 Subject: [PATCH 29/44] Incorporating feedback from reviewers. --- docs/reference/mapping.asciidoc | 14 +- .../mapping/{types => }/runtime.asciidoc | 131 +++++++++--------- docs/reference/mapping/types.asciidoc | 27 ---- docs/reference/search/field-caps.asciidoc | 4 +- 4 files changed, 83 insertions(+), 93 deletions(-) rename docs/reference/mapping/{types => }/runtime.asciidoc (86%) diff --git a/docs/reference/mapping.asciidoc b/docs/reference/mapping.asciidoc index e146b1194f6a4..2e2717f8dc0fa 100644 --- a/docs/reference/mapping.asciidoc +++ b/docs/reference/mapping.asciidoc @@ -13,7 +13,8 @@ are stored and indexed. For instance, use mappings to define: * custom rules to control the mapping for <>. -A mapping definition has: +A mapping definition includes metadata fields and fields, and can also include +runtime fields: <>:: @@ -30,6 +31,12 @@ document. Each field has its own <>. NOTE: Before 7.0.0, the 'mappings' definition used to include a type name. For more details, please see <>. +<>:: + +Runtime fields are not indexed, which saves disk space and makes data ingest +faster. You can add fields to existing documents without reindexing your data +and calculate field values dynamically at search time. + [[mapping-limit-settings]] [discrete] === Settings to prevent mappings explosion @@ -92,6 +99,7 @@ If your field mappings contain a large, arbitrary set of keys, consider using th `Long.MAX_VALUE` (no limit). [discrete] +[[dynamic-mapping-intro]] == Dynamic mapping Fields and mapping types do not need to be defined before being used. Thanks @@ -114,7 +122,7 @@ You can create field mappings when you <> and [discrete] [[create-mapping]] -== Create an index with an explicit mapping +=== Create an index with an explicit mapping You can use the <> API to create a new index with an explicit mapping. @@ -262,3 +270,5 @@ include::mapping/fields.asciidoc[] include::mapping/params.asciidoc[] include::mapping/dynamic-mapping.asciidoc[] + +include::mapping/runtime.asciidoc[] diff --git a/docs/reference/mapping/types/runtime.asciidoc b/docs/reference/mapping/runtime.asciidoc similarity index 86% rename from docs/reference/mapping/types/runtime.asciidoc rename to docs/reference/mapping/runtime.asciidoc index 699f4f9f2c45c..c0e4ab8e5ca9b 100644 --- a/docs/reference/mapping/types/runtime.asciidoc +++ b/docs/reference/mapping/runtime.asciidoc @@ -1,24 +1,37 @@ [[runtime]] -=== Runtime -Typically, you must index fields to {es} before they can be retrieved, -aggregated, or searched. With _runtime fields_ beta[], you can explicitly -define a field in the mapping and access it at search time without indexing -your data during ingest time. - +== Runtime fields +Typically, you index data into {es} to promote faster search. However, indexing +can be slow and requires more disk space, and you have to reindex your data to +add fields to existing documents. With _runtime fields_ beta[], you can add +fields to documents already indexed to {es} without reindexing your data. + +[discrete] +[[runtime-benefits]] +=== Benefits Because runtime fields aren't indexed, you can more quickly ingest raw data -into the Elastic Stack and access it immediately. Runtime fields are accessible -from the search API like any other field. The <> returns runtime fields as both searchable and aggregatable. +into the Elastic Stack and access it immediately. By dynamically evaluating +runtime fields at search time, you can optimize disk space by choosing which +fields to index. If you later decide that a runtime field is increasingly used +for filtering or aggregation, you can add the field to `_source` to gain faster +search speed. -By dynamically evaluating runtime fields at search time, you can optimize disk -space by choosing which fields to index. If you later decide that a runtime -field is increasingly used for filtering or aggregation, you can add the field -to `_source` to gain faster search speed. +Instead of reindexing your data to add fields, add runtime fields to the +mapping definition. You can access runtime fields from the search API like any +other field, and {es} sees runtime fields no differently. Alternatively, you can specify a `runtime_mappings` section in a search request to use runtime fields that exist only as part of the query. This flexibility allows you to create or override field values in `_source` for the duration of the query without modifying the field itself. +[discrete] +[[runtime-use-cases]] +=== Use cases +Runtime fields are useful when working with log data, especially when you're +unsure about the data structure. Your search speed decreases, but your index +size is much smaller and you can more quickly process logs without having to +index them. + Runtime fields help to alleviate several common issues when using {es}: * Reindexing your data between development iterations is slow and can make @@ -33,35 +46,52 @@ include any new fields in the index template is slow difficult for scripted fields because they have different needs based on where they are implemented -Runtime fields incur performance costs at search time, depending -on the <>. For example, let's say -you created an anomaly detection job that operates on the `@timestamp` field. -If the `@timestamp` field is a runtime field, the search cost would be extremely -high because the data isn't indexed and {es} must compute the value for each -document that matches the query. - -NOTE: Computing values for runtime fields in each document that might match a -query impacts search speed. Use the <> -to run searches that include runtime fields. +[discrete] +[[runtime-compromises]] +=== Compromises +Runtime fields use less disk space and provide flexibility in how you access +your data, but can impact search performance based on the computation defined in +the runtime script. + +To balance search performance and flexibility, index fields that you'll +commonly search for and filter on, such as a timestamp. {es} automatically uses +these indexed fields first when running a query, which produces a fast response +time. You can then use runtime fields to limit the number of fields that {es} +needs to calculate values for. Using indexed fields in tandem with runtime +fields provides flexibility in the data that you index and how you define +queries for other fields. + +Use the <> to run searches that include +runtime fields. This method of search helps to offset the performance impacts +of computing values for runtime fields in each document containing that field. -Runtime fields are useful when working with log data, especially when you're -unsure about the data structure. Your search speed decreases, but your index -size is much smaller and you can more quickly process logs without having to -index them. +IMPORTANT: Queries against runtime fields are considered expensive. If +<> is set +to `false`, expensive queries are not allowed and {es} will reject any queries +against runtime fields. +[discrete] [[runtime-mapping-fields]] -==== Mapping a runtime field +=== Mapping a runtime field You map runtime fields by adding a `"runtime"` section under the mapping definition. Within that section, you <>, which has access to the -entire context of a document, the original `_source`, and the mapped field plus -its values (`doc_values`). At search time, the script runs and generates values -for each scripted field. +original `_source` and the mapped field plus its values (`doc_values`). At +search time, the script runs and generates values for each scripted field. + +The `"runtime"` section supports `boolean`, `date`, `double`, `ip`, `keyword`, +and `long` data types. Runtime fields with a `type` of `date` can accept the +<> parameter exactly as the `date` field type. Runtime fields are similar to the <> parameter of the `_search` request. You can retrieve the results of running a script, but also make the script results available for queries and aggregations. +IMPORTANT: Objects are not supported in the `"runtime"` section under the +mapping definition. If you want to map a runtime field under an object, you +can +<>. + [[runtime-updating-scripts]] .Updating runtime scripts **** @@ -71,8 +101,8 @@ inconsistent results. Each shard might have access to different versions of the script, depending on when the mapping change takes effect. Existing queries or visualizations in {kib} that rely on runtime fields can -fail if you change the `runtime_type`. For example, a bar chart visualization -that uses a runtime field of `ip` will fail if the `runtime_type` is changed +fail if you change the field type. For example, a bar chart visualization +that uses a runtime field of type `ip` will fail if the type is changed to `boolean`. **** @@ -105,8 +135,9 @@ definition. <2> Each runtime has its own field type, just like any other field. <3> The script defines the evaluation to calculate at search time. +[discrete] [[runtime-search-request]] -==== Defining runtime fields in a search request +=== Defining runtime fields in a search request You can specify a `runtime_mappings` section in a search request to create runtime fields that exist only as part of the query. You must specify a script as part of the `runtime_mappings` section, just as you would if adding a @@ -138,8 +169,9 @@ GET my-index/_search ---- // TEST[continued] +[discrete] [[runtime-overriding-fields]] -===== Overriding fields in the search request +==== Overriding fields in the search request You can use the `runtime_mappings` section of the `_search` request to override field values inside objects by naming the runtime fields with dot notation. For example, index the following documents into `my-index`. @@ -181,28 +213,7 @@ POST my-index/_search } ---- -[[runtime-params]] -==== Parameters for runtime fields -Runtime fields accept the following parameters: - -[[runtime-params-type]] -`type`:: -The type of runtime computation to perform at query time. Currently, runtime -fields only support the `runtime` data type. - -[[runtime-params-runtime-type]] -`runtime_type`:: -The <> for each scripted field. {es} -supports `boolean`, `date`, `double`, `ip`, `keyword`, and `long`. -+ -Runtime fields with a `runtime_type` of `date` can accept the -<> parameter exactly as the `date` field type. - -[[runtime-params-script]] -`script`:: -The <> that is evaluated at search -time to produce the value of the runtime field. - +[discrete] [[runtime-retrieving-fields]] ==== Retrieving a runtime field Use the <> parameter on the `_search` API to retrieve @@ -210,11 +221,6 @@ the values of runtime fields. Runtime fields won't display in `_source`, but the `fields` API works for all fields, even those that were not sent as part of the original `_source`. -IMPORTANT: Queries against runtime fields are considered expensive. If -<> is set -to `false`, expensive queries are not allowed and {es} will reject any queries -against runtime fields. - The following request uses the search API to retrieve the `day_of_week` field that the <> defined as a runtime field in the mapping. The value for the `day_of_week` field is calculated dynamically @@ -238,8 +244,9 @@ GET my-index/_search ---- // TEST[continued] +[discrete] [[runtime-examples]] -==== Examples +=== Examples Consider a large set of log data that you want to extract fields from. Indexing the data is time consuming and uses a lot of disk space, and you just want to explore the data structure without committing to a schema up front. diff --git a/docs/reference/mapping/types.asciidoc b/docs/reference/mapping/types.asciidoc index de0919a304853..4a3b87b5e9931 100644 --- a/docs/reference/mapping/types.asciidoc +++ b/docs/reference/mapping/types.asciidoc @@ -126,31 +126,6 @@ the <>, the This is the purpose of _multi-fields_. Most field types support multi-fields via the <> parameter. -[discrete] -[[types-runtime]] -=== Runtime -{es} indexes most field types by default to promote faster search. However, -indexing all of your data can be slow and requires more disk space. If you're -experimenting with your data or are unsure which fields you need for search, -use _runtime fields_ beta[]. - -{es} treats runtime fields like any other field, except that their values are -only extracted or computed at search time. When mapping a runtime field, you -define a script that determines how to extract or compute values from existing -fields. - -Runtime fields use less disk space and provide flexibility in how you access -your data, but impact search performance based on the computation defined in -the runtime script. See <>. - -// Runtime fields optimize disk space by evaluating each runtime script at search -// time, instead of indexing fields and using disk space. -// -// Runtime fields use less disk space and provide flexibility in how you want to -// access your data. Use runtime fields to quickly get data into the Elastic Stack -// -// Runtime fields make searches slower, as computing their values for each document that might match the query is costly, depending on how they are calculated. - include::types/alias.asciidoc[] include::types/array.asciidoc[] @@ -195,8 +170,6 @@ include::types/rank-feature.asciidoc[] include::types/rank-features.asciidoc[] -include::types/runtime.asciidoc[] - include::types/search-as-you-type.asciidoc[] include::types/shape.asciidoc[] diff --git a/docs/reference/search/field-caps.asciidoc b/docs/reference/search/field-caps.asciidoc index a750b7c86ba49..6bbd238acd23e 100644 --- a/docs/reference/search/field-caps.asciidoc +++ b/docs/reference/search/field-caps.asciidoc @@ -34,8 +34,8 @@ GET /_field_caps?fields=rating The field capabilities API returns the information about the capabilities of fields among multiple indices. -Use the field capabilities API to return <> like any -other field. For example, a runtime field with a `runtime_type` of +The field capabilities API returns <> like any +other field. For example, a runtime field with a type of `keyword` returns as any other field that belongs to the `keyword` family. From 4cd5b20d1714587ca95f9719b89a86eea4c461a0 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Tue, 1 Dec 2020 15:21:30 -0500 Subject: [PATCH 30/44] Minor changes from reviews. --- docs/reference/mapping/runtime.asciidoc | 46 +++++++++++-------------- 1 file changed, 21 insertions(+), 25 deletions(-) diff --git a/docs/reference/mapping/runtime.asciidoc b/docs/reference/mapping/runtime.asciidoc index c0e4ab8e5ca9b..7b7951ecdf44f 100644 --- a/docs/reference/mapping/runtime.asciidoc +++ b/docs/reference/mapping/runtime.asciidoc @@ -84,29 +84,14 @@ and `long` data types. Runtime fields with a `type` of `date` can accept the <> parameter exactly as the `date` field type. Runtime fields are similar to the <> parameter -of the `_search` request. You can retrieve the results of running a script, but -also make the script results available for queries and aggregations. +of the `_search` request, but also make the script results available for +queries and aggregations. IMPORTANT: Objects are not supported in the `"runtime"` section under the mapping definition. If you want to map a runtime field under an object, you can <>. -[[runtime-updating-scripts]] -.Updating runtime scripts -**** - -Updating a script while a dependent query is running can return -inconsistent results. Each shard might have access to different versions of the -script, depending on when the mapping change takes effect. - -Existing queries or visualizations in {kib} that rely on runtime fields can -fail if you change the field type. For example, a bar chart visualization -that uses a runtime field of type `ip` will fail if the type is changed -to `boolean`. - -**** - The script in the following request extracts the day of the week from the `@timestamp` field, which is defined as a `date` data type. @@ -135,6 +120,21 @@ definition. <2> Each runtime has its own field type, just like any other field. <3> The script defines the evaluation to calculate at search time. +[[runtime-updating-scripts]] +.Updating runtime scripts +**** + +Updating a script while a dependent query is running can return +inconsistent results. Each shard might have access to different versions of the +script, depending on when the mapping change takes effect. + +Existing queries or visualizations in {kib} that rely on runtime fields can +fail if you change the field type. For example, a bar chart visualization +that uses a runtime field of type `ip` will fail if the type is changed +to `boolean`. + +**** + [discrete] [[runtime-search-request]] === Defining runtime fields in a search request @@ -212,6 +212,7 @@ POST my-index/_search } } ---- +// TEST[continued] [discrete] [[runtime-retrieving-fields]] @@ -231,13 +232,9 @@ the calculated value is equal to `Thursday`. ---- GET my-index/_search { - "query": { - "match": { - "day_of_week": "Thursday" - } - }, "fields": [ - "@timestamp", "day_of_week" + "@timestamp", + "day_of_week" ], "_source": false } @@ -324,7 +321,6 @@ The mapping contains two fields: `@timestamp` and `message`. ---- { "my-index" : { - "aliases" : { }, "mappings" : { "properties" : { "@timestamp" : { @@ -340,7 +336,7 @@ The mapping contains two fields: `@timestamp` and `message`. } } ---- -// TESTRESPONSE[s/\.\.\./"aliases": $body.my-index.aliases, "settings": $body.my-index.settings/] +// TESTRESPONSE[s/\.\.\./"settings": $body.my-index.settings/] If you want to retrieve results that include `clientip`, you can add that field as a runtime field in the mapping. The runtime script operates on the `clientip` From 99b2720ff878a45b64e6b2fed0846ac7c3833bc5 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Tue, 1 Dec 2020 15:43:13 -0500 Subject: [PATCH 31/44] Adding alias for test case. --- docs/reference/mapping/runtime.asciidoc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/reference/mapping/runtime.asciidoc b/docs/reference/mapping/runtime.asciidoc index 7b7951ecdf44f..aed4ea50dbc13 100644 --- a/docs/reference/mapping/runtime.asciidoc +++ b/docs/reference/mapping/runtime.asciidoc @@ -321,6 +321,7 @@ The mapping contains two fields: `@timestamp` and `message`. ---- { "my-index" : { + "aliases" : { }, "mappings" : { "properties" : { "@timestamp" : { @@ -336,7 +337,7 @@ The mapping contains two fields: `@timestamp` and `message`. } } ---- -// TESTRESPONSE[s/\.\.\./"settings": $body.my-index.settings/] +// TESTRESPONSE[s/\.\.\./"aliases": $body.my-index.aliases, "settings": $body.my-index.settings/] If you want to retrieve results that include `clientip`, you can add that field as a runtime field in the mapping. The runtime script operates on the `clientip` From 43dd29d804c3a789fead62faae4bae5f337b173d Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Tue, 1 Dec 2020 15:55:28 -0500 Subject: [PATCH 32/44] Adding aliases to PUT example. --- docs/reference/mapping/runtime.asciidoc | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/reference/mapping/runtime.asciidoc b/docs/reference/mapping/runtime.asciidoc index aed4ea50dbc13..90e8bf0c32a63 100644 --- a/docs/reference/mapping/runtime.asciidoc +++ b/docs/reference/mapping/runtime.asciidoc @@ -260,6 +260,7 @@ field type for `message`. ---- PUT /my-index/ { + "aliases": {}, "mappings": { "properties": { "@timestamp": { From 994b1c22d7e388718cad56b47cfcc651115641e1 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Tue, 1 Dec 2020 16:52:33 -0500 Subject: [PATCH 33/44] Fixing test cases, for real this time. --- docs/reference/mapping/runtime.asciidoc | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/docs/reference/mapping/runtime.asciidoc b/docs/reference/mapping/runtime.asciidoc index 90e8bf0c32a63..97ed6f607cc2f 100644 --- a/docs/reference/mapping/runtime.asciidoc +++ b/docs/reference/mapping/runtime.asciidoc @@ -260,7 +260,6 @@ field type for `message`. ---- PUT /my-index/ { - "aliases": {}, "mappings": { "properties": { "@timestamp": { @@ -338,7 +337,7 @@ The mapping contains two fields: `@timestamp` and `message`. } } ---- -// TESTRESPONSE[s/\.\.\./"aliases": $body.my-index.aliases, "settings": $body.my-index.settings/] +// TESTRESPONSE[s/\.\.\./"settings": $body.my-index.settings/] If you want to retrieve results that include `clientip`, you can add that field as a runtime field in the mapping. The runtime script operates on the `clientip` From 38194b28817ba9d72ce119d86401347bb6ba661b Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Tue, 1 Dec 2020 17:42:27 -0500 Subject: [PATCH 34/44] Updating use cases and introducing overlay throughout. --- docs/reference/mapping/runtime.asciidoc | 51 ++++++++++++------------- 1 file changed, 24 insertions(+), 27 deletions(-) diff --git a/docs/reference/mapping/runtime.asciidoc b/docs/reference/mapping/runtime.asciidoc index 97ed6f607cc2f..71ac62a1b5674 100644 --- a/docs/reference/mapping/runtime.asciidoc +++ b/docs/reference/mapping/runtime.asciidoc @@ -21,30 +21,27 @@ other field, and {es} sees runtime fields no differently. Alternatively, you can specify a `runtime_mappings` section in a search request to use runtime fields that exist only as part of the query. This -flexibility allows you to create or override field values in `_source` for the +flexibility allows you to create or overlay field values in `_source` for the duration of the query without modifying the field itself. [discrete] [[runtime-use-cases]] === Use cases -Runtime fields are useful when working with log data, especially when you're -unsure about the data structure. Your search speed decreases, but your index -size is much smaller and you can more quickly process logs without having to -index them. - -Runtime fields help to alleviate several common issues when using {es}: - -* Reindexing your data between development iterations is slow and can make -experimenting on large datasets difficult -* Indexing data before searching makes running one-off searches costly and -resource intensive -* Indexing all of your data instead of just the fields you want to search -requires more disk space to gain performance at search time -* Reindexing data for time-based indices to ensure that existing indices -include any new fields in the index template is slow -* Determining how a field is evaluated in {kib} at index or search time is -difficult for scripted fields because they have different needs based on where -they are implemented +Runtime fields are useful when working with <>, +especially when you're unsure about the data structure. Your search speed +decreases, but your index size is much smaller and you can more quickly process +logs without having to index them. + +Runtime fields are especially useful in the following contexts: + +* Adding fields to documents that are already indexed without having to reindex +data +* Immediately begin working on a new data stream without fully understanding +the data it contains +* Overlaying an indexed field with a runtime field to fix a mistake after +indexing documents +* Defining fields that are only relevant for a particular context (such as a +visualization in {kib}) without influencing the underlying schema [discrete] [[runtime-compromises]] @@ -90,7 +87,7 @@ queries and aggregations. IMPORTANT: Objects are not supported in the `"runtime"` section under the mapping definition. If you want to map a runtime field under an object, you can -<>. +<>. The script in the following request extracts the day of the week from the `@timestamp` field, which is defined as a `date` data type. @@ -170,9 +167,9 @@ GET my-index/_search // TEST[continued] [discrete] -[[runtime-overriding-fields]] -==== Overriding fields in the search request -You can use the `runtime_mappings` section of the `_search` request to override +[[runtime-overlaying-fields]] +==== Overlaying fields in the search request +You can use the `runtime_mappings` section of the `_search` request to overlay field values inside objects by naming the runtime fields with dot notation. For example, index the following documents into `my-index`. @@ -187,11 +184,11 @@ POST my-index/_bulk?refresh=true In the `runtime_mappings` section of the `_search` request, you can define a script that operates on the `name.first` field. The values you specify will -override fields in `_source` for the duration of the query without modifying -the field itself. +overlay field values in `_source` for the duration of the query without +modifying the field itself. -The following request evaluates the `name.last` field and overrides the -`name.first` field in the search request based on the script valuation. +The following request evaluates the `name.last` field and overlays the value for +the `name.first` field in the search request based on the script valuation. [source,console] ---- From 70a82e85987f974df55ace134e42b6f48a4da8d6 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Wed, 2 Dec 2020 18:06:35 -0500 Subject: [PATCH 35/44] Edits, adding 'shadowing', and explaining shadowing better. --- docs/reference/mapping/runtime.asciidoc | 45 +++++++++++++++---------- 1 file changed, 28 insertions(+), 17 deletions(-) diff --git a/docs/reference/mapping/runtime.asciidoc b/docs/reference/mapping/runtime.asciidoc index 71ac62a1b5674..8b013212902ef 100644 --- a/docs/reference/mapping/runtime.asciidoc +++ b/docs/reference/mapping/runtime.asciidoc @@ -2,7 +2,7 @@ == Runtime fields Typically, you index data into {es} to promote faster search. However, indexing can be slow and requires more disk space, and you have to reindex your data to -add fields to existing documents. With _runtime fields_ beta[], you can add +add fields to existing documents. With _runtime fields_, you can add fields to documents already indexed to {es} without reindexing your data. [discrete] @@ -21,7 +21,7 @@ other field, and {es} sees runtime fields no differently. Alternatively, you can specify a `runtime_mappings` section in a search request to use runtime fields that exist only as part of the query. This -flexibility allows you to create or overlay field values in `_source` for the +flexibility allows you to create or shadow field values in `_source` for the duration of the query without modifying the field itself. [discrete] @@ -38,7 +38,7 @@ Runtime fields are especially useful in the following contexts: data * Immediately begin working on a new data stream without fully understanding the data it contains -* Overlaying an indexed field with a runtime field to fix a mistake after +* Shadowing an indexed field with a runtime field to fix a mistake after indexing documents * Defining fields that are only relevant for a particular context (such as a visualization in {kib}) without influencing the underlying schema @@ -76,19 +76,14 @@ definition. Within that section, you original `_source` and the mapped field plus its values (`doc_values`). At search time, the script runs and generates values for each scripted field. -The `"runtime"` section supports `boolean`, `date`, `double`, `ip`, `keyword`, -and `long` data types. Runtime fields with a `type` of `date` can accept the -<> parameter exactly as the `date` field type. +NOTE: You can define a runtime field in the mapping definition without a +script. {es} will look in `_source` for a field with the same name as the +runtime field and use values from that field at query time. Runtime fields are similar to the <> parameter of the `_search` request, but also make the script results available for queries and aggregations. -IMPORTANT: Objects are not supported in the `"runtime"` section under the -mapping definition. If you want to map a runtime field under an object, you -can -<>. - The script in the following request extracts the day of the week from the `@timestamp` field, which is defined as a `date` data type. @@ -117,6 +112,15 @@ definition. <2> Each runtime has its own field type, just like any other field. <3> The script defines the evaluation to calculate at search time. +The `"runtime"` section supports `boolean`, `date`, `double`, `ip`, `keyword`, +and `long` data types. Runtime fields with a `type` of `date` can accept the +<> parameter exactly as the `date` field type. + +IMPORTANT: Objects are not supported in the `"runtime"` section under the +mapping definition. If you want to map a runtime field under an object, you +can +<>. + [[runtime-updating-scripts]] .Updating runtime scripts **** @@ -167,10 +171,17 @@ GET my-index/_search // TEST[continued] [discrete] -[[runtime-overlaying-fields]] -==== Overlaying fields in the search request -You can use the `runtime_mappings` section of the `_search` request to overlay -field values inside objects by naming the runtime fields with dot notation. For +[[runtime-shadowing-fields]] +==== Shadowing fields in the search request +If you create a runtime field with the same name as a field that +already exists in the mapping, the runtime field shadows the mapped field. At +search time, {es} calculates the value of the runtime field and returns it as +part of the query. Because the runtime field shadows the mapped field, you can +modify the value returned in search without modifying the mapped field. + +Objects are not supported in the `"runtime"` section under the +mapping definition. However, you can use the `runtime_mappings` section of the +`_search` request to shadow field values inside objects. For example, index the following documents into `my-index`. [source, console] @@ -184,10 +195,10 @@ POST my-index/_bulk?refresh=true In the `runtime_mappings` section of the `_search` request, you can define a script that operates on the `name.first` field. The values you specify will -overlay field values in `_source` for the duration of the query without +shadow field values in `_source` for the duration of the query without modifying the field itself. -The following request evaluates the `name.last` field and overlays the value for +The following request evaluates the `name.last` field and shadows the value for the `name.first` field in the search request based on the script valuation. [source,console] From c27ce4c177f14774bcdc4d616ff530d12c977cfb Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Thu, 3 Dec 2020 13:00:43 -0500 Subject: [PATCH 36/44] Streamlining tests and other changes. --- docs/reference/mapping/runtime.asciidoc | 152 ++++++------------------ 1 file changed, 37 insertions(+), 115 deletions(-) diff --git a/docs/reference/mapping/runtime.asciidoc b/docs/reference/mapping/runtime.asciidoc index 8b013212902ef..4296721cca4f8 100644 --- a/docs/reference/mapping/runtime.asciidoc +++ b/docs/reference/mapping/runtime.asciidoc @@ -27,10 +27,10 @@ duration of the query without modifying the field itself. [discrete] [[runtime-use-cases]] === Use cases -Runtime fields are useful when working with <>, -especially when you're unsure about the data structure. Your search speed -decreases, but your index size is much smaller and you can more quickly process -logs without having to index them. +Runtime fields are useful when working with log data +(see <>), especially when you're unsure about the +data structure. Your search speed decreases, but your index size is much +smaller and you can more quickly process logs without having to index them. Runtime fields are especially useful in the following contexts: @@ -52,7 +52,7 @@ the runtime script. To balance search performance and flexibility, index fields that you'll commonly search for and filter on, such as a timestamp. {es} automatically uses -these indexed fields first when running a query, which produces a fast response +these indexed fields first when running a query, resulting in a fast response time. You can then use runtime fields to limit the number of fields that {es} needs to calculate values for. Using indexed fields in tandem with runtime fields provides flexibility in the data that you index and how you define @@ -73,7 +73,7 @@ against runtime fields. You map runtime fields by adding a `"runtime"` section under the mapping definition. Within that section, you <>, which has access to the -original `_source` and the mapped field plus its values (`doc_values`). At +original `_source` and `doc_values` (the mapped field plus its values). At search time, the script runs and generates values for each scripted field. NOTE: You can define a runtime field in the mapping definition without a @@ -85,7 +85,7 @@ of the `_search` request, but also make the script results available for queries and aggregations. The script in the following request extracts the day of the week from the -`@timestamp` field, which is defined as a `date` data type. +`@timestamp` field, which is defined as a `date` type: [source,console] ---- @@ -107,7 +107,7 @@ PUT /my-index } ---- -<1> Runtime fields are defined in the `runtime` section of the mapping +<1> Runtime fields are defined in the `"runtime"` section of the mapping definition. <2> Each runtime has its own field type, just like any other field. <3> The script defines the evaluation to calculate at search time. @@ -140,12 +140,12 @@ to `boolean`. [[runtime-search-request]] === Defining runtime fields in a search request You can specify a `runtime_mappings` section in a search request to create -runtime fields that exist only as part of the query. You must specify a script +runtime fields that exist only as part of the query. You specify a script as part of the `runtime_mappings` section, just as you would if adding a runtime field to the mappings. In the following request, the values for the `day_of_week` field are calculated -dynamically, and only within the context of this search request. +dynamically, and only within the context of this search request: [source,console] ---- @@ -182,7 +182,7 @@ modify the value returned in search without modifying the mapped field. Objects are not supported in the `"runtime"` section under the mapping definition. However, you can use the `runtime_mappings` section of the `_search` request to shadow field values inside objects. For -example, index the following documents into `my-index`. +example, index the following documents into `my-index`: [source, console] ---- @@ -199,7 +199,7 @@ shadow field values in `_source` for the duration of the query without modifying the field itself. The following request evaluates the `name.last` field and shadows the value for -the `name.first` field in the search request based on the script valuation. +the `name.first` field in the search request based on the script valuation: [source,console] ---- @@ -231,7 +231,7 @@ the `fields` API works for all fields, even those that were not sent as part of the original `_source`. The following request uses the search API to retrieve the `day_of_week` field -that the <> defined as a runtime field +that <> defined as a runtime field in the mapping. The value for the `day_of_week` field is calculated dynamically at search time, and the following search request retrieves any documents where the calculated value is equal to `Thursday`. @@ -242,7 +242,7 @@ GET my-index/_search { "fields": [ "@timestamp", - "day_of_week" + "day_of_week": "Thursday" ], "_source": false } @@ -262,7 +262,7 @@ time for these fields. You can start with a simple example by adding the `@timestamp` and `message` fields to the `my-index` mapping. To remain flexible, use `wildcard` as the -field type for `message`. +field type for `message`: [source,console] ---- @@ -374,6 +374,7 @@ search for a specific IP address and return all related fields. ---- GET my-index/_search { + "size": 1, "query": { "match": { "clientip": "211.11.9.0" @@ -401,41 +402,21 @@ and determine which fields to index. "hits" : [ { "_index" : "my-index", - "_id" : "8Jh81nQBp2DRDXdiOxVt", + "_id" : "oWs5KXYB-XyJbifr9mrz", "_score" : 1.0, "_source" : { "@timestamp" : "2020-06-21T15:00:01-05:00", "message" : "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0" }, "fields" : { - "clientip" : [ - "211.11.9.0" - ], - "message" : [ - "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0" - ], "@timestamp" : [ "2020-06-21T20:00:01.000Z" - ] - } - }, - { - "_index" : "my-index", - "_id" : "8Zh81nQBp2DRDXdiOxVu", - "_score" : 1.0, - "_source" : { - "@timestamp" : "2020-06-21T15:00:01-05:00", - "message" : "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0" - }, - "fields" : { + ], "clientip" : [ "211.11.9.0" ], "message" : [ "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0" - ], - "@timestamp" : [ - "2020-06-21T20:00:01.000Z" ] } } @@ -444,82 +425,13 @@ and determine which fields to index. } ---- // TESTRESPONSE[s/\.\.\./"took" : $body.took,"timed_out" : $body.timed_out,"_shards" : $body._shards,/] -// TESTRESPONSE[s/"_id" : "8Jh81nQBp2DRDXdiOxVt"/"_id": $body.hits.hits.0._id/] -// TESTRESPONSE[s/"_id" : "8Zh81nQBp2DRDXdiOxVu"/"_id": $body.hits.hits.1._id/] - -If you add the `day_of_week` field to the mapping using the request in -<>, you can re-run the previous -search request and also retrieve the day of the week based on the `@timestamp` -field. +// TESTRESPONSE[s/"_id" : "oWs5KXYB-XyJbifr9mrz"/"_id": $body.hits.hits.0._id/] -The value for this field is calculated dynamically at runtime without -reindexing the document or adding the `day_of_week` field. This flexibility -allows you to modify the mapping without changing any field values. +You can add the `day_of_week` field to the mapping using the request from +<>: -//// [source,console] ---- -PUT /my-index/ -{ - "mappings": { - "properties": { - "@timestamp": { - "format": "strict_date_optional_time||epoch_second", - "type": "date" - }, - "message": { - "type": "wildcard" - } - } - } -} - -POST /my-index/_bulk?refresh -{ "index": {}} -{ "@timestamp": "2020-06-21T15:00:01-05:00", "message" : "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0"} -{ "index": {}} -{ "@timestamp": "2020-06-21T15:00:01-05:00", "message" : "211.11.9.0 - - [2020-06-21T15:00:01-05:00] \"GET /english/index.html HTTP/1.0\" 304 0"} -{ "index": {}} -{ "@timestamp": "2020-04-30T14:30:17-05:00", "message" : "40.135.0.0 - - [2020-04-30T14:30:17-05:00] \"GET /images/hm_bg.jpg HTTP/1.0\" 200 24736"} -{ "index": {}} -{ "@timestamp": "2020-04-30T14:30:53-05:00", "message" : "232.0.0.0 - - [2020-04-30T14:30:53-05:00] \"GET /images/hm_bg.jpg HTTP/1.0\" 200 24736"} -{ "index": {}} -{ "@timestamp": "2020-04-30T14:31:12-05:00", "message" : "26.1.0.0 - - [2020-04-30T14:31:12-05:00] \"GET /images/hm_bg.jpg HTTP/1.0\" 200 24736"} -{ "index": {}} -{ "@timestamp": "2020-04-30T14:31:19-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:19-05:00] \"GET /french/splash_inet.html HTTP/1.0\" 200 3781"} -{ "index": {}} -{ "@timestamp": "2020-04-30T14:31:27-05:00", "message" : "252.0.0.0 - - [2020-04-30T14:31:27-05:00] \"GET /images/hm_bg.jpg HTTP/1.0\" 200 24736"} -{ "index": {}} -{ "@timestamp": "2020-04-30T14:31:29-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:29-05:00] \"GET /images/hm_brdl.gif HTTP/1.0\" 304 0"} -{ "index": {}} -{ "@timestamp": "2020-04-30T14:31:29-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:29-05:00] \"GET /images/hm_arw.gif HTTP/1.0\" 304 0"} -{ "index": {}} -{ "@timestamp": "2020-04-30T14:31:32-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:32-05:00] \"GET /images/nav_bg_top.gif HTTP/1.0\" 200 929"} -{ "index": {}} -{ "@timestamp": "2020-04-30T14:31:43-05:00", "message" : "247.37.0.0 - - [2020-04-30T14:31:43-05:00] \"GET /french/images/nav_venue_off.gif HTTP/1.0\" 304 0"} - -PUT /my-index/_mapping -{ - "runtime": { - "clientip": { - "type": "ip", - "script" : { - "source" : "String m = doc[\"message\"].value; int end = m.indexOf(\" \"); emit(m.substring(0, end));" - } - } - } -} - -GET my-index/_search -{ - "query": { - "match": { - "clientip": "211.11.9.0" - } - }, - "fields" : ["*"] -} - PUT /my-index/_mapping { "runtime": { @@ -536,9 +448,17 @@ PUT /my-index/_mapping } } } +---- +// TEST[continued] + +Then, you can re-run the previous search request and also retrieve the day of +the week based on the `@timestamp` field: +[source,console] +---- GET my-index/_search { + "size": 1, "query": { "match": { "clientip": "211.11.9.0" @@ -547,7 +467,11 @@ GET my-index/_search "fields" : ["*"] } ---- -//// +// TEST[continued] + +The value for this field is calculated dynamically at runtime without +reindexing the document or adding the `day_of_week` field. This flexibility +allows you to modify the mapping without changing any field values. [source,console-result] ---- @@ -562,7 +486,7 @@ GET my-index/_search "hits" : [ { "_index" : "my-index", - "_id" : "8Jh81nQBp2DRDXdiOxVt", + "_id" : "oWs5KXYB-XyJbifr9mrz", "_score" : 1.0, "_source" : { "@timestamp" : "2020-06-21T15:00:01-05:00", @@ -582,16 +506,14 @@ GET my-index/_search "Sunday" <1> ] } - }, - *** + } ] } } ---- // TESTRESPONSE[s/\.\.\./"took" : $body.took,"timed_out" : $body.timed_out,"_shards" : $body._shards,/] -// TESTRESPONSE[s/"_id" : "8Jh81nQBp2DRDXdiOxVt"/"_id": $body.hits.hits.0._id/] +// TESTRESPONSE[s/"_id" : "oWs5KXYB-XyJbifr9mrz"/"_id": $body.hits.hits.0._id/] // TESTRESPONSE[s/"day_of_week" : \[\n\s+"Sunday"\n\s\]/"day_of_week": $body.hits.hits.0.fields.day_of_week/] -// TESTRESPONSE[s/\*\*\*/$body.hits.hits.1/] <1> This value was calculated at search time using the runtime script defined in the mapping. From d9a927167824d778de20c5db84f6701406ec5893 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Thu, 3 Dec 2020 14:45:25 -0500 Subject: [PATCH 37/44] Fix formatting in example for test. --- docs/reference/mapping/runtime.asciidoc | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/docs/reference/mapping/runtime.asciidoc b/docs/reference/mapping/runtime.asciidoc index 4296721cca4f8..0a0e915ffc1e4 100644 --- a/docs/reference/mapping/runtime.asciidoc +++ b/docs/reference/mapping/runtime.asciidoc @@ -233,8 +233,7 @@ the original `_source`. The following request uses the search API to retrieve the `day_of_week` field that <> defined as a runtime field in the mapping. The value for the `day_of_week` field is calculated dynamically -at search time, and the following search request retrieves any documents where -the calculated value is equal to `Thursday`. +at search time based on the evaluation of the defined script. [source,console] ---- @@ -242,7 +241,7 @@ GET my-index/_search { "fields": [ "@timestamp", - "day_of_week": "Thursday" + "day_of_week" ], "_source": false } From 4d5452d2840d25f9f0e1cf355bd14741c2a7c8f6 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Mon, 7 Dec 2020 17:36:42 -0500 Subject: [PATCH 38/44] Apply suggestions from code review Co-authored-by: Gilad Gal --- docs/reference/mapping/runtime.asciidoc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/reference/mapping/runtime.asciidoc b/docs/reference/mapping/runtime.asciidoc index 0a0e915ffc1e4..7b80cd4d9fad6 100644 --- a/docs/reference/mapping/runtime.asciidoc +++ b/docs/reference/mapping/runtime.asciidoc @@ -74,7 +74,7 @@ You map runtime fields by adding a `"runtime"` section under the mapping definition. Within that section, you <>, which has access to the original `_source` and `doc_values` (the mapped field plus its values). At -search time, the script runs and generates values for each scripted field. +search time, the script runs and generates values for each scripted field that is required for the query. NOTE: You can define a runtime field in the mapping definition without a script. {es} will look in `_source` for a field with the same name as the @@ -109,7 +109,7 @@ PUT /my-index <1> Runtime fields are defined in the `"runtime"` section of the mapping definition. -<2> Each runtime has its own field type, just like any other field. +<2> Each runtime field has its own field type, just like any other field. <3> The script defines the evaluation to calculate at search time. The `"runtime"` section supports `boolean`, `date`, `double`, `ip`, `keyword`, From 817591707969001ef5b9982efdedfdacb485ad02 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Mon, 7 Dec 2020 18:44:58 -0500 Subject: [PATCH 39/44] Incorporating reviewer feedback 7 Dec --- docs/reference/mapping.asciidoc | 83 ++++--------------- docs/reference/mapping/runtime.asciidoc | 31 ++++--- .../settings-mapping-explosion.asciidoc | 59 +++++++++++++ docs/reference/mapping/types/nested.asciidoc | 4 +- docs/reference/search/field-caps.asciidoc | 2 +- 5 files changed, 90 insertions(+), 89 deletions(-) create mode 100644 docs/reference/mapping/settings-mapping-explosion.asciidoc diff --git a/docs/reference/mapping.asciidoc b/docs/reference/mapping.asciidoc index 2e2717f8dc0fa..80bf4ab61880a 100644 --- a/docs/reference/mapping.asciidoc +++ b/docs/reference/mapping.asciidoc @@ -13,8 +13,7 @@ are stored and indexed. For instance, use mappings to define: * custom rules to control the mapping for <>. -A mapping definition includes metadata fields and fields, and can also include -runtime fields: +A mapping definition includes metadata fields and fields: <>:: @@ -31,72 +30,16 @@ document. Each field has its own <>. NOTE: Before 7.0.0, the 'mappings' definition used to include a type name. For more details, please see <>. -<>:: - -Runtime fields are not indexed, which saves disk space and makes data ingest -faster. You can add fields to existing documents without reindexing your data -and calculate field values dynamically at search time. - -[[mapping-limit-settings]] [discrete] -=== Settings to prevent mappings explosion - -Defining too many fields in an index can lead to a -mapping explosion, which can cause out of memory errors and difficult -situations to recover from. - -Consider a situation where every new document inserted -introduces new fields, such as with <>. -Each new field is added to the index mapping, which can become a -problem as the mapping grows. - -Use the following settings to limit the number of field mappings (created manually or dynamically) and prevent documents from causing a mapping explosion: - -`index.mapping.total_fields.limit`:: - The maximum number of fields in an index. Field and object mappings, as well as - field aliases count towards this limit. The default value is `1000`. -+ -[IMPORTANT] -==== -The limit is in place to prevent mappings and searches from becoming too -large. Higher values can lead to performance degradations and memory issues, -especially in clusters with a high load or few resources. - -If you increase this setting, we recommend you also increase the -<> setting, which -limits the maximum number of <> in a query. -==== -+ -[TIP] -==== -If your field mappings contain a large, arbitrary set of keys, consider using the <> data type. -==== - -`index.mapping.depth.limit`:: - The maximum depth for a field, which is measured as the number of inner - objects. For instance, if all fields are defined at the root object level, - then the depth is `1`. If there is one object mapping, then the depth is - `2`, etc. Default is `20`. - -// tag::nested-fields-limit[] -`index.mapping.nested_fields.limit`:: - The maximum number of distinct `nested` mappings in an index. The `nested` type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting - limits the number of unique `nested` types per index. Default is `50`. -// end::nested-fields-limit[] - -// tag::nested-objects-limit[] -`index.mapping.nested_objects.limit`:: - The maximum number of nested JSON objects that a single document can contain across all - `nested` types. This limit helps to prevent out of memory errors when a document contains too many nested - objects. Default is `10000`. -// end::nested-objects-limit[] - -`index.mapping.field_name_length.limit`:: - Setting for the maximum length of a field name. This setting isn't really something that addresses - mappings explosion but might still be useful if you want to limit the field length. - It usually shouldn't be necessary to set this setting. The default is okay - unless a user starts to add a huge number of fields with really long names. Default is - `Long.MAX_VALUE` (no limit). +[[runtime-fields]] +== Runtime fields +Typically, you index data into {es} to promote faster search. However, indexing +can be slow and requires more disk space, and you have to reindex your data to +add fields to existing documents. + +<> are not indexed, which saves disk space and makes +data ingest faster. You can add fields to existing documents without reindexing +your data and calculate field values dynamically at search time. [discrete] [[dynamic-mapping-intro]] @@ -263,12 +206,14 @@ The API returns the following response: include::mapping/removal_of_types.asciidoc[] +include::mapping/settings-mapping-explosion.asciidoc[] + include::mapping/types.asciidoc[] +include::mapping/runtime.asciidoc[] + include::mapping/fields.asciidoc[] include::mapping/params.asciidoc[] include::mapping/dynamic-mapping.asciidoc[] - -include::mapping/runtime.asciidoc[] diff --git a/docs/reference/mapping/runtime.asciidoc b/docs/reference/mapping/runtime.asciidoc index 7b80cd4d9fad6..ab4b74ef078cc 100644 --- a/docs/reference/mapping/runtime.asciidoc +++ b/docs/reference/mapping/runtime.asciidoc @@ -67,10 +67,9 @@ IMPORTANT: Queries against runtime fields are considered expensive. If to `false`, expensive queries are not allowed and {es} will reject any queries against runtime fields. -[discrete] [[runtime-mapping-fields]] === Mapping a runtime field -You map runtime fields by adding a `"runtime"` section under the mapping +You map runtime fields by adding a `runtime` section under the mapping definition. Within that section, you <>, which has access to the original `_source` and `doc_values` (the mapped field plus its values). At @@ -81,8 +80,8 @@ script. {es} will look in `_source` for a field with the same name as the runtime field and use values from that field at query time. Runtime fields are similar to the <> parameter -of the `_search` request, but also make the script results available for -queries and aggregations. +of the `_search` request, but also make the script results available anywhere +in a search request. The script in the following request extracts the day of the week from the `@timestamp` field, which is defined as a `date` type: @@ -107,16 +106,17 @@ PUT /my-index } ---- -<1> Runtime fields are defined in the `"runtime"` section of the mapping +<1> Runtime fields are defined in the `runtime` section of the mapping definition. <2> Each runtime field has its own field type, just like any other field. <3> The script defines the evaluation to calculate at search time. -The `"runtime"` section supports `boolean`, `date`, `double`, `ip`, `keyword`, -and `long` data types. Runtime fields with a `type` of `date` can accept the -<> parameter exactly as the `date` field type. +The `runtime` section supports `boolean`, `date`, `double`, `geo_point` `ip`, +`keyword`, and `long` data types. Runtime fields with a `type` of `date` can +accept the <> parameter exactly as the `date` +field type. -IMPORTANT: Objects are not supported in the `"runtime"` section under the +IMPORTANT: Objects are not supported in the `runtime` section under the mapping definition. If you want to map a runtime field under an object, you can <>. @@ -136,7 +136,6 @@ to `boolean`. **** -[discrete] [[runtime-search-request]] === Defining runtime fields in a search request You can specify a `runtime_mappings` section in a search request to create @@ -170,16 +169,15 @@ GET my-index/_search ---- // TEST[continued] -[discrete] [[runtime-shadowing-fields]] -==== Shadowing fields in the search request +=== Shadowing fields in the search request If you create a runtime field with the same name as a field that already exists in the mapping, the runtime field shadows the mapped field. At search time, {es} calculates the value of the runtime field and returns it as part of the query. Because the runtime field shadows the mapped field, you can modify the value returned in search without modifying the mapped field. -Objects are not supported in the `"runtime"` section under the +Objects are not supported in the `runtime` section under the mapping definition. However, you can use the `runtime_mappings` section of the `_search` request to shadow field values inside objects. For example, index the following documents into `my-index`: @@ -222,9 +220,9 @@ POST my-index/_search ---- // TEST[continued] -[discrete] + [[runtime-retrieving-fields]] -==== Retrieving a runtime field +=== Retrieving a runtime field Use the <> parameter on the `_search` API to retrieve the values of runtime fields. Runtime fields won't display in `_source`, but the `fields` API works for all fields, even those that were not sent as part of @@ -248,9 +246,8 @@ GET my-index/_search ---- // TEST[continued] -[discrete] [[runtime-examples]] -=== Examples +=== Runtime fields examples Consider a large set of log data that you want to extract fields from. Indexing the data is time consuming and uses a lot of disk space, and you just want to explore the data structure without committing to a schema up front. diff --git a/docs/reference/mapping/settings-mapping-explosion.asciidoc b/docs/reference/mapping/settings-mapping-explosion.asciidoc new file mode 100644 index 0000000000000..1306d8ff5ed28 --- /dev/null +++ b/docs/reference/mapping/settings-mapping-explosion.asciidoc @@ -0,0 +1,59 @@ +[[mapping-limit-settings]] +== Settings to prevent mappings explosion + +Defining too many fields in an index can lead to a +mapping explosion, which can cause out of memory errors and difficult +situations to recover from. + +Consider a situation where every new document inserted +introduces new fields, such as with <>. +Each new field is added to the index mapping, which can become a +problem as the mapping grows. + +Use the following settings to limit the number of field mappings (created manually or dynamically) and prevent documents from causing a mapping explosion: + +`index.mapping.total_fields.limit`:: + The maximum number of fields in an index. Field and object mappings, as well as + field aliases count towards this limit. The default value is `1000`. ++ +[IMPORTANT] +==== +The limit is in place to prevent mappings and searches from becoming too +large. Higher values can lead to performance degradations and memory issues, +especially in clusters with a high load or few resources. + +If you increase this setting, we recommend you also increase the +<> setting, which +limits the maximum number of <> in a query. +==== ++ +[TIP] +==== +If your field mappings contain a large, arbitrary set of keys, consider using the <> data type. +==== + +`index.mapping.depth.limit`:: + The maximum depth for a field, which is measured as the number of inner + objects. For instance, if all fields are defined at the root object level, + then the depth is `1`. If there is one object mapping, then the depth is + `2`, etc. Default is `20`. + +// tag::nested-fields-limit[] +`index.mapping.nested_fields.limit`:: + The maximum number of distinct `nested` mappings in an index. The `nested` type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting + limits the number of unique `nested` types per index. Default is `50`. +// end::nested-fields-limit[] + +// tag::nested-objects-limit[] +`index.mapping.nested_objects.limit`:: + The maximum number of nested JSON objects that a single document can contain across all + `nested` types. This limit helps to prevent out of memory errors when a document contains too many nested + objects. Default is `10000`. +// end::nested-objects-limit[] + +`index.mapping.field_name_length.limit`:: + Setting for the maximum length of a field name. This setting isn't really something that addresses + mappings explosion but might still be useful if you want to limit the field length. + It usually shouldn't be necessary to set this setting. The default is okay + unless a user starts to add a huge number of fields with really long names. Default is + `Long.MAX_VALUE` (no limit). diff --git a/docs/reference/mapping/types/nested.asciidoc b/docs/reference/mapping/types/nested.asciidoc index ddfb32471d4ce..7ac71c0048802 100644 --- a/docs/reference/mapping/types/nested.asciidoc +++ b/docs/reference/mapping/types/nested.asciidoc @@ -220,11 +220,11 @@ then 101 Lucene documents would be created: one for the parent document, and one nested object. Because of the expense associated with `nested` mappings, Elasticsearch puts settings in place to guard against performance problems: -include::{es-repo-dir}/mapping.asciidoc[tag=nested-fields-limit] +include::{es-repo-dir}/mapping/settings-mapping-explosion.asciidoc[tag=nested-fields-limit] In the previous example, the `user` mapping would count as only 1 towards this limit. -include::{es-repo-dir}/mapping.asciidoc[tag=nested-objects-limit] +include::{es-repo-dir}/mapping/settings-mapping-explosion.asciidoc[tag=nested-objects-limit] To illustrate how this setting works, consider adding another `nested` type called `comments` to the previous example mapping. For each document, the combined number of `user` and `comment` diff --git a/docs/reference/search/field-caps.asciidoc b/docs/reference/search/field-caps.asciidoc index 6bbd238acd23e..f09f3503f3873 100644 --- a/docs/reference/search/field-caps.asciidoc +++ b/docs/reference/search/field-caps.asciidoc @@ -36,7 +36,7 @@ fields among multiple indices. The field capabilities API returns <> like any other field. For example, a runtime field with a type of -`keyword` returns as any other field that belongs to the `keyword` family. +`keyword` is returned as any other field that belongs to the `keyword` family. [[search-field-caps-api-path-params]] From b732ae2997f8abd4b812960b440f3e095eb120d7 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Tue, 8 Dec 2020 10:17:47 -0500 Subject: [PATCH 40/44] Shifting structure of mapping page to fix cross links. --- docs/reference/mapping.asciidoc | 17 ++++++++++++++++- ...asciidoc => mapping-settings-limit.asciidoc} | 14 ++------------ docs/reference/mapping/types/nested.asciidoc | 4 ++-- 3 files changed, 20 insertions(+), 15 deletions(-) rename docs/reference/mapping/{settings-mapping-explosion.asciidoc => mapping-settings-limit.asciidoc} (84%) diff --git a/docs/reference/mapping.asciidoc b/docs/reference/mapping.asciidoc index 80bf4ab61880a..8c5deb0ad5877 100644 --- a/docs/reference/mapping.asciidoc +++ b/docs/reference/mapping.asciidoc @@ -30,6 +30,21 @@ document. Each field has its own <>. NOTE: Before 7.0.0, the 'mappings' definition used to include a type name. For more details, please see <>. +[discrete] +[[mapping-limit-settings]] +== Settings to prevent mapping explosion +Defining too many fields in an index can lead to a mapping explosion, which can +cause out of memory errors and difficult situations to recover from. + +Consider a situation where every new document inserted +introduces new fields, such as with <>. +Each new field is added to the index mapping, which can become a +problem as the mapping grows. + +Use the <> to limit the number +of field mappings (created manually or dynamically) and prevent documents from +causing a mapping explosion. + [discrete] [[runtime-fields]] == Runtime fields @@ -206,7 +221,7 @@ The API returns the following response: include::mapping/removal_of_types.asciidoc[] -include::mapping/settings-mapping-explosion.asciidoc[] +include::mapping/mapping-settings-limit.asciidoc[] include::mapping/types.asciidoc[] diff --git a/docs/reference/mapping/settings-mapping-explosion.asciidoc b/docs/reference/mapping/mapping-settings-limit.asciidoc similarity index 84% rename from docs/reference/mapping/settings-mapping-explosion.asciidoc rename to docs/reference/mapping/mapping-settings-limit.asciidoc index 1306d8ff5ed28..9099b3029a7f2 100644 --- a/docs/reference/mapping/settings-mapping-explosion.asciidoc +++ b/docs/reference/mapping/mapping-settings-limit.asciidoc @@ -1,15 +1,5 @@ -[[mapping-limit-settings]] -== Settings to prevent mappings explosion - -Defining too many fields in an index can lead to a -mapping explosion, which can cause out of memory errors and difficult -situations to recover from. - -Consider a situation where every new document inserted -introduces new fields, such as with <>. -Each new field is added to the index mapping, which can become a -problem as the mapping grows. - +[[mapping-settings-limit]] +== Mapping limit settings Use the following settings to limit the number of field mappings (created manually or dynamically) and prevent documents from causing a mapping explosion: `index.mapping.total_fields.limit`:: diff --git a/docs/reference/mapping/types/nested.asciidoc b/docs/reference/mapping/types/nested.asciidoc index 7ac71c0048802..0aec51d2b6d31 100644 --- a/docs/reference/mapping/types/nested.asciidoc +++ b/docs/reference/mapping/types/nested.asciidoc @@ -220,11 +220,11 @@ then 101 Lucene documents would be created: one for the parent document, and one nested object. Because of the expense associated with `nested` mappings, Elasticsearch puts settings in place to guard against performance problems: -include::{es-repo-dir}/mapping/settings-mapping-explosion.asciidoc[tag=nested-fields-limit] +include::{es-repo-dir}/mapping/mapping-settings-limit.asciidoc[tag=nested-fields-limit] In the previous example, the `user` mapping would count as only 1 towards this limit. -include::{es-repo-dir}/mapping/settings-mapping-explosion.asciidoc[tag=nested-objects-limit] +include::{es-repo-dir}/mapping/mapping-settings-limit.asciidoc[tag=nested-objects-limit] To illustrate how this setting works, consider adding another `nested` type called `comments` to the previous example mapping. For each document, the combined number of `user` and `comment` From 8870aabdbe3f3bc1b519bf042c473b81a37d0dc0 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Tue, 8 Dec 2020 18:17:12 -0500 Subject: [PATCH 41/44] Revisions for shadowing, overview, and other sections. --- docs/reference/mapping/runtime.asciidoc | 152 +++++++++++++++++++----- 1 file changed, 120 insertions(+), 32 deletions(-) diff --git a/docs/reference/mapping/runtime.asciidoc b/docs/reference/mapping/runtime.asciidoc index ab4b74ef078cc..b14eac6a51829 100644 --- a/docs/reference/mapping/runtime.asciidoc +++ b/docs/reference/mapping/runtime.asciidoc @@ -5,24 +5,29 @@ can be slow and requires more disk space, and you have to reindex your data to add fields to existing documents. With _runtime fields_, you can add fields to documents already indexed to {es} without reindexing your data. +You access runtime fields from the search API like any other field, and {es} +sees runtime fields no differently. + [discrete] [[runtime-benefits]] === Benefits -Because runtime fields aren't indexed, you can more quickly ingest raw data -into the Elastic Stack and access it immediately. By dynamically evaluating -runtime fields at search time, you can optimize disk space by choosing which -fields to index. If you later decide that a runtime field is increasingly used -for filtering or aggregation, you can add the field to `_source` to gain faster -search speed. - -Instead of reindexing your data to add fields, add runtime fields to the -mapping definition. You can access runtime fields from the search API like any -other field, and {es} sees runtime fields no differently. - -Alternatively, you can specify a `runtime_mappings` section in a search request -to use runtime fields that exist only as part of the query. This -flexibility allows you to create or shadow field values in `_source` for the -duration of the query without modifying the field itself. +Because runtime fields aren't indexed, adding a runtime field doesn't increase +the index size. You define runtime fields directly in the index mapping, saving +storage costs and increasing ingestion speed. You can more quickly ingest +data into the Elastic Stack and access it right away. + +When you define a runtime field, you can immediately use it in search requests, +aggregations, filtering, and sorting. If you later decide that the runtime +field is increasingly used for filtering or aggregations, you can modify the +index template to make the runtime field an indexed field. You'll achieve better +query performance, and the field will be indexed for any new document matching +an index generated by the index template. + +If you make a runtime field an indexed field, you don't need to modify any +queries that refer to the runtime field. Better yet, you can refer to some +indices where the field is a runtime field, and other indices where the field +is an indexed field. You have the flexibility to choose which fields to index +and which ones to keep as runtime fields. [discrete] [[runtime-use-cases]] @@ -61,6 +66,8 @@ queries for other fields. Use the <> to run searches that include runtime fields. This method of search helps to offset the performance impacts of computing values for runtime fields in each document containing that field. +If the query can't return the result set synchronously, you'll get results +asynchronously as they become available. IMPORTANT: Queries against runtime fields are considered expensive. If <> is set @@ -72,8 +79,9 @@ against runtime fields. You map runtime fields by adding a `runtime` section under the mapping definition. Within that section, you <>, which has access to the -original `_source` and `doc_values` (the mapped field plus its values). At -search time, the script runs and generates values for each scripted field that is required for the query. +entire context of a document, including the original `_source` and any mapped +fields plus their values. At search time, the script runs and generates values +for each scripted field that is required for the query. NOTE: You can define a runtime field in the mapping definition without a script. {es} will look in `_source` for a field with the same name as the @@ -116,10 +124,72 @@ The `runtime` section supports `boolean`, `date`, `double`, `geo_point` `ip`, accept the <> parameter exactly as the `date` field type. -IMPORTANT: Objects are not supported in the `runtime` section under the -mapping definition. If you want to map a runtime field under an object, you -can -<>. +You can define a runtime field that isn't at the top level of a document. If +you want to map a runtime field under an object, you can use dot notation +instead of recreating the entire object structure. + +For example, let's say you add the following document: + +[source, console] +---- +POST my-index/_doc/1 +{ + "company": { + "name": "Elastic" + } +} +---- + +If you retrieve the mapping for that document, you'll see that the `name` field +is nested under the `company` object: + +[source, console] +---- +GET my-index/_mapping +---- +//TEST[continued] + +[source, console-result] +---- +{ + "my-index" : { + "mappings" : { + "properties" : { + "company" : { + "properties" : { + "name" : { + "type" : "text", + "fields" : { + "keyword" : { + "type" : "keyword", + "ignore_above" : 256 + } + } + } + } + } + } + } + } +} +---- +//TEST[continued] + +You could create a runtime field in the mapping using `company.name` instead of +recreating the object structure for that field: + +[source,console] +---- +PUT my-index/_mapping +{ + "runtime": { + "company.name": { + "type": "keyword" + } + } +} +---- +//TEST[continued] [[runtime-updating-scripts]] .Updating runtime scripts @@ -143,6 +213,13 @@ runtime fields that exist only as part of the query. You specify a script as part of the `runtime_mappings` section, just as you would if adding a runtime field to the mappings. +Fields defined in the search request take precedence over fields defined with +the same name in the index mappings. This flexibility allows you to shadow +existing fields and calculate a different value in the search request, without +modifying the field itself. If you made a mistake in your index mapping, you +can use runtime fields to calculate values that override values in the mapping +during the search request. + In the following request, the values for the `day_of_week` field are calculated dynamically, and only within the context of this search request: @@ -169,18 +246,27 @@ GET my-index/_search ---- // TEST[continued] +Defining a runtime field in a search request uses the same format as defining +a runtime field in the index mapping. That consistency means you can promote a +runtime field from a search request to the index mapping by moving the field +definition from `runtime_mappings` in the search request to the `runtime` +section of the index mapping. + [[runtime-shadowing-fields]] -=== Shadowing fields in the search request +=== Shadowing fields If you create a runtime field with the same name as a field that already exists in the mapping, the runtime field shadows the mapped field. At -search time, {es} calculates the value of the runtime field and returns it as -part of the query. Because the runtime field shadows the mapped field, you can -modify the value returned in search without modifying the mapped field. +search time, {es} evaluates the runtime field, calculates a value based on the +script, and returns the value as part of the query. Because the runtime field +shadows the mapped field, you can modify the value returned in search without +modifying the mapped field. + +If you define a runtime field that does not include a script, {es} evaluates the +field at search time, looks at each document containing that field, retrieves +the `_source`, and returns a value if one exists. -Objects are not supported in the `runtime` section under the -mapping definition. However, you can use the `runtime_mappings` section of the -`_search` request to shadow field values inside objects. For -example, index the following documents into `my-index`: +As mentioned in <>, you can +shadow field values inside objects. For example, index the following documents into `my-index`: [source, console] ---- @@ -193,11 +279,13 @@ POST my-index/_bulk?refresh=true In the `runtime_mappings` section of the `_search` request, you can define a script that operates on the `name.first` field. The values you specify will -shadow field values in `_source` for the duration of the query without +shadow field values in the index mapping for the duration of the query without modifying the field itself. -The following request evaluates the `name.last` field and shadows the value for -the `name.first` field in the search request based on the script valuation: +The following request defines a runtime field that retrieves values based on +the script valuation. The field defined in the search request shadows the +indexed `name.first` field and substitutes a value for that field based on the +logic defined in the script: [source,console] ---- From 1475e9de11530dc2a7531ba86be7105919f43625 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Wed, 9 Dec 2020 12:15:57 -0500 Subject: [PATCH 42/44] Removing dot notation section and incorporating review changes. --- docs/reference/mapping.asciidoc | 4 +- docs/reference/mapping/runtime.asciidoc | 91 +++---------------------- 2 files changed, 11 insertions(+), 84 deletions(-) diff --git a/docs/reference/mapping.asciidoc b/docs/reference/mapping.asciidoc index 8c5deb0ad5877..ebea396cd279e 100644 --- a/docs/reference/mapping.asciidoc +++ b/docs/reference/mapping.asciidoc @@ -53,8 +53,8 @@ can be slow and requires more disk space, and you have to reindex your data to add fields to existing documents. <> are not indexed, which saves disk space and makes -data ingest faster. You can add fields to existing documents without reindexing -your data and calculate field values dynamically at search time. +data ingest faster. You can add runtime fields to existing documents without +reindexing your data and calculate field values dynamically at search time. [discrete] [[dynamic-mapping-intro]] diff --git a/docs/reference/mapping/runtime.asciidoc b/docs/reference/mapping/runtime.asciidoc index b14eac6a51829..14e8ec3009af5 100644 --- a/docs/reference/mapping/runtime.asciidoc +++ b/docs/reference/mapping/runtime.asciidoc @@ -18,10 +18,8 @@ data into the Elastic Stack and access it right away. When you define a runtime field, you can immediately use it in search requests, aggregations, filtering, and sorting. If you later decide that the runtime -field is increasingly used for filtering or aggregations, you can modify the -index template to make the runtime field an indexed field. You'll achieve better -query performance, and the field will be indexed for any new document matching -an index generated by the index template. +field is increasingly used for filtering or aggregations, you can make the +runtime field an indexed field to achieve better query performance. If you make a runtime field an indexed field, you don't need to modify any queries that refer to the runtime field. Better yet, you can refer to some @@ -77,15 +75,16 @@ against runtime fields. [[runtime-mapping-fields]] === Mapping a runtime field You map runtime fields by adding a `runtime` section under the mapping -definition. Within that section, you -<>, which has access to the +definition and defining +<>. This script has access to the entire context of a document, including the original `_source` and any mapped fields plus their values. At search time, the script runs and generates values for each scripted field that is required for the query. NOTE: You can define a runtime field in the mapping definition without a -script. {es} will look in `_source` for a field with the same name as the -runtime field and use values from that field at query time. +script. If you define a runtime field without a script, {es} evaluates the +field at search time, looks at each document containing that field, retrieves +the `_source`, and returns a value if one exists. Runtime fields are similar to the <> parameter of the `_search` request, but also make the script results available anywhere @@ -119,78 +118,11 @@ definition. <2> Each runtime field has its own field type, just like any other field. <3> The script defines the evaluation to calculate at search time. -The `runtime` section supports `boolean`, `date`, `double`, `geo_point` `ip`, +The `runtime` section supports `boolean`, `date`, `double`, `geo_point`, `ip`, `keyword`, and `long` data types. Runtime fields with a `type` of `date` can accept the <> parameter exactly as the `date` field type. -You can define a runtime field that isn't at the top level of a document. If -you want to map a runtime field under an object, you can use dot notation -instead of recreating the entire object structure. - -For example, let's say you add the following document: - -[source, console] ----- -POST my-index/_doc/1 -{ - "company": { - "name": "Elastic" - } -} ----- - -If you retrieve the mapping for that document, you'll see that the `name` field -is nested under the `company` object: - -[source, console] ----- -GET my-index/_mapping ----- -//TEST[continued] - -[source, console-result] ----- -{ - "my-index" : { - "mappings" : { - "properties" : { - "company" : { - "properties" : { - "name" : { - "type" : "text", - "fields" : { - "keyword" : { - "type" : "keyword", - "ignore_above" : 256 - } - } - } - } - } - } - } - } -} ----- -//TEST[continued] - -You could create a runtime field in the mapping using `company.name` instead of -recreating the object structure for that field: - -[source,console] ----- -PUT my-index/_mapping -{ - "runtime": { - "company.name": { - "type": "keyword" - } - } -} ----- -//TEST[continued] - [[runtime-updating-scripts]] .Updating runtime scripts **** @@ -261,12 +193,7 @@ script, and returns the value as part of the query. Because the runtime field shadows the mapped field, you can modify the value returned in search without modifying the mapped field. -If you define a runtime field that does not include a script, {es} evaluates the -field at search time, looks at each document containing that field, retrieves -the `_source`, and returns a value if one exists. - -As mentioned in <>, you can -shadow field values inside objects. For example, index the following documents into `my-index`: +For example, index the following documents into `my-index`: [source, console] ---- From 06241b7923628b7c4fc44728e86278afd770a920 Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Wed, 9 Dec 2020 17:00:46 -0500 Subject: [PATCH 43/44] Adding updated example for shadowing. --- docs/reference/mapping/runtime.asciidoc | 161 ++++++++++++++++++++---- 1 file changed, 137 insertions(+), 24 deletions(-) diff --git a/docs/reference/mapping/runtime.asciidoc b/docs/reference/mapping/runtime.asciidoc index 14e8ec3009af5..6e24b6d24dcea 100644 --- a/docs/reference/mapping/runtime.asciidoc +++ b/docs/reference/mapping/runtime.asciidoc @@ -14,12 +14,9 @@ sees runtime fields no differently. Because runtime fields aren't indexed, adding a runtime field doesn't increase the index size. You define runtime fields directly in the index mapping, saving storage costs and increasing ingestion speed. You can more quickly ingest -data into the Elastic Stack and access it right away. - -When you define a runtime field, you can immediately use it in search requests, -aggregations, filtering, and sorting. If you later decide that the runtime -field is increasingly used for filtering or aggregations, you can make the -runtime field an indexed field to achieve better query performance. +data into the Elastic Stack and access it right away. When you define a runtime +field, you can immediately use it in search requests, aggregations, filtering, +and sorting. If you make a runtime field an indexed field, you don't need to modify any queries that refer to the runtime field. Better yet, you can refer to some @@ -193,48 +190,164 @@ script, and returns the value as part of the query. Because the runtime field shadows the mapped field, you can modify the value returned in search without modifying the mapped field. -For example, index the following documents into `my-index`: +For example, let's say you indexed the following documents into `my-index`: -[source, console] +[source,console] ---- POST my-index/_bulk?refresh=true {"index":{}} -{"name":{"first":"Jose","last":"Hickman"}} +{"timestamp":1516729294000,"model_number":"QVKC92Q","measures":{"voltage":5.2}} +{"index":{}} +{"timestamp":1516642894000,"model_number":"PW83H7X","measures":{"voltage":5.8}} +{"index":{}} +{"timestamp":1516556494000,"model_number":"ELF7YV2","measures":{"voltage":5.1}} +{"index":{}} +{"timestamp":1516470094000,"model_number":"5NMDTMF","measures":{"voltage":5.6}} {"index":{}} -{"name":{"first":"India","last":"Avila","suffix":"II"}} +{"timestamp":1516383694000,"model_number":"1N0TH44","measures":{"voltage":4.2}} +{"index":{}} +{"timestamp":1516297294000,"model_number":"HG537PU","measures":{"voltage":4.0}} ---- -In the `runtime_mappings` section of the `_search` request, you can define a -script that operates on the `name.first` field. The values you specify will -shadow field values in the index mapping for the duration of the query without -modifying the field itself. +You later realize that the voltage for the sensor matching model number +`HG537PU` is incorrect. The indexed value is `4.0`, but is supposed to be 1.7 +times higher. Instead of reindexing your data, you can define a script in the +`runtime_mappings` section of the `_search` request to shadow the `voltage` +field and calculate a new value at search time. -The following request defines a runtime field that retrieves values based on -the script valuation. The field defined in the search request shadows the -indexed `name.first` field and substitutes a value for that field based on the -logic defined in the script: +If you search for documents where the model number matches `HG537PU`: + +[source,console] +---- +GET my-index/_search +{ + "query": { + "match": { + "model_number": "HG537PU" + } + } +} +---- +//TEST[continued] + +The response shows that the voltage is indeed `4.0`: + +[source,console-result] +---- +{ + "took" : 5, + "timed_out" : false, + "_shards" : { + "total" : 1, + "successful" : 1, + "skipped" : 0, + "failed" : 0 + }, + "hits" : { + "total" : { + "value" : 1, + "relation" : "eq" + }, + "max_score" : 1.540445, + "hits" : [ + { + "_index" : "my-index", + "_id" : "F1BeSXYBg_szTodcYCmk", + "_score" : 1.540445, + "_source" : { + "timestamp" : 1516297294000, + "model_number" : "HG537PU", + "measures" : { + "voltage" : 4.0 + } + } + } + ] + } +} +---- +// TESTRESPONSE[s/"took" : 5/"took": $body.took/] +// TESTRESPONSE[s/"_id" : "F1BeSXYBg_szTodcYCmk"/"_id": $body.hits.hits.0._id/] + +The following request defines a runtime field where the script evaluates the +`model_number` field where the value is `HG537PU`. For each match, the script +multiplies the value for the `voltage` field by `1.7`. The original voltage was +`4.0`, so the resulting voltage should be `6.8` (`4.0*1.7`). + +Using the <> parameter on the `_search` API, you can +retrieve the value that the script calculates for the `measures.voltage` field +for documents matching the search request: [source,console] ---- POST my-index/_search { "runtime_mappings": { - "name.first": { - "type": "keyword", + "measures.voltage": { + "type": "double", "script": { - "source": "if (\"Hickman\".equals(doc[\"name.last.keyword\"].value)) {emit(\"Guile\");} else if (\"Avila\".equals(doc[\"name.last.keyword\"].value)) {emit(\"Anastasia\");}" + "source": + """if (doc['model_number.keyword'].value.equals('HG537PU')) + {emit(1.7 * params._source['measures']['voltage']);} + else{emit(params._source['measures']['voltage']);}""" } } }, "query": { "match": { - "name.first": "Anastasia" + "model_number": "HG537PU" } - } + }, + "fields": ["measures.voltage"] } ---- -// TEST[continued] +//TEST[continued] +Looking at the response, the value for `measures.voltage` is `6.8`. The runtime +field calculated this value as part of the search request without modifying +the mapped value, which still returns in the response as `4.0`: + +[source,console-result] +---- +{ + "took" : 41, + "timed_out" : false, + "_shards" : { + "total" : 1, + "successful" : 1, + "skipped" : 0, + "failed" : 0 + }, + "hits" : { + "total" : { + "value" : 1, + "relation" : "eq" + }, + "max_score" : 1.540445, + "hits" : [ + { + "_index" : "my-index", + "_id" : "F1BeSXYBg_szTodcYCmk", + "_score" : 1.540445, + "_source" : { + "timestamp" : 1516297294000, + "model_number" : "HG537PU", + "measures" : { + "voltage" : 4.0 + } + }, + "fields" : { + "measures.voltage" : [ + 6.8 + ] + } + } + ] + } +} +---- +// TESTRESPONSE[s/"took" : 41/"took": $body.took/] +// TESTRESPONSE[s/"_id" : "F1BeSXYBg_szTodcYCmk"/"_id": $body.hits.hits.0._id/] [[runtime-retrieving-fields]] === Retrieving a runtime field From c1b61ee1d709fa0b64699e88863bf4858651276d Mon Sep 17 00:00:00 2001 From: Adam Locke Date: Wed, 9 Dec 2020 17:39:23 -0500 Subject: [PATCH 44/44] Streamlining shadowing example and TESTRESPONSEs. --- docs/reference/mapping/runtime.asciidoc | 94 +++++++++++++++---------- 1 file changed, 56 insertions(+), 38 deletions(-) diff --git a/docs/reference/mapping/runtime.asciidoc b/docs/reference/mapping/runtime.asciidoc index 6e24b6d24dcea..f53ebb9e3edc4 100644 --- a/docs/reference/mapping/runtime.asciidoc +++ b/docs/reference/mapping/runtime.asciidoc @@ -198,21 +198,21 @@ POST my-index/_bulk?refresh=true {"index":{}} {"timestamp":1516729294000,"model_number":"QVKC92Q","measures":{"voltage":5.2}} {"index":{}} -{"timestamp":1516642894000,"model_number":"PW83H7X","measures":{"voltage":5.8}} +{"timestamp":1516642894000,"model_number":"QVKC92Q","measures":{"voltage":5.8}} {"index":{}} -{"timestamp":1516556494000,"model_number":"ELF7YV2","measures":{"voltage":5.1}} +{"timestamp":1516556494000,"model_number":"QVKC92Q","measures":{"voltage":5.1}} {"index":{}} -{"timestamp":1516470094000,"model_number":"5NMDTMF","measures":{"voltage":5.6}} +{"timestamp":1516470094000,"model_number":"QVKC92Q","measures":{"voltage":5.6}} {"index":{}} -{"timestamp":1516383694000,"model_number":"1N0TH44","measures":{"voltage":4.2}} +{"timestamp":1516383694000,"model_number":"HG537PU","measures":{"voltage":4.2}} {"index":{}} {"timestamp":1516297294000,"model_number":"HG537PU","measures":{"voltage":4.0}} ---- -You later realize that the voltage for the sensor matching model number -`HG537PU` is incorrect. The indexed value is `4.0`, but is supposed to be 1.7 -times higher. Instead of reindexing your data, you can define a script in the -`runtime_mappings` section of the `_search` request to shadow the `voltage` +You later realize that the `HG537PU` sensors aren't reporting their true +voltage. The indexed values are supposed to be 1.7 times higher than +the reported values! Instead of reindexing your data, you can define a script in +the `runtime_mappings` section of the `_search` request to shadow the `voltage` field and calculate a new value at search time. If you search for documents where the model number matches `HG537PU`: @@ -230,30 +230,36 @@ GET my-index/_search ---- //TEST[continued] -The response shows that the voltage is indeed `4.0`: +The response includes indexed values for documents matching model number +`HG537PU`: [source,console-result] ---- { - "took" : 5, - "timed_out" : false, - "_shards" : { - "total" : 1, - "successful" : 1, - "skipped" : 0, - "failed" : 0 - }, + ... "hits" : { "total" : { - "value" : 1, + "value" : 2, "relation" : "eq" }, - "max_score" : 1.540445, + "max_score" : 1.0296195, "hits" : [ { "_index" : "my-index", "_id" : "F1BeSXYBg_szTodcYCmk", - "_score" : 1.540445, + "_score" : 1.0296195, + "_source" : { + "timestamp" : 1516383694000, + "model_number" : "HG537PU", + "measures" : { + "voltage" : 4.2 + } + } + }, + { + "_index" : "my-index", + "_id" : "l02aSXYBkpNf6QRDO62Q", + "_score" : 1.0296195, "_source" : { "timestamp" : 1516297294000, "model_number" : "HG537PU", @@ -266,13 +272,13 @@ The response shows that the voltage is indeed `4.0`: } } ---- -// TESTRESPONSE[s/"took" : 5/"took": $body.took/] +// TESTRESPONSE[s/\.\.\./"took" : $body.took,"timed_out" : $body.timed_out,"_shards" : $body._shards,/] // TESTRESPONSE[s/"_id" : "F1BeSXYBg_szTodcYCmk"/"_id": $body.hits.hits.0._id/] +// TESTRESPONSE[s/"_id" : "l02aSXYBkpNf6QRDO62Q"/"_id": $body.hits.hits.1._id/] The following request defines a runtime field where the script evaluates the `model_number` field where the value is `HG537PU`. For each match, the script -multiplies the value for the `voltage` field by `1.7`. The original voltage was -`4.0`, so the resulting voltage should be `6.8` (`4.0*1.7`). +multiplies the value for the `voltage` field by `1.7`. Using the <> parameter on the `_search` API, you can retrieve the value that the script calculates for the `measures.voltage` field @@ -303,32 +309,43 @@ POST my-index/_search ---- //TEST[continued] -Looking at the response, the value for `measures.voltage` is `6.8`. The runtime -field calculated this value as part of the search request without modifying -the mapped value, which still returns in the response as `4.0`: +Looking at the response, the calculated values for `measures.voltage` on each +result are `7.14` and `6.8`. That's more like it! The runtime field calculated +this value as part of the search request without modifying the mapped value, +which still returns in the response: [source,console-result] ---- { - "took" : 41, - "timed_out" : false, - "_shards" : { - "total" : 1, - "successful" : 1, - "skipped" : 0, - "failed" : 0 - }, + ... "hits" : { "total" : { - "value" : 1, + "value" : 2, "relation" : "eq" }, - "max_score" : 1.540445, + "max_score" : 1.0296195, "hits" : [ { "_index" : "my-index", "_id" : "F1BeSXYBg_szTodcYCmk", - "_score" : 1.540445, + "_score" : 1.0296195, + "_source" : { + "timestamp" : 1516383694000, + "model_number" : "HG537PU", + "measures" : { + "voltage" : 4.2 + } + }, + "fields" : { + "measures.voltage" : [ + 7.14 + ] + } + }, + { + "_index" : "my-index", + "_id" : "l02aSXYBkpNf6QRDO62Q", + "_score" : 1.0296195, "_source" : { "timestamp" : 1516297294000, "model_number" : "HG537PU", @@ -346,8 +363,9 @@ the mapped value, which still returns in the response as `4.0`: } } ---- -// TESTRESPONSE[s/"took" : 41/"took": $body.took/] +// TESTRESPONSE[s/\.\.\./"took" : $body.took,"timed_out" : $body.timed_out,"_shards" : $body._shards,/] // TESTRESPONSE[s/"_id" : "F1BeSXYBg_szTodcYCmk"/"_id": $body.hits.hits.0._id/] +// TESTRESPONSE[s/"_id" : "l02aSXYBkpNf6QRDO62Q"/"_id": $body.hits.hits.1._id/] [[runtime-retrieving-fields]] === Retrieving a runtime field