-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Add calculated numeric fields #69531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
a7f538c
1a330f6
6c6e6d3
e004881
5e1cf03
bc993f9
6d77d28
d81d1a2
381b591
0074b4d
51946a8
d97af2d
6a6426f
d4913be
5331ae5
4b35f37
6569f5a
4d79be2
cead316
1faea30
3b26fe3
2e2dde3
2e01777
152faac
ea291c4
4d3d4f1
9791f94
918dc6a
79e48ee
1917ddb
7981259
b8ade1a
45c0bab
ea2cef5
4a08638
cff8281
a5e8078
1276a9a
09f5966
c945b6a
f68b5d8
fb879ef
19edac6
b584c9d
d76a88b
ede91b9
b647239
d20db3a
7615f62
9b60436
c888b32
47e73b7
5a6ba36
b8df6c2
878a187
9aba5d3
de582e4
58cd88f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -117,6 +117,7 @@ The following parameters are accepted by numeric types: | |
|
|
||
| Try to convert strings to numbers and truncate fractions for integers. | ||
| Accepts `true` (default) and `false`. Not applicable for `unsigned_long`. | ||
| Note that this cannot be set if the `script` parameter is used. | ||
|
|
||
| <<doc-values,`doc_values`>>:: | ||
|
|
||
|
|
@@ -127,7 +128,8 @@ The following parameters are accepted by numeric types: | |
| <<ignore-malformed,`ignore_malformed`>>:: | ||
|
|
||
| If `true`, malformed numbers are ignored. If `false` (default), malformed | ||
| numbers throw an exception and reject the whole document. | ||
| numbers throw an exception and reject the whole document. Note that this | ||
| cannot be set if the `script` parameter is used. | ||
|
|
||
| <<mapping-index,`index`>>:: | ||
|
|
||
|
|
@@ -137,7 +139,26 @@ The following parameters are accepted by numeric types: | |
|
|
||
| Accepts a numeric value of the same `type` as the field which is | ||
| substituted for any explicit `null` values. Defaults to `null`, which | ||
| means the field is treated as missing. | ||
| means the field is treated as missing. Note that this cannot be set | ||
| if the `script` parameter is used. | ||
|
|
||
| `on_script_error`:: | ||
|
|
||
| Defines what to do if the script defined by the `script` parameter | ||
| throws an error at indexing time. Accepts `reject` (default), which | ||
| will cause the entire document to be rejected, and `ignore`, which | ||
| will register the field in the document's | ||
| <<mapping-ignored-field,`_ignored`>> metadata field and continue | ||
| indexing. This parameter can only be set if the `script` field is | ||
| also set. | ||
|
|
||
| `script`:: | ||
|
|
||
| If this parameter is set, then the field will index values generated | ||
| by this script, rather than reading the values directly from the | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The Do we have tests for this case (I couldn't find them from a quick look but I might have just missed them), as well as for the case when the script computes values based on content of the document under the same field name?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We reject documents that have a value for the field, if the field declares a script. We should probably clarify that in the docs.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 to explicitly documenting this, I will open a followup |
||
| source. Scripts are in the same format as their | ||
| <<runtime-mapping-fields,runtime equivalent>>. Scripts can only be | ||
| configured on `long` and `double` field types. | ||
|
|
||
| <<mapping-store,`store`>>:: | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,152 @@ | ||
| --- | ||
| setup: | ||
| - do: | ||
| indices.create: | ||
| index: sensor | ||
| body: | ||
| settings: | ||
| number_of_shards: 1 | ||
| number_of_replicas: 0 | ||
| mappings: | ||
| properties: | ||
| timestamp: | ||
| type: date | ||
| temperature: | ||
| type: long | ||
| voltage: | ||
| type: double | ||
| node: | ||
| type: keyword | ||
| voltage_times_ten: | ||
| type: long | ||
| script: | ||
| source: | | ||
| for (double v : doc['voltage']) { | ||
| emit((long)(v * params.multiplier)); | ||
| } | ||
| params: | ||
| multiplier: 10 | ||
| voltage_times_ten_no_dv: | ||
| type: long | ||
| doc_values: false | ||
| script: | ||
| source: | | ||
| for (double v : doc['voltage']) { | ||
| emit((long)(v * params.multiplier)); | ||
| } | ||
| params: | ||
| multiplier: 10 | ||
| # test multiple values | ||
| temperature_digits: | ||
| type: long | ||
| script: | ||
| source: | | ||
| for (long temperature : doc['temperature']) { | ||
| long t = temperature; | ||
| while (t != 0) { | ||
| emit(t % 10); | ||
| t /= 10; | ||
| } | ||
| } | ||
|
|
||
| - do: | ||
| bulk: | ||
| index: sensor | ||
| refresh: true | ||
| body: | | ||
| {"index":{}} | ||
| {"timestamp": 1516729294000, "temperature": 200, "voltage": 5.2, "node": "a"} | ||
| {"index":{}} | ||
| {"timestamp": 1516642894000, "temperature": 201, "voltage": 5.8, "node": "b"} | ||
| {"index":{}} | ||
| {"timestamp": 1516556494000, "temperature": 202, "voltage": 5.1, "node": "a"} | ||
| {"index":{}} | ||
| {"timestamp": 1516470094000, "temperature": 198, "voltage": 5.6, "node": "b"} | ||
| {"index":{}} | ||
| {"timestamp": 1516383694000, "temperature": 200, "voltage": 4.2, "node": "c"} | ||
| {"index":{}} | ||
| {"timestamp": 1516297294000, "temperature": 202, "voltage": 4.0, "node": "c"} | ||
|
|
||
| --- | ||
| "get mapping": | ||
| - do: | ||
| indices.get_mapping: | ||
| index: sensor | ||
| - match: {sensor.mappings.properties.voltage_times_ten.type: long } | ||
| - match: | ||
| sensor.mappings.properties.voltage_times_ten.script.source: | | ||
| for (double v : doc['voltage']) { | ||
| emit((long)(v * params.multiplier)); | ||
| } | ||
| - match: {sensor.mappings.properties.voltage_times_ten.script.params: {multiplier: 10} } | ||
| - match: {sensor.mappings.properties.voltage_times_ten.script.lang: painless } | ||
|
|
||
| --- | ||
| "fetch fields": | ||
| - do: | ||
| search: | ||
| index: sensor | ||
| body: | ||
| sort: timestamp | ||
| fields: | ||
| - voltage_times_ten | ||
| - voltage_times_ten_no_dv | ||
| - temperature_digits | ||
| - match: {hits.total.value: 6} | ||
| - match: {hits.hits.0.fields.voltage_times_ten: [40] } | ||
| - match: {hits.hits.0.fields.temperature_digits: [2, 0, 2] } | ||
| - match: {hits.hits.0.fields.voltage_times_ten: [40] } | ||
| - match: {hits.hits.0.fields.voltage_times_ten_no_dv: [40] } | ||
| - match: {hits.hits.1.fields.voltage_times_ten: [42] } | ||
| - match: {hits.hits.2.fields.voltage_times_ten: [56] } | ||
| - match: {hits.hits.3.fields.voltage_times_ten: [51] } | ||
| - match: {hits.hits.4.fields.voltage_times_ten: [58] } | ||
| - match: {hits.hits.5.fields.voltage_times_ten: [52] } | ||
|
|
||
| --- | ||
| "docvalue_fields": | ||
| - do: | ||
| search: | ||
| index: sensor | ||
| body: | ||
| sort: timestamp | ||
| docvalue_fields: | ||
| - voltage_times_ten | ||
| - temperature_digits | ||
| - match: {hits.total.value: 6} | ||
| - match: {hits.hits.0.fields.voltage_times_ten: [40] } | ||
| - match: {hits.hits.0.fields.temperature_digits: [0, 2, 2] } | ||
| - match: {hits.hits.0.fields.voltage_times_ten: [40] } | ||
| - match: {hits.hits.1.fields.voltage_times_ten: [42] } | ||
| - match: {hits.hits.2.fields.voltage_times_ten: [56] } | ||
| - match: {hits.hits.3.fields.voltage_times_ten: [51] } | ||
| - match: {hits.hits.4.fields.voltage_times_ten: [58] } | ||
| - match: {hits.hits.5.fields.voltage_times_ten: [52] } | ||
|
|
||
| --- | ||
| "terms agg": | ||
| - do: | ||
| search: | ||
| index: sensor | ||
| body: | ||
| aggs: | ||
| v10: | ||
| terms: | ||
| field: voltage_times_ten | ||
| - match: {hits.total.value: 6} | ||
| - match: {aggregations.v10.buckets.0.key: 40.0} | ||
| - match: {aggregations.v10.buckets.0.doc_count: 1} | ||
| - match: {aggregations.v10.buckets.1.key: 42.0} | ||
| - match: {aggregations.v10.buckets.1.doc_count: 1} | ||
|
|
||
| --- | ||
| "term query": | ||
| - do: | ||
| search: | ||
| index: sensor | ||
| body: | ||
| query: | ||
| term: | ||
| voltage_times_ten: 58 | ||
| - match: {hits.total.value: 1} | ||
| - match: {hits.hits.0._source.voltage: 5.8} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to be leaving a comment so late -- was just looking how this turned out! I was wondering if we considered re-using the
ignore_malformedboolean flag here instead of introducing a new parameter? That feels simple and consistent to me, and avoids introducing a new enum parameter.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks Julie, good question. It all started from "let's support some ignore_malformed behaviour" but then we decided to make it specific to scripts especially as we were not convinced that the two would have the same default values, as far as I recall. Another idea could be to support the new parameter as part of the definition of the script:
There is still time to change this if we want to, so let's discuss this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We did discuss re-using
ignore_malformedbut there are some important differences in the semantics of the two parameters. In particular,ignore_malformedis specifically targeted at handling input data that cannot be converted into a number; because the scripts are typed, there is no way that they can return a malformed object, and so the same logic doesn't really apply. In addition, a script could throw an error for any number of reasons (eg logic errors, array index out of bounds exceptions) andignore_malformedfeels too specific to cover all of these.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's nice that we are reusing the
_ignoredmeta field though. 👍There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks everyone for the context. I agree the name would feel a bit off, maybe if it had been called
ignore_parse_errorswe could have reused it!