From d3e31f56a526d86d7aed2d7f7458d74e6cb97b4c Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Tue, 19 Mar 2019 23:55:55 -0400 Subject: [PATCH 01/32] Keep intro only one page --- docs/index.asciidoc | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/index.asciidoc b/docs/index.asciidoc index 6660ff3317..592d50c668 100644 --- a/docs/index.asciidoc +++ b/docs/index.asciidoc @@ -13,6 +13,7 @@ ingesting data into Elasticsearch. A common schema helps you correlate data from sources like logs and metrics or IT operations analytics and security analytics. +[float] [[ecs-maturity]] === Maturity From 060e6d0d3978ec623be23dbf51e17e62d6b640f7 Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Wed, 20 Mar 2019 13:19:01 -0400 Subject: [PATCH 02/32] Fix sidebar issue --- docs/index.asciidoc | 2 -- 1 file changed, 2 deletions(-) diff --git a/docs/index.asciidoc b/docs/index.asciidoc index 592d50c668..768ec5976f 100644 --- a/docs/index.asciidoc +++ b/docs/index.asciidoc @@ -14,7 +14,6 @@ data from sources like logs and metrics or IT operations analytics and security analytics. [float] -[[ecs-maturity]] === Maturity With ECS turning 1.0, the team will approach improvements by following @@ -27,7 +26,6 @@ https://github.com/elastic/ecs/blob/master/CONTRIBUTING.md[Contribution Guidelines]. [float] -[[ecs-field-types]] === Types of fields * *Core fields.* Fields that are most common across all use cases. From ca432eabab651014b9fc57fb151569e2598ffed4 Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Wed, 20 Mar 2019 13:23:42 -0400 Subject: [PATCH 03/32] Temporarily hide use cases section. --- docs/index.asciidoc | 2 +- docs/{use-cases.asciidoc => use-cases.asciidoc.disabled} | 0 2 files changed, 1 insertion(+), 1 deletion(-) rename docs/{use-cases.asciidoc => use-cases.asciidoc.disabled} (100%) diff --git a/docs/index.asciidoc b/docs/index.asciidoc index 768ec5976f..6b0951548f 100644 --- a/docs/index.asciidoc +++ b/docs/index.asciidoc @@ -42,7 +42,7 @@ include::fields.asciidoc[] include::conventions.asciidoc[] include::guidelines.asciidoc[] include::convert.asciidoc[] -include::use-cases.asciidoc[] +// include::use-cases.asciidoc[] include::faq.asciidoc[] include::contributing.asciidoc[] include::glossary.asciidoc[] diff --git a/docs/use-cases.asciidoc b/docs/use-cases.asciidoc.disabled similarity index 100% rename from docs/use-cases.asciidoc rename to docs/use-cases.asciidoc.disabled From 99b04f701c4c88880b79c210550f3c5288f26be7 Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Wed, 20 Mar 2019 13:55:42 -0400 Subject: [PATCH 04/32] Add a 'what is ecs' section at the top of the index page --- docs/index.asciidoc | 28 ++++++++++++++++++++++------ 1 file changed, 22 insertions(+), 6 deletions(-) diff --git a/docs/index.asciidoc b/docs/index.asciidoc index 6b0951548f..7160fabe7c 100644 --- a/docs/index.asciidoc +++ b/docs/index.asciidoc @@ -8,15 +8,32 @@ include::{asciidoc-dir}/../../shared/attributes.asciidoc[] [[ecs-reference]] == Overview -The Elastic Common Schema (ECS) defines a common set of fields for -ingesting data into Elasticsearch. A common schema helps you correlate -data from sources like logs and metrics or IT operations -analytics and security analytics. +[float] +=== What is ECS? + +The Elastic Common Schema (ECS) defines a common set of fields, +their datatype, and gives guidance on their correct usage. +ECS is used to improve uniformity of event data ingested into Elasticsearch. + +Following ECS ensures your monitoring events follow a predictable schema, at all levels: + +- *Event source*: whether the source of your event is an Elastic product, + a third party product, or custom events generated by your application. +- *Event pipeline*: in any kind of event pipeline, such as + Beats processors, Logstash or Elasticsearch ingest node. +- *Consumption*: API consumers, Kibana applications and Kibana dashboards are + all simpler to build, maintain or share, when they are based on ECS. + +Following ECS reduces dependencies between unrelated parts of your event pipeline. + +The ultimate goal of ECS is to help you correlate data from various sources +like logs, metrics, IT operations analytics, and security analytics together. + [float] === Maturity -With ECS turning 1.0, the team will approach improvements by following +With ECS turning 1.0, the team will release improvements to the schema by following https://semver.org/[Semantic Versioning]. Generally major ECS releases are planned to be aligned with major Elastic Stack releases. @@ -46,4 +63,3 @@ include::convert.asciidoc[] include::faq.asciidoc[] include::contributing.asciidoc[] include::glossary.asciidoc[] - From dee7a714674284c4393ccccd537a051d073bc4f8 Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Wed, 20 Mar 2019 14:18:27 -0400 Subject: [PATCH 05/32] Let's address the main objection right at the beginning --- docs/index.asciidoc | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/docs/index.asciidoc b/docs/index.asciidoc index 7160fabe7c..2f724d17f0 100644 --- a/docs/index.asciidoc +++ b/docs/index.asciidoc @@ -29,6 +29,12 @@ Following ECS reduces dependencies between unrelated parts of your event pipelin The ultimate goal of ECS is to help you correlate data from various sources like logs, metrics, IT operations analytics, and security analytics together. +[float] +=== My events don't map with ECS + +ECS is a permissive schema. If your events have additional data that cannot be +mapped to ECS, you can simply add them to your events, using non-ECS field names. + [float] === Maturity From 76085201df621bd50c9ac2c1e6ea361a4db866db Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Wed, 20 Mar 2019 14:36:55 -0400 Subject: [PATCH 06/32] Space. The final frontier. --- docs/conventions.asciidoc | 8 ++++---- docs/guidelines.asciidoc | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/conventions.asciidoc b/docs/conventions.asciidoc index e4b694239a..3b5684f9a5 100644 --- a/docs/conventions.asciidoc +++ b/docs/conventions.asciidoc @@ -9,18 +9,18 @@ Elasticsearch can index text using: * *Text.* Text indexing allows for full text search, or searching arbitrary words that - are part of the field. + are part of the field. See {ref}/text.html[Text datatype] in the {es} Reference Guide. * *Keywords.* Keyword indexing offers faster exact match filtering and prefix search, and makes aggregations (for {kib} visualizations) possible. - See the {es} Reference Guide for more information on + See the {es} Reference Guide for more information on {ref}/query-dsl-term-query.html[exact match filtering], - {ref}/query-dsl-prefix-query.html[prefix search], or + {ref}/query-dsl-prefix-query.html[prefix search], or {ref}/search-aggregations.html[aggregations]. [float] ==== Default Elasticsearch convention - + Unless your index mapping or index template specifies otherwise (as the ECS index template does), Elasticsearch indexes text field as `text` at the canonical field name, diff --git a/docs/guidelines.asciidoc b/docs/guidelines.asciidoc index 5abc0ce9f5..41addcfc4f 100644 --- a/docs/guidelines.asciidoc +++ b/docs/guidelines.asciidoc @@ -27,7 +27,7 @@ practices. * *Singular or plural.* Use singular and plural names properly to reflect the field content. For example, use `requests_per_sec` rather than `request_per_sec`. * *General to specific.* Organise the prefixes from general to specific to allow grouping fields into objects with a prefix like `host.*`. * *Avoid repetition.* Avoid stuttering of words. If part of the field name is already in the prefix, do not repeat it. Example: `host.host_ip` should be `host.ip`. -* *Use prefixes.* Fields must be prefixed except for the base fields. For example, all `host` fields are prefixed with `host.`. +* *Use prefixes.* Fields must be prefixed except for the base fields. For example, all `host` fields are prefixed with `host.`. See <> for more details. + The document structure should be nested JSON objects. If you use Beats or From 6a4aa6d3785dedf0b29f4474d2cb8e086c5b3e90 Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Wed, 20 Mar 2019 15:16:58 -0400 Subject: [PATCH 07/32] Restructure part 1 --- docs/additional.asciidoc | 4 ++ docs/convert.asciidoc | 20 ++++----- docs/faq.asciidoc | 4 +- docs/fields.asciidoc | 10 ++++- docs/index.asciidoc | 15 +------ ...ns.asciidoc => using-conventions.asciidoc} | 44 ++++++++++--------- ...nes.asciidoc => using-guidelines.asciidoc} | 27 ++++++++---- docs/using.asciidoc | 12 +++++ scripts/generators/asciidoc_fields.py | 10 ++++- 9 files changed, 87 insertions(+), 59 deletions(-) create mode 100644 docs/additional.asciidoc rename docs/{conventions.asciidoc => using-conventions.asciidoc} (89%) rename docs/{guidelines.asciidoc => using-guidelines.asciidoc} (69%) create mode 100644 docs/using.asciidoc diff --git a/docs/additional.asciidoc b/docs/additional.asciidoc new file mode 100644 index 0000000000..972fe4ac33 --- /dev/null +++ b/docs/additional.asciidoc @@ -0,0 +1,4 @@ +[[ecs-additional-information]] +== Additional Information + +To be written diff --git a/docs/convert.asciidoc b/docs/convert.asciidoc index 6ae351e75b..074aae2e46 100644 --- a/docs/convert.asciidoc +++ b/docs/convert.asciidoc @@ -1,7 +1,7 @@ [[convert-to-ecs]] == Converting an implementation to ECS -A common schema helps you correlate and use data from various sources. +A common schema helps you correlate and use data from various sources. Fields for most Elastic modules and solutions (version 7.0 and later) are mapped to the Elastic Common Schema. You may want to map data from other @@ -14,24 +14,24 @@ Before you start a conversion, be sure that you understand the basics below. [[core-or-ext]] === Core and extended fields -* *Core fields.* Fields that are most common across all use cases are called *core fields*. +* *Core fields.* Fields that are most common across all use cases are called *core fields*. + These generalized fields are used by analysis content (searches, visualizations, dashboards, alerts, machine learning jobs, reports) across use cases. Analysis content designed to operate on these -fields should work properly on data from any relevant source. +fields should work properly on data from any relevant source. + -Focus on populating these fields first. +Focus on populating these fields first. -* *Extended fields.* Any field that is not a core field is called an *extended field*. +* *Extended fields.* Any field that is not a core field is called an *extended field*. Extended fields may apply to more narrow use cases, or may be more open to interpretation depending on the use case. Extended fields are more likely to change over time. -Each {ecs} <> in a table is identified as core or extended. +Each {ecs} <> in a table is identified as core or extended. [float] -[[ecs-comv]] +[[ecs-conv]] === An approach to mapping an existing implementation Here's the recommended approach for converting an existing implementation to {ecs}. @@ -39,14 +39,14 @@ Here's the recommended approach for converting an existing implementation to {ec . Start with Core fields. + Populate core fields first. Look at your set of event fields, and find -the appropriate ECS field name for each one. +the appropriate ECS field name for each one. . Move on to Extended fields. + Map fields that may be specific to various data sources using {ecs} extended fields. Look at {ecs} extended fields, and decide how to populate these fields with the data you have available. Even if you have already mapped a field to an -{ecs} core field, you can still map it to an extended field. +{ecs} core field, you can still map it to an extended field. Populating both core and extended fields helps ensure reusability of ECS analysis -content. +content. diff --git a/docs/faq.asciidoc b/docs/faq.asciidoc index a251d90e6d..e218383a02 100644 --- a/docs/faq.asciidoc +++ b/docs/faq.asciidoc @@ -44,7 +44,7 @@ There are two common key formats for ingesting data into Elasticsearch: * Dot notation: `user.firstname: Nicolas`, `user.lastname: Ruflin` * Underline notation: `user_firstname: Nicolas`, `user_lastname: Ruflin` -ECS uses the dot notation to represent nested objects. +ECS uses the dot notation to represent nested objects. [float] [[notation-diff]] @@ -100,5 +100,3 @@ the ECS data itself, this is not an issue because all fields are predefined. As long as there are no conflicts, underline notation and ECS dot notation can coexist in the same document. - - diff --git a/docs/fields.asciidoc b/docs/fields.asciidoc index 833d1aaf00..5bbcd0d5e4 100644 --- a/docs/fields.asciidoc +++ b/docs/fields.asciidoc @@ -1,6 +1,12 @@ -[[ecs-fields]] -== {ecs} Fields +[[ecs-field-reference]] +== {ecs} Field Reference + +ECS defines multiple groups of related fields. They are called "field sets". +The <> field set is the only one whose fields are defined +at the top level of the events. +All other field sets are defined as objects in {es}, under which +all fields are defined. [float] [[ecs-fieldsets]] diff --git a/docs/index.asciidoc b/docs/index.asciidoc index 2f724d17f0..4c3dc7025e 100644 --- a/docs/index.asciidoc +++ b/docs/index.asciidoc @@ -48,22 +48,11 @@ For contributions please read the https://github.com/elastic/ecs/blob/master/CONTRIBUTING.md[Contribution Guidelines]. -[float] -=== Types of fields - -* *Core fields.* Fields that are most common across all use cases. -Focus on populating these fields first. - -* *Extended fields.* Any fields that are not a core field. -Extended fields may apply to more narrow use cases, or may be more open -to interpretation depending on the use case. Extended fields are more likely to -change over time. - +include::using.asciidoc[] include::fields.asciidoc[] -include::conventions.asciidoc[] -include::guidelines.asciidoc[] +include::additional.asciidoc[] include::convert.asciidoc[] // include::use-cases.asciidoc[] include::faq.asciidoc[] diff --git a/docs/conventions.asciidoc b/docs/using-conventions.asciidoc similarity index 89% rename from docs/conventions.asciidoc rename to docs/using-conventions.asciidoc index 3b5684f9a5..df3456d7e6 100644 --- a/docs/conventions.asciidoc +++ b/docs/using-conventions.asciidoc @@ -1,10 +1,27 @@ -//[[ecs-conventions]] -== {ecs} Conventions +[[ecs-conventions]] +=== {ecs} Conventions {ecs} is most effective when you understand and follow these guidelines and conventions. +==== Datatype for integers + +Unless otherwise noted, the datatype used for integer fields should be `long`. + +[float] +==== IDs and most codes are keywords, not integers + +Despite the fact that IDs and codes (such as error codes) are often integers, +this is not always the case. +Since we want to make it possible to map as many systems and data sources +to ECS as possible, we default to using the `keyword` type for IDs and codes. + +Some specific kinds of codes are always integers, like HTTP status codes. +If those have a specific corresponding specific field (as HTTP status does), +its type can safely be an integer type. +But generic field like `error.code` cannot have this guarantee, and are therefore `keyword`. + [float] -=== Multi-fields text indexing +==== Multi-fields text indexing Elasticsearch can index text using: @@ -19,7 +36,7 @@ Elasticsearch can index text using: {ref}/search-aggregations.html[aggregations]. [float] -==== Default Elasticsearch convention +===== Default Elasticsearch convention Unless your index mapping or index template specifies otherwise (as the ECS index template does), @@ -32,7 +49,7 @@ Default Elasticsearch convention: * Multi-field: `myfield.keyword` is `keyword` [float] -==== ECS multi-field convention for text +===== ECS multi-field convention for text For monitoring use cases, `keyword` indexing is needed almost exclusively, with full text search on very few fields. Given this premise, ECS defaults @@ -48,7 +65,7 @@ ECS multi-field convention for text: * Multi-field: `myfield.text` is `text` [float] -==== Exceptions +===== Exceptions The only exceptions to this convention are fields `message` and `error.message`, which are indexed for full text search only, with no multi-field. @@ -57,18 +74,3 @@ of a breaking change with these two widely used fields in Beats. Any future field that will be indexed for full text search in ECS will however follow the multi-field convention where `text` indexing is nested in the multi-field. - -[float] -=== IDs and most codes are keywords, not integers - -Despite the fact that IDs and codes (such as error codes) are often integers, -this is not always the case. -Since we want to make it possible to map as many systems and data sources -to ECS as possible, we default to using the `keyword` type for IDs and codes. - -Some specific kinds of codes are always integers, like HTTP status codes. -If those have a specific corresponding specific field (as HTTP status does), -its type can safely be an integer type. -But generic field like `error.code` cannot have this guarantee, and are therefore `keyword`. - - diff --git a/docs/guidelines.asciidoc b/docs/using-guidelines.asciidoc similarity index 69% rename from docs/guidelines.asciidoc rename to docs/using-guidelines.asciidoc index 41addcfc4f..0c47d85722 100644 --- a/docs/guidelines.asciidoc +++ b/docs/using-guidelines.asciidoc @@ -1,11 +1,25 @@ -//[[ecs-guidelines]] -== Guidelines and Best Practices +[[ecs-guidelines]] +=== Guidelines and Best Practices The {ecs} schema serves best when you follow schema guidelines and best practices. [float] -=== General guidelines +==== Types of fields + +ECS defines "Core" and "Extended" fields. + +* *Core fields.* Fields that are the most common across all use cases. + Focus on populating these fields first. If consuming ECS events, expect + these fields to be populated in most situations. + +* *Extended fields.* Any fields that are not a core field. + Extended fields may apply to more narrow use cases, or may be more open + to interpretation depending on the use case. Extended fields are more likely to + change over time. + +[float] +==== General guidelines * The document MUST have the `@timestamp` field. * Use the {ref}/mapping-types.html[data types] @@ -14,14 +28,14 @@ practices. * Map as many fields as possible to ECS. [float] -==== Guidelines for writing fields +===== Guidelines for writing fields * All fields must be lower case * Combine words using underscore * No special characters except `_` [float] -==== Guidelines for naming fields +===== Guidelines for naming fields * *Present tense.* Use present tense unless field describes historical information. * *Singular or plural.* Use singular and plural names properly to reflect the field content. For example, use `requests_per_sec` rather than `request_per_sec`. @@ -36,6 +50,3 @@ ingesting to Elasticsearch using the API, your fields must be nested objects, not strings containing dots. * *Avoid abbreviations when possible*. A few exceptions like `ip` exist. - - - diff --git a/docs/using.asciidoc b/docs/using.asciidoc new file mode 100644 index 0000000000..dc70d201f6 --- /dev/null +++ b/docs/using.asciidoc @@ -0,0 +1,12 @@ +[[ecs-using-ecs]] +== Using ECS + +ECS fields follow a series of guidelines, to ensure a consistent and predictable +feel, across various use cases. + +Whether you're trying to recall a field name, implementing a solution that +follows ECS, or proposing a change to ECS, the <> and +<> will help get you there. + +include::using-guidelines.asciidoc[] +include::using-conventions.asciidoc[] diff --git a/scripts/generators/asciidoc_fields.py b/scripts/generators/asciidoc_fields.py index 6d0bd0c8a7..b7e923728f 100644 --- a/scripts/generators/asciidoc_fields.py +++ b/scripts/generators/asciidoc_fields.py @@ -148,8 +148,14 @@ def table_footer(): def index_header(): return ''' -[[ecs-fields]] -== {ecs} Fields +[[ecs-field-reference]] +== {ecs} Field Reference + +ECS defines multiple groups of related fields. They are called "field sets". +The <> field set is the only one whose fields are defined +at the top level of the events. +All other field sets are defined as objects in {es}, under which +all fields are defined. [float] [[ecs-fieldsets]] From fc3d0345000b646c48e70e1f58c650bcd8f3d4eb Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Wed, 20 Mar 2019 15:38:49 -0400 Subject: [PATCH 08/32] Finish the structuring of the docs --- docs/additional.asciidoc | 5 ++++- docs/contributing.asciidoc | 3 +-- docs/faq.asciidoc | 18 +++++++++--------- docs/glossary.asciidoc | 9 +++------ docs/index.asciidoc | 6 +----- 5 files changed, 18 insertions(+), 23 deletions(-) diff --git a/docs/additional.asciidoc b/docs/additional.asciidoc index 972fe4ac33..56bdee8c57 100644 --- a/docs/additional.asciidoc +++ b/docs/additional.asciidoc @@ -1,4 +1,7 @@ [[ecs-additional-information]] == Additional Information -To be written +// include::use-cases.asciidoc[] +include::faq.asciidoc[] +include::glossary.asciidoc[] +include::contributing.asciidoc[] diff --git a/docs/contributing.asciidoc b/docs/contributing.asciidoc index 86f09e72de..998c8d4c4d 100644 --- a/docs/contributing.asciidoc +++ b/docs/contributing.asciidoc @@ -1,8 +1,7 @@ [[ecs-contributing]] -== Contributing to {ecs} +=== Contributing to {ecs} All information related to ECS is versioned in the https://github.com/elastic/ecs[elastic/ecs] repository. All changes to ECS happen through Pull Requests submitted through Git. See the https://github.com/elastic/ecs/blob/master/CONTRIBUTING.md[Contribution Guidelines]. - diff --git a/docs/faq.asciidoc b/docs/faq.asciidoc index e218383a02..4b250ff056 100644 --- a/docs/faq.asciidoc +++ b/docs/faq.asciidoc @@ -1,9 +1,9 @@ [[ecs-faq]] -== FAQ +=== Questions and Answers [float] [[ecs-benefits]] -=== What are the benefits of using ECS? +==== What are the benefits of using ECS? The benefits to a user adopting these fields and names in their clusters are: @@ -18,7 +18,7 @@ The benefits to a user adopting these fields and names in their clusters are: [float] [[conflict]] -=== What if I have fields that conflict with ECS? +==== What if I have fields that conflict with ECS? The {ref}/rename-processor.html[rename @@ -30,14 +30,14 @@ field. If your field does not match ECS, you can rename your field to [float] [[addl-fields]] -=== What if my events have additional fields? +==== What if my events have additional fields? Events may contain fields in addition to ECS fields. These fields can follow the ECS naming and writing rules, but this is not a requirement. [float] [[dot-notation]] -=== Why does ECS use a dot notation instead of an underline notation? +==== Why does ECS use a dot notation instead of an underline notation? There are two common key formats for ingesting data into Elasticsearch: @@ -48,7 +48,7 @@ ECS uses the dot notation to represent nested objects. [float] [[notation-diff]] -==== What is the difference between the two notations? +===== What is the difference between the two notations? Ingesting `user.firstname: Nicolas` and `user.lastname: Ruflin` is identical to ingesting the following JSON: @@ -68,7 +68,7 @@ datatypes], which are arrays of objects. [float] [[dot-adv]] -==== Advantages of dot notation +===== Advantages of dot notation With dot notation, each prefix in Elasticsearch is an object. Each object can have {ref}/object.html#object-params[parameters] @@ -87,7 +87,7 @@ modifying each part of the final event easier. [float] [[dot-disadv]] -==== Disadvantage of dot notation +===== Disadvantage of dot notation In Elasticsearch, each key can have only one type. For example, if `user` is an `object`, you can't use it as a `keyword` type in the same index, like `{"user": @@ -96,7 +96,7 @@ the ECS data itself, this is not an issue because all fields are predefined. [float] [[underline]] -==== What if I already use the underline notation? +===== What if I already use the underline notation? As long as there are no conflicts, underline notation and ECS dot notation can coexist in the same document. diff --git a/docs/glossary.asciidoc b/docs/glossary.asciidoc index c9c018049b..f6ea10a137 100644 --- a/docs/glossary.asciidoc +++ b/docs/glossary.asciidoc @@ -1,11 +1,8 @@ -//[[ecs-glossary]] -== Glossary of {ecs} Terms +[[ecs-glossary]] +=== Glossary of {ecs} Terms -[[glossary-ecs]] +[[ecs-glossary-ecs]] ECS:: Elastic Common Schema. A common set of document fields, field names, and their respective entity relationships to be used in the storage of log messages and other data in Elasticsearch. - - - diff --git a/docs/index.asciidoc b/docs/index.asciidoc index 4c3dc7025e..a4d159ea20 100644 --- a/docs/index.asciidoc +++ b/docs/index.asciidoc @@ -52,9 +52,5 @@ Guidelines]. include::using.asciidoc[] include::fields.asciidoc[] -include::additional.asciidoc[] include::convert.asciidoc[] -// include::use-cases.asciidoc[] -include::faq.asciidoc[] -include::contributing.asciidoc[] -include::glossary.asciidoc[] +include::additional.asciidoc[] From 5ec021fe91ae0a1ddc235a2fa1f12ab0fde18f07 Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Wed, 20 Mar 2019 16:03:43 -0400 Subject: [PATCH 09/32] Create a broader 'migrating to ecs' section --- docs/{convert.asciidoc => converting.asciidoc} | 8 ++++---- docs/index.asciidoc | 2 +- docs/logstash.asciidoc | 4 ++++ docs/migrating.asciidoc | 12 ++++++++++++ docs/solutions.asciidoc | 12 ++++++++++++ 5 files changed, 33 insertions(+), 5 deletions(-) rename docs/{convert.asciidoc => converting.asciidoc} (92%) create mode 100644 docs/logstash.asciidoc create mode 100644 docs/migrating.asciidoc create mode 100644 docs/solutions.asciidoc diff --git a/docs/convert.asciidoc b/docs/converting.asciidoc similarity index 92% rename from docs/convert.asciidoc rename to docs/converting.asciidoc index 074aae2e46..e6b345420c 100644 --- a/docs/convert.asciidoc +++ b/docs/converting.asciidoc @@ -1,5 +1,5 @@ -[[convert-to-ecs]] -== Converting an implementation to ECS +[[ecs-converting]] +=== Converting a Custom Implementation A common schema helps you correlate and use data from various sources. @@ -12,7 +12,7 @@ Before you start a conversion, be sure that you understand the basics below. [float] [[core-or-ext]] -=== Core and extended fields +==== Core and extended fields * *Core fields.* Fields that are most common across all use cases are called *core fields*. + @@ -32,7 +32,7 @@ Each {ecs} <> in a table is identified as core or ext [float] [[ecs-conv]] -=== An approach to mapping an existing implementation +==== An approach to mapping an existing implementation Here's the recommended approach for converting an existing implementation to {ecs}. diff --git a/docs/index.asciidoc b/docs/index.asciidoc index a4d159ea20..2032cd6f0b 100644 --- a/docs/index.asciidoc +++ b/docs/index.asciidoc @@ -52,5 +52,5 @@ Guidelines]. include::using.asciidoc[] include::fields.asciidoc[] -include::convert.asciidoc[] +include::migrating.asciidoc[] include::additional.asciidoc[] diff --git a/docs/logstash.asciidoc b/docs/logstash.asciidoc new file mode 100644 index 0000000000..ff48d2ad82 --- /dev/null +++ b/docs/logstash.asciidoc @@ -0,0 +1,4 @@ +[[ecs-logstash]] +=== Using ECS with Logstash + +To be written diff --git a/docs/migrating.asciidoc b/docs/migrating.asciidoc new file mode 100644 index 0000000000..43e692bf5a --- /dev/null +++ b/docs/migrating.asciidoc @@ -0,0 +1,12 @@ +[[migrating-to-ecs]] +== Migrating to ECS + +There are multiple ways to reap the benefit of ECS. +The simplest is to use <>. + +If you have a custom pipeline or application you would like to convert to ECS, +please have a look at <>. + +include::solutions.asciidoc[] +include::logstash.asciidoc[] +include::converting.asciidoc[] diff --git a/docs/solutions.asciidoc b/docs/solutions.asciidoc new file mode 100644 index 0000000000..cc073e2a32 --- /dev/null +++ b/docs/solutions.asciidoc @@ -0,0 +1,12 @@ +[[ecs-solutions]] +=== Solutions that Support ECS + +The following Elastic solutions support ECS out of the box, as of the 7.0 release: + +* {beats} +* APM +* Infrastructure UI and Logs UI + +// TODO Insert community & partner solutions here + + From 13bd8f20d95655eedd7a88c748e7c6b7a4f593de Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Wed, 20 Mar 2019 16:28:28 -0400 Subject: [PATCH 10/32] Flesh out the ecs-solutions page a little bit. Bogus links, though. --- docs/solutions.asciidoc | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/docs/solutions.asciidoc b/docs/solutions.asciidoc index cc073e2a32..517ae21615 100644 --- a/docs/solutions.asciidoc +++ b/docs/solutions.asciidoc @@ -1,10 +1,16 @@ [[ecs-solutions]] === Solutions that Support ECS -The following Elastic solutions support ECS out of the box, as of the 7.0 release: +The following Elastic solutions support ECS out of the box: * {beats} +** Supported out of the box as of release 7.0 +** If you are migrating from an older version, please visit + http://example.com[EXAMPLE Beats Migration Guide] * APM +** Supported out of the box as of release 7.0 +** If you are migrating from an older version, please visit + http://example.com[EXAMPLE APM Migration Guide] * Infrastructure UI and Logs UI // TODO Insert community & partner solutions here From eaed27262706559e5f43b57e8c309adc45b51767 Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Wed, 20 Mar 2019 16:28:39 -0400 Subject: [PATCH 11/32] Minor wording tweak --- docs/using.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/using.asciidoc b/docs/using.asciidoc index dc70d201f6..b16397fe42 100644 --- a/docs/using.asciidoc +++ b/docs/using.asciidoc @@ -5,7 +5,7 @@ ECS fields follow a series of guidelines, to ensure a consistent and predictable feel, across various use cases. Whether you're trying to recall a field name, implementing a solution that -follows ECS, or proposing a change to ECS, the <> and +follows ECS, or proposing a change to the schema, the <> and <> will help get you there. include::using-guidelines.asciidoc[] From 951498b37fee80591473937f824ce320ea0ddb90 Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Thu, 21 Mar 2019 10:11:50 -0400 Subject: [PATCH 12/32] All of the feedback + my take on `converting.asciidoc` --- docs/converting.asciidoc | 37 ++++++++++++++++++++++--------------- 1 file changed, 22 insertions(+), 15 deletions(-) diff --git a/docs/converting.asciidoc b/docs/converting.asciidoc index e6b345420c..a72ddf28c3 100644 --- a/docs/converting.asciidoc +++ b/docs/converting.asciidoc @@ -12,9 +12,9 @@ Before you start a conversion, be sure that you understand the basics below. [float] [[core-or-ext]] -==== Core and extended fields +==== Core and extended levels -* *Core fields.* Fields that are most common across all use cases are called *core fields*. +* *Core fields.* Fields that are most common across all use cases are defined as *core fields*. + These generalized fields are used by analysis content (searches, visualizations, dashboards, alerts, machine learning jobs, reports) @@ -23,7 +23,7 @@ fields should work properly on data from any relevant source. + Focus on populating these fields first. -* *Extended fields.* Any field that is not a core field is called an *extended field*. +* *Extended fields.* Any field that is not a core field is defined as an *extended field*. Extended fields may apply to more narrow use cases, or may be more open to interpretation depending on the use case. Extended fields are more likely to change over time. @@ -36,17 +36,24 @@ Each {ecs} <> in a table is identified as core or ext Here's the recommended approach for converting an existing implementation to {ecs}. -. Start with Core fields. -+ -Populate core fields first. Look at your set of event fields, and find -the appropriate ECS field name for each one. +. Review each field in your original event and map it to the relevant ECS field. -. Move on to Extended fields. -+ -Map fields that may be specific to various data sources using {ecs} extended -fields. Look at {ecs} extended fields, and decide how to populate these fields -with the data you have available. Even if you have already mapped a field to an -{ecs} core field, you can still map it to an extended field. + - Start by mapping your field to the relevant ECS Core field. + - If a relevant ECS Core field does not exist, map your field to the relevant ECS Extended field. + - If no relevant ECS Extended field exists, consider keeping your field with its original details, + or possibly renaming it using ECS naming guidelines and attempt to map one + or more of your original event fields to it. + +. Review each ECS Core field, and attempt to populate it. + + - Review your original event data again + - Consider populating the field based on additional meta-data such as static + information (e.g. add `event.type:syslog` even if syslog events don't mention this fact), + or information gathered from the environment (e.g. host information). + +. Review other extended fields from any field set you are already using, and + attempt to populate it as well. -Populating both core and extended fields helps ensure reusability of ECS analysis -content. +. Set `ecs.version` to the version of the schema you are conforming to. This will + allow you to upgrade your sources, pipelines and content (like dashboards) + smoothly in the future. From 4c773232a594e3c82e27d6ef914b14eed776d32a Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Thu, 21 Mar 2019 10:28:13 -0400 Subject: [PATCH 13/32] `make reload_doc` runs generator, then generates documentation --- Makefile | 3 +++ 1 file changed, 3 insertions(+) diff --git a/Makefile b/Makefile index 280c5880f6..9d937287e8 100644 --- a/Makefile +++ b/Makefile @@ -112,6 +112,9 @@ readme: cat docs/about.md >> README.md cat docs/generated-files.md >> README.md +.PHONY: reload_docs +reload_docs: generator docs + # Download and setup tooling dependencies. .PHONY: setup setup: ve From df0a22a0a744f240337eb9c6c492e6fe6067aa6c Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Thu, 21 Mar 2019 10:29:00 -0400 Subject: [PATCH 14/32] base wording and ECS definition --- docs/fields.asciidoc | 3 ++- docs/glossary.asciidoc | 7 ++++--- scripts/generators/asciidoc_fields.py | 3 ++- 3 files changed, 8 insertions(+), 5 deletions(-) diff --git a/docs/fields.asciidoc b/docs/fields.asciidoc index 5bbcd0d5e4..cc88f9d9dd 100644 --- a/docs/fields.asciidoc +++ b/docs/fields.asciidoc @@ -4,7 +4,8 @@ ECS defines multiple groups of related fields. They are called "field sets". The <> field set is the only one whose fields are defined -at the top level of the events. +at the root of the event. + All other field sets are defined as objects in {es}, under which all fields are defined. diff --git a/docs/glossary.asciidoc b/docs/glossary.asciidoc index f6ea10a137..84acf4b8c1 100644 --- a/docs/glossary.asciidoc +++ b/docs/glossary.asciidoc @@ -3,6 +3,7 @@ [[ecs-glossary-ecs]] ECS:: -Elastic Common Schema. A common set of document fields, field names, and their respective entity -relationships to be used in the storage of log messages and other data in -Elasticsearch. +Elastic Common Schema. The Elastic Common Schema (ECS) defines a common set of fields, +their datatype, and gives guidance on their correct usage. +ECS is used to improve uniformity of event data ingested into Elasticsearch, +such as logs and metrics. diff --git a/scripts/generators/asciidoc_fields.py b/scripts/generators/asciidoc_fields.py index b7e923728f..5ed933e2ba 100644 --- a/scripts/generators/asciidoc_fields.py +++ b/scripts/generators/asciidoc_fields.py @@ -153,7 +153,8 @@ def index_header(): ECS defines multiple groups of related fields. They are called "field sets". The <> field set is the only one whose fields are defined -at the top level of the events. +at the root of the event. + All other field sets are defined as objects in {es}, under which all fields are defined. From cefa0f172aafd7c398a43a41ad2cbc0c61675241 Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Thu, 21 Mar 2019 11:34:32 -0400 Subject: [PATCH 15/32] Small tweaks based on review --- docs/glossary.asciidoc | 2 +- docs/index.asciidoc | 2 +- docs/solutions.asciidoc | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/glossary.asciidoc b/docs/glossary.asciidoc index 84acf4b8c1..e148b1fea1 100644 --- a/docs/glossary.asciidoc +++ b/docs/glossary.asciidoc @@ -3,7 +3,7 @@ [[ecs-glossary-ecs]] ECS:: -Elastic Common Schema. The Elastic Common Schema (ECS) defines a common set of fields, +*Elastic Common Schema*. The Elastic Common Schema (ECS) defines a common set of fields, their datatype, and gives guidance on their correct usage. ECS is used to improve uniformity of event data ingested into Elasticsearch, such as logs and metrics. diff --git a/docs/index.asciidoc b/docs/index.asciidoc index 2032cd6f0b..22cd6e77c5 100644 --- a/docs/index.asciidoc +++ b/docs/index.asciidoc @@ -33,7 +33,7 @@ like logs, metrics, IT operations analytics, and security analytics together. === My events don't map with ECS ECS is a permissive schema. If your events have additional data that cannot be -mapped to ECS, you can simply add them to your events, using non-ECS field names. +mapped to ECS, you can simply add them to your events, using custom field names. [float] diff --git a/docs/solutions.asciidoc b/docs/solutions.asciidoc index 517ae21615..8b909f0957 100644 --- a/docs/solutions.asciidoc +++ b/docs/solutions.asciidoc @@ -1,7 +1,7 @@ [[ecs-solutions]] === Solutions that Support ECS -The following Elastic solutions support ECS out of the box: +The following Elastic products support ECS out of the box: * {beats} ** Supported out of the box as of release 7.0 From 1f3874b1920fa681be48627135884b370c0c4c39 Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Thu, 21 Mar 2019 11:34:58 -0400 Subject: [PATCH 16/32] Merge 2 sections in guidelines, reorder and reword to make more consistent --- docs/using-guidelines.asciidoc | 55 +++++++++++++++++++++------------- 1 file changed, 35 insertions(+), 20 deletions(-) diff --git a/docs/using-guidelines.asciidoc b/docs/using-guidelines.asciidoc index 0c47d85722..f9e2cf0c12 100644 --- a/docs/using-guidelines.asciidoc +++ b/docs/using-guidelines.asciidoc @@ -18,6 +18,7 @@ ECS defines "Core" and "Extended" fields. to interpretation depending on the use case. Extended fields are more likely to change over time. + [float] ==== General guidelines @@ -27,26 +28,40 @@ ECS defines "Core" and "Extended" fields. * Use the `ecs.version` field to define which version of ECS is used. * Map as many fields as possible to ECS. + [float] -===== Guidelines for writing fields +===== Guidelines for field names -* All fields must be lower case -* Combine words using underscore -* No special characters except `_` +* *Field names must be lower case* -[float] -===== Guidelines for naming fields - -* *Present tense.* Use present tense unless field describes historical information. -* *Singular or plural.* Use singular and plural names properly to reflect the field content. For example, use `requests_per_sec` rather than `request_per_sec`. -* *General to specific.* Organise the prefixes from general to specific to allow grouping fields into objects with a prefix like `host.*`. -* *Avoid repetition.* Avoid stuttering of words. If part of the field name is already in the prefix, do not repeat it. Example: `host.host_ip` should be `host.ip`. -* *Use prefixes.* Fields must be prefixed except for the base fields. For example, all `host` fields are prefixed with `host.`. -See <> for more details. -+ -The document structure should be nested JSON objects. If you use Beats or -Logstash, the nesting of JSON objects is done for you automatically. If you're -ingesting to Elasticsearch using the API, your fields must be nested -objects, not strings containing dots. - -* *Avoid abbreviations when possible*. A few exceptions like `ip` exist. +* *Combine words using underscore* + +* *No special characters except underscore* + +* *Use present tense* unless field describes historical information. + +* *Use singular and plural names properly* to reflect the field content. +** For example, use `requests_per_sec` rather than `request_per_sec`. + +* *Use prefixes for all fields*, except for the base fields. +** For example, all `host` fields are prefixed with `host.` + +* *Separate prefixes by nesting* with dots +** The document structure should be nested JSON objects. + If you use Beats or Logstash, the nesting of JSON objects is done for you automatically. + If you're ingesting to Elasticsearch using the API, your fields must be nested + objects, not strings containing dots. +** See <> for more details. + +* *General to specific*. Organise the prefixes from general to specific to + allow grouping fields into objects with a prefix like `host.*`. + +* *Avoid repetition* or stuttering of words +** If part of the field name is already in the prefix, + avoid repeat it. Example: `host.host_ip` should be `host.ip`. +** Exceptions can be made, when changing the name of the field would break a + strong convention in the community. Example: `host.hostname` is an exception to this rule. + +* *Avoid abbreviations when possible* +** Exceptions can be made, when the name used for the concept is too strongly + in favor of the abbreviation. Example: all `ip` fields are an exception to this rule. From 0b6477c19e0e2fe6e510941b0af849e86713c3ab Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Thu, 21 Mar 2019 12:27:24 -0400 Subject: [PATCH 17/32] Try to replace the `prefix` wording with `field set` wording. --- docs/using-guidelines.asciidoc | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/docs/using-guidelines.asciidoc b/docs/using-guidelines.asciidoc index f9e2cf0c12..d53fbf41ad 100644 --- a/docs/using-guidelines.asciidoc +++ b/docs/using-guidelines.asciidoc @@ -44,24 +44,27 @@ ECS defines "Core" and "Extended" fields. ** For example, use `requests_per_sec` rather than `request_per_sec`. * *Use prefixes for all fields*, except for the base fields. -** For example, all `host` fields are prefixed with `host.` +** For example, all `host` fields are prefixed with `host.`. Such a grouping is + called a field set. -* *Separate prefixes by nesting* with dots +* *Nest fields inside a field set* with dots ** The document structure should be nested JSON objects. If you use Beats or Logstash, the nesting of JSON objects is done for you automatically. If you're ingesting to Elasticsearch using the API, your fields must be nested objects, not strings containing dots. ** See <> for more details. -* *General to specific*. Organise the prefixes from general to specific to - allow grouping fields into objects with a prefix like `host.*`. +* *General to specific*. Organise the nesting of field sets from general to specific, + to allow grouping fields into objects with a prefix like `host.*`. * *Avoid repetition* or stuttering of words -** If part of the field name is already in the prefix, - avoid repeat it. Example: `host.host_ip` should be `host.ip`. +** If part of the field name is already in the name of the field set, + avoid repeating it. Example: `host.host_ip` should be `host.ip`. ** Exceptions can be made, when changing the name of the field would break a - strong convention in the community. Example: `host.hostname` is an exception to this rule. + strong convention in the community. + Example: `host.hostname` is an exception to this rule. * *Avoid abbreviations when possible* ** Exceptions can be made, when the name used for the concept is too strongly - in favor of the abbreviation. Example: all `ip` fields are an exception to this rule. + in favor of the abbreviation. + Example: `ip` fields, or field sets such as `os`, `geo`. From 7f827c61b0f137036d1a9222230f510c0cda26a5 Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Thu, 21 Mar 2019 16:29:22 -0400 Subject: [PATCH 18/32] Disable Logstash section for now --- docs/{logstash.asciidoc => logstash.asciidoc.disabled} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename docs/{logstash.asciidoc => logstash.asciidoc.disabled} (100%) diff --git a/docs/logstash.asciidoc b/docs/logstash.asciidoc.disabled similarity index 100% rename from docs/logstash.asciidoc rename to docs/logstash.asciidoc.disabled From d48e50cb0c4d4966dc00ec20433970b26927e239 Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Thu, 21 Mar 2019 16:30:40 -0400 Subject: [PATCH 19/32] Address the 'solutions' wording and remove links we don't have --- docs/migrating.asciidoc | 4 ++-- docs/products-solutions.asciidoc | 15 +++++++++++++++ docs/solutions.asciidoc | 18 ------------------ 3 files changed, 17 insertions(+), 20 deletions(-) create mode 100644 docs/products-solutions.asciidoc delete mode 100644 docs/solutions.asciidoc diff --git a/docs/migrating.asciidoc b/docs/migrating.asciidoc index 43e692bf5a..c6a0ce6441 100644 --- a/docs/migrating.asciidoc +++ b/docs/migrating.asciidoc @@ -2,11 +2,11 @@ == Migrating to ECS There are multiple ways to reap the benefit of ECS. -The simplest is to use <>. +The simplest is to use <>. If you have a custom pipeline or application you would like to convert to ECS, please have a look at <>. -include::solutions.asciidoc[] +include::products-solutions.asciidoc[] include::logstash.asciidoc[] include::converting.asciidoc[] diff --git a/docs/products-solutions.asciidoc b/docs/products-solutions.asciidoc new file mode 100644 index 0000000000..3acd1968ce --- /dev/null +++ b/docs/products-solutions.asciidoc @@ -0,0 +1,15 @@ +[[ecs-products-solutions]] +=== Products and Solutions that Support ECS + +The following Elastic products support ECS out of the box: + +* {beats} +** Supported out of the box as of version 7.0 +* APM +** Supported out of the box as of version 7.0 +* Infrastructure UI and Logs UI +** Supported out of the box as of version 7.0 + +// TODO Insert community & partner solutions here + + diff --git a/docs/solutions.asciidoc b/docs/solutions.asciidoc deleted file mode 100644 index 8b909f0957..0000000000 --- a/docs/solutions.asciidoc +++ /dev/null @@ -1,18 +0,0 @@ -[[ecs-solutions]] -=== Solutions that Support ECS - -The following Elastic products support ECS out of the box: - -* {beats} -** Supported out of the box as of release 7.0 -** If you are migrating from an older version, please visit - http://example.com[EXAMPLE Beats Migration Guide] -* APM -** Supported out of the box as of release 7.0 -** If you are migrating from an older version, please visit - http://example.com[EXAMPLE APM Migration Guide] -* Infrastructure UI and Logs UI - -// TODO Insert community & partner solutions here - - From 4ce188f5359c6cdd3edd903c1c6b42bb74a2dc08 Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Thu, 21 Mar 2019 16:31:54 -0400 Subject: [PATCH 20/32] Remove logstash include --- docs/migrating.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/migrating.asciidoc b/docs/migrating.asciidoc index c6a0ce6441..5a7b09eaf6 100644 --- a/docs/migrating.asciidoc +++ b/docs/migrating.asciidoc @@ -8,5 +8,5 @@ If you have a custom pipeline or application you would like to convert to ECS, please have a look at <>. include::products-solutions.asciidoc[] -include::logstash.asciidoc[] +// include::logstash.asciidoc[] include::converting.asciidoc[] From af2c92fef0b1c83716c3e0c9fbdf98ff8abb0428 Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Thu, 21 Mar 2019 16:33:17 -0400 Subject: [PATCH 21/32] Make the list of products much less chatty --- docs/products-solutions.asciidoc | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/docs/products-solutions.asciidoc b/docs/products-solutions.asciidoc index 3acd1968ce..043cc85f62 100644 --- a/docs/products-solutions.asciidoc +++ b/docs/products-solutions.asciidoc @@ -1,14 +1,11 @@ [[ecs-products-solutions]] === Products and Solutions that Support ECS -The following Elastic products support ECS out of the box: +The following Elastic products support ECS out of the box, as of version 7.0: * {beats} -** Supported out of the box as of version 7.0 * APM -** Supported out of the box as of version 7.0 * Infrastructure UI and Logs UI -** Supported out of the box as of version 7.0 // TODO Insert community & partner solutions here From 27d54df3fddb44fcb3a0623bb05d8bfeb00f54bd Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Fri, 22 Mar 2019 11:15:45 -0400 Subject: [PATCH 22/32] Do a pass on conventions page. --- docs/using-conventions.asciidoc | 38 ++++++++++++++++++++------------- 1 file changed, 23 insertions(+), 15 deletions(-) diff --git a/docs/using-conventions.asciidoc b/docs/using-conventions.asciidoc index df3456d7e6..06dac9113e 100644 --- a/docs/using-conventions.asciidoc +++ b/docs/using-conventions.asciidoc @@ -1,8 +1,10 @@ [[ecs-conventions]] -=== {ecs} Conventions +=== Conventions -{ecs} is most effective when you understand and follow these guidelines and conventions. +The implementation of ECS follows a few conventions. Understanding them will +help you understand ECS better. +[float] ==== Datatype for integers Unless otherwise noted, the datatype used for integer fields should be `long`. @@ -18,29 +20,30 @@ to ECS as possible, we default to using the `keyword` type for IDs and codes. Some specific kinds of codes are always integers, like HTTP status codes. If those have a specific corresponding specific field (as HTTP status does), its type can safely be an integer type. -But generic field like `error.code` cannot have this guarantee, and are therefore `keyword`. +But generic fields like `error.code` cannot have this guarantee, and are therefore `keyword`. [float] -==== Multi-fields text indexing +==== Text indexing and multi-fields -Elasticsearch can index text using: +Elasticsearch can index text using datatypes: -* *Text.* Text indexing allows for full text search, or searching arbitrary words that +* *`text`* Text indexing allows for full text search, or searching arbitrary words that are part of the field. See {ref}/text.html[Text datatype] in the {es} Reference Guide. -* *Keywords.* Keyword indexing offers faster exact match filtering and prefix search, - and makes aggregations (for {kib} visualizations) possible. +* *`keyword`* Keyword indexing offers faster exact match filtering, + prefix search (like autocomplete), + and makes aggregations (like {kib} visualizations) possible. See the {es} Reference Guide for more information on {ref}/query-dsl-term-query.html[exact match filtering], {ref}/query-dsl-prefix-query.html[prefix search], or {ref}/search-aggregations.html[aggregations]. [float] -===== Default Elasticsearch convention +===== Default Elasticsearch convention for indexing text fields Unless your index mapping or index template specifies otherwise (as the ECS index template does), -Elasticsearch indexes text field as `text` at the canonical field name, +Elasticsearch indexes a text field as `text` at the canonical field name, and indexes a second time as `keyword`, nested in a multi-field. Default Elasticsearch convention: @@ -49,23 +52,28 @@ Default Elasticsearch convention: * Multi-field: `myfield.keyword` is `keyword` [float] -===== ECS multi-field convention for text +===== ECS convention for indexing text fields + +ECS flips the convention around. For monitoring use cases, `keyword` indexing is needed almost exclusively, with -full text search on very few fields. Given this premise, ECS defaults +full text search needed on very few fields. +Moreover, indexing for full text search on lots of fields, where it's not expected +to be used is wasteful of resources. + +Given these two premises, ECS defaults all text indexing to `keyword` at the top level (with very few exceptions). Any use case that requires full text search indexing on additional fields can add a {ref}/multi-fields.html[multi-field] for full text search. Doing so does not conflict with ECS, as the canonical field name will remain `keyword` indexed. -ECS multi-field convention for text: +So the ECS multi-field convention for text is: * Canonical field: `myfield` is `keyword` * Multi-field: `myfield.text` is `text` -[float] -===== Exceptions +**Exceptions** The only exceptions to this convention are fields `message` and `error.message`, which are indexed for full text search only, with no multi-field. From 8c289e54fe4815a070d23c3c718cbaf998185df3 Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Fri, 22 Mar 2019 12:57:58 -0400 Subject: [PATCH 23/32] I think it's important to mention the version of ECS this documentation is about. The sidebar selector is probably not there in the PDF books :-) --- docs/fields.asciidoc | 2 ++ docs/index.asciidoc | 2 ++ scripts/generators/asciidoc_fields.py | 11 +++++++---- 3 files changed, 11 insertions(+), 4 deletions(-) diff --git a/docs/fields.asciidoc b/docs/fields.asciidoc index cc88f9d9dd..220ce80133 100644 --- a/docs/fields.asciidoc +++ b/docs/fields.asciidoc @@ -2,6 +2,8 @@ [[ecs-field-reference]] == {ecs} Field Reference +This is the documentation of ECS version 1.1.0-dev. + ECS defines multiple groups of related fields. They are called "field sets". The <> field set is the only one whose fields are defined at the root of the event. diff --git a/docs/index.asciidoc b/docs/index.asciidoc index 22cd6e77c5..95687bbd85 100644 --- a/docs/index.asciidoc +++ b/docs/index.asciidoc @@ -8,6 +8,8 @@ include::{asciidoc-dir}/../../shared/attributes.asciidoc[] [[ecs-reference]] == Overview +This is the documentation of ECS version 1.1.0-dev. + [float] === What is ECS? diff --git a/scripts/generators/asciidoc_fields.py b/scripts/generators/asciidoc_fields.py index 5ed933e2ba..43c0b645d8 100644 --- a/scripts/generators/asciidoc_fields.py +++ b/scripts/generators/asciidoc_fields.py @@ -4,7 +4,7 @@ def generate(ecs_nested, ecs_version): - save_asciidoc('docs/fields.asciidoc', page_field_index(ecs_nested)) + save_asciidoc('docs/fields.asciidoc', page_field_index(ecs_nested, ecs_version)) save_asciidoc('docs/field-details.asciidoc', page_field_details(ecs_nested)) # Helpers @@ -23,8 +23,8 @@ def save_asciidoc(file, text): # Field Index -def page_field_index(ecs_nested): - page_text = index_header() +def page_field_index(ecs_nested, ecs_version): + page_text = index_header(ecs_version) for fieldset in ecs_helpers.dict_sorted_by_keys(ecs_nested, ['group', 'name']): page_text += render_field_index_row(fieldset) page_text += table_footer() @@ -146,11 +146,14 @@ def table_footer(): # Field Index -def index_header(): +def index_header(ecs_version): + # Not using format() because then asciidoc {ecs}, {es}, etc are resolved. return ''' [[ecs-field-reference]] == {ecs} Field Reference +This is the documentation of ECS version ''' + ecs_version + '''. + ECS defines multiple groups of related fields. They are called "field sets". The <> field set is the only one whose fields are defined at the root of the event. From 36f9739bd4fd0cc911f2402b56e740ad8dfccae5 Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Fri, 22 Mar 2019 13:07:30 -0400 Subject: [PATCH 24/32] Update ECS definition with my latest attempt... Featuring what ECS actually **is**, as opposed to what it does :-) --- docs/glossary.asciidoc | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/docs/glossary.asciidoc b/docs/glossary.asciidoc index e148b1fea1..a1e143f981 100644 --- a/docs/glossary.asciidoc +++ b/docs/glossary.asciidoc @@ -3,7 +3,8 @@ [[ecs-glossary-ecs]] ECS:: -*Elastic Common Schema*. The Elastic Common Schema (ECS) defines a common set of fields, -their datatype, and gives guidance on their correct usage. -ECS is used to improve uniformity of event data ingested into Elasticsearch, -such as logs and metrics. +*Elastic Common Schema*. The Elastic Common Schema (ECS) is a document schema +for Elasticsearch, for use cases such as logging and metrics. +ECS defines a common set of fields, their datatype, +and gives guidance on their correct usage. +ECS is used to improve uniformity of event data coming from different sources. From 61a6d161f5fe4cc82ad0240e446c0cba80a66304 Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Fri, 22 Mar 2019 13:17:06 -0400 Subject: [PATCH 25/32] Move the updated definitions of core and extended to 'guidelines' Other page simply links to it, to reduce duplication of content. The 'converting' page is just a series of pointers, and shouldn't be the canonical source of any fundamental information that defines the schema. --- docs/converting.asciidoc | 17 ++--------------- docs/using-guidelines.asciidoc | 24 ++++++++++++++++-------- 2 files changed, 18 insertions(+), 23 deletions(-) diff --git a/docs/converting.asciidoc b/docs/converting.asciidoc index a72ddf28c3..1db1fc7818 100644 --- a/docs/converting.asciidoc +++ b/docs/converting.asciidoc @@ -14,21 +14,8 @@ Before you start a conversion, be sure that you understand the basics below. [[core-or-ext]] ==== Core and extended levels -* *Core fields.* Fields that are most common across all use cases are defined as *core fields*. -+ -These generalized fields are used by analysis content -(searches, visualizations, dashboards, alerts, machine learning jobs, reports) -across use cases. Analysis content designed to operate on these -fields should work properly on data from any relevant source. -+ -Focus on populating these fields first. - -* *Extended fields.* Any field that is not a core field is defined as an *extended field*. -Extended fields may apply to more narrow use cases, or may be more open -to interpretation depending on the use case. Extended fields are more likely to -change over time. - -Each {ecs} <> in a table is identified as core or extended. +Make sure you understand the distinction between Core and Extended fields, +as explained in the <>. [float] [[ecs-conv]] diff --git a/docs/using-guidelines.asciidoc b/docs/using-guidelines.asciidoc index d53fbf41ad..36847357df 100644 --- a/docs/using-guidelines.asciidoc +++ b/docs/using-guidelines.asciidoc @@ -5,18 +5,26 @@ The {ecs} schema serves best when you follow schema guidelines and best practices. [float] -==== Types of fields +==== ECS Field Levels ECS defines "Core" and "Extended" fields. -* *Core fields.* Fields that are the most common across all use cases. - Focus on populating these fields first. If consuming ECS events, expect - these fields to be populated in most situations. +* *Core fields.* Fields that are most common across all use cases are defined as *core fields*. ++ +These generalized fields are used by analysis content +(searches, visualizations, dashboards, alerts, machine learning jobs, reports) +across use cases. Analysis content designed to operate on these +fields should work properly on data from any relevant source. ++ +Focus on populating these fields first. + +* *Extended fields.* Any field that is not a core field is defined as an *extended field*. +Extended fields may apply to more narrow use cases, or may be more open +to interpretation depending on the use case. Extended fields are more likely to +change over time. + +Each {ecs} <> in a table is identified as core or extended. -* *Extended fields.* Any fields that are not a core field. - Extended fields may apply to more narrow use cases, or may be more open - to interpretation depending on the use case. Extended fields are more likely to - change over time. [float] From 4f0b407b018aaf95bd7e50f23800de4eb9a83849 Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Fri, 22 Mar 2019 13:28:56 -0400 Subject: [PATCH 26/32] Additional details about field levels in overview paragraph --- docs/index.asciidoc | 36 ++++++++++++++++++++---------------- 1 file changed, 20 insertions(+), 16 deletions(-) diff --git a/docs/index.asciidoc b/docs/index.asciidoc index 95687bbd85..e341d34b15 100644 --- a/docs/index.asciidoc +++ b/docs/index.asciidoc @@ -13,23 +13,27 @@ This is the documentation of ECS version 1.1.0-dev. [float] === What is ECS? -The Elastic Common Schema (ECS) defines a common set of fields, -their datatype, and gives guidance on their correct usage. -ECS is used to improve uniformity of event data ingested into Elasticsearch. +The Elastic Common Schema (ECS) is an open source specification, +developed with support from the Elastic user community. +ECS defines a common set of fields to be used when storing event data in Elasticsearch, +such as logs and metrics. + +ECS specifies field names and Elasticsearch datatypes for each field, +and provides descriptions and example usage. +ECS also groups fields into ECS levels, which are used to signal how much a field +is expected to be present. You can learn more about ECS levels in <>. +Finally, ECS also provides a set of naming guidelines for adding custom fields. + +The goal of ECS is to enable and encourage users of Elasticsearch to normalize their event data, +so that they can better analyze, visualize, and correlate the data represented in their events. +ECS has been scoped to accommodate a wide variety of events, spanning: + +- *Event sources*: whether the source of your event is an Elastic product, + a third- party product, or a custom application built by your organization. +- *Ingestion architectures*: whether the ingestion path for your events includes Beats processors, + Logstash, Elasticsearch ingest node, all of the above, or none of the above. +- *Consumers*: whether consumed by API, Kibana queries, dashboards, apps, or other means. -Following ECS ensures your monitoring events follow a predictable schema, at all levels: - -- *Event source*: whether the source of your event is an Elastic product, - a third party product, or custom events generated by your application. -- *Event pipeline*: in any kind of event pipeline, such as - Beats processors, Logstash or Elasticsearch ingest node. -- *Consumption*: API consumers, Kibana applications and Kibana dashboards are - all simpler to build, maintain or share, when they are based on ECS. - -Following ECS reduces dependencies between unrelated parts of your event pipeline. - -The ultimate goal of ECS is to help you correlate data from various sources -like logs, metrics, IT operations analytics, and security analytics together. [float] === My events don't map with ECS From 7b68602cbce6df545f3a2a09db9ac44fe259414e Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Fri, 22 Mar 2019 16:47:19 -0400 Subject: [PATCH 27/32] Ensure multi-line field set definitions actually render multiple paragraphs --- docs/field-details.asciidoc | 22 ++++++++++++++++++++++ scripts/generators/asciidoc_fields.py | 6 +++++- 2 files changed, 27 insertions(+), 1 deletion(-) diff --git a/docs/field-details.asciidoc b/docs/field-details.asciidoc index 15d1b216c0..fd7ca0f6a7 100644 --- a/docs/field-details.asciidoc +++ b/docs/field-details.asciidoc @@ -69,6 +69,7 @@ example: `["production", "env2"]` === Agent Fields The agent fields contain the data about the software entity, if any, that collects, detects, or observes events on a host, or takes measurements on a host. + Examples include Beats. Agents may also run on observers. ECS agent.* fields shall be populated with details of the agent running on the host or observer where the event happened or the measurement was taken. ==== Agent Field Details @@ -145,7 +146,9 @@ example: `6.0.0-rc2` === Client Fields A client is defined as the initiator of a network connection for events regarding sessions, connections, or bidirectional flow records. + For TCP events, the client is the initiator of the TCP connection that sends the SYN packet(s). For other protocols, the client is generally the initiator or requestor in the network transaction. Some systems use the term "originator" to refer the client in TCP connections. The client fields describe details about the system acting as the client in the network event. Client fields are usually populated in conjunction with server fields. Client fields are generally not populated for packet-level events. + Client / server representations can add semantic context to an exchange, which is helpful to visualize the data in certain situations. If your context falls in that category, you should still ensure that source and destination are filled appropriately. ==== Client Field Details @@ -363,6 +366,7 @@ example: `us-east-1` === Container Fields Container fields are used for meta information about the specific container that is the source of information. + These fields help correlate data based containers from any runtime. ==== Container Field Details @@ -445,6 +449,7 @@ example: `docker` === Destination Fields Destination fields describe details about the destination of a packet/event. + Destination fields are usually populated in conjunction with source fields. ==== Destination Field Details @@ -596,6 +601,7 @@ example: `1.0.0` === Error Fields These fields can represent errors of any kind. + Use them for errors that happen while fetching events or in cases where the event itself contains an error. ==== Error Field Details @@ -645,6 +651,7 @@ type: text === Event Fields The event fields are used for context information about the log or metric event itself. + A log is defined as an event containing details of something that happened. Log events must include the time at which the thing happened. Examples of log events include a process starting on a host, a network packet being sent from a source to a destination, or a network connection between a client and a server being initiated or closed. A metric is defined as an event containing one or more numerical or categorical measurements and the time at which the measurement was taken. Examples of metric events include memory pressure measured on a host, or vulnerabilities measured on a scanned host. ==== Event Field Details @@ -873,6 +880,7 @@ type: keyword === File Fields A file is defined as a set of information that has been created on, or has existed on a filesystem. + File objects can be associated with host events, network events, and/or file events (e.g., those produced by File Integrity Monitoring [FIM] products or services). File fields provide details about the affected file associated with the event or metric. ==== File Field Details @@ -1044,6 +1052,7 @@ type: keyword === Geo Fields Geo fields can carry data about a specific location related to an event. + This geolocation information can be derived from techniques such as Geo IP, or be user-supplied. ==== Geo Field Details @@ -1205,6 +1214,7 @@ Note also that the `group` fields may be used directly at the top level. === Host Fields A host is defined as a general computing instance. + ECS host.* fields should be populated with details about the host on which the event happened, or from which the measurement was taken. Host types include hardware, virtual machines, Docker containers, and Kubernetes nodes. ==== Host Field Details @@ -1504,6 +1514,7 @@ example: `Sep 19 08:26:10 localhost My log` === Network Fields The network is defined as the communication path over which a host or network event happens. + The network.* fields should be populated with details about the network activity associated with an event. ==== Network Field Details @@ -1657,6 +1668,7 @@ example: `ipv4` === Observer Fields An observer is defined as a special network, security, or application device used to detect, observe, or create network, security, or application-related events and metrics. + This could be a custom hardware appliance or a server that has been configured to run special network, security, or application software. Examples include firewalls, intrusion detection/prevention systems, network monitoring sensors, web application firewalls, data loss prevention systems, and APM servers. The observer.* fields shall be populated with details of the system, if any, that detects, observes and/or creates a network, security, or application event or metric. Message queues and ETL components used in processing events or metrics are not considered observers in ECS. ==== Observer Field Details @@ -1780,6 +1792,7 @@ type: keyword === Organization Fields The organization fields enrich data with information about the company or entity the data is associated with. + These fields help you arrange or filter data stored in an index by one or multiple organizations. ==== Organization Field Details @@ -1908,6 +1921,7 @@ Note also that the `os` fields are not expected to be used directly at the top l === Process Fields These fields contain information about a process. + These fields can help you correlate metrics information with a process id/name from a log message. The `process.pid` often stays in the metric itself and is copied to the global field for correlation. ==== Process Field Details @@ -2026,7 +2040,9 @@ example: `/home/alice` === Related Fields This field set is meant to facilitate pivoting around a piece of data. + Some pieces of information can be seen in many places in an ECS event. To facilitate searching for them, store an array of all seen values to their corresponding field in `related.`. + A concrete example is IP addresses, which can be under host, observer, source, destination, client, server, and network.forwarded_ip. If you append all IPs to `related.ip`, you can then search for a given IP trivially, no matter where it appeared, by querying `related.ip:a.b.c.d`. ==== Related Field Details @@ -2054,7 +2070,9 @@ type: ip === Server Fields A Server is defined as the responder in a network connection for events regarding sessions, connections, or bidirectional flow records. + For TCP events, the server is the receiver of the initial SYN packet(s) of the TCP connection. For other protocols, the server is generally the responder in the network transaction. Some systems actually use the term "responder" to refer the server in TCP connections. The server fields describe details about the system acting as the server in the network event. Server fields are usually populated in conjunction with client fields. Server fields are generally not populated for packet-level events. + Client / server representations can add semantic context to an exchange, which is helpful to visualize the data in certain situations. If your context falls in that category, you should still ensure that source and destination are filled appropriately. ==== Server Field Details @@ -2179,6 +2197,7 @@ type: long === Service Fields The service fields describe the service for or from which the data was collected. + These fields help you find and correlate logs for a specific service and version. ==== Service Field Details @@ -2270,6 +2289,7 @@ example: `3.2.4` === Source Fields Source fields describe details about the source of a packet/event. + Source fields are usually populated in conjunction with destination fields. ==== Source Field Details @@ -2525,6 +2545,7 @@ type: keyword === User Fields The user fields describe information about the user that is relevant to the event. + Fields can have one entry or multiple entries. If a user has more than one id, provide an array that includes all of them. ==== User Field Details @@ -2624,6 +2645,7 @@ Note also that the `user` fields may be used directly at the top level. === User agent Fields The user_agent fields normally come from a browser request. + They often show up in web service logs coming from the parsed user agent string. ==== User agent Field Details diff --git a/scripts/generators/asciidoc_fields.py b/scripts/generators/asciidoc_fields.py index 43c0b645d8..af765685d6 100644 --- a/scripts/generators/asciidoc_fields.py +++ b/scripts/generators/asciidoc_fields.py @@ -53,7 +53,7 @@ def render_fieldset(fieldset, ecs_nested): text = field_details_table_header().format( fieldset_title=fieldset['title'], fieldset_name=fieldset['name'], - fieldset_description=fieldset['description'] + fieldset_description=render_asciidoc_paragraphs(fieldset['description']) ) for field in ecs_helpers.dict_sorted_by_keys(fieldset['fields'], 'flat_name'): @@ -67,6 +67,10 @@ def render_fieldset(fieldset, ecs_nested): return text +def render_asciidoc_paragraphs(string): + '''Simply double the \n''' + return string.replace("\n", "\n\n") + def render_field_details_row(field): example = '' if 'example' in field: From b863e600b561e2e06b1b65fb8e831e71699b8a1a Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Fri, 22 Mar 2019 16:49:55 -0400 Subject: [PATCH 28/32] Ensure field descriptions also render paragraphs correctly --- docs/field-details.asciidoc | 84 +++++++++++++++++++++++++++ scripts/generators/asciidoc_fields.py | 2 +- 2 files changed, 85 insertions(+), 1 deletion(-) diff --git a/docs/field-details.asciidoc b/docs/field-details.asciidoc index fd7ca0f6a7..a1f391d5e6 100644 --- a/docs/field-details.asciidoc +++ b/docs/field-details.asciidoc @@ -14,8 +14,11 @@ The `base` field set contains all fields which are on the top level. These field | @timestamp | Date/time when the event originated. + This is the date/time extracted from the event, typically representing when the event was generated by the source. + If the event source has no original timestamp, this value is typically populated by the first time the event was received by the pipeline. + Required field for all events. type: date @@ -28,7 +31,9 @@ example: `2016-05-23T08:05:34.853Z` | labels | Custom key/value pairs. + Can be used to add meta information to events. Should not contain nested objects. All values are stored as keyword. + Example: `docker` and `k8s` labels. type: object @@ -41,7 +46,9 @@ example: `{'application': 'foo-bar', 'env': 'production'}` | message | For log events the message field contains the log message, optimized for viewing in a log viewer. + For structured logs without an original message field, other fields can be concatenated to form a human-readable summary of the event. + If multiple messages exist, they can be combined into one message. type: text @@ -82,6 +89,7 @@ Examples include Beats. Agents may also run on observers. ECS agent.* fields sha | agent.ephemeral_id | Ephemeral identifier of this agent (if one exists). + This id normally changes across restarts, but `agent.id` does not. type: keyword @@ -94,6 +102,7 @@ example: `8a4f500f` | agent.id | Unique identifier of this agent (if one exists). + Example: For Beats this would be beat.id. type: keyword @@ -106,7 +115,9 @@ example: `8a4f500d` | agent.name | Custom name of the agent. + This is a name that can be given to an agent. This can be helpful if for example two Filebeat instances are running on the same host but a human readable separation is needed on which Filebeat instance data is coming from. + If no name is given, the name is often left empty. type: keyword @@ -119,6 +130,7 @@ example: `foo` | agent.type | Type of the agent. + The agent type stays always the same and should be given by the agent used. In case of Filebeat the agent would always be Filebeat also if two Filebeat instances are run on the same machine. type: keyword @@ -161,6 +173,7 @@ Client / server representations can add semantic context to an exchange, which i | client.address | Some event client addresses are defined ambiguously. The event will sometimes list an IP, a domain or a unix socket. You should always store the raw address in the `.address` field. + Then it should be duplicated to `.ip` or `.domain`, depending on which one it is. type: keyword @@ -195,6 +208,7 @@ type: keyword | client.ip | IP address of the client. + Can be one or multiple IPv4 or IPv6 addresses. type: ip @@ -284,6 +298,7 @@ Fields related to the cloud or infrastructure the events are coming from. | cloud.account.id | The cloud account or organization id used to identify different entities in a multi-tenant environment. + Examples: AWS account id, Google Cloud ORG Id, or other unique identifier. type: keyword @@ -462,6 +477,7 @@ Destination fields are usually populated in conjunction with source fields. | destination.address | Some event destination addresses are defined ambiguously. The event will sometimes list an IP, a domain or a unix socket. You should always store the raw address in the `.address` field. + Then it should be duplicated to `.ip` or `.domain`, depending on which one it is. type: keyword @@ -496,6 +512,7 @@ type: keyword | destination.ip | IP address of the destination. + Can be one or multiple IPv4 or IPv6 addresses. type: ip @@ -585,6 +602,7 @@ Meta-information specific to ECS. | ecs.version | ECS version this event conforms to. `ecs.version` is a required field and must exist in all events. + When querying across multiple indices -- which may conform to slightly different ECS versions -- this field lets integrations adjust to the schema version of the events. type: keyword @@ -664,6 +682,7 @@ A log is defined as an event containing details of something that happened. Log | event.action | The action captured by the event. + This describes the information in the event. It is more specific than `event.category`. Examples are `group-add`, `process-started`, `file-created`. The value is normally defined by the implementer. type: keyword @@ -676,6 +695,7 @@ example: `user-password-change` | event.category | Event category. + This contains high-level information about the contents of the event. It is more generic than `event.action`, in the sense that typically a category contains multiple actions. Warning: In future versions of ECS, we plan to provide a list of acceptable values for this field, please use with caution. type: keyword @@ -688,8 +708,11 @@ example: `user-management` | event.created | event.created contains the date/time when the event was first read by an agent, or by your pipeline. + This field is distinct from @timestamp in that @timestamp typically contain the time extracted from the original event. + In most situations, these two timestamps will be slightly different. The difference can be used to calculate the delay between your source generating an event, and the time when your agent first processed it. This can be used to monitor your agent's or pipeline's ability to keep up with your event source. + In case the two timestamps are identical, @timestamp should be used. type: date @@ -702,6 +725,7 @@ type: date | event.dataset | Name of the dataset. + The concept of a `dataset` (fileset / metricset) is used in Beats as a subset of modules. It contains the information which is currently stored in metricset.name and metricset.module or fileset.name. type: keyword @@ -714,6 +738,7 @@ example: `stats` | event.duration | Duration of the event in nanoseconds. + If event.start and event.end are known this value should be the difference between the end and start time. type: long @@ -759,6 +784,7 @@ example: `8a4f500d` | event.kind | The kind of the event. + This gives information about what type of information the event contains, without being specific to the contents of the event. Examples are `event`, `state`, `alarm`. Warning: In future versions of ECS, we plan to provide a list of acceptable values for this field, please use with caution. type: keyword @@ -771,6 +797,7 @@ example: `state` | event.module | Name of the module this data is coming from. + This information is coming from the modules used in Beats or Logstash. type: keyword @@ -783,6 +810,7 @@ example: `mysql` | event.original | Raw text message of entire event. Used to demonstrate log integrity. + This field is not indexed and doc_values are disabled. It cannot be searched, but it can be retrieved from `_source`. type: keyword @@ -795,6 +823,7 @@ example: `Sep 19 08:26:10 host CEF:0|Security| threatmanager|1.0& | event.outcome | The outcome of the event. + If the event describes an action, this fields contains the outcome of that action. Examples outcomes are `success` and `failure`. Warning: In future versions of ECS, we plan to provide a list of acceptable values for this field, please use with caution. type: keyword @@ -818,6 +847,7 @@ type: float | event.risk_score_norm | Normalized risk score or priority of the event, on a scale of 0 to 100. + This is mainly useful if you use more than one system that assigns risk scores, and you want to see a normalized value across all systems. type: float @@ -852,6 +882,7 @@ type: date | event.timezone | This field should be populated when the event's timestamp does not include timezone information already (e.g. default Syslog timestamps). It's optional otherwise. + Acceptable timezone formats are: a canonical ID (e.g. "Europe/Amsterdam"), abbreviated (e.g. "EST") or an HH:mm differential (e.g. "-05:00"). type: keyword @@ -864,6 +895,7 @@ type: keyword | event.type | Reserved for future usage. + Please avoid using this field for user data. type: keyword @@ -915,6 +947,7 @@ type: keyword | file.extension | File extension. + This should allow easy filtering by file extensions. type: keyword @@ -1120,7 +1153,9 @@ example: `{ "lon": -73.614830, "lat": 45.505918 }` | geo.name | User-defined description of a location, at the level of granularity they care about. + Could be the name of their data centers, the floor number, if this describes a local physical entity, city names. + Not typically used in automated geolocation. type: keyword @@ -1238,6 +1273,7 @@ example: `x86_64` | host.hostname | Hostname of the host. + It normally contains what the `hostname` command returns on the host machine. type: keyword @@ -1250,7 +1286,9 @@ type: keyword | host.id | Unique host id. + As hostname is not always unique, use values that are meaningful in your environment. + Example: The current usage of `beat.name`. type: keyword @@ -1285,6 +1323,7 @@ type: keyword | host.name | Name of the host. + It can contain what `hostname` returns on Unix systems, the fully qualified domain name, or a name specified by the user. The sender decides which value to use. type: keyword @@ -1297,6 +1336,7 @@ type: keyword | host.type | Type of host. + For Cloud providers this can be the machine type like `t2.medium`. If vm, this could be the container, for example, or other information meaningful in your environment. type: keyword @@ -1392,6 +1432,7 @@ example: `1437` | http.request.method | HTTP request method. + The field value must be normalized to lowercase for querying. See the documentation section "Implementing ECS". type: keyword @@ -1485,6 +1526,7 @@ Fields which are specific to log events. | log.level | Original log level of the log event. + Some examples are `warn`, `error`, `i`. type: keyword @@ -1497,7 +1539,9 @@ example: `err` | log.original | This is the original log message and contains the full log message before splitting it up in multiple parts. + In contrast to the `message` field which can contain an extracted part of the log message, this field contains the original, full log message. It can have already some modifications applied like encoding or new lines removed to clean up the log message. + This field is not indexed and doc_values are disabled so it can't be queried but the value can be retrieved from `_source`. type: keyword @@ -1527,6 +1571,7 @@ The network.* fields should be populated with details about the network activity | network.application | A name given to an application level protocol. This can be arbitrarily assigned for things like microservices, but also apply to things like skype, icq, facebook, twitter. This would be used in situations where the vendor or service can be decoded such as from the source/dest IP owners, ports, or wire format. + The field value must be normalized to lowercase for querying. See the documentation section "Implementing ECS". type: keyword @@ -1539,6 +1584,7 @@ example: `aim` | network.bytes | Total bytes transferred in both directions. + If `source.bytes` and `destination.bytes` are known, `network.bytes` is their sum. type: long @@ -1551,6 +1597,7 @@ example: `368` | network.community_id | A hash of source and destination IPs and ports, as well as the protocol used in a communication. This is a tool-agnostic standard to identify flows. + Learn more at https://github.com/corelight/community-id-spec. type: keyword @@ -1563,14 +1610,23 @@ example: `1:hO+sN4H+MG5MY/8hIrXPqc4ZQz0=` | network.direction | Direction of the network traffic. + Recommended values are: + * inbound + * outbound + * internal + * external + * unknown + + When mapping events from a host-based monitoring context, populate this field from the host's point of view. + When mapping events from a network or perimeter-based monitoring context, populate this field from the point of view of your network perimeter. type: keyword @@ -1616,6 +1672,7 @@ example: `Guest Wifi` | network.packets | Total packets transferred in both directions. + If `source.packets` and `destination.packets` are known, `network.packets` is their sum. type: long @@ -1628,6 +1685,7 @@ example: `24` | network.protocol | L7 Network protocol name. ex. http, lumberjack, transport protocol. + The field value must be normalized to lowercase for querying. See the documentation section "Implementing ECS". type: keyword @@ -1640,6 +1698,7 @@ example: `http` | network.transport | Same as network.iana_number, but instead using the Keyword name of the transport layer (udp, tcp, ipv6-icmp, etc.) + The field value must be normalized to lowercase for querying. See the documentation section "Implementing ECS". type: keyword @@ -1652,6 +1711,7 @@ example: `tcp` | network.type | In the OSI Model this would be the Network Layer. ipv4, ipv6, ipsec, pim, etc + The field value must be normalized to lowercase for querying. See the documentation section "Implementing ECS". type: keyword @@ -1725,6 +1785,7 @@ type: keyword | observer.type | The type of the observer the data is coming from. + There is no predefined list of observer types. Some examples are `forwarder`, `firewall`, `ids`, `ips`, `proxy`, `poller`, `sensor`, `APM server`. type: keyword @@ -1934,6 +1995,7 @@ These fields can help you correlate metrics information with a process id/name f | process.args | Array of process arguments. + May be filtered to protect sensitive information. type: keyword @@ -1957,6 +2019,7 @@ example: `/usr/bin/ssh` | process.name | Process name. + Sometimes called program name or similar. type: keyword @@ -2013,6 +2076,7 @@ example: `4242` | process.title | Process title. + The proctitle, some times the same as process name. Can also be different: for example a browser setting its title to the web page currently opened. type: keyword @@ -2085,6 +2149,7 @@ Client / server representations can add semantic context to an exchange, which i | server.address | Some event server addresses are defined ambiguously. The event will sometimes list an IP, a domain or a unix socket. You should always store the raw address in the `.address` field. + Then it should be duplicated to `.ip` or `.domain`, depending on which one it is. type: keyword @@ -2119,6 +2184,7 @@ type: keyword | server.ip | IP address of the server. + Can be one or multiple IPv4 or IPv6 addresses. type: ip @@ -2210,6 +2276,7 @@ These fields help you find and correlate logs for a specific service and version | service.ephemeral_id | Ephemeral identifier of this service (if one exists). + This id normally changes across restarts, but `service.id` does not. type: keyword @@ -2222,7 +2289,9 @@ example: `8a4f500f` | service.id | Unique identifier of the running service. + This id should uniquely identify this service. This makes it possible to correlate logs and metrics for one specific service. + Example: If you are experiencing issues with one redis instance, you can filter on that id to see metrics and logs for that single instance. type: keyword @@ -2235,8 +2304,11 @@ example: `d37e5ebfe0ae6c4972dbe9f0174a1637bb8247f6` | service.name | Name of the service data is collected from. + The name of the service is normally user given. This allows if two instances of the same service are running on the same machine they can be differentiated by the `service.name`. + Also it allows for distributed services that run on multiple hosts to correlate the related instances based on the name. + In the case of Elasticsearch the service.name could contain the cluster name. For Beats the service.name is by default a copy of the `service.type` field if no name is specified. type: keyword @@ -2260,7 +2332,9 @@ type: keyword | service.type | The type of the service data is collected from. + The type can be used to group and correlate logs and metrics from one service type. + Example: If logs or metrics are collected from Elasticsearch, `service.type` would be `elasticsearch`. type: keyword @@ -2273,6 +2347,7 @@ example: `elasticsearch` | service.version | Version of the service the data was collected from. + This allows to look at a data set only for a specific version of a service. type: keyword @@ -2302,6 +2377,7 @@ Source fields are usually populated in conjunction with destination fields. | source.address | Some event source addresses are defined ambiguously. The event will sometimes list an IP, a domain or a unix socket. You should always store the raw address in the `.address` field. + Then it should be duplicated to `.ip` or `.domain`, depending on which one it is. type: keyword @@ -2336,6 +2412,7 @@ type: keyword | source.ip | IP address of the source. + Can be one or multiple IPv4 or IPv6 addresses. type: ip @@ -2425,6 +2502,7 @@ URL fields provide support for complete or partial URLs, and supports the breaki | url.domain | Domain of the url, such as "www.elastic.co". + In some cases a URL may refer to an IP and/or port directly, without a domain name. In this case, the IP address would go to the `domain` field. type: keyword @@ -2437,6 +2515,7 @@ example: `www.elastic.co` | url.fragment | Portion of the url after the `#`, such as "top". + The `#` is not part of the fragment. type: keyword @@ -2460,7 +2539,9 @@ example: `https://www.elastic.co:443/search?q=elasticsearch#top` | url.original | Unmodified original url as seen in the event source. + Note that in network monitoring, the observed URL may be a full URL, whereas in access logs, the URL is often just represented as a path. + This field is meant to represent the URL as it was observed, complete or not. type: keyword @@ -2506,6 +2587,7 @@ example: `443` | url.query | The query field describes the query string of the request, such as "q=elasticsearch". + The `?` is excluded from the query string. If a URL contains no `?`, there is no query field. If there is a `?` but no query, the query field exists with an empty string. The `exists` query can be used to differentiate between the two cases. type: keyword @@ -2518,6 +2600,7 @@ type: keyword | url.scheme | Scheme of the request, such as "https". + Note: The `:` is not part of the scheme. type: keyword @@ -2580,6 +2663,7 @@ example: `Albert Einstein` | user.hash | Unique user hash to correlate information for a user in anonymized form. + Useful if `user.id` or `user.name` contain confidential information and cannot be used. type: keyword diff --git a/scripts/generators/asciidoc_fields.py b/scripts/generators/asciidoc_fields.py index af765685d6..199979d5da 100644 --- a/scripts/generators/asciidoc_fields.py +++ b/scripts/generators/asciidoc_fields.py @@ -77,7 +77,7 @@ def render_field_details_row(field): example = "example: `{}`".format(str(field['example'])) text = field_details_row().format( field_flat_name=field['flat_name'], - field_description=field['description'], + field_description=render_asciidoc_paragraphs(field['description']), field_example=example, field_level=field['level'], field_type=field['type'], From ce1edacc662d5c6cac599f7e4aa49015f1511aeb Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Fri, 22 Mar 2019 16:59:24 -0400 Subject: [PATCH 29/32] Fix JSON code block in faq --- docs/faq.asciidoc | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/docs/faq.asciidoc b/docs/faq.asciidoc index 4b250ff056..c427d6f24c 100644 --- a/docs/faq.asciidoc +++ b/docs/faq.asciidoc @@ -52,12 +52,11 @@ ECS uses the dot notation to represent nested objects. Ingesting `user.firstname: Nicolas` and `user.lastname: Ruflin` is identical to ingesting the following JSON: -``` -"user": { - "firstname": "Nicolas", - "lastname": "Ruflin" -} -``` +[source,json] + "user": { + "firstname": "Nicolas", + "lastname": "Ruflin" + } In Elasticsearch, `user` is represented as an {ref}/object.html[object datatype]. In the case of the underline notation, both are just From 238e07e93f1da5a953af3dfeb2975ea5dca7b2e5 Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Mon, 25 Mar 2019 08:55:28 -0400 Subject: [PATCH 30/32] make fmt --- scripts/generators/asciidoc_fields.py | 1 + 1 file changed, 1 insertion(+) diff --git a/scripts/generators/asciidoc_fields.py b/scripts/generators/asciidoc_fields.py index 199979d5da..372508aa85 100644 --- a/scripts/generators/asciidoc_fields.py +++ b/scripts/generators/asciidoc_fields.py @@ -71,6 +71,7 @@ def render_asciidoc_paragraphs(string): '''Simply double the \n''' return string.replace("\n", "\n\n") + def render_field_details_row(field): example = '' if 'example' in field: From 9f659739ae7ec5fc2e9c821b87f237117168999e Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Mon, 25 Mar 2019 10:27:40 -0400 Subject: [PATCH 31/32] Streamline the SemVer text in the overview page --- docs/index.asciidoc | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/docs/index.asciidoc b/docs/index.asciidoc index e341d34b15..f5ed3f6782 100644 --- a/docs/index.asciidoc +++ b/docs/index.asciidoc @@ -45,9 +45,8 @@ mapped to ECS, you can simply add them to your events, using custom field names. [float] === Maturity -With ECS turning 1.0, the team will release improvements to the schema by following -https://semver.org/[Semantic Versioning]. -Generally major ECS releases are planned to be aligned with major Elastic Stack releases. +ECS improvements are released following https://semver.org/[Semantic Versioning]. +Major ECS releases are planned to be aligned with major Elastic Stack releases. Any feedback on the general structure, missing fields, or existing fields is appreciated. For contributions please read the From 8f81b521a70ba1288100296abcc5f1115dbb3edc Mon Sep 17 00:00:00 2001 From: Mathieu Martin Date: Mon, 25 Mar 2019 10:44:51 -0400 Subject: [PATCH 32/32] Minor wording tweaks in conventions page --- docs/using-conventions.asciidoc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/using-conventions.asciidoc b/docs/using-conventions.asciidoc index 06dac9113e..2972a0df68 100644 --- a/docs/using-conventions.asciidoc +++ b/docs/using-conventions.asciidoc @@ -54,7 +54,7 @@ Default Elasticsearch convention: [float] ===== ECS convention for indexing text fields -ECS flips the convention around. +ECS flips the above convention around. For monitoring use cases, `keyword` indexing is needed almost exclusively, with full text search needed on very few fields. @@ -62,7 +62,7 @@ Moreover, indexing for full text search on lots of fields, where it's not expect to be used is wasteful of resources. Given these two premises, ECS defaults -all text indexing to `keyword` at the top level (with very few exceptions). +all text indexing to `keyword` datatype (with very few exceptions). Any use case that requires full text search indexing on additional fields can add a {ref}/multi-fields.html[multi-field] for full text search. Doing so does not conflict with ECS,