Skip to content

Commit bd59006

Browse files
Update logs docs for consistency and formatting (#3183) (#3195)
(cherry picked from commit 6a5abf3) Co-authored-by: Mike Birnstiehl <[email protected]>
1 parent ad05823 commit bd59006

File tree

1 file changed

+40
-47
lines changed

1 file changed

+40
-47
lines changed

docs/en/observability/logs-stream.asciidoc

Lines changed: 40 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -165,7 +165,7 @@ POST logs-example-default/_doc
165165
}
166166
----
167167

168-
The previous command stores the document in `logs-example-default`. You can retrieve it with the following search:
168+
The previous command stores the document in `logs-example-default`. Retrieve it with the following search:
169169

170170
[source,console]
171171
----
@@ -194,7 +194,7 @@ You see something like this:
194194
}
195195
----
196196

197-
{es} indexes the `message` field by default. This means you can search for phrases like `WARN` or `Disk usage exceeds`, but you can't use the `message` field for sorting or filtering. The following command searches for `WARN` and shows the document as result.
197+
{es} indexes the `message` field by default meaning you can search for phrases like `WARN` or `Disk usage exceeds`. For example, the following command searches for the phrase `WARN` in the log `message` field:
198198

199199
[source,console]
200200
----
@@ -210,7 +210,7 @@ GET logs-example-default/_search
210210
}
211211
----
212212

213-
Your message, however, contains all of the following potential fields. Extracting these will allow you to filter and aggregate based on these fields:
213+
While you can search for phrases in the `message` field, you can't use this field to filter log data. Your message, however, contains all of the following potential fields you can extract and use to filter and aggregate your log data:
214214

215215
- *@timestamp* – `2023-08-08T13:45:12.123Z` – Extracting this field lets you sort logs by date and time. This is helpful when you want to view your logs in the order that they occurred or identify when issues happened.
216216
- *log.level* – `WARN` – Extracting this field lets you filter logs by severity. This is helpful if you want to focus on high-severity WARN or ERROR-level logs, and reduce noise by filtering out low-severity INFO-level logs.
@@ -223,7 +223,7 @@ NOTE: These fields are part of the {ecs-ref}/ecs-reference.html[Elastic Common S
223223
[[logs-stream-extract-timestamp]]
224224
== Extract the `@timestamp` field
225225

226-
When you ingested the document in the previous section, you'll notice the `@timestamp` field in the resulting document shows when you added the data to {es}, not when the log occurred:
226+
When you ingested the document in the previous section, you'll notice the `@timestamp` field shows when you added the data to {es}, not when the log occurred:
227227

228228
[source,JSON]
229229
----
@@ -235,29 +235,24 @@ When you ingested the document in the previous section, you'll notice the `@time
235235
...
236236
----
237237

238-
This section shows you how to extract the `@timestamp` field from the example log so you can filter by when logs occurred and when issues happened.
238+
This section shows you how to extract the `@timestamp` field from the log message so you can filter by when the logs and issues actually occurred.
239239

240-
[source,log]
241-
----
242-
2023-08-08T13:45:12.123Z WARN 192.168.1.101 Disk usage exceeds 90%.
243-
----
244-
245-
To extract the timestamp you need to:
240+
To extract the timestamp, you need to:
246241

247-
- <<logs-stream-ingest-pipeline>>
248-
- <<logs-stream-simulate-api>>
249-
- <<logs-stream-index-template>>
250-
- <<logs-stream-create-data-stream>>
242+
. <<logs-stream-ingest-pipeline>>
243+
. <<logs-stream-simulate-api>>
244+
. <<logs-stream-index-template>>
245+
. <<logs-stream-create-data-stream>>
251246

252247
[discrete]
253248
[[logs-stream-ingest-pipeline]]
254249
=== Use an ingest pipeline to extract the `@timestamp`
255250

256-
To extract the `@timestamp` field from the example log, use an ingest pipeline with a dissect processor. Ingest pipelines in {es} are used to process incoming documents. The {ref}/dissect-processor.html[dissect processor] is one of the available processors that extracts structured fields from your unstructured log message based the pattern you set. In the following example command, the dissect processor extracts the timestamp to the `@timestamp` field.
251+
Ingest pipelines consist of a series of processors that perform common transformations on incoming documents before they are indexed. To extract the `@timestamp` field from the example log, use an ingest pipeline with a dissect processor. The {ref}/dissect-processor.html[dissect processor] extracts structured fields from unstructured log messages based on a pattern you set.
257252

258-
{es} can parse string timestamps that are in `yyyy-MM-dd'T'HH:mm:ss.SSSZ` and `yyyy-MM-dd` formats into date fields. Since the log example's timestamp is in one of these formats, you don't need additional processors. If your log timestamps are more complex or use a nonstandard format, you need a {ref}/date-processor.html[date processor] to parse the timestamp into a date field. You can also use a date processor to set the timezone, change the target field, and change the output format of the timestamp.
253+
{es} can parse string timestamps that are in `yyyy-MM-dd'T'HH:mm:ss.SSSZ` and `yyyy-MM-dd` formats into date fields. Since the log example's timestamp is in one of these formats, you don't need additional processors. More complex or nonstandard timestamps require a {ref}/date-processor.html[date processor] to parse the timestamp into a date field. Date processors can also set the timezone, change the target field, and change the output format of the timestamp.
259254

260-
This command creates an ingest pipeline with a dissect processor:
255+
In the following command, the dissect processor extracts the timestamp from the `message` field to the `@timestamp` field and leaves the rest of the message in the `message` field:
261256

262257
[source,console]
263258
----
@@ -275,16 +270,17 @@ PUT _ingest/pipeline/logs-example-default
275270
}
276271
----
277272

278-
Set these values for your pipeline:
273+
The previous command sets the following values for your ingest pipeline:
274+
279275
- `_ingest/pipeline/logs-example-default` – The name of the pipeline,`logs-example-default`, needs to match the name of your data stream. You'll set up your data stream in the next section. See the {fleet-guide}/data-streams.html#data-streams-naming-scheme[data stream naming scheme] for more information.
280276
- `field` – The field you're extracting data from, `message` in this case.
281-
- `pattern`– The pattern of the elements in your log data. The following pattern extracts the timestamp, `2023-08-08T13:45:12.123Z`, to the `@timestamp` field, while the rest of the message, `WARN 192.168.1.101 Disk usage exceeds 90%.`, stays in the `message` field. This works because the dissect processor looks for the space as a separator defined by the pattern `%{timestamp} %{message}`.
277+
- `pattern`– The pattern of the elements in your log data. The following pattern extracts the timestamp, `2023-08-08T13:45:12.123Z`, to the `@timestamp` field, while the rest of the message, `WARN 192.168.1.101 Disk usage exceeds 90%.`, stays in the `message` field. The dissect processor looks for the space as a separator defined by the pattern `%{timestamp} %{message}`.
282278

283279
[discrete]
284280
[[logs-stream-simulate-api]]
285281
=== Test your pipeline with the simulate pipeline API
286282

287-
You can test that your ingest pipeline works as expected with the {ref}/simulate-pipeline-api.html#ingest-verbose-param[simulate pipeline API]. This runs the pipeline without storing any documents, and is great for testing your pipeline with different documents. Run this command to test your pipeline:
283+
The {ref}/simulate-pipeline-api.html#ingest-verbose-param[simulate pipeline API] runs the ingest pipeline without storing any documents. This lets you verify your pipeline works using multiple documents. Run the following command to test your ingest pipeline with the simulate pipeline API.
288284

289285
[source,console]
290286
----
@@ -322,7 +318,7 @@ The results should show the `@timestamp` field extracted from the `message` fiel
322318
}
323319
----
324320

325-
NOTE: Create the index pipeline using the `PUT` command in the previous section before using the simulate pipeline API.
321+
NOTE: Make sure you've created the index pipeline using the `PUT` command in the previous section before using the simulate pipeline API.
326322

327323
[discrete]
328324
[[logs-stream-index-template]]
@@ -352,15 +348,13 @@ PUT _index_template/logs-example-default-template
352348
}
353349
----
354350

355-
356-
357-
Set the following values for the index template:
351+
The previous command sets the following values for your index template:
358352

359353
- `index_patterns`– The index pattern needs to match your log data stream. Naming conventions for data streams are `<type>-<dataset>-<namespace>`. In this example, your logs data stream is named `logs-example-default`. Data that matches this pattern will go through your pipeline.
360354
- `data_stream` – Enables data streams.
361-
- `priority` – Index templates with higher priority take precedence over lower priority. If a data stream matches multiple index templates, the template with the higher priority is used. Built-in templates have a priority of `200`, so we recommend a priority higher than `200`.
355+
- `priority` – Index templates with higher priority take precedence over lower priority. If a data stream matches multiple index templates, {es} uses the template with the higher priority. Built-in templates have a priority of `200`, so use a priority higher than `200` for custom templates.
362356
- `index.default_pipeline` – The name of your ingest pipeline. `logs-example-default` in this case.
363-
- `composed_of` – Here you can set component templates. Component templates are building blocks for constructing index templates that specify index mappings, settings, and aliases. Elastic has several built-in templates that help when ingesting your data. See the following list for more information.
357+
- `composed_of` – Here you can set component templates. Component templates are building blocks for constructing index templates that specify index mappings, settings, and aliases. Elastic has several built-in templates that help when ingesting your data.
364358

365359
The component templates that are set in the previous index template are defined as follows:
366360

@@ -378,7 +372,7 @@ The component templates that are set in the previous index template are defined
378372
[[logs-stream-create-data-stream]]
379373
=== Create your data stream
380374

381-
Create your data stream using the {fleet-guide}/data-streams.html#data-streams-naming-scheme[data stream naming scheme]. The name needs to match the name of your pipeline. For this example, we'll name the data stream `logs-example-default` and use the example log:
375+
Create your data stream using the {fleet-guide}/data-streams.html#data-streams-naming-scheme[data stream naming scheme]. Since The name needs to match the name of your pipeline, name the data stream `logs-example-default`. Post the example log to your data stream with this command:
382376

383377
[source,console]
384378
----
@@ -388,14 +382,14 @@ POST logs-example-default/_doc
388382
}
389383
----
390384

391-
Now look at your document's details using this command:
385+
View your documents using this command:
392386

393387
[source,console]
394388
----
395389
GET /logs-example-default/_search
396390
----
397391

398-
You can see the pipeline extracted the `@timestamp` field:
392+
You should see the pipeline has extracted the `@timestamp` field:
399393

400394
[source,JSON]
401395
----
@@ -426,9 +420,9 @@ You can now use the `@timestamp` field to sort your logs by the date and time th
426420
[[logs-stream-timestamp-troubleshooting]]
427421
=== Troubleshoot your `@timestamp` field
428422

429-
Check the following common issues for possible solutions:
423+
Check the following common issues and solutions with timestamps:
430424

431-
- *Timestamp failure* – If your data has inconsistent date formats, you can set `ignore_failure` to `true` for your date processor. This processes logs with correctly formatted dates and ignores those with issues.
425+
- *Timestamp failure* – If your data has inconsistent date formats, set `ignore_failure` to `true` for your date processor. This processes logs with correctly formatted dates and ignores those with issues.
432426
- *Incorrect timezone* – Set your timezone using the `timezone` option on the {ref}/date-processor.html[date processor].
433427
- *Incorrect timestamp format* – Your timestamp can be a Java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N. See the {ref}/mapping-date-format.html[mapping date format] for more information on timestamp formats.
434428

@@ -477,7 +471,7 @@ Now your pipeline will extract these fields:
477471
- The `log.level` field – `WARN`
478472
- The `message` field – `192.168.1.101 Disk usage exceeds 90%.`
479473

480-
After creating your pipeline, an index template points your log data to your pipeline. You can use the index template you created in the <<logs-stream-index-template, Extract the `@timestamp` field>> section.
474+
After creating your pipeline, an index template points your log data to your pipeline. Use the index template you created in the <<logs-stream-index-template, Extract the `@timestamp` field>> section.
481475

482476
[discrete]
483477
[[logs-stream-log-level-simulate]]
@@ -608,12 +602,11 @@ You should see the following results showing only your high-severity logs:
608602
}
609603
----
610604

611-
612605
[discrete]
613606
[[logs-stream-extract-host-ip]]
614607
== Extract the `host.ip` field
615608

616-
Extracting the `host.ip` field lets you filter logs by host IP addresses. This way you can focus on specific hosts that you’re having issues with or find disparities between hosts.
609+
Extracting the `host.ip` field lets you filter logs by host IP addresses allowing you to focus on specific hosts that you’re having issues with or find disparities between hosts.
617610

618611
The `host.ip` field is part of the {ecs-ref}/ecs-reference.html[Elastic Common Schema (ECS)]. Through the ECS, the `host.ip` field is mapped as an {ref}/ip.html[`ip` field type]. `ip` field types allow range queries so you can find logs with IP addresses in a specific range. You can also query `ip` field types using CIDR notation to find logs from a particular network or subnet.
619612

@@ -662,7 +655,7 @@ Your pipeline will extract these fields:
662655
- The `host.ip` field – `192.168.1.101`
663656
- The `message` field – `Disk usage exceeds 90%.`
664657

665-
After creating your pipeline, an index template points your log data to your pipeline. You can use the index template you created in the <<logs-stream-index-template, Extract the `@timestamp` field>> section.
658+
After creating your pipeline, an index template points your log data to your pipeline. Use the index template you created in the <<logs-stream-index-template, Extract the `@timestamp` field>> section.
666659

667660
[discrete]
668661
[[logs-stream-host-ip-simulate]]
@@ -684,7 +677,7 @@ POST _ingest/pipeline/logs-example-default/_simulate
684677
}
685678
----
686679

687-
The results should show the `@timestamp`, `log.level`, and `host.ip` fields extracted from the `message` field:
680+
The results should show the `host.ip`, `@timestamp`, and `log.level` fields extracted from the `message` field:
688681

689682
[source,JSON]
690683
----
@@ -714,7 +707,7 @@ The results should show the `@timestamp`, `log.level`, and `host.ip` fields extr
714707
[[logs-stream-host-ip-query]]
715708
=== Query logs based on `host.ip`
716709

717-
You can query your logs based on the `host.ip` field in different ways. The following sections detail querying your logs using CIDR notation and range queries.
710+
You can query your logs based on the `host.ip` field in different ways, including using CIDR notation and range queries.
718711

719712
Before querying your logs, add them to your data stream using this command:
720713

@@ -827,7 +820,7 @@ Because all of the example logs are in this range, you'll get the following resu
827820
[[logs-stream-range-query]]
828821
==== Range queries
829822

830-
You can use {ref}/query-dsl-range-query.html[range queries] to query logs in a specific range.
823+
Use {ref}/query-dsl-range-query.html[range queries] to query logs in a specific range.
831824

832825
The following command searches for IP addresses greater than or equal to `192.168.1.100` and less than or equal to `192.168.1.102`.
833826

@@ -894,7 +887,7 @@ You'll get the following results matching the range you've set:
894887
[[logs-stream-ip-ignore-malformed]]
895888
=== Ignore malformed IP addresses
896889

897-
When you're ingesting a large batch of log data, a single malformed IP address can cause the entire batch to fail. You can prevent this by setting `ignore_malformed` to `true` for the `host.ip` field. Update the `host.ip` field to ignore malformed IPs using the {ref}/indices-put-mapping.html[update mapping API]:
890+
When you're ingesting a large batch of log data, a single malformed IP address can cause the entire batch to fail. Prevent this by setting `ignore_malformed` to `true` for the `host.ip` field. Update the `host.ip` field to ignore malformed IPs using the {ref}/indices-put-mapping.html[update mapping API]:
898891

899892
[source,console]
900893
----
@@ -915,9 +908,9 @@ PUT /logs-example-default/_mapping
915908

916909
preview::[]
917910

918-
By default, an ingest pipeline sends your log data to a single data stream. To simplify log data management, you can use a {ref}/reroute-processor.html[reroute processor] to route data from the generic data stream to a target data stream. For example, you might want to send high-severity logs to a specific data stream that's different from low-severity logs to help with categorization.
911+
By default, an ingest pipeline sends your log data to a single data stream. To simplify log data management, use a {ref}/reroute-processor.html[reroute processor] to route data from the generic data stream to a target data stream. For example, you might want to send high-severity logs to a specific data stream to help with categorization.
919912

920-
This section shows you how to use a reroute processor to send the high-severity logs (`WARN` or `ERROR`) from the following log examples to a specific data stream and keep regular logs (`DEBUG` and `INFO`) in the default data stream:
913+
This section shows you how to use a reroute processor to send the high-severity logs (`WARN` or `ERROR`) from the following example logs to a specific data stream and keep the regular logs (`DEBUG` and `INFO`) in the default data stream:
921914

922915
[source,log]
923916
----
@@ -939,7 +932,7 @@ To use a reroute processor:
939932
[[logs-stream-reroute-pipeline]]
940933
=== Add a reroute processor to your ingest pipeline
941934

942-
You can add a reroute processor to your ingest pipeline with the following command:
935+
Add a reroute processor to your ingest pipeline with the following command:
943936

944937
[source,console]
945938
----
@@ -962,13 +955,13 @@ PUT _ingest/pipeline/logs-example-default
962955
}
963956
----
964957

965-
Set these values for the reroute processor:
958+
The previous command sets the following values for your reroute processor:
966959

967-
- `tag` – Identifier for the processor that you can use for debugging and metrics. In the example, that tag is set to `high_severity_logs`.
968-
- `if` – Conditionally runs the processor. In the example, ` "if" : "$('log.level', '') == 'WARN' || $('log.level', '') == 'ERROR'"` means the processor runs when the `log.level` field is `WARN` or `ERROR`.
960+
- `tag` – Identifier for the processor that you can use for debugging and metrics. In the example, the tag is set to `high_severity_logs`.
961+
- `if` – Conditionally runs the processor. In the example, `"ctx.log?.level == 'WARN' || ctx.log?.level == 'ERROR'",` means the processor runs when the `log.level` field is `WARN` or `ERROR`.
969962
- `dataset` – the data stream dataset to route your document to if the previous condition is `true`. In the example, logs with a `log.level` of `WARN` or `ERROR` are routed to the `logs-critical-default` data stream.
970963

971-
After creating your pipeline, an index template points your log data to your pipeline. You can use the index template you created in the <<logs-stream-index-template, Extract the `@timestamp` field>> section.
964+
After creating your pipeline, an index template points your log data to your pipeline. Use the index template you created in the <<logs-stream-index-template, Extract the `@timestamp` field>> section.
972965

973966
[discrete]
974967
[[logs-stream-reroute-add-logs]]

0 commit comments

Comments
 (0)