Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions docs/reference/data-rollup-transform.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
[[data-rollup-transform]]
= Roll up or transform your data

[partintro]
--

{es} offers the following methods for manipulating your data:

* <<xpack-rollup,Rolling up your historical data>>
+
include::rollup/index.asciidoc[tag=rollup-intro]
* {stack-ov}/ml-dataframes.html[Transforming your data]

--

include::rollup/index.asciidoc[]
4 changes: 2 additions & 2 deletions docs/reference/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -48,10 +48,10 @@ include::sql/index.asciidoc[]

include::monitoring/index.asciidoc[]

include::rollup/index.asciidoc[]

include::frozen-indices.asciidoc[]

include::data-rollup-transform.asciidoc[]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd match the revised anchor-text.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

include::high-availability.asciidoc[]

include::commands/index.asciidoc[]
Expand Down
11 changes: 7 additions & 4 deletions docs/reference/rollup/api-quickref.asciidoc
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
[role="xpack"]
[testenv="basic"]
[[rollup-api-quickref]]
== API Quick Reference
=== {rollup-cap} API quick reference
++++
<titleabbrev>API quick reference</titleabbrev>
++++

experimental[]

Expand All @@ -15,7 +18,7 @@ Most rollup endpoints have the following base:

[float]
[[rollup-api-jobs]]
=== /job/
==== /job/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the slashes consistent with other APIs?

* {ref}/rollup-put-job.html[PUT /_rollup/job/<job_id+++>+++]: Create a {rollup-job}
* {ref}/rollup-get-job.html[GET /_rollup/job]: List {rollup-jobs}
Expand All @@ -26,13 +29,13 @@ Most rollup endpoints have the following base:

[float]
[[rollup-api-data]]
=== /data/
==== /data/

* {ref}/rollup-get-rollup-caps.html[GET /_rollup/data/<index_pattern+++>/_rollup_caps+++]: Get Rollup Capabilities
* {ref}/rollup-get-rollup-index-caps.html[GET /<index_name+++>/_rollup/data/+++]: Get Rollup Index Capabilities

[float]
[[rollup-api-index]]
=== /<index_name>/
==== /<index_name>/

* {ref}/rollup-search.html[GET /<index_name>/_rollup_search]: Search rollup data
25 changes: 11 additions & 14 deletions docs/reference/rollup/index.asciidoc
Original file line number Diff line number Diff line change
@@ -1,30 +1,27 @@
[role="xpack"]
[testenv="basic"]
[[xpack-rollup]]
= Rolling up historical data

[partintro]
--
== Rolling up historical data

experimental[]

Keeping historical data around for analysis is extremely useful but often avoided due to the financial cost of
archiving massive amounts of data. Retention periods are thus driven by financial realities rather than by the
usefulness of extensive historical data.

The Rollup feature in {xpack} provides a means to summarize and store historical data so that it can still be used
for analysis, but at a fraction of the storage cost of raw data.

// tag::rollup-intro[]
The {stack} {rollup-features} provide a means to summarize and store historical
data so that it can still be used for analysis, but at a fraction of the storage
cost of raw data.
// end::rollup-intro[]

* <<rollup-overview, Overview>>
* <<rollup-getting-started,Getting Started>>
* <<rollup-api-quickref, API Quick Reference>>
* <<rollup-understanding-groups,Understanding Rollup Grouping>>
* <<rollup-overview,Overview>>
* <<rollup-getting-started,Getting started>>
* <<rollup-api-quickref, API quick reference>>
* <<rollup-understanding-groups,Understanding rollup grouping>>
* <<rollup-agg-limitations,Rollup aggregation limitations>>
* <<rollup-search-limitations,Rollup Search limitations>>

* <<rollup-search-limitations,Rollup search limitations>>

--

include::overview.asciidoc[]
include::api-quickref.asciidoc[]
Expand Down
13 changes: 8 additions & 5 deletions docs/reference/rollup/overview.asciidoc
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
[role="xpack"]
[testenv="basic"]
[[rollup-overview]]
== Overview
=== {rollup-cap} overview
++++
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Structurally, I think it would be better if we could incorporate the "overview" material into the top level section landing pages, instead of just making them link farms. Or make the top-level headings just landmarks and not actually navigable links in the TOC (as @gchaps suggested). Then this content would have the "overview" keyword for SEO, but we'd get rid of the extra click to get to meaningful content.) For now, this is consistent with other topics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this will be worth discussing in more detail as we roll out the reorg and strive for consistency, but for now I'll leave as-is.

<titleabbrev>Overview</titleabbrev>
++++

experimental[]

Expand All @@ -23,7 +26,7 @@ reading often diminishes with time. It's not useless -- it could easily contrib
value often leads to deletion rather than paying the fixed storage cost.

[float]
=== Rollup store historical data at reduced granularity
==== Rollup stores historical data at reduced granularity

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Rollup store historical data" is clunky. Rollup stores? Roll up to store? Or maybe "Storing historical data at reduced granularity"? (The pattern here of repeating Rollup in every heading kind seems like overkill.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated it to match the other sections, but I agree this page could do with an edit. Will defer and just stick to structural changes here, however.

That's where Rollup comes into play. The Rollup functionality summarizes old, high-granularity data into a reduced
granularity format for long-term storage. By "rolling" the data up into a single summary document, historical data
Expand All @@ -39,7 +42,7 @@ automates this process of summarizing historical data.
Details about setting up and configuring Rollup are covered in <<rollup-put-job,Create Job API>>

[float]
=== Rollup uses standard query DSL
==== Rollup uses standard query DSL

The Rollup feature exposes a new search endpoint (`/_rollup_search` vs the standard `/_search`) which knows how to search
over rolled-up data. Importantly, this endpoint accepts 100% normal {es} Query DSL. Your application does not need to learn
Expand All @@ -53,7 +56,7 @@ But if your queries, aggregations and dashboards only use the available function
data is trivial.

[float]
=== Rollup merges "live" and "rolled" data
==== Rollup merges "live" and "rolled" data

A useful feature of Rollup is the ability to query both "live", realtime data in addition to historical "rolled" data
in a single query.
Expand All @@ -67,7 +70,7 @@ It will take the results from both data sources and merge them together. If the
"rolled" data, live data is preferred to increase accuracy.

[float]
=== Rollup is multi-interval aware
==== Rollup is multi-interval aware

Finally, Rollup is capable of intelligently utilizing the best interval available. If you've worked with summarizing
features of other products, you'll find that they can be limiting. If you configure rollups at daily intervals... your
Expand Down
4 changes: 2 additions & 2 deletions docs/reference/rollup/rollup-agg-limitations.asciidoc
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
[role="xpack"]
[testenv="basic"]
[[rollup-agg-limitations]]
== Rollup Aggregation Limitations
=== {rollup-cap} aggregation limitations

experimental[]

There are some limitations to how fields can be rolled up / aggregated. This page highlights the major limitations so that
you are aware of them.

[float]
=== Limited aggregation components
==== Limited aggregation components

The Rollup functionality allows fields to be grouped with the following aggregations:

Expand Down
13 changes: 8 additions & 5 deletions docs/reference/rollup/rollup-getting-started.asciidoc
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
[role="xpack"]
[testenv="basic"]
[[rollup-getting-started]]
== Getting Started
=== Getting started with {rollups}
++++
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ultimately, I think this is an example of a place we don't want to use the "Getting started" terminology.

<titleabbrev>Getting started</titleabbrev>
++++

experimental[]

Expand All @@ -23,7 +26,7 @@ look like this:
// NOTCONSOLE

[float]
=== Creating a Rollup Job
==== Creating a rollup job

We'd like to rollup these documents into hourly summaries, which will allow us to generate reports and dashboards with any time interval
one hour or greater. A rollup job might look like this:
Expand Down Expand Up @@ -103,7 +106,7 @@ After you execute the above command and create the job, you'll receive the follo
----

[float]
=== Starting the job
==== Starting the job

After the job is created, it will be sitting in an inactive state. Jobs need to be started before they begin processing data (this allows
you to stop them later as a way to temporarily pause, without deleting the configuration).
Expand All @@ -117,7 +120,7 @@ POST _rollup/job/sensor/_start
// TEST[setup:sensor_rollup_job]

[float]
=== Searching the Rolled results
==== Searching the rolled results

After the job has run and processed some data, we can use the <<rollup-search>> endpoint to do some searching. The Rollup feature is designed
so that you can use the same Query DSL syntax that you are accustomed to... it just happens to run on the rolled up data instead.
Expand Down Expand Up @@ -292,7 +295,7 @@ In addition to being more complicated (date histogram and a terms aggregation, p
the date_histogram uses a `7d` interval instead of `60m`.

[float]
=== Conclusion
==== Conclusion

This quickstart should have provided a concise overview of the core functionality that Rollup exposes. There are more tips and things
to consider when setting up Rollups, which you can find throughout the rest of this section. You may also explore the <<rollup-api-quickref,REST API>>
Expand Down
12 changes: 6 additions & 6 deletions docs/reference/rollup/rollup-search-limitations.asciidoc
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[role="xpack"]
[testenv="basic"]
[[rollup-search-limitations]]
== Rollup Search Limitations
=== {rollup-cap} search limitations

experimental[]

Expand All @@ -11,7 +11,7 @@ live data is thrown away, you will always lose some flexibility.
This page highlights the major limitations so that you are aware of them.

[float]
=== Only one Rollup index per search
==== Only one {rollup} index per search

When using the <<rollup-search>> endpoint, the `index` parameter accepts one or more indices. These can be a mix of regular, non-rollup
indices and rollup indices. However, only one rollup index can be specified. The exact list of rules for the `index` parameter are as
Expand All @@ -33,7 +33,7 @@ may be able to open this up to multiple rollup jobs.

[float]
[[aggregate-stored-only]]
=== Can only aggregate what's been stored
==== Can only aggregate what's been stored

A perhaps obvious limitation, but rollups can only aggregate on data that has been stored in the rollups. If you don't configure the
rollup job to store metrics about the `price` field, you won't be able to use the `price` field in any query or aggregation.
Expand Down Expand Up @@ -81,7 +81,7 @@ The response will tell you that the field and aggregation were not possible, bec
// TESTRESPONSE[s/"stack_trace": \.\.\./"stack_trace": $body.$_path/]

[float]
=== Interval Granularity
==== Interval granularity

Rollups are stored at a certain granularity, as defined by the `date_histogram` group in the configuration. This means you
can only search/aggregate the rollup data with an interval that is greater-than or equal to the configured rollup interval.
Expand Down Expand Up @@ -111,7 +111,7 @@ That said, if multiple jobs are present in a single rollup index with varying in
with the largest interval to satisfy the search request.

[float]
=== Limited querying components
==== Limited querying components

The Rollup functionality allows `query`'s in the search request, but with a limited subset of components. The queries currently allowed are:

Expand All @@ -128,7 +128,7 @@ If you attempt to use an unsupported query, or the query references a field that
thrown. We expect the list of support queries to grow over time as more are implemented.

[float]
=== Timezones
==== Timezones

Rollup documents are stored in the timezone of the `date_histogram` group configuration in the job. If no timezone is specified, the default
is to rollup timestamps in `UTC`.
Expand Down
6 changes: 3 additions & 3 deletions docs/reference/rollup/understanding-groups.asciidoc
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[role="xpack"]
[testenv="basic"]
[[rollup-understanding-groups]]
== Understanding Groups
=== Understanding groups

experimental[]

Expand Down Expand Up @@ -121,7 +121,7 @@ Ultimately, when configuring `groups` for a job, think in terms of how you might
then include those in the config. Because Rollup Search allows any order or combination of the grouped fields, you just need to decide
if a field is useful for aggregating later, and how you might wish to use it (terms, histogram, etc)

=== Grouping Limitations with heterogeneous indices
==== Grouping limitations with heterogeneous indices

There was previously a limitation in how Rollup could handle indices that had heterogeneous mappings (multiple, unrelated/non-overlapping
mappings). The recommendation at the time was to configure a separate job per data "type". For example, you might configure a separate
Expand Down Expand Up @@ -192,7 +192,7 @@ PUT _rollup/job/combined
--------------------------------------------------
// NOTCONSOLE

=== Doc counts and overlapping jobs
==== Doc counts and overlapping jobs

There was previously an issue with document counts on "overlapping" job configurations, driven by the same internal implementation detail.
If there were two Rollup jobs saving to the same index, where one job is a "subset" of another job, it was possible that document counts
Expand Down