Skip to content

Conversation

@andreidan
Copy link
Contributor

@andreidan andreidan commented Dec 20, 2021

This fixes the migrate to data tiers routing API to take into account
the scenario where the node attribute configuration for an index is more
accurate than the existing _tier_preference configuration.

Previously we would simply remove the node attributes routing if there
was a _tier_preference configured for the index.

With this commit, we'll look if either the require.data or
include.data custom routings are colder than the existing _tier_preference
configuration (ie. cold vs data_warm,data_hot) and update the tier
routing accordingly.

eg.

{
  index.routing.allocation.require.data: "warm",
  index.routing.allocation.include.data: "cold",
  index.routing.allocation.include._tier_preference: "data_hot"
}

will be migrated to:

{
  index.routing.allocation.include._tier_preference: "data_cold,data_warm,data_hot"
}

This also removes the existing invariant that had the require.data
configuration take precedence over a possible include.data
configuration, and will now migrate the coldest configuration to the
corresponding _tier_preference.

eg.

{
  index.routing.allocation.require.data: "warm",
  index.routing.allocation.include.data: "cold"
}

will be migrated to:

{
  index.routing.allocation.include._tier_preference: "data_cold,data_warm,data_hot"
}

Fixes #81633

This fixes the migrate to data tiers routing API to take into account
the scenario where the node attribute configuration for an index is more
accurate than the existing `_tier_preference` configuration.

Previously we would simply remove the node attributes routing if there
was a `_tier_preference` configured for the index.

With this commit, we'll look if either the `require.data` or
`include.data` custom routings are colder than the existing `_tier_preference`
configuration (ie. `cold` vs `data_warm,data_hot`) and update the tier
routing accordingly.

eg.
{
  index.routing.allocation.require.data: "warm",
  index.routing.allocation.include.data: "cold",
  index.routing.allocation.include._tier_preference: "data_hot"
}
will be migrated to:
{
  index.routing.allocation.include._tier_preference: "data_cold,data_warm,data_hot"
}

This also removes the existing invariant that had the `require.data`
configuration take precedence over a possible `include.data`
configuration, and will now migrate the coldest configuration to the
corresponding `_tier_preference`.

eg.
{
  index.routing.allocation.require.data: "warm",
  index.routing.allocation.include.data: "cold"
}
will be migrated to:
{
  index.routing.allocation.include._tier_preference: "data_cold,data_warm,data_hot"
}
@andreidan andreidan requested a review from dakrone December 20, 2021 16:02
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Dec 20, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code LGTM, but I think we need to update the docs also?

@andreidan
Copy link
Contributor Author

@elasticmachine update branch

@andreidan andreidan requested a review from dakrone December 21, 2021 13:39
Comment on lines +204 to +209
if (tier1.equals(DATA_CONTENT)) {
tier1 = DATA_HOT;
}
if (tier2.equals(DATA_CONTENT)) {
tier2 = DATA_HOT;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dakrone handling the case where an index has the following configuration. Not entirely happy with treating data_content as equal to data_hot (as they are not comparable, or shouldn't be) but I guess it'll have to do unless you have a better suggestion?

"routing" : {
          "allocation" : {
            "include" : {
              "_tier_preference" : "data_content"
            },
            "require" : {
              "data" : "warm"
            }
          }
        }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's any way around it unfortunately. Functionally though, we consider them to be on the same hardware, so I don't think this is so bad, since it's only for the purpose of (internal) ordering rather than overwriting.

Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM (with the data_content fix), but we still need to change the docs for the migrate API to document this behavior

Comment on lines +204 to +209
if (tier1.equals(DATA_CONTENT)) {
tier1 = DATA_HOT;
}
if (tier2.equals(DATA_CONTENT)) {
tier2 = DATA_HOT;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's any way around it unfortunately. Functionally though, we consider them to be on the same hardware, so I don't think this is so bad, since it's only for the purpose of (internal) ordering rather than overwriting.

@andreidan
Copy link
Contributor Author

@elasticmachine update branch

@andreidan
Copy link
Contributor Author

@dakrone the documentation for the API doesn't go into details about how exactly it migrates the entities. We have the manual migration guide that does. I've updated the guide in 8ba044c

Maybe we should rework the docs for the API to be more detailed? Especially with #82170 coming in soon? Or maybe it's ok we're keeping the docs rather high level? @debadair what do you think?

@dakrone in any case, I think if we decide to refactor the docs we should do that in a follow-up and take #82170 into account once that's done. What do you think?

@andreidan andreidan requested a review from dakrone January 4, 2022 16:02
Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I left one comment about Latin pedantry :)

I think if we decide to refactor the docs we should do that in a follow-up and take #82170 into account once that's done. What do you think?

This sounds good to me 👍

Co-authored-by: Lee Hinman <[email protected]>
@andreidan andreidan merged commit 332e6d4 into elastic:master Jan 5, 2022
andreidan added a commit to andreidan/elasticsearch that referenced this pull request Jan 5, 2022
This fixes the migrate to data tiers routing API to take into account
the scenario where the node attribute configuration for an index is more
accurate than the existing `_tier_preference` configuration.

Previously we would simply remove the node attributes routing if there
was a `_tier_preference` configured for the index.

With this commit, we'll look if either the `require.data` or
`include.data` custom routings are colder than the existing `_tier_preference`
configuration (ie. `cold` vs `data_warm,data_hot`) and update the tier
routing accordingly.

eg.
{
  index.routing.allocation.require.data: "warm",
  index.routing.allocation.include.data: "cold",
  index.routing.allocation.include._tier_preference: "data_hot"
}
will be migrated to:
{
  index.routing.allocation.include._tier_preference: "data_cold,data_warm,data_hot"
}

This also removes the existing invariant that had the `require.data`
configuration take precedence over a possible `include.data`
configuration, and will now migrate the coldest configuration to the
corresponding `_tier_preference`.

eg.
{
  index.routing.allocation.require.data: "warm",
  index.routing.allocation.include.data: "cold"
}
will be migrated to:
{
  index.routing.allocation.include._tier_preference: "data_cold,data_warm,data_hot"
}
andreidan added a commit to andreidan/elasticsearch that referenced this pull request Jan 5, 2022
This fixes the migrate to data tiers routing API to take into account
the scenario where the node attribute configuration for an index is more
accurate than the existing `_tier_preference` configuration.

Previously we would simply remove the node attributes routing if there
was a `_tier_preference` configured for the index.

With this commit, we'll look if either the `require.data` or
`include.data` custom routings are colder than the existing `_tier_preference`
configuration (ie. `cold` vs `data_warm,data_hot`) and update the tier
routing accordingly.

eg.
{
  index.routing.allocation.require.data: "warm",
  index.routing.allocation.include.data: "cold",
  index.routing.allocation.include._tier_preference: "data_hot"
}
will be migrated to:
{
  index.routing.allocation.include._tier_preference: "data_cold,data_warm,data_hot"
}

This also removes the existing invariant that had the `require.data`
configuration take precedence over a possible `include.data`
configuration, and will now migrate the coldest configuration to the
corresponding `_tier_preference`.

eg.
{
  index.routing.allocation.require.data: "warm",
  index.routing.allocation.include.data: "cold"
}
will be migrated to:
{
  index.routing.allocation.include._tier_preference: "data_cold,data_warm,data_hot"
}
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.0
7.17

elasticsearchmachine pushed a commit that referenced this pull request Jan 5, 2022
This fixes the migrate to data tiers routing API to take into account
the scenario where the node attribute configuration for an index is more
accurate than the existing `_tier_preference` configuration.

Previously we would simply remove the node attributes routing if there
was a `_tier_preference` configured for the index.

With this commit, we'll look if either the `require.data` or
`include.data` custom routings are colder than the existing `_tier_preference`
configuration (ie. `cold` vs `data_warm,data_hot`) and update the tier
routing accordingly.

eg.
{
  index.routing.allocation.require.data: "warm",
  index.routing.allocation.include.data: "cold",
  index.routing.allocation.include._tier_preference: "data_hot"
}
will be migrated to:
{
  index.routing.allocation.include._tier_preference: "data_cold,data_warm,data_hot"
}

This also removes the existing invariant that had the `require.data`
configuration take precedence over a possible `include.data`
configuration, and will now migrate the coldest configuration to the
corresponding `_tier_preference`.

eg.
{
  index.routing.allocation.require.data: "warm",
  index.routing.allocation.include.data: "cold"
}
will be migrated to:
{
  index.routing.allocation.include._tier_preference: "data_cold,data_warm,data_hot"
}
elasticsearchmachine pushed a commit that referenced this pull request Jan 5, 2022
This fixes the migrate to data tiers routing API to take into account
the scenario where the node attribute configuration for an index is more
accurate than the existing `_tier_preference` configuration.

Previously we would simply remove the node attributes routing if there
was a `_tier_preference` configured for the index.

With this commit, we'll look if either the `require.data` or
`include.data` custom routings are colder than the existing `_tier_preference`
configuration (ie. `cold` vs `data_warm,data_hot`) and update the tier
routing accordingly.

eg.
{
  index.routing.allocation.require.data: "warm",
  index.routing.allocation.include.data: "cold",
  index.routing.allocation.include._tier_preference: "data_hot"
}
will be migrated to:
{
  index.routing.allocation.include._tier_preference: "data_cold,data_warm,data_hot"
}

This also removes the existing invariant that had the `require.data`
configuration take precedence over a possible `include.data`
configuration, and will now migrate the coldest configuration to the
corresponding `_tier_preference`.

eg.
{
  index.routing.allocation.require.data: "warm",
  index.routing.allocation.include.data: "cold"
}
will be migrated to:
{
  index.routing.allocation.include._tier_preference: "data_cold,data_warm,data_hot"
}
astefan pushed a commit to astefan/elasticsearch that referenced this pull request Jan 7, 2022
This fixes the migrate to data tiers routing API to take into account
the scenario where the node attribute configuration for an index is more
accurate than the existing `_tier_preference` configuration.

Previously we would simply remove the node attributes routing if there
was a `_tier_preference` configured for the index.

With this commit, we'll look if either the `require.data` or
`include.data` custom routings are colder than the existing `_tier_preference`
configuration (ie. `cold` vs `data_warm,data_hot`) and update the tier
routing accordingly.

eg.
{
  index.routing.allocation.require.data: "warm",
  index.routing.allocation.include.data: "cold",
  index.routing.allocation.include._tier_preference: "data_hot"
}
will be migrated to:
{
  index.routing.allocation.include._tier_preference: "data_cold,data_warm,data_hot"
}

This also removes the existing invariant that had the `require.data`
configuration take precedence over a possible `include.data`
configuration, and will now migrate the coldest configuration to the
corresponding `_tier_preference`.

eg.
{
  index.routing.allocation.require.data: "warm",
  index.routing.allocation.include.data: "cold"
}
will be migrated to:
{
  index.routing.allocation.include._tier_preference: "data_cold,data_warm,data_hot"
}
astefan pushed a commit to astefan/elasticsearch that referenced this pull request Jan 7, 2022
This fixes the migrate to data tiers routing API to take into account
the scenario where the node attribute configuration for an index is more
accurate than the existing `_tier_preference` configuration.

Previously we would simply remove the node attributes routing if there
was a `_tier_preference` configured for the index.

With this commit, we'll look if either the `require.data` or
`include.data` custom routings are colder than the existing `_tier_preference`
configuration (ie. `cold` vs `data_warm,data_hot`) and update the tier
routing accordingly.

eg.
{
  index.routing.allocation.require.data: "warm",
  index.routing.allocation.include.data: "cold",
  index.routing.allocation.include._tier_preference: "data_hot"
}
will be migrated to:
{
  index.routing.allocation.include._tier_preference: "data_cold,data_warm,data_hot"
}

This also removes the existing invariant that had the `require.data`
configuration take precedence over a possible `include.data`
configuration, and will now migrate the coldest configuration to the
corresponding `_tier_preference`.

eg.
{
  index.routing.allocation.require.data: "warm",
  index.routing.allocation.include.data: "cold"
}
will be migrated to:
{
  index.routing.allocation.include._tier_preference: "data_cold,data_warm,data_hot"
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :Data Management/ILM+SLM Index and Snapshot lifecycle management Team:Data Management Meta label for data/management team v7.17.0 v8.0.0-rc1 v8.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug in MetadataMigrateToDataTiersRoutingService - _tier_preference could be incorrect

5 participants