Skip to content

Upgrading to 5.6 Review #29818

@elasticmachine

Description

@elasticmachine

Original comment by @bleskes:

Since the feature has been originally implemented (LINK REDACTED , LINK REDACTED), things have changed slightly. I have recently spoken to @jaymode , @pickypg and @spinscale to document how things currently work. Here's a summary which we can use for future reference, plus some follow ups at the end of each section.

@jaymode, @pickypg and @spinscale - please read carefully and correct if needed.

Security

Native Realm (.security index)

  • ES 5.6 has introduced a new field, which prevents writing until the field has been added (see below when). Until this is done the native realm is read-only (on new nodes) as dynamic fields are disabled.
  • Once the cluster master moves to version 5.6, xpack.security.support.NativeRealmMigrator does the following:
    • the security code issues a PUT template to update the .security index template. It also updates the mapping of the existing index if exists.
    • some internal users are updated depending on ES version.
  • MetaDataUpgrader is used to upgrade the index template as well.
  • Users need to manually call the _xpack/migration/upgrade API to reindex and remove types

.audit-* indices

  • The mapping never changed and thus there is no need to upgrade the template
  • We rely on non-6.x compatible indices to fall out of the retention policy and thus have no upgrade scheme for these.

TODOs

  • Document that the native realm is read only until the master has been upgraded.
  • We need to choose who updates the .security index template. It should only be done in one place.
  • We need better testing to make sure that .security keeps on working after users have manually used the _xpack/migration/upgrade API. Those don't exist now.
    • Also take into account the upgrade of an index that was created on ES < 5.6, and upgraded to 5.6 first.
  • Test that a mixed cluster .security works in read-only mode
  • Test that we can write new user credentials after the cluster has been upgraded

Watcher

Watch CRUD & Execution

  • Adding a watch tries to use the new doc type. If this fails, it tries again using the pre-5.6 types. This seems to result in ugly log messages - see todo.
  • Watch execution runs on the master only and thus has a clean transition between old and new.
  • The .watch and .triggered-watches* indices are manually upgraded via the _xpack/migration/upgrade API.

Watch history

  • Template is automatically updated by the TemplateUpgradeService service.
  • Since the watch execution is on the master, we only use the new template once the master moves to a new version.

TODOs

  • We currently have the follow logs being logged until the user manually upgrades the .watch index. We should find a way to avoid it:
[2017-10-06T06:42:14,598][DEBUG][o.e.a.b.TransportShardBulkAction] [.watches][0] failed to execute bulk item (index) BulkShardRequest [[.watches][0]] containing [index {[.watches][doc][wAuN5cXhTiyjyCm58tH6ag_xpack_license_expiration], source[n/a, actual length: [4.5kb], max length: 2kb]}] and a refresh
org.elasticsearch.indices.TypeMissingException: type[doc] missing
        at org.elasticsearch.index.mapper.MapperService.documentMapperWithAu
  • Delay watch execution on the master until the required template version is visible in the cluster state. (LINK REDACTED)

Monitoring

Exporting

Local exporter:

  • The monitoring indices template is upgraded by the monitoring service, when the master moves to 5.6.
  • The exporter waits until it sees the new template in place (i.e., until the master is on 5.6)

Http Exporter:

  • Always tries to update the remote template when it sets up.

Monitoring indices

  • ES 5.6 uses a new index name schemes - i.e., new indices will be created next to the old indices as soon as the first data is shipped.
  • Old indices are not upgraded, we let them age and fall out of the retention policy.

TODOs

  • Move template upgrades to the centralized TemplateUpgradeService. Beats will takeover this responsibility for Monitoring.
  • Research the following log messages that repeated appear in the logs until the upgrade is complete
13:32:43 [2017-10-04T15:32:20,567][ERROR][o.e.x.m.c.c.ClusterStatsCollector] [node-0] collector [cluster_stats] failed to collect data
13:32:43 java.lang.IllegalStateException: Security index is not on the current version - the native realm will not be operational until the upgrade API is run on the security index

and

[2017-10-06T06:42:14,621][ERROR][o.e.x.m.e.l.LocalExporter] failed to set monitoring watch [wAuN5cXhTiyjyCm58tH6ag_elasticsearch_version_mismatch]
org.elasticsearch.indices.TypeMissingException: type[doc] missing
        at org.elasticsearch.index.mapper.MapperService.documentMapperWithAutoCreate(MapperService.java:765) ~[elasticsearch-6.1.0-SNAPSHOT.jar:6.1.0-SNAPSHOT]
        at org.elasticsearch.index.shard.IndexShard.docMapper(IndexShard.java:2147) ~[elasticsearch-6.1.0-SNAPSHOT.jar:6.1.0-SNAPSHOT]
        at org.elasticsearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:677) ~[elasticsearch-6.1.0-SNAPSHOT.jar:6.1.0-SNAPSHOT]

TemplateUpgradeService

The service is in charge of checking index template versions and upgrading them if needed. It currently tries to do so when the first 5.6 node joins the cluster. This fails because the _system user only has permissions to do so once the master has moved to 5.6 node (see LINK REDACTED). This results in repeated ugly messages about the _system not having the right permissions. Given the review of the different features above, we're safe to move to a simpler model where the templates are updated only on the current master (i.e., when the master is on the 5.6 version).

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions