Skip to content

Conversation

@kingherc
Copy link
Contributor

@kingherc kingherc commented Jul 4, 2022

But DEBUG (silent) logging of snapshot restore/completion when
done in the context of CCR or searchable snapshots.

Fixes #86610

But DEBUG (silent) logging of snapshot restore/completion when
done in the context of CCR or searchable snapshots.

Fixes elastic#86610
boolean changed = false;
for (RestoreInProgress.Entry entry : currentState.custom(RestoreInProgress.TYPE, RestoreInProgress.EMPTY)) {
if (entry.state().completed()) {
logger.log(entry.silent() ? Level.DEBUG : Level.INFO, "restore of snapshot [{}] completed", entry.snapshot());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fcofdez suggested to me this is a nice place to log the completion of a snapshot restore. This meant passing the silent property from the RestoreSnapshotRequest to RestoreInProgress.Entry so that we can read the silent parameter here.

@kingherc kingherc marked this pull request as ready for review July 4, 2022 15:45
@kingherc
Copy link
Contributor Author

kingherc commented Jul 4, 2022

Hi. Adding @DaveCTurner as reviewer (since he created the issue) and @original-brownbear as another reviewer (since you have experience with snapshot restores AFAIK). I manually tested this in this way:

  • Executed org.elasticsearch.snapshots.RestoreSnapshotIT#testParallelRestoreOperations (for regular snapshot restore use case) and saw that it emits the INFO log messages
  • Executed org.elasticsearch.xpack.ccr.AutoFollowIT#testAutoFollow (for CCR use case) and saw it does not emit the INFO log messages
  • Executed org.elasticsearch.xpack.searchablesnapshots.SearchableSnapshotsIntegTests#testCreateAndRestoreSearchableSnapshot (for searchable snapshot use case) and saw it does not emit the INFO log messages

Feel free to tell me:

  • if only one of you is needed as reviewer
  • whether we would like to add some sort of logging testing (if it exists for this part of the code, similar to logging tests that I saw in CoordinatorTests.java).

@kingherc kingherc self-assigned this Jul 4, 2022
@kingherc kingherc added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Jul 4, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@kingherc kingherc added >enhancement :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Supportability Improve our (devs, SREs, support eng, users) ability to troubleshoot/self-service product better. labels Jul 4, 2022
@elasticsearchmachine
Copy link
Collaborator

Hi @kingherc, I've created a changelog YAML for you.

@kingherc
Copy link
Contributor Author

kingherc commented Jul 5, 2022

OK, I understand I should have added you as reviewers after the CI passes -- will keep that in mind.

After conversing a bit, I also understand that in order to fix the CI bwc failures, I should (a) not expose the silent parameter in XContent, and (b) add ifs around the serialization parts. I'll try that 🤞 .

@DaveCTurner
Copy link
Contributor

DaveCTurner commented Jul 5, 2022

++ yes we shouldn't see this parameter in anything user-facing (particularly this means neither in docs nor XContent1) and the wire-format change needs version guards. When receiving a transport message from an older node I suggest just treating the missing silent parameter as true.

Footnotes

  1. NB XContent is not always user-facing, it's also used for writing the cluster metadata to disk (e.g. for cluster coordination and snapshots) but in this case neither of those cases apply.

@kingherc
Copy link
Contributor Author

kingherc commented Jul 5, 2022

Hi @original-brownbear and @DaveCTurner , CI passed. Feel free to review.

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I left a couple of small suggestions.

I'd like to have a test which asserts that we do actually report the log message we're adding at the right level. I think you can add a test case in RestoreSnapshotIT for that.

@kingherc kingherc requested a review from DaveCTurner July 7, 2022 10:48
@kingherc
Copy link
Contributor Author

kingherc commented Jul 7, 2022

@DaveCTurner incorporated your feedback. Feel free to review. @original-brownbear feel free to also review if you'd like. Thanks!

@kingherc
Copy link
Contributor Author

Gentle reminder for review @original-brownbear , @DaveCTurner

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay here @kingherc, this all looks good although I've one question about the test you added.

Comment on lines 182 to 204
mockLogAppender.addExpectation(new MockLogAppender.LoggingExpectation() {
String startMessageSnapshot;
Object startMessageIndicesList;
String completionMessageSnapshot;

@Override
public void match(LogEvent event) {
final String message = event.getMessage().getFormattedMessage();
if (message.startsWith("started restore of snapshot")) {
startMessageSnapshot = event.getMessage().getParameters()[0].toString();
startMessageIndicesList = event.getMessage().getParameters()[1];
} else if (message.startsWith("completed restore of snapshot")) {
completionMessageSnapshot = event.getMessage().getParameters()[0].toString();
}
}

@Override
public void assertMatched() {
assertThat(startMessageSnapshot, startsWith(snapshotName));
assertEquals(Arrays.asList(indexName), startMessageIndicesList);
assertThat(completionMessageSnapshot, startsWith(snapshotName));
}
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be clearer to write this as two regular SeenEventExpectation expectations with the actual message we expect (maybe with some * wildcards) instead of doing this. But maybe I've missed some subtlety about why we're doing it this way.

Done this way I think the fields would need to be volatile right?

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kingherc kingherc merged commit 1e07915 into elastic:master Jul 12, 2022
@kingherc kingherc deleted the feature/log-snapshot-restore branch July 12, 2022 13:52
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Jul 13, 2022
* upstream/master: (38 commits)
  Simplify map copying (elastic#88432)
  Make DiffableUtils.diff implementation agnostic (elastic#88403)
  Ingest: Start separating Metadata from IngestSourceAndMetadata (elastic#88401)
  Move runtime fields base scripts out of scripting fields api package. (elastic#88488)
  Enable TRACE Logging for test and increase timeout (elastic#88477)
  Mute ReactiveStorageIT#testScaleDuringSplitOrClone (elastic#88480)
  Track the count of failed invocations since last successful policy snapshot (elastic#88398)
  Avoid noisy exceptions on data nodes when aborting snapshots (elastic#88476)
  Fix ReactiveStorageDeciderServiceTests testNodeSizeForDataBelowLowWatermark (elastic#88452)
  INFO logging of snapshot restore and completion (elastic#88257)
  unmute test (elastic#88454)
  Updatable API keys - noop check (elastic#88346)
  Corrected an incomplete sentence. (elastic#86542)
  Use consistent shard map type in IndexService (elastic#88465)
  Stop registering TestGeoShapeFieldMapperPlugin in ESIntegTestCase (elastic#88460)
  TSDB: RollupShardIndexer logging improvements (elastic#88416)
  Audit API key ID when create or grant API keys (elastic#88456)
  Bound random negative size test in SearchSourceBuilderTests#testNegativeSizeErrors (elastic#88457)
  Updatable API keys - logging audit trail event (elastic#88276)
  Polish reworked LoggedExec task (elastic#88424)
  ...

# Conflicts:
#	x-pack/plugin/rollup/src/main/java/org/elasticsearch/xpack/rollup/v2/RollupShardIndexer.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >enhancement Supportability Improve our (devs, SREs, support eng, users) ability to troubleshoot/self-service product better. Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v8.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Log snapshot restores at INFO level

4 participants