Skip to content

Conversation

@martijnvg
Copy link
Member

The following stats are being kept track of:

  1. The total number of times that auto following a leader index succeed.
  2. The total number of times that auto following a leader index failed.
  3. The total number of times that fetching a remote cluster state failed.
  4. The most recent 256 auto follow failures per auto leader index
    (e.g. create_and_follow api call fails) or cluster alias
    (e.g. fetching remote cluster state fails).

Each auto follow run now produces a result that is being used to update
the stats being kept track of in AutoFollowCoordinator.

The transport and rest actions are added in a follow up PR.

Relates to #33007

The following stats are being kept track of:
1) The total number of times that auto following a leader index succeed.
2) The total number of times that auto following a leader index failed.
3) The total number of times that fetching a remote cluster state failed.
4) The most recent 256 auto follow failures per auto leader index
   (e.g. create_and_follow api call fails) or cluster alias
   (e.g. fetching remote cluster state fails).

Each auto follow run now produces a result that is being used to update
the stats being kept track of in AutoFollowCoordinator.

Relates to elastic#33007
@martijnvg martijnvg added review :Distributed Indexing/CCR Issues around the Cross Cluster State Replication features labels Sep 13, 2018
@martijnvg martijnvg requested a review from dnhatn September 13, 2018 18:10
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@martijnvg
Copy link
Member Author

\cc @jasontedor

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martijnvg This looks good. I left some comments.

LOGGER.warn("failure occurred during auto-follower coordination", e);
Consumer<List<AutoFollowResult>> handler = results -> {
for (AutoFollowResult result : results) {
if (result.clusterStateFetchException != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This flow is quite similar to updateStats. Should we combine these to a single method?


private final CountDown autoFollowPatternsCountDown;
private final AtomicReference<Exception> autoFollowPatternsErrorHolder = new AtomicReference<>();
private final AtomicArray<AutoFollowResult> clusterAliasResults;
Copy link
Member

@dnhatn dnhatn Sep 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just call it autoFollowResults?

for (Index indexToFollow : leaderIndicesToFollow) {
final AtomicArray<Tuple<Index, Exception>> results = new AtomicArray<>(leaderIndicesToFollow.size());
for (int i = 0; i < leaderIndicesToFollow.size(); i++) {
final int slot = i;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we can use another data-structure (instead of AtomicArray) to avoid passing the slot-index around. It may be less error-prone because we now have two slot-indexes (clusterAliasSlot and slot) in handleClusterAlias method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it would be better if we just do not pass the clusterAliasSlot around. Let me see if this would be cleaner.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think is better now: 90fb198 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dnhatn The splitup of handleClusterAlias() method like we discussed privately: f501c10

}
if (leaderIndicesCountDown.countDown()) {
finalise(leaderIndicesErrorHolder.get());
finalise(clusterAliasSlot, new AutoFollowResult(clusterAlias, results.asList()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we assert that every slot is assigned?

clusterAliasResults.set(slot, result);
if (autoFollowPatternsCountDown.countDown()) {
handler.accept(autoFollowPatternsErrorHolder.get());
handler.accept(clusterAliasResults.asList());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we assert that every slot is assigned?


AutoFollowResult(String clusterAlias) {
this.clusterAlias = clusterAlias;
this.clusterStateFetchException = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just delegate to this(clusterAlias, null)?

public class AutoFollowStats implements Writeable, ToXContentObject {

private static final ParseField NUMBER_OF_SUCCESSFUL_INDICES_AUTO_FOLLOWED =
new ParseField("number_of_successful_indices_auto_followed");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if indices_auto_followed is a right term. @jasontedor WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just number_of_successful_followed_indices?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I like this name. And I think we don't need ed here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed: 3a15fe6

@martijnvg martijnvg mentioned this pull request Sep 17, 2018
10 tasks
Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @martijnvg.

getLeaderIndicesToFollow(autoFollowPattern, leaderClusterState, followerClusterState, followedIndices);
if (leaderIndicesToFollow.isEmpty()) {
finalise(slot, new AutoFollowResult(clusterAlias));
}else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: a space before else.

@martijnvg martijnvg merged commit 47b86d6 into elastic:master Sep 18, 2018
martijnvg added a commit that referenced this pull request Sep 18, 2018
…cs (#33684)

The following stats are being kept track of:
1) The total number of times that auto following a leader index succeed.
2) The total number of times that auto following a leader index failed.
3) The total number of times that fetching a remote cluster state failed.
4) The most recent 256 auto follow failures per auto leader index
   (e.g. create_and_follow api call fails) or cluster alias
   (e.g. fetching remote cluster state fails).

Each auto follow run now produces a result that is being used to update
the stats being kept track of in AutoFollowCoordinator.

Relates to #33007
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Indexing/CCR Issues around the Cross Cluster State Replication features

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants