Skip to content

Conversation

@eisbilir
Copy link
Member

@eisbilir eisbilir commented Jun 3, 2021

No description provided.

Copy link
Contributor

@maxceem maxceem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eisbilir I'm still testing it, though there is one thing that looks suspicioious for me is using retryMap to keep the retry value.

It was expected that retry value would be used from the taas.action.retry topic payload.

The current implementation with retryMap has a few potentials issues:

  • when we send 2 events for WP in the same RB and both of them fail, it would use the same retry value for both of them because they have the same RB.id
  • even if we use WP.id in retryMap there could be 2 events for the same WP failed and they again use the same retry from the retryMap

To make the logic bulletproof I guess it's better to use retry from the taas.action.retry topic payload.

What do you think?

@eisbilir
Copy link
Member Author

eisbilir commented Jun 7, 2021

@maxceem
I was torn between using retry value from payload or using a retryMap.

There are two main methods for the retry service:

  1. processRetry that receives the payload and forwards it to the relevant service.
  2. processCreate that receives failed events from services and sends new retry event for the failed events.

The reasons I didn't choose using retry value from payload:

  1. We have to pass the retry value to relevant services and I didn't want to change the signatures of existing services. So it would be simple to implement retry logic for any service.
  2. Using the retry in payload obliges us to keep track of retries separately for each events, even if all of events' failure reason is related to same entity. (This might seem like exactly what wee need). I tried to avoid that because I don't think it's a good idea to send retry events for a single not-created RB for WP count of RB * max_retry_limit times. I think that If RB hasn't created yet after 3-4 WP retries at most, we can't really expect that the RB will be created, there must be some other reason other than unordered messages issue.
when we send 2 events for WP in the same RB and both of them fail, it would use the same retry value for both of them because they have the same RB.id
even if we use WP.id in retryMap there could be 2 events for the same WP failed and they again use the same retry from the retryMap

Yes. By using retryMap we make the keys global and after 2 failure, the entity won't have a chance to fail again even for some other reason. This looks like a bad idea, but our problem is with creating. After a successful creation, we are unlikely to face an issue about this entity again.

Well, If it's okay to pass retry to services, I propose to try with using retry value from payload this time. This will make the logic simpler for our new service.

@eisbilir eisbilir requested a review from maxceem June 8, 2021 18:13
@maxceem maxceem changed the base branch from dev to feature/retry-action June 9, 2021 10:17
Copy link
Contributor

@maxceem maxceem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eisbilir looks good as per quick local testing. Merging to a separate branch for more testing.

@maxceem maxceem merged commit 3538258 into topcoder-platform:feature/retry-action Jun 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants