-
Notifications
You must be signed in to change notification settings - Fork 839
Closed
Labels
keepaliveSkipped by stale botSkipped by stale bot
Description
Describe the bug
Cortex can return 5xx due a single ingester failure when a tenant is being throttled (4xx). In this case, distributor can return the error from the bad ingester (5xx) even though the other 2 returned 4xx. See this.
Looking at this code seems that if we have replication factor = 2, 1 ingester down and the other 2 returning 4xx we can have for example:
4xx + 5xx + 4xx = 5xx
or
5xx + 4xx + 4xx = 4xx
etc
To Reproduce
Steps to reproduce the behavior:
I could create a unit test that reproduce the behavior:
alanprot@fd36d97
- Start Cortex (SHA or version)
a4bf103 - Perform Operations(Read/Write/Others)
Write
Expected behavior
Cortex should return the error respecting the quorum of the response from ingesters.
So, if 2 ingesters return 4xx and one 5xx, cortex should return 4xx. This means that if distributor receive one 4xx and one 5xx, it needs to wait the response of the third ingester.
Environment:
- Infrastructure: [e.g., Kubernetes, bare-metal, laptop]
Kubernetes - Deployment tool: [e.g., helm, jsonnet]
Helm
Storage Engine - Blocks
- Chunks
Additional Context
marianafranco, bubu11e and guillaumerocq
Metadata
Metadata
Assignees
Labels
keepaliveSkipped by stale botSkipped by stale bot