Commit 93e4e88
Loïc Hoguin
CQ: Fix entry missing from cache leading to crash on read
The issue comes from a mechanic that allows us to avoid writing
to disk when a message has already been consumed. It works fine
in normal circumstances, but fan-out makes things trickier.
When multiple queues write and read the same message, we could
get a crash. Let's say queues A and B both handle message Msg.
* Queue A asks store to write Msg
* Queue B asks store to write Msg
* Queue B asks store to delete Msg (message was immediately consumed)
* Store processes Msg write from queue A
* Store writes Msg to current file
* Store processes Msg write from queue B
* Store notices queue B doesn't need Msg anymore; doesn't write
* Store clears Msg from the cache
* Queue A tries to read Msg
* Msg is missing from the cache
* Queue A tries to read from disk
* Msg is in the current write file and may not be on disk yet
* Crash
The problem is that the store clears Msg from the cache. We need
all messages written to the current file to remain in the cache
as we can't guarantee the data is on disk when comes the time
to read. That is, until we roll over to the next file.
The issue was that a match was wrong, instead of matching a single
location from the index, the code was matching against a list. The
error was present in the code for almost 13 years since commit
2ef30dc.1 parent 8f19a04 commit 93e4e88
1 file changed
+1
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
907 | 907 | | |
908 | 908 | | |
909 | 909 | | |
910 | | - | |
| 910 | + | |
911 | 911 | | |
912 | 912 | | |
913 | 913 | | |
| |||
0 commit comments