Skip to content

Commit fde8c5c

Browse files
lgegregkh
authored andcommitted
drbd: fix potential silent data corruption
commit f4329d1 upstream. Scenario: --------- bio chain generated by blk_queue_split(). Some split bio fails and propagates its error status to the "parent" bio. But then the (last part of the) parent bio itself completes without error. We would clobber the already recorded error status with BLK_STS_OK, causing silent data corruption. Reproducer: ----------- How to trigger this in the real world within seconds: DRBD on top of degraded parity raid, small stripe_cache_size, large read_ahead setting. Drop page cache (sysctl vm.drop_caches=1, fadvise "DONTNEED", umount and mount again, "reboot"). Cause significant read ahead. Large read ahead request is split by blk_queue_split(). Parts of the read ahead that are already in the stripe cache, or find an available stripe cache to use, can be serviced. Parts of the read ahead that would need "too much work", would need to wait for a "stripe_head" to become available, are rejected immediately. For larger read ahead requests that are split in many pieces, it is very likely that some "splits" will be serviced, but then the stripe cache is exhausted/busy, and the remaining ones will be rejected. Signed-off-by: Lars Ellenberg <[email protected]> Signed-off-by: Christoph Böhmwalder <[email protected]> Cc: <[email protected]> # 4.13.x Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
1 parent b101e74 commit fde8c5c

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

drivers/block/drbd/drbd_req.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -177,7 +177,8 @@ void start_new_tl_epoch(struct drbd_connection *connection)
177177
void complete_master_bio(struct drbd_device *device,
178178
struct bio_and_error *m)
179179
{
180-
m->bio->bi_status = errno_to_blk_status(m->error);
180+
if (unlikely(m->error))
181+
m->bio->bi_status = errno_to_blk_status(m->error);
181182
bio_endio(m->bio);
182183
dec_ap_bio(device);
183184
}

0 commit comments

Comments
 (0)