* [PATCH v2] md/raid5: protect bitmap batch counters aka seq_flush/seq_write consistency
@ 2026-06-17 14:28 Chen Cheng
2026-06-17 14:43 ` sashiko-bot
0 siblings, 1 reply; 2+ messages in thread
From: Chen Cheng @ 2026-06-17 14:28 UTC (permalink / raw)
To: linux-raid, yukuai, yukuai; +Cc: chencheng, chenchneg33, linux-kernel
From: Chen Cheng <chencheng@fnnas.com>
kcsan detect race :
- raid5d() closes the current bitmap batch by updating
conf->seq_flush under conf->device_lock.
- __add_stripe_bio() read conf->seq_flush without that
lock when assigning sh->bm_seq.
so, protect seq_flush/seq_write consistency for multiple CPUs by
READ_ONCE()/WRITE_ONCE() under the path without held device_lock.
re-explain the stripe batch sequence number update flow:
1. sh->bm_seq declare which batch number the stripe belongs to
when perform bitmap-related write.
==> bm_seq = seq_flush+1
2. stripe be handled,
* if sh->bm_seq - conf->seq_write > 0, means the
batch stripes **newer than** the last written
batch, it cannot proceed yet, queued on bitmap_list.
* otherwise , has already proceed.
3. raid5d() `++seq_flush` to closes the current batch, means
* no more stripes join that old batch
* just-closed batch ready to write-out to disk
4. raid5d() calls bitmap hooks unplug() or writeout, then,
`++seq_write` to the same as bm_seq.
- seq_flush - for producer, to close batches.
- seq_write - for consumer, the checkpoint number.
the report:
====================================
BUG: KCSAN: data-race in __add_stripe_bio / raid5d
write to 0xffff88ba5625d470 of 4 bytes by task 82401 on cpu 0:
raid5d+0x1d9/0xba0
[.....]
read to 0xffff88ba5625d470 of 4 bytes by task 82421 on cpu 8:
__add_stripe_bio+0x332/0x400
raid5_make_request+0x6ac/0x2930
md_handle_request+0x4a2/0xa40
md_submit_bio+0x109/0x1a0
__submit_bio+0x2ec/0x390
[.....]
v1 -> v2:
- remove WRITE_ONCE(conf->seq_write) in held device_lock path.
- remove READ_ONCE(conf->seq_flush) in held device_lock path.
Signed-off-by: Chen Cheng <chencheng@fnnas.com>
---
drivers/md/raid5.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index a320b71d7117..b2c5a1930841 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -3536,11 +3536,11 @@ static void __add_stripe_bio(struct stripe_head *sh, struct bio *bi,
pr_debug("added bi b#%llu to stripe s#%llu, disk %d, logical %llu\n",
(*bip)->bi_iter.bi_sector, sh->sector, dd_idx,
sh->dev[dd_idx].sector);
if (conf->mddev->bitmap && firstwrite && !sh->batch_head) {
- sh->bm_seq = conf->seq_flush+1;
+ sh->bm_seq = READ_ONCE(conf->seq_flush) + 1;
set_bit(STRIPE_BIT_DELAY, &sh->state);
}
}
/*
@@ -5827,11 +5827,11 @@ static void make_discard_request(struct mddev *mddev, struct bio *bi)
md_write_inc(mddev, bi);
sh->overwrite_disks++;
}
spin_unlock_irq(&sh->stripe_lock);
if (conf->mddev->bitmap) {
- sh->bm_seq = conf->seq_flush + 1;
+ sh->bm_seq = READ_ONCE(conf->seq_flush) + 1;
set_bit(STRIPE_BIT_DELAY, &sh->state);
}
set_bit(STRIPE_HANDLE, &sh->state);
clear_bit(STRIPE_DELAYED, &sh->state);
@@ -6877,16 +6877,17 @@ static void raid5d(struct md_thread *thread)
clear_bit(R5_DID_ALLOC, &conf->cache_state);
if (
!list_empty(&conf->bitmap_list)) {
/* Now is a good time to flush some bitmap updates */
- conf->seq_flush++;
+ int seq = conf->seq_flush + 1;
+ WRITE_ONCE(conf->seq_flush, seq);
spin_unlock_irq(&conf->device_lock);
if (md_bitmap_enabled(mddev, true))
mddev->bitmap_ops->unplug(mddev, true);
spin_lock_irq(&conf->device_lock);
- conf->seq_write = conf->seq_flush;
+ conf->seq_write = seq;
activate_bit_delay(conf, conf->temp_inactive_list);
}
raid5_activate_delayed(conf);
while ((bio = remove_bio_from_retry(conf, &offset))) {
--
2.54.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH v2] md/raid5: protect bitmap batch counters aka seq_flush/seq_write consistency
2026-06-17 14:28 [PATCH v2] md/raid5: protect bitmap batch counters aka seq_flush/seq_write consistency Chen Cheng
@ 2026-06-17 14:43 ` sashiko-bot
0 siblings, 0 replies; 2+ messages in thread
From: sashiko-bot @ 2026-06-17 14:43 UTC (permalink / raw)
To: Chen Cheng; +Cc: yukuai, linux-raid
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
Pre-existing issues:
- [Critical] Direct integer comparison of sequence numbers `bm_seq` in `stripe_add_to_batch_list` fails to handle integer wrap-around.
--
commit d138461da7f53106a83faa762ae81dbb822cbfca
Author: Chen Cheng <chencheng@fnnas.com>
md/raid5: protect bitmap batch counters aka seq_flush/seq_write consistency
This patch protects seq_flush and seq_write consistency across multiple CPUs.
It uses READ_ONCE() and WRITE_ONCE() to prevent data races detected by KCSAN
when assigning sh->bm_seq without holding device_lock.
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index d0d7f96fd7cd9..891c2cb605d30 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -3538,7 +3538,7 @@ static void __add_stripe_bio(struct stripe_head *sh, struct bio *bi,
> sh->dev[dd_idx].sector);
>
> if (conf->mddev->bitmap && firstwrite && !sh->batch_head) {
> - sh->bm_seq = conf->seq_flush+1;
> + sh->bm_seq = READ_ONCE(conf->seq_flush) + 1;
> set_bit(STRIPE_BIT_DELAY, &sh->state);
> }
> }
[Severity: Critical]
This is a pre-existing issue, but does stripe_add_to_batch_list() handle
32-bit integer wrap-around correctly when comparing sequence numbers?
Since seq_flush, seq_write, and sh->bm_seq increment for each batch, they
will eventually overflow and wrap around to negative values. In other parts
of the code, this is handled using signed subtraction (e.g.,
sh->bm_seq - conf->seq_write > 0).
However, in stripe_add_to_batch_list(), there is a direct comparison:
drivers/md/raid5.c:stripe_add_to_batch_list() {
...
if (test_and_clear_bit(STRIPE_BIT_DELAY, &sh->state)) {
int seq = sh->bm_seq;
if (test_bit(STRIPE_BIT_DELAY, &sh->batch_head->state) &&
sh->batch_head->bm_seq > seq)
seq = sh->batch_head->bm_seq;
...
}
At the wrap-around boundary, a newer sequence number is numerically less than
an older sequence number. Does this direct comparison incorrectly evaluate to
false, assigning the batch head an older sequence number?
Could this cause the batch to be released early when conf->seq_write catches
up, completing before its newer bitmap updates are fully flushed? Should this
comparison use subtraction instead, such as sh->batch_head->bm_seq - seq > 0?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260617142839.882378-1-chencheng@fnnas.com?part=1
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-06-17 14:43 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-17 14:28 [PATCH v2] md/raid5: protect bitmap batch counters aka seq_flush/seq_write consistency Chen Cheng
2026-06-17 14:43 ` sashiko-bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox