* [PATCH v2] md/raid5: read batch_head under stripe_lock in make_stripe_request
@ 2026-06-19 8:11 Chen Cheng
0 siblings, 0 replies; only message in thread
From: Chen Cheng @ 2026-06-19 8:11 UTC (permalink / raw)
To: linux-raid, yukuai, yukuai; +Cc: chencheng, linux-kernel
From: Chen Cheng <chencheng@fnnas.com>
KCSAN reports race in raid5_make_request() vs. stripe_add_to_batch_list()
Writer flow (stripe_add_to_batch_list):
1. grab `head` stripe;
2. lock_two_stripes(head, sh);
3. re-check stripe_can_batch() for both head and sh, which requires
STRIPE_BATCH_READY set on both;
4. write head->batch_head = head and sh->batch_head = head;
5. unlock_two_stripes.
STRIPE_BATCH_READY is cleared in two places:
- clear_batch_ready(), at the entry of handle_stripe();
- __add_stripe_bio(), for non-batchable bios.
And, both need to acquire `stripe_lock`.
Under stripe_lock, if STRIPE_BATCH_READY is clear, then:
- New writers cannot install a batch_head;
- Existing writers have already finished.
So .. handle_stripe() readers can ready `batch_head` locklessly.
Fix way:
Writer side make_stripe_request() under STRIPE_BATCH_READY, so , need
to be protected by stripe_lock when read something..
v1 -> v2:
- re-expalin how stripe_lock and batch_head work in commit message , and ,
- modify comment in raid5.h.
Fixs: f4aec6a097387
KCSAN report:
======================================
BUG: KCSAN: data-race in raid5_make_request / raid5_make_request
write to 0xffff8f03062432d8 of 8 bytes by task 210246 on cpu 6:
raid5_make_request+0x175e/0x2ab0
md_handle_request+0x2c5/0x700
md_submit_bio+0x126/0x320
[.........]
btrfs_sync_file+0x181/0x970
vfs_fsync_range+0x71/0x110
do_fsync+0x46/0xa0
__x64_sys_fsync+0x20/0x30
read to 0xffff8f03062432d8 of 8 bytes by task 210251 on cpu 0:
raid5_make_request+0x7c7/0x2ab0
md_handle_request+0x2c5/0x700
md_submit_bio+0x126/0x320
[.........]
btrfs_remap_file_range+0x266/0x980
vfs_clone_file_range+0x16d/0x610
ioctl_file_clone+0x64/0xd0
do_vfs_ioctl+0x87f/0xbc0
__x64_sys_ioctl+0xb8/0x130
value changed: 0x0000000000000000 -> 0xffff8f0307798728
Signed-off-by: Chen Cheng <chencheng@fnnas.com>
---
drivers/md/raid5.c | 2 ++
drivers/md/raid5.h | 8 +++++++-
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 5521051a9425..efc63740f867 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -6108,14 +6108,16 @@ static enum stripe_result make_stripe_request(struct mddev *mddev,
ctx->do_flush = false;
}
set_bit(STRIPE_HANDLE, &sh->state);
clear_bit(STRIPE_DELAYED, &sh->state);
+ spin_lock_irq(&sh->stripe_lock);
if ((!sh->batch_head || sh == sh->batch_head) &&
(bi->bi_opf & REQ_SYNC) &&
!test_and_set_bit(STRIPE_PREREAD_ACTIVE, &sh->state))
atomic_inc(&conf->preread_active_stripes);
+ spin_unlock_irq(&sh->stripe_lock);
release_stripe_plug(mddev, sh);
return STRIPE_SUCCESS;
out_release:
diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
index 1c7b710fc9c1..9ff825697ba3 100644
--- a/drivers/md/raid5.h
+++ b/drivers/md/raid5.h
@@ -221,11 +221,17 @@ struct stripe_head {
enum reconstruct_states reconstruct_state;
spinlock_t stripe_lock;
int cpu;
struct r5worker_group *group;
- struct stripe_head *batch_head; /* protected by stripe lock */
+ /*
+ * Writer protected by stripe_lock.
+ * Reader hold stripe_lock when STRIPE_BATCH_READY is set.
+ * Without STRIPE_BATCH_READY means no concurrent write,
+ * lockless read is ok.
+ */
+ struct stripe_head *batch_head;
spinlock_t batch_lock; /* only header's lock is useful */
struct list_head batch_list; /* protected by head's batch lock*/
union {
struct r5l_io_unit *log_io;
--
2.54.0
^ permalink raw reply related [flat|nested] only message in thread
only message in thread, other threads:[~2026-06-19 8:11 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-19 8:11 [PATCH v2] md/raid5: read batch_head under stripe_lock in make_stripe_request Chen Cheng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox