Linux RAID subsystem development
 help / color / mirror / Atom feed
From: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
To: sashiko-reviews@lists.linux.dev
Cc: linux-raid@vger.kernel.org, yukuai@fygo.io
Subject: Re: [PATCH v2 3/7] md/raid10: consistently fail atomic writes that require splitting
Date: Sun, 28 Jun 2026 23:35:02 +0200	[thread overview]
Message-ID: <m2v7b2z6ux.fsf@gmail.com> (raw)
In-Reply-To: <20260628143620.64A6C1F000E9@smtp.kernel.org>


Hi Kuai,

On Sun, Jun 28, 2026 at 14:36 +0000, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
>
> Pre-existing issues:
> - [High] The `err_dec_pending` (formerly `err_handle`) cleanup path in `raid10_write_request` can trigger a NULL pointer dereference when a replacement device is promoted concurrently.
> --
>
> commit bc9d81fdb2c33f8c55baad27ab1eedf427ae25ed
> Author: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
>
> md/raid10: consistently fail atomic writes that require splitting
>
> RAID10 handles one badblock path explicitly by failing atomic writes with
> EIO, but another path can also reduce the writable range and force the bio
> through bio_submit_split_bioset(), completing it with EINVAL. This commit
> centralizes the check to fail all atomic writes requiring splitting with
> EIO.
>
>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
>> index bd322eccdc3ff..3480fc7907f01 100644
>> --- a/drivers/md/raid10.c
>> +++ b/drivers/md/raid10.c
> [ ... ]
>> @@ -1493,6 +1484,9 @@ static bool raid10_write_request(struct mddev *mddev, struct bio *bio,
>>  		r10_bio->sectors = max_sectors;
>>  
>>  	if (r10_bio->sectors < bio_sectors(bio)) {
>> +		if (atomic)
>> +			goto err_handle;
>> +
>
> [Severity: High]
> This is a pre-existing issue, but can this jump to the error handler cause a
> null pointer dereference if a replacement device is concurrently promoted?
>
> In the error handler path, the code fetches the replacement device without
> checking for null or falling back to the primary slot:
>
> err_handle:
> 	for (k = 0;  k < i; k++) {
> 		int d = r10_bio->devs[k].devnum;
> 		struct md_rdev *rrdev = conf->mirrors[d].replacement;
> 		...
> 		if (r10_bio->devs[k].repl_bio) {
> 			rdev_dec_pending(rrdev, mddev);
> 			r10_bio->devs[k].repl_bio = NULL;
> 		}
> 	}
>
> If raid10_remove_disk() runs concurrently, it can promote the replacement to
> the primary slot and clear conf->mirrors[d].replacement:
>
> raid10_remove_disk() {
> 	...
> 	WRITE_ONCE(*rdevp, NULL);
> 	if (p->replacement) {
> 		WRITE_ONCE(p->rdev, p->replacement);
> 		clear_bit(Replacement, &p->replacement->flags);
> 		WRITE_ONCE(p->replacement, NULL);
> 	}
> 	...
> }
>
At first look this looks unreachable, because raid10_remove_disk()
should only run while the array is suspended, which would drain the
in-flight write before any promotion. But that assumption does not hold
on the md_start_sync() path. The suspend there is gated on a lock-free
check of md_spares_need_change() taken once at function entry; we only
suspend if it returns true. We then call remove_spares()
unconditionally, regardless of that earlier decision, and
remove_spares() re-evaluates rdev_removeable() independently under the
lock.

So the suspend decision and the actual removal are sampled at different
times with nothing held across them. A primary that was In_sync at the
entry check (suspend skipped) but fails afterward becomes removeable by
the time remove_spares() runs, and raid10_remove_disk() then promotes
the replacement and clears the slot with no suspend in effect. Meanwhile
an in-flight raid10_write_request() that referenced only the replacement
(because the primary was already Faulty when it ran) can be sitting in
err_handle, and it reads conf->mirrors[d].replacement as NULL.

It seems real. I'll submit a fix addressing this issue.

> If this concurrent promotion happens, will rrdev be null here? It looks like
> other paths such as raid10_end_write_request() safely handle this exact
> race by falling back to rdev when replacement is null, but this cleanup
> path does not.
>
>>  		allow_barrier(conf);
>>  		bio = bio_submit_split_bioset(bio, r10_bio->sectors,
>>  					      &conf->bio_split);
>
> -- 
> Sashiko AI review · https://sashiko.dev/#/patchset/20260628142420.1051027-1-abd.masalkhi@gmail.com?part=3

-- 
Best Regards,
Abd-Alrhman

  reply	other threads:[~2026-06-28 21:35 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-28 14:24 [PATCH v2 0/7] md/raid10: fixes, atomic write handling, and error-path cleanup Abd-Alrhman Masalkhi
2026-06-28 14:24 ` [PATCH v2 1/7] md/raid10: fix r10bio leak in raid10_write_request() error paths Abd-Alrhman Masalkhi
2026-06-28 14:39   ` sashiko-bot
2026-06-28 14:24 ` [PATCH v2 2/7] md/raid1: advertise atomic write limits and handle runtime constraints Abd-Alrhman Masalkhi
2026-06-28 14:38   ` sashiko-bot
2026-06-28 14:24 ` [PATCH v2 3/7] md/raid10: consistently fail atomic writes that require splitting Abd-Alrhman Masalkhi
2026-06-28 14:36   ` sashiko-bot
2026-06-28 21:35     ` Abd-Alrhman Masalkhi [this message]
2026-06-28 14:24 ` [PATCH v2 4/7] md/raid10: remove unnecessary barrier around bio_submit_split_bioset() Abd-Alrhman Masalkhi
2026-06-28 14:24 ` [PATCH v2 5/7] md/raid10: replace wait loop with wait_event_idle() Abd-Alrhman Masalkhi
2026-06-28 14:24 ` [PATCH v2 6/7] md/raid10: simplify write request error handling Abd-Alrhman Masalkhi
2026-06-28 14:24 ` [PATCH v2 7/7] md/raid10: simplify read " Abd-Alrhman Masalkhi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m2v7b2z6ux.fsf@gmail.com \
    --to=abd.masalkhi@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=yukuai@fygo.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox