All of lore.kernel.org
 help / color / mirror / Atom feed
From: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
To: sashiko-reviews@lists.linux.dev
Cc: linux-raid@vger.kernel.org, yukuai@fygo.io
Subject: Re: [PATCH v2 3/7] md/raid10: consistently fail atomic writes that require splitting
Date: Sun, 28 Jun 2026 23:35:02 +0200	[thread overview]
Message-ID: <m2v7b2z6ux.fsf@gmail.com> (raw)
In-Reply-To: <20260628143620.64A6C1F000E9@smtp.kernel.org>


Hi Kuai,

On Sun, Jun 28, 2026 at 14:36 +0000, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
>
> Pre-existing issues:
> - [High] The `err_dec_pending` (formerly `err_handle`) cleanup path in `raid10_write_request` can trigger a NULL pointer dereference when a replacement device is promoted concurrently.
> --
>
> commit bc9d81fdb2c33f8c55baad27ab1eedf427ae25ed
> Author: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
>
> md/raid10: consistently fail atomic writes that require splitting
>
> RAID10 handles one badblock path explicitly by failing atomic writes with
> EIO, but another path can also reduce the writable range and force the bio
> through bio_submit_split_bioset(), completing it with EINVAL. This commit
> centralizes the check to fail all atomic writes requiring splitting with
> EIO.
>
>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
>> index bd322eccdc3ff..3480fc7907f01 100644
>> --- a/drivers/md/raid10.c
>> +++ b/drivers/md/raid10.c
> [ ... ]
>> @@ -1493,6 +1484,9 @@ static bool raid10_write_request(struct mddev *mddev, struct bio *bio,
>>  		r10_bio->sectors = max_sectors;
>>  
>>  	if (r10_bio->sectors < bio_sectors(bio)) {
>> +		if (atomic)
>> +			goto err_handle;
>> +
>
> [Severity: High]
> This is a pre-existing issue, but can this jump to the error handler cause a
> null pointer dereference if a replacement device is concurrently promoted?
>
> In the error handler path, the code fetches the replacement device without
> checking for null or falling back to the primary slot:
>
> err_handle:
> 	for (k = 0;  k < i; k++) {
> 		int d = r10_bio->devs[k].devnum;
> 		struct md_rdev *rrdev = conf->mirrors[d].replacement;
> 		...
> 		if (r10_bio->devs[k].repl_bio) {
> 			rdev_dec_pending(rrdev, mddev);
> 			r10_bio->devs[k].repl_bio = NULL;
> 		}
> 	}
>
> If raid10_remove_disk() runs concurrently, it can promote the replacement to
> the primary slot and clear conf->mirrors[d].replacement:
>
> raid10_remove_disk() {
> 	...
> 	WRITE_ONCE(*rdevp, NULL);
> 	if (p->replacement) {
> 		WRITE_ONCE(p->rdev, p->replacement);
> 		clear_bit(Replacement, &p->replacement->flags);
> 		WRITE_ONCE(p->replacement, NULL);
> 	}
> 	...
> }
>
At first look this looks unreachable, because raid10_remove_disk()
should only run while the array is suspended, which would drain the
in-flight write before any promotion. But that assumption does not hold
on the md_start_sync() path. The suspend there is gated on a lock-free
check of md_spares_need_change() taken once at function entry; we only
suspend if it returns true. We then call remove_spares()
unconditionally, regardless of that earlier decision, and
remove_spares() re-evaluates rdev_removeable() independently under the
lock.

So the suspend decision and the actual removal are sampled at different
times with nothing held across them. A primary that was In_sync at the
entry check (suspend skipped) but fails afterward becomes removeable by
the time remove_spares() runs, and raid10_remove_disk() then promotes
the replacement and clears the slot with no suspend in effect. Meanwhile
an in-flight raid10_write_request() that referenced only the replacement
(because the primary was already Faulty when it ran) can be sitting in
err_handle, and it reads conf->mirrors[d].replacement as NULL.

It seems real. I'll submit a fix addressing this issue.

> If this concurrent promotion happens, will rrdev be null here? It looks like
> other paths such as raid10_end_write_request() safely handle this exact
> race by falling back to rdev when replacement is null, but this cleanup
> path does not.
>
>>  		allow_barrier(conf);
>>  		bio = bio_submit_split_bioset(bio, r10_bio->sectors,
>>  					      &conf->bio_split);
>
> -- 
> Sashiko AI review · https://sashiko.dev/#/patchset/20260628142420.1051027-1-abd.masalkhi@gmail.com?part=3

-- 
Best Regards,
Abd-Alrhman

  reply	other threads:[~2026-06-28 21:35 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-28 14:24 [PATCH v2 0/7] md/raid10: fixes, atomic write handling, and error-path cleanup Abd-Alrhman Masalkhi
2026-06-28 14:24 ` [PATCH v2 1/7] md/raid10: fix r10bio leak in raid10_write_request() error paths Abd-Alrhman Masalkhi
2026-06-28 14:39   ` sashiko-bot
2026-06-28 14:24 ` [PATCH v2 2/7] md/raid1: advertise atomic write limits and handle runtime constraints Abd-Alrhman Masalkhi
2026-06-28 14:38   ` sashiko-bot
2026-06-28 14:24 ` [PATCH v2 3/7] md/raid10: consistently fail atomic writes that require splitting Abd-Alrhman Masalkhi
2026-06-28 14:36   ` sashiko-bot
2026-06-28 21:35     ` Abd-Alrhman Masalkhi [this message]
2026-06-28 14:24 ` [PATCH v2 4/7] md/raid10: remove unnecessary barrier around bio_submit_split_bioset() Abd-Alrhman Masalkhi
2026-06-28 14:24 ` [PATCH v2 5/7] md/raid10: replace wait loop with wait_event_idle() Abd-Alrhman Masalkhi
2026-06-28 14:24 ` [PATCH v2 6/7] md/raid10: simplify write request error handling Abd-Alrhman Masalkhi
2026-06-28 14:24 ` [PATCH v2 7/7] md/raid10: simplify read " Abd-Alrhman Masalkhi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m2v7b2z6ux.fsf@gmail.com \
    --to=abd.masalkhi@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=yukuai@fygo.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.