From: sashiko-bot@kernel.org
To: "Chen Cheng" <chencheng@fnnas.com>
Cc: yukuai@fygo.io, linux-raid@vger.kernel.org
Subject: Re: [PATCH v2] md/raid5: fix reshape deadlock while failed devices more than max degraded
Date: Mon, 15 Jun 2026 11:49:29 +0000 [thread overview]
Message-ID: <20260615114929.E5BAD1F00A3A@smtp.kernel.org> (raw)
In-Reply-To: <20260615113450.2088877-1-chencheng@fnnas.com>
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Critical] Clearing STRIPE_EXPANDING prematurely in handle_failed_reshape() allows a destination stripe to be freed and reused while asynchronous DMA is still writing to its pages, leading to DMA-after-free memory corruption.
--
commit ec55b78c975b4f9f35e287b83bee72d76b92b33a
Author: Chen Cheng <chencheng@fnnas.com>
md/raid5: fix reshape deadlock while failed devices more than max degraded
This patch fixes a deadlock in raid5 reshape by cleaning up stripe states
(including destination and source stripes) when failed devices exceed the
max degraded count.
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index d0d7f96fd7cd9..2f4d82bd7c2e6 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -3730,6 +3730,78 @@ handle_failed_sync(struct r5conf *conf, struct stripe_head *sh,
> md_sync_error(conf->mddev);
> }
>
> +/*
> + * handle_failed_reshape - handl failed stripes when reshape failed and
> + * degraded devices >= max_degraded
[ ... ]
> +static void handle_failed_reshape(struct r5conf *conf, struct stripe_head *sh,
> + struct stripe_head_state *s)
> +{
> + int i;
[ ... ]
> + /* release the destination stripes that are waiting to be filled */
> + if (test_and_clear_bit(STRIPE_EXPAND_SOURCE, &sh->state)) {
> + for (i = 0; i < sh->disks; i++) {
> + int dd_idx;
> + struct stripe_head *sh2;
> + sector_t bn, sec;
> +
> + if (i == sh->pd_idx)
> + continue;
> + if (conf->level == 6 && i == sh->qd_idx)
> + continue;
> +
> + bn = raid5_compute_blocknr(sh, i, 1);
> + sec = raid5_compute_sector(conf, bn, 0, &dd_idx, NULL);
> + sh2 = raid5_get_active_stripe(conf, NULL, sec,
> + R5_GAS_NOBLOCK | R5_GAS_NOQUIESCE);
> + if (!sh2)
> + continue;
> +
> + if (test_and_clear_bit(STRIPE_EXPANDING, &sh2->state)) {
[Severity: Critical]
Could clearing STRIPE_EXPANDING prematurely here lead to a DMA-after-free?
During a reshape, a destination stripe (sh2) might require data from multiple
source stripes. STRIPE_EXPANDING normally prevents the destination stripe from
being returned to the free list before all data is copied.
If Worker 1 is handling a source stripe via handle_stripe_expansion(), it
submits an asynchronous DMA copy to sh2's pages and then calls
raid5_release_stripe(sh2) to drop its reference BEFORE waiting for the DMA to
complete:
drivers/md/raid5.c:handle_stripe_expansion() {
...
raid5_release_stripe(sh2);
}
/* done submitting copies, wait for them to complete */
async_tx_quiesce(&tx);
...
}
If the array concurrently fails, Worker 2 could enter handle_failed_reshape()
for a different source stripe, look up sh2, clear its STRIPE_EXPANDING bit,
and release it.
If sh2's refcount drops to 0 without STRIPE_EXPANDING set, could it be
added to the inactive_list and reallocated (or have its pages freed) while
the DMA from Worker 1 is still actively writing to it?
> + atomic_dec(&conf->reshape_stripes);
> + wake_up(&conf->wait_for_reshape);
> + md_done_sync(conf->mddev,
> + RAID5_STRIPE_SECTORS(conf));
> + }
> +
> + clear_bit(STRIPE_EXPAND_READY, &sh2->state);
> +
> + raid5_release_stripe(sh2);
> + }
> + }
> +}
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260615113450.2088877-1-chencheng@fnnas.com?part=1
next prev parent reply other threads:[~2026-06-15 11:49 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-15 11:34 [PATCH v2] md/raid5: fix reshape deadlock while failed devices more than max degraded Chen Cheng
2026-06-15 11:49 ` sashiko-bot [this message]
2026-06-16 4:04 ` chencheng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260615114929.E5BAD1F00A3A@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=chencheng@fnnas.com \
--cc=linux-raid@vger.kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
--cc=yukuai@fygo.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox