From: "Yu Kuai" <yukuai@fnnas.com>
To: <linan666@huaweicloud.com>, <song@kernel.org>
Cc: <linux-raid@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
<yangerkun@huawei.com>, <yi.zhang@huawei.com>,
<yukuai@fnnas.com>
Subject: Re: [PATCH] md/raid5: Fix a deadlock of reshape and suspend
Date: Thu, 25 Dec 2025 15:32:17 +0800 [thread overview]
Message-ID: <13706e9e-541a-4c09-b104-2d7272d0a2fa@fnnas.com> (raw)
In-Reply-To: <20251124084559.4097567-1-linan666@huaweicloud.com>
Hi,
在 2025/11/24 16:45, linan666@huaweicloud.com 写道:
> From: Li Nan <linan122@huawei.com>
>
> Commit 868bba54a3bc ("md/raid5: fix a deadlock in the case that reshape is
> interrupted") fixed a raid deadlock of reshape, but a similar issue is hit
> by mdadm test 25raid456-reshape-deadlock.
>
> INFO: task (udev-worker):63822 blocked for more than 122 seconds.
> Not tainted 6.18.0-rc2-g0555b5424915-dirty #153
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> __schedule
> schedule
> schedule_timeout
> wait_woken
> raid5_make_request
> md_handle_request
> md_submit_bio
> [...]
> blkdev_read_iter
> vfs_read
> ksys_read
> __x64_sys_read
>
> It is triggered by:
> 1) normal IO waits for reshape to progress
> 2) user sets ACTION_FROZEN via ioctl
> 3) reshape is interrupted and cannot restart
> 4) users try to suspend array while active IO waits reshape
>
> Following Kuai's previous fix, such IOs should fail in
> make_stripe_request(). Thus, set a timeout for wait_woken() to fix
> the deadlock, and blocked IO will fail in the next cycle.
>
> Signed-off-by: Li Nan <linan122@huawei.com>
> ---
> drivers/md/raid5.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index cdbc7eba5c54..957e712d2be9 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -6185,7 +6185,7 @@ static bool raid5_make_request(struct mddev *mddev, struct bio * bi)
> }
>
> wait_woken(&wait, TASK_UNINTERRUPTIBLE,
> - MAX_SCHEDULE_TIMEOUT);
> + msecs_to_jiffies(10000));
Instead of this change to wake up every 10s unconditionally, can you fix this by wake up
synchronously when array is frozen or suspended that reshape can't continue.
> continue;
> }
>
--
Thansk,
Kuai
prev parent reply other threads:[~2025-12-25 7:44 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-24 8:45 [PATCH] md/raid5: Fix a deadlock of reshape and suspend linan666
2025-12-25 7:32 ` Yu Kuai [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=13706e9e-541a-4c09-b104-2d7272d0a2fa@fnnas.com \
--to=yukuai@fnnas.com \
--cc=linan666@huaweicloud.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=song@kernel.org \
--cc=yangerkun@huawei.com \
--cc=yi.zhang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox