From: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
To: Yu Kuai <yukuai@fygo.io>, Xiao Ni <xiao@kernel.org>
Cc: song@kernel.org, xni@redhat.com, neilb@suse.com, shli@fb.com,
linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org,
yukuai@fygo.io
Subject: Re: [PATCH v2 2/3] md/raid1,raid10: fix error-path detection with md_cloned_bio()
Date: Thu, 21 May 2026 11:14:36 +0200 [thread overview]
Message-ID: <m2pl2pkttv.fsf@gmail.com> (raw)
In-Reply-To: <4ff43e8f-6405-4ce8-8cdf-f71e0bce4d3f@fygo.io>
Hi Kuai
On Thu, May 21, 2026 at 15:29 +0800, Yu Kuai wrote:
> Hi,
>
> 在 2026/5/21 15:08, Xiao Ni 写道:
>> On Thu, May 21, 2026 at 9:25 AM Yu Kuai <yukuai@fnnas.com> wrote:
>>> Hi,
>>>
>>> 在 2026/5/1 19:46, Abd-Alrhman Masalkhi 写道:
>>>> Detect the error path using md_cloned_bio() instead of relying
>>>> on r1_bio in raid1 or r10_bio->read_slot in raid10, which may be
>>>> NULL or -1 after splitting and resubmitting a failed bio.
>>>>
>>>> As a result, the error path may not be recognized and memory
>>>> allocations can incorrectly use GFP_NOIO instead of
>>>> (GFP_NOIO | __GFP_HIGH), which can lead to a deadlock under
>>>> memory pressure.
>>> I don't think this patch will fix this problem. Because split bio
>>> is a new bio that will be resubmit by md_submit_bio(), such bio is
>>> not a md_cloned_bio yet until md_account_bio().
>> Hi Kuai
>>
>> bio = bio_submit_split_bioset(bio, max_sectors,
>> &conf->bio_split);
>> bio_submit_split_bioset returns the split bio and the remainder bio
>> will be submitted again which is already a md_cloned_bio.
>
> Okay, I understand now this is a special case that current->bio_list is
> NULL. However, the problem still exist in the case original IO split again
> while resubmitting, and in this case current->bio_list will be set so new
> split io will be resubmit by md_submit_bio(). Which means patch 1 can't
> fix this case as well.
The original bio resubmitted via bio_list is not executing in the raid1d
thread context, so blocking on is_suspended() is correct and will not
cause a deadlock. The suspension deadlock risk only exists for
md_cloned_bio executing in the raid1d thread context in the error path.
And in this case it is always true that if we are executing in the
raid1d thread context the bio will already be a md_cloned_bio on entry
to md_handle_request(), even if current->bio_list was set, because (and as
@Xiao has mentioned) every time we split a bio in the error path we are
resubmitting the md_cloned_bio remainder, never the original bio.
>
> Perhaps it's better to check task_struct instead of bio_set.
>
>>
>> Regards
>> Xiao
>>>> Fixes: 689389a06ce7 ("md/raid1: simplify handle_read_error().")
>>>> Fixes: 545250f24809 ("md/raid10: simplify handle_read_error()")
>>>> Signed-off-by: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
>>>> ---
>>>> This patch depends on patch 1.
>>>>
>>>> Changes in v2:
>>>> - New patch.
>>>> ---
>>>> drivers/md/raid1.c | 13 ++++++++++---
>>>> drivers/md/raid10.c | 20 ++++++++++++++------
>>>> 2 files changed, 24 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
>>>> index cc9914bd15c1..c52ecd38c163 100644
>>>> --- a/drivers/md/raid1.c
>>>> +++ b/drivers/md/raid1.c
>>>> @@ -1321,11 +1321,18 @@ static void raid1_read_request(struct mddev *mddev, struct bio *bio,
>>>> bool r1bio_existed = !!r1_bio;
>>>>
>>>> /*
>>>> - * If r1_bio is set, we are blocking the raid1d thread
>>>> - * so there is a tiny risk of deadlock. So ask for
>>>> + * An md cloned bio indicates we are in the error path.
>>>> + * This is more reliable than checking r1_bio, which might
>>>> + * be NULL even in the error path if a failed bio was split.
>>>> + */
>>>> + bool err_path = md_cloned_bio(mddev, bio);
>>>> +
>>>> + /*
>>>> + * If we are in the error path, we are blocking the raid1d
>>>> + * thread so there is a tiny risk of deadlock. So ask for
>>>> * emergency memory if needed.
>>>> */
>>>> - gfp_t gfp = r1_bio ? (GFP_NOIO | __GFP_HIGH) : GFP_NOIO;
>>>> + gfp_t gfp = err_path ? (GFP_NOIO | __GFP_HIGH) : GFP_NOIO;
>>>>
>>>> /*
>>>> * Still need barrier for READ in case that whole
>>>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
>>>> index 3a591e60a144..8c6fc398260e 100644
>>>> --- a/drivers/md/raid10.c
>>>> +++ b/drivers/md/raid10.c
>>>> @@ -1155,7 +1155,20 @@ static void raid10_read_request(struct mddev *mddev, struct bio *bio,
>>>> char b[BDEVNAME_SIZE];
>>>> int slot = r10_bio->read_slot;
>>>> struct md_rdev *err_rdev = NULL;
>>>> - gfp_t gfp = GFP_NOIO;
>>>> +
>>>> + /*
>>>> + * An md cloned bio indicates we are in the error path.
>>>> + * This is more reliable than checking slot, which might
>>>> + * be -1 even in the error path if a failed bio was split.
>>>> + */
>>>> + bool err_path = md_cloned_bio(mddev, bio);
>>>> +
>>>> + /*
>>>> + * If we are in the error path, we are blocking the raid10d
>>>> + * thread so there is a tiny risk of deadlock. So ask for
>>>> + * emergency memory if needed.
>>>> + */
>>>> + gfp_t gfp = err_path ? (GFP_NOIO | __GFP_HIGH) : GFP_NOIO;
>>>>
>>>> if (slot >= 0 && r10_bio->devs[slot].rdev) {
>>>> /*
>>>> @@ -1166,11 +1179,6 @@ static void raid10_read_request(struct mddev *mddev, struct bio *bio,
>>>> * we lose the device name in error messages.
>>>> */
>>>> int disk;
>>>> - /*
>>>> - * As we are blocking raid10, it is a little safer to
>>>> - * use __GFP_HIGH.
>>>> - */
>>>> - gfp = GFP_NOIO | __GFP_HIGH;
>>>>
>>>> disk = r10_bio->devs[slot].devnum;
>>>> err_rdev = conf->mirrors[disk].rdev;
>>> --
>>> Thansk,
>>> Kuai
>>>
> --
> Thansk,
> Kuai
--
Best Regards,
Abd-Alrhman
next prev parent reply other threads:[~2026-05-21 9:14 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-01 11:46 [PATCH v2 0/3] md/raid1,raid10: fix deadlock and bio accounting in read error path Abd-Alrhman Masalkhi
2026-05-01 11:46 ` [PATCH v2 1/3] md/raid1,raid10: fix deadlock in read error recovery path Abd-Alrhman Masalkhi
2026-05-19 7:46 ` Xiao Ni
2026-05-21 1:28 ` Yu Kuai
2026-05-01 11:46 ` [PATCH v2 2/3] md/raid1,raid10: fix error-path detection with md_cloned_bio() Abd-Alrhman Masalkhi
2026-05-19 8:11 ` Xiao Ni
2026-05-21 1:24 ` Yu Kuai
2026-05-21 7:08 ` Xiao Ni
2026-05-21 7:29 ` Yu Kuai
2026-05-21 9:14 ` Abd-Alrhman Masalkhi [this message]
2026-05-01 11:46 ` [PATCH v2 3/3] md/raid1,raid10: fix bio accounting for split md cloned bios Abd-Alrhman Masalkhi
2026-05-19 8:18 ` Xiao Ni
2026-05-21 1:26 ` Yu Kuai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m2pl2pkttv.fsf@gmail.com \
--to=abd.masalkhi@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.com \
--cc=shli@fb.com \
--cc=song@kernel.org \
--cc=xiao@kernel.org \
--cc=xni@redhat.com \
--cc=yukuai@fygo.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox