public inbox for linux-raid@vger.kernel.org
 help / color / mirror / Atom feed
From: "Yu Kuai" <yukuai@fnnas.com>
To: "FengWei Shih" <dannyshih@synology.com>, <song@kernel.org>
Cc: <linan122@huawei.com>, <linux-raid@vger.kernel.org>,
	 <linux-kernel@vger.kernel.org>, <yukuai@fnnas.com>
Subject: Re: [PATCH] md/raid5: fix race between reshape and chunk-aligned read
Date: Tue, 28 Apr 2026 16:29:18 +0800	[thread overview]
Message-ID: <f3a86781-07da-49c8-a3e9-e41b9d710727@fnnas.com> (raw)
In-Reply-To: <d298e159-1e47-4c97-80cd-41c4c1c74a04@synology.com>

Hi,

在 2026/4/21 15:09, FengWei Shih 写道:
> Hi Kuai,
>
> Yu Kuai 於 2026/4/19 下午 01:14 寫道:
>> Hi,
>>
>> 在 2026/4/9 13:17, FengWei Shih 写道:
>>> raid5_make_request() checks mddev->reshape_position to decide whether
>>> to allow chunk-aligned reads. However in raid5_start_reshape(), the
>>> layout configuration (raid_disks, algorithm, etc.) is updated before
>>> mddev->reshape_position is set:
>>>
>>>     reshape (raid5_start_reshape)        read (raid5_make_request)
>>>     ============================== ===========================
>>>     write_seqcount_begin
>>>     update raid_disks, algorithm...
>>>     set conf->reshape_progress
>>>     write_seqcount_end
>>>                                           check mddev->reshape_position
>>>                                             * still MaxSector, allow
>>> raid5_read_one_chunk()
>>>                                             * use new layout
>>>     raid5_quiesce()
>>>     set mddev->reshape_position
>> While we're here, I think it's pretty ugly to disable 
>> raid5_read_one_chunk
>> when reshape is not fully done. So a better solution should be:
>>
>> - data behind reshape: read with new layout
>> - data ahead of reshape: read with old layout, reshape will also need 
>> to wait
>> for this IO to be done, before reshape can make progress.
>> - data intersecting the active reshape window: wait for reshape to 
>> make progress.
>
> Thanks for the feedback. I agree that using reshape_progress to 
> distinguish
> the three cases (ahead/behind/inside reshape) would be more refined.
>
> But disabling chunk-aligned reads during reshape was already the 
> existing design.
> In the original code, the check is at the caller level:
>
>     if (rw == READ && mddev->degraded == 0 &&
>         mddev->reshape_position == MaxSector) {
>         bi = chunk_aligned_read(mddev, bi);
>     }
>
> My patch is focused on fixing the race condition in the existing 
> lockless check of whether
> reshape is in progress. So just to confirm: are you suggesting we add 
> a mechanism to
> allow chunk-aligned reads during reshape (based on reshape progress), 
> rather
> than simply disabling them?

Yes, I think this is a huge performance improvement. And in this case it's totally
fine to do chunk aligned read with old layout since reshape do not start yet.

>
>>> Since reshape_position is not yet updated, raid5_make_request()
>>> considers no reshape is in progress and proceeds with the
>>> chunk-aligned path, but the layout has already changed, causing
>>> raid5_compute_sector() to return an incorrect physical address.
>>>
>>> Fix this by reading conf->reshape_progress under gen_lock in
>>> raid5_read_one_chunk() and falling back to the stripe path if a
>>> reshape is in progress.
>>>
>>> Signed-off-by: FengWei Shih <dannyshih@synology.com>
>>> ---
>>>    drivers/md/raid5.c | 8 ++++++++
>>>    1 file changed, 8 insertions(+)
>>>
>>> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>>> index a8e8d431071b..bded2b86f0ef 100644
>>> --- a/drivers/md/raid5.c
>>> +++ b/drivers/md/raid5.c
>>> @@ -5421,6 +5421,11 @@ static int raid5_read_one_chunk(struct mddev 
>>> *mddev, struct bio *raid_bio)
>>>        sector_t sector, end_sector;
>>>        int dd_idx;
>>>        bool did_inc;
>>> +    int seq;
>>> +
>>> +    seq = read_seqcount_begin(&conf->gen_lock);
>>> +    if (unlikely(conf->reshape_progress != MaxSector))
>>> +        return 0;
>>>           if (!in_chunk_boundary(mddev, raid_bio)) {
>>>            pr_debug("%s: non aligned\n", __func__);
>>> @@ -5431,6 +5436,9 @@ static int raid5_read_one_chunk(struct mddev 
>>> *mddev, struct bio *raid_bio)
>>>                          &dd_idx, NULL);
>>>        end_sector = sector + bio_sectors(raid_bio);
>>>    +    if (read_seqcount_retry(&conf->gen_lock, seq))
>>> +        return 0;
>>> +
>>>        if (r5c_big_stripe_cached(conf, sector))
>>>            return 0;
> Thanks,
> FengWei Shih
>
> Disclaimer: The contents of this e-mail message and any attachments 
> are confidential and are intended solely for addressee. The 
> information may also be legally privileged. This transmission is sent 
> in trust, for the sole purpose of delivery to the intended recipient. 
> If you have received this transmission in error, any use, reproduction 
> or dissemination of this transmission is strictly prohibited. If you 
> are not the intended recipient, please immediately notify the sender 
> by reply e-mail or phone and delete this message and its attachments, 
> if any.

-- 
Thansk,
Kuai

      reply	other threads:[~2026-04-28  8:29 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-09  5:17 [PATCH] md/raid5: fix race between reshape and chunk-aligned read FengWei Shih
2026-04-13  7:19 ` Li Nan
2026-04-14  8:27   ` FengWei Shih
2026-04-19  5:14 ` Yu Kuai
2026-04-21  7:09   ` FengWei Shih
2026-04-28  8:29     ` Yu Kuai [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f3a86781-07da-49c8-a3e9-e41b9d710727@fnnas.com \
    --to=yukuai@fnnas.com \
    --cc=dannyshih@synology.com \
    --cc=linan122@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=song@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox