linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yu Kuai <yukuai1@huaweicloud.com>
To: Xueshi Hu <xueshi.hu@smartx.com>, Yu Kuai <yukuai1@huaweicloud.com>
Cc: linux-raid@vger.kernel.org, pmenzel@molgen.mpg.de,
	song@kernel.org, "yukuai (C)" <yukuai3@huawei.com>
Subject: Re: [PATCH v3 1/3] md/raid1: freeze array more strictly when reshape
Date: Tue, 1 Aug 2023 09:24:13 +0800	[thread overview]
Message-ID: <1fbbf178-efdb-558e-685e-4e9ac785d5c0@huaweicloud.com> (raw)
In-Reply-To: <dxzuor2h2rkkzlkmbvgxcipvumsy7xlitxpnmgj4lcm3rclcuv@thwglgsryebj>

Hi,

在 2023/07/31 22:02, Xueshi Hu 写道:
> On Thu, Jul 20, 2023 at 09:37:38AM +0800, Yu Kuai wrote:
>> Hi,
>>
>> 在 2023/07/20 9:36, Yu Kuai 写道:
>>> Hi,
>>>
>>> 在 2023/07/19 15:09, Xueshi Hu 写道:
>>>> When an IO error happens, reschedule_retry() will increase
>>>> r1conf::nr_queued, which makes freeze_array() unblocked. However, before
>>>> all r1bio in the memory pool are released, the memory pool should not be
>>>> modified. Introduce freeze_array_totally() to solve the problem. Compared
>>>> to freeze_array(), it's more strict because any in-flight io needs to
>>>> complete including queued io.
>>>>
>>>> Signed-off-by: Xueshi Hu <xueshi.hu@smartx.com>
>>>> ---
>>>>    drivers/md/raid1.c | 35 +++++++++++++++++++++++++++++++++--
>>>>    1 file changed, 33 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
>>>> index dd25832eb045..5605c9680818 100644
>>>> --- a/drivers/md/raid1.c
>>>> +++ b/drivers/md/raid1.c
>>>> @@ -1072,7 +1072,7 @@ static void freeze_array(struct r1conf *conf,
>>>> int extra)
>>>>        /* Stop sync I/O and normal I/O and wait for everything to
>>>>         * go quiet.
>>>>         * This is called in two situations:
>>>> -     * 1) management command handlers (reshape, remove disk, quiesce).
>>>> +     * 1) management command handlers (remove disk, quiesce).
>>>>         * 2) one normal I/O request failed.
>>>>         * After array_frozen is set to 1, new sync IO will be blocked at
>>>> @@ -1111,6 +1111,37 @@ static void unfreeze_array(struct r1conf *conf)
>>>>        wake_up(&conf->wait_barrier);
>>>>    }
>>>> +/* conf->resync_lock should be held */
>>>> +static int get_pending(struct r1conf *conf)
>>>> +{
>>>> +    int idx, ret;
>>>> +
>>>> +    ret = atomic_read(&conf->nr_sync_pending);
>>>> +    for (idx = 0; idx < BARRIER_BUCKETS_NR; idx++)
>>>> +        ret += atomic_read(&conf->nr_pending[idx]);
>>>> +
>>>> +    return ret;
>>>> +}
>>>> +
>>>> +static void freeze_array_totally(struct r1conf *conf)
>>>> +{
>>>> +    /*
>>>> +     * freeze_array_totally() is almost the same with
>>>> freeze_array() except
>>>> +     * it requires there's no queued io. Raid1's reshape will
>>>> destroy the
>>>> +     * old mempool and change r1conf::raid_disks, which are
>>>> necessary when
>>>> +     * freeing the queued io.
>>>> +     */
>>>> +    spin_lock_irq(&conf->resync_lock);
>>>> +    conf->array_frozen = 1;
>>>> +    raid1_log(conf->mddev, "freeze totally");
>>>> +    wait_event_lock_irq_cmd(
>>>> +            conf->wait_barrier,
>>>> +            get_pending(conf) == 0,
>>>> +            conf->resync_lock,
>>>> +            md_wakeup_thread(conf->mddev->thread));
>>>> +    spin_unlock_irq(&conf->resync_lock);
>>>> +}
>>>> +
>>>>    static void alloc_behind_master_bio(struct r1bio *r1_bio,
>>>>                           struct bio *bio)
>>>>    {
>>>> @@ -3296,7 +3327,7 @@ static int raid1_reshape(struct mddev *mddev)
>>>>            return -ENOMEM;
>>>>        }
>>>> -    freeze_array(conf, 0);
>>>> +    freeze_array_totally(conf);
>>>
>>> I think this is wrong, raid1_reshape() can't be called with
>> Sorry about thi typo, I mean raid1_reshape() can be called with ...
> You're right, this is indeed a deadlock.
> 
> I am wondering whether this approach is viable
> 
> 	if (unlikely(atomic_read(conf->nr_queued))) {
> 		kfree(newpoolinfo);
> 		mempool_exit(&newpool);
> 		unfreeze_array(conf);
> 
> 		set_bit(MD_RECOVERY_RECOVER, &mddev->recovery);
> 		set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
> 		md_wakeup_thread(mddev->thread);
> 		return -EBUSY;
> 	}

This is not okay, 'nr_queued' can be incresed at any time when normal io
failed, read it once doesn't mean anything, and you need to
freeze_array() before reading it:

freeze_array
// guarantee new io won't be dispatched
if (atomic_read(conf->nr_queued))
  ...
  unfreeze_array
  return -EBUSY;

Fortunately, I'm working on another patchset to synchronize io with
array configuration, which means all the callers of raid1_reshape() will
suspend the array, and no normal io will be in progress, hence this
problem won't exist as well.

Thanks,
Kuai

> 
> Thanks,
> Hu
> 
>>
>> Thanks,
>> Kuai
>>> 'reconfig_mutex' grabbed, and this will deadlock because failed io need
>>> this lock to be handled by daemon thread.(see details in [1]).
>>>
>>> Be aware that never hold 'reconfig_mutex' to wait for io.
>>>
>>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?h=md-next&id=c4fe7edfc73f750574ef0ec3eee8c2de95324463
>>>
>>>>        /* ok, everything is stopped */
>>>>        oldpool = conf->r1bio_pool;
>>>>
>>>
>>> .
>>>
>>
> .
> 


  reply	other threads:[~2023-08-01  1:24 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-19  7:09 [PATCH v3 0/3] don't change mempool if in-flight r1bio exists Xueshi Hu
2023-07-19  7:09 ` [PATCH v3 1/3] md/raid1: freeze array more strictly when reshape Xueshi Hu
2023-07-20  1:36   ` Yu Kuai
2023-07-20  1:37     ` Yu Kuai
2023-07-31 14:02       ` Xueshi Hu
2023-08-01  1:24         ` Yu Kuai [this message]
2023-07-19  7:09 ` [PATCH v3 2/3] md/raid1: don't allow_barrier() before r1bio got freed Xueshi Hu
2023-07-20  1:47   ` Yu Kuai
2023-07-19  7:09 ` [PATCH v3 3/3] md/raid1: check array size before reshape Xueshi Hu
2023-07-19  7:38   ` Paul Menzel
2023-07-19 11:51     ` Xueshi Hu
2023-07-20  1:28       ` Yu Kuai
2023-07-28 14:42         ` Xueshi Hu
2023-07-29  0:58           ` Yu Kuai
2023-07-29  3:29             ` Xueshi Hu
2023-07-29  3:36               ` Yu Kuai
2023-07-29  3:51                 ` Yu Kuai
2023-07-29  6:16                   ` Xueshi Hu
2023-07-29  7:37                     ` Yu Kuai
2023-07-29 12:23                       ` Xueshi Hu
2023-07-31  1:03                         ` Yu Kuai
2023-07-31  3:48                           ` Xueshi Hu
2023-07-31  6:22                             ` Yu Kuai
2023-07-31 14:12                               ` Xueshi Hu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1fbbf178-efdb-558e-685e-4e9ac785d5c0@huaweicloud.com \
    --to=yukuai1@huaweicloud.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=pmenzel@molgen.mpg.de \
    --cc=song@kernel.org \
    --cc=xueshi.hu@smartx.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).