Re: [PATCH RFC] md: call md_cluster_stop() in __md_stop()

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Guoqing Jiang <guoqing.jiang@linux.dev>
To: NeilBrown <neilb@suse.de>
Cc: Song Liu <song@kernel.org>, linux-raid@vger.kernel.org
Subject: Re: [PATCH RFC] md: call md_cluster_stop() in __md_stop()
Date: Tue, 16 Aug 2022 21:53:53 +0800	[thread overview]
Message-ID: <a6657e08-b6a7-358b-2d2a-0ac37d49d23a@linux.dev> (raw)
In-Reply-To: <166061152702.5425.9507699881285566239@noble.neil.brown.name>



On 8/16/22 8:58 AM, NeilBrown wrote:
> On Mon, 15 Aug 2022, Guoqing Jiang wrote:
>> Hi Neil,
>>
>> Sorry for later reply since I was on vacation last week.
>>
>> On 8/12/22 10:16 AM, NeilBrown wrote:
>>> [[ I noticed the e151 patch recently which seems to admit that it broke
>>>      something.  So I looked into it and came up with this.
>> I just noticed it ...
>>
>>>      It seems to make sense, but I'm not in a position to test it.
>>>      Guoqing: does it look OK to you?
>>>      - NeilBrown
>>> ]]
>>>
>>> As described in Commit: 48df498daf62 ("md: move bitmap_destroy to the
>>> beginning of __md_stop") md_cluster_stop() needs to run before the
>>> mdddev->thread is stopped.
>>> The change to make this happen was reverted in Commit: e151db8ecfb0
>>> ("md-raid: destroy the bitmap after destroying the thread") due to
>>> problems it caused.
>>>
>>> To restore correct behaviour, make md_cluster_stop() reentrant and
>>> explicitly call it at the start of __md_stop(), after first calling
>>> md_bitmap_wait_behind_writes().
>>>
>>> Fixes: e151db8ecfb0 ("md-raid: destroy the bitmap after destroying the thread")
>>> Signed-off-by: NeilBrown <neilb@suse.de>
>>> ---
>>>    drivers/md/md-cluster.c | 1 +
>>>    drivers/md/md.c         | 3 +++
>>>    2 files changed, 4 insertions(+)
>>>
>>> diff --git a/drivers/md/md-cluster.c b/drivers/md/md-cluster.c
>>> index 742b2349fea3..37bf0aa4ed71 100644
>>> --- a/drivers/md/md-cluster.c
>>> +++ b/drivers/md/md-cluster.c
>>> @@ -1009,6 +1009,7 @@ static int leave(struct mddev *mddev)
>>>    	     test_bit(MD_CLOSING, &mddev->flags)))
>>>    		resync_bitmap(mddev);
>>>    
>>> +	mddev->cluster_info = NULL;
>> The above makes sense.
> Thanks.
>
>>>    	set_bit(MD_CLUSTER_HOLDING_MUTEX_FOR_RECVD, &cinfo->state);
>>>    	md_unregister_thread(&cinfo->recovery_thread);
>>>    	md_unregister_thread(&cinfo->recv_thread);
>>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>>> index afaf36b2f6ab..a57b2dff64dd 100644
>>> --- a/drivers/md/md.c
>>> +++ b/drivers/md/md.c
>>> @@ -6238,6 +6238,9 @@ static void mddev_detach(struct mddev *mddev)
>>>    static void __md_stop(struct mddev *mddev)
>>>    {
>>>    	struct md_personality *pers = mddev->pers;
>>> +
>>> +	md_bitmap_wait_behind_writes(mddev);
>>> +	md_cluster_stop(mddev);
>>>    	mddev_detach(mddev);
>>>    	/* Ensure ->event_work is done */
>>>    	if (mddev->event_work.func)
>> The md_bitmap_destroy is called in __md_stop with or without e151db8ecfb0,
>> and it already invokes md_bitmap_wait_behind_writes and md_cluster_stop by
>> md_bitmap_free. So the above is sort of redundant to me.
> The point is that md_cluster_stop() needs to run before mddev_detach()
> shuts down the thread.  If we don't call all of md_bitmap_destroy()
> before mddev_detach() we need to at least run md_cluster_stop(), and I
> think we need to run md_bitmap_wait_behind_writes() before that.
>
>
>> For the issue described in e151db8ecfb, looks like raid1d was running after
>> __md_stop, I am wondering if dm-raid should stop write first just like
>> normal md-raid.
> That looks like a really good idea, that should make it safe to move
> md_bitmap_destroy() back before mddev_detach().
> Would you like to post a patch to make those two changes, and include
> Mikulas Patocka, or should I?

NP, will send it later, thanks for your review.

BRs,
Guoqing

     prev parent reply	other threads:[~2022-08-16 13:54 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-12  2:16 [PATCH RFC] md: call md_cluster_stop() in __md_stop() NeilBrown
2022-08-15  6:19 ` Guoqing Jiang
2022-08-16  0:58   ` NeilBrown
2022-08-16 13:53     ` Guoqing Jiang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a6657e08-b6a7-358b-2d2a-0ac37d49d23a@linux.dev \
    --to=guoqing.jiang@linux.dev \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=song@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.