From: Guoqing Jiang <guoqing.jiang@linux.dev>
To: NeilBrown <neilb@suse.de>
Cc: Song Liu <song@kernel.org>, linux-raid@vger.kernel.org
Subject: Re: [PATCH RFC] md: call md_cluster_stop() in __md_stop()
Date: Tue, 16 Aug 2022 21:53:53 +0800 [thread overview]
Message-ID: <a6657e08-b6a7-358b-2d2a-0ac37d49d23a@linux.dev> (raw)
In-Reply-To: <166061152702.5425.9507699881285566239@noble.neil.brown.name>
On 8/16/22 8:58 AM, NeilBrown wrote:
> On Mon, 15 Aug 2022, Guoqing Jiang wrote:
>> Hi Neil,
>>
>> Sorry for later reply since I was on vacation last week.
>>
>> On 8/12/22 10:16 AM, NeilBrown wrote:
>>> [[ I noticed the e151 patch recently which seems to admit that it broke
>>> something. So I looked into it and came up with this.
>> I just noticed it ...
>>
>>> It seems to make sense, but I'm not in a position to test it.
>>> Guoqing: does it look OK to you?
>>> - NeilBrown
>>> ]]
>>>
>>> As described in Commit: 48df498daf62 ("md: move bitmap_destroy to the
>>> beginning of __md_stop") md_cluster_stop() needs to run before the
>>> mdddev->thread is stopped.
>>> The change to make this happen was reverted in Commit: e151db8ecfb0
>>> ("md-raid: destroy the bitmap after destroying the thread") due to
>>> problems it caused.
>>>
>>> To restore correct behaviour, make md_cluster_stop() reentrant and
>>> explicitly call it at the start of __md_stop(), after first calling
>>> md_bitmap_wait_behind_writes().
>>>
>>> Fixes: e151db8ecfb0 ("md-raid: destroy the bitmap after destroying the thread")
>>> Signed-off-by: NeilBrown <neilb@suse.de>
>>> ---
>>> drivers/md/md-cluster.c | 1 +
>>> drivers/md/md.c | 3 +++
>>> 2 files changed, 4 insertions(+)
>>>
>>> diff --git a/drivers/md/md-cluster.c b/drivers/md/md-cluster.c
>>> index 742b2349fea3..37bf0aa4ed71 100644
>>> --- a/drivers/md/md-cluster.c
>>> +++ b/drivers/md/md-cluster.c
>>> @@ -1009,6 +1009,7 @@ static int leave(struct mddev *mddev)
>>> test_bit(MD_CLOSING, &mddev->flags)))
>>> resync_bitmap(mddev);
>>>
>>> + mddev->cluster_info = NULL;
>> The above makes sense.
> Thanks.
>
>>> set_bit(MD_CLUSTER_HOLDING_MUTEX_FOR_RECVD, &cinfo->state);
>>> md_unregister_thread(&cinfo->recovery_thread);
>>> md_unregister_thread(&cinfo->recv_thread);
>>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>>> index afaf36b2f6ab..a57b2dff64dd 100644
>>> --- a/drivers/md/md.c
>>> +++ b/drivers/md/md.c
>>> @@ -6238,6 +6238,9 @@ static void mddev_detach(struct mddev *mddev)
>>> static void __md_stop(struct mddev *mddev)
>>> {
>>> struct md_personality *pers = mddev->pers;
>>> +
>>> + md_bitmap_wait_behind_writes(mddev);
>>> + md_cluster_stop(mddev);
>>> mddev_detach(mddev);
>>> /* Ensure ->event_work is done */
>>> if (mddev->event_work.func)
>> The md_bitmap_destroy is called in __md_stop with or without e151db8ecfb0,
>> and it already invokes md_bitmap_wait_behind_writes and md_cluster_stop by
>> md_bitmap_free. So the above is sort of redundant to me.
> The point is that md_cluster_stop() needs to run before mddev_detach()
> shuts down the thread. If we don't call all of md_bitmap_destroy()
> before mddev_detach() we need to at least run md_cluster_stop(), and I
> think we need to run md_bitmap_wait_behind_writes() before that.
>
>
>> For the issue described in e151db8ecfb, looks like raid1d was running after
>> __md_stop, I am wondering if dm-raid should stop write first just like
>> normal md-raid.
> That looks like a really good idea, that should make it safe to move
> md_bitmap_destroy() back before mddev_detach().
> Would you like to post a patch to make those two changes, and include
> Mikulas Patocka, or should I?
NP, will send it later, thanks for your review.
BRs,
Guoqing
prev parent reply other threads:[~2022-08-16 13:54 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-12 2:16 [PATCH RFC] md: call md_cluster_stop() in __md_stop() NeilBrown
2022-08-15 6:19 ` Guoqing Jiang
2022-08-16 0:58 ` NeilBrown
2022-08-16 13:53 ` Guoqing Jiang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a6657e08-b6a7-358b-2d2a-0ac37d49d23a@linux.dev \
--to=guoqing.jiang@linux.dev \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
--cc=song@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).