Re: [PATCH 0/4] RFC: attempt to remove md deadlocks with metadata without

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Xiao Ni <xni@redhat.com>
To: NeilBrown <neilb@suse.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: [PATCH 0/4] RFC: attempt to remove md deadlocks with metadata without
Date: Mon, 9 Oct 2017 13:32:16 +0800	[thread overview]
Message-ID: <441ae9fe-fd73-2aac-8bb1-c64da28cda27@redhat.com> (raw)
In-Reply-To: <87a810zznc.fsf@notabene.neil.brown.name>



On 10/09/2017 12:57 PM, NeilBrown wrote:
> On Sun, Oct 08 2017, Xiao Ni wrote:
>
>> ----- Original Message -----
>>> From: "NeilBrown" <neilb@suse.com>
>>> To: "Xiao Ni" <xni@redhat.com>
>>> Cc: linux-raid@vger.kernel.org
>>> Sent: Friday, October 6, 2017 12:32:19 PM
>>> Subject: Re: [PATCH 0/4] RFC: attempt to remove md deadlocks with metadata without
>>>
>>> On Fri, Oct 06 2017, Xiao Ni wrote:
>>>
>>>> On 10/05/2017 01:17 PM, NeilBrown wrote:
>>>>> On Thu, Sep 14 2017, Xiao Ni wrote:
>>>>>
>>>>>>> What do
>>>>>>>    cat /proc/8987/stack
>>>>>>>    cat /proc/8983/stack
>>>>>>>    cat /proc/8966/stack
>>>>>>>    cat /proc/8381/stack
>>>>>>>
>>>>>>> show??
>>>>> ...
>>>>>
>>>>>> /usr/sbin/mdadm --grow --continue /dev/md0. Is it the reason to add
>>>>>> lockdep_assert_held(&mddev->reconfig_mutex)?
>>>>>> [root@dell-pr1700-02 ~]# cat /proc/8983/stack
>>>>>> [<ffffffffa0a3464c>] mddev_suspend+0x12c/0x160 [md_mod]
>>>>>> [<ffffffffa0a379ec>] suspend_lo_store+0x7c/0xe0 [md_mod]
>>>>>> [<ffffffffa0a3b7d0>] md_attr_store+0x80/0xc0 [md_mod]
>>>>>> [<ffffffff812ec8da>] sysfs_kf_write+0x3a/0x50
>>>>>> [<ffffffff812ec39f>] kernfs_fop_write+0xff/0x180
>>>>>> [<ffffffff81260457>] __vfs_write+0x37/0x170
>>>>>> [<ffffffff812619e2>] vfs_write+0xb2/0x1b0
>>>>>> [<ffffffff81263015>] SyS_write+0x55/0xc0
>>>>>> [<ffffffff810037c7>] do_syscall_64+0x67/0x150
>>>>>> [<ffffffff81777527>] entry_SYSCALL64_slow_path+0x25/0x25
>>>>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>>>>
>>>>>> [jbd2/md0-8]
>>>>>> [root@dell-pr1700-02 ~]# cat /proc/8966/stack
>>>>>> [<ffffffffa0a39b20>] md_write_start+0xf0/0x220 [md_mod]
>>>>>> [<ffffffffa0972b49>] raid5_make_request+0x89/0x8b0 [raid456]
>>>>>> [<ffffffffa0a34175>] md_make_request+0xf5/0x260 [md_mod]
>>>>>> [<ffffffff81376427>] generic_make_request+0x117/0x2f0
>>>>>> [<ffffffff81376675>] submit_bio+0x75/0x150
>>>>>> [<ffffffff8129e0b0>] submit_bh_wbc+0x140/0x170
>>>>>> [<ffffffff8129e683>] submit_bh+0x13/0x20
>>>>>> [<ffffffffa0957e29>] jbd2_write_superblock+0x109/0x230 [jbd2]
>>>>>> [<ffffffffa0957f8b>] jbd2_journal_update_sb_log_tail+0x3b/0x80 [jbd2]
>>>>>> [<ffffffffa09517ff>] jbd2_journal_commit_transaction+0x16ef/0x19e0 [jbd2]
>>>>>> [<ffffffffa0955d02>] kjournald2+0xd2/0x260 [jbd2]
>>>>>> [<ffffffff810c73f9>] kthread+0x109/0x140
>>>>>> [<ffffffff817776c5>] ret_from_fork+0x25/0x30
>>>>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>>> Thanks for this (and sorry it took so long to get to it).
>>>>> It looks like
>>>>>
>>>>> Commit: cc27b0c78c79 ("md: fix deadlock between mddev_suspend() and
>>>>> md_write_start()")
>>>>>
>>>>> is badly broken.  I wonder how it ever passed testing.
>>>>>
>>>>> In write_start() is change the wait_event() call to
>>>>>
>>>>> 	wait_event(mddev->sb_wait,
>>>>> 		   !test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags) &&
>>>>> 		   !mddev->suspended);
>>>>>
>>>>>
>>>>> That should be
>>>>>
>>>>> 	wait_event(mddev->sb_wait,
>>>>> 		   !test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags) ||
>>>>> 		   mddev->suspended);
>>>> Hi Neil
>>>>
>>>> Do we want write bio can be handled when mddev->suspended is 1? After
>>>> changing to this,
>>>> write bio can be handled when mddev->suspended is 1.
>>> This is OK.
>>> New write bios will not get past md_handle_request().
>>> A write bios that did get past md_handle_request() is still allowed
>>> through md_write_start().  The mddev_suspend() call won't complete until
>>> that write bio has finished.
>> Hi Neil
>>
>> Thanks for the explanation. I took some time to read the emails about the
>> patch cc27b0c78 which introduced this. It's similar with this problem I
>> countered. But there is a call of function mddev_suspend in level_store.
>> So add the check of mddev->suspended in md_write_start can fix the problem
>> "reshape raid5 -> raid6 atop bcache deadlocks at start on md_attr_store /
>> raid5_make_request".
>>
>> In function suspend_lo_store it doesn't call mddev_suspend under mddev->reconfig_mutex.
> It would if you had applied
>     [PATCH 3/4] md: use mddev_suspend/resume instead of ->quiesce()
>
> Did you apply all 4 patches?

Sorry, it's my mistake. I insmod the wrong module. I'll apply the four 
patches
and do test again.
> Thanks.  I looks suspend_lo_store() is calling raid5_quiesce() directly
> as you say - so a patch is missing.

Yes, thanks for pointing about this.

>>>> Hmm, I have a question. Why can't call md_check_recovery when
>>>> MD_SB_CHANGE_PENDING
>>>> is set in raid5d?
>>> When MD_SB_CHANGE_PENDING is not set, there is no need to call
>>> md_check_recovery().  I wouldn't hurt except that it would be a waste of
>>> time.
>> I'm confused. If we want to call md_check_recovery when MD_SB_CHANGE_PENDING
>> is set, it should be
> Sorry, I described the condition wrongly.
> If any bit is set in ->sb_flags (except MD_SB_CHANGE_PENDING), then
> we need to call md_check_recovery().  If none of those other bits
> are set, there is no need.

Hmm, so it's the first question. Why can't call md_check_recovery when 
MD_SB_CHANGE_PENDING
is set. It needs to update the superblock too when MD_SB_CHANGE_PENDING 
is set. I can't
understand this part.

Can it be:
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -6299,7 +6299,7 @@ static void raid5d(struct md_thread *thread)
                         break;
                 handled += batch_size;

-               if (mddev->sb_flags & ~(1 << MD_SB_CHANGE_PENDING)) {
+               if (mddev->sb_flags) {


Best Regards
Xiao

>
> NeilBrown

next prev parent reply	other threads:[~2017-10-09  5:32 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-12  1:49 [PATCH 0/4] RFC: attempt to remove md deadlocks with metadata without NeilBrown
2017-09-12  1:49 ` [PATCH 1/4] md: always hold reconfig_mutex when calling mddev_suspend() NeilBrown
2017-09-12  1:49 ` [PATCH 4/4] md: allow metadata update while suspending NeilBrown
2017-09-12  1:49 ` [PATCH 2/4] md: don't call bitmap_create() while array is quiesced NeilBrown
2017-09-12  1:49 ` [PATCH 3/4] md: use mddev_suspend/resume instead of ->quiesce() NeilBrown
2017-09-12  2:51 ` [PATCH 0/4] RFC: attempt to remove md deadlocks with metadata without Xiao Ni
2017-09-13  2:11 ` Xiao Ni
2017-09-13 15:09   ` Xiao Ni
2017-09-13 23:05     ` NeilBrown
2017-09-14  4:55       ` Xiao Ni
2017-09-14  5:32         ` NeilBrown
2017-09-14  7:57           ` Xiao Ni
2017-09-16 13:15             ` Xiao Ni
2017-10-05  5:17             ` NeilBrown
2017-10-06  3:53               ` Xiao Ni
2017-10-06  4:32                 ` NeilBrown
2017-10-09  1:21                   ` Xiao Ni
2017-10-09  4:57                     ` NeilBrown
2017-10-09  5:32                       ` Xiao Ni [this message]
2017-10-09  5:52                         ` NeilBrown
2017-10-10  6:05                           ` Xiao Ni
2017-10-10 21:20                             ` NeilBrown
     [not found]                               ` <960568852.19225619.1507689864371.JavaMail.zimbra@redhat.com>
2017-10-13  3:48                                 ` NeilBrown
2017-10-16  4:43                                   ` Xiao Ni
2017-09-30  9:46 ` Xiao Ni
2017-10-05  5:03   ` NeilBrown
2017-10-06  3:40     ` Xiao Ni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=441ae9fe-fd73-2aac-8bb1-c64da28cda27@redhat.com \
    --to=xni@redhat.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).