From: NeilBrown <neilb@suse.com>
To: Xiao Ni <xni@redhat.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: [PATCH 0/4] RFC: attempt to remove md deadlocks with metadata without
Date: Wed, 11 Oct 2017 08:20:56 +1100 [thread overview]
Message-ID: <87vajmwvgn.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <ebf97c38-c8e0-aa87-be84-efc8d56802f0@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 2470 bytes --]
On Tue, Oct 10 2017, Xiao Ni wrote:
> On 10/09/2017 01:52 PM, NeilBrown wrote:
>> On Mon, Oct 09 2017, Xiao Ni wrote:
>>
>>> On 10/09/2017 12:57 PM, NeilBrown wrote:
>>>> It would if you had applied
>>>> [PATCH 3/4] md: use mddev_suspend/resume instead of ->quiesce()
>>>>
>>>> Did you apply all 4 patches?
>>> Sorry, it's my mistake. I insmod the wrong module. I'll apply the four
>>> patches
>>> and do test again.
>>>> Thanks. I looks suspend_lo_store() is calling raid5_quiesce() directly
>>>> as you say - so a patch is missing.
>>> Yes, thanks for pointing about this.
>
> Hi Neil
>
> I applied the four patches and one patch "md: fix deadlock error in
> recent patch."
> There is a new stuck. It's stuck at suspend_hi_store this time. I add
> the calltrace
> as an attachment.
>
> I added some printk to print some information.
>
> [12695.993329] mddev suspend : 1
> [12695.996270] mddev ro : 0
> [12695.998790] mddev insync : 0
> [12696.001641] mddev active io: 1
You didn't tell me where (in the code) you printed this information.
That makes it hard to interpret.
If mddev->active_io is 1, then some thread must be in this range
of code
atomic_inc(&mddev->active_io);
rcu_read_unlock();
if (!mddev->pers->make_request(mddev, bio)) {
atomic_dec(&mddev->active_io);
wake_up(&mddev->sb_wait);
goto check_suspended;
}
if (atomic_dec_and_test(&mddev->active_io) && mddev->suspended)
wake_up(&mddev->sb_wait);
If that thread is blocked (which appears to be the case) it must be in
->make_request() because nothing else there blocks.
None of the threads you showed are in that code.
But you didn't report all the threads - only those which hard printed
warnings.
echo t > /proc/sysrq-trigger
will produce the stack traces of *all* threads. That would be more
useful.
>
> Can it be:
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index b6b7a28..55e9280 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -7777,7 +7777,7 @@ void md_check_recovery(struct mddev *mddev)
> if (mddev->ro && !test_bit(MD_RECOVERY_NEEDED, &mddev->recovery))
> return;
> if ( ! (
> - (mddev->flags & ~ (1<<MD_CHANGE_PENDING)) ||
> + (mddev->flags & (mddev->external == 1 && ~
> (1<<MD_CHANGE_PENDING))) ||
Please read that code again and see how it doesn't make any sense at
all.
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
next prev parent reply other threads:[~2017-10-10 21:20 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-12 1:49 [PATCH 0/4] RFC: attempt to remove md deadlocks with metadata without NeilBrown
2017-09-12 1:49 ` [PATCH 4/4] md: allow metadata update while suspending NeilBrown
2017-09-12 1:49 ` [PATCH 3/4] md: use mddev_suspend/resume instead of ->quiesce() NeilBrown
2017-09-12 1:49 ` [PATCH 2/4] md: don't call bitmap_create() while array is quiesced NeilBrown
2017-09-12 1:49 ` [PATCH 1/4] md: always hold reconfig_mutex when calling mddev_suspend() NeilBrown
2017-09-12 2:51 ` [PATCH 0/4] RFC: attempt to remove md deadlocks with metadata without Xiao Ni
2017-09-13 2:11 ` Xiao Ni
2017-09-13 15:09 ` Xiao Ni
2017-09-13 23:05 ` NeilBrown
2017-09-14 4:55 ` Xiao Ni
2017-09-14 5:32 ` NeilBrown
2017-09-14 7:57 ` Xiao Ni
2017-09-16 13:15 ` Xiao Ni
2017-10-05 5:17 ` NeilBrown
2017-10-06 3:53 ` Xiao Ni
2017-10-06 4:32 ` NeilBrown
2017-10-09 1:21 ` Xiao Ni
2017-10-09 4:57 ` NeilBrown
2017-10-09 5:32 ` Xiao Ni
2017-10-09 5:52 ` NeilBrown
2017-10-10 6:05 ` Xiao Ni
2017-10-10 21:20 ` NeilBrown [this message]
[not found] ` <960568852.19225619.1507689864371.JavaMail.zimbra@redhat.com>
2017-10-13 3:48 ` NeilBrown
2017-10-16 4:43 ` Xiao Ni
2017-09-30 9:46 ` Xiao Ni
2017-10-05 5:03 ` NeilBrown
2017-10-06 3:40 ` Xiao Ni
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87vajmwvgn.fsf@notabene.neil.brown.name \
--to=neilb@suse.com \
--cc=linux-raid@vger.kernel.org \
--cc=xni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).