From: Xiao Ni <xni@redhat.com>
To: NeilBrown <neilb@suse.com>, linux-raid <linux-raid@vger.kernel.org>
Cc: shli@kernel.org
Subject: Re: Stuck in md_write_start because MD_SB_CHANGE_PENDING can't be cleared
Date: Wed, 6 Sep 2017 21:37:57 -0400 (EDT) [thread overview]
Message-ID: <624049285.8379021.1504748277805.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <3a5955de-e6a1-de83-b00b-1984f7125799@redhat.com>
----- Original Message -----
> From: "Xiao Ni" <xni@redhat.com>
> To: "NeilBrown" <neilb@suse.com>, "linux-raid" <linux-raid@vger.kernel.org>
> Cc: shli@kernel.org
> Sent: Tuesday, September 5, 2017 10:15:00 AM
> Subject: Re: Stuck in md_write_start because MD_SB_CHANGE_PENDING can't be cleared
>
>
>
> On 09/05/2017 09:36 AM, NeilBrown wrote:
> > On Mon, Sep 04 2017, Xiao Ni wrote:
> >
> >>
> >> In function handle_stripe:
> >> 4697 if (s.handle_bad_blocks ||
> >> 4698 test_bit(MD_SB_CHANGE_PENDING, &conf->mddev->sb_flags)) {
> >> 4699 set_bit(STRIPE_HANDLE, &sh->state);
> >> 4700 goto finish;
> >> 4701 }
> >>
> >> Because MD_SB_CHANGE_PENDING is set, so the stripes can't be handled.
> >>
> > Right, of course. I see what is happening now.
> >
> > - raid5d cannot complete stripes until the metadata is written
> > - the metadata cannot be written until raid5d gets the mddev_lock
> > - mddev_lock is held by the write to suspend_hi
> > - the write to suspend_hi is waiting for raid5_quiesce
> > - raid5_quiesce is waiting for some stripes to complete.
> >
> > We could declare that ->quiesce(, 1) cannot be called while holding the
> > lock.
> > We could possible allow it but only if md_update_sb() is called first,
> > though that might still be racy.
> >
> > ->quiesce(, 1) is currently called from:
> > mddev_suspend
> > suspend_lo_store
> > suspend_hi_store
> > __md_stop_writes
> > mddev_detach
> > set_bitmap_file
> > update_array_info (when setting/removing internal bitmap)
> > md_do_sync
> >
> > and most of those are call with the lock held, or take the lock.
> >
> > Maybe we should *require* that mddev_lock is held when calling
> > ->quiesce() and have ->quiesce() do the metadata update.
> >
> > Something like the following maybe. Can you test it?
>
> Hi Neil
>
> Thanks for the analysis. I need to thing for a while :)
> I already added the patch and the test is running now. It usually needs
> more than 5
> hours to reproduce this problem. I'll let it run more than 24 hours.
> I'll update the test
> result later.
Hi Neil
The problem still exists. But it doesn't show calltrace this time. It
was stuck yesterday. I didn't notice that because there has no calltrace.
echo file raid5.c +p > /sys/kernel/debug/dynamic_debug/control
It shows that raid5d is still spinning.
Regards
Xiao
>
> Regards
> Xiao
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
next prev parent reply other threads:[~2017-09-07 1:37 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <221835411.4473056.1504338574607.JavaMail.zimbra@redhat.com>
2017-09-02 8:01 ` Stuck in md_write_start because MD_SB_CHANGE_PENDING can't be cleared Xiao Ni
2017-09-04 2:16 ` NeilBrown
2017-09-04 2:45 ` Xiao Ni
2017-09-04 3:52 ` Xiao Ni
2017-09-04 5:34 ` NeilBrown
2017-09-04 7:36 ` Xiao Ni
2017-09-05 1:36 ` NeilBrown
2017-09-05 2:15 ` Xiao Ni
2017-09-07 1:37 ` Xiao Ni [this message]
2017-09-07 5:37 ` NeilBrown
2017-09-11 0:14 ` Xiao Ni
2017-09-11 3:36 ` NeilBrown
2017-09-11 5:03 ` Xiao Ni
2017-09-30 9:44 ` Xiao Ni
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=624049285.8379021.1504748277805.JavaMail.zimbra@redhat.com \
--to=xni@redhat.com \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.com \
--cc=shli@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).