From mboxrd@z Thu Jan 1 00:00:00 1970 From: Xiao Ni Subject: Re: Stuck in md_write_start because MD_SB_CHANGE_PENDING can't be cleared Date: Wed, 6 Sep 2017 21:37:57 -0400 (EDT) Message-ID: <624049285.8379021.1504748277805.JavaMail.zimbra@redhat.com> References: <546311999.4473128.1504339295016.JavaMail.zimbra@redhat.com> <877exfdx7x.fsf@notabene.neil.brown.name> <22698eb3-35f7-04e5-96e8-26470d892655@redhat.com> <87y3pvc9ha.fsf@notabene.neil.brown.name> <34fedde7-cef9-34ff-1403-9d097267eb55@redhat.com> <87k21ec4fn.fsf@notabene.neil.brown.name> <3a5955de-e6a1-de83-b00b-1984f7125799@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <3a5955de-e6a1-de83-b00b-1984f7125799@redhat.com> Sender: linux-raid-owner@vger.kernel.org To: NeilBrown , linux-raid Cc: shli@kernel.org List-Id: linux-raid.ids ----- Original Message ----- > From: "Xiao Ni" > To: "NeilBrown" , "linux-raid" > Cc: shli@kernel.org > Sent: Tuesday, September 5, 2017 10:15:00 AM > Subject: Re: Stuck in md_write_start because MD_SB_CHANGE_PENDING can't be cleared > > > > On 09/05/2017 09:36 AM, NeilBrown wrote: > > On Mon, Sep 04 2017, Xiao Ni wrote: > > > >> > >> In function handle_stripe: > >> 4697 if (s.handle_bad_blocks || > >> 4698 test_bit(MD_SB_CHANGE_PENDING, &conf->mddev->sb_flags)) { > >> 4699 set_bit(STRIPE_HANDLE, &sh->state); > >> 4700 goto finish; > >> 4701 } > >> > >> Because MD_SB_CHANGE_PENDING is set, so the stripes can't be handled. > >> > > Right, of course. I see what is happening now. > > > > - raid5d cannot complete stripes until the metadata is written > > - the metadata cannot be written until raid5d gets the mddev_lock > > - mddev_lock is held by the write to suspend_hi > > - the write to suspend_hi is waiting for raid5_quiesce > > - raid5_quiesce is waiting for some stripes to complete. > > > > We could declare that ->quiesce(, 1) cannot be called while holding the > > lock. > > We could possible allow it but only if md_update_sb() is called first, > > though that might still be racy. > > > > ->quiesce(, 1) is currently called from: > > mddev_suspend > > suspend_lo_store > > suspend_hi_store > > __md_stop_writes > > mddev_detach > > set_bitmap_file > > update_array_info (when setting/removing internal bitmap) > > md_do_sync > > > > and most of those are call with the lock held, or take the lock. > > > > Maybe we should *require* that mddev_lock is held when calling > > ->quiesce() and have ->quiesce() do the metadata update. > > > > Something like the following maybe. Can you test it? > > Hi Neil > > Thanks for the analysis. I need to thing for a while :) > I already added the patch and the test is running now. It usually needs > more than 5 > hours to reproduce this problem. I'll let it run more than 24 hours. > I'll update the test > result later. Hi Neil The problem still exists. But it doesn't show calltrace this time. It was stuck yesterday. I didn't notice that because there has no calltrace. echo file raid5.c +p > /sys/kernel/debug/dynamic_debug/control It shows that raid5d is still spinning. Regards Xiao > > Regards > Xiao > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >