From: Shaohua Li <shli@kernel.org>
To: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Cc: linux-raid@vger.kernel.org, Jes.Sorensen@redhat.com
Subject: Re: [PATCH] md: wake up personality thread after array state update
Date: Thu, 27 Oct 2016 15:02:35 -0700 [thread overview]
Message-ID: <20161027220235.o64c3ei454l2y4xi@kernel.org> (raw)
In-Reply-To: <20161027085206.GA11138@proton.igk.intel.com>
On Thu, Oct 27, 2016 at 10:52:06AM +0200, Tomasz Majchrzak wrote:
> On Wed, Oct 26, 2016 at 12:14:55PM -0700, Shaohua Li wrote:
> > On Tue, Oct 25, 2016 at 05:07:08PM +0200, Tomasz Majchrzak wrote:
> > > When raid1/raid10 array fails to write to one of the drives, the request
> > > is added to bio_end_io_list and finished by personality thread. The
> > > thread doesn't handle it as long as MD_CHANGE_PENDING flag is set. In
> > > case of external metadata this flag is cleared, however the thread is
> > > not woken up. It causes request to be blocked for few seconds (until
> > > another action on the array wakes up the thread) or to get stuck
> > > indefinitely.
> > >
> > > Wake up personality thread once MD_CHANGE_PENDING has been cleared.
> > > Moving 'restart_array' call after the flag is cleared it not a solution
> > > because in read-write mode the call doesn't wake up the thread.
> >
> > The patch looks good. However can you elaborate how userspace handles the case?
> > I'd like to understand what the user interface should be to support external
> > metadata array.
>
> 1. Kernel encounters new bad block that needs to be acknowledged.
>
> sysfs array state == "write-pending" (as MD_CHANGE_PENDING set)
> sysfs rdev state == "blocked" (as unacked_exists + external_bbl set)
>
> 2. mdmon wakes up as there is an update to sysfs array state and unacknowledged
> bad blocks list.
>
> 3. mdmon checks the state of each disk. If any is 'blocked' and there is a
> support for bad blocks in metadata, it reads unacknowledged bad block list and
> records new bad blocks in metadata. If successful, it acknowledges bad blocks by
> writing to sysfs bad block file. If all bad blocks have been acknowledged, it
> schedules disk unblock.
>
> As soon as kernel marks all bad blocks as acknowledged, it will clear
> unacked_exists flag.
>
> 4. mdmon checks 'faulty' flag for each disk. If it is set, the disk is removed
> from array and unblock is scheduled.
>
> 5. mdmon requests to unblock the array by writing '-blocked' to sysfs disk
> state.
>
> Requests awaiting for bad block confirmation are woken up in kernel.
Why this step? 3 step writes bad block file, which already wakeup threads
waiting for bad block confirmation.
> 6. mdmon writes 'active' to sysfs array state.
>
> MD_CHANGE_PENDING flag is cleared by this step but personality thread is not
> woken up. The patch resolves this problem.
>
> I hope it answers your question.
This is clear, thanks! I applied this patch.
Thanks,
Shaohua
prev parent reply other threads:[~2016-10-27 22:02 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-25 15:07 [PATCH] md: wake up personality thread after array state update Tomasz Majchrzak
2016-10-26 19:14 ` Shaohua Li
2016-10-27 8:52 ` Tomasz Majchrzak
2016-10-27 22:02 ` Shaohua Li [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161027220235.o64c3ei454l2y4xi@kernel.org \
--to=shli@kernel.org \
--cc=Jes.Sorensen@redhat.com \
--cc=linux-raid@vger.kernel.org \
--cc=tomasz.majchrzak@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).