From: NeilBrown <neilb@suse.de>
To: Tregaron Bayly <tbayly@bluehost.com>
Cc: Alexander Lyakas <alex.bolshoy@gmail.com>,
linux-raid <linux-raid@vger.kernel.org>,
Shyam Kaushik <shyam@zadarastorage.com>,
yair@zadarastorage.com
Subject: Re: BUG - raid 1 deadlock on handle_read_error / wait_barrier
Date: Wed, 12 Jun 2013 11:30:28 +1000 [thread overview]
Message-ID: <20130612113028.1ee21189@notabene.brown> (raw)
In-Reply-To: <51B0A40A.2010208@bluehost.com>
[-- Attachment #1: Type: text/plain, Size: 4261 bytes --]
On Thu, 06 Jun 2013 09:00:26 -0600 Tregaron Bayly <tbayly@bluehost.com> wrote:
> On 06/03/2013 07:49 PM, NeilBrown wrote:
> > On Sun, 2 Jun 2013 15:43:41 +0300 Alexander Lyakas <alex.bolshoy@gmail.com>
> > wrote:
> >
> >> Hello Neil,
> >> I believe I have found what is causing the deadlock. It happens in two flavors:
> >>
> >> 1)
> >> # raid1d() is called, and conf->pending_bio_list is non-empty at this point
> >> # raid1d() calls md_check_recovery(), which eventually calls
> >> raid1_add_disk(), which calls raise_barrier()
> >> # Now raise_barrier will wait for conf->nr_pending to become 0, but it
> >> cannot become 0, because there are bios sitting in
> >> conf->pending_bio_list, which nobody will flush, because raid1d is the
> >> one supposed to call flush_pending_writes(), either directly or via
> >> handle_read_error. But it is stuck in raise_barrier.
> >>
> >> 2)
> >> # raid1_add_disk() calls raise_barrier(), and waits for
> >> conf->nr_pending to become 0, as before
> >> # new WRITE comes and calls wait_barrier(), but this thread has a
> >> non-empty current->bio_list
> >> # In this case, the code allows the WRITE to go through
> >> wait_barrier(), and trigger WRITEs to mirror legs, but these WRITEs
> >> again end up in conf->pending_bio_list (either via raid1_unplug or
> >> directly). But nobody will flush conf->pending_bio_list, because
> >> raid1d is stuck in raise_barrier.
> >>
> >> Previously, for example in kernel 3.2, raid1_add_disk did not call
> >> raise_barrier, so this problem did not happen.
> >>
> >> Attached is a reproduction with some prints that I added to
> >> raise_barrier and wait_barrier (their code also attached). It
> >> demonstrates case 2. It shows that once raise_barrier got called,
> >> conf->nr_pending drops down, until it equals the number of
> >> wait_barrier calls, that slipped through because of non-empty
> >> current->bio_list. And at this point, this array hangs.
> >>
> >> Can you please comment on how to fix this problem. It looks like a
> >> real deadlock.
> >> We can perhaps call md_check_recovery() after flush_pending_writes(),
> >> but this only makes the window smaller, not closes it entirely. But it
> >> looks like we really should not be calling raise_barrier from raid1d.
> >>
> >> Thanks,
> >> Alex.
> >
> > Hi Alex,
> > thanks for the analysis.
> >
> > Does the following patch fix it? It makes raise_barrier more like
> > freeze_array().
> > If not, could you try making the same change to the first
> > wait_event_lock_irq in raise_barrier?
> >
> > Thanks.
> > NeilBrown
> >
> >
> >
> > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> > index 328fa2d..d34f892 100644
> > --- a/drivers/md/raid1.c
> > +++ b/drivers/md/raid1.c
> > @@ -828,9 +828,9 @@ static void raise_barrier(struct r1conf *conf)
> > conf->barrier++;
> >
> > /* Now wait for all pending IO to complete */
> > - wait_event_lock_irq(conf->wait_barrier,
> > - !conf->nr_pending && conf->barrier < RESYNC_DEPTH,
> > - conf->resync_lock);
> > + wait_event_lock_irq_cmd(conf->wait_barrier,
> > + !conf->nr_pending && conf->barrier < RESYNC_DEPTH,
> > + conf->resync_lock, flush_pending_writes(conf));
> >
> > spin_unlock_irq(&conf->resync_lock);
> > }
> >
>
> Neil,
>
> This deadlock also cropped up in 3.4 between .37 and .38. Passing flush_pending_writes(conf) as cmd to wait_event_lock_irq seems to fix it there as well.
>
> --- linux-3.4.38/drivers/md/raid1.c 2013-03-28 13:12:41.000000000 -0600
> +++ linux-3.4.38.patch/drivers/md/raid1.c 2013-06-04 12:17:35.314194903 -0600
> @@ -751,7 +751,7 @@
> /* Now wait for all pending IO to complete */
> wait_event_lock_irq(conf->wait_barrier,
> !conf->nr_pending && conf->barrier < RESYNC_DEPTH,
> - conf->resync_lock, );
> + conf->resync_lock, flush_pending_writes(conf));
>
> spin_unlock_irq(&conf->resync_lock);
> }
>
I suspect it was already there in 3.4.37 .. in fact I think it was there in
3.4.
I'll mark the fix for including in 3.4-stable
Thanks for the report.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
prev parent reply other threads:[~2013-06-12 1:30 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-02-21 22:58 BUG - raid 1 deadlock on handle_read_error / wait_barrier Tregaron Bayly
2013-02-22 3:44 ` Joe Lawrence
2013-02-22 11:52 ` majianpeng
2013-02-22 16:03 ` Tregaron Bayly
2013-02-22 18:14 ` Joe Lawrence
2013-02-24 22:43 ` NeilBrown
2013-02-25 0:04 ` NeilBrown
2013-02-25 16:11 ` Tregaron Bayly
2013-02-25 22:54 ` NeilBrown
2013-02-26 14:09 ` Joe Lawrence
2013-05-16 14:07 ` Alexander Lyakas
2013-05-20 7:17 ` NeilBrown
2013-05-30 14:30 ` Alexander Lyakas
2013-06-02 12:43 ` Alexander Lyakas
2013-06-04 1:49 ` NeilBrown
2013-06-04 9:52 ` Alexander Lyakas
2013-06-06 15:00 ` Tregaron Bayly
2013-06-08 9:45 ` Alexander Lyakas
2013-06-12 0:42 ` NeilBrown
2013-06-12 1:30 ` NeilBrown [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130612113028.1ee21189@notabene.brown \
--to=neilb@suse.de \
--cc=alex.bolshoy@gmail.com \
--cc=linux-raid@vger.kernel.org \
--cc=shyam@zadarastorage.com \
--cc=tbayly@bluehost.com \
--cc=yair@zadarastorage.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).