Re: Deadlock in md barrier code? / RAID1 / LVM CoW snapshot + ext3 / Debian 5.0 - lenny 2.6.26 kernel

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Neil Brown <neilb@suse.de>
To: Tim Small <tim@seoss.co.uk>
Cc: "linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>
Subject: Re: Deadlock in md barrier code? / RAID1 / LVM CoW snapshot + ext3 / Debian 5.0 - lenny 2.6.26 kernel
Date: Thu, 21 Oct 2010 10:04:29 +1100	[thread overview]
Message-ID: <20101021100429.4879a001@notabene> (raw)
In-Reply-To: <4CBF5267.7080304@seoss.co.uk>

On Wed, 20 Oct 2010 21:34:47 +0100
Tim Small <tim@seoss.co.uk> wrote:

> On 19/10/10 20:29, Tim Small wrote:
> > Sprinkled a few more printks....
> >
> > http://buttersideup.com/files/md-raid1-lockup-lvm-snapshot/dmesg-deadlock-instrumented.txt
> >
> 
> It seems that when the system is hung, conf->nr_pending gets stuck with
> a value of 2.  The resync task ends up stuck in the second
> wait_event_lock_irq within raise barrier, and everything else gets stuck
> in the first wait_event_lock_irq when waiting for that to complete..
> 
> So my assumption is that some IOs either get stuck incomplete, or take a
> path through the code such that they complete without calling allow_barrier.
> 
> Does that make any sense?
>

Yes, it is pretty much the same place that my thinking has reached.

I am quite confident that IO requests cannot complete without calling
allow_barrier - if that were possible  I think we would be seeing a lot more
problems, and in any case it is a fairly easy code path to verify by
inspection.

So the mostly likely avenue or exploration is that the IO's get stuck
somewhere.  But where?

They could get stuck in the device queue while the queue is plugged.  But
queues are meant to auto-unplug after 3msec.  And in any case the
raid1_unplug call in wait_event_lock_irq will make sure everything is
unplugged.

If there was an error (which according to the logs there wasn't) the request
could be stuck in the retry queue, but raid1d will take things off that queue
and handle them.  raid1_unplug wakes up raid1d, and the stack traces show
that raid1d is simply waiting to be woken, it isn't blocking on anything.
I guess there could be an attempt to do a barrier write that failed and
needed to be retried.   Maybe you could add a printk if RIBIO_BarrierRetry
ever gets set.  I don't expect it tell us much though.

They could be in pending_bio_list, but that is flushed by raid1d too.

Maybe you could add a could of global atomic variables, one for reads and one
for writes.
Then on each call to generic_make_request in:
  flush_pending_writes, make_request, raid1d
increment one or the other depending on whether it is a read or a write.
Then in raid1_end_read_request and raid1_end_write_request decrement them
appropriately.

Then in raid1_unplug (which is called just before the schedule in the
event_wait code) print out these two numbers.
Possibly also print something when you decrement them if they become zero.

That would tell us if the requests were stuck in the underlying devices, or
if they were stuck in raid1 somewhere.

Maybe you could also check that the retry list and the pending list are empty
and print that status somewhere suitable...

NeilBrown

next prev parent reply	other threads:[~2010-10-20 23:04 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-17 14:53 Deadlock in md barrier code? / RAID1 / LVM CoW snapshot + ext3 / Debian 5.0 - lenny 2.6.26 kernel Tim Small
2010-09-17 22:59 ` Neil Brown
2010-09-20 19:59   ` Tim Small
2010-09-21 21:02     ` Tim Small
2010-09-21 22:30       ` Neil Brown
2010-10-12 13:59         ` Tim Small
2010-10-12 14:06         ` Tim Small
2010-10-12 16:48           ` CoolCold
2010-10-13  8:51             ` Tim Small
2010-10-13 13:00               ` CoolCold
2010-10-18 18:52         ` Tim Small
2010-10-19  6:16           ` Neil Brown
2010-10-19 16:24             ` Tim Small
2010-10-19 16:29               ` Tim Small
2010-10-19 19:29                 ` Tim Small
2010-10-20 20:34                   ` Tim Small
2010-10-20 23:04                     ` Neil Brown [this message]
2010-11-18 18:04                       ` Tim Small
2010-11-21 23:05                         ` Neil Brown
2010-12-06 15:42                           ` Tim Small
2010-09-21 22:21     ` Neil Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101021100429.4879a001@notabene \
    --to=neilb@suse.de \
    --cc=linux-raid@vger.kernel.org \
    --cc=tim@seoss.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).