From: Neil Brown <neilb@suse.de>
To: Tim Small <tim@seoss.co.uk>
Cc: "linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>
Subject: Re: Deadlock in md barrier code? / RAID1 / LVM CoW snapshot + ext3 / Debian 5.0 - lenny 2.6.26 kernel
Date: Mon, 22 Nov 2010 10:05:09 +1100 [thread overview]
Message-ID: <20101122100509.76592b59@notabene.brown> (raw)
In-Reply-To: <4CE56AA9.2010905@seoss.co.uk>
On Thu, 18 Nov 2010 18:04:25 +0000
Tim Small <tim@seoss.co.uk> wrote:
> On 10/21/10 00:04, Neil Brown wrote:
> >
> > Maybe you could add a could of global atomic variables, one for reads and one
> > for writes.
> > Then on each call to generic_make_request in:
> > flush_pending_writes, make_request, raid1d
> > increment one or the other depending on whether it is a read or a write.
> > Then in raid1_end_read_request and raid1_end_write_request decrement them
> > appropriately.
> >
>
>
> Ended up with runs like this:
>
> [ 464.244109] 0 Obtaining resync_lock and disabling interrupts for
> allow_barrier
> [ 464.244109] conf nr_pending is 5
> [ 464.244109] 0 Released resync_lock and enabling interrupts for
> allow_barrier
> [ 464.244109] 2216: Post wait until no IO waiting - barrier: 0, pend:
> 5, wait: 0, queued: 0
> [ 464.244113] 2216: Pre wait pending to complete - barrier: 1, pend: 5,
> wait: 0, queued: 0
> [ 464.244116] In raid1_unplug debug read count: 5 write count: 0
> conf->nr_queued: 0
> [ 464.244469] 5113 Obtaining resync_lock and disabling interrupts for
> wait_barrier
> [ 464.244469] In raid1_unplug debug read count: 5 write count: 0
> conf->nr_queued: 0
> [ 464.244469] 5127 Obtaining resync_lock and disabling interrupts for
> allow_barrier
> [ 464.244469] conf nr_pending is 4
> [ 464.244469] 5127 Released resync_lock and enabling interrupts for
> allow_barrier
> [ 464.244469] 5127 Obtaining resync_lock and disabling interrupts for
> allow_barrier
> [ 464.246176] In raid1_unplug debug read count: 4 write count: 0
> conf->nr_queued: 0
> [ 464.244469] conf nr_pending is 3
> [ 464.244469] 5127 Released resync_lock and enabling interrupts for
> allow_barrier
> [ 464.244469] 5127 Obtaining resync_lock and disabling interrupts for
> allow_barrier
> [ 464.244469] conf nr_pending is 2
> [ 464.244469] 5127 Released resync_lock and enabling interrupts for
> allow_barrier
> [ 464.246176] In raid1_unplug debug read count: 2 write count: 0
> conf->nr_queued: 0
> [ 464.244469] 5127 Obtaining resync_lock and disabling interrupts for
> allow_barrier
> [ 464.244469] conf nr_pending is 1
> [ 464.244469] 5127 Released resync_lock and enabling interrupts for
> allow_barrier
> [ 464.246176] In raid1_unplug debug read count: 1 write count: 0
> conf->nr_queued: 0
> [ 464.244469] 5127 Obtaining resync_lock and disabling interrupts for
> allow_barrier
> [ 464.244469] conf nr_pending is 0
> [ 464.244469] 5127 Released resync_lock and enabling interrupts for
> allow_barrier
> [ 464.244469] In raid1_unplug debug read count: 0 write count: 0
> conf->nr_queued: 0
> [ 464.246176] 2216: Post wait pending to complete - barrier: 1, pend:
> 0, wait: 1, queued: 0
> [ 464.246176] In raid1_unplug debug read count: 0 write count: 0
> conf->nr_queued: 0
> [ 464.246176] 2216: Pre wait until no IO waiting - barrier: 1, pend: 0,
> wait: 1, queued: 0
> [ 464.246176] In raid1_unplug debug read count: 0 write count: 0
> conf->nr_queued: 0
> [ 464.246176] 5118 Obtaining resync_lock and disabling interrupts for
> wait_barrier
> [ 464.246176] In raid1_unplug debug read count: 0 write count: 0
> conf->nr_queued: 0
> [ 464.246633] In raid1_unplug debug read count: 0 write count: 0
> conf->nr_queued: 1
> [ 464.244469] In raid1_unplug debug read count: 0 write count: 0
> conf->nr_queued: 0
> [ 464.246176] In raid1_unplug debug read count: 0 write count: 0
> conf->nr_queued: 0
> [ 464.246990] 5118 Released resync_lock and enabling interrupts for
> wait_barrier
> [ 464.244469] 5113 Released resync_lock and enabling interrupts for
> wait_barrier
> [ 464.246990] 5118 Obtaining resync_lock and disabling interrupts for
> wait_barrier
> [ 464.246990] 5118 Released resync_lock and enabling interrupts for
> wait_barrier
> [ 464.246990] 5118 Obtaining resync_lock and disabling interrupts for
> wait_barrier
> [ 464.246990] 5118 Released resync_lock and enabling interrupts for
> wait_barrier
> [ 464.244469] In raid1_unplug debug read count: 3 write count: 0
> conf->nr_queued: 0
> [ 464.246990] 5118 Obtaining resync_lock and disabling interrupts for
> wait_barrier
> [ 464.246990] 5118 Released resync_lock and enabling interrupts for
> wait_barrier
> [ 464.246990] 5118 Obtaining resync_lock and disabling interrupts for
> wait_barrier
> [ 464.246990] 5118 Released resync_lock and enabling interrupts for
> wait_barrier
> [ 464.246176] 2216: Post wait until no IO waiting - barrier: 0, pend:
> 6, wait: 0, queued: 0
> [ 464.246176] 2216: Pre wait pending to complete - barrier: 1, pend: 6,
> wait: 0, queued: 0
> [ 464.246176] In raid1_unplug debug read count: 5 write count: 0
> conf->nr_queued: 0
> [ 464.246990] 5118 Obtaining resync_lock and disabling interrupts for
> wait_barrier
> [ 464.246990] In raid1_unplug debug read count: 6 write count: 0
> conf->nr_queued: 0
> [ 464.247091] 0 Obtaining resync_lock and disabling interrupts for
> allow_barrier
> [ 464.247091] conf nr_pending is 5
> [ 464.247091] 0 Released resync_lock and enabling interrupts for
> allow_barrier
> [ 464.247091] In raid1_unplug debug read count: 5 write count: 0
> conf->nr_queued: 0
> [ 464.244469] 5113 Obtaining resync_lock and disabling interrupts for
> wait_barrier
> [ 464.246176] In raid1_unplug debug read count: 5 write count: 0
> conf->nr_queued: 0
> [ 464.244469] In raid1_unplug debug read count: 5 write count: 0
> conf->nr_queued: 0
> [ 464.328828] 3639 Obtaining resync_lock and disabling interrupts for
> wait_barrier
> [ 464.328834] In raid1_unplug debug read count: 5 write count: 0
> conf->nr_queued: 0
> [ 466.065137] 2479 Obtaining resync_lock and disabling interrupts for
> wait_barrier
> [ 466.065137] In raid1_unplug debug read count: 5 write count: 0
> conf->nr_queued: 0
> [ 467.051970] 2478 Obtaining resync_lock and disabling interrupts for
> wait_barrier
> [ 467.051975] In raid1_unplug debug read count: 5 write count: 0
> conf->nr_queued: 0
> [ 471.293667] In raid1_unplug debug read count: 5 write count: 0
> conf->nr_queued: 0
> [ 471.525081] 220 Obtaining resync_lock and disabling interrupts for
> wait_barrier
> [ 471.525081] In raid1_unplug debug read count: 5 write count: 0
> conf->nr_queued: 0
> [ 505.308962] 2216 WARNING, schedule_timeout timed out for raise_barrier
> [ 505.308076] 5118 WARNING, schedule_timeout timed out for wait_barrier
>
>
> ... so does that mean that there are are read requests going awol in the
> block layer?
>
> If so, then the circumstantial evidence from when the lockups occur make
> it look to me like this is probably an OpenVZ bug, but I'll try and do
> some more digging tomorrow...
>
> Does that make sense?
Yes. Superficially, it appears that there are still 5 outstanding read
requests that are not being completed. I cannot guess how OpenVZ would
cause that, but I don't really know much about OpenVZ.
NeilBrown
next prev parent reply other threads:[~2010-11-21 23:05 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-09-17 14:53 Deadlock in md barrier code? / RAID1 / LVM CoW snapshot + ext3 / Debian 5.0 - lenny 2.6.26 kernel Tim Small
2010-09-17 22:59 ` Neil Brown
2010-09-20 19:59 ` Tim Small
2010-09-21 21:02 ` Tim Small
2010-09-21 22:30 ` Neil Brown
2010-10-12 13:59 ` Tim Small
2010-10-12 14:06 ` Tim Small
2010-10-12 16:48 ` CoolCold
2010-10-13 8:51 ` Tim Small
2010-10-13 13:00 ` CoolCold
2010-10-18 18:52 ` Tim Small
2010-10-19 6:16 ` Neil Brown
2010-10-19 16:24 ` Tim Small
2010-10-19 16:29 ` Tim Small
2010-10-19 19:29 ` Tim Small
2010-10-20 20:34 ` Tim Small
2010-10-20 23:04 ` Neil Brown
2010-11-18 18:04 ` Tim Small
2010-11-21 23:05 ` Neil Brown [this message]
2010-12-06 15:42 ` Tim Small
2010-09-21 22:21 ` Neil Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101122100509.76592b59@notabene.brown \
--to=neilb@suse.de \
--cc=linux-raid@vger.kernel.org \
--cc=tim@seoss.co.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.