From: Neil Brown <neilb@suse.de>
To: Tim Small <tim@seoss.co.uk>
Cc: "linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>
Subject: Re: Deadlock in md barrier code? / RAID1 / LVM CoW snapshot + ext3 / Debian 5.0 - lenny 2.6.26 kernel
Date: Mon, 22 Nov 2010 10:05:09 +1100 [thread overview]
Message-ID: <20101122100509.76592b59@notabene.brown> (raw)
In-Reply-To: <4CE56AA9.2010905@seoss.co.uk>
On Thu, 18 Nov 2010 18:04:25 +0000
Tim Small <tim@seoss.co.uk> wrote:
> On 10/21/10 00:04, Neil Brown wrote:
> >
> > Maybe you could add a could of global atomic variables, one for reads and one
> > for writes.
> > Then on each call to generic_make_request in:
> > flush_pending_writes, make_request, raid1d
> > increment one or the other depending on whether it is a read or a write.
> > Then in raid1_end_read_request and raid1_end_write_request decrement them
> > appropriately.
> >
>
>
> Ended up with runs like this:
>
> [ 464.244109] 0 Obtaining resync_lock and disabling interrupts for
> allow_barrier
> [ 464.244109] conf nr_pending is 5
> [ 464.244109] 0 Released resync_lock and enabling interrupts for
> allow_barrier
> [ 464.244109] 2216: Post wait until no IO waiting - barrier: 0, pend:
> 5, wait: 0, queued: 0
> [ 464.244113] 2216: Pre wait pending to complete - barrier: 1, pend: 5,
> wait: 0, queued: 0
> [ 464.244116] In raid1_unplug debug read count: 5 write count: 0
> conf->nr_queued: 0
> [ 464.244469] 5113 Obtaining resync_lock and disabling interrupts for
> wait_barrier
> [ 464.244469] In raid1_unplug debug read count: 5 write count: 0
> conf->nr_queued: 0
> [ 464.244469] 5127 Obtaining resync_lock and disabling interrupts for
> allow_barrier
> [ 464.244469] conf nr_pending is 4
> [ 464.244469] 5127 Released resync_lock and enabling interrupts for
> allow_barrier
> [ 464.244469] 5127 Obtaining resync_lock and disabling interrupts for
> allow_barrier
> [ 464.246176] In raid1_unplug debug read count: 4 write count: 0
> conf->nr_queued: 0
> [ 464.244469] conf nr_pending is 3
> [ 464.244469] 5127 Released resync_lock and enabling interrupts for
> allow_barrier
> [ 464.244469] 5127 Obtaining resync_lock and disabling interrupts for
> allow_barrier
> [ 464.244469] conf nr_pending is 2
> [ 464.244469] 5127 Released resync_lock and enabling interrupts for
> allow_barrier
> [ 464.246176] In raid1_unplug debug read count: 2 write count: 0
> conf->nr_queued: 0
> [ 464.244469] 5127 Obtaining resync_lock and disabling interrupts for
> allow_barrier
> [ 464.244469] conf nr_pending is 1
> [ 464.244469] 5127 Released resync_lock and enabling interrupts for
> allow_barrier
> [ 464.246176] In raid1_unplug debug read count: 1 write count: 0
> conf->nr_queued: 0
> [ 464.244469] 5127 Obtaining resync_lock and disabling interrupts for
> allow_barrier
> [ 464.244469] conf nr_pending is 0
> [ 464.244469] 5127 Released resync_lock and enabling interrupts for
> allow_barrier
> [ 464.244469] In raid1_unplug debug read count: 0 write count: 0
> conf->nr_queued: 0
> [ 464.246176] 2216: Post wait pending to complete - barrier: 1, pend:
> 0, wait: 1, queued: 0
> [ 464.246176] In raid1_unplug debug read count: 0 write count: 0
> conf->nr_queued: 0
> [ 464.246176] 2216: Pre wait until no IO waiting - barrier: 1, pend: 0,
> wait: 1, queued: 0
> [ 464.246176] In raid1_unplug debug read count: 0 write count: 0
> conf->nr_queued: 0
> [ 464.246176] 5118 Obtaining resync_lock and disabling interrupts for
> wait_barrier
> [ 464.246176] In raid1_unplug debug read count: 0 write count: 0
> conf->nr_queued: 0
> [ 464.246633] In raid1_unplug debug read count: 0 write count: 0
> conf->nr_queued: 1
> [ 464.244469] In raid1_unplug debug read count: 0 write count: 0
> conf->nr_queued: 0
> [ 464.246176] In raid1_unplug debug read count: 0 write count: 0
> conf->nr_queued: 0
> [ 464.246990] 5118 Released resync_lock and enabling interrupts for
> wait_barrier
> [ 464.244469] 5113 Released resync_lock and enabling interrupts for
> wait_barrier
> [ 464.246990] 5118 Obtaining resync_lock and disabling interrupts for
> wait_barrier
> [ 464.246990] 5118 Released resync_lock and enabling interrupts for
> wait_barrier
> [ 464.246990] 5118 Obtaining resync_lock and disabling interrupts for
> wait_barrier
> [ 464.246990] 5118 Released resync_lock and enabling interrupts for
> wait_barrier
> [ 464.244469] In raid1_unplug debug read count: 3 write count: 0
> conf->nr_queued: 0
> [ 464.246990] 5118 Obtaining resync_lock and disabling interrupts for
> wait_barrier
> [ 464.246990] 5118 Released resync_lock and enabling interrupts for
> wait_barrier
> [ 464.246990] 5118 Obtaining resync_lock and disabling interrupts for
> wait_barrier
> [ 464.246990] 5118 Released resync_lock and enabling interrupts for
> wait_barrier
> [ 464.246176] 2216: Post wait until no IO waiting - barrier: 0, pend:
> 6, wait: 0, queued: 0
> [ 464.246176] 2216: Pre wait pending to complete - barrier: 1, pend: 6,
> wait: 0, queued: 0
> [ 464.246176] In raid1_unplug debug read count: 5 write count: 0
> conf->nr_queued: 0
> [ 464.246990] 5118 Obtaining resync_lock and disabling interrupts for
> wait_barrier
> [ 464.246990] In raid1_unplug debug read count: 6 write count: 0
> conf->nr_queued: 0
> [ 464.247091] 0 Obtaining resync_lock and disabling interrupts for
> allow_barrier
> [ 464.247091] conf nr_pending is 5
> [ 464.247091] 0 Released resync_lock and enabling interrupts for
> allow_barrier
> [ 464.247091] In raid1_unplug debug read count: 5 write count: 0
> conf->nr_queued: 0
> [ 464.244469] 5113 Obtaining resync_lock and disabling interrupts for
> wait_barrier
> [ 464.246176] In raid1_unplug debug read count: 5 write count: 0
> conf->nr_queued: 0
> [ 464.244469] In raid1_unplug debug read count: 5 write count: 0
> conf->nr_queued: 0
> [ 464.328828] 3639 Obtaining resync_lock and disabling interrupts for
> wait_barrier
> [ 464.328834] In raid1_unplug debug read count: 5 write count: 0
> conf->nr_queued: 0
> [ 466.065137] 2479 Obtaining resync_lock and disabling interrupts for
> wait_barrier
> [ 466.065137] In raid1_unplug debug read count: 5 write count: 0
> conf->nr_queued: 0
> [ 467.051970] 2478 Obtaining resync_lock and disabling interrupts for
> wait_barrier
> [ 467.051975] In raid1_unplug debug read count: 5 write count: 0
> conf->nr_queued: 0
> [ 471.293667] In raid1_unplug debug read count: 5 write count: 0
> conf->nr_queued: 0
> [ 471.525081] 220 Obtaining resync_lock and disabling interrupts for
> wait_barrier
> [ 471.525081] In raid1_unplug debug read count: 5 write count: 0
> conf->nr_queued: 0
> [ 505.308962] 2216 WARNING, schedule_timeout timed out for raise_barrier
> [ 505.308076] 5118 WARNING, schedule_timeout timed out for wait_barrier
>
>
> ... so does that mean that there are are read requests going awol in the
> block layer?
>
> If so, then the circumstantial evidence from when the lockups occur make
> it look to me like this is probably an OpenVZ bug, but I'll try and do
> some more digging tomorrow...
>
> Does that make sense?
Yes. Superficially, it appears that there are still 5 outstanding read
requests that are not being completed. I cannot guess how OpenVZ would
cause that, but I don't really know much about OpenVZ.
NeilBrown
next prev parent reply other threads:[~2010-11-21 23:05 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-09-17 14:53 Deadlock in md barrier code? / RAID1 / LVM CoW snapshot + ext3 / Debian 5.0 - lenny 2.6.26 kernel Tim Small
2010-09-17 22:59 ` Neil Brown
2010-09-20 19:59 ` Tim Small
2010-09-21 21:02 ` Tim Small
2010-09-21 22:30 ` Neil Brown
2010-10-12 13:59 ` Tim Small
2010-10-12 14:06 ` Tim Small
2010-10-12 16:48 ` CoolCold
2010-10-13 8:51 ` Tim Small
2010-10-13 13:00 ` CoolCold
2010-10-18 18:52 ` Tim Small
2010-10-19 6:16 ` Neil Brown
2010-10-19 16:24 ` Tim Small
2010-10-19 16:29 ` Tim Small
2010-10-19 19:29 ` Tim Small
2010-10-20 20:34 ` Tim Small
2010-10-20 23:04 ` Neil Brown
2010-11-18 18:04 ` Tim Small
2010-11-21 23:05 ` Neil Brown [this message]
2010-12-06 15:42 ` Tim Small
2010-09-21 22:21 ` Neil Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101122100509.76592b59@notabene.brown \
--to=neilb@suse.de \
--cc=linux-raid@vger.kernel.org \
--cc=tim@seoss.co.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).