Re: XFS: Assertion failed: bp->b_bn != XFS_BUF_DADDR_NULL, file: fs/xfs/xfs_buf.c, line: 598

From: Dave Chinner <david@fromorbit.com>
To: "Rafał Kupka" <kupson@kupson.net>
Cc: Christoph Hellwig <hch@infradead.org>, xfs@oss.sgi.com
Subject: Re: XFS: Assertion failed: bp->b_bn != XFS_BUF_DADDR_NULL, file: fs/xfs/xfs_buf.c, line: 598
Date: Tue, 27 Mar 2012 14:28:43 +1100	[thread overview]
Message-ID: <20120327032843.GV5091@dastard> (raw)
In-Reply-To: <1332777925.14696.194.camel@utemp.mieszkanie>

On Mon, Mar 26, 2012 at 06:05:25PM +0200, Rafał Kupka wrote:
> On Thu, 2012-03-08 at 20:10 +0100, Rafał Kupka wrote: 
> > On Wed, 2012-03-07 at 07:30 +1100, Dave Chinner wrote:
> 
> Hi,
> 
> Finally I've captured this event trace. Hope it will help.
> 
> > > Also, it might be worthwhile to capture an event trace (e.g. all the
> > > xfs_buf events) to see what occurs just before the error is
> > > triggered. That might tell us exactl what sequence has occurred
> > > leading up to the error.
> > 
> > I will try to provide this information in a few days.
> 
> Kernel version - Linus tree up to:
> 0e6e847f "xfs: stop using the page cache to back the buffer cache"
> 
> It's uniprocessor virtual machine (KVM) with 384 MB RAM.

Ok, so limited memory and no parallelism. Can you send me you
.config so I can have a look at other various config options?

> Full event trace is available at https://gist.github.com/2206109
> (sorry for it's *.perl extension, my mistake).
> 
> Kernel log entry:
> [  495.197571] XFS (vda2): I/O error occurred: meta-data dev vda2 block 0x13d80       ("xfs_trans_read_buf") error 5 buf count 8192
> [  495.198283] XFS (vda2): xfs_imap_to_bp: xfs_trans_read_buf() returned error 5.
> [  501.540477] XFS (vda2): I/O error occurred: meta-data dev vda2 block 0x13d80       ("xfs_trans_read_buf") error 5 buf count 8192
> [  501.543047] XFS (vda2): xfs_imap_to_bp: xfs_trans_read_buf() returned error 5.
> 
> It's possible that this "I/O error" is caused by memory pressure?
> 
> Related trace line:
> xfsdump-2498  [000] 24195.782269: xfs_buf_ioerror: dev 254:2 bno 0x13d80 len 0x2000 hold 3 pincount 0 lock 0 error 5 flags READ|TRYLOCK|PAGES caller _xfs_buf_ioapply
> 
> Events related to "bno 0x13d80":
> xfsdump-2498  [000] 24195.782266: xfs_buf_get: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_read
> xfsdump-2498  [000] 24195.782266: xfs_buf_read: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_trans_read_buf
> xfsdump-2498  [000] 24195.782267: xfs_buf_iorequest: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 0 flags READ|TRYLOCK|PAGES caller _xfs_buf_read
> xfsdump-2498  [000] 24195.782267: xfs_buf_hold: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 0 flags READ|TRYLOCK|PAGES caller xfs_buf_iorequest
> xfsdump-2498  [000] 24195.782269: xfs_buf_ioerror: dev 254:2 bno 0x13d80 len 0x2000 hold 3 pincount 0 lock 0 error 5 flags READ|TRYLOCK|PAGES caller _xfs_buf_ioapply
> xfsdump-2498  [000] 24195.782270: xfs_buf_rele: dev 254:2 bno 0x13d80 len 0x2000 hold 3 pincount 0 lock 0 flags READ|TRYLOCK|PAGES caller xfs_buf_iorequest
> xfsdump-2498  [000] 24195.782985: xfs_buf_unlock: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 1 flags READ|TRYLOCK|PAGES caller xfs_buf_relse
> xfsdump-2498  [000] 24195.782986: xfs_buf_rele: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 1 flags READ|TRYLOCK|PAGES caller xfs_trans_read_buf
> 
> Do this data put some light on this issue?

Yes, that it was errored out because the bio that was built from the
buffer was zero length. That means the buffer was not set up
properly in the first place. Hmmm, this is unusual - the two lines
above your traces:

xfs_buf_cond_lock: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|ASYNC|TRYLOCK|PAGES caller _xfs_buf_find
xfs_buf_find: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfs_buf_get: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_read
xfs_buf_read: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_trans_read_buf

The lookup found an uninitialised buffer in the cache. That's
exceedingly curious - the match is done on bp->b_file_offset, and
after it has been inserted it the bno is then filled out.

Oh, wow, I think I see the bug. The key is the initial state of that
uninitialised buffer: READ|ASYNC|TRYLOCK|PAGES, and the state of
the it from the xfs_buf_get() trace READ|LOCK|DONT_BLOCK. Those two
sets of flags do not match up - the initial set indicate the buffer
has been inserted into the cache for readahead, the second set
indicate a blocking read. 

You've only got a small amount of memory, so I suspect that a memory
allocation has failed in a specific place to cause this. I'm not
exactly sure how the second thread trips over it and initialises it
somewhat (there's 3 initialisation steps to a buffer) but the
initial failure is probably this:

	async readhead started
	_xfs_buf_find fails
	allocate new buffer A
	partially initialise A
	_xfs_buf_find
	   fail to find match
	   insert buffer A into rbtree
	   returns buffer A
	A == A
	   fails allocate pages to buffer
	unlock buffer
	release buffer

	<uninitialised buffer now sitting in cache. THIS IS WRONG!>

That would lead to a second lookup finding it and using it, but I'm
still not quite sure how the buffer bno gets initialised. I can
construct valid scenarios that will cause it to happen, but none
match the event trace you've posted.

And just looking at the way we clean up this error, this race
condition could lead to a use after free if we immediately free
if we fail to allocate the pages to the buffer. Messy. I'll have to
think on this a bit - maybe a "broken" state is necessary.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs