assertion failure with latest xfs

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

* assertion failure with latest xfs
@ 2008-10-23  9:08 Lachlan McIlroy
  2008-10-23 17:31 ` Christoph Hellwig
  0 siblings, 1 reply; 8+ messages in thread
From: Lachlan McIlroy @ 2008-10-23  9:08 UTC (permalink / raw)
  To: xfs-oss

Just encountered this after pulling in the latest changes.  We are trying to
initialise an inode that should have an i_count of 1 but instead it is 2.  I
was running XFSQA test 167 when it happened.

Stack traceback for pid 25318
0xffff88004201ddc0    25318    25228  1    3   R  0xffff88004201e228 *fsstress
sp                ip                Function (args)
0xffff88000f409ab8 0xffffffff811c9265 assfail+0x1a (invalid, invalid, invalid)
0xffff88000f409af0 0xffffffff811c3cd5 xfs_setup_inode+0x56 (0xffff88006e1fbc00)
0xffff88000f409b20 0xffffffff8119d7a7 xfs_ialloc+0x53a (0xffff88006c8fd880, 0xffff880069990780, invalid, invalid, 0x100000000, invalid, 0xffff880000000000, 0x1, 0xffff88000f409c28)
0xffff88000f409bb0 0xffffffff811b4952 xfs_dir_ialloc+0xa0 (0xffff88000f409d00, 0xffff880069990780, invalid, invalid, 0x100000000, 0x0, 0xffff880000000000, 0xffff880000000001, 0xffff88000f409d08)
0xffff88000f409c70 0xffffffff811b8a87 xfs_create+0x325 (0xffff880069990780, 0xffff88000f409d68, invalid, invalid, 0xffff88000f409d80, 0x0)
0xffff88000f409d50 0xffffffff811c36eb xfs_vn_mknod+0x14f (0xffff880069990a20, 0xffff88007f564000, invalid, invalid)
0xffff88000f409dc0 0xffffffff811c37db xfs_vn_create+0xb
0xffff88000f409dd0 0xffffffff810b27fe vfs_create+0xdf (0xffff880069990a20, 0xffff88007f564000, invalid, 0xffff88000f409e48)
0xffff88000f409e10 0xffffffff810b4a21 do_filp_open+0x214 (invalid, 0xffff880076003180, invalid, invalid)
0xffff88000f409f30 0xffffffff810a7e9a do_sys_open+0x53 (invalid, invalid, invalid, invalid)
0xffff88000f409f70 0xffffffff810a7f43 sys_open+0x1b (invalid, invalid, invalid)
   not matched: from 0xffffffff8100bfb2 to 0xffffffff8100c02a drop_through 0 bb_jmp[7]
bb_special_case: Invalid bb_reg_state.memory, missing trailing entries
bb_special_case: on transfer to int_with_check
   system_call_fastpath has memory parameters but no register parameters.
   Assuming it is a 'pass through' function that does not refer to its register
   parameters and setting 6 register parameters
kdb_bb: 0xffffffff8100bf3b [kernel]system_call_fastpath failed at 0xffffffff8100bfcd

Using old style backtrace, unreliable with no arguments
sp                ip                Function (args)
0xffff88000f409a70 0xffffffff8104bffe up+0xf
[3]more>
0xffff88000f409ab8 0xffffffff811c9265 assfail+0x1a
0xffff88000f409ae0 0xffffffff811c9265 assfail+0x1a
0xffff88000f409af0 0xffffffff811c3cd5 xfs_setup_inode+0x56
0xffff88000f409b20 0xffffffff8119d7a7 xfs_ialloc+0x53a
0xffff88000f409bb0 0xffffffff811b4952 xfs_dir_ialloc+0xa0
0xffff88000f409c70 0xffffffff811b8a87 xfs_create+0x325
0xffff88000f409d50 0xffffffff811c36eb xfs_vn_mknod+0x14f
0xffff88000f409dc0 0xffffffff811c37db xfs_vn_create+0xb
0xffff88000f409dd0 0xffffffff810b27fe vfs_create+0xdf
0xffff88000f409e10 0xffffffff810b4a21 do_filp_open+0x214
0xffff88000f409e40 0xffffffff810a3581 init_object+0x6e
0xffff88000f409ed0 0xffffffff8155fb01 _spin_unlock+0x26
0xffff88000f409f30 0xffffffff810a7e9a do_sys_open+0x53
0xffff88000f409f38 0xffffffff811f2fc9 selinux_file_free_security+0x1e
0xffff88000f409f70 0xffffffff810a7f43 sys_open+0x1b
[3]kdb>
[3]kdb> dmesg 20
<4>[ 4349.936786] XFS: correcting sb_features alignment problem
<5>[ 4349.947784] XFS mounting filesystem sda4
<7>[ 4350.022280] Ending clean XFS mount for filesystem: sda4
<5>[ 4351.670370] XFS mounting filesystem sda3
<7>[ 4351.795964] Ending clean XFS mount for filesystem: sda3
<5>[ 4353.889829] XFS mounting filesystem sda3
<7>[ 4354.016496] Ending clean XFS mount for filesystem: sda3
<5>[ 4356.163284] XFS mounting filesystem sda3
<7>[ 4356.283840] Ending clean XFS mount for filesystem: sda3
<4>[ 4357.884887] XFS: correcting sb_features alignment problem
<5>[ 4357.895876] XFS mounting filesystem sda4
<7>[ 4357.970481] Ending clean XFS mount for filesystem: sda4
<5>[ 4359.714421] XFS mounting filesystem sda3
<7>[ 4359.835486] Ending clean XFS mount for filesystem: sda3
<4>[ 4361.442033] XFS: correcting sb_features alignment problem
<5>[ 4361.453021] XFS mounting filesystem sda4
<7>[ 4361.527472] Ending clean XFS mount for filesystem: sda4
<4>[ 4460.233979] Assertion failed: atomic_read(&inode->i_count) == 1, file: fs/xfs/linux-2.6/xfs_iops.c, line: 783
<0>[ 4460.253826] ------------[ cut here ]------------
<2>[ 4460.254764] kernel BUG at fs/xfs/support/debug.c:81!
[3]kdb>
[3]kdb> xnode 0xffff88006e1fbc00
mount 0xffff88007748f3f0 vnode 0xffff88006e1fbea0
dev 800004 ino 134342909[2:1e8f:d]
blkno 0x399cb80 len 0x10 boffset 0x1d00
transp 0xffff88006c8fd880 &itemp 0xffff88002f4ee690
&lock 0xffff88006e1fbc88 &iolock 0xffff88006e1fbcf0 &flush 0xffff88006e1fbd58 (1) pincount 0x0
udquotp 0x0000000000000000 gdquotp 0x0000000000000000
new_size 0
flags 0x140 <truncated >
update_core 0 update size 0
gen 0x0 delayed blks 0size 0
  trace 0xffff88007449a000
  bmap_trace 0xffff88007449a060
  bmbt trace 0xffff88007449a0c0
  rw trace 0xffff88007449a120
  ilock trace 0xffff88007449a180
  dir trace 0xffff88007449a1e0

data fork
  bytes 0x0 real_bytes 0x0 lastex 0x0 u1:extents 0x0000000000000000
  broot 0x0000000000000000 broot_bytes 0x0 ext_max 9 flags 0x2 <extents >
  u2 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
attr fork empty
[3]more>

magic 0x494e mode 0100666 (r---rw-rw-rw-) version 0x2 format 0x2 (extents)
nlink 1 uid 0 gid 0 projid 0 flushiter 0
atime 1224743591:233524184 mtime 1224743591d:233524184 ctime 1224743591:233524184
size 0 nblocks 0 extsize 0x0 nextents 0x0 anextents 0x0
forkoff 0 aformat 0x2 (extents) dmevmask 0x0 dmstate 0x0 flags 0x0 <> gen 0x47f454b
--> itrace @ 0xffff88006e1fbc00/0xffff88007449a000
exit from xfs_iget.alloc i_count = 1
   cpu = 3 pid = 25318   ra = xfs_trans_iget+0x205
[3]kdb>
[3]kdb> inode 0xffff88006e1fbea0
struct inode at  0xffff88006e1fbea0
  i_ino = 134342909 i_count = 2 i_size 0
  i_mode = 00  i_nlink = 1  i_rdev = 0x0
  i_hash.nxt = 0x0000000000000000 i_hash.pprev = 0xffffc200002f9328
  i_list.nxt = 0xffff880069990a20 i_list.prv = 0xffffffff817c3810
  i_dentry.nxt = 0xffff88006e1fbe38 i_dentry.prv = 0xffff88006e1fbe38
  i_sb = 0xffff88007748b1b0 i_op = 0xffffffff81eeab80 i_data = 0xffff88006e1fc088 nrpages = 0
  i_fop= 0xffffffff81eeaaa0 i_flock = 0x0000000000000000 i_mapping = 0xffff88006e1fc088
  i_flags 0x0 i_state 0x88 [I_NEW I_LOCK]  fs specific info @ 0xffff88006e1fc288

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: assertion failure with latest xfs
  2008-10-23  9:08 assertion failure with latest xfs Lachlan McIlroy
@ 2008-10-23 17:31 ` Christoph Hellwig
  2008-10-23 22:21   ` Dave Chinner
  0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2008-10-23 17:31 UTC (permalink / raw)
  To: Lachlan McIlroy; +Cc: xfs-oss

On Thu, Oct 23, 2008 at 07:08:15PM +1000, Lachlan McIlroy wrote:
> Just encountered this after pulling in the latest changes.  We are trying to
> initialise an inode that should have an i_count of 1 but instead it is 2.  I
> was running XFSQA test 167 when it happened.

I think the assert is incorrect.  The inode has been added to the radix
tree in xfs_iget_cache_miss, and starting from that point an igrab can
kick in from the sync code and bump the refcount.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: assertion failure with latest xfs
  2008-10-23 17:31 ` Christoph Hellwig
@ 2008-10-23 22:21   ` Dave Chinner
  2008-10-29  0:43     ` Lachlan McIlroy
  0 siblings, 1 reply; 8+ messages in thread
From: Dave Chinner @ 2008-10-23 22:21 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Lachlan McIlroy, xfs-oss

On Thu, Oct 23, 2008 at 01:31:49PM -0400, Christoph Hellwig wrote:
> On Thu, Oct 23, 2008 at 07:08:15PM +1000, Lachlan McIlroy wrote:
> > Just encountered this after pulling in the latest changes.  We are trying to
> > initialise an inode that should have an i_count of 1 but instead it is 2.  I
> > was running XFSQA test 167 when it happened.
> 
> I think the assert is incorrect.  The inode has been added to the radix
> tree in xfs_iget_cache_miss, and starting from that point an igrab can
> kick in from the sync code and bump the refcount.

Actually, it was put there for a reason. The generic code doesn't
allow new inodes to be found in the cache until the I_LOCK flag is
cleared. This is done by calling wait_on_inode() after a successful
lookup (which waits on I_LOCK) and unlock_new_inode() clears the
I_LOCK|I_NEW bits and wakes anyone who was waiting on that inode via
wake_up_inode().  So the assert was put there to catch potential
races in lookup where a second process does a successful igrab()
before the inode is fully initialised.

I think the race is in dealing with cache hits and recycling a
XFS_IRECLAIMABLE inode. We set the XFS_INEW flag there under
the radix tree read lock, which means we can have parallel lookups
on the same inode that goes:

	thread 1				thread 2
	test XFS_INEW
		-> not set
	test XFS_IRECLAIMABLE
		-> set
						test XFS_INEW
							-> not set
	set XFS_INEW
	clear XFS_IRECLAIMABLE
						test XFS_IRECLAIMABLE
							-> not set
	xfs_setup_inode()
		-> i_state = I_NEW|I_LOCK
						igrab(inode)
							-> I_CLEAR not set
							-> refcount = 2
		-> inode_add_to_lists
		-> assert(refcount == 1)
		.....
		-> clear XFS_INEW
		-> unlock_new_inode()
			-> clear I_NEW|I_LOCK

I thought I'd handled this race with the ordering of setting/clearing
XFS_INEW/XFS_IRECLAIMABLE. Clearly not. I'll add a comment to this
ordering because it is key to actually detecting the race condition
so we can handle it.

Hmmmm - there's also another bug in xfs_iget_cache_hit() - we don't
drop the reference we got if we found an unlinked inode after the
igrab() (the ENOENT case). I'll fix that as well.

Patch below that I'm currently running through xfsqa.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

XFS: Fix race when looking up reclaimable inodes

If we get a race looking up a reclaimable inode, we can
end up with the winner proceeding to use the inode before
it has been completely re-initialised. This is a Bad Thing.

Fix the race by checking whether we are still initialising the
inod eonce we have a reference to it, and if so wait for the
initialisation to complete before continuing.

While there, fix a leaked reference count in the same code
when encountering an unlinked inode and we are not doing a
lookup for a create operation.
---
 fs/xfs/linux-2.6/xfs_linux.h |    1 +
 fs/xfs/xfs_iget.c            |   32 ++++++++++++++++++++++----------
 2 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_linux.h b/fs/xfs/linux-2.6/xfs_linux.h
index cc0f7b3..947dfa1 100644
--- a/fs/xfs/linux-2.6/xfs_linux.h
+++ b/fs/xfs/linux-2.6/xfs_linux.h
@@ -77,6 +77,7 @@
 #include <linux/spinlock.h>
 #include <linux/random.h>
 #include <linux/ctype.h>
+#include <linux/writeback.h>
 
 #include <asm/page.h>
 #include <asm/div64.h>
diff --git a/fs/xfs/xfs_iget.c b/fs/xfs/xfs_iget.c
index 837cae7..bf4dc5e 100644
--- a/fs/xfs/xfs_iget.c
+++ b/fs/xfs/xfs_iget.c
@@ -52,7 +52,7 @@ xfs_iget_cache_hit(
 	int			lock_flags) __releases(pag->pag_ici_lock)
 {
 	struct xfs_mount	*mp = ip->i_mount;
-	int			error = 0;
+	int			error = EAGAIN;
 
 	/*
 	 * If INEW is set this inode is being set up
@@ -60,7 +60,6 @@ xfs_iget_cache_hit(
 	 * Pause and try again.
 	 */
 	if (xfs_iflags_test(ip, (XFS_INEW|XFS_IRECLAIM))) {
-		error = EAGAIN;
 		XFS_STATS_INC(xs_ig_frecycle);
 		goto out_error;
 	}
@@ -73,7 +72,6 @@ xfs_iget_cache_hit(
 		 * error immediately so we don't remove it from the reclaim
 		 * list and potentially leak the inode.
 		 */
-
 		if ((ip->i_d.di_mode == 0) && !(flags & XFS_IGET_CREATE)) {
 			error = ENOENT;
 			goto out_error;
@@ -91,27 +89,42 @@ xfs_iget_cache_hit(
 			error = ENOMEM;
 			goto out_error;
 		}
+
+		/*
+		 * We must set the XFS_INEW flag before clearing the
+		 * XFS_IRECLAIMABLE flag so that if a racing lookup does
+		 * not find the XFS_IRECLAIMABLE above but has the igrab()
+		 * below succeed we can safely check XFS_INEW to detect
+		 * that this inode is still being initialised.
+		 */
 		xfs_iflags_set(ip, XFS_INEW);
 		xfs_iflags_clear(ip, XFS_IRECLAIMABLE);
 
 		/* clear the radix tree reclaim flag as well. */
 		__xfs_inode_clear_reclaim_tag(mp, pag, ip);
-		read_unlock(&pag->pag_ici_lock);
 	} else if (!igrab(VFS_I(ip))) {
 		/* If the VFS inode is being torn down, pause and try again. */
-		error = EAGAIN;
 		XFS_STATS_INC(xs_ig_frecycle);
 		goto out_error;
-	} else {
-		/* we've got a live one */
-		read_unlock(&pag->pag_ici_lock);
+	} else if (xfs_iflags_test(ip, XFS_INEW)) {
+		/*
+		 * We are racing with another cache hit that is
+		 * currently recycling this inode out of the XFS_IRECLAIMABLE
+		 * state. Wait for the initialisation to complete before
+		 * continuing.
+		 */
+		wait_on_inode(VFS_I(ip));
 	}
 
 	if (ip->i_d.di_mode == 0 && !(flags & XFS_IGET_CREATE)) {
 		error = ENOENT;
-		goto out;
+		iput(VFS_I(ip));
+		goto out_error;
 	}
 
+	/* We've got a live one. */
+	read_unlock(&pag->pag_ici_lock);
+
 	if (lock_flags != 0)
 		xfs_ilock(ip, lock_flags);
 
@@ -122,7 +135,6 @@ xfs_iget_cache_hit(
 
 out_error:
 	read_unlock(&pag->pag_ici_lock);
-out:
 	return error;
 }
 

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: assertion failure with latest xfs
  2008-10-23 22:21   ` Dave Chinner
@ 2008-10-29  0:43     ` Lachlan McIlroy
  2008-10-29  3:29       ` Dave Chinner
  0 siblings, 1 reply; 8+ messages in thread
From: Lachlan McIlroy @ 2008-10-29  0:43 UTC (permalink / raw)
  To: Christoph Hellwig, Lachlan McIlroy, xfs-oss

Dave Chinner wrote:
> On Thu, Oct 23, 2008 at 01:31:49PM -0400, Christoph Hellwig wrote:
>> On Thu, Oct 23, 2008 at 07:08:15PM +1000, Lachlan McIlroy wrote:
>>> Just encountered this after pulling in the latest changes.  We are trying to
>>> initialise an inode that should have an i_count of 1 but instead it is 2.  I
>>> was running XFSQA test 167 when it happened.
>> I think the assert is incorrect.  The inode has been added to the radix
>> tree in xfs_iget_cache_miss, and starting from that point an igrab can
>> kick in from the sync code and bump the refcount.
> 
> Actually, it was put there for a reason. The generic code doesn't
> allow new inodes to be found in the cache until the I_LOCK flag is
> cleared. This is done by calling wait_on_inode() after a successful
> lookup (which waits on I_LOCK) and unlock_new_inode() clears the
> I_LOCK|I_NEW bits and wakes anyone who was waiting on that inode via
> wake_up_inode().  So the assert was put there to catch potential
> races in lookup where a second process does a successful igrab()
> before the inode is fully initialised.
> 
> I think the race is in dealing with cache hits and recycling a
> XFS_IRECLAIMABLE inode. We set the XFS_INEW flag there under
> the radix tree read lock, which means we can have parallel lookups
> on the same inode that goes:
> 
> 	thread 1				thread 2
> 	test XFS_INEW
> 		-> not set
> 	test XFS_IRECLAIMABLE
> 		-> set
> 						test XFS_INEW
> 							-> not set
> 	set XFS_INEW
> 	clear XFS_IRECLAIMABLE
> 						test XFS_IRECLAIMABLE
> 							-> not set
> 	xfs_setup_inode()
> 		-> i_state = I_NEW|I_LOCK
> 						igrab(inode)
> 							-> I_CLEAR not set
> 							-> refcount = 2
> 		-> inode_add_to_lists
> 		-> assert(refcount == 1)
> 		.....
> 		-> clear XFS_INEW
> 		-> unlock_new_inode()
> 			-> clear I_NEW|I_LOCK
> 
> I thought I'd handled this race with the ordering of setting/clearing
> XFS_INEW/XFS_IRECLAIMABLE. Clearly not. I'll add a comment to this
> ordering because it is key to actually detecting the race condition
> so we can handle it.
> 
> Hmmmm - there's also another bug in xfs_iget_cache_hit() - we don't
> drop the reference we got if we found an unlinked inode after the
> igrab() (the ENOENT case). I'll fix that as well.
> 
> Patch below that I'm currently running through xfsqa.

I gave this patch a go and it still asserted at the same place running
the same test.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: assertion failure with latest xfs
  2008-10-29  0:43     ` Lachlan McIlroy
@ 2008-10-29  3:29       ` Dave Chinner
  2008-10-30  2:29         ` Lachlan McIlroy
  0 siblings, 1 reply; 8+ messages in thread
From: Dave Chinner @ 2008-10-29  3:29 UTC (permalink / raw)
  To: Lachlan McIlroy; +Cc: Christoph Hellwig, xfs-oss

On Wed, Oct 29, 2008 at 11:43:31AM +1100, Lachlan McIlroy wrote:
> Dave Chinner wrote:
>> Hmmmm - there's also another bug in xfs_iget_cache_hit() - we don't
>> drop the reference we got if we found an unlinked inode after the
>> igrab() (the ENOENT case). I'll fix that as well.
>>
>> Patch below that I'm currently running through xfsqa.
>
> I gave this patch a go and it still asserted at the same place running
> the same test.

Can you put more inode trace points in so that we can see where the
extra reference is coming from?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: assertion failure with latest xfs
  2008-10-29  3:29       ` Dave Chinner
@ 2008-10-30  2:29         ` Lachlan McIlroy
  2008-10-30  5:38           ` Dave Chinner
  0 siblings, 1 reply; 8+ messages in thread
From: Lachlan McIlroy @ 2008-10-30  2:29 UTC (permalink / raw)
  To: Lachlan McIlroy, Christoph Hellwig, xfs-oss

Dave Chinner wrote:
> On Wed, Oct 29, 2008 at 11:43:31AM +1100, Lachlan McIlroy wrote:
>> Dave Chinner wrote:
>>> Hmmmm - there's also another bug in xfs_iget_cache_hit() - we don't
>>> drop the reference we got if we found an unlinked inode after the
>>> igrab() (the ENOENT case). I'll fix that as well.
>>>
>>> Patch below that I'm currently running through xfsqa.
>> I gave this patch a go and it still asserted at the same place running
>> the same test.
> 
> Can you put more inode trace points in so that we can see where the
> extra reference is coming from?

xfs_sync_inodes_ag() found the inode before it was completely
initialised.

--> itrace @ 0xffff880078d67800/0xffff880073563e40
ref @fs/xfs/xfs_inode.c:863(xfs_inode_alloc+0x205) i_count = 1
   cpu = 2 pid = 9938   ra = xfs_iread+0x29
exit from xfs_iget.alloc i_count = 1
   cpu = 2 pid = 9938   ra = xfs_trans_iget+0x205
ref @fs/xfs/xfs_iget.c:218(xfs_iget+0x585) i_count = 1
   cpu = 2 pid = 9938   ra = xfs_trans_iget+0x205
ref @fs/xfs/xfs_iget.c:305(xfs_iget+0x643) i_count = 1
   cpu = 2 pid = 9938   ra = xfs_trans_iget+0x205
ref @fs/xfs/linux-2.6/xfs_sync.c:113(xfs_sync_inodes_ag+0x118) i_count = 1
   cpu = 3 pid = 9953   ra = xfs_sync_inodes+0x68
ref @fs/xfs/linux-2.6/xfs_iops.c:780(xfs_setup_inode+0x2c) i_count = 2
   cpu = 2 pid = 9938   ra = xfs_ialloc+0x5d8

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: assertion failure with latest xfs
  2008-10-30  2:29         ` Lachlan McIlroy
@ 2008-10-30  5:38           ` Dave Chinner
  2008-10-31  1:09             ` Dave Chinner
  0 siblings, 1 reply; 8+ messages in thread
From: Dave Chinner @ 2008-10-30  5:38 UTC (permalink / raw)
  To: Lachlan McIlroy; +Cc: Christoph Hellwig, xfs-oss

On Thu, Oct 30, 2008 at 01:29:16PM +1100, Lachlan McIlroy wrote:
> Dave Chinner wrote:
>> On Wed, Oct 29, 2008 at 11:43:31AM +1100, Lachlan McIlroy wrote:
>>> Dave Chinner wrote:
>>>> Hmmmm - there's also another bug in xfs_iget_cache_hit() - we don't
>>>> drop the reference we got if we found an unlinked inode after the
>>>> igrab() (the ENOENT case). I'll fix that as well.
>>>>
>>>> Patch below that I'm currently running through xfsqa.
>>> I gave this patch a go and it still asserted at the same place running
>>> the same test.
>>
>> Can you put more inode trace points in so that we can see where the
>> extra reference is coming from?
>
> xfs_sync_inodes_ag() found the inode before it was completely
> initialised.
>
> --> itrace @ 0xffff880078d67800/0xffff880073563e40
> ref @fs/xfs/xfs_inode.c:863(xfs_inode_alloc+0x205) i_count = 1
>   cpu = 2 pid = 9938   ra = xfs_iread+0x29
> exit from xfs_iget.alloc i_count = 1
>   cpu = 2 pid = 9938   ra = xfs_trans_iget+0x205
> ref @fs/xfs/xfs_iget.c:218(xfs_iget+0x585) i_count = 1
>   cpu = 2 pid = 9938   ra = xfs_trans_iget+0x205
> ref @fs/xfs/xfs_iget.c:305(xfs_iget+0x643) i_count = 1
>   cpu = 2 pid = 9938   ra = xfs_trans_iget+0x205
> ref @fs/xfs/linux-2.6/xfs_sync.c:113(xfs_sync_inodes_ag+0x118) i_count = 1
>   cpu = 3 pid = 9953   ra = xfs_sync_inodes+0x68
> ref @fs/xfs/linux-2.6/xfs_iops.c:780(xfs_setup_inode+0x2c) i_count = 2
>   cpu = 2 pid = 9938   ra = xfs_ialloc+0x5d8

Ah - ok, that makes sense now. That should be trivial to fix up;
we just need to avoid XFS_INEW() inodes in xfs_sync_inodes_ag()
and probably also in xfs_qm_dqrele_all(), and that will mean
the assert needs to be removed as well.

Patch soon.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: assertion failure with latest xfs
  2008-10-30  5:38           ` Dave Chinner
@ 2008-10-31  1:09             ` Dave Chinner
  0 siblings, 0 replies; 8+ messages in thread
From: Dave Chinner @ 2008-10-31  1:09 UTC (permalink / raw)
  To: Lachlan McIlroy, Christoph Hellwig, xfs-oss

On Thu, Oct 30, 2008 at 04:38:33PM +1100, Dave Chinner wrote:
> On Thu, Oct 30, 2008 at 01:29:16PM +1100, Lachlan McIlroy wrote:
> > xfs_sync_inodes_ag() found the inode before it was completely
> > initialised.
> >
> > --> itrace @ 0xffff880078d67800/0xffff880073563e40
> > ref @fs/xfs/xfs_inode.c:863(xfs_inode_alloc+0x205) i_count = 1
> >   cpu = 2 pid = 9938   ra = xfs_iread+0x29
> > exit from xfs_iget.alloc i_count = 1
> >   cpu = 2 pid = 9938   ra = xfs_trans_iget+0x205
> > ref @fs/xfs/xfs_iget.c:218(xfs_iget+0x585) i_count = 1
> >   cpu = 2 pid = 9938   ra = xfs_trans_iget+0x205
> > ref @fs/xfs/xfs_iget.c:305(xfs_iget+0x643) i_count = 1
> >   cpu = 2 pid = 9938   ra = xfs_trans_iget+0x205
> > ref @fs/xfs/linux-2.6/xfs_sync.c:113(xfs_sync_inodes_ag+0x118) i_count = 1
> >   cpu = 3 pid = 9953   ra = xfs_sync_inodes+0x68
> > ref @fs/xfs/linux-2.6/xfs_iops.c:780(xfs_setup_inode+0x2c) i_count = 2
> >   cpu = 2 pid = 9938   ra = xfs_ialloc+0x5d8
> 
> Ah - ok, that makes sense now. That should be trivial to fix up;
> we just need to avoid XFS_INEW() inodes in xfs_sync_inodes_ag()
> and probably also in xfs_qm_dqrele_all(), and that will mean
> the assert needs to be removed as well.
> 
> Patch soon.

I noticed that the radix tree walk didn't get fixed in
xfs_qm_syscalls.c for the last round of bug fixes. The fix for
this really needs to have that as well. I'll post a series in
a minute....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2008-10-31  1:09 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-23  9:08 assertion failure with latest xfs Lachlan McIlroy
2008-10-23 17:31 ` Christoph Hellwig
2008-10-23 22:21   ` Dave Chinner
2008-10-29  0:43     ` Lachlan McIlroy
2008-10-29  3:29       ` Dave Chinner
2008-10-30  2:29         ` Lachlan McIlroy
2008-10-30  5:38           ` Dave Chinner
2008-10-31  1:09             ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox