public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* xfstest 179 ASSERT
@ 2012-10-31 18:23 Mark Tinguely
  2012-11-01  1:49 ` Dave Chinner
  0 siblings, 1 reply; 3+ messages in thread
From: Mark Tinguely @ 2012-10-31 18:23 UTC (permalink / raw)
  To: Dave Chinner, xfs-oss

OSS sources with the xfs: fix buffer shudown reference count mismatch
patch and xfstest 179.

xfstest 179 started to have various asserts starting with the "move the
workers" series, but mostly the b_hold count is zero assert.

Now that the b_hold count is fixed, the asert is:

XFS: Assertion failed: atomic_read(&pag->pag_ref) == 0,
file: /root/xfs/fs/xfs/xfs_mount.c, line: 273

PID: 6741   TASK: ffff880268f46540  CPU: 0   COMMAND: "umount"
  #0 [ffff88034bba1a70] machine_kexec at ffffffff8102a71d
  #1 [ffff88034bba1ae0] crash_kexec at ffffffff810a4703
  #2 [ffff88034bba1bb0] oops_end at ffffffff81432098
  #3 [ffff88034bba1be0] die at ffffffff81005763
  #4 [ffff88034bba1c10] do_trap at ffffffff814319b3
  #5 [ffff88034bba1c70] do_invalid_op at ffffffff81002f50
  #6 [ffff88034bba1d10] invalid_op at ffffffff8143a09e
     [exception RIP: assfail+29]
     RIP: ffffffffa038e7dd  RSP: ffff88034bba1dc8  RFLAGS: 00010292
     RAX: 0000000000000065  RBX: 0000000000000000  RCX: 0000000000000de5
     RDX: 0000000000002013  RSI: 0000000000000092  RDI: 0000000000000246
     RBP: ffff88034bba1dc8   R8: 0000000000000000   R9: 0000000000000000
     R10: 0000000000000352  R11: 0000000000000351  R12: ffff88034d8af1f8
     R13: ffff88034d8af000  R14: ffff88034d8af1e8  R15: ffff88034fbf28c0
     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
  #7 [ffff88034bba1dd0] xfs_free_perag at ffffffffa03de5d8 [xfs]
  #8 [ffff88034bba1e10] xfs_unmountfs at ffffffffa03e08bc [xfs]
  #9 [ffff88034bba1e60] xfs_fs_put_super at ffffffffa0390e80 [xfs]
#10 [ffff88034bba1e80] generic_shutdown_super at ffffffff8114f37d
#11 [ffff88034bba1eb0] kill_block_super at ffffffff8114f42b
#12 [ffff88034bba1ed0] deactivate_locked_super at ffffffff8114f9e7
#13 [ffff88034bba1ef0] deactivate_super at ffffffff811502d9
#14 [ffff88034bba1f10] mntput_no_expire at ffffffff8116a821
#15 [ffff88034bba1f40] sys_umount at ffffffff8116bd21
#16 [ffff88034bba1f80] system_call_fastpath at ffffffff814390a9
     RIP: 00007f64a30cf5d7  RSP: 00007ffffe2f3798  RFLAGS: 00010202
     RAX: 00000000000000a6  RBX: ffffffff814390a9  RCX: 000000000000c0c8
     RDX: 0000000000000000  RSI: 0000000000000000  RDI: 00007f64a3c0e0f0
     RBP: 00007f64a3c0dff0   R8: 0000000000000000   R9: 00007f64a3a05435
     R10: 00007ffffe2f35c0  R11: 0000000000000246  R12: 00007f64a3c0e0f0
     R13: 00007f64a3c0e0d0  R14: 00007f64a3c0e0f0  R15: 0000000000000000
     ORIG_RAX: 00000000000000a6  CS: 0033  SS: 002b


mount perag information:
   m_perag_tree = {
     height = 0x1,
     gfp_mask = 0x20,
     rnode = 0xffff8803124bb4b1
   },
crash> radix_tree_node ffff8803124bb4b0
struct radix_tree_node {
   height = 0x1,
   count = 0x3,
   {
     parent = 0x0,
     callback_head = {
       next = 0x0,
       func = 0xffffffff812384b0 <radix_tree_node_rcu_free>
     }
   },
   slots = {0x0, 0xffff88034fd6f540, 0xffff88034fd6f840, 
0xffff88034fd6f240, 0x0,
  0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 
0x0, 0x0,
  0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 
0x0, 0x0,
  0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 
0x0, 0x0,
  0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0},
   tags = {{0x0}, {0x0}, {0x0}}
}

The inodes at 0xffff88034fd6f540, 0xffff88034fd6f840, 0xffff88034fd6f240
don't look valid.

I haven't spent much time on this.

--Mark.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: xfstest 179 ASSERT
  2012-10-31 18:23 xfstest 179 ASSERT Mark Tinguely
@ 2012-11-01  1:49 ` Dave Chinner
  2012-11-01 13:08   ` Mark Tinguely
  0 siblings, 1 reply; 3+ messages in thread
From: Dave Chinner @ 2012-11-01  1:49 UTC (permalink / raw)
  To: Mark Tinguely; +Cc: xfs-oss

On Wed, Oct 31, 2012 at 01:23:54PM -0500, Mark Tinguely wrote:
> OSS sources with the xfs: fix buffer shudown reference count mismatch
> patch and xfstest 179.
> 
> xfstest 179 started to have various asserts starting with the "move the
> workers" series, but mostly the b_hold count is zero assert.
> 
> Now that the b_hold count is fixed, the asert is:
> 
> XFS: Assertion failed: atomic_read(&pag->pag_ref) == 0,
> file: /root/xfs/fs/xfs/xfs_mount.c, line: 273

The only way you are going to track this down is through tracing
perag gets and puts, and finding the object that is missing a put.
There are already tracepoints for these and they tell you the caller
function, so that should be sufficient to find what function has a
get without a put...

The other possibility is that there are buffers that have not be
freed properly (or leaked), but IIRC this failure pre-exists
attaching the perag to buffers....

> mount perag information:
>   m_perag_tree = {
>     height = 0x1,
>     gfp_mask = 0x20,
>     rnode = 0xffff8803124bb4b1
>   },
> crash> radix_tree_node ffff8803124bb4b0
> struct radix_tree_node {
>   height = 0x1,
>   count = 0x3,
>   {
>     parent = 0x0,
>     callback_head = {
>       next = 0x0,
>       func = 0xffffffff812384b0 <radix_tree_node_rcu_free>
>     }
>   },
>   slots = {0x0, 0xffff88034fd6f540, 0xffff88034fd6f840,
> 0xffff88034fd6f240, 0x0,
>  0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> 0x0, 0x0, 0x0,
>  0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> 0x0, 0x0, 0x0,
>  0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> 0x0, 0x0, 0x0,
>  0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0},
>   tags = {{0x0}, {0x0}, {0x0}}
> }
> 
> The inodes at 0xffff88034fd6f540, 0xffff88034fd6f840, 0xffff88034fd6f240
> don't look valid.

Those aren't inodes based on the addresses - they are all in the
same page, so the size is at most 0x300 bytes (768 bytes). An XFS
inode is larger than this, but a radix tree node is smaller (560
bytes and fits 7 to a page on my x86_64 machines), so I'm not sure
what you have there. how big does /proc/slabinfo tell you a radix
tree node is?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: xfstest 179 ASSERT
  2012-11-01  1:49 ` Dave Chinner
@ 2012-11-01 13:08   ` Mark Tinguely
  0 siblings, 0 replies; 3+ messages in thread
From: Mark Tinguely @ 2012-11-01 13:08 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs-oss

On 10/31/12 20:49, Dave Chinner wrote:
> On Wed, Oct 31, 2012 at 01:23:54PM -0500, Mark Tinguely wrote:
>> OSS sources with the xfs: fix buffer shudown reference count mismatch
>> patch and xfstest 179.
>>
>> xfstest 179 started to have various asserts starting with the "move the
>> workers" series, but mostly the b_hold count is zero assert.
>>
>> Now that the b_hold count is fixed, the asert is:
>>
>> XFS: Assertion failed: atomic_read(&pag->pag_ref) == 0,
>> file: /root/xfs/fs/xfs/xfs_mount.c, line: 273
>
> The only way you are going to track this down is through tracing
> perag gets and puts, and finding the object that is missing a put.
> There are already tracepoints for these and they tell you the caller
> function, so that should be sufficient to find what function has a
> get without a put...
>
> The other possibility is that there are buffers that have not be
> freed properly (or leaked), but IIRC this failure pre-exists
> attaching the perag to buffers....
>
>> mount perag information:
>>    m_perag_tree = {
>>      height = 0x1,
>>      gfp_mask = 0x20,
>>      rnode = 0xffff8803124bb4b1
>>    },
>> crash>  radix_tree_node ffff8803124bb4b0
>> struct radix_tree_node {
>>    height = 0x1,
>>    count = 0x3,
>>    {
>>      parent = 0x0,
>>      callback_head = {
>>        next = 0x0,
>>        func = 0xffffffff812384b0<radix_tree_node_rcu_free>
>>      }
>>    },
>>    slots = {0x0, 0xffff88034fd6f540, 0xffff88034fd6f840,
>> 0xffff88034fd6f240, 0x0,
>>   0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
>> 0x0, 0x0, 0x0,
>>   0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
>> 0x0, 0x0, 0x0,
>>   0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
>> 0x0, 0x0, 0x0,
>>   0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0},
>>    tags = {{0x0}, {0x0}, {0x0}}
>> }
>>
>> The inodes at 0xffff88034fd6f540, 0xffff88034fd6f840, 0xffff88034fd6f240
>> don't look valid.
>
> Those aren't inodes based on the addresses - they are all in the
> same page, so the size is at most 0x300 bytes (768 bytes). An XFS
> inode is larger than this, but a radix tree node is smaller (560
> bytes and fits 7 to a page on my x86_64 machines), so I'm not sure
> what you have there. how big does /proc/slabinfo tell you a radix
> tree node is?
>
> Cheers,
>
> Dave.


I have been spending too much time in earlier versions of XFS.

Last night, I had a good laugh at my mistake when I realized the perag
was not in an array...duh me, this is the radix tree of the perag
structures. I could not do a crash "kmem" command, that would have
given me a clue.

I *think* Ben is also seeing this assert on his 32 bit test machine.
Thanks for the advice, I will do some investigation this weekend.

--Mark.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-11-01 13:07 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-31 18:23 xfstest 179 ASSERT Mark Tinguely
2012-11-01  1:49 ` Dave Chinner
2012-11-01 13:08   ` Mark Tinguely

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox