public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* BUG: task blocked on waiter in xfs_trans_dqlockedjoin()
@ 2012-04-16 13:34 Alex Elder
  2012-04-16 20:41 ` Alex Elder
  2012-04-17  0:09 ` Dave Chinner
  0 siblings, 2 replies; 3+ messages in thread
From: Alex Elder @ 2012-04-16 13:34 UTC (permalink / raw)
  To: xfs

I am getting the following warning while running xfstests.  I haven't
started looking at it closely yet, but I wanted to report it so others
could have a look.  The XFS code in use was at commit c922bbc819,
with ad637a10f4 cherry-picked on top of it.  The tests all passed,
so now I'm going to have to narrow down which test produces the failure
(it was not near the first, nor the last test...).

Here is the source of the warning:
         DEBUG_LOCKS_WARN_ON(ti->task->blocked_on != waiter);

In this function:
void mutex_remove_waiter(struct mutex *lock, struct mutex_waiter *waiter,
                          struct thread_info *ti)
{
         DEBUG_LOCKS_WARN_ON(list_empty(&waiter->list));
         DEBUG_LOCKS_WARN_ON(waiter->task != ti->task);
         DEBUG_LOCKS_WARN_ON(ti->task->blocked_on != waiter);
         ti->task->blocked_on = NULL;

         list_del_init(&waiter->list);
         waiter->task = NULL;
}

called (eventually) via xfs_dqlock2() in this function:

STATIC void
xfs_trans_dqlockedjoin(
         xfs_trans_t     *tp,
         xfs_dqtrx_t     *q)
{
         ASSERT(q[0].qt_dquot != NULL);
         if (q[1].qt_dquot == NULL) {
                 xfs_dqlock(q[0].qt_dquot);
                 xfs_trans_dqjoin(tp, q[0].qt_dquot);
         } else {
                 ASSERT(XFS_QM_TRANS_MAXDQS == 2);
                 xfs_dqlock2(q[0].qt_dquot, q[1].qt_dquot);
                 xfs_trans_dqjoin(tp, q[0].qt_dquot);
                 xfs_trans_dqjoin(tp, q[1].qt_dquot);
         }
}


					-Alex

------------[ cut here ]------------
WARNING: at 
/srv/autobuild-ceph/gitbuilder.git/build/kernel/mutex-debug.c:65 
mutex_remove_waiter+0x93/0x130()
Hardware name: PowerEdge R410
Modules linked in: rbd libceph aesni_intel cryptd aes_x86_64 aes_generic 
ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs exportfs reiserfs 
ipmi_devintf ipmi_si ipmi_msghandler i7core_edac edac_core joydev hed 
serio_raw dcdbas lp parport usbhid hid mptsas ixgbe mptscsih mptbase dca 
mdio scsi_transport_sas bnx2 btrfs zlib_deflate crc32c libcrc32c [last 
unloaded: libceph]
Pid: 5947, comm: kworker/0:4 Not tainted 3.3.0-ceph-00067-gafede88 #1
Call Trace:
  [<ffffffff8104e42f>] warn_slowpath_common+0x7f/0xc0
  [<ffffffff8104e48a>] warn_slowpath_null+0x1a/0x20
  [<ffffffff810a59f3>] mutex_remove_waiter+0x93/0x130
  [<ffffffff8161309c>] __mutex_lock_common+0x24c/0x3d0
  [<ffffffffa0293a4f>] ? xfs_trans_dqlockedjoin+0x5f/0x70 [xfs]
  [<ffffffffa0293a4f>] ? xfs_trans_dqlockedjoin+0x5f/0x70 [xfs]
  [<ffffffff81613347>] mutex_lock_nested+0x37/0x50
  [<ffffffffa0293a4f>] xfs_trans_dqlockedjoin+0x5f/0x70 [xfs]
  [<ffffffffa0293aa5>] xfs_trans_apply_dquot_deltas+0x45/0x290 [xfs]
  [<ffffffffa0287257>] xfs_trans_commit+0x67/0x270 [xfs]
  [<ffffffffa023e52b>] xfs_iomap_write_allocate+0x16b/0x380 [xfs]
  [<ffffffffa0232575>] xfs_map_blocks+0x165/0x240 [xfs]
  [<ffffffffa02327ef>] xfs_vm_writepage+0x19f/0x500 [xfs]
  [<ffffffff8112990a>] __writepage+0x1a/0x50
  [<ffffffff8112b66e>] write_cache_pages+0x24e/0x4e0
  [<ffffffff8130e01f>] ? cfq_dispatch_requests+0x18f/0xc10
  [<ffffffff811298f0>] ? set_page_dirty+0x70/0x70
  [<ffffffff8112b954>] generic_writepages+0x54/0x80
  [<ffffffffa02311bc>] xfs_vm_writepages+0x5c/0x80 [xfs]
  [<ffffffff8112b9a4>] do_writepages+0x24/0x40
  [<ffffffff81120b4b>] __filemap_fdatawrite_range+0x5b/0x60
  [<ffffffff81120e23>] filemap_fdatawrite_range+0x13/0x20
  [<ffffffffa023a818>] xfs_flush_pages+0x78/0xc0 [xfs]
  [<ffffffffa0245416>] xfs_sync_inode_data+0x86/0xb0 [xfs]
  [<ffffffffa0245a7e>] xfs_inode_ag_walk+0x20e/0x370 [xfs]
  [<ffffffffa0245b5b>] ? xfs_inode_ag_walk+0x2eb/0x370 [xfs]
  [<ffffffffa0245390>] ? xfs_sync_worker+0x90/0x90 [xfs]
  [<ffffffffa0245390>] ? xfs_sync_worker+0x90/0x90 [xfs]
  [<ffffffffa0245c25>] xfs_inode_ag_iterator+0x45/0xa0 [xfs]
  [<ffffffffa0245ca9>] xfs_sync_data+0x29/0x50 [xfs]
  [<ffffffffa0245cf2>] xfs_flush_worker+0x22/0x40 [xfs]
  [<ffffffff8106b256>] process_one_work+0x1a6/0x520
  [<ffffffff8106b1e7>] ? process_one_work+0x137/0x520
  [<ffffffffa0245cd0>] ? xfs_sync_data+0x50/0x50 [xfs]
  [<ffffffff8106d71e>] worker_thread+0x2fe/0x400
  [<ffffffff8106d420>] ? manage_workers+0x210/0x210
  [<ffffffff8107280e>] kthread+0xbe/0xd0
  [<ffffffff8161f5f4>] kernel_thread_helper+0x4/0x10
  [<ffffffff81616134>] ? retint_restore_args+0x13/0x13
  [<ffffffff81072750>] ? __init_kthread_worker+0x70/0x70
  [<ffffffff8161f5f0>] ? gs_change+0x13/0x13
---[ end trace d27597ce9ccb8690 ]---

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: BUG: task blocked on waiter in xfs_trans_dqlockedjoin()
  2012-04-16 13:34 BUG: task blocked on waiter in xfs_trans_dqlockedjoin() Alex Elder
@ 2012-04-16 20:41 ` Alex Elder
  2012-04-17  0:09 ` Dave Chinner
  1 sibling, 0 replies; 3+ messages in thread
From: Alex Elder @ 2012-04-16 20:41 UTC (permalink / raw)
  To: xfs

On 04/16/2012 08:34 AM, Alex Elder wrote:
> I am getting the following warning while running xfstests. I haven't
> started looking at it closely yet, but I wanted to report it so others
> could have a look. The XFS code in use was at commit c922bbc819,
> with ad637a10f4 cherry-picked on top of it. The tests all passed,
> so now I'm going to have to narrow down which test produces the failure
> (it was not near the first, nor the last test...).

I have reproduced the problem running only test 232:
# Run fsstress with quotas enabled and verify accounted quotas in the end


There is very little to this test (other than running fsstress).
It's basically:
   quotacheck -u -g $SCRATCH_MNT 2>/dev/null
   quotaon -u -g $SCRATCH_MNT 2>/dev/null
   _check_quota_usage
   _fsstress

Now I'll look a little closer at the XFS path, and will try to
grab some extra information while reproducing it.

					-Alex

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: BUG: task blocked on waiter in xfs_trans_dqlockedjoin()
  2012-04-16 13:34 BUG: task blocked on waiter in xfs_trans_dqlockedjoin() Alex Elder
  2012-04-16 20:41 ` Alex Elder
@ 2012-04-17  0:09 ` Dave Chinner
  1 sibling, 0 replies; 3+ messages in thread
From: Dave Chinner @ 2012-04-17  0:09 UTC (permalink / raw)
  To: Alex Elder; +Cc: xfs

On Mon, Apr 16, 2012 at 08:34:21AM -0500, Alex Elder wrote:
> I am getting the following warning while running xfstests.  I haven't
> started looking at it closely yet, but I wanted to report it so others
> could have a look.  The XFS code in use was at commit c922bbc819,

Which does not modify locking at all.

> with ad637a10f4 cherry-picked on top of it.  The tests all passed,

And that modifies the way we do inode reclaim synchronisation by
ILOCK rather than by IOLOCK|ILOCK. Neither of these are touching the
dquot locking at all, so I'm having trouble understanding why these
commits would cause a problem with a dquot mutex....

> so now I'm going to have to narrow down which test produces the failure
> (it was not near the first, nor the last test...).
> 
> Here is the source of the warning:
>         DEBUG_LOCKS_WARN_ON(ti->task->blocked_on != waiter);

That implies that the task is currently trying to acquire two
mutexes at once - which I can't see is possible. How different are
the two values i.e. are you seeing memory corruption?

Also, what case is the code running through? is it taking the
xfs_dqlock2() branch, and if so are the two dquots different?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-04-17  0:09 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-04-16 13:34 BUG: task blocked on waiter in xfs_trans_dqlockedjoin() Alex Elder
2012-04-16 20:41 ` Alex Elder
2012-04-17  0:09 ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox