From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH 0/6 v2] xfs: xfs_iflush_cluster vs xfs_reclaim_inode
Date: Mon, 11 Apr 2016 09:37:18 -0400 [thread overview]
Message-ID: <20160411133716.GA47566@bfoster.bfoster> (raw)
In-Reply-To: <20160408221706.GB567@dastard>
[-- Attachment #1: Type: text/plain, Size: 4461 bytes --]
On Sat, Apr 09, 2016 at 08:17:06AM +1000, Dave Chinner wrote:
> On Fri, Apr 08, 2016 at 01:18:44PM -0400, Brian Foster wrote:
> > On Fri, Apr 08, 2016 at 09:37:45AM +1000, Dave Chinner wrote:
> > > Hi folks,
> > >
> > > This is the second version of this patch set, first posted and
> > > described here:
> > >
> > > http://oss.sgi.com/archives/xfs/2016-04/msg00069.html
> > >
> > > The only change from the first version is splitting up the first
> > > patch into two as Christoph requested - one for the bug fix, the
> > > other for the variable renaming.
> > >
> >
> > Did your xfstests testing for this series include generic/233? I'm
> > seeing a consistently reproducible test hang. The test is hanging on a
> > "xfs_quota -x -c off -ug /mnt/scratch" command. The stack is as follows:
> >
> > [<ffffffffa0772306>] xfs_qm_dquot_walk.isra.8+0x196/0x1b0 [xfs]
> > [<ffffffffa0774a98>] xfs_qm_dqpurge_all+0x78/0x80 [xfs]
> > [<ffffffffa07713e8>] xfs_qm_scall_quotaoff+0x148/0x640 [xfs]
> > [<ffffffffa077733d>] xfs_quota_disable+0x3d/0x50 [xfs]
> > [<ffffffff812c27e3>] SyS_quotactl+0x3b3/0x8c0
> > [<ffffffff81003e17>] do_syscall_64+0x67/0x190
> > [<ffffffff81763f7f>] return_from_SYSCALL_64+0x0/0x7a
> > [<ffffffffffffffff>] 0xffffffffffffffff
> >
> > ... and it looks like the kernel is spinning somehow or another between
> > inode reclaim and xfsaild:
> >
> > ...
> > kworker/1:2-210 [001] ...1 895.750591: xfs_perag_get_tag: dev 253:3 agno 1 refcount 1 caller xfs_reclaim_inodes_ag [xfs]
> > kworker/1:2-210 [001] ...1 895.750609: xfs_perag_put: dev 253:3 agno 1 refcount 0 caller xfs_reclaim_inodes_ag [xfs]
> > kworker/1:2-210 [001] ...1 895.750609: xfs_perag_get_tag: dev 253:3 agno 2 refcount 5 caller xfs_reclaim_inodes_ag [xfs]
> > kworker/1:2-210 [001] ...1 895.750611: xfs_perag_put: dev 253:3 agno 2 refcount 4 caller xfs_reclaim_inodes_ag [xfs]
> > kworker/1:2-210 [001] ...1 895.750612: xfs_perag_get_tag: dev 253:3 agno 3 refcount 1 caller xfs_reclaim_inodes_ag [xfs]
> > kworker/1:2-210 [001] ...1 895.750613: xfs_perag_put: dev 253:3 agno 3 refcount 0 caller xfs_reclaim_inodes_ag [xfs]
> > xfsaild/dm-3-12406 [003] ...2 895.760588: xfs_ail_locked: dev 253:3 lip 0xffff8801f8e65d80 lsn 2/5709 type XFS_LI_QUOTAOFF flags IN_AIL
> > xfsaild/dm-3-12406 [003] ...2 895.810595: xfs_ail_locked: dev 253:3 lip 0xffff8801f8e65d80 lsn 2/5709 type XFS_LI_QUOTAOFF flags IN_AIL
> > xfsaild/dm-3-12406 [003] ...2 895.860586: xfs_ail_locked: dev 253:3 lip 0xffff8801f8e65d80 lsn 2/5709 type XFS_LI_QUOTAOFF flags IN_AIL
> > xfsaild/dm-3-12406 [003] ...2 895.910596: xfs_ail_locked: dev 253:3 lip 0xffff8801f8e65d80 lsn 2/5709 type XFS_LI_QUOTAOFF flags IN_AIL
> > ...
>
> No deadlock involving the AIL - it doesn't remove the
> XFS_LI_QUOTAOFF from the AIL - the quota code committing the
> quotaoff-end transactions is what removes that. IOWs, the dquot walk
> has not completed, so quotaoff has not completed, so the
> XFS_LI_QUOTAOFF is still in the AIL.
>
> IOWs, this looks like xfs_qm_dquot_walk() is skipping dquots because
> xfs_qm_dqpurge is hitting this:
>
> xfs_dqlock(dqp);
> if ((dqp->dq_flags & XFS_DQ_FREEING) || dqp->q_nrefs != 0) {
> xfs_dqunlock(dqp);
> return -EAGAIN;
> }
>
> So that means we've got an inode that probably hasn't been
> reclaimed, because the last thing that happens during reclaim is the
> dquots are detatched from the inode and hence the reference counts
> are dropped.
>
> > FWIW, this only occurs with patch 6 applied. The test and scratch
> > devices are both 10GB lvm volumes formatted with mkfs defaults (v5).
>
> I can't see how patch 6 would prevent an inode from being reclaimed,
> as all the changes occur *after* the reclaim decision has been made.
> More investigation needed, I guess...
>
The attached diff addresses the problem for me. Feel free to fold it
into the original patch.
The regression test I had running failed with an OOM over the weekend.
I hadn't seen that before, but then again I haven't seen this test run
to completion on this system either due to the original problem. I'll
restart it today with this hunk included.
Brian
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
[-- Attachment #2: __xfs_inode_clear_reclaim.diff --]
[-- Type: text/plain, Size: 1246 bytes --]
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index a60db43..749689c 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -818,14 +818,15 @@ xfs_inode_set_reclaim_tag(
STATIC void
__xfs_inode_clear_reclaim(
xfs_perag_t *pag,
- xfs_inode_t *ip)
+ xfs_inode_t *ip,
+ xfs_ino_t ino)
{
pag->pag_ici_reclaimable--;
if (!pag->pag_ici_reclaimable) {
/* clear the reclaim tag from the perag radix tree */
spin_lock(&ip->i_mount->m_perag_lock);
radix_tree_tag_clear(&ip->i_mount->m_perag_tree,
- XFS_INO_TO_AGNO(ip->i_mount, ip->i_ino),
+ XFS_INO_TO_AGNO(ip->i_mount, ino),
XFS_ICI_RECLAIM_TAG);
spin_unlock(&ip->i_mount->m_perag_lock);
trace_xfs_perag_clear_reclaim(ip->i_mount, pag->pag_agno,
@@ -841,7 +842,7 @@ __xfs_inode_clear_reclaim_tag(
{
radix_tree_tag_clear(&pag->pag_ici_root,
XFS_INO_TO_AGINO(mp, ip->i_ino), XFS_ICI_RECLAIM_TAG);
- __xfs_inode_clear_reclaim(pag, ip);
+ __xfs_inode_clear_reclaim(pag, ip, ip->i_ino);
}
/*
@@ -1032,7 +1033,7 @@ reclaim:
if (!radix_tree_delete(&pag->pag_ici_root,
XFS_INO_TO_AGINO(ip->i_mount, ino)))
ASSERT(0);
- __xfs_inode_clear_reclaim(pag, ip);
+ __xfs_inode_clear_reclaim(pag, ip, ino);
spin_unlock(&pag->pag_ici_lock);
/*
[-- Attachment #3: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2016-04-11 13:37 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-07 23:37 [PATCH 0/6 v2] xfs: xfs_iflush_cluster vs xfs_reclaim_inode Dave Chinner
2016-04-07 23:37 ` [PATCH 1/6] xfs: fix inode validity check in xfs_iflush_cluster Dave Chinner
2016-04-07 23:43 ` Christoph Hellwig
2016-04-07 23:37 ` [PATCH 2/6] xfs: rename variables in xfs_iflush_cluster for clarity Dave Chinner
2016-04-07 23:44 ` Christoph Hellwig
2016-04-07 23:37 ` [PATCH 3/6] xfs: skip stale inodes in xfs_iflush_cluster Dave Chinner
2016-04-07 23:37 ` [PATCH 4/6] xfs: xfs_iflush_cluster has range issues Dave Chinner
2016-04-07 23:37 ` [PATCH 5/6] xfs: xfs_inode_free() isn't RCU safe Dave Chinner
2016-04-07 23:37 ` [PATCH 6/6] xfs: mark reclaimed inodes invalid earlier Dave Chinner
2016-04-07 23:46 ` Christoph Hellwig
2016-04-08 3:28 ` [PATCH 0/6 v2] xfs: xfs_iflush_cluster vs xfs_reclaim_inode Eryu Guan
2016-04-08 11:37 ` Brian Foster
2016-04-10 9:22 ` Eryu Guan
2016-04-11 6:25 ` Eryu Guan
2016-04-08 17:18 ` Brian Foster
2016-04-08 22:17 ` Dave Chinner
2016-04-11 13:37 ` Brian Foster [this message]
2016-04-11 23:31 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160411133716.GA47566@bfoster.bfoster \
--to=bfoster@redhat.com \
--cc=david@fromorbit.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.