* [patch] xfs: properly account for reclaimed inodes
@ 2010-10-01 7:43 Johannes Weiner
2010-10-01 14:02 ` Dave Chinner
2010-10-01 17:17 ` Alex Elder
0 siblings, 2 replies; 9+ messages in thread
From: Johannes Weiner @ 2010-10-01 7:43 UTC (permalink / raw)
To: xfs; +Cc: John Hawley, linux-kernel, stable
When marking an inode reclaimable, a per-AG counter is increased, the
inode is tagged reclaimable in its per-AG tree, and, when this is the
first reclaimable inode in the AG, the AG entry in the per-mount tree
is also tagged.
When an inode is finally reclaimed, however, it is only deleted from
the per-AG tree. Neither the counter is decreased, nor is the parent
tree's AG entry untagged properly.
Since the tags in the per-mount tree are not cleared, the inode
shrinker iterates over all AGs that have had reclaimable inodes at one
point in time.
The counters on the other hand signal an increasing amount of slab
objects to reclaim. Since "70e60ce xfs: convert inode shrinker to
per-filesystem context" this is not a real issue anymore because the
shrinker bails out after one iteration.
But the problem was observable on a machine running v2.6.34, where the
reclaimable work increased and each process going into direct reclaim
eventually got stuck on the xfs inode shrinking path, trying to scan
several million objects.
Fix this by properly unwinding the reclaimable-state tracking of an
inode when it is reclaimed.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: stable@kernel.org
---
fs/xfs/linux-2.6/xfs_sync.c | 19 ++++++++++++++-----
1 files changed, 14 insertions(+), 5 deletions(-)
diff --git a/fs/xfs/linux-2.6/xfs_sync.c b/fs/xfs/linux-2.6/xfs_sync.c
index d59c4a6..81976ff 100644
--- a/fs/xfs/linux-2.6/xfs_sync.c
+++ b/fs/xfs/linux-2.6/xfs_sync.c
@@ -668,14 +668,11 @@ xfs_inode_set_reclaim_tag(
xfs_perag_put(pag);
}
-void
-__xfs_inode_clear_reclaim_tag(
- xfs_mount_t *mp,
+STATIC void
+__xfs_inode_clear_reclaim(
xfs_perag_t *pag,
xfs_inode_t *ip)
{
- radix_tree_tag_clear(&pag->pag_ici_root,
- XFS_INO_TO_AGINO(mp, ip->i_ino), XFS_ICI_RECLAIM_TAG);
pag->pag_ici_reclaimable--;
if (!pag->pag_ici_reclaimable) {
/* clear the reclaim tag from the perag radix tree */
@@ -689,6 +686,17 @@ __xfs_inode_clear_reclaim_tag(
}
}
+void
+__xfs_inode_clear_reclaim_tag(
+ xfs_mount_t *mp,
+ xfs_perag_t *pag,
+ xfs_inode_t *ip)
+{
+ radix_tree_tag_clear(&pag->pag_ici_root,
+ XFS_INO_TO_AGINO(mp, ip->i_ino), XFS_ICI_RECLAIM_TAG);
+ __xfs_inode_clear_reclaim(pag, ip);
+}
+
/*
* Inodes in different states need to be treated differently, and the return
* value of xfs_iflush is not sufficient to get this right. The following table
@@ -838,6 +846,7 @@ reclaim:
if (!radix_tree_delete(&pag->pag_ici_root,
XFS_INO_TO_AGINO(ip->i_mount, ip->i_ino)))
ASSERT(0);
+ __xfs_inode_clear_reclaim(pag, ip);
write_unlock(&pag->pag_ici_lock);
/*
--
1.7.2.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [patch] xfs: properly account for reclaimed inodes
2010-10-01 7:43 [patch] xfs: properly account for reclaimed inodes Johannes Weiner
@ 2010-10-01 14:02 ` Dave Chinner
2010-10-01 17:17 ` Alex Elder
1 sibling, 0 replies; 9+ messages in thread
From: Dave Chinner @ 2010-10-01 14:02 UTC (permalink / raw)
To: Johannes Weiner; +Cc: xfs, John Hawley, linux-kernel, stable
On Fri, Oct 01, 2010 at 09:43:54AM +0200, Johannes Weiner wrote:
> When marking an inode reclaimable, a per-AG counter is increased, the
> inode is tagged reclaimable in its per-AG tree, and, when this is the
> first reclaimable inode in the AG, the AG entry in the per-mount tree
> is also tagged.
>
> When an inode is finally reclaimed, however, it is only deleted from
> the per-AG tree. Neither the counter is decreased, nor is the parent
> tree's AG entry untagged properly.
>
> Since the tags in the per-mount tree are not cleared, the inode
> shrinker iterates over all AGs that have had reclaimable inodes at one
> point in time.
>
> The counters on the other hand signal an increasing amount of slab
> objects to reclaim. Since "70e60ce xfs: convert inode shrinker to
> per-filesystem context" this is not a real issue anymore because the
> shrinker bails out after one iteration.
>
> But the problem was observable on a machine running v2.6.34, where the
> reclaimable work increased and each process going into direct reclaim
> eventually got stuck on the xfs inode shrinking path, trying to scan
> several million objects.
>
> Fix this by properly unwinding the reclaimable-state tracking of an
> inode when it is reclaimed.
>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> Cc: stable@kernel.org
Looks OK to me, and has run through a few hours of testing without
problems.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch] xfs: properly account for reclaimed inodes
2010-10-01 7:43 [patch] xfs: properly account for reclaimed inodes Johannes Weiner
2010-10-01 14:02 ` Dave Chinner
@ 2010-10-01 17:17 ` Alex Elder
2010-10-04 7:19 ` Dave Chinner
1 sibling, 1 reply; 9+ messages in thread
From: Alex Elder @ 2010-10-01 17:17 UTC (permalink / raw)
To: Johannes Weiner; +Cc: xfs, John Hawley, linux-kernel, stable
On Fri, 2010-10-01 at 09:43 +0200, Johannes Weiner wrote:
> When marking an inode reclaimable, a per-AG counter is increased, the
> inode is tagged reclaimable in its per-AG tree, and, when this is the
> first reclaimable inode in the AG, the AG entry in the per-mount tree
> is also tagged.
>
> When an inode is finally reclaimed, however, it is only deleted from
> the per-AG tree. Neither the counter is decreased, nor is the parent
> tree's AG entry untagged properly.
>
> Since the tags in the per-mount tree are not cleared, the inode
> shrinker iterates over all AGs that have had reclaimable inodes at one
> point in time.
>
> The counters on the other hand signal an increasing amount of slab
> objects to reclaim. Since "70e60ce xfs: convert inode shrinker to
> per-filesystem context" this is not a real issue anymore because the
> shrinker bails out after one iteration.
>
> But the problem was observable on a machine running v2.6.34, where the
> reclaimable work increased and each process going into direct reclaim
> eventually got stuck on the xfs inode shrinking path, trying to scan
> several million objects.
>
> Fix this by properly unwinding the reclaimable-state tracking of an
> inode when it is reclaimed.
>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> Cc: stable@kernel.org
Yes, this looks right to me. The state was correctly
adjusted in xfs_iget_cache_hit() when a RECLAIMABLE
inode is found in the cache, but it was not done when
reclaim completes.
Reviewed-by: Alex Elder <aelder@sgi.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch] xfs: properly account for reclaimed inodes
2010-10-01 17:17 ` Alex Elder
@ 2010-10-04 7:19 ` Dave Chinner
2010-10-04 10:22 ` Johannes Weiner
0 siblings, 1 reply; 9+ messages in thread
From: Dave Chinner @ 2010-10-04 7:19 UTC (permalink / raw)
To: Alex Elder; +Cc: Johannes Weiner, xfs, John Hawley, linux-kernel, stable
On Fri, Oct 01, 2010 at 12:17:23PM -0500, Alex Elder wrote:
> On Fri, 2010-10-01 at 09:43 +0200, Johannes Weiner wrote:
> > When marking an inode reclaimable, a per-AG counter is increased, the
> > inode is tagged reclaimable in its per-AG tree, and, when this is the
> > first reclaimable inode in the AG, the AG entry in the per-mount tree
> > is also tagged.
> >
> > When an inode is finally reclaimed, however, it is only deleted from
> > the per-AG tree. Neither the counter is decreased, nor is the parent
> > tree's AG entry untagged properly.
> >
> > Since the tags in the per-mount tree are not cleared, the inode
> > shrinker iterates over all AGs that have had reclaimable inodes at one
> > point in time.
> >
> > The counters on the other hand signal an increasing amount of slab
> > objects to reclaim. Since "70e60ce xfs: convert inode shrinker to
> > per-filesystem context" this is not a real issue anymore because the
> > shrinker bails out after one iteration.
> >
> > But the problem was observable on a machine running v2.6.34, where the
> > reclaimable work increased and each process going into direct reclaim
> > eventually got stuck on the xfs inode shrinking path, trying to scan
> > several million objects.
> >
> > Fix this by properly unwinding the reclaimable-state tracking of an
> > inode when it is reclaimed.
> >
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > Cc: stable@kernel.org
>
> Yes, this looks right to me. The state was correctly
> adjusted in xfs_iget_cache_hit() when a RECLAIMABLE
> inode is found in the cache, but it was not done when
> reclaim completes.
>
> Reviewed-by: Alex Elder <aelder@sgi.com>
Alex, can you push this to Linus ASAP? This needs to go back to
stable kernels as well..
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch] xfs: properly account for reclaimed inodes
2010-10-04 7:19 ` Dave Chinner
@ 2010-10-04 10:22 ` Johannes Weiner
2010-10-05 9:26 ` Hans-Peter Jansen
2010-10-06 4:53 ` Dave Chinner
0 siblings, 2 replies; 9+ messages in thread
From: Johannes Weiner @ 2010-10-04 10:22 UTC (permalink / raw)
To: Dave Chinner; +Cc: Alex Elder, xfs, John Hawley, linux-kernel, stable
Hi,
On Mon, Oct 04, 2010 at 06:19:04PM +1100, Dave Chinner wrote:
> On Fri, Oct 01, 2010 at 12:17:23PM -0500, Alex Elder wrote:
> > On Fri, 2010-10-01 at 09:43 +0200, Johannes Weiner wrote:
> > > When marking an inode reclaimable, a per-AG counter is increased, the
> > > inode is tagged reclaimable in its per-AG tree, and, when this is the
> > > first reclaimable inode in the AG, the AG entry in the per-mount tree
> > > is also tagged.
> > >
> > > When an inode is finally reclaimed, however, it is only deleted from
> > > the per-AG tree. Neither the counter is decreased, nor is the parent
> > > tree's AG entry untagged properly.
> > >
> > > Since the tags in the per-mount tree are not cleared, the inode
> > > shrinker iterates over all AGs that have had reclaimable inodes at one
> > > point in time.
> > >
> > > The counters on the other hand signal an increasing amount of slab
> > > objects to reclaim. Since "70e60ce xfs: convert inode shrinker to
> > > per-filesystem context" this is not a real issue anymore because the
> > > shrinker bails out after one iteration.
> > >
> > > But the problem was observable on a machine running v2.6.34, where the
> > > reclaimable work increased and each process going into direct reclaim
> > > eventually got stuck on the xfs inode shrinking path, trying to scan
> > > several million objects.
> > >
> > > Fix this by properly unwinding the reclaimable-state tracking of an
> > > inode when it is reclaimed.
> > >
> > > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > > Cc: stable@kernel.org
> >
> > Yes, this looks right to me. The state was correctly
> > adjusted in xfs_iget_cache_hit() when a RECLAIMABLE
> > inode is found in the cache, but it was not done when
> > reclaim completes.
> >
> > Reviewed-by: Alex Elder <aelder@sgi.com>
>
> Alex, can you push this to Linus ASAP? This needs to go back to
> stable kernels as well..
Here is my suggestion of a backport to .34. Dave, Alex, do you
approve?
Hannes
diff --git a/fs/xfs/xfs_iget.c b/fs/xfs/xfs_iget.c
index 6845db9..3314f2a 100644
--- a/fs/xfs/xfs_iget.c
+++ b/fs/xfs/xfs_iget.c
@@ -499,6 +499,7 @@ xfs_ireclaim(
write_lock(&pag->pag_ici_lock);
if (!radix_tree_delete(&pag->pag_ici_root, agino))
ASSERT(0);
+ pag->pag_ici_reclaimable--;
write_unlock(&pag->pag_ici_lock);
xfs_perag_put(pag);
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [patch] xfs: properly account for reclaimed inodes
2010-10-04 10:22 ` Johannes Weiner
@ 2010-10-05 9:26 ` Hans-Peter Jansen
2010-10-07 3:12 ` Alex Elder
2010-10-06 4:53 ` Dave Chinner
1 sibling, 1 reply; 9+ messages in thread
From: Hans-Peter Jansen @ 2010-10-05 9:26 UTC (permalink / raw)
To: xfs
Cc: Johannes Weiner, Dave Chinner, stable, John Hawley, linux-kernel,
Alex Elder
On Monday 04 October 2010, 12:22:13 Johannes Weiner wrote:
> Hi,
>
> On Mon, Oct 04, 2010 at 06:19:04PM +1100, Dave Chinner wrote:
> > On Fri, Oct 01, 2010 at 12:17:23PM -0500, Alex Elder wrote:
> > > On Fri, 2010-10-01 at 09:43 +0200, Johannes Weiner wrote:
> > > > When marking an inode reclaimable, a per-AG counter is increased,
> > > > the inode is tagged reclaimable in its per-AG tree, and, when this
> > > > is the first reclaimable inode in the AG, the AG entry in the
> > > > per-mount tree is also tagged.
> > > >
> > > > When an inode is finally reclaimed, however, it is only deleted
> > > > from the per-AG tree. Neither the counter is decreased, nor is the
> > > > parent tree's AG entry untagged properly.
> > > >
> > > > Since the tags in the per-mount tree are not cleared, the inode
> > > > shrinker iterates over all AGs that have had reclaimable inodes at
> > > > one point in time.
> > > >
> > > > The counters on the other hand signal an increasing amount of slab
> > > > objects to reclaim. Since "70e60ce xfs: convert inode shrinker to
> > > > per-filesystem context" this is not a real issue anymore because
> > > > the shrinker bails out after one iteration.
> > > >
> > > > But the problem was observable on a machine running v2.6.34, where
> > > > the reclaimable work increased and each process going into direct
> > > > reclaim eventually got stuck on the xfs inode shrinking path,
> > > > trying to scan several million objects.
> > > >
> > > > Fix this by properly unwinding the reclaimable-state tracking of an
> > > > inode when it is reclaimed.
> > > >
> > > > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > > > Cc: stable@kernel.org
> > >
> > > Yes, this looks right to me. The state was correctly
> > > adjusted in xfs_iget_cache_hit() when a RECLAIMABLE
> > > inode is found in the cache, but it was not done when
> > > reclaim completes.
> > >
> > > Reviewed-by: Alex Elder <aelder@sgi.com>
> >
> > Alex, can you push this to Linus ASAP? This needs to go back to
> > stable kernels as well..
>
> Here is my suggestion of a backport to .34. Dave, Alex, do you
> approve?
>
> Hannes
>
> diff --git a/fs/xfs/xfs_iget.c b/fs/xfs/xfs_iget.c
> index 6845db9..3314f2a 100644
> --- a/fs/xfs/xfs_iget.c
> +++ b/fs/xfs/xfs_iget.c
> @@ -499,6 +499,7 @@ xfs_ireclaim(
> write_lock(&pag->pag_ici_lock);
> if (!radix_tree_delete(&pag->pag_ici_root, agino))
> ASSERT(0);
> + pag->pag_ici_reclaimable--;
> write_unlock(&pag->pag_ici_lock);
> xfs_perag_put(pag);
>
>
Ping?
Masters of xfs, please raise your voices!
Pete
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch] xfs: properly account for reclaimed inodes
2010-10-04 10:22 ` Johannes Weiner
2010-10-05 9:26 ` Hans-Peter Jansen
@ 2010-10-06 4:53 ` Dave Chinner
2010-10-06 23:46 ` J.H.
1 sibling, 1 reply; 9+ messages in thread
From: Dave Chinner @ 2010-10-06 4:53 UTC (permalink / raw)
To: Johannes Weiner; +Cc: Alex Elder, xfs, John Hawley, linux-kernel, stable
On Mon, Oct 04, 2010 at 12:22:13PM +0200, Johannes Weiner wrote:
> Hi,
>
> On Mon, Oct 04, 2010 at 06:19:04PM +1100, Dave Chinner wrote:
> > On Fri, Oct 01, 2010 at 12:17:23PM -0500, Alex Elder wrote:
> > > On Fri, 2010-10-01 at 09:43 +0200, Johannes Weiner wrote:
> > > > When marking an inode reclaimable, a per-AG counter is increased, the
> > > > inode is tagged reclaimable in its per-AG tree, and, when this is the
> > > > first reclaimable inode in the AG, the AG entry in the per-mount tree
> > > > is also tagged.
> > > >
> > > > When an inode is finally reclaimed, however, it is only deleted from
> > > > the per-AG tree. Neither the counter is decreased, nor is the parent
> > > > tree's AG entry untagged properly.
> > > >
> > > > Since the tags in the per-mount tree are not cleared, the inode
> > > > shrinker iterates over all AGs that have had reclaimable inodes at one
> > > > point in time.
> > > >
> > > > The counters on the other hand signal an increasing amount of slab
> > > > objects to reclaim. Since "70e60ce xfs: convert inode shrinker to
> > > > per-filesystem context" this is not a real issue anymore because the
> > > > shrinker bails out after one iteration.
> > > >
> > > > But the problem was observable on a machine running v2.6.34, where the
> > > > reclaimable work increased and each process going into direct reclaim
> > > > eventually got stuck on the xfs inode shrinking path, trying to scan
> > > > several million objects.
> > > >
> > > > Fix this by properly unwinding the reclaimable-state tracking of an
> > > > inode when it is reclaimed.
> > > >
> > > > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > > > Cc: stable@kernel.org
> > >
> > > Yes, this looks right to me. The state was correctly
> > > adjusted in xfs_iget_cache_hit() when a RECLAIMABLE
> > > inode is found in the cache, but it was not done when
> > > reclaim completes.
> > >
> > > Reviewed-by: Alex Elder <aelder@sgi.com>
> >
> > Alex, can you push this to Linus ASAP? This needs to go back to
> > stable kernels as well..
>
> Here is my suggestion of a backport to .34. Dave, Alex, do you
> approve?
>
> Hannes
>
> diff --git a/fs/xfs/xfs_iget.c b/fs/xfs/xfs_iget.c
> index 6845db9..3314f2a 100644
> --- a/fs/xfs/xfs_iget.c
> +++ b/fs/xfs/xfs_iget.c
> @@ -499,6 +499,7 @@ xfs_ireclaim(
> write_lock(&pag->pag_ici_lock);
> if (!radix_tree_delete(&pag->pag_ici_root, agino))
> ASSERT(0);
> + pag->pag_ici_reclaimable--;
> write_unlock(&pag->pag_ici_lock);
> xfs_perag_put(pag);
Looks good to me.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch] xfs: properly account for reclaimed inodes
2010-10-06 4:53 ` Dave Chinner
@ 2010-10-06 23:46 ` J.H.
0 siblings, 0 replies; 9+ messages in thread
From: J.H. @ 2010-10-06 23:46 UTC (permalink / raw)
To: Dave Chinner; +Cc: Johannes Weiner, Alex Elder, xfs, linux-kernel, stable
On 10/05/2010 09:53 PM, Dave Chinner wrote:
> On Mon, Oct 04, 2010 at 12:22:13PM +0200, Johannes Weiner wrote:
>> Hi,
>>
>> On Mon, Oct 04, 2010 at 06:19:04PM +1100, Dave Chinner wrote:
>>> On Fri, Oct 01, 2010 at 12:17:23PM -0500, Alex Elder wrote:
>>>> On Fri, 2010-10-01 at 09:43 +0200, Johannes Weiner wrote:
>>>>> When marking an inode reclaimable, a per-AG counter is increased, the
>>>>> inode is tagged reclaimable in its per-AG tree, and, when this is the
>>>>> first reclaimable inode in the AG, the AG entry in the per-mount tree
>>>>> is also tagged.
>>>>>
>>>>> When an inode is finally reclaimed, however, it is only deleted from
>>>>> the per-AG tree. Neither the counter is decreased, nor is the parent
>>>>> tree's AG entry untagged properly.
>>>>>
>>>>> Since the tags in the per-mount tree are not cleared, the inode
>>>>> shrinker iterates over all AGs that have had reclaimable inodes at one
>>>>> point in time.
>>>>>
>>>>> The counters on the other hand signal an increasing amount of slab
>>>>> objects to reclaim. Since "70e60ce xfs: convert inode shrinker to
>>>>> per-filesystem context" this is not a real issue anymore because the
>>>>> shrinker bails out after one iteration.
>>>>>
>>>>> But the problem was observable on a machine running v2.6.34, where the
>>>>> reclaimable work increased and each process going into direct reclaim
>>>>> eventually got stuck on the xfs inode shrinking path, trying to scan
>>>>> several million objects.
>>>>>
>>>>> Fix this by properly unwinding the reclaimable-state tracking of an
>>>>> inode when it is reclaimed.
>>>>>
>>>>> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
>>>>> Cc: stable@kernel.org
>>>>
>>>> Yes, this looks right to me. The state was correctly
>>>> adjusted in xfs_iget_cache_hit() when a RECLAIMABLE
>>>> inode is found in the cache, but it was not done when
>>>> reclaim completes.
>>>>
>>>> Reviewed-by: Alex Elder <aelder@sgi.com>
>>>
>>> Alex, can you push this to Linus ASAP? This needs to go back to
>>> stable kernels as well..
>>
>> Here is my suggestion of a backport to .34. Dave, Alex, do you
>> approve?
>>
>> Hannes
>>
>> diff --git a/fs/xfs/xfs_iget.c b/fs/xfs/xfs_iget.c
>> index 6845db9..3314f2a 100644
>> --- a/fs/xfs/xfs_iget.c
>> +++ b/fs/xfs/xfs_iget.c
>> @@ -499,6 +499,7 @@ xfs_ireclaim(
>> write_lock(&pag->pag_ici_lock);
>> if (!radix_tree_delete(&pag->pag_ici_root, agino))
>> ASSERT(0);
>> + pag->pag_ici_reclaimable--;
>> write_unlock(&pag->pag_ici_lock);
>> xfs_perag_put(pag);
>
> Looks good to me.
>
> Reviewed-by: Dave Chinner <dchinner@redhat.com>
i've got this in production and things seem to be acting a lot more like
I would expect.
Tested-by: John 'Warthog9' Hawley <warthog9@kernel.org>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch] xfs: properly account for reclaimed inodes
2010-10-05 9:26 ` Hans-Peter Jansen
@ 2010-10-07 3:12 ` Alex Elder
0 siblings, 0 replies; 9+ messages in thread
From: Alex Elder @ 2010-10-07 3:12 UTC (permalink / raw)
To: Hans-Peter Jansen
Cc: xfs, Johannes Weiner, Dave Chinner, stable, John Hawley,
linux-kernel
On Tue, 2010-10-05 at 11:26 +0200, Hans-Peter Jansen wrote:
> On Monday 04 October 2010, 12:22:13 Johannes Weiner wrote:
> > Hi,
> >
> > On Mon, Oct 04, 2010 at 06:19:04PM +1100, Dave Chinner wrote:
> > > On Fri, Oct 01, 2010 at 12:17:23PM -0500, Alex Elder wrote:
> > > > On Fri, 2010-10-01 at 09:43 +0200, Johannes Weiner wrote:
> > > > > When marking an inode reclaimable, a per-AG counter is increased,
> > > > > the inode is tagged reclaimable in its per-AG tree, and, when this
> > > > > is the first reclaimable inode in the AG, the AG entry in the
> > > > > per-mount tree is also tagged.
> > > > >
> > > > > When an inode is finally reclaimed, however, it is only deleted
> > > > > from the per-AG tree. Neither the counter is decreased, nor is the
> > > > > parent tree's AG entry untagged properly.
> > > > >
> > > > > Since the tags in the per-mount tree are not cleared, the inode
> > > > > shrinker iterates over all AGs that have had reclaimable inodes at
> > > > > one point in time.
> > > > >
> > > > > The counters on the other hand signal an increasing amount of slab
> > > > > objects to reclaim. Since "70e60ce xfs: convert inode shrinker to
> > > > > per-filesystem context" this is not a real issue anymore because
> > > > > the shrinker bails out after one iteration.
> > > > >
> > > > > But the problem was observable on a machine running v2.6.34, where
> > > > > the reclaimable work increased and each process going into direct
> > > > > reclaim eventually got stuck on the xfs inode shrinking path,
> > > > > trying to scan several million objects.
> > > > >
> > > > > Fix this by properly unwinding the reclaimable-state tracking of an
> > > > > inode when it is reclaimed.
> > > > >
> > > > > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > > > > Cc: stable@kernel.org
> > > >
> > > > Yes, this looks right to me. The state was correctly
> > > > adjusted in xfs_iget_cache_hit() when a RECLAIMABLE
> > > > inode is found in the cache, but it was not done when
> > > > reclaim completes.
> > > >
> > > > Reviewed-by: Alex Elder <aelder@sgi.com>
> > >
> > > Alex, can you push this to Linus ASAP? This needs to go back to
> > > stable kernels as well..
> >
> > Here is my suggestion of a backport to .34. Dave, Alex, do you
> > approve?
> >
> > Hannes
> >
> > diff --git a/fs/xfs/xfs_iget.c b/fs/xfs/xfs_iget.c
> > index 6845db9..3314f2a 100644
> > --- a/fs/xfs/xfs_iget.c
> > +++ b/fs/xfs/xfs_iget.c
> > @@ -499,6 +499,7 @@ xfs_ireclaim(
> > write_lock(&pag->pag_ici_lock);
> > if (!radix_tree_delete(&pag->pag_ici_root, agino))
> > ASSERT(0);
> > + pag->pag_ici_reclaimable--;
> > write_unlock(&pag->pag_ici_lock);
> > xfs_perag_put(pag);
> >
> >
>
> Ping?
>
> Masters of xfs, please raise your voices!
>
> Pete
I know I'm a little late to the game in saying so, but I do
agree this looks like the right fix for the .34 stable branch.
Reviewed-by: Alex Elder <aelder@sgi.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2010-10-07 3:13 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-01 7:43 [patch] xfs: properly account for reclaimed inodes Johannes Weiner
2010-10-01 14:02 ` Dave Chinner
2010-10-01 17:17 ` Alex Elder
2010-10-04 7:19 ` Dave Chinner
2010-10-04 10:22 ` Johannes Weiner
2010-10-05 9:26 ` Hans-Peter Jansen
2010-10-07 3:12 ` Alex Elder
2010-10-06 4:53 ` Dave Chinner
2010-10-06 23:46 ` J.H.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox