public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Alex Elder <aelder@sgi.com>
To: Dave Chinner <david@fromorbit.com>
Cc: XFS Mailing List <xfs@oss.sgi.com>
Subject: Re: [PATCH 1/5] xfs: track AGs with reclaimable inodes in per-ag radix tree
Date: Thu, 15 Jul 2010 13:01:40 -0500	[thread overview]
Message-ID: <1279216900.2054.28.camel@doink> (raw)
In-Reply-To: <1279154300-2018-2-git-send-email-david@fromorbit.com>

On Thu, 2010-07-15 at 10:38 +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=16348
> 
> When the filesystem grows to a large number of allocation groups,
> the summing of recalimable inodes gets expensive. In many cases,
> most AGs won't have any reclaimable inodes and so we are wasting CPU
> time aggregating over these AGs. This is particularly important for
> the inode shrinker that gets called frequently under memory
> pressure.
> 
> To avoid the overhead, track AGs with reclaimable inodes in the
> per-ag radix tree so that we can find all the AGs with reclaimable
> inodes via a simple gang tag lookup. This involves setting the tag
> when the first reclaimable inode is tracked in the AG, and removing
> the tag when the last reclaimable inode is removed from the tree.
> Then the summation process becomes a loop walking the radix tree
> summing AGs with the reclaim tag set.
> 
> This significantly reduces the overhead of scanning - a 6400 AG
> filesystea now only uses about 25% of a cpu in kswapd while slab reclaim
> progresses instead of being permanently stuck at 100% CPU and making little
> progress. Clean filesystems filesystems will see no overhead and the
> overhead only increases linearly with the number of dirty AGs.

Using the same tag represent a perag with reclaimable
inodes as the tag representing an inode is reclaimable
is nicely consistent...

I have a few comments below for your consideration
but they are minor (and don't even require a response).

Overall this looks good.

Reviewed-by: Alex Elder <aelder@sgi.com>

> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/linux-2.6/xfs_sync.c  |   71 +++++++++++++++++++++++++++++++++++++----
>  fs/xfs/linux-2.6/xfs_trace.h |    3 ++
>  2 files changed, 67 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/xfs/linux-2.6/xfs_sync.c b/fs/xfs/linux-2.6/xfs_sync.c
> index 56fed91..51da595 100644
> --- a/fs/xfs/linux-2.6/xfs_sync.c
> +++ b/fs/xfs/linux-2.6/xfs_sync.c
> @@ -133,6 +133,41 @@ restart:
>  	return last_error;
>  }
>  
> +/*
> + * Select the next per-ag structure to iterate during the walk. The reclaim
> + * walk is optimised only to walk AGs with reclaimable inodes in them.
> + */
> +static struct xfs_perag *
> +xfs_inode_ag_iter_next_pag(
> +	struct xfs_mount	*mp,
> +	xfs_agnumber_t		*first,
> +	int			tag)
> +{
> +	struct xfs_perag	*pag = NULL;
> +
> +	if (tag == XFS_ICI_RECLAIM_TAG) {
> +		int found;
> +		int ref;
> +
> +		spin_lock(&mp->m_perag_lock);
> +		found = radix_tree_gang_lookup_tag(&mp->m_perag_tree,
> +				(void **)&pag, *first, 1, tag);
> +		if (found <= 0) {
> +			spin_unlock(&mp->m_perag_lock);
> +			return NULL;
> +		}
> +		*first = pag->pag_agno + 1;

Maybe move this below, just before the return.  I.e.:
	if (pag)
		*first = pag->pag_agno + 1;

To me it's slightly clearer that we're just setting
up to search next time with the perag following the
one returned.

> +		/* open coded pag reference increment */

By open-coding here you miss the assertions in xfs_perag_get().
(Though a common inline encapsulating them would have to be
in a header because the two functions reside in different files.)

> +		ref = atomic_inc_return(&pag->pag_ref);
> +		spin_unlock(&mp->m_perag_lock);
> +		trace_xfs_perag_get_reclaim(mp, pag->pag_agno, ref, _RET_IP_);
> +	} else {
> +		pag = xfs_perag_get(mp, *first);
> +		(*first)++;
> +	}
> +	return pag;
> +}
> +
>  int
>  xfs_inode_ag_iterator(
>  	struct xfs_mount	*mp,

. . .

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2010-07-15 17:58 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-15  0:38 [PATCH 0/5] xfs: reclaim bug fixes Dave Chinner
2010-07-15  0:38 ` [PATCH 1/5] xfs: track AGs with reclaimable inodes in per-ag radix tree Dave Chinner
2010-07-15 18:01   ` Alex Elder [this message]
2010-07-16  5:17     ` Christoph Hellwig
2010-07-19  0:17       ` Dave Chinner
2010-07-19  0:24     ` Dave Chinner
2010-07-15  0:38 ` [PATCH 2/5] xfs: simplify and remove xfs_ireclaim Dave Chinner
2010-07-15 18:07   ` Alex Elder
2010-07-16  5:16   ` Christoph Hellwig
2010-07-15  0:38 ` [PATCH 3/5] xfs: fix xfs_trans_add_item() lockdep warnings Dave Chinner
2010-07-15 18:09   ` Alex Elder
2010-07-16  5:19   ` Christoph Hellwig
2010-07-19  0:24     ` Dave Chinner
2010-07-15  0:38 ` [PATCH 4/5] xfs: use GFP_NOFS for page cache allocation Dave Chinner
2010-07-15 18:10   ` Alex Elder
2010-07-16  5:21   ` Christoph Hellwig
2010-07-15  0:38 ` [PATCH 5/5] xfs: fix memory reclaim recursion deadlock on locked inode buffer Dave Chinner
2010-07-15 18:42   ` Alex Elder
2010-07-16  5:22   ` Christoph Hellwig
2010-07-16  5:23 ` [PATCH 0/5] xfs: reclaim bug fixes Christoph Hellwig
2010-07-19  0:30   ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1279216900.2054.28.camel@doink \
    --to=aelder@sgi.com \
    --cc=david@fromorbit.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox