From: Alex Elder <aelder@sgi.com>
To: Dave Chinner <david@fromorbit.com>
Cc: XFS Mailing List <xfs@oss.sgi.com>
Subject: Re: [PATCH 1/5] xfs: track AGs with reclaimable inodes in per-ag radix tree
Date: Thu, 15 Jul 2010 13:01:40 -0500 [thread overview]
Message-ID: <1279216900.2054.28.camel@doink> (raw)
In-Reply-To: <1279154300-2018-2-git-send-email-david@fromorbit.com>
On Thu, 2010-07-15 at 10:38 +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> https://bugzilla.kernel.org/show_bug.cgi?id=16348
>
> When the filesystem grows to a large number of allocation groups,
> the summing of recalimable inodes gets expensive. In many cases,
> most AGs won't have any reclaimable inodes and so we are wasting CPU
> time aggregating over these AGs. This is particularly important for
> the inode shrinker that gets called frequently under memory
> pressure.
>
> To avoid the overhead, track AGs with reclaimable inodes in the
> per-ag radix tree so that we can find all the AGs with reclaimable
> inodes via a simple gang tag lookup. This involves setting the tag
> when the first reclaimable inode is tracked in the AG, and removing
> the tag when the last reclaimable inode is removed from the tree.
> Then the summation process becomes a loop walking the radix tree
> summing AGs with the reclaim tag set.
>
> This significantly reduces the overhead of scanning - a 6400 AG
> filesystea now only uses about 25% of a cpu in kswapd while slab reclaim
> progresses instead of being permanently stuck at 100% CPU and making little
> progress. Clean filesystems filesystems will see no overhead and the
> overhead only increases linearly with the number of dirty AGs.
Using the same tag represent a perag with reclaimable
inodes as the tag representing an inode is reclaimable
is nicely consistent...
I have a few comments below for your consideration
but they are minor (and don't even require a response).
Overall this looks good.
Reviewed-by: Alex Elder <aelder@sgi.com>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
> fs/xfs/linux-2.6/xfs_sync.c | 71 +++++++++++++++++++++++++++++++++++++----
> fs/xfs/linux-2.6/xfs_trace.h | 3 ++
> 2 files changed, 67 insertions(+), 7 deletions(-)
>
> diff --git a/fs/xfs/linux-2.6/xfs_sync.c b/fs/xfs/linux-2.6/xfs_sync.c
> index 56fed91..51da595 100644
> --- a/fs/xfs/linux-2.6/xfs_sync.c
> +++ b/fs/xfs/linux-2.6/xfs_sync.c
> @@ -133,6 +133,41 @@ restart:
> return last_error;
> }
>
> +/*
> + * Select the next per-ag structure to iterate during the walk. The reclaim
> + * walk is optimised only to walk AGs with reclaimable inodes in them.
> + */
> +static struct xfs_perag *
> +xfs_inode_ag_iter_next_pag(
> + struct xfs_mount *mp,
> + xfs_agnumber_t *first,
> + int tag)
> +{
> + struct xfs_perag *pag = NULL;
> +
> + if (tag == XFS_ICI_RECLAIM_TAG) {
> + int found;
> + int ref;
> +
> + spin_lock(&mp->m_perag_lock);
> + found = radix_tree_gang_lookup_tag(&mp->m_perag_tree,
> + (void **)&pag, *first, 1, tag);
> + if (found <= 0) {
> + spin_unlock(&mp->m_perag_lock);
> + return NULL;
> + }
> + *first = pag->pag_agno + 1;
Maybe move this below, just before the return. I.e.:
if (pag)
*first = pag->pag_agno + 1;
To me it's slightly clearer that we're just setting
up to search next time with the perag following the
one returned.
> + /* open coded pag reference increment */
By open-coding here you miss the assertions in xfs_perag_get().
(Though a common inline encapsulating them would have to be
in a header because the two functions reside in different files.)
> + ref = atomic_inc_return(&pag->pag_ref);
> + spin_unlock(&mp->m_perag_lock);
> + trace_xfs_perag_get_reclaim(mp, pag->pag_agno, ref, _RET_IP_);
> + } else {
> + pag = xfs_perag_get(mp, *first);
> + (*first)++;
> + }
> + return pag;
> +}
> +
> int
> xfs_inode_ag_iterator(
> struct xfs_mount *mp,
. . .
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2010-07-15 17:58 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-15 0:38 [PATCH 0/5] xfs: reclaim bug fixes Dave Chinner
2010-07-15 0:38 ` [PATCH 1/5] xfs: track AGs with reclaimable inodes in per-ag radix tree Dave Chinner
2010-07-15 18:01 ` Alex Elder [this message]
2010-07-16 5:17 ` Christoph Hellwig
2010-07-19 0:17 ` Dave Chinner
2010-07-19 0:24 ` Dave Chinner
2010-07-15 0:38 ` [PATCH 2/5] xfs: simplify and remove xfs_ireclaim Dave Chinner
2010-07-15 18:07 ` Alex Elder
2010-07-16 5:16 ` Christoph Hellwig
2010-07-15 0:38 ` [PATCH 3/5] xfs: fix xfs_trans_add_item() lockdep warnings Dave Chinner
2010-07-15 18:09 ` Alex Elder
2010-07-16 5:19 ` Christoph Hellwig
2010-07-19 0:24 ` Dave Chinner
2010-07-15 0:38 ` [PATCH 4/5] xfs: use GFP_NOFS for page cache allocation Dave Chinner
2010-07-15 18:10 ` Alex Elder
2010-07-16 5:21 ` Christoph Hellwig
2010-07-15 0:38 ` [PATCH 5/5] xfs: fix memory reclaim recursion deadlock on locked inode buffer Dave Chinner
2010-07-15 18:42 ` Alex Elder
2010-07-16 5:22 ` Christoph Hellwig
2010-07-16 5:23 ` [PATCH 0/5] xfs: reclaim bug fixes Christoph Hellwig
2010-07-19 0:30 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1279216900.2054.28.camel@doink \
--to=aelder@sgi.com \
--cc=david@fromorbit.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.