public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Alex Elder <aelder@sgi.com>
Cc: XFS Mailing List <xfs@oss.sgi.com>
Subject: Re: [PATCH 1/5] xfs: track AGs with reclaimable inodes in per-ag radix tree
Date: Mon, 19 Jul 2010 10:24:18 +1000	[thread overview]
Message-ID: <20100719002418.GE23223@dastard> (raw)
In-Reply-To: <1279216900.2054.28.camel@doink>

On Thu, Jul 15, 2010 at 01:01:40PM -0500, Alex Elder wrote:
> On Thu, 2010-07-15 at 10:38 +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > https://bugzilla.kernel.org/show_bug.cgi?id=16348
> > 
> > When the filesystem grows to a large number of allocation groups,
> > the summing of recalimable inodes gets expensive. In many cases,
> > most AGs won't have any reclaimable inodes and so we are wasting CPU
> > time aggregating over these AGs. This is particularly important for
> > the inode shrinker that gets called frequently under memory
> > pressure.
> > 
> > To avoid the overhead, track AGs with reclaimable inodes in the
> > per-ag radix tree so that we can find all the AGs with reclaimable
> > inodes via a simple gang tag lookup. This involves setting the tag
> > when the first reclaimable inode is tracked in the AG, and removing
> > the tag when the last reclaimable inode is removed from the tree.
> > Then the summation process becomes a loop walking the radix tree
> > summing AGs with the reclaim tag set.
> > 
> > This significantly reduces the overhead of scanning - a 6400 AG
> > filesystea now only uses about 25% of a cpu in kswapd while slab reclaim
> > progresses instead of being permanently stuck at 100% CPU and making little
> > progress. Clean filesystems filesystems will see no overhead and the
> > overhead only increases linearly with the number of dirty AGs.
> 
> Using the same tag represent a perag with reclaimable
> inodes as the tag representing an inode is reclaimable
> is nicely consistent...
> 
> I have a few comments below for your consideration
> but they are minor (and don't even require a response).
> 
> Overall this looks good.
> 
> Reviewed-by: Alex Elder <aelder@sgi.com>
> 
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > ---
> >  fs/xfs/linux-2.6/xfs_sync.c  |   71 +++++++++++++++++++++++++++++++++++++----
> >  fs/xfs/linux-2.6/xfs_trace.h |    3 ++
> >  2 files changed, 67 insertions(+), 7 deletions(-)
> > 
> > diff --git a/fs/xfs/linux-2.6/xfs_sync.c b/fs/xfs/linux-2.6/xfs_sync.c
> > index 56fed91..51da595 100644
> > --- a/fs/xfs/linux-2.6/xfs_sync.c
> > +++ b/fs/xfs/linux-2.6/xfs_sync.c
> > @@ -133,6 +133,41 @@ restart:
> >  	return last_error;
> >  }
> >  
> > +/*
> > + * Select the next per-ag structure to iterate during the walk. The reclaim
> > + * walk is optimised only to walk AGs with reclaimable inodes in them.
> > + */
> > +static struct xfs_perag *
> > +xfs_inode_ag_iter_next_pag(
> > +	struct xfs_mount	*mp,
> > +	xfs_agnumber_t		*first,
> > +	int			tag)
> > +{
> > +	struct xfs_perag	*pag = NULL;
> > +
> > +	if (tag == XFS_ICI_RECLAIM_TAG) {
> > +		int found;
> > +		int ref;
> > +
> > +		spin_lock(&mp->m_perag_lock);
> > +		found = radix_tree_gang_lookup_tag(&mp->m_perag_tree,
> > +				(void **)&pag, *first, 1, tag);
> > +		if (found <= 0) {
> > +			spin_unlock(&mp->m_perag_lock);
> > +			return NULL;
> > +		}
> > +		*first = pag->pag_agno + 1;
> 
> Maybe move this below, just before the return.  I.e.:
> 	if (pag)
> 		*first = pag->pag_agno + 1;
> 
> To me it's slightly clearer that we're just setting
> up to search next time with the perag following the
> one returned.

If found <=0, we didn't find anything in the tree. I don't know what
the value of pag is going to be in this case, so we have to check
the return value to determine if pag is valid or not.
The other gang lookups in XFS use the same convention...
>
> > +		/* open coded pag reference increment */
> 
> By open-coding here you miss the assertions in xfs_perag_get().
> (Though a common inline encapsulating them would have to be
> in a header because the two functions reside in different files.)

Yes, it misses them, but we do sufficient calls to xfs_perag_get()
elsewhere that I don't think it makes much difference - we'll still
get the ASSERT()s tripping, and the tracing will still point out
where the problem is...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  parent reply	other threads:[~2010-07-19  0:21 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-15  0:38 [PATCH 0/5] xfs: reclaim bug fixes Dave Chinner
2010-07-15  0:38 ` [PATCH 1/5] xfs: track AGs with reclaimable inodes in per-ag radix tree Dave Chinner
2010-07-15 18:01   ` Alex Elder
2010-07-16  5:17     ` Christoph Hellwig
2010-07-19  0:17       ` Dave Chinner
2010-07-19  0:24     ` Dave Chinner [this message]
2010-07-15  0:38 ` [PATCH 2/5] xfs: simplify and remove xfs_ireclaim Dave Chinner
2010-07-15 18:07   ` Alex Elder
2010-07-16  5:16   ` Christoph Hellwig
2010-07-15  0:38 ` [PATCH 3/5] xfs: fix xfs_trans_add_item() lockdep warnings Dave Chinner
2010-07-15 18:09   ` Alex Elder
2010-07-16  5:19   ` Christoph Hellwig
2010-07-19  0:24     ` Dave Chinner
2010-07-15  0:38 ` [PATCH 4/5] xfs: use GFP_NOFS for page cache allocation Dave Chinner
2010-07-15 18:10   ` Alex Elder
2010-07-16  5:21   ` Christoph Hellwig
2010-07-15  0:38 ` [PATCH 5/5] xfs: fix memory reclaim recursion deadlock on locked inode buffer Dave Chinner
2010-07-15 18:42   ` Alex Elder
2010-07-16  5:22   ` Christoph Hellwig
2010-07-16  5:23 ` [PATCH 0/5] xfs: reclaim bug fixes Christoph Hellwig
2010-07-19  0:30   ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100719002418.GE23223@dastard \
    --to=david@fromorbit.com \
    --cc=aelder@sgi.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox