public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Ben Myers <bpm@sgi.com>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH 10/10] xfs: don't cache inodes read through bulkstat
Date: Wed, 14 Mar 2012 15:44:01 -0500	[thread overview]
Message-ID: <20120314204401.GN7762@sgi.com> (raw)
In-Reply-To: <1331095828-28742-11-git-send-email-david@fromorbit.com>

On Wed, Mar 07, 2012 at 03:50:28PM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> When we read inodes via bulkstat, we generally only read them once
> and then throw them away - they never get used again. If we retain
> them in cache, then it simply causes the working set of inodes and
> other cached items to be reclaimed just so the inode cache can grow.
> 
> Avoid this problem by marking inodes read by bulkstat as not to be
> cached and check this flag in .drop_inode to determine whether the
> inode should be added to the VFS LRU or not. If the inode lookup
> hits an already cached inode, then don't set the flag. If the inode
> lookup hits an inode marked with no cache flag, remove the flag and
> allow it to be cached once the current reference goes away.
> 
> Inodes marked as not cached will get cleaned up by the background
> inode reclaim or via memory pressure, so they will still generate
> some short term cache pressure. They will, however, be reclaimed
> much sooner and in preference to cache hot inodes.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>

Looks good.

Reviewed-by: Ben Myers <bpm@sgi.com>

> ---
>  fs/xfs/xfs_iget.c   |    8 ++++++--
>  fs/xfs/xfs_inode.h  |    4 +++-
>  fs/xfs/xfs_itable.c |    3 ++-
>  fs/xfs/xfs_super.c  |   17 +++++++++++++++++
>  4 files changed, 28 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/xfs/xfs_iget.c b/fs/xfs/xfs_iget.c
> index 93fc1dc..20ddb1e 100644
> --- a/fs/xfs/xfs_iget.c
> +++ b/fs/xfs/xfs_iget.c
> @@ -290,7 +290,7 @@ xfs_iget_cache_hit(
>  	if (lock_flags != 0)
>  		xfs_ilock(ip, lock_flags);
>  
> -	xfs_iflags_clear(ip, XFS_ISTALE);
> +	xfs_iflags_clear(ip, XFS_ISTALE | XFS_IDONTCACHE);
>  	XFS_STATS_INC(xs_ig_found);
>  
>  	return 0;
> @@ -315,6 +315,7 @@ xfs_iget_cache_miss(
>  	struct xfs_inode	*ip;
>  	int			error;
>  	xfs_agino_t		agino = XFS_INO_TO_AGINO(mp, ino);
> +	int			iflags;
>  
>  	ip = xfs_inode_alloc(mp, ino);
>  	if (!ip)
> @@ -359,8 +360,11 @@ xfs_iget_cache_miss(
>  	 * memory barrier that ensures this detection works correctly at lookup
>  	 * time.
>  	 */
> +	iflags = XFS_INEW;
> +	if (flags & XFS_IGET_DONTCACHE)
> +		iflags |= XFS_IDONTCACHE;
>  	ip->i_udquot = ip->i_gdquot = NULL;
> -	xfs_iflags_set(ip, XFS_INEW);
> +	xfs_iflags_set(ip, iflags);
>  
>  	/* insert the new inode */
>  	spin_lock(&pag->pag_ici_lock);
> diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
> index eda4937..096b887 100644
> --- a/fs/xfs/xfs_inode.h
> +++ b/fs/xfs/xfs_inode.h
> @@ -374,10 +374,11 @@ xfs_set_projid(struct xfs_inode *ip,
>  #define XFS_IFLOCK		(1 << __XFS_IFLOCK_BIT)
>  #define __XFS_IPINNED_BIT	8	 /* wakeup key for zero pin count */
>  #define XFS_IPINNED		(1 << __XFS_IPINNED_BIT)
> +#define XFS_IDONTCACHE		(1 << 9) /* don't cache the inode long term */
>  
>  /*
>   * Per-lifetime flags need to be reset when re-using a reclaimable inode during
> - * inode lookup. Thi prevents unintended behaviour on the new inode from
> + * inode lookup. This prevents unintended behaviour on the new inode from
>   * ocurring.
>   */
>  #define XFS_IRECLAIM_RESET_FLAGS	\
> @@ -544,6 +545,7 @@ do { \
>   */
>  #define XFS_IGET_CREATE		0x1
>  #define XFS_IGET_UNTRUSTED	0x2
> +#define XFS_IGET_DONTCACHE	0x4
>  
>  int		xfs_inotobp(struct xfs_mount *, struct xfs_trans *,
>  			    xfs_ino_t, struct xfs_dinode **,
> diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c
> index 751e94f..b832c58 100644
> --- a/fs/xfs/xfs_itable.c
> +++ b/fs/xfs/xfs_itable.c
> @@ -76,7 +76,8 @@ xfs_bulkstat_one_int(
>  		return XFS_ERROR(ENOMEM);
>  
>  	error = xfs_iget(mp, NULL, ino,
> -			 XFS_IGET_UNTRUSTED, XFS_ILOCK_SHARED, &ip);
> +			 (XFS_IGET_DONTCACHE | XFS_IGET_UNTRUSTED),
> +			 XFS_ILOCK_SHARED, &ip);
>  	if (error) {
>  		*stat = BULKSTAT_RV_NOTHING;
>  		goto out_free;
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index b1df512..c162765 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -953,6 +953,22 @@ xfs_fs_evict_inode(
>  	xfs_inactive(ip);
>  }
>  
> +/*
> + * We do an unlocked check for XFS_IDONTCACHE here because we are already
> + * serialised against cache hits here via the inode->i_lock and igrab() in
> + * xfs_iget_cache_hit(). Hence a lookup that might clear this flag will not be
> + * racing with us, and it avoids needing to grab a spinlock here for every inode
> + * we drop the final reference on.
> + */

I'll try to put this in my own words, just in case it is mystifying for
anyone else.  ;)

In this case it is ok to do check of ip->i_flags without holding
inode->i_flags_lock because... we have exclusion from xfs_iget_cache_hit
as follows:

The 'dropper' would have taken inode->i_lock when the inode's count went
to zero, and if the XFS_IDONTCARE flag is set, dropper will return 1 to
iput_final which will result in iput_final skipping the inode lru and
setting I_FREEING immediately, before droppig inode->i_lock and evicting
the inode.

A 'cache hitter' must call igrab in order to get a reference on the
inode.  igrab takes the inode->i_lock, and if I_FREEING is set, it
returns NULL, then xfs_iget_cache_hit returns EAGAIN, and is restarted.

So... any 'cache hitter' who could possibly clear the XFS_IDONTCACHE
flag subsequent to 'dropper' checking it would always be unable to get a
reference due to I_FREEING having been set by the dropper.

I appreciate that you added the comment.

Regards,
	Ben

> +STATIC int
> +xfs_fs_drop_inode(
> +	struct inode		*inode)
> +{
> +	struct xfs_inode	*ip = XFS_I(inode);
> +
> +	return generic_drop_inode(inode) || (ip->i_flags & XFS_IDONTCACHE);
> +}
> +
>  STATIC void
>  xfs_free_fsname(
>  	struct xfs_mount	*mp)
> @@ -1431,6 +1447,7 @@ static const struct super_operations xfs_super_operations = {
>  	.dirty_inode		= xfs_fs_dirty_inode,
>  	.write_inode		= xfs_fs_write_inode,
>  	.evict_inode		= xfs_fs_evict_inode,
> +	.drop_inode		= xfs_fs_drop_inode,
>  	.put_super		= xfs_fs_put_super,
>  	.sync_fs		= xfs_fs_sync_fs,
>  	.freeze_fs		= xfs_fs_freeze,
> -- 
> 1.7.9
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  parent reply	other threads:[~2012-03-14 20:43 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-07  4:50 [PATCH 0/10] xfs: various fixes v2 Dave Chinner
2012-03-07  4:50 ` [PATCH 01/10] xfs: clean up minor sparse warnings Dave Chinner
2012-03-08 21:34   ` Ben Myers
2012-03-09  0:30     ` Dave Chinner
2012-03-07  4:50 ` [PATCH 02/10] xfs: Fix open flag handling in open_by_handle code Dave Chinner
2012-03-12 13:27   ` Christoph Hellwig
2012-03-13 21:15   ` Mark Tinguely
2012-03-07  4:50 ` [PATCH 03/10] xfs: fallback to vmalloc for large buffers in xfs_attrmulti_attr_get Dave Chinner
2012-03-12 13:27   ` Christoph Hellwig
2012-03-14 18:04   ` Mark Tinguely
2012-03-07  4:50 ` [PATCH 04/10] xfs: fallback to vmalloc for large buffers in xfs_getbmap Dave Chinner
2012-03-12 13:28   ` Christoph Hellwig
2012-03-14 18:12   ` Mark Tinguely
2012-03-07  4:50 ` [PATCH 05/10] xfs: introduce an allocation workqueue Dave Chinner
2012-03-12 16:16   ` Christoph Hellwig
2012-03-19 16:47   ` Mark Tinguely
2012-03-19 22:20     ` Dave Chinner
2012-03-20 16:34       ` Mark Tinguely
2012-03-20 22:45         ` Dave Chinner
2012-03-07  4:50 ` [PATCH 06/10] xfs: remove remaining scraps of struct xfs_iomap Dave Chinner
2012-03-15 16:48   ` Mark Tinguely
2012-03-07  4:50 ` [PATCH 07/10] xfs: fix inode lookup race Dave Chinner
2012-03-07  4:50 ` [PATCH 08/10] xfs: initialise xfssync work before running quotachecks Dave Chinner
2012-03-12 13:28   ` Christoph Hellwig
2012-03-16 17:07   ` Mark Tinguely
2012-03-07  4:50 ` [PATCH 09/10] xfs: remove MS_ACTIVE guard from inode reclaim work Dave Chinner
2012-03-12 13:30   ` Christoph Hellwig
2012-03-07  4:50 ` [PATCH 10/10] xfs: don't cache inodes read through bulkstat Dave Chinner
2012-03-12 13:31   ` Christoph Hellwig
2012-03-14 20:44   ` Ben Myers [this message]
2012-03-15 18:14   ` Ben Myers
2012-03-15 22:05     ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120314204401.GN7762@sgi.com \
    --to=bpm@sgi.com \
    --cc=david@fromorbit.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox