[RFC, PATCH] Make memory reclaim from inodes and dentry cache more scalable

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC, PATCH] Make memory reclaim from inodes and dentry cache more scalable
@ 2012-05-02 22:06 Tim Chen
  2012-05-08 20:00 ` Tim Chen
  0 siblings, 1 reply; 2+ messages in thread
From: Tim Chen @ 2012-05-02 22:06 UTC (permalink / raw)
  To: Alexander Viro, Matthew Wilcox; +Cc: linux-fsdevel, linux-kernel, Andi Kleen


The following patch detects when inodes and dentries cache are really
low in free entries, and skip reclamation of memory from them when it is
futile to do so.  We only resume reclaiming memory from inodes and
dentries cache when we have a reasonable amount of memory there. 
This avoided us bottlenecking on sb_lock to do useless memory
reclamation.  

I assume that it is okay to check super block's number of free objects
content without sb_lock as we are holding shrinker list's read lock. The
shrinker is still registered so super block is not yet deactivated which
requires shrinker un-registration.  It will be great if Al can help to
comment on whether this assumption is okay.

In a test scenario where page cache is putting heavy pressure on memory
usage with large number of processes, we saw very heavy contention on
the sb_lock to get free pages as seen in the following profile. The
patch helped to reduce the runtime by almost a factor of 4.

    62.81%               cp  [kernel.kallsyms]           [k] _raw_spin_lock
                         |
                         --- _raw_spin_lock
                            |
                            |--45.19%-- grab_super_passive
                            |          prune_super
                            |          shrink_slab
                            |          do_try_to_free_pages
                            |          try_to_free_pages
                            |          __alloc_pages_nodemask
                            |          alloc_pages_current


Tim

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
---
diff --git a/fs/super.c b/fs/super.c
index 8760fe1..e91c7506 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -38,6 +38,9 @@
 LIST_HEAD(super_blocks);
 DEFINE_SPINLOCK(sb_lock);
 
+int	sb_cache_himark = 100;
+int	sb_cache_lowmark = 5;
+
 /*
  * One thing we have to be careful of with a per-sb shrinker is that we don't
  * drop the last active reference to the superblock from within the shrinker.
@@ -60,6 +63,20 @@ static int prune_super(struct shrinker *shrink, struct shrink_control *sc)
 	if (sc->nr_to_scan && !(sc->gfp_mask & __GFP_FS))
 		return -1;
 
+	/* Don't do useless reclaim unless we have reasonable amount
+	 * of free objects to avoid sb_lock contention.
+	 * Should be okay to reference sb content without sb_lock as we are
+	 * holding shrinker list's read lock, which means shrinker is still
+	 * registered. So sb is not yet deactivated which requires shrinker
+	 * un-registration.
+	 */
+	if (sb->cache_low) {
+		total_objects = sb->s_nr_dentry_unused +
+				sb->s_nr_inodes_unused + fs_objects;
+		if (total_objects < sb_cache_himark)
+			return 0;
+	}
+
 	if (!grab_super_passive(sb))
 		return !sc->nr_to_scan ? 0 : -1;
 
@@ -69,6 +86,9 @@ static int prune_super(struct shrinker *shrink, struct shrink_control *sc)
 	total_objects = sb->s_nr_dentry_unused +
 			sb->s_nr_inodes_unused + fs_objects + 1;
 
+	if (!sb->cache_low && total_objects <= sb_cache_lowmark)
+		sb->cache_low = 1;
+
 	if (sc->nr_to_scan) {
 		int	dentries;
 		int	inodes;
@@ -96,6 +116,9 @@ static int prune_super(struct shrinker *shrink, struct shrink_control *sc)
 				sb->s_nr_inodes_unused + fs_objects;
 	}
 
+	if (sb->cache_low && total_objects > sb_cache_himark)
+		sb->cache_low = 0;
+
 	total_objects = (total_objects / 100) * sysctl_vfs_cache_pressure;
 	drop_super(sb);
 	return total_objects;
@@ -184,6 +207,7 @@ static struct super_block *alloc_super(struct file_system_type *type)
 		s->s_shrink.seeks = DEFAULT_SEEKS;
 		s->s_shrink.shrink = prune_super;
 		s->s_shrink.batch = 1024;
+		s->cache_low = 0;
 	}
 out:
 	return s;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 386da09..c0465e3 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1496,6 +1496,7 @@ struct super_block {
 
 	/* Being remounted read-only */
 	int s_readonly_remount;
+	int cache_low;
 };
 
 /* superblock cache pruning functions */

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [RFC, PATCH] Make memory reclaim from inodes and dentry cache more scalable
  2012-05-02 22:06 [RFC, PATCH] Make memory reclaim from inodes and dentry cache more scalable Tim Chen
@ 2012-05-08 20:00 ` Tim Chen
  0 siblings, 0 replies; 2+ messages in thread
From: Tim Chen @ 2012-05-08 20:00 UTC (permalink / raw)
  To: Alexander Viro; +Cc: Matthew Wilcox, linux-fsdevel, linux-kernel, Andi Kleen

On Wed, 2012-05-02 at 15:06 -0700, Tim Chen wrote:
> The following patch detects when inodes and dentries cache are really
> low in free entries, and skip reclamation of memory from them when it is
> futile to do so.  We only resume reclaiming memory from inodes and
> dentries cache when we have a reasonable amount of memory there. 
> This avoided us bottlenecking on sb_lock to do useless memory
> reclamation.  
> 
> I assume that it is okay to check super block's number of free objects
> content without sb_lock as we are holding shrinker list's read lock. The
> shrinker is still registered so super block is not yet deactivated which
> requires shrinker un-registration.  It will be great if Al can help to
> comment on whether this assumption is okay.
> 
> In a test scenario where page cache is putting heavy pressure on memory
> usage with large number of processes, we saw very heavy contention on
> the sb_lock to get free pages as seen in the following profile. The
> patch helped to reduce the runtime by almost a factor of 4.
> 
>     62.81%               cp  [kernel.kallsyms]           [k] _raw_spin_lock
>                          |
>                          --- _raw_spin_lock
>                             |
>                             |--45.19%-- grab_super_passive
>                             |          prune_super
>                             |          shrink_slab
>                             |          do_try_to_free_pages
>                             |          try_to_free_pages
>                             |          __alloc_pages_nodemask
>                             |          alloc_pages_current
> 
> 
> Tim


Hi Al, 

Want to ping you again to see what your thoughts are on this patch I've
sent a week ago.

Thanks.

Tim



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2012-05-08 20:00 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-02 22:06 [RFC, PATCH] Make memory reclaim from inodes and dentry cache more scalable Tim Chen
2012-05-08 20:00 ` Tim Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).