From: Christoph Lameter <clameter@sgi.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, Mel Gorman <mel@skynet.ie>,
andi@firstfloor.org, Nick Piggin <npiggin@suse.de>,
Rik van Riel <riel@redhat.com>,
Pekka Enberg <penberg@cs.helsinki.fi>
Subject: Re: [patch 05/18] SLUB: Slab defrag core
Date: Thu, 10 Apr 2008 11:28:35 -0700 (PDT) [thread overview]
Message-ID: <Pine.LNX.4.64.0804101126280.12367@schroedinger.engr.sgi.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0804081441350.31620@schroedinger.engr.sgi.com>
Here is a patch that gets rid of the timer and instead works with the
fuzzy notion of the "objects" freed returned from the shrinkers. We add
those up per node or globally and if they are greater than 100 we call
into defrag.
Do we need to have an additional knob to set the level at which defrag
triggers from reclaim? I just used 100.
Index: linux-2.6/include/linux/mmzone.h
===================================================================
--- linux-2.6.orig/include/linux/mmzone.h 2008-04-10 11:06:52.000000000 -0700
+++ linux-2.6/include/linux/mmzone.h 2008-04-10 11:08:04.000000000 -0700
@@ -263,6 +263,7 @@
unsigned long nr_scan_active;
unsigned long nr_scan_inactive;
unsigned long pages_scanned; /* since last reclaim */
+ unsigned long slab_objects_freed; /* Since last slab defrag */
unsigned long flags; /* zone flags, see below */
/* Zone statistics */
Index: linux-2.6/include/linux/slub_def.h
===================================================================
--- linux-2.6.orig/include/linux/slub_def.h 2008-04-10 11:06:52.000000000 -0700
+++ linux-2.6/include/linux/slub_def.h 2008-04-10 11:08:04.000000000 -0700
@@ -91,7 +91,6 @@
struct kmem_cache_order_objects min;
gfp_t allocflags; /* gfp flags to use on each alloc */
int refcount; /* Refcount for slab cache destroy */
- unsigned long next_defrag;
void (*ctor)(struct kmem_cache *, void *);
/*
* Called with slab lock held and interrupts disabled.
Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c 2008-04-10 11:06:52.000000000 -0700
+++ linux-2.6/mm/slub.c 2008-04-10 11:08:04.000000000 -0700
@@ -2985,9 +2985,6 @@
list_for_each_entry(s, &slab_caches, list) {
- if (time_before(jiffies, s->next_defrag))
- continue;
-
/*
* Defragmentable caches come first. If the slab cache is not
* defragmentable then we can stop traversing the list.
@@ -3004,11 +3001,6 @@
} else
reclaimed = __kmem_cache_shrink(s, node, MAX_PARTIAL);
- if (reclaimed)
- s->next_defrag = jiffies + HZ / 10;
- else
- s->next_defrag = jiffies + HZ;
-
slabs += reclaimed;
}
up_read(&slub_lock);
Index: linux-2.6/mm/vmscan.c
===================================================================
--- linux-2.6.orig/mm/vmscan.c 2008-04-10 11:06:52.000000000 -0700
+++ linux-2.6/mm/vmscan.c 2008-04-10 11:24:02.000000000 -0700
@@ -234,8 +234,34 @@
shrinker->nr += total_scan;
}
up_read(&shrinker_rwsem);
- if (ret && (gfp_mask & __GFP_FS))
- kmem_cache_defrag(zone ? zone_to_nid(zone) : -1);
+
+ /*
+ * "ret" doesnt really contain the freed object count. The shrinkers
+ * fake it. Gotta go with what we are getting though.
+ *
+ * Handling of the freed object counter is also racy. If we get the
+ * wrong counts then we may unnecessarily do a defrag pass or defer
+ * one. "ret" is already faked. So this is just increasing
+ * the already existing fuzziness to get some notion as to when
+ * to initiate slab defrag which will hopefully be okay.
+ */
+ if (zone) {
+ /* balance_pgdat running on a zone so we only scan one node */
+ zone->slab_objects_freed += ret;
+ if (zone->slab_objects_freed > 100 && (gfp_mask & __GFP_FS)) {
+ zone->slab_objects_freed = 0;
+ kmem_cache_defrag(zone_to_nid(zone));
+ }
+ } else {
+ static unsigned long global_objects_freed = 0;
+
+ /* Direct (and thus global) reclaim. Scan all nodes */
+ global_objects_freed += ret;
+ if (global_objects_freed > 100 && (gfp_mask & __GFP_FS)) {
+ global_objects_freed = 0;
+ kmem_cache_defrag(-1);
+ }
+ }
return ret;
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-04-10 18:28 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20080404230158.365359425@sgi.com>
[not found] ` <20080404230229.401345769@sgi.com>
2008-04-08 6:13 ` [patch 16/18] FS: Socket inode defragmentation Andrew Morton
[not found] ` <20080404230225.862960359@sgi.com>
[not found] ` <20080407231052.eb37a8fd.akpm@linux-foundation.org>
2008-04-08 18:55 ` [patch 01/18] SLUB: Add defrag_ratio field and sysfs support Pekka J Enberg
[not found] ` <20080404230226.340749825@sgi.com>
[not found] ` <20080407231059.e8c173fa.akpm@linux-foundation.org>
2008-04-08 19:07 ` [patch 03/18] SLUB: Add get() and kick() methods Pekka J Enberg
[not found] ` <20080404230226.577197795@sgi.com>
[not found] ` <20080407231113.855e2ba3.akpm@linux-foundation.org>
2008-04-08 19:18 ` [patch 04/18] SLUB: Sort slab cache list and establish maximum objects for defrag slabs Pekka Enberg
2008-04-08 21:01 ` Christoph Lameter
2008-04-08 21:07 ` Andi Kleen
[not found] ` <84144f020804072317g5b2b9f42yb300cad9a4258a15@mail.gmail.com>
[not found] ` <20080407233001.3e1e5147.akpm@linux-foundation.org>
2008-04-10 16:17 ` Pekka Enberg
[not found] ` <20080404230226.847485429@sgi.com>
[not found] ` <20080407231129.3c044ba1.akpm@linux-foundation.org>
2008-04-08 21:02 ` [patch 05/18] SLUB: Slab defrag core Christoph Lameter
2008-04-08 21:11 ` Andrew Morton
2008-04-08 21:17 ` Christoph Lameter
2008-04-08 21:25 ` Andrew Morton
2008-04-08 21:47 ` Christoph Lameter
2008-04-10 18:28 ` Christoph Lameter [this message]
2008-04-10 19:00 ` Andrew Morton
2008-04-10 20:33 ` Christoph Lameter
2008-04-10 20:49 ` Pekka Enberg
[not found] ` <20080404230227.768964864@sgi.com>
[not found] ` <20080407231137.6e3a38cd.akpm@linux-foundation.org>
2008-04-08 21:05 ` [patch 09/18] SLUB: Trigger defragmentation from memory reclaim Christoph Lameter
[not found] ` <20080404230229.678047976@sgi.com>
[not found] ` <20080407231402.63284bb5.akpm@linux-foundation.org>
2008-04-08 21:09 ` [patch 17/18] dentries: Add constructor Christoph Lameter
[not found] ` <20080404230229.922470579@sgi.com>
[not found] ` <20080407231434.88352977.akpm@linux-foundation.org>
2008-04-08 21:14 ` [patch 18/18] dentries: dentry defragmentation Christoph Lameter
2008-04-08 21:22 ` Andrew Morton
2008-04-08 21:41 ` Christoph Lameter
[not found] ` <20080404230229.169327879@sgi.com>
[not found] ` <20080407231346.8a17d27d.akpm@linux-foundation.org>
2008-04-13 13:39 ` RIP __kmem_cache_shrink (was Re: [patch 15/18] FS: Proc filesystem support for slab defrag) Alexey Dobriyan
2008-04-14 19:41 ` Christoph Lameter
2008-04-14 20:12 ` Alexey Dobriyan
2008-04-14 20:36 ` Pekka Enberg
[not found] ` <20080404230228.523868817@sgi.com>
[not found] ` <20080407231341.ac45cd9d.akpm@linux-foundation.org>
2008-05-08 3:49 ` [patch 12/18] FS: ExtX filesystem defrag Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.0804101126280.12367@schroedinger.engr.sgi.com \
--to=clameter@sgi.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=linux-mm@kvack.org \
--cc=mel@skynet.ie \
--cc=npiggin@suse.de \
--cc=penberg@cs.helsinki.fi \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).