From: Glauber Costa <glommer@parallels.com>
To: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
Mel Gorman <mgorman@suse.de>, Tejun Heo <tj@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Michal Hocko <mhocko@suse.cz>,
Johannes Weiner <hannes@cmpxchg.org>,
kamezawa.hiroyu@jp.fujitsu.com, Christoph Lameter <cl@linux.com>,
David Rientjes <rientjes@google.com>,
Pekka Enberg <penberg@kernel.org>,
devel@openvz.org, Glauber Costa <glommer@parallels.com>,
Pekka Enberg <penberg@cs.helsinki.fi>,
Suleiman Souhlal <suleiman@google.com>
Subject: [PATCH v5 17/18] slub: slub-specific propagation changes.
Date: Fri, 19 Oct 2012 18:20:41 +0400 [thread overview]
Message-ID: <1350656442-1523-18-git-send-email-glommer@parallels.com> (raw)
In-Reply-To: <1350656442-1523-1-git-send-email-glommer@parallels.com>
SLUB allows us to tune a particular cache behavior with sysfs-based
tunables. When creating a new memcg cache copy, we'd like to preserve
any tunables the parent cache already had.
This can be done by tapping into the store attribute function provided
by the allocator. We of course don't need to mess with read-only
fields. Since the attributes can have multiple types and are stored
internally by sysfs, the best strategy is to issue a ->show() in the
root cache, and then ->store() in the memcg cache.
The drawback of that, is that sysfs can allocate up to a page in
buffering for show(), that we are likely not to need, but also can't
guarantee. To avoid always allocating a page for that, we can update the
caches at store time with the maximum attribute size ever stored to the
root cache. We will then get a buffer big enough to hold it. The
corolary to this, is that if no stores happened, nothing will be
propagated.
It can also happen that a root cache has its tunables updated during
normal system operation. In this case, we will propagate the change to
all caches that are already active.
Signed-off-by: Glauber Costa <glommer@parallels.com>
CC: Christoph Lameter <cl@linux.com>
CC: Pekka Enberg <penberg@cs.helsinki.fi>
CC: Michal Hocko <mhocko@suse.cz>
CC: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
CC: Johannes Weiner <hannes@cmpxchg.org>
CC: Suleiman Souhlal <suleiman@google.com>
CC: Tejun Heo <tj@kernel.org>
---
include/linux/slub_def.h | 1 +
mm/slub.c | 76 +++++++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 76 insertions(+), 1 deletion(-)
diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index ed330df..f41acb9 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -105,6 +105,7 @@ struct kmem_cache {
#endif
#ifdef CONFIG_MEMCG_KMEM
struct memcg_cache_params *memcg_params;
+ int max_attr_size; /* for propagation, maximum size of a stored attr */
#endif
#ifdef CONFIG_NUMA
diff --git a/mm/slub.c b/mm/slub.c
index b5b970b..41c3caf 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -203,13 +203,14 @@ enum track_item { TRACK_ALLOC, TRACK_FREE };
static int sysfs_slab_add(struct kmem_cache *);
static int sysfs_slab_alias(struct kmem_cache *, const char *);
static void sysfs_slab_remove(struct kmem_cache *);
-
+static void memcg_propagate_slab_attrs(struct kmem_cache *s);
#else
static inline int sysfs_slab_add(struct kmem_cache *s) { return 0; }
static inline int sysfs_slab_alias(struct kmem_cache *s, const char *p)
{ return 0; }
static inline void sysfs_slab_remove(struct kmem_cache *s) { }
+static inline void memcg_propagate_slab_attrs(struct kmem_cache *s) { }
#endif
static inline void stat(const struct kmem_cache *s, enum stat_item si)
@@ -3973,6 +3974,7 @@ int __kmem_cache_create(struct kmem_cache *s, unsigned long flags)
if (err)
return err;
+ memcg_propagate_slab_attrs(s);
mutex_unlock(&slab_mutex);
err = sysfs_slab_add(s);
mutex_lock(&slab_mutex);
@@ -5198,6 +5200,7 @@ static ssize_t slab_attr_store(struct kobject *kobj,
struct slab_attribute *attribute;
struct kmem_cache *s;
int err;
+ int i __maybe_unused;
attribute = to_slab_attr(attr);
s = to_slab(kobj);
@@ -5206,10 +5209,81 @@ static ssize_t slab_attr_store(struct kobject *kobj,
return -EIO;
err = attribute->store(s, buf, len);
+#ifdef CONFIG_MEMCG_KMEM
+ if (slab_state < FULL)
+ return err;
+ if ((err < 0) || !is_root_cache(s))
+ return err;
+
+ mutex_lock(&slab_mutex);
+ if (s->max_attr_size < len)
+ s->max_attr_size = len;
+
+ for_each_memcg_cache_index(i) {
+ struct kmem_cache *c = cache_from_memcg(s, i);
+ if (c)
+ /* return value determined by the parent cache only */
+ attribute->store(c, buf, len);
+ }
+ mutex_unlock(&slab_mutex);
+#endif
return err;
}
+#ifdef CONFIG_MEMCG_KMEM
+static void memcg_propagate_slab_attrs(struct kmem_cache *s)
+{
+ int i;
+ char *buffer = NULL;
+
+ if (!is_root_cache(s))
+ return;
+
+ /*
+ * This mean this cache had no attribute written. Therefore, no point
+ * in copying default values around
+ */
+ if (!s->max_attr_size)
+ return;
+
+ for (i = 0; i < ARRAY_SIZE(slab_attrs); i++) {
+ char mbuf[64];
+ char *buf;
+ struct slab_attribute *attr = to_slab_attr(slab_attrs[i]);
+
+ if (!attr || !attr->store || !attr->show)
+ continue;
+
+ /*
+ * It is really bad that we have to allocate here, so we will
+ * do it only as a fallback. If we actually allocate, though,
+ * we can just use the allocated buffer until the end.
+ *
+ * Most of the slub attributes will tend to be very small in
+ * size, but sysfs allows buffers up to a page, so they can
+ * theoretically happen.
+ */
+ if (buffer)
+ buf = buffer;
+ else if (s->max_attr_size < ARRAY_SIZE(mbuf))
+ buf = mbuf;
+ else {
+ buffer = (char *) get_zeroed_page(GFP_KERNEL);
+ if (WARN_ON(!buffer))
+ continue;
+ buf = buffer;
+ }
+
+ attr->show(s->memcg_params->root_cache, buf);
+ attr->store(s, buf, strlen(buf));
+ }
+
+ if (buffer)
+ free_page((unsigned long)buffer);
+}
+#endif
+
static const struct sysfs_ops slab_sysfs_ops = {
.show = slab_attr_show,
.store = slab_attr_store,
--
1.7.11.7
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-10-19 14:22 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-10-19 14:20 [PATCH v5 00/18] slab accounting for memcg Glauber Costa
2012-10-19 14:20 ` [PATCH v5 01/18] move slabinfo processing to slab_common.c Glauber Costa
2012-10-24 6:43 ` Pekka Enberg
2012-10-19 14:20 ` [PATCH v5 02/18] move print_slabinfo_header " Glauber Costa
2012-10-19 14:20 ` [PATCH v5 03/18] sl[au]b: process slabinfo_show in common code Glauber Costa
2012-10-19 14:20 ` [PATCH v5 04/18] slab: don't preemptively remove element from list in cache destroy Glauber Costa
2012-10-19 19:34 ` Christoph Lameter
2012-10-22 8:40 ` Glauber Costa
2012-10-24 6:54 ` Pekka Enberg
2012-10-24 16:19 ` Glauber Costa
2012-10-19 14:20 ` [PATCH v5 05/18] slab/slub: struct memcg_params Glauber Costa
2012-10-23 17:25 ` JoonSoo Kim
2012-10-24 8:42 ` Glauber Costa
2012-10-19 14:20 ` [PATCH v5 06/18] consider a memcg parameter in kmem_create_cache Glauber Costa
2012-10-23 17:50 ` JoonSoo Kim
2012-10-24 8:42 ` Glauber Costa
2012-10-25 13:42 ` Glauber Costa
2012-10-19 14:20 ` [PATCH v5 07/18] Allocate memory for memcg caches whenever a new memcg appears Glauber Costa
2012-10-19 14:20 ` [PATCH v5 08/18] memcg: infrastructure to match an allocation to the right cache Glauber Costa
2012-10-24 18:10 ` JoonSoo Kim
2012-10-25 11:05 ` Glauber Costa
2012-10-25 18:06 ` Tejun Heo
2012-10-25 18:08 ` Tejun Heo
2012-10-19 14:20 ` [PATCH v5 09/18] memcg: skip memcg kmem allocations in specified code regions Glauber Costa
2012-10-19 14:20 ` [PATCH v5 10/18] sl[au]b: always get the cache from its page in kfree Glauber Costa
2012-10-19 19:44 ` Christoph Lameter
2012-10-22 10:13 ` Glauber Costa
2012-10-19 14:20 ` [PATCH v5 11/18] sl[au]b: Allocate objects from memcg cache Glauber Costa
2012-10-19 19:46 ` Christoph Lameter
2012-10-29 15:14 ` JoonSoo Kim
2012-10-29 15:19 ` Glauber Costa
2012-10-19 14:20 ` [PATCH v5 12/18] memcg: destroy memcg caches Glauber Costa
2012-10-19 14:20 ` [PATCH v5 13/18] memcg/sl[au]b Track all the memcg children of a kmem_cache Glauber Costa
2012-10-29 15:26 ` JoonSoo Kim
2012-10-30 11:31 ` Glauber Costa
2012-10-19 14:20 ` [PATCH v5 14/18] memcg/sl[au]b: shrink dead caches Glauber Costa
2012-10-19 19:47 ` Christoph Lameter
2012-10-22 7:37 ` Glauber Costa
2012-10-19 14:20 ` [PATCH v5 15/18] Aggregate memcg cache values in slabinfo Glauber Costa
2012-10-19 19:50 ` Christoph Lameter
2012-10-22 15:11 ` Glauber Costa
2012-10-19 14:20 ` [PATCH v5 16/18] slab: propagate tunables values Glauber Costa
2012-10-19 19:51 ` Christoph Lameter
2012-10-22 7:48 ` Glauber Costa
2012-10-23 20:44 ` Christoph Lameter
2012-10-19 14:20 ` Glauber Costa [this message]
2012-10-19 14:20 ` [PATCH v5 18/18] Add slab-specific documentation about the kmem controller Glauber Costa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1350656442-1523-18-git-send-email-glommer@parallels.com \
--to=glommer@parallels.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=cl@linux.com \
--cc=devel@openvz.org \
--cc=hannes@cmpxchg.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@suse.cz \
--cc=penberg@cs.helsinki.fi \
--cc=penberg@kernel.org \
--cc=rientjes@google.com \
--cc=suleiman@google.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).