From: Glauber Costa <glommer@parallels.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: <cgroups@vger.kernel.org>, <linux-mm@kvack.org>,
<linux-kernel@vger.kernel.org>, <devel@openvz.org>,
Michal Hocko <mhocko@suse.cz>,
Johannes Weiner <hannes@cmpxchg.org>,
Frederic Weisbecker <fweisbec@gmail.com>,
Greg Thelen <gthelen@google.com>,
Suleiman Souhlal <suleiman@google.com>,
Christoph Lameter <cl@linux.com>,
Pekka Enberg <penberg@cs.helsinki.fi>
Subject: Re: [PATCH 11/23] slub: consider a memcg parameter in kmem_create_cache
Date: Wed, 25 Apr 2012 11:37:22 -0300 [thread overview]
Message-ID: <4F980C22.70302@parallels.com> (raw)
In-Reply-To: <4F9755B3.7010606@jp.fujitsu.com>
On 04/24/2012 10:38 PM, KAMEZAWA Hiroyuki wrote:
> (2012/04/21 6:57), Glauber Costa wrote:
>
>> Allow a memcg parameter to be passed during cache creation.
>> The slub allocator will only merge caches that belong to
>> the same memcg.
>>
>> Default function is created as a wrapper, passing NULL
>> to the memcg version. We only merge caches that belong
>> to the same memcg.
>>
>> > From the memcontrol.c side, 3 helper functions are created:
>>
>> 1) memcg_css_id: because slub needs a unique cache name
>> for sysfs. Since this is visible, but not the canonical
>> location for slab data, the cache name is not used, the
>> css_id should suffice.
>>
>> 2) mem_cgroup_register_cache: is responsible for assigning
>> a unique index to each cache, and other general purpose
>> setup. The index is only assigned for the root caches. All
>> others are assigned index == -1.
>>
>> 3) mem_cgroup_release_cache: can be called from the root cache
>> destruction, and will release the index for other caches.
>>
>> This index mechanism was developed by Suleiman Souhlal.
>>
>> Signed-off-by: Glauber Costa<glommer@parallels.com>
>> CC: Christoph Lameter<cl@linux.com>
>> CC: Pekka Enberg<penberg@cs.helsinki.fi>
>> CC: Michal Hocko<mhocko@suse.cz>
>> CC: Kamezawa Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com>
>> CC: Johannes Weiner<hannes@cmpxchg.org>
>> CC: Suleiman Souhlal<suleiman@google.com>
>> ---
>> include/linux/memcontrol.h | 14 ++++++++++++++
>> include/linux/slab.h | 6 ++++++
>> mm/memcontrol.c | 29 +++++++++++++++++++++++++++++
>> mm/slub.c | 31 +++++++++++++++++++++++++++----
>> 4 files changed, 76 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
>> index f94efd2..99e14b9 100644
>> --- a/include/linux/memcontrol.h
>> +++ b/include/linux/memcontrol.h
>> @@ -26,6 +26,7 @@ struct mem_cgroup;
>> struct page_cgroup;
>> struct page;
>> struct mm_struct;
>> +struct kmem_cache;
>>
>> /* Stats that can be updated by kernel. */
>> enum mem_cgroup_page_stat_item {
>> @@ -440,7 +441,20 @@ struct sock;
>> #ifdef CONFIG_CGROUP_MEM_RES_CTLR_KMEM
>> void sock_update_memcg(struct sock *sk);
>> void sock_release_memcg(struct sock *sk);
>> +int memcg_css_id(struct mem_cgroup *memcg);
>> +void mem_cgroup_register_cache(struct mem_cgroup *memcg,
>> + struct kmem_cache *s);
>> +void mem_cgroup_release_cache(struct kmem_cache *cachep);
>> #else
>> +static inline void mem_cgroup_register_cache(struct mem_cgroup *memcg,
>> + struct kmem_cache *s)
>> +{
>> +}
>> +
>> +static inline void mem_cgroup_release_cache(struct kmem_cache *cachep)
>> +{
>> +}
>> +
>> static inline void sock_update_memcg(struct sock *sk)
>> {
>> }
>> diff --git a/include/linux/slab.h b/include/linux/slab.h
>> index a5127e1..c7a7e05 100644
>> --- a/include/linux/slab.h
>> +++ b/include/linux/slab.h
>> @@ -321,6 +321,12 @@ extern void *__kmalloc_track_caller(size_t, gfp_t, unsigned long);
>> __kmalloc(size, flags)
>> #endif /* DEBUG_SLAB */
>>
>> +#ifdef CONFIG_CGROUP_MEM_RES_CTLR_KMEM
>> +#define MAX_KMEM_CACHE_TYPES 400
>> +#else
>> +#define MAX_KMEM_CACHE_TYPES 0
>> +#endif /* CONFIG_CGROUP_MEM_RES_CTLR_KMEM */
>> +
>
>
> why 400 ?
Quite arbitrary. Just large enough to hold all caches there are
currently in a system + modules. (Right now I have around 140
in a normal fedora installation)
>> +/* Bitmap used for allocating the cache id numbers. */
>> +static DECLARE_BITMAP(cache_types, MAX_KMEM_CACHE_TYPES);
>> +
>> +void mem_cgroup_register_cache(struct mem_cgroup *memcg,
>> + struct kmem_cache *cachep)
>> +{
>> + int id = -1;
>> +
>> + cachep->memcg_params.memcg = memcg;
>> +
>> + if (!memcg) {
>> + id = find_first_zero_bit(cache_types, MAX_KMEM_CACHE_TYPES);
>> + BUG_ON(id< 0 || id>= MAX_KMEM_CACHE_TYPES);
>> + __set_bit(id, cache_types);
>
>
> No lock here ? you need find_first_zero_bit_and_set_atomic() or some.
> Rather than that, I think you can use lib/idr.c::ida_simple_get().
This function is called from within kmem_cache_create(), that usually
already do locking. The slub, for instance, uses the slub_lock() for all
cache creation, and the slab do something quite similar. (All right, I
should have mentioned that in comments)
But as for idr, I don't think it is a bad idea. I will take a look.
>> @@ -3880,7 +3881,7 @@ static int slab_unmergeable(struct kmem_cache *s)
>> return 0;
>> }
>>
>> -static struct kmem_cache *find_mergeable(size_t size,
>> +static struct kmem_cache *find_mergeable(struct mem_cgroup *memcg, size_t size,
>> size_t align, unsigned long flags, const char *name,
>> void (*ctor)(void *))
>> {
>> @@ -3916,21 +3917,29 @@ static struct kmem_cache *find_mergeable(size_t size,
>> if (s->size - size>= sizeof(void *))
>> continue;
>>
>> + if (memcg&& s->memcg_params.memcg != memcg)
>> + continue;
>> +
>> return s;
>> }
>> return NULL;
>> }
>>
>> -struct kmem_cache *kmem_cache_create(const char *name, size_t size,
>> - size_t align, unsigned long flags, void (*ctor)(void *))
>> +struct kmem_cache *
>> +kmem_cache_create_memcg(struct mem_cgroup *memcg, const char *name, size_t size,
>> + size_t align, unsigned long flags, void (*ctor)(void *))
>> {
>> struct kmem_cache *s;
>>
>> if (WARN_ON(!name))
>> return NULL;
>>
>> +#ifndef CONFIG_CGROUP_MEM_RES_CTLR_KMEM
>> + WARN_ON(memcg != NULL);
>> +#endif
>
>
> I'm sorry what's is this warning for ?
this is inside ifndef (not defined), so this means anyone trying to pass
a memcg in that situation, is doing something really wrong.
I was actually going for BUG() on this one, but changed my mind
Thinking again, I could probably do this:
if (WARN_ON(memcg != NULL))
memcg = NULL;
this way we can keep going without killing the kernel as well as
protecting the function.
>
>> @@ -5265,6 +5283,11 @@ static char *create_unique_id(struct kmem_cache *s)
>> if (p != name + 1)
>> *p++ = '-';
>> p += sprintf(p, "%07d", s->size);
>> +
>> +#ifdef CONFIG_CGROUP_MEM_RES_CTLR_KMEM
>> + if (s->memcg_params.memcg)
>> + p += sprintf(p, "-%08d", memcg_css_id(s->memcg_params.memcg));
>> +#endif
>> BUG_ON(p> name + ID_STR_LENGTH - 1);
>> return name;
>> }
>
>
> So, you use 'id' in user interface. Should we provide 'id' as memory.id file ?
We could.
But that is not the cache name, this is for alias files.
The cache name has css_id:dcache_name, so we'll see something like
2:container1
The css_id plays the role of avoiding name duplicates, since all we use
is the last dentry to derive the name.
So I guess if need arises to go search in sysfs for the slub stuff, it
gets easy enough to correlate so we don't need to export the id.
next prev parent reply other threads:[~2012-04-25 14:39 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-20 21:57 [PATCH 00/23] slab+slub accounting for memcg Glauber Costa
2012-04-20 21:57 ` [PATCH 01/23] slub: don't create a copy of the name string in kmem_cache_create Glauber Costa
2012-04-20 21:57 ` [PATCH 02/23] slub: always get the cache from its page in kfree Glauber Costa
2012-04-20 21:57 ` [PATCH 03/23] slab: rename gfpflags to allocflags Glauber Costa
2012-04-20 21:57 ` [PATCH 04/23] memcg: Make it possible to use the stock for more than one page Glauber Costa
2012-04-25 0:59 ` KAMEZAWA Hiroyuki
2012-04-20 21:57 ` [PATCH 05/23] memcg: Reclaim when more than one page needed Glauber Costa
2012-04-25 1:16 ` KAMEZAWA Hiroyuki
2012-04-20 21:57 ` [PATCH 06/23] slab: use obj_size field of struct kmem_cache when not debugging Glauber Costa
2012-04-20 21:57 ` [PATCH 07/23] change defines to an enum Glauber Costa
2012-04-25 1:18 ` KAMEZAWA Hiroyuki
2012-04-20 21:57 ` [PATCH 08/23] don't force return value checking in res_counter_charge_nofail Glauber Costa
2012-04-25 1:28 ` KAMEZAWA Hiroyuki
2012-04-20 21:57 ` [PATCH 09/23] kmem slab accounting basic infrastructure Glauber Costa
2012-04-25 1:32 ` KAMEZAWA Hiroyuki
2012-04-25 14:38 ` Glauber Costa
2012-04-26 0:08 ` KAMEZAWA Hiroyuki
2012-04-30 19:33 ` Suleiman Souhlal
2012-05-02 15:15 ` Glauber Costa
2012-04-20 21:57 ` [PATCH 10/23] slab/slub: struct memcg_params Glauber Costa
2012-04-30 19:42 ` Suleiman Souhlal
2012-04-20 21:57 ` [PATCH 11/23] slub: consider a memcg parameter in kmem_create_cache Glauber Costa
2012-04-24 14:03 ` Frederic Weisbecker
2012-04-24 14:27 ` Glauber Costa
2012-04-25 1:38 ` KAMEZAWA Hiroyuki
2012-04-25 14:37 ` Glauber Costa [this message]
2012-04-30 19:51 ` Suleiman Souhlal
2012-05-02 15:18 ` Glauber Costa
2012-04-22 23:53 ` [PATCH 12/23] slab: pass memcg parameter to kmem_cache_create Glauber Costa
2012-04-30 19:54 ` Suleiman Souhlal
2012-04-22 23:53 ` [PATCH 13/23] slub: create duplicate cache Glauber Costa
2012-04-24 14:18 ` Frederic Weisbecker
2012-04-24 14:37 ` Glauber Costa
2012-04-26 13:10 ` Frederic Weisbecker
2012-04-30 20:15 ` Suleiman Souhlal
2012-04-22 23:53 ` [PATCH 14/23] slub: provide kmalloc_no_account Glauber Costa
2012-04-22 23:53 ` [PATCH 15/23] slab: create duplicate cache Glauber Costa
2012-04-22 23:53 ` [PATCH 16/23] slab: provide kmalloc_no_account Glauber Costa
2012-04-25 1:44 ` KAMEZAWA Hiroyuki
2012-04-25 14:29 ` Glauber Costa
2012-04-26 0:13 ` KAMEZAWA Hiroyuki
2012-04-22 23:53 ` [PATCH 17/23] kmem controller charge/uncharge infrastructure Glauber Costa
2012-04-23 22:25 ` David Rientjes
2012-04-24 14:22 ` Frederic Weisbecker
2012-04-24 14:40 ` Glauber Costa
2012-04-24 20:25 ` David Rientjes
2012-04-24 21:36 ` Glauber Costa
2012-04-24 22:54 ` David Rientjes
2012-04-25 14:43 ` Glauber Costa
2012-04-24 20:21 ` David Rientjes
2012-04-27 11:38 ` Frederic Weisbecker
2012-04-27 18:13 ` David Rientjes
2012-04-25 1:56 ` KAMEZAWA Hiroyuki
2012-04-25 14:44 ` Glauber Costa
2012-04-27 12:22 ` Frederic Weisbecker
2012-04-30 20:56 ` Suleiman Souhlal
2012-05-02 15:34 ` Glauber Costa
2012-04-22 23:53 ` [PATCH 18/23] slub: charge allocation to a memcg Glauber Costa
2012-04-22 23:53 ` [PATCH 19/23] slab: per-memcg accounting of slab caches Glauber Costa
2012-04-30 21:25 ` Suleiman Souhlal
2012-05-02 15:40 ` Glauber Costa
2012-04-22 23:53 ` [PATCH 20/23] memcg: disable kmem code when not in use Glauber Costa
2012-04-22 23:53 ` [PATCH 21/23] memcg: Track all the memcg children of a kmem_cache Glauber Costa
2012-04-22 23:53 ` [PATCH 22/23] memcg: Per-memcg memory.kmem.slabinfo file Glauber Costa
2012-04-22 23:53 ` [PATCH 23/23] slub: create slabinfo file for memcg Glauber Costa
2012-04-22 23:59 ` [PATCH 00/23] slab+slub accounting " Glauber Costa
2012-04-30 9:59 ` [PATCH 0/3] A few fixes for '[PATCH 00/23] slab+slub accounting for memcg' series Anton Vorontsov
2012-04-30 10:01 ` [PATCH 1/3] slab: Proper off-slabs handling when duplicating caches Anton Vorontsov
2012-04-30 10:01 ` [PATCH 2/3] slab: Fix imbalanced rcu locking Anton Vorontsov
2012-04-30 10:02 ` [PATCH 3/3] slab: Get rid of mem_cgroup_put_kmem_cache() Anton Vorontsov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F980C22.70302@parallels.com \
--to=glommer@parallels.com \
--cc=cgroups@vger.kernel.org \
--cc=cl@linux.com \
--cc=devel@openvz.org \
--cc=fweisbec@gmail.com \
--cc=gthelen@google.com \
--cc=hannes@cmpxchg.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
--cc=penberg@cs.helsinki.fi \
--cc=suleiman@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox