* [PATCH 1/2] memcg: fix memcg_size() calculation @ 2013-12-14 8:15 Vladimir Davydov 2013-12-14 8:15 ` [PATCH 2/2] memcg: do not use vmalloc for mem_cgroup allocations Vladimir Davydov 2013-12-16 16:47 ` [PATCH 1/2] memcg: fix memcg_size() calculation Michal Hocko 0 siblings, 2 replies; 7+ messages in thread From: Vladimir Davydov @ 2013-12-14 8:15 UTC (permalink / raw) To: mhocko Cc: glommer, linux-kernel, cgroups, linux-mm, devel, Glauber Costa, Johannes Weiner, Balbir Singh, KAMEZAWA Hiroyuki The mem_cgroup structure contains nr_node_ids pointers to mem_cgroup_per_node objects, not the objects themselves. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@openvz.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> --- mm/memcontrol.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index bf5e894..7f1a356 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -338,7 +338,7 @@ struct mem_cgroup { static size_t memcg_size(void) { return sizeof(struct mem_cgroup) + - nr_node_ids * sizeof(struct mem_cgroup_per_node); + nr_node_ids * sizeof(struct mem_cgroup_per_node *); } /* internal only representation about the status of kmem accounting. */ -- 1.7.10.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] memcg: do not use vmalloc for mem_cgroup allocations 2013-12-14 8:15 [PATCH 1/2] memcg: fix memcg_size() calculation Vladimir Davydov @ 2013-12-14 8:15 ` Vladimir Davydov 2013-12-14 20:13 ` [Devel] " Vladimir Davydov 2013-12-16 16:53 ` Michal Hocko 2013-12-16 16:47 ` [PATCH 1/2] memcg: fix memcg_size() calculation Michal Hocko 1 sibling, 2 replies; 7+ messages in thread From: Vladimir Davydov @ 2013-12-14 8:15 UTC (permalink / raw) To: mhocko Cc: glommer, linux-kernel, cgroups, linux-mm, devel, Glauber Costa, Johannes Weiner, Balbir Singh, KAMEZAWA Hiroyuki The vmalloc was introduced by patch 333279 ("memcgroup: use vmalloc for mem_cgroup allocation"), because at that time MAX_NUMNODES was used for defining the per-node array in the mem_cgroup structure so that the structure could be huge even if the system had the only NUMA node. The situation was significantly improved by patch 45cf7e ("memcg: reduce the size of struct memcg 244-fold"), which made the size of the mem_cgroup structure calculated dynamically depending on the real number of NUMA nodes installed on the system (nr_node_ids), so now there is no point in using vmalloc here: the structure is allocated rarely and on most systems its size is about 1K. Personally I'd like to remove this vmalloc, because I'm considering using wait_on_bit() on mem_cgroup::kmem_account_flags in the kmemcg shrinkers implementation, which is impossible on vmalloc'd areas. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@openvz.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> --- mm/memcontrol.c | 28 ++++++---------------------- 1 file changed, 6 insertions(+), 22 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 7f1a356..205eb7b 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -48,7 +48,6 @@ #include <linux/sort.h> #include <linux/fs.h> #include <linux/seq_file.h> -#include <linux/vmalloc.h> #include <linux/vmpressure.h> #include <linux/mm_inline.h> #include <linux/page_cgroup.h> @@ -335,12 +334,6 @@ struct mem_cgroup { /* WARNING: nodeinfo must be the last member here */ }; -static size_t memcg_size(void) -{ - return sizeof(struct mem_cgroup) + - nr_node_ids * sizeof(struct mem_cgroup_per_node *); -} - /* internal only representation about the status of kmem accounting. */ enum { KMEM_ACCOUNTED_ACTIVE = 0, /* accounted by this cgroup itself */ @@ -6139,14 +6132,12 @@ static void free_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node) static struct mem_cgroup *mem_cgroup_alloc(void) { struct mem_cgroup *memcg; - size_t size = memcg_size(); + size_t size; - /* Can be very big if nr_node_ids is very big */ - if (size < PAGE_SIZE) - memcg = kzalloc(size, GFP_KERNEL); - else - memcg = vzalloc(size); + size = sizeof(struct mem_cgroup); + size += nr_node_ids * sizeof(struct mem_cgroup_per_node *); + memcg = kzalloc(size, GFP_KERNEL); if (!memcg) return NULL; @@ -6157,10 +6148,7 @@ static struct mem_cgroup *mem_cgroup_alloc(void) return memcg; out_free: - if (size < PAGE_SIZE) - kfree(memcg); - else - vfree(memcg); + kfree(memcg); return NULL; } @@ -6178,7 +6166,6 @@ out_free: static void __mem_cgroup_free(struct mem_cgroup *memcg) { int node; - size_t size = memcg_size(); mem_cgroup_remove_from_trees(memcg); @@ -6199,10 +6186,7 @@ static void __mem_cgroup_free(struct mem_cgroup *memcg) * the cgroup_lock. */ disarm_static_keys(memcg); - if (size < PAGE_SIZE) - kfree(memcg); - else - vfree(memcg); + kfree(memcg); } /* -- 1.7.10.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [Devel] [PATCH 2/2] memcg: do not use vmalloc for mem_cgroup allocations 2013-12-14 8:15 ` [PATCH 2/2] memcg: do not use vmalloc for mem_cgroup allocations Vladimir Davydov @ 2013-12-14 20:13 ` Vladimir Davydov 2013-12-16 16:53 ` Michal Hocko 1 sibling, 0 replies; 7+ messages in thread From: Vladimir Davydov @ 2013-12-14 20:13 UTC (permalink / raw) To: mhocko Cc: glommer, Balbir Singh, linux-kernel, Glauber Costa, linux-mm, KAMEZAWA Hiroyuki, Johannes Weiner, cgroups, devel On 12/14/2013 12:15 PM, Vladimir Davydov wrote: > The vmalloc was introduced by patch 333279 ("memcgroup: use vmalloc for > mem_cgroup allocation"), because at that time MAX_NUMNODES was used for > defining the per-node array in the mem_cgroup structure so that the > structure could be huge even if the system had the only NUMA node. > > The situation was significantly improved by patch 45cf7e ("memcg: reduce > the size of struct memcg 244-fold"), which made the size of the > mem_cgroup structure calculated dynamically depending on the real number > of NUMA nodes installed on the system (nr_node_ids), so now there is no > point in using vmalloc here: the structure is allocated rarely and on > most systems its size is about 1K. After a bit of thinking I refused to use wait_on_bit() on kmem_account_flags in the kmemcg shrinkers patchset I'm trying to submit, so if this patch is going to be committed, it is better to remove the paragraph below from the commit message :-) However, I still think this patch is reasonable, because I haven't found a place in the kernel source tree where we used vmalloc() for allocating an array of nr_node_ids pointers. On the other hand, there are plenty of places where we allocate per-node data using kmalloc. For instance, in list_lru_init() we have size_t size = sizeof(struct list_lru_node) * nr_node_ids; lru->node = kzalloc(size, GFP_KERNEL); where the list_lru_node structure is at least 4 words at size. > Personally I'd like to remove this vmalloc, because I'm considering > using wait_on_bit() on mem_cgroup::kmem_account_flags in the kmemcg > shrinkers implementation, which is impossible on vmalloc'd areas. > > Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> > Cc: Michal Hocko <mhocko@suse.cz> > Cc: Glauber Costa <glommer@openvz.org> > Cc: Johannes Weiner <hannes@cmpxchg.org> > Cc: Balbir Singh <bsingharora@gmail.com> > Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> > --- > mm/memcontrol.c | 28 ++++++---------------------- > 1 file changed, 6 insertions(+), 22 deletions(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 7f1a356..205eb7b 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -48,7 +48,6 @@ > #include <linux/sort.h> > #include <linux/fs.h> > #include <linux/seq_file.h> > -#include <linux/vmalloc.h> > #include <linux/vmpressure.h> > #include <linux/mm_inline.h> > #include <linux/page_cgroup.h> > @@ -335,12 +334,6 @@ struct mem_cgroup { > /* WARNING: nodeinfo must be the last member here */ > }; > > -static size_t memcg_size(void) > -{ > - return sizeof(struct mem_cgroup) + > - nr_node_ids * sizeof(struct mem_cgroup_per_node *); > -} > - > /* internal only representation about the status of kmem accounting. */ > enum { > KMEM_ACCOUNTED_ACTIVE = 0, /* accounted by this cgroup itself */ > @@ -6139,14 +6132,12 @@ static void free_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node) > static struct mem_cgroup *mem_cgroup_alloc(void) > { > struct mem_cgroup *memcg; > - size_t size = memcg_size(); > + size_t size; > > - /* Can be very big if nr_node_ids is very big */ > - if (size < PAGE_SIZE) > - memcg = kzalloc(size, GFP_KERNEL); > - else > - memcg = vzalloc(size); > + size = sizeof(struct mem_cgroup); > + size += nr_node_ids * sizeof(struct mem_cgroup_per_node *); > > + memcg = kzalloc(size, GFP_KERNEL); > if (!memcg) > return NULL; > > @@ -6157,10 +6148,7 @@ static struct mem_cgroup *mem_cgroup_alloc(void) > return memcg; > > out_free: > - if (size < PAGE_SIZE) > - kfree(memcg); > - else > - vfree(memcg); > + kfree(memcg); > return NULL; > } > > @@ -6178,7 +6166,6 @@ out_free: > static void __mem_cgroup_free(struct mem_cgroup *memcg) > { > int node; > - size_t size = memcg_size(); > > mem_cgroup_remove_from_trees(memcg); > > @@ -6199,10 +6186,7 @@ static void __mem_cgroup_free(struct mem_cgroup *memcg) > * the cgroup_lock. > */ > disarm_static_keys(memcg); > - if (size < PAGE_SIZE) > - kfree(memcg); > - else > - vfree(memcg); > + kfree(memcg); > } > > /* -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] memcg: do not use vmalloc for mem_cgroup allocations 2013-12-14 8:15 ` [PATCH 2/2] memcg: do not use vmalloc for mem_cgroup allocations Vladimir Davydov 2013-12-14 20:13 ` [Devel] " Vladimir Davydov @ 2013-12-16 16:53 ` Michal Hocko 1 sibling, 0 replies; 7+ messages in thread From: Michal Hocko @ 2013-12-16 16:53 UTC (permalink / raw) To: Vladimir Davydov Cc: glommer, linux-kernel, cgroups, linux-mm, devel, Glauber Costa, Johannes Weiner, Balbir Singh, KAMEZAWA Hiroyuki On Sat 14-12-13 12:15:34, Vladimir Davydov wrote: > The vmalloc was introduced by patch 333279 ("memcgroup: use vmalloc for > mem_cgroup allocation"), because at that time MAX_NUMNODES was used for > defining the per-node array in the mem_cgroup structure so that the > structure could be huge even if the system had the only NUMA node. > > The situation was significantly improved by patch 45cf7e ("memcg: reduce > the size of struct memcg 244-fold"), which made the size of the > mem_cgroup structure calculated dynamically depending on the real number > of NUMA nodes installed on the system (nr_node_ids), so now there is no > point in using vmalloc here: the structure is allocated rarely and on > most systems its size is about 1K. > > Personally I'd like to remove this vmalloc, because I'm considering > using wait_on_bit() on mem_cgroup::kmem_account_flags in the kmemcg > shrinkers implementation, which is impossible on vmalloc'd areas. > > Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> > Cc: Michal Hocko <mhocko@suse.cz> > Cc: Glauber Costa <glommer@openvz.org> > Cc: Johannes Weiner <hannes@cmpxchg.org> > Cc: Balbir Singh <bsingharora@gmail.com> > Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> yes this makes sense to me even without wait_on_bit part. Maybe I would just merge it with 1/2. Acked-by: Michal Hocko <mhocko@suse.cz> Thanks! > --- > mm/memcontrol.c | 28 ++++++---------------------- > 1 file changed, 6 insertions(+), 22 deletions(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 7f1a356..205eb7b 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -48,7 +48,6 @@ > #include <linux/sort.h> > #include <linux/fs.h> > #include <linux/seq_file.h> > -#include <linux/vmalloc.h> > #include <linux/vmpressure.h> > #include <linux/mm_inline.h> > #include <linux/page_cgroup.h> > @@ -335,12 +334,6 @@ struct mem_cgroup { > /* WARNING: nodeinfo must be the last member here */ > }; > > -static size_t memcg_size(void) > -{ > - return sizeof(struct mem_cgroup) + > - nr_node_ids * sizeof(struct mem_cgroup_per_node *); > -} > - > /* internal only representation about the status of kmem accounting. */ > enum { > KMEM_ACCOUNTED_ACTIVE = 0, /* accounted by this cgroup itself */ > @@ -6139,14 +6132,12 @@ static void free_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node) > static struct mem_cgroup *mem_cgroup_alloc(void) > { > struct mem_cgroup *memcg; > - size_t size = memcg_size(); > + size_t size; > > - /* Can be very big if nr_node_ids is very big */ > - if (size < PAGE_SIZE) > - memcg = kzalloc(size, GFP_KERNEL); > - else > - memcg = vzalloc(size); > + size = sizeof(struct mem_cgroup); > + size += nr_node_ids * sizeof(struct mem_cgroup_per_node *); > > + memcg = kzalloc(size, GFP_KERNEL); > if (!memcg) > return NULL; > > @@ -6157,10 +6148,7 @@ static struct mem_cgroup *mem_cgroup_alloc(void) > return memcg; > > out_free: > - if (size < PAGE_SIZE) > - kfree(memcg); > - else > - vfree(memcg); > + kfree(memcg); > return NULL; > } > > @@ -6178,7 +6166,6 @@ out_free: > static void __mem_cgroup_free(struct mem_cgroup *memcg) > { > int node; > - size_t size = memcg_size(); > > mem_cgroup_remove_from_trees(memcg); > > @@ -6199,10 +6186,7 @@ static void __mem_cgroup_free(struct mem_cgroup *memcg) > * the cgroup_lock. > */ > disarm_static_keys(memcg); > - if (size < PAGE_SIZE) > - kfree(memcg); > - else > - vfree(memcg); > + kfree(memcg); > } > > /* > -- > 1.7.10.4 > -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] memcg: fix memcg_size() calculation 2013-12-14 8:15 [PATCH 1/2] memcg: fix memcg_size() calculation Vladimir Davydov 2013-12-14 8:15 ` [PATCH 2/2] memcg: do not use vmalloc for mem_cgroup allocations Vladimir Davydov @ 2013-12-16 16:47 ` Michal Hocko 2013-12-17 7:48 ` Glauber Costa 1 sibling, 1 reply; 7+ messages in thread From: Michal Hocko @ 2013-12-16 16:47 UTC (permalink / raw) To: Vladimir Davydov Cc: glommer, linux-kernel, cgroups, linux-mm, devel, Glauber Costa, Johannes Weiner, Balbir Singh, KAMEZAWA Hiroyuki On Sat 14-12-13 12:15:33, Vladimir Davydov wrote: > The mem_cgroup structure contains nr_node_ids pointers to > mem_cgroup_per_node objects, not the objects themselves. Ouch! This is 2k per node which is wasted. What a shame I haven't noticed this back then when reviewing 45cf7ebd5a033 (memcg: reduce the size of struct memcg 244-fold) > Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> > Cc: Michal Hocko <mhocko@suse.cz> > Cc: Glauber Costa <glommer@openvz.org> > Cc: Johannes Weiner <hannes@cmpxchg.org> > Cc: Balbir Singh <bsingharora@gmail.com> > Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Michal Hocko <mhocko@suse.cz> Thanks! > --- > mm/memcontrol.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index bf5e894..7f1a356 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -338,7 +338,7 @@ struct mem_cgroup { > static size_t memcg_size(void) > { > return sizeof(struct mem_cgroup) + > - nr_node_ids * sizeof(struct mem_cgroup_per_node); > + nr_node_ids * sizeof(struct mem_cgroup_per_node *); > } > > /* internal only representation about the status of kmem accounting. */ > -- > 1.7.10.4 > -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] memcg: fix memcg_size() calculation 2013-12-16 16:47 ` [PATCH 1/2] memcg: fix memcg_size() calculation Michal Hocko @ 2013-12-17 7:48 ` Glauber Costa 2013-12-17 8:12 ` Michal Hocko 0 siblings, 1 reply; 7+ messages in thread From: Glauber Costa @ 2013-12-17 7:48 UTC (permalink / raw) To: Michal Hocko Cc: Vladimir Davydov, LKML, cgroups, linux-mm, devel, Glauber Costa, Johannes Weiner, Balbir Singh, KAMEZAWA Hiroyuki On Mon, Dec 16, 2013 at 8:47 PM, Michal Hocko <mhocko@suse.cz> wrote: > On Sat 14-12-13 12:15:33, Vladimir Davydov wrote: >> The mem_cgroup structure contains nr_node_ids pointers to >> mem_cgroup_per_node objects, not the objects themselves. > > Ouch! This is 2k per node which is wasted. What a shame I haven't > noticed this back then when reviewing 45cf7ebd5a033 (memcg: reduce the > size of struct memcg 244-fold) > IIRC, they weren't pointers back then. I think they were embedded in the structure, and I let them embedded. My mind may be tricking me, but I think I recall that Johannes changed them to pointers in a later time. No ? In any case, this is correct. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] memcg: fix memcg_size() calculation 2013-12-17 7:48 ` Glauber Costa @ 2013-12-17 8:12 ` Michal Hocko 0 siblings, 0 replies; 7+ messages in thread From: Michal Hocko @ 2013-12-17 8:12 UTC (permalink / raw) To: Glauber Costa Cc: Vladimir Davydov, LKML, cgroups, linux-mm, devel, Glauber Costa, Johannes Weiner, Balbir Singh, KAMEZAWA Hiroyuki On Tue 17-12-13 11:48:20, Glauber Costa wrote: > On Mon, Dec 16, 2013 at 8:47 PM, Michal Hocko <mhocko@suse.cz> wrote: > > On Sat 14-12-13 12:15:33, Vladimir Davydov wrote: > >> The mem_cgroup structure contains nr_node_ids pointers to > >> mem_cgroup_per_node objects, not the objects themselves. > > > > Ouch! This is 2k per node which is wasted. What a shame I haven't > > noticed this back then when reviewing 45cf7ebd5a033 (memcg: reduce the > > size of struct memcg 244-fold) > > > IIRC, they weren't pointers back then. I think they were embedded in > the structure, and I let > them embedded. > My mind may be tricking me, but I think I recall that Johannes changed > them to pointers > in a later time. No ? It was wrapped by mem_cgroup_lru_info back then but the memcg_size hasn't changed after 54f72fe022d9 (memcg: clean up memcg->nodeinfo) so the missing * was there since 45cf7ebd5a033 > In any case, this is correct. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2013-12-17 8:12 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-12-14 8:15 [PATCH 1/2] memcg: fix memcg_size() calculation Vladimir Davydov 2013-12-14 8:15 ` [PATCH 2/2] memcg: do not use vmalloc for mem_cgroup allocations Vladimir Davydov 2013-12-14 20:13 ` [Devel] " Vladimir Davydov 2013-12-16 16:53 ` Michal Hocko 2013-12-16 16:47 ` [PATCH 1/2] memcg: fix memcg_size() calculation Michal Hocko 2013-12-17 7:48 ` Glauber Costa 2013-12-17 8:12 ` Michal Hocko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).