* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
2013-01-24 6:46 [PATCH v2] memcg: reduce the size of struct memcg 244-fold Glauber Costa
@ 2013-01-24 7:50 ` Greg Thelen
2013-01-24 7:52 ` Glauber Costa
2013-01-24 23:51 ` Andrew Morton
2013-01-24 10:14 ` Michal Hocko
` (2 subsequent siblings)
3 siblings, 2 replies; 12+ messages in thread
From: Greg Thelen @ 2013-01-24 7:50 UTC (permalink / raw)
To: Glauber Costa
Cc: linux-mm, cgroups, Michal Hocko, Kamezawa Hiroyuki,
Johannes Weiner, Hugh Dickins, Ying Han, Mel Gorman, Rik van Riel
On Wed, Jan 23 2013, Glauber Costa wrote:
> In order to maintain all the memcg bookkeeping, we need per-node
> descriptors, which will in turn contain a per-zone descriptor.
>
> Because we want to statically allocate those, this array ends up being
> very big. Part of the reason is that we allocate something large enough
> to hold MAX_NUMNODES, the compile time constant that holds the maximum
> number of nodes we would ever consider.
>
> However, we can do better in some cases if the firmware help us. This is
> true for modern x86 machines; coincidentally one of the architectures in
> which MAX_NUMNODES tends to be very big.
>
> By using the firmware-provided maximum number of nodes instead of
> MAX_NUMNODES, we can reduce the memory footprint of struct memcg
> considerably. In the extreme case in which we have only one node, this
> reduces the size of the structure from ~ 64k to ~2k. This is
> particularly important because it means that we will no longer resort to
> the vmalloc area for the struct memcg on defconfigs. We also have enough
> room for an extra node and still be outside vmalloc.
>
> One also has to keep in mind that with the industry's ability to fit
> more processors in a die as fast as the FED prints money, a nodes = 2
> configuration is already respectably big.
>
> [ v2: use size_t for size calculations ]
> Signed-off-by: Glauber Costa <glommer@parallels.com>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Greg Thelen <gthelen@google.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Ying Han <yinghan@google.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Rik van Riel <riel@redhat.com>
Reviewed-by: Greg Thelen <gthelen@google.com>
> ---
> mm/memcontrol.c | 40 +++++++++++++++++++++++++---------------
> 1 file changed, 25 insertions(+), 15 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 09255ec..09d8b02 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -172,7 +172,7 @@ struct mem_cgroup_per_node {
> };
>
> struct mem_cgroup_lru_info {
> - struct mem_cgroup_per_node *nodeinfo[MAX_NUMNODES];
> + struct mem_cgroup_per_node *nodeinfo[0];
It seems like a VM_BUG_ON() in mem_cgroup_zoneinfo() asserting that the
nid index is less than nr_node_ids would be good at catching illegal
indexes. I don't see any illegal indexes in your patch, but I fear that
someday a MAX_NUMNODES based for() loop might sneak in.
> };
>
> /*
> @@ -276,17 +276,6 @@ struct mem_cgroup {
> */
> struct res_counter kmem;
> /*
> - * Per cgroup active and inactive list, similar to the
> - * per zone LRU lists.
> - */
> - struct mem_cgroup_lru_info info;
> - int last_scanned_node;
> -#if MAX_NUMNODES > 1
> - nodemask_t scan_nodes;
> - atomic_t numainfo_events;
> - atomic_t numainfo_updating;
> -#endif
> - /*
> * Should the accounting and control be hierarchical, per subtree?
> */
> bool use_hierarchy;
> @@ -349,8 +338,29 @@ struct mem_cgroup {
> /* Index in the kmem_cache->memcg_params->memcg_caches array */
> int kmemcg_id;
> #endif
> +
> + int last_scanned_node;
> +#if MAX_NUMNODES > 1
> + nodemask_t scan_nodes;
> + atomic_t numainfo_events;
> + atomic_t numainfo_updating;
> +#endif
> + /*
> + * Per cgroup active and inactive list, similar to the
> + * per zone LRU lists.
> + *
> + * WARNING: This has to be the last element of the struct. Don't
> + * add new fields after this point.
> + */
> + struct mem_cgroup_lru_info info;
> };
>
> +static inline size_t memcg_size(void)
> +{
> + return sizeof(struct mem_cgroup) +
> + nr_node_ids * sizeof(struct mem_cgroup_per_node);
> +}
> +
Tangential question: why use inline here? I figure that modern
compilers are good at making inlining decisions.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
2013-01-24 7:50 ` Greg Thelen
@ 2013-01-24 7:52 ` Glauber Costa
2013-01-24 23:51 ` Andrew Morton
1 sibling, 0 replies; 12+ messages in thread
From: Glauber Costa @ 2013-01-24 7:52 UTC (permalink / raw)
To: Greg Thelen
Cc: linux-mm, cgroups, Michal Hocko, Kamezawa Hiroyuki,
Johannes Weiner, Hugh Dickins, Ying Han, Mel Gorman, Rik van Riel
>> +static inline size_t memcg_size(void)
>> +{
>> + return sizeof(struct mem_cgroup) +
>> + nr_node_ids * sizeof(struct mem_cgroup_per_node);
>> +}
>> +
>
> Tangential question: why use inline here? I figure that modern
> compilers are good at making inlining decisions.
I was born last century.
No reason, really. Just habit.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
2013-01-24 7:50 ` Greg Thelen
2013-01-24 7:52 ` Glauber Costa
@ 2013-01-24 23:51 ` Andrew Morton
2013-01-25 7:37 ` Lord Glauber Costa of Sealand
` (2 more replies)
1 sibling, 3 replies; 12+ messages in thread
From: Andrew Morton @ 2013-01-24 23:51 UTC (permalink / raw)
To: Greg Thelen
Cc: Glauber Costa, linux-mm, cgroups, Michal Hocko, Kamezawa Hiroyuki,
Johannes Weiner, Hugh Dickins, Ying Han, Mel Gorman, Rik van Riel
On Wed, 23 Jan 2013 23:50:31 -0800
Greg Thelen <gthelen@google.com> wrote:
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -172,7 +172,7 @@ struct mem_cgroup_per_node {
> > };
> >
> > struct mem_cgroup_lru_info {
> > - struct mem_cgroup_per_node *nodeinfo[MAX_NUMNODES];
> > + struct mem_cgroup_per_node *nodeinfo[0];
>
> It seems like a VM_BUG_ON() in mem_cgroup_zoneinfo() asserting that the
> nid index is less than nr_node_ids would be good at catching illegal
> indexes. I don't see any illegal indexes in your patch, but I fear that
> someday a MAX_NUMNODES based for() loop might sneak in.
Can't hurt ;)
> Tangential question: why use inline here? I figure that modern
> compilers are good at making inlining decisions.
And that'll save some disk space.
This?
--- a/mm/memcontrol.c~memcg-reduce-the-size-of-struct-memcg-244-fold-fix
+++ a/mm/memcontrol.c
@@ -381,7 +381,7 @@ enum {
((1 << KMEM_ACCOUNTED_ACTIVE) | (1 << KMEM_ACCOUNTED_ACTIVATED))
#ifdef CONFIG_MEMCG_KMEM
-static inline void memcg_kmem_set_active(struct mem_cgroup *memcg)
+static void memcg_kmem_set_active(struct mem_cgroup *memcg)
{
set_bit(KMEM_ACCOUNTED_ACTIVE, &memcg->kmem_account_flags);
}
@@ -645,6 +645,7 @@ static void drain_all_stock_async(struct
static struct mem_cgroup_per_zone *
mem_cgroup_zoneinfo(struct mem_cgroup *memcg, int nid, int zid)
{
+ VM_BUG_ON((unsigned)nid >= nr_node_ids);
return &memcg->info.nodeinfo[nid]->zoneinfo[zid];
}
_
Glauber, could you please cc me on patches more often? It's a bit of a
pita having to go back to the mailing list to see if there has been
more dicussion and I may end up missing late review comments and acks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
2013-01-24 23:51 ` Andrew Morton
@ 2013-01-25 7:37 ` Lord Glauber Costa of Sealand
2013-01-25 17:14 ` Greg Thelen
2013-02-05 18:37 ` Johannes Weiner
2 siblings, 0 replies; 12+ messages in thread
From: Lord Glauber Costa of Sealand @ 2013-01-25 7:37 UTC (permalink / raw)
To: Andrew Morton
Cc: Greg Thelen, linux-mm, cgroups, Michal Hocko, Kamezawa Hiroyuki,
Johannes Weiner, Hugh Dickins, Ying Han, Mel Gorman, Rik van Riel
> #ifdef CONFIG_MEMCG_KMEM
> -static inline void memcg_kmem_set_active(struct mem_cgroup *memcg)
> +static void memcg_kmem_set_active(struct mem_cgroup *memcg)
> {
> set_bit(KMEM_ACCOUNTED_ACTIVE, &memcg->kmem_account_flags);
> }
> @@ -645,6 +645,7 @@ static void drain_all_stock_async(struct
> static struct mem_cgroup_per_zone *
> mem_cgroup_zoneinfo(struct mem_cgroup *memcg, int nid, int zid)
> {
> + VM_BUG_ON((unsigned)nid >= nr_node_ids);
> return &memcg->info.nodeinfo[nid]->zoneinfo[zid];
> }
>
> _
>
> Glauber, could you please cc me on patches more often? It's a bit of a
> pita having to go back to the mailing list to see if there has been
> more dicussion and I may end up missing late review comments and acks.
>
Sure, absolutely.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
2013-01-24 23:51 ` Andrew Morton
2013-01-25 7:37 ` Lord Glauber Costa of Sealand
@ 2013-01-25 17:14 ` Greg Thelen
2013-02-05 18:37 ` Johannes Weiner
2 siblings, 0 replies; 12+ messages in thread
From: Greg Thelen @ 2013-01-25 17:14 UTC (permalink / raw)
To: Andrew Morton
Cc: Glauber Costa, linux-mm, cgroups, Michal Hocko, Kamezawa Hiroyuki,
Johannes Weiner, Hugh Dickins, Ying Han, Mel Gorman, Rik van Riel
On Thu, Jan 24 2013, Andrew Morton wrote:
> On Wed, 23 Jan 2013 23:50:31 -0800
> Greg Thelen <gthelen@google.com> wrote:
>
>> > --- a/mm/memcontrol.c
>> > +++ b/mm/memcontrol.c
>> > @@ -172,7 +172,7 @@ struct mem_cgroup_per_node {
>> > };
>> >
>> > struct mem_cgroup_lru_info {
>> > - struct mem_cgroup_per_node *nodeinfo[MAX_NUMNODES];
>> > + struct mem_cgroup_per_node *nodeinfo[0];
>>
>> It seems like a VM_BUG_ON() in mem_cgroup_zoneinfo() asserting that the
>> nid index is less than nr_node_ids would be good at catching illegal
>> indexes. I don't see any illegal indexes in your patch, but I fear that
>> someday a MAX_NUMNODES based for() loop might sneak in.
>
> Can't hurt ;)
>
>> Tangential question: why use inline here? I figure that modern
>> compilers are good at making inlining decisions.
>
> And that'll save some disk space.
>
> This?
Yup, that looks good to me.
Acked-by: Greg Thelen <gthelen@google.com>
> --- a/mm/memcontrol.c~memcg-reduce-the-size-of-struct-memcg-244-fold-fix
> +++ a/mm/memcontrol.c
> @@ -381,7 +381,7 @@ enum {
> ((1 << KMEM_ACCOUNTED_ACTIVE) | (1 << KMEM_ACCOUNTED_ACTIVATED))
>
> #ifdef CONFIG_MEMCG_KMEM
> -static inline void memcg_kmem_set_active(struct mem_cgroup *memcg)
> +static void memcg_kmem_set_active(struct mem_cgroup *memcg)
> {
> set_bit(KMEM_ACCOUNTED_ACTIVE, &memcg->kmem_account_flags);
> }
> @@ -645,6 +645,7 @@ static void drain_all_stock_async(struct
> static struct mem_cgroup_per_zone *
> mem_cgroup_zoneinfo(struct mem_cgroup *memcg, int nid, int zid)
> {
> + VM_BUG_ON((unsigned)nid >= nr_node_ids);
> return &memcg->info.nodeinfo[nid]->zoneinfo[zid];
> }
>
> _
>
> Glauber, could you please cc me on patches more often? It's a bit of a
> pita having to go back to the mailing list to see if there has been
> more dicussion and I may end up missing late review comments and acks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
2013-01-24 23:51 ` Andrew Morton
2013-01-25 7:37 ` Lord Glauber Costa of Sealand
2013-01-25 17:14 ` Greg Thelen
@ 2013-02-05 18:37 ` Johannes Weiner
2 siblings, 0 replies; 12+ messages in thread
From: Johannes Weiner @ 2013-02-05 18:37 UTC (permalink / raw)
To: Andrew Morton
Cc: Greg Thelen, Glauber Costa, linux-mm, cgroups, Michal Hocko,
Kamezawa Hiroyuki, Hugh Dickins, Ying Han, Mel Gorman,
Rik van Riel
On Thu, Jan 24, 2013 at 03:51:05PM -0800, Andrew Morton wrote:
> On Wed, 23 Jan 2013 23:50:31 -0800
> Greg Thelen <gthelen@google.com> wrote:
>
> > > --- a/mm/memcontrol.c
> > > +++ b/mm/memcontrol.c
> > > @@ -172,7 +172,7 @@ struct mem_cgroup_per_node {
> > > };
> > >
> > > struct mem_cgroup_lru_info {
> > > - struct mem_cgroup_per_node *nodeinfo[MAX_NUMNODES];
> > > + struct mem_cgroup_per_node *nodeinfo[0];
> >
> > It seems like a VM_BUG_ON() in mem_cgroup_zoneinfo() asserting that the
> > nid index is less than nr_node_ids would be good at catching illegal
> > indexes. I don't see any illegal indexes in your patch, but I fear that
> > someday a MAX_NUMNODES based for() loop might sneak in.
>
> Can't hurt ;)
>
> > Tangential question: why use inline here? I figure that modern
> > compilers are good at making inlining decisions.
>
> And that'll save some disk space.
>
> This?
>
> --- a/mm/memcontrol.c~memcg-reduce-the-size-of-struct-memcg-244-fold-fix
> +++ a/mm/memcontrol.c
> @@ -381,7 +381,7 @@ enum {
> ((1 << KMEM_ACCOUNTED_ACTIVE) | (1 << KMEM_ACCOUNTED_ACTIVATED))
>
> #ifdef CONFIG_MEMCG_KMEM
> -static inline void memcg_kmem_set_active(struct mem_cgroup *memcg)
> +static void memcg_kmem_set_active(struct mem_cgroup *memcg)
> {
> set_bit(KMEM_ACCOUNTED_ACTIVE, &memcg->kmem_account_flags);
> }
I don't disapprove, but it's the wrong function for this patch.
Should be memcg_size().
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
2013-01-24 6:46 [PATCH v2] memcg: reduce the size of struct memcg 244-fold Glauber Costa
2013-01-24 7:50 ` Greg Thelen
@ 2013-01-24 10:14 ` Michal Hocko
2013-01-29 0:08 ` Kamezawa Hiroyuki
2013-02-05 18:53 ` Johannes Weiner
3 siblings, 0 replies; 12+ messages in thread
From: Michal Hocko @ 2013-01-24 10:14 UTC (permalink / raw)
To: Glauber Costa
Cc: linux-mm, cgroups, Kamezawa Hiroyuki, Johannes Weiner,
Greg Thelen, Hugh Dickins, Ying Han, Mel Gorman, Rik van Riel
On Thu 24-01-13 10:46:35, Glauber Costa wrote:
> In order to maintain all the memcg bookkeeping, we need per-node
> descriptors, which will in turn contain a per-zone descriptor.
>
> Because we want to statically allocate those, this array ends up being
> very big. Part of the reason is that we allocate something large enough
> to hold MAX_NUMNODES, the compile time constant that holds the maximum
> number of nodes we would ever consider.
>
> However, we can do better in some cases if the firmware help us. This is
> true for modern x86 machines; coincidentally one of the architectures in
> which MAX_NUMNODES tends to be very big.
>
> By using the firmware-provided maximum number of nodes instead of
> MAX_NUMNODES, we can reduce the memory footprint of struct memcg
> considerably. In the extreme case in which we have only one node, this
> reduces the size of the structure from ~ 64k to ~2k. This is
> particularly important because it means that we will no longer resort to
> the vmalloc area for the struct memcg on defconfigs. We also have enough
> room for an extra node and still be outside vmalloc.
>
> One also has to keep in mind that with the industry's ability to fit
> more processors in a die as fast as the FED prints money, a nodes = 2
> configuration is already respectably big.
>
> [ v2: use size_t for size calculations ]
> Signed-off-by: Glauber Costa <glommer@parallels.com>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Greg Thelen <gthelen@google.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Ying Han <yinghan@google.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Rik van Riel <riel@redhat.com>
The patch seem fine and I guess other things would need to be revisited
as well if nr_node_ids could change after NUMA initialization code.
Acked-by: Michal Hocko <mhocko@suse.cz>
> ---
> mm/memcontrol.c | 40 +++++++++++++++++++++++++---------------
> 1 file changed, 25 insertions(+), 15 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 09255ec..09d8b02 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -172,7 +172,7 @@ struct mem_cgroup_per_node {
> };
>
> struct mem_cgroup_lru_info {
> - struct mem_cgroup_per_node *nodeinfo[MAX_NUMNODES];
> + struct mem_cgroup_per_node *nodeinfo[0];
> };
>
> /*
> @@ -276,17 +276,6 @@ struct mem_cgroup {
> */
> struct res_counter kmem;
> /*
> - * Per cgroup active and inactive list, similar to the
> - * per zone LRU lists.
> - */
> - struct mem_cgroup_lru_info info;
> - int last_scanned_node;
> -#if MAX_NUMNODES > 1
> - nodemask_t scan_nodes;
> - atomic_t numainfo_events;
> - atomic_t numainfo_updating;
> -#endif
> - /*
> * Should the accounting and control be hierarchical, per subtree?
> */
> bool use_hierarchy;
> @@ -349,8 +338,29 @@ struct mem_cgroup {
> /* Index in the kmem_cache->memcg_params->memcg_caches array */
> int kmemcg_id;
> #endif
> +
> + int last_scanned_node;
> +#if MAX_NUMNODES > 1
> + nodemask_t scan_nodes;
> + atomic_t numainfo_events;
> + atomic_t numainfo_updating;
> +#endif
> + /*
> + * Per cgroup active and inactive list, similar to the
> + * per zone LRU lists.
> + *
> + * WARNING: This has to be the last element of the struct. Don't
> + * add new fields after this point.
> + */
> + struct mem_cgroup_lru_info info;
> };
>
> +static inline size_t memcg_size(void)
> +{
> + return sizeof(struct mem_cgroup) +
> + nr_node_ids * sizeof(struct mem_cgroup_per_node);
> +}
> +
> /* internal only representation about the status of kmem accounting. */
> enum {
> KMEM_ACCOUNTED_ACTIVE = 0, /* accounted by this cgroup itself */
> @@ -5894,9 +5904,9 @@ static void free_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
> static struct mem_cgroup *mem_cgroup_alloc(void)
> {
> struct mem_cgroup *memcg;
> - int size = sizeof(struct mem_cgroup);
> + size_t size = memcg_size();
>
> - /* Can be very big if MAX_NUMNODES is very big */
> + /* Can be very big if nr_node_ids is very big */
> if (size < PAGE_SIZE)
> memcg = kzalloc(size, GFP_KERNEL);
> else
> @@ -5933,7 +5943,7 @@ out_free:
> static void __mem_cgroup_free(struct mem_cgroup *memcg)
> {
> int node;
> - int size = sizeof(struct mem_cgroup);
> + size_t size = memcg_size();
>
> mem_cgroup_remove_from_trees(memcg);
> free_css_id(&mem_cgroup_subsys, &memcg->css);
> --
> 1.8.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe cgroups" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
2013-01-24 6:46 [PATCH v2] memcg: reduce the size of struct memcg 244-fold Glauber Costa
2013-01-24 7:50 ` Greg Thelen
2013-01-24 10:14 ` Michal Hocko
@ 2013-01-29 0:08 ` Kamezawa Hiroyuki
2013-02-05 18:53 ` Johannes Weiner
3 siblings, 0 replies; 12+ messages in thread
From: Kamezawa Hiroyuki @ 2013-01-29 0:08 UTC (permalink / raw)
To: Glauber Costa
Cc: linux-mm, cgroups, Michal Hocko, Johannes Weiner, Greg Thelen,
Hugh Dickins, Ying Han, Mel Gorman, Rik van Riel
(2013/01/24 15:46), Glauber Costa wrote:
> In order to maintain all the memcg bookkeeping, we need per-node
> descriptors, which will in turn contain a per-zone descriptor.
>
> Because we want to statically allocate those, this array ends up being
> very big. Part of the reason is that we allocate something large enough
> to hold MAX_NUMNODES, the compile time constant that holds the maximum
> number of nodes we would ever consider.
>
> However, we can do better in some cases if the firmware help us. This is
> true for modern x86 machines; coincidentally one of the architectures in
> which MAX_NUMNODES tends to be very big.
>
> By using the firmware-provided maximum number of nodes instead of
> MAX_NUMNODES, we can reduce the memory footprint of struct memcg
> considerably. In the extreme case in which we have only one node, this
> reduces the size of the structure from ~ 64k to ~2k. This is
> particularly important because it means that we will no longer resort to
> the vmalloc area for the struct memcg on defconfigs. We also have enough
> room for an extra node and still be outside vmalloc.
>
> One also has to keep in mind that with the industry's ability to fit
> more processors in a die as fast as the FED prints money, a nodes = 2
> configuration is already respectably big.
>
> [ v2: use size_t for size calculations ]
> Signed-off-by: Glauber Costa <glommer@parallels.com>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Greg Thelen <gthelen@google.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Ying Han <yinghan@google.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Rik van Riel <riel@redhat.com>
> ---
> mm/memcontrol.c | 40 +++++++++++++++++++++++++---------------
> 1 file changed, 25 insertions(+), 15 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 09255ec..09d8b02 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -172,7 +172,7 @@ struct mem_cgroup_per_node {
> };
>
> struct mem_cgroup_lru_info {
> - struct mem_cgroup_per_node *nodeinfo[MAX_NUMNODES];
> + struct mem_cgroup_per_node *nodeinfo[0];
> };
>
> /*
> @@ -276,17 +276,6 @@ struct mem_cgroup {
> */
> struct res_counter kmem;
> /*
> - * Per cgroup active and inactive list, similar to the
> - * per zone LRU lists.
> - */
> - struct mem_cgroup_lru_info info;
> - int last_scanned_node;
> -#if MAX_NUMNODES > 1
> - nodemask_t scan_nodes;
> - atomic_t numainfo_events;
> - atomic_t numainfo_updating;
> -#endif
> - /*
> * Should the accounting and control be hierarchical, per subtree?
> */
> bool use_hierarchy;
> @@ -349,8 +338,29 @@ struct mem_cgroup {
> /* Index in the kmem_cache->memcg_params->memcg_caches array */
> int kmemcg_id;
> #endif
> +
> + int last_scanned_node;
> +#if MAX_NUMNODES > 1
> + nodemask_t scan_nodes;
> + atomic_t numainfo_events;
> + atomic_t numainfo_updating;
> +#endif
> + /*
> + * Per cgroup active and inactive list, similar to the
> + * per zone LRU lists.
> + *
> + * WARNING: This has to be the last element of the struct. Don't
> + * add new fields after this point.
> + */
> + struct mem_cgroup_lru_info info;
> };
>
> +static inline size_t memcg_size(void)
> +{
> + return sizeof(struct mem_cgroup) +
> + nr_node_ids * sizeof(struct mem_cgroup_per_node);
> +}
>
ok, nr_node_ids is made from possible_node_map.
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
2013-01-24 6:46 [PATCH v2] memcg: reduce the size of struct memcg 244-fold Glauber Costa
` (2 preceding siblings ...)
2013-01-29 0:08 ` Kamezawa Hiroyuki
@ 2013-02-05 18:53 ` Johannes Weiner
2013-02-05 19:04 ` Michal Hocko
3 siblings, 1 reply; 12+ messages in thread
From: Johannes Weiner @ 2013-02-05 18:53 UTC (permalink / raw)
To: Glauber Costa
Cc: linux-mm, cgroups, Michal Hocko, Kamezawa Hiroyuki, Greg Thelen,
Hugh Dickins, Ying Han, Mel Gorman, Rik van Riel
On Thu, Jan 24, 2013 at 10:46:35AM +0400, Glauber Costa wrote:
> In order to maintain all the memcg bookkeeping, we need per-node
> descriptors, which will in turn contain a per-zone descriptor.
>
> Because we want to statically allocate those, this array ends up being
> very big. Part of the reason is that we allocate something large enough
> to hold MAX_NUMNODES, the compile time constant that holds the maximum
> number of nodes we would ever consider.
>
> However, we can do better in some cases if the firmware help us. This is
> true for modern x86 machines; coincidentally one of the architectures in
> which MAX_NUMNODES tends to be very big.
>
> By using the firmware-provided maximum number of nodes instead of
> MAX_NUMNODES, we can reduce the memory footprint of struct memcg
> considerably. In the extreme case in which we have only one node, this
> reduces the size of the structure from ~ 64k to ~2k. This is
> particularly important because it means that we will no longer resort to
> the vmalloc area for the struct memcg on defconfigs. We also have enough
> room for an extra node and still be outside vmalloc.
>
> One also has to keep in mind that with the industry's ability to fit
> more processors in a die as fast as the FED prints money, a nodes = 2
> configuration is already respectably big.
>
> [ v2: use size_t for size calculations ]
> Signed-off-by: Glauber Costa <glommer@parallels.com>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Greg Thelen <gthelen@google.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Ying Han <yinghan@google.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Rik van Riel <riel@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Nitpick:
> @@ -349,8 +338,29 @@ struct mem_cgroup {
> /* Index in the kmem_cache->memcg_params->memcg_caches array */
> int kmemcg_id;
> #endif
> +
> + int last_scanned_node;
> +#if MAX_NUMNODES > 1
> + nodemask_t scan_nodes;
> + atomic_t numainfo_events;
> + atomic_t numainfo_updating;
> +#endif
> + /*
> + * Per cgroup active and inactive list, similar to the
> + * per zone LRU lists.
> + *
> + * WARNING: This has to be the last element of the struct. Don't
> + * add new fields after this point.
> + */
> + struct mem_cgroup_lru_info info;
I can see myself ignoring comments pertaining to previous members when
adding to a struct. The indirection through mem_cgroup_lru_info can
probably be dropped anyway, and it moves the [0] in a place where it
helps document the struct mem_cgroup layout. What do you think about
the following:
---
Subject: [patch] memcg: reduce the size of struct memcg 244-fold morrr fix
Remove struct mem_cgroup_lru_info. It only holds the nodeinfo array
and is actively misleading because there is all kinds of per-node
stuff in addition to the LRU info in there. On that note, remove the
incorrect comment as well.
Move comment about the nodeinfo[0] array having to be the last field
in struct mem_cgroup after said array. Should be more visible when
attempting to append new members to the struct.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2382fe9..29cb9e9 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -179,10 +179,6 @@ struct mem_cgroup_per_node {
struct mem_cgroup_per_zone zoneinfo[MAX_NR_ZONES];
};
-struct mem_cgroup_lru_info {
- struct mem_cgroup_per_node *nodeinfo[0];
-};
-
/*
* Cgroups above their limits are maintained in a RB-Tree, independent of
* their hierarchy representation
@@ -370,14 +366,8 @@ struct mem_cgroup {
atomic_t numainfo_events;
atomic_t numainfo_updating;
#endif
- /*
- * Per cgroup active and inactive list, similar to the
- * per zone LRU lists.
- *
- * WARNING: This has to be the last element of the struct. Don't
- * add new fields after this point.
- */
- struct mem_cgroup_lru_info info;
+ struct mem_cgroup_per_node *nodeinfo[0];
+ /* WARNING: nodeinfo has to be the last member in here */
};
static inline size_t memcg_size(void)
@@ -718,7 +708,7 @@ static struct mem_cgroup_per_zone *
mem_cgroup_zoneinfo(struct mem_cgroup *memcg, int nid, int zid)
{
VM_BUG_ON((unsigned)nid >= nr_node_ids);
- return &memcg->info.nodeinfo[nid]->zoneinfo[zid];
+ return &memcg->nodeinfo[nid]->zoneinfo[zid];
}
struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *memcg)
@@ -6093,13 +6083,13 @@ static int alloc_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
mz->on_tree = false;
mz->memcg = memcg;
}
- memcg->info.nodeinfo[node] = pn;
+ memcg->nodeinfo[node] = pn;
return 0;
}
static void free_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
{
- kfree(memcg->info.nodeinfo[node]);
+ kfree(memcg->nodeinfo[node]);
}
static struct mem_cgroup *mem_cgroup_alloc(void)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
2013-02-05 18:53 ` Johannes Weiner
@ 2013-02-05 19:04 ` Michal Hocko
2013-02-05 19:06 ` Glauber Costa
0 siblings, 1 reply; 12+ messages in thread
From: Michal Hocko @ 2013-02-05 19:04 UTC (permalink / raw)
To: Johannes Weiner
Cc: Glauber Costa, linux-mm, cgroups, Kamezawa Hiroyuki, Greg Thelen,
Hugh Dickins, Ying Han, Mel Gorman, Rik van Riel
On Tue 05-02-13 13:53:24, Johannes Weiner wrote:
[...]
> Subject: [patch] memcg: reduce the size of struct memcg 244-fold morrr fix
>
> Remove struct mem_cgroup_lru_info. It only holds the nodeinfo array
> and is actively misleading because there is all kinds of per-node
> stuff in addition to the LRU info in there. On that note, remove the
> incorrect comment as well.
>
> Move comment about the nodeinfo[0] array having to be the last field
> in struct mem_cgroup after said array. Should be more visible when
> attempting to append new members to the struct.
>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Yes, I like this. The info level is just artificatial and without any
good reason.
Acked-by: Michal Hocko <mhocko@suse.cz>
Thanks
> ---
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 2382fe9..29cb9e9 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -179,10 +179,6 @@ struct mem_cgroup_per_node {
> struct mem_cgroup_per_zone zoneinfo[MAX_NR_ZONES];
> };
>
> -struct mem_cgroup_lru_info {
> - struct mem_cgroup_per_node *nodeinfo[0];
> -};
> -
> /*
> * Cgroups above their limits are maintained in a RB-Tree, independent of
> * their hierarchy representation
> @@ -370,14 +366,8 @@ struct mem_cgroup {
> atomic_t numainfo_events;
> atomic_t numainfo_updating;
> #endif
> - /*
> - * Per cgroup active and inactive list, similar to the
> - * per zone LRU lists.
> - *
> - * WARNING: This has to be the last element of the struct. Don't
> - * add new fields after this point.
> - */
> - struct mem_cgroup_lru_info info;
> + struct mem_cgroup_per_node *nodeinfo[0];
> + /* WARNING: nodeinfo has to be the last member in here */
> };
>
> static inline size_t memcg_size(void)
> @@ -718,7 +708,7 @@ static struct mem_cgroup_per_zone *
> mem_cgroup_zoneinfo(struct mem_cgroup *memcg, int nid, int zid)
> {
> VM_BUG_ON((unsigned)nid >= nr_node_ids);
> - return &memcg->info.nodeinfo[nid]->zoneinfo[zid];
> + return &memcg->nodeinfo[nid]->zoneinfo[zid];
> }
>
> struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *memcg)
> @@ -6093,13 +6083,13 @@ static int alloc_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
> mz->on_tree = false;
> mz->memcg = memcg;
> }
> - memcg->info.nodeinfo[node] = pn;
> + memcg->nodeinfo[node] = pn;
> return 0;
> }
>
> static void free_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
> {
> - kfree(memcg->info.nodeinfo[node]);
> + kfree(memcg->nodeinfo[node]);
> }
>
> static struct mem_cgroup *mem_cgroup_alloc(void)
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
2013-02-05 19:04 ` Michal Hocko
@ 2013-02-05 19:06 ` Glauber Costa
0 siblings, 0 replies; 12+ messages in thread
From: Glauber Costa @ 2013-02-05 19:06 UTC (permalink / raw)
To: Michal Hocko
Cc: Johannes Weiner, linux-mm, cgroups, Kamezawa Hiroyuki,
Greg Thelen, Hugh Dickins, Ying Han, Mel Gorman, Rik van Riel
On 02/05/2013 11:04 PM, Michal Hocko wrote:
> On Tue 05-02-13 13:53:24, Johannes Weiner wrote:
> [...]
>> Subject: [patch] memcg: reduce the size of struct memcg 244-fold morrr fix
>>
>> Remove struct mem_cgroup_lru_info. It only holds the nodeinfo array
>> and is actively misleading because there is all kinds of per-node
>> stuff in addition to the LRU info in there. On that note, remove the
>> incorrect comment as well.
>>
>> Move comment about the nodeinfo[0] array having to be the last field
>> in struct mem_cgroup after said array. Should be more visible when
>> attempting to append new members to the struct.
>>
>> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
>
> Yes, I like this. The info level is just artificatial and without any
> good reason.
>
> Acked-by: Michal Hocko <mhocko@suse.cz>
>
Acked-by: Glauber Costa <glommer@parallels.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 12+ messages in thread