* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
2013-01-24 6:46 ` Glauber Costa
@ 2013-01-24 10:14 ` Michal Hocko
-1 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2013-01-24 10:14 UTC (permalink / raw)
To: Glauber Costa
Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
Kamezawa Hiroyuki, Johannes Weiner, Greg Thelen, Hugh Dickins,
Ying Han, Mel Gorman, Rik van Riel
On Thu 24-01-13 10:46:35, Glauber Costa wrote:
> In order to maintain all the memcg bookkeeping, we need per-node
> descriptors, which will in turn contain a per-zone descriptor.
>
> Because we want to statically allocate those, this array ends up being
> very big. Part of the reason is that we allocate something large enough
> to hold MAX_NUMNODES, the compile time constant that holds the maximum
> number of nodes we would ever consider.
>
> However, we can do better in some cases if the firmware help us. This is
> true for modern x86 machines; coincidentally one of the architectures in
> which MAX_NUMNODES tends to be very big.
>
> By using the firmware-provided maximum number of nodes instead of
> MAX_NUMNODES, we can reduce the memory footprint of struct memcg
> considerably. In the extreme case in which we have only one node, this
> reduces the size of the structure from ~ 64k to ~2k. This is
> particularly important because it means that we will no longer resort to
> the vmalloc area for the struct memcg on defconfigs. We also have enough
> room for an extra node and still be outside vmalloc.
>
> One also has to keep in mind that with the industry's ability to fit
> more processors in a die as fast as the FED prints money, a nodes = 2
> configuration is already respectably big.
>
> [ v2: use size_t for size calculations ]
> Signed-off-by: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
> Cc: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
> Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
> Cc: Greg Thelen <gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Cc: Hugh Dickins <hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Cc: Ying Han <yinghan-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Cc: Mel Gorman <mgorman-l3A5Bk7waGM@public.gmane.org>
> Cc: Rik van Riel <riel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
The patch seem fine and I guess other things would need to be revisited
as well if nr_node_ids could change after NUMA initialization code.
Acked-by: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
> ---
> mm/memcontrol.c | 40 +++++++++++++++++++++++++---------------
> 1 file changed, 25 insertions(+), 15 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 09255ec..09d8b02 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -172,7 +172,7 @@ struct mem_cgroup_per_node {
> };
>
> struct mem_cgroup_lru_info {
> - struct mem_cgroup_per_node *nodeinfo[MAX_NUMNODES];
> + struct mem_cgroup_per_node *nodeinfo[0];
> };
>
> /*
> @@ -276,17 +276,6 @@ struct mem_cgroup {
> */
> struct res_counter kmem;
> /*
> - * Per cgroup active and inactive list, similar to the
> - * per zone LRU lists.
> - */
> - struct mem_cgroup_lru_info info;
> - int last_scanned_node;
> -#if MAX_NUMNODES > 1
> - nodemask_t scan_nodes;
> - atomic_t numainfo_events;
> - atomic_t numainfo_updating;
> -#endif
> - /*
> * Should the accounting and control be hierarchical, per subtree?
> */
> bool use_hierarchy;
> @@ -349,8 +338,29 @@ struct mem_cgroup {
> /* Index in the kmem_cache->memcg_params->memcg_caches array */
> int kmemcg_id;
> #endif
> +
> + int last_scanned_node;
> +#if MAX_NUMNODES > 1
> + nodemask_t scan_nodes;
> + atomic_t numainfo_events;
> + atomic_t numainfo_updating;
> +#endif
> + /*
> + * Per cgroup active and inactive list, similar to the
> + * per zone LRU lists.
> + *
> + * WARNING: This has to be the last element of the struct. Don't
> + * add new fields after this point.
> + */
> + struct mem_cgroup_lru_info info;
> };
>
> +static inline size_t memcg_size(void)
> +{
> + return sizeof(struct mem_cgroup) +
> + nr_node_ids * sizeof(struct mem_cgroup_per_node);
> +}
> +
> /* internal only representation about the status of kmem accounting. */
> enum {
> KMEM_ACCOUNTED_ACTIVE = 0, /* accounted by this cgroup itself */
> @@ -5894,9 +5904,9 @@ static void free_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
> static struct mem_cgroup *mem_cgroup_alloc(void)
> {
> struct mem_cgroup *memcg;
> - int size = sizeof(struct mem_cgroup);
> + size_t size = memcg_size();
>
> - /* Can be very big if MAX_NUMNODES is very big */
> + /* Can be very big if nr_node_ids is very big */
> if (size < PAGE_SIZE)
> memcg = kzalloc(size, GFP_KERNEL);
> else
> @@ -5933,7 +5943,7 @@ out_free:
> static void __mem_cgroup_free(struct mem_cgroup *memcg)
> {
> int node;
> - int size = sizeof(struct mem_cgroup);
> + size_t size = memcg_size();
>
> mem_cgroup_remove_from_trees(memcg);
> free_css_id(&mem_cgroup_subsys, &memcg->css);
> --
> 1.8.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe cgroups" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
@ 2013-01-24 10:14 ` Michal Hocko
0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2013-01-24 10:14 UTC (permalink / raw)
To: Glauber Costa
Cc: linux-mm, cgroups, Kamezawa Hiroyuki, Johannes Weiner,
Greg Thelen, Hugh Dickins, Ying Han, Mel Gorman, Rik van Riel
On Thu 24-01-13 10:46:35, Glauber Costa wrote:
> In order to maintain all the memcg bookkeeping, we need per-node
> descriptors, which will in turn contain a per-zone descriptor.
>
> Because we want to statically allocate those, this array ends up being
> very big. Part of the reason is that we allocate something large enough
> to hold MAX_NUMNODES, the compile time constant that holds the maximum
> number of nodes we would ever consider.
>
> However, we can do better in some cases if the firmware help us. This is
> true for modern x86 machines; coincidentally one of the architectures in
> which MAX_NUMNODES tends to be very big.
>
> By using the firmware-provided maximum number of nodes instead of
> MAX_NUMNODES, we can reduce the memory footprint of struct memcg
> considerably. In the extreme case in which we have only one node, this
> reduces the size of the structure from ~ 64k to ~2k. This is
> particularly important because it means that we will no longer resort to
> the vmalloc area for the struct memcg on defconfigs. We also have enough
> room for an extra node and still be outside vmalloc.
>
> One also has to keep in mind that with the industry's ability to fit
> more processors in a die as fast as the FED prints money, a nodes = 2
> configuration is already respectably big.
>
> [ v2: use size_t for size calculations ]
> Signed-off-by: Glauber Costa <glommer@parallels.com>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Greg Thelen <gthelen@google.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Ying Han <yinghan@google.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Rik van Riel <riel@redhat.com>
The patch seem fine and I guess other things would need to be revisited
as well if nr_node_ids could change after NUMA initialization code.
Acked-by: Michal Hocko <mhocko@suse.cz>
> ---
> mm/memcontrol.c | 40 +++++++++++++++++++++++++---------------
> 1 file changed, 25 insertions(+), 15 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 09255ec..09d8b02 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -172,7 +172,7 @@ struct mem_cgroup_per_node {
> };
>
> struct mem_cgroup_lru_info {
> - struct mem_cgroup_per_node *nodeinfo[MAX_NUMNODES];
> + struct mem_cgroup_per_node *nodeinfo[0];
> };
>
> /*
> @@ -276,17 +276,6 @@ struct mem_cgroup {
> */
> struct res_counter kmem;
> /*
> - * Per cgroup active and inactive list, similar to the
> - * per zone LRU lists.
> - */
> - struct mem_cgroup_lru_info info;
> - int last_scanned_node;
> -#if MAX_NUMNODES > 1
> - nodemask_t scan_nodes;
> - atomic_t numainfo_events;
> - atomic_t numainfo_updating;
> -#endif
> - /*
> * Should the accounting and control be hierarchical, per subtree?
> */
> bool use_hierarchy;
> @@ -349,8 +338,29 @@ struct mem_cgroup {
> /* Index in the kmem_cache->memcg_params->memcg_caches array */
> int kmemcg_id;
> #endif
> +
> + int last_scanned_node;
> +#if MAX_NUMNODES > 1
> + nodemask_t scan_nodes;
> + atomic_t numainfo_events;
> + atomic_t numainfo_updating;
> +#endif
> + /*
> + * Per cgroup active and inactive list, similar to the
> + * per zone LRU lists.
> + *
> + * WARNING: This has to be the last element of the struct. Don't
> + * add new fields after this point.
> + */
> + struct mem_cgroup_lru_info info;
> };
>
> +static inline size_t memcg_size(void)
> +{
> + return sizeof(struct mem_cgroup) +
> + nr_node_ids * sizeof(struct mem_cgroup_per_node);
> +}
> +
> /* internal only representation about the status of kmem accounting. */
> enum {
> KMEM_ACCOUNTED_ACTIVE = 0, /* accounted by this cgroup itself */
> @@ -5894,9 +5904,9 @@ static void free_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
> static struct mem_cgroup *mem_cgroup_alloc(void)
> {
> struct mem_cgroup *memcg;
> - int size = sizeof(struct mem_cgroup);
> + size_t size = memcg_size();
>
> - /* Can be very big if MAX_NUMNODES is very big */
> + /* Can be very big if nr_node_ids is very big */
> if (size < PAGE_SIZE)
> memcg = kzalloc(size, GFP_KERNEL);
> else
> @@ -5933,7 +5943,7 @@ out_free:
> static void __mem_cgroup_free(struct mem_cgroup *memcg)
> {
> int node;
> - int size = sizeof(struct mem_cgroup);
> + size_t size = memcg_size();
>
> mem_cgroup_remove_from_trees(memcg);
> free_css_id(&mem_cgroup_subsys, &memcg->css);
> --
> 1.8.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe cgroups" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
2013-01-24 6:46 ` Glauber Costa
@ 2013-01-29 0:08 ` Kamezawa Hiroyuki
-1 siblings, 0 replies; 20+ messages in thread
From: Kamezawa Hiroyuki @ 2013-01-29 0:08 UTC (permalink / raw)
To: Glauber Costa
Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
Michal Hocko, Johannes Weiner, Greg Thelen, Hugh Dickins,
Ying Han, Mel Gorman, Rik van Riel
(2013/01/24 15:46), Glauber Costa wrote:
> In order to maintain all the memcg bookkeeping, we need per-node
> descriptors, which will in turn contain a per-zone descriptor.
>
> Because we want to statically allocate those, this array ends up being
> very big. Part of the reason is that we allocate something large enough
> to hold MAX_NUMNODES, the compile time constant that holds the maximum
> number of nodes we would ever consider.
>
> However, we can do better in some cases if the firmware help us. This is
> true for modern x86 machines; coincidentally one of the architectures in
> which MAX_NUMNODES tends to be very big.
>
> By using the firmware-provided maximum number of nodes instead of
> MAX_NUMNODES, we can reduce the memory footprint of struct memcg
> considerably. In the extreme case in which we have only one node, this
> reduces the size of the structure from ~ 64k to ~2k. This is
> particularly important because it means that we will no longer resort to
> the vmalloc area for the struct memcg on defconfigs. We also have enough
> room for an extra node and still be outside vmalloc.
>
> One also has to keep in mind that with the industry's ability to fit
> more processors in a die as fast as the FED prints money, a nodes = 2
> configuration is already respectably big.
>
> [ v2: use size_t for size calculations ]
> Signed-off-by: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
> Cc: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
> Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
> Cc: Greg Thelen <gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Cc: Hugh Dickins <hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Cc: Ying Han <yinghan-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Cc: Mel Gorman <mgorman-l3A5Bk7waGM@public.gmane.org>
> Cc: Rik van Riel <riel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> ---
> mm/memcontrol.c | 40 +++++++++++++++++++++++++---------------
> 1 file changed, 25 insertions(+), 15 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 09255ec..09d8b02 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -172,7 +172,7 @@ struct mem_cgroup_per_node {
> };
>
> struct mem_cgroup_lru_info {
> - struct mem_cgroup_per_node *nodeinfo[MAX_NUMNODES];
> + struct mem_cgroup_per_node *nodeinfo[0];
> };
>
> /*
> @@ -276,17 +276,6 @@ struct mem_cgroup {
> */
> struct res_counter kmem;
> /*
> - * Per cgroup active and inactive list, similar to the
> - * per zone LRU lists.
> - */
> - struct mem_cgroup_lru_info info;
> - int last_scanned_node;
> -#if MAX_NUMNODES > 1
> - nodemask_t scan_nodes;
> - atomic_t numainfo_events;
> - atomic_t numainfo_updating;
> -#endif
> - /*
> * Should the accounting and control be hierarchical, per subtree?
> */
> bool use_hierarchy;
> @@ -349,8 +338,29 @@ struct mem_cgroup {
> /* Index in the kmem_cache->memcg_params->memcg_caches array */
> int kmemcg_id;
> #endif
> +
> + int last_scanned_node;
> +#if MAX_NUMNODES > 1
> + nodemask_t scan_nodes;
> + atomic_t numainfo_events;
> + atomic_t numainfo_updating;
> +#endif
> + /*
> + * Per cgroup active and inactive list, similar to the
> + * per zone LRU lists.
> + *
> + * WARNING: This has to be the last element of the struct. Don't
> + * add new fields after this point.
> + */
> + struct mem_cgroup_lru_info info;
> };
>
> +static inline size_t memcg_size(void)
> +{
> + return sizeof(struct mem_cgroup) +
> + nr_node_ids * sizeof(struct mem_cgroup_per_node);
> +}
>
ok, nr_node_ids is made from possible_node_map.
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
@ 2013-01-29 0:08 ` Kamezawa Hiroyuki
0 siblings, 0 replies; 20+ messages in thread
From: Kamezawa Hiroyuki @ 2013-01-29 0:08 UTC (permalink / raw)
To: Glauber Costa
Cc: linux-mm, cgroups, Michal Hocko, Johannes Weiner, Greg Thelen,
Hugh Dickins, Ying Han, Mel Gorman, Rik van Riel
(2013/01/24 15:46), Glauber Costa wrote:
> In order to maintain all the memcg bookkeeping, we need per-node
> descriptors, which will in turn contain a per-zone descriptor.
>
> Because we want to statically allocate those, this array ends up being
> very big. Part of the reason is that we allocate something large enough
> to hold MAX_NUMNODES, the compile time constant that holds the maximum
> number of nodes we would ever consider.
>
> However, we can do better in some cases if the firmware help us. This is
> true for modern x86 machines; coincidentally one of the architectures in
> which MAX_NUMNODES tends to be very big.
>
> By using the firmware-provided maximum number of nodes instead of
> MAX_NUMNODES, we can reduce the memory footprint of struct memcg
> considerably. In the extreme case in which we have only one node, this
> reduces the size of the structure from ~ 64k to ~2k. This is
> particularly important because it means that we will no longer resort to
> the vmalloc area for the struct memcg on defconfigs. We also have enough
> room for an extra node and still be outside vmalloc.
>
> One also has to keep in mind that with the industry's ability to fit
> more processors in a die as fast as the FED prints money, a nodes = 2
> configuration is already respectably big.
>
> [ v2: use size_t for size calculations ]
> Signed-off-by: Glauber Costa <glommer@parallels.com>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Greg Thelen <gthelen@google.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Ying Han <yinghan@google.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Rik van Riel <riel@redhat.com>
> ---
> mm/memcontrol.c | 40 +++++++++++++++++++++++++---------------
> 1 file changed, 25 insertions(+), 15 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 09255ec..09d8b02 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -172,7 +172,7 @@ struct mem_cgroup_per_node {
> };
>
> struct mem_cgroup_lru_info {
> - struct mem_cgroup_per_node *nodeinfo[MAX_NUMNODES];
> + struct mem_cgroup_per_node *nodeinfo[0];
> };
>
> /*
> @@ -276,17 +276,6 @@ struct mem_cgroup {
> */
> struct res_counter kmem;
> /*
> - * Per cgroup active and inactive list, similar to the
> - * per zone LRU lists.
> - */
> - struct mem_cgroup_lru_info info;
> - int last_scanned_node;
> -#if MAX_NUMNODES > 1
> - nodemask_t scan_nodes;
> - atomic_t numainfo_events;
> - atomic_t numainfo_updating;
> -#endif
> - /*
> * Should the accounting and control be hierarchical, per subtree?
> */
> bool use_hierarchy;
> @@ -349,8 +338,29 @@ struct mem_cgroup {
> /* Index in the kmem_cache->memcg_params->memcg_caches array */
> int kmemcg_id;
> #endif
> +
> + int last_scanned_node;
> +#if MAX_NUMNODES > 1
> + nodemask_t scan_nodes;
> + atomic_t numainfo_events;
> + atomic_t numainfo_updating;
> +#endif
> + /*
> + * Per cgroup active and inactive list, similar to the
> + * per zone LRU lists.
> + *
> + * WARNING: This has to be the last element of the struct. Don't
> + * add new fields after this point.
> + */
> + struct mem_cgroup_lru_info info;
> };
>
> +static inline size_t memcg_size(void)
> +{
> + return sizeof(struct mem_cgroup) +
> + nr_node_ids * sizeof(struct mem_cgroup_per_node);
> +}
>
ok, nr_node_ids is made from possible_node_map.
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
2013-01-24 6:46 ` Glauber Costa
@ 2013-02-05 18:53 ` Johannes Weiner
-1 siblings, 0 replies; 20+ messages in thread
From: Johannes Weiner @ 2013-02-05 18:53 UTC (permalink / raw)
To: Glauber Costa
Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
Michal Hocko, Kamezawa Hiroyuki, Greg Thelen, Hugh Dickins,
Ying Han, Mel Gorman, Rik van Riel
On Thu, Jan 24, 2013 at 10:46:35AM +0400, Glauber Costa wrote:
> In order to maintain all the memcg bookkeeping, we need per-node
> descriptors, which will in turn contain a per-zone descriptor.
>
> Because we want to statically allocate those, this array ends up being
> very big. Part of the reason is that we allocate something large enough
> to hold MAX_NUMNODES, the compile time constant that holds the maximum
> number of nodes we would ever consider.
>
> However, we can do better in some cases if the firmware help us. This is
> true for modern x86 machines; coincidentally one of the architectures in
> which MAX_NUMNODES tends to be very big.
>
> By using the firmware-provided maximum number of nodes instead of
> MAX_NUMNODES, we can reduce the memory footprint of struct memcg
> considerably. In the extreme case in which we have only one node, this
> reduces the size of the structure from ~ 64k to ~2k. This is
> particularly important because it means that we will no longer resort to
> the vmalloc area for the struct memcg on defconfigs. We also have enough
> room for an extra node and still be outside vmalloc.
>
> One also has to keep in mind that with the industry's ability to fit
> more processors in a die as fast as the FED prints money, a nodes = 2
> configuration is already respectably big.
>
> [ v2: use size_t for size calculations ]
> Signed-off-by: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
> Cc: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
> Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
> Cc: Greg Thelen <gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Cc: Hugh Dickins <hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Cc: Ying Han <yinghan-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Cc: Mel Gorman <mgorman-l3A5Bk7waGM@public.gmane.org>
> Cc: Rik van Riel <riel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Acked-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
Nitpick:
> @@ -349,8 +338,29 @@ struct mem_cgroup {
> /* Index in the kmem_cache->memcg_params->memcg_caches array */
> int kmemcg_id;
> #endif
> +
> + int last_scanned_node;
> +#if MAX_NUMNODES > 1
> + nodemask_t scan_nodes;
> + atomic_t numainfo_events;
> + atomic_t numainfo_updating;
> +#endif
> + /*
> + * Per cgroup active and inactive list, similar to the
> + * per zone LRU lists.
> + *
> + * WARNING: This has to be the last element of the struct. Don't
> + * add new fields after this point.
> + */
> + struct mem_cgroup_lru_info info;
I can see myself ignoring comments pertaining to previous members when
adding to a struct. The indirection through mem_cgroup_lru_info can
probably be dropped anyway, and it moves the [0] in a place where it
helps document the struct mem_cgroup layout. What do you think about
the following:
---
Subject: [patch] memcg: reduce the size of struct memcg 244-fold morrr fix
Remove struct mem_cgroup_lru_info. It only holds the nodeinfo array
and is actively misleading because there is all kinds of per-node
stuff in addition to the LRU info in there. On that note, remove the
incorrect comment as well.
Move comment about the nodeinfo[0] array having to be the last field
in struct mem_cgroup after said array. Should be more visible when
attempting to append new members to the struct.
Signed-off-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
---
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2382fe9..29cb9e9 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -179,10 +179,6 @@ struct mem_cgroup_per_node {
struct mem_cgroup_per_zone zoneinfo[MAX_NR_ZONES];
};
-struct mem_cgroup_lru_info {
- struct mem_cgroup_per_node *nodeinfo[0];
-};
-
/*
* Cgroups above their limits are maintained in a RB-Tree, independent of
* their hierarchy representation
@@ -370,14 +366,8 @@ struct mem_cgroup {
atomic_t numainfo_events;
atomic_t numainfo_updating;
#endif
- /*
- * Per cgroup active and inactive list, similar to the
- * per zone LRU lists.
- *
- * WARNING: This has to be the last element of the struct. Don't
- * add new fields after this point.
- */
- struct mem_cgroup_lru_info info;
+ struct mem_cgroup_per_node *nodeinfo[0];
+ /* WARNING: nodeinfo has to be the last member in here */
};
static inline size_t memcg_size(void)
@@ -718,7 +708,7 @@ static struct mem_cgroup_per_zone *
mem_cgroup_zoneinfo(struct mem_cgroup *memcg, int nid, int zid)
{
VM_BUG_ON((unsigned)nid >= nr_node_ids);
- return &memcg->info.nodeinfo[nid]->zoneinfo[zid];
+ return &memcg->nodeinfo[nid]->zoneinfo[zid];
}
struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *memcg)
@@ -6093,13 +6083,13 @@ static int alloc_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
mz->on_tree = false;
mz->memcg = memcg;
}
- memcg->info.nodeinfo[node] = pn;
+ memcg->nodeinfo[node] = pn;
return 0;
}
static void free_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
{
- kfree(memcg->info.nodeinfo[node]);
+ kfree(memcg->nodeinfo[node]);
}
static struct mem_cgroup *mem_cgroup_alloc(void)
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
@ 2013-02-05 18:53 ` Johannes Weiner
0 siblings, 0 replies; 20+ messages in thread
From: Johannes Weiner @ 2013-02-05 18:53 UTC (permalink / raw)
To: Glauber Costa
Cc: linux-mm, cgroups, Michal Hocko, Kamezawa Hiroyuki, Greg Thelen,
Hugh Dickins, Ying Han, Mel Gorman, Rik van Riel
On Thu, Jan 24, 2013 at 10:46:35AM +0400, Glauber Costa wrote:
> In order to maintain all the memcg bookkeeping, we need per-node
> descriptors, which will in turn contain a per-zone descriptor.
>
> Because we want to statically allocate those, this array ends up being
> very big. Part of the reason is that we allocate something large enough
> to hold MAX_NUMNODES, the compile time constant that holds the maximum
> number of nodes we would ever consider.
>
> However, we can do better in some cases if the firmware help us. This is
> true for modern x86 machines; coincidentally one of the architectures in
> which MAX_NUMNODES tends to be very big.
>
> By using the firmware-provided maximum number of nodes instead of
> MAX_NUMNODES, we can reduce the memory footprint of struct memcg
> considerably. In the extreme case in which we have only one node, this
> reduces the size of the structure from ~ 64k to ~2k. This is
> particularly important because it means that we will no longer resort to
> the vmalloc area for the struct memcg on defconfigs. We also have enough
> room for an extra node and still be outside vmalloc.
>
> One also has to keep in mind that with the industry's ability to fit
> more processors in a die as fast as the FED prints money, a nodes = 2
> configuration is already respectably big.
>
> [ v2: use size_t for size calculations ]
> Signed-off-by: Glauber Costa <glommer@parallels.com>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Greg Thelen <gthelen@google.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Ying Han <yinghan@google.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Rik van Riel <riel@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Nitpick:
> @@ -349,8 +338,29 @@ struct mem_cgroup {
> /* Index in the kmem_cache->memcg_params->memcg_caches array */
> int kmemcg_id;
> #endif
> +
> + int last_scanned_node;
> +#if MAX_NUMNODES > 1
> + nodemask_t scan_nodes;
> + atomic_t numainfo_events;
> + atomic_t numainfo_updating;
> +#endif
> + /*
> + * Per cgroup active and inactive list, similar to the
> + * per zone LRU lists.
> + *
> + * WARNING: This has to be the last element of the struct. Don't
> + * add new fields after this point.
> + */
> + struct mem_cgroup_lru_info info;
I can see myself ignoring comments pertaining to previous members when
adding to a struct. The indirection through mem_cgroup_lru_info can
probably be dropped anyway, and it moves the [0] in a place where it
helps document the struct mem_cgroup layout. What do you think about
the following:
---
Subject: [patch] memcg: reduce the size of struct memcg 244-fold morrr fix
Remove struct mem_cgroup_lru_info. It only holds the nodeinfo array
and is actively misleading because there is all kinds of per-node
stuff in addition to the LRU info in there. On that note, remove the
incorrect comment as well.
Move comment about the nodeinfo[0] array having to be the last field
in struct mem_cgroup after said array. Should be more visible when
attempting to append new members to the struct.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2382fe9..29cb9e9 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -179,10 +179,6 @@ struct mem_cgroup_per_node {
struct mem_cgroup_per_zone zoneinfo[MAX_NR_ZONES];
};
-struct mem_cgroup_lru_info {
- struct mem_cgroup_per_node *nodeinfo[0];
-};
-
/*
* Cgroups above their limits are maintained in a RB-Tree, independent of
* their hierarchy representation
@@ -370,14 +366,8 @@ struct mem_cgroup {
atomic_t numainfo_events;
atomic_t numainfo_updating;
#endif
- /*
- * Per cgroup active and inactive list, similar to the
- * per zone LRU lists.
- *
- * WARNING: This has to be the last element of the struct. Don't
- * add new fields after this point.
- */
- struct mem_cgroup_lru_info info;
+ struct mem_cgroup_per_node *nodeinfo[0];
+ /* WARNING: nodeinfo has to be the last member in here */
};
static inline size_t memcg_size(void)
@@ -718,7 +708,7 @@ static struct mem_cgroup_per_zone *
mem_cgroup_zoneinfo(struct mem_cgroup *memcg, int nid, int zid)
{
VM_BUG_ON((unsigned)nid >= nr_node_ids);
- return &memcg->info.nodeinfo[nid]->zoneinfo[zid];
+ return &memcg->nodeinfo[nid]->zoneinfo[zid];
}
struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *memcg)
@@ -6093,13 +6083,13 @@ static int alloc_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
mz->on_tree = false;
mz->memcg = memcg;
}
- memcg->info.nodeinfo[node] = pn;
+ memcg->nodeinfo[node] = pn;
return 0;
}
static void free_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
{
- kfree(memcg->info.nodeinfo[node]);
+ kfree(memcg->nodeinfo[node]);
}
static struct mem_cgroup *mem_cgroup_alloc(void)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 20+ messages in thread[parent not found: <20130205185324.GB6481-druUgvl0LCNAfugRpC6u6w@public.gmane.org>]
* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
2013-02-05 18:53 ` Johannes Weiner
@ 2013-02-05 19:04 ` Michal Hocko
-1 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2013-02-05 19:04 UTC (permalink / raw)
To: Johannes Weiner
Cc: Glauber Costa, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
cgroups-u79uwXL29TY76Z2rM5mHXA, Kamezawa Hiroyuki, Greg Thelen,
Hugh Dickins, Ying Han, Mel Gorman, Rik van Riel
On Tue 05-02-13 13:53:24, Johannes Weiner wrote:
[...]
> Subject: [patch] memcg: reduce the size of struct memcg 244-fold morrr fix
>
> Remove struct mem_cgroup_lru_info. It only holds the nodeinfo array
> and is actively misleading because there is all kinds of per-node
> stuff in addition to the LRU info in there. On that note, remove the
> incorrect comment as well.
>
> Move comment about the nodeinfo[0] array having to be the last field
> in struct mem_cgroup after said array. Should be more visible when
> attempting to append new members to the struct.
>
> Signed-off-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
Yes, I like this. The info level is just artificatial and without any
good reason.
Acked-by: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
Thanks
> ---
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 2382fe9..29cb9e9 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -179,10 +179,6 @@ struct mem_cgroup_per_node {
> struct mem_cgroup_per_zone zoneinfo[MAX_NR_ZONES];
> };
>
> -struct mem_cgroup_lru_info {
> - struct mem_cgroup_per_node *nodeinfo[0];
> -};
> -
> /*
> * Cgroups above their limits are maintained in a RB-Tree, independent of
> * their hierarchy representation
> @@ -370,14 +366,8 @@ struct mem_cgroup {
> atomic_t numainfo_events;
> atomic_t numainfo_updating;
> #endif
> - /*
> - * Per cgroup active and inactive list, similar to the
> - * per zone LRU lists.
> - *
> - * WARNING: This has to be the last element of the struct. Don't
> - * add new fields after this point.
> - */
> - struct mem_cgroup_lru_info info;
> + struct mem_cgroup_per_node *nodeinfo[0];
> + /* WARNING: nodeinfo has to be the last member in here */
> };
>
> static inline size_t memcg_size(void)
> @@ -718,7 +708,7 @@ static struct mem_cgroup_per_zone *
> mem_cgroup_zoneinfo(struct mem_cgroup *memcg, int nid, int zid)
> {
> VM_BUG_ON((unsigned)nid >= nr_node_ids);
> - return &memcg->info.nodeinfo[nid]->zoneinfo[zid];
> + return &memcg->nodeinfo[nid]->zoneinfo[zid];
> }
>
> struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *memcg)
> @@ -6093,13 +6083,13 @@ static int alloc_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
> mz->on_tree = false;
> mz->memcg = memcg;
> }
> - memcg->info.nodeinfo[node] = pn;
> + memcg->nodeinfo[node] = pn;
> return 0;
> }
>
> static void free_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
> {
> - kfree(memcg->info.nodeinfo[node]);
> + kfree(memcg->nodeinfo[node]);
> }
>
> static struct mem_cgroup *mem_cgroup_alloc(void)
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo-Bw31MaZKKs0EbZ0PF+XxCw@public.gmane.org For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org"> email-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org </a>
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
@ 2013-02-05 19:04 ` Michal Hocko
0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2013-02-05 19:04 UTC (permalink / raw)
To: Johannes Weiner
Cc: Glauber Costa, linux-mm, cgroups, Kamezawa Hiroyuki, Greg Thelen,
Hugh Dickins, Ying Han, Mel Gorman, Rik van Riel
On Tue 05-02-13 13:53:24, Johannes Weiner wrote:
[...]
> Subject: [patch] memcg: reduce the size of struct memcg 244-fold morrr fix
>
> Remove struct mem_cgroup_lru_info. It only holds the nodeinfo array
> and is actively misleading because there is all kinds of per-node
> stuff in addition to the LRU info in there. On that note, remove the
> incorrect comment as well.
>
> Move comment about the nodeinfo[0] array having to be the last field
> in struct mem_cgroup after said array. Should be more visible when
> attempting to append new members to the struct.
>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Yes, I like this. The info level is just artificatial and without any
good reason.
Acked-by: Michal Hocko <mhocko@suse.cz>
Thanks
> ---
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 2382fe9..29cb9e9 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -179,10 +179,6 @@ struct mem_cgroup_per_node {
> struct mem_cgroup_per_zone zoneinfo[MAX_NR_ZONES];
> };
>
> -struct mem_cgroup_lru_info {
> - struct mem_cgroup_per_node *nodeinfo[0];
> -};
> -
> /*
> * Cgroups above their limits are maintained in a RB-Tree, independent of
> * their hierarchy representation
> @@ -370,14 +366,8 @@ struct mem_cgroup {
> atomic_t numainfo_events;
> atomic_t numainfo_updating;
> #endif
> - /*
> - * Per cgroup active and inactive list, similar to the
> - * per zone LRU lists.
> - *
> - * WARNING: This has to be the last element of the struct. Don't
> - * add new fields after this point.
> - */
> - struct mem_cgroup_lru_info info;
> + struct mem_cgroup_per_node *nodeinfo[0];
> + /* WARNING: nodeinfo has to be the last member in here */
> };
>
> static inline size_t memcg_size(void)
> @@ -718,7 +708,7 @@ static struct mem_cgroup_per_zone *
> mem_cgroup_zoneinfo(struct mem_cgroup *memcg, int nid, int zid)
> {
> VM_BUG_ON((unsigned)nid >= nr_node_ids);
> - return &memcg->info.nodeinfo[nid]->zoneinfo[zid];
> + return &memcg->nodeinfo[nid]->zoneinfo[zid];
> }
>
> struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *memcg)
> @@ -6093,13 +6083,13 @@ static int alloc_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
> mz->on_tree = false;
> mz->memcg = memcg;
> }
> - memcg->info.nodeinfo[node] = pn;
> + memcg->nodeinfo[node] = pn;
> return 0;
> }
>
> static void free_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
> {
> - kfree(memcg->info.nodeinfo[node]);
> + kfree(memcg->nodeinfo[node]);
> }
>
> static struct mem_cgroup *mem_cgroup_alloc(void)
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread[parent not found: <20130205190454.GC3959-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>]
* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
2013-02-05 19:04 ` Michal Hocko
@ 2013-02-05 19:06 ` Glauber Costa
-1 siblings, 0 replies; 20+ messages in thread
From: Glauber Costa @ 2013-02-05 19:06 UTC (permalink / raw)
To: Michal Hocko
Cc: Johannes Weiner, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
cgroups-u79uwXL29TY76Z2rM5mHXA, Kamezawa Hiroyuki, Greg Thelen,
Hugh Dickins, Ying Han, Mel Gorman, Rik van Riel
On 02/05/2013 11:04 PM, Michal Hocko wrote:
> On Tue 05-02-13 13:53:24, Johannes Weiner wrote:
> [...]
>> Subject: [patch] memcg: reduce the size of struct memcg 244-fold morrr fix
>>
>> Remove struct mem_cgroup_lru_info. It only holds the nodeinfo array
>> and is actively misleading because there is all kinds of per-node
>> stuff in addition to the LRU info in there. On that note, remove the
>> incorrect comment as well.
>>
>> Move comment about the nodeinfo[0] array having to be the last field
>> in struct mem_cgroup after said array. Should be more visible when
>> attempting to append new members to the struct.
>>
>> Signed-off-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
>
> Yes, I like this. The info level is just artificatial and without any
> good reason.
>
> Acked-by: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
>
Acked-by: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
@ 2013-02-05 19:06 ` Glauber Costa
0 siblings, 0 replies; 20+ messages in thread
From: Glauber Costa @ 2013-02-05 19:06 UTC (permalink / raw)
To: Michal Hocko
Cc: Johannes Weiner, linux-mm, cgroups, Kamezawa Hiroyuki,
Greg Thelen, Hugh Dickins, Ying Han, Mel Gorman, Rik van Riel
On 02/05/2013 11:04 PM, Michal Hocko wrote:
> On Tue 05-02-13 13:53:24, Johannes Weiner wrote:
> [...]
>> Subject: [patch] memcg: reduce the size of struct memcg 244-fold morrr fix
>>
>> Remove struct mem_cgroup_lru_info. It only holds the nodeinfo array
>> and is actively misleading because there is all kinds of per-node
>> stuff in addition to the LRU info in there. On that note, remove the
>> incorrect comment as well.
>>
>> Move comment about the nodeinfo[0] array having to be the last field
>> in struct mem_cgroup after said array. Should be more visible when
>> attempting to append new members to the struct.
>>
>> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
>
> Yes, I like this. The info level is just artificatial and without any
> good reason.
>
> Acked-by: Michal Hocko <mhocko@suse.cz>
>
Acked-by: Glauber Costa <glommer@parallels.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread