* [PATCH] memcg: reduce the size of struct memcg 244-fold.
@ 2013-01-23 17:33 Glauber Costa
2013-01-24 0:18 ` Andrew Morton
0 siblings, 1 reply; 3+ messages in thread
From: Glauber Costa @ 2013-01-23 17:33 UTC (permalink / raw)
To: linux-mm
Cc: cgroups, Glauber Costa, Michal Hocko, Kamezawa Hiroyuki,
Johannes Weiner, Greg Thelen, Hugh Dickins, Ying Han, Mel Gorman,
Rik van Riel
In order to maintain all the memcg bookkeeping, we need per-node
descriptors, which will in turn contain a per-zone descriptor.
Because we want to statically allocate those, this array ends up being
very big. Part of the reason is that we allocate something large enough
to hold MAX_NUMNODES, the compile time constant that holds the maximum
number of nodes we would ever consider.
However, we can do better in some cases if the firmware help us. This is
true for modern x86 machines; coincidentally one of the architectures in
which MAX_NUMNODES tends to be very big.
By using the firmware-provided maximum number of nodes instead of
MAX_NUMNODES, we can reduce the memory footprint of struct memcg
considerably. In the extreme case in which we have only one node, this
reduces the size of the structure from ~ 64k to ~2k. This is
particularly important because it means that we will no longer resort to
the vmalloc area for the struct memcg on defconfigs. We also have enough
room for an extra node and still be outside vmalloc.
One also has to keep in mind that with the industry's ability to fit
more processors in a die as fast as the FED prints money, a nodes = 2
configuration is already respectably big.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Greg Thelen <gthelen@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ying Han <yinghan@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
---
mm/memcontrol.c | 40 +++++++++++++++++++++++++---------------
1 file changed, 25 insertions(+), 15 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 09255ec..9972fbf 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -172,7 +172,7 @@ struct mem_cgroup_per_node {
};
struct mem_cgroup_lru_info {
- struct mem_cgroup_per_node *nodeinfo[MAX_NUMNODES];
+ struct mem_cgroup_per_node *nodeinfo[0];
};
/*
@@ -276,17 +276,6 @@ struct mem_cgroup {
*/
struct res_counter kmem;
/*
- * Per cgroup active and inactive list, similar to the
- * per zone LRU lists.
- */
- struct mem_cgroup_lru_info info;
- int last_scanned_node;
-#if MAX_NUMNODES > 1
- nodemask_t scan_nodes;
- atomic_t numainfo_events;
- atomic_t numainfo_updating;
-#endif
- /*
* Should the accounting and control be hierarchical, per subtree?
*/
bool use_hierarchy;
@@ -349,8 +338,29 @@ struct mem_cgroup {
/* Index in the kmem_cache->memcg_params->memcg_caches array */
int kmemcg_id;
#endif
+
+ int last_scanned_node;
+#if MAX_NUMNODES > 1
+ nodemask_t scan_nodes;
+ atomic_t numainfo_events;
+ atomic_t numainfo_updating;
+#endif
+ /*
+ * Per cgroup active and inactive list, similar to the
+ * per zone LRU lists.
+ *
+ * WARNING: This has to be the last element of the struct. Don't
+ * add new fields after this point.
+ */
+ struct mem_cgroup_lru_info info;
};
+static inline int memcg_size(void)
+{
+ return sizeof(struct mem_cgroup) +
+ nr_node_ids * sizeof(struct mem_cgroup_per_node);
+}
+
/* internal only representation about the status of kmem accounting. */
enum {
KMEM_ACCOUNTED_ACTIVE = 0, /* accounted by this cgroup itself */
@@ -5894,9 +5904,9 @@ static void free_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
static struct mem_cgroup *mem_cgroup_alloc(void)
{
struct mem_cgroup *memcg;
- int size = sizeof(struct mem_cgroup);
+ int size = memcg_size();
- /* Can be very big if MAX_NUMNODES is very big */
+ /* Can be very big if nr_node_ids is very big */
if (size < PAGE_SIZE)
memcg = kzalloc(size, GFP_KERNEL);
else
@@ -5933,7 +5943,7 @@ out_free:
static void __mem_cgroup_free(struct mem_cgroup *memcg)
{
int node;
- int size = sizeof(struct mem_cgroup);
+ int size = memcg_size();
mem_cgroup_remove_from_trees(memcg);
free_css_id(&mem_cgroup_subsys, &memcg->css);
--
1.8.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH] memcg: reduce the size of struct memcg 244-fold.
2013-01-23 17:33 [PATCH] memcg: reduce the size of struct memcg 244-fold Glauber Costa
@ 2013-01-24 0:18 ` Andrew Morton
2013-01-24 6:28 ` Glauber Costa
0 siblings, 1 reply; 3+ messages in thread
From: Andrew Morton @ 2013-01-24 0:18 UTC (permalink / raw)
To: Glauber Costa
Cc: linux-mm, cgroups, Michal Hocko, Kamezawa Hiroyuki,
Johannes Weiner, Greg Thelen, Hugh Dickins, Ying Han, Mel Gorman,
Rik van Riel
On Wed, 23 Jan 2013 21:33:46 +0400
Glauber Costa <glommer@parallels.com> wrote:
> In order to maintain all the memcg bookkeeping, we need per-node
> descriptors, which will in turn contain a per-zone descriptor.
>
> Because we want to statically allocate those, this array ends up being
> very big. Part of the reason is that we allocate something large enough
> to hold MAX_NUMNODES, the compile time constant that holds the maximum
> number of nodes we would ever consider.
>
> However, we can do better in some cases if the firmware help us. This is
> true for modern x86 machines; coincidentally one of the architectures in
> which MAX_NUMNODES tends to be very big.
>
> By using the firmware-provided maximum number of nodes instead of
> MAX_NUMNODES, we can reduce the memory footprint of struct memcg
> considerably. In the extreme case in which we have only one node, this
> reduces the size of the structure from ~ 64k to ~2k. This is
> particularly important because it means that we will no longer resort to
> the vmalloc area for the struct memcg on defconfigs. We also have enough
> room for an extra node and still be outside vmalloc.
>
> One also has to keep in mind that with the industry's ability to fit
> more processors in a die as fast as the FED prints money, a nodes = 2
> configuration is already respectably big.
Seems sensible.
> +static inline int memcg_size(void)
> +{
> + return sizeof(struct mem_cgroup) +
> + nr_node_ids * sizeof(struct mem_cgroup_per_node);
> +}
> +
> /* internal only representation about the status of kmem accounting. */
> enum {
> KMEM_ACCOUNTED_ACTIVE = 0, /* accounted by this cgroup itself */
> @@ -5894,9 +5904,9 @@ static void free_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
> static struct mem_cgroup *mem_cgroup_alloc(void)
> {
> struct mem_cgroup *memcg;
> - int size = sizeof(struct mem_cgroup);
> + int size = memcg_size();
>
> - /* Can be very big if MAX_NUMNODES is very big */
> + /* Can be very big if nr_node_ids is very big */
> if (size < PAGE_SIZE)
> memcg = kzalloc(size, GFP_KERNEL);
> else
> @@ -5933,7 +5943,7 @@ out_free:
> static void __mem_cgroup_free(struct mem_cgroup *memcg)
> {
> int node;
> - int size = sizeof(struct mem_cgroup);
> + int size = memcg_size();
>
> mem_cgroup_remove_from_trees(memcg);
> free_css_id(&mem_cgroup_subsys, &memcg->css);
Really everything here should be using size_t - a minor
cosmetic/readability thing.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH] memcg: reduce the size of struct memcg 244-fold.
2013-01-24 0:18 ` Andrew Morton
@ 2013-01-24 6:28 ` Glauber Costa
0 siblings, 0 replies; 3+ messages in thread
From: Glauber Costa @ 2013-01-24 6:28 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, cgroups, Michal Hocko, Kamezawa Hiroyuki,
Johannes Weiner, Greg Thelen, Hugh Dickins, Ying Han, Mel Gorman,
Rik van Riel
>> struct mem_cgroup *memcg;
>> - int size = sizeof(struct mem_cgroup);
>> + int size = memcg_size();
>>
>> - /* Can be very big if MAX_NUMNODES is very big */
>> + /* Can be very big if nr_node_ids is very big */
>> if (size < PAGE_SIZE)
>> memcg = kzalloc(size, GFP_KERNEL);
>> else
>> @@ -5933,7 +5943,7 @@ out_free:
>> static void __mem_cgroup_free(struct mem_cgroup *memcg)
>> {
>> int node;
>> - int size = sizeof(struct mem_cgroup);
>> + int size = memcg_size();
>>
>> mem_cgroup_remove_from_trees(memcg);
>> free_css_id(&mem_cgroup_subsys, &memcg->css);
>
> Really everything here should be using size_t - a minor
> cosmetic/readability thing.
>
I agree
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2013-01-24 6:28 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-23 17:33 [PATCH] memcg: reduce the size of struct memcg 244-fold Glauber Costa
2013-01-24 0:18 ` Andrew Morton
2013-01-24 6:28 ` Glauber Costa
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).