* [PATCH 0/6] memcg: bypass root memcg page stat accounting
@ 2013-03-12 10:06 Sha Zhengju
2013-03-12 10:08 ` [PATCH 1/6] memcg: use global stat directly for root memcg usage Sha Zhengju
` (5 more replies)
0 siblings, 6 replies; 13+ messages in thread
From: Sha Zhengju @ 2013-03-12 10:06 UTC (permalink / raw)
To: cgroups, linux-mm
Cc: mhocko, kamezawa.hiroyu, glommer, akpm, mgorman, Sha Zhengju
Hi,
As we all know, if memcg is enabled but without any non-root memcgs,
all allocated pages belong to root memcg and go through root memcg
statistic routines which brings some overheads.
In this pathset we try to give up accounting stats of root memcg
including CACHE/RSS/SWAP/FILE_MAPPED/PGFAULT/PGMAJFAULT(First attempt
can be found here: https://lkml.org/lkml/2012/12/25/103). But we need
to pay special attention while showing these root memcg numbers in
memcg_stat_show(): as we don't account root memcg stats
anymore, the root_mem_cgroup->stat numbers are actually 0. But we can
fake these figures by using stats of global state and all other memcgs.
Take CACHE stats for example, that is for root memcg:
nr(MEM_CGROUP_STAT_CACHE) = global_page_state(NR_FILE_PAGES) -
sum_of_all_memcg(MEM_CGROUP_STAT_CACHE);
On a 4g memory and 4-core i5 CPU machine, we run Mel's pft test for
performance numbers:
nomemcg : memcg compile disabled.
vanilla : memcg enabled, patch not applied.
optimized: memcg enabled, with patch applied.
optimized vanilla
User 405.15 431.27
System 71.71 73.00
Elapsed 483.23 510.00
optimized nomemcg
User 405.15 390.68
System 71.71 67.21
Elapsed 483.23 466.15
Note that elapsed time reduce considerably from 510 to 483 after pathes
have been applied(about ~5%). But there is still some gap between the
patched and memcg-disabled kernel, and we can also do some further works
here(the left-over stats like PGPGIN/PGPGOUT).
I split the patchset to several parts mainly based on their accounting
entry function for the convenience of review:
Sha Zhengju (6):
memcg: use global stat directly for root memcg usage
memcg: Don't account root memcg CACHE/RSS stats
memcg: Don't account root memcg MEM_CGROUP_STAT_FILE_MAPPED stats
memcg: Don't account root memcg swap stats
memcg: Don't account root memcg PGFAULT/PGMAJFAULT
memcg: disable memcg page stat accounting
include/linux/memcontrol.h | 23 +++++++
mm/memcontrol.c | 149 +++++++++++++++++++++++++++++++++++++-------
2 files changed, 149 insertions(+), 23 deletions(-)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/6] memcg: use global stat directly for root memcg usage
2013-03-12 10:06 [PATCH 0/6] memcg: bypass root memcg page stat accounting Sha Zhengju
@ 2013-03-12 10:08 ` Sha Zhengju
2013-03-13 1:05 ` Kamezawa Hiroyuki
2013-03-12 10:09 ` [PATCH 2/6] memcg: Don't account root memcg CACHE/RSS stats Sha Zhengju
` (4 subsequent siblings)
5 siblings, 1 reply; 13+ messages in thread
From: Sha Zhengju @ 2013-03-12 10:08 UTC (permalink / raw)
To: cgroups, linux-mm
Cc: mhocko, kamezawa.hiroyu, glommer, akpm, mgorman, Sha Zhengju
Since mem_cgroup_recursive_stat(root_mem_cgroup, INDEX) will sum up
all memcg stats without regard to root's use_hierarchy, we may use
global stats instead for simplicity.
Signed-off-by: Sha Zhengju <handai.szj@taobao.com>
---
mm/memcontrol.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 669d16a..735cd41 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4987,11 +4987,11 @@ static inline u64 mem_cgroup_usage(struct mem_cgroup *memcg, bool swap)
return res_counter_read_u64(&memcg->memsw, RES_USAGE);
}
- val = mem_cgroup_recursive_stat(memcg, MEM_CGROUP_STAT_CACHE);
- val += mem_cgroup_recursive_stat(memcg, MEM_CGROUP_STAT_RSS);
+ val = global_page_state(NR_FILE_PAGES);
+ val += global_page_state(NR_ANON_PAGES);
if (swap)
- val += mem_cgroup_recursive_stat(memcg, MEM_CGROUP_STAT_SWAP);
+ val += total_swap_pages - atomic_long_read(&nr_swap_pages);
return val << PAGE_SHIFT;
}
--
1.7.9.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/6] memcg: Don't account root memcg CACHE/RSS stats
2013-03-12 10:06 [PATCH 0/6] memcg: bypass root memcg page stat accounting Sha Zhengju
2013-03-12 10:08 ` [PATCH 1/6] memcg: use global stat directly for root memcg usage Sha Zhengju
@ 2013-03-12 10:09 ` Sha Zhengju
2013-03-13 1:12 ` Kamezawa Hiroyuki
2013-03-20 7:07 ` Glauber Costa
2013-03-12 10:10 ` [PATCH 3/6] memcg: Don't account root memcg MEM_CGROUP_STAT_FILE_MAPPED stats Sha Zhengju
` (3 subsequent siblings)
5 siblings, 2 replies; 13+ messages in thread
From: Sha Zhengju @ 2013-03-12 10:09 UTC (permalink / raw)
To: cgroups, linux-mm
Cc: mhocko, kamezawa.hiroyu, glommer, akpm, mgorman, Sha Zhengju
If memcg is enabled and no non-root memcg exists, all allocated pages
belong to root_mem_cgroup and go through root memcg statistics routines
which brings some overheads.
So for the sake of performance, we can give up accounting stats of root
memcg for MEM_CGROUP_STAT_CACHE/RSS and instead we pay special attention
to memcg_stat_show() while showing root memcg numbers:
as we don't account root memcg stats anymore, the root_mem_cgroup->stat
numbers are actually 0. So we fake these numbers by using stats of global
state and all other memcg. That is for root memcg:
nr(MEM_CGROUP_STAT_CACHE) = global_page_state(NR_FILE_PAGES) -
sum_of_all_memcg(MEM_CGROUP_STAT_CACHE);
Rss pages accounting are in the similar way.
Signed-off-by: Sha Zhengju <handai.szj@taobao.com>
---
mm/memcontrol.c | 50 ++++++++++++++++++++++++++++++++++----------------
1 file changed, 34 insertions(+), 16 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 735cd41..e89204f 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -958,26 +958,27 @@ static void mem_cgroup_charge_statistics(struct mem_cgroup *memcg,
{
preempt_disable();
- /*
- * Here, RSS means 'mapped anon' and anon's SwapCache. Shmem/tmpfs is
- * counted as CACHE even if it's on ANON LRU.
- */
- if (anon)
- __this_cpu_add(memcg->stat->count[MEM_CGROUP_STAT_RSS],
- nr_pages);
- else
- __this_cpu_add(memcg->stat->count[MEM_CGROUP_STAT_CACHE],
- nr_pages);
-
/* pagein of a big page is an event. So, ignore page size */
if (nr_pages > 0)
__this_cpu_inc(memcg->stat->events[MEM_CGROUP_EVENTS_PGPGIN]);
- else {
+ else
__this_cpu_inc(memcg->stat->events[MEM_CGROUP_EVENTS_PGPGOUT]);
- nr_pages = -nr_pages; /* for event */
- }
- __this_cpu_add(memcg->stat->nr_page_events, nr_pages);
+ __this_cpu_add(memcg->stat->nr_page_events,
+ nr_pages < 0 ? -nr_pages : nr_pages);
+
+ if (!mem_cgroup_is_root(memcg)) {
+ /*
+ * Here, RSS means 'mapped anon' and anon's SwapCache. Shmem/tmpfs is
+ * counted as CACHE even if it's on ANON LRU.
+ */
+ if (anon)
+ __this_cpu_add(memcg->stat->count[MEM_CGROUP_STAT_RSS],
+ nr_pages);
+ else
+ __this_cpu_add(memcg->stat->count[MEM_CGROUP_STAT_CACHE],
+ nr_pages);
+ }
preempt_enable();
}
@@ -5445,12 +5446,24 @@ static int memcg_stat_show(struct cgroup *cont, struct cftype *cft,
struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
struct mem_cgroup *mi;
unsigned int i;
+ enum zone_stat_item global_stat[] = {NR_FILE_PAGES, NR_ANON_PAGES};
+ long root_stat[MEM_CGROUP_STAT_NSTATS] = {0};
for (i = 0; i < MEM_CGROUP_STAT_NSTATS; i++) {
+ long val = 0;
+
if (i == MEM_CGROUP_STAT_SWAP && !do_swap_account)
continue;
+
+ if (mem_cgroup_is_root(memcg) && (i == MEM_CGROUP_STAT_CACHE
+ || i == MEM_CGROUP_STAT_RSS)) {
+ val = global_page_state(global_stat[i]) -
+ mem_cgroup_recursive_stat(memcg, i);
+ root_stat[i] = val = val < 0 ? 0 : val;
+ } else
+ val = mem_cgroup_read_stat(memcg, i);
seq_printf(m, "%s %ld\n", mem_cgroup_stat_names[i],
- mem_cgroup_read_stat(memcg, i) * PAGE_SIZE);
+ val * PAGE_SIZE);
}
for (i = 0; i < MEM_CGROUP_EVENTS_NSTATS; i++)
@@ -5478,6 +5491,11 @@ static int memcg_stat_show(struct cgroup *cont, struct cftype *cft,
continue;
for_each_mem_cgroup_tree(mi, memcg)
val += mem_cgroup_read_stat(mi, i) * PAGE_SIZE;
+
+ /* Adding local stats of root memcg */
+ if (mem_cgroup_is_root(memcg))
+ val += root_stat[i] * PAGE_SIZE;
+
seq_printf(m, "total_%s %lld\n", mem_cgroup_stat_names[i], val);
}
--
1.7.9.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 3/6] memcg: Don't account root memcg MEM_CGROUP_STAT_FILE_MAPPED stats
2013-03-12 10:06 [PATCH 0/6] memcg: bypass root memcg page stat accounting Sha Zhengju
2013-03-12 10:08 ` [PATCH 1/6] memcg: use global stat directly for root memcg usage Sha Zhengju
2013-03-12 10:09 ` [PATCH 2/6] memcg: Don't account root memcg CACHE/RSS stats Sha Zhengju
@ 2013-03-12 10:10 ` Sha Zhengju
2013-03-12 10:10 ` [PATCH 4/6] memcg: Don't account root memcg swap stats Sha Zhengju
` (2 subsequent siblings)
5 siblings, 0 replies; 13+ messages in thread
From: Sha Zhengju @ 2013-03-12 10:10 UTC (permalink / raw)
To: cgroups, linux-mm
Cc: mhocko, kamezawa.hiroyu, glommer, akpm, mgorman, Sha Zhengju
Similar with root memcg's CACHE/RSS, we don't account its stats counted
by mem_cgroup_update_page_stat() (now MEM_CGROUP_STAT_FILE_MAPPED only)
to improve performance.
Signed-off-by: Sha Zhengju <handai.szj@taobao.com>
---
mm/memcontrol.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index e89204f..24ce5e6d 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2277,6 +2277,10 @@ void mem_cgroup_update_page_stat(struct page *page,
return;
memcg = pc->mem_cgroup;
+
+ if (mem_cgroup_is_root(memcg))
+ return;
+
if (unlikely(!memcg || !PageCgroupUsed(pc)))
return;
@@ -5446,7 +5450,8 @@ static int memcg_stat_show(struct cgroup *cont, struct cftype *cft,
struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
struct mem_cgroup *mi;
unsigned int i;
- enum zone_stat_item global_stat[] = {NR_FILE_PAGES, NR_ANON_PAGES};
+ enum zone_stat_item global_stat[] = {NR_FILE_PAGES, NR_ANON_PAGES,
+ NR_FILE_MAPPED};
long root_stat[MEM_CGROUP_STAT_NSTATS] = {0};
for (i = 0; i < MEM_CGROUP_STAT_NSTATS; i++) {
@@ -5455,8 +5460,7 @@ static int memcg_stat_show(struct cgroup *cont, struct cftype *cft,
if (i == MEM_CGROUP_STAT_SWAP && !do_swap_account)
continue;
- if (mem_cgroup_is_root(memcg) && (i == MEM_CGROUP_STAT_CACHE
- || i == MEM_CGROUP_STAT_RSS)) {
+ if (mem_cgroup_is_root(memcg) && (i != MEM_CGROUP_STAT_SWAP)) {
val = global_page_state(global_stat[i]) -
mem_cgroup_recursive_stat(memcg, i);
root_stat[i] = val = val < 0 ? 0 : val;
--
1.7.9.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 4/6] memcg: Don't account root memcg swap stats
2013-03-12 10:06 [PATCH 0/6] memcg: bypass root memcg page stat accounting Sha Zhengju
` (2 preceding siblings ...)
2013-03-12 10:10 ` [PATCH 3/6] memcg: Don't account root memcg MEM_CGROUP_STAT_FILE_MAPPED stats Sha Zhengju
@ 2013-03-12 10:10 ` Sha Zhengju
2013-03-12 10:11 ` [PATCH 5/6] memcg: Don't account root memcg PGFAULT/PGMAJFAULT events Sha Zhengju
2013-03-12 10:11 ` [PATCH 6/6] memcg: disable memcg page stat accounting Sha Zhengju
5 siblings, 0 replies; 13+ messages in thread
From: Sha Zhengju @ 2013-03-12 10:10 UTC (permalink / raw)
To: cgroups, linux-mm
Cc: mhocko, kamezawa.hiroyu, glommer, akpm, mgorman, Sha Zhengju
Similar with root memcg's CACHE/RSS, we don't account its swap stats
to improve performance.
And for root memcg memcg_stat_show():
nr(MEM_CGROUP_STAT_SWAP) = total_swap_pages - nr_swap_pages
- sum_of_all_memcg(MEM_CGROUP_STAT_SWAP);
Signed-off-by: Sha Zhengju <handai.szj@taobao.com>
---
mm/memcontrol.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 24ce5e6d..b73758e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -934,7 +934,9 @@ static void mem_cgroup_swap_statistics(struct mem_cgroup *memcg,
bool charge)
{
int val = (charge) ? 1 : -1;
- this_cpu_add(memcg->stat->count[MEM_CGROUP_STAT_SWAP], val);
+
+ if (!mem_cgroup_is_root(memcg))
+ this_cpu_add(memcg->stat->count[MEM_CGROUP_STAT_SWAP], val);
}
static unsigned long mem_cgroup_read_events(struct mem_cgroup *memcg,
@@ -5460,10 +5462,13 @@ static int memcg_stat_show(struct cgroup *cont, struct cftype *cft,
if (i == MEM_CGROUP_STAT_SWAP && !do_swap_account)
continue;
- if (mem_cgroup_is_root(memcg) && (i != MEM_CGROUP_STAT_SWAP)) {
- val = global_page_state(global_stat[i]) -
- mem_cgroup_recursive_stat(memcg, i);
- root_stat[i] = val = val < 0 ? 0 : val;
+ if (mem_cgroup_is_root(memcg)) {
+ if (i == MEM_CGROUP_STAT_SWAP)
+ val = total_swap_pages -
+ atomic_long_read(&nr_swap_pages);
+ else
+ val = global_page_state(global_stat[i]);
+ val = val - mem_cgroup_recursive_stat(memcg, i);
} else
val = mem_cgroup_read_stat(memcg, i);
seq_printf(m, "%s %ld\n", mem_cgroup_stat_names[i],
--
1.7.9.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 5/6] memcg: Don't account root memcg PGFAULT/PGMAJFAULT events
2013-03-12 10:06 [PATCH 0/6] memcg: bypass root memcg page stat accounting Sha Zhengju
` (3 preceding siblings ...)
2013-03-12 10:10 ` [PATCH 4/6] memcg: Don't account root memcg swap stats Sha Zhengju
@ 2013-03-12 10:11 ` Sha Zhengju
2013-03-12 10:11 ` [PATCH 6/6] memcg: disable memcg page stat accounting Sha Zhengju
5 siblings, 0 replies; 13+ messages in thread
From: Sha Zhengju @ 2013-03-12 10:11 UTC (permalink / raw)
To: cgroups, linux-mm
Cc: mhocko, kamezawa.hiroyu, glommer, akpm, mgorman, Sha Zhengju
Use the similar way to handle root memcg PGFAULT/PGMAJFAULT events.
So
nr(MEM_CGROUP_EVENTS_PGFAULT/PGMAJFAULT) = global_event_states -
sum_of_all_memcg(MEM_CGROUP_EVENTS_PGFAULT/PGMAJFAULT);
Signed-off-by: Sha Zhengju <handai.szj@taobao.com>
---
mm/memcontrol.c | 50 +++++++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 47 insertions(+), 3 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b73758e..cea4b02 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -53,6 +53,7 @@
#include <linux/page_cgroup.h>
#include <linux/cpu.h>
#include <linux/oom.h>
+#include <linux/vmstat.h>
#include "internal.h"
#include <net/sock.h>
#include <net/ip.h>
@@ -1252,6 +1253,10 @@ void __mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx)
rcu_read_lock();
memcg = mem_cgroup_from_task(rcu_dereference(mm->owner));
+
+ if (mem_cgroup_is_root(memcg))
+ goto out;
+
if (unlikely(!memcg))
goto out;
@@ -4983,6 +4988,18 @@ static unsigned long mem_cgroup_recursive_stat(struct mem_cgroup *memcg,
return val;
}
+static unsigned long mem_cgroup_recursive_events(struct mem_cgroup *memcg,
+ enum mem_cgroup_events_index idx)
+{
+ struct mem_cgroup *iter;
+ unsigned long val = 0;
+
+ for_each_mem_cgroup_tree(iter, memcg)
+ val += mem_cgroup_read_events(iter, idx);
+
+ return val;
+}
+
static inline u64 mem_cgroup_usage(struct mem_cgroup *memcg, bool swap)
{
u64 val;
@@ -5455,6 +5472,7 @@ static int memcg_stat_show(struct cgroup *cont, struct cftype *cft,
enum zone_stat_item global_stat[] = {NR_FILE_PAGES, NR_ANON_PAGES,
NR_FILE_MAPPED};
long root_stat[MEM_CGROUP_STAT_NSTATS] = {0};
+ unsigned long root_events[MEM_CGROUP_EVENTS_NSTATS] = {0};
for (i = 0; i < MEM_CGROUP_STAT_NSTATS; i++) {
long val = 0;
@@ -5475,9 +5493,30 @@ static int memcg_stat_show(struct cgroup *cont, struct cftype *cft,
val * PAGE_SIZE);
}
- for (i = 0; i < MEM_CGROUP_EVENTS_NSTATS; i++)
- seq_printf(m, "%s %lu\n", mem_cgroup_events_names[i],
- mem_cgroup_read_events(memcg, i));
+ for (i = 0; i < MEM_CGROUP_EVENTS_NSTATS; i++) {
+ unsigned long val = 0;
+
+ if (mem_cgroup_is_root(memcg) &&
+ ((i == MEM_CGROUP_EVENTS_PGFAULT) ||
+ i == MEM_CGROUP_EVENTS_PGMAJFAULT)) {
+ int cpu;
+
+ get_online_cpus();
+ for_each_online_cpu(cpu) {
+ struct vm_event_state *this = &per_cpu(vm_event_states, cpu);
+ if (i == MEM_CGROUP_EVENTS_PGFAULT)
+ val += this->event[PGFAULT];
+ else
+ val += this->event[PGMAJFAULT];
+ }
+ put_online_cpus();
+
+ val = val - mem_cgroup_recursive_events(memcg, i);
+ root_events[i] = val = val < 0 ? 0 : val;
+ } else
+ val = mem_cgroup_read_events(memcg, i);
+ seq_printf(m, "%s %lu\n", mem_cgroup_events_names[i], val);
+ }
for (i = 0; i < NR_LRU_LISTS; i++)
seq_printf(m, "%s %lu\n", mem_cgroup_lru_names[i],
@@ -5513,6 +5552,11 @@ static int memcg_stat_show(struct cgroup *cont, struct cftype *cft,
for_each_mem_cgroup_tree(mi, memcg)
val += mem_cgroup_read_events(mi, i);
+
+ /* Adding local events of root memcg */
+ if (mem_cgroup_is_root(memcg))
+ val += root_events[i];
+
seq_printf(m, "total_%s %llu\n",
mem_cgroup_events_names[i], val);
}
--
1.7.9.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 6/6] memcg: disable memcg page stat accounting
2013-03-12 10:06 [PATCH 0/6] memcg: bypass root memcg page stat accounting Sha Zhengju
` (4 preceding siblings ...)
2013-03-12 10:11 ` [PATCH 5/6] memcg: Don't account root memcg PGFAULT/PGMAJFAULT events Sha Zhengju
@ 2013-03-12 10:11 ` Sha Zhengju
2013-03-20 7:09 ` Glauber Costa
5 siblings, 1 reply; 13+ messages in thread
From: Sha Zhengju @ 2013-03-12 10:11 UTC (permalink / raw)
To: cgroups, linux-mm
Cc: mhocko, kamezawa.hiroyu, glommer, akpm, mgorman, Sha Zhengju
Use jump label to patch the memcg page stat accounting code
in or out when not used. when the first non-root memcg comes to
life the code is patching in otherwise it is out.
Signed-off-by: Sha Zhengju <handai.szj@taobao.com>
---
include/linux/memcontrol.h | 23 +++++++++++++++++++++++
mm/memcontrol.c | 34 +++++++++++++++++++++++++++++++++-
2 files changed, 56 insertions(+), 1 deletion(-)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index d6183f0..99dca91 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -42,6 +42,14 @@ struct mem_cgroup_reclaim_cookie {
};
#ifdef CONFIG_MEMCG
+
+extern struct static_key memcg_in_use_key;
+
+static inline bool mem_cgroup_in_use(void)
+{
+ return static_key_false(&memcg_in_use_key);
+}
+
/*
* All "charge" functions with gfp_mask should use GFP_KERNEL or
* (gfp_mask & GFP_RECLAIM_MASK). In current implementatin, memcg doesn't
@@ -145,6 +153,10 @@ static inline void mem_cgroup_begin_update_page_stat(struct page *page,
{
if (mem_cgroup_disabled())
return;
+
+ if (!mem_cgroup_in_use())
+ return;
+
rcu_read_lock();
*locked = false;
if (atomic_read(&memcg_moving))
@@ -158,6 +170,10 @@ static inline void mem_cgroup_end_update_page_stat(struct page *page,
{
if (mem_cgroup_disabled())
return;
+
+ if (!mem_cgroup_in_use())
+ return;
+
if (*locked)
__mem_cgroup_end_update_page_stat(page, flags);
rcu_read_unlock();
@@ -189,6 +205,9 @@ static inline void mem_cgroup_count_vm_event(struct mm_struct *mm,
{
if (mem_cgroup_disabled())
return;
+ if (!mem_cgroup_in_use())
+ return;
+
__mem_cgroup_count_vm_event(mm, idx);
}
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
@@ -201,6 +220,10 @@ void mem_cgroup_print_bad_page(struct page *page);
#endif
#else /* CONFIG_MEMCG */
struct mem_cgroup;
+static inline bool mem_cgroup_in_use(void)
+{
+ return false;
+}
static inline int mem_cgroup_newpage_charge(struct page *page,
struct mm_struct *mm, gfp_t gfp_mask)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index cea4b02..4e08347 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -562,6 +562,14 @@ enum res_type {
*/
static DEFINE_MUTEX(memcg_create_mutex);
+/* static_key used for marking memcg in use or not. We use this jump label to
+ * patch memcg page stat accounting code in or out.
+ * The key will be increased when non-root memcg is created, and be decreased
+ * when memcg is destroyed.
+ */
+struct static_key memcg_in_use_key;
+EXPORT_SYMBOL(memcg_in_use_key);
+
static void mem_cgroup_get(struct mem_cgroup *memcg);
static void mem_cgroup_put(struct mem_cgroup *memcg);
@@ -707,10 +715,21 @@ static void disarm_kmem_keys(struct mem_cgroup *memcg)
}
#endif /* CONFIG_MEMCG_KMEM */
+static void disarm_inuse_keys(void)
+{
+ static_key_slow_dec(&memcg_in_use_key);
+}
+
+static void arm_inuse_keys(void)
+{
+ static_key_slow_inc(&memcg_in_use_key);
+}
+
static void disarm_static_keys(struct mem_cgroup *memcg)
{
disarm_sock_keys(memcg);
disarm_kmem_keys(memcg);
+ disarm_inuse_keys();
}
static void drain_all_stock_async(struct mem_cgroup *memcg);
@@ -936,6 +955,9 @@ static void mem_cgroup_swap_statistics(struct mem_cgroup *memcg,
{
int val = (charge) ? 1 : -1;
+ if (!mem_cgroup_in_use())
+ return;
+
if (!mem_cgroup_is_root(memcg))
this_cpu_add(memcg->stat->count[MEM_CGROUP_STAT_SWAP], val);
}
@@ -970,6 +992,11 @@ static void mem_cgroup_charge_statistics(struct mem_cgroup *memcg,
__this_cpu_add(memcg->stat->nr_page_events,
nr_pages < 0 ? -nr_pages : nr_pages);
+ if (!mem_cgroup_in_use()) {
+ preempt_enable();
+ return;
+ }
+
if (!mem_cgroup_is_root(memcg)) {
/*
* Here, RSS means 'mapped anon' and anon's SwapCache. Shmem/tmpfs is
@@ -2278,11 +2305,13 @@ void mem_cgroup_update_page_stat(struct page *page,
{
struct mem_cgroup *memcg;
struct page_cgroup *pc = lookup_page_cgroup(page);
- unsigned long uninitialized_var(flags);
if (mem_cgroup_disabled())
return;
+ if (!mem_cgroup_in_use())
+ return;
+
memcg = pc->mem_cgroup;
if (mem_cgroup_is_root(memcg))
@@ -6414,6 +6443,9 @@ mem_cgroup_css_online(struct cgroup *cont)
}
error = memcg_init_kmem(memcg, &mem_cgroup_subsys);
+ if (!error)
+ arm_inuse_keys();
+
mutex_unlock(&memcg_create_mutex);
if (error) {
/*
--
1.7.9.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/6] memcg: use global stat directly for root memcg usage
2013-03-12 10:08 ` [PATCH 1/6] memcg: use global stat directly for root memcg usage Sha Zhengju
@ 2013-03-13 1:05 ` Kamezawa Hiroyuki
2013-03-13 8:50 ` Sha Zhengju
0 siblings, 1 reply; 13+ messages in thread
From: Kamezawa Hiroyuki @ 2013-03-13 1:05 UTC (permalink / raw)
To: Sha Zhengju
Cc: cgroups, linux-mm, mhocko, glommer, akpm, mgorman, Sha Zhengju
(2013/03/12 19:08), Sha Zhengju wrote:
> Since mem_cgroup_recursive_stat(root_mem_cgroup, INDEX) will sum up
> all memcg stats without regard to root's use_hierarchy, we may use
> global stats instead for simplicity.
>
> Signed-off-by: Sha Zhengju <handai.szj@taobao.com>
> ---
> mm/memcontrol.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 669d16a..735cd41 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -4987,11 +4987,11 @@ static inline u64 mem_cgroup_usage(struct mem_cgroup *memcg, bool swap)
> return res_counter_read_u64(&memcg->memsw, RES_USAGE);
> }
>
> - val = mem_cgroup_recursive_stat(memcg, MEM_CGROUP_STAT_CACHE);
> - val += mem_cgroup_recursive_stat(memcg, MEM_CGROUP_STAT_RSS);
> + val = global_page_state(NR_FILE_PAGES);
> + val += global_page_state(NR_ANON_PAGES);
>
you missed NR_ANON_TRANSPARENT_HUGEPAGES
> if (swap)
> - val += mem_cgroup_recursive_stat(memcg, MEM_CGROUP_STAT_SWAP);
> + val += total_swap_pages - atomic_long_read(&nr_swap_pages);
>
Double count mapped SwapCache ? Did you saw Costa's trial in a week ago ?
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/6] memcg: Don't account root memcg CACHE/RSS stats
2013-03-12 10:09 ` [PATCH 2/6] memcg: Don't account root memcg CACHE/RSS stats Sha Zhengju
@ 2013-03-13 1:12 ` Kamezawa Hiroyuki
2013-03-13 9:09 ` Sha Zhengju
2013-03-20 7:07 ` Glauber Costa
1 sibling, 1 reply; 13+ messages in thread
From: Kamezawa Hiroyuki @ 2013-03-13 1:12 UTC (permalink / raw)
To: Sha Zhengju
Cc: cgroups, linux-mm, mhocko, glommer, akpm, mgorman, Sha Zhengju
(2013/03/12 19:09), Sha Zhengju wrote:
> If memcg is enabled and no non-root memcg exists, all allocated pages
> belong to root_mem_cgroup and go through root memcg statistics routines
> which brings some overheads.
>
> So for the sake of performance, we can give up accounting stats of root
> memcg for MEM_CGROUP_STAT_CACHE/RSS and instead we pay special attention
> to memcg_stat_show() while showing root memcg numbers:
> as we don't account root memcg stats anymore, the root_mem_cgroup->stat
> numbers are actually 0. So we fake these numbers by using stats of global
> state and all other memcg. That is for root memcg:
>
> nr(MEM_CGROUP_STAT_CACHE) = global_page_state(NR_FILE_PAGES) -
> sum_of_all_memcg(MEM_CGROUP_STAT_CACHE);
>
> Rss pages accounting are in the similar way.
>
> Signed-off-by: Sha Zhengju <handai.szj@taobao.com>
> ---
> mm/memcontrol.c | 50 ++++++++++++++++++++++++++++++++++----------------
> 1 file changed, 34 insertions(+), 16 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 735cd41..e89204f 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -958,26 +958,27 @@ static void mem_cgroup_charge_statistics(struct mem_cgroup *memcg,
> {
> preempt_disable();
>
> - /*
> - * Here, RSS means 'mapped anon' and anon's SwapCache. Shmem/tmpfs is
> - * counted as CACHE even if it's on ANON LRU.
> - */
> - if (anon)
> - __this_cpu_add(memcg->stat->count[MEM_CGROUP_STAT_RSS],
> - nr_pages);
> - else
> - __this_cpu_add(memcg->stat->count[MEM_CGROUP_STAT_CACHE],
> - nr_pages);
> -
> /* pagein of a big page is an event. So, ignore page size */
> if (nr_pages > 0)
> __this_cpu_inc(memcg->stat->events[MEM_CGROUP_EVENTS_PGPGIN]);
> - else {
> + else
> __this_cpu_inc(memcg->stat->events[MEM_CGROUP_EVENTS_PGPGOUT]);
> - nr_pages = -nr_pages; /* for event */
> - }
>
> - __this_cpu_add(memcg->stat->nr_page_events, nr_pages);
> + __this_cpu_add(memcg->stat->nr_page_events,
> + nr_pages < 0 ? -nr_pages : nr_pages);
> +
> + if (!mem_cgroup_is_root(memcg)) {
> + /*
> + * Here, RSS means 'mapped anon' and anon's SwapCache. Shmem/tmpfs is
> + * counted as CACHE even if it's on ANON LRU.
> + */
> + if (anon)
> + __this_cpu_add(memcg->stat->count[MEM_CGROUP_STAT_RSS],
> + nr_pages);
> + else
> + __this_cpu_add(memcg->stat->count[MEM_CGROUP_STAT_CACHE],
> + nr_pages);
> + }
Hmm. I don't like to add this check to this fast path. IIUC, with Costa's patch, root memcg
will not make any charges at all and never call this function. I like his one rather than
this patching.
Thanks,
-Kame
>
> preempt_enable();
> }
> @@ -5445,12 +5446,24 @@ static int memcg_stat_show(struct cgroup *cont, struct cftype *cft,
> struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
> struct mem_cgroup *mi;
> unsigned int i;
> + enum zone_stat_item global_stat[] = {NR_FILE_PAGES, NR_ANON_PAGES};
> + long root_stat[MEM_CGROUP_STAT_NSTATS] = {0};
>
> for (i = 0; i < MEM_CGROUP_STAT_NSTATS; i++) {
> + long val = 0;
> +
> if (i == MEM_CGROUP_STAT_SWAP && !do_swap_account)
> continue;
> +
> + if (mem_cgroup_is_root(memcg) && (i == MEM_CGROUP_STAT_CACHE
> + || i == MEM_CGROUP_STAT_RSS)) {
> + val = global_page_state(global_stat[i]) -
> + mem_cgroup_recursive_stat(memcg, i);
> + root_stat[i] = val = val < 0 ? 0 : val;
> + } else
> + val = mem_cgroup_read_stat(memcg, i);
> seq_printf(m, "%s %ld\n", mem_cgroup_stat_names[i],
> - mem_cgroup_read_stat(memcg, i) * PAGE_SIZE);
> + val * PAGE_SIZE);
> }
>
> for (i = 0; i < MEM_CGROUP_EVENTS_NSTATS; i++)
> @@ -5478,6 +5491,11 @@ static int memcg_stat_show(struct cgroup *cont, struct cftype *cft,
> continue;
> for_each_mem_cgroup_tree(mi, memcg)
> val += mem_cgroup_read_stat(mi, i) * PAGE_SIZE;
> +
> + /* Adding local stats of root memcg */
> + if (mem_cgroup_is_root(memcg))
> + val += root_stat[i] * PAGE_SIZE;
> +
> seq_printf(m, "total_%s %lld\n", mem_cgroup_stat_names[i], val);
> }
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/6] memcg: use global stat directly for root memcg usage
2013-03-13 1:05 ` Kamezawa Hiroyuki
@ 2013-03-13 8:50 ` Sha Zhengju
0 siblings, 0 replies; 13+ messages in thread
From: Sha Zhengju @ 2013-03-13 8:50 UTC (permalink / raw)
To: Kamezawa Hiroyuki
Cc: cgroups, linux-mm, mhocko, glommer, akpm, mgorman, Sha Zhengju
On Wed, Mar 13, 2013 at 9:05 AM, Kamezawa Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
> (2013/03/12 19:08), Sha Zhengju wrote:
>> Since mem_cgroup_recursive_stat(root_mem_cgroup, INDEX) will sum up
>> all memcg stats without regard to root's use_hierarchy, we may use
>> global stats instead for simplicity.
>>
>> Signed-off-by: Sha Zhengju <handai.szj@taobao.com>
>> ---
>> mm/memcontrol.c | 6 +++---
>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index 669d16a..735cd41 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -4987,11 +4987,11 @@ static inline u64 mem_cgroup_usage(struct mem_cgroup *memcg, bool swap)
>> return res_counter_read_u64(&memcg->memsw, RES_USAGE);
>> }
>>
>> - val = mem_cgroup_recursive_stat(memcg, MEM_CGROUP_STAT_CACHE);
>> - val += mem_cgroup_recursive_stat(memcg, MEM_CGROUP_STAT_RSS);
>> + val = global_page_state(NR_FILE_PAGES);
>> + val += global_page_state(NR_ANON_PAGES);
>>
> you missed NR_ANON_TRANSPARENT_HUGEPAGES
right..
>
>> if (swap)
>> - val += mem_cgroup_recursive_stat(memcg, MEM_CGROUP_STAT_SWAP);
>> + val += total_swap_pages - atomic_long_read(&nr_swap_pages);
>>
> Double count mapped SwapCache ? Did you saw Costa's trial in a week ago ?
yeah, I’m hesitating how to handle swapcache. I've replied in that thread. : )
Thanks,
Sha
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/6] memcg: Don't account root memcg CACHE/RSS stats
2013-03-13 1:12 ` Kamezawa Hiroyuki
@ 2013-03-13 9:09 ` Sha Zhengju
0 siblings, 0 replies; 13+ messages in thread
From: Sha Zhengju @ 2013-03-13 9:09 UTC (permalink / raw)
To: Kamezawa Hiroyuki
Cc: cgroups, linux-mm, mhocko, glommer, akpm, mgorman, Sha Zhengju
On Wed, Mar 13, 2013 at 9:12 AM, Kamezawa Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
> (2013/03/12 19:09), Sha Zhengju wrote:
>> If memcg is enabled and no non-root memcg exists, all allocated pages
>> belong to root_mem_cgroup and go through root memcg statistics routines
>> which brings some overheads.
>>
>> So for the sake of performance, we can give up accounting stats of root
>> memcg for MEM_CGROUP_STAT_CACHE/RSS and instead we pay special attention
>> to memcg_stat_show() while showing root memcg numbers:
>> as we don't account root memcg stats anymore, the root_mem_cgroup->stat
>> numbers are actually 0. So we fake these numbers by using stats of global
>> state and all other memcg. That is for root memcg:
>>
>> nr(MEM_CGROUP_STAT_CACHE) = global_page_state(NR_FILE_PAGES) -
>> sum_of_all_memcg(MEM_CGROUP_STAT_CACHE);
>>
>> Rss pages accounting are in the similar way.
>>
>> Signed-off-by: Sha Zhengju <handai.szj@taobao.com>
>> ---
>> mm/memcontrol.c | 50 ++++++++++++++++++++++++++++++++++----------------
>> 1 file changed, 34 insertions(+), 16 deletions(-)
>>
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index 735cd41..e89204f 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -958,26 +958,27 @@ static void mem_cgroup_charge_statistics(struct mem_cgroup *memcg,
>> {
>> preempt_disable();
>>
>> - /*
>> - * Here, RSS means 'mapped anon' and anon's SwapCache. Shmem/tmpfs is
>> - * counted as CACHE even if it's on ANON LRU.
>> - */
>> - if (anon)
>> - __this_cpu_add(memcg->stat->count[MEM_CGROUP_STAT_RSS],
>> - nr_pages);
>> - else
>> - __this_cpu_add(memcg->stat->count[MEM_CGROUP_STAT_CACHE],
>> - nr_pages);
>> -
>> /* pagein of a big page is an event. So, ignore page size */
>> if (nr_pages > 0)
>> __this_cpu_inc(memcg->stat->events[MEM_CGROUP_EVENTS_PGPGIN]);
>> - else {
>> + else
>> __this_cpu_inc(memcg->stat->events[MEM_CGROUP_EVENTS_PGPGOUT]);
>> - nr_pages = -nr_pages; /* for event */
>> - }
>>
>> - __this_cpu_add(memcg->stat->nr_page_events, nr_pages);
>> + __this_cpu_add(memcg->stat->nr_page_events,
>> + nr_pages < 0 ? -nr_pages : nr_pages);
>> +
>> + if (!mem_cgroup_is_root(memcg)) {
>> + /*
>> + * Here, RSS means 'mapped anon' and anon's SwapCache. Shmem/tmpfs is
>> + * counted as CACHE even if it's on ANON LRU.
>> + */
>> + if (anon)
>> + __this_cpu_add(memcg->stat->count[MEM_CGROUP_STAT_RSS],
>> + nr_pages);
>> + else
>> + __this_cpu_add(memcg->stat->count[MEM_CGROUP_STAT_CACHE],
>> + nr_pages);
>> + }
>
> Hmm. I don't like to add this check to this fast path. IIUC, with Costa's patch, root memcg
> will not make any charges at all and never call this function. I like his one rather than
Yes. But I think that one still has some other problems such as
PGPGIN/PGPGOUT and threshold events related things. I prefer to
improve this as a start.
Thanks,
Sha
> this patching.
>
> Thanks,
> -Kame
>
>
>>
>> preempt_enable();
>> }
>> @@ -5445,12 +5446,24 @@ static int memcg_stat_show(struct cgroup *cont, struct cftype *cft,
>> struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
>> struct mem_cgroup *mi;
>> unsigned int i;
>> + enum zone_stat_item global_stat[] = {NR_FILE_PAGES, NR_ANON_PAGES};
>> + long root_stat[MEM_CGROUP_STAT_NSTATS] = {0};
>>
>> for (i = 0; i < MEM_CGROUP_STAT_NSTATS; i++) {
>> + long val = 0;
>> +
>> if (i == MEM_CGROUP_STAT_SWAP && !do_swap_account)
>> continue;
>> +
>> + if (mem_cgroup_is_root(memcg) && (i == MEM_CGROUP_STAT_CACHE
>> + || i == MEM_CGROUP_STAT_RSS)) {
>> + val = global_page_state(global_stat[i]) -
>> + mem_cgroup_recursive_stat(memcg, i);
>> + root_stat[i] = val = val < 0 ? 0 : val;
>> + } else
>> + val = mem_cgroup_read_stat(memcg, i);
>> seq_printf(m, "%s %ld\n", mem_cgroup_stat_names[i],
>> - mem_cgroup_read_stat(memcg, i) * PAGE_SIZE);
>> + val * PAGE_SIZE);
>> }
>>
>> for (i = 0; i < MEM_CGROUP_EVENTS_NSTATS; i++)
>> @@ -5478,6 +5491,11 @@ static int memcg_stat_show(struct cgroup *cont, struct cftype *cft,
>> continue;
>> for_each_mem_cgroup_tree(mi, memcg)
>> val += mem_cgroup_read_stat(mi, i) * PAGE_SIZE;
>> +
>> + /* Adding local stats of root memcg */
>> + if (mem_cgroup_is_root(memcg))
>> + val += root_stat[i] * PAGE_SIZE;
>> +
>> seq_printf(m, "total_%s %lld\n", mem_cgroup_stat_names[i], val);
>> }
>>
>>
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/6] memcg: Don't account root memcg CACHE/RSS stats
2013-03-12 10:09 ` [PATCH 2/6] memcg: Don't account root memcg CACHE/RSS stats Sha Zhengju
2013-03-13 1:12 ` Kamezawa Hiroyuki
@ 2013-03-20 7:07 ` Glauber Costa
1 sibling, 0 replies; 13+ messages in thread
From: Glauber Costa @ 2013-03-20 7:07 UTC (permalink / raw)
To: Sha Zhengju
Cc: cgroups, linux-mm, mhocko, kamezawa.hiroyu, akpm, mgorman,
Sha Zhengju
On 03/12/2013 02:09 PM, Sha Zhengju wrote:
> If memcg is enabled and no non-root memcg exists, all allocated pages
> belong to root_mem_cgroup and go through root memcg statistics routines
> which brings some overheads.
>
> So for the sake of performance, we can give up accounting stats of root
> memcg for MEM_CGROUP_STAT_CACHE/RSS and instead we pay special attention
> to memcg_stat_show() while showing root memcg numbers:
> as we don't account root memcg stats anymore, the root_mem_cgroup->stat
> numbers are actually 0. So we fake these numbers by using stats of global
> state and all other memcg. That is for root memcg:
>
> nr(MEM_CGROUP_STAT_CACHE) = global_page_state(NR_FILE_PAGES) -
> sum_of_all_memcg(MEM_CGROUP_STAT_CACHE);
>
> Rss pages accounting are in the similar way.
>
Well,
The problem is that statistics is not the only cause for overhead. We
will still incur in in the whole charging operation, and the same for
uncharge. There is memory overhead from page_cgroup, etc.
So my view is that this patch is far from complete.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 6/6] memcg: disable memcg page stat accounting
2013-03-12 10:11 ` [PATCH 6/6] memcg: disable memcg page stat accounting Sha Zhengju
@ 2013-03-20 7:09 ` Glauber Costa
0 siblings, 0 replies; 13+ messages in thread
From: Glauber Costa @ 2013-03-20 7:09 UTC (permalink / raw)
To: Sha Zhengju
Cc: cgroups, linux-mm, mhocko, kamezawa.hiroyu, akpm, mgorman,
Sha Zhengju
On 03/12/2013 02:11 PM, Sha Zhengju wrote:
> Use jump label to patch the memcg page stat accounting code
> in or out when not used. when the first non-root memcg comes to
> life the code is patching in otherwise it is out.
>
> Signed-off-by: Sha Zhengju <handai.szj@taobao.com>
> ---
> include/linux/memcontrol.h | 23 +++++++++++++++++++++++
> mm/memcontrol.c | 34 +++++++++++++++++++++++++++++++++-
> 2 files changed, 56 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index d6183f0..99dca91 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -42,6 +42,14 @@ struct mem_cgroup_reclaim_cookie {
> };
>
> #ifdef CONFIG_MEMCG
> +
> +extern struct static_key memcg_in_use_key;
> +
> +static inline bool mem_cgroup_in_use(void)
> +{
> + return static_key_false(&memcg_in_use_key);
> +}
> +
I believe the big advantage of the approach I've taken, including this
test in mem_cgroup_disabled(), is that we patch out a lot of things for
free.
We just need to be careful because some code expected that decision to
be permanent and now that status can change.
But I would still advocate for that.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2013-03-20 7:09 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-12 10:06 [PATCH 0/6] memcg: bypass root memcg page stat accounting Sha Zhengju
2013-03-12 10:08 ` [PATCH 1/6] memcg: use global stat directly for root memcg usage Sha Zhengju
2013-03-13 1:05 ` Kamezawa Hiroyuki
2013-03-13 8:50 ` Sha Zhengju
2013-03-12 10:09 ` [PATCH 2/6] memcg: Don't account root memcg CACHE/RSS stats Sha Zhengju
2013-03-13 1:12 ` Kamezawa Hiroyuki
2013-03-13 9:09 ` Sha Zhengju
2013-03-20 7:07 ` Glauber Costa
2013-03-12 10:10 ` [PATCH 3/6] memcg: Don't account root memcg MEM_CGROUP_STAT_FILE_MAPPED stats Sha Zhengju
2013-03-12 10:10 ` [PATCH 4/6] memcg: Don't account root memcg swap stats Sha Zhengju
2013-03-12 10:11 ` [PATCH 5/6] memcg: Don't account root memcg PGFAULT/PGMAJFAULT events Sha Zhengju
2013-03-12 10:11 ` [PATCH 6/6] memcg: disable memcg page stat accounting Sha Zhengju
2013-03-20 7:09 ` Glauber Costa
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).