From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Glauber Costa <glommer@parallels.com>,
netdev@vger.kernel.org, David Miller <davem@davemloft.net>,
Andrew Morton <akpm@linux-foundation.org>
Subject: [BUGFIX][PATCH 2/3] memcg/tcp: remove static_branch_slow_dec() at changing limit
Date: Thu, 29 Mar 2012 16:07:45 +0900 [thread overview]
Message-ID: <4F740A41.6040002@jp.fujitsu.com> (raw)
In-Reply-To: <4F7408B7.9090706@jp.fujitsu.com>
tcp memcontrol uses static_branch to optimize limit=RESOURCE_MAX case.
If all cgroup's limit=RESOUCE_MAX, resource usage is not accounted.
But it's buggy now.
For example, do following
# while sleep 1;do
echo 9223372036854775807 > /cgroup/memory/A/memory.kmem.tcp.limit_in_bytes;
echo 300M > /cgroup/memory/A/memory.kmem.tcp.limit_in_bytes;
done
and run network application under A. tcp's usage is sometimes accounted
and sometimes not accounted because of frequent changes of static_branch.
Then, finally, you can see broken tcp.usage_in_bytes.
WARN_ON() is printed because res_counter->usage goes below 0.
==
kernel: ------------[ cut here ]----------
kernel: WARNING: at kernel/res_counter.c:96 res_counter_uncharge_locked+0x37/0x40()
<snip>
kernel: Pid: 17753, comm: bash Tainted: G W 3.3.0+ #99
kernel: Call Trace:
kernel: <IRQ> [<ffffffff8104cc9f>] warn_slowpath_common+0x7f/0xc0
kernel: [<ffffffff810d7e88>] ? rb_reserve__next_event+0x68/0x470
kernel: [<ffffffff8104ccfa>] warn_slowpath_null+0x1a/0x20
kernel: [<ffffffff810b4e37>] res_counter_uncharge_locked+0x37/0x40
...
==
This patch removes static_branch_slow_dec() at changing res_counter's
limit to RESOUCE_MAX. By this, once accounting started, the accountting
will continue until the tcp cgroup is destroyed.
I think this will not be problem in real use.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
include/net/tcp_memcontrol.h | 1 +
net/ipv4/tcp_memcontrol.c | 24 ++++++++++++++++++------
2 files changed, 19 insertions(+), 6 deletions(-)
diff --git a/include/net/tcp_memcontrol.h b/include/net/tcp_memcontrol.h
index 48410ff..f47e3c7 100644
--- a/include/net/tcp_memcontrol.h
+++ b/include/net/tcp_memcontrol.h
@@ -9,6 +9,7 @@ struct tcp_memcontrol {
/* those two are read-mostly, leave them at the end */
long tcp_prot_mem[3];
int tcp_memory_pressure;
+ bool accounting;
};
struct cg_proto *tcp_proto_cgroup(struct mem_cgroup *memcg);
diff --git a/net/ipv4/tcp_memcontrol.c b/net/ipv4/tcp_memcontrol.c
index 32764a6..cd0b47d 100644
--- a/net/ipv4/tcp_memcontrol.c
+++ b/net/ipv4/tcp_memcontrol.c
@@ -49,6 +49,20 @@ static void memcg_tcp_enter_memory_pressure(struct sock *sk)
}
EXPORT_SYMBOL(memcg_tcp_enter_memory_pressure);
+static void tcp_start_accounting(struct tcp_memcontrol *tcp)
+{
+ if (tcp->accounting)
+ return;
+ tcp->accounting = true;
+ static_key_slow_inc(&memcg_socket_limit_enabled);
+}
+
+static void tcp_end_accounting(struct tcp_memcontrol *tcp)
+{
+ if (tcp->accounting)
+ static_key_slow_dec(&memcg_socket_limit_enabled);
+}
+
int tcp_init_cgroup(struct cgroup *cgrp, struct cgroup_subsys *ss)
{
/*
@@ -73,6 +87,7 @@ int tcp_init_cgroup(struct cgroup *cgrp, struct cgroup_subsys *ss)
tcp->tcp_prot_mem[1] = net->ipv4.sysctl_tcp_mem[1];
tcp->tcp_prot_mem[2] = net->ipv4.sysctl_tcp_mem[2];
tcp->tcp_memory_pressure = 0;
+ tcp->accounting = false;
parent_cg = tcp_prot.proto_cgroup(parent);
if (parent_cg && mem_cgroup_use_hierarchy(parent))
@@ -110,8 +125,7 @@ void tcp_destroy_cgroup(struct cgroup *cgrp)
val = res_counter_read_u64(&tcp->tcp_memory_allocated, RES_LIMIT);
- if (val != RESOURCE_MAX)
- static_key_slow_dec(&memcg_socket_limit_enabled);
+ tcp_end_accounting(tcp);
}
EXPORT_SYMBOL(tcp_destroy_cgroup);
@@ -142,10 +156,8 @@ static int tcp_update_limit(struct mem_cgroup *memcg, u64 val)
tcp->tcp_prot_mem[i] = min_t(long, val >> PAGE_SHIFT,
net->ipv4.sysctl_tcp_mem[i]);
- if (val == RESOURCE_MAX && old_lim != RESOURCE_MAX)
- static_key_slow_dec(&memcg_socket_limit_enabled);
- else if (old_lim == RESOURCE_MAX && val != RESOURCE_MAX)
- static_key_slow_inc(&memcg_socket_limit_enabled);
+ if (old_lim == RESOURCE_MAX && val != RESOURCE_MAX)
+ tcp_start_accounting(tcp);
return 0;
}
--
1.7.4.1
next prev parent reply other threads:[~2012-03-29 7:09 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-29 7:01 [BUGFIX][PATCH 0/3] memcg: tcp memcontrol fixes KAMEZAWA Hiroyuki
2012-03-29 7:03 ` [PATCH 1/3] [BUGFIX] memcg/tcp : fix to see use_hierarchy in tcp memcontrol cgroup KAMEZAWA Hiroyuki
2012-03-29 9:14 ` Glauber Costa
2012-03-29 9:16 ` KAMEZAWA Hiroyuki
2012-03-29 7:07 ` KAMEZAWA Hiroyuki [this message]
2012-03-29 10:58 ` [BUGFIX][PATCH 2/3] memcg/tcp: remove static_branch_slow_dec() at changing limit Glauber Costa
2012-03-29 23:51 ` KAMEZAWA Hiroyuki
2012-03-30 6:18 ` Glauber Costa
2012-03-29 7:10 ` [BUGFIX][PATCH 3/3] memcg/tcp: ignore tcp usage before accounting started KAMEZAWA Hiroyuki
2012-03-29 9:21 ` Glauber Costa
2012-03-30 1:44 ` [PATCH] memcg/tcp: fix warning caused b res->usage go to negative KAMEZAWA Hiroyuki
2012-04-06 15:49 ` Glauber Costa
2012-04-10 2:37 ` KAMEZAWA Hiroyuki
2012-04-10 2:51 ` Glauber Costa
2012-04-10 3:01 ` Glauber Costa
2012-04-10 4:15 ` KAMEZAWA Hiroyuki
2012-04-11 2:22 ` Glauber Costa
2012-04-10 3:21 ` KAMEZAWA Hiroyuki
2012-04-13 17:33 ` Glauber Costa
2012-04-18 8:02 ` KAMEZAWA Hiroyuki
2012-04-18 16:32 ` Glauber Costa
2012-04-02 3:41 ` [BUGFIX][PATCH 3/3] memcg/tcp: ignore tcp usage before accounting started David Miller
2012-04-03 22:31 ` Glauber Costa
2012-04-09 0:58 ` KAMEZAWA Hiroyuki
2012-04-09 1:44 ` Glauber Costa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F740A41.6040002@jp.fujitsu.com \
--to=kamezawa.hiroyu@jp.fujitsu.com \
--cc=akpm@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=glommer@parallels.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.