* [PATCH] netns: correctly use per-netns ipv4 sysctl_tcp_mem
@ 2012-07-09 6:05 Huang Qiang
[not found] ` <4FFA7495.5070702-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 3+ messages in thread
From: Huang Qiang @ 2012-07-09 6:05 UTC (permalink / raw)
To: davem, glommer; +Cc: netdev, containers, yangzhenzhang
From: Yang Zhenzhang <yangzhenzhang@huawei.com>
Now, kernel allows each net namespace to independently set up its levels
for tcp memory pressure thresholds.
But it seems there is a bug, as using the following steps:
[root@host socket]# lxc-start -n test -f config /bin/bash
[root@net-test socket]# ip route add default via 192.168.58.2
[root@net-test socket]# echo 0 0 0 > /proc/sys/net/ipv4/tcp_mem
[root@net-test socket]# scp root@192.168.58.174:/home/tcp_mem_test .
and it still can transport the "tcp_mem_test" file which we hope it
would not.
It's because inet_init() (net/ipv4/af_inet.c)initialize the
tcp_prot.sysctl_mem:
tcp_prot.sysctl_mem = init_net.ipv4.sysctl_tcp_mem;
So when the protocal is TCP, sk->sk_prot->sysctl_mem(following code)
always use the ipv4 sysctl_tcp_mem of init_net namespace rather than
it's own net namespace.
This patch simply set "prot" equal to net->ipv4.sysctl_tcp_mem when
the protocol type is TCP.
Signed-off-by: Yang Zhenzhang <yangzhenzhang@huawei.com>
---
include/net/sock.h | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/include/net/sock.h b/include/net/sock.h
index 4a45216..b62a8d9 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -59,6 +59,7 @@
#include <linux/static_key.h>
#include <linux/aio.h>
#include <linux/sched.h>
+#include <linux/in.h>
#include <linux/filter.h>
#include <linux/rculist_nulls.h>
@@ -1062,7 +1063,12 @@ static inline void sk_enter_memory_pressure(struct sock *sk)
static inline long sk_prot_mem_limits(const struct sock *sk, int index)
{
+ struct net *net = sock_net(sk);
long *prot = sk->sk_prot->sysctl_mem;
+
+ if (sk->protocol == IPPROTO_TCP)
+ prot = net->ipv4.sysctl_tcp_mem;
+
if (mem_cgroup_sockets_enabled && sk->sk_cgrp)
prot = sk->sk_cgrp->sysctl_mem;
return prot[index];
--
1.7.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] netns: correctly use per-netns ipv4 sysctl_tcp_mem
[not found] ` <4FFA7495.5070702-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
@ 2012-07-09 22:21 ` David Miller
[not found] ` <20120709.152100.571089964662155300.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
0 siblings, 1 reply; 3+ messages in thread
From: David Miller @ 2012-07-09 22:21 UTC (permalink / raw)
To: h.huangqiang-hv44wF8Li93QT0dZR+AlfA
Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
From: Huang Qiang <h.huangqiang@huawei.com>
Date: Mon, 9 Jul 2012 14:05:09 +0800
> From: Yang Zhenzhang <yangzhenzhang@huawei.com>
>
> Now, kernel allows each net namespace to independently set up its levels
> for tcp memory pressure thresholds.
>
> But it seems there is a bug, as using the following steps:
>
> [root@host socket]# lxc-start -n test -f config /bin/bash
> [root@net-test socket]# ip route add default via 192.168.58.2
> [root@net-test socket]# echo 0 0 0 > /proc/sys/net/ipv4/tcp_mem
> [root@net-test socket]# scp root@192.168.58.174:/home/tcp_mem_test .
>
> and it still can transport the "tcp_mem_test" file which we hope it
> would not.
>
> It's because inet_init() (net/ipv4/af_inet.c)initialize the
> tcp_prot.sysctl_mem:
> tcp_prot.sysctl_mem = init_net.ipv4.sysctl_tcp_mem;
>
> So when the protocal is TCP, sk->sk_prot->sysctl_mem(following code)
> always use the ipv4 sysctl_tcp_mem of init_net namespace rather than
> it's own net namespace.
> This patch simply set "prot" equal to net->ipv4.sysctl_tcp_mem when
> the protocol type is TCP.
>
> Signed-off-by: Yang Zhenzhang <yangzhenzhang@huawei.com>
Another regression added by the socket memory cgroup code, BIG
SURPRISE.
But this patch is even worse than the bug.
> long *prot = sk->sk_prot->sysctl_mem;
> +
^^^^^^^^
Trailing whitespace please remove.
Also, this patch causes build failures:
In file included from include/linux/tcp.h:227:0,
from include/linux/ipv6.h:221,
from include/net/ipv6.h:16,
from include/linux/sunrpc/clnt.h:26,
from include/linux/nfs_fs.h:57,
from init/do_mounts.c:30:
include/net/sock.h: In function ‘sk_prot_mem_limits’:
include/net/sock.h:1066:9: error: implicit declaration of function ‘sock_net’ [-Werror=implicit-function-declaration]
include/net/sock.h:1066:20: warning: initialization makes pointer from integer without a cast [enabled by default]
include/net/sock.h:1069:8: error: ‘const struct sock’ has no member named ‘protocol’
include/net/sock.h: At top level:
include/net/sock.h:2136:13: error: conflicting types for ‘sock_net’
include/net/sock.h:1066:20: note: previous implicit declaration of ‘sock_net’ was here
cc1: some warnings being treated as errors
It is basically impossible that you even compile tested this patch
because you're using an interface from the same header file before
it's even defined.
This is an incredibly poor patch submission, I hate to tell you.
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] netns: correctly use per-netns ipv4 sysctl_tcp_mem
[not found] ` <20120709.152100.571089964662155300.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
@ 2012-07-20 22:22 ` Glauber Costa
0 siblings, 0 replies; 3+ messages in thread
From: Glauber Costa @ 2012-07-20 22:22 UTC (permalink / raw)
To: David Miller
Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On 07/09/2012 07:21 PM, David Miller wrote:
> From: Huang Qiang <h.huangqiang-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> Date: Mon, 9 Jul 2012 14:05:09 +0800
>
>> From: Yang Zhenzhang <yangzhenzhang-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
>>
>> Now, kernel allows each net namespace to independently set up its levels
>> for tcp memory pressure thresholds.
>>
>> But it seems there is a bug, as using the following steps:
>>
>> [root@host socket]# lxc-start -n test -f config /bin/bash
>> [root@net-test socket]# ip route add default via 192.168.58.2
>> [root@net-test socket]# echo 0 0 0 > /proc/sys/net/ipv4/tcp_mem
>> [root@net-test socket]# scp root-Q0ErXNX1RuabR28l3DCWlg@public.gmane.org:/home/tcp_mem_test .
>>
>> and it still can transport the "tcp_mem_test" file which we hope it
>> would not.
>>
>> It's because inet_init() (net/ipv4/af_inet.c)initialize the
>> tcp_prot.sysctl_mem:
>> tcp_prot.sysctl_mem = init_net.ipv4.sysctl_tcp_mem;
>>
>> So when the protocal is TCP, sk->sk_prot->sysctl_mem(following code)
>> always use the ipv4 sysctl_tcp_mem of init_net namespace rather than
>> it's own net namespace.
>> This patch simply set "prot" equal to net->ipv4.sysctl_tcp_mem when
>> the protocol type is TCP.
>>
>> Signed-off-by: Yang Zhenzhang <yangzhenzhang-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
>
> Another regression added by the socket memory cgroup code, BIG
> SURPRISE.
>
Back from vacations: If I understand the submission correctly, this is
not a regression, since it seems to be only happening when those values
are set inside the network namespace - which was not possible before.
In any case, I believe from what I can see that the fix is already in
the way (haven't seen the whole backlog yet)
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2012-07-20 22:22 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-07-09 6:05 [PATCH] netns: correctly use per-netns ipv4 sysctl_tcp_mem Huang Qiang
[not found] ` <4FFA7495.5070702-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2012-07-09 22:21 ` David Miller
[not found] ` <20120709.152100.571089964662155300.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-07-20 22:22 ` Glauber Costa
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).