From mboxrd@z Thu Jan 1 00:00:00 1970 From: Roman Gushchin Subject: Re: [PATCH net] net: memcontrol: charge allocated memory after mem_cgroup_sk_alloc() Date: Wed, 31 Jan 2018 21:54:08 +0000 Message-ID: <20180131215401.GA8956@castle> References: <20180125001911.15597-1-guro@fb.com> <20180125.120302.1117695034222616751.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Cc: , , , , , To: David Miller Return-path: Content-Disposition: inline In-Reply-To: <20180125.120302.1117695034222616751.davem@davemloft.net> Sender: linux-kernel-owner@vger.kernel.org List-Id: netdev.vger.kernel.org On Thu, Jan 25, 2018 at 12:03:02PM -0500, David Miller wrote: > From: Roman Gushchin > Date: Thu, 25 Jan 2018 00:19:11 +0000 > > > @@ -476,6 +477,10 @@ struct sock *inet_csk_accept(struct sock *sk, int flags, int *err, bool kern) > > spin_unlock_bh(&queue->fastopenq.lock); > > } > > mem_cgroup_sk_alloc(newsk); > > + amt = sk_memory_allocated(newsk); > > + if (amt && newsk->sk_memcg) > > + mem_cgroup_charge_skmem(newsk->sk_memcg, amt); > > + > > This looks confusing to me. > > sk_memory_allocated() is the total amount of memory used by all > sockets for a particular "struct proto", not just for that specific > socket. > > Maybe I don't understand how this socket memcg stuff works, but it > seems like you should be looking instead at how much memory is > allocated to this specific socket. So, the patch below takes the per-socket charge into account, and it _almost_ works: css leak is weaker by a couple orders of magnitude, but still exists. I believe, the problem is that we need additional synchronization for sk_memcg and sk_forward_alloc fields; and I'm really out of ideas how to do it without heavy artillery like introducing a new field for unaccounted memcg charge. As I can see, we check the sk_memcg field without socket lock; and we do set it from a concurrent context. Most likely, I do miss something... So I really start thinking that reverting 9f1c2674b328 ("net: memcontrol: defer call to mem_cgroup_sk_alloc()") and fixing the original issue differently might be easier and a proper way to go. Does it makes sense? Thanks! -- diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c index 4ca46dc08e63..287de1501a30 100644 --- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -476,6 +476,12 @@ struct sock *inet_csk_accept(struct sock *sk, int flags, int *err, bool kern) spin_unlock_bh(&queue->fastopenq.lock); } mem_cgroup_sk_alloc(newsk); + if (mem_cgroup_sockets_enabled && newsk->sk_memcg) { + int amt = sk_mem_pages(newsk->sk_forward_alloc); + if (amt > 0) + mem_cgroup_charge_skmem(newsk->sk_memcg, amt); + } + out: release_sock(sk); if (req)