netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Roman Gushchin <guro@fb.com>
To: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>, <netdev@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <kernel-team@fb.com>,
	"David S . Miller" <davem@davemloft.net>,
	Johannes Weiner <hannes@cmpxchg.org>, Tejun Heo <tj@kernel.org>
Subject: [PATCH net] Revert "defer call to mem_cgroup_sk_alloc()"
Date: Fri, 2 Feb 2018 19:54:02 +0000	[thread overview]
Message-ID: <20180202195357.GA8169@castle.DHCP.thefacebook.com> (raw)
In-Reply-To: <1517600096.3715.138.camel@gmail.com>

On Fri, Feb 02, 2018 at 11:34:56AM -0800, Eric Dumazet wrote:
> On Fri, 2018-02-02 at 19:04 +0000, Roman Gushchin wrote:
> > On Fri, Feb 02, 2018 at 10:39:04AM -0800, Eric Dumazet wrote:
> > > On Fri, 2018-02-02 at 18:06 +0000, Roman Gushchin wrote:
> > > > 
> > > > Idk, how even we can hit it? And if so, what scary will happen?
> > > > 
> > > > If you prefer to have it there, I definitely can return it,
> > > > but I see no profit so far.
> > > 
> > > I was simply curious this was not mentioned in the changelog.
> > > 
> > > A revert is normally a true revert, modulo the changes needed by
> > > conflicts and possible changes.
> > > 
> > > I personally do not care of this BUG_ON(), I had not put it in the
> > > first place.
> > 
> > Technically it's not a true revert, but you're totally right.
> > Let me add a note to the commit description.
> > 
> > Are you ok with the rest?
> 
> Sure !
> 
> Thanks.

Hello, David!

Can you, please, pull the patch below?
It should be applied for 4.14+.

Thank you!

Roman

--

>From a0a07f65a38105562bf424d7dc072a2bc4f1569e Mon Sep 17 00:00:00 2001
From: Roman Gushchin <guro@fb.com>
Date: Fri, 2 Feb 2018 15:26:57 +0000
Subject: [PATCH net] Revert "defer call to mem_cgroup_sk_alloc()"

This patch effectively reverts commit 9f1c2674b328 ("net: memcontrol:
defer call to mem_cgroup_sk_alloc()").

Moving mem_cgroup_sk_alloc() to the inet_csk_accept() completely breaks
memcg socket memory accounting, as packets received before memcg
pointer initialization are not accounted and are causing refcounting
underflow on socket release.

Actually the free-after-use problem was fixed by
commit c0576e397508 ("net: call cgroup_sk_alloc() earlier in
sk_clone_lock()") for the cgroup pointer.

So, let's revert it and call mem_cgroup_sk_alloc() just before
cgroup_sk_alloc(). This is safe, as we hold a reference to the socket
we're cloning, and it holds a reference to the memcg.

Also, let's drop BUG_ON(mem_cgroup_is_root()) check from
mem_cgroup_sk_alloc(). I see no reasons why bumping the root
memcg counter is a good reason to panic, and there are no realistic
ways to hit it.

Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
---
 mm/memcontrol.c                 | 14 ++++++++++++++
 net/core/sock.c                 |  5 +----
 net/ipv4/inet_connection_sock.c |  1 -
 3 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 0ae2dc3a1748..0937f2c52c7d 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5747,6 +5747,20 @@ void mem_cgroup_sk_alloc(struct sock *sk)
 	if (!mem_cgroup_sockets_enabled)
 		return;
 
+	/*
+	 * Socket cloning can throw us here with sk_memcg already
+	 * filled. It won't however, necessarily happen from
+	 * process context. So the test for root memcg given
+	 * the current task's memcg won't help us in this case.
+	 *
+	 * Respecting the original socket's memcg is a better
+	 * decision in this case.
+	 */
+	if (sk->sk_memcg) {
+		css_get(&sk->sk_memcg->css);
+		return;
+	}
+
 	rcu_read_lock();
 	memcg = mem_cgroup_from_task(current);
 	if (memcg == root_mem_cgroup)
diff --git a/net/core/sock.c b/net/core/sock.c
index 1033f8ab0547..e50e7b3f2223 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1683,16 +1683,13 @@ struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority)
 		newsk->sk_dst_pending_confirm = 0;
 		newsk->sk_wmem_queued	= 0;
 		newsk->sk_forward_alloc = 0;
-
-		/* sk->sk_memcg will be populated at accept() time */
-		newsk->sk_memcg = NULL;
-
 		atomic_set(&newsk->sk_drops, 0);
 		newsk->sk_send_head	= NULL;
 		newsk->sk_userlocks	= sk->sk_userlocks & ~SOCK_BINDPORT_LOCK;
 		atomic_set(&newsk->sk_zckey, 0);
 
 		sock_reset_flag(newsk, SOCK_DONE);
+		mem_cgroup_sk_alloc(newsk);
 		cgroup_sk_alloc(&newsk->sk_cgrp_data);
 
 		rcu_read_lock();
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 12410ec6f7f7..881ac6d046f2 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -475,7 +475,6 @@ struct sock *inet_csk_accept(struct sock *sk, int flags, int *err, bool kern)
 		}
 		spin_unlock_bh(&queue->fastopenq.lock);
 	}
-	mem_cgroup_sk_alloc(newsk);
 out:
 	release_sock(sk);
 	if (req)
-- 
2.14.3

      reply	other threads:[~2018-02-02 19:54 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-02 16:57 [PATCH net] Revert "defer call to mem_cgroup_sk_alloc()" Roman Gushchin
2018-02-02 17:59 ` Eric Dumazet
2018-02-02 18:06   ` Roman Gushchin
2018-02-02 18:39     ` Eric Dumazet
2018-02-02 19:04       ` Roman Gushchin
2018-02-02 19:34         ` Eric Dumazet
2018-02-02 19:54           ` Roman Gushchin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180202195357.GA8169@castle.DHCP.thefacebook.com \
    --to=guro@fb.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).