netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: eric.dumazet@gmail.com
Cc: netdev@vger.kernel.org, Vasily Averin <vvs@virtuozzo.com>,
	Christoph Paasch <christoph.paasch@gmail.com>,
	Hao Sun <sunhao.th@gmail.com>, Jakub Kicinski <kuba@kernel.org>
Subject: [RFC net v7] net: skb_expand_head() adjust skb->truesize incorrectly
Date: Fri, 17 Sep 2021 09:24:18 -0700	[thread overview]
Message-ID: <20210917162418.1437772-1-kuba@kernel.org> (raw)

From: Vasily Averin <vvs@virtuozzo.com>

Christoph Paasch reports [1] about incorrect skb->truesize
after skb_expand_head() call in ip6_xmit.
This may happen because of two reasons:
 - skb_set_owner_w() for newly cloned skb is called too early,
   before pskb_expand_head() where truesize is adjusted for (!skb-sk) case.
 - pskb_expand_head() does not adjust truesize in (skb->sk) case.
   In this case sk->sk_wmem_alloc should be adjusted too.

Eric cautions us against increasing sk_wmem_alloc if the old
skb did not hold any wmem references.

[1] https://lkml.org/lkml/2021/8/20/1082

Fixes: f1260ff15a71 ("skbuff: introduce skb_expand_head()")
Reported-by: Christoph Paasch <christoph.paasch@gmail.com>
Reported-by: Hao Sun <sunhao.th@gmail.com>
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
v7: - shift more magic into helpers
    - follow Eric's advice and don't inherit non-wmem sks for now

Looks like we stalled here, let me try to push this forward.
This builds, is it possible to repro without syzcaller?
Anyone willing to test?
---
 include/net/sock.h |  2 ++
 net/core/skbuff.c  | 50 +++++++++++++++++++++++++++++++++++-----------
 net/core/sock.c    | 10 ++++++++++
 3 files changed, 50 insertions(+), 12 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 66a9a90f9558..102e3e1009d1 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1707,6 +1707,8 @@ void sock_pfree(struct sk_buff *skb);
 #define sock_edemux sock_efree
 #endif
 
+bool is_skb_wmem(const struct sk_buff *skb);
+
 int sock_setsockopt(struct socket *sock, int level, int op,
 		    sockptr_t optval, unsigned int optlen);
 
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 7c2ab27fcbf9..5093321c2b65 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1786,6 +1786,24 @@ struct sk_buff *skb_realloc_headroom(struct sk_buff *skb, unsigned int headroom)
 }
 EXPORT_SYMBOL(skb_realloc_headroom);
 
+static void skb_owner_inherit(struct sk_buff *nskb, struct sk_buff *oskb)
+{
+	if (is_skb_wmem(oskb))
+		skb_set_owner_w(nskb, oskb->sk);
+
+	/* handle rmem sock etc. as needed .. */
+}
+
+static void skb_increase_truesize(struct sk_buff *skb, unsigned int add)
+{
+	if (is_skb_wmem(skb))
+		refcount_add(add, &skb->sk->sk_wmem_alloc);
+	/* handle rmem sock etc. as needed .. */
+	WARN_ON(skb->destructor == sock_rfree);
+
+	skb->truesize += add;
+}
+
 /**
  *	skb_expand_head - reallocate header of &sk_buff
  *	@skb: buffer to reallocate
@@ -1801,6 +1819,7 @@ EXPORT_SYMBOL(skb_realloc_headroom);
 struct sk_buff *skb_expand_head(struct sk_buff *skb, unsigned int headroom)
 {
 	int delta = headroom - skb_headroom(skb);
+	int osize = skb_end_offset(skb);
 
 	if (WARN_ONCE(delta <= 0,
 		      "%s is expecting an increase in the headroom", __func__))
@@ -1810,21 +1829,28 @@ struct sk_buff *skb_expand_head(struct sk_buff *skb, unsigned int headroom)
 	if (skb_shared(skb)) {
 		struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC);
 
-		if (likely(nskb)) {
-			if (skb->sk)
-				skb_set_owner_w(nskb, skb->sk);
-			consume_skb(skb);
-		} else {
-			kfree_skb(skb);
-		}
+		if (unlikely(!nskb))
+			goto err_free;
+
+		skb_owner_inherit(nskb, skb);
+		consume_skb(skb);
 		skb = nskb;
 	}
-	if (skb &&
-	    pskb_expand_head(skb, SKB_DATA_ALIGN(delta), 0, GFP_ATOMIC)) {
-		kfree_skb(skb);
-		skb = NULL;
-	}
+
+	if (pskb_expand_head(skb, SKB_DATA_ALIGN(delta), 0, GFP_ATOMIC))
+		goto err_free;
+	delta = skb_end_offset(skb) - osize;
+
+	/* pskb_expand_head() will adjust truesize itself for non-sk cases
+	 * todo: move the adjustment there at some point?
+	 */
+	if (skb->sk && skb->destructor != sock_edemux)
+		skb_increase_truesize(skb, delta);
+
 	return skb;
+err_free:
+	kfree_skb(skb);
+	return NULL;
 }
 EXPORT_SYMBOL(skb_expand_head);
 
diff --git a/net/core/sock.c b/net/core/sock.c
index 62627e868e03..1483b4f755ef 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2227,6 +2227,16 @@ void skb_set_owner_w(struct sk_buff *skb, struct sock *sk)
 }
 EXPORT_SYMBOL(skb_set_owner_w);
 
+/* Should clones of this skb count towards skb->sk->sk_wmem_alloc
+ * and use sock_wfree() as their destructor?
+ */
+bool is_skb_wmem(const struct sk_buff *skb)
+{
+	return skb->destructor == sock_wfree ||
+		skb->destructor == __sock_wfree ||
+		(IS_ENABLED(CONFIG_INET) && skb->destructor == tcp_wfree);
+}
+
 static bool can_skb_orphan_partial(const struct sk_buff *skb)
 {
 #ifdef CONFIG_TLS_DEVICE
-- 
2.31.1


             reply	other threads:[~2021-09-17 16:24 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-17 16:24 Jakub Kicinski [this message]
2021-09-17 18:15 ` [RFC net v7] net: skb_expand_head() adjust skb->truesize incorrectly Vasily Averin
2021-09-18 10:05 ` Vasily Averin
2021-09-20 18:12   ` Jakub Kicinski
2021-09-20 21:41     ` Vasily Averin
2021-09-21  0:39       ` Jakub Kicinski
2021-09-21  6:36         ` Vasily Averin
2021-09-21 21:25           ` Jakub Kicinski
2021-09-20 21:41     ` [PATCH net v8] " Vasily Averin
2021-09-21  5:21       ` Vasily Averin
2021-10-04 13:00         ` [PATCH net v9] " Vasily Averin
2021-10-04 13:14           ` Vasily Averin
2021-10-04 19:26           ` Eric Dumazet
2021-10-05  5:57             ` Vasily Averin
2021-10-20 16:14               ` Eric Dumazet
2021-10-20 16:18               ` Eric Dumazet
2021-10-22 10:28                 ` [PATCH net v10] " Vasily Averin
2021-10-22 19:32                   ` Eric Dumazet
2021-10-22 20:50                   ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210917162418.1437772-1-kuba@kernel.org \
    --to=kuba@kernel.org \
    --cc=christoph.paasch@gmail.com \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=sunhao.th@gmail.com \
    --cc=vvs@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).