public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: David Miller <davem@davemloft.net>
Cc: anton@samba.org, netdev@vger.kernel.org
Subject: [PATCH] net: sock_queue_err_skb() dont mess with sk_forward_alloc
Date: Mon, 31 May 2010 18:02:46 +0200	[thread overview]
Message-ID: <1275321766.3291.100.camel@edumazet-laptop> (raw)
In-Reply-To: <20100529.002124.149815573.davem@davemloft.net>

Le samedi 29 mai 2010 à 00:21 -0700, David Miller a écrit :
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Wed, 26 May 2010 12:12:56 +0200
> 
> > [PATCH] net: fix sk_forward_alloc corruptions
> > 
> > As David found out, sock_queue_err_skb() should be called with socket
> > lock hold, or we risk sk_forward_alloc corruption, since we use non
> > atomic operations to update this field.
> > 
> > This patch adds bh_lock_sock()/bh_unlock_sock() pair to three spots.
> > (BH already disabled)
> > 
> > 1) skb_tstamp_tx() 
> > 2) Before calling ip_icmp_error(), in __udp4_lib_err() 
> > 3) Before calling ipv6_icmp_error(), in __udp6_lib_err()
> > 
> > Reported-by: Anton Blanchard <anton@samba.org>
> > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> 
> This wasn't the direct cause of Anton's problems but is
> a serious legitimate bug.
> 
> So, applied, thanks!

Thats embarrassing...

I believe there is still a problem with sock_queue_err_skb(), sorry
Dave :(

There is also a problem in ip_recv_error(), not called with socket
locked, skb freed -> potential corruption.

If current socket is 'owned' by a user thread, then we can still corrupt
sk_forward_alloc, even if we use bh_lock_sock()

I dont think we need to have another backlog for such case, maybe we
could account for skb->truesize in sk_rmem_alloc (this is atomic), and
not account for sk_mem_charge ?

Another possibility would be to store in skb the backlog function
pointer, so that backlog is generic (normal packets and error packets
handled in same backlog queue), instead of using a protocol provided
pointer. But thats more complex and error prone.

Thanks

[PATCH] net: sock_queue_err_skb() dont mess with sk_forward_alloc

Correct sk_forward_alloc handling for error_queue would need to use a
backlog of frames that softirq handler could not deliver because socket
is owned by user thread. Or extend backlog processing to be able to
process normal and error packets.

Another possibility is to not use mem charge for error queue, this is
what I implemented in this patch.

Note: this reverts commit 29030374
(net: fix sk_forward_alloc corruptions), since we dont need to lock
socket anymore.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 include/net/sock.h |   15 +--------------
 net/core/skbuff.c  |   30 ++++++++++++++++++++++++++++--
 net/ipv4/udp.c     |    6 ++----
 net/ipv6/udp.c     |    6 ++----
 4 files changed, 33 insertions(+), 24 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index ca241ea..731150d 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1524,20 +1524,7 @@ extern void sk_stop_timer(struct sock *sk, struct timer_list* timer);
 
 extern int sock_queue_rcv_skb(struct sock *sk, struct sk_buff *skb);
 
-static inline int sock_queue_err_skb(struct sock *sk, struct sk_buff *skb)
-{
-	/* Cast skb->rcvbuf to unsigned... It's pointless, but reduces
-	   number of warnings when compiling with -W --ANK
-	 */
-	if (atomic_read(&sk->sk_rmem_alloc) + skb->truesize >=
-	    (unsigned)sk->sk_rcvbuf)
-		return -ENOMEM;
-	skb_set_owner_r(skb, sk);
-	skb_queue_tail(&sk->sk_error_queue, skb);
-	if (!sock_flag(sk, SOCK_DEAD))
-		sk->sk_data_ready(sk, skb->len);
-	return 0;
-}
+extern int sock_queue_err_skb(struct sock *sk, struct sk_buff *skb);
 
 /*
  *	Recover an error report and clear atomically
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 4e7ac09..9f07e74 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2965,6 +2965,34 @@ int skb_cow_data(struct sk_buff *skb, int tailbits, struct sk_buff **trailer)
 }
 EXPORT_SYMBOL_GPL(skb_cow_data);
 
+static void sock_rmem_free(struct sk_buff *skb)
+{
+	struct sock *sk = skb->sk;
+
+	atomic_sub(skb->truesize, &sk->sk_rmem_alloc);
+}
+
+/*
+ * Note: We dont mem charge error packets (no sk_forward_alloc changes)
+ */
+int sock_queue_err_skb(struct sock *sk, struct sk_buff *skb)
+{
+	if (atomic_read(&sk->sk_rmem_alloc) + skb->truesize >=
+	    (unsigned)sk->sk_rcvbuf)
+		return -ENOMEM;
+
+	skb_orphan(skb);
+	skb->sk = sk;
+	skb->destructor = sock_rmem_free;
+	atomic_add(skb->truesize, &sk->sk_rmem_alloc);
+
+	skb_queue_tail(&sk->sk_error_queue, skb);
+	if (!sock_flag(sk, SOCK_DEAD))
+		sk->sk_data_ready(sk, skb->len);
+	return 0;
+}
+EXPORT_SYMBOL(sock_queue_err_skb);
+
 void skb_tstamp_tx(struct sk_buff *orig_skb,
 		struct skb_shared_hwtstamps *hwtstamps)
 {
@@ -2997,9 +3025,7 @@ void skb_tstamp_tx(struct sk_buff *orig_skb,
 	serr->ee.ee_errno = ENOMSG;
 	serr->ee.ee_origin = SO_EE_ORIGIN_TIMESTAMPING;
 
-	bh_lock_sock(sk);
 	err = sock_queue_err_skb(sk, skb);
-	bh_unlock_sock(sk);
 
 	if (err)
 		kfree_skb(skb);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 50678f9..eec4ff4 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -633,11 +633,9 @@ void __udp4_lib_err(struct sk_buff *skb, u32 info, struct udp_table *udptable)
 	if (!inet->recverr) {
 		if (!harderr || sk->sk_state != TCP_ESTABLISHED)
 			goto out;
-	} else {
-		bh_lock_sock(sk);
+	} else
 		ip_icmp_error(sk, skb, err, uh->dest, info, (u8 *)(uh+1));
-		bh_unlock_sock(sk);
-	}
+
 	sk->sk_err = err;
 	sk->sk_error_report(sk);
 out:
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 3048f90..87be586 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -466,11 +466,9 @@ void __udp6_lib_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 	if (sk->sk_state != TCP_ESTABLISHED && !np->recverr)
 		goto out;
 
-	if (np->recverr) {
-		bh_lock_sock(sk);
+	if (np->recverr)
 		ipv6_icmp_error(sk, skb, err, uh->dest, ntohl(info), (u8 *)(uh+1));
-		bh_unlock_sock(sk);
-	}
+
 	sk->sk_err = err;
 	sk->sk_error_report(sk);
 out:



  reply	other threads:[~2010-05-31 16:08 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-25 11:58 Warning in net/ipv4/af_inet.c:154 Anton Blanchard
2010-05-25 15:27 ` Eric Dumazet
2010-05-26  3:19   ` Anton Blanchard
2010-05-26  5:18     ` Eric Dumazet
2010-05-26  7:56     ` David Miller
2010-05-26 10:12       ` Eric Dumazet
2010-05-27  3:56         ` Anton Blanchard
2010-05-27  4:06           ` David Miller
2010-05-27  4:21             ` Eric Dumazet
2010-05-27  4:18           ` Eric Dumazet
2010-05-27  4:21             ` David Miller
2010-05-27  5:06               ` [PATCH] net: fix lock_sock_bh/unlock_sock_bh Eric Dumazet
2010-05-27  5:20                 ` Eric Dumazet
2010-05-27  5:23                   ` David Miller
2010-05-27  6:09                     ` Anton Blanchard
2010-05-27  7:29                       ` David Miller
2010-05-29  7:21         ` Warning in net/ipv4/af_inet.c:154 David Miller
2010-05-31 16:02           ` Eric Dumazet [this message]
2010-06-01  6:44             ` [PATCH] net: sock_queue_err_skb() dont mess with sk_forward_alloc David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1275321766.3291.100.camel@edumazet-laptop \
    --to=eric.dumazet@gmail.com \
    --cc=anton@samba.org \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox