netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: "David S. Miller" <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	Ralf Hildebrandt <Ralf.Hildebrandt@charite.de>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Kernel Testers List <kernel-testers@vger.kernel.org>,
	Linux Netdev List <netdev@vger.kernel.org>,
	Wei Yongjun <yjwei@cn.fujitsu.com>,
	Takahiro Yasui <tyasui@redhat.com>, Hideo Aoki <haoki@redhat.com>
Subject: [PATCH] udp: Fix udp_poll() and ioctl()
Date: Fri, 09 Oct 2009 16:43:40 +0200	[thread overview]
Message-ID: <4ACF4C1C.4050505@gmail.com> (raw)
In-Reply-To: <4ACCB6BE.5040602@gmail.com>

Eric Dumazet a écrit :
> Eric Dumazet a écrit :
>> Eric Dumazet a écrit :
>>> Eric Dumazet a écrit :
>>>> Rafael J. Wysocki a écrit :
>>>>> This message has been generated automatically as a part of a report
>>>>> of regressions introduced between 2.6.30 and 2.6.31.
>>>>>
>>>>> The following bug entry is on the current list of known regressions
>>>>> introduced between 2.6.30 and 2.6.31.  Please verify if it still should
>>>>> be listed and let me know (either way).
>>>>>
>>>>>
>>>>> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14301
>>>>> Subject		: WARNING: at net/ipv4/af_inet.c:154
>>>>> Submitter	: Ralf Hildebrandt <Ralf.Hildebrandt@charite.de>
>>>>> Date		: 2009-09-30 12:24 (2 days old)
>>>>> References	: http://marc.info/?l=linux-kernel&m=125431350218137&w=4
>>>>>
>> Investigation still needed...
>>
> 
> OK, my last (buggy ???) feeling is about commit 95766fff6b9a78d1
> 
> [UDP]: Add memory accounting.
> 
> (Its a two years old patch, oh well...)
> 
> Problem is the udp_poll() :
> 
> We check the first frame to be dequeued from sk_receive_queue has a good checksum.
> 
> If it doesnt, we drop the frame ( calling kfree_skb(skb); )
> 
> Problem is now we perform memory accounting on UDP, this kfree_skb()
> should be done with socket locked, but are we allowed to
> call lock_sock() from this udp_poll() context ?
> 

It seems we can lock_sock() from udp_poll() context, so here is a patch.

[PATCH] udp: Fix udp_poll()

udp_poll() can in some circumstances drop frames with incorrect checksums.

Problem is we now have to lock the socket while dropping frames, or risk
sk_forward corruption.

This bug is present since commit 95766fff6b9a78d1
([UDP]: Add memory accounting.)

While we are at it, we can correct ioctl(SIOCINQ) to also drop bad frames.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 net/ipv4/udp.c |   73 +++++++++++++++++++++++++++--------------------
 1 files changed, 43 insertions(+), 30 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 6ec6a8a..d0d436d 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -841,6 +841,42 @@ out:
 	return ret;
 }
 
+
+/**
+ *	first_packet_length	- return length of first packet in receive queue
+ *	@sk: socket
+ *
+ *	Drops all bad checksum frames, until a valid one is found.
+ *	Returns the length of found skb, or 0 if none is found.
+ */
+static unsigned int first_packet_length(struct sock *sk)
+{
+	struct sk_buff_head list_kill, *rcvq = &sk->sk_receive_queue;
+	struct sk_buff *skb;
+	unsigned int res;
+
+	__skb_queue_head_init(&list_kill);
+
+	spin_lock_bh(&rcvq->lock);
+	while ((skb = skb_peek(rcvq)) != NULL &&
+		udp_lib_checksum_complete(skb)) {
+		UDP_INC_STATS_BH(sock_net(sk), UDP_MIB_INERRORS,
+				 IS_UDPLITE(sk));
+		__skb_unlink(skb, rcvq);
+		__skb_queue_tail(&list_kill, skb);
+	}
+	res = skb ? skb->len : 0;
+	spin_unlock_bh(&rcvq->lock);
+
+	if (!skb_queue_empty(&list_kill)) {
+		lock_sock(sk);
+		__skb_queue_purge(&list_kill);
+		sk_mem_reclaim_partial(sk);
+		release_sock(sk);
+	}
+	return res;
+}
+
 /*
  *	IOCTL requests applicable to the UDP protocol
  */
@@ -857,21 +893,16 @@ int udp_ioctl(struct sock *sk, int cmd, unsigned long arg)
 
 	case SIOCINQ:
 	{
-		struct sk_buff *skb;
-		unsigned long amount;
+		unsigned int amount = first_packet_length(sk);
 
-		amount = 0;
-		spin_lock_bh(&sk->sk_receive_queue.lock);
-		skb = skb_peek(&sk->sk_receive_queue);
-		if (skb != NULL) {
+		if (amount)
 			/*
 			 * We will only return the amount
 			 * of this packet since that is all
 			 * that will be read.
 			 */
-			amount = skb->len - sizeof(struct udphdr);
-		}
-		spin_unlock_bh(&sk->sk_receive_queue.lock);
+			amount -= sizeof(struct udphdr);
+
 		return put_user(amount, (int __user *)arg);
 	}
 
@@ -1540,29 +1571,11 @@ unsigned int udp_poll(struct file *file, struct socket *sock, poll_table *wait)
 {
 	unsigned int mask = datagram_poll(file, sock, wait);
 	struct sock *sk = sock->sk;
-	int 	is_lite = IS_UDPLITE(sk);
 
 	/* Check for false positives due to checksum errors */
-	if ((mask & POLLRDNORM) &&
-	    !(file->f_flags & O_NONBLOCK) &&
-	    !(sk->sk_shutdown & RCV_SHUTDOWN)) {
-		struct sk_buff_head *rcvq = &sk->sk_receive_queue;
-		struct sk_buff *skb;
-
-		spin_lock_bh(&rcvq->lock);
-		while ((skb = skb_peek(rcvq)) != NULL &&
-		       udp_lib_checksum_complete(skb)) {
-			UDP_INC_STATS_BH(sock_net(sk),
-					UDP_MIB_INERRORS, is_lite);
-			__skb_unlink(skb, rcvq);
-			kfree_skb(skb);
-		}
-		spin_unlock_bh(&rcvq->lock);
-
-		/* nothing to see, move along */
-		if (skb == NULL)
-			mask &= ~(POLLIN | POLLRDNORM);
-	}
+	if ((mask & POLLRDNORM) && !(file->f_flags & O_NONBLOCK) &&
+	    !(sk->sk_shutdown & RCV_SHUTDOWN) && !first_packet_length(sk))
+		mask &= ~(POLLIN | POLLRDNORM);
 
 	return mask;
 

  reply	other threads:[~2009-10-09 14:43 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-01 19:53 2.6.32-rc1-git2: Reported regressions 2.6.30 -> 2.6.31 Rafael J. Wysocki
     [not found] ` <COE24pZSBH.A.jd.5MTxKB@chimera>
     [not found]   ` <877hvd8rj5.fsf@spindle.srvr.nix>
     [not found]     ` <877hvd8rj5.fsf-AdTWujXS48Mg67Zj9sPl2A@public.gmane.org>
2009-10-02 21:31       ` [Bug #14261] e1000e jumbo frames no longer work: 'Unsupported MTU setting' Rafael J. Wysocki
2009-10-02 22:13         ` Jeff Kirsher
2009-10-07 18:34           ` Theodore Tso
2009-10-07 19:12             ` Jeff Kirsher
     [not found] ` <COE24pZSBH.A.mdH.sMTxKB@chimera>
2009-10-03  8:36   ` [Bug #14301] WARNING: at net/ipv4/af_inet.c:154 Eric Dumazet
     [not found]     ` <4AC70D20.4060009-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2009-10-03  8:52       ` Eric Dumazet
2009-10-03 17:53         ` Eric Dumazet
     [not found]           ` <4AC78F7C.40908-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2009-10-07 15:41             ` Eric Dumazet
2009-10-09 14:43               ` Eric Dumazet [this message]
     [not found]                 ` <4ACF4C1C.4050505-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2009-10-13 10:18                   ` [PATCH] udp: Fix udp_poll() and ioctl() David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ACF4C1C.4050505@gmail.com \
    --to=eric.dumazet@gmail.com \
    --cc=Ralf.Hildebrandt@charite.de \
    --cc=davem@davemloft.net \
    --cc=haoki@redhat.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=kernel-testers@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=rjw@sisk.pl \
    --cc=tyasui@redhat.com \
    --cc=yjwei@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).