* Re: [Bugme-new] [Bug 8536] New: Kernel drops UDP packets silently when reading from certain proc file entries [not found] <200705241948.l4OJmTAe031670@fire-2.osdl.org> @ 2007-05-24 19:59 ` Andrew Morton 2007-05-25 6:23 ` Herbert Xu 0 siblings, 1 reply; 6+ messages in thread From: Andrew Morton @ 2007-05-24 19:59 UTC (permalink / raw) To: netdev; +Cc: bugme-daemon@kernel-bugs.osdl.org, andsve On Thu, 24 May 2007 12:48:29 -0700 bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=8536 > > Summary: Kernel drops UDP packets silently when reading from > certain proc file entries > Kernel Version: 2.6.x > Status: NEW > Severity: high > Owner: acme@ghostprotocols.net > Submitter: andsve@gmail.com > > > Most recent kernel where this bug did *NOT* occur: > I do not know, but I now that it exists in RHEL4 2.6.9.x kernels > Distribution: > All > Hardware Environment: > Multi core SMP > Software Environment: > All > Problem Description: > It is possible to introduce UDP packet losses by reading > the proc file entry /proc/net/tcp. The really strange thing is that > the error counters for packet drops are not increased. > This means that the kernel introduce "silent" packet drops by just reading a > proc statistics entry which is Not a good thing! I can most probably be used for > denial of service attacks from no root users. > > When looking at the network code it does not seem possible that silent packet > drops can ocurr so it is probably a quite nasty kernel bug. > > > Steps to reproduce: > > * Send high speed RTP/UDP multicast traffic towards the system, 50Mbit/s. > > * Receive the RTP packets and check/validate the RTP counters and print out when > the counter is not continous. > > * Do a while loop cat:ing from the /proc/net/tcp and see the packets beeing > dropped but not accounted for in the counter statistics. > > I have reproduced this behavior on all our systems ranging from dual to quad > core Xeon and Opteron and also on different OS releases, RHEL4, RHEL5, Fedora > Core 5 and 6 > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Bugme-new] [Bug 8536] New: Kernel drops UDP packets silently when reading from certain proc file entries 2007-05-24 19:59 ` [Bugme-new] [Bug 8536] New: Kernel drops UDP packets silently when reading from certain proc file entries Andrew Morton @ 2007-05-25 6:23 ` Herbert Xu 2007-05-25 6:50 ` Eric Dumazet 0 siblings, 1 reply; 6+ messages in thread From: Herbert Xu @ 2007-05-25 6:23 UTC (permalink / raw) To: Andrew Morton, davem; +Cc: netdev, bugme-daemon, andsve Andrew Morton <akpm@linux-foundation.org> wrote: > >> It is possible to introduce UDP packet losses by reading >> the proc file entry /proc/net/tcp. The really strange thing is that >> the error counters for packet drops are not increased. Please try this patch and let us know if it helps. [TCPv4]: Improve BH latency in /proc/net/tcp Currently the code for /proc/net/tcp disable BH while iterating over the entire established hash table. Even though we call cond_resched_softirq for each entry, we still won't process softirq's as regularly as we would otherwise do which results in poor performance when the system is loaded near capacity. This anomaly comes from the 2.4 code where this was all in a single function and the local_bh_disable might have made sense as a small optimisation. The cost of each local_bh_disable is so small when compared against the increased latency in keeping it disabled over a large but mostly empty TCP established hash table that we should just move it to the individual read_lock/read_unlock calls as we do in inet_diag. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 5a3e7f8..9dab06d 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -2039,10 +2039,7 @@ static void *established_get_first(struct seq_file *seq) struct hlist_node *node; struct inet_timewait_sock *tw; - /* We can reschedule _before_ having picked the target: */ - cond_resched_softirq(); - - read_lock(&tcp_hashinfo.ehash[st->bucket].lock); + read_lock_bh(&tcp_hashinfo.ehash[st->bucket].lock); sk_for_each(sk, node, &tcp_hashinfo.ehash[st->bucket].chain) { if (sk->sk_family != st->family) { continue; @@ -2059,7 +2056,7 @@ static void *established_get_first(struct seq_file *seq) rc = tw; goto out; } - read_unlock(&tcp_hashinfo.ehash[st->bucket].lock); + read_unlock_bh(&tcp_hashinfo.ehash[st->bucket].lock); st->state = TCP_SEQ_STATE_ESTABLISHED; } out: @@ -2086,14 +2083,11 @@ get_tw: cur = tw; goto out; } - read_unlock(&tcp_hashinfo.ehash[st->bucket].lock); + read_unlock_bh(&tcp_hashinfo.ehash[st->bucket].lock); st->state = TCP_SEQ_STATE_ESTABLISHED; - /* We can reschedule between buckets: */ - cond_resched_softirq(); - if (++st->bucket < tcp_hashinfo.ehash_size) { - read_lock(&tcp_hashinfo.ehash[st->bucket].lock); + read_lock_bh(&tcp_hashinfo.ehash[st->bucket].lock); sk = sk_head(&tcp_hashinfo.ehash[st->bucket].chain); } else { cur = NULL; @@ -2138,7 +2132,6 @@ static void *tcp_get_idx(struct seq_file *seq, loff_t pos) if (!rc) { inet_listen_unlock(&tcp_hashinfo); - local_bh_disable(); st->state = TCP_SEQ_STATE_ESTABLISHED; rc = established_get_idx(seq, pos); } @@ -2171,7 +2164,6 @@ static void *tcp_seq_next(struct seq_file *seq, void *v, loff_t *pos) rc = listening_get_next(seq, v); if (!rc) { inet_listen_unlock(&tcp_hashinfo); - local_bh_disable(); st->state = TCP_SEQ_STATE_ESTABLISHED; rc = established_get_first(seq); } @@ -2203,8 +2195,7 @@ static void tcp_seq_stop(struct seq_file *seq, void *v) case TCP_SEQ_STATE_TIME_WAIT: case TCP_SEQ_STATE_ESTABLISHED: if (v) - read_unlock(&tcp_hashinfo.ehash[st->bucket].lock); - local_bh_enable(); + read_unlock_bh(&tcp_hashinfo.ehash[st->bucket].lock); break; } } ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [Bugme-new] [Bug 8536] New: Kernel drops UDP packets silently when reading from certain proc file entries 2007-05-25 6:23 ` Herbert Xu @ 2007-05-25 6:50 ` Eric Dumazet 2007-05-25 6:57 ` Herbert Xu 0 siblings, 1 reply; 6+ messages in thread From: Eric Dumazet @ 2007-05-25 6:50 UTC (permalink / raw) To: Herbert Xu; +Cc: Andrew Morton, davem, netdev, bugme-daemon, andsve Herbert Xu a écrit : > Andrew Morton <akpm@linux-foundation.org> wrote: >>> It is possible to introduce UDP packet losses by reading >>> the proc file entry /proc/net/tcp. The really strange thing is that >>> the error counters for packet drops are not increased. > > Please try this patch and let us know if it helps. > > [TCPv4]: Improve BH latency in /proc/net/tcp > > Currently the code for /proc/net/tcp disable BH while iterating > over the entire established hash table. Even though we call > cond_resched_softirq for each entry, we still won't process > softirq's as regularly as we would otherwise do which results > in poor performance when the system is loaded near capacity. > > This anomaly comes from the 2.4 code where this was all in a > single function and the local_bh_disable might have made sense > as a small optimisation. > > The cost of each local_bh_disable is so small when compared > against the increased latency in keeping it disabled over a > large but mostly empty TCP established hash table that we > should just move it to the individual read_lock/read_unlock > calls as we do in inet_diag. > But its not really true : cond_resched_softirq() is called for each bucket in hash table, empty or not. If this patch really helps, this means cond_resched_softirq() doesnt work at all and should be fixed, or just zapped as it is seldom used. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Bugme-new] [Bug 8536] New: Kernel drops UDP packets silently when reading from certain proc file entries 2007-05-25 6:50 ` Eric Dumazet @ 2007-05-25 6:57 ` Herbert Xu 2007-05-25 7:15 ` Eric Dumazet 0 siblings, 1 reply; 6+ messages in thread From: Herbert Xu @ 2007-05-25 6:57 UTC (permalink / raw) To: Eric Dumazet; +Cc: Andrew Morton, davem, netdev, bugme-daemon, andsve On Fri, May 25, 2007 at 08:50:20AM +0200, Eric Dumazet wrote: > > If this patch really helps, this means cond_resched_softirq() > doesnt work at all and should be fixed, or just zapped as it > is seldom used. cond_resched_softirq lets other threads run if they want to. It doesn't run pending softirq's at all. In fact, it doesn't even wake up ksoftirqd. So if the only work we get come from softirq's then we'll just block them until we're done with /proc/net/tcp. You can (correctly) argue that cond_resched_softirq is broken, but it doesn't change the fact that we don't even need to call it for /proc/net/tcp. This patch simply changes /proc/net/tcp to be in line with the behaviour of inet_diag. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Bugme-new] [Bug 8536] New: Kernel drops UDP packets silently when reading from certain proc file entries 2007-05-25 6:57 ` Herbert Xu @ 2007-05-25 7:15 ` Eric Dumazet 2007-05-25 7:17 ` Herbert Xu 0 siblings, 1 reply; 6+ messages in thread From: Eric Dumazet @ 2007-05-25 7:15 UTC (permalink / raw) To: Herbert Xu; +Cc: Andrew Morton, davem, netdev, bugme-daemon, andsve Herbert Xu a écrit : > On Fri, May 25, 2007 at 08:50:20AM +0200, Eric Dumazet wrote: >> If this patch really helps, this means cond_resched_softirq() >> doesnt work at all and should be fixed, or just zapped as it >> is seldom used. > > cond_resched_softirq lets other threads run if they want to. > It doesn't run pending softirq's at all. In fact, it doesn't > even wake up ksoftirqd. I am very glad you fixed /proc/net/tcp, but I would like to understand why this cond_resched_softirq() even exist. Its name and behavior dont match at all. The only remaining use is in __release_sock(). Should we schedule threads, or ksoftirqd as well in this function ? ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Bugme-new] [Bug 8536] New: Kernel drops UDP packets silently when reading from certain proc file entries 2007-05-25 7:15 ` Eric Dumazet @ 2007-05-25 7:17 ` Herbert Xu 0 siblings, 0 replies; 6+ messages in thread From: Herbert Xu @ 2007-05-25 7:17 UTC (permalink / raw) To: Eric Dumazet, Ingo Molnar Cc: Andrew Morton, davem, netdev, bugme-daemon, andsve On Fri, May 25, 2007 at 09:15:17AM +0200, Eric Dumazet wrote: > > I am very glad you fixed /proc/net/tcp, but I would like to > understand why this cond_resched_softirq() even exist. Well presumably it lets other threads have a chance to run in a BH-disabled section. > Its name and behavior dont match at all. But yes it probably makes sense for it to process some softirq work as well. Ingo? Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-05-25 7:17 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <200705241948.l4OJmTAe031670@fire-2.osdl.org>
2007-05-24 19:59 ` [Bugme-new] [Bug 8536] New: Kernel drops UDP packets silently when reading from certain proc file entries Andrew Morton
2007-05-25 6:23 ` Herbert Xu
2007-05-25 6:50 ` Eric Dumazet
2007-05-25 6:57 ` Herbert Xu
2007-05-25 7:15 ` Eric Dumazet
2007-05-25 7:17 ` Herbert Xu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).