TCP connection stops after high load.

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* TCP connection stops after high load.
@ 2007-04-12 21:11 Robert Iakobashvili
  2007-04-12 21:15 ` David Miller
  0 siblings, 1 reply; 41+ messages in thread
From: Robert Iakobashvili @ 2007-04-12 21:11 UTC (permalink / raw)
  To: netdev; +Cc: Ben Greear

Hi Ben,

On 4/11/07, Ben Greear <greearb@candelatech.com> wrote:
>  The problem is that I set up a TCP connection with bi-directional traffic
> of around 800Mbps, doing large (20k - 64k writes and reads) between two ports on
> the same machine (this 2.6.18.2  kernel is tainted with my full patch set,
> but I also reproduced with only the non-tainted send-to-self patch applied
> last may on the 2.6.16 kernel, so I assume the bug is not particular to my patch
> set).
>
>  At first, all is well, but within 5-10 minutes, the TCP connection will stall
> and I only see a massive amount of duplicate ACKs on the link.
>

Just today I have faced some problems in the setup lighttpd server
(epoll demultiplexing and increased max-fds num) against curl-loader,
generating HTTP client load, both on the same host.

curl-loader adds 1000-8000 secondary IPv4 addresses to
eth0 interface. Then it opens 20-200 virtual HTTP clients per second till the
steady state number. Each client opens its socket, binds to a
secondary IP-address
and connects to the web server with further HTTP GET/POST, etc
response, etc

It works good with  2.6.11.8 and debian 2.6.18.3-i686 image.

At the same Intel Pentium-4 PC with the same about kernel configuration
(make oldconfig using Debian config-2.6.18.3-i686) the setup fails with the
tcp-connections stalled after 1000 established connections when the kernel
is 2.6.20.6 or 2.6.19.5.

It stalls even earlier, when lighttpd used with the default (poll ())
demultiplexing
after 500 connections or when apache2 web server used (memory?) - after 100
connections.

I am currently going to try vanilla 2.6.18.3 and, if with it also
fails, to look through
Debian patches, trying to figure out, what is the delta.

strace-ing and logs has revealed actually 2 scenarios of failures.
Connections are established successfully and:
- request sent and there is no response;
- partial response received and the connection stalls.

I will also try to collect some streams by tcpdump, using
the filtering by a client side source-ip.

Already tried going from BIC to Reno - not helpful, and loading
from the loopback (lo) - same picture.

Don't fill yourself alone, it may be the same problem, that
we encounter.

Sincerely,
 Robert Iakobashvili,
coroberti %x40 gmail %x2e com
...................................................................
Navigare necesse est, vivere non est necesse
...................................................................
http://curl-loader.sourceforge.net
An open-source HTTP/S, FTP/S traffic
generating, and web testing tool.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-12 21:11 TCP connection stops after high load Robert Iakobashvili
@ 2007-04-12 21:15 ` David Miller
  2007-04-15 12:14   ` Robert Iakobashvili
  2007-04-15 13:52   ` Robert Iakobashvili
  0 siblings, 2 replies; 41+ messages in thread
From: David Miller @ 2007-04-12 21:15 UTC (permalink / raw)
  To: coroberti; +Cc: netdev, greearb

From: "Robert Iakobashvili" <coroberti@gmail.com>
Date: Thu, 12 Apr 2007 23:11:14 +0200

> It works good with  2.6.11.8 and debian 2.6.18.3-i686 image.
> 
> At the same Intel Pentium-4 PC with the same about kernel configuration
> (make oldconfig using Debian config-2.6.18.3-i686) the setup fails with the
> tcp-connections stalled after 1000 established connections when the kernel
> is 2.6.20.6 or 2.6.19.5.
> 
> It stalls even earlier, when lighttpd used with the default (poll ())
> demultiplexing
> after 500 connections or when apache2 web server used (memory?) - after 100
> connections.
> 
> I am currently going to try vanilla 2.6.18.3 and, if with it also
> fails, to look through
> Debian patches, trying to figure out, what is the delta.
> 
> strace-ing and logs has revealed actually 2 scenarios of failures.
> Connections are established successfully and:
> - request sent and there is no response;
> - partial response received and the connection stalls.

The following patch is not the cause, but it likely
exacerbates the problem, can you revert the following
patch from your kernel and see if it changes the behavior?

commit 7b4f4b5ebceab67ce440a61081a69f0265e17c2a
Author: John Heffner <jheffner@psc.edu>
Date:   Sat Mar 25 01:34:07 2006 -0800

    [TCP]: Set default max buffers from memory pool size
    
    This patch sets the maximum TCP buffer sizes (available to automatic
    buffer tuning, not to setsockopt) based on the TCP memory pool size.
    The maximum sndbuf and rcvbuf each will be up to 4 MB, but no more
    than 1/128 of the memory pressure threshold.
    
    Signed-off-by: John Heffner <jheffner@psc.edu>
    Signed-off-by: David S. Miller <davem@davemloft.net>

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 4b0272c..591e96d 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -276,8 +276,8 @@ atomic_t tcp_orphan_count = ATOMIC_INIT(0);
 EXPORT_SYMBOL_GPL(tcp_orphan_count);
 
 int sysctl_tcp_mem[3];
-int sysctl_tcp_wmem[3] = { 4 * 1024, 16 * 1024, 128 * 1024 };
-int sysctl_tcp_rmem[3] = { 4 * 1024, 87380, 87380 * 2 };
+int sysctl_tcp_wmem[3];
+int sysctl_tcp_rmem[3];
 
 EXPORT_SYMBOL(sysctl_tcp_mem);
 EXPORT_SYMBOL(sysctl_tcp_rmem);
@@ -2081,7 +2081,8 @@ __setup("thash_entries=", set_thash_entries);
 void __init tcp_init(void)
 {
 	struct sk_buff *skb = NULL;
-	int order, i;
+	unsigned long limit;
+	int order, i, max_share;
 
 	if (sizeof(struct tcp_skb_cb) > sizeof(skb->cb))
 		__skb_cb_too_small_for_tcp(sizeof(struct tcp_skb_cb),
@@ -2155,12 +2156,16 @@ void __init tcp_init(void)
 	sysctl_tcp_mem[1] = 1024 << order;
 	sysctl_tcp_mem[2] = 1536 << order;
 
-	if (order < 3) {
-		sysctl_tcp_wmem[2] = 64 * 1024;
-		sysctl_tcp_rmem[0] = PAGE_SIZE;
-		sysctl_tcp_rmem[1] = 43689;
-		sysctl_tcp_rmem[2] = 2 * 43689;
-	}
+	limit = ((unsigned long)sysctl_tcp_mem[1]) << (PAGE_SHIFT - 7);
+	max_share = min(4UL*1024*1024, limit);
+
+	sysctl_tcp_wmem[0] = SK_STREAM_MEM_QUANTUM;
+	sysctl_tcp_wmem[1] = 16*1024;
+	sysctl_tcp_wmem[2] = max(64*1024, max_share);
+
+	sysctl_tcp_rmem[0] = SK_STREAM_MEM_QUANTUM;
+	sysctl_tcp_rmem[1] = 87380;
+	sysctl_tcp_rmem[2] = max(87380, max_share);
 
 	printk(KERN_INFO "TCP: Hash tables configured "
 	       "(established %d bind %d)\n",

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-12 21:15 ` David Miller
@ 2007-04-15 12:14   ` Robert Iakobashvili
  2007-04-15 15:31     ` John Heffner
  2007-04-15 13:52   ` Robert Iakobashvili
  1 sibling, 1 reply; 41+ messages in thread
From: Robert Iakobashvili @ 2007-04-15 12:14 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, greearb

On 4/13/07, David Miller <davem@davemloft.net> wrote:
> From: "Robert Iakobashvili" <coroberti@gmail.com>
> Date: Thu, 12 Apr 2007 23:11:14 +0200
>
> > It works good with  2.6.11.8 and debian 2.6.18.3-i686 image.
> >
> > At the same Intel Pentium-4 PC with the same about kernel configuration
> > (make oldconfig using Debian config-2.6.18.3-i686) the setup fails with the
> > tcp-connections stalled after 1000 established connections when the kernel
> > is 2.6.20.6 or 2.6.19.5.
> >
> > It stalls even earlier, when lighttpd used with the default (poll ())
> > demultiplexing
> > after 500 connections or when apache2 web server used (memory?) - after 100
> > connections.
> >
> > I am currently going to try vanilla 2.6.18.3 and, if with it also
> > fails, to look through
> > Debian patches, trying to figure out, what is the delta.

Vanilla 2.6.18.3 works for me perfectly, whereas 2.6.19.5 and
2.6.20.6 do not.

Looking into the tcp /proc entries of 2.6.18.3 versus 2.6.19.5
tcp_rmem and tcp_wmem are the same, whereas tcp_mem are
much different:

kernel                  tcp_mem
---------------------------------------
2.6.18.3    12288 16384 24576
2.6.19.5      3072    4096   6144


Is not it done deliberately by the below patch:

commit 9e950efa20dc8037c27509666cba6999da9368e8
Author: John Heffner <jheffner@psc.edu>
Date:   Mon Nov 6 23:10:51 2006 -0800

    [TCP]: Don't use highmem in tcp hash size calculation.

    This patch removes consideration of high memory when determining TCP
    hash table sizes.  Taking into account high memory results in tcp_mem
    values that are too large.

Is it a feature?

My machine has:
MemTotal:       484368 kB
and
for all kernel configurations are actually the same with
CONFIG_HIGHMEM4G=y

Thanks,

-- 
Sincerely,
Robert Iakobashvili,
coroberti %x40 gmail %x2e com
...................................................................
Navigare necesse est, vivere non est necesse
...................................................................
http://curl-loader.sourceforge.net
An open-source HTTP/S, FTP/S traffic
generating, and web testing tool.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-15 12:14   ` Robert Iakobashvili
@ 2007-04-15 15:31     ` John Heffner
  2007-04-15 15:49       ` Robert Iakobashvili
  0 siblings, 1 reply; 41+ messages in thread
From: John Heffner @ 2007-04-15 15:31 UTC (permalink / raw)
  To: Robert Iakobashvili; +Cc: David Miller, netdev, greearb

Robert Iakobashvili wrote:
> Vanilla 2.6.18.3 works for me perfectly, whereas 2.6.19.5 and
> 2.6.20.6 do not.
> 
> Looking into the tcp /proc entries of 2.6.18.3 versus 2.6.19.5
> tcp_rmem and tcp_wmem are the same, whereas tcp_mem are
> much different:
> 
> kernel                  tcp_mem
> ---------------------------------------
> 2.6.18.3    12288 16384 24576
> 2.6.19.5      3072    4096   6144
> 
> 
> Is not it done deliberately by the below patch:
> 
> commit 9e950efa20dc8037c27509666cba6999da9368e8
> Author: John Heffner <jheffner@psc.edu>
> Date:   Mon Nov 6 23:10:51 2006 -0800
> 
>    [TCP]: Don't use highmem in tcp hash size calculation.
> 
>    This patch removes consideration of high memory when determining TCP
>    hash table sizes.  Taking into account high memory results in tcp_mem
>    values that are too large.
> 
> Is it a feature?
> 
> My machine has:
> MemTotal:       484368 kB
> and
> for all kernel configurations are actually the same with
> CONFIG_HIGHMEM4G=y
> 
> Thanks,
> 

Another patch that went in right around that time:

commit 52bf376c63eebe72e862a1a6e713976b038c3f50
Author: John Heffner <jheffner@psc.edu>
Date:   Tue Nov 14 20:25:17 2006 -0800

     [TCP]: Fix up sysctl_tcp_mem initialization.

     Fix up tcp_mem initial settings to take into account the size of the
     hash entries (different on SMP and non-SMP systems).

     Signed-off-by: John Heffner <jheffner@psc.edu>
     Signed-off-by: David S. Miller <davem@davemloft.net>

(This has been changed again for 2.6.21.)

In the dmesg, there should be some messages like this:

IP route cache hash table entries: 32768 (order: 5, 131072 bytes)
TCP established hash table entries: 131072 (order: 8, 1048576 bytes)
TCP bind hash table entries: 65536 (order: 6, 262144 bytes)
TCP: Hash tables configured (established 131072 bind 65536)

What do yours say?

Thanks,
   -John

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-15 15:31     ` John Heffner
@ 2007-04-15 15:49       ` Robert Iakobashvili
  2007-04-16 18:07         ` John Heffner
  0 siblings, 1 reply; 41+ messages in thread
From: Robert Iakobashvili @ 2007-04-15 15:49 UTC (permalink / raw)
  To: John Heffner; +Cc: David Miller, netdev, greearb

Hi John,

On 4/15/07, John Heffner <jheffner@psc.edu> wrote:
> Robert Iakobashvili wrote:
> > Vanilla 2.6.18.3 works for me perfectly, whereas 2.6.19.5 and
> > 2.6.20.6 do not.
> >
> > Looking into the tcp /proc entries of 2.6.18.3 versus 2.6.19.5
> > tcp_rmem and tcp_wmem are the same, whereas tcp_mem are
> > much different:
> >
> > kernel                  tcp_mem
> > ---------------------------------------
> > 2.6.18.3    12288 16384 24576
> > 2.6.19.5      3072    4096   6144

> Another patch that went in right around that time:
>
> commit 52bf376c63eebe72e862a1a6e713976b038c3f50
> Author: John Heffner <jheffner@psc.edu>
> Date:   Tue Nov 14 20:25:17 2006 -0800
>
>      [TCP]: Fix up sysctl_tcp_mem initialization.
> (This has been changed again for 2.6.21.)
>
> In the dmesg, there should be some messages like this:
> IP route cache hash table entries: 32768 (order: 5, 131072 bytes)
> TCP established hash table entries: 131072 (order: 8, 1048576 bytes)
> TCP bind hash table entries: 65536 (order: 6, 262144 bytes)
> TCP: Hash tables configured (established 131072 bind 65536)
>
> What do yours say?

For the 2.6.19.5, where we have this problem:
>From dmsg:
IP route cache hash table entries: 4096 (order: 2, 16384 bytes)
TCP established hash table entries: 16384 (order: 5, 131072 bytes)
TCP bind hash table entries: 8192 (order: 4, 65536 bytes)

#cat /proc/sys/net/ipv4/tcp_mem
3072    4096    6144

MemTotal:       484368 kB
CONFIG_HIGHMEM4G=y

Thanks,

Sincerely,
Robert Iakobashvili,
coroberti %x40 gmail %x2e com
...................................................................
Navigare necesse est, vivere non est necesse
...................................................................
http://curl-loader.sourceforge.net
An open-source HTTP/S, FTP/S traffic
generating, and web testing tool.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-15 15:49       ` Robert Iakobashvili
@ 2007-04-16 18:07         ` John Heffner
  2007-04-16 18:51           ` Robert Iakobashvili
  0 siblings, 1 reply; 41+ messages in thread
From: John Heffner @ 2007-04-16 18:07 UTC (permalink / raw)
  To: Robert Iakobashvili; +Cc: David Miller, netdev, greearb

Robert Iakobashvili wrote:
> Hi John,
> 
> On 4/15/07, John Heffner <jheffner@psc.edu> wrote:
>> Robert Iakobashvili wrote:
>> > Vanilla 2.6.18.3 works for me perfectly, whereas 2.6.19.5 and
>> > 2.6.20.6 do not.
>> >
>> > Looking into the tcp /proc entries of 2.6.18.3 versus 2.6.19.5
>> > tcp_rmem and tcp_wmem are the same, whereas tcp_mem are
>> > much different:
>> >
>> > kernel                  tcp_mem
>> > ---------------------------------------
>> > 2.6.18.3    12288 16384 24576
>> > 2.6.19.5      3072    4096   6144
> 
>> Another patch that went in right around that time:
>>
>> commit 52bf376c63eebe72e862a1a6e713976b038c3f50
>> Author: John Heffner <jheffner@psc.edu>
>> Date:   Tue Nov 14 20:25:17 2006 -0800
>>
>>      [TCP]: Fix up sysctl_tcp_mem initialization.
>> (This has been changed again for 2.6.21.)
>>
>> In the dmesg, there should be some messages like this:
>> IP route cache hash table entries: 32768 (order: 5, 131072 bytes)
>> TCP established hash table entries: 131072 (order: 8, 1048576 bytes)
>> TCP bind hash table entries: 65536 (order: 6, 262144 bytes)
>> TCP: Hash tables configured (established 131072 bind 65536)
>>
>> What do yours say?
> 
> For the 2.6.19.5, where we have this problem:
>> From dmsg:
> IP route cache hash table entries: 4096 (order: 2, 16384 bytes)
> TCP established hash table entries: 16384 (order: 5, 131072 bytes)
> TCP bind hash table entries: 8192 (order: 4, 65536 bytes)
> 
> #cat /proc/sys/net/ipv4/tcp_mem
> 3072    4096    6144
> 
> MemTotal:       484368 kB
> CONFIG_HIGHMEM4G=y


Yes, this difference is caused by the commit above.  The old way didn't 
really make a lot of sense, since it was different based on smp/non-smp 
and page size, and had large discontinuities at 512MB and every power of 
two.  It was hard to make the limit never larger than the memory pool 
but never too small either, when based on the hash table size.

The current net-2.6 (2.6.21) has a redesigned tcp_mem initialization 
that should give you more appropriate values, something like 45408 60546 
90816.  For reference:

Commit: 53cdcc04c1e85d4e423b2822b66149b6f2e52c2c
Author: John Heffner <jheffner@psc.edu> Fri, 16 Mar 2007 15:04:03 -0700

     [TCP]: Fix tcp_mem[] initialization.

     Change tcp_mem initialization function.  The fraction of total memory
     is now a continuous function of memory size, and independent of page
     size.

     Signed-off-by: John Heffner <jheffner@psc.edu>
     Signed-off-by: David S. Miller <davem@davemloft.net>

Thanks,
   -John

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-16 18:07         ` John Heffner
@ 2007-04-16 18:51           ` Robert Iakobashvili
  2007-04-16 19:11             ` John Heffner
  2007-04-16 19:15             ` David Miller
  0 siblings, 2 replies; 41+ messages in thread
From: Robert Iakobashvili @ 2007-04-16 18:51 UTC (permalink / raw)
  To: John Heffner; +Cc: David Miller, netdev, greearb

> >> Robert Iakobashvili wrote:
> >> > Vanilla 2.6.18.3 works for me perfectly, whereas 2.6.19.5 and
> >> > 2.6.20.6 do not.
> >> >
> >> > Looking into the tcp /proc entries of 2.6.18.3 versus 2.6.19.5
> >> > tcp_rmem and tcp_wmem are the same, whereas tcp_mem are
> >> > much different:
> >> >
> >> > kernel                  tcp_mem
> >> > ---------------------------------------
> >> > 2.6.18.3    12288 16384 24576
> >> > 2.6.19.5      3072    4096   6144
> >
> >> Another patch that went in right around that time:
> >>
> >> commit 52bf376c63eebe72e862a1a6e713976b038c3f50
> >> Author: John Heffner <jheffner@psc.edu>
> >> Date:   Tue Nov 14 20:25:17 2006 -0800
> >>
> >>      [TCP]: Fix up sysctl_tcp_mem initialization.
> >> (This has been changed again for 2.6.21.)
> >>

> Yes, this difference is caused by the commit above.
> The current net-2.6 (2.6.21) has a redesigned tcp_mem initialization
> that should give you more appropriate values, something like 45408 60546
> 90816.  For reference:
> Commit: 53cdcc04c1e85d4e423b2822b66149b6f2e52c2c
> Author: John Heffner <jheffner@psc.edu> Fri, 16 Mar 2007 15:04:03 -0700
>
>      [TCP]: Fix tcp_mem[] initialization.
>      Change tcp_mem initialization function.  The fraction of total memory
>      is now a continuous function of memory size, and independent of page
>      size.


Kernels 2.6.19 and 2.6.20 series are effectively broken right now.
Don't you wish to patch them?

-- 
Sincerely,
Robert Iakobashvili,
coroberti %x40 gmail %x2e com
...................................................................
Navigare necesse est, vivere non est necesse
...................................................................
http://curl-loader.sourceforge.net
An open-source HTTP/S, FTP/S traffic
generating, and web testing tool.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-16 18:51           ` Robert Iakobashvili
@ 2007-04-16 19:11             ` John Heffner
  2007-04-16 19:17               ` David Miller
  2007-04-16 19:15             ` David Miller
  1 sibling, 1 reply; 41+ messages in thread
From: John Heffner @ 2007-04-16 19:11 UTC (permalink / raw)
  To: Robert Iakobashvili; +Cc: David Miller, netdev, greearb

Robert Iakobashvili wrote:
> Kernels 2.6.19 and 2.6.20 series are effectively broken right now.
> Don't you wish to patch them?
> 

I don't know if this qualifies as an unconditional bug.  The commit 
above was actually a bugfix so that the limits were not higher than 
total memory on some systems, but had the side effect that it made them 
even smaller on your particular configuration.  Also, having initial 
sysctl values that are conservatively small probably doesn't qualify as 
a bug (for patching stable trees).  You might ask the -stable 
maintainers if they have a different opinion.

For most people, 2.6.19 and 2.6.20 work fine.  For those who really care 
about the tcp_mem values (are using a substantial fraction of physical 
memory for TCP connections), the best bet is to set the tcp_mem sysctl 
values in the startup scripts, or use the new initialization function in 
2.6.21.

Thanks,
   -John

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-16 19:11             ` John Heffner
@ 2007-04-16 19:17               ` David Miller
  0 siblings, 0 replies; 41+ messages in thread
From: David Miller @ 2007-04-16 19:17 UTC (permalink / raw)
  To: jheffner; +Cc: coroberti, netdev, greearb

From: John Heffner <jheffner@psc.edu>
Date: Mon, 16 Apr 2007 15:11:07 -0400

> I don't know if this qualifies as an unconditional bug.  The commit 
> above was actually a bugfix so that the limits were not higher than 
> total memory on some systems, but had the side effect that it made them 
> even smaller on your particular configuration.  Also, having initial 
> sysctl values that are conservatively small probably doesn't qualify as 
> a bug (for patching stable trees).  You might ask the -stable 
> maintainers if they have a different opinion.
> 
> For most people, 2.6.19 and 2.6.20 work fine.  For those who really care 
> about the tcp_mem values (are using a substantial fraction of physical 
> memory for TCP connections), the best bet is to set the tcp_mem sysctl 
> values in the startup scripts, or use the new initialization function in 
> 2.6.21.

What's most important is determining if that tcp_mem[] patch actually
fixes his problem, so it is his responsibility to see whether this
is the case.

If it does fix the problem, I'm happy to submit the backport to -stable.

But until such tests are made, it's just speculation whether the patch
fixes the problem or not, and therefore there is zero justification to
submit it to -stable.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-16 18:51           ` Robert Iakobashvili
  2007-04-16 19:11             ` John Heffner
@ 2007-04-16 19:15             ` David Miller
  2007-04-17  7:58               ` Robert Iakobashvili
  1 sibling, 1 reply; 41+ messages in thread
From: David Miller @ 2007-04-16 19:15 UTC (permalink / raw)
  To: coroberti; +Cc: jheffner, netdev, greearb

From: "Robert Iakobashvili" <coroberti@gmail.com>
Date: Mon, 16 Apr 2007 20:51:54 +0200

> > Commit: 53cdcc04c1e85d4e423b2822b66149b6f2e52c2c
> > Author: John Heffner <jheffner@psc.edu> Fri, 16 Mar 2007 15:04:03 -0700
> >
> >      [TCP]: Fix tcp_mem[] initialization.
> >      Change tcp_mem initialization function.  The fraction of total memory
> >      is now a continuous function of memory size, and independent of page
> >      size.
> 
> 
> Kernels 2.6.19 and 2.6.20 series are effectively broken right now.
> Don't you wish to patch them?

Can you verify that this patch actually fixes your problem?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-16 19:15             ` David Miller
@ 2007-04-17  7:58               ` Robert Iakobashvili
  2007-04-17 19:39                 ` David Miller
  0 siblings, 1 reply; 41+ messages in thread
From: Robert Iakobashvili @ 2007-04-17  7:58 UTC (permalink / raw)
  To: David Miller; +Cc: jheffner, netdev, greearb, Michael Moser

David,

On 4/16/07, David Miller <davem@davemloft.net> wrote:
> > > Commit: 53cdcc04c1e85d4e423b2822b66149b6f2e52c2c
> > > Author: John Heffner <jheffner@psc.edu> Fri, 16 Mar 2007 15:04:03 -0700
> > >
> > >      [TCP]: Fix tcp_mem[] initialization.
> > >      Change tcp_mem initialization function.  The fraction of total memory
> > >      is now a continuous function of memory size, and independent of page
> > >      size.
> >
> >
> > Kernels 2.6.19 and 2.6.20 series are effectively broken right now.
> > Don't you wish to patch them?
>
> Can you verify that this patch actually fixes your problem?

Yes, it fixes.

After the patch curl-loader works with patched 2.6.19.7 and
with patched 2.6.20.7 using simulteneous 3000 local connections smothly,
and even better than with referred as a "good" 2.6.18.3.


Besides that the tcp_mem status for my machine:

kernel                             tcp_mem
------------------------------------------------------
2.6.19.7                     3072    4096     6144
2.6.19.7-patched    45696   60928   91392
2.6.20.7                     3072    4096     6144
2.6.20.7-patched    45696   60928   91392

The patch was applied smothly just with line offsets.

-- 
Sincerely,
Robert Iakobashvili,
coroberti %x40 gmail %x2e com
...................................................................
Navigare necesse est, vivere non est necesse
...................................................................
http://curl-loader.sourceforge.net
An open-source HTTP/S, FTP/S traffic
generating, and web testing tool.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-17  7:58               ` Robert Iakobashvili
@ 2007-04-17 19:39                 ` David Miller
  2007-04-17 19:47                   ` John Heffner
  2007-04-17 19:58                   ` Robert Iakobashvili
  0 siblings, 2 replies; 41+ messages in thread
From: David Miller @ 2007-04-17 19:39 UTC (permalink / raw)
  To: coroberti; +Cc: jheffner, netdev, greearb, moser.michael

From: "Robert Iakobashvili" <coroberti@gmail.com>
Date: Tue, 17 Apr 2007 10:58:04 +0300

> David,
> 
> On 4/16/07, David Miller <davem@davemloft.net> wrote:
> > > > Commit: 53cdcc04c1e85d4e423b2822b66149b6f2e52c2c
> > > > Author: John Heffner <jheffner@psc.edu> Fri, 16 Mar 2007 15:04:03 -0700
> > > >
> > > >      [TCP]: Fix tcp_mem[] initialization.
> > > >      Change tcp_mem initialization function.  The fraction of total memory
> > > >      is now a continuous function of memory size, and independent of page
> > > >      size.
> > >
> > >
> > > Kernels 2.6.19 and 2.6.20 series are effectively broken right now.
> > > Don't you wish to patch them?
> >
> > Can you verify that this patch actually fixes your problem?
> 
> Yes, it fixes.

Thanks, I will submit it to -stable branch.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-17 19:39                 ` David Miller
@ 2007-04-17 19:47                   ` John Heffner
  2007-04-17 19:51                     ` David Miller
  2007-04-17 19:58                   ` Robert Iakobashvili
  1 sibling, 1 reply; 41+ messages in thread
From: John Heffner @ 2007-04-17 19:47 UTC (permalink / raw)
  To: David Miller; +Cc: coroberti, netdev, greearb, moser.michael

David Miller wrote:
> From: "Robert Iakobashvili" <coroberti@gmail.com>
> Date: Tue, 17 Apr 2007 10:58:04 +0300
> 
>> David,
>>
>> On 4/16/07, David Miller <davem@davemloft.net> wrote:
>>>>> Commit: 53cdcc04c1e85d4e423b2822b66149b6f2e52c2c
>>>>> Author: John Heffner <jheffner@psc.edu> Fri, 16 Mar 2007 15:04:03 -0700
>>>>>
>>>>>      [TCP]: Fix tcp_mem[] initialization.
>>>>>      Change tcp_mem initialization function.  The fraction of total memory
>>>>>      is now a continuous function of memory size, and independent of page
>>>>>      size.
>>>>
>>>> Kernels 2.6.19 and 2.6.20 series are effectively broken right now.
>>>> Don't you wish to patch them?
>>> Can you verify that this patch actually fixes your problem?
>> Yes, it fixes.
> 
> Thanks, I will submit it to -stable branch.

My only reservation in submitting this to -stable is that it will in 
many cases increase the default tcp_mem values, which in turn can 
increase the default tcp_rmem values, and therefore the window scale. 
There will be some set of people with broken firewalls who trigger that 
problem for the first time by upgrading along the stable branch.  While 
it's not our fault, it could cause some complaints...

Thanks,
   -John

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-17 19:47                   ` John Heffner
@ 2007-04-17 19:51                     ` David Miller
  0 siblings, 0 replies; 41+ messages in thread
From: David Miller @ 2007-04-17 19:51 UTC (permalink / raw)
  To: jheffner; +Cc: coroberti, netdev, greearb, moser.michael

From: John Heffner <jheffner@psc.edu>
Date: Tue, 17 Apr 2007 15:47:58 -0400

> My only reservation in submitting this to -stable is that it will in 
> many cases increase the default tcp_mem values, which in turn can 
> increase the default tcp_rmem values, and therefore the window scale. 
> There will be some set of people with broken firewalls who trigger that 
> problem for the first time by upgrading along the stable branch.  While 
> it's not our fault, it could cause some complaints...

It is a very valid concern.

However this is fixing a problem where we are in the wrong,
whereas the firewall issues are external and should not
block us from being able to fix our own bugs :-)

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-17 19:39                 ` David Miller
  2007-04-17 19:47                   ` John Heffner
@ 2007-04-17 19:58                   ` Robert Iakobashvili
  1 sibling, 0 replies; 41+ messages in thread
From: Robert Iakobashvili @ 2007-04-17 19:58 UTC (permalink / raw)
  To: David Miller; +Cc: jheffner, netdev, greearb, moser.michael

> > Yes, it fixes.
>
> Thanks, I will submit it to -stable branch.
>

David and John,
Thanks for your caring and attention.


-- 
Sincerely,
Robert Iakobashvili,
coroberti %x40 gmail %x2e com
...................................................................
Navigare necesse est, vivere non est necesse
...................................................................
http://curl-loader.sourceforge.net
An open-source HTTP/S, FTP/S traffic
generating, and web testing tool.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-12 21:15 ` David Miller
  2007-04-15 12:14   ` Robert Iakobashvili
@ 2007-04-15 13:52   ` Robert Iakobashvili
  1 sibling, 0 replies; 41+ messages in thread
From: Robert Iakobashvili @ 2007-04-15 13:52 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, greearb

On 4/13/07, David Miller <davem@davemloft.net> wrote:
> From: "Robert Iakobashvili" <coroberti@gmail.com>
> Date: Thu, 12 Apr 2007 23:11:14 +0200
>
> > It works good with  2.6.11.8 and debian 2.6.18.3-i686 image.
> >
> > At the same Intel Pentium-4 PC with the same about kernel configuration
> > (make oldconfig using Debian config-2.6.18.3-i686) the setup fails with the
> > tcp-connections stalled after 1000 established connections when the kernel
> > is 2.6.20.6 or 2.6.19.5.
> >
> > It stalls even earlier, when lighttpd used with the default (poll ())
> > demultiplexing
> > after 500 connections or when apache2 web server used (memory?) - after 100
> > connections.
> >
> > I am currently going to try vanilla 2.6.18.3 and, if with it also
> > fails, to look through
> > Debian patches, trying to figure out, what is the delta.

>Vanilla 2.6.18.3 works for me perfectly, whereas 2.6.19.5 and
2.6.20.6 do not.
>
>Looking into the tcp /proc entries of 2.6.18.3 versus 2.6.19.5
>tcp_rmem and tcp_wmem are the same, whereas tcp_mem are
>much different:
>
>kernel                  tcp_mem
>---------------------------------------
>2.6.18.3    12288 16384 24576
>2.6.19.5      3072    4096   6144
>
>
>Is not it done deliberately by the below patch:
>
>commit 9e950efa20dc8037c27509666cba6999da9368e8
>Author: John Heffner <jheffner@psc.edu>
>Date:   Mon Nov 6 23:10:51 2006 -0800

Sorry, the commit is innocent. Something else has been
broken in tcp_mem initialization logic.

>My machine has:
>MemTotal:       484368 kB
>and
>for all kernel configurations are actually the same with
>CONFIG_HIGHMEM4G=y


Sincerely,
Robert Iakobashvili,
coroberti %x40 gmail %x2e com
...................................................................
Navigare necesse est, vivere non est necesse
...................................................................
http://curl-loader.sourceforge.net
An open-source HTTP/S, FTP/S traffic
generating, and web testing tool.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* TCP connection stops after high load.
@ 2007-04-11 18:50 Ben Greear
  2007-04-11 20:26 ` Ben Greear
                   ` (2 more replies)
  0 siblings, 3 replies; 41+ messages in thread
From: Ben Greear @ 2007-04-11 18:50 UTC (permalink / raw)
  To: NetDev

Back in May of last year, I reported this problem, but worked
around it at the time by changing the kernel memory settings
in the networking stack.  I reproduced the problem again today
with the previously working kernel memory settings..which is not
supprising since I just papered over the bug last time.

The problem is that I set up a TCP connection with bi-directional traffic
of around 800Mbps, doing large (20k - 64k writes and reads) between two ports on
the same machine (this 2.6.18.2 kernel is tainted with my full patch set,
but I also reproduced with only the non-tainted send-to-self patch applied
last may on the 2.6.16 kernel, so I assume the bug is not particular to my patch
set).

At first, all is well, but within 5-10 minutes, the TCP connection will stall
and I only see a massive amount of duplicate ACKs on the link.  Before,
I sometimes saw OOM messages, but this time there are no OOM messages.  The system
has a two-port pro/1000 fibre NIC, 1GB RAM, kernel 2.6.18.2 + hacks, etc.
Stopping and starting the connection allows traffic to flow again (if briefly).
Starting a new connection works fine even if the old one is still stalled,
so it's not a global memory exhaustion problem.

So, I would like to dig into this problem myself since no one else
is reporting this type of problem, but I am quite ignorant of the TCP
stack implementation.  Based on the dup-acks I see on the wire, I assume
the TCP state machine is messed up somehow.  Could anyone point me to
likely places in the TCP stack to start looking for this bug?

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-11 18:50 Ben Greear
@ 2007-04-11 20:26 ` Ben Greear
  2007-04-11 20:48   ` David Miller
  2007-04-11 20:41 ` David Miller
  2007-04-12  6:12 ` Ilpo Järvinen
  2 siblings, 1 reply; 41+ messages in thread
From: Ben Greear @ 2007-04-11 20:26 UTC (permalink / raw)
  To: NetDev

Ben Greear wrote:
> Back in May of last year, I reported this problem, but worked
> around it at the time by changing the kernel memory settings
> in the networking stack.  I reproduced the problem again today
> with the previously working kernel memory settings..which is not
> supprising since I just papered over the bug last time.

So, I have been poking around.  Disabling tso makes the problem happen
sooner (< 1 minute).  Changing the tcp_congestion_control does not help.

Interestingly, I found this page mentioning a SACK problem in Linux:
http://www-didc.lbl.gov/TCP-tuning/linux.html

I tried disabling SACK, but the problem still happens.  However,
I do see the CWND go to 1 as soon as the connection stalls (I'm not
sure exactly which happens first.)  Before the stall, I see CWND
reported in the ~40 range.

Maybe something similar to the SACK bug can happen on very fast, very
low latency links, with large send/receive buffers configured?

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-11 20:26 ` Ben Greear
@ 2007-04-11 20:48   ` David Miller
  2007-04-11 21:06     ` Ben Greear
  0 siblings, 1 reply; 41+ messages in thread
From: David Miller @ 2007-04-11 20:48 UTC (permalink / raw)
  To: greearb; +Cc: netdev

From: Ben Greear <greearb@candelatech.com>
Date: Wed, 11 Apr 2007 13:26:36 -0700

> Interestingly, I found this page mentioning a SACK problem in Linux:
> http://www-didc.lbl.gov/TCP-tuning/linux.html

Don't read that page, it is the last place in the world your should
take hints and advice from, most of the problems they speak of there
have been fixed years ago.

Please start instrumenting the TCP code instead of "poking around"
hoping you'll hit the grand jackpot by manipulating some sysctl
setting.

It doesn't help us and it won't help you, start reading and
understanding the TCP code, add debugging printk's, anything to get
more information about this.

And please don't report anything here until you have some solid piece
of debugging information, else I'll just sit here replying and
prodding you along ever so slowly. :(

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-11 20:48   ` David Miller
@ 2007-04-11 21:06     ` Ben Greear
  2007-04-11 21:11       ` David Miller
                         ` (2 more replies)
  0 siblings, 3 replies; 41+ messages in thread
From: Ben Greear @ 2007-04-11 21:06 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

David Miller wrote:
> From: Ben Greear <greearb@candelatech.com>
> Date: Wed, 11 Apr 2007 13:26:36 -0700
> 
>> Interestingly, I found this page mentioning a SACK problem in Linux:
>> http://www-didc.lbl.gov/TCP-tuning/linux.html
> 
> Don't read that page, it is the last place in the world your should
> take hints and advice from, most of the problems they speak of there
> have been fixed years ago.

Much of their memory and buffer settings are similar to what I've
seen elsewhere..and what I use, but it could be we're all getting
the same info from the same faulty source.  Suggestions of a proper
site for tuning TCP for high speed/high latency links are welcome.

> Please start instrumenting the TCP code instead of "poking around"
> hoping you'll hit the grand jackpot by manipulating some sysctl
> setting.
> 
> It doesn't help us and it won't help you, start reading and
> understanding the TCP code, add debugging printk's, anything to get
> more information about this.
> 
> And please don't report anything here until you have some solid piece
> of debugging information, else I'll just sit here replying and
> prodding you along ever so slowly. :(

Does the CWND == 1 count as solid?  Any idea how/why this would go
to 1 in conjunction with the dup acks?

For the dup acks, I see nothing *but* dup acks on the wire...going in
both directions interestingly, at greater than 100,000 packets per second.

I don't mind adding printks...and I've started reading through the code,
but there is a lot of it, and indiscriminate printks will likely just
hide the problem because it will slow down performance so much.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-11 21:06     ` Ben Greear
@ 2007-04-11 21:11       ` David Miller
  2007-04-11 21:31         ` Ben Greear
  2007-04-12  1:06       ` Benjamin LaHaise
  2007-04-12 14:48       ` Andi Kleen
  2 siblings, 1 reply; 41+ messages in thread
From: David Miller @ 2007-04-11 21:11 UTC (permalink / raw)
  To: greearb; +Cc: netdev

From: Ben Greear <greearb@candelatech.com>
Date: Wed, 11 Apr 2007 14:06:31 -0700

> Does the CWND == 1 count as solid?  Any idea how/why this would go
> to 1 in conjunction with the dup acks?
> 
> For the dup acks, I see nothing *but* dup acks on the wire...going in
> both directions interestingly, at greater than 100,000 packets per second.
> 
> I don't mind adding printks...and I've started reading through the code,
> but there is a lot of it, and indiscriminate printks will likely just
> hide the problem because it will slow down performance so much.

If you know that it doesn't take Einstein to figure out that maybe you
should add logging when CWND is one and we're sending out an ACK?

This is why I think you're very lazy Ben and I get very agitated with
all of your reports, you put zero effort into thinking about how to
debug the problem even though you know full well how to do it.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-11 21:11       ` David Miller
@ 2007-04-11 21:31         ` Ben Greear
  2007-04-11 21:39           ` David Miller
  2007-04-12  2:44           ` SANGTAE HA
  0 siblings, 2 replies; 41+ messages in thread
From: Ben Greear @ 2007-04-11 21:31 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

David Miller wrote:
> From: Ben Greear <greearb@candelatech.com>
> Date: Wed, 11 Apr 2007 14:06:31 -0700
> 
>> Does the CWND == 1 count as solid?  Any idea how/why this would go
>> to 1 in conjunction with the dup acks?
>>
>> For the dup acks, I see nothing *but* dup acks on the wire...going in
>> both directions interestingly, at greater than 100,000 packets per second.
>>
>> I don't mind adding printks...and I've started reading through the code,
>> but there is a lot of it, and indiscriminate printks will likely just
>> hide the problem because it will slow down performance so much.
> 
> If you know that it doesn't take Einstein to figure out that maybe you
> should add logging when CWND is one and we're sending out an ACK?
 >
> This is why I think you're very lazy Ben and I get very agitated with
> all of your reports, you put zero effort into thinking about how to
> debug the problem even though you know full well how to do it.

I've spent solid weeks tracking down obscure races.  I'm hoping that
someone who knows the tcp stack will have some idea of places to look
based on the reported symptoms so that I don't have to spend another
solid week chasing this one.  If not, so be it..I'm still working on
this between sending emails.  For what it's worth, the problem (or something similar)
is reproducible on a stock FC5 .18-ish kernel as well, running between
two machines, 2 ports each.

Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-11 21:31         ` Ben Greear
@ 2007-04-11 21:39           ` David Miller
  2007-04-12  2:44           ` SANGTAE HA
  1 sibling, 0 replies; 41+ messages in thread
From: David Miller @ 2007-04-11 21:39 UTC (permalink / raw)
  To: greearb; +Cc: netdev

From: Ben Greear <greearb@candelatech.com>
Date: Wed, 11 Apr 2007 14:31:00 -0700

> I've spent solid weeks tracking down obscure races.

I've spent solid weeks tracking down kernel stack corruption and scsi
problems on sparc64, as well as attending to my network maintainer
duties, what is your point?

> I'm hoping that someone who knows the tcp stack will have some idea
> of places to look based on the reported symptoms so that I don't
> have to spend another solid week chasing this one.

If you can reproduce the bug and others cannot, you are the one in the
best possible situation to add diagnostics and figure out what's
wrong.  Please do this.

You get a lot from Linux in your work, but you sure grumble a lot when
you might need to give even a smidgen back.  You just dump random
pieces of information at this list and expect other people to just fix
it for you.  It's this part of your attitude that I absolutely do not
like.  Other people are able to report bugs in a pleasant and
non-selfish way that makes me want to go and fix the bug for them
proactively, you do not.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-11 21:31         ` Ben Greear
  2007-04-11 21:39           ` David Miller
@ 2007-04-12  2:44           ` SANGTAE HA
  1 sibling, 0 replies; 41+ messages in thread
From: SANGTAE HA @ 2007-04-12  2:44 UTC (permalink / raw)
  To: Ben Greear; +Cc: David Miller, netdev

I also noticed this happening with 2.6.18 kernel version, but this was
not severe with linux 2.6.20.3.  So, the short-term solution will be
upgrading to the latest kernel of FC-6.

A long black-out is mostly observed when a lot of packet losses
happened in slow start. You can prevent this by applying a patch
(limited slow start) to your slow start. Did you have same problems
with cubic which employs a less aggressive slow start?  I leave this
debugging for some later version of kernel but you are welcome to
debug this problem.

I recommend you install tcp_probe and recreate the problem. Whenever
you get an ack from the receiver, the probe will print the current
congestion information. Also, you can easily include some other
information you want in that module. You can get some information from
some statistics on /proc/net/tcp and /proc/net/netstat.

See http://netsrv.csc.ncsu.edu/wiki/index.php/Efficiency_of_SACK_processing

Thanks,
Sangtae



On 4/11/07, Ben Greear <greearb@candelatech.com> wrote:
> David Miller wrote:
> > From: Ben Greear <greearb@candelatech.com>
> > Date: Wed, 11 Apr 2007 14:06:31 -0700
> >
> >> Does the CWND == 1 count as solid?  Any idea how/why this would go
> >> to 1 in conjunction with the dup acks?
> >>
> >> For the dup acks, I see nothing *but* dup acks on the wire...going in
> >> both directions interestingly, at greater than 100,000 packets per second.
> >>
> >> I don't mind adding printks...and I've started reading through the code,
> >> but there is a lot of it, and indiscriminate printks will likely just
> >> hide the problem because it will slow down performance so much.
> >
> > If you know that it doesn't take Einstein to figure out that maybe you
> > should add logging when CWND is one and we're sending out an ACK?
>  >
> > This is why I think you're very lazy Ben and I get very agitated with
> > all of your reports, you put zero effort into thinking about how to
> > debug the problem even though you know full well how to do it.
>
> I've spent solid weeks tracking down obscure races.  I'm hoping that
> someone who knows the tcp stack will have some idea of places to look
> based on the reported symptoms so that I don't have to spend another
> solid week chasing this one.  If not, so be it..I'm still working on
> this between sending emails.  For what it's worth, the problem (or something similar)
> is reproducible on a stock FC5 .18-ish kernel as well, running between
> two machines, 2 ports each.
>
> Ben
>
> --
> Ben Greear <greearb@candelatech.com>
> Candela Technologies Inc  http://www.candelatech.com
>
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-11 21:06     ` Ben Greear
  2007-04-11 21:11       ` David Miller
@ 2007-04-12  1:06       ` Benjamin LaHaise
  2007-04-12 14:48       ` Andi Kleen
  2 siblings, 0 replies; 41+ messages in thread
From: Benjamin LaHaise @ 2007-04-12  1:06 UTC (permalink / raw)
  To: Ben Greear; +Cc: David Miller, netdev

On Wed, Apr 11, 2007 at 02:06:31PM -0700, Ben Greear wrote:
> For the dup acks, I see nothing *but* dup acks on the wire...going in
> both directions interestingly, at greater than 100,000 packets per second.
> 
> I don't mind adding printks...and I've started reading through the code,
> but there is a lot of it, and indiscriminate printks will likely just
> hide the problem because it will slow down performance so much.

What do the timestamps look like?  PAWS contains logic which will drop 
packets if the timestamps are too old compared to what the receiver 
expects.

		-ben
-- 
"Time is of no importance, Mr. President, only life is important."
Don't Email: <zyntrop@kvack.org>.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-11 21:06     ` Ben Greear
  2007-04-11 21:11       ` David Miller
  2007-04-12  1:06       ` Benjamin LaHaise
@ 2007-04-12 14:48       ` Andi Kleen
  2007-04-12 17:59         ` Ben Greear
  2 siblings, 1 reply; 41+ messages in thread
From: Andi Kleen @ 2007-04-12 14:48 UTC (permalink / raw)
  To: Ben Greear; +Cc: David Miller, netdev

Ben Greear <greearb@candelatech.com> writes:
> 
> I don't mind adding printks...and I've started reading through the code,
> but there is a lot of it, and indiscriminate printks will likely just
> hide the problem because it will slow down performance so much.

You could add /proc/net/snmp counters for interesting events (e.g. GFP_ATOMIC
allocations failing). Perhaps netstat -s already shows something interesting.

-Andi

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-12 14:48       ` Andi Kleen
@ 2007-04-12 17:59         ` Ben Greear
  2007-04-12 18:19           ` Eric Dumazet
  0 siblings, 1 reply; 41+ messages in thread
From: Ben Greear @ 2007-04-12 17:59 UTC (permalink / raw)
  To: Andi Kleen; +Cc: netdev, bcrl

Andi Kleen wrote:
> Ben Greear <greearb@candelatech.com> writes:
>> I don't mind adding printks...and I've started reading through the code,
>> but there is a lot of it, and indiscriminate printks will likely just
>> hide the problem because it will slow down performance so much.
> 
> You could add /proc/net/snmp counters for interesting events (e.g. GFP_ATOMIC
> allocations failing). Perhaps netstat -s already shows something interesting.

I will look for more interesting events to add counters for, thanks for
the suggestion.  Thanks for the rest of the suggestions and patches from
others as well, I will be trying those out today and will let you know how
it goes.  I can also try this on the 2.6.20 kernel.

This is on the machine connected to itself.  This is by far the easiest
way to reproduce the problem.  This is from the stalled state.  About 3-5 minutes
later (I wasn't watching too closely), the connection briefly started up again
and then stalled again.  While it is stalled and sending ACKs, the netstat -an
counters remain the same.  It appears this run/stall behaviour happens repeatedly,
as the over-all bits-per-second average overnight was around 90Mbps, and it runs at ~800Mbps
when running full speed.

from netstat -an:
tcp        0 759744 20.20.20.30:33012           20.20.20.20:33011           ESTABLISHED
tcp        0 722984 20.20.20.20:33011           20.20.20.30:33012           ESTABLISHED


I'm not sure if netstat -s shows interesting things or not...it does show a very large
number of packets in and out.  I ran it twice..about 5 seconds apart.  I pasted some
values from the second run on the right-hand side where the numbers looked interesting.
This info is at the bottom of this email.

For GFP_ATOMIC allocations failing, doesn't that show up as order X allocation failure
messages in the kernel (I see no messages of this type.)?


Here is a tcpdump of the connection in the stalled state.  As you can see by
the 'time' output, it's running at around 100,000 packets per second.  tcpdump
dropped the vast majority of these.  Based on the network interface stats, I
believe both sides of the connection are sending acks at about the same
rate (about 160kpps when tcpdump is not running it seems).


10:46:46.541490 IP 20.20.20.20.33011 > 20.20.20.30.33012: . ack 48 win 6132 <nop,nop,timestamp 85158912 84963208>
10:46:46.541494 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1 win 114 <nop,nop,timestamp 85158912 84963208>
10:46:46.541567 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1 win 114 <nop,nop,timestamp 85158912 84963208>
10:46:46.541653 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1 win 114 <nop,nop,timestamp 85158912 84963208>
10:46:46.541886 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1 win 114 <nop,nop,timestamp 85158912 84963208>
10:46:46.541891 IP 20.20.20.20.33011 > 20.20.20.30.33012: . ack 48 win 6132 <nop,nop,timestamp 85158912 84963208>
10:46:46.541895 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1 win 114 <nop,nop,timestamp 85158912 84963208>
10:46:46.541988 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1 win 114 <nop,nop,timestamp 85158912 84963208>
10:46:46.542077 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1 win 114 <nop,nop,timestamp 85158912 84963208>
10:46:46.542307 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1 win 114 <nop,nop,timestamp 85158913 84963208>
10:46:46.542312 IP 20.20.20.20.33011 > 20.20.20.30.33012: . ack 48 win 6132 <nop,nop,timestamp 85158913 84963208>
10:46:46.542321 IP 20.20.20.20.33011 > 20.20.20.30.33012: . ack 48 win 6132 <nop,nop,timestamp 85158913 84963208>
10:46:46.542410 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1 win 114 <nop,nop,timestamp 85158913 84963208>
10:46:46.542494 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1 win 114 <nop,nop,timestamp 85158913 84963208>
10:46:46.542708 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1 win 114 <nop,nop,timestamp 85158913 84963208>
10:46:46.542718 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1 win 114 <nop,nop,timestamp 85158913 84963208>
10:46:46.542735 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1 win 114 <nop,nop,timestamp 85158913 84963208>
10:46:46.542818 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1 win 114 <nop,nop,timestamp 85158913 84963208>
10:46:46.542899 IP 20.20.20.20.33011 > 20.20.20.30.33012: . ack 48 win 6132 <nop,nop,timestamp 85158913 84963208>

4214 packets captured
253889 packets received by filter
244719 packets dropped by kernel

real    0m2.640s
user    0m0.067s
sys     0m0.079s


Two netstat -s outputs....about 5 seconds apart.

[root@lf1001-240 ipv4]# netstat -s
Ip:
     2823452436 total packets received       2840939253 total packets received
     1 with invalid addresses
     0 forwarded
     0 incoming packets discarded
     2823452435 incoming packets delivered   2840939252 incoming packets delivered
     1549687963 requests sent out            1565951477 requests sent out

Icmp:
     0 ICMP messages received
     0 input ICMP message failed.
     ICMP input histogram:
     0 ICMP messages sent
     0 ICMP messages failed
     ICMP output histogram:
Tcp:
     77 active connections openings
     74 passive connection openings
     0 failed connection attempts
     122 connection resets received
     10 connections established
     2823426197 segments received            2840914122 segments received
     1549683727 segments send out            1565948373 segments send out
     2171 segments retransmited              2187 segments retransmited
     0 bad segments received.
     2203 resets sent
Udp:
     21739 packets received
     0 packets to unknown port received.
     0 packet receive errors
     4236 packets sent
TcpExt:
     1164 invalid SYN cookies received
     31323 packets pruned from receive queue because of socket buffer overrun    31337
     4 TCP sockets finished time wait in fast timer
     8 packets rejects in established connections because of timestamp
     91542 delayed acks sent                                                     91645
     1902 delayed acks further delayed because of locked socket
     Quick ack mode was activated 2201 times
     2 packets directly queued to recvmsg prequeue.
     1323185164 packets header predicted                                         1324477473
     63077636 acknowledgments not containing data received                       63141338
     17021279 predicted acknowledgments                                          17043867
     2035 times recovered from packet loss due to fast retransmit
     8 times recovered from packet loss due to SACK data
     Detected reordering 13 times using reno fast retransmit
     Detected reordering 642 times using time stamp
     1971 congestion windows fully recovered
     16017 congestion windows partially recovered using Hoe heuristic
     19 congestion windows recovered after partial ack
     0 TCP data loss events
     1 timeouts in loss state
     225 fast retransmits
     3 forward retransmits
     151 other TCP timeouts
     TCPRenoRecoveryFail: 1
     11658529 packets collapsed in receive queue due to low socket buffer        11664170
     123 DSACKs sent for old packets
     70 DSACKs received
     132 connections aborted due to timeout



[root@lf1001-240 ipv4]# netstat -s
Ip:
     2840939253 total packets received
     1 with invalid addresses
     0 forwarded
     0 incoming packets discarded
     2840939252 incoming packets delivered
     1565951477 requests sent out
Icmp:
     0 ICMP messages received
     0 input ICMP message failed.
     ICMP input histogram:
     0 ICMP messages sent
     0 ICMP messages failed
     ICMP output histogram:
Tcp:
     77 active connections openings
     74 passive connection openings
     0 failed connection attempts
     122 connection resets received
     10 connections established
     2840914122 segments received
     1565948373 segments send out
     2187 segments retransmited
     0 bad segments received.
     2203 resets sent
Udp:
     21755 packets received
     0 packets to unknown port received.
     0 packet receive errors
     4239 packets sent
TcpExt:
     1164 invalid SYN cookies received
     31337 packets pruned from receive queue because of socket buffer overrun
     4 TCP sockets finished time wait in fast timer
     8 packets rejects in established connections because of timestamp
     91645 delayed acks sent
     1912 delayed acks further delayed because of locked socket
     Quick ack mode was activated 2217 times
     2 packets directly queued to recvmsg prequeue.
     1324477473 packets header predicted
     63141338 acknowledgments not containing data received
     17043867 predicted acknowledgments
     2037 times recovered from packet loss due to fast retransmit
     8 times recovered from packet loss due to SACK data
     Detected reordering 13 times using reno fast retransmit
     Detected reordering 642 times using time stamp
     1973 congestion windows fully recovered
     16021 congestion windows partially recovered using Hoe heuristic
     19 congestion windows recovered after partial ack
     0 TCP data loss events
     1 timeouts in loss state
     225 fast retransmits
     3 forward retransmits
     153 other TCP timeouts
     TCPRenoRecoveryFail: 1
     11664170 packets collapsed in receive queue due to low socket buffer
     123 DSACKs sent for old packets
     70 DSACKs received
     132 connections aborted due to timeout



> 
> -Andi
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-12 17:59         ` Ben Greear
@ 2007-04-12 18:19           ` Eric Dumazet
  2007-04-12 19:12             ` Ben Greear
  2007-04-13 16:10             ` Daniel Schaffrath
  0 siblings, 2 replies; 41+ messages in thread
From: Eric Dumazet @ 2007-04-12 18:19 UTC (permalink / raw)
  To: Ben Greear; +Cc: Andi Kleen, netdev, bcrl

On Thu, 12 Apr 2007 10:59:19 -0700
Ben Greear <greearb@candelatech.com> wrote:
> 
> Here is a tcpdump of the connection in the stalled state.  As you can see by
> the 'time' output, it's running at around 100,000 packets per second.  tcpdump
> dropped the vast majority of these.  Based on the network interface stats, I
> believe both sides of the connection are sending acks at about the same
> rate (about 160kpps when tcpdump is not running it seems).

Warning : tcpdump can lie, telling you packets being transmited several time.

And yes, tcpdump slow things down because of enabling accurate timestamping of packets.

> 
> 
> 10:46:46.541490 IP 20.20.20.20.33011 > 20.20.20.30.33012: . ack 48 win 6132 <nop,nop,timestamp 85158912 84963208>
> 10:46:46.541494 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1 win 114 <nop,nop,timestamp 85158912 84963208>
> 10:46:46.541567 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1 win 114 <nop,nop,timestamp 85158912 84963208>
> 10:46:46.541653 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1 win 114 <nop,nop,timestamp 85158912 84963208>
> 10:46:46.541886 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1 win 114 <nop,nop,timestamp 85158912 84963208>
> 10:46:46.541891 IP 20.20.20.20.33011 > 20.20.20.30.33012: . ack 48 win 6132 <nop,nop,timestamp 85158912 84963208>
> 10:46:46.541895 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1 win 114 <nop,nop,timestamp 85158912 84963208>
> 10:46:46.541988 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1 win 114 <nop,nop,timestamp 85158912 84963208>
>

What 
"tc -s -d qdisc"
"ifconfig -a"
"cat /proc/interrupts" 
"cat /proc/net/sockstat" and
"cat /proc/net/softnet_stat" are telling ?


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-12 18:19           ` Eric Dumazet
@ 2007-04-12 19:12             ` Ben Greear
  2007-04-12 20:41               ` Eric Dumazet
  2007-04-13 16:10             ` Daniel Schaffrath
  1 sibling, 1 reply; 41+ messages in thread
From: Ben Greear @ 2007-04-12 19:12 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Andi Kleen, netdev, bcrl

Eric Dumazet wrote:

> What 
> "tc -s -d qdisc"
> "ifconfig -a"
> "cat /proc/interrupts" 
> "cat /proc/net/sockstat" and
> "cat /proc/net/softnet_stat" are telling ?


In this test, eth2 is talking to eth3, using something similar to this
send-to-self patch:
http://www.candelatech.com/oss/sts.patch


[root@lf1001-240 ipv4]# ifconfig -a
eth0      Link encap:Ethernet  HWaddr 00:30:48:89:74:60
           inet addr:192.168.100.187  Bcast:192.168.100.255  Mask:255.255.255.0
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:1672220 errors:0 dropped:0 overruns:0 frame:0
           TX packets:1560305 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:151896589 (144.8 MiB)  TX bytes:1375280163 (1.2 GiB)
           Interrupt:17

eth1      Link encap:Ethernet  HWaddr 00:30:48:89:74:61
           UP BROADCAST MULTICAST  MTU:1500  Metric:1
           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
           TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
           Interrupt:18

eth2      Link encap:Ethernet  HWaddr 00:07:E9:1F:CE:02
           inet addr:20.20.20.20  Bcast:20.20.20.255  Mask:255.255.255.0
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:2175144684 errors:0 dropped:2 overruns:0 frame:0
           TX packets:2196560123 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:1321380186 (1.2 GiB)  TX bytes:2274982574 (2.1 GiB)
           Base address:0xd000 Memory:d0000000-d0020000

eth3      Link encap:Ethernet  HWaddr 00:07:E9:1F:CE:03
           inet addr:20.20.20.30  Bcast:20.20.20.255  Mask:255.255.255.0
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:2196315966 errors:0 dropped:0 overruns:0 frame:0
           TX packets:2174900538 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:2257901986 (2.1 GiB)  TX bytes:1304493504 (1.2 GiB)
           Base address:0xd100 Memory:d0020000-d0040000

lo        Link encap:Local Loopback
           inet addr:127.0.0.1  Mask:255.0.0.0
           UP LOOPBACK RUNNING  MTU:16436  Metric:1
           RX packets:1159378 errors:0 dropped:0 overruns:0 frame:0
           TX packets:1159378 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:1133646590 (1.0 GiB)  TX bytes:1133646590 (1.0 GiB)

[root@lf1001-240 ipv4]# tc -s -d qdisc
qdisc pfifo_fast 0: dev eth0 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
  Sent 1367521025 bytes 1324808 pkt (dropped 0, overlimits 0 requeues 0)
  rate 0bit 0pps backlog 0b 0p requeues 0
qdisc pfifo_fast 0: dev eth1 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
  rate 0bit 0pps backlog 0b 0p requeues 0
qdisc pfifo_fast 0: dev eth2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
  Sent 1815657070136 bytes 1536674488 pkt (dropped 0, overlimits 0 requeues 1448094)
  rate 0bit 0pps backlog 0b 0p requeues 1448094
qdisc pfifo_fast 0: dev eth3 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
  Sent 1752033393324 bytes 1536906566 pkt (dropped 0, overlimits 0 requeues 1063672)
  rate 0bit 0pps backlog 0b 0p requeues 1063672

[root@lf1001-240 ipv4]# cat /proc/interrupts
            CPU0       CPU1
   0:   46020594   44501954    IO-APIC-edge  timer
   1:          9          0    IO-APIC-edge  i8042
   7:          0          0    IO-APIC-edge  parport0
   8:          1          0    IO-APIC-edge  rtc
   9:          0          0   IO-APIC-level  acpi
  12:         96          0    IO-APIC-edge  i8042
  14:     394023     407282    IO-APIC-edge  ide0
  16:          0          0   IO-APIC-level  uhci_hcd:usb4
  17:    1134346    1034006   IO-APIC-level  uhci_hcd:usb3, eth0
  18:      81605      83739   IO-APIC-level  libata, uhci_hcd:usb2, eth1
  19:          0          0   IO-APIC-level  uhci_hcd:usb1, ehci_hcd:usb5
  20:   53056128   46598235   IO-APIC-level  eth2
  21:   47534577   52189674   IO-APIC-level  eth3
NMI:          0          0
LOC:   90485383   90485382
ERR:          0
MIS:          0

[root@lf1001-240 ipv4]# cat /proc/net/sockstat
sockets: used 334
TCP: inuse 27 orphan 0 tw 0 alloc 27 mem 360
UDP: inuse 12
RAW: inuse 0
FRAG: inuse 0 memory 0

[root@lf1001-240 ipv4]# cat /proc/net/softnet_stat
d58236f1 00000000 023badc3 00000000 00000000 00000000 00000000 00000000 0004ef01
3a4354a1 00000000 01b57b4b 00000000 00000000 00000000 00000000 00000000 0005445f


Thanks,
Ben

> 
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-12 19:12             ` Ben Greear
@ 2007-04-12 20:41               ` Eric Dumazet
  2007-04-12 21:36                 ` Ben Greear
  0 siblings, 1 reply; 41+ messages in thread
From: Eric Dumazet @ 2007-04-12 20:41 UTC (permalink / raw)
  To: Ben Greear; +Cc: Andi Kleen, netdev, bcrl

Ben Greear a écrit :
> Eric Dumazet wrote:
> 
>> What "tc -s -d qdisc"
>> "ifconfig -a"
>> "cat /proc/interrupts" "cat /proc/net/sockstat" and
>> "cat /proc/net/softnet_stat" are telling ?
> 
> 
> In this test, eth2 is talking to eth3, using something similar to this
> send-to-self patch:
> http://www.candelatech.com/oss/sts.patch
> 
> 
> [root@lf1001-240 ipv4]# ifconfig -a
> eth0      Link encap:Ethernet  HWaddr 00:30:48:89:74:60
>           inet addr:192.168.100.187  Bcast:192.168.100.255  
> Mask:255.255.255.0
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:1672220 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:1560305 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:151896589 (144.8 MiB)  TX bytes:1375280163 (1.2 GiB)
>           Interrupt:17
> 
> eth1      Link encap:Ethernet  HWaddr 00:30:48:89:74:61
>           UP BROADCAST MULTICAST  MTU:1500  Metric:1
>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
>           Interrupt:18
> 
> eth2      Link encap:Ethernet  HWaddr 00:07:E9:1F:CE:02
>           inet addr:20.20.20.20  Bcast:20.20.20.255  Mask:255.255.255.0
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:2175144684 errors:0 dropped:2 overruns:0 frame:0
>           TX packets:2196560123 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:1321380186 (1.2 GiB)  TX bytes:2274982574 (2.1 GiB)
>           Base address:0xd000 Memory:d0000000-d0020000
> 
> eth3      Link encap:Ethernet  HWaddr 00:07:E9:1F:CE:03
>           inet addr:20.20.20.30  Bcast:20.20.20.255  Mask:255.255.255.0
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:2196315966 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:2174900538 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:2257901986 (2.1 GiB)  TX bytes:1304493504 (1.2 GiB)
>           Base address:0xd100 Memory:d0020000-d0040000
> 
> lo        Link encap:Local Loopback
>           inet addr:127.0.0.1  Mask:255.0.0.0
>           UP LOOPBACK RUNNING  MTU:16436  Metric:1
>           RX packets:1159378 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:1159378 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:1133646590 (1.0 GiB)  TX bytes:1133646590 (1.0 GiB)
> 
> [root@lf1001-240 ipv4]# tc -s -d qdisc
> qdisc pfifo_fast 0: dev eth0 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 
> 1 1 1
>  Sent 1367521025 bytes 1324808 pkt (dropped 0, overlimits 0 requeues 0)
>  rate 0bit 0pps backlog 0b 0p requeues 0
> qdisc pfifo_fast 0: dev eth1 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 
> 1 1 1
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>  rate 0bit 0pps backlog 0b 0p requeues 0
> qdisc pfifo_fast 0: dev eth2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 
> 1 1 1
>  Sent 1815657070136 bytes 1536674488 pkt (dropped 0, overlimits 0 
> requeues 1448094)
>  rate 0bit 0pps backlog 0b 0p requeues 1448094
> qdisc pfifo_fast 0: dev eth3 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 
> 1 1 1
>  Sent 1752033393324 bytes 1536906566 pkt (dropped 0, overlimits 0 
> requeues 1063672)
>  rate 0bit 0pps backlog 0b 0p requeues 1063672
> 
> [root@lf1001-240 ipv4]# cat /proc/interrupts
>            CPU0       CPU1
>   0:   46020594   44501954    IO-APIC-edge  timer
>   1:          9          0    IO-APIC-edge  i8042
>   7:          0          0    IO-APIC-edge  parport0
>   8:          1          0    IO-APIC-edge  rtc
>   9:          0          0   IO-APIC-level  acpi
>  12:         96          0    IO-APIC-edge  i8042
>  14:     394023     407282    IO-APIC-edge  ide0
>  16:          0          0   IO-APIC-level  uhci_hcd:usb4
>  17:    1134346    1034006   IO-APIC-level  uhci_hcd:usb3, eth0
>  18:      81605      83739   IO-APIC-level  libata, uhci_hcd:usb2, eth1
>  19:          0          0   IO-APIC-level  uhci_hcd:usb1, ehci_hcd:usb5
>  20:   53056128   46598235   IO-APIC-level  eth2
>  21:   47534577   52189674   IO-APIC-level  eth3
> NMI:          0          0
> LOC:   90485383   90485382
> ERR:          0
> MIS:          0
> 
> [root@lf1001-240 ipv4]# cat /proc/net/sockstat
> sockets: used 334
> TCP: inuse 27 orphan 0 tw 0 alloc 27 mem 360
> UDP: inuse 12
> RAW: inuse 0
> FRAG: inuse 0 memory 0
> 
> [root@lf1001-240 ipv4]# cat /proc/net/softnet_stat
> d58236f1 00000000 023badc3 00000000 00000000 00000000 00000000 00000000 
> 0004ef01
> 3a4354a1 00000000 01b57b4b 00000000 00000000 00000000 00000000 00000000 
> 0005445f
> 

Hum, could you try to bind nic irqs on separate cpus ?

eth2 -> CPU0 and eth3 -> CPU1

# echo 1 >/proc/irq/20/smp_affinity
# echo 2 >/proc/irq/21/smp_affinity



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-12 20:41               ` Eric Dumazet
@ 2007-04-12 21:36                 ` Ben Greear
  2007-04-13  7:09                   ` Evgeniy Polyakov
  0 siblings, 1 reply; 41+ messages in thread
From: Ben Greear @ 2007-04-12 21:36 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Andi Kleen, netdev, bcrl

Eric Dumazet wrote:

> Hum, could you try to bind nic irqs on separate cpus ?

I just started a run on 2.6.20.4, and so far (~20 minutes), it
is behaving perfect..running at around 925Mbps in both directions.
CWND averages about 600, bouncing from a low of 300 up to 800, but
that could very well be perfectly normal.  I'm quite pleased with
the faster performance in this kernel as well...seems like the old one
would rarely get above 800Mbps even when it was passing traffic!

I am not sure if the problem is fixed or just harder to hit,
but for now it looks good.

I'm going to also try a 2.6.19 kernel and see if the problem hits there
in an attempt to figure out what patch changed the behaviour.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-12 21:36                 ` Ben Greear
@ 2007-04-13  7:09                   ` Evgeniy Polyakov
  2007-04-13 16:42                     ` Ben Greear
  0 siblings, 1 reply; 41+ messages in thread
From: Evgeniy Polyakov @ 2007-04-13  7:09 UTC (permalink / raw)
  To: Ben Greear; +Cc: Eric Dumazet, Andi Kleen, netdev, bcrl

On Thu, Apr 12, 2007 at 02:36:34PM -0700, Ben Greear (greearb@candelatech.com) wrote:
> I am not sure if the problem is fixed or just harder to hit,
> but for now it looks good.

Wasn't default congestion control algo changed between that kernel
releases?
With such small rtt like in your setup there could be some obscure bug,
try to set different one and check if it still works good/bad.

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-13  7:09                   ` Evgeniy Polyakov
@ 2007-04-13 16:42                     ` Ben Greear
  0 siblings, 0 replies; 41+ messages in thread
From: Ben Greear @ 2007-04-13 16:42 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: Eric Dumazet, Andi Kleen, netdev, bcrl

Evgeniy Polyakov wrote:
> On Thu, Apr 12, 2007 at 02:36:34PM -0700, Ben Greear (greearb@candelatech.com) wrote:
>   
>> I am not sure if the problem is fixed or just harder to hit,
>> but for now it looks good.
>>     
>
> Wasn't default congestion control algo changed between that kernel
> releases?
> With such small rtt like in your setup there could be some obscure bug,
> try to set different one and check if it still works good/bad.
>   
I had earlier tried changing between bic and reno (the only two I had 
compiled in
that kernel), and it did not affect anything.  I also realized that I 
had been reproducing
the bug (and the traces I sent to this list earlier) on a 2.6.17.4 
kernel..not 2.6.18 as
I had supposed.  So, it's possible that the problem was fixed between 
2.6.17.4 and
2.6.18.2 as well.

I also figured out yesterday that rebooting to go to a new kernel makes 
it slower
to reproduce, even on kernels known to have the problem.  This is 
probably because
lots of memory is available after a reboot.  I am going to set up
some long term tests on 2.6.18, 2.6.19 and 2.6.20 and let them cook for 
several
days to make sure the problem is truly fixed in the later kernels.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com> 
Candela Technologies Inc  http://www.candelatech.com



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-12 18:19           ` Eric Dumazet
  2007-04-12 19:12             ` Ben Greear
@ 2007-04-13 16:10             ` Daniel Schaffrath
  2007-04-13 16:41               ` Eric Dumazet
  1 sibling, 1 reply; 41+ messages in thread
From: Daniel Schaffrath @ 2007-04-13 16:10 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Ben Greear, Andi Kleen, netdev, bcrl


On 2007/04/12  , at 20:19, Eric Dumazet wrote:

> On Thu, 12 Apr 2007 10:59:19 -0700
> Ben Greear <greearb@candelatech.com> wrote:
>>
>> Here is a tcpdump of the connection in the stalled state.  As you  
>> can see by
>> the 'time' output, it's running at around 100,000 packets per  
>> second.  tcpdump
>> dropped the vast majority of these.  Based on the network  
>> interface stats, I
>> believe both sides of the connection are sending acks at about the  
>> same
>> rate (about 160kpps when tcpdump is not running it seems).
>
> Warning : tcpdump can lie, telling you packets being transmited  
> several time.
Maybe you have further pointers how come that tcpdump lies about  
duplicated packets?

Thanks a lot,
Daniel




>
> And yes, tcpdump slow things down because of enabling accurate  
> timestamping of packets.
>
>>
>>
>> 10:46:46.541490 IP 20.20.20.20.33011 > 20.20.20.30.33012: . ack 48  
>> win 6132 <nop,nop,timestamp 85158912 84963208>
>> 10:46:46.541494 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1  
>> win 114 <nop,nop,timestamp 85158912 84963208>
>> 10:46:46.541567 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1  
>> win 114 <nop,nop,timestamp 85158912 84963208>
>> 10:46:46.541653 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1  
>> win 114 <nop,nop,timestamp 85158912 84963208>
>> 10:46:46.541886 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1  
>> win 114 <nop,nop,timestamp 85158912 84963208>
>> 10:46:46.541891 IP 20.20.20.20.33011 > 20.20.20.30.33012: . ack 48  
>> win 6132 <nop,nop,timestamp 85158912 84963208>
>> 10:46:46.541895 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1  
>> win 114 <nop,nop,timestamp 85158912 84963208>
>> 10:46:46.541988 IP 20.20.20.30.33012 > 20.20.20.20.33011: . ack 1  
>> win 114 <nop,nop,timestamp 85158912 84963208>


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-13 16:10             ` Daniel Schaffrath
@ 2007-04-13 16:41               ` Eric Dumazet
  2007-04-14  4:21                 ` Herbert Xu
  0 siblings, 1 reply; 41+ messages in thread
From: Eric Dumazet @ 2007-04-13 16:41 UTC (permalink / raw)
  To: Daniel Schaffrath; +Cc: Ben Greear, Andi Kleen, netdev, bcrl

On Fri, 13 Apr 2007 18:10:12 +0200
Daniel Schaffrath <danielschaffrath@mac.com> wrote:
> 
> On 2007/04/12  , at 20:19, Eric Dumazet wrote:
> >
> > Warning : tcpdump can lie, telling you packets being transmited  
> > several time.
> Maybe you have further pointers how come that tcpdump lies about  
> duplicated packets?
> 

dev_queue_xmit_nit() is called before attempting to send packet to device.

If device could not accept the packet (hard_start_xmit() returns an error), packet is requeued and retried later.
each retry means call ev_queue_xmit_nit() again, so tcpdump/sniffers can 'see' packet transmited several times. 

This is why I asked for "tc -s -d qdisc" results : to check the requeue counter (not its absolute value, but relative to number of packets sent)

See dev_hard_start_xmit() in net/core/dev.c


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-13 16:41               ` Eric Dumazet
@ 2007-04-14  4:21                 ` Herbert Xu
  2007-04-14  4:25                   ` David Miller
  2007-04-14  5:31                   ` Eric Dumazet
  0 siblings, 2 replies; 41+ messages in thread
From: Herbert Xu @ 2007-04-14  4:21 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: danielschaffrath, greearb, andi, netdev, bcrl

Eric Dumazet <dada1@cosmosbay.com> wrote:
> 
> dev_queue_xmit_nit() is called before attempting to send packet to device.
> 
> If device could not accept the packet (hard_start_xmit() returns an error), packet is requeued and retried later.
> each retry means call ev_queue_xmit_nit() again, so tcpdump/sniffers can 'see' packet transmited several times. 

This should only happen with LLTX drivers.  In fact, LLTX drivers are
really more trouble than they're worth.  They should all be rewritten
to follow the model used in tg3.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-14  4:21                 ` Herbert Xu
@ 2007-04-14  4:25                   ` David Miller
  2007-04-14  5:31                   ` Eric Dumazet
  1 sibling, 0 replies; 41+ messages in thread
From: David Miller @ 2007-04-14  4:25 UTC (permalink / raw)
  To: herbert; +Cc: dada1, danielschaffrath, greearb, andi, netdev, bcrl

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Sat, 14 Apr 2007 14:21:44 +1000

> Eric Dumazet <dada1@cosmosbay.com> wrote:
> > 
> > dev_queue_xmit_nit() is called before attempting to send packet to device.
> > 
> > If device could not accept the packet (hard_start_xmit() returns an error), packet is requeued and retried later.
> > each retry means call ev_queue_xmit_nit() again, so tcpdump/sniffers can 'see' packet transmited several times. 
> 
> This should only happen with LLTX drivers.  In fact, LLTX drivers are
> really more trouble than they're worth.  They should all be rewritten
> to follow the model used in tg3.

Agreed.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-14  4:21                 ` Herbert Xu
  2007-04-14  4:25                   ` David Miller
@ 2007-04-14  5:31                   ` Eric Dumazet
  2007-04-14  5:37                     ` David Miller
  1 sibling, 1 reply; 41+ messages in thread
From: Eric Dumazet @ 2007-04-14  5:31 UTC (permalink / raw)
  To: Herbert Xu; +Cc: danielschaffrath, greearb, andi, netdev, bcrl

Herbert Xu a écrit :
> Eric Dumazet <dada1@cosmosbay.com> wrote:
>> dev_queue_xmit_nit() is called before attempting to send packet to device.
>>
>> If device could not accept the packet (hard_start_xmit() returns an error), packet is requeued and retried later.
>> each retry means call ev_queue_xmit_nit() again, so tcpdump/sniffers can 'see' packet transmited several times. 
> 
> This should only happen with LLTX drivers.  In fact, LLTX drivers are
> really more trouble than they're worth.  They should all be rewritten
> to follow the model used in tg3.

When did tg3 model changed exactly ?

Because I remember having this 'problem' with tg3 devices not a long time ago...


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-14  5:31                   ` Eric Dumazet
@ 2007-04-14  5:37                     ` David Miller
  0 siblings, 0 replies; 41+ messages in thread
From: David Miller @ 2007-04-14  5:37 UTC (permalink / raw)
  To: dada1; +Cc: herbert, danielschaffrath, greearb, andi, netdev, bcrl

From: Eric Dumazet <dada1@cosmosbay.com>
Date: Sat, 14 Apr 2007 07:31:35 +0200

> When did tg3 model changed exactly ?

June of 2006:

commit 00b7050426da8e7e58c889c5c80a19920d2d41b3
Author: Michael Chan <mchan@broadcom.com>
Date:   Sat Jun 17 21:58:45 2006 -0700

    [TG3]: Convert to non-LLTX
    
    Herbert Xu pointed out that it is unsafe to call netif_tx_disable()
    from LLTX drivers because it uses dev->xmit_lock to synchronize
    whereas LLTX drivers use private locks.
    
    Convert tg3 to non-LLTX to fix this issue. tg3 is a lockless driver
    where hard_start_xmit and tx completion handling can run concurrently
    under normal conditions. A tx_lock is only needed to prevent
    netif_stop_queue and netif_wake_queue race condtions when the queue
    is full.
    
    So whether we use LLTX or non-LLTX, it makes practically no
    difference.
    
    Signed-off-by: Michael Chan <mchan@broadcom.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-11 18:50 Ben Greear
  2007-04-11 20:26 ` Ben Greear
@ 2007-04-11 20:41 ` David Miller
  2007-04-12  6:12 ` Ilpo Järvinen
  2 siblings, 0 replies; 41+ messages in thread
From: David Miller @ 2007-04-11 20:41 UTC (permalink / raw)
  To: greearb; +Cc: netdev

From: Ben Greear <greearb@candelatech.com>
Date: Wed, 11 Apr 2007 11:50:18 -0700

> So, I would like to dig into this problem myself since no one else
> is reporting this type of problem, but I am quite ignorant of the TCP
> stack implementation.  Based on the dup-acks I see on the wire, I assume
> the TCP state machine is messed up somehow.  Could anyone point me to
> likely places in the TCP stack to start looking for this bug?

Dup acks mean that packets are being dropped and there are thus holes
in the sequence seen at the receiver.

Likely what happens is that we hit the global memory pressure
limit, start dropping packets, and never recover even after the
memory pressure is within it's limits again.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: TCP connection stops after high load.
  2007-04-11 18:50 Ben Greear
  2007-04-11 20:26 ` Ben Greear
  2007-04-11 20:41 ` David Miller
@ 2007-04-12  6:12 ` Ilpo Järvinen
  2 siblings, 0 replies; 41+ messages in thread
From: Ilpo Järvinen @ 2007-04-12  6:12 UTC (permalink / raw)
  To: Ben Greear; +Cc: NetDev

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3657 bytes --]

On Wed, 11 Apr 2007, Ben Greear wrote:

> The problem is that I set up a TCP connection with bi-directional traffic
> of around 800Mbps, doing large (20k - 64k writes and reads) between two ports
> on
> the same machine (this 2.6.18.2 kernel is tainted with my full patch set,
> but I also reproduced with only the non-tainted send-to-self patch applied
> last may on the 2.6.16 kernel, so I assume the bug is not particular to my
> patch
> set).
> 
> At first, all is well, but within 5-10 minutes, the TCP connection will stall
> and I only see a massive amount of duplicate ACKs on the link.  Before,
> I sometimes saw OOM messages, but this time there are no OOM messages.  The
> system
> has a two-port pro/1000 fibre NIC, 1GB RAM, kernel 2.6.18.2 + hacks, etc.
> Stopping and starting the connection allows traffic to flow again (if
> briefly).
> Starting a new connection works fine even if the old one is still stalled,
> so it's not a global memory exhaustion problem.
>
> So, I would like to dig into this problem myself since no one else
> is reporting this type of problem, but I am quite ignorant of the TCP
> stack implementation.  Based on the dup-acks I see on the wire, I assume
> the TCP state machine is messed up somehow.  Could anyone point me to
> likely places in the TCP stack to start looking for this bug?

Since your doing bidirectional, try this patch below (probably you'll have 
apply it manually to 2.6.18 series due to space changes that were made 
after it in net/ hierarchy). I suspect it's a part of the problem but 
there could be other things as well because this should only hinder TCP
before RTO occurs:

[PATCH] [TCP]: Fix ratehalving with bidirectional flows

Actually, the ratehalving seems to work too well, as cwnd is
reduced on every second ACK even though the packets in flight
remains unchanged. Recoveries in a bidirectional flows suffer
quite badly because of this, both NewReno and SACK are affected.

After this patch, rate halving is performed per ACK only if
packets in flight was supposedly changed too.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
---
 net/ipv4/tcp_input.c |   23 +++++++++++++----------
 1 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 322e43c..bf0f74c 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1823,19 +1823,22 @@ static inline u32 tcp_cwnd_min(const str
 }
 
 /* Decrease cwnd each second ack. */
-static void tcp_cwnd_down(struct sock *sk)
+static void tcp_cwnd_down(struct sock *sk, int flag)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	int decr = tp->snd_cwnd_cnt + 1;
+
+	if ((flag&FLAG_FORWARD_PROGRESS) ||
+	    (IsReno(tp) && !(flag&FLAG_NOT_DUP))) {
+		tp->snd_cwnd_cnt = decr&1;
+		decr >>= 1;
 
-	tp->snd_cwnd_cnt = decr&1;
-	decr >>= 1;
+		if (decr && tp->snd_cwnd > tcp_cwnd_min(sk))
+			tp->snd_cwnd -= decr;
 
-	if (decr && tp->snd_cwnd > tcp_cwnd_min(sk))
-		tp->snd_cwnd -= decr;
-
-	tp->snd_cwnd = min(tp->snd_cwnd, tcp_packets_in_flight(tp)+1);
-	tp->snd_cwnd_stamp = tcp_time_stamp;
+		tp->snd_cwnd = min(tp->snd_cwnd, tcp_packets_in_flight(tp)+1);
+		tp->snd_cwnd_stamp = tcp_time_stamp;
+	}
 }
 
 /* Nothing was retransmitted or returned timestamp is less
@@ -2020,7 +2023,7 @@ static void tcp_try_to_open(struct sock 
 		}
 		tcp_moderate_cwnd(tp);
 	} else {
-		tcp_cwnd_down(sk);
+		tcp_cwnd_down(sk, flag);
 	}
 }
 
@@ -2220,7 +2223,7 @@ tcp_fastretrans_alert(struct sock *sk, u
 
 	if (is_dupack || tcp_head_timedout(sk, tp))
 		tcp_update_scoreboard(sk, tp);
-	tcp_cwnd_down(sk);
+	tcp_cwnd_down(sk, flag);
 	tcp_xmit_retransmit_queue(sk);
 }
 
-- 
1.4.2

^ permalink raw reply related	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2007-04-17 19:58 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-04-12 21:11 TCP connection stops after high load Robert Iakobashvili
2007-04-12 21:15 ` David Miller
2007-04-15 12:14   ` Robert Iakobashvili
2007-04-15 15:31     ` John Heffner
2007-04-15 15:49       ` Robert Iakobashvili
2007-04-16 18:07         ` John Heffner
2007-04-16 18:51           ` Robert Iakobashvili
2007-04-16 19:11             ` John Heffner
2007-04-16 19:17               ` David Miller
2007-04-16 19:15             ` David Miller
2007-04-17  7:58               ` Robert Iakobashvili
2007-04-17 19:39                 ` David Miller
2007-04-17 19:47                   ` John Heffner
2007-04-17 19:51                     ` David Miller
2007-04-17 19:58                   ` Robert Iakobashvili
2007-04-15 13:52   ` Robert Iakobashvili
  -- strict thread matches above, loose matches on Subject: below --
2007-04-11 18:50 Ben Greear
2007-04-11 20:26 ` Ben Greear
2007-04-11 20:48   ` David Miller
2007-04-11 21:06     ` Ben Greear
2007-04-11 21:11       ` David Miller
2007-04-11 21:31         ` Ben Greear
2007-04-11 21:39           ` David Miller
2007-04-12  2:44           ` SANGTAE HA
2007-04-12  1:06       ` Benjamin LaHaise
2007-04-12 14:48       ` Andi Kleen
2007-04-12 17:59         ` Ben Greear
2007-04-12 18:19           ` Eric Dumazet
2007-04-12 19:12             ` Ben Greear
2007-04-12 20:41               ` Eric Dumazet
2007-04-12 21:36                 ` Ben Greear
2007-04-13  7:09                   ` Evgeniy Polyakov
2007-04-13 16:42                     ` Ben Greear
2007-04-13 16:10             ` Daniel Schaffrath
2007-04-13 16:41               ` Eric Dumazet
2007-04-14  4:21                 ` Herbert Xu
2007-04-14  4:25                   ` David Miller
2007-04-14  5:31                   ` Eric Dumazet
2007-04-14  5:37                     ` David Miller
2007-04-11 20:41 ` David Miller
2007-04-12  6:12 ` Ilpo Järvinen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).