From: lm@bitmover.com (Larry McVoy)
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: davem@davemloft.net, wscott@bitmover.com, netdev@vger.kernel.org
Subject: Re: tcp bw in 2.6
Date: Mon, 1 Oct 2007 17:59:18 -0700 [thread overview]
Message-ID: <20071002005917.GB5480@bitmover.com> (raw)
In-Reply-To: <alpine.LFD.0.999.0709291050200.3579@woody.linux-foundation.org>
On Sat, Sep 29, 2007 at 11:02:32AM -0700, Linus Torvalds wrote:
> On Sat, 29 Sep 2007, Larry McVoy wrote:
> > I haven't kept up on switch technology but in the past they were much
> > better than you are thinking. The Kalpana switch that I had modified
> > to support vlans (invented by yours truly), did not store and forward,
> > it was cut through and could handle any load that was theoretically
> > possible within about 1%.
>
> Hey, you may well be right. Maybe my assumptions about cutting corners are
> just cynical and pessimistic.
So I got a netgear switch and it works fine. But my tests are busted.
Catching netdev up, I'm trying to optimize traffic to a server that has
a gbit interface; I moved to a 24 port netgear that is all 10/100/1000
and I have a pile of clients to act as load generators.
I can do this on each of the clients
dd if=/dev/zero bs=1024000 | rsh work "dd of=/dev/null"
and that cranks up to about 47K packets/second which is about 70MB/sec.
One of my clients also has gigabit so I played around with just that
one and it (itanium running hpux w/ broadcom gigabit) can push the load
as well. One weird thing is that it is dependent on the direction the
data is flowing. If the hp is sending then I get 46MB/sec, if linux is
sending then I get 18MB/sec. Weird. Linux is debian, running
Linux work 2.6.18-5-k7 #1 SMP Thu Aug 30 02:52:31 UTC 2007 i686
and dual e1000 cards:
e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection
I wrote a tiny little program to try and emulate this and I can't get
it to do as well. I've tracked it down, I think, to the read side.
The server sources, the client sinks, the server looks like:
11689 accept(3, {sa_family=AF_INET, sin_port=htons(49376), sin_addr=inet_addr("10.3.1.38")}, [16]) = 4
11689 setsockopt(4, SOL_SOCKET, SO_RCVBUF, [1048576], 4) = 0
11689 setsockopt(4, SOL_SOCKET, SO_SNDBUF, [1048576], 4) = 0
11689 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7ddf708) = 11694
11689 close(4) = 0
11689 accept(3, <unfinished ...>
11694 write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
11694 write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
11694 write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
11694 write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
...
but the client looks like
connect(3, {sa_family=AF_INET, sin_port=htons(31235), sin_addr=inet_addr("10.3.9.1")}, 16) = 0
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 2896
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1448
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 2896
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 2896
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 2896
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 2896
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 2896
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 2896
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 2896
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1448
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 2896
which I suspect may be the problem.
I played around with SO_RCVBUF/SO_SNDBUF and that didn't help. So any ideas why
a simple dd piped through rsh is kicking my ass? It must be something simple
but my test program is tiny and does nothing weird that I can see.
--
---
Larry McVoy lm at bitmover.com http://www.bitkeeper.com
next parent reply other threads:[~2007-10-02 1:30 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20070929142517.EC6AB5FB21@work.bitmover.com>
[not found] ` <alpine.LFD.0.999.0709290914410.3579@woody.linux-foundation.org>
[not found] ` <20070929172639.GB7037@bitmover.com>
[not found] ` <alpine.LFD.0.999.0709291050200.3579@woody.linux-foundation.org>
2007-10-02 0:59 ` Larry McVoy [this message]
2007-10-02 2:14 ` tcp bw in 2.6 Linus Torvalds
2007-10-02 2:20 ` Larry McVoy
2007-10-02 3:50 ` David Miller
2007-10-02 4:23 ` Larry McVoy
2007-10-02 15:06 ` John Heffner
2007-10-02 17:14 ` Rick Jones
2007-10-02 17:20 ` Larry McVoy
2007-10-02 18:01 ` Rick Jones
2007-10-02 18:40 ` Larry McVoy
2007-10-02 19:47 ` Rick Jones
2007-10-02 21:32 ` David Miller
2007-10-03 7:19 ` Bill Fink
2007-10-02 10:52 ` Herbert Xu
2007-10-02 15:09 ` Larry McVoy
2007-10-02 15:41 ` Larry McVoy
2007-10-02 16:25 ` Larry McVoy
2007-10-02 16:47 ` Stephen Hemminger
2007-10-02 16:49 ` Larry McVoy
2007-10-02 17:10 ` Stephen Hemminger
2007-10-15 12:40 ` Daniel Schaffrath
2007-10-15 15:49 ` Stephen Hemminger
2007-10-02 16:34 ` Linus Torvalds
2007-10-02 16:48 ` Larry McVoy
2007-10-02 21:16 ` David Miller
2007-10-02 21:26 ` Larry McVoy
2007-10-02 21:47 ` David Miller
2007-10-02 22:17 ` Rick Jones
2007-10-02 22:32 ` David Miller
2007-10-02 22:36 ` Larry McVoy
2007-10-02 22:59 ` Rick Jones
2007-10-03 8:02 ` David Miller
2007-10-02 16:48 ` Ben Greear
2007-10-02 17:11 ` Larry McVoy
2007-10-02 17:18 ` Ben Greear
2007-10-02 17:21 ` Larry McVoy
2007-10-02 17:54 ` Stephen Hemminger
2007-10-02 18:35 ` Larry McVoy
2007-10-02 18:29 ` John Heffner
2007-10-02 19:07 ` Larry McVoy
2007-10-02 19:29 ` Linus Torvalds
2007-10-02 20:31 ` David Miller
2007-10-02 19:33 ` Larry McVoy
2007-10-02 19:53 ` John Heffner
2007-10-02 20:14 ` Larry McVoy
2007-10-02 20:40 ` Rick Jones
2007-10-02 20:42 ` Wayne Scott
2007-10-02 21:56 ` Linus Torvalds
2007-10-02 19:27 ` Linus Torvalds
2007-10-02 19:53 ` Rick Jones
2007-10-02 20:33 ` David Miller
2007-10-02 20:44 ` Roland Dreier
2007-10-02 21:21 ` Larry McVoy
2007-10-03 21:13 ` Pekka Pietikainen
2007-10-03 21:23 ` Larry McVoy
2007-10-03 21:50 ` Pekka Pietikainen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20071002005917.GB5480@bitmover.com \
--to=lm@bitmover.com \
--cc=davem@davemloft.net \
--cc=netdev@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=wscott@bitmover.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.