From: Andi Kleen <ak@suse.de>
To: Tomasz Torcz <zdzichu@irc.pl>
Cc: netdev@oss.sgi.com
Subject: zero copy TX in benchmarks was Re: [Prism54-devel] Re: TxDescriptors -> 1024 default. Please not for every NIC!
Date: Thu, 20 May 2004 19:13:00 +0200 [thread overview]
Message-ID: <20040520171300.GA24379@wotan.suse.de> (raw)
In-Reply-To: <20040520164516.GA9913@irc.pl>
On Thu, May 20, 2004 at 06:45:16PM +0200, Tomasz Torcz wrote:
> On Thu, May 20, 2004 at 09:38:11AM -0700, Jean Tourrilhes wrote:
> > I personally would stick with 100. The IrDA stack runs
> > perfectly fine with 15 buffers at 4 Mb/s. If 100 is not enough, I
> > think the problem is not the number of buffers, but somewhere else.
Not sure why you post this to this thread? It has nothing to do
with the previous message.
>
> I don't know how much trollish or true is that comment:
> http://bsd.slashdot.org/comments.pl?sid=106258&cid=9049422
Linux sk_buffs and BSD mbufs are not very different anymore today.
The BSD mbufs have been getting more sk_buff'ish over time,
and sk_buffs have grown some properties of mbufs. They both
have changed to optionally pass references of memory around instead of
copying always, which is what counts here.
> but it suggest, that Linux' stack having no BSD like mbuf functionality,
> is not perfect for fast transmission. Maybe some network guru
> cna comment ?
I have not read all the details, but I suppose they used sendmsg()
instead of sendfile() for this test. NetBSD can use zero copy TX
in this case; Linux can only with sendfile and sendmsg will copy.
Obvious linux will be slower then because a copy can cost quite
a lot of CPU. Or rather it is not really the CPU cost that is the
problem here, but the bandwidth usage - very high speed networking i
s essentially memory bandwidth limited and copying over the CPU
adds additional bandwidth requirements to the memory subsystem.
There was an implementation of zero copy sendmsg() for linux long ago,
but it was removed because it was fundamentally incompatible with good
SMP scaling, because it would require remote TLB flushes over possible
many CPUs (if you search the archives of this list you will find
long threads about it). It would not be very hard to readd (Linux
has all the low level infrastructure needed for it), but
it doesn't make sense. NetBSD may have the luxury to not care
about MP scaling, but Linux doesn't.
The disadvantage of sendfile is that you can only transmit files
directly; if you want to transmit data directly out of an process'
address space you have to put them into a file mmap and sendfile
from there. This may be a bit inconvenient if the basic unit
of data in your program isn't files.
There was an plan suggested to fix that (implement zero copy TX for
POSIX AIO instead of BSD sockets), which would not have this problem.
POSIX AIO has all the infrastructure to do zero copy IO without
problematic and slow TLB flushes. Just so far nobody implemented that.
In practice it is not a too big issue because many tuned servers
(your typical ftpd, httpd or samba server) use sendfile already.
-Andi
next prev parent reply other threads:[~2004-05-20 17:13 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-09-09 3:14 [e1000 2.6 10/11] TxDescriptors -> 1024 default Feldman, Scott
2003-09-11 19:18 ` Jeff Garzik
2003-09-11 19:45 ` Ben Greear
2003-09-11 19:59 ` Jeff Garzik
2003-09-11 20:12 ` David S. Miller
2003-09-11 20:40 ` Ben Greear
2003-09-11 21:07 ` David S. Miller
2003-09-11 21:29 ` Ben Greear
2003-09-11 21:29 ` David S. Miller
2003-09-11 21:47 ` Ricardo C Gonzalez
2003-09-11 22:00 ` Jeff Garzik
2003-09-11 22:15 ` Ben Greear
2003-09-11 23:02 ` David S. Miller
2003-09-11 23:22 ` Ben Greear
2003-09-11 23:29 ` David S. Miller
2003-09-12 1:34 ` jamal
2003-09-12 2:20 ` Ricardo C Gonzalez
2003-09-12 3:05 ` jamal
2003-09-13 3:49 ` David S. Miller
2003-09-13 11:52 ` Robert Olsson
2003-09-15 12:12 ` jamal
2003-09-15 13:45 ` Robert Olsson
2003-09-15 23:15 ` David S. Miller
2003-09-16 9:28 ` Robert Olsson
2003-09-14 19:08 ` Ricardo C Gonzalez
2003-09-15 2:50 ` David Brownell
2003-09-15 8:17 ` David S. Miller
2004-05-15 12:14 ` TxDescriptors -> 1024 default. Please not for every NIC! Marc Herbert
2004-05-19 9:30 ` Marc Herbert
2004-05-19 10:27 ` Pekka Pietikainen
2004-05-20 14:11 ` Luis R. Rodriguez
2004-05-20 16:38 ` [Prism54-devel] " Jean Tourrilhes
2004-05-20 16:45 ` Tomasz Torcz
2004-05-20 17:13 ` Andi Kleen [this message]
2004-05-19 11:54 ` Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040520171300.GA24379@wotan.suse.de \
--to=ak@suse.de \
--cc=netdev@oss.sgi.com \
--cc=zdzichu@irc.pl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).