From: Andi Kleen <ak@suse.de>
To: "David S. Miller" <davem@redhat.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: Fire Engine??
Date: Wed, 26 Nov 2003 23:39:18 +0100 [thread overview]
Message-ID: <20031126233918.2af3aae5.ak@suse.de> (raw)
In-Reply-To: <20031126113040.3b774360.davem@redhat.com>
On Wed, 26 Nov 2003 11:30:40 -0800
"David S. Miller" <davem@redhat.com> wrote:
>
> > - On TX we are inefficient for the same reason. TCP builds one packet
> > at a time and then goes down through all layers taking all locks (queue,
> > device driver etc.) and submits the single packet. Then repeats that for
> > lots of packets because many TCP writes are > MTU. Batching that would
> > likely help a lot, like it was done in the 2.6 VFS. I think it could
> > also make hard_start_xmit in many drivers significantly faster.
>
> This is tricky, because of getting all of the queueing stuff right.
> All of the packet scheduler APIs would need to change, as would
> the classification stuff, not to mention netfilter et al.
You only need to do a fast path for the default scheduler at the beginning.
Every complicated "slow" API like advanced queuing or netfilter can still fallback to
one packet at a time until cleaned up (similar strategy as was done with the
non linear skbs)
> You're talking about basically redoing the whole TX path if you
> want to really support this.
>
> I'm not saying "don't do this", just that we should be sure we know
> what we're getting if we invest the time into this.
In some profiling I did some time ago queue locks and device driver
locks were the biggest offenders on TX after copy.
The only tricky part is to get the state machine in tcp_do_sendmsg()
right that decides when to flush.
> - user copy and checksum could probably also done faster if they were
> > batched for multiple packets. It is hard to optimize properly for
> > <= 1.5K copies.
> > This is especially true for 4/4 split kernels which will eat an
> > page table look up + lock for each individual copy, but also for others.
>
> I disagree partially, especially in the presence of a chip that provides
> proper implementations of software initiated prefetching.
Especially for prefetching having a list of packets helps because you
can prefetch the next while you're working on the current one. The CPU
hardware prefetcher cannot do that for you.
I did look seriously at faster csum-copy/copy-to-user for K8, but the conclusion
was that all the tricks are only worth it when you can work with bigger amounts of data.
1.5K at a time is just too small.
Ah yes:
- Investigate more performance through explicit prefetching
(e.g. in the device drivers to optimize eth_type_trans() when you can classify the packet
just by looking at the RX ring state. Instead do a prefetch on the packet data
and hope the data is already in cache when the IP stack gets around to look at it)
could be also added to the list
-Andi (who shuts up now because I don't have any time to code on any of this :-( )
next prev parent reply other threads:[~2003-11-26 22:39 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <BAY1-DAV15JU71pROHD000040e2@hotmail.com.suse.lists.linux.kernel>
[not found] ` <20031125183035.1c17185a.davem@redhat.com.suse.lists.linux.kernel>
2003-11-26 9:53 ` Fire Engine?? Andi Kleen
2003-11-26 11:35 ` John Bradford
2003-11-26 18:50 ` Mike Fedyk
2003-11-26 19:19 ` Diego Calleja García
2003-11-26 19:59 ` Mike Fedyk
2003-11-27 3:54 ` Bill Huey
2003-11-26 15:00 ` Trond Myklebust
2003-11-26 23:01 ` Andi Kleen
2003-11-26 23:23 ` Trond Myklebust
2003-11-26 23:38 ` Andi Kleen
2003-11-26 19:30 ` David S. Miller
2003-11-26 19:58 ` Paul Menage
2003-11-26 20:03 ` David S. Miller
2003-11-26 22:29 ` Andi Kleen
2003-11-26 22:36 ` David S. Miller
2003-11-26 22:56 ` Andi Kleen
2003-11-26 23:13 ` David S. Miller
2003-11-26 23:29 ` Andi Kleen
2003-11-26 23:41 ` Ben Greear
2003-11-27 0:01 ` Fast timestamps David S. Miller
2003-11-27 0:30 ` Mitchell Blank Jr
2003-11-27 1:57 ` Ben Greear
2003-11-26 20:01 ` Fire Engine?? Jamie Lokier
2003-11-26 20:04 ` David S. Miller
2003-11-26 21:54 ` Pekka Pietikainen
2003-11-26 20:22 ` Theodore Ts'o
2003-11-26 21:02 ` David S. Miller
2003-11-26 21:24 ` Jamie Lokier
2003-11-26 21:38 ` David S. Miller
2003-11-26 23:43 ` Jamie Lokier
2003-11-26 21:34 ` Arjan van de Ven
2003-11-26 22:58 ` Andi Kleen
2003-11-27 12:16 ` Ingo Oeser
2003-11-26 22:39 ` Andi Kleen [this message]
2003-11-26 22:46 ` David S. Miller
2003-11-26 0:15 Mr. BOFH
2003-11-26 2:30 ` David S. Miller
2003-11-26 5:41 ` Valdis.Kletnieks
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20031126233918.2af3aae5.ak@suse.de \
--to=ak@suse.de \
--cc=davem@redhat.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox