From: Andi Kleen <ak@suse.de>
To: "David S. Miller" <davem@redhat.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: Fire Engine??
Date: Wed, 26 Nov 2003 23:39:18 +0100 [thread overview]
Message-ID: <20031126233918.2af3aae5.ak@suse.de> (raw)
In-Reply-To: <20031126113040.3b774360.davem@redhat.com>
On Wed, 26 Nov 2003 11:30:40 -0800
"David S. Miller" <davem@redhat.com> wrote:
>
> > - On TX we are inefficient for the same reason. TCP builds one packet
> > at a time and then goes down through all layers taking all locks (queue,
> > device driver etc.) and submits the single packet. Then repeats that for
> > lots of packets because many TCP writes are > MTU. Batching that would
> > likely help a lot, like it was done in the 2.6 VFS. I think it could
> > also make hard_start_xmit in many drivers significantly faster.
>
> This is tricky, because of getting all of the queueing stuff right.
> All of the packet scheduler APIs would need to change, as would
> the classification stuff, not to mention netfilter et al.
You only need to do a fast path for the default scheduler at the beginning.
Every complicated "slow" API like advanced queuing or netfilter can still fallback to
one packet at a time until cleaned up (similar strategy as was done with the
non linear skbs)
> You're talking about basically redoing the whole TX path if you
> want to really support this.
>
> I'm not saying "don't do this", just that we should be sure we know
> what we're getting if we invest the time into this.
In some profiling I did some time ago queue locks and device driver
locks were the biggest offenders on TX after copy.
The only tricky part is to get the state machine in tcp_do_sendmsg()
right that decides when to flush.
> - user copy and checksum could probably also done faster if they were
> > batched for multiple packets. It is hard to optimize properly for
> > <= 1.5K copies.
> > This is especially true for 4/4 split kernels which will eat an
> > page table look up + lock for each individual copy, but also for others.
>
> I disagree partially, especially in the presence of a chip that provides
> proper implementations of software initiated prefetching.
Especially for prefetching having a list of packets helps because you
can prefetch the next while you're working on the current one. The CPU
hardware prefetcher cannot do that for you.
I did look seriously at faster csum-copy/copy-to-user for K8, but the conclusion
was that all the tricks are only worth it when you can work with bigger amounts of data.
1.5K at a time is just too small.
Ah yes:
- Investigate more performance through explicit prefetching
(e.g. in the device drivers to optimize eth_type_trans() when you can classify the packet
just by looking at the RX ring state. Instead do a prefetch on the packet data
and hope the data is already in cache when the IP stack gets around to look at it)
could be also added to the list
-Andi (who shuts up now because I don't have any time to code on any of this :-( )
next prev parent reply other threads:[~2003-11-26 22:39 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <BAY1-DAV15JU71pROHD000040e2@hotmail.com.suse.lists.linux.kernel>
[not found] ` <20031125183035.1c17185a.davem@redhat.com.suse.lists.linux.kernel>
2003-11-26 9:53 ` Fire Engine?? Andi Kleen
2003-11-26 11:35 ` John Bradford
2003-11-26 18:50 ` Mike Fedyk
2003-11-26 19:19 ` Diego Calleja García
2003-11-26 19:59 ` Mike Fedyk
2003-11-27 3:54 ` Bill Huey
2003-11-26 15:00 ` Trond Myklebust
2003-11-26 23:01 ` Andi Kleen
2003-11-26 23:23 ` Trond Myklebust
2003-11-26 23:38 ` Andi Kleen
2003-11-26 19:30 ` David S. Miller
2003-11-26 19:58 ` Paul Menage
2003-11-26 20:03 ` David S. Miller
2003-11-26 22:29 ` Andi Kleen
2003-11-26 22:36 ` David S. Miller
2003-11-26 22:56 ` Andi Kleen
2003-11-26 23:13 ` David S. Miller
2003-11-26 23:29 ` Andi Kleen
2003-11-26 23:41 ` Ben Greear
2003-11-27 0:01 ` Fast timestamps David S. Miller
2003-11-27 0:30 ` Mitchell Blank Jr
2003-11-27 1:57 ` Ben Greear
2003-11-26 20:01 ` Fire Engine?? Jamie Lokier
2003-11-26 20:04 ` David S. Miller
2003-11-26 21:54 ` Pekka Pietikainen
2003-11-26 20:22 ` Theodore Ts'o
2003-11-26 21:02 ` David S. Miller
2003-11-26 21:24 ` Jamie Lokier
2003-11-26 21:38 ` David S. Miller
2003-11-26 23:43 ` Jamie Lokier
2003-11-26 21:34 ` Arjan van de Ven
2003-11-26 22:58 ` Andi Kleen
2003-11-27 12:16 ` Ingo Oeser
2003-11-26 22:39 ` Andi Kleen [this message]
2003-11-26 22:46 ` David S. Miller
2003-11-26 0:15 Mr. BOFH
2003-11-26 2:30 ` David S. Miller
2003-11-26 5:41 ` Valdis.Kletnieks
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20031126233918.2af3aae5.ak@suse.de \
--to=ak@suse.de \
--cc=davem@redhat.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.