public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andi Kleen <ak@suse.de>
To: "David S. Miller" <davem@redhat.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: Fire Engine??
Date: Wed, 26 Nov 2003 23:39:18 +0100	[thread overview]
Message-ID: <20031126233918.2af3aae5.ak@suse.de> (raw)
In-Reply-To: <20031126113040.3b774360.davem@redhat.com>

On Wed, 26 Nov 2003 11:30:40 -0800
"David S. Miller" <davem@redhat.com> wrote:

>
> > - On TX we are inefficient for the same reason. TCP builds one packet
> > at a time and then goes down through all layers taking all locks (queue,
> > device driver etc.) and submits the single packet. Then repeats that for 
> > lots of packets because many TCP writes are > MTU. Batching that would 
> > likely help a lot, like it was done in the 2.6 VFS. I think it could 
> > also make hard_start_xmit in many drivers significantly faster.
> 
> This is tricky, because of getting all of the queueing stuff right.
> All of the packet scheduler APIs would need to change, as would
> the classification stuff, not to mention netfilter et al.

You only need to do a fast path for the default scheduler at the beginning.
Every complicated "slow" API like advanced queuing or netfilter can still fallback to 
one packet at a time until cleaned up (similar strategy as was done with the 
non linear skbs) 
 
> You're talking about basically redoing the whole TX path if you
> want to really support this.
> 
> I'm not saying "don't do this", just that we should be sure we know
> what we're getting if we invest the time into this.

In some profiling I did some time ago queue locks and device driver
locks were the biggest offenders on TX after copy. 

The only tricky part is to get the state machine in tcp_do_sendmsg()
right that decides when to flush.

 > - user copy and checksum could probably also done faster if they were
> > batched for multiple packets. It is hard to optimize properly for 
> > <= 1.5K copies.
> > This is especially true for 4/4 split kernels which will eat an 
> > page table look up + lock for each individual copy, but also for others.
> 
> I disagree partially, especially in the presence of a chip that provides
> proper implementations of software initiated prefetching.

Especially for prefetching having a list of packets helps because you
can prefetch the next while you're working on the current one. The CPU
hardware prefetcher cannot do that for you.

I did look seriously at faster csum-copy/copy-to-user for K8, but the conclusion
was that all the tricks are only worth it when you can work with bigger amounts of data.
1.5K at a time is just too small.

Ah yes:

- Investigate more performance through explicit prefetching 
(e.g. in the device drivers to optimize eth_type_trans() when you can classify the packet 
just by looking at the RX ring state. Instead do a prefetch on the packet data
and hope the data is already in cache when the IP stack gets around to look at it) 

could be also added to the list

-Andi (who shuts up now because I don't have any time to code on any of this :-( ) 

  parent reply	other threads:[~2003-11-26 22:39 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <BAY1-DAV15JU71pROHD000040e2@hotmail.com.suse.lists.linux.kernel>
     [not found] ` <20031125183035.1c17185a.davem@redhat.com.suse.lists.linux.kernel>
2003-11-26  9:53   ` Fire Engine?? Andi Kleen
2003-11-26 11:35     ` John Bradford
2003-11-26 18:50       ` Mike Fedyk
2003-11-26 19:19         ` Diego Calleja García
2003-11-26 19:59           ` Mike Fedyk
2003-11-27  3:54           ` Bill Huey
2003-11-26 15:00     ` Trond Myklebust
2003-11-26 23:01       ` Andi Kleen
2003-11-26 23:23         ` Trond Myklebust
2003-11-26 23:38           ` Andi Kleen
2003-11-26 19:30     ` David S. Miller
2003-11-26 19:58       ` Paul Menage
2003-11-26 20:03         ` David S. Miller
2003-11-26 22:29           ` Andi Kleen
2003-11-26 22:36             ` David S. Miller
2003-11-26 22:56               ` Andi Kleen
2003-11-26 23:13                 ` David S. Miller
2003-11-26 23:29                   ` Andi Kleen
2003-11-26 23:41                   ` Ben Greear
2003-11-27  0:01                     ` Fast timestamps David S. Miller
2003-11-27  0:30                       ` Mitchell Blank Jr
2003-11-27  1:57                       ` Ben Greear
2003-11-26 20:01       ` Fire Engine?? Jamie Lokier
2003-11-26 20:04         ` David S. Miller
2003-11-26 21:54         ` Pekka Pietikainen
2003-11-26 20:22       ` Theodore Ts'o
2003-11-26 21:02         ` David S. Miller
2003-11-26 21:24           ` Jamie Lokier
2003-11-26 21:38             ` David S. Miller
2003-11-26 23:43               ` Jamie Lokier
2003-11-26 21:34       ` Arjan van de Ven
2003-11-26 22:58         ` Andi Kleen
2003-11-27 12:16           ` Ingo Oeser
2003-11-26 22:39       ` Andi Kleen [this message]
2003-11-26 22:46         ` David S. Miller
2003-11-26  0:15 Mr. BOFH
2003-11-26  2:30 ` David S. Miller
2003-11-26  5:41 ` Valdis.Kletnieks

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20031126233918.2af3aae5.ak@suse.de \
    --to=ak@suse.de \
    --cc=davem@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox