From: linas@austin.ibm.com (Linas Vepstas)
To: Bodo Eggert <7eggert@gmx.de>
Cc: Thomas Klein <tklein@de.ibm.com>,
Jan-Bernd Themann <themann@de.ibm.com>,
netdev <netdev@vger.kernel.org>,
linux-kernel <linux-kernel@vger.kernel.org>,
Christoph Raisch <raisch@de.ibm.com>,
linux-ppc <linuxppc-dev@ozlabs.org>,
Jan-Bernd Themann <ossthema@de.ibm.com>,
Marcus Eder <meder@de.ibm.com>,
Stefan Roscher <stefan.roscher@de.ibm.com>
Subject: Re: RFC: issues concerning the next NAPI interface
Date: Fri, 24 Aug 2007 15:42:44 -0500 [thread overview]
Message-ID: <20070824204243.GI4282@austin.ibm.com> (raw)
In-Reply-To: <E1IOeSm-0000bm-Jo@be1.lrz>
On Fri, Aug 24, 2007 at 09:04:56PM +0200, Bodo Eggert wrote:
> Linas Vepstas <linas@austin.ibm.com> wrote:
> > On Fri, Aug 24, 2007 at 03:59:16PM +0200, Jan-Bernd Themann wrote:
> >> 3) On modern systems the incoming packets are processed very fast. Especially
> >> on SMP systems when we use multiple queues we process only a few packets
> >> per napi poll cycle. So NAPI does not work very well here and the interrupt
> >> rate is still high.
> >
> > worst-case network ping-pong app: send one
> > packet, wait for reply, send one packet, etc.
>
> Possible solution / possible brainfart:
>
> Introduce a timer, but don't start to use it to combine packets unless you
> receive n packets within the timeframe. If you receive less than m packets
> within one timeframe, stop using the timer. The system should now have a
> decent response time when the network is idle, and when the network is
> busy, nobody will complain about the latency.-)
Ohh, that was inspirational. Let me free-associate some wild ideas.
Suppose we keep a running average of the recent packet arrival rate,
Lets say its 10 per millisecond ("typical" for a gigabit eth runnning
flat-out). If we could poll the driver at a rate of 10-20 per
millisecond (i.e. letting the OS do other useful work for 0.05 millisec),
then we could potentially service the card without ever having to enable
interrupts on the card, and without hurting latency.
If the packet arrival rate becomes slow enough, we go back to an
interrupt-driven scheme (to keep latency down).
The main problem here is that, even for HZ=1000 machines, this amounts
to 10-20 polls per jiffy. Which, if implemented in kernel, requires
using the high-resolution timers. And, umm, don't the HR timers require
a cpu timer interrupt to make them go? So its not clear that this is much
of a win.
The eHEA is a 10 gigabit device, so it can expect 80-100 packets per
millisecond for large packets, and even more, say 1K packets per
millisec, for small packets. (Even the spec for my 1Gb spidernet card
claims its internal rate is 1M packets/sec.)
Another possiblity is to set HZ to 5000 or 20000 or something humongous
... after all cpu's are now faster! But, since this might be wasteful,
maybe we could make HZ be dynamically variable: have high HZ rates when
there's lots of network/disk activity, and low HZ rates when not. That
means a non-constant jiffy.
If all drivers used interrupt mitigation, then the variable-high
frequency jiffy could take thier place, and be more "fair" to everyone.
Most drivers would be polled most of the time when they're busy, and
only use interrupts when they're not.
--linas
next prev parent reply other threads:[~2007-08-24 20:42 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <8VHRR-45R-17@gated-at.bofh.it>
[not found] ` <8VKwj-8ke-27@gated-at.bofh.it>
2007-08-24 19:04 ` RFC: issues concerning the next NAPI interface Bodo Eggert
2007-08-24 20:42 ` Linas Vepstas [this message]
2007-08-24 21:11 ` Jan-Bernd Themann
2007-08-24 21:35 ` Linas Vepstas
2007-08-24 13:59 Jan-Bernd Themann
2007-08-24 15:37 ` akepner
2007-08-24 15:47 ` Jan-Bernd Themann
2007-08-24 15:52 ` Stephen Hemminger
2007-08-24 16:50 ` David Stevens
2007-08-24 21:44 ` David Miller
2007-08-24 21:51 ` Linas Vepstas
2007-08-24 16:51 ` Linas Vepstas
2007-08-24 17:07 ` Rick Jones
2007-08-24 17:45 ` Shirley Ma
2007-08-24 17:16 ` James Chapman
2007-08-24 18:11 ` Jan-Bernd Themann
2007-08-24 21:47 ` David Miller
2007-08-24 22:06 ` akepner
2007-08-26 19:36 ` James Chapman
2007-08-27 1:58 ` David Miller
2007-08-27 9:47 ` Jan-Bernd Themann
2007-08-27 20:37 ` David Miller
2007-08-28 11:19 ` Jan-Bernd Themann
2007-08-28 20:21 ` David Miller
2007-08-29 7:10 ` Jan-Bernd Themann
2007-08-29 8:15 ` James Chapman
2007-08-29 8:43 ` Jan-Bernd Themann
2007-08-29 8:29 ` David Miller
2007-08-29 8:31 ` Jan-Bernd Themann
2007-08-27 15:51 ` James Chapman
2007-08-27 16:02 ` Jan-Bernd Themann
2007-08-27 17:05 ` James Chapman
2007-08-27 21:02 ` David Miller
2007-08-27 21:41 ` James Chapman
2007-08-27 21:56 ` David Miller
2007-08-28 9:22 ` James Chapman
2007-08-28 11:48 ` Jan-Bernd Themann
2007-08-28 12:16 ` Evgeniy Polyakov
2007-08-28 14:55 ` James Chapman
2007-08-28 11:21 ` Jan-Bernd Themann
2007-08-28 20:25 ` David Miller
2007-08-28 20:27 ` David Miller
2007-08-24 16:45 ` Linas Vepstas
2007-08-24 21:43 ` David Miller
2007-08-24 21:32 ` David Miller
2007-08-24 21:37 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070824204243.GI4282@austin.ibm.com \
--to=linas@austin.ibm.com \
--cc=7eggert@gmx.de \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@ozlabs.org \
--cc=meder@de.ibm.com \
--cc=netdev@vger.kernel.org \
--cc=ossthema@de.ibm.com \
--cc=raisch@de.ibm.com \
--cc=stefan.roscher@de.ibm.com \
--cc=themann@de.ibm.com \
--cc=tklein@de.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).