From: Jan-Bernd Themann <ossthema@de.ibm.com>
To: David Miller <davem@davemloft.net>
Cc: tklein@de.ibm.com, themann@de.ibm.com, stefan.roscher@de.ibm.com,
netdev@vger.kernel.org, jchapman@katalix.com, raisch@de.ibm.com,
linux-kernel@vger.kernel.org, linuxppc-dev@ozlabs.org,
akepner@sgi.com, meder@de.ibm.com,
shemminger@linux-foundation.org
Subject: Re: RFC: issues concerning the next NAPI interface
Date: Tue, 28 Aug 2007 13:21:09 +0200 [thread overview]
Message-ID: <200708281321.10679.ossthema@de.ibm.com> (raw)
In-Reply-To: <20070827.140251.95055210.davem@davemloft.net>
Hi
On Monday 27 August 2007 23:02, David Miller wrote:
> But there are huger fish to fry for you I think. Talk to your
> platform maintainers and ask for an interface for obtaining
> a flat static distribution of interrupts to cpus in order to
> support multiqueue NAPI better.
>
> In your previous postings you made arguments saying that the
> automatic placement of interrupts to cpus made everything
> bunch of to a single cpu and you wanted to propagate the
> NAPI work to other cpu's software interrupts from there.
>
> That logic is bogus, because it merely proves that the hardware
> interrupt distribution is broken. If it's a bad cpu to run
> software interrupts on, it's also a bad cpu to run hardware
> interrupts on.
>
As already mentioned some mails were mixed up here. To clarify the interrupt
issue that has nothing to do with the reduction of interrupts:
- Interrupts are distributed the round robin way on IBM POWER6 processors
- Interrupt distribution can be modified by user/daemons (smp_affinity, pinning)
- NAPI is scheduled on CPU where interrupt is catched
- NAPI polls on that CPU as long as poll has packets to process (default)
(David please correct if there is a misunderstanding here)
Problem for multi queue driver with interrupt distribution scheme set to
round robin for this simple example:
Assuming we have 2 SLOW CPUs. CPU_1 is under heavy load (applications). CPU_2
is not under heavy load. Now we receive a lot of packets (high load situation).
Receive queue 1 (RQ1) is scheduled on CPU_1. NAPI-Poll does not manage to empty
RQ1 ever, so it stays on CPU_1. The second receive queue (RQ2) is scheduled on
CPU_2. As that CPU is not under heavy load, RQ2 can be emptied, and the next IRQ
for RQ2 will go to CPU_1. Then both RQs are on CPU_1 and will stay there if
no IRQ is forced at some time as both RQs are never emptied completely.
This is a simplified example to demonstrate the problem. The interrupt scheme
is not bogus, it is just an effect you see if you don't use pinning.
The user can avoid this problem by pinning the interrupts to CPUs.
As pointed out by David, it will be too expensive to schedule NAPI poll on
a different CPU than the one that gets the IRQ. So I guess one solution
is to "force" an HW interrupt when two many RQs are processed on the same CPU
(when no IRQ pinning is used). This is something the driver has to handle.
Regards,
Jan-Bernd
next prev parent reply other threads:[~2007-08-28 11:21 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-08-24 13:59 RFC: issues concerning the next NAPI interface Jan-Bernd Themann
2007-08-24 15:37 ` akepner
2007-08-24 15:47 ` Jan-Bernd Themann
2007-08-24 15:52 ` Stephen Hemminger
2007-08-24 16:50 ` David Stevens
2007-08-24 21:44 ` David Miller
2007-08-24 21:51 ` Linas Vepstas
2007-08-24 16:51 ` Linas Vepstas
2007-08-24 17:07 ` Rick Jones
2007-08-24 17:45 ` Shirley Ma
2007-08-24 17:16 ` James Chapman
2007-08-24 18:11 ` Jan-Bernd Themann
2007-08-24 21:47 ` David Miller
2007-08-24 22:06 ` akepner
2007-08-26 19:36 ` James Chapman
2007-08-27 1:58 ` David Miller
2007-08-27 9:47 ` Jan-Bernd Themann
2007-08-27 20:37 ` David Miller
2007-08-28 11:19 ` Jan-Bernd Themann
2007-08-28 20:21 ` David Miller
2007-08-29 7:10 ` Jan-Bernd Themann
2007-08-29 8:15 ` James Chapman
2007-08-29 8:43 ` Jan-Bernd Themann
2007-08-29 8:29 ` David Miller
2007-08-29 8:31 ` Jan-Bernd Themann
2007-08-27 15:51 ` James Chapman
2007-08-27 16:02 ` Jan-Bernd Themann
2007-08-27 17:05 ` James Chapman
2007-08-27 21:02 ` David Miller
2007-08-27 21:41 ` James Chapman
2007-08-27 21:56 ` David Miller
2007-08-28 9:22 ` James Chapman
2007-08-28 11:48 ` Jan-Bernd Themann
2007-08-28 12:16 ` Evgeniy Polyakov
2007-08-28 14:55 ` James Chapman
2007-08-28 11:21 ` Jan-Bernd Themann [this message]
2007-08-28 20:25 ` David Miller
2007-08-28 20:27 ` David Miller
2007-08-24 16:45 ` Linas Vepstas
2007-08-24 21:43 ` David Miller
2007-08-24 21:32 ` David Miller
2007-08-24 21:37 ` David Miller
[not found] <8VHRR-45R-17@gated-at.bofh.it>
[not found] ` <8VKwj-8ke-27@gated-at.bofh.it>
2007-08-24 19:04 ` Bodo Eggert
2007-08-24 20:42 ` Linas Vepstas
2007-08-24 21:11 ` Jan-Bernd Themann
2007-08-24 21:35 ` Linas Vepstas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200708281321.10679.ossthema@de.ibm.com \
--to=ossthema@de.ibm.com \
--cc=akepner@sgi.com \
--cc=davem@davemloft.net \
--cc=jchapman@katalix.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@ozlabs.org \
--cc=meder@de.ibm.com \
--cc=netdev@vger.kernel.org \
--cc=raisch@de.ibm.com \
--cc=shemminger@linux-foundation.org \
--cc=stefan.roscher@de.ibm.com \
--cc=themann@de.ibm.com \
--cc=tklein@de.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).