From mboxrd@z Thu Jan 1 00:00:00 1970 From: Arthur Kepner Subject: NAPI and CPU utilization [was: NAPI, e100, and system performance problem] Date: Tue, 19 Apr 2005 13:38:20 -0700 (PDT) Message-ID: References: <1113855967.7436.39.camel@localhost.localdomain> <20050419055535.GA12211@sgi.com> <20050419113657.7290d26e.davem@davemloft.net> Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: Greg Banks , hadi@cyberus.ca, jesse.brandeburg@intel.com, netdev@oss.sgi.com Return-path: To: "David S. Miller" In-Reply-To: <20050419113657.7290d26e.davem@davemloft.net> Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com List-Id: netdev.vger.kernel.org [Modified the subject line in order to not distract from Jesse's original thread.] On Tue, 19 Apr 2005, David S. Miller wrote: > On Tue, 19 Apr 2005 15:55:35 +1000 > Greg Banks wrote: > > > > How do you recognize when system resources are being poorly utilized? > > > > An inordinate amount of CPU is being spent running around polling the > > device instead of dealing with the packets in IP, TCP and NFS land. > > By inordinate, we mean twice as much or more cpu% than a MIPS/Irix > > box with slower CPUs. > > You haven't answered the "how" yet. > > What tools did you run, what did those tools attempt to measure, and > what results did those tools output for you so that you could determine > your conclusions with such certainty? > I'm answering for myself, not Greg. Much of the data is essentially from "/proc/". (We use a nice tool called PCP to gather the data, but that's where PCP gets it, for the most part.) But I've used several other tools to gather corroborating data, including: the "kernprof" patch, "q-tools", an ad-hoc patch which used "get_cycles()" to time things which were happening while interrupts were disabled. The data acquired with all these show that NAPI causes relatively few packets to be processed per interrupt, so that expensive PIOs are relatively poorly amortized over a given amount of input from the network. (When I use "relative(ly)" above, I mean relative what we see when using Greg's interrupt coalescence patch from http://marc.theaimsgroup.com/?l=linux-netdev&m=107183822710263&w=2) For example, following is a comparison of the number of packets processed per interrupt and CPU utilization using NAPI and Greg's interrupt coalescence patch. This data is pretty old by now, it was gathered using 2.6.5 on an Altix with 1GHz CPUs using the tg3 driver doing bulk data transfer using nttcp. (I'm eyeballing the data from a set of graphs, so it's rough...) Link util [%] Packets/Interrupt CPU utilization [%] NAPI Intr Coal. NAPI Intr Coal. --------------------------------------------------------- 25 2 3.5 45 17 40 4 6 52 30 50 4 6 60 36 60 4 7 75 41 70 6 10 80 36 80 6 16 90 40 85 7 16 100 45 100 - 17 - 50 I know more recent kernels have somewhat better performance, (http://marc.theaimsgroup.com/?l=linux-netdev&m=109848080827969&w=2 helped, for one thing.) The reason that CPU utilization is so high with NAPI is that we're spinning, waiting for PIOs to flush (this can take an impressively long time when the PCI bus/bridge is busy.) I guess that some of us (at SGI) have seen this so often as a bottleneck that we're surprised that it's not more generally recognized as a problem, er, uh, "opportunity for improvement". -- Arthur