* ppc_irq_dispatch_handler dominating profile?
@ 2003-04-27 19:42 Fred Gray
2003-04-27 22:45 ` Paul Mackerras
0 siblings, 1 reply; 6+ messages in thread
From: Fred Gray @ 2003-04-27 19:42 UTC (permalink / raw)
To: linuxppc-dev
Dear linuxppc-dev,
I'm trying to get a gigabit Ethernet card (SBS Technologies PMC-Gigabit-ST3;
it uses the Intel 82545EM chipset and therefore the Linux e1000 driver) to
work with a MVME2600 board (a PReP board with a 200 MHz PowerPC 604e CPU).
I'm getting surprisingly poor performance and trying to understand why.
I'm running a simple benchmark program that was passed along to me by a kind
soul on the linux-net@vger.kernel.org mailing list. It has two modes, one
that uses the ordinary socket interface, and one that uses the sendfile()
system call for zero-copy transmission. In either case, it simply floods the
destination with TCP data for a fixed amount of time. The results in
non-zero-copy mode agree with standard benchmarks like netperf and iperf,
which I have also tried. In any event, the maximum bandwidth that I have
been able to obtain is about 15 MByte/s, and that level of performance required
16000 byte jumbo frames and zero-copy mode. Transmission was clearly CPU-bound.
I used the kernel profiling interface (kernel version 2.4.21-pre6 from the
linuxppc_2_4_devel tree) to determine where the hot spot is. Using ordinary
socket calls, these are the leading entries:
5838 total 0.0059
3263 ppc_irq_dispatch_handler 5.7855
1645 csum_partial_copy_generic 7.4773
133 e1000_intr 0.8750
89 do_softirq 0.3477
69 tcp_sendmsg 0.0149
In zero-copy mode, this is the situation (notice that the copy and checksum
have been successfully offloaded to the gigabit interface):
5983 total 0.0061
4740 ppc_irq_dispatch_handler 8.4043
614 e1000_intr 4.0395
61 e1000_clean_tx_irq 0.1113
52 do_tcp_sendpages 0.0179
51 do_softirq 0.1992
In both cases, ppc_irq_dispatch_handler is the "winner." I'm not very familiar
with the kernel profiler, especially on the PowerPC, so I don't know whether
or not this is likely to be an artifact of piled-up timer interrupts.
Otherwise, it suggests that something dramatically inefficient is
happening in the interrupt handling chain, since it spends twice as much
time here as it does touching all of the outgoing data for the copy and
checksum.
I would appreciate suggestions of what I might check next.
Thanks very much for your help,
-- Fred
-- Fred Gray / Visiting Postdoctoral Researcher --
-- Department of Physics / University of California, Berkeley --
-- fegray@socrates.berkeley.edu / phone 510-642-4057 / fax 510-642-9811 --
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: ppc_irq_dispatch_handler dominating profile? 2003-04-27 19:42 ppc_irq_dispatch_handler dominating profile? Fred Gray @ 2003-04-27 22:45 ` Paul Mackerras 2003-04-28 7:33 ` Fred Gray 2003-04-28 8:53 ` Gabriel Paubert 0 siblings, 2 replies; 6+ messages in thread From: Paul Mackerras @ 2003-04-27 22:45 UTC (permalink / raw) To: Fred Gray; +Cc: linuxppc-dev Fred Gray writes: > In both cases, ppc_irq_dispatch_handler is the "winner." I'm not very familiar > with the kernel profiler, especially on the PowerPC, so I don't know whether > or not this is likely to be an artifact of piled-up timer interrupts. > Otherwise, it suggests that something dramatically inefficient is > happening in the interrupt handling chain, since it spends twice as much > time here as it does touching all of the outgoing data for the copy and > checksum. ppc_irq_dispatch_handler is the first place where interrupts get turned on in the interrupt handling path, so all the time spent saving registers and finding out which interrupt occurred gets attributed to it. How many interrupts per second are you handling? A 200MHz 604e isn't a fast processor by today's standards. Also, how fast is your memory system? I would be a little surprised if the memory controller could deliver any more than about 100MB/s. I think that you will have to use interrupt mitigation to go any faster, and I will be amazed if you can actually do 1Gb/s with an old slow system such as you have. Paul. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ppc_irq_dispatch_handler dominating profile? 2003-04-27 22:45 ` Paul Mackerras @ 2003-04-28 7:33 ` Fred Gray 2003-04-28 8:53 ` Gabriel Paubert 1 sibling, 0 replies; 6+ messages in thread From: Fred Gray @ 2003-04-28 7:33 UTC (permalink / raw) To: Paul Mackerras; +Cc: linuxppc-dev On Mon, Apr 28, 2003 at 08:45:49AM +1000, Paul Mackerras wrote: > ppc_irq_dispatch_handler is the first place where interrupts get > turned on in the interrupt handling path, so all the time spent saving > registers and finding out which interrupt occurred gets attributed to > it. > > How many interrupts per second are you handling? A 200MHz 604e isn't > a fast processor by today's standards. Also, how fast is your memory > system? I would be a little surprised if the memory controller could > deliver any more than about 100MB/s. > > I think that you will have to use interrupt mitigation to go any > faster, and I will be amazed if you can actually do 1Gb/s with an old > slow system such as you have. Hi, Paul, Interrupt throttling is enabled on the card. According to /proc/interrupts, there were about 1250 eth0 interrupts per second while running, which doesn't seem like so terribly many. I can tune this parameter: larger interrupt throttling rates definitely increase throughput for low MTUs (1500) but not large MTUs (16000). The profile traces that I posted were for an MTU of 16000. Fortunately, I don't actually need to saturate the gigabit link for my application. I need to deliver about 15 MB/s worth of data to a server from each of a few VME crates, and I need to be able to do this with some CPU time left over to transfer the data over the VME bus from our custom electronics (which can in principle be done with zero-copy DMA, though currently the driver does a kernel-to-user copy). Bandwidth tests over the loopback interface give 36.6 MB/s for the normal socket API and 87.5 MB/s for the zero-copy path (which is a one-copy path when the recipient is on the same host). So, your 100 MB/s estimate is just about right on. It seems odd to me that the vast majority of the work involved in handling the interrupt is in saving registers and finding a handler, and not in the handler itself. Is there any good way that I can test whether the system is (e.g.) spending a lot of time waiting on the spinlocks in ppc_irq_dispatch_handler? Thanks very much for your help, -- Fred -- Fred Gray / Visiting Postdoctoral Researcher -- -- Department of Physics / University of California, Berkeley -- -- fegray@socrates.berkeley.edu / phone 510-642-4057 / fax 510-642-9811 -- ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ppc_irq_dispatch_handler dominating profile? 2003-04-27 22:45 ` Paul Mackerras 2003-04-28 7:33 ` Fred Gray @ 2003-04-28 8:53 ` Gabriel Paubert 2003-04-28 12:42 ` Fred Gray 1 sibling, 1 reply; 6+ messages in thread From: Gabriel Paubert @ 2003-04-28 8:53 UTC (permalink / raw) To: Paul Mackerras; +Cc: Fred Gray, linuxppc-dev On Mon, Apr 28, 2003 at 08:45:49AM +1000, Paul Mackerras wrote: > > Fred Gray writes: > > > In both cases, ppc_irq_dispatch_handler is the "winner." I'm not very familiar > > with the kernel profiler, especially on the PowerPC, so I don't know whether > > or not this is likely to be an artifact of piled-up timer interrupts. > > Otherwise, it suggests that something dramatically inefficient is > > happening in the interrupt handling chain, since it spends twice as much > > time here as it does touching all of the outgoing data for the copy and > > checksum. > > ppc_irq_dispatch_handler is the first place where interrupts get > turned on in the interrupt handling path, so all the time spent saving > registers and finding out which interrupt occurred gets attributed to > it. > > How many interrupts per second are you handling? A 200MHz 604e isn't > a fast processor by today's standards. Also, how fast is your memory > system? I would be a little surprised if the memory controller could > deliver any more than about 100MB/s. Hmmm, I get more than 100MB/s on my MVME2600 with a 200MHz 603e, although not the half GB/s Motorola claims it is capable of. But a 604 should be a bit faster. The chipset is old (1997) but it was rather fast when it came out, especially because the memory interface is 128 bits wide. This said, putting a gbit Ethernet (PMC module I suppose) on it is stretching it a bit. Is the 100Mb/s of the builtin interface too slow for you? Gabriel. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ppc_irq_dispatch_handler dominating profile? 2003-04-28 8:53 ` Gabriel Paubert @ 2003-04-28 12:42 ` Fred Gray 2003-04-29 12:08 ` Gabriel Paubert 0 siblings, 1 reply; 6+ messages in thread From: Fred Gray @ 2003-04-28 12:42 UTC (permalink / raw) To: Gabriel Paubert; +Cc: linuxppc-dev On Mon, Apr 28, 2003 at 10:53:42AM +0200, Gabriel Paubert wrote: > Hmmm, I get more than 100MB/s on my MVME2600 with a 200MHz 603e, > although not the half GB/s Motorola claims it is capable of. But a 604 > should be a bit faster. The chipset is old (1997) but it was rather > fast when it came out, especially because the memory interface is 128 > bits wide. This said, putting a gbit Ethernet (PMC module I suppose) > on it is stretching it a bit. > > Is the 100Mb/s of the builtin interface too slow for you? Hi, Gabriel, (...and many thanks for your hard work porting Linux to these boards in the first place--I am extremely glad to be in a position to leave vxWorks behind) The gigabit Ethernet is indeed on a PMC module (from SBS Technologies, PMC-Gigabit-ST3). Our electronics generates about 15 MB/s per VME crate; it's digitizing tracks left by muons in a time-projection chamber. There are two crates, each with this rate, each equipped with an MVME2600 with an gigabit card, and they have to transfer this firehose of data to a computer that will do some as-yet-undefined online data reduction and send the result to an LTO 2 tape robot. So, yes, we need about 50% more throughput than the built-in 10/100 Ethernet port could provide, and we need it with enough CPU time left over to manage the VME readout. Fortunately, though, we don't need the whole gigabit. I agree that would probably be well-nigh impossible. Still, I'm very interested in understanding why the interrupt overhead seems to be so high at our 10 to 15 MB/s interrupt rate. Thanks, -- Fred -- Fred Gray / Visiting Postdoctoral Researcher -- -- Department of Physics / University of California, Berkeley -- -- fegray@socrates.berkeley.edu / phone 510-642-4057 / fax 510-642-9811 -- ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ppc_irq_dispatch_handler dominating profile? 2003-04-28 12:42 ` Fred Gray @ 2003-04-29 12:08 ` Gabriel Paubert 0 siblings, 0 replies; 6+ messages in thread From: Gabriel Paubert @ 2003-04-29 12:08 UTC (permalink / raw) To: Fred Gray; +Cc: linuxppc-dev Hi Fred, On Mon, Apr 28, 2003 at 05:42:24AM -0700, Fred Gray wrote: > > On Mon, Apr 28, 2003 at 10:53:42AM +0200, Gabriel Paubert wrote: > > Hmmm, I get more than 100MB/s on my MVME2600 with a 200MHz 603e, > > although not the half GB/s Motorola claims it is capable of. But a 604 > > should be a bit faster. The chipset is old (1997) but it was rather > > fast when it came out, especially because the memory interface is 128 > > bits wide. This said, putting a gbit Ethernet (PMC module I suppose) > > on it is stretching it a bit. > > > > Is the 100Mb/s of the builtin interface too slow for you? > > Hi, Gabriel, (...and many thanks for your hard work porting Linux to these > boards in the first place--I am extremely glad to be in a position to leave > vxWorks behind) Glad that it helped. And another "customer" of my MVME port (and perhaps VME driver too) which I discover right now ;-) (I'm planning to do a port of late 2.5/early 2.6 in less than a year, I don't know exactly when but I shall have to do it). > The gigabit Ethernet is indeed on a PMC module (from SBS Technologies, > PMC-Gigabit-ST3). Our electronics generates about 15 MB/s per VME crate; > it's digitizing tracks left by muons in a time-projection chamber. > There are two crates, each with this rate, each equipped with an MVME2600 > with an gigabit card, and they have to transfer this firehose of data to a > computer that will do some as-yet-undefined online data reduction and > send the result to an LTO 2 tape robot. So, yes, we need about 50% more > throughput than the built-in 10/100 Ethernet port could provide, and we need > it with enough CPU time left over to manage the VME readout. Ok, I use them differently. The 6 MVME 2600 I have doing data acquisition produce very little data on the network (<300 kB/s, but they process it quite a lot, including a Fourier transform, bewteen acquisition and sending). Newer systems use MVME2400 which are way faster. The main problem I had (solved now) is that 4 of my boards are from the first batches and that I had to carefully work around the pile of bugs of the Universe I PCI<->VME bridge. > > Fortunately, though, we don't need the whole gigabit. I agree that would > probably be well-nigh impossible. Still, I'm very interested in understanding > why the interrupt overhead seems to be so high at our 10 to 15 MB/s > interrupt rate. How many interrupts do you have altogether, how many VME interrupts too if it's not secret (cat /proc/bus/vme/interrupts if you use my driver). Do you have lots of bad interrupts (cat /proc/interrupts)? Do you know how much time you spend in the VME interrupt routines (which are run with interrupts masked if you use my driver, but there is really no other solution, the Universe being essentially a cascaded interrupt controller) ? Do you know the percentage of bus utilization due to DMA ? Do both machines exhibit the same problem ? Which kernel version are you using? Regards, Gabriel ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2003-04-29 12:08 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2003-04-27 19:42 ppc_irq_dispatch_handler dominating profile? Fred Gray 2003-04-27 22:45 ` Paul Mackerras 2003-04-28 7:33 ` Fred Gray 2003-04-28 8:53 ` Gabriel Paubert 2003-04-28 12:42 ` Fred Gray 2003-04-29 12:08 ` Gabriel Paubert
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).