* Question about way that NICs deliver packets to the kernel
@ 2010-07-15 14:24 Junchang Wang
2010-07-15 14:33 ` Ben Hutchings
2010-07-15 21:12 ` Francois Romieu
0 siblings, 2 replies; 10+ messages in thread
From: Junchang Wang @ 2010-07-15 14:24 UTC (permalink / raw)
To: romieu, netdev
Hi list,
My understand of the way that NICs deliver packets to the kernel is
as follows. Correct me if any of this is wrong. Thanks.
1) The device buffer is fixed. When the kernel is acknowledged arrival of a
new packet, it dynamically allocate a new skb and copy the packet into it.
For example, 8139too.
2) The device buffer is mapped by streaming DMA. When the kernel is
acknowledged arrival of a new packet, it unmaps the region previously mapped.
Obviously, there is NO memcpy operation. Additional cost is streaming DMA
map/unmap operations. For example, e100 and e1000.
Here comes my question:
1) Is there a principle indicating which one is better? Is streaming DMA
map/unmap operations more expensive than memcpy operation?
2) Why does r8169 bias towards the first approach even if it support both? I
convert r8169 to the second one and get a 5% performance boost. Below is result
running netperf TCP_STREAM test with 1.6K byte packet length.
scheme 1 scheme 2 Imp.
r8169 683M 718M 5%
The following patch shows what I did:
diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index 239d7ef..707876f 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -4556,15 +4556,9 @@ static int rtl8169_rx_interrupt(struct net_device *dev,
rtl8169_rx_csum(skb, desc);
- if (rtl8169_try_rx_copy(&skb, tp, pkt_size, addr)) {
- pci_dma_sync_single_for_device(pdev, addr,
- pkt_size, PCI_DMA_FROMDEVICE);
- rtl8169_mark_to_asic(desc, tp->rx_buf_sz);
- } else {
- pci_unmap_single(pdev, addr, tp->rx_buf_sz,
- PCI_DMA_FROMDEVICE);
- tp->Rx_skbuff[entry] = NULL;
- }
+ pci_unmap_single(pdev, addr, tp->rx_buf_sz,
+ PCI_DMA_FROMDEVICE);
+ tp->Rx_skbuff[entry] = NULL;
skb_put(skb, pkt_size);
skb->protocol = eth_type_trans(skb, dev);
Thanks in advance.
--Junchang
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: Question about way that NICs deliver packets to the kernel
2010-07-15 14:24 Question about way that NICs deliver packets to the kernel Junchang Wang
@ 2010-07-15 14:33 ` Ben Hutchings
2010-07-15 15:59 ` Stephen Hemminger
2010-07-16 7:05 ` Junchang Wang
2010-07-15 21:12 ` Francois Romieu
1 sibling, 2 replies; 10+ messages in thread
From: Ben Hutchings @ 2010-07-15 14:33 UTC (permalink / raw)
To: Junchang Wang; +Cc: romieu, netdev
On Thu, 2010-07-15 at 22:24 +0800, Junchang Wang wrote:
> Hi list,
> My understand of the way that NICs deliver packets to the kernel is
> as follows. Correct me if any of this is wrong. Thanks.
>
> 1) The device buffer is fixed. When the kernel is acknowledged arrival of a
> new packet, it dynamically allocate a new skb and copy the packet into it.
> For example, 8139too.
>
> 2) The device buffer is mapped by streaming DMA. When the kernel is
> acknowledged arrival of a new packet, it unmaps the region previously mapped.
> Obviously, there is NO memcpy operation. Additional cost is streaming DMA
> map/unmap operations. For example, e100 and e1000.
>
> Here comes my question:
> 1) Is there a principle indicating which one is better? Is streaming DMA
> map/unmap operations more expensive than memcpy operation?
DMA should result in lower CPU usage and higher maximum performance.
> 2) Why does r8169 bias towards the first approach even if it support both? I
> convert r8169 to the second one and get a 5% performance boost. Below is result
> running netperf TCP_STREAM test with 1.6K byte packet length.
> scheme 1 scheme 2 Imp.
> r8169 683M 718M 5%
[...]
You should also compare the CPU usage.
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Question about way that NICs deliver packets to the kernel
2010-07-15 14:33 ` Ben Hutchings
@ 2010-07-15 15:59 ` Stephen Hemminger
2010-07-16 7:05 ` Junchang Wang
1 sibling, 0 replies; 10+ messages in thread
From: Stephen Hemminger @ 2010-07-15 15:59 UTC (permalink / raw)
To: Ben Hutchings; +Cc: Junchang Wang, romieu, netdev
On Thu, 15 Jul 2010 15:33:37 +0100
Ben Hutchings <bhutchings@solarflare.com> wrote:
> On Thu, 2010-07-15 at 22:24 +0800, Junchang Wang wrote:
> > Hi list,
> > My understand of the way that NICs deliver packets to the kernel is
> > as follows. Correct me if any of this is wrong. Thanks.
> >
> > 1) The device buffer is fixed. When the kernel is acknowledged arrival of a
> > new packet, it dynamically allocate a new skb and copy the packet into it.
> > For example, 8139too.
> >
> > 2) The device buffer is mapped by streaming DMA. When the kernel is
> > acknowledged arrival of a new packet, it unmaps the region previously mapped.
> > Obviously, there is NO memcpy operation. Additional cost is streaming DMA
> > map/unmap operations. For example, e100 and e1000.
> >
> > Here comes my question:
> > 1) Is there a principle indicating which one is better? Is streaming DMA
> > map/unmap operations more expensive than memcpy operation?
>
> DMA should result in lower CPU usage and higher maximum performance.
>
> > 2) Why does r8169 bias towards the first approach even if it support both? I
> > convert r8169 to the second one and get a 5% performance boost. Below is result
> > running netperf TCP_STREAM test with 1.6K byte packet length.
> > scheme 1 scheme 2 Imp.
> > r8169 683M 718M 5%
> [...]
>
> You should also compare the CPU usage.
Also many drivers copy small receives into a new buffer
which saves space and often gives better performance.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Question about way that NICs deliver packets to the kernel
2010-07-15 14:24 Question about way that NICs deliver packets to the kernel Junchang Wang
2010-07-15 14:33 ` Ben Hutchings
@ 2010-07-15 21:12 ` Francois Romieu
2010-07-16 7:35 ` Junchang Wang
1 sibling, 1 reply; 10+ messages in thread
From: Francois Romieu @ 2010-07-15 21:12 UTC (permalink / raw)
To: Junchang Wang; +Cc: netdev
Junchang Wang <junchangwang@gmail.com> :
[...]
> 2) Why does r8169 bias towards the first approach even if it support both ?
It is a simple, straightforward fix against a 8169 hardware bug.
See commit c0cd884af045338476b8e69a61fceb3f34ff22f1.
--
Ueimor
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Question about way that NICs deliver packets to the kernel
2010-07-15 14:33 ` Ben Hutchings
2010-07-15 15:59 ` Stephen Hemminger
@ 2010-07-16 7:05 ` Junchang Wang
2010-07-16 17:58 ` Rick Jones
1 sibling, 1 reply; 10+ messages in thread
From: Junchang Wang @ 2010-07-16 7:05 UTC (permalink / raw)
To: Ben Hutchings; +Cc: romieu, netdev
>
> You should also compare the CPU usage.
>
> Ben.
>
Hi Ben,
I added options -c -C to netperf's command line. Result is as follows:
scheme 1 scheme 2 Imp.
Throughput: 683M 718M 5%
CPU usage: 47.8% 45.6%
That really surprised me because "top" command showed the CPU usage
was fluctuating between 0.5% and 1.5% rather that between 45% and 50%.
How can I get the exact CPU usage?
Thanks.
--
--Junchang
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Question about way that NICs deliver packets to the kernel
2010-07-15 21:12 ` Francois Romieu
@ 2010-07-16 7:35 ` Junchang Wang
0 siblings, 0 replies; 10+ messages in thread
From: Junchang Wang @ 2010-07-16 7:35 UTC (permalink / raw)
To: Francois Romieu; +Cc: netdev
> It is a simple, straightforward fix against a 8169 hardware bug.
>
> See commit c0cd884af045338476b8e69a61fceb3f34ff22f1.
>
Fortunately, it seems my device is unaffected by this issue. :)
Thanks Francois.
--
--Junchang
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Question about way that NICs deliver packets to the kernel
2010-07-16 7:05 ` Junchang Wang
@ 2010-07-16 17:58 ` Rick Jones
2010-07-20 1:15 ` Junchang Wang
0 siblings, 1 reply; 10+ messages in thread
From: Rick Jones @ 2010-07-16 17:58 UTC (permalink / raw)
To: Junchang Wang; +Cc: Ben Hutchings, romieu, netdev
Junchang Wang wrote:
>>You should also compare the CPU usage.
>>
>>Ben.
>>
>
> Hi Ben,
> I added options -c -C to netperf's command line. Result is as follows:
> scheme 1 scheme 2 Imp.
> Throughput: 683M 718M 5%
> CPU usage: 47.8% 45.6%
>
> That really surprised me because "top" command showed the CPU usage
> was fluctuating between 0.5% and 1.5% rather that between 45% and 50%.
Can you tell us a bit more about the system, and which version of netperf you
are using? Any chance that the CPU utilization you were looking at in top was
just that being charged to netperf the process? "Network processing" does not
often get charged to the responsible process, so netperf reports system-wide CPU
utilization on the assumption it is the only thing causing the CPUs to be utilized.
happy benchmarking,
rick jones
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Question about way that NICs deliver packets to the kernel
2010-07-16 17:58 ` Rick Jones
@ 2010-07-20 1:15 ` Junchang Wang
2010-07-20 17:16 ` Rick Jones
0 siblings, 1 reply; 10+ messages in thread
From: Junchang Wang @ 2010-07-20 1:15 UTC (permalink / raw)
To: Rick Jones; +Cc: Ben Hutchings, romieu, netdev
On Fri, Jul 16, 2010 at 10:58:46AM -0700, Rick Jones wrote:
>>Hi Ben,
>>I added options -c -C to netperf's command line. Result is as follows:
>> scheme 1 scheme 2 Imp.
>>Throughput: 683M 718M 5%
>>CPU usage: 47.8% 45.6%
>>
>>That really surprised me because "top" command showed the CPU usage
>>was fluctuating between 0.5% and 1.5% rather that between 45% and 50%.
>
Hi rick,
very sorry for my late reply. Just recovered from the final exam.:)
>Can you tell us a bit more about the system, and which version of
>netperf you are using?
The target machine is a Pentium Dual-core E2200 desktop with a r8169
gigabit NIC. (I couldn't find a better server with old pci slot.)
Another machine is a Nehalem based system with Intel 82576 NIC.
The target machine executes netserver and Nehalem machine executes netperf.
The version of netperf is 2.4.5
>Any chance that the CPU utilization you were
>looking at in top was just that being charged to netperf the process?
What I see on target machine is as follows:
top - 21:37:12 up 21 min, 6 users, load average: 0.43, 0.28, 0.19
Tasks: 152 total, 2 running, 149 sleeping, 0 stopped, 1 zombie
Cpu(s): 2.3%us, 1.5%sy, 0.1%ni, 89.5%id, 2.7%wa, 0.0%hi, 3.9%si, 0.0%
Mem: 2074064k total, 690200k used, 1383864k free, 39372k buffers
Swap: 2096476k total, 0k used, 2096476k free, 435044k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3916 root 20 0 2228 584 296 R 84.6 0.0 0:07.12 netserver
It shows the CPU usage of taget machine is around 10%.
while Nehalem machine's report is as follows:
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.1 (192.168.2.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 16384 10.05 679.79 1.63 48.27 1.571 11.634
It shows the CPU usage of target machine is 48.27%.
>"Network processing" does not often get charged to the responsible
>process, so netperf reports system-wide CPU utilization on the
>assumption it is the only thing causing the CPUs to be utilized.
My understand of your commends is:
1)except running in ksoftirqd, network processing cannot be correctly counted
because it runs in interrupt contexts that do not get charged to a correct
process. So "top" misses lots of CPU usage in high interrupt rate network
situation.
2)As you have mentioned in netperf's manual, netperf uses /proc/stat on Linux
to retrieve time spent in idle mode. In other words, it accumulates cpu time
spent in all other modes, including hardware interrupt, software interrupt,
etc., making the CPU usage more accurate in high interrupt situation.
3)Since most processes in target machine are in sleeping mode, the CPU usage
of network processing is in actually very close to 48.27%. Right?
Correct me if any of them are incorrect. Thanks.
--Junchang
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Question about way that NICs deliver packets to the kernel
2010-07-20 1:15 ` Junchang Wang
@ 2010-07-20 17:16 ` Rick Jones
2010-07-25 14:18 ` Junchang Wang
0 siblings, 1 reply; 10+ messages in thread
From: Rick Jones @ 2010-07-20 17:16 UTC (permalink / raw)
To: Junchang Wang; +Cc: Ben Hutchings, romieu, netdev
Junchang Wang wrote:
> On Fri, Jul 16, 2010 at 10:58:46AM -0700, Rick Jones wrote:
>
>>>Hi Ben,
>>>I added options -c -C to netperf's command line. Result is as follows:
>>> scheme 1 scheme 2 Imp.
>>>Throughput: 683M 718M 5%
>>>CPU usage: 47.8% 45.6%
>>>
>>>That really surprised me because "top" command showed the CPU usage
>>>was fluctuating between 0.5% and 1.5% rather that between 45% and 50%.
>>
>
> Hi rick,
> very sorry for my late reply. Just recovered from the final exam.:)
>
>
>>Can you tell us a bit more about the system, and which version of
>>netperf you are using?
>
>
> The target machine is a Pentium Dual-core E2200 desktop with a r8169
> gigabit NIC. (I couldn't find a better server with old pci slot.)
>
> Another machine is a Nehalem based system with Intel 82576 NIC.
>
> The target machine executes netserver and Nehalem machine executes netperf.
> The version of netperf is 2.4.5
>
>
>>Any chance that the CPU utilization you were
>>looking at in top was just that being charged to netperf the process?
>
>
> What I see on target machine is as follows:
>
> top - 21:37:12 up 21 min, 6 users, load average: 0.43, 0.28, 0.19
> Tasks: 152 total, 2 running, 149 sleeping, 0 stopped, 1 zombie
> Cpu(s): 2.3%us, 1.5%sy, 0.1%ni, 89.5%id, 2.7%wa, 0.0%hi, 3.9%si, 0.0%
> Mem: 2074064k total, 690200k used, 1383864k free, 39372k buffers
> Swap: 2096476k total, 0k used, 2096476k free, 435044k cached
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 3916 root 20 0 2228 584 296 R 84.6 0.0 0:07.12 netserver
You said this was a dual-core system right? So two cores, no threads? If so,
then that does look odd - if netserver is consuming 84% of a CPU (core) and
there are only two CPUs (cores) in the system, how the system can be 89.5% idle
is beyond me. The 48% reported by netperf below makes better sense. If you press
"1" while top is running it should start to show per-CPU statistics
> It shows the CPU usage of taget machine is around 10%.
>
> while Nehalem machine's report is as follows:
>
> TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.1 (192.168.2.1) port 0 AF_INET
> Recv Send Send Utilization Service Demand
> Socket Socket Message Elapsed Send Recv Send Recv
> Size Size Size Time Throughput local remote local remote
> bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
>
> 87380 16384 16384 10.05 679.79 1.63 48.27 1.571 11.634
>
> It shows the CPU usage of target machine is 48.27%.
Clearly something is out of joint - let's go off-list (or on to
netperf-talk@netperf.org) and hash that out to see what may be happening. It
will probably involve variations on grabbing the top-of-trunk, adding the debug
option etc.
>
>
>>"Network processing" does not often get charged to the responsible
>>process, so netperf reports system-wide CPU utilization on the
>>assumption it is the only thing causing the CPUs to be utilized.
>
>
> My understand of your commends is:
> 1)except running in ksoftirqd, network processing cannot be correctly counted
> because it runs in interrupt contexts that do not get charged to a correct
> process. So "top" misses lots of CPU usage in high interrupt rate network
> situation.
Top *shouldn't* miss it as far as reporting overall CPU utlization. It just may
not be charged to the process on who's behalf the work is done.
> 2)As you have mentioned in netperf's manual, netperf uses /proc/stat on Linux
> to retrieve time spent in idle mode. In other words, it accumulates cpu time
> spent in all other modes, including hardware interrupt, software interrupt,
> etc., making the CPU usage more accurate in high interrupt situation.
That is the theory. In practice however... while the top output you've
provided looks like there is an "issue" in top, netperf has been known to have a
bug or three.
> 3)Since most processes in target machine are in sleeping mode, the CPU usage
> of network processing is in actually very close to 48.27%. Right?
I do not expect there to be a huge discrepancy between the overall CPU
utilization reported by top and the CPU utilization reported by netperf. That
there seems to be such a discrepancy has me wanting to make certain that netperf
is operating correctly.
happy benchmarking,
rick jones
>
> Correct me if any of them are incorrect. Thanks.
>
> --Junchang
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Question about way that NICs deliver packets to the kernel
2010-07-20 17:16 ` Rick Jones
@ 2010-07-25 14:18 ` Junchang Wang
0 siblings, 0 replies; 10+ messages in thread
From: Junchang Wang @ 2010-07-25 14:18 UTC (permalink / raw)
To: Rick Jones, netdev; +Cc: Ben Hutchings, romieu
Hi list,
> Clearly something is out of joint - let's go off-list (or on to
> netperf-talk@netperf.org) and hash that out to see what may be happening.
> It will probably involve variations on grabbing the top-of-trunk, adding
> the debug option etc.
>
The discrepancy between netperf and top has been worked out.
It seems top produce buggy data when I try to send output to a file.
For example, "top -b > output" gives out my previous buggy data in its
first iteration.
Actually, the report of top should be:
top - 21:37:15 up 21 min, 6 users, load average: 0.43, 0.28, 0.19
Tasks: 152 total, 2 running, 149 sleeping, 0 stopped, 1 zombie
Cpu(s): 0.2%us, 5.4%sy, 0.0%ni, 50.9%id, 0.0%wa, 0.0%hi, 43.5%si, 0.0%
Mem: 2074064k total, 690192k used, 1383872k free, 39372k buffers
Swap: 2096476k total, 0k used, 2096476k free, 435056k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3916 root 20 0 2228 584 296 R 86.3 0.0 0:09.72 netserver
I think 50.9% system idle makes sense because this is a dual-core system
and netserver is consuming 86.3% of a core. On average, the CPU usage
of the whole system reported by top can be regarded as from 46.2% to
50.1%.
netperf's report of 48% is right and testifies that "there is no huge
discrepancy
between the overall CPU utilization reported by top and the CPU utilization
reported by netperf."
Thanks Rick.
--
--Junchang
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2010-07-25 14:18 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-15 14:24 Question about way that NICs deliver packets to the kernel Junchang Wang
2010-07-15 14:33 ` Ben Hutchings
2010-07-15 15:59 ` Stephen Hemminger
2010-07-16 7:05 ` Junchang Wang
2010-07-16 17:58 ` Rick Jones
2010-07-20 1:15 ` Junchang Wang
2010-07-20 17:16 ` Rick Jones
2010-07-25 14:18 ` Junchang Wang
2010-07-15 21:12 ` Francois Romieu
2010-07-16 7:35 ` Junchang Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).