* ML405 gigabit ethernet with kernel 2.6.23
@ 2007-11-08 2:16 kentaro
2007-11-08 10:32 ` MingLiu
0 siblings, 1 reply; 3+ messages in thread
From: kentaro @ 2007-11-08 2:16 UTC (permalink / raw)
To: linuxppc-embedded
Dear all,
I have measured ethernet performance on ML405 with linux
kernel 2.6.23-rc2 which I obtained from the secreatlab.ca
git tree. I will post this e-mail because I would like to
share the data and besides I would like to ask something
about the performance.
In the past, similar e-mails are also posted to this mailing list;
http://ozlabs.org/pipermail/linuxppc-embedded/2007-June/027328.html
They are also helpful.
My hardware configuration :
-------------------------------------------------------------
ISE, EDK : 9.1SP3(IP update-3) 9.1SP2
-------------------------------------------------------------
Board : ML405
PPC frequency : 300 MHz
TEMAC : SG-DMA, TX/RX checksum offload
TX/RX FIFO depth = 131072
MAC length and Status FIFO Depth = 64
TX/RX DRE = 2
DDR Memory : Support PLB Bursts and Cache = TRUE
-------------------------------------------------------------
Basically, this configuration is exactly same as XAPP1023
except for BRAM. (I used 64k BRAM.) And with this configuration,
Xilinx achieved 400 Mbps ~ 500Mbps throughput with MontaVista
Linux 4.0. However, my results were
~110 Mbps (TCP) and ~200 Mbps (UDP). I guess the differences
came from linux configuration. Here are my linux setup.
-------------------------------------------------------------
kernel : 2.6.23-rc2 (from linux-2.6-virtex.git)
gcc, glibc : 4.0.2, 2.3.6
TX,RX threshold = 32, 8 and waitbound = 1, 1
-------------------------------------------------------------
Before compiling the kernel, I needed to modify a checksum
code in adapter.c because the checksum insert address was wrong.
Original (line 1076):
XTemac_mSgSendBdCsumSetup(bd_ptr, skb->transport_header
- skb->data, (skb->transport_header - skb->data) + skb->csum);
Modified :
XTemac_mSgSendBdCsumSetup(bd_ptr, skb_transport_offset(skb),
skb_transport_offset(skb) + skb->csum_offset);
I used "nerperf" to measure performance on the built kernel.
The results were
-------------------------------------------------------------
"netperf -H 192.168.1.1 -t TCP_STREAM" 110 Mbps
"netperf -H 192.168.1.1 -t UDP_STREAM" 210 Mbps
-------------------------------------------------------------
I have changed some netperf parameters but the results
didn't change so much. It seemed to me that the performance
was limited by CPU because "top" command told CPU usage was
99% (71% SYSTEM, 27% IRQ). If I lower the TX threshold down
to 16, the score becomes (~50% SYSTEM, ~40% IRQ).
Then, I changed MTU to 8000 (on both PC and ML405).
This made everything upset. Network became very unstable
and I couldn't run netperf successfully.
So, my question is
(1) Do I need to apply some optimization to the kernel sources
in order to achieve ~400 Mbps ? It seems to me the difference
comes from the kernel part.
(2) Does anyone have some MTU problem ? I'm very glad if I could
have advices.
Any suggestion is welcome.
Best regards,
Kentaro.
--------------------------------------------------------------------
PS:
For your interest, here I attach my /proc/profile info
obtained while running netperf.
=============== Netperf Test (TCP STREAM) ====================
394 __copy_tofrom_user 0.6888
208 invalidate_dcache_range 4.3333
196 clean_dcache_range 4.0833
173 XDmaV3_SgBdToHw 0.5149
152 tcp_sendmsg 0.0485
105 skb_clone 0.1862
71 tcp_transmit_skb 0.0380
71 ip_queue_xmit 0.0870
67 cpu_idle 0.3102
59 kfree 0.2588
57 tcp_cwnd_validate 0.4191
49 tcp_push_one 0.1551
49 kmem_cache_alloc 0.3063
45 ip_output 0.0622
44 tcp_ack 0.0067
42 xenet_SgSend_internal 0.0587
38 __alloc_skb 0.1418
36 pfifo_fast_enqueue 0.1579
33 __kmalloc 0.1375
30 memset 0.3261
28 _xenet_SgSetupRecvBuffers 0.0493
27 XTemac_IntrSgEnable 0.0938
23 skb_release_data 0.1150
22 tcp_rcv_established 0.0097
=============== Netperf Test (UDP STREAM) ====================
1426 csum_partial_copy_generic 6.4818
961 cpu_idle 4.4491
126 ip_fragment 0.0754
63 xenet_SgSend_internal 0.0880
58 memcpy 0.3718
50 memset 0.5435
48 XDmaV3_SgBdToHw 0.1429
48 __kmalloc 0.2000
46 ip_push_pending_frames 0.0451
38 kfree 0.1667
37 clean_dcache_range 0.7708
36 dev_queue_xmit 0.0536
33 __alloc_skb 0.1231
32 udp_push_pending_frames 0.0452
29 local_bh_enable 0.2071
29 ace_fsm_tasklet 0.3295
24 ip_append_data 0.0100
23 XTemac_SgCommit 0.1027
22 XDmaV3_SgBdAlloc 0.1964
21 skb_release_data 0.1050
21 kmem_cache_alloc 0.1313
20 ip_finish_output2 0.0365
19 XTemac_SgAlloc 0.0679
19 pfifo_fast_dequeue 0.1532
^ permalink raw reply [flat|nested] 3+ messages in thread
* RE: ML405 gigabit ethernet with kernel 2.6.23
2007-11-08 2:16 ML405 gigabit ethernet with kernel 2.6.23 kentaro
@ 2007-11-08 10:32 ` MingLiu
2007-11-08 16:29 ` Rick Moleres
0 siblings, 1 reply; 3+ messages in thread
From: MingLiu @ 2007-11-08 10:32 UTC (permalink / raw)
To: kentaro, linuxppc-embedded
[-- Attachment #1: Type: text/plain, Size: 1017 bytes --]
Dear Kentaro,
> -------------------------------------------------------------> "netperf -H 192.168.1.1 -t TCP_STREAM" 110 Mbps> "netperf -H 192.168.1.1 -t UDP_STREAM" 210 Mbps> -------------------------------------------------------------
Are these results the ones with or without Jumbo-frame enabled? If no, they are quite good I think. The results from Montavista probably are the ones with Jumbo-frame enabled.
For anybody who has interest on this topic, I have recently an accepted paper which has part of the content on this. The link is http://web.it.kth.se/~mingliu/publications/co_design(icfpt07).pdf and in 6.2 section, I listed our measurement results. 300Mbps for TCP and 400Mbps for UDP, with Jumbo-frame enabled. Unfortunately I did not explain the details and detailed configurations on our case. So these results are only for your reference.
BR
Ming
_________________________________________________________________
Windows Live Spaces 中最年轻的成员!
http://miaomiaogarden2007.spaces.live.com/
[-- Attachment #2: Type: text/html, Size: 1344 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
* RE: ML405 gigabit ethernet with kernel 2.6.23
2007-11-08 10:32 ` MingLiu
@ 2007-11-08 16:29 ` Rick Moleres
0 siblings, 0 replies; 3+ messages in thread
From: Rick Moleres @ 2007-11-08 16:29 UTC (permalink / raw)
To: MingLiu, kentaro, linuxppc-embedded
[-- Attachment #1: Type: text/plain, Size: 1851 bytes --]
Hi,
Yes, for 1500 MTU these aren’t too bad. One thing to note, using the TCP_STREAM option does not take advantage of zero-copy and possibly checksum offload on the transmit side. You should use the TCP_SENDFILE option for that. We typically use options such as:
netperf -c -C -H 192.168.1.1 -t TCP_STREAM -- -s131072 -S131072 -m65536
netperf -c -C -H 192.168.1.1 -t UDP_STREAM -- -s131072 -S131072 -m8192 (if using jumbo frames of 8982)
-Rick
________________________________
From: linuxppc-embedded-bounces+moleres=xilinx.com@ozlabs.org [mailto:linuxppc-embedded-bounces+moleres=xilinx.com@ozlabs.org] On Behalf Of MingLiu
Sent: Thursday, November 08, 2007 2:32 AM
To: kentaro; linuxppc-embedded@ozlabs.org
Subject: RE: ML405 gigabit ethernet with kernel 2.6.23
Dear Kentaro,
> -------------------------------------------------------------
> "netperf -H 192.168.1.1 -t TCP_STREAM" 110 Mbps
> "netperf -H 192.168.1.1 -t UDP_STREAM" 210 Mbps
> -------------------------------------------------------------
Are these results the ones with or without Jumbo-frame enabled? If no, they are quite good I think. The results from Montavista probably are the ones with Jumbo-frame enabled.
For anybody who has interest on this topic, I have recently an accepted paper which has part of the content on this. The link is http://web.it.kth.se/~mingliu/publications/co_design(icfpt07).pdf and in 6.2 section, I listed our measurement results. 300Mbps for TCP and 400Mbps for UDP, with Jumbo-frame enabled. Unfortunately I did not explain the details and detailed configurations on our case. So these results are only for your reference.
BR
Ming
________________________________
用 Windows Live Spaces 展示个性自我,与好友分享生活! 了解更多信息! <http://spaces.live.com/?page=HP>
[-- Attachment #2: Type: text/html, Size: 6630 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2007-11-08 16:30 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-08 2:16 ML405 gigabit ethernet with kernel 2.6.23 kentaro
2007-11-08 10:32 ` MingLiu
2007-11-08 16:29 ` Rick Moleres
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).