linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* ML405 gigabit ethernet with kernel 2.6.23
@ 2007-11-08  2:16 kentaro
  2007-11-08 10:32 ` MingLiu
  0 siblings, 1 reply; 3+ messages in thread
From: kentaro @ 2007-11-08  2:16 UTC (permalink / raw)
  To: linuxppc-embedded

Dear all,

I have measured ethernet performance on ML405 with linux
kernel 2.6.23-rc2 which I obtained from the secreatlab.ca
git tree. I will post this e-mail because I would like to
share the data and besides I would like to ask something
about the performance.

In the past, similar e-mails are also posted to this mailing list;
http://ozlabs.org/pipermail/linuxppc-embedded/2007-June/027328.html
They are also helpful.

My hardware configuration :
-------------------------------------------------------------
ISE, EDK      : 9.1SP3(IP update-3) 9.1SP2
-------------------------------------------------------------
Board         : ML405
PPC frequency : 300 MHz
TEMAC         : SG-DMA, TX/RX checksum offload
                TX/RX FIFO depth = 131072
                MAC length and Status FIFO Depth = 64
                TX/RX DRE = 2
DDR Memory    : Support PLB Bursts and Cache = TRUE
-------------------------------------------------------------

Basically, this configuration is exactly same as XAPP1023
except for BRAM. (I used 64k BRAM.) And with this configuration,
Xilinx achieved 400 Mbps ~ 500Mbps throughput with MontaVista
Linux 4.0. However, my results were
~110 Mbps (TCP) and ~200 Mbps (UDP). I guess the differences
came from linux configuration. Here are my linux setup.

-------------------------------------------------------------
kernel     : 2.6.23-rc2 (from linux-2.6-virtex.git)
gcc, glibc : 4.0.2,  2.3.6
TX,RX threshold = 32, 8 and waitbound = 1, 1
-------------------------------------------------------------

Before compiling the kernel, I needed to modify a checksum
code in adapter.c because the checksum insert address was wrong.

Original (line 1076):
XTemac_mSgSendBdCsumSetup(bd_ptr, skb->transport_header
  - skb->data, (skb->transport_header - skb->data) + skb->csum);

Modified :
XTemac_mSgSendBdCsumSetup(bd_ptr, skb_transport_offset(skb),
            skb_transport_offset(skb) + skb->csum_offset);

I used "nerperf" to measure performance on the built kernel.
The results were
-------------------------------------------------------------
"netperf -H 192.168.1.1 -t TCP_STREAM"		110 Mbps
"netperf -H 192.168.1.1 -t UDP_STREAM"          210 Mbps
-------------------------------------------------------------
I have changed some netperf parameters but the results
didn't change so much. It seemed to me that the performance
was limited by CPU because "top" command told CPU usage was
99% (71% SYSTEM, 27% IRQ). If I lower the TX threshold down
to 16, the score becomes (~50% SYSTEM, ~40% IRQ).

Then, I changed MTU to 8000 (on both PC and ML405).
This made everything upset. Network became very unstable
and I couldn't run netperf successfully.

So, my question is
(1) Do I need to apply some optimization to the kernel sources
    in order to achieve ~400 Mbps ? It seems to me the difference
    comes from the kernel part.
(2) Does anyone have some MTU problem ? I'm very glad if I could
    have advices.

Any suggestion is welcome.

Best regards,
Kentaro.

--------------------------------------------------------------------
PS:
For your interest, here I attach my /proc/profile info
obtained while running netperf.

=============== Netperf Test (TCP STREAM) ====================
   394 __copy_tofrom_user                         0.6888
   208 invalidate_dcache_range                    4.3333
   196 clean_dcache_range                         4.0833
   173 XDmaV3_SgBdToHw                            0.5149
   152 tcp_sendmsg                                0.0485
   105 skb_clone                                  0.1862
    71 tcp_transmit_skb                           0.0380
    71 ip_queue_xmit                              0.0870
    67 cpu_idle                                   0.3102
    59 kfree                                      0.2588
    57 tcp_cwnd_validate                          0.4191
    49 tcp_push_one                               0.1551
    49 kmem_cache_alloc                           0.3063
    45 ip_output                                  0.0622
    44 tcp_ack                                    0.0067
    42 xenet_SgSend_internal                      0.0587
    38 __alloc_skb                                0.1418
    36 pfifo_fast_enqueue                         0.1579
    33 __kmalloc                                  0.1375
    30 memset                                     0.3261
    28 _xenet_SgSetupRecvBuffers                  0.0493
    27 XTemac_IntrSgEnable                        0.0938
    23 skb_release_data                           0.1150
    22 tcp_rcv_established                        0.0097

=============== Netperf Test (UDP STREAM) ====================
  1426 csum_partial_copy_generic                  6.4818
   961 cpu_idle                                   4.4491
   126 ip_fragment                                0.0754
    63 xenet_SgSend_internal                      0.0880
    58 memcpy                                     0.3718
    50 memset                                     0.5435
    48 XDmaV3_SgBdToHw                            0.1429
    48 __kmalloc                                  0.2000
    46 ip_push_pending_frames                     0.0451
    38 kfree                                      0.1667
    37 clean_dcache_range                         0.7708
    36 dev_queue_xmit                             0.0536
    33 __alloc_skb                                0.1231
    32 udp_push_pending_frames                    0.0452
    29 local_bh_enable                            0.2071
    29 ace_fsm_tasklet                            0.3295
    24 ip_append_data                             0.0100
    23 XTemac_SgCommit                            0.1027
    22 XDmaV3_SgBdAlloc                           0.1964
    21 skb_release_data                           0.1050
    21 kmem_cache_alloc                           0.1313
    20 ip_finish_output2                          0.0365
    19 XTemac_SgAlloc                             0.0679
    19 pfifo_fast_dequeue                         0.1532

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: ML405 gigabit ethernet with kernel 2.6.23
  2007-11-08  2:16 ML405 gigabit ethernet with kernel 2.6.23 kentaro
@ 2007-11-08 10:32 ` MingLiu
  2007-11-08 16:29   ` Rick Moleres
  0 siblings, 1 reply; 3+ messages in thread
From: MingLiu @ 2007-11-08 10:32 UTC (permalink / raw)
  To: kentaro, linuxppc-embedded

[-- Attachment #1: Type: text/plain, Size: 1017 bytes --]


Dear Kentaro,
> -------------------------------------------------------------> "netperf -H 192.168.1.1 -t TCP_STREAM" 110 Mbps> "netperf -H 192.168.1.1 -t UDP_STREAM" 210 Mbps> -------------------------------------------------------------
Are these results the ones with or without Jumbo-frame enabled? If no, they are quite good I think. The results from Montavista probably are the ones with Jumbo-frame enabled. 
 
For anybody who has interest on this topic, I have recently an accepted paper which has part of the content on this. The link is http://web.it.kth.se/~mingliu/publications/co_design(icfpt07).pdf and in 6.2 section, I listed our measurement results. 300Mbps for TCP and 400Mbps for UDP, with Jumbo-frame enabled. Unfortunately I did not explain the details and detailed configurations on our case. So these results are only for your reference.
 
BR
Ming
_________________________________________________________________
Windows Live Spaces 中最年轻的成员!
http://miaomiaogarden2007.spaces.live.com/

[-- Attachment #2: Type: text/html, Size: 1344 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: ML405 gigabit ethernet with kernel 2.6.23
  2007-11-08 10:32 ` MingLiu
@ 2007-11-08 16:29   ` Rick Moleres
  0 siblings, 0 replies; 3+ messages in thread
From: Rick Moleres @ 2007-11-08 16:29 UTC (permalink / raw)
  To: MingLiu, kentaro, linuxppc-embedded

[-- Attachment #1: Type: text/plain, Size: 1851 bytes --]

Hi,

 

Yes, for 1500 MTU these aren’t too bad.  One thing to note, using the TCP_STREAM option does not take advantage of zero-copy and possibly checksum offload on the transmit side.  You should use the TCP_SENDFILE option for that.  We typically use options such as:

 

            netperf -c -C -H 192.168.1.1 -t TCP_STREAM -- -s131072 -S131072 -m65536

            netperf -c -C -H 192.168.1.1 -t UDP_STREAM -- -s131072 -S131072 -m8192 (if using jumbo frames of 8982)

 

-Rick

________________________________

From: linuxppc-embedded-bounces+moleres=xilinx.com@ozlabs.org [mailto:linuxppc-embedded-bounces+moleres=xilinx.com@ozlabs.org] On Behalf Of MingLiu
Sent: Thursday, November 08, 2007 2:32 AM
To: kentaro; linuxppc-embedded@ozlabs.org
Subject: RE: ML405 gigabit ethernet with kernel 2.6.23

 

Dear Kentaro,

> -------------------------------------------------------------
> "netperf -H 192.168.1.1 -t TCP_STREAM" 110 Mbps
> "netperf -H 192.168.1.1 -t UDP_STREAM" 210 Mbps
> -------------------------------------------------------------

Are these results the ones with or without Jumbo-frame enabled? If no, they are quite good I think. The results from Montavista probably are the ones with Jumbo-frame enabled. 
 
For anybody who has interest on this topic, I have recently an accepted paper which has part of the content on this. The link is http://web.it.kth.se/~mingliu/publications/co_design(icfpt07).pdf and in 6.2 section, I listed our measurement results. 300Mbps for TCP and 400Mbps for UDP, with Jumbo-frame enabled. Unfortunately I did not explain the details and detailed configurations on our case. So these results are only for your reference.
 
BR
Ming

________________________________

用 Windows Live Spaces 展示个性自我,与好友分享生活! 了解更多信息! <http://spaces.live.com/?page=HP> 


[-- Attachment #2: Type: text/html, Size: 6630 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2007-11-08 16:30 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-08  2:16 ML405 gigabit ethernet with kernel 2.6.23 kentaro
2007-11-08 10:32 ` MingLiu
2007-11-08 16:29   ` Rick Moleres

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).