netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* variation in thruput w/ change in mtu on gige
@ 2004-04-26 12:06 Abhijit Karmarkar
  2004-04-26 15:02 ` Steve Modica
  0 siblings, 1 reply; 3+ messages in thread
From: Abhijit Karmarkar @ 2004-04-26 12:06 UTC (permalink / raw)
  To: netdev

Hi,

i have observed that using jumbo frames (mtu=9000) decreases the thruput
(i am timing one-way ttcp). trying w/ different mtu's i see 4096 give
me the best numbers:

mtu             thruput
-------------------------------
1500 (default)  ~846Mbps
4096            ~930Mbps <== highest
8192            ~806Mbps
9000            ~806Mbps
15K             ~680Mbps

my setup is:
- 2 nodes connected directly (cross-over cable)
- each node: 2-way, 2.4G Xeon. 4G RAM., running RHEL3 (2.4.21-4.ELsmp)
- intel gige (82543GC), e1000 ver. (5.1.11-k1)
  i think the cards are: 64bit/66Mhz PCI.
- ipv4.tcp_r/wmem and core.r/wmem_max set sufficiently high (512KB)
- using ttcp to xfer ~8GB one-way.

why doesn't my thruput increase with increase in MTU? is it because of
small number of rx/txdescriptors on 82543GC (max=256?) or something
else?

are there any driver parameters that i can tune to get better numbers
with larger MTUs?

thanks,
abhijit

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: variation in thruput w/ change in mtu on gige
  2004-04-26 12:06 variation in thruput w/ change in mtu on gige Abhijit Karmarkar
@ 2004-04-26 15:02 ` Steve Modica
  2004-04-27  4:46   ` Abhijit Karmarkar
  0 siblings, 1 reply; 3+ messages in thread
From: Steve Modica @ 2004-04-26 15:02 UTC (permalink / raw)
  To: Abhijit Karmarkar; +Cc: netdev

Probably page size. 4k is one page so those are probably the most efficient IOs. 
There must be some additional handling required to squeeze multiple pages into 
an MTU.  Have you profiled things at all to see what additional code has to run 
in order to handle multiple pages?

Steve

Abhijit Karmarkar wrote:
> Hi,
> 
> i have observed that using jumbo frames (mtu=9000) decreases the thruput
> (i am timing one-way ttcp). trying w/ different mtu's i see 4096 give
> me the best numbers:
> 
> mtu             thruput
> -------------------------------
> 1500 (default)  ~846Mbps
> 4096            ~930Mbps <== highest
> 8192            ~806Mbps
> 9000            ~806Mbps
> 15K             ~680Mbps
> 
> my setup is:
> - 2 nodes connected directly (cross-over cable)
> - each node: 2-way, 2.4G Xeon. 4G RAM., running RHEL3 (2.4.21-4.ELsmp)
> - intel gige (82543GC), e1000 ver. (5.1.11-k1)
>   i think the cards are: 64bit/66Mhz PCI.
> - ipv4.tcp_r/wmem and core.r/wmem_max set sufficiently high (512KB)
> - using ttcp to xfer ~8GB one-way.
> 
> why doesn't my thruput increase with increase in MTU? is it because of
> small number of rx/txdescriptors on 82543GC (max=256?) or something
> else?
> 
> are there any driver parameters that i can tune to get better numbers
> with larger MTUs?
> 
> thanks,
> abhijit
> 


-- 
Steve Modica
work: 651-683-3224
MTS-Technical Lead
"Give a man a fish, and he will eat for a day, hit him with a fish and
he leaves you alone" - me

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: variation in thruput w/ change in mtu on gige
  2004-04-26 15:02 ` Steve Modica
@ 2004-04-27  4:46   ` Abhijit Karmarkar
  0 siblings, 0 replies; 3+ messages in thread
From: Abhijit Karmarkar @ 2004-04-27  4:46 UTC (permalink / raw)
  To: Steve Modica; +Cc: netdev

On Mon, 26 Apr 2004, Steve Modica wrote:

> Probably page size. 4k is one page so those are probably the most
> efficient IOs.  There must be some additional handling required to
> squeeze multiple pages into an MTU.

not sure. though i see copy_to_user() working harder on the rx side, in
case of MTU=9K.

> Have you profiled things at all to see what additional code has to run
> in order to handle multiple pages?

i did collect oprofile samples for my test runs (one-way flow, xmitting
38GB of data, using ttcp. same setup as earlier).

here is a summary (top 10 functions, for vmlinux and e1000):

for MTU=4096 (thruput = ~930Mbps) [At receiver]
----------------------------------------------------
	samples  %           symbol name
    vmlinux:
	28085    12.9316     default_idle
	24891    11.4609     __generic_copy_to_user
	12684    5.84026     tcp_v4_rcv
	11794    5.43047     __kmem_cache_alloc
	10966    5.04922     do_IRQ
	10834    4.98844     __wake_up
	8082     3.7213      try_to_wake_up
	7214     3.32164     __mod_timer
	6029     2.77601     net_rx_action
	5854     2.69544     ip_route_input
    e1000:
	52363    47.0657     e1000_intr
	36977    33.2363     e1000_irq_enable
	7435     6.68285     e1000_clean_tx_irq
	5024     4.51575     e1000_clean_rx_irq
	4764     4.28205     e1000_alloc_rx_buffers
	4037     3.6286      e1000_clean
	261      0.234596    e1000_tx_map
	258      0.2319      e1000_rx_checksum
	83       0.0746034   e1000_tx_queue
	48       0.0431441   e1000_xmit_frame


for MTU=9000 (thruput = ~806Mbps) [At receiver]
----------------------------------------------------
	samples  %           symbol name
    vmlinux:
	22533    20.7672     __generic_copy_to_user
	12178    11.2237     default_idle
	5893     5.43119     tcp_v4_rcv
	5151     4.74733     __wake_up
	5010     4.61738     __kmem_cache_alloc
	4585     4.22569     do_IRQ
	3592     3.31051     try_to_wake_up
	2966     2.73356     __mod_timer
	2683     2.47274     ip_route_input
	2491     2.29579     eth_type_trans
    e1000:
	20504    51.4349     e1000_intr
	10064    25.2458     e1000_irq_enable
	2860     7.17439     e1000_clean_tx_irq
	2292     5.74955     e1000_clean_rx_irq
	2261     5.67178     e1000_alloc_rx_buffers
	1583     3.971       e1000_clean
	132      0.331126    e1000_rx_checksum
	108      0.270921    e1000_tx_map
	35       0.0877985   e1000_tx_queue
	17       0.042645    e1000_xmit_frame


does that tell anything? also note that number of interrupts
(e1000_intr) is slightly higher for larger MTU.

[in case anybody needs the full profiles on both rx/tx side for
 differnet MTU's please let me know. i can mail them]

thanks
abhijit

>
> Steve
>
> Abhijit Karmarkar wrote:
> > Hi,
> >
> > i have observed that using jumbo frames (mtu=9000) decreases the thruput
> > (i am timing one-way ttcp). trying w/ different mtu's i see 4096 give
> > me the best numbers:
> >
> > mtu             thruput
> > -------------------------------
> > 1500 (default)  ~846Mbps
> > 4096            ~930Mbps <== highest
> > 8192            ~806Mbps
> > 9000            ~806Mbps
> > 15K             ~680Mbps
> >
> > my setup is:
> > - 2 nodes connected directly (cross-over cable)
> > - each node: 2-way, 2.4G Xeon. 4G RAM., running RHEL3 (2.4.21-4.ELsmp)
> > - intel gige (82543GC), e1000 ver. (5.1.11-k1)
> >   i think the cards are: 64bit/66Mhz PCI.
> > - ipv4.tcp_r/wmem and core.r/wmem_max set sufficiently high (512KB)
> > - using ttcp to xfer ~8GB one-way.
> >
> > why doesn't my thruput increase with increase in MTU? is it because of
> > small number of rx/txdescriptors on 82543GC (max=256?) or something
> > else?
> >
> > are there any driver parameters that i can tune to get better numbers
> > with larger MTUs?
> >
> > thanks,
> > abhijit
> >
>
>
> --
> Steve Modica
> work: 651-683-3224
> MTS-Technical Lead
> "Give a man a fish, and he will eat for a day, hit him with a fish and
> he leaves you alone" - me
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2004-04-27  4:46 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-04-26 12:06 variation in thruput w/ change in mtu on gige Abhijit Karmarkar
2004-04-26 15:02 ` Steve Modica
2004-04-27  4:46   ` Abhijit Karmarkar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).