All of lore.kernel.org
 help / color / mirror / Atom feed
* I/O descriptor ring size bottleneck?
@ 2005-03-20 21:47 Diwaker Gupta
  2005-03-21 23:42 ` Nivedita Singhvi
  0 siblings, 1 reply; 5+ messages in thread
From: Diwaker Gupta @ 2005-03-20 21:47 UTC (permalink / raw)
  To: xen-devel

Hi everyone,

I'm doing some networking experiments over high BDP topologies. Right
now the configuration is quite simple -- two Xen boxes connected via a
dummynet router. The dummynet router is set to limit bandwidth to
500Mbps and simulate an RTT of 80ms.

I'm using the following sysctl values:
net.ipv4.tcp_rmem = 4096        87380   4194304
net.ipv4.tcp_wmem = 4096        65536   4194304
net.core.rmem_max = 8388608
net.core.wmem_max = 8388608
net.ipv4.tcp_bic = 0

(tcp westwood and vegas are also turned off for now)

Now if I run 50 netperf flows lasting 80 seconds (1000RTTs) from
inside  a VM on one box talking to the netserver on the VM on the
other box, I get a per flow throughput of around ~2.5Mbps (which
sucks, but lets ignore the absolute value for the moment).

If I run the same test, but this time from inside dom0, I get a per
flow throughput of around 6Mbps.

I'm trying to understand the difference in performance. It seems to me
that the I/O descriptor ring sizes are hard coded to 256 -- could that
be a bottleneck here? If not, have people experience similar problems?

TIA
-- 
Diwaker Gupta
http://resolute.ucsd.edu/diwaker


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: I/O descriptor ring size bottleneck?
@ 2005-03-20 22:15 Ian Pratt
  2005-03-21  0:06 ` Diwaker Gupta
  0 siblings, 1 reply; 5+ messages in thread
From: Ian Pratt @ 2005-03-20 22:15 UTC (permalink / raw)
  To: Diwaker Gupta, xen-devel; +Cc: ian.pratt

 

> I'm doing some networking experiments over high BDP topologies. Right
> now the configuration is quite simple -- two Xen boxes connected via a
> dummynet router. The dummynet router is set to limit bandwidth to
> 500Mbps and simulate an RTT of 80ms.

> Now if I run 50 netperf flows lasting 80 seconds (1000RTTs) from
> inside  a VM on one box talking to the netserver on the VM on the
> other box, I get a per flow throughput of around ~2.5Mbps (which
> sucks, but lets ignore the absolute value for the moment).
> 
> If I run the same test, but this time from inside dom0, I get a per
> flow throughput of around 6Mbps.
> 
> I'm trying to understand the difference in performance. It seems to me
> that the I/O descriptor ring sizes are hard coded to 256 -- could that
> be a bottleneck here? If not, have people experience similar problems?

Interesting. I'm not aware of any high BDP testing, and I'm slightly
surprised that its causing a problem (low latency situations are more of
a challenge for virtual networking).

The ring size really shouldn't be an issue for this, as it just has the
effect of reducing the number of context switches between dom0 and the
domU.

BTW, I'd actually be very suspicious of dummynet's ability to operate at
500Mb/s. It's possible that the reduced bandwidth is due to some bad
interaction between burstiness caused by Xen's context switching and
dummynet.

Are your dom0 and domU running on the same processor? Could you try
using hyperthreading or SMP?

Have you checked that domU <-> domU performance is good on the LAN with
a single TCP connection?

Ian





-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id\x14396&op=click

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: I/O descriptor ring size bottleneck?
  2005-03-20 22:15 Ian Pratt
@ 2005-03-21  0:06 ` Diwaker Gupta
  0 siblings, 0 replies; 5+ messages in thread
From: Diwaker Gupta @ 2005-03-21  0:06 UTC (permalink / raw)
  To: Ian Pratt; +Cc: xen-devel, ian.pratt

> BTW, I'd actually be very suspicious of dummynet's ability to operate at
> 500Mb/s. It's possible that the reduced bandwidth is due to some bad
> interaction between burstiness caused by Xen's context switching and
> dummynet.

Could you elaborate a bit more on this? Even if dummynet can't operate
at 500Mbps, we should atleast see the same degradation in performance
right?

> Are your dom0 and domU running on the same processor? Could you try
> using hyperthreading or SMP?

Yep, same processor. For various other reasons, I wanted to avoid SMP,
so I was running with the nosmp option. I'll try running with SMP and
post an update.

> Have you checked that domU <-> domU performance is good on the LAN with
> a single TCP connection?

I had a long time back, but I think that was with SMP. I'll check again. 
-- 
Diwaker Gupta
http://resolute.ucsd.edu/diwaker


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: I/O descriptor ring size bottleneck?
@ 2005-03-21  0:36 Ian Pratt
  0 siblings, 0 replies; 5+ messages in thread
From: Ian Pratt @ 2005-03-21  0:36 UTC (permalink / raw)
  To: Diwaker Gupta; +Cc: xen-devel, ian.pratt

 

> -----Original Message-----
> From: Diwaker Gupta [mailto:diwakergupta@gmail.com] 
> Sent: 21 March 2005 00:07
> To: Ian Pratt
> Cc: xen-devel@lists.sourceforge.net; ian.pratt@cl.cam.ac.uk
> Subject: Re: [Xen-devel] I/O descriptor ring size bottleneck?
> 
> > BTW, I'd actually be very suspicious of dummynet's ability 
> to operate at
> > 500Mb/s. It's possible that the reduced bandwidth is due to some bad
> > interaction between burstiness caused by Xen's context switching and
> > dummynet.
> 
> Could you elaborate a bit more on this? Even if dummynet can't operate
> at 500Mbps, we should atleast see the same degradation in performance
> right?

Because of the context switching between dom0 and domU, the packets tend
to come out in a more bursty fashion. It's conceivable that this might
cause hickups.

It might be worth looking at the number of interupts that are occuring,
and also the Xen s/w perf counters to see domain switch rate.

BTW, what Ethernet card are you using?

> > Are your dom0 and domU running on the same processor? Could you try
> > using hyperthreading or SMP?
> 
> Yep, same processor. For various other reasons, I wanted to avoid SMP,
> so I was running with the nosmp option. I'll try running with SMP and
> post an update.

That'll be interesting. 

Ian


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id\x14396&op=click

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: I/O descriptor ring size bottleneck?
  2005-03-20 21:47 I/O descriptor ring size bottleneck? Diwaker Gupta
@ 2005-03-21 23:42 ` Nivedita Singhvi
  0 siblings, 0 replies; 5+ messages in thread
From: Nivedita Singhvi @ 2005-03-21 23:42 UTC (permalink / raw)
  To: Diwaker Gupta; +Cc: xen-devel

Diwaker Gupta wrote:

> Hi everyone,
> 
> I'm doing some networking experiments over high BDP topologies. Right
> now the configuration is quite simple -- two Xen boxes connected via a
> dummynet router. The dummynet router is set to limit bandwidth to
> 500Mbps and simulate an RTT of 80ms.
> 
> I'm using the following sysctl values:
> net.ipv4.tcp_rmem = 4096        87380   4194304
> net.ipv4.tcp_wmem = 4096        65536   4194304

If you're trying to tune TCP traffic, then you might
want to increase the default TCP socket size (87380) above
as well, as simply increasing the core size won't
help there.

> Now if I run 50 netperf flows lasting 80 seconds (1000RTTs) from
> inside  a VM on one box talking to the netserver on the VM on the
> other box, I get a per flow throughput of around ~2.5Mbps (which
> sucks, but lets ignore the absolute value for the moment).
> 
> If I run the same test, but this time from inside dom0, I get a per
> flow throughput of around 6Mbps.

Could you get any further information on your test/data?
Which netperf test were you running, btw?

> I'm trying to understand the difference in performance. It seems to me
> that the I/O descriptor ring sizes are hard coded to 256 -- could that
> be a bottleneck here? If not, have people experience similar problems?

Someone on this list had posted that they would be getting
oprofile working soon - you might want to retry your testing
with that patch.

thanks,
Nivedita



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-03-21 23:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-03-20 21:47 I/O descriptor ring size bottleneck? Diwaker Gupta
2005-03-21 23:42 ` Nivedita Singhvi
  -- strict thread matches above, loose matches on Subject: below --
2005-03-20 22:15 Ian Pratt
2005-03-21  0:06 ` Diwaker Gupta
2005-03-21  0:36 Ian Pratt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.