All of lore.kernel.org
 help / color / mirror / Atom feed
* DomU network stalling when Dom0 generates a lot of TX
@ 2012-04-08 21:14 Killian De Volder
  0 siblings, 0 replies; only message in thread
From: Killian De Volder @ 2012-04-08 21:14 UTC (permalink / raw)
  To: xen-devel@lists.xensource.com

Hello,

I'm experiencing the following troubles:
When Dom0 generates a lot of TX traffic, the DomU's often stall (once every ~2mins, but it's fairly irregular), anywhere from 300ms up to 1800ms.
I made a little python script to try and figure out if the machine was getting cpu time:(If this is a correct way to test it.)
The time result of the script below is fairly consistent, indicating to me the machine is not stalling. (Unless the time is also stalled ofcours, but ntpq -p look fine.)
"""
import time

while True:
     ts=time.time()
     for i in range(0,10**6):
         i=i+1-1
     te=time.time()
     print te-ts
"""

It's also safe (I think) to exclude the network driver from the equation, as the same problem occurs on a bridge without a physical network drive. (bridge: dmz)

"""
# brctl show
bridge name    bridge id        STP enabled    interfaces
dmz        8000.feffffffffff    no        vif1.2
                             vif2.0
lan        8000.00c049593d25    no        eth_lan
                             vif1.0
                             vif11.0
wan        8000.00c049593e3f    no        eth_wan
                             vif1.1
"""

Example of a bad ping:
64 bytes from doc (172.17.0.2): icmp_req=3288 ttl=64 time=1130 ms
64 bytes from doc (172.17.0.2): icmp_req=3289 ttl=64 time=920 ms
64 bytes from doc (172.17.0.2): icmp_req=3290 ttl=64 time=710 ms
64 bytes from doc (172.17.0.2): icmp_req=3291 ttl=64 time=500 ms
64 bytes from doc (172.17.0.2): icmp_req=3292 ttl=64 time=300 ms
64 bytes from doc (172.17.0.2): icmp_req=3293 ttl=64 time=91.0 ms
64 bytes from doc (172.17.0.2): icmp_req=3294 ttl=64 time=0.147 ms
64 bytes from doc (172.17.0.2): icmp_req=3295 ttl=64 time=0.180 ms
(However during the same time dom0 is quite responsive.)

I also tried turning of offloading (TX,TO,GPO,...)

The TXqueuelen is 1000.
Dom0 is loaded with CPU during this time, but manually starting CPU load does not seem to create the problem.

Version info:
Kernel: xen 3.2.1-gentoo-r2
Xen-Hyp: Xen version 4.1.2


Does anyone has any ideas left of what could cause this ?

Kind regards,
Killian De Volder

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2012-04-08 21:14 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-04-08 21:14 DomU network stalling when Dom0 generates a lot of TX Killian De Volder

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.