public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* kvm-72: problems with 8139 under heavy load, lost interrupts?
@ 2008-08-22  7:54 Nikola Ciprich
  2008-08-22 18:16 ` Avi Kivity
  0 siblings, 1 reply; 3+ messages in thread
From: Nikola Ciprich @ 2008-08-22  7:54 UTC (permalink / raw)
  To: KVM list; +Cc: nikola.ciprich

Hello everybody,
we're running cluster of two hosts with tens (~45 running) of kvms, 
and now I noticed that some nodes are loosing link under heavy load.

following appears in dmesg:
[  422.077128] NETDEV WATCHDOG: eth0: transmit timed out
[  422.077215] eth0: Transmit timeout, status  d   2b    5 80ff

[root@sql1 ~]# cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3
  0:        144          0          0          0   IO-APIC-edge      timer
  1:        539          2          1          2   IO-APIC-edge      i8042
  9:          0          0          0          0   IO-APIC-fasteoi   acpi
 10:     756783     362345     372753     751385   IO-APIC-fasteoi   eth0
 11:          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb1
 12:        150          4          3          4   IO-APIC-edge      i8042
 14:     518448     528815     172232     348704   IO-APIC-edge      ide0
 15:          0          0          0          0   IO-APIC-edge      ide1
NMI:          0          0          0          0   Non-maskable interrupts
LOC:     829179     775992     505151     458761   Local timer interrupts
RES:     115772      98143      88928      82099   Rescheduling interrupts
CAL:         73        166        138        160   function call interrupts
TLB:     214586     255980      66806     278284   TLB shootdowns
TRM:          0          0          0          0   Thermal event interrupts
SPU:          0          0          0          0   Spurious interrupts
ERR:          0
MIS:       1261

I guess the MIS value might be related to this. I have observed this problem
on 32bit guests up to now, but it might be coincidence (those affected are heavily used).
It also seems that it *might* be related to SMP guests.

Hosts are running 2.6.26.2-x86_64 + kvm-72, guests 2.6.24, and are using 8139 virt adapter.
I'm not sure if we had this problem with older KVM versions (and thus this is regression), 
as the usage of machines is growing constantly, so we maybe just didn't noticed the problem before.

I CAN try other virt adapters as well, but both machines are production, so I have to be
a bit cautious when it comes to experimenting. I'll try to prepare testing environment where
I could reproduce the problem.

But in the meantime, is there some way I could debug the problem furher, but in safe manner?
I don't see anything related in either hosts dmesg, or logfiles.

Thanks a lot in advance
BR
nik

-- 
-------------------------------------
Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:    +420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: servis@linuxbox.cz
-------------------------------------

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2008-08-22 20:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-22  7:54 kvm-72: problems with 8139 under heavy load, lost interrupts? Nikola Ciprich
2008-08-22 18:16 ` Avi Kivity
2008-08-22 20:48   ` Farkas Levente

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox