From: Farkas Levente <lfarkas@lfarkas.org>
To: Avi Kivity <avi@qumranet.com>
Cc: Nikola Ciprich <extmaillist@linuxbox.cz>,
KVM list <kvm@vger.kernel.org>,
nikola.ciprich@linuxbox.cz
Subject: Re: kvm-72: problems with 8139 under heavy load, lost interrupts?
Date: Fri, 22 Aug 2008 22:48:18 +0200 [thread overview]
Message-ID: <48AF2612.5020606@lfarkas.org> (raw)
In-Reply-To: <48AF0267.8090805@qumranet.com>
Avi Kivity wrote:
> Nikola Ciprich wrote:
>> Hello everybody,
>> we're running cluster of two hosts with tens (~45 running) of kvms,
>> and now I noticed that some nodes are loosing link under heavy load.
>>
>> following appears in dmesg:
>> [ 422.077128] NETDEV WATCHDOG: eth0: transmit timed out
>> [ 422.077215] eth0: Transmit timeout, status d 2b 5 80ff
>>
>> [root@sql1 ~]# cat /proc/interrupts
>> CPU0 CPU1 CPU2 CPU3
>> 0: 144 0 0 0 IO-APIC-edge timer
>> 1: 539 2 1 2 IO-APIC-edge i8042
>> 9: 0 0 0 0 IO-APIC-fasteoi acpi
>> 10: 756783 362345 372753 751385 IO-APIC-fasteoi eth0
>> 11: 0 0 0 0 IO-APIC-fasteoi uhci_hcd:usb1
>> 12: 150 4 3 4 IO-APIC-edge i8042
>> 14: 518448 528815 172232 348704 IO-APIC-edge ide0
>> 15: 0 0 0 0 IO-APIC-edge ide1
>> NMI: 0 0 0 0 Non-maskable interrupts
>> LOC: 829179 775992 505151 458761 Local timer interrupts
>> RES: 115772 98143 88928 82099 Rescheduling interrupts
>> CAL: 73 166 138 160 function call interrupts
>> TLB: 214586 255980 66806 278284 TLB shootdowns
>> TRM: 0 0 0 0 Thermal event interrupts
>> SPU: 0 0 0 0 Spurious interrupts
>> ERR: 0
>> MIS: 1261
>>
>> I guess the MIS value might be related to this. I have observed this problem
>> on 32bit guests up to now, but it might be coincidence (those affected are heavily used).
>> It also seems that it *might* be related to SMP guests.
>>
>> Hosts are running 2.6.26.2-x86_64 + kvm-72, guests 2.6.24, and are using 8139 virt adapter.
>> I'm not sure if we had this problem with older KVM versions (and thus this is regression),
>> as the usage of machines is growing constantly, so we maybe just didn't noticed the problem before.
>>
>> I CAN try other virt adapters as well, but both machines are production, so I have to be
>> a bit cautious when it comes to experimenting. I'll try to prepare testing environment where
>> I could reproduce the problem.
>>
>> But in the meantime, is there some way I could debug the problem furher, but in safe manner?
>> I don't see anything related in either hosts dmesg, or logfiles.
>>
>>
>
> What would be most useful is to verify that this reproduces reliably,
> and a recipe for us to try out.
>
> Also, how heavy is the load? Maybe it's so heavy that guests don't get
> scheduled and really time out. Does the network recover if you ifdown/ifup?
the same happened with us. an easy way to reproduce was to create a new
iso image with revisor when it's use kickstart files using the given kvm
guest's nfs server.
--
Levente "Si vis pacem para bellum!"
prev parent reply other threads:[~2008-08-22 20:48 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-22 7:54 kvm-72: problems with 8139 under heavy load, lost interrupts? Nikola Ciprich
2008-08-22 18:16 ` Avi Kivity
2008-08-22 20:48 ` Farkas Levente [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48AF2612.5020606@lfarkas.org \
--to=lfarkas@lfarkas.org \
--cc=avi@qumranet.com \
--cc=extmaillist@linuxbox.cz \
--cc=kvm@vger.kernel.org \
--cc=nikola.ciprich@linuxbox.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox