From: Martin Wawro <martin.wawro@gmail.com>
To: kvm@vger.kernel.org
Subject: [User Question] Repeated severe performance problems on guest
Date: Fri, 12 Apr 2013 17:04:27 +0200 [thread overview]
Message-ID: <5168227B.1050402@googlemail.com> (raw)
Hi All,
We are experiencing severe performance problems on a regular basis
which require us to destroy and restart the guest OS. What happens is that
the load average rises well above 50 and the guest OS becomes quite
unresponsive,
though no serious workload is running on the system. Even issuing a
clean reboot
command takes so long, that we have to abort that by simply cutting off
the guest instead
of waiting for around 40mins for the clean reboot to happen.
These problems arise on a regular basis, though there is no apparent
pattern behind
that. It may occur on larger workloads during daytime, but also during
nighttime
when there is virtually no load on the system. There are no associated
kernel messages
and the host does seem to work OK, though we think that also rebooting
the host after
such an incident prolongs the time to the next occurence of the problem.
The system in general serves around 50 GB incoming traffic and 90 GB
outgoing
traffic per day (it is a kind of fileserver). With 90% of the traffic
occuring
during the daytime.
Logging the kvm_stat on the host, we obtained the following output during
normal operation:
efer_reload 0 0
exits 5101302255 1974
fpu_reload 306359390 62
halt_exits 737202541 764
halt_wakeup 327442392 65
host_state_reload 2065997773 912
hypercalls 0 0
insn_emulation 1702740746 914
insn_emulation_fail 0 0
invlpg 0 0
io_exits 1352400686 148
irq_exits 736230648 38
irq_injections 881709782 767
irq_window 17402610 0
largepages 326880 0
mmio_exits 2951391 0
mmu_cache_miss 2986088 0
mmu_flooded 0 0
mmu_pde_zapped 0 0
mmu_pte_updated 0 0
mmu_pte_write 108123 0
mmu_recycled 0 0
mmu_shadow_zapped 3178728 0
mmu_unsync 0 0
nmi_injections 0 0
nmi_window 0 0
pf_fixed 84440791 0
pf_guest 0 0
remote_tlb_flush 37610010 8
request_irq 0 0
signal_exits 0 0
tlb_flush 0 0
and about 90 mins later, the output when the guest is in a
state where it is rather unresponsive looks like this:
efer_reload 0 0
exits 5125445200 21349
fpu_reload 307627942 119
halt_exits 741717495 792
halt_wakeup 328747102 108
host_state_reload 2075042930 1330
hypercalls 0 0
insn_emulation 1711070317 1135
insn_emulation_fail 0 0
invlpg 0 0
io_exits 1356868798 424
irq_exits 738940729 155
irq_injections 886685967 1012
irq_window 17463827 3
largepages 321488 18
mmio_exits 3062654 90
mmu_cache_miss 3552726 5581
mmu_flooded 0 0
mmu_pde_zapped 0 0
mmu_pte_updated 0 0
mmu_pte_write 108123 0
mmu_recycled 0 0
mmu_shadow_zapped 3781317 5396
mmu_unsync 0 0
nmi_injections 0 0
nmi_window 0 0
pf_fixed 86464743 18627
pf_guest 0 0
remote_tlb_flush 37881302 137
request_irq 0 0
signal_exits 0 0
tlb_flush 0 0
Our attempts to extract some valuable information from the logs inside
the guest OS were not exactly successful. We could not find anything
unusual as compared to normal operation, except for the huge load average.
We are running the following setup:
Host OS:
RHEL 6.3 (amd64) Kernel 2.6.32-279.22
qemu-kvm 0.12.1.2-2.295
Guest OS:
Ubuntu Server 10.04 (amd64) Kernel 2.6.32-45
Assigned CPU cores: 7 (we also tested single CPU pinning too, without
success)
Assigned Memory: 32GB
All harddrives / network paravirtualized using virtio
The filesystem in use is mainly xfs.
Hardware:
IBM BladeServer HS22 with 44 GB memory and 2 Xeon QC CPUs (E5506)
Only a single guest is running on that machine.
We messed around with a lot of parameters (including clocksource, APICs
etc.),
but none of them seems to have an effect on the problem other than just
prolonging or shortening (and even this we cannot tell for sure due to some
randomness involved) the interval to the next catastrophic failure.
Any hints on how to approach that problem are welcome, since we are out of
ideas over here.
Best regards,
Martin
next reply other threads:[~2013-04-12 15:04 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-12 15:04 Martin Wawro [this message]
2013-04-16 5:49 ` [User Question] Repeated severe performance problems on guest Stefan Hajnoczi
2013-04-16 7:49 ` Martin Wawro
2013-04-17 13:53 ` Stefan Hajnoczi
2013-04-17 19:52 ` Martin Wawro
2013-04-18 7:25 ` Stefan Hajnoczi
2013-04-18 10:00 ` Martin Wawro
2013-04-18 13:14 ` Stefan Hajnoczi
2013-04-18 13:27 ` Martin Wawro
2013-04-19 5:59 ` Stefan Hajnoczi
2013-04-19 6:51 ` Martin Wawro
2013-04-18 7:42 ` Martin Wawro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5168227B.1050402@googlemail.com \
--to=martin.wawro@gmail.com \
--cc=kvm@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox