From: Avi Kivity <avi@redhat.com>
To: habanero@linux.vnet.ibm.com
Cc: kvm@vger.kernel.org, Yan Vugenfirer <yvugenfi@redhat.com>
Subject: Re: Performace data when running Windows VMs
Date: Wed, 26 Aug 2009 18:44:45 +0300 [thread overview]
Message-ID: <4A95586D.5010403@redhat.com> (raw)
In-Reply-To: <1251298670.9683.65.camel@twinturbo.austin.ibm.com>
On 08/26/2009 05:57 PM, Andrew Theurer wrote:
> I recently gathered some performance data when running Windows Server
> 2008 VMs, and I wanted to share it here. There are 12 Windows
> Server2008 64-bit VMs (1 vcpu, 2 GB) running which handle the concurrent
> execution of 6 J2EE type benchmarks. Each benchmark needs a App VM and
> a Database VM. The benchmark clients inject a fixed rate of requests
> which yields X% CPU utilization on the host. A different hypervisor was
> compared; KVM used about 60% more CPU cycles to complete the same amount
> of work. Both had their hypervisor specific paravirt IO drivers in the
> VMs.
>
> Server is a 2 socket Core/i7, SMT off, with 72 GB memory
>
Did you use large pages?
> Host kernel used was kvm.git v2.6.31-rc3-3419-g6df4865
> Qemu was kvm-87. I tried a few newer versions of Qemu; none of them
> worked with the RedHat virtIO Windows drivers. I tried:
>
> f3600c589a9ee5ea4c0fec74ed4e06a15b461d52
> 0.11.0-rc1
> 0.10.6
> kvm-88
>
> All but 0.10.6 had "Problem code 10" driver error in the VM. 0.10.6 had
> "a disk read error occurred" very early in the booting of the VM.
>
Yan?
> I/O on the host was not what I would call very high: outbound network
> averaged at 163 Mbit/s inbound was 8 Mbit/s, while disk read ops was
> 243/sec and write ops was 561/sec
>
What was the disk bandwidth used? Presumably, direct access to the
volume with cache=off?
linux-aio should help reduce cpu usage.
> Host CPU breakdown was the following:
>
> user nice system irq softirq guest idle iowait
> 5.67 0.00 11.64 0.09 1.05 31.90 46.06 3.59
>
>
> The amount of kernel time had me concerned. Here is oprofile:
>
user+system is about 55% of guest time, and it's all overhead.
>> samples % app name symbol name
>> 1163422 52.3744 kvm-intel.ko vmx_vcpu_run
>> 103996 4.6816 vmlinux-2.6.31-rc5-v2.6.31-rc3-3419-g6df4865-autokern1 native_set_debugreg
>> 81036 3.6480 kvm.ko kvm_arch_vcpu_ioctl_run
>> 37913 1.7068 qemu-system-x86_64 cpu_physical_memory_rw
>> 34720 1.5630 qemu-system-x86_64 phys_page_find_alloc
>>
We should really optimize these two.
>> 23234 1.0459 vmlinux-2.6.31-rc5-v2.6.31-rc3-3419-g6df4865-autokern1 native_write_msr_safe
>> 20964 0.9437 vmlinux-2.6.31-rc5-v2.6.31-rc3-3419-g6df4865-autokern1 native_get_debugreg
>> 17628 0.7936 libc-2.5.so memcpy
>> 16587 0.7467 vmlinux-2.6.31-rc5-v2.6.31-rc3-3419-g6df4865-autokern1 __down_read
>> 15681 0.7059 vmlinux-2.6.31-rc5-v2.6.31-rc3-3419-g6df4865-autokern1 __up_read
>> 15466 0.6962 kvm.ko find_highest_vector
>> 14611 0.6578 qemu-system-x86_64 qemu_get_ram_ptr
>> 11254 0.5066 kvm-intel.ko vmcs_writel
>> 11133 0.5012 vmlinux-2.6.31-rc5-v2.6.31-rc3-3419-g6df4865-autokern1 copy_user_generic_string
>> 10917 0.4915 vmlinux-2.6.31-rc5-v2.6.31-rc3-3419-g6df4865-autokern1 native_read_msr_safe
>> 10760 0.4844 qemu-system-x86_64 virtqueue_get_head
>> 9025 0.4063 kvm-intel.ko vmx_handle_exit
>> 8953 0.4030 vmlinux-2.6.31-rc5-v2.6.31-rc3-3419-g6df4865-autokern1 schedule
>> 8753 0.3940 vmlinux-2.6.31-rc5-v2.6.31-rc3-3419-g6df4865-autokern1 fget_light
>> 8465 0.3811 qemu-system-x86_64 virtqueue_avail_bytes
>> 8185 0.3685 kvm-intel.ko handle_cr
>> 8069 0.3632 kvm.ko kvm_set_irq
>> 7697 0.3465 kvm.ko kvm_lapic_sync_from_vapic
>> 7586 0.3415 qemu-system-x86_64 main_loop_wait
>> 7480 0.3367 vmlinux-2.6.31-rc5-v2.6.31-rc3-3419-g6df4865-autokern1 do_select
>> 7121 0.3206 qemu-system-x86_64 lduw_phys
>> 7003 0.3153 vmlinux-2.6.31-rc5-v2.6.31-rc3-3419-g6df4865-autokern1 audit_syscall_exit
>> 6062 0.2729 vmlinux-2.6.31-rc5-v2.6.31-rc3-3419-g6df4865-autokern1 kfree
>> 5477 0.2466 vmlinux-2.6.31-rc5-v2.6.31-rc3-3419-g6df4865-autokern1 fput
>> 5454 0.2455 kvm.ko kvm_lapic_get_cr8
>> 5096 0.2294 kvm.ko kvm_load_guest_fpu
>> 5057 0.2277 kvm.ko apic_update_ppr
>> 4929 0.2219 vmlinux-2.6.31-rc5-v2.6.31-rc3-3419-g6df4865-autokern1 up_read
>> 4900 0.2206 vmlinux-2.6.31-rc5-v2.6.31-rc3-3419-g6df4865-autokern1 audit_syscall_entry
>> 4866 0.2191 kvm.ko kvm_apic_has_interrupt
>> 4670 0.2102 kvm-intel.ko skip_emulated_instruction
>> 4644 0.2091 kvm.ko kvm_cpu_has_interrupt
>> 4548 0.2047 vmlinux-2.6.31-rc5-v2.6.31-rc3-3419-g6df4865-autokern1 __switch_to
>> 4328 0.1948 kvm.ko kvm_apic_accept_pic_intr
>> 4303 0.1937 libpthread-2.5.so pthread_mutex_lock
>> 4235 0.1906 vmlinux-2.6.31-rc5-v2.6.31-rc3-3419-g6df4865-autokern1 system_call
>> 4175 0.1879 kvm.ko kvm_put_guest_fpu
>> 4170 0.1877 qemu-system-x86_64 ldl_phys
>> 4098 0.1845 kvm-intel.ko vmx_set_interrupt_shadow
>> 4003 0.1802 qemu-system-x86_64 kvm_run
>>
> I was wondering why the get/set debugreg was so high. I don't recall
> seeing this much with Linux VMs.
>
Could it be that Windows uses the debug registers? Maybe we're
incorrectly deciding to switch them.
Apart from that, nothing really stands out. We'll just have to optimize
things one by one.
> Here is an average of kvm_stat:
>
>
>
>> efer_relo 0
>> exits 1262814
>>
100K exits/sec/vm. This is high.
>> fpu_reloa 103842
>>
So is this -- maybe we're misdetecting fpu usage on EPT.
>> halt_exit 9918
>> halt_wake 9763
>> host_stat 103846
>>
This is presumably due to virtio in qemu.
>> hypercall 0
>> insn_emul 23277
>> insn_emul 23277
>> invlpg 0
>> io_exits 82717
>>
Yes, it is.
>> irq_exits 12797
>> irq_injec 18806
>> irq_windo 1194
>> largepage 12
>> mmio_exit 0
>> mmu_cache 0
>> mmu_flood 0
>> mmu_pde_z 0
>> mmu_pte_u 0
>> mmu_pte_w 0
>> mmu_recyc 0
>> mmu_shado 0
>> mmu_unsyn 0
>> nmi_injec 0
>> nmi_windo 0
>> pf_fixed 12
>> pf_guest 0
>> remote_tl 0
>> request_i 0
>> signal_ex 0
>> tlb_flush 0
>>
> For 12 VMs, do the number of exits/sec seem reasonable?
>
> Comments?
>
Not all of the exits are accounted for, so we're missing a big part of
the picture. 2.6.32 will have better statistics through ftrace.
--
error compiling committee.c: too many arguments to function
next prev parent reply other threads:[~2009-08-26 15:44 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-26 14:57 Performace data when running Windows VMs Andrew Theurer
2009-08-26 15:44 ` Avi Kivity [this message]
2009-08-26 16:14 ` Andrew Theurer
2009-08-26 16:26 ` Avi Kivity
2009-08-26 17:51 ` Andrew Theurer
2009-08-26 19:20 ` Avi Kivity
2009-08-26 16:27 ` Brian Jackson
2009-08-26 17:52 ` Andrew Theurer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A95586D.5010403@redhat.com \
--to=avi@redhat.com \
--cc=habanero@linux.vnet.ibm.com \
--cc=kvm@vger.kernel.org \
--cc=yvugenfi@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.