From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Lieven Subject: Re: win7 bad i/o performance, high insn_emulation and exists Date: Tue, 21 Feb 2012 11:50:47 +0100 Message-ID: <4F437707.8070208@dlh.net> References: <4F428E53.2010602@dlh.net> <20120220184008.GF29601@redhat.com> <20120220190449.GG29601@redhat.com> <4F42A62A.9080503@dlh.net> <20120220204515.GI29601@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org To: Gleb Natapov Return-path: Received: from ssl.dlh.net ([91.198.192.8]:57457 "EHLO ssl.dlh.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753487Ab2BUKux (ORCPT ); Tue, 21 Feb 2012 05:50:53 -0500 In-Reply-To: <20120220204515.GI29601@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On 20.02.2012 21:45, Gleb Natapov wrote: > On Mon, Feb 20, 2012 at 08:59:38PM +0100, Peter Lieven wrote: >> On 20.02.2012 20:04, Gleb Natapov wrote: >>> On Mon, Feb 20, 2012 at 08:40:08PM +0200, Gleb Natapov wrote: >>>> On Mon, Feb 20, 2012 at 07:17:55PM +0100, Peter Lieven wrote: >>>>> Hi, >>>>> >>>>> I came a across an issue with a Windows 7 (32-bit) as well as with a >>>>> Windows 2008 R2 (64-bit) guest. >>>>> >>>>> If I transfer a file from the VM via CIFS or FTP to a remote machine, >>>>> i get very poor read performance (around 13MB/s). The VM peaks at 100% >>>>> cpu and I see a lot of insn_emulations and all kinds of exists in kvm_stat >>>>> >>>>> efer_reload 0 0 >>>>> exits 2260976 79620 >>>>> fpu_reload 6197 11 >>>>> halt_exits 114734 5011 >>>>> halt_wakeup 111195 4876 >>>>> host_state_reload 1499659 60962 >>>>> hypercalls 0 0 >>>>> insn_emulation 1577325 58488 >>>>> insn_emulation_fail 0 0 >>>>> invlpg 0 0 >>>>> io_exits 943949 40249 >>>> Hmm, too many of those. >>>> >>>>> irq_exits 108679 5434 >>>>> irq_injections 236545 10788 >>>>> irq_window 7606 246 >>>>> largepages 672 5 >>>>> mmio_exits 460020 16082 >>>>> mmu_cache_miss 119 0 >>>>> mmu_flooded 0 0 >>>>> mmu_pde_zapped 0 0 >>>>> mmu_pte_updated 0 0 >>>>> mmu_pte_write 13474 9 >>>>> mmu_recycled 0 0 >>>>> mmu_shadow_zapped 141 0 >>>>> mmu_unsync 0 0 >>>>> nmi_injections 0 0 >>>>> nmi_window 0 0 >>>>> pf_fixed 22803 35 >>>>> pf_guest 0 0 >>>>> remote_tlb_flush 239 2 >>>>> request_irq 0 0 >>>>> signal_exits 0 0 >>>>> tlb_flush 20933 0 >>>>> >>>>> If I run the same VM with a Ubuntu 10.04.4 guest I get around 60MB/s >>>>> throughput. The kvm_stats look a lot more sane. >>>>> >>>>> efer_reload 0 0 >>>>> exits 6132004 17931 >>>>> fpu_reload 19863 3 >>>>> halt_exits 264961 3083 >>>>> halt_wakeup 236468 2959 >>>>> host_state_reload 1104468 3104 >>>>> hypercalls 0 0 >>>>> insn_emulation 1417443 7518 >>>>> insn_emulation_fail 0 0 >>>>> invlpg 0 0 >>>>> io_exits 869380 2795 >>>>> irq_exits 253501 2362 >>>>> irq_injections 616967 6804 >>>>> irq_window 201186 2161 >>>>> largepages 1019 0 >>>>> mmio_exits 205268 0 >>>>> mmu_cache_miss 192 0 >>>>> mmu_flooded 0 0 >>>>> mmu_pde_zapped 0 0 >>>>> mmu_pte_updated 0 0 >>>>> mmu_pte_write 7440546 0 >>>>> mmu_recycled 0 0 >>>>> mmu_shadow_zapped 259 0 >>>>> mmu_unsync 0 0 >>>>> nmi_injections 0 0 >>>>> nmi_window 0 0 >>>>> pf_fixed 38529 30 >>>>> pf_guest 0 0 >>>>> remote_tlb_flush 761 1 >>>>> request_irq 0 0 >>>>> signal_exits 0 0 >>>>> tlb_flush 0 0 >>>>> >>>>> I use virtio-net (with vhost-net) and virtio-blk. I tried disabling >>>>> hpet (which basically illiminated the mmio_exits, but does not >>>>> increase >>>>> performance) and also commit (39a7a362e16bb27e98738d63f24d1ab5811e26a8 >>>>> ) - no improvement. >>>>> >>>>> My commandline: >>>>> /usr/bin/qemu-kvm-1.0 -netdev >>>>> type=tap,id=guest8,script=no,downscript=no,ifname=tap0,vhost=on >>>>> -device virtio-net-pci,netdev=guest8,mac=52:54:00:ff:00:d3 -drive format=host_device,file=/dev/mapper/iqn.2001-05.com.equallogic:0-8a0906-eeef4e007-a8a9f3818674f2fc-lieven-windows7-vc-r80788,if=virtio,cache=none,aio=native >>>>> -m 2048 -smp 2 -monitor tcp:0:4001,server,nowait -vnc :1 -name >>>>> lieven-win7-vc -boot order=dc,menu=off -k de -pidfile >>>>> /var/run/qemu/vm-187.pid -mem-path /hugepages -mem-prealloc -cpu >>>>> host -rtc base=localtime -vga std -usb -usbdevice tablet -no-hpet >>>>> >>>>> What further information is needed to debug this further? >>>>> >>>> Which kernel version (looks like something recent)? >>>> Which host CPU (looks like something old)? >>> Output of cat /proc/cpuinfo >>> >>>> Which Windows' virtio drivers are you using? >>>> >>>> Take a trace like described here http://www.linux-kvm.org/page/Tracing >>>> (with -no-hpet please). >>>> >>> And also "info pci" output from qemu monitor while we are at it. >> here is the output while i was tracing. you can download the trace >> i took while i did a ftp transfer from the vm: >> >> -> http://82.141.21.156/report.txt.gz >> > Windows reads PM timer. A lot. 15152 times per second. > > Can you try to run this command in Windows guest: > > bcdedit /set {default} useplatformclock false > > I hope it will make Windows use TSC instead, but you can't be sure > about anything with Windows :( Whatever it does now it eates more CPU has almost equal number of exits and throughput is about the same (15MB/s). If pmtimer is at 0xb008 it still reads it like hell. I checked with bcedit /v that useplatformclock is set to "No". I still wonder why both virtio devices are on IRQ0 ? New Trace: http://82.141.21.156/report2.txt.gz efer_reload 0 0 exits 1510993 59343 fpu_reload 6729 10 halt_exits 93603 5913 halt_wakeup 95698 5849 host_state_reload 738523 24727 hypercalls 0 0 insn_emulation 678416 20107 insn_emulation_fail 0 0 invlpg 0 0 io_exits 703291 28436 irq_exits 102117 7527 irq_injections 217335 14344 irq_window 9926 650 largepages 573 8 mmio_exits 27 0 mmu_cache_miss 148 0 mmu_flooded 0 0 mmu_pde_zapped 0 0 mmu_pte_updated 0 0 mmu_pte_write 0 0 mmu_recycled 0 0 mmu_shadow_zapped 190 0 mmu_unsync 0 0 nmi_injections 0 0 nmi_window 0 0 pf_fixed 21938 38 pf_guest 0 0 remote_tlb_flush 20 0 request_irq 0 0 signal_exits 0 0 tlb_flush 11711 0 QEMU 1.0 monitor - type 'help' for more information (qemu) info pci info pci Bus 0, device 0, function 0: Host bridge: PCI device 8086:1237 id "" Bus 0, device 1, function 0: ISA bridge: PCI device 8086:7000 id "" Bus 0, device 1, function 1: IDE controller: PCI device 8086:7010 BAR4: I/O at 0xc080 [0xc08f]. id "" Bus 0, device 1, function 2: USB controller: PCI device 8086:7020 IRQ 5. BAR4: I/O at 0xc040 [0xc05f]. id "" Bus 0, device 1, function 3: Bridge: PCI device 8086:7113 IRQ 9. id "" Bus 0, device 2, function 0: VGA controller: PCI device 1234:1111 BAR0: 32 bit prefetchable memory at 0xfd000000 [0xfdffffff]. BAR6: 32 bit memory at 0xffffffffffffffff [0x0000fffe]. id "" Bus 0, device 3, function 0: Ethernet controller: PCI device 1af4:1000 IRQ 0. BAR0: I/O at 0xc060 [0xc07f]. BAR1: 32 bit memory at 0xfebf0000 [0xfebf0fff]. BAR6: 32 bit memory at 0xffffffffffffffff [0x0000fffe]. id "" Bus 0, device 4, function 0: SCSI controller: PCI device 1af4:1001 IRQ 0. BAR0: I/O at 0xc000 [0xc03f]. BAR1: 32 bit memory at 0xfebf1000 [0xfebf1fff]. id "" From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:54691) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RznJE-0001Jt-6P for qemu-devel@nongnu.org; Tue, 21 Feb 2012 05:51:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RznJ7-0007IQ-W4 for qemu-devel@nongnu.org; Tue, 21 Feb 2012 05:51:00 -0500 Received: from ssl.dlh.net ([91.198.192.8]:37763) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RznJ7-0007IJ-Jx for qemu-devel@nongnu.org; Tue, 21 Feb 2012 05:50:53 -0500 Message-ID: <4F437707.8070208@dlh.net> Date: Tue, 21 Feb 2012 11:50:47 +0100 From: Peter Lieven MIME-Version: 1.0 References: <4F428E53.2010602@dlh.net> <20120220184008.GF29601@redhat.com> <20120220190449.GG29601@redhat.com> <4F42A62A.9080503@dlh.net> <20120220204515.GI29601@redhat.com> In-Reply-To: <20120220204515.GI29601@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] win7 bad i/o performance, high insn_emulation and exists List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Gleb Natapov Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org On 20.02.2012 21:45, Gleb Natapov wrote: > On Mon, Feb 20, 2012 at 08:59:38PM +0100, Peter Lieven wrote: >> On 20.02.2012 20:04, Gleb Natapov wrote: >>> On Mon, Feb 20, 2012 at 08:40:08PM +0200, Gleb Natapov wrote: >>>> On Mon, Feb 20, 2012 at 07:17:55PM +0100, Peter Lieven wrote: >>>>> Hi, >>>>> >>>>> I came a across an issue with a Windows 7 (32-bit) as well as with a >>>>> Windows 2008 R2 (64-bit) guest. >>>>> >>>>> If I transfer a file from the VM via CIFS or FTP to a remote machine, >>>>> i get very poor read performance (around 13MB/s). The VM peaks at 100% >>>>> cpu and I see a lot of insn_emulations and all kinds of exists in kvm_stat >>>>> >>>>> efer_reload 0 0 >>>>> exits 2260976 79620 >>>>> fpu_reload 6197 11 >>>>> halt_exits 114734 5011 >>>>> halt_wakeup 111195 4876 >>>>> host_state_reload 1499659 60962 >>>>> hypercalls 0 0 >>>>> insn_emulation 1577325 58488 >>>>> insn_emulation_fail 0 0 >>>>> invlpg 0 0 >>>>> io_exits 943949 40249 >>>> Hmm, too many of those. >>>> >>>>> irq_exits 108679 5434 >>>>> irq_injections 236545 10788 >>>>> irq_window 7606 246 >>>>> largepages 672 5 >>>>> mmio_exits 460020 16082 >>>>> mmu_cache_miss 119 0 >>>>> mmu_flooded 0 0 >>>>> mmu_pde_zapped 0 0 >>>>> mmu_pte_updated 0 0 >>>>> mmu_pte_write 13474 9 >>>>> mmu_recycled 0 0 >>>>> mmu_shadow_zapped 141 0 >>>>> mmu_unsync 0 0 >>>>> nmi_injections 0 0 >>>>> nmi_window 0 0 >>>>> pf_fixed 22803 35 >>>>> pf_guest 0 0 >>>>> remote_tlb_flush 239 2 >>>>> request_irq 0 0 >>>>> signal_exits 0 0 >>>>> tlb_flush 20933 0 >>>>> >>>>> If I run the same VM with a Ubuntu 10.04.4 guest I get around 60MB/s >>>>> throughput. The kvm_stats look a lot more sane. >>>>> >>>>> efer_reload 0 0 >>>>> exits 6132004 17931 >>>>> fpu_reload 19863 3 >>>>> halt_exits 264961 3083 >>>>> halt_wakeup 236468 2959 >>>>> host_state_reload 1104468 3104 >>>>> hypercalls 0 0 >>>>> insn_emulation 1417443 7518 >>>>> insn_emulation_fail 0 0 >>>>> invlpg 0 0 >>>>> io_exits 869380 2795 >>>>> irq_exits 253501 2362 >>>>> irq_injections 616967 6804 >>>>> irq_window 201186 2161 >>>>> largepages 1019 0 >>>>> mmio_exits 205268 0 >>>>> mmu_cache_miss 192 0 >>>>> mmu_flooded 0 0 >>>>> mmu_pde_zapped 0 0 >>>>> mmu_pte_updated 0 0 >>>>> mmu_pte_write 7440546 0 >>>>> mmu_recycled 0 0 >>>>> mmu_shadow_zapped 259 0 >>>>> mmu_unsync 0 0 >>>>> nmi_injections 0 0 >>>>> nmi_window 0 0 >>>>> pf_fixed 38529 30 >>>>> pf_guest 0 0 >>>>> remote_tlb_flush 761 1 >>>>> request_irq 0 0 >>>>> signal_exits 0 0 >>>>> tlb_flush 0 0 >>>>> >>>>> I use virtio-net (with vhost-net) and virtio-blk. I tried disabling >>>>> hpet (which basically illiminated the mmio_exits, but does not >>>>> increase >>>>> performance) and also commit (39a7a362e16bb27e98738d63f24d1ab5811e26a8 >>>>> ) - no improvement. >>>>> >>>>> My commandline: >>>>> /usr/bin/qemu-kvm-1.0 -netdev >>>>> type=tap,id=guest8,script=no,downscript=no,ifname=tap0,vhost=on >>>>> -device virtio-net-pci,netdev=guest8,mac=52:54:00:ff:00:d3 -drive format=host_device,file=/dev/mapper/iqn.2001-05.com.equallogic:0-8a0906-eeef4e007-a8a9f3818674f2fc-lieven-windows7-vc-r80788,if=virtio,cache=none,aio=native >>>>> -m 2048 -smp 2 -monitor tcp:0:4001,server,nowait -vnc :1 -name >>>>> lieven-win7-vc -boot order=dc,menu=off -k de -pidfile >>>>> /var/run/qemu/vm-187.pid -mem-path /hugepages -mem-prealloc -cpu >>>>> host -rtc base=localtime -vga std -usb -usbdevice tablet -no-hpet >>>>> >>>>> What further information is needed to debug this further? >>>>> >>>> Which kernel version (looks like something recent)? >>>> Which host CPU (looks like something old)? >>> Output of cat /proc/cpuinfo >>> >>>> Which Windows' virtio drivers are you using? >>>> >>>> Take a trace like described here http://www.linux-kvm.org/page/Tracing >>>> (with -no-hpet please). >>>> >>> And also "info pci" output from qemu monitor while we are at it. >> here is the output while i was tracing. you can download the trace >> i took while i did a ftp transfer from the vm: >> >> -> http://82.141.21.156/report.txt.gz >> > Windows reads PM timer. A lot. 15152 times per second. > > Can you try to run this command in Windows guest: > > bcdedit /set {default} useplatformclock false > > I hope it will make Windows use TSC instead, but you can't be sure > about anything with Windows :( Whatever it does now it eates more CPU has almost equal number of exits and throughput is about the same (15MB/s). If pmtimer is at 0xb008 it still reads it like hell. I checked with bcedit /v that useplatformclock is set to "No". I still wonder why both virtio devices are on IRQ0 ? New Trace: http://82.141.21.156/report2.txt.gz efer_reload 0 0 exits 1510993 59343 fpu_reload 6729 10 halt_exits 93603 5913 halt_wakeup 95698 5849 host_state_reload 738523 24727 hypercalls 0 0 insn_emulation 678416 20107 insn_emulation_fail 0 0 invlpg 0 0 io_exits 703291 28436 irq_exits 102117 7527 irq_injections 217335 14344 irq_window 9926 650 largepages 573 8 mmio_exits 27 0 mmu_cache_miss 148 0 mmu_flooded 0 0 mmu_pde_zapped 0 0 mmu_pte_updated 0 0 mmu_pte_write 0 0 mmu_recycled 0 0 mmu_shadow_zapped 190 0 mmu_unsync 0 0 nmi_injections 0 0 nmi_window 0 0 pf_fixed 21938 38 pf_guest 0 0 remote_tlb_flush 20 0 request_irq 0 0 signal_exits 0 0 tlb_flush 11711 0 QEMU 1.0 monitor - type 'help' for more information (qemu) info pci info pci Bus 0, device 0, function 0: Host bridge: PCI device 8086:1237 id "" Bus 0, device 1, function 0: ISA bridge: PCI device 8086:7000 id "" Bus 0, device 1, function 1: IDE controller: PCI device 8086:7010 BAR4: I/O at 0xc080 [0xc08f]. id "" Bus 0, device 1, function 2: USB controller: PCI device 8086:7020 IRQ 5. BAR4: I/O at 0xc040 [0xc05f]. id "" Bus 0, device 1, function 3: Bridge: PCI device 8086:7113 IRQ 9. id "" Bus 0, device 2, function 0: VGA controller: PCI device 1234:1111 BAR0: 32 bit prefetchable memory at 0xfd000000 [0xfdffffff]. BAR6: 32 bit memory at 0xffffffffffffffff [0x0000fffe]. id "" Bus 0, device 3, function 0: Ethernet controller: PCI device 1af4:1000 IRQ 0. BAR0: I/O at 0xc060 [0xc07f]. BAR1: 32 bit memory at 0xfebf0000 [0xfebf0fff]. BAR6: 32 bit memory at 0xffffffffffffffff [0x0000fffe]. id "" Bus 0, device 4, function 0: SCSI controller: PCI device 1af4:1001 IRQ 0. BAR0: I/O at 0xc000 [0xc03f]. BAR1: 32 bit memory at 0xfebf1000 [0xfebf1fff]. id ""