From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [PATCH] KVM: Use thread debug register storage instead of kvm specific data Date: Tue, 01 Sep 2009 21:23:15 +0300 Message-ID: <4A9D6693.7040401@redhat.com> References: <1251798248-13164-1-git-send-email-avi@redhat.com> <4A9CEDBE.8010902@redhat.com> <1251828730.9683.129.camel@twinturbo.austin.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Marcelo Tosatti , kvm@vger.kernel.org, Jan Kiszka To: habanero@linux.vnet.ibm.com Return-path: Received: from mx1.redhat.com ([209.132.183.28]:39670 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751865AbZIASWz (ORCPT ); Tue, 1 Sep 2009 14:22:55 -0400 In-Reply-To: <1251828730.9683.129.camel@twinturbo.austin.ibm.com> Sender: kvm-owner@vger.kernel.org List-ID: On 09/01/2009 09:12 PM, Andrew Theurer wrote: > Here's a run from branch debugreg with thread debugreg storage + > conditionally reload dr6: > > user nice system irq softirq guest idle iowait > 5.79 0.00 9.28 0.08 1.00 20.81 58.78 4.26 > total busy: 36.97 > > Previous run that had avoided calling adjust_vmx_controls twice: > > user nice system irq softirq guest idle iowait > 5.81 0.00 9.48 0.08 1.04 21.32 57.86 4.41 > total busy: 37.73 > > A relative reduction CPU cycles of 2% > That was an wasy fruit to pick. To bad it was a regression that we introduced. > new oprofile: > > >> samples % app name symbol name >> 876648 54.1555 kvm-intel.ko vmx_vcpu_run >> 37595 2.3225 qemu-system-x86_64 cpu_physical_memory_rw >> 35623 2.2006 qemu-system-x86_64 phys_page_find_alloc >> 24874 1.5366 vmlinux-2.6.31-rc5_debugreg_v2.6.31-rc3-3441-g479fa73-autokern1 native_write_msr_safe >> 17710 1.0940 libc-2.5.so memcpy >> 14664 0.9059 kvm.ko kvm_arch_vcpu_ioctl_run >> 14577 0.9005 qemu-system-x86_64 qemu_get_ram_ptr >> 12528 0.7739 vmlinux-2.6.31-rc5_debugreg_v2.6.31-rc3-3441-g479fa73-autokern1 native_read_msr_safe >> 10979 0.6782 vmlinux-2.6.31-rc5_debugreg_v2.6.31-rc3-3441-g479fa73-autokern1 copy_user_generic_string >> 9979 0.6165 qemu-system-x86_64 virtqueue_get_head >> 9371 0.5789 vmlinux-2.6.31-rc5_debugreg_v2.6.31-rc3-3441-g479fa73-autokern1 schedule >> 8333 0.5148 qemu-system-x86_64 virtqueue_avail_bytes >> 7899 0.4880 vmlinux-2.6.31-rc5_debugreg_v2.6.31-rc3-3441-g479fa73-autokern1 fget_light >> 7289 0.4503 qemu-system-x86_64 main_loop_wait >> 7217 0.4458 qemu-system-x86_64 lduw_phys >> This is almost entirely host virtio. I can reduce native_write_msr_safe by a bit, but not much. >> 6821 0.4214 vmlinux-2.6.31-rc5_debugreg_v2.6.31-rc3-3441-g479fa73-autokern1 audit_syscall_exit >> 6749 0.4169 vmlinux-2.6.31-rc5_debugreg_v2.6.31-rc3-3441-g479fa73-autokern1 do_select >> 5919 0.3657 vmlinux-2.6.31-rc5_debugreg_v2.6.31-rc3-3441-g479fa73-autokern1 audit_syscall_entry >> 5466 0.3377 vmlinux-2.6.31-rc5_debugreg_v2.6.31-rc3-3441-g479fa73-autokern1 kfree >> 4887 0.3019 vmlinux-2.6.31-rc5_debugreg_v2.6.31-rc3-3441-g479fa73-autokern1 fput >> 4689 0.2897 vmlinux-2.6.31-rc5_debugreg_v2.6.31-rc3-3441-g479fa73-autokern1 __switch_to >> 4636 0.2864 vmlinux-2.6.31-rc5_debugreg_v2.6.31-rc3-3441-g479fa73-autokern1 mwait_idle >> Still not idle=poll, it may shave off 0.2%. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.