From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chao Peng Subject: Re: [PATCH] target-i386: kvm: cache KVM_GET_SUPPORTED_CPUID data Date: Tue, 14 Jun 2016 13:01:28 +0800 Message-ID: <20160614050128.GA21465@pengc-linux.bj.intel.com> References: <1465784487-23482-1-git-send-email-chao.p.peng@linux.intel.com> <0ad19d18-4ee1-0450-c3fa-b64b34e2cc66@redhat.com> Reply-To: Chao Peng Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, Eduardo Habkost , Marcelo Tosatti , Richard Henderson To: Paolo Bonzini Return-path: Received: from mga09.intel.com ([134.134.136.24]:22239 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751945AbcFNFMM (ORCPT ); Tue, 14 Jun 2016 01:12:12 -0400 Content-Disposition: inline In-Reply-To: <0ad19d18-4ee1-0450-c3fa-b64b34e2cc66@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On Mon, Jun 13, 2016 at 12:02:41PM +0200, Paolo Bonzini wrote: > > > On 13/06/2016 04:21, Chao Peng wrote: > > KVM_GET_SUPPORTED_CPUID ioctl is called frequently when initializing > > CPU. Depends on CPU features and CPU count, the number of calls can be > > extremely high which slows down QEMU booting significantly. In our > > testing, we saw 5922 calls with switches: > > > > -cpu SandyBridge -smp 6,sockets=6,cores=1,threads=1 > > > > This ioctl takes more than 100ms, which is almost half of the total > > QEMU startup time. > > > > While for most cases the data returned from two different invocations > > are not changed, that means, we can cache the data to avoid trapping > > into kernel for the second time. To make sure the cache safe one > > assumption is desirable: the ioctl is stateless. This is not true > > however, at least for some CPUID leaves. > > Which are the CPUID leaves for which KVM_GET_SUPPORTED_CPUID is not > stateless? I cannot find any. I have though leaf 0xd, sub leaf 1 is not stateless, as the size of xsave buffer(EBX) is based on XCR0 | IA32_XSS. But after looking KVM code more carefully, seems I was wrong. The code calculates EBX with the host xcr0 but not guest xcr0, nor guest IA32_XSS (not sure if this is the correct behavior), so it can always returns constant data on a certain machine. If this is not an issue, then looks we can cache it safely. > > > The good part is even the ioctl is not fully stateless, we can still > > cache the return value if we know the data is unchanged for the leaves > > we are interested in. Actually this should be true for most invocations > > and looks all the places in current code hold true. > > > > A non-cached version can be introduced if refresh is required in the > > future. > > [...] > > > > > +static Notifier kvm_exit_notifier; > > +static void kvm_arch_destroy(Notifier *n, void *unused) > > +{ > > + g_free(cpuid_cache); > > +} > > + > > int kvm_arch_init(MachineState *ms, KVMState *s) > > { > > uint64_t identity_base = 0xfffbc000; > > @@ -1165,6 +1176,9 @@ int kvm_arch_init(MachineState *ms, KVMState *s) > > smram_machine_done.notify = register_smram_listener; > > qemu_add_machine_init_done_notifier(&smram_machine_done); > > } > > + > > + kvm_exit_notifier.notify = kvm_arch_destroy; > > + qemu_add_exit_notifier(&kvm_exit_notifier); > > return 0; > > > This part is unnecessary; the OS takes care of freeing the heap on exit. Make sense. Thanks. Chao