From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: Cross vendor migration ideas Date: Sun, 16 Nov 2008 17:36:30 +0200 Message-ID: <49203DFE.1070101@redhat.com> References: <982D8D05B6407A49AD506E6C3AC8E7D6BEF936912A@caralain.haven.nynaeve.net> <7CCF7468C07AFF4B991DD1528058EC2F042C7283@SSVLEXMB1.amd.com> <878wrl16q5.fsf@basil.nowhere.org> <1FF5F416-082C-4FA2-8392-8552BFBEDA00@suse.de> <491F08F1.40501@firstfloor.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Alexander Graf , "Serebrin, Benjamin (Calendar)" , Skywing , Anthony Liguori , kvm@vger.kernel.org, Amit Shah , "Wahlig, Elsie" , "Nakajima, Jun" To: Andi Kleen Return-path: Received: from mx2.redhat.com ([66.187.237.31]:48569 "EHLO mx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752193AbYKPPgq (ORCPT ); Sun, 16 Nov 2008 10:36:46 -0500 In-Reply-To: <491F08F1.40501@firstfloor.org> Sender: kvm-owner@vger.kernel.org List-ID: Andi Kleen wrote: >> Yes, but since we're emulating a CPU anyways we don't want vendor >> specific setup, since we might live migrate. >> > > I was mainly thinking of tuning. > > For example Linux selects the best suitable page copy > function based on vendor information. Using the wrong page copy > can have a large impact on your performance. > We can tell Linux to retune using a custom ACPI (blech) event. >>> Also it might break user space, unless you key the fake vendor CPUID >>> intercept on ring 0 vs ring 3 (but even if that might not be enough >>> because some kernel modules can call CPUID on their own) >>> >> Is there userspace code that relies on GenuineIntel? >> > > Yes. Undoubtedly also the same with others. > This will not work of course for userspace. >>> I think just emulating SYSCALL/SYSENTER would be safer. It shouldn't >>> be that much slower than int 0x80 hopefully. >>> >> Well emulating them means you're leaving the VM on every user<->kernel >> transition. That's a _huge_ performance hit. I don't have the numbers, >> but IIRC a roundtrip is ~3000 cycles. >> > > Depends on the CPU. Newer CPUs are faster. > I think emulation will be slower than 3000 cycles, even on newer cpus. In addition to the #vmexit, you have to save all registers, walk the guest page tables to fetch the instruction, execute the emulator code including all the checks, go back to the guest. Then do the same again on sysexit/sysret -- the overhead doubles. > Also you would select the default based on which CPU you're booting > on. So initially and as long as you stay on the same > vendor you'll have full performance. Only if you switch > vendors later you'll also eat the emulation hit (and get it > back again when you switch back to a system with the original vendor) > But everything will still work correctly and that is the > important part. > > Doesn't sound so bad to me. > Long running guests on a large farm shouldn't expect to stay on the same host they were booted on. But at least for Linux, if we notify the guest telling it to retune (which can include the vsyscall path) we can avoid the cost entirely. -- error compiling committee.c: too many arguments to function