From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Date: Wed, 30 Sep 2009 09:24:11 +0000 Subject: Re: [PATCH 00/27] Add KVM support for Book3s_64 (PPC64) hosts v4 Message-Id: <4AC323BB.80307@redhat.com> List-Id: References: <1254212303-8737-1-git-send-email-agraf@suse.de> In-Reply-To: <1254212303-8737-1-git-send-email-agraf@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: kvm-ppc@vger.kernel.org On 09/30/2009 11:11 AM, Alexander Graf wrote: >>> The plan is to get qemu ppc64 guest support in a shape where it can >>> actually use the KVM support. As it is it's rather useless. >>> When we have that, a PV interface would be needed to get things fast >>> and then the next thing on my list is the MMU notifiers. >> >> Um. How slow is it today? What paths are problematic? mmu, context >> switch? > > Instruction emulation. > > X86 with virtualization extensions doesn't trap often, as most of the > state can be safely handled within the guest mode. > Now with PPC we're basically running in "ring 3" (called "problem > state" in ppc speech) which traps all the time because guests change > the IF or access some SPRs that we don't really need to trap on, but > only need to sync state with on #VMEXIT. > > So the PV idea here is to have a shared page between host and guest > that contains guest specific SPRs and other state (an MSR shadow for > example). That way the guest can patch itself to use that shared page > and KVM always knows about the most current state on #VMEXIT. At the > same time we're reducing exits by a _lot_. But writing those registers often has side effects. For example, enabling interrupts should also inject an interrupt when one is pending. On x86 we have the same problem with the TPR on Windows XP, so we copy it to the guest on entry (along with the pending interrupt state) and back to the host on exit. The guest uses an atomic operation to change the TPR and read pending interrupt information, and if an interrupt becomes unmasked, it calls a hypercall to trigger it. Presumably you'll be doing something similar? In any case, I recommend keeping fine-grained control over those bits so they can be enabled/disabled/expanded as needed. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic.