* [Qemu-devel] [0/2] Bugfixes for ppc KVM @ 2012-09-20 7:08 David Gibson 2012-09-20 7:08 ` [Qemu-devel] [PATCH 1/2] pseries: Synchronize qemu and KVM state on hypercalls David Gibson 2012-09-20 7:08 ` [Qemu-devel] [PATCH 2/2] target-ppc: KVM: Fix some kernel version edge cases for kvmppc_reset_htab() David Gibson 0 siblings, 2 replies; 12+ messages in thread From: David Gibson @ 2012-09-20 7:08 UTC (permalink / raw) To: agraf; +Cc: qemu-ppc, qemu-devel, qemu-stable Hi Alex, This series contains two bugfixes for KVM on POWER. 1/2 is a very serious bug where the privilege check for hypercalls will be based on an outdated MSR value. The second has a rather smaller impact; we will do the wrong thing resetting the hash table for certain kernel versions. Both are real bugs, and should be merged ASAP into both mainline and stable. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Qemu-devel] [PATCH 1/2] pseries: Synchronize qemu and KVM state on hypercalls 2012-09-20 7:08 [Qemu-devel] [0/2] Bugfixes for ppc KVM David Gibson @ 2012-09-20 7:08 ` David Gibson 2012-09-20 7:38 ` Alexander Graf 2012-09-20 7:39 ` [Qemu-devel] " Jan Kiszka 2012-09-20 7:08 ` [Qemu-devel] [PATCH 2/2] target-ppc: KVM: Fix some kernel version edge cases for kvmppc_reset_htab() David Gibson 1 sibling, 2 replies; 12+ messages in thread From: David Gibson @ 2012-09-20 7:08 UTC (permalink / raw) To: agraf; +Cc: David Gibson, qemu-ppc, qemu-devel, qemu-stable Currently the KVM exit path for PAPR hypercalls does not synchronize the qemu cpu state with the KVM state. Mostly this works, because the actual hypercall arguments and return values are explicitly passed through the kvm_run structure. However, the hypercall path includes a privilege check, to ensure that only the guest kernel can invoke hypercalls, not the guest userspace. Because of the lack of sync, this privilege check will use an out of date copy of the MSR, which could lead either to guest userspace being able to invoke hypercalls (a security hole for the guest) or to the guest kernel being incorrectly refused privilege leading to various other failures. This patch fixes the bug by forcing a synchronization on the hypercall exit path. This does mean we have a potentially quite expensive get and set of the state, however performance critical hypercalls are generally already implemented inside KVM so this probably won't matter. If it is a performance problem we can optimize it later by having the kernel perform the privilege check. That will need a new capability, however, since qemu will still need the privilege check for older kernels. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> --- target-ppc/kvm.c | 1 + 1 file changed, 1 insertion(+) diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index 546c116..78a47fb 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -813,6 +813,7 @@ int kvm_arch_handle_exit(CPUPPCState *env, struct kvm_run *run) #ifdef CONFIG_PSERIES case KVM_EXIT_PAPR_HCALL: dprintf("handle PAPR hypercall\n"); + cpu_synchronize_state(env); run->papr_hcall.ret = spapr_hypercall(env, run->papr_hcall.nr, run->papr_hcall.args); ret = 0; -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] pseries: Synchronize qemu and KVM state on hypercalls 2012-09-20 7:08 ` [Qemu-devel] [PATCH 1/2] pseries: Synchronize qemu and KVM state on hypercalls David Gibson @ 2012-09-20 7:38 ` Alexander Graf 2012-09-20 11:53 ` [Qemu-devel] [Qemu-ppc] " David Gibson 2012-09-20 7:39 ` [Qemu-devel] " Jan Kiszka 1 sibling, 1 reply; 12+ messages in thread From: Alexander Graf @ 2012-09-20 7:38 UTC (permalink / raw) To: David Gibson; +Cc: qemu-ppc, qemu-devel, qemu-stable On 20.09.2012, at 09:08, David Gibson wrote: > Currently the KVM exit path for PAPR hypercalls does not synchronize the > qemu cpu state with the KVM state. Mostly this works, because the actual > hypercall arguments and return values are explicitly passed through the > kvm_run structure. However, the hypercall path includes a privilege check, > to ensure that only the guest kernel can invoke hypercalls, not the guest > userspace. Because of the lack of sync, this privilege check will use an > out of date copy of the MSR, which could lead either to guest userspace > being able to invoke hypercalls (a security hole for the guest) or to the > guest kernel being incorrectly refused privilege leading to various other > failures. > > This patch fixes the bug by forcing a synchronization on the hypercall exit > path. This does mean we have a potentially quite expensive get and set of > the state, however performance critical hypercalls are generally already > implemented inside KVM so this probably won't matter. If it is a > performance problem we can optimize it later by having the kernel perform > the privilege check. That will need a new capability, however, since qemu > will still need the privilege check for older kernels. > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> I would actually prefer to see that one fixed in kernel space. If we send it through -stable there, we should be a lot better off. Also, PR KVM already checks for !MSR_PR in kvm. Alex ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [Qemu-ppc] [PATCH 1/2] pseries: Synchronize qemu and KVM state on hypercalls 2012-09-20 7:38 ` Alexander Graf @ 2012-09-20 11:53 ` David Gibson 2012-09-20 12:44 ` Alexander Graf 0 siblings, 1 reply; 12+ messages in thread From: David Gibson @ 2012-09-20 11:53 UTC (permalink / raw) To: Alexander Graf; +Cc: qemu-ppc, qemu-devel, qemu-stable On Thu, Sep 20, 2012 at 09:38:58AM +0200, Alexander Graf wrote: > > On 20.09.2012, at 09:08, David Gibson wrote: > > > Currently the KVM exit path for PAPR hypercalls does not synchronize the > > qemu cpu state with the KVM state. Mostly this works, because the actual > > hypercall arguments and return values are explicitly passed through the > > kvm_run structure. However, the hypercall path includes a privilege check, > > to ensure that only the guest kernel can invoke hypercalls, not the guest > > userspace. Because of the lack of sync, this privilege check will use an > > out of date copy of the MSR, which could lead either to guest userspace > > being able to invoke hypercalls (a security hole for the guest) or to the > > guest kernel being incorrectly refused privilege leading to various other > > failures. > > > > This patch fixes the bug by forcing a synchronization on the hypercall exit > > path. This does mean we have a potentially quite expensive get and set of > > the state, however performance critical hypercalls are generally already > > implemented inside KVM so this probably won't matter. If it is a > > performance problem we can optimize it later by having the kernel perform > > the privilege check. That will need a new capability, however, since qemu > > will still need the privilege check for older kernels. > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > I would actually prefer to see that one fixed in kernel space. That's a better fix, but we can't fix it purely in the kernel, because there are existing released kernels that don't do the privilege check. > If we > send it through -stable there, we should be a lot better off. Also, > PR KVM already checks for !MSR_PR in kvm. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [Qemu-ppc] [PATCH 1/2] pseries: Synchronize qemu and KVM state on hypercalls 2012-09-20 11:53 ` [Qemu-devel] [Qemu-ppc] " David Gibson @ 2012-09-20 12:44 ` Alexander Graf 2012-09-21 0:22 ` David Gibson 0 siblings, 1 reply; 12+ messages in thread From: Alexander Graf @ 2012-09-20 12:44 UTC (permalink / raw) To: David Gibson; +Cc: qemu-ppc, qemu-devel, qemu-stable On 20.09.2012, at 13:53, David Gibson wrote: > On Thu, Sep 20, 2012 at 09:38:58AM +0200, Alexander Graf wrote: >> >> On 20.09.2012, at 09:08, David Gibson wrote: >> >>> Currently the KVM exit path for PAPR hypercalls does not synchronize the >>> qemu cpu state with the KVM state. Mostly this works, because the actual >>> hypercall arguments and return values are explicitly passed through the >>> kvm_run structure. However, the hypercall path includes a privilege check, >>> to ensure that only the guest kernel can invoke hypercalls, not the guest >>> userspace. Because of the lack of sync, this privilege check will use an >>> out of date copy of the MSR, which could lead either to guest userspace >>> being able to invoke hypercalls (a security hole for the guest) or to the >>> guest kernel being incorrectly refused privilege leading to various other >>> failures. >>> >>> This patch fixes the bug by forcing a synchronization on the hypercall exit >>> path. This does mean we have a potentially quite expensive get and set of >>> the state, however performance critical hypercalls are generally already >>> implemented inside KVM so this probably won't matter. If it is a >>> performance problem we can optimize it later by having the kernel perform >>> the privilege check. That will need a new capability, however, since qemu >>> will still need the privilege check for older kernels. >>> >>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> >> >> I would actually prefer to see that one fixed in kernel space. > > That's a better fix, but we can't fix it purely in the kernel, because > there are existing released kernels that don't do the privilege check. There are security flaws fixed through -stable updates in the kernel any day, why should this one be handled differently? Alex ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [Qemu-ppc] [PATCH 1/2] pseries: Synchronize qemu and KVM state on hypercalls 2012-09-20 12:44 ` Alexander Graf @ 2012-09-21 0:22 ` David Gibson 2012-09-24 14:27 ` Alexander Graf 0 siblings, 1 reply; 12+ messages in thread From: David Gibson @ 2012-09-21 0:22 UTC (permalink / raw) To: Alexander Graf; +Cc: qemu-ppc, qemu-devel, qemu-stable On Thu, Sep 20, 2012 at 02:44:26PM +0200, Alexander Graf wrote: > > On 20.09.2012, at 13:53, David Gibson wrote: > > > On Thu, Sep 20, 2012 at 09:38:58AM +0200, Alexander Graf wrote: > >> > >> On 20.09.2012, at 09:08, David Gibson wrote: > >> > >>> Currently the KVM exit path for PAPR hypercalls does not synchronize the > >>> qemu cpu state with the KVM state. Mostly this works, because the actual > >>> hypercall arguments and return values are explicitly passed through the > >>> kvm_run structure. However, the hypercall path includes a privilege check, > >>> to ensure that only the guest kernel can invoke hypercalls, not the guest > >>> userspace. Because of the lack of sync, this privilege check will use an > >>> out of date copy of the MSR, which could lead either to guest userspace > >>> being able to invoke hypercalls (a security hole for the guest) or to the > >>> guest kernel being incorrectly refused privilege leading to various other > >>> failures. > >>> > >>> This patch fixes the bug by forcing a synchronization on the hypercall exit > >>> path. This does mean we have a potentially quite expensive get and set of > >>> the state, however performance critical hypercalls are generally already > >>> implemented inside KVM so this probably won't matter. If it is a > >>> performance problem we can optimize it later by having the kernel perform > >>> the privilege check. That will need a new capability, however, since qemu > >>> will still need the privilege check for older kernels. > >>> > >>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > >> > >> I would actually prefer to see that one fixed in kernel space. > > > > That's a better fix, but we can't fix it purely in the kernel, because > > there are existing released kernels that don't do the privilege check. > > There are security flaws fixed through -stable updates in the kernel > any day, why should this one be handled differently? >From the kernel's point of view, this is not obviously a security bug - it passes a hypercall it doesn't know how to handle to qemu, qemu handles it incorrectly. And in any case, even if you do consider it a kernel security bug, there's no reason that qemu should just allow that bug to appear when it's capable of working around buggy kernels in a way that closes the security hole. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [Qemu-ppc] [PATCH 1/2] pseries: Synchronize qemu and KVM state on hypercalls 2012-09-21 0:22 ` David Gibson @ 2012-09-24 14:27 ` Alexander Graf 2012-09-25 7:05 ` David Gibson 0 siblings, 1 reply; 12+ messages in thread From: Alexander Graf @ 2012-09-24 14:27 UTC (permalink / raw) To: David Gibson; +Cc: qemu-ppc, qemu-devel, qemu-stable On 21.09.2012, at 02:22, David Gibson wrote: > On Thu, Sep 20, 2012 at 02:44:26PM +0200, Alexander Graf wrote: >> >> On 20.09.2012, at 13:53, David Gibson wrote: >> >>> On Thu, Sep 20, 2012 at 09:38:58AM +0200, Alexander Graf wrote: >>>> >>>> On 20.09.2012, at 09:08, David Gibson wrote: >>>> >>>>> Currently the KVM exit path for PAPR hypercalls does not synchronize the >>>>> qemu cpu state with the KVM state. Mostly this works, because the actual >>>>> hypercall arguments and return values are explicitly passed through the >>>>> kvm_run structure. However, the hypercall path includes a privilege check, >>>>> to ensure that only the guest kernel can invoke hypercalls, not the guest >>>>> userspace. Because of the lack of sync, this privilege check will use an >>>>> out of date copy of the MSR, which could lead either to guest userspace >>>>> being able to invoke hypercalls (a security hole for the guest) or to the >>>>> guest kernel being incorrectly refused privilege leading to various other >>>>> failures. >>>>> >>>>> This patch fixes the bug by forcing a synchronization on the hypercall exit >>>>> path. This does mean we have a potentially quite expensive get and set of >>>>> the state, however performance critical hypercalls are generally already >>>>> implemented inside KVM so this probably won't matter. If it is a >>>>> performance problem we can optimize it later by having the kernel perform >>>>> the privilege check. That will need a new capability, however, since qemu >>>>> will still need the privilege check for older kernels. >>>>> >>>>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> >>>> >>>> I would actually prefer to see that one fixed in kernel space. >>> >>> That's a better fix, but we can't fix it purely in the kernel, because >>> there are existing released kernels that don't do the privilege check. >> >> There are security flaws fixed through -stable updates in the kernel >> any day, why should this one be handled differently? > > From the kernel's point of view, this is not obviously a security bug > - it passes a hypercall it doesn't know how to handle to qemu, qemu > handles it incorrectly. > > And in any case, even if you do consider it a kernel security bug, > there's no reason that qemu should just allow that bug to appear when > it's capable of working around buggy kernels in a way that closes the > security hole. This is the code in the HV kernel side: case BOOK3S_INTERRUPT_SYSCALL: { /* hcall - punt to userspace */ int i; if (vcpu->arch.shregs.msr & MSR_PR) { /* sc 1 from userspace - reflect to guest syscall */ kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_SYSCALL); r = RESUME_GUEST; break; } run->papr_hcall.nr = kvmppc_get_gpr(vcpu, 3); for (i = 0; i < 9; ++i) run->papr_hcall.args[i] = kvmppc_get_gpr(vcpu, 4 + i); run->exit_reason = KVM_EXIT_PAPR_HCALL; vcpu->arch.hcall_needed = 1; r = RESUME_HOST; break; } So it already handles hypercalls in user space and deflects them back. Everyone's happy :). The only outstanding bug is that QEMU shouldn't interpret env->msr when handling hypercalls from KVM, since these are already guaranteed to be checked and MSR in QEMU does not reflect the current MSR in the vcpu, so we might end up rejecting hypercalls by accident. Alex ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [Qemu-ppc] [PATCH 1/2] pseries: Synchronize qemu and KVM state on hypercalls 2012-09-24 14:27 ` Alexander Graf @ 2012-09-25 7:05 ` David Gibson 0 siblings, 0 replies; 12+ messages in thread From: David Gibson @ 2012-09-25 7:05 UTC (permalink / raw) To: Alexander Graf; +Cc: qemu-ppc, qemu-devel, qemu-stable On Mon, Sep 24, 2012 at 04:27:20PM +0200, Alexander Graf wrote: > > On 21.09.2012, at 02:22, David Gibson wrote: > > > On Thu, Sep 20, 2012 at 02:44:26PM +0200, Alexander Graf wrote: > >> > >> On 20.09.2012, at 13:53, David Gibson wrote: > >> > >>> On Thu, Sep 20, 2012 at 09:38:58AM +0200, Alexander Graf wrote: > >>>> > >>>> On 20.09.2012, at 09:08, David Gibson wrote: > >>>> > >>>>> Currently the KVM exit path for PAPR hypercalls does not synchronize the > >>>>> qemu cpu state with the KVM state. Mostly this works, because the actual > >>>>> hypercall arguments and return values are explicitly passed through the > >>>>> kvm_run structure. However, the hypercall path includes a privilege check, > >>>>> to ensure that only the guest kernel can invoke hypercalls, not the guest > >>>>> userspace. Because of the lack of sync, this privilege check will use an > >>>>> out of date copy of the MSR, which could lead either to guest userspace > >>>>> being able to invoke hypercalls (a security hole for the guest) or to the > >>>>> guest kernel being incorrectly refused privilege leading to various other > >>>>> failures. > >>>>> > >>>>> This patch fixes the bug by forcing a synchronization on the hypercall exit > >>>>> path. This does mean we have a potentially quite expensive get and set of > >>>>> the state, however performance critical hypercalls are generally already > >>>>> implemented inside KVM so this probably won't matter. If it is a > >>>>> performance problem we can optimize it later by having the kernel perform > >>>>> the privilege check. That will need a new capability, however, since qemu > >>>>> will still need the privilege check for older kernels. > >>>>> > >>>>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > >>>> > >>>> I would actually prefer to see that one fixed in kernel space. > >>> > >>> That's a better fix, but we can't fix it purely in the kernel, because > >>> there are existing released kernels that don't do the privilege check. > >> > >> There are security flaws fixed through -stable updates in the kernel > >> any day, why should this one be handled differently? > > > > From the kernel's point of view, this is not obviously a security bug > > - it passes a hypercall it doesn't know how to handle to qemu, qemu > > handles it incorrectly. > > > > And in any case, even if you do consider it a kernel security bug, > > there's no reason that qemu should just allow that bug to appear when > > it's capable of working around buggy kernels in a way that closes the > > security hole. > > This is the code in the HV kernel side: > > case BOOK3S_INTERRUPT_SYSCALL: > { > /* hcall - punt to userspace */ > int i; > > if (vcpu->arch.shregs.msr & MSR_PR) { > /* sc 1 from userspace - reflect to guest syscall */ > kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_SYSCALL); > r = RESUME_GUEST; > break; > } > run->papr_hcall.nr = kvmppc_get_gpr(vcpu, 3); > for (i = 0; i < 9; ++i) > run->papr_hcall.args[i] = kvmppc_get_gpr(vcpu, 4 + i); > run->exit_reason = KVM_EXIT_PAPR_HCALL; > vcpu->arch.hcall_needed = 1; > r = RESUME_HOST; > break; > } > > So it already handles hypercalls in user space and deflects them > back. Everyone's happy :). Ah, so it does. I was mistaken. > The only outstanding bug is that QEMU shouldn't interpret env->msr > when handling hypercalls from KVM, since these are already > guaranteed to be checked and MSR in QEMU does not reflect the > current MSR in the vcpu, so we might end up rejecting hypercalls by > accident. I've written a suitable patch, just needs a little more testing and I'll send it out. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] pseries: Synchronize qemu and KVM state on hypercalls 2012-09-20 7:08 ` [Qemu-devel] [PATCH 1/2] pseries: Synchronize qemu and KVM state on hypercalls David Gibson 2012-09-20 7:38 ` Alexander Graf @ 2012-09-20 7:39 ` Jan Kiszka 2012-09-20 11:54 ` [Qemu-devel] [Qemu-ppc] " David Gibson 1 sibling, 1 reply; 12+ messages in thread From: Jan Kiszka @ 2012-09-20 7:39 UTC (permalink / raw) To: David Gibson; +Cc: qemu-stable, qemu-ppc, agraf, qemu-devel On 2012-09-20 09:08, David Gibson wrote: > Currently the KVM exit path for PAPR hypercalls does not synchronize the > qemu cpu state with the KVM state. Mostly this works, because the actual > hypercall arguments and return values are explicitly passed through the > kvm_run structure. However, the hypercall path includes a privilege check, > to ensure that only the guest kernel can invoke hypercalls, not the guest > userspace. Because of the lack of sync, this privilege check will use an > out of date copy of the MSR, which could lead either to guest userspace > being able to invoke hypercalls (a security hole for the guest) or to the > guest kernel being incorrectly refused privilege leading to various other > failures. > > This patch fixes the bug by forcing a synchronization on the hypercall exit > path. This does mean we have a potentially quite expensive get and set of > the state, however performance critical hypercalls are generally already > implemented inside KVM so this probably won't matter. If it is a > performance problem we can optimize it later by having the kernel perform > the privilege check. That will need a new capability, however, since qemu > will still need the privilege check for older kernels. If it's just about reading a small subset of the state (a single MSR?) to allow the privilege check, you can also open-code only that in HCALL exit. If it really matters. Jan > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > --- > target-ppc/kvm.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c > index 546c116..78a47fb 100644 > --- a/target-ppc/kvm.c > +++ b/target-ppc/kvm.c > @@ -813,6 +813,7 @@ int kvm_arch_handle_exit(CPUPPCState *env, struct kvm_run *run) > #ifdef CONFIG_PSERIES > case KVM_EXIT_PAPR_HCALL: > dprintf("handle PAPR hypercall\n"); > + cpu_synchronize_state(env); > run->papr_hcall.ret = spapr_hypercall(env, run->papr_hcall.nr, > run->papr_hcall.args); > ret = 0; > -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [Qemu-ppc] [PATCH 1/2] pseries: Synchronize qemu and KVM state on hypercalls 2012-09-20 7:39 ` [Qemu-devel] " Jan Kiszka @ 2012-09-20 11:54 ` David Gibson 0 siblings, 0 replies; 12+ messages in thread From: David Gibson @ 2012-09-20 11:54 UTC (permalink / raw) To: Jan Kiszka; +Cc: qemu-ppc, qemu-stable, qemu-devel On Thu, Sep 20, 2012 at 09:39:56AM +0200, Jan Kiszka wrote: > On 2012-09-20 09:08, David Gibson wrote: > > Currently the KVM exit path for PAPR hypercalls does not synchronize the > > qemu cpu state with the KVM state. Mostly this works, because the actual > > hypercall arguments and return values are explicitly passed through the > > kvm_run structure. However, the hypercall path includes a privilege check, > > to ensure that only the guest kernel can invoke hypercalls, not the guest > > userspace. Because of the lack of sync, this privilege check will use an > > out of date copy of the MSR, which could lead either to guest userspace > > being able to invoke hypercalls (a security hole for the guest) or to the > > guest kernel being incorrectly refused privilege leading to various other > > failures. > > > > This patch fixes the bug by forcing a synchronization on the hypercall exit > > path. This does mean we have a potentially quite expensive get and set of > > the state, however performance critical hypercalls are generally already > > implemented inside KVM so this probably won't matter. If it is a > > performance problem we can optimize it later by having the kernel perform > > the privilege check. That will need a new capability, however, since qemu > > will still need the privilege check for older kernels. > > If it's just about reading a small subset of the state (a single MSR?) > to allow the privilege check, you can also open-code only that in HCALL > exit. If it really matters. We could, don't think it's worth the trouble. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Qemu-devel] [PATCH 2/2] target-ppc: KVM: Fix some kernel version edge cases for kvmppc_reset_htab() 2012-09-20 7:08 [Qemu-devel] [0/2] Bugfixes for ppc KVM David Gibson 2012-09-20 7:08 ` [Qemu-devel] [PATCH 1/2] pseries: Synchronize qemu and KVM state on hypercalls David Gibson @ 2012-09-20 7:08 ` David Gibson 2012-09-20 8:58 ` Alexander Graf 1 sibling, 1 reply; 12+ messages in thread From: David Gibson @ 2012-09-20 7:08 UTC (permalink / raw) To: agraf; +Cc: David Gibson, qemu-ppc, qemu-devel, qemu-stable The kvmppc_reset_htab() function invokes the KVM_PPC_ALLOCATE_HTAB vm ioctl to request KVM to allocate and reset a hash page table for the guest - it returns the size of hash table allocated, or 0 to indicate that qemu needs to allocate the hash table itself. In practice qemu needs to allocate the htab for full emulation and with Book3sPR KVM, but the kernel has to allocate it for Book3sHV KVM (the hash table needs to be physically contiguous in that case). Unfortunately, the logic in this function is incorrect for some existing kernels. Specifically: * at least some PR KVM versions advertise the relevant capability but don't actually implement the ioctl(), returning ENOTTY. * For old kernels which don't have the capability, we currently return 0. This is correct for PV KVM, where we need to allocate the htab, but not for HV KVM - kernels of this era always allocate a 16MB hash table per guest. This patch corrects both of these edge cases. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> --- target-ppc/kvm.c | 30 +++++++++++++++++++++++++----- 1 file changed, 25 insertions(+), 5 deletions(-) diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index 78a47fb..ecc7719 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -1128,18 +1128,38 @@ int kvmppc_reset_htab(int shift_hint) { uint32_t shift = shift_hint; - if (kvm_enabled() && - kvm_check_extension(kvm_state, KVM_CAP_PPC_ALLOC_HTAB)) { + if (!kvm_enabled()) { + /* Full emulation, tell caller to allocate htab itself */ + return 0; + } + if (kvm_check_extension(kvm_state, KVM_CAP_PPC_ALLOC_HTAB)) { int ret; ret = kvm_vm_ioctl(kvm_state, KVM_PPC_ALLOCATE_HTAB, &shift); - if (ret < 0) { + if (ret == -ENOTTY) { + /* At least some versions of PR KVM advertise the + * capability, but don't implement the ioctl(). Oops. + * Return 0 so that we allocate the htab in qemu, as is + * correct for PR. */ + return 0; + } else if (ret < 0) { return ret; } return shift; } - /* For now.. */ - return 0; + /* We have a kernel that predates the htab reset calls. For PR + * KVM, we need to allocate the htab ourselves, for an HV KVM of + * this era, it has allocated a 16MB fixed size hash table + * already. Kernels of this era have the GET_PVINFO capability + * only on PR, so we use this hack to determine the right + * answer */ + if (kvm_check_extension(kvm_state, KVM_CAP_PPC_GET_PVINFO)) { + /* PR - tell caller to allocate htab */ + return 0; + } else { + /* HV - assume 16MB kernel allocated htab */ + return 24; + } } static inline uint32_t mfpvr(void) -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] target-ppc: KVM: Fix some kernel version edge cases for kvmppc_reset_htab() 2012-09-20 7:08 ` [Qemu-devel] [PATCH 2/2] target-ppc: KVM: Fix some kernel version edge cases for kvmppc_reset_htab() David Gibson @ 2012-09-20 8:58 ` Alexander Graf 0 siblings, 0 replies; 12+ messages in thread From: Alexander Graf @ 2012-09-20 8:58 UTC (permalink / raw) To: David Gibson; +Cc: qemu-ppc, qemu-devel, qemu-stable On 20.09.2012, at 09:08, David Gibson wrote: > The kvmppc_reset_htab() function invokes the KVM_PPC_ALLOCATE_HTAB vm ioctl > to request KVM to allocate and reset a hash page table for the guest - it > returns the size of hash table allocated, or 0 to indicate that qemu needs > to allocate the hash table itself. In practice qemu needs to allocate the > htab for full emulation and with Book3sPR KVM, but the kernel has to > allocate it for Book3sHV KVM (the hash table needs to be physically > contiguous in that case). > > Unfortunately, the logic in this function is incorrect for some existing > kernels. Specifically: > * at least some PR KVM versions advertise the relevant capability but > don't actually implement the ioctl(), returning ENOTTY. > * For old kernels which don't have the capability, we currently return 0. > This is correct for PV KVM, where we need to allocate the htab, but not for > HV KVM - kernels of this era always allocate a 16MB hash table per guest. > > This patch corrects both of these edge cases. > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Thanks, applied to ppc-next. Alex ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2012-09-25 7:08 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-09-20 7:08 [Qemu-devel] [0/2] Bugfixes for ppc KVM David Gibson 2012-09-20 7:08 ` [Qemu-devel] [PATCH 1/2] pseries: Synchronize qemu and KVM state on hypercalls David Gibson 2012-09-20 7:38 ` Alexander Graf 2012-09-20 11:53 ` [Qemu-devel] [Qemu-ppc] " David Gibson 2012-09-20 12:44 ` Alexander Graf 2012-09-21 0:22 ` David Gibson 2012-09-24 14:27 ` Alexander Graf 2012-09-25 7:05 ` David Gibson 2012-09-20 7:39 ` [Qemu-devel] " Jan Kiszka 2012-09-20 11:54 ` [Qemu-devel] [Qemu-ppc] " David Gibson 2012-09-20 7:08 ` [Qemu-devel] [PATCH 2/2] target-ppc: KVM: Fix some kernel version edge cases for kvmppc_reset_htab() David Gibson 2012-09-20 8:58 ` Alexander Graf
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).