* [PATCH] KVM: PPC: Book3S PR: only call slbmte for valid SLB entries
@ 2017-09-22 9:34 Greg Kurz
2017-09-26 3:56 ` David Gibson
0 siblings, 1 reply; 5+ messages in thread
From: Greg Kurz @ 2017-09-22 9:34 UTC (permalink / raw)
To: kvm-ppc
Cc: Paul Mackerras, David Gibson, Alexey Kardashevskiy, linuxppc-dev,
qemu-ppc
Userland passes an array of 64 SLB descriptors to KVM_SET_SREGS,
some of which are valid (ie, SLB_ESID_V is set) and the rest are
likely all-zeroes (with QEMU at least).
Each of them is then passed to kvmppc_mmu_book3s_64_slbmte(), which
assumes to find the SLB index in the 3 lower bits of its rb argument.
When passed zeroed arguments, it happily overwrites the 0th SLB entry
with zeroes. This is exactly what happens while doing live migration
with QEMU when the destination pushes the incoming SLB descriptors to
KVM PR. When reloading the SLBs at the next synchronization, QEMU first
clears its SLB array and only restore valid ones, but the 0th one is
now gone and we cannot access the corresponding memory anymore:
(qemu) x/x $pc
c0000000000b742c: Cannot access memory
To avoid this, let's filter out non-valid SLB entries, like we
already do for Book3S HV.
Signed-off-by: Greg Kurz <groug@kaod.org>
---
arch/powerpc/kvm/book3s_pr.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 3beb4ff469d1..cb6894e55f97 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -1328,8 +1328,10 @@ static int kvm_arch_vcpu_ioctl_set_sregs_pr(struct kvm_vcpu *vcpu,
vcpu3s->sdr1 = sregs->u.s.sdr1;
if (vcpu->arch.hflags & BOOK3S_HFLAG_SLB) {
for (i = 0; i < 64; i++) {
- vcpu->arch.mmu.slbmte(vcpu, sregs->u.s.ppc64.slb[i].slbv,
- sregs->u.s.ppc64.slb[i].slbe);
+ u64 rb = sregs->u.s.ppc64.slb[i].slbe;
+ u64 rs = sregs->u.s.ppc64.slb[i].slbv;
+ if (rb & SLB_ESID_V)
+ vcpu->arch.mmu.slbmte(vcpu, rs, rb);
}
} else {
for (i = 0; i < 16; i++) {
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] KVM: PPC: Book3S PR: only call slbmte for valid SLB entries
2017-09-22 9:34 [PATCH] KVM: PPC: Book3S PR: only call slbmte for valid SLB entries Greg Kurz
@ 2017-09-26 3:56 ` David Gibson
2017-09-26 5:24 ` Michael Ellerman
0 siblings, 1 reply; 5+ messages in thread
From: David Gibson @ 2017-09-26 3:56 UTC (permalink / raw)
To: Greg Kurz
Cc: kvm-ppc, Paul Mackerras, Alexey Kardashevskiy, linuxppc-dev,
qemu-ppc
[-- Attachment #1: Type: text/plain, Size: 2219 bytes --]
On Fri, Sep 22, 2017 at 11:34:29AM +0200, Greg Kurz wrote:
> Userland passes an array of 64 SLB descriptors to KVM_SET_SREGS,
> some of which are valid (ie, SLB_ESID_V is set) and the rest are
> likely all-zeroes (with QEMU at least).
>
> Each of them is then passed to kvmppc_mmu_book3s_64_slbmte(), which
> assumes to find the SLB index in the 3 lower bits of its rb argument.
> When passed zeroed arguments, it happily overwrites the 0th SLB entry
> with zeroes. This is exactly what happens while doing live migration
> with QEMU when the destination pushes the incoming SLB descriptors to
> KVM PR. When reloading the SLBs at the next synchronization, QEMU first
> clears its SLB array and only restore valid ones, but the 0th one is
> now gone and we cannot access the corresponding memory anymore:
>
> (qemu) x/x $pc
> c0000000000b742c: Cannot access memory
>
> To avoid this, let's filter out non-valid SLB entries, like we
> already do for Book3S HV.
>
> Signed-off-by: Greg Kurz <groug@kaod.org>
This seems like a good idea, but to make it fully correct, don't we
also need to fully flush the SLB before inserting the new entries.
> ---
> arch/powerpc/kvm/book3s_pr.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
> index 3beb4ff469d1..cb6894e55f97 100644
> --- a/arch/powerpc/kvm/book3s_pr.c
> +++ b/arch/powerpc/kvm/book3s_pr.c
> @@ -1328,8 +1328,10 @@ static int kvm_arch_vcpu_ioctl_set_sregs_pr(struct kvm_vcpu *vcpu,
> vcpu3s->sdr1 = sregs->u.s.sdr1;
> if (vcpu->arch.hflags & BOOK3S_HFLAG_SLB) {
> for (i = 0; i < 64; i++) {
> - vcpu->arch.mmu.slbmte(vcpu, sregs->u.s.ppc64.slb[i].slbv,
> - sregs->u.s.ppc64.slb[i].slbe);
> + u64 rb = sregs->u.s.ppc64.slb[i].slbe;
> + u64 rs = sregs->u.s.ppc64.slb[i].slbv;
> + if (rb & SLB_ESID_V)
> + vcpu->arch.mmu.slbmte(vcpu, rs, rb);
> }
> } else {
> for (i = 0; i < 16; i++) {
>
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] KVM: PPC: Book3S PR: only call slbmte for valid SLB entries
2017-09-26 3:56 ` David Gibson
@ 2017-09-26 5:24 ` Michael Ellerman
2017-09-26 11:34 ` Paul Mackerras
0 siblings, 1 reply; 5+ messages in thread
From: Michael Ellerman @ 2017-09-26 5:24 UTC (permalink / raw)
To: David Gibson, Greg Kurz
Cc: Alexey Kardashevskiy, linuxppc-dev, qemu-ppc, Paul Mackerras,
kvm-ppc
David Gibson <david@gibson.dropbear.id.au> writes:
> On Fri, Sep 22, 2017 at 11:34:29AM +0200, Greg Kurz wrote:
>> Userland passes an array of 64 SLB descriptors to KVM_SET_SREGS,
>> some of which are valid (ie, SLB_ESID_V is set) and the rest are
>> likely all-zeroes (with QEMU at least).
>>
>> Each of them is then passed to kvmppc_mmu_book3s_64_slbmte(), which
>> assumes to find the SLB index in the 3 lower bits of its rb argument.
>> When passed zeroed arguments, it happily overwrites the 0th SLB entry
>> with zeroes. This is exactly what happens while doing live migration
>> with QEMU when the destination pushes the incoming SLB descriptors to
>> KVM PR. When reloading the SLBs at the next synchronization, QEMU first
>> clears its SLB array and only restore valid ones, but the 0th one is
>> now gone and we cannot access the corresponding memory anymore:
>>
>> (qemu) x/x $pc
>> c0000000000b742c: Cannot access memory
>>
>> To avoid this, let's filter out non-valid SLB entries, like we
>> already do for Book3S HV.
>>
>> Signed-off-by: Greg Kurz <groug@kaod.org>
>
> This seems like a good idea, but to make it fully correct, don't we
> also need to fully flush the SLB before inserting the new entries.
We would need to do that yeah.
But I don't think I like this patch, it would mean userspace has no way
of programming an invalid SLB entry. It's true that in general that
isn't something we care about doing, but the API should allow it.
For example the kernel could leave invalid entries in place and flip the
valid bit when it wanted to make them valid, and this patch would
prevent that state being successfully migrated IIUIC.
cheers
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] KVM: PPC: Book3S PR: only call slbmte for valid SLB entries
2017-09-26 5:24 ` Michael Ellerman
@ 2017-09-26 11:34 ` Paul Mackerras
2017-09-27 3:25 ` Michael Ellerman
0 siblings, 1 reply; 5+ messages in thread
From: Paul Mackerras @ 2017-09-26 11:34 UTC (permalink / raw)
To: Michael Ellerman
Cc: David Gibson, Greg Kurz, Alexey Kardashevskiy, linuxppc-dev,
qemu-ppc, kvm-ppc
On Tue, Sep 26, 2017 at 03:24:05PM +1000, Michael Ellerman wrote:
> David Gibson <david@gibson.dropbear.id.au> writes:
>
> > On Fri, Sep 22, 2017 at 11:34:29AM +0200, Greg Kurz wrote:
> >> Userland passes an array of 64 SLB descriptors to KVM_SET_SREGS,
> >> some of which are valid (ie, SLB_ESID_V is set) and the rest are
> >> likely all-zeroes (with QEMU at least).
> >>
> >> Each of them is then passed to kvmppc_mmu_book3s_64_slbmte(), which
> >> assumes to find the SLB index in the 3 lower bits of its rb argument.
> >> When passed zeroed arguments, it happily overwrites the 0th SLB entry
> >> with zeroes. This is exactly what happens while doing live migration
> >> with QEMU when the destination pushes the incoming SLB descriptors to
> >> KVM PR. When reloading the SLBs at the next synchronization, QEMU first
> >> clears its SLB array and only restore valid ones, but the 0th one is
> >> now gone and we cannot access the corresponding memory anymore:
> >>
> >> (qemu) x/x $pc
> >> c0000000000b742c: Cannot access memory
> >>
> >> To avoid this, let's filter out non-valid SLB entries, like we
> >> already do for Book3S HV.
> >>
> >> Signed-off-by: Greg Kurz <groug@kaod.org>
> >
> > This seems like a good idea, but to make it fully correct, don't we
> > also need to fully flush the SLB before inserting the new entries.
>
> We would need to do that yeah.
>
> But I don't think I like this patch, it would mean userspace has no way
> of programming an invalid SLB entry. It's true that in general that
> isn't something we care about doing, but the API should allow it.
>
> For example the kernel could leave invalid entries in place and flip the
> valid bit when it wanted to make them valid, and this patch would
> prevent that state being successfully migrated IIUIC.
If I remember correctly, the architecture says that slbmfee/slbmfev
return all zeroes for an invalid entry, so there would be no way for
the guest kernel to do what you suggest.
Paul.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] KVM: PPC: Book3S PR: only call slbmte for valid SLB entries
2017-09-26 11:34 ` Paul Mackerras
@ 2017-09-27 3:25 ` Michael Ellerman
0 siblings, 0 replies; 5+ messages in thread
From: Michael Ellerman @ 2017-09-27 3:25 UTC (permalink / raw)
To: Paul Mackerras
Cc: David Gibson, Greg Kurz, Alexey Kardashevskiy, linuxppc-dev,
qemu-ppc, kvm-ppc
Paul Mackerras <paulus@ozlabs.org> writes:
> On Tue, Sep 26, 2017 at 03:24:05PM +1000, Michael Ellerman wrote:
>> David Gibson <david@gibson.dropbear.id.au> writes:
>>
>> > On Fri, Sep 22, 2017 at 11:34:29AM +0200, Greg Kurz wrote:
>> >> Userland passes an array of 64 SLB descriptors to KVM_SET_SREGS,
>> >> some of which are valid (ie, SLB_ESID_V is set) and the rest are
>> >> likely all-zeroes (with QEMU at least).
>> >>
>> >> Each of them is then passed to kvmppc_mmu_book3s_64_slbmte(), which
>> >> assumes to find the SLB index in the 3 lower bits of its rb argument.
>> >> When passed zeroed arguments, it happily overwrites the 0th SLB entry
>> >> with zeroes. This is exactly what happens while doing live migration
>> >> with QEMU when the destination pushes the incoming SLB descriptors to
>> >> KVM PR. When reloading the SLBs at the next synchronization, QEMU first
>> >> clears its SLB array and only restore valid ones, but the 0th one is
>> >> now gone and we cannot access the corresponding memory anymore:
>> >>
>> >> (qemu) x/x $pc
>> >> c0000000000b742c: Cannot access memory
>> >>
>> >> To avoid this, let's filter out non-valid SLB entries, like we
>> >> already do for Book3S HV.
>> >>
>> >> Signed-off-by: Greg Kurz <groug@kaod.org>
>> >
>> > This seems like a good idea, but to make it fully correct, don't we
>> > also need to fully flush the SLB before inserting the new entries.
>>
>> We would need to do that yeah.
>>
>> But I don't think I like this patch, it would mean userspace has no way
>> of programming an invalid SLB entry. It's true that in general that
>> isn't something we care about doing, but the API should allow it.
>>
>> For example the kernel could leave invalid entries in place and flip the
>> valid bit when it wanted to make them valid, and this patch would
>> prevent that state being successfully migrated IIUIC.
>
> If I remember correctly, the architecture says that slbmfee/slbmfev
> return all zeroes for an invalid entry, so there would be no way for
> the guest kernel to do what you suggest.
You're right it does.
We have code in xmon that reads entries and then checks for SLB_ESID_V,
but I guess that's just overly pessimistic.
cheers
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-09-27 3:25 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-09-22 9:34 [PATCH] KVM: PPC: Book3S PR: only call slbmte for valid SLB entries Greg Kurz
2017-09-26 3:56 ` David Gibson
2017-09-26 5:24 ` Michael Ellerman
2017-09-26 11:34 ` Paul Mackerras
2017-09-27 3:25 ` Michael Ellerman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).