* [PATCH 01/27] Move dirty logging code to sub-arch
[not found] ` <1256917647-6200-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>
@ 2009-10-30 15:47 ` Alexander Graf
2009-10-30 15:47 ` [PATCH 03/27] Add Book3s definitions Alexander Graf
` (15 subsequent siblings)
16 siblings, 0 replies; 48+ messages in thread
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm-u79uwXL29TY76Z2rM5mHXA
Cc: Avi Kivity, kvm-ppc, Hollis Blanchard, Arnd Bergmann,
Benjamin Herrenschmidt, Kevin Wolf, bphilips-l3A5Bk7waGM,
Marcelo Tosatti, Olof Johansson,
linuxppc-dev-mnsaURCQ41sdnm+yROfE0A
PowerPC code handles dirty logging in the generic parts atm. While this
is great for "return -ENOTSUPP", we need to be rather target specific
when actually implementing it.
So let's split it to implementation specific code, so we can implement
it for book3s.
Signed-off-by: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
---
arch/powerpc/kvm/booke.c | 5 +++++
arch/powerpc/kvm/powerpc.c | 5 -----
2 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index e7bf4d0..06f5a9e 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -520,6 +520,11 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,
return kvmppc_core_vcpu_translate(vcpu, tr);
}
+int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
+{
+ return -ENOTSUPP;
+}
+
int __init kvmppc_booke_init(void)
{
unsigned long ivor[16];
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 5902bbc..4ae3490 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -410,11 +410,6 @@ out:
return r;
}
-int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
-{
- return -ENOTSUPP;
-}
-
long kvm_arch_vm_ioctl(struct file *filp,
unsigned int ioctl, unsigned long arg)
{
--
1.6.0.2
^ permalink raw reply related [flat|nested] 48+ messages in thread* [PATCH 03/27] Add Book3s definitions
[not found] ` <1256917647-6200-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>
2009-10-30 15:47 ` [PATCH 01/27] Move dirty logging code to sub-arch Alexander Graf
@ 2009-10-30 15:47 ` Alexander Graf
2009-10-30 15:47 ` [PATCH 07/27] Add book3s_64 highmem asm code Alexander Graf
` (14 subsequent siblings)
16 siblings, 0 replies; 48+ messages in thread
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm-u79uwXL29TY76Z2rM5mHXA
Cc: Avi Kivity, kvm-ppc, Hollis Blanchard, Arnd Bergmann,
Benjamin Herrenschmidt, Kevin Wolf, bphilips-l3A5Bk7waGM,
Marcelo Tosatti, Olof Johansson,
linuxppc-dev-mnsaURCQ41sdnm+yROfE0A
We need quite a bunch of new constants for KVM on Book3s,
so let's define them now.
These constants will be used in later patches.
Signed-off-by: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
---
v3 -> v4
- remove old kernel compat code
---
arch/powerpc/include/asm/kvm_asm.h | 39 ++++++++++++++++++++++++++++++++++++
1 files changed, 39 insertions(+), 0 deletions(-)
diff --git a/arch/powerpc/include/asm/kvm_asm.h b/arch/powerpc/include/asm/kvm_asm.h
index 56bfae5..19ddb35 100644
--- a/arch/powerpc/include/asm/kvm_asm.h
+++ b/arch/powerpc/include/asm/kvm_asm.h
@@ -49,6 +49,45 @@
#define BOOKE_INTERRUPT_SPE_FP_ROUND 34
#define BOOKE_INTERRUPT_PERFORMANCE_MONITOR 35
+/* book3s */
+
+#define BOOK3S_INTERRUPT_SYSTEM_RESET 0x100
+#define BOOK3S_INTERRUPT_MACHINE_CHECK 0x200
+#define BOOK3S_INTERRUPT_DATA_STORAGE 0x300
+#define BOOK3S_INTERRUPT_DATA_SEGMENT 0x380
+#define BOOK3S_INTERRUPT_INST_STORAGE 0x400
+#define BOOK3S_INTERRUPT_INST_SEGMENT 0x480
+#define BOOK3S_INTERRUPT_EXTERNAL 0x500
+#define BOOK3S_INTERRUPT_ALIGNMENT 0x600
+#define BOOK3S_INTERRUPT_PROGRAM 0x700
+#define BOOK3S_INTERRUPT_FP_UNAVAIL 0x800
+#define BOOK3S_INTERRUPT_DECREMENTER 0x900
+#define BOOK3S_INTERRUPT_SYSCALL 0xc00
+#define BOOK3S_INTERRUPT_TRACE 0xd00
+#define BOOK3S_INTERRUPT_PERFMON 0xf00
+#define BOOK3S_INTERRUPT_ALTIVEC 0xf20
+#define BOOK3S_INTERRUPT_VSX 0xf40
+
+#define BOOK3S_IRQPRIO_SYSTEM_RESET 0
+#define BOOK3S_IRQPRIO_DATA_SEGMENT 1
+#define BOOK3S_IRQPRIO_INST_SEGMENT 2
+#define BOOK3S_IRQPRIO_DATA_STORAGE 3
+#define BOOK3S_IRQPRIO_INST_STORAGE 4
+#define BOOK3S_IRQPRIO_ALIGNMENT 5
+#define BOOK3S_IRQPRIO_PROGRAM 6
+#define BOOK3S_IRQPRIO_FP_UNAVAIL 7
+#define BOOK3S_IRQPRIO_ALTIVEC 8
+#define BOOK3S_IRQPRIO_VSX 9
+#define BOOK3S_IRQPRIO_SYSCALL 10
+#define BOOK3S_IRQPRIO_MACHINE_CHECK 11
+#define BOOK3S_IRQPRIO_DEBUG 12
+#define BOOK3S_IRQPRIO_EXTERNAL 13
+#define BOOK3S_IRQPRIO_DECREMENTER 14
+#define BOOK3S_IRQPRIO_PERFORMANCE_MONITOR 15
+#define BOOK3S_IRQPRIO_MAX 16
+
+#define BOOK3S_HFLAG_DCBZ32 0x1
+
#define RESUME_FLAG_NV (1<<0) /* Reload guest nonvolatile state? */
#define RESUME_FLAG_HOST (1<<1) /* Resume host? */
--
1.6.0.2
^ permalink raw reply related [flat|nested] 48+ messages in thread* [PATCH 07/27] Add book3s_64 highmem asm code
[not found] ` <1256917647-6200-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>
2009-10-30 15:47 ` [PATCH 01/27] Move dirty logging code to sub-arch Alexander Graf
2009-10-30 15:47 ` [PATCH 03/27] Add Book3s definitions Alexander Graf
@ 2009-10-30 15:47 ` Alexander Graf
2009-10-30 15:47 ` [PATCH 08/27] Add SLB switching code for entry/exit Alexander Graf
` (13 subsequent siblings)
16 siblings, 0 replies; 48+ messages in thread
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm-u79uwXL29TY76Z2rM5mHXA
Cc: Avi Kivity, kvm-ppc, Hollis Blanchard, Arnd Bergmann,
Benjamin Herrenschmidt, Kevin Wolf, bphilips-l3A5Bk7waGM,
Marcelo Tosatti, Olof Johansson,
linuxppc-dev-mnsaURCQ41sdnm+yROfE0A
This is the of entry / exit code. In order to switch between host and guest
context, we need to switch register state and call the exit code handler on
exit.
This assembly file does exactly that. To finally enter the guest it calls
into book3s_64_slb.S. On exit it gets jumped at from book3s_64_slb.S too.
Signed-off-by: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
---
v3 -> v4:
- header rename fix
---
arch/powerpc/include/asm/kvm_ppc.h | 1 +
arch/powerpc/kvm/book3s_64_interrupts.S | 392 +++++++++++++++++++++++++++++++
2 files changed, 393 insertions(+), 0 deletions(-)
create mode 100644 arch/powerpc/kvm/book3s_64_interrupts.S
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 2c6ee34..269ee46 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -39,6 +39,7 @@ enum emulation_result {
extern int __kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu);
extern char kvmppc_handlers_start[];
extern unsigned long kvmppc_handler_len;
+extern void kvmppc_handler_highmem(void);
extern void kvmppc_dump_vcpu(struct kvm_vcpu *vcpu);
extern int kvmppc_handle_load(struct kvm_run *run, struct kvm_vcpu *vcpu,
diff --git a/arch/powerpc/kvm/book3s_64_interrupts.S b/arch/powerpc/kvm/book3s_64_interrupts.S
new file mode 100644
index 0000000..7b55d80
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_interrupts.S
@@ -0,0 +1,392 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
+ */
+
+#include <asm/ppc_asm.h>
+#include <asm/kvm_asm.h>
+#include <asm/reg.h>
+#include <asm/page.h>
+#include <asm/asm-offsets.h>
+#include <asm/exception-64s.h>
+
+#define KVMPPC_HANDLE_EXIT .kvmppc_handle_exit
+#define ULONG_SIZE 8
+#define VCPU_GPR(n) (VCPU_GPRS + (n * ULONG_SIZE))
+
+.macro mfpaca tmp_reg, src_reg, offset, vcpu_reg
+ ld \tmp_reg, (PACA_EXMC+\offset)(r13)
+ std \tmp_reg, VCPU_GPR(\src_reg)(\vcpu_reg)
+.endm
+
+.macro DISABLE_INTERRUPTS
+ mfmsr r0
+ rldicl r0,r0,48,1
+ rotldi r0,r0,16
+ mtmsrd r0,1
+.endm
+
+/*****************************************************************************
+ * *
+ * Guest entry / exit code that is in kernel module memory (highmem) *
+ * *
+ ****************************************************************************/
+
+/* Registers:
+ * r3: kvm_run pointer
+ * r4: vcpu pointer
+ */
+_GLOBAL(__kvmppc_vcpu_entry)
+
+kvm_start_entry:
+ /* Write correct stack frame */
+ mflr r0
+ std r0,16(r1)
+
+ /* Save host state to the stack */
+ stdu r1, -SWITCH_FRAME_SIZE(r1)
+
+ /* Save r3 (kvm_run) and r4 (vcpu) */
+ SAVE_2GPRS(3, r1)
+
+ /* Save non-volatile registers (r14 - r31) */
+ SAVE_NVGPRS(r1)
+
+ /* Save LR */
+ mflr r14
+ std r14, _LINK(r1)
+
+/* XXX optimize non-volatile loading away */
+kvm_start_lightweight:
+
+ DISABLE_INTERRUPTS
+
+ /* Save R1/R2 in the PACA */
+ std r1, PACAR1(r13)
+ std r2, (PACA_EXMC+EX_SRR0)(r13)
+ ld r3, VCPU_HIGHMEM_HANDLER(r4)
+ std r3, PACASAVEDMSR(r13)
+
+ /* Load non-volatile guest state from the vcpu */
+ ld r14, VCPU_GPR(r14)(r4)
+ ld r15, VCPU_GPR(r15)(r4)
+ ld r16, VCPU_GPR(r16)(r4)
+ ld r17, VCPU_GPR(r17)(r4)
+ ld r18, VCPU_GPR(r18)(r4)
+ ld r19, VCPU_GPR(r19)(r4)
+ ld r20, VCPU_GPR(r20)(r4)
+ ld r21, VCPU_GPR(r21)(r4)
+ ld r22, VCPU_GPR(r22)(r4)
+ ld r23, VCPU_GPR(r23)(r4)
+ ld r24, VCPU_GPR(r24)(r4)
+ ld r25, VCPU_GPR(r25)(r4)
+ ld r26, VCPU_GPR(r26)(r4)
+ ld r27, VCPU_GPR(r27)(r4)
+ ld r28, VCPU_GPR(r28)(r4)
+ ld r29, VCPU_GPR(r29)(r4)
+ ld r30, VCPU_GPR(r30)(r4)
+ ld r31, VCPU_GPR(r31)(r4)
+
+ ld r9, VCPU_PC(r4) /* r9 = vcpu->arch.pc */
+ ld r10, VCPU_SHADOW_MSR(r4) /* r10 = vcpu->arch.shadow_msr */
+
+ ld r3, VCPU_TRAMPOLINE_ENTER(r4)
+ mtsrr0 r3
+
+ LOAD_REG_IMMEDIATE(r3, MSR_KERNEL & ~(MSR_IR | MSR_DR))
+ mtsrr1 r3
+
+ /* Load guest state in the respective registers */
+ lwz r3, VCPU_CR(r4) /* r3 = vcpu->arch.cr */
+ stw r3, (PACA_EXMC + EX_CCR)(r13)
+
+ ld r3, VCPU_CTR(r4) /* r3 = vcpu->arch.ctr */
+ mtctr r3 /* CTR = r3 */
+
+ ld r3, VCPU_LR(r4) /* r3 = vcpu->arch.lr */
+ mtlr r3 /* LR = r3 */
+
+ ld r3, VCPU_XER(r4) /* r3 = vcpu->arch.xer */
+ std r3, (PACA_EXMC + EX_R3)(r13)
+
+ /* Some guests may need to have dcbz set to 32 byte length.
+ *
+ * Usually we ensure that by patching the guest's instructions
+ * to trap on dcbz and emulate it in the hypervisor.
+ *
+ * If we can, we should tell the CPU to use 32 byte dcbz though,
+ * because that's a lot faster.
+ */
+
+ ld r3, VCPU_HFLAGS(r4)
+ rldicl. r3, r3, 0, 63 /* CR = ((r3 & 1) == 0) */
+ beq no_dcbz32_on
+
+ mfspr r3,SPRN_HID5
+ ori r3, r3, 0x80 /* XXX HID5_dcbz32 = 0x80 */
+ mtspr SPRN_HID5,r3
+
+no_dcbz32_on:
+ /* Load guest GPRs */
+
+ ld r3, VCPU_GPR(r9)(r4)
+ std r3, (PACA_EXMC + EX_R9)(r13)
+ ld r3, VCPU_GPR(r10)(r4)
+ std r3, (PACA_EXMC + EX_R10)(r13)
+ ld r3, VCPU_GPR(r11)(r4)
+ std r3, (PACA_EXMC + EX_R11)(r13)
+ ld r3, VCPU_GPR(r12)(r4)
+ std r3, (PACA_EXMC + EX_R12)(r13)
+ ld r3, VCPU_GPR(r13)(r4)
+ std r3, (PACA_EXMC + EX_R13)(r13)
+
+ ld r0, VCPU_GPR(r0)(r4)
+ ld r1, VCPU_GPR(r1)(r4)
+ ld r2, VCPU_GPR(r2)(r4)
+ ld r3, VCPU_GPR(r3)(r4)
+ ld r5, VCPU_GPR(r5)(r4)
+ ld r6, VCPU_GPR(r6)(r4)
+ ld r7, VCPU_GPR(r7)(r4)
+ ld r8, VCPU_GPR(r8)(r4)
+ ld r4, VCPU_GPR(r4)(r4)
+
+ /* This sets the Magic value for the trampoline */
+
+ li r11, 1
+ stb r11, PACA_KVM_IN_GUEST(r13)
+
+ /* Jump to SLB patching handlder and into our guest */
+ RFI
+
+/*
+ * This is the handler in module memory. It gets jumped at from the
+ * lowmem trampoline code, so it's basically the guest exit code.
+ *
+ */
+
+.global kvmppc_handler_highmem
+kvmppc_handler_highmem:
+
+ /*
+ * Register usage at this point:
+ *
+ * R00 = guest R13
+ * R01 = host R1
+ * R02 = host R2
+ * R10 = guest PC
+ * R11 = guest MSR
+ * R12 = exit handler id
+ * R13 = PACA
+ * PACA.exmc.R9 = guest R1
+ * PACA.exmc.R10 = guest R10
+ * PACA.exmc.R11 = guest R11
+ * PACA.exmc.R12 = guest R12
+ * PACA.exmc.R13 = guest R2
+ * PACA.exmc.DAR = guest DAR
+ * PACA.exmc.DSISR = guest DSISR
+ * PACA.exmc.LR = guest instruction
+ * PACA.exmc.CCR = guest CR
+ * PACA.exmc.SRR0 = guest R0
+ *
+ */
+
+ std r3, (PACA_EXMC+EX_R3)(r13)
+
+ /* save the exit id in R3 */
+ mr r3, r12
+
+ /* R12 = vcpu */
+ ld r12, GPR4(r1)
+
+ /* Now save the guest state */
+
+ std r0, VCPU_GPR(r13)(r12)
+ std r4, VCPU_GPR(r4)(r12)
+ std r5, VCPU_GPR(r5)(r12)
+ std r6, VCPU_GPR(r6)(r12)
+ std r7, VCPU_GPR(r7)(r12)
+ std r8, VCPU_GPR(r8)(r12)
+ std r9, VCPU_GPR(r9)(r12)
+
+ /* get registers from PACA */
+ mfpaca r5, r0, EX_SRR0, r12
+ mfpaca r5, r3, EX_R3, r12
+ mfpaca r5, r1, EX_R9, r12
+ mfpaca r5, r10, EX_R10, r12
+ mfpaca r5, r11, EX_R11, r12
+ mfpaca r5, r12, EX_R12, r12
+ mfpaca r5, r2, EX_R13, r12
+
+ lwz r5, (PACA_EXMC+EX_LR)(r13)
+ stw r5, VCPU_LAST_INST(r12)
+
+ lwz r5, (PACA_EXMC+EX_CCR)(r13)
+ stw r5, VCPU_CR(r12)
+
+ ld r5, VCPU_HFLAGS(r12)
+ rldicl. r5, r5, 0, 63 /* CR = ((r5 & 1) == 0) */
+ beq no_dcbz32_off
+
+ mfspr r5,SPRN_HID5
+ rldimi r5,r5,6,56
+ mtspr SPRN_HID5,r5
+
+no_dcbz32_off:
+
+ /* XXX maybe skip on lightweight? */
+ std r14, VCPU_GPR(r14)(r12)
+ std r15, VCPU_GPR(r15)(r12)
+ std r16, VCPU_GPR(r16)(r12)
+ std r17, VCPU_GPR(r17)(r12)
+ std r18, VCPU_GPR(r18)(r12)
+ std r19, VCPU_GPR(r19)(r12)
+ std r20, VCPU_GPR(r20)(r12)
+ std r21, VCPU_GPR(r21)(r12)
+ std r22, VCPU_GPR(r22)(r12)
+ std r23, VCPU_GPR(r23)(r12)
+ std r24, VCPU_GPR(r24)(r12)
+ std r25, VCPU_GPR(r25)(r12)
+ std r26, VCPU_GPR(r26)(r12)
+ std r27, VCPU_GPR(r27)(r12)
+ std r28, VCPU_GPR(r28)(r12)
+ std r29, VCPU_GPR(r29)(r12)
+ std r30, VCPU_GPR(r30)(r12)
+ std r31, VCPU_GPR(r31)(r12)
+
+ /* Restore non-volatile host registers (r14 - r31) */
+ REST_NVGPRS(r1)
+
+ /* Save guest PC (R10) */
+ std r10, VCPU_PC(r12)
+
+ /* Save guest msr (R11) */
+ std r11, VCPU_SHADOW_MSR(r12)
+
+ /* Save guest CTR (in R12) */
+ mfctr r5
+ std r5, VCPU_CTR(r12)
+
+ /* Save guest LR */
+ mflr r5
+ std r5, VCPU_LR(r12)
+
+ /* Save guest XER */
+ mfxer r5
+ std r5, VCPU_XER(r12)
+
+ /* Save guest DAR */
+ ld r5, (PACA_EXMC+EX_DAR)(r13)
+ std r5, VCPU_FAULT_DEAR(r12)
+
+ /* Save guest DSISR */
+ lwz r5, (PACA_EXMC+EX_DSISR)(r13)
+ std r5, VCPU_FAULT_DSISR(r12)
+
+ /* Restore host msr -> SRR1 */
+ ld r7, VCPU_HOST_MSR(r12)
+ mtsrr1 r7
+
+ /* Restore host IP -> SRR0 */
+ ld r6, VCPU_HOST_RETIP(r12)
+ mtsrr0 r6
+
+ /*
+ * For some interrupts, we need to call the real Linux
+ * handler, so it can do work for us. This has to happen
+ * as if the interrupt arrived from the kernel though,
+ * so let's fake it here where most state is restored.
+ *
+ * Call Linux for hardware interrupts/decrementer
+ * r3 = address of interrupt handler (exit reason)
+ */
+
+ cmpwi r3, BOOK3S_INTERRUPT_EXTERNAL
+ beq call_linux_handler
+ cmpwi r3, BOOK3S_INTERRUPT_DECREMENTER
+ beq call_linux_handler
+
+ /* Back to Interruptable Mode! (goto kvm_return_point) */
+ RFI
+
+call_linux_handler:
+
+ /*
+ * If we land here we need to jump back to the handler we
+ * came from.
+ *
+ * We have a page that we can access from real mode, so let's
+ * jump back to that and use it as a trampoline to get back into the
+ * interrupt handler!
+ *
+ * R3 still contains the exit code,
+ * R6 VCPU_HOST_RETIP and
+ * R7 VCPU_HOST_MSR
+ */
+
+ mtlr r3
+
+ ld r5, VCPU_TRAMPOLINE_LOWMEM(r12)
+ mtsrr0 r5
+ LOAD_REG_IMMEDIATE(r5, MSR_KERNEL & ~(MSR_IR | MSR_DR))
+ mtsrr1 r5
+
+ RFI
+
+.global kvm_return_point
+kvm_return_point:
+
+ /* Jump back to lightweight entry if we're supposed to */
+ /* go back into the guest */
+ mr r5, r3
+ /* Restore r3 (kvm_run) and r4 (vcpu) */
+ REST_2GPRS(3, r1)
+ bl KVMPPC_HANDLE_EXIT
+
+#if 0 /* XXX get lightweight exits back */
+ cmpwi r3, RESUME_GUEST
+ bne kvm_exit_heavyweight
+
+ /* put VCPU and KVM_RUN back into place and roll again! */
+ REST_2GPRS(3, r1)
+ b kvm_start_lightweight
+
+kvm_exit_heavyweight:
+ /* Restore non-volatile host registers */
+ ld r14, _LINK(r1)
+ mtlr r14
+ REST_NVGPRS(r1)
+
+ addi r1, r1, SWITCH_FRAME_SIZE
+#else
+ ld r4, _LINK(r1)
+ mtlr r4
+
+ cmpwi r3, RESUME_GUEST
+ bne kvm_exit_heavyweight
+
+ REST_2GPRS(3, r1)
+
+ addi r1, r1, SWITCH_FRAME_SIZE
+
+ b kvm_start_entry
+
+kvm_exit_heavyweight:
+
+ addi r1, r1, SWITCH_FRAME_SIZE
+#endif
+
+ blr
--
1.6.0.2
^ permalink raw reply related [flat|nested] 48+ messages in thread* [PATCH 08/27] Add SLB switching code for entry/exit
[not found] ` <1256917647-6200-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>
` (2 preceding siblings ...)
2009-10-30 15:47 ` [PATCH 07/27] Add book3s_64 highmem asm code Alexander Graf
@ 2009-10-30 15:47 ` Alexander Graf
2009-11-01 23:23 ` Michael Neuling
2009-10-30 15:47 ` [PATCH 09/27] Add interrupt handling code Alexander Graf
` (12 subsequent siblings)
16 siblings, 1 reply; 48+ messages in thread
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm-u79uwXL29TY76Z2rM5mHXA
Cc: Avi Kivity, kvm-ppc, Hollis Blanchard, Arnd Bergmann,
Benjamin Herrenschmidt, Kevin Wolf, bphilips-l3A5Bk7waGM,
Marcelo Tosatti, Olof Johansson,
linuxppc-dev-mnsaURCQ41sdnm+yROfE0A
This is the really low level of guest entry/exit code.
Book3s_64 has an SLB, which stores all ESID -> VSID mappings we're
currently aware of.
The segments in the guest differ from the ones on the host, so we need
to switch the SLB to tell the MMU that we're in a new context.
So we store a shadow of the guest's SLB in the PACA, switch to that on
entry and only restore bolted entries on exit, leaving the rest to the
Linux SLB fault handler.
That way we get a really clean way of switching the SLB.
Signed-off-by: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
---
arch/powerpc/kvm/book3s_64_slb.S | 277 ++++++++++++++++++++++++++++++++++++++
1 files changed, 277 insertions(+), 0 deletions(-)
create mode 100644 arch/powerpc/kvm/book3s_64_slb.S
diff --git a/arch/powerpc/kvm/book3s_64_slb.S b/arch/powerpc/kvm/book3s_64_slb.S
new file mode 100644
index 0000000..00a8367
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_slb.S
@@ -0,0 +1,277 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
+ */
+
+/******************************************************************************
+ * *
+ * Entry code *
+ * *
+ *****************************************************************************/
+
+.global kvmppc_handler_trampoline_enter
+kvmppc_handler_trampoline_enter:
+
+ /* Required state:
+ *
+ * MSR = ~IR|DR
+ * R13 = PACA
+ * R9 = guest IP
+ * R10 = guest MSR
+ * R11 = free
+ * R12 = free
+ * PACA[PACA_EXMC + EX_R9] = guest R9
+ * PACA[PACA_EXMC + EX_R10] = guest R10
+ * PACA[PACA_EXMC + EX_R11] = guest R11
+ * PACA[PACA_EXMC + EX_R12] = guest R12
+ * PACA[PACA_EXMC + EX_R13] = guest R13
+ * PACA[PACA_EXMC + EX_CCR] = guest CR
+ * PACA[PACA_EXMC + EX_R3] = guest XER
+ */
+
+ mtsrr0 r9
+ mtsrr1 r10
+
+ mtspr SPRN_SPRG_SCRATCH0, r0
+
+ /* Remove LPAR shadow entries */
+
+#if SLB_NUM_BOLTED == 3
+
+ ld r12, PACA_SLBSHADOWPTR(r13)
+ ld r10, 0x10(r12)
+ ld r11, 0x18(r12)
+ /* Invalid? Skip. */
+ rldicl. r0, r10, 37, 63
+ beq slb_entry_skip_1
+ xoris r9, r10, SLB_ESID_V@h
+ std r9, 0x10(r12)
+slb_entry_skip_1:
+ ld r9, 0x20(r12)
+ /* Invalid? Skip. */
+ rldicl. r0, r9, 37, 63
+ beq slb_entry_skip_2
+ xoris r9, r9, SLB_ESID_V@h
+ std r9, 0x20(r12)
+slb_entry_skip_2:
+ ld r9, 0x30(r12)
+ /* Invalid? Skip. */
+ rldicl. r0, r9, 37, 63
+ beq slb_entry_skip_3
+ xoris r9, r9, SLB_ESID_V@h
+ std r9, 0x30(r12)
+slb_entry_skip_3:
+
+#else
+#error unknown number of bolted entries
+#endif
+
+ /* Flush SLB */
+
+ slbia
+
+ /* r0 = esid & ESID_MASK */
+ rldicr r10, r10, 0, 35
+ /* r0 |= CLASS_BIT(VSID) */
+ rldic r12, r11, 56 - 36, 36
+ or r10, r10, r12
+ slbie r10
+
+ isync
+
+ /* Fill SLB with our shadow */
+
+ lbz r12, PACA_KVM_SLB_MAX(r13)
+ mulli r12, r12, 16
+ addi r12, r12, PACA_KVM_SLB
+ add r12, r12, r13
+
+ /* for (r11 = kvm_slb; r11 < kvm_slb + kvm_slb_size; r11+=slb_entry) */
+ li r11, PACA_KVM_SLB
+ add r11, r11, r13
+
+slb_loop_enter:
+
+ ld r10, 0(r11)
+
+ rldicl. r0, r10, 37, 63
+ beq slb_loop_enter_skip
+
+ ld r9, 8(r11)
+ slbmte r9, r10
+
+slb_loop_enter_skip:
+ addi r11, r11, 16
+ cmpd cr0, r11, r12
+ blt slb_loop_enter
+
+slb_do_enter:
+
+ /* Enter guest */
+
+ mfspr r0, SPRN_SPRG_SCRATCH0
+
+ ld r9, (PACA_EXMC+EX_R9)(r13)
+ ld r10, (PACA_EXMC+EX_R10)(r13)
+ ld r12, (PACA_EXMC+EX_R12)(r13)
+
+ lwz r11, (PACA_EXMC+EX_CCR)(r13)
+ mtcr r11
+
+ ld r11, (PACA_EXMC+EX_R3)(r13)
+ mtxer r11
+
+ ld r11, (PACA_EXMC+EX_R11)(r13)
+ ld r13, (PACA_EXMC+EX_R13)(r13)
+
+ RFI
+kvmppc_handler_trampoline_enter_end:
+
+
+
+/******************************************************************************
+ * *
+ * Exit code *
+ * *
+ *****************************************************************************/
+
+.global kvmppc_handler_trampoline_exit
+kvmppc_handler_trampoline_exit:
+
+ /* Register usage at this point:
+ *
+ * SPRG_SCRATCH0 = guest R13
+ * R01 = host R1
+ * R02 = host R2
+ * R10 = guest PC
+ * R11 = guest MSR
+ * R12 = exit handler id
+ * R13 = PACA
+ * PACA.exmc.CCR = guest CR
+ * PACA.exmc.R9 = guest R1
+ * PACA.exmc.R10 = guest R10
+ * PACA.exmc.R11 = guest R11
+ * PACA.exmc.R12 = guest R12
+ * PACA.exmc.R13 = guest R2
+ *
+ */
+
+ /* Save registers */
+
+ std r0, (PACA_EXMC+EX_SRR0)(r13)
+ std r9, (PACA_EXMC+EX_R3)(r13)
+ std r10, (PACA_EXMC+EX_LR)(r13)
+ std r11, (PACA_EXMC+EX_DAR)(r13)
+
+ /*
+ * In order for us to easily get the last instruction,
+ * we got the #vmexit at, we exploit the fact that the
+ * virtual layout is still the same here, so we can just
+ * ld from the guest's PC address
+ */
+
+ /* We only load the last instruction when it's safe */
+ cmpwi r12, BOOK3S_INTERRUPT_DATA_STORAGE
+ beq ld_last_inst
+ cmpwi r12, BOOK3S_INTERRUPT_PROGRAM
+ beq ld_last_inst
+
+ b no_ld_last_inst
+
+ld_last_inst:
+ /* Save off the guest instruction we're at */
+ /* 1) enable paging for data */
+ mfmsr r9
+ ori r11, r9, MSR_DR /* Enable paging for data */
+ mtmsr r11
+ /* 2) fetch the instruction */
+ lwz r0, 0(r10)
+ /* 3) disable paging again */
+ mtmsr r9
+
+no_ld_last_inst:
+
+ /* Restore bolted entries from the shadow and fix it along the way */
+
+ /* We don't store anything in entry 0, so we don't need to take care of that */
+ slbia
+ isync
+
+#if SLB_NUM_BOLTED == 3
+
+ ld r11, PACA_SLBSHADOWPTR(r13)
+
+ ld r10, 0x10(r11)
+ cmpdi r10, 0
+ beq slb_exit_skip_1
+ oris r10, r10, SLB_ESID_V@h
+ ld r9, 0x18(r11)
+ slbmte r9, r10
+ std r10, 0x10(r11)
+slb_exit_skip_1:
+
+ ld r10, 0x20(r11)
+ cmpdi r10, 0
+ beq slb_exit_skip_2
+ oris r10, r10, SLB_ESID_V@h
+ ld r9, 0x28(r11)
+ slbmte r9, r10
+ std r10, 0x20(r11)
+slb_exit_skip_2:
+
+ ld r10, 0x30(r11)
+ cmpdi r10, 0
+ beq slb_exit_skip_3
+ oris r10, r10, SLB_ESID_V@h
+ ld r9, 0x38(r11)
+ slbmte r9, r10
+ std r10, 0x30(r11)
+slb_exit_skip_3:
+
+#else
+#error unknown number of bolted entries
+#endif
+
+slb_do_exit:
+
+ /* Restore registers */
+
+ ld r11, (PACA_EXMC+EX_DAR)(r13)
+ ld r10, (PACA_EXMC+EX_LR)(r13)
+ ld r9, (PACA_EXMC+EX_R3)(r13)
+
+ /* Save last inst */
+ stw r0, (PACA_EXMC+EX_LR)(r13)
+
+ /* Save DAR and DSISR before going to paged mode */
+ mfdar r0
+ std r0, (PACA_EXMC+EX_DAR)(r13)
+ mfdsisr r0
+ stw r0, (PACA_EXMC+EX_DSISR)(r13)
+
+ /* RFI into the highmem handler */
+ mfmsr r0
+ ori r0, r0, MSR_IR|MSR_DR|MSR_RI /* Enable paging */
+ mtsrr1 r0
+ ld r0, PACASAVEDMSR(r13) /* Highmem handler address */
+ mtsrr0 r0
+
+ mfspr r0, SPRN_SPRG_SCRATCH0
+
+ RFI
+kvmppc_handler_trampoline_exit_end:
+
--
1.6.0.2
^ permalink raw reply related [flat|nested] 48+ messages in thread* Re: [PATCH 08/27] Add SLB switching code for entry/exit
2009-10-30 15:47 ` [PATCH 08/27] Add SLB switching code for entry/exit Alexander Graf
@ 2009-11-01 23:23 ` Michael Neuling
[not found] ` <6695.1257117827-/owAOxkjmzZAfugRpC6u6w@public.gmane.org>
0 siblings, 1 reply; 48+ messages in thread
From: Michael Neuling @ 2009-11-01 23:23 UTC (permalink / raw)
To: Alexander Graf
Cc: kvm, Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
> This is the really low level of guest entry/exit code.
>
> Book3s_64 has an SLB, which stores all ESID -> VSID mappings we're
> currently aware of.
>
> The segments in the guest differ from the ones on the host, so we need
> to switch the SLB to tell the MMU that we're in a new context.
>
> So we store a shadow of the guest's SLB in the PACA, switch to that on
> entry and only restore bolted entries on exit, leaving the rest to the
> Linux SLB fault handler.
>
> That way we get a really clean way of switching the SLB.
>
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
> arch/powerpc/kvm/book3s_64_slb.S | 277 ++++++++++++++++++++++++++++++++++++
++
> 1 files changed, 277 insertions(+), 0 deletions(-)
> create mode 100644 arch/powerpc/kvm/book3s_64_slb.S
>
> diff --git a/arch/powerpc/kvm/book3s_64_slb.S b/arch/powerpc/kvm/book3s_64_sl
b.S
> new file mode 100644
> index 0000000..00a8367
> --- /dev/null
> +++ b/arch/powerpc/kvm/book3s_64_slb.S
> @@ -0,0 +1,277 @@
> +/*
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License, version 2, as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
> + *
> + * Copyright SUSE Linux Products GmbH 2009
> + *
> + * Authors: Alexander Graf <agraf@suse.de>
> + */
> +
> +/***************************************************************************
***
> + *
*
> + * Entry code
*
> + *
*
> + ***************************************************************************
**/
> +
> +.global kvmppc_handler_trampoline_enter
> +kvmppc_handler_trampoline_enter:
> +
> + /* Required state:
> + *
> + * MSR = ~IR|DR
> + * R13 = PACA
> + * R9 = guest IP
> + * R10 = guest MSR
> + * R11 = free
> + * R12 = free
> + * PACA[PACA_EXMC + EX_R9] = guest R9
> + * PACA[PACA_EXMC + EX_R10] = guest R10
> + * PACA[PACA_EXMC + EX_R11] = guest R11
> + * PACA[PACA_EXMC + EX_R12] = guest R12
> + * PACA[PACA_EXMC + EX_R13] = guest R13
> + * PACA[PACA_EXMC + EX_CCR] = guest CR
> + * PACA[PACA_EXMC + EX_R3] = guest XER
> + */
> +
> + mtsrr0 r9
> + mtsrr1 r10
> +
> + mtspr SPRN_SPRG_SCRATCH0, r0
> +
> + /* Remove LPAR shadow entries */
> +
> +#if SLB_NUM_BOLTED == 3
You could alternatively check the persistent entry in the slb_shawdow
buffer. This would give you a run time check. Not sure what's best
though.
> +
> + ld r12, PACA_SLBSHADOWPTR(r13)
> + ld r10, 0x10(r12)
> + ld r11, 0x18(r12)
Can you define something in asm-offsets.c for these magic constants 0x10
and 0x18. Similarly below.
> + /* Invalid? Skip. */
> + rldicl. r0, r10, 37, 63
> + beq slb_entry_skip_1
> + xoris r9, r10, SLB_ESID_V@h
> + std r9, 0x10(r12)
> +slb_entry_skip_1:
> + ld r9, 0x20(r12)
> + /* Invalid? Skip. */
> + rldicl. r0, r9, 37, 63
> + beq slb_entry_skip_2
> + xoris r9, r9, SLB_ESID_V@h
> + std r9, 0x20(r12)
> +slb_entry_skip_2:
> + ld r9, 0x30(r12)
> + /* Invalid? Skip. */
> + rldicl. r0, r9, 37, 63
> + beq slb_entry_skip_3
> + xoris r9, r9, SLB_ESID_V@h
> + std r9, 0x30(r12)
Can these 3 be made into a macro?
> +slb_entry_skip_3:
> +
> +#else
> +#error unknown number of bolted entries
> +#endif
> +
> + /* Flush SLB */
> +
> + slbia
> +
> + /* r0 = esid & ESID_MASK */
> + rldicr r10, r10, 0, 35
> + /* r0 |= CLASS_BIT(VSID) */
> + rldic r12, r11, 56 - 36, 36
> + or r10, r10, r12
> + slbie r10
> +
> + isync
> +
> + /* Fill SLB with our shadow */
> +
> + lbz r12, PACA_KVM_SLB_MAX(r13)
> + mulli r12, r12, 16
> + addi r12, r12, PACA_KVM_SLB
> + add r12, r12, r13
> +
> + /* for (r11 = kvm_slb; r11 < kvm_slb + kvm_slb_size; r11+=slb_entry) */
> + li r11, PACA_KVM_SLB
> + add r11, r11, r13
> +
> +slb_loop_enter:
> +
> + ld r10, 0(r11)
> +
> + rldicl. r0, r10, 37, 63
> + beq slb_loop_enter_skip
> +
> + ld r9, 8(r11)
> + slbmte r9, r10
If you're updating the first 3 slbs, you need to make sure the slb
shadow is updated at the same time (BTW dumb question: can we run this
under PHYP?)
> +
> +slb_loop_enter_skip:
> + addi r11, r11, 16
> + cmpd cr0, r11, r12
> + blt slb_loop_enter
> +
> +slb_do_enter:
> +
> + /* Enter guest */
> +
> + mfspr r0, SPRN_SPRG_SCRATCH0
> +
> + ld r9, (PACA_EXMC+EX_R9)(r13)
> + ld r10, (PACA_EXMC+EX_R10)(r13)
> + ld r12, (PACA_EXMC+EX_R12)(r13)
> +
> + lwz r11, (PACA_EXMC+EX_CCR)(r13)
> + mtcr r11
> +
> + ld r11, (PACA_EXMC+EX_R3)(r13)
> + mtxer r11
> +
> + ld r11, (PACA_EXMC+EX_R11)(r13)
> + ld r13, (PACA_EXMC+EX_R13)(r13)
> +
> + RFI
> +kvmppc_handler_trampoline_enter_end:
> +
> +
> +
> +/***************************************************************************
***
> + *
*
> + * Exit code
*
> + *
*
> + ***************************************************************************
**/
> +
> +.global kvmppc_handler_trampoline_exit
> +kvmppc_handler_trampoline_exit:
> +
> + /* Register usage at this point:
> + *
> + * SPRG_SCRATCH0 = guest R13
> + * R01 = host R1
> + * R02 = host R2
> + * R10 = guest PC
> + * R11 = guest MSR
> + * R12 = exit handler id
> + * R13 = PACA
> + * PACA.exmc.CCR = guest CR
> + * PACA.exmc.R9 = guest R1
> + * PACA.exmc.R10 = guest R10
> + * PACA.exmc.R11 = guest R11
> + * PACA.exmc.R12 = guest R12
> + * PACA.exmc.R13 = guest R2
> + *
> + */
> +
> + /* Save registers */
> +
> + std r0, (PACA_EXMC+EX_SRR0)(r13)
> + std r9, (PACA_EXMC+EX_R3)(r13)
> + std r10, (PACA_EXMC+EX_LR)(r13)
> + std r11, (PACA_EXMC+EX_DAR)(r13)
> +
> + /*
> + * In order for us to easily get the last instruction,
> + * we got the #vmexit at, we exploit the fact that the
> + * virtual layout is still the same here, so we can just
> + * ld from the guest's PC address
> + */
> +
> + /* We only load the last instruction when it's safe */
> + cmpwi r12, BOOK3S_INTERRUPT_DATA_STORAGE
> + beq ld_last_inst
> + cmpwi r12, BOOK3S_INTERRUPT_PROGRAM
> + beq ld_last_inst
> +
> + b no_ld_last_inst
> +
> +ld_last_inst:
> + /* Save off the guest instruction we're at */
> + /* 1) enable paging for data */
> + mfmsr r9
> + ori r11, r9, MSR_DR /* Enable paging for data */
> + mtmsr r11
> + /* 2) fetch the instruction */
> + lwz r0, 0(r10)
> + /* 3) disable paging again */
> + mtmsr r9
> +
> +no_ld_last_inst:
> +
> + /* Restore bolted entries from the shadow and fix it along the way */
> +
> + /* We don't store anything in entry 0, so we don't need to take care of
that */
> + slbia
> + isync
> +
> +#if SLB_NUM_BOLTED == 3
> +
> + ld r11, PACA_SLBSHADOWPTR(r13)
> +
> + ld r10, 0x10(r11)
> + cmpdi r10, 0
> + beq slb_exit_skip_1
> + oris r10, r10, SLB_ESID_V@h
> + ld r9, 0x18(r11)
> + slbmte r9, r10
> + std r10, 0x10(r11)
> +slb_exit_skip_1:
> +
> + ld r10, 0x20(r11)
> + cmpdi r10, 0
> + beq slb_exit_skip_2
> + oris r10, r10, SLB_ESID_V@h
> + ld r9, 0x28(r11)
> + slbmte r9, r10
> + std r10, 0x20(r11)
> +slb_exit_skip_2:
> +
> + ld r10, 0x30(r11)
> + cmpdi r10, 0
> + beq slb_exit_skip_3
> + oris r10, r10, SLB_ESID_V@h
> + ld r9, 0x38(r11)
> + slbmte r9, r10
> + std r10, 0x30(r11)
> +slb_exit_skip_3:
Again, a macro here?
> +
> +#else
> +#error unknown number of bolted entries
> +#endif
> +
> +slb_do_exit:
> +
> + /* Restore registers */
> +
> + ld r11, (PACA_EXMC+EX_DAR)(r13)
> + ld r10, (PACA_EXMC+EX_LR)(r13)
> + ld r9, (PACA_EXMC+EX_R3)(r13)
> +
> + /* Save last inst */
> + stw r0, (PACA_EXMC+EX_LR)(r13)
> +
> + /* Save DAR and DSISR before going to paged mode */
> + mfdar r0
> + std r0, (PACA_EXMC+EX_DAR)(r13)
> + mfdsisr r0
> + stw r0, (PACA_EXMC+EX_DSISR)(r13)
> +
> + /* RFI into the highmem handler */
> + mfmsr r0
> + ori r0, r0, MSR_IR|MSR_DR|MSR_RI /* Enable paging */
> + mtsrr1 r0
> + ld r0, PACASAVEDMSR(r13) /* Highmem handler address */
> + mtsrr0 r0
> +
> + mfspr r0, SPRN_SPRG_SCRATCH0
> +
> + RFI
> +kvmppc_handler_trampoline_exit_end:
> +
> --
> 1.6.0.2
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
>
^ permalink raw reply [flat|nested] 48+ messages in thread
* [PATCH 09/27] Add interrupt handling code
[not found] ` <1256917647-6200-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>
` (3 preceding siblings ...)
2009-10-30 15:47 ` [PATCH 08/27] Add SLB switching code for entry/exit Alexander Graf
@ 2009-10-30 15:47 ` Alexander Graf
2009-10-30 15:47 ` [PATCH 10/27] Add book3s.c Alexander Graf
` (11 subsequent siblings)
16 siblings, 0 replies; 48+ messages in thread
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm-u79uwXL29TY76Z2rM5mHXA
Cc: Avi Kivity, kvm-ppc, Hollis Blanchard, Arnd Bergmann,
Benjamin Herrenschmidt, Kevin Wolf, bphilips-l3A5Bk7waGM,
Marcelo Tosatti, Olof Johansson,
linuxppc-dev-mnsaURCQ41sdnm+yROfE0A
Getting from host state to the guest is only half the story. We also need
to return to our host context and handle whatever happened to get us out of
the guest.
On PowerPC every guest exit is an interrupt. So all we need to do is trap
the host's interrupt handlers and get into our #VMEXIT code to handle it.
PowerPCs also have a register that can add an offset to the interrupt handlers'
adresses which is what the booke KVM code uses. Unfortunately that is a
hypervisor ressource and we also want to be able to run KVM when we're running
in an LPAR. So we have to hook into the Linux interrupt handlers.
Signed-off-by: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
---
v3 -> v4:
- header rename fix
---
arch/powerpc/kvm/book3s_64_rmhandlers.S | 131 +++++++++++++++++++++++++++++++
1 files changed, 131 insertions(+), 0 deletions(-)
create mode 100644 arch/powerpc/kvm/book3s_64_rmhandlers.S
diff --git a/arch/powerpc/kvm/book3s_64_rmhandlers.S b/arch/powerpc/kvm/book3s_64_rmhandlers.S
new file mode 100644
index 0000000..fb7dd2e
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_rmhandlers.S
@@ -0,0 +1,131 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
+ */
+
+#include <asm/ppc_asm.h>
+#include <asm/kvm_asm.h>
+#include <asm/reg.h>
+#include <asm/page.h>
+#include <asm/asm-offsets.h>
+#include <asm/exception-64s.h>
+
+/*****************************************************************************
+ * *
+ * Real Mode handlers that need to be in low physical memory *
+ * *
+ ****************************************************************************/
+
+
+.macro INTERRUPT_TRAMPOLINE intno
+
+.global kvmppc_trampoline_\intno
+kvmppc_trampoline_\intno:
+
+ mtspr SPRN_SPRG_SCRATCH0, r13 /* Save r13 */
+
+ /*
+ * First thing to do is to find out if we're coming
+ * from a KVM guest or a Linux process.
+ *
+ * To distinguish, we check a magic byte in the PACA
+ */
+ mfspr r13, SPRN_SPRG_PACA /* r13 = PACA */
+ std r12, (PACA_EXMC + EX_R12)(r13)
+ mfcr r12
+ stw r12, (PACA_EXMC + EX_CCR)(r13)
+ lbz r12, PACA_KVM_IN_GUEST(r13)
+ cmpwi r12, 0
+ bne ..kvmppc_handler_hasmagic_\intno
+ /* No KVM guest? Then jump back to the Linux handler! */
+ lwz r12, (PACA_EXMC + EX_CCR)(r13)
+ mtcr r12
+ ld r12, (PACA_EXMC + EX_R12)(r13)
+ mfspr r13, SPRN_SPRG_SCRATCH0 /* r13 = original r13 */
+ b kvmppc_resume_\intno /* Get back original handler */
+
+ /* Now we know we're handling a KVM guest */
+..kvmppc_handler_hasmagic_\intno:
+ /* Unset guest state */
+ li r12, 0
+ stb r12, PACA_KVM_IN_GUEST(r13)
+
+ std r1, (PACA_EXMC+EX_R9)(r13)
+ std r10, (PACA_EXMC+EX_R10)(r13)
+ std r11, (PACA_EXMC+EX_R11)(r13)
+ std r2, (PACA_EXMC+EX_R13)(r13)
+
+ mfsrr0 r10
+ mfsrr1 r11
+
+ /* Restore R1/R2 so we can handle faults */
+ ld r1, PACAR1(r13)
+ ld r2, (PACA_EXMC+EX_SRR0)(r13)
+
+ /* Let's store which interrupt we're handling */
+ li r12, \intno
+
+ /* Jump into the SLB exit code that goes to the highmem handler */
+ b kvmppc_handler_trampoline_exit
+
+.endm
+
+INTERRUPT_TRAMPOLINE BOOK3S_INTERRUPT_SYSTEM_RESET
+INTERRUPT_TRAMPOLINE BOOK3S_INTERRUPT_MACHINE_CHECK
+INTERRUPT_TRAMPOLINE BOOK3S_INTERRUPT_DATA_STORAGE
+INTERRUPT_TRAMPOLINE BOOK3S_INTERRUPT_DATA_SEGMENT
+INTERRUPT_TRAMPOLINE BOOK3S_INTERRUPT_INST_STORAGE
+INTERRUPT_TRAMPOLINE BOOK3S_INTERRUPT_INST_SEGMENT
+INTERRUPT_TRAMPOLINE BOOK3S_INTERRUPT_EXTERNAL
+INTERRUPT_TRAMPOLINE BOOK3S_INTERRUPT_ALIGNMENT
+INTERRUPT_TRAMPOLINE BOOK3S_INTERRUPT_PROGRAM
+INTERRUPT_TRAMPOLINE BOOK3S_INTERRUPT_FP_UNAVAIL
+INTERRUPT_TRAMPOLINE BOOK3S_INTERRUPT_DECREMENTER
+INTERRUPT_TRAMPOLINE BOOK3S_INTERRUPT_SYSCALL
+INTERRUPT_TRAMPOLINE BOOK3S_INTERRUPT_TRACE
+INTERRUPT_TRAMPOLINE BOOK3S_INTERRUPT_PERFMON
+INTERRUPT_TRAMPOLINE BOOK3S_INTERRUPT_ALTIVEC
+INTERRUPT_TRAMPOLINE BOOK3S_INTERRUPT_VSX
+
+/*
+ * This trampoline brings us back to a real mode handler
+ *
+ * Input Registers:
+ *
+ * R6 = SRR0
+ * R7 = SRR1
+ * LR = real-mode IP
+ *
+ */
+.global kvmppc_handler_lowmem_trampoline
+kvmppc_handler_lowmem_trampoline:
+
+ mtsrr0 r6
+ mtsrr1 r7
+ blr
+kvmppc_handler_lowmem_trampoline_end:
+
+.global kvmppc_trampoline_lowmem
+kvmppc_trampoline_lowmem:
+ .long kvmppc_handler_lowmem_trampoline - _stext
+
+.global kvmppc_trampoline_enter
+kvmppc_trampoline_enter:
+ .long kvmppc_handler_trampoline_enter - _stext
+
+#include "book3s_64_slb.S"
+
--
1.6.0.2
^ permalink raw reply related [flat|nested] 48+ messages in thread* [PATCH 10/27] Add book3s.c
[not found] ` <1256917647-6200-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>
` (4 preceding siblings ...)
2009-10-30 15:47 ` [PATCH 09/27] Add interrupt handling code Alexander Graf
@ 2009-10-30 15:47 ` Alexander Graf
2009-10-30 15:47 ` [PATCH 12/27] Add book3s_64 guest MMU Alexander Graf
` (10 subsequent siblings)
16 siblings, 0 replies; 48+ messages in thread
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm-u79uwXL29TY76Z2rM5mHXA
Cc: Avi Kivity, kvm-ppc, Hollis Blanchard, Arnd Bergmann,
Benjamin Herrenschmidt, Kevin Wolf, bphilips-l3A5Bk7waGM,
Marcelo Tosatti, Olof Johansson,
linuxppc-dev-mnsaURCQ41sdnm+yROfE0A
This adds the book3s core handling file. Here everything that is generic to
desktop PowerPC cores is handled, including interrupt injections, MSR settings,
etc.
It basically takes over the same role as booke.c for embedded PowerPCs.
Signed-off-by: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
---
v3 -> v4:
- use context_id instead of mm_alloc
v4 -> v5:
- make pvr 32 bits
v5 -> v6:
- use /* */ instead of //
- use kvm_for_each_vcpu
---
arch/powerpc/kvm/book3s.c | 925 +++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 925 insertions(+), 0 deletions(-)
create mode 100644 arch/powerpc/kvm/book3s.c
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
new file mode 100644
index 0000000..42037d4
--- /dev/null
+++ b/arch/powerpc/kvm/book3s.c
@@ -0,0 +1,925 @@
+/*
+ * Copyright (C) 2009. SUSE Linux Products GmbH. All rights reserved.
+ *
+ * Authors:
+ * Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
+ * Kevin Wolf <mail-vbj5DHeKsUHgbcAU4aOf7A@public.gmane.org>
+ *
+ * Description:
+ * This file is derived from arch/powerpc/kvm/44x.c,
+ * by Hollis Blanchard <hollisb-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/kvm_host.h>
+#include <linux/err.h>
+
+#include <asm/reg.h>
+#include <asm/cputable.h>
+#include <asm/cacheflush.h>
+#include <asm/tlbflush.h>
+#include <asm/uaccess.h>
+#include <asm/io.h>
+#include <asm/kvm_ppc.h>
+#include <asm/kvm_book3s.h>
+#include <asm/mmu_context.h>
+#include <linux/sched.h>
+#include <linux/vmalloc.h>
+
+#define VCPU_STAT(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU
+
+/* #define EXIT_DEBUG */
+/* #define EXIT_DEBUG_SIMPLE */
+
+/* Without AGGRESSIVE_DEC we only fire off a DEC interrupt when DEC turns 0.
+ * When set, we retrigger a DEC interrupt after that if DEC <= 0.
+ * PPC32 Linux runs faster without AGGRESSIVE_DEC, PPC64 Linux requires it. */
+
+/* #define AGGRESSIVE_DEC */
+
+struct kvm_stats_debugfs_item debugfs_entries[] = {
+ { "exits", VCPU_STAT(sum_exits) },
+ { "mmio", VCPU_STAT(mmio_exits) },
+ { "sig", VCPU_STAT(signal_exits) },
+ { "sysc", VCPU_STAT(syscall_exits) },
+ { "inst_emu", VCPU_STAT(emulated_inst_exits) },
+ { "dec", VCPU_STAT(dec_exits) },
+ { "ext_intr", VCPU_STAT(ext_intr_exits) },
+ { "queue_intr", VCPU_STAT(queue_intr) },
+ { "halt_wakeup", VCPU_STAT(halt_wakeup) },
+ { "pf_storage", VCPU_STAT(pf_storage) },
+ { "sp_storage", VCPU_STAT(sp_storage) },
+ { "pf_instruc", VCPU_STAT(pf_instruc) },
+ { "sp_instruc", VCPU_STAT(sp_instruc) },
+ { "ld", VCPU_STAT(ld) },
+ { "ld_slow", VCPU_STAT(ld_slow) },
+ { "st", VCPU_STAT(st) },
+ { "st_slow", VCPU_STAT(st_slow) },
+ { NULL }
+};
+
+void kvmppc_core_load_host_debugstate(struct kvm_vcpu *vcpu)
+{
+}
+
+void kvmppc_core_load_guest_debugstate(struct kvm_vcpu *vcpu)
+{
+}
+
+void kvmppc_core_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
+{
+ memcpy(get_paca()->kvm_slb, to_book3s(vcpu)->slb_shadow, sizeof(get_paca()->kvm_slb));
+ get_paca()->kvm_slb_max = to_book3s(vcpu)->slb_shadow_max;
+}
+
+void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu)
+{
+ memcpy(to_book3s(vcpu)->slb_shadow, get_paca()->kvm_slb, sizeof(get_paca()->kvm_slb));
+ to_book3s(vcpu)->slb_shadow_max = get_paca()->kvm_slb_max;
+}
+
+#if defined(AGGRESSIVE_DEC) || defined(EXIT_DEBUG)
+static u32 kvmppc_get_dec(struct kvm_vcpu *vcpu)
+{
+ u64 jd = mftb() - vcpu->arch.dec_jiffies;
+ return vcpu->arch.dec - jd;
+}
+#endif
+
+void kvmppc_set_msr(struct kvm_vcpu *vcpu, u64 msr)
+{
+ ulong old_msr = vcpu->arch.msr;
+
+#ifdef EXIT_DEBUG
+ printk(KERN_INFO "KVM: Set MSR to 0x%llx\n", msr);
+#endif
+ msr &= to_book3s(vcpu)->msr_mask;
+ vcpu->arch.msr = msr;
+ vcpu->arch.shadow_msr = msr | MSR_USER32;
+ vcpu->arch.shadow_msr &= ( MSR_VEC | MSR_VSX | MSR_FP | MSR_FE0 |
+ MSR_USER64 | MSR_SE | MSR_BE | MSR_DE |
+ MSR_FE1);
+
+ if (msr & (MSR_WE|MSR_POW)) {
+ if (!vcpu->arch.pending_exceptions) {
+ kvm_vcpu_block(vcpu);
+ vcpu->stat.halt_wakeup++;
+ }
+ }
+
+ if (((vcpu->arch.msr & (MSR_IR|MSR_DR)) != (old_msr & (MSR_IR|MSR_DR))) ||
+ (vcpu->arch.msr & MSR_PR) != (old_msr & MSR_PR)) {
+ kvmppc_mmu_flush_segments(vcpu);
+ kvmppc_mmu_map_segment(vcpu, vcpu->arch.pc);
+ }
+}
+
+void kvmppc_inject_interrupt(struct kvm_vcpu *vcpu, int vec, u64 flags)
+{
+ vcpu->arch.srr0 = vcpu->arch.pc;
+ vcpu->arch.srr1 = vcpu->arch.msr | flags;
+ vcpu->arch.pc = to_book3s(vcpu)->hior + vec;
+ vcpu->arch.mmu.reset_msr(vcpu);
+}
+
+void kvmppc_book3s_queue_irqprio(struct kvm_vcpu *vcpu, unsigned int vec)
+{
+ unsigned int prio;
+
+ vcpu->stat.queue_intr++;
+ switch (vec) {
+ case 0x100: prio = BOOK3S_IRQPRIO_SYSTEM_RESET; break;
+ case 0x200: prio = BOOK3S_IRQPRIO_MACHINE_CHECK; break;
+ case 0x300: prio = BOOK3S_IRQPRIO_DATA_STORAGE; break;
+ case 0x380: prio = BOOK3S_IRQPRIO_DATA_SEGMENT; break;
+ case 0x400: prio = BOOK3S_IRQPRIO_INST_STORAGE; break;
+ case 0x480: prio = BOOK3S_IRQPRIO_INST_SEGMENT; break;
+ case 0x500: prio = BOOK3S_IRQPRIO_EXTERNAL; break;
+ case 0x600: prio = BOOK3S_IRQPRIO_ALIGNMENT; break;
+ case 0x700: prio = BOOK3S_IRQPRIO_PROGRAM; break;
+ case 0x800: prio = BOOK3S_IRQPRIO_FP_UNAVAIL; break;
+ case 0x900: prio = BOOK3S_IRQPRIO_DECREMENTER; break;
+ case 0xc00: prio = BOOK3S_IRQPRIO_SYSCALL; break;
+ case 0xd00: prio = BOOK3S_IRQPRIO_DEBUG; break;
+ case 0xf20: prio = BOOK3S_IRQPRIO_ALTIVEC; break;
+ case 0xf40: prio = BOOK3S_IRQPRIO_VSX; break;
+ default: prio = BOOK3S_IRQPRIO_MAX; break;
+ }
+
+ set_bit(prio, &vcpu->arch.pending_exceptions);
+#ifdef EXIT_DEBUG
+ printk(KERN_INFO "Queueing interrupt %x\n", vec);
+#endif
+}
+
+
+void kvmppc_core_queue_program(struct kvm_vcpu *vcpu)
+{
+ kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_PROGRAM);
+}
+
+void kvmppc_core_queue_dec(struct kvm_vcpu *vcpu)
+{
+ kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_DECREMENTER);
+}
+
+int kvmppc_core_pending_dec(struct kvm_vcpu *vcpu)
+{
+ return test_bit(BOOK3S_INTERRUPT_DECREMENTER >> 7, &vcpu->arch.pending_exceptions);
+}
+
+void kvmppc_core_queue_external(struct kvm_vcpu *vcpu,
+ struct kvm_interrupt *irq)
+{
+ kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_EXTERNAL);
+}
+
+int kvmppc_book3s_irqprio_deliver(struct kvm_vcpu *vcpu, unsigned int priority)
+{
+ int deliver = 1;
+ int vec = 0;
+
+ switch (priority) {
+ case BOOK3S_IRQPRIO_DECREMENTER:
+ deliver = vcpu->arch.msr & MSR_EE;
+ vec = BOOK3S_INTERRUPT_DECREMENTER;
+ break;
+ case BOOK3S_IRQPRIO_EXTERNAL:
+ deliver = vcpu->arch.msr & MSR_EE;
+ vec = BOOK3S_INTERRUPT_EXTERNAL;
+ break;
+ case BOOK3S_IRQPRIO_SYSTEM_RESET:
+ vec = BOOK3S_INTERRUPT_SYSTEM_RESET;
+ break;
+ case BOOK3S_IRQPRIO_MACHINE_CHECK:
+ vec = BOOK3S_INTERRUPT_MACHINE_CHECK;
+ break;
+ case BOOK3S_IRQPRIO_DATA_STORAGE:
+ vec = BOOK3S_INTERRUPT_DATA_STORAGE;
+ break;
+ case BOOK3S_IRQPRIO_INST_STORAGE:
+ vec = BOOK3S_INTERRUPT_INST_STORAGE;
+ break;
+ case BOOK3S_IRQPRIO_DATA_SEGMENT:
+ vec = BOOK3S_INTERRUPT_DATA_SEGMENT;
+ break;
+ case BOOK3S_IRQPRIO_INST_SEGMENT:
+ vec = BOOK3S_INTERRUPT_INST_SEGMENT;
+ break;
+ case BOOK3S_IRQPRIO_ALIGNMENT:
+ vec = BOOK3S_INTERRUPT_ALIGNMENT;
+ break;
+ case BOOK3S_IRQPRIO_PROGRAM:
+ vec = BOOK3S_INTERRUPT_PROGRAM;
+ break;
+ case BOOK3S_IRQPRIO_VSX:
+ vec = BOOK3S_INTERRUPT_VSX;
+ break;
+ case BOOK3S_IRQPRIO_ALTIVEC:
+ vec = BOOK3S_INTERRUPT_ALTIVEC;
+ break;
+ case BOOK3S_IRQPRIO_FP_UNAVAIL:
+ vec = BOOK3S_INTERRUPT_FP_UNAVAIL;
+ break;
+ case BOOK3S_IRQPRIO_SYSCALL:
+ vec = BOOK3S_INTERRUPT_SYSCALL;
+ break;
+ case BOOK3S_IRQPRIO_DEBUG:
+ vec = BOOK3S_INTERRUPT_TRACE;
+ break;
+ case BOOK3S_IRQPRIO_PERFORMANCE_MONITOR:
+ vec = BOOK3S_INTERRUPT_PERFMON;
+ break;
+ default:
+ deliver = 0;
+ printk(KERN_ERR "KVM: Unknown interrupt: 0x%x\n", priority);
+ break;
+ }
+
+#if 0
+ printk(KERN_INFO "Deliver interrupt 0x%x? %x\n", vec, deliver);
+#endif
+
+ if (deliver)
+ kvmppc_inject_interrupt(vcpu, vec, 0ULL);
+
+ return deliver;
+}
+
+void kvmppc_core_deliver_interrupts(struct kvm_vcpu *vcpu)
+{
+ unsigned long *pending = &vcpu->arch.pending_exceptions;
+ unsigned int priority;
+
+ /* XXX be more clever here - no need to mftb() on every entry */
+ /* Issue DEC again if it's still active */
+#ifdef AGGRESSIVE_DEC
+ if (vcpu->arch.msr & MSR_EE)
+ if (kvmppc_get_dec(vcpu) & 0x80000000)
+ kvmppc_core_queue_dec(vcpu);
+#endif
+
+#ifdef EXIT_DEBUG
+ if (vcpu->arch.pending_exceptions)
+ printk(KERN_EMERG "KVM: Check pending: %lx\n", vcpu->arch.pending_exceptions);
+#endif
+ priority = __ffs(*pending);
+ while (priority <= (sizeof(unsigned int) * 8)) {
+ if (kvmppc_book3s_irqprio_deliver(vcpu, priority)) {
+ clear_bit(priority, &vcpu->arch.pending_exceptions);
+ break;
+ }
+
+ priority = find_next_bit(pending,
+ BITS_PER_BYTE * sizeof(*pending),
+ priority + 1);
+ }
+}
+
+void kvmppc_set_pvr(struct kvm_vcpu *vcpu, u32 pvr)
+{
+ vcpu->arch.pvr = pvr;
+ if ((pvr >= 0x330000) && (pvr < 0x70330000)) {
+ kvmppc_mmu_book3s_64_init(vcpu);
+ to_book3s(vcpu)->hior = 0xfff00000;
+ to_book3s(vcpu)->msr_mask = 0xffffffffffffffffULL;
+ } else {
+ kvmppc_mmu_book3s_32_init(vcpu);
+ to_book3s(vcpu)->hior = 0;
+ to_book3s(vcpu)->msr_mask = 0xffffffffULL;
+ }
+
+ /* If we are in hypervisor level on 970, we can tell the CPU to
+ * treat DCBZ as 32 bytes store */
+ vcpu->arch.hflags &= ~BOOK3S_HFLAG_DCBZ32;
+ if (vcpu->arch.mmu.is_dcbz32(vcpu) && (mfmsr() & MSR_HV) &&
+ !strcmp(cur_cpu_spec->platform, "ppc970"))
+ vcpu->arch.hflags |= BOOK3S_HFLAG_DCBZ32;
+
+}
+
+/* Book3s_32 CPUs always have 32 bytes cache line size, which Linux assumes. To
+ * make Book3s_32 Linux work on Book3s_64, we have to make sure we trap dcbz to
+ * emulate 32 bytes dcbz length.
+ *
+ * The Book3s_64 inventors also realized this case and implemented a special bit
+ * in the HID5 register, which is a hypervisor ressource. Thus we can't use it.
+ *
+ * My approach here is to patch the dcbz instruction on executing pages.
+ */
+static void kvmppc_patch_dcbz(struct kvm_vcpu *vcpu, struct kvmppc_pte *pte)
+{
+ bool touched = false;
+ hva_t hpage;
+ u32 *page;
+ int i;
+
+ hpage = gfn_to_hva(vcpu->kvm, pte->raddr >> PAGE_SHIFT);
+ if (kvm_is_error_hva(hpage))
+ return;
+
+ hpage |= pte->raddr & ~PAGE_MASK;
+ hpage &= ~0xFFFULL;
+
+ page = vmalloc(HW_PAGE_SIZE);
+
+ if (copy_from_user(page, (void __user *)hpage, HW_PAGE_SIZE))
+ goto out;
+
+ for (i=0; i < HW_PAGE_SIZE / 4; i++)
+ if ((page[i] & 0xff0007ff) == INS_DCBZ) {
+ page[i] &= 0xfffffff7; // reserved instruction, so we trap
+ touched = true;
+ }
+
+ if (touched)
+ copy_to_user((void __user *)hpage, page, HW_PAGE_SIZE);
+
+out:
+ vfree(page);
+}
+
+static int kvmppc_xlate(struct kvm_vcpu *vcpu, ulong eaddr, bool data,
+ struct kvmppc_pte *pte)
+{
+ int relocated = (vcpu->arch.msr & (data ? MSR_DR : MSR_IR));
+ int r;
+
+ if (relocated) {
+ r = vcpu->arch.mmu.xlate(vcpu, eaddr, pte, data);
+ } else {
+ pte->eaddr = eaddr;
+ pte->raddr = eaddr & 0xffffffff;
+ pte->vpage = eaddr >> 12;
+ switch (vcpu->arch.msr & (MSR_DR|MSR_IR)) {
+ case 0:
+ pte->vpage |= VSID_REAL;
+ case MSR_DR:
+ pte->vpage |= VSID_REAL_DR;
+ case MSR_IR:
+ pte->vpage |= VSID_REAL_IR;
+ }
+ pte->may_read = true;
+ pte->may_write = true;
+ pte->may_execute = true;
+ r = 0;
+ }
+
+ return r;
+}
+
+static hva_t kvmppc_bad_hva(void)
+{
+ return PAGE_OFFSET;
+}
+
+static hva_t kvmppc_pte_to_hva(struct kvm_vcpu *vcpu, struct kvmppc_pte *pte,
+ bool read)
+{
+ hva_t hpage;
+
+ if (read && !pte->may_read)
+ goto err;
+
+ if (!read && !pte->may_write)
+ goto err;
+
+ hpage = gfn_to_hva(vcpu->kvm, pte->raddr >> PAGE_SHIFT);
+ if (kvm_is_error_hva(hpage))
+ goto err;
+
+ return hpage | (pte->raddr & ~PAGE_MASK);
+err:
+ return kvmppc_bad_hva();
+}
+
+int kvmppc_st(struct kvm_vcpu *vcpu, ulong eaddr, int size, void *ptr)
+{
+ struct kvmppc_pte pte;
+ hva_t hva = eaddr;
+
+ vcpu->stat.st++;
+
+ if (kvmppc_xlate(vcpu, eaddr, false, &pte))
+ goto err;
+
+ hva = kvmppc_pte_to_hva(vcpu, &pte, false);
+ if (kvm_is_error_hva(hva))
+ goto err;
+
+ if (copy_to_user((void __user *)hva, ptr, size)) {
+ printk(KERN_INFO "kvmppc_st at 0x%lx failed\n", hva);
+ goto err;
+ }
+
+ return 0;
+
+err:
+ return -ENOENT;
+}
+
+int kvmppc_ld(struct kvm_vcpu *vcpu, ulong eaddr, int size, void *ptr,
+ bool data)
+{
+ struct kvmppc_pte pte;
+ hva_t hva = eaddr;
+
+ vcpu->stat.ld++;
+
+ if (kvmppc_xlate(vcpu, eaddr, data, &pte))
+ goto err;
+
+ hva = kvmppc_pte_to_hva(vcpu, &pte, true);
+ if (kvm_is_error_hva(hva))
+ goto err;
+
+ if (copy_from_user(ptr, (void __user *)hva, size)) {
+ printk(KERN_INFO "kvmppc_ld at 0x%lx failed\n", hva);
+ goto err;
+ }
+
+ return 0;
+
+err:
+ return -ENOENT;
+}
+
+static int kvmppc_visible_gfn(struct kvm_vcpu *vcpu, gfn_t gfn)
+{
+ return kvm_is_visible_gfn(vcpu->kvm, gfn);
+}
+
+int kvmppc_handle_pagefault(struct kvm_run *run, struct kvm_vcpu *vcpu,
+ ulong eaddr, int vec)
+{
+ bool data = (vec == BOOK3S_INTERRUPT_DATA_STORAGE);
+ int r = RESUME_GUEST;
+ int relocated;
+ int page_found = 0;
+ struct kvmppc_pte pte;
+ bool is_mmio = false;
+
+ if ( vec == BOOK3S_INTERRUPT_DATA_STORAGE ) {
+ relocated = (vcpu->arch.msr & MSR_DR);
+ } else {
+ relocated = (vcpu->arch.msr & MSR_IR);
+ }
+
+ /* Resolve real address if translation turned on */
+ if (relocated) {
+ page_found = vcpu->arch.mmu.xlate(vcpu, eaddr, &pte, data);
+ } else {
+ pte.may_execute = true;
+ pte.may_read = true;
+ pte.may_write = true;
+ pte.raddr = eaddr & 0xffffffff;
+ pte.eaddr = eaddr;
+ pte.vpage = eaddr >> 12;
+ switch (vcpu->arch.msr & (MSR_DR|MSR_IR)) {
+ case 0:
+ pte.vpage |= VSID_REAL;
+ case MSR_DR:
+ pte.vpage |= VSID_REAL_DR;
+ case MSR_IR:
+ pte.vpage |= VSID_REAL_IR;
+ }
+ }
+
+ if (vcpu->arch.mmu.is_dcbz32(vcpu) &&
+ (!(vcpu->arch.hflags & BOOK3S_HFLAG_DCBZ32))) {
+ /*
+ * If we do the dcbz hack, we have to NX on every execution,
+ * so we can patch the executing code. This renders our guest
+ * NX-less.
+ */
+ pte.may_execute = !data;
+ }
+
+ if (page_found == -ENOENT) {
+ /* Page not found in guest PTE entries */
+ vcpu->arch.dear = vcpu->arch.fault_dear;
+ to_book3s(vcpu)->dsisr = vcpu->arch.fault_dsisr;
+ vcpu->arch.msr |= (vcpu->arch.shadow_msr & 0x00000000f8000000ULL);
+ kvmppc_book3s_queue_irqprio(vcpu, vec);
+ } else if (page_found == -EPERM) {
+ /* Storage protection */
+ vcpu->arch.dear = vcpu->arch.fault_dear;
+ to_book3s(vcpu)->dsisr = vcpu->arch.fault_dsisr & ~DSISR_NOHPTE;
+ to_book3s(vcpu)->dsisr |= DSISR_PROTFAULT;
+ vcpu->arch.msr |= (vcpu->arch.shadow_msr & 0x00000000f8000000ULL);
+ kvmppc_book3s_queue_irqprio(vcpu, vec);
+ } else if (page_found == -EINVAL) {
+ /* Page not found in guest SLB */
+ vcpu->arch.dear = vcpu->arch.fault_dear;
+ kvmppc_book3s_queue_irqprio(vcpu, vec + 0x80);
+ } else if (!is_mmio &&
+ kvmppc_visible_gfn(vcpu, pte.raddr >> PAGE_SHIFT)) {
+ /* The guest's PTE is not mapped yet. Map on the host */
+ kvmppc_mmu_map_page(vcpu, &pte);
+ if (data)
+ vcpu->stat.sp_storage++;
+ else if (vcpu->arch.mmu.is_dcbz32(vcpu) &&
+ (!(vcpu->arch.hflags & BOOK3S_HFLAG_DCBZ32)))
+ kvmppc_patch_dcbz(vcpu, &pte);
+ } else {
+ /* MMIO */
+ vcpu->stat.mmio_exits++;
+ vcpu->arch.paddr_accessed = pte.raddr;
+ r = kvmppc_emulate_mmio(run, vcpu);
+ if ( r == RESUME_HOST_NV )
+ r = RESUME_HOST;
+ if ( r == RESUME_GUEST_NV )
+ r = RESUME_GUEST;
+ }
+
+ return r;
+}
+
+int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
+ unsigned int exit_nr)
+{
+ int r = RESUME_HOST;
+
+ vcpu->stat.sum_exits++;
+
+ run->exit_reason = KVM_EXIT_UNKNOWN;
+ run->ready_for_interrupt_injection = 1;
+#ifdef EXIT_DEBUG
+ printk(KERN_EMERG "exit_nr=0x%x | pc=0x%lx | dar=0x%lx | dec=0x%x | msr=0x%lx\n",
+ exit_nr, vcpu->arch.pc, vcpu->arch.fault_dear,
+ kvmppc_get_dec(vcpu), vcpu->arch.msr);
+#elif defined (EXIT_DEBUG_SIMPLE)
+ if ((exit_nr != 0x900) && (exit_nr != 0x500))
+ printk(KERN_EMERG "exit_nr=0x%x | pc=0x%lx | dar=0x%lx | msr=0x%lx\n",
+ exit_nr, vcpu->arch.pc, vcpu->arch.fault_dear,
+ vcpu->arch.msr);
+#endif
+ kvm_resched(vcpu);
+ switch (exit_nr) {
+ case BOOK3S_INTERRUPT_INST_STORAGE:
+ vcpu->stat.pf_instruc++;
+ /* only care about PTEG not found errors, but leave NX alone */
+ if (vcpu->arch.shadow_msr & 0x40000000) {
+ r = kvmppc_handle_pagefault(run, vcpu, vcpu->arch.pc, exit_nr);
+ vcpu->stat.sp_instruc++;
+ } else if (vcpu->arch.mmu.is_dcbz32(vcpu) &&
+ (!(vcpu->arch.hflags & BOOK3S_HFLAG_DCBZ32))) {
+ /*
+ * XXX If we do the dcbz hack we use the NX bit to flush&patch the page,
+ * so we can't use the NX bit inside the guest. Let's cross our fingers,
+ * that no guest that needs the dcbz hack does NX.
+ */
+ kvmppc_mmu_pte_flush(vcpu, vcpu->arch.pc, ~0xFFFULL);
+ } else {
+ vcpu->arch.msr |= (vcpu->arch.shadow_msr & 0x58000000);
+ kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
+ kvmppc_mmu_pte_flush(vcpu, vcpu->arch.pc, ~0xFFFULL);
+ r = RESUME_GUEST;
+ }
+ break;
+ case BOOK3S_INTERRUPT_DATA_STORAGE:
+ vcpu->stat.pf_storage++;
+ /* The only case we need to handle is missing shadow PTEs */
+ if (vcpu->arch.fault_dsisr & DSISR_NOHPTE) {
+ r = kvmppc_handle_pagefault(run, vcpu, vcpu->arch.fault_dear, exit_nr);
+ } else {
+ vcpu->arch.dear = vcpu->arch.fault_dear;
+ to_book3s(vcpu)->dsisr = vcpu->arch.fault_dsisr;
+ kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
+ kvmppc_mmu_pte_flush(vcpu, vcpu->arch.dear, ~0xFFFULL);
+ r = RESUME_GUEST;
+ }
+ break;
+ case BOOK3S_INTERRUPT_DATA_SEGMENT:
+ if (kvmppc_mmu_map_segment(vcpu, vcpu->arch.fault_dear) < 0) {
+ vcpu->arch.dear = vcpu->arch.fault_dear;
+ kvmppc_book3s_queue_irqprio(vcpu,
+ BOOK3S_INTERRUPT_DATA_SEGMENT);
+ }
+ r = RESUME_GUEST;
+ break;
+ case BOOK3S_INTERRUPT_INST_SEGMENT:
+ if (kvmppc_mmu_map_segment(vcpu, vcpu->arch.pc) < 0) {
+ kvmppc_book3s_queue_irqprio(vcpu,
+ BOOK3S_INTERRUPT_INST_SEGMENT);
+ }
+ r = RESUME_GUEST;
+ break;
+ /* We're good on these - the host merely wanted to get our attention */
+ case BOOK3S_INTERRUPT_DECREMENTER:
+ vcpu->stat.dec_exits++;
+ r = RESUME_GUEST;
+ break;
+ case BOOK3S_INTERRUPT_EXTERNAL:
+ vcpu->stat.ext_intr_exits++;
+ r = RESUME_GUEST;
+ break;
+ case BOOK3S_INTERRUPT_PROGRAM:
+ {
+ enum emulation_result er;
+
+ if (vcpu->arch.msr & MSR_PR) {
+#ifdef EXIT_DEBUG
+ printk(KERN_INFO "Userspace triggered 0x700 exception at 0x%lx (0x%x)\n", vcpu->arch.pc, vcpu->arch.last_inst);
+#endif
+ if ((vcpu->arch.last_inst & 0xff0007ff) !=
+ (INS_DCBZ & 0xfffffff7)) {
+ kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
+ r = RESUME_GUEST;
+ break;
+ }
+ }
+
+ vcpu->stat.emulated_inst_exits++;
+ er = kvmppc_emulate_instruction(run, vcpu);
+ switch (er) {
+ case EMULATE_DONE:
+ r = RESUME_GUEST;
+ break;
+ case EMULATE_FAIL:
+ printk(KERN_CRIT "%s: emulation at %lx failed (%08x)\n",
+ __func__, vcpu->arch.pc, vcpu->arch.last_inst);
+ kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
+ r = RESUME_GUEST;
+ break;
+ default:
+ BUG();
+ }
+ break;
+ }
+ case BOOK3S_INTERRUPT_SYSCALL:
+#ifdef EXIT_DEBUG
+ printk(KERN_INFO "Syscall Nr %d\n", (int)vcpu->arch.gpr[0]);
+#endif
+ vcpu->stat.syscall_exits++;
+ kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
+ r = RESUME_GUEST;
+ break;
+ case BOOK3S_INTERRUPT_MACHINE_CHECK:
+ case BOOK3S_INTERRUPT_FP_UNAVAIL:
+ case BOOK3S_INTERRUPT_TRACE:
+ case BOOK3S_INTERRUPT_ALTIVEC:
+ case BOOK3S_INTERRUPT_VSX:
+ kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
+ r = RESUME_GUEST;
+ break;
+ default:
+ /* Ugh - bork here! What did we get? */
+ printk(KERN_EMERG "exit_nr=0x%x | pc=0x%lx | msr=0x%lx\n", exit_nr, vcpu->arch.pc, vcpu->arch.shadow_msr);
+ r = RESUME_HOST;
+ BUG();
+ break;
+ }
+
+
+ if (!(r & RESUME_HOST)) {
+ /* To avoid clobbering exit_reason, only check for signals if
+ * we aren't already exiting to userspace for some other
+ * reason. */
+ if (signal_pending(current)) {
+#ifdef EXIT_DEBUG
+ printk(KERN_EMERG "KVM: Going back to host\n");
+#endif
+ vcpu->stat.signal_exits++;
+ run->exit_reason = KVM_EXIT_INTR;
+ r = -EINTR;
+ } else {
+ /* In case an interrupt came in that was triggered
+ * from userspace (like DEC), we need to check what
+ * to inject now! */
+ kvmppc_core_deliver_interrupts(vcpu);
+ }
+ }
+
+#ifdef EXIT_DEBUG
+ printk(KERN_EMERG "KVM exit: vcpu=0x%p pc=0x%lx r=0x%x\n", vcpu, vcpu->arch.pc, r);
+#endif
+
+ return r;
+}
+
+int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
+{
+ return 0;
+}
+
+int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
+{
+ int i;
+
+ regs->pc = vcpu->arch.pc;
+ regs->cr = vcpu->arch.cr;
+ regs->ctr = vcpu->arch.ctr;
+ regs->lr = vcpu->arch.lr;
+ regs->xer = vcpu->arch.xer;
+ regs->msr = vcpu->arch.msr;
+ regs->srr0 = vcpu->arch.srr0;
+ regs->srr1 = vcpu->arch.srr1;
+ regs->pid = vcpu->arch.pid;
+ regs->sprg0 = vcpu->arch.sprg0;
+ regs->sprg1 = vcpu->arch.sprg1;
+ regs->sprg2 = vcpu->arch.sprg2;
+ regs->sprg3 = vcpu->arch.sprg3;
+ regs->sprg5 = vcpu->arch.sprg4;
+ regs->sprg6 = vcpu->arch.sprg5;
+ regs->sprg7 = vcpu->arch.sprg6;
+
+ for (i = 0; i < ARRAY_SIZE(regs->gpr); i++)
+ regs->gpr[i] = vcpu->arch.gpr[i];
+
+ return 0;
+}
+
+int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
+{
+ int i;
+
+ vcpu->arch.pc = regs->pc;
+ vcpu->arch.cr = regs->cr;
+ vcpu->arch.ctr = regs->ctr;
+ vcpu->arch.lr = regs->lr;
+ vcpu->arch.xer = regs->xer;
+ kvmppc_set_msr(vcpu, regs->msr);
+ vcpu->arch.srr0 = regs->srr0;
+ vcpu->arch.srr1 = regs->srr1;
+ vcpu->arch.sprg0 = regs->sprg0;
+ vcpu->arch.sprg1 = regs->sprg1;
+ vcpu->arch.sprg2 = regs->sprg2;
+ vcpu->arch.sprg3 = regs->sprg3;
+ vcpu->arch.sprg5 = regs->sprg4;
+ vcpu->arch.sprg6 = regs->sprg5;
+ vcpu->arch.sprg7 = regs->sprg6;
+
+ for (i = 0; i < ARRAY_SIZE(vcpu->arch.gpr); i++)
+ vcpu->arch.gpr[i] = regs->gpr[i];
+
+ return 0;
+}
+
+int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu,
+ struct kvm_sregs *sregs)
+{
+ sregs->pvr = vcpu->arch.pvr;
+ return 0;
+}
+
+int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
+ struct kvm_sregs *sregs)
+{
+ kvmppc_set_pvr(vcpu, sregs->pvr);
+ return 0;
+}
+
+int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
+{
+ return -ENOTSUPP;
+}
+
+int kvm_arch_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
+{
+ return -ENOTSUPP;
+}
+
+int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,
+ struct kvm_translation *tr)
+{
+ return 0;
+}
+
+/*
+ * Get (and clear) the dirty memory log for a memory slot.
+ */
+int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
+ struct kvm_dirty_log *log)
+{
+ struct kvm_memory_slot *memslot;
+ struct kvm_vcpu *vcpu;
+ ulong ga, ga_end;
+ int is_dirty = 0;
+ int r, n;
+
+ down_write(&kvm->slots_lock);
+
+ r = kvm_get_dirty_log(kvm, log, &is_dirty);
+ if (r)
+ goto out;
+
+ /* If nothing is dirty, don't bother messing with page tables. */
+ if (is_dirty) {
+ memslot = &kvm->memslots[log->slot];
+
+ ga = memslot->base_gfn << PAGE_SHIFT;
+ ga_end = ga + (memslot->npages << PAGE_SHIFT);
+
+ kvm_for_each_vcpu(n, vcpu, kvm)
+ kvmppc_mmu_pte_pflush(vcpu, ga, ga_end);
+
+ n = ALIGN(memslot->npages, BITS_PER_LONG) / 8;
+ memset(memslot->dirty_bitmap, 0, n);
+ }
+
+ r = 0;
+out:
+ up_write(&kvm->slots_lock);
+ return r;
+}
+
+int kvmppc_core_check_processor_compat(void)
+{
+ return 0;
+}
+
+struct kvm_vcpu *kvmppc_core_vcpu_create(struct kvm *kvm, unsigned int id)
+{
+ struct kvmppc_vcpu_book3s *vcpu_book3s;
+ struct kvm_vcpu *vcpu;
+ int err;
+
+ vcpu_book3s = (struct kvmppc_vcpu_book3s *)__get_free_pages( GFP_KERNEL | __GFP_ZERO,
+ get_order(sizeof(struct kvmppc_vcpu_book3s)));
+ if (!vcpu_book3s) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ vcpu = &vcpu_book3s->vcpu;
+ err = kvm_vcpu_init(vcpu, kvm, id);
+ if (err)
+ goto free_vcpu;
+
+ vcpu->arch.host_retip = kvm_return_point;
+ vcpu->arch.host_msr = mfmsr();
+ /* default to book3s_64 (970fx) */
+ vcpu->arch.pvr = 0x3C0301;
+ kvmppc_set_pvr(vcpu, vcpu->arch.pvr);
+ vcpu_book3s->slb_nr = 64;
+
+ /* remember where some real-mode handlers are */
+ vcpu->arch.trampoline_lowmem = kvmppc_trampoline_lowmem;
+ vcpu->arch.trampoline_enter = kvmppc_trampoline_enter;
+ vcpu->arch.highmem_handler = (ulong)kvmppc_handler_highmem;
+
+ vcpu->arch.shadow_msr = MSR_USER64;
+
+ err = __init_new_context();
+ if (err < 0)
+ goto free_vcpu;
+ vcpu_book3s->context_id = err;
+
+ vcpu_book3s->vsid_max = ((vcpu_book3s->context_id + 1) << USER_ESID_BITS) - 1;
+ vcpu_book3s->vsid_first = vcpu_book3s->context_id << USER_ESID_BITS;
+ vcpu_book3s->vsid_next = vcpu_book3s->vsid_first;
+
+ return vcpu;
+
+free_vcpu:
+ free_pages((long)vcpu_book3s, get_order(sizeof(struct kvmppc_vcpu_book3s)));
+out:
+ return ERR_PTR(err);
+}
+
+void kvmppc_core_vcpu_free(struct kvm_vcpu *vcpu)
+{
+ struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
+
+ __destroy_context(vcpu_book3s->context_id);
+ kvm_vcpu_uninit(vcpu);
+ free_pages((long)vcpu_book3s, get_order(sizeof(struct kvmppc_vcpu_book3s)));
+}
+
+extern int __kvmppc_vcpu_entry(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu);
+int __kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
+{
+ int ret;
+
+ /* No need to go into the guest when all we do is going out */
+ if (signal_pending(current)) {
+ kvm_run->exit_reason = KVM_EXIT_INTR;
+ return -EINTR;
+ }
+
+ /* XXX we get called with irq disabled - change that! */
+ local_irq_enable();
+
+ ret = __kvmppc_vcpu_entry(kvm_run, vcpu);
+
+ local_irq_disable();
+
+ return ret;
+}
+
+static int kvmppc_book3s_init(void)
+{
+ return kvm_init(NULL, sizeof(struct kvmppc_vcpu_book3s), THIS_MODULE);
+}
+
+static void kvmppc_book3s_exit(void)
+{
+ kvm_exit();
+}
+
+module_init(kvmppc_book3s_init);
+module_exit(kvmppc_book3s_exit);
--
1.6.0.2
^ permalink raw reply related [flat|nested] 48+ messages in thread* [PATCH 12/27] Add book3s_64 guest MMU
[not found] ` <1256917647-6200-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>
` (5 preceding siblings ...)
2009-10-30 15:47 ` [PATCH 10/27] Add book3s.c Alexander Graf
@ 2009-10-30 15:47 ` Alexander Graf
2009-10-30 15:47 ` [PATCH 13/27] Add book3s_32 " Alexander Graf
` (9 subsequent siblings)
16 siblings, 0 replies; 48+ messages in thread
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm-u79uwXL29TY76Z2rM5mHXA
Cc: Avi Kivity, kvm-ppc, Hollis Blanchard, Arnd Bergmann,
Benjamin Herrenschmidt, Kevin Wolf, bphilips-l3A5Bk7waGM,
Marcelo Tosatti, Olof Johansson,
linuxppc-dev-mnsaURCQ41sdnm+yROfE0A
To be able to run a guest, we also need to implement a guest MMU.
This patch adds MMU handling for Book3s_64 guests.
Signed-off-by: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
---
v5 -> v6:
- dprintk instead of scattered #ifdef's
- 80 line limit
- // -> /* */
---
arch/powerpc/kvm/book3s_64_mmu.c | 476 ++++++++++++++++++++++++++++++++++++++
1 files changed, 476 insertions(+), 0 deletions(-)
create mode 100644 arch/powerpc/kvm/book3s_64_mmu.c
diff --git a/arch/powerpc/kvm/book3s_64_mmu.c b/arch/powerpc/kvm/book3s_64_mmu.c
new file mode 100644
index 0000000..a31f9c6
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_mmu.c
@@ -0,0 +1,476 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
+ */
+
+#include <linux/types.h>
+#include <linux/string.h>
+#include <linux/kvm.h>
+#include <linux/kvm_host.h>
+#include <linux/highmem.h>
+
+#include <asm/tlbflush.h>
+#include <asm/kvm_ppc.h>
+#include <asm/kvm_book3s.h>
+
+/* #define DEBUG_MMU */
+
+#ifdef DEBUG_MMU
+#define dprintk(X...) printk(KERN_INFO X)
+#else
+#define dprintk(X...) do { } while(0)
+#endif
+
+static void kvmppc_mmu_book3s_64_reset_msr(struct kvm_vcpu *vcpu)
+{
+ kvmppc_set_msr(vcpu, MSR_SF);
+}
+
+static struct kvmppc_slb *kvmppc_mmu_book3s_64_find_slbe(
+ struct kvmppc_vcpu_book3s *vcpu_book3s,
+ gva_t eaddr)
+{
+ int i;
+ u64 esid = GET_ESID(eaddr);
+ u64 esid_1t = GET_ESID_1T(eaddr);
+
+ for (i = 0; i < vcpu_book3s->slb_nr; i++) {
+ u64 cmp_esid = esid;
+
+ if (!vcpu_book3s->slb[i].valid)
+ continue;
+
+ if (vcpu_book3s->slb[i].large)
+ cmp_esid = esid_1t;
+
+ if (vcpu_book3s->slb[i].esid == cmp_esid)
+ return &vcpu_book3s->slb[i];
+ }
+
+ dprintk("KVM: No SLB entry found for 0x%lx [%llx | %llx]\n",
+ eaddr, esid, esid_1t);
+ for (i = 0; i < vcpu_book3s->slb_nr; i++) {
+ if (vcpu_book3s->slb[i].vsid)
+ dprintk(" %d: %c%c %llx %llx\n", i,
+ vcpu_book3s->slb[i].valid ? 'v' : ' ',
+ vcpu_book3s->slb[i].large ? 'l' : ' ',
+ vcpu_book3s->slb[i].esid,
+ vcpu_book3s->slb[i].vsid);
+ }
+
+ return NULL;
+}
+
+static u64 kvmppc_mmu_book3s_64_ea_to_vp(struct kvm_vcpu *vcpu, gva_t eaddr,
+ bool data)
+{
+ struct kvmppc_slb *slb;
+
+ slb = kvmppc_mmu_book3s_64_find_slbe(to_book3s(vcpu), eaddr);
+ if (!slb)
+ return 0;
+
+ if (slb->large)
+ return (((u64)eaddr >> 12) & 0xfffffff) |
+ (((u64)slb->vsid) << 28);
+
+ return (((u64)eaddr >> 12) & 0xffff) | (((u64)slb->vsid) << 16);
+}
+
+static int kvmppc_mmu_book3s_64_get_pagesize(struct kvmppc_slb *slbe)
+{
+ return slbe->large ? 24 : 12;
+}
+
+static u32 kvmppc_mmu_book3s_64_get_page(struct kvmppc_slb *slbe, gva_t eaddr)
+{
+ int p = kvmppc_mmu_book3s_64_get_pagesize(slbe);
+ return ((eaddr & 0xfffffff) >> p);
+}
+
+static hva_t kvmppc_mmu_book3s_64_get_pteg(
+ struct kvmppc_vcpu_book3s *vcpu_book3s,
+ struct kvmppc_slb *slbe, gva_t eaddr,
+ bool second)
+{
+ u64 hash, pteg, htabsize;
+ u32 page;
+ hva_t r;
+
+ page = kvmppc_mmu_book3s_64_get_page(slbe, eaddr);
+ htabsize = ((1 << ((vcpu_book3s->sdr1 & 0x1f) + 11)) - 1);
+
+ hash = slbe->vsid ^ page;
+ if (second)
+ hash = ~hash;
+ hash &= ((1ULL << 39ULL) - 1ULL);
+ hash &= htabsize;
+ hash <<= 7ULL;
+
+ pteg = vcpu_book3s->sdr1 & 0xfffffffffffc0000ULL;
+ pteg |= hash;
+
+ dprintk("MMU: page=0x%x sdr1=0x%llx pteg=0x%llx vsid=0x%llx\n",
+ page, vcpu_book3s->sdr1, pteg, slbe->vsid);
+
+ r = gfn_to_hva(vcpu_book3s->vcpu.kvm, pteg >> PAGE_SHIFT);
+ if (kvm_is_error_hva(r))
+ return r;
+ return r | (pteg & ~PAGE_MASK);
+}
+
+static u64 kvmppc_mmu_book3s_64_get_avpn(struct kvmppc_slb *slbe, gva_t eaddr)
+{
+ int p = kvmppc_mmu_book3s_64_get_pagesize(slbe);
+ u64 avpn;
+
+ avpn = kvmppc_mmu_book3s_64_get_page(slbe, eaddr);
+ avpn |= slbe->vsid << (28 - p);
+
+ if (p < 24)
+ avpn >>= ((80 - p) - 56) - 8;
+ else
+ avpn <<= 8;
+
+ return avpn;
+}
+
+static int kvmppc_mmu_book3s_64_xlate(struct kvm_vcpu *vcpu, gva_t eaddr,
+ struct kvmppc_pte *gpte, bool data)
+{
+ struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
+ struct kvmppc_slb *slbe;
+ hva_t ptegp;
+ u64 pteg[16];
+ u64 avpn = 0;
+ int i;
+ u8 key = 0;
+ bool found = false;
+ bool perm_err = false;
+ int second = 0;
+
+ slbe = kvmppc_mmu_book3s_64_find_slbe(vcpu_book3s, eaddr);
+ if (!slbe)
+ goto no_seg_found;
+
+do_second:
+ ptegp = kvmppc_mmu_book3s_64_get_pteg(vcpu_book3s, slbe, eaddr, second);
+ if (kvm_is_error_hva(ptegp))
+ goto no_page_found;
+
+ avpn = kvmppc_mmu_book3s_64_get_avpn(slbe, eaddr);
+
+ if(copy_from_user(pteg, (void __user *)ptegp, sizeof(pteg))) {
+ printk(KERN_ERR "KVM can't copy data from 0x%lx!\n", ptegp);
+ goto no_page_found;
+ }
+
+ if ((vcpu->arch.msr & MSR_PR) && slbe->Kp)
+ key = 4;
+ else if (!(vcpu->arch.msr & MSR_PR) && slbe->Ks)
+ key = 4;
+
+ for (i=0; i<16; i+=2) {
+ u64 v = pteg[i];
+ u64 r = pteg[i+1];
+
+ /* Valid check */
+ if (!(v & HPTE_V_VALID))
+ continue;
+ /* Hash check */
+ if ((v & HPTE_V_SECONDARY) != second)
+ continue;
+
+ /* AVPN compare */
+ if (HPTE_V_AVPN_VAL(avpn) == HPTE_V_AVPN_VAL(v)) {
+ u8 pp = (r & HPTE_R_PP) | key;
+ int eaddr_mask = 0xFFF;
+
+ gpte->eaddr = eaddr;
+ gpte->vpage = kvmppc_mmu_book3s_64_ea_to_vp(vcpu,
+ eaddr,
+ data);
+ if (slbe->large)
+ eaddr_mask = 0xFFFFFF;
+ gpte->raddr = (r & HPTE_R_RPN) | (eaddr & eaddr_mask);
+ gpte->may_execute = ((r & HPTE_R_N) ? false : true);
+ gpte->may_read = false;
+ gpte->may_write = false;
+
+ switch (pp) {
+ case 0:
+ case 1:
+ case 2:
+ case 6:
+ gpte->may_write = true;
+ /* fall through */
+ case 3:
+ case 5:
+ case 7:
+ gpte->may_read = true;
+ break;
+ }
+
+ if (!gpte->may_read) {
+ perm_err = true;
+ continue;
+ }
+
+ dprintk("KVM MMU: Translated 0x%lx [0x%llx] -> 0x%llx "
+ "-> 0x%llx\n",
+ eaddr, avpn, gpte->vpage, gpte->raddr);
+ found = true;
+ break;
+ }
+ }
+
+ /* Update PTE R and C bits, so the guest's swapper knows we used the
+ * page */
+ if (found) {
+ u32 oldr = pteg[i+1];
+
+ if (gpte->may_read) {
+ /* Set the accessed flag */
+ pteg[i+1] |= HPTE_R_R;
+ }
+ if (gpte->may_write) {
+ /* Set the dirty flag */
+ pteg[i+1] |= HPTE_R_C;
+ } else {
+ dprintk("KVM: Mapping read-only page!\n");
+ }
+
+ /* Write back into the PTEG */
+ if (pteg[i+1] != oldr)
+ copy_to_user((void __user *)ptegp, pteg, sizeof(pteg));
+
+ return 0;
+ } else {
+ dprintk("KVM MMU: No PTE found (ea=0x%lx sdr1=0x%llx "
+ "ptegp=0x%lx)\n",
+ eaddr, to_book3s(vcpu)->sdr1, ptegp);
+ for (i = 0; i < 16; i += 2)
+ dprintk(" %02d: 0x%llx - 0x%llx (0x%llx)\n",
+ i, pteg[i], pteg[i+1], avpn);
+
+ if (!second) {
+ second = HPTE_V_SECONDARY;
+ goto do_second;
+ }
+ }
+
+
+no_page_found:
+
+
+ if (perm_err)
+ return -EPERM;
+
+ return -ENOENT;
+
+no_seg_found:
+
+ dprintk("KVM MMU: Trigger segment fault\n");
+ return -EINVAL;
+}
+
+static void kvmppc_mmu_book3s_64_slbmte(struct kvm_vcpu *vcpu, u64 rs, u64 rb)
+{
+ struct kvmppc_vcpu_book3s *vcpu_book3s;
+ u64 esid, esid_1t;
+ int slb_nr;
+ struct kvmppc_slb *slbe;
+
+ dprintk("KVM MMU: slbmte(0x%llx, 0x%llx)\n", rs, rb);
+
+ vcpu_book3s = to_book3s(vcpu);
+
+ esid = GET_ESID(rb);
+ esid_1t = GET_ESID_1T(rb);
+ slb_nr = rb & 0xfff;
+
+ if (slb_nr > vcpu_book3s->slb_nr)
+ return;
+
+ slbe = &vcpu_book3s->slb[slb_nr];
+
+ slbe->large = (rs & SLB_VSID_L) ? 1 : 0;
+ slbe->esid = slbe->large ? esid_1t : esid;
+ slbe->vsid = rs >> 12;
+ slbe->valid = (rb & SLB_ESID_V) ? 1 : 0;
+ slbe->Ks = (rs & SLB_VSID_KS) ? 1 : 0;
+ slbe->Kp = (rs & SLB_VSID_KP) ? 1 : 0;
+ slbe->nx = (rs & SLB_VSID_N) ? 1 : 0;
+ slbe->class = (rs & SLB_VSID_C) ? 1 : 0;
+
+ slbe->orige = rb & (ESID_MASK | SLB_ESID_V);
+ slbe->origv = rs;
+
+ /* Map the new segment */
+ kvmppc_mmu_map_segment(vcpu, esid << SID_SHIFT);
+}
+
+static u64 kvmppc_mmu_book3s_64_slbmfee(struct kvm_vcpu *vcpu, u64 slb_nr)
+{
+ struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
+ struct kvmppc_slb *slbe;
+
+ if (slb_nr > vcpu_book3s->slb_nr)
+ return 0;
+
+ slbe = &vcpu_book3s->slb[slb_nr];
+
+ return slbe->orige;
+}
+
+static u64 kvmppc_mmu_book3s_64_slbmfev(struct kvm_vcpu *vcpu, u64 slb_nr)
+{
+ struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
+ struct kvmppc_slb *slbe;
+
+ if (slb_nr > vcpu_book3s->slb_nr)
+ return 0;
+
+ slbe = &vcpu_book3s->slb[slb_nr];
+
+ return slbe->origv;
+}
+
+static void kvmppc_mmu_book3s_64_slbie(struct kvm_vcpu *vcpu, u64 ea)
+{
+ struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
+ struct kvmppc_slb *slbe;
+
+ dprintk("KVM MMU: slbie(0x%llx)\n", ea);
+
+ slbe = kvmppc_mmu_book3s_64_find_slbe(vcpu_book3s, ea);
+
+ if (!slbe)
+ return;
+
+ dprintk("KVM MMU: slbie(0x%llx, 0x%llx)\n", ea, slbe->esid);
+
+ slbe->valid = false;
+
+ kvmppc_mmu_map_segment(vcpu, ea);
+}
+
+static void kvmppc_mmu_book3s_64_slbia(struct kvm_vcpu *vcpu)
+{
+ struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
+ int i;
+
+ dprintk("KVM MMU: slbia()\n");
+
+ for (i = 1; i < vcpu_book3s->slb_nr; i++)
+ vcpu_book3s->slb[i].valid = false;
+
+ if (vcpu->arch.msr & MSR_IR) {
+ kvmppc_mmu_flush_segments(vcpu);
+ kvmppc_mmu_map_segment(vcpu, vcpu->arch.pc);
+ }
+}
+
+static void kvmppc_mmu_book3s_64_mtsrin(struct kvm_vcpu *vcpu, u32 srnum,
+ ulong value)
+{
+ u64 rb = 0, rs = 0;
+
+ /* ESID = srnum */
+ rb |= (srnum & 0xf) << 28;
+ /* Set the valid bit */
+ rb |= 1 << 27;
+ /* Index = ESID */
+ rb |= srnum;
+
+ /* VSID = VSID */
+ rs |= (value & 0xfffffff) << 12;
+ /* flags = flags */
+ rs |= ((value >> 27) & 0xf) << 9;
+
+ kvmppc_mmu_book3s_64_slbmte(vcpu, rs, rb);
+}
+
+static void kvmppc_mmu_book3s_64_tlbie(struct kvm_vcpu *vcpu, ulong va,
+ bool large)
+{
+ u64 mask = 0xFFFFFFFFFULL;
+
+ dprintk("KVM MMU: tlbie(0x%lx)\n", va);
+
+ if (large)
+ mask = 0xFFFFFF000ULL;
+ kvmppc_mmu_pte_vflush(vcpu, va >> 12, mask);
+}
+
+static int kvmppc_mmu_book3s_64_esid_to_vsid(struct kvm_vcpu *vcpu, u64 esid,
+ u64 *vsid)
+{
+ switch (vcpu->arch.msr & (MSR_DR|MSR_IR)) {
+ case 0:
+ *vsid = (VSID_REAL >> 16) | esid;
+ break;
+ case MSR_IR:
+ *vsid = (VSID_REAL_IR >> 16) | esid;
+ break;
+ case MSR_DR:
+ *vsid = (VSID_REAL_DR >> 16) | esid;
+ break;
+ case MSR_DR|MSR_IR:
+ {
+ ulong ea;
+ struct kvmppc_slb *slb;
+ ea = esid << SID_SHIFT;
+ slb = kvmppc_mmu_book3s_64_find_slbe(to_book3s(vcpu), ea);
+ if (slb)
+ *vsid = slb->vsid;
+ else
+ return -ENOENT;
+
+ break;
+ }
+ default:
+ BUG();
+ break;
+ }
+
+ return 0;
+}
+
+static bool kvmppc_mmu_book3s_64_is_dcbz32(struct kvm_vcpu *vcpu)
+{
+ return (to_book3s(vcpu)->hid[5] & 0x80);
+}
+
+void kvmppc_mmu_book3s_64_init(struct kvm_vcpu *vcpu)
+{
+ struct kvmppc_mmu *mmu = &vcpu->arch.mmu;
+
+ mmu->mfsrin = NULL;
+ mmu->mtsrin = kvmppc_mmu_book3s_64_mtsrin;
+ mmu->slbmte = kvmppc_mmu_book3s_64_slbmte;
+ mmu->slbmfee = kvmppc_mmu_book3s_64_slbmfee;
+ mmu->slbmfev = kvmppc_mmu_book3s_64_slbmfev;
+ mmu->slbie = kvmppc_mmu_book3s_64_slbie;
+ mmu->slbia = kvmppc_mmu_book3s_64_slbia;
+ mmu->xlate = kvmppc_mmu_book3s_64_xlate;
+ mmu->reset_msr = kvmppc_mmu_book3s_64_reset_msr;
+ mmu->tlbie = kvmppc_mmu_book3s_64_tlbie;
+ mmu->esid_to_vsid = kvmppc_mmu_book3s_64_esid_to_vsid;
+ mmu->ea_to_vp = kvmppc_mmu_book3s_64_ea_to_vp;
+ mmu->is_dcbz32 = kvmppc_mmu_book3s_64_is_dcbz32;
+}
--
1.6.0.2
^ permalink raw reply related [flat|nested] 48+ messages in thread* [PATCH 13/27] Add book3s_32 guest MMU
[not found] ` <1256917647-6200-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>
` (6 preceding siblings ...)
2009-10-30 15:47 ` [PATCH 12/27] Add book3s_64 guest MMU Alexander Graf
@ 2009-10-30 15:47 ` Alexander Graf
2009-10-30 15:47 ` [PATCH 14/27] Add book3s_64 specific opcode emulation Alexander Graf
` (8 subsequent siblings)
16 siblings, 0 replies; 48+ messages in thread
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm-u79uwXL29TY76Z2rM5mHXA
Cc: Avi Kivity, kvm-ppc, Hollis Blanchard, Arnd Bergmann,
Benjamin Herrenschmidt, Kevin Wolf, bphilips-l3A5Bk7waGM,
Marcelo Tosatti, Olof Johansson,
linuxppc-dev-mnsaURCQ41sdnm+yROfE0A
This patch adds an implementation for a G3/G4 MMU, so we can run G3 and
G4 guests in KVM on Book3s_64.
Signed-off-by: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
---
v5 -> v6:
- dprintk instead of scattered #ifdef's
- // -> /* */
- 80 characters per line
---
arch/powerpc/kvm/book3s_32_mmu.c | 372 ++++++++++++++++++++++++++++++++++++++
1 files changed, 372 insertions(+), 0 deletions(-)
create mode 100644 arch/powerpc/kvm/book3s_32_mmu.c
diff --git a/arch/powerpc/kvm/book3s_32_mmu.c b/arch/powerpc/kvm/book3s_32_mmu.c
new file mode 100644
index 0000000..faf99f2
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_32_mmu.c
@@ -0,0 +1,372 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
+ */
+
+#include <linux/types.h>
+#include <linux/string.h>
+#include <linux/kvm.h>
+#include <linux/kvm_host.h>
+#include <linux/highmem.h>
+
+#include <asm/tlbflush.h>
+#include <asm/kvm_ppc.h>
+#include <asm/kvm_book3s.h>
+
+/* #define DEBUG_MMU */
+/* #define DEBUG_MMU_PTE */
+/* #define DEBUG_MMU_PTE_IP 0xfff14c40 */
+
+#ifdef DEBUG_MMU
+#define dprintk(X...) printk(KERN_INFO X)
+#else
+#define dprintk(X...) do { } while(0)
+#endif
+
+#ifdef DEBUG_PTE
+#define dprintk_pte(X...) printk(KERN_INFO X)
+#else
+#define dprintk_pte(X...) do { } while(0)
+#endif
+
+#define PTEG_FLAG_ACCESSED 0x00000100
+#define PTEG_FLAG_DIRTY 0x00000080
+
+static inline bool check_debug_ip(struct kvm_vcpu *vcpu)
+{
+#ifdef DEBUG_MMU_PTE_IP
+ return vcpu->arch.pc == DEBUG_MMU_PTE_IP;
+#else
+ return true;
+#endif
+}
+
+static int kvmppc_mmu_book3s_32_xlate_bat(struct kvm_vcpu *vcpu, gva_t eaddr,
+ struct kvmppc_pte *pte, bool data);
+
+static struct kvmppc_sr *find_sr(struct kvmppc_vcpu_book3s *vcpu_book3s, gva_t eaddr)
+{
+ return &vcpu_book3s->sr[(eaddr >> 28) & 0xf];
+}
+
+static u64 kvmppc_mmu_book3s_32_ea_to_vp(struct kvm_vcpu *vcpu, gva_t eaddr,
+ bool data)
+{
+ struct kvmppc_sr *sre = find_sr(to_book3s(vcpu), eaddr);
+ struct kvmppc_pte pte;
+
+ if (!kvmppc_mmu_book3s_32_xlate_bat(vcpu, eaddr, &pte, data))
+ return pte.vpage;
+
+ return (((u64)eaddr >> 12) & 0xffff) | (((u64)sre->vsid) << 16);
+}
+
+static void kvmppc_mmu_book3s_32_reset_msr(struct kvm_vcpu *vcpu)
+{
+ kvmppc_set_msr(vcpu, 0);
+}
+
+static hva_t kvmppc_mmu_book3s_32_get_pteg(struct kvmppc_vcpu_book3s *vcpu_book3s,
+ struct kvmppc_sr *sre, gva_t eaddr,
+ bool primary)
+{
+ u32 page, hash, pteg, htabmask;
+ hva_t r;
+
+ page = (eaddr & 0x0FFFFFFF) >> 12;
+ htabmask = ((vcpu_book3s->sdr1 & 0x1FF) << 16) | 0xFFC0;
+
+ hash = ((sre->vsid ^ page) << 6);
+ if (!primary)
+ hash = ~hash;
+ hash &= htabmask;
+
+ pteg = (vcpu_book3s->sdr1 & 0xffff0000) | hash;
+
+ dprintk("MMU: pc=0x%lx eaddr=0x%lx sdr1=0x%llx pteg=0x%x vsid=0x%x\n",
+ vcpu_book3s->vcpu.arch.pc, eaddr, vcpu_book3s->sdr1, pteg,
+ sre->vsid);
+
+ r = gfn_to_hva(vcpu_book3s->vcpu.kvm, pteg >> PAGE_SHIFT);
+ if (kvm_is_error_hva(r))
+ return r;
+ return r | (pteg & ~PAGE_MASK);
+}
+
+static u32 kvmppc_mmu_book3s_32_get_ptem(struct kvmppc_sr *sre, gva_t eaddr,
+ bool primary)
+{
+ return ((eaddr & 0x0fffffff) >> 22) | (sre->vsid << 7) |
+ (primary ? 0 : 0x40) | 0x80000000;
+}
+
+static int kvmppc_mmu_book3s_32_xlate_bat(struct kvm_vcpu *vcpu, gva_t eaddr,
+ struct kvmppc_pte *pte, bool data)
+{
+ struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
+ struct kvmppc_bat *bat;
+ int i;
+
+ for (i = 0; i < 8; i++) {
+ if (data)
+ bat = &vcpu_book3s->dbat[i];
+ else
+ bat = &vcpu_book3s->ibat[i];
+
+ if (vcpu->arch.msr & MSR_PR) {
+ if (!bat->vp)
+ continue;
+ } else {
+ if (!bat->vs)
+ continue;
+ }
+
+ if (check_debug_ip(vcpu))
+ {
+ dprintk_pte("%cBAT %02d: 0x%lx - 0x%x (0x%x)\n",
+ data ? 'd' : 'i', i, eaddr, bat->bepi,
+ bat->bepi_mask);
+ }
+ if ((eaddr & bat->bepi_mask) == bat->bepi) {
+ pte->raddr = bat->brpn | (eaddr & ~bat->bepi_mask);
+ pte->vpage = (eaddr >> 12) | VSID_BAT;
+ pte->may_read = bat->pp;
+ pte->may_write = bat->pp > 1;
+ pte->may_execute = true;
+ if (!pte->may_read) {
+ printk(KERN_INFO "BAT is not readable!\n");
+ continue;
+ }
+ if (!pte->may_write) {
+ /* let's treat r/o BATs as not-readable for now */
+ dprintk_pte("BAT is read-only!\n");
+ continue;
+ }
+
+ return 0;
+ }
+ }
+
+ return -ENOENT;
+}
+
+static int kvmppc_mmu_book3s_32_xlate_pte(struct kvm_vcpu *vcpu, gva_t eaddr,
+ struct kvmppc_pte *pte, bool data,
+ bool primary)
+{
+ struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
+ struct kvmppc_sr *sre;
+ hva_t ptegp;
+ u32 pteg[16];
+ u64 ptem = 0;
+ int i;
+ int found = 0;
+
+ sre = find_sr(vcpu_book3s, eaddr);
+
+ dprintk_pte("SR 0x%lx: vsid=0x%x, raw=0x%x\n", eaddr >> 28,
+ sre->vsid, sre->raw);
+
+ pte->vpage = kvmppc_mmu_book3s_32_ea_to_vp(vcpu, eaddr, data);
+
+ ptegp = kvmppc_mmu_book3s_32_get_pteg(vcpu_book3s, sre, eaddr, primary);
+ if (kvm_is_error_hva(ptegp)) {
+ printk(KERN_INFO "KVM: Invalid PTEG!\n");
+ goto no_page_found;
+ }
+
+ ptem = kvmppc_mmu_book3s_32_get_ptem(sre, eaddr, primary);
+
+ if(copy_from_user(pteg, (void __user *)ptegp, sizeof(pteg))) {
+ printk(KERN_ERR "KVM: Can't copy data from 0x%lx!\n", ptegp);
+ goto no_page_found;
+ }
+
+ for (i=0; i<16; i+=2) {
+ if (ptem == pteg[i]) {
+ u8 pp;
+
+ pte->raddr = (pteg[i+1] & ~(0xFFFULL)) | (eaddr & 0xFFF);
+ pp = pteg[i+1] & 3;
+
+ if ((sre->Kp && (vcpu->arch.msr & MSR_PR)) ||
+ (sre->Ks && !(vcpu->arch.msr & MSR_PR)))
+ pp |= 4;
+
+ pte->may_write = false;
+ pte->may_read = false;
+ pte->may_execute = true;
+ switch (pp) {
+ case 0:
+ case 1:
+ case 2:
+ case 6:
+ pte->may_write = true;
+ case 3:
+ case 5:
+ case 7:
+ pte->may_read = true;
+ break;
+ }
+
+ if ( !pte->may_read )
+ continue;
+
+ dprintk_pte("MMU: Found PTE -> %x %x - %x\n",
+ pteg[i], pteg[i+1], pp);
+ found = 1;
+ break;
+ }
+ }
+
+ /* Update PTE C and A bits, so the guest's swapper knows we used the
+ page */
+ if (found) {
+ u32 oldpte = pteg[i+1];
+
+ if (pte->may_read)
+ pteg[i+1] |= PTEG_FLAG_ACCESSED;
+ if (pte->may_write)
+ pteg[i+1] |= PTEG_FLAG_DIRTY;
+ else
+ dprintk_pte("KVM: Mapping read-only page!\n");
+
+ /* Write back into the PTEG */
+ if (pteg[i+1] != oldpte)
+ copy_to_user((void __user *)ptegp, pteg, sizeof(pteg));
+
+ return 0;
+ }
+
+no_page_found:
+
+ if (check_debug_ip(vcpu)) {
+ dprintk_pte("KVM MMU: No PTE found (sdr1=0x%llx ptegp=0x%lx)\n",
+ to_book3s(vcpu)->sdr1, ptegp);
+ for (i=0; i<16; i+=2) {
+ dprintk_pte(" %02d: 0x%x - 0x%x (0x%llx)\n",
+ i, pteg[i], pteg[i+1], ptem);
+ }
+ }
+
+ return -ENOENT;
+}
+
+static int kvmppc_mmu_book3s_32_xlate(struct kvm_vcpu *vcpu, gva_t eaddr,
+ struct kvmppc_pte *pte, bool data)
+{
+ int r;
+
+ pte->eaddr = eaddr;
+ r = kvmppc_mmu_book3s_32_xlate_bat(vcpu, eaddr, pte, data);
+ if (r < 0)
+ r = kvmppc_mmu_book3s_32_xlate_pte(vcpu, eaddr, pte, data, true);
+ if (r < 0)
+ r = kvmppc_mmu_book3s_32_xlate_pte(vcpu, eaddr, pte, data, false);
+
+ return r;
+}
+
+
+static u32 kvmppc_mmu_book3s_32_mfsrin(struct kvm_vcpu *vcpu, u32 srnum)
+{
+ return to_book3s(vcpu)->sr[srnum].raw;
+}
+
+static void kvmppc_mmu_book3s_32_mtsrin(struct kvm_vcpu *vcpu, u32 srnum,
+ ulong value)
+{
+ struct kvmppc_sr *sre;
+
+ sre = &to_book3s(vcpu)->sr[srnum];
+
+ /* Flush any left-over shadows from the previous SR */
+
+ /* XXX Not necessary? */
+ /* kvmppc_mmu_pte_flush(vcpu, ((u64)sre->vsid) << 28, 0xf0000000ULL); */
+
+ /* And then put in the new SR */
+ sre->raw = value;
+ sre->vsid = (value & 0x0fffffff);
+ sre->Ks = (value & 0x40000000) ? true : false;
+ sre->Kp = (value & 0x20000000) ? true : false;
+ sre->nx = (value & 0x10000000) ? true : false;
+
+ /* Map the new segment */
+ kvmppc_mmu_map_segment(vcpu, srnum << SID_SHIFT);
+}
+
+static void kvmppc_mmu_book3s_32_tlbie(struct kvm_vcpu *vcpu, ulong ea, bool large)
+{
+ kvmppc_mmu_pte_flush(vcpu, ea, ~0xFFFULL);
+}
+
+static int kvmppc_mmu_book3s_32_esid_to_vsid(struct kvm_vcpu *vcpu, u64 esid,
+ u64 *vsid)
+{
+ /* In case we only have one of MSR_IR or MSR_DR set, let's put
+ that in the real-mode context (and hope RM doesn't access
+ high memory) */
+ switch (vcpu->arch.msr & (MSR_DR|MSR_IR)) {
+ case 0:
+ *vsid = (VSID_REAL >> 16) | esid;
+ break;
+ case MSR_IR:
+ *vsid = (VSID_REAL_IR >> 16) | esid;
+ break;
+ case MSR_DR:
+ *vsid = (VSID_REAL_DR >> 16) | esid;
+ break;
+ case MSR_DR|MSR_IR:
+ {
+ ulong ea;
+ ea = esid << SID_SHIFT;
+ *vsid = find_sr(to_book3s(vcpu), ea)->vsid;
+ break;
+ }
+ default:
+ BUG();
+ }
+
+ return 0;
+}
+
+static bool kvmppc_mmu_book3s_32_is_dcbz32(struct kvm_vcpu *vcpu)
+{
+ return true;
+}
+
+
+void kvmppc_mmu_book3s_32_init(struct kvm_vcpu *vcpu)
+{
+ struct kvmppc_mmu *mmu = &vcpu->arch.mmu;
+
+ mmu->mtsrin = kvmppc_mmu_book3s_32_mtsrin;
+ mmu->mfsrin = kvmppc_mmu_book3s_32_mfsrin;
+ mmu->xlate = kvmppc_mmu_book3s_32_xlate;
+ mmu->reset_msr = kvmppc_mmu_book3s_32_reset_msr;
+ mmu->tlbie = kvmppc_mmu_book3s_32_tlbie;
+ mmu->esid_to_vsid = kvmppc_mmu_book3s_32_esid_to_vsid;
+ mmu->ea_to_vp = kvmppc_mmu_book3s_32_ea_to_vp;
+ mmu->is_dcbz32 = kvmppc_mmu_book3s_32_is_dcbz32;
+
+ mmu->slbmte = NULL;
+ mmu->slbmfee = NULL;
+ mmu->slbmfev = NULL;
+ mmu->slbie = NULL;
+ mmu->slbia = NULL;
+}
--
1.6.0.2
^ permalink raw reply related [flat|nested] 48+ messages in thread* [PATCH 14/27] Add book3s_64 specific opcode emulation
[not found] ` <1256917647-6200-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>
` (7 preceding siblings ...)
2009-10-30 15:47 ` [PATCH 13/27] Add book3s_32 " Alexander Graf
@ 2009-10-30 15:47 ` Alexander Graf
2009-11-03 8:47 ` Segher Boessenkool
2009-10-30 15:47 ` [PATCH 15/27] Add mfdec emulation Alexander Graf
` (7 subsequent siblings)
16 siblings, 1 reply; 48+ messages in thread
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm-u79uwXL29TY76Z2rM5mHXA
Cc: Avi Kivity, kvm-ppc, Hollis Blanchard, Arnd Bergmann,
Benjamin Herrenschmidt, Kevin Wolf, bphilips-l3A5Bk7waGM,
Marcelo Tosatti, Olof Johansson,
linuxppc-dev-mnsaURCQ41sdnm+yROfE0A
There are generic parts of PowerPC that can be shared across all
implementations and specific parts that only apply to BookE or desktop PPCs.
This patch adds emulation for desktop specific opcodes that don't apply
to BookE CPUs.
Signed-off-by: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
---
v5 -> v6:
- // -> /* */
---
arch/powerpc/kvm/book3s_64_emulate.c | 337 ++++++++++++++++++++++++++++++++++
1 files changed, 337 insertions(+), 0 deletions(-)
create mode 100644 arch/powerpc/kvm/book3s_64_emulate.c
diff --git a/arch/powerpc/kvm/book3s_64_emulate.c b/arch/powerpc/kvm/book3s_64_emulate.c
new file mode 100644
index 0000000..c343e67
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_emulate.c
@@ -0,0 +1,337 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
+ */
+
+#include <asm/kvm_ppc.h>
+#include <asm/disassemble.h>
+#include <asm/kvm_book3s.h>
+#include <asm/reg.h>
+
+#define OP_19_XOP_RFID 18
+#define OP_19_XOP_RFI 50
+
+#define OP_31_XOP_MFMSR 83
+#define OP_31_XOP_MTMSR 146
+#define OP_31_XOP_MTMSRD 178
+#define OP_31_XOP_MTSRIN 242
+#define OP_31_XOP_TLBIEL 274
+#define OP_31_XOP_TLBIE 306
+#define OP_31_XOP_SLBMTE 402
+#define OP_31_XOP_SLBIE 434
+#define OP_31_XOP_SLBIA 498
+#define OP_31_XOP_MFSRIN 659
+#define OP_31_XOP_SLBMFEV 851
+#define OP_31_XOP_EIOIO 854
+#define OP_31_XOP_SLBMFEE 915
+
+/* DCBZ is actually 1014, but we patch it to 1010 so we get a trap */
+#define OP_31_XOP_DCBZ 1010
+
+int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
+ unsigned int inst, int *advance)
+{
+ int emulated = EMULATE_DONE;
+
+ switch (get_op(inst)) {
+ case 19:
+ switch (get_xop(inst)) {
+ case OP_19_XOP_RFID:
+ case OP_19_XOP_RFI:
+ vcpu->arch.pc = vcpu->arch.srr0;
+ kvmppc_set_msr(vcpu, vcpu->arch.srr1);
+ *advance = 0;
+ break;
+
+ default:
+ emulated = EMULATE_FAIL;
+ break;
+ }
+ break;
+ case 31:
+ switch (get_xop(inst)) {
+ case OP_31_XOP_MFMSR:
+ vcpu->arch.gpr[get_rt(inst)] = vcpu->arch.msr;
+ break;
+ case OP_31_XOP_MTMSRD:
+ {
+ ulong rs = vcpu->arch.gpr[get_rs(inst)];
+ if (inst & 0x10000) {
+ vcpu->arch.msr &= ~(MSR_RI | MSR_EE);
+ vcpu->arch.msr |= rs & (MSR_RI | MSR_EE);
+ } else
+ kvmppc_set_msr(vcpu, rs);
+ break;
+ }
+ case OP_31_XOP_MTMSR:
+ kvmppc_set_msr(vcpu, vcpu->arch.gpr[get_rs(inst)]);
+ break;
+ case OP_31_XOP_MFSRIN:
+ {
+ int srnum;
+
+ srnum = (vcpu->arch.gpr[get_rb(inst)] >> 28) & 0xf;
+ if (vcpu->arch.mmu.mfsrin) {
+ u32 sr;
+ sr = vcpu->arch.mmu.mfsrin(vcpu, srnum);
+ vcpu->arch.gpr[get_rt(inst)] = sr;
+ }
+ break;
+ }
+ case OP_31_XOP_MTSRIN:
+ vcpu->arch.mmu.mtsrin(vcpu,
+ (vcpu->arch.gpr[get_rb(inst)] >> 28) & 0xf,
+ vcpu->arch.gpr[get_rs(inst)]);
+ break;
+ case OP_31_XOP_TLBIE:
+ case OP_31_XOP_TLBIEL:
+ {
+ bool large = (inst & 0x00200000) ? true : false;
+ ulong addr = vcpu->arch.gpr[get_rb(inst)];
+ vcpu->arch.mmu.tlbie(vcpu, addr, large);
+ break;
+ }
+ case OP_31_XOP_EIOIO:
+ break;
+ case OP_31_XOP_SLBMTE:
+ if (!vcpu->arch.mmu.slbmte)
+ return EMULATE_FAIL;
+
+ vcpu->arch.mmu.slbmte(vcpu, vcpu->arch.gpr[get_rs(inst)],
+ vcpu->arch.gpr[get_rb(inst)]);
+ break;
+ case OP_31_XOP_SLBIE:
+ if (!vcpu->arch.mmu.slbie)
+ return EMULATE_FAIL;
+
+ vcpu->arch.mmu.slbie(vcpu, vcpu->arch.gpr[get_rb(inst)]);
+ break;
+ case OP_31_XOP_SLBIA:
+ if (!vcpu->arch.mmu.slbia)
+ return EMULATE_FAIL;
+
+ vcpu->arch.mmu.slbia(vcpu);
+ break;
+ case OP_31_XOP_SLBMFEE:
+ if (!vcpu->arch.mmu.slbmfee) {
+ emulated = EMULATE_FAIL;
+ } else {
+ ulong t, rb;
+
+ rb = vcpu->arch.gpr[get_rb(inst)];
+ t = vcpu->arch.mmu.slbmfee(vcpu, rb);
+ vcpu->arch.gpr[get_rt(inst)] = t;
+ }
+ break;
+ case OP_31_XOP_SLBMFEV:
+ if (!vcpu->arch.mmu.slbmfev) {
+ emulated = EMULATE_FAIL;
+ } else {
+ ulong t, rb;
+
+ rb = vcpu->arch.gpr[get_rb(inst)];
+ t = vcpu->arch.mmu.slbmfev(vcpu, rb);
+ vcpu->arch.gpr[get_rt(inst)] = t;
+ }
+ break;
+ case OP_31_XOP_DCBZ:
+ {
+ ulong rb = vcpu->arch.gpr[get_rb(inst)];
+ ulong ra = 0;
+ ulong addr;
+ u32 zeros[8] = { 0, 0, 0, 0, 0, 0, 0, 0 };
+
+ if (get_ra(inst))
+ ra = vcpu->arch.gpr[get_ra(inst)];
+
+ addr = (ra + rb) & ~31ULL;
+ if (!(vcpu->arch.msr & MSR_SF))
+ addr &= 0xffffffff;
+
+ if (kvmppc_st(vcpu, addr, 32, zeros)) {
+ vcpu->arch.dear = addr;
+ vcpu->arch.fault_dear = addr;
+ to_book3s(vcpu)->dsisr = DSISR_PROTFAULT |
+ DSISR_ISSTORE;
+ kvmppc_book3s_queue_irqprio(vcpu,
+ BOOK3S_INTERRUPT_DATA_STORAGE);
+ kvmppc_mmu_pte_flush(vcpu, addr, ~0xFFFULL);
+ }
+
+ break;
+ }
+ default:
+ emulated = EMULATE_FAIL;
+ }
+ break;
+ default:
+ emulated = EMULATE_FAIL;
+ }
+
+ return emulated;
+}
+
+static void kvmppc_write_bat(struct kvm_vcpu *vcpu, int sprn, u64 val)
+{
+ struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
+ struct kvmppc_bat *bat;
+
+ switch (sprn) {
+ case SPRN_IBAT0U ... SPRN_IBAT3L:
+ bat = &vcpu_book3s->ibat[(sprn - SPRN_IBAT0U) / 2];
+ break;
+ case SPRN_IBAT4U ... SPRN_IBAT7L:
+ bat = &vcpu_book3s->ibat[(sprn - SPRN_IBAT4U) / 2];
+ break;
+ case SPRN_DBAT0U ... SPRN_DBAT3L:
+ bat = &vcpu_book3s->dbat[(sprn - SPRN_DBAT0U) / 2];
+ break;
+ case SPRN_DBAT4U ... SPRN_DBAT7L:
+ bat = &vcpu_book3s->dbat[(sprn - SPRN_DBAT4U) / 2];
+ break;
+ default:
+ BUG();
+ }
+
+ if (!(sprn % 2)) {
+ /* Upper BAT */
+ u32 bl = (val >> 2) & 0x7ff;
+ bat->bepi_mask = (~bl << 17);
+ bat->bepi = val & 0xfffe0000;
+ bat->vs = (val & 2) ? 1 : 0;
+ bat->vp = (val & 1) ? 1 : 0;
+ } else {
+ /* Lower BAT */
+ bat->brpn = val & 0xfffe0000;
+ bat->wimg = (val >> 3) & 0xf;
+ bat->pp = val & 3;
+ }
+}
+
+int kvmppc_core_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, int rs)
+{
+ int emulated = EMULATE_DONE;
+
+ switch (sprn) {
+ case SPRN_SDR1:
+ to_book3s(vcpu)->sdr1 = vcpu->arch.gpr[rs];
+ break;
+ case SPRN_DSISR:
+ to_book3s(vcpu)->dsisr = vcpu->arch.gpr[rs];
+ break;
+ case SPRN_DAR:
+ vcpu->arch.dear = vcpu->arch.gpr[rs];
+ break;
+ case SPRN_HIOR:
+ to_book3s(vcpu)->hior = vcpu->arch.gpr[rs];
+ break;
+ case SPRN_IBAT0U ... SPRN_IBAT3L:
+ case SPRN_IBAT4U ... SPRN_IBAT7L:
+ case SPRN_DBAT0U ... SPRN_DBAT3L:
+ case SPRN_DBAT4U ... SPRN_DBAT7L:
+ kvmppc_write_bat(vcpu, sprn, vcpu->arch.gpr[rs]);
+ /* BAT writes happen so rarely that we're ok to flush
+ * everything here */
+ kvmppc_mmu_pte_flush(vcpu, 0, 0);
+ break;
+ case SPRN_HID0:
+ to_book3s(vcpu)->hid[0] = vcpu->arch.gpr[rs];
+ break;
+ case SPRN_HID1:
+ to_book3s(vcpu)->hid[1] = vcpu->arch.gpr[rs];
+ break;
+ case SPRN_HID2:
+ to_book3s(vcpu)->hid[2] = vcpu->arch.gpr[rs];
+ break;
+ case SPRN_HID4:
+ to_book3s(vcpu)->hid[4] = vcpu->arch.gpr[rs];
+ break;
+ case SPRN_HID5:
+ to_book3s(vcpu)->hid[5] = vcpu->arch.gpr[rs];
+ /* guest HID5 set can change is_dcbz32 */
+ if (vcpu->arch.mmu.is_dcbz32(vcpu) &&
+ (mfmsr() & MSR_HV))
+ vcpu->arch.hflags |= BOOK3S_HFLAG_DCBZ32;
+ break;
+ case SPRN_ICTC:
+ case SPRN_THRM1:
+ case SPRN_THRM2:
+ case SPRN_THRM3:
+ case SPRN_CTRLF:
+ case SPRN_CTRLT:
+ break;
+ default:
+ printk(KERN_INFO "KVM: invalid SPR write: %d\n", sprn);
+#ifndef DEBUG_SPR
+ emulated = EMULATE_FAIL;
+#endif
+ break;
+ }
+
+ return emulated;
+}
+
+int kvmppc_core_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, int rt)
+{
+ int emulated = EMULATE_DONE;
+
+ switch (sprn) {
+ case SPRN_SDR1:
+ vcpu->arch.gpr[rt] = to_book3s(vcpu)->sdr1;
+ break;
+ case SPRN_DSISR:
+ vcpu->arch.gpr[rt] = to_book3s(vcpu)->dsisr;
+ break;
+ case SPRN_DAR:
+ vcpu->arch.gpr[rt] = vcpu->arch.dear;
+ break;
+ case SPRN_HIOR:
+ vcpu->arch.gpr[rt] = to_book3s(vcpu)->hior;
+ break;
+ case SPRN_HID0:
+ vcpu->arch.gpr[rt] = to_book3s(vcpu)->hid[0];
+ break;
+ case SPRN_HID1:
+ vcpu->arch.gpr[rt] = to_book3s(vcpu)->hid[1];
+ break;
+ case SPRN_HID2:
+ vcpu->arch.gpr[rt] = to_book3s(vcpu)->hid[2];
+ break;
+ case SPRN_HID4:
+ vcpu->arch.gpr[rt] = to_book3s(vcpu)->hid[4];
+ break;
+ case SPRN_HID5:
+ vcpu->arch.gpr[rt] = to_book3s(vcpu)->hid[5];
+ break;
+ case SPRN_THRM1:
+ case SPRN_THRM2:
+ case SPRN_THRM3:
+ case SPRN_CTRLF:
+ case SPRN_CTRLT:
+ vcpu->arch.gpr[rt] = 0;
+ break;
+ default:
+ printk(KERN_INFO "KVM: invalid SPR read: %d\n", sprn);
+#ifndef DEBUG_SPR
+ emulated = EMULATE_FAIL;
+#endif
+ break;
+ }
+
+ return emulated;
+}
+
--
1.6.0.2
^ permalink raw reply related [flat|nested] 48+ messages in thread* Re: [PATCH 14/27] Add book3s_64 specific opcode emulation
2009-10-30 15:47 ` [PATCH 14/27] Add book3s_64 specific opcode emulation Alexander Graf
@ 2009-11-03 8:47 ` Segher Boessenkool
[not found] ` <A1CBD511-FF08-48BB-A8D6-9F66E20F770B-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org>
0 siblings, 1 reply; 48+ messages in thread
From: Segher Boessenkool @ 2009-11-03 8:47 UTC (permalink / raw)
To: Alexander Graf
Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
kvm-ppc, linuxppc-dev, Avi Kivity, kvm, bphilips, Olof Johansson
Nice patchset. Some comments on the emulation part:
> +#define OP_31_XOP_EIOIO 854
You mean EIEIO.
> + case 19:
> + switch (get_xop(inst)) {
> + case OP_19_XOP_RFID:
> + case OP_19_XOP_RFI:
> + vcpu->arch.pc = vcpu->arch.srr0;
> + kvmppc_set_msr(vcpu, vcpu->arch.srr1);
> + *advance = 0;
> + break;
I think you should only emulate the insns that exist on whatever the
guest
pretends to be. RFID exist only on 64-bit implementations. Same
comment
everywhere else.
> + case OP_31_XOP_EIOIO:
> + break;
Have you always executed an eieio or sync when you get here, or
do you just not allow direct access to I/O devices? Other context
synchronising insns are not enough, they do not broadcast on the
bus.
> + case OP_31_XOP_DCBZ:
> + {
> + ulong rb = vcpu->arch.gpr[get_rb(inst)];
> + ulong ra = 0;
> + ulong addr;
> + u32 zeros[8] = { 0, 0, 0, 0, 0, 0, 0, 0 };
> +
> + if (get_ra(inst))
> + ra = vcpu->arch.gpr[get_ra(inst)];
> +
> + addr = (ra + rb) & ~31ULL;
> + if (!(vcpu->arch.msr & MSR_SF))
> + addr &= 0xffffffff;
> +
> + if (kvmppc_st(vcpu, addr, 32, zeros)) {
DCBZ zeroes out a cache line, not 32 bytes; except on 970, where there
are HID bits to make it work on 32 bytes only, and an extra DCBZL insn
that always clears a full cache line (128 bytes).
> + switch (sprn) {
> + case SPRN_IBAT0U ... SPRN_IBAT3L:
> + bat = &vcpu_book3s->ibat[(sprn - SPRN_IBAT0U) / 2];
> + break;
> + case SPRN_IBAT4U ... SPRN_IBAT7L:
> + bat = &vcpu_book3s->ibat[(sprn - SPRN_IBAT4U) / 2];
> + break;
> + case SPRN_DBAT0U ... SPRN_DBAT3L:
> + bat = &vcpu_book3s->dbat[(sprn - SPRN_DBAT0U) / 2];
> + break;
> + case SPRN_DBAT4U ... SPRN_DBAT7L:
> + bat = &vcpu_book3s->dbat[(sprn - SPRN_DBAT4U) / 2];
> + break;
Do xBAT4..7 have the same SPR numbers on all CPUs? They are CPU-
specific
SPRs, after all. Some CPUs have only six, some only four, some none,
btw.
> + case SPRN_HID0:
> + to_book3s(vcpu)->hid[0] = vcpu->arch.gpr[rs];
> + break;
> + case SPRN_HID1:
> + to_book3s(vcpu)->hid[1] = vcpu->arch.gpr[rs];
> + break;
> + case SPRN_HID2:
> + to_book3s(vcpu)->hid[2] = vcpu->arch.gpr[rs];
> + break;
> + case SPRN_HID4:
> + to_book3s(vcpu)->hid[4] = vcpu->arch.gpr[rs];
> + break;
> + case SPRN_HID5:
> + to_book3s(vcpu)->hid[5] = vcpu->arch.gpr[rs];
HIDs are different per CPU; and worse, different CPUs have different
registers (SPR #s) for the same register name!
> + /* guest HID5 set can change is_dcbz32 */
> + if (vcpu->arch.mmu.is_dcbz32(vcpu) &&
> + (mfmsr() & MSR_HV))
> + vcpu->arch.hflags |= BOOK3S_HFLAG_DCBZ32;
> + break;
Wait, does this mean you allow other HID writes when MSR[HV] isn't
set? All HIDs (and many other SPRs) cannot be read or written in
supervisor mode.
Segher
^ permalink raw reply [flat|nested] 48+ messages in thread
* [PATCH 15/27] Add mfdec emulation
[not found] ` <1256917647-6200-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>
` (8 preceding siblings ...)
2009-10-30 15:47 ` [PATCH 14/27] Add book3s_64 specific opcode emulation Alexander Graf
@ 2009-10-30 15:47 ` Alexander Graf
2009-10-30 15:47 ` [PATCH 16/27] Add desktop PowerPC specific emulation Alexander Graf
` (6 subsequent siblings)
16 siblings, 0 replies; 48+ messages in thread
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm-u79uwXL29TY76Z2rM5mHXA
Cc: Avi Kivity, kvm-ppc, Hollis Blanchard, Arnd Bergmann,
Benjamin Herrenschmidt, Kevin Wolf, bphilips-l3A5Bk7waGM,
Marcelo Tosatti, Olof Johansson,
linuxppc-dev-mnsaURCQ41sdnm+yROfE0A
We support setting the DEC to a certain value right now. Doing that basically
triggers the CPU local timer.
But there's also an mfdec command that enabled the OS to read the decrementor.
This is required at least by all desktop and server PowerPC Linux kernels. It
can't really hurt to allow embedded ones to do it as well though.
Signed-off-by: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
---
arch/powerpc/kvm/emulate.c | 13 ++++++++++++-
1 files changed, 12 insertions(+), 1 deletions(-)
diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
index 7737146..50d411d 100644
--- a/arch/powerpc/kvm/emulate.c
+++ b/arch/powerpc/kvm/emulate.c
@@ -66,12 +66,14 @@
void kvmppc_emulate_dec(struct kvm_vcpu *vcpu)
{
+ unsigned long nr_jiffies;
+
if (vcpu->arch.tcr & TCR_DIE) {
/* The decrementer ticks at the same rate as the timebase, so
* that's how we convert the guest DEC value to the number of
* host ticks. */
- unsigned long nr_jiffies;
+ vcpu->arch.dec_jiffies = mftb();
nr_jiffies = vcpu->arch.dec / tb_ticks_per_jiffy;
mod_timer(&vcpu->arch.dec_timer,
get_jiffies_64() + nr_jiffies);
@@ -211,6 +213,15 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
/* Note: SPRG4-7 are user-readable, so we don't get
* a trap. */
+ case SPRN_DEC:
+ {
+ u64 jd = mftb() - vcpu->arch.dec_jiffies;
+ vcpu->arch.gpr[rt] = vcpu->arch.dec - jd;
+#ifdef DEBUG_EMUL
+ printk(KERN_INFO "mfDEC: %x - %llx = %lx\n", vcpu->arch.dec, jd, vcpu->arch.gpr[rt]);
+#endif
+ break;
+ }
default:
emulated = kvmppc_core_emulate_mfspr(vcpu, sprn, rt);
if (emulated == EMULATE_FAIL) {
--
1.6.0.2
^ permalink raw reply related [flat|nested] 48+ messages in thread* [PATCH 16/27] Add desktop PowerPC specific emulation
[not found] ` <1256917647-6200-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>
` (9 preceding siblings ...)
2009-10-30 15:47 ` [PATCH 15/27] Add mfdec emulation Alexander Graf
@ 2009-10-30 15:47 ` Alexander Graf
2009-10-30 15:47 ` [PATCH 19/27] Export symbols for KVM module Alexander Graf
` (5 subsequent siblings)
16 siblings, 0 replies; 48+ messages in thread
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm-u79uwXL29TY76Z2rM5mHXA
Cc: Avi Kivity, kvm-ppc, Hollis Blanchard, Arnd Bergmann,
Benjamin Herrenschmidt, Kevin Wolf, bphilips-l3A5Bk7waGM,
Marcelo Tosatti, Olof Johansson,
linuxppc-dev-mnsaURCQ41sdnm+yROfE0A
Little opcodes behave differently on desktop and embedded PowerPC cores.
In order to reflect those differences, let's add some #ifdef code to emulate.c.
We could probably also handle them in the core specific emulation files, but I
would prefer to reuse as much code as possible.
Signed-off-by: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
---
v4 -> v5:
- use get_tb instead of mftb
- make ppc32 and ppc64 emulation share more code
---
arch/powerpc/kvm/emulate.c | 49 +++++++++++++++++++++++++++++++++++---------
1 files changed, 39 insertions(+), 10 deletions(-)
diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
index 50d411d..1ec5e07 100644
--- a/arch/powerpc/kvm/emulate.c
+++ b/arch/powerpc/kvm/emulate.c
@@ -32,6 +32,7 @@
#include "trace.h"
#define OP_TRAP 3
+#define OP_TRAP_64 2
#define OP_31_XOP_LWZX 23
#define OP_31_XOP_LBZX 87
@@ -64,16 +65,36 @@
#define OP_STH 44
#define OP_STHU 45
+#ifdef CONFIG_PPC64
+static int kvmppc_dec_enabled(struct kvm_vcpu *vcpu)
+{
+ return 1;
+}
+#else
+static int kvmppc_dec_enabled(struct kvm_vcpu *vcpu)
+{
+ return vcpu->arch.tcr & TCR_DIE;
+}
+#endif
+
void kvmppc_emulate_dec(struct kvm_vcpu *vcpu)
{
unsigned long nr_jiffies;
- if (vcpu->arch.tcr & TCR_DIE) {
+#ifdef CONFIG_PPC64
+ /* POWER4+ triggers a dec interrupt if the value is < 0 */
+ if (vcpu->arch.dec & 0x80000000) {
+ del_timer(&vcpu->arch.dec_timer);
+ kvmppc_core_queue_dec(vcpu);
+ return;
+ }
+#endif
+ if (kvmppc_dec_enabled(vcpu)) {
/* The decrementer ticks at the same rate as the timebase, so
* that's how we convert the guest DEC value to the number of
* host ticks. */
- vcpu->arch.dec_jiffies = mftb();
+ vcpu->arch.dec_jiffies = get_tb();
nr_jiffies = vcpu->arch.dec / tb_ticks_per_jiffy;
mod_timer(&vcpu->arch.dec_timer,
get_jiffies_64() + nr_jiffies);
@@ -113,9 +134,15 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
/* this default type might be overwritten by subcategories */
kvmppc_set_exit_type(vcpu, EMULATED_INST_EXITS);
+ pr_debug(KERN_INFO "Emulating opcode %d / %d\n", get_op(inst), get_xop(inst));
+
switch (get_op(inst)) {
case OP_TRAP:
+#ifdef CONFIG_PPC64
+ case OP_TRAP_64:
+#else
vcpu->arch.esr |= ESR_PTR;
+#endif
kvmppc_core_queue_program(vcpu);
advance = 0;
break;
@@ -190,17 +217,19 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
case SPRN_SRR1:
vcpu->arch.gpr[rt] = vcpu->arch.srr1; break;
case SPRN_PVR:
- vcpu->arch.gpr[rt] = mfspr(SPRN_PVR); break;
+ vcpu->arch.gpr[rt] = vcpu->arch.pvr; break;
case SPRN_PIR:
- vcpu->arch.gpr[rt] = mfspr(SPRN_PIR); break;
+ vcpu->arch.gpr[rt] = vcpu->vcpu_id; break;
+ case SPRN_MSSSR0:
+ vcpu->arch.gpr[rt] = 0; break;
/* Note: mftb and TBRL/TBWL are user-accessible, so
* the guest can always access the real TB anyways.
* In fact, we probably will never see these traps. */
case SPRN_TBWL:
- vcpu->arch.gpr[rt] = mftbl(); break;
+ vcpu->arch.gpr[rt] = get_tb() >> 32; break;
case SPRN_TBWU:
- vcpu->arch.gpr[rt] = mftbu(); break;
+ vcpu->arch.gpr[rt] = get_tb(); break;
case SPRN_SPRG0:
vcpu->arch.gpr[rt] = vcpu->arch.sprg0; break;
@@ -215,11 +244,9 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
case SPRN_DEC:
{
- u64 jd = mftb() - vcpu->arch.dec_jiffies;
+ u64 jd = get_tb() - vcpu->arch.dec_jiffies;
vcpu->arch.gpr[rt] = vcpu->arch.dec - jd;
-#ifdef DEBUG_EMUL
- printk(KERN_INFO "mfDEC: %x - %llx = %lx\n", vcpu->arch.dec, jd, vcpu->arch.gpr[rt]);
-#endif
+ pr_debug(KERN_INFO "mfDEC: %x - %llx = %lx\n", vcpu->arch.dec, jd, vcpu->arch.gpr[rt]);
break;
}
default:
@@ -271,6 +298,8 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
case SPRN_TBWL: break;
case SPRN_TBWU: break;
+ case SPRN_MSSSR0: break;
+
case SPRN_DEC:
vcpu->arch.dec = vcpu->arch.gpr[rs];
kvmppc_emulate_dec(vcpu);
--
1.6.0.2
^ permalink raw reply related [flat|nested] 48+ messages in thread* [PATCH 19/27] Export symbols for KVM module
[not found] ` <1256917647-6200-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>
` (10 preceding siblings ...)
2009-10-30 15:47 ` [PATCH 16/27] Add desktop PowerPC specific emulation Alexander Graf
@ 2009-10-30 15:47 ` Alexander Graf
[not found] ` <1256917647-6200-20-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>
2009-10-30 15:47 ` [PATCH 20/27] Split init_new_context and destroy_context Alexander Graf
` (4 subsequent siblings)
16 siblings, 1 reply; 48+ messages in thread
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm-u79uwXL29TY76Z2rM5mHXA
Cc: Avi Kivity, kvm-ppc, Hollis Blanchard, Arnd Bergmann,
Benjamin Herrenschmidt, Kevin Wolf, bphilips-l3A5Bk7waGM,
Marcelo Tosatti, Olof Johansson,
linuxppc-dev-mnsaURCQ41sdnm+yROfE0A
We want to be able to build KVM as a module. To enable us doing so, we
need some more exports from core Linux parts.
This patch exports all functions and variables that are required for KVM.
Signed-off-by: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
---
v3 -> v4:
- don't export switch_slb
- don't export init_context
- don't export mm_alloc
---
arch/powerpc/kernel/ppc_ksyms.c | 3 ++-
arch/powerpc/kernel/time.c | 1 +
arch/powerpc/mm/hash_utils_64.c | 2 ++
3 files changed, 5 insertions(+), 1 deletions(-)
diff --git a/arch/powerpc/kernel/ppc_ksyms.c b/arch/powerpc/kernel/ppc_ksyms.c
index c8b27bb..baf778c 100644
--- a/arch/powerpc/kernel/ppc_ksyms.c
+++ b/arch/powerpc/kernel/ppc_ksyms.c
@@ -163,11 +163,12 @@ EXPORT_SYMBOL(screen_info);
#ifdef CONFIG_PPC32
EXPORT_SYMBOL(timer_interrupt);
EXPORT_SYMBOL(irq_desc);
-EXPORT_SYMBOL(tb_ticks_per_jiffy);
EXPORT_SYMBOL(cacheable_memcpy);
EXPORT_SYMBOL(cacheable_memzero);
#endif
+EXPORT_SYMBOL(tb_ticks_per_jiffy);
+
#ifdef CONFIG_PPC32
EXPORT_SYMBOL(switch_mmu_context);
#endif
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 92dc844..e05f6af 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -268,6 +268,7 @@ void account_system_vtime(struct task_struct *tsk)
per_cpu(cputime_scaled_last_delta, smp_processor_id()) = deltascaled;
local_irq_restore(flags);
}
+EXPORT_SYMBOL_GPL(account_system_vtime);
/*
* Transfer the user and system times accumulated in the paca
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 1ade7eb..2b2a4aa 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -92,6 +92,7 @@ struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT];
struct hash_pte *htab_address;
unsigned long htab_size_bytes;
unsigned long htab_hash_mask;
+EXPORT_SYMBOL_GPL(htab_hash_mask);
int mmu_linear_psize = MMU_PAGE_4K;
int mmu_virtual_psize = MMU_PAGE_4K;
int mmu_vmalloc_psize = MMU_PAGE_4K;
@@ -102,6 +103,7 @@ int mmu_io_psize = MMU_PAGE_4K;
int mmu_kernel_ssize = MMU_SEGSIZE_256M;
int mmu_highuser_ssize = MMU_SEGSIZE_256M;
u16 mmu_slb_size = 64;
+EXPORT_SYMBOL_GPL(mmu_slb_size);
#ifdef CONFIG_HUGETLB_PAGE
unsigned int HPAGE_SHIFT;
#endif
--
1.6.0.2
^ permalink raw reply related [flat|nested] 48+ messages in thread* [PATCH 20/27] Split init_new_context and destroy_context
[not found] ` <1256917647-6200-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>
` (11 preceding siblings ...)
2009-10-30 15:47 ` [PATCH 19/27] Export symbols for KVM module Alexander Graf
@ 2009-10-30 15:47 ` Alexander Graf
2009-10-31 4:40 ` Stephen Rothwell
2009-10-30 15:47 ` [PATCH 21/27] Export KVM symbols for module Alexander Graf
` (3 subsequent siblings)
16 siblings, 1 reply; 48+ messages in thread
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm-u79uwXL29TY76Z2rM5mHXA
Cc: Avi Kivity, kvm-ppc, Hollis Blanchard, Arnd Bergmann,
Benjamin Herrenschmidt, Kevin Wolf, bphilips-l3A5Bk7waGM,
Marcelo Tosatti, Olof Johansson,
linuxppc-dev-mnsaURCQ41sdnm+yROfE0A
For KVM we need to allocate a new context id, but don't really care about
all the mm context around it.
So let's split the alloc and destroy functions for the context id, so we can
grab one without allocating an mm context.
Signed-off-by: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
---
arch/powerpc/include/asm/mmu_context.h | 5 +++++
arch/powerpc/mm/mmu_context_hash64.c | 24 +++++++++++++++++++++---
2 files changed, 26 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index b34e94d..66b35d0 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -23,6 +23,11 @@ extern void switch_slb(struct task_struct *tsk, struct mm_struct *mm);
extern void set_context(unsigned long id, pgd_t *pgd);
#ifdef CONFIG_PPC_BOOK3S_64
+extern int __init_new_context(void);
+extern void __destroy_context(int context_id);
+#endif
+
+#ifdef CONFIG_PPC_BOOK3S_64
static inline void mmu_context_init(void) { }
#else
extern void mmu_context_init(void);
diff --git a/arch/powerpc/mm/mmu_context_hash64.c b/arch/powerpc/mm/mmu_context_hash64.c
index dbeb86a..b9e4cc2 100644
--- a/arch/powerpc/mm/mmu_context_hash64.c
+++ b/arch/powerpc/mm/mmu_context_hash64.c
@@ -18,6 +18,7 @@
#include <linux/mm.h>
#include <linux/spinlock.h>
#include <linux/idr.h>
+#include <linux/module.h>
#include <asm/mmu_context.h>
@@ -32,7 +33,7 @@ static DEFINE_IDR(mmu_context_idr);
#define NO_CONTEXT 0
#define MAX_CONTEXT ((1UL << 19) - 1)
-int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
+int __init_new_context(void)
{
int index;
int err;
@@ -57,6 +58,18 @@ again:
return -ENOMEM;
}
+ return index;
+}
+EXPORT_SYMBOL_GPL(__init_new_context);
+
+int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
+{
+ int index;
+
+ index = __init_new_context();
+ if (index < 0)
+ return index;
+
/* The old code would re-promote on fork, we don't do that
* when using slices as it could cause problem promoting slices
* that have been forced down to 4K
@@ -68,11 +81,16 @@ again:
return 0;
}
-void destroy_context(struct mm_struct *mm)
+void __destroy_context(int context_id)
{
spin_lock(&mmu_context_lock);
- idr_remove(&mmu_context_idr, mm->context.id);
+ idr_remove(&mmu_context_idr, context_id);
spin_unlock(&mmu_context_lock);
+}
+EXPORT_SYMBOL_GPL(__destroy_context);
+void destroy_context(struct mm_struct *mm)
+{
+ __destroy_context(mm->context.id);
mm->context.id = NO_CONTEXT;
}
--
1.6.0.2
^ permalink raw reply related [flat|nested] 48+ messages in thread* Re: [PATCH 20/27] Split init_new_context and destroy_context
2009-10-30 15:47 ` [PATCH 20/27] Split init_new_context and destroy_context Alexander Graf
@ 2009-10-31 4:40 ` Stephen Rothwell
2009-10-31 21:20 ` Alexander Graf
0 siblings, 1 reply; 48+ messages in thread
From: Stephen Rothwell @ 2009-10-31 4:40 UTC (permalink / raw)
To: Alexander Graf
Cc: kvm, Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
[-- Attachment #1: Type: text/plain, Size: 668 bytes --]
Hi Alexander,
On Fri, 30 Oct 2009 16:47:20 +0100 Alexander Graf <agraf@suse.de> wrote:
>
> --- a/arch/powerpc/include/asm/mmu_context.h
> +++ b/arch/powerpc/include/asm/mmu_context.h
> @@ -23,6 +23,11 @@ extern void switch_slb(struct task_struct *tsk, struct mm_struct *mm);
> extern void set_context(unsigned long id, pgd_t *pgd);
>
> #ifdef CONFIG_PPC_BOOK3S_64
> +extern int __init_new_context(void);
> +extern void __destroy_context(int context_id);
> +#endif
> +
> +#ifdef CONFIG_PPC_BOOK3S_64
don't add the #endif/#ifdef pair ...
--
Cheers,
Stephen Rothwell sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/
[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 20/27] Split init_new_context and destroy_context
2009-10-31 4:40 ` Stephen Rothwell
@ 2009-10-31 21:20 ` Alexander Graf
2009-10-31 21:37 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 48+ messages in thread
From: Alexander Graf @ 2009-10-31 21:20 UTC (permalink / raw)
To: Stephen Rothwell
Cc: kvm, Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
Stephen Rothwell wrote:
> Hi Alexander,
>
> On Fri, 30 Oct 2009 16:47:20 +0100 Alexander Graf <agraf@suse.de> wrote:
>
>> --- a/arch/powerpc/include/asm/mmu_context.h
>> +++ b/arch/powerpc/include/asm/mmu_context.h
>> @@ -23,6 +23,11 @@ extern void switch_slb(struct task_struct *tsk, struct mm_struct *mm);
>> extern void set_context(unsigned long id, pgd_t *pgd);
>>
>> #ifdef CONFIG_PPC_BOOK3S_64
>> +extern int __init_new_context(void);
>> +extern void __destroy_context(int context_id);
>> +#endif
>> +
>> +#ifdef CONFIG_PPC_BOOK3S_64
>>
>
> don't add the #endif/#ifdef pair ...
Any other comments? I'd like to wind up a final patch set so I can stop
spamming on all those MLs :-).
Alex
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 20/27] Split init_new_context and destroy_context
2009-10-31 21:20 ` Alexander Graf
@ 2009-10-31 21:37 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 48+ messages in thread
From: Benjamin Herrenschmidt @ 2009-10-31 21:37 UTC (permalink / raw)
To: Alexander Graf
Cc: Kevin Wolf, Stephen Rothwell, Arnd Bergmann, Hollis Blanchard,
Marcelo Tosatti, kvm-ppc, linuxppc-dev, Avi Kivity, kvm, bphilips,
Olof Johansson
On Sat, 2009-10-31 at 22:20 +0100, Alexander Graf wrote:
> Stephen Rothwell wrote:
> > Hi Alexander,
> >
> > On Fri, 30 Oct 2009 16:47:20 +0100 Alexander Graf <agraf@suse.de> wrote:
> >
> >> --- a/arch/powerpc/include/asm/mmu_context.h
> >> +++ b/arch/powerpc/include/asm/mmu_context.h
> >> @@ -23,6 +23,11 @@ extern void switch_slb(struct task_struct *tsk, struct mm_struct *mm);
> >> extern void set_context(unsigned long id, pgd_t *pgd);
> >>
> >> #ifdef CONFIG_PPC_BOOK3S_64
> >> +extern int __init_new_context(void);
> >> +extern void __destroy_context(int context_id);
> >> +#endif
> >> +
> >> +#ifdef CONFIG_PPC_BOOK3S_64
> >>
> >
> > don't add the #endif/#ifdef pair ...
>
> Any other comments? I'd like to wind up a final patch set so I can stop
> spamming on all those MLs :-).
Just send an update for -that- patch to the list.
Ben.
^ permalink raw reply [flat|nested] 48+ messages in thread
* [PATCH 21/27] Export KVM symbols for module
[not found] ` <1256917647-6200-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>
` (12 preceding siblings ...)
2009-10-30 15:47 ` [PATCH 20/27] Split init_new_context and destroy_context Alexander Graf
@ 2009-10-30 15:47 ` Alexander Graf
2009-10-30 15:47 ` [PATCH 22/27] Add fields to PACA Alexander Graf
` (2 subsequent siblings)
16 siblings, 0 replies; 48+ messages in thread
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm-u79uwXL29TY76Z2rM5mHXA
Cc: Avi Kivity, kvm-ppc, Hollis Blanchard, Arnd Bergmann,
Benjamin Herrenschmidt, Kevin Wolf, bphilips-l3A5Bk7waGM,
Marcelo Tosatti, Olof Johansson,
linuxppc-dev-mnsaURCQ41sdnm+yROfE0A
To be able to keep KVM as module, we need to export the SLB trampoline
addresses to the module, so it knows where to jump to.
Signed-off-by: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
---
arch/powerpc/kvm/book3s_64_exports.c | 24 ++++++++++++++++++++++++
1 files changed, 24 insertions(+), 0 deletions(-)
create mode 100644 arch/powerpc/kvm/book3s_64_exports.c
diff --git a/arch/powerpc/kvm/book3s_64_exports.c b/arch/powerpc/kvm/book3s_64_exports.c
new file mode 100644
index 0000000..5b2db38
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_exports.c
@@ -0,0 +1,24 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
+ */
+
+#include <linux/module.h>
+#include <asm/kvm_book3s.h>
+
+EXPORT_SYMBOL_GPL(kvmppc_trampoline_enter);
+EXPORT_SYMBOL_GPL(kvmppc_trampoline_lowmem);
--
1.6.0.2
^ permalink raw reply related [flat|nested] 48+ messages in thread* [PATCH 22/27] Add fields to PACA
[not found] ` <1256917647-6200-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>
` (13 preceding siblings ...)
2009-10-30 15:47 ` [PATCH 21/27] Export KVM symbols for module Alexander Graf
@ 2009-10-30 15:47 ` Alexander Graf
2009-10-30 15:47 ` [PATCH 27/27] Use hrtimers for the decrementer Alexander Graf
2009-11-05 6:03 ` [PATCH 00/27] Add KVM support for Book3s_64 (PPC64) hosts v6 Benjamin Herrenschmidt
16 siblings, 0 replies; 48+ messages in thread
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm-u79uwXL29TY76Z2rM5mHXA
Cc: Avi Kivity, kvm-ppc, Hollis Blanchard, Arnd Bergmann,
Benjamin Herrenschmidt, Kevin Wolf, bphilips-l3A5Bk7waGM,
Marcelo Tosatti, Olof Johansson,
linuxppc-dev-mnsaURCQ41sdnm+yROfE0A
For KVM we need to store some information in the PACA, so we
need to extend it.
This patch adds KVM SLB shadow related entries to the PACA and
a field that indicates if we're inside a guest.
Signed-off-by: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
---
arch/powerpc/include/asm/paca.h | 9 +++++++++
1 files changed, 9 insertions(+), 0 deletions(-)
diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 7d8514c..5e9b4ef 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -129,6 +129,15 @@ struct paca_struct {
u64 system_time; /* accumulated system TB ticks */
u64 startpurr; /* PURR/TB value snapshot */
u64 startspurr; /* SPURR value snapshot */
+
+#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
+ struct {
+ u64 esid;
+ u64 vsid;
+ } kvm_slb[64]; /* guest SLB */
+ u8 kvm_slb_max; /* highest used guest slb entry */
+ u8 kvm_in_guest; /* are we inside the guest? */
+#endif
};
extern struct paca_struct paca[];
--
1.6.0.2
^ permalink raw reply related [flat|nested] 48+ messages in thread* [PATCH 27/27] Use hrtimers for the decrementer
[not found] ` <1256917647-6200-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>
` (14 preceding siblings ...)
2009-10-30 15:47 ` [PATCH 22/27] Add fields to PACA Alexander Graf
@ 2009-10-30 15:47 ` Alexander Graf
2009-11-05 6:03 ` [PATCH 00/27] Add KVM support for Book3s_64 (PPC64) hosts v6 Benjamin Herrenschmidt
16 siblings, 0 replies; 48+ messages in thread
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm-u79uwXL29TY76Z2rM5mHXA
Cc: Avi Kivity, kvm-ppc, Hollis Blanchard, Arnd Bergmann,
Benjamin Herrenschmidt, Kevin Wolf, bphilips-l3A5Bk7waGM,
Marcelo Tosatti, Olof Johansson,
linuxppc-dev-mnsaURCQ41sdnm+yROfE0A
Following S390's good example we should use hrtimers for the decrementer too!
This patch converts the timer from the old mechanism to hrtimers.
Signed-off-by: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
---
arch/powerpc/include/asm/kvm_host.h | 6 ++++--
arch/powerpc/kvm/emulate.c | 18 +++++++++++-------
arch/powerpc/kvm/powerpc.c | 20 ++++++++++++++++++--
3 files changed, 33 insertions(+), 11 deletions(-)
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 2cff5fe..1201f62 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -21,7 +21,8 @@
#define __POWERPC_KVM_HOST_H__
#include <linux/mutex.h>
-#include <linux/timer.h>
+#include <linux/hrtimer.h>
+#include <linux/interrupt.h>
#include <linux/types.h>
#include <linux/kvm_types.h>
#include <asm/kvm_asm.h>
@@ -250,7 +251,8 @@ struct kvm_vcpu_arch {
u32 cpr0_cfgaddr; /* holds the last set cpr0_cfgaddr */
- struct timer_list dec_timer;
+ struct hrtimer dec_timer;
+ struct tasklet_struct tasklet;
u64 dec_jiffies;
unsigned long pending_exceptions;
diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
index 1ec5e07..4a9ac66 100644
--- a/arch/powerpc/kvm/emulate.c
+++ b/arch/powerpc/kvm/emulate.c
@@ -18,7 +18,7 @@
*/
#include <linux/jiffies.h>
-#include <linux/timer.h>
+#include <linux/hrtimer.h>
#include <linux/types.h>
#include <linux/string.h>
#include <linux/kvm_host.h>
@@ -79,12 +79,13 @@ static int kvmppc_dec_enabled(struct kvm_vcpu *vcpu)
void kvmppc_emulate_dec(struct kvm_vcpu *vcpu)
{
- unsigned long nr_jiffies;
+ unsigned long dec_nsec;
+ pr_debug("mtDEC: %x\n", vcpu->arch.dec);
#ifdef CONFIG_PPC64
/* POWER4+ triggers a dec interrupt if the value is < 0 */
if (vcpu->arch.dec & 0x80000000) {
- del_timer(&vcpu->arch.dec_timer);
+ hrtimer_try_to_cancel(&vcpu->arch.dec_timer);
kvmppc_core_queue_dec(vcpu);
return;
}
@@ -94,12 +95,15 @@ void kvmppc_emulate_dec(struct kvm_vcpu *vcpu)
* that's how we convert the guest DEC value to the number of
* host ticks. */
+ hrtimer_try_to_cancel(&vcpu->arch.dec_timer);
+ dec_nsec = vcpu->arch.dec;
+ dec_nsec *= 1000;
+ dec_nsec /= tb_ticks_per_usec;
+ hrtimer_start(&vcpu->arch.dec_timer, ktime_set(0, dec_nsec),
+ HRTIMER_MODE_REL);
vcpu->arch.dec_jiffies = get_tb();
- nr_jiffies = vcpu->arch.dec / tb_ticks_per_jiffy;
- mod_timer(&vcpu->arch.dec_timer,
- get_jiffies_64() + nr_jiffies);
} else {
- del_timer(&vcpu->arch.dec_timer);
+ hrtimer_try_to_cancel(&vcpu->arch.dec_timer);
}
}
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 4ae3490..4c582ed 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -23,6 +23,7 @@
#include <linux/kvm_host.h>
#include <linux/module.h>
#include <linux/vmalloc.h>
+#include <linux/hrtimer.h>
#include <linux/fs.h>
#include <asm/cputable.h>
#include <asm/uaccess.h>
@@ -209,10 +210,25 @@ static void kvmppc_decrementer_func(unsigned long data)
}
}
+/*
+ * low level hrtimer wake routine. Because this runs in hardirq context
+ * we schedule a tasklet to do the real work.
+ */
+enum hrtimer_restart kvmppc_decrementer_wakeup(struct hrtimer *timer)
+{
+ struct kvm_vcpu *vcpu;
+
+ vcpu = container_of(timer, struct kvm_vcpu, arch.dec_timer);
+ tasklet_schedule(&vcpu->arch.tasklet);
+
+ return HRTIMER_NORESTART;
+}
+
int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
{
- setup_timer(&vcpu->arch.dec_timer, kvmppc_decrementer_func,
- (unsigned long)vcpu);
+ hrtimer_init(&vcpu->arch.dec_timer, CLOCK_REALTIME, HRTIMER_MODE_ABS);
+ tasklet_init(&vcpu->arch.tasklet, kvmppc_decrementer_func, (ulong)vcpu);
+ vcpu->arch.dec_timer.function = kvmppc_decrementer_wakeup;
return 0;
}
--
1.6.0.2
^ permalink raw reply related [flat|nested] 48+ messages in thread* Re: [PATCH 00/27] Add KVM support for Book3s_64 (PPC64) hosts v6
[not found] ` <1256917647-6200-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>
` (15 preceding siblings ...)
2009-10-30 15:47 ` [PATCH 27/27] Use hrtimers for the decrementer Alexander Graf
@ 2009-11-05 6:03 ` Benjamin Herrenschmidt
16 siblings, 0 replies; 48+ messages in thread
From: Benjamin Herrenschmidt @ 2009-11-05 6:03 UTC (permalink / raw)
To: Alexander Graf
Cc: kvm-u79uwXL29TY76Z2rM5mHXA, Avi Kivity, kvm-ppc, Hollis Blanchard,
Arnd Bergmann, Kevin Wolf, bphilips-l3A5Bk7waGM, Marcelo Tosatti,
Olof Johansson, linuxppc-dev-mnsaURCQ41sdnm+yROfE0A
On Fri, 2009-10-30 at 16:47 +0100, Alexander Graf wrote:
> KVM for PowerPC only supports embedded cores at the moment.
>
> While it makes sense to virtualize on small machines, it's even more fun
> to do so on big boxes. So I figured we need KVM for PowerPC64 as well.
I get that with exit timing enabled:
arch/powerpc/kvm/timing.c:205: error: ‘THIS_MODULE’ undeclared here (not in a function)
I'll stick a fixup patch in my tree (just adding #include <linux/module.h>)
Cheers,
Ben.
^ permalink raw reply [flat|nested] 48+ messages in thread