LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: mpc512x/clock: fix clk_get logic
From: Mark Brown @ 2009-10-30 12:02 UTC (permalink / raw)
  To: Wolfram Sang; +Cc: linuxppc-dev, Wolfgang Denk
In-Reply-To: <20091030113644.GE12068@pengutronix.de>

On Fri, Oct 30, 2009 at 12:36:44PM +0100, Wolfram Sang wrote:
> On Fri, Oct 30, 2009 at 10:54:14AM +0000, Mark Brown wrote:

> > > - require the id field (as _this_ is the unique identifier)

> > NULL id fields are supposed to be supported in the cannonical clkdev
> > API, unfortunately.

> Hmm, ok, thanks for the hint. Where is this documented, I can't find it?

I'm not aware of any particular documentation on it but searching for
mailing list posts from rmk ought to turn up some posts on the issue.

^ permalink raw reply

* RE: Accessing flash directly from User Space [SOLVED]
From: Jonathan Haws @ 2009-10-30 14:50 UTC (permalink / raw)
  To: Scott Wood, Joakim Tjernlund; +Cc: Bill Gatliff, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20091029233038.GA21587@b07421-ec1.am.freescale.net>

> On Thu, Oct 29, 2009 at 10:00:28AM +0100, Joakim Tjernlund wrote:
> > > I have found the problem.  It occurred to me in the shower (okay
> not really,
> > > but most good ideas happen there).
> > >
> > > What was happening is that I was in fact able to write to the
> correct
> > > registers.  However, I would try and write to them in a batch.
> But the way
> > > mmap works (at least according to the man page) with MAP_SHARED
> is that the
> > > file may not be updated until msync() is called.  Now, I thought
> that O_SYNC
> > > would take care of that when I open /dev/mem, but that was not
> the case.
> > >
> > > Anyway, to make a long story short, I inserted an msync() after
> each
> > > assignment to the flash.  This resolved my problem and I can now
> program my flash.
> >
> > Ouch, this was news to me too. Calling msync() after every write
> kills performance.
> > We use mmap(/dev/mem) to access HW and havn't seen any issues yet.
> Is this
> > perhaps a new behaviour for mmap(/dev/mem) and is there a way
> > to avoid calling msync()?
>=20
> I suspect that the msync() was merely serving as a very heavyweight
> memory barrier.

I did try hacking the mb() calls from the kernel source to use them in user=
 space, but they had no effect.  I still had to include the calls to msync(=
).

^ permalink raw reply

* Re: Accessing flash directly from User Space [SOLVED]
From: Michael Buesch @ 2009-10-30 14:56 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Scott Wood, Bill Gatliff, Jonathan Haws
In-Reply-To: <BB99A6BA28709744BF22A68E6D7EB51F0330E230FC@midas.usurf.usu.edu>

On Friday 30 October 2009 15:50:22 Jonathan Haws wrote:
> > I suspect that the msync() was merely serving as a very heavyweight
> > memory barrier.
> 
> I did try hacking the mb() calls from the kernel source to use them in user space, but they had no effect.  I still had to include the calls to msync().

What did the resulting mb() that you used look like?

-- 
Greetings, Michael.

^ permalink raw reply

* RE: Accessing flash directly from User Space [SOLVED]
From: Jonathan Haws @ 2009-10-30 14:57 UTC (permalink / raw)
  To: Michael Buesch, linuxppc-dev@lists.ozlabs.org; +Cc: Scott Wood, Bill Gatliff
In-Reply-To: <200910301556.11349.mb@bu3sch.de>

> On Friday 30 October 2009 15:50:22 Jonathan Haws wrote:
> > > I suspect that the msync() was merely serving as a very
> heavyweight
> > > memory barrier.
> >
> > I did try hacking the mb() calls from the kernel source to use
> them in user space, but they had no effect.  I still had to include
> the calls to msync().
>=20
> What did the resulting mb() that you used look like?

asm("eieio; sync");

^ permalink raw reply

* Re: Accessing flash directly from User Space [SOLVED]
From: Alessandro Rubini @ 2009-10-30 15:08 UTC (permalink / raw)
  To: Jonathan.Haws; +Cc: scottwood, bgat, linuxppc-dev
In-Reply-To: <BB99A6BA28709744BF22A68E6D7EB51F0330E23124@midas.usurf.usu.edu >

> asm("eieio; sync");

Hmm...
	: : : "memory"

And, doesn't ";" start a comment in assembly? (no, not on powerpc it seems)

Just to make sure, if you replace msync() with another system call,
like "kill(1, 0);" you can verify wether it's really useful or if it's
only acting as a barrier.

/alessandro

^ permalink raw reply

* Re: Accessing flash directly from User Space [SOLVED]
From: Michael Buesch @ 2009-10-30 15:24 UTC (permalink / raw)
  To: Alessandro Rubini; +Cc: scottwood, bgat, linuxppc-dev, Jonathan.Haws
In-Reply-To: <20091030150855.GA15046@mail.gnudd.com>

On Friday 30 October 2009 16:08:55 Alessandro Rubini wrote:
> > asm("eieio; sync");
> 
> Hmm...
> 	: : : "memory"
> 
> And, doesn't ";" start a comment in assembly? (no, not on powerpc it seems)

Yes, I think the barrier is wrong.
Please try with

#define mb()	__asm__ __volatile__("eieio\n sync\n" : : : "memory")

-- 
Greetings, Michael.

^ permalink raw reply

* RE: Accessing flash directly from User Space [SOLVED]
From: Jonathan Haws @ 2009-10-30 15:33 UTC (permalink / raw)
  To: Michael Buesch, Alessandro Rubini
  Cc: scottwood@freescale.com, bgat@billgatliff.com,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <200910301624.03989.mb@bu3sch.de>

> On Friday 30 October 2009 16:08:55 Alessandro Rubini wrote:
> > > asm("eieio; sync");
> >
> > Hmm...
> > 	: : : "memory"
> >
> > And, doesn't ";" start a comment in assembly? (no, not on powerpc
> it seems)
>=20
> Yes, I think the barrier is wrong.
> Please try with
>=20
> #define mb()	__asm__ __volatile__("eieio\n sync\n" : : :
> "memory")

That definition worked great.  I must have missed the : : : "memory" bit wh=
en I was digging through code.

Thanks, that gives me about a 2x speedup over the msync() calls.

Thanks for the help!

^ permalink raw reply

* [PATCH 26/27] Use Little Endian for Dirty Bitmap
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
  To: kvm
  Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
	kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>

We currently use host endian long types to store information
in the dirty bitmap.

This works reasonably well on Little Endian targets, because the
u32 after the first contains the next 32 bits. On Big Endian this
breaks completely though, forcing us to be inventive here.

So Ben suggested to always use Little Endian, which looks reasonable.

We only have dirty bitmap implemented in Little Endian targets so far
and since PowerPC would be the first Big Endian platform, we can just
as well switch to Little Endian always with little effort without
breaking existing targets.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 virt/kvm/kvm_main.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index bd44fb4..972f9fb 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -49,6 +49,7 @@
 #include <asm/io.h>
 #include <asm/uaccess.h>
 #include <asm/pgtable.h>
+#include <asm-generic/bitops/le.h>
 
 #ifdef KVM_COALESCED_MMIO_PAGE_OFFSET
 #include "coalesced_mmio.h"
@@ -1071,8 +1072,8 @@ void mark_page_dirty(struct kvm *kvm, gfn_t gfn)
 		unsigned long rel_gfn = gfn - memslot->base_gfn;
 
 		/* avoid RMW */
-		if (!test_bit(rel_gfn, memslot->dirty_bitmap))
-			set_bit(rel_gfn, memslot->dirty_bitmap);
+		if (!generic_test_le_bit(rel_gfn, memslot->dirty_bitmap))
+			generic___set_le_bit(rel_gfn, memslot->dirty_bitmap);
 	}
 }
 
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 06/27] Add Book3s_64 intercept helpers
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
  To: kvm
  Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
	kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>

We need to intercept interrupt vectors. To do that, let's add a file
we can always include which only activates the intercepts when we have
then configured.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_book3s_64_asm.h |   58 ++++++++++++++++++++++++++
 1 files changed, 58 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/include/asm/kvm_book3s_64_asm.h

diff --git a/arch/powerpc/include/asm/kvm_book3s_64_asm.h b/arch/powerpc/include/asm/kvm_book3s_64_asm.h
new file mode 100644
index 0000000..2e06ee8
--- /dev/null
+++ b/arch/powerpc/include/asm/kvm_book3s_64_asm.h
@@ -0,0 +1,58 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf <agraf@suse.de>
+ */
+
+#ifndef __ASM_KVM_BOOK3S_ASM_H__
+#define __ASM_KVM_BOOK3S_ASM_H__
+
+#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
+
+#include <asm/kvm_asm.h>
+
+.macro DO_KVM intno
+	.if (\intno == BOOK3S_INTERRUPT_SYSTEM_RESET) || \
+	    (\intno == BOOK3S_INTERRUPT_MACHINE_CHECK) || \
+	    (\intno == BOOK3S_INTERRUPT_DATA_STORAGE) || \
+	    (\intno == BOOK3S_INTERRUPT_INST_STORAGE) || \
+	    (\intno == BOOK3S_INTERRUPT_DATA_SEGMENT) || \
+	    (\intno == BOOK3S_INTERRUPT_INST_SEGMENT) || \
+	    (\intno == BOOK3S_INTERRUPT_EXTERNAL) || \
+	    (\intno == BOOK3S_INTERRUPT_ALIGNMENT) || \
+	    (\intno == BOOK3S_INTERRUPT_PROGRAM) || \
+	    (\intno == BOOK3S_INTERRUPT_FP_UNAVAIL) || \
+	    (\intno == BOOK3S_INTERRUPT_DECREMENTER) || \
+	    (\intno == BOOK3S_INTERRUPT_SYSCALL) || \
+	    (\intno == BOOK3S_INTERRUPT_TRACE) || \
+	    (\intno == BOOK3S_INTERRUPT_PERFMON) || \
+	    (\intno == BOOK3S_INTERRUPT_ALTIVEC) || \
+	    (\intno == BOOK3S_INTERRUPT_VSX)
+
+	b	kvmppc_trampoline_\intno
+kvmppc_resume_\intno:
+
+	.endif
+.endm
+
+#else
+
+.macro DO_KVM intno
+.endm
+
+#endif /* CONFIG_KVM_BOOK3S_64_HANDLER */
+
+#endif /* __ASM_KVM_BOOK3S_ASM_H__ */
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 09/27] Add interrupt handling code
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
  To: kvm
  Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
	kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>

Getting from host state to the guest is only half the story. We also need
to return to our host context and handle whatever happened to get us out of
the guest.

On PowerPC every guest exit is an interrupt. So all we need to do is trap
the host's interrupt handlers and get into our #VMEXIT code to handle it.

PowerPCs also have a register that can add an offset to the interrupt handlers'
adresses which is what the booke KVM code uses. Unfortunately that is a
hypervisor ressource and we also want to be able to run KVM when we're running
in an LPAR. So we have to hook into the Linux interrupt handlers.

Signed-off-by: Alexander Graf <agraf@suse.de>

---

v3 -> v4:

  - header rename fix
---
 arch/powerpc/kvm/book3s_64_rmhandlers.S |  131 +++++++++++++++++++++++++++++++
 1 files changed, 131 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_64_rmhandlers.S

diff --git a/arch/powerpc/kvm/book3s_64_rmhandlers.S b/arch/powerpc/kvm/book3s_64_rmhandlers.S
new file mode 100644
index 0000000..fb7dd2e
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_rmhandlers.S
@@ -0,0 +1,131 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf <agraf@suse.de>
+ */
+
+#include <asm/ppc_asm.h>
+#include <asm/kvm_asm.h>
+#include <asm/reg.h>
+#include <asm/page.h>
+#include <asm/asm-offsets.h>
+#include <asm/exception-64s.h>
+
+/*****************************************************************************
+ *                                                                           *
+ *        Real Mode handlers that need to be in low physical memory          *
+ *                                                                           *
+ ****************************************************************************/
+
+
+.macro INTERRUPT_TRAMPOLINE intno
+
+.global kvmppc_trampoline_\intno
+kvmppc_trampoline_\intno:
+
+	mtspr	SPRN_SPRG_SCRATCH0, r13		/* Save r13 */
+
+	/*
+	 * First thing to do is to find out if we're coming
+	 * from a KVM guest or a Linux process.
+	 *
+	 * To distinguish, we check a magic byte in the PACA
+	 */
+	mfspr	r13, SPRN_SPRG_PACA		/* r13 = PACA */
+	std	r12, (PACA_EXMC + EX_R12)(r13)
+	mfcr	r12
+	stw	r12, (PACA_EXMC + EX_CCR)(r13)
+	lbz	r12, PACA_KVM_IN_GUEST(r13)
+	cmpwi	r12, 0
+	bne	..kvmppc_handler_hasmagic_\intno
+	/* No KVM guest? Then jump back to the Linux handler! */
+	lwz	r12, (PACA_EXMC + EX_CCR)(r13)
+	mtcr	r12
+	ld	r12, (PACA_EXMC + EX_R12)(r13)
+	mfspr	r13, SPRN_SPRG_SCRATCH0		/* r13 = original r13 */
+	b	kvmppc_resume_\intno		/* Get back original handler */
+
+	/* Now we know we're handling a KVM guest */
+..kvmppc_handler_hasmagic_\intno:
+	/* Unset guest state */
+	li	r12, 0
+	stb	r12, PACA_KVM_IN_GUEST(r13)
+
+	std	r1, (PACA_EXMC+EX_R9)(r13)
+	std	r10, (PACA_EXMC+EX_R10)(r13)
+	std	r11, (PACA_EXMC+EX_R11)(r13)
+	std	r2, (PACA_EXMC+EX_R13)(r13)
+
+	mfsrr0	r10
+	mfsrr1	r11
+
+	/* Restore R1/R2 so we can handle faults */
+	ld	r1, PACAR1(r13)
+	ld	r2, (PACA_EXMC+EX_SRR0)(r13)
+
+	/* Let's store which interrupt we're handling */
+	li	r12, \intno
+
+	/* Jump into the SLB exit code that goes to the highmem handler */
+	b	kvmppc_handler_trampoline_exit
+
+.endm
+
+INTERRUPT_TRAMPOLINE	BOOK3S_INTERRUPT_SYSTEM_RESET
+INTERRUPT_TRAMPOLINE	BOOK3S_INTERRUPT_MACHINE_CHECK
+INTERRUPT_TRAMPOLINE	BOOK3S_INTERRUPT_DATA_STORAGE
+INTERRUPT_TRAMPOLINE	BOOK3S_INTERRUPT_DATA_SEGMENT
+INTERRUPT_TRAMPOLINE	BOOK3S_INTERRUPT_INST_STORAGE
+INTERRUPT_TRAMPOLINE	BOOK3S_INTERRUPT_INST_SEGMENT
+INTERRUPT_TRAMPOLINE	BOOK3S_INTERRUPT_EXTERNAL
+INTERRUPT_TRAMPOLINE	BOOK3S_INTERRUPT_ALIGNMENT
+INTERRUPT_TRAMPOLINE	BOOK3S_INTERRUPT_PROGRAM
+INTERRUPT_TRAMPOLINE	BOOK3S_INTERRUPT_FP_UNAVAIL
+INTERRUPT_TRAMPOLINE	BOOK3S_INTERRUPT_DECREMENTER
+INTERRUPT_TRAMPOLINE	BOOK3S_INTERRUPT_SYSCALL
+INTERRUPT_TRAMPOLINE	BOOK3S_INTERRUPT_TRACE
+INTERRUPT_TRAMPOLINE	BOOK3S_INTERRUPT_PERFMON
+INTERRUPT_TRAMPOLINE	BOOK3S_INTERRUPT_ALTIVEC
+INTERRUPT_TRAMPOLINE	BOOK3S_INTERRUPT_VSX
+
+/*
+ * This trampoline brings us back to a real mode handler
+ *
+ * Input Registers:
+ *
+ * R6 = SRR0
+ * R7 = SRR1
+ * LR = real-mode IP
+ *
+ */
+.global kvmppc_handler_lowmem_trampoline
+kvmppc_handler_lowmem_trampoline:
+
+	mtsrr0	r6
+	mtsrr1	r7
+	blr
+kvmppc_handler_lowmem_trampoline_end:
+
+.global kvmppc_trampoline_lowmem
+kvmppc_trampoline_lowmem:
+	.long kvmppc_handler_lowmem_trampoline - _stext
+
+.global kvmppc_trampoline_enter
+kvmppc_trampoline_enter:
+	.long kvmppc_handler_trampoline_enter - _stext
+
+#include "book3s_64_slb.S"
+
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 07/27] Add book3s_64 highmem asm code
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
  To: kvm
  Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
	kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>

This is the of entry / exit code. In order to switch between host and guest
context, we need to switch register state and call the exit code handler on
exit.

This assembly file does exactly that. To finally enter the guest it calls
into book3s_64_slb.S. On exit it gets jumped at from book3s_64_slb.S too.

Signed-off-by: Alexander Graf <agraf@suse.de>

---

v3 -> v4:

  - header rename fix
---
 arch/powerpc/include/asm/kvm_ppc.h      |    1 +
 arch/powerpc/kvm/book3s_64_interrupts.S |  392 +++++++++++++++++++++++++++++++
 2 files changed, 393 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_64_interrupts.S

diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 2c6ee34..269ee46 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -39,6 +39,7 @@ enum emulation_result {
 extern int __kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu);
 extern char kvmppc_handlers_start[];
 extern unsigned long kvmppc_handler_len;
+extern void kvmppc_handler_highmem(void);
 
 extern void kvmppc_dump_vcpu(struct kvm_vcpu *vcpu);
 extern int kvmppc_handle_load(struct kvm_run *run, struct kvm_vcpu *vcpu,
diff --git a/arch/powerpc/kvm/book3s_64_interrupts.S b/arch/powerpc/kvm/book3s_64_interrupts.S
new file mode 100644
index 0000000..7b55d80
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_interrupts.S
@@ -0,0 +1,392 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf <agraf@suse.de>
+ */
+
+#include <asm/ppc_asm.h>
+#include <asm/kvm_asm.h>
+#include <asm/reg.h>
+#include <asm/page.h>
+#include <asm/asm-offsets.h>
+#include <asm/exception-64s.h>
+
+#define KVMPPC_HANDLE_EXIT .kvmppc_handle_exit
+#define ULONG_SIZE 8
+#define VCPU_GPR(n)     (VCPU_GPRS + (n * ULONG_SIZE))
+
+.macro mfpaca tmp_reg, src_reg, offset, vcpu_reg
+	ld	\tmp_reg, (PACA_EXMC+\offset)(r13)
+	std	\tmp_reg, VCPU_GPR(\src_reg)(\vcpu_reg)
+.endm
+
+.macro DISABLE_INTERRUPTS
+       mfmsr   r0
+       rldicl  r0,r0,48,1
+       rotldi  r0,r0,16
+       mtmsrd  r0,1
+.endm
+
+/*****************************************************************************
+ *                                                                           *
+ *     Guest entry / exit code that is in kernel module memory (highmem)     *
+ *                                                                           *
+ ****************************************************************************/
+
+/* Registers:
+ *  r3: kvm_run pointer
+ *  r4: vcpu pointer
+ */
+_GLOBAL(__kvmppc_vcpu_entry)
+
+kvm_start_entry:
+	/* Write correct stack frame */
+	mflr    r0
+	std     r0,16(r1)
+
+	/* Save host state to the stack */
+	stdu	r1, -SWITCH_FRAME_SIZE(r1)
+
+	/* Save r3 (kvm_run) and r4 (vcpu) */
+	SAVE_2GPRS(3, r1)
+
+	/* Save non-volatile registers (r14 - r31) */
+	SAVE_NVGPRS(r1)
+
+	/* Save LR */
+	mflr	r14
+	std	r14, _LINK(r1)
+
+/* XXX optimize non-volatile loading away */
+kvm_start_lightweight:
+
+	DISABLE_INTERRUPTS
+
+	/* Save R1/R2 in the PACA */
+	std	r1, PACAR1(r13)
+	std	r2, (PACA_EXMC+EX_SRR0)(r13)
+	ld	r3, VCPU_HIGHMEM_HANDLER(r4)
+	std	r3, PACASAVEDMSR(r13)
+
+	/* Load non-volatile guest state from the vcpu */
+	ld	r14, VCPU_GPR(r14)(r4)
+	ld	r15, VCPU_GPR(r15)(r4)
+	ld	r16, VCPU_GPR(r16)(r4)
+	ld	r17, VCPU_GPR(r17)(r4)
+	ld	r18, VCPU_GPR(r18)(r4)
+	ld	r19, VCPU_GPR(r19)(r4)
+	ld	r20, VCPU_GPR(r20)(r4)
+	ld	r21, VCPU_GPR(r21)(r4)
+	ld	r22, VCPU_GPR(r22)(r4)
+	ld	r23, VCPU_GPR(r23)(r4)
+	ld	r24, VCPU_GPR(r24)(r4)
+	ld	r25, VCPU_GPR(r25)(r4)
+	ld	r26, VCPU_GPR(r26)(r4)
+	ld	r27, VCPU_GPR(r27)(r4)
+	ld	r28, VCPU_GPR(r28)(r4)
+	ld	r29, VCPU_GPR(r29)(r4)
+	ld	r30, VCPU_GPR(r30)(r4)
+	ld	r31, VCPU_GPR(r31)(r4)
+
+	ld	r9, VCPU_PC(r4)			/* r9 = vcpu->arch.pc */
+	ld	r10, VCPU_SHADOW_MSR(r4)	/* r10 = vcpu->arch.shadow_msr */
+
+	ld	r3, VCPU_TRAMPOLINE_ENTER(r4)
+	mtsrr0	r3
+
+	LOAD_REG_IMMEDIATE(r3, MSR_KERNEL & ~(MSR_IR | MSR_DR))
+	mtsrr1	r3
+
+	/* Load guest state in the respective registers */
+	lwz	r3, VCPU_CR(r4)		/* r3 = vcpu->arch.cr */
+	stw	r3, (PACA_EXMC + EX_CCR)(r13)
+
+	ld	r3, VCPU_CTR(r4)	/* r3 = vcpu->arch.ctr */
+	mtctr	r3			/* CTR = r3 */
+
+	ld	r3, VCPU_LR(r4)		/* r3 = vcpu->arch.lr */
+	mtlr	r3			/* LR = r3 */
+
+	ld	r3, VCPU_XER(r4)	/* r3 = vcpu->arch.xer */
+	std	r3, (PACA_EXMC + EX_R3)(r13)
+
+	/* Some guests may need to have dcbz set to 32 byte length.
+	 *
+	 * Usually we ensure that by patching the guest's instructions
+	 * to trap on dcbz and emulate it in the hypervisor.
+	 *
+	 * If we can, we should tell the CPU to use 32 byte dcbz though,
+	 * because that's a lot faster.
+	 */
+
+	ld	r3, VCPU_HFLAGS(r4)
+	rldicl.	r3, r3, 0, 63		/* CR = ((r3 & 1) == 0) */
+	beq	no_dcbz32_on
+
+	mfspr   r3,SPRN_HID5
+	ori     r3, r3, 0x80		/* XXX HID5_dcbz32 = 0x80 */
+	mtspr   SPRN_HID5,r3
+
+no_dcbz32_on:
+	/*	Load guest GPRs */
+
+	ld	r3, VCPU_GPR(r9)(r4)
+	std	r3, (PACA_EXMC + EX_R9)(r13)
+	ld	r3, VCPU_GPR(r10)(r4)
+	std	r3, (PACA_EXMC + EX_R10)(r13)
+	ld	r3, VCPU_GPR(r11)(r4)
+	std	r3, (PACA_EXMC + EX_R11)(r13)
+	ld	r3, VCPU_GPR(r12)(r4)
+	std	r3, (PACA_EXMC + EX_R12)(r13)
+	ld	r3, VCPU_GPR(r13)(r4)
+	std	r3, (PACA_EXMC + EX_R13)(r13)
+
+	ld	r0, VCPU_GPR(r0)(r4)
+	ld	r1, VCPU_GPR(r1)(r4)
+	ld	r2, VCPU_GPR(r2)(r4)
+	ld	r3, VCPU_GPR(r3)(r4)
+	ld	r5, VCPU_GPR(r5)(r4)
+	ld	r6, VCPU_GPR(r6)(r4)
+	ld	r7, VCPU_GPR(r7)(r4)
+	ld	r8, VCPU_GPR(r8)(r4)
+	ld	r4, VCPU_GPR(r4)(r4)
+
+	/* This sets the Magic value for the trampoline */
+
+	li	r11, 1
+	stb	r11, PACA_KVM_IN_GUEST(r13)
+
+	/* Jump to SLB patching handlder and into our guest */
+	RFI
+
+/*
+ * This is the handler in module memory. It gets jumped at from the
+ * lowmem trampoline code, so it's basically the guest exit code.
+ *
+ */
+
+.global kvmppc_handler_highmem
+kvmppc_handler_highmem:
+
+	/*
+	 * Register usage at this point:
+	 *
+	 * R00   = guest R13
+	 * R01   = host R1
+	 * R02   = host R2
+	 * R10   = guest PC
+	 * R11   = guest MSR
+	 * R12   = exit handler id
+	 * R13   = PACA
+	 * PACA.exmc.R9    = guest R1
+	 * PACA.exmc.R10   = guest R10
+	 * PACA.exmc.R11   = guest R11
+	 * PACA.exmc.R12   = guest R12
+	 * PACA.exmc.R13   = guest R2
+	 * PACA.exmc.DAR   = guest DAR
+	 * PACA.exmc.DSISR = guest DSISR
+	 * PACA.exmc.LR    = guest instruction
+	 * PACA.exmc.CCR   = guest CR
+	 * PACA.exmc.SRR0  = guest R0
+	 *
+	 */
+
+	std	r3, (PACA_EXMC+EX_R3)(r13)
+
+	/* save the exit id in R3 */
+	mr	r3, r12
+
+	/* R12 = vcpu */
+	ld	r12, GPR4(r1)
+
+	/* Now save the guest state */
+
+	std	r0, VCPU_GPR(r13)(r12)
+	std	r4, VCPU_GPR(r4)(r12)
+	std	r5, VCPU_GPR(r5)(r12)
+	std	r6, VCPU_GPR(r6)(r12)
+	std	r7, VCPU_GPR(r7)(r12)
+	std	r8, VCPU_GPR(r8)(r12)
+	std	r9, VCPU_GPR(r9)(r12)
+
+	/* get registers from PACA */
+	mfpaca	r5, r0, EX_SRR0, r12
+	mfpaca	r5, r3, EX_R3, r12
+	mfpaca	r5, r1, EX_R9, r12
+	mfpaca	r5, r10, EX_R10, r12
+	mfpaca	r5, r11, EX_R11, r12
+	mfpaca	r5, r12, EX_R12, r12
+	mfpaca	r5, r2, EX_R13, r12
+
+	lwz	r5, (PACA_EXMC+EX_LR)(r13)
+	stw	r5, VCPU_LAST_INST(r12)
+
+	lwz	r5, (PACA_EXMC+EX_CCR)(r13)
+	stw	r5, VCPU_CR(r12)
+
+	ld	r5, VCPU_HFLAGS(r12)
+	rldicl.	r5, r5, 0, 63		/* CR = ((r5 & 1) == 0) */
+	beq	no_dcbz32_off
+
+	mfspr   r5,SPRN_HID5
+	rldimi  r5,r5,6,56
+	mtspr   SPRN_HID5,r5
+
+no_dcbz32_off:
+
+	/* XXX maybe skip on lightweight? */
+	std	r14, VCPU_GPR(r14)(r12)
+	std	r15, VCPU_GPR(r15)(r12)
+	std	r16, VCPU_GPR(r16)(r12)
+	std	r17, VCPU_GPR(r17)(r12)
+	std	r18, VCPU_GPR(r18)(r12)
+	std	r19, VCPU_GPR(r19)(r12)
+	std	r20, VCPU_GPR(r20)(r12)
+	std	r21, VCPU_GPR(r21)(r12)
+	std	r22, VCPU_GPR(r22)(r12)
+	std	r23, VCPU_GPR(r23)(r12)
+	std	r24, VCPU_GPR(r24)(r12)
+	std	r25, VCPU_GPR(r25)(r12)
+	std	r26, VCPU_GPR(r26)(r12)
+	std	r27, VCPU_GPR(r27)(r12)
+	std	r28, VCPU_GPR(r28)(r12)
+	std	r29, VCPU_GPR(r29)(r12)
+	std	r30, VCPU_GPR(r30)(r12)
+	std	r31, VCPU_GPR(r31)(r12)
+
+	/* Restore non-volatile host registers (r14 - r31) */
+	REST_NVGPRS(r1)
+
+	/* Save guest PC (R10) */
+	std	r10, VCPU_PC(r12)
+
+	/* Save guest msr (R11) */
+	std	r11, VCPU_SHADOW_MSR(r12)
+
+	/* Save guest CTR (in R12) */
+	mfctr	r5
+	std	r5, VCPU_CTR(r12)
+
+	/* Save guest LR */
+	mflr	r5
+	std	r5, VCPU_LR(r12)
+
+	/* Save guest XER */
+	mfxer	r5
+	std	r5, VCPU_XER(r12)
+
+	/* Save guest DAR */
+	ld	r5, (PACA_EXMC+EX_DAR)(r13)
+	std	r5, VCPU_FAULT_DEAR(r12)
+
+	/* Save guest DSISR */
+	lwz	r5, (PACA_EXMC+EX_DSISR)(r13)
+	std	r5, VCPU_FAULT_DSISR(r12)
+
+	/* Restore host msr -> SRR1 */
+	ld	r7, VCPU_HOST_MSR(r12)
+	mtsrr1	r7
+
+	/* Restore host IP -> SRR0 */
+	ld	r6, VCPU_HOST_RETIP(r12)
+	mtsrr0	r6
+
+	/*
+	 * For some interrupts, we need to call the real Linux
+	 * handler, so it can do work for us. This has to happen
+	 * as if the interrupt arrived from the kernel though,
+	 * so let's fake it here where most state is restored.
+	 *
+	 * Call Linux for hardware interrupts/decrementer
+	 * r3 = address of interrupt handler (exit reason)
+	 */
+
+	cmpwi	r3, BOOK3S_INTERRUPT_EXTERNAL
+	beq	call_linux_handler
+	cmpwi	r3, BOOK3S_INTERRUPT_DECREMENTER
+	beq	call_linux_handler
+
+	/* Back to Interruptable Mode! (goto kvm_return_point) */
+	RFI
+
+call_linux_handler:
+
+	/*
+	 * If we land here we need to jump back to the handler we
+	 * came from.
+	 *
+	 * We have a page that we can access from real mode, so let's
+	 * jump back to that and use it as a trampoline to get back into the
+	 * interrupt handler!
+	 *
+	 * R3 still contains the exit code,
+	 * R6 VCPU_HOST_RETIP and
+	 * R7 VCPU_HOST_MSR
+	 */
+
+	mtlr	r3
+
+	ld	r5, VCPU_TRAMPOLINE_LOWMEM(r12)
+	mtsrr0	r5
+	LOAD_REG_IMMEDIATE(r5, MSR_KERNEL & ~(MSR_IR | MSR_DR))
+	mtsrr1	r5
+
+	RFI
+
+.global kvm_return_point
+kvm_return_point:
+
+	/* Jump back to lightweight entry if we're supposed to */
+	/* go back into the guest */
+	mr	r5, r3
+	/* Restore r3 (kvm_run) and r4 (vcpu) */
+	REST_2GPRS(3, r1)
+	bl	KVMPPC_HANDLE_EXIT
+
+#if 0 /* XXX get lightweight exits back */
+	cmpwi	r3, RESUME_GUEST
+	bne	kvm_exit_heavyweight
+
+	/* put VCPU and KVM_RUN back into place and roll again! */
+	REST_2GPRS(3, r1)
+	b	kvm_start_lightweight
+
+kvm_exit_heavyweight:
+	/* Restore non-volatile host registers */
+	ld	r14, _LINK(r1)
+	mtlr	r14
+	REST_NVGPRS(r1)
+
+	addi    r1, r1, SWITCH_FRAME_SIZE
+#else
+	ld	r4, _LINK(r1)
+	mtlr	r4
+
+	cmpwi	r3, RESUME_GUEST
+	bne	kvm_exit_heavyweight
+
+	REST_2GPRS(3, r1)
+
+	addi    r1, r1, SWITCH_FRAME_SIZE
+
+	b	kvm_start_entry
+
+kvm_exit_heavyweight:
+
+	addi    r1, r1, SWITCH_FRAME_SIZE
+#endif
+
+	blr
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 11/27] Add book3s_64 Host MMU handling
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
  To: kvm
  Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
	kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>

We designed the Book3S port of KVM as modular as possible. Most
of the code could be easily used on a Book3S_32 host as well.

The main difference between 32 and 64 bit cores is the MMU. To keep
things well separated, we treat the book3s_64 MMU as one possible compile
option.

This patch adds all the MMU helpers the rest of the code needs in
order to modify the host's MMU, like setting PTEs and segments.

Signed-off-by: Alexander Graf <agraf@suse.de>

---

v5 -> v6

  - don't take mmap_sem
  - dprintk instead if scattered #ifdef's
  - minor cleanups
  - // -> /* */ (book3s_64_mmu_host.c)
---
 arch/powerpc/kvm/book3s_64_mmu_host.c |  408 +++++++++++++++++++++++++++++++++
 1 files changed, 408 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_64_mmu_host.c

diff --git a/arch/powerpc/kvm/book3s_64_mmu_host.c b/arch/powerpc/kvm/book3s_64_mmu_host.c
new file mode 100644
index 0000000..f2899b2
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_mmu_host.c
@@ -0,0 +1,408 @@
+/*
+ * Copyright (C) 2009 SUSE Linux Products GmbH. All rights reserved.
+ *
+ * Authors:
+ *     Alexander Graf <agraf@suse.de>
+ *     Kevin Wolf <mail@kevin-wolf.de>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ */
+
+#include <linux/kvm_host.h>
+
+#include <asm/kvm_ppc.h>
+#include <asm/kvm_book3s.h>
+#include <asm/mmu-hash64.h>
+#include <asm/machdep.h>
+#include <asm/mmu_context.h>
+#include <asm/hw_irq.h>
+
+#define PTE_SIZE 12
+#define VSID_ALL 0
+
+/* #define DEBUG_MMU */
+/* #define DEBUG_SLB */
+
+#ifdef DEBUG_MMU
+#define dprintk_mmu(a, ...) printk(KERN_INFO a, __VA_ARGS__)
+#else
+#define dprintk_mmu(a, ...) do { } while(0)
+#endif
+
+#ifdef DEBUG_SLB
+#define dprintk_slb(a, ...) printk(KERN_INFO a, __VA_ARGS__)
+#else
+#define dprintk_slb(a, ...) do { } while(0)
+#endif
+
+static void invalidate_pte(struct hpte_cache *pte)
+{
+	dprintk_mmu("KVM: Flushing SPT %d: 0x%llx (0x%llx) -> 0x%llx\n",
+		    i, pte->pte.eaddr, pte->pte.vpage, pte->host_va);
+
+	ppc_md.hpte_invalidate(pte->slot, pte->host_va,
+			       MMU_PAGE_4K, MMU_SEGSIZE_256M,
+			       false);
+	pte->host_va = 0;
+	kvm_release_pfn_dirty(pte->pfn);
+}
+
+void kvmppc_mmu_pte_flush(struct kvm_vcpu *vcpu, u64 guest_ea, u64 ea_mask)
+{
+	int i;
+
+	dprintk_mmu("KVM: Flushing %d Shadow PTEs: 0x%llx & 0x%llx\n",
+		    vcpu->arch.hpte_cache_offset, guest_ea, ea_mask);
+	BUG_ON(vcpu->arch.hpte_cache_offset > HPTEG_CACHE_NUM);
+
+	guest_ea &= ea_mask;
+	for (i = 0; i < vcpu->arch.hpte_cache_offset; i++) {
+		struct hpte_cache *pte;
+
+		pte = &vcpu->arch.hpte_cache[i];
+		if (!pte->host_va)
+			continue;
+
+		if ((pte->pte.eaddr & ea_mask) == guest_ea) {
+			invalidate_pte(pte);
+		}
+	}
+
+	/* Doing a complete flush -> start from scratch */
+	if (!ea_mask)
+		vcpu->arch.hpte_cache_offset = 0;
+}
+
+void kvmppc_mmu_pte_vflush(struct kvm_vcpu *vcpu, u64 guest_vp, u64 vp_mask)
+{
+	int i;
+
+	dprintk_mmu("KVM: Flushing %d Shadow vPTEs: 0x%llx & 0x%llx\n",
+		    vcpu->arch.hpte_cache_offset, guest_vp, vp_mask);
+	BUG_ON(vcpu->arch.hpte_cache_offset > HPTEG_CACHE_NUM);
+
+	guest_vp &= vp_mask;
+	for (i = 0; i < vcpu->arch.hpte_cache_offset; i++) {
+		struct hpte_cache *pte;
+
+		pte = &vcpu->arch.hpte_cache[i];
+		if (!pte->host_va)
+			continue;
+
+		if ((pte->pte.vpage & vp_mask) == guest_vp) {
+			invalidate_pte(pte);
+		}
+	}
+}
+
+void kvmppc_mmu_pte_pflush(struct kvm_vcpu *vcpu, u64 pa_start, u64 pa_end)
+{
+	int i;
+
+	dprintk_mmu("KVM: Flushing %d Shadow pPTEs: 0x%llx & 0x%llx\n",
+		    vcpu->arch.hpte_cache_offset, guest_pa, pa_mask);
+	BUG_ON(vcpu->arch.hpte_cache_offset > HPTEG_CACHE_NUM);
+
+	for (i = 0; i < vcpu->arch.hpte_cache_offset; i++) {
+		struct hpte_cache *pte;
+
+		pte = &vcpu->arch.hpte_cache[i];
+		if (!pte->host_va)
+			continue;
+
+		if ((pte->pte.raddr >= pa_start) &&
+		    (pte->pte.raddr < pa_end)) {
+			invalidate_pte(pte);
+		}
+	}
+}
+
+struct kvmppc_pte *kvmppc_mmu_find_pte(struct kvm_vcpu *vcpu, u64 ea, bool data)
+{
+	int i;
+	u64 guest_vp;
+
+	guest_vp = vcpu->arch.mmu.ea_to_vp(vcpu, ea, false);
+	for (i=0; i<vcpu->arch.hpte_cache_offset; i++) {
+		struct hpte_cache *pte;
+
+		pte = &vcpu->arch.hpte_cache[i];
+		if (!pte->host_va)
+			continue;
+
+		if (pte->pte.vpage == guest_vp)
+			return &pte->pte;
+	}
+
+	return NULL;
+}
+
+static int kvmppc_mmu_hpte_cache_next(struct kvm_vcpu *vcpu)
+{
+	if (vcpu->arch.hpte_cache_offset == HPTEG_CACHE_NUM)
+		kvmppc_mmu_pte_flush(vcpu, 0, 0);
+
+	return vcpu->arch.hpte_cache_offset++;
+}
+
+/* We keep 512 gvsid->hvsid entries, mapping the guest ones to the array using
+ * a hash, so we don't waste cycles on looping */
+static u16 kvmppc_sid_hash(struct kvm_vcpu *vcpu, u64 gvsid)
+{
+	return (u16)(((gvsid >> (SID_MAP_BITS * 7)) & SID_MAP_MASK) ^
+		     ((gvsid >> (SID_MAP_BITS * 6)) & SID_MAP_MASK) ^
+		     ((gvsid >> (SID_MAP_BITS * 5)) & SID_MAP_MASK) ^
+		     ((gvsid >> (SID_MAP_BITS * 4)) & SID_MAP_MASK) ^
+		     ((gvsid >> (SID_MAP_BITS * 3)) & SID_MAP_MASK) ^
+		     ((gvsid >> (SID_MAP_BITS * 2)) & SID_MAP_MASK) ^
+		     ((gvsid >> (SID_MAP_BITS * 1)) & SID_MAP_MASK) ^
+		     ((gvsid >> (SID_MAP_BITS * 0)) & SID_MAP_MASK));
+}
+
+
+static struct kvmppc_sid_map *find_sid_vsid(struct kvm_vcpu *vcpu, u64 gvsid)
+{
+	struct kvmppc_sid_map *map;
+	u16 sid_map_mask;
+
+	if (vcpu->arch.msr & MSR_PR)
+		gvsid |= VSID_PR;
+
+	sid_map_mask = kvmppc_sid_hash(vcpu, gvsid);
+	map = &to_book3s(vcpu)->sid_map[sid_map_mask];
+	if (map->guest_vsid == gvsid) {
+		dprintk_slb("SLB: Searching 0x%llx -> 0x%llx\n",
+			    gvsid, map->host_vsid);
+		return map;
+	}
+
+	map = &to_book3s(vcpu)->sid_map[SID_MAP_MASK - sid_map_mask];
+	if (map->guest_vsid == gvsid) {
+		dprintk_slb("SLB: Searching 0x%llx -> 0x%llx\n",
+			    gvsid, map->host_vsid);
+		return map;
+	}
+
+	dprintk_slb("SLB: Searching 0x%llx -> not found\n", gvsid);
+	return NULL;
+}
+
+int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *orig_pte)
+{
+	pfn_t hpaddr;
+	ulong hash, hpteg, va;
+	u64 vsid;
+	int ret;
+	int rflags = 0x192;
+	int vflags = 0;
+	int attempt = 0;
+	struct kvmppc_sid_map *map;
+
+	/* Get host physical address for gpa */
+	hpaddr = gfn_to_pfn(vcpu->kvm, orig_pte->raddr >> PAGE_SHIFT);
+	if (kvm_is_error_hva(hpaddr)) {
+		printk(KERN_INFO "Couldn't get guest page for gfn %llx!\n", orig_pte->eaddr);
+		return -EINVAL;
+	}
+	hpaddr <<= PAGE_SHIFT;
+#if PAGE_SHIFT == 12
+#elif PAGE_SHIFT == 16
+	hpaddr |= orig_pte->raddr & 0xf000;
+#else
+#error Unknown page size
+#endif
+
+	/* and write the mapping ea -> hpa into the pt */
+	vcpu->arch.mmu.esid_to_vsid(vcpu, orig_pte->eaddr >> SID_SHIFT, &vsid);
+	map = find_sid_vsid(vcpu, vsid);
+	if (!map) {
+		kvmppc_mmu_map_segment(vcpu, orig_pte->eaddr);
+		map = find_sid_vsid(vcpu, vsid);
+	}
+	BUG_ON(!map);
+
+	vsid = map->host_vsid;
+	va = hpt_va(orig_pte->eaddr, vsid, MMU_SEGSIZE_256M);
+
+	if (!orig_pte->may_write)
+		rflags |= HPTE_R_PP;
+	else
+		mark_page_dirty(vcpu->kvm, orig_pte->raddr >> PAGE_SHIFT);
+
+	if (!orig_pte->may_execute)
+		rflags |= HPTE_R_N;
+
+	hash = hpt_hash(va, PTE_SIZE, MMU_SEGSIZE_256M);
+
+map_again:
+	hpteg = ((hash & htab_hash_mask) * HPTES_PER_GROUP);
+
+	/* In case we tried normal mapping already, let's nuke old entries */
+	if (attempt > 1)
+		if (ppc_md.hpte_remove(hpteg) < 0)
+			return -1;
+
+	ret = ppc_md.hpte_insert(hpteg, va, hpaddr, rflags, vflags, MMU_PAGE_4K, MMU_SEGSIZE_256M);
+
+	if (ret < 0) {
+		/* If we couldn't map a primary PTE, try a secondary */
+#ifdef USE_SECONDARY
+		hash = ~hash;
+		attempt++;
+		if (attempt % 2)
+			vflags = HPTE_V_SECONDARY;
+		else
+			vflags = 0;
+#else
+		attempt = 2;
+#endif
+		goto map_again;
+	} else {
+		int hpte_id = kvmppc_mmu_hpte_cache_next(vcpu);
+		struct hpte_cache *pte = &vcpu->arch.hpte_cache[hpte_id];
+
+		dprintk_mmu("KVM: %c%c Map 0x%llx: [%lx] 0x%lx (0x%llx) -> %lx\n",
+			    ((rflags & HPTE_R_PP) == 3) ? '-' : 'w',
+			    (rflags & HPTE_R_N) ? '-' : 'x',
+			    orig_pte->eaddr, hpteg, va, orig_pte->vpage, hpaddr);
+
+		pte->slot = hpteg + (ret & 7);
+		pte->host_va = va;
+		pte->pte = *orig_pte;
+		pte->pfn = hpaddr >> PAGE_SHIFT;
+	}
+
+	return 0;
+}
+
+static struct kvmppc_sid_map *create_sid_map(struct kvm_vcpu *vcpu, u64 gvsid)
+{
+	struct kvmppc_sid_map *map;
+	struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
+	u16 sid_map_mask;
+	static int backwards_map = 0;
+
+	if (vcpu->arch.msr & MSR_PR)
+		gvsid |= VSID_PR;
+
+	/* We might get collisions that trap in preceding order, so let's
+	   map them differently */
+
+	sid_map_mask = kvmppc_sid_hash(vcpu, gvsid);
+	if (backwards_map)
+		sid_map_mask = SID_MAP_MASK - sid_map_mask;
+
+	map = &to_book3s(vcpu)->sid_map[sid_map_mask];
+
+	/* Make sure we're taking the other map next time */
+	backwards_map = !backwards_map;
+
+	/* Uh-oh ... out of mappings. Let's flush! */
+	if (vcpu_book3s->vsid_next == vcpu_book3s->vsid_max) {
+		vcpu_book3s->vsid_next = vcpu_book3s->vsid_first;
+		memset(vcpu_book3s->sid_map, 0,
+		       sizeof(struct kvmppc_sid_map) * SID_MAP_NUM);
+		kvmppc_mmu_pte_flush(vcpu, 0, 0);
+		kvmppc_mmu_flush_segments(vcpu);
+	}
+	map->host_vsid = vcpu_book3s->vsid_next++;
+
+	map->guest_vsid = gvsid;
+	map->valid = true;
+
+	return map;
+}
+
+static int kvmppc_mmu_next_segment(struct kvm_vcpu *vcpu, ulong esid)
+{
+	int i;
+	int max_slb_size = 64;
+	int found_inval = -1;
+	int r;
+
+	if (!get_paca()->kvm_slb_max)
+		get_paca()->kvm_slb_max = 1;
+
+	/* Are we overwriting? */
+	for (i = 1; i < get_paca()->kvm_slb_max; i++) {
+		if (!(get_paca()->kvm_slb[i].esid & SLB_ESID_V))
+			found_inval = i;
+		else if ((get_paca()->kvm_slb[i].esid & ESID_MASK) == esid)
+			return i;
+	}
+
+	/* Found a spare entry that was invalidated before */
+	if (found_inval > 0)
+		return found_inval;
+
+	/* No spare invalid entry, so create one */
+
+	if (mmu_slb_size < 64)
+		max_slb_size = mmu_slb_size;
+
+	/* Overflowing -> purge */
+	if ((get_paca()->kvm_slb_max) == max_slb_size)
+		kvmppc_mmu_flush_segments(vcpu);
+
+	r = get_paca()->kvm_slb_max;
+	get_paca()->kvm_slb_max++;
+
+	return r;
+}
+
+int kvmppc_mmu_map_segment(struct kvm_vcpu *vcpu, ulong eaddr)
+{
+	u64 esid = eaddr >> SID_SHIFT;
+	u64 slb_esid = (eaddr & ESID_MASK) | SLB_ESID_V;
+	u64 slb_vsid = SLB_VSID_USER;
+	u64 gvsid;
+	int slb_index;
+	struct kvmppc_sid_map *map;
+
+	slb_index = kvmppc_mmu_next_segment(vcpu, eaddr & ESID_MASK);
+
+	if (vcpu->arch.mmu.esid_to_vsid(vcpu, esid, &gvsid)) {
+		/* Invalidate an entry */
+		get_paca()->kvm_slb[slb_index].esid = 0;
+		return -ENOENT;
+	}
+
+	map = find_sid_vsid(vcpu, gvsid);
+	if (!map)
+		map = create_sid_map(vcpu, gvsid);
+
+	map->guest_esid = esid;
+
+	slb_vsid |= (map->host_vsid << 12);
+	slb_vsid &= ~SLB_VSID_KP;
+	slb_esid |= slb_index;
+
+	get_paca()->kvm_slb[slb_index].esid = slb_esid;
+	get_paca()->kvm_slb[slb_index].vsid = slb_vsid;
+
+	dprintk_slb("slbmte %#llx, %#llx\n", slb_vsid, slb_esid);
+
+	return 0;
+}
+
+void kvmppc_mmu_flush_segments(struct kvm_vcpu *vcpu)
+{
+	get_paca()->kvm_slb_max = 1;
+	get_paca()->kvm_slb[0].esid = 0;
+}
+
+void kvmppc_mmu_destroy(struct kvm_vcpu *vcpu)
+{
+	kvmppc_mmu_pte_flush(vcpu, 0, 0);
+}
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 02/27] Pass PVR in sregs
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
  To: kvm
  Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
	kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>

Right now sregs is unused on PPC, so we can use it for initialization
of the CPU.

KVM on BookE always virtualizes the host CPU. On Book3s we go a step further
and take the PVR from userspace that tells us what kind of CPU we are supposed
to virtualize, because we support Book3s_32 and Book3s_64 guests.

In order to get that information, we use the sregs ioctl, because we don't
want to reset the guest CPU on every normal register set.

Signed-off-by: Alexander Graf <agraf@suse.de>

---

v4 -> v5

  - make PVR 32 bits
---
 arch/powerpc/include/asm/kvm.h |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h
index bb2de6a..c9ca97f 100644
--- a/arch/powerpc/include/asm/kvm.h
+++ b/arch/powerpc/include/asm/kvm.h
@@ -46,6 +46,8 @@ struct kvm_regs {
 };
 
 struct kvm_sregs {
+	__u32 pvr;
+	char pad[1020];
 };
 
 struct kvm_fpu {
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 03/27] Add Book3s definitions
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
  To: kvm
  Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
	kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>

We need quite a bunch of new constants for KVM on Book3s,
so let's define them now.

These constants will be used in later patches.

Signed-off-by: Alexander Graf <agraf@suse.de>

---

v3 -> v4

  - remove old kernel compat code
---
 arch/powerpc/include/asm/kvm_asm.h |   39 ++++++++++++++++++++++++++++++++++++
 1 files changed, 39 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_asm.h b/arch/powerpc/include/asm/kvm_asm.h
index 56bfae5..19ddb35 100644
--- a/arch/powerpc/include/asm/kvm_asm.h
+++ b/arch/powerpc/include/asm/kvm_asm.h
@@ -49,6 +49,45 @@
 #define BOOKE_INTERRUPT_SPE_FP_ROUND 34
 #define BOOKE_INTERRUPT_PERFORMANCE_MONITOR 35
 
+/* book3s */
+
+#define BOOK3S_INTERRUPT_SYSTEM_RESET	0x100
+#define BOOK3S_INTERRUPT_MACHINE_CHECK	0x200
+#define BOOK3S_INTERRUPT_DATA_STORAGE	0x300
+#define BOOK3S_INTERRUPT_DATA_SEGMENT	0x380
+#define BOOK3S_INTERRUPT_INST_STORAGE	0x400
+#define BOOK3S_INTERRUPT_INST_SEGMENT	0x480
+#define BOOK3S_INTERRUPT_EXTERNAL	0x500
+#define BOOK3S_INTERRUPT_ALIGNMENT	0x600
+#define BOOK3S_INTERRUPT_PROGRAM	0x700
+#define BOOK3S_INTERRUPT_FP_UNAVAIL	0x800
+#define BOOK3S_INTERRUPT_DECREMENTER	0x900
+#define BOOK3S_INTERRUPT_SYSCALL	0xc00
+#define BOOK3S_INTERRUPT_TRACE		0xd00
+#define BOOK3S_INTERRUPT_PERFMON	0xf00
+#define BOOK3S_INTERRUPT_ALTIVEC	0xf20
+#define BOOK3S_INTERRUPT_VSX		0xf40
+
+#define BOOK3S_IRQPRIO_SYSTEM_RESET		0
+#define BOOK3S_IRQPRIO_DATA_SEGMENT		1
+#define BOOK3S_IRQPRIO_INST_SEGMENT		2
+#define BOOK3S_IRQPRIO_DATA_STORAGE		3
+#define BOOK3S_IRQPRIO_INST_STORAGE		4
+#define BOOK3S_IRQPRIO_ALIGNMENT		5
+#define BOOK3S_IRQPRIO_PROGRAM			6
+#define BOOK3S_IRQPRIO_FP_UNAVAIL		7
+#define BOOK3S_IRQPRIO_ALTIVEC			8
+#define BOOK3S_IRQPRIO_VSX			9
+#define BOOK3S_IRQPRIO_SYSCALL			10
+#define BOOK3S_IRQPRIO_MACHINE_CHECK		11
+#define BOOK3S_IRQPRIO_DEBUG			12
+#define BOOK3S_IRQPRIO_EXTERNAL			13
+#define BOOK3S_IRQPRIO_DECREMENTER		14
+#define BOOK3S_IRQPRIO_PERFORMANCE_MONITOR	15
+#define BOOK3S_IRQPRIO_MAX			16
+
+#define BOOK3S_HFLAG_DCBZ32			0x1
+
 #define RESUME_FLAG_NV          (1<<0)  /* Reload guest nonvolatile state? */
 #define RESUME_FLAG_HOST        (1<<1)  /* Resume host? */
 
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 04/27] Add Book3s fields to vcpu structs
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
  To: kvm
  Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
	kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>

We need to store more information than we currently have for vcpus
when running on Book3s.

So let's extend the internal struct definitions.

Signed-off-by: Alexander Graf <agraf@suse.de>

---

v3 -> v4:

  - use context_id instead of mm_context

v4 -> v5:

  - always include pvr in vcpu struct
---
 arch/powerpc/include/asm/kvm_host.h |   73 ++++++++++++++++++++++++++++++++++-
 1 files changed, 72 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index c9c930e..2cff5fe 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -37,6 +37,8 @@
 #define KVM_NR_PAGE_SIZES	1
 #define KVM_PAGES_PER_HPAGE(x)	(1UL<<31)
 
+#define HPTEG_CACHE_NUM 1024
+
 struct kvm;
 struct kvm_run;
 struct kvm_vcpu;
@@ -63,6 +65,17 @@ struct kvm_vcpu_stat {
 	u32 dec_exits;
 	u32 ext_intr_exits;
 	u32 halt_wakeup;
+#ifdef CONFIG_PPC64
+	u32 pf_storage;
+	u32 pf_instruc;
+	u32 sp_storage;
+	u32 sp_instruc;
+	u32 queue_intr;
+	u32 ld;
+	u32 ld_slow;
+	u32 st;
+	u32 st_slow;
+#endif
 };
 
 enum kvm_exit_types {
@@ -109,9 +122,53 @@ struct kvmppc_exit_timing {
 struct kvm_arch {
 };
 
+struct kvmppc_pte {
+	u64 eaddr;
+	u64 vpage;
+	u64 raddr;
+	bool may_read;
+	bool may_write;
+	bool may_execute;
+};
+
+struct kvmppc_mmu {
+	/* book3s_64 only */
+	void (*slbmte)(struct kvm_vcpu *vcpu, u64 rb, u64 rs);
+	u64  (*slbmfee)(struct kvm_vcpu *vcpu, u64 slb_nr);
+	u64  (*slbmfev)(struct kvm_vcpu *vcpu, u64 slb_nr);
+	void (*slbie)(struct kvm_vcpu *vcpu, u64 slb_nr);
+	void (*slbia)(struct kvm_vcpu *vcpu);
+	/* book3s */
+	void (*mtsrin)(struct kvm_vcpu *vcpu, u32 srnum, ulong value);
+	u32  (*mfsrin)(struct kvm_vcpu *vcpu, u32 srnum);
+	int  (*xlate)(struct kvm_vcpu *vcpu, gva_t eaddr, struct kvmppc_pte *pte, bool data);
+	void (*reset_msr)(struct kvm_vcpu *vcpu);
+	void (*tlbie)(struct kvm_vcpu *vcpu, ulong addr, bool large);
+	int  (*esid_to_vsid)(struct kvm_vcpu *vcpu, u64 esid, u64 *vsid);
+	u64  (*ea_to_vp)(struct kvm_vcpu *vcpu, gva_t eaddr, bool data);
+	bool (*is_dcbz32)(struct kvm_vcpu *vcpu);
+};
+
+struct hpte_cache {
+	u64 host_va;
+	u64 pfn;
+	ulong slot;
+	struct kvmppc_pte pte;
+};
+
 struct kvm_vcpu_arch {
-	u32 host_stack;
+	ulong host_stack;
 	u32 host_pid;
+#ifdef CONFIG_PPC64
+	ulong host_msr;
+	ulong host_r2;
+	void *host_retip;
+	ulong trampoline_lowmem;
+	ulong trampoline_enter;
+	ulong highmem_handler;
+	ulong host_paca_phys;
+	struct kvmppc_mmu mmu;
+#endif
 
 	u64 fpr[32];
 	ulong gpr[32];
@@ -123,6 +180,10 @@ struct kvm_vcpu_arch {
 	ulong xer;
 
 	ulong msr;
+#ifdef CONFIG_PPC64
+	ulong shadow_msr;
+	ulong hflags;
+#endif
 	u32 mmucr;
 	ulong sprg0;
 	ulong sprg1;
@@ -149,6 +210,7 @@ struct kvm_vcpu_arch {
 	u32 ivor[64];
 	ulong ivpr;
 	u32 pir;
+	u32 pvr;
 
 	u32 shadow_pid;
 	u32 pid;
@@ -174,6 +236,9 @@ struct kvm_vcpu_arch {
 #endif
 
 	u32 last_inst;
+#ifdef CONFIG_PPC64
+	ulong fault_dsisr;
+#endif
 	ulong fault_dear;
 	ulong fault_esr;
 	gpa_t paddr_accessed;
@@ -186,7 +251,13 @@ struct kvm_vcpu_arch {
 	u32 cpr0_cfgaddr; /* holds the last set cpr0_cfgaddr */
 
 	struct timer_list dec_timer;
+	u64 dec_jiffies;
 	unsigned long pending_exceptions;
+
+#ifdef CONFIG_PPC64
+	struct hpte_cache hpte_cache[HPTEG_CACHE_NUM];
+	int hpte_cache_offset;
+#endif
 };
 
 #endif /* __POWERPC_KVM_HOST_H__ */
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 05/27] Add asm/kvm_book3s.h
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
  To: kvm
  Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
	kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>

This adds the book3s specific header file that contains structs that
are only valid on book3s specific code.

Signed-off-by: Alexander Graf <agraf@suse.de>

---

v3 -> v4:

  - use context_id instead of mm_alloc
---
 arch/powerpc/include/asm/kvm_book3s.h |  136 +++++++++++++++++++++++++++++++++
 1 files changed, 136 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/include/asm/kvm_book3s.h

diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
new file mode 100644
index 0000000..c601133
--- /dev/null
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -0,0 +1,136 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf <agraf@suse.de>
+ */
+
+#ifndef __ASM_KVM_BOOK3S_H__
+#define __ASM_KVM_BOOK3S_H__
+
+#include <linux/types.h>
+#include <linux/kvm_host.h>
+#include <asm/kvm_ppc.h>
+
+struct kvmppc_slb {
+	u64 esid;
+	u64 vsid;
+	u64 orige;
+	u64 origv;
+	bool valid;
+	bool Ks;
+	bool Kp;
+	bool nx;
+	bool large;
+	bool class;
+};
+
+struct kvmppc_sr {
+	u32 raw;
+	u32 vsid;
+	bool Ks;
+	bool Kp;
+	bool nx;
+};
+
+struct kvmppc_bat {
+	u32 bepi;
+	u32 bepi_mask;
+	bool vs;
+	bool vp;
+	u32 brpn;
+	u8 wimg;
+	u8 pp;
+};
+
+struct kvmppc_sid_map {
+	u64 guest_vsid;
+	u64 guest_esid;
+	u64 host_vsid;
+	bool valid;
+};
+
+#define SID_MAP_BITS    9
+#define SID_MAP_NUM     (1 << SID_MAP_BITS)
+#define SID_MAP_MASK    (SID_MAP_NUM - 1)
+
+struct kvmppc_vcpu_book3s {
+	struct kvm_vcpu vcpu;
+	struct kvmppc_sid_map sid_map[SID_MAP_NUM];
+	struct kvmppc_slb slb[64];
+	struct {
+		u64 esid;
+		u64 vsid;
+	} slb_shadow[64];
+	u8 slb_shadow_max;
+	struct kvmppc_sr sr[16];
+	struct kvmppc_bat ibat[8];
+	struct kvmppc_bat dbat[8];
+	u64 hid[6];
+	int slb_nr;
+	u64 sdr1;
+	u64 dsisr;
+	u64 hior;
+	u64 msr_mask;
+	u64 vsid_first;
+	u64 vsid_next;
+	u64 vsid_max;
+	int context_id;
+};
+
+#define CONTEXT_HOST		0
+#define CONTEXT_GUEST		1
+#define CONTEXT_GUEST_END	2
+
+#define VSID_REAL	0xfffffffffff00000
+#define VSID_REAL_DR	0xffffffffffe00000
+#define VSID_REAL_IR	0xffffffffffd00000
+#define VSID_BAT	0xffffffffffc00000
+#define VSID_PR		0x8000000000000000
+
+extern void kvmppc_mmu_pte_flush(struct kvm_vcpu *vcpu, u64 ea, u64 ea_mask);
+extern void kvmppc_mmu_pte_vflush(struct kvm_vcpu *vcpu, u64 vp, u64 vp_mask);
+extern void kvmppc_mmu_pte_pflush(struct kvm_vcpu *vcpu, u64 pa_start, u64 pa_end);
+extern void kvmppc_set_msr(struct kvm_vcpu *vcpu, u64 new_msr);
+extern void kvmppc_mmu_book3s_64_init(struct kvm_vcpu *vcpu);
+extern void kvmppc_mmu_book3s_32_init(struct kvm_vcpu *vcpu);
+extern int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *pte);
+extern int kvmppc_mmu_map_segment(struct kvm_vcpu *vcpu, ulong eaddr);
+extern void kvmppc_mmu_flush_segments(struct kvm_vcpu *vcpu);
+extern struct kvmppc_pte *kvmppc_mmu_find_pte(struct kvm_vcpu *vcpu, u64 ea, bool data);
+extern int kvmppc_ld(struct kvm_vcpu *vcpu, ulong eaddr, int size, void *ptr, bool data);
+extern int kvmppc_st(struct kvm_vcpu *vcpu, ulong eaddr, int size, void *ptr);
+extern void kvmppc_book3s_queue_irqprio(struct kvm_vcpu *vcpu, unsigned int vec);
+
+extern u32 kvmppc_trampoline_lowmem;
+extern u32 kvmppc_trampoline_enter;
+
+static inline struct kvmppc_vcpu_book3s *to_book3s(struct kvm_vcpu *vcpu)
+{
+	return container_of(vcpu, struct kvmppc_vcpu_book3s, vcpu);
+}
+
+static inline ulong dsisr(void)
+{
+	ulong r;
+	asm ( "mfdsisr %0 " : "=r" (r) );
+	return r;
+}
+
+extern void kvm_return_point(void);
+
+#define INS_DCBZ			0x7c0007ec
+
+#endif /* __ASM_KVM_BOOK3S_H__ */
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 08/27] Add SLB switching code for entry/exit
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
  To: kvm
  Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
	kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>

This is the really low level of guest entry/exit code.

Book3s_64 has an SLB, which stores all ESID -> VSID mappings we're
currently aware of.

The segments in the guest differ from the ones on the host, so we need
to switch the SLB to tell the MMU that we're in a new context.

So we store a shadow of the guest's SLB in the PACA, switch to that on
entry and only restore bolted entries on exit, leaving the rest to the
Linux SLB fault handler.

That way we get a really clean way of switching the SLB.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/book3s_64_slb.S |  277 ++++++++++++++++++++++++++++++++++++++
 1 files changed, 277 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_64_slb.S

diff --git a/arch/powerpc/kvm/book3s_64_slb.S b/arch/powerpc/kvm/book3s_64_slb.S
new file mode 100644
index 0000000..00a8367
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_slb.S
@@ -0,0 +1,277 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf <agraf@suse.de>
+ */
+
+/******************************************************************************
+ *                                                                            *
+ *                               Entry code                                   *
+ *                                                                            *
+ *****************************************************************************/
+
+.global kvmppc_handler_trampoline_enter
+kvmppc_handler_trampoline_enter:
+
+	/* Required state:
+	 *
+	 * MSR = ~IR|DR
+	 * R13 = PACA
+	 * R9 = guest IP
+	 * R10 = guest MSR
+	 * R11 = free
+	 * R12 = free
+	 * PACA[PACA_EXMC + EX_R9] = guest R9
+	 * PACA[PACA_EXMC + EX_R10] = guest R10
+	 * PACA[PACA_EXMC + EX_R11] = guest R11
+	 * PACA[PACA_EXMC + EX_R12] = guest R12
+	 * PACA[PACA_EXMC + EX_R13] = guest R13
+	 * PACA[PACA_EXMC + EX_CCR] = guest CR
+	 * PACA[PACA_EXMC + EX_R3] = guest XER
+	 */
+
+	mtsrr0	r9
+	mtsrr1	r10
+
+	mtspr	SPRN_SPRG_SCRATCH0, r0
+
+	/* Remove LPAR shadow entries */
+
+#if SLB_NUM_BOLTED == 3
+
+	ld	r12, PACA_SLBSHADOWPTR(r13)
+	ld	r10, 0x10(r12)
+	ld	r11, 0x18(r12)
+	/* Invalid? Skip. */
+	rldicl. r0, r10, 37, 63
+	beq	slb_entry_skip_1
+	xoris	r9, r10, SLB_ESID_V@h
+	std	r9, 0x10(r12)
+slb_entry_skip_1:
+	ld	r9, 0x20(r12)
+	/* Invalid? Skip. */
+	rldicl. r0, r9, 37, 63
+	beq	slb_entry_skip_2
+	xoris	r9, r9, SLB_ESID_V@h
+	std	r9, 0x20(r12)
+slb_entry_skip_2:
+	ld	r9, 0x30(r12)
+	/* Invalid? Skip. */
+	rldicl. r0, r9, 37, 63
+	beq	slb_entry_skip_3
+	xoris	r9, r9, SLB_ESID_V@h
+	std	r9, 0x30(r12)
+slb_entry_skip_3:
+	
+#else
+#error unknown number of bolted entries
+#endif
+
+	/* Flush SLB */
+
+	slbia
+
+	/* r0 = esid & ESID_MASK */
+	rldicr  r10, r10, 0, 35
+	/* r0 |= CLASS_BIT(VSID) */
+	rldic   r12, r11, 56 - 36, 36
+	or      r10, r10, r12
+	slbie	r10
+
+	isync
+
+	/* Fill SLB with our shadow */
+
+	lbz	r12, PACA_KVM_SLB_MAX(r13)
+	mulli	r12, r12, 16
+	addi	r12, r12, PACA_KVM_SLB
+	add	r12, r12, r13
+
+	/* for (r11 = kvm_slb; r11 < kvm_slb + kvm_slb_size; r11+=slb_entry) */
+	li	r11, PACA_KVM_SLB
+	add	r11, r11, r13
+
+slb_loop_enter:
+
+	ld	r10, 0(r11)
+
+	rldicl. r0, r10, 37, 63
+	beq	slb_loop_enter_skip
+
+	ld	r9, 8(r11)
+	slbmte	r9, r10
+
+slb_loop_enter_skip:
+	addi	r11, r11, 16
+	cmpd	cr0, r11, r12
+	blt	slb_loop_enter
+
+slb_do_enter:
+
+	/* Enter guest */
+
+	mfspr	r0, SPRN_SPRG_SCRATCH0
+
+	ld	r9, (PACA_EXMC+EX_R9)(r13)
+	ld	r10, (PACA_EXMC+EX_R10)(r13)
+	ld	r12, (PACA_EXMC+EX_R12)(r13)
+
+	lwz	r11, (PACA_EXMC+EX_CCR)(r13)
+	mtcr	r11
+
+	ld	r11, (PACA_EXMC+EX_R3)(r13)
+	mtxer	r11
+
+	ld	r11, (PACA_EXMC+EX_R11)(r13)
+	ld	r13, (PACA_EXMC+EX_R13)(r13)
+
+	RFI
+kvmppc_handler_trampoline_enter_end:
+
+
+
+/******************************************************************************
+ *                                                                            *
+ *                               Exit code                                    *
+ *                                                                            *
+ *****************************************************************************/
+
+.global kvmppc_handler_trampoline_exit
+kvmppc_handler_trampoline_exit:
+
+	/* Register usage at this point:
+	 *
+	 * SPRG_SCRATCH0 = guest R13
+	 * R01           = host R1
+	 * R02           = host R2
+	 * R10           = guest PC
+	 * R11           = guest MSR
+	 * R12           = exit handler id
+	 * R13           = PACA
+	 * PACA.exmc.CCR  = guest CR
+	 * PACA.exmc.R9  = guest R1
+	 * PACA.exmc.R10 = guest R10
+	 * PACA.exmc.R11 = guest R11
+	 * PACA.exmc.R12 = guest R12
+	 * PACA.exmc.R13 = guest R2
+	 *
+	 */
+
+	/* Save registers */
+
+	std	r0, (PACA_EXMC+EX_SRR0)(r13)
+	std	r9, (PACA_EXMC+EX_R3)(r13)
+	std	r10, (PACA_EXMC+EX_LR)(r13)
+	std	r11, (PACA_EXMC+EX_DAR)(r13)
+
+	/*
+	 * In order for us to easily get the last instruction,
+	 * we got the #vmexit at, we exploit the fact that the
+	 * virtual layout is still the same here, so we can just
+	 * ld from the guest's PC address
+	 */
+
+	/* We only load the last instruction when it's safe */
+	cmpwi	r12, BOOK3S_INTERRUPT_DATA_STORAGE
+	beq	ld_last_inst
+	cmpwi	r12, BOOK3S_INTERRUPT_PROGRAM
+	beq	ld_last_inst
+
+	b	no_ld_last_inst
+
+ld_last_inst:
+	/* Save off the guest instruction we're at */
+	/*    1) enable paging for data */
+	mfmsr	r9
+	ori	r11, r9, MSR_DR			/* Enable paging for data */
+	mtmsr	r11
+	/*    2) fetch the instruction */
+	lwz	r0, 0(r10)
+	/*    3) disable paging again */
+	mtmsr	r9
+
+no_ld_last_inst:
+
+	/* Restore bolted entries from the shadow and fix it along the way */
+
+	/* We don't store anything in entry 0, so we don't need to take care of that */
+	slbia
+	isync
+
+#if SLB_NUM_BOLTED == 3
+
+	ld	r11, PACA_SLBSHADOWPTR(r13)
+
+	ld	r10, 0x10(r11)
+	cmpdi	r10, 0
+	beq	slb_exit_skip_1
+	oris	r10, r10, SLB_ESID_V@h
+	ld	r9, 0x18(r11)
+	slbmte	r9, r10
+	std	r10, 0x10(r11)
+slb_exit_skip_1:
+	
+	ld	r10, 0x20(r11)
+	cmpdi	r10, 0
+	beq	slb_exit_skip_2
+	oris	r10, r10, SLB_ESID_V@h
+	ld	r9, 0x28(r11)
+	slbmte	r9, r10
+	std	r10, 0x20(r11)
+slb_exit_skip_2:
+	
+	ld	r10, 0x30(r11)
+	cmpdi	r10, 0
+	beq	slb_exit_skip_3
+	oris	r10, r10, SLB_ESID_V@h
+	ld	r9, 0x38(r11)
+	slbmte	r9, r10
+	std	r10, 0x30(r11)
+slb_exit_skip_3:
+	
+#else
+#error unknown number of bolted entries
+#endif
+
+slb_do_exit:
+
+	/* Restore registers */
+
+	ld	r11, (PACA_EXMC+EX_DAR)(r13)
+	ld	r10, (PACA_EXMC+EX_LR)(r13)
+	ld	r9, (PACA_EXMC+EX_R3)(r13)
+
+	/* Save last inst */
+	stw	r0, (PACA_EXMC+EX_LR)(r13)
+
+	/* Save DAR and DSISR before going to paged mode */
+	mfdar	r0
+	std	r0, (PACA_EXMC+EX_DAR)(r13)
+	mfdsisr	r0
+	stw	r0, (PACA_EXMC+EX_DSISR)(r13)
+
+	/* RFI into the highmem handler */
+	mfmsr	r0
+	ori	r0, r0, MSR_IR|MSR_DR|MSR_RI	/* Enable paging */
+	mtsrr1	r0
+	ld	r0, PACASAVEDMSR(r13)		/* Highmem handler address */
+	mtsrr0	r0
+
+	mfspr	r0, SPRN_SPRG_SCRATCH0
+
+	RFI
+kvmppc_handler_trampoline_exit_end:
+
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 01/27] Move dirty logging code to sub-arch
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
  To: kvm
  Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
	kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>

PowerPC code handles dirty logging in the generic parts atm. While this
is great for "return -ENOTSUPP", we need to be rather target specific
when actually implementing it.

So let's split it to implementation specific code, so we can implement
it for book3s.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/booke.c   |    5 +++++
 arch/powerpc/kvm/powerpc.c |    5 -----
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index e7bf4d0..06f5a9e 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -520,6 +520,11 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,
 	return kvmppc_core_vcpu_translate(vcpu, tr);
 }
 
+int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
+{
+	return -ENOTSUPP;
+}
+
 int __init kvmppc_booke_init(void)
 {
 	unsigned long ivor[16];
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 5902bbc..4ae3490 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -410,11 +410,6 @@ out:
 	return r;
 }
 
-int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
-{
-	return -ENOTSUPP;
-}
-
 long kvm_arch_vm_ioctl(struct file *filp,
                        unsigned int ioctl, unsigned long arg)
 {
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 15/27] Add mfdec emulation
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
  To: kvm
  Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
	kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>

We support setting the DEC to a certain value right now. Doing that basically
triggers the CPU local timer.

But there's also an mfdec command that enabled the OS to read the decrementor.

This is required at least by all desktop and server PowerPC Linux kernels. It
can't really hurt to allow embedded ones to do it as well though.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/emulate.c |   13 ++++++++++++-
 1 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
index 7737146..50d411d 100644
--- a/arch/powerpc/kvm/emulate.c
+++ b/arch/powerpc/kvm/emulate.c
@@ -66,12 +66,14 @@
 
 void kvmppc_emulate_dec(struct kvm_vcpu *vcpu)
 {
+	unsigned long nr_jiffies;
+
 	if (vcpu->arch.tcr & TCR_DIE) {
 		/* The decrementer ticks at the same rate as the timebase, so
 		 * that's how we convert the guest DEC value to the number of
 		 * host ticks. */
-		unsigned long nr_jiffies;
 
+		vcpu->arch.dec_jiffies = mftb();
 		nr_jiffies = vcpu->arch.dec / tb_ticks_per_jiffy;
 		mod_timer(&vcpu->arch.dec_timer,
 		          get_jiffies_64() + nr_jiffies);
@@ -211,6 +213,15 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
 			/* Note: SPRG4-7 are user-readable, so we don't get
 			 * a trap. */
 
+			case SPRN_DEC:
+			{
+				u64 jd = mftb() - vcpu->arch.dec_jiffies;
+				vcpu->arch.gpr[rt] = vcpu->arch.dec - jd;
+#ifdef DEBUG_EMUL
+				printk(KERN_INFO "mfDEC: %x - %llx = %lx\n", vcpu->arch.dec, jd, vcpu->arch.gpr[rt]);
+#endif
+				break;
+			}
 			default:
 				emulated = kvmppc_core_emulate_mfspr(vcpu, sprn, rt);
 				if (emulated == EMULATE_FAIL) {
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 14/27] Add book3s_64 specific opcode emulation
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
  To: kvm
  Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
	kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>

There are generic parts of PowerPC that can be shared across all
implementations and specific parts that only apply to BookE or desktop PPCs.

This patch adds emulation for desktop specific opcodes that don't apply
to BookE CPUs.

Signed-off-by: Alexander Graf <agraf@suse.de>

---

v5 -> v6:

  - // -> /* */
---
 arch/powerpc/kvm/book3s_64_emulate.c |  337 ++++++++++++++++++++++++++++++++++
 1 files changed, 337 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_64_emulate.c

diff --git a/arch/powerpc/kvm/book3s_64_emulate.c b/arch/powerpc/kvm/book3s_64_emulate.c
new file mode 100644
index 0000000..c343e67
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_emulate.c
@@ -0,0 +1,337 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf <agraf@suse.de>
+ */
+
+#include <asm/kvm_ppc.h>
+#include <asm/disassemble.h>
+#include <asm/kvm_book3s.h>
+#include <asm/reg.h>
+
+#define OP_19_XOP_RFID		18
+#define OP_19_XOP_RFI		50
+
+#define OP_31_XOP_MFMSR		83
+#define OP_31_XOP_MTMSR		146
+#define OP_31_XOP_MTMSRD	178
+#define OP_31_XOP_MTSRIN	242
+#define OP_31_XOP_TLBIEL	274
+#define OP_31_XOP_TLBIE		306
+#define OP_31_XOP_SLBMTE	402
+#define OP_31_XOP_SLBIE		434
+#define OP_31_XOP_SLBIA		498
+#define OP_31_XOP_MFSRIN	659
+#define OP_31_XOP_SLBMFEV	851
+#define OP_31_XOP_EIOIO		854
+#define OP_31_XOP_SLBMFEE	915
+
+/* DCBZ is actually 1014, but we patch it to 1010 so we get a trap */
+#define OP_31_XOP_DCBZ		1010
+
+int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
+                           unsigned int inst, int *advance)
+{
+	int emulated = EMULATE_DONE;
+
+	switch (get_op(inst)) {
+	case 19:
+		switch (get_xop(inst)) {
+		case OP_19_XOP_RFID:
+		case OP_19_XOP_RFI:
+			vcpu->arch.pc = vcpu->arch.srr0;
+			kvmppc_set_msr(vcpu, vcpu->arch.srr1);
+			*advance = 0;
+			break;
+
+		default:
+			emulated = EMULATE_FAIL;
+			break;
+		}
+		break;
+	case 31:
+		switch (get_xop(inst)) {
+		case OP_31_XOP_MFMSR:
+			vcpu->arch.gpr[get_rt(inst)] = vcpu->arch.msr;
+			break;
+		case OP_31_XOP_MTMSRD:
+		{
+			ulong rs = vcpu->arch.gpr[get_rs(inst)];
+			if (inst & 0x10000) {
+				vcpu->arch.msr &= ~(MSR_RI | MSR_EE);
+				vcpu->arch.msr |= rs & (MSR_RI | MSR_EE);
+			} else
+				kvmppc_set_msr(vcpu, rs);
+			break;
+		}
+		case OP_31_XOP_MTMSR:
+			kvmppc_set_msr(vcpu, vcpu->arch.gpr[get_rs(inst)]);
+			break;
+		case OP_31_XOP_MFSRIN:
+		{
+			int srnum;
+
+			srnum = (vcpu->arch.gpr[get_rb(inst)] >> 28) & 0xf;
+			if (vcpu->arch.mmu.mfsrin) {
+				u32 sr;
+				sr = vcpu->arch.mmu.mfsrin(vcpu, srnum);
+				vcpu->arch.gpr[get_rt(inst)] = sr;
+			}
+			break;
+		}
+		case OP_31_XOP_MTSRIN:
+			vcpu->arch.mmu.mtsrin(vcpu,
+				(vcpu->arch.gpr[get_rb(inst)] >> 28) & 0xf,
+				vcpu->arch.gpr[get_rs(inst)]);
+			break;
+		case OP_31_XOP_TLBIE:
+		case OP_31_XOP_TLBIEL:
+		{
+			bool large = (inst & 0x00200000) ? true : false;
+			ulong addr = vcpu->arch.gpr[get_rb(inst)];
+			vcpu->arch.mmu.tlbie(vcpu, addr, large);
+			break;
+		}
+		case OP_31_XOP_EIOIO:
+			break;
+		case OP_31_XOP_SLBMTE:
+			if (!vcpu->arch.mmu.slbmte)
+				return EMULATE_FAIL;
+
+			vcpu->arch.mmu.slbmte(vcpu, vcpu->arch.gpr[get_rs(inst)],
+						vcpu->arch.gpr[get_rb(inst)]);
+			break;
+		case OP_31_XOP_SLBIE:
+			if (!vcpu->arch.mmu.slbie)
+				return EMULATE_FAIL;
+
+			vcpu->arch.mmu.slbie(vcpu, vcpu->arch.gpr[get_rb(inst)]);
+			break;
+		case OP_31_XOP_SLBIA:
+			if (!vcpu->arch.mmu.slbia)
+				return EMULATE_FAIL;
+
+			vcpu->arch.mmu.slbia(vcpu);
+			break;
+		case OP_31_XOP_SLBMFEE:
+			if (!vcpu->arch.mmu.slbmfee) {
+				emulated = EMULATE_FAIL;
+			} else {
+				ulong t, rb;
+
+				rb = vcpu->arch.gpr[get_rb(inst)];
+				t = vcpu->arch.mmu.slbmfee(vcpu, rb);
+				vcpu->arch.gpr[get_rt(inst)] = t;
+			}
+			break;
+		case OP_31_XOP_SLBMFEV:
+			if (!vcpu->arch.mmu.slbmfev) {
+				emulated = EMULATE_FAIL;
+			} else {
+				ulong t, rb;
+
+				rb = vcpu->arch.gpr[get_rb(inst)];
+				t = vcpu->arch.mmu.slbmfev(vcpu, rb);
+				vcpu->arch.gpr[get_rt(inst)] = t;
+			}
+			break;
+		case OP_31_XOP_DCBZ:
+		{
+			ulong rb =  vcpu->arch.gpr[get_rb(inst)];
+			ulong ra = 0;
+			ulong addr;
+			u32 zeros[8] = { 0, 0, 0, 0, 0, 0, 0, 0 };
+
+			if (get_ra(inst))
+				ra = vcpu->arch.gpr[get_ra(inst)];
+
+			addr = (ra + rb) & ~31ULL;
+			if (!(vcpu->arch.msr & MSR_SF))
+				addr &= 0xffffffff;
+
+			if (kvmppc_st(vcpu, addr, 32, zeros)) {
+				vcpu->arch.dear = addr;
+				vcpu->arch.fault_dear = addr;
+				to_book3s(vcpu)->dsisr = DSISR_PROTFAULT |
+						      DSISR_ISSTORE;
+				kvmppc_book3s_queue_irqprio(vcpu,
+					BOOK3S_INTERRUPT_DATA_STORAGE);
+				kvmppc_mmu_pte_flush(vcpu, addr, ~0xFFFULL);
+			}
+
+			break;
+		}
+		default:
+			emulated = EMULATE_FAIL;
+		}
+		break;
+	default:
+		emulated = EMULATE_FAIL;
+	}
+
+	return emulated;
+}
+
+static void kvmppc_write_bat(struct kvm_vcpu *vcpu, int sprn, u64 val)
+{
+	struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
+	struct kvmppc_bat *bat;
+
+	switch (sprn) {
+	case SPRN_IBAT0U ... SPRN_IBAT3L:
+		bat = &vcpu_book3s->ibat[(sprn - SPRN_IBAT0U) / 2];
+		break;
+	case SPRN_IBAT4U ... SPRN_IBAT7L:
+		bat = &vcpu_book3s->ibat[(sprn - SPRN_IBAT4U) / 2];
+		break;
+	case SPRN_DBAT0U ... SPRN_DBAT3L:
+		bat = &vcpu_book3s->dbat[(sprn - SPRN_DBAT0U) / 2];
+		break;
+	case SPRN_DBAT4U ... SPRN_DBAT7L:
+		bat = &vcpu_book3s->dbat[(sprn - SPRN_DBAT4U) / 2];
+		break;
+	default:
+		BUG();
+	}
+
+	if (!(sprn % 2)) {
+		/* Upper BAT */
+		u32 bl = (val >> 2) & 0x7ff;
+		bat->bepi_mask = (~bl << 17);
+		bat->bepi = val & 0xfffe0000;
+		bat->vs = (val & 2) ? 1 : 0;
+		bat->vp = (val & 1) ? 1 : 0;
+	} else {
+		/* Lower BAT */
+		bat->brpn = val & 0xfffe0000;
+		bat->wimg = (val >> 3) & 0xf;
+		bat->pp = val & 3;
+	}
+}
+
+int kvmppc_core_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, int rs)
+{
+	int emulated = EMULATE_DONE;
+
+	switch (sprn) {
+	case SPRN_SDR1:
+		to_book3s(vcpu)->sdr1 = vcpu->arch.gpr[rs];
+		break;
+	case SPRN_DSISR:
+		to_book3s(vcpu)->dsisr = vcpu->arch.gpr[rs];
+		break;
+	case SPRN_DAR:
+		vcpu->arch.dear = vcpu->arch.gpr[rs];
+		break;
+	case SPRN_HIOR:
+		to_book3s(vcpu)->hior = vcpu->arch.gpr[rs];
+		break;
+	case SPRN_IBAT0U ... SPRN_IBAT3L:
+	case SPRN_IBAT4U ... SPRN_IBAT7L:
+	case SPRN_DBAT0U ... SPRN_DBAT3L:
+	case SPRN_DBAT4U ... SPRN_DBAT7L:
+		kvmppc_write_bat(vcpu, sprn, vcpu->arch.gpr[rs]);
+		/* BAT writes happen so rarely that we're ok to flush
+		 * everything here */
+		kvmppc_mmu_pte_flush(vcpu, 0, 0);
+		break;
+	case SPRN_HID0:
+		to_book3s(vcpu)->hid[0] = vcpu->arch.gpr[rs];
+		break;
+	case SPRN_HID1:
+		to_book3s(vcpu)->hid[1] = vcpu->arch.gpr[rs];
+		break;
+	case SPRN_HID2:
+		to_book3s(vcpu)->hid[2] = vcpu->arch.gpr[rs];
+		break;
+	case SPRN_HID4:
+		to_book3s(vcpu)->hid[4] = vcpu->arch.gpr[rs];
+		break;
+	case SPRN_HID5:
+		to_book3s(vcpu)->hid[5] = vcpu->arch.gpr[rs];
+		/* guest HID5 set can change is_dcbz32 */
+		if (vcpu->arch.mmu.is_dcbz32(vcpu) &&
+		    (mfmsr() & MSR_HV))
+			vcpu->arch.hflags |= BOOK3S_HFLAG_DCBZ32;
+		break;
+	case SPRN_ICTC:
+	case SPRN_THRM1:
+	case SPRN_THRM2:
+	case SPRN_THRM3:
+	case SPRN_CTRLF:
+	case SPRN_CTRLT:
+		break;
+	default:
+		printk(KERN_INFO "KVM: invalid SPR write: %d\n", sprn);
+#ifndef DEBUG_SPR
+		emulated = EMULATE_FAIL;
+#endif
+		break;
+	}
+
+	return emulated;
+}
+
+int kvmppc_core_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, int rt)
+{
+	int emulated = EMULATE_DONE;
+
+	switch (sprn) {
+	case SPRN_SDR1:
+		vcpu->arch.gpr[rt] = to_book3s(vcpu)->sdr1;
+		break;
+	case SPRN_DSISR:
+		vcpu->arch.gpr[rt] = to_book3s(vcpu)->dsisr;
+		break;
+	case SPRN_DAR:
+		vcpu->arch.gpr[rt] = vcpu->arch.dear;
+		break;
+	case SPRN_HIOR:
+		vcpu->arch.gpr[rt] = to_book3s(vcpu)->hior;
+		break;
+	case SPRN_HID0:
+		vcpu->arch.gpr[rt] = to_book3s(vcpu)->hid[0];
+		break;
+	case SPRN_HID1:
+		vcpu->arch.gpr[rt] = to_book3s(vcpu)->hid[1];
+		break;
+	case SPRN_HID2:
+		vcpu->arch.gpr[rt] = to_book3s(vcpu)->hid[2];
+		break;
+	case SPRN_HID4:
+		vcpu->arch.gpr[rt] = to_book3s(vcpu)->hid[4];
+		break;
+	case SPRN_HID5:
+		vcpu->arch.gpr[rt] = to_book3s(vcpu)->hid[5];
+		break;
+	case SPRN_THRM1:
+	case SPRN_THRM2:
+	case SPRN_THRM3:
+	case SPRN_CTRLF:
+	case SPRN_CTRLT:
+		vcpu->arch.gpr[rt] = 0;
+		break;
+	default:
+		printk(KERN_INFO "KVM: invalid SPR read: %d\n", sprn);
+#ifndef DEBUG_SPR
+		emulated = EMULATE_FAIL;
+#endif
+		break;
+	}
+
+	return emulated;
+}
+
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 00/27] Add KVM support for Book3s_64 (PPC64) hosts v6
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
  To: kvm
  Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
	kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson

KVM for PowerPC only supports embedded cores at the moment.

While it makes sense to virtualize on small machines, it's even more fun
to do so on big boxes. So I figured we need KVM for PowerPC64 as well.

This patchset implements KVM support for Book3s_64 hosts and guest support
for Book3s_64 and G3/G4.

To really make use of this, you also need a recent version of qemu.

This time around I have no git tree to pull from. Sorry guys :-).

V1 -> V2:

 - extend sregs with padding
 - new naming scheme (ppc64 -> book3s_64; 74xx -> book3s_32)
 - to_phys -> in-kernel tophys()
 - loadimm -> LOAD_REG_IMMEDIATE
 - call .ko kvm.ko
 - set magic paca bit later
 - run guest code with PACA->soft_enabled=true
 - pt_regs for host state saving (guest too?)
 - only do HV dcbz trick on 970
 - refuse to run on LPAR because of missing SLB pieces

V2 -> V3:

 - fix DAR/DSISR saving
 - allow running on LPAR by modifying the SLB shadow
 - change the SLB implementation to use a mem-backed cache and do
   full world switch on enter/exit. gets rid of "context" magic
 - be more aggressive about DEC injection
 - remove fast ld/st because we're always in host context
 - don't use SPRGs in real->paged transition
 - implement dirty log
 - remove MMIO speedup code
 - SPRG cleanup
   - rename SPRG3 -> SPRN_SPRG_PACA
   - rename SPRG1 -> SPRN_SPRG_SCRATCH0
   - don't use SPRG2

V3 -> V4:

 - use context_id instead of mm_alloc
 - export less

V4 -> V5:

 - use get_tb instead of mftb
 - make ppc32 and ppc64 emulation share more code
 - make pvr 32 bits
 - add patch to use hrtimer for decrememter

V5 -> V6:

 - // -> /* */ style comments
 - make code easier to read
 - not take mmap_sem when it's not needed

Alexander Graf (27):
  Move dirty logging code to sub-arch
  Pass PVR in sregs
  Add Book3s definitions
  Add Book3s fields to vcpu structs
  Add asm/kvm_book3s.h
  Add Book3s_64 intercept helpers
  Add book3s_64 highmem asm code
  Add SLB switching code for entry/exit
  Add interrupt handling code
  Add book3s.c
  Add book3s_64 Host MMU handling
  Add book3s_64 guest MMU
  Add book3s_32 guest MMU
  Add book3s_64 specific opcode emulation
  Add mfdec emulation
  Add desktop PowerPC specific emulation
  Make head_64.S aware of KVM real mode code
  Add Book3s_64 offsets to asm-offsets.c
  Export symbols for KVM module
  Split init_new_context and destroy_context
  Export KVM symbols for module
  Add fields to PACA
  Export new PACA constants in asm-offsets
  Include Book3s_64 target in buildsystem
  Fix trace.h
  Use Little Endian for Dirty Bitmap
  Use hrtimers for the decrementer

 arch/powerpc/include/asm/exception-64s.h     |    2 +
 arch/powerpc/include/asm/kvm.h               |    2 +
 arch/powerpc/include/asm/kvm_asm.h           |   39 ++
 arch/powerpc/include/asm/kvm_book3s.h        |  136 ++++
 arch/powerpc/include/asm/kvm_book3s_64_asm.h |   58 ++
 arch/powerpc/include/asm/kvm_host.h          |   79 +++-
 arch/powerpc/include/asm/kvm_ppc.h           |    1 +
 arch/powerpc/include/asm/mmu_context.h       |    5 +
 arch/powerpc/include/asm/paca.h              |    9 +
 arch/powerpc/kernel/asm-offsets.c            |   18 +
 arch/powerpc/kernel/exceptions-64s.S         |    8 +
 arch/powerpc/kernel/head_64.S                |    7 +
 arch/powerpc/kernel/ppc_ksyms.c              |    3 +-
 arch/powerpc/kernel/time.c                   |    1 +
 arch/powerpc/kvm/Kconfig                     |   17 +
 arch/powerpc/kvm/Makefile                    |   27 +-
 arch/powerpc/kvm/book3s.c                    |  925 ++++++++++++++++++++++++++
 arch/powerpc/kvm/book3s_32_mmu.c             |  372 +++++++++++
 arch/powerpc/kvm/book3s_64_emulate.c         |  337 ++++++++++
 arch/powerpc/kvm/book3s_64_exports.c         |   24 +
 arch/powerpc/kvm/book3s_64_interrupts.S      |  392 +++++++++++
 arch/powerpc/kvm/book3s_64_mmu.c             |  476 +++++++++++++
 arch/powerpc/kvm/book3s_64_mmu_host.c        |  408 ++++++++++++
 arch/powerpc/kvm/book3s_64_rmhandlers.S      |  131 ++++
 arch/powerpc/kvm/book3s_64_slb.S             |  277 ++++++++
 arch/powerpc/kvm/booke.c                     |    5 +
 arch/powerpc/kvm/emulate.c                   |   66 ++-
 arch/powerpc/kvm/powerpc.c                   |   25 +-
 arch/powerpc/kvm/trace.h                     |    6 +-
 arch/powerpc/mm/hash_utils_64.c              |    2 +
 arch/powerpc/mm/mmu_context_hash64.c         |   24 +-
 virt/kvm/kvm_main.c                          |    5 +-
 32 files changed, 3853 insertions(+), 34 deletions(-)
 create mode 100644 arch/powerpc/include/asm/kvm_book3s.h
 create mode 100644 arch/powerpc/include/asm/kvm_book3s_64_asm.h
 create mode 100644 arch/powerpc/kvm/book3s.c
 create mode 100644 arch/powerpc/kvm/book3s_32_mmu.c
 create mode 100644 arch/powerpc/kvm/book3s_64_emulate.c
 create mode 100644 arch/powerpc/kvm/book3s_64_exports.c
 create mode 100644 arch/powerpc/kvm/book3s_64_interrupts.S
 create mode 100644 arch/powerpc/kvm/book3s_64_mmu.c
 create mode 100644 arch/powerpc/kvm/book3s_64_mmu_host.c
 create mode 100644 arch/powerpc/kvm/book3s_64_rmhandlers.S
 create mode 100644 arch/powerpc/kvm/book3s_64_slb.S

^ permalink raw reply

* [PATCH 16/27] Add desktop PowerPC specific emulation
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
  To: kvm
  Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
	kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>

Little opcodes behave differently on desktop and embedded PowerPC cores.
In order to reflect those differences, let's add some #ifdef code to emulate.c.

We could probably also handle them in the core specific emulation files, but I
would prefer to reuse as much code as possible.

Signed-off-by: Alexander Graf <agraf@suse.de>

---

v4 -> v5:

  - use get_tb instead of mftb
  - make ppc32 and ppc64 emulation share more code
---
 arch/powerpc/kvm/emulate.c |   49 +++++++++++++++++++++++++++++++++++---------
 1 files changed, 39 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
index 50d411d..1ec5e07 100644
--- a/arch/powerpc/kvm/emulate.c
+++ b/arch/powerpc/kvm/emulate.c
@@ -32,6 +32,7 @@
 #include "trace.h"
 
 #define OP_TRAP 3
+#define OP_TRAP_64 2
 
 #define OP_31_XOP_LWZX      23
 #define OP_31_XOP_LBZX      87
@@ -64,16 +65,36 @@
 #define OP_STH  44
 #define OP_STHU 45
 
+#ifdef CONFIG_PPC64
+static int kvmppc_dec_enabled(struct kvm_vcpu *vcpu)
+{
+	return 1;
+}
+#else
+static int kvmppc_dec_enabled(struct kvm_vcpu *vcpu)
+{
+	return vcpu->arch.tcr & TCR_DIE;
+}
+#endif
+
 void kvmppc_emulate_dec(struct kvm_vcpu *vcpu)
 {
 	unsigned long nr_jiffies;
 
-	if (vcpu->arch.tcr & TCR_DIE) {
+#ifdef CONFIG_PPC64
+	/* POWER4+ triggers a dec interrupt if the value is < 0 */
+	if (vcpu->arch.dec & 0x80000000) {
+		del_timer(&vcpu->arch.dec_timer);
+		kvmppc_core_queue_dec(vcpu);
+		return;
+	}
+#endif
+	if (kvmppc_dec_enabled(vcpu)) {
 		/* The decrementer ticks at the same rate as the timebase, so
 		 * that's how we convert the guest DEC value to the number of
 		 * host ticks. */
 
-		vcpu->arch.dec_jiffies = mftb();
+		vcpu->arch.dec_jiffies = get_tb();
 		nr_jiffies = vcpu->arch.dec / tb_ticks_per_jiffy;
 		mod_timer(&vcpu->arch.dec_timer,
 		          get_jiffies_64() + nr_jiffies);
@@ -113,9 +134,15 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
 	/* this default type might be overwritten by subcategories */
 	kvmppc_set_exit_type(vcpu, EMULATED_INST_EXITS);
 
+	pr_debug(KERN_INFO "Emulating opcode %d / %d\n", get_op(inst), get_xop(inst));
+
 	switch (get_op(inst)) {
 	case OP_TRAP:
+#ifdef CONFIG_PPC64
+	case OP_TRAP_64:
+#else
 		vcpu->arch.esr |= ESR_PTR;
+#endif
 		kvmppc_core_queue_program(vcpu);
 		advance = 0;
 		break;
@@ -190,17 +217,19 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
 			case SPRN_SRR1:
 				vcpu->arch.gpr[rt] = vcpu->arch.srr1; break;
 			case SPRN_PVR:
-				vcpu->arch.gpr[rt] = mfspr(SPRN_PVR); break;
+				vcpu->arch.gpr[rt] = vcpu->arch.pvr; break;
 			case SPRN_PIR:
-				vcpu->arch.gpr[rt] = mfspr(SPRN_PIR); break;
+				vcpu->arch.gpr[rt] = vcpu->vcpu_id; break;
+			case SPRN_MSSSR0:
+				vcpu->arch.gpr[rt] = 0; break;
 
 			/* Note: mftb and TBRL/TBWL are user-accessible, so
 			 * the guest can always access the real TB anyways.
 			 * In fact, we probably will never see these traps. */
 			case SPRN_TBWL:
-				vcpu->arch.gpr[rt] = mftbl(); break;
+				vcpu->arch.gpr[rt] = get_tb() >> 32; break;
 			case SPRN_TBWU:
-				vcpu->arch.gpr[rt] = mftbu(); break;
+				vcpu->arch.gpr[rt] = get_tb(); break;
 
 			case SPRN_SPRG0:
 				vcpu->arch.gpr[rt] = vcpu->arch.sprg0; break;
@@ -215,11 +244,9 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
 
 			case SPRN_DEC:
 			{
-				u64 jd = mftb() - vcpu->arch.dec_jiffies;
+				u64 jd = get_tb() - vcpu->arch.dec_jiffies;
 				vcpu->arch.gpr[rt] = vcpu->arch.dec - jd;
-#ifdef DEBUG_EMUL
-				printk(KERN_INFO "mfDEC: %x - %llx = %lx\n", vcpu->arch.dec, jd, vcpu->arch.gpr[rt]);
-#endif
+				pr_debug(KERN_INFO "mfDEC: %x - %llx = %lx\n", vcpu->arch.dec, jd, vcpu->arch.gpr[rt]);
 				break;
 			}
 			default:
@@ -271,6 +298,8 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
 			case SPRN_TBWL: break;
 			case SPRN_TBWU: break;
 
+			case SPRN_MSSSR0: break;
+
 			case SPRN_DEC:
 				vcpu->arch.dec = vcpu->arch.gpr[rs];
 				kvmppc_emulate_dec(vcpu);
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 12/27] Add book3s_64 guest MMU
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
  To: kvm
  Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
	kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>

To be able to run a guest, we also need to implement a guest MMU.

This patch adds MMU handling for Book3s_64 guests.

Signed-off-by: Alexander Graf <agraf@suse.de>

---

v5 -> v6:

  - dprintk instead of scattered #ifdef's
  - 80 line limit
  - // -> /* */
---
 arch/powerpc/kvm/book3s_64_mmu.c |  476 ++++++++++++++++++++++++++++++++++++++
 1 files changed, 476 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_64_mmu.c

diff --git a/arch/powerpc/kvm/book3s_64_mmu.c b/arch/powerpc/kvm/book3s_64_mmu.c
new file mode 100644
index 0000000..a31f9c6
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_mmu.c
@@ -0,0 +1,476 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf <agraf@suse.de>
+ */
+
+#include <linux/types.h>
+#include <linux/string.h>
+#include <linux/kvm.h>
+#include <linux/kvm_host.h>
+#include <linux/highmem.h>
+
+#include <asm/tlbflush.h>
+#include <asm/kvm_ppc.h>
+#include <asm/kvm_book3s.h>
+
+/* #define DEBUG_MMU */
+
+#ifdef DEBUG_MMU
+#define dprintk(X...) printk(KERN_INFO X)
+#else
+#define dprintk(X...) do { } while(0)
+#endif
+
+static void kvmppc_mmu_book3s_64_reset_msr(struct kvm_vcpu *vcpu)
+{
+	kvmppc_set_msr(vcpu, MSR_SF);
+}
+
+static struct kvmppc_slb *kvmppc_mmu_book3s_64_find_slbe(
+				struct kvmppc_vcpu_book3s *vcpu_book3s,
+				gva_t eaddr)
+{
+	int i;
+	u64 esid = GET_ESID(eaddr);
+	u64 esid_1t = GET_ESID_1T(eaddr);
+
+	for (i = 0; i < vcpu_book3s->slb_nr; i++) {
+		u64 cmp_esid = esid;
+
+		if (!vcpu_book3s->slb[i].valid)
+			continue;
+
+		if (vcpu_book3s->slb[i].large)
+			cmp_esid = esid_1t;
+
+		if (vcpu_book3s->slb[i].esid == cmp_esid)
+			return &vcpu_book3s->slb[i];
+	}
+
+	dprintk("KVM: No SLB entry found for 0x%lx [%llx | %llx]\n",
+		eaddr, esid, esid_1t);
+	for (i = 0; i < vcpu_book3s->slb_nr; i++) {
+	    if (vcpu_book3s->slb[i].vsid)
+		dprintk("  %d: %c%c %llx %llx\n", i,
+			vcpu_book3s->slb[i].valid ? 'v' : ' ',
+			vcpu_book3s->slb[i].large ? 'l' : ' ',
+			vcpu_book3s->slb[i].esid,
+			vcpu_book3s->slb[i].vsid);
+	}
+
+	return NULL;
+}
+
+static u64 kvmppc_mmu_book3s_64_ea_to_vp(struct kvm_vcpu *vcpu, gva_t eaddr,
+					 bool data)
+{
+	struct kvmppc_slb *slb;
+
+	slb = kvmppc_mmu_book3s_64_find_slbe(to_book3s(vcpu), eaddr);
+	if (!slb)
+		return 0;
+
+	if (slb->large)
+		return (((u64)eaddr >> 12) & 0xfffffff) |
+		       (((u64)slb->vsid) << 28);
+
+	return (((u64)eaddr >> 12) & 0xffff) | (((u64)slb->vsid) << 16);
+}
+
+static int kvmppc_mmu_book3s_64_get_pagesize(struct kvmppc_slb *slbe)
+{
+	return slbe->large ? 24 : 12;
+}
+
+static u32 kvmppc_mmu_book3s_64_get_page(struct kvmppc_slb *slbe, gva_t eaddr)
+{
+	int p = kvmppc_mmu_book3s_64_get_pagesize(slbe);
+	return ((eaddr & 0xfffffff) >> p);
+}
+
+static hva_t kvmppc_mmu_book3s_64_get_pteg(
+				struct kvmppc_vcpu_book3s *vcpu_book3s,
+				struct kvmppc_slb *slbe, gva_t eaddr,
+				bool second)
+{
+	u64 hash, pteg, htabsize;
+	u32 page;
+	hva_t r;
+
+	page = kvmppc_mmu_book3s_64_get_page(slbe, eaddr);
+	htabsize = ((1 << ((vcpu_book3s->sdr1 & 0x1f) + 11)) - 1);
+
+	hash = slbe->vsid ^ page;
+	if (second)
+		hash = ~hash;
+	hash &= ((1ULL << 39ULL) - 1ULL);
+	hash &= htabsize;
+	hash <<= 7ULL;
+
+	pteg = vcpu_book3s->sdr1 & 0xfffffffffffc0000ULL;
+	pteg |= hash;
+
+	dprintk("MMU: page=0x%x sdr1=0x%llx pteg=0x%llx vsid=0x%llx\n",
+		page, vcpu_book3s->sdr1, pteg, slbe->vsid);
+
+	r = gfn_to_hva(vcpu_book3s->vcpu.kvm, pteg >> PAGE_SHIFT);
+	if (kvm_is_error_hva(r))
+		return r;
+	return r | (pteg & ~PAGE_MASK);
+}
+
+static u64 kvmppc_mmu_book3s_64_get_avpn(struct kvmppc_slb *slbe, gva_t eaddr)
+{
+	int p = kvmppc_mmu_book3s_64_get_pagesize(slbe);
+	u64 avpn;
+
+	avpn = kvmppc_mmu_book3s_64_get_page(slbe, eaddr);
+	avpn |= slbe->vsid << (28 - p);
+
+	if (p < 24)
+		avpn >>= ((80 - p) - 56) - 8;
+	else
+		avpn <<= 8;
+
+	return avpn;
+}
+
+static int kvmppc_mmu_book3s_64_xlate(struct kvm_vcpu *vcpu, gva_t eaddr,
+				struct kvmppc_pte *gpte, bool data)
+{
+	struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
+	struct kvmppc_slb *slbe;
+	hva_t ptegp;
+	u64 pteg[16];
+	u64 avpn = 0;
+	int i;
+	u8 key = 0;
+	bool found = false;
+	bool perm_err = false;
+	int second = 0;
+
+	slbe = kvmppc_mmu_book3s_64_find_slbe(vcpu_book3s, eaddr);
+	if (!slbe)
+		goto no_seg_found;
+
+do_second:
+	ptegp = kvmppc_mmu_book3s_64_get_pteg(vcpu_book3s, slbe, eaddr, second);
+	if (kvm_is_error_hva(ptegp))
+		goto no_page_found;
+
+	avpn = kvmppc_mmu_book3s_64_get_avpn(slbe, eaddr);
+
+	if(copy_from_user(pteg, (void __user *)ptegp, sizeof(pteg))) {
+		printk(KERN_ERR "KVM can't copy data from 0x%lx!\n", ptegp);
+		goto no_page_found;
+	}
+
+	if ((vcpu->arch.msr & MSR_PR) && slbe->Kp)
+		key = 4;
+	else if (!(vcpu->arch.msr & MSR_PR) && slbe->Ks)
+		key = 4;
+
+	for (i=0; i<16; i+=2) {
+		u64 v = pteg[i];
+		u64 r = pteg[i+1];
+
+		/* Valid check */
+		if (!(v & HPTE_V_VALID))
+			continue;
+		/* Hash check */
+		if ((v & HPTE_V_SECONDARY) != second)
+			continue;
+
+		/* AVPN compare */
+		if (HPTE_V_AVPN_VAL(avpn) == HPTE_V_AVPN_VAL(v)) {
+			u8 pp = (r & HPTE_R_PP) | key;
+			int eaddr_mask = 0xFFF;
+
+			gpte->eaddr = eaddr;
+			gpte->vpage = kvmppc_mmu_book3s_64_ea_to_vp(vcpu,
+								    eaddr,
+								    data);
+			if (slbe->large)
+				eaddr_mask = 0xFFFFFF;
+			gpte->raddr = (r & HPTE_R_RPN) | (eaddr & eaddr_mask);
+			gpte->may_execute = ((r & HPTE_R_N) ? false : true);
+			gpte->may_read = false;
+			gpte->may_write = false;
+
+			switch (pp) {
+			case 0:
+			case 1:
+			case 2:
+			case 6:
+				gpte->may_write = true;
+				/* fall through */
+			case 3:
+			case 5:
+			case 7:
+				gpte->may_read = true;
+				break;
+			}
+
+			if (!gpte->may_read) {
+				perm_err = true;
+				continue;
+			}
+
+			dprintk("KVM MMU: Translated 0x%lx [0x%llx] -> 0x%llx "
+				"-> 0x%llx\n",
+				eaddr, avpn, gpte->vpage, gpte->raddr);
+			found = true;
+			break;
+		}
+	}
+
+	/* Update PTE R and C bits, so the guest's swapper knows we used the
+	 * page */
+	if (found) {
+		u32 oldr = pteg[i+1];
+
+		if (gpte->may_read) {
+			/* Set the accessed flag */
+			pteg[i+1] |= HPTE_R_R;
+		}
+		if (gpte->may_write) {
+			/* Set the dirty flag */
+			pteg[i+1] |= HPTE_R_C;
+		} else {
+			dprintk("KVM: Mapping read-only page!\n");
+		}
+
+		/* Write back into the PTEG */
+		if (pteg[i+1] != oldr)
+			copy_to_user((void __user *)ptegp, pteg, sizeof(pteg));
+
+		return 0;
+	} else {
+		dprintk("KVM MMU: No PTE found (ea=0x%lx sdr1=0x%llx "
+			"ptegp=0x%lx)\n",
+			eaddr, to_book3s(vcpu)->sdr1, ptegp);
+		for (i = 0; i < 16; i += 2)
+			dprintk("   %02d: 0x%llx - 0x%llx (0x%llx)\n",
+				i, pteg[i], pteg[i+1], avpn);
+
+		if (!second) {
+			second = HPTE_V_SECONDARY;
+			goto do_second;
+		}
+	}
+
+
+no_page_found:
+
+
+	if (perm_err)
+		return -EPERM;
+
+	return -ENOENT;
+
+no_seg_found:
+
+	dprintk("KVM MMU: Trigger segment fault\n");
+	return -EINVAL;
+}
+
+static void kvmppc_mmu_book3s_64_slbmte(struct kvm_vcpu *vcpu, u64 rs, u64 rb)
+{
+	struct kvmppc_vcpu_book3s *vcpu_book3s;
+	u64 esid, esid_1t;
+	int slb_nr;
+	struct kvmppc_slb *slbe;
+
+	dprintk("KVM MMU: slbmte(0x%llx, 0x%llx)\n", rs, rb);
+
+	vcpu_book3s = to_book3s(vcpu);
+
+	esid = GET_ESID(rb);
+	esid_1t = GET_ESID_1T(rb);
+	slb_nr = rb & 0xfff;
+
+	if (slb_nr > vcpu_book3s->slb_nr)
+		return;
+
+	slbe = &vcpu_book3s->slb[slb_nr];
+
+	slbe->large = (rs & SLB_VSID_L) ? 1 : 0;
+	slbe->esid  = slbe->large ? esid_1t : esid;
+	slbe->vsid  = rs >> 12;
+	slbe->valid = (rb & SLB_ESID_V) ? 1 : 0;
+	slbe->Ks    = (rs & SLB_VSID_KS) ? 1 : 0;
+	slbe->Kp    = (rs & SLB_VSID_KP) ? 1 : 0;
+	slbe->nx    = (rs & SLB_VSID_N) ? 1 : 0;
+	slbe->class = (rs & SLB_VSID_C) ? 1 : 0;
+
+	slbe->orige = rb & (ESID_MASK | SLB_ESID_V);
+	slbe->origv = rs;
+
+	/* Map the new segment */
+	kvmppc_mmu_map_segment(vcpu, esid << SID_SHIFT);
+}
+
+static u64 kvmppc_mmu_book3s_64_slbmfee(struct kvm_vcpu *vcpu, u64 slb_nr)
+{
+	struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
+	struct kvmppc_slb *slbe;
+
+	if (slb_nr > vcpu_book3s->slb_nr)
+		return 0;
+
+	slbe = &vcpu_book3s->slb[slb_nr];
+
+	return slbe->orige;
+}
+
+static u64 kvmppc_mmu_book3s_64_slbmfev(struct kvm_vcpu *vcpu, u64 slb_nr)
+{
+	struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
+	struct kvmppc_slb *slbe;
+
+	if (slb_nr > vcpu_book3s->slb_nr)
+		return 0;
+
+	slbe = &vcpu_book3s->slb[slb_nr];
+
+	return slbe->origv;
+}
+
+static void kvmppc_mmu_book3s_64_slbie(struct kvm_vcpu *vcpu, u64 ea)
+{
+	struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
+	struct kvmppc_slb *slbe;
+
+	dprintk("KVM MMU: slbie(0x%llx)\n", ea);
+
+	slbe = kvmppc_mmu_book3s_64_find_slbe(vcpu_book3s, ea);
+
+	if (!slbe)
+		return;
+
+	dprintk("KVM MMU: slbie(0x%llx, 0x%llx)\n", ea, slbe->esid);
+
+	slbe->valid = false;
+
+	kvmppc_mmu_map_segment(vcpu, ea);
+}
+
+static void kvmppc_mmu_book3s_64_slbia(struct kvm_vcpu *vcpu)
+{
+	struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
+	int i;
+
+	dprintk("KVM MMU: slbia()\n");
+
+	for (i = 1; i < vcpu_book3s->slb_nr; i++)
+		vcpu_book3s->slb[i].valid = false;
+
+	if (vcpu->arch.msr & MSR_IR) {
+		kvmppc_mmu_flush_segments(vcpu);
+		kvmppc_mmu_map_segment(vcpu, vcpu->arch.pc);
+	}
+}
+
+static void kvmppc_mmu_book3s_64_mtsrin(struct kvm_vcpu *vcpu, u32 srnum,
+					ulong value)
+{
+	u64 rb = 0, rs = 0;
+
+	/* ESID = srnum */
+	rb |= (srnum & 0xf) << 28;
+	/* Set the valid bit */
+	rb |= 1 << 27;
+	/* Index = ESID */
+	rb |= srnum;
+
+	/* VSID = VSID */
+	rs |= (value & 0xfffffff) << 12;
+	/* flags = flags */
+	rs |= ((value >> 27) & 0xf) << 9;
+
+	kvmppc_mmu_book3s_64_slbmte(vcpu, rs, rb);
+}
+
+static void kvmppc_mmu_book3s_64_tlbie(struct kvm_vcpu *vcpu, ulong va,
+				       bool large)
+{
+	u64 mask = 0xFFFFFFFFFULL;
+
+	dprintk("KVM MMU: tlbie(0x%lx)\n", va);
+
+	if (large)
+		mask = 0xFFFFFF000ULL;
+	kvmppc_mmu_pte_vflush(vcpu, va >> 12, mask);
+}
+
+static int kvmppc_mmu_book3s_64_esid_to_vsid(struct kvm_vcpu *vcpu, u64 esid,
+					     u64 *vsid)
+{
+	switch (vcpu->arch.msr & (MSR_DR|MSR_IR)) {
+	case 0:
+		*vsid = (VSID_REAL >> 16) | esid;
+		break;
+	case MSR_IR:
+		*vsid = (VSID_REAL_IR >> 16) | esid;
+		break;
+	case MSR_DR:
+		*vsid = (VSID_REAL_DR >> 16) | esid;
+		break;
+	case MSR_DR|MSR_IR:
+	{
+		ulong ea;
+		struct kvmppc_slb *slb;
+		ea = esid << SID_SHIFT;
+		slb = kvmppc_mmu_book3s_64_find_slbe(to_book3s(vcpu), ea);
+		if (slb)
+			*vsid = slb->vsid;
+		else
+			return -ENOENT;
+
+		break;
+	}
+	default:
+		BUG();
+		break;
+	}
+
+	return 0;
+}
+
+static bool kvmppc_mmu_book3s_64_is_dcbz32(struct kvm_vcpu *vcpu)
+{
+	return (to_book3s(vcpu)->hid[5] & 0x80);
+}
+
+void kvmppc_mmu_book3s_64_init(struct kvm_vcpu *vcpu)
+{
+	struct kvmppc_mmu *mmu = &vcpu->arch.mmu;
+
+	mmu->mfsrin = NULL;
+	mmu->mtsrin = kvmppc_mmu_book3s_64_mtsrin;
+	mmu->slbmte = kvmppc_mmu_book3s_64_slbmte;
+	mmu->slbmfee = kvmppc_mmu_book3s_64_slbmfee;
+	mmu->slbmfev = kvmppc_mmu_book3s_64_slbmfev;
+	mmu->slbie = kvmppc_mmu_book3s_64_slbie;
+	mmu->slbia = kvmppc_mmu_book3s_64_slbia;
+	mmu->xlate = kvmppc_mmu_book3s_64_xlate;
+	mmu->reset_msr = kvmppc_mmu_book3s_64_reset_msr;
+	mmu->tlbie = kvmppc_mmu_book3s_64_tlbie;
+	mmu->esid_to_vsid = kvmppc_mmu_book3s_64_esid_to_vsid;
+	mmu->ea_to_vp = kvmppc_mmu_book3s_64_ea_to_vp;
+	mmu->is_dcbz32 = kvmppc_mmu_book3s_64_is_dcbz32;
+}
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 22/27] Add fields to PACA
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
  To: kvm
  Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
	kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>

For KVM we need to store some information in the PACA, so we
need to extend it.

This patch adds KVM SLB shadow related entries to the PACA and
a field that indicates if we're inside a guest.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/paca.h |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 7d8514c..5e9b4ef 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -129,6 +129,15 @@ struct paca_struct {
 	u64 system_time;		/* accumulated system TB ticks */
 	u64 startpurr;			/* PURR/TB value snapshot */
 	u64 startspurr;			/* SPURR value snapshot */
+
+#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
+	struct  {
+		u64     esid;
+		u64     vsid;
+	} kvm_slb[64];			/* guest SLB */
+	u8 kvm_slb_max;			/* highest used guest slb entry */
+	u8 kvm_in_guest;		/* are we inside the guest? */
+#endif
 };
 
 extern struct paca_struct paca[];
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 20/27] Split init_new_context and destroy_context
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
  To: kvm
  Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
	kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>

For KVM we need to allocate a new context id, but don't really care about
all the mm context around it.

So let's split the alloc and destroy functions for the context id, so we can
grab one without allocating an mm context.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/mmu_context.h |    5 +++++
 arch/powerpc/mm/mmu_context_hash64.c   |   24 +++++++++++++++++++++---
 2 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index b34e94d..66b35d0 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -23,6 +23,11 @@ extern void switch_slb(struct task_struct *tsk, struct mm_struct *mm);
 extern void set_context(unsigned long id, pgd_t *pgd);
 
 #ifdef CONFIG_PPC_BOOK3S_64
+extern int __init_new_context(void);
+extern void __destroy_context(int context_id);
+#endif
+
+#ifdef CONFIG_PPC_BOOK3S_64
 static inline void mmu_context_init(void) { }
 #else
 extern void mmu_context_init(void);
diff --git a/arch/powerpc/mm/mmu_context_hash64.c b/arch/powerpc/mm/mmu_context_hash64.c
index dbeb86a..b9e4cc2 100644
--- a/arch/powerpc/mm/mmu_context_hash64.c
+++ b/arch/powerpc/mm/mmu_context_hash64.c
@@ -18,6 +18,7 @@
 #include <linux/mm.h>
 #include <linux/spinlock.h>
 #include <linux/idr.h>
+#include <linux/module.h>
 
 #include <asm/mmu_context.h>
 
@@ -32,7 +33,7 @@ static DEFINE_IDR(mmu_context_idr);
 #define NO_CONTEXT	0
 #define MAX_CONTEXT	((1UL << 19) - 1)
 
-int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
+int __init_new_context(void)
 {
 	int index;
 	int err;
@@ -57,6 +58,18 @@ again:
 		return -ENOMEM;
 	}
 
+	return index;
+}
+EXPORT_SYMBOL_GPL(__init_new_context);
+
+int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
+{
+	int index;
+
+	index = __init_new_context();
+	if (index < 0)
+		return index;
+
 	/* The old code would re-promote on fork, we don't do that
 	 * when using slices as it could cause problem promoting slices
 	 * that have been forced down to 4K
@@ -68,11 +81,16 @@ again:
 	return 0;
 }
 
-void destroy_context(struct mm_struct *mm)
+void __destroy_context(int context_id)
 {
 	spin_lock(&mmu_context_lock);
-	idr_remove(&mmu_context_idr, mm->context.id);
+	idr_remove(&mmu_context_idr, context_id);
 	spin_unlock(&mmu_context_lock);
+}
+EXPORT_SYMBOL_GPL(__destroy_context);
 
+void destroy_context(struct mm_struct *mm)
+{
+	__destroy_context(mm->context.id);
 	mm->context.id = NO_CONTEXT;
 }
-- 
1.6.0.2

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox