LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH] pcf857x: support working w/o platform data
From: Dmitry Eremin-Solenikov @ 2010-06-27  6:38 UTC (permalink / raw)
  To: David Brownell; +Cc: Jean Delvare, linux-kernel, linuxppc-dev
In-Reply-To: <352874.17151.qm@web180303.mail.gq1.yahoo.com>

Hello,

On 6/25/10, David Brownell <david-b@pacbell.net> wrote:
>
> --- On Thu, 6/17/10, Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> wrote:
>
>> Provide sane defaults for pcf857x, so
>> the driver can be used w/o
>> providing platform data (and thus can be simply bound via OF tree).
>
>
> Maybe we can get an ack from some OF folk
> who are using it in that way?

Please see arch/powerpc/boot//dts/mpc8349emitx.dts (the platform I'm using it).

>> Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
>> ---
>>  drivers/gpio/pcf857x.c |    9 ++++-----
>>  1 files changed, 4 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpio/pcf857x.c
>> b/drivers/gpio/pcf857x.c
>> index 29f19ce..879b473 100644
>> --- a/drivers/gpio/pcf857x.c
>> +++ b/drivers/gpio/pcf857x.c
>> @@ -190,7 +190,6 @@ static int pcf857x_probe(struct
>> i2c_client *client,
>>      pdata = client->dev.platform_data;
>>      if (!pdata) {
>>
>> dev_dbg(&client->dev, "no platform data\n");
>> -        return -EINVAL;
>>      }
>>
>>      /* Allocate, initialize, and register
>> this gpio_chip. */
>> @@ -200,7 +199,7 @@ static int pcf857x_probe(struct
>> i2c_client *client,
>>
>>      mutex_init(&gpio->lock);
>>
>> -    gpio->chip.base =
>> pdata->gpio_base;
>> +    gpio->chip.base = pdata ?
>> pdata->gpio_base : -1;
>>      gpio->chip.can_sleep = 1;
>>      gpio->chip.dev =
>> &client->dev;
>>      gpio->chip.owner = THIS_MODULE;
>> @@ -278,7 +277,7 @@ static int pcf857x_probe(struct
>> i2c_client *client,
>>       * to zero, our software copy
>> of the "latch" then matches the chip's
>>       * all-ones reset
>> state.  Otherwise it flags pins to be driven low.
>>       */
>> -    gpio->out = ~pdata->n_latch;
>> +    gpio->out = pdata ?
>> ~pdata->n_latch : ~0;
>>
>>      status =
>> gpiochip_add(&gpio->chip);
>>      if (status < 0)
>> @@ -299,7 +298,7 @@ static int pcf857x_probe(struct
>> i2c_client *client,
>>      /* Let platform code set up the GPIOs
>> and their users.
>>       * Now is the first time
>> anyone could use them.
>>       */
>> -    if (pdata->setup) {
>> +    if (pdata && pdata->setup)
>> {
>>          status =
>> pdata->setup(client,
>>
>>     gpio->chip.base, gpio->chip.ngpio,
>>
>>     pdata->context);
>> @@ -322,7 +321,7 @@ static int pcf857x_remove(struct
>> i2c_client *client)
>>      struct pcf857x
>>         *gpio =
>> i2c_get_clientdata(client);
>>      int
>>
>> status = 0;
>>
>> -    if (pdata->teardown) {
>> +    if (pdata &&
>> pdata->teardown) {
>>          status =
>> pdata->teardown(client,
>>
>>     gpio->chip.base, gpio->chip.ngpio,
>>
>>     pdata->context);
>> --
>> 1.7.1
>>
>>
>
>


-- 
With best wishes
Dmitry

^ permalink raw reply

* Re: [PATCH 1/2] KVM: PPC: Add generic hpte management functions
From: Benjamin Herrenschmidt @ 2010-06-26 22:58 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-ppc, linuxppc-dev, Alexander Graf, kvm
In-Reply-To: <4C20AA9B.8000807@redhat.com>

On Tue, 2010-06-22 at 15:20 +0300, Avi Kivity wrote:
> On 06/22/2010 03:14 PM, Alexander Graf wrote:
> > Avi Kivity wrote:
> >    
> >> On 06/22/2010 03:10 PM, Alexander Graf wrote:
> >>      
> >>> If you have more performance hints, I'll gladly take them :).
> >>>
> >>>        
> >> Using a cpu that virtualizes the mmu in hardware helps tremendously.
> >>
> >>      
> > PPC never does that. Even with the virtualization extensions the MMU is
> > still software managed.
> 
> Then mmu intensive loads can expect to be slow.

Well, depends. ppc64 indeed requires the hash to be managed by the
hypervisor, so inserting or invalidating translations will mean a
roundtrip to the hypervisor, though there are ways at least the
insertion could be alleviated (for example, the HV could service the
hash misses directly walking the guest page tables).

But that's due in part to a design choice (whether it's a good one or
not I'm not going to argue here) which favors huge reasonably static
workloads where the hash is expected to contain all translations for
everything.

However, note that BookE (the embedded variant of the architecture) uses
a different model for virtualization, including options in its latest
variant for a HW logical->real translation (via a small dedicated TLB)
and direct access to some TLB ops from the guest.

> > I was also more thinking of hints like
> > "kmem_cache_zalloc is slow" or so ;).
> >    
> 
> Stuff like that is usually worthless.  To give real feedback I need to 
> understand the hardware, so I'm reduced to coding style and indentation 
> review.

In that case, I'd say that BAT manipulation is rare enough (mostly only
at boot time) to warrant indeed speeding up the normal PTE operations &
invalidations at the expense of the BAT change case.

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH 24/26] KVM: PPC: PV mtmsrd L=0 and mtmsr
From: Segher Boessenkool @ 2010-06-26 17:03 UTC (permalink / raw)
  To: Alexander Graf; +Cc: linuxppc-dev, KVM list, kvm-ppc
In-Reply-To: <1277508314-915-25-git-send-email-agraf@suse.de>

> There is also a form of mtmsr where all bits need to be addressed.  
> While the
> PPC64 Linux kernel behaves resonably well here, the PPC32 one never  
> uses the
> L=1 form but does mtmsr even for simple things like only changing EE.

You make it sound like the 32-bit kernel does something stupid, while
there is no other choice.  The "L=1" thing only exists for 64-bit.


Segher

^ permalink raw reply

* Re: [PATCH 11/26] KVM: PPC: Make RMO a define
From: Segher Boessenkool @ 2010-06-26 16:52 UTC (permalink / raw)
  To: Alexander Graf; +Cc: linuxppc-dev, KVM list, kvm-ppc
In-Reply-To: <1277508314-915-12-git-send-email-agraf@suse.de>

> On PowerPC it's very normal to not support all of the physical RAM  
> in real mode.

Oh?  Are you referring to "real mode limit", or 32-bit  
implementations with
more than 32 address lines, or something else?

Either way, RMO is a really bad name for this, since that name is  
already
used for a similar but different concept.

Also, it seems you construct the physical address by masking out bits  
from
the effective address.  Most implementations will trap or machine  
check if
you address outside of physical address space, instead.

Segher

^ permalink raw reply

* RE: JFFS2 corruption when mounting filesystem with filenames oflength> 7
From: Joakim Tjernlund @ 2010-06-26 13:27 UTC (permalink / raw)
  To: Steve Deiters; +Cc: linuxppc-dev, linux-mtd
In-Reply-To: <181804936ABC2349BE503168465576460F20CB90@exchserver.basler.com>

>
> > -----Original Message-----
> > From: linux-mtd-bounces@lists.infradead.org
> > [mailto:linux-mtd-bounces@lists.infradead.org] On Behalf Of
> > Steve Deiters
> > Sent: Thursday, June 24, 2010 3:02 PM
> > To: linux-mtd@lists.infradead.org
> > Subject: RE: JFFS2 corruption when mounting filesystem with
> > filenames oflength> 7
> >
> > > -----Original Message-----
> > > From: linux-mtd-bounces@lists.infradead.org
> > > [mailto:linux-mtd-bounces@lists.infradead.org] On Behalf Of Steve
> > > Deiters
> > > Sent: Wednesday, June 23, 2010 5:42 PM
> > > To: linux-mtd@lists.infradead.org
> > > Subject: RE: JFFS2 corruption when mounting filesystem with
> > filenames
> > > oflength > 7
> > >
> > > > -----Original Message-----
> > > > From: linux-mtd-bounces@lists.infradead.org
> > > > [mailto:linux-mtd-bounces@lists.infradead.org] On Behalf Of Steve
> > > > Deiters
> > > > Sent: Wednesday, June 23, 2010 5:21 PM
> > > > To: linux-mtd@lists.infradead.org
> > > > Subject: JFFS2 corruption when mounting filesystem with
> > > filenames of
> > > > length > 7
> > > >
> > > > I found an archived post which seems to be identical to my issue.
> > > > However, this is quite old and there never seemed to be any
> > > > resolution.
> > > >
> > > > http://www.infradead.org/pipermail/linux-mtd/2006-September/01
> > > > 6491.html
> > > >
> > > > If I mount a filesystem that has filenames greater than 7
> > > characters
> > > > in length, the files are corrupted when I mount.
> > > > In my case, I am making a
> > > > JFFS2 image with mkfs.jffs2 and flashing it in with u-boot.
> > > > However, I have attached a workflow where I erase the Flash
> > > and create
> > > > a new filesystem completely within Linux and it gives the same
> > > > behavior.  I can list the files with the 'ls'
> > > > command from within u-boot.  If I mount from within
> > Linux, and then
> > > > reboot into u-boot, it will not display any files that had
> > > a filename
> > > > greater than 7 characters.
> > > >
> > > > I enabled the MTD debug verbosity at level 2 for the
> > > attached example
> > > > session.
> > > >
> > > > I am running on a custom board with a MPC5121 and Linux 2.6.33.4.
> > > >
> > > > Thanks in advance for any help.
> > >
> > >
> > > Sorry for the jumbled mess.  Looks like the line endings are messed
> > > up.
> > > Trying again.  I also provided this as an attachment in
> > case it gets
> > > messed up again.
> >
> > Once again sorry for the mess.
> >
> > I tried this again with the DENX-v2.6.34 tag in the DENX git
> > repository (git://git.denx.de/linux-2.6-denx.git).  The only
> > modification I made was to add my dts file.  I still get the
> > same issue I had before.
> >
> > I've attached my kernel config if that gives any clues.
> >
> > Are there any thoughts on what may be causing this?
> >
> > Thanks.
>
>
> I think there may be something weird going on with the memcpy in my
> build.  If I use the following patch I no longer get errors when I mount
> the filesystem.  All I did was replace the memcpy with a loop.
>
> I'm not sure what's special about this particular use of memcpy.  I
> can't believe that things would be working as well as they do if memcpy
> was broken in general.
>
> This is on a PowerPC 32 bit build for a MPC5121.  I am using a GCC 4.1.2
> to compile.  Is anyone aware of any issues with memcpy in this
> configuration?
>
> Thanks.
>
> -------
>
> diff --git a/fs/jffs2/scan.c b/fs/jffs2/scan.c
> index 46f870d..673caa2 100644
> --- a/fs/jffs2/scan.c
> +++ b/fs/jffs2/scan.c
> @@ -1038,7 +1038,10 @@ static int jffs2_scan_dirent_node(struct
> jffs2_sb_info *c, struct jffs2_eraseblo
>     if (!fd) {
>        return -ENOMEM;
>     }
> -   memcpy(&fd->name, rd->name, checkedlen);

Are the pointers to memcpy overlapping? If so memcpy is undefined
and you have to use memmove().

> +   int i;
> +   for(i = 0; i < checkedlen; i++)
> +      ((unsigned char*)fd->name)[i] = ((const unsigned
> char*)rd->name)[i];
> +
>     fd->name[checkedlen] = 0;
>
>     crc = crc32(0, fd->name, rd->nsize);
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
>
>

^ permalink raw reply

* RE: JFFS2 corruption when mounting filesystem with filenames oflength> 7
From: Steve Deiters @ 2010-06-25 23:48 UTC (permalink / raw)
  To: linux-mtd; +Cc: linuxppc-dev
In-Reply-To: <181804936ABC2349BE503168465576460F1AB73E@exchserver.basler.com>

> -----Original Message-----
> From: linux-mtd-bounces@lists.infradead.org=20
> [mailto:linux-mtd-bounces@lists.infradead.org] On Behalf Of=20
> Steve Deiters
> Sent: Thursday, June 24, 2010 3:02 PM
> To: linux-mtd@lists.infradead.org
> Subject: RE: JFFS2 corruption when mounting filesystem with=20
> filenames oflength> 7
>=20
> > -----Original Message-----
> > From: linux-mtd-bounces@lists.infradead.org
> > [mailto:linux-mtd-bounces@lists.infradead.org] On Behalf Of Steve=20
> > Deiters
> > Sent: Wednesday, June 23, 2010 5:42 PM
> > To: linux-mtd@lists.infradead.org
> > Subject: RE: JFFS2 corruption when mounting filesystem with=20
> filenames=20
> > oflength > 7
> >=20
> > > -----Original Message-----
> > > From: linux-mtd-bounces@lists.infradead.org
> > > [mailto:linux-mtd-bounces@lists.infradead.org] On Behalf Of Steve=20
> > > Deiters
> > > Sent: Wednesday, June 23, 2010 5:21 PM
> > > To: linux-mtd@lists.infradead.org
> > > Subject: JFFS2 corruption when mounting filesystem with
> > filenames of
> > > length > 7
> > >=20
> > > I found an archived post which seems to be identical to my issue.
> > > However, this is quite old and there never seemed to be any=20
> > > resolution.
> > >=20
> > > http://www.infradead.org/pipermail/linux-mtd/2006-September/01
> > > 6491.html
> > >=20
> > > If I mount a filesystem that has filenames greater than 7
> > characters
> > > in length, the files are corrupted when I mount.
> > > In my case, I am making a
> > > JFFS2 image with mkfs.jffs2 and flashing it in with u-boot. =20
> > > However, I have attached a workflow where I erase the Flash
> > and create
> > > a new filesystem completely within Linux and it gives the same=20
> > > behavior.  I can list the files with the 'ls'
> > > command from within u-boot.  If I mount from within=20
> Linux, and then=20
> > > reboot into u-boot, it will not display any files that had
> > a filename
> > > greater than 7 characters.
> > >=20
> > > I enabled the MTD debug verbosity at level 2 for the
> > attached example
> > > session.
> > >=20
> > > I am running on a custom board with a MPC5121 and Linux 2.6.33.4.
> > >=20
> > > Thanks in advance for any help.
> >=20
> >=20
> > Sorry for the jumbled mess.  Looks like the line endings are messed=20
> > up.
> > Trying again.  I also provided this as an attachment in=20
> case it gets=20
> > messed up again.
>=20
> Once again sorry for the mess.
>=20
> I tried this again with the DENX-v2.6.34 tag in the DENX git=20
> repository (git://git.denx.de/linux-2.6-denx.git).  The only=20
> modification I made was to add my dts file.  I still get the=20
> same issue I had before.
>=20
> I've attached my kernel config if that gives any clues.
>=20
> Are there any thoughts on what may be causing this?
>=20
> Thanks.


I think there may be something weird going on with the memcpy in my
build.  If I use the following patch I no longer get errors when I mount
the filesystem.  All I did was replace the memcpy with a loop.

I'm not sure what's special about this particular use of memcpy.  I
can't believe that things would be working as well as they do if memcpy
was broken in general.

This is on a PowerPC 32 bit build for a MPC5121.  I am using a GCC 4.1.2
to compile.  Is anyone aware of any issues with memcpy in this
configuration?

Thanks.

-------

diff --git a/fs/jffs2/scan.c b/fs/jffs2/scan.c
index 46f870d..673caa2 100644
--- a/fs/jffs2/scan.c
+++ b/fs/jffs2/scan.c
@@ -1038,7 +1038,10 @@ static int jffs2_scan_dirent_node(struct
jffs2_sb_info *c, struct jffs2_eraseblo
 	if (!fd) {
 		return -ENOMEM;
 	}
-	memcpy(&fd->name, rd->name, checkedlen);
+	int i;
+	for(i =3D 0; i < checkedlen; i++)
+		((unsigned char*)fd->name)[i] =3D ((const unsigned
char*)rd->name)[i];
+
 	fd->name[checkedlen] =3D 0;
=20
 	crc =3D crc32(0, fd->name, rd->nsize);

^ permalink raw reply related

* [PATCH 25/26] KVM: PPC: PV wrteei
From: Alexander Graf @ 2010-06-25 23:25 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1277508314-915-1-git-send-email-agraf@suse.de>

On BookE the preferred way to write the EE bit is the wrteei instruction. It
already encodes the EE bit in the instruction.

So in order to get BookE some speedups as well, let's also PV'nize thati
instruction.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kernel/kvm.c      |   50 ++++++++++++++++++++++++++++++++++++++++
 arch/powerpc/kernel/kvm_emul.S |   41 ++++++++++++++++++++++++++++++++
 2 files changed, 91 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
index 3557bc8..85e2163 100644
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -66,6 +66,9 @@
 #define KVM_INST_MTMSRD_L1	0x7c010164
 #define KVM_INST_MTMSR		0x7c000124
 
+#define KVM_INST_WRTEEI_0	0x7c000146
+#define KVM_INST_WRTEEI_1	0x7c008146
+
 static bool kvm_patching_worked = true;
 static char kvm_tmp[1024 * 1024];
 static int kvm_tmp_index;
@@ -200,6 +203,47 @@ static void kvm_patch_ins_mtmsr(u32 *inst, u32 rt)
 	*inst = KVM_INST_B | (distance_start & KVM_INST_B_MASK);
 }
 
+#ifdef CONFIG_BOOKE
+
+extern u32 kvm_emulate_wrteei_branch_offs;
+extern u32 kvm_emulate_wrteei_ee_offs;
+extern u32 kvm_emulate_wrteei_len;
+extern u32 kvm_emulate_wrteei[];
+
+static void kvm_patch_ins_wrteei(u32 *inst)
+{
+	u32 *p;
+	int distance_start;
+	int distance_end;
+	ulong next_inst;
+
+	p = kvm_alloc(kvm_emulate_wrteei_len * 4);
+	if (!p)
+		return;
+
+	/* Find out where we are and put everything there */
+	distance_start = (ulong)p - (ulong)inst;
+	next_inst = ((ulong)inst + 4);
+	distance_end = next_inst - (ulong)&p[kvm_emulate_wrteei_branch_offs];
+
+	/* Make sure we only write valid b instructions */
+	if (distance_start > KVM_INST_B_MAX) {
+		kvm_patching_worked = false;
+		return;
+	}
+
+	/* Modify the chunk to fit the invocation */
+	memcpy(p, kvm_emulate_wrteei, kvm_emulate_wrteei_len * 4);
+	p[kvm_emulate_wrteei_branch_offs] |= distance_end & KVM_INST_B_MASK;
+	p[kvm_emulate_wrteei_ee_offs] |= (*inst & MSR_EE);
+	flush_icache_range((ulong)p, (ulong)p + kvm_emulate_wrteei_len * 4);
+
+	/* Patch the invocation */
+	*inst = KVM_INST_B | (distance_start & KVM_INST_B_MASK);
+}
+
+#endif
+
 static void kvm_map_magic_page(void *data)
 {
 	kvm_hypercall2(KVM_HC_PPC_MAP_MAGIC_PAGE,
@@ -289,6 +333,12 @@ static void kvm_check_ins(u32 *inst)
 	}
 
 	switch (_inst) {
+#ifdef CONFIG_BOOKE
+	case KVM_INST_WRTEEI_0:
+	case KVM_INST_WRTEEI_1:
+		kvm_patch_ins_wrteei(inst);
+		break;
+#endif
 	}
 
 	flush_icache_range((ulong)inst, (ulong)inst + 4);
diff --git a/arch/powerpc/kernel/kvm_emul.S b/arch/powerpc/kernel/kvm_emul.S
index ccf5a42..b79b9de 100644
--- a/arch/powerpc/kernel/kvm_emul.S
+++ b/arch/powerpc/kernel/kvm_emul.S
@@ -194,3 +194,44 @@ kvm_emulate_mtmsr_orig_ins_offs:
 .global kvm_emulate_mtmsr_len
 kvm_emulate_mtmsr_len:
 	.long (kvm_emulate_mtmsr_end - kvm_emulate_mtmsr) / 4
+
+
+
+.global kvm_emulate_wrteei
+kvm_emulate_wrteei:
+
+	SCRATCH_SAVE
+
+	/* Fetch old MSR in r31 */
+	LL64(r31, KVM_MAGIC_PAGE + KVM_MAGIC_MSR, 0)
+
+	/* Remove MSR_EE from old MSR */
+	li	r30, 0
+	ori	r30, r30, MSR_EE
+	andc	r31, r31, r30
+
+	/* OR new MSR_EE onto the old MSR */
+kvm_emulate_wrteei_ee:
+	ori	r31, r31, 0
+
+	/* Write new MSR value back */
+	STL64(r31, KVM_MAGIC_PAGE + KVM_MAGIC_MSR, 0)
+
+	SCRATCH_RESTORE
+
+	/* Go back to caller */
+kvm_emulate_wrteei_branch:
+	b	.
+kvm_emulate_wrteei_end:
+
+.global kvm_emulate_wrteei_branch_offs
+kvm_emulate_wrteei_branch_offs:
+	.long (kvm_emulate_wrteei_branch - kvm_emulate_wrteei) / 4
+
+.global kvm_emulate_wrteei_ee_offs
+kvm_emulate_wrteei_ee_offs:
+	.long (kvm_emulate_wrteei_ee - kvm_emulate_wrteei) / 4
+
+.global kvm_emulate_wrteei_len
+kvm_emulate_wrteei_len:
+	.long (kvm_emulate_wrteei_end - kvm_emulate_wrteei) / 4
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 24/26] KVM: PPC: PV mtmsrd L=0 and mtmsr
From: Alexander Graf @ 2010-06-25 23:25 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1277508314-915-1-git-send-email-agraf@suse.de>

There is also a form of mtmsr where all bits need to be addressed. While the
PPC64 Linux kernel behaves resonably well here, the PPC32 one never uses the
L=1 form but does mtmsr even for simple things like only changing EE.

So we need to hook into that one as well and check for a mask of bits that we
deem safe to change from within guest context.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kernel/kvm.c      |   51 ++++++++++++++++++++++++
 arch/powerpc/kernel/kvm_emul.S |   84 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 135 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
index 71153d0..3557bc8 100644
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -62,7 +62,9 @@
 #define KVM_INST_MTSPR_DSISR	0x7c1203a6
 
 #define KVM_INST_TLBSYNC	0x7c00046c
+#define KVM_INST_MTMSRD_L0	0x7c000164
 #define KVM_INST_MTMSRD_L1	0x7c010164
+#define KVM_INST_MTMSR		0x7c000124
 
 static bool kvm_patching_worked = true;
 static char kvm_tmp[1024 * 1024];
@@ -155,6 +157,49 @@ static void kvm_patch_ins_mtmsrd(u32 *inst, u32 rt)
 	*inst = KVM_INST_B | (distance_start & KVM_INST_B_MASK);
 }
 
+extern u32 kvm_emulate_mtmsr_branch_offs;
+extern u32 kvm_emulate_mtmsr_reg1_offs;
+extern u32 kvm_emulate_mtmsr_reg2_offs;
+extern u32 kvm_emulate_mtmsr_reg3_offs;
+extern u32 kvm_emulate_mtmsr_orig_ins_offs;
+extern u32 kvm_emulate_mtmsr_len;
+extern u32 kvm_emulate_mtmsr[];
+
+static void kvm_patch_ins_mtmsr(u32 *inst, u32 rt)
+{
+	u32 *p;
+	int distance_start;
+	int distance_end;
+	ulong next_inst;
+
+	p = kvm_alloc(kvm_emulate_mtmsr_len * 4);
+	if (!p)
+		return;
+
+	/* Find out where we are and put everything there */
+	distance_start = (ulong)p - (ulong)inst;
+	next_inst = ((ulong)inst + 4);
+	distance_end = next_inst - (ulong)&p[kvm_emulate_mtmsr_branch_offs];
+
+	/* Make sure we only write valid b instructions */
+	if (distance_start > KVM_INST_B_MAX) {
+		kvm_patching_worked = false;
+		return;
+	}
+
+	/* Modify the chunk to fit the invocation */
+	memcpy(p, kvm_emulate_mtmsr, kvm_emulate_mtmsr_len * 4);
+	p[kvm_emulate_mtmsr_branch_offs] |= distance_end & KVM_INST_B_MASK;
+	p[kvm_emulate_mtmsr_reg1_offs] |= rt;
+	p[kvm_emulate_mtmsr_reg2_offs] |= rt;
+	p[kvm_emulate_mtmsr_reg3_offs] |= rt;
+	p[kvm_emulate_mtmsr_orig_ins_offs] = *inst;
+	flush_icache_range((ulong)p, (ulong)p + kvm_emulate_mtmsr_len * 4);
+
+	/* Patch the invocation */
+	*inst = KVM_INST_B | (distance_start & KVM_INST_B_MASK);
+}
+
 static void kvm_map_magic_page(void *data)
 {
 	kvm_hypercall2(KVM_HC_PPC_MAP_MAGIC_PAGE,
@@ -235,6 +280,12 @@ static void kvm_check_ins(u32 *inst)
 		if (get_rt(inst_rt) < 30)
 			kvm_patch_ins_mtmsrd(inst, inst_rt);
 		break;
+	case KVM_INST_MTMSR:
+	case KVM_INST_MTMSRD_L0:
+		/* We use r30 and r31 during the hook */
+		if (get_rt(inst_rt) < 30)
+			kvm_patch_ins_mtmsr(inst, inst_rt);
+		break;
 	}
 
 	switch (_inst) {
diff --git a/arch/powerpc/kernel/kvm_emul.S b/arch/powerpc/kernel/kvm_emul.S
index 25e6683..ccf5a42 100644
--- a/arch/powerpc/kernel/kvm_emul.S
+++ b/arch/powerpc/kernel/kvm_emul.S
@@ -110,3 +110,87 @@ kvm_emulate_mtmsrd_reg_offs:
 .global kvm_emulate_mtmsrd_len
 kvm_emulate_mtmsrd_len:
 	.long (kvm_emulate_mtmsrd_end - kvm_emulate_mtmsrd) / 4
+
+
+#define MSR_SAFE_BITS (MSR_EE | MSR_CE | MSR_ME | MSR_RI)
+#define MSR_CRITICAL_BITS ~MSR_SAFE_BITS
+
+.global kvm_emulate_mtmsr
+kvm_emulate_mtmsr:
+
+	SCRATCH_SAVE
+
+	/* Fetch old MSR in r31 */
+	LL64(r31, KVM_MAGIC_PAGE + KVM_MAGIC_MSR, 0)
+
+	/* Find the changed bits between old and new MSR */
+kvm_emulate_mtmsr_reg1:
+	xor	r31, r0, r31
+
+	/* Check if we need to really do mtmsr */
+	LOAD_REG_IMMEDIATE(r30, MSR_CRITICAL_BITS)
+	and.	r31, r31, r30
+
+	/* No critical bits changed? Maybe we can stay in the guest. */
+	beq	maybe_stay_in_guest
+
+do_mtmsr:
+
+	SCRATCH_RESTORE
+
+	/* Just fire off the mtmsr if it's critical */
+kvm_emulate_mtmsr_orig_ins:
+	mtmsr	r0
+
+	b	kvm_emulate_mtmsr_branch
+
+maybe_stay_in_guest:
+
+	/* Check if we have to fetch an interrupt */
+	lwz	r31, (KVM_MAGIC_PAGE + KVM_MAGIC_INT)(0)
+	cmpwi	r31, 0
+	beq+	no_mtmsr
+
+	/* Check if we may trigger an interrupt */
+kvm_emulate_mtmsr_reg2:
+	andi.	r31, r0, MSR_EE
+	beq	no_mtmsr
+
+	b	do_mtmsr
+
+no_mtmsr:
+
+	/* Put MSR into magic page because we don't call mtmsr */
+kvm_emulate_mtmsr_reg3:
+	STL64(r0, KVM_MAGIC_PAGE + KVM_MAGIC_MSR, 0)
+
+	SCRATCH_RESTORE
+
+	/* Go back to caller */
+kvm_emulate_mtmsr_branch:
+	b	.
+kvm_emulate_mtmsr_end:
+
+.global kvm_emulate_mtmsr_branch_offs
+kvm_emulate_mtmsr_branch_offs:
+	.long (kvm_emulate_mtmsr_branch - kvm_emulate_mtmsr) / 4
+
+.global kvm_emulate_mtmsr_reg1_offs
+kvm_emulate_mtmsr_reg1_offs:
+	.long (kvm_emulate_mtmsr_reg1 - kvm_emulate_mtmsr) / 4
+
+.global kvm_emulate_mtmsr_reg2_offs
+kvm_emulate_mtmsr_reg2_offs:
+	.long (kvm_emulate_mtmsr_reg2 - kvm_emulate_mtmsr) / 4
+
+.global kvm_emulate_mtmsr_reg3_offs
+kvm_emulate_mtmsr_reg3_offs:
+	.long (kvm_emulate_mtmsr_reg3 - kvm_emulate_mtmsr) / 4
+
+.global kvm_emulate_mtmsr_orig_ins_offs
+kvm_emulate_mtmsr_orig_ins_offs:
+	.long (kvm_emulate_mtmsr_orig_ins - kvm_emulate_mtmsr) / 4
+
+.global kvm_emulate_mtmsr_len
+kvm_emulate_mtmsr_len:
+	.long (kvm_emulate_mtmsr_end - kvm_emulate_mtmsr) / 4
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 12/26] KVM: PPC: First magic page steps
From: Alexander Graf @ 2010-06-25 23:25 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1277508314-915-1-git-send-email-agraf@suse.de>

We will be introducing a method to project the shared page in guest context.
As soon as we're talking about this coupling, the shared page is colled magic
page.

This patch introduces simple defines, so the follow-up patches are easier to
read.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_host.h |    2 ++
 include/linux/kvm_para.h            |    1 +
 2 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index e35c1ac..5f8c214 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -285,6 +285,8 @@ struct kvm_vcpu_arch {
 	u64 dec_jiffies;
 	unsigned long pending_exceptions;
 	struct kvm_vcpu_arch_shared *shared;
+	unsigned long magic_page_pa; /* phys addr to map the magic page to */
+	unsigned long magic_page_ea; /* effect. addr to map the magic page to */
 
 #ifdef CONFIG_PPC_BOOK3S
 	struct kmem_cache *hpte_cache;
diff --git a/include/linux/kvm_para.h b/include/linux/kvm_para.h
index 3b8080e..ac2015a 100644
--- a/include/linux/kvm_para.h
+++ b/include/linux/kvm_para.h
@@ -18,6 +18,7 @@
 #define KVM_HC_VAPIC_POLL_IRQ		1
 #define KVM_HC_MMU_OP			2
 #define KVM_HC_FEATURES			3
+#define KVM_HC_PPC_MAP_MAGIC_PAGE	4
 
 /*
  * hypercalls use architecture specific
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 18/26] KVM: PPC: KVM PV guest stubs
From: Alexander Graf @ 2010-06-25 23:25 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1277508314-915-1-git-send-email-agraf@suse.de>

We will soon start and replace instructions from the text section with
other, paravirtualized versions. To ease the readability of those patches
I split out the generic looping and magic page mapping code out.

This patch still only contains stubs. But at least it loops through the
text section :).

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kernel/kvm.c |   59 +++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 59 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
index 2d8dd73..d873bc6 100644
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -32,3 +32,62 @@
 #define KVM_MAGIC_PAGE		(-4096L)
 #define magic_var(x) KVM_MAGIC_PAGE + offsetof(struct kvm_vcpu_arch_shared, x)
 
+static bool kvm_patching_worked = true;
+
+static void kvm_map_magic_page(void *data)
+{
+	kvm_hypercall2(KVM_HC_PPC_MAP_MAGIC_PAGE,
+		       KVM_MAGIC_PAGE,  /* Physical Address */
+		       KVM_MAGIC_PAGE); /* Effective Address */
+}
+
+static void kvm_check_ins(u32 *inst)
+{
+	u32 _inst = *inst;
+	u32 inst_no_rt = _inst & ~KVM_MASK_RT;
+	u32 inst_rt = _inst & KVM_MASK_RT;
+
+	switch (inst_no_rt) {
+	}
+
+	switch (_inst) {
+	}
+
+	flush_icache_range((ulong)inst, (ulong)inst + 4);
+}
+
+static void kvm_use_magic_page(void)
+{
+	u32 *p;
+	u32 *start, *end;
+
+	/* Tell the host to map the magic page to -4096 on all CPUs */
+
+	on_each_cpu(kvm_map_magic_page, NULL, 1);
+
+	/* Now loop through all code and find instructions */
+
+	start = (void*)_stext;
+	end = (void*)_etext;
+
+	for (p = start; p < end; p++)
+		kvm_check_ins(p);
+}
+
+static int __init kvm_guest_init(void)
+{
+	char *p;
+
+	if (!kvm_para_available())
+		return 0;
+
+	if (kvm_para_has_feature(KVM_FEATURE_MAGIC_PAGE))
+		kvm_use_magic_page();
+
+	printk(KERN_INFO "KVM: Live patching for a fast VM %s\n",
+			 kvm_patching_worked ? "worked" : "failed");
+
+	return 0;
+}
+
+postcore_initcall(kvm_guest_init);
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 20/26] KVM: PPC: PV tlbsync to nop
From: Alexander Graf @ 2010-06-25 23:25 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1277508314-915-1-git-send-email-agraf@suse.de>

With our current MMU scheme we don't need to know about the tlbsync instruction.
So we can just nop it out.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kernel/kvm.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
index b165b20..b091f94 100644
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -61,6 +61,8 @@
 #define KVM_INST_MTSPR_DAR	0x7c1303a6
 #define KVM_INST_MTSPR_DSISR	0x7c1203a6
 
+#define KVM_INST_TLBSYNC	0x7c00046c
+
 static bool kvm_patching_worked = true;
 
 static void kvm_patch_ins_ld(u32 *inst, long addr, u32 rt)
@@ -91,6 +93,11 @@ static void kvm_patch_ins_stw(u32 *inst, long addr, u32 rt)
 	*inst = KVM_INST_STW | rt | (addr & 0x0000fffc);
 }
 
+static void kvm_patch_ins_nop(u32 *inst)
+{
+	*inst = KVM_INST_NOP;
+}
+
 static void kvm_map_magic_page(void *data)
 {
 	kvm_hypercall2(KVM_HC_PPC_MAP_MAGIC_PAGE,
@@ -159,6 +166,11 @@ static void kvm_check_ins(u32 *inst)
 	case KVM_INST_MTSPR_DSISR:
 		kvm_patch_ins_stw(inst, magic_var(dsisr), inst_rt);
 		break;
+
+	/* Nops */
+	case KVM_INST_TLBSYNC:
+		kvm_patch_ins_nop(inst);
+		break;
 	}
 
 	switch (_inst) {
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 16/26] KVM: Move kvm_guest_init out of generic code
From: Alexander Graf @ 2010-06-25 23:25 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1277508314-915-1-git-send-email-agraf@suse.de>

Currently x86 is the only architecture that uses kvm_guest_init(). With
PowerPC we're getting a second user, but the signature is different there
and we don't need to export it, as it uses the normal kernel init framework.

So let's move the x86 specific definition of that function over to the x86
specfic header file.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/x86/include/asm/kvm_para.h |    6 ++++++
 include/linux/kvm_para.h        |    5 -----
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 05eba5e..7b562b6 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -158,6 +158,12 @@ static inline unsigned int kvm_arch_para_features(void)
 	return cpuid_eax(KVM_CPUID_FEATURES);
 }
 
+#ifdef CONFIG_KVM_GUEST
+void __init kvm_guest_init(void);
+#else
+#define kvm_guest_init() do { } while (0)
 #endif
 
+#endif /* __KERNEL__ */
+
 #endif /* _ASM_X86_KVM_PARA_H */
diff --git a/include/linux/kvm_para.h b/include/linux/kvm_para.h
index ac2015a..47a070b 100644
--- a/include/linux/kvm_para.h
+++ b/include/linux/kvm_para.h
@@ -26,11 +26,6 @@
 #include <asm/kvm_para.h>
 
 #ifdef __KERNEL__
-#ifdef CONFIG_KVM_GUEST
-void __init kvm_guest_init(void);
-#else
-#define kvm_guest_init() do { } while (0)
-#endif
 
 static inline int kvm_para_has_feature(unsigned int feature)
 {
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 26/26] KVM: PPC: Add Documentation about PV interface
From: Alexander Graf @ 2010-06-25 23:25 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1277508314-915-1-git-send-email-agraf@suse.de>

We just introduced a new PV interface that screams for documentation. So here
it is - a shiny new and awesome text file describing the internal works of
the PPC KVM paravirtual interface.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 Documentation/kvm/ppc-pv.txt |  164 ++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 164 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/kvm/ppc-pv.txt

diff --git a/Documentation/kvm/ppc-pv.txt b/Documentation/kvm/ppc-pv.txt
new file mode 100644
index 0000000..7cbcd51
--- /dev/null
+++ b/Documentation/kvm/ppc-pv.txt
@@ -0,0 +1,164 @@
+The PPC KVM paravirtual interface
+=================================
+
+The basic execution principle by which KVM on PowerPC works is to run all kernel
+space code in PR=1 which is user space. This way we trap all privileged
+instructions and can emulate them accordingly.
+
+Unfortunately that is also the downfall. There are quite some privileged
+instructions that needlessly return us to the hypervisor even though they
+could be handled differently.
+
+This is what the PPC PV interface helps with. It takes privileged instructions
+and transforms them into unprivileged ones with some help from the hypervisor.
+This cuts down virtualization costs by about 50% on some of my benchmarks.
+
+The code for that interface can be found in arch/powerpc/kernel/kvm*
+
+Querying for existence
+======================
+
+To find out if we're running on KVM or not, we overlay the PVR register. Usually
+the PVR register contains an id that identifies your CPU type. If, however, you
+pass KVM_PVR_PARA in the register that you want the PVR result in, the register
+still contains KVM_PVR_PARA after the mfpvr call.
+
+	LOAD_REG_IMM(r5, KVM_PVR_PARA)
+	mfpvr	r5
+	[r5 still contains KVM_PVR_PARA]
+
+Once determined to run under a PV capable KVM, you can now use hypercalls as
+described below.
+
+PPC hypercalls
+==============
+
+The only viable ways to reliably get from guest context to host context are:
+
+	1) Call an invalid instruction
+	2) Call the "sc" instruction with a parameter to "sc"
+	3) Call the "sc" instruction with parameters in GPRs
+
+Method 1 is always a bad idea. Invalid instructions can be replaced later on
+by valid instructions, rendering the interface broken.
+
+Method 2 also has downfalls. If the parameter to "sc" is != 0 the spec is
+rather unclear if the sc is targeted directly for the hypervisor or the
+supervisor. It would also require that we read the syscall issuing instruction
+every time a syscall is issued, slowing down guest syscalls.
+
+Method 3 is what KVM uses. We pass magic constants (KVM_SC_MAGIC_R3 and
+KVM_SC_MAGIC_R4) in r3 and r4 respectively. If a syscall instruction with these
+magic values arrives from the guest's kernel mode, we take the syscall as a
+hypercall.
+
+The parameters are as follows:
+
+	r3		KVM_SC_MAGIC_R3
+	r4		KVM_SC_MAGIC_R4
+	r5		Hypercall number
+	r6		First parameter
+	r7		Second parameter
+	r8		Third parameter
+	r9		Fourth parameter
+
+Hypercall definitions are shared in generic code, so the same hypercall numbers
+apply for x86 and powerpc alike.
+
+The magic page
+==============
+
+To enable communication between the hypervisor and guest there is a new shared
+page that contains parts of supervisor visible register state. The guest can
+map this shared page using the KVM hypercall KVM_HC_PPC_MAP_MAGIC_PAGE.
+
+With this hypercall issued the guest always gets the magic page mapped at the
+desired location in effective and physical address space. For now, we always
+map the page to -4096. This way we can access it using absolute load and store
+functions. The following instruction reads the first field of the magic page:
+
+	ld	rX, -4096(0)
+
+The interface is designed to be extensible should there be need later to add
+additional registers to the magic page. If you add fields to the magic page,
+also define a new hypercall feature to indicate that the host can give you more
+registers. Only if the host supports the additional features, make use of them.
+
+The magic page has the following layout as described in
+arch/powerpc/include/asm/kvm_para.h:
+
+struct kvm_vcpu_arch_shared {
+	__u64 scratch1;
+	__u64 scratch2;
+	__u64 scratch3;
+	__u64 critical;		/* Guest may not get interrupts if == r1 */
+	__u64 sprg0;
+	__u64 sprg1;
+	__u64 sprg2;
+	__u64 sprg3;
+	__u64 srr0;
+	__u64 srr1;
+	__u64 dar;
+	__u64 msr;
+	__u32 dsisr;
+	__u32 int_pending;	/* Tells the guest if we have an interrupt */
+};
+
+Additions to the page must only occur at the end. Struct fields are always 32
+bit aligned.
+
+Patched instructions
+====================
+
+The "ld" and "std" instructions are transormed to "lwz" and "stw" instructions
+respectively on 32 bit systems with an added offset of 4 to accomodate for big
+endianness.
+
+From			To
+====			==
+
+mfmsr	rX		ld	rX, magic_page->msr
+mfsprg	rX, 0		ld	rX, magic_page->sprg0
+mfsprg	rX, 1		ld	rX, magic_page->sprg1
+mfsprg	rX, 2		ld	rX, magic_page->sprg2
+mfsprg	rX, 3		ld	rX, magic_page->sprg3
+mfsrr0	rX		ld	rX, magic_page->srr0
+mfsrr1	rX		ld	rX, magic_page->srr1
+mfdar	rX		ld	rX, magic_page->dar
+mfdsisr	rX		ld	rX, magic_page->dsisr
+
+mtmsr	rX		std	rX, magic_page->msr
+mtsprg	0, rX		std	rX, magic_page->sprg0
+mtsprg	1, rX		std	rX, magic_page->sprg1
+mtsprg	2, rX		std	rX, magic_page->sprg2
+mtsprg	3, rX		std	rX, magic_page->sprg3
+mtsrr0	rX		std	rX, magic_page->srr0
+mtsrr1	rX		std	rX, magic_page->srr1
+mtdar	rX		std	rX, magic_page->dar
+mtdsisr	rX		std	rX, magic_page->dsisr
+
+tlbsync			nop
+
+mtmsrd	rX, 0		b	<special mtmsr section>
+mtmsr			b	<special mtmsr section>
+
+mtmsrd	rX, 1		b	<special mtmsrd section>
+
+[BookE only]
+wrteei	[0|1]		b	<special wrteei section>
+
+
+Some instructions require more logic to determine what's going on than a load
+or store instruction can deliver. To enable patching of those, we keep some
+RAM around where we can live translate instructions to. What happens is the
+following:
+
+	1) copy emulation code to memory
+	2) patch that code to fit the emulated instruction
+	3) patch that code to return to the original pc + 4
+	4) patch the original instruction to branch to the new code
+
+That way we can inject an arbitrary amount of code as replacement for a single
+instruction. This allows us to check for pending interrupts when setting EE=1
+for example.
+
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 23/26] KVM: PPC: PV mtmsrd L=1
From: Alexander Graf @ 2010-06-25 23:25 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1277508314-915-1-git-send-email-agraf@suse.de>

The PowerPC ISA has a special instruction for mtmsr that only changes the EE
and RI bits, namely the L=1 form.

Since that one is reasonably often occuring and simple to implement, let's
go with this first. Writing EE=0 is always just a store. Doing EE=1 also
requires us to check for pending interrupts and if necessary exit back to the
hypervisor.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kernel/kvm.c      |   45 ++++++++++++++++++++++++++++++++
 arch/powerpc/kernel/kvm_emul.S |   56 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 101 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
index 7e8fe6f..71153d0 100644
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -62,6 +62,7 @@
 #define KVM_INST_MTSPR_DSISR	0x7c1203a6
 
 #define KVM_INST_TLBSYNC	0x7c00046c
+#define KVM_INST_MTMSRD_L1	0x7c010164
 
 static bool kvm_patching_worked = true;
 static char kvm_tmp[1024 * 1024];
@@ -117,6 +118,43 @@ static u32 *kvm_alloc(int len)
 	return p;
 }
 
+extern u32 kvm_emulate_mtmsrd_branch_offs;
+extern u32 kvm_emulate_mtmsrd_reg_offs;
+extern u32 kvm_emulate_mtmsrd_len;
+extern u32 kvm_emulate_mtmsrd[];
+
+static void kvm_patch_ins_mtmsrd(u32 *inst, u32 rt)
+{
+	u32 *p;
+	int distance_start;
+	int distance_end;
+	ulong next_inst;
+
+	p = kvm_alloc(kvm_emulate_mtmsrd_len * 4);
+	if (!p)
+		return;
+
+	/* Find out where we are and put everything there */
+	distance_start = (ulong)p - (ulong)inst;
+	next_inst = ((ulong)inst + 4);
+	distance_end = next_inst - (ulong)&p[kvm_emulate_mtmsrd_branch_offs];
+
+	/* Make sure we only write valid b instructions */
+	if (distance_start > KVM_INST_B_MAX) {
+		kvm_patching_worked = false;
+		return;
+	}
+
+	/* Modify the chunk to fit the invocation */
+	memcpy(p, kvm_emulate_mtmsrd, kvm_emulate_mtmsrd_len * 4);
+	p[kvm_emulate_mtmsrd_branch_offs] |= distance_end & KVM_INST_B_MASK;
+	p[kvm_emulate_mtmsrd_reg_offs] |= rt;
+	flush_icache_range((ulong)p, (ulong)p + kvm_emulate_mtmsrd_len * 4);
+
+	/* Patch the invocation */
+	*inst = KVM_INST_B | (distance_start & KVM_INST_B_MASK);
+}
+
 static void kvm_map_magic_page(void *data)
 {
 	kvm_hypercall2(KVM_HC_PPC_MAP_MAGIC_PAGE,
@@ -190,6 +228,13 @@ static void kvm_check_ins(u32 *inst)
 	case KVM_INST_TLBSYNC:
 		kvm_patch_ins_nop(inst);
 		break;
+
+	/* Rewrites */
+	case KVM_INST_MTMSRD_L1:
+		/* We use r30 and r31 during the hook */
+		if (get_rt(inst_rt) < 30)
+			kvm_patch_ins_mtmsrd(inst, inst_rt);
+		break;
 	}
 
 	switch (_inst) {
diff --git a/arch/powerpc/kernel/kvm_emul.S b/arch/powerpc/kernel/kvm_emul.S
index 7da835a..25e6683 100644
--- a/arch/powerpc/kernel/kvm_emul.S
+++ b/arch/powerpc/kernel/kvm_emul.S
@@ -54,3 +54,59 @@
 	/* Disable critical section. We are critical if			\
 	   shared->critical == r1 and r2 is always != r1 */		\
 	STL64(r2, KVM_MAGIC_PAGE + KVM_MAGIC_CRITICAL, 0);
+
+.global kvm_emulate_mtmsrd
+kvm_emulate_mtmsrd:
+
+	SCRATCH_SAVE
+
+	/* Put MSR & ~(MSR_EE|MSR_RI) in r31 */
+	LL64(r31, KVM_MAGIC_PAGE + KVM_MAGIC_MSR, 0)
+	lis	r30, (~(MSR_EE | MSR_RI))@h
+	ori	r30, r30, (~(MSR_EE | MSR_RI))@l
+	and	r31, r31, r30
+
+	/* OR the register's (MSR_EE|MSR_RI) on MSR */
+kvm_emulate_mtmsrd_reg:
+	andi.	r30, r0, (MSR_EE|MSR_RI)
+	or	r31, r31, r30
+
+	/* Put MSR back into magic page */
+	STL64(r31, KVM_MAGIC_PAGE + KVM_MAGIC_MSR, 0)
+
+	/* Check if we have to fetch an interrupt */
+	lwz	r31, (KVM_MAGIC_PAGE + KVM_MAGIC_INT)(0)
+	cmpwi	r31, 0
+	beq+	no_check
+
+	/* Check if we may trigger an interrupt */
+	andi.	r30, r30, MSR_EE
+	beq	no_check
+
+	SCRATCH_RESTORE
+
+	/* Nag hypervisor */
+	tlbsync
+
+	b	kvm_emulate_mtmsrd_branch
+
+no_check:
+
+	SCRATCH_RESTORE
+
+	/* Go back to caller */
+kvm_emulate_mtmsrd_branch:
+	b	.
+kvm_emulate_mtmsrd_end:
+
+.global kvm_emulate_mtmsrd_branch_offs
+kvm_emulate_mtmsrd_branch_offs:
+	.long (kvm_emulate_mtmsrd_branch - kvm_emulate_mtmsrd) / 4
+
+.global kvm_emulate_mtmsrd_reg_offs
+kvm_emulate_mtmsrd_reg_offs:
+	.long (kvm_emulate_mtmsrd_reg - kvm_emulate_mtmsrd) / 4
+
+.global kvm_emulate_mtmsrd_len
+kvm_emulate_mtmsrd_len:
+	.long (kvm_emulate_mtmsrd_end - kvm_emulate_mtmsrd) / 4
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 22/26] KVM: PPC: PV assembler helpers
From: Alexander Graf @ 2010-06-25 23:25 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1277508314-915-1-git-send-email-agraf@suse.de>

When we hook an instruction we need to make sure we don't clobber any of
the registers at that point. So we write them out to scratch space in the
magic page. To make sure we don't fall into a race with another piece of
hooked code, we need to disable interrupts.

To make the later patches and code in general easier readable, let's introduce
a set of defines that save and restore r30, r31 and cr. Let's also define some
helpers to read the lower 32 bits of a 64 bit field on 32 bit systems.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kernel/kvm_emul.S |   29 +++++++++++++++++++++++++++++
 1 files changed, 29 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/kvm_emul.S b/arch/powerpc/kernel/kvm_emul.S
index c7b9fc9..7da835a 100644
--- a/arch/powerpc/kernel/kvm_emul.S
+++ b/arch/powerpc/kernel/kvm_emul.S
@@ -25,3 +25,32 @@
 
 #define KVM_MAGIC_PAGE		(-4096)
 
+#ifdef CONFIG_64BIT
+#define LL64(reg, offs, reg2)	ld	reg, (offs)(reg2)
+#define STL64(reg, offs, reg2)	std	reg, (offs)(reg2)
+#else
+#define LL64(reg, offs, reg2)	lwz	reg, (offs + 4)(reg2)
+#define STL64(reg, offs, reg2)	stw	reg, (offs + 4)(reg2)
+#endif
+
+#define SCRATCH_SAVE							\
+	/* Enable critical section. We are critical if			\
+	   shared->critical == r1 */					\
+	STL64(r1, KVM_MAGIC_PAGE + KVM_MAGIC_CRITICAL, 0);		\
+									\
+	/* Save state */						\
+	PPC_STL	r31, (KVM_MAGIC_PAGE + KVM_MAGIC_SCRATCH1)(0);		\
+	PPC_STL	r30, (KVM_MAGIC_PAGE + KVM_MAGIC_SCRATCH2)(0);		\
+	mfcr	r31;							\
+	stw	r31, (KVM_MAGIC_PAGE + KVM_MAGIC_SCRATCH3)(0);
+
+#define SCRATCH_RESTORE							\
+	/* Restore state */						\
+	PPC_LL	r31, (KVM_MAGIC_PAGE + KVM_MAGIC_SCRATCH1)(0);		\
+	lwz	r30, (KVM_MAGIC_PAGE + KVM_MAGIC_SCRATCH3)(0);		\
+	mtcr	r30;							\
+	PPC_LL	r30, (KVM_MAGIC_PAGE + KVM_MAGIC_SCRATCH2)(0);		\
+									\
+	/* Disable critical section. We are critical if			\
+	   shared->critical == r1 and r2 is always != r1 */		\
+	STL64(r2, KVM_MAGIC_PAGE + KVM_MAGIC_CRITICAL, 0);
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 10/26] KVM: PPC: Tell guest about pending interrupts
From: Alexander Graf @ 2010-06-25 23:24 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1277508314-915-1-git-send-email-agraf@suse.de>

When the guest turns on interrupts again, it needs to know if we have an
interrupt pending for it. Because if so, it should rather get out of guest
context and get the interrupt.

So we introduce a new field in the shared page that we use to tell the guest
that there's a pending interrupt lying around.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_para.h |    1 +
 arch/powerpc/kvm/book3s.c           |    7 +++++++
 arch/powerpc/kvm/booke.c            |    7 +++++++
 3 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_para.h b/arch/powerpc/include/asm/kvm_para.h
index edf8f83..c7305d7 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -36,6 +36,7 @@ struct kvm_vcpu_arch_shared {
 	__u64 dar;
 	__u64 msr;
 	__u32 dsisr;
+	__u32 int_pending;	/* Tells the guest if we have an interrupt */
 };
 
 #define KVM_PVR_PARA		0x4b564d3f /* "KVM?" */
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index f0e8047..e76c950 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -334,6 +334,7 @@ int kvmppc_book3s_irqprio_deliver(struct kvm_vcpu *vcpu, unsigned int priority)
 void kvmppc_core_deliver_interrupts(struct kvm_vcpu *vcpu)
 {
 	unsigned long *pending = &vcpu->arch.pending_exceptions;
+	unsigned long old_pending = vcpu->arch.pending_exceptions;
 	unsigned int priority;
 
 #ifdef EXIT_DEBUG
@@ -353,6 +354,12 @@ void kvmppc_core_deliver_interrupts(struct kvm_vcpu *vcpu)
 					 BITS_PER_BYTE * sizeof(*pending),
 					 priority + 1);
 	}
+
+	/* Tell the guest about our interrupt status */
+	if (*pending)
+		vcpu->arch.shared->int_pending = 1;
+	else if (old_pending)
+		vcpu->arch.shared->int_pending = 0;
 }
 
 void kvmppc_set_pvr(struct kvm_vcpu *vcpu, u32 pvr)
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 485f8fa..2229df9 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -221,6 +221,7 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
 void kvmppc_core_deliver_interrupts(struct kvm_vcpu *vcpu)
 {
 	unsigned long *pending = &vcpu->arch.pending_exceptions;
+	unsigned long old_pending = vcpu->arch.pending_exceptions;
 	unsigned int priority;
 
 	priority = __ffs(*pending);
@@ -232,6 +233,12 @@ void kvmppc_core_deliver_interrupts(struct kvm_vcpu *vcpu)
 		                         BITS_PER_BYTE * sizeof(*pending),
 		                         priority + 1);
 	}
+
+	/* Tell the guest about our interrupt status */
+	if (*pending)
+		vcpu->arch.shared->int_pending = 1;
+	else if (old_pending)
+		vcpu->arch.shared->int_pending = 0;
 }
 
 /**
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 19/26] KVM: PPC: PV instructions to loads and stores
From: Alexander Graf @ 2010-06-25 23:25 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1277508314-915-1-git-send-email-agraf@suse.de>

Some instructions can simply be replaced by load and store instructions to
or from the magic page.

This patch replaces often called instructions that fall into the above category.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kernel/kvm.c |  111 +++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 111 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
index d873bc6..b165b20 100644
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -32,8 +32,65 @@
 #define KVM_MAGIC_PAGE		(-4096L)
 #define magic_var(x) KVM_MAGIC_PAGE + offsetof(struct kvm_vcpu_arch_shared, x)
 
+#define KVM_INST_LWZ		0x80000000
+#define KVM_INST_STW		0x90000000
+#define KVM_INST_LD		0xe8000000
+#define KVM_INST_STD		0xf8000000
+#define KVM_INST_NOP		0x60000000
+#define KVM_INST_B		0x48000000
+#define KVM_INST_B_MASK		0x03ffffff
+#define KVM_INST_B_MAX		0x01ffffff
+
+#define KVM_MASK_RT		0x03e00000
+#define KVM_INST_MFMSR		0x7c0000a6
+#define KVM_INST_MFSPR_SPRG0	0x7c1042a6
+#define KVM_INST_MFSPR_SPRG1	0x7c1142a6
+#define KVM_INST_MFSPR_SPRG2	0x7c1242a6
+#define KVM_INST_MFSPR_SPRG3	0x7c1342a6
+#define KVM_INST_MFSPR_SRR0	0x7c1a02a6
+#define KVM_INST_MFSPR_SRR1	0x7c1b02a6
+#define KVM_INST_MFSPR_DAR	0x7c1302a6
+#define KVM_INST_MFSPR_DSISR	0x7c1202a6
+
+#define KVM_INST_MTSPR_SPRG0	0x7c1043a6
+#define KVM_INST_MTSPR_SPRG1	0x7c1143a6
+#define KVM_INST_MTSPR_SPRG2	0x7c1243a6
+#define KVM_INST_MTSPR_SPRG3	0x7c1343a6
+#define KVM_INST_MTSPR_SRR0	0x7c1a03a6
+#define KVM_INST_MTSPR_SRR1	0x7c1b03a6
+#define KVM_INST_MTSPR_DAR	0x7c1303a6
+#define KVM_INST_MTSPR_DSISR	0x7c1203a6
+
 static bool kvm_patching_worked = true;
 
+static void kvm_patch_ins_ld(u32 *inst, long addr, u32 rt)
+{
+#ifdef CONFIG_64BIT
+	*inst = KVM_INST_LD | rt | (addr & 0x0000fffc);
+#else
+	*inst = KVM_INST_LWZ | rt | ((addr + 4) & 0x0000fffc);
+#endif
+}
+
+static void kvm_patch_ins_lwz(u32 *inst, long addr, u32 rt)
+{
+	*inst = KVM_INST_LWZ | rt | (addr & 0x0000ffff);
+}
+
+static void kvm_patch_ins_std(u32 *inst, long addr, u32 rt)
+{
+#ifdef CONFIG_64BIT
+	*inst = KVM_INST_STD | rt | (addr & 0x0000fffc);
+#else
+	*inst = KVM_INST_STW | rt | ((addr + 4) & 0x0000fffc);
+#endif
+}
+
+static void kvm_patch_ins_stw(u32 *inst, long addr, u32 rt)
+{
+	*inst = KVM_INST_STW | rt | (addr & 0x0000fffc);
+}
+
 static void kvm_map_magic_page(void *data)
 {
 	kvm_hypercall2(KVM_HC_PPC_MAP_MAGIC_PAGE,
@@ -48,6 +105,60 @@ static void kvm_check_ins(u32 *inst)
 	u32 inst_rt = _inst & KVM_MASK_RT;
 
 	switch (inst_no_rt) {
+	/* Loads */
+	case KVM_INST_MFMSR:
+		kvm_patch_ins_ld(inst, magic_var(msr), inst_rt);
+		break;
+	case KVM_INST_MFSPR_SPRG0:
+		kvm_patch_ins_ld(inst, magic_var(sprg0), inst_rt);
+		break;
+	case KVM_INST_MFSPR_SPRG1:
+		kvm_patch_ins_ld(inst, magic_var(sprg1), inst_rt);
+		break;
+	case KVM_INST_MFSPR_SPRG2:
+		kvm_patch_ins_ld(inst, magic_var(sprg2), inst_rt);
+		break;
+	case KVM_INST_MFSPR_SPRG3:
+		kvm_patch_ins_ld(inst, magic_var(sprg3), inst_rt);
+		break;
+	case KVM_INST_MFSPR_SRR0:
+		kvm_patch_ins_ld(inst, magic_var(srr0), inst_rt);
+		break;
+	case KVM_INST_MFSPR_SRR1:
+		kvm_patch_ins_ld(inst, magic_var(srr1), inst_rt);
+		break;
+	case KVM_INST_MFSPR_DAR:
+		kvm_patch_ins_ld(inst, magic_var(dar), inst_rt);
+		break;
+	case KVM_INST_MFSPR_DSISR:
+		kvm_patch_ins_lwz(inst, magic_var(dsisr), inst_rt);
+		break;
+
+	/* Stores */
+	case KVM_INST_MTSPR_SPRG0:
+		kvm_patch_ins_std(inst, magic_var(sprg0), inst_rt);
+		break;
+	case KVM_INST_MTSPR_SPRG1:
+		kvm_patch_ins_std(inst, magic_var(sprg1), inst_rt);
+		break;
+	case KVM_INST_MTSPR_SPRG2:
+		kvm_patch_ins_std(inst, magic_var(sprg2), inst_rt);
+		break;
+	case KVM_INST_MTSPR_SPRG3:
+		kvm_patch_ins_std(inst, magic_var(sprg3), inst_rt);
+		break;
+	case KVM_INST_MTSPR_SRR0:
+		kvm_patch_ins_std(inst, magic_var(srr0), inst_rt);
+		break;
+	case KVM_INST_MTSPR_SRR1:
+		kvm_patch_ins_std(inst, magic_var(srr1), inst_rt);
+		break;
+	case KVM_INST_MTSPR_DAR:
+		kvm_patch_ins_std(inst, magic_var(dar), inst_rt);
+		break;
+	case KVM_INST_MTSPR_DSISR:
+		kvm_patch_ins_stw(inst, magic_var(dsisr), inst_rt);
+		break;
 	}
 
 	switch (_inst) {
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 17/26] KVM: PPC: Generic KVM PV guest support
From: Alexander Graf @ 2010-06-25 23:25 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1277508314-915-1-git-send-email-agraf@suse.de>

We have all the hypervisor pieces in place now, but the guest parts are still
missing.

This patch implements basic awareness of KVM when running Linux as guest. It
doesn't do anything with it yet though.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kernel/Makefile      |    2 ++
 arch/powerpc/kernel/asm-offsets.c |   15 +++++++++++++++
 arch/powerpc/kernel/kvm.c         |   34 ++++++++++++++++++++++++++++++++++
 arch/powerpc/kernel/kvm_emul.S    |   27 +++++++++++++++++++++++++++
 arch/powerpc/platforms/Kconfig    |   10 ++++++++++
 5 files changed, 88 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kernel/kvm.c
 create mode 100644 arch/powerpc/kernel/kvm_emul.S

diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 58d0572..2d7eb9e 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -125,6 +125,8 @@ ifneq ($(CONFIG_XMON)$(CONFIG_KEXEC),)
 obj-y				+= ppc_save_regs.o
 endif
 
+obj-$(CONFIG_KVM_GUEST) += kvm.o kvm_emul.o
+
 # Disable GCOV in odd or sensitive code
 GCOV_PROFILE_prom_init.o := n
 GCOV_PROFILE_ftrace.o := n
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index a55d47e..e3e740b 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -465,6 +465,21 @@ int main(void)
 	DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr));
 #endif /* CONFIG_PPC_BOOK3S */
 #endif
+
+#ifdef CONFIG_KVM_GUEST
+	DEFINE(KVM_MAGIC_SCRATCH1, offsetof(struct kvm_vcpu_arch_shared,
+					    scratch1));
+	DEFINE(KVM_MAGIC_SCRATCH2, offsetof(struct kvm_vcpu_arch_shared,
+					    scratch2));
+	DEFINE(KVM_MAGIC_SCRATCH3, offsetof(struct kvm_vcpu_arch_shared,
+					    scratch3));
+	DEFINE(KVM_MAGIC_INT, offsetof(struct kvm_vcpu_arch_shared,
+				       int_pending));
+	DEFINE(KVM_MAGIC_MSR, offsetof(struct kvm_vcpu_arch_shared, msr));
+	DEFINE(KVM_MAGIC_CRITICAL, offsetof(struct kvm_vcpu_arch_shared,
+					    critical));
+#endif
+
 #ifdef CONFIG_44x
 	DEFINE(PGD_T_LOG2, PGD_T_LOG2);
 	DEFINE(PTE_T_LOG2, PTE_T_LOG2);
diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
new file mode 100644
index 0000000..2d8dd73
--- /dev/null
+++ b/arch/powerpc/kernel/kvm.c
@@ -0,0 +1,34 @@
+/*
+ * Copyright (C) 2010 SUSE Linux Products GmbH. All rights reserved.
+ *
+ * Authors:
+ *     Alexander Graf <agraf@suse.de>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ */
+
+#include <linux/kvm_host.h>
+#include <linux/init.h>
+#include <linux/kvm_para.h>
+#include <linux/slab.h>
+
+#include <asm/reg.h>
+#include <asm/kvm_ppc.h>
+#include <asm/sections.h>
+#include <asm/cacheflush.h>
+#include <asm/disassemble.h>
+
+#define KVM_MAGIC_PAGE		(-4096L)
+#define magic_var(x) KVM_MAGIC_PAGE + offsetof(struct kvm_vcpu_arch_shared, x)
+
diff --git a/arch/powerpc/kernel/kvm_emul.S b/arch/powerpc/kernel/kvm_emul.S
new file mode 100644
index 0000000..c7b9fc9
--- /dev/null
+++ b/arch/powerpc/kernel/kvm_emul.S
@@ -0,0 +1,27 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2010
+ *
+ * Authors: Alexander Graf <agraf@suse.de>
+ */
+
+#include <asm/ppc_asm.h>
+#include <asm/kvm_asm.h>
+#include <asm/reg.h>
+#include <asm/page.h>
+#include <asm/asm-offsets.h>
+
+#define KVM_MAGIC_PAGE		(-4096)
+
diff --git a/arch/powerpc/platforms/Kconfig b/arch/powerpc/platforms/Kconfig
index d1663db..1744349 100644
--- a/arch/powerpc/platforms/Kconfig
+++ b/arch/powerpc/platforms/Kconfig
@@ -21,6 +21,16 @@ source "arch/powerpc/platforms/44x/Kconfig"
 source "arch/powerpc/platforms/40x/Kconfig"
 source "arch/powerpc/platforms/amigaone/Kconfig"
 
+config KVM_GUEST
+	bool "KVM Guest support"
+	default y
+	---help---
+	  This option enables various optimizations for running under the KVM
+	  hypervisor. Overhead for the kernel when not running inside KVM should
+	  be minimal.
+
+	  In case of doubt, say Y
+
 config PPC_NATIVE
 	bool
 	depends on 6xx || PPC64
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 14/26] KVM: PPC: Magic Page BookE support
From: Alexander Graf @ 2010-06-25 23:25 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1277508314-915-1-git-send-email-agraf@suse.de>

As we now have Book3s support for the magic page, we also need BookE to
join in on the party.

This patch implements generic magic page logic for BookE and specific
TLB logic for e500. I didn't have any 440 around, so I didn't dare to
blindly try and write up broken code.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/booke.c    |   29 +++++++++++++++++++++++++++++
 arch/powerpc/kvm/e500_tlb.c |   19 +++++++++++++++++--
 2 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 2229df9..7957aa4 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -241,6 +241,31 @@ void kvmppc_core_deliver_interrupts(struct kvm_vcpu *vcpu)
 		vcpu->arch.shared->int_pending = 0;
 }
 
+/* Check if a DTLB miss was on the magic page. Returns !0 if so. */
+int kvmppc_dtlb_magic_page(struct kvm_vcpu *vcpu, ulong eaddr)
+{
+	ulong mp_ea = vcpu->arch.magic_page_ea;
+	ulong gpaddr = vcpu->arch.magic_page_pa;
+	int gtlb_index = 11 | (1 << 16); /* Random number in TLB1 */
+
+	/* Check for existence of magic page */
+	if(likely(!mp_ea))
+		return 0;
+
+	/* Check if we're on the magic page */
+	if(likely((eaddr >> 12) != (mp_ea >> 12)))
+		return 0;
+
+	/* Don't map in user mode */
+	if(vcpu->arch.shared->msr & MSR_PR)
+		return 0;
+
+	kvmppc_mmu_map(vcpu, vcpu->arch.magic_page_ea, gpaddr, gtlb_index);
+	kvmppc_account_exit(vcpu, DTLB_VIRT_MISS_EXITS);
+
+	return 1;
+}
+
 /**
  * kvmppc_handle_exit
  *
@@ -308,6 +333,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 			r = RESUME_HOST;
 			break;
 		case EMULATE_FAIL:
+		case EMULATE_DO_MMIO:
 			/* XXX Deliver Program interrupt to guest. */
 			printk(KERN_CRIT "%s: emulation at %lx failed (%08x)\n",
 			       __func__, vcpu->arch.pc, vcpu->arch.last_inst);
@@ -377,6 +403,9 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 		gpa_t gpaddr;
 		gfn_t gfn;
 
+		if (kvmppc_dtlb_magic_page(vcpu, eaddr))
+			break;
+
 		/* Check the guest TLB. */
 		gtlb_index = kvmppc_mmu_dtlb_index(vcpu, eaddr);
 		if (gtlb_index < 0) {
diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
index 66845a5..f5582ca 100644
--- a/arch/powerpc/kvm/e500_tlb.c
+++ b/arch/powerpc/kvm/e500_tlb.c
@@ -295,9 +295,22 @@ static inline void kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500,
 	struct page *new_page;
 	struct tlbe *stlbe;
 	hpa_t hpaddr;
+	u32 mas2 = gtlbe->mas2;
+	u32 mas3 = gtlbe->mas3;
 
 	stlbe = &vcpu_e500->shadow_tlb[tlbsel][esel];
 
+	if ((vcpu_e500->vcpu.arch.magic_page_ea) &&
+	    ((vcpu_e500->vcpu.arch.magic_page_pa >> PAGE_SHIFT) == gfn) &&
+	    !(vcpu_e500->vcpu.arch.shared->msr & MSR_PR)) {
+		mas2 = 0;
+		mas3 = E500_TLB_SUPER_PERM_MASK;
+		hpaddr = virt_to_phys(vcpu_e500->vcpu.arch.shared);
+		new_page = pfn_to_page(hpaddr >> PAGE_SHIFT);
+		get_page(new_page);
+		goto mapped;
+	}
+
 	/* Get reference to new page. */
 	new_page = gfn_to_page(vcpu_e500->vcpu.kvm, gfn);
 	if (is_error_page(new_page)) {
@@ -305,6 +318,8 @@ static inline void kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500,
 		kvm_release_page_clean(new_page);
 		return;
 	}
+
+mapped:
 	hpaddr = page_to_phys(new_page);
 
 	/* Drop reference to old page. */
@@ -316,10 +331,10 @@ static inline void kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500,
 	stlbe->mas1 = MAS1_TSIZE(BOOK3E_PAGESZ_4K)
 		| MAS1_TID(get_tlb_tid(gtlbe)) | MAS1_TS | MAS1_VALID;
 	stlbe->mas2 = (gvaddr & MAS2_EPN)
-		| e500_shadow_mas2_attrib(gtlbe->mas2,
+		| e500_shadow_mas2_attrib(mas2,
 				vcpu_e500->vcpu.arch.shared->msr & MSR_PR);
 	stlbe->mas3 = (hpaddr & MAS3_RPN)
-		| e500_shadow_mas3_attrib(gtlbe->mas3,
+		| e500_shadow_mas3_attrib(mas3,
 				vcpu_e500->vcpu.arch.shared->msr & MSR_PR);
 	stlbe->mas7 = (hpaddr >> 32) & MAS7_RPN;
 
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 08/26] KVM: PPC: Add PV guest critical sections
From: Alexander Graf @ 2010-06-25 23:24 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1277508314-915-1-git-send-email-agraf@suse.de>

When running in hooked code we need a way to disable interrupts without
clobbering any interrupts or exiting out to the hypervisor.

To achieve this, we have an additional critical field in the shared page. If
that field is equal to the r1 register of the guest, it tells the hypervisor
that we're in such a critical section and thus may not receive any interrupts.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_para.h |    1 +
 arch/powerpc/kvm/book3s.c           |   15 +++++++++++++--
 arch/powerpc/kvm/booke.c            |   12 ++++++++++++
 3 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_para.h b/arch/powerpc/include/asm/kvm_para.h
index eaab306..d1fe9ae 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -23,6 +23,7 @@
 #include <linux/types.h>
 
 struct kvm_vcpu_arch_shared {
+	__u64 critical;		/* Guest may not get interrupts if == r1 */
 	__u64 sprg0;
 	__u64 sprg1;
 	__u64 sprg2;
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index e8001c5..f0e8047 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -251,14 +251,25 @@ int kvmppc_book3s_irqprio_deliver(struct kvm_vcpu *vcpu, unsigned int priority)
 	int deliver = 1;
 	int vec = 0;
 	ulong flags = 0ULL;
+	ulong crit_raw = vcpu->arch.shared->critical;
+	ulong crit_r1 = kvmppc_get_gpr(vcpu, 1);
+	bool crit;
+
+	/* Truncate crit indicators in 32 bit mode */
+	if (!(vcpu->arch.shared->msr & MSR_SF)) {
+		crit_raw &= 0xffffffff;
+		crit_r1 &= 0xffffffff;
+	}
+
+	crit = (crit_raw == crit_r1);
 
 	switch (priority) {
 	case BOOK3S_IRQPRIO_DECREMENTER:
-		deliver = vcpu->arch.shared->msr & MSR_EE;
+		deliver = (vcpu->arch.shared->msr & MSR_EE) && !crit;
 		vec = BOOK3S_INTERRUPT_DECREMENTER;
 		break;
 	case BOOK3S_IRQPRIO_EXTERNAL:
-		deliver = vcpu->arch.shared->msr & MSR_EE;
+		deliver = (vcpu->arch.shared->msr & MSR_EE) && !crit;
 		vec = BOOK3S_INTERRUPT_EXTERNAL;
 		break;
 	case BOOK3S_IRQPRIO_SYSTEM_RESET:
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index e7d1216..485f8fa 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -147,6 +147,17 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
 	int allowed = 0;
 	ulong uninitialized_var(msr_mask);
 	bool update_esr = false, update_dear = false;
+	ulong crit_raw = vcpu->arch.shared->critical;
+	ulong crit_r1 = kvmppc_get_gpr(vcpu, 1);
+	bool crit;
+
+	/* Truncate crit indicators in 32 bit mode */
+	if (!(vcpu->arch.shared->msr & MSR_SF)) {
+		crit_raw &= 0xffffffff;
+		crit_r1 &= 0xffffffff;
+	}
+
+	crit = (crit_raw == crit_r1);
 
 	switch (priority) {
 	case BOOKE_IRQPRIO_DTLB_MISS:
@@ -181,6 +192,7 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
 	case BOOKE_IRQPRIO_DECREMENTER:
 	case BOOKE_IRQPRIO_FIT:
 		allowed = vcpu->arch.shared->msr & MSR_EE;
+		allowed = allowed && !crit;
 		msr_mask = MSR_CE|MSR_ME|MSR_DE;
 		break;
 	case BOOKE_IRQPRIO_DEBUG:
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 13/26] KVM: PPC: Magic Page Book3s support
From: Alexander Graf @ 2010-06-25 23:25 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1277508314-915-1-git-send-email-agraf@suse.de>

We need to override EA as well as PA lookups for the magic page. When the guest
tells us to project it, the magic page overrides any guest mappings.

In order to reflect that, we need to hook into all the MMU layers of KVM to
force map the magic page if necessary.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/book3s.c             |    7 +++++++
 arch/powerpc/kvm/book3s_32_mmu.c      |   16 ++++++++++++++++
 arch/powerpc/kvm/book3s_32_mmu_host.c |   12 ++++++++++++
 arch/powerpc/kvm/book3s_64_mmu.c      |   30 +++++++++++++++++++++++++++++-
 arch/powerpc/kvm/book3s_64_mmu_host.c |   12 ++++++++++++
 5 files changed, 76 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 2f55aa5..6ce7fa1 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -551,6 +551,13 @@ mmio:
 
 static int kvmppc_visible_gfn(struct kvm_vcpu *vcpu, gfn_t gfn)
 {
+	ulong mp_pa = vcpu->arch.magic_page_pa;
+
+	if (unlikely(mp_pa) &&
+	    unlikely((mp_pa & KVM_RMO) >> PAGE_SHIFT == gfn)) {
+		return 1;
+	}
+
 	return kvm_is_visible_gfn(vcpu->kvm, gfn);
 }
 
diff --git a/arch/powerpc/kvm/book3s_32_mmu.c b/arch/powerpc/kvm/book3s_32_mmu.c
index 41130c8..d2bd1a6 100644
--- a/arch/powerpc/kvm/book3s_32_mmu.c
+++ b/arch/powerpc/kvm/book3s_32_mmu.c
@@ -281,8 +281,24 @@ static int kvmppc_mmu_book3s_32_xlate(struct kvm_vcpu *vcpu, gva_t eaddr,
 				      struct kvmppc_pte *pte, bool data)
 {
 	int r;
+	ulong mp_ea = vcpu->arch.magic_page_ea;
 
 	pte->eaddr = eaddr;
+
+	/* Magic page override */
+	if (unlikely(mp_ea) &&
+	    unlikely((eaddr & ~0xfffULL) == (mp_ea & ~0xfffULL)) &&
+	    !(vcpu->arch.shared->msr & MSR_PR)) {
+		pte->vpage = kvmppc_mmu_book3s_32_ea_to_vp(vcpu, eaddr, data);
+		pte->raddr = vcpu->arch.magic_page_pa | (pte->raddr & 0xfff);
+		pte->raddr &= KVM_RMO;
+		pte->may_execute = true;
+		pte->may_read = true;
+		pte->may_write = true;
+
+		return 0;
+	}
+
 	r = kvmppc_mmu_book3s_32_xlate_bat(vcpu, eaddr, pte, data);
 	if (r < 0)
 	       r = kvmppc_mmu_book3s_32_xlate_pte(vcpu, eaddr, pte, data, true);
diff --git a/arch/powerpc/kvm/book3s_32_mmu_host.c b/arch/powerpc/kvm/book3s_32_mmu_host.c
index 67b8c38..658d3e0 100644
--- a/arch/powerpc/kvm/book3s_32_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_32_mmu_host.c
@@ -145,6 +145,16 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *orig_pte)
 	bool primary = false;
 	bool evict = false;
 	struct hpte_cache *pte;
+	ulong mp_pa = vcpu->arch.magic_page_pa;
+
+	/* Magic page override */
+	if (unlikely(mp_pa) &&
+	    unlikely((orig_pte->raddr & ~0xfffUL & KVM_RMO) ==
+		     (mp_pa & ~0xfffUL & KVM_RMO))) {
+		hpaddr = (pfn_t)virt_to_phys(vcpu->arch.shared);
+		get_page(pfn_to_page(hpaddr >> PAGE_SHIFT));
+		goto mapped;
+	}
 
 	/* Get host physical address for gpa */
 	hpaddr = gfn_to_pfn(vcpu->kvm, orig_pte->raddr >> PAGE_SHIFT);
@@ -155,6 +165,8 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *orig_pte)
 	}
 	hpaddr <<= PAGE_SHIFT;
 
+mapped:
+
 	/* and write the mapping ea -> hpa into the pt */
 	vcpu->arch.mmu.esid_to_vsid(vcpu, orig_pte->eaddr >> SID_SHIFT, &vsid);
 	map = find_sid_vsid(vcpu, vsid);
diff --git a/arch/powerpc/kvm/book3s_64_mmu.c b/arch/powerpc/kvm/book3s_64_mmu.c
index 58aa840..4a2e5fc 100644
--- a/arch/powerpc/kvm/book3s_64_mmu.c
+++ b/arch/powerpc/kvm/book3s_64_mmu.c
@@ -163,6 +163,22 @@ static int kvmppc_mmu_book3s_64_xlate(struct kvm_vcpu *vcpu, gva_t eaddr,
 	bool found = false;
 	bool perm_err = false;
 	int second = 0;
+	ulong mp_ea = vcpu->arch.magic_page_ea;
+
+	/* Magic page override */
+	if (unlikely(mp_ea) &&
+	    unlikely((eaddr & ~0xfffULL) == (mp_ea & ~0xfffULL)) &&
+	    !(vcpu->arch.shared->msr & MSR_PR)) {
+		gpte->eaddr = eaddr;
+		gpte->vpage = kvmppc_mmu_book3s_64_ea_to_vp(vcpu, eaddr, data);
+		gpte->raddr = vcpu->arch.magic_page_pa | (gpte->raddr & 0xfff);
+		gpte->raddr &= KVM_RMO;
+		gpte->may_execute = true;
+		gpte->may_read = true;
+		gpte->may_write = true;
+
+		return 0;
+	}
 
 	slbe = kvmppc_mmu_book3s_64_find_slbe(vcpu_book3s, eaddr);
 	if (!slbe)
@@ -445,6 +461,7 @@ static int kvmppc_mmu_book3s_64_esid_to_vsid(struct kvm_vcpu *vcpu, ulong esid,
 	ulong ea = esid << SID_SHIFT;
 	struct kvmppc_slb *slb;
 	u64 gvsid = esid;
+	ulong mp_ea = vcpu->arch.magic_page_ea;
 
 	if (vcpu->arch.shared->msr & (MSR_DR|MSR_IR)) {
 		slb = kvmppc_mmu_book3s_64_find_slbe(to_book3s(vcpu), ea);
@@ -464,7 +481,7 @@ static int kvmppc_mmu_book3s_64_esid_to_vsid(struct kvm_vcpu *vcpu, ulong esid,
 		break;
 	case MSR_DR|MSR_IR:
 		if (!slb)
-			return -ENOENT;
+			goto no_slb;
 
 		*vsid = gvsid;
 		break;
@@ -477,6 +494,17 @@ static int kvmppc_mmu_book3s_64_esid_to_vsid(struct kvm_vcpu *vcpu, ulong esid,
 		*vsid |= VSID_PR;
 
 	return 0;
+
+no_slb:
+	/* Catch magic page case */
+	if (unlikely(mp_ea) &&
+	    unlikely(esid == (mp_ea >> SID_SHIFT)) &&
+	    !(vcpu->arch.shared->msr & MSR_PR)) {
+		*vsid = VSID_REAL | esid;
+		return 0;
+	}
+
+	return -EINVAL;
 }
 
 static bool kvmppc_mmu_book3s_64_is_dcbz32(struct kvm_vcpu *vcpu)
diff --git a/arch/powerpc/kvm/book3s_64_mmu_host.c b/arch/powerpc/kvm/book3s_64_mmu_host.c
index 71c1f90..d880c88 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_host.c
@@ -99,6 +99,16 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *orig_pte)
 	int vflags = 0;
 	int attempt = 0;
 	struct kvmppc_sid_map *map;
+	ulong mp_pa = vcpu->arch.magic_page_pa;
+
+	/* Magic page override */
+	if (unlikely(mp_pa) &&
+	    unlikely((orig_pte->raddr & ~0xfffULL & KVM_RMO) ==
+		     (mp_pa & ~0xfffULL & KVM_RMO))) {
+		hpaddr = (pfn_t)virt_to_phys(vcpu->arch.shared);
+		get_page(pfn_to_page(hpaddr >> PAGE_SHIFT));
+		goto mapped;
+	}
 
 	/* Get host physical address for gpa */
 	hpaddr = gfn_to_pfn(vcpu->kvm, orig_pte->raddr >> PAGE_SHIFT);
@@ -114,6 +124,8 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *orig_pte)
 #error Unknown page size
 #endif
 
+mapped:
+
 	/* and write the mapping ea -> hpa into the pt */
 	vcpu->arch.mmu.esid_to_vsid(vcpu, orig_pte->eaddr >> SID_SHIFT, &vsid);
 	map = find_sid_vsid(vcpu, vsid);
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 06/26] KVM: PPC: Convert SPRG[0-4] to shared page
From: Alexander Graf @ 2010-06-25 23:24 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1277508314-915-1-git-send-email-agraf@suse.de>

When in kernel mode there are 4 additional registers available that are
simple data storage. Instead of exiting to the hypervisor to read and
write those, we can just share them with the guest using the page.

This patch converts all users of the current field to the shared page.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_host.h |    4 ----
 arch/powerpc/include/asm/kvm_para.h |    4 ++++
 arch/powerpc/kvm/book3s.c           |   16 ++++++++--------
 arch/powerpc/kvm/booke.c            |   16 ++++++++--------
 arch/powerpc/kvm/emulate.c          |   24 ++++++++++++++++--------
 5 files changed, 36 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 6bcf62f..83c45ea 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -216,10 +216,6 @@ struct kvm_vcpu_arch {
 	ulong guest_owned_ext;
 #endif
 	u32 mmucr;
-	ulong sprg0;
-	ulong sprg1;
-	ulong sprg2;
-	ulong sprg3;
 	ulong sprg4;
 	ulong sprg5;
 	ulong sprg6;
diff --git a/arch/powerpc/include/asm/kvm_para.h b/arch/powerpc/include/asm/kvm_para.h
index d7fc6c2..e402999 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -23,6 +23,10 @@
 #include <linux/types.h>
 
 struct kvm_vcpu_arch_shared {
+	__u64 sprg0;
+	__u64 sprg1;
+	__u64 sprg2;
+	__u64 sprg3;
 	__u64 srr0;
 	__u64 srr1;
 	__u64 dar;
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index b144697..5a6f055 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -1062,10 +1062,10 @@ int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
 	regs->srr0 = vcpu->arch.shared->srr0;
 	regs->srr1 = vcpu->arch.shared->srr1;
 	regs->pid = vcpu->arch.pid;
-	regs->sprg0 = vcpu->arch.sprg0;
-	regs->sprg1 = vcpu->arch.sprg1;
-	regs->sprg2 = vcpu->arch.sprg2;
-	regs->sprg3 = vcpu->arch.sprg3;
+	regs->sprg0 = vcpu->arch.shared->sprg0;
+	regs->sprg1 = vcpu->arch.shared->sprg1;
+	regs->sprg2 = vcpu->arch.shared->sprg2;
+	regs->sprg3 = vcpu->arch.shared->sprg3;
 	regs->sprg5 = vcpu->arch.sprg4;
 	regs->sprg6 = vcpu->arch.sprg5;
 	regs->sprg7 = vcpu->arch.sprg6;
@@ -1088,10 +1088,10 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
 	kvmppc_set_msr(vcpu, regs->msr);
 	vcpu->arch.shared->srr0 = regs->srr0;
 	vcpu->arch.shared->srr1 = regs->srr1;
-	vcpu->arch.sprg0 = regs->sprg0;
-	vcpu->arch.sprg1 = regs->sprg1;
-	vcpu->arch.sprg2 = regs->sprg2;
-	vcpu->arch.sprg3 = regs->sprg3;
+	vcpu->arch.shared->sprg0 = regs->sprg0;
+	vcpu->arch.shared->sprg1 = regs->sprg1;
+	vcpu->arch.shared->sprg2 = regs->sprg2;
+	vcpu->arch.shared->sprg3 = regs->sprg3;
 	vcpu->arch.sprg5 = regs->sprg4;
 	vcpu->arch.sprg6 = regs->sprg5;
 	vcpu->arch.sprg7 = regs->sprg6;
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 8b546fe..984c461 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -495,10 +495,10 @@ int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
 	regs->srr0 = vcpu->arch.shared->srr0;
 	regs->srr1 = vcpu->arch.shared->srr1;
 	regs->pid = vcpu->arch.pid;
-	regs->sprg0 = vcpu->arch.sprg0;
-	regs->sprg1 = vcpu->arch.sprg1;
-	regs->sprg2 = vcpu->arch.sprg2;
-	regs->sprg3 = vcpu->arch.sprg3;
+	regs->sprg0 = vcpu->arch.shared->sprg0;
+	regs->sprg1 = vcpu->arch.shared->sprg1;
+	regs->sprg2 = vcpu->arch.shared->sprg2;
+	regs->sprg3 = vcpu->arch.shared->sprg3;
 	regs->sprg5 = vcpu->arch.sprg4;
 	regs->sprg6 = vcpu->arch.sprg5;
 	regs->sprg7 = vcpu->arch.sprg6;
@@ -521,10 +521,10 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
 	kvmppc_set_msr(vcpu, regs->msr);
 	vcpu->arch.shared->srr0 = regs->srr0;
 	vcpu->arch.shared->srr1 = regs->srr1;
-	vcpu->arch.sprg0 = regs->sprg0;
-	vcpu->arch.sprg1 = regs->sprg1;
-	vcpu->arch.sprg2 = regs->sprg2;
-	vcpu->arch.sprg3 = regs->sprg3;
+	vcpu->arch.shared->sprg0 = regs->sprg0;
+	vcpu->arch.shared->sprg1 = regs->sprg1;
+	vcpu->arch.shared->sprg2 = regs->sprg2;
+	vcpu->arch.shared->sprg3 = regs->sprg3;
 	vcpu->arch.sprg5 = regs->sprg4;
 	vcpu->arch.sprg6 = regs->sprg5;
 	vcpu->arch.sprg7 = regs->sprg6;
diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
index ad0fa4f..454869b 100644
--- a/arch/powerpc/kvm/emulate.c
+++ b/arch/powerpc/kvm/emulate.c
@@ -263,13 +263,17 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
 				kvmppc_set_gpr(vcpu, rt, get_tb()); break;
 
 			case SPRN_SPRG0:
-				kvmppc_set_gpr(vcpu, rt, vcpu->arch.sprg0); break;
+				kvmppc_set_gpr(vcpu, rt, vcpu->arch.shared->sprg0);
+				break;
 			case SPRN_SPRG1:
-				kvmppc_set_gpr(vcpu, rt, vcpu->arch.sprg1); break;
+				kvmppc_set_gpr(vcpu, rt, vcpu->arch.shared->sprg1);
+				break;
 			case SPRN_SPRG2:
-				kvmppc_set_gpr(vcpu, rt, vcpu->arch.sprg2); break;
+				kvmppc_set_gpr(vcpu, rt, vcpu->arch.shared->sprg2);
+				break;
 			case SPRN_SPRG3:
-				kvmppc_set_gpr(vcpu, rt, vcpu->arch.sprg3); break;
+				kvmppc_set_gpr(vcpu, rt, vcpu->arch.shared->sprg3);
+				break;
 			/* Note: SPRG4-7 are user-readable, so we don't get
 			 * a trap. */
 
@@ -341,13 +345,17 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
 				break;
 
 			case SPRN_SPRG0:
-				vcpu->arch.sprg0 = kvmppc_get_gpr(vcpu, rs); break;
+				vcpu->arch.shared->sprg0 = kvmppc_get_gpr(vcpu, rs);
+				break;
 			case SPRN_SPRG1:
-				vcpu->arch.sprg1 = kvmppc_get_gpr(vcpu, rs); break;
+				vcpu->arch.shared->sprg1 = kvmppc_get_gpr(vcpu, rs);
+				break;
 			case SPRN_SPRG2:
-				vcpu->arch.sprg2 = kvmppc_get_gpr(vcpu, rs); break;
+				vcpu->arch.shared->sprg2 = kvmppc_get_gpr(vcpu, rs);
+				break;
 			case SPRN_SPRG3:
-				vcpu->arch.sprg3 = kvmppc_get_gpr(vcpu, rs); break;
+				vcpu->arch.shared->sprg3 = kvmppc_get_gpr(vcpu, rs);
+				break;
 
 			default:
 				emulated = kvmppc_core_emulate_mtspr(vcpu, sprn, rs);
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 21/26] KVM: PPC: Introduce kvm_tmp framework
From: Alexander Graf @ 2010-06-25 23:25 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1277508314-915-1-git-send-email-agraf@suse.de>

We will soon require more sophisticated methods to replace single instructions
with multiple instructions. We do that by branching to a memory region where we
write replacement code for the instruction to.

This region needs to be within 32 MB of the patched instruction though, because
that's the furthest we can jump with immediate branches.

So we keep 1MB of free space around in bss. After we're done initing we can just
tell the mm system that the unused pages are free, but until then we have enough
space to fit all our code in.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kernel/kvm.c |   41 +++++++++++++++++++++++++++++++++++++++--
 1 files changed, 39 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
index b091f94..7e8fe6f 100644
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -64,6 +64,8 @@
 #define KVM_INST_TLBSYNC	0x7c00046c
 
 static bool kvm_patching_worked = true;
+static char kvm_tmp[1024 * 1024];
+static int kvm_tmp_index;
 
 static void kvm_patch_ins_ld(u32 *inst, long addr, u32 rt)
 {
@@ -98,6 +100,23 @@ static void kvm_patch_ins_nop(u32 *inst)
 	*inst = KVM_INST_NOP;
 }
 
+static u32 *kvm_alloc(int len)
+{
+	u32 *p;
+
+	if ((kvm_tmp_index + len) > ARRAY_SIZE(kvm_tmp)) {
+		printk(KERN_ERR "KVM: No more space (%d + %d)\n",
+				kvm_tmp_index, len);
+		kvm_patching_worked = false;
+		return NULL;
+	}
+
+	p = (void*)&kvm_tmp[kvm_tmp_index];
+	kvm_tmp_index += len;
+
+	return p;
+}
+
 static void kvm_map_magic_page(void *data)
 {
 	kvm_hypercall2(KVM_HC_PPC_MAP_MAGIC_PAGE,
@@ -197,12 +216,27 @@ static void kvm_use_magic_page(void)
 		kvm_check_ins(p);
 }
 
+static void kvm_free_tmp(void)
+{
+	unsigned long start, end;
+
+	start = (ulong)&kvm_tmp[kvm_tmp_index + (PAGE_SIZE - 1)] & PAGE_MASK;
+	end = (ulong)&kvm_tmp[ARRAY_SIZE(kvm_tmp)] & PAGE_MASK;
+
+	/* Free the tmp space we don't need */
+	for (; start < end; start += PAGE_SIZE) {
+		ClearPageReserved(virt_to_page(start));
+		init_page_count(virt_to_page(start));
+		free_page(start);
+		totalram_pages++;
+	}
+}
+
 static int __init kvm_guest_init(void)
 {
-	char *p;
 
 	if (!kvm_para_available())
-		return 0;
+		goto free_tmp;
 
 	if (kvm_para_has_feature(KVM_FEATURE_MAGIC_PAGE))
 		kvm_use_magic_page();
@@ -210,6 +244,9 @@ static int __init kvm_guest_init(void)
 	printk(KERN_INFO "KVM: Live patching for a fast VM %s\n",
 			 kvm_patching_worked ? "worked" : "failed");
 
+free_tmp:
+	kvm_free_tmp();
+
 	return 0;
 }
 
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 01/26] KVM: PPC: Introduce shared page
From: Alexander Graf @ 2010-06-25 23:24 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1277508314-915-1-git-send-email-agraf@suse.de>

For transparent variable sharing between the hypervisor and guest, I introduce
a shared page. This shared page will contain all the registers the guest can
read and write safely without exiting guest context.

This patch only implements the stubs required for the basic structure of the
shared page. The actual register moving follows.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_host.h |    2 ++
 arch/powerpc/include/asm/kvm_para.h |    5 +++++
 arch/powerpc/kernel/asm-offsets.c   |    1 +
 arch/powerpc/kvm/44x.c              |    7 +++++++
 arch/powerpc/kvm/book3s.c           |    7 +++++++
 arch/powerpc/kvm/e500.c             |    7 +++++++
 6 files changed, 29 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 895eb63..bca9391 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -25,6 +25,7 @@
 #include <linux/interrupt.h>
 #include <linux/types.h>
 #include <linux/kvm_types.h>
+#include <linux/kvm_para.h>
 #include <asm/kvm_asm.h>
 
 #define KVM_MAX_VCPUS 1
@@ -289,6 +290,7 @@ struct kvm_vcpu_arch {
 	struct tasklet_struct tasklet;
 	u64 dec_jiffies;
 	unsigned long pending_exceptions;
+	struct kvm_vcpu_arch_shared *shared;
 
 #ifdef CONFIG_PPC_BOOK3S
 	struct kmem_cache *hpte_cache;
diff --git a/arch/powerpc/include/asm/kvm_para.h b/arch/powerpc/include/asm/kvm_para.h
index 2d48f6a..1485ba8 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -20,6 +20,11 @@
 #ifndef __POWERPC_KVM_PARA_H__
 #define __POWERPC_KVM_PARA_H__
 
+#include <linux/types.h>
+
+struct kvm_vcpu_arch_shared {
+};
+
 #ifdef __KERNEL__
 
 static inline int kvm_para_available(void)
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 496cc5b..944f593 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -400,6 +400,7 @@ int main(void)
 	DEFINE(VCPU_SPRG6, offsetof(struct kvm_vcpu, arch.sprg6));
 	DEFINE(VCPU_SPRG7, offsetof(struct kvm_vcpu, arch.sprg7));
 	DEFINE(VCPU_SHADOW_PID, offsetof(struct kvm_vcpu, arch.shadow_pid));
+	DEFINE(VCPU_SHARED, offsetof(struct kvm_vcpu, arch.shared));
 
 	/* book3s */
 #ifdef CONFIG_PPC_BOOK3S
diff --git a/arch/powerpc/kvm/44x.c b/arch/powerpc/kvm/44x.c
index 73c0a3f..e7b1f3f 100644
--- a/arch/powerpc/kvm/44x.c
+++ b/arch/powerpc/kvm/44x.c
@@ -123,8 +123,14 @@ struct kvm_vcpu *kvmppc_core_vcpu_create(struct kvm *kvm, unsigned int id)
 	if (err)
 		goto free_vcpu;
 
+	vcpu->arch.shared = (void*)__get_free_page(GFP_KERNEL|__GFP_ZERO);
+	if (!vcpu->arch.shared)
+		goto uninit_vcpu;
+
 	return vcpu;
 
+uninit_vcpu:
+	kvm_vcpu_uninit(vcpu);
 free_vcpu:
 	kmem_cache_free(kvm_vcpu_cache, vcpu_44x);
 out:
@@ -135,6 +141,7 @@ void kvmppc_core_vcpu_free(struct kvm_vcpu *vcpu)
 {
 	struct kvmppc_vcpu_44x *vcpu_44x = to_44x(vcpu);
 
+	free_page((unsigned long)vcpu->arch.shared);
 	kvm_vcpu_uninit(vcpu);
 	kmem_cache_free(kvm_vcpu_cache, vcpu_44x);
 }
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 884d4a5..ba79b35 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -1247,6 +1247,10 @@ struct kvm_vcpu *kvmppc_core_vcpu_create(struct kvm *kvm, unsigned int id)
 	if (err)
 		goto free_shadow_vcpu;
 
+	vcpu->arch.shared = (void*)__get_free_page(GFP_KERNEL|__GFP_ZERO);
+	if (!vcpu->arch.shared)
+		goto uninit_vcpu;
+
 	vcpu->arch.host_retip = kvm_return_point;
 	vcpu->arch.host_msr = mfmsr();
 #ifdef CONFIG_PPC_BOOK3S_64
@@ -1277,6 +1281,8 @@ struct kvm_vcpu *kvmppc_core_vcpu_create(struct kvm *kvm, unsigned int id)
 
 	return vcpu;
 
+uninit_vcpu:
+	kvm_vcpu_uninit(vcpu);
 free_shadow_vcpu:
 	kfree(vcpu_book3s->shadow_vcpu);
 free_vcpu:
@@ -1289,6 +1295,7 @@ void kvmppc_core_vcpu_free(struct kvm_vcpu *vcpu)
 {
 	struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
 
+	free_page((unsigned long)vcpu->arch.shared);
 	kvm_vcpu_uninit(vcpu);
 	kfree(vcpu_book3s->shadow_vcpu);
 	vfree(vcpu_book3s);
diff --git a/arch/powerpc/kvm/e500.c b/arch/powerpc/kvm/e500.c
index e8a00b0..71750f2 100644
--- a/arch/powerpc/kvm/e500.c
+++ b/arch/powerpc/kvm/e500.c
@@ -117,8 +117,14 @@ struct kvm_vcpu *kvmppc_core_vcpu_create(struct kvm *kvm, unsigned int id)
 	if (err)
 		goto uninit_vcpu;
 
+	vcpu->arch.shared = (void*)__get_free_page(GFP_KERNEL|__GFP_ZERO);
+	if (!vcpu->arch.shared)
+		goto uninit_tlb;
+
 	return vcpu;
 
+uninit_tlb:
+	kvmppc_e500_tlb_uninit(vcpu_e500);
 uninit_vcpu:
 	kvm_vcpu_uninit(vcpu);
 free_vcpu:
@@ -131,6 +137,7 @@ void kvmppc_core_vcpu_free(struct kvm_vcpu *vcpu)
 {
 	struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu);
 
+	free_page((unsigned long)vcpu->arch.shared);
 	kvmppc_e500_tlb_uninit(vcpu_e500);
 	kvm_vcpu_uninit(vcpu);
 	kmem_cache_free(kvm_vcpu_cache, vcpu_e500);
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 09/26] KVM: PPC: Add PV guest scratch registers
From: Alexander Graf @ 2010-06-25 23:24 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1277508314-915-1-git-send-email-agraf@suse.de>

While running in hooked code we need to store register contents out because
we must not clobber any registers.

So let's add some fields to the shared page we can just happily write to.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_para.h |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_para.h b/arch/powerpc/include/asm/kvm_para.h
index d1fe9ae..edf8f83 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -23,6 +23,9 @@
 #include <linux/types.h>
 
 struct kvm_vcpu_arch_shared {
+	__u64 scratch1;
+	__u64 scratch2;
+	__u64 scratch3;
 	__u64 critical;		/* Guest may not get interrupts if == r1 */
 	__u64 sprg0;
 	__u64 sprg1;
-- 
1.6.0.2

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox