* Re: [PATCH] mpc52xx-psc-spi: refactor probe and remove to make use of of_register_spi_devices()
From: Stephen Rothwell @ 2009-10-31 5:32 UTC (permalink / raw)
To: Wolfram Sang
Cc: Kári Davíðsson, spi-devel-general, linuxppc-dev
In-Reply-To: <1256931852-13255-1-git-send-email-w.sang@pengutronix.de>
[-- Attachment #1: Type: text/plain, Size: 901 bytes --]
Hi Wolfram,
On Fri, 30 Oct 2009 20:44:12 +0100 Wolfram Sang <w.sang@pengutronix.de> wrote:
>
> static struct of_platform_driver mpc52xx_psc_spi_of_driver = {
> .owner = THIS_MODULE,
> - .name = "mpc52xx-psc-spi",
> + .name = DRIVER_NAME,
You no longer need to set either owner or name in the of_platform driver,
just in the included struct driver (as is done below), so you could just
remove the above two lines.
> .match_table = mpc52xx_psc_spi_of_match,
> .probe = mpc52xx_psc_spi_of_probe,
> .remove = __exit_p(mpc52xx_psc_spi_of_remove),
> .driver = {
> - .name = "mpc52xx-psc-spi",
> + .name = DRIVER_NAME,
> .owner = THIS_MODULE,
> },
> };
I am hoping that we can remove the owner and name fields from struct
of_platform_driver sometime.
--
Cheers,
Stephen Rothwell sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/
[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply
* Re: [PATCH 20/27] Split init_new_context and destroy_context
From: Stephen Rothwell @ 2009-10-31 4:40 UTC (permalink / raw)
To: Alexander Graf
Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
kvm-ppc, linuxppc-dev, Avi Kivity, kvm, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-21-git-send-email-agraf@suse.de>
[-- Attachment #1: Type: text/plain, Size: 668 bytes --]
Hi Alexander,
On Fri, 30 Oct 2009 16:47:20 +0100 Alexander Graf <agraf@suse.de> wrote:
>
> --- a/arch/powerpc/include/asm/mmu_context.h
> +++ b/arch/powerpc/include/asm/mmu_context.h
> @@ -23,6 +23,11 @@ extern void switch_slb(struct task_struct *tsk, struct mm_struct *mm);
> extern void set_context(unsigned long id, pgd_t *pgd);
>
> #ifdef CONFIG_PPC_BOOK3S_64
> +extern int __init_new_context(void);
> +extern void __destroy_context(int context_id);
> +#endif
> +
> +#ifdef CONFIG_PPC_BOOK3S_64
don't add the #endif/#ifdef pair ...
--
Cheers,
Stephen Rothwell sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/
[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply
* Re: [PATCH 19/27] Export symbols for KVM module
From: Stephen Rothwell @ 2009-10-31 4:37 UTC (permalink / raw)
To: Alexander Graf
Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
kvm-ppc, linuxppc-dev, Avi Kivity, kvm, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-20-git-send-email-agraf@suse.de>
[-- Attachment #1: Type: text/plain, Size: 838 bytes --]
Hi Alexander,
On Fri, 30 Oct 2009 16:47:19 +0100 Alexander Graf <agraf@suse.de> wrote:
>
> diff --git a/arch/powerpc/kernel/ppc_ksyms.c b/arch/powerpc/kernel/ppc_ksyms.c
> index c8b27bb..baf778c 100644
> --- a/arch/powerpc/kernel/ppc_ksyms.c
> +++ b/arch/powerpc/kernel/ppc_ksyms.c
> @@ -163,11 +163,12 @@ EXPORT_SYMBOL(screen_info);
> #ifdef CONFIG_PPC32
> EXPORT_SYMBOL(timer_interrupt);
> EXPORT_SYMBOL(irq_desc);
> -EXPORT_SYMBOL(tb_ticks_per_jiffy);
> EXPORT_SYMBOL(cacheable_memcpy);
> EXPORT_SYMBOL(cacheable_memzero);
> #endif
>
> +EXPORT_SYMBOL(tb_ticks_per_jiffy);
Since you are moving this anyway, how about moving it into
arch/powerpc/kernel/time.c where tb_ticks_per_jiffy is defined.
--
Cheers,
Stephen Rothwell sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/
[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply
* Filtering bits in set_pte_at()
From: Benjamin Herrenschmidt @ 2009-10-31 2:44 UTC (permalink / raw)
To: linux-mm@kvack.org
Cc: Hugh Dickins, linuxppc-dev, linux-kernel@vger.kernel.org
Hi folks !
So I have a little problem on powerpc ... :-)
Due to the way I'm attempting to do my I$/D$ coherency on embedded
processors, I basically need to "filter out" _PAGE_EXEC in set_pte_at()
if the page isn't clean (PG_arch_1) and the set_pte_at() isn't caused by
an exec fault. etc...
The problem with that approach (current upstream) is that the generic
code tends not to read back the PTE, and thus still carries around a PTE
value that doesn't match what was actually written.
For example, we end up with update_mmu_cache() called with an "entry"
argument that has _PAGE_EXEC set while we really didn't write it into
the page tables. This will be problematic when we finally add preloading
directly into the TLB on those processors. There's at least one other
fishy case where huetlbfs would carry the PTE value around and later do
the wrong thing because pte_same() with the loaded one failed.
What do you suggest we do here ? Among the options at hand:
- Ugly but would probably "just work" with the last amount of changes:
we could make set_pte_at() be a macro on powerpc that modifies it's PTE
value argument :-) (I -did- warn it was ugly !)
- Another one slightly less bad that would require more work but mostly
mechanical arch header updates would be to make set_pte_at() return the
new value of the PTE, and thus change the callsites to something like:
entry = set_pte_at(mm, addr, ptep, entry)
- Any other idea ? We could use another PTE bit (_PAGE_HWEXEC), in
fact, we used to, but we are really short on PTE bits nowadays and I
freed that one up to get _PAGE_SPECIAL... _PAGE_EXEC is trivial to
"recover" from ptep_set_access_flags() on an exec fault or from the VM
prot.
Cheers,
Ben.
^ permalink raw reply
* Re: mpc512x/clock: fix clk_get logic
From: Benjamin Herrenschmidt @ 2009-10-30 21:54 UTC (permalink / raw)
To: Wolfram Sang; +Cc: linuxppc-dev, Wolfgang Denk
In-Reply-To: <1256894234-11264-1-git-send-email-w.sang@pengutronix.de>
On Fri, 2009-10-30 at 10:17 +0100, Wolfram Sang wrote:
> The matching logic returns a clock even if only the dev-part matches. This is
> wrong as devices may utilize more than one clock, so the wrong clock may be
> returned due to dev being not unique (noticed while working on the CAN driver).
> The proposed new method will:
>
> - require the id field (as _this_ is the unique identifier)
> - dev need not be given; if NULL, it will match any device.
> if given, it has to match the dev of the clock
> - using the above rules, both fields need to match in order to claim the clock
Have you considered switching to my proposed device-tree based clock
reprentation ?
Cheers,
Ben.
> Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
> Cc: Wolfgang Denk <wd@denx.de>
> Cc: Grant Likely <grant.likely@secretlab.ca>
> ---
> arch/powerpc/platforms/512x/clock.c | 14 ++++++++------
> 1 file changed, 8 insertions(+), 6 deletions(-)
>
> Index: .kernel/arch/powerpc/platforms/512x/clock.c
> ===================================================================
> --- .kernel.orig/arch/powerpc/platforms/512x/clock.c
> +++ .kernel/arch/powerpc/platforms/512x/clock.c
> @@ -53,19 +53,21 @@ static DEFINE_MUTEX(clocks_mutex);
> static struct clk *mpc5121_clk_get(struct device *dev, const char *id)
> {
> struct clk *p, *clk = ERR_PTR(-ENOENT);
> - int dev_match = 0;
> - int id_match = 0;
> + bool id_match = false;
> + /* Match any device if no dev given */
> + bool dev_match = !dev;
>
> - if (dev == NULL || id == NULL)
> + /* We need the unique identifier */
> + if (id == NULL)
> return NULL;
>
> mutex_lock(&clocks_mutex);
> list_for_each_entry(p, &clocks, node) {
> if (dev == p->dev)
> - dev_match++;
> + dev_match = true;
> if (strcmp(id, p->name) == 0)
> - id_match++;
> - if ((dev_match || id_match) && try_module_get(p->owner)) {
> + id_match = true;
> + if (dev_match && id_match && try_module_get(p->owner)) {
> clk = p;
> break;
> }
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
^ permalink raw reply
* [PATCH] mpc52xx-psc-spi: refactor probe and remove to make use of of_register_spi_devices()
From: Wolfram Sang @ 2009-10-30 19:44 UTC (permalink / raw)
To: spi-devel-general; +Cc: Kári Davíðsson, linuxppc-dev
This driver has a generic part for the probe/remove routines which probably
used to be called from a platform driver and an of_platform driver. Meanwhile,
the driver is of_platform only, so there is no need for the generic part
anymore. Unifying probe/remove finally enables us to call
of_register_spi_devices(), so spi-device nodes from the device tree will be
parsed. This was also mentioned by:
http://lists.ozlabs.org/pipermail/linuxppc-dev/2009-June/073273.html
On the way, some patterns have been changed to current best practices (like
using '!ptr' and 'struct resource'). Also, an older, yet uncommented, patch
from me has been incorporated to improve the checks if the selected PSC is
valid:
http://patchwork.ozlabs.org/patch/36780/
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Cc: Kári Davíðsson <kari.davidsson@marel.com>
Cc: Grant Likely <grant.likely@secretlab.ca>
---
drivers/spi/mpc52xx_psc_spi.c | 110 +++++++++++++++++++---------------------
1 files changed, 52 insertions(+), 58 deletions(-)
diff --git a/drivers/spi/mpc52xx_psc_spi.c b/drivers/spi/mpc52xx_psc_spi.c
index 1b74d5c..3fa8c78 100644
--- a/drivers/spi/mpc52xx_psc_spi.c
+++ b/drivers/spi/mpc52xx_psc_spi.c
@@ -17,6 +17,7 @@
#include <linux/errno.h>
#include <linux/interrupt.h>
#include <linux/of_platform.h>
+#include <linux/of_spi.h>
#include <linux/workqueue.h>
#include <linux/completion.h>
#include <linux/io.h>
@@ -27,6 +28,7 @@
#include <asm/mpc52xx.h>
#include <asm/mpc52xx_psc.h>
+#define DRIVER_NAME "mpc52xx-psc-spi"
#define MCLK 20000000 /* PSC port MClk in hz */
struct mpc52xx_psc_spi {
@@ -358,32 +360,54 @@ static irqreturn_t mpc52xx_psc_spi_isr(int irq, void *dev_id)
return IRQ_NONE;
}
-/* bus_num is used only for the case dev->platform_data == NULL */
-static int __init mpc52xx_psc_spi_do_probe(struct device *dev, u32 regaddr,
- u32 size, unsigned int irq, s16 bus_num)
+static int __init mpc52xx_psc_spi_of_probe(struct of_device *op,
+ const struct of_device_id *match)
{
- struct fsl_spi_platform_data *pdata = dev->platform_data;
+ struct fsl_spi_platform_data *pdata = op->dev.platform_data;
struct mpc52xx_psc_spi *mps;
struct spi_master *master;
+ struct resource mem;
+ s16 id = -1;
int ret;
- master = spi_alloc_master(dev, sizeof *mps);
- if (master == NULL)
+ /* get PSC id (only 1,2,3,6 can do SPI) */
+ if (!pdata) {
+ const u32 *psc_nump;
+
+ psc_nump = of_get_property(op->node, "cell-index", NULL);
+ if (!psc_nump || (*psc_nump > 2 && *psc_nump != 5)) {
+ dev_err(&op->dev, "PSC can't do SPI\n");
+ return -EINVAL;
+ }
+ id = *psc_nump + 1;
+ }
+
+ ret = of_address_to_resource(op->node, 0, &mem);
+ if (ret)
+ return ret;
+
+ master = spi_alloc_master(&op->dev, sizeof *mps);
+ if (!master)
return -ENOMEM;
- dev_set_drvdata(dev, master);
+ if (!request_mem_region(mem.start, resource_size(&mem), DRIVER_NAME)) {
+ ret = -ENOMEM;
+ goto free_master;
+ }
+
+ dev_set_drvdata(&op->dev, master);
mps = spi_master_get_devdata(master);
/* the spi->mode bits understood by this driver: */
master->mode_bits = SPI_CPOL | SPI_CPHA | SPI_CS_HIGH | SPI_LSB_FIRST;
- mps->irq = irq;
- if (pdata == NULL) {
- dev_warn(dev, "probe called without platform data, no "
+ mps->irq = irq_of_parse_and_map(op->node, 0);
+ if (!pdata) {
+ dev_info(&op->dev, "probe called without platform data, no "
"cs_control function will be called\n");
mps->cs_control = NULL;
mps->sysclk = 0;
- master->bus_num = bus_num;
+ master->bus_num = id;
master->num_chipselect = 255;
} else {
mps->cs_control = pdata->cs_control;
@@ -395,19 +419,18 @@ static int __init mpc52xx_psc_spi_do_probe(struct device *dev, u32 regaddr,
master->transfer = mpc52xx_psc_spi_transfer;
master->cleanup = mpc52xx_psc_spi_cleanup;
- mps->psc = ioremap(regaddr, size);
+ mps->psc = ioremap(mem.start, resource_size(&mem));
if (!mps->psc) {
- dev_err(dev, "could not ioremap I/O port range\n");
+ dev_err(&op->dev, "could not ioremap I/O port range\n");
ret = -EFAULT;
- goto free_master;
+ goto free_unmap;
}
/* On the 5200, fifo regs are immediately ajacent to the psc regs */
mps->fifo = ((void __iomem *)mps->psc) + sizeof(struct mpc52xx_psc);
- ret = request_irq(mps->irq, mpc52xx_psc_spi_isr, 0, "mpc52xx-psc-spi",
- mps);
+ ret = request_irq(mps->irq, mpc52xx_psc_spi_isr, 0, DRIVER_NAME, mps);
if (ret)
- goto free_master;
+ goto free_unmap;
ret = mpc52xx_psc_spi_port_config(master->bus_num, mps);
if (ret < 0)
@@ -429,24 +452,29 @@ static int __init mpc52xx_psc_spi_do_probe(struct device *dev, u32 regaddr,
if (ret < 0)
goto unreg_master;
+ of_register_spi_devices(master, op->node);
+
return ret;
unreg_master:
destroy_workqueue(mps->workqueue);
free_irq:
free_irq(mps->irq, mps);
-free_master:
+free_unmap:
if (mps->psc)
iounmap(mps->psc);
+ release_mem_region(mem.start, resource_size(&mem));
+free_master:
spi_master_put(master);
return ret;
}
-static int __exit mpc52xx_psc_spi_do_remove(struct device *dev)
+static int __exit mpc52xx_psc_spi_of_remove(struct of_device *op)
{
- struct spi_master *master = dev_get_drvdata(dev);
+ struct spi_master *master = dev_get_drvdata(&op->dev);
struct mpc52xx_psc_spi *mps = spi_master_get_devdata(master);
+ struct resource mem;
flush_workqueue(mps->workqueue);
destroy_workqueue(mps->workqueue);
@@ -454,46 +482,12 @@ static int __exit mpc52xx_psc_spi_do_remove(struct device *dev)
free_irq(mps->irq, mps);
if (mps->psc)
iounmap(mps->psc);
+ if (of_address_to_resource(op->node, 0, &mem) == 0)
+ release_mem_region(mem.start, resource_size(&mem));
return 0;
}
-static int __init mpc52xx_psc_spi_of_probe(struct of_device *op,
- const struct of_device_id *match)
-{
- const u32 *regaddr_p;
- u64 regaddr64, size64;
- s16 id = -1;
-
- regaddr_p = of_get_address(op->node, 0, &size64, NULL);
- if (!regaddr_p) {
- printk(KERN_ERR "Invalid PSC address\n");
- return -EINVAL;
- }
- regaddr64 = of_translate_address(op->node, regaddr_p);
-
- /* get PSC id (1..6, used by port_config) */
- if (op->dev.platform_data == NULL) {
- const u32 *psc_nump;
-
- psc_nump = of_get_property(op->node, "cell-index", NULL);
- if (!psc_nump || *psc_nump > 5) {
- printk(KERN_ERR "mpc52xx_psc_spi: Device node %s has invalid "
- "cell-index property\n", op->node->full_name);
- return -EINVAL;
- }
- id = *psc_nump + 1;
- }
-
- return mpc52xx_psc_spi_do_probe(&op->dev, (u32)regaddr64, (u32)size64,
- irq_of_parse_and_map(op->node, 0), id);
-}
-
-static int __exit mpc52xx_psc_spi_of_remove(struct of_device *op)
-{
- return mpc52xx_psc_spi_do_remove(&op->dev);
-}
-
static struct of_device_id mpc52xx_psc_spi_of_match[] = {
{ .compatible = "fsl,mpc5200-psc-spi", },
{ .compatible = "mpc5200-psc-spi", }, /* old */
@@ -504,12 +498,12 @@ MODULE_DEVICE_TABLE(of, mpc52xx_psc_spi_of_match);
static struct of_platform_driver mpc52xx_psc_spi_of_driver = {
.owner = THIS_MODULE,
- .name = "mpc52xx-psc-spi",
+ .name = DRIVER_NAME,
.match_table = mpc52xx_psc_spi_of_match,
.probe = mpc52xx_psc_spi_of_probe,
.remove = __exit_p(mpc52xx_psc_spi_of_remove),
.driver = {
- .name = "mpc52xx-psc-spi",
+ .name = DRIVER_NAME,
.owner = THIS_MODULE,
},
};
--
1.6.3.3
^ permalink raw reply related
* [PATCH V2] mpc512x/clock: fix clk_get logic
From: Wolfram Sang @ 2009-10-30 17:53 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Roel Kluin, Wolfgang Denk, Mark Brown
In-Reply-To: <20091030105414.GD717@sirena.org.uk>
The current matching logic returns a clock even if only one out of two
arguments matches. This is wrong as devices may utilize more than one clock, so
only the first clock out of those is returned if the dev-match alone is
considered sufficent (noticed while working on the CAN driver). The proposed
new method will:
- return -EINVAL if both arguments are NULL
- skip the relevant check if one argument is NULL (first match wins)
- otherwise both arguments need to match
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Cc: Mark Brown <broonie@sirena.org.uk>
Cc: Roel Kluin <roel.kluin@gmail.com>
Cc: Wolfgang Denk <wd@denx.de>
Cc: Grant Likely <grant.likely@secretlab.ca>
---
After Mark's valid comment, I'll try harder ;)
arch/powerpc/platforms/512x/clock.c | 18 ++++++++++--------
1 files changed, 10 insertions(+), 8 deletions(-)
diff --git a/arch/powerpc/platforms/512x/clock.c b/arch/powerpc/platforms/512x/clock.c
index 84544d0..4168457 100644
--- a/arch/powerpc/platforms/512x/clock.c
+++ b/arch/powerpc/platforms/512x/clock.c
@@ -53,19 +53,21 @@ static DEFINE_MUTEX(clocks_mutex);
static struct clk *mpc5121_clk_get(struct device *dev, const char *id)
{
struct clk *p, *clk = ERR_PTR(-ENOENT);
- int dev_match = 0;
- int id_match = 0;
+ /* If one argument is not given, skip its match */
+ bool id_matched = !id;
+ bool dev_matched = !dev;
- if (dev == NULL || id == NULL)
- return NULL;
+ /* We need at least one argument */
+ if (!dev && !id)
+ return ERR_PTR(-EINVAL);
mutex_lock(&clocks_mutex);
list_for_each_entry(p, &clocks, node) {
if (dev == p->dev)
- dev_match++;
- if (strcmp(id, p->name) == 0)
- id_match++;
- if ((dev_match || id_match) && try_module_get(p->owner)) {
+ dev_matched = true;
+ if (id && strcmp(id, p->name) == 0)
+ id_matched = true;
+ if (dev_matched && id_matched && try_module_get(p->owner)) {
clk = p;
break;
}
--
1.6.3.3
^ permalink raw reply related
* Re: Accessing flash directly from User Space [SOLVED]
From: Gabriel Paubert @ 2009-10-30 17:49 UTC (permalink / raw)
To: Jonathan Haws
Cc: bgat@billgatliff.com, scottwood@freescale.com,
linuxppc-dev@lists.ozlabs.org, Alessandro Rubini
In-Reply-To: <BB99A6BA28709744BF22A68E6D7EB51F0330EE1316@midas.usurf.usu.edu>
On Fri, Oct 30, 2009 at 10:46:46AM -0600, Jonathan Haws wrote:
> > Michael Buesch wrote:
> > > Yes, I think the barrier is wrong.
> > > Please try with
> > >
> > > #define mb() __asm__ __volatile__("eieio\n sync\n" : : :
> > "memory")
> >
> >
> > For uncached memory, eieio should be enough.
>
> I tried eieio alone and it did not work. It needed the above command to take care of it all.
Surprising. But sync is a superset of eieio, no?
In this case a sync alone should be sufficient (although an eieio
followed by a sync should not be much slower than a sync alone).
Gabriel
^ permalink raw reply
* Re: [PATCH 0/8] Fix 8xx MMU/TLB
From: Scott Wood @ 2009-10-30 17:37 UTC (permalink / raw)
To: Joakim Tjernlund; +Cc: Rex Feany, linuxppc-dev@ozlabs.org
In-Reply-To: <20091030171607.GA781@loki.buserror.net>
On Fri, Oct 30, 2009 at 12:16:07PM -0500, Scott Wood wrote:
> On Sat, Oct 17, 2009 at 02:01:38PM +0200, Joakim Tjernlund wrote:
> > + mfspr r10, SPRN_SRR0
> > DO_8xx_CPU6(0x3780, r3)
> > mtspr SPRN_MD_EPN, r10
> > mfspr r11, SPRN_M_TWB /* Get level 1 table entry address */
> > - lwz r11, 0(r11) /* Get the level 1 entry */
> > + cmplwi cr0, r11, 0x0800
> > + blt- 3f /* Branch if user space */
> > + lis r11, swapper_pg_dir@h
> > + ori r11, r11, swapper_pg_dir@l
> > + rlwimi r11, r11, 0, 2, 19
>
> That rlwimi is a no-op -- I think you meant to use a different register
> here?
>
> > +3: lwz r11, 0(r11) /* Get the level 1 entry */
> > DO_8xx_CPU6(0x3b80, r3)
> > mtspr SPRN_MD_TWC, r11 /* Load pte table base address */
> > mfspr r11, SPRN_MD_TWC /* ....and get the pte address */
> > lwz r11, 0(r11) /* Get the pte */
> > /* concat physical page address(r11) and page offset(r10) */
> > rlwimi r11, r10, 0, 20, 31
>
> But r10 here contains SRR0 from above, and this is a data TLB error.
Never mind that last one, forgot that you'd be wanting to load the
instruction. :-P
But the rlwimi is what's causing the machine checks. I replaced it with:
rlwinm r11, r11, 0, 0x3ffff000
rlwimi r11, r10, 22, 0xffc
and things seem to work. You could probably replace the rlwinm by
subtracting PAGE_OFFSET from swapper_pg_dir instead.
-Scott
^ permalink raw reply
* Re: [PATCH 0/8] Fix 8xx MMU/TLB
From: Scott Wood @ 2009-10-30 17:16 UTC (permalink / raw)
To: Joakim Tjernlund; +Cc: Rex Feany, linuxppc-dev@ozlabs.org
In-Reply-To: <OFA43B9FDF.F43B1705-ONC1257652.0041B12F-C1257652.004211A3@transmode.se>
On Sat, Oct 17, 2009 at 02:01:38PM +0200, Joakim Tjernlund wrote:
> + mfspr r10, SPRN_SRR0
> DO_8xx_CPU6(0x3780, r3)
> mtspr SPRN_MD_EPN, r10
> mfspr r11, SPRN_M_TWB /* Get level 1 table entry address */
> - lwz r11, 0(r11) /* Get the level 1 entry */
> + cmplwi cr0, r11, 0x0800
> + blt- 3f /* Branch if user space */
> + lis r11, swapper_pg_dir@h
> + ori r11, r11, swapper_pg_dir@l
> + rlwimi r11, r11, 0, 2, 19
That rlwimi is a no-op -- I think you meant to use a different register
here?
> +3: lwz r11, 0(r11) /* Get the level 1 entry */
> DO_8xx_CPU6(0x3b80, r3)
> mtspr SPRN_MD_TWC, r11 /* Load pte table base address */
> mfspr r11, SPRN_MD_TWC /* ....and get the pte address */
> lwz r11, 0(r11) /* Get the pte */
> /* concat physical page address(r11) and page offset(r10) */
> rlwimi r11, r10, 0, 20, 31
But r10 here contains SRR0 from above, and this is a data TLB error.
-Scott
^ permalink raw reply
* RE: Accessing flash directly from User Space [SOLVED]
From: Jonathan Haws @ 2009-10-30 16:46 UTC (permalink / raw)
To: Micha Nelissen, Michael Buesch
Cc: scottwood@freescale.com, Alessandro Rubini, bgat@billgatliff.com,
linuxppc-dev@lists.ozlabs.org
In-Reply-To: <4AEB0ABB.8070605@neli.hopto.org>
> Michael Buesch wrote:
> > Yes, I think the barrier is wrong.
> > Please try with
> >
> > #define mb() __asm__ __volatile__("eieio\n sync\n" : : :
> "memory")
>=20
>=20
> For uncached memory, eieio should be enough.
I tried eieio alone and it did not work. It needed the above command to ta=
ke care of it all.
^ permalink raw reply
* Re: Accessing flash directly from User Space [SOLVED]
From: Scott Wood @ 2009-10-30 15:57 UTC (permalink / raw)
To: Alessandro Rubini; +Cc: bgat, linuxppc-dev, Jonathan.Haws
In-Reply-To: <20091030150855.GA15046@mail.gnudd.com>
On Fri, Oct 30, 2009 at 04:08:55PM +0100, Alessandro Rubini wrote:
> > asm("eieio; sync");
>
> Hmm...
> : : : "memory"
>
> And, doesn't ";" start a comment in assembly? (no, not on powerpc it seems)
';' is an instruction separator on all GNU as targets that I'm familiar
with (though a quick manual search shows z80 as an exception).
-Scott
^ permalink raw reply
* Re: Accessing flash directly from User Space [SOLVED]
From: Micha Nelissen @ 2009-10-30 15:48 UTC (permalink / raw)
To: Michael Buesch
Cc: scottwood, Alessandro Rubini, bgat, linuxppc-dev, Jonathan.Haws
In-Reply-To: <200910301624.03989.mb@bu3sch.de>
Michael Buesch wrote:
> Yes, I think the barrier is wrong.
> Please try with
>
> #define mb() __asm__ __volatile__("eieio\n sync\n" : : : "memory")
For uncached memory, eieio should be enough.
Micha
^ permalink raw reply
* [PATCH 10/27] Add book3s.c
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm
Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>
This adds the book3s core handling file. Here everything that is generic to
desktop PowerPC cores is handled, including interrupt injections, MSR settings,
etc.
It basically takes over the same role as booke.c for embedded PowerPCs.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
v3 -> v4:
- use context_id instead of mm_alloc
v4 -> v5:
- make pvr 32 bits
v5 -> v6:
- use /* */ instead of //
- use kvm_for_each_vcpu
---
arch/powerpc/kvm/book3s.c | 925 +++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 925 insertions(+), 0 deletions(-)
create mode 100644 arch/powerpc/kvm/book3s.c
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
new file mode 100644
index 0000000..42037d4
--- /dev/null
+++ b/arch/powerpc/kvm/book3s.c
@@ -0,0 +1,925 @@
+/*
+ * Copyright (C) 2009. SUSE Linux Products GmbH. All rights reserved.
+ *
+ * Authors:
+ * Alexander Graf <agraf@suse.de>
+ * Kevin Wolf <mail@kevin-wolf.de>
+ *
+ * Description:
+ * This file is derived from arch/powerpc/kvm/44x.c,
+ * by Hollis Blanchard <hollisb@us.ibm.com>.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/kvm_host.h>
+#include <linux/err.h>
+
+#include <asm/reg.h>
+#include <asm/cputable.h>
+#include <asm/cacheflush.h>
+#include <asm/tlbflush.h>
+#include <asm/uaccess.h>
+#include <asm/io.h>
+#include <asm/kvm_ppc.h>
+#include <asm/kvm_book3s.h>
+#include <asm/mmu_context.h>
+#include <linux/sched.h>
+#include <linux/vmalloc.h>
+
+#define VCPU_STAT(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU
+
+/* #define EXIT_DEBUG */
+/* #define EXIT_DEBUG_SIMPLE */
+
+/* Without AGGRESSIVE_DEC we only fire off a DEC interrupt when DEC turns 0.
+ * When set, we retrigger a DEC interrupt after that if DEC <= 0.
+ * PPC32 Linux runs faster without AGGRESSIVE_DEC, PPC64 Linux requires it. */
+
+/* #define AGGRESSIVE_DEC */
+
+struct kvm_stats_debugfs_item debugfs_entries[] = {
+ { "exits", VCPU_STAT(sum_exits) },
+ { "mmio", VCPU_STAT(mmio_exits) },
+ { "sig", VCPU_STAT(signal_exits) },
+ { "sysc", VCPU_STAT(syscall_exits) },
+ { "inst_emu", VCPU_STAT(emulated_inst_exits) },
+ { "dec", VCPU_STAT(dec_exits) },
+ { "ext_intr", VCPU_STAT(ext_intr_exits) },
+ { "queue_intr", VCPU_STAT(queue_intr) },
+ { "halt_wakeup", VCPU_STAT(halt_wakeup) },
+ { "pf_storage", VCPU_STAT(pf_storage) },
+ { "sp_storage", VCPU_STAT(sp_storage) },
+ { "pf_instruc", VCPU_STAT(pf_instruc) },
+ { "sp_instruc", VCPU_STAT(sp_instruc) },
+ { "ld", VCPU_STAT(ld) },
+ { "ld_slow", VCPU_STAT(ld_slow) },
+ { "st", VCPU_STAT(st) },
+ { "st_slow", VCPU_STAT(st_slow) },
+ { NULL }
+};
+
+void kvmppc_core_load_host_debugstate(struct kvm_vcpu *vcpu)
+{
+}
+
+void kvmppc_core_load_guest_debugstate(struct kvm_vcpu *vcpu)
+{
+}
+
+void kvmppc_core_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
+{
+ memcpy(get_paca()->kvm_slb, to_book3s(vcpu)->slb_shadow, sizeof(get_paca()->kvm_slb));
+ get_paca()->kvm_slb_max = to_book3s(vcpu)->slb_shadow_max;
+}
+
+void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu)
+{
+ memcpy(to_book3s(vcpu)->slb_shadow, get_paca()->kvm_slb, sizeof(get_paca()->kvm_slb));
+ to_book3s(vcpu)->slb_shadow_max = get_paca()->kvm_slb_max;
+}
+
+#if defined(AGGRESSIVE_DEC) || defined(EXIT_DEBUG)
+static u32 kvmppc_get_dec(struct kvm_vcpu *vcpu)
+{
+ u64 jd = mftb() - vcpu->arch.dec_jiffies;
+ return vcpu->arch.dec - jd;
+}
+#endif
+
+void kvmppc_set_msr(struct kvm_vcpu *vcpu, u64 msr)
+{
+ ulong old_msr = vcpu->arch.msr;
+
+#ifdef EXIT_DEBUG
+ printk(KERN_INFO "KVM: Set MSR to 0x%llx\n", msr);
+#endif
+ msr &= to_book3s(vcpu)->msr_mask;
+ vcpu->arch.msr = msr;
+ vcpu->arch.shadow_msr = msr | MSR_USER32;
+ vcpu->arch.shadow_msr &= ( MSR_VEC | MSR_VSX | MSR_FP | MSR_FE0 |
+ MSR_USER64 | MSR_SE | MSR_BE | MSR_DE |
+ MSR_FE1);
+
+ if (msr & (MSR_WE|MSR_POW)) {
+ if (!vcpu->arch.pending_exceptions) {
+ kvm_vcpu_block(vcpu);
+ vcpu->stat.halt_wakeup++;
+ }
+ }
+
+ if (((vcpu->arch.msr & (MSR_IR|MSR_DR)) != (old_msr & (MSR_IR|MSR_DR))) ||
+ (vcpu->arch.msr & MSR_PR) != (old_msr & MSR_PR)) {
+ kvmppc_mmu_flush_segments(vcpu);
+ kvmppc_mmu_map_segment(vcpu, vcpu->arch.pc);
+ }
+}
+
+void kvmppc_inject_interrupt(struct kvm_vcpu *vcpu, int vec, u64 flags)
+{
+ vcpu->arch.srr0 = vcpu->arch.pc;
+ vcpu->arch.srr1 = vcpu->arch.msr | flags;
+ vcpu->arch.pc = to_book3s(vcpu)->hior + vec;
+ vcpu->arch.mmu.reset_msr(vcpu);
+}
+
+void kvmppc_book3s_queue_irqprio(struct kvm_vcpu *vcpu, unsigned int vec)
+{
+ unsigned int prio;
+
+ vcpu->stat.queue_intr++;
+ switch (vec) {
+ case 0x100: prio = BOOK3S_IRQPRIO_SYSTEM_RESET; break;
+ case 0x200: prio = BOOK3S_IRQPRIO_MACHINE_CHECK; break;
+ case 0x300: prio = BOOK3S_IRQPRIO_DATA_STORAGE; break;
+ case 0x380: prio = BOOK3S_IRQPRIO_DATA_SEGMENT; break;
+ case 0x400: prio = BOOK3S_IRQPRIO_INST_STORAGE; break;
+ case 0x480: prio = BOOK3S_IRQPRIO_INST_SEGMENT; break;
+ case 0x500: prio = BOOK3S_IRQPRIO_EXTERNAL; break;
+ case 0x600: prio = BOOK3S_IRQPRIO_ALIGNMENT; break;
+ case 0x700: prio = BOOK3S_IRQPRIO_PROGRAM; break;
+ case 0x800: prio = BOOK3S_IRQPRIO_FP_UNAVAIL; break;
+ case 0x900: prio = BOOK3S_IRQPRIO_DECREMENTER; break;
+ case 0xc00: prio = BOOK3S_IRQPRIO_SYSCALL; break;
+ case 0xd00: prio = BOOK3S_IRQPRIO_DEBUG; break;
+ case 0xf20: prio = BOOK3S_IRQPRIO_ALTIVEC; break;
+ case 0xf40: prio = BOOK3S_IRQPRIO_VSX; break;
+ default: prio = BOOK3S_IRQPRIO_MAX; break;
+ }
+
+ set_bit(prio, &vcpu->arch.pending_exceptions);
+#ifdef EXIT_DEBUG
+ printk(KERN_INFO "Queueing interrupt %x\n", vec);
+#endif
+}
+
+
+void kvmppc_core_queue_program(struct kvm_vcpu *vcpu)
+{
+ kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_PROGRAM);
+}
+
+void kvmppc_core_queue_dec(struct kvm_vcpu *vcpu)
+{
+ kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_DECREMENTER);
+}
+
+int kvmppc_core_pending_dec(struct kvm_vcpu *vcpu)
+{
+ return test_bit(BOOK3S_INTERRUPT_DECREMENTER >> 7, &vcpu->arch.pending_exceptions);
+}
+
+void kvmppc_core_queue_external(struct kvm_vcpu *vcpu,
+ struct kvm_interrupt *irq)
+{
+ kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_EXTERNAL);
+}
+
+int kvmppc_book3s_irqprio_deliver(struct kvm_vcpu *vcpu, unsigned int priority)
+{
+ int deliver = 1;
+ int vec = 0;
+
+ switch (priority) {
+ case BOOK3S_IRQPRIO_DECREMENTER:
+ deliver = vcpu->arch.msr & MSR_EE;
+ vec = BOOK3S_INTERRUPT_DECREMENTER;
+ break;
+ case BOOK3S_IRQPRIO_EXTERNAL:
+ deliver = vcpu->arch.msr & MSR_EE;
+ vec = BOOK3S_INTERRUPT_EXTERNAL;
+ break;
+ case BOOK3S_IRQPRIO_SYSTEM_RESET:
+ vec = BOOK3S_INTERRUPT_SYSTEM_RESET;
+ break;
+ case BOOK3S_IRQPRIO_MACHINE_CHECK:
+ vec = BOOK3S_INTERRUPT_MACHINE_CHECK;
+ break;
+ case BOOK3S_IRQPRIO_DATA_STORAGE:
+ vec = BOOK3S_INTERRUPT_DATA_STORAGE;
+ break;
+ case BOOK3S_IRQPRIO_INST_STORAGE:
+ vec = BOOK3S_INTERRUPT_INST_STORAGE;
+ break;
+ case BOOK3S_IRQPRIO_DATA_SEGMENT:
+ vec = BOOK3S_INTERRUPT_DATA_SEGMENT;
+ break;
+ case BOOK3S_IRQPRIO_INST_SEGMENT:
+ vec = BOOK3S_INTERRUPT_INST_SEGMENT;
+ break;
+ case BOOK3S_IRQPRIO_ALIGNMENT:
+ vec = BOOK3S_INTERRUPT_ALIGNMENT;
+ break;
+ case BOOK3S_IRQPRIO_PROGRAM:
+ vec = BOOK3S_INTERRUPT_PROGRAM;
+ break;
+ case BOOK3S_IRQPRIO_VSX:
+ vec = BOOK3S_INTERRUPT_VSX;
+ break;
+ case BOOK3S_IRQPRIO_ALTIVEC:
+ vec = BOOK3S_INTERRUPT_ALTIVEC;
+ break;
+ case BOOK3S_IRQPRIO_FP_UNAVAIL:
+ vec = BOOK3S_INTERRUPT_FP_UNAVAIL;
+ break;
+ case BOOK3S_IRQPRIO_SYSCALL:
+ vec = BOOK3S_INTERRUPT_SYSCALL;
+ break;
+ case BOOK3S_IRQPRIO_DEBUG:
+ vec = BOOK3S_INTERRUPT_TRACE;
+ break;
+ case BOOK3S_IRQPRIO_PERFORMANCE_MONITOR:
+ vec = BOOK3S_INTERRUPT_PERFMON;
+ break;
+ default:
+ deliver = 0;
+ printk(KERN_ERR "KVM: Unknown interrupt: 0x%x\n", priority);
+ break;
+ }
+
+#if 0
+ printk(KERN_INFO "Deliver interrupt 0x%x? %x\n", vec, deliver);
+#endif
+
+ if (deliver)
+ kvmppc_inject_interrupt(vcpu, vec, 0ULL);
+
+ return deliver;
+}
+
+void kvmppc_core_deliver_interrupts(struct kvm_vcpu *vcpu)
+{
+ unsigned long *pending = &vcpu->arch.pending_exceptions;
+ unsigned int priority;
+
+ /* XXX be more clever here - no need to mftb() on every entry */
+ /* Issue DEC again if it's still active */
+#ifdef AGGRESSIVE_DEC
+ if (vcpu->arch.msr & MSR_EE)
+ if (kvmppc_get_dec(vcpu) & 0x80000000)
+ kvmppc_core_queue_dec(vcpu);
+#endif
+
+#ifdef EXIT_DEBUG
+ if (vcpu->arch.pending_exceptions)
+ printk(KERN_EMERG "KVM: Check pending: %lx\n", vcpu->arch.pending_exceptions);
+#endif
+ priority = __ffs(*pending);
+ while (priority <= (sizeof(unsigned int) * 8)) {
+ if (kvmppc_book3s_irqprio_deliver(vcpu, priority)) {
+ clear_bit(priority, &vcpu->arch.pending_exceptions);
+ break;
+ }
+
+ priority = find_next_bit(pending,
+ BITS_PER_BYTE * sizeof(*pending),
+ priority + 1);
+ }
+}
+
+void kvmppc_set_pvr(struct kvm_vcpu *vcpu, u32 pvr)
+{
+ vcpu->arch.pvr = pvr;
+ if ((pvr >= 0x330000) && (pvr < 0x70330000)) {
+ kvmppc_mmu_book3s_64_init(vcpu);
+ to_book3s(vcpu)->hior = 0xfff00000;
+ to_book3s(vcpu)->msr_mask = 0xffffffffffffffffULL;
+ } else {
+ kvmppc_mmu_book3s_32_init(vcpu);
+ to_book3s(vcpu)->hior = 0;
+ to_book3s(vcpu)->msr_mask = 0xffffffffULL;
+ }
+
+ /* If we are in hypervisor level on 970, we can tell the CPU to
+ * treat DCBZ as 32 bytes store */
+ vcpu->arch.hflags &= ~BOOK3S_HFLAG_DCBZ32;
+ if (vcpu->arch.mmu.is_dcbz32(vcpu) && (mfmsr() & MSR_HV) &&
+ !strcmp(cur_cpu_spec->platform, "ppc970"))
+ vcpu->arch.hflags |= BOOK3S_HFLAG_DCBZ32;
+
+}
+
+/* Book3s_32 CPUs always have 32 bytes cache line size, which Linux assumes. To
+ * make Book3s_32 Linux work on Book3s_64, we have to make sure we trap dcbz to
+ * emulate 32 bytes dcbz length.
+ *
+ * The Book3s_64 inventors also realized this case and implemented a special bit
+ * in the HID5 register, which is a hypervisor ressource. Thus we can't use it.
+ *
+ * My approach here is to patch the dcbz instruction on executing pages.
+ */
+static void kvmppc_patch_dcbz(struct kvm_vcpu *vcpu, struct kvmppc_pte *pte)
+{
+ bool touched = false;
+ hva_t hpage;
+ u32 *page;
+ int i;
+
+ hpage = gfn_to_hva(vcpu->kvm, pte->raddr >> PAGE_SHIFT);
+ if (kvm_is_error_hva(hpage))
+ return;
+
+ hpage |= pte->raddr & ~PAGE_MASK;
+ hpage &= ~0xFFFULL;
+
+ page = vmalloc(HW_PAGE_SIZE);
+
+ if (copy_from_user(page, (void __user *)hpage, HW_PAGE_SIZE))
+ goto out;
+
+ for (i=0; i < HW_PAGE_SIZE / 4; i++)
+ if ((page[i] & 0xff0007ff) == INS_DCBZ) {
+ page[i] &= 0xfffffff7; // reserved instruction, so we trap
+ touched = true;
+ }
+
+ if (touched)
+ copy_to_user((void __user *)hpage, page, HW_PAGE_SIZE);
+
+out:
+ vfree(page);
+}
+
+static int kvmppc_xlate(struct kvm_vcpu *vcpu, ulong eaddr, bool data,
+ struct kvmppc_pte *pte)
+{
+ int relocated = (vcpu->arch.msr & (data ? MSR_DR : MSR_IR));
+ int r;
+
+ if (relocated) {
+ r = vcpu->arch.mmu.xlate(vcpu, eaddr, pte, data);
+ } else {
+ pte->eaddr = eaddr;
+ pte->raddr = eaddr & 0xffffffff;
+ pte->vpage = eaddr >> 12;
+ switch (vcpu->arch.msr & (MSR_DR|MSR_IR)) {
+ case 0:
+ pte->vpage |= VSID_REAL;
+ case MSR_DR:
+ pte->vpage |= VSID_REAL_DR;
+ case MSR_IR:
+ pte->vpage |= VSID_REAL_IR;
+ }
+ pte->may_read = true;
+ pte->may_write = true;
+ pte->may_execute = true;
+ r = 0;
+ }
+
+ return r;
+}
+
+static hva_t kvmppc_bad_hva(void)
+{
+ return PAGE_OFFSET;
+}
+
+static hva_t kvmppc_pte_to_hva(struct kvm_vcpu *vcpu, struct kvmppc_pte *pte,
+ bool read)
+{
+ hva_t hpage;
+
+ if (read && !pte->may_read)
+ goto err;
+
+ if (!read && !pte->may_write)
+ goto err;
+
+ hpage = gfn_to_hva(vcpu->kvm, pte->raddr >> PAGE_SHIFT);
+ if (kvm_is_error_hva(hpage))
+ goto err;
+
+ return hpage | (pte->raddr & ~PAGE_MASK);
+err:
+ return kvmppc_bad_hva();
+}
+
+int kvmppc_st(struct kvm_vcpu *vcpu, ulong eaddr, int size, void *ptr)
+{
+ struct kvmppc_pte pte;
+ hva_t hva = eaddr;
+
+ vcpu->stat.st++;
+
+ if (kvmppc_xlate(vcpu, eaddr, false, &pte))
+ goto err;
+
+ hva = kvmppc_pte_to_hva(vcpu, &pte, false);
+ if (kvm_is_error_hva(hva))
+ goto err;
+
+ if (copy_to_user((void __user *)hva, ptr, size)) {
+ printk(KERN_INFO "kvmppc_st at 0x%lx failed\n", hva);
+ goto err;
+ }
+
+ return 0;
+
+err:
+ return -ENOENT;
+}
+
+int kvmppc_ld(struct kvm_vcpu *vcpu, ulong eaddr, int size, void *ptr,
+ bool data)
+{
+ struct kvmppc_pte pte;
+ hva_t hva = eaddr;
+
+ vcpu->stat.ld++;
+
+ if (kvmppc_xlate(vcpu, eaddr, data, &pte))
+ goto err;
+
+ hva = kvmppc_pte_to_hva(vcpu, &pte, true);
+ if (kvm_is_error_hva(hva))
+ goto err;
+
+ if (copy_from_user(ptr, (void __user *)hva, size)) {
+ printk(KERN_INFO "kvmppc_ld at 0x%lx failed\n", hva);
+ goto err;
+ }
+
+ return 0;
+
+err:
+ return -ENOENT;
+}
+
+static int kvmppc_visible_gfn(struct kvm_vcpu *vcpu, gfn_t gfn)
+{
+ return kvm_is_visible_gfn(vcpu->kvm, gfn);
+}
+
+int kvmppc_handle_pagefault(struct kvm_run *run, struct kvm_vcpu *vcpu,
+ ulong eaddr, int vec)
+{
+ bool data = (vec == BOOK3S_INTERRUPT_DATA_STORAGE);
+ int r = RESUME_GUEST;
+ int relocated;
+ int page_found = 0;
+ struct kvmppc_pte pte;
+ bool is_mmio = false;
+
+ if ( vec == BOOK3S_INTERRUPT_DATA_STORAGE ) {
+ relocated = (vcpu->arch.msr & MSR_DR);
+ } else {
+ relocated = (vcpu->arch.msr & MSR_IR);
+ }
+
+ /* Resolve real address if translation turned on */
+ if (relocated) {
+ page_found = vcpu->arch.mmu.xlate(vcpu, eaddr, &pte, data);
+ } else {
+ pte.may_execute = true;
+ pte.may_read = true;
+ pte.may_write = true;
+ pte.raddr = eaddr & 0xffffffff;
+ pte.eaddr = eaddr;
+ pte.vpage = eaddr >> 12;
+ switch (vcpu->arch.msr & (MSR_DR|MSR_IR)) {
+ case 0:
+ pte.vpage |= VSID_REAL;
+ case MSR_DR:
+ pte.vpage |= VSID_REAL_DR;
+ case MSR_IR:
+ pte.vpage |= VSID_REAL_IR;
+ }
+ }
+
+ if (vcpu->arch.mmu.is_dcbz32(vcpu) &&
+ (!(vcpu->arch.hflags & BOOK3S_HFLAG_DCBZ32))) {
+ /*
+ * If we do the dcbz hack, we have to NX on every execution,
+ * so we can patch the executing code. This renders our guest
+ * NX-less.
+ */
+ pte.may_execute = !data;
+ }
+
+ if (page_found == -ENOENT) {
+ /* Page not found in guest PTE entries */
+ vcpu->arch.dear = vcpu->arch.fault_dear;
+ to_book3s(vcpu)->dsisr = vcpu->arch.fault_dsisr;
+ vcpu->arch.msr |= (vcpu->arch.shadow_msr & 0x00000000f8000000ULL);
+ kvmppc_book3s_queue_irqprio(vcpu, vec);
+ } else if (page_found == -EPERM) {
+ /* Storage protection */
+ vcpu->arch.dear = vcpu->arch.fault_dear;
+ to_book3s(vcpu)->dsisr = vcpu->arch.fault_dsisr & ~DSISR_NOHPTE;
+ to_book3s(vcpu)->dsisr |= DSISR_PROTFAULT;
+ vcpu->arch.msr |= (vcpu->arch.shadow_msr & 0x00000000f8000000ULL);
+ kvmppc_book3s_queue_irqprio(vcpu, vec);
+ } else if (page_found == -EINVAL) {
+ /* Page not found in guest SLB */
+ vcpu->arch.dear = vcpu->arch.fault_dear;
+ kvmppc_book3s_queue_irqprio(vcpu, vec + 0x80);
+ } else if (!is_mmio &&
+ kvmppc_visible_gfn(vcpu, pte.raddr >> PAGE_SHIFT)) {
+ /* The guest's PTE is not mapped yet. Map on the host */
+ kvmppc_mmu_map_page(vcpu, &pte);
+ if (data)
+ vcpu->stat.sp_storage++;
+ else if (vcpu->arch.mmu.is_dcbz32(vcpu) &&
+ (!(vcpu->arch.hflags & BOOK3S_HFLAG_DCBZ32)))
+ kvmppc_patch_dcbz(vcpu, &pte);
+ } else {
+ /* MMIO */
+ vcpu->stat.mmio_exits++;
+ vcpu->arch.paddr_accessed = pte.raddr;
+ r = kvmppc_emulate_mmio(run, vcpu);
+ if ( r == RESUME_HOST_NV )
+ r = RESUME_HOST;
+ if ( r == RESUME_GUEST_NV )
+ r = RESUME_GUEST;
+ }
+
+ return r;
+}
+
+int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
+ unsigned int exit_nr)
+{
+ int r = RESUME_HOST;
+
+ vcpu->stat.sum_exits++;
+
+ run->exit_reason = KVM_EXIT_UNKNOWN;
+ run->ready_for_interrupt_injection = 1;
+#ifdef EXIT_DEBUG
+ printk(KERN_EMERG "exit_nr=0x%x | pc=0x%lx | dar=0x%lx | dec=0x%x | msr=0x%lx\n",
+ exit_nr, vcpu->arch.pc, vcpu->arch.fault_dear,
+ kvmppc_get_dec(vcpu), vcpu->arch.msr);
+#elif defined (EXIT_DEBUG_SIMPLE)
+ if ((exit_nr != 0x900) && (exit_nr != 0x500))
+ printk(KERN_EMERG "exit_nr=0x%x | pc=0x%lx | dar=0x%lx | msr=0x%lx\n",
+ exit_nr, vcpu->arch.pc, vcpu->arch.fault_dear,
+ vcpu->arch.msr);
+#endif
+ kvm_resched(vcpu);
+ switch (exit_nr) {
+ case BOOK3S_INTERRUPT_INST_STORAGE:
+ vcpu->stat.pf_instruc++;
+ /* only care about PTEG not found errors, but leave NX alone */
+ if (vcpu->arch.shadow_msr & 0x40000000) {
+ r = kvmppc_handle_pagefault(run, vcpu, vcpu->arch.pc, exit_nr);
+ vcpu->stat.sp_instruc++;
+ } else if (vcpu->arch.mmu.is_dcbz32(vcpu) &&
+ (!(vcpu->arch.hflags & BOOK3S_HFLAG_DCBZ32))) {
+ /*
+ * XXX If we do the dcbz hack we use the NX bit to flush&patch the page,
+ * so we can't use the NX bit inside the guest. Let's cross our fingers,
+ * that no guest that needs the dcbz hack does NX.
+ */
+ kvmppc_mmu_pte_flush(vcpu, vcpu->arch.pc, ~0xFFFULL);
+ } else {
+ vcpu->arch.msr |= (vcpu->arch.shadow_msr & 0x58000000);
+ kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
+ kvmppc_mmu_pte_flush(vcpu, vcpu->arch.pc, ~0xFFFULL);
+ r = RESUME_GUEST;
+ }
+ break;
+ case BOOK3S_INTERRUPT_DATA_STORAGE:
+ vcpu->stat.pf_storage++;
+ /* The only case we need to handle is missing shadow PTEs */
+ if (vcpu->arch.fault_dsisr & DSISR_NOHPTE) {
+ r = kvmppc_handle_pagefault(run, vcpu, vcpu->arch.fault_dear, exit_nr);
+ } else {
+ vcpu->arch.dear = vcpu->arch.fault_dear;
+ to_book3s(vcpu)->dsisr = vcpu->arch.fault_dsisr;
+ kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
+ kvmppc_mmu_pte_flush(vcpu, vcpu->arch.dear, ~0xFFFULL);
+ r = RESUME_GUEST;
+ }
+ break;
+ case BOOK3S_INTERRUPT_DATA_SEGMENT:
+ if (kvmppc_mmu_map_segment(vcpu, vcpu->arch.fault_dear) < 0) {
+ vcpu->arch.dear = vcpu->arch.fault_dear;
+ kvmppc_book3s_queue_irqprio(vcpu,
+ BOOK3S_INTERRUPT_DATA_SEGMENT);
+ }
+ r = RESUME_GUEST;
+ break;
+ case BOOK3S_INTERRUPT_INST_SEGMENT:
+ if (kvmppc_mmu_map_segment(vcpu, vcpu->arch.pc) < 0) {
+ kvmppc_book3s_queue_irqprio(vcpu,
+ BOOK3S_INTERRUPT_INST_SEGMENT);
+ }
+ r = RESUME_GUEST;
+ break;
+ /* We're good on these - the host merely wanted to get our attention */
+ case BOOK3S_INTERRUPT_DECREMENTER:
+ vcpu->stat.dec_exits++;
+ r = RESUME_GUEST;
+ break;
+ case BOOK3S_INTERRUPT_EXTERNAL:
+ vcpu->stat.ext_intr_exits++;
+ r = RESUME_GUEST;
+ break;
+ case BOOK3S_INTERRUPT_PROGRAM:
+ {
+ enum emulation_result er;
+
+ if (vcpu->arch.msr & MSR_PR) {
+#ifdef EXIT_DEBUG
+ printk(KERN_INFO "Userspace triggered 0x700 exception at 0x%lx (0x%x)\n", vcpu->arch.pc, vcpu->arch.last_inst);
+#endif
+ if ((vcpu->arch.last_inst & 0xff0007ff) !=
+ (INS_DCBZ & 0xfffffff7)) {
+ kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
+ r = RESUME_GUEST;
+ break;
+ }
+ }
+
+ vcpu->stat.emulated_inst_exits++;
+ er = kvmppc_emulate_instruction(run, vcpu);
+ switch (er) {
+ case EMULATE_DONE:
+ r = RESUME_GUEST;
+ break;
+ case EMULATE_FAIL:
+ printk(KERN_CRIT "%s: emulation at %lx failed (%08x)\n",
+ __func__, vcpu->arch.pc, vcpu->arch.last_inst);
+ kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
+ r = RESUME_GUEST;
+ break;
+ default:
+ BUG();
+ }
+ break;
+ }
+ case BOOK3S_INTERRUPT_SYSCALL:
+#ifdef EXIT_DEBUG
+ printk(KERN_INFO "Syscall Nr %d\n", (int)vcpu->arch.gpr[0]);
+#endif
+ vcpu->stat.syscall_exits++;
+ kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
+ r = RESUME_GUEST;
+ break;
+ case BOOK3S_INTERRUPT_MACHINE_CHECK:
+ case BOOK3S_INTERRUPT_FP_UNAVAIL:
+ case BOOK3S_INTERRUPT_TRACE:
+ case BOOK3S_INTERRUPT_ALTIVEC:
+ case BOOK3S_INTERRUPT_VSX:
+ kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
+ r = RESUME_GUEST;
+ break;
+ default:
+ /* Ugh - bork here! What did we get? */
+ printk(KERN_EMERG "exit_nr=0x%x | pc=0x%lx | msr=0x%lx\n", exit_nr, vcpu->arch.pc, vcpu->arch.shadow_msr);
+ r = RESUME_HOST;
+ BUG();
+ break;
+ }
+
+
+ if (!(r & RESUME_HOST)) {
+ /* To avoid clobbering exit_reason, only check for signals if
+ * we aren't already exiting to userspace for some other
+ * reason. */
+ if (signal_pending(current)) {
+#ifdef EXIT_DEBUG
+ printk(KERN_EMERG "KVM: Going back to host\n");
+#endif
+ vcpu->stat.signal_exits++;
+ run->exit_reason = KVM_EXIT_INTR;
+ r = -EINTR;
+ } else {
+ /* In case an interrupt came in that was triggered
+ * from userspace (like DEC), we need to check what
+ * to inject now! */
+ kvmppc_core_deliver_interrupts(vcpu);
+ }
+ }
+
+#ifdef EXIT_DEBUG
+ printk(KERN_EMERG "KVM exit: vcpu=0x%p pc=0x%lx r=0x%x\n", vcpu, vcpu->arch.pc, r);
+#endif
+
+ return r;
+}
+
+int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
+{
+ return 0;
+}
+
+int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
+{
+ int i;
+
+ regs->pc = vcpu->arch.pc;
+ regs->cr = vcpu->arch.cr;
+ regs->ctr = vcpu->arch.ctr;
+ regs->lr = vcpu->arch.lr;
+ regs->xer = vcpu->arch.xer;
+ regs->msr = vcpu->arch.msr;
+ regs->srr0 = vcpu->arch.srr0;
+ regs->srr1 = vcpu->arch.srr1;
+ regs->pid = vcpu->arch.pid;
+ regs->sprg0 = vcpu->arch.sprg0;
+ regs->sprg1 = vcpu->arch.sprg1;
+ regs->sprg2 = vcpu->arch.sprg2;
+ regs->sprg3 = vcpu->arch.sprg3;
+ regs->sprg5 = vcpu->arch.sprg4;
+ regs->sprg6 = vcpu->arch.sprg5;
+ regs->sprg7 = vcpu->arch.sprg6;
+
+ for (i = 0; i < ARRAY_SIZE(regs->gpr); i++)
+ regs->gpr[i] = vcpu->arch.gpr[i];
+
+ return 0;
+}
+
+int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
+{
+ int i;
+
+ vcpu->arch.pc = regs->pc;
+ vcpu->arch.cr = regs->cr;
+ vcpu->arch.ctr = regs->ctr;
+ vcpu->arch.lr = regs->lr;
+ vcpu->arch.xer = regs->xer;
+ kvmppc_set_msr(vcpu, regs->msr);
+ vcpu->arch.srr0 = regs->srr0;
+ vcpu->arch.srr1 = regs->srr1;
+ vcpu->arch.sprg0 = regs->sprg0;
+ vcpu->arch.sprg1 = regs->sprg1;
+ vcpu->arch.sprg2 = regs->sprg2;
+ vcpu->arch.sprg3 = regs->sprg3;
+ vcpu->arch.sprg5 = regs->sprg4;
+ vcpu->arch.sprg6 = regs->sprg5;
+ vcpu->arch.sprg7 = regs->sprg6;
+
+ for (i = 0; i < ARRAY_SIZE(vcpu->arch.gpr); i++)
+ vcpu->arch.gpr[i] = regs->gpr[i];
+
+ return 0;
+}
+
+int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu,
+ struct kvm_sregs *sregs)
+{
+ sregs->pvr = vcpu->arch.pvr;
+ return 0;
+}
+
+int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
+ struct kvm_sregs *sregs)
+{
+ kvmppc_set_pvr(vcpu, sregs->pvr);
+ return 0;
+}
+
+int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
+{
+ return -ENOTSUPP;
+}
+
+int kvm_arch_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
+{
+ return -ENOTSUPP;
+}
+
+int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,
+ struct kvm_translation *tr)
+{
+ return 0;
+}
+
+/*
+ * Get (and clear) the dirty memory log for a memory slot.
+ */
+int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
+ struct kvm_dirty_log *log)
+{
+ struct kvm_memory_slot *memslot;
+ struct kvm_vcpu *vcpu;
+ ulong ga, ga_end;
+ int is_dirty = 0;
+ int r, n;
+
+ down_write(&kvm->slots_lock);
+
+ r = kvm_get_dirty_log(kvm, log, &is_dirty);
+ if (r)
+ goto out;
+
+ /* If nothing is dirty, don't bother messing with page tables. */
+ if (is_dirty) {
+ memslot = &kvm->memslots[log->slot];
+
+ ga = memslot->base_gfn << PAGE_SHIFT;
+ ga_end = ga + (memslot->npages << PAGE_SHIFT);
+
+ kvm_for_each_vcpu(n, vcpu, kvm)
+ kvmppc_mmu_pte_pflush(vcpu, ga, ga_end);
+
+ n = ALIGN(memslot->npages, BITS_PER_LONG) / 8;
+ memset(memslot->dirty_bitmap, 0, n);
+ }
+
+ r = 0;
+out:
+ up_write(&kvm->slots_lock);
+ return r;
+}
+
+int kvmppc_core_check_processor_compat(void)
+{
+ return 0;
+}
+
+struct kvm_vcpu *kvmppc_core_vcpu_create(struct kvm *kvm, unsigned int id)
+{
+ struct kvmppc_vcpu_book3s *vcpu_book3s;
+ struct kvm_vcpu *vcpu;
+ int err;
+
+ vcpu_book3s = (struct kvmppc_vcpu_book3s *)__get_free_pages( GFP_KERNEL | __GFP_ZERO,
+ get_order(sizeof(struct kvmppc_vcpu_book3s)));
+ if (!vcpu_book3s) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ vcpu = &vcpu_book3s->vcpu;
+ err = kvm_vcpu_init(vcpu, kvm, id);
+ if (err)
+ goto free_vcpu;
+
+ vcpu->arch.host_retip = kvm_return_point;
+ vcpu->arch.host_msr = mfmsr();
+ /* default to book3s_64 (970fx) */
+ vcpu->arch.pvr = 0x3C0301;
+ kvmppc_set_pvr(vcpu, vcpu->arch.pvr);
+ vcpu_book3s->slb_nr = 64;
+
+ /* remember where some real-mode handlers are */
+ vcpu->arch.trampoline_lowmem = kvmppc_trampoline_lowmem;
+ vcpu->arch.trampoline_enter = kvmppc_trampoline_enter;
+ vcpu->arch.highmem_handler = (ulong)kvmppc_handler_highmem;
+
+ vcpu->arch.shadow_msr = MSR_USER64;
+
+ err = __init_new_context();
+ if (err < 0)
+ goto free_vcpu;
+ vcpu_book3s->context_id = err;
+
+ vcpu_book3s->vsid_max = ((vcpu_book3s->context_id + 1) << USER_ESID_BITS) - 1;
+ vcpu_book3s->vsid_first = vcpu_book3s->context_id << USER_ESID_BITS;
+ vcpu_book3s->vsid_next = vcpu_book3s->vsid_first;
+
+ return vcpu;
+
+free_vcpu:
+ free_pages((long)vcpu_book3s, get_order(sizeof(struct kvmppc_vcpu_book3s)));
+out:
+ return ERR_PTR(err);
+}
+
+void kvmppc_core_vcpu_free(struct kvm_vcpu *vcpu)
+{
+ struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
+
+ __destroy_context(vcpu_book3s->context_id);
+ kvm_vcpu_uninit(vcpu);
+ free_pages((long)vcpu_book3s, get_order(sizeof(struct kvmppc_vcpu_book3s)));
+}
+
+extern int __kvmppc_vcpu_entry(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu);
+int __kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
+{
+ int ret;
+
+ /* No need to go into the guest when all we do is going out */
+ if (signal_pending(current)) {
+ kvm_run->exit_reason = KVM_EXIT_INTR;
+ return -EINTR;
+ }
+
+ /* XXX we get called with irq disabled - change that! */
+ local_irq_enable();
+
+ ret = __kvmppc_vcpu_entry(kvm_run, vcpu);
+
+ local_irq_disable();
+
+ return ret;
+}
+
+static int kvmppc_book3s_init(void)
+{
+ return kvm_init(NULL, sizeof(struct kvmppc_vcpu_book3s), THIS_MODULE);
+}
+
+static void kvmppc_book3s_exit(void)
+{
+ kvm_exit();
+}
+
+module_init(kvmppc_book3s_init);
+module_exit(kvmppc_book3s_exit);
--
1.6.0.2
^ permalink raw reply related
* [PATCH 13/27] Add book3s_32 guest MMU
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm
Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>
This patch adds an implementation for a G3/G4 MMU, so we can run G3 and
G4 guests in KVM on Book3s_64.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
v5 -> v6:
- dprintk instead of scattered #ifdef's
- // -> /* */
- 80 characters per line
---
arch/powerpc/kvm/book3s_32_mmu.c | 372 ++++++++++++++++++++++++++++++++++++++
1 files changed, 372 insertions(+), 0 deletions(-)
create mode 100644 arch/powerpc/kvm/book3s_32_mmu.c
diff --git a/arch/powerpc/kvm/book3s_32_mmu.c b/arch/powerpc/kvm/book3s_32_mmu.c
new file mode 100644
index 0000000..faf99f2
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_32_mmu.c
@@ -0,0 +1,372 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf <agraf@suse.de>
+ */
+
+#include <linux/types.h>
+#include <linux/string.h>
+#include <linux/kvm.h>
+#include <linux/kvm_host.h>
+#include <linux/highmem.h>
+
+#include <asm/tlbflush.h>
+#include <asm/kvm_ppc.h>
+#include <asm/kvm_book3s.h>
+
+/* #define DEBUG_MMU */
+/* #define DEBUG_MMU_PTE */
+/* #define DEBUG_MMU_PTE_IP 0xfff14c40 */
+
+#ifdef DEBUG_MMU
+#define dprintk(X...) printk(KERN_INFO X)
+#else
+#define dprintk(X...) do { } while(0)
+#endif
+
+#ifdef DEBUG_PTE
+#define dprintk_pte(X...) printk(KERN_INFO X)
+#else
+#define dprintk_pte(X...) do { } while(0)
+#endif
+
+#define PTEG_FLAG_ACCESSED 0x00000100
+#define PTEG_FLAG_DIRTY 0x00000080
+
+static inline bool check_debug_ip(struct kvm_vcpu *vcpu)
+{
+#ifdef DEBUG_MMU_PTE_IP
+ return vcpu->arch.pc == DEBUG_MMU_PTE_IP;
+#else
+ return true;
+#endif
+}
+
+static int kvmppc_mmu_book3s_32_xlate_bat(struct kvm_vcpu *vcpu, gva_t eaddr,
+ struct kvmppc_pte *pte, bool data);
+
+static struct kvmppc_sr *find_sr(struct kvmppc_vcpu_book3s *vcpu_book3s, gva_t eaddr)
+{
+ return &vcpu_book3s->sr[(eaddr >> 28) & 0xf];
+}
+
+static u64 kvmppc_mmu_book3s_32_ea_to_vp(struct kvm_vcpu *vcpu, gva_t eaddr,
+ bool data)
+{
+ struct kvmppc_sr *sre = find_sr(to_book3s(vcpu), eaddr);
+ struct kvmppc_pte pte;
+
+ if (!kvmppc_mmu_book3s_32_xlate_bat(vcpu, eaddr, &pte, data))
+ return pte.vpage;
+
+ return (((u64)eaddr >> 12) & 0xffff) | (((u64)sre->vsid) << 16);
+}
+
+static void kvmppc_mmu_book3s_32_reset_msr(struct kvm_vcpu *vcpu)
+{
+ kvmppc_set_msr(vcpu, 0);
+}
+
+static hva_t kvmppc_mmu_book3s_32_get_pteg(struct kvmppc_vcpu_book3s *vcpu_book3s,
+ struct kvmppc_sr *sre, gva_t eaddr,
+ bool primary)
+{
+ u32 page, hash, pteg, htabmask;
+ hva_t r;
+
+ page = (eaddr & 0x0FFFFFFF) >> 12;
+ htabmask = ((vcpu_book3s->sdr1 & 0x1FF) << 16) | 0xFFC0;
+
+ hash = ((sre->vsid ^ page) << 6);
+ if (!primary)
+ hash = ~hash;
+ hash &= htabmask;
+
+ pteg = (vcpu_book3s->sdr1 & 0xffff0000) | hash;
+
+ dprintk("MMU: pc=0x%lx eaddr=0x%lx sdr1=0x%llx pteg=0x%x vsid=0x%x\n",
+ vcpu_book3s->vcpu.arch.pc, eaddr, vcpu_book3s->sdr1, pteg,
+ sre->vsid);
+
+ r = gfn_to_hva(vcpu_book3s->vcpu.kvm, pteg >> PAGE_SHIFT);
+ if (kvm_is_error_hva(r))
+ return r;
+ return r | (pteg & ~PAGE_MASK);
+}
+
+static u32 kvmppc_mmu_book3s_32_get_ptem(struct kvmppc_sr *sre, gva_t eaddr,
+ bool primary)
+{
+ return ((eaddr & 0x0fffffff) >> 22) | (sre->vsid << 7) |
+ (primary ? 0 : 0x40) | 0x80000000;
+}
+
+static int kvmppc_mmu_book3s_32_xlate_bat(struct kvm_vcpu *vcpu, gva_t eaddr,
+ struct kvmppc_pte *pte, bool data)
+{
+ struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
+ struct kvmppc_bat *bat;
+ int i;
+
+ for (i = 0; i < 8; i++) {
+ if (data)
+ bat = &vcpu_book3s->dbat[i];
+ else
+ bat = &vcpu_book3s->ibat[i];
+
+ if (vcpu->arch.msr & MSR_PR) {
+ if (!bat->vp)
+ continue;
+ } else {
+ if (!bat->vs)
+ continue;
+ }
+
+ if (check_debug_ip(vcpu))
+ {
+ dprintk_pte("%cBAT %02d: 0x%lx - 0x%x (0x%x)\n",
+ data ? 'd' : 'i', i, eaddr, bat->bepi,
+ bat->bepi_mask);
+ }
+ if ((eaddr & bat->bepi_mask) == bat->bepi) {
+ pte->raddr = bat->brpn | (eaddr & ~bat->bepi_mask);
+ pte->vpage = (eaddr >> 12) | VSID_BAT;
+ pte->may_read = bat->pp;
+ pte->may_write = bat->pp > 1;
+ pte->may_execute = true;
+ if (!pte->may_read) {
+ printk(KERN_INFO "BAT is not readable!\n");
+ continue;
+ }
+ if (!pte->may_write) {
+ /* let's treat r/o BATs as not-readable for now */
+ dprintk_pte("BAT is read-only!\n");
+ continue;
+ }
+
+ return 0;
+ }
+ }
+
+ return -ENOENT;
+}
+
+static int kvmppc_mmu_book3s_32_xlate_pte(struct kvm_vcpu *vcpu, gva_t eaddr,
+ struct kvmppc_pte *pte, bool data,
+ bool primary)
+{
+ struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
+ struct kvmppc_sr *sre;
+ hva_t ptegp;
+ u32 pteg[16];
+ u64 ptem = 0;
+ int i;
+ int found = 0;
+
+ sre = find_sr(vcpu_book3s, eaddr);
+
+ dprintk_pte("SR 0x%lx: vsid=0x%x, raw=0x%x\n", eaddr >> 28,
+ sre->vsid, sre->raw);
+
+ pte->vpage = kvmppc_mmu_book3s_32_ea_to_vp(vcpu, eaddr, data);
+
+ ptegp = kvmppc_mmu_book3s_32_get_pteg(vcpu_book3s, sre, eaddr, primary);
+ if (kvm_is_error_hva(ptegp)) {
+ printk(KERN_INFO "KVM: Invalid PTEG!\n");
+ goto no_page_found;
+ }
+
+ ptem = kvmppc_mmu_book3s_32_get_ptem(sre, eaddr, primary);
+
+ if(copy_from_user(pteg, (void __user *)ptegp, sizeof(pteg))) {
+ printk(KERN_ERR "KVM: Can't copy data from 0x%lx!\n", ptegp);
+ goto no_page_found;
+ }
+
+ for (i=0; i<16; i+=2) {
+ if (ptem == pteg[i]) {
+ u8 pp;
+
+ pte->raddr = (pteg[i+1] & ~(0xFFFULL)) | (eaddr & 0xFFF);
+ pp = pteg[i+1] & 3;
+
+ if ((sre->Kp && (vcpu->arch.msr & MSR_PR)) ||
+ (sre->Ks && !(vcpu->arch.msr & MSR_PR)))
+ pp |= 4;
+
+ pte->may_write = false;
+ pte->may_read = false;
+ pte->may_execute = true;
+ switch (pp) {
+ case 0:
+ case 1:
+ case 2:
+ case 6:
+ pte->may_write = true;
+ case 3:
+ case 5:
+ case 7:
+ pte->may_read = true;
+ break;
+ }
+
+ if ( !pte->may_read )
+ continue;
+
+ dprintk_pte("MMU: Found PTE -> %x %x - %x\n",
+ pteg[i], pteg[i+1], pp);
+ found = 1;
+ break;
+ }
+ }
+
+ /* Update PTE C and A bits, so the guest's swapper knows we used the
+ page */
+ if (found) {
+ u32 oldpte = pteg[i+1];
+
+ if (pte->may_read)
+ pteg[i+1] |= PTEG_FLAG_ACCESSED;
+ if (pte->may_write)
+ pteg[i+1] |= PTEG_FLAG_DIRTY;
+ else
+ dprintk_pte("KVM: Mapping read-only page!\n");
+
+ /* Write back into the PTEG */
+ if (pteg[i+1] != oldpte)
+ copy_to_user((void __user *)ptegp, pteg, sizeof(pteg));
+
+ return 0;
+ }
+
+no_page_found:
+
+ if (check_debug_ip(vcpu)) {
+ dprintk_pte("KVM MMU: No PTE found (sdr1=0x%llx ptegp=0x%lx)\n",
+ to_book3s(vcpu)->sdr1, ptegp);
+ for (i=0; i<16; i+=2) {
+ dprintk_pte(" %02d: 0x%x - 0x%x (0x%llx)\n",
+ i, pteg[i], pteg[i+1], ptem);
+ }
+ }
+
+ return -ENOENT;
+}
+
+static int kvmppc_mmu_book3s_32_xlate(struct kvm_vcpu *vcpu, gva_t eaddr,
+ struct kvmppc_pte *pte, bool data)
+{
+ int r;
+
+ pte->eaddr = eaddr;
+ r = kvmppc_mmu_book3s_32_xlate_bat(vcpu, eaddr, pte, data);
+ if (r < 0)
+ r = kvmppc_mmu_book3s_32_xlate_pte(vcpu, eaddr, pte, data, true);
+ if (r < 0)
+ r = kvmppc_mmu_book3s_32_xlate_pte(vcpu, eaddr, pte, data, false);
+
+ return r;
+}
+
+
+static u32 kvmppc_mmu_book3s_32_mfsrin(struct kvm_vcpu *vcpu, u32 srnum)
+{
+ return to_book3s(vcpu)->sr[srnum].raw;
+}
+
+static void kvmppc_mmu_book3s_32_mtsrin(struct kvm_vcpu *vcpu, u32 srnum,
+ ulong value)
+{
+ struct kvmppc_sr *sre;
+
+ sre = &to_book3s(vcpu)->sr[srnum];
+
+ /* Flush any left-over shadows from the previous SR */
+
+ /* XXX Not necessary? */
+ /* kvmppc_mmu_pte_flush(vcpu, ((u64)sre->vsid) << 28, 0xf0000000ULL); */
+
+ /* And then put in the new SR */
+ sre->raw = value;
+ sre->vsid = (value & 0x0fffffff);
+ sre->Ks = (value & 0x40000000) ? true : false;
+ sre->Kp = (value & 0x20000000) ? true : false;
+ sre->nx = (value & 0x10000000) ? true : false;
+
+ /* Map the new segment */
+ kvmppc_mmu_map_segment(vcpu, srnum << SID_SHIFT);
+}
+
+static void kvmppc_mmu_book3s_32_tlbie(struct kvm_vcpu *vcpu, ulong ea, bool large)
+{
+ kvmppc_mmu_pte_flush(vcpu, ea, ~0xFFFULL);
+}
+
+static int kvmppc_mmu_book3s_32_esid_to_vsid(struct kvm_vcpu *vcpu, u64 esid,
+ u64 *vsid)
+{
+ /* In case we only have one of MSR_IR or MSR_DR set, let's put
+ that in the real-mode context (and hope RM doesn't access
+ high memory) */
+ switch (vcpu->arch.msr & (MSR_DR|MSR_IR)) {
+ case 0:
+ *vsid = (VSID_REAL >> 16) | esid;
+ break;
+ case MSR_IR:
+ *vsid = (VSID_REAL_IR >> 16) | esid;
+ break;
+ case MSR_DR:
+ *vsid = (VSID_REAL_DR >> 16) | esid;
+ break;
+ case MSR_DR|MSR_IR:
+ {
+ ulong ea;
+ ea = esid << SID_SHIFT;
+ *vsid = find_sr(to_book3s(vcpu), ea)->vsid;
+ break;
+ }
+ default:
+ BUG();
+ }
+
+ return 0;
+}
+
+static bool kvmppc_mmu_book3s_32_is_dcbz32(struct kvm_vcpu *vcpu)
+{
+ return true;
+}
+
+
+void kvmppc_mmu_book3s_32_init(struct kvm_vcpu *vcpu)
+{
+ struct kvmppc_mmu *mmu = &vcpu->arch.mmu;
+
+ mmu->mtsrin = kvmppc_mmu_book3s_32_mtsrin;
+ mmu->mfsrin = kvmppc_mmu_book3s_32_mfsrin;
+ mmu->xlate = kvmppc_mmu_book3s_32_xlate;
+ mmu->reset_msr = kvmppc_mmu_book3s_32_reset_msr;
+ mmu->tlbie = kvmppc_mmu_book3s_32_tlbie;
+ mmu->esid_to_vsid = kvmppc_mmu_book3s_32_esid_to_vsid;
+ mmu->ea_to_vp = kvmppc_mmu_book3s_32_ea_to_vp;
+ mmu->is_dcbz32 = kvmppc_mmu_book3s_32_is_dcbz32;
+
+ mmu->slbmte = NULL;
+ mmu->slbmfee = NULL;
+ mmu->slbmfev = NULL;
+ mmu->slbie = NULL;
+ mmu->slbia = NULL;
+}
--
1.6.0.2
^ permalink raw reply related
* [PATCH 24/27] Include Book3s_64 target in buildsystem
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm
Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>
Now we have everything in place to be able to build KVM, so let's add it
as config option and in the Makefile.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/kvm/Kconfig | 17 +++++++++++++++++
arch/powerpc/kvm/Makefile | 27 +++++++++++++++++++++++----
2 files changed, 40 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index c299268..07703f7 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -21,6 +21,23 @@ config KVM
select PREEMPT_NOTIFIERS
select ANON_INODES
+config KVM_BOOK3S_64_HANDLER
+ bool
+
+config KVM_BOOK3S_64
+ tristate "KVM support for PowerPC book3s_64 processors"
+ depends on EXPERIMENTAL && PPC64
+ select KVM
+ select KVM_BOOK3S_64_HANDLER
+ ---help---
+ Support running unmodified book3s_64 and book3s_32 guest kernels
+ in virtual machines on book3s_64 host processors.
+
+ This module provides access to the hardware capabilities through
+ a character device node named /dev/kvm.
+
+ If unsure, say N.
+
config KVM_440
bool "KVM support for PowerPC 440 processors"
depends on EXPERIMENTAL && 44x
diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index 37655fe..56484d6 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -12,26 +12,45 @@ CFLAGS_44x_tlb.o := -I.
CFLAGS_e500_tlb.o := -I.
CFLAGS_emulate.o := -I.
-kvm-objs := $(common-objs-y) powerpc.o emulate.o
+common-objs-y += powerpc.o emulate.o
obj-$(CONFIG_KVM_EXIT_TIMING) += timing.o
-obj-$(CONFIG_KVM) += kvm.o
+obj-$(CONFIG_KVM_BOOK3S_64_HANDLER) += book3s_64_exports.o
AFLAGS_booke_interrupts.o := -I$(obj)
kvm-440-objs := \
+ $(common-objs-y) \
booke.o \
booke_emulate.o \
booke_interrupts.o \
44x.o \
44x_tlb.o \
44x_emulate.o
-obj-$(CONFIG_KVM_440) += kvm-440.o
+kvm-objs-$(CONFIG_KVM_440) := $(kvm-440-objs)
kvm-e500-objs := \
+ $(common-objs-y) \
booke.o \
booke_emulate.o \
booke_interrupts.o \
e500.o \
e500_tlb.o \
e500_emulate.o
-obj-$(CONFIG_KVM_E500) += kvm-e500.o
+kvm-objs-$(CONFIG_KVM_E500) := $(kvm-e500-objs)
+
+kvm-book3s_64-objs := \
+ $(common-objs-y) \
+ book3s.o \
+ book3s_64_emulate.o \
+ book3s_64_interrupts.o \
+ book3s_64_mmu_host.o \
+ book3s_64_mmu.o \
+ book3s_32_mmu.o
+kvm-objs-$(CONFIG_KVM_BOOK3S_64) := $(kvm-book3s_64-objs)
+
+kvm-objs := $(kvm-objs-m) $(kvm-objs-y)
+
+obj-$(CONFIG_KVM_440) += kvm.o
+obj-$(CONFIG_KVM_E500) += kvm.o
+obj-$(CONFIG_KVM_BOOK3S_64) += kvm.o
+
--
1.6.0.2
^ permalink raw reply related
* [PATCH 23/27] Export new PACA constants in asm-offsets
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm
Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>
In order to access fields in the PACA from assembly code, we need
to generate offsets using asm-offsets.c.
So let's add the new PACA related bits, we just introduced!
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/kernel/asm-offsets.c | 5 +++++
1 files changed, 5 insertions(+), 0 deletions(-)
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index aba3ea6..e2e2082 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -190,6 +190,11 @@ int main(void)
DEFINE(PACA_SYSTEM_TIME, offsetof(struct paca_struct, system_time));
DEFINE(PACA_DATA_OFFSET, offsetof(struct paca_struct, data_offset));
DEFINE(PACA_TRAP_SAVE, offsetof(struct paca_struct, trap_save));
+#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
+ DEFINE(PACA_KVM_IN_GUEST, offsetof(struct paca_struct, kvm_in_guest));
+ DEFINE(PACA_KVM_SLB, offsetof(struct paca_struct, kvm_slb));
+ DEFINE(PACA_KVM_SLB_MAX, offsetof(struct paca_struct, kvm_slb_max));
+#endif
#endif /* CONFIG_PPC64 */
/* RTAS */
--
1.6.0.2
^ permalink raw reply related
* [PATCH 25/27] Fix trace.h
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm
Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>
It looks like the variable "pc" is defined. At least the current code always
failed on me stating that "pc" is already defined somewhere else.
Let's use _pc instead, because that doesn't collide.
Is this the right approach? Does it break on 440 too? If not, why not?
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/kvm/trace.h | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/kvm/trace.h b/arch/powerpc/kvm/trace.h
index 67f219d..a8e8400 100644
--- a/arch/powerpc/kvm/trace.h
+++ b/arch/powerpc/kvm/trace.h
@@ -12,8 +12,8 @@
* Tracepoint for guest mode entry.
*/
TRACE_EVENT(kvm_ppc_instr,
- TP_PROTO(unsigned int inst, unsigned long pc, unsigned int emulate),
- TP_ARGS(inst, pc, emulate),
+ TP_PROTO(unsigned int inst, unsigned long _pc, unsigned int emulate),
+ TP_ARGS(inst, _pc, emulate),
TP_STRUCT__entry(
__field( unsigned int, inst )
@@ -23,7 +23,7 @@ TRACE_EVENT(kvm_ppc_instr,
TP_fast_assign(
__entry->inst = inst;
- __entry->pc = pc;
+ __entry->pc = _pc;
__entry->emulate = emulate;
),
--
1.6.0.2
^ permalink raw reply related
* [PATCH 17/27] Make head_64.S aware of KVM real mode code
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm
Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>
We need to run some KVM trampoline code in real mode. Unfortunately, real mode
only covers 8MB on Cell so we need to squeeze ourselves as low as possible.
Also, we need to trap interrupts to get us back from guest state to host state
without telling Linux about it.
This patch adds interrupt traps and includes the KVM code that requires real
mode in the real mode parts of Linux.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/include/asm/exception-64s.h | 2 ++
arch/powerpc/kernel/exceptions-64s.S | 8 ++++++++
arch/powerpc/kernel/head_64.S | 7 +++++++
3 files changed, 17 insertions(+), 0 deletions(-)
diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
index a98653b..57c4000 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -147,6 +147,7 @@
.globl label##_pSeries; \
label##_pSeries: \
HMT_MEDIUM; \
+ DO_KVM n; \
mtspr SPRN_SPRG_SCRATCH0,r13; /* save r13 */ \
EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, label##_common)
@@ -170,6 +171,7 @@ label##_pSeries: \
.globl label##_pSeries; \
label##_pSeries: \
HMT_MEDIUM; \
+ DO_KVM n; \
mtspr SPRN_SPRG_SCRATCH0,r13; /* save r13 */ \
mfspr r13,SPRN_SPRG_PACA; /* get paca address into r13 */ \
std r9,PACA_EXGEN+EX_R9(r13); /* save r9, r10 */ \
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 1808876..fc3ead0 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -41,6 +41,7 @@ __start_interrupts:
. = 0x200
_machine_check_pSeries:
HMT_MEDIUM
+ DO_KVM 0x200
mtspr SPRN_SPRG_SCRATCH0,r13 /* save r13 */
EXCEPTION_PROLOG_PSERIES(PACA_EXMC, machine_check_common)
@@ -48,6 +49,7 @@ _machine_check_pSeries:
.globl data_access_pSeries
data_access_pSeries:
HMT_MEDIUM
+ DO_KVM 0x300
mtspr SPRN_SPRG_SCRATCH0,r13
BEGIN_FTR_SECTION
mfspr r13,SPRN_SPRG_PACA
@@ -77,6 +79,7 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_SLB)
.globl data_access_slb_pSeries
data_access_slb_pSeries:
HMT_MEDIUM
+ DO_KVM 0x380
mtspr SPRN_SPRG_SCRATCH0,r13
mfspr r13,SPRN_SPRG_PACA /* get paca address into r13 */
std r3,PACA_EXSLB+EX_R3(r13)
@@ -115,6 +118,7 @@ data_access_slb_pSeries:
.globl instruction_access_slb_pSeries
instruction_access_slb_pSeries:
HMT_MEDIUM
+ DO_KVM 0x480
mtspr SPRN_SPRG_SCRATCH0,r13
mfspr r13,SPRN_SPRG_PACA /* get paca address into r13 */
std r3,PACA_EXSLB+EX_R3(r13)
@@ -154,6 +158,7 @@ instruction_access_slb_pSeries:
.globl system_call_pSeries
system_call_pSeries:
HMT_MEDIUM
+ DO_KVM 0xc00
BEGIN_FTR_SECTION
cmpdi r0,0x1ebe
beq- 1f
@@ -186,12 +191,15 @@ END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE)
* trickery is thus necessary
*/
. = 0xf00
+ DO_KVM 0xf00
b performance_monitor_pSeries
. = 0xf20
+ DO_KVM 0xf20
b altivec_unavailable_pSeries
. = 0xf40
+ DO_KVM 0xf40
b vsx_unavailable_pSeries
#ifdef CONFIG_CBE_RAS
diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index c38afdb..9258074 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -37,6 +37,7 @@
#include <asm/firmware.h>
#include <asm/page_64.h>
#include <asm/irqflags.h>
+#include <asm/kvm_book3s_64_asm.h>
/* The physical memory is layed out such that the secondary processor
* spin code sits at 0x0000...0x00ff. On server, the vectors follow
@@ -165,6 +166,12 @@ exception_marker:
#include "exceptions-64s.S"
#endif
+/* KVM trampoline code needs to be close to the interrupt handlers */
+
+#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
+#include "../kvm/book3s_64_rmhandlers.S"
+#endif
+
_GLOBAL(generic_secondary_thread_init)
mr r24,r3
--
1.6.0.2
^ permalink raw reply related
* [PATCH 21/27] Export KVM symbols for module
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm
Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>
To be able to keep KVM as module, we need to export the SLB trampoline
addresses to the module, so it knows where to jump to.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/kvm/book3s_64_exports.c | 24 ++++++++++++++++++++++++
1 files changed, 24 insertions(+), 0 deletions(-)
create mode 100644 arch/powerpc/kvm/book3s_64_exports.c
diff --git a/arch/powerpc/kvm/book3s_64_exports.c b/arch/powerpc/kvm/book3s_64_exports.c
new file mode 100644
index 0000000..5b2db38
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_exports.c
@@ -0,0 +1,24 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf <agraf@suse.de>
+ */
+
+#include <linux/module.h>
+#include <asm/kvm_book3s.h>
+
+EXPORT_SYMBOL_GPL(kvmppc_trampoline_enter);
+EXPORT_SYMBOL_GPL(kvmppc_trampoline_lowmem);
--
1.6.0.2
^ permalink raw reply related
* [PATCH 27/27] Use hrtimers for the decrementer
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm
Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>
Following S390's good example we should use hrtimers for the decrementer too!
This patch converts the timer from the old mechanism to hrtimers.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/include/asm/kvm_host.h | 6 ++++--
arch/powerpc/kvm/emulate.c | 18 +++++++++++-------
arch/powerpc/kvm/powerpc.c | 20 ++++++++++++++++++--
3 files changed, 33 insertions(+), 11 deletions(-)
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 2cff5fe..1201f62 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -21,7 +21,8 @@
#define __POWERPC_KVM_HOST_H__
#include <linux/mutex.h>
-#include <linux/timer.h>
+#include <linux/hrtimer.h>
+#include <linux/interrupt.h>
#include <linux/types.h>
#include <linux/kvm_types.h>
#include <asm/kvm_asm.h>
@@ -250,7 +251,8 @@ struct kvm_vcpu_arch {
u32 cpr0_cfgaddr; /* holds the last set cpr0_cfgaddr */
- struct timer_list dec_timer;
+ struct hrtimer dec_timer;
+ struct tasklet_struct tasklet;
u64 dec_jiffies;
unsigned long pending_exceptions;
diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
index 1ec5e07..4a9ac66 100644
--- a/arch/powerpc/kvm/emulate.c
+++ b/arch/powerpc/kvm/emulate.c
@@ -18,7 +18,7 @@
*/
#include <linux/jiffies.h>
-#include <linux/timer.h>
+#include <linux/hrtimer.h>
#include <linux/types.h>
#include <linux/string.h>
#include <linux/kvm_host.h>
@@ -79,12 +79,13 @@ static int kvmppc_dec_enabled(struct kvm_vcpu *vcpu)
void kvmppc_emulate_dec(struct kvm_vcpu *vcpu)
{
- unsigned long nr_jiffies;
+ unsigned long dec_nsec;
+ pr_debug("mtDEC: %x\n", vcpu->arch.dec);
#ifdef CONFIG_PPC64
/* POWER4+ triggers a dec interrupt if the value is < 0 */
if (vcpu->arch.dec & 0x80000000) {
- del_timer(&vcpu->arch.dec_timer);
+ hrtimer_try_to_cancel(&vcpu->arch.dec_timer);
kvmppc_core_queue_dec(vcpu);
return;
}
@@ -94,12 +95,15 @@ void kvmppc_emulate_dec(struct kvm_vcpu *vcpu)
* that's how we convert the guest DEC value to the number of
* host ticks. */
+ hrtimer_try_to_cancel(&vcpu->arch.dec_timer);
+ dec_nsec = vcpu->arch.dec;
+ dec_nsec *= 1000;
+ dec_nsec /= tb_ticks_per_usec;
+ hrtimer_start(&vcpu->arch.dec_timer, ktime_set(0, dec_nsec),
+ HRTIMER_MODE_REL);
vcpu->arch.dec_jiffies = get_tb();
- nr_jiffies = vcpu->arch.dec / tb_ticks_per_jiffy;
- mod_timer(&vcpu->arch.dec_timer,
- get_jiffies_64() + nr_jiffies);
} else {
- del_timer(&vcpu->arch.dec_timer);
+ hrtimer_try_to_cancel(&vcpu->arch.dec_timer);
}
}
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 4ae3490..4c582ed 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -23,6 +23,7 @@
#include <linux/kvm_host.h>
#include <linux/module.h>
#include <linux/vmalloc.h>
+#include <linux/hrtimer.h>
#include <linux/fs.h>
#include <asm/cputable.h>
#include <asm/uaccess.h>
@@ -209,10 +210,25 @@ static void kvmppc_decrementer_func(unsigned long data)
}
}
+/*
+ * low level hrtimer wake routine. Because this runs in hardirq context
+ * we schedule a tasklet to do the real work.
+ */
+enum hrtimer_restart kvmppc_decrementer_wakeup(struct hrtimer *timer)
+{
+ struct kvm_vcpu *vcpu;
+
+ vcpu = container_of(timer, struct kvm_vcpu, arch.dec_timer);
+ tasklet_schedule(&vcpu->arch.tasklet);
+
+ return HRTIMER_NORESTART;
+}
+
int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
{
- setup_timer(&vcpu->arch.dec_timer, kvmppc_decrementer_func,
- (unsigned long)vcpu);
+ hrtimer_init(&vcpu->arch.dec_timer, CLOCK_REALTIME, HRTIMER_MODE_ABS);
+ tasklet_init(&vcpu->arch.tasklet, kvmppc_decrementer_func, (ulong)vcpu);
+ vcpu->arch.dec_timer.function = kvmppc_decrementer_wakeup;
return 0;
}
--
1.6.0.2
^ permalink raw reply related
* [PATCH 19/27] Export symbols for KVM module
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm
Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>
We want to be able to build KVM as a module. To enable us doing so, we
need some more exports from core Linux parts.
This patch exports all functions and variables that are required for KVM.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
v3 -> v4:
- don't export switch_slb
- don't export init_context
- don't export mm_alloc
---
arch/powerpc/kernel/ppc_ksyms.c | 3 ++-
arch/powerpc/kernel/time.c | 1 +
arch/powerpc/mm/hash_utils_64.c | 2 ++
3 files changed, 5 insertions(+), 1 deletions(-)
diff --git a/arch/powerpc/kernel/ppc_ksyms.c b/arch/powerpc/kernel/ppc_ksyms.c
index c8b27bb..baf778c 100644
--- a/arch/powerpc/kernel/ppc_ksyms.c
+++ b/arch/powerpc/kernel/ppc_ksyms.c
@@ -163,11 +163,12 @@ EXPORT_SYMBOL(screen_info);
#ifdef CONFIG_PPC32
EXPORT_SYMBOL(timer_interrupt);
EXPORT_SYMBOL(irq_desc);
-EXPORT_SYMBOL(tb_ticks_per_jiffy);
EXPORT_SYMBOL(cacheable_memcpy);
EXPORT_SYMBOL(cacheable_memzero);
#endif
+EXPORT_SYMBOL(tb_ticks_per_jiffy);
+
#ifdef CONFIG_PPC32
EXPORT_SYMBOL(switch_mmu_context);
#endif
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 92dc844..e05f6af 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -268,6 +268,7 @@ void account_system_vtime(struct task_struct *tsk)
per_cpu(cputime_scaled_last_delta, smp_processor_id()) = deltascaled;
local_irq_restore(flags);
}
+EXPORT_SYMBOL_GPL(account_system_vtime);
/*
* Transfer the user and system times accumulated in the paca
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 1ade7eb..2b2a4aa 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -92,6 +92,7 @@ struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT];
struct hash_pte *htab_address;
unsigned long htab_size_bytes;
unsigned long htab_hash_mask;
+EXPORT_SYMBOL_GPL(htab_hash_mask);
int mmu_linear_psize = MMU_PAGE_4K;
int mmu_virtual_psize = MMU_PAGE_4K;
int mmu_vmalloc_psize = MMU_PAGE_4K;
@@ -102,6 +103,7 @@ int mmu_io_psize = MMU_PAGE_4K;
int mmu_kernel_ssize = MMU_SEGSIZE_256M;
int mmu_highuser_ssize = MMU_SEGSIZE_256M;
u16 mmu_slb_size = 64;
+EXPORT_SYMBOL_GPL(mmu_slb_size);
#ifdef CONFIG_HUGETLB_PAGE
unsigned int HPAGE_SHIFT;
#endif
--
1.6.0.2
^ permalink raw reply related
* [PATCH 18/27] Add Book3s_64 offsets to asm-offsets.c
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm
Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>
We need to access some VCPU fields from assembly code. In order to get
the proper offsets, we have to define them in asm-offsets.c.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/kernel/asm-offsets.c | 13 +++++++++++++
1 files changed, 13 insertions(+), 0 deletions(-)
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 0812b0f..aba3ea6 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -398,6 +398,19 @@ int main(void)
DEFINE(VCPU_LAST_INST, offsetof(struct kvm_vcpu, arch.last_inst));
DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, arch.fault_dear));
DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr));
+
+ /* book3s_64 */
+#ifdef CONFIG_PPC64
+ DEFINE(VCPU_FAULT_DSISR, offsetof(struct kvm_vcpu, arch.fault_dsisr));
+ DEFINE(VCPU_HOST_RETIP, offsetof(struct kvm_vcpu, arch.host_retip));
+ DEFINE(VCPU_HOST_R2, offsetof(struct kvm_vcpu, arch.host_r2));
+ DEFINE(VCPU_HOST_MSR, offsetof(struct kvm_vcpu, arch.host_msr));
+ DEFINE(VCPU_SHADOW_MSR, offsetof(struct kvm_vcpu, arch.shadow_msr));
+ DEFINE(VCPU_TRAMPOLINE_LOWMEM, offsetof(struct kvm_vcpu, arch.trampoline_lowmem));
+ DEFINE(VCPU_TRAMPOLINE_ENTER, offsetof(struct kvm_vcpu, arch.trampoline_enter));
+ DEFINE(VCPU_HIGHMEM_HANDLER, offsetof(struct kvm_vcpu, arch.highmem_handler));
+ DEFINE(VCPU_HFLAGS, offsetof(struct kvm_vcpu, arch.hflags));
+#endif
#endif
#ifdef CONFIG_44x
DEFINE(PGD_T_LOG2, PGD_T_LOG2);
--
1.6.0.2
^ permalink raw reply related
* [PATCH 20/27] Split init_new_context and destroy_context
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm
Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>
For KVM we need to allocate a new context id, but don't really care about
all the mm context around it.
So let's split the alloc and destroy functions for the context id, so we can
grab one without allocating an mm context.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/include/asm/mmu_context.h | 5 +++++
arch/powerpc/mm/mmu_context_hash64.c | 24 +++++++++++++++++++++---
2 files changed, 26 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index b34e94d..66b35d0 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -23,6 +23,11 @@ extern void switch_slb(struct task_struct *tsk, struct mm_struct *mm);
extern void set_context(unsigned long id, pgd_t *pgd);
#ifdef CONFIG_PPC_BOOK3S_64
+extern int __init_new_context(void);
+extern void __destroy_context(int context_id);
+#endif
+
+#ifdef CONFIG_PPC_BOOK3S_64
static inline void mmu_context_init(void) { }
#else
extern void mmu_context_init(void);
diff --git a/arch/powerpc/mm/mmu_context_hash64.c b/arch/powerpc/mm/mmu_context_hash64.c
index dbeb86a..b9e4cc2 100644
--- a/arch/powerpc/mm/mmu_context_hash64.c
+++ b/arch/powerpc/mm/mmu_context_hash64.c
@@ -18,6 +18,7 @@
#include <linux/mm.h>
#include <linux/spinlock.h>
#include <linux/idr.h>
+#include <linux/module.h>
#include <asm/mmu_context.h>
@@ -32,7 +33,7 @@ static DEFINE_IDR(mmu_context_idr);
#define NO_CONTEXT 0
#define MAX_CONTEXT ((1UL << 19) - 1)
-int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
+int __init_new_context(void)
{
int index;
int err;
@@ -57,6 +58,18 @@ again:
return -ENOMEM;
}
+ return index;
+}
+EXPORT_SYMBOL_GPL(__init_new_context);
+
+int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
+{
+ int index;
+
+ index = __init_new_context();
+ if (index < 0)
+ return index;
+
/* The old code would re-promote on fork, we don't do that
* when using slices as it could cause problem promoting slices
* that have been forced down to 4K
@@ -68,11 +81,16 @@ again:
return 0;
}
-void destroy_context(struct mm_struct *mm)
+void __destroy_context(int context_id)
{
spin_lock(&mmu_context_lock);
- idr_remove(&mmu_context_idr, mm->context.id);
+ idr_remove(&mmu_context_idr, context_id);
spin_unlock(&mmu_context_lock);
+}
+EXPORT_SYMBOL_GPL(__destroy_context);
+void destroy_context(struct mm_struct *mm)
+{
+ __destroy_context(mm->context.id);
mm->context.id = NO_CONTEXT;
}
--
1.6.0.2
^ permalink raw reply related
* [PATCH 22/27] Add fields to PACA
From: Alexander Graf @ 2009-10-30 15:47 UTC (permalink / raw)
To: kvm
Cc: Kevin Wolf, Arnd Bergmann, Hollis Blanchard, Marcelo Tosatti,
kvm-ppc, linuxppc-dev, Avi Kivity, bphilips, Olof Johansson
In-Reply-To: <1256917647-6200-1-git-send-email-agraf@suse.de>
For KVM we need to store some information in the PACA, so we
need to extend it.
This patch adds KVM SLB shadow related entries to the PACA and
a field that indicates if we're inside a guest.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/include/asm/paca.h | 9 +++++++++
1 files changed, 9 insertions(+), 0 deletions(-)
diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 7d8514c..5e9b4ef 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -129,6 +129,15 @@ struct paca_struct {
u64 system_time; /* accumulated system TB ticks */
u64 startpurr; /* PURR/TB value snapshot */
u64 startspurr; /* SPURR value snapshot */
+
+#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
+ struct {
+ u64 esid;
+ u64 vsid;
+ } kvm_slb[64]; /* guest SLB */
+ u8 kvm_slb_max; /* highest used guest slb entry */
+ u8 kvm_in_guest; /* are we inside the guest? */
+#endif
};
extern struct paca_struct paca[];
--
1.6.0.2
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox