* Re: [PATCH] powerpc/uprobes: teach uprobes to ignore gdb breakpoints
From: Oleg Nesterov @ 2013-03-20 12:43 UTC (permalink / raw)
To: Ananth N Mavinakayanahalli; +Cc: ppcdev, Srikar Dronamraju, stable
In-Reply-To: <20130320122639.GA29541@redhat.com>
On 03/20, Oleg Nesterov wrote:
>
> But we did not install UPROBE_SWBP_INSN. Is it fine? I hope yes, just to
> verify. If not, we need 2 definitions. is_uprobe_insn() should still check
> insns == UPROBE_SWBP_INSN, and is_swbp_insn() should check is_trap().
>
> And I am just curious, could you explain how X and UPROBE_SWBP_INSN
> differ?
IOW, if I wasn't clear... Lets forget about gdb/etc for the moment.
Suppose we apply the patch below. Will uprobes on powerpc work?
If yes, then your patch should be fine. If not, we probably need more
changes.
And, forgot to mention. If you change is_swbp_insn(), you can remove
is_trap() from arch_uprobe_analyze_insn().
Oleg.
--- x/arch/powerpc/include/asm/uprobes.h
+++ x/arch/powerpc/include/asm/uprobes.h
@@ -31,7 +31,7 @@ typedef ppc_opcode_t uprobe_opcode_t;
#define UPROBE_XOL_SLOT_BYTES (MAX_UINSN_BYTES)
/* The following alias is needed for reference from arch-agnostic code */
-#define UPROBE_SWBP_INSN BREAKPOINT_INSTRUCTION
+#define UPROBE_SWBP_INSN TRAP_INSN_USED_BY_GDB
#define UPROBE_SWBP_INSN_SIZE 4 /* swbp insn size in bytes */
struct arch_uprobe {
^ permalink raw reply
* Re: [PATCHv2 0/3] PCI Speed Cap Fixes for ppc64
From: Jerome Glisse @ 2013-03-20 15:34 UTC (permalink / raw)
To: Lucas Kannebley Tavares
Cc: linux-pci, dri-devel, Bjorn Helgaas, Thadeu Cascardo, Brian King,
Alex Deucher, linuxppc-dev
In-Reply-To: <1363757079-23550-1-git-send-email-lucaskt@linux.vnet.ibm.com>
On Wed, Mar 20, 2013 at 1:24 AM, Lucas Kannebley Tavares
<lucaskt@linux.vnet.ibm.com> wrote:
> This patch series first implements a function called pcie_get_speed_cap_mask
> in the PCI subsystem based off from drm_pcie_get_speed_cap_mask in drm. Then
> it removes the latter and fixes all references to it. And ultimately, it
> implements an architecture-specific version of the same function for ppc64.
>
> The refactor is done because the function that was used by drm is more
> architecture goo than module-specific. Whilst the function also needed a
> platform-specific implementation to get PCIE Gen2 speeds on ppc64.
Looks good to me but we probably want some one from the pci side to take a look
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
>
> Lucas Kannebley Tavares (3):
> pci: added pcie_get_speed_cap_mask function
> drm: removed drm_pcie_get_speed_cap_mask function
> ppc64: implemented pcibios_get_speed_cap_mask
>
> arch/powerpc/platforms/pseries/pci.c | 35 +++++++++++++++++++++++++++
> drivers/gpu/drm/drm_pci.c | 38 -----------------------------
> drivers/gpu/drm/radeon/evergreen.c | 5 ++-
> drivers/gpu/drm/radeon/r600.c | 5 ++-
> drivers/gpu/drm/radeon/rv770.c | 5 ++-
> drivers/pci/pci.c | 44 ++++++++++++++++++++++++++++++++++
> include/drm/drmP.h | 6 ----
> include/linux/pci.h | 6 ++++
> 8 files changed, 94 insertions(+), 50 deletions(-)
>
> --
> 1.7.4.4
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply
* Re: [PATCH] powerpc/uprobes: teach uprobes to ignore gdb breakpoints
From: Ananth N Mavinakayanahalli @ 2013-03-20 15:41 UTC (permalink / raw)
To: Oleg Nesterov; +Cc: ppcdev, Srikar Dronamraju, stable
In-Reply-To: <20130320122639.GA29541@redhat.com>
On Wed, Mar 20, 2013 at 01:26:39PM +0100, Oleg Nesterov wrote:
> Hi Ananth,
>
> First of all, let me remind that I know nothing about powerpc ;)
>
> But iirc we already discussed this a bit, I forgot the details but
> still I have some concerns...
>
> On 03/20, Ananth N Mavinakayanahalli wrote:
> >
> > GDB uses a variant of the trap instruction that is different from the
> > one used by uprobes. Currently, running gdb on a program being traced
> > by uprobes causes an endless loop since uprobes doesn't understand
> > that the trap is inserted by some other entity and hence a SIGTRAP needs
> > to be delivered.
>
> Yes, and thus is_swbp_at_addr()->is_swbp_insn() called by handle_swbp()
> should be updated,
>
> > +bool is_swbp_insn(uprobe_opcode_t *insn)
> > +{
> > + return (is_trap(*insn));
> > +}
>
> And this patch should fix the problem. (and probably this is fine
> for prepare_uprobe()).
Correct
> But, at the same time, is the new definition fine for verify_opcode()?
>
> IOW, powerpc has another is_trap() insn(s) used by gdb, lets denote it X.
> X != UPROBE_SWBP_INSN.
>
> Suppose that gdb installs the trap X at some addr, and then uprobe_register()
> tries to install uprobe at the same address. Then set_swbp() will do nothing,
> assuming the uprobe was already installed.
>
> But we did not install UPROBE_SWBP_INSN. Is it fine? I hope yes, just to
> verify. If not, we need 2 definitions. is_uprobe_insn() should still check
> insns == UPROBE_SWBP_INSN, and is_swbp_insn() should check is_trap().
is_trap() checks for all trap variants on powerpc, including the one
uprobe uses. It returns true if the instruction is *any* trap variant.
So, install_breakpoint()->prepare_uprobe()->is_swbp_insn() will return
ENOTSUPP. In fact, arch_uprobe_analyze_insn() will also do the same.
This itself should take care of all the cases.
> And I am just curious, could you explain how X and UPROBE_SWBP_INSN
> differ?
Powerpc has numerous variants of the trap instruction based on
comparison of two registers or a regsiter and immediate value and a condition
(less than, greater than, [signed forms thereof], or equal to).
Uprobes uses 0x7fe0008 which is 'tw 31,0,0' which essentially is an
unconditional trap.
Gdb uses many traps, one of which is 0x7d821008 which is twge r2,r2,
which is basically trap if r2 greater than or equal to r2.
Ananth
^ permalink raw reply
* Re: [PATCH] powerpc/uprobes: teach uprobes to ignore gdb breakpoints
From: Ananth N Mavinakayanahalli @ 2013-03-20 15:42 UTC (permalink / raw)
To: Oleg Nesterov; +Cc: ppcdev, Srikar Dronamraju, stable
In-Reply-To: <20130320124301.GA30887@redhat.com>
On Wed, Mar 20, 2013 at 01:43:01PM +0100, Oleg Nesterov wrote:
> On 03/20, Oleg Nesterov wrote:
> >
> > But we did not install UPROBE_SWBP_INSN. Is it fine? I hope yes, just to
> > verify. If not, we need 2 definitions. is_uprobe_insn() should still check
> > insns == UPROBE_SWBP_INSN, and is_swbp_insn() should check is_trap().
> >
> > And I am just curious, could you explain how X and UPROBE_SWBP_INSN
> > differ?
>
> IOW, if I wasn't clear... Lets forget about gdb/etc for the moment.
> Suppose we apply the patch below. Will uprobes on powerpc work?
>
> If yes, then your patch should be fine. If not, we probably need more
> changes.
Yes, it will work fine.
> And, forgot to mention. If you change is_swbp_insn(), you can remove
> is_trap() from arch_uprobe_analyze_insn().
Yeah, that check is no longer needed. I'll send a separate cleanup after
this patch gets applied.
Ananth
^ permalink raw reply
* Re: [PATCH] powerpc/uprobes: teach uprobes to ignore gdb breakpoints
From: Oleg Nesterov @ 2013-03-20 16:06 UTC (permalink / raw)
To: Ananth N Mavinakayanahalli; +Cc: ppcdev, Srikar Dronamraju, stable
In-Reply-To: <20130320154152.GA8246@in.ibm.com>
On 03/20, Ananth N Mavinakayanahalli wrote:
>
> On Wed, Mar 20, 2013 at 01:26:39PM +0100, Oleg Nesterov wrote:
> > But, at the same time, is the new definition fine for verify_opcode()?
> >
> > IOW, powerpc has another is_trap() insn(s) used by gdb, lets denote it X.
> > X != UPROBE_SWBP_INSN.
> >
> > Suppose that gdb installs the trap X at some addr, and then uprobe_register()
> > tries to install uprobe at the same address. Then set_swbp() will do nothing,
> > assuming the uprobe was already installed.
> >
> > But we did not install UPROBE_SWBP_INSN. Is it fine? I hope yes, just to
> > verify. If not, we need 2 definitions. is_uprobe_insn() should still check
> > insns == UPROBE_SWBP_INSN, and is_swbp_insn() should check is_trap().
>
> is_trap() checks for all trap variants on powerpc, including the one
> uprobe uses. It returns true if the instruction is *any* trap variant.
I understand,
> So, install_breakpoint()->prepare_uprobe()->is_swbp_insn() will return
> ENOTSUPP. In fact, arch_uprobe_analyze_insn() will also do the same.
Yes and the check in arch_uprobe_analyze_insn() should go away.
But you missed my point. Please forget about prepare_uprobe(), it is
wrong anyway. And, prepare_uprobe() inspects the _original_ insn from
the file, this has nothing install_breakpoint/etc.
I meant verify_opcode() called by install_breakpoint/etc.
> This itself should take care of all the cases.
>
> > And I am just curious, could you explain how X and UPROBE_SWBP_INSN
> > differ?
>
> Powerpc has numerous variants of the trap instruction based on
> comparison of two registers or a regsiter and immediate value and a condition
> (less than, greater than, [signed forms thereof], or equal to).
>
> Uprobes uses 0x7fe0008 which is 'tw 31,0,0' which essentially is an
> unconditional trap.
>
> Gdb uses many traps, one of which is 0x7d821008 which is twge r2,r2,
> which is basically trap if r2 greater than or equal to r2.
OK. So, if I understand correctly, gdb can use some conditional
breakpoint, and it is possible that this insn won't generate the
trap?
Then this patch is not right, or at least we need another change
on top?
Once again. Suppose that gdb installs the TRAP_IF_R1_GT_R2.
After that uprobe_register() is called, but it won't change this
insn because verify_opcode() returns 0.
Then the probed task hits this breakoint with "r1 < r2" and we do
not report this event.
So. I still think that we actually need something like below, and
powerpc should reimplement is_trap_insn() to use is_trap(insn).
No?
Oleg.
--- x/kernel/events/uprobes.c
+++ x/kernel/events/uprobes.c
@@ -532,6 +532,12 @@ static int copy_insn(struct uprobe *upro
return __copy_insn(mapping, filp, uprobe->arch.insn, bytes, uprobe->offset);
}
+bool __weak is_trap_insn(uprobe_opcode_t *insn)
+{
+ // powerpc: should use is_trap()
+ return is_swbp_insn(insn);
+}
+
static int prepare_uprobe(struct uprobe *uprobe, struct file *file,
struct mm_struct *mm, unsigned long vaddr)
{
@@ -550,7 +556,7 @@ static int prepare_uprobe(struct uprobe
goto out;
ret = -ENOTSUPP;
- if (is_swbp_insn((uprobe_opcode_t *)uprobe->arch.insn))
+ if (is_trap_insn((uprobe_opcode_t *)uprobe->arch.insn))
goto out;
ret = arch_uprobe_analyze_insn(&uprobe->arch, mm, vaddr);
@@ -1452,7 +1458,7 @@ static int is_swbp_at_addr(struct mm_str
copy_opcode(page, vaddr, &opcode);
put_page(page);
out:
- return is_swbp_insn(&opcode);
+ return is_trap_insn(&opcode);
}
static struct uprobe *find_active_uprobe(unsigned long bp_vaddr, int *is_swbp)
^ permalink raw reply
* Re: [PATCH] powerpc/uprobes: teach uprobes to ignore gdb breakpoints
From: Oleg Nesterov @ 2013-03-20 16:07 UTC (permalink / raw)
To: Ananth N Mavinakayanahalli; +Cc: ppcdev, Srikar Dronamraju, stable
In-Reply-To: <20130320154245.GB8246@in.ibm.com>
On 03/20, Ananth N Mavinakayanahalli wrote:
>
> On Wed, Mar 20, 2013 at 01:43:01PM +0100, Oleg Nesterov wrote:
> > On 03/20, Oleg Nesterov wrote:
> > >
> > > But we did not install UPROBE_SWBP_INSN. Is it fine? I hope yes, just to
> > > verify. If not, we need 2 definitions. is_uprobe_insn() should still check
> > > insns == UPROBE_SWBP_INSN, and is_swbp_insn() should check is_trap().
> > >
> > > And I am just curious, could you explain how X and UPROBE_SWBP_INSN
> > > differ?
> >
> > IOW, if I wasn't clear... Lets forget about gdb/etc for the moment.
> > Suppose we apply the patch below. Will uprobes on powerpc work?
> >
> > If yes, then your patch should be fine. If not, we probably need more
> > changes.
>
> Yes, it will work fine.
Even if this new insn is conditional?
Oleg.
^ permalink raw reply
* [PATCH 1/3] powerpc/512x: move mpc5121_generic platform to mpc512x_generic.
From: Matteo Facchinetti @ 2013-03-20 17:41 UTC (permalink / raw)
To: linuxppc-dev; +Cc: gregkh, Matteo Facchinetti, agust
In-Reply-To: <1363801314-16967-1-git-send-email-matteo.facchinetti@sirius-es.it>
This provides a base for using 512x_generic platform on mpc5125 boards.
By this way 512x_GENERIC it could be used for all generic mpc512x boards
and kernel could be compiled with mpc512x_defconfig.
Signed-off-by: Matteo Facchinetti <matteo.facchinetti@sirius-es.it>
---
arch/powerpc/configs/mpc512x_defconfig | 2 +-
arch/powerpc/platforms/512x/Kconfig | 8 ++++----
arch/powerpc/platforms/512x/Makefile | 2 +-
.../platforms/512x/{mpc5121_generic.c => mpc512x_generic.c} | 0
4 files changed, 6 insertions(+), 6 deletions(-)
rename arch/powerpc/platforms/512x/{mpc5121_generic.c => mpc512x_generic.c} (100%)
diff --git a/arch/powerpc/configs/mpc512x_defconfig b/arch/powerpc/configs/mpc512x_defconfig
index 211fcc9..0d0d981 100644
--- a/arch/powerpc/configs/mpc512x_defconfig
+++ b/arch/powerpc/configs/mpc512x_defconfig
@@ -13,7 +13,7 @@ CONFIG_MODULE_UNLOAD=y
# CONFIG_PPC_CHRP is not set
CONFIG_PPC_MPC512x=y
CONFIG_MPC5121_ADS=y
-CONFIG_MPC5121_GENERIC=y
+CONFIG_MPC512x_GENERIC=y
CONFIG_PDM360NG=y
# CONFIG_PPC_PMAC is not set
CONFIG_NO_HZ=y
diff --git a/arch/powerpc/platforms/512x/Kconfig b/arch/powerpc/platforms/512x/Kconfig
index c169998..eb46203 100644
--- a/arch/powerpc/platforms/512x/Kconfig
+++ b/arch/powerpc/platforms/512x/Kconfig
@@ -15,16 +15,16 @@ config MPC5121_ADS
help
This option enables support for the MPC5121E ADS board.
-config MPC5121_GENERIC
- bool "Generic support for simple MPC5121 based boards"
+config MPC512x_GENERIC
+ bool "Generic support for simple MPC512x based boards"
depends on PPC_MPC512x
select DEFAULT_UIMAGE
help
- This option enables support for simple MPC5121 based boards
+ This option enables support for simple MPC512x based boards
which do not need custom platform specific setup.
Compatible boards include: Protonic LVT base boards (ZANMCU
- and VICVT2).
+ and VICVT2), Freescale MPC5125 Tower system.
config PDM360NG
bool "ifm PDM360NG board"
diff --git a/arch/powerpc/platforms/512x/Makefile b/arch/powerpc/platforms/512x/Makefile
index 4efc1c4..72fb934 100644
--- a/arch/powerpc/platforms/512x/Makefile
+++ b/arch/powerpc/platforms/512x/Makefile
@@ -3,5 +3,5 @@
#
obj-y += clock.o mpc512x_shared.o
obj-$(CONFIG_MPC5121_ADS) += mpc5121_ads.o mpc5121_ads_cpld.o
-obj-$(CONFIG_MPC5121_GENERIC) += mpc5121_generic.o
+obj-$(CONFIG_MPC512x_GENERIC) += mpc512x_generic.o
obj-$(CONFIG_PDM360NG) += pdm360ng.o
diff --git a/arch/powerpc/platforms/512x/mpc5121_generic.c b/arch/powerpc/platforms/512x/mpc512x_generic.c
similarity index 100%
rename from arch/powerpc/platforms/512x/mpc5121_generic.c
rename to arch/powerpc/platforms/512x/mpc512x_generic.c
--
1.7.10.4
^ permalink raw reply related
* [PATCH 0/3] Add MPC5125 platform support
From: Matteo Facchinetti @ 2013-03-20 17:41 UTC (permalink / raw)
To: linuxppc-dev; +Cc: gregkh, Matteo Facchinetti, agust
Add MPC5125 SoC in mpc512x family.
Inconvenient of MPC5125 SoC is to have differences in certain register
access between the others in the same family.
This patches add MPC5125 SoC unless adding configuration macro but
using general MPC512x kernel configuration.
Tested on Freescale MPC5125 tower system evaluation board.
Matteo Facchinetti (3):
powerpc/512x: move mpc5121_generic platform to mpc512x_generic.
serial/mpc52xx_uart: add PSC UART support for MPC5125 platforms.
powerpc/mpc512x: add platform code for MPC5125.
arch/powerpc/boot/dts/mpc5125twr.dts | 238 ++++++++++++
arch/powerpc/configs/mpc512x_defconfig | 2 +-
arch/powerpc/include/asm/mpc52xx_psc.h | 49 +++
arch/powerpc/platforms/512x/Kconfig | 8 +-
arch/powerpc/platforms/512x/Makefile | 2 +-
arch/powerpc/platforms/512x/clock.c | 9 +-
arch/powerpc/platforms/512x/mpc512x.h | 1 +
.../512x/{mpc5121_generic.c => mpc512x_generic.c} | 1 +
arch/powerpc/platforms/512x/mpc512x_shared.c | 24 +-
drivers/tty/serial/mpc52xx_uart.c | 407 ++++++++++++++++++--
10 files changed, 691 insertions(+), 50 deletions(-)
create mode 100644 arch/powerpc/boot/dts/mpc5125twr.dts
rename arch/powerpc/platforms/512x/{mpc5121_generic.c => mpc512x_generic.c} (98%)
--
1.7.10.4
^ permalink raw reply
* [PATCH 3/3] powerpc/mpc512x: add platform code for MPC5125.
From: Matteo Facchinetti @ 2013-03-20 17:41 UTC (permalink / raw)
To: linuxppc-dev; +Cc: gregkh, Matteo Facchinetti, agust
In-Reply-To: <1363801314-16967-1-git-send-email-matteo.facchinetti@sirius-es.it>
Tested on MPC5125 Tower evaluation board with
mpc512x_defconfig compile configuration.
In detail, supports for:
- PSC / UART
- RTC
- ETH
- DIU
- I2C
Signed-off-by: Matteo Facchinetti <matteo.facchinetti@sirius-es.it>
---
arch/powerpc/boot/dts/mpc5125twr.dts | 238 +++++++++++++++++++++++++
arch/powerpc/platforms/512x/clock.c | 9 +-
arch/powerpc/platforms/512x/mpc512x.h | 1 +
arch/powerpc/platforms/512x/mpc512x_generic.c | 1 +
arch/powerpc/platforms/512x/mpc512x_shared.c | 24 ++-
5 files changed, 271 insertions(+), 2 deletions(-)
create mode 100644 arch/powerpc/boot/dts/mpc5125twr.dts
diff --git a/arch/powerpc/boot/dts/mpc5125twr.dts b/arch/powerpc/boot/dts/mpc5125twr.dts
new file mode 100644
index 0000000..afcad7a
--- /dev/null
+++ b/arch/powerpc/boot/dts/mpc5125twr.dts
@@ -0,0 +1,238 @@
+/*
+ * STx/Freescale ADS5125 MPC5125 silicon
+ *
+ * Copyright (C) 2009 Freescale Semiconductor Inc. All rights reserved.
+ *
+ * Reworked by Matteo Facchinetti (engineering@sirius-es.it)
+ * Copyright (C) 2013 Sirius Electronic Systems
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or (at your
+ * option) any later version.
+ */
+
+/dts-v1/;
+
+/ {
+ model = "mpc5125twr"; // In BSP "mpc5125ads"
+ compatible = "fsl,mpc5125ads";
+ #address-cells = <1>;
+ #size-cells = <1>;
+ interrupt-parent = <&ipic>;
+
+ aliases {
+ gpio0 = &gpio0;
+ gpio1 = &gpio1;
+ ethernet0 = ð0;
+ };
+
+ cpus {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ PowerPC,5125@0 {
+ device_type = "cpu";
+ reg = <0>;
+ d-cache-line-size = <0x20>; // 32 bytes
+ i-cache-line-size = <0x20>; // 32 bytes
+ d-cache-size = <0x8000>; // L1, 32K
+ i-cache-size = <0x8000>; // L1, 32K
+ timebase-frequency = <49500000>;// 49.5 MHz (csb/4)
+ bus-frequency = <198000000>; // 198 MHz csb bus
+ clock-frequency = <396000000>; // 396 MHz ppc core
+ };
+ };
+
+ memory {
+ device_type = "memory";
+ reg = <0x00000000 0x10000000>; // 256MB at 0
+ };
+
+ sram@30000000 {
+ compatible = "fsl,mpc5121-sram";
+ reg = <0x30000000 0x08000>; // 32K at 0x30000000
+ };
+
+ soc@80000000 {
+ compatible = "fsl,mpc5121-immr";
+ #address-cells = <1>;
+ #size-cells = <1>;
+ #interrupt-cells = <2>;
+ ranges = <0x0 0x80000000 0x400000>;
+ reg = <0x80000000 0x400000>;
+ bus-frequency = <66000000>; // 66 MHz ips bus
+
+ // IPIC
+ // interrupts cell = <intr #, sense>
+ // sense values match linux IORESOURCE_IRQ_* defines:
+ // sense == 8: Level, low assertion
+ // sense == 2: Edge, high-to-low change
+ //
+ ipic: interrupt-controller@c00 {
+ compatible = "fsl,mpc5121-ipic", "fsl,ipic";
+ interrupt-controller;
+ #address-cells = <0>;
+ #interrupt-cells = <2>;
+ reg = <0xc00 0x100>;
+ };
+
+ rtc@a00 { // Real time clock
+ compatible = "fsl,mpc5121-rtc";
+ reg = <0xa00 0x100>;
+ interrupts = <79 0x8 80 0x8>;
+ };
+
+ reset@e00 { // Reset module
+ compatible = "fsl,mpc5121-reset";
+ reg = <0xe00 0x100>;
+ };
+
+ clock@f00 { // Clock control
+ compatible = "fsl,mpc5121-clock";
+ reg = <0xf00 0x100>;
+ };
+
+ pmc@1000{ // Power Management Controller
+ compatible = "fsl,mpc5121-pmc";
+ reg = <0x1000 0x100>;
+ interrupts = <83 0x2>;
+ };
+
+ gpio0: gpio@1100 {
+ compatible = "fsl,mpc5125-gpio";
+ reg = <0x1100 0x080>;
+ interrupts = <78 0x8>;
+ };
+
+ gpio1: gpio@1180 {
+ compatible = "fsl,mpc5125-gpio";
+ reg = <0x1180 0x080>;
+ interrupts = <86 0x8>;
+ };
+
+ can@1300 { // CAN rev.2
+ compatible = "fsl,mpc5121-mscan";
+ interrupts = <12 0x8>;
+ reg = <0x1300 0x80>;
+ };
+
+ can@1380 {
+ compatible = "fsl,mpc5121-mscan";
+ interrupts = <13 0x8>;
+ reg = <0x1380 0x80>;
+ };
+
+ sdhc@1500 {
+ compatible = "fsl,mpc5121-sdhc";
+ interrupts = <8 0x8>;
+ reg = <0x1500 0x100>;
+ };
+
+ i2c@1700 {
+ #address-cells = <1>;
+ #size-cells = <0>;
+ compatible = "fsl-i2c";
+ reg = <0x1700 0x20>;
+ interrupts = <0x9 0x8>;
+ };
+
+ i2c@1720 {
+ #address-cells = <1>;
+ #size-cells = <0>;
+ compatible = "fsl-i2c";
+ reg = <0x1720 0x20>;
+ interrupts = <0xa 0x8>;
+ };
+
+ i2c@1740 {
+ #address-cells = <1>;
+ #size-cells = <0>;
+ compatible = "fsl-i2c";
+ reg = <0x1740 0x20>;
+ interrupts = <0xb 0x8>;
+ };
+
+ i2ccontrol@1760 {
+ compatible = "fsl,mpc5121-i2c-ctrl";
+ reg = <0x1760 0x8>;
+ };
+
+ diu@2100 {
+ device_type = "display";
+ compatible = "fsl,mpc5121-diu";
+ reg = <0x2100 0x100>;
+ interrupts = <64 0x8>;
+ };
+
+ mdio@2800 {
+ compatible = "fsl,mpc5121-fec-mdio";
+ reg = <0x2800 0x800>;
+ #address-cells = <1>;
+ #size-cells = <0>;
+ phy0: ethernet-phy@0 {
+ reg = <1>;
+ };
+ };
+
+ eth0: ethernet@2800 {
+ compatible = "fsl,mpc5125-fec";
+ reg = <0x2800 0x800>;
+ local-mac-address = [ 00 00 00 00 00 00 ];
+ interrupts = <4 0x8>;
+ phy-handle = < &phy0 >;
+ phy-connection-type = "rmii";
+ };
+
+ // IO control
+ ioctl@a000 {
+ compatible = "fsl,mpc5125-ioctl";
+ reg = <0xA000 0x1000>;
+ };
+
+ usb@3000 {
+ compatible = "fsl,mpc5121-usb2-dr";
+ reg = <0x3000 0x400>;
+ #address-cells = <1>;
+ #size-cells = <0>;
+ interrupts = <43 0x8>;
+ dr_mode = "host";
+ phy_type = "ulpi";
+ };
+
+ // 5125 PSCs are not 52xx or 5121 PSC compatible
+ // PSC1 uart0 aka ttyPSC0
+ serial@11100 {
+ device_type = "serial";
+ compatible = "fsl,mpc5125-psc-uart", "fsl,mpc5125-psc";
+ port-number = <0>;
+ reg = <0x11100 0x100>;
+ interrupts = <40 0x8>;
+ fsl,rx-fifo-size = <16>;
+ fsl,tx-fifo-size = <16>;
+ };
+
+ // PSC9 uart1 aka ttyPSC1
+ serial@11900 {
+ device_type = "serial";
+ compatible = "fsl,mpc5125-psc-uart", "fsl,mpc5125-psc";
+ port-number = <1>;
+ reg = <0x11900 0x100>;
+ interrupts = <40 0x8>;
+ fsl,rx-fifo-size = <16>;
+ fsl,tx-fifo-size = <16>;
+ };
+
+ pscfifo@11f00 {
+ compatible = "fsl,mpc5121-psc-fifo";
+ reg = <0x11f00 0x100>;
+ interrupts = <40 0x8>;
+ };
+
+ dma@14000 {
+ compatible = "fsl,mpc5121-dma"; // BSP name: "mpc512x-dma2"
+ reg = <0x14000 0x1800>;
+ interrupts = <65 0x8>;
+ };
+ };
+};
diff --git a/arch/powerpc/platforms/512x/clock.c b/arch/powerpc/platforms/512x/clock.c
index 52d57d2..a8987dc2 100644
--- a/arch/powerpc/platforms/512x/clock.c
+++ b/arch/powerpc/platforms/512x/clock.c
@@ -29,6 +29,8 @@
#include <asm/mpc5121.h>
#include <asm/clk_interface.h>
+#include "mpc512x.h"
+
#undef CLK_DEBUG
static int clocks_initialized;
@@ -683,8 +685,13 @@ static void psc_clks_init(void)
struct device_node *np;
struct platform_device *ofdev;
u32 reg;
+ char *psc_compat;
+
+ psc_compat = mpc512x_select_psc_compat();
+ if (!psc_compat)
+ return;
- for_each_compatible_node(np, NULL, "fsl,mpc5121-psc") {
+ for_each_compatible_node(np, NULL, psc_compat) {
if (!of_property_read_u32(np, "reg", ®)) {
int pscnum = (reg & 0xf00) >> 8;
struct clk *clk = psc_dev_clk(pscnum);
diff --git a/arch/powerpc/platforms/512x/mpc512x.h b/arch/powerpc/platforms/512x/mpc512x.h
index c32b399..2b97622 100644
--- a/arch/powerpc/platforms/512x/mpc512x.h
+++ b/arch/powerpc/platforms/512x/mpc512x.h
@@ -15,6 +15,7 @@ extern void __init mpc512x_init_IRQ(void);
extern void __init mpc512x_init(void);
extern int __init mpc5121_clk_init(void);
void __init mpc512x_declare_of_platform_devices(void);
+extern char *mpc512x_select_psc_compat(void);
extern void mpc512x_restart(char *cmd);
#if defined(CONFIG_FB_FSL_DIU) || defined(CONFIG_FB_FSL_DIU_MODULE)
diff --git a/arch/powerpc/platforms/512x/mpc512x_generic.c b/arch/powerpc/platforms/512x/mpc512x_generic.c
index ca1ca66..1ca7b61 100644
--- a/arch/powerpc/platforms/512x/mpc512x_generic.c
+++ b/arch/powerpc/platforms/512x/mpc512x_generic.c
@@ -28,6 +28,7 @@
*/
static const char * const board[] __initconst = {
"prt,prtlvt",
+ "fsl,mpc5125ads",
NULL
};
diff --git a/arch/powerpc/platforms/512x/mpc512x_shared.c b/arch/powerpc/platforms/512x/mpc512x_shared.c
index d30235b..fbf67e9 100644
--- a/arch/powerpc/platforms/512x/mpc512x_shared.c
+++ b/arch/powerpc/platforms/512x/mpc512x_shared.c
@@ -350,6 +350,21 @@ void __init mpc512x_declare_of_platform_devices(void)
#define DEFAULT_FIFO_SIZE 16
+char *mpc512x_select_psc_compat(void)
+{
+ char *psc_compats[] = {
+ "fsl,mpc5121-psc",
+ "fsl,mpc5125-psc"
+ };
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(psc_compats); i++)
+ if (of_find_compatible_node(NULL, NULL, psc_compats[i]))
+ return psc_compats[i];
+
+ return NULL;
+}
+
static unsigned int __init get_fifo_size(struct device_node *np,
char *prop_name)
{
@@ -375,9 +390,16 @@ void __init mpc512x_psc_fifo_init(void)
void __iomem *psc;
unsigned int tx_fifo_size;
unsigned int rx_fifo_size;
+ char *psc_compat;
int fifobase = 0; /* current fifo address in 32 bit words */
- for_each_compatible_node(np, NULL, "fsl,mpc5121-psc") {
+ psc_compat = mpc512x_select_psc_compat();
+ if (!psc_compat) {
+ pr_err("%s: no compatible devices found\n", __func__);
+ return;
+ }
+
+ for_each_compatible_node(np, NULL, psc_compat) {
tx_fifo_size = get_fifo_size(np, "fsl,tx-fifo-size");
rx_fifo_size = get_fifo_size(np, "fsl,rx-fifo-size");
--
1.7.10.4
^ permalink raw reply related
* [PATCH 2/3] serial/mpc52xx_uart: add PSC UART support for MPC5125 platforms.
From: Matteo Facchinetti @ 2013-03-20 17:41 UTC (permalink / raw)
To: linuxppc-dev; +Cc: gregkh, Matteo Facchinetti, agust
In-Reply-To: <1363801314-16967-1-git-send-email-matteo.facchinetti@sirius-es.it>
MPC5125 PSC controller has different registers than MPC5121.
This patch was originally created by Vladimir Ermakov
https://lists.ozlabs.org/pipermail/linuxppc-dev/2011-March/088954.html
Signed-off-by: Vladimir Ermakov <vooon341@gmail.com>
Signed-off-by: Matteo Facchinetti <matteo.facchinetti@sirius-es.it>
---
arch/powerpc/include/asm/mpc52xx_psc.h | 49 ++++
drivers/tty/serial/mpc52xx_uart.c | 407 ++++++++++++++++++++++++++++----
2 files changed, 414 insertions(+), 42 deletions(-)
diff --git a/arch/powerpc/include/asm/mpc52xx_psc.h b/arch/powerpc/include/asm/mpc52xx_psc.h
index 2966df6..d0ece25 100644
--- a/arch/powerpc/include/asm/mpc52xx_psc.h
+++ b/arch/powerpc/include/asm/mpc52xx_psc.h
@@ -299,4 +299,53 @@ struct mpc512x_psc_fifo {
#define rxdata_32 rxdata.rxdata_32
};
+struct mpc5125_psc {
+ u8 mr1; /* PSC + 0x00 */
+ u8 reserved0[3];
+ u8 mr2; /* PSC + 0x04 */
+ u8 reserved1[3];
+ struct {
+ u16 status; /* PSC + 0x08 */
+ u8 reserved2[2];
+ u8 clock_select; /* PSC + 0x0c */
+ u8 reserved3[3];
+ } sr_csr;
+ u8 command; /* PSC + 0x10 */
+ u8 reserved4[3];
+ union { /* PSC + 0x14 */
+ u8 buffer_8;
+ u16 buffer_16;
+ u32 buffer_32;
+ } buffer;
+ struct {
+ u8 ipcr; /* PSC + 0x18 */
+ u8 reserved5[3];
+ u8 acr; /* PSC + 0x1c */
+ u8 reserved6[3];
+ } ipcr_acr;
+ struct {
+ u16 isr; /* PSC + 0x20 */
+ u8 reserved7[2];
+ u16 imr; /* PSC + 0x24 */
+ u8 reserved8[2];
+ } isr_imr;
+ u8 ctur; /* PSC + 0x28 */
+ u8 reserved9[3];
+ u8 ctlr; /* PSC + 0x2c */
+ u8 reserved10[3];
+ u32 ccr; /* PSC + 0x30 */
+ u32 ac97slots; /* PSC + 0x34 */
+ u32 ac97cmd; /* PSC + 0x38 */
+ u32 ac97data; /* PSC + 0x3c */
+ u8 reserved11[4];
+ u8 ip; /* PSC + 0x44 */
+ u8 reserved12[3];
+ u8 op1; /* PSC + 0x48 */
+ u8 reserved13[3];
+ u8 op0; /* PSC + 0x4c */
+ u8 reserved14[3];
+ u32 sicr; /* PSC + 0x50 */
+ u8 reserved15[4]; /* make eq. sizeof(mpc52xx_psc) */
+};
+
#endif /* __ASM_MPC52xx_PSC_H__ */
diff --git a/drivers/tty/serial/mpc52xx_uart.c b/drivers/tty/serial/mpc52xx_uart.c
index 018bad9..3f77b9d 100644
--- a/drivers/tty/serial/mpc52xx_uart.c
+++ b/drivers/tty/serial/mpc52xx_uart.c
@@ -122,6 +122,15 @@ struct psc_ops {
void (*fifoc_uninit)(void);
void (*get_irq)(struct uart_port *, struct device_node *);
irqreturn_t (*handle_irq)(struct uart_port *port);
+ u16 (*get_status)(struct uart_port *port);
+ u8 (*get_ipcr)(struct uart_port *port);
+ void (*command)(struct uart_port *port, u8 cmd);
+ void (*set_mode)(struct uart_port *port, u8 mr1, u8 mr2);
+ void (*set_rts)(struct uart_port *port, int state);
+ void (*enable_ms)(struct uart_port *port);
+ void (*set_sicr)(struct uart_port *port, u32 val);
+ void (*set_imr)(struct uart_port *port, u16 val);
+ u8 (*get_mr1)(struct uart_port *port);
};
/* setting the prescaler and divisor reg is common for all chips */
@@ -134,6 +143,74 @@ static inline void mpc52xx_set_divisor(struct mpc52xx_psc __iomem *psc,
out_8(&psc->ctlr, divisor & 0xff);
}
+static inline void mpc5125_set_divisor(struct mpc5125_psc __iomem *psc,
+ u8 prescaler, unsigned int divisor)
+{
+ /* select prescaler */
+ out_8(&psc->mpc52xx_psc_clock_select, prescaler);
+ out_8(&psc->ctur, divisor >> 8);
+ out_8(&psc->ctlr, divisor & 0xff);
+}
+
+static u16 mpc52xx_psc_get_status(struct uart_port *port)
+{
+ return in_be16(&PSC(port)->mpc52xx_psc_status);
+}
+
+static u8 mpc52xx_psc_get_ipcr(struct uart_port *port)
+{
+ return in_8(&PSC(port)->mpc52xx_psc_ipcr);
+}
+
+static void mpc52xx_psc_command(struct uart_port *port, u8 cmd)
+{
+ out_8(&PSC(port)->command, cmd);
+}
+
+static void mpc52xx_psc_set_mode(struct uart_port *port, u8 mr1, u8 mr2)
+{
+ out_8(&PSC(port)->command, MPC52xx_PSC_SEL_MODE_REG_1);
+ out_8(&PSC(port)->mode, mr1);
+ out_8(&PSC(port)->mode, mr2);
+}
+
+static void mpc52xx_psc_set_rts(struct uart_port *port, int state)
+{
+ if (state)
+ out_8(&PSC(port)->op1, MPC52xx_PSC_OP_RTS);
+ else
+ out_8(&PSC(port)->op0, MPC52xx_PSC_OP_RTS);
+}
+
+static void mpc52xx_psc_enable_ms(struct uart_port *port)
+{
+ struct mpc52xx_psc __iomem *psc = PSC(port);
+
+ /* clear D_*-bits by reading them */
+ in_8(&psc->mpc52xx_psc_ipcr);
+ /* enable CTS and DCD as IPC interrupts */
+ out_8(&psc->mpc52xx_psc_acr, MPC52xx_PSC_IEC_CTS | MPC52xx_PSC_IEC_DCD);
+
+ port->read_status_mask |= MPC52xx_PSC_IMR_IPC;
+ out_be16(&psc->mpc52xx_psc_imr, port->read_status_mask);
+}
+
+static void mpc52xx_psc_set_sicr(struct uart_port *port, u32 val)
+{
+ out_be32(&PSC(port)->sicr, val);
+}
+
+static void mpc52xx_psc_set_imr(struct uart_port *port, u16 val)
+{
+ out_be16(&PSC(port)->mpc52xx_psc_imr, val);
+}
+
+static u8 mpc52xx_psc_get_mr1(struct uart_port *port)
+{
+ out_8(&PSC(port)->command, MPC52xx_PSC_SEL_MODE_REG_1);
+ return in_8(&PSC(port)->mode);
+}
+
#ifdef CONFIG_PPC_MPC52xx
#define FIFO_52xx(port) ((struct mpc52xx_psc_fifo __iomem *)(PSC(port)+1))
static void mpc52xx_psc_fifo_init(struct uart_port *port)
@@ -304,6 +381,15 @@ static struct psc_ops mpc52xx_psc_ops = {
.set_baudrate = mpc5200_psc_set_baudrate,
.get_irq = mpc52xx_psc_get_irq,
.handle_irq = mpc52xx_psc_handle_irq,
+ .get_status = mpc52xx_psc_get_status,
+ .get_ipcr = mpc52xx_psc_get_ipcr,
+ .command = mpc52xx_psc_command,
+ .set_mode = mpc52xx_psc_set_mode,
+ .set_rts = mpc52xx_psc_set_rts,
+ .enable_ms = mpc52xx_psc_enable_ms,
+ .set_sicr = mpc52xx_psc_set_sicr,
+ .set_imr = mpc52xx_psc_set_imr,
+ .get_mr1 = mpc52xx_psc_get_mr1,
};
static struct psc_ops mpc5200b_psc_ops = {
@@ -325,6 +411,15 @@ static struct psc_ops mpc5200b_psc_ops = {
.set_baudrate = mpc5200b_psc_set_baudrate,
.get_irq = mpc52xx_psc_get_irq,
.handle_irq = mpc52xx_psc_handle_irq,
+ .get_status = mpc52xx_psc_get_status,
+ .get_ipcr = mpc52xx_psc_get_ipcr,
+ .command = mpc52xx_psc_command,
+ .set_mode = mpc52xx_psc_set_mode,
+ .set_rts = mpc52xx_psc_set_rts,
+ .enable_ms = mpc52xx_psc_enable_ms,
+ .set_sicr = mpc52xx_psc_set_sicr,
+ .set_imr = mpc52xx_psc_set_imr,
+ .get_mr1 = mpc52xx_psc_get_mr1,
};
#endif /* CONFIG_MPC52xx */
@@ -573,6 +668,242 @@ static void mpc512x_psc_get_irq(struct uart_port *port, struct device_node *np)
port->irq = psc_fifoc_irq;
}
+#endif
+
+#ifdef CONFIG_PPC_MPC512x
+
+#define PSC_5125(port) ((struct mpc5125_psc __iomem *)((port)->membase))
+#define FIFO_5125(port) ((struct mpc512x_psc_fifo __iomem *)(PSC_5125(port)+1))
+
+static void mpc5125_psc_fifo_init(struct uart_port *port)
+{
+ /* /32 prescaler */
+ out_8(&PSC_5125(port)->mpc52xx_psc_clock_select, 0xdd);
+
+ out_be32(&FIFO_5125(port)->txcmd, MPC512x_PSC_FIFO_RESET_SLICE);
+ out_be32(&FIFO_5125(port)->txcmd, MPC512x_PSC_FIFO_ENABLE_SLICE);
+ out_be32(&FIFO_5125(port)->txalarm, 1);
+ out_be32(&FIFO_5125(port)->tximr, 0);
+
+ out_be32(&FIFO_5125(port)->rxcmd, MPC512x_PSC_FIFO_RESET_SLICE);
+ out_be32(&FIFO_5125(port)->rxcmd, MPC512x_PSC_FIFO_ENABLE_SLICE);
+ out_be32(&FIFO_5125(port)->rxalarm, 1);
+ out_be32(&FIFO_5125(port)->rximr, 0);
+
+ out_be32(&FIFO_5125(port)->tximr, MPC512x_PSC_FIFO_ALARM);
+ out_be32(&FIFO_5125(port)->rximr, MPC512x_PSC_FIFO_ALARM);
+}
+
+static int mpc5125_psc_raw_rx_rdy(struct uart_port *port)
+{
+ return !(in_be32(&FIFO_5125(port)->rxsr) & MPC512x_PSC_FIFO_EMPTY);
+}
+
+static int mpc5125_psc_raw_tx_rdy(struct uart_port *port)
+{
+ return !(in_be32(&FIFO_5125(port)->txsr) & MPC512x_PSC_FIFO_FULL);
+}
+
+static int mpc5125_psc_rx_rdy(struct uart_port *port)
+{
+ return in_be32(&FIFO_5125(port)->rxsr)
+ & in_be32(&FIFO_5125(port)->rximr)
+ & MPC512x_PSC_FIFO_ALARM;
+}
+
+static int mpc5125_psc_tx_rdy(struct uart_port *port)
+{
+ return in_be32(&FIFO_5125(port)->txsr)
+ & in_be32(&FIFO_5125(port)->tximr)
+ & MPC512x_PSC_FIFO_ALARM;
+}
+
+static int mpc5125_psc_tx_empty(struct uart_port *port)
+{
+ return in_be32(&FIFO_5125(port)->txsr)
+ & MPC512x_PSC_FIFO_EMPTY;
+}
+
+static void mpc5125_psc_stop_rx(struct uart_port *port)
+{
+ unsigned long rx_fifo_imr;
+
+ rx_fifo_imr = in_be32(&FIFO_5125(port)->rximr);
+ rx_fifo_imr &= ~MPC512x_PSC_FIFO_ALARM;
+ out_be32(&FIFO_5125(port)->rximr, rx_fifo_imr);
+}
+
+static void mpc5125_psc_start_tx(struct uart_port *port)
+{
+ unsigned long tx_fifo_imr;
+
+ tx_fifo_imr = in_be32(&FIFO_5125(port)->tximr);
+ tx_fifo_imr |= MPC512x_PSC_FIFO_ALARM;
+ out_be32(&FIFO_5125(port)->tximr, tx_fifo_imr);
+}
+
+static void mpc5125_psc_stop_tx(struct uart_port *port)
+{
+ unsigned long tx_fifo_imr;
+
+ tx_fifo_imr = in_be32(&FIFO_5125(port)->tximr);
+ tx_fifo_imr &= ~MPC512x_PSC_FIFO_ALARM;
+ out_be32(&FIFO_5125(port)->tximr, tx_fifo_imr);
+}
+
+static void mpc5125_psc_rx_clr_irq(struct uart_port *port)
+{
+ out_be32(&FIFO_5125(port)->rxisr, in_be32(&FIFO_5125(port)->rxisr));
+}
+
+static void mpc5125_psc_tx_clr_irq(struct uart_port *port)
+{
+ out_be32(&FIFO_5125(port)->txisr, in_be32(&FIFO_5125(port)->txisr));
+}
+
+static void mpc5125_psc_write_char(struct uart_port *port, unsigned char c)
+{
+ out_8(&FIFO_5125(port)->txdata_8, c);
+}
+
+static unsigned char mpc5125_psc_read_char(struct uart_port *port)
+{
+ return in_8(&FIFO_5125(port)->rxdata_8);
+}
+
+static void mpc5125_psc_cw_disable_ints(struct uart_port *port)
+{
+ port->read_status_mask =
+ in_be32(&FIFO_5125(port)->tximr) << 16 |
+ in_be32(&FIFO_5125(port)->rximr);
+ out_be32(&FIFO_5125(port)->tximr, 0);
+ out_be32(&FIFO_5125(port)->rximr, 0);
+}
+
+static void mpc5125_psc_cw_restore_ints(struct uart_port *port)
+{
+ out_be32(&FIFO_5125(port)->tximr,
+ (port->read_status_mask >> 16) & 0x7f);
+ out_be32(&FIFO_5125(port)->rximr, port->read_status_mask & 0x7f);
+}
+
+static unsigned int mpc5125_psc_set_baudrate(struct uart_port *port,
+ struct ktermios *new,
+ struct ktermios *old)
+{
+ unsigned int baud;
+ unsigned int divisor;
+
+ /*
+ * Calculate with a /16 prescaler here.
+ */
+
+ /* uartclk contains the ips freq */
+ baud = uart_get_baud_rate(port, new, old,
+ port->uartclk / (16 * 0xffff) + 1,
+ port->uartclk / 16);
+ divisor = (port->uartclk + 8 * baud) / (16 * baud);
+
+ /* enable the /16 prescaler and set the divisor */
+ mpc5125_set_divisor(PSC_5125(port), 0xdd, divisor);
+ return baud;
+}
+
+/* MPC5125 have compatible PSC FIFO Controller.
+ * Special init not needed.
+ */
+
+static u16 mpc5125_psc_get_status(struct uart_port *port)
+{
+ return in_be16(&PSC_5125(port)->mpc52xx_psc_status);
+}
+
+static u8 mpc5125_psc_get_ipcr(struct uart_port *port)
+{
+ return in_8(&PSC_5125(port)->mpc52xx_psc_ipcr);
+}
+
+static void mpc5125_psc_command(struct uart_port *port, u8 cmd)
+{
+ out_8(&PSC_5125(port)->command, cmd);
+}
+
+static void mpc5125_psc_set_mode(struct uart_port *port, u8 mr1, u8 mr2)
+{
+ out_8(&PSC_5125(port)->mr1, mr1);
+ out_8(&PSC_5125(port)->mr2, mr2);
+}
+
+static void mpc5125_psc_set_rts(struct uart_port *port, int state)
+{
+ if (state & TIOCM_RTS)
+ out_8(&PSC_5125(port)->op1, MPC52xx_PSC_OP_RTS);
+ else
+ out_8(&PSC_5125(port)->op0, MPC52xx_PSC_OP_RTS);
+}
+
+static void
+mpc5125_psc_enable_ms(struct uart_port *port)
+{
+ struct mpc5125_psc __iomem *psc = PSC_5125(port);
+
+ /* clear D_*-bits by reading them */
+ in_8(&psc->mpc52xx_psc_ipcr);
+ /* enable CTS and DCD as IPC interrupts */
+ out_8(&psc->mpc52xx_psc_acr, MPC52xx_PSC_IEC_CTS | MPC52xx_PSC_IEC_DCD);
+
+ port->read_status_mask |= MPC52xx_PSC_IMR_IPC;
+ out_be16(&psc->mpc52xx_psc_imr, port->read_status_mask);
+}
+
+static void mpc5125_psc_set_sicr(struct uart_port *port, u32 val)
+{
+ out_be32(&PSC_5125(port)->sicr, val);
+}
+
+static void mpc5125_psc_set_imr(struct uart_port *port, u16 val)
+{
+ out_be16(&PSC_5125(port)->mpc52xx_psc_imr, val);
+}
+
+static u8 mpc5125_psc_get_mr1(struct uart_port *port)
+{
+ return in_8(&PSC_5125(port)->mr1);
+}
+
+static struct psc_ops mpc5125_psc_ops = {
+ .fifo_init = mpc5125_psc_fifo_init,
+ .raw_rx_rdy = mpc5125_psc_raw_rx_rdy,
+ .raw_tx_rdy = mpc5125_psc_raw_tx_rdy,
+ .rx_rdy = mpc5125_psc_rx_rdy,
+ .tx_rdy = mpc5125_psc_tx_rdy,
+ .tx_empty = mpc5125_psc_tx_empty,
+ .stop_rx = mpc5125_psc_stop_rx,
+ .start_tx = mpc5125_psc_start_tx,
+ .stop_tx = mpc5125_psc_stop_tx,
+ .rx_clr_irq = mpc5125_psc_rx_clr_irq,
+ .tx_clr_irq = mpc5125_psc_tx_clr_irq,
+ .write_char = mpc5125_psc_write_char,
+ .read_char = mpc5125_psc_read_char,
+ .cw_disable_ints = mpc5125_psc_cw_disable_ints,
+ .cw_restore_ints = mpc5125_psc_cw_restore_ints,
+ .set_baudrate = mpc5125_psc_set_baudrate,
+ .clock = mpc512x_psc_clock,
+ .fifoc_init = mpc512x_psc_fifoc_init,
+ .fifoc_uninit = mpc512x_psc_fifoc_uninit,
+ .get_irq = mpc512x_psc_get_irq,
+ .handle_irq = mpc512x_psc_handle_irq,
+ .get_status = mpc5125_psc_get_status,
+ .get_ipcr = mpc5125_psc_get_ipcr,
+ .command = mpc5125_psc_command,
+ .set_mode = mpc5125_psc_set_mode,
+ .set_rts = mpc5125_psc_set_rts,
+ .enable_ms = mpc5125_psc_enable_ms,
+ .set_sicr = mpc5125_psc_set_sicr,
+ .set_imr = mpc5125_psc_set_imr,
+ .get_mr1 = mpc5125_psc_get_mr1,
+};
+
static struct psc_ops mpc512x_psc_ops = {
.fifo_init = mpc512x_psc_fifo_init,
.raw_rx_rdy = mpc512x_psc_raw_rx_rdy,
@@ -595,8 +926,18 @@ static struct psc_ops mpc512x_psc_ops = {
.fifoc_uninit = mpc512x_psc_fifoc_uninit,
.get_irq = mpc512x_psc_get_irq,
.handle_irq = mpc512x_psc_handle_irq,
+ .get_status = mpc52xx_psc_get_status,
+ .get_ipcr = mpc52xx_psc_get_ipcr,
+ .command = mpc52xx_psc_command,
+ .set_mode = mpc52xx_psc_set_mode,
+ .set_rts = mpc52xx_psc_set_rts,
+ .enable_ms = mpc52xx_psc_enable_ms,
+ .set_sicr = mpc52xx_psc_set_sicr,
+ .set_imr = mpc52xx_psc_set_imr,
+ .get_mr1 = mpc52xx_psc_get_mr1,
};
-#endif
+#endif /* CONFIG_PPC_MPC512x */
+
static const struct psc_ops *psc_ops;
@@ -613,17 +954,14 @@ mpc52xx_uart_tx_empty(struct uart_port *port)
static void
mpc52xx_uart_set_mctrl(struct uart_port *port, unsigned int mctrl)
{
- if (mctrl & TIOCM_RTS)
- out_8(&PSC(port)->op1, MPC52xx_PSC_OP_RTS);
- else
- out_8(&PSC(port)->op0, MPC52xx_PSC_OP_RTS);
+ psc_ops->set_rts(port, mctrl & TIOCM_RTS);
}
static unsigned int
mpc52xx_uart_get_mctrl(struct uart_port *port)
{
unsigned int ret = TIOCM_DSR;
- u8 status = in_8(&PSC(port)->mpc52xx_psc_ipcr);
+ u8 status = psc_ops->get_ipcr(port);
if (!(status & MPC52xx_PSC_CTS))
ret |= TIOCM_CTS;
@@ -673,15 +1011,7 @@ mpc52xx_uart_stop_rx(struct uart_port *port)
static void
mpc52xx_uart_enable_ms(struct uart_port *port)
{
- struct mpc52xx_psc __iomem *psc = PSC(port);
-
- /* clear D_*-bits by reading them */
- in_8(&psc->mpc52xx_psc_ipcr);
- /* enable CTS and DCD as IPC interrupts */
- out_8(&psc->mpc52xx_psc_acr, MPC52xx_PSC_IEC_CTS | MPC52xx_PSC_IEC_DCD);
-
- port->read_status_mask |= MPC52xx_PSC_IMR_IPC;
- out_be16(&psc->mpc52xx_psc_imr, port->read_status_mask);
+ psc_ops->enable_ms(port);
}
static void
@@ -691,9 +1021,9 @@ mpc52xx_uart_break_ctl(struct uart_port *port, int ctl)
spin_lock_irqsave(&port->lock, flags);
if (ctl == -1)
- out_8(&PSC(port)->command, MPC52xx_PSC_START_BRK);
+ psc_ops->command(port, MPC52xx_PSC_START_BRK);
else
- out_8(&PSC(port)->command, MPC52xx_PSC_STOP_BRK);
+ psc_ops->command(port, MPC52xx_PSC_STOP_BRK);
spin_unlock_irqrestore(&port->lock, flags);
}
@@ -701,7 +1031,6 @@ mpc52xx_uart_break_ctl(struct uart_port *port, int ctl)
static int
mpc52xx_uart_startup(struct uart_port *port)
{
- struct mpc52xx_psc __iomem *psc = PSC(port);
int ret;
if (psc_ops->clock) {
@@ -717,15 +1046,15 @@ mpc52xx_uart_startup(struct uart_port *port)
return ret;
/* Reset/activate the port, clear and enable interrupts */
- out_8(&psc->command, MPC52xx_PSC_RST_RX);
- out_8(&psc->command, MPC52xx_PSC_RST_TX);
+ psc_ops->command(port, MPC52xx_PSC_RST_RX);
+ psc_ops->command(port, MPC52xx_PSC_RST_TX);
- out_be32(&psc->sicr, 0); /* UART mode DCD ignored */
+ psc_ops->set_sicr(port, 0); /* UART mode DCD ignored */
psc_ops->fifo_init(port);
- out_8(&psc->command, MPC52xx_PSC_TX_ENABLE);
- out_8(&psc->command, MPC52xx_PSC_RX_ENABLE);
+ psc_ops->command(port, MPC52xx_PSC_TX_ENABLE);
+ psc_ops->command(port, MPC52xx_PSC_RX_ENABLE);
return 0;
}
@@ -733,15 +1062,13 @@ mpc52xx_uart_startup(struct uart_port *port)
static void
mpc52xx_uart_shutdown(struct uart_port *port)
{
- struct mpc52xx_psc __iomem *psc = PSC(port);
-
/* Shut down the port. Leave TX active if on a console port */
- out_8(&psc->command, MPC52xx_PSC_RST_RX);
+ psc_ops->command(port, MPC52xx_PSC_RST_RX);
if (!uart_console(port))
- out_8(&psc->command, MPC52xx_PSC_RST_TX);
+ psc_ops->command(port, MPC52xx_PSC_RST_TX);
port->read_status_mask = 0;
- out_be16(&psc->mpc52xx_psc_imr, port->read_status_mask);
+ psc_ops->set_imr(port, port->read_status_mask);
if (psc_ops->clock)
psc_ops->clock(port, 0);
@@ -754,7 +1081,6 @@ static void
mpc52xx_uart_set_termios(struct uart_port *port, struct ktermios *new,
struct ktermios *old)
{
- struct mpc52xx_psc __iomem *psc = PSC(port);
unsigned long flags;
unsigned char mr1, mr2;
unsigned int j;
@@ -818,13 +1144,11 @@ mpc52xx_uart_set_termios(struct uart_port *port, struct ktermios *new,
"Some chars may have been lost.\n");
/* Reset the TX & RX */
- out_8(&psc->command, MPC52xx_PSC_RST_RX);
- out_8(&psc->command, MPC52xx_PSC_RST_TX);
+ psc_ops->command(port, MPC52xx_PSC_RST_RX);
+ psc_ops->command(port, MPC52xx_PSC_RST_TX);
/* Send new mode settings */
- out_8(&psc->command, MPC52xx_PSC_SEL_MODE_REG_1);
- out_8(&psc->mode, mr1);
- out_8(&psc->mode, mr2);
+ psc_ops->set_mode(port, mr1, mr2);
baud = psc_ops->set_baudrate(port, new, old);
/* Update the per-port timeout */
@@ -834,8 +1158,8 @@ mpc52xx_uart_set_termios(struct uart_port *port, struct ktermios *new,
mpc52xx_uart_enable_ms(port);
/* Reenable TX & RX */
- out_8(&psc->command, MPC52xx_PSC_TX_ENABLE);
- out_8(&psc->command, MPC52xx_PSC_RX_ENABLE);
+ psc_ops->command(port, MPC52xx_PSC_TX_ENABLE);
+ psc_ops->command(port, MPC52xx_PSC_RX_ENABLE);
/* We're all set, release the lock */
spin_unlock_irqrestore(&port->lock, flags);
@@ -963,7 +1287,7 @@ mpc52xx_uart_int_rx_chars(struct uart_port *port)
flag = TTY_NORMAL;
port->icount.rx++;
- status = in_be16(&PSC(port)->mpc52xx_psc_status);
+ status = psc_ops->get_status(port);
if (status & (MPC52xx_PSC_SR_PE |
MPC52xx_PSC_SR_FE |
@@ -983,7 +1307,7 @@ mpc52xx_uart_int_rx_chars(struct uart_port *port)
}
/* Clear error condition */
- out_8(&PSC(port)->command, MPC52xx_PSC_RST_ERR_STAT);
+ psc_ops->command(port, MPC52xx_PSC_RST_ERR_STAT);
}
tty_insert_flip_char(tport, ch, flag);
@@ -1066,7 +1390,7 @@ mpc5xxx_uart_process_int(struct uart_port *port)
if (psc_ops->tx_rdy(port))
keepgoing |= mpc52xx_uart_int_tx_chars(port);
- status = in_8(&PSC(port)->mpc52xx_psc_ipcr);
+ status = psc_ops->get_ipcr(port);
if (status & MPC52xx_PSC_D_DCD)
uart_handle_dcd_change(port, !(status & MPC52xx_PSC_DCD));
@@ -1107,14 +1431,12 @@ static void __init
mpc52xx_console_get_options(struct uart_port *port,
int *baud, int *parity, int *bits, int *flow)
{
- struct mpc52xx_psc __iomem *psc = PSC(port);
unsigned char mr1;
pr_debug("mpc52xx_console_get_options(port=%p)\n", port);
/* Read the mode registers */
- out_8(&psc->command, MPC52xx_PSC_SEL_MODE_REG_1);
- mr1 = in_8(&psc->mode);
+ mr1 = psc_ops->get_mr1(port);
/* CT{U,L}R are write-only ! */
*baud = CONFIG_SERIAL_MPC52xx_CONSOLE_BAUD;
@@ -1304,6 +1626,7 @@ static struct of_device_id mpc52xx_uart_of_match[] = {
#endif
#ifdef CONFIG_PPC_MPC512x
{ .compatible = "fsl,mpc5121-psc-uart", .data = &mpc512x_psc_ops, },
+ { .compatible = "fsl,mpc5125-psc-uart", .data = &mpc5125_psc_ops, },
#endif
{},
};
--
1.7.10.4
^ permalink raw reply related
* [PATCH] bookehv: Handle debug exception on guest exit
From: Bharat Bhushan @ 2013-03-20 17:45 UTC (permalink / raw)
To: linuxppc-dev, kvm, kvm-ppc, agraf, scottwood; +Cc: Bharat Bhushan
EPCR.DUVD controls whether the debug events can come in
hypervisor mode or not. When KVM guest is using the debug
resource then we do not want debug events to be captured
in guest entry/exit path. So we set EPCR.DUVD when entering
and clears EPCR.DUVD when exiting from guest.
Debug instruction complete is a post-completion debug
exception but debug event gets posted on the basis of MSR
before the instruction is executed. Now if the instruction
switches the context from guest mode (MSR.GS = 1) to hypervisor
mode (MSR.GS = 0) then the xSRR0 points to first instruction of
KVM handler and xSRR1 points that MSR.GS is clear
(hypervisor context). Now as xSRR1.GS is used to decide whether
KVM handler will be invoked to handle the exception or host
host kernel debug handler will be invoked to handle the exception.
This leads to host kernel debug handler handling the exception
which should either be handled by KVM.
This is tested on e500mc in 32 bit mode
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
---
v0:
- Do not apply this change for debug_crit as we do not know those chips have issue or not.
- corrected 64bit case branching
arch/powerpc/kernel/exceptions-64e.S | 29 ++++++++++++++++++++++++++++-
arch/powerpc/kernel/head_booke.h | 26 ++++++++++++++++++++++++++
2 files changed, 54 insertions(+), 1 deletions(-)
diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
index 4684e33..8b26294 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -516,6 +516,33 @@ kernel_dbg_exc:
andis. r15,r14,DBSR_IC@h
beq+ 1f
+#ifdef CONFIG_KVM_BOOKE_HV
+ /*
+ * EPCR.DUVD controls whether the debug events can come in
+ * hypervisor mode or not. When KVM guest is using the debug
+ * resource then we do not want debug events to be captured
+ * in guest entry/exit path. So we set EPCR.DUVD when entering
+ * and clears EPCR.DUVD when exiting from guest.
+ * Debug instruction complete is a post-completion debug
+ * exception but debug event gets posted on the basis of MSR
+ * before the instruction is executed. Now if the instruction
+ * switches the context from guest mode (MSR.GS = 1) to hypervisor
+ * mode (MSR.GS = 0) then the xSRR0 points to first instruction of
+ * KVM handler and xSRR1 points that MSR.GS is clear
+ * (hypervisor context). Now as xSRR1.GS is used to decide whether
+ * KVM handler will be invoked to handle the exception or host
+ * host kernel debug handler will be invoked to handle the exception.
+ * This leads to host kernel debug handler handling the exception
+ * which should either be handled by KVM.
+ */
+ mfspr r10, SPRN_EPCR
+ andis. r10,r10,SPRN_EPCR_DUVD@h
+ beq+ 2f
+
+ andis. r10,r9,MSR_GS@h
+ beq+ 3f
+2:
+#endif
LOAD_REG_IMMEDIATE(r14,interrupt_base_book3e)
LOAD_REG_IMMEDIATE(r15,interrupt_end_book3e)
cmpld cr0,r10,r14
@@ -523,7 +550,7 @@ kernel_dbg_exc:
blt+ cr0,1f
bge+ cr1,1f
- /* here it looks like we got an inappropriate debug exception. */
+3: /* here it looks like we got an inappropriate debug exception. */
lis r14,DBSR_IC@h /* clear the IC event */
rlwinm r11,r11,0,~MSR_DE /* clear DE in the DSRR1 value */
mtspr SPRN_DBSR,r14
diff --git a/arch/powerpc/kernel/head_booke.h b/arch/powerpc/kernel/head_booke.h
index 5f051ee..edc6a3b 100644
--- a/arch/powerpc/kernel/head_booke.h
+++ b/arch/powerpc/kernel/head_booke.h
@@ -285,7 +285,33 @@ label:
mfspr r10,SPRN_DBSR; /* check single-step/branch taken */ \
andis. r10,r10,(DBSR_IC|DBSR_BT)@h; \
beq+ 2f; \
+#ifdef CONFIG_KVM_BOOKE_HV \
+ /* \
+ * EPCR.DUVD controls whether the debug events can come in \
+ * hypervisor mode or not. When KVM guest is using the debug \
+ * resource then we do not want debug events to be captured \
+ * in guest entry/exit path. So we set EPCR.DUVD when entering \
+ * and clears EPCR.DUVD when exiting from guest. \
+ * Debug instruction complete is a post-completion debug \
+ * exception but debug event gets posted on the basis of MSR \
+ * before the instruction is executed. Now if the instruction \
+ * switches the context from guest mode (MSR.GS = 1) to hypervisor \
+ * mode (MSR.GS = 0) then the xSRR0 points to first instruction of \
+ * KVM handler and xSRR1 points that MSR.GS is clear \
+ * (hypervisor context). Now as xSRR1.GS is used to decide whether \
+ * KVM handler will be invoked to handle the exception or host \
+ * host kernel debug handler will be invoked to handle the exception. \
+ * This leads to host kernel debug handler handling the exception \
+ * which should either be handled by KVM. \
+ */ \
+ mfspr r10, SPRN_EPCR; \
+ andis. r10,r10,SPRN_EPCR_DUVD@h; \
+ beq+ 3f; \
\
+ andis. r10,r9,MSR_GS@h; \
+ beq+ 1f; \
+3: \
+#endif \
lis r10,KERNELBASE@h; /* check if exception in vectors */ \
ori r10,r10,KERNELBASE@l; \
cmplw r12,r10; \
--
1.7.0.4
^ permalink raw reply related
* Re: [PATCH 2/3] VFIO: VFIO_DEVICE_SET_ADDR_MAPPING command
From: Alex Williamson @ 2013-03-20 18:48 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: aik, linuxppc-dev, Gavin Shan, kvm
In-Reply-To: <1363668309.18880.10.camel@pasglop>
On Tue, 2013-03-19 at 05:45 +0100, Benjamin Herrenschmidt wrote:
> On Mon, 2013-03-18 at 22:18 -0600, Alex Williamson wrote:
> > > Yes, EEH firmware call needn't going through VFIO. However, EEH has
> > > very close relationship with PCI and so VFIO-PCI does. Eventually, EEH
> > > has close relationship with VFIO-PCI :-)
> >
> > Is there some plan to do more with EEH through VFIO in the future or are
> > we just talking about some kind of weird associative property to sell
> > this ioctl? Thanks,
>
> Well, I'm not sure how 'weird' that is but it makes sense to me... VFIO
> is the mechanism that virtualizes access to a PCI device and provides
> interfaces to qemu & kvm to access it &| map it.
>
> Or rather VFIO-PCI is.
>
> At a basic level it provides ... the basic PCI interfaces, ie, config
> space access (with or without filtering), interrupts, etc...
>
> In our environment, EEH is just another functionality of PCI really.
> The firmware calls used by the guest to do that fall into more or less
> the same category as the ones used for PCI config space accesses,
> manipulation of DMA windows, etc... Similar to host (though guest
> and host use a different FW interface for various reasons).
>
> So it's very natural to "transport" these via VFIO-PCI like everything
> else, I don't see a more natural place to put the ioctl's we need for
> qemu to be able to access the EEH state, trigger EEH (un)isolation,
> resets, etc...
>
> Fundamentally, the design should be for VFIO-PCI to provide some specific
> ioctls for EEH that userspace programs such as qemu can use, and then
> re-expose those APIs to the guest.
>
> In addition, to do some of it in the kernel for performance reason, we
> want to establish that mapping, but I see that as a VFIO "accelerator".
>
> IE. Whatever is going to respond to the EEH calls from the guest in-kernel
> will have to share state with the rest of the EEH stuff provided to qemu
> by vfio-pci.
Perhaps my problem is that I don't have a clear picture of where you're
going with this like I do for AER. For AER we're starting with
notification of an error, from that we build into how to retrieve the
error information, and finally how to perform corrective action. Each
of these will be done through vifo-pci.
Here we're starting by registering a mapping that's really only useful
to the vfio "accelerator" path, but we don't even have a hint of what
the non-accelerated path is and how vfio is involved with it. Thanks,
Alex
^ permalink raw reply
* Re: [PATCH 2/3] VFIO: VFIO_DEVICE_SET_ADDR_MAPPING command
From: Benjamin Herrenschmidt @ 2013-03-20 19:31 UTC (permalink / raw)
To: Alex Williamson; +Cc: aik, linuxppc-dev, Gavin Shan, kvm
In-Reply-To: <1363805280.24132.529.camel@bling.home>
On Wed, 2013-03-20 at 12:48 -0600, Alex Williamson wrote:
> Perhaps my problem is that I don't have a clear picture of where
> you're
> going with this like I do for AER. For AER we're starting with
> notification of an error, from that we build into how to retrieve the
> error information, and finally how to perform corrective action. Each
> of these will be done through vifo-pci.
>
> Here we're starting by registering a mapping that's really only useful
> to the vfio "accelerator" path, but we don't even have a hint of what
> the non-accelerated path is and how vfio is involved with it. Thanks,
I'm surprised that you are building so much policy around AER ... can't
you just pass the raw stuff down to the guest and let the guest do it's
own corrective actions ?
As for EEH, I will let Gavin describe in more details what he is doing,
though I wouldn't be surprised if so far he doesn't have a
non-accelerated path :-) Which indeed makes things oddball, granted ...
at least for now. I *think* what Gavin's doing right now is a
pass-through to the host EEH directly in the kernel, so without a slow
path...
Gavin, it really boils down to that. In-kernel EEH for guests is a
KVMism that ends up not involving VFIO in any other way than
establishing the mapping, then arguably it could be done via a VM ioctl.
If there's more going through VFIO and shared state, then it should
probably go through VFIO-PCI.
Note (FYI) that EEH somewhat encompass AER... the EEH logic triggers on
AER errors as well and the error reports generated by the firmware
contain the AER register dump in addition to the bridge internal stuff.
IE. EEH encompass pretty much all sort of errors, correctable or not,
that happens on PCI. It adds a mechanism of "isolation" of domains on
first error (involving blocking MMIOs, DMAs and MSIs) which helps with
preventing propagation of bad data, and various recovery schemes.
Cheers,
Ben.
^ permalink raw reply
* [PATCH -V4 12/25] powerpc: Return all the valid pte ecndoing in KVM_PPC_GET_SMMU_INFO ioctl
From: Aneesh Kumar K.V @ 2013-03-20 19:34 UTC (permalink / raw)
To: benh, paulus; +Cc: linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1363808110-25748-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/kvm/book3s_hv.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 48f6d99..f472414 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1508,14 +1508,24 @@ long kvm_vm_ioctl_allocate_rma(struct kvm *kvm, struct kvm_allocate_rma *ret)
static void kvmppc_add_seg_page_size(struct kvm_ppc_one_seg_page_size **sps,
int linux_psize)
{
+ int i, index = 0;
struct mmu_psize_def *def = &mmu_psize_defs[linux_psize];
if (!def->shift)
return;
(*sps)->page_shift = def->shift;
(*sps)->slb_enc = def->sllp;
- (*sps)->enc[0].page_shift = def->shift;
- (*sps)->enc[0].pte_enc = def->penc[linux_psize];
+ for (i = 0; i < MMU_PAGE_COUNT; i++) {
+ if (def->penc[i] != -1) {
+ if (index >= KVM_PPC_PAGE_SIZES_MAX_SZ) {
+ WARN_ON(1);
+ break;
+ }
+ (*sps)->enc[index].page_shift = mmu_psize_defs[i].shift;
+ (*sps)->enc[index].pte_enc = def->penc[i];
+ index++;
+ }
+ }
(*sps)++;
}
--
1.7.10
^ permalink raw reply related
* [PATCH -V4 00/25] THP support for PPC64
From: Aneesh Kumar K.V @ 2013-03-20 19:34 UTC (permalink / raw)
To: benh, paulus; +Cc: linuxppc-dev
Hi,
This patchset adds transparent hugepage support for PPC64.
TODO:
* hash preload support in update_mmu_cache_pmd (we don't do that for hugetlb)
Some numbers:
The latency measurements code from Anton found at
http://ozlabs.org/~anton/junkcode/latency2001.c
THP disabled 64K page size
------------------------
[root@llmp24l02 ~]# ./latency2001 8G
8589934592 731.73 cycles 205.77 ns
[root@llmp24l02 ~]# ./latency2001 8G
8589934592 743.39 cycles 209.05 ns
[root@llmp24l02 ~]#
THP disabled large page via hugetlbfs
-------------------------------------
[root@llmp24l02 ~]# ./latency2001 -l 8G
8589934592 416.09 cycles 117.01 ns
[root@llmp24l02 ~]# ./latency2001 -l 8G
8589934592 415.74 cycles 116.91 ns
THP enabled 64K page size.
----------------
[root@llmp24l02 ~]# ./latency2001 8G
8589934592 405.07 cycles 113.91 ns
[root@llmp24l02 ~]# ./latency2001 8G
8589934592 411.82 cycles 115.81 ns
[root@llmp24l02 ~]#
We are close to hugetlbfs in latency and we can achieve this with zero
config/page reservation. Most of the allocations above are fault allocated.
I haven't really measured the collapse alloc impact.
Another test that does 50000000 random access over 1GB area goes from
2.65 seconds to 1.07 seconds with this patchset.
Changes from V3:
* PowerNV boot fixes
Change from V2:
* Change patch "powerpc: Reduce PTE table memory wastage" to use much simpler approach
for PTE page sharing.
* Changes to handle huge pages in KVM code.
* Address other review comments
Changes from V1
* Address review comments
* More patch split
* Add batch hpte invalidate for hugepages.
Changes from RFC V2:
* Address review comments
* More code cleanup and patch split
Changes from RFC V1:
* HugeTLB fs now works
* Compile issues fixed
* rebased to v3.8
* Patch series reorded so that ppc64 cleanups and MM THP changes are moved
early in the series. This should help in picking those patches early.
Thanks,
-aneesh
^ permalink raw reply
* [PATCH -V4 03/25] powerpc: Don't hard code the size of pte page
From: Aneesh Kumar K.V @ 2013-03-20 19:34 UTC (permalink / raw)
To: benh, paulus; +Cc: linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1363808110-25748-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
USE PTRS_PER_PTE to indicate the size of pte page. To support THP,
later patches will be changing PTRS_PER_PTE value.
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/pgtable.h | 6 ++++++
arch/powerpc/mm/hash_low_64.S | 4 ++--
2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
index a9cbd3b..4b52726 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -17,6 +17,12 @@ struct mm_struct;
# include <asm/pgtable-ppc32.h>
#endif
+/*
+ * We save the slot number & secondary bit in the second half of the
+ * PTE page. We use the 8 bytes per each pte entry.
+ */
+#define PTE_PAGE_HIDX_OFFSET (PTRS_PER_PTE * 8)
+
#ifndef __ASSEMBLY__
#include <asm/tlbflush.h>
diff --git a/arch/powerpc/mm/hash_low_64.S b/arch/powerpc/mm/hash_low_64.S
index 7443481..abdd5e2 100644
--- a/arch/powerpc/mm/hash_low_64.S
+++ b/arch/powerpc/mm/hash_low_64.S
@@ -490,7 +490,7 @@ END_FTR_SECTION(CPU_FTR_NOEXECUTE|CPU_FTR_COHERENT_ICACHE, CPU_FTR_NOEXECUTE)
beq htab_inval_old_hpte
ld r6,STK_PARAM(R6)(r1)
- ori r26,r6,0x8000 /* Load the hidx mask */
+ ori r26,r6,PTE_PAGE_HIDX_OFFSET /* Load the hidx mask. */
ld r26,0(r26)
addi r5,r25,36 /* Check actual HPTE_SUB bit, this */
rldcr. r0,r31,r5,0 /* must match pgtable.h definition */
@@ -607,7 +607,7 @@ htab_pte_insert_ok:
sld r4,r4,r5
andc r26,r26,r4
or r26,r26,r3
- ori r5,r6,0x8000
+ ori r5,r6,PTE_PAGE_HIDX_OFFSET
std r26,0(r5)
lwsync
std r30,0(r6)
--
1.7.10
^ permalink raw reply related
* [PATCH -V4 06/25] powerpc: Reduce PTE table memory wastage
From: Aneesh Kumar K.V @ 2013-03-20 19:34 UTC (permalink / raw)
To: benh, paulus; +Cc: linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1363808110-25748-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
We allocate one page for the last level of linux page table. With THP and
large page size of 16MB, that would mean we are wasting large part
of that page. To map 16MB area, we only need a PTE space of 2K with 64K
page size. This patch reduce the space wastage by sharing the page
allocated for the last level of linux page table with multiple pmd
entries. We call these smaller chunks PTE page fragments and allocated
page, PTE page.
In order to support systems which doesn't have 64K HPTE support, we also
add another 2K to PTE page fragment. The second half of the PTE fragments
is used for storing slot and secondary bit information of an HPTE. With this
we now have a 4K PTE fragment.
We use a simple approach to share the PTE page. On allocation, we bump the
PTE page refcount to 16 and share the PTE page with the next 16 pte alloc
request. This should help in the node locality of the PTE page fragment,
assuming that the immediate pte alloc request will mostly come from the
same NUMA node. We don't try to reuse the freed PTE page fragment. Hence
we could be waisting some space.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/mmu-book3e.h | 4 +
arch/powerpc/include/asm/mmu-hash64.h | 4 +
arch/powerpc/include/asm/page.h | 4 +
arch/powerpc/include/asm/pgalloc-64.h | 72 ++++-------------
arch/powerpc/kernel/setup_64.c | 4 +-
arch/powerpc/mm/mmu_context_hash64.c | 35 ++++++++
arch/powerpc/mm/pgtable_64.c | 144 +++++++++++++++++++++++++++++++++
7 files changed, 209 insertions(+), 58 deletions(-)
diff --git a/arch/powerpc/include/asm/mmu-book3e.h b/arch/powerpc/include/asm/mmu-book3e.h
index 99d43e0..affbd68 100644
--- a/arch/powerpc/include/asm/mmu-book3e.h
+++ b/arch/powerpc/include/asm/mmu-book3e.h
@@ -231,6 +231,10 @@ typedef struct {
u64 high_slices_psize; /* 4 bits per slice for now */
u16 user_psize; /* page size index */
#endif
+#ifdef CONFIG_PPC_64K_PAGES
+ /* for 4K PTE fragment support */
+ struct page *pgtable_page;
+#endif
} mm_context_t;
/* Page size definitions, common between 32 and 64-bit
diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
index 35bb51e..300ac3c 100644
--- a/arch/powerpc/include/asm/mmu-hash64.h
+++ b/arch/powerpc/include/asm/mmu-hash64.h
@@ -498,6 +498,10 @@ typedef struct {
unsigned long acop; /* mask of enabled coprocessor types */
unsigned int cop_pid; /* pid value used with coprocessors */
#endif /* CONFIG_PPC_ICSWX */
+#ifdef CONFIG_PPC_64K_PAGES
+ /* for 4K PTE fragment support */
+ struct page *pgtable_page;
+#endif
} mm_context_t;
diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index f072e97..38e7ff6 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -378,7 +378,11 @@ void arch_free_page(struct page *page, int order);
struct vm_area_struct;
+#ifdef CONFIG_PPC_64K_PAGES
+typedef pte_t *pgtable_t;
+#else
typedef struct page *pgtable_t;
+#endif
#include <asm-generic/memory_model.h>
#endif /* __ASSEMBLY__ */
diff --git a/arch/powerpc/include/asm/pgalloc-64.h b/arch/powerpc/include/asm/pgalloc-64.h
index cdbf555..3418989 100644
--- a/arch/powerpc/include/asm/pgalloc-64.h
+++ b/arch/powerpc/include/asm/pgalloc-64.h
@@ -150,6 +150,13 @@ static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
#else /* if CONFIG_PPC_64K_PAGES */
+extern pte_t *page_table_alloc(struct mm_struct *, unsigned long, int);
+extern void page_table_free(struct mm_struct *, unsigned long *, int);
+extern void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift);
+#ifdef CONFIG_SMP
+extern void __tlb_remove_table(void *_table);
+#endif
+
#define pud_populate(mm, pud, pmd) pud_set(pud, (unsigned long)pmd)
static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd,
@@ -161,90 +168,42 @@ static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd,
static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd,
pgtable_t pte_page)
{
- pmd_populate_kernel(mm, pmd, page_address(pte_page));
+ pmd_set(pmd, (unsigned long)pte_page);
}
static inline pgtable_t pmd_pgtable(pmd_t pmd)
{
- return pmd_page(pmd);
+ return (pgtable_t)(pmd_val(pmd) & -sizeof(pte_t)*PTRS_PER_PTE);
}
static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm,
unsigned long address)
{
- return (pte_t *)__get_free_page(GFP_KERNEL | __GFP_REPEAT | __GFP_ZERO);
+ return (pte_t *)page_table_alloc(mm, address, 1);
}
static inline pgtable_t pte_alloc_one(struct mm_struct *mm,
- unsigned long address)
+ unsigned long address)
{
- struct page *page;
- pte_t *pte;
-
- pte = pte_alloc_one_kernel(mm, address);
- if (!pte)
- return NULL;
- page = virt_to_page(pte);
- pgtable_page_ctor(page);
- return page;
+ return (pgtable_t)page_table_alloc(mm, address, 0);
}
static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
{
- free_page((unsigned long)pte);
+ page_table_free(mm, (unsigned long *)pte, 1);
}
static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage)
{
- pgtable_page_dtor(ptepage);
- __free_page(ptepage);
-}
-
-static inline void pgtable_free(void *table, unsigned index_size)
-{
- if (!index_size)
- free_page((unsigned long)table);
- else {
- BUG_ON(index_size > MAX_PGTABLE_INDEX_SIZE);
- kmem_cache_free(PGT_CACHE(index_size), table);
- }
+ page_table_free(mm, (unsigned long *)ptepage, 0);
}
-#ifdef CONFIG_SMP
-static inline void pgtable_free_tlb(struct mmu_gather *tlb,
- void *table, int shift)
-{
- unsigned long pgf = (unsigned long)table;
- BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
- pgf |= shift;
- tlb_remove_table(tlb, (void *)pgf);
-}
-
-static inline void __tlb_remove_table(void *_table)
-{
- void *table = (void *)((unsigned long)_table & ~MAX_PGTABLE_INDEX_SIZE);
- unsigned shift = (unsigned long)_table & MAX_PGTABLE_INDEX_SIZE;
-
- pgtable_free(table, shift);
-}
-#else /* !CONFIG_SMP */
-static inline void pgtable_free_tlb(struct mmu_gather *tlb,
- void *table, int shift)
-{
- pgtable_free(table, shift);
-}
-#endif /* CONFIG_SMP */
-
static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
unsigned long address)
{
- struct page *page = page_address(table);
-
tlb_flush_pgtable(tlb, address);
- pgtable_page_dtor(page);
- pgtable_free_tlb(tlb, page, 0);
+ pgtable_free_tlb(tlb, table, 0);
}
-
#endif /* CONFIG_PPC_64K_PAGES */
static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
@@ -258,7 +217,6 @@ static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
kmem_cache_free(PGT_CACHE(PMD_INDEX_SIZE), pmd);
}
-
#define __pmd_free_tlb(tlb, pmd, addr) \
pgtable_free_tlb(tlb, pmd, PMD_INDEX_SIZE)
#ifndef CONFIG_PPC_64K_PAGES
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 6da881b..04d833c 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -575,7 +575,9 @@ void __init setup_arch(char **cmdline_p)
init_mm.end_code = (unsigned long) _etext;
init_mm.end_data = (unsigned long) _edata;
init_mm.brk = klimit;
-
+#ifdef CONFIG_PPC_64K_PAGES
+ init_mm.context.pgtable_page = NULL;
+#endif
irqstack_early_init();
exc_lvl_early_init();
emergency_stack_init();
diff --git a/arch/powerpc/mm/mmu_context_hash64.c b/arch/powerpc/mm/mmu_context_hash64.c
index 59cd773..fbfdca2 100644
--- a/arch/powerpc/mm/mmu_context_hash64.c
+++ b/arch/powerpc/mm/mmu_context_hash64.c
@@ -86,6 +86,9 @@ int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
spin_lock_init(mm->context.cop_lockp);
#endif /* CONFIG_PPC_ICSWX */
+#ifdef CONFIG_PPC_64K_PAGES
+ mm->context.pgtable_page = NULL;
+#endif
return 0;
}
@@ -97,13 +100,45 @@ void __destroy_context(int context_id)
}
EXPORT_SYMBOL_GPL(__destroy_context);
+#ifdef CONFIG_PPC_64K_PAGES
+static void destroy_pagetable_page(struct mm_struct *mm)
+{
+ int count;
+ struct page *page;
+
+ page = mm->context.pgtable_page;
+ if (!page)
+ return;
+
+ /* drop all the pending references */
+ count = atomic_read(&page->_mapcount) + 1;
+ /* We allow PTE_FRAG_NR(16) fragments from a PTE page */
+ count = atomic_sub_return(16 - count, &page->_count);
+ if (!count) {
+ pgtable_page_dtor(page);
+ reset_page_mapcount(page);
+ free_hot_cold_page(page, 0);
+ }
+}
+
+#else
+static inline void destroy_pagetable_page(struct mm_struct *mm)
+{
+ return;
+}
+#endif
+
+
void destroy_context(struct mm_struct *mm)
{
+
#ifdef CONFIG_PPC_ICSWX
drop_cop(mm->context.acop, mm);
kfree(mm->context.cop_lockp);
mm->context.cop_lockp = NULL;
#endif /* CONFIG_PPC_ICSWX */
+
+ destroy_pagetable_page(mm);
__destroy_context(mm->context.id);
subpage_prot_free(mm);
mm->context.id = MMU_NO_CONTEXT;
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index e212a27..66fad08 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -337,3 +337,147 @@ EXPORT_SYMBOL(__ioremap_at);
EXPORT_SYMBOL(iounmap);
EXPORT_SYMBOL(__iounmap);
EXPORT_SYMBOL(__iounmap_at);
+
+#ifdef CONFIG_PPC_64K_PAGES
+/*
+ * we support 16 fragments per PTE page. This is limited by how many
+ * bits we can pack in page->_mapcount. We use the first half for
+ * tracking the usage for rcu page table free.
+ */
+#define PTE_FRAG_NR 16
+/*
+ * We use a 2K PTE page fragment and another 2K for storing
+ * real_pte_t hash index
+ */
+#define PTE_FRAG_SIZE (2 * PTRS_PER_PTE * sizeof(pte_t))
+
+static pte_t *__get_from_cache(struct mm_struct *mm)
+{
+ int index;
+ pte_t *ret = NULL;
+ struct page *page;
+
+ page = mm->context.pgtable_page;
+ if (page) {
+ void *p = page_address(page);
+ index = atomic_add_return(1, &page->_mapcount);
+ ret = (pte_t *) (p + (index * PTE_FRAG_SIZE));
+ /*
+ * If we have taken up all the fragments mark PTE page NULL
+ */
+ if (index == PTE_FRAG_NR - 1)
+ mm->context.pgtable_page = NULL;
+ }
+ return ret;
+}
+
+static pte_t *get_from_cache(struct mm_struct *mm)
+{
+ pte_t *ret = NULL;
+
+ spin_lock(&mm->page_table_lock);
+ ret = __get_from_cache(mm);
+ spin_unlock(&mm->page_table_lock);
+
+ return ret;
+}
+
+static pte_t *__alloc_for_cache(struct mm_struct *mm, int kernel)
+{
+ pte_t *ret = NULL;
+ struct page *page = alloc_page(GFP_KERNEL | __GFP_NOTRACK |
+ __GFP_REPEAT | __GFP_ZERO);
+
+ if (page) {
+ spin_lock(&mm->page_table_lock);
+ if (likely(!mm->context.pgtable_page)) {
+ atomic_set(&page->_count, PTE_FRAG_NR);
+ atomic_set(&page->_mapcount, 0);
+ mm->context.pgtable_page = page;
+ ret = (unsigned long *)page_address(page);
+ if (!kernel)
+ pgtable_page_ctor(page);
+ page = NULL;
+ } else
+ ret = __get_from_cache(mm);
+ spin_unlock(&mm->page_table_lock);
+ }
+
+ if (page)
+ free_hot_cold_page(page, 0);
+ return ret;
+}
+
+pte_t *page_table_alloc(struct mm_struct *mm, unsigned long vmaddr, int kernel)
+{
+ pte_t *pte;
+
+ pte = get_from_cache(mm);
+ if (pte)
+ return pte;
+
+ return __alloc_for_cache(mm, kernel);
+}
+
+void page_table_free(struct mm_struct *mm, unsigned long *table, int kernel)
+{
+ struct page *page = virt_to_page(table);
+ if (put_page_testzero(page)) {
+ if (!kernel)
+ pgtable_page_dtor(page);
+ reset_page_mapcount(page);
+ free_hot_cold_page(page, 0);
+ }
+}
+
+#ifdef CONFIG_SMP
+static void page_table_free_rcu(void *table)
+{
+ struct page *page = virt_to_page(table);
+ if (put_page_testzero(page)) {
+ pgtable_page_dtor(page);
+ reset_page_mapcount(page);
+ free_hot_cold_page(page, 0);
+ }
+}
+
+void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift)
+{
+ unsigned long pgf = (unsigned long)table;
+
+ BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
+ pgf |= shift;
+ tlb_remove_table(tlb, (void *)pgf);
+}
+
+void __tlb_remove_table(void *_table)
+{
+ void *table = (void *)((unsigned long)_table & ~MAX_PGTABLE_INDEX_SIZE);
+ unsigned shift = (unsigned long)_table & MAX_PGTABLE_INDEX_SIZE;
+
+ if (!shift)
+ /* PTE page needs special handling */
+ page_table_free_rcu(table);
+ else {
+ BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
+ kmem_cache_free(PGT_CACHE(shift), table);
+ }
+}
+#else
+void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift)
+{
+ if (!shift) {
+ /* PTE page needs special handling */
+ struct page *page = virt_to_page(table);
+ if (put_page_testzero(page)) {
+ pgtable_page_dtor(page);
+ reset_page_mapcount(page);
+ free_hot_cold_page(page, 0);
+ }
+ } else {
+ BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
+ kmem_cache_free(PGT_CACHE(shift), table);
+ }
+}
+#endif
+#endif /* CONFIG_PPC_64K_PAGES */
--
1.7.10
^ permalink raw reply related
* [PATCH -V4 05/25] powerpc: Move the pte free routines from common header
From: Aneesh Kumar K.V @ 2013-03-20 19:34 UTC (permalink / raw)
To: benh, paulus; +Cc: linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1363808110-25748-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
This patch moves the common code to 32/64 bit headers and also duplicate
4K_PAGES and 64K_PAGES section. We will later change the 64 bit 64K_PAGES
version to support smaller PTE fragments. The patch doesn't introduce
any functional changes.
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/pgalloc-32.h | 45 ++++++++++
arch/powerpc/include/asm/pgalloc-64.h | 157 ++++++++++++++++++++++++++++++---
arch/powerpc/include/asm/pgalloc.h | 46 +---------
3 files changed, 189 insertions(+), 59 deletions(-)
diff --git a/arch/powerpc/include/asm/pgalloc-32.h b/arch/powerpc/include/asm/pgalloc-32.h
index 580cf73..27b2386 100644
--- a/arch/powerpc/include/asm/pgalloc-32.h
+++ b/arch/powerpc/include/asm/pgalloc-32.h
@@ -37,6 +37,17 @@ extern void pgd_free(struct mm_struct *mm, pgd_t *pgd);
extern pte_t *pte_alloc_one_kernel(struct mm_struct *mm, unsigned long addr);
extern pgtable_t pte_alloc_one(struct mm_struct *mm, unsigned long addr);
+static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
+{
+ free_page((unsigned long)pte);
+}
+
+static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage)
+{
+ pgtable_page_dtor(ptepage);
+ __free_page(ptepage);
+}
+
static inline void pgtable_free(void *table, unsigned index_size)
{
BUG_ON(index_size); /* 32-bit doesn't use this */
@@ -45,4 +56,38 @@ static inline void pgtable_free(void *table, unsigned index_size)
#define check_pgt_cache() do { } while (0)
+#ifdef CONFIG_SMP
+static inline void pgtable_free_tlb(struct mmu_gather *tlb,
+ void *table, int shift)
+{
+ unsigned long pgf = (unsigned long)table;
+ BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
+ pgf |= shift;
+ tlb_remove_table(tlb, (void *)pgf);
+}
+
+static inline void __tlb_remove_table(void *_table)
+{
+ void *table = (void *)((unsigned long)_table & ~MAX_PGTABLE_INDEX_SIZE);
+ unsigned shift = (unsigned long)_table & MAX_PGTABLE_INDEX_SIZE;
+
+ pgtable_free(table, shift);
+}
+#else
+static inline void pgtable_free_tlb(struct mmu_gather *tlb,
+ void *table, int shift)
+{
+ pgtable_free(table, shift);
+}
+#endif
+
+static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
+ unsigned long address)
+{
+ struct page *page = page_address(table);
+
+ tlb_flush_pgtable(tlb, address);
+ pgtable_page_dtor(page);
+ pgtable_free_tlb(tlb, page, 0);
+}
#endif /* _ASM_POWERPC_PGALLOC_32_H */
diff --git a/arch/powerpc/include/asm/pgalloc-64.h b/arch/powerpc/include/asm/pgalloc-64.h
index 292725c..cdbf555 100644
--- a/arch/powerpc/include/asm/pgalloc-64.h
+++ b/arch/powerpc/include/asm/pgalloc-64.h
@@ -72,8 +72,83 @@ static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
#define pmd_populate_kernel(mm, pmd, pte) pmd_set(pmd, (unsigned long)(pte))
#define pmd_pgtable(pmd) pmd_page(pmd)
+static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm,
+ unsigned long address)
+{
+ return (pte_t *)__get_free_page(GFP_KERNEL | __GFP_REPEAT | __GFP_ZERO);
+}
+
+static inline pgtable_t pte_alloc_one(struct mm_struct *mm,
+ unsigned long address)
+{
+ struct page *page;
+ pte_t *pte;
+
+ pte = pte_alloc_one_kernel(mm, address);
+ if (!pte)
+ return NULL;
+ page = virt_to_page(pte);
+ pgtable_page_ctor(page);
+ return page;
+}
+
+static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
+{
+ free_page((unsigned long)pte);
+}
+
+static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage)
+{
+ pgtable_page_dtor(ptepage);
+ __free_page(ptepage);
+}
+
+static inline void pgtable_free(void *table, unsigned index_size)
+{
+ if (!index_size)
+ free_page((unsigned long)table);
+ else {
+ BUG_ON(index_size > MAX_PGTABLE_INDEX_SIZE);
+ kmem_cache_free(PGT_CACHE(index_size), table);
+ }
+}
+
+#ifdef CONFIG_SMP
+static inline void pgtable_free_tlb(struct mmu_gather *tlb,
+ void *table, int shift)
+{
+ unsigned long pgf = (unsigned long)table;
+ BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
+ pgf |= shift;
+ tlb_remove_table(tlb, (void *)pgf);
+}
+
+static inline void __tlb_remove_table(void *_table)
+{
+ void *table = (void *)((unsigned long)_table & ~MAX_PGTABLE_INDEX_SIZE);
+ unsigned shift = (unsigned long)_table & MAX_PGTABLE_INDEX_SIZE;
+
+ pgtable_free(table, shift);
+}
+#else /* !CONFIG_SMP */
+static inline void pgtable_free_tlb(struct mmu_gather *tlb,
+ void *table, int shift)
+{
+ pgtable_free(table, shift);
+}
+#endif /* CONFIG_SMP */
+
+static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
+ unsigned long address)
+{
+ struct page *page = page_address(table);
+
+ tlb_flush_pgtable(tlb, address);
+ pgtable_page_dtor(page);
+ pgtable_free_tlb(tlb, page, 0);
+}
-#else /* CONFIG_PPC_64K_PAGES */
+#else /* if CONFIG_PPC_64K_PAGES */
#define pud_populate(mm, pud, pmd) pud_set(pud, (unsigned long)pmd)
@@ -83,31 +158,25 @@ static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd,
pmd_set(pmd, (unsigned long)pte);
}
-#define pmd_populate(mm, pmd, pte_page) \
- pmd_populate_kernel(mm, pmd, page_address(pte_page))
-#define pmd_pgtable(pmd) pmd_page(pmd)
-
-#endif /* CONFIG_PPC_64K_PAGES */
-
-static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
+static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd,
+ pgtable_t pte_page)
{
- return kmem_cache_alloc(PGT_CACHE(PMD_INDEX_SIZE),
- GFP_KERNEL|__GFP_REPEAT);
+ pmd_populate_kernel(mm, pmd, page_address(pte_page));
}
-static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
+static inline pgtable_t pmd_pgtable(pmd_t pmd)
{
- kmem_cache_free(PGT_CACHE(PMD_INDEX_SIZE), pmd);
+ return pmd_page(pmd);
}
static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm,
unsigned long address)
{
- return (pte_t *)__get_free_page(GFP_KERNEL | __GFP_REPEAT | __GFP_ZERO);
+ return (pte_t *)__get_free_page(GFP_KERNEL | __GFP_REPEAT | __GFP_ZERO);
}
static inline pgtable_t pte_alloc_one(struct mm_struct *mm,
- unsigned long address)
+ unsigned long address)
{
struct page *page;
pte_t *pte;
@@ -120,6 +189,17 @@ static inline pgtable_t pte_alloc_one(struct mm_struct *mm,
return page;
}
+static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
+{
+ free_page((unsigned long)pte);
+}
+
+static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage)
+{
+ pgtable_page_dtor(ptepage);
+ __free_page(ptepage);
+}
+
static inline void pgtable_free(void *table, unsigned index_size)
{
if (!index_size)
@@ -130,6 +210,55 @@ static inline void pgtable_free(void *table, unsigned index_size)
}
}
+#ifdef CONFIG_SMP
+static inline void pgtable_free_tlb(struct mmu_gather *tlb,
+ void *table, int shift)
+{
+ unsigned long pgf = (unsigned long)table;
+ BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
+ pgf |= shift;
+ tlb_remove_table(tlb, (void *)pgf);
+}
+
+static inline void __tlb_remove_table(void *_table)
+{
+ void *table = (void *)((unsigned long)_table & ~MAX_PGTABLE_INDEX_SIZE);
+ unsigned shift = (unsigned long)_table & MAX_PGTABLE_INDEX_SIZE;
+
+ pgtable_free(table, shift);
+}
+#else /* !CONFIG_SMP */
+static inline void pgtable_free_tlb(struct mmu_gather *tlb,
+ void *table, int shift)
+{
+ pgtable_free(table, shift);
+}
+#endif /* CONFIG_SMP */
+
+static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
+ unsigned long address)
+{
+ struct page *page = page_address(table);
+
+ tlb_flush_pgtable(tlb, address);
+ pgtable_page_dtor(page);
+ pgtable_free_tlb(tlb, page, 0);
+}
+
+#endif /* CONFIG_PPC_64K_PAGES */
+
+static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
+{
+ return kmem_cache_alloc(PGT_CACHE(PMD_INDEX_SIZE),
+ GFP_KERNEL|__GFP_REPEAT);
+}
+
+static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
+{
+ kmem_cache_free(PGT_CACHE(PMD_INDEX_SIZE), pmd);
+}
+
+
#define __pmd_free_tlb(tlb, pmd, addr) \
pgtable_free_tlb(tlb, pmd, PMD_INDEX_SIZE)
#ifndef CONFIG_PPC_64K_PAGES
diff --git a/arch/powerpc/include/asm/pgalloc.h b/arch/powerpc/include/asm/pgalloc.h
index bf301ac..e9a9f60 100644
--- a/arch/powerpc/include/asm/pgalloc.h
+++ b/arch/powerpc/include/asm/pgalloc.h
@@ -3,6 +3,7 @@
#ifdef __KERNEL__
#include <linux/mm.h>
+#include <asm-generic/tlb.h>
#ifdef CONFIG_PPC_BOOK3E
extern void tlb_flush_pgtable(struct mmu_gather *tlb, unsigned long address);
@@ -13,56 +14,11 @@ static inline void tlb_flush_pgtable(struct mmu_gather *tlb,
}
#endif /* !CONFIG_PPC_BOOK3E */
-static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
-{
- free_page((unsigned long)pte);
-}
-
-static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage)
-{
- pgtable_page_dtor(ptepage);
- __free_page(ptepage);
-}
-
#ifdef CONFIG_PPC64
#include <asm/pgalloc-64.h>
#else
#include <asm/pgalloc-32.h>
#endif
-#ifdef CONFIG_SMP
-struct mmu_gather;
-extern void tlb_remove_table(struct mmu_gather *, void *);
-
-static inline void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift)
-{
- unsigned long pgf = (unsigned long)table;
- BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
- pgf |= shift;
- tlb_remove_table(tlb, (void *)pgf);
-}
-
-static inline void __tlb_remove_table(void *_table)
-{
- void *table = (void *)((unsigned long)_table & ~MAX_PGTABLE_INDEX_SIZE);
- unsigned shift = (unsigned long)_table & MAX_PGTABLE_INDEX_SIZE;
-
- pgtable_free(table, shift);
-}
-#else /* CONFIG_SMP */
-static inline void pgtable_free_tlb(struct mmu_gather *tlb, void *table, unsigned shift)
-{
- pgtable_free(table, shift);
-}
-#endif /* !CONFIG_SMP */
-
-static inline void __pte_free_tlb(struct mmu_gather *tlb, struct page *ptepage,
- unsigned long address)
-{
- tlb_flush_pgtable(tlb, address);
- pgtable_page_dtor(ptepage);
- pgtable_free_tlb(tlb, page_address(ptepage), 0);
-}
-
#endif /* __KERNEL__ */
#endif /* _ASM_POWERPC_PGALLOC_H */
--
1.7.10
^ permalink raw reply related
* [PATCH -V4 04/25] powerpc: Reduce the PTE_INDEX_SIZE
From: Aneesh Kumar K.V @ 2013-03-20 19:34 UTC (permalink / raw)
To: benh, paulus; +Cc: linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1363808110-25748-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
This make one PMD cover 16MB range. That helps in easier implementation of THP
on power. THP core code make use of one pmd entry to track the hugepage and
the range mapped by a single pmd entry should be equal to the hugepage size
supported by the hardware.
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/pgtable-ppc64-64k.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/pgtable-ppc64-64k.h b/arch/powerpc/include/asm/pgtable-ppc64-64k.h
index be4e287..3c529b4 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64-64k.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64-64k.h
@@ -4,10 +4,10 @@
#include <asm-generic/pgtable-nopud.h>
-#define PTE_INDEX_SIZE 12
+#define PTE_INDEX_SIZE 8
#define PMD_INDEX_SIZE 12
#define PUD_INDEX_SIZE 0
-#define PGD_INDEX_SIZE 6
+#define PGD_INDEX_SIZE 10
#ifndef __ASSEMBLY__
#define PTE_TABLE_SIZE (sizeof(real_pte_t) << PTE_INDEX_SIZE)
--
1.7.10
^ permalink raw reply related
* [PATCH -V4 07/25] powerpc: Use encode avpn where we need only avpn values
From: Aneesh Kumar K.V @ 2013-03-20 19:34 UTC (permalink / raw)
To: benh, paulus; +Cc: linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1363808110-25748-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
In all these cases we are doing something similar to
HPTE_V_COMPARE(hpte_v, want_v) which ignores the HPTE_V_LARGE bit
With MPSS support we would need actual page size to set HPTE_V_LARGE
bit and that won't be available in most of these cases. Since we are ignoring
HPTE_V_LARGE bit, use the avpn value instead. There should not be any change
in behaviour after this patch.
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/mm/hash_native_64.c | 8 ++++----
arch/powerpc/platforms/cell/beat_htab.c | 10 +++++-----
arch/powerpc/platforms/ps3/htab.c | 2 +-
3 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c
index ffc1e00..9d8983a 100644
--- a/arch/powerpc/mm/hash_native_64.c
+++ b/arch/powerpc/mm/hash_native_64.c
@@ -252,7 +252,7 @@ static long native_hpte_updatepp(unsigned long slot, unsigned long newpp,
unsigned long hpte_v, want_v;
int ret = 0;
- want_v = hpte_encode_v(vpn, psize, ssize);
+ want_v = hpte_encode_avpn(vpn, psize, ssize);
DBG_LOW(" update(vpn=%016lx, avpnv=%016lx, group=%lx, newpp=%lx)",
vpn, want_v & HPTE_V_AVPN, slot, newpp);
@@ -288,7 +288,7 @@ static long native_hpte_find(unsigned long vpn, int psize, int ssize)
unsigned long want_v, hpte_v;
hash = hpt_hash(vpn, mmu_psize_defs[psize].shift, ssize);
- want_v = hpte_encode_v(vpn, psize, ssize);
+ want_v = hpte_encode_avpn(vpn, psize, ssize);
/* Bolted mappings are only ever in the primary group */
slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
@@ -348,7 +348,7 @@ static void native_hpte_invalidate(unsigned long slot, unsigned long vpn,
DBG_LOW(" invalidate(vpn=%016lx, hash: %lx)\n", vpn, slot);
- want_v = hpte_encode_v(vpn, psize, ssize);
+ want_v = hpte_encode_avpn(vpn, psize, ssize);
native_lock_hpte(hptep);
hpte_v = hptep->v;
@@ -520,7 +520,7 @@ static void native_flush_hash_range(unsigned long number, int local)
slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
slot += hidx & _PTEIDX_GROUP_IX;
hptep = htab_address + slot;
- want_v = hpte_encode_v(vpn, psize, ssize);
+ want_v = hpte_encode_avpn(vpn, psize, ssize);
native_lock_hpte(hptep);
hpte_v = hptep->v;
if (!HPTE_V_COMPARE(hpte_v, want_v) ||
diff --git a/arch/powerpc/platforms/cell/beat_htab.c b/arch/powerpc/platforms/cell/beat_htab.c
index 0f6f839..472f9a7 100644
--- a/arch/powerpc/platforms/cell/beat_htab.c
+++ b/arch/powerpc/platforms/cell/beat_htab.c
@@ -191,7 +191,7 @@ static long beat_lpar_hpte_updatepp(unsigned long slot,
u64 dummy0, dummy1;
unsigned long want_v;
- want_v = hpte_encode_v(vpn, psize, MMU_SEGSIZE_256M);
+ want_v = hpte_encode_avpn(vpn, psize, MMU_SEGSIZE_256M);
DBG_LOW(" update: "
"avpnv=%016lx, slot=%016lx, psize: %d, newpp %016lx ... ",
@@ -228,7 +228,7 @@ static long beat_lpar_hpte_find(unsigned long vpn, int psize)
unsigned long want_v, hpte_v;
hash = hpt_hash(vpn, mmu_psize_defs[psize].shift, MMU_SEGSIZE_256M);
- want_v = hpte_encode_v(vpn, psize, MMU_SEGSIZE_256M);
+ want_v = hpte_encode_avpn(vpn, psize, MMU_SEGSIZE_256M);
for (j = 0; j < 2; j++) {
slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
@@ -283,7 +283,7 @@ static void beat_lpar_hpte_invalidate(unsigned long slot, unsigned long vpn,
DBG_LOW(" inval : slot=%lx, va=%016lx, psize: %d, local: %d\n",
slot, va, psize, local);
- want_v = hpte_encode_v(vpn, psize, MMU_SEGSIZE_256M);
+ want_v = hpte_encode_avpn(vpn, psize, MMU_SEGSIZE_256M);
raw_spin_lock_irqsave(&beat_htab_lock, flags);
dummy1 = beat_lpar_hpte_getword0(slot);
@@ -372,7 +372,7 @@ static long beat_lpar_hpte_updatepp_v3(unsigned long slot,
unsigned long want_v;
unsigned long pss;
- want_v = hpte_encode_v(vpn, psize, MMU_SEGSIZE_256M);
+ want_v = hpte_encode_avpn(vpn, psize, MMU_SEGSIZE_256M);
pss = (psize == MMU_PAGE_4K) ? -1UL : mmu_psize_defs[psize].penc;
DBG_LOW(" update: "
@@ -402,7 +402,7 @@ static void beat_lpar_hpte_invalidate_v3(unsigned long slot, unsigned long vpn,
DBG_LOW(" inval : slot=%lx, vpn=%016lx, psize: %d, local: %d\n",
slot, vpn, psize, local);
- want_v = hpte_encode_v(vpn, psize, MMU_SEGSIZE_256M);
+ want_v = hpte_encode_avpn(vpn, psize, MMU_SEGSIZE_256M);
pss = (psize == MMU_PAGE_4K) ? -1UL : mmu_psize_defs[psize].penc;
lpar_rc = beat_invalidate_htab_entry3(0, slot, want_v, pss);
diff --git a/arch/powerpc/platforms/ps3/htab.c b/arch/powerpc/platforms/ps3/htab.c
index d00d7b0..07a4bba 100644
--- a/arch/powerpc/platforms/ps3/htab.c
+++ b/arch/powerpc/platforms/ps3/htab.c
@@ -115,7 +115,7 @@ static long ps3_hpte_updatepp(unsigned long slot, unsigned long newpp,
unsigned long flags;
long ret;
- want_v = hpte_encode_v(vpn, psize, ssize);
+ want_v = hpte_encode_avpn(vpn, psize, ssize);
spin_lock_irqsave(&ps3_htab_lock, flags);
--
1.7.10
^ permalink raw reply related
* [PATCH -V4 11/25] powerpc: Print page size info during boot
From: Aneesh Kumar K.V @ 2013-03-20 19:34 UTC (permalink / raw)
To: benh, paulus; +Cc: linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1363808110-25748-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
This gives hint about different base and actual page size combination
supported by the platform.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/mm/hash_utils_64.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 56ff4bb..1f2ebbd 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -315,7 +315,7 @@ static int __init htab_dt_scan_page_sizes(unsigned long node,
prop = (u32 *)of_get_flat_dt_prop(node,
"ibm,segment-page-sizes", &size);
if (prop != NULL) {
- DBG("Page sizes from device-tree:\n");
+ pr_info("Page sizes from device-tree:\n");
size /= 4;
cur_cpu_spec->mmu_features &= ~(MMU_FTR_16M_PAGE);
while(size > 0) {
@@ -369,10 +369,10 @@ static int __init htab_dt_scan_page_sizes(unsigned long node,
"shift=%d\n", base_shift, shift);
def->penc[idx] = penc;
- DBG(" %d: shift=%02x, sllp=%04lx, "
- "avpnm=%08lx, tlbiel=%d, penc=%d\n",
- idx, shift, def->sllp, def->avpnm,
- def->tlbiel, def->penc[idx]);
+ pr_info("base_shift=%d: shift=%d, sllp=0x%04lx,"
+ " avpnm=0x%08lx, tlbiel=%d, penc=%d\n",
+ base_shift, shift, def->sllp,
+ def->avpnm, def->tlbiel, def->penc[idx]);
}
}
return 1;
--
1.7.10
^ permalink raw reply related
* [PATCH -V4 10/25] powerpc: print both base and actual page size on hash failure
From: Aneesh Kumar K.V @ 2013-03-20 19:34 UTC (permalink / raw)
To: benh, paulus; +Cc: linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1363808110-25748-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/mmu-hash64.h | 3 ++-
arch/powerpc/mm/hash_utils_64.c | 12 +++++++-----
arch/powerpc/mm/hugetlbpage-hash64.c | 2 +-
3 files changed, 10 insertions(+), 7 deletions(-)
diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
index e42f4a3..e187254 100644
--- a/arch/powerpc/include/asm/mmu-hash64.h
+++ b/arch/powerpc/include/asm/mmu-hash64.h
@@ -324,7 +324,8 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
unsigned int shift, unsigned int mmu_psize);
extern void hash_failure_debug(unsigned long ea, unsigned long access,
unsigned long vsid, unsigned long trap,
- int ssize, int psize, unsigned long pte);
+ int ssize, int psize, int lpsize,
+ unsigned long pte);
extern int htab_bolt_mapping(unsigned long vstart, unsigned long vend,
unsigned long pstart, unsigned long prot,
int psize, int ssize);
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index a5a5067..56ff4bb 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -933,14 +933,14 @@ static inline int subpage_protection(struct mm_struct *mm, unsigned long ea)
void hash_failure_debug(unsigned long ea, unsigned long access,
unsigned long vsid, unsigned long trap,
- int ssize, int psize, unsigned long pte)
+ int ssize, int psize, int lpsize, unsigned long pte)
{
if (!printk_ratelimit())
return;
pr_info("mm: Hashing failure ! EA=0x%lx access=0x%lx current=%s\n",
ea, access, current->comm);
- pr_info(" trap=0x%lx vsid=0x%lx ssize=%d psize=%d pte=0x%lx\n",
- trap, vsid, ssize, psize, pte);
+ pr_info(" trap=0x%lx vsid=0x%lx ssize=%d base psize=%d psize %d pte=0x%lx\n",
+ trap, vsid, ssize, psize, lpsize, pte);
}
/* Result code is:
@@ -1113,7 +1113,7 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap)
*/
if (rc == -1)
hash_failure_debug(ea, access, vsid, trap, ssize, psize,
- pte_val(*ptep));
+ psize, pte_val(*ptep));
#ifndef CONFIG_PPC_64K_PAGES
DBG_LOW(" o-pte: %016lx\n", pte_val(*ptep));
#else
@@ -1191,7 +1191,9 @@ void hash_preload(struct mm_struct *mm, unsigned long ea,
*/
if (rc == -1)
hash_failure_debug(ea, access, vsid, trap, ssize,
- mm->context.user_psize, pte_val(*ptep));
+ mm->context.user_psize,
+ mm->context.user_psize,
+ pte_val(*ptep));
local_irq_restore(flags);
}
diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c b/arch/powerpc/mm/hugetlbpage-hash64.c
index e0d52ee..06ecb55 100644
--- a/arch/powerpc/mm/hugetlbpage-hash64.c
+++ b/arch/powerpc/mm/hugetlbpage-hash64.c
@@ -129,7 +129,7 @@ repeat:
if (unlikely(slot == -2)) {
*ptep = __pte(old_pte);
hash_failure_debug(ea, access, vsid, trap, ssize,
- mmu_psize, old_pte);
+ mmu_psize, mmu_psize, old_pte);
return -1;
}
--
1.7.10
^ permalink raw reply related
* [PATCH -V4 01/25] powerpc: Use signed formatting when printing error
From: Aneesh Kumar K.V @ 2013-03-20 19:34 UTC (permalink / raw)
To: benh, paulus; +Cc: linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1363808110-25748-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
PAPR defines these errors as negative values. So print them accordingly
for easy debugging.
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/platforms/pseries/lpar.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index 0da39fe..a77c35b 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -155,7 +155,7 @@ static long pSeries_lpar_hpte_insert(unsigned long hpte_group,
*/
if (unlikely(lpar_rc != H_SUCCESS)) {
if (!(vflags & HPTE_V_BOLTED))
- pr_devel(" lpar err %lu\n", lpar_rc);
+ pr_devel(" lpar err %ld\n", lpar_rc);
return -2;
}
if (!(vflags & HPTE_V_BOLTED))
--
1.7.10
^ permalink raw reply related
* [PATCH -V4 13/25] powerpc: Update tlbie/tlbiel as per ISA doc
From: Aneesh Kumar K.V @ 2013-03-20 19:34 UTC (permalink / raw)
To: benh, paulus; +Cc: linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1363808110-25748-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
This make sure we handle multiple page size segment correctly.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/mm/hash_native_64.c | 30 ++++++++++++++++++++++++++++--
1 file changed, 28 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c
index b461b2d..ac84fa6 100644
--- a/arch/powerpc/mm/hash_native_64.c
+++ b/arch/powerpc/mm/hash_native_64.c
@@ -61,7 +61,10 @@ static inline void __tlbie(unsigned long vpn, int psize, int apsize, int ssize)
switch (psize) {
case MMU_PAGE_4K:
+ /* clear out bits after (52) [0....52.....63] */
+ va &= ~((1ul << (64 - 52)) - 1);
va |= ssize << 8;
+ va |= mmu_psize_defs[apsize].sllp << 6;
asm volatile(ASM_FTR_IFCLR("tlbie %0,0", PPC_TLBIE(%1,%0), %2)
: : "r" (va), "r"(0), "i" (CPU_FTR_ARCH_206)
: "memory");
@@ -69,9 +72,19 @@ static inline void __tlbie(unsigned long vpn, int psize, int apsize, int ssize)
default:
/* We need 14 to 14 + i bits of va */
penc = mmu_psize_defs[psize].penc[apsize];
- va &= ~((1ul << mmu_psize_defs[psize].shift) - 1);
+ va &= ~((1ul << mmu_psize_defs[apsize].shift) - 1);
va |= penc << 12;
va |= ssize << 8;
+ /* Add AVAL part */
+ if (psize != apsize) {
+ /*
+ * MPSS, 64K base page size and 16MB parge page size
+ * We don't need all the bits, but this seems to work.
+ * vpn cover upto 65 bits of va. (0...65) and we need
+ * 58..64 bits of va.
+ */
+ va |= (vpn & 0xfe);
+ }
va |= 1; /* L */
asm volatile(ASM_FTR_IFCLR("tlbie %0,1", PPC_TLBIE(%1,%0), %2)
: : "r" (va), "r"(0), "i" (CPU_FTR_ARCH_206)
@@ -96,16 +109,29 @@ static inline void __tlbiel(unsigned long vpn, int psize, int apsize, int ssize)
switch (psize) {
case MMU_PAGE_4K:
+ /* clear out bits after(52) [0....52.....63] */
+ va &= ~((1ul << (64 - 52)) - 1);
va |= ssize << 8;
+ va |= mmu_psize_defs[apsize].sllp << 6;
asm volatile(".long 0x7c000224 | (%0 << 11) | (0 << 21)"
: : "r"(va) : "memory");
break;
default:
/* We need 14 to 14 + i bits of va */
penc = mmu_psize_defs[psize].penc[apsize];
- va &= ~((1ul << mmu_psize_defs[psize].shift) - 1);
+ va &= ~((1ul << mmu_psize_defs[apsize].shift) - 1);
va |= penc << 12;
va |= ssize << 8;
+ /* Add AVAL part */
+ if (psize != apsize) {
+ /*
+ * MPSS, 64K base page size and 16MB parge page size
+ * We don't need all the bits, but this seems to work.
+ * vpn cover upto 65 bits of va. (0...65) and we need
+ * 58..64 bits of va.
+ */
+ va |= (vpn & 0xfe);
+ }
va |= 1; /* L */
asm volatile(".long 0x7c000224 | (%0 << 11) | (1 << 21)"
: : "r"(va) : "memory");
--
1.7.10
^ permalink raw reply related
* [PATCH -V4 08/25] powerpc: Decode the pte-lp-encoding bits correctly.
From: Aneesh Kumar K.V @ 2013-03-20 19:34 UTC (permalink / raw)
To: benh, paulus; +Cc: linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1363808110-25748-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
We look at both the segment base page size and actual page size and store
the pte-lp-encodings in an array per base page size.
We also update all relevant functions to take actual page size argument
so that we can use the correct PTE LP encoding in HPTE. This should also
get the basic Multiple Page Size per Segment (MPSS) support. This is needed
to enable THP on ppc64.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/machdep.h | 3 +-
arch/powerpc/include/asm/mmu-hash64.h | 33 ++++----
arch/powerpc/kvm/book3s_hv.c | 2 +-
arch/powerpc/mm/hash_low_64.S | 18 ++--
arch/powerpc/mm/hash_native_64.c | 138 ++++++++++++++++++++++---------
arch/powerpc/mm/hash_utils_64.c | 121 +++++++++++++++++----------
arch/powerpc/mm/hugetlbpage-hash64.c | 4 +-
arch/powerpc/platforms/cell/beat_htab.c | 16 ++--
arch/powerpc/platforms/ps3/htab.c | 6 +-
arch/powerpc/platforms/pseries/lpar.c | 6 +-
10 files changed, 230 insertions(+), 117 deletions(-)
diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index 19d9d96..6cee6e0 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -50,7 +50,8 @@ struct machdep_calls {
unsigned long prpn,
unsigned long rflags,
unsigned long vflags,
- int psize, int ssize);
+ int psize, int apsize,
+ int ssize);
long (*hpte_remove)(unsigned long hpte_group);
void (*hpte_removebolted)(unsigned long ea,
int psize, int ssize);
diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
index 300ac3c..e42f4a3 100644
--- a/arch/powerpc/include/asm/mmu-hash64.h
+++ b/arch/powerpc/include/asm/mmu-hash64.h
@@ -154,7 +154,7 @@ extern unsigned long htab_hash_mask;
struct mmu_psize_def
{
unsigned int shift; /* number of bits */
- unsigned int penc; /* HPTE encoding */
+ int penc[MMU_PAGE_COUNT]; /* HPTE encoding */
unsigned int tlbiel; /* tlbiel supported for that page size */
unsigned long avpnm; /* bits to mask out in AVPN in the HPTE */
unsigned long sllp; /* SLB L||LP (exact mask to use in slbmte) */
@@ -181,6 +181,13 @@ struct mmu_psize_def
*/
#define VPN_SHIFT 12
+/*
+ * HPTE Large Page (LP) details
+ */
+#define LP_SHIFT 12
+#define LP_BITS 8
+#define LP_MASK(i) ((0xFF >> (i)) << LP_SHIFT)
+
#ifndef __ASSEMBLY__
static inline int segment_shift(int ssize)
@@ -237,14 +244,14 @@ static inline unsigned long hpte_encode_avpn(unsigned long vpn, int psize,
/*
* This function sets the AVPN and L fields of the HPTE appropriately
- * for the page size
+ * using the base page size and actual page size.
*/
-static inline unsigned long hpte_encode_v(unsigned long vpn,
- int psize, int ssize)
+static inline unsigned long hpte_encode_v(unsigned long vpn, int base_psize,
+ int actual_psize, int ssize)
{
unsigned long v;
- v = hpte_encode_avpn(vpn, psize, ssize);
- if (psize != MMU_PAGE_4K)
+ v = hpte_encode_avpn(vpn, base_psize, ssize);
+ if (actual_psize != MMU_PAGE_4K)
v |= HPTE_V_LARGE;
return v;
}
@@ -254,19 +261,17 @@ static inline unsigned long hpte_encode_v(unsigned long vpn,
* for the page size. We assume the pa is already "clean" that is properly
* aligned for the requested page size
*/
-static inline unsigned long hpte_encode_r(unsigned long pa, int psize)
+static inline unsigned long hpte_encode_r(unsigned long pa, int base_psize,
+ int actual_psize)
{
- unsigned long r;
-
/* A 4K page needs no special encoding */
- if (psize == MMU_PAGE_4K)
+ if (actual_psize == MMU_PAGE_4K)
return pa & HPTE_R_RPN;
else {
- unsigned int penc = mmu_psize_defs[psize].penc;
- unsigned int shift = mmu_psize_defs[psize].shift;
- return (pa & ~((1ul << shift) - 1)) | (penc << 12);
+ unsigned int penc = mmu_psize_defs[base_psize].penc[actual_psize];
+ unsigned int shift = mmu_psize_defs[actual_psize].shift;
+ return (pa & ~((1ul << shift) - 1)) | (penc << LP_SHIFT);
}
- return r;
}
/*
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 71d0c90..48f6d99 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1515,7 +1515,7 @@ static void kvmppc_add_seg_page_size(struct kvm_ppc_one_seg_page_size **sps,
(*sps)->page_shift = def->shift;
(*sps)->slb_enc = def->sllp;
(*sps)->enc[0].page_shift = def->shift;
- (*sps)->enc[0].pte_enc = def->penc;
+ (*sps)->enc[0].pte_enc = def->penc[linux_psize];
(*sps)++;
}
diff --git a/arch/powerpc/mm/hash_low_64.S b/arch/powerpc/mm/hash_low_64.S
index abdd5e2..0e980ac 100644
--- a/arch/powerpc/mm/hash_low_64.S
+++ b/arch/powerpc/mm/hash_low_64.S
@@ -196,7 +196,8 @@ htab_insert_pte:
mr r4,r29 /* Retrieve vpn */
li r7,0 /* !bolted, !secondary */
li r8,MMU_PAGE_4K /* page size */
- ld r9,STK_PARAM(R9)(r1) /* segment size */
+ li r9,MMU_PAGE_4K /* actual page size */
+ ld r10,STK_PARAM(R9)(r1) /* segment size */
_GLOBAL(htab_call_hpte_insert1)
bl . /* Patched by htab_finish_init() */
cmpdi 0,r3,0
@@ -219,7 +220,8 @@ _GLOBAL(htab_call_hpte_insert1)
mr r4,r29 /* Retrieve vpn */
li r7,HPTE_V_SECONDARY /* !bolted, secondary */
li r8,MMU_PAGE_4K /* page size */
- ld r9,STK_PARAM(R9)(r1) /* segment size */
+ li r9,MMU_PAGE_4K /* actual page size */
+ ld r10,STK_PARAM(R9)(r1) /* segment size */
_GLOBAL(htab_call_hpte_insert2)
bl . /* Patched by htab_finish_init() */
cmpdi 0,r3,0
@@ -515,7 +517,8 @@ htab_special_pfn:
mr r4,r29 /* Retrieve vpn */
li r7,0 /* !bolted, !secondary */
li r8,MMU_PAGE_4K /* page size */
- ld r9,STK_PARAM(R9)(r1) /* segment size */
+ li r9,MMU_PAGE_4K /* actual page size */
+ ld r10,STK_PARAM(R9)(r1) /* segment size */
_GLOBAL(htab_call_hpte_insert1)
bl . /* patched by htab_finish_init() */
cmpdi 0,r3,0
@@ -542,7 +545,8 @@ _GLOBAL(htab_call_hpte_insert1)
mr r4,r29 /* Retrieve vpn */
li r7,HPTE_V_SECONDARY /* !bolted, secondary */
li r8,MMU_PAGE_4K /* page size */
- ld r9,STK_PARAM(R9)(r1) /* segment size */
+ li r9,MMU_PAGE_4K /* actual page size */
+ ld r10,STK_PARAM(R9)(r1) /* segment size */
_GLOBAL(htab_call_hpte_insert2)
bl . /* patched by htab_finish_init() */
cmpdi 0,r3,0
@@ -840,7 +844,8 @@ ht64_insert_pte:
mr r4,r29 /* Retrieve vpn */
li r7,0 /* !bolted, !secondary */
li r8,MMU_PAGE_64K
- ld r9,STK_PARAM(R9)(r1) /* segment size */
+ li r9,MMU_PAGE_64K /* actual page size */
+ ld r10,STK_PARAM(R9)(r1) /* segment size */
_GLOBAL(ht64_call_hpte_insert1)
bl . /* patched by htab_finish_init() */
cmpdi 0,r3,0
@@ -863,7 +868,8 @@ _GLOBAL(ht64_call_hpte_insert1)
mr r4,r29 /* Retrieve vpn */
li r7,HPTE_V_SECONDARY /* !bolted, secondary */
li r8,MMU_PAGE_64K
- ld r9,STK_PARAM(R9)(r1) /* segment size */
+ li r9,MMU_PAGE_64K /* actual page size */
+ ld r10,STK_PARAM(R9)(r1) /* segment size */
_GLOBAL(ht64_call_hpte_insert2)
bl . /* patched by htab_finish_init() */
cmpdi 0,r3,0
diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c
index 9d8983a..aa0499b 100644
--- a/arch/powerpc/mm/hash_native_64.c
+++ b/arch/powerpc/mm/hash_native_64.c
@@ -39,7 +39,7 @@
DEFINE_RAW_SPINLOCK(native_tlbie_lock);
-static inline void __tlbie(unsigned long vpn, int psize, int ssize)
+static inline void __tlbie(unsigned long vpn, int psize, int apsize, int ssize)
{
unsigned long va;
unsigned int penc;
@@ -68,7 +68,7 @@ static inline void __tlbie(unsigned long vpn, int psize, int ssize)
break;
default:
/* We need 14 to 14 + i bits of va */
- penc = mmu_psize_defs[psize].penc;
+ penc = mmu_psize_defs[psize].penc[apsize];
va &= ~((1ul << mmu_psize_defs[psize].shift) - 1);
va |= penc << 12;
va |= ssize << 8;
@@ -80,7 +80,7 @@ static inline void __tlbie(unsigned long vpn, int psize, int ssize)
}
}
-static inline void __tlbiel(unsigned long vpn, int psize, int ssize)
+static inline void __tlbiel(unsigned long vpn, int psize, int apsize, int ssize)
{
unsigned long va;
unsigned int penc;
@@ -102,7 +102,7 @@ static inline void __tlbiel(unsigned long vpn, int psize, int ssize)
break;
default:
/* We need 14 to 14 + i bits of va */
- penc = mmu_psize_defs[psize].penc;
+ penc = mmu_psize_defs[psize].penc[apsize];
va &= ~((1ul << mmu_psize_defs[psize].shift) - 1);
va |= penc << 12;
va |= ssize << 8;
@@ -114,7 +114,8 @@ static inline void __tlbiel(unsigned long vpn, int psize, int ssize)
}
-static inline void tlbie(unsigned long vpn, int psize, int ssize, int local)
+static inline void tlbie(unsigned long vpn, int psize, int apsize,
+ int ssize, int local)
{
unsigned int use_local = local && mmu_has_feature(MMU_FTR_TLBIEL);
int lock_tlbie = !mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE);
@@ -125,10 +126,10 @@ static inline void tlbie(unsigned long vpn, int psize, int ssize, int local)
raw_spin_lock(&native_tlbie_lock);
asm volatile("ptesync": : :"memory");
if (use_local) {
- __tlbiel(vpn, psize, ssize);
+ __tlbiel(vpn, psize, apsize, ssize);
asm volatile("ptesync": : :"memory");
} else {
- __tlbie(vpn, psize, ssize);
+ __tlbie(vpn, psize, apsize, ssize);
asm volatile("eieio; tlbsync; ptesync": : :"memory");
}
if (lock_tlbie && !use_local)
@@ -156,7 +157,7 @@ static inline void native_unlock_hpte(struct hash_pte *hptep)
static long native_hpte_insert(unsigned long hpte_group, unsigned long vpn,
unsigned long pa, unsigned long rflags,
- unsigned long vflags, int psize, int ssize)
+ unsigned long vflags, int psize, int apsize, int ssize)
{
struct hash_pte *hptep = htab_address + hpte_group;
unsigned long hpte_v, hpte_r;
@@ -183,8 +184,8 @@ static long native_hpte_insert(unsigned long hpte_group, unsigned long vpn,
if (i == HPTES_PER_GROUP)
return -1;
- hpte_v = hpte_encode_v(vpn, psize, ssize) | vflags | HPTE_V_VALID;
- hpte_r = hpte_encode_r(pa, psize) | rflags;
+ hpte_v = hpte_encode_v(vpn, psize, apsize, ssize) | vflags | HPTE_V_VALID;
+ hpte_r = hpte_encode_r(pa, psize, apsize) | rflags;
if (!(vflags & HPTE_V_BOLTED)) {
DBG_LOW(" i=%x hpte_v=%016lx, hpte_r=%016lx\n",
@@ -244,6 +245,48 @@ static long native_hpte_remove(unsigned long hpte_group)
return i;
}
+static inline int hpte_actual_psize(struct hash_pte *hptep, int psize)
+{
+ int i, shift;
+ unsigned int mask;
+ /* Look at the 8 bit LP value */
+ unsigned int lp = (hptep->r >> LP_SHIFT) & ((1 << LP_BITS) - 1);
+
+ if (!(hptep->v & HPTE_V_VALID))
+ return -1;
+
+ /* First check if it is large page */
+ if (!(hptep->v & HPTE_V_LARGE))
+ return MMU_PAGE_4K;
+
+ /* start from 1 ignoring MMU_PAGE_4K */
+ for (i = 1; i < MMU_PAGE_COUNT; i++) {
+ /* valid entries have a shift value */
+ if (!mmu_psize_defs[i].shift)
+ continue;
+
+ /* invalid penc */
+ if (mmu_psize_defs[psize].penc[i] == -1)
+ continue;
+ /*
+ * encoding bits per actual page size
+ * PTE LP actual page size
+ * rrrr rrrz >=8KB
+ * rrrr rrzz >=16KB
+ * rrrr rzzz >=32KB
+ * rrrr zzzz >=64KB
+ * .......
+ */
+ shift = mmu_psize_defs[i].shift - LP_SHIFT;
+ if (shift > LP_BITS)
+ shift = LP_BITS;
+ mask = (1 << shift) - 1;
+ if ((lp & mask) == mmu_psize_defs[psize].penc[i])
+ return i;
+ }
+ return -1;
+}
+
static long native_hpte_updatepp(unsigned long slot, unsigned long newpp,
unsigned long vpn, int psize, int ssize,
int local)
@@ -251,6 +294,7 @@ static long native_hpte_updatepp(unsigned long slot, unsigned long newpp,
struct hash_pte *hptep = htab_address + slot;
unsigned long hpte_v, want_v;
int ret = 0;
+ int actual_psize;
want_v = hpte_encode_avpn(vpn, psize, ssize);
@@ -260,9 +304,13 @@ static long native_hpte_updatepp(unsigned long slot, unsigned long newpp,
native_lock_hpte(hptep);
hpte_v = hptep->v;
-
+ actual_psize = hpte_actual_psize(hptep, psize);
+ if (actual_psize < 0) {
+ native_unlock_hpte(hptep);
+ return -1;
+ }
/* Even if we miss, we need to invalidate the TLB */
- if (!HPTE_V_COMPARE(hpte_v, want_v) || !(hpte_v & HPTE_V_VALID)) {
+ if (!HPTE_V_COMPARE(hpte_v, want_v)) {
DBG_LOW(" -> miss\n");
ret = -1;
} else {
@@ -274,7 +322,7 @@ static long native_hpte_updatepp(unsigned long slot, unsigned long newpp,
native_unlock_hpte(hptep);
/* Ensure it is out of the tlb too. */
- tlbie(vpn, psize, ssize, local);
+ tlbie(vpn, psize, actual_psize, ssize, local);
return ret;
}
@@ -315,6 +363,7 @@ static long native_hpte_find(unsigned long vpn, int psize, int ssize)
static void native_hpte_updateboltedpp(unsigned long newpp, unsigned long ea,
int psize, int ssize)
{
+ int actual_psize;
unsigned long vpn;
unsigned long vsid;
long slot;
@@ -327,13 +376,16 @@ static void native_hpte_updateboltedpp(unsigned long newpp, unsigned long ea,
if (slot == -1)
panic("could not find page to bolt\n");
hptep = htab_address + slot;
+ actual_psize = hpte_actual_psize(hptep, psize);
+ if (actual_psize < 0)
+ return;
/* Update the HPTE */
hptep->r = (hptep->r & ~(HPTE_R_PP | HPTE_R_N)) |
(newpp & (HPTE_R_PP | HPTE_R_N));
/* Ensure it is out of the tlb too. */
- tlbie(vpn, psize, ssize, 0);
+ tlbie(vpn, psize, actual_psize, ssize, 0);
}
static void native_hpte_invalidate(unsigned long slot, unsigned long vpn,
@@ -343,6 +395,7 @@ static void native_hpte_invalidate(unsigned long slot, unsigned long vpn,
unsigned long hpte_v;
unsigned long want_v;
unsigned long flags;
+ int actual_psize;
local_irq_save(flags);
@@ -352,35 +405,38 @@ static void native_hpte_invalidate(unsigned long slot, unsigned long vpn,
native_lock_hpte(hptep);
hpte_v = hptep->v;
+ actual_psize = hpte_actual_psize(hptep, psize);
+ if (actual_psize < 0) {
+ native_unlock_hpte(hptep);
+ local_irq_restore(flags);
+ return;
+ }
/* Even if we miss, we need to invalidate the TLB */
- if (!HPTE_V_COMPARE(hpte_v, want_v) || !(hpte_v & HPTE_V_VALID))
+ if (!HPTE_V_COMPARE(hpte_v, want_v))
native_unlock_hpte(hptep);
else
/* Invalidate the hpte. NOTE: this also unlocks it */
hptep->v = 0;
/* Invalidate the TLB */
- tlbie(vpn, psize, ssize, local);
+ tlbie(vpn, psize, actual_psize, ssize, local);
local_irq_restore(flags);
}
-#define LP_SHIFT 12
-#define LP_BITS 8
-#define LP_MASK(i) ((0xFF >> (i)) << LP_SHIFT)
-
static void hpte_decode(struct hash_pte *hpte, unsigned long slot,
- int *psize, int *ssize, unsigned long *vpn)
+ int *psize, int *apsize, int *ssize, unsigned long *vpn)
{
unsigned long avpn, pteg, vpi;
unsigned long hpte_r = hpte->r;
unsigned long hpte_v = hpte->v;
unsigned long vsid, seg_off;
- int i, size, shift, penc;
+ int i, size, a_size, shift, penc;
- if (!(hpte_v & HPTE_V_LARGE))
- size = MMU_PAGE_4K;
- else {
+ if (!(hpte_v & HPTE_V_LARGE)) {
+ size = MMU_PAGE_4K;
+ a_size = MMU_PAGE_4K;
+ } else {
for (i = 0; i < LP_BITS; i++) {
if ((hpte_r & LP_MASK(i+1)) == LP_MASK(i+1))
break;
@@ -388,19 +444,26 @@ static void hpte_decode(struct hash_pte *hpte, unsigned long slot,
penc = LP_MASK(i+1) >> LP_SHIFT;
for (size = 0; size < MMU_PAGE_COUNT; size++) {
- /* 4K pages are not represented by LP */
- if (size == MMU_PAGE_4K)
- continue;
-
/* valid entries have a shift value */
if (!mmu_psize_defs[size].shift)
continue;
+ for (a_size = 0; a_size < MMU_PAGE_COUNT; a_size++) {
- if (penc == mmu_psize_defs[size].penc)
- break;
+ /* 4K pages are not represented by LP */
+ if (a_size == MMU_PAGE_4K)
+ continue;
+
+ /* valid entries have a shift value */
+ if (!mmu_psize_defs[a_size].shift)
+ continue;
+
+ if (penc == mmu_psize_defs[size].penc[a_size])
+ goto out;
+ }
}
}
+out:
/* This works for all page sizes, and for 256M and 1T segments */
*ssize = hpte_v >> HPTE_V_SSIZE_SHIFT;
shift = mmu_psize_defs[size].shift;
@@ -433,7 +496,8 @@ static void hpte_decode(struct hash_pte *hpte, unsigned long slot,
default:
*vpn = size = 0;
}
- *psize = size;
+ *psize = size;
+ *apsize = a_size;
}
/*
@@ -451,7 +515,7 @@ static void native_hpte_clear(void)
struct hash_pte *hptep = htab_address;
unsigned long hpte_v;
unsigned long pteg_count;
- int psize, ssize;
+ int psize, apsize, ssize;
pteg_count = htab_hash_mask + 1;
@@ -477,9 +541,9 @@ static void native_hpte_clear(void)
* already hold the native_tlbie_lock.
*/
if (hpte_v & HPTE_V_VALID) {
- hpte_decode(hptep, slot, &psize, &ssize, &vpn);
+ hpte_decode(hptep, slot, &psize, &apsize, &ssize, &vpn);
hptep->v = 0;
- __tlbie(vpn, psize, ssize);
+ __tlbie(vpn, psize, apsize, ssize);
}
}
@@ -540,7 +604,7 @@ static void native_flush_hash_range(unsigned long number, int local)
pte_iterate_hashed_subpages(pte, psize,
vpn, index, shift) {
- __tlbiel(vpn, psize, ssize);
+ __tlbiel(vpn, psize, psize, ssize);
} pte_iterate_hashed_end();
}
asm volatile("ptesync":::"memory");
@@ -557,7 +621,7 @@ static void native_flush_hash_range(unsigned long number, int local)
pte_iterate_hashed_subpages(pte, psize,
vpn, index, shift) {
- __tlbie(vpn, psize, ssize);
+ __tlbie(vpn, psize, psize, ssize);
} pte_iterate_hashed_end();
}
asm volatile("eieio; tlbsync; ptesync":::"memory");
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index bfeab83..a5a5067 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -125,7 +125,7 @@ static struct mmu_psize_def mmu_psize_defaults_old[] = {
[MMU_PAGE_4K] = {
.shift = 12,
.sllp = 0,
- .penc = 0,
+ .penc = {[MMU_PAGE_4K] = 0, [1 ... MMU_PAGE_COUNT - 1] = -1},
.avpnm = 0,
.tlbiel = 0,
},
@@ -139,14 +139,15 @@ static struct mmu_psize_def mmu_psize_defaults_gp[] = {
[MMU_PAGE_4K] = {
.shift = 12,
.sllp = 0,
- .penc = 0,
+ .penc = {[MMU_PAGE_4K] = 0, [1 ... MMU_PAGE_COUNT - 1] = -1},
.avpnm = 0,
.tlbiel = 1,
},
[MMU_PAGE_16M] = {
.shift = 24,
.sllp = SLB_VSID_L,
- .penc = 0,
+ .penc = {[0 ... MMU_PAGE_16M - 1] = -1, [MMU_PAGE_16M] = 0,
+ [MMU_PAGE_16M + 1 ... MMU_PAGE_COUNT - 1] = -1 },
.avpnm = 0x1UL,
.tlbiel = 0,
},
@@ -208,7 +209,7 @@ int htab_bolt_mapping(unsigned long vstart, unsigned long vend,
BUG_ON(!ppc_md.hpte_insert);
ret = ppc_md.hpte_insert(hpteg, vpn, paddr, tprot,
- HPTE_V_BOLTED, psize, ssize);
+ HPTE_V_BOLTED, psize, psize, ssize);
if (ret < 0)
break;
@@ -275,6 +276,30 @@ static void __init htab_init_seg_sizes(void)
of_scan_flat_dt(htab_dt_scan_seg_sizes, NULL);
}
+static int __init get_idx_from_shift(unsigned int shift)
+{
+ int idx = -1;
+
+ switch (shift) {
+ case 0xc:
+ idx = MMU_PAGE_4K;
+ break;
+ case 0x10:
+ idx = MMU_PAGE_64K;
+ break;
+ case 0x14:
+ idx = MMU_PAGE_1M;
+ break;
+ case 0x18:
+ idx = MMU_PAGE_16M;
+ break;
+ case 0x22:
+ idx = MMU_PAGE_16G;
+ break;
+ }
+ return idx;
+}
+
static int __init htab_dt_scan_page_sizes(unsigned long node,
const char *uname, int depth,
void *data)
@@ -294,60 +319,61 @@ static int __init htab_dt_scan_page_sizes(unsigned long node,
size /= 4;
cur_cpu_spec->mmu_features &= ~(MMU_FTR_16M_PAGE);
while(size > 0) {
- unsigned int shift = prop[0];
+ unsigned int base_shift = prop[0];
unsigned int slbenc = prop[1];
unsigned int lpnum = prop[2];
- unsigned int lpenc = 0;
struct mmu_psize_def *def;
- int idx = -1;
+ int idx, base_idx;
size -= 3; prop += 3;
- while(size > 0 && lpnum) {
- if (prop[0] == shift)
- lpenc = prop[1];
- prop += 2; size -= 2;
- lpnum--;
+ base_idx = get_idx_from_shift(base_shift);
+ if (base_idx < 0) {
+ /*
+ * skip the pte encoding also
+ */
+ prop += lpnum * 2; size -= lpnum * 2;
+ continue;
}
- switch(shift) {
- case 0xc:
- idx = MMU_PAGE_4K;
- break;
- case 0x10:
- idx = MMU_PAGE_64K;
- break;
- case 0x14:
- idx = MMU_PAGE_1M;
- break;
- case 0x18:
- idx = MMU_PAGE_16M;
+ def = &mmu_psize_defs[base_idx];
+ if (base_idx == MMU_PAGE_16M)
cur_cpu_spec->mmu_features |= MMU_FTR_16M_PAGE;
- break;
- case 0x22:
- idx = MMU_PAGE_16G;
- break;
- }
- if (idx < 0)
- continue;
- def = &mmu_psize_defs[idx];
- def->shift = shift;
- if (shift <= 23)
+
+ def->shift = base_shift;
+ if (base_shift <= 23)
def->avpnm = 0;
else
- def->avpnm = (1 << (shift - 23)) - 1;
+ def->avpnm = (1 << (base_shift - 23)) - 1;
def->sllp = slbenc;
- def->penc = lpenc;
- /* We don't know for sure what's up with tlbiel, so
+ /*
+ * We don't know for sure what's up with tlbiel, so
* for now we only set it for 4K and 64K pages
*/
- if (idx == MMU_PAGE_4K || idx == MMU_PAGE_64K)
+ if (base_idx == MMU_PAGE_4K || base_idx == MMU_PAGE_64K)
def->tlbiel = 1;
else
def->tlbiel = 0;
- DBG(" %d: shift=%02x, sllp=%04lx, avpnm=%08lx, "
- "tlbiel=%d, penc=%d\n",
- idx, shift, def->sllp, def->avpnm, def->tlbiel,
- def->penc);
+ while (size > 0 && lpnum) {
+ unsigned int shift = prop[0];
+ int penc = prop[1];
+
+ prop += 2; size -= 2;
+ lpnum--;
+
+ idx = get_idx_from_shift(shift);
+ if (idx < 0)
+ continue;
+
+ if (penc == -1)
+ pr_err("Invalid penc for base_shift=%d "
+ "shift=%d\n", base_shift, shift);
+
+ def->penc[idx] = penc;
+ DBG(" %d: shift=%02x, sllp=%04lx, "
+ "avpnm=%08lx, tlbiel=%d, penc=%d\n",
+ idx, shift, def->sllp, def->avpnm,
+ def->tlbiel, def->penc[idx]);
+ }
}
return 1;
}
@@ -396,10 +422,21 @@ static int __init htab_dt_scan_hugepage_blocks(unsigned long node,
}
#endif /* CONFIG_HUGETLB_PAGE */
+static void mmu_psize_set_default_penc(void)
+{
+ int bpsize, apsize;
+ for (bpsize = 0; bpsize < MMU_PAGE_COUNT; bpsize++)
+ for (apsize = 0; apsize < MMU_PAGE_COUNT; apsize++)
+ mmu_psize_defs[bpsize].penc[apsize] = -1;
+}
+
static void __init htab_init_page_sizes(void)
{
int rc;
+ /* se the invalid penc to -1 */
+ mmu_psize_set_default_penc();
+
/* Default to 4K pages only */
memcpy(mmu_psize_defs, mmu_psize_defaults_old,
sizeof(mmu_psize_defaults_old));
diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c b/arch/powerpc/mm/hugetlbpage-hash64.c
index cecad34..e0d52ee 100644
--- a/arch/powerpc/mm/hugetlbpage-hash64.c
+++ b/arch/powerpc/mm/hugetlbpage-hash64.c
@@ -103,7 +103,7 @@ repeat:
/* Insert into the hash table, primary slot */
slot = ppc_md.hpte_insert(hpte_group, vpn, pa, rflags, 0,
- mmu_psize, ssize);
+ mmu_psize, mmu_psize, ssize);
/* Primary is full, try the secondary */
if (unlikely(slot == -1)) {
@@ -111,7 +111,7 @@ repeat:
HPTES_PER_GROUP) & ~0x7UL;
slot = ppc_md.hpte_insert(hpte_group, vpn, pa, rflags,
HPTE_V_SECONDARY,
- mmu_psize, ssize);
+ mmu_psize, mmu_psize, ssize);
if (slot == -1) {
if (mftb() & 0x1)
hpte_group = ((hash & htab_hash_mask) *
diff --git a/arch/powerpc/platforms/cell/beat_htab.c b/arch/powerpc/platforms/cell/beat_htab.c
index 472f9a7..246e1d8 100644
--- a/arch/powerpc/platforms/cell/beat_htab.c
+++ b/arch/powerpc/platforms/cell/beat_htab.c
@@ -90,7 +90,7 @@ static inline unsigned int beat_read_mask(unsigned hpte_group)
static long beat_lpar_hpte_insert(unsigned long hpte_group,
unsigned long vpn, unsigned long pa,
unsigned long rflags, unsigned long vflags,
- int psize, int ssize)
+ int psize, int apsize, int ssize)
{
unsigned long lpar_rc;
u64 hpte_v, hpte_r, slot;
@@ -103,9 +103,9 @@ static long beat_lpar_hpte_insert(unsigned long hpte_group,
"rflags=%lx, vflags=%lx, psize=%d)\n",
hpte_group, va, pa, rflags, vflags, psize);
- hpte_v = hpte_encode_v(vpn, psize, MMU_SEGSIZE_256M) |
+ hpte_v = hpte_encode_v(vpn, psize, apsize, MMU_SEGSIZE_256M) |
vflags | HPTE_V_VALID;
- hpte_r = hpte_encode_r(pa, psize) | rflags;
+ hpte_r = hpte_encode_r(pa, psize, apsize) | rflags;
if (!(vflags & HPTE_V_BOLTED))
DBG_LOW(" hpte_v=%016lx, hpte_r=%016lx\n", hpte_v, hpte_r);
@@ -314,7 +314,7 @@ void __init hpte_init_beat(void)
static long beat_lpar_hpte_insert_v3(unsigned long hpte_group,
unsigned long vpn, unsigned long pa,
unsigned long rflags, unsigned long vflags,
- int psize, int ssize)
+ int psize, int apsize, int ssize)
{
unsigned long lpar_rc;
u64 hpte_v, hpte_r, slot;
@@ -327,9 +327,9 @@ static long beat_lpar_hpte_insert_v3(unsigned long hpte_group,
"rflags=%lx, vflags=%lx, psize=%d)\n",
hpte_group, vpn, pa, rflags, vflags, psize);
- hpte_v = hpte_encode_v(vpn, psize, MMU_SEGSIZE_256M) |
+ hpte_v = hpte_encode_v(vpn, psize, apsize, MMU_SEGSIZE_256M) |
vflags | HPTE_V_VALID;
- hpte_r = hpte_encode_r(pa, psize) | rflags;
+ hpte_r = hpte_encode_r(pa, psize, apsize) | rflags;
if (!(vflags & HPTE_V_BOLTED))
DBG_LOW(" hpte_v=%016lx, hpte_r=%016lx\n", hpte_v, hpte_r);
@@ -373,7 +373,7 @@ static long beat_lpar_hpte_updatepp_v3(unsigned long slot,
unsigned long pss;
want_v = hpte_encode_avpn(vpn, psize, MMU_SEGSIZE_256M);
- pss = (psize == MMU_PAGE_4K) ? -1UL : mmu_psize_defs[psize].penc;
+ pss = (psize == MMU_PAGE_4K) ? -1UL : mmu_psize_defs[psize].penc[psize];
DBG_LOW(" update: "
"avpnv=%016lx, slot=%016lx, psize: %d, newpp %016lx ... ",
@@ -403,7 +403,7 @@ static void beat_lpar_hpte_invalidate_v3(unsigned long slot, unsigned long vpn,
DBG_LOW(" inval : slot=%lx, vpn=%016lx, psize: %d, local: %d\n",
slot, vpn, psize, local);
want_v = hpte_encode_avpn(vpn, psize, MMU_SEGSIZE_256M);
- pss = (psize == MMU_PAGE_4K) ? -1UL : mmu_psize_defs[psize].penc;
+ pss = (psize == MMU_PAGE_4K) ? -1UL : mmu_psize_defs[psize].penc[psize];
lpar_rc = beat_invalidate_htab_entry3(0, slot, want_v, pss);
diff --git a/arch/powerpc/platforms/ps3/htab.c b/arch/powerpc/platforms/ps3/htab.c
index 07a4bba..44f06d2 100644
--- a/arch/powerpc/platforms/ps3/htab.c
+++ b/arch/powerpc/platforms/ps3/htab.c
@@ -45,7 +45,7 @@ static DEFINE_SPINLOCK(ps3_htab_lock);
static long ps3_hpte_insert(unsigned long hpte_group, unsigned long vpn,
unsigned long pa, unsigned long rflags, unsigned long vflags,
- int psize, int ssize)
+ int psize, int apsize, int ssize)
{
int result;
u64 hpte_v, hpte_r;
@@ -61,8 +61,8 @@ static long ps3_hpte_insert(unsigned long hpte_group, unsigned long vpn,
*/
vflags &= ~HPTE_V_SECONDARY;
- hpte_v = hpte_encode_v(vpn, psize, ssize) | vflags | HPTE_V_VALID;
- hpte_r = hpte_encode_r(ps3_mm_phys_to_lpar(pa), psize) | rflags;
+ hpte_v = hpte_encode_v(vpn, psize, apsize, ssize) | vflags | HPTE_V_VALID;
+ hpte_r = hpte_encode_r(ps3_mm_phys_to_lpar(pa), psize, apsize) | rflags;
spin_lock_irqsave(&ps3_htab_lock, flags);
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index a77c35b..3daced3 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -109,7 +109,7 @@ void vpa_init(int cpu)
static long pSeries_lpar_hpte_insert(unsigned long hpte_group,
unsigned long vpn, unsigned long pa,
unsigned long rflags, unsigned long vflags,
- int psize, int ssize)
+ int psize, int apsize, int ssize)
{
unsigned long lpar_rc;
unsigned long flags;
@@ -121,8 +121,8 @@ static long pSeries_lpar_hpte_insert(unsigned long hpte_group,
"pa=%016lx, rflags=%lx, vflags=%lx, psize=%d)\n",
hpte_group, vpn, pa, rflags, vflags, psize);
- hpte_v = hpte_encode_v(vpn, psize, ssize) | vflags | HPTE_V_VALID;
- hpte_r = hpte_encode_r(pa, psize) | rflags;
+ hpte_v = hpte_encode_v(vpn, psize, apsize, ssize) | vflags | HPTE_V_VALID;
+ hpte_r = hpte_encode_r(pa, psize, apsize) | rflags;
if (!(vflags & HPTE_V_BOLTED))
pr_devel(" hpte_v=%016lx, hpte_r=%016lx\n", hpte_v, hpte_r);
--
1.7.10
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox