LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH][v3] sata_fsl: add workaround for data length mismatch on freescale V2 controller
From: Shaohui Xie @ 2012-09-07 10:01 UTC (permalink / raw)
  To: jgarzik, linux-ide; +Cc: linuxppc-dev, linux-kernel, Anju Bhartiya, Shaohui Xie

The freescale V2 SATA controller checks if the received data length matches
the programmed length 'ttl', if not, it assumes that this is an error.
In ATAPI, the 'ttl' is based on max allocation length and not the actual
data transfer length, controller will raise 'DLM' (Data length Mismatch)
error bit in Hstatus register. Along with 'DLM', DE (Device error) and
FE (fatal Error) bits are also set in Hstatus register, 'E' (Internal Error)
bit is set in Serror register and CE (Command Error) and DE (Device error)
registers have the corresponding bit set. In this condition, we need to
clear errors in following way: in the service routine, based on 'DLM' flag,
HCONTROL[27] operation clears Hstatus, CE and DE registers, clear Serror
register.

Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com>
Signed-off-by: Anju Bhartiya <Anju.Bhartiya@freescale.com>
---
changes for v3:
1. not using uppercase for variable names;
2. remove unnecessary parens;

changes for v2:
1. remove the using of quirk;
2. wrap errata codes in condition;

 drivers/ata/sata_fsl.c |   39 +++++++++++++++++++++++++++++++++++----
 1 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/drivers/ata/sata_fsl.c b/drivers/ata/sata_fsl.c
index d6577b9..9fbab68 100644
--- a/drivers/ata/sata_fsl.c
+++ b/drivers/ata/sata_fsl.c
@@ -143,6 +143,7 @@ enum {
 	    FATAL_ERR_CRC_ERR_RX |
 	    FATAL_ERR_FIFO_OVRFL_TX | FATAL_ERR_FIFO_OVRFL_RX,
 
+	INT_ON_DATA_LENGTH_MISMATCH = (1 << 12),
 	INT_ON_FATAL_ERR = (1 << 5),
 	INT_ON_PHYRDY_CHG = (1 << 4),
 
@@ -1181,25 +1182,55 @@ static void sata_fsl_host_intr(struct ata_port *ap)
 	u32 hstatus, done_mask = 0;
 	struct ata_queued_cmd *qc;
 	u32 SError;
+	u32 tag;
+	u32 status_mask = INT_ON_ERROR;
 
 	hstatus = ioread32(hcr_base + HSTATUS);
 
 	sata_fsl_scr_read(&ap->link, SCR_ERROR, &SError);
 
+	/* Read command completed register */
+	done_mask = ioread32(hcr_base + CC);
+
+	/* Workaround for data length mismatch errata */
+	if (unlikely(hstatus & INT_ON_DATA_LENGTH_MISMATCH)) {
+		for (tag = 0; tag < ATA_MAX_QUEUE; tag++) {
+			qc = ata_qc_from_tag(ap, tag);
+			if (qc && ata_is_atapi(qc->tf.protocol)) {
+				u32 hcontrol;
+#define HCONTROL_CLEAR_ERROR	(1 << 27)
+				/* Set HControl[27] to clear error registers */
+				hcontrol = ioread32(hcr_base + HCONTROL);
+				iowrite32(hcontrol | HCONTROL_CLEAR_ERROR,
+						hcr_base + HCONTROL);
+
+				/* Clear HControl[27] */
+				iowrite32(hcontrol & ~HCONTROL_CLEAR_ERROR,
+						hcr_base + HCONTROL);
+
+				/* Clear SError[E] bit */
+				sata_fsl_scr_write(&ap->link, SCR_ERROR,
+						SError);
+
+				/* Ignore fatal error and device error */
+				status_mask &= ~(INT_ON_SINGL_DEVICE_ERR
+						| INT_ON_FATAL_ERR);
+				break;
+			}
+		}
+	}
+
 	if (unlikely(SError & 0xFFFF0000)) {
 		DPRINTK("serror @host_intr : 0x%x\n", SError);
 		sata_fsl_error_intr(ap);
 	}
 
-	if (unlikely(hstatus & INT_ON_ERROR)) {
+	if (unlikely(hstatus & status_mask)) {
 		DPRINTK("error interrupt!!\n");
 		sata_fsl_error_intr(ap);
 		return;
 	}
 
-	/* Read command completed register */
-	done_mask = ioread32(hcr_base + CC);
-
 	VPRINTK("Status of all queues :\n");
 	VPRINTK("done_mask/CC = 0x%x, CA = 0x%x, CE=0x%x,CQ=0x%x,apqa=0x%x\n",
 		done_mask,
-- 
1.6.4

^ permalink raw reply related

* Re: [PATCH v2 1/2] [powerpc] Change memory_limit from phys_addr_t to unsigned long long
From: Suzuki K. Poulose @ 2012-09-07 10:01 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: mahesh, linuxppc-dev, linux-kernel
In-Reply-To: <1346981712.2385.30.camel@pasglop>

On 09/07/2012 07:05 AM, Benjamin Herrenschmidt wrote:
> On Tue, 2012-08-21 at 17:12 +0530, Suzuki K. Poulose wrote:
>> There are some device-tree nodes, whose values are of type phys_addr_t.
>> The phys_addr_t is variable sized based on the CONFIG_PHSY_T_64BIT.
>>
>> Change these to a fixed unsigned long long for consistency.
>>
>> This patch does the change only for memory_limit.
>>
>> The following is a list of such variables which need the change:
>>
>>   1) kernel_end, crashk_size - in arch/powerpc/kernel/machine_kexec.c
>>
>>   2) (struct resource *)crashk_res.start - We could export a local static
>>      variable from machine_kexec.c.
>>
>> Changing the above values might break the kexec-tools. So, I will
>> fix kexec-tools first to handle the different sized values and then change
>>   the above.
>>
>> Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com>
>> ---
>
> Breaks the build on some configs (with 32-bit phys_addr_t):

Sorry for that.
>
> /home/benh/linux-powerpc-test/arch/powerpc/kernel/prom.c: In function
> 'early_init_devtree':
> /home/benh/linux-powerpc-test/arch/powerpc/kernel/prom.c:664:25: error:
> comparison of distinct pointer types lacks a cast
>
> I'm fixing that myself this time but please be more careful.
Sure. Thanks Ben for fixing that.

Suzuki

^ permalink raw reply

* Re: [PATCH] powerpc/powernv: move the dereference below the NULL test
From: Benjamin Herrenschmidt @ 2012-09-07  7:59 UTC (permalink / raw)
  To: Wei Yongjun
  Cc: devicetree-discuss, linux-kernel, rob.herring, yongjun_wei,
	paulus, linuxppc-dev
In-Reply-To: <CAPgLHd_w9B8sHg9i9msFLx8FVBypqTtDEVOt1-VBNTW4zwHTMQ@mail.gmail.com>

On Fri, 2012-09-07 at 14:45 +0800, Wei Yongjun wrote:
> From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
> 
> The dereference should be moved below the NULL test.
> 
> spatch with a semantic match is used to found this.
> (http://coccinelle.lip6.fr/)

I haven't applied this patch yet (there was a similar one recently from
another semantic checker I believe) because that code is about to be
deeply reworked (waiting for some dependencies to get in), so this will
just make the patch harder to apply, and the stuff should never be NULL
in the first place anyway.

So let's leave that aside for a bit.

Cheers,
Ben.

> Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
> ---
>  arch/powerpc/platforms/powernv/pci.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
> index be3cfc5..4ba89c1 100644
> --- a/arch/powerpc/platforms/powernv/pci.c
> +++ b/arch/powerpc/platforms/powernv/pci.c
> @@ -287,13 +287,15 @@ static int pnv_pci_read_config(struct pci_bus *bus,
>  			       int where, int size, u32 *val)
>  {
>  	struct pci_controller *hose = pci_bus_to_host(bus);
> -	struct pnv_phb *phb = hose->private_data;
> +	struct pnv_phb *phb;
>  	u32 bdfn = (((uint64_t)bus->number) << 8) | devfn;
>  	s64 rc;
>  
>  	if (hose == NULL)
>  		return PCIBIOS_DEVICE_NOT_FOUND;
>  
> +	phb = hose->private_data;
> +
>  	switch (size) {
>  	case 1: {
>  		u8 v8;
> @@ -331,12 +333,14 @@ static int pnv_pci_write_config(struct pci_bus *bus,
>  				int where, int size, u32 val)
>  {
>  	struct pci_controller *hose = pci_bus_to_host(bus);
> -	struct pnv_phb *phb = hose->private_data;
> +	struct pnv_phb *phb;
>  	u32 bdfn = (((uint64_t)bus->number) << 8) | devfn;
>  
>  	if (hose == NULL)
>  		return PCIBIOS_DEVICE_NOT_FOUND;
>  
> +	phb = hose->private_data;
> +
>  	cfg_dbg("pnv_pci_write_config bus: %x devfn: %x +%x/%x -> %08x\n",
>  		bus->number, devfn, where, size, val);
>  	switch (size) {

^ permalink raw reply

* Re: [PATCH -V8 0/11] arch/powerpc: Add 64TB support to ppc64
From: Benjamin Herrenschmidt @ 2012-09-07  7:53 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linuxppc-dev, paulus
In-Reply-To: <871uiexuau.fsf@linux.vnet.ibm.com>

On Fri, 2012-09-07 at 11:12 +0530, Aneesh Kumar K.V wrote:

> 
> diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
> index 428f23e..057a12a 100644
> --- a/arch/powerpc/include/asm/mmu-hash64.h
> +++ b/arch/powerpc/include/asm/mmu-hash64.h
> @@ -14,6 +14,7 @@
>  
>  #include <asm/asm-compat.h>
>  #include <asm/page.h>
> +#include <asm/pgtable-ppc64-range.h>

Nah, that's all too gross... I think the right thing to do is to move
the slice stuff out of page_64.h

>  /*
>   * Segment table
> @@ -415,12 +416,7 @@ extern void slb_set_size(u16 size);
>  	add	rt,rt,rx
>  
>  /* 4 bits per slice and we have one slice per 1TB */
> -#if 0 /* We can't directly include pgtable.h hence this hack */
>  #define SLICE_ARRAY_SIZE  (PGTABLE_RANGE >> 41)
> -#else
> -/* Right now we only support 64TB */
> -#define SLICE_ARRAY_SIZE  32
> -#endif
>  
>  #ifndef __ASSEMBLY__
>  
> diff --git a/arch/powerpc/include/asm/page_64.h b/arch/powerpc/include/asm/page_64.h
> index b55beb4..01ab518 100644
> --- a/arch/powerpc/include/asm/page_64.h
> +++ b/arch/powerpc/include/asm/page_64.h
> @@ -78,16 +78,14 @@ extern u64 ppc64_pft_size;
>  #define GET_LOW_SLICE_INDEX(addr)	((addr) >> SLICE_LOW_SHIFT)
>  #define GET_HIGH_SLICE_INDEX(addr)	((addr) >> SLICE_HIGH_SHIFT)
>  
> -/* 1 bit per slice and we have one slice per 1TB */
> -#if 0 /* We can't directly include pgtable.h hence this hack */
> -#define SLICE_MASK_SIZE (PGTABLE_RANGE >> 43)
> -#else
> -/*
> +/* 1 bit per slice and we have one slice per 1TB
>   * Right now we support only 64TB.
>   * IF we change this we will have to change the type
>   * of high_slices
>   */
>  #define SLICE_MASK_SIZE 8
> +#if (PGTABLE_RANGE >> 43) > SLICE_MASK_SIZE
> +#error PGTABLE_RANGE exceeds slice_mask high_slices size
>  #endif
>  
>  #ifndef __ASSEMBLY__
> diff --git a/arch/powerpc/include/asm/pgtable-ppc64-range.h b/arch/powerpc/include/asm/pgtable-ppc64-range.h
> new file mode 100644
> index 0000000..04a825c
> --- /dev/null
> +++ b/arch/powerpc/include/asm/pgtable-ppc64-range.h
> @@ -0,0 +1,16 @@
> +#ifndef _ASM_POWERPC_PGTABLE_PPC64_RANGE_H_
> +#define _ASM_POWERPC_PGTABLE_PPC64_RANGE_H_
> +
> +#ifdef CONFIG_PPC_64K_PAGES
> +#include <asm/pgtable-ppc64-64k.h>
> +#else
> +#include <asm/pgtable-ppc64-4k.h>
> +#endif
> +
> +/*
> + * Size of EA range mapped by our pagetables.
> + */
> +#define PGTABLE_EADDR_SIZE (PTE_INDEX_SIZE + PMD_INDEX_SIZE + \
> +			    PUD_INDEX_SIZE + PGD_INDEX_SIZE + PAGE_SHIFT)
> +#define PGTABLE_RANGE (ASM_CONST(1) << PGTABLE_EADDR_SIZE)
> +#endif
> diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
> index dea953f..ee783b4 100644
> --- a/arch/powerpc/include/asm/pgtable-ppc64.h
> +++ b/arch/powerpc/include/asm/pgtable-ppc64.h
> @@ -13,13 +13,7 @@
>  
>  #define FIRST_USER_ADDRESS	0
>  
> -/*
> - * Size of EA range mapped by our pagetables.
> - */
> -#define PGTABLE_EADDR_SIZE (PTE_INDEX_SIZE + PMD_INDEX_SIZE + \
> -                	    PUD_INDEX_SIZE + PGD_INDEX_SIZE + PAGE_SHIFT)
> -#define PGTABLE_RANGE (ASM_CONST(1) << PGTABLE_EADDR_SIZE)
> -
> +#include <asm/pgtable-ppc64-range.h>
>  
>  /* Some sanity checking */
>  #if TASK_SIZE_USER64 > PGTABLE_RANGE
> @@ -32,14 +26,6 @@
>  #endif
>  #endif
>  
> -#if (PGTABLE_RANGE >> 41) > SLICE_ARRAY_SIZE
> -#error PGTABLE_RANGE exceeds SLICE_ARRAY_SIZE
> -#endif
> -
> -#if (PGTABLE_RANGE >> 43) > SLICE_MASK_SIZE
> -#error PGTABLE_RANGE exceeds slice_mask high_slices size
> -#endif
> -
>  /*
>   * Define the address range of the kernel non-linear virtual area
>   */

Ben.

^ permalink raw reply

* [PATCH 3/3] powerpc: cleanup old DABRX #defines
From: Michael Neuling @ 2012-09-07  7:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Geert Uytterhoeven, Michael Neuling, linuxppc-dev
In-Reply-To: <alpine.DEB.2.00.1209070716110.22556@bushbaby.sonytel.be>

These are no longer used so get rid of them

Signed-off-by: Michael Neuling <mikey@neuling.org>
---
 arch/powerpc/include/asm/hvcall.h |    5 -----
 1 file changed, 5 deletions(-)

diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
index 423cf9e..7a86706 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -152,11 +152,6 @@
 #define H_VASI_RESUMED          5
 #define H_VASI_COMPLETED        6
 
-/* DABRX flags */
-#define H_DABRX_HYPERVISOR	(1UL<<(63-61))
-#define H_DABRX_KERNEL		(1UL<<(63-62))
-#define H_DABRX_USER		(1UL<<(63-63))
-
 /* Each control block has to be on a 4K boundary */
 #define H_CB_ALIGNMENT          4096
 
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH 2/3] powerpc: Dynamically calculate the dabrx based on kernel/user/hypervisor
From: Michael Neuling @ 2012-09-07  7:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Geert Uytterhoeven, Michael Neuling, linuxppc-dev
In-Reply-To: <alpine.DEB.2.00.1209070716110.22556@bushbaby.sonytel.be>

Currently we mark the DABRX to interrupt on all matches
(hypervisor/kernel/user and then filter in software.  We can be a lot
smarter now that we can set the DABRX dynamically.

This sets the DABRX based on the flags passed by the user.

Signed-off-by: Michael Neuling <mikey@neuling.org>
---
 arch/powerpc/include/asm/hw_breakpoint.h |    1 +
 arch/powerpc/kernel/hw_breakpoint.c      |   15 +++++++++++----
 arch/powerpc/platforms/pseries/setup.c   |    2 +-
 3 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/hw_breakpoint.h b/arch/powerpc/include/asm/hw_breakpoint.h
index c6f48eb..4234245 100644
--- a/arch/powerpc/include/asm/hw_breakpoint.h
+++ b/arch/powerpc/include/asm/hw_breakpoint.h
@@ -28,6 +28,7 @@
 
 struct arch_hw_breakpoint {
 	unsigned long	address;
+	unsigned long	dabrx;
 	int		type;
 	u8		len; /* length of the target data symbol */
 	bool		extraneous_interrupt;
diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
index 6891d79..a89cae4 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -73,7 +73,7 @@ int arch_install_hw_breakpoint(struct perf_event *bp)
 	 * If so, DABR will be populated in single_step_dabr_instruction().
 	 */
 	if (current->thread.last_hit_ubp != bp)
-		set_dabr(info->address | info->type | DABR_TRANSLATION, DABRX_ALL);
+		set_dabr(info->address | info->type | DABR_TRANSLATION, info->dabrx);
 
 	return 0;
 }
@@ -170,6 +170,13 @@ int arch_validate_hwbkpt_settings(struct perf_event *bp)
 
 	info->address = bp->attr.bp_addr;
 	info->len = bp->attr.bp_len;
+	info->dabrx = DABRX_ALL;
+	if (bp->attr.exclude_user)
+		info->dabrx &= ~DABRX_USER;
+	if (bp->attr.exclude_kernel)
+		info->dabrx &= ~DABRX_KERNEL;
+	if (bp->attr.exclude_hv)
+		info->dabrx &= ~DABRX_HYP;
 
 	/*
 	 * Since breakpoint length can be a maximum of HW_BREAKPOINT_LEN(8)
@@ -197,7 +204,7 @@ void thread_change_pc(struct task_struct *tsk, struct pt_regs *regs)
 
 	info = counter_arch_bp(tsk->thread.last_hit_ubp);
 	regs->msr &= ~MSR_SE;
-	set_dabr(info->address | info->type | DABR_TRANSLATION, DABRX_ALL);
+	set_dabr(info->address | info->type | DABR_TRANSLATION, info->dabrx);
 	tsk->thread.last_hit_ubp = NULL;
 }
 
@@ -281,7 +288,7 @@ int __kprobes hw_breakpoint_handler(struct die_args *args)
 	if (!info->extraneous_interrupt)
 		perf_bp_event(bp, regs);
 
-	set_dabr(info->address | info->type | DABR_TRANSLATION, DABRX_ALL);
+	set_dabr(info->address | info->type | DABR_TRANSLATION, info->dabrx);
 out:
 	rcu_read_unlock();
 	return rc;
@@ -313,7 +320,7 @@ int __kprobes single_step_dabr_instruction(struct die_args *args)
 	if (!info->extraneous_interrupt)
 		perf_bp_event(bp, regs);
 
-	set_dabr(info->address | info->type | DABR_TRANSLATION, DABRX_ALL);
+	set_dabr(info->address | info->type | DABR_TRANSLATION, info->dabrx);
 	current->thread.last_hit_ubp = NULL;
 
 	/*
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index b90deaf..40b30e4 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -427,7 +427,7 @@ static int pseries_set_xdabr(unsigned long dabr, unsigned long dabrx)
 	if (dabrx == 0 && dabr == 0)
 		dabrx = DABRX_USER;
 	/* PAPR says we can only set kernel and user bits */
-	dabrx &= H_DABRX_KERNEL | H_DABRX_USER;
+	dabrx &= DABRX_KERNEL | DABRX_USER;
 
 	return plpar_hcall_norets(H_SET_XDABR, dabr, dabrx);
 }
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH 1/3] powerpc: Rework set_dabr so it can take a DABRX value as well
From: Michael Neuling @ 2012-09-07  7:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Geert Uytterhoeven, Michael Neuling, linuxppc-dev
In-Reply-To: <alpine.DEB.2.00.1209070716110.22556@bushbaby.sonytel.be>

Rework set_dabr to take a DABRX value as well.

Both the pseries and PS3 hypervisors do some checks on the DABRX
values that are passed in the hcall.  This patch stops bogus values
from being passed to hypervisor.  Also, in the case where we are
clearing the breakpoint, where DABR and DABRX are zero, we modify the
DABRX value to make it valid so that the hcall won't fail.

Signed-off-by: Michael Neuling <mikey@neuling.org>
---
 arch/powerpc/include/asm/debug.h         |    2 +-
 arch/powerpc/include/asm/hw_breakpoint.h |    2 +-
 arch/powerpc/include/asm/machdep.h       |    3 ++-
 arch/powerpc/include/asm/processor.h     |    1 +
 arch/powerpc/include/asm/reg.h           |    3 +++
 arch/powerpc/kernel/hw_breakpoint.c      |   12 ++++++------
 arch/powerpc/kernel/process.c            |   14 +++++++-------
 arch/powerpc/kernel/ptrace.c             |    3 +++
 arch/powerpc/kernel/signal.c             |    2 +-
 arch/powerpc/platforms/cell/beat.c       |    4 ++--
 arch/powerpc/platforms/cell/beat.h       |    2 +-
 arch/powerpc/platforms/ps3/setup.c       |   10 +++++++---
 arch/powerpc/platforms/pseries/setup.c   |   14 +++++++++-----
 arch/powerpc/xmon/xmon.c                 |    4 ++--
 14 files changed, 46 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/include/asm/debug.h b/arch/powerpc/include/asm/debug.h
index 716d2f0..32de257 100644
--- a/arch/powerpc/include/asm/debug.h
+++ b/arch/powerpc/include/asm/debug.h
@@ -44,7 +44,7 @@ static inline int debugger_dabr_match(struct pt_regs *regs) { return 0; }
 static inline int debugger_fault_handler(struct pt_regs *regs) { return 0; }
 #endif
 
-extern int set_dabr(unsigned long dabr);
+extern int set_dabr(unsigned long dabr, unsigned long dabrx);
 #ifdef CONFIG_PPC_ADV_DEBUG_REGS
 extern void do_send_trap(struct pt_regs *regs, unsigned long address,
 			 unsigned long error_code, int signal_code, int brkpt);
diff --git a/arch/powerpc/include/asm/hw_breakpoint.h b/arch/powerpc/include/asm/hw_breakpoint.h
index 39b323e..c6f48eb 100644
--- a/arch/powerpc/include/asm/hw_breakpoint.h
+++ b/arch/powerpc/include/asm/hw_breakpoint.h
@@ -61,7 +61,7 @@ extern void ptrace_triggered(struct perf_event *bp,
 			struct perf_sample_data *data, struct pt_regs *regs);
 static inline void hw_breakpoint_disable(void)
 {
-	set_dabr(0);
+	set_dabr(0, 0);
 }
 extern void thread_change_pc(struct task_struct *tsk, struct pt_regs *regs);
 
diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index 42ce570..236b477 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -180,7 +180,8 @@ struct machdep_calls {
 	void		(*enable_pmcs)(void);
 
 	/* Set DABR for this platform, leave empty for default implemenation */
-	int		(*set_dabr)(unsigned long dabr);
+	int		(*set_dabr)(unsigned long dabr,
+				    unsigned long dabrx);
 
 #ifdef CONFIG_PPC32	/* XXX for now */
 	/* A general init function, called by ppc_init in init/main.c.
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index 54b73a2..17b58e5 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -219,6 +219,7 @@ struct thread_struct {
 #endif /* CONFIG_HAVE_HW_BREAKPOINT */
 #endif
 	unsigned long	dabr;		/* Data address breakpoint register */
+	unsigned long	dabrx;		/*      ... extension  */
 #ifdef CONFIG_ALTIVEC
 	/* Complete AltiVec register set */
 	vector128	vr[32] __attribute__((aligned(16)));
diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 6386086..334be34 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -208,6 +208,9 @@
 #define SPRN_DABRX	0x3F7	/* Data Address Breakpoint Register Extension */
 #define   DABRX_USER	(1UL << 0)
 #define   DABRX_KERNEL	(1UL << 1)
+#define   DABRX_HYP	(1UL << 2)
+#define   DABRX_BTI	(1UL << 3)
+#define   DABRX_ALL     (DABRX_BTI | DABRX_HYP | DABRX_KERNEL | DABRX_USER)
 #define SPRN_DAR	0x013	/* Data Address Register */
 #define SPRN_DBCR	0x136	/* e300 Data Breakpoint Control Reg */
 #define SPRN_DSISR	0x012	/* Data Storage Interrupt Status Register */
diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
index 6767445..6891d79 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -73,7 +73,7 @@ int arch_install_hw_breakpoint(struct perf_event *bp)
 	 * If so, DABR will be populated in single_step_dabr_instruction().
 	 */
 	if (current->thread.last_hit_ubp != bp)
-		set_dabr(info->address | info->type | DABR_TRANSLATION);
+		set_dabr(info->address | info->type | DABR_TRANSLATION, DABRX_ALL);
 
 	return 0;
 }
@@ -97,7 +97,7 @@ void arch_uninstall_hw_breakpoint(struct perf_event *bp)
 	}
 
 	*slot = NULL;
-	set_dabr(0);
+	set_dabr(0, 0);
 }
 
 /*
@@ -197,7 +197,7 @@ void thread_change_pc(struct task_struct *tsk, struct pt_regs *regs)
 
 	info = counter_arch_bp(tsk->thread.last_hit_ubp);
 	regs->msr &= ~MSR_SE;
-	set_dabr(info->address | info->type | DABR_TRANSLATION);
+	set_dabr(info->address | info->type | DABR_TRANSLATION, DABRX_ALL);
 	tsk->thread.last_hit_ubp = NULL;
 }
 
@@ -215,7 +215,7 @@ int __kprobes hw_breakpoint_handler(struct die_args *args)
 	unsigned long dar = regs->dar;
 
 	/* Disable breakpoints during exception handling */
-	set_dabr(0);
+	set_dabr(0, 0);
 
 	/*
 	 * The counter may be concurrently released but that can only
@@ -281,7 +281,7 @@ int __kprobes hw_breakpoint_handler(struct die_args *args)
 	if (!info->extraneous_interrupt)
 		perf_bp_event(bp, regs);
 
-	set_dabr(info->address | info->type | DABR_TRANSLATION);
+	set_dabr(info->address | info->type | DABR_TRANSLATION, DABRX_ALL);
 out:
 	rcu_read_unlock();
 	return rc;
@@ -313,7 +313,7 @@ int __kprobes single_step_dabr_instruction(struct die_args *args)
 	if (!info->extraneous_interrupt)
 		perf_bp_event(bp, regs);
 
-	set_dabr(info->address | info->type | DABR_TRANSLATION);
+	set_dabr(info->address | info->type | DABR_TRANSLATION, DABRX_ALL);
 	current->thread.last_hit_ubp = NULL;
 
 	/*
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 1a1f2dd..53c32a9 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -283,7 +283,7 @@ void do_dabr(struct pt_regs *regs, unsigned long address,
 		return;
 
 	/* Clear the DABR */
-	set_dabr(0);
+	set_dabr(0, 0);
 
 	/* Deliver the signal to userspace */
 	info.si_signo = SIGTRAP;
@@ -364,18 +364,19 @@ static void set_debug_reg_defaults(struct thread_struct *thread)
 {
 	if (thread->dabr) {
 		thread->dabr = 0;
-		set_dabr(0);
+		thread->dabrx = 0;
+		set_dabr(0, 0);
 	}
 }
 #endif /* !CONFIG_HAVE_HW_BREAKPOINT */
 #endif	/* CONFIG_PPC_ADV_DEBUG_REGS */
 
-int set_dabr(unsigned long dabr)
+int set_dabr(unsigned long dabr, unsigned long dabrx)
 {
 	__get_cpu_var(current_dabr) = dabr;
 
 	if (ppc_md.set_dabr)
-		return ppc_md.set_dabr(dabr);
+		return ppc_md.set_dabr(dabr, dabrx);
 
 	/* XXX should we have a CPU_FTR_HAS_DABR ? */
 #ifdef CONFIG_PPC_ADV_DEBUG_REGS
@@ -385,9 +386,8 @@ int set_dabr(unsigned long dabr)
 #endif
 #elif defined(CONFIG_PPC_BOOK3S)
 	mtspr(SPRN_DABR, dabr);
+	mtspr(SPRN_DABRX, dabrx);
 #endif
-
-
 	return 0;
 }
 
@@ -480,7 +480,7 @@ struct task_struct *__switch_to(struct task_struct *prev,
  */
 #ifndef CONFIG_HAVE_HW_BREAKPOINT
 	if (unlikely(__get_cpu_var(current_dabr) != new->thread.dabr))
-		set_dabr(new->thread.dabr);
+		set_dabr(new->thread.dabr, new->thread.dabrx);
 #endif /* CONFIG_HAVE_HW_BREAKPOINT */
 #endif
 
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index c10fc28..79d8e56 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -960,6 +960,7 @@ int ptrace_set_debugreg(struct task_struct *task, unsigned long addr,
 		thread->ptrace_bps[0] = bp;
 		ptrace_put_breakpoints(task);
 		thread->dabr = data;
+		thread->dabrx = DABRX_ALL;
 		return 0;
 	}
 
@@ -983,6 +984,7 @@ int ptrace_set_debugreg(struct task_struct *task, unsigned long addr,
 
 	/* Move contents to the DABR register */
 	task->thread.dabr = data;
+	task->thread.dabrx = DABRX_ALL;
 #else /* CONFIG_PPC_ADV_DEBUG_REGS */
 	/* As described above, it was assumed 3 bits were passed with the data
 	 *  address, but we will assume only the mode bits will be passed
@@ -1397,6 +1399,7 @@ static long ppc_set_hwdebug(struct task_struct *child,
 		dabr |= DABR_DATA_WRITE;
 
 	child->thread.dabr = dabr;
+	child->thread.dabrx = DABRX_ALL;
 
 	return 1;
 #endif /* !CONFIG_PPC_ADV_DEBUG_DVCS */
diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
index 5c023c9..b936b45 100644
--- a/arch/powerpc/kernel/signal.c
+++ b/arch/powerpc/kernel/signal.c
@@ -130,7 +130,7 @@ static int do_signal(struct pt_regs *regs)
 	 * triggered inside the kernel.
 	 */
 	if (current->thread.dabr)
-		set_dabr(current->thread.dabr);
+		set_dabr(current->thread.dabr, current->thread.dabrx);
 #endif
 	/* Re-enable the breakpoints for the signal stack */
 	thread_change_pc(current, regs);
diff --git a/arch/powerpc/platforms/cell/beat.c b/arch/powerpc/platforms/cell/beat.c
index 852592b..affcf56 100644
--- a/arch/powerpc/platforms/cell/beat.c
+++ b/arch/powerpc/platforms/cell/beat.c
@@ -136,9 +136,9 @@ ssize_t beat_nvram_get_size(void)
 	return BEAT_NVRAM_SIZE;
 }
 
-int beat_set_xdabr(unsigned long dabr)
+int beat_set_xdabr(unsigned long dabr, unsigned long dabrx)
 {
-	if (beat_set_dabr(dabr, DABRX_KERNEL | DABRX_USER))
+	if (beat_set_dabr(dabr, dabrx))
 		return -1;
 	return 0;
 }
diff --git a/arch/powerpc/platforms/cell/beat.h b/arch/powerpc/platforms/cell/beat.h
index 32c8efc..bfcb8e3 100644
--- a/arch/powerpc/platforms/cell/beat.h
+++ b/arch/powerpc/platforms/cell/beat.h
@@ -32,7 +32,7 @@ void beat_get_rtc_time(struct rtc_time *);
 ssize_t beat_nvram_get_size(void);
 ssize_t beat_nvram_read(char *, size_t, loff_t *);
 ssize_t beat_nvram_write(char *, size_t, loff_t *);
-int beat_set_xdabr(unsigned long);
+int beat_set_xdabr(unsigned long, unsigned long);
 void beat_power_save(void);
 void beat_kexec_cpu_down(int, int);
 
diff --git a/arch/powerpc/platforms/ps3/setup.c b/arch/powerpc/platforms/ps3/setup.c
index 2d664c5..3f509f8 100644
--- a/arch/powerpc/platforms/ps3/setup.c
+++ b/arch/powerpc/platforms/ps3/setup.c
@@ -184,11 +184,15 @@ early_param("ps3flash", early_parse_ps3flash);
 #define prealloc_ps3flash_bounce_buffer()	do { } while (0)
 #endif
 
-static int ps3_set_dabr(unsigned long dabr)
+static int ps3_set_dabr(unsigned long dabr, unsigned long dabrx)
 {
-	enum {DABR_USER = 1, DABR_KERNEL = 2,};
+	/* Have to set at least one bit in the DABRX */
+	if (dabrx == 0 && dabr == 0)
+		dabrx = DABRX_USER;
+	/* hypervisor only allows us to set BTI, Kernel and user */
+	dabrx &= DABRX_BTI | DABRX_KERNEL | DABRX_USER;
 
-	return lv1_set_dabr(dabr, DABR_KERNEL | DABR_USER) ? -1 : 0;
+	return lv1_set_dabr(dabr, dabrx) ? -1 : 0;
 }
 
 static void __init ps3_setup_arch(void)
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 36b7744..b90deaf 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -416,16 +416,20 @@ static int __init pSeries_init_panel(void)
 }
 machine_arch_initcall(pseries, pSeries_init_panel);
 
-static int pseries_set_dabr(unsigned long dabr)
+static int pseries_set_dabr(unsigned long dabr, unsigned long dabrx)
 {
 	return plpar_hcall_norets(H_SET_DABR, dabr);
 }
 
-static int pseries_set_xdabr(unsigned long dabr)
+static int pseries_set_xdabr(unsigned long dabr, unsigned long dabrx)
 {
-	/* We want to catch accesses from kernel and userspace */
-	return plpar_hcall_norets(H_SET_XDABR, dabr,
-			H_DABRX_KERNEL | H_DABRX_USER);
+	/* Have to set at least one bit in the DABRX according to PAPR */
+	if (dabrx == 0 && dabr == 0)
+		dabrx = DABRX_USER;
+	/* PAPR says we can only set kernel and user bits */
+	dabrx &= H_DABRX_KERNEL | H_DABRX_USER;
+
+	return plpar_hcall_norets(H_SET_XDABR, dabr, dabrx);
 }
 
 #define CMO_CHARACTERISTICS_TOKEN 44
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 9b49c65..987f441 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -740,7 +740,7 @@ static void insert_bpts(void)
 static void insert_cpu_bpts(void)
 {
 	if (dabr.enabled)
-		set_dabr(dabr.address | (dabr.enabled & 7));
+		set_dabr(dabr.address | (dabr.enabled & 7), DABRX_ALL);
 	if (iabr && cpu_has_feature(CPU_FTR_IABR))
 		mtspr(SPRN_IABR, iabr->address
 			 | (iabr->enabled & (BP_IABR|BP_IABR_TE)));
@@ -768,7 +768,7 @@ static void remove_bpts(void)
 
 static void remove_cpu_bpts(void)
 {
-	set_dabr(0);
+	set_dabr(0, 0);
 	if (cpu_has_feature(CPU_FTR_IABR))
 		mtspr(SPRN_IABR, 0);
 }
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH v2 0/3] powerpc: HW filtering of breakpoint for user/kernel/hypervisor events
From: Michael Neuling @ 2012-09-07  7:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Geert Uytterhoeven, Michael Neuling, linuxppc-dev
In-Reply-To: <alpine.DEB.2.00.1209070716110.22556@bushbaby.sonytel.be>

This is in response to Geerts concerns. 

Only posting last two patches again as first 3 are already in benh's next
tree.

Added another patch to cleanup some #defines that we can now remove

Michael Neuling (3):
  powerpc: Rework set_dabr so it can take a DABRX value as well
  powerpc: Dynamically calculate the dabrx based on
    kernel/user/hypervisor
  powerpc: cleanup old DABRX #defines

 arch/powerpc/include/asm/debug.h         |    2 +-
 arch/powerpc/include/asm/hvcall.h        |    5 -----
 arch/powerpc/include/asm/hw_breakpoint.h |    3 ++-
 arch/powerpc/include/asm/machdep.h       |    3 ++-
 arch/powerpc/include/asm/processor.h     |    1 +
 arch/powerpc/include/asm/reg.h           |    3 +++
 arch/powerpc/kernel/hw_breakpoint.c      |   19 +++++++++++++------
 arch/powerpc/kernel/process.c            |   14 +++++++-------
 arch/powerpc/kernel/ptrace.c             |    3 +++
 arch/powerpc/kernel/signal.c             |    2 +-
 arch/powerpc/platforms/cell/beat.c       |    4 ++--
 arch/powerpc/platforms/cell/beat.h       |    2 +-
 arch/powerpc/platforms/ps3/setup.c       |   10 +++++++---
 arch/powerpc/platforms/pseries/setup.c   |   14 +++++++++-----
 arch/powerpc/xmon/xmon.c                 |    4 ++--
 15 files changed, 54 insertions(+), 35 deletions(-)

-- 
1.7.9.5

^ permalink raw reply

* [PATCH] powerpc/powernv: move the dereference below the NULL test
From: Wei Yongjun @ 2012-09-07  6:45 UTC (permalink / raw)
  To: benh, paulus, grant.likely, rob.herring
  Cc: yongjun_wei, linuxppc-dev, devicetree-discuss, linux-kernel

From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>

The dereference should be moved below the NULL test.

spatch with a semantic match is used to found this.
(http://coccinelle.lip6.fr/)

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
---
 arch/powerpc/platforms/powernv/pci.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index be3cfc5..4ba89c1 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -287,13 +287,15 @@ static int pnv_pci_read_config(struct pci_bus *bus,
 			       int where, int size, u32 *val)
 {
 	struct pci_controller *hose = pci_bus_to_host(bus);
-	struct pnv_phb *phb = hose->private_data;
+	struct pnv_phb *phb;
 	u32 bdfn = (((uint64_t)bus->number) << 8) | devfn;
 	s64 rc;
 
 	if (hose == NULL)
 		return PCIBIOS_DEVICE_NOT_FOUND;
 
+	phb = hose->private_data;
+
 	switch (size) {
 	case 1: {
 		u8 v8;
@@ -331,12 +333,14 @@ static int pnv_pci_write_config(struct pci_bus *bus,
 				int where, int size, u32 val)
 {
 	struct pci_controller *hose = pci_bus_to_host(bus);
-	struct pnv_phb *phb = hose->private_data;
+	struct pnv_phb *phb;
 	u32 bdfn = (((uint64_t)bus->number) << 8) | devfn;
 
 	if (hose == NULL)
 		return PCIBIOS_DEVICE_NOT_FOUND;
 
+	phb = hose->private_data;
+
 	cfg_dbg("pnv_pci_write_config bus: %x devfn: %x +%x/%x -> %08x\n",
 		bus->number, devfn, where, size, val);
 	switch (size) {

^ permalink raw reply related

* Re: [PATCH 4/5] powerpc: Rework set_dabr so it can take a DABRX value as well
From: Michael Neuling @ 2012-09-07  5:43 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: linuxppc-dev
In-Reply-To: <alpine.DEB.2.00.1209070716110.22556@bushbaby.sonytel.be>

Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> wrote:

> 	Hi Mikey,
> 
> On Fri, 7 Sep 2012, Michael Neuling wrote:
> > Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > On Thu, Sep 6, 2012 at 7:17 AM, Michael Neuling <mikey@neuling.org> wrote:
> > > > Rework set_dabr to take a DABRX value as well. We are not actually
> > > > changing any functionality at this stage, just preparing for that.
> > > 
> > > You are changing functionality.
> > 
> > You are right.. I'll fix that up.. Sorry.
> > 
> > > >  #define   DABRX_USER   (1UL << 0)
> > > >  #define   DABRX_KERNEL (1UL << 1)
> > > > +#define   DABRX_HYP    (1UL << 2)
> > > > +#define   DABRX_BTI    (1UL << 3)
> > > > +#define   DABRX_ALL     (DABRX_BTI | DABRX_HYP | DABRX_KERNEL | DABRX_USER)
> > > 
> > > > --- a/arch/powerpc/platforms/cell/beat.c
> > > > +++ b/arch/powerpc/platforms/cell/beat.c
> > > > @@ -136,9 +136,9 @@ ssize_t beat_nvram_get_size(void)
> > > >         return BEAT_NVRAM_SIZE;
> > > >  }
> > > >
> > > > -int beat_set_xdabr(unsigned long dabr)
> > > > +int beat_set_xdabr(unsigned long dabr, unsigned long dabrx)
> > > >  {
> > > > -       if (beat_set_dabr(dabr, DABRX_KERNEL | DABRX_USER))
> > > > +       if (beat_set_dabr(dabr, dabrx))
> > > >                 return -1;
> > > >         return 0;
> > > >  }
> > > 
> > > > --- a/arch/powerpc/platforms/ps3/setup.c
> > > > +++ b/arch/powerpc/platforms/ps3/setup.c
> > > > @@ -184,11 +184,9 @@ early_param("ps3flash", early_parse_ps3flash);
> > > >  #define prealloc_ps3flash_bounce_buffer()      do { } while (0)
> > > >  #endif
> > > >
> > > > -static int ps3_set_dabr(unsigned long dabr)
> > > > +static int ps3_set_dabr(unsigned long dabr, unsigned long dabrx)
> > > >  {
> > > > -       enum {DABR_USER = 1, DABR_KERNEL = 2,};
> > > > -
> > > > -       return lv1_set_dabr(dabr, DABR_KERNEL | DABR_USER) ? -1 : 0;
> > > > +       return lv1_set_dabr(dabr, dabrx) ? -1 : 0;
> > > >  }
> > > 
> > > > -               set_dabr(dabr.address | (dabr.enabled & 7));
> > > > +               set_dabr(dabr.address | (dabr.enabled & 7), DABRX_ALL);
> > > 
> > > Before, beat_set_dabr() and lv1_set_dabr() would have been called with dabrx = 3
> > > (DABRX_KERNEL | DABRX_USER). Now they're called with dabrx = 15
> > > (DABRX_ALL = DABRX_BTI | DABRX_HYP | DABRX_KERNEL | DABRX_USER).
> > > 
> > > No idea what's the impact of this...
> > 
> > Do you know if the ps3 hypervisor will allow us to set DABRX_BTI or
> > DABRX_HYP?  phyp wont.  
> 
> According to the documenation, all bits but DABRX_USER, DABRX_KERNEL, and
> DABRX_BTI must be zero.  This implies DABRX_HYP cannot be set.
> 
> BTW, the requirement that DABRX_USER and DABRX_KERNEL cannot both be zero
> at the same time is also there, cfr. your comment and check in
> pseries_set_xdabr().
> 
> Unfortunately, I cannot test it.

OK thanks, I'll mask appropriately.

Any place we can get a copy of the PS3 HV doc you're quoting from?

Mikey 

^ permalink raw reply

* Re: [PATCH -V8 0/11] arch/powerpc: Add 64TB support to ppc64
From: Aneesh Kumar K.V @ 2012-09-07  5:42 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, paulus
In-Reply-To: <1346982235.2385.33.camel@pasglop>

Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:

> On Thu, 2012-09-06 at 20:59 +0530, Aneesh Kumar K.V wrote:
>> Hi,
>> 
>> This patchset include patches for supporting 64TB with ppc64. I haven't booted
>> this on hardware with 64TB memory yet. But they boot fine on real hardware with
>> less memory. Changes extend VSID bits to 38 bits for a 256MB segment
>> and 26 bits for 1TB segments.
>
> Your series breaks the embedded 64-bit build. You seem to be hard wiring
> dependencies on slice stuff all over 64-bit stuff regardless of the MMU
> type or the value of CONFIG_MM_SLICES.
>
> Also all these:
>
>> +/* 4 bits per slice and we have one slice per 1TB */
>> +#if 0 /* We can't directly include pgtable.h hence this hack */
>> +#define SLICE_ARRAY_SIZE  (PGTABLE_RANGE >> 41)
>> +#else
>> +/* Right now we only support 64TB */
>> +#define SLICE_ARRAY_SIZE  32
>> +#endif
>
> Things are just too horrible. Find a different way of doing it, if
> necessary create a new range define somewhere, whatever but don't leave
> that crap as-is, it's too wrong.
>
> Dropping the series for now.
>

How about the change below. If you are ok moving the range details to
new header, I can fold this into patch 7 and send a new series

-aneesh

diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
index 428f23e..057a12a 100644
--- a/arch/powerpc/include/asm/mmu-hash64.h
+++ b/arch/powerpc/include/asm/mmu-hash64.h
@@ -14,6 +14,7 @@
 
 #include <asm/asm-compat.h>
 #include <asm/page.h>
+#include <asm/pgtable-ppc64-range.h>
 
 /*
  * Segment table
@@ -415,12 +416,7 @@ extern void slb_set_size(u16 size);
 	add	rt,rt,rx
 
 /* 4 bits per slice and we have one slice per 1TB */
-#if 0 /* We can't directly include pgtable.h hence this hack */
 #define SLICE_ARRAY_SIZE  (PGTABLE_RANGE >> 41)
-#else
-/* Right now we only support 64TB */
-#define SLICE_ARRAY_SIZE  32
-#endif
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/powerpc/include/asm/page_64.h b/arch/powerpc/include/asm/page_64.h
index b55beb4..01ab518 100644
--- a/arch/powerpc/include/asm/page_64.h
+++ b/arch/powerpc/include/asm/page_64.h
@@ -78,16 +78,14 @@ extern u64 ppc64_pft_size;
 #define GET_LOW_SLICE_INDEX(addr)	((addr) >> SLICE_LOW_SHIFT)
 #define GET_HIGH_SLICE_INDEX(addr)	((addr) >> SLICE_HIGH_SHIFT)
 
-/* 1 bit per slice and we have one slice per 1TB */
-#if 0 /* We can't directly include pgtable.h hence this hack */
-#define SLICE_MASK_SIZE (PGTABLE_RANGE >> 43)
-#else
-/*
+/* 1 bit per slice and we have one slice per 1TB
  * Right now we support only 64TB.
  * IF we change this we will have to change the type
  * of high_slices
  */
 #define SLICE_MASK_SIZE 8
+#if (PGTABLE_RANGE >> 43) > SLICE_MASK_SIZE
+#error PGTABLE_RANGE exceeds slice_mask high_slices size
 #endif
 
 #ifndef __ASSEMBLY__
diff --git a/arch/powerpc/include/asm/pgtable-ppc64-range.h b/arch/powerpc/include/asm/pgtable-ppc64-range.h
new file mode 100644
index 0000000..04a825c
--- /dev/null
+++ b/arch/powerpc/include/asm/pgtable-ppc64-range.h
@@ -0,0 +1,16 @@
+#ifndef _ASM_POWERPC_PGTABLE_PPC64_RANGE_H_
+#define _ASM_POWERPC_PGTABLE_PPC64_RANGE_H_
+
+#ifdef CONFIG_PPC_64K_PAGES
+#include <asm/pgtable-ppc64-64k.h>
+#else
+#include <asm/pgtable-ppc64-4k.h>
+#endif
+
+/*
+ * Size of EA range mapped by our pagetables.
+ */
+#define PGTABLE_EADDR_SIZE (PTE_INDEX_SIZE + PMD_INDEX_SIZE + \
+			    PUD_INDEX_SIZE + PGD_INDEX_SIZE + PAGE_SHIFT)
+#define PGTABLE_RANGE (ASM_CONST(1) << PGTABLE_EADDR_SIZE)
+#endif
diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
index dea953f..ee783b4 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -13,13 +13,7 @@
 
 #define FIRST_USER_ADDRESS	0
 
-/*
- * Size of EA range mapped by our pagetables.
- */
-#define PGTABLE_EADDR_SIZE (PTE_INDEX_SIZE + PMD_INDEX_SIZE + \
-                	    PUD_INDEX_SIZE + PGD_INDEX_SIZE + PAGE_SHIFT)
-#define PGTABLE_RANGE (ASM_CONST(1) << PGTABLE_EADDR_SIZE)
-
+#include <asm/pgtable-ppc64-range.h>
 
 /* Some sanity checking */
 #if TASK_SIZE_USER64 > PGTABLE_RANGE
@@ -32,14 +26,6 @@
 #endif
 #endif
 
-#if (PGTABLE_RANGE >> 41) > SLICE_ARRAY_SIZE
-#error PGTABLE_RANGE exceeds SLICE_ARRAY_SIZE
-#endif
-
-#if (PGTABLE_RANGE >> 43) > SLICE_MASK_SIZE
-#error PGTABLE_RANGE exceeds slice_mask high_slices size
-#endif
-
 /*
  * Define the address range of the kernel non-linear virtual area
  */

^ permalink raw reply related

* Re: [PATCH 4/5] powerpc: Rework set_dabr so it can take a DABRX value as well
From: Geert Uytterhoeven @ 2012-09-07  5:26 UTC (permalink / raw)
  To: Michael Neuling; +Cc: linuxppc-dev
In-Reply-To: <15813.1346989072@neuling.org>

	Hi Mikey,

On Fri, 7 Sep 2012, Michael Neuling wrote:
> Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > On Thu, Sep 6, 2012 at 7:17 AM, Michael Neuling <mikey@neuling.org> w=
rote:
> > > Rework set_dabr to take a DABRX value as well. We are not actually
> > > changing any functionality at this stage, just preparing for that.
> >=20
> > You are changing functionality.
>=20
> You are right.. I'll fix that up.. Sorry.
>=20
> > >  #define   DABRX_USER   (1UL << 0)
> > >  #define   DABRX_KERNEL (1UL << 1)
> > > +#define   DABRX_HYP    (1UL << 2)
> > > +#define   DABRX_BTI    (1UL << 3)
> > > +#define   DABRX_ALL     (DABRX_BTI | DABRX_HYP | DABRX_KERNEL | DA=
BRX_USER)
> >=20
> > > --- a/arch/powerpc/platforms/cell/beat.c
> > > +++ b/arch/powerpc/platforms/cell/beat.c
> > > @@ -136,9 +136,9 @@ ssize_t beat_nvram_get_size(void)
> > >         return BEAT_NVRAM_SIZE;
> > >  }
> > >
> > > -int beat_set_xdabr(unsigned long dabr)
> > > +int beat_set_xdabr(unsigned long dabr, unsigned long dabrx)
> > >  {
> > > -       if (beat_set_dabr(dabr, DABRX_KERNEL | DABRX_USER))
> > > +       if (beat_set_dabr(dabr, dabrx))
> > >                 return -1;
> > >         return 0;
> > >  }
> >=20
> > > --- a/arch/powerpc/platforms/ps3/setup.c
> > > +++ b/arch/powerpc/platforms/ps3/setup.c
> > > @@ -184,11 +184,9 @@ early_param("ps3flash", early_parse_ps3flash);
> > >  #define prealloc_ps3flash_bounce_buffer()      do { } while (0)
> > >  #endif
> > >
> > > -static int ps3_set_dabr(unsigned long dabr)
> > > +static int ps3_set_dabr(unsigned long dabr, unsigned long dabrx)
> > >  {
> > > -       enum {DABR_USER =3D 1, DABR_KERNEL =3D 2,};
> > > -
> > > -       return lv1_set_dabr(dabr, DABR_KERNEL | DABR_USER) ? -1 : 0=
;
> > > +       return lv1_set_dabr(dabr, dabrx) ? -1 : 0;
> > >  }
> >=20
> > > -               set_dabr(dabr.address | (dabr.enabled & 7));
> > > +               set_dabr(dabr.address | (dabr.enabled & 7), DABRX_A=
LL);
> >=20
> > Before, beat_set_dabr() and lv1_set_dabr() would have been called wit=
h dabrx =3D 3
> > (DABRX_KERNEL | DABRX_USER). Now they're called with dabrx =3D 15
> > (DABRX_ALL =3D DABRX_BTI | DABRX_HYP | DABRX_KERNEL | DABRX_USER).
> >=20
> > No idea what's the impact of this...
>=20
> Do you know if the ps3 hypervisor will allow us to set DABRX_BTI or
> DABRX_HYP?  phyp wont. =20

According to the documenation, all bits but DABRX_USER, DABRX_KERNEL, and
DABRX_BTI must be zero.  This implies DABRX_HYP cannot be set.

BTW, the requirement that DABRX_USER and DABRX_KERNEL cannot both be zero
at the same time is also there, cfr. your comment and check in
pseries_set_xdabr().

Unfortunately, I cannot test it.

With kind regards,

Geert Uytterhoeven
Software Architect
Technology and Software Centre Europe

Sony Belgium, bijkantoor van Sony Europe Limited.
Da Vincilaan 7-D1 =C2=B7 B-1935 Zaventem =C2=B7 Belgium

Phone:  +32 (0)2 700 8453
Fax:    +32 (0)2 700 8622
E-mail: Geert.Uytterhoeven@sonycom.com

Sony Europe Limited. A company registered in England and Wales.
Registered office: The Heights, Brooklands, Weybridge, Surrey. KT13 0XW.
                   United Kingdom

^ permalink raw reply

* Re: [PATCH 2/2] powerpc/e6500: TLB miss handler with hardware tablewalk support
From: Benjamin Herrenschmidt @ 2012-09-07  4:41 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev
In-Reply-To: <20120614234101.GB17147@tyr.buserror.net>

On Thu, 2012-06-14 at 18:41 -0500, Scott Wood wrote:
> There are a few things that make the existing hw tablewalk handlers
> unsuitable for e6500:
> 
>  - Indirect entries go in TLB1 (though the resulting direct entries go in
>    TLB0).
> 
>  - It has threads, but no "tlbsrx." -- so we need a spinlock and
>    a normal "tlbsx".  Because we need this lock, hardware tablewalk
>    is mandatory on e6500 unless we want to add spinlock+tlbsx to
>    the normal bolted TLB miss handler.
> 
>  - TLB1 has no HES (nor next-victim hint) so we need software round robin
>    (TODO: integrate this round robin data with hugetlb/KVM)
> 
>  - The existing tablewalk handlers map half of a page table at a time,
>    because IBM hardware has a fixed 1MiB indirect page size.  e6500
>    has variable size indirect entries, with a minimum of 2MiB.
>    So we can't do the half-page indirect mapping, and even if we
>    could it would be less efficient than mapping the full page.
> 
>  - Like on e5500, the linear mapping is bolted, so we don't need the
>    overhead of supporting nested tlb misses.
> 
> Note that hardware tablewalk does not work in rev1 of e6500.
> We do not expect to support e6500 rev1 in mainline Linux.
> 
> Signed-off-by: Scott Wood <scottwood@freescale.com>
> ---
>  arch/powerpc/include/asm/mmu-book3e.h |   13 +++
>  arch/powerpc/include/asm/mmu.h        |   21 ++--
>  arch/powerpc/include/asm/paca.h       |    6 +
>  arch/powerpc/kernel/asm-offsets.c     |   10 ++
>  arch/powerpc/kernel/paca.c            |    5 +
>  arch/powerpc/kernel/setup_64.c        |   33 +++++++
>  arch/powerpc/mm/fsl_booke_mmu.c       |    8 ++
>  arch/powerpc/mm/tlb_low_64e.S         |  167 +++++++++++++++++++++++++++++++++
>  arch/powerpc/mm/tlb_nohash.c          |  109 ++++++++++++++++------
>  9 files changed, 335 insertions(+), 37 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/mmu-book3e.h b/arch/powerpc/include/asm/mmu-book3e.h
> index eeabcdb..3072aa0 100644
> --- a/arch/powerpc/include/asm/mmu-book3e.h
> +++ b/arch/powerpc/include/asm/mmu-book3e.h
> @@ -264,8 +264,21 @@ extern struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT];
>  extern int mmu_linear_psize;
>  extern int mmu_vmemmap_psize;
>  
> +struct book3e_tlb_per_core {
> +	/* For software way selection, as on Freescale TLB1 */
> +	u8 esel_next, esel_max, esel_first;
> +
> +	/* Per-core spinlock for e6500 TLB handlers (no tlbsrx.) */
> +	u8 lock;
> +};

I'm no fan of the name ... tlb_core_data ? Probably don't even need the
book3e prefix really.

>  #ifdef CONFIG_PPC64
>  extern unsigned long linear_map_top;
> +extern int book3e_htw_mode;
> +
> +#define PPC_HTW_NONE	0
> +#define PPC_HTW_IBM	1
> +#define PPC_HTW_E6500	2

Sad :-( Wonder why we bother with an architecture, really ...

>  /*
>   * 64-bit booke platforms don't load the tlb in the tlb miss handler code.
> diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
> index a9e9ec6..63d97eb 100644
> --- a/arch/powerpc/include/asm/mmu.h
> +++ b/arch/powerpc/include/asm/mmu.h
> @@ -170,16 +170,17 @@ extern u64 ppc64_rma_size;
>  #define MMU_PAGE_64K_AP	3	/* "Admixed pages" (hash64 only) */
>  #define MMU_PAGE_256K	4
>  #define MMU_PAGE_1M	5
> -#define MMU_PAGE_4M	6
> -#define MMU_PAGE_8M	7
> -#define MMU_PAGE_16M	8
> -#define MMU_PAGE_64M	9
> -#define MMU_PAGE_256M	10
> -#define MMU_PAGE_1G	11
> -#define MMU_PAGE_16G	12
> -#define MMU_PAGE_64G	13
> -
> -#define MMU_PAGE_COUNT	14
> +#define MMU_PAGE_2M	6
> +#define MMU_PAGE_4M	7
> +#define MMU_PAGE_8M	8
> +#define MMU_PAGE_16M	9
> +#define MMU_PAGE_64M	10
> +#define MMU_PAGE_256M	11
> +#define MMU_PAGE_1G	12
> +#define MMU_PAGE_16G	13
> +#define MMU_PAGE_64G	14
> +
> +#define MMU_PAGE_COUNT	15

Let's pray that won't hit a funny bug on server :-)

>  #if defined(CONFIG_PPC_STD_MMU_64)
>  /* 64-bit classic hash table MMU */
> diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
> index daf813f..4e18bb5 100644
> --- a/arch/powerpc/include/asm/paca.h
> +++ b/arch/powerpc/include/asm/paca.h
> @@ -108,6 +108,12 @@ struct paca_struct {
>  	/* Keep pgd in the same cacheline as the start of extlb */
>  	pgd_t *pgd __attribute__((aligned(0x80))); /* Current PGD */
>  	pgd_t *kernel_pgd;		/* Kernel PGD */
> +
> +	struct book3e_tlb_per_core tlb_per_core;
> +
> +	/* Points to the tlb_per_core of the first thread on this core. */
> +	struct book3e_tlb_per_core *tlb_per_core_ptr;
> +

That's gross. Can't you allocate them elsewhere and then populate the
PACA pointers ?

>  	/* We can have up to 3 levels of reentrancy in the TLB miss handler */
>  	u64 extlb[3][EX_TLB_SIZE / sizeof(u64)];
>  	u64 exmc[8];		/* used for machine checks */
> diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
> index 52c7ad7..61f4634 100644
> --- a/arch/powerpc/kernel/asm-offsets.c
> +++ b/arch/powerpc/kernel/asm-offsets.c
> @@ -168,6 +168,16 @@ int main(void)
>  	DEFINE(PACA_MC_STACK, offsetof(struct paca_struct, mc_kstack));
>  	DEFINE(PACA_CRIT_STACK, offsetof(struct paca_struct, crit_kstack));
>  	DEFINE(PACA_DBG_STACK, offsetof(struct paca_struct, dbg_kstack));
> +	DEFINE(PACA_TLB_PER_CORE_PTR,
> +		offsetof(struct paca_struct, tlb_per_core_ptr));
> +
> +	DEFINE(PERCORE_TLB_ESEL_NEXT,
> +		offsetof(struct book3e_tlb_per_core, esel_next));
> +	DEFINE(PERCORE_TLB_ESEL_MAX,
> +		offsetof(struct book3e_tlb_per_core, esel_max));
> +	DEFINE(PERCORE_TLB_ESEL_FIRST,
> +		offsetof(struct book3e_tlb_per_core, esel_first));
> +	DEFINE(PERCORE_TLB_LOCK, offsetof(struct book3e_tlb_per_core, lock));
>  #endif /* CONFIG_PPC_BOOK3E */
>  
>  #ifdef CONFIG_PPC_STD_MMU_64
> diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c
> index fbe1a12..65abfc0 100644
> --- a/arch/powerpc/kernel/paca.c
> +++ b/arch/powerpc/kernel/paca.c
> @@ -145,6 +145,11 @@ void __init initialise_paca(struct paca_struct *new_paca, int cpu)
>  #ifdef CONFIG_PPC_STD_MMU_64
>  	new_paca->slb_shadow_ptr = &slb_shadow[cpu];
>  #endif /* CONFIG_PPC_STD_MMU_64 */
> +
> +#ifdef CONFIG_PPC_BOOK3E
> +	/* For now -- if we have threads this will be adjusted later */
> +	new_paca->tlb_per_core_ptr = &new_paca->tlb_per_core;
> +#endif
>  }
>  
>  /* Put the paca pointer into r13 and SPRG_PACA */
> diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
> index 389bd4f..271b85d 100644
> --- a/arch/powerpc/kernel/setup_64.c
> +++ b/arch/powerpc/kernel/setup_64.c
> @@ -102,6 +102,37 @@ int ucache_bsize;
>  
>  static char *smt_enabled_cmdline;
>  
> +#ifdef CONFIG_PPC_BOOK3E
> +static void setup_tlb_per_core(void)
> +{
> +	int cpu;
> +
> +	for_each_possible_cpu(cpu) {
> +		int first = cpu_first_thread_sibling(cpu);
> +
> +		paca[cpu].tlb_per_core_ptr = &paca[first].tlb_per_core;
> +
> +		/*
> +		 * If we have threads, we need either tlbsrx.
> +		 * or e6500 tablewalk mode, or else TLB handlers
> +		 * will be racy and could produce duplicate entries.
> +		 */
> +		if (smt_enabled_at_boot >= 2 &&
> +		    !mmu_has_feature(MMU_FTR_USE_TLBRSRV) &&
> +		    book3e_htw_mode != PPC_HTW_E6500) {
> +			/* Should we panic instead? */
> +			WARN_ONCE("%s: unsupported MMU configuration -- expect problems\n",
> +				  __func__);
> +		}
> +	}
> +}
> +#else
> +static void setup_tlb_per_core(void)
> +{
> +}
> +#endif
> +
> +
>  /* Look for ibm,smt-enabled OF option */
>  static void check_smt_enabled(void)
>  {
> @@ -142,6 +173,8 @@ static void check_smt_enabled(void)
>  			of_node_put(dn);
>  		}
>  	}
> +
> +	setup_tlb_per_core();
>  }

I'd rather you move that to the caller

>  /* Look for smt-enabled= cmdline option */
> diff --git a/arch/powerpc/mm/fsl_booke_mmu.c b/arch/powerpc/mm/fsl_booke_mmu.c
> index 07ba45b..bf06d36b 100644
> --- a/arch/powerpc/mm/fsl_booke_mmu.c
> +++ b/arch/powerpc/mm/fsl_booke_mmu.c
> @@ -52,6 +52,7 @@
>  #include <asm/smp.h>
>  #include <asm/machdep.h>
>  #include <asm/setup.h>
> +#include <asm/paca.h>
>  
>  #include "mmu_decl.h"
>  
> @@ -192,6 +193,13 @@ unsigned long map_mem_in_cams(unsigned long ram, int max_cam_idx)
>  	}
>  	tlbcam_index = i;
>  
> +#ifdef CONFIG_PPC64
> +	get_paca()->tlb_per_core.esel_next = i;
> +	get_paca()->tlb_per_core.esel_max =
> +		mfspr(SPRN_TLB1CFG) & TLBnCFG_N_ENTRY;
> +	get_paca()->tlb_per_core.esel_first = i;
> +#endif
> +
>  	return amount_mapped;
>  }
>  
> diff --git a/arch/powerpc/mm/tlb_low_64e.S b/arch/powerpc/mm/tlb_low_64e.S
> index efe0f33..8e82772 100644
> --- a/arch/powerpc/mm/tlb_low_64e.S
> +++ b/arch/powerpc/mm/tlb_low_64e.S
> @@ -232,6 +232,173 @@ itlb_miss_fault_bolted:
>  	beq	tlb_miss_common_bolted
>  	b	itlb_miss_kernel_bolted
>  
> +/*
> + * TLB miss handling for e6500 and derivatives, using hardware tablewalk.
> + *
> + * Linear mapping is bolted: no virtual page table or nested TLB misses
> + * Indirect entries in TLB1, hardware loads resulting direct entries
> + *    into TLB0
> + * No HES or NV hint on TLB1, so we need to do software round-robin
> + * No tlbsrx. so we need a spinlock, and we have to deal
> + *    with MAS-damage caused by tlbsx

Ouch ... so for every indirect entry you have to take a lock, backup the
MAS, do a tlbsx, restore the MAS, insert the entry and drop the lock ?

After all that, do you have some bullets left for the HW designers ?

Remind me to also shoot myself for allowing tlbsrx. and HES to be
optional in MAV2 :-(

> + * 4K pages only
> + */
> +
> +	START_EXCEPTION(instruction_tlb_miss_e6500)
> +	tlb_prolog_bolted SPRN_SRR0
> +
> +	ld	r11,PACA_TLB_PER_CORE_PTR(r13)
> +	srdi.	r15,r16,60		/* get region */
> +	ori	r16,r16,1
> +
> +	TLB_MISS_STATS_SAVE_INFO_BOLTED
> +	bne	tlb_miss_kernel_e6500	/* user/kernel test */
> +
> +	b	tlb_miss_common_e6500
> +
> +	START_EXCEPTION(data_tlb_miss_e6500)
> +	tlb_prolog_bolted SPRN_DEAR
> +
> +	ld	r11,PACA_TLB_PER_CORE_PTR(r13)
> +	srdi.	r15,r16,60		/* get region */
> +	rldicr	r16,r16,0,62
> +
> +	TLB_MISS_STATS_SAVE_INFO_BOLTED
> +	bne	tlb_miss_kernel_e6500	/* user vs kernel check */
> +
> +/*
> + * This is the guts of the TLB miss handler for e6500 and derivatives.
> + * We are entered with:
> + *
> + * r16 = page of faulting address (low bit 0 if data, 1 if instruction)
> + * r15 = crap (free to use)
> + * r14 = page table base
> + * r13 = PACA
> + * r11 = tlb_per_core ptr
> + * r10 = crap (free to use)
> + */
> +tlb_miss_common_e6500:
> +	/*
> +	 * Search if we already have an indirect entry for that virtual
> +	 * address, and if we do, bail out.
> +	 *
> +	 * MAS6:IND should be already set based on MAS4
> +	 */
> +	addi	r10,r11,PERCORE_TLB_LOCK
> +1:	lbarx	r15,0,r10
> +	cmpdi	r15,0
> +	bne	2f
> +	li	r15,1
> +	stbcx.	r15,0,r10

No need for barriers here ?

> +	bne	1b
> +	.subsection 1
> +2:	lbz	r15,0(r10)
> +	cmpdi	r15,0
> +	bne	2b
> +	b	1b
> +	.previous
> +
> +	mfspr	r15,SPRN_MAS2
> +
> +	tlbsx	0,r16
> +	mfspr	r10,SPRN_MAS1
> +	andis.	r10,r10,MAS1_VALID@h
> +	bne	tlb_miss_done_e6500
> +
> +	/* Undo MAS-damage from the tlbsx */
> +	mfspr	r10,SPRN_MAS1
> +	oris	r10,r10,MAS1_VALID@h
> +	mtspr	SPRN_MAS1,r10
> +	mtspr	SPRN_MAS2,r15
> +
> +	/* Now, we need to walk the page tables. First check if we are in
> +	 * range.
> +	 */
> +	rldicl.	r10,r16,64-PGTABLE_EADDR_SIZE,PGTABLE_EADDR_SIZE+4
> +	bne-	tlb_miss_fault_e6500
> +
> +	rldicl	r15,r16,64-PGDIR_SHIFT+3,64-PGD_INDEX_SIZE-3
> +	cmpldi	cr0,r14,0
> +	clrrdi	r15,r15,3
> +	beq-	tlb_miss_fault_e6500 /* No PGDIR, bail */
> +	ldx	r14,r14,r15		/* grab pgd entry */
> +
> +	rldicl	r15,r16,64-PUD_SHIFT+3,64-PUD_INDEX_SIZE-3
> +	clrrdi	r15,r15,3
> +	cmpdi	cr0,r14,0
> +	bge	tlb_miss_fault_e6500	/* Bad pgd entry or hugepage; bail */
> +	ldx	r14,r14,r15		/* grab pud entry */
> +
> +	rldicl	r15,r16,64-PMD_SHIFT+3,64-PMD_INDEX_SIZE-3
> +	clrrdi	r15,r15,3
> +	cmpdi	cr0,r14,0
> +	bge	tlb_miss_fault_e6500
> +	ldx	r14,r14,r15		/* Grab pmd entry */
> +
> +	mfspr	r10,SPRN_MAS0
> +	cmpdi	cr0,r14,0
> +	bge	tlb_miss_fault_e6500
> +
> +	/* Now we build the MAS for a 2M indirect page:
> +	 *
> +	 * MAS 0   :	ESEL needs to be filled by software round-robin
> +	 * MAS 1   :	Almost fully setup
> +	 *               - PID already updated by caller if necessary
> +	 *               - TSIZE for now is base ind page size always
> +	 * MAS 2   :	Use defaults
> +	 * MAS 3+7 :	Needs to be done
> +	 */
> +
> +	ori	r14,r14,(BOOK3E_PAGESZ_4K << MAS3_SPSIZE_SHIFT)
> +	mtspr	SPRN_MAS7_MAS3,r14
> +
> +	lbz	r15,PERCORE_TLB_ESEL_NEXT(r11)
> +	lbz	r16,PERCORE_TLB_ESEL_MAX(r11)
> +	lbz	r14,PERCORE_TLB_ESEL_FIRST(r11)
> +	rlwimi	r10,r15,16,0x00ff0000	/* insert esel_next into MAS0 */
> +	addi	r15,r15,1		/* increment esel_next */
> +	mtspr	SPRN_MAS0,r10
> +	cmpw	r15,r16
> +	iseleq	r15,r14,r15		/* if next == last use first */
> +	stb	r15,PERCORE_TLB_ESEL_NEXT(r11)
> +
> +	tlbwe
> +
> +tlb_miss_done_e6500:
> +	.macro	tlb_unlock_e6500
> +	li	r15,0
> +	isync
> +	stb	r15,PERCORE_TLB_LOCK(r11)
> +	.endm
> +
> +	tlb_unlock_e6500
> +	TLB_MISS_STATS_X(MMSTAT_TLB_MISS_NORM_OK)
> +	tlb_epilog_bolted
> +	rfi
> +
> +tlb_miss_kernel_e6500:
> +	mfspr	r10,SPRN_MAS1
> +	ld	r14,PACA_KERNELPGD(r13)
> +	cmpldi	cr0,r15,8		/* Check for vmalloc region */
> +	rlwinm	r10,r10,0,16,1		/* Clear TID */
> +	mtspr	SPRN_MAS1,r10
> +	beq+	tlb_miss_common_e6500
> +
> +tlb_miss_fault_e6500:
> +	tlb_unlock_e6500
> +	/* We need to check if it was an instruction miss */
> +	andi.	r16,r16,1
> +	bne	itlb_miss_fault_e6500
> +dtlb_miss_fault_e6500:
> +	TLB_MISS_STATS_D(MMSTAT_TLB_MISS_NORM_FAULT)
> +	tlb_epilog_bolted
> +	b	exc_data_storage_book3e
> +itlb_miss_fault_e6500:
> +	TLB_MISS_STATS_I(MMSTAT_TLB_MISS_NORM_FAULT)
> +	tlb_epilog_bolted
> +	b	exc_instruction_storage_book3e
> +
> +
>  /**********************************************************************
>   *                                                                    *
>   * TLB miss handling for Book3E with TLB reservation and HES support  *
> diff --git a/arch/powerpc/mm/tlb_nohash.c b/arch/powerpc/mm/tlb_nohash.c
> index df32a83..2f09ddf 100644
> --- a/arch/powerpc/mm/tlb_nohash.c
> +++ b/arch/powerpc/mm/tlb_nohash.c
> @@ -43,6 +43,7 @@
>  #include <asm/tlb.h>
>  #include <asm/code-patching.h>
>  #include <asm/hugetlb.h>
> +#include <asm/paca.h>
>  
>  #include "mmu_decl.h"
>  
> @@ -58,6 +59,10 @@ struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT] = {
>  		.shift	= 12,
>  		.enc	= BOOK3E_PAGESZ_4K,
>  	},
> +	[MMU_PAGE_2M] = {
> +		.shift	= 21,
> +		.enc	= BOOK3E_PAGESZ_2M,
> +	},
>  	[MMU_PAGE_4M] = {
>  		.shift	= 22,
>  		.enc	= BOOK3E_PAGESZ_4M,
> @@ -136,7 +141,7 @@ static inline int mmu_get_tsize(int psize)
>  int mmu_linear_psize;		/* Page size used for the linear mapping */
>  int mmu_pte_psize;		/* Page size used for PTE pages */
>  int mmu_vmemmap_psize;		/* Page size used for the virtual mem map */
> -int book3e_htw_enabled;		/* Is HW tablewalk enabled ? */
> +int book3e_htw_mode;		/* HW tablewalk?  Value is PPC_HTW_* */
>  unsigned long linear_map_top;	/* Top of linear mapping */
>  
>  #endif /* CONFIG_PPC64 */
> @@ -377,7 +382,7 @@ void tlb_flush_pgtable(struct mmu_gather *tlb, unsigned long address)
>  {
>  	int tsize = mmu_psize_defs[mmu_pte_psize].enc;
>  
> -	if (book3e_htw_enabled) {
> +	if (book3e_htw_mode) {

Make it if (boot3e_htw_enabled != PPC_HTW_NONE)

>  		unsigned long start = address & PMD_MASK;
>  		unsigned long end = address + PMD_SIZE;
>  		unsigned long size = 1UL << mmu_psize_defs[mmu_pte_psize].shift;
> @@ -413,10 +418,10 @@ static void setup_page_sizes(void)
>  	int i, psize;
>  
>  #ifdef CONFIG_PPC_FSL_BOOK3E
> +	int fsl_mmu = mmu_has_feature(MMU_FTR_TYPE_FSL_E);
>  	unsigned int mmucfg = mfspr(SPRN_MMUCFG);
>  
> -	if (((mmucfg & MMUCFG_MAVN) == MMUCFG_MAVN_V1) &&
> -		(mmu_has_feature(MMU_FTR_TYPE_FSL_E))) {
> +	if (fsl_mmu && (mmucfg & MMUCFG_MAVN) == MMUCFG_MAVN_V1) {
>  		unsigned int tlb1cfg = mfspr(SPRN_TLB1CFG);
>  		unsigned int min_pg, max_pg;
>  
> @@ -430,7 +435,7 @@ static void setup_page_sizes(void)
>  			def = &mmu_psize_defs[psize];
>  			shift = def->shift;
>  
> -			if (shift == 0)
> +			if (shift == 0 || shift & 1)
>  				continue;
>  
>  			/* adjust to be in terms of 4^shift Kb */
> @@ -440,7 +445,40 @@ static void setup_page_sizes(void)
>  				def->flags |= MMU_PAGE_SIZE_DIRECT;
>  		}
>  
> -		goto no_indirect;
> +		goto out;
> +	}
> +
> +	if (fsl_mmu && (mmucfg & MMUCFG_MAVN) == MMUCFG_MAVN_V2) {
> +		u32 tlb1cfg, tlb1ps;
> +
> +		tlb0cfg = mfspr(SPRN_TLB0CFG);
> +		tlb1cfg = mfspr(SPRN_TLB1CFG);
> +		tlb1ps = mfspr(SPRN_TLB1PS);
> +		eptcfg = mfspr(SPRN_EPTCFG);
> +
> +		if ((tlb1cfg & TLBnCFG_IND) && (tlb0cfg & TLBnCFG_PT))
> +			book3e_htw_mode = PPC_HTW_E6500;
> +
> +		/*
> +		 * We expect 4K subpage size and unrestricted indirect size.
> +		 * The lack of a restriction on indirect size is a Freescale
> +		 * extension, indicated by PSn = 0 but SPSn != 0.
> +		 */
> +		if (eptcfg != 2)
> +			book3e_htw_mode = PPC_HTW_NONE;
> +
> +		for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) {
> +			struct mmu_psize_def *def = &mmu_psize_defs[psize];
> +
> +			if (tlb1ps & (1U << (def->shift - 10))) {
> +				def->flags |= MMU_PAGE_SIZE_DIRECT;
> +
> +				if (book3e_htw_mode && psize == MMU_PAGE_2M)
> +					def->flags |= MMU_PAGE_SIZE_INDIRECT;
> +			}
> +		}
> +
> +		goto out;
>  	}
>  #endif
>  
> @@ -457,8 +495,11 @@ static void setup_page_sizes(void)
>  	}
>  
>  	/* Indirect page sizes supported ? */
> -	if ((tlb0cfg & TLBnCFG_IND) == 0)
> -		goto no_indirect;
> +	if ((tlb0cfg & TLBnCFG_IND) == 0 ||
> +	    (tlb0cfg & TLBnCFG_PT) == 0)
> +		goto out;
> +
> +	book3e_htw_mode = PPC_HTW_IBM;
>  
>  	/* Now, we only deal with one IND page size for each
>  	 * direct size. Hopefully all implementations today are
> @@ -483,8 +524,8 @@ static void setup_page_sizes(void)
>  				def->ind = ps + 10;
>  		}
>  	}
> - no_indirect:
>  
> +out:
>  	/* Cleanup array and print summary */
>  	pr_info("MMU: Supported page sizes\n");
>  	for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) {
> @@ -525,23 +566,23 @@ static void __patch_exception(int exc, unsigned long addr)
>  
>  static void setup_mmu_htw(void)
>  {
> -	/* Check if HW tablewalk is present, and if yes, enable it by:
> -	 *
> -	 * - patching the TLB miss handlers to branch to the
> -	 *   one dedicates to it
> -	 *
> -	 * - setting the global book3e_htw_enabled
> -       	 */
> -	unsigned int tlb0cfg = mfspr(SPRN_TLB0CFG);
> +	/*
> +	 * If we want to use HW tablewalk, enable it by patching the TLB miss
> +	 * handlers to branch to the one dedicated to it.
> +	 */
>  
> -	if ((tlb0cfg & TLBnCFG_IND) &&
> -	    (tlb0cfg & TLBnCFG_PT)) {
> +	switch (book3e_htw_mode) {
> +	case PPC_HTW_IBM:
>  		patch_exception(0x1c0, exc_data_tlb_miss_htw_book3e);
>  		patch_exception(0x1e0, exc_instruction_tlb_miss_htw_book3e);
> -		book3e_htw_enabled = 1;
> +		break;
> +	case PPC_HTW_E6500:
> +		patch_exception(0x1c0, exc_data_tlb_miss_e6500_book3e);
> +		patch_exception(0x1e0, exc_instruction_tlb_miss_e6500_book3e);
> +		break;
>  	}
>  	pr_info("MMU: Book3E HW tablewalk %s\n",
> -		book3e_htw_enabled ? "enabled" : "not supported");
> +		book3e_htw_mode ? "enabled" : "not supported");
>  }
>  
>  /*
> @@ -581,8 +622,16 @@ static void __early_init_mmu(int boot_cpu)
>  	/* Set MAS4 based on page table setting */
>  
>  	mas4 = 0x4 << MAS4_WIMGED_SHIFT;
> -	if (book3e_htw_enabled) {
> -		mas4 |= mas4 | MAS4_INDD;
> +	switch (book3e_htw_mode) {
> +	case PPC_HTW_E6500:
> +		mas4 |= MAS4_INDD;
> +		mas4 |= BOOK3E_PAGESZ_2M << MAS4_TSIZED_SHIFT;
> +		mas4 |= MAS4_TLBSELD(1);
> +		mmu_pte_psize = MMU_PAGE_2M;
> +		break;
> +
> +	case PPC_HTW_IBM:
> +		mas4 |= MAS4_INDD;
>  #ifdef CONFIG_PPC_64K_PAGES
>  		mas4 |=	BOOK3E_PAGESZ_256M << MAS4_TSIZED_SHIFT;
>  		mmu_pte_psize = MMU_PAGE_256M;
> @@ -590,13 +639,16 @@ static void __early_init_mmu(int boot_cpu)
>  		mas4 |=	BOOK3E_PAGESZ_1M << MAS4_TSIZED_SHIFT;
>  		mmu_pte_psize = MMU_PAGE_1M;
>  #endif
> -	} else {
> +		break;
> +
> +	case PPC_HTW_NONE:
>  #ifdef CONFIG_PPC_64K_PAGES
>  		mas4 |=	BOOK3E_PAGESZ_64K << MAS4_TSIZED_SHIFT;
>  #else
>  		mas4 |=	BOOK3E_PAGESZ_4K << MAS4_TSIZED_SHIFT;
>  #endif
>  		mmu_pte_psize = mmu_virtual_psize;
> +		break;
>  	}
>  	mtspr(SPRN_MAS4, mas4);
>  
> @@ -616,8 +668,11 @@ static void __early_init_mmu(int boot_cpu)
>  		/* limit memory so we dont have linear faults */
>  		memblock_enforce_memory_limit(linear_map_top);
>  
> -		patch_exception(0x1c0, exc_data_tlb_miss_bolted_book3e);
> -		patch_exception(0x1e0, exc_instruction_tlb_miss_bolted_book3e);
> +		if (book3e_htw_mode == PPC_HTW_NONE) {
> +			patch_exception(0x1c0, exc_data_tlb_miss_bolted_book3e);
> +			patch_exception(0x1e0,
> +				exc_instruction_tlb_miss_bolted_book3e);
> +		}
>  	}
>  #endif
>  

Ben.

^ permalink raw reply

* Re: [PATCH] powerpc/booke-64: fix tlbsrx. path in bolted tlb handler
From: Benjamin Herrenschmidt @ 2012-09-07  4:23 UTC (permalink / raw)
  To: scott; +Cc: linuxppc-dev
In-Reply-To: <20120612220232.GA17228@tyr.buserror.net>

On Tue, 2012-06-12 at 17:02 -0500, Scott Wood wrote:
> It was branching to the cleanup part of the non-bolted handler,
> which would have been bad if there were any chips with tlbsrx.
> that use the bolted handler.

Still relevant ? It doesn't apply anymore :-)

Cheers,
Ben.

> Signed-off-by: Scott Wood <scott@tyr.buserror.net>
> ---
>  arch/powerpc/mm/tlb_low_64e.S |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/powerpc/mm/tlb_low_64e.S b/arch/powerpc/mm/tlb_low_64e.S
> index ff672bd..efe0f33 100644
> --- a/arch/powerpc/mm/tlb_low_64e.S
> +++ b/arch/powerpc/mm/tlb_low_64e.S
> @@ -128,7 +128,7 @@ BEGIN_MMU_FTR_SECTION
>  	 */
>  	PPC_TLBSRX_DOT(0,r16)
>  	ldx	r14,r14,r15		/* grab pgd entry */
> -	beq	normal_tlb_miss_done	/* tlb exists already, bail */
> +	beq	tlb_miss_done_bolted	/* tlb exists already, bail */
>  MMU_FTR_SECTION_ELSE
>  	ldx	r14,r14,r15		/* grab pgd entry */
>  ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_USE_TLBRSRV)
> @@ -184,6 +184,7 @@ ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_USE_TLBRSRV)
>  	mtspr	SPRN_MAS7_MAS3,r15
>  	tlbwe
>  
> +tlb_miss_done_bolted:
>  	TLB_MISS_STATS_X(MMSTAT_TLB_MISS_NORM_OK)
>  	tlb_epilog_bolted
>  	rfi

^ permalink raw reply

* Re: [PATCH 4/5] powerpc: Rework set_dabr so it can take a DABRX value as well
From: Michael Neuling @ 2012-09-07  3:37 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: linuxppc-dev
In-Reply-To: <CAMuHMdU=UjkU3BRV7Aouu-m=vKaSPWH93Rkh6L3RWev4oM5jYA@mail.gmail.com>

Geert Uytterhoeven <geert@linux-m68k.org> wrote:

> On Thu, Sep 6, 2012 at 7:17 AM, Michael Neuling <mikey@neuling.org> wrote:
> > Rework set_dabr to take a DABRX value as well. We are not actually
> > changing any functionality at this stage, just preparing for that.
> 
> You are changing functionality.

You are right.. I'll fix that up.. Sorry.

> 
> >  #define   DABRX_USER   (1UL << 0)
> >  #define   DABRX_KERNEL (1UL << 1)
> > +#define   DABRX_HYP    (1UL << 2)
> > +#define   DABRX_BTI    (1UL << 3)
> > +#define   DABRX_ALL     (DABRX_BTI | DABRX_HYP | DABRX_KERNEL | DABRX_USER)
> 
> > --- a/arch/powerpc/platforms/cell/beat.c
> > +++ b/arch/powerpc/platforms/cell/beat.c
> > @@ -136,9 +136,9 @@ ssize_t beat_nvram_get_size(void)
> >         return BEAT_NVRAM_SIZE;
> >  }
> >
> > -int beat_set_xdabr(unsigned long dabr)
> > +int beat_set_xdabr(unsigned long dabr, unsigned long dabrx)
> >  {
> > -       if (beat_set_dabr(dabr, DABRX_KERNEL | DABRX_USER))
> > +       if (beat_set_dabr(dabr, dabrx))
> >                 return -1;
> >         return 0;
> >  }
> 
> > --- a/arch/powerpc/platforms/ps3/setup.c
> > +++ b/arch/powerpc/platforms/ps3/setup.c
> > @@ -184,11 +184,9 @@ early_param("ps3flash", early_parse_ps3flash);
> >  #define prealloc_ps3flash_bounce_buffer()      do { } while (0)
> >  #endif
> >
> > -static int ps3_set_dabr(unsigned long dabr)
> > +static int ps3_set_dabr(unsigned long dabr, unsigned long dabrx)
> >  {
> > -       enum {DABR_USER = 1, DABR_KERNEL = 2,};
> > -
> > -       return lv1_set_dabr(dabr, DABR_KERNEL | DABR_USER) ? -1 : 0;
> > +       return lv1_set_dabr(dabr, dabrx) ? -1 : 0;
> >  }
> 
> > -               set_dabr(dabr.address | (dabr.enabled & 7));
> > +               set_dabr(dabr.address | (dabr.enabled & 7), DABRX_ALL);
> 
> Before, beat_set_dabr() and lv1_set_dabr() would have been called with dabrx = 3
> (DABRX_KERNEL | DABRX_USER). Now they're called with dabrx = 15
> (DABRX_ALL = DABRX_BTI | DABRX_HYP | DABRX_KERNEL | DABRX_USER).
> 
> No idea what's the impact of this...

Do you know if the ps3 hypervisor will allow us to set DABRX_BTI or
DABRX_HYP?  phyp wont.  

Mikey

^ permalink raw reply

* Re: [PATCH -V8 0/11] arch/powerpc: Add 64TB support to ppc64
From: Benjamin Herrenschmidt @ 2012-09-07  1:43 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linuxppc-dev, paulus
In-Reply-To: <1346945351-7672-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>

On Thu, 2012-09-06 at 20:59 +0530, Aneesh Kumar K.V wrote:
> Hi,
> 
> This patchset include patches for supporting 64TB with ppc64. I haven't booted
> this on hardware with 64TB memory yet. But they boot fine on real hardware with
> less memory. Changes extend VSID bits to 38 bits for a 256MB segment
> and 26 bits for 1TB segments.

Your series breaks the embedded 64-bit build. You seem to be hard wiring
dependencies on slice stuff all over 64-bit stuff regardless of the MMU
type or the value of CONFIG_MM_SLICES.

Also all these:

> +/* 4 bits per slice and we have one slice per 1TB */
> +#if 0 /* We can't directly include pgtable.h hence this hack */
> +#define SLICE_ARRAY_SIZE  (PGTABLE_RANGE >> 41)
> +#else
> +/* Right now we only support 64TB */
> +#define SLICE_ARRAY_SIZE  32
> +#endif

Things are just too horrible. Find a different way of doing it, if
necessary create a new range define somewhere, whatever but don't leave
that crap as-is, it's too wrong.

Dropping the series for now.

Cheers,
Ben. 

> Changes from V7:
>  * Address review feedback
> 
> Changes from V6:
>  * rebase to latest upstream (5b716ac728bcc01b1f2a7ed6e437196602237c27)
> 
> Changes from v5:
>  * Address review feedback
> 
> Changes from v4:
>  * Drop patch "arch/powerpc: properly offset the context bits for 1T segemnts"
>    based on review feedback
>  * split CONTEXT_BITS related changes from patch 12
>  * Add a new doc update patch
> 
> Changes from v3:
>  * Address review comments.
>  * Added new patch to ensure proto-VSID isolation between kernel and user space
> 
> Changes from V2:
>  * Fix few FIXMEs in the patchset. I have added them as separate patch for
>    easier review. That should help us to drop those changes if we don't agree.
> 
> Changes from V1:
> * Drop the usage of structure (struct virt_addr) to carry virtual address.
>   We now represent virtual address via vpn which is virtual address shifted
>   right 12 bits.
> 
> Thanks,
> -aneesh
> 

^ permalink raw reply

* Re: [PATCH v2 1/2] [powerpc] Change memory_limit from phys_addr_t to unsigned long long
From: Benjamin Herrenschmidt @ 2012-09-07  1:35 UTC (permalink / raw)
  To: Suzuki K. Poulose; +Cc: mahesh, linuxppc-dev, linux-kernel
In-Reply-To: <20120821114225.29282.87841.stgit@suzukikp.in.ibm.com>

On Tue, 2012-08-21 at 17:12 +0530, Suzuki K. Poulose wrote:
> There are some device-tree nodes, whose values are of type phys_addr_t.
> The phys_addr_t is variable sized based on the CONFIG_PHSY_T_64BIT.
> 
> Change these to a fixed unsigned long long for consistency.
> 
> This patch does the change only for memory_limit.
> 
> The following is a list of such variables which need the change:
> 
>  1) kernel_end, crashk_size - in arch/powerpc/kernel/machine_kexec.c
> 
>  2) (struct resource *)crashk_res.start - We could export a local static
>     variable from machine_kexec.c.
> 
> Changing the above values might break the kexec-tools. So, I will
> fix kexec-tools first to handle the different sized values and then change
>  the above.
> 
> Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com>
> ---

Breaks the build on some configs (with 32-bit phys_addr_t):

/home/benh/linux-powerpc-test/arch/powerpc/kernel/prom.c: In function
'early_init_devtree':
/home/benh/linux-powerpc-test/arch/powerpc/kernel/prom.c:664:25: error:
comparison of distinct pointer types lacks a cast

I'm fixing that myself this time but please be more careful.

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH 2/2][v2] powerpc/perf: Sample only if SIAR-Valid bit is set in P7+
From: Benjamin Herrenschmidt @ 2012-09-07  0:50 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: michaele, linuxppc-dev, Anton Blanchard, benh, cel, khandual
In-Reply-To: <20120716212241.GB14033@us.ibm.com>

On Mon, 2012-07-16 at 14:22 -0700, Sukadev Bhattiprolu wrote:
> From: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
> Date: Mon, 2 Jul 2012 08:06:14 -0700
> Subject: [PATCH 2/2][v2] powerpc/perf: Sample only if SIAR-Valid bit is set in P7+
> 
> On POWER7+ two new bits (mmcra[35] and mmcra[36]) indicate whether the
> contents of SIAR and SDAR are valid.
> 
> For marked instructions on P7+, we must save the contents of SIAR and
> SDAR registers only if these new bits are set.
> 
> This code/check for the SIAR-Valid bit is specific to P7+, so rather than
> waste a CPU-feature bit use the PVR flag.

This appears to be based on an ancient code base. The code has changed
significantly in that area and this patch doesn't apply at all.

I have applied the first patch and renamed PV_ to PVR_ since we've
renamed them all since then. This will show up in powerpc-next later
today. Please rebase your perf patch on top of that.

Cheers,
Ben.

> Note that Carl Love proposed a similar change for oprofile:
> 
>         https://lkml.org/lkml/2012/6/22/309
> 
> Changelog[v2]:
> 	- [Gabriel Paubert] Rename PV_POWER7P to PV_POWER7p.
> 
> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/perf_event_server.h |    1 +
>  arch/powerpc/include/asm/reg.h               |    4 +++
>  arch/powerpc/perf/core-book3s.c              |   38 ++++++++++++++++++++++---
>  arch/powerpc/perf/power7-pmu.c               |    3 ++
>  4 files changed, 41 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/perf_event_server.h b/arch/powerpc/include/asm/perf_event_server.h
> index 078019b..9710be3 100644
> --- a/arch/powerpc/include/asm/perf_event_server.h
> +++ b/arch/powerpc/include/asm/perf_event_server.h
> @@ -49,6 +49,7 @@ struct power_pmu {
>  #define PPMU_ALT_SIPR		2	/* uses alternate posn for SIPR/HV */
>  #define PPMU_NO_SIPR		4	/* no SIPR/HV in MMCRA at all */
>  #define PPMU_NO_CONT_SAMPLING	8	/* no continuous sampling */
> +#define PPMU_SIAR_VALID		16	/* Processor has SIAR Valid bit */
>  
>  /*
>   * Values for flags to get_alternatives()
> diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
> index 65b6164..a7a9a8b 100644
> --- a/arch/powerpc/include/asm/reg.h
> +++ b/arch/powerpc/include/asm/reg.h
> @@ -601,6 +601,10 @@
>  #define   POWER6_MMCRA_SIPR   0x0000020000000000ULL
>  #define   POWER6_MMCRA_THRM	0x00000020UL
>  #define   POWER6_MMCRA_OTHER	0x0000000EUL
> +
> +#define   POWER7P_MMCRA_SIAR_VALID 0x10000000	/* P7+ SIAR contents valid */
> +#define   POWER7P_MMCRA_SDAR_VALID 0x08000000	/* P7+ SDAR contents valid */
> +
>  #define SPRN_PMC1	787
>  #define SPRN_PMC2	788
>  #define SPRN_PMC3	789
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index 8f84bcb..0a392d8 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -103,14 +103,20 @@ static inline unsigned long perf_ip_adjust(struct pt_regs *regs)
>   * If we're not doing instruction sampling, give them the SDAR
>   * (sampled data address).  If we are doing instruction sampling, then
>   * only give them the SDAR if it corresponds to the instruction
> - * pointed to by SIAR; this is indicated by the [POWER6_]MMCRA_SDSYNC
> - * bit in MMCRA.
> + * pointed to by SIAR; this is indicated by the [POWER6_]MMCRA_SDSYNC or
> + * the [POWER7P_]MMCRA_SDAR_VALID bit in MMCRA.
>   */
>  static inline void perf_get_data_addr(struct pt_regs *regs, u64 *addrp)
>  {
>  	unsigned long mmcra = regs->dsisr;
> -	unsigned long sdsync = (ppmu->flags & PPMU_ALT_SIPR) ?
> -		POWER6_MMCRA_SDSYNC : MMCRA_SDSYNC;
> +	unsigned long sdsync;
> +
> +	if (ppmu->flags & PPMU_SIAR_VALID)
> +		sdsync = POWER7P_MMCRA_SDAR_VALID;
> +	else if (ppmu->flags & PPMU_ALT_SIPR)
> +		sdsync = POWER6_MMCRA_SDSYNC;
> +	else
> +		sdsync = MMCRA_SDSYNC;
>  
>  	if (!(mmcra & MMCRA_SAMPLE_ENABLE) || (mmcra & sdsync))
>  		*addrp = mfspr(SPRN_SDAR);
> @@ -1248,6 +1254,25 @@ struct pmu power_pmu = {
>  	.event_idx	= power_pmu_event_idx,
>  };
>  
> +
> +/*
> + * On processors like P7+ that have the SIAR-Valid bit, marked instructions
> + * must be sampled only if the SIAR-valid bit is set.
> + *
> + * For unmarked instructions and for processors that don't have the SIAR-Valid
> + * bit, assume that SIAR is valid.
> + */
> +static inline int siar_valid(struct pt_regs *regs)
> +{
> +	unsigned long mmcra = regs->dsisr;
> +	int marked = mmcra & MMCRA_SAMPLE_ENABLE;
> +
> +	if ((ppmu->flags & PPMU_SIAR_VALID) && marked)
> +		return mmcra & POWER7P_MMCRA_SIAR_VALID;
> +
> +	return 1;
> +}
> +
>  /*
>   * A counter has overflowed; update its count and record
>   * things if requested.  Note that interrupts are hard-disabled
> @@ -1281,7 +1306,7 @@ static void record_and_restart(struct perf_event *event, unsigned long val,
>  			left += period;
>  			if (left <= 0)
>  				left = period;
> -			record = 1;
> +			record = siar_valid(regs);
>  			event->hw.last_period = event->hw.sample_period;
>  		}
>  		if (left < 0x80000000LL)
> @@ -1340,6 +1365,9 @@ unsigned long perf_instruction_pointer(struct pt_regs *regs)
>  	    !(mmcra & MMCRA_SAMPLE_ENABLE))
>  		return regs->nip;
>  
> +	if (!siar_valid(regs))
> +		return 0;	// no valid instruction pointer
> +
>  	return mfspr(SPRN_SIAR) + perf_ip_adjust(regs);
>  }
>  
> diff --git a/arch/powerpc/perf/power7-pmu.c b/arch/powerpc/perf/power7-pmu.c
> index 1251e4d..970a634 100644
> --- a/arch/powerpc/perf/power7-pmu.c
> +++ b/arch/powerpc/perf/power7-pmu.c
> @@ -373,6 +373,9 @@ static int __init init_power7_pmu(void)
>  	    strcmp(cur_cpu_spec->oprofile_cpu_type, "ppc64/power7"))
>  		return -ENODEV;
>  
> +	if (__is_processor(PV_POWER7p))
> +		power7_pmu.flags |= PPMU_SIAR_VALID;
> +
>  	return register_power_pmu(&power7_pmu);
>  }
>  

^ permalink raw reply

* Re: [PATCH -V8 04/11] arch/powerpc: Convert virtual address to vpn
From: Paul Mackerras @ 2012-09-06 22:32 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linuxppc-dev
In-Reply-To: <1346945351-7672-5-git-send-email-aneesh.kumar@linux.vnet.ibm.com>

On Thu, Sep 06, 2012 at 08:59:04PM +0530, Aneesh Kumar K.V wrote:
> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> 
> This patch convert different functions to take virtual page number
> instead of virtual address. Virtual page number is virtual address
> shifted right by VPN_SHIFT (12) bits. This enable us to have an
> address range of upto 76 bits.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

Reviewed-by: Paul Mackerras <paulus@samba.org>

^ permalink raw reply

* Re: [PATCH] KVM: PPC: BookE: HV: Fix compile
From: Alexander Graf @ 2012-09-06 19:39 UTC (permalink / raw)
  To: Michael Neuling; +Cc: linuxppc-dev, Linus Torvalds, KVM list, kvm-ppc
In-Reply-To: <9849.1343262707@neuling.org>


On 25.07.2012, at 20:31, Michael Neuling wrote:

> Alexander Graf <agraf@suse.de> wrote:
>=20
>> After merging the register type check patches from Ben's tree, the
>> hv enabled booke implementation ceased to compile.
>>=20
>> This patch fixes things up so everyone's happy again.
>=20
> Is there a defconfig which catches this?

Hrm. I don't think a defconfig gets you there, as KVM isn't enabled by =
default. Just configure your kernel with support for e500mc and enable =
KVM :).


Alex

^ permalink raw reply

* [PATCH] powerpc: Fix build dependencies for c files requiring libfdt.h
From: Matthew McClintock @ 2012-09-06 18:48 UTC (permalink / raw)
  To: linuxppc-dev

Several files in obj-plat depend on libfdt header file. Sometimes
when building one can see the following issue. This patch adds
libfdt as dependency to those object files

| In file included from arch/powerpc/boot/treeboot-iss4xx.c:33:0:
| arch/powerpc/boot/libfdt.h:854:1: error: unterminated comment
| In file included from arch/powerpc/boot/treeboot-iss4xx.c:33:0:
| arch/powerpc/boot/libfdt.h:1:0: error: unterminated #ifndef
|   BOOTCC  arch/powerpc/boot/inffast.o
| make[1]: *** [arch/powerpc/boot/treeboot-iss4xx.o] Error 1
| make[1]: *** Waiting for unfinished jobs....
|   BOOTCC  arch/powerpc/boot/inflate.o
| make: *** [uImage] Error 2
| ERROR: oe_runmake failed
| ERROR: Function failed: do_compile (see /srv/home/pokybuild/yocto-autobuilder/yocto-slave/p1022ds/build/build/tmp/work/p1022ds-poky-linux-gnuspe/linux-qoriq-sdk-3.0.34-r5/temp/log.do_compile.2167 for further information)
NOTE: recipe linux-qoriq-sdk-3.0.34-r5: task do_compile: Failed

Signed-off-by: Matthew McClintock <msm@freescale.com>
---
 arch/powerpc/boot/Makefile |    1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
index b7d8333..6a15c96 100644
--- a/arch/powerpc/boot/Makefile
+++ b/arch/powerpc/boot/Makefile
@@ -107,6 +107,7 @@ src-boot := $(addprefix $(obj)/, $(src-boot))
 obj-boot := $(addsuffix .o, $(basename $(src-boot)))
 obj-wlib := $(addsuffix .o, $(basename $(addprefix $(obj)/, $(src-wlib))))
 obj-plat := $(addsuffix .o, $(basename $(addprefix $(obj)/, $(src-plat))))
+obj-plat: $(libfdt)
 
 quiet_cmd_copy_zlib = COPY    $@
       cmd_copy_zlib = sed "s@__used@@;s@<linux/\([^>]*\).*@\"\1\"@" $< > $@
-- 
1.7.9.7

^ permalink raw reply related

* [PATCH -V8 06/11] arch/powerpc: Increase the slice range to 64TB
From: Aneesh Kumar K.V @ 2012-09-06 15:29 UTC (permalink / raw)
  To: benh, paulus; +Cc: linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1346945351-7672-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

This patch makes the high psizes mask as an unsigned char array
so that we can have more than 16TB. Currently we support upto
64TB

Reviewed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/mmu-hash64.h |    6 +-
 arch/powerpc/include/asm/page_64.h    |    6 +-
 arch/powerpc/mm/hash_utils_64.c       |   15 +++--
 arch/powerpc/mm/slb_low.S             |   30 ++++++---
 arch/powerpc/mm/slice.c               |  107 +++++++++++++++++++++------------
 5 files changed, 109 insertions(+), 55 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
index 6aeb498..7cbd541 100644
--- a/arch/powerpc/include/asm/mmu-hash64.h
+++ b/arch/powerpc/include/asm/mmu-hash64.h
@@ -460,7 +460,11 @@ typedef struct {
 
 #ifdef CONFIG_PPC_MM_SLICES
 	u64 low_slices_psize;	/* SLB page size encodings */
-	u64 high_slices_psize;  /* 4 bits per slice for now */
+	/*
+	 * Right now we support 64TB and 4 bits for each
+	 * 1TB slice we need 32 bytes for 64TB.
+	 */
+	unsigned char high_slices_psize[32];  /* 4 bits per slice for now */
 #else
 	u16 sllp;		/* SLB page size encoding */
 #endif
diff --git a/arch/powerpc/include/asm/page_64.h b/arch/powerpc/include/asm/page_64.h
index fed85e6..6c9bef4 100644
--- a/arch/powerpc/include/asm/page_64.h
+++ b/arch/powerpc/include/asm/page_64.h
@@ -82,7 +82,11 @@ extern u64 ppc64_pft_size;
 
 struct slice_mask {
 	u16 low_slices;
-	u16 high_slices;
+	/*
+	 * This should be derived out of PGTABLE_RANGE. For the current
+	 * max 64TB, u64 should be ok.
+	 */
+	u64 high_slices;
 };
 
 struct mm_struct;
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 74c5479..13e0ccf 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -804,16 +804,19 @@ unsigned int hash_page_do_lazy_icache(unsigned int pp, pte_t pte, int trap)
 #ifdef CONFIG_PPC_MM_SLICES
 unsigned int get_paca_psize(unsigned long addr)
 {
-	unsigned long index, slices;
+	u64 lpsizes;
+	unsigned char *hpsizes;
+	unsigned long index, mask_index;
 
 	if (addr < SLICE_LOW_TOP) {
-		slices = get_paca()->context.low_slices_psize;
+		lpsizes = get_paca()->context.low_slices_psize;
 		index = GET_LOW_SLICE_INDEX(addr);
-	} else {
-		slices = get_paca()->context.high_slices_psize;
-		index = GET_HIGH_SLICE_INDEX(addr);
+		return (lpsizes >> (index * 4)) & 0xF;
 	}
-	return (slices >> (index * 4)) & 0xF;
+	hpsizes = get_paca()->context.high_slices_psize;
+	index = GET_HIGH_SLICE_INDEX(addr);
+	mask_index = index & 0x1;
+	return (hpsizes[index >> 1] >> (mask_index * 4)) & 0xF;
 }
 
 #else
diff --git a/arch/powerpc/mm/slb_low.S b/arch/powerpc/mm/slb_low.S
index b9ee79ce..e132dc6 100644
--- a/arch/powerpc/mm/slb_low.S
+++ b/arch/powerpc/mm/slb_low.S
@@ -108,17 +108,31 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_1T_SEGMENT)
 	 * between 4k and 64k standard page size
 	 */
 #ifdef CONFIG_PPC_MM_SLICES
+	/* r10 have esid */
 	cmpldi	r10,16
-
-	/* Get the slice index * 4 in r11 and matching slice size mask in r9 */
-	ld	r9,PACALOWSLICESPSIZE(r13)
-	sldi	r11,r10,2
+	/* below SLICE_LOW_TOP */
 	blt	5f
-	ld	r9,PACAHIGHSLICEPSIZE(r13)
-	srdi	r11,r10,(SLICE_HIGH_SHIFT - SLICE_LOW_SHIFT - 2)
-	andi.	r11,r11,0x3c
+	/*
+	 * Handle hpsizes,
+	 * r9 is get_paca()->context.high_slices_psize[index], r11 is mask_index
+	 */
+	srdi    r11,r10,(SLICE_HIGH_SHIFT - SLICE_LOW_SHIFT + 1) /* index */
+	addi	r9,r11,PACAHIGHSLICEPSIZE
+	lbzx	r9,r13,r9		/* r9 is hpsizes[r11] */
+	/* r11 = (r10 >> (SLICE_HIGH_SHIFT - SLICE_LOW_SHIFT)) & 0x1 */
+	rldicl	r11,r10,(64 - (SLICE_HIGH_SHIFT - SLICE_LOW_SHIFT)),63
+	b	6f
 
-5:	/* Extract the psize and multiply to get an array offset */
+5:
+	/*
+	 * Handle lpsizes
+	 * r9 is get_paca()->context.low_slices_psize, r11 is index
+	 */
+	ld	r9,PACALOWSLICESPSIZE(r13)
+	mr	r11,r10
+6:
+	sldi	r11,r11,2  /* index * 4 */
+	/* Extract the psize and multiply to get an array offset */
 	srd	r9,r9,r11
 	andi.	r9,r9,0xf
 	mulli	r9,r9,MMUPSIZEDEFSIZE
diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c
index 73709f7..b4e996a 100644
--- a/arch/powerpc/mm/slice.c
+++ b/arch/powerpc/mm/slice.c
@@ -42,7 +42,7 @@ int _slice_debug = 1;
 
 static void slice_print_mask(const char *label, struct slice_mask mask)
 {
-	char	*p, buf[16 + 3 + 16 + 1];
+	char	*p, buf[16 + 3 + 64 + 1];
 	int	i;
 
 	if (!_slice_debug)
@@ -54,7 +54,7 @@ static void slice_print_mask(const char *label, struct slice_mask mask)
 	*(p++) = '-';
 	*(p++) = ' ';
 	for (i = 0; i < SLICE_NUM_HIGH; i++)
-		*(p++) = (mask.high_slices & (1 << i)) ? '1' : '0';
+		*(p++) = (mask.high_slices & (1ul << i)) ? '1' : '0';
 	*(p++) = 0;
 
 	printk(KERN_DEBUG "%s:%s\n", label, buf);
@@ -84,8 +84,8 @@ static struct slice_mask slice_range_to_mask(unsigned long start,
 	}
 
 	if ((start + len) > SLICE_LOW_TOP)
-		ret.high_slices = (1u << (GET_HIGH_SLICE_INDEX(end) + 1))
-			- (1u << GET_HIGH_SLICE_INDEX(start));
+		ret.high_slices = (1ul << (GET_HIGH_SLICE_INDEX(end) + 1))
+			- (1ul << GET_HIGH_SLICE_INDEX(start));
 
 	return ret;
 }
@@ -135,26 +135,31 @@ static struct slice_mask slice_mask_for_free(struct mm_struct *mm)
 
 	for (i = 0; i < SLICE_NUM_HIGH; i++)
 		if (!slice_high_has_vma(mm, i))
-			ret.high_slices |= 1u << i;
+			ret.high_slices |= 1ul << i;
 
 	return ret;
 }
 
 static struct slice_mask slice_mask_for_size(struct mm_struct *mm, int psize)
 {
+	unsigned char *hpsizes;
+	int index, mask_index;
 	struct slice_mask ret = { 0, 0 };
 	unsigned long i;
-	u64 psizes;
+	u64 lpsizes;
 
-	psizes = mm->context.low_slices_psize;
+	lpsizes = mm->context.low_slices_psize;
 	for (i = 0; i < SLICE_NUM_LOW; i++)
-		if (((psizes >> (i * 4)) & 0xf) == psize)
+		if (((lpsizes >> (i * 4)) & 0xf) == psize)
 			ret.low_slices |= 1u << i;
 
-	psizes = mm->context.high_slices_psize;
-	for (i = 0; i < SLICE_NUM_HIGH; i++)
-		if (((psizes >> (i * 4)) & 0xf) == psize)
-			ret.high_slices |= 1u << i;
+	hpsizes = mm->context.high_slices_psize;
+	for (i = 0; i < SLICE_NUM_HIGH; i++) {
+		mask_index = i & 0x1;
+		index = i >> 1;
+		if (((hpsizes[index] >> (mask_index * 4)) & 0xf) == psize)
+			ret.high_slices |= 1ul << i;
+	}
 
 	return ret;
 }
@@ -183,8 +188,10 @@ static void slice_flush_segments(void *parm)
 
 static void slice_convert(struct mm_struct *mm, struct slice_mask mask, int psize)
 {
+	int index, mask_index;
 	/* Write the new slice psize bits */
-	u64 lpsizes, hpsizes;
+	unsigned char *hpsizes;
+	u64 lpsizes;
 	unsigned long i, flags;
 
 	slice_dbg("slice_convert(mm=%p, psize=%d)\n", mm, psize);
@@ -201,14 +208,18 @@ static void slice_convert(struct mm_struct *mm, struct slice_mask mask, int psiz
 			lpsizes = (lpsizes & ~(0xful << (i * 4))) |
 				(((unsigned long)psize) << (i * 4));
 
-	hpsizes = mm->context.high_slices_psize;
-	for (i = 0; i < SLICE_NUM_HIGH; i++)
-		if (mask.high_slices & (1u << i))
-			hpsizes = (hpsizes & ~(0xful << (i * 4))) |
-				(((unsigned long)psize) << (i * 4));
-
+	/* Assign the value back */
 	mm->context.low_slices_psize = lpsizes;
-	mm->context.high_slices_psize = hpsizes;
+
+	hpsizes = mm->context.high_slices_psize;
+	for (i = 0; i < SLICE_NUM_HIGH; i++) {
+		mask_index = i & 0x1;
+		index = i >> 1;
+		if (mask.high_slices & (1ul << i))
+			hpsizes[index] = (hpsizes[index] &
+					  ~(0xf << (mask_index * 4))) |
+				(((unsigned long)psize) << (mask_index * 4));
+	}
 
 	slice_dbg(" lsps=%lx, hsps=%lx\n",
 		  mm->context.low_slices_psize,
@@ -587,18 +598,19 @@ unsigned long arch_get_unmapped_area_topdown(struct file *filp,
 
 unsigned int get_slice_psize(struct mm_struct *mm, unsigned long addr)
 {
-	u64 psizes;
-	int index;
+	unsigned char *hpsizes;
+	int index, mask_index;
 
 	if (addr < SLICE_LOW_TOP) {
-		psizes = mm->context.low_slices_psize;
+		u64 lpsizes;
+		lpsizes = mm->context.low_slices_psize;
 		index = GET_LOW_SLICE_INDEX(addr);
-	} else {
-		psizes = mm->context.high_slices_psize;
-		index = GET_HIGH_SLICE_INDEX(addr);
+		return (lpsizes >> (index * 4)) & 0xf;
 	}
-
-	return (psizes >> (index * 4)) & 0xf;
+	hpsizes = mm->context.high_slices_psize;
+	index = GET_HIGH_SLICE_INDEX(addr);
+	mask_index = index & 0x1;
+	return (hpsizes[index >> 1] >> (mask_index * 4)) & 0xf;
 }
 EXPORT_SYMBOL_GPL(get_slice_psize);
 
@@ -618,7 +630,9 @@ EXPORT_SYMBOL_GPL(get_slice_psize);
  */
 void slice_set_user_psize(struct mm_struct *mm, unsigned int psize)
 {
-	unsigned long flags, lpsizes, hpsizes;
+	int index, mask_index;
+	unsigned char *hpsizes;
+	unsigned long flags, lpsizes;
 	unsigned int old_psize;
 	int i;
 
@@ -639,15 +653,21 @@ void slice_set_user_psize(struct mm_struct *mm, unsigned int psize)
 		if (((lpsizes >> (i * 4)) & 0xf) == old_psize)
 			lpsizes = (lpsizes & ~(0xful << (i * 4))) |
 				(((unsigned long)psize) << (i * 4));
+	/* Assign the value back */
+	mm->context.low_slices_psize = lpsizes;
 
 	hpsizes = mm->context.high_slices_psize;
-	for (i = 0; i < SLICE_NUM_HIGH; i++)
-		if (((hpsizes >> (i * 4)) & 0xf) == old_psize)
-			hpsizes = (hpsizes & ~(0xful << (i * 4))) |
-				(((unsigned long)psize) << (i * 4));
+	for (i = 0; i < SLICE_NUM_HIGH; i++) {
+		mask_index = i & 0x1;
+		index = i >> 1;
+		if (((hpsizes[index] >> (mask_index * 4)) & 0xf) == old_psize)
+			hpsizes[index] = (hpsizes[index] &
+					  ~(0xf << (mask_index * 4))) |
+				(((unsigned long)psize) << (mask_index * 4));
+	}
+
+
 
-	mm->context.low_slices_psize = lpsizes;
-	mm->context.high_slices_psize = hpsizes;
 
 	slice_dbg(" lsps=%lx, hsps=%lx\n",
 		  mm->context.low_slices_psize,
@@ -660,18 +680,27 @@ void slice_set_user_psize(struct mm_struct *mm, unsigned int psize)
 void slice_set_psize(struct mm_struct *mm, unsigned long address,
 		     unsigned int psize)
 {
+	unsigned char *hpsizes;
 	unsigned long i, flags;
-	u64 *p;
+	u64 *lpsizes;
 
 	spin_lock_irqsave(&slice_convert_lock, flags);
 	if (address < SLICE_LOW_TOP) {
 		i = GET_LOW_SLICE_INDEX(address);
-		p = &mm->context.low_slices_psize;
+		lpsizes = &mm->context.low_slices_psize;
+		*lpsizes = (*lpsizes & ~(0xful << (i * 4))) |
+			((unsigned long) psize << (i * 4));
 	} else {
+		int index, mask_index;
 		i = GET_HIGH_SLICE_INDEX(address);
-		p = &mm->context.high_slices_psize;
+		hpsizes = mm->context.high_slices_psize;
+		mask_index = i & 0x1;
+		index = i >> 1;
+		hpsizes[index] = (hpsizes[index] &
+				  ~(0xf << (mask_index * 4))) |
+			(((unsigned long)psize) << (mask_index * 4));
 	}
-	*p = (*p & ~(0xful << (i * 4))) | ((unsigned long) psize << (i * 4));
+
 	spin_unlock_irqrestore(&slice_convert_lock, flags);
 
 #ifdef CONFIG_SPU_BASE
-- 
1.7.10

^ permalink raw reply related

* [PATCH -V8 09/11] arch/powerpc: Use 32bit array for slb cache
From: Aneesh Kumar K.V @ 2012-09-06 15:29 UTC (permalink / raw)
  To: benh, paulus; +Cc: linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1346945351-7672-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

With larger vsid we need to track more bits of ESID in slb cache
for slb invalidate.

Reviewed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/paca.h |    2 +-
 arch/powerpc/mm/slb_low.S       |    8 ++++----
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index daf813f..3e7abba 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -100,7 +100,7 @@ struct paca_struct {
 	/* SLB related definitions */
 	u16 vmalloc_sllp;
 	u16 slb_cache_ptr;
-	u16 slb_cache[SLB_CACHE_ENTRIES];
+	u32 slb_cache[SLB_CACHE_ENTRIES];
 #endif /* CONFIG_PPC_STD_MMU_64 */
 
 #ifdef CONFIG_PPC_BOOK3E
diff --git a/arch/powerpc/mm/slb_low.S b/arch/powerpc/mm/slb_low.S
index 3b75f19..f6a2625 100644
--- a/arch/powerpc/mm/slb_low.S
+++ b/arch/powerpc/mm/slb_low.S
@@ -270,10 +270,10 @@ _GLOBAL(slb_compare_rr_to_size)
 	bge	1f
 
 	/* still room in the slb cache */
-	sldi	r11,r3,1		/* r11 = offset * sizeof(u16) */
-	rldicl	r10,r10,36,28		/* get low 16 bits of the ESID */
-	add	r11,r11,r13		/* r11 = (u16 *)paca + offset */
-	sth	r10,PACASLBCACHE(r11)	/* paca->slb_cache[offset] = esid */
+	sldi	r11,r3,2		/* r11 = offset * sizeof(u32) */
+	srdi    r10,r10,28		/* get the 36 bits of the ESID */
+	add	r11,r11,r13		/* r11 = (u32 *)paca + offset */
+	stw	r10,PACASLBCACHE(r11)	/* paca->slb_cache[offset] = esid */
 	addi	r3,r3,1			/* offset++ */
 	b	2f
 1:					/* offset >= SLB_CACHE_ENTRIES */
-- 
1.7.10

^ permalink raw reply related

* [PATCH -V8 07/11] arch/powerpc: Make some of the PGTABLE_RANGE dependency explicit
From: Aneesh Kumar K.V @ 2012-09-06 15:29 UTC (permalink / raw)
  To: benh, paulus; +Cc: linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1346945351-7672-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

slice array size and slice mask size depend on PGTABLE_RANGE. We
can't directly include pgtable.h in these header because there is
a circular dependency. So add compile time check for these values.

Reviewed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/mmu-hash64.h    |   13 ++++++++-----
 arch/powerpc/include/asm/page_64.h       |   16 ++++++++++++----
 arch/powerpc/include/asm/pgtable-ppc64.h |    8 ++++++++
 3 files changed, 28 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
index 7cbd541..cbd7edb 100644
--- a/arch/powerpc/include/asm/mmu-hash64.h
+++ b/arch/powerpc/include/asm/mmu-hash64.h
@@ -416,6 +416,13 @@ extern void slb_set_size(u16 size);
 	srdi	rx,rx,VSID_BITS_##size;	/* extract 2^VSID_BITS bit */	\
 	add	rt,rt,rx
 
+/* 4 bits per slice and we have one slice per 1TB */
+#if 0 /* We can't directly include pgtable.h hence this hack */
+#define SLICE_ARRAY_SIZE  (PGTABLE_RANGE >> 41)
+#else
+/* Right now we only support 64TB */
+#define SLICE_ARRAY_SIZE  32
+#endif
 
 #ifndef __ASSEMBLY__
 
@@ -460,11 +467,7 @@ typedef struct {
 
 #ifdef CONFIG_PPC_MM_SLICES
 	u64 low_slices_psize;	/* SLB page size encodings */
-	/*
-	 * Right now we support 64TB and 4 bits for each
-	 * 1TB slice we need 32 bytes for 64TB.
-	 */
-	unsigned char high_slices_psize[32];  /* 4 bits per slice for now */
+	unsigned char high_slices_psize[SLICE_ARRAY_SIZE];
 #else
 	u16 sllp;		/* SLB page size encoding */
 #endif
diff --git a/arch/powerpc/include/asm/page_64.h b/arch/powerpc/include/asm/page_64.h
index 6c9bef4..b55beb4 100644
--- a/arch/powerpc/include/asm/page_64.h
+++ b/arch/powerpc/include/asm/page_64.h
@@ -78,14 +78,22 @@ extern u64 ppc64_pft_size;
 #define GET_LOW_SLICE_INDEX(addr)	((addr) >> SLICE_LOW_SHIFT)
 #define GET_HIGH_SLICE_INDEX(addr)	((addr) >> SLICE_HIGH_SHIFT)
 
+/* 1 bit per slice and we have one slice per 1TB */
+#if 0 /* We can't directly include pgtable.h hence this hack */
+#define SLICE_MASK_SIZE (PGTABLE_RANGE >> 43)
+#else
+/*
+ * Right now we support only 64TB.
+ * IF we change this we will have to change the type
+ * of high_slices
+ */
+#define SLICE_MASK_SIZE 8
+#endif
+
 #ifndef __ASSEMBLY__
 
 struct slice_mask {
 	u16 low_slices;
-	/*
-	 * This should be derived out of PGTABLE_RANGE. For the current
-	 * max 64TB, u64 should be ok.
-	 */
 	u64 high_slices;
 };
 
diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
index 8af1cf2..dea953f 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -32,6 +32,14 @@
 #endif
 #endif
 
+#if (PGTABLE_RANGE >> 41) > SLICE_ARRAY_SIZE
+#error PGTABLE_RANGE exceeds SLICE_ARRAY_SIZE
+#endif
+
+#if (PGTABLE_RANGE >> 43) > SLICE_MASK_SIZE
+#error PGTABLE_RANGE exceeds slice_mask high_slices size
+#endif
+
 /*
  * Define the address range of the kernel non-linear virtual area
  */
-- 
1.7.10

^ permalink raw reply related

* [PATCH -V8 11/11] arch/powerpc: Update VSID allocation documentation
From: Aneesh Kumar K.V @ 2012-09-06 15:29 UTC (permalink / raw)
  To: benh, paulus; +Cc: linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1346945351-7672-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

This update the proto-VSID and VSID scramble related information
to be more generic by using names instead of current values.

Reviewed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/mmu-hash64.h |   40 ++++++++++++++-------------------
 arch/powerpc/mm/mmu_context_hash64.c  |    8 ++++---
 2 files changed, 22 insertions(+), 26 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
index de9cfed..428f23e 100644
--- a/arch/powerpc/include/asm/mmu-hash64.h
+++ b/arch/powerpc/include/asm/mmu-hash64.h
@@ -324,51 +324,45 @@ extern void slb_set_size(u16 size);
 #endif /* __ASSEMBLY__ */
 
 /*
- * VSID allocation
+ * VSID allocation (256MB segment)
  *
- * We first generate a 36-bit "proto-VSID".  For kernel addresses this
- * is equal to the ESID, for user addresses it is:
- *	(context << 15) | (esid & 0x7fff)
+ * We first generate a 38-bit "proto-VSID".  For kernel addresses this
+ * is equal to the ESID | 1 << 37, for user addresses it is:
+ *	(context << USER_ESID_BITS) | (esid & ((1U << USER_ESID_BITS) - 1)
  *
- * The two forms are distinguishable because the top bit is 0 for user
- * addresses, whereas the top two bits are 1 for kernel addresses.
- * Proto-VSIDs with the top two bits equal to 0b10 are reserved for
- * now.
+ * This splits the proto-VSID into the below range
+ *  0 - (2^(CONTEXT_BITS + USER_ESID_BITS) - 1) : User proto-VSID range
+ *  2^(CONTEXT_BITS + USER_ESID_BITS) - 2^(VSID_BITS) : Kernel proto-VSID range
+ *
+ * We also have CONTEXT_BITS + USER_ESID_BITS = VSID_BITS - 1
+ * That is, we assign half of the space to user processes and half
+ * to the kernel.
  *
  * The proto-VSIDs are then scrambled into real VSIDs with the
  * multiplicative hash:
  *
  *	VSID = (proto-VSID * VSID_MULTIPLIER) % VSID_MODULUS
- *	where	VSID_MULTIPLIER = 268435399 = 0xFFFFFC7
- *		VSID_MODULUS = 2^36-1 = 0xFFFFFFFFF
  *
- * This scramble is only well defined for proto-VSIDs below
- * 0xFFFFFFFFF, so both proto-VSID and actual VSID 0xFFFFFFFFF are
- * reserved.  VSID_MULTIPLIER is prime, so in particular it is
+ * VSID_MULTIPLIER is prime, so in particular it is
  * co-prime to VSID_MODULUS, making this a 1:1 scrambling function.
  * Because the modulus is 2^n-1 we can compute it efficiently without
  * a divide or extra multiply (see below).
  *
  * This scheme has several advantages over older methods:
  *
- * 	- We have VSIDs allocated for every kernel address
+ *	- We have VSIDs allocated for every kernel address
  * (i.e. everything above 0xC000000000000000), except the very top
  * segment, which simplifies several things.
  *
- *	- We allow for 16 significant bits of ESID and 19 bits of
- * context for user addresses.  i.e. 16T (44 bits) of address space for
- * up to half a million contexts.
+ *	- We allow for USER_ESID_BITS significant bits of ESID and
+ * CONTEXT_BITS  bits of context for user addresses.
+ *  i.e. 64T (46 bits) of address space for up to half a million contexts.
  *
- * 	- The scramble function gives robust scattering in the hash
+ *	- The scramble function gives robust scattering in the hash
  * table (at least based on some initial results).  The previous
  * method was more susceptible to pathological cases giving excessive
  * hash collisions.
  */
-/*
- * WARNING - If you change these you must make sure the asm
- * implementations in slb_allocate (slb_low.S), do_stab_bolted
- * (head.S) and ASM_VSID_SCRAMBLE (below) are changed accordingly.
- */
 
 /*
  * This should be computed such that protovosid * vsid_mulitplier
diff --git a/arch/powerpc/mm/mmu_context_hash64.c b/arch/powerpc/mm/mmu_context_hash64.c
index daa076c..40bc5b0 100644
--- a/arch/powerpc/mm/mmu_context_hash64.c
+++ b/arch/powerpc/mm/mmu_context_hash64.c
@@ -30,9 +30,11 @@ static DEFINE_SPINLOCK(mmu_context_lock);
 static DEFINE_IDA(mmu_context_ida);
 
 /*
- * The proto-VSID space has 2^35 - 1 segments available for user mappings.
- * Each segment contains 2^28 bytes.  Each context maps 2^44 bytes,
- * so we can support 2^19-1 contexts (19 == 35 + 28 - 44).
+ * 256MB segment
+ * The proto-VSID space has 2^(CONTEX_BITS + USER_ESID_BITS) - 1 segments
+ * available for user mappings. Each segment contains 2^28 bytes. Each
+ * context maps 2^46 bytes (64TB) so we can support 2^19-1 contexts
+ * (19 == 37 + 28 - 46).
  */
 #define MAX_CONTEXT	((1UL << CONTEXT_BITS) - 1)
 
-- 
1.7.10

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox