* [PATCH][v3] sata_fsl: add workaround for data length mismatch on freescale V2 controller
From: Shaohui Xie @ 2012-09-07 10:01 UTC (permalink / raw)
To: jgarzik, linux-ide; +Cc: linuxppc-dev, linux-kernel, Anju Bhartiya, Shaohui Xie
The freescale V2 SATA controller checks if the received data length matches
the programmed length 'ttl', if not, it assumes that this is an error.
In ATAPI, the 'ttl' is based on max allocation length and not the actual
data transfer length, controller will raise 'DLM' (Data length Mismatch)
error bit in Hstatus register. Along with 'DLM', DE (Device error) and
FE (fatal Error) bits are also set in Hstatus register, 'E' (Internal Error)
bit is set in Serror register and CE (Command Error) and DE (Device error)
registers have the corresponding bit set. In this condition, we need to
clear errors in following way: in the service routine, based on 'DLM' flag,
HCONTROL[27] operation clears Hstatus, CE and DE registers, clear Serror
register.
Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com>
Signed-off-by: Anju Bhartiya <Anju.Bhartiya@freescale.com>
---
changes for v3:
1. not using uppercase for variable names;
2. remove unnecessary parens;
changes for v2:
1. remove the using of quirk;
2. wrap errata codes in condition;
drivers/ata/sata_fsl.c | 39 +++++++++++++++++++++++++++++++++++----
1 files changed, 35 insertions(+), 4 deletions(-)
diff --git a/drivers/ata/sata_fsl.c b/drivers/ata/sata_fsl.c
index d6577b9..9fbab68 100644
--- a/drivers/ata/sata_fsl.c
+++ b/drivers/ata/sata_fsl.c
@@ -143,6 +143,7 @@ enum {
FATAL_ERR_CRC_ERR_RX |
FATAL_ERR_FIFO_OVRFL_TX | FATAL_ERR_FIFO_OVRFL_RX,
+ INT_ON_DATA_LENGTH_MISMATCH = (1 << 12),
INT_ON_FATAL_ERR = (1 << 5),
INT_ON_PHYRDY_CHG = (1 << 4),
@@ -1181,25 +1182,55 @@ static void sata_fsl_host_intr(struct ata_port *ap)
u32 hstatus, done_mask = 0;
struct ata_queued_cmd *qc;
u32 SError;
+ u32 tag;
+ u32 status_mask = INT_ON_ERROR;
hstatus = ioread32(hcr_base + HSTATUS);
sata_fsl_scr_read(&ap->link, SCR_ERROR, &SError);
+ /* Read command completed register */
+ done_mask = ioread32(hcr_base + CC);
+
+ /* Workaround for data length mismatch errata */
+ if (unlikely(hstatus & INT_ON_DATA_LENGTH_MISMATCH)) {
+ for (tag = 0; tag < ATA_MAX_QUEUE; tag++) {
+ qc = ata_qc_from_tag(ap, tag);
+ if (qc && ata_is_atapi(qc->tf.protocol)) {
+ u32 hcontrol;
+#define HCONTROL_CLEAR_ERROR (1 << 27)
+ /* Set HControl[27] to clear error registers */
+ hcontrol = ioread32(hcr_base + HCONTROL);
+ iowrite32(hcontrol | HCONTROL_CLEAR_ERROR,
+ hcr_base + HCONTROL);
+
+ /* Clear HControl[27] */
+ iowrite32(hcontrol & ~HCONTROL_CLEAR_ERROR,
+ hcr_base + HCONTROL);
+
+ /* Clear SError[E] bit */
+ sata_fsl_scr_write(&ap->link, SCR_ERROR,
+ SError);
+
+ /* Ignore fatal error and device error */
+ status_mask &= ~(INT_ON_SINGL_DEVICE_ERR
+ | INT_ON_FATAL_ERR);
+ break;
+ }
+ }
+ }
+
if (unlikely(SError & 0xFFFF0000)) {
DPRINTK("serror @host_intr : 0x%x\n", SError);
sata_fsl_error_intr(ap);
}
- if (unlikely(hstatus & INT_ON_ERROR)) {
+ if (unlikely(hstatus & status_mask)) {
DPRINTK("error interrupt!!\n");
sata_fsl_error_intr(ap);
return;
}
- /* Read command completed register */
- done_mask = ioread32(hcr_base + CC);
-
VPRINTK("Status of all queues :\n");
VPRINTK("done_mask/CC = 0x%x, CA = 0x%x, CE=0x%x,CQ=0x%x,apqa=0x%x\n",
done_mask,
--
1.6.4
^ permalink raw reply related
* Re: [PATCH v2 1/2] [powerpc] Change memory_limit from phys_addr_t to unsigned long long
From: Suzuki K. Poulose @ 2012-09-07 10:01 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: mahesh, linuxppc-dev, linux-kernel
In-Reply-To: <1346981712.2385.30.camel@pasglop>
On 09/07/2012 07:05 AM, Benjamin Herrenschmidt wrote:
> On Tue, 2012-08-21 at 17:12 +0530, Suzuki K. Poulose wrote:
>> There are some device-tree nodes, whose values are of type phys_addr_t.
>> The phys_addr_t is variable sized based on the CONFIG_PHSY_T_64BIT.
>>
>> Change these to a fixed unsigned long long for consistency.
>>
>> This patch does the change only for memory_limit.
>>
>> The following is a list of such variables which need the change:
>>
>> 1) kernel_end, crashk_size - in arch/powerpc/kernel/machine_kexec.c
>>
>> 2) (struct resource *)crashk_res.start - We could export a local static
>> variable from machine_kexec.c.
>>
>> Changing the above values might break the kexec-tools. So, I will
>> fix kexec-tools first to handle the different sized values and then change
>> the above.
>>
>> Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com>
>> ---
>
> Breaks the build on some configs (with 32-bit phys_addr_t):
Sorry for that.
>
> /home/benh/linux-powerpc-test/arch/powerpc/kernel/prom.c: In function
> 'early_init_devtree':
> /home/benh/linux-powerpc-test/arch/powerpc/kernel/prom.c:664:25: error:
> comparison of distinct pointer types lacks a cast
>
> I'm fixing that myself this time but please be more careful.
Sure. Thanks Ben for fixing that.
Suzuki
^ permalink raw reply
* Re: [PATCH] powerpc/powernv: move the dereference below the NULL test
From: Benjamin Herrenschmidt @ 2012-09-07 7:59 UTC (permalink / raw)
To: Wei Yongjun
Cc: devicetree-discuss, linux-kernel, rob.herring, yongjun_wei,
paulus, linuxppc-dev
In-Reply-To: <CAPgLHd_w9B8sHg9i9msFLx8FVBypqTtDEVOt1-VBNTW4zwHTMQ@mail.gmail.com>
On Fri, 2012-09-07 at 14:45 +0800, Wei Yongjun wrote:
> From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
>
> The dereference should be moved below the NULL test.
>
> spatch with a semantic match is used to found this.
> (http://coccinelle.lip6.fr/)
I haven't applied this patch yet (there was a similar one recently from
another semantic checker I believe) because that code is about to be
deeply reworked (waiting for some dependencies to get in), so this will
just make the patch harder to apply, and the stuff should never be NULL
in the first place anyway.
So let's leave that aside for a bit.
Cheers,
Ben.
> Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
> ---
> arch/powerpc/platforms/powernv/pci.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
> index be3cfc5..4ba89c1 100644
> --- a/arch/powerpc/platforms/powernv/pci.c
> +++ b/arch/powerpc/platforms/powernv/pci.c
> @@ -287,13 +287,15 @@ static int pnv_pci_read_config(struct pci_bus *bus,
> int where, int size, u32 *val)
> {
> struct pci_controller *hose = pci_bus_to_host(bus);
> - struct pnv_phb *phb = hose->private_data;
> + struct pnv_phb *phb;
> u32 bdfn = (((uint64_t)bus->number) << 8) | devfn;
> s64 rc;
>
> if (hose == NULL)
> return PCIBIOS_DEVICE_NOT_FOUND;
>
> + phb = hose->private_data;
> +
> switch (size) {
> case 1: {
> u8 v8;
> @@ -331,12 +333,14 @@ static int pnv_pci_write_config(struct pci_bus *bus,
> int where, int size, u32 val)
> {
> struct pci_controller *hose = pci_bus_to_host(bus);
> - struct pnv_phb *phb = hose->private_data;
> + struct pnv_phb *phb;
> u32 bdfn = (((uint64_t)bus->number) << 8) | devfn;
>
> if (hose == NULL)
> return PCIBIOS_DEVICE_NOT_FOUND;
>
> + phb = hose->private_data;
> +
> cfg_dbg("pnv_pci_write_config bus: %x devfn: %x +%x/%x -> %08x\n",
> bus->number, devfn, where, size, val);
> switch (size) {
^ permalink raw reply
* Re: [PATCH -V8 0/11] arch/powerpc: Add 64TB support to ppc64
From: Benjamin Herrenschmidt @ 2012-09-07 7:53 UTC (permalink / raw)
To: Aneesh Kumar K.V; +Cc: linuxppc-dev, paulus
In-Reply-To: <871uiexuau.fsf@linux.vnet.ibm.com>
On Fri, 2012-09-07 at 11:12 +0530, Aneesh Kumar K.V wrote:
>
> diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
> index 428f23e..057a12a 100644
> --- a/arch/powerpc/include/asm/mmu-hash64.h
> +++ b/arch/powerpc/include/asm/mmu-hash64.h
> @@ -14,6 +14,7 @@
>
> #include <asm/asm-compat.h>
> #include <asm/page.h>
> +#include <asm/pgtable-ppc64-range.h>
Nah, that's all too gross... I think the right thing to do is to move
the slice stuff out of page_64.h
> /*
> * Segment table
> @@ -415,12 +416,7 @@ extern void slb_set_size(u16 size);
> add rt,rt,rx
>
> /* 4 bits per slice and we have one slice per 1TB */
> -#if 0 /* We can't directly include pgtable.h hence this hack */
> #define SLICE_ARRAY_SIZE (PGTABLE_RANGE >> 41)
> -#else
> -/* Right now we only support 64TB */
> -#define SLICE_ARRAY_SIZE 32
> -#endif
>
> #ifndef __ASSEMBLY__
>
> diff --git a/arch/powerpc/include/asm/page_64.h b/arch/powerpc/include/asm/page_64.h
> index b55beb4..01ab518 100644
> --- a/arch/powerpc/include/asm/page_64.h
> +++ b/arch/powerpc/include/asm/page_64.h
> @@ -78,16 +78,14 @@ extern u64 ppc64_pft_size;
> #define GET_LOW_SLICE_INDEX(addr) ((addr) >> SLICE_LOW_SHIFT)
> #define GET_HIGH_SLICE_INDEX(addr) ((addr) >> SLICE_HIGH_SHIFT)
>
> -/* 1 bit per slice and we have one slice per 1TB */
> -#if 0 /* We can't directly include pgtable.h hence this hack */
> -#define SLICE_MASK_SIZE (PGTABLE_RANGE >> 43)
> -#else
> -/*
> +/* 1 bit per slice and we have one slice per 1TB
> * Right now we support only 64TB.
> * IF we change this we will have to change the type
> * of high_slices
> */
> #define SLICE_MASK_SIZE 8
> +#if (PGTABLE_RANGE >> 43) > SLICE_MASK_SIZE
> +#error PGTABLE_RANGE exceeds slice_mask high_slices size
> #endif
>
> #ifndef __ASSEMBLY__
> diff --git a/arch/powerpc/include/asm/pgtable-ppc64-range.h b/arch/powerpc/include/asm/pgtable-ppc64-range.h
> new file mode 100644
> index 0000000..04a825c
> --- /dev/null
> +++ b/arch/powerpc/include/asm/pgtable-ppc64-range.h
> @@ -0,0 +1,16 @@
> +#ifndef _ASM_POWERPC_PGTABLE_PPC64_RANGE_H_
> +#define _ASM_POWERPC_PGTABLE_PPC64_RANGE_H_
> +
> +#ifdef CONFIG_PPC_64K_PAGES
> +#include <asm/pgtable-ppc64-64k.h>
> +#else
> +#include <asm/pgtable-ppc64-4k.h>
> +#endif
> +
> +/*
> + * Size of EA range mapped by our pagetables.
> + */
> +#define PGTABLE_EADDR_SIZE (PTE_INDEX_SIZE + PMD_INDEX_SIZE + \
> + PUD_INDEX_SIZE + PGD_INDEX_SIZE + PAGE_SHIFT)
> +#define PGTABLE_RANGE (ASM_CONST(1) << PGTABLE_EADDR_SIZE)
> +#endif
> diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
> index dea953f..ee783b4 100644
> --- a/arch/powerpc/include/asm/pgtable-ppc64.h
> +++ b/arch/powerpc/include/asm/pgtable-ppc64.h
> @@ -13,13 +13,7 @@
>
> #define FIRST_USER_ADDRESS 0
>
> -/*
> - * Size of EA range mapped by our pagetables.
> - */
> -#define PGTABLE_EADDR_SIZE (PTE_INDEX_SIZE + PMD_INDEX_SIZE + \
> - PUD_INDEX_SIZE + PGD_INDEX_SIZE + PAGE_SHIFT)
> -#define PGTABLE_RANGE (ASM_CONST(1) << PGTABLE_EADDR_SIZE)
> -
> +#include <asm/pgtable-ppc64-range.h>
>
> /* Some sanity checking */
> #if TASK_SIZE_USER64 > PGTABLE_RANGE
> @@ -32,14 +26,6 @@
> #endif
> #endif
>
> -#if (PGTABLE_RANGE >> 41) > SLICE_ARRAY_SIZE
> -#error PGTABLE_RANGE exceeds SLICE_ARRAY_SIZE
> -#endif
> -
> -#if (PGTABLE_RANGE >> 43) > SLICE_MASK_SIZE
> -#error PGTABLE_RANGE exceeds slice_mask high_slices size
> -#endif
> -
> /*
> * Define the address range of the kernel non-linear virtual area
> */
Ben.
^ permalink raw reply
* [PATCH 3/3] powerpc: cleanup old DABRX #defines
From: Michael Neuling @ 2012-09-07 7:24 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Geert Uytterhoeven, Michael Neuling, linuxppc-dev
In-Reply-To: <alpine.DEB.2.00.1209070716110.22556@bushbaby.sonytel.be>
These are no longer used so get rid of them
Signed-off-by: Michael Neuling <mikey@neuling.org>
---
arch/powerpc/include/asm/hvcall.h | 5 -----
1 file changed, 5 deletions(-)
diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
index 423cf9e..7a86706 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -152,11 +152,6 @@
#define H_VASI_RESUMED 5
#define H_VASI_COMPLETED 6
-/* DABRX flags */
-#define H_DABRX_HYPERVISOR (1UL<<(63-61))
-#define H_DABRX_KERNEL (1UL<<(63-62))
-#define H_DABRX_USER (1UL<<(63-63))
-
/* Each control block has to be on a 4K boundary */
#define H_CB_ALIGNMENT 4096
--
1.7.9.5
^ permalink raw reply related
* [PATCH 2/3] powerpc: Dynamically calculate the dabrx based on kernel/user/hypervisor
From: Michael Neuling @ 2012-09-07 7:24 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Geert Uytterhoeven, Michael Neuling, linuxppc-dev
In-Reply-To: <alpine.DEB.2.00.1209070716110.22556@bushbaby.sonytel.be>
Currently we mark the DABRX to interrupt on all matches
(hypervisor/kernel/user and then filter in software. We can be a lot
smarter now that we can set the DABRX dynamically.
This sets the DABRX based on the flags passed by the user.
Signed-off-by: Michael Neuling <mikey@neuling.org>
---
arch/powerpc/include/asm/hw_breakpoint.h | 1 +
arch/powerpc/kernel/hw_breakpoint.c | 15 +++++++++++----
arch/powerpc/platforms/pseries/setup.c | 2 +-
3 files changed, 13 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/include/asm/hw_breakpoint.h b/arch/powerpc/include/asm/hw_breakpoint.h
index c6f48eb..4234245 100644
--- a/arch/powerpc/include/asm/hw_breakpoint.h
+++ b/arch/powerpc/include/asm/hw_breakpoint.h
@@ -28,6 +28,7 @@
struct arch_hw_breakpoint {
unsigned long address;
+ unsigned long dabrx;
int type;
u8 len; /* length of the target data symbol */
bool extraneous_interrupt;
diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
index 6891d79..a89cae4 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -73,7 +73,7 @@ int arch_install_hw_breakpoint(struct perf_event *bp)
* If so, DABR will be populated in single_step_dabr_instruction().
*/
if (current->thread.last_hit_ubp != bp)
- set_dabr(info->address | info->type | DABR_TRANSLATION, DABRX_ALL);
+ set_dabr(info->address | info->type | DABR_TRANSLATION, info->dabrx);
return 0;
}
@@ -170,6 +170,13 @@ int arch_validate_hwbkpt_settings(struct perf_event *bp)
info->address = bp->attr.bp_addr;
info->len = bp->attr.bp_len;
+ info->dabrx = DABRX_ALL;
+ if (bp->attr.exclude_user)
+ info->dabrx &= ~DABRX_USER;
+ if (bp->attr.exclude_kernel)
+ info->dabrx &= ~DABRX_KERNEL;
+ if (bp->attr.exclude_hv)
+ info->dabrx &= ~DABRX_HYP;
/*
* Since breakpoint length can be a maximum of HW_BREAKPOINT_LEN(8)
@@ -197,7 +204,7 @@ void thread_change_pc(struct task_struct *tsk, struct pt_regs *regs)
info = counter_arch_bp(tsk->thread.last_hit_ubp);
regs->msr &= ~MSR_SE;
- set_dabr(info->address | info->type | DABR_TRANSLATION, DABRX_ALL);
+ set_dabr(info->address | info->type | DABR_TRANSLATION, info->dabrx);
tsk->thread.last_hit_ubp = NULL;
}
@@ -281,7 +288,7 @@ int __kprobes hw_breakpoint_handler(struct die_args *args)
if (!info->extraneous_interrupt)
perf_bp_event(bp, regs);
- set_dabr(info->address | info->type | DABR_TRANSLATION, DABRX_ALL);
+ set_dabr(info->address | info->type | DABR_TRANSLATION, info->dabrx);
out:
rcu_read_unlock();
return rc;
@@ -313,7 +320,7 @@ int __kprobes single_step_dabr_instruction(struct die_args *args)
if (!info->extraneous_interrupt)
perf_bp_event(bp, regs);
- set_dabr(info->address | info->type | DABR_TRANSLATION, DABRX_ALL);
+ set_dabr(info->address | info->type | DABR_TRANSLATION, info->dabrx);
current->thread.last_hit_ubp = NULL;
/*
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index b90deaf..40b30e4 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -427,7 +427,7 @@ static int pseries_set_xdabr(unsigned long dabr, unsigned long dabrx)
if (dabrx == 0 && dabr == 0)
dabrx = DABRX_USER;
/* PAPR says we can only set kernel and user bits */
- dabrx &= H_DABRX_KERNEL | H_DABRX_USER;
+ dabrx &= DABRX_KERNEL | DABRX_USER;
return plpar_hcall_norets(H_SET_XDABR, dabr, dabrx);
}
--
1.7.9.5
^ permalink raw reply related
* [PATCH 1/3] powerpc: Rework set_dabr so it can take a DABRX value as well
From: Michael Neuling @ 2012-09-07 7:24 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Geert Uytterhoeven, Michael Neuling, linuxppc-dev
In-Reply-To: <alpine.DEB.2.00.1209070716110.22556@bushbaby.sonytel.be>
Rework set_dabr to take a DABRX value as well.
Both the pseries and PS3 hypervisors do some checks on the DABRX
values that are passed in the hcall. This patch stops bogus values
from being passed to hypervisor. Also, in the case where we are
clearing the breakpoint, where DABR and DABRX are zero, we modify the
DABRX value to make it valid so that the hcall won't fail.
Signed-off-by: Michael Neuling <mikey@neuling.org>
---
arch/powerpc/include/asm/debug.h | 2 +-
arch/powerpc/include/asm/hw_breakpoint.h | 2 +-
arch/powerpc/include/asm/machdep.h | 3 ++-
arch/powerpc/include/asm/processor.h | 1 +
arch/powerpc/include/asm/reg.h | 3 +++
arch/powerpc/kernel/hw_breakpoint.c | 12 ++++++------
arch/powerpc/kernel/process.c | 14 +++++++-------
arch/powerpc/kernel/ptrace.c | 3 +++
arch/powerpc/kernel/signal.c | 2 +-
arch/powerpc/platforms/cell/beat.c | 4 ++--
arch/powerpc/platforms/cell/beat.h | 2 +-
arch/powerpc/platforms/ps3/setup.c | 10 +++++++---
arch/powerpc/platforms/pseries/setup.c | 14 +++++++++-----
arch/powerpc/xmon/xmon.c | 4 ++--
14 files changed, 46 insertions(+), 30 deletions(-)
diff --git a/arch/powerpc/include/asm/debug.h b/arch/powerpc/include/asm/debug.h
index 716d2f0..32de257 100644
--- a/arch/powerpc/include/asm/debug.h
+++ b/arch/powerpc/include/asm/debug.h
@@ -44,7 +44,7 @@ static inline int debugger_dabr_match(struct pt_regs *regs) { return 0; }
static inline int debugger_fault_handler(struct pt_regs *regs) { return 0; }
#endif
-extern int set_dabr(unsigned long dabr);
+extern int set_dabr(unsigned long dabr, unsigned long dabrx);
#ifdef CONFIG_PPC_ADV_DEBUG_REGS
extern void do_send_trap(struct pt_regs *regs, unsigned long address,
unsigned long error_code, int signal_code, int brkpt);
diff --git a/arch/powerpc/include/asm/hw_breakpoint.h b/arch/powerpc/include/asm/hw_breakpoint.h
index 39b323e..c6f48eb 100644
--- a/arch/powerpc/include/asm/hw_breakpoint.h
+++ b/arch/powerpc/include/asm/hw_breakpoint.h
@@ -61,7 +61,7 @@ extern void ptrace_triggered(struct perf_event *bp,
struct perf_sample_data *data, struct pt_regs *regs);
static inline void hw_breakpoint_disable(void)
{
- set_dabr(0);
+ set_dabr(0, 0);
}
extern void thread_change_pc(struct task_struct *tsk, struct pt_regs *regs);
diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index 42ce570..236b477 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -180,7 +180,8 @@ struct machdep_calls {
void (*enable_pmcs)(void);
/* Set DABR for this platform, leave empty for default implemenation */
- int (*set_dabr)(unsigned long dabr);
+ int (*set_dabr)(unsigned long dabr,
+ unsigned long dabrx);
#ifdef CONFIG_PPC32 /* XXX for now */
/* A general init function, called by ppc_init in init/main.c.
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index 54b73a2..17b58e5 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -219,6 +219,7 @@ struct thread_struct {
#endif /* CONFIG_HAVE_HW_BREAKPOINT */
#endif
unsigned long dabr; /* Data address breakpoint register */
+ unsigned long dabrx; /* ... extension */
#ifdef CONFIG_ALTIVEC
/* Complete AltiVec register set */
vector128 vr[32] __attribute__((aligned(16)));
diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 6386086..334be34 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -208,6 +208,9 @@
#define SPRN_DABRX 0x3F7 /* Data Address Breakpoint Register Extension */
#define DABRX_USER (1UL << 0)
#define DABRX_KERNEL (1UL << 1)
+#define DABRX_HYP (1UL << 2)
+#define DABRX_BTI (1UL << 3)
+#define DABRX_ALL (DABRX_BTI | DABRX_HYP | DABRX_KERNEL | DABRX_USER)
#define SPRN_DAR 0x013 /* Data Address Register */
#define SPRN_DBCR 0x136 /* e300 Data Breakpoint Control Reg */
#define SPRN_DSISR 0x012 /* Data Storage Interrupt Status Register */
diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
index 6767445..6891d79 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -73,7 +73,7 @@ int arch_install_hw_breakpoint(struct perf_event *bp)
* If so, DABR will be populated in single_step_dabr_instruction().
*/
if (current->thread.last_hit_ubp != bp)
- set_dabr(info->address | info->type | DABR_TRANSLATION);
+ set_dabr(info->address | info->type | DABR_TRANSLATION, DABRX_ALL);
return 0;
}
@@ -97,7 +97,7 @@ void arch_uninstall_hw_breakpoint(struct perf_event *bp)
}
*slot = NULL;
- set_dabr(0);
+ set_dabr(0, 0);
}
/*
@@ -197,7 +197,7 @@ void thread_change_pc(struct task_struct *tsk, struct pt_regs *regs)
info = counter_arch_bp(tsk->thread.last_hit_ubp);
regs->msr &= ~MSR_SE;
- set_dabr(info->address | info->type | DABR_TRANSLATION);
+ set_dabr(info->address | info->type | DABR_TRANSLATION, DABRX_ALL);
tsk->thread.last_hit_ubp = NULL;
}
@@ -215,7 +215,7 @@ int __kprobes hw_breakpoint_handler(struct die_args *args)
unsigned long dar = regs->dar;
/* Disable breakpoints during exception handling */
- set_dabr(0);
+ set_dabr(0, 0);
/*
* The counter may be concurrently released but that can only
@@ -281,7 +281,7 @@ int __kprobes hw_breakpoint_handler(struct die_args *args)
if (!info->extraneous_interrupt)
perf_bp_event(bp, regs);
- set_dabr(info->address | info->type | DABR_TRANSLATION);
+ set_dabr(info->address | info->type | DABR_TRANSLATION, DABRX_ALL);
out:
rcu_read_unlock();
return rc;
@@ -313,7 +313,7 @@ int __kprobes single_step_dabr_instruction(struct die_args *args)
if (!info->extraneous_interrupt)
perf_bp_event(bp, regs);
- set_dabr(info->address | info->type | DABR_TRANSLATION);
+ set_dabr(info->address | info->type | DABR_TRANSLATION, DABRX_ALL);
current->thread.last_hit_ubp = NULL;
/*
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 1a1f2dd..53c32a9 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -283,7 +283,7 @@ void do_dabr(struct pt_regs *regs, unsigned long address,
return;
/* Clear the DABR */
- set_dabr(0);
+ set_dabr(0, 0);
/* Deliver the signal to userspace */
info.si_signo = SIGTRAP;
@@ -364,18 +364,19 @@ static void set_debug_reg_defaults(struct thread_struct *thread)
{
if (thread->dabr) {
thread->dabr = 0;
- set_dabr(0);
+ thread->dabrx = 0;
+ set_dabr(0, 0);
}
}
#endif /* !CONFIG_HAVE_HW_BREAKPOINT */
#endif /* CONFIG_PPC_ADV_DEBUG_REGS */
-int set_dabr(unsigned long dabr)
+int set_dabr(unsigned long dabr, unsigned long dabrx)
{
__get_cpu_var(current_dabr) = dabr;
if (ppc_md.set_dabr)
- return ppc_md.set_dabr(dabr);
+ return ppc_md.set_dabr(dabr, dabrx);
/* XXX should we have a CPU_FTR_HAS_DABR ? */
#ifdef CONFIG_PPC_ADV_DEBUG_REGS
@@ -385,9 +386,8 @@ int set_dabr(unsigned long dabr)
#endif
#elif defined(CONFIG_PPC_BOOK3S)
mtspr(SPRN_DABR, dabr);
+ mtspr(SPRN_DABRX, dabrx);
#endif
-
-
return 0;
}
@@ -480,7 +480,7 @@ struct task_struct *__switch_to(struct task_struct *prev,
*/
#ifndef CONFIG_HAVE_HW_BREAKPOINT
if (unlikely(__get_cpu_var(current_dabr) != new->thread.dabr))
- set_dabr(new->thread.dabr);
+ set_dabr(new->thread.dabr, new->thread.dabrx);
#endif /* CONFIG_HAVE_HW_BREAKPOINT */
#endif
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index c10fc28..79d8e56 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -960,6 +960,7 @@ int ptrace_set_debugreg(struct task_struct *task, unsigned long addr,
thread->ptrace_bps[0] = bp;
ptrace_put_breakpoints(task);
thread->dabr = data;
+ thread->dabrx = DABRX_ALL;
return 0;
}
@@ -983,6 +984,7 @@ int ptrace_set_debugreg(struct task_struct *task, unsigned long addr,
/* Move contents to the DABR register */
task->thread.dabr = data;
+ task->thread.dabrx = DABRX_ALL;
#else /* CONFIG_PPC_ADV_DEBUG_REGS */
/* As described above, it was assumed 3 bits were passed with the data
* address, but we will assume only the mode bits will be passed
@@ -1397,6 +1399,7 @@ static long ppc_set_hwdebug(struct task_struct *child,
dabr |= DABR_DATA_WRITE;
child->thread.dabr = dabr;
+ child->thread.dabrx = DABRX_ALL;
return 1;
#endif /* !CONFIG_PPC_ADV_DEBUG_DVCS */
diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
index 5c023c9..b936b45 100644
--- a/arch/powerpc/kernel/signal.c
+++ b/arch/powerpc/kernel/signal.c
@@ -130,7 +130,7 @@ static int do_signal(struct pt_regs *regs)
* triggered inside the kernel.
*/
if (current->thread.dabr)
- set_dabr(current->thread.dabr);
+ set_dabr(current->thread.dabr, current->thread.dabrx);
#endif
/* Re-enable the breakpoints for the signal stack */
thread_change_pc(current, regs);
diff --git a/arch/powerpc/platforms/cell/beat.c b/arch/powerpc/platforms/cell/beat.c
index 852592b..affcf56 100644
--- a/arch/powerpc/platforms/cell/beat.c
+++ b/arch/powerpc/platforms/cell/beat.c
@@ -136,9 +136,9 @@ ssize_t beat_nvram_get_size(void)
return BEAT_NVRAM_SIZE;
}
-int beat_set_xdabr(unsigned long dabr)
+int beat_set_xdabr(unsigned long dabr, unsigned long dabrx)
{
- if (beat_set_dabr(dabr, DABRX_KERNEL | DABRX_USER))
+ if (beat_set_dabr(dabr, dabrx))
return -1;
return 0;
}
diff --git a/arch/powerpc/platforms/cell/beat.h b/arch/powerpc/platforms/cell/beat.h
index 32c8efc..bfcb8e3 100644
--- a/arch/powerpc/platforms/cell/beat.h
+++ b/arch/powerpc/platforms/cell/beat.h
@@ -32,7 +32,7 @@ void beat_get_rtc_time(struct rtc_time *);
ssize_t beat_nvram_get_size(void);
ssize_t beat_nvram_read(char *, size_t, loff_t *);
ssize_t beat_nvram_write(char *, size_t, loff_t *);
-int beat_set_xdabr(unsigned long);
+int beat_set_xdabr(unsigned long, unsigned long);
void beat_power_save(void);
void beat_kexec_cpu_down(int, int);
diff --git a/arch/powerpc/platforms/ps3/setup.c b/arch/powerpc/platforms/ps3/setup.c
index 2d664c5..3f509f8 100644
--- a/arch/powerpc/platforms/ps3/setup.c
+++ b/arch/powerpc/platforms/ps3/setup.c
@@ -184,11 +184,15 @@ early_param("ps3flash", early_parse_ps3flash);
#define prealloc_ps3flash_bounce_buffer() do { } while (0)
#endif
-static int ps3_set_dabr(unsigned long dabr)
+static int ps3_set_dabr(unsigned long dabr, unsigned long dabrx)
{
- enum {DABR_USER = 1, DABR_KERNEL = 2,};
+ /* Have to set at least one bit in the DABRX */
+ if (dabrx == 0 && dabr == 0)
+ dabrx = DABRX_USER;
+ /* hypervisor only allows us to set BTI, Kernel and user */
+ dabrx &= DABRX_BTI | DABRX_KERNEL | DABRX_USER;
- return lv1_set_dabr(dabr, DABR_KERNEL | DABR_USER) ? -1 : 0;
+ return lv1_set_dabr(dabr, dabrx) ? -1 : 0;
}
static void __init ps3_setup_arch(void)
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 36b7744..b90deaf 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -416,16 +416,20 @@ static int __init pSeries_init_panel(void)
}
machine_arch_initcall(pseries, pSeries_init_panel);
-static int pseries_set_dabr(unsigned long dabr)
+static int pseries_set_dabr(unsigned long dabr, unsigned long dabrx)
{
return plpar_hcall_norets(H_SET_DABR, dabr);
}
-static int pseries_set_xdabr(unsigned long dabr)
+static int pseries_set_xdabr(unsigned long dabr, unsigned long dabrx)
{
- /* We want to catch accesses from kernel and userspace */
- return plpar_hcall_norets(H_SET_XDABR, dabr,
- H_DABRX_KERNEL | H_DABRX_USER);
+ /* Have to set at least one bit in the DABRX according to PAPR */
+ if (dabrx == 0 && dabr == 0)
+ dabrx = DABRX_USER;
+ /* PAPR says we can only set kernel and user bits */
+ dabrx &= H_DABRX_KERNEL | H_DABRX_USER;
+
+ return plpar_hcall_norets(H_SET_XDABR, dabr, dabrx);
}
#define CMO_CHARACTERISTICS_TOKEN 44
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 9b49c65..987f441 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -740,7 +740,7 @@ static void insert_bpts(void)
static void insert_cpu_bpts(void)
{
if (dabr.enabled)
- set_dabr(dabr.address | (dabr.enabled & 7));
+ set_dabr(dabr.address | (dabr.enabled & 7), DABRX_ALL);
if (iabr && cpu_has_feature(CPU_FTR_IABR))
mtspr(SPRN_IABR, iabr->address
| (iabr->enabled & (BP_IABR|BP_IABR_TE)));
@@ -768,7 +768,7 @@ static void remove_bpts(void)
static void remove_cpu_bpts(void)
{
- set_dabr(0);
+ set_dabr(0, 0);
if (cpu_has_feature(CPU_FTR_IABR))
mtspr(SPRN_IABR, 0);
}
--
1.7.9.5
^ permalink raw reply related
* [PATCH v2 0/3] powerpc: HW filtering of breakpoint for user/kernel/hypervisor events
From: Michael Neuling @ 2012-09-07 7:24 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Geert Uytterhoeven, Michael Neuling, linuxppc-dev
In-Reply-To: <alpine.DEB.2.00.1209070716110.22556@bushbaby.sonytel.be>
This is in response to Geerts concerns.
Only posting last two patches again as first 3 are already in benh's next
tree.
Added another patch to cleanup some #defines that we can now remove
Michael Neuling (3):
powerpc: Rework set_dabr so it can take a DABRX value as well
powerpc: Dynamically calculate the dabrx based on
kernel/user/hypervisor
powerpc: cleanup old DABRX #defines
arch/powerpc/include/asm/debug.h | 2 +-
arch/powerpc/include/asm/hvcall.h | 5 -----
arch/powerpc/include/asm/hw_breakpoint.h | 3 ++-
arch/powerpc/include/asm/machdep.h | 3 ++-
arch/powerpc/include/asm/processor.h | 1 +
arch/powerpc/include/asm/reg.h | 3 +++
arch/powerpc/kernel/hw_breakpoint.c | 19 +++++++++++++------
arch/powerpc/kernel/process.c | 14 +++++++-------
arch/powerpc/kernel/ptrace.c | 3 +++
arch/powerpc/kernel/signal.c | 2 +-
arch/powerpc/platforms/cell/beat.c | 4 ++--
arch/powerpc/platforms/cell/beat.h | 2 +-
arch/powerpc/platforms/ps3/setup.c | 10 +++++++---
arch/powerpc/platforms/pseries/setup.c | 14 +++++++++-----
arch/powerpc/xmon/xmon.c | 4 ++--
15 files changed, 54 insertions(+), 35 deletions(-)
--
1.7.9.5
^ permalink raw reply
* [PATCH] powerpc/powernv: move the dereference below the NULL test
From: Wei Yongjun @ 2012-09-07 6:45 UTC (permalink / raw)
To: benh, paulus, grant.likely, rob.herring
Cc: yongjun_wei, linuxppc-dev, devicetree-discuss, linux-kernel
From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
The dereference should be moved below the NULL test.
spatch with a semantic match is used to found this.
(http://coccinelle.lip6.fr/)
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
---
arch/powerpc/platforms/powernv/pci.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index be3cfc5..4ba89c1 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -287,13 +287,15 @@ static int pnv_pci_read_config(struct pci_bus *bus,
int where, int size, u32 *val)
{
struct pci_controller *hose = pci_bus_to_host(bus);
- struct pnv_phb *phb = hose->private_data;
+ struct pnv_phb *phb;
u32 bdfn = (((uint64_t)bus->number) << 8) | devfn;
s64 rc;
if (hose == NULL)
return PCIBIOS_DEVICE_NOT_FOUND;
+ phb = hose->private_data;
+
switch (size) {
case 1: {
u8 v8;
@@ -331,12 +333,14 @@ static int pnv_pci_write_config(struct pci_bus *bus,
int where, int size, u32 val)
{
struct pci_controller *hose = pci_bus_to_host(bus);
- struct pnv_phb *phb = hose->private_data;
+ struct pnv_phb *phb;
u32 bdfn = (((uint64_t)bus->number) << 8) | devfn;
if (hose == NULL)
return PCIBIOS_DEVICE_NOT_FOUND;
+ phb = hose->private_data;
+
cfg_dbg("pnv_pci_write_config bus: %x devfn: %x +%x/%x -> %08x\n",
bus->number, devfn, where, size, val);
switch (size) {
^ permalink raw reply related
* Re: [PATCH 4/5] powerpc: Rework set_dabr so it can take a DABRX value as well
From: Michael Neuling @ 2012-09-07 5:43 UTC (permalink / raw)
To: Geert Uytterhoeven; +Cc: linuxppc-dev
In-Reply-To: <alpine.DEB.2.00.1209070716110.22556@bushbaby.sonytel.be>
Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> wrote:
> Hi Mikey,
>
> On Fri, 7 Sep 2012, Michael Neuling wrote:
> > Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > On Thu, Sep 6, 2012 at 7:17 AM, Michael Neuling <mikey@neuling.org> wrote:
> > > > Rework set_dabr to take a DABRX value as well. We are not actually
> > > > changing any functionality at this stage, just preparing for that.
> > >
> > > You are changing functionality.
> >
> > You are right.. I'll fix that up.. Sorry.
> >
> > > > #define DABRX_USER (1UL << 0)
> > > > #define DABRX_KERNEL (1UL << 1)
> > > > +#define DABRX_HYP (1UL << 2)
> > > > +#define DABRX_BTI (1UL << 3)
> > > > +#define DABRX_ALL (DABRX_BTI | DABRX_HYP | DABRX_KERNEL | DABRX_USER)
> > >
> > > > --- a/arch/powerpc/platforms/cell/beat.c
> > > > +++ b/arch/powerpc/platforms/cell/beat.c
> > > > @@ -136,9 +136,9 @@ ssize_t beat_nvram_get_size(void)
> > > > return BEAT_NVRAM_SIZE;
> > > > }
> > > >
> > > > -int beat_set_xdabr(unsigned long dabr)
> > > > +int beat_set_xdabr(unsigned long dabr, unsigned long dabrx)
> > > > {
> > > > - if (beat_set_dabr(dabr, DABRX_KERNEL | DABRX_USER))
> > > > + if (beat_set_dabr(dabr, dabrx))
> > > > return -1;
> > > > return 0;
> > > > }
> > >
> > > > --- a/arch/powerpc/platforms/ps3/setup.c
> > > > +++ b/arch/powerpc/platforms/ps3/setup.c
> > > > @@ -184,11 +184,9 @@ early_param("ps3flash", early_parse_ps3flash);
> > > > #define prealloc_ps3flash_bounce_buffer() do { } while (0)
> > > > #endif
> > > >
> > > > -static int ps3_set_dabr(unsigned long dabr)
> > > > +static int ps3_set_dabr(unsigned long dabr, unsigned long dabrx)
> > > > {
> > > > - enum {DABR_USER = 1, DABR_KERNEL = 2,};
> > > > -
> > > > - return lv1_set_dabr(dabr, DABR_KERNEL | DABR_USER) ? -1 : 0;
> > > > + return lv1_set_dabr(dabr, dabrx) ? -1 : 0;
> > > > }
> > >
> > > > - set_dabr(dabr.address | (dabr.enabled & 7));
> > > > + set_dabr(dabr.address | (dabr.enabled & 7), DABRX_ALL);
> > >
> > > Before, beat_set_dabr() and lv1_set_dabr() would have been called with dabrx = 3
> > > (DABRX_KERNEL | DABRX_USER). Now they're called with dabrx = 15
> > > (DABRX_ALL = DABRX_BTI | DABRX_HYP | DABRX_KERNEL | DABRX_USER).
> > >
> > > No idea what's the impact of this...
> >
> > Do you know if the ps3 hypervisor will allow us to set DABRX_BTI or
> > DABRX_HYP? phyp wont.
>
> According to the documenation, all bits but DABRX_USER, DABRX_KERNEL, and
> DABRX_BTI must be zero. This implies DABRX_HYP cannot be set.
>
> BTW, the requirement that DABRX_USER and DABRX_KERNEL cannot both be zero
> at the same time is also there, cfr. your comment and check in
> pseries_set_xdabr().
>
> Unfortunately, I cannot test it.
OK thanks, I'll mask appropriately.
Any place we can get a copy of the PS3 HV doc you're quoting from?
Mikey
^ permalink raw reply
* Re: [PATCH -V8 0/11] arch/powerpc: Add 64TB support to ppc64
From: Aneesh Kumar K.V @ 2012-09-07 5:42 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, paulus
In-Reply-To: <1346982235.2385.33.camel@pasglop>
Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:
> On Thu, 2012-09-06 at 20:59 +0530, Aneesh Kumar K.V wrote:
>> Hi,
>>
>> This patchset include patches for supporting 64TB with ppc64. I haven't booted
>> this on hardware with 64TB memory yet. But they boot fine on real hardware with
>> less memory. Changes extend VSID bits to 38 bits for a 256MB segment
>> and 26 bits for 1TB segments.
>
> Your series breaks the embedded 64-bit build. You seem to be hard wiring
> dependencies on slice stuff all over 64-bit stuff regardless of the MMU
> type or the value of CONFIG_MM_SLICES.
>
> Also all these:
>
>> +/* 4 bits per slice and we have one slice per 1TB */
>> +#if 0 /* We can't directly include pgtable.h hence this hack */
>> +#define SLICE_ARRAY_SIZE (PGTABLE_RANGE >> 41)
>> +#else
>> +/* Right now we only support 64TB */
>> +#define SLICE_ARRAY_SIZE 32
>> +#endif
>
> Things are just too horrible. Find a different way of doing it, if
> necessary create a new range define somewhere, whatever but don't leave
> that crap as-is, it's too wrong.
>
> Dropping the series for now.
>
How about the change below. If you are ok moving the range details to
new header, I can fold this into patch 7 and send a new series
-aneesh
diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
index 428f23e..057a12a 100644
--- a/arch/powerpc/include/asm/mmu-hash64.h
+++ b/arch/powerpc/include/asm/mmu-hash64.h
@@ -14,6 +14,7 @@
#include <asm/asm-compat.h>
#include <asm/page.h>
+#include <asm/pgtable-ppc64-range.h>
/*
* Segment table
@@ -415,12 +416,7 @@ extern void slb_set_size(u16 size);
add rt,rt,rx
/* 4 bits per slice and we have one slice per 1TB */
-#if 0 /* We can't directly include pgtable.h hence this hack */
#define SLICE_ARRAY_SIZE (PGTABLE_RANGE >> 41)
-#else
-/* Right now we only support 64TB */
-#define SLICE_ARRAY_SIZE 32
-#endif
#ifndef __ASSEMBLY__
diff --git a/arch/powerpc/include/asm/page_64.h b/arch/powerpc/include/asm/page_64.h
index b55beb4..01ab518 100644
--- a/arch/powerpc/include/asm/page_64.h
+++ b/arch/powerpc/include/asm/page_64.h
@@ -78,16 +78,14 @@ extern u64 ppc64_pft_size;
#define GET_LOW_SLICE_INDEX(addr) ((addr) >> SLICE_LOW_SHIFT)
#define GET_HIGH_SLICE_INDEX(addr) ((addr) >> SLICE_HIGH_SHIFT)
-/* 1 bit per slice and we have one slice per 1TB */
-#if 0 /* We can't directly include pgtable.h hence this hack */
-#define SLICE_MASK_SIZE (PGTABLE_RANGE >> 43)
-#else
-/*
+/* 1 bit per slice and we have one slice per 1TB
* Right now we support only 64TB.
* IF we change this we will have to change the type
* of high_slices
*/
#define SLICE_MASK_SIZE 8
+#if (PGTABLE_RANGE >> 43) > SLICE_MASK_SIZE
+#error PGTABLE_RANGE exceeds slice_mask high_slices size
#endif
#ifndef __ASSEMBLY__
diff --git a/arch/powerpc/include/asm/pgtable-ppc64-range.h b/arch/powerpc/include/asm/pgtable-ppc64-range.h
new file mode 100644
index 0000000..04a825c
--- /dev/null
+++ b/arch/powerpc/include/asm/pgtable-ppc64-range.h
@@ -0,0 +1,16 @@
+#ifndef _ASM_POWERPC_PGTABLE_PPC64_RANGE_H_
+#define _ASM_POWERPC_PGTABLE_PPC64_RANGE_H_
+
+#ifdef CONFIG_PPC_64K_PAGES
+#include <asm/pgtable-ppc64-64k.h>
+#else
+#include <asm/pgtable-ppc64-4k.h>
+#endif
+
+/*
+ * Size of EA range mapped by our pagetables.
+ */
+#define PGTABLE_EADDR_SIZE (PTE_INDEX_SIZE + PMD_INDEX_SIZE + \
+ PUD_INDEX_SIZE + PGD_INDEX_SIZE + PAGE_SHIFT)
+#define PGTABLE_RANGE (ASM_CONST(1) << PGTABLE_EADDR_SIZE)
+#endif
diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
index dea953f..ee783b4 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -13,13 +13,7 @@
#define FIRST_USER_ADDRESS 0
-/*
- * Size of EA range mapped by our pagetables.
- */
-#define PGTABLE_EADDR_SIZE (PTE_INDEX_SIZE + PMD_INDEX_SIZE + \
- PUD_INDEX_SIZE + PGD_INDEX_SIZE + PAGE_SHIFT)
-#define PGTABLE_RANGE (ASM_CONST(1) << PGTABLE_EADDR_SIZE)
-
+#include <asm/pgtable-ppc64-range.h>
/* Some sanity checking */
#if TASK_SIZE_USER64 > PGTABLE_RANGE
@@ -32,14 +26,6 @@
#endif
#endif
-#if (PGTABLE_RANGE >> 41) > SLICE_ARRAY_SIZE
-#error PGTABLE_RANGE exceeds SLICE_ARRAY_SIZE
-#endif
-
-#if (PGTABLE_RANGE >> 43) > SLICE_MASK_SIZE
-#error PGTABLE_RANGE exceeds slice_mask high_slices size
-#endif
-
/*
* Define the address range of the kernel non-linear virtual area
*/
^ permalink raw reply related
* Re: [PATCH 4/5] powerpc: Rework set_dabr so it can take a DABRX value as well
From: Geert Uytterhoeven @ 2012-09-07 5:26 UTC (permalink / raw)
To: Michael Neuling; +Cc: linuxppc-dev
In-Reply-To: <15813.1346989072@neuling.org>
Hi Mikey,
On Fri, 7 Sep 2012, Michael Neuling wrote:
> Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > On Thu, Sep 6, 2012 at 7:17 AM, Michael Neuling <mikey@neuling.org> w=
rote:
> > > Rework set_dabr to take a DABRX value as well. We are not actually
> > > changing any functionality at this stage, just preparing for that.
> >=20
> > You are changing functionality.
>=20
> You are right.. I'll fix that up.. Sorry.
>=20
> > > #define DABRX_USER (1UL << 0)
> > > #define DABRX_KERNEL (1UL << 1)
> > > +#define DABRX_HYP (1UL << 2)
> > > +#define DABRX_BTI (1UL << 3)
> > > +#define DABRX_ALL (DABRX_BTI | DABRX_HYP | DABRX_KERNEL | DA=
BRX_USER)
> >=20
> > > --- a/arch/powerpc/platforms/cell/beat.c
> > > +++ b/arch/powerpc/platforms/cell/beat.c
> > > @@ -136,9 +136,9 @@ ssize_t beat_nvram_get_size(void)
> > > return BEAT_NVRAM_SIZE;
> > > }
> > >
> > > -int beat_set_xdabr(unsigned long dabr)
> > > +int beat_set_xdabr(unsigned long dabr, unsigned long dabrx)
> > > {
> > > - if (beat_set_dabr(dabr, DABRX_KERNEL | DABRX_USER))
> > > + if (beat_set_dabr(dabr, dabrx))
> > > return -1;
> > > return 0;
> > > }
> >=20
> > > --- a/arch/powerpc/platforms/ps3/setup.c
> > > +++ b/arch/powerpc/platforms/ps3/setup.c
> > > @@ -184,11 +184,9 @@ early_param("ps3flash", early_parse_ps3flash);
> > > #define prealloc_ps3flash_bounce_buffer() do { } while (0)
> > > #endif
> > >
> > > -static int ps3_set_dabr(unsigned long dabr)
> > > +static int ps3_set_dabr(unsigned long dabr, unsigned long dabrx)
> > > {
> > > - enum {DABR_USER =3D 1, DABR_KERNEL =3D 2,};
> > > -
> > > - return lv1_set_dabr(dabr, DABR_KERNEL | DABR_USER) ? -1 : 0=
;
> > > + return lv1_set_dabr(dabr, dabrx) ? -1 : 0;
> > > }
> >=20
> > > - set_dabr(dabr.address | (dabr.enabled & 7));
> > > + set_dabr(dabr.address | (dabr.enabled & 7), DABRX_A=
LL);
> >=20
> > Before, beat_set_dabr() and lv1_set_dabr() would have been called wit=
h dabrx =3D 3
> > (DABRX_KERNEL | DABRX_USER). Now they're called with dabrx =3D 15
> > (DABRX_ALL =3D DABRX_BTI | DABRX_HYP | DABRX_KERNEL | DABRX_USER).
> >=20
> > No idea what's the impact of this...
>=20
> Do you know if the ps3 hypervisor will allow us to set DABRX_BTI or
> DABRX_HYP? phyp wont. =20
According to the documenation, all bits but DABRX_USER, DABRX_KERNEL, and
DABRX_BTI must be zero. This implies DABRX_HYP cannot be set.
BTW, the requirement that DABRX_USER and DABRX_KERNEL cannot both be zero
at the same time is also there, cfr. your comment and check in
pseries_set_xdabr().
Unfortunately, I cannot test it.
With kind regards,
Geert Uytterhoeven
Software Architect
Technology and Software Centre Europe
Sony Belgium, bijkantoor van Sony Europe Limited.
Da Vincilaan 7-D1 =C2=B7 B-1935 Zaventem =C2=B7 Belgium
Phone: +32 (0)2 700 8453
Fax: +32 (0)2 700 8622
E-mail: Geert.Uytterhoeven@sonycom.com
Sony Europe Limited. A company registered in England and Wales.
Registered office: The Heights, Brooklands, Weybridge, Surrey. KT13 0XW.
United Kingdom
^ permalink raw reply
* Re: [PATCH 2/2] powerpc/e6500: TLB miss handler with hardware tablewalk support
From: Benjamin Herrenschmidt @ 2012-09-07 4:41 UTC (permalink / raw)
To: Scott Wood; +Cc: linuxppc-dev
In-Reply-To: <20120614234101.GB17147@tyr.buserror.net>
On Thu, 2012-06-14 at 18:41 -0500, Scott Wood wrote:
> There are a few things that make the existing hw tablewalk handlers
> unsuitable for e6500:
>
> - Indirect entries go in TLB1 (though the resulting direct entries go in
> TLB0).
>
> - It has threads, but no "tlbsrx." -- so we need a spinlock and
> a normal "tlbsx". Because we need this lock, hardware tablewalk
> is mandatory on e6500 unless we want to add spinlock+tlbsx to
> the normal bolted TLB miss handler.
>
> - TLB1 has no HES (nor next-victim hint) so we need software round robin
> (TODO: integrate this round robin data with hugetlb/KVM)
>
> - The existing tablewalk handlers map half of a page table at a time,
> because IBM hardware has a fixed 1MiB indirect page size. e6500
> has variable size indirect entries, with a minimum of 2MiB.
> So we can't do the half-page indirect mapping, and even if we
> could it would be less efficient than mapping the full page.
>
> - Like on e5500, the linear mapping is bolted, so we don't need the
> overhead of supporting nested tlb misses.
>
> Note that hardware tablewalk does not work in rev1 of e6500.
> We do not expect to support e6500 rev1 in mainline Linux.
>
> Signed-off-by: Scott Wood <scottwood@freescale.com>
> ---
> arch/powerpc/include/asm/mmu-book3e.h | 13 +++
> arch/powerpc/include/asm/mmu.h | 21 ++--
> arch/powerpc/include/asm/paca.h | 6 +
> arch/powerpc/kernel/asm-offsets.c | 10 ++
> arch/powerpc/kernel/paca.c | 5 +
> arch/powerpc/kernel/setup_64.c | 33 +++++++
> arch/powerpc/mm/fsl_booke_mmu.c | 8 ++
> arch/powerpc/mm/tlb_low_64e.S | 167 +++++++++++++++++++++++++++++++++
> arch/powerpc/mm/tlb_nohash.c | 109 ++++++++++++++++------
> 9 files changed, 335 insertions(+), 37 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/mmu-book3e.h b/arch/powerpc/include/asm/mmu-book3e.h
> index eeabcdb..3072aa0 100644
> --- a/arch/powerpc/include/asm/mmu-book3e.h
> +++ b/arch/powerpc/include/asm/mmu-book3e.h
> @@ -264,8 +264,21 @@ extern struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT];
> extern int mmu_linear_psize;
> extern int mmu_vmemmap_psize;
>
> +struct book3e_tlb_per_core {
> + /* For software way selection, as on Freescale TLB1 */
> + u8 esel_next, esel_max, esel_first;
> +
> + /* Per-core spinlock for e6500 TLB handlers (no tlbsrx.) */
> + u8 lock;
> +};
I'm no fan of the name ... tlb_core_data ? Probably don't even need the
book3e prefix really.
> #ifdef CONFIG_PPC64
> extern unsigned long linear_map_top;
> +extern int book3e_htw_mode;
> +
> +#define PPC_HTW_NONE 0
> +#define PPC_HTW_IBM 1
> +#define PPC_HTW_E6500 2
Sad :-( Wonder why we bother with an architecture, really ...
> /*
> * 64-bit booke platforms don't load the tlb in the tlb miss handler code.
> diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
> index a9e9ec6..63d97eb 100644
> --- a/arch/powerpc/include/asm/mmu.h
> +++ b/arch/powerpc/include/asm/mmu.h
> @@ -170,16 +170,17 @@ extern u64 ppc64_rma_size;
> #define MMU_PAGE_64K_AP 3 /* "Admixed pages" (hash64 only) */
> #define MMU_PAGE_256K 4
> #define MMU_PAGE_1M 5
> -#define MMU_PAGE_4M 6
> -#define MMU_PAGE_8M 7
> -#define MMU_PAGE_16M 8
> -#define MMU_PAGE_64M 9
> -#define MMU_PAGE_256M 10
> -#define MMU_PAGE_1G 11
> -#define MMU_PAGE_16G 12
> -#define MMU_PAGE_64G 13
> -
> -#define MMU_PAGE_COUNT 14
> +#define MMU_PAGE_2M 6
> +#define MMU_PAGE_4M 7
> +#define MMU_PAGE_8M 8
> +#define MMU_PAGE_16M 9
> +#define MMU_PAGE_64M 10
> +#define MMU_PAGE_256M 11
> +#define MMU_PAGE_1G 12
> +#define MMU_PAGE_16G 13
> +#define MMU_PAGE_64G 14
> +
> +#define MMU_PAGE_COUNT 15
Let's pray that won't hit a funny bug on server :-)
> #if defined(CONFIG_PPC_STD_MMU_64)
> /* 64-bit classic hash table MMU */
> diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
> index daf813f..4e18bb5 100644
> --- a/arch/powerpc/include/asm/paca.h
> +++ b/arch/powerpc/include/asm/paca.h
> @@ -108,6 +108,12 @@ struct paca_struct {
> /* Keep pgd in the same cacheline as the start of extlb */
> pgd_t *pgd __attribute__((aligned(0x80))); /* Current PGD */
> pgd_t *kernel_pgd; /* Kernel PGD */
> +
> + struct book3e_tlb_per_core tlb_per_core;
> +
> + /* Points to the tlb_per_core of the first thread on this core. */
> + struct book3e_tlb_per_core *tlb_per_core_ptr;
> +
That's gross. Can't you allocate them elsewhere and then populate the
PACA pointers ?
> /* We can have up to 3 levels of reentrancy in the TLB miss handler */
> u64 extlb[3][EX_TLB_SIZE / sizeof(u64)];
> u64 exmc[8]; /* used for machine checks */
> diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
> index 52c7ad7..61f4634 100644
> --- a/arch/powerpc/kernel/asm-offsets.c
> +++ b/arch/powerpc/kernel/asm-offsets.c
> @@ -168,6 +168,16 @@ int main(void)
> DEFINE(PACA_MC_STACK, offsetof(struct paca_struct, mc_kstack));
> DEFINE(PACA_CRIT_STACK, offsetof(struct paca_struct, crit_kstack));
> DEFINE(PACA_DBG_STACK, offsetof(struct paca_struct, dbg_kstack));
> + DEFINE(PACA_TLB_PER_CORE_PTR,
> + offsetof(struct paca_struct, tlb_per_core_ptr));
> +
> + DEFINE(PERCORE_TLB_ESEL_NEXT,
> + offsetof(struct book3e_tlb_per_core, esel_next));
> + DEFINE(PERCORE_TLB_ESEL_MAX,
> + offsetof(struct book3e_tlb_per_core, esel_max));
> + DEFINE(PERCORE_TLB_ESEL_FIRST,
> + offsetof(struct book3e_tlb_per_core, esel_first));
> + DEFINE(PERCORE_TLB_LOCK, offsetof(struct book3e_tlb_per_core, lock));
> #endif /* CONFIG_PPC_BOOK3E */
>
> #ifdef CONFIG_PPC_STD_MMU_64
> diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c
> index fbe1a12..65abfc0 100644
> --- a/arch/powerpc/kernel/paca.c
> +++ b/arch/powerpc/kernel/paca.c
> @@ -145,6 +145,11 @@ void __init initialise_paca(struct paca_struct *new_paca, int cpu)
> #ifdef CONFIG_PPC_STD_MMU_64
> new_paca->slb_shadow_ptr = &slb_shadow[cpu];
> #endif /* CONFIG_PPC_STD_MMU_64 */
> +
> +#ifdef CONFIG_PPC_BOOK3E
> + /* For now -- if we have threads this will be adjusted later */
> + new_paca->tlb_per_core_ptr = &new_paca->tlb_per_core;
> +#endif
> }
>
> /* Put the paca pointer into r13 and SPRG_PACA */
> diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
> index 389bd4f..271b85d 100644
> --- a/arch/powerpc/kernel/setup_64.c
> +++ b/arch/powerpc/kernel/setup_64.c
> @@ -102,6 +102,37 @@ int ucache_bsize;
>
> static char *smt_enabled_cmdline;
>
> +#ifdef CONFIG_PPC_BOOK3E
> +static void setup_tlb_per_core(void)
> +{
> + int cpu;
> +
> + for_each_possible_cpu(cpu) {
> + int first = cpu_first_thread_sibling(cpu);
> +
> + paca[cpu].tlb_per_core_ptr = &paca[first].tlb_per_core;
> +
> + /*
> + * If we have threads, we need either tlbsrx.
> + * or e6500 tablewalk mode, or else TLB handlers
> + * will be racy and could produce duplicate entries.
> + */
> + if (smt_enabled_at_boot >= 2 &&
> + !mmu_has_feature(MMU_FTR_USE_TLBRSRV) &&
> + book3e_htw_mode != PPC_HTW_E6500) {
> + /* Should we panic instead? */
> + WARN_ONCE("%s: unsupported MMU configuration -- expect problems\n",
> + __func__);
> + }
> + }
> +}
> +#else
> +static void setup_tlb_per_core(void)
> +{
> +}
> +#endif
> +
> +
> /* Look for ibm,smt-enabled OF option */
> static void check_smt_enabled(void)
> {
> @@ -142,6 +173,8 @@ static void check_smt_enabled(void)
> of_node_put(dn);
> }
> }
> +
> + setup_tlb_per_core();
> }
I'd rather you move that to the caller
> /* Look for smt-enabled= cmdline option */
> diff --git a/arch/powerpc/mm/fsl_booke_mmu.c b/arch/powerpc/mm/fsl_booke_mmu.c
> index 07ba45b..bf06d36b 100644
> --- a/arch/powerpc/mm/fsl_booke_mmu.c
> +++ b/arch/powerpc/mm/fsl_booke_mmu.c
> @@ -52,6 +52,7 @@
> #include <asm/smp.h>
> #include <asm/machdep.h>
> #include <asm/setup.h>
> +#include <asm/paca.h>
>
> #include "mmu_decl.h"
>
> @@ -192,6 +193,13 @@ unsigned long map_mem_in_cams(unsigned long ram, int max_cam_idx)
> }
> tlbcam_index = i;
>
> +#ifdef CONFIG_PPC64
> + get_paca()->tlb_per_core.esel_next = i;
> + get_paca()->tlb_per_core.esel_max =
> + mfspr(SPRN_TLB1CFG) & TLBnCFG_N_ENTRY;
> + get_paca()->tlb_per_core.esel_first = i;
> +#endif
> +
> return amount_mapped;
> }
>
> diff --git a/arch/powerpc/mm/tlb_low_64e.S b/arch/powerpc/mm/tlb_low_64e.S
> index efe0f33..8e82772 100644
> --- a/arch/powerpc/mm/tlb_low_64e.S
> +++ b/arch/powerpc/mm/tlb_low_64e.S
> @@ -232,6 +232,173 @@ itlb_miss_fault_bolted:
> beq tlb_miss_common_bolted
> b itlb_miss_kernel_bolted
>
> +/*
> + * TLB miss handling for e6500 and derivatives, using hardware tablewalk.
> + *
> + * Linear mapping is bolted: no virtual page table or nested TLB misses
> + * Indirect entries in TLB1, hardware loads resulting direct entries
> + * into TLB0
> + * No HES or NV hint on TLB1, so we need to do software round-robin
> + * No tlbsrx. so we need a spinlock, and we have to deal
> + * with MAS-damage caused by tlbsx
Ouch ... so for every indirect entry you have to take a lock, backup the
MAS, do a tlbsx, restore the MAS, insert the entry and drop the lock ?
After all that, do you have some bullets left for the HW designers ?
Remind me to also shoot myself for allowing tlbsrx. and HES to be
optional in MAV2 :-(
> + * 4K pages only
> + */
> +
> + START_EXCEPTION(instruction_tlb_miss_e6500)
> + tlb_prolog_bolted SPRN_SRR0
> +
> + ld r11,PACA_TLB_PER_CORE_PTR(r13)
> + srdi. r15,r16,60 /* get region */
> + ori r16,r16,1
> +
> + TLB_MISS_STATS_SAVE_INFO_BOLTED
> + bne tlb_miss_kernel_e6500 /* user/kernel test */
> +
> + b tlb_miss_common_e6500
> +
> + START_EXCEPTION(data_tlb_miss_e6500)
> + tlb_prolog_bolted SPRN_DEAR
> +
> + ld r11,PACA_TLB_PER_CORE_PTR(r13)
> + srdi. r15,r16,60 /* get region */
> + rldicr r16,r16,0,62
> +
> + TLB_MISS_STATS_SAVE_INFO_BOLTED
> + bne tlb_miss_kernel_e6500 /* user vs kernel check */
> +
> +/*
> + * This is the guts of the TLB miss handler for e6500 and derivatives.
> + * We are entered with:
> + *
> + * r16 = page of faulting address (low bit 0 if data, 1 if instruction)
> + * r15 = crap (free to use)
> + * r14 = page table base
> + * r13 = PACA
> + * r11 = tlb_per_core ptr
> + * r10 = crap (free to use)
> + */
> +tlb_miss_common_e6500:
> + /*
> + * Search if we already have an indirect entry for that virtual
> + * address, and if we do, bail out.
> + *
> + * MAS6:IND should be already set based on MAS4
> + */
> + addi r10,r11,PERCORE_TLB_LOCK
> +1: lbarx r15,0,r10
> + cmpdi r15,0
> + bne 2f
> + li r15,1
> + stbcx. r15,0,r10
No need for barriers here ?
> + bne 1b
> + .subsection 1
> +2: lbz r15,0(r10)
> + cmpdi r15,0
> + bne 2b
> + b 1b
> + .previous
> +
> + mfspr r15,SPRN_MAS2
> +
> + tlbsx 0,r16
> + mfspr r10,SPRN_MAS1
> + andis. r10,r10,MAS1_VALID@h
> + bne tlb_miss_done_e6500
> +
> + /* Undo MAS-damage from the tlbsx */
> + mfspr r10,SPRN_MAS1
> + oris r10,r10,MAS1_VALID@h
> + mtspr SPRN_MAS1,r10
> + mtspr SPRN_MAS2,r15
> +
> + /* Now, we need to walk the page tables. First check if we are in
> + * range.
> + */
> + rldicl. r10,r16,64-PGTABLE_EADDR_SIZE,PGTABLE_EADDR_SIZE+4
> + bne- tlb_miss_fault_e6500
> +
> + rldicl r15,r16,64-PGDIR_SHIFT+3,64-PGD_INDEX_SIZE-3
> + cmpldi cr0,r14,0
> + clrrdi r15,r15,3
> + beq- tlb_miss_fault_e6500 /* No PGDIR, bail */
> + ldx r14,r14,r15 /* grab pgd entry */
> +
> + rldicl r15,r16,64-PUD_SHIFT+3,64-PUD_INDEX_SIZE-3
> + clrrdi r15,r15,3
> + cmpdi cr0,r14,0
> + bge tlb_miss_fault_e6500 /* Bad pgd entry or hugepage; bail */
> + ldx r14,r14,r15 /* grab pud entry */
> +
> + rldicl r15,r16,64-PMD_SHIFT+3,64-PMD_INDEX_SIZE-3
> + clrrdi r15,r15,3
> + cmpdi cr0,r14,0
> + bge tlb_miss_fault_e6500
> + ldx r14,r14,r15 /* Grab pmd entry */
> +
> + mfspr r10,SPRN_MAS0
> + cmpdi cr0,r14,0
> + bge tlb_miss_fault_e6500
> +
> + /* Now we build the MAS for a 2M indirect page:
> + *
> + * MAS 0 : ESEL needs to be filled by software round-robin
> + * MAS 1 : Almost fully setup
> + * - PID already updated by caller if necessary
> + * - TSIZE for now is base ind page size always
> + * MAS 2 : Use defaults
> + * MAS 3+7 : Needs to be done
> + */
> +
> + ori r14,r14,(BOOK3E_PAGESZ_4K << MAS3_SPSIZE_SHIFT)
> + mtspr SPRN_MAS7_MAS3,r14
> +
> + lbz r15,PERCORE_TLB_ESEL_NEXT(r11)
> + lbz r16,PERCORE_TLB_ESEL_MAX(r11)
> + lbz r14,PERCORE_TLB_ESEL_FIRST(r11)
> + rlwimi r10,r15,16,0x00ff0000 /* insert esel_next into MAS0 */
> + addi r15,r15,1 /* increment esel_next */
> + mtspr SPRN_MAS0,r10
> + cmpw r15,r16
> + iseleq r15,r14,r15 /* if next == last use first */
> + stb r15,PERCORE_TLB_ESEL_NEXT(r11)
> +
> + tlbwe
> +
> +tlb_miss_done_e6500:
> + .macro tlb_unlock_e6500
> + li r15,0
> + isync
> + stb r15,PERCORE_TLB_LOCK(r11)
> + .endm
> +
> + tlb_unlock_e6500
> + TLB_MISS_STATS_X(MMSTAT_TLB_MISS_NORM_OK)
> + tlb_epilog_bolted
> + rfi
> +
> +tlb_miss_kernel_e6500:
> + mfspr r10,SPRN_MAS1
> + ld r14,PACA_KERNELPGD(r13)
> + cmpldi cr0,r15,8 /* Check for vmalloc region */
> + rlwinm r10,r10,0,16,1 /* Clear TID */
> + mtspr SPRN_MAS1,r10
> + beq+ tlb_miss_common_e6500
> +
> +tlb_miss_fault_e6500:
> + tlb_unlock_e6500
> + /* We need to check if it was an instruction miss */
> + andi. r16,r16,1
> + bne itlb_miss_fault_e6500
> +dtlb_miss_fault_e6500:
> + TLB_MISS_STATS_D(MMSTAT_TLB_MISS_NORM_FAULT)
> + tlb_epilog_bolted
> + b exc_data_storage_book3e
> +itlb_miss_fault_e6500:
> + TLB_MISS_STATS_I(MMSTAT_TLB_MISS_NORM_FAULT)
> + tlb_epilog_bolted
> + b exc_instruction_storage_book3e
> +
> +
> /**********************************************************************
> * *
> * TLB miss handling for Book3E with TLB reservation and HES support *
> diff --git a/arch/powerpc/mm/tlb_nohash.c b/arch/powerpc/mm/tlb_nohash.c
> index df32a83..2f09ddf 100644
> --- a/arch/powerpc/mm/tlb_nohash.c
> +++ b/arch/powerpc/mm/tlb_nohash.c
> @@ -43,6 +43,7 @@
> #include <asm/tlb.h>
> #include <asm/code-patching.h>
> #include <asm/hugetlb.h>
> +#include <asm/paca.h>
>
> #include "mmu_decl.h"
>
> @@ -58,6 +59,10 @@ struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT] = {
> .shift = 12,
> .enc = BOOK3E_PAGESZ_4K,
> },
> + [MMU_PAGE_2M] = {
> + .shift = 21,
> + .enc = BOOK3E_PAGESZ_2M,
> + },
> [MMU_PAGE_4M] = {
> .shift = 22,
> .enc = BOOK3E_PAGESZ_4M,
> @@ -136,7 +141,7 @@ static inline int mmu_get_tsize(int psize)
> int mmu_linear_psize; /* Page size used for the linear mapping */
> int mmu_pte_psize; /* Page size used for PTE pages */
> int mmu_vmemmap_psize; /* Page size used for the virtual mem map */
> -int book3e_htw_enabled; /* Is HW tablewalk enabled ? */
> +int book3e_htw_mode; /* HW tablewalk? Value is PPC_HTW_* */
> unsigned long linear_map_top; /* Top of linear mapping */
>
> #endif /* CONFIG_PPC64 */
> @@ -377,7 +382,7 @@ void tlb_flush_pgtable(struct mmu_gather *tlb, unsigned long address)
> {
> int tsize = mmu_psize_defs[mmu_pte_psize].enc;
>
> - if (book3e_htw_enabled) {
> + if (book3e_htw_mode) {
Make it if (boot3e_htw_enabled != PPC_HTW_NONE)
> unsigned long start = address & PMD_MASK;
> unsigned long end = address + PMD_SIZE;
> unsigned long size = 1UL << mmu_psize_defs[mmu_pte_psize].shift;
> @@ -413,10 +418,10 @@ static void setup_page_sizes(void)
> int i, psize;
>
> #ifdef CONFIG_PPC_FSL_BOOK3E
> + int fsl_mmu = mmu_has_feature(MMU_FTR_TYPE_FSL_E);
> unsigned int mmucfg = mfspr(SPRN_MMUCFG);
>
> - if (((mmucfg & MMUCFG_MAVN) == MMUCFG_MAVN_V1) &&
> - (mmu_has_feature(MMU_FTR_TYPE_FSL_E))) {
> + if (fsl_mmu && (mmucfg & MMUCFG_MAVN) == MMUCFG_MAVN_V1) {
> unsigned int tlb1cfg = mfspr(SPRN_TLB1CFG);
> unsigned int min_pg, max_pg;
>
> @@ -430,7 +435,7 @@ static void setup_page_sizes(void)
> def = &mmu_psize_defs[psize];
> shift = def->shift;
>
> - if (shift == 0)
> + if (shift == 0 || shift & 1)
> continue;
>
> /* adjust to be in terms of 4^shift Kb */
> @@ -440,7 +445,40 @@ static void setup_page_sizes(void)
> def->flags |= MMU_PAGE_SIZE_DIRECT;
> }
>
> - goto no_indirect;
> + goto out;
> + }
> +
> + if (fsl_mmu && (mmucfg & MMUCFG_MAVN) == MMUCFG_MAVN_V2) {
> + u32 tlb1cfg, tlb1ps;
> +
> + tlb0cfg = mfspr(SPRN_TLB0CFG);
> + tlb1cfg = mfspr(SPRN_TLB1CFG);
> + tlb1ps = mfspr(SPRN_TLB1PS);
> + eptcfg = mfspr(SPRN_EPTCFG);
> +
> + if ((tlb1cfg & TLBnCFG_IND) && (tlb0cfg & TLBnCFG_PT))
> + book3e_htw_mode = PPC_HTW_E6500;
> +
> + /*
> + * We expect 4K subpage size and unrestricted indirect size.
> + * The lack of a restriction on indirect size is a Freescale
> + * extension, indicated by PSn = 0 but SPSn != 0.
> + */
> + if (eptcfg != 2)
> + book3e_htw_mode = PPC_HTW_NONE;
> +
> + for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) {
> + struct mmu_psize_def *def = &mmu_psize_defs[psize];
> +
> + if (tlb1ps & (1U << (def->shift - 10))) {
> + def->flags |= MMU_PAGE_SIZE_DIRECT;
> +
> + if (book3e_htw_mode && psize == MMU_PAGE_2M)
> + def->flags |= MMU_PAGE_SIZE_INDIRECT;
> + }
> + }
> +
> + goto out;
> }
> #endif
>
> @@ -457,8 +495,11 @@ static void setup_page_sizes(void)
> }
>
> /* Indirect page sizes supported ? */
> - if ((tlb0cfg & TLBnCFG_IND) == 0)
> - goto no_indirect;
> + if ((tlb0cfg & TLBnCFG_IND) == 0 ||
> + (tlb0cfg & TLBnCFG_PT) == 0)
> + goto out;
> +
> + book3e_htw_mode = PPC_HTW_IBM;
>
> /* Now, we only deal with one IND page size for each
> * direct size. Hopefully all implementations today are
> @@ -483,8 +524,8 @@ static void setup_page_sizes(void)
> def->ind = ps + 10;
> }
> }
> - no_indirect:
>
> +out:
> /* Cleanup array and print summary */
> pr_info("MMU: Supported page sizes\n");
> for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) {
> @@ -525,23 +566,23 @@ static void __patch_exception(int exc, unsigned long addr)
>
> static void setup_mmu_htw(void)
> {
> - /* Check if HW tablewalk is present, and if yes, enable it by:
> - *
> - * - patching the TLB miss handlers to branch to the
> - * one dedicates to it
> - *
> - * - setting the global book3e_htw_enabled
> - */
> - unsigned int tlb0cfg = mfspr(SPRN_TLB0CFG);
> + /*
> + * If we want to use HW tablewalk, enable it by patching the TLB miss
> + * handlers to branch to the one dedicated to it.
> + */
>
> - if ((tlb0cfg & TLBnCFG_IND) &&
> - (tlb0cfg & TLBnCFG_PT)) {
> + switch (book3e_htw_mode) {
> + case PPC_HTW_IBM:
> patch_exception(0x1c0, exc_data_tlb_miss_htw_book3e);
> patch_exception(0x1e0, exc_instruction_tlb_miss_htw_book3e);
> - book3e_htw_enabled = 1;
> + break;
> + case PPC_HTW_E6500:
> + patch_exception(0x1c0, exc_data_tlb_miss_e6500_book3e);
> + patch_exception(0x1e0, exc_instruction_tlb_miss_e6500_book3e);
> + break;
> }
> pr_info("MMU: Book3E HW tablewalk %s\n",
> - book3e_htw_enabled ? "enabled" : "not supported");
> + book3e_htw_mode ? "enabled" : "not supported");
> }
>
> /*
> @@ -581,8 +622,16 @@ static void __early_init_mmu(int boot_cpu)
> /* Set MAS4 based on page table setting */
>
> mas4 = 0x4 << MAS4_WIMGED_SHIFT;
> - if (book3e_htw_enabled) {
> - mas4 |= mas4 | MAS4_INDD;
> + switch (book3e_htw_mode) {
> + case PPC_HTW_E6500:
> + mas4 |= MAS4_INDD;
> + mas4 |= BOOK3E_PAGESZ_2M << MAS4_TSIZED_SHIFT;
> + mas4 |= MAS4_TLBSELD(1);
> + mmu_pte_psize = MMU_PAGE_2M;
> + break;
> +
> + case PPC_HTW_IBM:
> + mas4 |= MAS4_INDD;
> #ifdef CONFIG_PPC_64K_PAGES
> mas4 |= BOOK3E_PAGESZ_256M << MAS4_TSIZED_SHIFT;
> mmu_pte_psize = MMU_PAGE_256M;
> @@ -590,13 +639,16 @@ static void __early_init_mmu(int boot_cpu)
> mas4 |= BOOK3E_PAGESZ_1M << MAS4_TSIZED_SHIFT;
> mmu_pte_psize = MMU_PAGE_1M;
> #endif
> - } else {
> + break;
> +
> + case PPC_HTW_NONE:
> #ifdef CONFIG_PPC_64K_PAGES
> mas4 |= BOOK3E_PAGESZ_64K << MAS4_TSIZED_SHIFT;
> #else
> mas4 |= BOOK3E_PAGESZ_4K << MAS4_TSIZED_SHIFT;
> #endif
> mmu_pte_psize = mmu_virtual_psize;
> + break;
> }
> mtspr(SPRN_MAS4, mas4);
>
> @@ -616,8 +668,11 @@ static void __early_init_mmu(int boot_cpu)
> /* limit memory so we dont have linear faults */
> memblock_enforce_memory_limit(linear_map_top);
>
> - patch_exception(0x1c0, exc_data_tlb_miss_bolted_book3e);
> - patch_exception(0x1e0, exc_instruction_tlb_miss_bolted_book3e);
> + if (book3e_htw_mode == PPC_HTW_NONE) {
> + patch_exception(0x1c0, exc_data_tlb_miss_bolted_book3e);
> + patch_exception(0x1e0,
> + exc_instruction_tlb_miss_bolted_book3e);
> + }
> }
> #endif
>
Ben.
^ permalink raw reply
* Re: [PATCH] powerpc/booke-64: fix tlbsrx. path in bolted tlb handler
From: Benjamin Herrenschmidt @ 2012-09-07 4:23 UTC (permalink / raw)
To: scott; +Cc: linuxppc-dev
In-Reply-To: <20120612220232.GA17228@tyr.buserror.net>
On Tue, 2012-06-12 at 17:02 -0500, Scott Wood wrote:
> It was branching to the cleanup part of the non-bolted handler,
> which would have been bad if there were any chips with tlbsrx.
> that use the bolted handler.
Still relevant ? It doesn't apply anymore :-)
Cheers,
Ben.
> Signed-off-by: Scott Wood <scott@tyr.buserror.net>
> ---
> arch/powerpc/mm/tlb_low_64e.S | 3 ++-
> 1 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/arch/powerpc/mm/tlb_low_64e.S b/arch/powerpc/mm/tlb_low_64e.S
> index ff672bd..efe0f33 100644
> --- a/arch/powerpc/mm/tlb_low_64e.S
> +++ b/arch/powerpc/mm/tlb_low_64e.S
> @@ -128,7 +128,7 @@ BEGIN_MMU_FTR_SECTION
> */
> PPC_TLBSRX_DOT(0,r16)
> ldx r14,r14,r15 /* grab pgd entry */
> - beq normal_tlb_miss_done /* tlb exists already, bail */
> + beq tlb_miss_done_bolted /* tlb exists already, bail */
> MMU_FTR_SECTION_ELSE
> ldx r14,r14,r15 /* grab pgd entry */
> ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_USE_TLBRSRV)
> @@ -184,6 +184,7 @@ ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_USE_TLBRSRV)
> mtspr SPRN_MAS7_MAS3,r15
> tlbwe
>
> +tlb_miss_done_bolted:
> TLB_MISS_STATS_X(MMSTAT_TLB_MISS_NORM_OK)
> tlb_epilog_bolted
> rfi
^ permalink raw reply
* Re: [PATCH 4/5] powerpc: Rework set_dabr so it can take a DABRX value as well
From: Michael Neuling @ 2012-09-07 3:37 UTC (permalink / raw)
To: Geert Uytterhoeven; +Cc: linuxppc-dev
In-Reply-To: <CAMuHMdU=UjkU3BRV7Aouu-m=vKaSPWH93Rkh6L3RWev4oM5jYA@mail.gmail.com>
Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> On Thu, Sep 6, 2012 at 7:17 AM, Michael Neuling <mikey@neuling.org> wrote:
> > Rework set_dabr to take a DABRX value as well. We are not actually
> > changing any functionality at this stage, just preparing for that.
>
> You are changing functionality.
You are right.. I'll fix that up.. Sorry.
>
> > #define DABRX_USER (1UL << 0)
> > #define DABRX_KERNEL (1UL << 1)
> > +#define DABRX_HYP (1UL << 2)
> > +#define DABRX_BTI (1UL << 3)
> > +#define DABRX_ALL (DABRX_BTI | DABRX_HYP | DABRX_KERNEL | DABRX_USER)
>
> > --- a/arch/powerpc/platforms/cell/beat.c
> > +++ b/arch/powerpc/platforms/cell/beat.c
> > @@ -136,9 +136,9 @@ ssize_t beat_nvram_get_size(void)
> > return BEAT_NVRAM_SIZE;
> > }
> >
> > -int beat_set_xdabr(unsigned long dabr)
> > +int beat_set_xdabr(unsigned long dabr, unsigned long dabrx)
> > {
> > - if (beat_set_dabr(dabr, DABRX_KERNEL | DABRX_USER))
> > + if (beat_set_dabr(dabr, dabrx))
> > return -1;
> > return 0;
> > }
>
> > --- a/arch/powerpc/platforms/ps3/setup.c
> > +++ b/arch/powerpc/platforms/ps3/setup.c
> > @@ -184,11 +184,9 @@ early_param("ps3flash", early_parse_ps3flash);
> > #define prealloc_ps3flash_bounce_buffer() do { } while (0)
> > #endif
> >
> > -static int ps3_set_dabr(unsigned long dabr)
> > +static int ps3_set_dabr(unsigned long dabr, unsigned long dabrx)
> > {
> > - enum {DABR_USER = 1, DABR_KERNEL = 2,};
> > -
> > - return lv1_set_dabr(dabr, DABR_KERNEL | DABR_USER) ? -1 : 0;
> > + return lv1_set_dabr(dabr, dabrx) ? -1 : 0;
> > }
>
> > - set_dabr(dabr.address | (dabr.enabled & 7));
> > + set_dabr(dabr.address | (dabr.enabled & 7), DABRX_ALL);
>
> Before, beat_set_dabr() and lv1_set_dabr() would have been called with dabrx = 3
> (DABRX_KERNEL | DABRX_USER). Now they're called with dabrx = 15
> (DABRX_ALL = DABRX_BTI | DABRX_HYP | DABRX_KERNEL | DABRX_USER).
>
> No idea what's the impact of this...
Do you know if the ps3 hypervisor will allow us to set DABRX_BTI or
DABRX_HYP? phyp wont.
Mikey
^ permalink raw reply
* Re: [PATCH -V8 0/11] arch/powerpc: Add 64TB support to ppc64
From: Benjamin Herrenschmidt @ 2012-09-07 1:43 UTC (permalink / raw)
To: Aneesh Kumar K.V; +Cc: linuxppc-dev, paulus
In-Reply-To: <1346945351-7672-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
On Thu, 2012-09-06 at 20:59 +0530, Aneesh Kumar K.V wrote:
> Hi,
>
> This patchset include patches for supporting 64TB with ppc64. I haven't booted
> this on hardware with 64TB memory yet. But they boot fine on real hardware with
> less memory. Changes extend VSID bits to 38 bits for a 256MB segment
> and 26 bits for 1TB segments.
Your series breaks the embedded 64-bit build. You seem to be hard wiring
dependencies on slice stuff all over 64-bit stuff regardless of the MMU
type or the value of CONFIG_MM_SLICES.
Also all these:
> +/* 4 bits per slice and we have one slice per 1TB */
> +#if 0 /* We can't directly include pgtable.h hence this hack */
> +#define SLICE_ARRAY_SIZE (PGTABLE_RANGE >> 41)
> +#else
> +/* Right now we only support 64TB */
> +#define SLICE_ARRAY_SIZE 32
> +#endif
Things are just too horrible. Find a different way of doing it, if
necessary create a new range define somewhere, whatever but don't leave
that crap as-is, it's too wrong.
Dropping the series for now.
Cheers,
Ben.
> Changes from V7:
> * Address review feedback
>
> Changes from V6:
> * rebase to latest upstream (5b716ac728bcc01b1f2a7ed6e437196602237c27)
>
> Changes from v5:
> * Address review feedback
>
> Changes from v4:
> * Drop patch "arch/powerpc: properly offset the context bits for 1T segemnts"
> based on review feedback
> * split CONTEXT_BITS related changes from patch 12
> * Add a new doc update patch
>
> Changes from v3:
> * Address review comments.
> * Added new patch to ensure proto-VSID isolation between kernel and user space
>
> Changes from V2:
> * Fix few FIXMEs in the patchset. I have added them as separate patch for
> easier review. That should help us to drop those changes if we don't agree.
>
> Changes from V1:
> * Drop the usage of structure (struct virt_addr) to carry virtual address.
> We now represent virtual address via vpn which is virtual address shifted
> right 12 bits.
>
> Thanks,
> -aneesh
>
^ permalink raw reply
* Re: [PATCH v2 1/2] [powerpc] Change memory_limit from phys_addr_t to unsigned long long
From: Benjamin Herrenschmidt @ 2012-09-07 1:35 UTC (permalink / raw)
To: Suzuki K. Poulose; +Cc: mahesh, linuxppc-dev, linux-kernel
In-Reply-To: <20120821114225.29282.87841.stgit@suzukikp.in.ibm.com>
On Tue, 2012-08-21 at 17:12 +0530, Suzuki K. Poulose wrote:
> There are some device-tree nodes, whose values are of type phys_addr_t.
> The phys_addr_t is variable sized based on the CONFIG_PHSY_T_64BIT.
>
> Change these to a fixed unsigned long long for consistency.
>
> This patch does the change only for memory_limit.
>
> The following is a list of such variables which need the change:
>
> 1) kernel_end, crashk_size - in arch/powerpc/kernel/machine_kexec.c
>
> 2) (struct resource *)crashk_res.start - We could export a local static
> variable from machine_kexec.c.
>
> Changing the above values might break the kexec-tools. So, I will
> fix kexec-tools first to handle the different sized values and then change
> the above.
>
> Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com>
> ---
Breaks the build on some configs (with 32-bit phys_addr_t):
/home/benh/linux-powerpc-test/arch/powerpc/kernel/prom.c: In function
'early_init_devtree':
/home/benh/linux-powerpc-test/arch/powerpc/kernel/prom.c:664:25: error:
comparison of distinct pointer types lacks a cast
I'm fixing that myself this time but please be more careful.
Cheers,
Ben.
^ permalink raw reply
* Re: [PATCH 2/2][v2] powerpc/perf: Sample only if SIAR-Valid bit is set in P7+
From: Benjamin Herrenschmidt @ 2012-09-07 0:50 UTC (permalink / raw)
To: Sukadev Bhattiprolu
Cc: michaele, linuxppc-dev, Anton Blanchard, benh, cel, khandual
In-Reply-To: <20120716212241.GB14033@us.ibm.com>
On Mon, 2012-07-16 at 14:22 -0700, Sukadev Bhattiprolu wrote:
> From: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
> Date: Mon, 2 Jul 2012 08:06:14 -0700
> Subject: [PATCH 2/2][v2] powerpc/perf: Sample only if SIAR-Valid bit is set in P7+
>
> On POWER7+ two new bits (mmcra[35] and mmcra[36]) indicate whether the
> contents of SIAR and SDAR are valid.
>
> For marked instructions on P7+, we must save the contents of SIAR and
> SDAR registers only if these new bits are set.
>
> This code/check for the SIAR-Valid bit is specific to P7+, so rather than
> waste a CPU-feature bit use the PVR flag.
This appears to be based on an ancient code base. The code has changed
significantly in that area and this patch doesn't apply at all.
I have applied the first patch and renamed PV_ to PVR_ since we've
renamed them all since then. This will show up in powerpc-next later
today. Please rebase your perf patch on top of that.
Cheers,
Ben.
> Note that Carl Love proposed a similar change for oprofile:
>
> https://lkml.org/lkml/2012/6/22/309
>
> Changelog[v2]:
> - [Gabriel Paubert] Rename PV_POWER7P to PV_POWER7p.
>
> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
> ---
> arch/powerpc/include/asm/perf_event_server.h | 1 +
> arch/powerpc/include/asm/reg.h | 4 +++
> arch/powerpc/perf/core-book3s.c | 38 ++++++++++++++++++++++---
> arch/powerpc/perf/power7-pmu.c | 3 ++
> 4 files changed, 41 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/perf_event_server.h b/arch/powerpc/include/asm/perf_event_server.h
> index 078019b..9710be3 100644
> --- a/arch/powerpc/include/asm/perf_event_server.h
> +++ b/arch/powerpc/include/asm/perf_event_server.h
> @@ -49,6 +49,7 @@ struct power_pmu {
> #define PPMU_ALT_SIPR 2 /* uses alternate posn for SIPR/HV */
> #define PPMU_NO_SIPR 4 /* no SIPR/HV in MMCRA at all */
> #define PPMU_NO_CONT_SAMPLING 8 /* no continuous sampling */
> +#define PPMU_SIAR_VALID 16 /* Processor has SIAR Valid bit */
>
> /*
> * Values for flags to get_alternatives()
> diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
> index 65b6164..a7a9a8b 100644
> --- a/arch/powerpc/include/asm/reg.h
> +++ b/arch/powerpc/include/asm/reg.h
> @@ -601,6 +601,10 @@
> #define POWER6_MMCRA_SIPR 0x0000020000000000ULL
> #define POWER6_MMCRA_THRM 0x00000020UL
> #define POWER6_MMCRA_OTHER 0x0000000EUL
> +
> +#define POWER7P_MMCRA_SIAR_VALID 0x10000000 /* P7+ SIAR contents valid */
> +#define POWER7P_MMCRA_SDAR_VALID 0x08000000 /* P7+ SDAR contents valid */
> +
> #define SPRN_PMC1 787
> #define SPRN_PMC2 788
> #define SPRN_PMC3 789
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index 8f84bcb..0a392d8 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -103,14 +103,20 @@ static inline unsigned long perf_ip_adjust(struct pt_regs *regs)
> * If we're not doing instruction sampling, give them the SDAR
> * (sampled data address). If we are doing instruction sampling, then
> * only give them the SDAR if it corresponds to the instruction
> - * pointed to by SIAR; this is indicated by the [POWER6_]MMCRA_SDSYNC
> - * bit in MMCRA.
> + * pointed to by SIAR; this is indicated by the [POWER6_]MMCRA_SDSYNC or
> + * the [POWER7P_]MMCRA_SDAR_VALID bit in MMCRA.
> */
> static inline void perf_get_data_addr(struct pt_regs *regs, u64 *addrp)
> {
> unsigned long mmcra = regs->dsisr;
> - unsigned long sdsync = (ppmu->flags & PPMU_ALT_SIPR) ?
> - POWER6_MMCRA_SDSYNC : MMCRA_SDSYNC;
> + unsigned long sdsync;
> +
> + if (ppmu->flags & PPMU_SIAR_VALID)
> + sdsync = POWER7P_MMCRA_SDAR_VALID;
> + else if (ppmu->flags & PPMU_ALT_SIPR)
> + sdsync = POWER6_MMCRA_SDSYNC;
> + else
> + sdsync = MMCRA_SDSYNC;
>
> if (!(mmcra & MMCRA_SAMPLE_ENABLE) || (mmcra & sdsync))
> *addrp = mfspr(SPRN_SDAR);
> @@ -1248,6 +1254,25 @@ struct pmu power_pmu = {
> .event_idx = power_pmu_event_idx,
> };
>
> +
> +/*
> + * On processors like P7+ that have the SIAR-Valid bit, marked instructions
> + * must be sampled only if the SIAR-valid bit is set.
> + *
> + * For unmarked instructions and for processors that don't have the SIAR-Valid
> + * bit, assume that SIAR is valid.
> + */
> +static inline int siar_valid(struct pt_regs *regs)
> +{
> + unsigned long mmcra = regs->dsisr;
> + int marked = mmcra & MMCRA_SAMPLE_ENABLE;
> +
> + if ((ppmu->flags & PPMU_SIAR_VALID) && marked)
> + return mmcra & POWER7P_MMCRA_SIAR_VALID;
> +
> + return 1;
> +}
> +
> /*
> * A counter has overflowed; update its count and record
> * things if requested. Note that interrupts are hard-disabled
> @@ -1281,7 +1306,7 @@ static void record_and_restart(struct perf_event *event, unsigned long val,
> left += period;
> if (left <= 0)
> left = period;
> - record = 1;
> + record = siar_valid(regs);
> event->hw.last_period = event->hw.sample_period;
> }
> if (left < 0x80000000LL)
> @@ -1340,6 +1365,9 @@ unsigned long perf_instruction_pointer(struct pt_regs *regs)
> !(mmcra & MMCRA_SAMPLE_ENABLE))
> return regs->nip;
>
> + if (!siar_valid(regs))
> + return 0; // no valid instruction pointer
> +
> return mfspr(SPRN_SIAR) + perf_ip_adjust(regs);
> }
>
> diff --git a/arch/powerpc/perf/power7-pmu.c b/arch/powerpc/perf/power7-pmu.c
> index 1251e4d..970a634 100644
> --- a/arch/powerpc/perf/power7-pmu.c
> +++ b/arch/powerpc/perf/power7-pmu.c
> @@ -373,6 +373,9 @@ static int __init init_power7_pmu(void)
> strcmp(cur_cpu_spec->oprofile_cpu_type, "ppc64/power7"))
> return -ENODEV;
>
> + if (__is_processor(PV_POWER7p))
> + power7_pmu.flags |= PPMU_SIAR_VALID;
> +
> return register_power_pmu(&power7_pmu);
> }
>
^ permalink raw reply
* Re: [PATCH -V8 04/11] arch/powerpc: Convert virtual address to vpn
From: Paul Mackerras @ 2012-09-06 22:32 UTC (permalink / raw)
To: Aneesh Kumar K.V; +Cc: linuxppc-dev
In-Reply-To: <1346945351-7672-5-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
On Thu, Sep 06, 2012 at 08:59:04PM +0530, Aneesh Kumar K.V wrote:
> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>
> This patch convert different functions to take virtual page number
> instead of virtual address. Virtual page number is virtual address
> shifted right by VPN_SHIFT (12) bits. This enable us to have an
> address range of upto 76 bits.
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Reviewed-by: Paul Mackerras <paulus@samba.org>
^ permalink raw reply
* Re: [PATCH] KVM: PPC: BookE: HV: Fix compile
From: Alexander Graf @ 2012-09-06 19:39 UTC (permalink / raw)
To: Michael Neuling; +Cc: linuxppc-dev, Linus Torvalds, KVM list, kvm-ppc
In-Reply-To: <9849.1343262707@neuling.org>
On 25.07.2012, at 20:31, Michael Neuling wrote:
> Alexander Graf <agraf@suse.de> wrote:
>=20
>> After merging the register type check patches from Ben's tree, the
>> hv enabled booke implementation ceased to compile.
>>=20
>> This patch fixes things up so everyone's happy again.
>=20
> Is there a defconfig which catches this?
Hrm. I don't think a defconfig gets you there, as KVM isn't enabled by =
default. Just configure your kernel with support for e500mc and enable =
KVM :).
Alex
^ permalink raw reply
* [PATCH] powerpc: Fix build dependencies for c files requiring libfdt.h
From: Matthew McClintock @ 2012-09-06 18:48 UTC (permalink / raw)
To: linuxppc-dev
Several files in obj-plat depend on libfdt header file. Sometimes
when building one can see the following issue. This patch adds
libfdt as dependency to those object files
| In file included from arch/powerpc/boot/treeboot-iss4xx.c:33:0:
| arch/powerpc/boot/libfdt.h:854:1: error: unterminated comment
| In file included from arch/powerpc/boot/treeboot-iss4xx.c:33:0:
| arch/powerpc/boot/libfdt.h:1:0: error: unterminated #ifndef
| BOOTCC arch/powerpc/boot/inffast.o
| make[1]: *** [arch/powerpc/boot/treeboot-iss4xx.o] Error 1
| make[1]: *** Waiting for unfinished jobs....
| BOOTCC arch/powerpc/boot/inflate.o
| make: *** [uImage] Error 2
| ERROR: oe_runmake failed
| ERROR: Function failed: do_compile (see /srv/home/pokybuild/yocto-autobuilder/yocto-slave/p1022ds/build/build/tmp/work/p1022ds-poky-linux-gnuspe/linux-qoriq-sdk-3.0.34-r5/temp/log.do_compile.2167 for further information)
NOTE: recipe linux-qoriq-sdk-3.0.34-r5: task do_compile: Failed
Signed-off-by: Matthew McClintock <msm@freescale.com>
---
arch/powerpc/boot/Makefile | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
index b7d8333..6a15c96 100644
--- a/arch/powerpc/boot/Makefile
+++ b/arch/powerpc/boot/Makefile
@@ -107,6 +107,7 @@ src-boot := $(addprefix $(obj)/, $(src-boot))
obj-boot := $(addsuffix .o, $(basename $(src-boot)))
obj-wlib := $(addsuffix .o, $(basename $(addprefix $(obj)/, $(src-wlib))))
obj-plat := $(addsuffix .o, $(basename $(addprefix $(obj)/, $(src-plat))))
+obj-plat: $(libfdt)
quiet_cmd_copy_zlib = COPY $@
cmd_copy_zlib = sed "s@__used@@;s@<linux/\([^>]*\).*@\"\1\"@" $< > $@
--
1.7.9.7
^ permalink raw reply related
* [PATCH -V8 06/11] arch/powerpc: Increase the slice range to 64TB
From: Aneesh Kumar K.V @ 2012-09-06 15:29 UTC (permalink / raw)
To: benh, paulus; +Cc: linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1346945351-7672-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
This patch makes the high psizes mask as an unsigned char array
so that we can have more than 16TB. Currently we support upto
64TB
Reviewed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/mmu-hash64.h | 6 +-
arch/powerpc/include/asm/page_64.h | 6 +-
arch/powerpc/mm/hash_utils_64.c | 15 +++--
arch/powerpc/mm/slb_low.S | 30 ++++++---
arch/powerpc/mm/slice.c | 107 +++++++++++++++++++++------------
5 files changed, 109 insertions(+), 55 deletions(-)
diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
index 6aeb498..7cbd541 100644
--- a/arch/powerpc/include/asm/mmu-hash64.h
+++ b/arch/powerpc/include/asm/mmu-hash64.h
@@ -460,7 +460,11 @@ typedef struct {
#ifdef CONFIG_PPC_MM_SLICES
u64 low_slices_psize; /* SLB page size encodings */
- u64 high_slices_psize; /* 4 bits per slice for now */
+ /*
+ * Right now we support 64TB and 4 bits for each
+ * 1TB slice we need 32 bytes for 64TB.
+ */
+ unsigned char high_slices_psize[32]; /* 4 bits per slice for now */
#else
u16 sllp; /* SLB page size encoding */
#endif
diff --git a/arch/powerpc/include/asm/page_64.h b/arch/powerpc/include/asm/page_64.h
index fed85e6..6c9bef4 100644
--- a/arch/powerpc/include/asm/page_64.h
+++ b/arch/powerpc/include/asm/page_64.h
@@ -82,7 +82,11 @@ extern u64 ppc64_pft_size;
struct slice_mask {
u16 low_slices;
- u16 high_slices;
+ /*
+ * This should be derived out of PGTABLE_RANGE. For the current
+ * max 64TB, u64 should be ok.
+ */
+ u64 high_slices;
};
struct mm_struct;
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 74c5479..13e0ccf 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -804,16 +804,19 @@ unsigned int hash_page_do_lazy_icache(unsigned int pp, pte_t pte, int trap)
#ifdef CONFIG_PPC_MM_SLICES
unsigned int get_paca_psize(unsigned long addr)
{
- unsigned long index, slices;
+ u64 lpsizes;
+ unsigned char *hpsizes;
+ unsigned long index, mask_index;
if (addr < SLICE_LOW_TOP) {
- slices = get_paca()->context.low_slices_psize;
+ lpsizes = get_paca()->context.low_slices_psize;
index = GET_LOW_SLICE_INDEX(addr);
- } else {
- slices = get_paca()->context.high_slices_psize;
- index = GET_HIGH_SLICE_INDEX(addr);
+ return (lpsizes >> (index * 4)) & 0xF;
}
- return (slices >> (index * 4)) & 0xF;
+ hpsizes = get_paca()->context.high_slices_psize;
+ index = GET_HIGH_SLICE_INDEX(addr);
+ mask_index = index & 0x1;
+ return (hpsizes[index >> 1] >> (mask_index * 4)) & 0xF;
}
#else
diff --git a/arch/powerpc/mm/slb_low.S b/arch/powerpc/mm/slb_low.S
index b9ee79ce..e132dc6 100644
--- a/arch/powerpc/mm/slb_low.S
+++ b/arch/powerpc/mm/slb_low.S
@@ -108,17 +108,31 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_1T_SEGMENT)
* between 4k and 64k standard page size
*/
#ifdef CONFIG_PPC_MM_SLICES
+ /* r10 have esid */
cmpldi r10,16
-
- /* Get the slice index * 4 in r11 and matching slice size mask in r9 */
- ld r9,PACALOWSLICESPSIZE(r13)
- sldi r11,r10,2
+ /* below SLICE_LOW_TOP */
blt 5f
- ld r9,PACAHIGHSLICEPSIZE(r13)
- srdi r11,r10,(SLICE_HIGH_SHIFT - SLICE_LOW_SHIFT - 2)
- andi. r11,r11,0x3c
+ /*
+ * Handle hpsizes,
+ * r9 is get_paca()->context.high_slices_psize[index], r11 is mask_index
+ */
+ srdi r11,r10,(SLICE_HIGH_SHIFT - SLICE_LOW_SHIFT + 1) /* index */
+ addi r9,r11,PACAHIGHSLICEPSIZE
+ lbzx r9,r13,r9 /* r9 is hpsizes[r11] */
+ /* r11 = (r10 >> (SLICE_HIGH_SHIFT - SLICE_LOW_SHIFT)) & 0x1 */
+ rldicl r11,r10,(64 - (SLICE_HIGH_SHIFT - SLICE_LOW_SHIFT)),63
+ b 6f
-5: /* Extract the psize and multiply to get an array offset */
+5:
+ /*
+ * Handle lpsizes
+ * r9 is get_paca()->context.low_slices_psize, r11 is index
+ */
+ ld r9,PACALOWSLICESPSIZE(r13)
+ mr r11,r10
+6:
+ sldi r11,r11,2 /* index * 4 */
+ /* Extract the psize and multiply to get an array offset */
srd r9,r9,r11
andi. r9,r9,0xf
mulli r9,r9,MMUPSIZEDEFSIZE
diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c
index 73709f7..b4e996a 100644
--- a/arch/powerpc/mm/slice.c
+++ b/arch/powerpc/mm/slice.c
@@ -42,7 +42,7 @@ int _slice_debug = 1;
static void slice_print_mask(const char *label, struct slice_mask mask)
{
- char *p, buf[16 + 3 + 16 + 1];
+ char *p, buf[16 + 3 + 64 + 1];
int i;
if (!_slice_debug)
@@ -54,7 +54,7 @@ static void slice_print_mask(const char *label, struct slice_mask mask)
*(p++) = '-';
*(p++) = ' ';
for (i = 0; i < SLICE_NUM_HIGH; i++)
- *(p++) = (mask.high_slices & (1 << i)) ? '1' : '0';
+ *(p++) = (mask.high_slices & (1ul << i)) ? '1' : '0';
*(p++) = 0;
printk(KERN_DEBUG "%s:%s\n", label, buf);
@@ -84,8 +84,8 @@ static struct slice_mask slice_range_to_mask(unsigned long start,
}
if ((start + len) > SLICE_LOW_TOP)
- ret.high_slices = (1u << (GET_HIGH_SLICE_INDEX(end) + 1))
- - (1u << GET_HIGH_SLICE_INDEX(start));
+ ret.high_slices = (1ul << (GET_HIGH_SLICE_INDEX(end) + 1))
+ - (1ul << GET_HIGH_SLICE_INDEX(start));
return ret;
}
@@ -135,26 +135,31 @@ static struct slice_mask slice_mask_for_free(struct mm_struct *mm)
for (i = 0; i < SLICE_NUM_HIGH; i++)
if (!slice_high_has_vma(mm, i))
- ret.high_slices |= 1u << i;
+ ret.high_slices |= 1ul << i;
return ret;
}
static struct slice_mask slice_mask_for_size(struct mm_struct *mm, int psize)
{
+ unsigned char *hpsizes;
+ int index, mask_index;
struct slice_mask ret = { 0, 0 };
unsigned long i;
- u64 psizes;
+ u64 lpsizes;
- psizes = mm->context.low_slices_psize;
+ lpsizes = mm->context.low_slices_psize;
for (i = 0; i < SLICE_NUM_LOW; i++)
- if (((psizes >> (i * 4)) & 0xf) == psize)
+ if (((lpsizes >> (i * 4)) & 0xf) == psize)
ret.low_slices |= 1u << i;
- psizes = mm->context.high_slices_psize;
- for (i = 0; i < SLICE_NUM_HIGH; i++)
- if (((psizes >> (i * 4)) & 0xf) == psize)
- ret.high_slices |= 1u << i;
+ hpsizes = mm->context.high_slices_psize;
+ for (i = 0; i < SLICE_NUM_HIGH; i++) {
+ mask_index = i & 0x1;
+ index = i >> 1;
+ if (((hpsizes[index] >> (mask_index * 4)) & 0xf) == psize)
+ ret.high_slices |= 1ul << i;
+ }
return ret;
}
@@ -183,8 +188,10 @@ static void slice_flush_segments(void *parm)
static void slice_convert(struct mm_struct *mm, struct slice_mask mask, int psize)
{
+ int index, mask_index;
/* Write the new slice psize bits */
- u64 lpsizes, hpsizes;
+ unsigned char *hpsizes;
+ u64 lpsizes;
unsigned long i, flags;
slice_dbg("slice_convert(mm=%p, psize=%d)\n", mm, psize);
@@ -201,14 +208,18 @@ static void slice_convert(struct mm_struct *mm, struct slice_mask mask, int psiz
lpsizes = (lpsizes & ~(0xful << (i * 4))) |
(((unsigned long)psize) << (i * 4));
- hpsizes = mm->context.high_slices_psize;
- for (i = 0; i < SLICE_NUM_HIGH; i++)
- if (mask.high_slices & (1u << i))
- hpsizes = (hpsizes & ~(0xful << (i * 4))) |
- (((unsigned long)psize) << (i * 4));
-
+ /* Assign the value back */
mm->context.low_slices_psize = lpsizes;
- mm->context.high_slices_psize = hpsizes;
+
+ hpsizes = mm->context.high_slices_psize;
+ for (i = 0; i < SLICE_NUM_HIGH; i++) {
+ mask_index = i & 0x1;
+ index = i >> 1;
+ if (mask.high_slices & (1ul << i))
+ hpsizes[index] = (hpsizes[index] &
+ ~(0xf << (mask_index * 4))) |
+ (((unsigned long)psize) << (mask_index * 4));
+ }
slice_dbg(" lsps=%lx, hsps=%lx\n",
mm->context.low_slices_psize,
@@ -587,18 +598,19 @@ unsigned long arch_get_unmapped_area_topdown(struct file *filp,
unsigned int get_slice_psize(struct mm_struct *mm, unsigned long addr)
{
- u64 psizes;
- int index;
+ unsigned char *hpsizes;
+ int index, mask_index;
if (addr < SLICE_LOW_TOP) {
- psizes = mm->context.low_slices_psize;
+ u64 lpsizes;
+ lpsizes = mm->context.low_slices_psize;
index = GET_LOW_SLICE_INDEX(addr);
- } else {
- psizes = mm->context.high_slices_psize;
- index = GET_HIGH_SLICE_INDEX(addr);
+ return (lpsizes >> (index * 4)) & 0xf;
}
-
- return (psizes >> (index * 4)) & 0xf;
+ hpsizes = mm->context.high_slices_psize;
+ index = GET_HIGH_SLICE_INDEX(addr);
+ mask_index = index & 0x1;
+ return (hpsizes[index >> 1] >> (mask_index * 4)) & 0xf;
}
EXPORT_SYMBOL_GPL(get_slice_psize);
@@ -618,7 +630,9 @@ EXPORT_SYMBOL_GPL(get_slice_psize);
*/
void slice_set_user_psize(struct mm_struct *mm, unsigned int psize)
{
- unsigned long flags, lpsizes, hpsizes;
+ int index, mask_index;
+ unsigned char *hpsizes;
+ unsigned long flags, lpsizes;
unsigned int old_psize;
int i;
@@ -639,15 +653,21 @@ void slice_set_user_psize(struct mm_struct *mm, unsigned int psize)
if (((lpsizes >> (i * 4)) & 0xf) == old_psize)
lpsizes = (lpsizes & ~(0xful << (i * 4))) |
(((unsigned long)psize) << (i * 4));
+ /* Assign the value back */
+ mm->context.low_slices_psize = lpsizes;
hpsizes = mm->context.high_slices_psize;
- for (i = 0; i < SLICE_NUM_HIGH; i++)
- if (((hpsizes >> (i * 4)) & 0xf) == old_psize)
- hpsizes = (hpsizes & ~(0xful << (i * 4))) |
- (((unsigned long)psize) << (i * 4));
+ for (i = 0; i < SLICE_NUM_HIGH; i++) {
+ mask_index = i & 0x1;
+ index = i >> 1;
+ if (((hpsizes[index] >> (mask_index * 4)) & 0xf) == old_psize)
+ hpsizes[index] = (hpsizes[index] &
+ ~(0xf << (mask_index * 4))) |
+ (((unsigned long)psize) << (mask_index * 4));
+ }
+
+
- mm->context.low_slices_psize = lpsizes;
- mm->context.high_slices_psize = hpsizes;
slice_dbg(" lsps=%lx, hsps=%lx\n",
mm->context.low_slices_psize,
@@ -660,18 +680,27 @@ void slice_set_user_psize(struct mm_struct *mm, unsigned int psize)
void slice_set_psize(struct mm_struct *mm, unsigned long address,
unsigned int psize)
{
+ unsigned char *hpsizes;
unsigned long i, flags;
- u64 *p;
+ u64 *lpsizes;
spin_lock_irqsave(&slice_convert_lock, flags);
if (address < SLICE_LOW_TOP) {
i = GET_LOW_SLICE_INDEX(address);
- p = &mm->context.low_slices_psize;
+ lpsizes = &mm->context.low_slices_psize;
+ *lpsizes = (*lpsizes & ~(0xful << (i * 4))) |
+ ((unsigned long) psize << (i * 4));
} else {
+ int index, mask_index;
i = GET_HIGH_SLICE_INDEX(address);
- p = &mm->context.high_slices_psize;
+ hpsizes = mm->context.high_slices_psize;
+ mask_index = i & 0x1;
+ index = i >> 1;
+ hpsizes[index] = (hpsizes[index] &
+ ~(0xf << (mask_index * 4))) |
+ (((unsigned long)psize) << (mask_index * 4));
}
- *p = (*p & ~(0xful << (i * 4))) | ((unsigned long) psize << (i * 4));
+
spin_unlock_irqrestore(&slice_convert_lock, flags);
#ifdef CONFIG_SPU_BASE
--
1.7.10
^ permalink raw reply related
* [PATCH -V8 09/11] arch/powerpc: Use 32bit array for slb cache
From: Aneesh Kumar K.V @ 2012-09-06 15:29 UTC (permalink / raw)
To: benh, paulus; +Cc: linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1346945351-7672-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
With larger vsid we need to track more bits of ESID in slb cache
for slb invalidate.
Reviewed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/paca.h | 2 +-
arch/powerpc/mm/slb_low.S | 8 ++++----
2 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index daf813f..3e7abba 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -100,7 +100,7 @@ struct paca_struct {
/* SLB related definitions */
u16 vmalloc_sllp;
u16 slb_cache_ptr;
- u16 slb_cache[SLB_CACHE_ENTRIES];
+ u32 slb_cache[SLB_CACHE_ENTRIES];
#endif /* CONFIG_PPC_STD_MMU_64 */
#ifdef CONFIG_PPC_BOOK3E
diff --git a/arch/powerpc/mm/slb_low.S b/arch/powerpc/mm/slb_low.S
index 3b75f19..f6a2625 100644
--- a/arch/powerpc/mm/slb_low.S
+++ b/arch/powerpc/mm/slb_low.S
@@ -270,10 +270,10 @@ _GLOBAL(slb_compare_rr_to_size)
bge 1f
/* still room in the slb cache */
- sldi r11,r3,1 /* r11 = offset * sizeof(u16) */
- rldicl r10,r10,36,28 /* get low 16 bits of the ESID */
- add r11,r11,r13 /* r11 = (u16 *)paca + offset */
- sth r10,PACASLBCACHE(r11) /* paca->slb_cache[offset] = esid */
+ sldi r11,r3,2 /* r11 = offset * sizeof(u32) */
+ srdi r10,r10,28 /* get the 36 bits of the ESID */
+ add r11,r11,r13 /* r11 = (u32 *)paca + offset */
+ stw r10,PACASLBCACHE(r11) /* paca->slb_cache[offset] = esid */
addi r3,r3,1 /* offset++ */
b 2f
1: /* offset >= SLB_CACHE_ENTRIES */
--
1.7.10
^ permalink raw reply related
* [PATCH -V8 07/11] arch/powerpc: Make some of the PGTABLE_RANGE dependency explicit
From: Aneesh Kumar K.V @ 2012-09-06 15:29 UTC (permalink / raw)
To: benh, paulus; +Cc: linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1346945351-7672-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
slice array size and slice mask size depend on PGTABLE_RANGE. We
can't directly include pgtable.h in these header because there is
a circular dependency. So add compile time check for these values.
Reviewed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/mmu-hash64.h | 13 ++++++++-----
arch/powerpc/include/asm/page_64.h | 16 ++++++++++++----
arch/powerpc/include/asm/pgtable-ppc64.h | 8 ++++++++
3 files changed, 28 insertions(+), 9 deletions(-)
diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
index 7cbd541..cbd7edb 100644
--- a/arch/powerpc/include/asm/mmu-hash64.h
+++ b/arch/powerpc/include/asm/mmu-hash64.h
@@ -416,6 +416,13 @@ extern void slb_set_size(u16 size);
srdi rx,rx,VSID_BITS_##size; /* extract 2^VSID_BITS bit */ \
add rt,rt,rx
+/* 4 bits per slice and we have one slice per 1TB */
+#if 0 /* We can't directly include pgtable.h hence this hack */
+#define SLICE_ARRAY_SIZE (PGTABLE_RANGE >> 41)
+#else
+/* Right now we only support 64TB */
+#define SLICE_ARRAY_SIZE 32
+#endif
#ifndef __ASSEMBLY__
@@ -460,11 +467,7 @@ typedef struct {
#ifdef CONFIG_PPC_MM_SLICES
u64 low_slices_psize; /* SLB page size encodings */
- /*
- * Right now we support 64TB and 4 bits for each
- * 1TB slice we need 32 bytes for 64TB.
- */
- unsigned char high_slices_psize[32]; /* 4 bits per slice for now */
+ unsigned char high_slices_psize[SLICE_ARRAY_SIZE];
#else
u16 sllp; /* SLB page size encoding */
#endif
diff --git a/arch/powerpc/include/asm/page_64.h b/arch/powerpc/include/asm/page_64.h
index 6c9bef4..b55beb4 100644
--- a/arch/powerpc/include/asm/page_64.h
+++ b/arch/powerpc/include/asm/page_64.h
@@ -78,14 +78,22 @@ extern u64 ppc64_pft_size;
#define GET_LOW_SLICE_INDEX(addr) ((addr) >> SLICE_LOW_SHIFT)
#define GET_HIGH_SLICE_INDEX(addr) ((addr) >> SLICE_HIGH_SHIFT)
+/* 1 bit per slice and we have one slice per 1TB */
+#if 0 /* We can't directly include pgtable.h hence this hack */
+#define SLICE_MASK_SIZE (PGTABLE_RANGE >> 43)
+#else
+/*
+ * Right now we support only 64TB.
+ * IF we change this we will have to change the type
+ * of high_slices
+ */
+#define SLICE_MASK_SIZE 8
+#endif
+
#ifndef __ASSEMBLY__
struct slice_mask {
u16 low_slices;
- /*
- * This should be derived out of PGTABLE_RANGE. For the current
- * max 64TB, u64 should be ok.
- */
u64 high_slices;
};
diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
index 8af1cf2..dea953f 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -32,6 +32,14 @@
#endif
#endif
+#if (PGTABLE_RANGE >> 41) > SLICE_ARRAY_SIZE
+#error PGTABLE_RANGE exceeds SLICE_ARRAY_SIZE
+#endif
+
+#if (PGTABLE_RANGE >> 43) > SLICE_MASK_SIZE
+#error PGTABLE_RANGE exceeds slice_mask high_slices size
+#endif
+
/*
* Define the address range of the kernel non-linear virtual area
*/
--
1.7.10
^ permalink raw reply related
* [PATCH -V8 11/11] arch/powerpc: Update VSID allocation documentation
From: Aneesh Kumar K.V @ 2012-09-06 15:29 UTC (permalink / raw)
To: benh, paulus; +Cc: linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1346945351-7672-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
This update the proto-VSID and VSID scramble related information
to be more generic by using names instead of current values.
Reviewed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/mmu-hash64.h | 40 ++++++++++++++-------------------
arch/powerpc/mm/mmu_context_hash64.c | 8 ++++---
2 files changed, 22 insertions(+), 26 deletions(-)
diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
index de9cfed..428f23e 100644
--- a/arch/powerpc/include/asm/mmu-hash64.h
+++ b/arch/powerpc/include/asm/mmu-hash64.h
@@ -324,51 +324,45 @@ extern void slb_set_size(u16 size);
#endif /* __ASSEMBLY__ */
/*
- * VSID allocation
+ * VSID allocation (256MB segment)
*
- * We first generate a 36-bit "proto-VSID". For kernel addresses this
- * is equal to the ESID, for user addresses it is:
- * (context << 15) | (esid & 0x7fff)
+ * We first generate a 38-bit "proto-VSID". For kernel addresses this
+ * is equal to the ESID | 1 << 37, for user addresses it is:
+ * (context << USER_ESID_BITS) | (esid & ((1U << USER_ESID_BITS) - 1)
*
- * The two forms are distinguishable because the top bit is 0 for user
- * addresses, whereas the top two bits are 1 for kernel addresses.
- * Proto-VSIDs with the top two bits equal to 0b10 are reserved for
- * now.
+ * This splits the proto-VSID into the below range
+ * 0 - (2^(CONTEXT_BITS + USER_ESID_BITS) - 1) : User proto-VSID range
+ * 2^(CONTEXT_BITS + USER_ESID_BITS) - 2^(VSID_BITS) : Kernel proto-VSID range
+ *
+ * We also have CONTEXT_BITS + USER_ESID_BITS = VSID_BITS - 1
+ * That is, we assign half of the space to user processes and half
+ * to the kernel.
*
* The proto-VSIDs are then scrambled into real VSIDs with the
* multiplicative hash:
*
* VSID = (proto-VSID * VSID_MULTIPLIER) % VSID_MODULUS
- * where VSID_MULTIPLIER = 268435399 = 0xFFFFFC7
- * VSID_MODULUS = 2^36-1 = 0xFFFFFFFFF
*
- * This scramble is only well defined for proto-VSIDs below
- * 0xFFFFFFFFF, so both proto-VSID and actual VSID 0xFFFFFFFFF are
- * reserved. VSID_MULTIPLIER is prime, so in particular it is
+ * VSID_MULTIPLIER is prime, so in particular it is
* co-prime to VSID_MODULUS, making this a 1:1 scrambling function.
* Because the modulus is 2^n-1 we can compute it efficiently without
* a divide or extra multiply (see below).
*
* This scheme has several advantages over older methods:
*
- * - We have VSIDs allocated for every kernel address
+ * - We have VSIDs allocated for every kernel address
* (i.e. everything above 0xC000000000000000), except the very top
* segment, which simplifies several things.
*
- * - We allow for 16 significant bits of ESID and 19 bits of
- * context for user addresses. i.e. 16T (44 bits) of address space for
- * up to half a million contexts.
+ * - We allow for USER_ESID_BITS significant bits of ESID and
+ * CONTEXT_BITS bits of context for user addresses.
+ * i.e. 64T (46 bits) of address space for up to half a million contexts.
*
- * - The scramble function gives robust scattering in the hash
+ * - The scramble function gives robust scattering in the hash
* table (at least based on some initial results). The previous
* method was more susceptible to pathological cases giving excessive
* hash collisions.
*/
-/*
- * WARNING - If you change these you must make sure the asm
- * implementations in slb_allocate (slb_low.S), do_stab_bolted
- * (head.S) and ASM_VSID_SCRAMBLE (below) are changed accordingly.
- */
/*
* This should be computed such that protovosid * vsid_mulitplier
diff --git a/arch/powerpc/mm/mmu_context_hash64.c b/arch/powerpc/mm/mmu_context_hash64.c
index daa076c..40bc5b0 100644
--- a/arch/powerpc/mm/mmu_context_hash64.c
+++ b/arch/powerpc/mm/mmu_context_hash64.c
@@ -30,9 +30,11 @@ static DEFINE_SPINLOCK(mmu_context_lock);
static DEFINE_IDA(mmu_context_ida);
/*
- * The proto-VSID space has 2^35 - 1 segments available for user mappings.
- * Each segment contains 2^28 bytes. Each context maps 2^44 bytes,
- * so we can support 2^19-1 contexts (19 == 35 + 28 - 44).
+ * 256MB segment
+ * The proto-VSID space has 2^(CONTEX_BITS + USER_ESID_BITS) - 1 segments
+ * available for user mappings. Each segment contains 2^28 bytes. Each
+ * context maps 2^46 bytes (64TB) so we can support 2^19-1 contexts
+ * (19 == 37 + 28 - 46).
*/
#define MAX_CONTEXT ((1UL << CONTEXT_BITS) - 1)
--
1.7.10
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox