LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [RFC] [PATCH] powerpc: Add MSR_DE to MSR_KERNEL
From: Scott Wood @ 2012-05-31 17:47 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: linuxppc-dev, Dan Malek, Bob Cochran, Support
In-Reply-To: <OF1016B1AA.DF19B646-ONC1257A0F.003647EF-C1257A0F.003699E5@transmode.se>

On 05/31/2012 04:56 AM, Joakim Tjernlund wrote:
> Abatron Support <support@abatron.ch> wrote on 2012/05/31 11:30:57:
>>
>>
>>> Abatron Support <support@abatron.ch> wrote on 2012/05/30 14:08:26:
>>>>
>>>>>> I have tested this briefly with BDI2000 on P2010(e500) and
>>>>>> it works for me. I don't know if there are any bad side effects,
>>>>>> therfore
>>>>>> this RFC.
>>>>
>>>>> We used to have MSR_DE surrounded by CONFIG_something
>>>>> to ensure it wasn't set under normal operation.  IIRC, if MSR_DE
>>>>> is set, you will have problems with software debuggers that
>>>>> utilize the the debugging registers in the chip itself.  You only want
>>>>> to force this to be set when using the BDI, not at other times.
>>>>
>>>> This MSR_DE is also of interest and used for software debuggers that
>>>> make use of the debug registers. Only if MSR_DE is set then debug
>>>> interrupts are generated. If a debug event leads to a debug interrupt
>>>> handled by a software debugger or if it leads to a debug halt handled
>>>> by a JTAG tool is selected with DBCR0_EDM / DBCR0_IDM.
>>>>
>>>> The "e500 Core Family Reference Manual" chapter "Chapter 8
>>>> Debug Support" explains in detail the effect of MSR_DE.
>>
>>> So what is the verdict on this? I don't buy into Dan argument without some
>>> hard data.
>>
>> What I tried to mention is that handling the MSR_DE correct is not only
>> an emulator (JTAG debugger) requirement. Also a software debugger may
>> depend on a correct handled MSR_DE bit.
> 
> Yes, that made sense to me too. How would SW debuggers work if the kernel keeps
> turning off MSR_DE first chance it gets?

The kernel selectively enables MSR_DE when it wants to debug.  I'm not
sure if anything will be bothered by leaving it on all the time.  This
is something we need for virtualization as well, so a hypervisor can
debug the guest.

-Scott

^ permalink raw reply

* Re: kernel panic during kernel module load (powerpc specific part)
From: Gabriel Paubert @ 2012-05-31 11:04 UTC (permalink / raw)
  To: Wrobel Heinz-R39252
  Cc: Michael Ellerman, Steffen Rumler, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <192298D25D96A042975E372855100DB70FEA87@039-SN2MPN1-011.039d.mgd.msft.net>

On Thu, May 31, 2012 at 07:04:42AM +0000, Wrobel Heinz-R39252 wrote:
> Michael,
> 
> > On Wed, 2012-05-30 at 16:33 +0200, Steffen Rumler wrote:
> > > I've found the following root cause:
> > >
> > >      (6) Unfortunately, the trampoline code (do_plt_call()) is using
> > register r11 to setup the jump.
> > >            It looks like the prologue and epilogue are using also the
> > register r11, in order to point to the previous stack frame.
> > >            This is a conflict !!! The trampoline code is damaging the
> > content of r11.
> > 
> > Hi Steffen,
> > 
> > Great bug report!
> > 
> > I can't quite work out what the standards say, the versions I'm looking
> > at are probably old anyway.
> 
> The ABI supplement from https://www.power.org/resources/downloads/ is explicit about r11 being a requirement for the statically lined save/restore functions in section 3.3.4 on page 59.
> 
> This means that any trampoline code must not ever use r11 or it can't be used to get to such save/restore functions safely from far away.

I believe that the basic premise is that you should provide a directly reachable copy 
of the save/rstore functions, even if this means that you need several copies of the functions.

> 
> Unfortunately the same doc and predecessors show r11 in all basic examples for PLT/trampoline code AFAICS, which is likely why all trampoline code uses r11 in any known case.
> 
> I would guess that it was never envisioned that compiler generated code would be in a different section than save/restore functions, i.e., the Linux module "__init" assumptions for Power break the ABI. Or does the ABI break the __init concept?!
> 
> Using r12 in the trampoline seems to be the obvious solution for module loading.
> 
> But what about other code loading done? If, e.g., a user runs any app from bash it gets loaded and relocated and trampolines might get set up somehow.

I don't think so. The linker/whatever should generate a copy of the save/restore functions for every
executable code area (shared library), and probably more than one copy if the text becomes too large.

For 64 bit code, these functions are actually inserted by the linker.

[Ok, I just recompiled my 64 bit kernel with -Os and I see that vmlinux gets one copy
of the save/restore functions and every module also gets its copy.]

This makes sense, really these functions are there for a compromise between locality 
and performance, there should be one per code section, otherwise the cache line 
used by the trampoline negates a large part of their advantage.

> 
> Wouldn't we have to find fix ANY trampoline code generator remotely related to a basic Power Architecture Linux?
> Or is it a basic assumption for anything but modules that compiler generated code may never ever be outside the .text section? I am not sure that would be a safe assumption.
> 
> Isn't this problem going beyond just module loading for Power Architecture Linux?

I don't think so. It really seems to be a 32 bit kernel problem.

	Regards,
	Gabriel

^ permalink raw reply

* Re: Re[4]: [RFC] [PATCH] powerpc: Add MSR_DE to MSR_KERNEL
From: Joakim Tjernlund @ 2012-05-31  9:56 UTC (permalink / raw)
  To: Support; +Cc: linuxppc-dev, Dan Malek, Bob Cochran
In-Reply-To: <13517672561.20120531113057@abatron.ch>

Abatron Support <support@abatron.ch> wrote on 2012/05/31 11:30:57:
>
>
> > Abatron Support <support@abatron.ch> wrote on 2012/05/30 14:08:26:
> >>
> >> >> I have tested this briefly with BDI2000 on P2010(e500) and
> >> >> it works for me. I don't know if there are any bad side effects,
> >> >> therfore
> >> >> this RFC.
> >>
> >> > We used to have MSR_DE surrounded by CONFIG_something
> >> > to ensure it wasn't set under normal operation.  IIRC, if MSR_DE
> >> > is set, you will have problems with software debuggers that
> >> > utilize the the debugging registers in the chip itself.  You only want
> >> > to force this to be set when using the BDI, not at other times.
> >>
> >> This MSR_DE is also of interest and used for software debuggers that
> >> make use of the debug registers. Only if MSR_DE is set then debug
> >> interrupts are generated. If a debug event leads to a debug interrupt
> >> handled by a software debugger or if it leads to a debug halt handled
> >> by a JTAG tool is selected with DBCR0_EDM / DBCR0_IDM.
> >>
> >> The "e500 Core Family Reference Manual" chapter "Chapter 8
> >> Debug Support" explains in detail the effect of MSR_DE.
>
> > So what is the verdict on this? I don't buy into Dan argument without some
> > hard data.
>
> What I tried to mention is that handling the MSR_DE correct is not only
> an emulator (JTAG debugger) requirement. Also a software debugger may
> depend on a correct handled MSR_DE bit.

Yes, that made sense to me too. How would SW debuggers work if the kernel keeps
turning off MSR_DE first chance it gets?

 Jocke

^ permalink raw reply

* Re[4]: [RFC] [PATCH] powerpc: Add MSR_DE to MSR_KERNEL
From: Abatron Support @ 2012-05-31  9:30 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: linuxppc-dev, Dan Malek, Bob Cochran
In-Reply-To: <OF53CDE18C.59DD377E-ONC1257A0F.0031C7A0-C1257A0F.0031EA81@transmode.se>


> Abatron Support <support@abatron.ch> wrote on 2012/05/30 14:08:26:
>>
>> >> I have tested this briefly with BDI2000 on P2010(e500) and
>> >> it works for me. I don't know if there are any bad side effects,
>> >> therfore
>> >> this RFC.
>>
>> > We used to have MSR_DE surrounded by CONFIG_something
>> > to ensure it wasn't set under normal operation.  IIRC, if MSR_DE
>> > is set, you will have problems with software debuggers that
>> > utilize the the debugging registers in the chip itself.  You only want
>> > to force this to be set when using the BDI, not at other times.
>>
>> This MSR_DE is also of interest and used for software debuggers that
>> make use of the debug registers. Only if MSR_DE is set then debug
>> interrupts are generated. If a debug event leads to a debug interrupt
>> handled by a software debugger or if it leads to a debug halt handled
>> by a JTAG tool is selected with DBCR0_EDM / DBCR0_IDM.
>>
>> The "e500 Core Family Reference Manual" chapter "Chapter 8
>> Debug Support" explains in detail the effect of MSR_DE.

> So what is the verdict on this? I don't buy into Dan argument without some
> hard data.

What I tried to mention is that handling the MSR_DE correct is not only
an emulator (JTAG debugger) requirement. Also a software debugger may
depend on a correct handled MSR_DE bit.

Ruedi

^ permalink raw reply

* Re: Re[2]: [RFC] [PATCH] powerpc: Add MSR_DE to MSR_KERNEL
From: Joakim Tjernlund @ 2012-05-31  9:05 UTC (permalink / raw)
  To: Support; +Cc: linuxppc-dev, Dan Malek, Bob Cochran
In-Reply-To: <10126984030.20120530140826@abatron.ch>

Abatron Support <support@abatron.ch> wrote on 2012/05/30 14:08:26:
>
> >> I have tested this briefly with BDI2000 on P2010(e500) and
> >> it works for me. I don't know if there are any bad side effects,
> >> therfore
> >> this RFC.
>
> > We used to have MSR_DE surrounded by CONFIG_something
> > to ensure it wasn't set under normal operation.  IIRC, if MSR_DE
> > is set, you will have problems with software debuggers that
> > utilize the the debugging registers in the chip itself.  You only want
> > to force this to be set when using the BDI, not at other times.
>
> This MSR_DE is also of interest and used for software debuggers that
> make use of the debug registers. Only if MSR_DE is set then debug
> interrupts are generated. If a debug event leads to a debug interrupt
> handled by a software debugger or if it leads to a debug halt handled
> by a JTAG tool is selected with DBCR0_EDM / DBCR0_IDM.
>
> The "e500 Core Family Reference Manual" chapter "Chapter 8
> Debug Support" explains in detail the effect of MSR_DE.

So what is the verdict on this? I don't buy into Dan argument without some
hard data.

 Jocke

^ permalink raw reply

* [PATCH 2/2] edac/85xx: PCI/PCIe error interrupt edac support
From: Chunhe Lan @ 2012-06-01  8:16 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Doug Thompson, Chunhe Lan
In-Reply-To: <1338538618-26939-1-git-send-email-Chunhe.Lan@freescale.com>

Adding pcie error interrupt edac support for mpc85xx and p4080.
mpc85xx uses the legacy interrupt report mechanism - the error
interrupts are reported directly to mpic. While, p4080 attaches
the most of error interrupts to interrupt 0. And report error
interrupts to mpic via interrupt 0. This patch can handle both
of them.

Signed-off-by: Chunhe Lan <Chunhe.Lan@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: Doug Thompson <dougthompson@xmission.com>
---
 drivers/edac/mpc85xx_edac.c |  236 +++++++++++++++++++++++++++++++++----------
 drivers/edac/mpc85xx_edac.h |    9 ++-
 2 files changed, 191 insertions(+), 54 deletions(-)

diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c
index 73464a6..35eef79 100644
--- a/drivers/edac/mpc85xx_edac.c
+++ b/drivers/edac/mpc85xx_edac.c
@@ -1,5 +1,6 @@
 /*
  * Freescale MPC85xx Memory Controller kenel module
+ * Copyright (c) 2012 Freescale Semiconductor, Inc.
  *
  * Author: Dave Jiang <djiang@mvista.com>
  *
@@ -21,6 +22,7 @@
 
 #include <linux/of_platform.h>
 #include <linux/of_device.h>
+#include <sysdev/fsl_pci.h>
 #include "edac_module.h"
 #include "edac_core.h"
 #include "mpc85xx_edac.h"
@@ -37,11 +39,6 @@ static u32 orig_ddr_err_sbe;
 /*
  * PCI Err defines
  */
-#ifdef CONFIG_PCI
-static u32 orig_pci_err_cap_dr;
-static u32 orig_pci_err_en;
-#endif
-
 static u32 orig_l2_err_disable;
 #ifdef CONFIG_FSL_SOC_BOOKE
 static u32 orig_hid1[2];
@@ -151,37 +148,52 @@ static void mpc85xx_pci_check(struct edac_pci_ctl_info *pci)
 {
 	struct mpc85xx_pci_pdata *pdata = pci->pvt_info;
 	u32 err_detect;
+	struct ccsr_pci *reg = pdata->pci_reg;
+
+	err_detect = in_be32(&pdata->pci_reg->pex_err_dr);
+
+	if (pdata->pcie_flag) {
+		printk(KERN_ERR "PCIE error(s) detected\n");
+		printk(KERN_ERR "PCIE ERR_DR register: 0x%08x\n", err_detect);
+		printk(KERN_ERR "PCIE ERR_CAP_STAT register: 0x%08x\n",
+				in_be32(&reg->pex_err_cap_stat));
+		printk(KERN_ERR "PCIE ERR_CAP_R0 register: 0x%08x\n",
+				in_be32(&reg->pex_err_cap_r0));
+		printk(KERN_ERR "PCIE ERR_CAP_R1 register: 0x%08x\n",
+				in_be32(&reg->pex_err_cap_r1));
+		printk(KERN_ERR "PCIE ERR_CAP_R2 register: 0x%08x\n",
+				in_be32(&reg->pex_err_cap_r2));
+		printk(KERN_ERR "PCIE ERR_CAP_R3 register: 0x%08x\n",
+				in_be32(&reg->pex_err_cap_r3));
+	} else {
+		/* master aborts can happen during PCI config cycles */
+		if (!(err_detect & ~(PCI_EDE_MULTI_ERR | PCI_EDE_MST_ABRT))) {
+			out_be32(&reg->pex_err_dr, err_detect);
+			return;
+		}
 
-	err_detect = in_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_DR);
-
-	/* master aborts can happen during PCI config cycles */
-	if (!(err_detect & ~(PCI_EDE_MULTI_ERR | PCI_EDE_MST_ABRT))) {
-		out_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_DR, err_detect);
-		return;
+		printk(KERN_ERR "PCI error(s) detected\n");
+		printk(KERN_ERR "PCI/X ERR_DR register: 0x%08x\n", err_detect);
+		printk(KERN_ERR "PCI/X ERR_ATTRIB register: 0x%08x\n",
+				in_be32(&reg->pex_err_attrib));
+		printk(KERN_ERR "PCI/X ERR_ADDR register: 0x%08x\n",
+				in_be32(&reg->pex_err_disr));
+		printk(KERN_ERR "PCI/X ERR_EXT_ADDR register: 0x%08x\n",
+				in_be32(&reg->pex_err_ext_addr));
+		printk(KERN_ERR "PCI/X ERR_DL register: 0x%08x\n",
+				in_be32(&reg->pex_err_dl));
+		printk(KERN_ERR "PCI/X ERR_DH register: 0x%08x\n",
+				in_be32(&reg->pex_err_dh));
+
+		if (err_detect & PCI_EDE_PERR_MASK)
+			edac_pci_handle_pe(pci, pci->ctl_name);
+
+		if (err_detect & ~(PCI_EDE_MULTI_ERR | PCI_EDE_PERR_MASK))
+			edac_pci_handle_npe(pci, pci->ctl_name);
 	}
 
-	printk(KERN_ERR "PCI error(s) detected\n");
-	printk(KERN_ERR "PCI/X ERR_DR register: %#08x\n", err_detect);
-
-	printk(KERN_ERR "PCI/X ERR_ATTRIB register: %#08x\n",
-	       in_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_ATTRIB));
-	printk(KERN_ERR "PCI/X ERR_ADDR register: %#08x\n",
-	       in_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_ADDR));
-	printk(KERN_ERR "PCI/X ERR_EXT_ADDR register: %#08x\n",
-	       in_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_EXT_ADDR));
-	printk(KERN_ERR "PCI/X ERR_DL register: %#08x\n",
-	       in_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_DL));
-	printk(KERN_ERR "PCI/X ERR_DH register: %#08x\n",
-	       in_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_DH));
-
 	/* clear error bits */
-	out_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_DR, err_detect);
-
-	if (err_detect & PCI_EDE_PERR_MASK)
-		edac_pci_handle_pe(pci, pci->ctl_name);
-
-	if ((err_detect & ~PCI_EDE_MULTI_ERR) & ~PCI_EDE_PERR_MASK)
-		edac_pci_handle_npe(pci, pci->ctl_name);
+	out_be32(&reg->pex_err_dr, err_detect);
 }
 
 static irqreturn_t mpc85xx_pci_isr(int irq, void *dev_id)
@@ -190,7 +202,7 @@ static irqreturn_t mpc85xx_pci_isr(int irq, void *dev_id)
 	struct mpc85xx_pci_pdata *pdata = pci->pvt_info;
 	u32 err_detect;
 
-	err_detect = in_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_DR);
+	err_detect = in_be32(&pdata->pci_reg->pex_err_dr);
 
 	if (!err_detect)
 		return IRQ_NONE;
@@ -200,11 +212,104 @@ static irqreturn_t mpc85xx_pci_isr(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
+/*
+ * This function is for error interrupt ORed mechanism.
+ * This mechanism attaches most functions' error interrupts to interrupt 0.
+ * And report error interrupt to mpic via interrupt 0.
+ * EIMR0 - Error Interrupt Mask Register 0.
+ *
+ * This function check whether the device support error interrupt ORed
+ * mechanism via device tree. If supported, umask pcie error interrupt
+ * bit in EIMR0.
+ */
+static int mpc85xx_err_int_en(struct platform_device *op)
+{
+	u32 *int_cell;
+	struct device_node *np;
+	void __iomem *mpic_base;
+	u32 reg_tmp;
+	u32 int_len;
+	struct resource r;
+	int res;
+
+	if (!op->dev.of_node)
+		return -EINVAL;
+
+	/*
+	 * Unmask pcie error interrupt bit in EIMR0.
+	 * Extend interrupt specifier has 4 cells.
+	 * For the 3rd cell:
+	 *	0 -- normal interrupt;
+	 *	1 -- error interrupt.
+	 */
+	int_cell = (u32 *)of_get_property(op->dev.of_node, "interrupts",
+								&int_len);
+	if ((int_len/sizeof(u32)) == 4) {
+		/* soc has error interrupt integration handling mechanism */
+		if (*(int_cell + 2) == 1) {
+			np = of_find_node_by_type(NULL, "open-pic");
+
+			if (of_address_to_resource(np, 0, &r)) {
+				printk(KERN_ERR "%s: Failed to map mpic regs\n",
+						__func__);
+				of_node_put(np);
+				res = -ENOMEM;
+				goto err;
+			}
+
+			if (!request_mem_region(r.start, r.end - r.start + 1,
+						"mpic")) {
+				printk(KERN_ERR
+				       "%s: Error when requesting mem region\n",
+				       __func__);
+				res = -EBUSY;
+				goto err;
+			}
+
+			mpic_base = ioremap(r.start, r.end - r.start + 1);
+			if (!mpic_base) {
+				printk(KERN_ERR "%s: Unable to map mpic regs\n",
+						__func__);
+				res = -ENOMEM;
+				goto err_ioremap;
+			}
+
+			reg_tmp = in_be32(mpic_base + MPC85XX_MPIC_EIMR0);
+			out_be32(mpic_base + MPC85XX_MPIC_EIMR0,
+				reg_tmp & ~(1 << (31 - *(int_cell + 3))));
+			iounmap(mpic_base);
+			release_mem_region(r.start, r.end - r.start + 1);
+			of_node_put(np);
+		}
+	}
+
+	return 0;
+
+err_ioremap:
+	release_mem_region(r.start, r.end - r.start + 1);
+err:
+	return res;
+}
+
+static int mpc85xx_pcie_find_capability(struct device_node *np)
+{
+	struct pci_controller *hose;
+
+	if (!np)
+		return -EINVAL;
+
+	hose = pci_find_hose_for_OF_device(np);
+
+	return early_find_capability(hose, hose->bus->number, 0,
+				     PCI_CAP_ID_EXP);
+}
+
 static int __devinit mpc85xx_pci_err_probe(struct platform_device *op)
 {
 	struct edac_pci_ctl_info *pci;
 	struct mpc85xx_pci_pdata *pdata;
 	struct resource r;
+	struct ccsr_pci *reg;
 	int res = 0;
 
 	if (!devres_open_group(&op->dev, mpc85xx_pci_err_probe, GFP_KERNEL))
@@ -223,6 +328,9 @@ static int __devinit mpc85xx_pci_err_probe(struct platform_device *op)
 	pci->ctl_name = pdata->name;
 	pci->dev_name = dev_name(&op->dev);
 
+	if (mpc85xx_pcie_find_capability(op->dev.of_node) > 0)
+		pdata->pcie_flag = 1;
+
 	if (edac_op_state == EDAC_OPSTATE_POLL)
 		pci->edac_check = mpc85xx_pci_check;
 
@@ -234,10 +342,6 @@ static int __devinit mpc85xx_pci_err_probe(struct platform_device *op)
 		       "PCI err regs\n", __func__);
 		goto err;
 	}
-
-	/* we only need the error registers */
-	r.start += 0xe00;
-
 	if (!devm_request_mem_region(&op->dev, r.start, resource_size(&r),
 					pdata->name)) {
 		printk(KERN_ERR "%s: Error while requesting mem region\n",
@@ -246,26 +350,32 @@ static int __devinit mpc85xx_pci_err_probe(struct platform_device *op)
 		goto err;
 	}
 
-	pdata->pci_vbase = devm_ioremap(&op->dev, r.start, resource_size(&r));
-	if (!pdata->pci_vbase) {
+	pdata->pci_reg = devm_ioremap(&op->dev, r.start, resource_size(&r));
+	if (!pdata->pci_reg) {
 		printk(KERN_ERR "%s: Unable to setup PCI err regs\n", __func__);
 		res = -ENOMEM;
 		goto err;
 	}
 
-	orig_pci_err_cap_dr =
-	    in_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_CAP_DR);
-
-	/* PCI master abort is expected during config cycles */
-	out_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_CAP_DR, 0x40);
+	if (mpc85xx_err_int_en(op) < 0)
+		goto err;
 
-	orig_pci_err_en = in_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_EN);
+	reg = pdata->pci_reg;
+	/* disable pci/pcie error detect */
+	if (pdata->pcie_flag) {
+		pdata->orig_pci_err_dr =  in_be32(&reg->pex_err_disr);
+		out_be32(&reg->pex_err_disr, ~0);
+	} else {
+		pdata->orig_pci_err_dr =  in_be32(&reg->pex_err_cap_dr);
+		out_be32(&reg->pex_err_cap_dr, ~0);
+	}
 
-	/* disable master abort reporting */
-	out_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_EN, ~0x40);
+	/* disable all pcie error interrupt */
+	pdata->orig_pci_err_en = in_be32(&reg->pex_err_en);
+	out_be32(&reg->pex_err_en, 0);
 
-	/* clear error bits */
-	out_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_DR, ~0);
+	/* clear all error bits */
+	out_be32(&reg->pex_err_dr, ~0);
 
 	if (edac_pci_add_device(pci, pdata->edac_idx) > 0) {
 		debugf3("%s(): failed edac_pci_add_device()\n", __func__);
@@ -275,7 +385,7 @@ static int __devinit mpc85xx_pci_err_probe(struct platform_device *op)
 	if (edac_op_state == EDAC_OPSTATE_INT) {
 		pdata->irq = irq_of_parse_and_map(op->dev.of_node, 0);
 		res = devm_request_irq(&op->dev, pdata->irq,
-				       mpc85xx_pci_isr, IRQF_DISABLED,
+				       mpc85xx_pci_isr, IRQF_SHARED,
 				       "[EDAC] PCI err", pci);
 		if (res < 0) {
 			printk(KERN_ERR
@@ -290,6 +400,17 @@ static int __devinit mpc85xx_pci_err_probe(struct platform_device *op)
 		       pdata->irq);
 	}
 
+	if (pdata->pcie_flag) {
+		/* enable all pcie error interrupt & error detect */
+		out_be32(&reg->pex_err_en, ~0);
+		out_be32(&reg->pex_err_disr, 0);
+	} else {
+		/* PCI master abort is expected during config cycles */
+		out_be32(&reg->pex_err_cap_dr, PCI_ERR_CAP_DR_DIS_MST);
+		/* disable master abort reporting */
+		out_be32(&reg->pex_err_en, PCI_ERR_EN_DIS_MST);
+	}
+
 	devres_remove_group(&op->dev, mpc85xx_pci_err_probe);
 	debugf3("%s(): success\n", __func__);
 	printk(KERN_INFO EDAC_MOD_STR " PCI err registered\n");
@@ -311,10 +432,13 @@ static int mpc85xx_pci_err_remove(struct platform_device *op)
 
 	debugf0("%s()\n", __func__);
 
-	out_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_CAP_DR,
-		 orig_pci_err_cap_dr);
+	if (pdata->pcie_flag)
+		out_be32(&pdata->pci_reg->pex_err_disr, pdata->orig_pci_err_dr);
+	else
+		out_be32(&pdata->pci_reg->pex_err_cap_dr,
+						pdata->orig_pci_err_dr);
 
-	out_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_EN, orig_pci_err_en);
+	out_be32(&pdata->pci_reg->pex_err_en, pdata->orig_pci_err_en);
 
 	edac_pci_del_device(pci->dev);
 
@@ -333,6 +457,12 @@ static struct of_device_id mpc85xx_pci_err_of_match[] = {
 	{
 	 .compatible = "fsl,mpc8540-pci",
 	},
+	{
+	.compatible = "fsl,mpc8548-pcie",
+	},
+	{
+	 .compatible = "fsl,p4080-pcie",
+	},
 	{},
 };
 MODULE_DEVICE_TABLE(of, mpc85xx_pci_err_of_match);
diff --git a/drivers/edac/mpc85xx_edac.h b/drivers/edac/mpc85xx_edac.h
index 8ba4152..262004d 100644
--- a/drivers/edac/mpc85xx_edac.h
+++ b/drivers/edac/mpc85xx_edac.h
@@ -133,6 +133,10 @@
 #define PCI_EDE_PERR_MASK	(PCI_EDE_TGT_PERR | PCI_EDE_MST_PERR | \
 				PCI_EDE_ADDR_PERR)
 
+#define PCI_ERR_CAP_DR_DIS_MST		0x40
+#define PCI_ERR_EN_DIS_MST		(~PCI_ERR_CAP_DR_DIS_MST)
+#define MPC85XX_MPIC_EIMR0		0x3910
+
 struct mpc85xx_mc_pdata {
 	char *name;
 	int edac_idx;
@@ -149,8 +153,11 @@ struct mpc85xx_l2_pdata {
 
 struct mpc85xx_pci_pdata {
 	char *name;
+	u8 pcie_flag;
 	int edac_idx;
-	void __iomem *pci_vbase;
+	struct ccsr_pci *pci_reg;
+	u32 orig_pci_err_dr;
+	u32 orig_pci_err_en;
 	int irq;
 };
 
-- 
1.5.6.5

^ permalink raw reply related

* [PATCH 1/2] edac: Use ccsr_pci structure instead of hardcoded define
From: Chunhe Lan @ 2012-06-01  8:16 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Doug Thompson, Chunhe Lan

There are some differences of register offset and definition between
pci and pcie error management registers. While, some other pci/pcie
error management registers are nearly the same.

To merge pci and pcie edac code into one, it is easier to use ccsr_pci
structure than the hardcoded define. So remove the hardcoded define and
add pci/pcie error management register in ccsr_pci structure.

Signed-off-by: Chunhe Lan <Chunhe.Lan@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: Doug Thompson <dougthompson@xmission.com>
---
 arch/powerpc/sysdev/fsl_pci.h |   46 +++++++++++++++++++++++++++++++++-------
 drivers/edac/mpc85xx_edac.h   |   13 +---------
 2 files changed, 40 insertions(+), 19 deletions(-)

diff --git a/arch/powerpc/sysdev/fsl_pci.h b/arch/powerpc/sysdev/fsl_pci.h
index a39ed5c..5378a47 100644
--- a/arch/powerpc/sysdev/fsl_pci.h
+++ b/arch/powerpc/sysdev/fsl_pci.h
@@ -1,7 +1,7 @@
 /*
  * MPC85xx/86xx PCI Express structure define
  *
- * Copyright 2007,2011 Freescale Semiconductor, Inc
+ * Copyright 2007,2011,2012 Freescale Semiconductor, Inc
  *
  * This program is free software; you can redistribute  it and/or modify it
  * under  the terms of  the GNU General  Public License as published by the
@@ -14,6 +14,8 @@
 #ifndef __POWERPC_FSL_PCI_H
 #define __POWERPC_FSL_PCI_H
 
+#include <asm/pci-bridge.h>
+
 #define PCIE_LTSSM	0x0404		/* PCIE Link Training and Status */
 #define PCIE_LTSSM_L0	0x16		/* L0 state */
 #define PIWAR_EN		0x80000000	/* Enable */
@@ -74,13 +76,41 @@ struct ccsr_pci {
  */
 	struct pci_inbound_window_regs piw[4];
 
-	__be32	pex_err_dr;		/* 0x.e00 - PCI/PCIE error detect register */
-	u8	res21[4];
-	__be32	pex_err_en;		/* 0x.e08 - PCI/PCIE error interrupt enable register */
-	u8	res22[4];
-	__be32	pex_err_disr;		/* 0x.e10 - PCI/PCIE error disable register */
-	u8	res23[12];
-	__be32	pex_err_cap_stat;	/* 0x.e20 - PCI/PCIE error capture status register */
+/* Merge PCI Express/PCI error management registers */
+	__be32	pex_err_dr;	  /* 0x.e00
+				   * - PCI/PCIE error detect register
+				   */
+	__be32	pex_err_cap_dr;	  /* 0x.e04
+				   * - PCI error capture disabled register
+				   * - PCIE has no this register
+				   */
+	__be32	pex_err_en;	  /* 0x.e08
+				   * - PCI/PCIE error interrupt enable register
+				   */
+	__be32	pex_err_attrib;	  /* 0x.e0c
+				   * - PCI error attributes capture register
+				   * - PCIE has no this register
+				   */
+	__be32	pex_err_disr;	  /* 0x.e10
+				   * - PCI error address capture register
+				   * - PCIE error disable register
+				   */
+	__be32	pex_err_ext_addr; /* 0x.e14
+				   * - PCI error extended addr capture register
+				   * - PCIE has no this register
+				   */
+	__be32	pex_err_dl;	  /* 0x.e18
+				   * - PCI error data low capture register
+				   * - PCIE has no this register
+				   */
+	__be32	pex_err_dh;	  /* 0x.e1c
+				   * - PCI error data high capture register
+				   * - PCIE has no this register
+				   */
+	__be32	pex_err_cap_stat; /* 0x.e20
+				   * - PCI gasket timer register
+				   * - PCIE error capture status register
+				   */
 	u8	res24[4];
 	__be32	pex_err_cap_r0;		/* 0x.e28 - PCIE error capture register 0 */
 	__be32	pex_err_cap_r1;		/* 0x.e2c - PCIE error capture register 0 */
diff --git a/drivers/edac/mpc85xx_edac.h b/drivers/edac/mpc85xx_edac.h
index 932016f..8ba4152 100644
--- a/drivers/edac/mpc85xx_edac.h
+++ b/drivers/edac/mpc85xx_edac.h
@@ -1,5 +1,7 @@
 /*
  * Freescale MPC85xx Memory Controller kenel module
+ * Copyright (c) 2012 Freescale Semiconductor, Inc.
+ *
  * Author: Dave Jiang <djiang@mvista.com>
  *
  * 2006-2007 (c) MontaVista Software, Inc. This file is licensed under
@@ -131,17 +133,6 @@
 #define PCI_EDE_PERR_MASK	(PCI_EDE_TGT_PERR | PCI_EDE_MST_PERR | \
 				PCI_EDE_ADDR_PERR)
 
-#define MPC85XX_PCI_ERR_DR		0x0000
-#define MPC85XX_PCI_ERR_CAP_DR		0x0004
-#define MPC85XX_PCI_ERR_EN		0x0008
-#define MPC85XX_PCI_ERR_ATTRIB		0x000c
-#define MPC85XX_PCI_ERR_ADDR		0x0010
-#define MPC85XX_PCI_ERR_EXT_ADDR	0x0014
-#define MPC85XX_PCI_ERR_DL		0x0018
-#define MPC85XX_PCI_ERR_DH		0x001c
-#define MPC85XX_PCI_GAS_TIMR		0x0020
-#define MPC85XX_PCI_PCIX_TIMR		0x0024
-
 struct mpc85xx_mc_pdata {
 	char *name;
 	int edac_idx;
-- 
1.5.6.5

^ permalink raw reply related

* RE: kernel panic during kernel module load (powerpc specific part)
From: Wrobel Heinz-R39252 @ 2012-05-31  7:04 UTC (permalink / raw)
  To: Michael Ellerman, Steffen Rumler; +Cc: linuxppc-dev@lists.ozlabs.org
In-Reply-To: <1338420242.27402.2.camel@concordia>

Michael,

> On Wed, 2012-05-30 at 16:33 +0200, Steffen Rumler wrote:
> > I've found the following root cause:
> >
> >      (6) Unfortunately, the trampoline code (do_plt_call()) is using
> register r11 to setup the jump.
> >            It looks like the prologue and epilogue are using also the
> register r11, in order to point to the previous stack frame.
> >            This is a conflict !!! The trampoline code is damaging the
> content of r11.
>=20
> Hi Steffen,
>=20
> Great bug report!
>=20
> I can't quite work out what the standards say, the versions I'm looking
> at are probably old anyway.

The ABI supplement from https://www.power.org/resources/downloads/ is expli=
cit about r11 being a requirement for the statically lined save/restore fun=
ctions in section 3.3.4 on page 59.

This means that any trampoline code must not ever use r11 or it can't be us=
ed to get to such save/restore functions safely from far away.

Unfortunately the same doc and predecessors show r11 in all basic examples =
for PLT/trampoline code AFAICS, which is likely why all trampoline code use=
s r11 in any known case.

I would guess that it was never envisioned that compiler generated code wou=
ld be in a different section than save/restore functions, i.e., the Linux m=
odule "__init" assumptions for Power break the ABI. Or does the ABI break t=
he __init concept?!

Using r12 in the trampoline seems to be the obvious solution for module loa=
ding.

But what about other code loading done? If, e.g., a user runs any app from =
bash it gets loaded and relocated and trampolines might get set up somehow.

Wouldn't we have to find fix ANY trampoline code generator remotely related=
 to a basic Power Architecture Linux?
Or is it a basic assumption for anything but modules that compiler generate=
d code may never ever be outside the .text section? I am not sure that woul=
d be a safe assumption.

Isn't this problem going beyond just module loading for Power Architecture =
Linux?

Regards,

Heinz

^ permalink raw reply

* [PATCH] powerpc: POWER7 optimised memcpy using VMX and enhanced prefetch
From: Anton Blanchard @ 2012-05-31  6:22 UTC (permalink / raw)
  To: benh, paulus, michael; +Cc: linuxppc-dev


Implement a POWER7 optimised memcpy using VMX and enhanced prefetch
instructions.

This is a copy of the POWER7 optimised copy_to_user/copy_from_user
loop. Detailed implementation and performance details can be found in
commit a66086b8197d (powerpc: POWER7 optimised
copy_to_user/copy_from_user using VMX).

I noticed memcpy issues when profiling a RAID6 workload:

	.memcpy
	.async_memcpy
	.async_copy_data
	.__raid_run_ops
	.handle_stripe
	.raid5d
	.md_thread

I created a simplified testcase by building a RAID6 array with 4 1GB
ramdisks (booting with brd.rd_size=1048576):

# mdadm -CR -e 1.2 /dev/md0 --level=6 -n4 /dev/ram[0-3]

I then timed how long it took to write to the entire array:

# dd if=/dev/zero of=/dev/md0 bs=1M

Before: 892 MB/s
After:  999 MB/s

A 12% improvement.

Signed-off-by: Anton Blanchard <anton@samba.org>
---

Index: linux-build/arch/powerpc/lib/Makefile
===================================================================
--- linux-build.orig/arch/powerpc/lib/Makefile	2012-05-30 15:27:30.000000000 +1000
+++ linux-build/arch/powerpc/lib/Makefile	2012-05-31 09:12:27.574372864 +1000
@@ -17,7 +17,8 @@ obj-$(CONFIG_HAS_IOMEM)	+= devres.o
 obj-$(CONFIG_PPC64)	+= copypage_64.o copyuser_64.o \
 			   memcpy_64.o usercopy_64.o mem_64.o string.o \
 			   checksum_wrappers_64.o hweight_64.o \
-			   copyuser_power7.o string_64.o copypage_power7.o
+			   copyuser_power7.o string_64.o copypage_power7.o \
+			   memcpy_power7.o
 obj-$(CONFIG_XMON)	+= sstep.o ldstfp.o
 obj-$(CONFIG_KPROBES)	+= sstep.o ldstfp.o
 obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= sstep.o ldstfp.o
Index: linux-build/arch/powerpc/lib/memcpy_64.S
===================================================================
--- linux-build.orig/arch/powerpc/lib/memcpy_64.S	2012-05-30 09:39:59.000000000 +1000
+++ linux-build/arch/powerpc/lib/memcpy_64.S	2012-05-31 09:12:00.093876936 +1000
@@ -11,7 +11,11 @@
 
 	.align	7
 _GLOBAL(memcpy)
+BEGIN_FTR_SECTION
 	std	r3,48(r1)	/* save destination pointer for return value */
+FTR_SECTION_ELSE
+	b	memcpy_power7
+ALT_FTR_SECTION_END_IFCLR(CPU_FTR_VMX_COPY)
 	PPC_MTOCRF(0x01,r5)
 	cmpldi	cr1,r5,16
 	neg	r6,r3		# LS 3 bits = # bytes to 8-byte dest bdry
Index: linux-build/arch/powerpc/lib/memcpy_power7.S
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-build/arch/powerpc/lib/memcpy_power7.S	2012-05-31 15:28:03.495781127 +1000
@@ -0,0 +1,650 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) IBM Corporation, 2012
+ *
+ * Author: Anton Blanchard <anton@au.ibm.com>
+ */
+#include <asm/ppc_asm.h>
+
+#define STACKFRAMESIZE	256
+#define STK_REG(i)	(112 + ((i)-14)*8)
+
+_GLOBAL(memcpy_power7)
+#ifdef CONFIG_ALTIVEC
+	cmpldi	r5,16
+	cmpldi	cr1,r5,4096
+
+	std	r3,48(r1)
+
+	blt	.Lshort_copy
+	bgt	cr1,.Lvmx_copy
+#else
+	cmpldi	r5,16
+
+	std	r3,48(r1)
+
+	blt	.Lshort_copy
+#endif
+
+.Lnonvmx_copy:
+	/* Get the source 8B aligned */
+	neg	r6,r4
+	mtocrf	0x01,r6
+	clrldi	r6,r6,(64-3)
+
+	bf	cr7*4+3,1f
+	lbz	r0,0(r4)
+	addi	r4,r4,1
+	stb	r0,0(r3)
+	addi	r3,r3,1
+
+1:	bf	cr7*4+2,2f
+	lhz	r0,0(r4)
+	addi	r4,r4,2
+	sth	r0,0(r3)
+	addi	r3,r3,2
+
+2:	bf	cr7*4+1,3f
+	lwz	r0,0(r4)
+	addi	r4,r4,4
+	stw	r0,0(r3)
+	addi	r3,r3,4
+
+3:	sub	r5,r5,r6
+	cmpldi	r5,128
+	blt	5f
+
+	mflr	r0
+	stdu	r1,-STACKFRAMESIZE(r1)
+	std	r14,STK_REG(r14)(r1)
+	std	r15,STK_REG(r15)(r1)
+	std	r16,STK_REG(r16)(r1)
+	std	r17,STK_REG(r17)(r1)
+	std	r18,STK_REG(r18)(r1)
+	std	r19,STK_REG(r19)(r1)
+	std	r20,STK_REG(r20)(r1)
+	std	r21,STK_REG(r21)(r1)
+	std	r22,STK_REG(r22)(r1)
+	std	r0,STACKFRAMESIZE+16(r1)
+
+	srdi	r6,r5,7
+	mtctr	r6
+
+	/* Now do cacheline (128B) sized loads and stores. */
+	.align	5
+4:
+	ld	r0,0(r4)
+	ld	r6,8(r4)
+	ld	r7,16(r4)
+	ld	r8,24(r4)
+	ld	r9,32(r4)
+	ld	r10,40(r4)
+	ld	r11,48(r4)
+	ld	r12,56(r4)
+	ld	r14,64(r4)
+	ld	r15,72(r4)
+	ld	r16,80(r4)
+	ld	r17,88(r4)
+	ld	r18,96(r4)
+	ld	r19,104(r4)
+	ld	r20,112(r4)
+	ld	r21,120(r4)
+	addi	r4,r4,128
+	std	r0,0(r3)
+	std	r6,8(r3)
+	std	r7,16(r3)
+	std	r8,24(r3)
+	std	r9,32(r3)
+	std	r10,40(r3)
+	std	r11,48(r3)
+	std	r12,56(r3)
+	std	r14,64(r3)
+	std	r15,72(r3)
+	std	r16,80(r3)
+	std	r17,88(r3)
+	std	r18,96(r3)
+	std	r19,104(r3)
+	std	r20,112(r3)
+	std	r21,120(r3)
+	addi	r3,r3,128
+	bdnz	4b
+
+	clrldi	r5,r5,(64-7)
+
+	ld	r14,STK_REG(r14)(r1)
+	ld	r15,STK_REG(r15)(r1)
+	ld	r16,STK_REG(r16)(r1)
+	ld	r17,STK_REG(r17)(r1)
+	ld	r18,STK_REG(r18)(r1)
+	ld	r19,STK_REG(r19)(r1)
+	ld	r20,STK_REG(r20)(r1)
+	ld	r21,STK_REG(r21)(r1)
+	ld	r22,STK_REG(r22)(r1)
+	addi	r1,r1,STACKFRAMESIZE
+
+	/* Up to 127B to go */
+5:	srdi	r6,r5,4
+	mtocrf	0x01,r6
+
+6:	bf	cr7*4+1,7f
+	ld	r0,0(r4)
+	ld	r6,8(r4)
+	ld	r7,16(r4)
+	ld	r8,24(r4)
+	ld	r9,32(r4)
+	ld	r10,40(r4)
+	ld	r11,48(r4)
+	ld	r12,56(r4)
+	addi	r4,r4,64
+	std	r0,0(r3)
+	std	r6,8(r3)
+	std	r7,16(r3)
+	std	r8,24(r3)
+	std	r9,32(r3)
+	std	r10,40(r3)
+	std	r11,48(r3)
+	std	r12,56(r3)
+	addi	r3,r3,64
+
+	/* Up to 63B to go */
+7:	bf	cr7*4+2,8f
+	ld	r0,0(r4)
+	ld	r6,8(r4)
+	ld	r7,16(r4)
+	ld	r8,24(r4)
+	addi	r4,r4,32
+	std	r0,0(r3)
+	std	r6,8(r3)
+	std	r7,16(r3)
+	std	r8,24(r3)
+	addi	r3,r3,32
+
+	/* Up to 31B to go */
+8:	bf	cr7*4+3,9f
+	ld	r0,0(r4)
+	ld	r6,8(r4)
+	addi	r4,r4,16
+	std	r0,0(r3)
+	std	r6,8(r3)
+	addi	r3,r3,16
+
+9:	clrldi	r5,r5,(64-4)
+
+	/* Up to 15B to go */
+.Lshort_copy:
+	mtocrf	0x01,r5
+	bf	cr7*4+0,12f
+	lwz	r0,0(r4)	/* Less chance of a reject with word ops */
+	lwz	r6,4(r4)
+	addi	r4,r4,8
+	stw	r0,0(r3)
+	stw	r6,4(r3)
+	addi	r3,r3,8
+
+12:	bf	cr7*4+1,13f
+	lwz	r0,0(r4)
+	addi	r4,r4,4
+	stw	r0,0(r3)
+	addi	r3,r3,4
+
+13:	bf	cr7*4+2,14f
+	lhz	r0,0(r4)
+	addi	r4,r4,2
+	sth	r0,0(r3)
+	addi	r3,r3,2
+
+14:	bf	cr7*4+3,15f
+	lbz	r0,0(r4)
+	stb	r0,0(r3)
+
+15:	ld	r3,48(r1)
+	blr
+
+.Lunwind_stack_nonvmx_copy:
+	addi	r1,r1,STACKFRAMESIZE
+	b	.Lnonvmx_copy
+
+#ifdef CONFIG_ALTIVEC
+.Lvmx_copy:
+	mflr	r0
+	std	r4,56(r1)
+	std	r5,64(r1)
+	std	r0,16(r1)
+	stdu	r1,-STACKFRAMESIZE(r1)
+	bl	.enter_vmx_copy
+	cmpwi	r3,0
+	ld	r0,STACKFRAMESIZE+16(r1)
+	ld	r3,STACKFRAMESIZE+48(r1)
+	ld	r4,STACKFRAMESIZE+56(r1)
+	ld	r5,STACKFRAMESIZE+64(r1)
+	mtlr	r0
+
+	/*
+	 * We prefetch both the source and destination using enhanced touch
+	 * instructions. We use a stream ID of 0 for the load side and
+	 * 1 for the store side.
+	 */
+	clrrdi	r6,r4,7
+	clrrdi	r9,r3,7
+	ori	r9,r9,1		/* stream=1 */
+
+	srdi	r7,r5,7		/* length in cachelines, capped at 0x3FF */
+	cmpldi	cr1,r7,0x3FF
+	ble	cr1,1f
+	li	r7,0x3FF
+1:	lis	r0,0x0E00	/* depth=7 */
+	sldi	r7,r7,7
+	or	r7,r7,r0
+	ori	r10,r7,1	/* stream=1 */
+
+	lis	r8,0x8000	/* GO=1 */
+	clrldi	r8,r8,32
+
+.machine push
+.machine "power4"
+	dcbt	r0,r6,0b01000
+	dcbt	r0,r7,0b01010
+	dcbtst	r0,r9,0b01000
+	dcbtst	r0,r10,0b01010
+	eieio
+	dcbt	r0,r8,0b01010	/* GO */
+.machine pop
+
+	beq	.Lunwind_stack_nonvmx_copy
+
+	/*
+	 * If source and destination are not relatively aligned we use a
+	 * slower permute loop.
+	 */
+	xor	r6,r4,r3
+	rldicl.	r6,r6,0,(64-4)
+	bne	.Lvmx_unaligned_copy
+
+	/* Get the destination 16B aligned */
+	neg	r6,r3
+	mtocrf	0x01,r6
+	clrldi	r6,r6,(64-4)
+
+	bf	cr7*4+3,1f
+	lbz	r0,0(r4)
+	addi	r4,r4,1
+	stb	r0,0(r3)
+	addi	r3,r3,1
+
+1:	bf	cr7*4+2,2f
+	lhz	r0,0(r4)
+	addi	r4,r4,2
+	sth	r0,0(r3)
+	addi	r3,r3,2
+
+2:	bf	cr7*4+1,3f
+	lwz	r0,0(r4)
+	addi	r4,r4,4
+	stw	r0,0(r3)
+	addi	r3,r3,4
+
+3:	bf	cr7*4+0,4f
+	ld	r0,0(r4)
+	addi	r4,r4,8
+	std	r0,0(r3)
+	addi	r3,r3,8
+
+4:	sub	r5,r5,r6
+
+	/* Get the desination 128B aligned */
+	neg	r6,r3
+	srdi	r7,r6,4
+	mtocrf	0x01,r7
+	clrldi	r6,r6,(64-7)
+
+	li	r9,16
+	li	r10,32
+	li	r11,48
+
+	bf	cr7*4+3,5f
+	lvx	vr1,r0,r4
+	addi	r4,r4,16
+	stvx	vr1,r0,r3
+	addi	r3,r3,16
+
+5:	bf	cr7*4+2,6f
+	lvx	vr1,r0,r4
+	lvx	vr0,r4,r9
+	addi	r4,r4,32
+	stvx	vr1,r0,r3
+	stvx	vr0,r3,r9
+	addi	r3,r3,32
+
+6:	bf	cr7*4+1,7f
+	lvx	vr3,r0,r4
+	lvx	vr2,r4,r9
+	lvx	vr1,r4,r10
+	lvx	vr0,r4,r11
+	addi	r4,r4,64
+	stvx	vr3,r0,r3
+	stvx	vr2,r3,r9
+	stvx	vr1,r3,r10
+	stvx	vr0,r3,r11
+	addi	r3,r3,64
+
+7:	sub	r5,r5,r6
+	srdi	r6,r5,7
+
+	std	r14,STK_REG(r14)(r1)
+	std	r15,STK_REG(r15)(r1)
+	std	r16,STK_REG(r16)(r1)
+
+	li	r12,64
+	li	r14,80
+	li	r15,96
+	li	r16,112
+
+	mtctr	r6
+
+	/*
+	 * Now do cacheline sized loads and stores. By this stage the
+	 * cacheline stores are also cacheline aligned.
+	 */
+	.align	5
+8:
+	lvx	vr7,r0,r4
+	lvx	vr6,r4,r9
+	lvx	vr5,r4,r10
+	lvx	vr4,r4,r11
+	lvx	vr3,r4,r12
+	lvx	vr2,r4,r14
+	lvx	vr1,r4,r15
+	lvx	vr0,r4,r16
+	addi	r4,r4,128
+	stvx	vr7,r0,r3
+	stvx	vr6,r3,r9
+	stvx	vr5,r3,r10
+	stvx	vr4,r3,r11
+	stvx	vr3,r3,r12
+	stvx	vr2,r3,r14
+	stvx	vr1,r3,r15
+	stvx	vr0,r3,r16
+	addi	r3,r3,128
+	bdnz	8b
+
+	ld	r14,STK_REG(r14)(r1)
+	ld	r15,STK_REG(r15)(r1)
+	ld	r16,STK_REG(r16)(r1)
+
+	/* Up to 127B to go */
+	clrldi	r5,r5,(64-7)
+	srdi	r6,r5,4
+	mtocrf	0x01,r6
+
+	bf	cr7*4+1,9f
+	lvx	vr3,r0,r4
+	lvx	vr2,r4,r9
+	lvx	vr1,r4,r10
+	lvx	vr0,r4,r11
+	addi	r4,r4,64
+	stvx	vr3,r0,r3
+	stvx	vr2,r3,r9
+	stvx	vr1,r3,r10
+	stvx	vr0,r3,r11
+	addi	r3,r3,64
+
+9:	bf	cr7*4+2,10f
+	lvx	vr1,r0,r4
+	lvx	vr0,r4,r9
+	addi	r4,r4,32
+	stvx	vr1,r0,r3
+	stvx	vr0,r3,r9
+	addi	r3,r3,32
+
+10:	bf	cr7*4+3,11f
+	lvx	vr1,r0,r4
+	addi	r4,r4,16
+	stvx	vr1,r0,r3
+	addi	r3,r3,16
+
+	/* Up to 15B to go */
+11:	clrldi	r5,r5,(64-4)
+	mtocrf	0x01,r5
+	bf	cr7*4+0,12f
+	ld	r0,0(r4)
+	addi	r4,r4,8
+	std	r0,0(r3)
+	addi	r3,r3,8
+
+12:	bf	cr7*4+1,13f
+	lwz	r0,0(r4)
+	addi	r4,r4,4
+	stw	r0,0(r3)
+	addi	r3,r3,4
+
+13:	bf	cr7*4+2,14f
+	lhz	r0,0(r4)
+	addi	r4,r4,2
+	sth	r0,0(r3)
+	addi	r3,r3,2
+
+14:	bf	cr7*4+3,15f
+	lbz	r0,0(r4)
+	stb	r0,0(r3)
+
+15:	addi	r1,r1,STACKFRAMESIZE
+	ld	r3,48(r1)
+	b	.exit_vmx_copy		/* tail call optimise */
+
+.Lvmx_unaligned_copy:
+	/* Get the destination 16B aligned */
+	neg	r6,r3
+	mtocrf	0x01,r6
+	clrldi	r6,r6,(64-4)
+
+	bf	cr7*4+3,1f
+	lbz	r0,0(r4)
+	addi	r4,r4,1
+	stb	r0,0(r3)
+	addi	r3,r3,1
+
+1:	bf	cr7*4+2,2f
+	lhz	r0,0(r4)
+	addi	r4,r4,2
+	sth	r0,0(r3)
+	addi	r3,r3,2
+
+2:	bf	cr7*4+1,3f
+	lwz	r0,0(r4)
+	addi	r4,r4,4
+	stw	r0,0(r3)
+	addi	r3,r3,4
+
+3:	bf	cr7*4+0,4f
+	lwz	r0,0(r4)	/* Less chance of a reject with word ops */
+	lwz	r7,4(r4)
+	addi	r4,r4,8
+	stw	r0,0(r3)
+	stw	r7,4(r3)
+	addi	r3,r3,8
+
+4:	sub	r5,r5,r6
+
+	/* Get the desination 128B aligned */
+	neg	r6,r3
+	srdi	r7,r6,4
+	mtocrf	0x01,r7
+	clrldi	r6,r6,(64-7)
+
+	li	r9,16
+	li	r10,32
+	li	r11,48
+
+	lvsl	vr16,0,r4	/* Setup permute control vector */
+	lvx	vr0,0,r4
+	addi	r4,r4,16
+
+	bf	cr7*4+3,5f
+	lvx	vr1,r0,r4
+	vperm	vr8,vr0,vr1,vr16
+	addi	r4,r4,16
+	stvx	vr8,r0,r3
+	addi	r3,r3,16
+	vor	vr0,vr1,vr1
+
+5:	bf	cr7*4+2,6f
+	lvx	vr1,r0,r4
+	vperm	vr8,vr0,vr1,vr16
+	lvx	vr0,r4,r9
+	vperm	vr9,vr1,vr0,vr16
+	addi	r4,r4,32
+	stvx	vr8,r0,r3
+	stvx	vr9,r3,r9
+	addi	r3,r3,32
+
+6:	bf	cr7*4+1,7f
+	lvx	vr3,r0,r4
+	vperm	vr8,vr0,vr3,vr16
+	lvx	vr2,r4,r9
+	vperm	vr9,vr3,vr2,vr16
+	lvx	vr1,r4,r10
+	vperm	vr10,vr2,vr1,vr16
+	lvx	vr0,r4,r11
+	vperm	vr11,vr1,vr0,vr16
+	addi	r4,r4,64
+	stvx	vr8,r0,r3
+	stvx	vr9,r3,r9
+	stvx	vr10,r3,r10
+	stvx	vr11,r3,r11
+	addi	r3,r3,64
+
+7:	sub	r5,r5,r6
+	srdi	r6,r5,7
+
+	std	r14,STK_REG(r14)(r1)
+	std	r15,STK_REG(r15)(r1)
+	std	r16,STK_REG(r16)(r1)
+
+	li	r12,64
+	li	r14,80
+	li	r15,96
+	li	r16,112
+
+	mtctr	r6
+
+	/*
+	 * Now do cacheline sized loads and stores. By this stage the
+	 * cacheline stores are also cacheline aligned.
+	 */
+	.align	5
+8:
+	lvx	vr7,r0,r4
+	vperm	vr8,vr0,vr7,vr16
+	lvx	vr6,r4,r9
+	vperm	vr9,vr7,vr6,vr16
+	lvx	vr5,r4,r10
+	vperm	vr10,vr6,vr5,vr16
+	lvx	vr4,r4,r11
+	vperm	vr11,vr5,vr4,vr16
+	lvx	vr3,r4,r12
+	vperm	vr12,vr4,vr3,vr16
+	lvx	vr2,r4,r14
+	vperm	vr13,vr3,vr2,vr16
+	lvx	vr1,r4,r15
+	vperm	vr14,vr2,vr1,vr16
+	lvx	vr0,r4,r16
+	vperm	vr15,vr1,vr0,vr16
+	addi	r4,r4,128
+	stvx	vr8,r0,r3
+	stvx	vr9,r3,r9
+	stvx	vr10,r3,r10
+	stvx	vr11,r3,r11
+	stvx	vr12,r3,r12
+	stvx	vr13,r3,r14
+	stvx	vr14,r3,r15
+	stvx	vr15,r3,r16
+	addi	r3,r3,128
+	bdnz	8b
+
+	ld	r14,STK_REG(r14)(r1)
+	ld	r15,STK_REG(r15)(r1)
+	ld	r16,STK_REG(r16)(r1)
+
+	/* Up to 127B to go */
+	clrldi	r5,r5,(64-7)
+	srdi	r6,r5,4
+	mtocrf	0x01,r6
+
+	bf	cr7*4+1,9f
+	lvx	vr3,r0,r4
+	vperm	vr8,vr0,vr3,vr16
+	lvx	vr2,r4,r9
+	vperm	vr9,vr3,vr2,vr16
+	lvx	vr1,r4,r10
+	vperm	vr10,vr2,vr1,vr16
+	lvx	vr0,r4,r11
+	vperm	vr11,vr1,vr0,vr16
+	addi	r4,r4,64
+	stvx	vr8,r0,r3
+	stvx	vr9,r3,r9
+	stvx	vr10,r3,r10
+	stvx	vr11,r3,r11
+	addi	r3,r3,64
+
+9:	bf	cr7*4+2,10f
+	lvx	vr1,r0,r4
+	vperm	vr8,vr0,vr1,vr16
+	lvx	vr0,r4,r9
+	vperm	vr9,vr1,vr0,vr16
+	addi	r4,r4,32
+	stvx	vr8,r0,r3
+	stvx	vr9,r3,r9
+	addi	r3,r3,32
+
+10:	bf	cr7*4+3,11f
+	lvx	vr1,r0,r4
+	vperm	vr8,vr0,vr1,vr16
+	addi	r4,r4,16
+	stvx	vr8,r0,r3
+	addi	r3,r3,16
+
+	/* Up to 15B to go */
+11:	clrldi	r5,r5,(64-4)
+	addi	r4,r4,-16	/* Unwind the +16 load offset */
+	mtocrf	0x01,r5
+	bf	cr7*4+0,12f
+	lwz	r0,0(r4)	/* Less chance of a reject with word ops */
+	lwz	r6,4(r4)
+	addi	r4,r4,8
+	stw	r0,0(r3)
+	stw	r6,4(r3)
+	addi	r3,r3,8
+
+12:	bf	cr7*4+1,13f
+	lwz	r0,0(r4)
+	addi	r4,r4,4
+	stw	r0,0(r3)
+	addi	r3,r3,4
+
+13:	bf	cr7*4+2,14f
+	lhz	r0,0(r4)
+	addi	r4,r4,2
+	sth	r0,0(r3)
+	addi	r3,r3,2
+
+14:	bf	cr7*4+3,15f
+	lbz	r0,0(r4)
+	stb	r0,0(r3)
+
+15:	addi	r1,r1,STACKFRAMESIZE
+	ld	r3,48(r1)
+	b	.exit_vmx_copy		/* tail call optimise */
+#endif /* CONFiG_ALTIVEC */

^ permalink raw reply

* [PATCH] powerpc: Use enhanced touch instructions in POWER7 copy_to_user/copy_from_user
From: Anton Blanchard @ 2012-05-31  6:19 UTC (permalink / raw)
  To: benh, paulus, michael, amodra; +Cc: linuxppc-dev
In-Reply-To: <20120529181432.2038c1a3@kryten>


Version 2.06 of the POWER ISA introduced enhanced touch instructions,
allowing us to specify a number of attributes including the length of
a stream.

This patch adds a software stream for both loads and stores in the
POWER7 copy_tofrom_user loop. Since the setup is quite complicated
and we have to use an eieio to ensure correct ordering of the "GO"
command we only do this for copies above 4kB.

To quantify any performance improvements we need a working set
bigger than the caches so we operate on a 1GB file:

# dd if=/dev/zero of=/tmp/foo bs=1M count=1024

And we compare how fast we can read the file:

# dd if=/tmp/foo of=/dev/null bs=1M

before: 7.7 GB/s
after:  9.6 GB/s

A 25% improvement.

The worst case for this patch will be a completely L1 cache contained
copy of just over 4kB. We can test this with the copy_to_user
testcase we used to tune copy_tofrom_user originally:

http://ozlabs.org/~anton/junkcode/copy_to_user.c

# time ./copy_to_user2 -l 4224 -i 10000000

before: 6.807 s
after:  6.946 s

A 2% slowdown, which seems reasonable considering our data is unlikely
to be completely L1 contained.

Signed-off-by: Anton Blanchard <anton@samba.org>  
---

v2: Use cr1 in the comparison so we don't corrupt the compare/branch
to select vmx vs non vmx loops.

Index: linux-build/arch/powerpc/lib/copyuser_power7.S
===================================================================
--- linux-build.orig/arch/powerpc/lib/copyuser_power7.S	2012-05-29 21:22:40.445551834 +1000
+++ linux-build/arch/powerpc/lib/copyuser_power7.S	2012-05-31 15:28:35.336354208 +1000
@@ -298,6 +298,37 @@ err1;	stb	r0,0(r3)
 	ld	r5,STACKFRAMESIZE+64(r1)
 	mtlr	r0
 
+	/*
+	 * We prefetch both the source and destination using enhanced touch
+	 * instructions. We use a stream ID of 0 for the load side and
+	 * 1 for the store side.
+	 */
+	clrrdi	r6,r4,7
+	clrrdi	r9,r3,7
+	ori	r9,r9,1		/* stream=1 */
+
+	srdi	r7,r5,7		/* length in cachelines, capped at 0x3FF */
+	cmpldi	cr1,r7,0x3FF
+	ble	cr1,1f
+	li	r7,0x3FF
+1:	lis	r0,0x0E00	/* depth=7 */
+	sldi	r7,r7,7
+	or	r7,r7,r0
+	ori	r10,r7,1	/* stream=1 */
+
+	lis	r8,0x8000	/* GO=1 */
+	clrldi	r8,r8,32
+
+.machine push
+.machine "power4"
+	dcbt	r0,r6,0b01000
+	dcbt	r0,r7,0b01010
+	dcbtst	r0,r9,0b01000
+	dcbtst	r0,r10,0b01010
+	eieio
+	dcbt	r0,r8,0b01010	/* GO */
+.machine pop
+
 	beq	.Lunwind_stack_nonvmx_copy
 
 	/*

^ permalink raw reply

* [PATCH] powerpc/pci: cleanup on duplicate assignment
From: Gavin Shan @ 2012-05-31  6:17 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: michael, Gavin Shan

While creating the PCI root bus through function pci_create_root_bus()
of PCI core, it should have assigned the secondary bus number for the
newly created PCI root bus. Thus we needn't do the explicit assignment
for the secondary bus number again in pcibios_scan_phb().

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/pci-common.c |    1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 8e78e93..0f75bd5 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -1646,7 +1646,6 @@ void __devinit pcibios_scan_phb(struct pci_controller *hose)
 		pci_free_resource_list(&resources);
 		return;
 	}
-	bus->secondary = hose->first_busno;
 	hose->bus = bus;
 
 	/* Get probe mode and perform scan */
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH] powerpc/mm: dereference OF node "/chosen"
From: Gavin Shan @ 2012-05-31  3:07 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: michael, Gavin Shan

The form affinity for NUMA is set to 1 if the firmware supports
OPAL. Otherwise, we have to retrieve that from OF node "/chosen".
For the latter case, OF node "/chosen" was referred without
dereferencing.

The patch dereference OF node "/chosen" if necessary.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 arch/powerpc/mm/numa.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index b6edbb3..5ca3a15 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -340,6 +340,8 @@ static int __init find_min_common_depth(void)
 				dbg("Using form 1 affinity\n");
 				form1_affinity = 1;
 			}
+
+			of_node_put(chosen);
 		}
 	}
 
-- 
1.7.9.5

^ permalink raw reply related

* Re: kernel panic during kernel module load (powerpc specific part)
From: Michael Ellerman @ 2012-05-30 23:24 UTC (permalink / raw)
  To: Steffen Rumler; +Cc: linuxppc-dev
In-Reply-To: <4FC62FB9.8010701@nsn.com>

On Wed, 2012-05-30 at 16:33 +0200, Steffen Rumler wrote:
> Hi,
> 
> The system crashes inside the return of the init entry point of the kernel module.
> 
> I've found the following root cause:
> 
>      (6) Unfortunately, the trampoline code (do_plt_call()) is using register r11 to setup the jump.
>            It looks like the prologue and epilogue are using also the register r11, in order to point to the previous stack frame.
>            This is a conflict !!! The trampoline code is damaging the content of r11.

Hi Steffen,

Great bug report!

I can't quite work out what the standards say, the versions I'm looking
at are probably old anyway.

Have you tried the obvious fix?

cheers


diff --git a/arch/powerpc/kernel/module_32.c b/arch/powerpc/kernel/module_32.c
index 0b6d796..989d79a 100644
--- a/arch/powerpc/kernel/module_32.c
+++ b/arch/powerpc/kernel/module_32.c
@@ -205,9 +205,9 @@ static uint32_t do_plt_call(void *location,
        }
 
        /* Stolen from Paul Mackerras as well... */
-       entry->jump[0] = 0x3d600000+((val+0x8000)>>16); /* lis r11,sym@ha */
-       entry->jump[1] = 0x396b0000 + (val&0xffff);     /* addi r11,r11,sym@l*/
-       entry->jump[2] = 0x7d6903a6;                    /* mtctr r11 */
+       entry->jump[0] = 0x3d800000+((val+0x8000)>>16); /* lis r12,sym@ha */
+       entry->jump[1] = 0x398c0000 + (val&0xffff);     /* addi r12,r12,sym@l*/
+       entry->jump[2] = 0x7d8903a6;                    /* mtctr r12 */
        entry->jump[3] = 0x4e800420;                    /* bctr */
 
        DEBUGP("Initialized plt for 0x%x at %p\n", val, entry);

^ permalink raw reply related

* Re: ppc/sata-fsl: orphan config value: CONFIG_MPC8315_DS
From: Anthony Foiani @ 2012-05-30 20:52 UTC (permalink / raw)
  To: Scott Wood
  Cc: Robert P.J.Day, linuxppc-dev@lists.ozlabs.org, Li Yang-R58472,
	Jeff Garzik, Adrian Bunk
In-Reply-To: <4FC68109.3030607@freescale.com>

Scott Wood <scottwood@freescale.com> writes:

> We currently support building one kernel that supports a bunch of
> different boards.  The hardcoding of this workaround was harmless so
> far because it was conditional on a symbol that was never defined,
> but now you'll be enabling this workaround on any kernel that simply
> has support for mpc8315erdb.  That is not acceptable unless you show
> it's harmless on all those other boards.

Ok, I see your point now.  Sorry for being dense.

At the moment, I'm building a kernel that is only going to run on this
particular board, so the kconfig solution works *for me*.

Unfortunately, I'm not sure I can help develop a more generic
solution.  I can't reliably reproduce the problem, so I can't even
offer to help test for it.

Even more unfortunately, I don't currently have the bandwidth to do
any more investigation or experimenting with the devtree option (as
much as I would like to!).  At this point in my project, I probably
can't even justify trying to switch to a more current kernel, so I
couldn't try out a new release regardless.

Sorry I can't be more help.

Thanks again,
Tony

^ permalink raw reply

* Re: ppc/sata-fsl: orphan config value: CONFIG_MPC8315_DS
From: Scott Wood @ 2012-05-30 20:20 UTC (permalink / raw)
  To: Anthony Foiani
  Cc: Robert P.J.Day, linuxppc-dev@lists.ozlabs.org, Li Yang-R58472,
	Jeff Garzik, Adrian Bunk
In-Reply-To: <gd35lbf55.fsf@dworkin.scrye.com>

On 05/30/2012 03:14 PM, Anthony Foiani wrote:
> Scott Wood <scottwood@freescale.com> writes:
> 
>> Board information is available from the device tree, and from
>> platform code that was selected based on the device tree.
> 
> You're right, of course; I was focusing on discovery/probing, and
> completely forgot about "provided information".
> 
> However, as I just mentioned in my reply to Yang, I'm pretty happy
> with the kconfig solution (Adrian's patch, basically).
> 
> If we find that this is a more widespread problem, we can revisit this
> discussion; but if only a handful of us have encountered this in a
> 5-year-old design, then I don't think it's worth the extra effort of
> making it dynamic.

We currently support building one kernel that supports a bunch of
different boards.  The hardcoding of this workaround was harmless so far
because it was conditional on a symbol that was never defined, but now
you'll be enabling this workaround on any kernel that simply has support
for mpc8315erdb.  That is not acceptable unless you show it's harmless
on all those other boards.

-Scott

^ permalink raw reply

* Re: ppc/sata-fsl: orphan config value: CONFIG_MPC8315_DS
From: Anthony Foiani @ 2012-05-30 20:14 UTC (permalink / raw)
  To: Scott Wood
  Cc: Robert P.J.Day, linuxppc-dev@lists.ozlabs.org, Li Yang-R58472,
	Jeff Garzik, Adrian Bunk
In-Reply-To: <4FC5546A.4090506@freescale.com>

Scott Wood <scottwood@freescale.com> writes:

> Board information is available from the device tree, and from
> platform code that was selected based on the device tree.

You're right, of course; I was focusing on discovery/probing, and
completely forgot about "provided information".

However, as I just mentioned in my reply to Yang, I'm pretty happy
with the kconfig solution (Adrian's patch, basically).

If we find that this is a more widespread problem, we can revisit this
discussion; but if only a handful of us have encountered this in a
5-year-old design, then I don't think it's worth the extra effort of
making it dynamic.

Maybe someone who knows devtree really well could crank that out in a
few minutes... but I'm not that person.  :)

Regardless, thanks very much for helping out on this.

I do advocate that Adrian's patch get put into place, so that we don't
have undocumented / unconnected kconfig symbols in the tree.  If we
ever do find out more details about the workaround, we can at least
add some comments at the code site.

Thanks again!

Best regards,
Anthony Foiani

^ permalink raw reply

* Re: ppc/sata-fsl: orphan config value: CONFIG_MPC8315_DS
From: Anthony Foiani @ 2012-05-30 20:07 UTC (permalink / raw)
  To: Li Yang
  Cc: Li Yang-R58472, Jeff Garzik, Adrian Bunk, Scott Wood,
	Robert P.J.Day, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <CADRPPNQ9Dk2OJuQQZm0WdSskW3D1mAqPq21d=h=JEf1nHszDDQ@mail.gmail.com>

Li Yang <leoli@freescale.com> writes:

> The original code was there before I touched the driver.  So
> unfortunately I also don't know the history of the problem.

Alas.

> Judging from the comment in code and current test result I guess it
> is a board related issue.

I wonder if anyone on the 8315_DS project knows where the limitation
came from, since that's the origin of the workaround...

Regardless, it's recommended by at least one vendor who based their
design on the 8315 RDB.  If it's board-related, then that seems a
reasonable conditional.

> I agree with Anthony that the best action for now is to remove the
> workaround completely.

Eeek.

I'm pretty sure that it needs to stay.  (I can't guarantee that it has
fixed my problem, but it's been a week or two without the hang, so I'm
becoming more confident).

I think the question is how to best conditionalize it.  The options
seem to be:

  1. at compile time, via kconfig bits;
  2. at runtime via probing / discovery; or
  3. at runtime via device tree.

Given that this is a relatively old platform, and only 2-3 of us have
run into this issue in 5 years, I'm inclined to just go with option 1.

That's exactly what Adrian's patch (from 2008!) does:

  http://old.nabble.com/-2.6-patch--sata_fsl.c%3A-fix-8315DS-workaround-td18807647.html

Using CONFIG_831x_RDB seems like a reasonable choice.

Anyway.  To be clear, my project is currently in good shape (by
adopting Adrian's patch) so I don't have any actual urgency for fixing
this.

I was hoping that someone might know the "correct" answer offhand, but
I honestly think that this isn't worth spending too much time on.
(But I do think that Adrian's patch is an improvement over the current
state of affairs.)

Thanks again to everyone that's chimed in.

Best regards,
Anthony Foiani

^ permalink raw reply

* MPC8315 PCI express lockup
From: David Laight @ 2012-05-30 15:10 UTC (permalink / raw)
  To: linuxppc-dev

(I apologise for this not having much to do with linux...)

We have a system with an MPC8315 ppc running linux 2.6.32
that uses the PCI express interface in RC mode to interface
to an Altera FPGA.
This uses both PIO and the PEX DMA interfaces (locally
written dma driver).
Under normal circumstances this all works fine.

However under some circumstances (eg DMA reads from
addresses that don't have actual slaves on the fpga [1])
the dma transfer requests don't complete.
There are no obvious error bits set in the hisr or csmisr
registers and the csb_status shows 'dma in progress'.
The dma transfer itself can be cancelled (by setting the
SUS bit in the dma_ctrl register), and the relevant
status bit are set to show the transfer has been aborted.

Once in this state all further PCI express transfers fail.
DMA requests timeout (driver gives up waiting for completion)
and PIO requests fault (Oops: Machine check, sig 7 [#1])
locking the kernel solid.

This looks very much like the MPC8315's errata PEX7
except that I don't see the CSMISR[RST] bit set.
I'm not at all sure the recovery for that errata is
actually writable! I'm certainly not going to write it
just to find out if it would help.
In any case it is quite likely that the driver's ISR will
try to do a PIO read while a dma transfer is timing out.

We can look at the fpga side and possibly find out
what it is doing, but it would be useful to know more
about the status on the ppc side.

I presume there is a way of doing a 'probe' type memory
cycle that won't panic on a fault?
Although that may not help me keep linux running as the
ISR needs to to a PCIe write to remove the level-sensitive
interrupt.

Any thought on ways to progress?

	David

[1] PIO reads are ok and just return 0xffffffff.

^ permalink raw reply

* kernel panic during kernel module load (powerpc specific part)
From: Steffen Rumler @ 2012-05-30 14:33 UTC (permalink / raw)
  To: linuxppc-dev

Hi,

We have seen the following kernel panic, happened during loading a kernel module:

     [  536.107430] Unable to handle kernel paging request for data at address 0xd76a907c
     [  536.114922] Faulting instruction address: 0xc0000770
     [  536.119891] Oops: Kernel access of bad area, sig: 11 [#1]
     [  536.125291] CCEP MPC8541E
     [  536.127908] Modules linked in: pppoe(+) nf_conntrack_ipv6 ...
     [  536.155705] NIP: c0000770 LR: c0000770 CTR: d76ab0d4
     [  536.160674] REGS: d76a8f24 TRAP: 0300   Not tainted  (2.6.33-ccep)
     [  536.166857] MSR: 00021000 <ME,CE>  CR: 24000482  XER: 20000000
     [  536.172718] DEAR: d76a907c, ESR: 00800000
     [  536.176728] TASK = cbd7f9f0[972] 'insmod' THREAD: cbeb2000
     [  536.182041] GPR00: 83cbfff8 d76a8fd4 cbd7f9f0 00000000 83cbfff8 00000000 cbeb3e1e 00000000
     [  536.190438] GPR08: 0000327b d76aa000 24000482 d76a8fd4 cbd7fc08 10019b04 100c2e5c 00000000
     [  536.198836] GPR16: 00000000 1009bafc 100c44a4 100c2ed4 100c0000 100d1e60 100d1ca8 100017ab
     [  536.207235] GPR24: 100017ae 10001936 c0343ae8 10012018 00000000 834bffe8 836bffec 838bfff0
     [  536.215819] NIP [c0000770] InstructionStorage+0xb0/0xc0
     [  536.221048] LR [c0000770] InstructionStorage+0xb0/0xc0
     [  536.226188] Call Trace:
     [  536.228630] Instruction dump:
     [  536.231600] 90eb002c 910b0030 7cbe0aa6 90ab00b8 7d846378 38a00000 39400401 914b00b0
     [  536.239386] 3d400002 614a1002 512a0420 4800d6f5 <c000e43c> c000e65c 60000000 60000000
     [  536.247348] Kernel panic - not syncing: Fatal exception in interrupt
     [  536.253704] Call Trace:
     [  536.256149] Rebooting in 10 seconds..

The system crashes inside the return of the init entry point of the kernel module.

I've found the following root cause:

     (1) The system has a high number of NAT rules configured, which created a bigger vmalloc area.
           I've checked this by looking at /proc/vmallocinfo.

     (2) The kernel module ELF file contains the separate section .init.text for the init entry point,
           which is marked with __init, as usual.

     (3) The kernel module ELF file contains the function prologue and epilogue in the .text section.

     (4) The epilogue is also called from the init entry point, in order to return to the caller.
           It is intended to restore the non-volatile registers from the stack and to jump to the caller.

     (5) Because of (1), it is not more possible to jump by a relative branch instruction. The distance is too big.
           Instead, the trampoline method is applied, which allows longer jumps via register.
           (please see see do_plt_call() in arch/powerpc/kernel/module_32.c)

     (6) Unfortunately, the trampoline code (do_plt_call()) is using register r11 to setup the jump.
           It looks like the prologue and epilogue are using also the register r11, in order to point to the previous stack frame.
           This is a conflict !!! The trampoline code is damaging the content of r11.

According to the current EABI definitions, the register r11 has got a dedicated function (pointer to previous stack frame).
In the following, there are parts of the prologue/epilogue shown, which are generated by the compiler:

    ...
    00000084 <_rest32gpr_28>:
          84:       83 8b ff f0     lwz     r28,-16(r11)
    00000088 <_rest32gpr_29>:
          88:       83 ab ff f4     lwz     r29,-12(r11)
    0000008c <_rest32gpr_30>:
          8c:       83 cb ff f8     lwz     r30,-8(r11)
    00000090 <_rest32gpr_31>:
          90:       83 eb ff fc     lwz     r31,-4(r11)
          94:       4e 80 00 20     blr
    00000098 <_rest32gpr_14_x>:
          98:       81 cb ff b8     lwz     r14,-72(r11)
    ...


I'd suggest to use register r12 instead of r11 in the trampoline generation code, in do_plt_call() (arch/powerpc/kernel/module_32.c).
I'm using kernel 2.6.33, but I think it is also relevant for the current kernel release.

Below, there is the complete debug sessions, showing more the details.

Thanks
Steffen Rumler

--

     0xd54990c0:     addi    r11,r1,48
     0xd54990c4:     mr      r3,r29
     0xd54990c8:     b       0xd5499100 <-- going to return from the init entry point

     (gdb) bt
     #0  0xd5499100 in ?? ()
     #1  0xd54990ac in ?? ()
     #2  0xc0001db0 in do_one_initcall (fn=0, wait=1131130) at init/main.c:719
     #3  0xc0059e50 in sys_init_module (umod=<value optimized out>, ...
     #4  0xc000e038 in syscall_dotrace_cont () at arch/powerpc/kernel/entry_32.S:331
     Backtrace stopped: frame did not save the PC
     (gdb) info reg r1
     r1             0xcbbdbed0    3418210000
     (gdb) x/2x 0xcbbdbed0
     0xcbbdbed0:    0xcbbdbf00    0xd54990ac
     (gdb) x/2x 0xcbbdbf00
     0xcbbdbf00:    0xcbbdbf20    0xc0001db0
     (gdb) info reg r11
     r11            0xcbbdbf00    3418210048

         --> the stack is OK here
         --> r11 is OK, pointing to the previous stack frame

     0xd5499100:     lis        r11,-10381 <-- this is the trampoline code using/changing r11 (do_plt_call()).
     0xd5499104:     addi    r11,r11,-3888
     0xd5499108:     mtctr  r11

     (gdb) info reg r11
     r11            0xd772f0d0    3614634192
<-- r11 is now damaged !!!

     0xd549910c:     bctr

     0xd772f0d0:     lwz     r28,-16(r11) <-- the epilogue, using damaged r11 as stack pointer
     0xd772f0d4:     lwz     r29,-12(r11)
     0xd772f0d8:     lwz     r30,-8(r11)
     0xd772f0dc:     lwz     r0,4(r11)
     0xd772f0e0:     lwz     r31,-4(r11)
     0xd772f0e4:     mtlr    r0
     0xd772f0e8:     mr      r1,r11

     (gdb) info reg lr
     lr             0x83abfff4    0x83abfff4
         0xd772f0ec:     blr
     (gdb) stepi
     Program received signal SIGSTOP, Stopped (signal).

        ---> the kernel panic !!!

^ permalink raw reply

* Re: [RFC] [PATCH] powerpc: Add MSR_DE to MSR_KERNEL
From: Bob Cochran @ 2012-05-30 13:26 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: linuxppc-dev, support
In-Reply-To: <1338363814-19565-1-git-send-email-Joakim.Tjernlund@transmode.se>

On 05/30/2012 03:43 AM, Joakim Tjernlund wrote:
> Emulators such as BDI2000 and CodeWarrior needs to have MSR_DE set
> in order to support break points.
> This adds MSR_DE for kernel space only.
> ---
>
> I have tested this briefly with BDI2000 on P2010(e500) and
> it works for me. I don't know if there are any bad side effects, therfore
> this RFC.
>
>   arch/powerpc/include/asm/reg.h       |    2 +-
>   arch/powerpc/include/asm/reg_booke.h |    2 +-
>   2 files changed, 2 insertions(+), 2 deletions(-)
>
snip


I believe that additional patches are required for CodeWarrior to work 
properly (e.g., assembly start up).  I think the patches should come 
from Freescale.  For whatever reason, they include them in their SDK, 
but haven't submitted them for inclusion in the mainline.

As a developer on Freescale Power products, I would like to see 
Freescale offer up a CodeWarrior patch set, so I don't have to manage 
the patches myself when working outside the SDK (i.e., on a more recent 
kernel).

^ permalink raw reply

* Gianfar TX problems
From: Rodolfo Giometti @ 2012-05-30 12:02 UTC (permalink / raw)
  To: linuxppc-dev

Hello,

I'm working on a P1021MDS based board and I have a strange behaviour
regarding the TX packets from the gianfar driver (linux 3.0.4-rc5).

Rx is working correctly but Tx is not... in particular I noticed that
the buffer descriptors are correctly filled but the packets simply do
NOT exit form the TSEC (I looked at TSEC's tx counters)!

What is puzzling me is that sometimes both TX and RX work correctly
but most of the time only Rx is working! Again, testing the same
uImage on a Freescale P1021MDS devboard everything works well!

Maybe something related to DMA/RAM coherence?

Thanks in advance,

Rodolfo

-- 

GNU/Linux Solutions                  e-mail: giometti@enneenne.com
Linux Device Driver                          giometti@linux.it
Embedded Systems                     phone:  +39 349 2432127
UNIX programming                     skype:  rodolfo.giometti
Freelance ICT Italia - Consulente ICT Italia - www.consulenti-ict.it

^ permalink raw reply

* Re[2]: [RFC] [PATCH] powerpc: Add MSR_DE to MSR_KERNEL
From: Abatron Support @ 2012-05-30 12:08 UTC (permalink / raw)
  To: Dan Malek; +Cc: linuxppc-dev, Bob Cochran
In-Reply-To: <6F7E3816-E71B-466A-9C6F-9928E1CFD7B1@digitaldans.com>

>> I have tested this briefly with BDI2000 on P2010(e500) and
>> it works for me. I don't know if there are any bad side effects,  
>> therfore
>> this RFC.

> We used to have MSR_DE surrounded by CONFIG_something
> to ensure it wasn't set under normal operation.  IIRC, if MSR_DE
> is set, you will have problems with software debuggers that
> utilize the the debugging registers in the chip itself.  You only want
> to force this to be set when using the BDI, not at other times.

This MSR_DE is also of interest and used for software debuggers that
make use of the debug registers. Only if MSR_DE is set then debug
interrupts are generated. If a debug event leads to a debug interrupt
handled by a software debugger or if it leads to a debug halt handled
by a JTAG tool is selected with DBCR0_EDM / DBCR0_IDM.

The "e500 Core Family Reference Manual" chapter "Chapter 8
Debug Support" explains in detail the effect of MSR_DE.

Ruedi

^ permalink raw reply

* Re: ppc/sata-fsl: orphan config value: CONFIG_MPC8315_DS
From: Li Yang @ 2012-05-30 10:59 UTC (permalink / raw)
  To: Scott Wood
  Cc: Li Yang-R58472, Jeff Garzik, Adrian Bunk, Anthony Foiani,
	Robert P.J.Day, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <4FC5546A.4090506@freescale.com>

On Wed, May 30, 2012 at 6:57 AM, Scott Wood <scottwood@freescale.com> wrote=
:
> On 05/29/2012 05:07 PM, Anthony Foiani wrote:
>> Scott Wood <scottwood@freescale.com> writes:
>>
>>> CONFIG_MPC831x_RDB doesn't mean that you're running on such a board,
>>> only that the kernel supports those boards. =C2=A0It should be a runtim=
e
>>> test.
>>
>> Point taken.
>>
>> If that SATA check is CPU/SOC-based, then it should be easy enough to
>> test. =C2=A0The cpuinfo for my board is:
>>
>> =C2=A0 # cat /proc/cpuinfo
>> =C2=A0 processor =C2=A0 =C2=A0 =C2=A0 : 0
>> =C2=A0 cpu =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 : e300c3
>> =C2=A0 clock =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 : 266.666664MHz
>> =C2=A0 revision =C2=A0 =C2=A0 =C2=A0 =C2=A0: 2.0 (pvr 8085 0020)
>> =C2=A0 bogomips =C2=A0 =C2=A0 =C2=A0 =C2=A0: 66.66
>> =C2=A0 timebase =C2=A0 =C2=A0 =C2=A0 =C2=A0: 33333333
>>
>> On the other hand, if the problem is actually caused by board trace
>> routing (or other hardware that's outside the control of the CPU/SOC),
>> then I don't know how possible a runtime check will be.
>
> Board information is available from the device tree, and from platform
> code that was selected based on the device tree.
>
>> Do you know if there is a specific errata that the MPC8315_DS ran
>> across that required this fix, or was it a band-aid in the first
>> place?
>
> I don't know the history of this, sorry. =C2=A0It looks like Yang Li adde=
d
> this code -- Yang, can you answer this?

The original code was there before I touched the driver.  So
unfortunately I also don't know the history of the problem.  Judging
from the comment in code and current test result I guess it is a board
related issue.  I agree with Anthony that the best action for now is
to remove the workaround completely.

Leo

^ permalink raw reply

* RE: pread() and pwrite() system calls
From: David Laight @ 2012-05-30 10:56 UTC (permalink / raw)
  To: Gabriel Paubert; +Cc: linuxppc-dev
In-Reply-To: <20120525164550.GA32406@visitor2.iram.es>

> > We have a system with linux 2.6.32 and the somewhat archaic
> > uClibc 0.9.27 (but I'm not sure the current version is
> > any better, and I think there are binary compatibility
> > if we update).
> >=20
> > I've just discovered that pread() is 'implemented'
> > by using 3 lseek() system calls and read().
> > (the same is true for the 64bit versions).
...
> I think that it is an uClibc problem.

It seems that uClibc hadn't been changed when the
names of the constants used for the system calls
were changed from __NR_pread to __NR_pread64.

	David

^ permalink raw reply

* [RFC PATCH powerpc] make CONFIG_NUMA depends on CONFIG_SMP
From: Li Zhong @ 2012-05-30  9:31 UTC (permalink / raw)
  Cc: Paul Mackerras, PowerPC email list

I'm not sure whether it makes sense to add this dependency to avoid
CONFI_NUMA && !CONFIG_SMP. 

I want to do this because I saw some build errors on next-tree when
compiling with CONFIG_SMP disabled, and it seems they are caused by some
codes under the CONFIG_NUMA #ifdefs.  

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
---
 arch/powerpc/Kconfig |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 050cb37..b2aa74b 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -394,7 +394,7 @@ config IRQ_ALL_CPUS
 
 config NUMA
 	bool "NUMA support"
-	depends on PPC64
+	depends on PPC64 && SMP
 	default y if SMP && PPC_PSERIES
 
 config NODES_SHIFT
-- 
1.7.1

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox