LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [stable 3.1] [PATCH] powerpc/kvm: Fix build failure with HV KVM and CBE
From: Michael Neuling @ 2011-11-10  0:04 UTC (permalink / raw)
  To: stable; +Cc: Alexander Graf, linuxppc-dev

From: Alexander Graf <agraf@suse.de>

powerpc/kvm: Fix build failure with HV KVM and CBE

When running with HV KVM and CBE config options enabled, I get
build failures like the following:

  arch/powerpc/kernel/head_64.o: In function `cbe_system_error_hv':
  (.text+0x1228): undefined reference to `do_kvm_0x1202'
  arch/powerpc/kernel/head_64.o: In function `cbe_maintenance_hv':
  (.text+0x1628): undefined reference to `do_kvm_0x1602'
  arch/powerpc/kernel/head_64.o: In function `cbe_thermal_hv':
  (.text+0x1828): undefined reference to `do_kvm_0x1802'

This is because we jump to a KVM handler when HV is enabled, but we
only generate the handler with PR KVM mode.

Upstream commit 5ccf55dd8177295813b68780f0a3c85e47306be1

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
CC: stable@kernel.org (3.1 only)

diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index a54d92f..cf9c69b 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -267,7 +267,7 @@ vsx_unavailable_pSeries_1:
 
 #ifdef CONFIG_CBE_RAS
 	STD_EXCEPTION_HV(0x1200, 0x1202, cbe_system_error)
-	KVM_HANDLER_PR_SKIP(PACA_EXGEN, EXC_HV, 0x1202)
+	KVM_HANDLER_SKIP(PACA_EXGEN, EXC_HV, 0x1202)
 #endif /* CONFIG_CBE_RAS */
 
 	STD_EXCEPTION_PSERIES(0x1300, 0x1300, instruction_breakpoint)
@@ -275,7 +275,7 @@ vsx_unavailable_pSeries_1:
 
 #ifdef CONFIG_CBE_RAS
 	STD_EXCEPTION_HV(0x1600, 0x1602, cbe_maintenance)
-	KVM_HANDLER_PR_SKIP(PACA_EXGEN, EXC_HV, 0x1602)
+	KVM_HANDLER_SKIP(PACA_EXGEN, EXC_HV, 0x1602)
 #endif /* CONFIG_CBE_RAS */
 
 	STD_EXCEPTION_PSERIES(0x1700, 0x1700, altivec_assist)
@@ -283,7 +283,7 @@ vsx_unavailable_pSeries_1:
 
 #ifdef CONFIG_CBE_RAS
 	STD_EXCEPTION_HV(0x1800, 0x1802, cbe_thermal)
-	KVM_HANDLER_PR_SKIP(PACA_EXGEN, EXC_HV, 0x1802)
+	KVM_HANDLER_SKIP(PACA_EXGEN, EXC_HV, 0x1802)
 #endif /* CONFIG_CBE_RAS */
 
 	. = 0x3000

^ permalink raw reply related

* [stable 3.1] [PATCH] KVM: PPC: Assemble book3s{, _hv}_rmhandlers.S separately
From: Michael Neuling @ 2011-11-10  0:04 UTC (permalink / raw)
  To: stable; +Cc: linuxppc-dev, Paul Mackerras, Alexander Graf

From: Paul Mackerras <paulus@samba.org>

KVM: PPC: Assemble book3s{,_hv}_rmhandlers.S separately

This makes arch/powerpc/kvm/book3s_rmhandlers.S and
arch/powerpc/kvm/book3s_hv_rmhandlers.S be assembled as
separate compilation units rather than having them #included in
arch/powerpc/kernel/exceptions-64s.S.  We no longer have any
conditional branches between the exception prologs in
exceptions-64s.S and the KVM handlers, so there is no need to
keep their contents close together in the vmlinux image.

In their current location, they are using up part of the limited
space between the first-level interrupt handlers and the firmware
NMI data area at offset 0x7000, and with some kernel configurations
this area will overflow (e.g. allyesconfig), leading to an
"attempt to .org backwards" error when compiling exceptions-64s.S.

Moving them out requires that we add some #includes that the
book3s_{,hv_}rmhandlers.S code was previously getting implicitly
via exceptions-64s.S.

Upstream commit 177339d7f7c99a25ecfdb6baeea6a2508fb2349f

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Michael Neuling <mikey@neuling.org>
CC: stable@kernel.org (3.1 only)

diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 41b02c7..29ddd8b 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -427,16 +427,6 @@ slb_miss_user_pseries:
 	b	.				/* prevent spec. execution */
 #endif /* __DISABLED__ */
 
-/* KVM's trampoline code needs to be close to the interrupt handlers */
-
-#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
-#ifdef CONFIG_KVM_BOOK3S_PR
-#include "../kvm/book3s_rmhandlers.S"
-#else
-#include "../kvm/book3s_hv_rmhandlers.S"
-#endif
-#endif
-
 	.align	7
 	.globl	__end_interrupts
 __end_interrupts:
diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index 4c66d51..3688aee 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -50,12 +50,15 @@ kvm-book3s_64-objs-$(CONFIG_KVM_BOOK3S_64_PR) := \
 	book3s_64_mmu_host.o \
 	book3s_64_mmu.o \
 	book3s_32_mmu.o
+kvm-book3s_64-builtin-objs-$(CONFIG_KVM_BOOK3S_64_PR) := \
+	book3s_rmhandlers.o
 
 kvm-book3s_64-objs-$(CONFIG_KVM_BOOK3S_64_HV) := \
 	book3s_hv.o \
 	book3s_hv_interrupts.o \
 	book3s_64_mmu_hv.o
 kvm-book3s_64-builtin-objs-$(CONFIG_KVM_BOOK3S_64_HV) := \
+	book3s_hv_rmhandlers.o \
 	book3s_hv_rm_mmu.o \
 	book3s_64_vio_hv.o \
 	book3s_hv_builtin.o
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index de29501..bc6ade9 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -20,7 +20,10 @@
 #include <asm/ppc_asm.h>
 #include <asm/kvm_asm.h>
 #include <asm/reg.h>
+#include <asm/mmu.h>
 #include <asm/page.h>
+#include <asm/ptrace.h>
+#include <asm/hvcall.h>
 #include <asm/asm-offsets.h>
 #include <asm/exception-64s.h>
 
diff --git a/arch/powerpc/kvm/book3s_rmhandlers.S b/arch/powerpc/kvm/book3s_rmhandlers.S
index c1f877c..5ee66ed 100644
--- a/arch/powerpc/kvm/book3s_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_rmhandlers.S
@@ -20,6 +20,7 @@
 #include <asm/ppc_asm.h>
 #include <asm/kvm_asm.h>
 #include <asm/reg.h>
+#include <asm/mmu.h>
 #include <asm/page.h>
 #include <asm/asm-offsets.h>
 
@@ -39,6 +40,7 @@
 #define MSR_NOIRQ		MSR_KERNEL & ~(MSR_IR | MSR_DR)
 #define FUNC(name) 		GLUE(.,name)
 
+	.globl	kvmppc_skip_interrupt
 kvmppc_skip_interrupt:
 	/*
 	 * Here all GPRs are unchanged from when the interrupt happened
@@ -51,6 +53,7 @@ kvmppc_skip_interrupt:
 	rfid
 	b	.
 
+	.globl	kvmppc_skip_Hinterrupt
 kvmppc_skip_Hinterrupt:
 	/*
 	 * Here all GPRs are unchanged from when the interrupt happened

^ permalink raw reply related

* [RFC PATCH 00/17] powerpc/e500: separate e500 from e500mc
From: Kyle Moffett @ 2011-11-10  0:03 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Scott Wood, Paul Gortmaker, Timur Tabi, linux-kernel
In-Reply-To: <4E42AB6F.1050900@freescale.com>

Hello,

I saw Baruch Siach's patch:
  powerpc: 85xx: separate e500 from e500mc

Unfortunately, that patch breaks the dependencies for the P5020DS
platform and does not fix the underlying code which does not
understand what the ambiguous "CONFIG_E500" means.

In order to fix the issue at the fundamental level, I created the
following 17-patch series loosely based on Baruch's patch.

=== High-Level Summary ===

The e500v1/v2 and e500mc/e5500 CPU families are not compatible with
each other, yet they share the same "CONFIG_E500" Kconfig option.

The following patch series splits the 32-bit CPU support into two
separate options: "CONFIG_FSL_E500_V1_V2" and "CONFIG_FSL_E500MC".
Additionally, the 64-bit e5500 support is separated to its own config
option ("CONFIG_FSL_E5500") which is automatically combined with
either 32-bit e500MC or 64-bit Book-3E when the P5020DS board support
is enabled.

I based the patches on v3.2-rc1, please let me know if I should
update the patches against a different tree.

The first 4 patches stand on their own merits; they are generic code
cleanups necessary to support the later patches.

I'd like to know what you all think.

Cheers,
Kyle Moffett

--
Curious about my work on the Debian powerpcspe port?
I'm keeping a blog here: http://pureperl.blogspot.com/

^ permalink raw reply

* Re: [PATCH] net: fsl_pq_mdio: fix oops when using uninitialized mutex
From: Andy Fleming @ 2011-11-09 20:10 UTC (permalink / raw)
  To: Baruch Siach; +Cc: netdev, linuxppc-dev, Andy Fleming
In-Reply-To: <59b050a97a9b5382918b66f2850a80c86e52f409.1320736936.git.baruch@tkos.co.il>

> Fix this by moving the of_mdiobus_register() call earlier.
>
> Cc: Andy Fleming <afleming@freescale.com>
> Signed-off-by: Baruch Siach <baruch@tkos.co.il>
> ---
> =A0drivers/net/ethernet/freescale/fsl_pq_mdio.c | =A0 14 +++++++-------
> =A01 files changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/net/ethernet/freescale/fsl_pq_mdio.c b/drivers/net/e=
thernet/freescale/fsl_pq_mdio.c
> index 52f4e8a..e17fd2f 100644
> --- a/drivers/net/ethernet/freescale/fsl_pq_mdio.c
> +++ b/drivers/net/ethernet/freescale/fsl_pq_mdio.c
> @@ -385,6 +385,13 @@ static int fsl_pq_mdio_probe(struct platform_device =
*ofdev)
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tbiaddr =3D *prop;
> =A0 =A0 =A0 =A0}
>
> + =A0 =A0 =A0 err =3D of_mdiobus_register(new_bus, np);
> + =A0 =A0 =A0 if (err) {
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 printk (KERN_ERR "%s: Cannot register as MD=
IO bus\n",
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 new_bus->na=
me);
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto err_free_irqs;
> + =A0 =A0 =A0 }
> +


This fix totally breaks the point of setting tbipa beforehand.
mdiobus_register will cause the bus to be scanned, and if any of the
PHYs are at the default address for tbipa, they won't be found. I have
a different fix which I will (re)submit today.


> =A0 =A0 =A0 =A0if (tbiaddr =3D=3D -1) {
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0out_be32(tbipa, 0);


Andy

^ permalink raw reply

* RE: [PATCH v14 03/10] USB/ppc4xx: Add Synopsys DWC OTG Core Interface Layer (CIL)
From: Tirumala Marri @ 2011-11-09 18:02 UTC (permalink / raw)
  To: Pratyush Anand; +Cc: greg, linux-usb, Mark Miesfeld, linuxppc-dev, Fushen Chen
In-Reply-To: <CAHM4w1=OiZA4S4TBE_Hgx3yBptcWdtLu2iQiy17wb2=NaSjwZw@mail.gmail.com>

>-----Original Message-----
>From: linuxppc-dev-bounces+tmarri=amcc.com@lists.ozlabs.org
>[mailto:linuxppc-dev-bounces+tmarri=amcc.com@lists.ozlabs.org] On Behalf
>Of Pratyush Anand
>Sent: Thursday, October 20, 2011 2:12 AM
>To: tmarri@apm.com
>Cc: Mark Miesfeld; greg@kroah.com; linux-usb@vger.kernel.org; linuxppc-
>dev@lists.ozlabs.org; Fushen Chen
>Subject: Re: [PATCH v14 03/10] USB/ppc4xx: Add Synopsys DWC OTG Core
>Interface Layer (CIL)
>
>On Fri, Oct 7, 2011 at 8:00 AM,  <tmarri@apm.com> wrote:
>> From: Tirumala Marri <tmarri@apm.com>
>>
>
>[...]
>
>> + * Do core a soft reset of the core.  Be careful with this because it
>> + * resets all the internal state machines of the core.
>> + */
>> +static void dwc_otg_core_reset(struct core_if *core_if)
>> +{
>> +       ulong global_regs = core_if->core_global_regs;
>> +       u32 greset = 0;
>> +       int count = 0;
>> +
>> +       /* Wait for AHB master IDLE state. */
>> +       do {
>> +               udelay(10);
>> +               greset = dwc_reg_read(global_regs, DWC_GRSTCTL);
>> +               if (++count > 100000) {
>> +                       pr_warning("%s() HANG! AHB Idle
>GRSTCTL=%0x\n",
>> +                                  __func__, greset);
>> +                       return;
>> +               }
>> +       } while (greset & DWC_RSTCTL_AHB_IDLE);
>
>As per sepcs:
>Bit 31:  AHB Master Idle (AHBIdle): Indicates that the AHB Master State
>Machine is in the IDLE condition.
>So when this bit is 1 , AHB would be idle. So, what do you want here?
>If AHB is idle, control wil return from above loop rather going ahead to
>execute
>this function.


What we are trying to do is waiting for the idle before we reset the core.
We want to make sure state machine is settle down and all the transactions
Are finished before reset.
>
>> +               nptxsize =
>> +                   DWC_RX_FIFO_START_ADDR_WR(nptxsize,
>> +                                             params-
>>dev_rx_fifo_size);
>> +               dwc_reg_write(regs, DWC_GNPTXFSIZ, nptxsize);
>> +
>> +               ptxsize = DWC_RX_FIFO_START_ADDR_WR(ptxsize,
>> +
>(DWC_RX_FIFO_START_ADDR_RD
>> +                                                    (nptxsize) +
>> +
>DWC_RX_FIFO_DEPTH_RD
>> +                                                    (nptxsize)));
>> +               for (i = 0;
>> +                    i < DWC_HWCFG4_NUM_DEV_PERIO_IN_EP_RD(core_if-
>>hwcfg4);
>> +                    i++) {
>
>
>Reading hwcfg4 will give you maximum nuber of fifos provided by
>hardware.
>But, a particular application (platform) may not need all those fifo.
>Since, you
>have a vriable fifo_num in your param list, so why not use that?
>
I am not sure about that. I think it should come from device tree
Based on how board is going to be configured.

^ permalink raw reply

* Re: [PATCH 5/7] fsl_pmc: update device bindings
From: Scott Wood @ 2011-11-09 17:15 UTC (permalink / raw)
  To: Li Yang-R58472
  Cc: Wood Scott-B07421, linuxppc-dev@lists.ozlabs.org,
	Zhao Chenhui-B35336
In-Reply-To: <3F607A5180246847A760FD34122A1E052BD6F1@039-SN1MPN1-004.039d.mgd.msft.net>

On Wed, Nov 09, 2011 at 04:59:16AM -0600, Li Yang-R58472 wrote:
> >>> Do we have any chips that have ethernet controller support for wake
> >>> on user-defined packet, but a sleep mode that doesn't let it be used?
> >>
> >> I think it is more a PMC feature cause Ethernet controller on many
> >> SoCs have the filer feature, but only very limited SoCs can support
> >> using it as a wake-up source.  The differences should be mostly in the
> >> PM controller block.
> >
> >Have you tried waking from sleep with it on those other SoCs?
> 
> We have tried and it's not working.

OK.

> AFAIKT, the purpose of defining the clock binding is used to describe
> the topology of clocks in a system.  And then used for runtime control
> of enabling/disabling clocks or getting/changing clock frequencies.
>
> But in this PM case, we just set the wakeup capability 

Where "wakeup capability" means deciding whether to turn off the clock
during a low-power state.  The basic information you need from the device
tree is the same -- a connection from here to there.

> and provide little runtime control of the clocks for on-chip devices.

My original intent for the binding you replaced was that it could be used
for other clock management as well -- you could use it to describe
DEVDISR or 83xx SCCR mappings as well.  It was when I sent that binding
out that I recall being asked to use the clock binding.

That said, Grant has recently said he's unhappy with the current state of
the clock binding, so I'm not sure what the right thing to do here is.

>  And it doesn't get any benefit of using this binding.  If your concern
> is the confusion with the already existing clock binding, we can remove
> the clk in the name of the PM binding.

My concern, besides unnecessary duplication of ways to express something,
is a loss of generality.

> If we are to use the clock binding, I think adding the CSB clock, DDR
> clock, core clock, and etc with this binding is more appropriate.

That would be another use of it, but one doesn't need to imply the other.

-Scott

^ permalink raw reply

* Re: [PATCH 4/7] powerpc/85xx: add support to JOG feature using cpufreq interface
From: Scott Wood @ 2011-11-09 16:13 UTC (permalink / raw)
  To: Zhao Chenhui; +Cc: Jerry Huang, linuxppc-dev
In-Reply-To: <20111109113813.GB3303@localhost.localdomain>

On Wed, Nov 09, 2011 at 07:38:13PM +0800, Zhao Chenhui wrote:
> On Mon, Nov 07, 2011 at 12:50:24PM -0600, Scott Wood wrote:
> > On 11/07/2011 04:27 AM, Zhao Chenhui wrote:
> > > There is only one ppc_proc_freq. no lock.
> > 
> > I realize there's only one.
> > 
> > I'm asking whether CPUs can have their frequencies set indpendently --
> > if the answer is no, and this function is not specific to a CPU, my only
> > concern is the lock.  Either this function can be called multiple times
> > in parallel, in which case the ppc_proc_freq update should be inside the
> > lock, or it can't, in which case why do we need the lock at all?
> > 
> > -Scott
> 
> Yes. They can be changed independently.
> I will set ppc_proc_freq inside the lock.

If they can be changed independently, what does the global ppc_proc_freq
mean?

-Scott

^ permalink raw reply

* Re: [PATCH 2/7] powerpc/85xx: add HOTPLUG_CPU support
From: Scott Wood @ 2011-11-09 16:12 UTC (permalink / raw)
  To: Li Yang-R58472
  Cc: Wood Scott-B07421, linuxppc-dev@lists.ozlabs.org,
	Zhao Chenhui-B35336
In-Reply-To: <3F607A5180246847A760FD34122A1E052BD4A4@039-SN1MPN1-004.039d.mgd.msft.net>

On Wed, Nov 09, 2011 at 02:31:23AM -0600, Li Yang-R58472 wrote:
> >-----Original Message-----
> >From: Wood Scott-B07421
> >Sent: Wednesday, November 09, 2011 4:58 AM
> >To: Li Yang-R58472
> >Cc: Wood Scott-B07421; Zhao Chenhui-B35336; linuxppc-dev@lists.ozlabs.org
> >Subject: Re: [PATCH 2/7] powerpc/85xx: add HOTPLUG_CPU support
> >
> >On 11/08/2011 04:05 AM, Li Yang-R58472 wrote:
> >>> Subject: Re: [PATCH 2/7] powerpc/85xx: add HOTPLUG_CPU support
> >>>
> >>>> +	flush_disable_L1();
> >>>
> >>> You'll also need to take down L2 on e500mc.
> >>
> >> Right.  E500mc support is beyond this patch series.  We will work on it
> >after the e500v2 support is finalized.
> >
> >I saw some support with "is_corenet"...  If we don't support e500mc, make
> >sure the code doesn't try to run on e500mc.
> 
> The is_corenet code is just a start of the e500mc support and is far from complete.
> 
> >
> >This isn't an e500v2-only BSP you're putting the code into. :-)
> 
> Yes, I know.  But it will take quite some time to perfect the support
> for different type of cores. 

That's fine, I'm just asking for it to be clearer that e500mc/corenet is TODO,
and ideally for the code to reject hotplug if is_corenet is set, until it's
supported.

> >>>> +static int __cpuinit smp_85xx_map_bootpg(u32 page) {
> >>>> +	__iomem u32 *bootpg_ptr;
> >>>> +	u32 bptr;
> >>>> +
> >>>> +	/* Get the BPTR */
> >>>> +	bootpg_ptr = ioremap(get_immrbase() + MPC85xx_BPTR_OFF, 4);
> >>>> +
> >>>> +	/* Set the BPTR to the secondary boot page */
> >>>> +	bptr = MPC85xx_BPTR_EN | (page & MPC85xx_BPTR_BOOT_PAGE_MASK);
> >>>> +	out_be32(bootpg_ptr, bptr);
> >>>> +
> >>>> +	iounmap(bootpg_ptr);
> >>>> +	return 0;
> >>>> +}
> >>>
> >>> Shouldn't the boot page already be set by U-Boot?
> >>
> >> The register should be lost after a deep sleep.   Better to do it again.
> >
> >How can it be lost after a deep sleep if we're relying on it to hold our
> >wakeup code?
> 
> In order to wake up from deep sleep, the boot page has been set to the
> deep sleep restoration function.  We need to set it back to the
> bootpage from u-boot.

Ah.  That deserves a code comment.

> >It's not "better to do it again" if we're making a bad assumption about
> >the code and the table being in the same page.
> 
> Currently we are reusing the whole boot page including the spin-table
> from u-boot.  Are you suggesting to move just the boot page to kernel?

Just that it might be better to make sure that we put the same value in
BPTR that was in it when Linux booted, rather than assume that the page
that holds the spin table itself is the right one.

-Scott

^ permalink raw reply

* Re: [PATCH] powerpc/p1023: set IRQ[4:6, 11] to high level sensitive for PCIe
From: Kumar Gala @ 2011-11-09 15:49 UTC (permalink / raw)
  To: Roy Zang; +Cc: linuxppc-dev
In-Reply-To: <1320654778-3294-1-git-send-email-tie-fei.zang@freescale.com>

How did you come by this information?

- k

On Nov 7, 2011, at 2:32 AM, Roy Zang wrote:

> P1023 external IRQ[4:6, 11] do not pin out, but the interrupts are
> shared with PCIe controller.
> The silicon internally ties the interrupts to L, so change the
> IRQ[4:6,11] to high level sensitive for PCIe.
>=20
> Signed-off-by: Roy Zang <tie-fei.zang@freescale.com>
> ---
> arch/powerpc/boot/dts/p1023rds.dts |    8 ++++----
> 1 files changed, 4 insertions(+), 4 deletions(-)
>=20
> diff --git a/arch/powerpc/boot/dts/p1023rds.dts =
b/arch/powerpc/boot/dts/p1023rds.dts
> index d9b7767..66bf804 100644
> --- a/arch/powerpc/boot/dts/p1023rds.dts
> +++ b/arch/powerpc/boot/dts/p1023rds.dts
> @@ -490,9 +490,9 @@
> 			interrupt-map-mask =3D <0xf800 0 0 7>;
> 			interrupt-map =3D <
> 				/* IDSEL 0x0 */
> -				0000 0 0 1 &mpic 4 1
> -				0000 0 0 2 &mpic 5 1
> -				0000 0 0 3 &mpic 6 1
> +				0000 0 0 1 &mpic 4 2
> +				0000 0 0 2 &mpic 5 2
> +				0000 0 0 3 &mpic 6 2
> 				0000 0 0 4 &mpic 7 1
> 				>;
> 			ranges =3D <0x2000000 0x0 0xa0000000
> @@ -532,7 +532,7 @@
> 				0000 0 0 1 &mpic 8 1
> 				0000 0 0 2 &mpic 9 1
> 				0000 0 0 3 &mpic 10 1
> -				0000 0 0 4 &mpic 11 1
> +				0000 0 0 4 &mpic 11 2
> 				>;
> 			ranges =3D <0x2000000 0x0 0x80000000
> 				  0x2000000 0x0 0x80000000
> --=20
> 1.6.0.6
>=20

^ permalink raw reply

* Re: [PATCH] powerpc: Export PIR data through sysfs
From: Scott Wood @ 2011-11-09 15:48 UTC (permalink / raw)
  To: Ananth N Mavinakayanahalli; +Cc: linuxppc-dev, Anton Blanchard, mahesh
In-Reply-To: <20111109044124.GA10961@in.ibm.com>

On Wed, Nov 09, 2011 at 10:11:24AM +0530, Ananth N Mavinakayanahalli wrote:
> On Tue, Nov 08, 2011 at 10:59:46AM -0600, Scott Wood wrote:
> > On 11/08/2011 12:58 AM, Ananth N Mavinakayanahalli wrote:
> > > On Mon, Nov 07, 2011 at 11:18:32AM -0600, Scott Wood wrote:
> > >> What use does userspace have for this?  If you want to return the
> > >> currently executing CPU (which unless you're pinned could change as soon
> > >> as the value is read...), why not just return smp_processor_id() or
> > >> hard_smp_processor_id()?
> > > 
> > > Its not just the current cpu. Decoding PIR can tell you the core id,
> > > thread id in case of SMT, and this information can be used by userspace
> > > apps to set affinities, etc.
> > 
> > Wouldn't it make more sense to expose the thread to core mappings in a
> > general way, not tied to hardware or what thread we're currently running on?
> 
> AFAIK, the information encoding in PIR is platform dependent. There is
> no general way to expose this information unless you want have a
> per-platform ifdef. Even then, I am not sure if that information will
> generally be available or provided.
> 
> > What's the use case for knowing this information only about the current
> > thread (or rather the state the current thread was in a few moments ago)?
> 
> Its not information about the thread but about the cpu. Unless you have
> a shared LPAR environment, the data will be consistent and can be used
> by applications with knowledge of the platform.

I'm not sure what a "shared LPAR environment" is, but unless you're
pinned there's no guarantee the CPU you're running on once the read()
syscall returns is the same as the one that PIR was read on.  Do you mean
you're expecting this to be run from inside a partition that runs only on
one CPU, and thus whichever thread you'll be migrated to will have the
other data remain the same?

> > > +#if defined(CONFIG_SMP) && defined(SPRN_PIR)
> > > +SYSFS_PMCSETUP(pir, SPRN_PIR);
> > > +static SYSDEV_ATTR(pir, 0400, show_pir, NULL);
> > > +#endif
> > 
> > This only helps on architectures such as 8xx where you can't build as
> > SMP -- and I don't think #ifdef SPRN_PIR excludes any builds.
> > 
> > It doesn't help on chips like 750 or e300 where you can run a normal 6xx
> > SMP build, you just won't have multiple CPUs, and thus won't run things
> > like the secondary entry code.
> 
> Ugh! Booke builds seem to be fun :-)

Those aren't booke.

> I think this calls for a CPU_FTR_PIR. What do you suggest?

Unless someone wants to test what actually happens when you read PIR on
all these CPUs...

What platform is this meant to be useful for?  Perhaps it could just be a
platform-specific sysfs entry?

-Scott

^ permalink raw reply

* Re: [PATCH] powerpc/p1023: set IRQ[4:6, 11] to high level sensitive for PCIe
From: Scott Wood @ 2011-11-09 15:38 UTC (permalink / raw)
  To: Zang Roy-R61911; +Cc: Wood Scott-B07421, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <2239AC579C7D3646A720227A37E0268120EC83@039-SN1MPN1-004.039d.mgd.msft.net>

On Wed, Nov 09, 2011 at 09:27:02AM -0600, Zang Roy-R61911 wrote:
> > -----Original Message-----
> > From: Wood Scott-B07421
> > Sent: Wednesday, November 09, 2011 0:54 AM
> > To: Zang Roy-R61911
> > Cc: Wood Scott-B07421; linuxppc-dev@lists.ozlabs.org
> > Subject: Re: [PATCH] powerpc/p1023: set IRQ[4:6, 11] to high level sensitive for
> > PCIe
> > 
> > Just a note that there's magic to allow the PCIe block to output these
> > interrupts as either active-high or active-low, depending on how they're
> > configured at the mpic.
> I do not think there is any magic.
> On the contrary, it is the mpic/device tree needs to comply with the hardware setting for the interrupt polarity.

The magic is that the PCIe block can generate the interrupt in either
polarity depending on the MPIC setting (or perhaps it bases it on
sampling the pin status during/after reset, but that seems fragile).

> > > IRQ 4,5,6, 11 are internally tie to low by silicon. To use these interrupts
> > for PCIe, they need to set high level sensitive.
> > > It is clear enough for this patch.
> > 
> > It's odd enough that I felt the need to go reading through the docs to
> > see why such a thing would work at all.
> If you consider the normal case, the shared irq pulls up, the PCIe interrupt set to low level sensitive. Anything odd?

The oddity is that active-high works at all for a PCI interrupt, and that
not all the PCIe interrupts have the same polarity.

-Scott

^ permalink raw reply

* RE: [PATCH] powerpc/p1023: set IRQ[4:6, 11] to high level sensitive for PCIe
From: Zang Roy-R61911 @ 2011-11-09 15:27 UTC (permalink / raw)
  To: Wood Scott-B07421; +Cc: linuxppc-dev@lists.ozlabs.org
In-Reply-To: <4EB95EBC.8010808@freescale.com>

DQoNCj4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gRnJvbTogV29vZCBTY290dC1CMDc0
MjENCj4gU2VudDogV2VkbmVzZGF5LCBOb3ZlbWJlciAwOSwgMjAxMSAwOjU0IEFNDQo+IFRvOiBa
YW5nIFJveS1SNjE5MTENCj4gQ2M6IFdvb2QgU2NvdHQtQjA3NDIxOyBsaW51eHBwYy1kZXZAbGlz
dHMub3psYWJzLm9yZw0KPiBTdWJqZWN0OiBSZTogW1BBVENIXSBwb3dlcnBjL3AxMDIzOiBzZXQg
SVJRWzQ6NiwgMTFdIHRvIGhpZ2ggbGV2ZWwgc2Vuc2l0aXZlIGZvcg0KPiBQQ0llDQo+IA0KPiBP
biAxMS8wNy8yMDExIDExOjUxIFBNLCBaYW5nIFJveS1SNjE5MTEgd3JvdGU6DQo+ID4NCj4gPg0K
PiA+PiAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiA+PiBGcm9tOiBXb29kIFNjb3R0LUIw
NzQyMQ0KPiA+PiBTZW50OiBUdWVzZGF5LCBOb3ZlbWJlciAwOCwgMjAxMSAyOjQ0IEFNDQo+ID4+
IFRvOiBaYW5nIFJveS1SNjE5MTENCj4gPj4gQ2M6IGxpbnV4cHBjLWRldkBsaXN0cy5vemxhYnMu
b3JnDQo+ID4+IFN1YmplY3Q6IFJlOiBbUEFUQ0hdIHBvd2VycGMvcDEwMjM6IHNldCBJUlFbNDo2
LCAxMV0gdG8gaGlnaCBsZXZlbCBzZW5zaXRpdmUNCj4gZm9yDQo+ID4+IFBDSWUNCj4gPj4NCj4g
Pj4gT24gMTEvMDcvMjAxMSAwMjozMiBBTSwgUm95IFphbmcgd3JvdGU6DQo+ID4+PiBQMTAyMyBl
eHRlcm5hbCBJUlFbNDo2LCAxMV0gZG8gbm90IHBpbiBvdXQsIGJ1dCB0aGUgaW50ZXJydXB0cyBh
cmUNCj4gPj4+IHNoYXJlZCB3aXRoIFBDSWUgY29udHJvbGxlci4NCj4gPj4+IFRoZSBzaWxpY29u
IGludGVybmFsbHkgdGllcyB0aGUgaW50ZXJydXB0cyB0byBMLCBzbyBjaGFuZ2UgdGhlDQo+ID4+
PiBJUlFbNDo2LDExXSB0byBoaWdoIGxldmVsIHNlbnNpdGl2ZSBmb3IgUENJZS4NCj4gPj4NCj4g
Pj4gU29tZSBleHRyYSBjb21tZW50YXJ5IG9uIHdoeSB0aGlzIHdvcmtzIHdvdWxkIGJlIG5pY2Uu
DQo+ID4gSSBkbyBub3Qga25vdyB3aGF0IGtpbmQgb2YgZXh0cmEgY29tbWVudGFyeSB5b3UgcmVx
dWVzdC4NCj4gDQo+IEp1c3QgYSBub3RlIHRoYXQgdGhlcmUncyBtYWdpYyB0byBhbGxvdyB0aGUg
UENJZSBibG9jayB0byBvdXRwdXQgdGhlc2UNCj4gaW50ZXJydXB0cyBhcyBlaXRoZXIgYWN0aXZl
LWhpZ2ggb3IgYWN0aXZlLWxvdywgZGVwZW5kaW5nIG9uIGhvdyB0aGV5J3JlDQo+IGNvbmZpZ3Vy
ZWQgYXQgdGhlIG1waWMuDQpJIGRvIG5vdCB0aGluayB0aGVyZSBpcyBhbnkgbWFnaWMuDQpPbiB0
aGUgY29udHJhcnksIGl0IGlzIHRoZSBtcGljL2RldmljZSB0cmVlIG5lZWRzIHRvIGNvbXBseSB3
aXRoIHRoZSBoYXJkd2FyZSBzZXR0aW5nIGZvciB0aGUgaW50ZXJydXB0IHBvbGFyaXR5Lg0KDQo+
IA0KPiA+IElSUSA0LDUsNiwgMTEgYXJlIGludGVybmFsbHkgdGllIHRvIGxvdyBieSBzaWxpY29u
LiBUbyB1c2UgdGhlc2UgaW50ZXJydXB0cw0KPiBmb3IgUENJZSwgdGhleSBuZWVkIHRvIHNldCBo
aWdoIGxldmVsIHNlbnNpdGl2ZS4NCj4gPiBJdCBpcyBjbGVhciBlbm91Z2ggZm9yIHRoaXMgcGF0
Y2guDQo+IA0KPiBJdCdzIG9kZCBlbm91Z2ggdGhhdCBJIGZlbHQgdGhlIG5lZWQgdG8gZ28gcmVh
ZGluZyB0aHJvdWdoIHRoZSBkb2NzIHRvDQo+IHNlZSB3aHkgc3VjaCBhIHRoaW5nIHdvdWxkIHdv
cmsgYXQgYWxsLg0KSWYgeW91IGNvbnNpZGVyIHRoZSBub3JtYWwgY2FzZSwgdGhlIHNoYXJlZCBp
cnEgcHVsbHMgdXAsIHRoZSBQQ0llIGludGVycnVwdCBzZXQgdG8gbG93IGxldmVsIHNlbnNpdGl2
ZS4gQW55dGhpbmcgb2RkPw0KDQo+IA0KPiA+PiBUaGUgbWFudWFsIHNheXM6DQo+ID4+DQo+ID4+
PiBJZiBhIFBDSSBFeHByZXNzIElOVHggaW50ZXJydXB0IGlzIGJlaW5nIHVzZWQsIHRoZW4gdGhl
IFBJQyBtdXN0IGJlDQo+IGNvbmZpZ3VyZWQNCj4gPj4gc28gdGhhdCBleHRlcm5hbCBpbnRlcnJ1
cHRzDQo+ID4+PiBhcmUgbGV2ZWwtc2Vuc2l0aXZlIChFSVZQUm5bU10gPSAxKS4NCj4gPiBUaGF0
IGlzIHRydWUgZm9yIGFsbCBGU0wgcG93ZXJwYyBzaWxpY29uIHdpdGggUENJZSBjb250cm9sbGVy
IGJlc2lkZSBQMTAyMy4NCj4gDQo+IFN1cmUsIG15IHBvaW50IHdhcyBtb3JlIHRoYXQgaXQgZGlk
bid0IHNheSBhbnl0aGluZyB0aGVyZSBhYm91dCBob3cgdG8NCj4gY29uZmlndXJlIEVJVlBSbltQ
XS4NCkl0IGRlcGVuZHMgb24gdGhlIGlycSBwdWxscyB1cCBvciBkb3duIGJ5IGhhcmR3YXJlLg0K
Um95DQo=

^ permalink raw reply

* Re: [PATCH v2 1/5] [ppc] Process dynamic relocations for kernel
From: Josh Poimboeuf @ 2011-11-09 14:53 UTC (permalink / raw)
  To: Suzuki Poulose
  Cc: Nathan Miller, Josh Poimboeuf, Dave Hansen, Alan Modra,
	Scott Wood, Paul Mackerras, linuxppc-dev
In-Reply-To: <20111109120303.51ac3b1b@suzukikp.in.ibm.com>

On Wed, 2011-11-09 at 12:03 +0530, Suzuki Poulose wrote:
> On Tue, 08 Nov 2011 10:19:05 -0600
> Josh Poimboeuf <jpoimboe@linux.vnet.ibm.com> wrote:
> 
> > On Tue, 2011-11-08 at 12:41 +0530, Suzuki Poulose wrote:
> > > What I was suggesting is, instead of flushing the cache in
> > > relocate(), lets do it like:
> > > 
> > > for e.g, on 440x, (in head_44x.S :)
> > > 
> > > #ifdef CONFIG_RELOCATABLE
> > > 	...
> > > 	bl relocate
> > > 
> > > 	#Flush the d-cache and invalidate the i-cache here
> > > #endif
> > > 
> > > 
> > > This would let the different platforms do the the cache
> > > invalidation in their own way.
> > > 
> > > Btw, I didn't find an instruction to flush the entire d-cache in
> > > PPC440 manual. We have instructions to flush only a block
> > > corresponding to an address.
> > > 
> > > However, we have 'iccci' which would invalidate the entire i-cache
> > > which, which I think is better than 80,000 i-cache invalidates.
> > 
> > In misc_32.S there are already some platform-independent cache
> > management functions.  If we use those, then relocate() could simply
> > call them.  Then the different platforms calling relocate() wouldn't
> > have to worry about flushing/invalidating caches.
> > 
> > For example, there's a clean_dcache_range() function.  Given any range
> > twice the size of the d-cache, it should flush the entire d-cache.
> > But the only drawback is that it would require the caller to know the
> > size of the d-cache.
> > 
> > Instead, I think it would be preferable to create a new clean_dcache()
> > (or clean_dcache_all()?) function in misc_32.S, which could call
> > clean_dcache_range() with the appropriate args for flushing the entire
> > d-cache.  relocate() could then call the platform-independent
> > clean_dcache().
> > 
> 
> 
> How about using clean_dcache_range() to flush the range runtime
> address range [ _stext, _end ] ? That would flush the entire
> instructions.

Wouldn't that result in more cache flushing than the original solution? 

For example, my kernel is 3.5MB.  Assuming a 32 byte cache line size,
clean_dcache_range(_stext, _end) would result in about 115,000 dcbst's
(3.5MB / 32).


> 
> 
> > For i-cache invalidation there's already the (incorrectly named?)
> > flush_instruction_cache().  It uses the appropriate platform-specific
> > methods (e.g. iccci for 44x) to invalidate the entire i-cache.
> 
> Agreed. The only thing that worries me is the use of KERNELBASE in the 
> flush_instruction_cache() for CONFIG_4xx. Can we safely assume all 4xx
> implementations ignore the arguments passed to iccci ? 

Good question.  I don't know the answer. :-)

That also may suggest a bigger can of worms.  A grep of the powerpc code
shows many uses of KERNELBASE.  For a relocatable kernel, nobody should
be relying on KERNELBASE except for the early relocation code.  Are we
sure that all the other usages of KERNELBASE are "safe"?

For example -- the DEBUG_CRIT_EXCEPTION macro, which head_44x.S uses,
relies on KERNELBASE.  That doesn't seem right to me.


Josh

^ permalink raw reply

* Re: [PATCH 4/7] powerpc/85xx: add support to JOG feature using cpufreq interface
From: Zhao Chenhui @ 2011-11-09 11:38 UTC (permalink / raw)
  To: Scott Wood; +Cc: Jerry Huang, linuxppc-dev
In-Reply-To: <4EB82870.1090907@freescale.com>

On Mon, Nov 07, 2011 at 12:50:24PM -0600, Scott Wood wrote:
> On 11/07/2011 04:27 AM, Zhao Chenhui wrote:
> > On Fri, Nov 04, 2011 at 02:42:54PM -0500, Scott Wood wrote:
> >> On 11/04/2011 07:36 AM, Zhao Chenhui wrote:
> >>> +	cpufreq_frequency_table_target(policy,
> >>> +				       mpc85xx_freqs,
> >>> +				       target_freq,
> >>> +				       relation,
> >>> +				       &new);
> >>> +
> >>> +	freqs.old = policy->cur;
> >>> +	freqs.new = mpc85xx_freqs[new].frequency;
> >>> +	freqs.cpu = policy->cpu;
> >>> +
> >>> +	mutex_lock(&mpc85xx_switch_mutex);
> >>> +	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
> >>> +
> >>> +	pr_info("Setting frequency for core %d to %d kHz, " \
> >>> +		 "PLL ratio is %d/2\n",
> >>> +		 policy->cpu,
> >>> +		 mpc85xx_freqs[new].frequency,
> >>> +		 mpc85xx_freqs[new].index);
> >>> +
> >>> +	set_pll(mpc85xx_freqs[new].index, policy->cpu);
> >>> +
> >>> +	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
> >>> +	mutex_unlock(&mpc85xx_switch_mutex);
> >>> +
> >>> +	ppc_proc_freq = freqs.new * 1000ul;
> >>
> >> ppc_proc_freq is global -- can CPUs not have their frequencies adjusted
> >> separately?
> >>
> >> It should be under the lock, if the lock is needed at all.
> >>
> > 
> > There is only one ppc_proc_freq. no lock.
> 
> I realize there's only one.
> 
> I'm asking whether CPUs can have their frequencies set indpendently --
> if the answer is no, and this function is not specific to a CPU, my only
> concern is the lock.  Either this function can be called multiple times
> in parallel, in which case the ppc_proc_freq update should be inside the
> lock, or it can't, in which case why do we need the lock at all?
> 
> -Scott

Yes. They can be changed independently.
I will set ppc_proc_freq inside the lock.

-chenhui

^ permalink raw reply

* RE: [PATCH 5/7] fsl_pmc: update device bindings
From: Li Yang-R58472 @ 2011-11-09 10:59 UTC (permalink / raw)
  To: Wood Scott-B07421; +Cc: linuxppc-dev@lists.ozlabs.org, Zhao Chenhui-B35336
In-Reply-To: <4EB9939E.8070800@freescale.com>

>-----Original Message-----
>From: Wood Scott-B07421
>Sent: Wednesday, November 09, 2011 4:40 AM
>To: Li Yang-R58472
>Cc: Wood Scott-B07421; Zhao Chenhui-B35336; linuxppc-dev@lists.ozlabs.org
>Subject: Re: [PATCH 5/7] fsl_pmc: update device bindings
>
>On 11/08/2011 04:56 AM, Li Yang-R58472 wrote:
>>>> diff --git a/Documentation/devicetree/bindings/powerpc/fsl/pmc.txt
>>> b/Documentation/devicetree/bindings/powerpc/fsl/pmc.txt
>>>> index 07256b7..d84b4f8 100644
>>>> --- a/Documentation/devicetree/bindings/powerpc/fsl/pmc.txt
>>>> +++ b/Documentation/devicetree/bindings/powerpc/fsl/pmc.txt
>>>> @@ -9,22 +9,27 @@ Properties:
>>>>
>>>>    "fsl,mpc8548-pmc" should be listed for any chip whose PMC is
>>>>    compatible.  "fsl,mpc8536-pmc" should also be listed for any chip
>>>> -  whose PMC is compatible, and implies deep-sleep capability.
>>>> +  whose PMC is compatible, and implies deep-sleep capability and
>>>> + wake on user defined packet(wakeup on ARP).
>>>
>>> Why does the PMC care?  This is an ethernet controller feature, the
>>> PMC is just keeping the wakeup-relevant parts of the ethernet
>>> controller alive (whatever they happen to be).
>>>
>>> Do we have any chips that have ethernet controller support for wake
>>> on user-defined packet, but a sleep mode that doesn't let it be used?
>>
>> I think it is more a PMC feature cause Ethernet controller on many
>> SoCs have the filer feature, but only very limited SoCs can support
>> using it as a wake-up source.  The differences should be mostly in the
>> PM controller block.
>
>Have you tried waking from sleep with it on those other SoCs?

We have tried and it's not working.  Although the filer is an eTSEC feature=
, waking up on it is really a complex system wide change including eTSEC, D=
DR, ECM, PIC, and etc.  And currently it's tied up to deep sleep feature.  =
So I would like it to be part of the SoC integration domain i.e. PMC.

>
>> Also the wake on user-defined packet feature was never mentioned in the
>Ethernet controller part of UM.
>
>AFAICT all this "feature" is, is programming the Ethernet controller to
>filter out some packets that we don't want to wake us up, and then
>refraining from entering magic packet mode.  What PMC registers are
>programmed any differently for this?
>
>It's in the PM part of the manual because that's where they generally
>describe issues related to PM.  They describe magic packet there too.
>
>The set of devices which are still active during sleep is a different
>issue from the conditions on which a device will request a wake.
>
>I also don't trust our PM documentation very much.  It's ambiguous just
>what gets shut down in ordinary sleep mode.  E.g. the mpc8544 manual talks
>about "external interrupts", but in one place it looks like it means
>external to the SoC:
>
>> In sleep mode, all clocks internal to the e500 core are turned off,
>> including the timer facilities clock. All I/O interfaces in the device
>> logic are also shut down. Only the clocks to the MPC8544E PIC are still
>running so that an external interrupt can wake up the device.
>
>But the note immediately below that seems to imply they're talking about
>external to the core:
>
>> Only external interrupts can wake the MPC8544E from sleep mode.
>> Internal interrupt sources like the core interval timer or watchdog
>> timer depend on an active clock for their operation and these are
>disabled in sleep mode.
>
>
>
>>> Normally that shouldn't matter, but we already an unused binding for
>>> this. :-)
>>>
>>> Please provide rationale for doing it this way.  Ideally it should
>>> probably use whatever http://devicetree.org/ClockBindings ends up being=
.
>>
>> I have looked into that binding.  The binding was primarily defined for
>the Linux clock API which is not same as what we are doing here(set wakeup
>source).
>> If in the future the clock API also covers wakeup related features, we
>can change to use it.
>
>While Linux APIs can serve as an inspiration, they're not the basis of
>device tree bindings.  The device tree is not Linux-specific, and should
>not change just because Linux changes, but rather should describe the
>hardware.  We want to get this right the first time.
>
>What we are describing here is how a device's clock is connected to the
>PMC.

AFAIKT, the purpose of defining the clock binding is used to describe the t=
opology of clocks in a system.  And then used for runtime control of enabli=
ng/disabling clocks or getting/changing clock frequencies.

But in this PM case, we just set the wakeup capability and provide little r=
untime control of the clocks for on-chip devices.  And it doesn't get any b=
enefit of using this binding.  If your concern is the confusion with the al=
ready existing clock binding, we can remove the clk in the name of the PM b=
inding.

If we are to use the clock binding, I think adding the CSB clock, DDR clock=
, core clock, and etc with this binding is more appropriate.

- Leo

^ permalink raw reply

* Re: [PATCH 1/2] powerpc/ps3: Fix lost SMP IPIs
From: Peter Zijlstra @ 2011-11-09  9:53 UTC (permalink / raw)
  To: Geoff Levand; +Cc: cbe-oss-dev, Andre Heider, linuxppc-dev
In-Reply-To: <E1RNxlN-0000wq-1P@casper.infradead.org>

On Tue, 2011-11-08 at 14:37 -0800, Geoff Levand wrote:
> Fixes the PS3 bootup hang introduced in 3.0-rc1 by:
>=20
>   commit 317f394160e9beb97d19a84c39b7e5eb3d7815a
>   sched: Move the second half of ttwu() to the remote cpu
>=20
> Move the PS3's LV1 EOI call lv1_end_of_interrupt_ext() from ps3_chip_eoi(=
)
> to ps3_get_irq() for IPI messages.
>=20
> If lv1_send_event_locally() is called between a previous call to
> lv1_send_event_locally() and the coresponding call to
> lv1_end_of_interrupt_ext() the second event will not be delivered to the
> target cpu.
>=20
> The PS3's SMP IPIs are implemented using lv1_send_event_locally(), so if =
two
> IPI messages of the same type are sent to the same target in a relatively
> short period of time the second IPI event can become lost when
> lv1_end_of_interrupt_ext() is called from ps3_chip_eoi().=20

Ah glad you found it! I'm not going to ack it since PPC guts are way
beyond me.

^ permalink raw reply

* Re: [PATCH v2 1/5] [ppc] Process dynamic relocations for kernel
From: Suzuki Poulose @ 2011-11-09  8:42 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Nathan Miller, Josh Poimboeuf, Dave Hansen, Alan Modra,
	Scott Wood, Paul Mackerras, linuxppc-dev
In-Reply-To: <20111109120303.51ac3b1b@suzukikp.in.ibm.com>

On 11/09/11 12:03, Suzuki Poulose wrote:
> On Tue, 08 Nov 2011 10:19:05 -0600
> Josh Poimboeuf<jpoimboe@linux.vnet.ibm.com>  wrote:
>
>> On Tue, 2011-11-08 at 12:41 +0530, Suzuki Poulose wrote:
>>> What I was suggesting is, instead of flushing the cache in
>>> relocate(), lets do it like:
>>>
>>> for e.g, on 440x, (in head_44x.S :)
>>>
>>> #ifdef CONFIG_RELOCATABLE
>>> 	...
>>> 	bl relocate
>>>
>>> 	#Flush the d-cache and invalidate the i-cache here
>>> #endif
>>>
>>>
>>> This would let the different platforms do the the cache
>>> invalidation in their own way.
>>>
>>> Btw, I didn't find an instruction to flush the entire d-cache in
>>> PPC440 manual. We have instructions to flush only a block
>>> corresponding to an address.
>>>
>>> However, we have 'iccci' which would invalidate the entire i-cache
>>> which, which I think is better than 80,000 i-cache invalidates.
>>
>> In misc_32.S there are already some platform-independent cache
>> management functions.  If we use those, then relocate() could simply
>> call them.  Then the different platforms calling relocate() wouldn't
>> have to worry about flushing/invalidating caches.
>>
>> For example, there's a clean_dcache_range() function.  Given any range
>> twice the size of the d-cache, it should flush the entire d-cache.
>> But the only drawback is that it would require the caller to know the
>> size of the d-cache.
>>
>> Instead, I think it would be preferable to create a new clean_dcache()
>> (or clean_dcache_all()?) function in misc_32.S, which could call
>> clean_dcache_range() with the appropriate args for flushing the entire
>> d-cache.  relocate() could then call the platform-independent
>> clean_dcache().
>>
>
>
> How about using clean_dcache_range() to flush the range runtime
> address range [ _stext, _end ] ? That would flush the entire
> instructions.
>
>
>> For i-cache invalidation there's already the (incorrectly named?)
>> flush_instruction_cache().  It uses the appropriate platform-specific
>> methods (e.g. iccci for 44x) to invalidate the entire i-cache.
>
> Agreed. The only thing that worries me is the use of KERNELBASE in the
> flush_instruction_cache() for CONFIG_4xx. Can we safely assume all 4xx
> implementations ignore the arguments passed to iccci ?
>>
>> Suzuki, if you agree with this direction, I could work up a new patch
>> if needed.
>>
>
> I have the following (untested) patch which uses clean_dcache_range()
> and flush_instruction_cache(): I will send the next version soon
> with those changes and the DYNAMIC_MEMSTART config for oldstyle
> relocatoin, if every one agrees to this.
>
>
> diff --git a/arch/powerpc/kernel/reloc_32.S
> b/arch/powerpc/kernel/reloc_32.S index 045d61e..cce0510 100644
> --- a/arch/powerpc/kernel/reloc_32.S
> +++ b/arch/powerpc/kernel/reloc_32.S
> @@ -33,10 +33,9 @@ R_PPC_RELATIVE = 22
>
>   _GLOBAL(relocate)
>
> -	mflr	r0
> +	mflr	r14		/* Save our LR */
>   	bl	0f		/* Find our current runtime
> address */ 0:	mflr	r12		/* Make it
> accessible */
> -	mtlr	r0
>
>   	lwz	r11, (p_dyn - 0b)(r12)
>   	add	r11, r11, r12	/* runtime address of .dynamic
> section */ @@ -87,18 +86,19 @@ eodyn:				/*
> End of Dyn Table scan */
>   	 * Work out the current offset from the link time address
> of .rela
>   	 * section.
>   	 *  cur_offset[r7] = rela.run[r9] - rela.link [r7]
> -	 *  _stext.link[r10] = _stext.run[r10] - cur_offset[r7]
> -	 *  final_offset[r3] = _stext.final[r3] - _stext.link[r10]
> +	 *  _stext.link[r12] = _stext.run[r10] - cur_offset[r7]
> +	 *  final_offset[r3] = _stext.final[r3] - _stext.link[r12]
>   	 */
>   	subf	r7, r7, r9	/* cur_offset */
> -	subf	r10, r7, r10
> -	subf	r3, r10, r3	/* final_offset */
> +	subf	r12, r7, r10
> +	subf	r3, r12, r3	/* final_offset */
>
>   	subf	r8, r6, r8	/* relaz -= relaent */
>   	/*
>   	 * Scan through the .rela table and process each entry
>   	 * r9	- points to the current .rela table entry
>   	 * r13	- points to the symbol table
> +	 * r10  - holds the runtime address of _stext
>   	 */
>
>   	/*
> @@ -180,12 +180,23 @@ store_half:
>
>   nxtrela:
>   	cmpwi	r8, 0		/* relasz = 0 ? */
> -	ble	done
> +	ble	flush
>   	add	r9, r9, r6	/* move to next entry in
> the .rela table */ subf	r8, r6, r8	/* relasz -= relaent */
>   	b	applyrela
>
> -done:	blr
> +	/* Flush the d-cache'd instructions */
> +flush:
> +	mr	r3, r10
> +	lis	r4, (_end - _stext)@h
> +	ori	r4, r4, (_end - _stext)@l

Err ! This doesn't compile :

arch/powerpc/kernel/reloc_32.S: Assembler messages:
arch/powerpc/kernel/reloc_32.S:191: Error: can't resolve `_end' {*UND* section} - `_stext' {*UND* section}


I will fix it, but the idea remains the same.

Thanks

Suzuki

^ permalink raw reply

* RE: [PATCH 2/7] powerpc/85xx: add HOTPLUG_CPU support
From: Li Yang-R58472 @ 2011-11-09  8:31 UTC (permalink / raw)
  To: Wood Scott-B07421; +Cc: linuxppc-dev@lists.ozlabs.org, Zhao Chenhui-B35336
In-Reply-To: <4EB997C5.60105@freescale.com>



>-----Original Message-----
>From: Wood Scott-B07421
>Sent: Wednesday, November 09, 2011 4:58 AM
>To: Li Yang-R58472
>Cc: Wood Scott-B07421; Zhao Chenhui-B35336; linuxppc-dev@lists.ozlabs.org
>Subject: Re: [PATCH 2/7] powerpc/85xx: add HOTPLUG_CPU support
>
>On 11/08/2011 04:05 AM, Li Yang-R58472 wrote:
>>> Subject: Re: [PATCH 2/7] powerpc/85xx: add HOTPLUG_CPU support
>>>
>>> On 11/04/2011 07:31 AM, Zhao Chenhui wrote:
>>>> +static int is_corenet;
>>>> +static void __cpuinit smp_85xx_setup_cpu(int cpu_nr);
>>>> +
>>>> +#if defined(CONFIG_HOTPLUG_CPU) && defined(CONFIG_PPC32)
>>>
>>> Why PPC32?
>>
>> Not sure if it is the same for e5500.  E5500 support will be verified
>later.
>
>It's better to have 64-bit silently do nothing here?
>
>>>> +	flush_disable_L1();
>>>
>>> You'll also need to take down L2 on e500mc.
>>
>> Right.  E500mc support is beyond this patch series.  We will work on it
>after the e500v2 support is finalized.
>
>I saw some support with "is_corenet"...  If we don't support e500mc, make
>sure the code doesn't try to run on e500mc.

The is_corenet code is just a start of the e500mc support and is far from c=
omplete.

>
>This isn't an e500v2-only BSP you're putting the code into. :-)

Yes, I know.  But it will take quite some time to perfect the support for d=
ifferent type of cores.  I'd like to make the effort into stages.  We want =
to push the e500v2 support in first and add support to newer cores later so=
 that we don't need to re-spin the patches again and again.  And the upstre=
am kernel can get the PM functionality at least for e500v2 earlier.  Anyway=
, we need to update the title of the patch to be more specific on e500v2.

>
>>>> +	tmp =3D 0;
>>>> +	if (cpu_has_feature(CPU_FTR_CAN_NAP))
>>>> +		tmp =3D HID0_NAP;
>>>> +	else if (cpu_has_feature(CPU_FTR_CAN_DOZE))
>>>> +		tmp =3D HID0_DOZE;
>>>
>>> Those FTR bits are for what we can do in idle, and can be cleared if
>>> the user sets CONFIG_BDI_SWITCH.
>>
>> It is powersave_nap variable shows what we can do in idle.
>
>No, that shows what the user wants to do in idle.
>
>> I think it's correct to use the CPU_FTR_CAN_* macro as CPU capability.
>
>This is 85xx-specific code.  We always can, and want to, nap here (though
>the way we enter nap will be different on e500mc and up) -- even if it's
>broken to use it for idle (such as on p4080, which would miss doorbells as
>wake events).

Ok.  Will stick to Nap.

>
>>>> +static void __cpuinit smp_85xx_reset_core(int nr) {
>>>> +	__iomem u32 *vaddr, *pir_vaddr;
>>>> +	u32 val, cpu_mask;
>>>> +
>>>> +	/* If CoreNet platform, use BRR as release register. */
>>>> +	if (is_corenet) {
>>>> +		cpu_mask =3D 1 << nr;
>>>> +		vaddr =3D ioremap(get_immrbase() + MPC85xx_BRR_OFF, 4);
>>>> +	} else {
>>>> +		cpu_mask =3D 1 << (24 + nr);
>>>> +		vaddr =3D ioremap(get_immrbase() + MPC85xx_ECM_EEBPCR_OFF, 4);
>>>> +	}
>>>
>>> Please use the device tree node, not get_immrbase().
>>
>> The get_immrbase() implementation uses the device tree node.
>
>I mean the guts node.  get_immrbase() should be avoided where possible.

Ok.  I got your point to use offset from guts.

>
>Also, some people might care about latency to enter/exit deep sleep.
>Searching through the device tree at this point (rather than on init)
>slows that down.

Actually the get_immrbase() is smarter than you thought. :) It only search =
the device tree on first call and returns the saved value on follow up call=
s.

>
>>>> +static int __cpuinit smp_85xx_map_bootpg(u32 page) {
>>>> +	__iomem u32 *bootpg_ptr;
>>>> +	u32 bptr;
>>>> +
>>>> +	/* Get the BPTR */
>>>> +	bootpg_ptr =3D ioremap(get_immrbase() + MPC85xx_BPTR_OFF, 4);
>>>> +
>>>> +	/* Set the BPTR to the secondary boot page */
>>>> +	bptr =3D MPC85xx_BPTR_EN | (page & MPC85xx_BPTR_BOOT_PAGE_MASK);
>>>> +	out_be32(bootpg_ptr, bptr);
>>>> +
>>>> +	iounmap(bootpg_ptr);
>>>> +	return 0;
>>>> +}
>>>
>>> Shouldn't the boot page already be set by U-Boot?
>>
>> The register should be lost after a deep sleep.   Better to do it again.
>
>How can it be lost after a deep sleep if we're relying on it to hold our
>wakeup code?

In order to wake up from deep sleep, the boot page has been set to the deep=
 sleep restoration function.  We need to set it back to the bootpage from u=
-boot.

>
>It's not "better to do it again" if we're making a bad assumption about
>the code and the table being in the same page.

Currently we are reusing the whole boot page including the spin-table from =
u-boot.  Are you suggesting to move just the boot page to kernel?

>
>>>>  	local_irq_save(flags);
>>>>
>>>> -	out_be32(bptr_vaddr + BOOT_ENTRY_PIR, nr);
>>>> +	out_be32(&epapr->pir, hw_cpu);
>>>>  #ifdef CONFIG_PPC32
>>>> -	out_be32(bptr_vaddr + BOOT_ENTRY_ADDR_LOWER, __pa(__early_start));
>>>> +#ifdef CONFIG_HOTPLUG_CPU
>>>> +	if (system_state =3D=3D SYSTEM_RUNNING) {
>>>> +		out_be32(&epapr->addr_l, 0);
>>>> +		smp_85xx_map_bootpg((u32)(*cpu_rel_addr >> PAGE_SHIFT));
>>>
>>> Why is this inside PPC32?
>>
>> Not tested on 64-bit yet.  Might require a different implementation on
>PPC64.
>
>Please make a reasonable effort to do things in a way that works on both.
>It shouldn't be a big deal to test that it doesn't break booting on p5020.

Will do.  But in stages.

>
>>>>  	if (!ioremappable)
>>>> -		flush_dcache_range((ulong)bptr_vaddr,
>>>> -				(ulong)(bptr_vaddr + SIZE_BOOT_ENTRY));
>>>> +		flush_dcache_range((ulong)epapr,
>>>> +				(ulong)epapr + sizeof(struct epapr_entry));
>>>
>>> We don't wait for the core to come up on 64-bit?
>>
>> Not sure about it.  But at least the normal boot up should be tested on
>P5020, right?
>
>Well, that's a special case in that it only has one secondary. :-)
>
>Or we could be getting lucky with timing.  It's not a new issue with this
>patch, it just looks odd.

We will look into it more when doing the e5500 support.

- Leo

^ permalink raw reply

* Re: [PATCH v3 3/3] Use separate struct console structure for each hvc_console.
From: Michael Ellerman @ 2011-11-09  8:05 UTC (permalink / raw)
  To: Miche Baker-Harvey
  Cc: Stephen Rothwell, Greg Kroah-Hartman, Konrad Rzeszutek Wilk,
	Rusty Russell, linux-kernel, virtualization, xen-devel,
	Anton Blanchard, Amit Shah, Mike Waychison, ppc-dev,
	Eric Northrup
In-Reply-To: <20111108214509.28884.98169.stgit@miche.sea.corp.google.com>

[-- Attachment #1: Type: text/plain, Size: 2758 bytes --]

On Tue, 2011-11-08 at 13:45 -0800, Miche Baker-Harvey wrote:
> It is possible to make any virtio_console port be a console
> by sending VIRITO_CONSOLE_CONSOLE_PORT.  But hvc_alloc was
> using a single struct console hvc_console, which contains
> both an index and flags which are per-port.
> 
> This adds a separate struct console for each virtio_console
> that is CONSOLE_PORT.

Hi Miche,

I'm testing this on powerpc and unfortunately it's working a little _too
well_. I end up with two struct consoles registered and so I get every
line of output twice :)

The problem is that we're registering two struct consoles. The first
obviously is hvc_console, either in hvc_console_init(), or in my case
from hvc_instantiate().

Then we register the allocated one in hvc_alloc(). But because they both
point back to the same hardware you get duplicate output.

We _do_ want to register a console early, in either/both
hvc_console_init() and hvc_instantiate(), because we want to have
console during boot prior to when hvc_alloc() gets called.

I think maybe we should be checking in hvc_alloc() whether we already
have hvc_console associated with the vtermno and if so we use
hvc_console instead of allocating a new one.

Patch below to do that, and works for me, but it's a bit of a hack,
there must be a better solution.

Finally I'm not sure how your patch affects the code in hvc_poll() which
checks hvc_console.index to do the SYSRQ hack.

cheers

diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
index fff35da..b249195 100644
--- a/drivers/tty/hvc/hvc_console.c
+++ b/drivers/tty/hvc/hvc_console.c
@@ -815,13 +815,15 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
        kref_init(&hp->kref);

        INIT_WORK(&hp->tty_resize, hvc_set_winsz);
-       /*
-        * Make each console its own struct console.
-        */
-       cp = kmemdup(&hvc_console, sizeof(*cp), GFP_KERNEL);
-       if (!cp) {
-               kfree(hp);
-               return ERR_PTR(-ENOMEM);
+
+       if (hvc_console.index >= 0 && vtermnos[hvc_console.index] == hp->vtermno)
+               cp = &hvc_console;
+       else {
+               cp = kmemdup(&hvc_console, sizeof(*cp), GFP_KERNEL);
+               if (!cp) {
+                       kfree(hp);
+                       return ERR_PTR(-ENOMEM);
+               }
        }

        hp->hvc_console = cp;
@@ -850,7 +852,9 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,

        list_add_tail(&(hp->next), &hvc_structs);
        spin_unlock(&hvc_structs_lock);
-       register_console(cp);
+
+       if (cp != &hvc_console)
+               register_console(cp);

        return hp;
 }

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply related

* Re: [PATCH] memory hotplug: Refuse to add unaligned memory regions
From: Chen Gong @ 2011-11-09  7:28 UTC (permalink / raw)
  To: Anton Blanchard; +Cc: gregkh, linuxppc-dev, linux-kernel
In-Reply-To: <20110915062615.782bc4df@kryten>

于 2011/9/15 4:26, Anton Blanchard 写道:
>
> The sysfs memory probe interface allows unaligned regions
> to be added:
>
> # echo 0xffffff>  /sys/devices/system/memory/probe
>
> # cat /proc/iomem
> 00ffffff-01fffffe : System RAM
> 01ffffff-02fffffe : System RAM
> 02ffffff-03fffffe : System RAM
> 03ffffff-04fffffe : System RAM
> 04ffffff-05fffffe : System RAM
>
> Return -EINVAL instead of creating these bad regions.
>
> Signed-off-by: Anton Blanchard<anton@samba.org>
> ---
>
> Index: linux-build/drivers/base/memory.c
> ===================================================================
> --- linux-build.orig/drivers/base/memory.c	2011-08-11 08:25:55.005941391 +1000
> +++ linux-build/drivers/base/memory.c	2011-08-11 08:28:27.938580440 +1000
> @@ -380,9 +380,13 @@ memory_probe_store(struct class *class,
>   	u64 phys_addr;
>   	int nid;
>   	int i, ret;
> +	unsigned long pages_per_block = PAGES_PER_SECTION * sections_per_block;
>
>   	phys_addr = simple_strtoull(buf, NULL, 0);
>
> +	if (phys_addr&  ((pages_per_block<<  PAGE_SHIFT) - 1))
> +		return -EINVAL;
> +
>   	for (i = 0; i<  sections_per_block; i++) {
>   		nid = memory_add_physaddr_to_nid(phys_addr);
>   		ret = add_memory(nid, phys_addr,
> --

what platform doese it affect? PowerPC or else?

As I know, on x86 platform it should not use this interface: *probe*,
instead of acpi_hotplug_xxx. But PowerPC is RISC so how can you add
such weird address for it? Maybe it is because PowerPC uses 16M as
one section size and you assign a wrong address to it intentionally.
The final result is as you show, isn't it?

^ permalink raw reply

* Re: [PATCH v3 2/3] hvc_init():  Enforce one-time initialization.
From: Michael Ellerman @ 2011-11-09  7:24 UTC (permalink / raw)
  To: Miche Baker-Harvey
  Cc: Stephen Rothwell, Greg Kroah-Hartman, Konrad Rzeszutek Wilk,
	Rusty Russell, linux-kernel, virtualization, xen-devel,
	Anton Blanchard, Amit Shah, Mike Waychison, ppc-dev,
	Eric Northrup
In-Reply-To: <20111108214504.28884.61814.stgit@miche.sea.corp.google.com>

[-- Attachment #1: Type: text/plain, Size: 1534 bytes --]

On Tue, 2011-11-08 at 13:45 -0800, Miche Baker-Harvey wrote:
> hvc_init() must only be called once, and no thread should continue with hvc_alloc()
> until after initialization is complete.  The original code does not enforce either
> of these requirements.  A new mutex limits entry to hvc_init() to a single thread,
> and blocks all later comers until it has completed.
> 
> This patch fixes multiple crash symptoms.

Hi Miche,

A few nit-picky comments below ..

> @@ -84,6 +85,10 @@ static LIST_HEAD(hvc_structs);
>   * list traversal.
>   */
>  static DEFINE_SPINLOCK(hvc_structs_lock);
> +/*
> + * only one task does allocation at a time.
> + */
> +static DEFINE_MUTEX(hvc_ports_mutex);

The comment is wrong, isn't it? Only one task does _init_ at a time.
Once the driver is initialised allocs can run concurrently.

So shouldn't it be called hvc_init_mutex ?

> @@ -825,11 +830,15 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
>  	int i;
>  
>  	/* We wait until a driver actually comes along */
> +	mutex_lock(&hvc_ports_mutex);
>  	if (!hvc_driver) {
>  		int err = hvc_init();
> -		if (err)
> +		if (err) {
> +			mutex_unlock(&hvc_ports_mutex);
>  			return ERR_PTR(err);
> +		}
>  	}
> +	mutex_unlock(&hvc_ports_mutex);
>  
>  	hp = kzalloc(ALIGN(sizeof(*hp), sizeof(long)) + outbuf_size,
>  			GFP_KERNEL);

It'd be cleaner I think to do all the locking in hvc_init(). That's safe
as long as you recheck !hvc_driver in hvc_init() with the lock held.

cheers


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply

* Re: [PATCH] powerpc: Export PIR data through sysfs
From: Michael Ellerman @ 2011-11-09  6:51 UTC (permalink / raw)
  To: Scott Wood; +Cc: mahesh, linuxppc-dev, Anton Blanchard
In-Reply-To: <4EB96002.5030605@freescale.com>

[-- Attachment #1: Type: text/plain, Size: 873 bytes --]

On Tue, 2011-11-08 at 10:59 -0600, Scott Wood wrote:
> On 11/08/2011 12:58 AM, Ananth N Mavinakayanahalli wrote:
> > On Mon, Nov 07, 2011 at 11:18:32AM -0600, Scott Wood wrote:
> >> What use does userspace have for this?  If you want to return the
> >> currently executing CPU (which unless you're pinned could change as soon
> >> as the value is read...), why not just return smp_processor_id() or
> >> hard_smp_processor_id()?
> > 
> > Its not just the current cpu. Decoding PIR can tell you the core id,
> > thread id in case of SMT, and this information can be used by userspace
> > apps to set affinities, etc.
> 
> Wouldn't it make more sense to expose the thread to core mappings in a
> general way, not tied to hardware or what thread we're currently running on?

AFAIK that is already available in /sys/devices/system/cpu/cpuX/topology

cheers


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply

* Re: [PATCH v2 1/5] [ppc] Process dynamic relocations for kernel
From: Suzuki Poulose @ 2011-11-09  6:33 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Nathan Miller, Josh Poimboeuf, Dave Hansen, Alan Modra,
	Scott Wood, Paul Mackerras, linuxppc-dev
In-Reply-To: <1320769145.5273.26.camel@treble>

On Tue, 08 Nov 2011 10:19:05 -0600
Josh Poimboeuf <jpoimboe@linux.vnet.ibm.com> wrote:

> On Tue, 2011-11-08 at 12:41 +0530, Suzuki Poulose wrote:
> > What I was suggesting is, instead of flushing the cache in
> > relocate(), lets do it like:
> > 
> > for e.g, on 440x, (in head_44x.S :)
> > 
> > #ifdef CONFIG_RELOCATABLE
> > 	...
> > 	bl relocate
> > 
> > 	#Flush the d-cache and invalidate the i-cache here
> > #endif
> > 
> > 
> > This would let the different platforms do the the cache
> > invalidation in their own way.
> > 
> > Btw, I didn't find an instruction to flush the entire d-cache in
> > PPC440 manual. We have instructions to flush only a block
> > corresponding to an address.
> > 
> > However, we have 'iccci' which would invalidate the entire i-cache
> > which, which I think is better than 80,000 i-cache invalidates.
> 
> In misc_32.S there are already some platform-independent cache
> management functions.  If we use those, then relocate() could simply
> call them.  Then the different platforms calling relocate() wouldn't
> have to worry about flushing/invalidating caches.
> 
> For example, there's a clean_dcache_range() function.  Given any range
> twice the size of the d-cache, it should flush the entire d-cache.
> But the only drawback is that it would require the caller to know the
> size of the d-cache.
> 
> Instead, I think it would be preferable to create a new clean_dcache()
> (or clean_dcache_all()?) function in misc_32.S, which could call
> clean_dcache_range() with the appropriate args for flushing the entire
> d-cache.  relocate() could then call the platform-independent
> clean_dcache().
> 


How about using clean_dcache_range() to flush the range runtime
address range [ _stext, _end ] ? That would flush the entire
instructions.


> For i-cache invalidation there's already the (incorrectly named?)
> flush_instruction_cache().  It uses the appropriate platform-specific
> methods (e.g. iccci for 44x) to invalidate the entire i-cache.

Agreed. The only thing that worries me is the use of KERNELBASE in the 
flush_instruction_cache() for CONFIG_4xx. Can we safely assume all 4xx
implementations ignore the arguments passed to iccci ? 
> 
> Suzuki, if you agree with this direction, I could work up a new patch
> if needed.
> 

I have the following (untested) patch which uses clean_dcache_range()
and flush_instruction_cache(): I will send the next version soon
with those changes and the DYNAMIC_MEMSTART config for oldstyle
relocatoin, if every one agrees to this.


diff --git a/arch/powerpc/kernel/reloc_32.S
b/arch/powerpc/kernel/reloc_32.S index 045d61e..cce0510 100644
--- a/arch/powerpc/kernel/reloc_32.S
+++ b/arch/powerpc/kernel/reloc_32.S
@@ -33,10 +33,9 @@ R_PPC_RELATIVE = 22
 
 _GLOBAL(relocate)
 
-	mflr	r0
+	mflr	r14		/* Save our LR */
 	bl	0f		/* Find our current runtime
address */ 0:	mflr	r12		/* Make it
accessible */
-	mtlr	r0
 
 	lwz	r11, (p_dyn - 0b)(r12)
 	add	r11, r11, r12	/* runtime address of .dynamic
section */ @@ -87,18 +86,19 @@ eodyn:				/*
End of Dyn Table scan */
 	 * Work out the current offset from the link time address
of .rela
 	 * section.
 	 *  cur_offset[r7] = rela.run[r9] - rela.link [r7]
-	 *  _stext.link[r10] = _stext.run[r10] - cur_offset[r7]
-	 *  final_offset[r3] = _stext.final[r3] - _stext.link[r10]
+	 *  _stext.link[r12] = _stext.run[r10] - cur_offset[r7]
+	 *  final_offset[r3] = _stext.final[r3] - _stext.link[r12]
 	 */
 	subf	r7, r7, r9	/* cur_offset */
-	subf	r10, r7, r10
-	subf	r3, r10, r3	/* final_offset */
+	subf	r12, r7, r10
+	subf	r3, r12, r3	/* final_offset */
 
 	subf	r8, r6, r8	/* relaz -= relaent */
 	/*
 	 * Scan through the .rela table and process each entry
 	 * r9	- points to the current .rela table entry
 	 * r13	- points to the symbol table
+	 * r10  - holds the runtime address of _stext
 	 */
 
 	/*
@@ -180,12 +180,23 @@ store_half:
 
 nxtrela:
 	cmpwi	r8, 0		/* relasz = 0 ? */
-	ble	done
+	ble	flush
 	add	r9, r9, r6	/* move to next entry in
the .rela table */ subf	r8, r6, r8	/* relasz -= relaent */
 	b	applyrela
 
-done:	blr
+	/* Flush the d-cache'd instructions */
+flush:
+	mr	r3, r10
+	lis	r4, (_end - _stext)@h
+	ori	r4, r4, (_end - _stext)@l
+	add	r4, r3, r4
+	bl	clean_dcache_range
+	/* Invalidate the i-cache */
+	bl	flush_instruction_cache
+done:
+	mtlr	r14
+	blr
 
 
 p_dyn:		.long	__dynamic_start - 0b


Thanks

Suzuki

^ permalink raw reply

* Re: [PATCH] powerpc: Export PIR data through sysfs
From: Ananth N Mavinakayanahalli @ 2011-11-09  4:41 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, Anton Blanchard, mahesh
In-Reply-To: <4EB96002.5030605@freescale.com>

On Tue, Nov 08, 2011 at 10:59:46AM -0600, Scott Wood wrote:
> On 11/08/2011 12:58 AM, Ananth N Mavinakayanahalli wrote:
> > On Mon, Nov 07, 2011 at 11:18:32AM -0600, Scott Wood wrote:
> >> What use does userspace have for this?  If you want to return the
> >> currently executing CPU (which unless you're pinned could change as soon
> >> as the value is read...), why not just return smp_processor_id() or
> >> hard_smp_processor_id()?
> > 
> > Its not just the current cpu. Decoding PIR can tell you the core id,
> > thread id in case of SMT, and this information can be used by userspace
> > apps to set affinities, etc.
> 
> Wouldn't it make more sense to expose the thread to core mappings in a
> general way, not tied to hardware or what thread we're currently running on?

AFAIK, the information encoding in PIR is platform dependent. There is
no general way to expose this information unless you want have a
per-platform ifdef. Even then, I am not sure if that information will
generally be available or provided.

> What's the use case for knowing this information only about the current
> thread (or rather the state the current thread was in a few moments ago)?

Its not information about the thread but about the cpu. Unless you have
a shared LPAR environment, the data will be consistent and can be used
by applications with knowledge of the platform.

> > +#if defined(CONFIG_SMP) && defined(SPRN_PIR)
> > +SYSFS_PMCSETUP(pir, SPRN_PIR);
> > +static SYSDEV_ATTR(pir, 0400, show_pir, NULL);
> > +#endif
> 
> This only helps on architectures such as 8xx where you can't build as
> SMP -- and I don't think #ifdef SPRN_PIR excludes any builds.
> 
> It doesn't help on chips like 750 or e300 where you can run a normal 6xx
> SMP build, you just won't have multiple CPUs, and thus won't run things
> like the secondary entry code.

Ugh! Booke builds seem to be fun :-)

I think this calls for a CPU_FTR_PIR. What do you suggest?

Ananth

^ permalink raw reply

* Re: [PATCH 0/2] PS3 fixes for 3.2
From: Benjamin Herrenschmidt @ 2011-11-09  2:33 UTC (permalink / raw)
  To: Geoff Levand; +Cc: cbe-oss-dev, linuxppc-dev, Andre Heider
In-Reply-To: <E1RNxlF-0000wf-Rh@casper.infradead.org>

On Tue, 2011-11-08 at 17:51 -0800, Geoff Levand wrote:
> git://git.kernel.org/pub/scm/linux/kernel/git/geoff/ps3-linux.git

Thanks. Will do after he pulls my current batch.

Cheers,
Ben.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox