LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: Kernel support for the Freescale P2020-MSC8156 AdvancedMC Reference Design
From: Kumar Gala @ 2011-11-30 13:24 UTC (permalink / raw)
  To: Daniel Ng2; +Cc: linuxppc-dev
In-Reply-To: <32883132.post@talk.nabble.com>


On Nov 29, 2011, at 11:34 PM, Daniel Ng2 wrote:

>=20
> Hi,
>=20
> Does anyone know of any kernel support for the Freescale P2020-MSC8156 =
AMC
> board?-
>=20
> =
http://freescale.com.hk/webapp/sps/site/prod_summary.jsp?code=3DP2020-MSC8=
156AMCRD
>=20
> I am looking for platform-specific files ie. the ones that go in
> arch/powerpc/platforms/85xx and DTS files if possible...
>=20
> Cheers,
> Daniel

There isn't any support for this board in the open source kernel.

- k=

^ permalink raw reply

* [RFC PATCH v3 4/4] cpuidle: (POWER) Handle power_save=off
From: Deepthi Dharwar @ 2011-11-30 12:47 UTC (permalink / raw)
  To: benh, len.brown; +Cc: linuxppc-dev, linux-pm, linux-kernel, linux-pm
In-Reply-To: <20111130124608.972.87712.stgit@deepthi-ThinkPad-T420>

This patch makes pseries_idle_driver not to be registered when
power_save=off kernel boot option is specified. The
cpuidle_disable variable used here is similar to
its usage on x86. If cpuidle_disable is set then
sysfs entries for cpuidle framework are not created
and the required drivers are not loaded.

Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Trinabh Gupta <g.trinabh@gmail.com>
Signed-off-by: Arun R Bharadwaj <arun.r.bharadwaj@gmail.com>
---
 arch/powerpc/include/asm/processor.h            |    1 +
 arch/powerpc/platforms/pseries/processor_idle.c |    3 +++
 2 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index 811b7e7..b585bff 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -382,6 +382,7 @@ static inline unsigned long get_clean_sp(struct pt_regs *regs, int is_32)
 }
 #endif
 
+extern unsigned long cpuidle_disable;
 enum idle_boot_override {IDLE_NO_OVERRIDE = 0, IDLE_POWERSAVE_OFF};
 
 #endif /* __KERNEL__ */
diff --git a/arch/powerpc/platforms/pseries/processor_idle.c b/arch/powerpc/platforms/pseries/processor_idle.c
index 352b78a..4f59af0 100644
--- a/arch/powerpc/platforms/pseries/processor_idle.c
+++ b/arch/powerpc/platforms/pseries/processor_idle.c
@@ -269,6 +269,9 @@ static int pseries_idle_probe(void)
 	if (!firmware_has_feature(FW_FEATURE_SPLPAR))
 		return -ENODEV;
 
+	if (cpuidle_disable != IDLE_NO_OVERRIDE)
+		return -ENODEV;
+
 	if (max_idle_state == 0) {
 		printk(KERN_DEBUG "pseries processor idle disabled.\n");
 		return -EPERM;

^ permalink raw reply related

* Re: [PATCH v3 2/8] [booke] Rename mapping based RELOCATABLE to DYNAMIC_MEMSTART for BookE
From: Josh Boyer @ 2011-11-30 14:41 UTC (permalink / raw)
  To: Scott Wood
  Cc: Josh Poimboeuf, David Laight, Alan Modra, Suzuki K. Poulose,
	linuxppc-dev
In-Reply-To: <4ED4126D.7000201@freescale.com>

On Mon, Nov 28, 2011 at 5:59 PM, Scott Wood <scottwood@freescale.com> wrote=
:
> On 11/23/2011 10:47 AM, Josh Boyer wrote:
>> On Mon, Nov 14, 2011 at 12:41 AM, Suzuki K. Poulose <suzuki@in.ibm.com> =
wrote:
>>> The current implementation of CONFIG_RELOCATABLE in BookE is based
>>> on mapping the page aligned kernel load address to KERNELBASE. This
>>> approach however is not enough for platforms, where the TLB page size
>>> is large (e.g, 256M on 44x). So we are renaming the RELOCATABLE used
>>> currently in BookE to DYNAMIC_MEMSTART to reflect the actual method.
>
> Should reword the config help to make it clear what the alignment
> restriction is, or where to find the information for a particular
> platform. =A0Someone reading "page aligned" without any context that we'r=
e
> talking about special large pages is going to think 4K -- and on e500,
> many large page sizes are supported, so the required alignment is found
> in Linux init code rather than a CPU manual.
>
>>>
>>> The CONFIG_RELOCATABLE for PPC32(BookE) based on processing of the
>>> dynamic relocations will be introduced in the later in the patch series=
.
>>>
>>> This change would allow the use of the old method of RELOCATABLE for
>>> platforms which can afford to enforce the page alignment (platforms wit=
h
>>> smaller TLB size).
>>
>> I'm OK with the general direction, but this touches a lot of non-4xx
>> code. =A0I'd prefer it if Ben took this directly on whatever final
>> solution is done.
>>
>>> I haven tested this change only on 440x. I don't have an FSL BookE to v=
erify
>>> the changes there.
>>>
>>> Scott,
>>> Could you please test this patch on FSL and let me know the results ?
>>
>> Scott, did you ever get around to testing this? =A0In my opinion, this
>> shouldn't go in without a Tested-by: from someone that tried it on an
>> FSL platform.
>
> Booted OK for me on e500v2 with RAM starting at 256M.
>
> Tested-by: Scott Wood <scottwood@freescale.com>
>
>> We add DYNAMIC_MEMSTART for 32-bit, and we have RELOCATABLE for
>> 64-bit. =A0Then throughout almost the rest of the patch, all we're doing
>> is duplicating what RELOCATABLE already did (e.g. if ! either thing).
>> It works, but it is kind of ugly.
>>
>> Instead, could we define a helper config variable that can be used in
>> place of that construct? =A0Something like:
>>
>> config NONSTATIC_KERNEL (or whatever)
>> =A0 =A0 bool
>> =A0 =A0 default n
>>
>> ...
>>
>> config DYNAMIC_MEMSTART
>> =A0 =A0 <blah>
>> =A0 =A0 select NONSTATIC_KERNEL
>>
>> ...
>>
>> config RELOCATABLE
>> =A0 =A0 <blah>
>> =A0 =A0 select NONSTATIC_KERNEL
>
> I agree.

Suzie do you think you could respin this patch with the suggested
changes and include Scott's Tested-by?  The rest of the series looks
fine and I'd like to get it included in my next branch.

josh

^ permalink raw reply

* Please pull 'next' branch of new 4xx tree
From: Josh Boyer @ 2011-11-30 15:14 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Stephen Rothwell; +Cc: linuxppc-dev

Hi Ben,

I have a new 4xx tree setup now.  Two small commits for the next
branch are queued up.  I'd like to get the patch series from Suzie and
Tony included soon as well, but thought I'd start with these to get
things rolling.

Stephen, can you switch the linux-next tree to use this instead?

josh

The following changes since commit fa8cbaaf5a68f62db3f9a8444ecbb940b47984cb:

  powerpc+sparc64/mm: Remove hack in mmap randomize layout (2011-11-28
11:42:09 +1100)

are available in the git repository at:
  git://git.infradead.org/users/jwboyer/powerpc-4xx.git next

Josh Boyer (1):
      MAINTAINERS: Update PowerPC 4xx git tree

Tanmay Inamdar (1):
      powerpc/40x: Add APM8018X SOC support

 MAINTAINERS                                 |    2 +-
 arch/powerpc/boot/dts/klondike.dts          |  227 +++++++++++++++++++++++++++
 arch/powerpc/configs/40x/klondike_defconfig |   55 +++++++
 arch/powerpc/kernel/cputable.c              |   13 ++
 arch/powerpc/platforms/40x/Kconfig          |   11 ++
 arch/powerpc/platforms/40x/ppc40x_simple.c  |    1 +
 6 files changed, 308 insertions(+), 1 deletions(-)
 create mode 100644 arch/powerpc/boot/dts/klondike.dts
 create mode 100644 arch/powerpc/configs/40x/klondike_defconfig

^ permalink raw reply

* [PATCH] powerpc/85xx: don't call of_platform_bus_probe() twice
From: Timur Tabi @ 2011-11-30 16:19 UTC (permalink / raw)
  To: kumar.gala, benh, scottwood, dbaryshkov, linuxppc-dev

Commit 46d026ac ("powerpc/85xx: consolidate of_platform_bus_probe calls")
replaced platform-specific of_device_id tables with a single function
that probes the most of the busses in 85xx device trees.  If a specific
platform needed additional busses probed, then it could call
of_platform_bus_probe() again.  Typically, the additional platform-specific
busses are children of existing busses that have already been probed.
of_platform_bus_probe() does not handle those child busses automatically.

Unfortunately, this doesn't actually work.  The second (platform-specific)
call to of_platform_bus_probe() never finds any of the busses it's asked
to find.

To remedy this, the platform-specific of_device_id tables are eliminated,
and their entries are merged into mpc85xx_common_ids[], so that all busses
are probed at once.

Signed-off-by: Timur Tabi <timur@freescale.com>
---
 arch/powerpc/platforms/85xx/common.c      |    6 ++++++
 arch/powerpc/platforms/85xx/mpc85xx_mds.c |   11 +----------
 arch/powerpc/platforms/85xx/p1022_ds.c    |   13 +------------
 3 files changed, 8 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/platforms/85xx/common.c b/arch/powerpc/platforms/85xx/common.c
index 9fef530..67dac22 100644
--- a/arch/powerpc/platforms/85xx/common.c
+++ b/arch/powerpc/platforms/85xx/common.c
@@ -21,6 +21,12 @@ static struct of_device_id __initdata mpc85xx_common_ids[] = {
 	{ .compatible = "fsl,qe", },
 	{ .compatible = "fsl,cpm2", },
 	{ .compatible = "fsl,srio", },
+	/* So that the DMA channel nodes can be probed individually: */
+	{ .compatible = "fsl,eloplus-dma", },
+	/* For the PMC driver */
+	{ .compatible = "fsl,mpc8548-guts", },
+	/* Probably unnecessary? */
+	{ .compatible = "gpio-leds", },
 	{},
 };
 
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_mds.c b/arch/powerpc/platforms/85xx/mpc85xx_mds.c
index 495cfd9..4a555ee 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_mds.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_mds.c
@@ -410,12 +410,6 @@ static int __init board_fixups(void)
 machine_arch_initcall(mpc8568_mds, board_fixups);
 machine_arch_initcall(mpc8569_mds, board_fixups);
 
-static struct of_device_id mpc85xx_ids[] = {
-	{ .compatible = "fsl,mpc8548-guts", },
-	{ .compatible = "gpio-leds", },
-	{},
-};
-
 static int __init mpc85xx_publish_devices(void)
 {
 	if (machine_is(mpc8568_mds))
@@ -423,10 +417,7 @@ static int __init mpc85xx_publish_devices(void)
 	if (machine_is(mpc8569_mds))
 		simple_gpiochip_init("fsl,mpc8569mds-bcsr-gpio");
 
-	mpc85xx_common_publish_devices();
-	of_platform_bus_probe(NULL, mpc85xx_ids, NULL);
-
-	return 0;
+	return mpc85xx_common_publish_devices();
 }
 
 machine_device_initcall(mpc8568_mds, mpc85xx_publish_devices);
diff --git a/arch/powerpc/platforms/85xx/p1022_ds.c b/arch/powerpc/platforms/85xx/p1022_ds.c
index 2bf4342..ea6190b 100644
--- a/arch/powerpc/platforms/85xx/p1022_ds.c
+++ b/arch/powerpc/platforms/85xx/p1022_ds.c
@@ -326,18 +326,7 @@ static void __init p1022_ds_setup_arch(void)
 	pr_info("Freescale P1022 DS reference board\n");
 }
 
-static struct of_device_id __initdata p1022_ds_ids[] = {
-	/* So that the DMA channel nodes can be probed individually: */
-	{ .compatible = "fsl,eloplus-dma", },
-	{},
-};
-
-static int __init p1022_ds_publish_devices(void)
-{
-	mpc85xx_common_publish_devices();
-	return of_platform_bus_probe(NULL, p1022_ds_ids, NULL);
-}
-machine_device_initcall(p1022_ds, p1022_ds_publish_devices);
+machine_device_initcall(p1022_ds, mpc85xx_common_publish_devices);
 
 machine_arch_initcall(p1022_ds, swiotlb_setup_bus_notifier);
 
-- 
1.7.3.4

^ permalink raw reply related

* Re: [PATCH][RFC] fsldma: fix performance degradation by optimizing spinlock use.
From: Ira W. Snyder @ 2011-11-30 17:07 UTC (permalink / raw)
  To: Shi Xuelin-B29237
  Cc: vinod.koul@intel.com, dan.j.williams@intel.com,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	Li Yang-R58472
In-Reply-To: <DBB740589CE8814680DECFE34BE197AB14A1E0@039-SN1MPN1-006.039d.mgd.msft.net>

On Wed, Nov 30, 2011 at 09:57:47AM +0000, Shi Xuelin-B29237 wrote:
> Hello Ira,
> 
> In drivers/dma/dmaengine.c, we have below tight loop to check DMA completion in mainline Linux:
>        do {
>                 status = dma_async_is_tx_complete(chan, cookie, NULL, NULL);
>                 if (time_after_eq(jiffies, dma_sync_wait_timeout)) {
>                         printk(KERN_ERR "dma_sync_wait_timeout!\n");
>                         return DMA_ERROR;
>                 }
>         } while (status == DMA_IN_PROGRESS);
> 

That is the body of dma_sync_wait(). It is mostly used in the raid code.
I understand that you don't want to change the raid code to use
callbacks.

In any case, I think we've strayed from the topic under consideration,
which is: can we remove this spinlock without introducing a bug.

I'm convinced that "smp_rmb()" is needed when removing the spinlock. As
noted, Documentation/memory-barriers.txt says that stores on one CPU can
be observed by another CPU in a different order.

Previously, there was an UNLOCK (in fsl_dma_tx_submit) followed by a
LOCK (in fsl_tx_status). This provided a "full barrier", forcing the
operations to complete correctly when viewed by the second CPU. From the
text:

> Therefore, from (1), (2) and (4) an UNLOCK followed by an unconditional LOCK is
> equivalent to a full barrier, but a LOCK followed by an UNLOCK is not.

Also, please read "EXAMPLES OF MEMORY BARRIER SEQUENCES" and "INTER-CPU
LOCKING BARRIER EFFECTS". Particularly, in "EXAMPLES OF MEMORY BARRIER
SEQUENCES", the text notes:

> Without intervention, CPU 2 may perceive the events on CPU 1 in some
> effectively random order, despite the write barrier issued by CPU 1:
>
> [snip diagram]
>
> And thirdly, a read barrier acts as a partial order on loads. Consider the
> following sequence of events:
>
> [snip diagram]
>
> Without intervention, CPU 2 may then choose to perceive the events on CPU 1 in
> some effectively random order, despite the write barrier issued by CPU 1:
>
> [snip diagram]
>

And so on. Please read this entire section in the document.

I can't give you an ACK on the proposed patch. To the best of my
understanding, I believe it introduces a bug. I've tried to provide as
much evidence for this belief as I can, in the form of documentation in
the kernel source tree. If you can cite some documentation that shows I
am wrong, I will happily change my mind!

Ira

> -----Original Message-----
> From: Ira W. Snyder [mailto:iws@ovro.caltech.edu] 
> Sent: 2011年11月30日 1:26
> To: Li Yang-R58472
> Cc: Shi Xuelin-B29237; vinod.koul@intel.com; dan.j.williams@intel.com; linuxppc-dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH][RFC] fsldma: fix performance degradation by optimizing spinlock use.
> 
> On Tue, Nov 29, 2011 at 03:19:05AM +0000, Li Yang-R58472 wrote:
> > > Subject: Re: [PATCH][RFC] fsldma: fix performance degradation by 
> > > optimizing spinlock use.
> > > 
> > > On Thu, Nov 24, 2011 at 08:12:25AM +0000, Shi Xuelin-B29237 wrote:
> > > > Hi Ira,
> > > >
> > > > Thanks for your review.
> > > >
> > > > After second thought, I think your scenario may not occur.
> > > > Because the cookie 20 we query must be returned by 
> > > > fsl_dma_tx_submit(...) in
> > > practice.
> > > > We never query a cookie not returned by fsl_dma_tx_submit(...).
> > > >
> > > 
> > > I agree about this part.
> > > 
> > > > When we call fsl_tx_status(20), the chan->common.cookie is 
> > > > definitely wrote as
> > > 20 and cpu2 could not read as 19.
> > > >
> > > 
> > > This is what I don't agree about. However, I'm not an expert on CPU cache vs.
> > > memory accesses in an multi-processor system. The section titled 
> > > "CACHE COHERENCY" in Documentation/memory-barriers.txt leads me to 
> > > believe that the scenario I described is possible.
> > 
> > For Freescale PowerPC, the chip automatically takes care of cache coherency.  Even if this is a concern, spinlock can't address it.
> > 
> > > 
> > > What happens if CPU1's write of chan->common.cookie only goes into 
> > > CPU1's cache. It never makes it to main memory before CPU2 fetches the old value of 19.
> > > 
> > > I don't think you should see any performance impact from the 
> > > smp_mb() operation.
> > 
> > Smp_mb() do have impact on performance if it's in the hot path.  While it might be safer having it, I doubt it is really necessary.  If the CPU1 doesn't have the updated last_used, it's shouldn't have known there is a cookie 20 existed either.
> > 
> 
> I believe that you are correct, for powerpc. However, anything outside of arch/powerpc shouldn't assume it only runs on powerpc. I wouldn't be surprised to see fsldma running on an iMX someday (ARM processor).
> 
> My interpretation says that the change introduces the possibility that
> fsl_tx_status() returns the wrong answer for an extremely small time window, on SMP only, based on Documentation/memory-barriers.txt. But I can't seem convince you.
> 
> My real question is what code path is hitting this spinlock? Is it in mainline Linux? Why is it polling rather than using callbacks to determine DMA completion?
> 
> Thanks,
> Ira
> 
> > > > -----Original Message-----
> > > > From: Ira W. Snyder [mailto:iws@ovro.caltech.edu]
> > > > Sent: 2011年11月23日 2:59
> > > > To: Shi Xuelin-B29237
> > > > Cc: dan.j.williams@intel.com; Li Yang-R58472; zw@zh-kernel.org; 
> > > > vinod.koul@intel.com; linuxppc-dev@lists.ozlabs.org; 
> > > > linux-kernel@vger.kernel.org
> > > > Subject: Re: [PATCH][RFC] fsldma: fix performance degradation by 
> > > > optimizing
> > > spinlock use.
> > > >
> > > > On Tue, Nov 22, 2011 at 12:55:05PM +0800, b29237@freescale.com wrote:
> > > > > From: Forrest Shi <b29237@freescale.com>
> > > > >
> > > > >     dma status check function fsl_tx_status is heavily called in
> > > > >     a tight loop and the desc lock in fsl_tx_status contended by
> > > > >     the dma status update function. this caused the dma performance
> > > > >     degrades much.
> > > > >
> > > > >     this patch releases the lock in the fsl_tx_status function.
> > > > >     I believe it has no neglect impact on the following call of
> > > > >     dma_async_is_complete(...).
> > > > >
> > > > >     we can see below three conditions will be identified as success
> > > > >     a)  x < complete < use
> > > > >     b)  x < complete+N < use+N
> > > > >     c)  x < complete < use+N
> > > > >     here complete is the completed_cookie, use is the last_used
> > > > >     cookie, x is the querying cookie, N is MAX cookie
> > > > >
> > > > >     when chan->completed_cookie is being read, the last_used may
> > > > >     be incresed. Anyway it has no neglect impact on the dma status
> > > > >     decision.
> > > > >
> > > > >     Signed-off-by: Forrest Shi <xuelin.shi@freescale.com>
> > > > > ---
> > > > >  drivers/dma/fsldma.c |    5 -----
> > > > >  1 files changed, 0 insertions(+), 5 deletions(-)
> > > > >
> > > > > diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c index 
> > > > > 8a78154..1dca56f 100644
> > > > > --- a/drivers/dma/fsldma.c
> > > > > +++ b/drivers/dma/fsldma.c
> > > > > @@ -986,15 +986,10 @@ static enum dma_status 
> > > > > fsl_tx_status(struct
> > > dma_chan *dchan,
> > > > >  	struct fsldma_chan *chan = to_fsl_chan(dchan);
> > > > >  	dma_cookie_t last_complete;
> > > > >  	dma_cookie_t last_used;
> > > > > -	unsigned long flags;
> > > > > -
> > > > > -	spin_lock_irqsave(&chan->desc_lock, flags);
> > > > >
> > > >
> > > > This will cause a bug. See below for a detailed explanation. You need this instead:
> > > >
> > > > 	/*
> > > > 	 * On an SMP system, we must ensure that this CPU has seen the
> > > > 	 * memory accesses performed by another CPU under the
> > > > 	 * chan->desc_lock spinlock.
> > > > 	 */
> > > > 	smp_mb();
> > > > >  	last_complete = chan->completed_cookie;
> > > > >  	last_used = dchan->cookie;
> > > > >
> > > > > -	spin_unlock_irqrestore(&chan->desc_lock, flags);
> > > > > -
> > > > >  	dma_set_tx_state(txstate, last_complete, last_used, 0);
> > > > >  	return dma_async_is_complete(cookie, last_complete, 
> > > > > last_used);  }
> > > >
> > > > Facts:
> > > > - dchan->cookie is the same member as chan->common.cookie (same 
> > > > memory
> > > > location)
> > > > - chan->common.cookie is the "last allocated cookie for a pending transaction"
> > > > - chan->completed_cookie is the "last completed transaction"
> > > >
> > > > I have replaced "dchan->cookie" with "chan->common.cookie" in the 
> > > > below
> > > explanation, to keep everything referenced from the same structure.
> > > >
> > > > Variable usage before your change. Everything is used locked.
> > > > - RW chan->common.cookie		(fsl_dma_tx_submit)
> > > > - R  chan->common.cookie		(fsl_tx_status)
> > > > - R  chan->completed_cookie		(fsl_tx_status)
> > > > - W  chan->completed_cookie		(dma_do_tasklet)
> > > >
> > > > Variable usage after your change:
> > > > - RW chan->common.cookie		LOCKED
> > > > - R  chan->common.cookie		NO LOCK
> > > > - R  chan->completed_cookie		NO LOCK
> > > > - W  chan->completed_cookie             LOCKED
> > > >
> > > > What if we assume that you have a 2 CPU system (such as a P2020). 
> > > > After your
> > > changes, one possible sequence is:
> > > >
> > > > === CPU1 - allocate + submit descriptor: fsl_dma_tx_submit() === 
> > > > spin_lock_irqsave
> > > > descriptor->cookie = 20		(x in your example)
> > > > chan->common.cookie = 20	(used in your example)
> > > > spin_unlock_irqrestore
> > > >
> > > > === CPU2 - immediately calls fsl_tx_status() ===
> > > > chan->common.cookie == 19
> > > > chan->completed_cookie == 19
> > > > descriptor->cookie == 20
> > > >
> > > > Since we don't have locks anymore, CPU2 may not have seen the 
> > > > write to
> > > > chan->common.cookie yet.
> > > >
> > > > Also assume that the DMA hardware has not started processing the 
> > > > transaction yet. Therefore dma_do_tasklet() has not been called, 
> > > > and
> > > > chan->completed_cookie has not been updated.
> > > >
> > > > In this case, dma_async_is_complete() (on CPU2) returns 
> > > > DMA_SUCCESS, even
> > > though the DMA operation has not succeeded. The DMA operation has 
> > > not even started yet!
> > > >
> > > > The smp_mb() fixes this, since it forces CPU2 to have seen all 
> > > > memory operations
> > > that happened before CPU1 released the spinlock. Spinlocks are 
> > > implicit SMP memory barriers.
> > > >
> > > > Therefore, the above example becomes:
> > > > smp_mb();
> > > > chan->common.cookie == 20
> > > > chan->completed_cookie == 19
> > > > descriptor->cookie == 20
> > > >
> > > > Then dma_async_is_complete() returns DMA_IN_PROGRESS, which is correct.
> > > >
> > > > Thanks,
> > > > Ira
> > > >
> > > >
> > > > _______________________________________________
> > > > Linuxppc-dev mailing list
> > > > Linuxppc-dev@lists.ozlabs.org
> > > > https://lists.ozlabs.org/listinfo/linuxppc-dev
> > 
> 

^ permalink raw reply

* RE: [PATCH 5/6] powerpc/boot: Add mfdcrx
From: David Laight @ 2011-11-30 18:10 UTC (permalink / raw)
  To: Segher Boessenkool, Tony Breeds; +Cc: LinuxPPC-dev
In-Reply-To: <F91E9007-5DA5-4E3A-993E-EA680DE08D79@kernel.crashing.org>

=20
> > +#define mfdcrx(rn) \
> > +	({	\
> > +		unsigned long rval; \
> > +		asm volatile("mfdcrx %0,%1" : "=3Dr"(rval) : "g"(rn)); \
> > +		rval; \
> > +	})
>=20
> "g" is never correct on PowerPC, you want "r" here.  You can write
> this as a static inline btw, you only need the #define stuff when
> there is an "i" constraint involved.

I think you can still use static inlines even when a
constraint is one that requires a compile time constant.
eg (not ppc, but the "n" become part of the instruction
word):

static __inline__ uint32_t
custom_inic(const uint32_t op, uint32_t a, const uint32_t b)
{
    uint32_t result;

    __asm__ ( "custom\t%1, %0, %2, c%3"
        : "=3Dr" (result) : "n" (op), "r" (a), "n" (b));
    return result;
}

	David

^ permalink raw reply

* Re: scsi/ipr failed to initialize >=linux-3.0.9, >=linux-3.1.1
From: acrux @ 2011-11-30 18:27 UTC (permalink / raw)
  To: acrux; +Cc: linuxppc-dev
In-Reply-To: <20111127143752.269cc898.acrux_it@libero.it>

On Sun, 27 Nov 2011 14:37:52 +0100
acrux <acrux_it@libero.it> wrote:

> 
> scsi subsystem with ipr driver fails to initialize with every kernel >=3.0.9 and >=3.1.1
> Checked on YDL_Powerstation, IBM 9114-275, IBM 9123-710,
> 

well, it seems this was already fixed by one of these two commits (only for linux-3.2):
[SCSI] ipr: Stop reading adapter dump prematurely
commit	41e9a69641fb3fa86fa9277a179f3ad261d072f7
[SCSI] ipr: Fix BUG on adapter dump timeout
commit	4c647e909fceb9df8ec8f06016dd56244045a929

indeed YDL_Powerstation scsi ctrl has no issue with linux-3.2-rc3 .
Thus i guess also other platforms can work fine.

cheers,
--nico
-- 
acrux <acrux_it@libero.it>

^ permalink raw reply

* Re: [PATCH 3/6] 44x: Removing dead CONFIG_PPC47x
From: Benjamin Herrenschmidt @ 2011-11-30 20:20 UTC (permalink / raw)
  To: Josh Boyer; +Cc: Christoph Egger, LinuxPPC-dev
In-Reply-To: <CA+5PVA637ckfo60Foc_G_yhuJg6=fbuyQcjFDwYV_XFLkd6whA@mail.gmail.com>

On Wed, 2011-11-30 at 06:43 -0500, Josh Boyer wrote:
> 
> That doesn't look right.  The code is there doing something, why is it
> just being removed?  I would think the change would be to use
> CONFIG_PPC_47x?
> 
> Or if the code there isn't needed any longer, the changelog should say
> why.

Ah right, I tripped on this one too when reviewing then figured it out
but I agree, the changelog should be clearer.

If you notice, the original ifdef was in a function that is only ever
used on 44x. There's a separate function that handles 47x. I suppose
this is a leftover of the initial port which somebody forgot to remove.

So the patch is fine, but yes, the changelog could be made clearer.

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH 4/6] powerpc/boot: Add extended precision shifts to the boot wrapper.
From: Benjamin Herrenschmidt @ 2011-11-30 20:21 UTC (permalink / raw)
  To: Josh Boyer; +Cc: LinuxPPC-dev
In-Reply-To: <CA+5PVA6kBquh2=sPy25PKGSMBWvoZFFvZjrL4bgdmddFgjyaQA@mail.gmail.com>

On Wed, 2011-11-30 at 06:45 -0500, Josh Boyer wrote:
> On Wed, Nov 30, 2011 at 12:48 AM, Benjamin Herrenschmidt
> <benh@kernel.crashing.org> wrote:
> > On Wed, 2011-11-30 at 16:23 +1100, Tony Breeds wrote:
> >> Code copied from arch/powerpc/kernel/misc_32.S
> >>
> >> Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
> >> ---
> >>  arch/powerpc/boot/div64.S |   52 +++++++++++++++++++++++++++++++++++++++++++++
> >>  1 files changed, 52 insertions(+), 0 deletions(-)
> >
> > Should we just link with libgcc ? :-)
> 
> Please tell me you're joking.

Only half... I wonder what it would look like. Wouldn't ld only pickup
what we use anyway ?

> However, adding this code and wonderful and all but why do we need to
> add it?  Changelog should say why.

Agreed.

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH 4/6] powerpc/boot: Add extended precision shifts to the boot wrapper.
From: Scott Wood @ 2011-11-30 20:28 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: LinuxPPC-dev
In-Reply-To: <1322684475.29041.2.camel@pasglop>

On 11/30/2011 02:21 PM, Benjamin Herrenschmidt wrote:
> On Wed, 2011-11-30 at 06:45 -0500, Josh Boyer wrote:
>> On Wed, Nov 30, 2011 at 12:48 AM, Benjamin Herrenschmidt
>> <benh@kernel.crashing.org> wrote:
>>> On Wed, 2011-11-30 at 16:23 +1100, Tony Breeds wrote:
>>>> Code copied from arch/powerpc/kernel/misc_32.S
>>>>
>>>> Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
>>>> ---
>>>>  arch/powerpc/boot/div64.S |   52 +++++++++++++++++++++++++++++++++++++++++++++
>>>>  1 files changed, 52 insertions(+), 0 deletions(-)
>>>
>>> Should we just link with libgcc ? :-)
>>
>> Please tell me you're joking.
> 
> Only half... I wonder what it would look like. Wouldn't ld only pickup
> what we use anyway ?

We do it in U-Boot...

Only problem I see is that it would need to be built as soft-float, and
apparently that isn't supported by gcc on 64-bit ppc.  Unfortunately
there's isn't a "no float" mode.

-Scott

^ permalink raw reply

* Re: [PATCH 6/8] powerpc/ps3: Fix pr_debug build warnings
From: Geert Uytterhoeven @ 2011-11-30 20:57 UTC (permalink / raw)
  To: Geoff Levand; +Cc: cbe-oss-dev, linuxppc-dev
In-Reply-To: <0b0c0efa6cca6e5e169b4b3e81373a3bab75716a.1322615824.git.geoff@infradead.org>

Hi Geoff,

On Wed, Nov 30, 2011 at 02:38, Geoff Levand <geoff@infradead.org> wrote:
> Fix some PS3 build warnings when DEBUG is defined.
>
> Fixes warnings like these:
>
> =C2=A0format '%lx' expects type 'long unsigned int', but argument 7 has t=
ype 'u64'
>
> Signed-off-by: Geoff Levand <geoff@infradead.org>
> ---
> =C2=A0arch/powerpc/platforms/ps3/repository.c | =C2=A0 14 ++++++++------
> =C2=A01 files changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/arch/powerpc/platforms/ps3/repository.c b/arch/powerpc/platf=
orms/ps3/repository.c
> index cb68729..2ce2782 100644
> --- a/arch/powerpc/platforms/ps3/repository.c
> +++ b/arch/powerpc/platforms/ps3/repository.c
> @@ -1050,7 +1050,7 @@ int ps3_repository_dump_resource_info(const struct =
ps3_repository_device *repo)
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0pr_debug("%s:%d (%=
u:%u) reg_type %u, bus_addr %lxh, len %lxh\n",
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0__func__, __LINE__, repo->bus_index, repo->dev_index,
> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 reg_type, bus_addr, len);
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 reg_type, (unsigned long)bus_addr, (unsigned long)len);
> =C2=A0 =C2=A0 =C2=A0 =C2=A0}
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0pr_debug(" <- %s:%d\n", __func__, __LINE__);

The correct way to format u64 is using the "ll" length modifier. That way y=
ou
don't need casts.

The code above was originally written before the u64 uniformization, when p=
pc64
was still using "unsigned long" for u64.

Gr{oetje,eeting}s,

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k=
.org

In personal conversations with technical people, I call myself a hacker. Bu=
t
when I'm talking to journalists I just say "programmer" or something like t=
hat.
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 =C2=A0=C2=A0 -- Linus Torvalds

^ permalink raw reply

* Re: [BUG?]3.0-rc4+ftrace+kprobe: set kprobe at instruction 'stwu' lead to system crash/freeze
From: Benjamin Herrenschmidt @ 2011-11-30 21:00 UTC (permalink / raw)
  To: tiejun.chen
  Cc: Jim Keniston, Anton Blanchard, linux-kernel, Steven Rostedt,
	Yong Zhang, paulus, yrl.pp-manager.tt, Masami Hiramatsu,
	linuxppc-dev
In-Reply-To: <4ED60E2B.8030603@windriver.com>

On Wed, 2011-11-30 at 19:06 +0800, tiejun.chen wrote:

> >  - Copy the exception frame we're about to unwind to just -below- the
> > new r1 value we want to write to. Then perform the write, and change
> > r1 to point to that copy of the frame.
> > 
> >  - Branch to restore: which will unwind everything from that copy of
> > the frame, and eventually set r1 to GPR(1) in the copy which contains
> > the new value of r1.
> 
> We still can't restore this there.

Yes, we can since we have copied the pt_regs down to -below- where we
are going to write to. That's the whole trick. We copy the pt_regs
somewhere "safe" and restore from there.

> I mean we have to do that real restore here. So I'm really not sure if its a
> good way to add such a codes including check TIF/copy-get new r1/restore
> operation here since this is so deep for the exception return code.

No. Re-read my explanation.

In the do_work case (so when things are still easy), we copy the pt_regs
of the return frame to a safe place (right below where we want to write
to typically), and change r1 to point to this "new" frame which we use
to restore from. Then we do the store which is now safe and go to an
unmodified "restore" exit path.

> > This is the less intrusive approach and should work just fine, it's also
> > more robust than anything I've been able to think of and the approach
> > would work for 32 and 64-bit similarily.
> > 
> > (***) Above comment about a bug: If you look at entry_64.S version of
> > ret_from_except_lite you'll notice that in the !preempt case, after
> > we've checked MSR_PR we test for any TIF flag in _TIF_USER_WORK_MASK to
> > decide whether to go to do_work or not. However, in the preempt case, we
> > do a convoluted trick to test SIGPENDING only if PR was set and always
> > test NEED_RESCHED ... but we forget to test any other bit of
> > _TIF_USER_WORK_MASK !!! So that means that with preempt, we completely
> > fail to test for things like single step, syscall tracing, etc...
> > 
> 
> This is another problem we should address.
> 
> > I think this should be fixed at the same time, by simplifying the code
> > by doing:
> > 
> >  - Test PR. If set, go to test_work_user, else continue (or the other
> > way around and call it test_work_kernel)
> > 
> >  - In test_work_user, always test for _TIF_USER_WORK_MASK to decide to
> > go to do_work, maybe call it do_user_work
> > 
> >  - In test_work_kernel, test for _TIF_KERNEL_WORK_MASK which is set to
> > our new flag along with NEED_RESCHED if preempt is enabled and branch to
> > do_kernel_work.
> > 
> > do_user_work is basically the same as today's user_work
> > 
> > do_kernel_work is basically the same as today preempt block with added
> > code to handle the new flag as described above.
> > 
> > Is anybody volunteering for fixing that ? I don't have the bandwidth
> 
> I always use one specific kprobe stack to fix this for BOOKE and work well in my
> local tree :) Do you remember my v3 patch? I think its possible to extend this
> for all PPC variants.
>
> Anyway, I'd like to be this volunteer with our last solution.

So the second problem I exposed, for which you just volunteered
(hint :-) is an orthogonal issue not related to kprobe or stacks which
happen to be something I discovered yesterday while looking at the code.

As for the solution to the emulation problem, re-read my explanation.
The whole trick is that we can avoid a separate stack (which I really
want to avoid) and we can avoid touching the low level restore code
path.

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH v2] powerpc/40x: Add APM8018X SOC support
From: Benjamin Herrenschmidt @ 2011-11-30 21:03 UTC (permalink / raw)
  To: Tanmay Inamdar; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <1322550101-28520-1-git-send-email-tinamdar@apm.com>

On Tue, 2011-11-29 at 12:31 +0530, Tanmay Inamdar wrote:
> The AppliedMicro APM8018X embedded processor targets embedded applications that
> require low power and a small footprint. It features a PowerPC 405 processor
> core built in a 65nm low-power CMOS process with a five-stage pipeline executing
> up to one instruction per cycle. The family has 128-kbytes of on-chip memory,
> a 128-bit local bus and on-chip DDR2 SDRAM controller with 16-bit interface.

Thanks, looks much better.

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH v2] powerpc/40x: Add APM8018X SOC support
From: Josh Boyer @ 2011-11-30 21:07 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, Tanmay Inamdar, linux-kernel
In-Reply-To: <1322686997.29041.11.camel@pasglop>

On Wed, Nov 30, 2011 at 4:03 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Tue, 2011-11-29 at 12:31 +0530, Tanmay Inamdar wrote:
>> The AppliedMicro APM8018X embedded processor targets embedded applications that
>> require low power and a small footprint. It features a PowerPC 405 processor
>> core built in a 65nm low-power CMOS process with a five-stage pipeline executing
>> up to one instruction per cycle. The family has 128-kbytes of on-chip memory,
>> a 128-bit local bus and on-chip DDR2 SDRAM controller with 16-bit interface.
>
> Thanks, looks much better.

Yes, agreed.  So much better I already sent it to Ben in a pull request.

josh

^ permalink raw reply

* Re: [PATCH 4/6] powerpc/boot: Add extended precision shifts to the boot wrapper.
From: Tony Breeds @ 2011-11-30 23:25 UTC (permalink / raw)
  To: Josh Boyer; +Cc: LinuxPPC-dev
In-Reply-To: <CA+5PVA6kBquh2=sPy25PKGSMBWvoZFFvZjrL4bgdmddFgjyaQA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 856 bytes --]

On Wed, Nov 30, 2011 at 06:45:37AM -0500, Josh Boyer wrote:

> However, adding this code and wonderful and all but why do we need to
> add it?  Changelog should say why.

Okay changelog updated to:

    powerpc/boot: Add extended precision shifts to the boot wrapper.
    
    The upcoming currituck patches will need to do 64-bit shifts which will
    fail with undefined symbol without this patch.
    
    I looked at linking against libgcc but we can't guarantee that libgcc
    was compiled with soft-float.  Also Using ../lib/div64.S or
    ../kernel/misc_32.S, this will break the build as the .o's need to be
    built with different flags for the bootwrapper vs the kernel.  So for
    now the easiest option is to just copy code from
    arch/powerpc/kernel/misc_32.S  I don't think this code changes too often ;P

Yours Tony

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCH 5/6] powerpc/boot: Add mfdcrx
From: Tony Breeds @ 2011-11-30 23:30 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: LinuxPPC-dev
In-Reply-To: <F91E9007-5DA5-4E3A-993E-EA680DE08D79@kernel.crashing.org>

[-- Attachment #1: Type: text/plain, Size: 593 bytes --]

On Wed, Nov 30, 2011 at 02:09:20PM +0100, Segher Boessenkool wrote:
> >+#define mfdcrx(rn) \
> >+	({	\
> >+		unsigned long rval; \
> >+		asm volatile("mfdcrx %0,%1" : "=r"(rval) : "g"(rn)); \
> >+		rval; \
> >+	})
> 
> "g" is never correct on PowerPC, you want "r" here.  You can write
> this as a static inline btw, you only need the #define stuff when
> there is an "i" constraint involved.

Okay I'll change it to "i", mostly I used a #define to match the style
of m[tf]dcr.  To be honnest I didn't know about the issue with "i"
constraints and static inlines.

Yours Tony

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCH 5/6] powerpc/boot: Add mfdcrx
From: Tony Breeds @ 2011-11-30 23:35 UTC (permalink / raw)
  To: Segher Boessenkool, Benjamin Herrenschmidt, Josh Boyer,
	LinuxPPC-dev
In-Reply-To: <20111130233027.GC15560@thor.bakeyournoodle.com>

[-- Attachment #1: Type: text/plain, Size: 176 bytes --]

On Thu, Dec 01, 2011 at 10:30:27AM +1100, Tony Breeds wrote:

> Okay I'll change it to "i", mostly I used a #define to match the style

Of course I menat "r" here.

Yours Tony

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: Please pull 'next' branch of new 4xx tree
From: Stephen Rothwell @ 2011-11-30 23:41 UTC (permalink / raw)
  To: Josh Boyer; +Cc: linuxppc-dev
In-Reply-To: <CA+5PVA6ahzxWHqBiESw8x7RVC1n8hjqBPvMnQ8qfb=bpwMeuzQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 409 bytes --]

Hi Josh,

On Wed, 30 Nov 2011 10:14:52 -0500 Josh Boyer <jwboyer@gmail.com> wrote:
>
> Stephen, can you switch the linux-next tree to use this instead?
> 
> are available in the git repository at:
>   git://git.infradead.org/users/jwboyer/powerpc-4xx.git next

I have switched to that now.

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCH 02/10] powerpc: Consolidate mpic_alloc() OF address translation
From: Paul Mackerras @ 2011-12-01  1:04 UTC (permalink / raw)
  To: Kyle Moffett; +Cc: linuxppc-dev
In-Reply-To: <1322593117-29938-3-git-send-email-Kyle.D.Moffett@boeing.com>

On Tue, Nov 29, 2011 at 01:58:29PM -0500, Kyle Moffett wrote:
> diff --git a/arch/powerpc/platforms/powermac/pic.c b/arch/powerpc/platforms/powermac/pic.c
> index 901bfbd..44f9774 100644
> --- a/arch/powerpc/platforms/powermac/pic.c
> +++ b/arch/powerpc/platforms/powermac/pic.c
> @@ -498,15 +498,10 @@ static struct mpic * __init pmac_setup_one_mpic(struct device_node *np,
>  						int master)
>  {
>  	const char *name = master ? " MPIC 1   " : " MPIC 2   ";
> -	struct resource r;
>  	struct mpic *mpic;
>  	unsigned int flags = master ? MPIC_PRIMARY : 0;
>  	int rc;
>  
> -	rc = of_address_to_resource(np, 0, &r);
> -	if (rc)
> -		return NULL;
> -

This gets me an unused variable warning, which because we compile
arch/powerpc with -Werror is fatal:

  CC      arch/powerpc/platforms/powermac/pic.o
/home/paulus/kernel/kvm-merge/arch/powerpc/platforms/powermac/pic.c: In function ‘pmac_setup_one_mpic’:
/home/paulus/kernel/kvm-merge/arch/powerpc/platforms/powermac/pic.c:491:6: error: unused variable ‘rc’ [-Werror=unused-variable]
cc1: all warnings being treated as errors

Need to remove the declaration of rc as well.

Paul.

^ permalink raw reply

* Re: [PATCH 1/6] 44x/pci: Continue pci setup even if there is no sdr-base in the device-tree
From: Tony Breeds @ 2011-12-01  1:11 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: LinuxPPC-dev
In-Reply-To: <1322631973.21641.42.camel@pasglop>

[-- Attachment #1: Type: text/plain, Size: 455 bytes --]

On Wed, Nov 30, 2011 at 04:46:13PM +1100, Benjamin Herrenschmidt wrote:

> If you don't expect an sdr-base as part of the normal operations of that
> bridge, don't bring a message that makes me think something is wrong :-)
> 
> Just changing the severity isn't enough. you should just remove the
> message and later on, print/warn/error out if you decide you actually
> need an sdr-base (such as in the backend).

Okay no problem.

Yours Tony

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* [PATCH] powerpc/nvram: Add spinlock to oops_to_nvram to prevent oops in compression code.
From: Anton Blanchard @ 2011-12-01  1:46 UTC (permalink / raw)
  To: benh, paulus, hbabu, jkenisto; +Cc: linuxppc-dev


When issuing a system reset we almost always oops in the oops_to_nvram
code because multiple CPUs are using the deflate work area. Add a
spinlock to protect it.

To play it safe I'm using trylock to avoid locking up if the NVRAM
code oopses. This means we might miss multiple CPUs oopsing at exactly
the same time but I think it's best to play it safe for now. Once we
are happy with the reliability we can change it to a full spinlock.

Signed-off-by: Anton Blanchard <anton@samba.org>
---

Index: linux-build/arch/powerpc/platforms/pseries/nvram.c
===================================================================
--- linux-build.orig/arch/powerpc/platforms/pseries/nvram.c	2011-12-01 09:44:27.205568463 +1100
+++ linux-build/arch/powerpc/platforms/pseries/nvram.c	2011-12-01 12:36:49.334478156 +1100
@@ -634,6 +634,8 @@ static void oops_to_nvram(struct kmsg_du
 {
 	static unsigned int oops_count = 0;
 	static bool panicking = false;
+	static DEFINE_SPINLOCK(lock);
+	unsigned long flags;
 	size_t text_len;
 	unsigned int err_type = ERR_TYPE_KERNEL_PANIC_GZ;
 	int rc = -1;
@@ -664,6 +666,9 @@ static void oops_to_nvram(struct kmsg_du
 	if (clobbering_unread_rtas_event())
 		return;
 
+	if (!spin_trylock_irqsave(&lock, flags))
+		return;
+
 	if (big_oops_buf) {
 		text_len = capture_last_msgs(old_msgs, old_len,
 			new_msgs, new_len, big_oops_buf, big_oops_buf_sz);
@@ -679,4 +684,6 @@ static void oops_to_nvram(struct kmsg_du
 
 	(void) nvram_write_os_partition(&oops_log_partition, oops_buf,
 		(int) (sizeof(*oops_len) + *oops_len), err_type, ++oops_count);
+
+	spin_unlock_irqrestore(&lock, flags);
 }

^ permalink raw reply

* Re: [PATCH] powerpc/nvram: Add spinlock to oops_to_nvram to prevent oops in compression code.
From: Benjamin Herrenschmidt @ 2011-12-01  2:11 UTC (permalink / raw)
  To: Anton Blanchard; +Cc: jkenisto, paulus, linuxppc-dev
In-Reply-To: <20111201124645.24c6e54f@kryten>

On Thu, 2011-12-01 at 12:46 +1100, Anton Blanchard wrote:
> When issuing a system reset we almost always oops in the oops_to_nvram
> code because multiple CPUs are using the deflate work area. Add a
> spinlock to protect it.
> 
> To play it safe I'm using trylock to avoid locking up if the NVRAM
> code oopses. This means we might miss multiple CPUs oopsing at exactly
> the same time but I think it's best to play it safe for now. Once we
> are happy with the reliability we can change it to a full spinlock.

How would we miss ?

trylock does loop on stwcx. failure, it doesn't loop if the lock is
-taken-, so if the lock is only used for actually dealing with the oops
the only "miss" is because somebody already got it... or am I missing
something ?

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH] powerpc/nvram: Add spinlock to oops_to_nvram to prevent oops in compression code.
From: Anton Blanchard @ 2011-12-01  2:26 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: jkenisto, paulus, linuxppc-dev
In-Reply-To: <1322705472.3729.11.camel@pasglop>

Hi Ben,

> How would we miss ?
> 
> trylock does loop on stwcx. failure, it doesn't loop if the lock is
> -taken-, so if the lock is only used for actually dealing with the
> oops the only "miss" is because somebody already got it... or am I
> missing something ?

I'm thinking of two CPUs that enter at exactly the same time either
through a system reset or an ugly bug (writing junk at 0x900 so the
decrementer exception gets an oops). Probably unlikely enough that we
don't care.

Anton

^ permalink raw reply

* Re: lmb_alloc() and page memory overlap
From: Benjamin Herrenschmidt @ 2011-12-01  4:00 UTC (permalink / raw)
  To: Prashant Bhole; +Cc: linuxppc-dev
In-Reply-To: <CAD6p20evHeESH1UwpB4rg8NTxndAtQW9C1S1bpJWFLmS-rOu5w@mail.gmail.com>

On Tue, 2011-11-29 at 18:51 +0530, Prashant Bhole wrote:
> Hi,
> I am using custom 460ex board with kernel version 2.6.30.
> I noticed that page_alloc() is returning a page whose memory
> is already allocated by lmb_alloc() while unflattening the device
> tree. As per my knowledge the memory allocated by lmb_alloc()
> should be reserved till the end, right?

This should have been fixed in memblock in recent kernel, at least I
believe it is. It looks like this is caused by overlapping lmb_reserve()
at boot (or lmb_reserve() overlapping an lmb_alloc'ated region which
boils down to the same thing).

Old lmb didn't deal with that well at all and that lead to corruption of
the lmb list. We fixed that in

8f7a66051b7523108c5aefb08c6a637e54aedc47

    mm/memblock: properly handle overlaps and fix error path

Which got merged in 2.6.39.

If you absolutely need to stick to 2.6.30, you can try backporting the
fix to lmb.

Cheers,
Ben.

> Some more explanation of what I observed:
> 
> unflatten_device_tree() allocates memory, which will be used
> for "struct node" objects in the device tree. I obtained base
> address of allocated memory in "unsigned long base_mem"
> 
> Now I executed the following code after the kernel booted properly.
> 
> ---------------------------------------------------------------
> extern unsigned long mem; // lmb_alloc() memory
> struct page *test_page = virt_to_page(mem);
> struct page *new_page = NULL;
> 
> while(1)
> {
>     new_page = NULL;
>     new_page = alloc_page(GFP_KERNEL);
>     if(!new_page)
>     {
>         printk("Allocation failed\n");
>         while(1);
>     }
>     if(test_page == new_page)
>     {
>          printk("Memory already allocated by lmb_alloc\n");
>          while(1);
>     }
> }
> ---------------------------------------------------------------
> 
> After many page allocations, I always hit the condition (test_page == new_page).
> Am I doing anything wrong here?
> Has anybody faced this kind of problem before?
> 
> 
> I also noticed that lmb_dump_all() shows 2 regions overlapping (last two):
> 
> LMB configuration:
>  rmo_size    = 0x30000000
>  memory.size = 0x30000000
>  memory.cnt  = 0x1
>  memory[0x0]    0x0000000000000000 - 0x000000002fffffff, 0x30000000 bytes
>  reserved.cnt  = 0x6
>  reserved[0x0]  0x0000000000000000 - 0x00000000006bffff, 0x6c0000 bytes
>  reserved[0x1]  0x0000000000ffa000 - 0x0000000000ffcfff, 0x3000 bytes
>  reserved[0x2]  0x000000002fdd0000 - 0x000000002fddffff, 0x10000 bytes
>  reserved[0x3]  0x000000002fde4000 - 0x000000002fde9fff, 0x6000 bytes
>  reserved[0x4]  0x000000002fdeb060 - 0x000000002ffff768, 0x214709 bytes
>  reserved[0x5]  0x000000002fdee000 - 0x000000002ffff769, 0x21176a bytes
> 
> 
> Thanks,
> Prashant
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox