LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] powerpc: mtmsrd not defined
From: Kumar Gala @ 2010-09-01 13:01 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linuxppc-dev, Sean MacLennan
In-Reply-To: <20100901080214.GA4722@brick.ozlabs.ibm.com>


On Sep 1, 2010, at 3:02 AM, Paul Mackerras wrote:

> On Tue, Aug 31, 2010 at 10:55:25PM -0400, Sean MacLennan wrote:
>=20
>> I had to give up. Without the CONFIG_PPC_FPU it compiles fine for an
>> FPU less 44x. But with the CONFIG_PPC_FPU, I get the following =
errors:
>=20
> Ah, those references would be from arch/powerpc/lib/sstep.c.  =
Evidently
> we need #ifdef CONFIG_PPC_FPU around the emulation of the =
floating-point
> loads and stores.
>=20
> Do we do any in-kernel emulation of floating-point operations with
> CONFIG_PPC_FPU turned off?

I'm not aware of any.  I haven't looked at the Program Check exception =
path to know if we'd handle emulation gracefully or not for =
!CONFIG_PPC_FPU.

- k=

^ permalink raw reply

* Re: [linuxppc-release] [PATCH 1/2] powerpc/mm: Assume first cpu is boot_cpuid not 0
From: Timur Tabi @ 2010-09-01 13:05 UTC (permalink / raw)
  To: Matthew McClintock; +Cc: linuxppc-dev
In-Reply-To: <1283297085-3455-1-git-send-email-msm@freescale.com>

Matthew McClintock wrote:
> +#ifndef CONFIG_SMP
>  	stale_map[0] = alloc_bootmem(CTX_MAP_SIZE);
> +#else
> +	stale_map[boot_cpuid] = alloc_bootmem(CTX_MAP_SIZE);

So you're saying that even on a non-SMP kernel, boot_cpuid might not be zero?

^ permalink raw reply

* Re: [linuxppc-release] [PATCH 1/2] powerpc/mm: Assume first cpu is boot_cpuid not 0
From: Kumar Gala @ 2010-09-01 13:12 UTC (permalink / raw)
  To: Timur Tabi; +Cc: Matthew McClintock, linuxppc-dev
In-Reply-To: <4C7E4FA4.60909@freescale.com>


On Sep 1, 2010, at 8:05 AM, Timur Tabi wrote:

> Matthew McClintock wrote:
>> +#ifndef CONFIG_SMP
>> 	stale_map[0] =3D alloc_bootmem(CTX_MAP_SIZE);
>> +#else
>> +	stale_map[boot_cpuid] =3D alloc_bootmem(CTX_MAP_SIZE);
>=20
> So you're saying that even on a non-SMP kernel, boot_cpuid might not =
be zero?

That is correct, we have that situation on AMP kernel boots on 8572, =
p2020, etc.

- k=

^ permalink raw reply

* Re: [v1 PATCH] ucc_geth: fix ethtool set ring param bug
From: Ben Hutchings @ 2010-09-01 13:42 UTC (permalink / raw)
  To: Liang Li; +Cc: netdev, avorontsov, davem, linuxppc-dev
In-Reply-To: <1283305429-8426-1-git-send-email-liang.li@windriver.com>

On Wed, 2010-09-01 at 09:43 +0800, Liang Li wrote:
> It's common sense that when we should do change to driver ring
> desc/buffer etc only after 'stop/shutdown' the device. When we
> do change while devices/driver is running, kernel oops occur:
[...]
> -	ug_info->bdRingLenRx[queue] = ring->rx_pending;
> -	ug_info->bdRingLenTx[queue] = ring->tx_pending;
> -
>  	if (netif_running(netdev)) {
> -		/* FIXME: restart automatically */
> -		printk(KERN_INFO
> -			"Please re-open the interface.\n");
> +		u16 rx_t;
> +		u16 tx_t;
> +		printk(KERN_INFO "Stopping interface %s.\n", netdev->name);
> +		ucc_geth_close(netdev);
> +
> +		rx_t = ug_info->bdRingLenRx[queue];
> +		tx_t = ug_info->bdRingLenTx[queue];
> +
> +		ug_info->bdRingLenRx[queue] = ring->rx_pending;
> +		ug_info->bdRingLenTx[queue] = ring->tx_pending;
> +
> +		printk(KERN_INFO "Reactivating interface %s.\n", netdev->name);
> +		ret = ucc_geth_open(netdev);
> +		if (ret) {
> +			printk(KERN_WARNING "uec_set_ringparam: set ring param for running"
> +					" interface %s failed. Please try to make the interface "
> +					" down, then try again.\n", netdev->name);
> +			ug_info->bdRingLenRx[queue] = rx_t;
> +			ug_info->bdRingLenTx[queue] = tx_t;
> +		}
[...]

Bringing the interface down will call ucc_geth_close(), which will try
to free resources that have not been allocated!

If you cannot roll back a failed change then at least use dev_close()
and dev_open() rather than calling ucc_geth_{close,open}() directly, so
that the interface state is updated correctly.  Or just refuse to make
the change if the interface is up.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* ERR_PTR pattern in phylib
From: Grant Likely @ 2010-09-01 14:42 UTC (permalink / raw)
  To: Andy Fleming, netdev, linuxppc-dev

Hi Andy,

I've been looking at the phylib code, and specifically at the use of
the ERR_PTR() pattern.  I'm personally not a big fan of ERR_PTR()
because the compiler cannot type check the result and it runs
completely counter to the pattern "if (!ptr)" than is common for
testing a pointer return value.

(That being said, I do understand the need for it in certain parts of
the kernel (like the filesystem code) where it is important to be both
efficient because it is a hot path and still able to accurately return
an error code that is used by userspace.)

It seems to me that phylib is one of the cases where the users (the
network drivers) don't actually care about the specific error code
when calling phylib functions.  The drivers only seem to care whether
or not the function failed, and if it did then bail out.  I've also
noticed that using the "if (!ptr)" test on phylib return values is a
common error for driver writers.

In the interest of making driver code easier to write and review,
would you be opposed to a set of patches to remove the ERR_PTR()
pattern from phylib and its users?

g.

-- 
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.

^ permalink raw reply

* Re: [PATCH 1/7] drivers/macintosh/via-pmu-led.c: Add of_node_put to avoid memory leak
From: walter harms @ 2010-09-01 15:03 UTC (permalink / raw)
  To: Grant Likely
  Cc: Vasiliy Kulikov, devicetree-discuss, kernel-janitors,
	linux-kernel, Julia Lawall, linuxppc-dev
In-Reply-To: <AANLkTi=3b0NY5pFepv-TmjFDmPUBK9Ew16HyTn-+YguX@mail.gmail.com>



Grant Likely schrieb:
> On Tue, Aug 31, 2010 at 10:16 AM, Vasiliy Kulikov <segooon@gmail.com> wrote:
>> On Tue, Aug 31, 2010 at 18:08 +0200, Julia Lawall wrote:
>>> On Tue, 31 Aug 2010, walter harms wrote:
>>>>>   if (strncmp(model, "PowerBook", strlen("PowerBook")) != 0 &&
>>>>>       strncmp(model, "iBook", strlen("iBook")) != 0 &&
>>>>>       strcmp(model, "PowerMac7,2") != 0 &&
>>>>>
>>>> is there any rule that says when to use strncmp ? it seems perfecly valid to use strcpy here
>>>> (what is done in the last cmp).
>>> Perhaps there are some characters after eg PowerBook that one doesn't want
>>> to compare with?
>> It seems to me that model has no '\0' in the end. If model is got from
>> the hardware then we should double check it - maybe harware is buggy.
>> Otherwise we'll overflow model.
> 
> Model does have \0 at the end.  This code is using strncmp to
> purposefully ignore the model suffix.
> 
>> But why strcmp(model, "PowerMac7,2")? IMO it should be replaced
>> with strncmp().
> 
> We use strcmp when parsing the device tree because the the length of
> the model property string is unknown and in most cases we *must* match
> the exact entire string, such as with this PowerMac7,2 example.  Using
> strncmp would also happen to match with something like
> "PowerMac7,2345" which is not the desired behaviour.
> 

hi Grant,
whould you mind to use you explanation as comment in the code ?
Tthat the strncpy/strcpy difference is important should be noted. that would be clearly a
bonos with further audits.

re,
 wh

^ permalink raw reply

* PCI: device not available (can't reserve [mem 0x00000000-0x0003ffff])
From: Ravi Gupta @ 2010-09-01 15:07 UTC (permalink / raw)
  To: linuxppc-dev, MJ embd, Ira W. Snyder, Anton Vorontsov

[-- Attachment #1: Type: text/plain, Size: 1159 bytes --]

Hi,

I am facing a strange problem with a PCI express device. I have written a
simple PCI driver(pci_skel sample driver for LDD book) just to detect a PCI
device and read its vender and device id from its configuration space. When
I connect the device on a standard i386 based PC, it works fine, as
expected. But when I run the same driver on the MPC837xERDB board(powerpc
arch based processor from freescale), it gives me error. Below is the O/P on
both the architectures. Could anyone suggest what can be the cause of this
problem? I am new to linux device driver development.

i386 output.

[  921.097050] PCI driver: Init function
[  921.097415] PCI driver: Probe function
[  921.097442] pci_skel 0000:04:00.0: PCI INT A -> GSI 18 (level, low) ->
IRQ 18
[  921.097453] PCI driver Vendor ID = 1204
[  921.097462] PCI driver Device ID = e250
[  921.097471] PCI driver Revision = 0


Powerpc output

PCI driver: Init function
PCI driver: Probe function
pci_skel 0001:02:00.0: device not available (can't reserve [mem
0x00000000-0x0003ffff])
Unable to Enable PCI device:-22
pci_skel: probe of 0001:02:00.0 failed with error -22


Thanks in advance
Ravi Gupta

[-- Attachment #2: Type: text/html, Size: 1276 bytes --]

^ permalink raw reply

* Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries
From: Darren Hart @ 2010-09-01 15:10 UTC (permalink / raw)
  To: michael
  Cc: Stephen Rothwell, Gautham R Shenoy, Josh Triplett, Steven Rostedt,
	linuxppc-dev, Will Schmidt, Paul Mackerras, Ankita Garg,
	Thomas Gleixner
In-Reply-To: <1283320481.32679.32.camel@concordia>

On 08/31/2010 10:54 PM, Michael Ellerman wrote:
> On Tue, 2010-08-31 at 00:12 -0700, Darren Hart wrote:
> ..
>>
>> When running with the function plugin I had to stop the trace
>> immediately before entering start_secondary after an online or my traces
>> would not include the pseries_mach_cpu_die function, nor the tracing I
>> added there (possibly buffer size, I am using 2048). The following trace
>> was collected using "trace-cmd record -p function -e irq -e sched" and
>> has been filtered to only show CPU [001] (the CPU undergoing the
>> offline/online test, and the one seeing preempt_count (pcnt) go to
>> ffffffff after cede. The function tracer does not indicate anything
>> running on the CPU other than the HCALL - unless the __trace_hcall*
>> commands might be to blame. 
> 
> It's not impossible. Though normally they're disabled right, so the only
> reason they're running is because you're tracing. So if they are causing
> the bug then that doesn't explain why you see it normally.
> 
> Still, might be worth disabling just the hcall tracepoints just to be
> 100% sure.

A couple of updates. I was working from tip/rt/head, which has been
stale for some months now. I switched to tip/rt/2.6.33 and the
preempt_count() change over cede went away. I now see the live hang that
Will has reported.

Before I dive into the live hang, I want to understand what fixed the
preempt_count() change. That might start pointing us in the right
direction for the live hang.

I did an inverted git bisect between tip/rt/head and tip/rt/2.6.33 to
try and locate the fix. I've narrowed it down to the 2.6.33.6 merge:

# git show 7e1af1172bbd4109d09ac515c5d376f633da7cff
commit 7e1af1172bbd4109d09ac515c5d376f633da7cff
Merge: d8e94db 9666790
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Tue Jul 13 16:01:16 2010 +0200

    Merge
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-2.6.33.y

    Conflicts:
        Makefile

    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


Visual inspection yields two patches of interest:

f8b67691828321f5c85bb853283aa101ae673130
powerpc/pseries: Make query-cpu-stopped callable outside hotplug cpu

aef40e87d866355ffd279ab21021de733242d0d5
powerpc/pseries: Only call start-cpu when a CPU is stopped

I'm going to try reverting these today and see if they addressed the
issue indirectly.


-- 
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team

^ permalink raw reply

* Re: [PATCH] gpiolib: Add 'struct gpio_chip' forward declaration for !GPIOLIB case
From: Grant Likely @ 2010-09-01 15:26 UTC (permalink / raw)
  To: Anton Vorontsov
  Cc: David Brownell, Andrew Morton, Benjamin Herrenschmidt,
	linuxppc-dev, linux-kernel
In-Reply-To: <20100831170756.GA24887@oksana.dev.rtsoft.ru>

On Tue, Aug 31, 2010 at 09:07:56PM +0400, Anton Vorontsov wrote:
> On Tue, Aug 31, 2010 at 10:44:14AM -0600, Grant Likely wrote:
> > On Tue, Aug 31, 2010 at 2:37 AM, Anton Vorontsov <cbouatmailru@gmail.com> wrote:
> > > On Tue, Aug 31, 2010 at 02:03:44AM -0600, Grant Likely wrote:
> > >> On Tue, Aug 24, 2010 at 01:26:23PM +0400, Anton Vorontsov wrote:
> > >> > With CONFIG_GPIOLIB=n, the 'struct gpio_chip' is not declared,
> > >> > so the following pops up on PowerPC:
> > >> >
> > >> >   cc1: warnings being treated as errors
> > >> >   In file included from arch/powerpc/platforms/52xx/mpc52xx_common.c:19:
> > >> >   include/linux/of_gpio.h:74: warning: 'struct gpio_chip' declared
> > >> >                               inside parameter list
> > >> >   include/linux/of_gpio.h:74: warning: its scope is only this definition
> > >> >                               or declaration, which is probably not what
> > >> >                           you want
> > >> >   include/linux/of_gpio.h:75: warning: 'struct gpio_chip' declared
> > >> >                               inside parameter list
> > >> >   make[2]: *** [arch/powerpc/platforms/52xx/mpc52xx_common.o] Error 1
> > >> >
> > >> > This patch fixes the issue by providing the proper forward declaration.
> > >> >
> > >> > Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com>
> > >>
> > >> This doesn't actually solve the problem, and gpiochip should
> > >> remain undefined when CONFIG_GPIOLIB=n to catch exactly these
> > >> build failures.  The real problem is that I merged a change
> > >> into the mpc5200 code that required CONFIG_GPIOLIB be enabled
> > >> without reflecting that requirement in Kconfig.
> > >
> > > No, look closer. The error is in of_gpio.h, and it's perfectly
> > > fine to include it w/o GPIOLIB=y.
> > 
> > Looking even closer, we're both wrong.  You're right I didn't look
> > carefully enough, and the error is in of_gpio.h, not the .c file.
> > 
> > However, it is not okay to get the definitions from of_gpio.h when
> > CONFIG_GPIOLIB=n.  If GPIOLIB, or more specifically OF_GPIO isn't set,
> > then the of_gpio.h definitions should either not be defined, or should
> > be defined as empty stubs (where appropriate).
> 
> Grrr. Grant, look again, even closer than you did.

Gah.  Very bad day yesterday.  Applied and sorry for the noise.

g.

^ permalink raw reply

* Re: ERR_PTR pattern in phylib
From: David Miller @ 2010-09-01 15:27 UTC (permalink / raw)
  To: grant.likely; +Cc: netdev, afleming, linuxppc-dev
In-Reply-To: <AANLkTimjVZTNeLdm4qptU1MqREODLASdGGanUyxqxfub@mail.gmail.com>

From: Grant Likely <grant.likely@secretlab.ca>
Date: Wed, 1 Sep 2010 08:42:49 -0600

> It seems to me that phylib is one of the cases where the users (the
> network drivers) don't actually care about the specific error code
> when calling phylib functions.  The drivers only seem to care whether
> or not the function failed, and if it did then bail out.  I've also
> noticed that using the "if (!ptr)" test on phylib return values is a
> common error for driver writers.
> 
> In the interest of making driver code easier to write and review,
> would you be opposed to a set of patches to remove the ERR_PTR()
> pattern from phylib and its users?

I'm opposed to it because it means that if code actually does
care about the error code it will no longer be able to obtain
it.

Please don't make this change, thanks.

^ permalink raw reply

* Re: [PATCH] powerpc/512x: fix clk_get() return value
From: Grant Likely @ 2010-09-01 15:32 UTC (permalink / raw)
  To: Akinobu Mita; +Cc: linuxppc-dev
In-Reply-To: <1283244231-23659-1-git-send-email-akinobu.mita@gmail.com>

On Tue, Aug 31, 2010 at 05:43:51PM +0900, Akinobu Mita wrote:
> clk_get() should return an ERR_PTR value on error, not NULL.
> 
> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
> Cc: Grant Likely <grant.likely@secretlab.ca>
> Cc: linuxppc-dev@lists.ozlabs.org

applied, thanks.

g.

> ---
>  arch/powerpc/platforms/512x/clock.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/512x/clock.c b/arch/powerpc/platforms/512x/clock.c
> index 5b243bd..3dc2a8d 100644
> --- a/arch/powerpc/platforms/512x/clock.c
> +++ b/arch/powerpc/platforms/512x/clock.c
> @@ -57,7 +57,7 @@ static struct clk *mpc5121_clk_get(struct device *dev, const char *id)
>  	int id_match = 0;
>  
>  	if (dev == NULL || id == NULL)
> -		return NULL;
> +		return clk;
>  
>  	mutex_lock(&clocks_mutex);
>  	list_for_each_entry(p, &clocks, node) {
> -- 
> 1.7.1.231.gd0b16
> 

^ permalink raw reply

* Re: ERR_PTR pattern in phylib
From: Fleming Andy-AFLEMING @ 2010-09-01 15:50 UTC (permalink / raw)
  To: Grant Likely; +Cc: netdev, linuxppc-dev
In-Reply-To: <AANLkTimjVZTNeLdm4qptU1MqREODLASdGGanUyxqxfub@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 681 bytes --]



On Sep 1, 2010, at 9:43, Grant Likely <grant.likely@secretlab.ca> wrote:

> 
> In the interest of making driver code easier to write and review,
> would you be opposed to a set of patches to remove the ERR_PTR()
> pattern from phylib and its users?


Not at all.  I was trying to make sure all the information was available, and I'm not really a big fan of passing in a **ptr to get set.  But I don't really have a strong opinion.  This is just the design i settled on.  Heck, 99.9% of the errors are unlikely to ever happen, a they're caused by hardware malfunction!

Andy


> 
> g.
> 
> -- 
> Grant Likely, B.Sc., P.Eng.
> Secret Lab Technologies Ltd.
> 

[-- Attachment #2: Type: text/html, Size: 1236 bytes --]

^ permalink raw reply

* question about arch/powerpc/platforms/cell/celleb_scc_uhc.c
From: Julia Lawall @ 2010-09-01 15:51 UTC (permalink / raw)
  To: Arnd Bergmann, Benjamin Herrenschmidt
  Cc: cbe-oss-dev, linuxppc-dev, Paul Mackerras, joe

The file arch/powerpc/platforms/cell/celleb_scc_uhc.c contains the 
following function:

static inline int uhc_clkctrl_ready(u32 val)
{
        const u32 mask = SCC_UHC_USBCEN | SCC_UHC_USBCEN;
        return((val & mask) == mask);
}

The variable mask is a bit or of two identical constants.  Later in the 
same file in the function enable_scc_uhc, I see the code:

val |= (SCC_UHC_USBCEN | SCC_UHC_USBEN);

Should the code in uhc_clkctrl_ready also be SCC_UHC_USBCEN | SCC_UHC_USBEN?
Or something else?

thanks,
julia

^ permalink raw reply

* Re: question about arch/powerpc/platforms/cell/celleb_scc_uhc.c
From: Joe Perches @ 2010-09-01 16:02 UTC (permalink / raw)
  To: Julia Lawall; +Cc: cbe-oss-dev, Paul Mackerras, linuxppc-dev, Arnd Bergmann
In-Reply-To: <Pine.LNX.4.64.1009011748050.32566@ask.diku.dk>

On Wed, 2010-09-01 at 17:51 +0200, Julia Lawall wrote:
> The file arch/powerpc/platforms/cell/celleb_scc_uhc.c contains the 
> following function:
> static inline int uhc_clkctrl_ready(u32 val)
> {
>         const u32 mask = SCC_UHC_USBCEN | SCC_UHC_USBCEN;
>         return((val & mask) == mask);
> }
> The variable mask is a bit or of two identical constants.  Later in the 
> same file in the function enable_scc_uhc, I see the code:
> val |= (SCC_UHC_USBCEN | SCC_UHC_USBEN);
> Should the code in uhc_clkctrl_ready also be SCC_UHC_USBCEN | SCC_UHC_USBEN?
> Or something else?

Thanks Julia.

This was also noticed a couple of years ago and not applied
http://lkml.indiana.edu/hypermail/linux/kernel/0808.2/1428.html

One day...

^ permalink raw reply

* Re: question about arch/powerpc/platforms/cell/celleb_scc_uhc.c
From: Julia Lawall @ 2010-09-01 16:05 UTC (permalink / raw)
  To: Joe Perches; +Cc: cbe-oss-dev, Paul Mackerras, linuxppc-dev, Arnd Bergmann
In-Reply-To: <1283356950.1797.159.camel@Joe-Laptop>

On Wed, 1 Sep 2010, Joe Perches wrote:

> On Wed, 2010-09-01 at 17:51 +0200, Julia Lawall wrote:
> > The file arch/powerpc/platforms/cell/celleb_scc_uhc.c contains the 
> > following function:
> > static inline int uhc_clkctrl_ready(u32 val)
> > {
> >         const u32 mask = SCC_UHC_USBCEN | SCC_UHC_USBCEN;
> >         return((val & mask) == mask);
> > }
> > The variable mask is a bit or of two identical constants.  Later in the 
> > same file in the function enable_scc_uhc, I see the code:
> > val |= (SCC_UHC_USBCEN | SCC_UHC_USBEN);
> > Should the code in uhc_clkctrl_ready also be SCC_UHC_USBCEN | SCC_UHC_USBEN?
> > Or something else?
> 
> Thanks Julia.
> 
> This was also noticed a couple of years ago and not applied
> http://lkml.indiana.edu/hypermail/linux/kernel/0808.2/1428.html
> 
> One day...

OK, thanks, I will send a patch shortly.

julia

^ permalink raw reply

* Re: Request review of device tree documentation
From: Grant Likely @ 2010-09-01 16:19 UTC (permalink / raw)
  To: devicetree-discuss, Jeremy Kerr, John Rigby, linuxppc-dev,
	microblaze-uclinux, Olof Johansson
In-Reply-To: <20100805044325.GC25458@yookeroo>

On Thu, Aug 05, 2010 at 02:43:25PM +1000, David Gibson wrote:
> On Fri, Jun 11, 2010 at 04:59:46PM -0600, Grant Likely wrote:
> > I've been doing a bit of work on some introductory level documentation
> > of the flattened device tree.  I've got a rough copy up on the
> > devicetree.org wiki, and I could use some feedback.  If anyone has
> > some time to look at it, you can find it here:
> > 
> > http://devicetree.org/Device_Tree_Usage
> 
> Sorry I haven't replied sooner, I've been away, then sick and
> generally preoccupied.  Still here are some comments now.

Thanks David.  Reworked as per comments.  You can see the diff here:

http://www.devicetree.org/mediawiki/index.php?title=Device_Tree_Usage&diff=228&oldid=227

g.

> 
> How Addressing Works:
> 
>  * Small inconsistency you use "address1", "address2" then "unit-address3".
> 
>  * Perhaps re-emphasise that a parent's #*-cells properties govern the
>    children's reg properties, not its own, since this is a common
>    misunderstanding..
> 
> Non Memory Mapped Devices:
> 
>  * Your phrasing here suggests that non-memory-maped == zero
>    size-cells, which is not always true.
> 
> Ranges (Address Translation):
> 
>  * Third paragraph, first sentence is a grammatical dogs' breakfast,
> 
> How Interrupts Work:
> 
>  * Bogus paragraph break partway through first sentence.
> 
>  * At the end you say the second cell indicates the interrupt's
>    polarity, but you don't specify how this is encoded.  It might be
>    worth emphasising that while most interrupt specifiers do include
>    trigger and polarity type information, the encoding of it can and
>    does vary between interrupt controllers.
> 
> Advanced Sample Machine:
> 
>  * The unit address in the name shouldn't have a "0x" prefix
> 
> Advanced Interrupt Mapping:
> 
>  * Perhaps worth noting that while a PCI *card* will use INTA..INTD,
>    on-board PCI devices can, and frequently do, have interrupts wired
>    side-band to the PCI bus, directly to the main interrupt
>    controller.
> 
>  * In your example, you're muddying the waters of your previous usage
>    of interrupt-parent.  The PCI child nodes have the PCI top-level
>    node as their implicit interrupt parent, because its their first
>    ancestor with an interrupt-map, and we hit that before the
>    interrupt-parent property specified at the very top level.  This
>    means amongst other things that if there are PCI devices with
>    seperately wired interrupts, they must explicitly set
>    interrupt-parent to bypass the normal PCI interrupt mapping.
> 
> 
> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson

^ permalink raw reply

* Re: [PATCH] powerpc: mtmsrd not defined
From: Sean MacLennan @ 2010-09-01 16:40 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linuxppc-dev
In-Reply-To: <20100901080214.GA4722@brick.ozlabs.ibm.com>

On Wed, 1 Sep 2010 18:02:14 +1000
Paul Mackerras <paulus@samba.org> wrote:

> Ah, those references would be from arch/powerpc/lib/sstep.c.
> Evidently we need #ifdef CONFIG_PPC_FPU around the emulation of the
> floating-point loads and stores.

I tried that yesterday and it didn't help, although maybe I forgot to
mask out a few. I will check again.

Cheers,
   Sean

^ permalink raw reply

* Re: [PATCH] powerpc: mtmsrd not defined
From: Sean MacLennan @ 2010-09-01 17:21 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev, Paul Mackerras
In-Reply-To: <04187CA6-D0DE-45B3-819F-DE26BFD9FF16@kernel.crashing.org>

Ok, I didn't catch all the cases the first time :( Since the errors
seemed to come from copy_32.S, I didn't spend much time on sstep.c.

Here is an updated patch the fixes the mtmsrd problem and maps out the
floating point.

Cheers,
   Sean

Replace the BOOK3S_64 specific mtmsrd with the generic MTMSRD macro.
Only enable ldstfp when CONFIG_PPC_FPU is set.

Signed-off-by: Sean MacLennan <smaclennan@pikatech.com>
---
diff --git a/arch/powerpc/lib/copy_32.S b/arch/powerpc/lib/copy_32.S
index 74a7f41..55f19f9 100644
--- a/arch/powerpc/lib/copy_32.S
+++ b/arch/powerpc/lib/copy_32.S
@@ -62,7 +62,7 @@
 
 	.text
 	.stabs	"arch/powerpc/lib/",N_SO,0,0,0f
-	.stabs	"copy32.S",N_SO,0,0,0f
+	.stabs	"copy_32.S",N_SO,0,0,0f
 0:
 
 CACHELINE_BYTES = L1_CACHE_BYTES
diff --git a/arch/powerpc/lib/ldstfp.S b/arch/powerpc/lib/ldstfp.S
index f644863..6a85380 100644
--- a/arch/powerpc/lib/ldstfp.S
+++ b/arch/powerpc/lib/ldstfp.S
@@ -17,6 +17,8 @@
 #include <asm/asm-offsets.h>
 #include <linux/errno.h>
 
+#ifdef CONFIG_PPC_FPU
+
 #define STKFRM	(PPC_MIN_STKFRM + 16)
 
 	.macro	extab	instr,handler
@@ -81,7 +83,7 @@ _GLOBAL(do_lfs)
 	mfmsr	r6
 	ori	r7,r6,MSR_FP
 	cmpwi	cr7,r3,0
-	mtmsrd	r7
+	MTMSRD(r7)
 	isync
 	beq	cr7,1f
 	stfd	fr0,STKFRM-16(r1)
@@ -93,7 +95,7 @@ _GLOBAL(do_lfs)
 	lfd	fr0,STKFRM-16(r1)
 4:	PPC_LL	r0,STKFRM+PPC_LR_STKOFF(r1)
 	mtlr	r0
-	mtmsrd	r6
+	MTMSRD(r6)
 	isync
 	mr	r3,r9
 	addi	r1,r1,STKFRM
@@ -108,7 +110,7 @@ _GLOBAL(do_lfd)
 	mfmsr	r6
 	ori	r7,r6,MSR_FP
 	cmpwi	cr7,r3,0
-	mtmsrd	r7
+	MTMSRD(r7)
 	isync
 	beq	cr7,1f
 	stfd	fr0,STKFRM-16(r1)
@@ -120,7 +122,7 @@ _GLOBAL(do_lfd)
 	lfd	fr0,STKFRM-16(r1)
 4:	PPC_LL	r0,STKFRM+PPC_LR_STKOFF(r1)
 	mtlr	r0
-	mtmsrd	r6
+	MTMSRD(r6)
 	isync
 	mr	r3,r9
 	addi	r1,r1,STKFRM
@@ -135,7 +137,7 @@ _GLOBAL(do_stfs)
 	mfmsr	r6
 	ori	r7,r6,MSR_FP
 	cmpwi	cr7,r3,0
-	mtmsrd	r7
+	MTMSRD(r7)
 	isync
 	beq	cr7,1f
 	stfd	fr0,STKFRM-16(r1)
@@ -147,7 +149,7 @@ _GLOBAL(do_stfs)
 	lfd	fr0,STKFRM-16(r1)
 4:	PPC_LL	r0,STKFRM+PPC_LR_STKOFF(r1)
 	mtlr	r0
-	mtmsrd	r6
+	MTMSRD(r6)
 	isync
 	mr	r3,r9
 	addi	r1,r1,STKFRM
@@ -162,7 +164,7 @@ _GLOBAL(do_stfd)
 	mfmsr	r6
 	ori	r7,r6,MSR_FP
 	cmpwi	cr7,r3,0
-	mtmsrd	r7
+	MTMSRD(r7)
 	isync
 	beq	cr7,1f
 	stfd	fr0,STKFRM-16(r1)
@@ -174,7 +176,7 @@ _GLOBAL(do_stfd)
 	lfd	fr0,STKFRM-16(r1)
 4:	PPC_LL	r0,STKFRM+PPC_LR_STKOFF(r1)
 	mtlr	r0
-	mtmsrd	r6
+	MTMSRD(r6)
 	isync
 	mr	r3,r9
 	addi	r1,r1,STKFRM
@@ -229,7 +231,7 @@ _GLOBAL(do_lvx)
 	oris	r7,r6,MSR_VEC@h
 	cmpwi	cr7,r3,0
 	li	r8,STKFRM-16
-	mtmsrd	r7
+	MTMSRD(r7)
 	isync
 	beq	cr7,1f
 	stvx	vr0,r1,r8
@@ -241,7 +243,7 @@ _GLOBAL(do_lvx)
 	lvx	vr0,r1,r8
 4:	PPC_LL	r0,STKFRM+PPC_LR_STKOFF(r1)
 	mtlr	r0
-	mtmsrd	r6
+	MTMSRD(r6)
 	isync
 	mr	r3,r9
 	addi	r1,r1,STKFRM
@@ -257,7 +259,7 @@ _GLOBAL(do_stvx)
 	oris	r7,r6,MSR_VEC@h
 	cmpwi	cr7,r3,0
 	li	r8,STKFRM-16
-	mtmsrd	r7
+	MTMSRD(r7)
 	isync
 	beq	cr7,1f
 	stvx	vr0,r1,r8
@@ -269,7 +271,7 @@ _GLOBAL(do_stvx)
 	lvx	vr0,r1,r8
 4:	PPC_LL	r0,STKFRM+PPC_LR_STKOFF(r1)
 	mtlr	r0
-	mtmsrd	r6
+	MTMSRD(r6)
 	isync
 	mr	r3,r9
 	addi	r1,r1,STKFRM
@@ -325,7 +327,7 @@ _GLOBAL(do_lxvd2x)
 	oris	r7,r6,MSR_VSX@h
 	cmpwi	cr7,r3,0
 	li	r8,STKFRM-16
-	mtmsrd	r7
+	MTMSRD(r7)
 	isync
 	beq	cr7,1f
 	STXVD2X(0,r1,r8)
@@ -337,7 +339,7 @@ _GLOBAL(do_lxvd2x)
 	LXVD2X(0,r1,r8)
 4:	PPC_LL	r0,STKFRM+PPC_LR_STKOFF(r1)
 	mtlr	r0
-	mtmsrd	r6
+	MTMSRD(r6)
 	isync
 	mr	r3,r9
 	addi	r1,r1,STKFRM
@@ -353,7 +355,7 @@ _GLOBAL(do_stxvd2x)
 	oris	r7,r6,MSR_VSX@h
 	cmpwi	cr7,r3,0
 	li	r8,STKFRM-16
-	mtmsrd	r7
+	MTMSRD(r7)
 	isync
 	beq	cr7,1f
 	STXVD2X(0,r1,r8)
@@ -365,7 +367,7 @@ _GLOBAL(do_stxvd2x)
 	LXVD2X(0,r1,r8)
 4:	PPC_LL	r0,STKFRM+PPC_LR_STKOFF(r1)
 	mtlr	r0
-	mtmsrd	r6
+	MTMSRD(r6)
 	isync
 	mr	r3,r9
 	addi	r1,r1,STKFRM
@@ -373,3 +375,5 @@ _GLOBAL(do_stxvd2x)
 	extab	2b,3b
 
 #endif /* CONFIG_VSX */
+
+#endif	/* CONFIG_PPC_FPU */
diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index e0a9858..ae5189a 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -30,6 +30,7 @@ extern char system_call_common[];
 #define XER_OV		0x40000000U
 #define XER_CA		0x20000000U
 
+#ifdef CONFIG_PPC_FPU
 /*
  * Functions in ldstfp.S
  */
@@ -41,6 +42,7 @@ extern int do_lvx(int rn, unsigned long ea);
 extern int do_stvx(int rn, unsigned long ea);
 extern int do_lxvd2x(int rn, unsigned long ea);
 extern int do_stxvd2x(int rn, unsigned long ea);
+#endif
 
 /*
  * Determine whether a conditional branch instruction would branch.
@@ -290,6 +292,7 @@ static int __kprobes write_mem(unsigned long val, unsigned long ea, int nb,
 	return write_mem_unaligned(val, ea, nb, regs);
 }
 
+#ifdef CONFIG_PPC_FPU
 /*
  * Check the address and alignment, and call func to do the actual
  * load or store.
@@ -351,6 +354,7 @@ static int __kprobes do_fp_store(int rn, int (*func)(int, unsigned long),
 	}
 	return err;
 }
+#endif
 
 #ifdef CONFIG_ALTIVEC
 /* For Altivec/VMX, no need to worry about alignment */
@@ -1393,6 +1397,7 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr)
 				regs->gpr[rd] = byterev_4(val);
 			goto ldst_done;
 
+#ifdef CONFIG_PPC_CPU
 		case 535:	/* lfsx */
 		case 567:	/* lfsux */
 			if (!(regs->msr & MSR_FP))
@@ -1424,6 +1429,7 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr)
 			ea = xform_ea(instr, regs, u);
 			err = do_fp_store(rd, do_stfd, ea, 8, regs);
 			goto ldst_done;
+#endif
 
 #ifdef __powerpc64__
 		case 660:	/* stdbrx */
@@ -1534,6 +1540,7 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr)
 		} while (++rd < 32);
 		goto instr_done;
 
+#ifdef CONFIG_PPC_FPU
 	case 48:	/* lfs */
 	case 49:	/* lfsu */
 		if (!(regs->msr & MSR_FP))
@@ -1565,6 +1572,7 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr)
 		ea = dform_ea(instr, regs);
 		err = do_fp_store(rd, do_stfd, ea, 8, regs);
 		goto ldst_done;
+#endif
 
 #ifdef __powerpc64__
 	case 58:	/* ld[u], lwa */

^ permalink raw reply related

* Re: [PATCH v2] powerpc/fsl_soc: Search all global-utilities nodes for rstccr
From: Timur Tabi @ 2010-09-01 18:15 UTC (permalink / raw)
  To: Matthew McClintock; +Cc: kumar.gala, linuxppc-dev
In-Reply-To: <1283294691-18765-1-git-send-email-msm@freescale.com>

Matthew McClintock wrote:
> The first global-utilities node might not contain the rstcr
> property, so we should search all the nodes
> 
> Signed-off-by: Matthew McClintock <msm@freescale.com>

Acked-by: Timur Tabi <timur@freescale.com>

^ permalink raw reply

* Re: [PATCH 1/4] drivers/serial/ucc_uart.c: Add of_node_put to avoid memory leak
From: Timur Tabi @ 2010-09-01 18:40 UTC (permalink / raw)
  To: Julia Lawall
  Cc: kernel-janitors, devicetree-discuss, linuxppc-dev, linux-kernel
In-Reply-To: <1283269738-14612-2-git-send-email-julia@diku.dk>

On Tue, Aug 31, 2010 at 10:48 AM, Julia Lawall <julia@diku.dk> wrote:
> Add a call to of_node_put in the error handling code following a call to
> of_find_compatible_node or of_find_node_by_type.
>
> This patch also substantially reorganizes the error handling code in the
> function, to that it is possible first to jump to code that frees qe_port
> and then to jump to code that also puts np.

Acked-by: Timur Tabi <timur@freescale.com>

Thanks, Julia.

-- 
Timur Tabi
Linux kernel developer at Freescale

^ permalink raw reply

* Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries
From: Darren Hart @ 2010-09-01 18:47 UTC (permalink / raw)
  To: michael
  Cc: Stephen Rothwell, Michael Neuling, Gautham R Shenoy,
	Josh Triplett, Steven Rostedt, linuxppc-dev, Will Schmidt,
	Paul Mackerras, Ankita Garg, Thomas Gleixner
In-Reply-To: <4C7E6CCC.8090700@us.ibm.com>

On 09/01/2010 08:10 AM, Darren Hart wrote:
> On 08/31/2010 10:54 PM, Michael Ellerman wrote:
>> On Tue, 2010-08-31 at 00:12 -0700, Darren Hart wrote:
>> ..
>>>
>>> When running with the function plugin I had to stop the trace
>>> immediately before entering start_secondary after an online or my traces
>>> would not include the pseries_mach_cpu_die function, nor the tracing I
>>> added there (possibly buffer size, I am using 2048). The following trace
>>> was collected using "trace-cmd record -p function -e irq -e sched" and
>>> has been filtered to only show CPU [001] (the CPU undergoing the
>>> offline/online test, and the one seeing preempt_count (pcnt) go to
>>> ffffffff after cede. The function tracer does not indicate anything
>>> running on the CPU other than the HCALL - unless the __trace_hcall*
>>> commands might be to blame. 
>>
>> It's not impossible. Though normally they're disabled right, so the only
>> reason they're running is because you're tracing. So if they are causing
>> the bug then that doesn't explain why you see it normally.
>>
>> Still, might be worth disabling just the hcall tracepoints just to be
>> 100% sure.
> 
> A couple of updates. I was working from tip/rt/head, which has been
> stale for some months now. I switched to tip/rt/2.6.33 and the
> preempt_count() change over cede went away. I now see the live hang that
> Will has reported.
> 
> Before I dive into the live hang, I want to understand what fixed the
> preempt_count() change. That might start pointing us in the right
> direction for the live hang.
> 
> I did an inverted git bisect between tip/rt/head and tip/rt/2.6.33 to
> try and locate the fix. I've narrowed it down to the 2.6.33.6 merge:
> 
> # git show 7e1af1172bbd4109d09ac515c5d376f633da7cff
> commit 7e1af1172bbd4109d09ac515c5d376f633da7cff
> Merge: d8e94db 9666790
> Author: Thomas Gleixner <tglx@linutronix.de>
> Date:   Tue Jul 13 16:01:16 2010 +0200
> 
>     Merge
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-2.6.33.y
> 
>     Conflicts:
>         Makefile
> 
>     Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> 
> 
> Visual inspection yields two patches of interest:
> 
> f8b67691828321f5c85bb853283aa101ae673130
> powerpc/pseries: Make query-cpu-stopped callable outside hotplug cpu
> 
> aef40e87d866355ffd279ab21021de733242d0d5
> powerpc/pseries: Only call start-cpu when a CPU is stopped
> 
> I'm going to try reverting these today and see if they addressed the
> issue indirectly.

Removing:

aef40e87d866355ffd279ab21021de733242d0d5
powerpc/pseries: Only call start-cpu when a CPU is stopped

--- a/arch/powerpc/platforms/pseries/smp.c
+++ b/arch/powerpc/platforms/pseries/smp.c
@@ -82,6 +82,12 @@ static inline int __devinit smp_startup_cpu(unsigned int lcpu)
 
        pcpu = get_hard_smp_processor_id(lcpu);
 
+       /* Check to see if the CPU out of FW already for kexec */

This comment is really confusing to me. I _think_ it is saying that this test
determines if the CPU is done executing firmware code and has begun executing
OS code.... Is that right?

+       if (smp_query_cpu_stopped(pcpu) == QCSS_NOT_STOPPED){
+               cpu_set(lcpu, of_spin_map);
+               return 1;
+       }
+
        /* Fixup atomic count: it exited inside IRQ handler. */
        task_thread_info(paca[lcpu].__current)->preempt_count   = 0;

from tip/rt/2.6.33 causes the preempt_count() to change across the cede
call.  This patch appears to prevents the proxy preempt_count assignment
from happening. This non-local-cpu assignment to 0 would cause an
underrun of preempt_count() if the local-cpu had disabled preemption
prior to the assignment and then later tried to enable it. This appears
to be the case with the stack of __trace_hcall* calls preceeding the
return from extended_cede_processor() in the latency format trace-cmd
report:

  <idle>-0       1d....   201.252737: function:             .cpu_die
  <idle>-0       1d....   201.252738: function:                .pseries_mach_cpu_die
  <idle>-0       1d....   201.252740: function:                   .idle_task_exit
  <idle>-0       1d....   201.252741: function:                      .switch_slb
  <idle>-0       1d....   201.252742: function:                   .xics_teardown_cpu
  <idle>-0       1d....   201.252743: function:                      .xics_set_cpu_priority
  <idle>-0       1d....   201.252744: function:             .__trace_hcall_entry
  <idle>-0       1d..1.   201.252745: function:                .probe_hcall_entry
  <idle>-0       1d..1.   201.252746: function:             .__trace_hcall_exit
  <idle>-0       1d..2.   201.252747: function:                .probe_hcall_exit
  <idle>-0       1d....   201.252748: function:             .__trace_hcall_entry
  <idle>-0       1d..1.   201.252748: function:                .probe_hcall_entry
  <idle>-0       1d..1.   201.252750: function:             .__trace_hcall_exit
  <idle>-0       1d..2.   201.252751: function:                .probe_hcall_exit
  <idle>-0       1d....   201.252752: function:             .__trace_hcall_entry
  <idle>-0       1d..1.   201.252753: function:                .probe_hcall_entry
  offon.sh-3684  6.....   201.466488: bprint:               .smp_pSeries_kick_cpu : resetting pcnt to 0 for cpu 1

preempt_count() is reset from 1 to 0 by smp_startup_cpu() without the
QCSS_NOT_STOPPED check from the patch above.

  <idle>-0       1d....   201.466503: function:             .__trace_hcall_exit
  <idle>-0       1d..1.   201.466505: function:                .probe_hcall_exit
  <idle>-0       1d.Hff.   201.466507: bprint:               .pseries_mach_cpu_die : after cede: ffffffff

With the preempt_count() being one less than it should be, the final
preempt_enable() in the trace_hcall path drops preempt_count to
0xffffffff, which of course is an illegal value and leads to a crash.

I don't know if the patch above is the "right fix" for this or not as
don't yet understand why this occurs only with PREEMPT_RT and not in
mainline. Once we sort that out, we may need a more specific fix.

Many thanks to Steven Rostedt for pouring over the trace and the trace
code with me.

Thanks,

-- 
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team

^ permalink raw reply

* Re: ERR_PTR pattern in phylib
From: Grant Likely @ 2010-09-01 18:56 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, afleming, linuxppc-dev
In-Reply-To: <20100901.082729.135524471.davem@davemloft.net>

On Wed, Sep 1, 2010 at 9:27 AM, David Miller <davem@davemloft.net> wrote:
> From: Grant Likely <grant.likely@secretlab.ca>
> Date: Wed, 1 Sep 2010 08:42:49 -0600
>
>> It seems to me that phylib is one of the cases where the users (the
>> network drivers) don't actually care about the specific error code
>> when calling phylib functions. =A0The drivers only seem to care whether
>> or not the function failed, and if it did then bail out. =A0I've also
>> noticed that using the "if (!ptr)" test on phylib return values is a
>> common error for driver writers.
>>
>> In the interest of making driver code easier to write and review,
>> would you be opposed to a set of patches to remove the ERR_PTR()
>> pattern from phylib and its users?
>
> I'm opposed to it because it means that if code actually does
> care about the error code it will no longer be able to obtain
> it.

The error codes in phylib are almost arbitrary and don't really give
enough information about where the a failure lies.  dev_err() is more
useful for debugging.

My experience has been that the encoding of error numbers into a
pointer return is a source of bugs for driver writers and should be
strongly avoided unless the return codes are actually important (ie.
userspace depends on them).  Especially so when the use-case for
keeping them is merely theoretical.  I've looked through the phylib
usage for ERR_PTR(), and it all is related to whether or not a
phy_device pointer can be located.  The error code is pretty much
irrelevant if a phy cannot be obtained.

How about this as a compromise:  I'll investigate all the users of
phylib and if I find even one situation where the specific return code
is actually important to a driver, then I'll back off.  phylib has
been around for 5 years now which should be enough time for that use
case to bubble to the surface.

g.

^ permalink raw reply

* Re: ERR_PTR pattern in phylib
From: David Miller @ 2010-09-01 19:03 UTC (permalink / raw)
  To: grant.likely; +Cc: netdev, afleming, linuxppc-dev
In-Reply-To: <AANLkTimkh6ytDmiCQY-gZLHpLzeux1RwMFyr=HyfcDaJ@mail.gmail.com>

From: Grant Likely <grant.likely@secretlab.ca>
Date: Wed, 1 Sep 2010 12:56:29 -0600

> The error codes in phylib are almost arbitrary and don't really give
> enough information about where the a failure lies.  dev_err() is more
> useful for debugging.

If it's using bad error codes, that's only indicative of a bug in
phylib.

> How about this as a compromise:  I'll investigate all the users of
> phylib and if I find even one situation where the specific return code
> is actually important to a driver, then I'll back off.  phylib has
> been around for 5 years now which should be enough time for that use
> case to bubble to the surface.

Sorry, I'm not agreeable to this.

If there are bugs where drivers are not checking the return
pointers correctly, please just fix those bugs.  Otherwise
I see things fine as they are.

^ permalink raw reply

* Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries
From: Steven Rostedt @ 2010-09-01 19:59 UTC (permalink / raw)
  To: Darren Hart
  Cc: Stephen Rothwell, Michael Neuling, Gautham R Shenoy,
	Josh Triplett, linuxppc-dev, Will Schmidt, Paul Mackerras,
	Ankita Garg, Thomas Gleixner
In-Reply-To: <4C7E9FC1.60004@us.ibm.com>

On Wed, 2010-09-01 at 11:47 -0700, Darren Hart wrote:

> from tip/rt/2.6.33 causes the preempt_count() to change across the cede
> call.  This patch appears to prevents the proxy preempt_count assignment
> from happening. This non-local-cpu assignment to 0 would cause an
> underrun of preempt_count() if the local-cpu had disabled preemption
> prior to the assignment and then later tried to enable it. This appears
> to be the case with the stack of __trace_hcall* calls preceeding the
> return from extended_cede_processor() in the latency format trace-cmd
> report:
> 
>   <idle>-0       1d....   201.252737: function:             .cpu_die

Note, the above 1d.... is a series of values. The first being the CPU,
the next if interrupts are disabled, the next if the NEED_RESCHED flag
is set, the next is softirqs enabled or disabled, next the
preempt_count, and finally the lockdepth count.

Here we only care about the preempt_count, which is zero when '.' and a
number if it is something else. It is the second to last field in that
list.


>   <idle>-0       1d....   201.252738: function:                .pseries_mach_cpu_die
>   <idle>-0       1d....   201.252740: function:                   .idle_task_exit
>   <idle>-0       1d....   201.252741: function:                      .switch_slb
>   <idle>-0       1d....   201.252742: function:                   .xics_teardown_cpu
>   <idle>-0       1d....   201.252743: function:                      .xics_set_cpu_priority
>   <idle>-0       1d....   201.252744: function:             .__trace_hcall_entry
>   <idle>-0       1d..1.   201.252745: function:                .probe_hcall_entry

                       ^
                preempt_count set to 1

>   <idle>-0       1d..1.   201.252746: function:             .__trace_hcall_exit
>   <idle>-0       1d..2.   201.252747: function:                .probe_hcall_exit
>   <idle>-0       1d....   201.252748: function:             .__trace_hcall_entry
>   <idle>-0       1d..1.   201.252748: function:                .probe_hcall_entry
>   <idle>-0       1d..1.   201.252750: function:             .__trace_hcall_exit
>   <idle>-0       1d..2.   201.252751: function:                .probe_hcall_exit
>   <idle>-0       1d....   201.252752: function:             .__trace_hcall_entry
>   <idle>-0       1d..1.   201.252753: function:                .probe_hcall_entry
                   ^   ^
                  CPU  preempt_count

Entering the function probe_hcall_entry() the preempt_count is 1 (see
below). But probe_hcall_entry does:

	h = &get_cpu_var(hcall_stats)[opcode / 4];

Without doing the put (which it does in probe_hcall_exit())

So exiting the probe_hcall_entry() the prempt_count is 2.
The trace_hcall_entry() will do a preempt_enable() making it leave as 1.


>   offon.sh-3684  6.....   201.466488: bprint:               .smp_pSeries_kick_cpu : resetting pcnt to 0 for cpu 1

This is CPU 6, changing the preempt count from 1 to 0.

> 
> preempt_count() is reset from 1 to 0 by smp_startup_cpu() without the
> QCSS_NOT_STOPPED check from the patch above.
> 
>   <idle>-0       1d....   201.466503: function:             .__trace_hcall_exit

Note: __trace_hcall_exit() and __trace_hcall_entry() basically do:

 preempt_disable();
 call probe();
 preempt_enable();


>   <idle>-0       1d..1.   201.466505: function:                .probe_hcall_exit

The preempt_count of 1 entering the probe_hcall_exit() is because of the
preempt_disable() shown above. It should have been entered as a 2.

But then it does:


	put_cpu_var(hcall_stats);

making preempt_count 0.

But the preempt_enable() in the trace_hcall_exit() causes this to be -1.


>   <idle>-0       1d.Hff.   201.466507: bprint:               .pseries_mach_cpu_die : after cede: ffffffff
> 
> With the preempt_count() being one less than it should be, the final
> preempt_enable() in the trace_hcall path drops preempt_count to
> 0xffffffff, which of course is an illegal value and leads to a crash.

I'm confused to how this works in mainline?

-- Steve

> 
> I don't know if the patch above is the "right fix" for this or not as
> don't yet understand why this occurs only with PREEMPT_RT and not in
> mainline. Once we sort that out, we may need a more specific fix.

^ permalink raw reply

* Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries
From: Darren Hart @ 2010-09-01 20:42 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Stephen Rothwell, Michael Neuling, Gautham R Shenoy,
	Josh Triplett, linuxppc-dev, Will Schmidt, Paul Mackerras,
	Ankita Garg, Thomas Gleixner
In-Reply-To: <1283371161.2356.53.camel@gandalf.stny.rr.com>

On 09/01/2010 12:59 PM, Steven Rostedt wrote:
> On Wed, 2010-09-01 at 11:47 -0700, Darren Hart wrote:
> 
>> from tip/rt/2.6.33 causes the preempt_count() to change across the cede
>> call.  This patch appears to prevents the proxy preempt_count assignment
>> from happening. This non-local-cpu assignment to 0 would cause an
>> underrun of preempt_count() if the local-cpu had disabled preemption
>> prior to the assignment and then later tried to enable it. This appears
>> to be the case with the stack of __trace_hcall* calls preceeding the
>> return from extended_cede_processor() in the latency format trace-cmd
>> report:
>>
>>   <idle>-0       1d....   201.252737: function:             .cpu_die
> 
> Note, the above 1d.... is a series of values. The first being the CPU,
> the next if interrupts are disabled, the next if the NEED_RESCHED flag
> is set, the next is softirqs enabled or disabled, next the
> preempt_count, and finally the lockdepth count.
> 
> Here we only care about the preempt_count, which is zero when '.' and a
> number if it is something else. It is the second to last field in that
> list.
> 
> 
>>   <idle>-0       1d....   201.252738: function:                .pseries_mach_cpu_die
>>   <idle>-0       1d....   201.252740: function:                   .idle_task_exit
>>   <idle>-0       1d....   201.252741: function:                      .switch_slb
>>   <idle>-0       1d....   201.252742: function:                   .xics_teardown_cpu
>>   <idle>-0       1d....   201.252743: function:                      .xics_set_cpu_priority
>>   <idle>-0       1d....   201.252744: function:             .__trace_hcall_entry
>>   <idle>-0       1d..1.   201.252745: function:                .probe_hcall_entry
> 
>                        ^
>                 preempt_count set to 1
> 
>>   <idle>-0       1d..1.   201.252746: function:             .__trace_hcall_exit
>>   <idle>-0       1d..2.   201.252747: function:                .probe_hcall_exit
>>   <idle>-0       1d....   201.252748: function:             .__trace_hcall_entry
>>   <idle>-0       1d..1.   201.252748: function:                .probe_hcall_entry
>>   <idle>-0       1d..1.   201.252750: function:             .__trace_hcall_exit
>>   <idle>-0       1d..2.   201.252751: function:                .probe_hcall_exit
>>   <idle>-0       1d....   201.252752: function:             .__trace_hcall_entry
>>   <idle>-0       1d..1.   201.252753: function:                .probe_hcall_entry
>                    ^   ^
>                   CPU  preempt_count
> 
> Entering the function probe_hcall_entry() the preempt_count is 1 (see
> below). But probe_hcall_entry does:
> 
> 	h = &get_cpu_var(hcall_stats)[opcode / 4];
> 
> Without doing the put (which it does in probe_hcall_exit())
> 
> So exiting the probe_hcall_entry() the prempt_count is 2.
> The trace_hcall_entry() will do a preempt_enable() making it leave as 1.
> 
> 
>>   offon.sh-3684  6.....   201.466488: bprint:               .smp_pSeries_kick_cpu : resetting pcnt to 0 for cpu 1
> 
> This is CPU 6, changing the preempt count from 1 to 0.
> 
>>
>> preempt_count() is reset from 1 to 0 by smp_startup_cpu() without the
>> QCSS_NOT_STOPPED check from the patch above.
>>
>>   <idle>-0       1d....   201.466503: function:             .__trace_hcall_exit
> 
> Note: __trace_hcall_exit() and __trace_hcall_entry() basically do:
> 
>  preempt_disable();
>  call probe();
>  preempt_enable();
> 
> 
>>   <idle>-0       1d..1.   201.466505: function:                .probe_hcall_exit
> 
> The preempt_count of 1 entering the probe_hcall_exit() is because of the
> preempt_disable() shown above. It should have been entered as a 2.
> 
> But then it does:
> 
> 
> 	put_cpu_var(hcall_stats);
> 
> making preempt_count 0.
> 
> But the preempt_enable() in the trace_hcall_exit() causes this to be -1.
> 
> 
>>   <idle>-0       1d.Hff.   201.466507: bprint:               .pseries_mach_cpu_die : after cede: ffffffff
>>
>> With the preempt_count() being one less than it should be, the final
>> preempt_enable() in the trace_hcall path drops preempt_count to
>> 0xffffffff, which of course is an illegal value and leads to a crash.
> 
> I'm confused to how this works in mainline?

Turns out it didn't. 2.6.33.5 with CONFIG_PREEMPT=y sees this exact same
behavior. The following, part of the 2.6.33.6 stable release, prevents
this from happening:

aef40e87d866355ffd279ab21021de733242d0d5
powerpc/pseries: Only call start-cpu when a CPU is stopped

--- a/arch/powerpc/platforms/pseries/smp.c
+++ b/arch/powerpc/platforms/pseries/smp.c
@@ -82,6 +82,12 @@ static inline int __devinit smp_startup_cpu(unsigned
int lcpu)

        pcpu = get_hard_smp_processor_id(lcpu);

+       /* Check to see if the CPU out of FW already for kexec */
+       if (smp_query_cpu_stopped(pcpu) == QCSS_NOT_STOPPED){
+               cpu_set(lcpu, of_spin_map);
+               return 1;
+       }
+
        /* Fixup atomic count: it exited inside IRQ handler. */
        task_thread_info(paca[lcpu].__current)->preempt_count   = 0;

The question is now, Is this the right fix? If so, perhaps we can update
the comment to be a bit more clear and not refer solely to kexec.

Michael Neuling, can you offer any thoughts here? We hit this EVERY
TIME, which makes me wonder if the offline/online path could do this
without calling smp_startup_cpu at all.

-- 
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox