All of lore.kernel.org
 help / color / mirror / Atom feed
* STOP THE MADNESS
@ 2013-11-26 16:30 Russell King - ARM Linux
  2013-11-26 17:29 ` Rob Herring
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Russell King - ARM Linux @ 2013-11-26 16:30 UTC (permalink / raw)
  To: linux-arm-kernel

Versatile PB926 is currently broken in mainline because of this, and it's
damn obvious that people are just churning out changes to "update" drivers
without testing them at all.

CPU: 0 PID: 1 Comm: swapper Not tainted 3.13.0-rc1+ #652
Backtrace: 
[<c0016f0c>] (dump_backtrace) from [<c00172ac>] (show_stack+0x18/0x1c)
 r6:c798ace0 r5:00000000 r4:c78257e0 r3:00200140
[<c0017294>] (show_stack) from [<c0329ea0>] (dump_stack+0x20/0x28)
[<c0329e80>] (dump_stack) from [<c004fa80>] (__lock_acquire+0x1c0/0x1b80)
[<c004f8c0>] (__lock_acquire) from [<c0051970>] (lock_acquire+0x6c/0x80)
 r10:00000000 r9:c0455234 r8:00000060 r7:c047d798 r6:600000d3 r5:00000000
 r4:c782c000
[<c0051904>] (lock_acquire) from [<c032e484>] (_raw_spin_lock_irqsave+0x60/0x74)
 r6:c01a1100 r5:800000d3 r4:c798acd0
[<c032e424>] (_raw_spin_lock_irqsave) from [<c01a1100>] (pl061_irq_type+0x28/0x)
 r6:00000000 r5:00000000 r4:c798acd0
[<c01a10d8>] (pl061_irq_type) from [<c0059ef4>] (__irq_set_trigger+0x70/0x104)
 r6:00000000 r5:c01a10d8 r4:c046da1c r3:c01a10d8
[<c0059e84>] (__irq_set_trigger) from [<c005b348>] (irq_set_irq_type+0x40/0x60)
 r10:c043240c r8:00000060 r7:00000000 r6:c046da1c r5:00000060 r4:00000000
[<c005b308>] (irq_set_irq_type) from [<c01a1208>] (pl061_irq_map+0x40/0x54)
 r6:c79693c0 r5:c798acd0 r4:00000060
[<c01a11c8>] (pl061_irq_map) from [<c005d27c>] (irq_domain_associate+0xc0/0x190)
 r5:00000060 r4:c046da1c
[<c005d1bc>] (irq_domain_associate) from [<c005d604>] (irq_domain_associate_man)
 r8:00000008 r7:00000000 r6:c79693c0 r5:00000060 r4:00000000
[<c005d5d0>] (irq_domain_associate_many) from [<c005d864>] (irq_domain_add_simp)
 r8:c046578c r7:c035b72c r6:c79693c0 r5:00000060 r4:00000008 r3:00000008
[<c005d814>] (irq_domain_add_simple) from [<c01a1380>] (pl061_probe+0xc4/0x22c)
 r6:00000060 r5:c0464380 r4:c798acd0
[<c01a12bc>] (pl061_probe) from [<c01c0450>] (amba_probe+0x74/0xe0)
 r10:c043240c r9:c0455234 r8:00000000 r7:c047d7f8 r6:c047d744 r5:00000000
 r4:c0464380

This is caused by the recently introduced irq_domain_add_simple() call
being placed before the spinlock has been initialised, and guess what
the side effect of that call is.

Far from impressed.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* STOP THE MADNESS
  2013-11-26 16:30 STOP THE MADNESS Russell King - ARM Linux
@ 2013-11-26 17:29 ` Rob Herring
  2013-11-26 22:09   ` Linus Walleij
  2013-11-26 21:43 ` Linus Walleij
  2013-11-29  2:01 ` Russell King - ARM Linux
  2 siblings, 1 reply; 12+ messages in thread
From: Rob Herring @ 2013-11-26 17:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 26, 2013 at 10:30 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> Versatile PB926 is currently broken in mainline because of this, and it's
> damn obvious that people are just churning out changes to "update" drivers
> without testing them at all.
>
> CPU: 0 PID: 1 Comm: swapper Not tainted 3.13.0-rc1+ #652
> Backtrace:
> [<c0016f0c>] (dump_backtrace) from [<c00172ac>] (show_stack+0x18/0x1c)
>  r6:c798ace0 r5:00000000 r4:c78257e0 r3:00200140
> [<c0017294>] (show_stack) from [<c0329ea0>] (dump_stack+0x20/0x28)
> [<c0329e80>] (dump_stack) from [<c004fa80>] (__lock_acquire+0x1c0/0x1b80)
> [<c004f8c0>] (__lock_acquire) from [<c0051970>] (lock_acquire+0x6c/0x80)
>  r10:00000000 r9:c0455234 r8:00000060 r7:c047d798 r6:600000d3 r5:00000000
>  r4:c782c000
> [<c0051904>] (lock_acquire) from [<c032e484>] (_raw_spin_lock_irqsave+0x60/0x74)
>  r6:c01a1100 r5:800000d3 r4:c798acd0
> [<c032e424>] (_raw_spin_lock_irqsave) from [<c01a1100>] (pl061_irq_type+0x28/0x)
>  r6:00000000 r5:00000000 r4:c798acd0
> [<c01a10d8>] (pl061_irq_type) from [<c0059ef4>] (__irq_set_trigger+0x70/0x104)
>  r6:00000000 r5:c01a10d8 r4:c046da1c r3:c01a10d8
> [<c0059e84>] (__irq_set_trigger) from [<c005b348>] (irq_set_irq_type+0x40/0x60)
>  r10:c043240c r8:00000060 r7:00000000 r6:c046da1c r5:00000060 r4:00000000
> [<c005b308>] (irq_set_irq_type) from [<c01a1208>] (pl061_irq_map+0x40/0x54)
>  r6:c79693c0 r5:c798acd0 r4:00000060
> [<c01a11c8>] (pl061_irq_map) from [<c005d27c>] (irq_domain_associate+0xc0/0x190)
>  r5:00000060 r4:c046da1c
> [<c005d1bc>] (irq_domain_associate) from [<c005d604>] (irq_domain_associate_man)
>  r8:00000008 r7:00000000 r6:c79693c0 r5:00000060 r4:00000000
> [<c005d5d0>] (irq_domain_associate_many) from [<c005d864>] (irq_domain_add_simp)
>  r8:c046578c r7:c035b72c r6:c79693c0 r5:00000060 r4:00000008 r3:00000008
> [<c005d814>] (irq_domain_add_simple) from [<c01a1380>] (pl061_probe+0xc4/0x22c)
>  r6:00000060 r5:c0464380 r4:c798acd0
> [<c01a12bc>] (pl061_probe) from [<c01c0450>] (amba_probe+0x74/0xe0)
>  r10:c043240c r9:c0455234 r8:00000000 r7:c047d7f8 r6:c047d744 r5:00000000
>  r4:c0464380
>
> This is caused by the recently introduced irq_domain_add_simple() call
> being placed before the spinlock has been initialised, and guess what
> the side effect of that call is.

IMO, the whole commit causing this problem should be reverted for
additional reasons. Irq generic chp and irqdomains are orthogonal
features/infrastructure and should not be mutually exclusive. We need
to fix irq generic chip.

commit f1f70479e999217ecbf619d71837fc5d77c680fb
Author: Haojian Zhuang <haojian.zhuang@linaro.org>
Date:   Sun Feb 17 19:42:49 2013 +0800

    gpio: pl061: support irqdomain

    Drop the support of irq generic chip. Now support irqdomain instead.

    Although set_wake() is defined in irq generic chip & it is not really
    used in pl061 gpio driver. Drop it at the same time.

    Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
    Signed-off-by: Linus Walleij <linus.walleij@linaro.org>


Rob

^ permalink raw reply	[flat|nested] 12+ messages in thread

* STOP THE MADNESS
  2013-11-26 16:30 STOP THE MADNESS Russell King - ARM Linux
  2013-11-26 17:29 ` Rob Herring
@ 2013-11-26 21:43 ` Linus Walleij
  2013-11-29  2:01 ` Russell King - ARM Linux
  2 siblings, 0 replies; 12+ messages in thread
From: Linus Walleij @ 2013-11-26 21:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 26, 2013 at 5:30 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:

> Versatile PB926 is currently broken in mainline because of this, and it's
> damn obvious that people are just churning out changes to "update" drivers
> without testing them at all.

Incidentally I was actually testing this driver today for the other pl061
patches I sent - on the Integrator/AP with the IM-PD1 board.

Strangely I don't get this crash.

But I was wondering why my interrupts didn't fire properly and
now I'm starting to suspect it has something to do with this.

I'll revert and see if this fixes it...

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 12+ messages in thread

* STOP THE MADNESS
  2013-11-26 17:29 ` Rob Herring
@ 2013-11-26 22:09   ` Linus Walleij
  2013-11-27  1:22     ` Haojian Zhuang
  2013-11-27  3:54     ` Rob Herring
  0 siblings, 2 replies; 12+ messages in thread
From: Linus Walleij @ 2013-11-26 22:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 26, 2013 at 6:29 PM, Rob Herring <robherring2@gmail.com> wrote:

> IMO, the whole commit causing this problem should be reverted for
> additional reasons. Irq generic chp and irqdomains are orthogonal
> features/infrastructure and should not be mutually exclusive. We need
> to fix irq generic chip.

This cannot be easily reverted due to a lot of code and fixes building
on top of it, and even if it could I guess it will also break device tree
use on these: git grep pl061 arch/arm/boot/dts

As DT mandates that the irqdomain be used, or it cannot translate
the IRQs as before this patch it uses an hard-coded base from
platform data. This is one of those odd drivers that don't need
one line of DT-specific code in it to work with device tree, yet
it works.

I guess what I have to do is take out the very board Russell has
and that I happen to have also and just get to the bottom of this
bug and fix it and test it.

A pointer to an "ideal" GPIO chip + irq_chip driver would be highly
appreciated ...

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 12+ messages in thread

* STOP THE MADNESS
  2013-11-26 22:09   ` Linus Walleij
@ 2013-11-27  1:22     ` Haojian Zhuang
  2013-11-27  7:44       ` Linus Walleij
  2013-11-27  3:54     ` Rob Herring
  1 sibling, 1 reply; 12+ messages in thread
From: Haojian Zhuang @ 2013-11-27  1:22 UTC (permalink / raw)
  To: linux-arm-kernel


On 11/27/2013 06:09 AM, Linus Walleij wrote:
> On Tue, Nov 26, 2013 at 6:29 PM, Rob Herring <robherring2@gmail.com> wrote:
>
>> IMO, the whole commit causing this problem should be reverted for
>> additional reasons. Irq generic chp and irqdomains are orthogonal
>> features/infrastructure and should not be mutually exclusive. We need
>> to fix irq generic chip.
> This cannot be easily reverted due to a lot of code and fixes building
> on top of it, and even if it could I guess it will also break device tree
> use on these: git grep pl061 arch/arm/boot/dts
>
> As DT mandates that the irqdomain be used, or it cannot translate
> the IRQs as before this patch it uses an hard-coded base from
> platform data. This is one of those odd drivers that don't need
> one line of DT-specific code in it to work with device tree, yet
> it works.
>
> I guess what I have to do is take out the very board Russell has
> and that I happen to have also and just get to the bottom of this
> bug and fix it and test it.
>
> A pointer to an "ideal" GPIO chip + irq_chip driver would be highly
> appreciated ...
>
> Yours,
> Linus Walleij

Does qemu support the board that Russell is using? If so, we can fix & 
test it.

Regards
Haojian

^ permalink raw reply	[flat|nested] 12+ messages in thread

* STOP THE MADNESS
  2013-11-26 22:09   ` Linus Walleij
  2013-11-27  1:22     ` Haojian Zhuang
@ 2013-11-27  3:54     ` Rob Herring
  2013-11-27  7:41       ` Linus Walleij
  1 sibling, 1 reply; 12+ messages in thread
From: Rob Herring @ 2013-11-27  3:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 26, 2013 at 4:09 PM, Linus Walleij <linus.walleij@linaro.org> wrote:
> On Tue, Nov 26, 2013 at 6:29 PM, Rob Herring <robherring2@gmail.com> wrote:
>
>> IMO, the whole commit causing this problem should be reverted for
>> additional reasons. Irq generic chp and irqdomains are orthogonal
>> features/infrastructure and should not be mutually exclusive. We need
>> to fix irq generic chip.
>
> This cannot be easily reverted due to a lot of code and fixes building
> on top of it, and even if it could I guess it will also break device tree
> use on these: git grep pl061 arch/arm/boot/dts
>
> As DT mandates that the irqdomain be used, or it cannot translate
> the IRQs as before this patch it uses an hard-coded base from
> platform data. This is one of those odd drivers that don't need
> one line of DT-specific code in it to work with device tree, yet
> it works.

Correct, interrupts did not work with DT before this. It is needed
functionality, but it is how it was implemented that I have issue
with.

> I guess what I have to do is take out the very board Russell has
> and that I happen to have also and just get to the bottom of this
> bug and fix it and test it.
>
> A pointer to an "ideal" GPIO chip + irq_chip driver would be highly
> appreciated ...

I'm not sure if one exists. Shouldn't be too hard to find if anything
uses generic chip and irq domains together. I had done some work to
address generic chip issues with irq domains and pl061 support[1], but
Grant was never really happy with it and I never got back to looking
at it. I think the conclusion was we don't want to add irqdomains into
generic chip as my series did, but we need to at least make them
coexist.

Rob

[1]  git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git
irqdomain-for-grant

^ permalink raw reply	[flat|nested] 12+ messages in thread

* STOP THE MADNESS
  2013-11-27  3:54     ` Rob Herring
@ 2013-11-27  7:41       ` Linus Walleij
  0 siblings, 0 replies; 12+ messages in thread
From: Linus Walleij @ 2013-11-27  7:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 27, 2013 at 4:54 AM, Rob Herring <robherring2@gmail.com> wrote:
> On Tue, Nov 26, 2013 at 4:09 PM, Linus Walleij <linus.walleij@linaro.org> wrote:
>>
>> A pointer to an "ideal" GPIO chip + irq_chip driver would be highly
>> appreciated ...
>
> I'm not sure if one exists. Shouldn't be too hard to find if anything
> uses generic chip and irq domains together. I had done some work to
> address generic chip issues with irq domains and pl061 support[1], but
> Grant was never really happy with it and I never got back to looking
> at it. I think the conclusion was we don't want to add irqdomains into
> generic chip as my series did, but we need to at least make them
> coexist.

Hm I also had the idea of embedding a struct irq_chip into the
struct gpio_chip (w/some config symbol like
GPIOLIB_PROVIDE_IRQCHIP) and use that for all GPIO drivers
doing IRQchips. Embedding a generic chip could be the way
forward.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 12+ messages in thread

* STOP THE MADNESS
  2013-11-27  1:22     ` Haojian Zhuang
@ 2013-11-27  7:44       ` Linus Walleij
  0 siblings, 0 replies; 12+ messages in thread
From: Linus Walleij @ 2013-11-27  7:44 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 27, 2013 at 2:22 AM, Haojian Zhuang
<haojian.zhuang@linaro.org> wrote:

> Does qemu support the board that Russell is using? If so, we can fix & test
> it.

Yes and no. It is versatilepb, but the GPIOs are not emulated. However
the device is there I think, and as this is a pure logical problem it
*should* have the same problem.

Which makes the crash even more strange, as people would have
noticed a probe failure like this in QEMU (the versatilepb QEMU model
is used a *lot*) but they didn't. Neither did it turn up on my real HW
IM-PD1.

I'll propse a patch, I'm still struggling with getting the PB I have up and
running ...

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 12+ messages in thread

* STOP THE MADNESS
  2013-11-26 16:30 STOP THE MADNESS Russell King - ARM Linux
  2013-11-26 17:29 ` Rob Herring
  2013-11-26 21:43 ` Linus Walleij
@ 2013-11-29  2:01 ` Russell King - ARM Linux
  2013-11-29 10:11   ` Linus Walleij
  2 siblings, 1 reply; 12+ messages in thread
From: Russell King - ARM Linux @ 2013-11-29  2:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 26, 2013 at 04:30:32PM +0000, Russell King - ARM Linux wrote:
> Versatile PB926 is currently broken in mainline because of this, and it's
> damn obvious that people are just churning out changes to "update" drivers
> without testing them at all.

So... EBSA285... where would people like me to start with that.
VGA... broken.  LEDs... broken.  Several other issues as well (eg,
my tightening of the vectors page seems to have broken all low-vectors
platforms.)

IDE... both IT821x and CY82C693 broken - CY82C693 is beyond recovery
given the state of core IDE code - it requires a rewrite to get it
working again due to broken assumptions concerning this host controller
by core code.  IT821x gets quickly kicked out of BMDMA mode and then
sees the occasional PATA _protocol_ error.  Frankly I wouldn't trust
drivers/ide at all given observed behaviour; I'm not going to even
bother trying to debug it.

libata... better for IT821x.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* STOP THE MADNESS
  2013-11-29  2:01 ` Russell King - ARM Linux
@ 2013-11-29 10:11   ` Linus Walleij
  2013-11-29 10:46     ` Russell King - ARM Linux
  0 siblings, 1 reply; 12+ messages in thread
From: Linus Walleij @ 2013-11-29 10:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 29, 2013 at 3:01 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:

> So... EBSA285... where would people like me to start with that.
> VGA... broken.  LEDs... broken.  Several other issues as well (eg,
> my tightening of the vectors page seems to have broken all low-vectors
> platforms.)

It seems that the majority of the world's Versatile "users" are
running the QEMU model of that machine and not really
testing on anything else. Unsurprisingly the QEMU Versatile
is overly forgiving when mistreating virtual hardware. Some
stuff like LEDs I guess are just dummies.

I'm having a look at some stuff on this Versatile AB I have here
(turned out it was not a PB, but hopefully this will suffice).
I'll try to attend to the stuff that I understand atleast.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 12+ messages in thread

* STOP THE MADNESS
  2013-11-29 10:11   ` Linus Walleij
@ 2013-11-29 10:46     ` Russell King - ARM Linux
  2013-11-29 12:11       ` Linus Walleij
  0 siblings, 1 reply; 12+ messages in thread
From: Russell King - ARM Linux @ 2013-11-29 10:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 29, 2013 at 11:11:08AM +0100, Linus Walleij wrote:
> On Fri, Nov 29, 2013 at 3:01 AM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> 
> > So... EBSA285... where would people like me to start with that.
> > VGA... broken.  LEDs... broken.  Several other issues as well (eg,
> > my tightening of the vectors page seems to have broken all low-vectors
> > platforms.)
> 
> It seems that the majority of the world's Versatile "users" are
> running the QEMU model of that machine and not really
> testing on anything else. Unsurprisingly the QEMU Versatile
> is overly forgiving when mistreating virtual hardware. Some
> stuff like LEDs I guess are just dummies.

Note that the GPIO bug should show on anything due to the nature of it -
it's probably the debug options which mostly show it.  Making sure you
have spinlock debugging and lockdep enabled.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* STOP THE MADNESS
  2013-11-29 10:46     ` Russell King - ARM Linux
@ 2013-11-29 12:11       ` Linus Walleij
  0 siblings, 0 replies; 12+ messages in thread
From: Linus Walleij @ 2013-11-29 12:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 29, 2013 at 11:46 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:

> Note that the GPIO bug should show on anything due to the nature of it -
> it's probably the debug options which mostly show it.  Making sure you
> have spinlock debugging and lockdep enabled.

Yep I noticed ... when developing the fix yesterday I poisoned the
spinlock explicitly to provoke the error and after doing that and
sending the fix I realized that it was probably provoked by spinlock
debug (which I usually run, don't know what was wrong with me).

Anyway the fix should be somewhere in your inbox.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2013-11-29 12:11 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-26 16:30 STOP THE MADNESS Russell King - ARM Linux
2013-11-26 17:29 ` Rob Herring
2013-11-26 22:09   ` Linus Walleij
2013-11-27  1:22     ` Haojian Zhuang
2013-11-27  7:44       ` Linus Walleij
2013-11-27  3:54     ` Rob Herring
2013-11-27  7:41       ` Linus Walleij
2013-11-26 21:43 ` Linus Walleij
2013-11-29  2:01 ` Russell King - ARM Linux
2013-11-29 10:11   ` Linus Walleij
2013-11-29 10:46     ` Russell King - ARM Linux
2013-11-29 12:11       ` Linus Walleij

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.