From: Tony Lindgren <tony@atomide.com>
To: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: linux-arm-kernel@lists.infradead.org, linux-omap@vger.kernel.org,
Yingjoe Chen <yingjoe.chen@mediatek.com>,
Marc Zyngier <marc.zyngier@arm.com>,
Jason Cooper <jason@lakedaemon.net>, Felipe Balbi <balbi@ti.com>,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: Regression with legacy IRQ numbers caused by 9a1091ef0017
Date: Fri, 16 Jan 2015 14:52:44 -0800 [thread overview]
Message-ID: <20150116225243.GN18552@atomide.com> (raw)
In-Reply-To: <20150116172905.GM18552@atomide.com>
* Tony Lindgren <tony@atomide.com> [150116 09:36]:
> * Russell King - ARM Linux <linux@arm.linux.org.uk> [150116 09:25]:
> > On Fri, Jan 16, 2015 at 08:41:06AM -0800, Tony Lindgren wrote:
> > > * Russell King - ARM Linux <linux@arm.linux.org.uk> [150116 08:33]:
> > > > I would still like to understand /why/ enabling preempt causes the error.
> > > > Changing the preempt configuration really should not change what happens
> > > > on the bus. (Think about it.) It's an indication that there is some
> > > > other error present.
> > >
> > > We have a wrong irq number caused by $subject. And the wrong irq
> > > gets triggered before the dma hardware is configured during dma
> > > init. And then we get the invalid access error from omap_l3_noc.
> >
> > ... which should happen whether or not preempt is enabled, which is
> > really my point.
> >
> > We know tha the wrong IRQ gets requested by the driver - and that wrong
> > IRQ is requested whether or not we have preempt enabled. Yet we get
> > the warning whether or not preempt is enabled.
> >
> > The DMA handler is not registered as a threaded handler, so it's not
> > depending on a context switch to execute omap2_dma_irq_handler().
> >
> > Another reason why I don't agree with your explanation is that by the
> > time setup_irq() is called, we have already poked at the DMA hardware
> > several times - omap_clear_dma() and omap2_disable_irq_lch() will have
> > been called for each DMA channel - and both will write to the hardware.
> >
> > What's more is that the only things left after setup_irq() has been
> > called is to possibly reserve the first two DMA channels and print
> > the DMA message (via show_dma_caps). So I see nothing after setup_irq()
> > which would "finish" any unfinished hardware initialisation.
> >
> > The final reason I don't agree is that I've put a printk() in
> > omap2_dma_irq_handler(), and this does not trigger.
>
> Oh, yes that blows my theory completely then.
>
> > So, I think this has nothing to do with the DMA hardware /at all/,
> > but more to do with the GPIO code, and it suggests that the GPIO code
> > publishes IRQs before it is safe for those IRQs to be used.
> >
> > Maybe it has to do with omap_gpio_irq_handler() being called... added
> > printk(), nope, that's not called either. So it's not an IRQ which
> > gets triggered at all.
> >
> > What is called are (in order):
> >
> > omap_gpio_unmask_irq()
> > omap_set_gpio_irqenable()
> > omap_enable_gpio_irqbank()
> >
> > and this reveals where the problem is, especially when you then add
> > instrumentation into the runtime PM functions - and this reveals that
> > when a GPIO IRQ is requested, these functions are called while the
> > GPIO is runtime suspended.
> >
> > _That_ is where the *real* problem lies - requesting a GPIO interrupt
> > results in the kernel touching possibly runtime-suspended hardware.
> >
> > The reason it happens with preempt is that preempt introduces scheduling
> > points during the kernel boot which would not otherwise be there (with
> > preempt disabled, you have to hit an explicit context switch due to
> > contention on some lock or a wait in order for some other thread to run.)
>
> OK makes sense.
>
> > So, the GPIO driver really needs fixing - and I'd suggest fixing it
> > first, before fixing the DMA problem, because the DMA problem allows
> > us to see the GPIO problem.
>
> Yes we need to fix that.
Posted a minimal fix for that one as a separate thread:
[PATCH 1/1] gpio: omap: Fix bad device access with setup_irq()
Regards,
Tony
WARNING: multiple messages have this Message-ID (diff)
From: tony@atomide.com (Tony Lindgren)
To: linux-arm-kernel@lists.infradead.org
Subject: Regression with legacy IRQ numbers caused by 9a1091ef0017
Date: Fri, 16 Jan 2015 14:52:44 -0800 [thread overview]
Message-ID: <20150116225243.GN18552@atomide.com> (raw)
In-Reply-To: <20150116172905.GM18552@atomide.com>
* Tony Lindgren <tony@atomide.com> [150116 09:36]:
> * Russell King - ARM Linux <linux@arm.linux.org.uk> [150116 09:25]:
> > On Fri, Jan 16, 2015 at 08:41:06AM -0800, Tony Lindgren wrote:
> > > * Russell King - ARM Linux <linux@arm.linux.org.uk> [150116 08:33]:
> > > > I would still like to understand /why/ enabling preempt causes the error.
> > > > Changing the preempt configuration really should not change what happens
> > > > on the bus. (Think about it.) It's an indication that there is some
> > > > other error present.
> > >
> > > We have a wrong irq number caused by $subject. And the wrong irq
> > > gets triggered before the dma hardware is configured during dma
> > > init. And then we get the invalid access error from omap_l3_noc.
> >
> > ... which should happen whether or not preempt is enabled, which is
> > really my point.
> >
> > We know tha the wrong IRQ gets requested by the driver - and that wrong
> > IRQ is requested whether or not we have preempt enabled. Yet we get
> > the warning whether or not preempt is enabled.
> >
> > The DMA handler is not registered as a threaded handler, so it's not
> > depending on a context switch to execute omap2_dma_irq_handler().
> >
> > Another reason why I don't agree with your explanation is that by the
> > time setup_irq() is called, we have already poked at the DMA hardware
> > several times - omap_clear_dma() and omap2_disable_irq_lch() will have
> > been called for each DMA channel - and both will write to the hardware.
> >
> > What's more is that the only things left after setup_irq() has been
> > called is to possibly reserve the first two DMA channels and print
> > the DMA message (via show_dma_caps). So I see nothing after setup_irq()
> > which would "finish" any unfinished hardware initialisation.
> >
> > The final reason I don't agree is that I've put a printk() in
> > omap2_dma_irq_handler(), and this does not trigger.
>
> Oh, yes that blows my theory completely then.
>
> > So, I think this has nothing to do with the DMA hardware /at all/,
> > but more to do with the GPIO code, and it suggests that the GPIO code
> > publishes IRQs before it is safe for those IRQs to be used.
> >
> > Maybe it has to do with omap_gpio_irq_handler() being called... added
> > printk(), nope, that's not called either. So it's not an IRQ which
> > gets triggered at all.
> >
> > What is called are (in order):
> >
> > omap_gpio_unmask_irq()
> > omap_set_gpio_irqenable()
> > omap_enable_gpio_irqbank()
> >
> > and this reveals where the problem is, especially when you then add
> > instrumentation into the runtime PM functions - and this reveals that
> > when a GPIO IRQ is requested, these functions are called while the
> > GPIO is runtime suspended.
> >
> > _That_ is where the *real* problem lies - requesting a GPIO interrupt
> > results in the kernel touching possibly runtime-suspended hardware.
> >
> > The reason it happens with preempt is that preempt introduces scheduling
> > points during the kernel boot which would not otherwise be there (with
> > preempt disabled, you have to hit an explicit context switch due to
> > contention on some lock or a wait in order for some other thread to run.)
>
> OK makes sense.
>
> > So, the GPIO driver really needs fixing - and I'd suggest fixing it
> > first, before fixing the DMA problem, because the DMA problem allows
> > us to see the GPIO problem.
>
> Yes we need to fix that.
Posted a minimal fix for that one as a separate thread:
[PATCH 1/1] gpio: omap: Fix bad device access with setup_irq()
Regards,
Tony
next prev parent reply other threads:[~2015-01-16 22:56 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-14 22:14 Regression with legacy IRQ numbers caused by 9a1091ef0017 Tony Lindgren
2015-01-14 22:14 ` Tony Lindgren
2015-01-15 10:50 ` Russell King - ARM Linux
2015-01-15 10:50 ` Russell King - ARM Linux
2015-01-15 15:28 ` Tony Lindgren
2015-01-15 15:28 ` Tony Lindgren
2015-01-15 17:19 ` Russell King - ARM Linux
2015-01-15 17:19 ` Russell King - ARM Linux
2015-01-16 16:21 ` Tony Lindgren
2015-01-16 16:21 ` Tony Lindgren
2015-01-16 16:30 ` Russell King - ARM Linux
2015-01-16 16:30 ` Russell King - ARM Linux
2015-01-16 16:41 ` Tony Lindgren
2015-01-16 16:41 ` Tony Lindgren
2015-01-16 16:46 ` Felipe Balbi
2015-01-16 16:46 ` Felipe Balbi
2015-01-16 17:22 ` Russell King - ARM Linux
2015-01-16 17:22 ` Russell King - ARM Linux
2015-01-16 17:29 ` Tony Lindgren
2015-01-16 17:29 ` Tony Lindgren
2015-01-16 22:52 ` Tony Lindgren [this message]
2015-01-16 22:52 ` Tony Lindgren
2015-01-16 22:57 ` Russell King - ARM Linux
2015-01-16 22:57 ` Russell King - ARM Linux
2015-01-16 22:57 ` Tony Lindgren
2015-01-16 22:57 ` Tony Lindgren
2015-01-15 13:42 ` Marc Zyngier
2015-01-15 13:42 ` Marc Zyngier
2015-01-15 14:27 ` Arnd Bergmann
2015-01-15 14:27 ` Arnd Bergmann
2015-01-15 14:43 ` Marc Zyngier
2015-01-15 14:43 ` Marc Zyngier
2015-01-15 15:37 ` Tony Lindgren
2015-01-15 15:37 ` Tony Lindgren
2015-01-16 16:56 ` Arnd Bergmann
2015-01-16 16:56 ` Arnd Bergmann
2015-01-16 17:23 ` Marc Zyngier
2015-01-16 17:23 ` Marc Zyngier
2015-01-17 0:48 ` Simon Horman
2015-01-17 0:48 ` Simon Horman
2015-01-15 16:37 ` Arnd Bergmann
2015-01-15 16:37 ` Arnd Bergmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150116225243.GN18552@atomide.com \
--to=tony@atomide.com \
--cc=balbi@ti.com \
--cc=jason@lakedaemon.net \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-omap@vger.kernel.org \
--cc=linux@arm.linux.org.uk \
--cc=marc.zyngier@arm.com \
--cc=tglx@linutronix.de \
--cc=yingjoe.chen@mediatek.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.