linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sekhar Nori <nsekhar@ti.com>
To: Tony Lindgren <tony@atomide.com>,
	John Ogness <john.ogness@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Jason Cooper <jason@lakedaemon.net>,
	Marc Zyngier <marc.zyngier@arm.com>, Felipe Balbi <balbi@ti.com>,
	Linux OMAP Mailing List <linux-omap@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] irqchip: omap-intc: fix spurious irq handling
Date: Thu, 3 Dec 2015 16:58:41 +0530	[thread overview]
Message-ID: <56602769.9050708@ti.com> (raw)
In-Reply-To: <20151020145255.GB3078@atomide.com>

Hi Tony,

On Tuesday 20 October 2015 08:22 PM, Tony Lindgren wrote:
> * John Ogness <john.ogness@linutronix.de> [151020 00:33]:
>> On 2015-10-20, Sekhar Nori <nsekhar@ti.com> wrote:
>>>> Do you know what really is causing the spurious interrupts in your
>>>> case?
>>>
>>> No, not yet.
>>
>> According to the TRM this is normal behavior if conditions that might
>> affect priority are changed during priority sorting.
>>
>>     6.2.5 ARM A8 INTC Spurious Interrupt Handling
>>
>>     The spurious flag indicates whether the result of the sorting (a
>>     window of 10 INTC functional clock cycles after the interrupt
>>     assertion) is invalid. The sorting is invalid if:
>>
>>     - The interrupt that triggered the sorting is no longer active
>>       during the sorting.
>>
>>     - A change in the mask has affected the result during the sorting
>>       time.
>>
>>>> In all the cases I've seen, the spurious interrupts were caused by a
>>>> missing flush of posted write acking the IRQ at the device driver.
>>>> for the _previously triggered_ INTC interrupt.
>>>>
>>>> If you have a reproducable case, I suggest you test that by printing
>>>> out the previous interrupt to check if that makes sense. And then see
>>>> if adding the missing read back to that interrupt handler fixes the
>>>> issue.
>>>
>>> Okay, thats good to know. Thanks for the hints and history of your debug
>>> on OMAP3. The issue is not easily reproducible in my case. But if I try
>>> hard enough, I can get hit it though. So I can surely try your hints.
>>
>> I can reproduce the situation very easily. After running a test for a
>> few minutes and printing out the previous interrupt, I have the
>> following list. These are the irq numbers seen by the handler before the
>> spurious interrupt triggered.
>>
>>     INT12 - EDMACOMPINT - TPCC (EDMA)
>>     INT41 - 3PGSWRXINT0 - CPSW (Ethernet)
>>     INT42 - 3PGSWTXINT0 - CPSW (Ethernet)
>>     INT68 - TINT2       - DMTIMER2
>>     INT72 - UART0INT    - UART0
>>
>> From this I do not think we can put the blame on any single driver. I
>> trigger this situation very easily by putting a load of 7,000+
>> interrupts per second on the system. This means we have 70,000 INTC
>> clock cycles per second where a change in the interrupt priority
>> conditions would cause the priority sorting to become invalid and thus
>> cause the spurious interrupt.
>>
>> I'm not sure if we can/should do anything more than Sekhar's patch of
>> acknowledging the spurious interrupt so the priority sorting algorithm
>> can run again.
> 
> OK thanks for testing. My guess from the above list would be EDMA
> or CPSW missing a flush of posted write. Maybe try adding a readback
> of the related device revision register after acking the interrupt into
> TPCC interrupt handler and CPSW interrupt handler(s)?

I could get back to debugging this only now. I have converted
__raw_writel to writel() and also added readback from the same register
in both EDMA and CPSW drivers. But I am still able to reproduce the
spurious irq reports.

> The timer2 and uart0 seem to be false positives here naturally.

I also added readback in 8250 driver. I haven't touched the timer
driver, but I guess if that driver had an issue, it should have come out
much earlier.

I also saw that sometimes previous irq was the TI LCDC interrupt. Added
readback there too. Did not help.

> I would not yet rule out the "previous interrupt" theory until you have
> tried that. We really want to know the root cause of the issue, just
> printing out spurious interrupt does not fix the problem :)

While we cannot rule out a software issue completely, the description in
TRM around spurious interrupts suggests it can happen even with no role
of software.

May I suggest we go ahead and add this patch to the kernel after
addressing Thomas's comment? At least it will prevent kernel from
locking up with flood of prints when a spurious irq happens and allows
easier debug by others too.

Thanks,
Sekhar


  reply	other threads:[~2015-12-03 11:29 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-19  9:46 [PATCH] irqchip: omap-intc: fix spurious irq handling Sekhar Nori
2015-10-19 10:13 ` Thomas Gleixner
2015-10-19 14:50 ` Tony Lindgren
2015-10-20  6:22   ` Sekhar Nori
2015-10-20  7:32     ` John Ogness
2015-10-20 14:52       ` Tony Lindgren
2015-12-03 11:28         ` Sekhar Nori [this message]
2015-12-03 15:02           ` Tony Lindgren
2015-12-03 15:24             ` Sekhar Nori
2015-12-03 15:39               ` Tony Lindgren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56602769.9050708@ti.com \
    --to=nsekhar@ti.com \
    --cc=balbi@ti.com \
    --cc=jason@lakedaemon.net \
    --cc=john.ogness@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-omap@vger.kernel.org \
    --cc=marc.zyngier@arm.com \
    --cc=tglx@linutronix.de \
    --cc=tony@atomide.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).