From: Oliver Hartkopp <socketcan@hartkopp.net>
To: Austin Schuh <austin@peloton-tech.com>
Cc: Wolfgang Grandegger <wg@grandegger.com>, linux-can@vger.kernel.org
Subject: Re: sja1000 interrupt problem
Date: Wed, 13 Nov 2013 07:58:59 +0100 [thread overview]
Message-ID: <52832333.9080908@hartkopp.net> (raw)
In-Reply-To: <CANGgnMaCvb=B2r997e+H9UjVquX66HJ+OtftL0EyGP7MKcy0tQ@mail.gmail.com>
Hi Austin,
sorry for checking my mails in sequential order :-)
I would have been able to shorten the last mail.
Thanks for your interesting investigation.
I wonder why this problem did not show up before then. Having shared
interrupts should be a usual thing.
This kind of race condition should not be there at all. Do you have a second
peak_pci hardware? I could be an idea to try to split the IRQs in a way that
you have two IRQs for two cards - and then connect can0 to can2.
You would have a pretty fast following RX/TX interrupt but without interrupt
sharing ...
Best regards,
Oliver
On 13.11.2013 04:41, Austin Schuh wrote:
> On Tue, Nov 12, 2013 at 3:22 PM, Austin Schuh <austin@peloton-tech.com> wrote:
>> On Tue, Nov 12, 2013 at 1:26 PM, Oliver Hartkopp <socketcan@hartkopp.net> wrote:
>>> On 12.11.2013 03:59, Austin Schuh wrote:
>>>
>>>>> From the trace it is pretty hard to know which CAN interface is in charge.
>>>>> (2) Can you please add the output of dev->ifindex in the pr_info() calls?
>>>>
>>>> Gladly. See the updated logs.
>>>>
>>>> [ 556.019246] peak_pci 0000:05:00.0 can1: Got an sja1000 interrupt.
>>>> [ 556.019268] Unhandled IRQ 18... stop tracing...
>>>> [ 556.019280] peak_pci 0000:05:00.0 can0: Got an sja1000 interrupt.
>>>> [ 556.019289] peak_pci 0000:05:00.0 can1: Received packet.
>>>> [ 556.019299] peak_pci 0000:05:00.0 can1: sja1000_rx
>>>> [ 556.019307] peak_pci 0000:05:00.0 can0: TX complete.
>>>> [ 556.019318] peak_pci 0000:05:00.0 can0: Returning IRQ_HANDLED
>>>> [ 556.019362] peak_pci 0000:05:00.0 can1: Returning IRQ_HANDLED
>>>>
>>>
>>> This looks pretty broken regarding the IRQ handling.
>>> Maybe the IRQ thread handling has a real problem in the -rt kernel ?!?
>>
>> Sounds pretty plausible right now.
>
> Ok, I spent a good chunk of today reading the IRQ handling code in the
> kernel, and I think I get what is happening and have a plausible
> explanation for why the interrupt is getting disabled. Not sure how
> to test it.
>
> Here is what it looks like is happening. The hardware triggers an
> interrupt. The handler is called, and then the registered action for
> each of the devices is to notify their threads that an IRQ occurred,
> and to have them handle it. Each of the handling threads then calls
> the sja1000_interrupt function, or the equivalent ata_generic
> interrupt function. 2 of the 3 interrupt functions then return
> IRQ_NONE, and one of them returns IRQ_HANDLED. note_interrupt is then
> called in each of the threads (instead of being called once in the
> non-rt case), resulting in 2 unhanded calls, and 1 handled call. So
> far, so good. The kernel operates as expected, since less than 99.9 %
> of the interrupts are handled. (There is a note_interrupt call in the
> handler, but since the threaded handlers are notified, this doesn't
> get counted.
>
> Since the IRQ handlers are now all in threads, if the thread that
> actually receives data doesn't process the interrupts either because
> something goes wrong, or because it doesn't get scheduled, there will
> be a bunch of unhanded interrupts noted, and no handled interrupts
> noted. This will cause the IRQ to be disabled.
>
> I guess the next interesting thing to do is to trigger when it
> disables the IRQ and take a look at what is happening. I have a test
> running on one machine with tracing enabled which will disable tracing
> when the IRQ is disabled. That should provide some interesting
> results. I think I also know how to bypass it for now by setting
> "noirqdebug", but I'd like to fix it for real as well.
>
> Austin
>
next prev parent reply other threads:[~2013-11-13 6:59 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-08 0:47 sja1000 interrupt problem Austin Schuh
2013-10-08 6:32 ` Wolfgang Grandegger
2013-10-08 6:58 ` Oliver Hartkopp
2013-10-08 18:48 ` Austin Schuh
2013-10-08 19:44 ` Wolfgang Grandegger
2013-10-08 20:47 ` Austin Schuh
2013-10-09 6:21 ` Wolfgang Grandegger
2013-10-09 6:31 ` Wolfgang Grandegger
2013-10-09 6:47 ` Wolfgang Grandegger
[not found] ` <CANGgnMZpPGctUWGcg7Lp-QFPc7d6A5GeL9KQYnpeYMR8WukgdA@mail.gmail.com>
2013-11-07 8:15 ` Wolfgang Grandegger
2013-11-07 23:43 ` Austin Schuh
2013-11-09 14:21 ` Oliver Hartkopp
2013-11-12 2:59 ` Austin Schuh
2013-11-12 21:26 ` Oliver Hartkopp
2013-11-12 23:22 ` Austin Schuh
2013-11-13 3:41 ` Austin Schuh
2013-11-13 6:58 ` Oliver Hartkopp [this message]
2013-11-13 9:48 ` Kurt Van Dijck
2013-11-13 6:44 ` Oliver Hartkopp
2013-11-13 8:11 ` Wolfgang Grandegger
2013-11-13 9:08 ` Pavel Pisa
2013-11-13 9:52 ` Wolfgang Grandegger
2013-11-13 18:41 ` Oliver Hartkopp
2013-11-13 19:29 ` Wolfgang Grandegger
2013-11-13 22:00 ` Oliver Hartkopp
2013-11-13 11:02 ` Kurt Van Dijck
2013-11-16 21:42 ` Oliver Hartkopp
2013-11-17 8:18 ` Wolfgang Grandegger
2013-11-17 14:27 ` Oliver Hartkopp
2013-11-17 17:23 ` Wolfgang Grandegger
2013-11-17 20:46 ` Wolfgang Grandegger
2013-11-18 17:08 ` Austin Schuh
2013-12-09 21:54 ` Austin Schuh
2013-12-09 21:54 ` Austin Schuh
2013-12-10 7:49 ` Wolfgang Grandegger
2013-12-10 8:05 ` Austin Schuh
2013-12-10 9:32 ` Wolfgang Grandegger
2013-12-10 13:47 ` Oliver Hartkopp
2013-12-10 14:23 ` Oliver Hartkopp
2013-12-10 14:41 ` Wolfgang Grandegger
2013-12-10 16:05 ` Oliver Hartkopp
2013-12-10 21:12 ` Wolfgang Grandegger
2013-12-11 16:59 ` Oliver Hartkopp
2013-12-11 19:27 ` Wolfgang Grandegger
2013-12-12 6:13 ` Oliver Hartkopp
2013-12-12 17:38 ` Oliver Hartkopp
2013-12-12 22:56 ` Wolfgang Grandegger
2013-12-13 0:07 ` Austin Schuh
2013-12-13 16:16 ` Oliver Hartkopp
2013-12-13 9:38 ` Oliver Hartkopp
2013-12-13 10:04 ` Wolfgang Grandegger
2013-12-13 10:09 ` Wolfgang Grandegger
2013-12-13 16:25 ` Oliver Hartkopp
2013-12-13 17:33 ` Wolfgang Grandegger
2013-12-13 10:07 ` Marc Kleine-Budde
2013-12-13 16:22 ` Oliver Hartkopp
2013-12-13 17:14 ` Oliver Hartkopp
2013-12-13 21:14 ` Oliver Hartkopp
2013-12-14 9:51 ` Oliver Hartkopp
2013-12-20 23:13 ` Austin Schuh
2013-12-21 8:29 ` Wolfgang Grandegger
2013-12-21 13:12 ` Oliver Hartkopp
2013-12-21 12:55 ` Oliver Hartkopp
2013-12-23 15:58 ` Oliver Hartkopp
2013-11-09 19:42 ` Wolfgang Grandegger
[not found] ` <CANGgnMbb+VResUC6h+cK6Hfe5PLJx9R9ao6bMdJM2e5BPaDamw@mail.gmail.com>
2013-11-12 22:15 ` Wolfgang Grandegger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52832333.9080908@hartkopp.net \
--to=socketcan@hartkopp.net \
--cc=austin@peloton-tech.com \
--cc=linux-can@vger.kernel.org \
--cc=wg@grandegger.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.