linux-can.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Wolfgang Grandegger <wg@grandegger.com>
To: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: Austin Schuh <austin@peloton-tech.com>,
	Pavel Pisa <pisa@cmp.felk.cvut.cz>,
	linux-can@vger.kernel.org
Subject: Re: sja1000 interrupt problem
Date: Thu, 12 Dec 2013 23:56:22 +0100	[thread overview]
Message-ID: <52AA3F16.3070309@grandegger.com> (raw)
In-Reply-To: <52A9F491.3060406@hartkopp.net>

Hi Oliver,

I started to analyse Austin's function trace...

On 12/12/2013 06:38 PM, Oliver Hartkopp wrote:
> On 12.12.2013 07:13, Oliver Hartkopp wrote:
>> On 11.12.2013 20:27, Wolfgang Grandegger wrote:
> 
>>> BTW, does the problem show up if only one can is active? Austin tests
>>> did just RX on can1, but can0 was obviously up (but no traffic).
>>
>> There was a case that the IRQ line from can9, can10, can11, can12 failed with
>> an error messaged pointing to can10. All four interfaces where up but only
>> can9 had traffic at that time.
> 
> Hi Wolfgang,
> 
> here's my latest investigation result.
> 
> The setup still has (only) can9 with traffic and after a modification of
> linux/kernel/irq/spurious.c (patch below) I got this:
> 
> [ 1117.957651] irq 17: nobody cared (try booting with the "irqpoll" option)
> [ 1117.959910] CPU: 0 PID: 3498 Comm: irq/17-can9 Not tainted 3.10.11-rt7-can #6
> [ 1117.959913] Hardware name: xxxxxx
> [ 1117.959916]  00000000 c1089114 f4e44840 00000001 00000011 c1089490 ee84e780 f4e44840
> [ 1117.959924]  ee84e780 ed876fa0 c1087cf3 c10884a9 ee84e7a0 ed876fa0 16edf7d9 00000000
> [ 1117.959932]  00000000 00000000 00000000 c108857f eea53de4 ee84e780 c1088416 ef0a1f90
> [ 1117.959939] Call Trace:
> [ 1117.959949]  [<c1089114>] ? __report_bad_irq+0x18/0xbe
> [ 1117.959953]  [<c1089490>] ? note_interrupt+0x118/0x194
> [ 1117.959957]  [<c1087cf3>] ? irq_thread_fn+0x21/0x21
> [ 1117.959960]  [<c10884a9>] ? irq_thread+0x93/0x169
> [ 1117.959964]  [<c108857f>] ? irq_thread+0x169/0x169
> [ 1117.959968]  [<c1088416>] ? wake_threads_waitq+0x31/0x31
> [ 1117.959973]  [<c104a79e>] ? kthread+0x68/0x6d
> [ 1117.959979]  [<c13143b7>] ? ret_from_kernel_thread+0x1b/0x28
> [ 1117.959982]  [<c104a736>] ? __kthread_parkme+0x50/0x50
> [ 1117.959986] handlers:
> [ 1117.962184] [<c1087bdb>] irq_default_primary_handler threaded [<f86b169b>] sja1000_interrupt [sja1000]device can8 PITA 0x0001
> [ 1117.962190] [<c1087bdb>] irq_default_primary_handler threaded [<f86b169b>] sja1000_interrupt [sja1000]device can10 PITA 0x0001
> [ 1117.962196] [<c1087bdb>] irq_default_primary_handler threaded [<f86b169b>] sja1000_interrupt [sja1000]device can11 PITA 0x0001
> [ 1117.962201] [<c1087bdb>] irq_default_primary_handler threaded [<f86b169b>] sja1000_interrupt [sja1000]device can9 PITA 0x0001
> [ 1117.962202] Disabling IRQ #17

...ending here as well.

> The value in the PITA register is 0x0001 which is from can9 according to
> peak_pci_icr_masks[] (2nd element).
> 
> So obviously the flag was set and not consumed correctly ...

Well, it could also be set after the bug above triggered. If I look to
Austins trace, I see at the end (with "grep note_interrupt"):

 irq/18-ata_gene-219   [003] ....... 360026.315878: note_interrupt <-irq_thread
     irq/18-can1-1890  [001] ....... 360026.315984: note_interrupt <-irq_thread
     irq/18-can0-1863  [002] ....... 360026.316122: note_interrupt <-irq_thread
 irq/18-ata_gene-219   [003] ....... 360026.316140: note_interrupt <-irq_thread
     irq/18-can1-1890  [001] ....... 360026.316184: note_interrupt <-irq_thread
     irq/18-can0-1863  [002] ....... 360026.316358: note_interrupt <-irq_thread
 irq/18-ata_gene-219   [003] ....... 360026.316361: note_interrupt <-irq_thread
 irq/18-ata_gene-219   [003] ....... 360026.316361: __report_bad_irq <-note_interrupt
     irq/18-can1-1890  [001] ....... 360026.316437: note_interrupt <-irq_thread
     irq/18-can0-1863  [002] ....... 360026.316608: note_interrupt <-irq_thread
     irq/18-can1-1890  [001] ....... 360026.316714: note_interrupt <-irq_thread
 irq/18-ata_gene-219   [003] d...1.. 360026.317050: _raw_spin_unlock_irqrestore <-note_interrupt
 irq/18-ata_gene-219   [003] ....... 360026.317051: printk <-note_interrupt
 irq/18-ata_gene-219   [003] ....... 360026.317541: irq_disable <-note_interrupt
 irq/18-ata_gene-219   [003] ....... 360026.317541: mod_timer <-note_interrupt

In Austin's case can1 is doing RX and can0 is up but inactive. The last activity
in "sja1000_interrupt" is:

     irq/18-can1-1890  [001] .....11 360026.316619: sja1000_interrupt <-irq_forced_thread_fn
     irq/18-can1-1890  [001] .....11 360026.316619: peak_pci_read_reg <-sja1000_interrupt
     irq/18-can1-1890  [001] .....11 360026.316621: peak_pci_read_reg <-sja1000_interrupt
     irq/18-can1-1890  [001] .....11 360026.316623: peak_pci_read_reg <-sja1000_interrupt
     irq/18-can1-1890  [001] .....11 360026.316626: alloc_can_skb <-sja1000_interrupt
     irq/18-can1-1890  [001] .....11 360026.316631: peak_pci_read_reg <-sja1000_interrupt
     irq/18-can1-1890  [001] .....11 360026.316633: peak_pci_read_reg <-sja1000_interrupt
     irq/18-can1-1890  [001] .....11 360026.316637: peak_pci_read_reg <-sja1000_interrupt
     irq/18-can1-1890  [001] .....11 360026.316640: peak_pci_read_reg <-sja1000_interrupt
     irq/18-can1-1890  [001] .....11 360026.316643: peak_pci_read_reg <-sja1000_interrupt
     irq/18-can1-1890  [001] .....11 360026.316645: peak_pci_read_reg <-sja1000_interrupt
     irq/18-can1-1890  [001] .....11 360026.316648: peak_pci_read_reg <-sja1000_interrupt
     irq/18-can1-1890  [001] .....11 360026.316651: peak_pci_read_reg <-sja1000_interrupt
     irq/18-can1-1890  [001] .....11 360026.316654: peak_pci_read_reg <-sja1000_interrupt
     irq/18-can1-1890  [001] .....11 360026.316656: peak_pci_read_reg <-sja1000_interrupt
     irq/18-can1-1890  [001] .....11 360026.316659: peak_pci_read_reg <-sja1000_interrupt
     irq/18-can1-1890  [001] .....11 360026.316662: sja1000_write_cmdreg <-sja1000_interrupt
     irq/18-can1-1890  [001] .....12 360026.316666: migrate_enable <-sja1000_interrupt
     irq/18-can1-1890  [001] .....11 360026.316667: netif_rx <-sja1000_interrupt
     irq/18-can1-1890  [001] .....11 360026.316671: peak_pci_read_reg <-sja1000_interrupt
     irq/18-can1-1890  [001] .....11 360026.316673: peak_pci_read_reg <-sja1000_interrupt
     irq/18-can1-1890  [001] .....11 360026.316675: peak_pci_post_irq <-sja1000_interrupt

Which ends with "peak_pci_post_irq" before the interrupt gets disabled.
Pretty normal! I'm puzzled why the irqs_unhandled counter of the
interrupt reaches 99900 here:

  http://lxr.free-electrons.com/source/kernel/irq/spurious.c#L30

My impression is that the problem is with counting "irqs_unhandled" and "irqs_count",
which might not be done atomically. Actually three threads call "note_interrupt".
Does that make sense? Hope to find some time tomorrow to use atomic_set and friends
to handle these counters.

Wolfgang.


  reply	other threads:[~2013-12-12 22:56 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-08  0:47 sja1000 interrupt problem Austin Schuh
2013-10-08  6:32 ` Wolfgang Grandegger
2013-10-08  6:58   ` Oliver Hartkopp
2013-10-08 18:48     ` Austin Schuh
2013-10-08 19:44       ` Wolfgang Grandegger
2013-10-08 20:47         ` Austin Schuh
2013-10-09  6:21           ` Wolfgang Grandegger
2013-10-09  6:31           ` Wolfgang Grandegger
2013-10-09  6:47           ` Wolfgang Grandegger
     [not found]             ` <CANGgnMZpPGctUWGcg7Lp-QFPc7d6A5GeL9KQYnpeYMR8WukgdA@mail.gmail.com>
2013-11-07  8:15               ` Wolfgang Grandegger
2013-11-07 23:43                 ` Austin Schuh
2013-11-09 14:21                   ` Oliver Hartkopp
2013-11-12  2:59                     ` Austin Schuh
2013-11-12 21:26                       ` Oliver Hartkopp
2013-11-12 23:22                         ` Austin Schuh
2013-11-13  3:41                           ` Austin Schuh
2013-11-13  6:58                             ` Oliver Hartkopp
2013-11-13  9:48                               ` Kurt Van Dijck
2013-11-13  6:44                           ` Oliver Hartkopp
2013-11-13  8:11                             ` Wolfgang Grandegger
2013-11-13  9:08                               ` Pavel Pisa
2013-11-13  9:52                                 ` Wolfgang Grandegger
2013-11-13 18:41                                   ` Oliver Hartkopp
2013-11-13 19:29                                     ` Wolfgang Grandegger
2013-11-13 22:00                                       ` Oliver Hartkopp
2013-11-13 11:02                                 ` Kurt Van Dijck
2013-11-16 21:42                                 ` Oliver Hartkopp
2013-11-17  8:18                                   ` Wolfgang Grandegger
2013-11-17 14:27                                     ` Oliver Hartkopp
2013-11-17 17:23                                       ` Wolfgang Grandegger
2013-11-17 20:46                                         ` Wolfgang Grandegger
2013-11-18 17:08                                           ` Austin Schuh
2013-12-09 21:54                                             ` Austin Schuh
2013-12-09 21:54                                               ` Austin Schuh
2013-12-10  7:49                                               ` Wolfgang Grandegger
2013-12-10  8:05                                                 ` Austin Schuh
2013-12-10  9:32                                                   ` Wolfgang Grandegger
2013-12-10 13:47                                                     ` Oliver Hartkopp
2013-12-10 14:23                                                       ` Oliver Hartkopp
2013-12-10 14:41                                                       ` Wolfgang Grandegger
2013-12-10 16:05                                                         ` Oliver Hartkopp
2013-12-10 21:12                                                           ` Wolfgang Grandegger
2013-12-11 16:59                                                             ` Oliver Hartkopp
2013-12-11 19:27                                                               ` Wolfgang Grandegger
2013-12-12  6:13                                                                 ` Oliver Hartkopp
2013-12-12 17:38                                                                   ` Oliver Hartkopp
2013-12-12 22:56                                                                     ` Wolfgang Grandegger [this message]
2013-12-13  0:07                                                                       ` Austin Schuh
2013-12-13 16:16                                                                         ` Oliver Hartkopp
2013-12-13  9:38                                                                       ` Oliver Hartkopp
2013-12-13 10:04                                                                         ` Wolfgang Grandegger
2013-12-13 10:09                                                                           ` Wolfgang Grandegger
2013-12-13 16:25                                                                             ` Oliver Hartkopp
2013-12-13 17:33                                                                               ` Wolfgang Grandegger
2013-12-13 10:07                                                                         ` Marc Kleine-Budde
2013-12-13 16:22                                                                           ` Oliver Hartkopp
2013-12-13 17:14                                                                             ` Oliver Hartkopp
2013-12-13 21:14                                                                               ` Oliver Hartkopp
2013-12-14  9:51                                                                                 ` Oliver Hartkopp
2013-12-20 23:13                                                                                   ` Austin Schuh
2013-12-21  8:29                                                                                     ` Wolfgang Grandegger
2013-12-21 13:12                                                                                       ` Oliver Hartkopp
2013-12-21 12:55                                                                                     ` Oliver Hartkopp
2013-12-23 15:58                                                                                       ` Oliver Hartkopp
2013-11-09 19:42                   ` Wolfgang Grandegger
     [not found]                     ` <CANGgnMbb+VResUC6h+cK6Hfe5PLJx9R9ao6bMdJM2e5BPaDamw@mail.gmail.com>
2013-11-12 22:15                       ` Wolfgang Grandegger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52AA3F16.3070309@grandegger.com \
    --to=wg@grandegger.com \
    --cc=austin@peloton-tech.com \
    --cc=linux-can@vger.kernel.org \
    --cc=pisa@cmp.felk.cvut.cz \
    --cc=socketcan@hartkopp.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).