From: Oliver Hartkopp <socketcan@hartkopp.net>
To: Wolfgang Grandegger <wg@grandegger.com>
Cc: Austin Schuh <austin@peloton-tech.com>,
Pavel Pisa <pisa@cmp.felk.cvut.cz>,
linux-can@vger.kernel.org
Subject: Re: sja1000 interrupt problem
Date: Fri, 13 Dec 2013 10:38:49 +0100 [thread overview]
Message-ID: <52AAD5A9.5030607@hartkopp.net> (raw)
In-Reply-To: <52AA3F16.3070309@grandegger.com>
On 12.12.2013 23:56, Wolfgang Grandegger wrote:
> My impression is that the problem is with counting "irqs_unhandled" and "irqs_count",
> which might not be done atomically. Actually three threads call "note_interrupt".
> Does that make sense? Hope to find some time tomorrow to use atomic_set and friends
> to handle these counters.
To hopefully complete the picture some more traces from yesterday evening:
[ 1117.959986] handlers:
[ 1117.962184] [<c1087bdb>] irq_default_primary_handler threaded [<f86b169b>] sja1000_interrupt [sja1000]device can8 PITA 0x0001
[ 1117.962190] [<c1087bdb>] irq_default_primary_handler threaded [<f86b169b>] sja1000_interrupt [sja1000]device can10 PITA 0x0001
[ 1117.962196] [<c1087bdb>] irq_default_primary_handler threaded [<f86b169b>] sja1000_interrupt [sja1000]device can11 PITA 0x0001
[ 1117.962201] [<c1087bdb>] irq_default_primary_handler threaded [<f86b169b>] sja1000_interrupt [sja1000]device can9 PITA 0x0001
[ 1117.962202] Disabling IRQ #17
(..)
[ 5995.979307] handlers:
[ 5995.979337] [<c1087bdb>] irq_default_primary_handler threaded [<f86b169b>] sja1000_interrupt [sja1000]device can0 PITA 0x0042
[ 5995.979342] [<c1087bdb>] irq_default_primary_handler threaded [<f86b169b>] sja1000_interrupt [sja1000]device can1 PITA 0x0042
[ 5995.979346] [<c1087bdb>] irq_default_primary_handler threaded [<f86b169b>] sja1000_interrupt [sja1000]device can2 PITA 0x0042
[ 5995.979350] [<c1087bdb>] irq_default_primary_handler threaded [<f86b169b>] sja1000_interrupt [sja1000]device can3 PITA 0x0042
[ 5995.979354] [<c1087bdb>] irq_default_primary_handler threaded [<f86b169b>] sja1000_interrupt [sja1000]device can16 PITA 0x0000
[ 5995.979358] [<c1087bdb>] irq_default_primary_handler threaded [<f86b169b>] sja1000_interrupt [sja1000]device can17 PITA 0x0000
[ 5995.979362] [<c1087bdb>] irq_default_primary_handler threaded [<f86b169b>] sja1000_interrupt [sja1000]device can18 PITA 0x0000
[ 5995.979365] [<c1087bdb>] irq_default_primary_handler threaded [<f86b169b>] sja1000_interrupt [sja1000]device can19 PITA 0x0000
[ 5995.979366] Disabling IRQ #19
(..)
[ 7527.712564] handlers:
[ 7527.712606] [<c1087bdb>] irq_default_primary_handler threaded [<f86b169b>] sja1000_interrupt [sja1000]device can12 PITA 0x0000
[ 7527.712612] [<c1087bdb>] irq_default_primary_handler threaded [<f86b169b>] sja1000_interrupt [sja1000]device can13 PITA 0x0000
[ 7527.712617] [<c1087bdb>] irq_default_primary_handler threaded [<f86b169b>] sja1000_interrupt [sja1000]device can14 PITA 0x0000
[ 7527.712623] [<c1087bdb>] irq_default_primary_handler threaded [<f86b169b>] sja1000_interrupt [sja1000]device can15 PITA 0x0000
[ 7527.712624] Disabling IRQ #18
/proc/interrupts:
16: 8 9 8 8 IO-APIC-fasteoi ehci_hcd:usb1, ahci, can4, can5, can6, can7
17: 1838572 1843233 1845868 1838175 IO-APIC-fasteoi can8, can10, can11, can9
18: 12665112 12624875 12641515 12637319 IO-APIC-fasteoi can12, can13, can14, can15
19: 10787522 10822954 10803457 10815440 IO-APIC-fasteoi can0, can1, can2, can3, can16, can17, can18, can19
So after some time all CAN related interrupts have been disabled ...
I wondered if the PITA access for consuming the bit is really working.
Therefore I made the if-statement a while statement here:
--- linux-source-3.10/drivers/net/can/sja1000/peak_pci.c 2013-09-08 07:10:14.000000000 +0200
+++ linux-source-3.10-rt/drivers/net/can/sja1000/peak_pci.c 2013-12-13 08:42:15.850192329 +0100
@@ -539,12 +539,17 @@
static void peak_pci_post_irq(const struct sja1000_priv *priv)
{
struct peak_pci_chan *chan = priv->priv;
+#if 0
u16 icr;
/* Select and clear in PITA stored interrupt */
icr = readw(chan->cfg_base + PITA_ICR);
if (icr & chan->icr_mask)
writew(chan->icr_mask, chan->cfg_base + PITA_ICR);
+#else
+ while (readw(chan->cfg_base + PITA_ICR) & chan->icr_mask)
+ writew(chan->icr_mask, chan->cfg_base + PITA_ICR);
+#endif
}
static int peak_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
This should usually not have any effect, right?
But what happened was a big crash after a pretty short time:
[ 760.718091] INFO: rcu_preempt self-detected stall on CPU { 1} (t=84015 jiffies g=482 c=481 q=2688)
[ 760.718092] sending NMI to all CPUs:
[ 760.718094] NMI backtrace for cpu 1
[ 760.718098] CPU: 1 PID: 3629 Comm: irq/17-can9 Not tainted 3.10.11-rt7-can #6
[ 760.718099] Hardware name: xxxxxx
[ 760.718100] task: edcdb4e0 ti: ee942000 task.ti: ee942000
[ 760.718101] EIP: 0060:[<c118b3a3>] EFLAGS: 00000006 CPU: 1
[ 760.718106] EIP is at __const_udelay+0x7/0x17
[ 760.718107] EAX: 00418958 EBX: 00002710 ECX: c13ca099 EDX: 009aa184
[ 760.718108] ESI: f51bb954 EDI: 00000a80 EBP: ee943ec0 ESP: ee943dbc
[ 760.718109] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 760.718111] CR0: 8005003b CR2: 0812a748 CR3: 0156b000 CR4: 000007f0
[ 760.718112] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 760.718112] DR6: ffff0ff0 DR7: 00000400
[ 760.718113] Stack:
[ 760.718117] c10235ec c146a580 c108dd53 c13d58b6 0001482f 000001e2 000001e1 00000a80
[ 760.718120] 00000007 c1037f67 c10382b5 00000001 c146a580 00000001 edcdb4e0 00000000
[ 760.718123] 00000001 ee943ec0 c103d5a1 f51bb7f4 ee943ec0 000000b1 c106c73b f51bb7f4
[ 760.718124] Call Trace:
[ 760.718129] [<c10235ec>] ? arch_trigger_all_cpu_backtrace+0x57/0x5f
[ 760.718132] [<c108dd53>] ? rcu_check_callbacks+0x17e/0x470
[ 760.718135] [<c1037f67>] ? raise_softirq_irqoff+0x5/0x2a
[ 760.718137] [<c10382b5>] ? raise_softirq+0x17/0x20
[ 760.718140] [<c103d5a1>] ? update_process_times+0x2f/0x39
[ 760.718142] [<c106c73b>] ? tick_sched_handle+0x37/0x43
[ 760.718144] [<c106c91e>] ? tick_sched_timer+0x28/0x4b
[ 760.718145] [<c106c8f6>] ? tick_sched_do_timer+0x2f/0x2f
[ 760.718149] [<c104d3a5>] ? __run_hrtimer+0x8e/0x12e
[ 760.718151] [<c104dc59>] ? hrtimer_interrupt+0x1a8/0x305
[ 760.718164] [<c1022b3a>] ? smp_apic_timer_interrupt+0x55/0x64
[ 760.718167] [<c1310b7c>] ? apic_timer_interrupt+0x34/0x3c
[ 760.718171] [<f8370001>] ? usb_otg_state_string+0x1/0x13 [usb_common]
[ 760.718177] [<c126007b>] ? skb_copy_datagram_const_iovec+0xf/0x196
[ 760.718180] [<f8709078>] ? peak_pci_post_irq+0x12/0x1b [peak_pci]
[ 760.718183] [<f88e7adf>] ? sja1000_interrupt+0x444/0x456 [sja1000]
[ 760.718187] [<c121f27b>] ? add_interrupt_randomness+0x34/0x131
[ 760.718191] [<c1087cf3>] ? irq_thread_fn+0x21/0x21
[ 760.718193] [<c1087d08>] ? irq_forced_thread_fn+0x15/0x38
[ 760.718195] [<c1088494>] ? irq_thread+0x7e/0x169
[ 760.718197] [<c108857f>] ? irq_thread+0x169/0x169
[ 760.718198] [<c1088416>] ? wake_threads_waitq+0x31/0x31
[ 760.718200] [<c104a79e>] ? kthread+0x68/0x6d
[ 760.718203] [<c13143b7>] ? ret_from_kernel_thread+0x1b/0x28
[ 760.718205] [<c104a736>] ? __kthread_parkme+0x50/0x50
[ 760.718221] Code: 00 8d bc 27 00 00 00 00 eb 0e 8d b4 26 00 00 00 00 8d bc 27 00 00 00 00 48 75 fd 48 c3 ff 15 94 17 48 c1 c3 64 8b 15 dc 50 56 c1 <6b> d2 3e c1 e0 02 f7 e2 8d 42 01 e9 e2 ff ff ff 69 c0 c7 10 00
[ 760.718223] NMI backtrace for cpu 0
[ 760.718225] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 3.10.11-rt7-can #6
[ 760.718226] Hardware name: xxxxxx
[ 760.718227] task: f4c70bc0 ti: f4c7e000 task.ti: f4c7e000
[ 760.718228] EIP: 0060:[<c131063c>] EFLAGS: 00000002 CPU: 0
[ 760.718231] EIP is at _raw_spin_unlock_irq+0x3/0x43
[ 760.718232] EAX: f51b2640 EBX: ee34de00 ECX: f515a000 EDX: f4c70bc0
[ 760.718233] ESI: f51b2640 EDI: 00000000 EBP: 00000000 ESP: f4c7fec0
[ 760.718234] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 760.718235] CR0: 8005003b CR2: b7721484 CR3: 2ea77000 CR4: 000007f0
[ 760.718237] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 760.718237] DR6: ffff0ff0 DR7: 00000400
[ 760.718238] Stack:
[ 760.718241] c1052acd 6b2cb6d6 ef2a3c00 f51b2640 f4c70bc0 c130fa2f ee34de00 000000b1
[ 760.718244] c1565640 1e17e089 000000b1 c1565640 c103461e f4c70bc0 c10546d3 f515d488
[ 760.718248] f4c70bc0 c151043c f4c7ff3c c1037c06 00000000 00000004 f4c70bc0 f4c70bc0
[ 760.718248] Call Trace:
[ 760.718252] [<c1052acd>] ? finish_task_switch+0x38/0x9d
[ 760.718254] [<c130fa2f>] ? __schedule+0x385/0x41e
[ 760.718257] [<c103461e>] ? unpin_current_cpu+0xb/0x45
[ 760.718259] [<c10546d3>] ? migrate_enable+0x18f/0x19c
[ 760.718261] [<c1037c06>] ? do_current_softirqs+0x209/0x26e
[ 760.718263] [<c108e2f0>] ? rcu_note_context_switch+0x13b/0x14c
[ 760.718265] [<c130fb84>] ? schedule+0x5e/0x6e
[ 760.718267] [<c10505a6>] ? smpboot_thread_fn+0x233/0x2a5
[ 760.718269] [<c130fb84>] ? schedule+0x5e/0x6e
[ 760.718271] [<c1050373>] ? test_ti_thread_flag+0x7/0x7
[ 760.718272] [<c104a79e>] ? kthread+0x68/0x6d
[ 760.718275] [<c13143b7>] ? ret_from_kernel_thread+0x1b/0x28
[ 760.718277] [<c104a736>] ? __kthread_parkme+0x50/0x50
[ 760.718293] Code: 85 c0 74 07 e8 ae f4 ff ff eb 15 89 e0 ba 09 00 00 00 25 00 e0 ff ff e8 6e 0c d6 ff 85 c0 75 e4 8b 04 24 e9 c1 76 d2 ff 80 00 01 <fb> 66 66 90 66 90 b8 01 00 00 00 e8 fe 27 00 00 89 e0 ba 03 00
[ 760.718294] NMI backtrace for cpu 2
[ 760.718296] CPU: 2 PID: 3195 Comm: irq/18-can12 Not tainted 3.10.11-rt7-can #6
[ 760.718297] Hardware name: xxxxxx
[ 760.718298] task: ee2bb4e0 ti: edd34000 task.ti: edd34000
[ 760.718299] EIP: 0060:[<f8709078>] EFLAGS: 00000202 CPU: 2
[ 760.718305] EIP is at peak_pci_post_irq+0x12/0x1b [peak_pci]
[ 760.718306] EAX: ef66ee1c EBX: ef66e800 ECX: f8590003 EDX: f859a000
[ 760.718307] ESI: 00000385 EDI: f864a01c EBP: ef66ed40 ESP: edd35efc
[ 760.718308] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 760.718309] CR0: 8005003b CR2: b7749000 CR3: 2eae3000 CR4: 000007f0
[ 760.718310] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 760.718311] DR6: ffff0ff0 DR7: 00000400
[ 760.718312] Stack:
[ 760.718315] f88e7adf c121f27b ee2bb413 000344d2 edd83748 00000001 ef08d200 edd83748
[ 760.718319] eea248c0 f4e44900 ee2bb4e0 c1087cf3 c1087d08 f4e44900 eea248c0 ee2bb4e0
[ 760.718322] c1088494 eea248e0 ee2bb4e0 2f86ad9c 00000000 00000000 00000000 00000000
[ 760.718323] Call Trace:
[ 760.718326] [<f88e7adf>] ? sja1000_interrupt+0x444/0x456 [sja1000]
[ 760.718329] [<c121f27b>] ? add_interrupt_randomness+0x34/0x131
[ 760.718332] [<c1087cf3>] ? irq_thread_fn+0x21/0x21
[ 760.718334] [<c1087d08>] ? irq_forced_thread_fn+0x15/0x38
[ 760.718336] [<c1088494>] ? irq_thread+0x7e/0x169
[ 760.718338] [<c108857f>] ? irq_thread+0x169/0x169
[ 760.718340] [<c1088416>] ? wake_threads_waitq+0x31/0x31
[ 760.718342] [<c104a79e>] ? kthread+0x68/0x6d
[ 760.718344] [<c13143b7>] ? ret_from_kernel_thread+0x1b/0x28
[ 760.718346] [<c104a736>] ? __kthread_parkme+0x50/0x50
[ 760.718364] Code: c3 8b 80 b8 00 00 00 8d 04 90 8a 00 c3 8b 80 b8 00 00 00 8d 04 90 88 08 c3 8b 80 b0 00 00 00 eb 05 8b 08 66 89 11 8b 10 66 8b 0a <8b> 50 08 66 85 d1 75 ee c3 57 56 89 ce 53 89 c3 83 ec 14 31 c0
[ 760.718365] NMI backtrace for cpu 3
[ 760.718367] CPU: 3 PID: 2867 Comm: irq/19-can2 Not tainted 3.10.11-rt7-can #6
[ 760.718368] Hardware name: xxxxxx
[ 760.718369] task: edcdc0a0 ti: edd14000 task.ti: edd14000
[ 760.718370] EIP: 0060:[<f8709078>] EFLAGS: 00000202 CPU: 3
[ 760.718372] EIP is at peak_pci_post_irq+0x12/0x1b [peak_pci]
[ 760.718373] EAX: ef66c61c EBX: ef66c000 ECX: f82700c3 EDX: f8270000
[ 760.718375] ESI: 00000127 EDI: f827881c EBP: ef66c540 ESP: edd15efc
[ 760.718376] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 760.718377] CR0: 8005003b CR2: b76fc000 CR3: 2eae3000 CR4: 000007f0
[ 760.718378] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 760.718379] DR6: ffff0ff0 DR7: 00000400
[ 760.718379] Stack:
[ 760.718382] f88e7adf c121f27b edcdc013 000344d2 edcecdc8 00000001 ef127200 edcecdc8
[ 760.718386] edc229c0 f4e449c0 edcdc0a0 c1087cf3 c1087d08 f4e449c0 edc229c0 edcdc0a0
[ 760.718389] c1088494 edc229e0 edcdc0a0 15b2b9b8 00000000 00000000 00000000 00000000
[ 760.718389] Call Trace:
[ 760.718392] [<f88e7adf>] ? sja1000_interrupt+0x444/0x456 [sja1000]
[ 760.718394] [<c121f27b>] ? add_interrupt_randomness+0x34/0x131
[ 760.718397] [<c1087cf3>] ? irq_thread_fn+0x21/0x21
[ 760.718398] [<c1087d08>] ? irq_forced_thread_fn+0x15/0x38
[ 760.718400] [<c1088494>] ? irq_thread+0x7e/0x169
[ 760.718402] [<c108857f>] ? irq_thread+0x169/0x169
[ 760.718404] [<c1088416>] ? wake_threads_waitq+0x31/0x31
[ 760.718405] [<c104a79e>] ? kthread+0x68/0x6d
[ 760.718408] [<c13143b7>] ? ret_from_kernel_thread+0x1b/0x28
[ 760.718409] [<c104a736>] ? __kthread_parkme+0x50/0x50
[ 760.718426] Code: c3 8b 80 b8 00 00 00 8d 04 90 8a 00 c3 8b 80 b8 00 00 00 8d 04 90 88 08 c3 8b 80 b0 00 00 00 eb 05 8b 08 66 89 11 8b 10 66 8b 0a <8b> 50 08 66 85 d1 75 ee c3 57 56 89 ce 53 89 c3 83 ec 14 31 c0
No idea ...
Regards,
Oliver
next prev parent reply other threads:[~2013-12-13 9:38 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-08 0:47 sja1000 interrupt problem Austin Schuh
2013-10-08 6:32 ` Wolfgang Grandegger
2013-10-08 6:58 ` Oliver Hartkopp
2013-10-08 18:48 ` Austin Schuh
2013-10-08 19:44 ` Wolfgang Grandegger
2013-10-08 20:47 ` Austin Schuh
2013-10-09 6:21 ` Wolfgang Grandegger
2013-10-09 6:31 ` Wolfgang Grandegger
2013-10-09 6:47 ` Wolfgang Grandegger
[not found] ` <CANGgnMZpPGctUWGcg7Lp-QFPc7d6A5GeL9KQYnpeYMR8WukgdA@mail.gmail.com>
2013-11-07 8:15 ` Wolfgang Grandegger
2013-11-07 23:43 ` Austin Schuh
2013-11-09 14:21 ` Oliver Hartkopp
2013-11-12 2:59 ` Austin Schuh
2013-11-12 21:26 ` Oliver Hartkopp
2013-11-12 23:22 ` Austin Schuh
2013-11-13 3:41 ` Austin Schuh
2013-11-13 6:58 ` Oliver Hartkopp
2013-11-13 9:48 ` Kurt Van Dijck
2013-11-13 6:44 ` Oliver Hartkopp
2013-11-13 8:11 ` Wolfgang Grandegger
2013-11-13 9:08 ` Pavel Pisa
2013-11-13 9:52 ` Wolfgang Grandegger
2013-11-13 18:41 ` Oliver Hartkopp
2013-11-13 19:29 ` Wolfgang Grandegger
2013-11-13 22:00 ` Oliver Hartkopp
2013-11-13 11:02 ` Kurt Van Dijck
2013-11-16 21:42 ` Oliver Hartkopp
2013-11-17 8:18 ` Wolfgang Grandegger
2013-11-17 14:27 ` Oliver Hartkopp
2013-11-17 17:23 ` Wolfgang Grandegger
2013-11-17 20:46 ` Wolfgang Grandegger
2013-11-18 17:08 ` Austin Schuh
2013-12-09 21:54 ` Austin Schuh
2013-12-09 21:54 ` Austin Schuh
2013-12-10 7:49 ` Wolfgang Grandegger
2013-12-10 8:05 ` Austin Schuh
2013-12-10 9:32 ` Wolfgang Grandegger
2013-12-10 13:47 ` Oliver Hartkopp
2013-12-10 14:23 ` Oliver Hartkopp
2013-12-10 14:41 ` Wolfgang Grandegger
2013-12-10 16:05 ` Oliver Hartkopp
2013-12-10 21:12 ` Wolfgang Grandegger
2013-12-11 16:59 ` Oliver Hartkopp
2013-12-11 19:27 ` Wolfgang Grandegger
2013-12-12 6:13 ` Oliver Hartkopp
2013-12-12 17:38 ` Oliver Hartkopp
2013-12-12 22:56 ` Wolfgang Grandegger
2013-12-13 0:07 ` Austin Schuh
2013-12-13 16:16 ` Oliver Hartkopp
2013-12-13 9:38 ` Oliver Hartkopp [this message]
2013-12-13 10:04 ` Wolfgang Grandegger
2013-12-13 10:09 ` Wolfgang Grandegger
2013-12-13 16:25 ` Oliver Hartkopp
2013-12-13 17:33 ` Wolfgang Grandegger
2013-12-13 10:07 ` Marc Kleine-Budde
2013-12-13 16:22 ` Oliver Hartkopp
2013-12-13 17:14 ` Oliver Hartkopp
2013-12-13 21:14 ` Oliver Hartkopp
2013-12-14 9:51 ` Oliver Hartkopp
2013-12-20 23:13 ` Austin Schuh
2013-12-21 8:29 ` Wolfgang Grandegger
2013-12-21 13:12 ` Oliver Hartkopp
2013-12-21 12:55 ` Oliver Hartkopp
2013-12-23 15:58 ` Oliver Hartkopp
2013-11-09 19:42 ` Wolfgang Grandegger
[not found] ` <CANGgnMbb+VResUC6h+cK6Hfe5PLJx9R9ao6bMdJM2e5BPaDamw@mail.gmail.com>
2013-11-12 22:15 ` Wolfgang Grandegger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52AAD5A9.5030607@hartkopp.net \
--to=socketcan@hartkopp.net \
--cc=austin@peloton-tech.com \
--cc=linux-can@vger.kernel.org \
--cc=pisa@cmp.felk.cvut.cz \
--cc=wg@grandegger.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).