All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oliver Hartkopp <socketcan@hartkopp.net>
To: Wolfgang Grandegger <wg@grandegger.com>,
	Austin Schuh <austin@peloton-tech.com>,
	Pavel Pisa <pisa@cmp.felk.cvut.cz>
Cc: linux-can@vger.kernel.org
Subject: Re: sja1000 interrupt problem
Date: Tue, 10 Dec 2013 15:23:46 +0100	[thread overview]
Message-ID: <52A723F2.7040908@hartkopp.net> (raw)
In-Reply-To: <52A71B6C.3050600@hartkopp.net>

In addition to the setup of the mail below:

Now the can9 (with the 1Mbit/s) crashed with this message:

[ 5542.981022] irq 17: nobody cared (try booting with the "irqpoll" option)
[ 5542.983013] CPU: 3 PID: 5407 Comm: irq/17-can10 Not tainted 3.10.11-rt7-can #1
[ 5542.983016] Hardware name: xxxxxx
[ 5542.983019]  00000000 c108910d f4e44840 00000000 00000011 c1089466 ee219f00 f4e44840
[ 5542.983027]  ee219f00 ef2d7580 c1087cf3 c10884a9 ee219f20 ef2d7580 1647bf59 00000000
[ 5542.983035]  00000000 00000000 00000000 c108857f ef169a68 ee219f00 c1088416 ee87bf90
[ 5542.983042] Call Trace:
[ 5542.983052]  [<c108910d>] ? __report_bad_irq+0x11/0x94
[ 5542.983057]  [<c1089466>] ? note_interrupt+0x118/0x192
[ 5542.983061]  [<c1087cf3>] ? irq_thread_fn+0x21/0x21
[ 5542.983064]  [<c10884a9>] ? irq_thread+0x93/0x169
[ 5542.983069]  [<c108857f>] ? irq_thread+0x169/0x169
[ 5542.983072]  [<c1088416>] ? wake_threads_waitq+0x31/0x31
[ 5542.983080]  [<c104a79e>] ? kthread+0x68/0x6d
[ 5542.983090]  [<c13143b7>] ? ret_from_kernel_thread+0x1b/0x28
[ 5542.983096]  [<c104a736>] ? __kthread_parkme+0x50/0x50
[ 5542.983102] handlers:
[ 5542.985069] [<c1087bdb>] irq_default_primary_handler threaded [<f886769b>] sja1000_interrupt [sja1000]
[ 5542.985073] [<c1087bdb>] irq_default_primary_handler threaded [<f886769b>] sja1000_interrupt [sja1000]
[ 5542.985080] [<c1087bdb>] irq_default_primary_handler threaded [<f886769b>] sja1000_interrupt [sja1000]
[ 5542.985082] [<c1087bdb>] irq_default_primary_handler threaded [<f886769b>] sja1000_interrupt [sja1000]
[ 5542.985083] Disabling IRQ #17

The problem with can9 shows up with irq/17-can10.
This might be related to the PITA hack.

Looks like this machine turned into a zombie:

I still get about 60 CAN frames per second from can9 even without the interrupt #17
counters in /proc/interrupts being increased ...

Oliver

On 10.12.2013 14:47, Oliver Hartkopp wrote:
> Hey all,
> 
> as I have a similar setup here (Core i7, 5x PEAK cPCI = 20 CAN interfaces) I
> downloaded the linux-image-3.10-0.bpo.3-rt-686-pae kernel including the
> sources from
> 
> 	http://packages.debian.org/de/wheezy-backports/kernel/
> 
> and was able to see Austins problem with the -rt kernel.
> 
> My interrupt lines are mostly dedicated to the CAN interfaces, so I was able
> to select interrupts (17 & 19) that _only_ deal with sja1000 irq handlers:
> 
>  16:          7          7         10          9   IO-APIC-fasteoi   ehci_hcd:usb1, ahci, can4, can5, can6, can7
>  17:    6328236    6330659    6328557    6330266   IO-APIC-fasteoi   can8, can10, can9
>  18:          0          0          0          0   IO-APIC-fasteoi   can12, can13, can14, can15
>  19:    1446093    1443817    1445833    1444230   IO-APIC-fasteoi   can2, can16, can17, can18, can19, can3, can1, can0
> 
> can0/can2 are linked together (500 kbit/s)
> can1/can3 are linked together (500 kbit/s)
> can9 is linked to a 1Mbit/s CAN traffic source
> 
> All interfaces get a full bus load from the outside.
> Additionally can0 and can1 get a 'cangen -g0 -i <if>' from the local host.
> 
> The funny thing was that one time IRQ #19 got disabled twice(?!?) :
> 
> Message from syslogd@xxxxx at Dec 10 11:25:37 ...
>  kernel:[  967.213174] Disabling IRQ #19
> 
> Message from syslogd@xxxxx at Dec 10 12:06:13 ...
>  kernel:[ 3401.523019] Disabling IRQ #17
> 
> Message from syslogd@xxxxx at Dec 10 12:49:08 ...
>  kernel:[ 5975.113373] Disabling IRQ #19
> 
> Don't know where the last message could come from as the 8 CAN interfaces at
> this interrupt line were already dead for more than a hour.
> 
> The disabling of the interrupt seems to be reproducible - as Austin already
> mentioned after different times.
> 
> My assumption was that we run into a problem with the PITA chip, when
> consuming the interface specific interrupt line in peak_pci_post_irq(), see:
> 
> static void peak_pci_post_irq(const struct sja1000_priv *priv)
> {
>         struct peak_pci_chan *chan = priv->priv;
>         u16 icr;
> 
>         /* Select and clear in PITA stored interrupt */
>         icr = readw(chan->cfg_base + PITA_ICR);
>         if (icr & chan->icr_mask)
>                 writew(chan->icr_mask, chan->cfg_base + PITA_ICR);
> }
> 
> With the writew() only the corresponding SJA1000 line is consumed.
> 
> My quick hack was to clear all bits in the PITA each time:
> 
> --- peak_pci.c~ 2013-09-08 07:10:14.000000000 +0200
> +++ peak_pci.c  2013-12-10 13:26:48.315166478 +0100
> @@ -542,9 +542,13 @@
>         u16 icr;
>  
>         /* Select and clear in PITA stored interrupt */
> +#if 0
>         icr = readw(chan->cfg_base + PITA_ICR);
>         if (icr & chan->icr_mask)
>                 writew(chan->icr_mask, chan->cfg_base + PITA_ICR);
> +#else
> +       writew(0x00C3, chan->cfg_base + PITA_ICR);
> +#endif
>  }
>  
>  static int peak_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
> 
> The 0x00C3 comes from OR'ing the values from 
> static const u16 peak_pci_icr_masks[PEAK_PCI_CHAN_MAX]
> 
> I'm currently running the setup for more than one hour without any problems.
> 
> But I assume that this a really bad hack - and I did not check, if any CAN
> frames got lost. Btw. the performance increased from 90% busload to 95%
> busload with that patch when creating only local traffic on the host.
> 
> Any idea how to proceed?
> 
> Regards,
> Oliver
> --
> To unsubscribe from this list: send the line "unsubscribe linux-can" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 




On 10.12.2013 14:47, Oliver Hartkopp wrote:
> Hey all,
> 
> as I have a similar setup here (Core i7, 5x PEAK cPCI = 20 CAN interfaces) I
> downloaded the linux-image-3.10-0.bpo.3-rt-686-pae kernel including the
> sources from
> 
> 	http://packages.debian.org/de/wheezy-backports/kernel/
> 
> and was able to see Austins problem with the -rt kernel.
> 
> My interrupt lines are mostly dedicated to the CAN interfaces, so I was able
> to select interrupts (17 & 19) that _only_ deal with sja1000 irq handlers:
> 
>  16:          7          7         10          9   IO-APIC-fasteoi   ehci_hcd:usb1, ahci, can4, can5, can6, can7
>  17:    6328236    6330659    6328557    6330266   IO-APIC-fasteoi   can8, can10, can9
>  18:          0          0          0          0   IO-APIC-fasteoi   can12, can13, can14, can15
>  19:    1446093    1443817    1445833    1444230   IO-APIC-fasteoi   can2, can16, can17, can18, can19, can3, can1, can0
> 
> can0/can2 are linked together (500 kbit/s)
> can1/can3 are linked together (500 kbit/s)
> can9 is linked to a 1Mbit/s CAN traffic source
> 
> All interfaces get a full bus load from the outside.
> Additionally can0 and can1 get a 'cangen -g0 -i <if>' from the local host.
> 
> The funny thing was that one time IRQ #19 got disabled twice(?!?) :
> 
> Message from syslogd@xxxxx at Dec 10 11:25:37 ...
>  kernel:[  967.213174] Disabling IRQ #19
> 
> Message from syslogd@xxxxx at Dec 10 12:06:13 ...
>  kernel:[ 3401.523019] Disabling IRQ #17
> 
> Message from syslogd@xxxxx at Dec 10 12:49:08 ...
>  kernel:[ 5975.113373] Disabling IRQ #19
> 
> Don't know where the last message could come from as the 8 CAN interfaces at
> this interrupt line were already dead for more than a hour.
> 
> The disabling of the interrupt seems to be reproducible - as Austin already
> mentioned after different times.
> 
> My assumption was that we run into a problem with the PITA chip, when
> consuming the interface specific interrupt line in peak_pci_post_irq(), see:
> 
> static void peak_pci_post_irq(const struct sja1000_priv *priv)
> {
>         struct peak_pci_chan *chan = priv->priv;
>         u16 icr;
> 
>         /* Select and clear in PITA stored interrupt */
>         icr = readw(chan->cfg_base + PITA_ICR);
>         if (icr & chan->icr_mask)
>                 writew(chan->icr_mask, chan->cfg_base + PITA_ICR);
> }
> 
> With the writew() only the corresponding SJA1000 line is consumed.
> 
> My quick hack was to clear all bits in the PITA each time:
> 
> --- peak_pci.c~ 2013-09-08 07:10:14.000000000 +0200
> +++ peak_pci.c  2013-12-10 13:26:48.315166478 +0100
> @@ -542,9 +542,13 @@
>         u16 icr;
>  
>         /* Select and clear in PITA stored interrupt */
> +#if 0
>         icr = readw(chan->cfg_base + PITA_ICR);
>         if (icr & chan->icr_mask)
>                 writew(chan->icr_mask, chan->cfg_base + PITA_ICR);
> +#else
> +       writew(0x00C3, chan->cfg_base + PITA_ICR);
> +#endif
>  }
>  
>  static int peak_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
> 
> The 0x00C3 comes from OR'ing the values from 
> static const u16 peak_pci_icr_masks[PEAK_PCI_CHAN_MAX]
> 
> I'm currently running the setup for more than one hour without any problems.
> 
> But I assume that this a really bad hack - and I did not check, if any CAN
> frames got lost. Btw. the performance increased from 90% busload to 95%
> busload with that patch when creating only local traffic on the host.
> 
> Any idea how to proceed?
> 
> Regards,
> Oliver
> --
> To unsubscribe from this list: send the line "unsubscribe linux-can" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

  reply	other threads:[~2013-12-10 14:23 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-08  0:47 sja1000 interrupt problem Austin Schuh
2013-10-08  6:32 ` Wolfgang Grandegger
2013-10-08  6:58   ` Oliver Hartkopp
2013-10-08 18:48     ` Austin Schuh
2013-10-08 19:44       ` Wolfgang Grandegger
2013-10-08 20:47         ` Austin Schuh
2013-10-09  6:21           ` Wolfgang Grandegger
2013-10-09  6:31           ` Wolfgang Grandegger
2013-10-09  6:47           ` Wolfgang Grandegger
     [not found]             ` <CANGgnMZpPGctUWGcg7Lp-QFPc7d6A5GeL9KQYnpeYMR8WukgdA@mail.gmail.com>
2013-11-07  8:15               ` Wolfgang Grandegger
2013-11-07 23:43                 ` Austin Schuh
2013-11-09 14:21                   ` Oliver Hartkopp
2013-11-12  2:59                     ` Austin Schuh
2013-11-12 21:26                       ` Oliver Hartkopp
2013-11-12 23:22                         ` Austin Schuh
2013-11-13  3:41                           ` Austin Schuh
2013-11-13  6:58                             ` Oliver Hartkopp
2013-11-13  9:48                               ` Kurt Van Dijck
2013-11-13  6:44                           ` Oliver Hartkopp
2013-11-13  8:11                             ` Wolfgang Grandegger
2013-11-13  9:08                               ` Pavel Pisa
2013-11-13  9:52                                 ` Wolfgang Grandegger
2013-11-13 18:41                                   ` Oliver Hartkopp
2013-11-13 19:29                                     ` Wolfgang Grandegger
2013-11-13 22:00                                       ` Oliver Hartkopp
2013-11-13 11:02                                 ` Kurt Van Dijck
2013-11-16 21:42                                 ` Oliver Hartkopp
2013-11-17  8:18                                   ` Wolfgang Grandegger
2013-11-17 14:27                                     ` Oliver Hartkopp
2013-11-17 17:23                                       ` Wolfgang Grandegger
2013-11-17 20:46                                         ` Wolfgang Grandegger
2013-11-18 17:08                                           ` Austin Schuh
2013-12-09 21:54                                             ` Austin Schuh
2013-12-09 21:54                                               ` Austin Schuh
2013-12-10  7:49                                               ` Wolfgang Grandegger
2013-12-10  8:05                                                 ` Austin Schuh
2013-12-10  9:32                                                   ` Wolfgang Grandegger
2013-12-10 13:47                                                     ` Oliver Hartkopp
2013-12-10 14:23                                                       ` Oliver Hartkopp [this message]
2013-12-10 14:41                                                       ` Wolfgang Grandegger
2013-12-10 16:05                                                         ` Oliver Hartkopp
2013-12-10 21:12                                                           ` Wolfgang Grandegger
2013-12-11 16:59                                                             ` Oliver Hartkopp
2013-12-11 19:27                                                               ` Wolfgang Grandegger
2013-12-12  6:13                                                                 ` Oliver Hartkopp
2013-12-12 17:38                                                                   ` Oliver Hartkopp
2013-12-12 22:56                                                                     ` Wolfgang Grandegger
2013-12-13  0:07                                                                       ` Austin Schuh
2013-12-13 16:16                                                                         ` Oliver Hartkopp
2013-12-13  9:38                                                                       ` Oliver Hartkopp
2013-12-13 10:04                                                                         ` Wolfgang Grandegger
2013-12-13 10:09                                                                           ` Wolfgang Grandegger
2013-12-13 16:25                                                                             ` Oliver Hartkopp
2013-12-13 17:33                                                                               ` Wolfgang Grandegger
2013-12-13 10:07                                                                         ` Marc Kleine-Budde
2013-12-13 16:22                                                                           ` Oliver Hartkopp
2013-12-13 17:14                                                                             ` Oliver Hartkopp
2013-12-13 21:14                                                                               ` Oliver Hartkopp
2013-12-14  9:51                                                                                 ` Oliver Hartkopp
2013-12-20 23:13                                                                                   ` Austin Schuh
2013-12-21  8:29                                                                                     ` Wolfgang Grandegger
2013-12-21 13:12                                                                                       ` Oliver Hartkopp
2013-12-21 12:55                                                                                     ` Oliver Hartkopp
2013-12-23 15:58                                                                                       ` Oliver Hartkopp
2013-11-09 19:42                   ` Wolfgang Grandegger
     [not found]                     ` <CANGgnMbb+VResUC6h+cK6Hfe5PLJx9R9ao6bMdJM2e5BPaDamw@mail.gmail.com>
2013-11-12 22:15                       ` Wolfgang Grandegger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52A723F2.7040908@hartkopp.net \
    --to=socketcan@hartkopp.net \
    --cc=austin@peloton-tech.com \
    --cc=linux-can@vger.kernel.org \
    --cc=pisa@cmp.felk.cvut.cz \
    --cc=wg@grandegger.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.