From: Stephane Grosjean <s.grosjean@peak-system.com>
To: Andri Yngvason <andri.yngvason@marel.com>, linux-can@vger.kernel.org
Cc: wg@grandegger.com, mkl@pengutronix.de
Subject: Re: peak_pci: TX Frame Loss
Date: Thu, 19 Nov 2015 09:38:17 +0100 [thread overview]
Message-ID: <564D8A79.7090605@peak-system.com> (raw)
In-Reply-To: <20151118145121.32487.38169@maxwell.marel.net>
Hi Andri,
Could you first give me the result of sudo lspci -d 1c: -vvv please?
Regards,
Stéphane
Le 18/11/2015 15:51, Andri Yngvason a écrit :
> Hi all,
>
> We've been experiencing frame loss on transmission in the peak_pci netdev
> driver.
>
> The frames are not reported as "dumped" by the netlink interface.
>
> We are running CANopen and this manifests sporadically as nodes dropping off the
> network due to failure to answer node guarding RTR and as SDO request timeouts.
>
> Example with can0 and can1 on the same bus where the CANopen master is on can0
> and can1 is set up to listen only:
> (1446688151.783844) can0 701 [1] remote request <- node guarding request on can0
> (1446688151.784296) can0 70A [1] remote request <- another node guarding request
> (1446688151.784304) can1 70A [1] remote request <- only the latter is seen by can1
> (1446688151.784751) can0 720 [1] remote request
> (1446688151.784763) can1 720 [1] remote request
> (1446688151.785793) can1 283 [8] 00 00 00 00 00 00 00 00
> (1446688151.785792) can0 283 [8] 00 00 00 00 00 00 00 00
> (1446688151.786164) can0 70A [1] 85 <-- node guarding response
> (1446688151.786163) can1 70A [1] 85
> (1446688151.786641) can1 720 [1] 85
> (1446688151.786641) can0 720 [1] 85 <-- node guarding response
> (1446688151.787057) can0 721 [1] remote request
> (1446688151.787063) can1 721 [1] remote request
> (1446688151.787728) can0 721 [1] 05
> (1446688151.787733) can1 721 [1] 05
>
> Node 1 never responded because it never received the request.
>
> The node guarding requests are sent in bursts where lower ids appear before
> higher ids. A curious observation is that it's always the lowest id that drops
> out first. I.e. the first frame in a burst of frames is the one that's lost.
>
> Another interesting thing that we've found out is that if we turn off SMP on the
> system, the problem disappears. But obviously we don't want to disable SMP in a
> production system. ;)
> It helps to set the cpy affinity of all threads and processes that touch the CAN
> bus to a single core but sadly it doesn't eliminate the problem.
>
> Our systems are running on kernel version 3.14.3 with the rt patch. I tried
> running 4.1.12-rt13 but that did not eliminate the problem. We also tried
> running with the pcan netdev driver from peak which does in fact run without
> frame loss. Thus, this is probably an issue with either peak_pci or sja1000.
>
> I tried poking around in sja1000.c. I noticed that sja1000_start_xmit() is not
> guarded against trying to transmit when the tx buffer is occupied, so I added a
> check and a print-out:
> diff --git a/drivers/net/can/sja1000/sja1000.c b/drivers/net/can/sja1000/sja1000.c
> index 32bd7f4..adc49db 100644
> --- a/drivers/net/can/sja1000/sja1000.c
> +++ b/drivers/net/can/sja1000/sja1000.c
> @@ -292,6 +292,11 @@ static netdev_tx_t sja1000_start_xmit(struct sk_buff *skb,
>
> netif_stop_queue(dev);
>
> + if (!(priv->read_reg(priv, SJA1000_SR) & SR_TBS)) {
> + netdev_err(dev, "BUG!, TX FIFO full when queue awake!\n");
> + return NETDEV_TX_BUSY;
> + }
> +
> fi = dlc = cf->can_dlc;
> id = cf->can_id;
>
> There was no error message in dmesg after frame loss, so that's not the problem.
>
> The CPU is an Intel i7-4700EQ and the CAN interface is a Peak PCIe dual channel.
>
> Does anyone have an idea what might be wrong? :)
>
> Best regards,
> Andri
--
PEAK-System Technik GmbH
Sitz der Gesellschaft Darmstadt
Handelsregister Darmstadt HRB 9183
Geschaeftsfuehrung: Alexander Gach, Uwe Wilhelm
--
next prev parent reply other threads:[~2015-11-19 8:47 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-18 14:51 peak_pci: TX Frame Loss Andri Yngvason
2015-11-19 8:38 ` Stephane Grosjean [this message]
2015-11-19 10:12 ` Andri Yngvason
2015-12-02 18:09 ` Andri Yngvason
2015-12-02 19:19 ` Oliver Hartkopp
2015-12-03 6:37 ` Oliver Hartkopp
2015-12-03 11:23 ` Andri Yngvason
2015-12-03 11:44 ` Marc Kleine-Budde
2015-12-08 10:21 ` Stephane Grosjean
2015-12-08 10:50 ` Andri Yngvason
2015-12-08 11:42 ` Stephane Grosjean
2015-12-08 12:24 ` Andri Yngvason
2015-12-08 14:12 ` [BULK]Re: " Stephane Grosjean
2015-12-22 8:13 ` Stephane Grosjean
2015-12-22 11:51 ` Andri Yngvason
2015-12-03 16:37 ` Stephane Grosjean
2015-12-03 8:20 ` Marc Kleine-Budde
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=564D8A79.7090605@peak-system.com \
--to=s.grosjean@peak-system.com \
--cc=andri.yngvason@marel.com \
--cc=linux-can@vger.kernel.org \
--cc=mkl@pengutronix.de \
--cc=wg@grandegger.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).