From: Wolfgang Grandegger <wg@grandegger.com>
To: Amr Bekhit <amrbekhit@gmail.com>
Cc: Alexander Stein <alexander.stein@systec-electronic.com>,
mkl@pengutronix.de, linux-can@vger.kernel.org
Subject: Re: Flooding AT91_CAN peripheral with messages causes it to stop receiving any more messages
Date: Fri, 3 Jun 2016 09:22:32 +0200 [thread overview]
Message-ID: <57513038.2080107@grandegger.com> (raw)
In-Reply-To: <CAOLz05qKWrDK2f6tg9wyY6HsQp1m=jvN+KF4v5z2u0xUakSQCw@mail.gmail.com>
Hello Amr,
I'm resending this message because it did not show up on the linux-can
mailing list archive...
Am 01.06.2016 um 15:21 schrieb Amr Bekhit:
> Hi Wolfgang and Alexander,
>
> @Wolfgang: using the patch you sent to me, I ran the test twice until
> the unit stopped responding to messages. After taking the can
> interface down, here is the output from the console for both tests:
>
> # ifconfig can0 down
> at91_can f8004000.can can0: reg_sr=1
> at91_can f8004000.can can0: tx_next=0
> at91_can f8004000.can can0: tx_echo=0
> at91_can f8004000.can can0: rx_next=6
>
> # ifconfig can0 down
> at91_can f8004000.can can0: reg_sr=1
> at91_can f8004000.can can0: tx_next=8042
> at91_can f8004000.can can0: tx_echo=8042
> at91_can f8004000.can can0: rx_next=6
Trying to understand why RX stopped: at91_poll() entered with all RX
message boxes filled (reg_sr=1, rx_next=6). Because "quota" is exceeded,
the following if block is not executed:
http://lxr.free-electrons.com/source/drivers/net/can/at91_can.c#L713
At the next entrance of at91_poll(), at91_poll_rx() is *not* called,
because reg_sr is 0 and the RX MB interrupts are not re-enabled, because
rx_next is still 6. The RX interrupts stay *disabled*.
If I'm not wrong, the following patch should fix that problem:
diff --git a/drivers/net/can/at91_can.c b/drivers/net/can/at91_can.c
index 945c095..c9f36a4 100644
--- a/drivers/net/can/at91_can.c
+++ b/drivers/net/can/at91_can.c
@@ -733,9 +733,10 @@ static int at91_poll_rx(struct net_device *dev, int
quota)
/* upper group completed, look again in lower */
if (priv->rx_next > get_mb_rx_low_last(priv) &&
- quota > 0 && mb > get_mb_rx_last(priv)) {
+ mb > get_mb_rx_last(priv)) {
priv->rx_next = get_mb_rx_first(priv);
- goto again;
+ if (quota > 0)
+ goto again;
}
return received;
Could you give this patch a try, please.
> I've also tried out the patch suggested by Alexander and that seems to
> work fine - I was unable to get the CAN device to lock up after
> running it for over a day continuously (test repeated twice). As I
> understood it, the aim of the patch was to get the messages out of the
> CAN peripheral immediately during the interrupt and store them in a
> kfifo for later processing. From my testing, this does appear to have
> solved the problem (or severely reduced the probability of it
> happening).
The existing driver may loose messages due to latency, but it should not
stop working.
Wolfgang.
> On 3 May 2016 at 09:27, Amr Bekhit <amrbekhit@gmail.com> wrote:
>> Hi Wolfgang and Alexander,
>>
>> Thanks for both of your responses.
>>
>>
>> @Alexander: Thanks for pointing out the patch.
>>
>> @Wolfgang: In response to your earlier request, I've uploaded my dts
>> file to pastebin, which can find at http://pastebin.com/tNp2PnW4. I'll
>> give the patch mentioned by Alexander and your one a try and let you
>> know how it goes.
>>
>> Amr
>>
>> On 2 May 2016 at 14:53, Wolfgang Grandegger <wg@grandegger.com> wrote:
>>> Hello Alexander,
>>>
>>> Am 02.05.2016 um 08:23 schrieb Alexander Stein:
>>>>
>>>> On Tuesday 05 April 2016 14:10:48, Amr Bekhit wrote:
>>>>>
>>>>> I working on a board based on the AT91SAM9X25 SoC and I'm using
>>>>> integrated CAN peripheral. I seem to have run into an issue whereby
>>>>> sending lots of messages very rapidly in quick succession causes the
>>>>> CAN peripheral to then stop receiving any messages at all. The only
>>>>> way to bring it back to a functional state is to bring the network
>>>>> interface down and then back up again.
>>>>> [...]
>>>>> I then start sending CAN messages to the unit using a PCAN-USB adapter
>>>>> that is plugged into a test Linux PC. After bringing up the CAN
>>>>> interface on the test PC, messages can be continuously sent using the
>>>>> following bash script:
>>>>> [...]
>>>>> I then leave the system running for some time (1.5 hours typically,
>>>>> may vary), periodically running ifconfig can0 to check to see if new
>>>>> packets are being received. After a while, the can interface will stop
>>>>> receiving new packets, even though the test PC is still transmitting
>>>>> them. Stopping and restarting the CAN transmissions on the test PC
>>>>> does not solve the problem. The interface does not appear to be in the
>>>>> bus off state, as shown by running the following:
>>>>
>>>>
>>>> That sounds a bit like my getting stuck problem in
>>>> http://linux-can.vger.kernel.narkive.com/bBQqK84G/resend-patch-net-can-at91-can-c-decrease-likelyhood-of-rx-overruns#post2
>>>>
>>>> The patch post1 at least keeps the driver working. Although I don't know
>>>> what
>>>> has changed in at91_can meanwhile.
>>>
>>>
>>> Thanks for pointing me to that patch. It still applies to Linux 4.1 with
>>> some minor fixes. Amr, could you please give it a try. Please let me know if
>>> you need help.
>>>
>>> Anyway, I think the driver should not hang even in case of overflows. I will
>>> have a closer look later this week.
>>>
>>> Wolfgang.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-can" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
next prev parent reply other threads:[~2016-06-03 7:22 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-05 13:10 Flooding AT91_CAN peripheral with messages causes it to stop receiving any more messages Amr Bekhit
2016-04-08 7:39 ` Wolfgang Grandegger
2016-04-29 8:04 ` Amr Bekhit
[not found] ` <CAOLz05oo=EGqvmCaXXBhXs5McMmJDPKCzuiij7Pv22fj5hPB_g@mail.gmail.com>
2016-04-29 8:15 ` Amr Bekhit
2016-04-29 11:18 ` Wolfgang Grandegger
2016-04-29 11:29 ` Amr Bekhit
2016-04-29 14:27 ` Wolfgang Grandegger
2016-04-30 13:34 ` Wolfgang Grandegger
2016-05-02 6:23 ` Alexander Stein
2016-05-02 13:53 ` Wolfgang Grandegger
2016-05-03 8:27 ` Amr Bekhit
2016-06-01 13:21 ` Amr Bekhit
2016-06-03 7:22 ` Wolfgang Grandegger [this message]
2016-06-08 8:17 ` Amr Bekhit
2016-06-08 8:37 ` Wolfgang Grandegger
2016-06-10 13:34 ` Amr Bekhit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=57513038.2080107@grandegger.com \
--to=wg@grandegger.com \
--cc=alexander.stein@systec-electronic.com \
--cc=amrbekhit@gmail.com \
--cc=linux-can@vger.kernel.org \
--cc=mkl@pengutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox