All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wolfgang Grandegger <wg@grandegger.com>
To: Amr Bekhit <amrbekhit@gmail.com>
Cc: Alexander Stein <alexander.stein@systec-electronic.com>,
	mkl@pengutronix.de, linux-can@vger.kernel.org
Subject: Re: Flooding AT91_CAN peripheral with messages causes it to stop receiving any more messages
Date: Fri, 3 Jun 2016 09:22:32 +0200	[thread overview]
Message-ID: <57513038.2080107@grandegger.com> (raw)
In-Reply-To: <CAOLz05qKWrDK2f6tg9wyY6HsQp1m=jvN+KF4v5z2u0xUakSQCw@mail.gmail.com>

Hello Amr,

I'm resending this message because it did not show up on the linux-can 
mailing list archive...

Am 01.06.2016 um 15:21 schrieb Amr Bekhit:
> Hi Wolfgang and Alexander,
>
> @Wolfgang: using the patch you sent to me, I ran the test twice until
> the unit stopped responding to messages. After taking the can
> interface down, here is the output from the console for both tests:
>
> # ifconfig can0 down
> at91_can f8004000.can can0: reg_sr=1
> at91_can f8004000.can can0: tx_next=0
> at91_can f8004000.can can0: tx_echo=0
> at91_can f8004000.can can0: rx_next=6
>
> # ifconfig can0 down
> at91_can f8004000.can can0: reg_sr=1
> at91_can f8004000.can can0: tx_next=8042
> at91_can f8004000.can can0: tx_echo=8042
> at91_can f8004000.can can0: rx_next=6

Trying to understand why RX stopped: at91_poll() entered with all RX 
message boxes filled (reg_sr=1, rx_next=6). Because "quota" is exceeded, 
the following if block is not executed:

http://lxr.free-electrons.com/source/drivers/net/can/at91_can.c#L713

At the next entrance of at91_poll(), at91_poll_rx() is *not* called, 
because reg_sr is 0 and the RX MB interrupts are not re-enabled, because 
rx_next is still 6. The RX interrupts stay *disabled*.

If I'm not wrong, the following patch should fix that problem:

diff --git a/drivers/net/can/at91_can.c b/drivers/net/can/at91_can.c
index 945c095..c9f36a4 100644
--- a/drivers/net/can/at91_can.c
+++ b/drivers/net/can/at91_can.c
@@ -733,9 +733,10 @@ static int at91_poll_rx(struct net_device *dev, int 
quota)

         /* upper group completed, look again in lower */
         if (priv->rx_next > get_mb_rx_low_last(priv) &&
-           quota > 0 && mb > get_mb_rx_last(priv)) {
+           mb > get_mb_rx_last(priv)) {
                 priv->rx_next = get_mb_rx_first(priv);
-               goto again;
+               if (quota > 0)
+                       goto again;
         }

         return received;

Could you give this patch a try, please.

> I've also tried out the patch suggested by Alexander and that seems to
> work fine - I was unable to get the CAN device to lock up after
> running it for over a day continuously (test repeated twice). As I
> understood it, the aim of the patch was to get the messages out of the
> CAN peripheral immediately during the interrupt and store them in a
> kfifo for later processing. From my testing, this does appear to have
> solved the problem (or severely reduced the probability of it
> happening).

The existing driver may loose messages due to latency, but it should not 
stop working.

Wolfgang.

> On 3 May 2016 at 09:27, Amr Bekhit <amrbekhit@gmail.com> wrote:
>> Hi Wolfgang and Alexander,
>>
>> Thanks for both of your responses.
>>
>>
>> @Alexander: Thanks for pointing out the patch.
>>
>> @Wolfgang: In response to your earlier request, I've uploaded my dts
>> file to pastebin, which can find at http://pastebin.com/tNp2PnW4. I'll
>> give the patch mentioned by Alexander and your one a try and let you
>> know how it goes.
>>
>> Amr
>>
>> On 2 May 2016 at 14:53, Wolfgang Grandegger <wg@grandegger.com> wrote:
>>> Hello Alexander,
>>>
>>> Am 02.05.2016 um 08:23 schrieb Alexander Stein:
>>>>
>>>> On Tuesday 05 April 2016 14:10:48, Amr Bekhit wrote:
>>>>>
>>>>> I working on a board based on the AT91SAM9X25 SoC and I'm using
>>>>> integrated CAN peripheral. I seem to have run into an issue whereby
>>>>> sending lots of messages very rapidly in quick succession causes the
>>>>> CAN peripheral to then stop receiving any messages at all. The only
>>>>> way to bring it back to a functional state is to bring the network
>>>>> interface down and then back up again.
>>>>> [...]
>>>>> I then start sending CAN messages to the unit using a PCAN-USB adapter
>>>>> that is plugged into a test Linux PC. After bringing up the CAN
>>>>> interface on the test PC, messages can be continuously sent using the
>>>>> following bash script:
>>>>> [...]
>>>>> I then leave the system running for some time (1.5 hours typically,
>>>>> may vary), periodically running ifconfig can0 to check to see if new
>>>>> packets are being received. After a while, the can interface will stop
>>>>> receiving new packets, even though the test PC is still transmitting
>>>>> them. Stopping and restarting the CAN transmissions on the test PC
>>>>> does not solve the problem. The interface does not appear to be in the
>>>>> bus off state, as shown by running the following:
>>>>
>>>>
>>>> That sounds a bit like my getting stuck problem in
>>>> http://linux-can.vger.kernel.narkive.com/bBQqK84G/resend-patch-net-can-at91-can-c-decrease-likelyhood-of-rx-overruns#post2
>>>>
>>>> The patch post1 at least keeps the driver working. Although I don't know
>>>> what
>>>> has changed in at91_can meanwhile.
>>>
>>>
>>> Thanks for pointing me to that patch. It still applies to Linux 4.1 with
>>> some minor fixes. Amr, could you please give it a try. Please let me know if
>>> you need help.
>>>
>>> Anyway, I think the driver should not hang even in case of overflows. I will
>>> have a closer look later this week.
>>>
>>> Wolfgang.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-can" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>

  reply	other threads:[~2016-06-03  7:22 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-05 13:10 Flooding AT91_CAN peripheral with messages causes it to stop receiving any more messages Amr Bekhit
2016-04-08  7:39 ` Wolfgang Grandegger
2016-04-29  8:04   ` Amr Bekhit
     [not found]     ` <CAOLz05oo=EGqvmCaXXBhXs5McMmJDPKCzuiij7Pv22fj5hPB_g@mail.gmail.com>
2016-04-29  8:15       ` Amr Bekhit
2016-04-29 11:18     ` Wolfgang Grandegger
2016-04-29 11:29       ` Amr Bekhit
2016-04-29 14:27         ` Wolfgang Grandegger
2016-04-30 13:34         ` Wolfgang Grandegger
2016-05-02  6:23 ` Alexander Stein
2016-05-02 13:53   ` Wolfgang Grandegger
2016-05-03  8:27     ` Amr Bekhit
2016-06-01 13:21       ` Amr Bekhit
2016-06-03  7:22         ` Wolfgang Grandegger [this message]
2016-06-08  8:17           ` Amr Bekhit
2016-06-08  8:37             ` Wolfgang Grandegger
2016-06-10 13:34               ` Amr Bekhit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57513038.2080107@grandegger.com \
    --to=wg@grandegger.com \
    --cc=alexander.stein@systec-electronic.com \
    --cc=amrbekhit@gmail.com \
    --cc=linux-can@vger.kernel.org \
    --cc=mkl@pengutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.