Linux CAN drivers development
 help / color / mirror / Atom feed
From: Wolfgang Grandegger <wg@grandegger.com>
To: Amr Bekhit <amrbekhit@gmail.com>
Cc: Alexander Stein <alexander.stein@systec-electronic.com>,
	mkl@pengutronix.de, linux-can@vger.kernel.org
Subject: Re: Flooding AT91_CAN peripheral with messages causes it to stop receiving any more messages
Date: Wed, 8 Jun 2016 10:37:30 +0200	[thread overview]
Message-ID: <5757D94A.4090802@grandegger.com> (raw)
In-Reply-To: <CAOLz05oGGuXA+R=rFkk1uP0fpqPdq4JKx3L_-cowpj1h2gAC6A@mail.gmail.com>

Hello Amr,

Am 08.06.2016 um 10:17 schrieb Amr Bekhit:
> Hi Wolfgang,
>
> I've implemented the patch that you suggested and have been testing
> the unit for almost 48 hours, sending CAN messages continuously and
> the unit has continued to operate fine with no problems whatsoever.
> So, it appears that your patch has fixed the problem. Thanks!

Good news. I'm going to prepare a patch for mainline inclusion. Can I 
add your "Tested-by: Amr Bekhit <amrbekhit@gmail.com>" ?

BTW, did you realize message overflows (with "ip -d -s link show") or 
related out-of-order messages in the kernel log ("dmesg").

Thanks,

Wolfgang.

> Amr
>
> On 3 June 2016 at 08:22, Wolfgang Grandegger <wg@grandegger.com> wrote:
>> Hello Amr,
>>
>> I'm resending this message because it did not show up on the linux-can
>> mailing list archive...
>>
>> Am 01.06.2016 um 15:21 schrieb Amr Bekhit:
>>>
>>> Hi Wolfgang and Alexander,
>>>
>>> @Wolfgang: using the patch you sent to me, I ran the test twice until
>>> the unit stopped responding to messages. After taking the can
>>> interface down, here is the output from the console for both tests:
>>>
>>> # ifconfig can0 down
>>> at91_can f8004000.can can0: reg_sr=1
>>> at91_can f8004000.can can0: tx_next=0
>>> at91_can f8004000.can can0: tx_echo=0
>>> at91_can f8004000.can can0: rx_next=6
>>>
>>> # ifconfig can0 down
>>> at91_can f8004000.can can0: reg_sr=1
>>> at91_can f8004000.can can0: tx_next=8042
>>> at91_can f8004000.can can0: tx_echo=8042
>>> at91_can f8004000.can can0: rx_next=6
>>
>>
>> Trying to understand why RX stopped: at91_poll() entered with all RX message
>> boxes filled (reg_sr=1, rx_next=6). Because "quota" is exceeded, the
>> following if block is not executed:
>>
>> http://lxr.free-electrons.com/source/drivers/net/can/at91_can.c#L713
>>
>> At the next entrance of at91_poll(), at91_poll_rx() is *not* called, because
>> reg_sr is 0 and the RX MB interrupts are not re-enabled, because rx_next is
>> still 6. The RX interrupts stay *disabled*.
>>
>> If I'm not wrong, the following patch should fix that problem:
>>
>> diff --git a/drivers/net/can/at91_can.c b/drivers/net/can/at91_can.c
>> index 945c095..c9f36a4 100644
>> --- a/drivers/net/can/at91_can.c
>> +++ b/drivers/net/can/at91_can.c
>> @@ -733,9 +733,10 @@ static int at91_poll_rx(struct net_device *dev, int
>> quota)
>>
>>          /* upper group completed, look again in lower */
>>          if (priv->rx_next > get_mb_rx_low_last(priv) &&
>> -           quota > 0 && mb > get_mb_rx_last(priv)) {
>> +           mb > get_mb_rx_last(priv)) {
>>                  priv->rx_next = get_mb_rx_first(priv);
>> -               goto again;
>> +               if (quota > 0)
>> +                       goto again;
>>          }
>>
>>          return received;
>>
>> Could you give this patch a try, please.
>>
>>> I've also tried out the patch suggested by Alexander and that seems to
>>> work fine - I was unable to get the CAN device to lock up after
>>> running it for over a day continuously (test repeated twice). As I
>>> understood it, the aim of the patch was to get the messages out of the
>>> CAN peripheral immediately during the interrupt and store them in a
>>> kfifo for later processing. From my testing, this does appear to have
>>> solved the problem (or severely reduced the probability of it
>>> happening).
>>
>>
>> The existing driver may loose messages due to latency, but it should not
>> stop working.
>>
>> Wolfgang.
>>
>>> On 3 May 2016 at 09:27, Amr Bekhit <amrbekhit@gmail.com> wrote:
>>>>
>>>> Hi Wolfgang and Alexander,
>>>>
>>>> Thanks for both of your responses.
>>>>
>>>>
>>>> @Alexander: Thanks for pointing out the patch.
>>>>
>>>> @Wolfgang: In response to your earlier request, I've uploaded my dts
>>>> file to pastebin, which can find at http://pastebin.com/tNp2PnW4. I'll
>>>> give the patch mentioned by Alexander and your one a try and let you
>>>> know how it goes.
>>>>
>>>> Amr
>>>>
>>>> On 2 May 2016 at 14:53, Wolfgang Grandegger <wg@grandegger.com> wrote:
>>>>>
>>>>> Hello Alexander,
>>>>>
>>>>> Am 02.05.2016 um 08:23 schrieb Alexander Stein:
>>>>>>
>>>>>>
>>>>>> On Tuesday 05 April 2016 14:10:48, Amr Bekhit wrote:
>>>>>>>
>>>>>>>
>>>>>>> I working on a board based on the AT91SAM9X25 SoC and I'm using
>>>>>>> integrated CAN peripheral. I seem to have run into an issue whereby
>>>>>>> sending lots of messages very rapidly in quick succession causes the
>>>>>>> CAN peripheral to then stop receiving any messages at all. The only
>>>>>>> way to bring it back to a functional state is to bring the network
>>>>>>> interface down and then back up again.
>>>>>>> [...]
>>>>>>> I then start sending CAN messages to the unit using a PCAN-USB adapter
>>>>>>> that is plugged into a test Linux PC. After bringing up the CAN
>>>>>>> interface on the test PC, messages can be continuously sent using the
>>>>>>> following bash script:
>>>>>>> [...]
>>>>>>> I then leave the system running for some time (1.5 hours typically,
>>>>>>> may vary), periodically running ifconfig can0 to check to see if new
>>>>>>> packets are being received. After a while, the can interface will stop
>>>>>>> receiving new packets, even though the test PC is still transmitting
>>>>>>> them. Stopping and restarting the CAN transmissions on the test PC
>>>>>>> does not solve the problem. The interface does not appear to be in the
>>>>>>> bus off state, as shown by running the following:
>>>>>>
>>>>>>
>>>>>>
>>>>>> That sounds a bit like my getting stuck problem in
>>>>>>
>>>>>> http://linux-can.vger.kernel.narkive.com/bBQqK84G/resend-patch-net-can-at91-can-c-decrease-likelyhood-of-rx-overruns#post2
>>>>>>
>>>>>> The patch post1 at least keeps the driver working. Although I don't
>>>>>> know
>>>>>> what
>>>>>> has changed in at91_can meanwhile.
>>>>>
>>>>>
>>>>>
>>>>> Thanks for pointing me to that patch. It still applies to Linux 4.1 with
>>>>> some minor fixes. Amr, could you please give it a try. Please let me
>>>>> know if
>>>>> you need help.
>>>>>
>>>>> Anyway, I think the driver should not hang even in case of overflows. I
>>>>> will
>>>>> have a closer look later this week.
>>>>>
>>>>> Wolfgang.
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-can" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>
>
>

  reply	other threads:[~2016-06-08  8:37 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-05 13:10 Flooding AT91_CAN peripheral with messages causes it to stop receiving any more messages Amr Bekhit
2016-04-08  7:39 ` Wolfgang Grandegger
2016-04-29  8:04   ` Amr Bekhit
     [not found]     ` <CAOLz05oo=EGqvmCaXXBhXs5McMmJDPKCzuiij7Pv22fj5hPB_g@mail.gmail.com>
2016-04-29  8:15       ` Amr Bekhit
2016-04-29 11:18     ` Wolfgang Grandegger
2016-04-29 11:29       ` Amr Bekhit
2016-04-29 14:27         ` Wolfgang Grandegger
2016-04-30 13:34         ` Wolfgang Grandegger
2016-05-02  6:23 ` Alexander Stein
2016-05-02 13:53   ` Wolfgang Grandegger
2016-05-03  8:27     ` Amr Bekhit
2016-06-01 13:21       ` Amr Bekhit
2016-06-03  7:22         ` Wolfgang Grandegger
2016-06-08  8:17           ` Amr Bekhit
2016-06-08  8:37             ` Wolfgang Grandegger [this message]
2016-06-10 13:34               ` Amr Bekhit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5757D94A.4090802@grandegger.com \
    --to=wg@grandegger.com \
    --cc=alexander.stein@systec-electronic.com \
    --cc=amrbekhit@gmail.com \
    --cc=linux-can@vger.kernel.org \
    --cc=mkl@pengutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox