From: Wolfgang Grandegger <wg@grandegger.com>
To: Amr Bekhit <amrbekhit@gmail.com>
Cc: Alexander Stein <alexander.stein@systec-electronic.com>,
mkl@pengutronix.de, linux-can@vger.kernel.org
Subject: Re: Flooding AT91_CAN peripheral with messages causes it to stop receiving any more messages
Date: Wed, 8 Jun 2016 10:37:30 +0200 [thread overview]
Message-ID: <5757D94A.4090802@grandegger.com> (raw)
In-Reply-To: <CAOLz05oGGuXA+R=rFkk1uP0fpqPdq4JKx3L_-cowpj1h2gAC6A@mail.gmail.com>
Hello Amr,
Am 08.06.2016 um 10:17 schrieb Amr Bekhit:
> Hi Wolfgang,
>
> I've implemented the patch that you suggested and have been testing
> the unit for almost 48 hours, sending CAN messages continuously and
> the unit has continued to operate fine with no problems whatsoever.
> So, it appears that your patch has fixed the problem. Thanks!
Good news. I'm going to prepare a patch for mainline inclusion. Can I
add your "Tested-by: Amr Bekhit <amrbekhit@gmail.com>" ?
BTW, did you realize message overflows (with "ip -d -s link show") or
related out-of-order messages in the kernel log ("dmesg").
Thanks,
Wolfgang.
> Amr
>
> On 3 June 2016 at 08:22, Wolfgang Grandegger <wg@grandegger.com> wrote:
>> Hello Amr,
>>
>> I'm resending this message because it did not show up on the linux-can
>> mailing list archive...
>>
>> Am 01.06.2016 um 15:21 schrieb Amr Bekhit:
>>>
>>> Hi Wolfgang and Alexander,
>>>
>>> @Wolfgang: using the patch you sent to me, I ran the test twice until
>>> the unit stopped responding to messages. After taking the can
>>> interface down, here is the output from the console for both tests:
>>>
>>> # ifconfig can0 down
>>> at91_can f8004000.can can0: reg_sr=1
>>> at91_can f8004000.can can0: tx_next=0
>>> at91_can f8004000.can can0: tx_echo=0
>>> at91_can f8004000.can can0: rx_next=6
>>>
>>> # ifconfig can0 down
>>> at91_can f8004000.can can0: reg_sr=1
>>> at91_can f8004000.can can0: tx_next=8042
>>> at91_can f8004000.can can0: tx_echo=8042
>>> at91_can f8004000.can can0: rx_next=6
>>
>>
>> Trying to understand why RX stopped: at91_poll() entered with all RX message
>> boxes filled (reg_sr=1, rx_next=6). Because "quota" is exceeded, the
>> following if block is not executed:
>>
>> http://lxr.free-electrons.com/source/drivers/net/can/at91_can.c#L713
>>
>> At the next entrance of at91_poll(), at91_poll_rx() is *not* called, because
>> reg_sr is 0 and the RX MB interrupts are not re-enabled, because rx_next is
>> still 6. The RX interrupts stay *disabled*.
>>
>> If I'm not wrong, the following patch should fix that problem:
>>
>> diff --git a/drivers/net/can/at91_can.c b/drivers/net/can/at91_can.c
>> index 945c095..c9f36a4 100644
>> --- a/drivers/net/can/at91_can.c
>> +++ b/drivers/net/can/at91_can.c
>> @@ -733,9 +733,10 @@ static int at91_poll_rx(struct net_device *dev, int
>> quota)
>>
>> /* upper group completed, look again in lower */
>> if (priv->rx_next > get_mb_rx_low_last(priv) &&
>> - quota > 0 && mb > get_mb_rx_last(priv)) {
>> + mb > get_mb_rx_last(priv)) {
>> priv->rx_next = get_mb_rx_first(priv);
>> - goto again;
>> + if (quota > 0)
>> + goto again;
>> }
>>
>> return received;
>>
>> Could you give this patch a try, please.
>>
>>> I've also tried out the patch suggested by Alexander and that seems to
>>> work fine - I was unable to get the CAN device to lock up after
>>> running it for over a day continuously (test repeated twice). As I
>>> understood it, the aim of the patch was to get the messages out of the
>>> CAN peripheral immediately during the interrupt and store them in a
>>> kfifo for later processing. From my testing, this does appear to have
>>> solved the problem (or severely reduced the probability of it
>>> happening).
>>
>>
>> The existing driver may loose messages due to latency, but it should not
>> stop working.
>>
>> Wolfgang.
>>
>>> On 3 May 2016 at 09:27, Amr Bekhit <amrbekhit@gmail.com> wrote:
>>>>
>>>> Hi Wolfgang and Alexander,
>>>>
>>>> Thanks for both of your responses.
>>>>
>>>>
>>>> @Alexander: Thanks for pointing out the patch.
>>>>
>>>> @Wolfgang: In response to your earlier request, I've uploaded my dts
>>>> file to pastebin, which can find at http://pastebin.com/tNp2PnW4. I'll
>>>> give the patch mentioned by Alexander and your one a try and let you
>>>> know how it goes.
>>>>
>>>> Amr
>>>>
>>>> On 2 May 2016 at 14:53, Wolfgang Grandegger <wg@grandegger.com> wrote:
>>>>>
>>>>> Hello Alexander,
>>>>>
>>>>> Am 02.05.2016 um 08:23 schrieb Alexander Stein:
>>>>>>
>>>>>>
>>>>>> On Tuesday 05 April 2016 14:10:48, Amr Bekhit wrote:
>>>>>>>
>>>>>>>
>>>>>>> I working on a board based on the AT91SAM9X25 SoC and I'm using
>>>>>>> integrated CAN peripheral. I seem to have run into an issue whereby
>>>>>>> sending lots of messages very rapidly in quick succession causes the
>>>>>>> CAN peripheral to then stop receiving any messages at all. The only
>>>>>>> way to bring it back to a functional state is to bring the network
>>>>>>> interface down and then back up again.
>>>>>>> [...]
>>>>>>> I then start sending CAN messages to the unit using a PCAN-USB adapter
>>>>>>> that is plugged into a test Linux PC. After bringing up the CAN
>>>>>>> interface on the test PC, messages can be continuously sent using the
>>>>>>> following bash script:
>>>>>>> [...]
>>>>>>> I then leave the system running for some time (1.5 hours typically,
>>>>>>> may vary), periodically running ifconfig can0 to check to see if new
>>>>>>> packets are being received. After a while, the can interface will stop
>>>>>>> receiving new packets, even though the test PC is still transmitting
>>>>>>> them. Stopping and restarting the CAN transmissions on the test PC
>>>>>>> does not solve the problem. The interface does not appear to be in the
>>>>>>> bus off state, as shown by running the following:
>>>>>>
>>>>>>
>>>>>>
>>>>>> That sounds a bit like my getting stuck problem in
>>>>>>
>>>>>> http://linux-can.vger.kernel.narkive.com/bBQqK84G/resend-patch-net-can-at91-can-c-decrease-likelyhood-of-rx-overruns#post2
>>>>>>
>>>>>> The patch post1 at least keeps the driver working. Although I don't
>>>>>> know
>>>>>> what
>>>>>> has changed in at91_can meanwhile.
>>>>>
>>>>>
>>>>>
>>>>> Thanks for pointing me to that patch. It still applies to Linux 4.1 with
>>>>> some minor fixes. Amr, could you please give it a try. Please let me
>>>>> know if
>>>>> you need help.
>>>>>
>>>>> Anyway, I think the driver should not hang even in case of overflows. I
>>>>> will
>>>>> have a closer look later this week.
>>>>>
>>>>> Wolfgang.
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-can" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>
>
>
next prev parent reply other threads:[~2016-06-08 8:37 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-05 13:10 Flooding AT91_CAN peripheral with messages causes it to stop receiving any more messages Amr Bekhit
2016-04-08 7:39 ` Wolfgang Grandegger
2016-04-29 8:04 ` Amr Bekhit
[not found] ` <CAOLz05oo=EGqvmCaXXBhXs5McMmJDPKCzuiij7Pv22fj5hPB_g@mail.gmail.com>
2016-04-29 8:15 ` Amr Bekhit
2016-04-29 11:18 ` Wolfgang Grandegger
2016-04-29 11:29 ` Amr Bekhit
2016-04-29 14:27 ` Wolfgang Grandegger
2016-04-30 13:34 ` Wolfgang Grandegger
2016-05-02 6:23 ` Alexander Stein
2016-05-02 13:53 ` Wolfgang Grandegger
2016-05-03 8:27 ` Amr Bekhit
2016-06-01 13:21 ` Amr Bekhit
2016-06-03 7:22 ` Wolfgang Grandegger
2016-06-08 8:17 ` Amr Bekhit
2016-06-08 8:37 ` Wolfgang Grandegger [this message]
2016-06-10 13:34 ` Amr Bekhit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5757D94A.4090802@grandegger.com \
--to=wg@grandegger.com \
--cc=alexander.stein@systec-electronic.com \
--cc=amrbekhit@gmail.com \
--cc=linux-can@vger.kernel.org \
--cc=mkl@pengutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox