From: Marc Kleine-Budde <mkl@pengutronix.de>
To: AnilKumar Ch <anilkumar@ti.com>
Cc: wg@grandegger.com, swarren@wwwdotorg.org,
linux-can@vger.kernel.org, anantgole@ti.com, nsekhar@ti.com
Subject: Re: [PATCH] can: ti_hecc: fix rx wrong sequence issue
Date: Mon, 05 Nov 2012 12:34:55 +0100 [thread overview]
Message-ID: <5097A45F.3000606@pengutronix.de> (raw)
In-Reply-To: <1349678436-747-1-git-send-email-anilkumar@ti.com>
[-- Attachment #1: Type: text/plain, Size: 5521 bytes --]
Hello,
On 10/08/2012 08:40 AM, AnilKumar Ch wrote:
> Fix "received wrong sequence count. expected: x, got: y" errors
> reported by cansequence in a stress test (--verbose).
>
> While processing the receive packets in mailboxes, upper mailboxes
> need to enable while processing 12th (RX_BUFFER mailbox) mailbox.
> This requires a status check to identify whether the receiving of
> packets in progress or not. If the mailboxes are enabled before the
> receive packet status is cleared then there is a possibility of
> receiving out of order packets.
>
> With this patch mailboxes are enabled once the receive status is
> cleared.
>
> Signed-off-by: AnilKumar Ch <anilkumar@ti.com>
AnilKumar Ch, can you add some TI people with access to the vhdl CAN
core design to Cc?
I'm debugging this issue for a week, fixed several other problems with
the driver and now and it seems there is a race condition in the hardware.
The algorithm in the driver relies on the ability to re-enable a bunch
of mailboxes with a single write command. In the ti_hecc hardware the
driver uses the CANME (mailbox enable) register to do so. Under certain
circumstances I've noticed that the CAN core receives the next message
into an undefined mailbox. According to the data sheet CAN messages are
always received into the highest free mailbox (with a matching filter).
But after enabling the mailboxes the next CAN frame doesn't go into the
highest mailbox.
This is a log from my test application. The ti_hecc receives 1 byte long
frames at 500 kbit/s. In the first data byte a number is increased with
every CAN frame (the "d=" column). The enabled mailboxes are in the "me"
column and the pending messages in "rmp".
> [ 57.390747] ti_hecc ti_hecc: can0: i=39502 m=14 n=14 rmp=00004000 me=00007ff0 d= 78
> [ 57.390777] ti_hecc ti_hecc: can0: i=39503 m=13 n=13 rmp=00002000 me=00003ff0 d= 79
> [ 57.390777] ti_hecc ti_hecc: can0: i=39504 m=12 n=12 rmp=00001000 me=00001ff0 d= 80
> [ 57.390808] ti_hecc ti_hecc: can0: i=39505 m=11 n=11 rmp=00000800 me=00000ff0 d= 81
> [ 57.390838] ti_hecc ti_hecc: can0: i=39506 m=10 n=10 rmp=00000400 me=000007f0 d= 82
> [ 57.390838] ti_hecc ti_hecc: can0: i=39507 m= 9 n= 9 rmp=00000200 me=000003f0 d= 83
> [ 57.390869] ti_hecc ti_hecc: can0: i=39508 m= 8 n= 8 rmp=00000100 me=000001f0 d= 84
Here the driver enables the high priority mailboxes (0xfffff000) with a
single write. Actually the driver writes 0xfffff0f0, as there a still
four free low priority mailboxes.
> [ 57.390899] ti_hecc ti_hecc: can0: i=39509 m= 0 n= 7 rmp=00010000 me=fffff0f0 d= 0
^
The next CAN frame should be received, either into 0x00000800 (if the
reception happens before the driver enables the high prio mailboxes) or
into 0x80000000 (if the reception takes place after the driver has
written the canme register).
> [ 57.390930] ti_hecc ti_hecc: can0: i=39510 m= 0 n= 7 rmp=00010000 me=fffff0f0 d= 0
> [ 57.390930] ti_hecc ti_hecc: can0: i=39511 m= 0 n= 7 rmp=00010000 me=fffff0f0 d= 0
> [ 57.390960] ti_hecc ti_hecc: can0: i=39512 m= 0 n= 7 rmp=00010000 me=fffff0f0 d= 0
> [ 57.390991] ti_hecc ti_hecc: can0: i=39513 m= 0 n= 7 rmp=00010000 me=fffff0f0 d= 0
> [ 57.391021] ti_hecc ti_hecc: can0: i=39514 m= 0 n= 7 rmp=00010000 me=fffff0f0 d= 0
> [ 57.391021] ti_hecc ti_hecc: can0: i=39515 m= 0 n= 7 rmp=00010000 me=fffff0f0 d= 0
> [ 57.391052] ti_hecc ti_hecc: can0: i=39516 m= 0 n= 7 rmp=00010000 me=fffff0f0 d= 0
> [ 57.391082] ti_hecc ti_hecc: can0: i=39517 m=31 n=31 rmp=80010000 me=fffff0f0 d= 86
^
We see the next frame goes into 0x80000000 as expected. (There are so
many d=0 lines, because the driver is confused, as the next frame is not
expected in the 0x00010000 mailbox.)
There isn't any tx going on at the moment, the rx-path is the only
modifier of the canme register.
I thought maybe there is a problem in the hardware with the canme
register, so I've reworked the algorithm not to change the canme
register at all. canme stays 0xfffffff0 all the time (all rx mailboxes
active), but received frames are not acknowledged (I don't write 1s into
canrmp after message reception). The pending rx interrupts are masked
with canmim. To "enable" the mailboxes again, the driver clears bits in
the canrmp register with a single write and then unmasks the interrupt
again. This however results in _the_same_ buggy behaviour.
Then I had a closer look at AnilKumar Ch's patch, the canme register is
not changed if a the CAN core currently receives a CAN frame. I added
AnilKumar Ch's busy loop before modifying the canme register, and it
works now. So I conclude that the "pick next free mailbox" algorithm in
the CAN core is racy. You are not allowed to do something that changes a
mailbox's status from disabled/full to enabled/free when a CAN frame is
received. As you cannot control CAN frame reception, this is a nice
hardware race condition. Please add this to the manual and errata sheets.
I'll post a series which fixes the problems I've found soon.
regards,
Marc
--
Pengutronix e.K. | Marc Kleine-Budde |
Industrial Linux Solutions | Phone: +49-231-2826-924 |
Vertretung West/Dortmund | Fax: +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de |
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]
next prev parent reply other threads:[~2012-11-05 11:35 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-10-08 6:40 [PATCH] can: ti_hecc: fix rx wrong sequence issue AnilKumar Ch
2012-10-08 9:12 ` Jan Lübbe
2012-10-11 12:01 ` Jan Lübbe
2012-10-08 9:28 ` Wolfgang Grandegger
2012-11-05 11:34 ` Marc Kleine-Budde [this message]
2012-11-05 13:47 ` AnilKumar, Chimata
2012-11-05 13:56 ` Marc Kleine-Budde
2012-11-05 18:15 ` Marc Kleine-Budde
2012-11-05 18:40 ` Marc Kleine-Budde
2014-09-08 10:15 ` Jan Sondhauss
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5097A45F.3000606@pengutronix.de \
--to=mkl@pengutronix.de \
--cc=anantgole@ti.com \
--cc=anilkumar@ti.com \
--cc=linux-can@vger.kernel.org \
--cc=nsekhar@ti.com \
--cc=swarren@wwwdotorg.org \
--cc=wg@grandegger.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).