From: Wolfgang Grandegger <wg@grandegger.com>
To: Linux-can Mailing List <linux-can@vger.kernel.org>,
SocketCAN Core Mailing List <socketcan-core@lists.berlios.de>
Subject: RFC: improve and consolidate state change and bus-off handling
Date: Fri, 02 Dec 2011 10:04:14 +0100 [thread overview]
Message-ID: <4ED8948E.5090806@grandegger.com> (raw)
Hello,
as you might know, our handling of CAN state changes and bus-off is
not consistent, weak or even incorrect. Therefore I'm making an effort
to improve, consolidate and *unify* it. Most things are straight-
forward, but others need more attention and especially for the bus-off
recovery I would appreciate some CAN expert advice (more below). I have
already some patches implementing:
- Add missing do_get_berr_counter() callbacks (for ti_hecc, etc.).
- Add error counters to the data fields 6..7 of *any* CAN error message
automatically in alloc_can_err_skb():
if (priv->do_get_berr_counter) {
struct can_berr_counter bec;
priv->do_get_berr_counter(dev, &bec);
(*cf)->data[6] = bec.txerr;
(*cf)->data[7] = bec.rxerr;
}
- Allow state changes going down including "back to error active":
Therefore I added:
$ cat include/linux/can/error.h
...
#define CAN_ERR_STATE_CHANGE 0x00000200U /* CAN error state change / data[1] */
...
#define CAN_ERR_CRTL_ACTIVE 0x40 /* recovered to error active state */
For any state change the CAN_ERR_STATE_CHANGE will be set in the
can_id. If the state gets worse, CAN_ERR_CRTL is set as usual
also for backward compatibility. The state change management will
be done by a common "can_change_state()" function doing all the bit
settings and counter increments. For the SJA1000 "candump -e" will
then report for recovery from the error passive state (no cable):
can0 20000204 [8] 00 08 00 00 00 00 60 00 ERRORFRAME
controller-problem{tx-error-warning}
state-change{tx-error-warning}
error-counter-tx-rx{{96}{0}}
can0 20000204 [8] 00 30 00 00 00 00 80 00 ERRORFRAME
controller-problem{tx-error-passive}
state-change{tx-error-passive}
error-counter-tx-rx{{128}{0}}
can0 124 [3] 12 34 56
...
can0 124 [3] 12 34 56
can0 20000200 [8] 00 08 00 00 00 00 7F 00 ERRORFRAME
state-change{tx-error-warning}
error-counter-tx-rx{{127}{0}}
can0 124 [3] 12 34 56
...
can0 124 [3] 12 34 56
can0 20000200 [8] 00 40 00 00 00 00 5F 00 ERRORFRAME
state-change{back-to-error-active}
error-counter-tx-rx{{95}{0}}
Updating all drivers correctly is a challenge, especially because I
do not have all hardware. Help and comments are appreciated.
- Bus-off recovery:
Currently, I think, we do not handle bus-off recovery correctly for
most controllers. We brute-force stop and restart the controller.
The controller will do the recovery cycle anyway and we may send
messages to early. Instead the software should handle the bus-off
recovery cycle as shown below:
* bus-off happens
- call netif_stop_queue() and maybe disable interrupts
* automatic or manual restart is done
- trigger bus-off recovery sequence by resetting the init bit
(on SJA1000) and maybe re-enable the interrupts
- await the controller going back to error-active state
(signaled via interrupt).
- call netif_wake_queue()
Here is a "candump -e" output for the SJA1000 (with delta times)
(009.832477) can0 20000204 [8] 00 30 00 00 00 00 88 00 ERRORFRAME
controller-problem{tx-error-passive}
state-change{tx-error-passive}
error-counter-tx-rx{{136}{0}}
(000.000804) can0 20000240 [8] 00 00 00 00 00 00 7F 00 ERRORFRAME
bus-off
state-change{}
error-counter-tx-rx{{127}{0}}
(000.099795) can0 20000100 [8] 00 00 00 00 00 00 7F 00 ERRORFRAME
restarted-after-bus-off
error-counter-tx-rx{{127}{0}}
(000.003061) can0 20000200 [8] 00 40 00 00 00 00 00 00 ERRORFRAME
state-change{back-to-error-active}
Before doing all the necessary code changes, which are not always
trivial I ask: Would that be the correct bus-off handling???
Thanks for feedback.
Wolfgang.
next reply other threads:[~2011-12-02 9:04 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-02 9:04 Wolfgang Grandegger [this message]
2011-12-02 9:41 ` RFC: improve and consolidate state change and bus-off handling Marc Kleine-Budde
2011-12-02 10:04 ` Wolfgang Grandegger
2011-12-03 9:28 ` Sebastian Haas
[not found] ` <4ED9EBAB.4060701-xpvPi5bcW5W9w4jpWW8B1qHonnlKzd3f@public.gmane.org>
2011-12-03 10:18 ` Oliver Hartkopp
2011-12-03 11:53 ` Sebastian Haas
2011-12-03 16:53 ` Oliver Hartkopp
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4ED8948E.5090806@grandegger.com \
--to=wg@grandegger.com \
--cc=linux-can@vger.kernel.org \
--cc=socketcan-core@lists.berlios.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).