From: Wolfgang Grandegger <wg@grandegger.com>
To: Linux-can Mailing List <linux-can@vger.kernel.org>,
SocketCAN Core Mailing List <socketcan-core@lists.berlios.de>
Subject: RFC: improve and consolidate state change and bus-off handling
Date: Fri, 02 Dec 2011 10:04:14 +0100 [thread overview]
Message-ID: <4ED8948E.5090806@grandegger.com> (raw)
Hello,
as you might know, our handling of CAN state changes and bus-off is
not consistent, weak or even incorrect. Therefore I'm making an effort
to improve, consolidate and *unify* it. Most things are straight-
forward, but others need more attention and especially for the bus-off
recovery I would appreciate some CAN expert advice (more below). I have
already some patches implementing:
- Add missing do_get_berr_counter() callbacks (for ti_hecc, etc.).
- Add error counters to the data fields 6..7 of *any* CAN error message
automatically in alloc_can_err_skb():
if (priv->do_get_berr_counter) {
struct can_berr_counter bec;
priv->do_get_berr_counter(dev, &bec);
(*cf)->data[6] = bec.txerr;
(*cf)->data[7] = bec.rxerr;
}
- Allow state changes going down including "back to error active":
Therefore I added:
$ cat include/linux/can/error.h
...
#define CAN_ERR_STATE_CHANGE 0x00000200U /* CAN error state change / data[1] */
...
#define CAN_ERR_CRTL_ACTIVE 0x40 /* recovered to error active state */
For any state change the CAN_ERR_STATE_CHANGE will be set in the
can_id. If the state gets worse, CAN_ERR_CRTL is set as usual
also for backward compatibility. The state change management will
be done by a common "can_change_state()" function doing all the bit
settings and counter increments. For the SJA1000 "candump -e" will
then report for recovery from the error passive state (no cable):
can0 20000204 [8] 00 08 00 00 00 00 60 00 ERRORFRAME
controller-problem{tx-error-warning}
state-change{tx-error-warning}
error-counter-tx-rx{{96}{0}}
can0 20000204 [8] 00 30 00 00 00 00 80 00 ERRORFRAME
controller-problem{tx-error-passive}
state-change{tx-error-passive}
error-counter-tx-rx{{128}{0}}
can0 124 [3] 12 34 56
...
can0 124 [3] 12 34 56
can0 20000200 [8] 00 08 00 00 00 00 7F 00 ERRORFRAME
state-change{tx-error-warning}
error-counter-tx-rx{{127}{0}}
can0 124 [3] 12 34 56
...
can0 124 [3] 12 34 56
can0 20000200 [8] 00 40 00 00 00 00 5F 00 ERRORFRAME
state-change{back-to-error-active}
error-counter-tx-rx{{95}{0}}
Updating all drivers correctly is a challenge, especially because I
do not have all hardware. Help and comments are appreciated.
- Bus-off recovery:
Currently, I think, we do not handle bus-off recovery correctly for
most controllers. We brute-force stop and restart the controller.
The controller will do the recovery cycle anyway and we may send
messages to early. Instead the software should handle the bus-off
recovery cycle as shown below:
* bus-off happens
- call netif_stop_queue() and maybe disable interrupts
* automatic or manual restart is done
- trigger bus-off recovery sequence by resetting the init bit
(on SJA1000) and maybe re-enable the interrupts
- await the controller going back to error-active state
(signaled via interrupt).
- call netif_wake_queue()
Here is a "candump -e" output for the SJA1000 (with delta times)
(009.832477) can0 20000204 [8] 00 30 00 00 00 00 88 00 ERRORFRAME
controller-problem{tx-error-passive}
state-change{tx-error-passive}
error-counter-tx-rx{{136}{0}}
(000.000804) can0 20000240 [8] 00 00 00 00 00 00 7F 00 ERRORFRAME
bus-off
state-change{}
error-counter-tx-rx{{127}{0}}
(000.099795) can0 20000100 [8] 00 00 00 00 00 00 7F 00 ERRORFRAME
restarted-after-bus-off
error-counter-tx-rx{{127}{0}}
(000.003061) can0 20000200 [8] 00 40 00 00 00 00 00 00 ERRORFRAME
state-change{back-to-error-active}
Before doing all the necessary code changes, which are not always
trivial I ask: Would that be the correct bus-off handling???
Thanks for feedback.
Wolfgang.
next reply other threads:[~2011-12-02 9:04 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-02 9:04 Wolfgang Grandegger [this message]
2011-12-02 9:41 ` RFC: improve and consolidate state change and bus-off handling Marc Kleine-Budde
2011-12-02 10:04 ` Wolfgang Grandegger
2011-12-03 9:28 ` Sebastian Haas
[not found] ` <4ED9EBAB.4060701-xpvPi5bcW5W9w4jpWW8B1qHonnlKzd3f@public.gmane.org>
2011-12-03 10:18 ` Oliver Hartkopp
2011-12-03 11:53 ` Sebastian Haas
2011-12-03 16:53 ` Oliver Hartkopp
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4ED8948E.5090806@grandegger.com \
--to=wg@grandegger.com \
--cc=linux-can@vger.kernel.org \
--cc=socketcan-core@lists.berlios.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.