From: Wolfgang Grandegger <wg@grandegger.com>
To: Andri Yngvason <andri.yngvason@marel.com>,
Marc Kleine-Budde <mkl@pengutronix.de>,
linux-can@vger.kernel.org
Subject: Re: [PATCH 2/4] Consolidate and unify state change handling
Date: Sun, 21 Sep 2014 17:30:48 +0200 [thread overview]
Message-ID: <541EEF28.5000103@grandegger.com> (raw)
In-Reply-To: <541EE4E9.9010506@marel.com>
On 09/21/2014 04:47 PM, Andri Yngvason wrote:
>
> On fös 19.sep 2014 21:10, Wolfgang Grandegger wrote:
>> On 09/18/2014 06:25 PM, Andri Yngvason wrote:
>>> Signed-off-by: Andri Yngvason <andri.yngvason@marel.com>
>>> ---
>>> ...
>>> - cf->can_id |= CAN_ERR_CRTL;
>>> - cf->data[1] = (bec.txerr > bec.rxerr) ?
>>> - CAN_ERR_CRTL_TX_WARNING :
>>> - CAN_ERR_CRTL_RX_WARNING;
>> Hm, can_change_state() handles the equal case differently. In the
>> SJA1000 manual I found:
>>
>> "Errors detected during reception or transmission will affect the error
>> counters according to the CAN 2.0B protocol
>> specification. The error status bit is set when at least one of the
>> error counters has reached or exceeded the CPU
>> warning limit of 96. An error interrupt is generated, if enabled."
>>
>> If both are equal we do not known if rx or tx has caused the state
>> change and therefore setting "CAN_ERR_CRTL_TX_WARNING |
>> CAN_ERR_CRTL_RX_WARNING" seems more logical, indeed. But maybe it simply
>> does not happen. Any other opinions?
> I think that not specifically handling the equal case would be wrong. Let's
> consider the following sequence of events:
> * txerr reaches warning level
> * rxerr reaches warning level
> If they are both equal at this point, you will only get a second
> CAN_ERR_CRTL_TX_WARNING in the current implementation, whereas in the
> proposed
> implementation, the user would get
> CAN_ERR_CRTL_TX_WARNING | CAN_ERR_CRTL_RX_WARNING and because the user
> can know
> the prior error state message, he can find out which state actually
> changed.
The question is what error (rx or tx) error did triger the error state
change interrupt. I doubt that such an interrupt is triggered when one
error counter catches up, .e.g. txer was > 128 and rxerr exceeded 128.
It's even not sure that all the controllers act the same way. Therefore
also keeping the current behaviour would be fine for me.
> But this is all based on the premise that txerr hasn't progressed since.
> In fact,
> because we cannot assume that txerr stays in place until rxerr catches
> up, this
> is what we should be doing:
> enum can_state errcount_to_state(unsigned int count)
> {
> if (unlikely(count > 127))
> return CAN_STATE_ERROR_PASSIVE;
>
> if (unlikely(count > 96))
> return CAN_STATE_ERROR_WARNING;
>
> return CAN_STATE_ERROR_ACTIVE;
> }
>
> enum can_err_dir can_get_err_dir(unsigned int txerr, unsigned int rxerr)
> {
> enum can_err_dir dir;
>
> enum can_state tx_state = errcount_to_state(txerr);
> enum can_state rx_state = errcount_to_state(rxerr);
>
> if (tx_state > rx_state)
> return CAN_ERR_DIR_TX;
>
> if (tx_state < rx_state)
> return CAN_ERR_DIR_RX;
>
> return CAN_ERR_DIR_TX | CAN_ERR_DIR_RX;
> }
>
> However, now that we've introduced errcount_to_state(), it seems to me
> that it would
> be simpler to dump the proposed CAN_ERR_DIR enum in favour of passing
> the two states
> directly to can_change_state().
D'accord.
>>> - }
>>> - case CAN_STATE_ERROR_WARNING: /* fallthrough */
>>> - /*
>>> - * from: ERROR_ACTIVE, ERROR_WARNING
>>> - * to : ERROR_PASSIVE, BUS_OFF
>>> - * => : error passive int
>>> - */
>>> - if (new_state >= CAN_STATE_ERROR_PASSIVE &&
>>> - new_state <= CAN_STATE_BUS_OFF) {
>>> - netdev_dbg(dev, "Error Passive IRQ\n");
>>> - priv->can.can_stats.error_passive++;
>>> -
>>> - cf->can_id |= CAN_ERR_CRTL;
>>> - cf->data[1] = (bec.txerr > bec.rxerr) ?
>>> - CAN_ERR_CRTL_TX_PASSIVE :
>>> - CAN_ERR_CRTL_RX_PASSIVE;
>>> - }
>>> - break;
>>> - case CAN_STATE_BUS_OFF:
>>> - netdev_err(dev, "BUG! "
>>> - "hardware recovered automatically from BUS_OFF\n");
>>> - break;
>>> - default:
>>> - break;
>>> - }
>>> + can_change_state(dev, cf, new_state,
>>> + can_get_err_dir(bec.rxerr, bec.txerr));
>> Saves a lot of lines :).
> Indeed ;)
>>
>>> - /* process state changes depending on the new state */
>>> - switch (new_state) {
>>> - case CAN_STATE_ERROR_WARNING:
>>> - netdev_dbg(dev, "Error Warning\n");
>>> - cf->can_id |= CAN_ERR_CRTL;
>>> - cf->data[1] = (bec.txerr > bec.rxerr) ?
>>> - CAN_ERR_CRTL_TX_WARNING :
>> To validate the correct behaviour could you please send messages while
>> the cable is disconnected. Then reconnect the cable and see how the
>> error state decreases. You can monitor the behaviour with ""candump -td
>> -e any,0:0,#FFFFFFFF" in another shell.
>>
> I'm using PCAN-USB Pro to generate errors on the bus. It works quite well.
> I can generate tx errors by sending from the device and then have the pcan
> ruin a few frames. rx errors can be generated by having an other device on
> the bus outputting random data and then let the pcan corrupt the frames.
Short-circuiting the CAN low and high lines is a simple method to
> Sadly the error generation mechanism only works on windows. :(
>
> I've tried the "disconnected cable" method too in the past. It usually
> puts mscan into bus-off quite fast.
Sending a message whithout cable should never trigger an bus-off. The tx
error counter never exceeds 128.
Here is an example output of "candump -candump -td -e any,0:0,#FFFFFFFF"
for a recovery from error passive state due to no ack/cable (reconnect
after 5s) for a SJA1000 on an on EMS PCI card:
(000.201913) can0 1C [0]
(000.212241) can0 20000204 [8] 00 08 00 00 00 00 60 00 ERRORFRAME
controller-problem{tx-error-warning}
state-change{tx-error-warning}
error-counter-tx-rx{{96}{0}}
(000.003544) can0 20000204 [8] 00 20 00 00 00 00 80 00 ERRORFRAME
controller-problem{tx-error-passive}
state-change{tx-error-passive}
error-counter-tx-rx{{128}{0}}
(004.901842) can0 1D [7] 1D F6 33 52 31 4B DE
(000.000116) can0 20000200 [8] 00 08 00 00 00 00 7F 00 ERRORFRAME
state-change{tx-error-warning}
error-counter-tx-rx{{127}{0}}
(000.000678) can0 1E [6] 42 05 14 82 23 B6
...
(000.201927) can0 49 [4] 2F 1A 97 25
(000.000096) can0 20000200 [8] 00 40 00 00 00 00 5F 00 ERRORFRAME
state-change{back-to-error-active}
error-counter-tx-rx{{95}{0}}
(000.202184) can0 4A [8] 7F 87 0E FE 03 BA 78 91
This is from my related patch-set.
Wolfgang.
next prev parent reply other threads:[~2014-09-21 15:30 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-18 16:25 [PATCH 2/4] Consolidate and unify state change handling Andri Yngvason
2014-09-19 21:10 ` Wolfgang Grandegger
2014-09-21 14:47 ` Andri Yngvason
2014-09-21 15:30 ` Wolfgang Grandegger [this message]
2014-09-21 17:27 ` Andri Yngvason
2014-09-23 20:33 ` Wolfgang Grandegger
2014-09-23 22:31 ` Oliver Hartkopp
2014-09-24 6:28 ` Wolfgang Grandegger
-- strict thread matches above, loose matches on Subject: below --
2014-09-18 16:38 Andri Yngvason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=541EEF28.5000103@grandegger.com \
--to=wg@grandegger.com \
--cc=andri.yngvason@marel.com \
--cc=linux-can@vger.kernel.org \
--cc=mkl@pengutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).