From: Wolfgang Grandegger <wg@grandegger.com>
To: Andri Yngvason <andri.yngvason@marel.com>
Cc: linux-can@vger.kernel.org
Subject: Re: [PATCH v2 1/4] can: dev: Consolidate and unify state change handling.
Date: Wed, 22 Oct 2014 20:42:05 +0200 [thread overview]
Message-ID: <5447FA7D.7060806@grandegger.com> (raw)
In-Reply-To: <5447DC34.7070602@marel.com>
On 10/22/2014 06:32 PM, Andri Yngvason wrote:
>
> On mið 22.okt 2014 12:56, Wolfgang Grandegger wrote:
>>
>>>>> restarted-after-bus-off
>>>> And the restart does come when the short-circuit is gone?
>>>>
>>> No, in fact the bus keeps restarting until the short-circuit is gone.
>>>
>>> Note that I'm using peak_pci. It's sja1000 but maybe there is something
>>> different?
>> No, because it's a memory mapped SJA1000 chip. What do you see with
>> "dmesg"
>> assuming that CONFIG_CAN_DEV_DEBUG is enabled.
>>
> dmesg wasn't very informative so I added some debug output of my own. This is
> the output after I've fixed everything:
> [ 3903.594640] peak_pci 0000:02:00.0: can1 at reg_base=0xf892a400
> cfg_base=0xf8928000 irq=18
> [ 3903.684937] peak_pci 0000:02:00.0 can1: setting BTR0=0x07 BTR1=0x05
> [ 3922.009853] peak_pci 0000:02:00.0 can1: error warning interrupt; bs=false,
> es=true
According to the manual: error active -> warning
> [ 3922.017428] peak_pci 0000:02:00.0 can1: state=1, txerr=96, rxerr=1
> [ 3922.023615] peak_pci 0000:02:00.0 can1: error passive interrupt; bs=false,
> es=true
warning -> error passive
> [ 3922.031197] peak_pci 0000:02:00.0 can1: state=2, txerr=128, rxerr=1
> [ 3928.401643] peak_pci 0000:02:00.0 can1: error passive interrupt; bs=false,
> es=true
error passive -> warning
> [ 3928.409243] peak_pci 0000:02:00.0 can1: state=1, txerr=127, rxerr=0
> [ 3934.804395] peak_pci 0000:02:00.0 can1: error warning interrupt; bs=false,
> es=false
warning -> error active
> [ 3934.812075] peak_pci 0000:02:00.0 can1: state=0, txerr=95, rxerr=0
>
> Note that, between the two passive interrupts, es does not change. This is
> correct according to
> http://www.nxp.com/documents/data_sheet/SJA1000.pdf because ES only signifies
> warning state.
> Thus, "if (status & SR_ES)" is redundant.
>
> @@ -440,11 +446,14 @@ static int sja1000_err(struct net_device *dev, uint8_t
> isrc, uint8_t status)
> }
> if (isrc & IRQ_EPI) {
> /* error passive interrupt */
> - netdev_dbg(dev, "error passive interrupt\n");
> - if (status & SR_ES)
> - state = CAN_STATE_ERROR_PASSIVE;
> + netdev_dbg(dev, "error passive interrupt; bs=%s, es=%s\n",
> + status & SR_BS ? "true" : "false",
> + status & SR_ES ? "true" : "false");
> +
> + if (state == CAN_STATE_ERROR_PASSIVE)
> + state = CAN_STATE_ERROR_WARNING;
> else
> - state = CAN_STATE_ERROR_ACTIVE;
> + state = CAN_STATE_ERROR_PASSIVE;
> }
> if (isrc & IRQ_ALI) {
> /* arbitration lost interrupt */
>
> The data sheet says this about EPI:
> set; this bit is set whenever the CAN controller has
> reached the error passive status (at least one
> error counter exceeds the protocol-defined level of
> 127) or if the CAN controller is in the error passive
> status and enters the error active status again and
> the EPIE bit is set within the interrupt enable
> register
>
> I'm a little bit concerned that it actually says that EPI is set on "error passive"
> and "error passive to error active". However, the log says that txerr=127 on
> that second interrupt. It makes more sense for it to be "error warning". Might
> this be a error in the datasheet?
IIRC, the CAN standard only knows error active and error passive.
Various vendors added the warning level where the level is sometimes
even programmable.
> Another thing that worries me is that when I enabled debug, txerr would advance
> while the interrupt was being handled. This yielded mismatches between
> the new state and the actual current state of the error counters. I moved the
> reading of the RXERR/TXERR registers to the top of the error handler function
> and that fixed the problem for me. Might this still be an issue on slower
> platforms?
Yes, strictly speaking we cannot be sure that we read the correct error
counter when the state change is triggered via interrupt.
> candump looks like this:
> ...
> (000.053403) can0 5B1 [5] D1 D1 BB 7C DF
> (000.146664) can0 506 [8] 2C 6E 9C 77 5E F1 AE 2D
> (000.053628) can0 488 [8] FF C8 AC 26 43 C5 AF 11
> (000.146428) can0 5AD [8] C1 31 BD 1B 6B 8D 34 13
> (000.053112) can0 658 [0]
> (000.171131) can0 20000004 [8] 00 08 00 00 00 00 60 00 ERRORFRAME
> controller-problem{tx-error-warning}
> error-counter-tx-rx{{96}{0}}
> (000.013841) can0 20000004 [8] 00 20 00 00 00 00 80 00 ERRORFRAME
> controller-problem{tx-error-passive}
> error-counter-tx-rx{{128}{0}}
> (004.433642) can0 20000002 [8] 04 00 00 00 00 00 00 00 ERRORFRAME
> lost-arbitration{at bit 4}
> (000.014785) can0 20000002 [8] 02 00 00 00 00 00 00 00 ERRORFRAME
> lost-arbitration{at bit 2}
> (000.012565) can0 20000002 [8] 02 00 00 00 00 00 00 00 ERRORFRAME
> lost-arbitration{at bit 2}
> (000.000006) can0 152 [8] 48 94 14 45 C3 60 8A 58
> (000.000015) can0 6D7 [4] 3A B8 84 26
> (000.012537) can0 20000002 [8] 02 00 00 00 00 00 00 00 ERRORFRAME
> lost-arbitration{at bit 2}
Why do you see these errors? Are there electrical problems on the CAN
bus? And if no cable is connected just txerr should increase (and
decrease if it's reconnected!
> (000.014083) can0 2E7 [7] A2 6F F8 5F DF DD 3B
> (000.000904) can0 757 [8] 4F 0E AB 57 C3 58 DB 75
> (000.000680) can0 459 [5] 52 52 58 29 8C
> ...
> (000.000546) can0 3C7 [2] 18 AC
> (000.000875) can0 130 [8] B3 57 0A 52 CC CA D3 08
> (000.015962) can0 5DF [8] 64 14 6C 61 F9 FD E9 55
> (000.000023) can0 5D2 [8] C5 B2 39 6B 65 45 4C 05
> (000.014554) can0 20000002 [8] 03 00 00 00 00 00 00 00 ERRORFRAME
> lost-arbitration{at bit 3}
> (000.000008) can0 2FC [8] 19 73 77 6C 86 91 D5 58
> ...
> (000.053183) can0 7A1 [6] F8 EE 7B 12 A3 43
> (000.146734) can0 089 [5] 36 B8 A0 4F DD
> (000.014032) can0 20000004 [8] 00 0C 00 00 00 00 7F 74 ERRORFRAME
> controller-problem{rx-error-warning,tx-error-warning}
> error-counter-tx-rx{{127}{116}}
> (000.039316) can0 684 [6] ED D1 1E 32 00 1D
> (000.146892) can0 0D1 [8] CD 99 DE 18 62 B0 45 27
> (000.053317) can0 687 [8] 43 37 CF 5E 6B A4 CA 3D
> ...
> (000.053561) can0 42A [4] 65 20 3B 5C
> (000.146834) can0 3C2 [8] 32 50 A4 56 31 EC DD 55
> (000.013948) can0 20000004 [8] 00 40 00 00 00 00 5F 54 ERRORFRAME
> controller-problem{back-to-error-active}
> error-counter-tx-rx{{95}{84}}
> (000.039372) can0 455 [5] 58 38 31 62 B6
> ...
Wolfgang.
next prev parent reply other threads:[~2014-10-22 18:42 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-26 18:25 [PATCH v2 1/4] can: dev: Consolidate and unify state change handling Andri Yngvason
2014-10-20 19:45 ` Wolfgang Grandegger
2014-10-21 10:42 ` Andri Yngvason
2014-10-21 10:52 ` Wolfgang Grandegger
2014-10-21 14:56 ` Andri Yngvason
2014-10-21 15:21 ` Wolfgang Grandegger
2014-10-22 11:48 ` Andri Yngvason
2014-10-22 12:30 ` Wolfgang Grandegger
2014-10-22 12:48 ` Andri Yngvason
2014-10-22 12:56 ` Wolfgang Grandegger
2014-10-22 16:32 ` Andri Yngvason
2014-10-22 18:42 ` Wolfgang Grandegger [this message]
2014-10-23 12:50 ` Andri Yngvason
2014-10-23 13:10 ` Wolfgang Grandegger
2014-10-23 15:55 ` Andri Yngvason
2014-11-22 17:10 ` Marc Kleine-Budde
2014-11-22 18:15 ` Andri Yngvason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5447FA7D.7060806@grandegger.com \
--to=wg@grandegger.com \
--cc=andri.yngvason@marel.com \
--cc=linux-can@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).