From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wolfgang Grandegger Subject: Re: [PATCH v3] can: grcan: Add device driver for GRCAN and GRHCAN cores Date: Wed, 07 Nov 2012 12:15:36 +0100 Message-ID: <509A42D8.6030009@grandegger.com> References: <5087EDBF.2040108@gaisler.com> <1351588017-14357-1-git-send-email-andreas@gaisler.com> <508FA6D3.9000809@grandegger.com> <508FFF50.8050900@gaisler.com> <50911EB6.1070105@grandegger.com> <509152D5.2050701@gaisler.com> <50918830.9040601@grandegger.com> <50929E7F.1090205@gaisler.com> <509786C0.9070703@grandegger.com> <509A0E94.1040103@gaisler.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Return-path: Received: from ngcobalt02.manitu.net ([217.11.48.102]:41146 "EHLO ngcobalt02.manitu.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754180Ab2KGLPr (ORCPT ); Wed, 7 Nov 2012 06:15:47 -0500 In-Reply-To: <509A0E94.1040103@gaisler.com> Sender: linux-can-owner@vger.kernel.org List-ID: To: Andreas Larsson Cc: linux-can@vger.kernel.org, Marc Kleine-Budde , devicetree-discuss@lists.ozlabs.org, software@gaisler.com On 11/07/2012 08:32 AM, Andreas Larsson wrote: > On 11/05/2012 10:28 AM, Wolfgang Grandegger wrote: >> On 11/01/2012 05:08 PM, Andreas Larsson wrote: >>> On 2012-10-31 21:21, Wolfgang Grandegger wrote: > ... >>> Yes, the hardware becomes error active after having monitored 11 >>> consecutive recessive bits on the bus 128 times (as allowed by the 2.0 >>> CAN spec). There is no way of turning this off, so to conform to the >>> normal linux procedure of not doing this, I shut down the device on bus >>> off interrupt. >> >> This should be handled in the following way: >> >> 1. If priv->can.restart_ms == 0: do *not* allow automatic restart >> That's what you alredy have implemented. >> >> 2. If priv->can.restart_ms > 0 : do allow automatic restart. >> This requires to send CAN_ERR_RESTARTED when the system goes >> bus-on. See at91_can and mcp251x as example. >> >>> In addition I have thrown out the arbitration lost error frame >>> generation as arbitration errors can not be singled out. The TXLOSS >>> interrupt might be due to arbitration error, but is also triggered in >>> great numbers when there is no one else on the can bus or when there is >>> a problem with the hardware interface or the bus itself. >>> >>> This is how things look currently with no one else on the bus: >>> ~ # cansend can0 123#45 >>> can0 20000004 [8] 00 08 00 00 00 00 60 00 ERRORFRAME >>> controller-problem{tx-error-warning} >>> error-counter-tx-rx{{96}{0}} >>> can0 20000004 [8] 00 20 00 00 00 00 80 00 ERRORFRAME >>> controller-problem{tx-error-passive} >>> error-counter-tx-rx{{128}{0}} >>> ~ # >>> >>> And this is how it looks with a short-circuited bus: >>> ~ # cansend can0 123#45 >>> can0 20000004 [8] 00 08 00 00 00 00 90 00 ERRORFRAME >>> controller-problem{tx-error-warning} >>> error-counter-tx-rx{{144}{0}} >>> can0 20000004 [8] 00 20 00 00 00 00 98 00 ERRORFRAME >>> controller-problem{tx-error-passive} >>> error-counter-tx-rx{{152}{0}} >>> can0 20000040 [8] 00 00 00 00 00 00 00 00 ERRORFRAME >>> bus-off >>> ~ # >> >> This looks good now. Just the automatic restart is missing as described >> above. > > When doing the bus_off handling as in at91_can, on a short-circuited bus > with restart-ms != 0, the result of a cansend is an endless and frequent > stream of > > can0 20000004 [8] 00 20 00 00 00 00 88 00 ERRORFRAME > controller-problem{tx-error-passive} > error-counter-tx-rx{{136}{0}} > can0 20000040 [8] 00 00 00 00 00 00 80 00 ERRORFRAME > bus-off > error-counter-tx-rx{{128}{0}} > can0 20000104 [8] 00 00 00 00 00 00 10 00 ERRORFRAME > controller-problem{} > restarted-after-bus-off > error-counter-tx-rx{{16}{0}} > can0 20000004 [8] 00 10 00 00 00 00 57 80 ERRORFRAME > controller-problem{rx-error-passive} > error-counter-tx-rx{{87}{128}} > can0 20000040 [8] 00 00 00 00 00 00 80 00 ERRORFRAME > bus-off > error-counter-tx-rx{{128}{0}} > can0 20000104 [8] 00 00 00 00 00 00 08 00 ERRORFRAME > controller-problem{} > restarted-after-bus-off > error-counter-tx-rx{{8}{0}} > can0 20000004 [8] 00 10 00 00 00 00 57 80 ERRORFRAME > controller-problem{rx-error-passive} > error-counter-tx-rx{{87}{128}} > can0 20000040 [8] 00 00 00 00 00 00 80 00 ERRORFRAME > bus-off > error-counter-tx-rx{{128}{0}} > can0 20000104 [8] 00 00 00 00 00 00 08 00 ERRORFRAME > controller-problem{} > restarted-after-bus-off > error-counter-tx-rx{{8}{0}} > [...] > > as the grcan core continues to try to resend the frame when it comes > back again. To mimic the sja1000 behavior as closely as possible, I > guess that the driver also would need to make sure that the tx and rx > buffers are cleaned out so that this resending does not happen, right? No, what you see is the normal behavior for automatic restart by the hardware. A bus-off recovery is *not* the same than a controller restart. > To do this, the hardware needs to be stopped anyway. Therefore, in my > opinion it is much simpler to handle it as it is in v5: always shut down > the hardware on bus off and, in the case of a non-zero restart_ms, let > restart timer trigger can_restart that will call grcan_set_mode which > will restart the hardware with empty buffers. Do you see any problems > with this approach? The application will start to send frames anyway and will again trigger a bus-off as long as the electronic problem persists. Flushing the buffers will not cure the problem. > The added benefit of this approach is that then the actual millisecond > value of the non-zero restart_ms is used instead of having the hardware > quickly restart regardless of the value. See above. > In any case I have some other fixes for v6. OK. Wolfgang.