From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <45FE4E96.4010103@domain.hid> Date: Mon, 19 Mar 2007 08:49:26 +0000 From: =?ISO-8859-1?Q?St=E9phane_ANCELOT?= MIME-Version: 1.0 Subject: Re: [Xenomai-help] RT-Socket-CAN bus error handling (was CAN errors and real-time behaviour (IRQ raise forever and may lock system)) References: <45F7DEA8.2050309@domain.hid> <45FBD768.9050407@domain.hid> In-Reply-To: <45FBD768.9050407@domain.hid> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: Help regarding installation and common use of Xenomai List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Wolfgang Grandegger Cc: xenomai@xenomai.org Hi, only one comment : It is not anbsolutely necessary to disable bus error interrupt " a new bus error in terrupt is not possible until the ecc register is read out once". only disabling reading of ecc in isr will disable new bei generation. Best Regards Steph Wolfgang Grandegger wrote: > Hi Sebastian, > > Sebastian Smolorz wrote: >> Wolfgang Grandegger wrote: >>> Sebastian Smolorz wrote: >>>> Last summer we had a discussion about the BEI issue on the >>>> socketcan-ML. >>>> Two additional handling policies popped up: >>>> 1. The interface could restart itself after an amount of BEIs, thus >>>> taking responsibility from the user application. >>>> 2. The BEI could be completely disabled if no one is interested in this >>>> type of error frame. >>> I tried to implement 2. for SJA1000, but re-enabling the BIE on the fly >>> does not work. :-(. The controller requires a re-start of the device to >>> get the bus error reporting back to work. >> >> Oh, really? I wasn't aware of this. > > Well, I got it working. Reading the ECC register after re-enabling the > bus error interrupts fixed the problem: > > if (CAN_STATE_OPERATING(dev->state)) { > chip->write_reg(dev, SJA_IER, chip->ier); /* update on the fly */ > chip->read_reg(dev, SJA_ECC); > } > >>>> Maybe it is time to think about the implementation of these policies as >>>> more and more users seem to run into the BEI issue with a disconnected >>>> bus. Wolfgang, Jan, what is your opinion? >>> Well, solution 2. with the limitations mentioned above is therefore less >>> attractive because it interrupts the CAN traffic. >> >> True. > > Back to our preferred solution 1. Attached is a patch for review > including some other fixes and suggestions accumulated over time: > > * ksrc/drivers/can/*: To avoid unnecessary bus error interrupt > flooding, the option CONFIG_XENO_DRIVERS_CAN_BUS_ERR now allows to > enable bus error interrupts "on demand" only if an application is > interested in such errors. It is automatically selected for CAN > controllers supporting bus error interrupts like the SJA1000. > > * include/rtdm/rtcan.h: Add some doc on bus-off and bus-error error > conditions and the restart policy. > > * src/utils/can/rtcanconfig.c: Controller mode settings and doc > has been corrected. > >>> The Socket-CAN implementation actually restarts the CAN controller >>> after a certain >>> amount of bus error interrupts (200 by default) which matches your first >>> policy above. But in RT-Socket-CAN, we do not automatically re-start the >>> device by purpose. Therefore I tend to just stop the device. It's then >>> up to the application to restart it. What do you think? >> >> No fundamental objections but it would be best if an application would >> be informed of this special situation e.g. through an error frame with >> the meaning "controller was stopped because of a disconnected bus >> after trying to send 200 times the same message". >> >> A question pops up in this context: Why do we define CAN_ERR_RESTARTED >> if we never do this? Only to be compatible with Socket-CAN? Then I >> would propose to extend the documentation by pointing out that this >> will not appear under RT-Socket-CAN. > > Let's wait if solution 1. is sufficient. maybe we need 2. later as well. > > Wolfgang. >