All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wolfgang Grandegger <wg@domain.hid>
To: Wolfgang Grandegger <wg@domain.hid>
Cc: xenomai@xenomai.org, Jan Kiszka <jan.kiszka@domain.hid>
Subject: Re: [Xenomai-help] RT-Socket-CAN bus error handling (was CAN errors and real-time behaviour (IRQ raise forever and may lock system))
Date: Mon, 19 Mar 2007 22:19:32 +0100	[thread overview]
Message-ID: <45FEFE64.5010206@domain.hid> (raw)
In-Reply-To: <45FEF649.9060205@domain.hid>

Wolfgang Grandegger wrote:
> Jan Kiszka wrote:
>> Wolfgang Grandegger wrote:
>>> Sebastian Smolorz wrote:
>>>> Sebastian Smolorz wrote:
>>>>> Hi Jan,
>>>>>
>>>>> Jan Kiszka wrote:
>>>>>> Wolfgang Grandegger wrote:
>>>>>>> you know, on the SJA1000 the bus error interrupt can result in high
>>>>>>> error interrupt rates and even hang the system on slow processors.
>>>>>>> Just
>>>>>>> unplugging the CAN cable can cause such interrupt flooding. This
>>>>>>> problem
>>>>>>>
>>>>>>> popped up again recently and Sebastian proposed:
>>>>>>>> Last summer we had a discussion about the BEI issue on the
>>>>>>>> socketcan-ML. Two additional handling policies popped up:
>>>>>>>> 1. The interface could restart itself after an amount of BEIs, thus
>>>>>>>>    taking responsibility from the user application.
>>>>>>>> 2. The BEI could be completely disabled if no one is interested in
>>>>>>>>    this ype of error frame.
>>>>>>> As 2. is also my preferred solution, I have implemented it. The only
>>>>>>> downside is that you do not see the error counter increasing when
>>>>>>> /proc/rtcan/devices is inspected. We also discussed 1., but
>>>>>>> RT-Socket-CAN does not restart the CAN controller by purpose and 
>>>>>>> just
>>>>>>> stoppping it requires user intervention.
>>>>>> And if there is someone listening, how is the flooding issue on cable
>>>>>> unplug etc. solved by option 2?
>>>>> Hm, maybe we could implement 1 additionally (but without automatical
>>>>> restart)?
>>>> A more precise suggestion: What about letting BEIs appear until
>>>> passive mode is reached and if the TX error counter doesn't count up
>>>> any more (indication of start-up situation discovered by the SJA1000)
>>>> the driver ceases to read out ECC any further (thanks Stephane for the
>>>> hint). The controller would be still operating but not reporting BEIs
>>>> any more. There has to be some mechanism to let BEIs through after the
>>>> situation has normalized. Maybe the driver could check inside the
>>>> interrupt handler if active mode was reached again after the above
>>>> situation occured.
>>> Well, this is rather sophisticated and needs some more careful
>>> evaluation. We might also reach the passive level slowly without
>>> flooding. Furthermore, the method should also be applicable for other
>>> controllers.
>>
>> What is the current behaviour of other controllers?
> 
> Most do not have such detailed error reporting via bus error interrupts. 
> I know just the i82527 reporting bus errors as well.
> 
>>> Let's implement 1. and downscaled printk and wait for the users reaction
>>> , see also my other mail. Then we should bring up this discussion again
>>> on the Socket-CAN-ML to negotiate a common solution.
>>
>> Instead of waiting on some user triggering a (potential) latency mine, I
>> would prefer that we experimentally evaluate the effect. E.g. via an
>> I-pipe tracer dump on a faster and a slower box. I would offer to run
>> some demo code here on our PC104 Phytec boards as well.
> 
> I think we should first run the latency test concurrently and if we 
> discover high latencies an IPIPE trace helps locating the latency peaks.
> 
>> The problem is to define what degree of error-related IRQ load is
>> generally acceptable. We surely can't do this, so we have to document
>> the effect /at least/ and help the users to check it on their own - or
>> we have to avoid it / make it insignificant compared to normal CAN
>> operation (I'm still in favour of this path).
> 
> We speak about a pathological situation and therefore I do not share 
> your concerns. When there are electrical problems or even the cable is 
> not connected, we do have an abnormal mode of operation and CAN related 
> real-time is broken anyhow. The bus error messages are then useful for 
> analyzing the problem. The effect of the bus error interrupts on non-CAN 
> related latencies is another issue but I think it's not that critical 
> either (handling a bus error just requires the reading of 2 SJA1000 
> registers). But I agree, a more detailed analysis of "bus error 
> flooding" would help to understand the impact on the real-time behavior.

And also be aware, that heavy CAN traffic can cause similar latencies as 
well and when there is more than one CAN controller, they can accumulate 
(as I have observed with my PCAN dongle tests). Here a IRQ service task 
or threaded IRQs would help. Maybe this is the right way to go.

Wolfgang.



  reply	other threads:[~2007-03-19 21:19 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-03 14:09 [Xenomai-help] CAN errors and real-time behaviour roland Tollenaar
2007-03-05  8:49 ` Stéphane ANCELOT
2007-03-05  9:26   ` Roland Tollenaar
2007-03-05 10:39   ` [Xenomai-help] CAN errors and real-time behaviour (IRQ raise forever and may lock system) Stéphane ANCELOT
2007-03-05 11:26     ` Sebastian Smolorz
2007-03-05 11:42       ` Roland Tollenaar
2007-03-05 12:01         ` Sebastian Smolorz
2007-03-05 12:16           ` Roland Tollenaar
2007-03-05 12:48             ` Sebastian Smolorz
2007-03-05 13:13               ` Roland Tollenaar
2007-03-05 14:57       ` Stéphane ANCELOT
2007-03-05 14:42         ` Sebastian Smolorz
2007-03-05 17:02           ` Stéphane ANCELOT
2007-03-06  9:36             ` Sebastian Smolorz
2007-03-10 20:53               ` Wolfgang Grandegger
2007-03-14 11:38               ` [Xenomai-help] RT-Socket-CAN bus error handling (was CAN errors and real-time behaviour (IRQ raise forever and may lock system)) Wolfgang Grandegger
2007-03-14 12:51                 ` Sebastian Smolorz
2007-03-14 13:18                   ` Wolfgang Grandegger
2007-03-14 13:24                     ` Sebastian Smolorz
2007-03-17 11:56                   ` Wolfgang Grandegger
2007-03-18 10:22                     ` Jan Kiszka
2007-03-18 11:33                       ` Wolfgang Grandegger
2007-03-18 20:59                         ` Jan Kiszka
2007-03-19  8:21                           ` Sebastian Smolorz
2007-03-19  8:50                             ` Sebastian Smolorz
2007-03-19 11:35                               ` Wolfgang Grandegger
2007-03-19 11:46                                 ` Sebastian Smolorz
2007-03-19 13:05                                 ` Jan Kiszka
2007-03-19 20:44                                   ` Wolfgang Grandegger
2007-03-19 21:19                                     ` Wolfgang Grandegger [this message]
2007-03-19 22:25                                       ` Jan Kiszka
2007-03-20  6:53                                         ` Wolfgang Grandegger
2007-03-19  8:54                             ` Wolfgang Grandegger
2007-03-19 16:48                             ` Stéphane ANCELOT
2007-03-19 16:56                               ` Sebastian Smolorz
2007-03-19 17:33                                 ` Jan Kiszka
2007-03-19  8:49                     ` Stéphane ANCELOT
2007-03-19  8:30                       ` Wolfgang Grandegger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45FEFE64.5010206@domain.hid \
    --to=wg@domain.hid \
    --cc=jan.kiszka@domain.hid \
    --cc=xenomai@xenomai.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.