From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wolfgang Grandegger Subject: Re: CAN messages being lost on i.MX25 with flexcan - continued (was CAN messages being lost on i.MX25 with flexcan - 2012-04-19) Date: Wed, 30 Oct 2013 10:27:20 +0100 Message-ID: <5270D0F8.10106@grandegger.com> References: <526A6B28.4040800@kkmicro.cz> <526AB12C.7090900@grandegger.com> <526C0768.8040903@kkmicro.cz> <526C1A90.4050005@grandegger.com> <526F9216.6010506@kkmicro.cz> <526FA40D.8000202@grandegger.com> <526FACBD.4030605@kkmicro.cz> <526FC670.4000209@grandegger.com> <5270C6B5.8050408@kkmicro.cz> <5270CB8D.5020206@grandegger.com> <5270CDDD.9080405@kkmicro.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: Received: from ngcobalt02.manitu.net ([217.11.48.102]:38537 "EHLO ngcobalt02.manitu.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751332Ab3J3J1Y (ORCPT ); Wed, 30 Oct 2013 05:27:24 -0400 In-Reply-To: <5270CDDD.9080405@kkmicro.cz> Sender: linux-can-owner@vger.kernel.org List-ID: To: =?UTF-8?B?Ik1hcnRpbiBLb8W+dXNrw70gW0tLIG1pY3JvIHMuci5vLl0i?= , linux-can@vger.kernel.org On 10/30/2013 10:14 AM, "Martin Ko=C5=BEusk=C3=BD [KK micro s.r.o.]" wr= ote: > -------- Original Message -------- > Subject: Re: CAN messages being lost on i.MX25 with flexcan - continu= ed > (was CAN messages being lost on i.MX25 with flexcan - 2012-04-19) > From: Wolfgang Grandegger > To: Martin Ko=C5=BEusk=C3=BD [KK micro s.r.o.], linux-can@vger.kernel= =2Eorg > Date: 30. =C5=98=C3=ADjen 2013 10:04:13 >=20 >> On 10/30/2013 09:43 AM, "Martin Ko=C5=BEusk=C3=BD [KK micro s.r.o.]"= wrote: >>> -------- Original Message -------- >>> Subject: Re: CAN messages being lost on i.MX25 with flexcan - conti= nued >>> (was CAN messages being lost on i.MX25 with flexcan - 2012-04-19) >>> From: Wolfgang Grandegger >>> To: Martin Kozusky, linux-can@vger.kernel.org >>> Date: 29. =C5=98=C3=ADjen 2013 15:30:08 >>> >>>> On 10/29/2013 01:40 PM, Martin Kozusky wrote: >>>>> Dne 29.10.2013 13:03, Wolfgang Grandegger napsal(a): >>>>>> On 10/29/2013 11:46 AM, Martin Kozusky wrote: >>>> ... >>>>>>> Hello Wolfgang, >>>>>>> it seems that my architecture (arm/mx25 on 2.6.35 kernel) is mi= ssing >>>>>>> HAVE_FUNCTION_GRAPH_TRACER, HAVE_DYNAMIC_FTRACE options so it >>>>>>> won't be >>>>>>> that easy, will be? >>>>>>> Timestamps that ftrace is showing me are in 10 miliseconds >>>>>>> resolution, >>>>>>> that won't help me much :( >>>> >>>> Are high resolution timers enabled in the kernel? Still, event tra= cing >>>> could would already be useful. >>>> >>>>>> Probably that version is to old for proper ftrace support. The 1= 00us >>>>>> you >>>>>> measured for alloc_can_skb() is worst case, right? What is the m= ean >>>>>> value? >>>>> >>>>> Now I checked again and logged every call (don't know if realy >>>>> everything was logged but something was :) and I see that it is n= ot >>>>> 100usec, but only around 20usec (mean value - checked by eye). Th= ere >>>>> were some very long calls (around 2ms!) that were puttings errors >>>>> in my >>>>> sum/count formula (may be I should filter out calls longer that >>>>> 200usec), with this error it was not 100usec, but almost 1ms (my = value >>>>> was only 32bit and was overflowing when I wrote it first time). S= o >>>>> normally around 20usec, but with very long calls around 2-3 ms (l= ooks >>>>> like those long are periodic - each 6th - 8th call is much longer= , but >>>>> not all the time) >>>> >>>> Are these long latencies related to the SDcard accesses? I think t= he >>>> problem is the rather long latencies caused by other kernel >>>> activity. In >>>> your case caused by the MMC (SDcard) driver, I assume. The Flexcan >>>> controller does buffer up to 5 messages before loosing packets. >>> I think it is not primary related to SD card, those tests I was doi= ng >>> lately were done when system was idle, no special processes were >>> running. But when I do access SD card, problems get bigger. >>> >>>>> But still with those 20usec call, there are many RX overflows, if= I >>>>> disable the call alloc_can_skb, there are no overflows and all is >>>>> received (but still not processed further, because I don't have s= kb >>>>> ... ) >>>> >>>> Could you run this test for a longer time? The probability is just >>>> lower >>>> that RX gets interrupted for a longer time, I think. >>> I have run it for a few minutes, problem is still the same :( BTW: with long I was thinking about hours rather than minutes. >>> >>>> I see a few approaches to overcome this problem: >>>> >>>> - fix the MMC driver to cause less latency. If you are lucky this = is >>>> already the case in more recent versions of the kernel. >>> Hard to say if this helps when it's also doing in idle. >> >> OK, then I misinterpreted your error description. >> >>>> - Use the "-rt" extension (CONFIG_PREEPMT_RT). It will then allow = to >>>> adjust priorities of soft and hard irq threads. >>> I don't know if there is a patch for this :( >>> >>>> - Do the RX processing in the interrupt context, which would requi= re >>>> some custom modifications. >>> I was thinking about writing data directly to fifo file, my program >>> would read from it >> >> Well, before hacking something you should try to find out what is >> provoking the long latencies (> 1ms). FTrace is your friend. Therefo= re I >> would try to get a more recent version of the kernel running somehow= =2E >> 2.6.39 should already be much better. > I will try to get new kernel running, but when I checked the patch fo= r > 2.6.35, I see some incompatibilites in directory structure and files,= so > I will have to go line-by-line and make my patch that will fit the ne= w > kernel. And when I do that, I will try it on latest 3.x kernel. I hop= e I > will make it work. Porting board support to a recent kernel version would be ideal, of course. But it might be much less straight-forward than porting to a close version, e.g. 2.6.39. Note that version 2.6.35 is more 3 years ol= d. Wolfgang.