From mboxrd@z Thu Jan 1 00:00:00 1970 From: Martin Kozusky Subject: Re: CAN messages being lost on i.MX25 with flexcan Date: Fri, 20 Apr 2012 08:17:26 +0200 Message-ID: References: <4F8FEF02.7040503@grandegger.com> <4F8FFAB3.1040906@grandegger.com> <4F901500.9070009@grandegger.com> <4F90FC53.9030709@grandegger.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from plane.gmane.org ([80.91.229.3]:34667 "EHLO plane.gmane.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753931Ab2DTGRp (ORCPT ); Fri, 20 Apr 2012 02:17:45 -0400 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1SL7A4-00023A-Cd for linux-can@vger.kernel.org; Fri, 20 Apr 2012 08:17:40 +0200 Received: from 213.191.105.242 ([213.191.105.242]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 20 Apr 2012 08:17:40 +0200 Received: from mkozusky by 213.191.105.242 with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 20 Apr 2012 08:17:40 +0200 In-Reply-To: <4F90FC53.9030709@grandegger.com> Sender: linux-can-owner@vger.kernel.org List-ID: To: linux-can@vger.kernel.org Dne 20.4.2012 8:04, Wolfgang Grandegger napsal(a): > On 04/20/2012 07:51 AM, Martin Kozusky wrote: >> Dne 19.4.2012 15:37, Wolfgang Grandegger napsal(a): >>> On 04/19/2012 01:58 PM, Martin Kozusky wrote: >>>> Dne 19.4.2012 13:44, Wolfgang Grandegger napsal(a): >>>>> On 04/19/2012 01:21 PM, Martin Kozusky wrote: >>>>>> Dne 19.4.2012 12:54, Wolfgang Grandegger napsal(a): >>>>>>> On 04/19/2012 12:04 PM, Martin Kozusky wrote: >>>>>>>> Hello, >>>>>>>> I'm using Voipac i.MX25 module with flexcan, kernel 2.6.38.9. >>>>>>>> I'm sending the data at 250kbps, around 1100 msgs/sec. When I >>>>>>>> enable the >>>>>>>> canbus interface (canconfig can0 start), CPU load is higher, that is >>>>>>>> understandable, there are many interrupts. I'm not doing anything >>>>>>>> else >>>>>>>> then using recvmsg (or recvmmsg which is little better), but some >>>>>>>> messages are still lost (around 1500 messages lost from 467 000 >>>>>>>> being >>>>>>>> send from another source). When I start doing something (like "cat >>>>>>>> /proc/interrupts", or write to file), many more messages are lost. >>>>>>>> >>>>>>>> Do you have any idea how to fix this? I need to make some CAN >>>>>>>> messages >>>>>>>> logger and I cannot lose any message (idealy :) So I made big >>>>>>>> buffer in >>>>>>>> my program so that I don't need to write the messages into the file >>>>>>>> while "recording" is enabled, after "recording" is switched off, I >>>>>>>> write >>>>>>>> the buffer into the file, but that is still not good enough. >>>>>>>> Is there any way how to write to some buffer directly in flexcan >>>>>>>> driver >>>>>>>> (the best would be in the IRQ routine) and then read messages from >>>>>>>> this >>>>>>>> buffer in my program? >>>>>>>> Or are just interrupts lost when doing something else in the system >>>>>>>> and >>>>>>>> I cannot fix this? Or can I somehow specify that "can rx >>>>>>>> interrupts" has >>>>>>>> highest prioroty? >>>>>>> >>>>>>> Do you already know where the packets are dropped (get lost). Maybe >>>>>>> your >>>>>>> user space app is simply not faster enough to process them in >>>>>>> time. You >>>>>>> can check that with the candump option "-d". >>>>>> >>>>>> Hello Wolfgang, >>>>>> I don't know where it gets lost. But I'm just making select() and >>>>>> recvmmsg in my app now. >>>>>> Now when I'm trying candump -d, it shows messages and between that: >>>>>> >>>>>> can0 18FFC33D [8] 4A E6 00 00 03 00 00 F6 >>>>>> can0 1CEFB8F3 [8] 34 42 23 3D 8C 01 00 00 >>>>>> DROPCOUNT: dropped 245 CAN frames on 'can0' socket (total drops 30886) >>>>>> can0 18FF0AEF [8] 70 76 46 7C A7 9E 2A 6A >>>>>> can0 18FF0AF0 [8] B0 7E FB 7D A7 DC 27 69 >>>>>> can0 18FEBF0B [8] F5 03 79 81 69 74 6E 79 >>>>>> >>>>>> I've sent around 40 000 messages. >>>>>> Does this help somehow to identify where the problem is? >>>>> >>>>> Yes, you are loosing messages because your app is not faster enough. >>>>> >>>>>> I found out that if I (in my program) don't printf every frame, then >>>>>> dropcount is better, so I think when candump would only show dropps >>>>>> and >>>>>> not every frame then it would also get better. >>>>> >>>>> printf takes time, of course. Therefore everything which makes your app >>>>> reading the messages faster will help including increasing the >>>>> scheduling priority of the process/thread. Also increasing the size of >>>>> the receive buffer may help. Try candump with the "-r" option. >>>> >>>> I tried -r 10000000 >>>> >>>> and the result was that dropping started some time later then it would >>>> normaly. From 467 000 packets, total drops was 103064. That is better >>>> ratio than before. But few times, my system was stucked for a while, >>>> then it recovered and continued to work. So I will try to use this >>>> buffer size in my program, without printf and I will see how many >>>> packets I lose. >>> >>> If other activities delay your app too much, your are in trouble. Proper >>> thread priorities may help. >> >> So I tried my app with bigger receive buffer and saving data into memory >> and it looks like no messages are lost now :) (but when I write them >> into file, they are lost, but I can use the temp memory storage, so I >> can live with that :) > > You are obviously hitting the resource limits. Do you need -rt? It makes > your system slower. I don't know what steps are necessary to make the kernel (and can driver) -rt :( I will first try to finish my app and see how it behaves. > >>>> Is there a way how to get actual used size of the buffer? >>> >>> getsockopt? >>> >>>>>> This is my /proc/interrupts >>>>>> >>>>>> root@vmx25 /opt$ cat /proc/interrupts >>>>>> CPU0 >>>>>> 0: 0 - spi_imx >>>>>> 5: 32827 - IMX-uart >>>>>> 9: 129 - sdhci >>>>>> 13: 0 - spi_imx >>>>>> 25: 5 - imxdi_rtc >>>>>> 33: 12168 - mxc_nand >>>>>> 34: 1 - mxc-sdma >>>>>> 37: 109275 - ehci_hcd:usb1 >>>>>> 43: 1624971 - can0 >>>>>> 46: 0 - mxcadc >>>>>> 54: 195357 - i.MX Timer Tick >>>>>> 164: 0 - ESDHCI card 0 detect >>>>>> Err: 0 >>>>> >>>>> In a first place, I don't think your are loosing messages because they >>>>> are not read-out faster enough from the CAN controller. >>>>> >>>>>> I have another problem - when I send messages and nobody is acking >>>>>> them >>>>>> (no other device on CANbus), system stops responding until I put >>>>>> another >>>>>> device onto CANbus, but I think I should start new thread with this. >>>>> >>>>> What kernel version do you use? Do you use the flexcan driver from that >>>>> kernel? The problem is bus errors coming at high rate. >>>> >>>> I'm using 2.6.35.9 Preempt, the flexcan.c driver is original. >>> >>> IIRC, that kernel version did not include the flexcan driver. It entered >>> with 2.6.36. Anyway, maybe the following patch helps: >>> >>> http://patchwork.ozlabs.org/patch/130761/ >> >> It seems that it helped :) when nobody is ACKing, it doesn't hang. > > Good. What does the CAN statistics show ($ ip -d -s link show can0)? Now it shows (after some time of experimenting, it was disconnected from canbus for a while) 2: can0: mtu 16 qdisc pfifo_fast state DOWN mode DEFAULT qlen 10 link/can can state STOPPED (berr-counter tx 0 rx 0) restart-ms 0 bitrate 250000 sample-point 0.857 tq 285 prop-seg 5 phase-seg1 6 phase-seg2 2 sjw 1 flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1 clock 66500000 re-started bus-errors arbit-lost error-warn error-pass bus-off 0 0 0 1 0 0 RX: bytes packets errors dropped overrun mcast 4758311 602831 212 0 212 0 TX: bytes packets errors dropped carrier collsns 73 73 0 0 0 0 Martin > Wolfgang. > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-can" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >