From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tom Evans Subject: Re: [Rfi] Cyclone V CAN errors when application pinned to CPU1 Date: Mon, 8 Feb 2016 09:19:37 +1100 Message-ID: <56B7C2F9.2090901@optusnet.com.au> References: <562155B7.7020504@vsis.cz> <20151020071807.GH20879@pengutronix.de> <5625EF45.2000807@pengutronix.de> <56B63491.9020500@vstk.cz> <56B6750D.4040602@optusnet.com.au> <56B6881F.3010606@vstk.cz> Reply-To: tom_usenet@optusnet.com.au Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from mail109.syd.optusnet.com.au ([211.29.132.80]:37566 "EHLO mail109.syd.optusnet.com.au" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754041AbcBGWTn (ORCPT ); Sun, 7 Feb 2016 17:19:43 -0500 In-Reply-To: <56B6881F.3010606@vstk.cz> Sender: linux-can-owner@vger.kernel.org List-ID: To: Vlastimil Setka , Marc Kleine-Budde , Robert Schwebel Cc: rfi@lists.rocketboards.org, linux-can On 07/02/16 10:56, Vlastimil Setka wrote: > 6.2.2016 23:34 Tom Evans: >> On 7/02/2016 4:59 AM, Vlastimil Setka wrote: >>>>> We have a linux application which sends data >>>>> periodically (1 to 20 ms period) out over the >>>>> can0 socketcan interface. Sometimes the first >>>>> data byte in the CAN frame is zero on the wire, >>>>> but non-zero in the data sent! >>>> The TX functions is usually pretty straight forward. Copy all data bytes into the hardware, write ID and DLC, then hit the send bit (or whatever triggers the hardware to send the frame). Maybe there's some barrier missing in this sequence? >> I'd suggest you "objdump -S" the CAN driver object file and check to see the optimizer hasn't re-ordered the above sequence too much. ... >>> It can be reproducibly triggered by a high network load on >>> ethernet generated by iperf for example. >> >> Which generates a lot of interrupts. Which are probably >> interrupting the above transmit sequence and delaying its >> completion. During which time something else can get in. ... >> I'd suggest you add a "reentry counter" to the driver An easier way to investigate this (initially, you'll need things like reentry counters later) is to add "spin_lock_irqsave()" and "spin_unlock_irqrestore()" pairs around sections of the CAN driver code you want to "protect". Have a look at the other CAN drivers that do this and some of the large number of Ethernet drivers for examples of how to declare the variables and call the functions. I'm not recommending the above as a fix, just as a tool for narrowing down the cause of your problem. If this is bad advice (the wrong functions etc) I ask others on the list that know better to correct me. Tom