From mboxrd@z Thu Jan 1 00:00:00 1970 From: Oliver Hartkopp Subject: Re: peak_pci: TX Frame Loss Date: Wed, 02 Dec 2015 20:19:21 +0100 Message-ID: <565F4439.3050309@hartkopp.net> References: <20151118145121.32487.38169@maxwell.marel.net> <20151202180937.19023.96078@maxwell.marel.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Return-path: Received: from mo4-p00-ob.smtp.rzone.de ([81.169.146.161]:64504 "EHLO mo4-p00-ob.smtp.rzone.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756416AbbLBTTZ (ORCPT ); Wed, 2 Dec 2015 14:19:25 -0500 In-Reply-To: <20151202180937.19023.96078@maxwell.marel.net> Sender: linux-can-owner@vger.kernel.org List-ID: To: Andri Yngvason , linux-can@vger.kernel.org Cc: wg@grandegger.com, mkl@pengutronix.de, s.grosjean@peak-system.com, hrafnkell.eiriksson@marel.com, haukur.hafsteinsson@marel.com On 12/02/2015 07:09 PM, Andri Yngvason wrote: > Quoting Andri Yngvason (2015-11-18 14:51:21) > [...] >> We've been experiencing frame loss on transmission in the peak_pci netdev >> driver. > [...] > > Hi all, > > I think I've figured it out. > > I wrote two programs: One that sends out bursts of can frames with the same > can_id but increasing data[0] and one that listens to can frames and stops > ftrace when a frame is missing. These programs also place ftrace markers. > See link below [1]. > > In cases where frames were lost sja1000_write_cmdreg() is called from two cores > at the same time. > > 7) | sja1000_interrupt [sja1001]() { > [...] > 3) | raw_sendmsg [can_raw]() { > [...] > 3) | dev_hard_start_xmit() { > 3) | sja1000_start_xmit [sja1000]() { > 7) 2.071 us | peak_pci_read_reg [peak_pci](); > 3) 3.901 us | peak_pci_read_reg [peak_pci](); > [...] > 7) 2.066 us | peak_pci_read_reg [peak_pci](); > 3) 0.139 us | can_put_echo_skb [can_dev](); > 3) | sja1000_write_cmdreg [sja1000]() { > start_xmit acquires spin lock: > 3) 0.050 us | _raw_spin_lock_irqsave(); > start_xmit writes to command register: > 3) 0.043 us | peak_pci_write_reg [peak_pci](); > 3) 2.516 us | peak_pci_read_reg [peak_pci](); > 7) | sja1000_write_cmdreg [sja1000]() { > sja1000_interrupt from other thread stalls on spin lock: > 7) 2.035 us | _raw_spin_lock_irqsave(); > start_xmit releases spin lock: > 3) 0.083 us | _raw_spin_unlock_irqrestore(); > sja1000_interrupt continues and writes to command register: > 7) 0.057 us | peak_pci_write_reg [peak_pci](); > 3) 4.456 us | } > 3) + 21.813 us | } > 7) 1.657 us | peak_pci_read_reg [peak_pci](); > 3) + 22.632 us | } > 3) 0.044 us | _raw_spin_lock(); > 3) + 26.822 us | } > 3) 0.105 us | __local_bh_enable_ip(); > 3) + 30.474 us | } > 3) + 30.985 us | } > sja1000_interrupt releases spin lock: > 7) 0.072 us | _raw_spin_unlock_irqrestore(); > > As far as locking goes, this looks OK, but maybe the sja1000 (or peak's FPGA > implementation thereof) hasn't settled after writing to the command register? > > I tried adding a delay to test my theory: > diff --git a/drivers/net/can/sja1000/sja1000.c b/drivers/net/can/sja1000/sja1000.c > index f968d1e..b5115c2 100644 > --- a/drivers/net/can/sja1000/sja1000.c > +++ b/drivers/net/can/sja1000/sja1000.c > @@ -93,6 +93,7 @@ static void sja1000_write_cmdreg(struct sja1000_priv *priv, u8 val) > spin_lock_irqsave(&priv->cmdreg_lock, flags); > priv->write_reg(priv, SJA1000_CMR, val); > priv->read_reg(priv, SJA1000_SR); > + udelay(10); > spin_unlock_irqrestore(&priv->cmdreg_lock, flags); > } Hi Andri, looking at the code above I wonder whether the priv->read_reg(priv, SJA1000_SR); is a bad hack anyway. The read_reg() becomes readb(priv->reg_base + (port << 2)) in peak_pci.c and the result is never used. What if the entire operation gets 'optimized' at some point? Can you check what happens if you just replace the priv->read_reg() with your udelay(10) call? Best regards, Oliver > > I've been running the test programs for a while now and I haven't lost a frame > yet, so this delay seems to solve the problem. > > Do you guys think that this is an acceptable solution? > Should it maybe reside within peak_pci.c instead? > > Regards, > Andri > > [1] https://gist.github.com/any1/7062000f8c02b9b27bf6 > -- > To unsubscribe from this list: send the line "unsubscribe linux-can" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >