Linux CAN drivers development
 help / color / mirror / Atom feed
From: dariobin@libero.it
To: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: Jacob Kroon <jacob.kroon@gmail.com>,
	Oliver Hartkopp <socketcan@hartkopp.net>,
	linux-can@vger.kernel.org, wg@grandegger.com
Subject: Re: CM-ITC, pch_can/c_can_pci, sendto() returning ENOBUFS
Date: Thu, 22 Sep 2022 09:20:52 +0200 (CEST)	[thread overview]
Message-ID: <11136408.601940.1663831252572@mail1.libero.it> (raw)
In-Reply-To: <20220921074741.admuodnlv4yexfwr@pengutronix.de>

Hi Marc,

> Il 21/09/2022 09:47 Marc Kleine-Budde <mkl@pengutronix.de> ha scritto:
> 
>  
> On 21.09.2022 09:25:41, dariobin@libero.it wrote:
> > > On 9/16/22 06:14, Jacob Kroon wrote:
> > > ...> What I do know is that if I revert commit:
> > > > 
> > > > "can: c_can: cache frames to operate as a true FIFO"
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=387da6bc7a826cc6d532b1c0002b7c7513238d5f
> > > > 
> > > > then everything looks good. I don't get any BUG messages, and the host 
> > > > has been running overnight without problems, so it seems to have fixed 
> > > > the network interface lockup as well.
> > 
> > Here's what I think:
> > If one or more messages are cached, the controller has to transmit more frames 
> > in the unit of time when they can be transmitted (IF_COMM_TXRQST), different from
> > when the transmission occurs directly on request from the user space. In the case 
> > of cached data transmission I therefore think that the controller is more heavily
> > loaded. Can this shift the balance ?
> > 
> > > 
> > > I ran the kernel *with* the commit above, and also with the following patch:
> > > 
> > > > diff --git a/drivers/net/can/c_can/c_can_main.c b/drivers/net/can/c_can/c_can_main.c
> > > > index 52671d1ea17d..4375dc70e21f 100644
> > > > --- a/drivers/net/can/c_can/c_can_main.c
> > > > +++ b/drivers/net/can/c_can/c_can_main.c
> > > > @@ -1,3 +1,4 @@
> > > > +#define DEBUG
> > > >  /*
> > > >   * CAN bus driver for Bosch C_CAN controller
> > > >   *
> > > > @@ -469,8 +470,15 @@ static netdev_tx_t c_can_start_xmit(struct sk_buff *skb,
> > > >  	if (c_can_get_tx_free(tx_ring) == 0)
> > > >  		netif_stop_queue(dev);
> > > >  
> > > > -	if (idx < c_can_get_tx_tail(tx_ring))
> > > > +	netdev_dbg(dev, "JAKR:%d:%d:%d:%d\n", idx,
> > > > +	                                      c_can_get_tx_head(tx_ring),
> > > > +	                                      c_can_get_tx_tail(tx_ring),
> > > > +	                                      c_can_get_tx_free(tx_ring));
> > > > +
> > > > +	if (idx < c_can_get_tx_tail(tx_ring)) {
> > > >  		cmd &= ~IF_COMM_TXRQST; /* Cache the message */
> > > > +		netdev_dbg(dev, "JAKR:Caching messages\n");
> > > > +	}
> > > >  
> > > >  	/* Store the message in the interface so we can call
> > > >  	 * can_put_echo_skb(). We must do this before we enable
> > > 
> > > and I've uploaded the entire log I could capture from /dev/kmsg, right 
> > > up to the hang, here:
> > > 
> > > https://pastebin.com/6hvAcPc9
> > > 
> > > What looks odd to me right from the start is that sometimes when idx 
> > > rolls over to 0, and *only* when it rolls over to 0, the CAN frame gets 
> > > cached because "idx < c_can_get_tx_tail(tx_ring)".
> > 
> > If the message were not stored but transmitted, the order of transmission 
> > would not be respected.
> > 
> > > 
> > > Is it possible there is some difference between c_can and d_can in how 
> > > the HW buffers are working, which breaks the driver on my particular HW 
> > > setup ?
> > > 
> > 
> > I tested the patch on a beaglebone board without encountering any problems.
> > There is also a version of the driver I submitted to Xenomai running on a custom
> > board without problems. But surely the setup and context is different from yours.
> > 
> > What compatible are you using in your device tree?
> > I used "ti,am3352-d_can".
> 
> I think Jacob's board has a c_can core, while the beagle bone uses a
> d_can. Maybe there's a subtle difference between these cores?
> 
> Dario, do you have access to a real c_can core to test?
No, only d_can core.

> 
> As reverting 387da6bc7a82 ("can: c_can: cache frames to operate as a
> true FIFO") helps to fix Jacob's problem, a temporary solution might be
> to only cache frames on d_can cores.
OK

Thanks and regards,
Dario

> 
> regards,
> Marc
> 
> -- 
> Pengutronix e.K.                 | Marc Kleine-Budde           |
> Embedded Linux                   | https://www.pengutronix.de  |
> Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
> Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

  parent reply	other threads:[~2022-09-22  7:21 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-25 13:25 CM-ITC, pch_can/c_can_pci, sendto() returning ENOBUFS Jacob Kroon
2022-08-26 11:24 ` Jacob Kroon
2022-08-29  9:14   ` Jacob Kroon
2022-08-29 13:20     ` Jacob Kroon
2022-08-29 13:53       ` Oliver Hartkopp
2022-08-30 12:59         ` Jacob Kroon
2022-08-30 19:15           ` Oliver Hartkopp
2022-09-01  9:38             ` Jacob Kroon
2022-09-01 16:35               ` Oliver Hartkopp
2022-09-02 15:13                 ` Jacob Kroon
2022-09-02 16:39                   ` Jacob Kroon
2022-09-05 14:17                   ` Marc Kleine-Budde
2022-09-05 15:54               ` Marc Kleine-Budde
2022-09-16  4:14                 ` Jacob Kroon
2022-09-19 23:24                   ` Jacob Kroon
2022-09-20  1:23                     ` Vincent Mailhol
2022-09-20  5:08                       ` Jacob Kroon
2022-09-21  7:25                     ` dariobin
2022-09-21  7:47                       ` Marc Kleine-Budde
2022-09-21  8:26                         ` Jacob Kroon
2022-09-21  9:55                         ` Oliver Hartkopp
2022-09-21 10:32                           ` Marc Kleine-Budde
2022-09-21 10:39                             ` Oliver Hartkopp
2022-09-21 10:53                               ` Marc Kleine-Budde
2022-09-21 11:00                                 ` Oliver Hartkopp
2022-09-22  7:20                         ` dariobin [this message]
2022-09-23 11:36                   ` Marc Kleine-Budde
2022-09-23 17:55                     ` dariobin
2022-09-23 19:03                       ` Jacob Kroon
2022-09-23 19:21                         ` Jacob Kroon
2022-09-23 19:45                           ` dariobin
2022-09-23 20:27                             ` Jacob Kroon
2022-09-24  5:17                               ` Jacob Kroon
2022-09-28  8:25                                 ` Marc Kleine-Budde
2022-09-28  8:28                                   ` Jacob Kroon
2022-09-28  8:02                             ` Marc Kleine-Budde

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=11136408.601940.1663831252572@mail1.libero.it \
    --to=dariobin@libero.it \
    --cc=jacob.kroon@gmail.com \
    --cc=linux-can@vger.kernel.org \
    --cc=mkl@pengutronix.de \
    --cc=socketcan@hartkopp.net \
    --cc=wg@grandegger.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox