Linux CAN drivers development
 help / color / mirror / Atom feed
From: Marc Kleine-Budde <mkl@pengutronix.de>
To: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: "Wolfgang Grandegger" <wg@grandegger.com>,
	"David S. Miller" <davem@davemloft.net>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Eric Dumazet" <edumazet@google.com>,
	netdev@vger.kernel.org, linux-can@vger.kernel.org,
	"Jérémie Dautheribes" <jeremie.dautheribes@bootlin.com>,
	"Thomas Petazzoni" <thomas.petazzoni@bootlin.com>,
	sylvain.girard@se.com, pascal.eberhard@se.com,
	stable@vger.kernel.org
Subject: Re: [PATCH v3] can: sja1000: Always restart the Tx queue after an overrun
Date: Wed, 4 Oct 2023 11:41:08 +0200	[thread overview]
Message-ID: <20231004-uneasy-backed-e01d77be9f51-mkl@pengutronix.de> (raw)
In-Reply-To: <20231002160206.190953-1-miquel.raynal@bootlin.com>

[-- Attachment #1: Type: text/plain, Size: 4117 bytes --]

On 02.10.2023 18:02:06, Miquel Raynal wrote:
> Upstream commit 717c6ec241b5 ("can: sja1000: Prevent overrun stalls with
> a soft reset on Renesas SoCs") fixes an issue with Renesas own SJA1000
> CAN controller reception: the Rx buffer is only 5 messages long, so when
> the bus loaded (eg. a message every 50us), overrun may easily
> happen. Upon an overrun situation, due to a possible internal crosstalk
> situation, the controller enters a frozen state which only can be
> unlocked with a soft reset (experimentally). The solution was to offload
> a call to sja1000_start() in a threaded handler. This needs to happen in
> process context as this operation requires to sleep. sja1000_start()
> basically enters "reset mode", performs a proper software reset and
> returns back into "normal mode".
> 
> Since this fix was introduced, we no longer observe any stalls in
> reception. However it was sporadically observed that the transmit path
> would now freeze. Further investigation blamed the fix mentioned above,
> and especially the reset operation. Reproducing the reset in a loop
> helped identifying what could possibly go wrong. The sja1000 is a single
> Tx queue device, which leverages the netdev helpers to process one Tx
> message at a time. The logic is: the queue is stopped, the message sent
> to the transceiver, once properly transmitted the controller sets a
> status bit which triggers an interrupt, in the interrupt handler the
> transmission status is checked and the queue woken up. Unfortunately, if
> an overrun happens, we might perform the soft reset precisely between
> the transmission of the buffer to the transceiver and the advent of the
> transmission status bit. We would then stop the transmission operation
> without re-enabling the queue, leading to all further transmissions to
> be ignored.
> 
> The reset interrupt can only happen while the device is "open", and
> after a reset we anyway want to resume normal operations, no matter if a
> packet to transmit got dropped in the process, so we shall wake up the
> queue. Restarting the device and waking-up the queue is exactly what
> sja1000_set_mode(CAN_MODE_START) does. In order to be consistent about
> the queue state, we must acquire a lock both in the reset handler and in
> the transmit path to ensure serialization of both operations. It turns
> out, a lock is already held when entering the transmit path, so we can
> just acquire/release it as well with the regular net helpers inside the
> threaded interrupt handler and this way we should be safe. As the
> reset handler might still be called after the transmission of a frame to
> the transceiver but before it actually gets transmitted, we must ensure
> we don't leak the skb, so we free it (the behavior is consistent, no
> matter if there was an skb on the stack or not).
> 
> Fixes: 717c6ec241b5 ("can: sja1000: Prevent overrun stalls with a soft reset on Renesas SoCs")
> Cc: stable@vger.kernel.org
> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>

Have you compile tested this against current net/main?

|   CC [M]  drivers/net/can/sja1000/sja1000.o
| drivers/net/can/sja1000/sja1000.c: In function ‘sja1000_reset_interrupt’:
| drivers/net/can/sja1000/sja1000.c:398:9: error: too few arguments to function ‘can_free_echo_skb’
|   398 |         can_free_echo_skb(dev, 0);
|       |         ^~~~~~~~~~~~~~~~~
| In file included from include/linux/can/dev.h:22,
|                  from drivers/net/can/sja1000/sja1000.c:62:
| include/linux/can/skb.h:28:6: note: declared here
|    28 | void can_free_echo_skb(struct net_device *dev, unsigned int idx,
|       |      ^~~~~~~~~~~~~~~~~
|

This chance is mainline since v5.13-rc1~94^2~297^2~34. I've fixed the
problem while applying the patch.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde          |
Embedded Linux                   | https://www.pengutronix.de |
Vertretung Nürnberg              | Phone: +49-5121-206917-129 |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-9   |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2023-10-04  9:41 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-02 16:02 [PATCH v3] can: sja1000: Always restart the Tx queue after an overrun Miquel Raynal
2023-10-04  9:41 ` Marc Kleine-Budde [this message]
2023-10-04  9:55   ` Miquel Raynal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231004-uneasy-backed-e01d77be9f51-mkl@pengutronix.de \
    --to=mkl@pengutronix.de \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jeremie.dautheribes@bootlin.com \
    --cc=kuba@kernel.org \
    --cc=linux-can@vger.kernel.org \
    --cc=miquel.raynal@bootlin.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pascal.eberhard@se.com \
    --cc=stable@vger.kernel.org \
    --cc=sylvain.girard@se.com \
    --cc=thomas.petazzoni@bootlin.com \
    --cc=wg@grandegger.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox