From: Andrea della Porta <andrea.porta@suse.com>
To: Nicolai Buchwitz <nb@tipi-net.de>
Cc: Andrea della Porta <andrea.porta@suse.com>,
netdev@vger.kernel.org, Theo Lebrun <theo.lebrun@bootlin.com>,
Nicolas Ferre <nicolas.ferre@microchip.com>,
Claudiu Beznea <claudiu.beznea@tuxon.dev>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S . Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
linux-rpi-kernel@lists.infradead.org,
Lukasz Raczylo <lukasz@raczylo.com>,
Steffen Jaeckel <sjaeckel@suse.de>
Subject: Re: [PATCH] net: macb: add TX stall timeout callback to recover from lost TSTART write
Date: Fri, 12 Jun 2026 14:51:33 +0200 [thread overview]
Message-ID: <aiwA1dD-qXcT3hds@apocalypse> (raw)
In-Reply-To: <dbc25b24865a1ad92555bc826bef4267@tipi-net.de>
Hi Nicolai,
On 14:23 Fri 12 Jun , Nicolai Buchwitz wrote:
> Hi Andrea
>
> On 12.6.2026 11:01, Andrea della Porta wrote:
> > From: Lukasz Raczylo <lukasz@raczylo.com>
> >
> > The MACB found in the Raspberry Pi RP1 suffers from sporadic stalls on
> > the TX queue.
> > While the exact root cause is not yet fully understood, it is likely
> > related to a hardware issue where a TSTART write to the NCR register
> > is missed, preventing the transmission from being kicked off.
> >
> > Implement a timeout callback to handle TX queue stalls, triggering the
> > existing restart mechanism to recover.
> >
> > Link:
> > https://lore.kernel.org/all/20260514215459.36109-1-lukasz@raczylo.com/
> > Fixes: dc110d1b23564 ("net: cadence: macb: Add support for Raspberry Pi
> > RP1 ethernet controller")
> > Signed-off-by: Lukasz Raczylo <lukasz@raczylo.com>
> > Co-developed-by: Steffen Jaeckel <sjaeckel@suse.de>
> > Signed-off-by: Steffen Jaeckel <sjaeckel@suse.de>
> > Co-developed-by: Andrea della Porta <andrea.porta@suse.com>
> > Signed-off-by: Andrea della Porta <andrea.porta@suse.com>
> > ---
> > drivers/net/ethernet/cadence/macb_main.c | 11 +++++++++++
> > 1 file changed, 11 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/cadence/macb_main.c
> > b/drivers/net/ethernet/cadence/macb_main.c
> > index a12aa21244e83..615da65d5d68d 100644
> > --- a/drivers/net/ethernet/cadence/macb_main.c
> > +++ b/drivers/net/ethernet/cadence/macb_main.c
> > @@ -4522,6 +4522,16 @@ static int macb_setup_tc(struct net_device *dev,
> > enum tc_setup_type type,
> > }
> > }
> >
> > +static void macb_tx_timeout(struct net_device *dev, unsigned int q)
> > +{
> > + struct macb *bp = netdev_priv(dev);
> > +
> > + if (net_ratelimit())
>
> Do we need the net_ratelimit() check (and message) here? AFAIU the watchdog
> core already prints a message for every timeout.
Correct. I'll drop those two lines.
>
> > + netdev_err(dev, "TX stall detected, re-kicking TSTART\n");
> > + dev->stats.tx_errors++;
> > + macb_tx_restart(&bp->queues[q]);
> > +}
> > +
> > static const struct net_device_ops macb_netdev_ops = {
> > .ndo_open = macb_open,
> > .ndo_stop = macb_close,
> > @@ -4540,6 +4550,7 @@ static const struct net_device_ops macb_netdev_ops
> > = {
> > .ndo_hwtstamp_set = macb_hwtstamp_set,
> > .ndo_hwtstamp_get = macb_hwtstamp_get,
> > .ndo_setup_tc = macb_setup_tc,
> > + .ndo_tx_timeout = macb_tx_timeout,
>
> The commit message describes it as RP1 specific, but it gets applied to all
> other variants?
I've seen this issue happening only on RaspberryPi 5, but AFAIK it
could affect also other MACB blocks connected through PCIe, so it
may be widespread (even though it should have probably already been
noticed in the past). In the orginal driver there's no timeout callback
defined and this is much like pretgending the issue causing the timeout
to happen to go away without doing anything (whatever the cause ot the
specific hw are). So in my opinion we can just extend that to all MACB.
Or maybe we should execute the restart conditionally on
.compatible = "raspberrypi,rp1-gem"?
Thanks,
Andrea
>
> > };
> >
> > /* Configure peripheral capabilities according to device tree
>
> Thanks
> Nicolai
next prev parent reply other threads:[~2026-06-12 12:48 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-12 9:01 [PATCH] net: macb: add TX stall timeout callback to recover from lost TSTART write Andrea della Porta
2026-06-12 9:45 ` Théo Lebrun
2026-06-12 12:40 ` Andrea della Porta
2026-06-12 12:23 ` Nicolai Buchwitz
2026-06-12 12:51 ` Andrea della Porta [this message]
2026-06-12 12:53 ` Nicolai Buchwitz
2026-06-12 13:03 ` Andrea della Porta
2026-06-12 14:28 ` Théo Lebrun
2026-06-12 14:30 ` Théo Lebrun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aiwA1dD-qXcT3hds@apocalypse \
--to=andrea.porta@suse.com \
--cc=andrew+netdev@lunn.ch \
--cc=claudiu.beznea@tuxon.dev \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rpi-kernel@lists.infradead.org \
--cc=lukasz@raczylo.com \
--cc=nb@tipi-net.de \
--cc=netdev@vger.kernel.org \
--cc=nicolas.ferre@microchip.com \
--cc=pabeni@redhat.com \
--cc=sjaeckel@suse.de \
--cc=theo.lebrun@bootlin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox