public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: "Russell King (Oracle)" <linux@armlinux.org.uk>
To: Maxime Chevallier <maxime.chevallier@bootlin.com>
Cc: Tao Wang <tao03.wang@horizon.auto>,
	alexandre.torgue@foss.st.com, andrew+netdev@lunn.ch,
	davem@davemloft.net, edumazet@google.com, horms@kernel.org,
	kuba@kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, mcoquelin.stm32@gmail.com,
	netdev@vger.kernel.org, pabeni@redhat.com
Subject: Re: [PATCH net v2] net: stmmac: fix transmit queue timed out after resume
Date: Fri, 16 Jan 2026 19:22:38 +0000	[thread overview]
Message-ID: <aWqP_hhX73x_8Qs1@shell.armlinux.org.uk> (raw)
In-Reply-To: <3a93c79e-f755-4642-a3b0-1cce7d0ea0ef@bootlin.com>

On Fri, Jan 16, 2026 at 07:27:16PM +0100, Maxime Chevallier wrote:
> Hi,
> 
> On 16/01/2026 19:08, Russell King (Oracle) wrote:
> > On Fri, Jan 16, 2026 at 01:37:48PM +0000, Russell King (Oracle) wrote:
> >> On Fri, Jan 16, 2026 at 12:50:35AM +0000, Russell King (Oracle) wrote:
> >>> However, while this may explain the transmit slowdown because it's
> >>> on the transmit side, it doesn't explain the receive problem.
> >>
> >> I'm bisecting to find the cause of the receive issue, but it's going to
> >> take a long time (in the mean time, I can't do any mainline work.)
> >>
> >> So far, the range of good/bad has been narrowed down to 6.14 is good,
> >> 1b98f357dadd ("Merge tag 'net-next-6.16' of
> >> git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next") is bad.
> >>
> >> 14 more iterations to go. Might be complete by Sunday. (Slowness in
> >> building the more fully featured net-next I use primarily for build
> >> testing, the slowness of the platform to reboot, and the need to
> >> manually test each build.)
> > 
> > Well, that's been a waste of time today. While the next iteration was
> > building, because it's been suspicious that each and every bisect
> > point has failed so far, I decided to re-check 6.14, and that fails.
> > So, it looks like this problem has existed for some considerable
> > time. I don't have the compute power locally to bisect over a massive
> > range of kernels, so I'm afraid stmmac receive is going to have to
> > stay broken unless someone else can bisect (and find a "good" point
> > in the git history.)
> > 
> 
> To me RX looks OK, at least on the various devices I have that use
> stmmac. It's fine on Cyclone V socfpga, and imx8mp. Maybe that's Jetson
> specific ?

Maybe - it could be something to do with MMUs slowing down the packet
rate, or it could be uncovering a bug in stmmac's handling of dwmac4
when it runs out of descriptors in the ring.

The problem I'm seeing is that RBU ends up being set in the channel 0
control register (there's only a single channel) which means that the
hardware moved on to the next receive descriptor, and found that it
didn't own it.

It _should_ be counted by this statistic:

     rx_buf_unav_irq: 0

but clearly, this doesn't work, because here is the channel 0 status
register:

Value at address 0x02491160: 0x00000484

which has:

#define DMA_CHAN_STATUS_RBU             BIT(7)

set. The documentation I have (sadly not for Xavier but for stm32mp151)
states that when this occurs, a "Receive Poll Demand" command needs to
be issued, but fails to explain how to do that. Older cores (such as
dwmac1000) had a "received poll demand" register to write to for this.

> I've got pretty-much line rate with a basic 'iperf3 -c XX" and same with
> 'iperf3 -c XX -R". What commands are you running to check the issue ?

Merely iperf3 -R -c XX, it's enough to make it fall over normally
within the first second.

> Are you still seeing the pause frames flood ?

Yes, because the receive DMA has stopped, which makes the FIFO between
the MAC and MTL fill above the threshold for sending pause frames.

In order to stop the disruption to my network (because it basically
causes *everything* to clog up) I've had to turn off pause autoneg,
but that doesn't affect whether or not this happens.

It _may_ be worth testing whether adding a ndelay(500) into the
receive processing path, thereby making it intentionally slow,
allows you to reproduce the problem. If it does, then that confirms
that we're missing something in the dwmac4 handling for RBU.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

  reply	other threads:[~2026-01-16 19:22 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-09  7:02 [PATCH net Resend] net: stmmac: fix transmit queue timed out after resume Tao Wang
2026-01-13  4:05 ` Jakub Kicinski
2026-01-14 11:00   ` [PATCH net v2] " Tao Wang
2026-01-14 11:26     ` Russell King (Oracle)
2026-01-15  7:08       ` Tao Wang
2026-01-15 12:09         ` Russell King (Oracle)
2026-01-15 19:40           ` Russell King (Oracle)
2026-01-15 21:04             ` Maxime Chevallier
2026-01-15 21:35               ` Maxime Chevallier
2026-01-16  0:50                 ` Russell King (Oracle)
2026-01-16 13:37                   ` Russell King (Oracle)
2026-01-16 18:08                     ` Russell King (Oracle)
2026-01-16 18:27                       ` Maxime Chevallier
2026-01-16 19:22                         ` Russell King (Oracle) [this message]
2026-01-16 20:57                           ` Russell King (Oracle)
2026-01-17 17:06                             ` Jakub Kicinski
2026-01-17 20:50                               ` Russell King (Oracle)
2026-01-16  7:29           ` Tao Wang
2026-01-15  3:16     ` Jakub Kicinski
2026-01-15  7:19       ` Tao Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aWqP_hhX73x_8Qs1@shell.armlinux.org.uk \
    --to=linux@armlinux.org.uk \
    --cc=alexandre.torgue@foss.st.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maxime.chevallier@bootlin.com \
    --cc=mcoquelin.stm32@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=tao03.wang@horizon.auto \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox