From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pandora.armlinux.org.uk (pandora.armlinux.org.uk [78.32.30.218]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5911013D891 for ; Tue, 14 Apr 2026 14:13:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=78.32.30.218 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776176037; cv=none; b=aK0hf32g5j9ddtoCQcqKmKoC7fbNgprm/NxJlmzdkbGI7vC5fU/ahWNvAHI6noKHuixB+TzuHWJebxQ4uG2PWBfXGjEi25y860HLp4nIZifEHW2CV8nGFLViMzAxiZpVJeCh42VuR0bxnf+ieayTdYfVHWLDYX/j7TC58t7GDmU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776176037; c=relaxed/simple; bh=WPOviA6HJqDPXnEosTB2SZiNPfvLepdNnxcVremXflM=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=NgsmgVHh6hKDdt3aL5FT7/I7XoPZDnd2K/29M0vCVkzlpsn9ar8wib8ev9OZeeu/ec9HFs7CC7ky35GoDe/vH8X2Xq8iRRpyMCnDT9TirBx5C/LvupThNmqZ5keFRg1JBuoz2oXbJ/MyFFJ7QBs/5a805OMXF7kt1YnLE7HOzV8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=armlinux.org.uk; spf=none smtp.mailfrom=armlinux.org.uk; dkim=pass (2048-bit key) header.d=armlinux.org.uk header.i=@armlinux.org.uk header.b=Xy8DegYU; arc=none smtp.client-ip=78.32.30.218 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=armlinux.org.uk Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=armlinux.org.uk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=armlinux.org.uk header.i=@armlinux.org.uk header.b="Xy8DegYU" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=armlinux.org.uk; s=pandora-2019; h=Sender:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=bsBaXpNwI9biepD5DuuX73+8/Sux5QohEO/bGIUXfks=; b=Xy8DegYUTGoivNy2SMfRiIu3lz o2+N8ZMt5+PVNEzTOFRDK0ZWBFJ2HQTsy0A9PKoqF0HnN7P5XtXhFPV2VEFtlok2bCx6FsJWjpnw4 CS5h8uZXBq1F/jXFSA/VavRyFskuHTNA8WAH/VFJklN2FNa6EL+18rk89Jwma9CPtrQhBP8T6R4oX H2qF3BrwL/SwaGp9NbSPmsuisHKEvnF3TdfIU+ElcrYeIJsgPzJdUZoI1Pz5glJbyMkLyHCtyfeSY +o4B17PAkoR1VhiW7Ztk5mLRFsXoiuaq2gibSoktvueEdPw31FGIG4Z9PaM2mtVH1BEwD7GkUQplE KAQWW+0g==; Received: from shell.armlinux.org.uk ([fd8f:7570:feb6:1:5054:ff:fe00:4ec]:39914) by pandora.armlinux.org.uk with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wCeWO-0000000015i-302d; Tue, 14 Apr 2026 15:13:44 +0100 Received: from linux by shell.armlinux.org.uk with local (Exim 4.98.2) (envelope-from ) id 1wCeWL-000000001Gu-1VF4; Tue, 14 Apr 2026 15:13:41 +0100 Date: Tue, 14 Apr 2026 15:13:41 +0100 From: "Russell King (Oracle)" To: Sam Edwards Cc: Jakub Kicinski , Andrew Lunn , Alexandre Torgue , Andrew Lunn , "David S. Miller" , Eric Dumazet , "moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE" , linux-stm32@st-md-mailman.stormreply.com, Linux Network Development Mailing List , Paolo Abeni Subject: Re: [PATCH net-next] net: stmmac: enable RPS and RBU interrupts Message-ID: References: <20260413110222.49fc3759@kernel.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: Russell King (Oracle) Hi Sam, Most of this email was written this morning, but I didn't have a chance to finish nor send it due to how busy I am. I had also written a separate reply last night with detailed results of what I was seeing but didn't/haven't got around to sending it. Not currently sure whether I saved it as draft or got rid of it yet. On Mon, Apr 13, 2026 at 02:54:30PM -0700, Sam Edwards wrote: > On Mon, Apr 13, 2026, 11:49 Russell King (Oracle) wrote: > > > > On Mon, Apr 13, 2026 at 11:02:22AM -0700, Jakub Kicinski wrote: > > > On Fri, 10 Apr 2026 14:07:51 +0100 Russell King (Oracle) wrote: > > > > Since we are seeing receive buffer exhaustion on several platforms, > > > > let's enable the interrupts so the statistics we publish via ethtool -S > > > > actually work to aid diagnosis. I've been in two minds about whether > > > > to send this patch, but given the problems with stmmac at the moment, > > > > I think it should be merged. > > > > > > Sorry for a under-research response but wasn't there are person trying > > > to fix the OOM starvation issue? Who was supposed to add a timer? > > > Is your problem also OOM related or do you suspect something else? > > > > It is not OOM related. I have this patch applied: > > > > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c > > index 131ea887bedc..614d0e10e3e6 100644 > > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c > > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c > > @@ -5095,14 +5095,18 @@ static inline void stmmac_rx_refill(struct stmmac_priv *priv, u32 queue) > > > > if (!buf->page) { > > buf->page = page_pool_alloc_pages(rx_q->page_pool, gfp); > > - if (!buf->page) > > + if (!buf->page) { > > + netdev_err(priv->dev, "q%u: no buffer 1\n", queue); > > break; > > + } > > } > > > > if (priv->sph_active && !buf->sec_page) { > > buf->sec_page = page_pool_alloc_pages(rx_q->page_pool, gfp); > > - if (!buf->sec_page) > > + if (!buf->sec_page) { > > + netdev_err(priv->dev, "q%u: no buffer 2\n", queue); > > break; > > + } > > > > buf->sec_addr = page_pool_get_dma_addr(buf->sec_page); > > } > > > > and it is silent, so we are not suffering starvation of buffers. > > > > However, the hardware hangs during iperf3, and because it triggers the > > MAC to stream PAUSE frames, and my network uses Netgear GS108 and GS116 > > unmanaged switches that always use flow-control between them (there's no > > way not to) it takes down the entire network - as we've discussed > > before. So, this problem is pretty fatal to the *entire* network. > > > > With this patch, the existing statistical counters for this condition > > are incremented, and thus users can use ethtool -S to see what happened > > and report whether they are seeing the same issue. > > > > Without this patch applied, there are no diagnostics from stmmac that > > report what the state is. ethtool -d doesn't list the appropriate > > registers (as I suspect part of the problem is the number of queues > > is somewhat dynamic - userspace can change that configuration through > > ethtool). > > > > Thus, one has to resort to using devmem2 to find out what's happened. > > That's not user friendly. > > > > For me, devmem2 shows: > > > > Channel 0 status register: > > Value at address 0x02491160: 0x00000484 > > bit 10: ETI early transmit interrupt - set > > bit 9 : RWT receive watchdog - clear > > bit 8 : RPS receieve process stopped - clear > > bit 7 : RBU receive buffer unavailable - set > > bit 6 : RI receive interrupt - clear > > bit 2 : TBU transmit buffer unavailable - set > > bit 1 : TPS transmit process stopped - clear > > bit 0 : TI transmit interrupt - clear > > Should that reset trigger be RPS, not RBU? My understanding of these > status bits is RBU is just "RxDMA has failed to take a frame from the > RxFIFO" while RPS is "the RxFIFO is full." That would make RBU our > critical threshold to start proactively refilling, and RPS the "too > late, we lose" threshold. That's a fine theory, but look at the channel 0 status register above, noting that any interrupts that are raised but not enabled remain set. RPS is not set, so RPS is not being raised, only RBU when this condition occurs. > Thinking aloud: Do you suppose the RxDMA waits for a wakeup signal > sent whenever a frame is added to RxFIFO? That might explain why the > former never recovers once the latter is full: a manual wakeup needs > to be sent whenever we resolve RBU. Does the .enable_dma_reception() > op need to be implemented for dwmac5, or have you tried that already? I've not found anything in the closest documentation I have. The Xavier is Synopsys IP v5.0, whereas i.MX8M is v5.1 - and v5.1 compared to previous versions reads the same for statements concerning recovering from a RBU condition: "In ring mode, the application should advance the Receive Descriptor Tail Pointer register of a channel. This bit is set only when the DMA owns the previous Rx descriptor." I've tried expanding what happens when RBU fires, dumping some of the receive state and the receive ring: [ 55.766199] dwc-eth-dwmac 2490000.ethernet eth0: q0: receive buffer unavailable: cur_rx=309 dirty_rx=309 last_cur_rx=245 last_cur_rx_post=309 last_dirty_rx=245 count=64 budget=64 cur_rx == dirty_rx _should_ mean that we fully refilled the ring. These are their values at the point the RBU interrupt fires. last_cur_rx and last_dirty_rx are the values of cur_rx/dirty_rx when stmmac_rx() was last entered. last_cur_rx_post is the value of cur_rx when stmmac_rx() finished looping but before we have refilled the ring. count is the value of count just before stmmac_rx() returns, budget is the limit at that point. The patch that prints errors should we fail to allocate a buffer is in place, none of those errors fire, so we are fully repopulating the ring each time stmmac_rx() runs. [ 55.766785] RX descriptor ring: [ 55.766802] 000 [0x0000007fffffe000]: 0x0 0x12 0x0 0x340105ee [ 55.766826] 001 [0x0000007fffffe010]: 0x0 0x12 0x0 0x340105ee [ 55.766843] 002 [0x0000007fffffe020]: 0x0 0x12 0x0 0x340105ee [ 55.766860] 003 [0x0000007fffffe030]: 0x0 0x12 0x0 0x340105ee ... [ 55.772205] 308 [0x0000007ffffff340]: 0x0 0x12 0x0 0x340105ee [ 55.772221] 309 [0x0000007ffffff350]: 0x0 0x12 0x0 0x340105ee [ 55.772237] 310 [0x0000007ffffff360]: 0x0 0x12 0x0 0x340105ee [ 55.772253] 311 [0x0000007ffffff370]: 0x0 0x12 0x0 0x340105ee [ 55.772268] 312 [0x0000007ffffff380]: 0x0 0x12 0x0 0x340105ee [ 55.772284] 313 [0x0000007ffffff390]: 0x0 0x12 0x0 0x340105ee [ 55.772300] 314 [0x0000007ffffff3a0]: 0x0 0x12 0x0 0x340105ee [ 55.772315] 315 [0x0000007ffffff3b0]: 0x0 0x12 0x0 0x340105ee ... [ 55.775539] 511 [0x0000007ffffffff0]: 0x0 0x12 0x0 0x340105ee Every ring entry contains the same RDES3 value, so it really is completely full at the point RBU fires (bit 31 clear means software owns the descriptor, and it's basically saying first/last segment, RDES1 valid, buffer 1 length of 1518. The Rx tail pointer register contains 0xfffff3a0 which is entry 314. The current receive descriptor address is also 0xfffff3a0. Note that these values were obtained some time after the RBU interrupt fired (due to the time taken for devmem2 to access every stmmac register - I have a script that dumps the entire stmmac register state via devmem2.) The other thing to note is that when looking at debugfs stmmaceth/eth0/descriptor* (or whatever it's called, I don't have the NX powered to look at the moment, and I didn't take a copy of it last night) all tne descriptor entries are fully repopulated with buffers and owned by the hardware. I've tried using devmem2 to write to the rx tail pointer to kick it back into action, but that changes nothing. I've tried writing the next descriptor value and previous descriptor value, but that appears to have no effect, it stedfastly remains stuck - and as that is the documented recovery from RBU and there's no "receive demand" register listed in dwmac v4 or v5 documentation, there seems to be no other documented way. The debug registers that I provided in my previous email suggest that the MAC is waiting for a packet, and MTL's descriptor reader is idle (I'm guessing it would only briefly change when the tail pointer is updated.) Note that I have augmented the driver with more dma_rmb() + dma_wmb() in stmmac_rx(), dwmac4_wrback_get_rx_status(), and stmmac_rx_refill() to ensure that reads and writes to the descriptor ring are correctly ordered. While this generally allows iperf3 to run for a few more seconds, it doesn't solve the problem - it is very rare for iperf3 to actually complete before stmmac has taken down my entire network. I have noticed that on some occasions I see a small number of RBU interrupts before it falls over. I'm not going to have much time to look at this today due to further appointments (I also didn't yesterday - only an hour in the morning and a bit more time late in the evening/night.) I should have more time during the rest of the week... but that may change. >From the above, it looks like NAPI/stmmac driver isn't keeping up with the packet flow coming from an i.MX6 platform (which is limited to around 470Mbps due to internal SoC bus limitations.) I'll also mention that stmmac falls apart even more if I run iperf3 -c -R against an x86 machine that is capable of saturating the network, so much so that the arm-smmu IOMMU throws errors even after the stmmac hardware has been soft-reset for addresses that were in the ring *prior* to the soft-reset occuring (stmmac is soft-reset each time the netdev is brought up.) The only recovery from that is to reboot - down/up the interface just spews more IOMMU errors. I don't have the details of that to hand and I don't have enough time to re-run that test this morning. From what I remember, the transmit side also stops processing descriptors (one can see them accumulate in the debugfs file,) which eventually leads to the netdev watchdog firing. It currently looks like the stmmac v5 EQoS IP works fine only under light packet loads. If one puts any stress on it, then the hardware totally falls apart. This may point to an issue with the AXI bus configuration that is specific to this platform, but that requires further investigation. I'll mention again, in case anyone's forgotten, that these problems pre-date any of my cleanups I've made to stmmac. From what I remember they are reproducible with the kernels that are supplied as part of the nVidia BSP. Again, as I don't have access to the nVidia platform at the moment, I can't include the details in this email. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!