* [PATCH net v2 1/2] net: stmmac: Prevent NULL deref when RX memory exhausted
[not found] <20260319184031.8596-1-CFSworks@gmail.com>
@ 2026-03-19 18:40 ` Sam Edwards
2026-03-19 21:59 ` Russell King (Oracle)
2026-03-19 18:40 ` [PATCH net v2 2/2] net: stmmac: Prevent indefinite RX stall on buffer exhaustion Sam Edwards
1 sibling, 1 reply; 4+ messages in thread
From: Sam Edwards @ 2026-03-19 18:40 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: Russell King, Maxime Chevallier, Ovidiu Panait, Vladimir Oltean,
Baruch Siach, Serge Semin, netdev, linux-arm-kernel, linux-kernel,
Sam Edwards, stable
The CPU receives frames from the MAC through conventional DMA: the CPU
allocates buffers for the MAC, then the MAC fills them and returns
ownership to the CPU. For each hardware RX queue, the CPU and MAC
coordinate through a shared ring array of DMA descriptors: one
descriptor per DMA buffer. Each descriptor includes the buffer's
physical address and a status flag ("OWN") indicating which side owns
the buffer: OWN=0 for CPU, OWN=1 for MAC. The CPU is only allowed to set
the flag and the MAC is only allowed to clear it, and both must move
through the ring in sequence: thus the ring is used for both
"submissions" and "completions."
In the stmmac driver, stmmac_rx() bookmarks its position in the ring
with the `cur_rx` index. The main receive loop in that function checks
for rx_descs[cur_rx].own=0, gives the corresponding buffer to the
network stack (NULLing the pointer), and increments `cur_rx` modulo the
ring size. After the loop exits, stmmac_rx_refill(), which bookmarks its
position with `dirty_rx`, allocates fresh buffers and rearms the
descriptors (setting OWN=1). If it fails any allocation, it simply stops
early (leaving OWN=0) and will retry where it left off when next called.
This means descriptors have a three-stage lifecycle (terms my own):
- `empty` (OWN=1, buffer valid)
- `full` (OWN=0, buffer valid and populated)
- `dirty` (OWN=0, buffer NULL)
But because stmmac_rx() only checks OWN, it confuses `full`/`dirty`. In
the past (see 'Fixes:'), there was a bug where the loop could cycle
`cur_rx` all the way back to the first descriptor it dirtied, resulting
in a NULL dereference when mistaken for `full`. The aforementioned
commit resolved that *specific* failure by capping the loop's iteration
limit at `dma_rx_size - 1`, but this is only a partial fix: if the
previous stmmac_rx_refill() didn't complete, then there are leftover
`dirty` descriptors that the loop might encounter without needing to
cycle fully around. The current code therefore panics (see 'Closes:')
when stmmac_rx_refill() is memory-starved long enough for `cur_rx` to
catch up to `dirty_rx`.
Fix this by further tightening the clamp from `dma_rx_size - 1` to
`dma_rx_size - stmmac_rx_dirty() - 1`, subtracting any remnant dirty
entries and limiting the loop so that `cur_rx` cannot catch back up to
`dirty_rx`. This carries no risk of arithmetic underflow: since the
maximum possible return value of stmmac_rx_dirty() is `dma_rx_size - 1`,
the worst the clamp can do is prevent the loop from running at all.
Fixes: b6cb4541853c7 ("net: stmmac: avoid rx queue overrun")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221010
Cc: stable@vger.kernel.org
Signed-off-by: Sam Edwards <CFSworks@gmail.com>
---
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 6827c99bde8c..f98b070073c0 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -5609,7 +5609,8 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit, u32 queue)
dma_dir = page_pool_get_dma_dir(rx_q->page_pool);
bufsz = DIV_ROUND_UP(priv->dma_conf.dma_buf_sz, PAGE_SIZE) * PAGE_SIZE;
- limit = min(priv->dma_conf.dma_rx_size - 1, (unsigned int)limit);
+ limit = min(priv->dma_conf.dma_rx_size - stmmac_rx_dirty(priv, queue) - 1,
+ (unsigned int)limit);
if (netif_msg_rx_status(priv)) {
void *rx_head;
--
2.52.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH net v2 2/2] net: stmmac: Prevent indefinite RX stall on buffer exhaustion
[not found] <20260319184031.8596-1-CFSworks@gmail.com>
2026-03-19 18:40 ` [PATCH net v2 1/2] net: stmmac: Prevent NULL deref when RX memory exhausted Sam Edwards
@ 2026-03-19 18:40 ` Sam Edwards
1 sibling, 0 replies; 4+ messages in thread
From: Sam Edwards @ 2026-03-19 18:40 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: Russell King, Maxime Chevallier, Ovidiu Panait, Vladimir Oltean,
Baruch Siach, Serge Semin, netdev, linux-arm-kernel, linux-kernel,
Sam Edwards, stable
The stmmac driver handles interrupts in the usual NAPI way: an interrupt
arrives, the NAPI instance is scheduled and interrupts are masked, and
the actual work occurs in the NAPI polling function. Once no further
work remains, interrupts are unmasked and the NAPI instance is put to
sleep to await a future interrupt. In the receive case, the MAC only
sends the interrupt when a DMA operation completes; thus the driver must
make sure a usable RX DMA descriptor exists before expecting a future
interrupt.
The main receive loop in stmmac_rx() exits under one of 3 conditions:
1) It encounters a DMA descriptor with OWN=1, indicating that no further
pending data exists. The MAC will use this descriptor for the next
RX DMA operation, so the driver can expect a future interrupt.
2) It exhausts the NAPI budget. In this case, the driver doesn't know
whether the MAC has any usable DMA descriptors. But when the driver
consumes its full budget, that signals NAPI to keep polling, so the
question is moot.
3) It runs out of (non-dirty) descriptors in the RX ring. In this case,
the MAC will only have a usable descriptor if stmmac_rx_refill()
succeeds (at least partially).
Currently, stmmac_rx() lacks any check against scenario #3 and
stmmac_rx_refill() failing: it will stop NAPI polling and unmask
interrupts to await an interrupt that will never arrive, stalling the
receive pipeline indefinitely.
Fix this by checking if stmmac_rx_dirty() returns its maximal value,
returning the full budget (which tells NAPI to keep polling) if so.
Fixes: 47dd7a540b8a ("net: add support for STMicroelectronics Ethernet controllers.")
Cc: stable@vger.kernel.org
Signed-off-by: Sam Edwards <CFSworks@gmail.com>
---
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index f98b070073c0..05d3c548ce28 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -5604,6 +5604,7 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit, u32 queue)
unsigned int desc_size;
struct sk_buff *skb = NULL;
struct stmmac_xdp_buff ctx;
+ int budget = limit;
int xdp_status = 0;
int bufsz;
@@ -5870,6 +5871,12 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit, u32 queue)
priv->xstats.rx_dropped += rx_dropped;
priv->xstats.rx_errors += rx_errors;
+ /* If the RX queue is completely dirty, we can't expect a future
+ * interrupt; tell NAPI to keep polling.
+ */
+ if (unlikely(stmmac_rx_dirty(priv, queue) == priv->dma_conf.dma_rx_size - 1))
+ return budget;
+
return count;
}
--
2.52.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH net v2 1/2] net: stmmac: Prevent NULL deref when RX memory exhausted
2026-03-19 18:40 ` [PATCH net v2 1/2] net: stmmac: Prevent NULL deref when RX memory exhausted Sam Edwards
@ 2026-03-19 21:59 ` Russell King (Oracle)
2026-03-20 1:11 ` Sam Edwards
0 siblings, 1 reply; 4+ messages in thread
From: Russell King (Oracle) @ 2026-03-19 21:59 UTC (permalink / raw)
To: Sam Edwards
Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Maxime Chevallier, Ovidiu Panait, Vladimir Oltean,
Baruch Siach, Serge Semin, netdev, linux-arm-kernel, linux-kernel,
stable
On Thu, Mar 19, 2026 at 11:40:30AM -0700, Sam Edwards wrote:
> The CPU receives frames from the MAC through conventional DMA: the CPU
> allocates buffers for the MAC, then the MAC fills them and returns
> ownership to the CPU. For each hardware RX queue, the CPU and MAC
> coordinate through a shared ring array of DMA descriptors: one
> descriptor per DMA buffer. Each descriptor includes the buffer's
> physical address and a status flag ("OWN") indicating which side owns
> the buffer: OWN=0 for CPU, OWN=1 for MAC. The CPU is only allowed to set
> the flag and the MAC is only allowed to clear it, and both must move
> through the ring in sequence: thus the ring is used for both
> "submissions" and "completions."
>
> In the stmmac driver, stmmac_rx() bookmarks its position in the ring
> with the `cur_rx` index. The main receive loop in that function checks
> for rx_descs[cur_rx].own=0, gives the corresponding buffer to the
> network stack (NULLing the pointer), and increments `cur_rx` modulo the
> ring size. After the loop exits, stmmac_rx_refill(), which bookmarks its
> position with `dirty_rx`, allocates fresh buffers and rearms the
> descriptors (setting OWN=1). If it fails any allocation, it simply stops
> early (leaving OWN=0) and will retry where it left off when next called.
>
> This means descriptors have a three-stage lifecycle (terms my own):
> - `empty` (OWN=1, buffer valid)
> - `full` (OWN=0, buffer valid and populated)
> - `dirty` (OWN=0, buffer NULL)
>
> But because stmmac_rx() only checks OWN, it confuses `full`/`dirty`. In
> the past (see 'Fixes:'), there was a bug where the loop could cycle
> `cur_rx` all the way back to the first descriptor it dirtied, resulting
> in a NULL dereference when mistaken for `full`. The aforementioned
> commit resolved that *specific* failure by capping the loop's iteration
> limit at `dma_rx_size - 1`, but this is only a partial fix: if the
> previous stmmac_rx_refill() didn't complete, then there are leftover
> `dirty` descriptors that the loop might encounter without needing to
> cycle fully around. The current code therefore panics (see 'Closes:')
> when stmmac_rx_refill() is memory-starved long enough for `cur_rx` to
> catch up to `dirty_rx`.
>
> Fix this by further tightening the clamp from `dma_rx_size - 1` to
> `dma_rx_size - stmmac_rx_dirty() - 1`, subtracting any remnant dirty
> entries and limiting the loop so that `cur_rx` cannot catch back up to
> `dirty_rx`. This carries no risk of arithmetic underflow: since the
> maximum possible return value of stmmac_rx_dirty() is `dma_rx_size - 1`,
> the worst the clamp can do is prevent the loop from running at all.
Isn't the limit equivalent to:
space = CIRC_SPACE(rx_q->cur_rx, rx_q->dirty_rx,
priv->dma_conf.dma_rx_size);
limit = min_t(unsigned int, space, limit);
(Think of the "full" and "empty" cases as "space" which can be
consumed to provide entries for the refiller to action.)
I think the same applies for patch 2 - when the above returns zero
it means we have no entries in the ring that aren't due for refill.
Have you checked whether stmmac_rx_zc() also buggy in this respect?
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net v2 1/2] net: stmmac: Prevent NULL deref when RX memory exhausted
2026-03-19 21:59 ` Russell King (Oracle)
@ 2026-03-20 1:11 ` Sam Edwards
0 siblings, 0 replies; 4+ messages in thread
From: Sam Edwards @ 2026-03-20 1:11 UTC (permalink / raw)
To: Russell King (Oracle)
Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Maxime Chevallier, Ovidiu Panait, Vladimir Oltean,
Baruch Siach, Serge Semin, netdev, linux-arm-kernel, linux-kernel,
stable
On Thu, Mar 19, 2026 at 2:59 PM Russell King (Oracle)
<linux@armlinux.org.uk> wrote:
Hi Russell,
> Isn't the limit equivalent to:
>
> space = CIRC_SPACE(rx_q->cur_rx, rx_q->dirty_rx,
> priv->dma_conf.dma_rx_size);
> limit = min_t(unsigned int, space, limit);
>
I had to think about the edge cases for a few minutes -- yes, you're
right. However...
> (Think of the "full" and "empty" cases as "space" which can be
> consumed to provide entries for the refiller to action.)
...this seems like a double negative: We're looking at the ring from
the refiller's perspective (i.e. considering "dirty" the meaningful
state), and *then* looking at the "unused space" (i.e. "not dirty").
If I used it, I'd at least add a comment explaining that CIRC_SPACE
is, counterintuitively, not measuring "space."
After the bugs are fixed, I'll follow up with a net-next patch [1]
that removes the `min` altogether and makes the loop check for 'dirty'
directly, so my focus here is "minimal, obviously correct" and not
necessarily "tidiest." Would it be acceptable if I simplify
stmmac_rx_dirty() to use CIRC_CNT() then, instead of using
CIRC_SPACE() now?
> I think the same applies for patch 2 - when the above returns zero
> it means we have no entries in the ring that aren't due for refill.
I'm not completely satisfied with the `stmmac_rx_dirty() ==
dma_rx_size - 1` expression either. One might argue we should keep
trying until dirty==0, because tolerating *any* dirtiness risks not
having enough buffers for the next traffic burst, causing avoidable rx
drops. But that's a discussion for patch 2. :)
> Have you checked whether stmmac_rx_zc() also buggy in this respect?
I skimmed it. I saw that it checks the return status of
stmmac_rx_refill_zc() and on failure, returns the NAPI budget to keep
polling -- which told me that the ZC author(s) at least considered
these problems.
Looking at it more thoroughly now, there are a few code smells around
the unconditional `dirty = 0;`, and `dirty` being used as the "budget"
for stmmac_rx_refill_zc() (why have a budget at all then?) so it looks
like something's there, but it's a separate bug if so.
Best,
Sam
[1] https://lore.kernel.org/netdev/20260316021009.262358-4-CFSworks@gmail.com/
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-03-20 1:12 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260319184031.8596-1-CFSworks@gmail.com>
2026-03-19 18:40 ` [PATCH net v2 1/2] net: stmmac: Prevent NULL deref when RX memory exhausted Sam Edwards
2026-03-19 21:59 ` Russell King (Oracle)
2026-03-20 1:11 ` Sam Edwards
2026-03-19 18:40 ` [PATCH net v2 2/2] net: stmmac: Prevent indefinite RX stall on buffer exhaustion Sam Edwards
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox