public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
* [BUG] net: ethernet: cortina: gemini: skb leak in gmac_rx() causes kernel lockup under sustained RX load
@ 2026-03-29 10:11 Andreas Haarmann-Thiemann
  0 siblings, 0 replies; 2+ messages in thread
From: Andreas Haarmann-Thiemann @ 2026-03-29 10:11 UTC (permalink / raw)
  To: linusw; +Cc: ulli.kroll, netdev, linux-arm-kernel

Hello,

I am writing to report an SKB memory leak in the Cortina Gemini Ethernet
driver (drivers/net/ethernet/cortina/gemini.c) that causes the device to
lock up under sustained receive load.

Hardware affected: Raidsonic IB-NAS4220-B (Storlink/Cortina Gemini SL3516,
ARM FA526), running OpenWrt 6.12.67.

--- Observed Behaviour ---

Under sustained RX load (e.g. large file transfers over the network), the
device freezes completely and requires a hard power cycle. No kernel panic
or oops is produced; the system simply stops responding.

--- Root Cause Analysis ---

In gmac_rx() (drivers/net/ethernet/cortina/gemini.c), when
gmac_get_queue_page() returns NULL for the second page of a multi-page
fragment, the driver logs an error and continues - but does not free the
in-progress skb that was already being assembled via napi_build_skb() /
napi_get_frags():

    gpage = gmac_get_queue_page(geth, port, mapping + PAGE_SIZE);
    if (!gpage) {
        dev_err(geth->dev, "could not find mapping\n");
        /* BUG: skb leaked here */
        port->stats.rx_dropped++;
        continue;
    }

This path is distinct from the similar block in gmac_cleanup_rxq(), which
correctly only logs "could not find page" without an skb in flight.

Each occurrence of this error path leaks one skb. Under sustained traffic
the leak exhausts kernel memory, causing the observed lockup.

Note: this analysis is based on code review only. The fix below has not
yet been verified on hardware due to the driver being compiled into the
kernel (CONFIG_GEMINI_ETHERNET=y) on the affected device, which prevents
loading a patched module at runtime.

--- Proposed Fix ---

Free the in-progress skb via napi_free_frags() before continuing, matching
the pattern already used elsewhere in the driver:

diff --git a/drivers/net/ethernet/cortina/gemini.c
b/drivers/net/ethernet/cortina/gemini.c
--- a/drivers/net/ethernet/cortina/gemini.c
+++ b/drivers/net/ethernet/cortina/gemini.c
@@ -1491,6 +1491,10 @@ static int gmac_rx(struct napi_struct *napi, int
budget)
 		gpage = gmac_get_queue_page(geth, port, mapping +
PAGE_SIZE);
 		if (!gpage) {
 			dev_err(geth->dev, "could not find mapping\n");
+			if (skb) {
+				napi_free_frags(&port->napi);
+				skb = NULL;
+			}
 			port->stats.rx_dropped++;
 			continue;
 		}

--- Additional Notes ---

A similar "could not find page" error path exists in gmac_cleanup_rxq().
That path does not have an skb in flight at that point and does not require
the same fix.

I would be happy to submit this as a formal patch if the analysis looks
correct to you.

Thank you for your time.

Best regards,
Andreas Haarmann-Thiemann
eitschman@nebelreich.de



^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [BUG] net: ethernet: cortina: gemini: skb leak in gmac_rx() causes kernel lockup under sustained RX load
       [not found] <006201dcbf63$84593aa0$8d0bafe0$@nebelreich.de>
@ 2026-03-29 18:54 ` Linus Walleij
  0 siblings, 0 replies; 2+ messages in thread
From: Linus Walleij @ 2026-03-29 18:54 UTC (permalink / raw)
  To: Andreas Haarmann-Thiemann; +Cc: ulli.kroll, netdev, linux-arm-kernel

Hi Andreas,

thanks for digging into this, I have wondered why this happens for a long
time but I'm not the best net developer myself.

On Sun, Mar 29, 2026 at 12:05 PM Andreas Haarmann-Thiemann
<eitschman@nebelreich.de> wrote:

> diff --git a/drivers/net/ethernet/cortina/gemini.c b/drivers/net/ethernet/cortina/gemini.c
> --- a/drivers/net/ethernet/cortina/gemini.c
> +++ b/drivers/net/ethernet/cortina/gemini.c
>
> @@ -1491,6 +1491,10 @@ static int gmac_rx(struct napi_struct *napi, int budget)
>                               gpage = gmac_get_queue_page(geth, port, mapping + PAGE_SIZE);
>                               if (!gpage) {
>                                               dev_err(geth->dev, "could not find mapping\n");
> +                                             if (skb) {
> +                                                            napi_free_frags(&port->napi);
> +                                                            skb = NULL;
> +                                             }
>                                               port->stats.rx_dropped++;
>                                               continue;
>                               }

This looks right to me, can you send a proper patch, or provide your
Signed-off-by in this thread so I can create a patch from this inline code?

The kernel process requires a "certificate of origin" i.e. Signed-off-by,
described a bit down in this document:
https://docs.kernel.org/process/submitting-patches.html

Yours,
Linus Walleij


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-03-29 18:54 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-29 10:11 [BUG] net: ethernet: cortina: gemini: skb leak in gmac_rx() causes kernel lockup under sustained RX load Andreas Haarmann-Thiemann
     [not found] <006201dcbf63$84593aa0$8d0bafe0$@nebelreich.de>
2026-03-29 18:54 ` Linus Walleij

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox