BPF List
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Daniel Machon" <daniel.machon@microchip.com>
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH net-next v4 04/13] net: lan966x: add FDMA LLP register write helper
Date: Sat, 09 May 2026 08:27:18 +0000	[thread overview]
Message-ID: <20260509082718.D70F6C2BCB2@smtp.kernel.org> (raw)
In-Reply-To: <20260508-lan966x-pci-fdma-v4-4-14e0c89d8d63@microchip.com>

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
- [High] Moving LLP register initialization out of `lan966x_fdma_rx_start()` breaks the MTU change error recovery path.
- [High] The return value of `fdma_dcbs_init()` is ignored in `lan966x_fdma_rx_alloc()`, causing page leaks and the activation of uninitialized DMA descriptors on failure.
- [High] Destroying the page pool on the error path in `lan966x_fdma_rx_alloc()` without unregistering it from the XDP RX queues leaves dangling pointers.
- [High] The `restore:` error path in `lan966x_fdma_reload()` contains a severe hardware/software ring desync bug.
--

commit 2fddff57a8bf4d213ae07c8fe84e8040a60982cb
Author: Daniel Machon <daniel.machon@microchip.com>

net: lan966x: add FDMA LLP register write helper

The patch introduces a helper lan966x_fdma_llp_configure() and moves the
FDMA Link List Pointer (LLP) register initialization from the start/activate
functions into the allocation functions.

> diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c b/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c
> index f8ce735a7fc09..6c5761e886d47 100644
> --- a/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c
> +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c

[ ... ]

> @@ -127,6 +134,9 @@ static int lan966x_fdma_rx_alloc(struct lan966x_rx *rx)
>  	fdma_dcbs_init(fdma, FDMA_DCB_INFO_DATAL(fdma->db_size),
>  		       FDMA_DCB_STATUS_INTR);
>  
> +	lan966x_fdma_llp_configure(lan966x, (u64)fdma->dma,
> +				   fdma->channel_id);
> +
>  	return 0;
>  }

This isn't a bug introduced by this patch, but does lan966x_fdma_rx_alloc()
leak pages and activate uninitialized DMA descriptors on failure?

During lan966x_fdma_rx_alloc(), fdma_dcbs_init() invokes the callback
lan966x_fdma_rx_dataptr_cb() to populate the DMA buffer rings. This callback
dynamically allocates pages using page_pool_dev_alloc_pages() and returns
-ENOMEM on failure, which causes fdma_dcbs_init() to return -ENOMEM.

However, lan966x_fdma_rx_alloc() completely ignores this return value and
returns 0.

Will this cause all successfully allocated pages up to the failure point to
be leaked, and trick the caller into activating the hardware DMA engine with
partially uninitialized DCBs, leading to IOMMU faults or memory corruption?

This isn't a bug introduced by this patch either, but does the error path in
lan966x_fdma_rx_alloc() leave dangling pointers in the XDP RX queues?

In lan966x_fdma_rx_alloc(), lan966x_fdma_rx_alloc_page_pool() registers the
newly created page pool with the XDP memory model across all physical ports
via xdp_rxq_info_reg_mem_model(), overwriting the old registration.

If the subsequent fdma_alloc_coherent() fails, the error path invokes
page_pool_destroy(rx->page_pool) but fails to restore the old page pool
registration:

	err = fdma_alloc_coherent(lan966x->dev, fdma);
	if (err) {
		page_pool_destroy(rx->page_pool);
		return err;
	}

When lan966x_fdma_reload() falls back to the restore: path, it resumes
network operations.

Could this leave a dangling pointer to the destroyed page pool in the
xdp_rxq structures, leading to a use-after-free when XDP processes the next
packet?

> @@ -136,14 +146,6 @@ static void lan966x_fdma_rx_start(struct lan966x_rx *rx)
>  	struct fdma *fdma = &rx->fdma;
>  	u32 mask;
>  
> -	/* When activating a channel, first is required to write the first DCB
> -	 * address and then to activate it
> -	 */
> -	lan_wr(lower_32_bits((u64)fdma->dma), lan966x,
> -	       FDMA_DCB_LLP(fdma->channel_id));
> -	lan_wr(upper_32_bits((u64)fdma->dma), lan966x,
> -	       FDMA_DCB_LLP1(fdma->channel_id));
> -

Will moving the LLP register configuration into lan966x_fdma_rx_alloc() break
the MTU change error recovery path?

In lan966x_fdma_reload(), if the new lan966x_fdma_rx_alloc() fails, the code
jumps to the restore: label, copies back the old software state, and calls
lan966x_fdma_rx_start() to reactivate the channel:

drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c:lan966x_fdma_reload() {
    ...
	err = lan966x_fdma_rx_alloc(&lan966x->rx);
	if (err)
		goto restore;
    ...
restore:
	lan966x->rx.page_pool = page_pool;
	memcpy(&lan966x->rx.fdma, &fdma_rx_old, sizeof(struct fdma));
	lan966x_fdma_rx_start(&lan966x->rx);
    ...
}

Because lan966x_fdma_rx_alloc() is bypassed for the restored state, the
hardware channel is activated without configuring the LLP.

Since the LLP must be written before the channel is activated, won't
activating the channel after a disable and discard cycle without
reprogramming the LLP result in a DMA fault or system hang?

This isn't a bug introduced by this patch, but does the restore: error path
in lan966x_fdma_reload() contain a hardware/software ring desync issue?

When MTU allocation fails, the driver restores the fdma_rx_old software
state, which contains the old non-zero dcb_index and db_index from when the
ring was active.

However, before this patch, lan966x_fdma_rx_start() unconditionally set the
hardware LLP register to fdma->dma (index 0).

Does this cause the hardware DMA engine to start writing at DCB 0, while the
software's napi_poll routine resumes polling from the restored non-zero
dcb_index? Could this complete desync between hardware and software ring
indices lead to a dead RX path or the processing of stale memory?

[ ... ]

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260508-lan966x-pci-fdma-v4-0-14e0c89d8d63@microchip.com?part=4

  reply	other threads:[~2026-05-09  8:27 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-08  7:35 [PATCH net-next v4 00/13] net: lan966x: add support for PCIe FDMA Daniel Machon
2026-05-08  7:35 ` [PATCH net-next v4 01/13] MAINTAINERS: add FDMA library to Sparx5 SoC entry Daniel Machon
2026-05-08  7:35 ` [PATCH net-next v4 02/13] net: microchip: fdma: rename contiguous dataptr helpers Daniel Machon
2026-05-08  7:35 ` [PATCH net-next v4 03/13] net: microchip: fdma: add PCIe ATU support Daniel Machon
2026-05-09  8:27   ` sashiko-bot
2026-05-08  7:35 ` [PATCH net-next v4 04/13] net: lan966x: add FDMA LLP register write helper Daniel Machon
2026-05-09  8:27   ` sashiko-bot [this message]
2026-05-08  7:35 ` [PATCH net-next v4 05/13] net: lan966x: export FDMA helpers for reuse Daniel Machon
2026-05-08  7:35 ` [PATCH net-next v4 06/13] net: lan966x: add FDMA ops dispatch for PCIe support Daniel Machon
2026-05-08  7:35 ` [PATCH net-next v4 07/13] net: lan966x: clear FDMA interrupt stickies after switch reset Daniel Machon
2026-05-08  7:35 ` [PATCH net-next v4 08/13] net: lan966x: add shutdown callback to stop FDMA on reboot Daniel Machon
2026-05-09  8:27   ` sashiko-bot
2026-05-08  7:35 ` [PATCH net-next v4 09/13] net: lan966x: add PCIe FDMA support Daniel Machon
2026-05-09  8:27   ` sashiko-bot
2026-05-08  7:35 ` [PATCH net-next v4 10/13] net: lan966x: add PCIe FDMA MTU change support Daniel Machon
2026-05-09  8:27   ` sashiko-bot
2026-05-08  7:35 ` [PATCH net-next v4 11/13] net: lan966x: add PCIe FDMA XDP support Daniel Machon
2026-05-09  8:27   ` sashiko-bot
2026-05-08  7:35 ` [PATCH net-next v4 12/13] misc: lan966x-pci: dts: extend cpu reg to cover PCIE DBI space Daniel Machon
2026-05-08  7:35 ` [PATCH net-next v4 13/13] misc: lan966x-pci: dts: add fdma interrupt to overlay Daniel Machon
2026-05-12  2:20 ` [PATCH net-next v4 00/13] net: lan966x: add support for PCIe FDMA Jakub Kicinski
2026-05-12  9:56   ` Daniel Machon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260509082718.D70F6C2BCB2@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel.machon@microchip.com \
    --cc=sashiko@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox