From: sashiko-bot@kernel.org
To: "Dipayaan Roy" <dipayanroy@linux.microsoft.com>
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH v8 2/2] net: mana: force full-page RX buffers via ethtool private flag
Date: Fri, 08 May 2026 21:16:31 +0000 [thread overview]
Message-ID: <20260508211632.052D5C2BCB0@smtp.kernel.org> (raw)
In-Reply-To: <20260508115100.488506-3-dipayanroy@linux.microsoft.com>
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
- [Critical] Faking `apc->port_is_up = true` on `mana_attach()` failure causes a guaranteed NULL pointer dereference in the asynchronous reset worker.
- [High] Toggling the 'full-page-rx' flag OFF leaks memory when initial packets are dropped because pre-allocated buffers are saved for reuse but never consumed by the fragment refill path.
- [High] `mana_pre_alloc_rxbufs` dangerously allocates huge arrays with `kmalloc` and thousands of pages with `GFP_ATOMIC` in process context, leading to allocation failures under load.
--
commit 57f161d214770cf8af214936230e6e5eeac378bc
Author: Dipayaan Roy <dipayanroy@linux.microsoft.com>
net: mana: force full-page RX buffers via ethtool private flag
On some ARM64 platforms with 4K PAGE_SIZE, page_pool fragment
allocation in the RX refill path can cause 15-20% throughput
regression under high connection counts. This adds an ethtool private
flag "full-page-rx" to force one RX buffer per page.
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_ethtool.c b/drivers/net/ethernet/microsoft/mana/mana_ethtool.c
> index a28ca461c135f..dfd222de11d09 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_ethtool.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_ethtool.c
[ ... ]
> @@ -590,6 +609,88 @@ static int mana_get_link_ksettings(struct net_device *ndev,
> return 0;
> }
>
> +static u32 mana_get_priv_flags(struct net_device *ndev)
> +{
[ ... ]
> +static int mana_set_priv_flags(struct net_device *ndev, u32 priv_flags)
> +{
> + struct mana_port_context *apc = netdev_priv(ndev);
> + u32 changed = apc->priv_flags ^ priv_flags;
> + u32 old_priv_flags = apc->priv_flags;
> + bool schedule_port_reset = false;
> + int err = 0;
> +
> + if (!changed)
> + return 0;
> +
> + /* Reject unknown bits */
> + if (priv_flags & ~GENMASK(MANA_PRIV_FLAG_MAX - 1, 0))
> + return -EINVAL;
> +
> + if (changed & BIT(MANA_PRIV_FLAG_USE_FULL_PAGE_RXBUF)) {
> + apc->priv_flags = priv_flags;
> +
> + if (!apc->port_is_up) {
> + /* Port is down, flag updated to apply on next up
> + * so just return.
> + */
> + return 0;
> + }
> +
> + /* Pre-allocate buffers to prevent failure in mana_attach
> + * later
> + */
> + err = mana_pre_alloc_rxbufs(apc, ndev->mtu, apc->num_queues);
Since mana_set_priv_flags() runs in process context under rtnl_lock, does
calling mana_pre_alloc_rxbufs() risk memory exhaustion and fragmentation
failures?
mana_pre_alloc_rxbufs() loops num_rxb times using dev_alloc_pages(), which
allocates using GFP_ATOMIC. Could mass-allocating thousands of atomic pages
exhaust atomic memory reserves?
It also uses kmalloc_array() for rxbufs_pre and das_pre, which might require
megabytes of physically contiguous memory. Would it be safer to use
kvmalloc_array() and alloc_pages(GFP_KERNEL) here instead?
Additionally, if the full-page-rx flag is toggled OFF, does this leak memory
when initial packets are dropped?
When the flag is toggled, mana_pre_alloc_rxbufs() pre-allocates the RX ring
and the buffers are injected with from_pool = false. If the flag is toggled
OFF, the standard MTU configuration sets rxq->frag_count > 1.
If any initial packets are dropped (e.g. truncated packets), mana_rx_skb()
executes the drop path and saves the buffer for reuse via
rxq->xdp_save_va = buf_va.
However, mana_get_rxfrag() only reuses rxq->xdp_save_va if
rxq->frag_count == 1. For frag_count > 1, it allocates new fragments from the
page pool and ignores the saved buffer. If multiple pre-allocated packets are
dropped, could mana_rx_skb() repeatedly overwrite rxq->xdp_save_va, triggering
a WARN_ON_ONCE and leaking the pages?
> + if (err) {
> + netdev_err(ndev,
> + "Insufficient memory for new allocations\n");
> + apc->priv_flags = old_priv_flags;
> + return err;
> + }
> +
> + err = mana_detach(ndev, false);
[ ... ]
> + err = mana_attach(ndev);
> + if (err) {
> + netdev_err(ndev, "mana_attach failed: %d\n", err);
> + apc->priv_flags = old_priv_flags;
> +
> + /* Restore 'port_is_up' so the reset work handler
> + * can properly detach/attach. Without this,
> + * the handler sees port_is_up=false and skips
> + * queue allocation, leaving the port dead.
> + */
> + apc->port_is_up = true;
> + schedule_port_reset = true;
Does setting apc->port_is_up = true on mana_attach() failure cause a NULL
pointer dereference in the async reset worker?
If mana_attach() fails, its error path calls mana_cleanup_port_context(),
which ultimately frees apc->tx_qp and sets it to NULL.
When the scheduled mana_per_port_queue_reset_work_handler executes, it calls
mana_detach(). Because we faked apc->port_is_up = true, mana_detach() invokes
mana_dealloc_queues(), which blindly iterates over apc->num_queues and
dereferences apc->tx_qp[i].txq.
Since apc->tx_qp is NULL, will this result in a kernel panic?
> + }
> + }
> +
> +out:
> + mana_pre_dealloc_rxbufs(apc);
> +
> + if (schedule_port_reset)
> + queue_work(apc->ac->per_port_queue_reset_wq,
> + &apc->queue_reset_work);
> +
> + return err;
> +}
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260508115100.488506-1-dipayanroy@linux.microsoft.com?part=2
next prev parent reply other threads:[~2026-05-08 21:16 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-08 11:46 [PATCH v8 0/2] net: mana: add ethtool private flag for full-page RX buffers Dipayaan Roy
2026-05-08 11:46 ` [PATCH v8 1/2] net: mana: refactor mana_get_strings() and mana_get_sset_count() to use switch Dipayaan Roy
2026-05-08 11:46 ` [PATCH v8 2/2] net: mana: force full-page RX buffers via ethtool private flag Dipayaan Roy
2026-05-08 21:16 ` sashiko-bot [this message]
2026-05-12 2:21 ` Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260508211632.052D5C2BCB0@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=dipayanroy@linux.microsoft.com \
--cc=sashiko@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox