From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BDE9B23C39A; Fri, 1 May 2026 01:39:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777599575; cv=none; b=esh4oUIeh0c+MxswyL/FqrfSfrbxdmcvDfFGY7WaDSzCkVDhFPLhDA/SEcrrVONNeFVXl1RVIXGEba7O9Q0P1DCVfTeYC32BhNqrMEuutTgylFdP8iMyc6bjoraGn8kqurBqosJr/dqwiGQh8FXEpAXr6t/z0aOW+VIq69Ba+YI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777599575; c=relaxed/simple; bh=QVVmfKBuYjK1KtBzZ+SBGhTexOP7jIj1vRGIDd19OrU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=GH0BOrBdXLPXajhcu+z11EmZJWN+f1Y70SL/kJGKWmxE69rlVxk46eXg2DsTlb6Mxw3G1Y+CT3IHaVPscFojxueAsmdPVN+g3wWnNNLddwdIFDsrPdX2zm4mQN3pI39waeXxEgrwqDg6j7KwyWqVwUhu9ac56MN9DvcBiMlu2Z4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=LLTjKOK3; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="LLTjKOK3" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D5536C2BCB3; Fri, 1 May 2026 01:39:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777599575; bh=QVVmfKBuYjK1KtBzZ+SBGhTexOP7jIj1vRGIDd19OrU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LLTjKOK3/Wo09wGrdAJ7Dc1wUl3ssr3LQ0j0Ab4GKyxj/xMm8kzsOyfDPk4Az8QNG Tl/WCx32St1mPloNKLHCg0t8Y0nTzeDpAbe/1o4uQWexRlZXXIsqp0Z9XzGQKFsABW 8T4N5b8YLIVweYm9MJPzNo6BYoyMTSjhOusr621u66W33L70nsqshq/kbrA3+jamnA d6KZs9UlDgbB+JPza8CoVLa8MjLvXYtzWD9F7UX4+JgBmPI9J9MADVIjxdTeYoR1TN v+BJkHWEJOgvuX8hX2TWNvCT2voSStxsjH5osGIDaineaXZbhFL94MHT1B+sIXLcny rLT9s/A2vYEww== From: Jakub Kicinski To: nb@tipi-net.de Cc: Jakub Kicinski , netdev@vger.kernel.org, justin.chen@broadcom.com, horms@kernel.org, mohsin.bashr@gmail.com, opendmb@gmail.com, florian.fainelli@broadcom.com, bcm-kernel-feedback-list@broadcom.com, andrew+netdev@lunn.ch, edumazet@google.com, pabeni@redhat.com, davem@davemloft.net, rajashekar.hudumula@broadcom.com, vikas.gupta@broadcom.com, bhargava.marreddy@broadcom.com, sashal@kernel.org, ebiggers@google.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH net-next v8 1/7] net: bcmgenet: convert RX path to page_pool Date: Thu, 30 Apr 2026 18:37:25 -0700 Message-ID: <20260501013724.3164844-2-kuba@kernel.org> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260428205846.2625550-2-nb@tipi-net.de> References: <20260428205846.2625550-2-nb@tipi-net.de> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit This is an AI-generated review of your patch. The human sending this email has considered the AI review valid, or at least plausible. --- net: bcmgenet: convert RX path to page_pool This patch replaces the per-packet __netdev_alloc_skb() + dma_map_single() in the bcmgenet RX path with page_pool, adding a page_pool per RX ring and building SKBs from pages via napi_build_skb() with skb_mark_for_recycle(). A few questions on the changes below. > diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c > index 54f71b1e85fc..d013a3df9048 100644 > --- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c > +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c > @@ -52,6 +52,13 @@ > #define RX_BUF_LENGTH 2048 > #define SKB_ALIGNMENT 32 > > +/* Page pool RX buffer layout: > + * XDP_PACKET_HEADROOM | RSB(64) + pad(2) | frame data | skb_shared_info > + * The HW writes the 64B RSB + 2B alignment padding before the frame. > + */ > +#define GENET_RSB_PAD (sizeof(struct status_64) + 2) > +#define GENET_RX_HEADROOM (XDP_PACKET_HEADROOM + GENET_RSB_PAD) Is GENET_RX_HEADROOM used anywhere? A grep of the tree after this patch shows no references to it; all call sites use XDP_PACKET_HEADROOM and GENET_RSB_PAD separately. Can this macro be dropped? Jakub: false positive Also, XDP_PACKET_HEADROOM is referenced here and in three other places in bcmgenet.c (bcmgenet_rx_refill, bcmgenet_desc_rx, and bcmgenet_rx_ring_create_pool), but the only new include added by the patch is in bcmgenet.h. XDP_PACKET_HEADROOM is defined in include/uapi/linux/bpf.h, and the helpers.h/types.h chain does not include it. Would it be worth adding an explicit #include to bcmgenet.c so this does not rely on transitive inclusion (other drivers using this macro, e.g. bnxt, include it explicitly)? > @@ -2250,46 +2249,30 @@ static netdev_tx_t bcmgenet_xmit(struct sk_buff *skb, struct net_device *dev) > -static struct sk_buff *bcmgenet_rx_refill(struct bcmgenet_priv *priv, > - struct enet_cb *cb) > +static int bcmgenet_rx_refill(struct bcmgenet_rx_ring *ring, > + struct enet_cb *cb) > { [ ... ] > - /* Allocate a new Rx skb */ > - skb = __netdev_alloc_skb(priv->dev, priv->rx_buf_len + SKB_ALIGNMENT, > - GFP_ATOMIC | __GFP_NOWARN); After this conversion, priv->rx_buf_len no longer has any readers; the old dma_map_single/dma_unmap paths that consumed it are gone, and the new code uses RX_BUF_LENGTH directly in pp_params.max_len and page_pool_dma_sync_for_cpu(). bcmgenet_init() still writes the field. Should priv->rx_buf_len be removed as part of this cleanup? [ ... ] > @@ -2341,25 +2324,28 @@ static unsigned int bcmgenet_desc_rx(struct bcmgenet_rx_ring *ring, > while ((rxpktprocessed < rxpkttoprocess) && > (rxpktprocessed < budget)) { > struct status_64 *status; > + struct page *rx_page; > + unsigned int rx_off; > __be16 rx_csum; > + void *hard_start; > > cb = &priv->rx_cbs[ring->read_ptr]; [ ... ] > + page_pool_dma_sync_for_cpu(ring->page_pool, rx_page, 0, > + RX_BUF_LENGTH); page_pool_dma_sync_for_cpu() is called unconditionally with a fixed size of RX_BUF_LENGTH (2048) even when the actual received frame is much smaller. The received length is available in the descriptor status word once it is read. On the non-cache-coherent SoCs this driver targets (for example BCM2711), is there a reason not to sync only the number of bytes that were actually received? [ ... ] > @@ -2375,7 +2361,8 @@ static unsigned int bcmgenet_desc_rx(struct bcmgenet_rx_ring *ring, > if (unlikely(len > RX_BUF_LENGTH)) { > netif_err(priv, rx_status, dev, "oversized packet\n"); > BCMGENET_STATS64_INC(stats, length_errors); [ ... ] > @@ -2411,24 +2399,47 @@ static unsigned int bcmgenet_desc_rx(struct bcmgenet_rx_ring *ring, [ ... ] > - skb_put(skb, len); > + /* Build SKB from the page - data starts at hard_start, > + * frame begins after RSB(64) + pad(2) = 66 bytes. > + */ > + skb = napi_build_skb(hard_start, PAGE_SIZE - XDP_PACKET_HEADROOM); > + if (unlikely(!skb)) { > + BCMGENET_STATS64_INC(stats, dropped); > + page_pool_put_full_page(ring->page_pool, rx_page, > + true); > + goto next; > + } > > - /* remove RSB and hardware 2bytes added for IP alignment */ > - skb_pull(skb, 66); > - len -= 66; > + skb_mark_for_recycle(skb); > + > + /* Reserve the RSB + pad, then set the data length */ > + skb_reserve(skb, GENET_RSB_PAD); > + __skb_put(skb, len - GENET_RSB_PAD); Can len here be smaller than GENET_RSB_PAD (66)? len is taken from status->length_status >> DMA_BUFLENGTH_SHIFT and the only bound checked above is the upper bound against RX_BUF_LENGTH: if (unlikely(len > RX_BUF_LENGTH)) { ... } If HW ever reports a runt length below 66 (for example after a reset or due to an errata path), len - GENET_RSB_PAD underflows as an unsigned value. __skb_put() is the unchecked variant: static inline void *__skb_put(struct sk_buff *skb, unsigned int len) { void *tmp = skb_tail_pointer(skb); SKB_LINEAR_ASSERT(skb); skb->tail += len; skb->len += len; return tmp; } so it will not trip skb_over_panic(); instead skb->tail/skb->len advance past skb->end into the page. Since napi_build_skb() was given frag_size = PAGE_SIZE - XDP_PACKET_HEADROOM, skb->end sits inside the same page_pool-backed page and the SKB would then expose recycled page_pool memory up the stack via napi_gro_receive(). The previous code used skb_put(skb, len) followed by skb_pull(skb, 66), both of which are bounds-checked. Would it make sense to add an explicit guard such as: if (unlikely(len < GENET_RSB_PAD)) { BCMGENET_STATS64_INC(stats, length_errors); page_pool_put_full_page(ring->page_pool, rx_page, true); goto next; } before the __skb_put() call? Jakub: I suppose.. IDK if protecting against bad HW is required, up to you > if (priv->crc_fwd_en) { > - skb_trim(skb, len - ETH_FCS_LEN); > - len -= ETH_FCS_LEN; > + skb_trim(skb, skb->len - ETH_FCS_LEN); > } The same underflow concern applies here: if skb->len is less than ETH_FCS_LEN, skb->len - ETH_FCS_LEN wraps to a very large unsigned value, and skb_trim() does: void skb_trim(struct sk_buff *skb, unsigned int len) { if (skb->len > len) __skb_trim(skb, len); } so the trim is silently skipped and the FCS stays attached. Should this path also validate skb->len before subtracting ETH_FCS_LEN? [ ... ] -- pw-bot: cr