From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail.tipi-net.de (mail.tipi-net.de [194.13.80.246]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9530B1F4C8E for ; Wed, 29 Apr 2026 22:51:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=194.13.80.246 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777503066; cv=none; b=oKjbyvbctcxf3DeypEBKfYdD3EosEwTcke+f9FYJ+1kHjPz43n2UGN2favtbhoJ0bapwjYUeftOVuL/UiNMh8eXMayjoqZsM7rQVH6ji+PtV6VWJW2k69A2a1emIgl1gTd3vFtahsPjjZmEEIJqOo/rC26ShWSjhaUvCfnXrk+s= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777503066; c=relaxed/simple; bh=IT+A5rkbn4Kvuu/efFQZMLBZ43KWVSjUxL9mXp4sxAM=; h=MIME-Version:Date:From:To:Cc:Subject:In-Reply-To:References: Message-ID:Content-Type; b=p5QCc9s8+fciAwfbu4RfgmiGYowOBAut0zyVZCIV0Vmudn0QLkaE788B9Vyh0CAIajB+XoXOIiHkIZtFtetADyXp7bwq0CRArh1xuaBOMCqAPCrQwlj7MwjdZ/y1XffTQ4lFzQuJQpOSTPSwQVPDJwOnAz2oKPCW7kbvMFz76EU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tipi-net.de; spf=pass smtp.mailfrom=tipi-net.de; dkim=pass (2048-bit key) header.d=tipi-net.de header.i=@tipi-net.de header.b=GEu2gyB4; arc=none smtp.client-ip=194.13.80.246 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tipi-net.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=tipi-net.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=tipi-net.de header.i=@tipi-net.de header.b="GEu2gyB4" Received: from [127.0.0.1] (localhost [127.0.0.1]) by localhost (Mailerdaemon) with ESMTPSA id F40EFA5338; Thu, 30 Apr 2026 00:50:59 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tipi-net.de; s=dkim; t=1777503060; h=from:subject:date:message-id:to:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:references; bh=6XdQFv3lhWyVf6CiRYWh183KTdlcIFi6uEWthmdNDzI=; b=GEu2gyB44Fk3AKiEaoTYgxgRtn2LnrisWVbHHz1ub44QSrQ7V2ZtTa+rY3koeSiHF0Z/6y ch0ouyLsTOsMqx0kna+ORHPO7ppyTukz1ikZ61fPdQg/mQRjfJUYG9pI1iynzOA0hTDllr Os8DrmOxt9NsccNq9ZukArM5r4vpxqCKlHAn1YUQ9vKVZYLJyMIvFeJEhgQyPvUHMVXQV9 jmTcG92SG7/rgCQ9+RZ9E49LklPF+/DEmCvD9PjdKKOvk1m0jdY7BSX8/O6yXKmxUnAIcY yz5FNxIFzsORr+m681agne4Rz92xzl3pTbM0WLVh7mXzY8K+S8TVqez/S9WalA== Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Date: Thu, 30 Apr 2026 00:50:59 +0200 From: Nicolai Buchwitz To: sashiko@lists.linux.dev Cc: bpf@vger.kernel.org Subject: Re: [PATCH net-next v8 5/7] net: bcmgenet: add XDP_REDIRECT and ndo_xdp_xmit support In-Reply-To: <20260429211703.19EC9C2BCB4@smtp.kernel.org> References: <20260428205846.2625550-6-nb@tipi-net.de> <20260429211703.19EC9C2BCB4@smtp.kernel.org> Message-ID: <6f05e119f0c419047313a20707e7fba5@tipi-net.de> X-Sender: nb@tipi-net.de Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit X-Last-TLS-Session-Version: TLSv1.3 On 29.4.2026 23:17, sashiko-bot@kernel.org wrote: > Thank you for your contribution! Sashiko AI review found 6 potential > issue(s) to consider: > - [Critical] Prepended TSB in ndo_xdp_xmit completely overwrites struct > xdp_frame. > - [Critical] Headroom validation in XDP_TX ignores metadata size, > allowing out-of-bounds writes. > - [High] DMA mapping address in XDP_TX is incorrect when XDP metadata > is present. > - [High] Redirected XDP frames will be leaked indefinitely if the > target interface is idle. > - [Medium] bcmgenet_xdp_setup() clears NETDEV_XDP_ACT_NDO_XMIT when > detaching an XDP program, hiding device capabilities. > - [Medium] Runt XDP frames (less than 60 bytes) will not be padded by > the hardware and will be transmitted as undersized packets. > -- > > commit 893d59a522f087d2ccf486f06a06bb9e815cf330 > Author: Nicolai Buchwitz > > net: bcmgenet: add XDP_REDIRECT and ndo_xdp_xmit support > > Add XDP_REDIRECT support and implement ndo_xdp_xmit for receiving > redirected frames from other devices. > >> diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c >> b/drivers/net/ethernet/broadcom/genet/bcmgenet.c >> index 3c3b0c44ea8a..9dd258567824 100644 >> --- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c >> +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c > > [ ... ] > >> @@ -2451,11 +2452,25 @@ static unsigned int bcmgenet_run_xdp(struct >> bcmgenet_rx_ring *ring, >> true); >> return XDP_DROP; >> } >> - if (unlikely(!bcmgenet_xdp_xmit_frame(priv, xdpf, false))) { >> + >> + tx_ring = &priv->xdp_tx_ring; >> + spin_lock(&tx_ring->lock); >> + if (unlikely(!bcmgenet_xdp_xmit_frame(priv, tx_ring, >> + xdpf, false))) { > > Does the DMA mapping calculation in bcmgenet_xdp_xmit_frame() for this > local > XDP_TX path (dma_map == false) account for XDP metadata? For local XDP_TX, the caller adjusts xdp->data backwards to include the prepended TSB before xdp_convert_buff_to_frame(). xdpf->data then points to the TSB start, and the mapping calculation correctly resolves to that location. > > If an XDP program populates metadata, the calculation sizeof(*xdpf) + > xdpf->headroom in the else block of bcmgenet_xdp_xmit_frame() will > point > to the start of the metadata rather than where the TSB was actually > prepended > at xdpf->data. The hardware expects the TSB at the DMA address, so it > might > misinterpret the metadata as the status block. > >> + spin_unlock(&tx_ring->lock); >> xdp_return_frame_rx_napi(xdpf); >> return XDP_DROP; >> } >> + bcmgenet_xdp_ring_doorbell(priv, tx_ring); >> + spin_unlock(&tx_ring->lock); >> return XDP_TX; > > Does the headroom check earlier in this XDP_TX block adequately > validate > xdp->data_meta? xdp_convert_buff_to_frame() validates the remaining headroom against sizeof(struct xdp_frame) and returns NULL on insufficient space, which we handle by dropping the frame. > > Before backing up the pointers: > if (xdp->data - xdp->data_hard_start < > sizeof(struct status_64) + sizeof(struct xdp_frame)) { > > This check validates the space before xdp->data, but the code then > subtracts > 64 bytes from xdp->data_meta: > xdp->data -= sizeof(struct status_64); > xdp->data_meta -= sizeof(struct status_64); > > If an XDP program populates metadata, xdp->data_meta will be lower than > xdp->data. The subtraction might push xdp->data_meta below > xdp->data_hard_start, causing out-of-bounds writes when > xdp_convert_buff_to_frame() constructs the struct xdp_frame. > > [ ... ] > >> @@ -3996,10 +4017,16 @@ static int bcmgenet_xdp_setup(struct >> net_device *dev, >> return -EOPNOTSUPP; >> } >> >> + if (!prog) >> + xdp_features_clear_redirect_target(dev); >> + >> old_prog = xchg(&priv->xdp_prog, prog); >> if (old_prog) >> bpf_prog_put(old_prog); >> >> + if (prog) >> + xdp_features_set_redirect_target(dev, false); >> + >> return 0; >> } > > Can this unintentionally clear NETDEV_XDP_ACT_NDO_XMIT when detaching > an XDP > program? xdp_features_clear_redirect_target() / xdp_features_set_redirect_target() only manipulate the redirect-target advertisement, not NETDEV_XDP_ACT_NDO_XMIT. The same pattern is used by ice, bnxt and others. > > The driver statically implements ndo_xdp_xmit and advertises it in > dev->xdp_features during probe, indicating the interface can receive > XDP > redirected frames from other interfaces regardless of its own XDP > program state. > Calling xdp_features_clear_redirect_target(dev) when prog is NULL will > clear the NETDEV_XDP_ACT_NDO_XMIT capability, hiding it from discovery. > > [ ... ] > >> @@ -4013,6 +4040,36 @@ static int bcmgenet_xdp(struct net_device *dev, >> struct netdev_bpf *xdp) >> } >> } >> >> +static int bcmgenet_xdp_xmit(struct net_device *dev, int num_frames, >> + struct xdp_frame **frames, u32 flags) >> +{ >> + struct bcmgenet_priv *priv = netdev_priv(dev); >> + struct bcmgenet_tx_ring *ring = &priv->xdp_tx_ring; >> + int sent = 0; >> + int i; >> + >> + if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK)) >> + return -EINVAL; >> + >> + if (unlikely(!netif_running(dev))) >> + return -ENETDOWN; >> + >> + spin_lock(&ring->lock); >> + >> + for (i = 0; i < num_frames; i++) { >> + if (!bcmgenet_xdp_xmit_frame(priv, ring, frames[i], true)) >> + break; > > Does bcmgenet_xdp_xmit_frame() overwrite the struct xdp_frame metadata > when > handling these redirected frames? No. The xdp_frame ends at data_hard_start + sizeof(xdp_frame). The TSB is written at xdpf->data - 64, which falls inside the headroom (after the xdp_frame). The check xdpf->headroom < sizeof(struct status_64) ensures no overlap with the xdp_frame. > > When dma_map == true for redirected frames, the driver prepends a > 64-byte > hardware TSB by calling memset(xdpf->data - sizeof(struct status_64), > 0, 64). > Since the struct xdp_frame is located right before xdpf->data (or any > prepended > metadata), writing backwards into the headroom could completely zero > out the > struct xdp_frame fields, leading to crashes when returning the frame. > > Also, does this cause runt frames to be transmitted as undersized > packets? Padding short frames is the responsibility of the XDP program / source device, same as in other XDP drivers. > > If the XDP frame is less than 60 bytes, enabling DMA_TX_APPEND_CRC > inside > bcmgenet_xdp_xmit_frame() without first padding the frame might trigger > the > GENET hardware quirk where it fails to pad packets to the minimum > Ethernet size, > causing them to be dropped by receiving switches. > > Is there a mechanism to ensure these redirected frames won't be leaked > if the > target interface is idle? Known limitation of piggybacked TX reclaim, shared by other drivers. In practice, redirect targets see bidirectional traffic. Frames are dropped (not leaked) when the ring fills > > Without setting DMA_TX_REQ_INTR in len_stat during transmission in > bcmgenet_xdp_xmit_frame(), TX completions for the XDP TX ring rely > entirely > on being polled during the RX NAPI cycle. If the interface is receiving > no > normal traffic, the NAPI poll might never run, causing the redirected > frames to > sit in the TX ring indefinitely. > >> + sent++; >> + } Thanks, Nicolai