public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC net-next v2 0/8] net: macb: Add XDP support and page pool integration
@ 2025-12-20 23:51 Paolo Valerio
  2025-12-20 23:51 ` [PATCH RFC net-next v2 1/8] net: macb: move Rx buffers alloc from link up to open Paolo Valerio
                   ` (7 more replies)
  0 siblings, 8 replies; 21+ messages in thread
From: Paolo Valerio @ 2025-12-20 23:51 UTC (permalink / raw)
  To: netdev
  Cc: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Lorenzo Bianconi,
	Théo Lebrun

This was initially planned to be sent as non-RFC, but given net-next
will remain closed for some more, might make sense to start sending it as
RFC v2.

Tested on Raspberry Pi 5.
All the changes are intended for gem only.

The series consists of two main changes:

- Migration from netdev_alloc_skb() to page pool allocation model,
  enabling skb recycling.
  This also adds support for multi-descriptor frame reception,
  removing the previous single-descriptor approach and avoiding
  potentially large contiguous allocations for e.g. jumbo frames
  with CONFIG_PAGE_SIZE_4KB.

- XDP support: Base XDP implementation supporting all major
  verdicts (XDP_PASS, XDP_DROP, XDP_REDIRECT, XDP_TX) along with
  the ndo_xdp_xmit function for packet redirection.

The driver now advertises NETDEV_XDP_ACT_BASIC, NETDEV_XDP_ACT_REDIRECT,
NETDEV_XDP_ACT_NDO_XMIT capabilities.

v1: https://lore.kernel.org/netdev/20251119135330.551835-1-pvalerio@redhat.com/

Changes since RFC v1
====================
  - Squashed 1/6 and 2/6
  - Reworked rx_buffer_size computation. It no longer takes into
    accounts extra room.
  - A bunch of renaming (rx_offset => rx_headroom, removed MACB_MAX_PAD,
    MACB_PP_HEADROOM => XDP_PACKET_HEADROOM, data => ptr, xdp_q => xdp_rxq,
    macb_xdp() => gem_xdp(), macb_xdp_xmit() => gem_xdp_xmit())
  - Deduplicated buffer size computation in gem_xdp_valid_mtu()
    and gem_xdp_setup()
  - gem_xdp_setup() no longer close()/open()
  - Renaming from rx_skbuff to rx_buff is now got split in a separate commit
  - Open-coded gem_page_pool_get_buff()
  - Added missing rx_buff re-initialization in the error path during rx
  - Page pool creation failure now fails the device open
  - Moved xdp buff preparation inside gem_xdp_run()
  - Added missing rcu_access_pointer()
  - Turned return value in -EOPNOTSUPP for macb_xdp() on failure
  - moved tx_skb to tx_buff renaming to a separate commit
  - Removed some unneeded code and set MACB_TYPE_SKB for lp->rm9200_txq[desc].type as well
  - Replaced !!addr with a dedicated bool in macb_xdp_submit_frame()

Additional notes
================
  - Clamping is there and needed as rx_buffer_size + overhead cannot be greater
    than a PAGE_SIZE. This is fine as frames larger than PAGE_SIZE
    are still handled in a scattered way as intended.
  - Code in macb_xdp_submit_frame() and macb_start_xmit()/macb_tx_map() was not
    factored as the skb path, other than being scattered across skb specific code,
    also contains frags handling.
    Probably a unification attempt might better be done when multi buff support for
    xdp gets added.


Paolo Valerio (7):
  net: macb: rename rx_skbuff into rx_buff
  cadence: macb: Add page pool support handle multi-descriptor frame rx
  cadence: macb: use the current queue number for stats
  cadence: macb: add XDP support for gem
  cadence: macb: make macb_tx_skb generic
  cadence: macb: make tx path skb agnostic
  cadence: macb: introduce xmit support

Théo Lebrun (1):
  net: macb: move Rx buffers alloc from link up to open

 drivers/net/ethernet/cadence/Kconfig     |   1 +
 drivers/net/ethernet/cadence/macb.h      |  41 +-
 drivers/net/ethernet/cadence/macb_main.c | 811 +++++++++++++++++------
 3 files changed, 654 insertions(+), 199 deletions(-)

-- 
2.52.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH RFC net-next v2 1/8] net: macb: move Rx buffers alloc from link up to open
  2025-12-20 23:51 [PATCH RFC net-next v2 0/8] net: macb: Add XDP support and page pool integration Paolo Valerio
@ 2025-12-20 23:51 ` Paolo Valerio
  2026-01-08 15:24   ` Théo Lebrun
  2025-12-20 23:51 ` [PATCH RFC net-next v2 2/8] net: macb: rename rx_skbuff into rx_buff Paolo Valerio
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Paolo Valerio @ 2025-12-20 23:51 UTC (permalink / raw)
  To: netdev
  Cc: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Lorenzo Bianconi,
	Théo Lebrun

From: Théo Lebrun <theo.lebrun@bootlin.com>

mog_alloc_rx_buffers(), getting called at open, does not do rx buffer
alloc on GEM. The bulk of the work is done by gem_rx_refill() filling
up all slots with valid buffers.

gem_rx_refill() is called at link up by
gem_init_rings() == bp->macbgem_ops.mog_init_rings().

Move operation to macb_open(), mostly to allow it to fail early and
loudly rather than init the device with Rx mostly broken.

About `bool fail_early`:
 - When called from macb_open(), ring init fails as soon as a queue
   cannot be refilled.
 - When called from macb_hresp_error_task(), we do our best to reinit
   the device: we still iterate over all queues and try refilling all
   even if a previous queue failed.

Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
---
Picked up as agreed from the discussion about v1.
This is slightly different from the original patch.
mog_init_rings() was simply moved after rx_ring_tieoff
allocation to avoid a NULL pointer dereference of rx_ring_tieoff.
---
 drivers/net/ethernet/cadence/macb.h      |  2 +-
 drivers/net/ethernet/cadence/macb_main.c | 39 ++++++++++++++++++------
 2 files changed, 30 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
index 87414a2ddf6e..2cb65ec37d44 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -1180,7 +1180,7 @@ struct macb_queue;
 struct macb_or_gem_ops {
 	int	(*mog_alloc_rx_buffers)(struct macb *bp);
 	void	(*mog_free_rx_buffers)(struct macb *bp);
-	void	(*mog_init_rings)(struct macb *bp);
+	int	(*mog_init_rings)(struct macb *bp, bool fail_early);
 	int	(*mog_rx)(struct macb_queue *queue, struct napi_struct *napi,
 			  int budget);
 };
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index e461f5072884..cd869db27920 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -705,10 +705,9 @@ static void macb_mac_link_up(struct phylink_config *config,
 		if (rx_pause)
 			ctrl |= MACB_BIT(PAE);
 
-		/* Initialize rings & buffers as clearing MACB_BIT(TE) in link down
-		 * cleared the pipeline and control registers.
+		/* Initialize buffer registers as clearing MACB_BIT(TE) in link
+		 * down cleared the pipeline and control registers.
 		 */
-		bp->macbgem_ops.mog_init_rings(bp);
 		macb_init_buffers(bp);
 
 		for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue)
@@ -1250,13 +1249,14 @@ static int macb_tx_complete(struct macb_queue *queue, int budget)
 	return packets;
 }
 
-static void gem_rx_refill(struct macb_queue *queue)
+static int gem_rx_refill(struct macb_queue *queue)
 {
 	unsigned int		entry;
 	struct sk_buff		*skb;
 	dma_addr_t		paddr;
 	struct macb *bp = queue->bp;
 	struct macb_dma_desc *desc;
+	int err = 0;
 
 	while (CIRC_SPACE(queue->rx_prepared_head, queue->rx_tail,
 			bp->rx_ring_size) > 0) {
@@ -1273,6 +1273,7 @@ static void gem_rx_refill(struct macb_queue *queue)
 			if (unlikely(!skb)) {
 				netdev_err(bp->dev,
 					   "Unable to allocate sk_buff\n");
+				err = -ENOMEM;
 				break;
 			}
 
@@ -1322,6 +1323,7 @@ static void gem_rx_refill(struct macb_queue *queue)
 
 	netdev_vdbg(bp->dev, "rx ring: queue: %p, prepared head %d, tail %d\n",
 			queue, queue->rx_prepared_head, queue->rx_tail);
+	return err;
 }
 
 /* Mark DMA descriptors from begin up to and not including end as unused */
@@ -1774,7 +1776,7 @@ static void macb_hresp_error_task(struct work_struct *work)
 	netif_tx_stop_all_queues(dev);
 	netif_carrier_off(dev);
 
-	bp->macbgem_ops.mog_init_rings(bp);
+	bp->macbgem_ops.mog_init_rings(bp, false);
 
 	/* Initialize TX and RX buffers */
 	macb_init_buffers(bp);
@@ -2547,8 +2549,6 @@ static int macb_alloc_consistent(struct macb *bp)
 		if (!queue->tx_skb)
 			goto out_err;
 	}
-	if (bp->macbgem_ops.mog_alloc_rx_buffers(bp))
-		goto out_err;
 
 	/* Required for tie off descriptor for PM cases */
 	if (!(bp->caps & MACB_CAPS_QUEUE_DISABLE)) {
@@ -2560,6 +2560,11 @@ static int macb_alloc_consistent(struct macb *bp)
 			goto out_err;
 	}
 
+	if (bp->macbgem_ops.mog_alloc_rx_buffers(bp))
+		goto out_err;
+	if (bp->macbgem_ops.mog_init_rings(bp, true))
+		goto out_err;
+
 	return 0;
 
 out_err:
@@ -2580,11 +2585,13 @@ static void macb_init_tieoff(struct macb *bp)
 	desc->ctrl = 0;
 }
 
-static void gem_init_rings(struct macb *bp)
+static int gem_init_rings(struct macb *bp, bool fail_early)
 {
 	struct macb_queue *queue;
 	struct macb_dma_desc *desc = NULL;
+	int last_err = 0;
 	unsigned int q;
+	int err;
 	int i;
 
 	for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
@@ -2600,13 +2607,24 @@ static void gem_init_rings(struct macb *bp)
 		queue->rx_tail = 0;
 		queue->rx_prepared_head = 0;
 
-		gem_rx_refill(queue);
+		/* We get called in two cases:
+		 *  - open: we can propagate alloc errors (so fail early),
+		 *  - HRESP error: cannot propagate, we attempt to reinit
+		 *    all queues in case of failure.
+		 */
+		err = gem_rx_refill(queue);
+		if (err) {
+			last_err = err;
+			if (fail_early)
+				break;
+		}
 	}
 
 	macb_init_tieoff(bp);
+	return last_err;
 }
 
-static void macb_init_rings(struct macb *bp)
+static int macb_init_rings(struct macb *bp, bool fail_early)
 {
 	int i;
 	struct macb_dma_desc *desc = NULL;
@@ -2623,6 +2641,7 @@ static void macb_init_rings(struct macb *bp)
 	desc->ctrl |= MACB_BIT(TX_WRAP);
 
 	macb_init_tieoff(bp);
+	return 0;
 }
 
 static void macb_reset_hw(struct macb *bp)
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC net-next v2 2/8] net: macb: rename rx_skbuff into rx_buff
  2025-12-20 23:51 [PATCH RFC net-next v2 0/8] net: macb: Add XDP support and page pool integration Paolo Valerio
  2025-12-20 23:51 ` [PATCH RFC net-next v2 1/8] net: macb: move Rx buffers alloc from link up to open Paolo Valerio
@ 2025-12-20 23:51 ` Paolo Valerio
  2025-12-20 23:51 ` [PATCH RFC net-next v2 3/8] cadence: macb: Add page pool support handle multi-descriptor frame rx Paolo Valerio
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Paolo Valerio @ 2025-12-20 23:51 UTC (permalink / raw)
  To: netdev
  Cc: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Lorenzo Bianconi,
	Théo Lebrun

This is a preparation commit as the field in later patches will no longer
accomomdate skbuffs, but more generic frames.

Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
---
 drivers/net/ethernet/cadence/macb.h      |  2 +-
 drivers/net/ethernet/cadence/macb_main.c | 24 ++++++++++++------------
 2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
index 2cb65ec37d44..3b184e9ac771 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -1262,7 +1262,7 @@ struct macb_queue {
 	unsigned int		rx_tail;
 	unsigned int		rx_prepared_head;
 	struct macb_dma_desc	*rx_ring;
-	struct sk_buff		**rx_skbuff;
+	void			**rx_buff;
 	void			*rx_buffers;
 	struct napi_struct	napi_rx;
 	struct queue_stats stats;
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index cd869db27920..b4e2444b2e95 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -1267,7 +1267,7 @@ static int gem_rx_refill(struct macb_queue *queue)
 
 		desc = macb_rx_desc(queue, entry);
 
-		if (!queue->rx_skbuff[entry]) {
+		if (!queue->rx_buff[entry]) {
 			/* allocate sk_buff for this free entry in ring */
 			skb = netdev_alloc_skb(bp->dev, bp->rx_buffer_size);
 			if (unlikely(!skb)) {
@@ -1286,7 +1286,7 @@ static int gem_rx_refill(struct macb_queue *queue)
 				break;
 			}
 
-			queue->rx_skbuff[entry] = skb;
+			queue->rx_buff[entry] = skb;
 
 			if (entry == bp->rx_ring_size - 1)
 				paddr |= MACB_BIT(RX_WRAP);
@@ -1389,7 +1389,7 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
 			queue->stats.rx_dropped++;
 			break;
 		}
-		skb = queue->rx_skbuff[entry];
+		skb = queue->rx_buff[entry];
 		if (unlikely(!skb)) {
 			netdev_err(bp->dev,
 				   "inconsistent Rx descriptor chain\n");
@@ -1398,7 +1398,7 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
 			break;
 		}
 		/* now everything is ready for receiving packet */
-		queue->rx_skbuff[entry] = NULL;
+		queue->rx_buff[entry] = NULL;
 		len = ctrl & bp->rx_frm_len_mask;
 
 		netdev_vdbg(bp->dev, "gem_rx %u (len %u)\n", entry, len);
@@ -2397,11 +2397,11 @@ static void gem_free_rx_buffers(struct macb *bp)
 	int i;
 
 	for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
-		if (!queue->rx_skbuff)
+		if (!queue->rx_buff)
 			continue;
 
 		for (i = 0; i < bp->rx_ring_size; i++) {
-			skb = queue->rx_skbuff[i];
+			skb = queue->rx_buff[i];
 
 			if (!skb)
 				continue;
@@ -2415,8 +2415,8 @@ static void gem_free_rx_buffers(struct macb *bp)
 			skb = NULL;
 		}
 
-		kfree(queue->rx_skbuff);
-		queue->rx_skbuff = NULL;
+		kfree(queue->rx_buff);
+		queue->rx_buff = NULL;
 	}
 }
 
@@ -2479,13 +2479,13 @@ static int gem_alloc_rx_buffers(struct macb *bp)
 
 	for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
 		size = bp->rx_ring_size * sizeof(struct sk_buff *);
-		queue->rx_skbuff = kzalloc(size, GFP_KERNEL);
-		if (!queue->rx_skbuff)
+		queue->rx_buff = kzalloc(size, GFP_KERNEL);
+		if (!queue->rx_buff)
 			return -ENOMEM;
 		else
 			netdev_dbg(bp->dev,
-				   "Allocated %d RX struct sk_buff entries at %p\n",
-				   bp->rx_ring_size, queue->rx_skbuff);
+				   "Allocated %d RX buff entries at %p\n",
+				   bp->rx_ring_size, queue->rx_buff);
 	}
 	return 0;
 }
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC net-next v2 3/8] cadence: macb: Add page pool support handle multi-descriptor frame rx
  2025-12-20 23:51 [PATCH RFC net-next v2 0/8] net: macb: Add XDP support and page pool integration Paolo Valerio
  2025-12-20 23:51 ` [PATCH RFC net-next v2 1/8] net: macb: move Rx buffers alloc from link up to open Paolo Valerio
  2025-12-20 23:51 ` [PATCH RFC net-next v2 2/8] net: macb: rename rx_skbuff into rx_buff Paolo Valerio
@ 2025-12-20 23:51 ` Paolo Valerio
  2026-01-08 15:43   ` Théo Lebrun
  2025-12-20 23:51 ` [PATCH RFC net-next v2 4/8] cadence: macb: use the current queue number for stats Paolo Valerio
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Paolo Valerio @ 2025-12-20 23:51 UTC (permalink / raw)
  To: netdev
  Cc: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Lorenzo Bianconi,
	Théo Lebrun

Use the page pool allocator for the data buffers and enable skb recycling
support, instead of relying on netdev_alloc_skb allocating the entire skb
during the refill.

The patch also add support for receiving network frames that span multiple
DMA descriptors in the Cadence MACB/GEM Ethernet driver.

The patch removes the requirement that limited frame reception to
a single descriptor (RX_SOF && RX_EOF), also avoiding potential
contiguous multi-page allocation for large frames.

Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
---
 drivers/net/ethernet/cadence/Kconfig     |   1 +
 drivers/net/ethernet/cadence/macb.h      |   5 +
 drivers/net/ethernet/cadence/macb_main.c | 345 +++++++++++++++--------
 3 files changed, 235 insertions(+), 116 deletions(-)

diff --git a/drivers/net/ethernet/cadence/Kconfig b/drivers/net/ethernet/cadence/Kconfig
index 5b2a461dfd28..ae500f717433 100644
--- a/drivers/net/ethernet/cadence/Kconfig
+++ b/drivers/net/ethernet/cadence/Kconfig
@@ -25,6 +25,7 @@ config MACB
 	depends on PTP_1588_CLOCK_OPTIONAL
 	select PHYLINK
 	select CRC32
+	select PAGE_POOL
 	help
 	  The Cadence MACB ethernet interface is found on many Atmel AT32 and
 	  AT91 parts.  This driver also supports the Cadence GEM (Gigabit
diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
index 3b184e9ac771..45c04157f153 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -14,6 +14,8 @@
 #include <linux/interrupt.h>
 #include <linux/phy/phy.h>
 #include <linux/workqueue.h>
+#include <net/page_pool/helpers.h>
+#include <net/xdp.h>
 
 #define MACB_GREGS_NBR 16
 #define MACB_GREGS_VERSION 2
@@ -1266,6 +1268,8 @@ struct macb_queue {
 	void			*rx_buffers;
 	struct napi_struct	napi_rx;
 	struct queue_stats stats;
+	struct page_pool	*page_pool;
+	struct sk_buff		*skb;
 };
 
 struct ethtool_rx_fs_item {
@@ -1289,6 +1293,7 @@ struct macb {
 	struct macb_dma_desc	*rx_ring_tieoff;
 	dma_addr_t		rx_ring_tieoff_dma;
 	size_t			rx_buffer_size;
+	size_t			rx_headroom;
 
 	unsigned int		rx_ring_size;
 	unsigned int		tx_ring_size;
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index b4e2444b2e95..9e1efc1f56d8 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -1249,14 +1249,22 @@ static int macb_tx_complete(struct macb_queue *queue, int budget)
 	return packets;
 }
 
-static int gem_rx_refill(struct macb_queue *queue)
+static int gem_total_rx_buffer_size(struct macb *bp)
+{
+	return SKB_HEAD_ALIGN(bp->rx_buffer_size + bp->rx_headroom);
+}
+
+static int gem_rx_refill(struct macb_queue *queue, bool napi)
 {
 	unsigned int		entry;
-	struct sk_buff		*skb;
 	dma_addr_t		paddr;
+	void			*data;
 	struct macb *bp = queue->bp;
 	struct macb_dma_desc *desc;
+	struct page *page;
+	gfp_t gfp_alloc;
 	int err = 0;
+	int offset;
 
 	while (CIRC_SPACE(queue->rx_prepared_head, queue->rx_tail,
 			bp->rx_ring_size) > 0) {
@@ -1268,25 +1276,20 @@ static int gem_rx_refill(struct macb_queue *queue)
 		desc = macb_rx_desc(queue, entry);
 
 		if (!queue->rx_buff[entry]) {
-			/* allocate sk_buff for this free entry in ring */
-			skb = netdev_alloc_skb(bp->dev, bp->rx_buffer_size);
-			if (unlikely(!skb)) {
+			gfp_alloc = napi ? GFP_ATOMIC : GFP_KERNEL;
+			page = page_pool_alloc_frag(queue->page_pool, &offset,
+						    gem_total_rx_buffer_size(bp),
+						    gfp_alloc | __GFP_NOWARN);
+			if (!page) {
 				netdev_err(bp->dev,
-					   "Unable to allocate sk_buff\n");
+					   "Unable to allocate page\n");
 				err = -ENOMEM;
 				break;
 			}
 
-			/* now fill corresponding descriptor entry */
-			paddr = dma_map_single(&bp->pdev->dev, skb->data,
-					       bp->rx_buffer_size,
-					       DMA_FROM_DEVICE);
-			if (dma_mapping_error(&bp->pdev->dev, paddr)) {
-				dev_kfree_skb(skb);
-				break;
-			}
-
-			queue->rx_buff[entry] = skb;
+			paddr = page_pool_get_dma_addr(page) + XDP_PACKET_HEADROOM + offset;
+			data = page_address(page) + offset;
+			queue->rx_buff[entry] = data;
 
 			if (entry == bp->rx_ring_size - 1)
 				paddr |= MACB_BIT(RX_WRAP);
@@ -1296,20 +1299,6 @@ static int gem_rx_refill(struct macb_queue *queue)
 			 */
 			dma_wmb();
 			macb_set_addr(bp, desc, paddr);
-
-			/* Properly align Ethernet header.
-			 *
-			 * Hardware can add dummy bytes if asked using the RBOF
-			 * field inside the NCFGR register. That feature isn't
-			 * available if hardware is RSC capable.
-			 *
-			 * We cannot fallback to doing the 2-byte shift before
-			 * DMA mapping because the address field does not allow
-			 * setting the low 2/3 bits.
-			 * It is 3 bits if HW_DMA_CAP_PTP, else 2 bits.
-			 */
-			if (!(bp->caps & MACB_CAPS_RSC))
-				skb_reserve(skb, NET_IP_ALIGN);
 		} else {
 			desc->ctrl = 0;
 			dma_wmb();
@@ -1353,14 +1342,19 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
 	struct macb *bp = queue->bp;
 	unsigned int		len;
 	unsigned int		entry;
-	struct sk_buff		*skb;
 	struct macb_dma_desc	*desc;
+	int			data_len;
 	int			count = 0;
+	void			*buff_head;
+	struct skb_shared_info	*shinfo;
+	struct page		*page;
+	int			nr_frags;
+
 
 	while (count < budget) {
 		u32 ctrl;
 		dma_addr_t addr;
-		bool rxused;
+		bool rxused, first_frame;
 
 		entry = macb_rx_ring_wrap(bp, queue->rx_tail);
 		desc = macb_rx_desc(queue, entry);
@@ -1374,6 +1368,12 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
 		if (!rxused)
 			break;
 
+		if (!(bp->caps & MACB_CAPS_RSC))
+			addr += NET_IP_ALIGN;
+
+		dma_sync_single_for_cpu(&bp->pdev->dev,
+					addr, bp->rx_buffer_size,
+					page_pool_get_dma_dir(queue->page_pool));
 		/* Ensure ctrl is at least as up-to-date as rxused */
 		dma_rmb();
 
@@ -1382,58 +1382,118 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
 		queue->rx_tail++;
 		count++;
 
-		if (!(ctrl & MACB_BIT(RX_SOF) && ctrl & MACB_BIT(RX_EOF))) {
-			netdev_err(bp->dev,
-				   "not whole frame pointed by descriptor\n");
-			bp->dev->stats.rx_dropped++;
-			queue->stats.rx_dropped++;
-			break;
-		}
-		skb = queue->rx_buff[entry];
-		if (unlikely(!skb)) {
+		buff_head = queue->rx_buff[entry];
+		if (unlikely(!buff_head)) {
 			netdev_err(bp->dev,
 				   "inconsistent Rx descriptor chain\n");
 			bp->dev->stats.rx_dropped++;
 			queue->stats.rx_dropped++;
 			break;
 		}
-		/* now everything is ready for receiving packet */
-		queue->rx_buff[entry] = NULL;
+
+		first_frame = ctrl & MACB_BIT(RX_SOF);
 		len = ctrl & bp->rx_frm_len_mask;
 
-		netdev_vdbg(bp->dev, "gem_rx %u (len %u)\n", entry, len);
+		if (len) {
+			data_len = len;
+			if (!first_frame)
+				data_len -= queue->skb->len;
+		} else {
+			data_len = bp->rx_buffer_size;
+		}
+
+		if (first_frame) {
+			queue->skb = napi_build_skb(buff_head, gem_total_rx_buffer_size(bp));
+			if (unlikely(!queue->skb)) {
+				netdev_err(bp->dev,
+					   "Unable to allocate sk_buff\n");
+				goto free_frags;
+			}
+
+			/* Properly align Ethernet header.
+			 *
+			 * Hardware can add dummy bytes if asked using the RBOF
+			 * field inside the NCFGR register. That feature isn't
+			 * available if hardware is RSC capable.
+			 *
+			 * We cannot fallback to doing the 2-byte shift before
+			 * DMA mapping because the address field does not allow
+			 * setting the low 2/3 bits.
+			 * It is 3 bits if HW_DMA_CAP_PTP, else 2 bits.
+			 */
+			skb_reserve(queue->skb, bp->rx_headroom);
+			skb_mark_for_recycle(queue->skb);
+			skb_put(queue->skb, data_len);
+			queue->skb->protocol = eth_type_trans(queue->skb, bp->dev);
+
+			skb_checksum_none_assert(queue->skb);
+			if (bp->dev->features & NETIF_F_RXCSUM &&
+			    !(bp->dev->flags & IFF_PROMISC) &&
+			    GEM_BFEXT(RX_CSUM, ctrl) & GEM_RX_CSUM_CHECKED_MASK)
+				queue->skb->ip_summed = CHECKSUM_UNNECESSARY;
+		} else {
+			if (!queue->skb) {
+				netdev_err(bp->dev,
+					   "Received non-starting frame while expecting it\n");
+				goto free_frags;
+			}
+
+			shinfo = skb_shinfo(queue->skb);
+			page = virt_to_head_page(buff_head);
+			nr_frags = shinfo->nr_frags;
+
+			if (nr_frags >= ARRAY_SIZE(shinfo->frags))
+				goto free_frags;
 
-		skb_put(skb, len);
-		dma_unmap_single(&bp->pdev->dev, addr,
-				 bp->rx_buffer_size, DMA_FROM_DEVICE);
+			skb_add_rx_frag(queue->skb, nr_frags, page,
+					buff_head - page_address(page) + bp->rx_headroom,
+					data_len, gem_total_rx_buffer_size(bp));
+		}
+
+		/* now everything is ready for receiving packet */
+		queue->rx_buff[entry] = NULL;
 
-		skb->protocol = eth_type_trans(skb, bp->dev);
-		skb_checksum_none_assert(skb);
-		if (bp->dev->features & NETIF_F_RXCSUM &&
-		    !(bp->dev->flags & IFF_PROMISC) &&
-		    GEM_BFEXT(RX_CSUM, ctrl) & GEM_RX_CSUM_CHECKED_MASK)
-			skb->ip_summed = CHECKSUM_UNNECESSARY;
+		netdev_vdbg(bp->dev, "%s %u (len %u)\n", __func__, entry, data_len);
 
-		bp->dev->stats.rx_packets++;
-		queue->stats.rx_packets++;
-		bp->dev->stats.rx_bytes += skb->len;
-		queue->stats.rx_bytes += skb->len;
+		if (ctrl & MACB_BIT(RX_EOF)) {
+			bp->dev->stats.rx_packets++;
+			queue->stats.rx_packets++;
+			bp->dev->stats.rx_bytes += queue->skb->len;
+			queue->stats.rx_bytes += queue->skb->len;
 
-		gem_ptp_do_rxstamp(bp, skb, desc);
+			gem_ptp_do_rxstamp(bp, queue->skb, desc);
 
 #if defined(DEBUG) && defined(VERBOSE_DEBUG)
-		netdev_vdbg(bp->dev, "received skb of length %u, csum: %08x\n",
-			    skb->len, skb->csum);
-		print_hex_dump(KERN_DEBUG, " mac: ", DUMP_PREFIX_ADDRESS, 16, 1,
-			       skb_mac_header(skb), 16, true);
-		print_hex_dump(KERN_DEBUG, "data: ", DUMP_PREFIX_ADDRESS, 16, 1,
-			       skb->data, 32, true);
+			netdev_vdbg(bp->dev, "received skb of length %u, csum: %08x\n",
+				    queue->skb->len, queue->skb->csum);
+			print_hex_dump(KERN_DEBUG, " mac: ", DUMP_PREFIX_ADDRESS, 16, 1,
+				       skb_mac_header(queue->skb), 16, true);
+			print_hex_dump(KERN_DEBUG, "buff_head: ", DUMP_PREFIX_ADDRESS, 16, 1,
+				       queue->skb->buff_head, 32, true);
 #endif
 
-		napi_gro_receive(napi, skb);
+			napi_gro_receive(napi, queue->skb);
+			queue->skb = NULL;
+		}
+
+		continue;
+
+free_frags:
+		if (queue->skb) {
+			dev_kfree_skb(queue->skb);
+			queue->skb = NULL;
+		} else {
+			page_pool_put_full_page(queue->page_pool,
+						virt_to_head_page(buff_head),
+						false);
+		}
+
+		bp->dev->stats.rx_dropped++;
+		queue->stats.rx_dropped++;
+		queue->rx_buff[entry] = NULL;
 	}
 
-	gem_rx_refill(queue);
+	gem_rx_refill(queue, true);
 
 	return count;
 }
@@ -2367,12 +2427,25 @@ static netdev_tx_t macb_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	return ret;
 }
 
-static void macb_init_rx_buffer_size(struct macb *bp, size_t size)
+static void macb_init_rx_buffer_size(struct macb *bp, unsigned int mtu)
 {
+	int overhead;
+	size_t size;
+
 	if (!macb_is_gem(bp)) {
 		bp->rx_buffer_size = MACB_RX_BUFFER_SIZE;
 	} else {
-		bp->rx_buffer_size = size;
+		size = mtu + ETH_HLEN + ETH_FCS_LEN;
+		if (!(bp->caps & MACB_CAPS_RSC))
+			size += NET_IP_ALIGN;
+
+		bp->rx_buffer_size = SKB_DATA_ALIGN(size);
+		if (gem_total_rx_buffer_size(bp) > PAGE_SIZE) {
+			overhead = bp->rx_headroom +
+				SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+			bp->rx_buffer_size = rounddown(PAGE_SIZE - overhead,
+						       RX_BUFFER_MULTIPLE);
+		}
 
 		if (bp->rx_buffer_size % RX_BUFFER_MULTIPLE) {
 			netdev_dbg(bp->dev,
@@ -2389,11 +2462,9 @@ static void macb_init_rx_buffer_size(struct macb *bp, size_t size)
 
 static void gem_free_rx_buffers(struct macb *bp)
 {
-	struct sk_buff		*skb;
-	struct macb_dma_desc	*desc;
 	struct macb_queue *queue;
-	dma_addr_t		addr;
 	unsigned int q;
+	void *data;
 	int i;
 
 	for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
@@ -2401,22 +2472,20 @@ static void gem_free_rx_buffers(struct macb *bp)
 			continue;
 
 		for (i = 0; i < bp->rx_ring_size; i++) {
-			skb = queue->rx_buff[i];
-
-			if (!skb)
+			data = queue->rx_buff[i];
+			if (!data)
 				continue;
 
-			desc = macb_rx_desc(queue, i);
-			addr = macb_get_addr(bp, desc);
-
-			dma_unmap_single(&bp->pdev->dev, addr, bp->rx_buffer_size,
-					DMA_FROM_DEVICE);
-			dev_kfree_skb_any(skb);
-			skb = NULL;
+			page_pool_put_full_page(queue->page_pool,
+						virt_to_head_page(data),
+						false);
+			queue->rx_buff[i] = NULL;
 		}
 
 		kfree(queue->rx_buff);
 		queue->rx_buff = NULL;
+		page_pool_destroy(queue->page_pool);
+		queue->page_pool = NULL;
 	}
 }
 
@@ -2478,13 +2547,12 @@ static int gem_alloc_rx_buffers(struct macb *bp)
 	int size;
 
 	for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
-		size = bp->rx_ring_size * sizeof(struct sk_buff *);
+		size = bp->rx_ring_size * sizeof(*queue->rx_buff);
 		queue->rx_buff = kzalloc(size, GFP_KERNEL);
 		if (!queue->rx_buff)
 			return -ENOMEM;
 		else
-			netdev_dbg(bp->dev,
-				   "Allocated %d RX buff entries at %p\n",
+			netdev_dbg(bp->dev, "Allocated %d RX buff entries at %p\n",
 				   bp->rx_ring_size, queue->rx_buff);
 	}
 	return 0;
@@ -2572,6 +2640,33 @@ static int macb_alloc_consistent(struct macb *bp)
 	return -ENOMEM;
 }
 
+static int gem_create_page_pool(struct macb_queue *queue)
+{
+	struct page_pool_params pp_params = {
+		.order = 0,
+		.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
+		.pool_size = queue->bp->rx_ring_size,
+		.nid = NUMA_NO_NODE,
+		.dma_dir = DMA_FROM_DEVICE,
+		.dev = &queue->bp->pdev->dev,
+		.napi = &queue->napi_rx,
+		.max_len = PAGE_SIZE,
+	};
+	struct page_pool *pool;
+	int err = 0;
+
+	pool = page_pool_create(&pp_params);
+	if (IS_ERR(pool)) {
+		netdev_err(queue->bp->dev, "cannot create rx page pool\n");
+		err = PTR_ERR(pool);
+		pool = NULL;
+	}
+
+	queue->page_pool = pool;
+
+	return err;
+}
+
 static void macb_init_tieoff(struct macb *bp)
 {
 	struct macb_dma_desc *desc = bp->rx_ring_tieoff;
@@ -2607,12 +2702,24 @@ static int gem_init_rings(struct macb *bp, bool fail_early)
 		queue->rx_tail = 0;
 		queue->rx_prepared_head = 0;
 
+		/* This is a hard failure, so the best we can do is try the
+		 * next queue in case of HRESP error.
+		 */
+		err = gem_create_page_pool(queue);
+		if (err) {
+			last_err = err;
+			if (fail_early)
+				break;
+
+			continue;
+		}
+
 		/* We get called in two cases:
 		 *  - open: we can propagate alloc errors (so fail early),
 		 *  - HRESP error: cannot propagate, we attempt to reinit
 		 *    all queues in case of failure.
 		 */
-		err = gem_rx_refill(queue);
+		err = gem_rx_refill(queue, false);
 		if (err) {
 			last_err = err;
 			if (fail_early)
@@ -2756,39 +2863,40 @@ static void macb_configure_dma(struct macb *bp)
 	unsigned int q;
 	u32 dmacfg;
 
-	buffer_size = bp->rx_buffer_size / RX_BUFFER_MULTIPLE;
-	if (macb_is_gem(bp)) {
-		dmacfg = gem_readl(bp, DMACFG) & ~GEM_BF(RXBS, -1L);
-		for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
-			if (q)
-				queue_writel(queue, RBQS, buffer_size);
-			else
-				dmacfg |= GEM_BF(RXBS, buffer_size);
-		}
-		if (bp->dma_burst_length)
-			dmacfg = GEM_BFINS(FBLDO, bp->dma_burst_length, dmacfg);
-		dmacfg |= GEM_BIT(TXPBMS) | GEM_BF(RXBMS, -1L);
-		dmacfg &= ~GEM_BIT(ENDIA_PKT);
+	if (!macb_is_gem((bp)))
+		return;
 
-		if (bp->native_io)
-			dmacfg &= ~GEM_BIT(ENDIA_DESC);
+	buffer_size = bp->rx_buffer_size / RX_BUFFER_MULTIPLE;
+	dmacfg = gem_readl(bp, DMACFG) & ~GEM_BF(RXBS, -1L);
+	for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
+		if (q)
+			queue_writel(queue, RBQS, buffer_size);
 		else
-			dmacfg |= GEM_BIT(ENDIA_DESC); /* CPU in big endian */
+			dmacfg |= GEM_BF(RXBS, buffer_size);
+	}
+	if (bp->dma_burst_length)
+		dmacfg = GEM_BFINS(FBLDO, bp->dma_burst_length, dmacfg);
+	dmacfg |= GEM_BIT(TXPBMS) | GEM_BF(RXBMS, -1L);
+	dmacfg &= ~GEM_BIT(ENDIA_PKT);
 
-		if (bp->dev->features & NETIF_F_HW_CSUM)
-			dmacfg |= GEM_BIT(TXCOEN);
-		else
-			dmacfg &= ~GEM_BIT(TXCOEN);
+	if (bp->native_io)
+		dmacfg &= ~GEM_BIT(ENDIA_DESC);
+	else
+		dmacfg |= GEM_BIT(ENDIA_DESC); /* CPU in big endian */
 
-		dmacfg &= ~GEM_BIT(ADDR64);
-		if (macb_dma64(bp))
-			dmacfg |= GEM_BIT(ADDR64);
-		if (macb_dma_ptp(bp))
-			dmacfg |= GEM_BIT(RXEXT) | GEM_BIT(TXEXT);
-		netdev_dbg(bp->dev, "Cadence configure DMA with 0x%08x\n",
-			   dmacfg);
-		gem_writel(bp, DMACFG, dmacfg);
-	}
+	if (bp->dev->features & NETIF_F_HW_CSUM)
+		dmacfg |= GEM_BIT(TXCOEN);
+	else
+		dmacfg &= ~GEM_BIT(TXCOEN);
+
+	dmacfg &= ~GEM_BIT(ADDR64);
+	if (macb_dma64(bp))
+		dmacfg |= GEM_BIT(ADDR64);
+	if (macb_dma_ptp(bp))
+		dmacfg |= GEM_BIT(RXEXT) | GEM_BIT(TXEXT);
+	netdev_dbg(bp->dev, "Cadence configure DMA with 0x%08x\n",
+		   dmacfg);
+	gem_writel(bp, DMACFG, dmacfg);
 }
 
 static void macb_init_hw(struct macb *bp)
@@ -2951,7 +3059,6 @@ static void macb_set_rx_mode(struct net_device *dev)
 
 static int macb_open(struct net_device *dev)
 {
-	size_t bufsz = dev->mtu + ETH_HLEN + ETH_FCS_LEN + NET_IP_ALIGN;
 	struct macb *bp = netdev_priv(dev);
 	struct macb_queue *queue;
 	unsigned int q;
@@ -2964,7 +3071,7 @@ static int macb_open(struct net_device *dev)
 		return err;
 
 	/* RX buffers initialization */
-	macb_init_rx_buffer_size(bp, bufsz);
+	macb_init_rx_buffer_size(bp, dev->mtu);
 
 	err = macb_alloc_consistent(bp);
 	if (err) {
@@ -5623,6 +5730,12 @@ static int macb_probe(struct platform_device *pdev)
 	if (err)
 		goto err_out_phy_exit;
 
+	if (macb_is_gem(bp)) {
+		bp->rx_headroom = XDP_PACKET_HEADROOM;
+		if (!(bp->caps & MACB_CAPS_RSC))
+			bp->rx_headroom += NET_IP_ALIGN;
+	}
+
 	netif_carrier_off(dev);
 
 	err = register_netdev(dev);
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC net-next v2 4/8] cadence: macb: use the current queue number for stats
  2025-12-20 23:51 [PATCH RFC net-next v2 0/8] net: macb: Add XDP support and page pool integration Paolo Valerio
                   ` (2 preceding siblings ...)
  2025-12-20 23:51 ` [PATCH RFC net-next v2 3/8] cadence: macb: Add page pool support handle multi-descriptor frame rx Paolo Valerio
@ 2025-12-20 23:51 ` Paolo Valerio
  2025-12-20 23:51 ` [PATCH RFC net-next v2 5/8] cadence: macb: add XDP support for gem Paolo Valerio
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Paolo Valerio @ 2025-12-20 23:51 UTC (permalink / raw)
  To: netdev
  Cc: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Lorenzo Bianconi,
	Théo Lebrun

gem_get_ethtool_stats calculates the size of the statistics
data to copy always considering maximum number of queues.

The patch makes sure the statistics are copied only for the
active queues as returned in the string set count op.

Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
---
This is not related to XDP, but an issue related to this was
spotted while introducing ethtool stats support resulting in page
pool stats pollution.
Page pool stats support patch was later dropped from the series
once realized its deprecation.
---
 drivers/net/ethernet/cadence/macb_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index 9e1efc1f56d8..582ceb728124 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -3253,7 +3253,7 @@ static void gem_get_ethtool_stats(struct net_device *dev,
 	spin_lock_irq(&bp->stats_lock);
 	gem_update_stats(bp);
 	memcpy(data, &bp->ethtool_stats, sizeof(u64)
-			* (GEM_STATS_LEN + QUEUE_STATS_LEN * MACB_MAX_QUEUES));
+			* (GEM_STATS_LEN + QUEUE_STATS_LEN * bp->num_queues));
 	spin_unlock_irq(&bp->stats_lock);
 }
 
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC net-next v2 5/8] cadence: macb: add XDP support for gem
  2025-12-20 23:51 [PATCH RFC net-next v2 0/8] net: macb: Add XDP support and page pool integration Paolo Valerio
                   ` (3 preceding siblings ...)
  2025-12-20 23:51 ` [PATCH RFC net-next v2 4/8] cadence: macb: use the current queue number for stats Paolo Valerio
@ 2025-12-20 23:51 ` Paolo Valerio
  2026-01-08 15:49   ` Théo Lebrun
  2025-12-20 23:51 ` [PATCH RFC net-next v2 6/8] cadence: macb: make macb_tx_skb generic Paolo Valerio
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Paolo Valerio @ 2025-12-20 23:51 UTC (permalink / raw)
  To: netdev
  Cc: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Lorenzo Bianconi,
	Théo Lebrun

Introduce basic XDP support for macb/gem with the XDP_PASS,
XDP_DROP, XDP_REDIRECT verdict support.

Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
---
 drivers/net/ethernet/cadence/macb.h      |   3 +
 drivers/net/ethernet/cadence/macb_main.c | 184 ++++++++++++++++++++---
 2 files changed, 169 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
index 45c04157f153..815d50574267 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -16,6 +16,7 @@
 #include <linux/workqueue.h>
 #include <net/page_pool/helpers.h>
 #include <net/xdp.h>
+#include <linux/bpf_trace.h>
 
 #define MACB_GREGS_NBR 16
 #define MACB_GREGS_VERSION 2
@@ -1270,6 +1271,7 @@ struct macb_queue {
 	struct queue_stats stats;
 	struct page_pool	*page_pool;
 	struct sk_buff		*skb;
+	struct xdp_rxq_info	xdp_rxq;
 };
 
 struct ethtool_rx_fs_item {
@@ -1369,6 +1371,7 @@ struct macb {
 
 	struct macb_pm_data pm_data;
 	const struct macb_usrio_config *usrio;
+	struct bpf_prog	__rcu *prog;
 };
 
 #ifdef CONFIG_MACB_USE_HWSTAMP
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index 582ceb728124..f767eb2e272e 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -5,6 +5,7 @@
  * Copyright (C) 2004-2006 Atmel Corporation
  */
 
+#include <asm-generic/errno.h>
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 #include <linux/circ_buf.h>
 #include <linux/clk-provider.h>
@@ -1249,9 +1250,19 @@ static int macb_tx_complete(struct macb_queue *queue, int budget)
 	return packets;
 }
 
+static int gem_max_rx_data_size(int base_sz)
+{
+	return SKB_DATA_ALIGN(base_sz + ETH_HLEN + ETH_FCS_LEN);
+}
+
+static int gem_max_rx_buffer_size(int data_sz, struct macb *bp)
+{
+	return SKB_HEAD_ALIGN(data_sz + bp->rx_headroom);
+}
+
 static int gem_total_rx_buffer_size(struct macb *bp)
 {
-	return SKB_HEAD_ALIGN(bp->rx_buffer_size + bp->rx_headroom);
+	return gem_max_rx_buffer_size(bp->rx_buffer_size, bp);
 }
 
 static int gem_rx_refill(struct macb_queue *queue, bool napi)
@@ -1336,10 +1347,59 @@ static void discard_partial_frame(struct macb_queue *queue, unsigned int begin,
 	 */
 }
 
+static u32 gem_xdp_run(struct macb_queue *queue, void *buff_head,
+		       unsigned int len)
+{
+	struct net_device *dev;
+	struct bpf_prog *prog;
+	struct xdp_buff xdp;
+
+	u32 act = XDP_PASS;
+
+	rcu_read_lock();
+
+	prog = rcu_dereference(queue->bp->prog);
+	if (!prog)
+		goto out;
+
+	xdp_init_buff(&xdp, gem_total_rx_buffer_size(queue->bp), &queue->xdp_rxq);
+	xdp_prepare_buff(&xdp, buff_head, queue->bp->rx_headroom, len, false);
+	xdp_buff_clear_frags_flag(&xdp);
+	dev = queue->bp->dev;
+
+	act = bpf_prog_run_xdp(prog, &xdp);
+	switch (act) {
+	case XDP_PASS:
+		goto out;
+	case XDP_REDIRECT:
+		if (unlikely(xdp_do_redirect(dev, &xdp, prog))) {
+			act = XDP_DROP;
+			break;
+		}
+		goto out;
+	default:
+		bpf_warn_invalid_xdp_action(dev, prog, act);
+		fallthrough;
+	case XDP_ABORTED:
+		trace_xdp_exception(dev, prog, act);
+		fallthrough;
+	case XDP_DROP:
+		break;
+	}
+
+	page_pool_put_full_page(queue->page_pool,
+				virt_to_head_page(xdp.data), true);
+out:
+	rcu_read_unlock();
+
+	return act;
+}
+
 static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
 		  int budget)
 {
 	struct macb *bp = queue->bp;
+	bool			xdp_flush = false;
 	unsigned int		len;
 	unsigned int		entry;
 	struct macb_dma_desc	*desc;
@@ -1352,9 +1412,10 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
 
 
 	while (count < budget) {
-		u32 ctrl;
-		dma_addr_t addr;
 		bool rxused, first_frame;
+		dma_addr_t addr;
+		u32 ctrl;
+		u32 ret;
 
 		entry = macb_rx_ring_wrap(bp, queue->rx_tail);
 		desc = macb_rx_desc(queue, entry);
@@ -1402,6 +1463,17 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
 			data_len = bp->rx_buffer_size;
 		}
 
+		if (!(ctrl & MACB_BIT(RX_SOF) && ctrl & MACB_BIT(RX_EOF)))
+			goto skip_xdp;
+
+		ret = gem_xdp_run(queue, buff_head, len);
+		if (ret == XDP_REDIRECT)
+			xdp_flush = true;
+
+		if (ret != XDP_PASS)
+			goto next_frame;
+
+skip_xdp:
 		if (first_frame) {
 			queue->skb = napi_build_skb(buff_head, gem_total_rx_buffer_size(bp));
 			if (unlikely(!queue->skb)) {
@@ -1451,10 +1523,6 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
 		}
 
 		/* now everything is ready for receiving packet */
-		queue->rx_buff[entry] = NULL;
-
-		netdev_vdbg(bp->dev, "%s %u (len %u)\n", __func__, entry, data_len);
-
 		if (ctrl & MACB_BIT(RX_EOF)) {
 			bp->dev->stats.rx_packets++;
 			queue->stats.rx_packets++;
@@ -1476,6 +1544,8 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
 			queue->skb = NULL;
 		}
 
+next_frame:
+		queue->rx_buff[entry] = NULL;
 		continue;
 
 free_frags:
@@ -1493,6 +1563,9 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
 		queue->rx_buff[entry] = NULL;
 	}
 
+	if (xdp_flush)
+		xdp_do_flush();
+
 	gem_rx_refill(queue, true);
 
 	return count;
@@ -2430,16 +2503,11 @@ static netdev_tx_t macb_start_xmit(struct sk_buff *skb, struct net_device *dev)
 static void macb_init_rx_buffer_size(struct macb *bp, unsigned int mtu)
 {
 	int overhead;
-	size_t size;
 
 	if (!macb_is_gem(bp)) {
 		bp->rx_buffer_size = MACB_RX_BUFFER_SIZE;
 	} else {
-		size = mtu + ETH_HLEN + ETH_FCS_LEN;
-		if (!(bp->caps & MACB_CAPS_RSC))
-			size += NET_IP_ALIGN;
-
-		bp->rx_buffer_size = SKB_DATA_ALIGN(size);
+		bp->rx_buffer_size = gem_max_rx_data_size(mtu);
 		if (gem_total_rx_buffer_size(bp) > PAGE_SIZE) {
 			overhead = bp->rx_headroom +
 				SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
@@ -2484,6 +2552,8 @@ static void gem_free_rx_buffers(struct macb *bp)
 
 		kfree(queue->rx_buff);
 		queue->rx_buff = NULL;
+		if (xdp_rxq_info_is_reg(&queue->xdp_rxq))
+			xdp_rxq_info_unreg(&queue->xdp_rxq);
 		page_pool_destroy(queue->page_pool);
 		queue->page_pool = NULL;
 	}
@@ -2640,30 +2710,55 @@ static int macb_alloc_consistent(struct macb *bp)
 	return -ENOMEM;
 }
 
-static int gem_create_page_pool(struct macb_queue *queue)
+static int gem_create_page_pool(struct macb_queue *queue, int qid)
 {
 	struct page_pool_params pp_params = {
 		.order = 0,
 		.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
 		.pool_size = queue->bp->rx_ring_size,
 		.nid = NUMA_NO_NODE,
-		.dma_dir = DMA_FROM_DEVICE,
+		.dma_dir = rcu_access_pointer(queue->bp->prog)
+				? DMA_BIDIRECTIONAL
+				: DMA_FROM_DEVICE,
 		.dev = &queue->bp->pdev->dev,
 		.napi = &queue->napi_rx,
 		.max_len = PAGE_SIZE,
 	};
 	struct page_pool *pool;
-	int err = 0;
+	int err;
 
 	pool = page_pool_create(&pp_params);
 	if (IS_ERR(pool)) {
 		netdev_err(queue->bp->dev, "cannot create rx page pool\n");
 		err = PTR_ERR(pool);
-		pool = NULL;
+		goto clear_pool;
 	}
 
 	queue->page_pool = pool;
 
+	err = xdp_rxq_info_reg(&queue->xdp_rxq, queue->bp->dev, qid,
+			       queue->napi_rx.napi_id);
+	if (err < 0) {
+		netdev_err(queue->bp->dev, "xdp: failed to register rxq info\n");
+		goto destroy_pool;
+	}
+
+	err = xdp_rxq_info_reg_mem_model(&queue->xdp_rxq, MEM_TYPE_PAGE_POOL,
+					 queue->page_pool);
+	if (err) {
+		netdev_err(queue->bp->dev, "xdp: failed to register rxq memory model\n");
+		goto unreg_info;
+	}
+
+	return 0;
+
+unreg_info:
+	xdp_rxq_info_unreg(&queue->xdp_rxq);
+destroy_pool:
+	page_pool_destroy(pool);
+clear_pool:
+	queue->page_pool = NULL;
+
 	return err;
 }
 
@@ -2705,7 +2800,7 @@ static int gem_init_rings(struct macb *bp, bool fail_early)
 		/* This is a hard failure, so the best we can do is try the
 		 * next queue in case of HRESP error.
 		 */
-		err = gem_create_page_pool(queue);
+		err = gem_create_page_pool(queue, q);
 		if (err) {
 			last_err = err;
 			if (fail_early)
@@ -3156,11 +3251,27 @@ static int macb_close(struct net_device *dev)
 	return 0;
 }
 
+static bool gem_xdp_valid_mtu(struct macb *bp, int mtu)
+{
+	int max_frame_size;
+
+	max_frame_size = gem_max_rx_buffer_size(gem_max_rx_data_size(mtu), bp);
+
+	return max_frame_size <= PAGE_SIZE;
+}
+
 static int macb_change_mtu(struct net_device *dev, int new_mtu)
 {
+	struct macb *bp = netdev_priv(dev);
+
 	if (netif_running(dev))
 		return -EBUSY;
 
+	if (rcu_access_pointer(bp->prog) && !gem_xdp_valid_mtu(bp, new_mtu)) {
+		netdev_err(dev, "MTU %d too large for XDP", new_mtu);
+		return -EINVAL;
+	}
+
 	WRITE_ONCE(dev->mtu, new_mtu);
 
 	return 0;
@@ -3178,6 +3289,39 @@ static int macb_set_mac_addr(struct net_device *dev, void *addr)
 	return 0;
 }
 
+static int gem_xdp_setup(struct net_device *dev, struct bpf_prog *prog,
+			 struct netlink_ext_ack *extack)
+{
+	struct macb *bp = netdev_priv(dev);
+	struct bpf_prog *old_prog;
+
+	if (prog && !gem_xdp_valid_mtu(bp, dev->mtu)) {
+		NL_SET_ERR_MSG_MOD(extack, "MTU too large for XDP");
+		return -EOPNOTSUPP;
+	}
+
+	old_prog = rcu_replace_pointer(bp->prog, prog, lockdep_rtnl_is_held());
+	if (old_prog)
+		bpf_prog_put(old_prog);
+
+	return 0;
+}
+
+static int gem_xdp(struct net_device *dev, struct netdev_bpf *xdp)
+{
+	struct macb *bp = netdev_priv(dev);
+
+	if (!macb_is_gem(bp))
+		return -EOPNOTSUPP;
+
+	switch (xdp->command) {
+	case XDP_SETUP_PROG:
+		return gem_xdp_setup(dev, xdp->prog, xdp->extack);
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
 static void gem_update_stats(struct macb *bp)
 {
 	struct macb_queue *queue;
@@ -4431,6 +4575,7 @@ static const struct net_device_ops macb_netdev_ops = {
 	.ndo_hwtstamp_set	= macb_hwtstamp_set,
 	.ndo_hwtstamp_get	= macb_hwtstamp_get,
 	.ndo_setup_tc		= macb_setup_tc,
+	.ndo_bpf		= gem_xdp,
 };
 
 /* Configure peripheral capabilities according to device tree
@@ -5734,6 +5879,9 @@ static int macb_probe(struct platform_device *pdev)
 		bp->rx_headroom = XDP_PACKET_HEADROOM;
 		if (!(bp->caps & MACB_CAPS_RSC))
 			bp->rx_headroom += NET_IP_ALIGN;
+
+		dev->xdp_features = NETDEV_XDP_ACT_BASIC |
+				    NETDEV_XDP_ACT_REDIRECT;
 	}
 
 	netif_carrier_off(dev);
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC net-next v2 6/8] cadence: macb: make macb_tx_skb generic
  2025-12-20 23:51 [PATCH RFC net-next v2 0/8] net: macb: Add XDP support and page pool integration Paolo Valerio
                   ` (4 preceding siblings ...)
  2025-12-20 23:51 ` [PATCH RFC net-next v2 5/8] cadence: macb: add XDP support for gem Paolo Valerio
@ 2025-12-20 23:51 ` Paolo Valerio
  2025-12-20 23:51 ` [PATCH RFC net-next v2 7/8] cadence: macb: make tx path skb agnostic Paolo Valerio
  2025-12-20 23:51 ` [PATCH RFC net-next v2 8/8] cadence: macb: introduce xmit support Paolo Valerio
  7 siblings, 0 replies; 21+ messages in thread
From: Paolo Valerio @ 2025-12-20 23:51 UTC (permalink / raw)
  To: netdev
  Cc: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Lorenzo Bianconi,
	Théo Lebrun

The macb_tx_skb structure is renamed to macb_tx_buff with
no functional changes.

This is a preparatory step for adding xdp xmit support.

Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
---
 drivers/net/ethernet/cadence/macb.h      | 10 +--
 drivers/net/ethernet/cadence/macb_main.c | 96 ++++++++++++------------
 2 files changed, 53 insertions(+), 53 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
index 815d50574267..47c25993ad40 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -960,7 +960,7 @@ struct macb_dma_desc_ptp {
 /* Scaled PPM fraction */
 #define PPM_FRACTION	16
 
-/* struct macb_tx_skb - data about an skb which is being transmitted
+/* struct macb_tx_buff - data about an skb which is being transmitted
  * @skb: skb currently being transmitted, only set for the last buffer
  *       of the frame
  * @mapping: DMA address of the skb's fragment buffer
@@ -968,8 +968,8 @@ struct macb_dma_desc_ptp {
  * @mapped_as_page: true when buffer was mapped with skb_frag_dma_map(),
  *                  false when buffer was mapped with dma_map_single()
  */
-struct macb_tx_skb {
-	struct sk_buff		*skb;
+struct macb_tx_buff {
+	void			*skb;
 	dma_addr_t		mapping;
 	size_t			size;
 	bool			mapped_as_page;
@@ -1254,7 +1254,7 @@ struct macb_queue {
 	spinlock_t		tx_ptr_lock;
 	unsigned int		tx_head, tx_tail;
 	struct macb_dma_desc	*tx_ring;
-	struct macb_tx_skb	*tx_skb;
+	struct macb_tx_buff	*tx_buff;
 	dma_addr_t		tx_ring_dma;
 	struct work_struct	tx_error_task;
 	bool			txubr_pending;
@@ -1332,7 +1332,7 @@ struct macb {
 	phy_interface_t		phy_interface;
 
 	/* AT91RM9200 transmit queue (1 on wire + 1 queued) */
-	struct macb_tx_skb	rm9200_txq[2];
+	struct macb_tx_buff	rm9200_txq[2];
 	unsigned int		max_tx_length;
 
 	u64			ethtool_stats[GEM_STATS_LEN + QUEUE_STATS_LEN * MACB_MAX_QUEUES];
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index f767eb2e272e..3ffad2ddc349 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -157,10 +157,10 @@ static struct macb_dma_desc *macb_tx_desc(struct macb_queue *queue,
 	return &queue->tx_ring[index];
 }
 
-static struct macb_tx_skb *macb_tx_skb(struct macb_queue *queue,
-				       unsigned int index)
+static struct macb_tx_buff *macb_tx_buff(struct macb_queue *queue,
+					 unsigned int index)
 {
-	return &queue->tx_skb[macb_tx_ring_wrap(queue->bp, index)];
+	return &queue->tx_buff[macb_tx_ring_wrap(queue->bp, index)];
 }
 
 static dma_addr_t macb_tx_dma(struct macb_queue *queue, unsigned int index)
@@ -969,21 +969,21 @@ static int macb_halt_tx(struct macb *bp)
 					bp, TSR);
 }
 
-static void macb_tx_unmap(struct macb *bp, struct macb_tx_skb *tx_skb, int budget)
+static void macb_tx_unmap(struct macb *bp, struct macb_tx_buff *tx_buff, int budget)
 {
-	if (tx_skb->mapping) {
-		if (tx_skb->mapped_as_page)
-			dma_unmap_page(&bp->pdev->dev, tx_skb->mapping,
-				       tx_skb->size, DMA_TO_DEVICE);
+	if (tx_buff->mapping) {
+		if (tx_buff->mapped_as_page)
+			dma_unmap_page(&bp->pdev->dev, tx_buff->mapping,
+				       tx_buff->size, DMA_TO_DEVICE);
 		else
-			dma_unmap_single(&bp->pdev->dev, tx_skb->mapping,
-					 tx_skb->size, DMA_TO_DEVICE);
-		tx_skb->mapping = 0;
+			dma_unmap_single(&bp->pdev->dev, tx_buff->mapping,
+					 tx_buff->size, DMA_TO_DEVICE);
+		tx_buff->mapping = 0;
 	}
 
-	if (tx_skb->skb) {
-		napi_consume_skb(tx_skb->skb, budget);
-		tx_skb->skb = NULL;
+	if (tx_buff->skb) {
+		napi_consume_skb(tx_buff->skb, budget);
+		tx_buff->skb = NULL;
 	}
 }
 
@@ -1029,7 +1029,7 @@ static void macb_tx_error_task(struct work_struct *work)
 	u32			queue_index;
 	u32			packets = 0;
 	u32			bytes = 0;
-	struct macb_tx_skb	*tx_skb;
+	struct macb_tx_buff	*tx_buff;
 	struct macb_dma_desc	*desc;
 	struct sk_buff		*skb;
 	unsigned int		tail;
@@ -1069,16 +1069,16 @@ static void macb_tx_error_task(struct work_struct *work)
 
 		desc = macb_tx_desc(queue, tail);
 		ctrl = desc->ctrl;
-		tx_skb = macb_tx_skb(queue, tail);
-		skb = tx_skb->skb;
+		tx_buff = macb_tx_buff(queue, tail);
+		skb = tx_buff->skb;
 
 		if (ctrl & MACB_BIT(TX_USED)) {
 			/* skb is set for the last buffer of the frame */
 			while (!skb) {
-				macb_tx_unmap(bp, tx_skb, 0);
+				macb_tx_unmap(bp, tx_buff, 0);
 				tail++;
-				tx_skb = macb_tx_skb(queue, tail);
-				skb = tx_skb->skb;
+				tx_buff = macb_tx_buff(queue, tail);
+				skb = tx_buff->skb;
 			}
 
 			/* ctrl still refers to the first buffer descriptor
@@ -1107,7 +1107,7 @@ static void macb_tx_error_task(struct work_struct *work)
 			desc->ctrl = ctrl | MACB_BIT(TX_USED);
 		}
 
-		macb_tx_unmap(bp, tx_skb, 0);
+		macb_tx_unmap(bp, tx_buff, 0);
 	}
 
 	netdev_tx_completed_queue(netdev_get_tx_queue(bp->dev, queue_index),
@@ -1185,7 +1185,7 @@ static int macb_tx_complete(struct macb_queue *queue, int budget)
 	spin_lock_irqsave(&queue->tx_ptr_lock, flags);
 	head = queue->tx_head;
 	for (tail = queue->tx_tail; tail != head && packets < budget; tail++) {
-		struct macb_tx_skb	*tx_skb;
+		struct macb_tx_buff	*tx_buff;
 		struct sk_buff		*skb;
 		struct macb_dma_desc	*desc;
 		u32			ctrl;
@@ -1205,8 +1205,8 @@ static int macb_tx_complete(struct macb_queue *queue, int budget)
 
 		/* Process all buffers of the current transmitted frame */
 		for (;; tail++) {
-			tx_skb = macb_tx_skb(queue, tail);
-			skb = tx_skb->skb;
+			tx_buff = macb_tx_buff(queue, tail);
+			skb = tx_buff->skb;
 
 			/* First, update TX stats if needed */
 			if (skb) {
@@ -1226,7 +1226,7 @@ static int macb_tx_complete(struct macb_queue *queue, int budget)
 			}
 
 			/* Now we can safely release resources */
-			macb_tx_unmap(bp, tx_skb, budget);
+			macb_tx_unmap(bp, tx_buff, budget);
 
 			/* skb is set only for the last buffer of the frame.
 			 * WARNING: at this point skb has been freed by
@@ -2133,8 +2133,8 @@ static unsigned int macb_tx_map(struct macb *bp,
 	unsigned int f, nr_frags = skb_shinfo(skb)->nr_frags;
 	unsigned int len, i, tx_head = queue->tx_head;
 	u32 ctrl, lso_ctrl = 0, seq_ctrl = 0;
+	struct macb_tx_buff *tx_buff = NULL;
 	unsigned int eof = 1, mss_mfs = 0;
-	struct macb_tx_skb *tx_skb = NULL;
 	struct macb_dma_desc *desc;
 	unsigned int offset, size;
 	dma_addr_t mapping;
@@ -2157,7 +2157,7 @@ static unsigned int macb_tx_map(struct macb *bp,
 
 	offset = 0;
 	while (len) {
-		tx_skb = macb_tx_skb(queue, tx_head);
+		tx_buff = macb_tx_buff(queue, tx_head);
 
 		mapping = dma_map_single(&bp->pdev->dev,
 					 skb->data + offset,
@@ -2166,10 +2166,10 @@ static unsigned int macb_tx_map(struct macb *bp,
 			goto dma_error;
 
 		/* Save info to properly release resources */
-		tx_skb->skb = NULL;
-		tx_skb->mapping = mapping;
-		tx_skb->size = size;
-		tx_skb->mapped_as_page = false;
+		tx_buff->skb = NULL;
+		tx_buff->mapping = mapping;
+		tx_buff->size = size;
+		tx_buff->mapped_as_page = false;
 
 		len -= size;
 		offset += size;
@@ -2186,7 +2186,7 @@ static unsigned int macb_tx_map(struct macb *bp,
 		offset = 0;
 		while (len) {
 			size = umin(len, bp->max_tx_length);
-			tx_skb = macb_tx_skb(queue, tx_head);
+			tx_buff = macb_tx_buff(queue, tx_head);
 
 			mapping = skb_frag_dma_map(&bp->pdev->dev, frag,
 						   offset, size, DMA_TO_DEVICE);
@@ -2194,10 +2194,10 @@ static unsigned int macb_tx_map(struct macb *bp,
 				goto dma_error;
 
 			/* Save info to properly release resources */
-			tx_skb->skb = NULL;
-			tx_skb->mapping = mapping;
-			tx_skb->size = size;
-			tx_skb->mapped_as_page = true;
+			tx_buff->skb = NULL;
+			tx_buff->mapping = mapping;
+			tx_buff->size = size;
+			tx_buff->mapped_as_page = true;
 
 			len -= size;
 			offset += size;
@@ -2206,13 +2206,13 @@ static unsigned int macb_tx_map(struct macb *bp,
 	}
 
 	/* Should never happen */
-	if (unlikely(!tx_skb)) {
+	if (unlikely(!tx_buff)) {
 		netdev_err(bp->dev, "BUG! empty skb!\n");
 		return 0;
 	}
 
 	/* This is the last buffer of the frame: save socket buffer */
-	tx_skb->skb = skb;
+	tx_buff->skb = skb;
 
 	/* Update TX ring: update buffer descriptors in reverse order
 	 * to avoid race condition
@@ -2243,10 +2243,10 @@ static unsigned int macb_tx_map(struct macb *bp,
 
 	do {
 		i--;
-		tx_skb = macb_tx_skb(queue, i);
+		tx_buff = macb_tx_buff(queue, i);
 		desc = macb_tx_desc(queue, i);
 
-		ctrl = (u32)tx_skb->size;
+		ctrl = (u32)tx_buff->size;
 		if (eof) {
 			ctrl |= MACB_BIT(TX_LAST);
 			eof = 0;
@@ -2269,7 +2269,7 @@ static unsigned int macb_tx_map(struct macb *bp,
 			ctrl |= MACB_BF(MSS_MFS, mss_mfs);
 
 		/* Set TX buffer descriptor */
-		macb_set_addr(bp, desc, tx_skb->mapping);
+		macb_set_addr(bp, desc, tx_buff->mapping);
 		/* desc->addr must be visible to hardware before clearing
 		 * 'TX_USED' bit in desc->ctrl.
 		 */
@@ -2285,9 +2285,9 @@ static unsigned int macb_tx_map(struct macb *bp,
 	netdev_err(bp->dev, "TX DMA map failed\n");
 
 	for (i = queue->tx_head; i != tx_head; i++) {
-		tx_skb = macb_tx_skb(queue, i);
+		tx_buff = macb_tx_buff(queue, i);
 
-		macb_tx_unmap(bp, tx_skb, 0);
+		macb_tx_unmap(bp, tx_buff, 0);
 	}
 
 	return -ENOMEM;
@@ -2603,8 +2603,8 @@ static void macb_free_consistent(struct macb *bp)
 	dma_free_coherent(dev, size, bp->queues[0].rx_ring, bp->queues[0].rx_ring_dma);
 
 	for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
-		kfree(queue->tx_skb);
-		queue->tx_skb = NULL;
+		kfree(queue->tx_buff);
+		queue->tx_buff = NULL;
 		queue->tx_ring = NULL;
 		queue->rx_ring = NULL;
 	}
@@ -2682,9 +2682,9 @@ static int macb_alloc_consistent(struct macb *bp)
 		queue->rx_ring = rx + macb_rx_ring_size_per_queue(bp) * q;
 		queue->rx_ring_dma = rx_dma + macb_rx_ring_size_per_queue(bp) * q;
 
-		size = bp->tx_ring_size * sizeof(struct macb_tx_skb);
-		queue->tx_skb = kmalloc(size, GFP_KERNEL);
-		if (!queue->tx_skb)
+		size = bp->tx_ring_size * sizeof(struct macb_tx_buff);
+		queue->tx_buff = kmalloc(size, GFP_KERNEL);
+		if (!queue->tx_buff)
 			goto out_err;
 	}
 
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC net-next v2 7/8] cadence: macb: make tx path skb agnostic
  2025-12-20 23:51 [PATCH RFC net-next v2 0/8] net: macb: Add XDP support and page pool integration Paolo Valerio
                   ` (5 preceding siblings ...)
  2025-12-20 23:51 ` [PATCH RFC net-next v2 6/8] cadence: macb: make macb_tx_skb generic Paolo Valerio
@ 2025-12-20 23:51 ` Paolo Valerio
  2025-12-20 23:51 ` [PATCH RFC net-next v2 8/8] cadence: macb: introduce xmit support Paolo Valerio
  7 siblings, 0 replies; 21+ messages in thread
From: Paolo Valerio @ 2025-12-20 23:51 UTC (permalink / raw)
  To: netdev
  Cc: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Lorenzo Bianconi,
	Théo Lebrun

Rename macb_tx_buff member skb to ptr and introduce macb_tx_buff_type
to identify the buffer type macb_tx_buff represents.

This is the last preparatory step for xdp xmit support.

Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
---
 drivers/net/ethernet/cadence/macb.h      | 23 +++++++++++-----
 drivers/net/ethernet/cadence/macb_main.c | 35 ++++++++++++++----------
 2 files changed, 37 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
index 47c25993ad40..09aec2c4f7d4 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -960,19 +960,28 @@ struct macb_dma_desc_ptp {
 /* Scaled PPM fraction */
 #define PPM_FRACTION	16
 
-/* struct macb_tx_buff - data about an skb which is being transmitted
- * @skb: skb currently being transmitted, only set for the last buffer
- *       of the frame
+enum macb_tx_buff_type {
+	MACB_TYPE_SKB,
+	MACB_TYPE_XDP_TX,
+	MACB_TYPE_XDP_NDO,
+};
+
+/* struct macb_tx_buff - data about an skb or xdp frame which is being
+ * transmitted.
+ * @ptr: pointer to skb or xdp frame being transmitted, only set
+ *        for the last buffer for sk_buff
  * @mapping: DMA address of the skb's fragment buffer
  * @size: size of the DMA mapped buffer
  * @mapped_as_page: true when buffer was mapped with skb_frag_dma_map(),
  *                  false when buffer was mapped with dma_map_single()
+ * @type: type of buffer (MACB_TYPE_SKB, MACB_TYPE_XDP_TX, MACB_TYPE_XDP_NDO)
  */
 struct macb_tx_buff {
-	void			*skb;
-	dma_addr_t		mapping;
-	size_t			size;
-	bool			mapped_as_page;
+	void				*ptr;
+	dma_addr_t			mapping;
+	size_t				size;
+	bool				mapped_as_page;
+	enum macb_tx_buff_type		type;
 };
 
 /* Hardware-collected statistics. Used when updating the network
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index 3ffad2ddc349..cd29a80d1dbb 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -969,7 +969,8 @@ static int macb_halt_tx(struct macb *bp)
 					bp, TSR);
 }
 
-static void macb_tx_unmap(struct macb *bp, struct macb_tx_buff *tx_buff, int budget)
+static void macb_tx_unmap(struct macb *bp, struct macb_tx_buff *tx_buff,
+			  int budget)
 {
 	if (tx_buff->mapping) {
 		if (tx_buff->mapped_as_page)
@@ -981,9 +982,9 @@ static void macb_tx_unmap(struct macb *bp, struct macb_tx_buff *tx_buff, int bud
 		tx_buff->mapping = 0;
 	}
 
-	if (tx_buff->skb) {
-		napi_consume_skb(tx_buff->skb, budget);
-		tx_buff->skb = NULL;
+	if (tx_buff->ptr) {
+		napi_consume_skb(tx_buff->ptr, budget);
+		tx_buff->ptr = NULL;
 	}
 }
 
@@ -1070,7 +1071,7 @@ static void macb_tx_error_task(struct work_struct *work)
 		desc = macb_tx_desc(queue, tail);
 		ctrl = desc->ctrl;
 		tx_buff = macb_tx_buff(queue, tail);
-		skb = tx_buff->skb;
+		skb = tx_buff->ptr;
 
 		if (ctrl & MACB_BIT(TX_USED)) {
 			/* skb is set for the last buffer of the frame */
@@ -1078,7 +1079,7 @@ static void macb_tx_error_task(struct work_struct *work)
 				macb_tx_unmap(bp, tx_buff, 0);
 				tail++;
 				tx_buff = macb_tx_buff(queue, tail);
-				skb = tx_buff->skb;
+				skb = tx_buff->ptr;
 			}
 
 			/* ctrl still refers to the first buffer descriptor
@@ -1206,7 +1207,9 @@ static int macb_tx_complete(struct macb_queue *queue, int budget)
 		/* Process all buffers of the current transmitted frame */
 		for (;; tail++) {
 			tx_buff = macb_tx_buff(queue, tail);
-			skb = tx_buff->skb;
+
+			if (tx_buff->type == MACB_TYPE_SKB)
+				skb = tx_buff->ptr;
 
 			/* First, update TX stats if needed */
 			if (skb) {
@@ -2166,7 +2169,8 @@ static unsigned int macb_tx_map(struct macb *bp,
 			goto dma_error;
 
 		/* Save info to properly release resources */
-		tx_buff->skb = NULL;
+		tx_buff->ptr = NULL;
+		tx_buff->type = MACB_TYPE_SKB;
 		tx_buff->mapping = mapping;
 		tx_buff->size = size;
 		tx_buff->mapped_as_page = false;
@@ -2194,7 +2198,8 @@ static unsigned int macb_tx_map(struct macb *bp,
 				goto dma_error;
 
 			/* Save info to properly release resources */
-			tx_buff->skb = NULL;
+			tx_buff->ptr = NULL;
+			tx_buff->type = MACB_TYPE_SKB;
 			tx_buff->mapping = mapping;
 			tx_buff->size = size;
 			tx_buff->mapped_as_page = true;
@@ -2212,7 +2217,8 @@ static unsigned int macb_tx_map(struct macb *bp,
 	}
 
 	/* This is the last buffer of the frame: save socket buffer */
-	tx_buff->skb = skb;
+	tx_buff->ptr = skb;
+	tx_buff->type = MACB_TYPE_SKB;
 
 	/* Update TX ring: update buffer descriptors in reverse order
 	 * to avoid race condition
@@ -5096,8 +5102,9 @@ static netdev_tx_t at91ether_start_xmit(struct sk_buff *skb,
 		netif_stop_queue(dev);
 
 		/* Store packet information (to free when Tx completed) */
-		lp->rm9200_txq[desc].skb = skb;
+		lp->rm9200_txq[desc].ptr = skb;
 		lp->rm9200_txq[desc].size = skb->len;
+		lp->rm9200_txq[desc].type = MACB_TYPE_SKB;
 		lp->rm9200_txq[desc].mapping = dma_map_single(&lp->pdev->dev, skb->data,
 							      skb->len, DMA_TO_DEVICE);
 		if (dma_mapping_error(&lp->pdev->dev, lp->rm9200_txq[desc].mapping)) {
@@ -5189,9 +5196,9 @@ static irqreturn_t at91ether_interrupt(int irq, void *dev_id)
 			dev->stats.tx_errors++;
 
 		desc = 0;
-		if (lp->rm9200_txq[desc].skb) {
-			dev_consume_skb_irq(lp->rm9200_txq[desc].skb);
-			lp->rm9200_txq[desc].skb = NULL;
+		if (lp->rm9200_txq[desc].ptr) {
+			dev_consume_skb_irq(lp->rm9200_txq[desc].ptr);
+			lp->rm9200_txq[desc].ptr = NULL;
 			dma_unmap_single(&lp->pdev->dev, lp->rm9200_txq[desc].mapping,
 					 lp->rm9200_txq[desc].size, DMA_TO_DEVICE);
 			dev->stats.tx_packets++;
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC net-next v2 8/8] cadence: macb: introduce xmit support
  2025-12-20 23:51 [PATCH RFC net-next v2 0/8] net: macb: Add XDP support and page pool integration Paolo Valerio
                   ` (6 preceding siblings ...)
  2025-12-20 23:51 ` [PATCH RFC net-next v2 7/8] cadence: macb: make tx path skb agnostic Paolo Valerio
@ 2025-12-20 23:51 ` Paolo Valerio
  2026-01-08 15:54   ` Théo Lebrun
  7 siblings, 1 reply; 21+ messages in thread
From: Paolo Valerio @ 2025-12-20 23:51 UTC (permalink / raw)
  To: netdev
  Cc: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Lorenzo Bianconi,
	Théo Lebrun

Add XDP_TX verdict support, also introduce ndo_xdp_xmit function for
redirection, and update macb_tx_unmap() to handle both skbs and xdp
frames advertising NETDEV_XDP_ACT_NDO_XMIT capability and the ability
to process XDP_TX verdicts.

Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
---
 drivers/net/ethernet/cadence/macb_main.c | 166 +++++++++++++++++++++--
 1 file changed, 158 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index cd29a80d1dbb..d8abfa45e22d 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -969,6 +969,17 @@ static int macb_halt_tx(struct macb *bp)
 					bp, TSR);
 }
 
+static void release_buff(void *buff, enum macb_tx_buff_type type, int budget)
+{
+	if (type == MACB_TYPE_SKB) {
+		napi_consume_skb(buff, budget);
+	} else if (type == MACB_TYPE_XDP_TX) {
+		xdp_return_frame_rx_napi(buff);
+	} else {
+		xdp_return_frame(buff);
+	}
+}
+
 static void macb_tx_unmap(struct macb *bp, struct macb_tx_buff *tx_buff,
 			  int budget)
 {
@@ -983,7 +994,7 @@ static void macb_tx_unmap(struct macb *bp, struct macb_tx_buff *tx_buff,
 	}
 
 	if (tx_buff->ptr) {
-		napi_consume_skb(tx_buff->ptr, budget);
+		release_buff(tx_buff->ptr, tx_buff->type, budget);
 		tx_buff->ptr = NULL;
 	}
 }
@@ -1071,6 +1082,10 @@ static void macb_tx_error_task(struct work_struct *work)
 		desc = macb_tx_desc(queue, tail);
 		ctrl = desc->ctrl;
 		tx_buff = macb_tx_buff(queue, tail);
+
+		if (tx_buff->type != MACB_TYPE_SKB)
+			goto unmap;
+
 		skb = tx_buff->ptr;
 
 		if (ctrl & MACB_BIT(TX_USED)) {
@@ -1108,6 +1123,7 @@ static void macb_tx_error_task(struct work_struct *work)
 			desc->ctrl = ctrl | MACB_BIT(TX_USED);
 		}
 
+unmap:
 		macb_tx_unmap(bp, tx_buff, 0);
 	}
 
@@ -1186,6 +1202,7 @@ static int macb_tx_complete(struct macb_queue *queue, int budget)
 	spin_lock_irqsave(&queue->tx_ptr_lock, flags);
 	head = queue->tx_head;
 	for (tail = queue->tx_tail; tail != head && packets < budget; tail++) {
+		void			*data = NULL;
 		struct macb_tx_buff	*tx_buff;
 		struct sk_buff		*skb;
 		struct macb_dma_desc	*desc;
@@ -1208,11 +1225,16 @@ static int macb_tx_complete(struct macb_queue *queue, int budget)
 		for (;; tail++) {
 			tx_buff = macb_tx_buff(queue, tail);
 
-			if (tx_buff->type == MACB_TYPE_SKB)
-				skb = tx_buff->ptr;
+			if (tx_buff->type != MACB_TYPE_SKB) {
+				data = tx_buff->ptr;
+				goto unmap;
+			}
 
 			/* First, update TX stats if needed */
-			if (skb) {
+			if (tx_buff->type == MACB_TYPE_SKB && tx_buff->ptr) {
+				data = tx_buff->ptr;
+				skb = tx_buff->ptr;
+
 				if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) &&
 				    !ptp_one_step_sync(skb))
 					gem_ptp_do_txstamp(bp, skb, desc);
@@ -1228,6 +1250,7 @@ static int macb_tx_complete(struct macb_queue *queue, int budget)
 				bytes += skb->len;
 			}
 
+unmap:
 			/* Now we can safely release resources */
 			macb_tx_unmap(bp, tx_buff, budget);
 
@@ -1235,7 +1258,7 @@ static int macb_tx_complete(struct macb_queue *queue, int budget)
 			 * WARNING: at this point skb has been freed by
 			 * macb_tx_unmap().
 			 */
-			if (skb)
+			if (data)
 				break;
 		}
 	}
@@ -1350,10 +1373,128 @@ static void discard_partial_frame(struct macb_queue *queue, unsigned int begin,
 	 */
 }
 
+static int macb_xdp_submit_frame(struct macb *bp, struct xdp_frame *xdpf,
+				 struct net_device *dev, bool dma_map,
+				 dma_addr_t addr)
+{
+	enum macb_tx_buff_type buff_type;
+	struct macb_tx_buff *tx_buff;
+	int cpu = smp_processor_id();
+	struct macb_dma_desc *desc;
+	struct macb_queue *queue;
+	unsigned int next_head;
+	unsigned long flags;
+	dma_addr_t mapping;
+	u16 queue_index;
+	int err = 0;
+	u32 ctrl;
+
+	queue_index = cpu % bp->num_queues;
+	queue = &bp->queues[queue_index];
+	buff_type = dma_map ? MACB_TYPE_XDP_NDO : MACB_TYPE_XDP_TX;
+
+	spin_lock_irqsave(&queue->tx_ptr_lock, flags);
+
+	/* This is a hard error, log it. */
+	if (CIRC_SPACE(queue->tx_head, queue->tx_tail, bp->tx_ring_size) < 1) {
+		netif_stop_subqueue(dev, queue_index);
+		netdev_dbg(bp->dev, "tx_head = %u, tx_tail = %u\n",
+			   queue->tx_head, queue->tx_tail);
+		err = -ENOMEM;
+		goto unlock;
+	}
+
+	if (dma_map) {
+		mapping = dma_map_single(&bp->pdev->dev,
+					 xdpf->data,
+					 xdpf->len, DMA_TO_DEVICE);
+		if (unlikely(dma_mapping_error(&bp->pdev->dev, mapping))) {
+			err = -ENOMEM;
+			goto unlock;
+		}
+	} else {
+		mapping = addr;
+		dma_sync_single_for_device(&bp->pdev->dev, mapping,
+					   xdpf->len, DMA_BIDIRECTIONAL);
+	}
+
+	next_head = queue->tx_head + 1;
+
+	ctrl = MACB_BIT(TX_USED);
+	desc = macb_tx_desc(queue, next_head);
+	desc->ctrl = ctrl;
+
+	desc = macb_tx_desc(queue, queue->tx_head);
+	tx_buff = macb_tx_buff(queue, queue->tx_head);
+	tx_buff->ptr = xdpf;
+	tx_buff->type = buff_type;
+	tx_buff->mapping = mapping;
+	tx_buff->size = xdpf->len;
+	tx_buff->mapped_as_page = false;
+
+	ctrl = (u32)tx_buff->size;
+	ctrl |= MACB_BIT(TX_LAST);
+
+	if (unlikely(macb_tx_ring_wrap(bp, queue->tx_head) == (bp->tx_ring_size - 1)))
+		ctrl |= MACB_BIT(TX_WRAP);
+
+	/* Set TX buffer descriptor */
+	macb_set_addr(bp, desc, tx_buff->mapping);
+	/* desc->addr must be visible to hardware before clearing
+	 * 'TX_USED' bit in desc->ctrl.
+	 */
+	wmb();
+	desc->ctrl = ctrl;
+	queue->tx_head = next_head;
+
+	/* Make newly initialized descriptor visible to hardware */
+	wmb();
+
+	spin_lock(&bp->lock);
+	macb_writel(bp, NCR, macb_readl(bp, NCR) | MACB_BIT(TSTART));
+	spin_unlock(&bp->lock);
+
+	if (CIRC_SPACE(queue->tx_head, queue->tx_tail, bp->tx_ring_size) < 1)
+		netif_stop_subqueue(dev, queue_index);
+
+unlock:
+	spin_unlock_irqrestore(&queue->tx_ptr_lock, flags);
+
+	if (err)
+		release_buff(xdpf, buff_type, 0);
+
+	return err;
+}
+
+static int
+gem_xdp_xmit(struct net_device *dev, int num_frame,
+	     struct xdp_frame **frames, u32 flags)
+{
+	struct macb *bp = netdev_priv(dev);
+	u32 xmitted = 0;
+	int i;
+
+	if (!macb_is_gem(bp))
+		return -EOPNOTSUPP;
+
+	if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK))
+		return -EINVAL;
+
+	for (i = 0; i < num_frame; i++) {
+		if (macb_xdp_submit_frame(bp, frames[i], dev, true, 0))
+			break;
+
+		xmitted++;
+	}
+
+	return xmitted;
+}
+
 static u32 gem_xdp_run(struct macb_queue *queue, void *buff_head,
-		       unsigned int len)
+		       unsigned int len, dma_addr_t addr)
 {
 	struct net_device *dev;
+	struct xdp_frame *xdpf;
 	struct bpf_prog *prog;
 	struct xdp_buff xdp;
 
@@ -1380,6 +1521,13 @@ static u32 gem_xdp_run(struct macb_queue *queue, void *buff_head,
 			break;
 		}
 		goto out;
+	case XDP_TX:
+		xdpf = xdp_convert_buff_to_frame(&xdp);
+
+		if (!xdpf || macb_xdp_submit_frame(queue->bp, xdpf, dev, false,
+						   addr))
+			act = XDP_DROP;
+		goto out;
 	default:
 		bpf_warn_invalid_xdp_action(dev, prog, act);
 		fallthrough;
@@ -1469,7 +1617,7 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
 		if (!(ctrl & MACB_BIT(RX_SOF) && ctrl & MACB_BIT(RX_EOF)))
 			goto skip_xdp;
 
-		ret = gem_xdp_run(queue, buff_head, len);
+		ret = gem_xdp_run(queue, buff_head, len, addr);
 		if (ret == XDP_REDIRECT)
 			xdp_flush = true;
 
@@ -4582,6 +4730,7 @@ static const struct net_device_ops macb_netdev_ops = {
 	.ndo_hwtstamp_get	= macb_hwtstamp_get,
 	.ndo_setup_tc		= macb_setup_tc,
 	.ndo_bpf		= gem_xdp,
+	.ndo_xdp_xmit		= gem_xdp_xmit,
 };
 
 /* Configure peripheral capabilities according to device tree
@@ -5888,7 +6037,8 @@ static int macb_probe(struct platform_device *pdev)
 			bp->rx_headroom += NET_IP_ALIGN;
 
 		dev->xdp_features = NETDEV_XDP_ACT_BASIC |
-				    NETDEV_XDP_ACT_REDIRECT;
+				    NETDEV_XDP_ACT_REDIRECT |
+				    NETDEV_XDP_ACT_NDO_XMIT;
 	}
 
 	netif_carrier_off(dev);
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC net-next v2 1/8] net: macb: move Rx buffers alloc from link up to open
  2025-12-20 23:51 ` [PATCH RFC net-next v2 1/8] net: macb: move Rx buffers alloc from link up to open Paolo Valerio
@ 2026-01-08 15:24   ` Théo Lebrun
  2026-01-12 14:14     ` Paolo Valerio
  0 siblings, 1 reply; 21+ messages in thread
From: Théo Lebrun @ 2026-01-08 15:24 UTC (permalink / raw)
  To: Paolo Valerio, netdev
  Cc: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Lorenzo Bianconi,
	Théo Lebrun

Hello Paolo,

Nothing major in this review. Mostly nits.

On Sun Dec 21, 2025 at 12:51 AM CET, Paolo Valerio wrote:
> From: Théo Lebrun <theo.lebrun@bootlin.com>
>
> mog_alloc_rx_buffers(), getting called at open, does not do rx buffer
> alloc on GEM. The bulk of the work is done by gem_rx_refill() filling
> up all slots with valid buffers.
>
> gem_rx_refill() is called at link up by
> gem_init_rings() == bp->macbgem_ops.mog_init_rings().
>
> Move operation to macb_open(), mostly to allow it to fail early and
> loudly rather than init the device with Rx mostly broken.
>
> About `bool fail_early`:
>  - When called from macb_open(), ring init fails as soon as a queue
>    cannot be refilled.
>  - When called from macb_hresp_error_task(), we do our best to reinit
>    the device: we still iterate over all queues and try refilling all
>    even if a previous queue failed.

About [PATCH 1/8], it conflicts with a patch that landed on v6.19-rc4:
99537d5c476c ("net: macb: Relocate mog_init_rings() callback from
macb_mac_link_up() to macb_open()").

I don't get a merge conflict but the
   bp->macbgem_ops.mog_init_rings(bp);
call must be dropped from macb_open() in [1/8]. It doesn't build anyway
because that call passes a single argument.

Thanks,

--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC net-next v2 3/8] cadence: macb: Add page pool support handle multi-descriptor frame rx
  2025-12-20 23:51 ` [PATCH RFC net-next v2 3/8] cadence: macb: Add page pool support handle multi-descriptor frame rx Paolo Valerio
@ 2026-01-08 15:43   ` Théo Lebrun
  2026-01-12 14:16     ` Paolo Valerio
  0 siblings, 1 reply; 21+ messages in thread
From: Théo Lebrun @ 2026-01-08 15:43 UTC (permalink / raw)
  To: Paolo Valerio, netdev
  Cc: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Lorenzo Bianconi,
	Théo Lebrun

On Sun Dec 21, 2025 at 12:51 AM CET, Paolo Valerio wrote:
> Use the page pool allocator for the data buffers and enable skb recycling
> support, instead of relying on netdev_alloc_skb allocating the entire skb
> during the refill.
>
> The patch also add support for receiving network frames that span multiple
> DMA descriptors in the Cadence MACB/GEM Ethernet driver.
>
> The patch removes the requirement that limited frame reception to
> a single descriptor (RX_SOF && RX_EOF), also avoiding potential
> contiguous multi-page allocation for large frames.
>
> Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
> ---
>  drivers/net/ethernet/cadence/Kconfig     |   1 +
>  drivers/net/ethernet/cadence/macb.h      |   5 +
>  drivers/net/ethernet/cadence/macb_main.c | 345 +++++++++++++++--------
>  3 files changed, 235 insertions(+), 116 deletions(-)
>
> diff --git a/drivers/net/ethernet/cadence/Kconfig b/drivers/net/ethernet/cadence/Kconfig
> index 5b2a461dfd28..ae500f717433 100644
> --- a/drivers/net/ethernet/cadence/Kconfig
> +++ b/drivers/net/ethernet/cadence/Kconfig
> @@ -25,6 +25,7 @@ config MACB
>  	depends on PTP_1588_CLOCK_OPTIONAL
>  	select PHYLINK
>  	select CRC32
> +	select PAGE_POOL
>  	help
>  	  The Cadence MACB ethernet interface is found on many Atmel AT32 and
>  	  AT91 parts.  This driver also supports the Cadence GEM (Gigabit
> diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
> index 3b184e9ac771..45c04157f153 100644
> --- a/drivers/net/ethernet/cadence/macb.h
> +++ b/drivers/net/ethernet/cadence/macb.h
> @@ -14,6 +14,8 @@
>  #include <linux/interrupt.h>
>  #include <linux/phy/phy.h>
>  #include <linux/workqueue.h>
> +#include <net/page_pool/helpers.h>
> +#include <net/xdp.h>

nit: `#include <net/xdp.h>` is not needed yet.

>  
>  #define MACB_GREGS_NBR 16
>  #define MACB_GREGS_VERSION 2
> @@ -1266,6 +1268,8 @@ struct macb_queue {
>  	void			*rx_buffers;
>  	struct napi_struct	napi_rx;
>  	struct queue_stats stats;
> +	struct page_pool	*page_pool;
> +	struct sk_buff		*skb;
>  };
>  
>  struct ethtool_rx_fs_item {
> @@ -1289,6 +1293,7 @@ struct macb {
>  	struct macb_dma_desc	*rx_ring_tieoff;
>  	dma_addr_t		rx_ring_tieoff_dma;
>  	size_t			rx_buffer_size;
> +	size_t			rx_headroom;
>  
>  	unsigned int		rx_ring_size;
>  	unsigned int		tx_ring_size;
> diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
> index b4e2444b2e95..9e1efc1f56d8 100644
> --- a/drivers/net/ethernet/cadence/macb_main.c
> +++ b/drivers/net/ethernet/cadence/macb_main.c
> @@ -1249,14 +1249,22 @@ static int macb_tx_complete(struct macb_queue *queue, int budget)
>  	return packets;
>  }
>  
> -static int gem_rx_refill(struct macb_queue *queue)
> +static int gem_total_rx_buffer_size(struct macb *bp)
> +{
> +	return SKB_HEAD_ALIGN(bp->rx_buffer_size + bp->rx_headroom);
> +}

nit: something closer to a buffer size, either `unsigned int` or
`size_t`, sounds better than an int return type.

> +
> +static int gem_rx_refill(struct macb_queue *queue, bool napi)
>  {
>  	unsigned int		entry;
> -	struct sk_buff		*skb;
>  	dma_addr_t		paddr;
> +	void			*data;
>  	struct macb *bp = queue->bp;
>  	struct macb_dma_desc *desc;
> +	struct page *page;
> +	gfp_t gfp_alloc;
>  	int err = 0;
> +	int offset;
>  
>  	while (CIRC_SPACE(queue->rx_prepared_head, queue->rx_tail,
>  			bp->rx_ring_size) > 0) {
> @@ -1268,25 +1276,20 @@ static int gem_rx_refill(struct macb_queue *queue)
>  		desc = macb_rx_desc(queue, entry);
>  
>  		if (!queue->rx_buff[entry]) {
> -			/* allocate sk_buff for this free entry in ring */
> -			skb = netdev_alloc_skb(bp->dev, bp->rx_buffer_size);
> -			if (unlikely(!skb)) {
> +			gfp_alloc = napi ? GFP_ATOMIC : GFP_KERNEL;
> +			page = page_pool_alloc_frag(queue->page_pool, &offset,
> +						    gem_total_rx_buffer_size(bp),
> +						    gfp_alloc | __GFP_NOWARN);
> +			if (!page) {
>  				netdev_err(bp->dev,
> -					   "Unable to allocate sk_buff\n");
> +					   "Unable to allocate page\n");
>  				err = -ENOMEM;
>  				break;
>  			}
>  
> -			/* now fill corresponding descriptor entry */
> -			paddr = dma_map_single(&bp->pdev->dev, skb->data,
> -					       bp->rx_buffer_size,
> -					       DMA_FROM_DEVICE);
> -			if (dma_mapping_error(&bp->pdev->dev, paddr)) {
> -				dev_kfree_skb(skb);
> -				break;
> -			}
> -
> -			queue->rx_buff[entry] = skb;
> +			paddr = page_pool_get_dma_addr(page) + XDP_PACKET_HEADROOM + offset;
> +			data = page_address(page) + offset;
> +			queue->rx_buff[entry] = data;
>  
>  			if (entry == bp->rx_ring_size - 1)
>  				paddr |= MACB_BIT(RX_WRAP);
> @@ -1296,20 +1299,6 @@ static int gem_rx_refill(struct macb_queue *queue)
>  			 */
>  			dma_wmb();
>  			macb_set_addr(bp, desc, paddr);
> -
> -			/* Properly align Ethernet header.
> -			 *
> -			 * Hardware can add dummy bytes if asked using the RBOF
> -			 * field inside the NCFGR register. That feature isn't
> -			 * available if hardware is RSC capable.
> -			 *
> -			 * We cannot fallback to doing the 2-byte shift before
> -			 * DMA mapping because the address field does not allow
> -			 * setting the low 2/3 bits.
> -			 * It is 3 bits if HW_DMA_CAP_PTP, else 2 bits.
> -			 */
> -			if (!(bp->caps & MACB_CAPS_RSC))
> -				skb_reserve(skb, NET_IP_ALIGN);
>  		} else {
>  			desc->ctrl = 0;
>  			dma_wmb();
> @@ -1353,14 +1342,19 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
>  	struct macb *bp = queue->bp;
>  	unsigned int		len;
>  	unsigned int		entry;
> -	struct sk_buff		*skb;
>  	struct macb_dma_desc	*desc;
> +	int			data_len;
>  	int			count = 0;
> +	void			*buff_head;
> +	struct skb_shared_info	*shinfo;
> +	struct page		*page;
> +	int			nr_frags;

nit: you add 5 new stack variables, maybe you could apply reverse xmas
tree while at it. You do it for the loop body in [5/8].

> +
>  
>  	while (count < budget) {
>  		u32 ctrl;
>  		dma_addr_t addr;
> -		bool rxused;
> +		bool rxused, first_frame;
>  
>  		entry = macb_rx_ring_wrap(bp, queue->rx_tail);
>  		desc = macb_rx_desc(queue, entry);
> @@ -1374,6 +1368,12 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
>  		if (!rxused)
>  			break;
>  
> +		if (!(bp->caps & MACB_CAPS_RSC))
> +			addr += NET_IP_ALIGN;
> +
> +		dma_sync_single_for_cpu(&bp->pdev->dev,
> +					addr, bp->rx_buffer_size,
> +					page_pool_get_dma_dir(queue->page_pool));
>  		/* Ensure ctrl is at least as up-to-date as rxused */
>  		dma_rmb();
>  
> @@ -1382,58 +1382,118 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
>  		queue->rx_tail++;
>  		count++;
>  
> -		if (!(ctrl & MACB_BIT(RX_SOF) && ctrl & MACB_BIT(RX_EOF))) {
> -			netdev_err(bp->dev,
> -				   "not whole frame pointed by descriptor\n");
> -			bp->dev->stats.rx_dropped++;
> -			queue->stats.rx_dropped++;
> -			break;
> -		}
> -		skb = queue->rx_buff[entry];
> -		if (unlikely(!skb)) {
> +		buff_head = queue->rx_buff[entry];
> +		if (unlikely(!buff_head)) {
>  			netdev_err(bp->dev,
>  				   "inconsistent Rx descriptor chain\n");
>  			bp->dev->stats.rx_dropped++;
>  			queue->stats.rx_dropped++;
>  			break;
>  		}
> -		/* now everything is ready for receiving packet */
> -		queue->rx_buff[entry] = NULL;
> +
> +		first_frame = ctrl & MACB_BIT(RX_SOF);
>  		len = ctrl & bp->rx_frm_len_mask;
>  
> -		netdev_vdbg(bp->dev, "gem_rx %u (len %u)\n", entry, len);
> +		if (len) {
> +			data_len = len;
> +			if (!first_frame)
> +				data_len -= queue->skb->len;
> +		} else {
> +			data_len = bp->rx_buffer_size;
> +		}

Why deal with the `!len` case? How can it occur? User guide doesn't hint
that. It would mean we would grab uninitialised bytes as we assume len
is the max buffer size.

> +
> +		if (first_frame) {
> +			queue->skb = napi_build_skb(buff_head, gem_total_rx_buffer_size(bp));
> +			if (unlikely(!queue->skb)) {
> +				netdev_err(bp->dev,
> +					   "Unable to allocate sk_buff\n");
> +				goto free_frags;
> +			}
> +
> +			/* Properly align Ethernet header.
> +			 *
> +			 * Hardware can add dummy bytes if asked using the RBOF
> +			 * field inside the NCFGR register. That feature isn't
> +			 * available if hardware is RSC capable.
> +			 *
> +			 * We cannot fallback to doing the 2-byte shift before
> +			 * DMA mapping because the address field does not allow
> +			 * setting the low 2/3 bits.
> +			 * It is 3 bits if HW_DMA_CAP_PTP, else 2 bits.
> +			 */
> +			skb_reserve(queue->skb, bp->rx_headroom);
> +			skb_mark_for_recycle(queue->skb);
> +			skb_put(queue->skb, data_len);
> +			queue->skb->protocol = eth_type_trans(queue->skb, bp->dev);
> +
> +			skb_checksum_none_assert(queue->skb);
> +			if (bp->dev->features & NETIF_F_RXCSUM &&
> +			    !(bp->dev->flags & IFF_PROMISC) &&
> +			    GEM_BFEXT(RX_CSUM, ctrl) & GEM_RX_CSUM_CHECKED_MASK)
> +				queue->skb->ip_summed = CHECKSUM_UNNECESSARY;
> +		} else {
> +			if (!queue->skb) {
> +				netdev_err(bp->dev,
> +					   "Received non-starting frame while expecting it\n");
> +				goto free_frags;
> +			}
> +
> +			shinfo = skb_shinfo(queue->skb);
> +			page = virt_to_head_page(buff_head);
> +			nr_frags = shinfo->nr_frags;
> +
> +			if (nr_frags >= ARRAY_SIZE(shinfo->frags))
> +				goto free_frags;
>  
> -		skb_put(skb, len);
> -		dma_unmap_single(&bp->pdev->dev, addr,
> -				 bp->rx_buffer_size, DMA_FROM_DEVICE);
> +			skb_add_rx_frag(queue->skb, nr_frags, page,
> +					buff_head - page_address(page) + bp->rx_headroom,
> +					data_len, gem_total_rx_buffer_size(bp));
> +		}
> +
> +		/* now everything is ready for receiving packet */
> +		queue->rx_buff[entry] = NULL;
>  
> -		skb->protocol = eth_type_trans(skb, bp->dev);
> -		skb_checksum_none_assert(skb);
> -		if (bp->dev->features & NETIF_F_RXCSUM &&
> -		    !(bp->dev->flags & IFF_PROMISC) &&
> -		    GEM_BFEXT(RX_CSUM, ctrl) & GEM_RX_CSUM_CHECKED_MASK)
> -			skb->ip_summed = CHECKSUM_UNNECESSARY;
> +		netdev_vdbg(bp->dev, "%s %u (len %u)\n", __func__, entry, data_len);
>  
> -		bp->dev->stats.rx_packets++;
> -		queue->stats.rx_packets++;
> -		bp->dev->stats.rx_bytes += skb->len;
> -		queue->stats.rx_bytes += skb->len;
> +		if (ctrl & MACB_BIT(RX_EOF)) {
> +			bp->dev->stats.rx_packets++;
> +			queue->stats.rx_packets++;
> +			bp->dev->stats.rx_bytes += queue->skb->len;
> +			queue->stats.rx_bytes += queue->skb->len;
>  
> -		gem_ptp_do_rxstamp(bp, skb, desc);
> +			gem_ptp_do_rxstamp(bp, queue->skb, desc);
>  
>  #if defined(DEBUG) && defined(VERBOSE_DEBUG)
> -		netdev_vdbg(bp->dev, "received skb of length %u, csum: %08x\n",
> -			    skb->len, skb->csum);
> -		print_hex_dump(KERN_DEBUG, " mac: ", DUMP_PREFIX_ADDRESS, 16, 1,
> -			       skb_mac_header(skb), 16, true);
> -		print_hex_dump(KERN_DEBUG, "data: ", DUMP_PREFIX_ADDRESS, 16, 1,
> -			       skb->data, 32, true);
> +			netdev_vdbg(bp->dev, "received skb of length %u, csum: %08x\n",
> +				    queue->skb->len, queue->skb->csum);
> +			print_hex_dump(KERN_DEBUG, " mac: ", DUMP_PREFIX_ADDRESS, 16, 1,
> +				       skb_mac_header(queue->skb), 16, true);
> +			print_hex_dump(KERN_DEBUG, "buff_head: ", DUMP_PREFIX_ADDRESS, 16, 1,
> +				       queue->skb->buff_head, 32, true);
>  #endif

nit: while you are at it, maybe replace with print_hex_dump_debug()?

>  
> -		napi_gro_receive(napi, skb);
> +			napi_gro_receive(napi, queue->skb);
> +			queue->skb = NULL;
> +		}
> +
> +		continue;
> +
> +free_frags:
> +		if (queue->skb) {
> +			dev_kfree_skb(queue->skb);
> +			queue->skb = NULL;
> +		} else {
> +			page_pool_put_full_page(queue->page_pool,
> +						virt_to_head_page(buff_head),
> +						false);
> +		}
> +
> +		bp->dev->stats.rx_dropped++;
> +		queue->stats.rx_dropped++;
> +		queue->rx_buff[entry] = NULL;
>  	}
>  
> -	gem_rx_refill(queue);
> +	gem_rx_refill(queue, true);
>  
>  	return count;
>  }
> @@ -2367,12 +2427,25 @@ static netdev_tx_t macb_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  	return ret;
>  }
>  
> -static void macb_init_rx_buffer_size(struct macb *bp, size_t size)
> +static void macb_init_rx_buffer_size(struct macb *bp, unsigned int mtu)
>  {
> +	int overhead;

nit: Maybe `unsigned int` or `size_t` rather than `int`?

> +	size_t size;
> +
>  	if (!macb_is_gem(bp)) {
>  		bp->rx_buffer_size = MACB_RX_BUFFER_SIZE;
>  	} else {
> -		bp->rx_buffer_size = size;
> +		size = mtu + ETH_HLEN + ETH_FCS_LEN;
> +		if (!(bp->caps & MACB_CAPS_RSC))
> +			size += NET_IP_ALIGN;

NET_IP_ALIGN looks like it is accounted for twice, once in
bp->rx_headroom and once in bp->rx_buffer_size. This gets fixed in
[5/8] where gem_max_rx_data_size() gets introduced.

> +
> +		bp->rx_buffer_size = SKB_DATA_ALIGN(size);
> +		if (gem_total_rx_buffer_size(bp) > PAGE_SIZE) {
> +			overhead = bp->rx_headroom +
> +				SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> +			bp->rx_buffer_size = rounddown(PAGE_SIZE - overhead,
> +						       RX_BUFFER_MULTIPLE);
> +		}

I've seen your comment in [0/8]. Do you have any advice on how to test
this clamping? All I can think of is to either configure a massive MTU
or, more easily, cheat with the headroom.

Also, should we warn? It means MTU-sized packets will be received in
fragments. It will work but is probably unexpected by users and a
slowdown reason that users might want to know about.

--

nit: while in macb_init_rx_buffer_size(), can you tweak the debug line
from mtu & rx_buffer_size to also have rx_headroom and total? So that
we have everything available to understand what is going on buffer size
wise. Something like:

-       netdev_dbg(bp->dev, "mtu [%u] rx_buffer_size [%zu]\n",
-                  bp->dev->mtu, bp->rx_buffer_size);
+       netdev_info(bp->dev, "mtu [%u] rx_buffer_size [%zu] rx_headroom [%zu] total [%u]\n",
+                   bp->dev->mtu, bp->rx_buffer_size, bp->rx_headroom,
+                   gem_total_rx_buffer_size(bp));

Thanks,

--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC net-next v2 5/8] cadence: macb: add XDP support for gem
  2025-12-20 23:51 ` [PATCH RFC net-next v2 5/8] cadence: macb: add XDP support for gem Paolo Valerio
@ 2026-01-08 15:49   ` Théo Lebrun
  2026-01-12 14:17     ` Paolo Valerio
  0 siblings, 1 reply; 21+ messages in thread
From: Théo Lebrun @ 2026-01-08 15:49 UTC (permalink / raw)
  To: Paolo Valerio, netdev
  Cc: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Lorenzo Bianconi,
	Théo Lebrun

Hello Paolo, netdev,

On Sun Dec 21, 2025 at 12:51 AM CET, Paolo Valerio wrote:
> Introduce basic XDP support for macb/gem with the XDP_PASS,
> XDP_DROP, XDP_REDIRECT verdict support.
>
> Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
> ---
>  drivers/net/ethernet/cadence/macb.h      |   3 +
>  drivers/net/ethernet/cadence/macb_main.c | 184 ++++++++++++++++++++---
>  2 files changed, 169 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
> index 45c04157f153..815d50574267 100644
> --- a/drivers/net/ethernet/cadence/macb.h
> +++ b/drivers/net/ethernet/cadence/macb.h
> @@ -16,6 +16,7 @@
>  #include <linux/workqueue.h>
>  #include <net/page_pool/helpers.h>
>  #include <net/xdp.h>
> +#include <linux/bpf_trace.h>

Shouldn't that land in macb_main.c? Required by trace_xdp_exception().

> diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
> index 582ceb728124..f767eb2e272e 100644
> --- a/drivers/net/ethernet/cadence/macb_main.c
> +++ b/drivers/net/ethernet/cadence/macb_main.c
> @@ -5,6 +5,7 @@
>   * Copyright (C) 2004-2006 Atmel Corporation
>   */
>  
> +#include <asm-generic/errno.h>

This is a mistake. For example compiling for a MIPS target I get all
errno constants redefined. Seeing where it was added it might have been
added by auto-import tooling.

If needed, to be replaced by
   #include <linux/errno.h>

⟩ git grep -h 'include.*errno' drivers/ | sort | uniq -c | sort -nr | head -n3
   1645 #include <linux/errno.h>
     19 #include <asm/errno.h>
      5 #include <linux/errno.h> /* For the -ENODEV/... values */

Thanks,

--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC net-next v2 8/8] cadence: macb: introduce xmit support
  2025-12-20 23:51 ` [PATCH RFC net-next v2 8/8] cadence: macb: introduce xmit support Paolo Valerio
@ 2026-01-08 15:54   ` Théo Lebrun
  2026-01-12 14:17     ` Paolo Valerio
  0 siblings, 1 reply; 21+ messages in thread
From: Théo Lebrun @ 2026-01-08 15:54 UTC (permalink / raw)
  To: Paolo Valerio, netdev
  Cc: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Lorenzo Bianconi,
	Théo Lebrun

On Sun Dec 21, 2025 at 12:51 AM CET, Paolo Valerio wrote:
> Add XDP_TX verdict support, also introduce ndo_xdp_xmit function for
> redirection, and update macb_tx_unmap() to handle both skbs and xdp
> frames advertising NETDEV_XDP_ACT_NDO_XMIT capability and the ability
> to process XDP_TX verdicts.
>
> Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
> ---
>  drivers/net/ethernet/cadence/macb_main.c | 166 +++++++++++++++++++++--
>  1 file changed, 158 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
> index cd29a80d1dbb..d8abfa45e22d 100644
> --- a/drivers/net/ethernet/cadence/macb_main.c
> +++ b/drivers/net/ethernet/cadence/macb_main.c
> +static int
> +gem_xdp_xmit(struct net_device *dev, int num_frame,
> +	     struct xdp_frame **frames, u32 flags)
> +{

nit: a bit surprised by the first line break in the function header.
Especially as it doesn't prevent splitting arguments across two lines.

Thanks,

--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC net-next v2 1/8] net: macb: move Rx buffers alloc from link up to open
  2026-01-08 15:24   ` Théo Lebrun
@ 2026-01-12 14:14     ` Paolo Valerio
  0 siblings, 0 replies; 21+ messages in thread
From: Paolo Valerio @ 2026-01-12 14:14 UTC (permalink / raw)
  To: Théo Lebrun, netdev
  Cc: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Lorenzo Bianconi,
	Théo Lebrun

On 08 Jan 2026 at 04:24:32 PM, Théo Lebrun <theo.lebrun@bootlin.com> wrote:

> Hello Paolo,
>
> Nothing major in this review. Mostly nits.
>

Thanks for the feedback!

> On Sun Dec 21, 2025 at 12:51 AM CET, Paolo Valerio wrote:
>> From: Théo Lebrun <theo.lebrun@bootlin.com>
>>
>> mog_alloc_rx_buffers(), getting called at open, does not do rx buffer
>> alloc on GEM. The bulk of the work is done by gem_rx_refill() filling
>> up all slots with valid buffers.
>>
>> gem_rx_refill() is called at link up by
>> gem_init_rings() == bp->macbgem_ops.mog_init_rings().
>>
>> Move operation to macb_open(), mostly to allow it to fail early and
>> loudly rather than init the device with Rx mostly broken.
>>
>> About `bool fail_early`:
>>  - When called from macb_open(), ring init fails as soon as a queue
>>    cannot be refilled.
>>  - When called from macb_hresp_error_task(), we do our best to reinit
>>    the device: we still iterate over all queues and try refilling all
>>    even if a previous queue failed.
>
> About [PATCH 1/8], it conflicts with a patch that landed on v6.19-rc4:
> 99537d5c476c ("net: macb: Relocate mog_init_rings() callback from
> macb_mac_link_up() to macb_open()").
>
> I don't get a merge conflict but the
>    bp->macbgem_ops.mog_init_rings(bp);
> call must be dropped from macb_open() in [1/8]. It doesn't build anyway
> because that call passes a single argument.
>

sure, will do.

> Thanks,
>
> --
> Théo Lebrun, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC net-next v2 3/8] cadence: macb: Add page pool support handle multi-descriptor frame rx
  2026-01-08 15:43   ` Théo Lebrun
@ 2026-01-12 14:16     ` Paolo Valerio
  2026-01-12 18:43       ` Paolo Valerio
  2026-01-13 10:43       ` Théo Lebrun
  0 siblings, 2 replies; 21+ messages in thread
From: Paolo Valerio @ 2026-01-12 14:16 UTC (permalink / raw)
  To: Théo Lebrun, netdev
  Cc: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Lorenzo Bianconi,
	Théo Lebrun

On 08 Jan 2026 at 04:43:43 PM, Théo Lebrun <theo.lebrun@bootlin.com> wrote:

> On Sun Dec 21, 2025 at 12:51 AM CET, Paolo Valerio wrote:
>> Use the page pool allocator for the data buffers and enable skb recycling
>> support, instead of relying on netdev_alloc_skb allocating the entire skb
>> during the refill.
>>
>> The patch also add support for receiving network frames that span multiple
>> DMA descriptors in the Cadence MACB/GEM Ethernet driver.
>>
>> The patch removes the requirement that limited frame reception to
>> a single descriptor (RX_SOF && RX_EOF), also avoiding potential
>> contiguous multi-page allocation for large frames.
>>
>> Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
>> ---
>>  drivers/net/ethernet/cadence/Kconfig     |   1 +
>>  drivers/net/ethernet/cadence/macb.h      |   5 +
>>  drivers/net/ethernet/cadence/macb_main.c | 345 +++++++++++++++--------
>>  3 files changed, 235 insertions(+), 116 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/cadence/Kconfig b/drivers/net/ethernet/cadence/Kconfig
>> index 5b2a461dfd28..ae500f717433 100644
>> --- a/drivers/net/ethernet/cadence/Kconfig
>> +++ b/drivers/net/ethernet/cadence/Kconfig
>> @@ -25,6 +25,7 @@ config MACB
>>  	depends on PTP_1588_CLOCK_OPTIONAL
>>  	select PHYLINK
>>  	select CRC32
>> +	select PAGE_POOL
>>  	help
>>  	  The Cadence MACB ethernet interface is found on many Atmel AT32 and
>>  	  AT91 parts.  This driver also supports the Cadence GEM (Gigabit
>> diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
>> index 3b184e9ac771..45c04157f153 100644
>> --- a/drivers/net/ethernet/cadence/macb.h
>> +++ b/drivers/net/ethernet/cadence/macb.h
>> @@ -14,6 +14,8 @@
>>  #include <linux/interrupt.h>
>>  #include <linux/phy/phy.h>
>>  #include <linux/workqueue.h>
>> +#include <net/page_pool/helpers.h>
>> +#include <net/xdp.h>
>
> nit: `#include <net/xdp.h>` is not needed yet.
>

ack

>>  
>>  #define MACB_GREGS_NBR 16
>>  #define MACB_GREGS_VERSION 2
>> @@ -1266,6 +1268,8 @@ struct macb_queue {
>>  	void			*rx_buffers;
>>  	struct napi_struct	napi_rx;
>>  	struct queue_stats stats;
>> +	struct page_pool	*page_pool;
>> +	struct sk_buff		*skb;
>>  };
>>  
>>  struct ethtool_rx_fs_item {
>> @@ -1289,6 +1293,7 @@ struct macb {
>>  	struct macb_dma_desc	*rx_ring_tieoff;
>>  	dma_addr_t		rx_ring_tieoff_dma;
>>  	size_t			rx_buffer_size;
>> +	size_t			rx_headroom;
>>  
>>  	unsigned int		rx_ring_size;
>>  	unsigned int		tx_ring_size;
>> diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
>> index b4e2444b2e95..9e1efc1f56d8 100644
>> --- a/drivers/net/ethernet/cadence/macb_main.c
>> +++ b/drivers/net/ethernet/cadence/macb_main.c
>> @@ -1249,14 +1249,22 @@ static int macb_tx_complete(struct macb_queue *queue, int budget)
>>  	return packets;
>>  }
>>  
>> -static int gem_rx_refill(struct macb_queue *queue)
>> +static int gem_total_rx_buffer_size(struct macb *bp)
>> +{
>> +	return SKB_HEAD_ALIGN(bp->rx_buffer_size + bp->rx_headroom);
>> +}
>
> nit: something closer to a buffer size, either `unsigned int` or
> `size_t`, sounds better than an int return type.
>

will do

>> +
>> +static int gem_rx_refill(struct macb_queue *queue, bool napi)
>>  {
>>  	unsigned int		entry;
>> -	struct sk_buff		*skb;
>>  	dma_addr_t		paddr;
>> +	void			*data;
>>  	struct macb *bp = queue->bp;
>>  	struct macb_dma_desc *desc;
>> +	struct page *page;
>> +	gfp_t gfp_alloc;
>>  	int err = 0;
>> +	int offset;
>>  
>>  	while (CIRC_SPACE(queue->rx_prepared_head, queue->rx_tail,
>>  			bp->rx_ring_size) > 0) {
>> @@ -1268,25 +1276,20 @@ static int gem_rx_refill(struct macb_queue *queue)
>>  		desc = macb_rx_desc(queue, entry);
>>  
>>  		if (!queue->rx_buff[entry]) {
>> -			/* allocate sk_buff for this free entry in ring */
>> -			skb = netdev_alloc_skb(bp->dev, bp->rx_buffer_size);
>> -			if (unlikely(!skb)) {
>> +			gfp_alloc = napi ? GFP_ATOMIC : GFP_KERNEL;
>> +			page = page_pool_alloc_frag(queue->page_pool, &offset,
>> +						    gem_total_rx_buffer_size(bp),
>> +						    gfp_alloc | __GFP_NOWARN);
>> +			if (!page) {
>>  				netdev_err(bp->dev,
>> -					   "Unable to allocate sk_buff\n");
>> +					   "Unable to allocate page\n");
>>  				err = -ENOMEM;
>>  				break;
>>  			}
>>  
>> -			/* now fill corresponding descriptor entry */
>> -			paddr = dma_map_single(&bp->pdev->dev, skb->data,
>> -					       bp->rx_buffer_size,
>> -					       DMA_FROM_DEVICE);
>> -			if (dma_mapping_error(&bp->pdev->dev, paddr)) {
>> -				dev_kfree_skb(skb);
>> -				break;
>> -			}
>> -
>> -			queue->rx_buff[entry] = skb;
>> +			paddr = page_pool_get_dma_addr(page) + XDP_PACKET_HEADROOM + offset;
>> +			data = page_address(page) + offset;
>> +			queue->rx_buff[entry] = data;
>>  
>>  			if (entry == bp->rx_ring_size - 1)
>>  				paddr |= MACB_BIT(RX_WRAP);
>> @@ -1296,20 +1299,6 @@ static int gem_rx_refill(struct macb_queue *queue)
>>  			 */
>>  			dma_wmb();
>>  			macb_set_addr(bp, desc, paddr);
>> -
>> -			/* Properly align Ethernet header.
>> -			 *
>> -			 * Hardware can add dummy bytes if asked using the RBOF
>> -			 * field inside the NCFGR register. That feature isn't
>> -			 * available if hardware is RSC capable.
>> -			 *
>> -			 * We cannot fallback to doing the 2-byte shift before
>> -			 * DMA mapping because the address field does not allow
>> -			 * setting the low 2/3 bits.
>> -			 * It is 3 bits if HW_DMA_CAP_PTP, else 2 bits.
>> -			 */
>> -			if (!(bp->caps & MACB_CAPS_RSC))
>> -				skb_reserve(skb, NET_IP_ALIGN);
>>  		} else {
>>  			desc->ctrl = 0;
>>  			dma_wmb();
>> @@ -1353,14 +1342,19 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
>>  	struct macb *bp = queue->bp;
>>  	unsigned int		len;
>>  	unsigned int		entry;
>> -	struct sk_buff		*skb;
>>  	struct macb_dma_desc	*desc;
>> +	int			data_len;
>>  	int			count = 0;
>> +	void			*buff_head;
>> +	struct skb_shared_info	*shinfo;
>> +	struct page		*page;
>> +	int			nr_frags;
>
> nit: you add 5 new stack variables, maybe you could apply reverse xmas
> tree while at it. You do it for the loop body in [5/8].
>

sure

>> +
>>  
>>  	while (count < budget) {
>>  		u32 ctrl;
>>  		dma_addr_t addr;
>> -		bool rxused;
>> +		bool rxused, first_frame;
>>  
>>  		entry = macb_rx_ring_wrap(bp, queue->rx_tail);
>>  		desc = macb_rx_desc(queue, entry);
>> @@ -1374,6 +1368,12 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
>>  		if (!rxused)
>>  			break;
>>  
>> +		if (!(bp->caps & MACB_CAPS_RSC))
>> +			addr += NET_IP_ALIGN;
>> +
>> +		dma_sync_single_for_cpu(&bp->pdev->dev,
>> +					addr, bp->rx_buffer_size,
>> +					page_pool_get_dma_dir(queue->page_pool));
>>  		/* Ensure ctrl is at least as up-to-date as rxused */
>>  		dma_rmb();
>>  
>> @@ -1382,58 +1382,118 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
>>  		queue->rx_tail++;
>>  		count++;
>>  
>> -		if (!(ctrl & MACB_BIT(RX_SOF) && ctrl & MACB_BIT(RX_EOF))) {
>> -			netdev_err(bp->dev,
>> -				   "not whole frame pointed by descriptor\n");
>> -			bp->dev->stats.rx_dropped++;
>> -			queue->stats.rx_dropped++;
>> -			break;
>> -		}
>> -		skb = queue->rx_buff[entry];
>> -		if (unlikely(!skb)) {
>> +		buff_head = queue->rx_buff[entry];
>> +		if (unlikely(!buff_head)) {
>>  			netdev_err(bp->dev,
>>  				   "inconsistent Rx descriptor chain\n");
>>  			bp->dev->stats.rx_dropped++;
>>  			queue->stats.rx_dropped++;
>>  			break;
>>  		}
>> -		/* now everything is ready for receiving packet */
>> -		queue->rx_buff[entry] = NULL;
>> +
>> +		first_frame = ctrl & MACB_BIT(RX_SOF);
>>  		len = ctrl & bp->rx_frm_len_mask;
>>  
>> -		netdev_vdbg(bp->dev, "gem_rx %u (len %u)\n", entry, len);
>> +		if (len) {
>> +			data_len = len;
>> +			if (!first_frame)
>> +				data_len -= queue->skb->len;
>> +		} else {
>> +			data_len = bp->rx_buffer_size;
>> +		}
>
> Why deal with the `!len` case? How can it occur? User guide doesn't hint
> that. It would mean we would grab uninitialised bytes as we assume len
> is the max buffer size.
>

Good point. After taking a second look, !len may not be the most reliable
way to check this.
From the datasheet, status signals are only valid (with some exceptions)
when MACB_BIT(RX_EOF) is set. As a side effect, len is always zero on my
hw for frames without the EOF bit, but it's probably better to just rely
on MACB_BIT(RX_EOF) instead of reading something that may end up being
unreliable.

>> +
>> +		if (first_frame) {
>> +			queue->skb = napi_build_skb(buff_head, gem_total_rx_buffer_size(bp));
>> +			if (unlikely(!queue->skb)) {
>> +				netdev_err(bp->dev,
>> +					   "Unable to allocate sk_buff\n");
>> +				goto free_frags;
>> +			}
>> +
>> +			/* Properly align Ethernet header.
>> +			 *
>> +			 * Hardware can add dummy bytes if asked using the RBOF
>> +			 * field inside the NCFGR register. That feature isn't
>> +			 * available if hardware is RSC capable.
>> +			 *
>> +			 * We cannot fallback to doing the 2-byte shift before
>> +			 * DMA mapping because the address field does not allow
>> +			 * setting the low 2/3 bits.
>> +			 * It is 3 bits if HW_DMA_CAP_PTP, else 2 bits.
>> +			 */
>> +			skb_reserve(queue->skb, bp->rx_headroom);
>> +			skb_mark_for_recycle(queue->skb);
>> +			skb_put(queue->skb, data_len);
>> +			queue->skb->protocol = eth_type_trans(queue->skb, bp->dev);
>> +
>> +			skb_checksum_none_assert(queue->skb);
>> +			if (bp->dev->features & NETIF_F_RXCSUM &&
>> +			    !(bp->dev->flags & IFF_PROMISC) &&
>> +			    GEM_BFEXT(RX_CSUM, ctrl) & GEM_RX_CSUM_CHECKED_MASK)
>> +				queue->skb->ip_summed = CHECKSUM_UNNECESSARY;
>> +		} else {
>> +			if (!queue->skb) {
>> +				netdev_err(bp->dev,
>> +					   "Received non-starting frame while expecting it\n");
>> +				goto free_frags;
>> +			}
>> +
>> +			shinfo = skb_shinfo(queue->skb);
>> +			page = virt_to_head_page(buff_head);
>> +			nr_frags = shinfo->nr_frags;
>> +
>> +			if (nr_frags >= ARRAY_SIZE(shinfo->frags))
>> +				goto free_frags;
>>  
>> -		skb_put(skb, len);
>> -		dma_unmap_single(&bp->pdev->dev, addr,
>> -				 bp->rx_buffer_size, DMA_FROM_DEVICE);
>> +			skb_add_rx_frag(queue->skb, nr_frags, page,
>> +					buff_head - page_address(page) + bp->rx_headroom,
>> +					data_len, gem_total_rx_buffer_size(bp));
>> +		}
>> +
>> +		/* now everything is ready for receiving packet */
>> +		queue->rx_buff[entry] = NULL;
>>  
>> -		skb->protocol = eth_type_trans(skb, bp->dev);
>> -		skb_checksum_none_assert(skb);
>> -		if (bp->dev->features & NETIF_F_RXCSUM &&
>> -		    !(bp->dev->flags & IFF_PROMISC) &&
>> -		    GEM_BFEXT(RX_CSUM, ctrl) & GEM_RX_CSUM_CHECKED_MASK)
>> -			skb->ip_summed = CHECKSUM_UNNECESSARY;
>> +		netdev_vdbg(bp->dev, "%s %u (len %u)\n", __func__, entry, data_len);
>>  
>> -		bp->dev->stats.rx_packets++;
>> -		queue->stats.rx_packets++;
>> -		bp->dev->stats.rx_bytes += skb->len;
>> -		queue->stats.rx_bytes += skb->len;
>> +		if (ctrl & MACB_BIT(RX_EOF)) {
>> +			bp->dev->stats.rx_packets++;
>> +			queue->stats.rx_packets++;
>> +			bp->dev->stats.rx_bytes += queue->skb->len;
>> +			queue->stats.rx_bytes += queue->skb->len;
>>  
>> -		gem_ptp_do_rxstamp(bp, skb, desc);
>> +			gem_ptp_do_rxstamp(bp, queue->skb, desc);
>>  
>>  #if defined(DEBUG) && defined(VERBOSE_DEBUG)
>> -		netdev_vdbg(bp->dev, "received skb of length %u, csum: %08x\n",
>> -			    skb->len, skb->csum);
>> -		print_hex_dump(KERN_DEBUG, " mac: ", DUMP_PREFIX_ADDRESS, 16, 1,
>> -			       skb_mac_header(skb), 16, true);
>> -		print_hex_dump(KERN_DEBUG, "data: ", DUMP_PREFIX_ADDRESS, 16, 1,
>> -			       skb->data, 32, true);
>> +			netdev_vdbg(bp->dev, "received skb of length %u, csum: %08x\n",
>> +				    queue->skb->len, queue->skb->csum);
>> +			print_hex_dump(KERN_DEBUG, " mac: ", DUMP_PREFIX_ADDRESS, 16, 1,
>> +				       skb_mac_header(queue->skb), 16, true);
>> +			print_hex_dump(KERN_DEBUG, "buff_head: ", DUMP_PREFIX_ADDRESS, 16, 1,
>> +				       queue->skb->buff_head, 32, true);
>>  #endif
>
> nit: while you are at it, maybe replace with print_hex_dump_debug()?
>

sure

>>  
>> -		napi_gro_receive(napi, skb);
>> +			napi_gro_receive(napi, queue->skb);
>> +			queue->skb = NULL;
>> +		}
>> +
>> +		continue;
>> +
>> +free_frags:
>> +		if (queue->skb) {
>> +			dev_kfree_skb(queue->skb);
>> +			queue->skb = NULL;
>> +		} else {
>> +			page_pool_put_full_page(queue->page_pool,
>> +						virt_to_head_page(buff_head),
>> +						false);
>> +		}
>> +
>> +		bp->dev->stats.rx_dropped++;
>> +		queue->stats.rx_dropped++;
>> +		queue->rx_buff[entry] = NULL;
>>  	}
>>  
>> -	gem_rx_refill(queue);
>> +	gem_rx_refill(queue, true);
>>  
>>  	return count;
>>  }
>> @@ -2367,12 +2427,25 @@ static netdev_tx_t macb_start_xmit(struct sk_buff *skb, struct net_device *dev)
>>  	return ret;
>>  }
>>  
>> -static void macb_init_rx_buffer_size(struct macb *bp, size_t size)
>> +static void macb_init_rx_buffer_size(struct macb *bp, unsigned int mtu)
>>  {
>> +	int overhead;
>
> nit: Maybe `unsigned int` or `size_t` rather than `int`?
>

ack

>> +	size_t size;
>> +
>>  	if (!macb_is_gem(bp)) {
>>  		bp->rx_buffer_size = MACB_RX_BUFFER_SIZE;
>>  	} else {
>> -		bp->rx_buffer_size = size;
>> +		size = mtu + ETH_HLEN + ETH_FCS_LEN;
>> +		if (!(bp->caps & MACB_CAPS_RSC))
>> +			size += NET_IP_ALIGN;
>
> NET_IP_ALIGN looks like it is accounted for twice, once in
> bp->rx_headroom and once in bp->rx_buffer_size. This gets fixed in
> [5/8] where gem_max_rx_data_size() gets introduced.
>

ah, right

>> +
>> +		bp->rx_buffer_size = SKB_DATA_ALIGN(size);
>> +		if (gem_total_rx_buffer_size(bp) > PAGE_SIZE) {
>> +			overhead = bp->rx_headroom +
>> +				SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
>> +			bp->rx_buffer_size = rounddown(PAGE_SIZE - overhead,
>> +						       RX_BUFFER_MULTIPLE);
>> +		}
>
> I've seen your comment in [0/8]. Do you have any advice on how to test
> this clamping? All I can think of is to either configure a massive MTU
> or, more easily, cheat with the headroom.
>

I normally test the set with 4k PAGE_SIZE and, as you said, setting the
mtu to something bigger than that. This is still possible with 8k pages
(given .jumbo_max_len = 10240).


> Also, should we warn? It means MTU-sized packets will be received in
> fragments. It will work but is probably unexpected by users and a
> slowdown reason that users might want to know about.
>

I'm not sure about the warning as I don't see this as a user level detail.
For debugging purpose, I guess we should be fine the last print out (even
better once extended with your suggestion). Of course, feel free to disagree.

> --
>
> nit: while in macb_init_rx_buffer_size(), can you tweak the debug line
> from mtu & rx_buffer_size to also have rx_headroom and total? So that
> we have everything available to understand what is going on buffer size
> wise. Something like:
>
> -       netdev_dbg(bp->dev, "mtu [%u] rx_buffer_size [%zu]\n",
> -                  bp->dev->mtu, bp->rx_buffer_size);
> +       netdev_info(bp->dev, "mtu [%u] rx_buffer_size [%zu] rx_headroom [%zu] total [%u]\n",
> +                   bp->dev->mtu, bp->rx_buffer_size, bp->rx_headroom,
> +                   gem_total_rx_buffer_size(bp));
>
> Thanks,
>
> --
> Théo Lebrun, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC net-next v2 5/8] cadence: macb: add XDP support for gem
  2026-01-08 15:49   ` Théo Lebrun
@ 2026-01-12 14:17     ` Paolo Valerio
  0 siblings, 0 replies; 21+ messages in thread
From: Paolo Valerio @ 2026-01-12 14:17 UTC (permalink / raw)
  To: Théo Lebrun, netdev
  Cc: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Lorenzo Bianconi,
	Théo Lebrun

On 08 Jan 2026 at 04:49:45 PM, Théo Lebrun <theo.lebrun@bootlin.com> wrote:

> Hello Paolo, netdev,
>
> On Sun Dec 21, 2025 at 12:51 AM CET, Paolo Valerio wrote:
>> Introduce basic XDP support for macb/gem with the XDP_PASS,
>> XDP_DROP, XDP_REDIRECT verdict support.
>>
>> Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
>> ---
>>  drivers/net/ethernet/cadence/macb.h      |   3 +
>>  drivers/net/ethernet/cadence/macb_main.c | 184 ++++++++++++++++++++---
>>  2 files changed, 169 insertions(+), 18 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
>> index 45c04157f153..815d50574267 100644
>> --- a/drivers/net/ethernet/cadence/macb.h
>> +++ b/drivers/net/ethernet/cadence/macb.h
>> @@ -16,6 +16,7 @@
>>  #include <linux/workqueue.h>
>>  #include <net/page_pool/helpers.h>
>>  #include <net/xdp.h>
>> +#include <linux/bpf_trace.h>
>
> Shouldn't that land in macb_main.c? Required by trace_xdp_exception().
>

that's right, no need to stay here

>> diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
>> index 582ceb728124..f767eb2e272e 100644
>> --- a/drivers/net/ethernet/cadence/macb_main.c
>> +++ b/drivers/net/ethernet/cadence/macb_main.c
>> @@ -5,6 +5,7 @@
>>   * Copyright (C) 2004-2006 Atmel Corporation
>>   */
>>  
>> +#include <asm-generic/errno.h>
>
> This is a mistake. For example compiling for a MIPS target I get all
> errno constants redefined. Seeing where it was added it might have been
> added by auto-import tooling.
>

yeah, it doesn't look right. No extra include is needed here.
You're right about the auto-import tooling. I just noticed I wrote
this on an lsp-mode/clangd config with the related default option
still on.

> If needed, to be replaced by
>    #include <linux/errno.h>
>
> ⟩ git grep -h 'include.*errno' drivers/ | sort | uniq -c | sort -nr | head -n3
>    1645 #include <linux/errno.h>
>      19 #include <asm/errno.h>
>       5 #include <linux/errno.h> /* For the -ENODEV/... values */
>
> Thanks,
>
> --
> Théo Lebrun, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC net-next v2 8/8] cadence: macb: introduce xmit support
  2026-01-08 15:54   ` Théo Lebrun
@ 2026-01-12 14:17     ` Paolo Valerio
  0 siblings, 0 replies; 21+ messages in thread
From: Paolo Valerio @ 2026-01-12 14:17 UTC (permalink / raw)
  To: Théo Lebrun, netdev
  Cc: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Lorenzo Bianconi,
	Théo Lebrun

On 08 Jan 2026 at 04:54:14 PM, Théo Lebrun <theo.lebrun@bootlin.com> wrote:

> On Sun Dec 21, 2025 at 12:51 AM CET, Paolo Valerio wrote:
>> Add XDP_TX verdict support, also introduce ndo_xdp_xmit function for
>> redirection, and update macb_tx_unmap() to handle both skbs and xdp
>> frames advertising NETDEV_XDP_ACT_NDO_XMIT capability and the ability
>> to process XDP_TX verdicts.
>>
>> Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
>> ---
>>  drivers/net/ethernet/cadence/macb_main.c | 166 +++++++++++++++++++++--
>>  1 file changed, 158 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
>> index cd29a80d1dbb..d8abfa45e22d 100644
>> --- a/drivers/net/ethernet/cadence/macb_main.c
>> +++ b/drivers/net/ethernet/cadence/macb_main.c
>> +static int
>> +gem_xdp_xmit(struct net_device *dev, int num_frame,
>> +	     struct xdp_frame **frames, u32 flags)
>> +{
>
> nit: a bit surprised by the first line break in the function header.
> Especially as it doesn't prevent splitting arguments across two lines.
>

Will fix it

> Thanks,
>
> --
> Théo Lebrun, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC net-next v2 3/8] cadence: macb: Add page pool support handle multi-descriptor frame rx
  2026-01-12 14:16     ` Paolo Valerio
@ 2026-01-12 18:43       ` Paolo Valerio
  2026-01-13 10:35         ` Théo Lebrun
  2026-01-13 10:43       ` Théo Lebrun
  1 sibling, 1 reply; 21+ messages in thread
From: Paolo Valerio @ 2026-01-12 18:43 UTC (permalink / raw)
  To: Théo Lebrun, netdev
  Cc: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Lorenzo Bianconi,
	Théo Lebrun

On 12 Jan 2026 at 03:16:24 PM, Paolo Valerio <pvalerio@redhat.com> wrote:

> On 08 Jan 2026 at 04:43:43 PM, Théo Lebrun <theo.lebrun@bootlin.com> wrote:
>
>> On Sun Dec 21, 2025 at 12:51 AM CET, Paolo Valerio wrote:
>>> Use the page pool allocator for the data buffers and enable skb recycling
>>> support, instead of relying on netdev_alloc_skb allocating the entire skb
>>> during the refill.
>>>
>>> The patch also add support for receiving network frames that span multiple
>>> DMA descriptors in the Cadence MACB/GEM Ethernet driver.
>>>
>>> The patch removes the requirement that limited frame reception to
>>> a single descriptor (RX_SOF && RX_EOF), also avoiding potential
>>> contiguous multi-page allocation for large frames.
>>>
>>> Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
>>> ---
>>>  drivers/net/ethernet/cadence/Kconfig     |   1 +
>>>  drivers/net/ethernet/cadence/macb.h      |   5 +
>>>  drivers/net/ethernet/cadence/macb_main.c | 345 +++++++++++++++--------
>>>  3 files changed, 235 insertions(+), 116 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/cadence/Kconfig b/drivers/net/ethernet/cadence/Kconfig
>>> index 5b2a461dfd28..ae500f717433 100644
>>> --- a/drivers/net/ethernet/cadence/Kconfig
>>> +++ b/drivers/net/ethernet/cadence/Kconfig
>>> @@ -25,6 +25,7 @@ config MACB
>>>  	depends on PTP_1588_CLOCK_OPTIONAL
>>>  	select PHYLINK
>>>  	select CRC32
>>> +	select PAGE_POOL
>>>  	help
>>>  	  The Cadence MACB ethernet interface is found on many Atmel AT32 and
>>>  	  AT91 parts.  This driver also supports the Cadence GEM (Gigabit
>>> diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
>>> index 3b184e9ac771..45c04157f153 100644
>>> --- a/drivers/net/ethernet/cadence/macb.h
>>> +++ b/drivers/net/ethernet/cadence/macb.h
>>> @@ -14,6 +14,8 @@
>>>  #include <linux/interrupt.h>
>>>  #include <linux/phy/phy.h>
>>>  #include <linux/workqueue.h>
>>> +#include <net/page_pool/helpers.h>
>>> +#include <net/xdp.h>
>>
>> nit: `#include <net/xdp.h>` is not needed yet.
>>
>
> ack
>
>>>  
>>>  #define MACB_GREGS_NBR 16
>>>  #define MACB_GREGS_VERSION 2
>>> @@ -1266,6 +1268,8 @@ struct macb_queue {
>>>  	void			*rx_buffers;
>>>  	struct napi_struct	napi_rx;
>>>  	struct queue_stats stats;
>>> +	struct page_pool	*page_pool;
>>> +	struct sk_buff		*skb;
>>>  };
>>>  
>>>  struct ethtool_rx_fs_item {
>>> @@ -1289,6 +1293,7 @@ struct macb {
>>>  	struct macb_dma_desc	*rx_ring_tieoff;
>>>  	dma_addr_t		rx_ring_tieoff_dma;
>>>  	size_t			rx_buffer_size;
>>> +	size_t			rx_headroom;
>>>  
>>>  	unsigned int		rx_ring_size;
>>>  	unsigned int		tx_ring_size;
>>> diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
>>> index b4e2444b2e95..9e1efc1f56d8 100644
>>> --- a/drivers/net/ethernet/cadence/macb_main.c
>>> +++ b/drivers/net/ethernet/cadence/macb_main.c
>>> @@ -1249,14 +1249,22 @@ static int macb_tx_complete(struct macb_queue *queue, int budget)
>>>  	return packets;
>>>  }
>>>  
>>> -static int gem_rx_refill(struct macb_queue *queue)
>>> +static int gem_total_rx_buffer_size(struct macb *bp)
>>> +{
>>> +	return SKB_HEAD_ALIGN(bp->rx_buffer_size + bp->rx_headroom);
>>> +}
>>
>> nit: something closer to a buffer size, either `unsigned int` or
>> `size_t`, sounds better than an int return type.
>>
>
> will do
>
>>> +
>>> +static int gem_rx_refill(struct macb_queue *queue, bool napi)
>>>  {
>>>  	unsigned int		entry;
>>> -	struct sk_buff		*skb;
>>>  	dma_addr_t		paddr;
>>> +	void			*data;
>>>  	struct macb *bp = queue->bp;
>>>  	struct macb_dma_desc *desc;
>>> +	struct page *page;
>>> +	gfp_t gfp_alloc;
>>>  	int err = 0;
>>> +	int offset;
>>>  
>>>  	while (CIRC_SPACE(queue->rx_prepared_head, queue->rx_tail,
>>>  			bp->rx_ring_size) > 0) {
>>> @@ -1268,25 +1276,20 @@ static int gem_rx_refill(struct macb_queue *queue)
>>>  		desc = macb_rx_desc(queue, entry);
>>>  
>>>  		if (!queue->rx_buff[entry]) {
>>> -			/* allocate sk_buff for this free entry in ring */
>>> -			skb = netdev_alloc_skb(bp->dev, bp->rx_buffer_size);
>>> -			if (unlikely(!skb)) {
>>> +			gfp_alloc = napi ? GFP_ATOMIC : GFP_KERNEL;
>>> +			page = page_pool_alloc_frag(queue->page_pool, &offset,
>>> +						    gem_total_rx_buffer_size(bp),
>>> +						    gfp_alloc | __GFP_NOWARN);
>>> +			if (!page) {
>>>  				netdev_err(bp->dev,
>>> -					   "Unable to allocate sk_buff\n");
>>> +					   "Unable to allocate page\n");
>>>  				err = -ENOMEM;
>>>  				break;
>>>  			}
>>>  
>>> -			/* now fill corresponding descriptor entry */
>>> -			paddr = dma_map_single(&bp->pdev->dev, skb->data,
>>> -					       bp->rx_buffer_size,
>>> -					       DMA_FROM_DEVICE);
>>> -			if (dma_mapping_error(&bp->pdev->dev, paddr)) {
>>> -				dev_kfree_skb(skb);
>>> -				break;
>>> -			}
>>> -
>>> -			queue->rx_buff[entry] = skb;
>>> +			paddr = page_pool_get_dma_addr(page) + XDP_PACKET_HEADROOM + offset;
>>> +			data = page_address(page) + offset;
>>> +			queue->rx_buff[entry] = data;
>>>  
>>>  			if (entry == bp->rx_ring_size - 1)
>>>  				paddr |= MACB_BIT(RX_WRAP);
>>> @@ -1296,20 +1299,6 @@ static int gem_rx_refill(struct macb_queue *queue)
>>>  			 */
>>>  			dma_wmb();
>>>  			macb_set_addr(bp, desc, paddr);
>>> -
>>> -			/* Properly align Ethernet header.
>>> -			 *
>>> -			 * Hardware can add dummy bytes if asked using the RBOF
>>> -			 * field inside the NCFGR register. That feature isn't
>>> -			 * available if hardware is RSC capable.
>>> -			 *
>>> -			 * We cannot fallback to doing the 2-byte shift before
>>> -			 * DMA mapping because the address field does not allow
>>> -			 * setting the low 2/3 bits.
>>> -			 * It is 3 bits if HW_DMA_CAP_PTP, else 2 bits.
>>> -			 */
>>> -			if (!(bp->caps & MACB_CAPS_RSC))
>>> -				skb_reserve(skb, NET_IP_ALIGN);
>>>  		} else {
>>>  			desc->ctrl = 0;
>>>  			dma_wmb();
>>> @@ -1353,14 +1342,19 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
>>>  	struct macb *bp = queue->bp;
>>>  	unsigned int		len;
>>>  	unsigned int		entry;
>>> -	struct sk_buff		*skb;
>>>  	struct macb_dma_desc	*desc;
>>> +	int			data_len;
>>>  	int			count = 0;
>>> +	void			*buff_head;
>>> +	struct skb_shared_info	*shinfo;
>>> +	struct page		*page;
>>> +	int			nr_frags;
>>
>> nit: you add 5 new stack variables, maybe you could apply reverse xmas
>> tree while at it. You do it for the loop body in [5/8].
>>
>
> sure
>
>>> +
>>>  
>>>  	while (count < budget) {
>>>  		u32 ctrl;
>>>  		dma_addr_t addr;
>>> -		bool rxused;
>>> +		bool rxused, first_frame;
>>>  
>>>  		entry = macb_rx_ring_wrap(bp, queue->rx_tail);
>>>  		desc = macb_rx_desc(queue, entry);
>>> @@ -1374,6 +1368,12 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
>>>  		if (!rxused)
>>>  			break;
>>>  
>>> +		if (!(bp->caps & MACB_CAPS_RSC))
>>> +			addr += NET_IP_ALIGN;
>>> +
>>> +		dma_sync_single_for_cpu(&bp->pdev->dev,
>>> +					addr, bp->rx_buffer_size,
>>> +					page_pool_get_dma_dir(queue->page_pool));
>>>  		/* Ensure ctrl is at least as up-to-date as rxused */
>>>  		dma_rmb();
>>>  
>>> @@ -1382,58 +1382,118 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
>>>  		queue->rx_tail++;
>>>  		count++;
>>>  
>>> -		if (!(ctrl & MACB_BIT(RX_SOF) && ctrl & MACB_BIT(RX_EOF))) {
>>> -			netdev_err(bp->dev,
>>> -				   "not whole frame pointed by descriptor\n");
>>> -			bp->dev->stats.rx_dropped++;
>>> -			queue->stats.rx_dropped++;
>>> -			break;
>>> -		}
>>> -		skb = queue->rx_buff[entry];
>>> -		if (unlikely(!skb)) {
>>> +		buff_head = queue->rx_buff[entry];
>>> +		if (unlikely(!buff_head)) {
>>>  			netdev_err(bp->dev,
>>>  				   "inconsistent Rx descriptor chain\n");
>>>  			bp->dev->stats.rx_dropped++;
>>>  			queue->stats.rx_dropped++;
>>>  			break;
>>>  		}
>>> -		/* now everything is ready for receiving packet */
>>> -		queue->rx_buff[entry] = NULL;
>>> +
>>> +		first_frame = ctrl & MACB_BIT(RX_SOF);
>>>  		len = ctrl & bp->rx_frm_len_mask;
>>>  
>>> -		netdev_vdbg(bp->dev, "gem_rx %u (len %u)\n", entry, len);
>>> +		if (len) {
>>> +			data_len = len;
>>> +			if (!first_frame)
>>> +				data_len -= queue->skb->len;
>>> +		} else {
>>> +			data_len = bp->rx_buffer_size;
>>> +		}
>>
>> Why deal with the `!len` case? How can it occur? User guide doesn't hint
>> that. It would mean we would grab uninitialised bytes as we assume len
>> is the max buffer size.
>>
>
> Good point. After taking a second look, !len may not be the most reliable
> way to check this.
> From the datasheet, status signals are only valid (with some exceptions)
> when MACB_BIT(RX_EOF) is set. As a side effect, len is always zero on my
> hw for frames without the EOF bit, but it's probably better to just rely
> on MACB_BIT(RX_EOF) instead of reading something that may end up being
> unreliable.
>
>>> +
>>> +		if (first_frame) {
>>> +			queue->skb = napi_build_skb(buff_head, gem_total_rx_buffer_size(bp));
>>> +			if (unlikely(!queue->skb)) {
>>> +				netdev_err(bp->dev,
>>> +					   "Unable to allocate sk_buff\n");
>>> +				goto free_frags;
>>> +			}
>>> +
>>> +			/* Properly align Ethernet header.
>>> +			 *
>>> +			 * Hardware can add dummy bytes if asked using the RBOF
>>> +			 * field inside the NCFGR register. That feature isn't
>>> +			 * available if hardware is RSC capable.
>>> +			 *
>>> +			 * We cannot fallback to doing the 2-byte shift before
>>> +			 * DMA mapping because the address field does not allow
>>> +			 * setting the low 2/3 bits.
>>> +			 * It is 3 bits if HW_DMA_CAP_PTP, else 2 bits.
>>> +			 */
>>> +			skb_reserve(queue->skb, bp->rx_headroom);
>>> +			skb_mark_for_recycle(queue->skb);
>>> +			skb_put(queue->skb, data_len);
>>> +			queue->skb->protocol = eth_type_trans(queue->skb, bp->dev);
>>> +
>>> +			skb_checksum_none_assert(queue->skb);
>>> +			if (bp->dev->features & NETIF_F_RXCSUM &&
>>> +			    !(bp->dev->flags & IFF_PROMISC) &&
>>> +			    GEM_BFEXT(RX_CSUM, ctrl) & GEM_RX_CSUM_CHECKED_MASK)
>>> +				queue->skb->ip_summed = CHECKSUM_UNNECESSARY;
>>> +		} else {
>>> +			if (!queue->skb) {
>>> +				netdev_err(bp->dev,
>>> +					   "Received non-starting frame while expecting it\n");
>>> +				goto free_frags;
>>> +			}
>>> +
>>> +			shinfo = skb_shinfo(queue->skb);
>>> +			page = virt_to_head_page(buff_head);
>>> +			nr_frags = shinfo->nr_frags;
>>> +
>>> +			if (nr_frags >= ARRAY_SIZE(shinfo->frags))
>>> +				goto free_frags;
>>>  
>>> -		skb_put(skb, len);
>>> -		dma_unmap_single(&bp->pdev->dev, addr,
>>> -				 bp->rx_buffer_size, DMA_FROM_DEVICE);
>>> +			skb_add_rx_frag(queue->skb, nr_frags, page,
>>> +					buff_head - page_address(page) + bp->rx_headroom,
>>> +					data_len, gem_total_rx_buffer_size(bp));
>>> +		}
>>> +
>>> +		/* now everything is ready for receiving packet */
>>> +		queue->rx_buff[entry] = NULL;
>>>  
>>> -		skb->protocol = eth_type_trans(skb, bp->dev);
>>> -		skb_checksum_none_assert(skb);
>>> -		if (bp->dev->features & NETIF_F_RXCSUM &&
>>> -		    !(bp->dev->flags & IFF_PROMISC) &&
>>> -		    GEM_BFEXT(RX_CSUM, ctrl) & GEM_RX_CSUM_CHECKED_MASK)
>>> -			skb->ip_summed = CHECKSUM_UNNECESSARY;
>>> +		netdev_vdbg(bp->dev, "%s %u (len %u)\n", __func__, entry, data_len);
>>>  
>>> -		bp->dev->stats.rx_packets++;
>>> -		queue->stats.rx_packets++;
>>> -		bp->dev->stats.rx_bytes += skb->len;
>>> -		queue->stats.rx_bytes += skb->len;
>>> +		if (ctrl & MACB_BIT(RX_EOF)) {
>>> +			bp->dev->stats.rx_packets++;
>>> +			queue->stats.rx_packets++;
>>> +			bp->dev->stats.rx_bytes += queue->skb->len;
>>> +			queue->stats.rx_bytes += queue->skb->len;
>>>  
>>> -		gem_ptp_do_rxstamp(bp, skb, desc);
>>> +			gem_ptp_do_rxstamp(bp, queue->skb, desc);
>>>  
>>>  #if defined(DEBUG) && defined(VERBOSE_DEBUG)
>>> -		netdev_vdbg(bp->dev, "received skb of length %u, csum: %08x\n",
>>> -			    skb->len, skb->csum);
>>> -		print_hex_dump(KERN_DEBUG, " mac: ", DUMP_PREFIX_ADDRESS, 16, 1,
>>> -			       skb_mac_header(skb), 16, true);
>>> -		print_hex_dump(KERN_DEBUG, "data: ", DUMP_PREFIX_ADDRESS, 16, 1,
>>> -			       skb->data, 32, true);
>>> +			netdev_vdbg(bp->dev, "received skb of length %u, csum: %08x\n",
>>> +				    queue->skb->len, queue->skb->csum);
>>> +			print_hex_dump(KERN_DEBUG, " mac: ", DUMP_PREFIX_ADDRESS, 16, 1,
>>> +				       skb_mac_header(queue->skb), 16, true);
>>> +			print_hex_dump(KERN_DEBUG, "buff_head: ", DUMP_PREFIX_ADDRESS, 16, 1,
>>> +				       queue->skb->buff_head, 32, true);
>>>  #endif
>>
>> nit: while you are at it, maybe replace with print_hex_dump_debug()?
>>
>
> sure
>
>>>  
>>> -		napi_gro_receive(napi, skb);
>>> +			napi_gro_receive(napi, queue->skb);
>>> +			queue->skb = NULL;
>>> +		}
>>> +
>>> +		continue;
>>> +
>>> +free_frags:
>>> +		if (queue->skb) {
>>> +			dev_kfree_skb(queue->skb);
>>> +			queue->skb = NULL;
>>> +		} else {
>>> +			page_pool_put_full_page(queue->page_pool,
>>> +						virt_to_head_page(buff_head),
>>> +						false);
>>> +		}
>>> +
>>> +		bp->dev->stats.rx_dropped++;
>>> +		queue->stats.rx_dropped++;
>>> +		queue->rx_buff[entry] = NULL;
>>>  	}
>>>  
>>> -	gem_rx_refill(queue);
>>> +	gem_rx_refill(queue, true);
>>>  
>>>  	return count;
>>>  }
>>> @@ -2367,12 +2427,25 @@ static netdev_tx_t macb_start_xmit(struct sk_buff *skb, struct net_device *dev)
>>>  	return ret;
>>>  }
>>>  
>>> -static void macb_init_rx_buffer_size(struct macb *bp, size_t size)
>>> +static void macb_init_rx_buffer_size(struct macb *bp, unsigned int mtu)
>>>  {
>>> +	int overhead;
>>
>> nit: Maybe `unsigned int` or `size_t` rather than `int`?
>>
>
> ack
>
>>> +	size_t size;
>>> +
>>>  	if (!macb_is_gem(bp)) {
>>>  		bp->rx_buffer_size = MACB_RX_BUFFER_SIZE;
>>>  	} else {
>>> -		bp->rx_buffer_size = size;
>>> +		size = mtu + ETH_HLEN + ETH_FCS_LEN;
>>> +		if (!(bp->caps & MACB_CAPS_RSC))
>>> +			size += NET_IP_ALIGN;
>>
>> NET_IP_ALIGN looks like it is accounted for twice, once in
>> bp->rx_headroom and once in bp->rx_buffer_size. This gets fixed in
>> [5/8] where gem_max_rx_data_size() gets introduced.
>>
>
> ah, right
>
>>> +
>>> +		bp->rx_buffer_size = SKB_DATA_ALIGN(size);
>>> +		if (gem_total_rx_buffer_size(bp) > PAGE_SIZE) {
>>> +			overhead = bp->rx_headroom +
>>> +				SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
>>> +			bp->rx_buffer_size = rounddown(PAGE_SIZE - overhead,
>>> +						       RX_BUFFER_MULTIPLE);
>>> +		}
>>
>> I've seen your comment in [0/8]. Do you have any advice on how to test
>> this clamping? All I can think of is to either configure a massive MTU
>> or, more easily, cheat with the headroom.
>>
>
> I normally test the set with 4k PAGE_SIZE and, as you said, setting the
> mtu to something bigger than that. This is still possible with 8k pages
> (given .jumbo_max_len = 10240).
>
>
>> Also, should we warn? It means MTU-sized packets will be received in
>> fragments. It will work but is probably unexpected by users and a
>> slowdown reason that users might want to know about.
>>
>
> I'm not sure about the warning as I don't see this as a user level detail.
> For debugging purpose, I guess we should be fine the last print out (even
> better once extended with your suggestion). Of course, feel free to disagree.
>
>> --
>>
>> nit: while in macb_init_rx_buffer_size(), can you tweak the debug line
>> from mtu & rx_buffer_size to also have rx_headroom and total? So that
>> we have everything available to understand what is going on buffer size
>> wise. Something like:
>>
>> -       netdev_dbg(bp->dev, "mtu [%u] rx_buffer_size [%zu]\n",
>> -                  bp->dev->mtu, bp->rx_buffer_size);
>> +       netdev_info(bp->dev, "mtu [%u] rx_buffer_size [%zu] rx_headroom [%zu] total [%u]\n",
>> +                   bp->dev->mtu, bp->rx_buffer_size, bp->rx_headroom,
>> +                   gem_total_rx_buffer_size(bp));
>>

I missed this before:
I assume so, but just checking, is the promotion from dbg to info also
wanted?

>> Thanks,
>>
>> --
>> Théo Lebrun, Bootlin
>> Embedded Linux and Kernel engineering
>> https://bootlin.com


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC net-next v2 3/8] cadence: macb: Add page pool support handle multi-descriptor frame rx
  2026-01-12 18:43       ` Paolo Valerio
@ 2026-01-13 10:35         ` Théo Lebrun
  2026-01-13 19:30           ` Paolo Valerio
  0 siblings, 1 reply; 21+ messages in thread
From: Théo Lebrun @ 2026-01-13 10:35 UTC (permalink / raw)
  To: Paolo Valerio, Théo Lebrun, netdev
  Cc: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Lorenzo Bianconi,
	Gregory Clement, Thomas Petazzoni

On Mon Jan 12, 2026 at 7:43 PM CET, Paolo Valerio wrote:
> On 12 Jan 2026 at 03:16:24 PM, Paolo Valerio <pvalerio@redhat.com> wrote:
>> On 08 Jan 2026 at 04:43:43 PM, Théo Lebrun <theo.lebrun@bootlin.com> wrote:
>>> nit: while in macb_init_rx_buffer_size(), can you tweak the debug line
>>> from mtu & rx_buffer_size to also have rx_headroom and total? So that
>>> we have everything available to understand what is going on buffer size
>>> wise. Something like:
>>>
>>> -       netdev_dbg(bp->dev, "mtu [%u] rx_buffer_size [%zu]\n",
>>> -                  bp->dev->mtu, bp->rx_buffer_size);
>>> +       netdev_info(bp->dev, "mtu [%u] rx_buffer_size [%zu] rx_headroom [%zu] total [%u]\n",
>>> +                   bp->dev->mtu, bp->rx_buffer_size, bp->rx_headroom,
>>> +                   gem_total_rx_buffer_size(bp));
>
> I missed this before:
> I assume so, but just checking, is the promotion from dbg to info also
> wanted?

Ah no it was a mistake. I was lazy during my testing: rather than
`#define DEBUG` I changed netdev_dbg() to netdev_info().

I wouldn't mind but that isn't the usual kernel policy wrt to logs.
A working driver should be silent.

Thanks,

--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC net-next v2 3/8] cadence: macb: Add page pool support handle multi-descriptor frame rx
  2026-01-12 14:16     ` Paolo Valerio
  2026-01-12 18:43       ` Paolo Valerio
@ 2026-01-13 10:43       ` Théo Lebrun
  1 sibling, 0 replies; 21+ messages in thread
From: Théo Lebrun @ 2026-01-13 10:43 UTC (permalink / raw)
  To: Paolo Valerio, Théo Lebrun, netdev
  Cc: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Lorenzo Bianconi,
	Thomas Petazzoni, Gregory Clement

On Mon Jan 12, 2026 at 3:16 PM CET, Paolo Valerio wrote:
> On 08 Jan 2026 at 04:43:43 PM, Théo Lebrun <theo.lebrun@bootlin.com> wrote:
>> On Sun Dec 21, 2025 at 12:51 AM CET, Paolo Valerio wrote:
>>> @@ -1382,58 +1382,118 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
>>> +		first_frame = ctrl & MACB_BIT(RX_SOF);
>>>  		len = ctrl & bp->rx_frm_len_mask;
>>>  
>>> -		netdev_vdbg(bp->dev, "gem_rx %u (len %u)\n", entry, len);
>>> +		if (len) {
>>> +			data_len = len;
>>> +			if (!first_frame)
>>> +				data_len -= queue->skb->len;
>>> +		} else {
>>> +			data_len = bp->rx_buffer_size;
>>> +		}
>>
>> Why deal with the `!len` case? How can it occur? User guide doesn't hint
>> that. It would mean we would grab uninitialised bytes as we assume len
>> is the max buffer size.
>
> Good point. After taking a second look, !len may not be the most reliable
> way to check this.
> From the datasheet, status signals are only valid (with some exceptions)
> when MACB_BIT(RX_EOF) is set. As a side effect, len is always zero on my
> hw for frames without the EOF bit, but it's probably better to just rely
> on MACB_BIT(RX_EOF) instead of reading something that may end up being
> unreliable.

100%, I do agree!

>>> +		bp->rx_buffer_size = SKB_DATA_ALIGN(size);
>>> +		if (gem_total_rx_buffer_size(bp) > PAGE_SIZE) {
>>> +			overhead = bp->rx_headroom +
>>> +				SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
>>> +			bp->rx_buffer_size = rounddown(PAGE_SIZE - overhead,
>>> +						       RX_BUFFER_MULTIPLE);
>>> +		}
>>
>> I've seen your comment in [0/8]. Do you have any advice on how to test
>> this clamping? All I can think of is to either configure a massive MTU
>> or, more easily, cheat with the headroom.
>
> I normally test the set with 4k PAGE_SIZE and, as you said, setting the
> mtu to something bigger than that. This is still possible with 8k pages
> (given .jumbo_max_len = 10240).

Ah yes there is .jumbo_max_len, but our PAGE_SIZE==16K > .jumbo_max_len
so we cannot land in that codepath.

>> Also, should we warn? It means MTU-sized packets will be received in
>> fragments. It will work but is probably unexpected by users and a
>> slowdown reason that users might want to know about.
>
> I'm not sure about the warning as I don't see this as a user level detail.
> For debugging purpose, I guess we should be fine the last print out (even
> better once extended with your suggestion). Of course, feel free to disagree.

I'm fine with no warnings. We'll check our performance anyways. :-)
If it changes we'll notice.

Regards,

--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC net-next v2 3/8] cadence: macb: Add page pool support handle multi-descriptor frame rx
  2026-01-13 10:35         ` Théo Lebrun
@ 2026-01-13 19:30           ` Paolo Valerio
  0 siblings, 0 replies; 21+ messages in thread
From: Paolo Valerio @ 2026-01-13 19:30 UTC (permalink / raw)
  To: Théo Lebrun, Théo Lebrun, netdev
  Cc: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Lorenzo Bianconi,
	Gregory Clement, Thomas Petazzoni

On 13 Jan 2026 at 11:35:09 AM, Théo Lebrun <theo.lebrun@bootlin.com> wrote:

> On Mon Jan 12, 2026 at 7:43 PM CET, Paolo Valerio wrote:
>> On 12 Jan 2026 at 03:16:24 PM, Paolo Valerio <pvalerio@redhat.com> wrote:
>>> On 08 Jan 2026 at 04:43:43 PM, Théo Lebrun <theo.lebrun@bootlin.com> wrote:
>>>> nit: while in macb_init_rx_buffer_size(), can you tweak the debug line
>>>> from mtu & rx_buffer_size to also have rx_headroom and total? So that
>>>> we have everything available to understand what is going on buffer size
>>>> wise. Something like:
>>>>
>>>> -       netdev_dbg(bp->dev, "mtu [%u] rx_buffer_size [%zu]\n",
>>>> -                  bp->dev->mtu, bp->rx_buffer_size);
>>>> +       netdev_info(bp->dev, "mtu [%u] rx_buffer_size [%zu] rx_headroom [%zu] total [%u]\n",
>>>> +                   bp->dev->mtu, bp->rx_buffer_size, bp->rx_headroom,
>>>> +                   gem_total_rx_buffer_size(bp));
>>
>> I missed this before:
>> I assume so, but just checking, is the promotion from dbg to info also
>> wanted?
>
> Ah no it was a mistake. I was lazy during my testing: rather than
> `#define DEBUG` I changed netdev_dbg() to netdev_info().
>
> I wouldn't mind but that isn't the usual kernel policy wrt to logs.
> A working driver should be silent.
>

np, makes perfectly sense

> Thanks,
>
> --
> Théo Lebrun, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2026-01-13 19:30 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-20 23:51 [PATCH RFC net-next v2 0/8] net: macb: Add XDP support and page pool integration Paolo Valerio
2025-12-20 23:51 ` [PATCH RFC net-next v2 1/8] net: macb: move Rx buffers alloc from link up to open Paolo Valerio
2026-01-08 15:24   ` Théo Lebrun
2026-01-12 14:14     ` Paolo Valerio
2025-12-20 23:51 ` [PATCH RFC net-next v2 2/8] net: macb: rename rx_skbuff into rx_buff Paolo Valerio
2025-12-20 23:51 ` [PATCH RFC net-next v2 3/8] cadence: macb: Add page pool support handle multi-descriptor frame rx Paolo Valerio
2026-01-08 15:43   ` Théo Lebrun
2026-01-12 14:16     ` Paolo Valerio
2026-01-12 18:43       ` Paolo Valerio
2026-01-13 10:35         ` Théo Lebrun
2026-01-13 19:30           ` Paolo Valerio
2026-01-13 10:43       ` Théo Lebrun
2025-12-20 23:51 ` [PATCH RFC net-next v2 4/8] cadence: macb: use the current queue number for stats Paolo Valerio
2025-12-20 23:51 ` [PATCH RFC net-next v2 5/8] cadence: macb: add XDP support for gem Paolo Valerio
2026-01-08 15:49   ` Théo Lebrun
2026-01-12 14:17     ` Paolo Valerio
2025-12-20 23:51 ` [PATCH RFC net-next v2 6/8] cadence: macb: make macb_tx_skb generic Paolo Valerio
2025-12-20 23:51 ` [PATCH RFC net-next v2 7/8] cadence: macb: make tx path skb agnostic Paolo Valerio
2025-12-20 23:51 ` [PATCH RFC net-next v2 8/8] cadence: macb: introduce xmit support Paolo Valerio
2026-01-08 15:54   ` Théo Lebrun
2026-01-12 14:17     ` Paolo Valerio

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox