public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/8] net: macb: add XSK support
@ 2026-03-04 18:23 Théo Lebrun
  0 siblings, 0 replies; 15+ messages in thread
From: Théo Lebrun @ 2026-03-04 18:23 UTC (permalink / raw)
  To: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, Richard Cochran
  Cc: netdev, linux-kernel, bpf, Vladimir Kondratiev, Gregory CLEMENT,
	Benoît Monin, Tawfik Bayouk, Thomas Petazzoni,
	Maxime Chevallier, Théo Lebrun

Add XSK support to the MACB/GEM driver.
Tested on Mobileye EyeQ5 (MIPS) evaluation board.
Applies on top of net-next (4ad96a7c9e2c) and Paolo's XDP work [0].

I don't have good Rx benchmark numbers yet, sorry, mostly because of
userspace tooling issues around eBPF/XDP and MIPS. In copy mode it only
means slowdowns, but in zero-copy, as we work with a fixed amount of
buffers, it causes allocation errors.

--

The bulk of the work is dealing with a second allocator. Throughout, we
now use queue->page_pool or queue->xsk_pool. The former gives us raw
buffers which we need to wrap inside xdp_buff and the latter allocates
xdp_buff, meaning less work.

To simplify the implementation, attaching an XSK pool implies closing
and reopening the interface. It could be improved over time as
currently attaching AF_XDP socket in zero-copy mode means we
close/reopen twice: once for the XDP program, once for the XSK pool.

First three patches are cleanup.

   [PATCH net-next 1/8] net: macb: make rx error messages rate-limited
   [PATCH net-next 2/8] net: macb: account for stats in Rx XDP codepaths
   [PATCH net-next 3/8] net: macb: account for stats in Tx XDP codepaths

Then comes preparation work.

   [PATCH net-next 4/8] net: macb: drop handling of recycled buffers in gem_rx_refill()
   [PATCH net-next 5/8] net: macb: move macb_xdp_submit_frame() body to helper function

And finally the XSK codepaths.

   [PATCH net-next 6/8] net: macb: add infrastructure for XSK buffer pool
   [PATCH net-next 7/8] net: macb: add Rx zero-copy AF_XDP support
   [PATCH net-next 8/8] net: macb: add Tx zero-copy AF_XDP support

Thanks,
Have a nice day,
Théo

[0]: https://lore.kernel.org/netdev/20260302115232.1430640-1-pvalerio@redhat.com/

Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
---
Théo Lebrun (8):
      net: macb: make rx error messages rate-limited
      net: macb: account for stats in Rx XDP codepaths
      net: macb: account for stats in Tx XDP codepaths
      net: macb: drop handling of recycled buffers in gem_rx_refill()
      net: macb: move macb_xdp_submit_frame() body to helper function
      net: macb: add infrastructure for XSK buffer pool
      net: macb: add Rx zero-copy AF_XDP support
      net: macb: add Tx zero-copy AF_XDP support

 drivers/net/ethernet/cadence/macb.h      |   2 +
 drivers/net/ethernet/cadence/macb_main.c | 668 +++++++++++++++++++++----------
 2 files changed, 468 insertions(+), 202 deletions(-)
---
base-commit: 06d25a140f34f5879d0731117d4d62a7dd3824a9
change-id: 20260225-macb-xsk-452c0c802436

Best regards,
-- 
Théo Lebrun <theo.lebrun@bootlin.com>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH net-next 0/8] net: macb: add XSK support
@ 2026-03-04 18:24 Théo Lebrun
  2026-03-04 18:24 ` [PATCH net-next 1/8] net: macb: make rx error messages rate-limited Théo Lebrun
                   ` (8 more replies)
  0 siblings, 9 replies; 15+ messages in thread
From: Théo Lebrun @ 2026-03-04 18:24 UTC (permalink / raw)
  To: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, Richard Cochran
  Cc: netdev, linux-kernel, bpf, Vladimir Kondratiev, Gregory CLEMENT,
	Benoît Monin, Tawfik Bayouk, Thomas Petazzoni,
	Maxime Chevallier, Théo Lebrun

Add XSK support to the MACB/GEM driver.
Tested on Mobileye EyeQ5 (MIPS) evaluation board.
Applies on top of net-next (4ad96a7c9e2c) and Paolo's XDP work [0].

I don't have good Rx benchmark numbers yet, sorry, mostly because of
userspace tooling issues around eBPF/XDP and MIPS. In copy mode it only
means slowdowns, but in zero-copy, as we work with a fixed amount of
buffers, it causes allocation errors.

--

The bulk of the work is dealing with a second allocator. Throughout, we
now use queue->page_pool or queue->xsk_pool. The former gives us raw
buffers which we need to wrap inside xdp_buff and the latter allocates
xdp_buff, meaning less work.

To simplify the implementation, attaching an XSK pool implies closing
and reopening the interface. It could be improved over time as
currently attaching AF_XDP socket in zero-copy mode means we
close/reopen twice: once for the XDP program, once for the XSK pool.

First three patches are cleanup.

   [PATCH net-next 1/8] net: macb: make rx error messages rate-limited
   [PATCH net-next 2/8] net: macb: account for stats in Rx XDP codepaths
   [PATCH net-next 3/8] net: macb: account for stats in Tx XDP codepaths

Then comes preparation work.

   [PATCH net-next 4/8] net: macb: drop handling of recycled buffers in gem_rx_refill()
   [PATCH net-next 5/8] net: macb: move macb_xdp_submit_frame() body to helper function

And finally the XSK codepaths.

   [PATCH net-next 6/8] net: macb: add infrastructure for XSK buffer pool
   [PATCH net-next 7/8] net: macb: add Rx zero-copy AF_XDP support
   [PATCH net-next 8/8] net: macb: add Tx zero-copy AF_XDP support

Thanks,
Have a nice day,
Théo

[0]: https://lore.kernel.org/netdev/20260302115232.1430640-1-pvalerio@redhat.com/

Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
---
Théo Lebrun (8):
      net: macb: make rx error messages rate-limited
      net: macb: account for stats in Rx XDP codepaths
      net: macb: account for stats in Tx XDP codepaths
      net: macb: drop handling of recycled buffers in gem_rx_refill()
      net: macb: move macb_xdp_submit_frame() body to helper function
      net: macb: add infrastructure for XSK buffer pool
      net: macb: add Rx zero-copy AF_XDP support
      net: macb: add Tx zero-copy AF_XDP support

 drivers/net/ethernet/cadence/macb.h      |   2 +
 drivers/net/ethernet/cadence/macb_main.c | 668 +++++++++++++++++++++----------
 2 files changed, 468 insertions(+), 202 deletions(-)
---
base-commit: 06d25a140f34f5879d0731117d4d62a7dd3824a9
change-id: 20260225-macb-xsk-452c0c802436

Best regards,
-- 
Théo Lebrun <theo.lebrun@bootlin.com>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH net-next 1/8] net: macb: make rx error messages rate-limited
  2026-03-04 18:24 [PATCH net-next 0/8] net: macb: add XSK support Théo Lebrun
@ 2026-03-04 18:24 ` Théo Lebrun
  2026-03-04 18:24 ` [PATCH net-next 2/8] net: macb: account for stats in Rx XDP codepaths Théo Lebrun
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Théo Lebrun @ 2026-03-04 18:24 UTC (permalink / raw)
  To: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, Richard Cochran
  Cc: netdev, linux-kernel, bpf, Vladimir Kondratiev, Gregory CLEMENT,
	Benoît Monin, Tawfik Bayouk, Thomas Petazzoni,
	Maxime Chevallier, Théo Lebrun

If Rx codepath error messages trigger, they do not interrupt reception.
Kernel log gets spammed, we lose useful history and everything crawls
to a halt. Instead, make them rate-limited to keep old useful
information in the log and keep the system responsive.

No netdev_*_ratelimited() variants exist so we switch to dev_*().

Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
---
 drivers/net/ethernet/cadence/macb_main.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index a79daad275ba..ab73d1a522c2 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -1303,11 +1303,12 @@ static int macb_tx_complete(struct macb_queue *queue, int budget)
 static inline int gem_rx_data_len(struct macb *bp, struct macb_queue *queue,
 				  u32 desc_ctrl, bool rx_sof, bool rx_eof)
 {
+	struct device *dev = &bp->pdev->dev;
 	int len;
 
 	if (unlikely(!rx_sof && !queue->skb)) {
-		netdev_err(bp->dev,
-			   "Received non-starting frame while expecting a starting one\n");
+		dev_err_ratelimited(dev,
+				    "Received non-starting frame while expecting a starting one\n");
 		return -1;
 	}
 
@@ -1322,7 +1323,7 @@ static inline int gem_rx_data_len(struct macb *bp, struct macb_queue *queue,
 
 	if (rx_eof && !rx_sof) {
 		if (unlikely(queue->skb->len > len)) {
-			netdev_err(bp->dev, "Unexpected frame len: %d\n", len);
+			dev_err_ratelimited(dev, "Unexpected frame len: %d\n", len);
 			return -1;
 		}
 
@@ -1382,8 +1383,8 @@ static int gem_rx_refill(struct macb_queue *queue, bool napi)
 						    gem_total_rx_buffer_size(bp),
 						    gfp_alloc | __GFP_NOWARN);
 			if (!page) {
-				netdev_err(bp->dev,
-					   "Unable to allocate rx buffer\n");
+				dev_err_ratelimited(&bp->pdev->dev,
+						    "Unable to allocate rx buffer\n");
 				err = -ENOMEM;
 				break;
 			}
@@ -1666,8 +1667,8 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
 
 		buff_head = queue->rx_buff[entry];
 		if (unlikely(!buff_head)) {
-			netdev_err(bp->dev,
-				   "inconsistent Rx descriptor chain\n");
+			dev_err_ratelimited(&bp->pdev->dev,
+					    "inconsistent Rx descriptor chain\n");
 			bp->dev->stats.rx_dropped++;
 			queue->stats.rx_dropped++;
 			break;

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net-next 2/8] net: macb: account for stats in Rx XDP codepaths
  2026-03-04 18:24 [PATCH net-next 0/8] net: macb: add XSK support Théo Lebrun
  2026-03-04 18:24 ` [PATCH net-next 1/8] net: macb: make rx error messages rate-limited Théo Lebrun
@ 2026-03-04 18:24 ` Théo Lebrun
  2026-03-04 18:24 ` [PATCH net-next 3/8] net: macb: account for stats in Tx " Théo Lebrun
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Théo Lebrun @ 2026-03-04 18:24 UTC (permalink / raw)
  To: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, Richard Cochran
  Cc: netdev, linux-kernel, bpf, Vladimir Kondratiev, Gregory CLEMENT,
	Benoît Monin, Tawfik Bayouk, Thomas Petazzoni,
	Maxime Chevallier, Théo Lebrun

gem_xdp_run() returns an action.
Wrt stats, we land in three different cases:
 - Packet is handed to the stack (XDP_PASS), turns into an SKB and gets
   accounted for below in gem_rx(). No fix here.
 - Packet is dropped (XDP_DROP|ABORTED), we must increment the dropped
   counter. Missing; add it.
 - Packet is passed along (XDP_TX|REDIRECT), we must increment bytes &
   packets counters. Missing; add it.

Along the way, use local variables to store rx_bytes, rx_packets and
rx_dropped. Then increase stats only once at the end of gem_rx(). This
is simpler because all three stats must modified on a per interface and
per queue basis.

Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
---
 drivers/net/ethernet/cadence/macb_main.c | 47 +++++++++++++++++++++++---------
 1 file changed, 34 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index ab73d1a522c2..1aa90499343a 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -1627,6 +1627,7 @@ static u32 gem_xdp_run(struct macb_queue *queue, void *buff_head,
 static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
 		  int budget)
 {
+	unsigned int packets = 0, dropped = 0, bytes = 0;
 	struct skb_shared_info *shinfo;
 	struct macb *bp = queue->bp;
 	struct macb_dma_desc *desc;
@@ -1669,8 +1670,7 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
 		if (unlikely(!buff_head)) {
 			dev_err_ratelimited(&bp->pdev->dev,
 					    "inconsistent Rx descriptor chain\n");
-			bp->dev->stats.rx_dropped++;
-			queue->stats.rx_dropped++;
+			dropped++;
 			break;
 		}
 
@@ -1700,11 +1700,29 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
 			if (last_frame) {
 				ret = gem_xdp_run(queue, buff_head, &data_len,
 						  &headroom, addr - gem_rx_pad(bp));
-				if (ret == XDP_REDIRECT)
-					xdp_flush = true;
 
-				if (ret != XDP_PASS)
-					goto next_frame;
+				switch (ret) {
+				/* continue to SKB handling codepath */
+				case XDP_PASS:
+					break;
+
+				/* dropped packet cases */
+				case XDP_ABORTED:
+				case XDP_DROP:
+					dropped++;
+					queue->rx_buff[entry] = NULL;
+					continue;
+
+				/* redirect/tx cases */
+				case XDP_REDIRECT:
+					xdp_flush = true;
+					fallthrough;
+				case XDP_TX:
+					packets++;
+					bytes += data_len;
+					queue->rx_buff[entry] = NULL;
+					continue;
+				}
 			}
 
 			queue->skb = napi_build_skb(buff_head, gem_total_rx_buffer_size(bp));
@@ -1743,10 +1761,8 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
 
 		/* now everything is ready for receiving packet */
 		if (last_frame) {
-			bp->dev->stats.rx_packets++;
-			queue->stats.rx_packets++;
-			bp->dev->stats.rx_bytes += queue->skb->len;
-			queue->stats.rx_bytes += queue->skb->len;
+			packets++;
+			bytes += queue->skb->len;
 
 			queue->skb->protocol = eth_type_trans(queue->skb, bp->dev);
 			skb_checksum_none_assert(queue->skb);
@@ -1769,7 +1785,6 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
 			queue->skb = NULL;
 		}
 
-next_frame:
 		queue->rx_buff[entry] = NULL;
 		continue;
 
@@ -1784,11 +1799,17 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
 						virt_to_head_page(buff_head),
 						false);
 
-		bp->dev->stats.rx_dropped++;
-		queue->stats.rx_dropped++;
+		dropped++;
 		queue->rx_buff[entry] = NULL;
 	}
 
+	bp->dev->stats.rx_packets += packets;
+	queue->stats.rx_packets += packets;
+	bp->dev->stats.rx_dropped += dropped;
+	queue->stats.rx_dropped += dropped;
+	bp->dev->stats.rx_bytes += bytes;
+	queue->stats.rx_bytes += bytes;
+
 	if (xdp_flush)
 		xdp_do_flush();
 

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net-next 3/8] net: macb: account for stats in Tx XDP codepaths
  2026-03-04 18:24 [PATCH net-next 0/8] net: macb: add XSK support Théo Lebrun
  2026-03-04 18:24 ` [PATCH net-next 1/8] net: macb: make rx error messages rate-limited Théo Lebrun
  2026-03-04 18:24 ` [PATCH net-next 2/8] net: macb: account for stats in Rx XDP codepaths Théo Lebrun
@ 2026-03-04 18:24 ` Théo Lebrun
  2026-03-04 18:24 ` [PATCH net-next 4/8] net: macb: drop handling of recycled buffers in gem_rx_refill() Théo Lebrun
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Théo Lebrun @ 2026-03-04 18:24 UTC (permalink / raw)
  To: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, Richard Cochran
  Cc: netdev, linux-kernel, bpf, Vladimir Kondratiev, Gregory CLEMENT,
	Benoît Monin, Tawfik Bayouk, Thomas Petazzoni,
	Maxime Chevallier, Théo Lebrun

macb_tx_complete() processing loop assumes a packet is composed of
multiple frames and composes around this idea. However, this is only
true in the SKB case ie `tx_buff->type == MACB_TYPE_SKB`.

Rework macb_tx_complete() to bring the tx_buff->type switch statement
outside and the frame iteration loop now lives only inside the SKB
case.

Fix Tx XDP stats that were not accounted for, in the XDP_TX|NDO cases.
Only increment statistics once per macb_tx_complete() call rather than
once per frame.

The `bytes` and `packets` stack variables now gets incremented for
completed XDP XMIT/TX packets. This implies the DQL subsystem through
netdev_tx_completed_queue() now gets notified of those packets
completing. We must therefore also report those bytes as sent, using
netdev_tx_sent_queue(), in macb_xdp_submit_frame() called by:
 - Rx XDP programs returning action XDP_TX and,
 - the .ndo_xdp_xmit() callback.

Incrementing `packets` also implies XDP packets are accounted for in our
NAPI budget calculation.

Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
---
 drivers/net/ethernet/cadence/macb_main.c | 71 +++++++++++++++-----------------
 1 file changed, 33 insertions(+), 38 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index 1aa90499343a..c1677f1d8f23 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -1212,7 +1212,7 @@ static int macb_tx_complete(struct macb_queue *queue, int budget)
 {
 	struct macb *bp = queue->bp;
 	unsigned long flags;
-	int skb_packets = 0;
+	int xsk_frames = 0;
 	unsigned int tail;
 	unsigned int head;
 	u16 queue_index;
@@ -1227,7 +1227,6 @@ static int macb_tx_complete(struct macb_queue *queue, int budget)
 		struct macb_tx_buff *tx_buff;
 		struct macb_dma_desc *desc;
 		struct sk_buff *skb;
-		void *data = NULL;
 		u32 ctrl;
 
 		desc = macb_tx_desc(queue, tail);
@@ -1243,52 +1242,46 @@ static int macb_tx_complete(struct macb_queue *queue, int budget)
 		if (!(ctrl & MACB_BIT(TX_USED)))
 			break;
 
-		/* Process all buffers of the current transmitted frame */
-		for (;; tail++) {
-			tx_buff = macb_tx_buff(queue, tail);
+		tx_buff = macb_tx_buff(queue, tail);
 
-			if (tx_buff->type != MACB_TYPE_SKB) {
-				data = tx_buff->ptr;
-				packets++;
-				goto unmap;
+		switch (tx_buff->type) {
+		case MACB_TYPE_SKB:
+			/* Process all buffers of the current transmitted frame */
+			while (!tx_buff->ptr) {
+				macb_tx_unmap(bp, tx_buff, budget);
+				tail++;
+				tx_buff = macb_tx_buff(queue, tail);
 			}
 
-			/* First, update TX stats if needed */
-			if (tx_buff->ptr) {
-				data = tx_buff->ptr;
-				skb = tx_buff->ptr;
+			skb = tx_buff->ptr;
 
-				if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) &&
-				    !ptp_one_step_sync(skb))
-					gem_ptp_do_txstamp(bp, skb, desc);
+			if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) &&
+			    !ptp_one_step_sync(skb))
+				gem_ptp_do_txstamp(bp, skb, desc);
 
-				netdev_vdbg(bp->dev, "skb %u (data %p) TX complete\n",
-					    macb_tx_ring_wrap(bp, tail),
-					    skb->data);
-				bp->dev->stats.tx_packets++;
-				queue->stats.tx_packets++;
-				bp->dev->stats.tx_bytes += skb->len;
-				queue->stats.tx_bytes += skb->len;
-				skb_packets++;
-				packets++;
-				bytes += skb->len;
-			}
+			netdev_vdbg(bp->dev, "skb %u (data %p) TX complete\n",
+				    macb_tx_ring_wrap(bp, tail),
+				    skb->data);
+			bytes += skb->len;
+			break;
 
-unmap:
-			/* Now we can safely release resources */
-			macb_tx_unmap(bp, tx_buff, budget);
-
-			/* data is set only for the last buffer of the frame.
-			 * WARNING: at this point the buffer has been freed by
-			 * macb_tx_unmap().
-			 */
-			if (data)
-				break;
+		case MACB_TYPE_XDP_TX:
+		case MACB_TYPE_XDP_NDO:
+			bytes += tx_buff->size;
+			break;
 		}
+
+		packets++;
+		macb_tx_unmap(bp, tx_buff, budget);
 	}
 
+	bp->dev->stats.tx_packets += packets;
+	queue->stats.tx_packets += packets;
+	bp->dev->stats.tx_bytes += bytes;
+	queue->stats.tx_bytes += bytes;
+
 	netdev_tx_completed_queue(netdev_get_tx_queue(bp->dev, queue_index),
-				  skb_packets, bytes);
+				  packets, bytes);
 
 	queue->tx_tail = tail;
 	if (__netif_subqueue_stopped(bp->dev, queue_index) &&
@@ -1529,6 +1522,8 @@ static int macb_xdp_submit_frame(struct macb *bp, struct xdp_frame *xdpf,
 	macb_writel(bp, NCR, macb_readl(bp, NCR) | MACB_BIT(TSTART));
 	spin_unlock(&bp->lock);
 
+	netdev_tx_sent_queue(netdev_get_tx_queue(bp->dev, queue_index), xdpf->len);
+
 	if (CIRC_SPACE(queue->tx_head, queue->tx_tail, bp->tx_ring_size) < 1)
 		netif_stop_subqueue(dev, queue_index);
 

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net-next 4/8] net: macb: drop handling of recycled buffers in gem_rx_refill()
  2026-03-04 18:24 [PATCH net-next 0/8] net: macb: add XSK support Théo Lebrun
                   ` (2 preceding siblings ...)
  2026-03-04 18:24 ` [PATCH net-next 3/8] net: macb: account for stats in Tx " Théo Lebrun
@ 2026-03-04 18:24 ` Théo Lebrun
  2026-03-04 18:24 ` [PATCH net-next 5/8] net: macb: move macb_xdp_submit_frame() body to helper function Théo Lebrun
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Théo Lebrun @ 2026-03-04 18:24 UTC (permalink / raw)
  To: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, Richard Cochran
  Cc: netdev, linux-kernel, bpf, Vladimir Kondratiev, Gregory CLEMENT,
	Benoît Monin, Tawfik Bayouk, Thomas Petazzoni,
	Maxime Chevallier, Théo Lebrun

The refill operation supports detecting if a buffer is present in a
slot; if it is, then it updates its DMA descriptor reusing the same
buffer.

This behavior can be dropped; all codepaths of gem_rx() letting a buffer
lay around to be reused by refill have disappeared. Said another way:
every time queue->tx_tail is incremented, queue->rx_buff[entry] is set
to NULL.

On the same occasion, move `gfp_alloc` assignment out of the loop and
into variable declarations. Its value is constant across the function's
lifetime. Also fix tiny alignment issue with the while statement.

Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
---
 drivers/net/ethernet/cadence/macb_main.c | 64 ++++++++++++++------------------
 1 file changed, 28 insertions(+), 36 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index c1677f1d8f23..ed94f9f0894b 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -1351,18 +1351,18 @@ static unsigned int gem_total_rx_buffer_size(struct macb *bp)
 
 static int gem_rx_refill(struct macb_queue *queue, bool napi)
 {
+	gfp_t gfp_alloc = napi ? GFP_ATOMIC : GFP_KERNEL;
 	struct macb *bp = queue->bp;
 	struct macb_dma_desc *desc;
 	unsigned int entry;
 	struct page *page;
 	dma_addr_t paddr;
-	gfp_t gfp_alloc;
 	int err = 0;
 	void *data;
 	int offset;
 
 	while (CIRC_SPACE(queue->rx_prepared_head, queue->rx_tail,
-			bp->rx_ring_size) > 0) {
+			  bp->rx_ring_size) > 0) {
 		entry = macb_rx_ring_wrap(bp, queue->rx_prepared_head);
 
 		/* Make hw descriptor updates visible to CPU */
@@ -1370,41 +1370,33 @@ static int gem_rx_refill(struct macb_queue *queue, bool napi)
 
 		desc = macb_rx_desc(queue, entry);
 
-		if (!queue->rx_buff[entry]) {
-			gfp_alloc = napi ? GFP_ATOMIC : GFP_KERNEL;
-			page = page_pool_alloc_frag(queue->page_pool, &offset,
-						    gem_total_rx_buffer_size(bp),
-						    gfp_alloc | __GFP_NOWARN);
-			if (!page) {
-				dev_err_ratelimited(&bp->pdev->dev,
-						    "Unable to allocate rx buffer\n");
-				err = -ENOMEM;
-				break;
-			}
-
-			paddr = page_pool_get_dma_addr(page) +
-				gem_rx_pad(bp) + offset;
-
-			dma_sync_single_for_device(&bp->pdev->dev,
-						   paddr, bp->rx_buffer_size,
-						   page_pool_get_dma_dir(queue->page_pool));
-
-			data = page_address(page) + offset;
-			queue->rx_buff[entry] = data;
-
-			if (entry == bp->rx_ring_size - 1)
-				paddr |= MACB_BIT(RX_WRAP);
-			desc->ctrl = 0;
-			/* Setting addr clears RX_USED and allows reception,
-			 * make sure ctrl is cleared first to avoid a race.
-			 */
-			dma_wmb();
-			macb_set_addr(bp, desc, paddr);
-		} else {
-			desc->ctrl = 0;
-			dma_wmb();
-			desc->addr &= ~MACB_BIT(RX_USED);
+		page = page_pool_alloc_frag(queue->page_pool, &offset,
+					    gem_total_rx_buffer_size(bp),
+					    gfp_alloc | __GFP_NOWARN);
+		if (!page) {
+			dev_err_ratelimited(&bp->pdev->dev,
+					    "Unable to allocate rx buffer\n");
+			err = -ENOMEM;
+			break;
 		}
+
+		paddr = page_pool_get_dma_addr(page) + gem_rx_pad(bp) + offset;
+
+		dma_sync_single_for_device(&bp->pdev->dev,
+					   paddr, bp->rx_buffer_size,
+					   page_pool_get_dma_dir(queue->page_pool));
+
+		data = page_address(page) + offset;
+		queue->rx_buff[entry] = data;
+
+		if (entry == bp->rx_ring_size - 1)
+			paddr |= MACB_BIT(RX_WRAP);
+		desc->ctrl = 0;
+		/* Setting addr clears RX_USED and allows reception,
+		 * make sure ctrl is cleared first to avoid a race.
+		 */
+		dma_wmb();
+		macb_set_addr(bp, desc, paddr);
 		queue->rx_prepared_head++;
 	}
 

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net-next 5/8] net: macb: move macb_xdp_submit_frame() body to helper function
  2026-03-04 18:24 [PATCH net-next 0/8] net: macb: add XSK support Théo Lebrun
                   ` (3 preceding siblings ...)
  2026-03-04 18:24 ` [PATCH net-next 4/8] net: macb: drop handling of recycled buffers in gem_rx_refill() Théo Lebrun
@ 2026-03-04 18:24 ` Théo Lebrun
  2026-03-04 18:24 ` [PATCH net-next 6/8] net: macb: add infrastructure for XSK buffer pool Théo Lebrun
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Théo Lebrun @ 2026-03-04 18:24 UTC (permalink / raw)
  To: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, Richard Cochran
  Cc: netdev, linux-kernel, bpf, Vladimir Kondratiev, Gregory CLEMENT,
	Benoît Monin, Tawfik Bayouk, Thomas Petazzoni,
	Maxime Chevallier, Théo Lebrun

Part of macb_xdp_submit_frame() is specific to the handling of an XDP
buffer (pick a queue for emission, DMA map or sync, report emitted
bytes), part is chitchat with hardware to update DMA descriptor and
start transmit.

Move the hardware specific code out of macb_xdp_submit_frame() into a
macb_xdp_submit_buff() helper function. The goal is to make code
reusable to support XSK buffers.

The macb_xdp_submit_frame() body is modified slightly: we bring the
dma_map_single() call outside of the queue->tx_ptr_lock critical
section, to minimise its span.

Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
---
 drivers/net/ethernet/cadence/macb_main.c | 143 +++++++++++++++++--------------
 1 file changed, 78 insertions(+), 65 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index ed94f9f0894b..65c2ec2a843c 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -1208,6 +1208,52 @@ static bool ptp_one_step_sync(struct sk_buff *skb)
 	return false;
 }
 
+static void macb_xdp_submit_buff(struct macb *bp, unsigned int queue_index,
+				 struct macb_tx_buff buff)
+{
+	struct macb_queue *queue = &bp->queues[queue_index];
+	struct net_device *netdev = bp->dev;
+	struct macb_tx_buff *tx_buff;
+	struct macb_dma_desc *desc;
+	unsigned int next_head;
+	u32 ctrl;
+
+	next_head = queue->tx_head + 1;
+
+	ctrl = MACB_BIT(TX_USED);
+	desc = macb_tx_desc(queue, next_head);
+	desc->ctrl = ctrl;
+
+	desc = macb_tx_desc(queue, queue->tx_head);
+	tx_buff = macb_tx_buff(queue, queue->tx_head);
+	*tx_buff = buff;
+
+	ctrl = (u32)buff.size;
+	ctrl |= MACB_BIT(TX_LAST);
+
+	if (unlikely(macb_tx_ring_wrap(bp, queue->tx_head) == (bp->tx_ring_size - 1)))
+		ctrl |= MACB_BIT(TX_WRAP);
+
+	/* Set TX buffer descriptor */
+	macb_set_addr(bp, desc, buff.mapping);
+	/* desc->addr must be visible to hardware before clearing
+	 * 'TX_USED' bit in desc->ctrl.
+	 */
+	wmb();
+	desc->ctrl = ctrl;
+	queue->tx_head = next_head;
+
+	/* Make newly initialized descriptor visible to hardware */
+	wmb();
+
+	spin_lock(&bp->lock);
+	macb_writel(bp, NCR, macb_readl(bp, NCR) | MACB_BIT(TSTART));
+	spin_unlock(&bp->lock);
+
+	if (CIRC_SPACE(queue->tx_head, queue->tx_tail, bp->tx_ring_size) < 1)
+		netif_stop_subqueue(netdev, queue_index);
+}
+
 static int macb_tx_complete(struct macb_queue *queue, int budget)
 {
 	struct macb *bp = queue->bp;
@@ -1430,44 +1476,25 @@ static void discard_partial_frame(struct macb_queue *queue, unsigned int begin,
 }
 
 static int macb_xdp_submit_frame(struct macb *bp, struct xdp_frame *xdpf,
-				 struct net_device *dev, bool dma_map,
+				 struct net_device *netdev, bool dma_map,
 				 dma_addr_t addr)
 {
+	struct device *dev = &bp->pdev->dev;
 	enum macb_tx_buff_type buff_type;
-	struct macb_tx_buff *tx_buff;
 	int cpu = smp_processor_id();
-	struct macb_dma_desc *desc;
 	struct macb_queue *queue;
-	unsigned int next_head;
 	unsigned long flags;
 	dma_addr_t mapping;
 	u16 queue_index;
 	int err = 0;
-	u32 ctrl;
-
-	queue_index = cpu % bp->num_queues;
-	queue = &bp->queues[queue_index];
-	buff_type = dma_map ? MACB_TYPE_XDP_NDO : MACB_TYPE_XDP_TX;
-
-	spin_lock_irqsave(&queue->tx_ptr_lock, flags);
-
-	/* This is a hard error, log it. */
-	if (CIRC_SPACE(queue->tx_head, queue->tx_tail, bp->tx_ring_size) < 1) {
-		netif_stop_subqueue(dev, queue_index);
-		netdev_dbg(bp->dev, "tx_head = %u, tx_tail = %u\n",
-			   queue->tx_head, queue->tx_tail);
-		err = -ENOMEM;
-		goto unlock;
-	}
 
 	if (dma_map) {
-		mapping = dma_map_single(&bp->pdev->dev,
-					 xdpf->data,
-					 xdpf->len, DMA_TO_DEVICE);
-		if (unlikely(dma_mapping_error(&bp->pdev->dev, mapping))) {
-			err = -ENOMEM;
-			goto unlock;
-		}
+		mapping = dma_map_single(dev, xdpf->data, xdpf->len, DMA_TO_DEVICE);
+		err = dma_mapping_error(&bp->pdev->dev, mapping);
+		if (unlikely(err))
+			return err;
+
+		buff_type = MACB_TYPE_XDP_NDO;
 	} else {
 		/* progs can adjust the head. Sync and set the adjusted one.
 		 * This also implicitly takes into account ip alignment,
@@ -1476,52 +1503,38 @@ static int macb_xdp_submit_frame(struct macb *bp, struct xdp_frame *xdpf,
 		mapping = addr + xdpf->headroom + sizeof(*xdpf);
 		dma_sync_single_for_device(&bp->pdev->dev, mapping,
 					   xdpf->len, DMA_BIDIRECTIONAL);
+
+		buff_type = MACB_TYPE_XDP_TX;
 	}
 
-	next_head = queue->tx_head + 1;
+	queue_index = cpu % bp->num_queues;
+	queue = &bp->queues[queue_index];
 
-	ctrl = MACB_BIT(TX_USED);
-	desc = macb_tx_desc(queue, next_head);
-	desc->ctrl = ctrl;
+	spin_lock_irqsave(&queue->tx_ptr_lock, flags);
 
-	desc = macb_tx_desc(queue, queue->tx_head);
-	tx_buff = macb_tx_buff(queue, queue->tx_head);
-	tx_buff->ptr = xdpf;
-	tx_buff->type = buff_type;
-	tx_buff->mapping = dma_map ? mapping : 0;
-	tx_buff->size = xdpf->len;
-	tx_buff->mapped_as_page = false;
+	if (CIRC_SPACE(queue->tx_head, queue->tx_tail, bp->tx_ring_size) < 1) {
+		/* This is a hard error, log it. */
+		netif_stop_subqueue(netdev, queue_index);
+		netdev_dbg(netdev, "tx_head = %u, tx_tail = %u\n",
+			   queue->tx_head, queue->tx_tail);
+		err = -ENOMEM;
+	} else {
+		macb_xdp_submit_buff(bp, queue_index, (struct macb_tx_buff){
+			.ptr = xdpf,
+			.mapping = dma_map ? mapping : 0,
+			.size = xdpf->len,
+			.mapped_as_page = false,
+			.type = buff_type,
+		});
 
-	ctrl = (u32)tx_buff->size;
-	ctrl |= MACB_BIT(TX_LAST);
+		netdev_tx_sent_queue(netdev_get_tx_queue(bp->dev, queue_index), xdpf->len);
+	}
 
-	if (unlikely(macb_tx_ring_wrap(bp, queue->tx_head) == (bp->tx_ring_size - 1)))
-		ctrl |= MACB_BIT(TX_WRAP);
-
-	/* Set TX buffer descriptor */
-	macb_set_addr(bp, desc, mapping);
-	/* desc->addr must be visible to hardware before clearing
-	 * 'TX_USED' bit in desc->ctrl.
-	 */
-	wmb();
-	desc->ctrl = ctrl;
-	queue->tx_head = next_head;
-
-	/* Make newly initialized descriptor visible to hardware */
-	wmb();
-
-	spin_lock(&bp->lock);
-	macb_writel(bp, NCR, macb_readl(bp, NCR) | MACB_BIT(TSTART));
-	spin_unlock(&bp->lock);
-
-	netdev_tx_sent_queue(netdev_get_tx_queue(bp->dev, queue_index), xdpf->len);
-
-	if (CIRC_SPACE(queue->tx_head, queue->tx_tail, bp->tx_ring_size) < 1)
-		netif_stop_subqueue(dev, queue_index);
-
-unlock:
 	spin_unlock_irqrestore(&queue->tx_ptr_lock, flags);
 
+	if (err && dma_map)
+		dma_unmap_single(dev, mapping, xdpf->len, DMA_TO_DEVICE);
+
 	return err;
 }
 

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net-next 6/8] net: macb: add infrastructure for XSK buffer pool
  2026-03-04 18:24 [PATCH net-next 0/8] net: macb: add XSK support Théo Lebrun
                   ` (4 preceding siblings ...)
  2026-03-04 18:24 ` [PATCH net-next 5/8] net: macb: move macb_xdp_submit_frame() body to helper function Théo Lebrun
@ 2026-03-04 18:24 ` Théo Lebrun
  2026-03-04 18:24 ` [PATCH net-next 7/8] net: macb: add Rx zero-copy AF_XDP support Théo Lebrun
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Théo Lebrun @ 2026-03-04 18:24 UTC (permalink / raw)
  To: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, Richard Cochran
  Cc: netdev, linux-kernel, bpf, Vladimir Kondratiev, Gregory CLEMENT,
	Benoît Monin, Tawfik Bayouk, Thomas Petazzoni,
	Maxime Chevallier, Théo Lebrun

Store a XSK buffer pool per queue, assigned through .ndo_bpf() with
command == XDP_SETUP_XSK_POOL.

We have no sequence upstream to disable a single queue, free its
buffers, refill it and re-enable the queue (without affecting other
queues). Therefore we protect our operation with interface-wide close
and open.

Also, prepare the terrain with a .ndo_xsk_wakeup() operation that does
the pre-flight-checks but is a no-op.

Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
---
 drivers/net/ethernet/cadence/macb.h      |  1 +
 drivers/net/ethernet/cadence/macb_main.c | 66 +++++++++++++++++++++++++++++++-
 2 files changed, 66 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
index 009a44e94726..a9e6f0289ecb 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -1278,6 +1278,7 @@ struct macb_queue {
 	struct napi_struct	napi_rx;
 	struct queue_stats stats;
 	struct page_pool	*page_pool;
+	struct xsk_buff_pool	*xsk_pool;
 	struct sk_buff		*skb;
 	struct xdp_rxq_info	xdp_rxq;
 };
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index 65c2ec2a843c..a72d59ffd1cf 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -38,6 +38,7 @@
 #include <linux/types.h>
 #include <linux/udp.h>
 #include <net/pkt_sched.h>
+#include <net/xdp_sock_drv.h>
 #include "macb.h"
 
 /* This structure is only used for MACB on SiFive FU540 devices */
@@ -1564,6 +1565,24 @@ static int gem_xdp_xmit(struct net_device *dev, int num_frame,
 	return xmitted;
 }
 
+static int gem_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
+{
+	struct macb *bp = netdev_priv(dev);
+	struct macb_queue *queue = &bp->queues[qid];
+
+	if (unlikely(!netif_carrier_ok(dev)))
+		return -ENETDOWN;
+
+	if (unlikely(qid >= bp->num_queues ||
+		     !rcu_access_pointer(bp->prog) ||
+		     !queue->xsk_pool))
+		return -ENXIO;
+
+	/* no-op, until rx/tx implement XSK support */
+
+	return 0;
+}
+
 static u32 gem_xdp_run(struct macb_queue *queue, void *buff_head,
 		       unsigned int *len, unsigned int *headroom,
 		       dma_addr_t addr)
@@ -3580,6 +3599,46 @@ static int gem_xdp_setup(struct net_device *dev, struct bpf_prog *prog,
 	return err;
 }
 
+static int gem_xdp_setup_xsk_pool(struct net_device *netdev,
+				  struct xsk_buff_pool *pool, u16 qid)
+{
+	struct macb *bp = netdev_priv(netdev);
+	unsigned long attrs = DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_WEAK_ORDERING;
+	struct macb_queue *queue = &bp->queues[qid];
+	bool running = netif_running(netdev);
+	struct device *dev = &bp->pdev->dev;
+	int err = 0;
+
+	if (qid >= bp->num_queues)
+		return -EINVAL;
+
+	if (pool && queue->xsk_pool)
+		return -EBUSY;
+
+	if (running)
+		macb_close(netdev);
+
+	if (pool) {
+		err = xsk_pool_dma_map(pool, dev, attrs);
+		if (err)
+			netdev_err(netdev, "xdp: failed to DMA map XSK pool\n");
+		else
+			queue->xsk_pool = pool;
+	} else {
+		if (queue->xsk_pool)
+			xsk_pool_dma_unmap(queue->xsk_pool, attrs);
+		queue->xsk_pool = NULL;
+	}
+
+	if (running) {
+		int err_open = macb_open(netdev);
+
+		err = err ?: err_open;
+	}
+
+	return err;
+}
+
 static int gem_xdp(struct net_device *dev, struct netdev_bpf *xdp)
 {
 	struct macb *bp = netdev_priv(dev);
@@ -3590,6 +3649,9 @@ static int gem_xdp(struct net_device *dev, struct netdev_bpf *xdp)
 	switch (xdp->command) {
 	case XDP_SETUP_PROG:
 		return gem_xdp_setup(dev, xdp->prog, xdp->extack);
+	case XDP_SETUP_XSK_POOL:
+		return gem_xdp_setup_xsk_pool(dev, xdp->xsk.pool,
+					      xdp->xsk.queue_id);
 	default:
 		return -EOPNOTSUPP;
 	}
@@ -4852,6 +4914,7 @@ static const struct net_device_ops macb_netdev_ops = {
 	.ndo_setup_tc		= macb_setup_tc,
 	.ndo_bpf		= gem_xdp,
 	.ndo_xdp_xmit		= gem_xdp_xmit,
+	.ndo_xsk_wakeup		= gem_xsk_wakeup,
 };
 
 /* Configure peripheral capabilities according to device tree
@@ -6156,7 +6219,8 @@ static int macb_probe(struct platform_device *pdev)
 
 		dev->xdp_features = NETDEV_XDP_ACT_BASIC |
 				    NETDEV_XDP_ACT_REDIRECT |
-				    NETDEV_XDP_ACT_NDO_XMIT;
+				    NETDEV_XDP_ACT_NDO_XMIT |
+				    NETDEV_XDP_ACT_XSK_ZEROCOPY;
 	}
 
 	netif_carrier_off(dev);

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net-next 7/8] net: macb: add Rx zero-copy AF_XDP support
  2026-03-04 18:24 [PATCH net-next 0/8] net: macb: add XSK support Théo Lebrun
                   ` (5 preceding siblings ...)
  2026-03-04 18:24 ` [PATCH net-next 6/8] net: macb: add infrastructure for XSK buffer pool Théo Lebrun
@ 2026-03-04 18:24 ` Théo Lebrun
  2026-03-04 18:24 ` [PATCH net-next 8/8] net: macb: add Tx " Théo Lebrun
  2026-03-06  3:11 ` [PATCH net-next 0/8] net: macb: add XSK support Jakub Kicinski
  8 siblings, 0 replies; 15+ messages in thread
From: Théo Lebrun @ 2026-03-04 18:24 UTC (permalink / raw)
  To: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, Richard Cochran
  Cc: netdev, linux-kernel, bpf, Vladimir Kondratiev, Gregory CLEMENT,
	Benoît Monin, Tawfik Bayouk, Thomas Petazzoni,
	Maxime Chevallier, Théo Lebrun

The Rx direction uses a page_pool instance as allocator created at open.
If present, exploit our new xsk_buff_pool located at queue->xsk_pool.
Allocate `struct xdp_buff` inside each queue->rx_buff[] slot instead of
raw pointers to the buffer start. Therefore, inside gem_rx() and
gem_xdp_run(), we get handed XDP buffers directly and need not to
allocate one on the stack to pass to the XDP program.

As this is a fresh implementation, jump straight to batch alloc rather
than the xsk_buff_alloc() API. We need two batch alloc calls at
wrap-around.

--

At open, in gem_create_page_pool() renamed to gem_init_pool():
 - Stop creating a page_pool if we have an XSK one.
 - Report proper values to xdp_rxq.

While running, in gem_rx(), gem_rx_refill() and gem_xdp_run():
 - Refill buffer slots using one/two calls to xsk_buff_alloc_batch().
 - Support running XDP program on a pre-allocated `struct xdp_buff`.
 - Adjust buffer free operations to support XSK. xsk_buff_free()
   replaces page_pool_put_full_page() if XSK is active.
 - End gem_rx() by marking the XSK need_wakeup flag.
 - When needed, wakeup is triggered by activating an IRQ from software,
   allowed by the hardware in the per-queue IMR register.

At close, in gem_free_rx_buffers():
 - Adjust the buffer free operation.
 - Don't destroy the page pool if we were in XSK mode.

Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
---
 drivers/net/ethernet/cadence/macb_main.c | 223 ++++++++++++++++++++++---------
 1 file changed, 161 insertions(+), 62 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index a72d59ffd1cf..ea1b0b8c4fab 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -1398,18 +1398,39 @@ static unsigned int gem_total_rx_buffer_size(struct macb *bp)
 
 static int gem_rx_refill(struct macb_queue *queue, bool napi)
 {
-	gfp_t gfp_alloc = napi ? GFP_ATOMIC : GFP_KERNEL;
 	struct macb *bp = queue->bp;
+	struct xdp_buff **xdp_buffs = (struct xdp_buff **)queue->rx_buff;
+	gfp_t gfp_alloc = napi ? GFP_ATOMIC : GFP_KERNEL;
+	struct xsk_buff_pool *xsk = queue->xsk_pool;
+	unsigned int size = bp->rx_ring_size;
 	struct macb_dma_desc *desc;
+	unsigned int offset;
 	unsigned int entry;
 	struct page *page;
 	dma_addr_t paddr;
 	int err = 0;
-	void *data;
-	int offset;
 
-	while (CIRC_SPACE(queue->rx_prepared_head, queue->rx_tail,
-			  bp->rx_ring_size) > 0) {
+	if (xsk) {
+		u32 head, tail, space_to_end, space_from_start, first_alloc;
+
+		/* CIRC_SPACE_TO_END() requires wrapping head & tail. */
+		head = macb_rx_ring_wrap(bp, queue->rx_prepared_head);
+		tail = macb_rx_ring_wrap(bp, queue->rx_tail);
+		space_to_end = CIRC_SPACE_TO_END(head, tail, size);
+		space_from_start = CIRC_SPACE(head, tail, size) - space_to_end;
+
+		first_alloc = xsk_buff_alloc_batch(xsk, xdp_buffs + head,
+						   space_to_end);
+
+		/*
+		 * Refill in two batch operations if we are wrapping around and
+		 * the first alloc batch gave us satisfaction.
+		 */
+		if (head + first_alloc == size && space_from_start)
+			xsk_buff_alloc_batch(xsk, xdp_buffs, space_from_start);
+	}
+
+	while (CIRC_SPACE(queue->rx_prepared_head, queue->rx_tail, size) > 0) {
 		entry = macb_rx_ring_wrap(bp, queue->rx_prepared_head);
 
 		/* Make hw descriptor updates visible to CPU */
@@ -1417,26 +1438,38 @@ static int gem_rx_refill(struct macb_queue *queue, bool napi)
 
 		desc = macb_rx_desc(queue, entry);
 
-		page = page_pool_alloc_frag(queue->page_pool, &offset,
-					    gem_total_rx_buffer_size(bp),
-					    gfp_alloc | __GFP_NOWARN);
-		if (!page) {
+		if (xsk) {
+			/* Remember xdp_buffs is an alias to queue->rx_buff. */
+			if (xdp_buffs[entry])
+				paddr = xsk_buff_xdp_get_dma(xdp_buffs[entry]);
+		} else {
+			page = page_pool_alloc_frag(queue->page_pool, &offset,
+						    gem_total_rx_buffer_size(bp),
+						    gfp_alloc | __GFP_NOWARN);
+			if (page) {
+				queue->rx_buff[entry] = page_address(page) +
+							offset;
+				paddr = page_pool_get_dma_addr(page) +
+					gem_rx_pad(bp) + offset;
+				dma_sync_single_for_device(&bp->pdev->dev,
+							   paddr,
+							   bp->rx_buffer_size,
+							   page_pool_get_dma_dir(queue->page_pool));
+			}
+		}
+
+		/*
+		 * In case xsk_buff_alloc_batch() returned less than requested
+		 * or page_pool_alloc_frag() failed.
+		 */
+		if (!queue->rx_buff[entry]) {
 			dev_err_ratelimited(&bp->pdev->dev,
 					    "Unable to allocate rx buffer\n");
 			err = -ENOMEM;
 			break;
 		}
 
-		paddr = page_pool_get_dma_addr(page) + gem_rx_pad(bp) + offset;
-
-		dma_sync_single_for_device(&bp->pdev->dev,
-					   paddr, bp->rx_buffer_size,
-					   page_pool_get_dma_dir(queue->page_pool));
-
-		data = page_address(page) + offset;
-		queue->rx_buff[entry] = data;
-
-		if (entry == bp->rx_ring_size - 1)
+		if (entry == size - 1)
 			paddr |= MACB_BIT(RX_WRAP);
 		desc->ctrl = 0;
 		/* Setting addr clears RX_USED and allows reception,
@@ -1569,6 +1602,7 @@ static int gem_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
 {
 	struct macb *bp = netdev_priv(dev);
 	struct macb_queue *queue = &bp->queues[qid];
+	u32 irqs = 0;
 
 	if (unlikely(!netif_carrier_ok(dev)))
 		return -ENETDOWN;
@@ -1578,7 +1612,12 @@ static int gem_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
 		     !queue->xsk_pool))
 		return -ENXIO;
 
-	/* no-op, until rx/tx implement XSK support */
+	if ((flags & XDP_WAKEUP_RX) &&
+	    !napi_if_scheduled_mark_missed(&queue->napi_rx))
+		irqs |= MACB_BIT(RCOMP);
+
+	if (irqs)
+		queue_writel(queue, IMR, irqs);
 
 	return 0;
 }
@@ -1587,10 +1626,11 @@ static u32 gem_xdp_run(struct macb_queue *queue, void *buff_head,
 		       unsigned int *len, unsigned int *headroom,
 		       dma_addr_t addr)
 {
-	struct net_device *dev;
+	struct xsk_buff_pool *xsk = queue->xsk_pool;
+	struct net_device *dev = queue->bp->dev;
+	struct xdp_buff xdp, *xdp_ptr;
 	struct xdp_frame *xdpf;
 	struct bpf_prog *prog;
-	struct xdp_buff xdp;
 
 	u32 act = XDP_PASS;
 
@@ -1600,25 +1640,35 @@ static u32 gem_xdp_run(struct macb_queue *queue, void *buff_head,
 	if (!prog)
 		goto out;
 
-	xdp_init_buff(&xdp, gem_total_rx_buffer_size(queue->bp), &queue->xdp_rxq);
-	xdp_prepare_buff(&xdp, buff_head, *headroom, *len, false);
-	xdp_buff_clear_frags_flag(&xdp);
-	dev = queue->bp->dev;
+	if (xsk) {
+		/*
+		 * It was a lie all along: buff_head is not a buffer but a
+		 * struct xdp_buff that points to the actual buffer.
+		 */
+		xdp_ptr = buff_head;
+		xdp_ptr->data_end = xdp_ptr->data + *len;
+	} else {
+		/* Use a stack-allocated struct xdp_buff. */
+		xdp_init_buff(&xdp, gem_total_rx_buffer_size(queue->bp), &queue->xdp_rxq);
+		xdp_prepare_buff(&xdp, buff_head, *headroom, *len, false);
+		xdp_buff_clear_frags_flag(&xdp);
+		xdp_ptr = &xdp;
+	}
 
-	act = bpf_prog_run_xdp(prog, &xdp);
+	act = bpf_prog_run_xdp(prog, xdp_ptr);
 	switch (act) {
 	case XDP_PASS:
 		*len = xdp.data_end - xdp.data;
 		*headroom = xdp.data - xdp.data_hard_start;
 		goto out;
 	case XDP_REDIRECT:
-		if (unlikely(xdp_do_redirect(dev, &xdp, prog))) {
+		if (unlikely(xdp_do_redirect(dev, xdp_ptr, prog))) {
 			act = XDP_DROP;
 			break;
 		}
 		goto out;
 	case XDP_TX:
-		xdpf = xdp_convert_buff_to_frame(&xdp);
+		xdpf = xdp_convert_buff_to_frame(xdp_ptr);
 		if (unlikely(!xdpf) || macb_xdp_submit_frame(queue->bp, xdpf,
 							     dev, false, addr)) {
 			act = XDP_DROP;
@@ -1635,8 +1685,12 @@ static u32 gem_xdp_run(struct macb_queue *queue, void *buff_head,
 		break;
 	}
 
-	page_pool_put_full_page(queue->page_pool,
-				virt_to_head_page(xdp.data), true);
+	if (xsk)
+		xsk_buff_free(xdp_ptr);
+	else
+		page_pool_put_full_page(queue->page_pool,
+					virt_to_head_page(xdp.data), true);
+
 out:
 	rcu_read_unlock();
 
@@ -1647,14 +1701,17 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
 		  int budget)
 {
 	unsigned int packets = 0, dropped = 0, bytes = 0;
+	struct xsk_buff_pool *xsk = queue->xsk_pool;
 	struct skb_shared_info *shinfo;
 	struct macb *bp = queue->bp;
 	struct macb_dma_desc *desc;
+	struct xdp_buff *xsk_xdp;
 	bool xdp_flush = false;
 	unsigned int headroom;
 	unsigned int entry;
 	struct page *page;
 	void *buff_head;
+	int refill_err;
 	int count = 0;
 	int data_len;
 	int nr_frags;
@@ -1686,6 +1743,7 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
 		count++;
 
 		buff_head = queue->rx_buff[entry];
+		xsk_xdp = buff_head;
 		if (unlikely(!buff_head)) {
 			dev_err_ratelimited(&bp->pdev->dev,
 					    "inconsistent Rx descriptor chain\n");
@@ -1701,10 +1759,14 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
 		if (data_len < 0)
 			goto free_frags;
 
-		dma_sync_single_for_cpu(&bp->pdev->dev,
-					addr + (first_frame ? bp->rx_ip_align : 0),
-					data_len,
-					page_pool_get_dma_dir(queue->page_pool));
+		if (xsk) {
+			xsk_buff_dma_sync_for_cpu(xsk_xdp);
+		} else {
+			dma_sync_single_for_cpu(&bp->pdev->dev,
+						addr + (first_frame ? bp->rx_ip_align : 0),
+						data_len,
+						page_pool_get_dma_dir(queue->page_pool));
+		}
 
 		if (first_frame) {
 			if (unlikely(queue->skb)) {
@@ -1813,10 +1875,13 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
 			queue->skb = NULL;
 		}
 
-		if (buff_head)
+		if (buff_head && xsk) {
+			xsk_buff_free(xsk_xdp);
+		} else if (buff_head) {
 			page_pool_put_full_page(queue->page_pool,
 						virt_to_head_page(buff_head),
 						false);
+		}
 
 		dropped++;
 		queue->rx_buff[entry] = NULL;
@@ -1829,10 +1894,26 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
 	bp->dev->stats.rx_bytes += bytes;
 	queue->stats.rx_bytes += bytes;
 
+	if (!count) /* short-circuit */
+		return 0;
+
 	if (xdp_flush)
 		xdp_do_flush();
 
-	gem_rx_refill(queue, true);
+	refill_err = gem_rx_refill(queue, true);
+	if (refill_err)
+		count = budget;
+
+	if (xsk && xsk_uses_need_wakeup(xsk)) {
+		unsigned int desc_available = CIRC_SPACE(queue->rx_prepared_head,
+							 queue->rx_tail,
+							 bp->rx_ring_size);
+
+		if (refill_err || !desc_available)
+			xsk_set_rx_need_wakeup(xsk);
+		else
+			xsk_clear_rx_need_wakeup(xsk);
+	}
 
 	return count;
 }
@@ -2816,9 +2897,16 @@ static void gem_free_rx_buffers(struct macb *bp)
 			if (!data)
 				continue;
 
-			page_pool_put_full_page(queue->page_pool,
-						virt_to_head_page(data),
-						false);
+			if (queue->xsk_pool) {
+				struct xdp_buff *xdp = data;
+
+				xsk_buff_free(xdp);
+			} else {
+				page_pool_put_full_page(queue->page_pool,
+							virt_to_head_page(data),
+							false);
+			}
+
 			queue->rx_buff[i] = NULL;
 		}
 
@@ -2831,8 +2919,10 @@ static void gem_free_rx_buffers(struct macb *bp)
 		queue->rx_buff = NULL;
 		if (xdp_rxq_info_is_reg(&queue->xdp_rxq))
 			xdp_rxq_info_unreg(&queue->xdp_rxq);
-		page_pool_destroy(queue->page_pool);
-		queue->page_pool = NULL;
+		if (!queue->xsk_pool) {
+			page_pool_destroy(queue->page_pool);
+			queue->page_pool = NULL;
+		}
 	}
 }
 
@@ -2987,7 +3077,7 @@ static int macb_alloc_consistent(struct macb *bp)
 	return -ENOMEM;
 }
 
-static int gem_create_page_pool(struct macb_queue *queue, int qid)
+static int gem_init_pool(struct macb_queue *queue, int qid)
 {
 	struct page_pool_params pp_params = {
 		.order = 0,
@@ -3002,24 +3092,32 @@ static int gem_create_page_pool(struct macb_queue *queue, int qid)
 		.napi = &queue->napi_rx,
 		.max_len = PAGE_SIZE,
 	};
-	struct page_pool *pool;
-	int err;
+	struct xsk_buff_pool *xsk = queue->xsk_pool;
+	enum xdp_mem_type mem_type;
+	void *allocator;
+	int err = 0;
 
-	/* This can happen in the case of HRESP error.
-	 * Do nothing as page pool is already existing.
-	 */
-	if (queue->page_pool)
-		return 0;
+	if (xsk) {
+		mem_type = MEM_TYPE_XSK_BUFF_POOL;
+		allocator = xsk;
+	} else {
+		/* This can happen in the case of HRESP error.
+		 * Do nothing as page pool is already existing.
+		 */
+		if (queue->page_pool)
+			return 0;
 
-	pool = page_pool_create(&pp_params);
-	if (IS_ERR(pool)) {
-		netdev_err(queue->bp->dev, "cannot create rx page pool\n");
-		err = PTR_ERR(pool);
-		goto clear_pool;
+		queue->page_pool = page_pool_create(&pp_params);
+		if (IS_ERR(queue->page_pool)) {
+			netdev_err(queue->bp->dev, "cannot create rx page pool\n");
+			err = PTR_ERR(queue->page_pool);
+			goto clear_pool;
+		}
+
+		mem_type = MEM_TYPE_PAGE_POOL;
+		allocator = queue->page_pool;
 	}
 
-	queue->page_pool = pool;
-
 	err = xdp_rxq_info_reg(&queue->xdp_rxq, queue->bp->dev, qid,
 			       queue->napi_rx.napi_id);
 	if (err < 0) {
@@ -3027,8 +3125,7 @@ static int gem_create_page_pool(struct macb_queue *queue, int qid)
 		goto destroy_pool;
 	}
 
-	err = xdp_rxq_info_reg_mem_model(&queue->xdp_rxq, MEM_TYPE_PAGE_POOL,
-					 queue->page_pool);
+	err = xdp_rxq_info_reg_mem_model(&queue->xdp_rxq, mem_type, allocator);
 	if (err) {
 		netdev_err(queue->bp->dev, "xdp: failed to register rxq memory model\n");
 		goto unreg_info;
@@ -3039,9 +3136,11 @@ static int gem_create_page_pool(struct macb_queue *queue, int qid)
 unreg_info:
 	xdp_rxq_info_unreg(&queue->xdp_rxq);
 destroy_pool:
-	page_pool_destroy(pool);
+	if (!xsk)
+		page_pool_destroy(queue->page_pool);
 clear_pool:
-	queue->page_pool = NULL;
+	if (!xsk)
+		queue->page_pool = NULL;
 
 	return err;
 }
@@ -3084,7 +3183,7 @@ static int gem_init_rings(struct macb *bp, bool fail_early)
 		/* This is a hard failure. In case of HRESP error
 		 * recovery we always reuse the existing page pool.
 		 */
-		last_err = gem_create_page_pool(queue, q);
+		last_err = gem_init_pool(queue, q);
 		if (last_err)
 			break;
 

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net-next 8/8] net: macb: add Tx zero-copy AF_XDP support
  2026-03-04 18:24 [PATCH net-next 0/8] net: macb: add XSK support Théo Lebrun
                   ` (6 preceding siblings ...)
  2026-03-04 18:24 ` [PATCH net-next 7/8] net: macb: add Rx zero-copy AF_XDP support Théo Lebrun
@ 2026-03-04 18:24 ` Théo Lebrun
  2026-03-06 12:48   ` Maxime Chevallier
  2026-03-06  3:11 ` [PATCH net-next 0/8] net: macb: add XSK support Jakub Kicinski
  8 siblings, 1 reply; 15+ messages in thread
From: Théo Lebrun @ 2026-03-04 18:24 UTC (permalink / raw)
  To: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, Richard Cochran
  Cc: netdev, linux-kernel, bpf, Vladimir Kondratiev, Gregory CLEMENT,
	Benoît Monin, Tawfik Bayouk, Thomas Petazzoni,
	Maxime Chevallier, Théo Lebrun

Add a new buffer type (to `enum macb_tx_buff_type`). Near the end of
macb_tx_complete(), we go and read the XSK buffers using
xsk_tx_peek_release_desc_batch() and append those buffers to our Tx
ring.

Additionally, in macb_tx_complete(), we signal to the XSK subsystem
number of bytes completed and conditionally mark the need_wakeup
flag.

Lastly, we update XSK wakeup by writing the TCOMP bit in the per-queue
IMR register, to ensure NAPI scheduling will take place.

Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
---
 drivers/net/ethernet/cadence/macb.h      |  1 +
 drivers/net/ethernet/cadence/macb_main.c | 91 +++++++++++++++++++++++++++++---
 2 files changed, 86 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
index a9e6f0289ecb..5700a285c08a 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -963,6 +963,7 @@ enum macb_tx_buff_type {
 	MACB_TYPE_SKB,
 	MACB_TYPE_XDP_TX,
 	MACB_TYPE_XDP_NDO,
+	MACB_TYPE_XSK,
 };
 
 /* struct macb_tx_buff - data about an skb or xdp frame which is being
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index ea1b0b8c4fab..fee1ebadcf20 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -986,21 +986,30 @@ static int macb_halt_tx(struct macb *bp)
 
 static void macb_tx_release_buff(void *buff, enum macb_tx_buff_type type, int budget)
 {
-	if (type == MACB_TYPE_SKB) {
+	switch (type) {
+	case MACB_TYPE_SKB:
 		napi_consume_skb(buff, budget);
-	} else if (type == MACB_TYPE_XDP_TX) {
-		if (!budget)
-			xdp_return_frame(buff);
-		else
+		break;
+	case MACB_TYPE_XDP_TX:
+		if (budget)
 			xdp_return_frame_rx_napi(buff);
-	} else {
+		else
+			xdp_return_frame(buff);
+		break;
+	case MACB_TYPE_XDP_NDO:
 		xdp_return_frame(buff);
+		break;
+	case MACB_TYPE_XSK:
+		break;
 	}
 }
 
 static void macb_tx_unmap(struct macb *bp, struct macb_tx_buff *tx_buff,
 			  int budget)
 {
+	if (tx_buff->type == MACB_TYPE_XSK)
+		return;
+
 	if (tx_buff->mapping) {
 		if (tx_buff->mapped_as_page)
 			dma_unmap_page(&bp->pdev->dev, tx_buff->mapping,
@@ -1255,6 +1264,57 @@ static void macb_xdp_submit_buff(struct macb *bp, unsigned int queue_index,
 		netif_stop_subqueue(netdev, queue_index);
 }
 
+static void macb_xdp_xmit_zc(struct macb *bp, unsigned int queue_index, int budget)
+{
+	struct macb_queue *queue = &bp->queues[queue_index];
+	struct xsk_buff_pool *xsk = queue->xsk_pool;
+	dma_addr_t mapping;
+	u32 slot_available;
+	size_t bytes = 0;
+	u32 batch;
+
+	guard(spinlock_irqsave)(&queue->tx_ptr_lock);
+
+	/* This is a hard error, log it. */
+	slot_available = CIRC_SPACE(queue->tx_head, queue->tx_tail, bp->tx_ring_size);
+	if (slot_available < 1) {
+		netif_stop_subqueue(bp->dev, queue_index);
+		netdev_dbg(bp->dev, "tx_head = %u, tx_tail = %u\n",
+			   queue->tx_head, queue->tx_tail);
+		return;
+	}
+
+	batch = min_t(u32, slot_available, budget);
+	batch = xsk_tx_peek_release_desc_batch(xsk, batch);
+	if (!batch)
+		return;
+
+	for (u32 i = 0; i < batch; i++) {
+		struct xdp_desc *desc = &xsk->tx_descs[i];
+
+		mapping = xsk_buff_raw_get_dma(xsk, desc->addr);
+		xsk_buff_raw_dma_sync_for_device(xsk, mapping, desc->len);
+
+		macb_xdp_submit_buff(bp, queue_index, (struct macb_tx_buff){
+			.ptr = NULL,
+			.mapping = mapping,
+			.size = desc->len,
+			.mapped_as_page = false,
+			.type = MACB_TYPE_XSK,
+		});
+
+		bytes += desc->len;
+	}
+
+	/* Make newly initialized descriptor visible to hardware */
+	wmb();
+	spin_lock(&bp->lock);
+	macb_writel(bp, NCR, macb_readl(bp, NCR) | MACB_BIT(TSTART));
+	spin_unlock(&bp->lock);
+
+	netdev_tx_sent_queue(netdev_get_tx_queue(bp->dev, queue_index), bytes);
+}
+
 static int macb_tx_complete(struct macb_queue *queue, int budget)
 {
 	struct macb *bp = queue->bp;
@@ -1316,6 +1376,11 @@ static int macb_tx_complete(struct macb_queue *queue, int budget)
 		case MACB_TYPE_XDP_NDO:
 			bytes += tx_buff->size;
 			break;
+
+		case MACB_TYPE_XSK:
+			bytes += tx_buff->size;
+			xsk_frames++;
+			break;
 		}
 
 		packets++;
@@ -1337,6 +1402,16 @@ static int macb_tx_complete(struct macb_queue *queue, int budget)
 		netif_wake_subqueue(bp->dev, queue_index);
 	spin_unlock_irqrestore(&queue->tx_ptr_lock, flags);
 
+	if (queue->xsk_pool) {
+		if (xsk_frames)
+			xsk_tx_completed(queue->xsk_pool, xsk_frames);
+
+		if (xsk_uses_need_wakeup(queue->xsk_pool))
+			xsk_set_tx_need_wakeup(queue->xsk_pool);
+
+		macb_xdp_xmit_zc(bp, queue_index, budget);
+	}
+
 	return packets;
 }
 
@@ -1616,6 +1691,10 @@ static int gem_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
 	    !napi_if_scheduled_mark_missed(&queue->napi_rx))
 		irqs |= MACB_BIT(RCOMP);
 
+	if ((flags & XDP_WAKEUP_TX) &&
+	    !napi_if_scheduled_mark_missed(&queue->napi_tx))
+		irqs |= MACB_BIT(TCOMP);
+
 	if (irqs)
 		queue_writel(queue, IMR, irqs);
 

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 0/8] net: macb: add XSK support
  2026-03-04 18:24 [PATCH net-next 0/8] net: macb: add XSK support Théo Lebrun
                   ` (7 preceding siblings ...)
  2026-03-04 18:24 ` [PATCH net-next 8/8] net: macb: add Tx " Théo Lebrun
@ 2026-03-06  3:11 ` Jakub Kicinski
  8 siblings, 0 replies; 15+ messages in thread
From: Jakub Kicinski @ 2026-03-06  3:11 UTC (permalink / raw)
  To: Théo Lebrun
  Cc: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Richard Cochran, netdev, linux-kernel, bpf, Vladimir Kondratiev,
	Gregory CLEMENT, Benoît Monin, Tawfik Bayouk,
	Thomas Petazzoni, Maxime Chevallier

On Wed, 04 Mar 2026 19:24:23 +0100 Théo Lebrun wrote:
> Applies on top of net-next (4ad96a7c9e2c) and Paolo's XDP work [0].

You have to wait until dependencies are merged.
If you want to post in the meantime is has to be as an RFC
-- 
pw-bot: cr

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 8/8] net: macb: add Tx zero-copy AF_XDP support
  2026-03-04 18:24 ` [PATCH net-next 8/8] net: macb: add Tx " Théo Lebrun
@ 2026-03-06 12:48   ` Maxime Chevallier
  2026-03-06 17:18     ` Théo Lebrun
  0 siblings, 1 reply; 15+ messages in thread
From: Maxime Chevallier @ 2026-03-06 12:48 UTC (permalink / raw)
  To: Théo Lebrun, Nicolas Ferre, Claudiu Beznea, Andrew Lunn,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Stanislav Fomichev, Richard Cochran
  Cc: netdev, linux-kernel, bpf, Vladimir Kondratiev, Gregory CLEMENT,
	Benoît Monin, Tawfik Bayouk, Thomas Petazzoni

Hi Théo,

On 04/03/2026 19:24, Théo Lebrun wrote:
> Add a new buffer type (to `enum macb_tx_buff_type`). Near the end of
> macb_tx_complete(), we go and read the XSK buffers using
> xsk_tx_peek_release_desc_batch() and append those buffers to our Tx
> ring.
> 
> Additionally, in macb_tx_complete(), we signal to the XSK subsystem
> number of bytes completed and conditionally mark the need_wakeup
> flag.
> 
> Lastly, we update XSK wakeup by writing the TCOMP bit in the per-queue
> IMR register, to ensure NAPI scheduling will take place.
> 
> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
> ---

[...]

> +static void macb_xdp_xmit_zc(struct macb *bp, unsigned int queue_index, int budget)
> +{
> +	struct macb_queue *queue = &bp->queues[queue_index];
> +	struct xsk_buff_pool *xsk = queue->xsk_pool;
> +	dma_addr_t mapping;
> +	u32 slot_available;
> +	size_t bytes = 0;
> +	u32 batch;
> +
> +	guard(spinlock_irqsave)(&queue->tx_ptr_lock);
> +
> +	/* This is a hard error, log it. */
> +	slot_available = CIRC_SPACE(queue->tx_head, queue->tx_tail, bp->tx_ring_size);
> +	if (slot_available < 1) {
> +		netif_stop_subqueue(bp->dev, queue_index);
> +		netdev_dbg(bp->dev, "tx_head = %u, tx_tail = %u\n",
> +			   queue->tx_head, queue->tx_tail);
> +		return;
> +	}
> +
> +	batch = min_t(u32, slot_available, budget);
> +	batch = xsk_tx_peek_release_desc_batch(xsk, batch);
> +	if (!batch)
> +		return;
> +
> +	for (u32 i = 0; i < batch; i++) {
> +		struct xdp_desc *desc = &xsk->tx_descs[i];
> +
> +		mapping = xsk_buff_raw_get_dma(xsk, desc->addr);
> +		xsk_buff_raw_dma_sync_for_device(xsk, mapping, desc->len);
> +
> +		macb_xdp_submit_buff(bp, queue_index, (struct macb_tx_buff){
> +			.ptr = NULL,
> +			.mapping = mapping,
> +			.size = desc->len,
> +			.mapped_as_page = false,
> +			.type = MACB_TYPE_XSK,
> +		});
> +
> +		bytes += desc->len;
> +	}
> +
> +	/* Make newly initialized descriptor visible to hardware */
> +	wmb();
> +	spin_lock(&bp->lock);
> +	macb_writel(bp, NCR, macb_readl(bp, NCR) | MACB_BIT(TSTART));
> +	spin_unlock(&bp->lock);

this lock is also taken in interrupt context, this should probably use a
irqsave/restore variant. Now, there are a few other parts of this driver
that use a plain spin_lock() call and except for the paths that actually
run in interrupt context, they don't seem correct to me :(

Maxime



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 8/8] net: macb: add Tx zero-copy AF_XDP support
  2026-03-06 12:48   ` Maxime Chevallier
@ 2026-03-06 17:18     ` Théo Lebrun
  2026-03-06 17:53       ` Maxime Chevallier
  0 siblings, 1 reply; 15+ messages in thread
From: Théo Lebrun @ 2026-03-06 17:18 UTC (permalink / raw)
  To: Maxime Chevallier, Théo Lebrun, Nicolas Ferre,
	Claudiu Beznea, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Richard Cochran
  Cc: netdev, linux-kernel, bpf, Vladimir Kondratiev, Gregory CLEMENT,
	Benoît Monin, Tawfik Bayouk, Thomas Petazzoni

Hello!

On Fri Mar 6, 2026 at 1:48 PM CET, Maxime Chevallier wrote:
> On 04/03/2026 19:24, Théo Lebrun wrote:
>> Add a new buffer type (to `enum macb_tx_buff_type`). Near the end of
>> macb_tx_complete(), we go and read the XSK buffers using
>> xsk_tx_peek_release_desc_batch() and append those buffers to our Tx
>> ring.
>> 
>> Additionally, in macb_tx_complete(), we signal to the XSK subsystem
>> number of bytes completed and conditionally mark the need_wakeup
>> flag.
>> 
>> Lastly, we update XSK wakeup by writing the TCOMP bit in the per-queue
>> IMR register, to ensure NAPI scheduling will take place.
>> 
>> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
>> ---
>
> [...]
>
>> +static void macb_xdp_xmit_zc(struct macb *bp, unsigned int queue_index, int budget)
>> +{
>> +	struct macb_queue *queue = &bp->queues[queue_index];
>> +	struct xsk_buff_pool *xsk = queue->xsk_pool;
>> +	dma_addr_t mapping;
>> +	u32 slot_available;
>> +	size_t bytes = 0;
>> +	u32 batch;
>> +
>> +	guard(spinlock_irqsave)(&queue->tx_ptr_lock);
>> +
>> +	/* This is a hard error, log it. */
>> +	slot_available = CIRC_SPACE(queue->tx_head, queue->tx_tail, bp->tx_ring_size);
>> +	if (slot_available < 1) {
>> +		netif_stop_subqueue(bp->dev, queue_index);
>> +		netdev_dbg(bp->dev, "tx_head = %u, tx_tail = %u\n",
>> +			   queue->tx_head, queue->tx_tail);
>> +		return;
>> +	}
>> +
>> +	batch = min_t(u32, slot_available, budget);
>> +	batch = xsk_tx_peek_release_desc_batch(xsk, batch);
>> +	if (!batch)
>> +		return;
>> +
>> +	for (u32 i = 0; i < batch; i++) {
>> +		struct xdp_desc *desc = &xsk->tx_descs[i];
>> +
>> +		mapping = xsk_buff_raw_get_dma(xsk, desc->addr);
>> +		xsk_buff_raw_dma_sync_for_device(xsk, mapping, desc->len);
>> +
>> +		macb_xdp_submit_buff(bp, queue_index, (struct macb_tx_buff){
>> +			.ptr = NULL,
>> +			.mapping = mapping,
>> +			.size = desc->len,
>> +			.mapped_as_page = false,
>> +			.type = MACB_TYPE_XSK,
>> +		});
>> +
>> +		bytes += desc->len;
>> +	}
>> +
>> +	/* Make newly initialized descriptor visible to hardware */
>> +	wmb();
>> +	spin_lock(&bp->lock);
>> +	macb_writel(bp, NCR, macb_readl(bp, NCR) | MACB_BIT(TSTART));
>> +	spin_unlock(&bp->lock);
>
> this lock is also taken in interrupt context, this should probably use a
> irqsave/restore variant. Now, there are a few other parts of this driver
> that use a plain spin_lock() call and except for the paths that actually
> run in interrupt context, they don't seem correct to me :(

I almost sent a reply agreeing with you, but actually here is the
exhaustive `spin_lock(&bp->lock)` list:

   #   Function                Context
   ------------------------------------------
   1   gem_wol_interrupt()     irq
   2   macb_interrupt()        irq
   3   macb_wol_interrupt()    irq
   4   macb_tx_error_task()    workqueue/user
   5   macb_tx_restart()       napi/softirq
   6   macb_xdp_xmit_zc()      napi/softirq
   7   macb_start_xmit()       user
   8   macb_xdp_submit_frame() user

And all contexts are safe because it always is this sequence in non-IRQ
contexts (#4-8):

   spin_lock_irqsave(&queue->tx_ptr_lock, flags);
   spin_lock(&bp->lock);
   spin_unlock(&bp->lock);
   spin_unlock_irqrestore(&queue->tx_ptr_lock, flags);

So bp->tx_ptr_lock always wraps bp->lock and does the local CPU IRQ
disabling.

(I also checked we don't risk ABBA deadlock, and we don't: all code
acquires bp->tx_ptr_lock THEN bp->lock.)

However, there is still a bug in the code you quoted: setting
BIT(TSTART) is done twice by macb_xdp_xmit_zc():
 - once in the helper function macb_xdp_submit_buff() and,
 - once in its own body (code you quoted)
This is fixed for V2!

Thanks Maxime,
Have a nice week-end,

--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 8/8] net: macb: add Tx zero-copy AF_XDP support
  2026-03-06 17:18     ` Théo Lebrun
@ 2026-03-06 17:53       ` Maxime Chevallier
  2026-03-09 10:56         ` Théo Lebrun
  0 siblings, 1 reply; 15+ messages in thread
From: Maxime Chevallier @ 2026-03-06 17:53 UTC (permalink / raw)
  To: Théo Lebrun, Nicolas Ferre, Claudiu Beznea, Andrew Lunn,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Stanislav Fomichev, Richard Cochran
  Cc: netdev, linux-kernel, bpf, Vladimir Kondratiev, Gregory CLEMENT,
	Benoît Monin, Tawfik Bayouk, Thomas Petazzoni

Hi,

On 06/03/2026 18:18, Théo Lebrun wrote:
> Hello!
> 
> On Fri Mar 6, 2026 at 1:48 PM CET, Maxime Chevallier wrote:
>> On 04/03/2026 19:24, Théo Lebrun wrote:
>>> Add a new buffer type (to `enum macb_tx_buff_type`). Near the end of
>>> macb_tx_complete(), we go and read the XSK buffers using
>>> xsk_tx_peek_release_desc_batch() and append those buffers to our Tx
>>> ring.
>>>
>>> Additionally, in macb_tx_complete(), we signal to the XSK subsystem
>>> number of bytes completed and conditionally mark the need_wakeup
>>> flag.
>>>
>>> Lastly, we update XSK wakeup by writing the TCOMP bit in the per-queue
>>> IMR register, to ensure NAPI scheduling will take place.
>>>
>>> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
>>> ---
>>
>> [...]
>>
>>> +static void macb_xdp_xmit_zc(struct macb *bp, unsigned int queue_index, int budget)
>>> +{
>>> +	struct macb_queue *queue = &bp->queues[queue_index];
>>> +	struct xsk_buff_pool *xsk = queue->xsk_pool;
>>> +	dma_addr_t mapping;
>>> +	u32 slot_available;
>>> +	size_t bytes = 0;
>>> +	u32 batch;
>>> +
>>> +	guard(spinlock_irqsave)(&queue->tx_ptr_lock);
>>> +
>>> +	/* This is a hard error, log it. */
>>> +	slot_available = CIRC_SPACE(queue->tx_head, queue->tx_tail, bp->tx_ring_size);
>>> +	if (slot_available < 1) {
>>> +		netif_stop_subqueue(bp->dev, queue_index);
>>> +		netdev_dbg(bp->dev, "tx_head = %u, tx_tail = %u\n",
>>> +			   queue->tx_head, queue->tx_tail);
>>> +		return;
>>> +	}
>>> +
>>> +	batch = min_t(u32, slot_available, budget);
>>> +	batch = xsk_tx_peek_release_desc_batch(xsk, batch);
>>> +	if (!batch)
>>> +		return;
>>> +
>>> +	for (u32 i = 0; i < batch; i++) {
>>> +		struct xdp_desc *desc = &xsk->tx_descs[i];
>>> +
>>> +		mapping = xsk_buff_raw_get_dma(xsk, desc->addr);
>>> +		xsk_buff_raw_dma_sync_for_device(xsk, mapping, desc->len);
>>> +
>>> +		macb_xdp_submit_buff(bp, queue_index, (struct macb_tx_buff){
>>> +			.ptr = NULL,
>>> +			.mapping = mapping,
>>> +			.size = desc->len,
>>> +			.mapped_as_page = false,
>>> +			.type = MACB_TYPE_XSK,
>>> +		});
>>> +
>>> +		bytes += desc->len;
>>> +	}
>>> +
>>> +	/* Make newly initialized descriptor visible to hardware */
>>> +	wmb();
>>> +	spin_lock(&bp->lock);
>>> +	macb_writel(bp, NCR, macb_readl(bp, NCR) | MACB_BIT(TSTART));
>>> +	spin_unlock(&bp->lock);
>>
>> this lock is also taken in interrupt context, this should probably use a
>> irqsave/restore variant. Now, there are a few other parts of this driver
>> that use a plain spin_lock() call and except for the paths that actually
>> run in interrupt context, they don't seem correct to me :(
> 
> I almost sent a reply agreeing with you, but actually here is the
> exhaustive `spin_lock(&bp->lock)` list:
> 
>    #   Function                Context
>    ------------------------------------------
>    1   gem_wol_interrupt()     irq
>    2   macb_interrupt()        irq
>    3   macb_wol_interrupt()    irq
>    4   macb_tx_error_task()    workqueue/user
>    5   macb_tx_restart()       napi/softirq
>    6   macb_xdp_xmit_zc()      napi/softirq
>    7   macb_start_xmit()       user
>    8   macb_xdp_submit_frame() user
> 
> And all contexts are safe because it always is this sequence in non-IRQ
> contexts (#4-8):
> 
>    spin_lock_irqsave(&queue->tx_ptr_lock, flags);
>    spin_lock(&bp->lock);
>    spin_unlock(&bp->lock);
>    spin_unlock_irqrestore(&queue->tx_ptr_lock, flags);

Is it because of the guard statement ?

  guard(spinlock_irqsave)(&queue->tx_ptr_lock);

It really doesn't make it obvious that this is how it plays out :(

> 
> So bp->tx_ptr_lock always wraps bp->lock and does the local CPU IRQ
> disabling.
> 
> (I also checked we don't risk ABBA deadlock, and we don't: all code
> acquires bp->tx_ptr_lock THEN bp->lock.)
> 
> However, there is still a bug in the code you quoted: setting
> BIT(TSTART) is done twice by macb_xdp_xmit_zc():
>  - once in the helper function macb_xdp_submit_buff() and,
>  - once in its own body (code you quoted)
> This is fixed for V2!

great :)

Maxime

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 8/8] net: macb: add Tx zero-copy AF_XDP support
  2026-03-06 17:53       ` Maxime Chevallier
@ 2026-03-09 10:56         ` Théo Lebrun
  0 siblings, 0 replies; 15+ messages in thread
From: Théo Lebrun @ 2026-03-09 10:56 UTC (permalink / raw)
  To: Maxime Chevallier, Théo Lebrun, Nicolas Ferre,
	Claudiu Beznea, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Richard Cochran
  Cc: netdev, linux-kernel, bpf, Vladimir Kondratiev, Gregory CLEMENT,
	Benoît Monin, Tawfik Bayouk, Thomas Petazzoni

Hello Maxime,

On Fri Mar 6, 2026 at 6:53 PM CET, Maxime Chevallier wrote:
> On 06/03/2026 18:18, Théo Lebrun wrote:
>> Hello!
>> 
>> On Fri Mar 6, 2026 at 1:48 PM CET, Maxime Chevallier wrote:
>>> On 04/03/2026 19:24, Théo Lebrun wrote:
>>>> Add a new buffer type (to `enum macb_tx_buff_type`). Near the end of
>>>> macb_tx_complete(), we go and read the XSK buffers using
>>>> xsk_tx_peek_release_desc_batch() and append those buffers to our Tx
>>>> ring.
>>>>
>>>> Additionally, in macb_tx_complete(), we signal to the XSK subsystem
>>>> number of bytes completed and conditionally mark the need_wakeup
>>>> flag.
>>>>
>>>> Lastly, we update XSK wakeup by writing the TCOMP bit in the per-queue
>>>> IMR register, to ensure NAPI scheduling will take place.
>>>>
>>>> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
>>>> ---
>>>
>>> [...]
>>>
>>>> +static void macb_xdp_xmit_zc(struct macb *bp, unsigned int queue_index, int budget)
>>>> +{
>>>> +	struct macb_queue *queue = &bp->queues[queue_index];
>>>> +	struct xsk_buff_pool *xsk = queue->xsk_pool;
>>>> +	dma_addr_t mapping;
>>>> +	u32 slot_available;
>>>> +	size_t bytes = 0;
>>>> +	u32 batch;
>>>> +
>>>> +	guard(spinlock_irqsave)(&queue->tx_ptr_lock);
>>>> +
>>>> +	/* This is a hard error, log it. */
>>>> +	slot_available = CIRC_SPACE(queue->tx_head, queue->tx_tail, bp->tx_ring_size);
>>>> +	if (slot_available < 1) {
>>>> +		netif_stop_subqueue(bp->dev, queue_index);
>>>> +		netdev_dbg(bp->dev, "tx_head = %u, tx_tail = %u\n",
>>>> +			   queue->tx_head, queue->tx_tail);
>>>> +		return;
>>>> +	}
>>>> +
>>>> +	batch = min_t(u32, slot_available, budget);
>>>> +	batch = xsk_tx_peek_release_desc_batch(xsk, batch);
>>>> +	if (!batch)
>>>> +		return;
>>>> +
>>>> +	for (u32 i = 0; i < batch; i++) {
>>>> +		struct xdp_desc *desc = &xsk->tx_descs[i];
>>>> +
>>>> +		mapping = xsk_buff_raw_get_dma(xsk, desc->addr);
>>>> +		xsk_buff_raw_dma_sync_for_device(xsk, mapping, desc->len);
>>>> +
>>>> +		macb_xdp_submit_buff(bp, queue_index, (struct macb_tx_buff){
>>>> +			.ptr = NULL,
>>>> +			.mapping = mapping,
>>>> +			.size = desc->len,
>>>> +			.mapped_as_page = false,
>>>> +			.type = MACB_TYPE_XSK,
>>>> +		});
>>>> +
>>>> +		bytes += desc->len;
>>>> +	}
>>>> +
>>>> +	/* Make newly initialized descriptor visible to hardware */
>>>> +	wmb();
>>>> +	spin_lock(&bp->lock);
>>>> +	macb_writel(bp, NCR, macb_readl(bp, NCR) | MACB_BIT(TSTART));
>>>> +	spin_unlock(&bp->lock);
>>>
>>> this lock is also taken in interrupt context, this should probably use a
>>> irqsave/restore variant. Now, there are a few other parts of this driver
>>> that use a plain spin_lock() call and except for the paths that actually
>>> run in interrupt context, they don't seem correct to me :(
>> 
>> I almost sent a reply agreeing with you, but actually here is the
>> exhaustive `spin_lock(&bp->lock)` list:
>> 
>>    #   Function                Context
>>    ------------------------------------------
>>    1   gem_wol_interrupt()     irq
>>    2   macb_interrupt()        irq
>>    3   macb_wol_interrupt()    irq
>>    4   macb_tx_error_task()    workqueue/user
>>    5   macb_tx_restart()       napi/softirq
>>    6   macb_xdp_xmit_zc()      napi/softirq
>>    7   macb_start_xmit()       user
>>    8   macb_xdp_submit_frame() user
>> 
>> And all contexts are safe because it always is this sequence in non-IRQ
>> contexts (#4-8):
>> 
>>    spin_lock_irqsave(&queue->tx_ptr_lock, flags);
>>    spin_lock(&bp->lock);
>>    spin_unlock(&bp->lock);
>>    spin_unlock_irqrestore(&queue->tx_ptr_lock, flags);
>
> Is it because of the guard statement ?
>
>   guard(spinlock_irqsave)(&queue->tx_ptr_lock);
>
> It really doesn't make it obvious that this is how it plays out :(

Yes! A guard does an operation when called and one at scope end (in our
case at the end of macb_xdp_xmit_zc()). That way we don't forget the
cleanup, and we can do early returns without a list of labels and
gotos (and mess up along the way).

It uses the __attribute__((cleanup(cleanup_function))) compiler feature,
that is aliased to `__cleanup()` in the kernel.
https://gcc.gnu.org/onlinedocs/gcc/Common-Attributes.html#index-cleanup
https://elixir.bootlin.com/linux/v6.19.6/source/include/linux/compiler_attributes.h#L76

Guard definition for `spinlock_irqsave`:
https://elixir.bootlin.com/linux/v6.19.6/source/include/linux/spinlock.h#L585-L588
(delving into those macros is not recommended)

Code documentation is good:
https://elixir.bootlin.com/linux/v6.19.6/source/include/linux/cleanup.h#L10

Thanks,

--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2026-03-09 10:56 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-04 18:24 [PATCH net-next 0/8] net: macb: add XSK support Théo Lebrun
2026-03-04 18:24 ` [PATCH net-next 1/8] net: macb: make rx error messages rate-limited Théo Lebrun
2026-03-04 18:24 ` [PATCH net-next 2/8] net: macb: account for stats in Rx XDP codepaths Théo Lebrun
2026-03-04 18:24 ` [PATCH net-next 3/8] net: macb: account for stats in Tx " Théo Lebrun
2026-03-04 18:24 ` [PATCH net-next 4/8] net: macb: drop handling of recycled buffers in gem_rx_refill() Théo Lebrun
2026-03-04 18:24 ` [PATCH net-next 5/8] net: macb: move macb_xdp_submit_frame() body to helper function Théo Lebrun
2026-03-04 18:24 ` [PATCH net-next 6/8] net: macb: add infrastructure for XSK buffer pool Théo Lebrun
2026-03-04 18:24 ` [PATCH net-next 7/8] net: macb: add Rx zero-copy AF_XDP support Théo Lebrun
2026-03-04 18:24 ` [PATCH net-next 8/8] net: macb: add Tx " Théo Lebrun
2026-03-06 12:48   ` Maxime Chevallier
2026-03-06 17:18     ` Théo Lebrun
2026-03-06 17:53       ` Maxime Chevallier
2026-03-09 10:56         ` Théo Lebrun
2026-03-06  3:11 ` [PATCH net-next 0/8] net: macb: add XSK support Jakub Kicinski
  -- strict thread matches above, loose matches on Subject: below --
2026-03-04 18:23 Théo Lebrun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox