* [PATCH v6 net-next 00/15] net: fec: improve XDP copy mode and add AF_XDP zero-copy support
@ 2026-02-03 5:23 Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 01/15] net: fec: add fec_txq_trigger_xmit() helper Wei Fang
` (14 more replies)
0 siblings, 15 replies; 20+ messages in thread
From: Wei Fang @ 2026-02-03 5:23 UTC (permalink / raw)
To: shenwei.wang, xiaoning.wang, frank.li, andrew+netdev, davem,
edumazet, kuba, pabeni, ast, daniel, hawk, john.fastabend, sdf,
horms
Cc: netdev, linux-kernel, imx, bpf
This patch set optimizes the XDP copy mode logic as follows.
1. Separate the processing of RX XDP frames from fec_enet_rx_queue(),
and adds a separate function fec_enet_rx_queue_xdp() for handling XDP
frames.
2. For TX XDP packets, using the batch sending method to avoid frequent
MMIO writes.
3. Use the switch statement to check the tx_buf type instead of the
if...else... statement, making the cleanup logic of TX BD ring cleared
and more efficient.
We compared the performance of XDP copy mode before and after applying
this patch set, and the results show that the performance has improved.
Before applying this patch set.
root@imx93evk:~# ./xdp-bench tx eth0
Summary 396,868 rx/s 0 err,drop/s
Summary 396,024 rx/s 0 err,drop/s
root@imx93evk:~# ./xdp-bench drop eth0
Summary 684,781 rx/s 0 err/s
Summary 675,746 rx/s 0 err/s
root@imx93evk:~# ./xdp-bench pass eth0
Summary 208,552 rx/s 0 err,drop/s
Summary 208,654 rx/s 0 err,drop/s
root@imx93evk:~# ./xdp-bench redirect eth0 eth0
eth0->eth0 311,210 rx/s 0 err,drop/s 311,208 xmit/s
eth0->eth0 310,808 rx/s 0 err,drop/s 310,809 xmit/s
After applying this patch set.
root@imx93evk:~# ./xdp-bench tx eth0
Summary 425,778 rx/s 0 err,drop/s
Summary 426,042 rx/s 0 err,drop/s
root@imx93evk:~# ./xdp-bench drop eth0
Summary 698,351 rx/s 0 err/s
Summary 701,882 rx/s 0 err/s
root@imx93evk:~# ./xdp-bench pass eth0
Summary 210,348 rx/s 0 err,drop/s
Summary 210,016 rx/s 0 err,drop/s
root@imx93evk:~# ./xdp-bench redirect eth0 eth0
eth0->eth0 354,407 rx/s 0 err,drop/s 354,401 xmit/s
eth0->eth0 350,381 rx/s 0 err,drop/s 350,389 xmit/s
This patch set also addes the AF_XDP zero-copy support, and we tested
the performance on i.MX93 platform with xdpsock tool. The following is
the performance comparison of copy mode and zero-copy mode. It can be
seen that the performance of zero-copy mode is better than that of copy
mode.
1. MAC swap L2 forwarding
1.1 Zero-copy mode
root@imx93evk:~# ./xdpsock -i eth0 -l -z
sock0@eth0:0 l2fwd xdp-drv
pps pkts 1.00
rx 414715 415455
tx 414715 415455
1.2 Copy mode
root@imx93evk:~# ./xdpsock -i eth0 -l -c
sock0@eth0:0 l2fwd xdp-drv
pps pkts 1.00
rx 356396 356609
tx 356396 356609
2. TX only
2.1 Zero-copy mode
root@imx93evk:~# ./xdpsock -i eth0 -t -s 64 -z
sock0@eth0:0 txonly xdp-drv
pps pkts 1.00
rx 0 0
tx 1119573 1126720
2.2 Copy mode
root@imx93evk:~# ./xdpsock -i eth0 -t -s 64 -c
sock0@eth0:0 txonly xdp-drv
pps pkts 1.00
rx 0 0
tx 406864 407616
---
v6:
1. Remove fec_xdp_drop() from XDP_PASS branch of fec_enet_rx_queue_xdp()
v5 link: https://lore.kernel.org/imx/20260129103447.1893129-1-wei.fang@nxp.com/
v5:
1. Update the commit message of patch 6 and 13.
2. Fix the potential memory leak issue in fec_enet_alloc_rxq_buffers(),
so free the buffers on failure.
3. Only initialize bdp->cbd_bufaddr in fec_alloc_rxq_buffers_pp() because
the other fileds of bdp will be initialized in fec_enet_bd_init(). And
move the initialization of ebdp to fec_enet_bd_init().
4. Refactor fec_alloc_rxq_buffers_zc().
5. Add a check to rxq->rx_buf[index].xdp in fec_enet_rx_queue_xsk(), for
example, for the xdpsock tx only use case, the RX buffers are all NULL.
6. Refactor fec_setup_xsk_pool(), so add some new helpers like
fec_enable_xsk_pool(), fec_disable_xsk_pool(), fec_alloc_new_rxq_xsk(),
fec_alloc_new_rxq_pp() and fec_xsk_restart_napi(). This refactoring
ensures that when fec_setup_xsk_pool() fails, it can revert to the
original rxq configuration.
v4 link: https://lore.kernel.org/netdev/20260123022143.4121797-1-wei.fang@nxp.com/
v4:
1. Remove the unneeded semicolon in fec_enet_bd_init()
v3 link: https://lore.kernel.org/imx/20260121033357.3261464-1-wei.fang@nxp.com/
v3:
1. Fix build warning due to missing break for the FEC_TXBUF_T_XSK_TX
branch in fec_enet_bd_init().
2. Add new patch 7, which is separated from patch 6
3. Collect Reviewed-by tags.
v2 link: https://lore.kernel.org/imx/20260116074027.1603841-1-wei.fang@nxp.com/
v2:
1. Improve the commit message
2. Remove the unused variable dma from fec_enet_rx_queue_xsk() to fix
the build warning
3. Remove fec_txq_trigger_xmit() from fec_enet_xsk_tx_xmit()
4. Separate some new patches, patch 4, 8, 13.
5. Collect Reviewed-by tags.
vl link: https://lore.kernel.org/imx/20260113032939.3705137-1-wei.fang@nxp.com/
---
Wei Fang (15):
net: fec: add fec_txq_trigger_xmit() helper
net: fec: add fec_rx_error_check() to check RX errors
net: fec: add rx_shift to indicate the extra bytes padded in front of
RX frame
net: fec: add fec_build_skb() to build a skb
net: fec: improve fec_enet_rx_queue()
net: fec: add fec_enet_rx_queue_xdp() for XDP path
net: fec: add tx_qid parameter to fec_enet_xdp_tx_xmit()
net: fec: transmit XDP frames in bulk
net: fec: remove unnecessary NULL pointer check when clearing TX BD
ring
net: fec: use switch statement to check the type of tx_buf
net: fec: remove the size parameter from fec_enet_create_page_pool()
net: fec: move xdp_rxq_info* APIs out of fec_enet_create_page_pool()
net: fec: add fec_alloc_rxq_buffers_pp() to allocate buffers from page
pool
net: fec: improve fec_enet_tx_queue()
net: fec: add AF_XDP zero-copy support
drivers/net/ethernet/freescale/fec.h | 14 +-
drivers/net/ethernet/freescale/fec_main.c | 1640 ++++++++++++++++-----
2 files changed, 1251 insertions(+), 403 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v6 net-next 01/15] net: fec: add fec_txq_trigger_xmit() helper
2026-02-03 5:23 [PATCH v6 net-next 00/15] net: fec: improve XDP copy mode and add AF_XDP zero-copy support Wei Fang
@ 2026-02-03 5:23 ` Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 02/15] net: fec: add fec_rx_error_check() to check RX errors Wei Fang
` (13 subsequent siblings)
14 siblings, 0 replies; 20+ messages in thread
From: Wei Fang @ 2026-02-03 5:23 UTC (permalink / raw)
To: shenwei.wang, xiaoning.wang, frank.li, andrew+netdev, davem,
edumazet, kuba, pabeni, ast, daniel, hawk, john.fastabend, sdf,
horms
Cc: netdev, linux-kernel, imx, bpf
Currently, the workaround for FEC_QUIRK_ERR007885 has three call sites,
so add the helper fec_txq_trigger_xmit() to make the code more concise
and reusable.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
---
drivers/net/ethernet/freescale/fec_main.c | 32 ++++++++++-------------
1 file changed, 14 insertions(+), 18 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index a1405c928525..ce433af02ad2 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -508,6 +508,17 @@ fec_enet_create_page_pool(struct fec_enet_private *fep,
return err;
}
+static void fec_txq_trigger_xmit(struct fec_enet_private *fep,
+ struct fec_enet_priv_tx_q *txq)
+{
+ if (!(fep->quirks & FEC_QUIRK_ERR007885) ||
+ !readl(txq->bd.reg_desc_active) ||
+ !readl(txq->bd.reg_desc_active) ||
+ !readl(txq->bd.reg_desc_active) ||
+ !readl(txq->bd.reg_desc_active))
+ writel(0, txq->bd.reg_desc_active);
+}
+
static struct bufdesc *
fec_enet_txq_submit_frag_skb(struct fec_enet_priv_tx_q *txq,
struct sk_buff *skb,
@@ -717,12 +728,7 @@ static int fec_enet_txq_submit_skb(struct fec_enet_priv_tx_q *txq,
txq->bd.cur = bdp;
/* Trigger transmission start */
- if (!(fep->quirks & FEC_QUIRK_ERR007885) ||
- !readl(txq->bd.reg_desc_active) ||
- !readl(txq->bd.reg_desc_active) ||
- !readl(txq->bd.reg_desc_active) ||
- !readl(txq->bd.reg_desc_active))
- writel(0, txq->bd.reg_desc_active);
+ fec_txq_trigger_xmit(fep, txq);
return 0;
}
@@ -913,12 +919,7 @@ static int fec_enet_txq_submit_tso(struct fec_enet_priv_tx_q *txq,
txq->bd.cur = bdp;
/* Trigger transmission start */
- if (!(fep->quirks & FEC_QUIRK_ERR007885) ||
- !readl(txq->bd.reg_desc_active) ||
- !readl(txq->bd.reg_desc_active) ||
- !readl(txq->bd.reg_desc_active) ||
- !readl(txq->bd.reg_desc_active))
- writel(0, txq->bd.reg_desc_active);
+ fec_txq_trigger_xmit(fep, txq);
return 0;
@@ -3936,12 +3937,7 @@ static int fec_enet_txq_xmit_frame(struct fec_enet_private *fep,
txq->bd.cur = bdp;
/* Trigger transmission start */
- if (!(fep->quirks & FEC_QUIRK_ERR007885) ||
- !readl(txq->bd.reg_desc_active) ||
- !readl(txq->bd.reg_desc_active) ||
- !readl(txq->bd.reg_desc_active) ||
- !readl(txq->bd.reg_desc_active))
- writel(0, txq->bd.reg_desc_active);
+ fec_txq_trigger_xmit(fep, txq);
return 0;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v6 net-next 02/15] net: fec: add fec_rx_error_check() to check RX errors
2026-02-03 5:23 [PATCH v6 net-next 00/15] net: fec: improve XDP copy mode and add AF_XDP zero-copy support Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 01/15] net: fec: add fec_txq_trigger_xmit() helper Wei Fang
@ 2026-02-03 5:23 ` Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 03/15] net: fec: add rx_shift to indicate the extra bytes padded in front of RX frame Wei Fang
` (12 subsequent siblings)
14 siblings, 0 replies; 20+ messages in thread
From: Wei Fang @ 2026-02-03 5:23 UTC (permalink / raw)
To: shenwei.wang, xiaoning.wang, frank.li, andrew+netdev, davem,
edumazet, kuba, pabeni, ast, daniel, hawk, john.fastabend, sdf,
horms
Cc: netdev, linux-kernel, imx, bpf
Extract fec_rx_error_check() from fec_enet_rx_queue(), this helper is
used to check RX errors. And it will be used in XDP and XDP zero copy
paths in subsequent patches.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
---
drivers/net/ethernet/freescale/fec_main.c | 58 ++++++++++++++---------
1 file changed, 36 insertions(+), 22 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index ce433af02ad2..ee4a20d8c438 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -1747,6 +1747,41 @@ static void fec_enet_rx_vlan(const struct net_device *ndev, struct sk_buff *skb)
}
}
+static int fec_rx_error_check(struct net_device *ndev, u16 status)
+{
+ if (status & (BD_ENET_RX_LG | BD_ENET_RX_SH | BD_ENET_RX_NO |
+ BD_ENET_RX_CR | BD_ENET_RX_OV | BD_ENET_RX_LAST |
+ BD_ENET_RX_CL)) {
+ ndev->stats.rx_errors++;
+
+ if (status & BD_ENET_RX_OV) {
+ /* FIFO overrun */
+ ndev->stats.rx_fifo_errors++;
+ return -EIO;
+ }
+
+ if (status & (BD_ENET_RX_LG | BD_ENET_RX_SH |
+ BD_ENET_RX_LAST)) {
+ /* Frame too long or too short. */
+ ndev->stats.rx_length_errors++;
+ if ((status & BD_ENET_RX_LAST) && net_ratelimit())
+ netdev_err(ndev, "rcv is not +last\n");
+ }
+
+ /* CRC Error */
+ if (status & BD_ENET_RX_CR)
+ ndev->stats.rx_crc_errors++;
+
+ /* Report late collisions as a frame error. */
+ if (status & (BD_ENET_RX_NO | BD_ENET_RX_CL))
+ ndev->stats.rx_frame_errors++;
+
+ return -EIO;
+ }
+
+ return 0;
+}
+
/* During a receive, the bd_rx.cur points to the current incoming buffer.
* When we update through the ring, if the next incoming buffer has
* not been given to the system, we just set the empty indicator,
@@ -1807,29 +1842,8 @@ fec_enet_rx_queue(struct net_device *ndev, u16 queue_id, int budget)
/* Check for errors. */
status ^= BD_ENET_RX_LAST;
- if (status & (BD_ENET_RX_LG | BD_ENET_RX_SH | BD_ENET_RX_NO |
- BD_ENET_RX_CR | BD_ENET_RX_OV | BD_ENET_RX_LAST |
- BD_ENET_RX_CL)) {
- ndev->stats.rx_errors++;
- if (status & BD_ENET_RX_OV) {
- /* FIFO overrun */
- ndev->stats.rx_fifo_errors++;
- goto rx_processing_done;
- }
- if (status & (BD_ENET_RX_LG | BD_ENET_RX_SH
- | BD_ENET_RX_LAST)) {
- /* Frame too long or too short. */
- ndev->stats.rx_length_errors++;
- if (status & BD_ENET_RX_LAST)
- netdev_err(ndev, "rcv is not +last\n");
- }
- if (status & BD_ENET_RX_CR) /* CRC Error */
- ndev->stats.rx_crc_errors++;
- /* Report late collisions as a frame error. */
- if (status & (BD_ENET_RX_NO | BD_ENET_RX_CL))
- ndev->stats.rx_frame_errors++;
+ if (unlikely(fec_rx_error_check(ndev, status)))
goto rx_processing_done;
- }
/* Process the incoming frame. */
ndev->stats.rx_packets++;
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v6 net-next 03/15] net: fec: add rx_shift to indicate the extra bytes padded in front of RX frame
2026-02-03 5:23 [PATCH v6 net-next 00/15] net: fec: improve XDP copy mode and add AF_XDP zero-copy support Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 01/15] net: fec: add fec_txq_trigger_xmit() helper Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 02/15] net: fec: add fec_rx_error_check() to check RX errors Wei Fang
@ 2026-02-03 5:23 ` Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 04/15] net: fec: add fec_build_skb() to build a skb Wei Fang
` (11 subsequent siblings)
14 siblings, 0 replies; 20+ messages in thread
From: Wei Fang @ 2026-02-03 5:23 UTC (permalink / raw)
To: shenwei.wang, xiaoning.wang, frank.li, andrew+netdev, davem,
edumazet, kuba, pabeni, ast, daniel, hawk, john.fastabend, sdf,
horms
Cc: netdev, linux-kernel, imx, bpf
The FEC of some platforms supports RX FIFO shift-16, it means the actual
frame data starts at bit 16 of the first word read from RX FIFO aligning
the Ethernet payload on a 32-bit boundary. The MAC writes two additional
bytes in front of each frame received into the RX FIFO. Currently, the
fec_enet_rx_queue() updates the data_start, sub_len and the rx_bytes
statistics by checking whether FEC_QUIRK_HAS_RACC is set. This makes the
code less concise, so rx_shift is added to represent the number of extra
bytes padded in front of the RX frame. Furthermore, when adding separate
RX handling functions for XDP copy mode and zero copy mode in the future,
it will no longer be necessary to check FEC_QUIRK_HAS_RACC to update the
corresponding variables.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
---
drivers/net/ethernet/freescale/fec.h | 1 +
drivers/net/ethernet/freescale/fec_main.c | 21 ++++++++-------------
2 files changed, 9 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h
index fd9a93d02f8e..ad7aba1a8536 100644
--- a/drivers/net/ethernet/freescale/fec.h
+++ b/drivers/net/ethernet/freescale/fec.h
@@ -643,6 +643,7 @@ struct fec_enet_private {
struct pm_qos_request pm_qos_req;
unsigned int tx_align;
+ unsigned int rx_shift;
/* hw interrupt coalesce */
unsigned int rx_pkts_itr;
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index ee4a20d8c438..c5d303f2aaed 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -1800,22 +1800,14 @@ fec_enet_rx_queue(struct net_device *ndev, u16 queue_id, int budget)
struct bufdesc_ex *ebdp = NULL;
int index = 0;
bool need_swap = fep->quirks & FEC_QUIRK_SWAP_FRAME;
+ u32 data_start = FEC_ENET_XDP_HEADROOM + fep->rx_shift;
struct bpf_prog *xdp_prog = READ_ONCE(fep->xdp_prog);
u32 ret, xdp_result = FEC_ENET_XDP_PASS;
- u32 data_start = FEC_ENET_XDP_HEADROOM;
+ u32 sub_len = 4 + fep->rx_shift;
int cpu = smp_processor_id();
struct xdp_buff xdp;
struct page *page;
__fec32 cbd_bufaddr;
- u32 sub_len = 4;
-
- /*If it has the FEC_QUIRK_HAS_RACC quirk property, the bit of
- * FEC_RACC_SHIFT16 is set by default in the probe function.
- */
- if (fep->quirks & FEC_QUIRK_HAS_RACC) {
- data_start += 2;
- sub_len += 2;
- }
#if defined(CONFIG_COLDFIRE) && !defined(CONFIG_COLDFIRE_COHERENT_DMA)
/*
@@ -1848,9 +1840,7 @@ fec_enet_rx_queue(struct net_device *ndev, u16 queue_id, int budget)
/* Process the incoming frame. */
ndev->stats.rx_packets++;
pkt_len = fec16_to_cpu(bdp->cbd_datlen);
- ndev->stats.rx_bytes += pkt_len;
- if (fep->quirks & FEC_QUIRK_HAS_RACC)
- ndev->stats.rx_bytes -= 2;
+ ndev->stats.rx_bytes += pkt_len - fep->rx_shift;
index = fec_enet_get_bd_index(bdp, &rxq->bd);
page = rxq->rx_buf[index];
@@ -4603,6 +4593,11 @@ fec_probe(struct platform_device *pdev)
ndev->max_mtu = fep->max_buf_size - VLAN_ETH_HLEN - ETH_FCS_LEN;
+ if (fep->quirks & FEC_QUIRK_HAS_RACC)
+ fep->rx_shift = 2;
+ else
+ fep->rx_shift = 0;
+
ret = register_netdev(ndev);
if (ret)
goto failed_register;
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v6 net-next 04/15] net: fec: add fec_build_skb() to build a skb
2026-02-03 5:23 [PATCH v6 net-next 00/15] net: fec: improve XDP copy mode and add AF_XDP zero-copy support Wei Fang
` (2 preceding siblings ...)
2026-02-03 5:23 ` [PATCH v6 net-next 03/15] net: fec: add rx_shift to indicate the extra bytes padded in front of RX frame Wei Fang
@ 2026-02-03 5:23 ` Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 05/15] net: fec: improve fec_enet_rx_queue() Wei Fang
` (10 subsequent siblings)
14 siblings, 0 replies; 20+ messages in thread
From: Wei Fang @ 2026-02-03 5:23 UTC (permalink / raw)
To: shenwei.wang, xiaoning.wang, frank.li, andrew+netdev, davem,
edumazet, kuba, pabeni, ast, daniel, hawk, john.fastabend, sdf,
horms
Cc: netdev, linux-kernel, imx, bpf
Extract the helper fec_build_skb() from fec_enet_rx_queue(), so that the
code for building a skb is centralized in fec_build_skb(), which makes
the code of fec_enet_rx_queue() more concise and readable.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
---
drivers/net/ethernet/freescale/fec_main.c | 106 ++++++++++++----------
1 file changed, 60 insertions(+), 46 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index c5d303f2aaed..851608d04084 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -1782,6 +1782,59 @@ static int fec_rx_error_check(struct net_device *ndev, u16 status)
return 0;
}
+static struct sk_buff *fec_build_skb(struct fec_enet_private *fep,
+ struct fec_enet_priv_rx_q *rxq,
+ struct bufdesc *bdp,
+ struct page *page, u32 len)
+{
+ struct net_device *ndev = fep->netdev;
+ struct bufdesc_ex *ebdp;
+ struct sk_buff *skb;
+
+ skb = build_skb(page_address(page),
+ PAGE_SIZE << fep->pagepool_order);
+ if (unlikely(!skb)) {
+ page_pool_recycle_direct(rxq->page_pool, page);
+ ndev->stats.rx_dropped++;
+ if (net_ratelimit())
+ netdev_err(ndev, "build_skb failed\n");
+
+ return NULL;
+ }
+
+ skb_reserve(skb, FEC_ENET_XDP_HEADROOM + fep->rx_shift);
+ skb_put(skb, len);
+ skb_mark_for_recycle(skb);
+
+ /* Get offloads from the enhanced buffer descriptor */
+ if (fep->bufdesc_ex) {
+ ebdp = (struct bufdesc_ex *)bdp;
+
+ /* If this is a VLAN packet remove the VLAN Tag */
+ if (ebdp->cbd_esc & cpu_to_fec32(BD_ENET_RX_VLAN))
+ fec_enet_rx_vlan(ndev, skb);
+
+ /* Get receive timestamp from the skb */
+ if (fep->hwts_rx_en)
+ fec_enet_hwtstamp(fep, fec32_to_cpu(ebdp->ts),
+ skb_hwtstamps(skb));
+
+ if (fep->csum_flags & FLAG_RX_CSUM_ENABLED) {
+ if (!(ebdp->cbd_esc &
+ cpu_to_fec32(FLAG_RX_CSUM_ERROR)))
+ /* don't check it */
+ skb->ip_summed = CHECKSUM_UNNECESSARY;
+ else
+ skb_checksum_none_assert(skb);
+ }
+ }
+
+ skb->protocol = eth_type_trans(skb, ndev);
+ skb_record_rx_queue(skb, rxq->bd.qid);
+
+ return skb;
+}
+
/* During a receive, the bd_rx.cur points to the current incoming buffer.
* When we update through the ring, if the next incoming buffer has
* not been given to the system, we just set the empty indicator,
@@ -1797,7 +1850,6 @@ fec_enet_rx_queue(struct net_device *ndev, u16 queue_id, int budget)
struct sk_buff *skb;
ushort pkt_len;
int pkt_received = 0;
- struct bufdesc_ex *ebdp = NULL;
int index = 0;
bool need_swap = fep->quirks & FEC_QUIRK_SWAP_FRAME;
u32 data_start = FEC_ENET_XDP_HEADROOM + fep->rx_shift;
@@ -1867,24 +1919,6 @@ fec_enet_rx_queue(struct net_device *ndev, u16 queue_id, int budget)
goto rx_processing_done;
}
- /* The packet length includes FCS, but we don't want to
- * include that when passing upstream as it messes up
- * bridging applications.
- */
- skb = build_skb(page_address(page),
- PAGE_SIZE << fep->pagepool_order);
- if (unlikely(!skb)) {
- page_pool_recycle_direct(rxq->page_pool, page);
- ndev->stats.rx_dropped++;
-
- netdev_err_once(ndev, "build_skb failed!\n");
- goto rx_processing_done;
- }
-
- skb_reserve(skb, data_start);
- skb_put(skb, pkt_len - sub_len);
- skb_mark_for_recycle(skb);
-
if (unlikely(need_swap)) {
u8 *data;
@@ -1892,34 +1926,14 @@ fec_enet_rx_queue(struct net_device *ndev, u16 queue_id, int budget)
swap_buffer(data, pkt_len);
}
- /* Extract the enhanced buffer descriptor */
- ebdp = NULL;
- if (fep->bufdesc_ex)
- ebdp = (struct bufdesc_ex *)bdp;
-
- /* If this is a VLAN packet remove the VLAN Tag */
- if (fep->bufdesc_ex &&
- (ebdp->cbd_esc & cpu_to_fec32(BD_ENET_RX_VLAN)))
- fec_enet_rx_vlan(ndev, skb);
-
- skb->protocol = eth_type_trans(skb, ndev);
-
- /* Get receive timestamp from the skb */
- if (fep->hwts_rx_en && fep->bufdesc_ex)
- fec_enet_hwtstamp(fep, fec32_to_cpu(ebdp->ts),
- skb_hwtstamps(skb));
-
- if (fep->bufdesc_ex &&
- (fep->csum_flags & FLAG_RX_CSUM_ENABLED)) {
- if (!(ebdp->cbd_esc & cpu_to_fec32(FLAG_RX_CSUM_ERROR))) {
- /* don't check it */
- skb->ip_summed = CHECKSUM_UNNECESSARY;
- } else {
- skb_checksum_none_assert(skb);
- }
- }
+ /* The packet length includes FCS, but we don't want to
+ * include that when passing upstream as it messes up
+ * bridging applications.
+ */
+ skb = fec_build_skb(fep, rxq, bdp, page, pkt_len - sub_len);
+ if (!skb)
+ goto rx_processing_done;
- skb_record_rx_queue(skb, queue_id);
napi_gro_receive(&fep->napi, skb);
rx_processing_done:
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v6 net-next 05/15] net: fec: improve fec_enet_rx_queue()
2026-02-03 5:23 [PATCH v6 net-next 00/15] net: fec: improve XDP copy mode and add AF_XDP zero-copy support Wei Fang
` (3 preceding siblings ...)
2026-02-03 5:23 ` [PATCH v6 net-next 04/15] net: fec: add fec_build_skb() to build a skb Wei Fang
@ 2026-02-03 5:23 ` Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 06/15] net: fec: add fec_enet_rx_queue_xdp() for XDP path Wei Fang
` (9 subsequent siblings)
14 siblings, 0 replies; 20+ messages in thread
From: Wei Fang @ 2026-02-03 5:23 UTC (permalink / raw)
To: shenwei.wang, xiaoning.wang, frank.li, andrew+netdev, davem,
edumazet, kuba, pabeni, ast, daniel, hawk, john.fastabend, sdf,
horms
Cc: netdev, linux-kernel, imx, bpf
This patch has made the following adjustments to fec_enet_rx_queue().
1. The function parameters are modified to maintain the same style as
subsequently added XDP-related interfaces.
2. Some variables are initialized at the time of declaration, and the
order of local variables is updated to follow the reverse xmas tree
style.
3. Replace variable cbd_bufaddr with dma.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
---
drivers/net/ethernet/freescale/fec_main.c | 35 ++++++++++-------------
1 file changed, 15 insertions(+), 20 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index 851608d04084..7bc2eac01e6e 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -1840,26 +1840,25 @@ static struct sk_buff *fec_build_skb(struct fec_enet_private *fep,
* not been given to the system, we just set the empty indicator,
* effectively tossing the packet.
*/
-static int
-fec_enet_rx_queue(struct net_device *ndev, u16 queue_id, int budget)
+static int fec_enet_rx_queue(struct fec_enet_private *fep,
+ u16 queue, int budget)
{
- struct fec_enet_private *fep = netdev_priv(ndev);
- struct fec_enet_priv_rx_q *rxq;
- struct bufdesc *bdp;
- unsigned short status;
- struct sk_buff *skb;
- ushort pkt_len;
- int pkt_received = 0;
- int index = 0;
- bool need_swap = fep->quirks & FEC_QUIRK_SWAP_FRAME;
u32 data_start = FEC_ENET_XDP_HEADROOM + fep->rx_shift;
+ struct fec_enet_priv_rx_q *rxq = fep->rx_queue[queue];
struct bpf_prog *xdp_prog = READ_ONCE(fep->xdp_prog);
+ bool need_swap = fep->quirks & FEC_QUIRK_SWAP_FRAME;
u32 ret, xdp_result = FEC_ENET_XDP_PASS;
+ struct net_device *ndev = fep->netdev;
+ struct bufdesc *bdp = rxq->bd.cur;
u32 sub_len = 4 + fep->rx_shift;
int cpu = smp_processor_id();
+ int pkt_received = 0;
+ u16 status, pkt_len;
+ struct sk_buff *skb;
struct xdp_buff xdp;
struct page *page;
- __fec32 cbd_bufaddr;
+ dma_addr_t dma;
+ int index;
#if defined(CONFIG_COLDFIRE) && !defined(CONFIG_COLDFIRE_COHERENT_DMA)
/*
@@ -1868,12 +1867,10 @@ fec_enet_rx_queue(struct net_device *ndev, u16 queue_id, int budget)
*/
flush_cache_all();
#endif
- rxq = fep->rx_queue[queue_id];
/* First, grab all of the stats for the incoming packet.
* These get messed up if we get called due to a busy condition.
*/
- bdp = rxq->bd.cur;
xdp_init_buff(&xdp, PAGE_SIZE << fep->pagepool_order, &rxq->xdp_rxq);
while (!((status = fec16_to_cpu(bdp->cbd_sc)) & BD_ENET_RX_EMPTY)) {
@@ -1882,7 +1879,7 @@ fec_enet_rx_queue(struct net_device *ndev, u16 queue_id, int budget)
break;
pkt_received++;
- writel(FEC_ENET_RXF_GET(queue_id), fep->hwp + FEC_IEVENT);
+ writel(FEC_ENET_RXF_GET(queue), fep->hwp + FEC_IEVENT);
/* Check for errors. */
status ^= BD_ENET_RX_LAST;
@@ -1896,15 +1893,13 @@ fec_enet_rx_queue(struct net_device *ndev, u16 queue_id, int budget)
index = fec_enet_get_bd_index(bdp, &rxq->bd);
page = rxq->rx_buf[index];
- cbd_bufaddr = bdp->cbd_bufaddr;
+ dma = fec32_to_cpu(bdp->cbd_bufaddr);
if (fec_enet_update_cbd(rxq, bdp, index)) {
ndev->stats.rx_dropped++;
goto rx_processing_done;
}
- dma_sync_single_for_cpu(&fep->pdev->dev,
- fec32_to_cpu(cbd_bufaddr),
- pkt_len,
+ dma_sync_single_for_cpu(&fep->pdev->dev, dma, pkt_len,
DMA_FROM_DEVICE);
prefetch(page_address(page));
@@ -1980,7 +1975,7 @@ static int fec_enet_rx(struct net_device *ndev, int budget)
/* Make sure that AVB queues are processed first. */
for (i = fep->num_rx_queues - 1; i >= 0; i--)
- done += fec_enet_rx_queue(ndev, i, budget - done);
+ done += fec_enet_rx_queue(fep, i, budget - done);
return done;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v6 net-next 06/15] net: fec: add fec_enet_rx_queue_xdp() for XDP path
2026-02-03 5:23 [PATCH v6 net-next 00/15] net: fec: improve XDP copy mode and add AF_XDP zero-copy support Wei Fang
` (4 preceding siblings ...)
2026-02-03 5:23 ` [PATCH v6 net-next 05/15] net: fec: improve fec_enet_rx_queue() Wei Fang
@ 2026-02-03 5:23 ` Wei Fang
2026-02-05 5:28 ` [v6,net-next,06/15] " Jakub Kicinski
2026-02-03 5:23 ` [PATCH v6 net-next 07/15] net: fec: add tx_qid parameter to fec_enet_xdp_tx_xmit() Wei Fang
` (8 subsequent siblings)
14 siblings, 1 reply; 20+ messages in thread
From: Wei Fang @ 2026-02-03 5:23 UTC (permalink / raw)
To: shenwei.wang, xiaoning.wang, frank.li, andrew+netdev, davem,
edumazet, kuba, pabeni, ast, daniel, hawk, john.fastabend, sdf,
horms
Cc: netdev, linux-kernel, imx, bpf
Currently, the processing of XDP path packets and protocol stack packets
are both mixed in fec_enet_rx_queue(), which makes the logic somewhat
confusing and debugging more difficult. Furthermore, some logic is not
needed by each other. Such as the kernel path does not need to call
xdp_init_buff(), XDP path does not support swap_buffer(), because
fec_enet_bpf() returns "-EOPNOTSUPP" for those platforms which need
swap_buffer()), and so on. This prevents XDP from achieving its maximum
performance. Therefore, XDP path packets processing has been separated
from fec_enet_rx_queue() by adding the fec_enet_rx_queue_xdp() function
to optimize XDP path logic and improve XDP performance.
The XDP performance on the iMX93 platform was compared before and after
applying this patch. Detailed results are as follows and we can see the
performance has been improved.
Env: i.MX93, packet size 64 bytes including FCS, only single core and RX
BD ring are used to receive packets, flow-control is off.
Before the patch is applied:
xdp-bench tx eth0
Summary 396,868 rx/s 0 err,drop/s
Summary 396,024 rx/s 0 err,drop/s
xdp-bench drop eth0
Summary 684,781 rx/s 0 err/s
Summary 675,746 rx/s 0 err/s
xdp-bench pass eth0
Summary 208,552 rx/s 0 err,drop/s
Summary 208,654 rx/s 0 err,drop/s
xdp-bench redirect eth0 eth0
eth0->eth0 311,210 rx/s 0 err,drop/s 311,208 xmit/s
eth0->eth0 310,808 rx/s 0 err,drop/s 310,809 xmit/s
After the patch is applied:
xdp-bench tx eth0
Summary 409,975 rx/s 0 err,drop/s
Summary 411,073 rx/s 0 err,drop/s
xdp-bench drop eth0
Summary 700,681 rx/s 0 err/s
Summary 698,102 rx/s 0 err/s
xdp-bench pass eth0
Summary 211,356 rx/s 0 err,drop/s
Summary 210,629 rx/s 0 err,drop/s
xdp-bench redirect eth0 eth0
eth0->eth0 320,351 rx/s 0 err,drop/s 320,348 xmit/s
eth0->eth0 318,988 rx/s 0 err,drop/s 318,988 xmit/s
Signed-off-by: Wei Fang <wei.fang@nxp.com>
---
drivers/net/ethernet/freescale/fec_main.c | 255 ++++++++++++++--------
1 file changed, 169 insertions(+), 86 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index 7bc2eac01e6e..aafde4378f65 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -1666,71 +1666,6 @@ static int fec_enet_update_cbd(struct fec_enet_priv_rx_q *rxq,
return 0;
}
-static u32
-fec_enet_run_xdp(struct fec_enet_private *fep, struct bpf_prog *prog,
- struct xdp_buff *xdp, struct fec_enet_priv_rx_q *rxq, int cpu)
-{
- unsigned int sync, len = xdp->data_end - xdp->data;
- u32 ret = FEC_ENET_XDP_PASS;
- struct page *page;
- int err;
- u32 act;
-
- act = bpf_prog_run_xdp(prog, xdp);
-
- /* Due xdp_adjust_tail and xdp_adjust_head: DMA sync for_device cover
- * max len CPU touch
- */
- sync = xdp->data_end - xdp->data;
- sync = max(sync, len);
-
- switch (act) {
- case XDP_PASS:
- rxq->stats[RX_XDP_PASS]++;
- ret = FEC_ENET_XDP_PASS;
- break;
-
- case XDP_REDIRECT:
- rxq->stats[RX_XDP_REDIRECT]++;
- err = xdp_do_redirect(fep->netdev, xdp, prog);
- if (unlikely(err))
- goto xdp_err;
-
- ret = FEC_ENET_XDP_REDIR;
- break;
-
- case XDP_TX:
- rxq->stats[RX_XDP_TX]++;
- err = fec_enet_xdp_tx_xmit(fep, cpu, xdp, sync);
- if (unlikely(err)) {
- rxq->stats[RX_XDP_TX_ERRORS]++;
- goto xdp_err;
- }
-
- ret = FEC_ENET_XDP_TX;
- break;
-
- default:
- bpf_warn_invalid_xdp_action(fep->netdev, prog, act);
- fallthrough;
-
- case XDP_ABORTED:
- fallthrough; /* handle aborts by dropping packet */
-
- case XDP_DROP:
- rxq->stats[RX_XDP_DROP]++;
-xdp_err:
- ret = FEC_ENET_XDP_CONSUMED;
- page = virt_to_head_page(xdp->data);
- page_pool_put_page(rxq->page_pool, page, sync, true);
- if (act != XDP_DROP)
- trace_xdp_exception(fep->netdev, prog, act);
- break;
- }
-
- return ret;
-}
-
static void fec_enet_rx_vlan(const struct net_device *ndev, struct sk_buff *skb)
{
if (ndev->features & NETIF_F_HW_VLAN_CTAG_RX) {
@@ -1843,19 +1778,14 @@ static struct sk_buff *fec_build_skb(struct fec_enet_private *fep,
static int fec_enet_rx_queue(struct fec_enet_private *fep,
u16 queue, int budget)
{
- u32 data_start = FEC_ENET_XDP_HEADROOM + fep->rx_shift;
struct fec_enet_priv_rx_q *rxq = fep->rx_queue[queue];
- struct bpf_prog *xdp_prog = READ_ONCE(fep->xdp_prog);
bool need_swap = fep->quirks & FEC_QUIRK_SWAP_FRAME;
- u32 ret, xdp_result = FEC_ENET_XDP_PASS;
struct net_device *ndev = fep->netdev;
struct bufdesc *bdp = rxq->bd.cur;
u32 sub_len = 4 + fep->rx_shift;
- int cpu = smp_processor_id();
int pkt_received = 0;
u16 status, pkt_len;
struct sk_buff *skb;
- struct xdp_buff xdp;
struct page *page;
dma_addr_t dma;
int index;
@@ -1871,8 +1801,6 @@ static int fec_enet_rx_queue(struct fec_enet_private *fep,
/* First, grab all of the stats for the incoming packet.
* These get messed up if we get called due to a busy condition.
*/
- xdp_init_buff(&xdp, PAGE_SIZE << fep->pagepool_order, &rxq->xdp_rxq);
-
while (!((status = fec16_to_cpu(bdp->cbd_sc)) & BD_ENET_RX_EMPTY)) {
if (pkt_received >= budget)
@@ -1903,17 +1831,6 @@ static int fec_enet_rx_queue(struct fec_enet_private *fep,
DMA_FROM_DEVICE);
prefetch(page_address(page));
- if (xdp_prog) {
- xdp_buff_clear_frags_flag(&xdp);
- /* subtract 16bit shift and FCS */
- xdp_prepare_buff(&xdp, page_address(page),
- data_start, pkt_len - sub_len, false);
- ret = fec_enet_run_xdp(fep, xdp_prog, &xdp, rxq, cpu);
- xdp_result |= ret;
- if (ret != FEC_ENET_XDP_PASS)
- goto rx_processing_done;
- }
-
if (unlikely(need_swap)) {
u8 *data;
@@ -1962,7 +1879,167 @@ static int fec_enet_rx_queue(struct fec_enet_private *fep,
}
rxq->bd.cur = bdp;
- if (xdp_result & FEC_ENET_XDP_REDIR)
+ return pkt_received;
+}
+
+static void fec_xdp_drop(struct fec_enet_priv_rx_q *rxq,
+ struct xdp_buff *xdp, u32 sync)
+{
+ struct page *page = virt_to_head_page(xdp->data);
+
+ page_pool_put_page(rxq->page_pool, page, sync, true);
+}
+
+static int fec_enet_rx_queue_xdp(struct fec_enet_private *fep, int queue,
+ int budget, struct bpf_prog *prog)
+{
+ u32 data_start = FEC_ENET_XDP_HEADROOM + fep->rx_shift;
+ struct fec_enet_priv_rx_q *rxq = fep->rx_queue[queue];
+ struct net_device *ndev = fep->netdev;
+ struct bufdesc *bdp = rxq->bd.cur;
+ u32 sub_len = 4 + fep->rx_shift;
+ int cpu = smp_processor_id();
+ int pkt_received = 0;
+ struct sk_buff *skb;
+ u16 status, pkt_len;
+ struct xdp_buff xdp;
+ struct page *page;
+ u32 xdp_res = 0;
+ dma_addr_t dma;
+ int index, err;
+ u32 act, sync;
+
+#if defined(CONFIG_COLDFIRE) && !defined(CONFIG_COLDFIRE_COHERENT_DMA)
+ /*
+ * Hacky flush of all caches instead of using the DMA API for the TSO
+ * headers.
+ */
+ flush_cache_all();
+#endif
+
+ xdp_init_buff(&xdp, PAGE_SIZE << fep->pagepool_order, &rxq->xdp_rxq);
+
+ while (!((status = fec16_to_cpu(bdp->cbd_sc)) & BD_ENET_RX_EMPTY)) {
+ if (pkt_received >= budget)
+ break;
+ pkt_received++;
+
+ writel(FEC_ENET_RXF_GET(queue), fep->hwp + FEC_IEVENT);
+
+ /* Check for errors. */
+ status ^= BD_ENET_RX_LAST;
+ if (unlikely(fec_rx_error_check(ndev, status)))
+ goto rx_processing_done;
+
+ /* Process the incoming frame. */
+ ndev->stats.rx_packets++;
+ pkt_len = fec16_to_cpu(bdp->cbd_datlen);
+ ndev->stats.rx_bytes += pkt_len - fep->rx_shift;
+
+ index = fec_enet_get_bd_index(bdp, &rxq->bd);
+ page = rxq->rx_buf[index];
+ dma = fec32_to_cpu(bdp->cbd_bufaddr);
+
+ if (fec_enet_update_cbd(rxq, bdp, index)) {
+ ndev->stats.rx_dropped++;
+ goto rx_processing_done;
+ }
+
+ dma_sync_single_for_cpu(&fep->pdev->dev, dma, pkt_len,
+ DMA_FROM_DEVICE);
+ prefetch(page_address(page));
+
+ xdp_buff_clear_frags_flag(&xdp);
+ /* subtract 16bit shift and FCS */
+ pkt_len -= sub_len;
+ xdp_prepare_buff(&xdp, page_address(page), data_start,
+ pkt_len, false);
+
+ act = bpf_prog_run_xdp(prog, &xdp);
+ /* Due xdp_adjust_tail and xdp_adjust_head: DMA sync
+ * for_device cover max len CPU touch.
+ */
+ sync = xdp.data_end - xdp.data;
+ sync = max(sync, pkt_len);
+
+ switch (act) {
+ case XDP_PASS:
+ rxq->stats[RX_XDP_PASS]++;
+ /* The packet length includes FCS, but we don't want to
+ * include that when passing upstream as it messes up
+ * bridging applications.
+ */
+ skb = fec_build_skb(fep, rxq, bdp, page, pkt_len);
+ if (!skb)
+ trace_xdp_exception(ndev, prog, XDP_PASS);
+ else
+ napi_gro_receive(&fep->napi, skb);
+
+ break;
+ case XDP_REDIRECT:
+ rxq->stats[RX_XDP_REDIRECT]++;
+ err = xdp_do_redirect(ndev, &xdp, prog);
+ if (unlikely(err)) {
+ fec_xdp_drop(rxq, &xdp, sync);
+ trace_xdp_exception(ndev, prog, XDP_REDIRECT);
+ } else {
+ xdp_res |= FEC_ENET_XDP_REDIR;
+ }
+ break;
+ case XDP_TX:
+ rxq->stats[RX_XDP_TX]++;
+ err = fec_enet_xdp_tx_xmit(fep, cpu, &xdp, sync);
+ if (unlikely(err)) {
+ rxq->stats[RX_XDP_TX_ERRORS]++;
+ fec_xdp_drop(rxq, &xdp, sync);
+ trace_xdp_exception(ndev, prog, XDP_TX);
+ }
+ break;
+ default:
+ bpf_warn_invalid_xdp_action(ndev, prog, act);
+ fallthrough;
+ case XDP_ABORTED:
+ /* handle aborts by dropping packet */
+ fallthrough;
+ case XDP_DROP:
+ rxq->stats[RX_XDP_DROP]++;
+ fec_xdp_drop(rxq, &xdp, sync);
+ break;
+ }
+
+rx_processing_done:
+ /* Clear the status flags for this buffer */
+ status &= ~BD_ENET_RX_STATS;
+ /* Mark the buffer empty */
+ status |= BD_ENET_RX_EMPTY;
+
+ if (fep->bufdesc_ex) {
+ struct bufdesc_ex *ebdp = (struct bufdesc_ex *)bdp;
+
+ ebdp->cbd_esc = cpu_to_fec32(BD_ENET_RX_INT);
+ ebdp->cbd_prot = 0;
+ ebdp->cbd_bdu = 0;
+ }
+
+ /* Make sure the updates to rest of the descriptor are
+ * performed before transferring ownership.
+ */
+ dma_wmb();
+ bdp->cbd_sc = cpu_to_fec16(status);
+
+ /* Update BD pointer to next entry */
+ bdp = fec_enet_get_nextdesc(bdp, &rxq->bd);
+
+ /* Doing this here will keep the FEC running while we process
+ * incoming frames. On a heavily loaded network, we should be
+ * able to keep up at the expense of system resources.
+ */
+ writel(0, rxq->bd.reg_desc_active);
+ }
+
+ rxq->bd.cur = bdp;
+
+ if (xdp_res & FEC_ENET_XDP_REDIR)
xdp_do_flush();
return pkt_received;
@@ -1971,11 +2048,17 @@ static int fec_enet_rx_queue(struct fec_enet_private *fep,
static int fec_enet_rx(struct net_device *ndev, int budget)
{
struct fec_enet_private *fep = netdev_priv(ndev);
+ struct bpf_prog *prog = READ_ONCE(fep->xdp_prog);
int i, done = 0;
/* Make sure that AVB queues are processed first. */
- for (i = fep->num_rx_queues - 1; i >= 0; i--)
- done += fec_enet_rx_queue(fep, i, budget - done);
+ for (i = fep->num_rx_queues - 1; i >= 0; i--) {
+ if (prog)
+ done += fec_enet_rx_queue_xdp(fep, i, budget - done,
+ prog);
+ else
+ done += fec_enet_rx_queue(fep, i, budget - done);
+ }
return done;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v6 net-next 07/15] net: fec: add tx_qid parameter to fec_enet_xdp_tx_xmit()
2026-02-03 5:23 [PATCH v6 net-next 00/15] net: fec: improve XDP copy mode and add AF_XDP zero-copy support Wei Fang
` (5 preceding siblings ...)
2026-02-03 5:23 ` [PATCH v6 net-next 06/15] net: fec: add fec_enet_rx_queue_xdp() for XDP path Wei Fang
@ 2026-02-03 5:23 ` Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 08/15] net: fec: transmit XDP frames in bulk Wei Fang
` (7 subsequent siblings)
14 siblings, 0 replies; 20+ messages in thread
From: Wei Fang @ 2026-02-03 5:23 UTC (permalink / raw)
To: shenwei.wang, xiaoning.wang, frank.li, andrew+netdev, davem,
edumazet, kuba, pabeni, ast, daniel, hawk, john.fastabend, sdf,
horms
Cc: netdev, linux-kernel, imx, bpf
Remove fec_enet_xdp_get_tx_queue() from fec_enet_xdp_tx_xmit() and add
the tx_qid parameter to it. Then, calculate the TX queue ID for XDP_TX
frames in fec_enet_rx_queue_xdp(). This way, the TX queue ID only needs
to be calculated once for XDP_TX frames during each NAPI polling. And
since the number of RX queues and TX queues in FEC is generally equal,
the RX queue ID can be directly used as the TX queue ID. In exceptional
cases, fec_enet_xdp_get_tx_queue() is used to calculate the TX queue ID.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
---
drivers/net/ethernet/freescale/fec_main.c | 38 +++++++++++------------
1 file changed, 19 insertions(+), 19 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index aafde4378f65..0c1806a46fcd 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -79,7 +79,7 @@ static void set_multicast_list(struct net_device *ndev);
static void fec_enet_itr_coal_set(struct net_device *ndev);
static int fec_enet_xdp_tx_xmit(struct fec_enet_private *fep,
int cpu, struct xdp_buff *xdp,
- u32 dma_sync_len);
+ u32 dma_sync_len, int queue);
#define DRIVER_NAME "fec"
@@ -1890,6 +1890,15 @@ static void fec_xdp_drop(struct fec_enet_priv_rx_q *rxq,
page_pool_put_page(rxq->page_pool, page, sync, true);
}
+static int
+fec_enet_xdp_get_tx_queue(struct fec_enet_private *fep, int index)
+{
+ if (unlikely(index < 0))
+ return 0;
+
+ return (index % fep->num_tx_queues);
+}
+
static int fec_enet_rx_queue_xdp(struct fec_enet_private *fep, int queue,
int budget, struct bpf_prog *prog)
{
@@ -1903,6 +1912,7 @@ static int fec_enet_rx_queue_xdp(struct fec_enet_private *fep, int queue,
struct sk_buff *skb;
u16 status, pkt_len;
struct xdp_buff xdp;
+ int tx_qid = queue;
struct page *page;
u32 xdp_res = 0;
dma_addr_t dma;
@@ -1917,6 +1927,9 @@ static int fec_enet_rx_queue_xdp(struct fec_enet_private *fep, int queue,
flush_cache_all();
#endif
+ if (unlikely(tx_qid >= fep->num_tx_queues))
+ tx_qid = fec_enet_xdp_get_tx_queue(fep, cpu);
+
xdp_init_buff(&xdp, PAGE_SIZE << fep->pagepool_order, &rxq->xdp_rxq);
while (!((status = fec16_to_cpu(bdp->cbd_sc)) & BD_ENET_RX_EMPTY)) {
@@ -1988,7 +2001,7 @@ static int fec_enet_rx_queue_xdp(struct fec_enet_private *fep, int queue,
break;
case XDP_TX:
rxq->stats[RX_XDP_TX]++;
- err = fec_enet_xdp_tx_xmit(fep, cpu, &xdp, sync);
+ err = fec_enet_xdp_tx_xmit(fep, cpu, &xdp, sync, tx_qid);
if (unlikely(err)) {
rxq->stats[RX_XDP_TX_ERRORS]++;
fec_xdp_drop(rxq, &xdp, sync);
@@ -3938,15 +3951,6 @@ static int fec_enet_bpf(struct net_device *dev, struct netdev_bpf *bpf)
}
}
-static int
-fec_enet_xdp_get_tx_queue(struct fec_enet_private *fep, int index)
-{
- if (unlikely(index < 0))
- return 0;
-
- return (index % fep->num_tx_queues);
-}
-
static int fec_enet_txq_xmit_frame(struct fec_enet_private *fep,
struct fec_enet_priv_tx_q *txq,
void *frame, u32 dma_sync_len,
@@ -4040,15 +4044,11 @@ static int fec_enet_txq_xmit_frame(struct fec_enet_private *fep,
static int fec_enet_xdp_tx_xmit(struct fec_enet_private *fep,
int cpu, struct xdp_buff *xdp,
- u32 dma_sync_len)
+ u32 dma_sync_len, int queue)
{
- struct fec_enet_priv_tx_q *txq;
- struct netdev_queue *nq;
- int queue, ret;
-
- queue = fec_enet_xdp_get_tx_queue(fep, cpu);
- txq = fep->tx_queue[queue];
- nq = netdev_get_tx_queue(fep->netdev, queue);
+ struct netdev_queue *nq = netdev_get_tx_queue(fep->netdev, queue);
+ struct fec_enet_priv_tx_q *txq = fep->tx_queue[queue];
+ int ret;
__netif_tx_lock(nq, cpu);
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v6 net-next 08/15] net: fec: transmit XDP frames in bulk
2026-02-03 5:23 [PATCH v6 net-next 00/15] net: fec: improve XDP copy mode and add AF_XDP zero-copy support Wei Fang
` (6 preceding siblings ...)
2026-02-03 5:23 ` [PATCH v6 net-next 07/15] net: fec: add tx_qid parameter to fec_enet_xdp_tx_xmit() Wei Fang
@ 2026-02-03 5:23 ` Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 09/15] net: fec: remove unnecessary NULL pointer check when clearing TX BD ring Wei Fang
` (6 subsequent siblings)
14 siblings, 0 replies; 20+ messages in thread
From: Wei Fang @ 2026-02-03 5:23 UTC (permalink / raw)
To: shenwei.wang, xiaoning.wang, frank.li, andrew+netdev, davem,
edumazet, kuba, pabeni, ast, daniel, hawk, john.fastabend, sdf,
horms
Cc: netdev, linux-kernel, imx, bpf
Currently, the driver writes the ENET_TDAR register for every XDP frame
to trigger transmit start. Frequent MMIO writes consume more CPU cycles
and may reduce XDP TX performance, so transmit XDP frames in bulk.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
---
drivers/net/ethernet/freescale/fec_main.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index 0c1806a46fcd..e0cea2efb6da 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -2006,6 +2006,8 @@ static int fec_enet_rx_queue_xdp(struct fec_enet_private *fep, int queue,
rxq->stats[RX_XDP_TX_ERRORS]++;
fec_xdp_drop(rxq, &xdp, sync);
trace_xdp_exception(ndev, prog, XDP_TX);
+ } else {
+ xdp_res |= FEC_ENET_XDP_TX;
}
break;
default:
@@ -2055,6 +2057,10 @@ static int fec_enet_rx_queue_xdp(struct fec_enet_private *fep, int queue,
if (xdp_res & FEC_ENET_XDP_REDIR)
xdp_do_flush();
+ if (xdp_res & FEC_ENET_XDP_TX)
+ /* Trigger transmission start */
+ fec_txq_trigger_xmit(fep, fep->tx_queue[tx_qid]);
+
return pkt_received;
}
@@ -4036,9 +4042,6 @@ static int fec_enet_txq_xmit_frame(struct fec_enet_private *fep,
txq->bd.cur = bdp;
- /* Trigger transmission start */
- fec_txq_trigger_xmit(fep, txq);
-
return 0;
}
@@ -4088,6 +4091,9 @@ static int fec_enet_xdp_xmit(struct net_device *dev,
sent_frames++;
}
+ if (sent_frames)
+ fec_txq_trigger_xmit(fep, txq);
+
__netif_tx_unlock(nq);
return sent_frames;
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v6 net-next 09/15] net: fec: remove unnecessary NULL pointer check when clearing TX BD ring
2026-02-03 5:23 [PATCH v6 net-next 00/15] net: fec: improve XDP copy mode and add AF_XDP zero-copy support Wei Fang
` (7 preceding siblings ...)
2026-02-03 5:23 ` [PATCH v6 net-next 08/15] net: fec: transmit XDP frames in bulk Wei Fang
@ 2026-02-03 5:23 ` Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 10/15] net: fec: use switch statement to check the type of tx_buf Wei Fang
` (5 subsequent siblings)
14 siblings, 0 replies; 20+ messages in thread
From: Wei Fang @ 2026-02-03 5:23 UTC (permalink / raw)
To: shenwei.wang, xiaoning.wang, frank.li, andrew+netdev, davem,
edumazet, kuba, pabeni, ast, daniel, hawk, john.fastabend, sdf,
horms
Cc: netdev, linux-kernel, imx, bpf
The tx_buf pointer will not NULL when its type is FEC_TXBUF_T_XDP_NDO or
FEC_TXBUF_T_XDP_TX. If the type is FEC_TXBUF_T_SKB, dev_kfree_skb_any()
will do NULL pointer check. So it is unnecessary to do NULL pointer check
in fec_enet_bd_init() and fec_enet_tx_queue().
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
---
drivers/net/ethernet/freescale/fec_main.c | 35 ++++++++---------------
1 file changed, 12 insertions(+), 23 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index e0cea2efb6da..d71615352a62 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -1032,24 +1032,19 @@ static void fec_enet_bd_init(struct net_device *dev)
fec32_to_cpu(bdp->cbd_bufaddr),
fec16_to_cpu(bdp->cbd_datlen),
DMA_TO_DEVICE);
- if (txq->tx_buf[i].buf_p)
- dev_kfree_skb_any(txq->tx_buf[i].buf_p);
+ dev_kfree_skb_any(txq->tx_buf[i].buf_p);
} else if (txq->tx_buf[i].type == FEC_TXBUF_T_XDP_NDO) {
- if (bdp->cbd_bufaddr)
- dma_unmap_single(&fep->pdev->dev,
- fec32_to_cpu(bdp->cbd_bufaddr),
- fec16_to_cpu(bdp->cbd_datlen),
- DMA_TO_DEVICE);
+ dma_unmap_single(&fep->pdev->dev,
+ fec32_to_cpu(bdp->cbd_bufaddr),
+ fec16_to_cpu(bdp->cbd_datlen),
+ DMA_TO_DEVICE);
- if (txq->tx_buf[i].buf_p)
- xdp_return_frame(txq->tx_buf[i].buf_p);
+ xdp_return_frame(txq->tx_buf[i].buf_p);
} else {
struct page *page = txq->tx_buf[i].buf_p;
- if (page)
- page_pool_put_page(pp_page_to_nmdesc(page)->pp,
- page, 0,
- false);
+ page_pool_put_page(pp_page_to_nmdesc(page)->pp,
+ page, 0, false);
}
txq->tx_buf[i].buf_p = NULL;
@@ -1538,21 +1533,15 @@ fec_enet_tx_queue(struct net_device *ndev, u16 queue_id, int budget)
if (txq->tx_buf[index].type == FEC_TXBUF_T_XDP_NDO) {
xdpf = txq->tx_buf[index].buf_p;
- if (bdp->cbd_bufaddr)
- dma_unmap_single(&fep->pdev->dev,
- fec32_to_cpu(bdp->cbd_bufaddr),
- fec16_to_cpu(bdp->cbd_datlen),
- DMA_TO_DEVICE);
+ dma_unmap_single(&fep->pdev->dev,
+ fec32_to_cpu(bdp->cbd_bufaddr),
+ fec16_to_cpu(bdp->cbd_datlen),
+ DMA_TO_DEVICE);
} else {
page = txq->tx_buf[index].buf_p;
}
bdp->cbd_bufaddr = cpu_to_fec32(0);
- if (unlikely(!txq->tx_buf[index].buf_p)) {
- txq->tx_buf[index].type = FEC_TXBUF_T_SKB;
- goto tx_buf_done;
- }
-
frame_len = fec16_to_cpu(bdp->cbd_datlen);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v6 net-next 10/15] net: fec: use switch statement to check the type of tx_buf
2026-02-03 5:23 [PATCH v6 net-next 00/15] net: fec: improve XDP copy mode and add AF_XDP zero-copy support Wei Fang
` (8 preceding siblings ...)
2026-02-03 5:23 ` [PATCH v6 net-next 09/15] net: fec: remove unnecessary NULL pointer check when clearing TX BD ring Wei Fang
@ 2026-02-03 5:23 ` Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 11/15] net: fec: remove the size parameter from fec_enet_create_page_pool() Wei Fang
` (4 subsequent siblings)
14 siblings, 0 replies; 20+ messages in thread
From: Wei Fang @ 2026-02-03 5:23 UTC (permalink / raw)
To: shenwei.wang, xiaoning.wang, frank.li, andrew+netdev, davem,
edumazet, kuba, pabeni, ast, daniel, hawk, john.fastabend, sdf,
horms
Cc: netdev, linux-kernel, imx, bpf
The tx_buf has three types: FEC_TXBUF_T_SKB, FEC_TXBUF_T_XDP_NDO and
FEC_TXBUF_T_XDP_TX. Currently, the driver uses 'if...else...' statements
to check the type and perform the corresponding processing. This is very
detrimental to future expansion. To support AF_XDP zero-copy mode, two
new types will be added in the future, continuing to use 'if...else...'
would be a very bad coding style. So the 'if...else...' statements in
the current driver are replaced with switch statements.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
---
drivers/net/ethernet/freescale/fec_main.c | 134 ++++++++++++----------
1 file changed, 73 insertions(+), 61 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index d71615352a62..61bc267e0a5c 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -1023,9 +1023,13 @@ static void fec_enet_bd_init(struct net_device *dev)
txq->bd.cur = bdp;
for (i = 0; i < txq->bd.ring_size; i++) {
+ struct page *page;
+
/* Initialize the BD for every fragment in the page. */
bdp->cbd_sc = cpu_to_fec16(0);
- if (txq->tx_buf[i].type == FEC_TXBUF_T_SKB) {
+
+ switch (txq->tx_buf[i].type) {
+ case FEC_TXBUF_T_SKB:
if (bdp->cbd_bufaddr &&
!IS_TSO_HEADER(txq, fec32_to_cpu(bdp->cbd_bufaddr)))
dma_unmap_single(&fep->pdev->dev,
@@ -1033,18 +1037,21 @@ static void fec_enet_bd_init(struct net_device *dev)
fec16_to_cpu(bdp->cbd_datlen),
DMA_TO_DEVICE);
dev_kfree_skb_any(txq->tx_buf[i].buf_p);
- } else if (txq->tx_buf[i].type == FEC_TXBUF_T_XDP_NDO) {
+ break;
+ case FEC_TXBUF_T_XDP_NDO:
dma_unmap_single(&fep->pdev->dev,
fec32_to_cpu(bdp->cbd_bufaddr),
fec16_to_cpu(bdp->cbd_datlen),
DMA_TO_DEVICE);
-
xdp_return_frame(txq->tx_buf[i].buf_p);
- } else {
- struct page *page = txq->tx_buf[i].buf_p;
-
+ break;
+ case FEC_TXBUF_T_XDP_TX:
+ page = txq->tx_buf[i].buf_p;
page_pool_put_page(pp_page_to_nmdesc(page)->pp,
page, 0, false);
+ break;
+ default:
+ break;
}
txq->tx_buf[i].buf_p = NULL;
@@ -1510,39 +1517,69 @@ fec_enet_tx_queue(struct net_device *ndev, u16 queue_id, int budget)
break;
index = fec_enet_get_bd_index(bdp, &txq->bd);
+ frame_len = fec16_to_cpu(bdp->cbd_datlen);
- if (txq->tx_buf[index].type == FEC_TXBUF_T_SKB) {
- skb = txq->tx_buf[index].buf_p;
+ switch (txq->tx_buf[index].type) {
+ case FEC_TXBUF_T_SKB:
if (bdp->cbd_bufaddr &&
!IS_TSO_HEADER(txq, fec32_to_cpu(bdp->cbd_bufaddr)))
dma_unmap_single(&fep->pdev->dev,
fec32_to_cpu(bdp->cbd_bufaddr),
- fec16_to_cpu(bdp->cbd_datlen),
- DMA_TO_DEVICE);
+ frame_len, DMA_TO_DEVICE);
+
bdp->cbd_bufaddr = cpu_to_fec32(0);
+ skb = txq->tx_buf[index].buf_p;
if (!skb)
goto tx_buf_done;
- } else {
+
+ frame_len = skb->len;
+
+ /* NOTE: SKBTX_IN_PROGRESS being set does not imply it's we who
+ * are to time stamp the packet, so we still need to check time
+ * stamping enabled flag.
+ */
+ if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS &&
+ fep->hwts_tx_en) && fep->bufdesc_ex) {
+ struct bufdesc_ex *ebdp = (struct bufdesc_ex *)bdp;
+ struct skb_shared_hwtstamps shhwtstamps;
+
+ fec_enet_hwtstamp(fep, fec32_to_cpu(ebdp->ts), &shhwtstamps);
+ skb_tstamp_tx(skb, &shhwtstamps);
+ }
+
+ /* Free the sk buffer associated with this last transmit */
+ napi_consume_skb(skb, budget);
+ break;
+ case FEC_TXBUF_T_XDP_NDO:
/* Tx processing cannot call any XDP (or page pool) APIs if
* the "budget" is 0. Because NAPI is called with budget of
* 0 (such as netpoll) indicates we may be in an IRQ context,
* however, we can't use the page pool from IRQ context.
*/
if (unlikely(!budget))
- break;
+ goto out;
- if (txq->tx_buf[index].type == FEC_TXBUF_T_XDP_NDO) {
- xdpf = txq->tx_buf[index].buf_p;
- dma_unmap_single(&fep->pdev->dev,
- fec32_to_cpu(bdp->cbd_bufaddr),
- fec16_to_cpu(bdp->cbd_datlen),
- DMA_TO_DEVICE);
- } else {
- page = txq->tx_buf[index].buf_p;
- }
+ xdpf = txq->tx_buf[index].buf_p;
+ dma_unmap_single(&fep->pdev->dev,
+ fec32_to_cpu(bdp->cbd_bufaddr),
+ frame_len, DMA_TO_DEVICE);
+ bdp->cbd_bufaddr = cpu_to_fec32(0);
+ xdp_return_frame_rx_napi(xdpf);
+ break;
+ case FEC_TXBUF_T_XDP_TX:
+ if (unlikely(!budget))
+ goto out;
bdp->cbd_bufaddr = cpu_to_fec32(0);
- frame_len = fec16_to_cpu(bdp->cbd_datlen);
+ page = txq->tx_buf[index].buf_p;
+ /* The dma_sync_size = 0 as XDP_TX has already synced
+ * DMA for_device
+ */
+ page_pool_put_page(pp_page_to_nmdesc(page)->pp, page,
+ 0, true);
+ break;
+ default:
+ break;
}
/* Check for errors. */
@@ -1562,11 +1599,7 @@ fec_enet_tx_queue(struct net_device *ndev, u16 queue_id, int budget)
ndev->stats.tx_carrier_errors++;
} else {
ndev->stats.tx_packets++;
-
- if (txq->tx_buf[index].type == FEC_TXBUF_T_SKB)
- ndev->stats.tx_bytes += skb->len;
- else
- ndev->stats.tx_bytes += frame_len;
+ ndev->stats.tx_bytes += frame_len;
}
/* Deferred means some collisions occurred during transmit,
@@ -1575,30 +1608,6 @@ fec_enet_tx_queue(struct net_device *ndev, u16 queue_id, int budget)
if (status & BD_ENET_TX_DEF)
ndev->stats.collisions++;
- if (txq->tx_buf[index].type == FEC_TXBUF_T_SKB) {
- /* NOTE: SKBTX_IN_PROGRESS being set does not imply it's we who
- * are to time stamp the packet, so we still need to check time
- * stamping enabled flag.
- */
- if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS &&
- fep->hwts_tx_en) && fep->bufdesc_ex) {
- struct skb_shared_hwtstamps shhwtstamps;
- struct bufdesc_ex *ebdp = (struct bufdesc_ex *)bdp;
-
- fec_enet_hwtstamp(fep, fec32_to_cpu(ebdp->ts), &shhwtstamps);
- skb_tstamp_tx(skb, &shhwtstamps);
- }
-
- /* Free the sk buffer associated with this last transmit */
- napi_consume_skb(skb, budget);
- } else if (txq->tx_buf[index].type == FEC_TXBUF_T_XDP_NDO) {
- xdp_return_frame_rx_napi(xdpf);
- } else { /* recycle pages of XDP_TX frames */
- /* The dma_sync_size = 0 as XDP_TX has already synced DMA for_device */
- page_pool_put_page(pp_page_to_nmdesc(page)->pp, page,
- 0, true);
- }
-
txq->tx_buf[index].buf_p = NULL;
/* restore default tx buffer type: FEC_TXBUF_T_SKB */
txq->tx_buf[index].type = FEC_TXBUF_T_SKB;
@@ -1622,6 +1631,8 @@ fec_enet_tx_queue(struct net_device *ndev, u16 queue_id, int budget)
}
}
+out:
+
/* ERR006358: Keep the transmitter going */
if (bdp != txq->bd.cur &&
readl(txq->bd.reg_desc_active) == 0)
@@ -3414,6 +3425,7 @@ static void fec_enet_free_buffers(struct net_device *ndev)
unsigned int i;
struct fec_enet_priv_tx_q *txq;
struct fec_enet_priv_rx_q *rxq;
+ struct page *page;
unsigned int q;
for (q = 0; q < fep->num_rx_queues; q++) {
@@ -3437,20 +3449,20 @@ static void fec_enet_free_buffers(struct net_device *ndev)
kfree(txq->tx_bounce[i]);
txq->tx_bounce[i] = NULL;
- if (!txq->tx_buf[i].buf_p) {
- txq->tx_buf[i].type = FEC_TXBUF_T_SKB;
- continue;
- }
-
- if (txq->tx_buf[i].type == FEC_TXBUF_T_SKB) {
+ switch (txq->tx_buf[i].type) {
+ case FEC_TXBUF_T_SKB:
dev_kfree_skb(txq->tx_buf[i].buf_p);
- } else if (txq->tx_buf[i].type == FEC_TXBUF_T_XDP_NDO) {
+ break;
+ case FEC_TXBUF_T_XDP_NDO:
xdp_return_frame(txq->tx_buf[i].buf_p);
- } else {
- struct page *page = txq->tx_buf[i].buf_p;
-
+ break;
+ case FEC_TXBUF_T_XDP_TX:
+ page = txq->tx_buf[i].buf_p;
page_pool_put_page(pp_page_to_nmdesc(page)->pp,
page, 0, false);
+ break;
+ default:
+ break;
}
txq->tx_buf[i].buf_p = NULL;
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v6 net-next 11/15] net: fec: remove the size parameter from fec_enet_create_page_pool()
2026-02-03 5:23 [PATCH v6 net-next 00/15] net: fec: improve XDP copy mode and add AF_XDP zero-copy support Wei Fang
` (9 preceding siblings ...)
2026-02-03 5:23 ` [PATCH v6 net-next 10/15] net: fec: use switch statement to check the type of tx_buf Wei Fang
@ 2026-02-03 5:23 ` Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 12/15] net: fec: move xdp_rxq_info* APIs out of fec_enet_create_page_pool() Wei Fang
` (3 subsequent siblings)
14 siblings, 0 replies; 20+ messages in thread
From: Wei Fang @ 2026-02-03 5:23 UTC (permalink / raw)
To: shenwei.wang, xiaoning.wang, frank.li, andrew+netdev, davem,
edumazet, kuba, pabeni, ast, daniel, hawk, john.fastabend, sdf,
horms
Cc: netdev, linux-kernel, imx, bpf
Remove the size parameter from fec_enet_create_page_pool(), since
rxq->bd.ring_size already contains this information.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
---
drivers/net/ethernet/freescale/fec_main.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index 61bc267e0a5c..e7113ea971ca 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -467,13 +467,13 @@ fec_enet_clear_csum(struct sk_buff *skb, struct net_device *ndev)
static int
fec_enet_create_page_pool(struct fec_enet_private *fep,
- struct fec_enet_priv_rx_q *rxq, int size)
+ struct fec_enet_priv_rx_q *rxq)
{
struct bpf_prog *xdp_prog = READ_ONCE(fep->xdp_prog);
struct page_pool_params pp_params = {
.order = fep->pagepool_order,
.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
- .pool_size = size,
+ .pool_size = rxq->bd.ring_size,
.nid = dev_to_node(&fep->pdev->dev),
.dev = &fep->pdev->dev,
.dma_dir = xdp_prog ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE,
@@ -3552,7 +3552,7 @@ fec_enet_alloc_rxq_buffers(struct net_device *ndev, unsigned int queue)
rxq = fep->rx_queue[queue];
bdp = rxq->bd.base;
- err = fec_enet_create_page_pool(fep, rxq, rxq->bd.ring_size);
+ err = fec_enet_create_page_pool(fep, rxq);
if (err < 0) {
netdev_err(ndev, "%s failed queue %d (%d)\n", __func__, queue, err);
return err;
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v6 net-next 12/15] net: fec: move xdp_rxq_info* APIs out of fec_enet_create_page_pool()
2026-02-03 5:23 [PATCH v6 net-next 00/15] net: fec: improve XDP copy mode and add AF_XDP zero-copy support Wei Fang
` (10 preceding siblings ...)
2026-02-03 5:23 ` [PATCH v6 net-next 11/15] net: fec: remove the size parameter from fec_enet_create_page_pool() Wei Fang
@ 2026-02-03 5:23 ` Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 13/15] net: fec: add fec_alloc_rxq_buffers_pp() to allocate buffers from page pool Wei Fang
` (2 subsequent siblings)
14 siblings, 0 replies; 20+ messages in thread
From: Wei Fang @ 2026-02-03 5:23 UTC (permalink / raw)
To: shenwei.wang, xiaoning.wang, frank.li, andrew+netdev, davem,
edumazet, kuba, pabeni, ast, daniel, hawk, john.fastabend, sdf,
horms
Cc: netdev, linux-kernel, imx, bpf
Extract fec_xdp_rxq_info_reg() from fec_enet_create_page_pool() and move
it out of fec_enet_create_page_pool(), so that it can be reused in the
subsequent patches to support XDP zero copy mode.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
---
drivers/net/ethernet/freescale/fec_main.c | 58 ++++++++++++++++-------
1 file changed, 40 insertions(+), 18 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index e7113ea971ca..31a27f1cbf39 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -489,23 +489,7 @@ fec_enet_create_page_pool(struct fec_enet_private *fep,
return err;
}
- err = xdp_rxq_info_reg(&rxq->xdp_rxq, fep->netdev, rxq->id, 0);
- if (err < 0)
- goto err_free_pp;
-
- err = xdp_rxq_info_reg_mem_model(&rxq->xdp_rxq, MEM_TYPE_PAGE_POOL,
- rxq->page_pool);
- if (err)
- goto err_unregister_rxq;
-
return 0;
-
-err_unregister_rxq:
- xdp_rxq_info_unreg(&rxq->xdp_rxq);
-err_free_pp:
- page_pool_destroy(rxq->page_pool);
- rxq->page_pool = NULL;
- return err;
}
static void fec_txq_trigger_xmit(struct fec_enet_private *fep,
@@ -3419,6 +3403,38 @@ static const struct ethtool_ops fec_enet_ethtool_ops = {
.self_test = net_selftest,
};
+static int fec_xdp_rxq_info_reg(struct fec_enet_private *fep,
+ struct fec_enet_priv_rx_q *rxq)
+{
+ struct net_device *ndev = fep->netdev;
+ int err;
+
+ err = xdp_rxq_info_reg(&rxq->xdp_rxq, ndev, rxq->id, 0);
+ if (err) {
+ netdev_err(ndev, "Failed to register xdp rxq info\n");
+ return err;
+ }
+
+ err = xdp_rxq_info_reg_mem_model(&rxq->xdp_rxq, MEM_TYPE_PAGE_POOL,
+ rxq->page_pool);
+ if (err) {
+ netdev_err(ndev, "Failed to register XDP mem model\n");
+ xdp_rxq_info_unreg(&rxq->xdp_rxq);
+
+ return err;
+ }
+
+ return 0;
+}
+
+static void fec_xdp_rxq_info_unreg(struct fec_enet_priv_rx_q *rxq)
+{
+ if (xdp_rxq_info_is_reg(&rxq->xdp_rxq)) {
+ xdp_rxq_info_unreg_mem_model(&rxq->xdp_rxq);
+ xdp_rxq_info_unreg(&rxq->xdp_rxq);
+ }
+}
+
static void fec_enet_free_buffers(struct net_device *ndev)
{
struct fec_enet_private *fep = netdev_priv(ndev);
@@ -3430,6 +3446,9 @@ static void fec_enet_free_buffers(struct net_device *ndev)
for (q = 0; q < fep->num_rx_queues; q++) {
rxq = fep->rx_queue[q];
+
+ fec_xdp_rxq_info_unreg(rxq);
+
for (i = 0; i < rxq->bd.ring_size; i++)
page_pool_put_full_page(rxq->page_pool, rxq->rx_buf[i],
false);
@@ -3437,8 +3456,6 @@ static void fec_enet_free_buffers(struct net_device *ndev)
for (i = 0; i < XDP_STATS_TOTAL; i++)
rxq->stats[i] = 0;
- if (xdp_rxq_info_is_reg(&rxq->xdp_rxq))
- xdp_rxq_info_unreg(&rxq->xdp_rxq);
page_pool_destroy(rxq->page_pool);
rxq->page_pool = NULL;
}
@@ -3593,6 +3610,11 @@ fec_enet_alloc_rxq_buffers(struct net_device *ndev, unsigned int queue)
/* Set the last buffer to wrap. */
bdp = fec_enet_get_prevdesc(bdp, &rxq->bd);
bdp->cbd_sc |= cpu_to_fec16(BD_ENET_RX_WRAP);
+
+ err = fec_xdp_rxq_info_reg(fep, rxq);
+ if (err)
+ goto err_alloc;
+
return 0;
err_alloc:
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v6 net-next 13/15] net: fec: add fec_alloc_rxq_buffers_pp() to allocate buffers from page pool
2026-02-03 5:23 [PATCH v6 net-next 00/15] net: fec: improve XDP copy mode and add AF_XDP zero-copy support Wei Fang
` (11 preceding siblings ...)
2026-02-03 5:23 ` [PATCH v6 net-next 12/15] net: fec: move xdp_rxq_info* APIs out of fec_enet_create_page_pool() Wei Fang
@ 2026-02-03 5:23 ` Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 14/15] net: fec: improve fec_enet_tx_queue() Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 15/15] net: fec: add AF_XDP zero-copy support Wei Fang
14 siblings, 0 replies; 20+ messages in thread
From: Wei Fang @ 2026-02-03 5:23 UTC (permalink / raw)
To: shenwei.wang, xiaoning.wang, frank.li, andrew+netdev, davem,
edumazet, kuba, pabeni, ast, daniel, hawk, john.fastabend, sdf,
horms
Cc: netdev, linux-kernel, imx, bpf
Currently, the buffers of RX queue are allocated from the page pool. In
the subsequent patches to support XDP zero copy, the RX buffers will be
allocated from the UMEM. Therefore, extract fec_alloc_rxq_buffers_pp()
from fec_enet_alloc_rxq_buffers() and we will add another helper to
allocate RX buffers from UMEM for the XDP zero copy mode. In addition,
fec_alloc_rxq_buffers_pp() only initializes bdp->bufaddr and does not
initialize other fields of bdp, because these will be initialized in
fec_enet_bd_init().
Signed-off-by: Wei Fang <wei.fang@nxp.com>
---
drivers/net/ethernet/freescale/fec_main.c | 90 +++++++++++++++--------
1 file changed, 58 insertions(+), 32 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index 31a27f1cbf39..cb03b45b7951 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -990,6 +990,13 @@ static void fec_enet_bd_init(struct net_device *dev)
bdp->cbd_sc = cpu_to_fec16(BD_ENET_RX_EMPTY);
else
bdp->cbd_sc = cpu_to_fec16(0);
+
+ if (fep->bufdesc_ex) {
+ struct bufdesc_ex *ebdp = (struct bufdesc_ex *)bdp;
+
+ ebdp->cbd_esc = cpu_to_fec32(BD_ENET_RX_INT);
+ }
+
bdp = fec_enet_get_nextdesc(bdp, &rxq->bd);
}
@@ -3435,6 +3442,24 @@ static void fec_xdp_rxq_info_unreg(struct fec_enet_priv_rx_q *rxq)
}
}
+static void fec_free_rxq_buffers(struct fec_enet_priv_rx_q *rxq)
+{
+ int i;
+
+ for (i = 0; i < rxq->bd.ring_size; i++) {
+ struct page *page = rxq->rx_buf[i];
+
+ if (!page)
+ continue;
+
+ page_pool_put_full_page(rxq->page_pool, page, false);
+ rxq->rx_buf[i] = NULL;
+ }
+
+ page_pool_destroy(rxq->page_pool);
+ rxq->page_pool = NULL;
+}
+
static void fec_enet_free_buffers(struct net_device *ndev)
{
struct fec_enet_private *fep = netdev_priv(ndev);
@@ -3448,16 +3473,10 @@ static void fec_enet_free_buffers(struct net_device *ndev)
rxq = fep->rx_queue[q];
fec_xdp_rxq_info_unreg(rxq);
-
- for (i = 0; i < rxq->bd.ring_size; i++)
- page_pool_put_full_page(rxq->page_pool, rxq->rx_buf[i],
- false);
+ fec_free_rxq_buffers(rxq);
for (i = 0; i < XDP_STATS_TOTAL; i++)
rxq->stats[i] = 0;
-
- page_pool_destroy(rxq->page_pool);
- rxq->page_pool = NULL;
}
for (q = 0; q < fep->num_tx_queues; q++) {
@@ -3556,22 +3575,18 @@ static int fec_enet_alloc_queue(struct net_device *ndev)
return ret;
}
-static int
-fec_enet_alloc_rxq_buffers(struct net_device *ndev, unsigned int queue)
+static int fec_alloc_rxq_buffers_pp(struct fec_enet_private *fep,
+ struct fec_enet_priv_rx_q *rxq)
{
- struct fec_enet_private *fep = netdev_priv(ndev);
- struct fec_enet_priv_rx_q *rxq;
+ struct bufdesc *bdp = rxq->bd.base;
dma_addr_t phys_addr;
- struct bufdesc *bdp;
struct page *page;
int i, err;
- rxq = fep->rx_queue[queue];
- bdp = rxq->bd.base;
-
err = fec_enet_create_page_pool(fep, rxq);
if (err < 0) {
- netdev_err(ndev, "%s failed queue %d (%d)\n", __func__, queue, err);
+ netdev_err(fep->netdev, "%s failed queue %d (%d)\n",
+ __func__, rxq->bd.qid, err);
return err;
}
@@ -3590,36 +3605,47 @@ fec_enet_alloc_rxq_buffers(struct net_device *ndev, unsigned int queue)
for (i = 0; i < rxq->bd.ring_size; i++) {
page = page_pool_dev_alloc_pages(rxq->page_pool);
- if (!page)
- goto err_alloc;
+ if (!page) {
+ err = -ENOMEM;
+ goto free_rx_buffers;
+ }
phys_addr = page_pool_get_dma_addr(page) + FEC_ENET_XDP_HEADROOM;
bdp->cbd_bufaddr = cpu_to_fec32(phys_addr);
-
rxq->rx_buf[i] = page;
- bdp->cbd_sc = cpu_to_fec16(BD_ENET_RX_EMPTY);
-
- if (fep->bufdesc_ex) {
- struct bufdesc_ex *ebdp = (struct bufdesc_ex *)bdp;
- ebdp->cbd_esc = cpu_to_fec32(BD_ENET_RX_INT);
- }
-
bdp = fec_enet_get_nextdesc(bdp, &rxq->bd);
}
- /* Set the last buffer to wrap. */
- bdp = fec_enet_get_prevdesc(bdp, &rxq->bd);
- bdp->cbd_sc |= cpu_to_fec16(BD_ENET_RX_WRAP);
+ return 0;
+
+free_rx_buffers:
+ fec_free_rxq_buffers(rxq);
+
+ return err;
+}
+
+static int
+fec_enet_alloc_rxq_buffers(struct net_device *ndev, unsigned int queue)
+{
+ struct fec_enet_private *fep = netdev_priv(ndev);
+ struct fec_enet_priv_rx_q *rxq;
+ int err;
+
+ rxq = fep->rx_queue[queue];
+ err = fec_alloc_rxq_buffers_pp(fep, rxq);
+ if (err)
+ goto free_buffers;
err = fec_xdp_rxq_info_reg(fep, rxq);
if (err)
- goto err_alloc;
+ goto free_buffers;
return 0;
- err_alloc:
+free_buffers:
fec_enet_free_buffers(ndev);
- return -ENOMEM;
+
+ return err;
}
static int
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v6 net-next 14/15] net: fec: improve fec_enet_tx_queue()
2026-02-03 5:23 [PATCH v6 net-next 00/15] net: fec: improve XDP copy mode and add AF_XDP zero-copy support Wei Fang
` (12 preceding siblings ...)
2026-02-03 5:23 ` [PATCH v6 net-next 13/15] net: fec: add fec_alloc_rxq_buffers_pp() to allocate buffers from page pool Wei Fang
@ 2026-02-03 5:23 ` Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 15/15] net: fec: add AF_XDP zero-copy support Wei Fang
14 siblings, 0 replies; 20+ messages in thread
From: Wei Fang @ 2026-02-03 5:23 UTC (permalink / raw)
To: shenwei.wang, xiaoning.wang, frank.li, andrew+netdev, davem,
edumazet, kuba, pabeni, ast, daniel, hawk, john.fastabend, sdf,
horms
Cc: netdev, linux-kernel, imx, bpf
To support AF_XDP zero-copy mode in the subsequent patch, the following
adjustments have been made to fec_tx_queue().
1. Change the parameters of fec_tx_queue().
2. Some variables are initialized at the time of declaration, and the
order of local variables is updated to follow the reverse xmas tree
style.
3. Remove the variable xdpf and add the variable tx_buf.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
---
drivers/net/ethernet/freescale/fec_main.c | 43 +++++++++--------------
1 file changed, 17 insertions(+), 26 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index cb03b45b7951..7aeaa055c2cd 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -1475,27 +1475,18 @@ fec_enet_hwtstamp(struct fec_enet_private *fep, unsigned ts,
hwtstamps->hwtstamp = ns_to_ktime(ns);
}
-static void
-fec_enet_tx_queue(struct net_device *ndev, u16 queue_id, int budget)
+static void fec_enet_tx_queue(struct fec_enet_private *fep,
+ u16 queue, int budget)
{
- struct fec_enet_private *fep;
- struct xdp_frame *xdpf;
- struct bufdesc *bdp;
+ struct netdev_queue *nq = netdev_get_tx_queue(fep->netdev, queue);
+ struct fec_enet_priv_tx_q *txq = fep->tx_queue[queue];
+ struct net_device *ndev = fep->netdev;
+ struct bufdesc *bdp = txq->dirty_tx;
+ int index, frame_len, entries_free;
+ struct fec_tx_buffer *tx_buf;
unsigned short status;
- struct sk_buff *skb;
- struct fec_enet_priv_tx_q *txq;
- struct netdev_queue *nq;
- int index = 0;
- int entries_free;
+ struct sk_buff *skb;
struct page *page;
- int frame_len;
-
- fep = netdev_priv(ndev);
-
- txq = fep->tx_queue[queue_id];
- /* get next bdp of dirty_tx */
- nq = netdev_get_tx_queue(ndev, queue_id);
- bdp = txq->dirty_tx;
/* get next bdp of dirty_tx */
bdp = fec_enet_get_nextdesc(bdp, &txq->bd);
@@ -1508,9 +1499,10 @@ fec_enet_tx_queue(struct net_device *ndev, u16 queue_id, int budget)
break;
index = fec_enet_get_bd_index(bdp, &txq->bd);
+ tx_buf = &txq->tx_buf[index];
frame_len = fec16_to_cpu(bdp->cbd_datlen);
- switch (txq->tx_buf[index].type) {
+ switch (tx_buf->type) {
case FEC_TXBUF_T_SKB:
if (bdp->cbd_bufaddr &&
!IS_TSO_HEADER(txq, fec32_to_cpu(bdp->cbd_bufaddr)))
@@ -1519,7 +1511,7 @@ fec_enet_tx_queue(struct net_device *ndev, u16 queue_id, int budget)
frame_len, DMA_TO_DEVICE);
bdp->cbd_bufaddr = cpu_to_fec32(0);
- skb = txq->tx_buf[index].buf_p;
+ skb = tx_buf->buf_p;
if (!skb)
goto tx_buf_done;
@@ -1550,19 +1542,18 @@ fec_enet_tx_queue(struct net_device *ndev, u16 queue_id, int budget)
if (unlikely(!budget))
goto out;
- xdpf = txq->tx_buf[index].buf_p;
dma_unmap_single(&fep->pdev->dev,
fec32_to_cpu(bdp->cbd_bufaddr),
frame_len, DMA_TO_DEVICE);
bdp->cbd_bufaddr = cpu_to_fec32(0);
- xdp_return_frame_rx_napi(xdpf);
+ xdp_return_frame_rx_napi(tx_buf->buf_p);
break;
case FEC_TXBUF_T_XDP_TX:
if (unlikely(!budget))
goto out;
bdp->cbd_bufaddr = cpu_to_fec32(0);
- page = txq->tx_buf[index].buf_p;
+ page = tx_buf->buf_p;
/* The dma_sync_size = 0 as XDP_TX has already synced
* DMA for_device
*/
@@ -1599,9 +1590,9 @@ fec_enet_tx_queue(struct net_device *ndev, u16 queue_id, int budget)
if (status & BD_ENET_TX_DEF)
ndev->stats.collisions++;
- txq->tx_buf[index].buf_p = NULL;
+ tx_buf->buf_p = NULL;
/* restore default tx buffer type: FEC_TXBUF_T_SKB */
- txq->tx_buf[index].type = FEC_TXBUF_T_SKB;
+ tx_buf->type = FEC_TXBUF_T_SKB;
tx_buf_done:
/* Make sure the update to bdp and tx_buf are performed
@@ -1637,7 +1628,7 @@ static void fec_enet_tx(struct net_device *ndev, int budget)
/* Make sure that AVB queues are processed first. */
for (i = fep->num_tx_queues - 1; i >= 0; i--)
- fec_enet_tx_queue(ndev, i, budget);
+ fec_enet_tx_queue(fep, i, budget);
}
static int fec_enet_update_cbd(struct fec_enet_priv_rx_q *rxq,
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v6 net-next 15/15] net: fec: add AF_XDP zero-copy support
2026-02-03 5:23 [PATCH v6 net-next 00/15] net: fec: improve XDP copy mode and add AF_XDP zero-copy support Wei Fang
` (13 preceding siblings ...)
2026-02-03 5:23 ` [PATCH v6 net-next 14/15] net: fec: improve fec_enet_tx_queue() Wei Fang
@ 2026-02-03 5:23 ` Wei Fang
2026-02-05 5:28 ` [v6,net-next,15/15] " Jakub Kicinski
14 siblings, 1 reply; 20+ messages in thread
From: Wei Fang @ 2026-02-03 5:23 UTC (permalink / raw)
To: shenwei.wang, xiaoning.wang, frank.li, andrew+netdev, davem,
edumazet, kuba, pabeni, ast, daniel, hawk, john.fastabend, sdf,
horms
Cc: netdev, linux-kernel, imx, bpf
Add AF_XDP zero-copy support for both TX and RX.
For RX, instead of allocating buffers from the page pool, the buffers
are allocated from xsk pool, so fec_alloc_rxq_buffers_zc() is added to
allocate RX buffers from xsk pool. And fec_enet_rx_queue_xsk() is used
to process the frames from the RX queue which is bound to the AF_XDP
socket. Similar to the XDP copy mode, the zero-copy mode also supports
XDP_TX, XDP_PASS, XDP_DROP and XDP_REDIRECT actions. In addition,
fec_enet_xsk_tx_xmit() is similar to fec_enet_xdp_tx_xmit() and is used
to handle XDP_TX action in zero-copy mode.
For TX, there are two cases, one is the frames from the AF_XDP socket,
so fec_enet_xsk_xmit() is added to directly transmit the frames from
the socket and the buffer type is marked as FEC_TXBUF_T_XSK_XMIT. The
other one is the frams from the RX queue (XDP_TX action), the buffer
type is marked as FEC_TXBUF_T_XSK_TX. Therefore, fec_enet_tx_queue()
could correctly clean the TX queue base on the buffer type.
Also, some tests have been done on the i.MX93-EVK board with the xdpsock
tool, the following are the results.
Env: i.MX93 connects to a packet generator, the link speed is 1Gbps, and
flow-control is off. The RX packet size is 64 bytes including FCS. Only
one RX queue (CPU) is used to receive frames.
1. MAC swap L2 forwarding
1.1 Zero-copy mode
root@imx93evk:~# ./xdpsock -i eth0 -l -z
sock0@eth0:0 l2fwd xdp-drv
pps pkts 1.00
rx 414715 415455
tx 414715 415455
1.2 Copy mode
root@imx93evk:~# ./xdpsock -i eth0 -l -c
sock0@eth0:0 l2fwd xdp-drv
pps pkts 1.00
rx 356396 356609
tx 356396 356609
2. TX only
2.1 Zero-copy mode
root@imx93evk:~# ./xdpsock -i eth0 -t -s 64 -z
sock0@eth0:0 txonly xdp-drv
pps pkts 1.00
rx 0 0
tx 1119573 1126720
2.2 Copy mode
root@imx93evk:~# ./xdpsock -i eth0 -t -s 64 -c
sock0@eth0:0 txonly xdp-drv
pps pkts 1.00
rx 0 0
tx 406864 407616
Signed-off-by: Wei Fang <wei.fang@nxp.com>
---
drivers/net/ethernet/freescale/fec.h | 13 +-
drivers/net/ethernet/freescale/fec_main.c | 779 ++++++++++++++++++++--
2 files changed, 748 insertions(+), 44 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h
index ad7aba1a8536..7176803146f3 100644
--- a/drivers/net/ethernet/freescale/fec.h
+++ b/drivers/net/ethernet/freescale/fec.h
@@ -340,6 +340,7 @@ struct bufdesc_ex {
#define FEC_ENET_TX_FRPPG (PAGE_SIZE / FEC_ENET_TX_FRSIZE)
#define TX_RING_SIZE 1024 /* Must be power of two */
#define TX_RING_MOD_MASK 511 /* for this to work */
+#define FEC_XSK_TX_BUDGET_MAX 256
#define BD_ENET_RX_INT 0x00800000
#define BD_ENET_RX_PTP ((ushort)0x0400)
@@ -528,6 +529,8 @@ enum fec_txbuf_type {
FEC_TXBUF_T_SKB,
FEC_TXBUF_T_XDP_NDO,
FEC_TXBUF_T_XDP_TX,
+ FEC_TXBUF_T_XSK_XMIT,
+ FEC_TXBUF_T_XSK_TX,
};
struct fec_tx_buffer {
@@ -539,6 +542,7 @@ struct fec_enet_priv_tx_q {
struct bufdesc_prop bd;
unsigned char *tx_bounce[TX_RING_SIZE];
struct fec_tx_buffer tx_buf[TX_RING_SIZE];
+ struct xsk_buff_pool *xsk_pool;
unsigned short tx_stop_threshold;
unsigned short tx_wake_threshold;
@@ -548,9 +552,16 @@ struct fec_enet_priv_tx_q {
dma_addr_t tso_hdrs_dma;
};
+union fec_rx_buffer {
+ void *buf_p;
+ struct page *page;
+ struct xdp_buff *xdp;
+};
+
struct fec_enet_priv_rx_q {
struct bufdesc_prop bd;
- struct page *rx_buf[RX_RING_SIZE];
+ union fec_rx_buffer rx_buf[RX_RING_SIZE];
+ struct xsk_buff_pool *xsk_pool;
/* page_pool */
struct page_pool *page_pool;
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index 7aeaa055c2cd..2e049b7dea55 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -71,6 +71,7 @@
#include <net/page_pool/helpers.h>
#include <net/selftests.h>
#include <net/tso.h>
+#include <net/xdp_sock_drv.h>
#include <soc/imx/cpuidle.h>
#include "fec.h"
@@ -1041,6 +1042,9 @@ static void fec_enet_bd_init(struct net_device *dev)
page_pool_put_page(pp_page_to_nmdesc(page)->pp,
page, 0, false);
break;
+ case FEC_TXBUF_T_XSK_TX:
+ xsk_buff_free(txq->tx_buf[i].buf_p);
+ break;
default:
break;
}
@@ -1475,8 +1479,91 @@ fec_enet_hwtstamp(struct fec_enet_private *fep, unsigned ts,
hwtstamps->hwtstamp = ns_to_ktime(ns);
}
-static void fec_enet_tx_queue(struct fec_enet_private *fep,
- u16 queue, int budget)
+static bool fec_enet_xsk_xmit(struct fec_enet_private *fep,
+ struct xsk_buff_pool *pool,
+ u32 queue)
+{
+ struct fec_enet_priv_tx_q *txq = fep->tx_queue[queue];
+ struct xdp_desc *xsk_desc = pool->tx_descs;
+ int cpu = smp_processor_id();
+ int free_bds, budget, batch;
+ struct netdev_queue *nq;
+ struct bufdesc *bdp;
+ dma_addr_t dma;
+ u32 estatus;
+ u16 status;
+ int i, j;
+
+ nq = netdev_get_tx_queue(fep->netdev, queue);
+ __netif_tx_lock(nq, cpu);
+
+ txq_trans_cond_update(nq);
+ free_bds = fec_enet_get_free_txdesc_num(txq);
+ if (!free_bds)
+ goto tx_unlock;
+
+ budget = min(free_bds, FEC_XSK_TX_BUDGET_MAX);
+ batch = xsk_tx_peek_release_desc_batch(pool, budget);
+ if (!batch)
+ goto tx_unlock;
+
+ bdp = txq->bd.cur;
+ for (i = 0; i < batch; i++) {
+ dma = xsk_buff_raw_get_dma(pool, xsk_desc[i].addr);
+ xsk_buff_raw_dma_sync_for_device(pool, dma, xsk_desc[i].len);
+
+ j = fec_enet_get_bd_index(bdp, &txq->bd);
+ txq->tx_buf[j].type = FEC_TXBUF_T_XSK_XMIT;
+ txq->tx_buf[j].buf_p = NULL;
+
+ status = fec16_to_cpu(bdp->cbd_sc);
+ status &= ~BD_ENET_TX_STATS;
+ status |= BD_ENET_TX_INTR | BD_ENET_TX_LAST;
+ bdp->cbd_datlen = cpu_to_fec16(xsk_desc[i].len);
+ bdp->cbd_bufaddr = cpu_to_fec32(dma);
+
+ if (fep->bufdesc_ex) {
+ struct bufdesc_ex *ebdp = (struct bufdesc_ex *)bdp;
+
+ estatus = BD_ENET_TX_INT;
+ if (fep->quirks & FEC_QUIRK_HAS_AVB)
+ estatus |= FEC_TX_BD_FTYPE(txq->bd.qid);
+
+ ebdp->cbd_bdu = 0;
+ ebdp->cbd_esc = cpu_to_fec32(estatus);
+ }
+
+ /* Make sure the updates to rest of the descriptor are performed
+ * before transferring ownership.
+ */
+ dma_wmb();
+
+ /* Send it on its way. Tell FEC it's ready, interrupt when done,
+ * it's the last BD of the frame, and to put the CRC on the end.
+ */
+ status |= BD_ENET_TX_READY | BD_ENET_TX_TC;
+ bdp->cbd_sc = cpu_to_fec16(status);
+ dma_wmb();
+
+ bdp = fec_enet_get_nextdesc(bdp, &txq->bd);
+ txq->bd.cur = bdp;
+ }
+
+ /* Trigger transmission start */
+ fec_txq_trigger_xmit(fep, txq);
+
+ __netif_tx_unlock(nq);
+
+ return batch < budget;
+
+tx_unlock:
+ __netif_tx_unlock(nq);
+
+ return true;
+}
+
+static int fec_enet_tx_queue(struct fec_enet_private *fep,
+ u16 queue, int budget)
{
struct netdev_queue *nq = netdev_get_tx_queue(fep->netdev, queue);
struct fec_enet_priv_tx_q *txq = fep->tx_queue[queue];
@@ -1487,6 +1574,7 @@ static void fec_enet_tx_queue(struct fec_enet_private *fep,
unsigned short status;
struct sk_buff *skb;
struct page *page;
+ int xsk_cnt = 0;
/* get next bdp of dirty_tx */
bdp = fec_enet_get_nextdesc(bdp, &txq->bd);
@@ -1560,6 +1648,14 @@ static void fec_enet_tx_queue(struct fec_enet_private *fep,
page_pool_put_page(pp_page_to_nmdesc(page)->pp, page,
0, true);
break;
+ case FEC_TXBUF_T_XSK_XMIT:
+ bdp->cbd_bufaddr = cpu_to_fec32(0);
+ xsk_cnt++;
+ break;
+ case FEC_TXBUF_T_XSK_TX:
+ bdp->cbd_bufaddr = cpu_to_fec32(0);
+ xsk_buff_free(tx_buf->buf_p);
+ break;
default:
break;
}
@@ -1619,16 +1715,37 @@ static void fec_enet_tx_queue(struct fec_enet_private *fep,
if (bdp != txq->bd.cur &&
readl(txq->bd.reg_desc_active) == 0)
writel(0, txq->bd.reg_desc_active);
+
+ if (txq->xsk_pool) {
+ struct xsk_buff_pool *pool = txq->xsk_pool;
+
+ if (xsk_cnt)
+ xsk_tx_completed(pool, xsk_cnt);
+
+ if (xsk_uses_need_wakeup(pool))
+ xsk_set_tx_need_wakeup(pool);
+
+ /* If the condition is true, it indicates that there are still
+ * packets to be transmitted, so return "budget" to make the
+ * NAPI continue polling.
+ */
+ if (!fec_enet_xsk_xmit(fep, pool, queue))
+ return budget;
+ }
+
+ return 0;
}
-static void fec_enet_tx(struct net_device *ndev, int budget)
+static int fec_enet_tx(struct net_device *ndev, int budget)
{
struct fec_enet_private *fep = netdev_priv(ndev);
- int i;
+ int i, count = 0;
/* Make sure that AVB queues are processed first. */
for (i = fep->num_tx_queues - 1; i >= 0; i--)
- fec_enet_tx_queue(fep, i, budget);
+ count += fec_enet_tx_queue(fep, i, budget);
+
+ return count;
}
static int fec_enet_update_cbd(struct fec_enet_priv_rx_q *rxq,
@@ -1641,13 +1758,30 @@ static int fec_enet_update_cbd(struct fec_enet_priv_rx_q *rxq,
if (unlikely(!new_page))
return -ENOMEM;
- rxq->rx_buf[index] = new_page;
+ rxq->rx_buf[index].page = new_page;
phys_addr = page_pool_get_dma_addr(new_page) + FEC_ENET_XDP_HEADROOM;
bdp->cbd_bufaddr = cpu_to_fec32(phys_addr);
return 0;
}
+static int fec_enet_update_cbd_zc(struct fec_enet_priv_rx_q *rxq,
+ struct bufdesc *bdp, int index)
+{
+ struct xdp_buff *new_xdp;
+ dma_addr_t phys_addr;
+
+ new_xdp = xsk_buff_alloc(rxq->xsk_pool);
+ if (unlikely(!new_xdp))
+ return -ENOMEM;
+
+ rxq->rx_buf[index].xdp = new_xdp;
+ phys_addr = xsk_buff_xdp_get_dma(new_xdp);
+ bdp->cbd_bufaddr = cpu_to_fec32(phys_addr);
+
+ return 0;
+}
+
static void fec_enet_rx_vlan(const struct net_device *ndev, struct sk_buff *skb)
{
if (ndev->features & NETIF_F_HW_VLAN_CTAG_RX) {
@@ -1802,7 +1936,7 @@ static int fec_enet_rx_queue(struct fec_enet_private *fep,
ndev->stats.rx_bytes += pkt_len - fep->rx_shift;
index = fec_enet_get_bd_index(bdp, &rxq->bd);
- page = rxq->rx_buf[index];
+ page = rxq->rx_buf[index].page;
dma = fec32_to_cpu(bdp->cbd_bufaddr);
if (fec_enet_update_cbd(rxq, bdp, index)) {
ndev->stats.rx_dropped++;
@@ -1932,7 +2066,7 @@ static int fec_enet_rx_queue_xdp(struct fec_enet_private *fep, int queue,
ndev->stats.rx_bytes += pkt_len - fep->rx_shift;
index = fec_enet_get_bd_index(bdp, &rxq->bd);
- page = rxq->rx_buf[index];
+ page = rxq->rx_buf[index].page;
dma = fec32_to_cpu(bdp->cbd_bufaddr);
if (fec_enet_update_cbd(rxq, bdp, index)) {
@@ -2046,6 +2180,268 @@ static int fec_enet_rx_queue_xdp(struct fec_enet_private *fep, int queue,
return pkt_received;
}
+static struct sk_buff *fec_build_skb_zc(struct xdp_buff *xsk,
+ struct napi_struct *napi)
+{
+ size_t len = xdp_get_buff_len(xsk);
+ struct sk_buff *skb;
+
+ skb = napi_alloc_skb(napi, len);
+ if (unlikely(!skb)) {
+ xsk_buff_free(xsk);
+ return NULL;
+ }
+
+ skb_put_data(skb, xsk->data, len);
+ xsk_buff_free(xsk);
+
+ return skb;
+}
+
+static int fec_enet_xsk_tx_xmit(struct fec_enet_private *fep,
+ struct xdp_buff *xsk, int cpu,
+ int queue)
+{
+ struct netdev_queue *nq = netdev_get_tx_queue(fep->netdev, queue);
+ struct fec_enet_priv_tx_q *txq = fep->tx_queue[queue];
+ u32 offset = xsk->data - xsk->data_hard_start;
+ u32 headroom = txq->xsk_pool->headroom;
+ u32 len = xsk->data_end - xsk->data;
+ u32 index, status, estatus;
+ struct bufdesc *bdp;
+ dma_addr_t dma;
+
+ __netif_tx_lock(nq, cpu);
+
+ /* Avoid tx timeout as XDP shares the queue with kernel stack */
+ txq_trans_cond_update(nq);
+
+ if (!fec_enet_get_free_txdesc_num(txq)) {
+ __netif_tx_unlock(nq);
+
+ return -EBUSY;
+ }
+
+ /* Fill in a Tx ring entry */
+ bdp = txq->bd.cur;
+ status = fec16_to_cpu(bdp->cbd_sc);
+ status &= ~BD_ENET_TX_STATS;
+
+ index = fec_enet_get_bd_index(bdp, &txq->bd);
+ dma = xsk_buff_xdp_get_frame_dma(xsk) + headroom + offset;
+
+ xsk_buff_raw_dma_sync_for_device(txq->xsk_pool, dma, len);
+
+ txq->tx_buf[index].buf_p = xsk;
+ txq->tx_buf[index].type = FEC_TXBUF_T_XSK_TX;
+
+ status |= (BD_ENET_TX_INTR | BD_ENET_TX_LAST);
+ if (fep->bufdesc_ex)
+ estatus = BD_ENET_TX_INT;
+
+ bdp->cbd_bufaddr = cpu_to_fec32(dma);
+ bdp->cbd_datlen = cpu_to_fec16(len);
+
+ if (fep->bufdesc_ex) {
+ struct bufdesc_ex *ebdp = (struct bufdesc_ex *)bdp;
+
+ if (fep->quirks & FEC_QUIRK_HAS_AVB)
+ estatus |= FEC_TX_BD_FTYPE(txq->bd.qid);
+
+ ebdp->cbd_bdu = 0;
+ ebdp->cbd_esc = cpu_to_fec32(estatus);
+ }
+
+ status |= (BD_ENET_TX_READY | BD_ENET_TX_TC);
+ bdp->cbd_sc = cpu_to_fec16(status);
+ dma_wmb();
+
+ bdp = fec_enet_get_nextdesc(bdp, &txq->bd);
+ txq->bd.cur = bdp;
+
+ __netif_tx_unlock(nq);
+
+ return 0;
+}
+
+static int fec_enet_rx_queue_xsk(struct fec_enet_private *fep, int queue,
+ int budget, struct bpf_prog *prog)
+{
+ u32 data_start = FEC_ENET_XDP_HEADROOM + fep->rx_shift;
+ struct fec_enet_priv_rx_q *rxq = fep->rx_queue[queue];
+ struct net_device *ndev = fep->netdev;
+ struct bufdesc *bdp = rxq->bd.cur;
+ u32 sub_len = 4 + fep->rx_shift;
+ int cpu = smp_processor_id();
+ bool wakeup_xsk = false;
+ struct xdp_buff *xsk;
+ int pkt_received = 0;
+ struct sk_buff *skb;
+ u16 status, pkt_len;
+ u32 xdp_res = 0;
+ int index, err;
+ u32 act;
+
+#if defined(CONFIG_COLDFIRE) && !defined(CONFIG_COLDFIRE_COHERENT_DMA)
+ /*
+ * Hacky flush of all caches instead of using the DMA API for the TSO
+ * headers.
+ */
+ flush_cache_all();
+#endif
+
+ while (!((status = fec16_to_cpu(bdp->cbd_sc)) & BD_ENET_RX_EMPTY)) {
+ if (unlikely(pkt_received >= budget))
+ break;
+
+ writel(FEC_ENET_RXF_GET(queue), fep->hwp + FEC_IEVENT);
+
+ index = fec_enet_get_bd_index(bdp, &rxq->bd);
+ xsk = rxq->rx_buf[index].xdp;
+ if (unlikely(!xsk)) {
+ if (fec_enet_update_cbd_zc(rxq, bdp, index))
+ break;
+
+ if (fep->bufdesc_ex) {
+ struct bufdesc_ex *ebdp = (struct bufdesc_ex *)bdp;
+
+ ebdp->cbd_esc = cpu_to_fec32(BD_ENET_RX_INT);
+ ebdp->cbd_prot = 0;
+ ebdp->cbd_bdu = 0;
+ }
+
+ dma_wmb();
+ status &= ~BD_ENET_RX_STATS;
+ status |= BD_ENET_RX_EMPTY;
+ bdp->cbd_sc = cpu_to_fec16(status);
+ break;
+ }
+
+ pkt_received++;
+ /* Check for errors. */
+ status ^= BD_ENET_RX_LAST;
+ if (unlikely(fec_rx_error_check(ndev, status)))
+ goto rx_processing_done;
+
+ /* Process the incoming frame. */
+ ndev->stats.rx_packets++;
+ pkt_len = fec16_to_cpu(bdp->cbd_datlen);
+ ndev->stats.rx_bytes += pkt_len - fep->rx_shift;
+
+ if (fec_enet_update_cbd_zc(rxq, bdp, index)) {
+ ndev->stats.rx_dropped++;
+ goto rx_processing_done;
+ }
+
+ pkt_len -= sub_len;
+ xsk->data = xsk->data_hard_start + data_start;
+ /* Subtract FCS and 16bit shift */
+ xsk->data_end = xsk->data + pkt_len;
+ xsk->data_meta = xsk->data;
+ xsk_buff_dma_sync_for_cpu(xsk);
+
+ /* If the XSK pool is enabled before the bpf program is
+ * installed, or the bpf program is uninstalled before
+ * the XSK pool is disabled. prog will be NULL and we
+ * need to set a default XDP_PASS action.
+ */
+ if (unlikely(!prog))
+ act = XDP_PASS;
+ else
+ act = bpf_prog_run_xdp(prog, xsk);
+
+ switch (act) {
+ case XDP_PASS:
+ rxq->stats[RX_XDP_PASS]++;
+ skb = fec_build_skb_zc(xsk, &fep->napi);
+ if (unlikely(!skb))
+ ndev->stats.rx_dropped++;
+ else
+ napi_gro_receive(&fep->napi, skb);
+ break;
+ case XDP_TX:
+ rxq->stats[RX_XDP_TX]++;
+ err = fec_enet_xsk_tx_xmit(fep, xsk, cpu, queue);
+ if (unlikely(err)) {
+ rxq->stats[RX_XDP_TX_ERRORS]++;
+ xsk_buff_free(xsk);
+ } else {
+ xdp_res |= FEC_ENET_XDP_TX;
+ }
+ break;
+ case XDP_REDIRECT:
+ rxq->stats[RX_XDP_REDIRECT]++;
+ err = xdp_do_redirect(ndev, xsk, prog);
+ if (unlikely(err)) {
+ if (err == -ENOBUFS)
+ wakeup_xsk = true;
+
+ rxq->stats[RX_XDP_DROP]++;
+ xsk_buff_free(xsk);
+ } else {
+ xdp_res |= FEC_ENET_XDP_REDIR;
+ }
+ break;
+ default:
+ bpf_warn_invalid_xdp_action(ndev, prog, act);
+ fallthrough;
+ case XDP_ABORTED:
+ trace_xdp_exception(ndev, prog, act);
+ fallthrough;
+ case XDP_DROP:
+ rxq->stats[RX_XDP_DROP]++;
+ xsk_buff_free(xsk);
+ break;
+ }
+
+rx_processing_done:
+ /* Clear the status flags for this buffer */
+ status &= ~BD_ENET_RX_STATS;
+ /* Mark the buffer empty */
+ status |= BD_ENET_RX_EMPTY;
+
+ if (fep->bufdesc_ex) {
+ struct bufdesc_ex *ebdp = (struct bufdesc_ex *)bdp;
+
+ ebdp->cbd_esc = cpu_to_fec32(BD_ENET_RX_INT);
+ ebdp->cbd_prot = 0;
+ ebdp->cbd_bdu = 0;
+ }
+
+ /* Make sure the updates to rest of the descriptor are
+ * performed before transferring ownership.
+ */
+ dma_wmb();
+ bdp->cbd_sc = cpu_to_fec16(status);
+
+ /* Update BD pointer to next entry */
+ bdp = fec_enet_get_nextdesc(bdp, &rxq->bd);
+
+ /* Doing this here will keep the FEC running while we process
+ * incoming frames. On a heavily loaded network, we should be
+ * able to keep up at the expense of system resources.
+ */
+ writel(0, rxq->bd.reg_desc_active);
+ }
+
+ rxq->bd.cur = bdp;
+
+ if (xdp_res & FEC_ENET_XDP_REDIR)
+ xdp_do_flush();
+
+ if (xdp_res & FEC_ENET_XDP_TX)
+ fec_txq_trigger_xmit(fep, fep->tx_queue[queue]);
+
+ if (rxq->xsk_pool && xsk_uses_need_wakeup(rxq->xsk_pool)) {
+ if (wakeup_xsk)
+ xsk_set_rx_need_wakeup(rxq->xsk_pool);
+ else
+ xsk_clear_rx_need_wakeup(rxq->xsk_pool);
+ }
+
+ return pkt_received;
+}
+
static int fec_enet_rx(struct net_device *ndev, int budget)
{
struct fec_enet_private *fep = netdev_priv(ndev);
@@ -2054,11 +2450,15 @@ static int fec_enet_rx(struct net_device *ndev, int budget)
/* Make sure that AVB queues are processed first. */
for (i = fep->num_rx_queues - 1; i >= 0; i--) {
- if (prog)
- done += fec_enet_rx_queue_xdp(fep, i, budget - done,
- prog);
+ struct fec_enet_priv_rx_q *rxq = fep->rx_queue[i];
+ int batch = budget - done;
+
+ if (rxq->xsk_pool)
+ done += fec_enet_rx_queue_xsk(fep, i, batch, prog);
+ else if (prog)
+ done += fec_enet_rx_queue_xdp(fep, i, batch, prog);
else
- done += fec_enet_rx_queue(fep, i, budget - done);
+ done += fec_enet_rx_queue(fep, i, batch);
}
return done;
@@ -2102,19 +2502,22 @@ static int fec_enet_rx_napi(struct napi_struct *napi, int budget)
{
struct net_device *ndev = napi->dev;
struct fec_enet_private *fep = netdev_priv(ndev);
- int done = 0;
+ int rx_done = 0, tx_done = 0;
+ int max_done;
do {
- done += fec_enet_rx(ndev, budget - done);
- fec_enet_tx(ndev, budget);
- } while ((done < budget) && fec_enet_collect_events(fep));
+ rx_done += fec_enet_rx(ndev, budget - rx_done);
+ tx_done += fec_enet_tx(ndev, budget);
+ max_done = max(rx_done, tx_done);
+ } while ((max_done < budget) && fec_enet_collect_events(fep));
- if (done < budget) {
- napi_complete_done(napi, done);
+ if (max_done < budget) {
+ napi_complete_done(napi, max_done);
writel(FEC_DEFAULT_IMASK, fep->hwp + FEC_IMASK);
+ return max_done;
}
- return done;
+ return budget;
}
/* ------------------------------------------------------------------------- */
@@ -3405,7 +3808,8 @@ static int fec_xdp_rxq_info_reg(struct fec_enet_private *fep,
struct fec_enet_priv_rx_q *rxq)
{
struct net_device *ndev = fep->netdev;
- int err;
+ void *allocator;
+ int type, err;
err = xdp_rxq_info_reg(&rxq->xdp_rxq, ndev, rxq->id, 0);
if (err) {
@@ -3413,8 +3817,9 @@ static int fec_xdp_rxq_info_reg(struct fec_enet_private *fep,
return err;
}
- err = xdp_rxq_info_reg_mem_model(&rxq->xdp_rxq, MEM_TYPE_PAGE_POOL,
- rxq->page_pool);
+ allocator = rxq->xsk_pool ? NULL : rxq->page_pool;
+ type = rxq->xsk_pool ? MEM_TYPE_XSK_BUFF_POOL : MEM_TYPE_PAGE_POOL;
+ err = xdp_rxq_info_reg_mem_model(&rxq->xdp_rxq, type, allocator);
if (err) {
netdev_err(ndev, "Failed to register XDP mem model\n");
xdp_rxq_info_unreg(&rxq->xdp_rxq);
@@ -3422,6 +3827,9 @@ static int fec_xdp_rxq_info_reg(struct fec_enet_private *fep,
return err;
}
+ if (rxq->xsk_pool)
+ xsk_pool_set_rxq_info(rxq->xsk_pool, &rxq->xdp_rxq);
+
return 0;
}
@@ -3435,20 +3843,28 @@ static void fec_xdp_rxq_info_unreg(struct fec_enet_priv_rx_q *rxq)
static void fec_free_rxq_buffers(struct fec_enet_priv_rx_q *rxq)
{
+ bool xsk = !!rxq->xsk_pool;
int i;
for (i = 0; i < rxq->bd.ring_size; i++) {
- struct page *page = rxq->rx_buf[i];
+ union fec_rx_buffer *buf = &rxq->rx_buf[i];
- if (!page)
+ if (!buf->buf_p)
continue;
- page_pool_put_full_page(rxq->page_pool, page, false);
- rxq->rx_buf[i] = NULL;
+ if (xsk)
+ xsk_buff_free(buf->xdp);
+ else
+ page_pool_put_full_page(rxq->page_pool,
+ buf->page, false);
+
+ rxq->rx_buf[i].buf_p = NULL;
}
- page_pool_destroy(rxq->page_pool);
- rxq->page_pool = NULL;
+ if (!xsk) {
+ page_pool_destroy(rxq->page_pool);
+ rxq->page_pool = NULL;
+ }
}
static void fec_enet_free_buffers(struct net_device *ndev)
@@ -3488,6 +3904,9 @@ static void fec_enet_free_buffers(struct net_device *ndev)
page_pool_put_page(pp_page_to_nmdesc(page)->pp,
page, 0, false);
break;
+ case FEC_TXBUF_T_XSK_TX:
+ xsk_buff_free(txq->tx_buf[i].buf_p);
+ break;
default:
break;
}
@@ -3603,7 +4022,7 @@ static int fec_alloc_rxq_buffers_pp(struct fec_enet_private *fep,
phys_addr = page_pool_get_dma_addr(page) + FEC_ENET_XDP_HEADROOM;
bdp->cbd_bufaddr = cpu_to_fec32(phys_addr);
- rxq->rx_buf[i] = page;
+ rxq->rx_buf[i].page = page;
bdp = fec_enet_get_nextdesc(bdp, &rxq->bd);
}
@@ -3615,6 +4034,33 @@ static int fec_alloc_rxq_buffers_pp(struct fec_enet_private *fep,
return err;
}
+static int fec_alloc_rxq_buffers_zc(struct fec_enet_private *fep,
+ struct fec_enet_priv_rx_q *rxq)
+{
+ union fec_rx_buffer *buf = &rxq->rx_buf[0];
+ struct bufdesc *bdp = rxq->bd.base;
+ dma_addr_t phys_addr;
+ int i;
+
+ for (i = 0; i < rxq->bd.ring_size; i++) {
+ buf[i].xdp = xsk_buff_alloc(rxq->xsk_pool);
+ if (!buf[i].xdp)
+ break;
+
+ phys_addr = xsk_buff_xdp_get_dma(buf[i].xdp);
+ bdp->cbd_bufaddr = cpu_to_fec32(phys_addr);
+ bdp = fec_enet_get_nextdesc(bdp, &rxq->bd);
+ }
+
+ for (; i < rxq->bd.ring_size; i++) {
+ buf[i].xdp = NULL;
+ bdp->cbd_bufaddr = cpu_to_fec32(0);
+ bdp = fec_enet_get_nextdesc(bdp, &rxq->bd);
+ }
+
+ return 0;
+}
+
static int
fec_enet_alloc_rxq_buffers(struct net_device *ndev, unsigned int queue)
{
@@ -3623,9 +4069,16 @@ fec_enet_alloc_rxq_buffers(struct net_device *ndev, unsigned int queue)
int err;
rxq = fep->rx_queue[queue];
- err = fec_alloc_rxq_buffers_pp(fep, rxq);
- if (err)
- goto free_buffers;
+ if (rxq->xsk_pool) {
+ /* RX XDP ZC buffer pool may not be populated, e.g.
+ * xdpsock TX-only.
+ */
+ fec_alloc_rxq_buffers_zc(fep, rxq);
+ } else {
+ err = fec_alloc_rxq_buffers_pp(fep, rxq);
+ if (err)
+ goto free_buffers;
+ }
err = fec_xdp_rxq_info_reg(fep, rxq);
if (err)
@@ -3950,21 +4403,237 @@ static u16 fec_enet_select_queue(struct net_device *ndev, struct sk_buff *skb,
return fec_enet_vlan_pri_to_queue[vlan_tag >> 13];
}
+static void fec_free_rxq(struct fec_enet_priv_rx_q *rxq)
+{
+ fec_xdp_rxq_info_unreg(rxq);
+ fec_free_rxq_buffers(rxq);
+ kfree(rxq);
+}
+
+static struct fec_enet_priv_rx_q *
+fec_alloc_new_rxq_xsk(struct fec_enet_private *fep, int queue,
+ struct xsk_buff_pool *pool)
+{
+ struct fec_enet_priv_rx_q *old_rxq = fep->rx_queue[queue];
+ struct fec_enet_priv_rx_q *rxq;
+ union fec_rx_buffer *buf;
+ int i;
+
+ rxq = kzalloc(sizeof(*rxq), GFP_KERNEL);
+ if (!rxq)
+ return NULL;
+
+ /* Copy the BD ring to the new rxq */
+ rxq->bd = old_rxq->bd;
+ rxq->id = queue;
+ rxq->xsk_pool = pool;
+ buf = &rxq->rx_buf[0];
+
+ for (i = 0; i < rxq->bd.ring_size; i++) {
+ buf[i].xdp = xsk_buff_alloc(pool);
+ /* RX XDP ZC buffer pool may not be populated, e.g.
+ * xdpsock TX-only.
+ */
+ if (!buf[i].xdp)
+ break;
+ }
+
+ if (fec_xdp_rxq_info_reg(fep, rxq))
+ goto free_buffers;
+
+ return rxq;
+
+free_buffers:
+ while (--i >= 0)
+ xsk_buff_free(buf[i].xdp);
+
+ kfree(rxq);
+
+ return NULL;
+}
+
+static struct fec_enet_priv_rx_q *
+fec_alloc_new_rxq_pp(struct fec_enet_private *fep, int queue)
+{
+ struct fec_enet_priv_rx_q *old_rxq = fep->rx_queue[queue];
+ struct fec_enet_priv_rx_q *rxq;
+ union fec_rx_buffer *buf;
+ int i = 0;
+
+ rxq = kzalloc(sizeof(*rxq), GFP_KERNEL);
+ if (!rxq)
+ return NULL;
+
+ rxq->bd = old_rxq->bd;
+ rxq->id = queue;
+
+ if (fec_enet_create_page_pool(fep, rxq))
+ goto free_rxq;
+
+ buf = &rxq->rx_buf[0];
+ for (; i < rxq->bd.ring_size; i++) {
+ buf[i].page = page_pool_dev_alloc_pages(rxq->page_pool);
+ if (!buf[i].page)
+ goto free_buffers;
+ }
+
+ if (fec_xdp_rxq_info_reg(fep, rxq))
+ goto free_buffers;
+
+ return rxq;
+
+free_buffers:
+ while (--i >= 0)
+ page_pool_put_full_page(rxq->page_pool,
+ buf[i].page, false);
+
+ page_pool_destroy(rxq->page_pool);
+free_rxq:
+ kfree(rxq);
+
+ return NULL;
+}
+
+static void fec_init_rxq_bd_buffers(struct fec_enet_priv_rx_q *rxq, bool xsk)
+{
+ union fec_rx_buffer *buf = &rxq->rx_buf[0];
+ struct bufdesc *bdp = rxq->bd.base;
+ dma_addr_t dma;
+
+ for (int i = 0; i < rxq->bd.ring_size; i++) {
+ if (xsk)
+ dma = buf[i].xdp ?
+ xsk_buff_xdp_get_dma(buf[i].xdp) : 0;
+ else
+ dma = page_pool_get_dma_addr(buf[i].page) +
+ FEC_ENET_XDP_HEADROOM;
+
+ bdp->cbd_bufaddr = cpu_to_fec32(dma);
+ bdp = fec_enet_get_nextdesc(bdp, &rxq->bd);
+ }
+}
+
+static int fec_xsk_restart_napi(struct fec_enet_private *fep,
+ struct xsk_buff_pool *pool,
+ u16 queue)
+{
+ struct fec_enet_priv_tx_q *txq = fep->tx_queue[queue];
+ struct net_device *ndev = fep->netdev;
+ struct fec_enet_priv_rx_q *rxq;
+ int err;
+
+ napi_disable(&fep->napi);
+ netif_tx_disable(ndev);
+ synchronize_rcu();
+
+ rxq = pool ? fec_alloc_new_rxq_xsk(fep, queue, pool) :
+ fec_alloc_new_rxq_pp(fep, queue);
+ if (!rxq) {
+ err = -ENOMEM;
+ goto err_alloc_new_rxq;
+ }
+
+ /* Replace the old rxq with the new rxq */
+ fec_free_rxq(fep->rx_queue[queue]);
+ fep->rx_queue[queue] = rxq;
+ fec_init_rxq_bd_buffers(rxq, !!pool);
+ txq->xsk_pool = pool;
+
+ fec_restart(ndev);
+ napi_enable(&fep->napi);
+ netif_tx_start_all_queues(ndev);
+
+ return 0;
+
+err_alloc_new_rxq:
+ napi_enable(&fep->napi);
+ netif_tx_start_all_queues(ndev);
+
+ return err;
+}
+
+static int fec_enable_xsk_pool(struct fec_enet_private *fep,
+ struct xsk_buff_pool *pool,
+ u16 queue)
+{
+ int err;
+
+ err = xsk_pool_dma_map(pool, &fep->pdev->dev, 0);
+ if (err) {
+ netdev_err(fep->netdev, "Failed to map xsk pool\n");
+ return err;
+ }
+
+ if (!netif_running(fep->netdev)) {
+ struct fec_enet_priv_rx_q *rxq = fep->rx_queue[queue];
+ struct fec_enet_priv_tx_q *txq = fep->tx_queue[queue];
+
+ rxq->xsk_pool = pool;
+ txq->xsk_pool = pool;
+
+ return 0;
+ }
+
+ err = fec_xsk_restart_napi(fep, pool, queue);
+ if (err) {
+ xsk_pool_dma_unmap(pool, 0);
+ return err;
+ }
+
+ return 0;
+}
+
+static int fec_disable_xsk_pool(struct fec_enet_private *fep,
+ u16 queue)
+{
+ struct fec_enet_priv_tx_q *txq = fep->tx_queue[queue];
+ struct xsk_buff_pool *old_pool = txq->xsk_pool;
+ int err;
+
+ if (!netif_running(fep->netdev)) {
+ struct fec_enet_priv_rx_q *rxq = fep->rx_queue[queue];
+
+ xsk_pool_dma_unmap(old_pool, 0);
+ rxq->xsk_pool = NULL;
+ txq->xsk_pool = NULL;
+
+ return 0;
+ }
+
+ err = fec_xsk_restart_napi(fep, NULL, queue);
+ if (err)
+ return err;
+
+ xsk_pool_dma_unmap(old_pool, 0);
+
+ return 0;
+}
+
+static int fec_setup_xsk_pool(struct fec_enet_private *fep,
+ struct xsk_buff_pool *pool,
+ u16 queue)
+{
+ if (queue >= fep->num_rx_queues || queue >= fep->num_tx_queues)
+ return -ERANGE;
+
+ return pool ? fec_enable_xsk_pool(fep, pool, queue) :
+ fec_disable_xsk_pool(fep, queue);
+}
+
static int fec_enet_bpf(struct net_device *dev, struct netdev_bpf *bpf)
{
struct fec_enet_private *fep = netdev_priv(dev);
bool is_run = netif_running(dev);
struct bpf_prog *old_prog;
+ /* No need to support the SoCs that require to do the frame swap
+ * because the performance wouldn't be better than the skb mode.
+ */
+ if (fep->quirks & FEC_QUIRK_SWAP_FRAME)
+ return -EOPNOTSUPP;
+
switch (bpf->command) {
case XDP_SETUP_PROG:
- /* No need to support the SoCs that require to
- * do the frame swap because the performance wouldn't be
- * better than the skb mode.
- */
- if (fep->quirks & FEC_QUIRK_SWAP_FRAME)
- return -EOPNOTSUPP;
-
if (!bpf->prog)
xdp_features_clear_redirect_target(dev);
@@ -3988,10 +4657,9 @@ static int fec_enet_bpf(struct net_device *dev, struct netdev_bpf *bpf)
xdp_features_set_redirect_target(dev, false);
return 0;
-
case XDP_SETUP_XSK_POOL:
- return -EOPNOTSUPP;
-
+ return fec_setup_xsk_pool(fep, bpf->xsk.pool,
+ bpf->xsk.queue_id);
default:
return -EOPNOTSUPP;
}
@@ -4139,6 +4807,29 @@ static int fec_enet_xdp_xmit(struct net_device *dev,
return sent_frames;
}
+static int fec_enet_xsk_wakeup(struct net_device *ndev, u32 queue, u32 flags)
+{
+ struct fec_enet_private *fep = netdev_priv(ndev);
+ struct fec_enet_priv_rx_q *rxq;
+
+ if (!netif_running(ndev) || !netif_carrier_ok(ndev))
+ return -ENETDOWN;
+
+ if (queue >= fep->num_rx_queues || queue >= fep->num_tx_queues)
+ return -ERANGE;
+
+ rxq = fep->rx_queue[queue];
+ if (!rxq->xsk_pool)
+ return -EINVAL;
+
+ if (!napi_if_scheduled_mark_missed(&fep->napi)) {
+ if (likely(napi_schedule_prep(&fep->napi)))
+ __napi_schedule(&fep->napi);
+ }
+
+ return 0;
+}
+
static int fec_hwtstamp_get(struct net_device *ndev,
struct kernel_hwtstamp_config *config)
{
@@ -4201,6 +4892,7 @@ static const struct net_device_ops fec_netdev_ops = {
.ndo_set_features = fec_set_features,
.ndo_bpf = fec_enet_bpf,
.ndo_xdp_xmit = fec_enet_xdp_xmit,
+ .ndo_xsk_wakeup = fec_enet_xsk_wakeup,
.ndo_hwtstamp_get = fec_hwtstamp_get,
.ndo_hwtstamp_set = fec_hwtstamp_set,
};
@@ -4328,7 +5020,8 @@ static int fec_enet_init(struct net_device *ndev)
if (!(fep->quirks & FEC_QUIRK_SWAP_FRAME))
ndev->xdp_features = NETDEV_XDP_ACT_BASIC |
- NETDEV_XDP_ACT_REDIRECT;
+ NETDEV_XDP_ACT_REDIRECT |
+ NETDEV_XDP_ACT_XSK_ZEROCOPY;
fec_restart(ndev);
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [v6,net-next,06/15] net: fec: add fec_enet_rx_queue_xdp() for XDP path
2026-02-03 5:23 ` [PATCH v6 net-next 06/15] net: fec: add fec_enet_rx_queue_xdp() for XDP path Wei Fang
@ 2026-02-05 5:28 ` Jakub Kicinski
2026-02-05 6:16 ` Wei Fang
0 siblings, 1 reply; 20+ messages in thread
From: Jakub Kicinski @ 2026-02-05 5:28 UTC (permalink / raw)
To: wei.fang
Cc: Jakub Kicinski, ast, andrew+netdev, davem, pabeni, xiaoning.wang,
john.fastabend, daniel, sdf, frank.li, linux-kernel, hawk, horms,
bpf, edumazet, netdev, imx, shenwei.wang
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
net: fec: add fec_enet_rx_queue_xdp() for XDP path
This commit separates XDP path packet processing from fec_enet_rx_queue()
into a new function fec_enet_rx_queue_xdp() to optimize XDP performance.
Performance testing on iMX93 shows improvements across all xdp-bench tests.
> diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
> --- a/drivers/net/ethernet/freescale/fec_main.c
> +++ b/drivers/net/ethernet/freescale/fec_main.c
> @@ -1666,71 +1666,6 @@ static int fec_enet_update_cbd(struct fec_enet_priv_rx_q *rxq,
> return 0;
> }
>
> -static u32
> -fec_enet_run_xdp(struct fec_enet_private *fep, struct bpf_prog *prog,
> - struct xdp_buff *xdp, struct fec_enet_priv_rx_q *rxq, int cpu)
> -{
[ ... ]
> - default:
> - bpf_warn_invalid_xdp_action(fep->netdev, prog, act);
> - fallthrough;
> -
> - case XDP_ABORTED:
> - fallthrough; /* handle aborts by dropping packet */
> -
> - case XDP_DROP:
> - rxq->stats[RX_XDP_DROP]++;
> -xdp_err:
> - ret = FEC_ENET_XDP_CONSUMED;
> - page = virt_to_head_page(xdp->data);
> - page_pool_put_page(rxq->page_pool, page, sync, true);
> - if (act != XDP_DROP)
> - trace_xdp_exception(fep->netdev, prog, act);
> - break;
> - }
> -
> - return ret;
> -}
> -
[ ... ]
> @@ -1962,7 +1879,167 @@
[ ... ]
> +static int fec_enet_rx_queue_xdp(struct fec_enet_private *fep, int queue,
> + int budget, struct bpf_prog *prog)
> +{
[ ... ]
> + switch (act) {
> + case XDP_PASS:
> + rxq->stats[RX_XDP_PASS]++;
> + /* The packet length includes FCS, but we don't want to
> + * include that when passing upstream as it messes up
> + * bridging applications.
> + */
> + skb = fec_build_skb(fep, rxq, bdp, page, pkt_len);
> + if (!skb)
> + trace_xdp_exception(ndev, prog, XDP_PASS);
> + else
> + napi_gro_receive(&fep->napi, skb);
> +
> + break;
> + case XDP_REDIRECT:
> + rxq->stats[RX_XDP_REDIRECT]++;
> + err = xdp_do_redirect(ndev, &xdp, prog);
> + if (unlikely(err)) {
> + fec_xdp_drop(rxq, &xdp, sync);
> + trace_xdp_exception(ndev, prog, XDP_REDIRECT);
> + } else {
> + xdp_res |= FEC_ENET_XDP_REDIR;
> + }
> + break;
> + case XDP_TX:
> + rxq->stats[RX_XDP_TX]++;
> + err = fec_enet_xdp_tx_xmit(fep, cpu, &xdp, sync);
> + if (unlikely(err)) {
> + rxq->stats[RX_XDP_TX_ERRORS]++;
> + fec_xdp_drop(rxq, &xdp, sync);
> + trace_xdp_exception(ndev, prog, XDP_TX);
> + }
> + break;
> + default:
> + bpf_warn_invalid_xdp_action(ndev, prog, act);
> + fallthrough;
> + case XDP_ABORTED:
> + /* handle aborts by dropping packet */
> + fallthrough;
> + case XDP_DROP:
> + rxq->stats[RX_XDP_DROP]++;
> + fec_xdp_drop(rxq, &xdp, sync);
> + break;
Are XDP_ABORTED and invalid XDP actions being traced properly? In the
deleted fec_enet_run_xdp() function, the error path included this check:
if (act != XDP_DROP)
trace_xdp_exception(fep->netdev, prog, act);
This would trace both XDP_ABORTED and invalid actions (after calling
bpf_warn_invalid_xdp_action). The new code in fec_enet_rx_queue_xdp()
calls trace_xdp_exception() for XDP_PASS failures, XDP_REDIRECT failures,
and XDP_TX failures, but not for XDP_ABORTED or invalid actions.
This trace information is useful for debugging XDP program issues and
monitoring abnormal XDP behavior.
> + }
> +
> +rx_processing_done:
[ ... ]
> + return pkt_received;
> +}
--
pw-bot: cr
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [v6,net-next,15/15] net: fec: add AF_XDP zero-copy support
2026-02-03 5:23 ` [PATCH v6 net-next 15/15] net: fec: add AF_XDP zero-copy support Wei Fang
@ 2026-02-05 5:28 ` Jakub Kicinski
2026-02-05 6:04 ` Wei Fang
0 siblings, 1 reply; 20+ messages in thread
From: Jakub Kicinski @ 2026-02-05 5:28 UTC (permalink / raw)
To: wei.fang
Cc: Jakub Kicinski, ast, andrew+netdev, davem, pabeni, xiaoning.wang,
john.fastabend, daniel, sdf, frank.li, linux-kernel, hawk, horms,
bpf, edumazet, netdev, imx, shenwei.wang
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
net: fec: add AF_XDP zero-copy support
This commit adds AF_XDP zero-copy support for both TX and RX on the FEC
Ethernet driver. It introduces new functions for XSK buffer allocation,
RX/TX queue processing in zero-copy mode, and XSK pool setup/teardown.
> For TX, there are two cases, one is the frames from the AF_XDP socket,
> so fec_enet_xsk_xmit() is added to directly transmit the frames from
> the socket and the buffer type is marked as FEC_TXBUF_T_XSK_XMIT. The
> other one is the frams from the RX queue (XDP_TX action), the buffer
This isn't a bug, but "frams" should be "frames".
> diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
> --- a/drivers/net/ethernet/freescale/fec_main.c
> +++ b/drivers/net/ethernet/freescale/fec_main.c
[ ... ]
> +static int fec_enet_xsk_tx_xmit(struct fec_enet_private *fep,
> + struct xdp_buff *xsk, int cpu,
> + int queue)
> +{
[ ... ]
> + bdp->cbd_bufaddr = cpu_to_fec32(dma);
> + bdp->cbd_datlen = cpu_to_fec16(len);
> +
> + if (fep->bufdesc_ex) {
> + struct bufdesc_ex *ebdp = (struct bufdesc_ex *)bdp;
> +
> + if (fep->quirks & FEC_QUIRK_HAS_AVB)
> + estatus |= FEC_TX_BD_FTYPE(txq->bd.qid);
> +
> + ebdp->cbd_bdu = 0;
> + ebdp->cbd_esc = cpu_to_fec32(estatus);
> + }
> +
> + status |= (BD_ENET_TX_READY | BD_ENET_TX_TC);
> + bdp->cbd_sc = cpu_to_fec16(status);
> + dma_wmb();
Is there a missing dma_wmb() before setting BD_ENET_TX_READY? The
descriptor fields (cbd_bufaddr, cbd_datlen, cbd_esc) are written before
the status, but without a memory barrier, the hardware could see the
READY bit before those fields are visible on weakly-ordered architectures.
Compare to fec_enet_xsk_xmit() in the same file which does:
dma_wmb();
status |= BD_ENET_TX_READY | BD_ENET_TX_TC;
bdp->cbd_sc = cpu_to_fec16(status);
dma_wmb();
The dma_wmb() after setting status only ensures ordering with subsequent
writes (bd.cur update), not with the preceding descriptor field writes.
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [v6,net-next,15/15] net: fec: add AF_XDP zero-copy support
2026-02-05 5:28 ` [v6,net-next,15/15] " Jakub Kicinski
@ 2026-02-05 6:04 ` Wei Fang
0 siblings, 0 replies; 20+ messages in thread
From: Wei Fang @ 2026-02-05 6:04 UTC (permalink / raw)
To: Jakub Kicinski
Cc: ast@kernel.org, andrew+netdev@lunn.ch, davem@davemloft.net,
pabeni@redhat.com, Clark Wang, john.fastabend@gmail.com,
daniel@iogearbox.net, sdf@fomichev.me, Frank Li,
linux-kernel@vger.kernel.org, hawk@kernel.org, horms@kernel.org,
bpf@vger.kernel.org, edumazet@google.com, netdev@vger.kernel.org,
imx@lists.linux.dev, Shenwei Wang
> > + bdp->cbd_bufaddr = cpu_to_fec32(dma);
> > + bdp->cbd_datlen = cpu_to_fec16(len);
> > +
> > + if (fep->bufdesc_ex) {
> > + struct bufdesc_ex *ebdp = (struct bufdesc_ex *)bdp;
> > +
> > + if (fep->quirks & FEC_QUIRK_HAS_AVB)
> > + estatus |= FEC_TX_BD_FTYPE(txq->bd.qid);
> > +
> > + ebdp->cbd_bdu = 0;
> > + ebdp->cbd_esc = cpu_to_fec32(estatus);
> > + }
> > +
> > + status |= (BD_ENET_TX_READY | BD_ENET_TX_TC);
> > + bdp->cbd_sc = cpu_to_fec16(status);
> > + dma_wmb();
>
> Is there a missing dma_wmb() before setting BD_ENET_TX_READY?
Yes, my bad, I will fix it.
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [v6,net-next,06/15] net: fec: add fec_enet_rx_queue_xdp() for XDP path
2026-02-05 5:28 ` [v6,net-next,06/15] " Jakub Kicinski
@ 2026-02-05 6:16 ` Wei Fang
0 siblings, 0 replies; 20+ messages in thread
From: Wei Fang @ 2026-02-05 6:16 UTC (permalink / raw)
To: Jakub Kicinski
Cc: ast@kernel.org, andrew+netdev@lunn.ch, davem@davemloft.net,
pabeni@redhat.com, Clark Wang, john.fastabend@gmail.com,
daniel@iogearbox.net, sdf@fomichev.me, Frank Li,
linux-kernel@vger.kernel.org, hawk@kernel.org, horms@kernel.org,
bpf@vger.kernel.org, edumazet@google.com, netdev@vger.kernel.org,
imx@lists.linux.dev, Shenwei Wang
> > +static int fec_enet_rx_queue_xdp(struct fec_enet_private *fep, int queue,
> > + int budget, struct bpf_prog *prog)
> > +{
>
> [ ... ]
>
> > + switch (act) {
> > + case XDP_PASS:
> > + rxq->stats[RX_XDP_PASS]++;
> > + /* The packet length includes FCS, but we don't want to
> > + * include that when passing upstream as it messes up
> > + * bridging applications.
> > + */
> > + skb = fec_build_skb(fep, rxq, bdp, page, pkt_len);
> > + if (!skb)
> > + trace_xdp_exception(ndev, prog, XDP_PASS);
> > + else
> > + napi_gro_receive(&fep->napi, skb);
> > +
> > + break;
> > + case XDP_REDIRECT:
> > + rxq->stats[RX_XDP_REDIRECT]++;
> > + err = xdp_do_redirect(ndev, &xdp, prog);
> > + if (unlikely(err)) {
> > + fec_xdp_drop(rxq, &xdp, sync);
> > + trace_xdp_exception(ndev, prog, XDP_REDIRECT);
> > + } else {
> > + xdp_res |= FEC_ENET_XDP_REDIR;
> > + }
> > + break;
> > + case XDP_TX:
> > + rxq->stats[RX_XDP_TX]++;
> > + err = fec_enet_xdp_tx_xmit(fep, cpu, &xdp, sync);
> > + if (unlikely(err)) {
> > + rxq->stats[RX_XDP_TX_ERRORS]++;
> > + fec_xdp_drop(rxq, &xdp, sync);
> > + trace_xdp_exception(ndev, prog, XDP_TX);
> > + }
> > + break;
> > + default:
> > + bpf_warn_invalid_xdp_action(ndev, prog, act);
> > + fallthrough;
> > + case XDP_ABORTED:
> > + /* handle aborts by dropping packet */
> > + fallthrough;
> > + case XDP_DROP:
> > + rxq->stats[RX_XDP_DROP]++;
> > + fec_xdp_drop(rxq, &xdp, sync);
> > + break;
>
> Are XDP_ABORTED and invalid XDP actions being traced properly? In the
I will fix it, thanks.
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2026-02-05 6:16 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-03 5:23 [PATCH v6 net-next 00/15] net: fec: improve XDP copy mode and add AF_XDP zero-copy support Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 01/15] net: fec: add fec_txq_trigger_xmit() helper Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 02/15] net: fec: add fec_rx_error_check() to check RX errors Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 03/15] net: fec: add rx_shift to indicate the extra bytes padded in front of RX frame Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 04/15] net: fec: add fec_build_skb() to build a skb Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 05/15] net: fec: improve fec_enet_rx_queue() Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 06/15] net: fec: add fec_enet_rx_queue_xdp() for XDP path Wei Fang
2026-02-05 5:28 ` [v6,net-next,06/15] " Jakub Kicinski
2026-02-05 6:16 ` Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 07/15] net: fec: add tx_qid parameter to fec_enet_xdp_tx_xmit() Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 08/15] net: fec: transmit XDP frames in bulk Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 09/15] net: fec: remove unnecessary NULL pointer check when clearing TX BD ring Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 10/15] net: fec: use switch statement to check the type of tx_buf Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 11/15] net: fec: remove the size parameter from fec_enet_create_page_pool() Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 12/15] net: fec: move xdp_rxq_info* APIs out of fec_enet_create_page_pool() Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 13/15] net: fec: add fec_alloc_rxq_buffers_pp() to allocate buffers from page pool Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 14/15] net: fec: improve fec_enet_tx_queue() Wei Fang
2026-02-03 5:23 ` [PATCH v6 net-next 15/15] net: fec: add AF_XDP zero-copy support Wei Fang
2026-02-05 5:28 ` [v6,net-next,15/15] " Jakub Kicinski
2026-02-05 6:04 ` Wei Fang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox