Re: [PATCH v2 net-next 06/14] net: fec: add fec_enet_rx_queue_xdp() for XDP path

Netdev List
 help / color / mirror / Atom feed

From: Frank Li <Frank.li@nxp.com>
To: Wei Fang <wei.fang@nxp.com>
Cc: shenwei.wang@nxp.com, xiaoning.wang@nxp.com,
	andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
	kuba@kernel.org, pabeni@redhat.com, ast@kernel.org,
	daniel@iogearbox.net, hawk@kernel.org, john.fastabend@gmail.com,
	sdf@fomichev.me, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, imx@lists.linux.dev,
	bpf@vger.kernel.org
Subject: Re: [PATCH v2 net-next 06/14] net: fec: add fec_enet_rx_queue_xdp() for XDP path
Date: Fri, 16 Jan 2026 09:19:45 -0500	[thread overview]
Message-ID: <aWpJAfYTvO4D/COp@lizhi-Precision-Tower-5810> (raw)
In-Reply-To: <20260116074027.1603841-7-wei.fang@nxp.com>

On Fri, Jan 16, 2026 at 03:40:19PM +0800, Wei Fang wrote:
> Currently, the processing of XDP path packets and protocol stack packets
> are both mixed in fec_enet_rx_queue(), which makes the logic somewhat
> confusing and debugging more difficult. Furthermore, some logic is not
> needed by each other. For example, the kernel path does not need to call
> xdp_init_buff(), and XDP path does not support swap_buffer(), etc. This
> prevents XDP from achieving its maximum performance. Therefore, XDP path
> packets processing has been separated from fec_enet_rx_queue() by adding
> the fec_enet_rx_queue_xdp() function to optimize XDP path logic and
> improve XDP performance.
>
> The XDP performance on the iMX93 platform was compared before and after
> applying this patch. Detailed results are as follows and we can see the
> performance has been improved.
>
> Env: i.MX93, packet size 64 bytes including FCS, only single core and RX
> BD ring are used to receive packets, flow-control is off.
>
> Before the patch is applied:
> xdp-bench tx eth0
> Summary                   396,868 rx/s                  0 err,drop/s
> Summary                   396,024 rx/s                  0 err,drop/s
>
> xdp-bench drop eth0
> Summary                   684,781 rx/s                  0 err/s
> Summary                   675,746 rx/s                  0 err/s
>
> xdp-bench pass eth0
> Summary                   208,552 rx/s                  0 err,drop/s
> Summary                   208,654 rx/s                  0 err,drop/s
>
> xdp-bench redirect eth0 eth0
> eth0->eth0                311,210 rx/s                  0 err,drop/s      311,208 xmit/s
> eth0->eth0                310,808 rx/s                  0 err,drop/s      310,809 xmit/s
>
> After the patch is applied:
> xdp-bench tx eth0
> Summary                   409,975 rx/s                  0 err,drop/s
> Summary                   411,073 rx/s                  0 err,drop/s
>
> xdp-bench drop eth0
> Summary                   700,681 rx/s                  0 err/s
> Summary                   698,102 rx/s                  0 err/s
>
> xdp-bench pass eth0
> Summary                   211,356 rx/s                  0 err,drop/s
> Summary                   210,629 rx/s                  0 err,drop/s
>
> xdp-bench redirect eth0 eth0
> eth0->eth0                320,351 rx/s                  0 err,drop/s      320,348 xmit/s
> eth0->eth0                318,988 rx/s                  0 err,drop/s      318,988 xmit/s
>
> Signed-off-by: Wei Fang <wei.fang@nxp.com>
> ---
>  drivers/net/ethernet/freescale/fec_main.c | 292 ++++++++++++++--------
>  1 file changed, 188 insertions(+), 104 deletions(-)
>
> diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
> index 0529dc91c981..251191ab99b3 100644
> --- a/drivers/net/ethernet/freescale/fec_main.c
> +++ b/drivers/net/ethernet/freescale/fec_main.c
> @@ -79,7 +79,7 @@ static void set_multicast_list(struct net_device *ndev);
>  static void fec_enet_itr_coal_set(struct net_device *ndev);
>  static int fec_enet_xdp_tx_xmit(struct fec_enet_private *fep,
>  				int cpu, struct xdp_buff *xdp,
> -				u32 dma_sync_len);
> +				u32 dma_sync_len, int queue);
>
>  #define DRIVER_NAME	"fec"
>
> @@ -1665,71 +1665,6 @@ static int fec_enet_update_cbd(struct fec_enet_priv_rx_q *rxq,
>  	return 0;
>  }
>
> -static u32
> -fec_enet_run_xdp(struct fec_enet_private *fep, struct bpf_prog *prog,
> -		 struct xdp_buff *xdp, struct fec_enet_priv_rx_q *rxq, int cpu)
> -{
> -	unsigned int sync, len = xdp->data_end - xdp->data;
> -	u32 ret = FEC_ENET_XDP_PASS;
> -	struct page *page;
> -	int err;
> -	u32 act;
> -
> -	act = bpf_prog_run_xdp(prog, xdp);
> -
> -	/* Due xdp_adjust_tail and xdp_adjust_head: DMA sync for_device cover
> -	 * max len CPU touch
> -	 */
> -	sync = xdp->data_end - xdp->data;
> -	sync = max(sync, len);
> -
> -	switch (act) {
> -	case XDP_PASS:
> -		rxq->stats[RX_XDP_PASS]++;
> -		ret = FEC_ENET_XDP_PASS;
> -		break;
> -
> -	case XDP_REDIRECT:
> -		rxq->stats[RX_XDP_REDIRECT]++;
> -		err = xdp_do_redirect(fep->netdev, xdp, prog);
> -		if (unlikely(err))
> -			goto xdp_err;
> -
> -		ret = FEC_ENET_XDP_REDIR;
> -		break;
> -
> -	case XDP_TX:
> -		rxq->stats[RX_XDP_TX]++;
> -		err = fec_enet_xdp_tx_xmit(fep, cpu, xdp, sync);
> -		if (unlikely(err)) {
> -			rxq->stats[RX_XDP_TX_ERRORS]++;
> -			goto xdp_err;
> -		}
> -
> -		ret = FEC_ENET_XDP_TX;
> -		break;
> -
> -	default:
> -		bpf_warn_invalid_xdp_action(fep->netdev, prog, act);
> -		fallthrough;
> -
> -	case XDP_ABORTED:
> -		fallthrough;    /* handle aborts by dropping packet */
> -
> -	case XDP_DROP:
> -		rxq->stats[RX_XDP_DROP]++;
> -xdp_err:
> -		ret = FEC_ENET_XDP_CONSUMED;
> -		page = virt_to_head_page(xdp->data);
> -		page_pool_put_page(rxq->page_pool, page, sync, true);
> -		if (act != XDP_DROP)
> -			trace_xdp_exception(fep->netdev, prog, act);
> -		break;
> -	}
> -
> -	return ret;
> -}
> -
>  static void fec_enet_rx_vlan(const struct net_device *ndev, struct sk_buff *skb)
>  {
>  	if (ndev->features & NETIF_F_HW_VLAN_CTAG_RX) {
> @@ -1842,19 +1777,14 @@ static struct sk_buff *fec_build_skb(struct fec_enet_private *fep,
>  static int fec_enet_rx_queue(struct fec_enet_private *fep,
>  			     u16 queue, int budget)
>  {
> -	u32 data_start = FEC_ENET_XDP_HEADROOM + fep->rx_shift;
>  	struct fec_enet_priv_rx_q *rxq = fep->rx_queue[queue];
> -	struct bpf_prog *xdp_prog = READ_ONCE(fep->xdp_prog);
>  	bool need_swap = fep->quirks & FEC_QUIRK_SWAP_FRAME;
> -	u32 ret, xdp_result = FEC_ENET_XDP_PASS;
>  	struct net_device *ndev = fep->netdev;
>  	struct bufdesc *bdp = rxq->bd.cur;
>  	u32 sub_len = 4 + fep->rx_shift;
> -	int cpu = smp_processor_id();
>  	int pkt_received = 0;
>  	u16 status, pkt_len;
>  	struct sk_buff *skb;
> -	struct xdp_buff xdp;
>  	struct page *page;
>  	dma_addr_t dma;
>  	int index;
> @@ -1870,8 +1800,6 @@ static int fec_enet_rx_queue(struct fec_enet_private *fep,
>  	/* First, grab all of the stats for the incoming packet.
>  	 * These get messed up if we get called due to a busy condition.
>  	 */
> -	xdp_init_buff(&xdp, PAGE_SIZE << fep->pagepool_order, &rxq->xdp_rxq);
> -
>  	while (!((status = fec16_to_cpu(bdp->cbd_sc)) & BD_ENET_RX_EMPTY)) {
>
>  		if (pkt_received >= budget)
> @@ -1902,17 +1830,6 @@ static int fec_enet_rx_queue(struct fec_enet_private *fep,
>  					DMA_FROM_DEVICE);
>  		prefetch(page_address(page));
>
> -		if (xdp_prog) {
> -			xdp_buff_clear_frags_flag(&xdp);
> -			/* subtract 16bit shift and FCS */
> -			xdp_prepare_buff(&xdp, page_address(page),
> -					 data_start, pkt_len - sub_len, false);
> -			ret = fec_enet_run_xdp(fep, xdp_prog, &xdp, rxq, cpu);
> -			xdp_result |= ret;
> -			if (ret != FEC_ENET_XDP_PASS)
> -				goto rx_processing_done;
> -		}
> -
>  		if (unlikely(need_swap)) {
>  			u8 *data;
>
> @@ -1961,7 +1878,181 @@ static int fec_enet_rx_queue(struct fec_enet_private *fep,
>  	}
>  	rxq->bd.cur = bdp;
>
> -	if (xdp_result & FEC_ENET_XDP_REDIR)
> +	return pkt_received;
> +}
> +
> +static void fec_xdp_drop(struct fec_enet_priv_rx_q *rxq,
> +			 struct xdp_buff *xdp, u32 sync)
> +{
> +	struct page *page = virt_to_head_page(xdp->data);
> +
> +	page_pool_put_page(rxq->page_pool, page, sync, true);
> +}
> +
> +static int
> +fec_enet_xdp_get_tx_queue(struct fec_enet_private *fep, int index)
> +{
> +	if (unlikely(index < 0))
> +		return 0;
> +
> +	return (index % fep->num_tx_queues);
> +}
> +
> +static int fec_enet_rx_queue_xdp(struct fec_enet_private *fep, int queue,
> +				 int budget, struct bpf_prog *prog)
> +{
> +	u32 data_start = FEC_ENET_XDP_HEADROOM + fep->rx_shift;
> +	struct fec_enet_priv_rx_q *rxq = fep->rx_queue[queue];
> +	struct net_device *ndev = fep->netdev;
> +	struct bufdesc *bdp = rxq->bd.cur;
> +	u32 sub_len = 4 + fep->rx_shift;
> +	int cpu = smp_processor_id();
> +	int pkt_received = 0;
> +	struct sk_buff *skb;
> +	u16 status, pkt_len;
> +	struct xdp_buff xdp;
> +	int tx_qid = queue;
> +	struct page *page;
> +	u32 xdp_res = 0;
> +	dma_addr_t dma;
> +	int index, err;
> +	u32 act, sync;
> +
> +#if defined(CONFIG_COLDFIRE) && !defined(CONFIG_COLDFIRE_COHERENT_DMA)
> +	/*
> +	 * Hacky flush of all caches instead of using the DMA API for the TSO
> +	 * headers.
> +	 */
> +	flush_cache_all();
> +#endif
> +
> +	if (unlikely(queue >= fep->num_tx_queues))
> +		tx_qid = fec_enet_xdp_get_tx_queue(fep, cpu);
> +
> +	xdp_init_buff(&xdp, PAGE_SIZE << fep->pagepool_order, &rxq->xdp_rxq);
> +
> +	while (!((status = fec16_to_cpu(bdp->cbd_sc)) & BD_ENET_RX_EMPTY)) {
> +		if (pkt_received >= budget)
> +			break;
> +		pkt_received++;
> +
> +		writel(FEC_ENET_RXF_GET(queue), fep->hwp + FEC_IEVENT);
> +
> +		/* Check for errors. */
> +		status ^= BD_ENET_RX_LAST;
> +		if (unlikely(fec_rx_error_check(ndev, status)))
> +			goto rx_processing_done;
> +
> +		/* Process the incoming frame. */
> +		ndev->stats.rx_packets++;
> +		pkt_len = fec16_to_cpu(bdp->cbd_datlen);
> +		ndev->stats.rx_bytes += pkt_len - fep->rx_shift;
> +
> +		index = fec_enet_get_bd_index(bdp, &rxq->bd);
> +		page = rxq->rx_buf[index];
> +		dma = fec32_to_cpu(bdp->cbd_bufaddr);
> +
> +		if (fec_enet_update_cbd(rxq, bdp, index)) {
> +			ndev->stats.rx_dropped++;
> +			goto rx_processing_done;
> +		}
> +
> +		dma_sync_single_for_cpu(&fep->pdev->dev, dma, pkt_len,
> +					DMA_FROM_DEVICE);
> +		prefetch(page_address(page));
> +
> +		xdp_buff_clear_frags_flag(&xdp);
> +		/* subtract 16bit shift and FCS */
> +		pkt_len -= sub_len;
> +		xdp_prepare_buff(&xdp, page_address(page), data_start,
> +				 pkt_len, false);
> +
> +		act = bpf_prog_run_xdp(prog, &xdp);
> +		/* Due xdp_adjust_tail and xdp_adjust_head: DMA sync
> +		 * for_device cover max len CPU touch.
> +		 */
> +		sync = xdp.data_end - xdp.data;
> +		sync = max(sync, pkt_len);
> +
> +		switch (act) {
> +		case XDP_PASS:
> +			rxq->stats[RX_XDP_PASS]++;
> +			/* The packet length includes FCS, but we don't want to
> +			 * include that when passing upstream as it messes up
> +			 * bridging applications.
> +			 */
> +			skb = fec_build_skb(fep, rxq, bdp, page, pkt_len);
> +			if (!skb) {
> +				fec_xdp_drop(rxq, &xdp, sync);
> +				trace_xdp_exception(ndev, prog, XDP_PASS);
> +			} else {
> +				napi_gro_receive(&fep->napi, skb);
> +			}
> +			break;
> +		case XDP_REDIRECT:
> +			rxq->stats[RX_XDP_REDIRECT]++;
> +			err = xdp_do_redirect(ndev, &xdp, prog);
> +			if (unlikely(err)) {
> +				fec_xdp_drop(rxq, &xdp, sync);
> +				trace_xdp_exception(ndev, prog, XDP_REDIRECT);
> +			} else {
> +				xdp_res |= FEC_ENET_XDP_REDIR;
> +			}
> +			break;
> +		case XDP_TX:
> +			rxq->stats[RX_XDP_TX]++;
> +			err = fec_enet_xdp_tx_xmit(fep, cpu, &xdp, sync, tx_qid);
> +			if (unlikely(err)) {
> +				rxq->stats[RX_XDP_TX_ERRORS]++;
> +				fec_xdp_drop(rxq, &xdp, sync);
> +				trace_xdp_exception(ndev, prog, XDP_TX);
> +			}
> +			break;
> +		default:
> +			bpf_warn_invalid_xdp_action(ndev, prog, act);
> +			fallthrough;
> +		case XDP_ABORTED:
> +			/* handle aborts by dropping packet */
> +			fallthrough;
> +		case XDP_DROP:
> +			rxq->stats[RX_XDP_DROP]++;
> +			fec_xdp_drop(rxq, &xdp, sync);
> +			break;
> +		}
> +
> +rx_processing_done:
> +		/* Clear the status flags for this buffer */
> +		status &= ~BD_ENET_RX_STATS;
> +		/* Mark the buffer empty */
> +		status |= BD_ENET_RX_EMPTY;
> +
> +		if (fep->bufdesc_ex) {
> +			struct bufdesc_ex *ebdp = (struct bufdesc_ex *)bdp;
> +
> +			ebdp->cbd_esc = cpu_to_fec32(BD_ENET_RX_INT);
> +			ebdp->cbd_prot = 0;
> +			ebdp->cbd_bdu = 0;
> +		}
> +
> +		/* Make sure the updates to rest of the descriptor are
> +		 * performed before transferring ownership.
> +		 */
> +		dma_wmb();
> +		bdp->cbd_sc = cpu_to_fec16(status);
> +
> +		/* Update BD pointer to next entry */
> +		bdp = fec_enet_get_nextdesc(bdp, &rxq->bd);
> +
> +		/* Doing this here will keep the FEC running while we process
> +		 * incoming frames. On a heavily loaded network, we should be
> +		 * able to keep up at the expense of system resources.
> +		 */
> +		writel(0, rxq->bd.reg_desc_active);
> +	}
> +
> +	rxq->bd.cur = bdp;
> +
> +	if (xdp_res & FEC_ENET_XDP_REDIR)
>  		xdp_do_flush();
>
>  	return pkt_received;
> @@ -1970,11 +2061,17 @@ static int fec_enet_rx_queue(struct fec_enet_private *fep,
>  static int fec_enet_rx(struct net_device *ndev, int budget)
>  {
>  	struct fec_enet_private *fep = netdev_priv(ndev);
> +	struct bpf_prog *prog = READ_ONCE(fep->xdp_prog);
>  	int i, done = 0;
>
>  	/* Make sure that AVB queues are processed first. */
> -	for (i = fep->num_rx_queues - 1; i >= 0; i--)
> -		done += fec_enet_rx_queue(fep, i, budget - done);
> +	for (i = fep->num_rx_queues - 1; i >= 0; i--) {
> +		if (prog)
> +			done += fec_enet_rx_queue_xdp(fep, i, budget - done,
> +						      prog);

Patch still is hard to review. It may be simpe if
1. create new patch cp fec_enet_rx_queue() to fec_enet_rx_queue_xdp().
2. the change may small if base on 1.

> +		else
> +			done += fec_enet_rx_queue(fep, i, budget - done);
> +	}
>
>  	return done;
>  }
> @@ -3854,15 +3951,6 @@ static int fec_enet_bpf(struct net_device *dev, struct netdev_bpf *bpf)
>  	}
>  }
>
> -static int
> -fec_enet_xdp_get_tx_queue(struct fec_enet_private *fep, int index)
> -{
> -	if (unlikely(index < 0))
> -		return 0;
> -
> -	return (index % fep->num_tx_queues);
> -}
> -
>  static int fec_enet_txq_xmit_frame(struct fec_enet_private *fep,
>  				   struct fec_enet_priv_tx_q *txq,
>  				   void *frame, u32 dma_sync_len,
> @@ -3956,15 +4044,11 @@ static int fec_enet_txq_xmit_frame(struct fec_enet_private *fep,
>
>  static int fec_enet_xdp_tx_xmit(struct fec_enet_private *fep,
>  				int cpu, struct xdp_buff *xdp,
> -				u32 dma_sync_len)
> +				u32 dma_sync_len, int queue)

you can split it new patch, just add queue id at fec_enet_xdp_tx_xmit().

Frank
>  {
> -	struct fec_enet_priv_tx_q *txq;
> -	struct netdev_queue *nq;
> -	int queue, ret;
> -
> -	queue = fec_enet_xdp_get_tx_queue(fep, cpu);
> -	txq = fep->tx_queue[queue];
> -	nq = netdev_get_tx_queue(fep->netdev, queue);
> +	struct netdev_queue *nq = netdev_get_tx_queue(fep->netdev, queue);
> +	struct fec_enet_priv_tx_q *txq = fep->tx_queue[queue];
> +	int ret;
>
>  	__netif_tx_lock(nq, cpu);
>
> --
> 2.34.1
>

next prev parent reply	other threads:[~2026-01-16 14:20 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-16  7:40 [PATCH v2 net-next 00/14] net: fec: improve XDP copy mode and add AF_XDP zero-copy support Wei Fang
2026-01-16  7:40 ` [PATCH v2 net-next 01/14] net: fec: add fec_txq_trigger_xmit() helper Wei Fang
2026-01-16  7:40 ` [PATCH v2 net-next 02/14] net: fec: add fec_rx_error_check() to check RX errors Wei Fang
2026-01-16  7:40 ` [PATCH v2 net-next 03/14] net: fec: add rx_shift to indicate the extra bytes padded in front of RX frame Wei Fang
2026-01-16  7:40 ` [PATCH v2 net-next 04/14] net: fec: add fec_build_skb() to build a skb Wei Fang
2026-01-16 14:01   ` Frank Li
2026-01-16  7:40 ` [PATCH v2 net-next 05/14] net: fec: improve fec_enet_rx_queue() Wei Fang
2026-01-16 14:03   ` Frank Li
2026-01-16  7:40 ` [PATCH v2 net-next 06/14] net: fec: add fec_enet_rx_queue_xdp() for XDP path Wei Fang
2026-01-16 14:19   ` Frank Li [this message]
2026-01-17  2:32     ` Wei Fang
2026-01-16  7:40 ` [PATCH v2 net-next 07/14] net: fec: transmit XDP frames in bulk Wei Fang
2026-01-16 14:04   ` Frank Li
2026-01-16  7:40 ` [PATCH v2 net-next 08/14] net: fec: remove unnecessary NULL pointer check when clearing TX BD ring Wei Fang
2026-01-16 14:21   ` Frank Li
2026-01-16  7:40 ` [PATCH v2 net-next 09/14] net: fec: use switch statement to check the type of tx_buf Wei Fang
2026-01-16  7:40 ` [PATCH v2 net-next 10/14] net: fec: remove the size parameter from fec_enet_create_page_pool() Wei Fang
2026-01-16 14:26   ` Frank Li
2026-01-16  7:40 ` [PATCH v2 net-next 11/14] net: fec: move xdp_rxq_info* APIs out of fec_enet_create_page_pool() Wei Fang
2026-01-16 14:34   ` Frank Li
2026-01-20  6:39     ` Wei Fang
2026-01-16  7:40 ` [PATCH v2 net-next 12/14] net: fec: add fec_alloc_rxq_buffers_pp() to allocate buffers from page pool Wei Fang
2026-01-16 14:38   ` Frank Li
2026-01-17  2:16     ` Wei Fang
2026-01-20  7:30     ` Wei Fang
2026-01-16  7:40 ` [PATCH v2 net-next 13/14] net: fec: improve fec_enet_tx_queue() Wei Fang
2026-01-16 14:40   ` Frank Li
2026-01-16  7:40 ` [PATCH v2 net-next 14/14] net: fec: add AF_XDP zero-copy support Wei Fang
2026-01-16 12:28   ` kernel test robot
2026-01-16 14:47   ` Frank Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aWpJAfYTvO4D/COp@lizhi-Precision-Tower-5810 \
    --to=frank.li@nxp.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=imx@lists.linux.dev \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    --cc=shenwei.wang@nxp.com \
    --cc=wei.fang@nxp.com \
    --cc=xiaoning.wang@nxp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox