public inbox for dev@dpdk.org
 help / color / mirror / Atom feed
From: "Burakov, Anatoly" <anatoly.burakov@intel.com>
To: Bruce Richardson <bruce.richardson@intel.com>, <dev@dpdk.org>
Cc: <stable@dpdk.org>, Jingjing Wu <jingjing.wu@intel.com>,
	Praveen Shetty <praveen.shetty@intel.com>,
	Xiaoyun Li <xiaoyun.li@intel.com>,
	Beilei Xing <beilei.xing@intel.com>,
	Junfeng Guo <junfengg@nvidia.com>
Subject: Re: [PATCH] net/idpf: handle Tx of mbuf segments larger than 16k
Date: Fri, 6 Mar 2026 14:45:00 +0100	[thread overview]
Message-ID: <f46aa723-75a0-41dc-a063-d79420365b3a@intel.com> (raw)
In-Reply-To: <20260303150026.1601461-1-bruce.richardson@intel.com>

On 3/3/2026 4:00 PM, Bruce Richardson wrote:
> Recent rework of the Tx single-queue path in idpf aligned that path with
> that of other drivers, meaning it now supports segments of size greater
> than 16k. Rework the split-queue path to similarly support those large
> segments.
> 
> Fixes: 770f4dfe0f79 ("net/idpf: support basic Tx data path")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---

<snip>

>   		uint64_t cd_qw0 = 0, cd_qw1 = 0;
>   		nb_ctx = idpf_set_tso_ctx(ol_flags, tx_pkt, &tx_offload, txq,
>   					  &cd_qw0, &cd_qw1);
>   
> -		/* Calculate the number of TX descriptors needed for
> -		 * each packet. For TSO packets, use ci_calc_pkt_desc as
> -		 * the mbuf data size might exceed max data size that hw allows
> -		 * per tx desc.
> +		/* Calculate the number of TX descriptors needed for each packet.
> +		 * For TSO packets, use ci_calc_pkt_desc as the mbuf data size
> +		 * might exceed the max data size that hw allows per tx desc.
>   		 */
> -		if (ol_flags & RTE_MBUF_F_TX_TCP_SEG)
> +		if (ol_flags & (RTE_MBUF_F_TX_TCP_SEG | RTE_MBUF_F_TX_UDP_SEG))

This looks like a drive-by fix for an unrelated issue. That particular 
code was introduced here:

2904020f8313 ("net/intel: add common function to calculate needed descs")

There are other drivers that check TSO flags but only look at TCP_SEG 
but not UDP_SEG - should they all look for both? Perhaps this should be 
looked at and fixed across all our PMD's that support TSO.

(to be clear, this is a general question, I'm not implying these changes 
must be part of this patchset)

>   			nb_used = ci_calc_pkt_desc(tx_pkt) + nb_ctx;
>   		else
>   			nb_used = tx_pkt->nb_segs + nb_ctx;
>   
> +		if (txq->nb_tx_free <= txq->tx_free_thresh) {
> +			/* TODO: Need to refine
> +			 * 1. free and clean: Better to decide a clean destination instead of
> +			 * loop times. And don't free mbuf when RS got immediately, free when
> +			 * transmit or according to the clean destination.
> +			 * Now, just ignore the RE write back, free mbuf when get RS
> +			 * 2. out-of-order rewrite back haven't be supported, SW head and HW head
> +			 * need to be separated.
> +			 **/
> +			nb_to_clean = 2 * txq->tx_rs_thresh;
> +			while (nb_to_clean--)
> +				idpf_split_tx_free(txq->complq);
> +		}
> +
> +		if (txq->nb_tx_free < nb_used)
> +			break;
> +
>   		if (ol_flags & CI_TX_CKSUM_OFFLOAD_MASK)
>   			cmd_dtype = IDPF_TXD_FLEX_FLOW_CMD_CS_EN;
>   
> @@ -959,30 +959,52 @@ idpf_dp_splitq_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
>   			ctx_desc[0] = cd_qw0;
>   			ctx_desc[1] = cd_qw1;
>   
> -			tx_id++;
> -			if (tx_id == txq->nb_tx_desc)
> +			if (++tx_id == txq->nb_tx_desc)
>   				tx_id = 0;
>   		}
>   
> +		cmd_dtype |= IDPF_TX_DESC_DTYPE_FLEX_FLOW_SCHE;
> +		struct rte_mbuf *m_seg = tx_pkt;
>   		do {
> -			txd = &txr[tx_id];
> -			txn = &sw_ring[txe->next_id];
> -			txe->mbuf = tx_pkt;
> +			uint64_t buf_dma_addr = rte_mbuf_data_iova(m_seg);
> +			uint16_t slen = m_seg->data_len;
> +
> +			txe->mbuf = m_seg;

CodeRabbit picked up on something here, and I think it's worth highlighting.

When we're splitting segments, we assign txe->mbuf to the first segment...

<snip>

> +				txe = &sw_ring[sw_id];
> +				/* sub-descriptor slots do not own the mbuf */
> +				txe->mbuf = NULL;

...then set subsequent segments to NULL...

> +			}
>   
> -			/* Setup TX descriptor */
> -			txd->buf_addr =
> -				rte_cpu_to_le_64(rte_mbuf_data_iova(tx_pkt));
> -			cmd_dtype |= IDPF_TX_DESC_DTYPE_FLEX_FLOW_SCHE;
> +			/* Write the final (or only) descriptor for this segment */
> +			txd = &txr[tx_id];
> +			txd->buf_addr = rte_cpu_to_le_64(buf_dma_addr);
>   			txd->qw1.cmd_dtype = cmd_dtype;
> -			txd->qw1.rxr_bufsize = tx_pkt->data_len;
> +			txd->qw1.rxr_bufsize = slen;
>   			txd->qw1.compl_tag = sw_id;

...and we're supposed to write the final descriptor here, but we've 
stored the mbuf pointer in the *first* descriptor, not in the *last* 
one, which means when this descriptor gets to processing completions, 
the mbuf pointer of that descriptor will be NULL? Is that intended?


> -			tx_id++;
> -			if (tx_id == txq->nb_tx_desc)
> +			if (++tx_id == txq->nb_tx_desc)
>   				tx_id = 0;
>   			sw_id = txe->next_id;
> -			txe = txn;
> -			tx_pkt = tx_pkt->next;
> -		} while (tx_pkt);
> +			txe = &sw_ring[sw_id];
> +			m_seg = m_seg->next;
> +		} while (m_seg);
>   
>   		/* fill the last descriptor with End of Packet (EOP) bit */
>   		txd->qw1.cmd_dtype |= IDPF_TXD_FLEX_FLOW_CMD_EOP;

-- 
Thanks,
Anatoly

  parent reply	other threads:[~2026-03-06 13:45 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-03 15:00 [PATCH] net/idpf: handle Tx of mbuf segments larger than 16k Bruce Richardson
2026-03-04  9:53 ` Bruce Richardson
2026-03-06 13:45 ` Burakov, Anatoly [this message]
2026-03-06 14:03   ` Burakov, Anatoly
2026-03-06 14:16     ` Burakov, Anatoly

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f46aa723-75a0-41dc-a063-d79420365b3a@intel.com \
    --to=anatoly.burakov@intel.com \
    --cc=beilei.xing@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=jingjing.wu@intel.com \
    --cc=junfengg@nvidia.com \
    --cc=praveen.shetty@intel.com \
    --cc=stable@dpdk.org \
    --cc=xiaoyun.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox