From: "Burakov, Anatoly" <anatoly.burakov@intel.com>
To: Bruce Richardson <bruce.richardson@intel.com>, <dev@dpdk.org>
Cc: <stable@dpdk.org>, Jingjing Wu <jingjing.wu@intel.com>,
Praveen Shetty <praveen.shetty@intel.com>,
Xiaoyun Li <xiaoyun.li@intel.com>,
Beilei Xing <beilei.xing@intel.com>,
Junfeng Guo <junfengg@nvidia.com>
Subject: Re: [PATCH] net/idpf: handle Tx of mbuf segments larger than 16k
Date: Fri, 6 Mar 2026 14:45:00 +0100 [thread overview]
Message-ID: <f46aa723-75a0-41dc-a063-d79420365b3a@intel.com> (raw)
In-Reply-To: <20260303150026.1601461-1-bruce.richardson@intel.com>
On 3/3/2026 4:00 PM, Bruce Richardson wrote:
> Recent rework of the Tx single-queue path in idpf aligned that path with
> that of other drivers, meaning it now supports segments of size greater
> than 16k. Rework the split-queue path to similarly support those large
> segments.
>
> Fixes: 770f4dfe0f79 ("net/idpf: support basic Tx data path")
> Cc: stable@dpdk.org
>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
<snip>
> uint64_t cd_qw0 = 0, cd_qw1 = 0;
> nb_ctx = idpf_set_tso_ctx(ol_flags, tx_pkt, &tx_offload, txq,
> &cd_qw0, &cd_qw1);
>
> - /* Calculate the number of TX descriptors needed for
> - * each packet. For TSO packets, use ci_calc_pkt_desc as
> - * the mbuf data size might exceed max data size that hw allows
> - * per tx desc.
> + /* Calculate the number of TX descriptors needed for each packet.
> + * For TSO packets, use ci_calc_pkt_desc as the mbuf data size
> + * might exceed the max data size that hw allows per tx desc.
> */
> - if (ol_flags & RTE_MBUF_F_TX_TCP_SEG)
> + if (ol_flags & (RTE_MBUF_F_TX_TCP_SEG | RTE_MBUF_F_TX_UDP_SEG))
This looks like a drive-by fix for an unrelated issue. That particular
code was introduced here:
2904020f8313 ("net/intel: add common function to calculate needed descs")
There are other drivers that check TSO flags but only look at TCP_SEG
but not UDP_SEG - should they all look for both? Perhaps this should be
looked at and fixed across all our PMD's that support TSO.
(to be clear, this is a general question, I'm not implying these changes
must be part of this patchset)
> nb_used = ci_calc_pkt_desc(tx_pkt) + nb_ctx;
> else
> nb_used = tx_pkt->nb_segs + nb_ctx;
>
> + if (txq->nb_tx_free <= txq->tx_free_thresh) {
> + /* TODO: Need to refine
> + * 1. free and clean: Better to decide a clean destination instead of
> + * loop times. And don't free mbuf when RS got immediately, free when
> + * transmit or according to the clean destination.
> + * Now, just ignore the RE write back, free mbuf when get RS
> + * 2. out-of-order rewrite back haven't be supported, SW head and HW head
> + * need to be separated.
> + **/
> + nb_to_clean = 2 * txq->tx_rs_thresh;
> + while (nb_to_clean--)
> + idpf_split_tx_free(txq->complq);
> + }
> +
> + if (txq->nb_tx_free < nb_used)
> + break;
> +
> if (ol_flags & CI_TX_CKSUM_OFFLOAD_MASK)
> cmd_dtype = IDPF_TXD_FLEX_FLOW_CMD_CS_EN;
>
> @@ -959,30 +959,52 @@ idpf_dp_splitq_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
> ctx_desc[0] = cd_qw0;
> ctx_desc[1] = cd_qw1;
>
> - tx_id++;
> - if (tx_id == txq->nb_tx_desc)
> + if (++tx_id == txq->nb_tx_desc)
> tx_id = 0;
> }
>
> + cmd_dtype |= IDPF_TX_DESC_DTYPE_FLEX_FLOW_SCHE;
> + struct rte_mbuf *m_seg = tx_pkt;
> do {
> - txd = &txr[tx_id];
> - txn = &sw_ring[txe->next_id];
> - txe->mbuf = tx_pkt;
> + uint64_t buf_dma_addr = rte_mbuf_data_iova(m_seg);
> + uint16_t slen = m_seg->data_len;
> +
> + txe->mbuf = m_seg;
CodeRabbit picked up on something here, and I think it's worth highlighting.
When we're splitting segments, we assign txe->mbuf to the first segment...
<snip>
> + txe = &sw_ring[sw_id];
> + /* sub-descriptor slots do not own the mbuf */
> + txe->mbuf = NULL;
...then set subsequent segments to NULL...
> + }
>
> - /* Setup TX descriptor */
> - txd->buf_addr =
> - rte_cpu_to_le_64(rte_mbuf_data_iova(tx_pkt));
> - cmd_dtype |= IDPF_TX_DESC_DTYPE_FLEX_FLOW_SCHE;
> + /* Write the final (or only) descriptor for this segment */
> + txd = &txr[tx_id];
> + txd->buf_addr = rte_cpu_to_le_64(buf_dma_addr);
> txd->qw1.cmd_dtype = cmd_dtype;
> - txd->qw1.rxr_bufsize = tx_pkt->data_len;
> + txd->qw1.rxr_bufsize = slen;
> txd->qw1.compl_tag = sw_id;
...and we're supposed to write the final descriptor here, but we've
stored the mbuf pointer in the *first* descriptor, not in the *last*
one, which means when this descriptor gets to processing completions,
the mbuf pointer of that descriptor will be NULL? Is that intended?
> - tx_id++;
> - if (tx_id == txq->nb_tx_desc)
> + if (++tx_id == txq->nb_tx_desc)
> tx_id = 0;
> sw_id = txe->next_id;
> - txe = txn;
> - tx_pkt = tx_pkt->next;
> - } while (tx_pkt);
> + txe = &sw_ring[sw_id];
> + m_seg = m_seg->next;
> + } while (m_seg);
>
> /* fill the last descriptor with End of Packet (EOP) bit */
> txd->qw1.cmd_dtype |= IDPF_TXD_FLEX_FLOW_CMD_EOP;
--
Thanks,
Anatoly
next prev parent reply other threads:[~2026-03-06 13:45 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-03 15:00 [PATCH] net/idpf: handle Tx of mbuf segments larger than 16k Bruce Richardson
2026-03-04 9:53 ` Bruce Richardson
2026-03-06 13:45 ` Burakov, Anatoly [this message]
2026-03-06 14:03 ` Burakov, Anatoly
2026-03-06 14:16 ` Burakov, Anatoly
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f46aa723-75a0-41dc-a063-d79420365b3a@intel.com \
--to=anatoly.burakov@intel.com \
--cc=beilei.xing@intel.com \
--cc=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=jingjing.wu@intel.com \
--cc=junfengg@nvidia.com \
--cc=praveen.shetty@intel.com \
--cc=stable@dpdk.org \
--cc=xiaoyun.li@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox