From: Stephen Hemminger <stephen@networkplumber.org>
To: Junlong Wang <wang.junlong1@zte.com.cn>
Cc: dev@dpdk.org
Subject: Re: [PATCH v6 0/4] net/zxdh: optimize Rx/Tx path performance
Date: Wed, 17 Jun 2026 08:21:48 -0700 [thread overview]
Message-ID: <20260617082148.0030de4c@phoenix.local> (raw)
In-Reply-To: <20260617082828.1058127-1-wang.junlong1@zte.com.cn>
On Wed, 17 Jun 2026 16:28:23 +0800
Junlong Wang <wang.junlong1@zte.com.cn> wrote:
> v6:
> - Remove unnecessary error checking code in submit_to_backend_simple() and
> pkt_padding(). Since as the max dl_net_hdr_len is always less than
> RTE_PKTMBUF_HEADROOM, rte_pktmbuf_prepend() cannot fail in the
> simple path (single-segment mbufs).
>
> v5:
> - Reorganize patch series, placing interrupt fix as the first patch
> and fix condition check to properly enable interrupts.
> - Fix zxdh_recv_single_pkts() not compacting rcv_pkts[] on failure,
> which could cause use-after-free and mbuf leak.
> - Fix tx_bunch() and tx1() missing store barrier before setting AVAIL flag,
> preventing data race on weakly-ordered architectures.
> - Fix submit_to_backend_simple() writing descriptors for packets that
> failed pkt_padding(), causing mbuf leak.
>
> v4:
> - fix some AI review issues.
> - fix queue enable intr bug.
>
> v3:
> - remove unnecessary NULL check in zxdh_init_queue.
> - Split Ring: Bit[31] is unused and reserved, zxdh_queue_notify(): removing the
> zxdh_pci_with_feature(hw, ZXDH_F_RING_PACKED) check;
> - remove unnecessary double-free in in zxdh_recv_single_pkts();
> - used rte_pktmbuf_mtod();
> - remove rxq_get_vq(q) macro, use q->vq and apply it consistently;
> - Refactoring scatter and mtu check logic in zxdh_dev_mtu_set();
> - set txdp->id = avail_idx + i in tx_bunch/tx1.
> - add comment documenting zxdh_xmit_enqueue_append() now sets dxp->cookie = NULL for
> the head slot and stores cookies per descriptor via dep[idx].cookie.
> - add one-line comment noting tx_bunch() is the simple path handles single-segment.
> - remove unnecessary Extra initialization and the uint32_t cast.
>
> v2:
> - zxdh_rxtx.c, pkt_padding(): modifyed the return value of pkt_padding();
> - zxdh_rxtx.c, zxdh_recv_single_pkts(): modifyed When zxdh_init_mbuf() fails
> the loop does "continue" and free mbufs;
> - zxdh_rxtx.c, refill_desc_unwrap(): Add rte_io_wmb() before writing flags
> in the refill_que_descs();
> - zxdh_queue.h, zxdh_queue_enable_intr(): Remove unnecessary function of zxdh_queue_enable_intr;
> - zxdh_ethdev.c, zxdh_init_queue(): changed the hdr_mz NULL check logic;
>
> - zxdh_rxtx.c, zxdh_xmit_pkts_simple()、zxdh_recv_single_pkts(): add stats.bytes count;
> - zxdh_rxtx.c, zxdh_init_mbuf():remove rte_pktmbuf_dump(stdout, rxm, 40);
> - zxdh_ethdev.c, zxdh_dev_free_mbufs(): using rte_pktmbuf_free() to free mbufs;
> - Splitting into separate patches, structure reorganization and sw_ring removal、
> RX recv optimize、Tx xmit optimize、Tx;
>
> v1:
> This patch optimizes the ZXDH PMD's receive and transmit path for better
> performance through several improvements:
>
> - Add simple TX/RX burst functions (zxdh_xmit_pkts_simple and
> zxdh_recv_single_pkts) for single-segment packet scenarios.
> - Remove RX software ring (sw_ring) to reduce memory allocation and
> copy.
> - Optimize descriptor management with prefetching and simplified
> cleanup.
> - Reorganize structure fields for better cache locality.
>
> These changes reduce CPU cycles and memory bandwidth consumption,
> resulting in improved packet processing throughput.
>
> Junlong Wang (4):
> net/zxdh: fix queue enable intr issues
> net/zxdh: optimize queue structure to improve performance
> net/zxdh: optimize Rx recv pkts performance
> net/zxdh: optimize Tx xmit pkts performance
>
> drivers/net/zxdh/zxdh_ethdev.c | 81 ++---
> drivers/net/zxdh/zxdh_ethdev_ops.c | 23 +-
> drivers/net/zxdh/zxdh_ethdev_ops.h | 4 +
> drivers/net/zxdh/zxdh_pci.c | 2 +-
> drivers/net/zxdh/zxdh_queue.c | 11 +-
> drivers/net/zxdh/zxdh_queue.h | 122 ++++---
> drivers/net/zxdh/zxdh_rxtx.c | 529 ++++++++++++++++++++++-------
> drivers/net/zxdh/zxdh_rxtx.h | 27 +-
> 8 files changed, 539 insertions(+), 260 deletions(-)
>
I thought this was better, but AI found one new bug.
Also, if the driver wants headroom in mbuf and headroom typically
is set from rte_pktmbuf_alloc, a good defensive in depth.
Something like:
static_assert(RTE_PKTMBUF_HEADROOM >= ZXDH_DL_NET_HDR_SIZE,
"RTE_PKTMBUF_HEADROOM too small for zxdh Tx downlink header");
You still need the check in transmit, but this would protect
against user misconfiguration.
Not sure why you needed to open code rte_pktmbuf_prepend() which
has the check that AI wants you to have
Series review: net/zxdh Rx/Tx optimization (v6)
Patches 1-3 are unchanged from v5 and remain good. The v5 issue in the
simple Tx path (leaked mbuf and unpublished descriptor on the
pkt_padding failure branch) is resolved: pkt_padding() no longer
returns a failure, so there is no skip/continue and no ring gap. The
way it was made infallible introduces a new problem.
[PATCH v6 4/4] net/zxdh: optimize Tx xmit pkts performance
Error: pkt_padding() writes the downlink header in place without
checking that the mbuf has hdr_len bytes of headroom, so a packet with
short headroom causes an out-of-bounds write and a data_off underflow.
hdr = rte_pktmbuf_mtod_offset(cookie, struct zxdh_net_hdr_dl *, -hdr_len);
memcpy(hdr, net_hdr_dl, hdr_len);
/* Update mbuf to reflect the prepended header */
cookie->data_off -= hdr_len;
cookie->data_len += hdr_len;
cookie->pkt_len += hdr_len;
This open-codes rte_pktmbuf_prepend() but drops its headroom check. If
cookie->data_off < hdr_len, the mtod_offset pointer lands before
buf_addr and the rte_memcpy scribbles over the mbuf's private area or
the preceding memory; cookie->data_off (uint16_t) then underflows to a
large value. v4/v5 used rte_pktmbuf_prepend() here, which returns NULL
when headroom is short, so this is a regression: a checked-but-mishandled
case became an unchecked memory corruption.
Nothing upstream of the simple path guarantees the headroom. The simple
Tx burst is selected solely on RTE_ETH_TX_OFFLOAD_MULTI_SEGS being
disabled, which is independent of headroom, and tx_prepare()'s
dl_net_hdr_check() only validates offload capability, not headroom. The
driver itself shows the headroom cannot be assumed: the packed path
gates the in-place prepend on
txm->data_off >= ZXDH_DL_NET_HDR_SIZE
(zxdh_xmit_pkts_packed, the can_push test) and falls back to
zxdh_xmit_enqueue_append(), which copies the header into the reserved
txr region instead of prepending. The simple path has no equivalent
guard.
Restore a headroom check before the in-place prepend. For a
short-headroom packet, fall back to the append path that copies the
header into the reserved region (as the packed path does) rather than
writing before the buffer.
prev parent reply other threads:[~2026-06-17 15:21 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-26 2:28 [PATCH v1] net/zxdh: optimize Rx/Tx path performance Junlong Wang
2026-03-26 3:27 ` Stephen Hemminger
2026-04-06 4:26 ` Stephen Hemminger
2026-04-23 1:18 ` [PATCH v2 0/3] " Junlong Wang
2026-04-23 1:18 ` [PATCH v2 1/3] net/zxdh: optimize queue structure to improve performance Junlong Wang
2026-04-23 18:57 ` Stephen Hemminger
2026-04-23 1:18 ` [PATCH v2 2/3] net/zxdh: optimize Rx recv pkts performance Junlong Wang
2026-04-23 18:54 ` Stephen Hemminger
2026-04-23 23:39 ` Stephen Hemminger
2026-04-23 1:18 ` [PATCH v2 3/3] net/zxdh: optimize Tx xmit " Junlong Wang
2026-04-23 19:23 ` [PATCH v2 0/3] net/zxdh: optimize Rx/Tx path performance Stephen Hemminger
2026-05-09 6:29 ` [PATCH v3 " Junlong Wang
2026-05-09 6:29 ` [PATCH v3 1/3] net/zxdh: optimize queue structure to improve performance Junlong Wang
2026-05-18 2:20 ` Stephen Hemminger
2026-05-09 6:29 ` [PATCH v3 2/3] net/zxdh: optimize Rx recv pkts performance Junlong Wang
2026-05-09 6:29 ` [PATCH v3 3/3] net/zxdh: optimize Tx xmit " Junlong Wang
2026-05-18 2:22 ` Stephen Hemminger
2026-06-06 6:32 ` [PATCH v4 0/4] net/zxdh: optimize Rx/Tx path performance Junlong Wang
2026-06-06 6:32 ` [PATCH v4 1/4] net/zxdh: optimize queue structure to improve performance Junlong Wang
2026-06-06 6:32 ` [PATCH v4 2/4] net/zxdh: optimize Rx recv pkts performance Junlong Wang
2026-06-06 6:32 ` [PATCH v4 3/4] net/zxdh: optimize Tx xmit " Junlong Wang
2026-06-06 6:32 ` [PATCH v4 4/4] net/zxdh: fix queue enable intr issues Junlong Wang
2026-06-07 18:00 ` [PATCH v4 0/4] net/zxdh: optimize Rx/Tx path performance Stephen Hemminger
2026-06-15 1:19 ` [PATCH v5 " Junlong Wang
2026-06-15 1:19 ` [PATCH v5 1/4] net/zxdh: fix queue enable intr issues Junlong Wang
2026-06-15 1:19 ` [PATCH v5 2/4] net/zxdh: optimize queue structure to improve performance Junlong Wang
2026-06-15 1:19 ` [PATCH v5 3/4] net/zxdh: optimize Rx recv pkts performance Junlong Wang
2026-06-15 1:19 ` [PATCH v5 4/4] net/zxdh: optimize Tx xmit " Junlong Wang
2026-06-15 18:38 ` Stephen Hemminger
2026-06-17 8:28 ` [PATCH v6 0/4] net/zxdh: optimize Rx/Tx path performance Junlong Wang
2026-06-17 8:28 ` [PATCH v6 1/4] net/zxdh: fix queue enable intr issues Junlong Wang
2026-06-17 8:28 ` [PATCH v6 2/4] net/zxdh: optimize queue structure to improve performance Junlong Wang
2026-06-17 8:28 ` [PATCH v6 3/4] net/zxdh: optimize Rx recv pkts performance Junlong Wang
2026-06-17 8:28 ` [PATCH v6 4/4] net/zxdh: optimize Tx xmit " Junlong Wang
2026-06-17 15:21 ` Stephen Hemminger [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260617082148.0030de4c@phoenix.local \
--to=stephen@networkplumber.org \
--cc=dev@dpdk.org \
--cc=wang.junlong1@zte.com.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.