DPDK-dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 00/24] deprecate rte_atomic functions
From: Stephen Hemminger @ 2026-06-20  2:28 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger
In-Reply-To: <https://inbox.dpdk.org/dev/20260521042043.1590536-1-stephen@networkplumber.org>

The rte_atomicNN_* family was flagged for deprecation in 2021 by
commit 3ec965b6de12 ("doc: update atomic operation deprecation")
but enforcement never landed and in-tree usage continued to grow.

This series finishes converting every remaining in-tree caller to
the C11-style rte_atomic_*_explicit() / RTE_ATOMIC() API, then
marks the legacy functions __rte_deprecated so future in-tree and
out-of-tree uses are caught at compile time.

The goal of this series is to get driver writers to review and
test each change.

v5 - rebase now that ring changes are merged.
   - drop the barrier (rte_smp_mb) patch not required.


Stephen Hemminger (24):
  bpf: use C11 atomics in BPF_ST_ATOMIC_REG
  net/bonding: use stdatomic
  net/nbl: remove unused rte_atomic16 field
  net/ena: replace use of rte_atomicNN
  net/failsafe: convert to stdatomic
  net/enic: do not use deprecated rte_atomic64
  net/pfe: use ethdev linkstatus helpers
  net/sfc: replace rte_atomic with stdatomic
  crypto/ccp: replace use of rte_atomic64 with stdatomic
  bus/dpaa: replace rte_atomic16 with stdatomic
  drivers: replace rte_atomic16 with stdatomic
  net/netvsc: replace rte_atomic32 with stdatomic
  event/sw: convert from rte_atomic32 to stdatomic
  bus/vmbus: convert from rte_atomic to stdatomic
  common/dpaax: use stdatomic instead of rte_atomic
  net/bnx2x: convert from rte_atomic32 to stdatomic
  bus/fslmc: replace rte_atomic32 with stdatomic
  drivers/event: replace rte_atomic32 in selftests
  net/hinic: replace rte_atomic32 with stdatomic
  net/txgbe: replace rte_atomic32 with stdatomic
  net/vhost: use stdatomic instead of rte_atomic32
  vdpa/ifc: replace rte_atomic32 with stdatomic
  test/atomic: suppress deprecation warnings for legacy APIs
  eal: deprecate rte_atomicNN functions

 app/test/test_atomic.c                        |  12 +
 devtools/checkpatches.sh                      |   8 -
 doc/guides/rel_notes/deprecation.rst          |   4 +-
 doc/guides/rel_notes/release_26_07.rst        |   5 +
 drivers/bus/dpaa/base/qbman/qman.c            |   9 +-
 drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c      |  10 +-
 drivers/bus/fslmc/portal/dpaa2_hw_dpci.c      |  10 +-
 drivers/bus/fslmc/portal/dpaa2_hw_dpio.c      |  12 +-
 drivers/bus/fslmc/portal/dpaa2_hw_pvt.h       |   8 +-
 drivers/bus/fslmc/qbman/include/compat.h      |  21 +-
 drivers/bus/vmbus/private.h                   |   2 +-
 drivers/bus/vmbus/vmbus_bufring.c             |  39 +--
 drivers/common/dpaax/compat.h                 |  21 +-
 drivers/crypto/ccp/ccp_crypto.c               |  11 +-
 drivers/crypto/ccp/ccp_crypto.h               |   2 +-
 drivers/crypto/ccp/ccp_dev.c                  |  10 +-
 drivers/crypto/ccp/ccp_dev.h                  |   4 +-
 drivers/event/dpaa2/dpaa2_eventdev_selftest.c |  26 +-
 drivers/event/dpaa2/dpaa2_hw_dpcon.c          |  11 +-
 drivers/event/octeontx/ssovf_evdev_selftest.c |  61 +++--
 drivers/event/sw/sw_evdev.c                   |   8 +-
 drivers/event/sw/sw_evdev.h                   |   4 +-
 drivers/event/sw/sw_evdev_worker.c            |  16 +-
 drivers/net/bnx2x/bnx2x.c                     |   6 +-
 drivers/net/bnx2x/bnx2x.h                     |   2 +-
 drivers/net/bnx2x/ecore_sp.c                  |   6 +-
 drivers/net/bonding/eth_bond_8023ad_private.h |   6 +-
 drivers/net/bonding/rte_eth_bond_8023ad.c     |  35 +--
 drivers/net/ena/base/ena_plat_dpdk.h          |  14 +-
 drivers/net/ena/ena_ethdev.c                  |  21 +-
 drivers/net/ena/ena_ethdev.h                  |   7 +-
 drivers/net/enic/enic.h                       |   6 +-
 drivers/net/enic/enic_compat.h                |   1 -
 drivers/net/enic/enic_main.c                  |  17 +-
 drivers/net/enic/enic_rxtx.c                  |  14 +-
 drivers/net/enic/enic_rxtx_vec_avx2.c         |   4 +-
 drivers/net/failsafe/failsafe_ops.c           |  12 +-
 drivers/net/failsafe/failsafe_private.h       |  29 ++-
 drivers/net/failsafe/failsafe_rxtx.c          |   2 +-
 drivers/net/hinic/base/hinic_compat.h         |   2 +-
 drivers/net/hinic/base/hinic_pmd_hwdev.c      |  24 +-
 drivers/net/hinic/base/hinic_pmd_hwdev.h      |   4 +-
 drivers/net/nbl/nbl_hw/nbl_resource.h         |   1 -
 drivers/net/netvsc/hn_rndis.c                 |  28 +-
 drivers/net/netvsc/hn_rxtx.c                  |  12 +-
 drivers/net/netvsc/hn_var.h                   |   6 +-
 drivers/net/pfe/pfe_ethdev.c                  |  32 +--
 drivers/net/sfc/sfc.c                         |   9 +-
 drivers/net/sfc/sfc.h                         |   4 +-
 drivers/net/sfc/sfc_port.c                    |   7 +-
 drivers/net/sfc/sfc_stats.h                   |   2 +-
 drivers/net/txgbe/base/txgbe_mng.c            |   4 +-
 drivers/net/txgbe/base/txgbe_type.h           |   2 +-
 drivers/net/vhost/rte_eth_vhost.c             | 103 +++++---
 drivers/vdpa/ifc/ifcvf_vdpa.c                 |  37 +--
 lib/bpf/bpf_exec.c                            |  13 +-
 lib/eal/arm/include/rte_atomic_32.h           |   4 -
 lib/eal/arm/include/rte_atomic_64.h           |   4 -
 lib/eal/include/generic/rte_atomic.h          | 243 +++++-------------
 lib/eal/include/rte_common.h                  |   2 +
 lib/eal/loongarch/include/rte_atomic.h        |   4 -
 lib/eal/ppc/include/rte_atomic.h              | 173 -------------
 lib/eal/riscv/include/rte_atomic.h            |   4 -
 lib/eal/x86/include/rte_atomic.h              | 172 -------------
 lib/eal/x86/include/rte_atomic_32.h           | 188 --------------
 lib/eal/x86/include/rte_atomic_64.h           | 157 -----------
 66 files changed, 472 insertions(+), 1265 deletions(-)

-- 
2.53.0


^ permalink raw reply

* RE: [EXTERNAL] [PATCH v2 06/10] bus/vmbus: allocate interrupt during probing
From: Long Li @ 2026-06-19 22:05 UTC (permalink / raw)
  To: David Marchand, dev@dpdk.org
  Cc: thomas@monjalon.net, stephen@networkplumber.org,
	bruce.richardson@intel.com, fengchengwen@huawei.com,
	hemant.agrawal@nxp.com, Wei Hu
In-Reply-To: <20260618152826.490569-7-david.marchand@redhat.com>

> Allocating the interrupt handle is a waste of memory if no device is probed
> later (like for example, if a allowlist is passed).
> Instead, allocate this handle at the time probe_device is called.
> 
> Signed-off-by: David Marchand <david.marchand@redhat.com>

Reviewed-by: Long Li <longli@microsoft.com>


> ---
> Changes since v1:
> - fixed/reordered interrupt handle allocation,
> 
> ---
>  drivers/bus/vmbus/linux/vmbus_bus.c |  6 ------
>  drivers/bus/vmbus/vmbus_common.c    | 18 ++++++++++++++++--
>  2 files changed, 16 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/bus/vmbus/linux/vmbus_bus.c
> b/drivers/bus/vmbus/linux/vmbus_bus.c
> index 0af10f6a69..77d904ad6d 100644
> --- a/drivers/bus/vmbus/linux/vmbus_bus.c
> +++ b/drivers/bus/vmbus/linux/vmbus_bus.c
> @@ -345,12 +345,6 @@ vmbus_scan_one(const char *name)
>  		}
>  	}
> 
> -	/* Allocate interrupt handle instance */
> -	dev->intr_handle =
> -		rte_intr_instance_alloc(RTE_INTR_INSTANCE_F_PRIVATE);
> -	if (dev->intr_handle == NULL)
> -		goto error;
> -
>  	/* device is valid, add in list (sorted) */
>  	VMBUS_LOG(DEBUG, "Adding vmbus device %s", name);
> 
> diff --git a/drivers/bus/vmbus/vmbus_common.c
> b/drivers/bus/vmbus/vmbus_common.c
> index 74c1ddff69..bfb45e963c 100644
> --- a/drivers/bus/vmbus/vmbus_common.c
> +++ b/drivers/bus/vmbus/vmbus_common.c
> @@ -100,10 +100,16 @@ vmbus_probe_device(struct rte_driver *drv, struct
> rte_device *dev)
>  		return 1;
>  	}
> 
> +	/* allocate interrupt handle instance */
> +	vmbus_dev->intr_handle =
> +		rte_intr_instance_alloc(RTE_INTR_INSTANCE_F_PRIVATE);
> +	if (vmbus_dev->intr_handle == NULL)
> +		return -ENOMEM;
> +
>  	/* map resources for device */
>  	ret = rte_vmbus_map_device(vmbus_dev);
>  	if (ret != 0)
> -		return ret;
> +		goto free_intr;
> 
>  	if (vmbus_dev->device.numa_node < 0 && rte_socket_count() > 1)
>  		VMBUS_LOG(INFO, "Device %s is not NUMA-aware", guid); @@
> -112,7 +118,15 @@ vmbus_probe_device(struct rte_driver *drv, struct
> rte_device *dev)
>  	VMBUS_LOG(INFO, "  probe driver: %s", vmbus_drv->driver.name);
>  	ret = vmbus_drv->probe(vmbus_drv, vmbus_dev);
>  	if (ret != 0)
> -		rte_vmbus_unmap_device(vmbus_dev);
> +		goto unmap;
> +
> +	return 0;
> +
> +unmap:
> +	rte_vmbus_unmap_device(vmbus_dev);
> +free_intr:
> +	rte_intr_instance_free(vmbus_dev->intr_handle);
> +	vmbus_dev->intr_handle = NULL;
> 
>  	return ret;
>  }
> --
> 2.53.0


^ permalink raw reply

* RE: [EXTERNAL] [PATCH v2 05/10] bus/vmbus: fix interrupt leak in cleanup
From: Long Li @ 2026-06-19 22:04 UTC (permalink / raw)
  To: David Marchand, dev@dpdk.org
  Cc: thomas@monjalon.net, stephen@networkplumber.org,
	bruce.richardson@intel.com, fengchengwen@huawei.com,
	hemant.agrawal@nxp.com, stable@dpdk.org, Wei Hu
In-Reply-To: <20260618152826.490569-6-david.marchand@redhat.com>

> When calling this bus cleanup, interrupt handle was not released.
> 
> Fixes: 65780eada9d9 ("bus/vmbus: support cleanup")
> Cc: stable@dpdk.org
> 
> Signed-off-by: David Marchand <david.marchand@redhat.com>

Reviewed-by: Long Li <longli@microsoft.com>

> ---
>  drivers/bus/vmbus/vmbus_common.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/bus/vmbus/vmbus_common.c
> b/drivers/bus/vmbus/vmbus_common.c
> index 01573927ce..74c1ddff69 100644
> --- a/drivers/bus/vmbus/vmbus_common.c
> +++ b/drivers/bus/vmbus/vmbus_common.c
> @@ -150,6 +150,7 @@ rte_vmbus_cleanup(void)
>  			error = -1;
> 
>  		rte_vmbus_unmap_device(dev);
> +		rte_intr_instance_free(dev->intr_handle);
> 
>  		dev->device.driver = NULL;
>  		rte_bus_remove_device(&rte_vmbus_bus, &dev->device);
> --
> 2.53.0


^ permalink raw reply

* Re: [PATCH 00/10] NXP ENETC driver related changes
From: Stephen Hemminger @ 2026-06-19 21:43 UTC (permalink / raw)
  To: Gagandeep Singh; +Cc: dev, hemant.agrawal
In-Reply-To: <20260619184427.522518-1-g.singh@nxp.com>

On Sat, 20 Jun 2026 00:14:17 +0530
Gagandeep Singh <g.singh@nxp.com> wrote:

> ENETC driver related changes series
> 
> Gagandeep Singh (8):
>   net/enetc: fix TX BD structure
>   net/enetc: fix TX BDs flag overwrite issue
>   net/enetc: fix queue initialization
>   net/enetc: support ESP packet type in packet parsing
>   net/enetc: update random MAC generation code
>   net/enetc: add option to disable VSI messaging
>   net/enetc: add devargs to control VSI-PSI timeout and delay
>   net/enetc4: add cacheable BD ring support with SW cache maintenance
> 
> Vanshika Shukla (2):
>   net/enetc: support scatter-gather
>   net/enetc: set user configurable priority to TX rings
> 
>  drivers/net/enetc/base/enetc_hw.h |  13 +-
>  drivers/net/enetc/enetc.h         |  28 +-
>  drivers/net/enetc/enetc4_ethdev.c | 123 +++++++--
>  drivers/net/enetc/enetc4_vf.c     | 159 ++++++++++--
>  drivers/net/enetc/enetc_ethdev.c  |  26 +-
>  drivers/net/enetc/enetc_rxtx.c    | 411 ++++++++++++++++++++++++++----
>  6 files changed, 649 insertions(+), 111 deletions(-)
> 

The AI review shows some thing that need to be addressed before merging.

[PATCH 04/10] net/enetc: support ESP packet type

Info: enetc_supported_ptypes_get() adds RTE_PTYPE_TUNNEL_ESP and a
trailing RTE_PTYPE_UNKNOWN. *no_of_elements is RTE_DIM(ptypes), so the
0 entry is counted (not used as a sentinel). It is filtered out by the
mask test in rte_eth_dev_get_supported_ptypes(), so harmless, but the
RTE_PTYPE_UNKNOWN line is unnecessary and should be dropped.


[PATCH 06/10] net/enetc: support scatter-gather

Warning: scatter Rx reassembly state (first_seg/cur_seg) is held in
local variables and reset on every call. rx_frm_cnt only advances on
the F bit, so work_limit won't cut a frame, but the
"!(bd_status & LSTATUS_R)" break can exit mid-frame if HW has written
the leading segments of a multi-segment frame but not yet the segment
carrying F. On the next call first_seg is NULL again, next_to_clean has
already advanced past the consumed leading segments, and those mbufs
are leaked while the tail segments are mis-assembled as a new frame.
Persist the partial chain across bursts in the ring (e.g.
rx_ring->pkt_first_seg / pkt_last_seg) instead of locals. (Same pattern
is reproduced in enetc_clean_rx_ring_cacheable in patch 10.)

Warning: enetc4 now advertises RTE_ETH_RX_OFFLOAD_SCATTER and
RTE_ETH_TX_OFFLOAD_MULTI_SEGS (VF) but doc/guides/nics/features/
enetc4.ini is not updated (Scattered Rx / Multi segment rows).

Info: the VF dev_info now advertises L3/L4 RX checksum offload, but
enetc_dev_rx_parse() unconditionally sets
RTE_MBUF_F_RX_IP_CKSUM_GOOD | RTE_MBUF_F_RX_L4_CKSUM_GOOD and never
reports *_BAD. With the offload now advertised, an application relying
on it will never see a bad-checksum indication.

Info: dccivac(data + (data_len - 1)) / dcbf(data + (seg_len - 1))
underflow to data-1 when the segment length is 0 (uint16_t promotes to
int). The preceding loop already covers the final cache line, so the
extra op is redundant as well as unsafe for len==0.


[PATCH 07/10] net/enetc: add option to disable VSI messaging

Warning: new devarg "enetc4_vsi_disable" is registered but not
documented in doc/guides/nics/enetc.rst.


[PATCH 08/10] net/enetc: add devargs to control VSI-PSI timeout/delay

Warning: new devargs "enetc4_vsi_timeout" / "enetc4_vsi_delay" are not
documented in doc/guides/nics/enetc.rst.


[PATCH 09/10] net/enetc: set user configurable priority to TX rings

Error: hw->txq_prior is allocated in parse_txq_prior() with
rte_zmalloc() but never freed. It leaks on dev_close / re-probe. Free
it in the close/uninit path (and note it is re-allocated every time the
handler runs, so a repeated key would leak the prior allocation too).

Warning: txq_prior is control-path, CPU-only data; rte_zmalloc()
consumes hugepage memory unnecessarily. Use calloc()/malloc().

Warning: the parsed value is OR'd straight into TBMR:
	tx_en |= priv->hw.txq_prior[tx_ring->index];
with no mask. ENETC_TBMR_EN is BIT(31) and there is no TBMR priority
mask defined. A user value with high bits set can corrupt unrelated
TBMR control bits. Mask the input to the valid TBMR priority field.

Info: strdup(value) return is not checked; on failure
strtok(input_str, "|") is called with a NULL first argument, which
resumes from strtok's stale internal state rather than erroring.

Warning: new devarg "enetc4_txq_prior" not documented in
doc/guides/nics/enetc.rst.


[PATCH 10/10] net/enetc4: add cacheable BD ring support with SW cache

Warning: enetc4_dev_hw_init() switches rx_pkt_burst/tx_pkt_burst to the
cache-maintenance variants unconditionally for every enetc4 device
(PF and VF). The commit message scopes this to non-cache-coherent
parts (i.MX95), but the code applies it everywhere, adding dcbf/dccivac
cost on cache-coherent platforms that previously used the _nc fast
path. Gate it on a devarg or coherency/platform check.

Warning: the RX payload invalidation uses dccivac (dc civac =
clean+invalidate). The comment justifies clean-then-invalidate for the
BD ring (refill dcbf leaves BD lines clean), but payload buffers are
not cleaned before being handed to HW. If a payload cache line is dirty
(stale CPU data from a prior use of the mbuf), the clean phase writes
it back over the HW-DMA'd data in DDR before invalidating -> silent RX
corruption on a non-coherent part. Please confirm payload lines can
never be dirty here, or use invalidate-only.

Info: struct enetc_bdr gains "uint64_t bd_base_p" but it is never
referenced anywhere. Remove the dead field.

Info: the 64-bit BD fast copy
	__uint128_t *dst128 = (__uint128_t *)&rxbd_temp;
	*dst128 = *(const __uint128_t *)rxbd;
takes the address of an 8-byte-aligned stack union (rxbd_temp) as
__uint128_t*. That is an under-aligned 128-bit access (UB); aarch64
tolerates it via ldp/stp but it is fragile. Force 16-byte alignment on
rxbd_temp or copy as two u64.


General (series-wide)

Warning: no release notes. The series adds user-visible features
(scatter-gather, cacheable BD ring support, four new devargs) with no
entry in doc/guides/rel_notes/. New driver capabilities and devargs
need release-note coverage.

^ permalink raw reply

* Re: [PATCH v4 00/23] et/sxe2: added Linkdata sxe2 ethernet driver
From: Stephen Hemminger @ 2026-06-19 20:58 UTC (permalink / raw)
  To: liujie5; +Cc: dev
In-Reply-To: <20260619080156.1539964-1-liujie5@linkdatatechnology.com>

On Fri, 19 Jun 2026 16:01:56 +0800
liujie5@linkdatatechnology.com wrote:

> From: Jie Liu <liujie5@linkdatatechnology.com>
> 
> This patch set implements core functionality for the SXE2 PMD,
> including basic driver framework, data path setup, and advanced
> offload features (VLAN, RSS,TM, PTP etc.).


Looking over the driver overall, I noticed you are adding cflags for -g.
This is not necessary, meson supports this via -Dbuildtype=debug or -Dbuildtype=debugoptimized

Remove this in meson.build
cflags += ['-g']

^ permalink raw reply

* [PATCH v3] graph: add optional profiling stats
From: Morten Brørup @ 2026-06-19 20:56 UTC (permalink / raw)
  To: dev, Jerin Jacob, Kiran Kumar K, Nithin Dabilpuram, Zhirun Yan
  Cc: Morten Brørup
In-Reply-To: <20260619202047.2809165-1-mb@smartsharesystems.com>

graph: add optional profiling stats

Added graph node profiling stats, build time configurable by enabling
RTE_GRAPH_PROFILE in rte_config.h.

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
---
v3:
* Debug shows cycles/obj instead of cycles/call.
* Fixed missing --in-reply-to.
v2:
* Fix indentation.
---
 config/rte_config.h                 |  1 +
 lib/graph/graph_debug.c             | 29 ++++++++++++++++++++++++++++-
 lib/graph/node.c                    |  2 ++
 lib/graph/rte_graph_worker_common.h | 23 ++++++++++++++++++++---
 4 files changed, 51 insertions(+), 4 deletions(-)

diff --git a/config/rte_config.h b/config/rte_config.h
index 0447cdf2ad..1942c1b1ec 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -106,6 +106,7 @@
 /* rte_graph defines */
 #define RTE_GRAPH_BURST_SIZE 256
 #define RTE_LIBRTE_GRAPH_STATS 1
+/* RTE_GRAPH_PROFILE is not set */
 
 /****** driver defines ********/
 
diff --git a/lib/graph/graph_debug.c b/lib/graph/graph_debug.c
index e3b8cccdc1..1110b43c6a 100644
--- a/lib/graph/graph_debug.c
+++ b/lib/graph/graph_debug.c
@@ -92,7 +92,34 @@ rte_graph_obj_dump(FILE *f, struct rte_graph *g, bool all)
 			fprintf(f, "       total_sched_fail=%" PRId64 "\n",
 				n->dispatch.total_sched_fail);
 		}
-		fprintf(f, "       total_calls=%" PRId64 "\n", n->total_calls);
+		fprintf(f, "       total_calls=%" PRIu64 "\n", n->total_calls);
+		fprintf(f, "       total_cycles=%" PRIu64 "\n", n->total_cycles);
+#ifdef RTE_GRAPH_PROFILE
+		uint64_t calls_2_or_more = n->total_calls -
+				(n->usage_stats[0].calls + n->usage_stats[1].calls);
+		double avg_objs_2_or_more = calls_2_or_more == 0 ? (double)2 :
+				(double)(n->total_objs - n->usage_stats[1].calls) /
+				(double)calls_2_or_more;
+		fprintf(f, "       calls_0=%" PRIu64 ", _1=%" PRIu64 ", _%.1f=%" PRIu64 "\n",
+				n->usage_stats[0].calls,
+				n->usage_stats[1].calls,
+				avg_objs_2_or_more,
+				calls_2_or_more);
+		fprintf(f, "       cycles_0=%" PRIu64 ", _1=%" PRIu64 ", _%.1f=%" PRIu64 "\n",
+				n->usage_stats[0].cycles,
+				n->usage_stats[1].cycles,
+				avg_objs_2_or_more,
+				n->total_cycles -
+				(n->usage_stats[0].cycles + n->usage_stats[1].cycles));
+		fprintf(f, "       cycles_per_obj_1=%.1f, _%.1f=%.1f\n",
+				n->usage_stats[1].calls == 0 ? (double)0 :
+				(double)n->usage_stats[1].cycles / (double)n->usage_stats[1].calls,
+				avg_objs_2_or_more,
+				calls_2_or_more == 0 ? (double)0 :
+				(double)(n->total_cycles -
+				(n->usage_stats[0].cycles + n->usage_stats[1].cycles)) /
+				(double)calls_2_or_more / avg_objs_2_or_more);
+#endif
 		for (i = 0; i < n->nb_edges; i++)
 			fprintf(f, "          edge[%d] <%s>\n", i,
 				n->nodes[i]->name);
diff --git a/lib/graph/node.c b/lib/graph/node.c
index 1fce3e6632..19b38881ae 100644
--- a/lib/graph/node.c
+++ b/lib/graph/node.c
@@ -110,10 +110,12 @@ __rte_node_register(const struct rte_node_register *reg)
 	rte_edge_t i;
 	size_t sz;
 
+#ifndef RTE_GRAPH_PROFILE
 	/* Limit Node specific metadata to one cacheline on 64B CL machine */
 	RTE_BUILD_BUG_ON((offsetof(struct rte_node, nodes) -
 			  offsetof(struct rte_node, ctx)) !=
 			 RTE_CACHE_LINE_MIN_SIZE);
+#endif
 
 	graph_spinlock_lock();
 
diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
index 4ab53a533e..43ce23765b 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -144,12 +144,22 @@ struct __rte_cache_aligned rte_node {
 			rte_node_process_t process; /**< Process function. */
 			uint64_t process_u64;
 		};
+		/** Fast path area cache line 3. */
+#ifdef RTE_GRAPH_PROFILE
+		struct {
+			uint64_t calls;
+			uint64_t cycles;
+		} usage_stats[2];	/**< Usage when this node processed 0 or 1 objects. */
+		/** Fast path area cache line 4. */
+#endif
 		alignas(RTE_CACHE_LINE_MIN_SIZE) struct rte_node *nodes[]; /**< Next nodes. */
 	};
 };
 
+#ifndef RTE_GRAPH_PROFILE
 static_assert(offsetof(struct rte_node, nodes) - offsetof(struct rte_node, ctx)
 	== RTE_CACHE_LINE_MIN_SIZE, "rte_node fast path area must fit in 64 bytes");
+#endif
 
 /**
  * @internal
@@ -197,7 +207,7 @@ void __rte_node_stream_alloc_size(struct rte_graph *graph,
 static __rte_always_inline void
 __rte_node_process(struct rte_graph *graph, struct rte_node *node)
 {
-	uint64_t start;
+	uint64_t cycles;
 	uint16_t rc;
 	void **objs;
 
@@ -206,11 +216,18 @@ __rte_node_process(struct rte_graph *graph, struct rte_node *node)
 	rte_prefetch0(objs);
 
 	if (rte_graph_has_stats_feature()) {
-		start = rte_rdtsc();
+		cycles = -rte_rdtsc();
 		rc = node->process(graph, node, objs, node->idx);
-		node->total_cycles += rte_rdtsc() - start;
+		cycles += rte_rdtsc();
+		node->total_cycles += cycles;
 		node->total_calls++;
 		node->total_objs += rc;
+#ifdef RTE_GRAPH_PROFILE
+		if (rc <= 1) {
+			node->usage_stats[rc].calls++;
+			node->usage_stats[rc].cycles += cycles;
+		}
+#endif
 	} else {
 		node->process(graph, node, objs, node->idx);
 	}
-- 
2.43.0


^ permalink raw reply related

* Re: [PATCH v4 09/23] net/sxe2: support IPsec inline protocol offload
From: Stephen Hemminger @ 2026-06-19 20:54 UTC (permalink / raw)
  To: liujie5; +Cc: dev
In-Reply-To: <20260619080812.1543972-1-liujie5@linkdatatechnology.com>

On Fri, 19 Jun 2026 16:08:12 +0800
liujie5@linkdatatechnology.com wrote:

> diff --git a/drivers/net/sxe2/sxe2_cmd_chnl.c b/drivers/net/sxe2/sxe2_cmd_chnl.c
> index 19323ffcc4..7711e8e57d 100644
> --- a/drivers/net/sxe2/sxe2_cmd_chnl.c
> +++ b/drivers/net/sxe2/sxe2_cmd_chnl.c
> @@ -877,3 +877,200 @@ int32_t sxe2_drv_tm_commit(struct sxe2_adapter *adapter)
>  l_end:
...

> +int32_t sxe2_drv_ipsec_txsa_delete(struct sxe2_adapter *adapter,
> +					   uint16_t sa_id)
> +{
> +	struct sxe2_drv_ipsec_txsa_del_req req = { 0 };
> +	struct sxe2_drv_cmd_params cmd             = { 0 };
> +	struct sxe2_common_device *cdev = adapter->cdev;
> +	int32_t ret                                 = -1;
> +
> +	req.sa_idx = rte_cpu_to_le_16(sa_id);
> +	sxe2_drv_cmd_params_fill(adapter, &cmd, SXE2_DRV_CMD_IPSEC_TXSA_DEL,
> +				 &req, sizeof(req),
> +				 NULL, 0);
> +	ret = sxe2_drv_cmd_exec(cdev, &cmd);
> +	if (ret)
> +		PMD_DEV_LOG_ERR(adapter, DRV,
> +				"Failed to delete tx sa, sa id: %u, ret: %d.",
> +				sa_id, ret);
> +
> +	return ret;
> +}
> +

git merge doesn't like extra blank lines at end of file.
Applying: net/sxe2: support IPsec inline protocol offload
/home/shemminger/DPDK/main/.git/worktrees/sxe2/rebase-apply/patch:236: new blank line at EOF.
+
warning: 1 line adds whitespace errors.


^ permalink raw reply

* [PATCH v2] graph: add optional profiling stats
From: Morten Brørup @ 2026-06-19 20:25 UTC (permalink / raw)
  To: dev, Jerin Jacob, Kiran Kumar K, Nithin Dabilpuram, Zhirun Yan
  Cc: Morten Brørup

Added graph node profiling stats, build time configurable by enabling
RTE_GRAPH_PROFILE in rte_config.h.

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
---
v2:
* Fix indentation.
---
 config/rte_config.h                 |  1 +
 lib/graph/graph_debug.c             | 29 ++++++++++++++++++++++++++++-
 lib/graph/node.c                    |  2 ++
 lib/graph/rte_graph_worker_common.h | 23 ++++++++++++++++++++---
 4 files changed, 51 insertions(+), 4 deletions(-)

diff --git a/config/rte_config.h b/config/rte_config.h
index 0447cdf2ad..1942c1b1ec 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -106,6 +106,7 @@
 /* rte_graph defines */
 #define RTE_GRAPH_BURST_SIZE 256
 #define RTE_LIBRTE_GRAPH_STATS 1
+/* RTE_GRAPH_PROFILE is not set */
 
 /****** driver defines ********/
 
diff --git a/lib/graph/graph_debug.c b/lib/graph/graph_debug.c
index e3b8cccdc1..b1028f88ed 100644
--- a/lib/graph/graph_debug.c
+++ b/lib/graph/graph_debug.c
@@ -92,7 +92,34 @@ rte_graph_obj_dump(FILE *f, struct rte_graph *g, bool all)
 			fprintf(f, "       total_sched_fail=%" PRId64 "\n",
 				n->dispatch.total_sched_fail);
 		}
-		fprintf(f, "       total_calls=%" PRId64 "\n", n->total_calls);
+		fprintf(f, "       total_calls=%" PRIu64 "\n", n->total_calls);
+		fprintf(f, "       total_cycles=%" PRIu64 "\n", n->total_cycles);
+#ifdef RTE_GRAPH_PROFILE
+		uint64_t calls_2_or_more = n->total_calls -
+				(n->usage_stats[0].calls + n->usage_stats[1].calls);
+		double avg_objs_2_or_more = calls_2_or_more == 0 ? (double)2 :
+				(double)(n->total_objs - n->usage_stats[1].calls) /
+				(double)calls_2_or_more;
+		fprintf(f, "       calls_0=%" PRIu64 ", _1=%" PRIu64 ", _%.1f=%" PRIu64 "\n",
+				n->usage_stats[0].calls,
+				n->usage_stats[1].calls,
+				avg_objs_2_or_more,
+				calls_2_or_more);
+		fprintf(f, "       cycles_0=%" PRIu64 ", _1=%" PRIu64 ", _%.1f=%" PRIu64 "\n",
+				n->usage_stats[0].cycles,
+				n->usage_stats[1].cycles,
+				avg_objs_2_or_more,
+				n->total_cycles -
+				(n->usage_stats[0].cycles + n->usage_stats[1].cycles));
+		fprintf(f, "       cycles_per_call_1=%.1f, _%.1f=%.1f\n",
+				n->usage_stats[1].calls == 0 ? (double)0 :
+				(double)n->usage_stats[1].cycles / (double)n->usage_stats[1].calls,
+				avg_objs_2_or_more,
+				calls_2_or_more == 0 ? (double)0 :
+				(double)(n->total_cycles -
+				(n->usage_stats[0].cycles + n->usage_stats[1].cycles)) /
+				(double)calls_2_or_more);
+#endif
 		for (i = 0; i < n->nb_edges; i++)
 			fprintf(f, "          edge[%d] <%s>\n", i,
 				n->nodes[i]->name);
diff --git a/lib/graph/node.c b/lib/graph/node.c
index 1fce3e6632..19b38881ae 100644
--- a/lib/graph/node.c
+++ b/lib/graph/node.c
@@ -110,10 +110,12 @@ __rte_node_register(const struct rte_node_register *reg)
 	rte_edge_t i;
 	size_t sz;
 
+#ifndef RTE_GRAPH_PROFILE
 	/* Limit Node specific metadata to one cacheline on 64B CL machine */
 	RTE_BUILD_BUG_ON((offsetof(struct rte_node, nodes) -
 			  offsetof(struct rte_node, ctx)) !=
 			 RTE_CACHE_LINE_MIN_SIZE);
+#endif
 
 	graph_spinlock_lock();
 
diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
index 4ab53a533e..43ce23765b 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -144,12 +144,22 @@ struct __rte_cache_aligned rte_node {
 			rte_node_process_t process; /**< Process function. */
 			uint64_t process_u64;
 		};
+		/** Fast path area cache line 3. */
+#ifdef RTE_GRAPH_PROFILE
+		struct {
+			uint64_t calls;
+			uint64_t cycles;
+		} usage_stats[2];	/**< Usage when this node processed 0 or 1 objects. */
+		/** Fast path area cache line 4. */
+#endif
 		alignas(RTE_CACHE_LINE_MIN_SIZE) struct rte_node *nodes[]; /**< Next nodes. */
 	};
 };
 
+#ifndef RTE_GRAPH_PROFILE
 static_assert(offsetof(struct rte_node, nodes) - offsetof(struct rte_node, ctx)
 	== RTE_CACHE_LINE_MIN_SIZE, "rte_node fast path area must fit in 64 bytes");
+#endif
 
 /**
  * @internal
@@ -197,7 +207,7 @@ void __rte_node_stream_alloc_size(struct rte_graph *graph,
 static __rte_always_inline void
 __rte_node_process(struct rte_graph *graph, struct rte_node *node)
 {
-	uint64_t start;
+	uint64_t cycles;
 	uint16_t rc;
 	void **objs;
 
@@ -206,11 +216,18 @@ __rte_node_process(struct rte_graph *graph, struct rte_node *node)
 	rte_prefetch0(objs);
 
 	if (rte_graph_has_stats_feature()) {
-		start = rte_rdtsc();
+		cycles = -rte_rdtsc();
 		rc = node->process(graph, node, objs, node->idx);
-		node->total_cycles += rte_rdtsc() - start;
+		cycles += rte_rdtsc();
+		node->total_cycles += cycles;
 		node->total_calls++;
 		node->total_objs += rc;
+#ifdef RTE_GRAPH_PROFILE
+		if (rc <= 1) {
+			node->usage_stats[rc].calls++;
+			node->usage_stats[rc].cycles += cycles;
+		}
+#endif
 	} else {
 		node->process(graph, node, objs, node->idx);
 	}
-- 
2.43.0


^ permalink raw reply related

* [PATCH] graph: add optional profiling stats
From: Morten Brørup @ 2026-06-19 20:20 UTC (permalink / raw)
  To: dev, Jerin Jacob, Kiran Kumar K, Nithin Dabilpuram, Zhirun Yan
  Cc: Morten Brørup

Added graph node profiling stats, build time configurable by enabling
RTE_GRAPH_PROFILE in rte_config.h.

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
---
 config/rte_config.h                 |  1 +
 lib/graph/graph_debug.c             | 29 ++++++++++++++++++++++++++++-
 lib/graph/node.c                    |  2 ++
 lib/graph/rte_graph_worker_common.h | 23 ++++++++++++++++++++---
 4 files changed, 51 insertions(+), 4 deletions(-)

diff --git a/config/rte_config.h b/config/rte_config.h
index 0447cdf2ad..1942c1b1ec 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -106,6 +106,7 @@
 /* rte_graph defines */
 #define RTE_GRAPH_BURST_SIZE 256
 #define RTE_LIBRTE_GRAPH_STATS 1
+/* RTE_GRAPH_PROFILE is not set */
 
 /****** driver defines ********/
 
diff --git a/lib/graph/graph_debug.c b/lib/graph/graph_debug.c
index e3b8cccdc1..883e37707c 100644
--- a/lib/graph/graph_debug.c
+++ b/lib/graph/graph_debug.c
@@ -92,7 +92,34 @@ rte_graph_obj_dump(FILE *f, struct rte_graph *g, bool all)
 			fprintf(f, "       total_sched_fail=%" PRId64 "\n",
 				n->dispatch.total_sched_fail);
 		}
-		fprintf(f, "       total_calls=%" PRId64 "\n", n->total_calls);
+		fprintf(f, "       total_calls=%" PRIu64 "\n", n->total_calls);
+		fprintf(f, "       total_cycles=%" PRIu64 "\n", n->total_cycles);
+#ifdef RTE_GRAPH_PROFILE
+		uint64_t calls_2_or_more = n->total_calls -
+				(n->usage_stats[0].calls + n->usage_stats[1].calls);
+		double avg_objs_2_or_more = calls_2_or_more == 0 ? (double)2 :
+				(double)(n->total_objs - n->usage_stats[1].calls) /
+				(double)calls_2_or_more;
+		fprintf(f, "       calls_0=%" PRIu64 ", _1=%" PRIu64 ", _%.1f=%" PRIu64 "\n",
+				n->usage_stats[0].calls,
+				n->usage_stats[1].calls,
+				avg_objs_2_or_more,
+				calls_2_or_more);
+		fprintf(f, "       cycles_0=%" PRIu64 ", _1=%" PRIu64 ", _%.1f=%" PRIu64 "\n",
+				n->usage_stats[0].cycles,
+				n->usage_stats[1].cycles,
+				avg_objs_2_or_more,
+				n->total_cycles -
+				(n->usage_stats[0].cycles + n->usage_stats[1].cycles));
+		fprintf(f, "       cycles_per_call_1=%.1f, _%.1f=%.1f\n",
+				n->usage_stats[1].calls == 0 ? (double)0 :
+				(double)n->usage_stats[1].cycles / (double)n->usage_stats[1].calls,
+				avg_objs_2_or_more,
+				calls_2_or_more == 0 ? (double)0 :
+				(double)(n->total_cycles -
+				(n->usage_stats[0].cycles + n->usage_stats[1].cycles)) /
+                (double)calls_2_or_more);
+#endif
 		for (i = 0; i < n->nb_edges; i++)
 			fprintf(f, "          edge[%d] <%s>\n", i,
 				n->nodes[i]->name);
diff --git a/lib/graph/node.c b/lib/graph/node.c
index 1fce3e6632..19b38881ae 100644
--- a/lib/graph/node.c
+++ b/lib/graph/node.c
@@ -110,10 +110,12 @@ __rte_node_register(const struct rte_node_register *reg)
 	rte_edge_t i;
 	size_t sz;
 
+#ifndef RTE_GRAPH_PROFILE
 	/* Limit Node specific metadata to one cacheline on 64B CL machine */
 	RTE_BUILD_BUG_ON((offsetof(struct rte_node, nodes) -
 			  offsetof(struct rte_node, ctx)) !=
 			 RTE_CACHE_LINE_MIN_SIZE);
+#endif
 
 	graph_spinlock_lock();
 
diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
index 4ab53a533e..43ce23765b 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -144,12 +144,22 @@ struct __rte_cache_aligned rte_node {
 			rte_node_process_t process; /**< Process function. */
 			uint64_t process_u64;
 		};
+		/** Fast path area cache line 3. */
+#ifdef RTE_GRAPH_PROFILE
+		struct {
+			uint64_t calls;
+			uint64_t cycles;
+		} usage_stats[2];	/**< Usage when this node processed 0 or 1 objects. */
+		/** Fast path area cache line 4. */
+#endif
 		alignas(RTE_CACHE_LINE_MIN_SIZE) struct rte_node *nodes[]; /**< Next nodes. */
 	};
 };
 
+#ifndef RTE_GRAPH_PROFILE
 static_assert(offsetof(struct rte_node, nodes) - offsetof(struct rte_node, ctx)
 	== RTE_CACHE_LINE_MIN_SIZE, "rte_node fast path area must fit in 64 bytes");
+#endif
 
 /**
  * @internal
@@ -197,7 +207,7 @@ void __rte_node_stream_alloc_size(struct rte_graph *graph,
 static __rte_always_inline void
 __rte_node_process(struct rte_graph *graph, struct rte_node *node)
 {
-	uint64_t start;
+	uint64_t cycles;
 	uint16_t rc;
 	void **objs;
 
@@ -206,11 +216,18 @@ __rte_node_process(struct rte_graph *graph, struct rte_node *node)
 	rte_prefetch0(objs);
 
 	if (rte_graph_has_stats_feature()) {
-		start = rte_rdtsc();
+		cycles = -rte_rdtsc();
 		rc = node->process(graph, node, objs, node->idx);
-		node->total_cycles += rte_rdtsc() - start;
+		cycles += rte_rdtsc();
+		node->total_cycles += cycles;
 		node->total_calls++;
 		node->total_objs += rc;
+#ifdef RTE_GRAPH_PROFILE
+		if (rc <= 1) {
+			node->usage_stats[rc].calls++;
+			node->usage_stats[rc].cycles += cycles;
+		}
+#endif
 	} else {
 		node->process(graph, node, objs, node->idx);
 	}
-- 
2.43.0


^ permalink raw reply related

* RE: [PATCH v1 0/5] prefix lcore role enum values
From: Morten Brørup @ 2026-06-19 20:11 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Thomas Monjalon, Huisong Li, andrew.rybchenko, dev, zhanjie9
In-Reply-To: <20260619083934.510bd2d4@phoenix.local>

> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Friday, 19 June 2026 17.40
> 
> On Fri, 19 Jun 2026 09:54:51 +0200
> Morten Brørup <mb@smartsharesystems.com> wrote:
> 
> > > > The problem with this patch it causes build failures now with abi
> > > diff.
> > >
> > > It is probably a bug of an old version of abidiff.
> > > I recommend updating.
> >
> > With the #define's the ABI has not changed. It's probably too
> indirect for abidiff to understand.
> > If we absolutely want to please abidiff, we could keep the existing
> enums and #define RTE_LCORE_ROLE_RTE ROLE_RTE for now.
> > But I'm in favor of what was done already.
> 
> The build failures on github, not in my local builds.
> https://github.com/ovsrobot/dpdk/actions/runs/27789889172/job/822359650
> 90
> 
> It makes looking at patchwork dashboard difficult, all patches show up
> with red mark

So maybe we can choose the path of pleasing abidiff...
Keep the existing enums, and #define the new RTE_LCORE_ prefixed variants, and use those in the code.

Later, with an ABI breaking release, we can swap.
Or maybe we just wait until an ABI breaking release to fix this.


^ permalink raw reply

* [PATCH 10/10] net/enetc4: add cacheable BD ring support with SW cache maintenance
From: Gagandeep Singh @ 2026-06-19 18:44 UTC (permalink / raw)
  To: dev; +Cc: hemant.agrawal
In-Reply-To: <20260619184427.522518-1-g.singh@nxp.com>

On non-cache-coherent platforms such as i.MX95, the BD ring memory
may be mapped as cacheable (normal memory) while the ENETC hardware
DMA engine writes and reads descriptors without CPU cache snooping.
SW must therefore perform explicit cache maintenance to keep CPU
caches and DDR coherent.

TX path (enetc_xmit_pkts_cacheable):
  - Flush each segment's payload cache lines to PoC (dcbf) before
    the BD is handed to HW, so HW DMA reads the correct data.
  - After all BDs for a burst are written, flush the BD cache lines
    (dcbf, one per 64-byte group of 4 BDs) so HW can read the
    updated descriptors.

RX refill (enetc_refill_rx_ring):
  - After writing each full 4-BD cache-line group, dcbf that group
    so HW sees the buffer addresses and cleared lstatus fields.
  - Flush any partial trailing group before updating the ring tail.

RX receive (enetc_recv_pkts_cacheable via enetc_clean_rx_ring_cacheable):
  - Before reading BD status, dccivac the current BD cache line so
    stale CPU-cached BD data is discarded and fresh HW-written
    content is fetched from DDR.
  - After a BD is consumed, dccivac each payload cache line so the
    CPU reads the DMA'd packet data, not stale cached bytes.

Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
---
 drivers/net/enetc/enetc.h         |  21 +++
 drivers/net/enetc/enetc4_ethdev.c |  40 +++--
 drivers/net/enetc/enetc_rxtx.c    | 274 ++++++++++++++++++++++++++++++
 3 files changed, 320 insertions(+), 15 deletions(-)

diff --git a/drivers/net/enetc/enetc.h b/drivers/net/enetc/enetc.h
index 99b1e91..9f98480 100644
--- a/drivers/net/enetc/enetc.h
+++ b/drivers/net/enetc/enetc.h
@@ -96,6 +96,7 @@ struct enetc_bdr {
 	uint64_t ierrors;
 	uint8_t rx_deferred_start;
 	uint8_t tx_deferred_start;
+	uint64_t bd_base_p;
 };
 
 struct enetc_eth_hw {
@@ -312,8 +313,28 @@ uint16_t enetc_recv_pkts(void *rxq, struct rte_mbuf **rx_pkts,
 		uint16_t nb_pkts);
 uint16_t enetc_recv_pkts_nc(void *rxq, struct rte_mbuf **rx_pkts,
 		uint16_t nb_pkts);
+uint16_t enetc_xmit_pkts_cacheable(void *txq, struct rte_mbuf **tx_pkts,
+		uint16_t nb_pkts);
+uint16_t enetc_recv_pkts_cacheable(void *rxq, struct rte_mbuf **rx_pkts,
+		uint16_t nb_pkts);
 
 int enetc_refill_rx_ring(struct enetc_bdr *rx_ring, const int buff_cnt);
+
+/*
+ * Cache-maintenance constants for cacheable BD ring mode.
+ *
+ * BD = 16 bytes, cache line = 64 bytes => 4 BDs per cache line.
+ * Every dcbf in enetc_refill_rx_ring() flushes a full 64-byte cache line.
+ * To ensure each dcbf covers only fully-written BDs the caller
+ * must pass a count rounded DOWN to a multiple of ENETC_BD_PER_CL so that
+ * the last partial group is left in cache to be completed and flushed in
+ * the next call.
+ */
+#define ENETC_BD_PER_CL		(RTE_CACHE_LINE_SIZE / sizeof(union enetc_rx_bd))
+#define ENETC_BD_PER_CL_MASK	(ENETC_BD_PER_CL - 1)
+/* Round n DOWN to the nearest multiple of ENETC_BD_PER_CL. */
+#define ENETC_BD_ALIGN_DOWN(n)	((n) & ~(unsigned int)ENETC_BD_PER_CL_MASK)
+
 void enetc4_dev_hw_init(struct rte_eth_dev *eth_dev);
 void enetc_print_ethaddr(const char *name, const struct rte_ether_addr *eth_addr);
 
diff --git a/drivers/net/enetc/enetc4_ethdev.c b/drivers/net/enetc/enetc4_ethdev.c
index d54051f..04dc306 100644
--- a/drivers/net/enetc/enetc4_ethdev.c
+++ b/drivers/net/enetc/enetc4_ethdev.c
@@ -281,12 +281,14 @@ enetc4_alloc_txbdr(struct enetc_bdr *txr, uint16_t nb_desc)
 	int size;
 
 	size = nb_desc * sizeof(struct enetc_swbd);
-	txr->q_swbd = rte_malloc(NULL, size, ENETC_BD_RING_ALIGN);
+	/* Zero q_swbd so buffer_addr is NULL for all uninitialized slots. */
+	txr->q_swbd = rte_zmalloc(NULL, size, ENETC_BD_RING_ALIGN);
 	if (txr->q_swbd == NULL)
 		return -ENOMEM;
 
-	size = nb_desc * sizeof(struct enetc_bdr);
-	txr->bd_base = rte_malloc(NULL, size, ENETC_BD_RING_ALIGN);
+	/* Allocate the TX BD ring: each BD is struct enetc_tx_bd (16 bytes). */
+	size = nb_desc * sizeof(struct enetc_tx_bd);
+	txr->bd_base = rte_zmalloc(NULL, size, ENETC_BD_RING_ALIGN);
 	if (txr->bd_base == NULL) {
 		rte_free(txr->q_swbd);
 		txr->q_swbd = NULL;
@@ -441,12 +443,14 @@ enetc4_alloc_rxbdr(struct enetc_bdr *rxr, uint16_t nb_desc)
 	int size;
 
 	size = nb_desc * sizeof(struct enetc_swbd);
-	rxr->q_swbd = rte_malloc(NULL, size, ENETC_BD_RING_ALIGN);
+	/* Zero q_swbd so buffer_addr is NULL for all uninitialized slots. */
+	rxr->q_swbd = rte_zmalloc(NULL, size, ENETC_BD_RING_ALIGN);
 	if (rxr->q_swbd == NULL)
 		return -ENOMEM;
 
-	size = nb_desc * sizeof(struct enetc_bdr);
-	rxr->bd_base = rte_malloc(NULL, size, ENETC_BD_RING_ALIGN);
+	/* Allocate the RX BD ring: each BD is union enetc_rx_bd (16 bytes). */
+	size = nb_desc * sizeof(union enetc_rx_bd);
+	rxr->bd_base = rte_zmalloc(NULL, size, ENETC_BD_RING_ALIGN);
 	if (rxr->bd_base == NULL) {
 		rte_free(rxr->q_swbd);
 		rxr->q_swbd = NULL;
@@ -481,7 +485,7 @@ enetc4_setup_rxbdr(struct enetc_hw *hw, struct enetc_bdr *rx_ring,
 	rx_ring->mb_pool = mb_pool;
 	rx_ring->rcir = (void *)((size_t)hw->reg +
 			ENETC_BDR(RX, idx, ENETC_RBCIR));
-	enetc_refill_rx_ring(rx_ring, (enetc_bd_unused(rx_ring)));
+	enetc_refill_rx_ring(rx_ring, ENETC_BD_ALIGN_DOWN(enetc_bd_unused(rx_ring)));
 	buf_size = (uint16_t)(rte_pktmbuf_data_room_size(rx_ring->mb_pool) -
 		   RTE_PKTMBUF_HEADROOM);
 	enetc4_rxbdr_wr(hw, idx, ENETC_RBBSR, buf_size);
@@ -743,12 +747,17 @@ enetc4_dev_configure(struct rte_eth_dev *dev)
 
 	PMD_INIT_FUNC_TRACE();
 
-	max_len = dev->data->dev_conf.rxmode.mtu + RTE_ETHER_HDR_LEN +
-		  RTE_ETHER_CRC_LEN;
-	enetc4_port_wr(enetc_hw, ENETC4_PM_MAXFRM(0), ENETC_SET_MAXFRM(max_len));
+	/* Port-level register writes are PF-only; skip for VF devices */
+	if (hw->device_id != ENETC4_DEV_ID_VF) {
+		max_len = dev->data->dev_conf.rxmode.mtu + RTE_ETHER_HDR_LEN +
+			  RTE_ETHER_CRC_LEN;
+		enetc4_port_wr(enetc_hw, ENETC4_PM_MAXFRM(0),
+			       ENETC_SET_MAXFRM(max_len));
 
-	val = ENETC4_MAC_MAXFRM_SIZE | SDU_TYPE_MPDU;
-	enetc4_port_wr(enetc_hw, ENETC4_PTCTMSDUR(0), val | SDU_TYPE_MPDU);
+		val = ENETC4_MAC_MAXFRM_SIZE | SDU_TYPE_MPDU;
+		enetc4_port_wr(enetc_hw, ENETC4_PTCTMSDUR(0),
+			       val | SDU_TYPE_MPDU);
+	}
 
 	/* Rx offloads which are enabled by default */
 	if (dev_rx_offloads_sup & ~rx_offloads) {
@@ -770,7 +779,8 @@ enetc4_dev_configure(struct rte_eth_dev *dev)
 	if (rx_offloads & (RTE_ETH_RX_OFFLOAD_UDP_CKSUM | RTE_ETH_RX_OFFLOAD_TCP_CKSUM))
 		checksum &= ~L4_CKSUM;
 
-	enetc4_port_wr(enetc_hw, ENETC4_PARCSCR, checksum);
+	if (hw->device_id != ENETC4_DEV_ID_VF)
+		enetc4_port_wr(enetc_hw, ENETC4_PARCSCR, checksum);
 
 	/* Enable interrupts */
 	if (hw->device_id == ENETC4_DEV_ID_VF) {
@@ -1033,8 +1043,8 @@ enetc4_dev_hw_init(struct rte_eth_dev *eth_dev)
 		ENETC_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);
 	struct rte_pci_device *pci_dev = RTE_CLASS_TO_BUS_DEVICE(eth_dev, *pci_dev);
 
-	eth_dev->rx_pkt_burst = &enetc_recv_pkts_nc;
-	eth_dev->tx_pkt_burst = &enetc_xmit_pkts_nc;
+	eth_dev->rx_pkt_burst = &enetc_recv_pkts_cacheable;
+	eth_dev->tx_pkt_burst = &enetc_xmit_pkts_cacheable;
 
 	/* Retrieving and storing the HW base address of device */
 	hw->hw.reg = (void *)pci_dev->mem_resource[0].addr;
diff --git a/drivers/net/enetc/enetc_rxtx.c b/drivers/net/enetc/enetc_rxtx.c
index a37c835..c737b22 100644
--- a/drivers/net/enetc/enetc_rxtx.c
+++ b/drivers/net/enetc/enetc_rxtx.c
@@ -26,6 +26,7 @@ enetc_clean_tx_ring(struct enetc_bdr *tx_ring)
 	struct enetc_swbd *tx_swbd, *tx_swbd_base;
 	int i, hwci, bd_count;
 	struct rte_mbuf *m[ENETC_RXBD_BUNDLE];
+	struct enetc_tx_bd *txbd;
 
 	/* we don't need barriers here, we just want a relatively current value
 	 * from HW.
@@ -51,6 +52,13 @@ enetc_clean_tx_ring(struct enetc_bdr *tx_ring)
 		/* It seems calling rte_pktmbuf_free is wasting a lot of cycles,
 		 * make a list and call _free when it's done.
 		 */
+		/* Clear flags on the reclaimed BD so that dcbf in the
+		 * cacheable TX path never flushes a stale flags_F to memory
+		 * before the new BD fields are fully written.
+		 */
+		txbd = ENETC_TXBD(*tx_ring, i);
+		txbd->flags = 0;
+
 		if (tx_frm_cnt == ENETC_RXBD_BUNDLE) {
 			rte_pktmbuf_free_bulk(m, tx_frm_cnt);
 			tx_frm_cnt = 0;
@@ -217,6 +225,7 @@ enetc_refill_rx_ring(struct enetc_bdr *rx_ring, const int buff_cnt)
 {
 	struct enetc_swbd *rx_swbd;
 	union enetc_rx_bd *rxbd;
+	union enetc_rx_bd *grp_start_rxbd;
 	int i, j, k = ENETC_RXBD_BUNDLE;
 	struct rte_mbuf *m[ENETC_RXBD_BUNDLE];
 	struct rte_mempool *mb_pool;
@@ -225,6 +234,7 @@ enetc_refill_rx_ring(struct enetc_bdr *rx_ring, const int buff_cnt)
 	mb_pool = rx_ring->mb_pool;
 	rx_swbd = &rx_ring->q_swbd[i];
 	rxbd = ENETC_RXBD(*rx_ring, i);
+	grp_start_rxbd = rxbd;
 	for (j = 0; j < buff_cnt; j++) {
 		/* bulk alloc for the next up to 8 BDs */
 		if (k == ENETC_RXBD_BUNDLE) {
@@ -246,12 +256,29 @@ enetc_refill_rx_ring(struct enetc_bdr *rx_ring, const int buff_cnt)
 		i++;
 		k++;
 		if (unlikely(i == rx_ring->bd_count)) {
+			/*
+			 * Ring wrap: flush the current partial or full group
+			 * before resetting the pointer to index 0.
+			 */
+			dcbf((void *)grp_start_rxbd);
 			i = 0;
 			rxbd = ENETC_RXBD(*rx_ring, i);
 			rx_swbd = &rx_ring->q_swbd[i];
+			grp_start_rxbd = rxbd;
+		} else if ((i & ENETC_BD_PER_CL_MASK) == 0) {
+			/*
+			 * Completed a full 4-BD group (one cache line).
+			 * Flush it to PoC so HW sees the updated descriptors.
+			 */
+			dcbf((void *)grp_start_rxbd);
+			grp_start_rxbd = rxbd;
 		}
 	}
 
+	/* Flush any remaining partial group at the end of the fill. */
+	if (j && (i & ENETC_BD_PER_CL_MASK) != 0)
+		dcbf((void *)grp_start_rxbd);
+
 	if (likely(j)) {
 		rx_ring->next_to_alloc = i;
 		rx_ring->next_to_use = i;
@@ -597,3 +624,250 @@ enetc_recv_pkts(void *rxq, struct rte_mbuf **rx_pkts,
 
 	return enetc_clean_rx_ring(rx_ring, rx_pkts, nb_pkts);
 }
+
+/* --- Cacheable BD ring TX path with SW cache maintenance (dcbf) --- */
+
+uint16_t
+enetc_xmit_pkts_cacheable(void *tx_queue,
+		struct rte_mbuf **tx_pkts,
+		uint16_t nb_pkts)
+{
+	int i, start, bds_to_use;
+	struct enetc_tx_bd *txbd;
+	struct enetc_bdr *tx_ring = (struct enetc_bdr *)tx_queue;
+	unsigned int j;
+	uint8_t *data;
+	struct rte_mbuf *seg;
+	uint16_t seg_len, segs_per_pkt;
+	bool is_first_seg;
+	int first_bd_idx, bd_count;
+
+	i = tx_ring->next_to_use;
+	bds_to_use = enetc_bd_unused(tx_ring);
+	bd_count = tx_ring->bd_count;
+	start = 0;
+
+	/*
+	 * Remember the first BD index of this batch so we can flush the
+	 * BD cache lines to PoC after all descriptors are written.
+	 */
+	first_bd_idx = i;
+
+	while (start < nb_pkts) {
+		seg = tx_pkts[start];
+		segs_per_pkt = seg->nb_segs;
+
+		if (bds_to_use < segs_per_pkt)
+			break;
+
+		is_first_seg = true;
+		while (seg) {
+			tx_ring->q_swbd[i].buffer_addr = NULL;
+			seg_len = rte_pktmbuf_data_len(seg);
+			data = rte_pktmbuf_mtod(seg, void *);
+
+			/*
+			 * Flush packet data cache lines to PoC so HW DMA
+			 * reads the correct payload from memory.
+			 */
+			for (j = 0; j < seg_len; j += RTE_CACHE_LINE_SIZE)
+				dcbf(data + j);
+
+			/*
+			 * Cover the last byte of an unaligned buffer to
+			 * ensure the full payload is clean to the Point of
+			 * Coherency.
+			 */
+			dcbf(data + (seg_len - 1));
+			txbd = ENETC_TXBD(*tx_ring, i);
+			txbd->flags = 0;
+			if (is_first_seg) {
+				tx_ring->q_swbd[i].buffer_addr = seg;
+				txbd->frm_len = rte_pktmbuf_pkt_len(seg);
+				if (seg->ol_flags & ENETC4_TX_CKSUM_OFFLOAD_MASK)
+					enetc4_tx_offload_checksum(seg, txbd);
+				is_first_seg = false;
+			}
+
+			txbd->buf_len = rte_cpu_to_le_16(seg_len);
+			txbd->addr = rte_cpu_to_le_64(rte_mbuf_data_iova(seg));
+			seg = seg->next;
+			i++;
+			bds_to_use--;
+
+			if (unlikely(i == bd_count))
+				i = 0;
+		}
+
+		/*
+		 * Set the frame-last flag on the final BD of this packet.
+		 * This is the last write to the BD group; the cache flush
+		 * below will push all BDs to memory afterwards.
+		 */
+		txbd->flags |= rte_cpu_to_le_16(ENETC4_TXBD_FLAGS_F);
+		start++;
+	}
+
+	/*
+	 * Flush TX BDs to PoC so HW (non-cache-coherent i.MX95) can read
+	 * the descriptors from memory.  TX BDs are 16 B each; 4 BDs share
+	 * one 64-byte cache line.  Walk from the cache-line-aligned start
+	 * of first_bd_idx to just past the last written BD, one dcbf per
+	 * cache line.
+	 *
+	 * The flush must happen AFTER all BD fields (including flags_F) are
+	 * written, so HW never sees a partial descriptor.
+	 */
+	if (likely(start > 0)) {
+		int n = first_bd_idx & ~ENETC_BD_PER_CL_MASK;
+		int written = (i - n + bd_count) % bd_count;
+
+		if (written == 0)
+			written = bd_count;
+		written = (written + ENETC_BD_PER_CL_MASK) & ~ENETC_BD_PER_CL_MASK;
+
+		while (written > 0) {
+			dcbf((void *)ENETC_TXBD(*tx_ring, n));
+			n = (n + ENETC_BD_PER_CL) % bd_count;
+			written -= ENETC_BD_PER_CL;
+		}
+	}
+
+	enetc_clean_tx_ring(tx_ring);
+	tx_ring->next_to_use = i;
+	enetc_wr_reg(tx_ring->tcir, i);
+
+	return start;
+}
+
+/* --- Cacheable BD ring RX path with SW cache maintenance (dccivac) --- */
+
+static int
+enetc_clean_rx_ring_cacheable(struct enetc_bdr *rx_ring,
+		struct rte_mbuf **rx_pkts,
+		int work_limit)
+{
+	int rx_frm_cnt = 0;
+	int cleaned_cnt, i;
+	struct enetc_swbd *rx_swbd;
+	union enetc_rx_bd *rxbd, rxbd_temp;
+	struct rte_mbuf *first_seg = NULL, *cur_seg = NULL;
+	uint32_t bd_status;
+	uint8_t *data;
+	uint32_t j;
+	struct rte_mbuf *seg;
+	uint16_t data_len;
+
+	i = rx_ring->next_to_clean;
+	rxbd = ENETC_RXBD(*rx_ring, i);
+	cleaned_cnt = enetc_bd_unused(rx_ring);
+	rx_swbd = &rx_ring->q_swbd[i];
+
+	/*
+	 * On i.MX95 the BD ring is in cacheable hugepage memory but the
+	 * platform is non-cache-coherent.  HW writes RX BDs to DDR
+	 * without snooping the CPU cache, so stale cached copies of BD
+	 * status fields must be discarded before the CPU reads them.
+	 *
+	 * Ideal instruction: DC IVAC (invalidate only, no writeback).
+	 * ARM64 constraint: DC IVAC requires EL1 privilege; executing it
+	 * from EL0 (DPDK userspace) raises a fault.  The only EL0-safe
+	 * cache maintenance instruction that invalidates is DC CIVAC
+	 * (clean + invalidate, dccivac).
+	 *
+	 * Safety of using dccivac here:
+	 * enetc_refill_rx_ring() issues dcbf() on every BD group before
+	 * returning ownership to HW.  After dcbf the CPU cache lines are
+	 * marked clean (no dirty data).  When dccivac runs, the "clean"
+	 * phase finds nothing dirty to write back, so it behaves as a
+	 * pure invalidate - exactly what we need.
+	 *
+	 * Granularity: BD = 16 B, cache line = 64 B, so one dccivac
+	 * covers exactly 4 BDs.  Invalidate at each 4-BD boundary.
+	 */
+	dccivac((void *)ENETC_RXBD(*rx_ring,
+			(i & ~(int)ENETC_BD_PER_CL_MASK)));
+
+	while (likely(rx_frm_cnt < work_limit)) {
+#ifdef RTE_ARCH_32
+		rte_memcpy(&rxbd_temp, rxbd, 16);
+#else
+		__uint128_t *dst128 = (__uint128_t *)&rxbd_temp;
+		const __uint128_t *src128 = (const __uint128_t *)rxbd;
+		*dst128 = *src128;
+#endif
+		bd_status = rte_le_to_cpu_32(rxbd_temp.r.lstatus);
+
+		if (!(bd_status & ENETC_RXBD_LSTATUS_R))
+			break;
+		if (rxbd_temp.r.error)
+			rx_ring->ierrors++;
+
+		seg = rx_swbd->buffer_addr;
+		data_len = rte_le_to_cpu_16(rxbd_temp.r.buf_len);
+		seg->data_len = data_len;
+		if (!first_seg) {
+			first_seg = seg;
+			cur_seg = seg;
+			first_seg->pkt_len = data_len;
+			enetc_dev_rx_parse(first_seg,
+					   rxbd_temp.r.parse_summary);
+			first_seg->hash.rss = rxbd_temp.r.rss_hash;
+		} else {
+			first_seg->pkt_len += data_len;
+			first_seg->nb_segs++;
+			cur_seg->next = seg;
+			cur_seg = seg;
+		}
+
+		/*
+		 * Invalidate packet data cache lines so the CPU reads the
+		 * payload that HW DMA'd into memory, not stale cached bytes.
+		 */
+		data = rte_pktmbuf_mtod(seg, void *);
+		for (j = 0; j < data_len; j += RTE_CACHE_LINE_SIZE)
+			dccivac(data + j);
+		/* Cover the last byte of an unaligned buffer. */
+		dccivac(data + (data_len - 1));
+
+		if (bd_status & ENETC_RXBD_LSTATUS_F) {
+			seg->next = NULL;
+			first_seg->pkt_len -= rx_ring->crc_len;
+			rx_pkts[rx_frm_cnt] = first_seg;
+			rx_frm_cnt++;
+			first_seg = NULL;
+		}
+
+		cleaned_cnt++;
+		rx_swbd++;
+		i++;
+		if (unlikely(i == rx_ring->bd_count)) {
+			i = 0;
+			rx_swbd = &rx_ring->q_swbd[i];
+		}
+		rxbd = ENETC_RXBD(*rx_ring, i);
+
+		/*
+		 * Crossed a 4-BD (cache-line) boundary: invalidate the new
+		 * group so the next four status reads fetch fresh DDR data
+		 * written by HW.
+		 */
+		if ((i & ENETC_BD_PER_CL_MASK) == 0 &&
+		    likely(rx_frm_cnt < work_limit))
+			dccivac((void *)rxbd);
+	}
+
+	rx_ring->next_to_clean = i;
+	enetc_refill_rx_ring(rx_ring, ENETC_BD_ALIGN_DOWN(cleaned_cnt));
+
+	return rx_frm_cnt;
+}
+
+uint16_t
+enetc_recv_pkts_cacheable(void *rxq, struct rte_mbuf **rx_pkts,
+		uint16_t nb_pkts)
+{
+	struct enetc_bdr *rx_ring = (struct enetc_bdr *)rxq;
+
+	return enetc_clean_rx_ring_cacheable(rx_ring, rx_pkts, nb_pkts);
+}
-- 
2.25.1


^ permalink raw reply related

* [PATCH 09/10] net/enetc: set user configurable priority to TX rings
From: Gagandeep Singh @ 2026-06-19 18:44 UTC (permalink / raw)
  To: dev; +Cc: hemant.agrawal, Vanshika Shukla
In-Reply-To: <20260619184427.522518-1-g.singh@nxp.com>

From: Vanshika Shukla <vanshika.shukla@nxp.com>

Add devarg 'enetc4_txq_prior' to allow per-queue TX ring priority
configuration. The value is a '|'-separated list of TBMR priority
bits, one per TX queue (e.g. 'enetc4_txq_prior=1|2|3').

Store the parsed priorities in hw->txq_prior and apply them in
enetc4_tx_queue_setup() when enabling the ring.

Signed-off-by: Vanshika Shukla <vanshika.shukla@nxp.com>
---
 drivers/net/enetc/enetc.h         |  1 +
 drivers/net/enetc/enetc4_ethdev.c | 71 ++++++++++++++++++++++++++++++-
 2 files changed, 71 insertions(+), 1 deletion(-)

diff --git a/drivers/net/enetc/enetc.h b/drivers/net/enetc/enetc.h
index 2cdb3c7..99b1e91 100644
--- a/drivers/net/enetc/enetc.h
+++ b/drivers/net/enetc/enetc.h
@@ -111,6 +111,7 @@ struct enetc_eth_hw {
 	uint32_t max_tx_queues;
 	uint32_t vsi_timeout; /* VSI-PSI message wait timeout (iterations) */
 	uint32_t vsi_delay;   /* VSI-PSI message wait delay (us) */
+	uint32_t *txq_prior;  /* per-queue TX priority (TBMR priority bits) */
 };
 
 /*
diff --git a/drivers/net/enetc/enetc4_ethdev.c b/drivers/net/enetc/enetc4_ethdev.c
index 154fc09..d54051f 100644
--- a/drivers/net/enetc/enetc4_ethdev.c
+++ b/drivers/net/enetc/enetc4_ethdev.c
@@ -3,6 +3,7 @@
  */
 
 #include <stdbool.h>
+#include <rte_kvargs.h>
 #include <rte_random.h>
 #include <dpaax_iova_table.h>
 
@@ -10,6 +11,65 @@
 #include "enetc_logs.h"
 #include "enetc.h"
 
+#define ENETC4_TXQ_PRIORITIES	"enetc4_txq_prior"
+
+static int
+parse_txq_prior(const char *key __rte_unused, const char *value, void *opaque)
+{
+	struct rte_eth_dev *dev = (struct rte_eth_dev *)opaque;
+	struct enetc_eth_hw *hw =
+		ENETC_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	char *input_str = strdup(value);
+	char *str;
+	uint32_t i = 0;
+
+	hw->txq_prior = rte_zmalloc(NULL,
+				    hw->max_tx_queues * sizeof(uint32_t), 0);
+	if (!hw->txq_prior) {
+		free(input_str);
+		return -1;
+	}
+
+	str = strtok(input_str, "|");
+	while (str != NULL && i < hw->max_tx_queues) {
+		hw->txq_prior[i++] = (uint32_t)atoi(str);
+		str = strtok(NULL, "|");
+	}
+
+	free(input_str);
+	return 0;
+}
+
+static int
+enetc4_get_devargs(struct rte_eth_dev *dev, const char *key)
+{
+	struct rte_devargs *devargs = dev->device->devargs;
+	struct rte_kvargs *kvlist;
+
+	if (!devargs)
+		return 0;
+
+	kvlist = rte_kvargs_parse(devargs->args, NULL);
+	if (!kvlist)
+		return 0;
+
+	if (!rte_kvargs_count(kvlist, key)) {
+		rte_kvargs_free(kvlist);
+		return 0;
+	}
+
+	if (!strcmp(key, ENETC4_TXQ_PRIORITIES)) {
+		if (rte_kvargs_process(kvlist, key,
+				       parse_txq_prior, (void *)dev) < 0) {
+			rte_kvargs_free(kvlist);
+			return 0;
+		}
+	}
+
+	rte_kvargs_free(kvlist);
+	return 0;
+}
+
 /* Supported Rx offloads */
 static uint64_t dev_rx_offloads_sup =
 	RTE_ETH_RX_OFFLOAD_IPV4_CKSUM |
@@ -310,9 +370,14 @@ enetc4_tx_queue_setup(struct rte_eth_dev *dev,
 	data->tx_queues[queue_idx] = tx_ring;
 	tx_ring->tx_deferred_start = tx_conf->tx_deferred_start;
 	if (!tx_conf->tx_deferred_start) {
+		uint32_t tx_en = ENETC_TBMR_EN;
+
+		/* apply TX queue priority if configured */
+		if (priv->hw.txq_prior)
+			tx_en |= priv->hw.txq_prior[tx_ring->index];
 		/* enable ring */
 		enetc4_txbdr_wr(&priv->hw.hw, tx_ring->index,
-			       ENETC_TBMR, ENETC_TBMR_EN);
+			       ENETC_TBMR, tx_en);
 		dev->data->tx_queue_state[tx_ring->index] =
 			       RTE_ETH_QUEUE_STATE_STARTED;
 	} else {
@@ -1009,6 +1074,8 @@ enetc4_dev_init(struct rte_eth_dev *eth_dev)
 	hw->max_tx_queues = si_cap & ENETC_SICAPR0_BDR_MASK;
 	hw->max_rx_queues = (si_cap >> 16) & ENETC_SICAPR0_BDR_MASK;
 
+	enetc4_get_devargs(eth_dev, ENETC4_TXQ_PRIORITIES);
+
 	ENETC_PMD_DEBUG("Max RX queues = %d Max TX queues = %d",
 			hw->max_rx_queues, hw->max_tx_queues);
 	error = enetc4_mac_init(hw, eth_dev);
@@ -1065,4 +1132,6 @@ static struct rte_pci_driver rte_enetc4_pmd = {
 RTE_PMD_REGISTER_PCI(net_enetc4, rte_enetc4_pmd);
 RTE_PMD_REGISTER_PCI_TABLE(net_enetc4, pci_id_enetc4_map);
 RTE_PMD_REGISTER_KMOD_DEP(net_enetc4, "* vfio-pci");
+RTE_PMD_REGISTER_PARAM_STRING(net_enetc4,
+			      ENETC4_TXQ_PRIORITIES "=<string>");
 RTE_LOG_REGISTER_DEFAULT(enetc4_logtype_pmd, NOTICE);
-- 
2.25.1


^ permalink raw reply related

* [PATCH 08/10] net/enetc: add devargs to control VSI-PSI timeout and delay
From: Gagandeep Singh @ 2026-06-19 18:44 UTC (permalink / raw)
  To: dev; +Cc: hemant.agrawal
In-Reply-To: <20260619184427.522518-1-g.singh@nxp.com>

Add two new devargs for ENETC4 VF:
- enetc4_vsi_timeout: VSI-PSI message wait timeout (iteration count)
- enetc4_vsi_delay: VSI-PSI message wait delay in microseconds

Store the values in struct enetc_eth_hw and use them in
enetc4_msg_vsi_send() instead of the hardcoded defaults.
Fall back to ENETC4_DEF_VSI_WAIT_TIMEOUT_UPDATE /
ENETC4_DEF_VSI_WAIT_DELAY_UPDATE when not set.

Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
---
 drivers/net/enetc/enetc.h     |  2 ++
 drivers/net/enetc/enetc4_vf.c | 54 ++++++++++++++++++++++++-----------
 2 files changed, 39 insertions(+), 17 deletions(-)

diff --git a/drivers/net/enetc/enetc.h b/drivers/net/enetc/enetc.h
index 439d2d6..2cdb3c7 100644
--- a/drivers/net/enetc/enetc.h
+++ b/drivers/net/enetc/enetc.h
@@ -109,6 +109,8 @@ struct enetc_eth_hw {
 	uint32_t num_rss;
 	uint32_t max_rx_queues;
 	uint32_t max_tx_queues;
+	uint32_t vsi_timeout; /* VSI-PSI message wait timeout (iterations) */
+	uint32_t vsi_delay;   /* VSI-PSI message wait delay (us) */
 };
 
 /*
diff --git a/drivers/net/enetc/enetc4_vf.c b/drivers/net/enetc/enetc4_vf.c
index 44c0dc0..79a08b3 100644
--- a/drivers/net/enetc/enetc4_vf.c
+++ b/drivers/net/enetc/enetc4_vf.c
@@ -10,6 +10,8 @@
 #include "enetc.h"
 
 #define ENETC4_VSI_DISABLE		"enetc4_vsi_disable"
+#define ENETC4_VSI_TIMEOUT		"enetc4_vsi_timeout"
+#define ENETC4_VSI_DELAY		"enetc4_vsi_delay"
 
 #define ENETC_CRC_TABLE_SIZE		256
 #define ENETC_POLY			0x1021
@@ -262,10 +264,13 @@ enetc4_process_psi_msg(struct rte_eth_dev *eth_dev, struct enetc_hw *enetc_hw)
 }
 
 static int
-enetc4_msg_vsi_send(struct enetc_hw *enetc_hw, struct enetc_msg_swbd *msg)
+enetc4_msg_vsi_send(struct enetc_eth_hw *hw, struct enetc_msg_swbd *msg)
 {
-	int timeout = ENETC4_DEF_VSI_WAIT_TIMEOUT_UPDATE;
-	int delay_us = ENETC4_DEF_VSI_WAIT_DELAY_UPDATE;
+	struct enetc_hw *enetc_hw = &hw->hw;
+	int timeout = hw->vsi_timeout ? (int)hw->vsi_timeout :
+					ENETC4_DEF_VSI_WAIT_TIMEOUT_UPDATE;
+	int delay_us = hw->vsi_delay ? (int)hw->vsi_delay :
+				       ENETC4_DEF_VSI_WAIT_DELAY_UPDATE;
 	uint8_t class_id = 0;
 	int err = 0;
 	int vsimsgsr;
@@ -382,7 +387,7 @@ enetc4_vf_set_mac_addr(struct rte_eth_dev *dev, struct rte_ether_addr *addr)
 					ENETC_CMD_ID_SET_PRIMARY_MAC, 0, 0, 0);
 
 	/* send the command and wait */
-	err = enetc4_msg_vsi_send(enetc_hw, msg);
+	err = enetc4_msg_vsi_send(hw, msg);
 	if (err) {
 		ENETC_PMD_ERR("VSI message send error");
 		goto end;
@@ -426,7 +431,6 @@ static int
 enetc4_vf_promisc_send_message(struct rte_eth_dev *dev, bool promisc_en)
 {
 	struct enetc_eth_hw *hw = ENETC_DEV_PRIVATE_TO_HW(dev->data->dev_private);
-	struct enetc_hw *enetc_hw = &hw->hw;
 	struct enetc_msg_cmd_set_promisc *cmd;
 	struct enetc_msg_swbd *msg;
 	uint32_t msg_size;
@@ -466,7 +470,7 @@ enetc4_vf_promisc_send_message(struct rte_eth_dev *dev, bool promisc_en)
 				ENETC_CMD_ID_SET_MAC_PROMISCUOUS, 0, 0, 0);
 
 	/* send the command and wait */
-	err = enetc4_msg_vsi_send(enetc_hw, msg);
+	err = enetc4_msg_vsi_send(hw, msg);
 	if (err) {
 		ENETC_PMD_ERR("VSI message send error");
 		goto end;
@@ -483,7 +487,6 @@ static int
 enetc4_vf_allmulti_send_message(struct rte_eth_dev *dev, bool mc_promisc)
 {
 	struct enetc_eth_hw *hw = ENETC_DEV_PRIVATE_TO_HW(dev->data->dev_private);
-	struct enetc_hw *enetc_hw = &hw->hw;
 	struct enetc_msg_cmd_set_promisc *cmd;
 	struct enetc_msg_swbd *msg;
 	uint32_t msg_size;
@@ -524,7 +527,7 @@ enetc4_vf_allmulti_send_message(struct rte_eth_dev *dev, bool mc_promisc)
 				ENETC_CMD_ID_SET_MAC_PROMISCUOUS, 0, 0, 0);
 
 	/* send the command and wait */
-	err = enetc4_msg_vsi_send(enetc_hw, msg);
+	err = enetc4_msg_vsi_send(hw, msg);
 	if (err) {
 		ENETC_PMD_ERR("VSI message send error");
 		goto end;
@@ -630,7 +633,7 @@ enetc4_vf_get_link_status(struct rte_eth_dev *dev, struct enetc_psi_reply_msg *r
 			ENETC_CMD_ID_GET_LINK_STATUS, 0, 0, 0);
 
 	/* send the command and wait */
-	err = enetc4_msg_vsi_send(enetc_hw, msg);
+	err = enetc4_msg_vsi_send(hw, msg);
 	if (err) {
 		ENETC_PMD_ERR("VSI message send error");
 		goto end;
@@ -676,7 +679,7 @@ enetc4_vf_get_link_speed(struct rte_eth_dev *dev, struct enetc_psi_reply_msg *re
 			ENETC_CMD_ID_GET_LINK_SPEED, 0, 0, 0);
 
 	/* send the command and wait */
-	err = enetc4_msg_vsi_send(enetc_hw, msg);
+	err = enetc4_msg_vsi_send(hw, msg);
 	if (err) {
 		ENETC_PMD_ERR("VSI message send error");
 		goto end;
@@ -819,7 +822,6 @@ static int
 enetc4_vf_vlan_promisc(struct rte_eth_dev *dev, bool promisc_en)
 {
 	struct enetc_eth_hw *hw = ENETC_DEV_PRIVATE_TO_HW(dev->data->dev_private);
-	struct enetc_hw *enetc_hw = &hw->hw;
 	struct enetc_msg_cmd_set_vlan_promisc *cmd;
 	struct enetc_msg_swbd *msg;
 	uint32_t msg_size;
@@ -858,7 +860,7 @@ enetc4_vf_vlan_promisc(struct rte_eth_dev *dev, bool promisc_en)
 				ENETC_CMD_ID_SET_VLAN_PROMISCUOUS, 0, 0, 0);
 
 	/* send the command and wait */
-	err = enetc4_msg_vsi_send(enetc_hw, msg);
+	err = enetc4_msg_vsi_send(hw, msg);
 	if (err) {
 		ENETC_PMD_ERR("VSI message send error");
 		goto end;
@@ -921,7 +923,7 @@ enetc4_vf_mac_addr_add(struct rte_eth_dev *dev, struct rte_ether_addr *addr,
 			ENETC_MSG_ADD_EXACT_MAC_ENTRIES, 0, 0, 0);
 
 	/* send the command and wait */
-	err = enetc4_msg_vsi_send(enetc_hw, msg);
+	err = enetc4_msg_vsi_send(hw, msg);
 	if (err) {
 		ENETC_PMD_ERR("VSI message send error");
 		goto end;
@@ -1021,7 +1023,7 @@ static int enetc4_vf_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id,
 	}
 
 	/* send the command and wait */
-	err = enetc4_msg_vsi_send(enetc_hw, msg);
+	err = enetc4_msg_vsi_send(hw, msg);
 	if (err) {
 		ENETC_PMD_ERR("VSI message send error");
 		goto end;
@@ -1104,7 +1106,6 @@ static int
 enetc4_vf_link_register_notif(struct rte_eth_dev *dev, bool enable)
 {
 	struct enetc_eth_hw *hw = ENETC_DEV_PRIVATE_TO_HW(dev->data->dev_private);
-	struct enetc_hw *enetc_hw = &hw->hw;
 	struct enetc_msg_swbd *msg;
 	struct rte_eth_link link;
 	uint32_t msg_size;
@@ -1138,7 +1139,7 @@ enetc4_vf_link_register_notif(struct rte_eth_dev *dev, bool enable)
 			cmd, 0, 0, 0);
 
 	/* send the command and wait */
-	err = enetc4_msg_vsi_send(enetc_hw, msg);
+	err = enetc4_msg_vsi_send(hw, msg);
 	if (err)
 		ENETC_PMD_ERR("VSI msg error for link status notification");
 
@@ -1322,12 +1323,29 @@ enetc4_vf_dev_init(struct rte_eth_dev *eth_dev)
 		kvlist = rte_kvargs_parse(eth_dev->device->devargs->args,
 					  NULL);
 		if (kvlist) {
+			const char *val;
+
 			if (rte_kvargs_count(kvlist, ENETC4_VSI_DISABLE) != 0) {
 				ENETC_PMD_NOTICE("VSI messaging disabled by devarg");
 				eth_dev->dev_ops = &enetc4_vf_ops_no_vsi_m;
 			} else {
 				eth_dev->dev_ops = &enetc4_vf_ops;
 			}
+
+			/* parse optional VSI-PSI timeout devarg */
+			val = rte_kvargs_get(kvlist, ENETC4_VSI_TIMEOUT);
+			if (val) {
+				hw->vsi_timeout = (uint32_t)strtoul(val, NULL, 0);
+				ENETC_PMD_NOTICE("VSI timeout set to %u", hw->vsi_timeout);
+			}
+
+			/* parse optional VSI-PSI delay devarg */
+			val = rte_kvargs_get(kvlist, ENETC4_VSI_DELAY);
+			if (val) {
+				hw->vsi_delay = (uint32_t)strtoul(val, NULL, 0);
+				ENETC_PMD_NOTICE("VSI delay set to %u us", hw->vsi_delay);
+			}
+
 			rte_kvargs_free(kvlist);
 		} else {
 			eth_dev->dev_ops = &enetc4_vf_ops;
@@ -1443,5 +1461,7 @@ RTE_PMD_REGISTER_PCI(net_enetc4_vf, rte_enetc4_vf_pmd);
 RTE_PMD_REGISTER_PCI_TABLE(net_enetc4_vf, pci_vf_id_enetc4_map);
 RTE_PMD_REGISTER_KMOD_DEP(net_enetc4_vf, "* igb_uio | uio_pci_generic");
 RTE_PMD_REGISTER_PARAM_STRING(net_enetc4_vf,
-			      ENETC4_VSI_DISABLE "=<any>");
+			      ENETC4_VSI_DISABLE "=<any> "
+			      ENETC4_VSI_TIMEOUT "=<uint> "
+			      ENETC4_VSI_DELAY "=<uint>");
 RTE_LOG_REGISTER_DEFAULT(enetc4_vf_logtype_pmd, NOTICE);
-- 
2.25.1


^ permalink raw reply related

* [PATCH 06/10] net/enetc: support scatter-gather
From: Gagandeep Singh @ 2026-06-19 18:44 UTC (permalink / raw)
  To: dev; +Cc: hemant.agrawal, Vanshika Shukla
In-Reply-To: <20260619184427.522518-1-g.singh@nxp.com>

From: Vanshika Shukla <vanshika.shukla@nxp.com>

Add scatter-gather support for ENETC4 PMD:
- Add ENETC_RXBD_LSTATUS_R/F bits for RX BD status
- Add ENETC4_MAX_SEGS (63) for max segments per TX packet
- Update enetc4_vf_dev_infos_get to fill nb_seg_max, offloads,
  max queues and packet length
- Extend enetc_xmit_pkts_nc to handle multi-segment mbufs
- Extend enetc_clean_rx_ring_nc to chain scatter-gather segments
  using LSTATUS_R/F bits

Signed-off-by: Vanshika Shukla <vanshika.shukla@nxp.com>
---
 drivers/net/enetc/base/enetc_hw.h |   2 +
 drivers/net/enetc/enetc.h         |   4 +-
 drivers/net/enetc/enetc4_vf.c     |  46 ++++++++---
 drivers/net/enetc/enetc_rxtx.c    | 124 +++++++++++++++++++-----------
 4 files changed, 119 insertions(+), 57 deletions(-)

diff --git a/drivers/net/enetc/base/enetc_hw.h b/drivers/net/enetc/base/enetc_hw.h
index f79c950..6e96562 100644
--- a/drivers/net/enetc/base/enetc_hw.h
+++ b/drivers/net/enetc/base/enetc_hw.h
@@ -230,6 +230,8 @@ enum enetc_bdr_type {TX, RX};
 			(0x0005 | ENETC_PKT_TYPE_IPV4)
 #define ENETC_PKT_TYPE_IPV6_ESP \
 			(0x0005 | ENETC_PKT_TYPE_IPV6)
+#define ENETC_RXBD_LSTATUS_R	BIT(30)
+#define ENETC_RXBD_LSTATUS_F	BIT(31)
 
 /* PCI device info */
 struct enetc_hw {
diff --git a/drivers/net/enetc/enetc.h b/drivers/net/enetc/enetc.h
index 4d99b5b..439d2d6 100644
--- a/drivers/net/enetc/enetc.h
+++ b/drivers/net/enetc/enetc.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright 2018-2019,2024 NXP
+ * Copyright 2018-2019,2024-2026 NXP
  */
 
 #ifndef _ENETC_H_
@@ -28,6 +28,8 @@
 #define MIN_BD_COUNT   32
 /* BD ALIGN */
 #define BD_ALIGN       8
+/* Max segments per ENETC4 TX packet (scatter-gather) */
+#define ENETC4_MAX_SEGS	63
 
 /* minimum frame size supported */
 #define ENETC_MAC_MINFRM_SIZE	68
diff --git a/drivers/net/enetc/enetc4_vf.c b/drivers/net/enetc/enetc4_vf.c
index bec7128..9dc4e1d 100644
--- a/drivers/net/enetc/enetc4_vf.c
+++ b/drivers/net/enetc/enetc4_vf.c
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright 2024 NXP
+ * Copyright 2024-2026 NXP
  */
 
 #include <stdbool.h>
@@ -18,8 +18,19 @@ uint16_t enetc_crc_table[ENETC_CRC_TABLE_SIZE];
 bool enetc_crc_gen;
 
 /* Supported Rx offloads */
-static uint64_t dev_vf_rx_offloads_sup =
-	RTE_ETH_RX_OFFLOAD_VLAN_FILTER;
+static uint64_t dev_rx_offloads_sup =
+	RTE_ETH_RX_OFFLOAD_IPV4_CKSUM |
+	RTE_ETH_RX_OFFLOAD_UDP_CKSUM |
+	RTE_ETH_RX_OFFLOAD_TCP_CKSUM |
+	RTE_ETH_RX_OFFLOAD_VLAN_FILTER |
+	RTE_ETH_RX_OFFLOAD_SCATTER;
+
+/* Supported Tx offloads */
+static uint64_t dev_tx_offloads_sup =
+	RTE_ETH_TX_OFFLOAD_IPV4_CKSUM |
+	RTE_ETH_TX_OFFLOAD_UDP_CKSUM |
+	RTE_ETH_TX_OFFLOAD_TCP_CKSUM |
+	RTE_ETH_TX_OFFLOAD_MULTI_SEGS;
 
 static void
 enetc_gen_crc_table(void)
@@ -61,21 +72,38 @@ static int
 enetc4_vf_dev_infos_get(struct rte_eth_dev *dev,
 			struct rte_eth_dev_info *dev_info)
 {
-	int ret = 0;
+	struct enetc_eth_hw *hw =
+		ENETC_DEV_PRIVATE_TO_HW(dev->data->dev_private);
 
 	PMD_INIT_FUNC_TRACE();
 
-	ret = enetc4_dev_infos_get(dev, dev_info);
-	if (ret)
-		return ret;
-
+	dev_info->rx_desc_lim = (struct rte_eth_desc_lim) {
+		.nb_max = MAX_BD_COUNT,
+		.nb_min = MIN_BD_COUNT,
+		.nb_align = BD_ALIGN,
+		.nb_seg_max = ENETC4_MAX_SEGS,
+		.nb_mtu_seg_max = ENETC4_MAX_SEGS,
+	};
+	dev_info->tx_desc_lim = (struct rte_eth_desc_lim) {
+		.nb_max = MAX_BD_COUNT,
+		.nb_min = MIN_BD_COUNT,
+		.nb_align = BD_ALIGN,
+		.nb_seg_max = ENETC4_MAX_SEGS,
+		.nb_mtu_seg_max = ENETC4_MAX_SEGS,
+	};
+	dev_info->max_rx_queues = hw->max_rx_queues;
+	dev_info->max_tx_queues = hw->max_tx_queues;
+	dev_info->max_rx_pktlen = ENETC4_MAC_MAXFRM_SIZE;
 	dev_info->max_mtu = dev_info->max_rx_pktlen - (RTE_ETHER_HDR_LEN + RTE_ETHER_CRC_LEN);
 	dev_info->max_mac_addrs = ENETC4_MAC_ENTRIES;
-	dev_info->rx_offload_capa |= dev_vf_rx_offloads_sup;
+	dev_info->rx_offload_capa = dev_rx_offloads_sup;
+	dev_info->tx_offload_capa = dev_tx_offloads_sup;
+	dev_info->flow_type_rss_offloads = ENETC_RSS_OFFLOAD_ALL;
 
 	return 0;
 }
 
+
 int
 enetc4_vf_dev_stop(struct rte_eth_dev *dev __rte_unused)
 {
diff --git a/drivers/net/enetc/enetc_rxtx.c b/drivers/net/enetc/enetc_rxtx.c
index c87349f..a37c835 100644
--- a/drivers/net/enetc/enetc_rxtx.c
+++ b/drivers/net/enetc/enetc_rxtx.c
@@ -149,54 +149,64 @@ enetc_xmit_pkts_nc(void *tx_queue,
 		struct rte_mbuf **tx_pkts,
 		uint16_t nb_pkts)
 {
-	struct enetc_swbd *tx_swbd;
-	int i, start, bds_to_use;
-	struct enetc_tx_bd *txbd;
 	struct enetc_bdr *tx_ring = (struct enetc_bdr *)tx_queue;
-	unsigned int buflen, j;
+	int i, start, bds_to_use, bd_count;
+	struct enetc_tx_bd *txbd;
+	struct rte_mbuf *seg;
+	uint16_t seg_len, segs_per_pkt;
+	bool is_first_seg;
+	unsigned int j;
 	uint8_t *data;
 
 	i = tx_ring->next_to_use;
-
 	bds_to_use = enetc_bd_unused(tx_ring);
-	if (bds_to_use < nb_pkts)
-		nb_pkts = bds_to_use;
-
+	bd_count = tx_ring->bd_count;
 	start = 0;
-	while (nb_pkts--) {
-		tx_ring->q_swbd[i].buffer_addr = tx_pkts[start];
 
-		buflen = rte_pktmbuf_pkt_len(tx_ring->q_swbd[i].buffer_addr);
-		data = rte_pktmbuf_mtod(tx_ring->q_swbd[i].buffer_addr, void *);
-		for (j = 0; j <= buflen; j += RTE_CACHE_LINE_SIZE)
-			dcbf(data + j);
+	while (start < nb_pkts) {
+		seg = tx_pkts[start];
+		segs_per_pkt = seg->nb_segs;
 
-		txbd = ENETC_TXBD(*tx_ring, i);
-		txbd->flags = 0;
-		if (tx_ring->q_swbd[i].buffer_addr->ol_flags & ENETC4_TX_CKSUM_OFFLOAD_MASK)
-			enetc4_tx_offload_checksum(tx_ring->q_swbd[i].buffer_addr, txbd);
+		if (bds_to_use < segs_per_pkt)
+			break;
 
-		tx_swbd = &tx_ring->q_swbd[i];
-		txbd->frm_len = buflen;
-		txbd->buf_len = txbd->frm_len;
-		txbd->addr = (uint64_t)(uintptr_t)
-		rte_cpu_to_le_64((size_t)tx_swbd->buffer_addr->buf_iova +
-				 tx_swbd->buffer_addr->data_off);
+		is_first_seg = true;
+		while (seg) {
+			tx_ring->q_swbd[i].buffer_addr = NULL;
+			seg_len = rte_pktmbuf_data_len(seg);
+			data = rte_pktmbuf_mtod(seg, void *);
+
+			/* Flush payload to PoC so HW DMA reads the correct data. */
+			for (j = 0; j < seg_len; j += RTE_CACHE_LINE_SIZE)
+				dcbf(data + j);
+			/* Cover the last byte of an unaligned buffer. */
+			dcbf(data + (seg_len - 1));
+
+			txbd = ENETC_TXBD(*tx_ring, i);
+			txbd->flags = 0;
+			if (is_first_seg) {
+				tx_ring->q_swbd[i].buffer_addr = tx_pkts[start];
+				txbd->frm_len = rte_pktmbuf_pkt_len(seg);
+				if (seg->ol_flags & ENETC4_TX_CKSUM_OFFLOAD_MASK)
+					enetc4_tx_offload_checksum(seg, txbd);
+				is_first_seg = false;
+			}
+
+			txbd->buf_len = rte_cpu_to_le_16(seg_len);
+			txbd->addr = rte_cpu_to_le_64(rte_mbuf_data_iova(seg));
+			seg = seg->next;
+			i++;
+			bds_to_use--;
+			if (unlikely(i == bd_count))
+				i = 0;
+		}
+
+		/* Set the frame-last flag on the final BD of this packet. */
 		txbd->flags |= rte_cpu_to_le_16(ENETC4_TXBD_FLAGS_F);
-		i++;
 		start++;
-		if (unlikely(i == tx_ring->bd_count))
-			i = 0;
 	}
 
-	/* we're only cleaning up the Tx ring here, on the assumption that
-	 * software is slower than hardware and hardware completed sending
-	 * older frames out by now.
-	 * We're also cleaning up the ring before kicking off Tx for the new
-	 * batch to minimize chances of contention on the Tx ring
-	 */
 	enetc_clean_tx_ring(tx_ring);
-
 	tx_ring->next_to_use = i;
 	enetc_wr_reg(tx_ring->tcir, i);
 	return start;
@@ -501,38 +511,59 @@ enetc_clean_rx_ring_nc(struct enetc_bdr *rx_ring,
 	int cleaned_cnt, i;
 	struct enetc_swbd *rx_swbd;
 	union enetc_rx_bd *rxbd, rxbd_temp;
+	struct rte_mbuf *first_seg = NULL, *cur_seg = NULL;
 	uint32_t bd_status;
 	uint8_t *data;
 	uint32_t j;
+	struct rte_mbuf *seg;
+	uint16_t data_len;
 
 	/* next descriptor to process */
 	i = rx_ring->next_to_clean;
-	/* next descriptor to process */
 	rxbd = ENETC_RXBD(*rx_ring, i);
-
 	cleaned_cnt = enetc_bd_unused(rx_ring);
 	rx_swbd = &rx_ring->q_swbd[i];
 
 	while (likely(rx_frm_cnt < work_limit)) {
 		rxbd_temp = *rxbd;
 		bd_status = rte_le_to_cpu_32(rxbd_temp.r.lstatus);
-		if (!bd_status)
+		/* LSTATUS_R indicates this BD has been written by HW */
+		if (!(bd_status & ENETC_RXBD_LSTATUS_R))
 			break;
 		if (rxbd_temp.r.error)
 			rx_ring->ierrors++;
 
-		rx_swbd->buffer_addr->pkt_len = rxbd_temp.r.buf_len -
-						rx_ring->crc_len;
-		rx_swbd->buffer_addr->data_len = rx_swbd->buffer_addr->pkt_len;
-		rx_swbd->buffer_addr->hash.rss = rxbd_temp.r.rss_hash;
-		enetc_dev_rx_parse(rx_swbd->buffer_addr,
-				   rxbd_temp.r.parse_summary);
+		seg = rx_swbd->buffer_addr;
+		data_len = rte_le_to_cpu_16(rxbd_temp.r.buf_len);
+		seg->data_len = data_len;
+
+		if (!first_seg) {
+			first_seg = seg;
+			cur_seg = seg;
+			first_seg->pkt_len = data_len;
+			enetc_dev_rx_parse(first_seg, rxbd_temp.r.parse_summary);
+			first_seg->hash.rss = rxbd_temp.r.rss_hash;
+		} else {
+			first_seg->pkt_len += data_len;
+			first_seg->nb_segs++;
+			cur_seg->next = seg;
+			cur_seg = seg;
+		}
 
-		data = rte_pktmbuf_mtod(rx_swbd->buffer_addr, void *);
-		for (j = 0; j <= rx_swbd->buffer_addr->pkt_len; j += RTE_CACHE_LINE_SIZE)
+		/* Invalidate packet data cache lines so CPU reads HW-written data. */
+		data = rte_pktmbuf_mtod(seg, void *);
+		for (j = 0; j < data_len; j += RTE_CACHE_LINE_SIZE)
 			dccivac(data + j);
+		dccivac(data + (data_len - 1));
+
+		if (bd_status & ENETC_RXBD_LSTATUS_F) {
+			seg->next = NULL;
+			first_seg->pkt_len -= rx_ring->crc_len;
+			rx_pkts[rx_frm_cnt] = first_seg;
+			rx_frm_cnt++;
+			first_seg = NULL;
+		}
 
-		rx_pkts[rx_frm_cnt] = rx_swbd->buffer_addr;
 		cleaned_cnt++;
 		rx_swbd++;
 		i++;
@@ -541,7 +572,6 @@ enetc_clean_rx_ring_nc(struct enetc_bdr *rx_ring,
 			rx_swbd = &rx_ring->q_swbd[i];
 		}
 		rxbd = ENETC_RXBD(*rx_ring, i);
-		rx_frm_cnt++;
 	}
 
 	rx_ring->next_to_clean = i;
-- 
2.25.1


^ permalink raw reply related

* [PATCH 07/10] net/enetc: add option to disable VSI messaging
From: Gagandeep Singh @ 2026-06-19 18:44 UTC (permalink / raw)
  To: dev; +Cc: hemant.agrawal
In-Reply-To: <20260619184427.522518-1-g.singh@nxp.com>

Add devarg 'enetc4_vsi_disable' to allow disabling features
dependent on VSI-PSI messaging. This is useful for testing DPDK
with a PF driver that does not support VSI-PSI messages.

When the devarg is present, a reduced ops table
(enetc4_vf_ops_no_vsi_m) is used that replaces link_update with
a no-op stub and omits MAC/VLAN filter ops that require VSI msgs.

Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
---
 drivers/net/enetc/enetc4_vf.c | 61 +++++++++++++++++++++++++++++++++--
 1 file changed, 58 insertions(+), 3 deletions(-)

diff --git a/drivers/net/enetc/enetc4_vf.c b/drivers/net/enetc/enetc4_vf.c
index 9dc4e1d..44c0dc0 100644
--- a/drivers/net/enetc/enetc4_vf.c
+++ b/drivers/net/enetc/enetc4_vf.c
@@ -3,11 +3,14 @@
  */
 
 #include <stdbool.h>
+#include <rte_kvargs.h>
 #include <rte_random.h>
 #include <dpaax_iova_table.h>
 #include "enetc_logs.h"
 #include "enetc.h"
 
+#define ENETC4_VSI_DISABLE		"enetc4_vsi_disable"
+
 #define ENETC_CRC_TABLE_SIZE		256
 #define ENETC_POLY			0x1021
 #define ENETC_CRC_INIT			0xffff
@@ -687,6 +690,13 @@ enetc4_vf_get_link_speed(struct rte_eth_dev *dev, struct enetc_psi_reply_msg *re
 	return err;
 }
 
+static int
+enetc4_vf_link_update_dummy(struct rte_eth_dev *dev __rte_unused,
+			    int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
 static int
 enetc4_vf_link_update(struct rte_eth_dev *dev, int wait_to_complete __rte_unused)
 {
@@ -1148,6 +1158,27 @@ static const struct rte_pci_id pci_vf_id_enetc4_map[] = {
 };
 
 /* Features supported by this driver */
+/* ops table used when VSI messaging is disabled */
+static const struct eth_dev_ops enetc4_vf_ops_no_vsi_m = {
+	.dev_configure        = enetc4_dev_configure,
+	.dev_start            = enetc4_vf_dev_start,
+	.dev_stop             = enetc4_vf_dev_stop,
+	.dev_close            = enetc4_dev_close,
+	.stats_get            = enetc4_vf_stats_get,
+	.dev_infos_get        = enetc4_vf_dev_infos_get,
+	.mtu_set              = enetc4_vf_mtu_set,
+	.link_update	      = enetc4_vf_link_update_dummy,
+	.rx_queue_setup       = enetc4_rx_queue_setup,
+	.rx_queue_start       = enetc4_rx_queue_start,
+	.rx_queue_stop        = enetc4_rx_queue_stop,
+	.rx_queue_release     = enetc4_rx_queue_release,
+	.tx_queue_setup       = enetc4_tx_queue_setup,
+	.tx_queue_start       = enetc4_tx_queue_start,
+	.tx_queue_stop        = enetc4_tx_queue_stop,
+	.tx_queue_release     = enetc4_tx_queue_release,
+	.dev_supported_ptypes_get = enetc4_supported_ptypes_get,
+};
+
 static const struct eth_dev_ops enetc4_vf_ops = {
 	.dev_configure        = enetc4_dev_configure,
 	.dev_start            = enetc4_vf_dev_start,
@@ -1283,7 +1314,28 @@ enetc4_vf_dev_init(struct rte_eth_dev *eth_dev)
 	struct enetc_hw *enetc_hw = &hw->hw;
 
 	PMD_INIT_FUNC_TRACE();
-	eth_dev->dev_ops = &enetc4_vf_ops;
+
+	/* check if VSI messaging should be disabled via devarg */
+	if (eth_dev->device->devargs) {
+		struct rte_kvargs *kvlist;
+
+		kvlist = rte_kvargs_parse(eth_dev->device->devargs->args,
+					  NULL);
+		if (kvlist) {
+			if (rte_kvargs_count(kvlist, ENETC4_VSI_DISABLE) != 0) {
+				ENETC_PMD_NOTICE("VSI messaging disabled by devarg");
+				eth_dev->dev_ops = &enetc4_vf_ops_no_vsi_m;
+			} else {
+				eth_dev->dev_ops = &enetc4_vf_ops;
+			}
+			rte_kvargs_free(kvlist);
+		} else {
+			eth_dev->dev_ops = &enetc4_vf_ops;
+		}
+	} else {
+		eth_dev->dev_ops = &enetc4_vf_ops;
+	}
+
 	enetc4_dev_hw_init(eth_dev);
 
 	si_cap = enetc_rd(enetc_hw, ENETC_SICAPR0);
@@ -1304,8 +1356,9 @@ enetc4_vf_dev_init(struct rte_eth_dev *eth_dev)
 	ENETC_PMD_DEBUG("port_id %d vendorID=0x%x deviceID=0x%x",
 			eth_dev->data->port_id, pci_dev->id.vendor_id,
 			pci_dev->id.device_id);
-	/* update link */
-	enetc4_vf_link_update(eth_dev, 0);
+	/* update link if VSI messaging is enabled */
+	if (eth_dev->dev_ops == &enetc4_vf_ops)
+		enetc4_vf_link_update(eth_dev, 0);
 
 	return 0;
 }
@@ -1389,4 +1442,6 @@ static struct rte_pci_driver rte_enetc4_vf_pmd = {
 RTE_PMD_REGISTER_PCI(net_enetc4_vf, rte_enetc4_vf_pmd);
 RTE_PMD_REGISTER_PCI_TABLE(net_enetc4_vf, pci_vf_id_enetc4_map);
 RTE_PMD_REGISTER_KMOD_DEP(net_enetc4_vf, "* igb_uio | uio_pci_generic");
+RTE_PMD_REGISTER_PARAM_STRING(net_enetc4_vf,
+			      ENETC4_VSI_DISABLE "=<any>");
 RTE_LOG_REGISTER_DEFAULT(enetc4_vf_logtype_pmd, NOTICE);
-- 
2.25.1


^ permalink raw reply related

* [PATCH 05/10] net/enetc: update random MAC generation code
From: Gagandeep Singh @ 2026-06-19 18:44 UTC (permalink / raw)
  To: dev; +Cc: hemant.agrawal
In-Reply-To: <20260619184427.522518-1-g.singh@nxp.com>

Use rte_eth_random_addr() instead of manual rte_rand() based MAC
generation. Also handle VF path by writing to ENETC_SIPMAR0/1 instead
of ENETC_PSIPMAR0/1 when running as a VF.

Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
---
 drivers/net/enetc/enetc_ethdev.c | 22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/net/enetc/enetc_ethdev.c b/drivers/net/enetc/enetc_ethdev.c
index 407179f..427da87 100644
--- a/drivers/net/enetc/enetc_ethdev.c
+++ b/drivers/net/enetc/enetc_ethdev.c
@@ -196,20 +196,18 @@ enetc_hardware_init(struct enetc_eth_hw *hw)
 	}
 
 	if ((high_mac | low_mac) == 0) {
-		char *first_byte;
-
 		ENETC_PMD_NOTICE("MAC is not available for this SI, "
 				"set random MAC");
-		mac = (uint32_t *)hw->mac.addr;
-		*mac = (uint32_t)rte_rand();
-		first_byte = (char *)mac;
-		*first_byte &= 0xfe;	/* clear multicast bit */
-		*first_byte |= 0x02;	/* set local assignment bit (IEEE802) */
-
-		enetc_port_wr(enetc_hw, ENETC_PSIPMAR0(0), *mac);
-		mac++;
-		*mac = (uint16_t)rte_rand();
-		enetc_port_wr(enetc_hw, ENETC_PSIPMAR1(0), *mac);
+		rte_eth_random_addr(hw->mac.addr);
+		high_mac = *(uint32_t *)hw->mac.addr;
+		low_mac = *(uint16_t *)(hw->mac.addr + 4);
+		if (hw->device_id == ENETC_DEV_ID_VF) {
+			enetc_wr(enetc_hw, ENETC_SIPMAR0, high_mac);
+			enetc_wr(enetc_hw, ENETC_SIPMAR1, low_mac);
+		} else {
+			enetc_port_wr(enetc_hw, ENETC_PSIPMAR0(0), high_mac);
+			enetc_port_wr(enetc_hw, ENETC_PSIPMAR1(0), low_mac);
+		}
 		enetc_print_ethaddr("New address: ",
 			      (const struct rte_ether_addr *)hw->mac.addr);
 	}
-- 
2.25.1


^ permalink raw reply related

* [PATCH 04/10] net/enetc: support ESP packet type in packet parsing
From: Gagandeep Singh @ 2026-06-19 18:44 UTC (permalink / raw)
  To: dev; +Cc: hemant.agrawal
In-Reply-To: <20260619184427.522518-1-g.singh@nxp.com>

Add ESP (Encapsulating Security Payload) packet type definitions and
handling to the RX packet parsing path. Also update the supported
ptypes array to advertise ESP tunnel type support.

Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
---
 drivers/net/enetc/base/enetc_hw.h |  4 ++++
 drivers/net/enetc/enetc_ethdev.c  |  4 +++-
 drivers/net/enetc/enetc_rxtx.c    | 10 ++++++++++
 3 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/net/enetc/base/enetc_hw.h b/drivers/net/enetc/base/enetc_hw.h
index 19efadd..f79c950 100644
--- a/drivers/net/enetc/base/enetc_hw.h
+++ b/drivers/net/enetc/base/enetc_hw.h
@@ -226,6 +226,10 @@ enum enetc_bdr_type {TX, RX};
 			(0x0003 | ENETC_PKT_TYPE_IPV4)
 #define ENETC_PKT_TYPE_IPV6_ICMP \
 			(0x0003 | ENETC_PKT_TYPE_IPV6)
+#define ENETC_PKT_TYPE_IPV4_ESP \
+			(0x0005 | ENETC_PKT_TYPE_IPV4)
+#define ENETC_PKT_TYPE_IPV6_ESP \
+			(0x0005 | ENETC_PKT_TYPE_IPV6)
 
 /* PCI device info */
 struct enetc_hw {
diff --git a/drivers/net/enetc/enetc_ethdev.c b/drivers/net/enetc/enetc_ethdev.c
index f41f3c1..407179f 100644
--- a/drivers/net/enetc/enetc_ethdev.c
+++ b/drivers/net/enetc/enetc_ethdev.c
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright 2018-2024 NXP
+ * Copyright 2018-2026 NXP
  */
 
 #include <stdbool.h>
@@ -95,6 +95,8 @@ enetc_supported_ptypes_get(struct rte_eth_dev *dev __rte_unused,
 		RTE_PTYPE_L4_UDP,
 		RTE_PTYPE_L4_SCTP,
 		RTE_PTYPE_L4_ICMP,
+		RTE_PTYPE_TUNNEL_ESP,
+		RTE_PTYPE_UNKNOWN,
 	};
 
 	*no_of_elements = RTE_DIM(ptypes);
diff --git a/drivers/net/enetc/enetc_rxtx.c b/drivers/net/enetc/enetc_rxtx.c
index b44e6f3..c87349f 100644
--- a/drivers/net/enetc/enetc_rxtx.c
+++ b/drivers/net/enetc/enetc_rxtx.c
@@ -370,6 +370,16 @@ enetc_dev_rx_parse(struct rte_mbuf *m, uint16_t parse_results)
 				 RTE_PTYPE_L3_IPV6 |
 				 RTE_PTYPE_L4_UDP;
 		return;
+	case ENETC_PKT_TYPE_IPV4_ESP:
+		m->packet_type = RTE_PTYPE_L2_ETHER |
+				 RTE_PTYPE_L3_IPV4 |
+				 RTE_PTYPE_TUNNEL_ESP;
+		return;
+	case ENETC_PKT_TYPE_IPV6_ESP:
+		m->packet_type = RTE_PTYPE_L2_ETHER |
+				 RTE_PTYPE_L3_IPV6 |
+				 RTE_PTYPE_TUNNEL_ESP;
+		return;
 	case ENETC_PKT_TYPE_IPV4_SCTP:
 		m->packet_type = RTE_PTYPE_L2_ETHER |
 				 RTE_PTYPE_L3_IPV4 |
-- 
2.25.1


^ permalink raw reply related

* [PATCH 03/10] net/enetc: fix queue initialization
From: Gagandeep Singh @ 2026-06-19 18:44 UTC (permalink / raw)
  To: dev; +Cc: hemant.agrawal, stable
In-Reply-To: <20260619184427.522518-1-g.singh@nxp.com>

Hardware can misbehave if the user tries to reset the consumer and
producer indexes without resetting the ring.

This patch adds the ring reset step before resetting the indexes.

Fixes: 6c9c5aadc0e0 ("net/enetc: support ENETC4 queue API")
Cc: stable@dpdk.org

Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
---
 drivers/net/enetc/enetc4_ethdev.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/net/enetc/enetc4_ethdev.c b/drivers/net/enetc/enetc4_ethdev.c
index 78eba70..154fc09 100644
--- a/drivers/net/enetc/enetc4_ethdev.c
+++ b/drivers/net/enetc/enetc4_ethdev.c
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright 2024 NXP
+ * Copyright 2024-2026 NXP
  */
 
 #include <stdbool.h>
@@ -279,6 +279,7 @@ enetc4_tx_queue_setup(struct rte_eth_dev *dev,
 		     const struct rte_eth_txconf *tx_conf)
 {
 	int err;
+	uint32_t tx_data;
 	struct enetc_bdr *tx_ring;
 	struct rte_eth_dev_data *data = dev->data;
 	struct enetc_eth_adapter *priv =
@@ -301,6 +302,10 @@ enetc4_tx_queue_setup(struct rte_eth_dev *dev,
 		goto fail;
 
 	tx_ring->ndev = dev;
+	/* reset queue */
+	tx_data = enetc4_txbdr_rd(&priv->hw.hw, tx_ring->index, ENETC_TBMR);
+	tx_data &= ~ENETC_TBMR_EN;
+	enetc4_txbdr_wr(&priv->hw.hw, tx_ring->index, ENETC_TBMR, tx_data);
 	enetc4_setup_txbdr(&priv->hw.hw, tx_ring);
 	data->tx_queues[queue_idx] = tx_ring;
 	tx_ring->tx_deferred_start = tx_conf->tx_deferred_start;
@@ -427,6 +432,7 @@ enetc4_rx_queue_setup(struct rte_eth_dev *dev,
 		     struct rte_mempool *mb_pool)
 {
 	int err = 0;
+	uint32_t rx_enable;
 	struct enetc_bdr *rx_ring;
 	struct rte_eth_dev_data *data =  dev->data;
 	struct enetc_eth_adapter *adapter =
@@ -450,6 +456,10 @@ enetc4_rx_queue_setup(struct rte_eth_dev *dev,
 		goto fail;
 
 	rx_ring->ndev = dev;
+	/* reset queue */
+	rx_enable = enetc4_rxbdr_rd(&adapter->hw.hw, rx_ring->index, ENETC_RBMR);
+	rx_enable &= ~ENETC_RBMR_EN;
+	enetc4_rxbdr_wr(&adapter->hw.hw, rx_ring->index, ENETC_RBMR, rx_enable);
 	enetc4_setup_rxbdr(&adapter->hw.hw, rx_ring, mb_pool);
 	data->rx_queues[rx_queue_id] = rx_ring;
 	rx_ring->rx_deferred_start = rx_conf->rx_deferred_start;
-- 
2.25.1


^ permalink raw reply related

* [PATCH 02/10] net/enetc: fix TX BDs flag overwrite issue
From: Gagandeep Singh @ 2026-06-19 18:44 UTC (permalink / raw)
  To: dev; +Cc: hemant.agrawal, stable
In-Reply-To: <20260619184427.522518-1-g.singh@nxp.com>

Zero the flags field before setting offload bits and set the
frame-last flag (F) after all descriptor fields are written.
This prevents stale flag bits from a previous packet corrupting
the current descriptor.

Fixes: 72f491f1e53c ("net/enetc: optimize ENETC4 data path")
Cc: stable@dpdk.org

Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
---
 drivers/net/enetc/enetc_rxtx.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/enetc/enetc_rxtx.c b/drivers/net/enetc/enetc_rxtx.c
index a2b8153..b44e6f3 100644
--- a/drivers/net/enetc/enetc_rxtx.c
+++ b/drivers/net/enetc/enetc_rxtx.c
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright 2018-2024 NXP
+ * Copyright 2018-2026 NXP
  */
 
 #include <stdbool.h>
@@ -172,7 +172,7 @@ enetc_xmit_pkts_nc(void *tx_queue,
 			dcbf(data + j);
 
 		txbd = ENETC_TXBD(*tx_ring, i);
-		txbd->flags = rte_cpu_to_le_16(ENETC4_TXBD_FLAGS_F);
+		txbd->flags = 0;
 		if (tx_ring->q_swbd[i].buffer_addr->ol_flags & ENETC4_TX_CKSUM_OFFLOAD_MASK)
 			enetc4_tx_offload_checksum(tx_ring->q_swbd[i].buffer_addr, txbd);
 
@@ -182,6 +182,7 @@ enetc_xmit_pkts_nc(void *tx_queue,
 		txbd->addr = (uint64_t)(uintptr_t)
 		rte_cpu_to_le_64((size_t)tx_swbd->buffer_addr->buf_iova +
 				 tx_swbd->buffer_addr->data_off);
+		txbd->flags |= rte_cpu_to_le_16(ENETC4_TXBD_FLAGS_F);
 		i++;
 		start++;
 		if (unlikely(i == tx_ring->bd_count))
-- 
2.25.1


^ permalink raw reply related

* [PATCH 01/10] net/enetc: fix TX BD structure
From: Gagandeep Singh @ 2026-06-19 18:44 UTC (permalink / raw)
  To: dev; +Cc: hemant.agrawal, stable
In-Reply-To: <20260619184427.522518-1-g.singh@nxp.com>

The flags field in struct enetc_tx_bd was declared as uint16_t but
ENETC4 TX BDs only use an 8-bit flags byte. Fix the type to uint8_t
to match the hardware descriptor layout.

Fixes: 696fa399d797 ("net/enetc: add PMD with basic operations")
Cc: stable@dpdk.org

Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
---
 drivers/net/enetc/base/enetc_hw.h | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/enetc/base/enetc_hw.h b/drivers/net/enetc/base/enetc_hw.h
index 173d677..19efadd 100644
--- a/drivers/net/enetc/base/enetc_hw.h
+++ b/drivers/net/enetc/base/enetc_hw.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright 2018-2024 NXP
+ * Copyright 2018-2026 NXP
  */
 
 #ifndef _ENETC_HW_H_
@@ -198,8 +198,7 @@ enum enetc_bdr_type {TX, RX};
 
 #define ENETC_TX_ADDR(txq, addr) ((void *)((txq)->enetc_txbdr + (addr)))
 
-#define ENETC_TXBD_FLAGS_IE		BIT(13)
-#define ENETC_TXBD_FLAGS_F		BIT(15)
+#define ENETC_TXBD_FLAGS_F		BIT(7)
 
 /* ENETC Parsed values (Little Endian) */
 #define ENETC_PARSE_ERROR		0x8000
@@ -262,7 +261,7 @@ struct enetc_tx_bd {
 			uint8_t l3t:1;
 			uint8_t resv:5;
 			uint8_t l4t:3;
-			uint16_t flags;
+			uint8_t flags;
 		};/* default layout */
 		uint32_t txstart;
 		uint32_t lstatus;
-- 
2.25.1


^ permalink raw reply related

* [PATCH 00/10] NXP ENETC driver related changes
From: Gagandeep Singh @ 2026-06-19 18:44 UTC (permalink / raw)
  To: dev; +Cc: hemant.agrawal

ENETC driver related changes series

Gagandeep Singh (8):
  net/enetc: fix TX BD structure
  net/enetc: fix TX BDs flag overwrite issue
  net/enetc: fix queue initialization
  net/enetc: support ESP packet type in packet parsing
  net/enetc: update random MAC generation code
  net/enetc: add option to disable VSI messaging
  net/enetc: add devargs to control VSI-PSI timeout and delay
  net/enetc4: add cacheable BD ring support with SW cache maintenance

Vanshika Shukla (2):
  net/enetc: support scatter-gather
  net/enetc: set user configurable priority to TX rings

 drivers/net/enetc/base/enetc_hw.h |  13 +-
 drivers/net/enetc/enetc.h         |  28 +-
 drivers/net/enetc/enetc4_ethdev.c | 123 +++++++--
 drivers/net/enetc/enetc4_vf.c     | 159 ++++++++++--
 drivers/net/enetc/enetc_ethdev.c  |  26 +-
 drivers/net/enetc/enetc_rxtx.c    | 411 ++++++++++++++++++++++++++----
 6 files changed, 649 insertions(+), 111 deletions(-)

-- 
2.25.1


^ permalink raw reply

* Re: [PATCH v4 00/23] et/sxe2: added Linkdata sxe2 ethernet driver
From: Stephen Hemminger @ 2026-06-19 17:31 UTC (permalink / raw)
  To: liujie5; +Cc: dev
In-Reply-To: <20260619080156.1539964-1-liujie5@linkdatatechnology.com>

On Fri, 19 Jun 2026 16:01:56 +0800
liujie5@linkdatatechnology.com wrote:

> From: Jie Liu <liujie5@linkdatatechnology.com>
> 
> This patch set implements core functionality for the SXE2 PMD,
> including basic driver framework, data path setup, and advanced
> offload features (VLAN, RSS,TM, PTP etc.).
> 
> V19:
>  - remove software statistics devargs
> 
> Jie Liu (23):
>   net/sxe2: remove software statistics devargs
>   net/sxe2: support AVX512 vectorized path for Rx and Tx
>   net/sxe2: add AVX2 vector data path for Rx and Tx
>   net/sxe2: add supported packet types get callback
>   net/sxe2: add link update callback
>   net/sxe2: support L2 filtering and MAC config
>   drivers: support RSS feature
>   net/sxe2: support TM hierarchy and shaping
>   net/sxe2: support IPsec inline protocol offload
>   net/sxe2: support statistics and multi-process
>   drivers: interrupt handling
>   net/sxe2: add NEON vec Rx/Tx burst functions
>   drivers: add support for VF representors
>   net/sxe2: add support for custom UDP tunnel ports
>   net/sxe2: support firmware version reading
>   net/sxe2: implement get monitor address
>   common/sxe2: add shared SFP module definitions
>   net/sxe2: support SFP module info and EEPROM access
>   net/sxe2: implement private dump info
>   net/sxe2: add mbuf validation in Tx debug mode
>   common/sxe2: add callback for memory event handling
>   net/sxe2: add private devargs parsing
>   net/sxe2: update sxe2 feature matrix docs
> 
>  doc/guides/nics/features/sxe2.ini          |   56 +
>  doc/guides/nics/sxe2.rst                   |  164 ++
>  drivers/common/sxe2/sxe2_common.c          |  156 ++
>  drivers/common/sxe2/sxe2_common.h          |    4 +
>  drivers/common/sxe2/sxe2_flow_public.h     |  633 +++++++
>  drivers/common/sxe2/sxe2_ioctl_chnl.c      |  178 +-
>  drivers/common/sxe2/sxe2_ioctl_chnl_func.h |   18 +
>  drivers/common/sxe2/sxe2_msg.h             |  118 ++
>  drivers/net/sxe2/meson.build               |   52 +
>  drivers/net/sxe2/sxe2_cmd_chnl.c           | 1587 +++++++++++++++-
>  drivers/net/sxe2/sxe2_cmd_chnl.h           |  139 ++
>  drivers/net/sxe2/sxe2_drv_cmd.h            |  523 +++++-
>  drivers/net/sxe2/sxe2_dump.c               |  302 +++
>  drivers/net/sxe2/sxe2_dump.h               |   12 +
>  drivers/net/sxe2/sxe2_ethdev.c             | 1513 ++++++++++++++-
>  drivers/net/sxe2/sxe2_ethdev.h             |  112 +-
>  drivers/net/sxe2/sxe2_ethdev_repr.c        |  609 ++++++
>  drivers/net/sxe2/sxe2_ethdev_repr.h        |   32 +
>  drivers/net/sxe2/sxe2_filter.c             |  895 +++++++++
>  drivers/net/sxe2/sxe2_filter.h             |  100 +
>  drivers/net/sxe2/sxe2_flow.c               | 1394 ++++++++++++++
>  drivers/net/sxe2/sxe2_flow.h               |   30 +
>  drivers/net/sxe2/sxe2_flow_define.h        |  144 ++
>  drivers/net/sxe2/sxe2_flow_parse_action.c  | 1182 ++++++++++++
>  drivers/net/sxe2/sxe2_flow_parse_action.h  |   23 +
>  drivers/net/sxe2/sxe2_flow_parse_engine.c  |  106 ++
>  drivers/net/sxe2/sxe2_flow_parse_engine.h  |   13 +
>  drivers/net/sxe2/sxe2_flow_parse_pattern.c | 1935 +++++++++++++++++++
>  drivers/net/sxe2/sxe2_flow_parse_pattern.h |   46 +
>  drivers/net/sxe2/sxe2_ipsec.c              | 1565 ++++++++++++++++
>  drivers/net/sxe2/sxe2_ipsec.h              |  254 +++
>  drivers/net/sxe2/sxe2_irq.c                | 1026 ++++++++++
>  drivers/net/sxe2/sxe2_irq.h                |   25 +
>  drivers/net/sxe2/sxe2_mac.c                |  530 ++++++
>  drivers/net/sxe2/sxe2_mac.h                |   84 +
>  drivers/net/sxe2/sxe2_mp.c                 |  414 ++++
>  drivers/net/sxe2/sxe2_mp.h                 |   67 +
>  drivers/net/sxe2/sxe2_queue.c              |   17 +-
>  drivers/net/sxe2/sxe2_queue.h              |   15 +-
>  drivers/net/sxe2/sxe2_rss.c                |  584 ++++++
>  drivers/net/sxe2/sxe2_rss.h                |   81 +
>  drivers/net/sxe2/sxe2_rx.c                 |   93 +-
>  drivers/net/sxe2/sxe2_rx.h                 |    2 +
>  drivers/net/sxe2/sxe2_security.c           |  335 ++++
>  drivers/net/sxe2/sxe2_security.h           |   77 +
>  drivers/net/sxe2/sxe2_stats.c              |  586 ++++++
>  drivers/net/sxe2/sxe2_stats.h              |   39 +
>  drivers/net/sxe2/sxe2_switchdev.c          |  332 ++++
>  drivers/net/sxe2/sxe2_switchdev.h          |   33 +
>  drivers/net/sxe2/sxe2_tm.c                 | 1151 ++++++++++++
>  drivers/net/sxe2/sxe2_tm.h                 |   76 +
>  drivers/net/sxe2/sxe2_tx.c                 |    7 +
>  drivers/net/sxe2/sxe2_txrx.c               | 1968 +++++++++++++++++++-
>  drivers/net/sxe2/sxe2_txrx.h               |    8 +
>  drivers/net/sxe2/sxe2_txrx_check_mbuf.c    |  595 ++++++
>  drivers/net/sxe2/sxe2_txrx_check_mbuf.h    |   38 +
>  drivers/net/sxe2/sxe2_txrx_poll.c          |  281 ++-
>  drivers/net/sxe2/sxe2_txrx_vec.c           |   46 +-
>  drivers/net/sxe2/sxe2_txrx_vec.h           |   38 +-
>  drivers/net/sxe2/sxe2_txrx_vec_avx2.c      |  748 ++++++++
>  drivers/net/sxe2/sxe2_txrx_vec_avx512.c    |  868 +++++++++
>  drivers/net/sxe2/sxe2_txrx_vec_common.h    |   53 +-
>  drivers/net/sxe2/sxe2_txrx_vec_neon.c      |  691 +++++++
>  drivers/net/sxe2/sxe2_txrx_vec_sse.c       |   29 +-
>  drivers/net/sxe2/sxe2_vsi.c                |  146 ++
>  drivers/net/sxe2/sxe2_vsi.h                |   12 +-
>  drivers/net/sxe2/sxe2vf_regs.h             |   85 +
>  67 files changed, 24809 insertions(+), 266 deletions(-)
>  create mode 100644 drivers/common/sxe2/sxe2_flow_public.h
>  create mode 100644 drivers/common/sxe2/sxe2_msg.h
>  create mode 100644 drivers/net/sxe2/sxe2_dump.c
>  create mode 100644 drivers/net/sxe2/sxe2_dump.h
>  create mode 100644 drivers/net/sxe2/sxe2_ethdev_repr.c
>  create mode 100644 drivers/net/sxe2/sxe2_ethdev_repr.h
>  create mode 100644 drivers/net/sxe2/sxe2_filter.c
>  create mode 100644 drivers/net/sxe2/sxe2_filter.h
>  create mode 100644 drivers/net/sxe2/sxe2_flow.c
>  create mode 100644 drivers/net/sxe2/sxe2_flow.h
>  create mode 100644 drivers/net/sxe2/sxe2_flow_define.h
>  create mode 100644 drivers/net/sxe2/sxe2_flow_parse_action.c
>  create mode 100644 drivers/net/sxe2/sxe2_flow_parse_action.h
>  create mode 100644 drivers/net/sxe2/sxe2_flow_parse_engine.c
>  create mode 100644 drivers/net/sxe2/sxe2_flow_parse_engine.h
>  create mode 100644 drivers/net/sxe2/sxe2_flow_parse_pattern.c
>  create mode 100644 drivers/net/sxe2/sxe2_flow_parse_pattern.h
>  create mode 100644 drivers/net/sxe2/sxe2_ipsec.c
>  create mode 100644 drivers/net/sxe2/sxe2_ipsec.h
>  create mode 100644 drivers/net/sxe2/sxe2_irq.c
>  create mode 100644 drivers/net/sxe2/sxe2_mac.c
>  create mode 100644 drivers/net/sxe2/sxe2_mac.h
>  create mode 100644 drivers/net/sxe2/sxe2_mp.c
>  create mode 100644 drivers/net/sxe2/sxe2_mp.h
>  create mode 100644 drivers/net/sxe2/sxe2_rss.c
>  create mode 100644 drivers/net/sxe2/sxe2_rss.h
>  create mode 100644 drivers/net/sxe2/sxe2_security.c
>  create mode 100644 drivers/net/sxe2/sxe2_security.h
>  create mode 100644 drivers/net/sxe2/sxe2_stats.c
>  create mode 100644 drivers/net/sxe2/sxe2_stats.h
>  create mode 100644 drivers/net/sxe2/sxe2_switchdev.c
>  create mode 100644 drivers/net/sxe2/sxe2_switchdev.h
>  create mode 100644 drivers/net/sxe2/sxe2_tm.c
>  create mode 100644 drivers/net/sxe2/sxe2_tm.h
>  create mode 100644 drivers/net/sxe2/sxe2_txrx_check_mbuf.c
>  create mode 100644 drivers/net/sxe2/sxe2_txrx_check_mbuf.h
>  create mode 100644 drivers/net/sxe2/sxe2_txrx_vec_avx2.c
>  create mode 100644 drivers/net/sxe2/sxe2_txrx_vec_avx512.c
>  create mode 100644 drivers/net/sxe2/sxe2_txrx_vec_neon.c
>  create mode 100644 drivers/net/sxe2/sxe2vf_regs.h
> 

This is look much better, there are a few minor things that you probably
want to address before I merge it.

The (overly verbose) AI feedback is...

[PATCH v4 00/23] sxe2 driver feature additions

This is in good shape. Substantive structural progress on essentially
everything I raised against v3.

Verified across the assembled tree:

- All 23 commits build cleanly end-to-end. git bisect now works. This is
  the first revision of the series where that's been true.
- No LLM citation placeholders remain in commit messages. The v3 19/20
  message with "[citation:1][citation:3][citation:5]" markers and the
  "approximately X%" placeholder are both gone.
- The atomic-sw-stats fix is properly placed. 01/23 is a clean standalone
  cleanup commit that removes RTE_ATOMIC qualifiers from
  sxe2_rxq_sw_stats, replaces the atomic load/store/fetch_add calls with
  plain operations, removes the if(sw_stats_en) gating, removes the now-
  unused #include <rte_stdatomic.h>, and renames high_performance_mode to
  no_sched_mode to match the devargs string. Verified zero atomic
  operations on sw_stats remain in the assembled tree.
- drv-sw-stats devarg removed entirely (defined, parsed, but unused in
  v3 - now gone).
- All surviving devargs are documented in doc/guides/nics/sxe2.rst with
  substantive explanations covering what each parameter does, valid
  values, defaults, and trade-offs.
- The v3 19/20 patch is split into 21/23 (memseg-walk callback
  infrastructure, common/sxe2 only) and 22/23 (devargs parsing,
  net/sxe2). Both commit messages now describe one thing each.
- The 469-entry runtime ptype-table initialiser is now a file-scope
  `static const alignas(RTE_CACHE_LINE_SIZE) uint32_t
  sxe2_ptype_tbl[]` with C99 designated initialisers.
- Patch 02/23 (AVX512) scope is tightened - dropped from 13 files to 6,
  and the files it touches are all AVX512-related now.
- The v3 03/20 patch is split into 04/23 ("supported packet types") and
  05/23 ("link update callback"), addressing the scope-drift complaint.

Three remaining items, none blocking:

[PATCH v4 04/23] subject still does not match content

The commit message says the patch adds `dev_supported_ptypes_get`, and
the patch adds that callback - but it also creates the entire 1793-line
drivers/net/sxe2/sxe2_txrx.c with the Tx/Rx framework, packet-type
constant table, classification helpers, etc. The ptype callback is a
small piece of what this patch does. Either rename the subject to
something like "net/sxe2: add Rx/Tx framework and packet types callback"
(more honest) or split the txrx framework into a separate prior commit
with ptype-callback registration as a small follow-up.

[PATCH v4 04/23] ptype table refactor is incomplete

The static const table is correct, but adapter->ptype_tbl is still
declared in struct sxe2_adapter and sxe2_init_ptype_tbl() now just
memcpy's the const table into the per-adapter copy at init. The vec
paths in sxe2_txrx_vec_avx2.c, _avx512.c, _sse.c and the poll path all
read through rxq->vsi->adapter->ptype_tbl[] rather than the file-scope
const. To finish: remove the adapter field, remove sxe2_init_ptype_tbl,
and have all readers reference sxe2_ptype_tbl directly. The inner-loop
saves one indirection per packet, and per-port memory drops by
SXE2_MAX_PTYPE_NUM * 4 bytes.

[PATCH v4 22/23] flow-duplicate-pattern still defaults to 1

This devarg now has good documentation, but the documentation
clarifies the design objection rather than resolving it: a boolean
that toggles "duplicate rte_flow rules are rejected with EEXIST" vs
"duplicate rte_flow rules are accepted" is a per-boot toggle for
standard-API contract semantics. Standard APIs shouldn't behave
differently based on a vendor devarg. Pick one policy (rejecting
duplicates with EEXIST is what every other PMD does), apply it
unconditionally, and remove the devarg. The
switch_pattern_dup_allow rule metadata can stay if hardware needs it
internally - just don't expose the policy as a boot-time knob.

The other surviving devargs are acceptable as posted:
- no-sched-mode: kernel-coexistence rationale documented, defensible.
- rx-low-latency: ITR throttling threshold, well-documented trade-off,
  precedent in other PMDs.
- function-flow-direct: DPDK/kernel flow-table coexistence policy with
  no rte_flow analogue. The documentation explains this clearly.
- fnav-stat-type: hardware counter-mode selection. The cleaner long-
  term shape would be separate xstats names, but the current form is
  documented and reasonable for now.
- sched-layer-mode: hardware-imposed TM hierarchy cap. Should ideally
  be exposed via rte_tm_capabilities_get and selected at hierarchy
  build time rather than via devarg; worth raising as a future rte_tm
  enhancement.

Minor cosmetic:

In sxe2_parse_no_sched_mode() (22/23) the local variable is still
named `high_performance_mode`. The struct field rename in 01/23
didn't propagate to this parser local. Cosmetic.

Once 22/23 drops flow-duplicate-pattern and 04/23's subject is
either renamed or split, I'd consider this ready.


^ permalink raw reply

* Re: [PATCH v3 00/18] net/dpaa: bug fixes for bus, net and fmlib drivers
From: Stephen Hemminger @ 2026-06-19 17:28 UTC (permalink / raw)
  To: Hemant Agrawal; +Cc: david.marchand, dev
In-Reply-To: <20260619103901.2274740-1-hemant.agrawal@nxp.com>

On Fri, 19 Jun 2026 16:08:43 +0530
Hemant Agrawal <hemant.agrawal@nxp.com> wrote:

> This series contains bug fixes for the DPAA PMD (bus/dpaa, net/dpaa,
> net/dpaa/fmlib and dma/dpaa).
> 
> v3 changes (AI code review feedback):
> - P05: Clarify commit message: p_dev == NULL is equivalent to h_scheme == NULL
>   since p_dev = (t_device *)h_scheme; consistent with all sibling functions
> - P16: Add comment explaining the intentional loop continuation; clarify
>   commit message about the loop design
> - P17: Add DPAA_DP_LOG(WARNING) before silent return on l3_len == 0 to
>   aid debugging of corrupt/uninitialized mbufs
> 
> v2 changes:
> - P05: Fix commit message API name
> - P08: Guard DPAA_PUSH_QUEUES_NUMBER env-var for LS1043A (errata)
> - P09: Document dpaa_finish() removal
> - P10: Fix wrong Fixes: tag
> - P11: Split into two patches with correct Fixes: tags
> - P13: Also fix rx_buf_diallocate -> rx_buf_deallocate
> 
> All patches are bug fixes tagged with Fixes: and Cc: stable@dpdk.org.
> 
> Gagandeep Singh (3):
>   bus/dpaa: fix device probe issue
>   net/dpaa: fix device remove
>   net/dpaa: fix invalid check on interrupt unregister
> 
> Hemant Agrawal (11):
>   bus/dpaa: fix error handling of qman_create_fq
>   bus/dpaa: fix fqid endianness
>   bus/dpaa: fix error handling in qman_query
>   net/dpaa: fix modify cgr to use index
>   bus/dpaa: fix fd leak for ccsr mmap
>   net/dpaa: fix xstat name for tx undersized counter
>   net/dpaa: fix xstat string typos in BMI stats table
>   net/dpaa: remove duplicate ptype entries
>   net/dpaa: fix wrong buffer in xstats get by id
>   net/dpaa: fix null l3_len check in checksum offload
>   net/dpaa: fix mbuf leak in SG fd creation
> 
> Jun Yang (1):
>   bus/dpaa: fix BMI RX stats register offset
> 
> Prashant Gupta (1):
>   net/dpaa/fmlib: add null check in scheme delete
> 
> Vanshika Shukla (2):
>   net/dpaa: fix port_handle leak in fm_prev_cleanup
>   dma/dpaa: fix out-of-bounds access in SG descriptor enqueue
> 
>  drivers/bus/dpaa/base/qbman/bman_driver.c |  3 ++-
>  drivers/bus/dpaa/base/qbman/qman.c        | 11 ++++++---
>  drivers/bus/dpaa/base/qbman/qman_driver.c |  6 ++---
>  drivers/bus/dpaa/dpaa_bus.c               | 17 ++++++-------
>  drivers/bus/dpaa/include/fman.h           |  6 ++---
>  drivers/dma/dpaa/dpaa_qdma.c              |  7 +++++-
>  drivers/net/dpaa/dpaa_ethdev.c            | 30 +++++++++++------------
>  drivers/net/dpaa/dpaa_flow.c              |  4 +++
>  drivers/net/dpaa/dpaa_rxtx.c              |  5 ++++
>  drivers/net/dpaa/fmlib/fm_lib.c           |  3 +++
>  10 files changed, 56 insertions(+), 36 deletions(-)
> 

Applied to next-net with some minor changes to commit message to fix capitalization complaints from check-git-log

^ permalink raw reply

* Re: [PATCH v2 0/4] net/bond: fixes and cleanup
From: Stephen Hemminger @ 2026-06-19 17:22 UTC (permalink / raw)
  To: dev
In-Reply-To: <20260529000157.235931-1-stephen@networkplumber.org>

On Thu, 28 May 2026 16:59:12 -0700
Stephen Hemminger <stephen@networkplumber.org> wrote:

> Automated analysis of the bonding found a few minor things.
> The bug fix is in patch 3 for secondary process crash does rx/tx. 
> 
> The cleanups are in handling of 8023ad mode setting
> and the logging macros.
> 
> v2 - feedback about the mode setting and log messages
> 
> Stephen Hemminger (4):
>   net/bonding: make 8023ad enable function void
>   net/bonding: check mode before setting dedicated queues
>   net/bonding: prevent crash on Rx/Tx from secondary process
>   net/bonding: remove redundant function names from log
> 
>  drivers/net/bonding/eth_bond_8023ad_private.h | 17 +----
>  drivers/net/bonding/rte_eth_bond_8023ad.c     | 16 ++--
>  drivers/net/bonding/rte_eth_bond_api.c        |  4 +-
>  drivers/net/bonding/rte_eth_bond_pmd.c        | 73 ++++++++++++++-----
>  4 files changed, 67 insertions(+), 43 deletions(-)
> 

Applied to net-next, took Bruce's suggestion to split the first patch.

^ permalink raw reply

* Re: [PATCH 2/6] ip_frag: discard datagrams with overlapping fragments
From: Stephen Hemminger @ 2026-06-19 17:01 UTC (permalink / raw)
  To: Morten Brørup; +Cc: dev, stable, Konstantin Ananyev
In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35F6592A@smartserver.smartshare.dk>

On Fri, 19 Jun 2026 15:12:21 +0200
Morten Brørup <mb@smartsharesystems.com> wrote:

> > +		/*
> > +		 * Overlap with an existing fragment. Per RFC 8200 section
> > 4.5
> > +		 * (and RFC 5722) the datagram must be discarded; the same
> > is
> > +		 * applied to IPv4. Free all collected fragments, drop this
> > one,
> > +		 * and invalidate the entry.
> > +		 */
> > +		if (ofs < fp->frags[i].ofs + fp->frags[i].len &&
> > +				fp->frags[i].ofs < ofs + len) {  
> 
> This only catches fragments that are smaller than existing fragments, i.e. fit within one of the existing fragments.
> It should be:
> if ((ofs >= fp->frags[i].ofs &&
> 		ofs < fp->frags[i].ofs + fp->frags[i].len) ||
> 		(ofs + len >= fp->frags[i].ofs &&
> 		ofs + len < fp->frags[i].ofs + fp->frags[i].len)) {
> 
> > +			ip_frag_free(fp, dr);

The code here is comparing an incoming fragment N against existing fragment E,
using half-open ranges [start, end).

The test in the patch is symmetric in N and E.
       ofs < e.ofs + e.len && e.ofs < ofs + len

The one you propose tests that either endpoint of N lands inside E.

Take a fixed stored fragment E = [200, 400) and run several incoming fragments through both.
 N0 = ofs, N1 = ofs+len.

N inside E: N = [250, 300)

E:        |=========|        (200..400)
N:           |===|           (250..300)

Patch: 250 < 400 && 200 < 300 → T && T → overlap. 
Proposed: (250≥200 && 250<400) → T → overlap. 
Both agree.

N encloses E: N = [100, 500)

E:        |=========|        (200..400)
N:      |=============|      (100..500)

Patch: 100 < 400 && 200 < 500 → T && T → overlap.
Proposed: (100≥200 && …) → F, (500≥200 && 500<400) → T && F → F, so F || F → no overlap, MISSED.

This is the case the new version version drops. Neither endpoint of N (100 or 500) sits inside [200,400), 
because N straddles E completely, so new version endpoint-in-E check fails even though the ranges clearly overlap. 
Patch version catches it because the interval test doesn't care which range is larger.

N partial on the left: N = [100, 300)

E:        |=========|        (200..400)
N:      |======|             (100..300)

Patch: 100 < 400 && 200 < 300 → T → overlap.
Proposed: (300≥200 && 300<400) → T → overlap. 
Agree.

N partial on the right: N = [300, 500) — symmetric to the above, both catch it.

So on the four genuine-overlap geometries, your suggestion catches all four and his misses the enclosing one. 
That is not right since the enclosing overlap is a legitimate attack shape (a big fragment overwriting a smaller stored one).

There is another issue.
The >= on the exclusive end produces a false positive on fragments that merely abut, which is the normal case.
Take E already stored as [1400, 2800) and an in-order-but-late fragment N = [0, 1400) arriving after it (ordinary out-of-order delivery):

N:      |======|             (0..1400)
E:             |======|      (1400..2800)

These share no bytes; byte 1400 belongs only to E. 
Patch: 0 < 2800 && 1400 < 1400 → T && F → no overlap, correct. 
Proposed: (1400≥1400 && 1400<2800) → T && T → overlap, wrong. 
This test would discard a perfectly valid datagram whenever a left-abutting fragment arrives after its neighbor.
Adjacent fragments abutting is what fragmentation produces by design, so this would fire constantly under reordering.

Bottom line: the patch was correct as far as I can tell.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox