DPDK-dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v2 1/3] dma/ae4dma: introduce AMD AE4DMA DMA PMD
From: David Marchand @ 2026-06-22 12:26 UTC (permalink / raw)
  To: Raghavendra Ningoji
  Cc: dev, Thomas Monjalon, Bhagyada Modali, Robin Jarry,
	Selwin.Sebastian
In-Reply-To: <20260525184244.1758825-2-raghavendra.ningoji@amd.com>

On Mon, 25 May 2026 at 20:43, Raghavendra Ningoji
<raghavendra.ningoji@amd.com> wrote:
> diff --git a/.mailmap b/.mailmap
> index 89ba6ffccc..60180818f9 100644
> --- a/.mailmap
> +++ b/.mailmap
> @@ -203,6 +203,7 @@ Benoît Ganne <bganne@cisco.com>
>  Bernard Iremonger <bernard.iremonger@intel.com>
>  Bert van Leeuwen <bert.vanleeuwen@netronome.com>
>  Bhagyada Modali <bhagyada.modali@amd.com>
> +Raghavendra Ningoji <raghavendra.ningoji@amd.com>
>  Bharat Mota <bharat.mota@broadcom.com> <bmota@vmware.com>
>  Bhuvan Mital <bhuvan.mital@amd.com>
>  Bibo Mao <maobibo@loongson.cn>

Almost missed this.
Alphabetical order please.


-- 
David Marchand


^ permalink raw reply

* Re: [PATCH v2 0/3] dma/ae4dma: add AMD AE4DMA DMA PMD
From: David Marchand @ 2026-06-22 12:25 UTC (permalink / raw)
  To: Raghavendra Ningoji
  Cc: dev, Thomas Monjalon, Bhagyada Modali, Robin Jarry,
	Selwin.Sebastian, Chengwen Feng, Bruce Richardson
In-Reply-To: <20260525184244.1758825-1-raghavendra.ningoji@amd.com>

Hello,

On Mon, 25 May 2026 at 20:43, Raghavendra Ningoji
<raghavendra.ningoji@amd.com> wrote:
>
> This series adds a new dmadev poll-mode driver for the AMD AE4DMA
> hardware DMA engine. An AE4DMA engine exposes 16 hardware command
> queues, each with a 32-entry descriptor ring; the PMD maps each
> hardware channel to its own dmadev with a single virtual channel,
> so a PCI function appears as 16 dmadevs named "<pci-bdf>-ch0" ..
> "<pci-bdf>-ch15".
>
> Driver characteristics:
>
>  - Memory-to-memory copy operations only (RTE_DMA_CAPA_MEM_TO_MEM).
>  - Completion is detected via the hardware's per-queue read_idx
>    register, which the engine advances as it processes descriptors.
>    The descriptor status / err_code bytes are read only to classify
>    each drained slot as success or failure.
>  - vchan_status reports IDLE/ACTIVE based on HW read_idx vs write_idx
>    and HALTED_ERROR when the queue is not enabled.
>  - depends on bus_pci and dmadev.
>
> The v1 was submitted as a single patch.  Per review feedback the
> driver is now introduced in three logical patches, following the
> pattern of the recent hisi_acc dmadev driver:
>
>   1/3 - introduce driver (probe, remove, per-queue HW init)
>   2/3 - add control path operations (dev_ops)
>   3/3 - add data path operations (copy, submit, completion)
> ---
> Changes in v2:
>  - Split the monolithic v1 patch into three logical patches
>    (introduce / control path / data path), mirroring the
>    structure used by drivers/dma/hisi_acc.
>  - Fix checkpatches.sh warnings in drivers/dma/ae4dma/ae4dma_internal.h:
>      * Use RTE_LOG_LINE_PREFIX (with RTE_LOGTYPE_AE4DMA_PMD) instead
>        of the deprecated rte_log() call form.
>      * Replace the GCC variadic argument-pack extension ("args...")
>        with C99 __VA_ARGS__ in the AE4DMA_PMD_{LOG,DEBUG,INFO,ERR,
>        WARN} macros.
>  - Move __rte_cache_aligned to the "struct" keyword position on
>    struct ae4dma_cmd_queue, as required by checkpatches.sh.
>
> v1:https://patches.dpdk.org/project/dpdk/patch/20260518181856.1228373-1-raghavendra.ningoji@amd.com/
>
> Raghavendra Ningoji (3):
>   dma/ae4dma: introduce AMD AE4DMA DMA PMD
>   dma/ae4dma: add control path operations
>   dma/ae4dma: add data path operations
>
>  .mailmap                               |   1 +
>  MAINTAINERS                            |   5 +
>  doc/guides/dmadevs/ae4dma.rst          |  75 +++
>  doc/guides/dmadevs/index.rst           |   1 +
>  doc/guides/rel_notes/release_26_07.rst |   7 +
>  drivers/dma/ae4dma/ae4dma_dmadev.c     | 738 +++++++++++++++++++++++++
>  drivers/dma/ae4dma/ae4dma_hw_defs.h    | 160 ++++++
>  drivers/dma/ae4dma/ae4dma_internal.h   | 118 ++++
>  drivers/dma/ae4dma/meson.build         |   7 +
>  drivers/dma/meson.build                |   1 +
>  usertools/dpdk-devbind.py              |   5 +-
>  11 files changed, 1117 insertions(+), 1 deletion(-)
>  create mode 100644 doc/guides/dmadevs/ae4dma.rst
>  create mode 100644 drivers/dma/ae4dma/ae4dma_dmadev.c
>  create mode 100644 drivers/dma/ae4dma/ae4dma_hw_defs.h
>  create mode 100644 drivers/dma/ae4dma/ae4dma_internal.h
>  create mode 100644 drivers/dma/ae4dma/meson.build
>
>
> base-commit: f724d1c0d1c1636b9c171c34db3f17c3defaa2f3

I did a pass on this series and sent comments, nothing blocking but please fix.


-- 
David Marchand


^ permalink raw reply

* Re: [PATCH v2 1/3] dma/ae4dma: introduce AMD AE4DMA DMA PMD
From: Bruce Richardson @ 2026-06-22 12:16 UTC (permalink / raw)
  To: David Marchand
  Cc: Raghavendra Ningoji, dev, Thomas Monjalon, Bhagyada Modali,
	Robin Jarry, Selwin.Sebastian, Chengwen Feng
In-Reply-To: <CAJFAV8w_67sp9iGW9+Gpwxx0ZkDYc4Zc2JKDtsPFFccU0UHePg@mail.gmail.com>

On Mon, Jun 22, 2026 at 02:06:55PM +0200, David Marchand wrote:
> On Mon, 25 May 2026 at 20:43, Raghavendra Ningoji
> <raghavendra.ningoji@amd.com> wrote:
> >
> > Add the skeleton of a new dmadev poll-mode driver for the AMD AE4DMA
> > hardware DMA engine, providing only PCI probe/remove and per-queue
> > hardware initialisation. An AE4DMA engine exposes 16 hardware command
> > queues, each with a 32-entry descriptor ring; the PMD maps each
> > hardware channel to its own dmadev with a single virtual channel,
> > so a PCI function appears as 16 dmadevs named "<pci-bdf>-ch0" ..
> > "<pci-bdf>-ch15".
> 
> I am not familiar with DMA drivers, I am not sure it is something acceptable.
> @Chengwen for info.
> 
This is similar with what is done by idxd driver when used as a PCI device
bound to vfio. We make the number of channels to configure a devarg, and
each channel becomes its own dmadev instance, since each channel is
independent from a user viewpoint. Only difference is that we use "q"
rather than "ch" in the naming. See [1] for what idxd does.

/Bruce

[1] https://github.com/DPDK/dpdk/blob/main/drivers/dma/idxd/idxd_pci.c#L326

^ permalink raw reply

* Re: [PATCH v2 2/3] dma/ae4dma: add control path operations
From: David Marchand @ 2026-06-22 12:15 UTC (permalink / raw)
  To: Raghavendra Ningoji
  Cc: dev, Thomas Monjalon, Bhagyada Modali, Robin Jarry,
	Selwin.Sebastian
In-Reply-To: <20260525184244.1758825-3-raghavendra.ningoji@amd.com>

On Mon, 25 May 2026 at 20:43, Raghavendra Ningoji
<raghavendra.ningoji@amd.com> wrote:
>
> Implement the dmadev control path for the AMD AE4DMA PMD.
>
> This commit adds:
>  - dev_configure / vchan_setup: accept a single virtual channel per
>    dmadev and clamp the requested ring size to the hardware maximum
>    of 32 descriptors (rounded up to a power of two).
>  - dev_start / dev_stop / dev_close: program the per-queue control
>    register to enable/disable the hardware queue and release the
>    descriptor ring memzone on close.
>  - dev_info_get: advertise RTE_DMA_CAPA_MEM_TO_MEM and the fixed
>    ring depth.
>  - dev_dump: print the queue identifiers, ring layout and software
>    completion counters.
>  - stats_get / stats_reset: expose submitted / completed / errors
>    counters maintained by the driver.
>  - vchan_status: report IDLE / ACTIVE based on hardware read_idx vs
>    write_idx, and HALTED_ERROR when the queue is not enabled.
>
> The dmadev framework is wired through dev_ops in ae4dma_dmadev_create().
>
> Signed-off-by: Raghavendra Ningoji <raghavendra.ningoji@amd.com>
> ---
>  drivers/dma/ae4dma/ae4dma_dmadev.c | 223 +++++++++++++++++++++++++++++
>  1 file changed, 223 insertions(+)
>
> diff --git a/drivers/dma/ae4dma/ae4dma_dmadev.c b/drivers/dma/ae4dma/ae4dma_dmadev.c
> index 76de2cde45..dfda723c13 100644
> --- a/drivers/dma/ae4dma/ae4dma_dmadev.c
> +++ b/drivers/dma/ae4dma/ae4dma_dmadev.c
> @@ -53,6 +53,215 @@ ae4dma_queue_dma_zone_reserve(const char *queue_name,
>                         socket_id, RTE_MEMZONE_IOVA_CONTIG, queue_size);
>  }
>
> +/* Configure a device. */
> +static int
> +ae4dma_dev_configure(struct rte_dma_dev *dev __rte_unused,
> +               const struct rte_dma_conf *dev_conf,
> +               uint32_t conf_sz)
> +{
> +       if (sizeof(struct rte_dma_conf) != conf_sz)
> +               return -EINVAL;
> +
> +       if (dev_conf->nb_vchans != 1)
> +               return -EINVAL;
> +
> +       return 0;
> +}
> +
> +/* Setup a virtual channel for AE4DMA, only 1 vchan is supported per dmadev. */
> +static int
> +ae4dma_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan __rte_unused,
> +               const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz)
> +{
> +       struct ae4dma_dmadev *ae4dma = dev->fp_obj->dev_private;
> +       struct ae4dma_cmd_queue *cmd_q = &ae4dma->cmd_q;
> +       uint16_t max_desc = qconf->nb_desc;
> +
> +       if (sizeof(struct rte_dma_vchan_conf) != qconf_sz)
> +               return -EINVAL;
> +
> +       if (max_desc < 2)
> +               return -EINVAL;
> +
> +       if (!rte_is_power_of_2(max_desc))
> +               max_desc = rte_align32pow2(max_desc);
> +
> +       if (max_desc > AE4DMA_DESCRIPTORS_PER_CMDQ) {
> +               AE4DMA_PMD_DEBUG("DMA dev %u nb_desc clamped to %u",
> +                               dev->data->dev_id, AE4DMA_DESCRIPTORS_PER_CMDQ);
> +               max_desc = AE4DMA_DESCRIPTORS_PER_CMDQ;
> +       }
> +
> +       cmd_q->qcfg = *qconf;
> +       cmd_q->qcfg.nb_desc = max_desc;
> +
> +       /* Ensure all counters are reset, if reconfiguring/restarting device. */
> +       memset(&cmd_q->stats, 0, sizeof(cmd_q->stats));
> +       return 0;
> +}
> +
> +/* Start a configured device. */
> +static int
> +ae4dma_dev_start(struct rte_dma_dev *dev)
> +{
> +       struct ae4dma_dmadev *ae4dma = dev->fp_obj->dev_private;
> +       struct ae4dma_cmd_queue *cmd_q = &ae4dma->cmd_q;
> +       uint16_t nb = cmd_q->qcfg.nb_desc;
> +
> +       if (nb == 0)
> +               return -EBUSY;
> +
> +       /* Program ring depth expected by hardware. */
> +       AE4DMA_WRITE_REG(&cmd_q->hwq_regs->max_idx, nb);
> +       return 0;
> +}
> +
> +/* Stop a configured device. */
> +static int
> +ae4dma_dev_stop(struct rte_dma_dev *dev)
> +{
> +       struct ae4dma_dmadev *ae4dma = dev->fp_obj->dev_private;
> +       struct ae4dma_cmd_queue *cmd_q = &ae4dma->cmd_q;
> +
> +       if (cmd_q->hwq_regs != NULL)
> +               AE4DMA_WRITE_REG(&cmd_q->hwq_regs->control_reg.control_raw,
> +                               AE4DMA_CMD_QUEUE_DISABLE);
> +       return 0;
> +}
> +
> +/* Get device information of a device. */
> +static int
> +ae4dma_dev_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info,
> +               uint32_t size)
> +{
> +       if (size < sizeof(*info))
> +               return -EINVAL;
> +       info->dev_name = dev->device->name;

The dmadev library sets this field in rte_dma_info_get().
Please remove.


> +       info->dev_capa = RTE_DMA_CAPA_MEM_TO_MEM;
> +       info->max_vchans = 1;
> +       info->min_desc = 2;
> +       info->max_desc = AE4DMA_DESCRIPTORS_PER_CMDQ;
> +       info->nb_vchans = 1;
> +       return 0;
> +}
> +
> +/* Close a configured device. */
> +static int
> +ae4dma_dev_close(struct rte_dma_dev *dev)
> +{
> +       struct ae4dma_dmadev *ae4dma = dev->fp_obj->dev_private;
> +       struct ae4dma_cmd_queue *cmd_q = &ae4dma->cmd_q;
> +
> +       if (cmd_q->hwq_regs != NULL)
> +               AE4DMA_WRITE_REG(&cmd_q->hwq_regs->control_reg.control_raw,
> +                               AE4DMA_CMD_QUEUE_DISABLE);
> +
> +       if (cmd_q->memz_name[0] != '\0') {
> +               const struct rte_memzone *mz = rte_memzone_lookup(cmd_q->memz_name);

Rather than resolve again, can't you store the reference to the
memzone in the priv pointer at probe time?


> +
> +               if (mz != NULL)
> +                       rte_memzone_free(mz);

No need to test for NULL.


> +       }
> +       cmd_q->qbase_desc = NULL;
> +       cmd_q->qbase_addr = NULL;
> +       cmd_q->qbase_phys_addr = 0;
> +       return 0;
> +}

[snip]


-- 
David Marchand


^ permalink raw reply

* Re: [PATCH v3 01/20] net/cnxk: update mbuf next field for multi segment
From: Jerin Jacob @ 2026-06-22 12:13 UTC (permalink / raw)
  To: Rahul Bhansali
  Cc: dev, Nithin Dabilpuram, Kiran Kumar K, Sunil Kumar Kori,
	Satha Rao, Harman Kalra, jerinj
In-Reply-To: <20260615162446.578336-1-rbhansali@marvell.com>

On Mon, Jun 15, 2026 at 9:56 PM Rahul Bhansali <rbhansali@marvell.com> wrote:
>
> As per the requirement of rte_mbuf_raw_reset_bulk(), the mbuf's
> 'next' and 'nb_segs' fields are required to be reset.
> This reset these field for multi-segment mbufs on cn9k platform.
>
> Signed-off-by: Rahul Bhansali <rbhansali@marvell.com>

Series applied to dpdk-next-net-mrvl/for-main. Thanks

^ permalink raw reply

* Re: [PATCH v2 1/3] dma/ae4dma: introduce AMD AE4DMA DMA PMD
From: David Marchand @ 2026-06-22 12:06 UTC (permalink / raw)
  To: Raghavendra Ningoji
  Cc: dev, Thomas Monjalon, Bhagyada Modali, Robin Jarry,
	Selwin.Sebastian, Chengwen Feng
In-Reply-To: <20260525184244.1758825-2-raghavendra.ningoji@amd.com>

On Mon, 25 May 2026 at 20:43, Raghavendra Ningoji
<raghavendra.ningoji@amd.com> wrote:
>
> Add the skeleton of a new dmadev poll-mode driver for the AMD AE4DMA
> hardware DMA engine, providing only PCI probe/remove and per-queue
> hardware initialisation. An AE4DMA engine exposes 16 hardware command
> queues, each with a 32-entry descriptor ring; the PMD maps each
> hardware channel to its own dmadev with a single virtual channel,
> so a PCI function appears as 16 dmadevs named "<pci-bdf>-ch0" ..
> "<pci-bdf>-ch15".

I am not familiar with DMA drivers, I am not sure it is something acceptable.
@Chengwen for info.


>
> This patch only registers the PCI driver, allocates the dmadev
> objects, reserves the per-queue descriptor rings and programs the
> hardware queue base addresses. Control and data path operations are
> added in subsequent patches.
>
> Signed-off-by: Raghavendra Ningoji <raghavendra.ningoji@amd.com>

Here is a superficial review.

Many places are fishy when it comes to integer/pointer casts: I only
raised a few comments on this topic.


[snip]

> diff --git a/drivers/dma/ae4dma/ae4dma_dmadev.c b/drivers/dma/ae4dma/ae4dma_dmadev.c
> new file mode 100644
> index 0000000000..76de2cde45
> --- /dev/null
> +++ b/drivers/dma/ae4dma/ae4dma_dmadev.c
> @@ -0,0 +1,227 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2026 Advanced Micro Devices, Inc. All rights reserved.
> + */
> +
> +#include <errno.h>
> +#include <inttypes.h>
> +#include <stdio.h>
> +#include <string.h>
> +
> +#include <rte_bus_pci.h>
> +#include <bus_pci_driver.h>
> +#include <rte_dmadev_pmd.h>
> +#include <rte_malloc.h>
> +
> +#include "ae4dma_internal.h"
> +
> +/*
> + * One dmadev per AE4DMA hardware channel; each dmadev has exactly one
> + * virtual channel. The HW's per-queue register block must be densely
> + * packed right after the engine-common config register at BAR0+0; the
> + * build-time check below catches an accidental layout change.
> + */
> +static_assert(sizeof(struct ae4dma_hwq_regs) == 32,
> +               "ae4dma_hwq_regs stride changed; per-queue offset math will break");
> +
> +RTE_LOG_REGISTER_DEFAULT(ae4dma_pmd_logtype, INFO);
> +
> +#define AE4DMA_PMD_NAME dmadev_ae4dma
> +
> +static const struct rte_memzone *
> +ae4dma_queue_dma_zone_reserve(const char *queue_name,
> +               uint32_t queue_size, int socket_id)
> +{
> +       const struct rte_memzone *mz;
> +
> +       mz = rte_memzone_lookup(queue_name);
> +       if (mz != NULL) {
> +               if (((size_t)queue_size <= mz->len) &&
> +                               ((socket_id == SOCKET_ID_ANY) ||
> +                                (socket_id == mz->socket_id))) {
> +                       AE4DMA_PMD_INFO("reuse memzone already "
> +                                       "allocated for %s", queue_name);
> +                       return mz;
> +               }
> +               AE4DMA_PMD_ERR("Incompatible memzone already "
> +                               "allocated %s, size %u, socket %d. "
> +                               "Requested size %u, socket %u",
> +                               queue_name, (uint32_t)mz->len,
> +                               mz->socket_id, queue_size, socket_id);
> +               return NULL;
> +       }
> +       return rte_memzone_reserve_aligned(queue_name, queue_size,
> +                       socket_id, RTE_MEMZONE_IOVA_CONTIG, queue_size);
> +}
> +
> +static int
> +ae4dma_add_queue(struct ae4dma_dmadev *dev, uint8_t qn, const char *pci_name)
> +{
> +       uint32_t dma_addr_lo, dma_addr_hi;
> +       struct ae4dma_cmd_queue *cmd_q;
> +       const struct rte_memzone *q_mz;
> +
> +       dev->io_regs = dev->pci->mem_resource[AE4DMA_PCIE_BAR].addr;
> +
> +       cmd_q = &dev->cmd_q;
> +       cmd_q->id = qn;
> +       cmd_q->qidx = 0;
> +       cmd_q->qsize = AE4DMA_QUEUE_SIZE(AE4DMA_QUEUE_DESC_SIZE);
> +       cmd_q->hwq_regs = (volatile struct ae4dma_hwq_regs *)dev->io_regs + (qn + 1);
> +
> +       /*
> +        * Memzone name must be globally unique. Embed PCI BDF so multiple
> +        * PCI functions probed concurrently don't collide.
> +        */
> +       snprintf(cmd_q->memz_name, sizeof(cmd_q->memz_name),
> +                       "ae4dma_%s_q%u", pci_name, (unsigned int)qn);
> +
> +       q_mz = ae4dma_queue_dma_zone_reserve(cmd_q->memz_name,
> +                       cmd_q->qsize, rte_socket_id());
> +       if (q_mz == NULL) {
> +               AE4DMA_PMD_ERR("memzone reserve failed for %s", cmd_q->memz_name);
> +               return -ENOMEM;
> +       }

I see no tracking of q_mz, so I suspect this memzone is leaked on
device probing failure, and/or unplugging.


> +
> +       cmd_q->qbase_addr = (void *)q_mz->addr;
> +       cmd_q->qbase_desc = (struct ae4dma_desc *)q_mz->addr;
> +       cmd_q->qbase_phys_addr = q_mz->iova;
> +
> +       AE4DMA_WRITE_REG(&cmd_q->hwq_regs->max_idx, AE4DMA_DESCRIPTORS_PER_CMDQ);
> +       AE4DMA_WRITE_REG(&cmd_q->hwq_regs->control_reg.control_raw,
> +                       AE4DMA_CMD_QUEUE_ENABLE);
> +       AE4DMA_WRITE_REG(&cmd_q->hwq_regs->intr_status_reg.intr_status_raw,
> +                       AE4DMA_DISABLE_INTR);
> +       cmd_q->next_write = (uint16_t)AE4DMA_READ_REG(&cmd_q->hwq_regs->write_idx);
> +       cmd_q->next_read = (uint16_t)AE4DMA_READ_REG(&cmd_q->hwq_regs->read_idx);

Strange that you need to cast.


> +       cmd_q->ring_buff_count = 0;
> +
> +       dma_addr_lo = low32_value(cmd_q->qbase_phys_addr);
> +       AE4DMA_WRITE_REG(&cmd_q->hwq_regs->qbase_lo, dma_addr_lo);
> +       dma_addr_hi = high32_value(cmd_q->qbase_phys_addr);
> +       AE4DMA_WRITE_REG(&cmd_q->hwq_regs->qbase_hi, dma_addr_hi);
> +
> +       return 0;
> +}
> +
> +static void
> +ae4dma_channel_dev_name(char *out, size_t outlen, const char *pci_name,
> +               unsigned int ch)
> +{
> +       snprintf(out, outlen, "%s-ch%u", pci_name, ch);
> +}
> +
> +/* Create a dmadev(dpdk DMA device) */

This is a general comment for the patch: let's avoid Lapalissade /
trivial comments that adds nothing.
The function name is self explanatory.


> +static int
> +ae4dma_dmadev_create(const char *name, struct rte_pci_device *dev, uint8_t qn)
> +{
> +       struct rte_dma_dev *dmadev = NULL;
> +       struct ae4dma_dmadev *ae4dma = NULL;

Those variables do not need any explicit setting to NULL, since there
are set at their first use.


> +       char hwq_dev_name[RTE_DEV_NAME_MAX_LEN];
> +
> +       if (!name) {

Such check will only confuse AI tools or other static code analysers,
as those tools will assume the function *may* be called with a NULL
pointer.
This is a static helper called internally from a single location,
remove the check.


> +               AE4DMA_PMD_ERR("Invalid name of the device!");
> +               return -EINVAL;
> +       }
> +       memset(hwq_dev_name, 0, sizeof(hwq_dev_name));
> +       ae4dma_channel_dev_name(hwq_dev_name, sizeof(hwq_dev_name), name, qn);
> +
> +       dmadev = rte_dma_pmd_allocate(hwq_dev_name, dev->device.numa_node,
> +                       sizeof(struct ae4dma_dmadev));
> +       if (dmadev == NULL) {
> +               AE4DMA_PMD_ERR("Unable to allocate dma device");
> +               return -ENOMEM;
> +       }
> +       dmadev->device = &dev->device;
> +       dmadev->fp_obj->dev_private = dmadev->data->dev_private;
> +
> +       ae4dma = dmadev->data->dev_private;
> +       ae4dma->dmadev = dmadev;

Such a back reference looks odd to me (how could you end with only a
reference to the priv pointer, which is in general deduced from the
dmadev pointer?).
And, in the end, this field is never used in the series.

Please remove.


> +       ae4dma->pci = dev;

dev is already a rte_pci_device pointer, and you only need to pass it
to ae4dma_add_queue as an argument.
By doing this change, there is no user of this field in the series,
please remove.


One note on this topic, you have a reference to the rte_device in the
dmadev object.
On the principle, the pci device can be resolved via
RTE_BUS_DEVICE(dmadev->device, struct rte_pci_device), or
RTE_BUS_DEVICE(dmadev->device, *pci_dev).
See other drivers for examples.


> +
> +       if (ae4dma_add_queue(ae4dma, qn, name) != 0)
> +               goto init_error;
> +       return 0;
> +
> +init_error:
> +       AE4DMA_PMD_ERR("driver %s(): failed", __func__);

__func__ is already part of AE4DMA_PMD_LOG.


> +       rte_dma_pmd_release(hwq_dev_name);
> +       return -ENOMEM;
> +}
> +
> +/* Probe DMA device. */
> +static int
> +ae4dma_dmadev_probe(struct rte_pci_driver *drv, struct rte_pci_device *dev)
> +{
> +       char name[32];
> +       char chname[RTE_DEV_NAME_MAX_LEN];
> +       void *mmio_base;
> +       uint32_t q_per_eng;
> +       int ret = 0;
> +       uint8_t i;
> +
> +       rte_pci_device_name(&dev->addr, name, sizeof(name));
> +       AE4DMA_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node);
> +       dev->device.driver = &drv->driver;

Setting the driver pointer in the device object is not the driver
responsibility anymore with commit f282771a04ef ("bus: factorize
driver reference").
EAL will set this field on probe() success.


> +
> +       mmio_base = dev->mem_resource[AE4DMA_PCIE_BAR].addr;
> +       if (mmio_base == NULL) {
> +               AE4DMA_PMD_ERR("%s: BAR%d not mapped", name, AE4DMA_PCIE_BAR);
> +               return -ENODEV;
> +       }
> +
> +       /* Program the per-engine HW queue count once. */
> +       AE4DMA_WRITE_REG_OFFSET(mmio_base, AE4DMA_COMMON_CONFIG_OFFSET,
> +                       AE4DMA_MAX_HW_QUEUES);
> +       q_per_eng = AE4DMA_READ_REG_OFFSET(mmio_base, AE4DMA_COMMON_CONFIG_OFFSET);
> +       AE4DMA_PMD_INFO("%s: AE4DMA queues per engine = %u", name, q_per_eng);
> +
> +       for (i = 0; i < AE4DMA_MAX_HW_QUEUES; i++) {
> +               ret = ae4dma_dmadev_create(name, dev, i);
> +               if (ret != 0) {
> +                       AE4DMA_PMD_ERR("%s create dmadev %u failed!", name, i);
> +                       while (i > 0) {
> +                               i--;
> +                               ae4dma_channel_dev_name(chname, sizeof(chname), name, i);
> +                               rte_dma_pmd_release(chname);
> +                       }
> +                       break;
> +               }
> +       }
> +       return ret;
> +}
> +
> +/* Remove DMA device. */
> +static int
> +ae4dma_dmadev_remove(struct rte_pci_device *dev)
> +{
> +       char name[32];
> +       char chname[RTE_DEV_NAME_MAX_LEN];
> +       unsigned int i;
> +
> +       rte_pci_device_name(&dev->addr, name, sizeof(name));
> +
> +       AE4DMA_PMD_INFO("Closing %s on NUMA node %d",
> +                       name, dev->device.numa_node);
> +
> +       for (i = 0; i < AE4DMA_MAX_HW_QUEUES; i++) {
> +               ae4dma_channel_dev_name(chname, sizeof(chname), name, i);
> +               rte_dma_pmd_release(chname);
> +       }
> +       return 0;
> +}
> +
> +static const struct rte_pci_id pci_id_ae4dma_map[] = {
> +       { RTE_PCI_DEVICE(AMD_VENDOR_ID, AE4DMA_DEVICE_ID) },
> +       { .vendor_id = 0, /* sentinel */ },
> +};
> +
> +static struct rte_pci_driver ae4dma_pmd_drv = {
> +       .id_table = pci_id_ae4dma_map,
> +       .drv_flags = RTE_PCI_DRV_NEED_MAPPING,
> +       .probe = ae4dma_dmadev_probe,
> +       .remove = ae4dma_dmadev_remove,
> +};
> +
> +RTE_PMD_REGISTER_PCI(AE4DMA_PMD_NAME, ae4dma_pmd_drv);
> +RTE_PMD_REGISTER_PCI_TABLE(AE4DMA_PMD_NAME, pci_id_ae4dma_map);
> +RTE_PMD_REGISTER_KMOD_DEP(AE4DMA_PMD_NAME, "* igb_uio | uio_pci_generic | vfio-pci");
> diff --git a/drivers/dma/ae4dma/ae4dma_hw_defs.h b/drivers/dma/ae4dma/ae4dma_hw_defs.h
> new file mode 100644
> index 0000000000..62b6a1b30b
> --- /dev/null
> +++ b/drivers/dma/ae4dma/ae4dma_hw_defs.h
> @@ -0,0 +1,160 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2026 Advanced Micro Devices, Inc. All rights reserved.
> + */
> +
> +#ifndef __AE4DMA_HW_DEFS_H__
> +#define __AE4DMA_HW_DEFS_H__
> +

Is this header autosufficient ?
I see references to uint32_t below, so this header probably depends on stdint.h.


> +#include <rte_bus_pci.h>
> +#include <rte_byteorder.h>
> +#include <rte_io.h>
> +#include <rte_pci.h>
> +#include <rte_memzone.h>
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif

Do we really need C++ guards?

> +
> +#define AE4DMA_BIT(nr)                 (1UL << (nr))
> +
> +/* ae4dma device details */
> +#define AMD_VENDOR_ID  0x1022
> +#define AE4DMA_DEVICE_ID       0x149b
> +#define AE4DMA_PCIE_BAR 0
> +
> +/*
> + * An AE4DMA engine has 16 DMA queues. Each queue supports 32 descriptors.
> + */
> +#define AE4DMA_MAX_HW_QUEUES        16
> +#define AE4DMA_QUEUE_START_INDEX    0
> +#define AE4DMA_CMD_QUEUE_ENABLE                0x1
> +#define AE4DMA_CMD_QUEUE_DISABLE       0x0
> +
> +/* Common to all queues */
> +#define AE4DMA_COMMON_CONFIG_OFFSET 0x00
> +
> +#define AE4DMA_DISABLE_INTR 0x01
> +
> +/* Descriptor status */
> +enum ae4dma_dma_status {
> +       AE4DMA_DMA_DESC_SUBMITTED = 0,
> +       AE4DMA_DMA_DESC_VALIDATED = 1,
> +       AE4DMA_DMA_DESC_PROCESSED = 2,
> +       AE4DMA_DMA_DESC_COMPLETED = 3,
> +       AE4DMA_DMA_DESC_ERROR = 4,
> +};
> +
> +/* Descriptor error-code */
> +enum ae4dma_dma_err {
> +       AE4DMA_DMA_ERR_NO_ERR = 0,
> +       AE4DMA_DMA_ERR_INV_HEADER = 1,
> +       AE4DMA_DMA_ERR_INV_STATUS = 2,
> +       AE4DMA_DMA_ERR_INV_LEN = 3,
> +       AE4DMA_DMA_ERR_INV_SRC = 4,
> +       AE4DMA_DMA_ERR_INV_DST = 5,
> +       AE4DMA_DMA_ERR_INV_ALIGN = 6,
> +       AE4DMA_DMA_ERR_UNKNOWN = 7,
> +};
> +
> +/* HW Queue status */
> +enum ae4dma_hwqueue_status {
> +       AE4DMA_HWQUEUE_EMPTY = 0,
> +       AE4DMA_HWQUEUE_FULL = 1,
> +       AE4DMA_HWQUEUE_NOT_EMPTY = 4

For consistency with other enums, add a comma.


> +};
> +/*
> + * descriptor for AE4DMA commands
> + * 8 32-bit words:
> + * word 0: source memory type; destination memory type ; control bits
> + * word 1: desc_id; error code; status
> + * word 2: length
> + * word 3: reserved
> + * word 4: upper 32 bits of source pointer
> + * word 5: low 32 bits of source pointer
> + * word 6: upper 32 bits of destination pointer
> + * word 7: low 32 bits of destination pointer
> + */
> +
> +/* AE4DMA Descriptor - DWORD0 - Controls bits: Reserved for future use */
> +#define AE4DMA_DWORD0_STOP_ON_COMPLETION       AE4DMA_BIT(0)
> +#define AE4DMA_DWORD0_INTERRUPT_ON_COMPLETION  AE4DMA_BIT(1)
> +#define AE4DMA_DWORD0_START_OF_MESSAGE         AE4DMA_BIT(3)
> +#define AE4DMA_DWORD0_END_OF_MESSAGE           AE4DMA_BIT(4)
> +#define AE4DMA_DWORD0_DESTINATION_MEMORY_TYPE  RTE_GENMASK64(5, 4)
> +#define AE4DMA_DWORD0_SOURCE_MEMEORY_TYPE      RTE_GENMASK64(7, 6)
> +
> +#define AE4DMA_DWORD0_DESTINATION_MEMORY_TYPE_MEMORY    (0x0)
> +#define AE4DMA_DWORD0_DESTINATION_MEMORY_TYPE_IOMEMORY  (1<<4)
> +#define AE4DMA_DWORD0_SOURCE_MEMEORY_TYPE_MEMORY    (0x0)
> +#define AE4DMA_DWORD0_SOURCE_MEMEORY_TYPE_IOMEMORY  (1<<6)
> +
> +struct ae4dma_desc_dword0 {
> +       uint8_t byte0;
> +       uint8_t byte1;
> +       uint16_t timestamp;
> +};
> +
> +struct ae4dma_desc_dword1 {
> +       uint8_t status;
> +       uint8_t err_code;
> +       uint16_t desc_id;
> +};
> +
> +struct ae4dma_desc {
> +       struct ae4dma_desc_dword0 dw0;
> +       struct ae4dma_desc_dword1 dw1;
> +       uint32_t length;
> +       uint32_t reserved;
> +       uint32_t src_lo;
> +       uint32_t src_hi;
> +       uint32_t dst_lo;
> +       uint32_t dst_hi;
> +};
> +
> +/*
> + * Registers for each queue :4 bytes length
> + * Effective address : offset + reg
> + */
> +struct ae4dma_hwq_regs {
> +       union {
> +               uint32_t control_raw;
> +               struct {
> +                       uint32_t queue_enable: 1;
> +                       uint32_t reserved_internal: 31;
> +               } control;
> +       } control_reg;
> +
> +       union {
> +               uint32_t status_raw;
> +               struct {
> +                       uint32_t reserved0: 1;
> +                       /* 0–empty, 1–full, 2–stopped, 3–error , 4–Not Empty */
> +                       uint32_t queue_status: 2;
> +                       uint32_t reserved1: 21;
> +                       uint32_t interrupt_type: 4;
> +                       uint32_t reserved2: 4;
> +               } status;
> +       } status_reg;
> +
> +       uint32_t max_idx;
> +       uint32_t read_idx;
> +       uint32_t write_idx;
> +
> +       union {
> +               uint32_t intr_status_raw;
> +               struct {
> +                       uint32_t intr_status: 1;
> +                       uint32_t reserved: 31;
> +               } intr_status;
> +       } intr_status_reg;
> +
> +       uint32_t qbase_lo;
> +       uint32_t qbase_hi;
> +
> +};
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* AE4DMA_HW_DEFS_H */
> diff --git a/drivers/dma/ae4dma/ae4dma_internal.h b/drivers/dma/ae4dma/ae4dma_internal.h
> new file mode 100644
> index 0000000000..9892d6697f
> --- /dev/null
> +++ b/drivers/dma/ae4dma/ae4dma_internal.h
> @@ -0,0 +1,118 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2026 Advanced Micro Devices, Inc. All rights reserved.
> + */
> +
> +#ifndef _AE4DMA_INTERNAL_H_
> +#define _AE4DMA_INTERNAL_H_
> +
> +#include <stdint.h>
> +
> +#include "ae4dma_hw_defs.h"
> +
> +/**

This is an internal header, we don't need doxygen style comments,
simple comments are enough.


> + * upper_32_bits - return bits 32-63 of a number
> + * @n: the number we're accessing
> + */
> +#define upper_32_bits(n) ((uint32_t)(((n) >> 16) >> 16))
> +
> +/**
> + * lower_32_bits - return bits 0-31 of a number
> + * @n: the number we're accessing
> + */
> +#define lower_32_bits(n) ((uint32_t)((n) & 0xffffffff))
> +
> +/** Hardware ring depth (slots per queue); must be power of two. */
> +#define AE4DMA_DESCRIPTORS_PER_CMDQ    32
> +#define AE4DMA_QUEUE_DESC_SIZE         sizeof(struct ae4dma_desc)
> +#define AE4DMA_QUEUE_SIZE(n)           (AE4DMA_DESCRIPTORS_PER_CMDQ * (n))
> +
> +
> +/** AE4DMA registers Write/Read */
> +static inline void ae4dma_pci_reg_write(void *base, int offset,
> +               uint32_t value)
> +{
> +       volatile void *reg_addr = ((uint8_t *)base + offset);
> +
> +       rte_write32((rte_cpu_to_le_32(value)), reg_addr);
> +}
> +
> +static inline uint32_t ae4dma_pci_reg_read(void *base, int offset)
> +{
> +       volatile void *reg_addr = ((uint8_t *)base + offset);
> +
> +       return rte_le_to_cpu_32(rte_read32(reg_addr));
> +}
> +
> +#define AE4DMA_READ_REG_OFFSET(hw_addr, reg_offset) \
> +       ae4dma_pci_reg_read(hw_addr, reg_offset)
> +
> +#define AE4DMA_WRITE_REG_OFFSET(hw_addr, reg_offset, value) \
> +       ae4dma_pci_reg_write(hw_addr, reg_offset, value)
> +
> +
> +#define AE4DMA_READ_REG(hw_addr) \
> +       ae4dma_pci_reg_read((void *)(uintptr_t)(hw_addr), 0)
> +
> +#define AE4DMA_WRITE_REG(hw_addr, value) \
> +       ae4dma_pci_reg_write((void *)(uintptr_t)(hw_addr), 0, value)
> +
> +static inline uint32_t
> +low32_value(unsigned long addr)
> +{
> +       return ((uint64_t)addr) & 0xffffffffUL;
> +}
> +
> +static inline uint32_t
> +high32_value(unsigned long addr)
> +{
> +       return (uint32_t)(((uint64_t)addr) >> 32);
> +}
> +
> +/**
> + * A structure describing a AE4DMA command queue.
> + */
> +struct __rte_cache_aligned ae4dma_cmd_queue {
> +       char memz_name[RTE_MEMZONE_NAMESIZE];
> +       volatile struct ae4dma_hwq_regs *hwq_regs;
> +
> +       struct rte_dma_vchan_conf qcfg;
> +       struct rte_dma_stats stats;
> +       /* Queue address */
> +       struct ae4dma_desc *qbase_desc;
> +       void *qbase_addr;
> +       rte_iova_t qbase_phys_addr;
> +       enum ae4dma_dma_err status[AE4DMA_DESCRIPTORS_PER_CMDQ];
> +       /* Queue identifier */
> +       uint64_t id;    /**< queue id */
> +       uint64_t qidx;  /**< queue index */
> +       uint64_t qsize; /**< queue size */
> +       uint32_t ring_buff_count;
> +       unsigned short next_read;
> +       unsigned short next_write;
> +       unsigned short last_write; /* Used to compute submitted count. */
> +};
> +
> +/*
> + * One dmadev per AE4DMA hardware channel: probe creates AE4DMA_MAX_HW_QUEUES
> + * dmadevs per PCI function, each owning a single HW command queue.
> + */
> +struct ae4dma_dmadev {
> +       struct rte_dma_dev *dmadev;
> +       void *io_regs;
> +       struct ae4dma_cmd_queue cmd_q; /**< single HW queue owned by this dmadev */
> +       struct rte_pci_device *pci;    /**< owning PCI device (not owned) */
> +};
> +
> +
> +extern int ae4dma_pmd_logtype;
> +#define RTE_LOGTYPE_AE4DMA_PMD ae4dma_pmd_logtype
> +
> +#define AE4DMA_PMD_LOG(level, ...) \
> +       RTE_LOG_LINE_PREFIX(level, AE4DMA_PMD, "%s(): ", __func__, __VA_ARGS__)
> +
> +#define AE4DMA_PMD_DEBUG(...)  AE4DMA_PMD_LOG(DEBUG, __VA_ARGS__)
> +#define AE4DMA_PMD_INFO(...)   AE4DMA_PMD_LOG(INFO, __VA_ARGS__)
> +#define AE4DMA_PMD_ERR(...)    AE4DMA_PMD_LOG(ERR, __VA_ARGS__)
> +#define AE4DMA_PMD_WARN(...)   AE4DMA_PMD_LOG(WARNING, __VA_ARGS__)
> +
> +#endif /* _AE4DMA_INTERNAL_H_ */


-- 
David Marchand


^ permalink raw reply

* RE: [PATCH 00/10] NXP ENETC driver related changes
From: Gagandeep Singh @ 2026-06-22 11:36 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev@dpdk.org, Hemant Agrawal
In-Reply-To: <20260619144336.1f9c46f2@phoenix.local>

Hi

> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Saturday, June 20, 2026 3:14 AM
> To: Gagandeep Singh <G.Singh@nxp.com>
> Cc: dev@dpdk.org; Hemant Agrawal <hemant.agrawal@nxp.com>
> Subject: Re: [PATCH 00/10] NXP ENETC driver related changes
> 
> On Sat, 20 Jun 2026 00:14:17 +0530
> Gagandeep Singh <g.singh@nxp.com> wrote:
> 
> > ENETC driver related changes series
> >
> > Gagandeep Singh (8):
> >   net/enetc: fix TX BD structure
> >   net/enetc: fix TX BDs flag overwrite issue
> >   net/enetc: fix queue initialization
> >   net/enetc: support ESP packet type in packet parsing
> >   net/enetc: update random MAC generation code
> >   net/enetc: add option to disable VSI messaging
> >   net/enetc: add devargs to control VSI-PSI timeout and delay
> >   net/enetc4: add cacheable BD ring support with SW cache maintenance
> >
> > Vanshika Shukla (2):
> >   net/enetc: support scatter-gather
> >   net/enetc: set user configurable priority to TX rings
> >
> >  drivers/net/enetc/base/enetc_hw.h |  13 +-
> >  drivers/net/enetc/enetc.h         |  28 +-
> >  drivers/net/enetc/enetc4_ethdev.c | 123 +++++++--
> >  drivers/net/enetc/enetc4_vf.c     | 159 ++++++++++--
> >  drivers/net/enetc/enetc_ethdev.c  |  26 +-
> >  drivers/net/enetc/enetc_rxtx.c    | 411 ++++++++++++++++++++++++++----
> >  6 files changed, 649 insertions(+), 111 deletions(-)
> >
> 
> The AI review shows some thing that need to be addressed before merging.
> 
> [PATCH 04/10] net/enetc: support ESP packet type
> 
> Info: enetc_supported_ptypes_get() adds RTE_PTYPE_TUNNEL_ESP and a trailing
> RTE_PTYPE_UNKNOWN. *no_of_elements is RTE_DIM(ptypes), so the
> 0 entry is counted (not used as a sentinel). It is filtered out by the mask test in
> rte_eth_dev_get_supported_ptypes(), so harmless, but the
> RTE_PTYPE_UNKNOWN line is unnecessary and should be dropped.
> 
> 
> [PATCH 06/10] net/enetc: support scatter-gather
> 
> Warning: scatter Rx reassembly state (first_seg/cur_seg) is held in local variables
> and reset on every call. rx_frm_cnt only advances on the F bit, so work_limit
> won't cut a frame, but the "!(bd_status & LSTATUS_R)" break can exit mid-frame
> if HW has written the leading segments of a multi-segment frame but not yet the
> segment carrying F. On the next call first_seg is NULL again, next_to_clean has
> already advanced past the consumed leading segments, and those mbufs are
> leaked while the tail segments are mis-assembled as a new frame.
> Persist the partial chain across bursts in the ring (e.g.
> rx_ring->pkt_first_seg / pkt_last_seg) instead of locals. (Same pattern is
> reproduced in enetc_clean_rx_ring_cacheable in patch 10.)
> 
> Warning: enetc4 now advertises RTE_ETH_RX_OFFLOAD_SCATTER and
> RTE_ETH_TX_OFFLOAD_MULTI_SEGS (VF) but doc/guides/nics/features/
> enetc4.ini is not updated (Scattered Rx / Multi segment rows).
> 
> Info: the VF dev_info now advertises L3/L4 RX checksum offload, but
> enetc_dev_rx_parse() unconditionally sets RTE_MBUF_F_RX_IP_CKSUM_GOOD |
> RTE_MBUF_F_RX_L4_CKSUM_GOOD and never reports *_BAD. With the offload
> now advertised, an application relying on it will never see a bad-checksum
> indication.
> 
> Info: dccivac(data + (data_len - 1)) / dcbf(data + (seg_len - 1)) underflow to data-1
> when the segment length is 0 (uint16_t promotes to int). The preceding loop
> already covers the final cache line, so the extra op is redundant as well as unsafe
> for len==0.
> 
> 
> [PATCH 07/10] net/enetc: add option to disable VSI messaging
> 
> Warning: new devarg "enetc4_vsi_disable" is registered but not documented in
> doc/guides/nics/enetc.rst.
> 
> 
> [PATCH 08/10] net/enetc: add devargs to control VSI-PSI timeout/delay
> 
> Warning: new devargs "enetc4_vsi_timeout" / "enetc4_vsi_delay" are not
> documented in doc/guides/nics/enetc.rst.
> 
> 
> [PATCH 09/10] net/enetc: set user configurable priority to TX rings
> 
> Error: hw->txq_prior is allocated in parse_txq_prior() with
> rte_zmalloc() but never freed. It leaks on dev_close / re-probe. Free it in the
> close/uninit path (and note it is re-allocated every time the handler runs, so a
> repeated key would leak the prior allocation too).
> 
> Warning: txq_prior is control-path, CPU-only data; rte_zmalloc() consumes
> hugepage memory unnecessarily. Use calloc()/malloc().
> 
> Warning: the parsed value is OR'd straight into TBMR:
> 	tx_en |= priv->hw.txq_prior[tx_ring->index];
> with no mask. ENETC_TBMR_EN is BIT(31) and there is no TBMR priority mask
> defined. A user value with high bits set can corrupt unrelated TBMR control bits.
> Mask the input to the valid TBMR priority field.
> 
> Info: strdup(value) return is not checked; on failure strtok(input_str, "|") is called
> with a NULL first argument, which resumes from strtok's stale internal state
> rather than erroring.
> 
> Warning: new devarg "enetc4_txq_prior" not documented in
> doc/guides/nics/enetc.rst.
> 
> 
> [PATCH 10/10] net/enetc4: add cacheable BD ring support with SW cache
> 
> Warning: enetc4_dev_hw_init() switches rx_pkt_burst/tx_pkt_burst to the cache-
> maintenance variants unconditionally for every enetc4 device (PF and VF). The
> commit message scopes this to non-cache-coherent parts (i.MX95), but the code
> applies it everywhere, adding dcbf/dccivac cost on cache-coherent platforms that
> previously used the _nc fast path. Gate it on a devarg or coherency/platform
> check.
> 
> Warning: the RX payload invalidation uses dccivac (dc civac =
> clean+invalidate). The comment justifies clean-then-invalidate for the
> BD ring (refill dcbf leaves BD lines clean), but payload buffers are not cleaned
> before being handed to HW. If a payload cache line is dirty (stale CPU data from a
> prior use of the mbuf), the clean phase writes it back over the HW-DMA'd data in
> DDR before invalidating -> silent RX corruption on a non-coherent part. Please
> confirm payload lines can never be dirty here, or use invalidate-only.
> 
> Info: struct enetc_bdr gains "uint64_t bd_base_p" but it is never referenced
> anywhere. Remove the dead field.
> 
> Info: the 64-bit BD fast copy
> 	__uint128_t *dst128 = (__uint128_t *)&rxbd_temp;
> 	*dst128 = *(const __uint128_t *)rxbd;
> takes the address of an 8-byte-aligned stack union (rxbd_temp) as __uint128_t*.
> That is an under-aligned 128-bit access (UB); aarch64 tolerates it via ldp/stp but
> it is fragile. Force 16-byte alignment on rxbd_temp or copy as two u64.
> 
> 
> General (series-wide)
> 
> Warning: no release notes. The series adds user-visible features (scatter-gather,
> cacheable BD ring support, four new devargs) with no entry in
> doc/guides/rel_notes/. New driver capabilities and devargs need release-note
> coverage.

I have sent the V2 series which includes most of the fixes. I have mentioned all the fixes in the cover-letter.

Thanks,
Gagan

^ permalink raw reply

* [PATCH v2 9/9] net/enetc4: add cacheable BD ring support with SW cache maintenance
From: Gagandeep Singh @ 2026-06-22 11:35 UTC (permalink / raw)
  To: dev; +Cc: hemant.agrawal, Gagandeep Singh
In-Reply-To: <20260622113517.1616028-1-g.singh@nxp.com>

On non-cache-coherent platforms such as i.MX95, the BD ring memory
may be mapped as cacheable (normal memory) while the ENETC hardware
DMA engine writes and reads descriptors without CPU cache snooping.
SW must therefore perform explicit cache maintenance to keep CPU
caches and DDR coherent.

TX path (enetc_xmit_pkts_cacheable):
  - Flush each segment's payload cache lines to PoC (dcbf) before
    the BD is handed to HW, so HW DMA reads the correct data.
  - After all BDs for a burst are written, flush the BD cache lines
    (dcbf, one per 64-byte group of 4 BDs) so HW can read the
    updated descriptors.

RX refill (enetc_refill_rx_ring):
  - After writing each full 4-BD cache-line group, dcbf that group
    so HW sees the buffer addresses and cleared lstatus fields.
  - Flush any partial trailing group before updating the ring tail.

RX receive (enetc_recv_pkts_cacheable via enetc_clean_rx_ring_cacheable):
  - Before reading BD status, dccivac the current BD cache line so
    stale CPU-cached BD data is discarded and fresh HW-written
    content is fetched from DDR.
  - After a BD is consumed, dccivac each payload cache line so the
    CPU reads the DMA'd packet data, not stale cached bytes.

Add a new devarg 'nc=1' that allows selecting the non-cacheable
RX/TX ops.
When nc=1 is specified:
  eth_dev->rx_pkt_burst = &enetc_recv_pkts_nc;
  eth_dev->tx_pkt_burst = &enetc_xmit_pkts_nc;

The default (nc not set or nc=0) keeps the cacheable path with
SW cache maintenance (dccivac/dcbf):
  eth_dev->rx_pkt_burst = &enetc_recv_pkts_cacheable;
  eth_dev->tx_pkt_burst = &enetc_xmit_pkts_cacheable;

Usage:
  --allow 0000:00:01.0,nc=1

Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
---
 doc/guides/rel_notes/release_26_07.rst |   1 +
 drivers/net/enetc/enetc.h              |  21 ++
 drivers/net/enetc/enetc4_ethdev.c      |  71 +++++--
 drivers/net/enetc/enetc4_vf.c          |  35 +++-
 drivers/net/enetc/enetc_rxtx.c         | 278 ++++++++++++++++++++++++-
 5 files changed, 388 insertions(+), 18 deletions(-)

diff --git a/doc/guides/rel_notes/release_26_07.rst b/doc/guides/rel_notes/release_26_07.rst
index 495eba0..b208229 100644
--- a/doc/guides/rel_notes/release_26_07.rst
+++ b/doc/guides/rel_notes/release_26_07.rst
@@ -196,6 +196,7 @@ New Features
   * Added devargs options ``enetc4_vsi_timeout`` and ``enetc4_vsi_delay``
     for VSI-PSI messaging timeout and delay.
   * Added devargs option ``enetc4_txq_prior`` to set TX queues priorities.
+  * Added devargs option ``nc`` to select non-cacheable RX/TX ops.
 
 Removed Items
 -------------
diff --git a/drivers/net/enetc/enetc.h b/drivers/net/enetc/enetc.h
index c12597b..d3a8b8e 100644
--- a/drivers/net/enetc/enetc.h
+++ b/drivers/net/enetc/enetc.h
@@ -115,6 +115,7 @@ struct enetc_eth_hw {
 	uint32_t vsi_timeout; /* VSI-PSI message wait timeout (iterations) */
 	uint32_t vsi_delay;   /* VSI-PSI message wait delay (us) */
 	uint32_t *txq_prior;  /* per-queue TX priority (TBMR priority bits) */
+	uint8_t nc_mode;      /* 1 = non-cacheable BD memory, use _nc ops */
 };
 
 /*
@@ -315,8 +316,28 @@ uint16_t enetc_recv_pkts(void *rxq, struct rte_mbuf **rx_pkts,
 		uint16_t nb_pkts);
 uint16_t enetc_recv_pkts_nc(void *rxq, struct rte_mbuf **rx_pkts,
 		uint16_t nb_pkts);
+uint16_t enetc_xmit_pkts_cacheable(void *txq, struct rte_mbuf **tx_pkts,
+		uint16_t nb_pkts);
+uint16_t enetc_recv_pkts_cacheable(void *rxq, struct rte_mbuf **rx_pkts,
+		uint16_t nb_pkts);
 
 int enetc_refill_rx_ring(struct enetc_bdr *rx_ring, const int buff_cnt);
+
+/*
+ * Cache-maintenance constants for cacheable BD ring mode.
+ *
+ * BD = 16 bytes, cache line = 64 bytes => 4 BDs per cache line.
+ * Every dcbf in enetc_refill_rx_ring() flushes a full 64-byte cache line.
+ * To ensure each dcbf covers only fully-written BDs the caller
+ * must pass a count rounded DOWN to a multiple of ENETC_BD_PER_CL so that
+ * the last partial group is left in cache to be completed and flushed in
+ * the next call.
+ */
+#define ENETC_BD_PER_CL		(RTE_CACHE_LINE_SIZE / sizeof(union enetc_rx_bd))
+#define ENETC_BD_PER_CL_MASK	(ENETC_BD_PER_CL - 1)
+/* Round n DOWN to the nearest multiple of ENETC_BD_PER_CL. */
+#define ENETC_BD_ALIGN_DOWN(n)	((n) & ~(unsigned int)ENETC_BD_PER_CL_MASK)
+
 void enetc4_dev_hw_init(struct rte_eth_dev *eth_dev);
 void enetc_print_ethaddr(const char *name, const struct rte_ether_addr *eth_addr);
 
diff --git a/drivers/net/enetc/enetc4_ethdev.c b/drivers/net/enetc/enetc4_ethdev.c
index 7e2d665..2ddd63d 100644
--- a/drivers/net/enetc/enetc4_ethdev.c
+++ b/drivers/net/enetc/enetc4_ethdev.c
@@ -12,6 +12,7 @@
 #include "enetc.h"
 
 #define ENETC4_TXQ_PRIORITIES	"enetc4_txq_prior"
+#define ENETC4_NC_MEMORY	"nc"
 
 static int
 parse_txq_prior(const char *key __rte_unused, const char *value, void *opaque)
@@ -42,6 +43,19 @@ parse_txq_prior(const char *key __rte_unused, const char *value, void *opaque)
 	return 0;
 }
 
+static int
+parse_nc(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	struct rte_eth_dev *dev = extra_args;
+	struct enetc_eth_hw *hw =
+		ENETC_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	if (value && atoi(value) == 1)
+		hw->nc_mode = 1;
+
+	return 0;
+}
+
 static int
 enetc4_get_devargs(struct rte_eth_dev *dev, const char *key)
 {
@@ -67,6 +81,13 @@ enetc4_get_devargs(struct rte_eth_dev *dev, const char *key)
 			return 0;
 		}
 	}
+	if (!strcmp(key, ENETC4_NC_MEMORY)) {
+		if (rte_kvargs_process(kvlist, key,
+				       parse_nc, (void *)dev) < 0) {
+			rte_kvargs_free(kvlist);
+			return 0;
+		}
+	}
 
 	rte_kvargs_free(kvlist);
 	return 0;
@@ -289,12 +310,14 @@ enetc4_alloc_txbdr(struct enetc_bdr *txr, uint16_t nb_desc)
 	int size;
 
 	size = nb_desc * sizeof(struct enetc_swbd);
-	txr->q_swbd = rte_malloc(NULL, size, ENETC_BD_RING_ALIGN);
+	/* Zero q_swbd so buffer_addr is NULL for all uninitialized slots. */
+	txr->q_swbd = rte_zmalloc(NULL, size, ENETC_BD_RING_ALIGN);
 	if (txr->q_swbd == NULL)
 		return -ENOMEM;
 
-	size = nb_desc * sizeof(struct enetc_bdr);
-	txr->bd_base = rte_malloc(NULL, size, ENETC_BD_RING_ALIGN);
+	/* Allocate the TX BD ring: each BD is struct enetc_tx_bd (16 bytes). */
+	size = nb_desc * sizeof(struct enetc_tx_bd);
+	txr->bd_base = rte_zmalloc(NULL, size, ENETC_BD_RING_ALIGN);
 	if (txr->bd_base == NULL) {
 		rte_free(txr->q_swbd);
 		txr->q_swbd = NULL;
@@ -449,12 +472,14 @@ enetc4_alloc_rxbdr(struct enetc_bdr *rxr, uint16_t nb_desc)
 	int size;
 
 	size = nb_desc * sizeof(struct enetc_swbd);
-	rxr->q_swbd = rte_malloc(NULL, size, ENETC_BD_RING_ALIGN);
+	/* Zero q_swbd so buffer_addr is NULL for all uninitialized slots. */
+	rxr->q_swbd = rte_zmalloc(NULL, size, ENETC_BD_RING_ALIGN);
 	if (rxr->q_swbd == NULL)
 		return -ENOMEM;
 
-	size = nb_desc * sizeof(struct enetc_bdr);
-	rxr->bd_base = rte_malloc(NULL, size, ENETC_BD_RING_ALIGN);
+	/* Allocate the RX BD ring: each BD is union enetc_rx_bd (16 bytes). */
+	size = nb_desc * sizeof(union enetc_rx_bd);
+	rxr->bd_base = rte_zmalloc(NULL, size, ENETC_BD_RING_ALIGN);
 	if (rxr->bd_base == NULL) {
 		rte_free(rxr->q_swbd);
 		rxr->q_swbd = NULL;
@@ -489,7 +514,7 @@ enetc4_setup_rxbdr(struct enetc_hw *hw, struct enetc_bdr *rx_ring,
 	rx_ring->mb_pool = mb_pool;
 	rx_ring->rcir = (void *)((size_t)hw->reg +
 			ENETC_BDR(RX, idx, ENETC_RBCIR));
-	enetc_refill_rx_ring(rx_ring, (enetc_bd_unused(rx_ring)));
+	enetc_refill_rx_ring(rx_ring, ENETC_BD_ALIGN_DOWN(enetc_bd_unused(rx_ring)));
 	buf_size = (uint16_t)(rte_pktmbuf_data_room_size(rx_ring->mb_pool) -
 		   RTE_PKTMBUF_HEADROOM);
 	enetc4_rxbdr_wr(hw, idx, ENETC_RBBSR, buf_size);
@@ -751,12 +776,17 @@ enetc4_dev_configure(struct rte_eth_dev *dev)
 
 	PMD_INIT_FUNC_TRACE();
 
-	max_len = dev->data->dev_conf.rxmode.mtu + RTE_ETHER_HDR_LEN +
-		  RTE_ETHER_CRC_LEN;
-	enetc4_port_wr(enetc_hw, ENETC4_PM_MAXFRM(0), ENETC_SET_MAXFRM(max_len));
+	/* Port-level register writes are PF-only; skip for VF devices */
+	if (hw->device_id != ENETC4_DEV_ID_VF) {
+		max_len = dev->data->dev_conf.rxmode.mtu + RTE_ETHER_HDR_LEN +
+			  RTE_ETHER_CRC_LEN;
+		enetc4_port_wr(enetc_hw, ENETC4_PM_MAXFRM(0),
+			       ENETC_SET_MAXFRM(max_len));
 
-	val = ENETC4_MAC_MAXFRM_SIZE | SDU_TYPE_MPDU;
-	enetc4_port_wr(enetc_hw, ENETC4_PTCTMSDUR(0), val | SDU_TYPE_MPDU);
+		val = ENETC4_MAC_MAXFRM_SIZE | SDU_TYPE_MPDU;
+		enetc4_port_wr(enetc_hw, ENETC4_PTCTMSDUR(0),
+			       val | SDU_TYPE_MPDU);
+	}
 
 	/* Rx offloads which are enabled by default */
 	if (dev_rx_offloads_sup & ~rx_offloads) {
@@ -778,7 +808,8 @@ enetc4_dev_configure(struct rte_eth_dev *dev)
 	if (rx_offloads & (RTE_ETH_RX_OFFLOAD_UDP_CKSUM | RTE_ETH_RX_OFFLOAD_TCP_CKSUM))
 		checksum &= ~L4_CKSUM;
 
-	enetc4_port_wr(enetc_hw, ENETC4_PARCSCR, checksum);
+	if (hw->device_id != ENETC4_DEV_ID_VF)
+		enetc4_port_wr(enetc_hw, ENETC4_PARCSCR, checksum);
 
 	/* Enable interrupts */
 	if (hw->device_id == ENETC4_DEV_ID_VF) {
@@ -1041,8 +1072,8 @@ enetc4_dev_hw_init(struct rte_eth_dev *eth_dev)
 		ENETC_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);
 	struct rte_pci_device *pci_dev = RTE_CLASS_TO_BUS_DEVICE(eth_dev, *pci_dev);
 
-	eth_dev->rx_pkt_burst = &enetc_recv_pkts_nc;
-	eth_dev->tx_pkt_burst = &enetc_xmit_pkts_nc;
+	eth_dev->rx_pkt_burst = &enetc_recv_pkts_cacheable;
+	eth_dev->tx_pkt_burst = &enetc_xmit_pkts_cacheable;
 
 	/* Retrieving and storing the HW base address of device */
 	hw->hw.reg = (void *)pci_dev->mem_resource[0].addr;
@@ -1082,7 +1113,14 @@ enetc4_dev_init(struct rte_eth_dev *eth_dev)
 	hw->max_tx_queues = si_cap & ENETC_SICAPR0_BDR_MASK;
 	hw->max_rx_queues = (si_cap >> 16) & ENETC_SICAPR0_BDR_MASK;
 
+	hw->nc_mode = 0;
 	enetc4_get_devargs(eth_dev, ENETC4_TXQ_PRIORITIES);
+	enetc4_get_devargs(eth_dev, ENETC4_NC_MEMORY);
+	if (hw->nc_mode) {
+		eth_dev->rx_pkt_burst = &enetc_recv_pkts_nc;
+		eth_dev->tx_pkt_burst = &enetc_xmit_pkts_nc;
+		ENETC_PMD_LOG(INFO, "nc=1: using non-cacheable BD ops (_nc)");
+	}
 
 	ENETC_PMD_DEBUG("Max RX queues = %d Max TX queues = %d",
 			hw->max_rx_queues, hw->max_tx_queues);
@@ -1149,5 +1187,6 @@ RTE_PMD_REGISTER_PCI(net_enetc4, rte_enetc4_pmd);
 RTE_PMD_REGISTER_PCI_TABLE(net_enetc4, pci_id_enetc4_map);
 RTE_PMD_REGISTER_KMOD_DEP(net_enetc4, "* vfio-pci");
 RTE_PMD_REGISTER_PARAM_STRING(net_enetc4,
-			      ENETC4_TXQ_PRIORITIES "=<string>");
+			      ENETC4_TXQ_PRIORITIES "=<string> "
+			      ENETC4_NC_MEMORY "=<int>");
 RTE_LOG_REGISTER_DEFAULT(enetc4_logtype_pmd, NOTICE);
diff --git a/drivers/net/enetc/enetc4_vf.c b/drivers/net/enetc/enetc4_vf.c
index d78e08e..98435ef 100644
--- a/drivers/net/enetc/enetc4_vf.c
+++ b/drivers/net/enetc/enetc4_vf.c
@@ -12,6 +12,30 @@
 #define ENETC4_VSI_DISABLE		"enetc4_vsi_disable"
 #define ENETC4_VSI_TIMEOUT		"enetc4_vsi_timeout"
 #define ENETC4_VSI_DELAY		"enetc4_vsi_delay"
+#define ENETC4_NC_MEMORY		"nc"
+
+static void
+enetc4_vf_get_devarg_nc(struct rte_eth_dev *dev)
+{
+	struct enetc_eth_hw *hw =
+		ENETC_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	struct rte_devargs *devargs = dev->device->devargs;
+	struct rte_kvargs *kvlist;
+	const char *val;
+
+	if (!devargs)
+		return;
+
+	kvlist = rte_kvargs_parse(devargs->args, NULL);
+	if (!kvlist)
+		return;
+
+	val = rte_kvargs_get(kvlist, ENETC4_NC_MEMORY);
+	if (val && atoi(val) == 1)
+		hw->nc_mode = 1;
+
+	rte_kvargs_free(kvlist);
+}
 
 #define ENETC_CRC_TABLE_SIZE		256
 #define ENETC_POLY			0x1021
@@ -1368,6 +1392,14 @@ enetc4_vf_dev_init(struct rte_eth_dev *eth_dev)
 
 	enetc4_dev_hw_init(eth_dev);
 
+	hw->nc_mode = 0;
+	enetc4_vf_get_devarg_nc(eth_dev);
+	if (hw->nc_mode) {
+		eth_dev->rx_pkt_burst = &enetc_recv_pkts_nc;
+		eth_dev->tx_pkt_burst = &enetc_xmit_pkts_nc;
+		ENETC_PMD_LOG(INFO, "nc=1: using non-cacheable BD ops (_nc)");
+	}
+
 	si_cap = enetc_rd(enetc_hw, ENETC_SICAPR0);
 	hw->max_tx_queues = si_cap & ENETC_SICAPR0_BDR_MASK;
 	hw->max_rx_queues = (si_cap >> 16) & ENETC_SICAPR0_BDR_MASK;
@@ -1475,5 +1507,6 @@ RTE_PMD_REGISTER_KMOD_DEP(net_enetc4_vf, "* igb_uio | uio_pci_generic");
 RTE_PMD_REGISTER_PARAM_STRING(net_enetc4_vf,
 			      ENETC4_VSI_DISABLE "=<any> "
 			      ENETC4_VSI_TIMEOUT "=<uint> "
-			      ENETC4_VSI_DELAY "=<uint>");
+			      ENETC4_VSI_DELAY "=<uint> "
+			      ENETC4_NC_MEMORY "=<int>");
 RTE_LOG_REGISTER_DEFAULT(enetc4_vf_logtype_pmd, NOTICE);
diff --git a/drivers/net/enetc/enetc_rxtx.c b/drivers/net/enetc/enetc_rxtx.c
index e4f5608..2cd74f5 100644
--- a/drivers/net/enetc/enetc_rxtx.c
+++ b/drivers/net/enetc/enetc_rxtx.c
@@ -26,6 +26,7 @@ enetc_clean_tx_ring(struct enetc_bdr *tx_ring)
 	struct enetc_swbd *tx_swbd, *tx_swbd_base;
 	int i, hwci, bd_count;
 	struct rte_mbuf *m[ENETC_RXBD_BUNDLE];
+	struct enetc_tx_bd *txbd;
 
 	/* we don't need barriers here, we just want a relatively current value
 	 * from HW.
@@ -51,6 +52,13 @@ enetc_clean_tx_ring(struct enetc_bdr *tx_ring)
 		/* It seems calling rte_pktmbuf_free is wasting a lot of cycles,
 		 * make a list and call _free when it's done.
 		 */
+		/* Clear flags on the reclaimed BD so that dcbf in the
+		 * cacheable TX path never flushes a stale flags_F to memory
+		 * before the new BD fields are fully written.
+		 */
+		txbd = ENETC_TXBD(*tx_ring, i);
+		txbd->flags = 0;
+
 		if (tx_frm_cnt == ENETC_RXBD_BUNDLE) {
 			rte_pktmbuf_free_bulk(m, tx_frm_cnt);
 			tx_frm_cnt = 0;
@@ -202,7 +210,8 @@ enetc_xmit_pkts_nc(void *tx_queue,
 		}
 
 		/* Set the frame-last flag on the final BD of this packet. */
-		txbd->flags |= ENETC4_TXBD_FLAGS_F;
+		if (likely(txbd))
+			txbd->flags |= ENETC4_TXBD_FLAGS_F;
 		start++;
 	}
 
@@ -217,6 +226,7 @@ enetc_refill_rx_ring(struct enetc_bdr *rx_ring, const int buff_cnt)
 {
 	struct enetc_swbd *rx_swbd;
 	union enetc_rx_bd *rxbd;
+	union enetc_rx_bd *grp_start_rxbd;
 	int i, j, k = ENETC_RXBD_BUNDLE;
 	struct rte_mbuf *m[ENETC_RXBD_BUNDLE];
 	struct rte_mempool *mb_pool;
@@ -225,6 +235,7 @@ enetc_refill_rx_ring(struct enetc_bdr *rx_ring, const int buff_cnt)
 	mb_pool = rx_ring->mb_pool;
 	rx_swbd = &rx_ring->q_swbd[i];
 	rxbd = ENETC_RXBD(*rx_ring, i);
+	grp_start_rxbd = rxbd;
 	for (j = 0; j < buff_cnt; j++) {
 		/* bulk alloc for the next up to 8 BDs */
 		if (k == ENETC_RXBD_BUNDLE) {
@@ -246,12 +257,29 @@ enetc_refill_rx_ring(struct enetc_bdr *rx_ring, const int buff_cnt)
 		i++;
 		k++;
 		if (unlikely(i == rx_ring->bd_count)) {
+			/*
+			 * Ring wrap: flush the current partial or full group
+			 * before resetting the pointer to index 0.
+			 */
+			dcbf((void *)grp_start_rxbd);
 			i = 0;
 			rxbd = ENETC_RXBD(*rx_ring, i);
 			rx_swbd = &rx_ring->q_swbd[i];
+			grp_start_rxbd = rxbd;
+		} else if ((i & ENETC_BD_PER_CL_MASK) == 0) {
+			/*
+			 * Completed a full 4-BD group (one cache line).
+			 * Flush it to PoC so HW sees the updated descriptors.
+			 */
+			dcbf((void *)grp_start_rxbd);
+			grp_start_rxbd = rxbd;
 		}
 	}
 
+	/* Flush any remaining partial group at the end of the fill. */
+	if (j && (i & ENETC_BD_PER_CL_MASK) != 0)
+		dcbf((void *)grp_start_rxbd);
+
 	if (likely(j)) {
 		rx_ring->next_to_alloc = i;
 		rx_ring->next_to_use = i;
@@ -604,3 +632,251 @@ enetc_recv_pkts(void *rxq, struct rte_mbuf **rx_pkts,
 
 	return enetc_clean_rx_ring(rx_ring, rx_pkts, nb_pkts);
 }
+
+/* --- Cacheable BD ring TX path with SW cache maintenance (dcbf) --- */
+
+uint16_t
+enetc_xmit_pkts_cacheable(void *tx_queue,
+		struct rte_mbuf **tx_pkts,
+		uint16_t nb_pkts)
+{
+	int i, start, bds_to_use;
+	struct enetc_tx_bd *txbd;
+	struct enetc_bdr *tx_ring = (struct enetc_bdr *)tx_queue;
+	unsigned int j;
+	uint8_t *data;
+	struct rte_mbuf *seg;
+	uint16_t seg_len, segs_per_pkt;
+	bool is_first_seg;
+	int first_bd_idx, bd_count;
+
+	i = tx_ring->next_to_use;
+	bds_to_use = enetc_bd_unused(tx_ring);
+	bd_count = tx_ring->bd_count;
+	start = 0;
+
+	/*
+	 * Remember the first BD index of this batch so we can flush the
+	 * BD cache lines to PoC after all descriptors are written.
+	 */
+	first_bd_idx = i;
+
+	while (start < nb_pkts) {
+		seg = tx_pkts[start];
+		segs_per_pkt = seg->nb_segs;
+
+		if (bds_to_use < segs_per_pkt)
+			break;
+
+		is_first_seg = true;
+		while (seg) {
+			tx_ring->q_swbd[i].buffer_addr = NULL;
+			seg_len = rte_pktmbuf_data_len(seg);
+			data = rte_pktmbuf_mtod(seg, void *);
+
+			/*
+			 * Flush packet data cache lines to PoC so HW DMA
+			 * reads the correct payload from memory.
+			 */
+			for (j = 0; j < seg_len; j += RTE_CACHE_LINE_SIZE)
+				dcbf(data + j);
+
+			/*
+			 * Cover the last byte of an unaligned buffer to
+			 * ensure the full payload is clean to the Point of
+			 * Coherency.
+			 */
+			dcbf(data + (seg_len - 1));
+			txbd = ENETC_TXBD(*tx_ring, i);
+			txbd->flags = 0;
+			if (is_first_seg) {
+				tx_ring->q_swbd[i].buffer_addr = seg;
+				txbd->frm_len = rte_pktmbuf_pkt_len(seg);
+				if (seg->ol_flags & ENETC4_TX_CKSUM_OFFLOAD_MASK)
+					enetc4_tx_offload_checksum(seg, txbd);
+				is_first_seg = false;
+			}
+
+			txbd->buf_len = rte_cpu_to_le_16(seg_len);
+			txbd->addr = rte_cpu_to_le_64(rte_mbuf_data_iova(seg));
+			seg = seg->next;
+			i++;
+			bds_to_use--;
+
+			if (unlikely(i == bd_count))
+				i = 0;
+		}
+
+		/*
+		 * Set the frame-last flag on the final BD of this packet.
+		 * This is the last write to the BD group; the cache flush
+		 * below will push all BDs to memory afterwards.
+		 */
+		if (likely(txbd))
+			txbd->flags |= ENETC4_TXBD_FLAGS_F;
+		start++;
+	}
+
+	/*
+	 * Flush TX BDs to PoC so HW (non-cache-coherent i.MX95) can read
+	 * the descriptors from memory.  TX BDs are 16 B each; 4 BDs share
+	 * one 64-byte cache line.  Walk from the cache-line-aligned start
+	 * of first_bd_idx to just past the last written BD, one dcbf per
+	 * cache line.
+	 *
+	 * The flush must happen AFTER all BD fields (including flags_F) are
+	 * written, so HW never sees a partial descriptor.
+	 */
+	if (likely(start > 0)) {
+		int n = first_bd_idx & ~ENETC_BD_PER_CL_MASK;
+		int written = (i - n + bd_count) % bd_count;
+
+		if (written == 0)
+			written = bd_count;
+		written = (written + ENETC_BD_PER_CL_MASK) & ~ENETC_BD_PER_CL_MASK;
+
+		while (written > 0) {
+			dcbf((void *)ENETC_TXBD(*tx_ring, n));
+			n = (n + ENETC_BD_PER_CL) % bd_count;
+			written -= ENETC_BD_PER_CL;
+		}
+	}
+
+	enetc_clean_tx_ring(tx_ring);
+	tx_ring->next_to_use = i;
+	enetc_wr_reg(tx_ring->tcir, i);
+
+	return start;
+}
+
+/* --- Cacheable BD ring RX path with SW cache maintenance (dccivac) --- */
+
+static int
+enetc_clean_rx_ring_cacheable(struct enetc_bdr *rx_ring,
+		struct rte_mbuf **rx_pkts,
+		int work_limit)
+{
+	int rx_frm_cnt = 0;
+	int cleaned_cnt, i;
+	struct enetc_swbd *rx_swbd;
+	union enetc_rx_bd *rxbd;
+	struct rte_mbuf *first_seg, *cur_seg;
+	uint32_t bd_status;
+	uint8_t *data;
+	uint32_t j;
+	struct rte_mbuf *seg;
+	uint16_t data_len;
+
+	i = rx_ring->next_to_clean;
+	rxbd = ENETC_RXBD(*rx_ring, i);
+	cleaned_cnt = enetc_bd_unused(rx_ring);
+	rx_swbd = &rx_ring->q_swbd[i];
+
+	/* Restore partial multi-segment chain from a previous burst. */
+	first_seg = rx_ring->pkt_first_seg;
+	cur_seg = rx_ring->pkt_last_seg;
+
+	/*
+	 * On i.MX95 the BD ring is in cacheable hugepage memory but the
+	 * platform is non-cache-coherent.  HW writes RX BDs to DDR
+	 * without snooping the CPU cache, so stale cached copies of BD
+	 * status fields must be discarded before the CPU reads them.
+	 *
+	 * Ideal instruction: DC IVAC (invalidate only, no writeback).
+	 * ARM64 constraint: DC IVAC requires EL1 privilege; executing it
+	 * from EL0 (DPDK userspace) raises a fault.  The only EL0-safe
+	 * cache maintenance instruction that invalidates is DC CIVAC
+	 * (clean + invalidate, dccivac).
+	 *
+	 * Safety of using dccivac here:
+	 * enetc_refill_rx_ring() issues dcbf() on every BD group before
+	 * returning ownership to HW.  After dcbf the CPU cache lines are
+	 * marked clean (no dirty data).  When dccivac runs, the "clean"
+	 * phase finds nothing dirty to write back, so it behaves as a
+	 * pure invalidate - exactly what we need.
+	 *
+	 * Granularity: BD = 16 B, cache line = 64 B, so one dccivac
+	 * covers exactly 4 BDs.  Invalidate at each 4-BD boundary.
+	 */
+	dccivac((void *)ENETC_RXBD(*rx_ring,
+			(i & ~(int)ENETC_BD_PER_CL_MASK)));
+
+	while (likely(rx_frm_cnt < work_limit)) {
+		bd_status = rte_le_to_cpu_32(rxbd->r.lstatus);
+
+		if (!(bd_status & ENETC_RXBD_LSTATUS_R))
+			break;
+		if (rxbd->r.error)
+			rx_ring->ierrors++;
+
+		seg = rx_swbd->buffer_addr;
+		data_len = rte_le_to_cpu_16(rxbd->r.buf_len);
+		seg->data_len = data_len;
+		if (!first_seg) {
+			first_seg = seg;
+			cur_seg = seg;
+			first_seg->pkt_len = data_len;
+			enetc_dev_rx_parse(first_seg,
+					   rxbd->r.parse_summary);
+			first_seg->hash.rss = rxbd->r.rss_hash;
+		} else {
+			first_seg->pkt_len += data_len;
+			first_seg->nb_segs++;
+			cur_seg->next = seg;
+			cur_seg = seg;
+		}
+
+		/*
+		 * Invalidate packet data cache lines so the CPU reads the
+		 * payload that HW DMA'd into memory, not stale cached bytes.
+		 */
+		data = rte_pktmbuf_mtod(seg, void *);
+		for (j = 0; j < data_len; j += RTE_CACHE_LINE_SIZE)
+			dccivac(data + j);
+		/* Cover the last byte of an unaligned buffer. */
+		dccivac(data + (data_len - 1));
+
+		if (bd_status & ENETC_RXBD_LSTATUS_F) {
+			seg->next = NULL;
+			first_seg->pkt_len -= rx_ring->crc_len;
+			rx_pkts[rx_frm_cnt] = first_seg;
+			rx_frm_cnt++;
+			first_seg = NULL;
+		}
+
+		cleaned_cnt++;
+		rx_swbd++;
+		i++;
+		if (unlikely(i == rx_ring->bd_count)) {
+			i = 0;
+			rx_swbd = &rx_ring->q_swbd[i];
+		}
+		rxbd = ENETC_RXBD(*rx_ring, i);
+
+		/*
+		 * Crossed a 4-BD (cache-line) boundary: invalidate the new
+		 * group so the next four status reads fetch fresh DDR data
+		 * written by HW.
+		 */
+		if ((i & ENETC_BD_PER_CL_MASK) == 0 &&
+		    likely(rx_frm_cnt < work_limit))
+			dccivac((void *)rxbd);
+	}
+
+	/* Save partial chain for the next burst if frame is incomplete. */
+	rx_ring->pkt_first_seg = first_seg;
+	rx_ring->pkt_last_seg = cur_seg;
+	rx_ring->next_to_clean = i;
+	enetc_refill_rx_ring(rx_ring, ENETC_BD_ALIGN_DOWN(cleaned_cnt));
+
+	return rx_frm_cnt;
+}
+
+uint16_t
+enetc_recv_pkts_cacheable(void *rxq, struct rte_mbuf **rx_pkts,
+		uint16_t nb_pkts)
+{
+	struct enetc_bdr *rx_ring = (struct enetc_bdr *)rxq;
+
+	return enetc_clean_rx_ring_cacheable(rx_ring, rx_pkts, nb_pkts);
+}
-- 
2.25.1


^ permalink raw reply related

* [PATCH v2 8/9] net/enetc: set user configurable priority to TX rings
From: Gagandeep Singh @ 2026-06-22 11:35 UTC (permalink / raw)
  To: dev; +Cc: hemant.agrawal, Vanshika Shukla
In-Reply-To: <20260622113517.1616028-1-g.singh@nxp.com>

From: Vanshika Shukla <vanshika.shukla@nxp.com>

Add devarg 'enetc4_txq_prior' to allow per-queue TX ring priority
configuration. The value is a '|'-separated list of TBMR priority
bits, one per TX queue (e.g. 'enetc4_txq_prior=1|2|3').
The configuration accepts values only up to the maximum supported
TX queues. Any additional values beyond this supported range
are discarded.

Store the parsed priorities in hw->txq_prior and apply them in
enetc4_tx_queue_setup() when enabling the ring.

Signed-off-by: Vanshika Shukla <vanshika.shukla@nxp.com>
---
 doc/guides/rel_notes/release_26_07.rst |  1 +
 drivers/net/enetc/enetc.h              |  1 +
 drivers/net/enetc/enetc4_ethdev.c      | 81 +++++++++++++++++++++++++-
 3 files changed, 82 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_26_07.rst b/doc/guides/rel_notes/release_26_07.rst
index 192623d..495eba0 100644
--- a/doc/guides/rel_notes/release_26_07.rst
+++ b/doc/guides/rel_notes/release_26_07.rst
@@ -195,6 +195,7 @@ New Features
     messaging.
   * Added devargs options ``enetc4_vsi_timeout`` and ``enetc4_vsi_delay``
     for VSI-PSI messaging timeout and delay.
+  * Added devargs option ``enetc4_txq_prior`` to set TX queues priorities.
 
 Removed Items
 -------------
diff --git a/drivers/net/enetc/enetc.h b/drivers/net/enetc/enetc.h
index 80844e9..c12597b 100644
--- a/drivers/net/enetc/enetc.h
+++ b/drivers/net/enetc/enetc.h
@@ -114,6 +114,7 @@ struct enetc_eth_hw {
 	uint32_t max_tx_queues;
 	uint32_t vsi_timeout; /* VSI-PSI message wait timeout (iterations) */
 	uint32_t vsi_delay;   /* VSI-PSI message wait delay (us) */
+	uint32_t *txq_prior;  /* per-queue TX priority (TBMR priority bits) */
 };
 
 /*
diff --git a/drivers/net/enetc/enetc4_ethdev.c b/drivers/net/enetc/enetc4_ethdev.c
index ad1ef4d..7e2d665 100644
--- a/drivers/net/enetc/enetc4_ethdev.c
+++ b/drivers/net/enetc/enetc4_ethdev.c
@@ -3,6 +3,7 @@
  */
 
 #include <stdbool.h>
+#include <rte_kvargs.h>
 #include <rte_random.h>
 #include <dpaax_iova_table.h>
 
@@ -10,6 +11,67 @@
 #include "enetc_logs.h"
 #include "enetc.h"
 
+#define ENETC4_TXQ_PRIORITIES	"enetc4_txq_prior"
+
+static int
+parse_txq_prior(const char *key __rte_unused, const char *value, void *opaque)
+{
+	struct rte_eth_dev *dev = (struct rte_eth_dev *)opaque;
+	struct enetc_eth_hw *hw =
+		ENETC_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	char *input_str = strdup(value);
+	char *str;
+	uint32_t i = 0;
+
+	if (!input_str)
+		return -ENOMEM;
+
+	hw->txq_prior = calloc(hw->max_tx_queues, sizeof(uint32_t));
+	if (!hw->txq_prior) {
+		free(input_str);
+		return -ENOMEM;
+	}
+
+	str = strtok(input_str, "|");
+	while (str != NULL && i < hw->max_tx_queues) {
+		hw->txq_prior[i++] = (uint32_t)atoi(str);
+		str = strtok(NULL, "|");
+	}
+
+	free(input_str);
+	return 0;
+}
+
+static int
+enetc4_get_devargs(struct rte_eth_dev *dev, const char *key)
+{
+	struct rte_devargs *devargs = dev->device->devargs;
+	struct rte_kvargs *kvlist;
+
+	if (!devargs)
+		return 0;
+
+	kvlist = rte_kvargs_parse(devargs->args, NULL);
+	if (!kvlist)
+		return 0;
+
+	if (!rte_kvargs_count(kvlist, key)) {
+		rte_kvargs_free(kvlist);
+		return 0;
+	}
+
+	if (!strcmp(key, ENETC4_TXQ_PRIORITIES)) {
+		if (rte_kvargs_process(kvlist, key,
+				       parse_txq_prior, (void *)dev) < 0) {
+			rte_kvargs_free(kvlist);
+			return 0;
+		}
+	}
+
+	rte_kvargs_free(kvlist);
+	return 0;
+}
+
 /* Supported Rx offloads */
 static uint64_t dev_rx_offloads_sup =
 	RTE_ETH_RX_OFFLOAD_IPV4_CKSUM |
@@ -316,9 +378,14 @@ enetc4_tx_queue_setup(struct rte_eth_dev *dev,
 	data->tx_queues[queue_idx] = tx_ring;
 	tx_ring->tx_deferred_start = tx_conf->tx_deferred_start;
 	if (!tx_conf->tx_deferred_start) {
+		uint32_t tx_en = ENETC_TBMR_EN;
+
+		/* apply TX queue priority if configured */
+		if (priv->hw.txq_prior)
+			tx_en |= priv->hw.txq_prior[tx_ring->index];
 		/* enable ring */
 		enetc4_txbdr_wr(&priv->hw.hw, tx_ring->index,
-			       ENETC_TBMR, ENETC_TBMR_EN);
+			       ENETC_TBMR, tx_en);
 		dev->data->tx_queue_state[tx_ring->index] =
 			       RTE_ETH_QUEUE_STATE_STARTED;
 	} else {
@@ -1015,6 +1082,8 @@ enetc4_dev_init(struct rte_eth_dev *eth_dev)
 	hw->max_tx_queues = si_cap & ENETC_SICAPR0_BDR_MASK;
 	hw->max_rx_queues = (si_cap >> 16) & ENETC_SICAPR0_BDR_MASK;
 
+	enetc4_get_devargs(eth_dev, ENETC4_TXQ_PRIORITIES);
+
 	ENETC_PMD_DEBUG("Max RX queues = %d Max TX queues = %d",
 			hw->max_rx_queues, hw->max_tx_queues);
 	error = enetc4_mac_init(hw, eth_dev);
@@ -1041,8 +1110,16 @@ enetc4_dev_init(struct rte_eth_dev *eth_dev)
 static int
 enetc4_dev_uninit(struct rte_eth_dev *eth_dev)
 {
+	struct enetc_eth_hw *hw =
+		ENETC_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);
+
 	PMD_INIT_FUNC_TRACE();
 
+	if (hw->txq_prior) {
+		free(hw->txq_prior);
+		hw->txq_prior = NULL;
+	}
+
 	return enetc4_dev_close(eth_dev);
 }
 
@@ -1071,4 +1148,6 @@ static struct rte_pci_driver rte_enetc4_pmd = {
 RTE_PMD_REGISTER_PCI(net_enetc4, rte_enetc4_pmd);
 RTE_PMD_REGISTER_PCI_TABLE(net_enetc4, pci_id_enetc4_map);
 RTE_PMD_REGISTER_KMOD_DEP(net_enetc4, "* vfio-pci");
+RTE_PMD_REGISTER_PARAM_STRING(net_enetc4,
+			      ENETC4_TXQ_PRIORITIES "=<string>");
 RTE_LOG_REGISTER_DEFAULT(enetc4_logtype_pmd, NOTICE);
-- 
2.25.1


^ permalink raw reply related

* [PATCH v2 6/9] net/enetc: add option to disable VSI messaging
From: Gagandeep Singh @ 2026-06-22 11:35 UTC (permalink / raw)
  To: dev; +Cc: hemant.agrawal, Gagandeep Singh
In-Reply-To: <20260622113517.1616028-1-g.singh@nxp.com>

Add devarg 'enetc4_vsi_disable' to allow disabling features
dependent on VSI-PSI messaging. This is useful for testing DPDK
with a PF driver that does not support VSI-PSI messages.

When the devarg is present, a reduced ops table
(enetc4_vf_ops_no_vsi_m) is used that replaces link_update with
a no-op stub and omits MAC/VLAN filter ops that require VSI msgs.

Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
---
 doc/guides/rel_notes/release_26_07.rst |  2 +
 drivers/net/enetc/enetc4_vf.c          | 61 ++++++++++++++++++++++++--
 2 files changed, 60 insertions(+), 3 deletions(-)

diff --git a/doc/guides/rel_notes/release_26_07.rst b/doc/guides/rel_notes/release_26_07.rst
index f900145..783ad16 100644
--- a/doc/guides/rel_notes/release_26_07.rst
+++ b/doc/guides/rel_notes/release_26_07.rst
@@ -191,6 +191,8 @@ New Features
 
   * Added support for ESP packet type in packet parsing.
   * Added scatter-gather support for ENETC4 PFs and VFs.
+  * Added devargs option ``enetc4_vsi_disable`` to disable VSI-PSI
+    messaging.
 
 Removed Items
 -------------
diff --git a/drivers/net/enetc/enetc4_vf.c b/drivers/net/enetc/enetc4_vf.c
index 9dc4e1d..44c0dc0 100644
--- a/drivers/net/enetc/enetc4_vf.c
+++ b/drivers/net/enetc/enetc4_vf.c
@@ -3,11 +3,14 @@
  */
 
 #include <stdbool.h>
+#include <rte_kvargs.h>
 #include <rte_random.h>
 #include <dpaax_iova_table.h>
 #include "enetc_logs.h"
 #include "enetc.h"
 
+#define ENETC4_VSI_DISABLE		"enetc4_vsi_disable"
+
 #define ENETC_CRC_TABLE_SIZE		256
 #define ENETC_POLY			0x1021
 #define ENETC_CRC_INIT			0xffff
@@ -687,6 +690,13 @@ enetc4_vf_get_link_speed(struct rte_eth_dev *dev, struct enetc_psi_reply_msg *re
 	return err;
 }
 
+static int
+enetc4_vf_link_update_dummy(struct rte_eth_dev *dev __rte_unused,
+			    int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
 static int
 enetc4_vf_link_update(struct rte_eth_dev *dev, int wait_to_complete __rte_unused)
 {
@@ -1148,6 +1158,27 @@ static const struct rte_pci_id pci_vf_id_enetc4_map[] = {
 };
 
 /* Features supported by this driver */
+/* ops table used when VSI messaging is disabled */
+static const struct eth_dev_ops enetc4_vf_ops_no_vsi_m = {
+	.dev_configure        = enetc4_dev_configure,
+	.dev_start            = enetc4_vf_dev_start,
+	.dev_stop             = enetc4_vf_dev_stop,
+	.dev_close            = enetc4_dev_close,
+	.stats_get            = enetc4_vf_stats_get,
+	.dev_infos_get        = enetc4_vf_dev_infos_get,
+	.mtu_set              = enetc4_vf_mtu_set,
+	.link_update	      = enetc4_vf_link_update_dummy,
+	.rx_queue_setup       = enetc4_rx_queue_setup,
+	.rx_queue_start       = enetc4_rx_queue_start,
+	.rx_queue_stop        = enetc4_rx_queue_stop,
+	.rx_queue_release     = enetc4_rx_queue_release,
+	.tx_queue_setup       = enetc4_tx_queue_setup,
+	.tx_queue_start       = enetc4_tx_queue_start,
+	.tx_queue_stop        = enetc4_tx_queue_stop,
+	.tx_queue_release     = enetc4_tx_queue_release,
+	.dev_supported_ptypes_get = enetc4_supported_ptypes_get,
+};
+
 static const struct eth_dev_ops enetc4_vf_ops = {
 	.dev_configure        = enetc4_dev_configure,
 	.dev_start            = enetc4_vf_dev_start,
@@ -1283,7 +1314,28 @@ enetc4_vf_dev_init(struct rte_eth_dev *eth_dev)
 	struct enetc_hw *enetc_hw = &hw->hw;
 
 	PMD_INIT_FUNC_TRACE();
-	eth_dev->dev_ops = &enetc4_vf_ops;
+
+	/* check if VSI messaging should be disabled via devarg */
+	if (eth_dev->device->devargs) {
+		struct rte_kvargs *kvlist;
+
+		kvlist = rte_kvargs_parse(eth_dev->device->devargs->args,
+					  NULL);
+		if (kvlist) {
+			if (rte_kvargs_count(kvlist, ENETC4_VSI_DISABLE) != 0) {
+				ENETC_PMD_NOTICE("VSI messaging disabled by devarg");
+				eth_dev->dev_ops = &enetc4_vf_ops_no_vsi_m;
+			} else {
+				eth_dev->dev_ops = &enetc4_vf_ops;
+			}
+			rte_kvargs_free(kvlist);
+		} else {
+			eth_dev->dev_ops = &enetc4_vf_ops;
+		}
+	} else {
+		eth_dev->dev_ops = &enetc4_vf_ops;
+	}
+
 	enetc4_dev_hw_init(eth_dev);
 
 	si_cap = enetc_rd(enetc_hw, ENETC_SICAPR0);
@@ -1304,8 +1356,9 @@ enetc4_vf_dev_init(struct rte_eth_dev *eth_dev)
 	ENETC_PMD_DEBUG("port_id %d vendorID=0x%x deviceID=0x%x",
 			eth_dev->data->port_id, pci_dev->id.vendor_id,
 			pci_dev->id.device_id);
-	/* update link */
-	enetc4_vf_link_update(eth_dev, 0);
+	/* update link if VSI messaging is enabled */
+	if (eth_dev->dev_ops == &enetc4_vf_ops)
+		enetc4_vf_link_update(eth_dev, 0);
 
 	return 0;
 }
@@ -1389,4 +1442,6 @@ static struct rte_pci_driver rte_enetc4_vf_pmd = {
 RTE_PMD_REGISTER_PCI(net_enetc4_vf, rte_enetc4_vf_pmd);
 RTE_PMD_REGISTER_PCI_TABLE(net_enetc4_vf, pci_vf_id_enetc4_map);
 RTE_PMD_REGISTER_KMOD_DEP(net_enetc4_vf, "* igb_uio | uio_pci_generic");
+RTE_PMD_REGISTER_PARAM_STRING(net_enetc4_vf,
+			      ENETC4_VSI_DISABLE "=<any>");
 RTE_LOG_REGISTER_DEFAULT(enetc4_vf_logtype_pmd, NOTICE);
-- 
2.25.1


^ permalink raw reply related

* [PATCH v2 7/9] net/enetc: add devargs to control VSI-PSI timeout and delay
From: Gagandeep Singh @ 2026-06-22 11:35 UTC (permalink / raw)
  To: dev; +Cc: hemant.agrawal, Gagandeep Singh
In-Reply-To: <20260622113517.1616028-1-g.singh@nxp.com>

Add two new devargs for ENETC4 VF:
- enetc4_vsi_timeout: VSI-PSI message wait timeout (iteration count)
- enetc4_vsi_delay: VSI-PSI message wait delay in microseconds

Store the values in struct enetc_eth_hw and use them in
enetc4_msg_vsi_send() instead of the hardcoded defaults.
Fall back to ENETC4_DEF_VSI_WAIT_TIMEOUT_UPDATE /
ENETC4_DEF_VSI_WAIT_DELAY_UPDATE when not set.

Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
---
 doc/guides/rel_notes/release_26_07.rst |  2 +
 drivers/net/enetc/enetc.h              |  2 +
 drivers/net/enetc/enetc4_vf.c          | 66 +++++++++++++++++++-------
 3 files changed, 53 insertions(+), 17 deletions(-)

diff --git a/doc/guides/rel_notes/release_26_07.rst b/doc/guides/rel_notes/release_26_07.rst
index 783ad16..192623d 100644
--- a/doc/guides/rel_notes/release_26_07.rst
+++ b/doc/guides/rel_notes/release_26_07.rst
@@ -193,6 +193,8 @@ New Features
   * Added scatter-gather support for ENETC4 PFs and VFs.
   * Added devargs option ``enetc4_vsi_disable`` to disable VSI-PSI
     messaging.
+  * Added devargs options ``enetc4_vsi_timeout`` and ``enetc4_vsi_delay``
+    for VSI-PSI messaging timeout and delay.
 
 Removed Items
 -------------
diff --git a/drivers/net/enetc/enetc.h b/drivers/net/enetc/enetc.h
index 01da898..80844e9 100644
--- a/drivers/net/enetc/enetc.h
+++ b/drivers/net/enetc/enetc.h
@@ -112,6 +112,8 @@ struct enetc_eth_hw {
 	uint32_t num_rss;
 	uint32_t max_rx_queues;
 	uint32_t max_tx_queues;
+	uint32_t vsi_timeout; /* VSI-PSI message wait timeout (iterations) */
+	uint32_t vsi_delay;   /* VSI-PSI message wait delay (us) */
 };
 
 /*
diff --git a/drivers/net/enetc/enetc4_vf.c b/drivers/net/enetc/enetc4_vf.c
index 44c0dc0..d78e08e 100644
--- a/drivers/net/enetc/enetc4_vf.c
+++ b/drivers/net/enetc/enetc4_vf.c
@@ -10,6 +10,8 @@
 #include "enetc.h"
 
 #define ENETC4_VSI_DISABLE		"enetc4_vsi_disable"
+#define ENETC4_VSI_TIMEOUT		"enetc4_vsi_timeout"
+#define ENETC4_VSI_DELAY		"enetc4_vsi_delay"
 
 #define ENETC_CRC_TABLE_SIZE		256
 #define ENETC_POLY			0x1021
@@ -262,10 +264,13 @@ enetc4_process_psi_msg(struct rte_eth_dev *eth_dev, struct enetc_hw *enetc_hw)
 }
 
 static int
-enetc4_msg_vsi_send(struct enetc_hw *enetc_hw, struct enetc_msg_swbd *msg)
+enetc4_msg_vsi_send(struct enetc_eth_hw *hw, struct enetc_msg_swbd *msg)
 {
-	int timeout = ENETC4_DEF_VSI_WAIT_TIMEOUT_UPDATE;
-	int delay_us = ENETC4_DEF_VSI_WAIT_DELAY_UPDATE;
+	struct enetc_hw *enetc_hw = &hw->hw;
+	int timeout = hw->vsi_timeout ? (int)hw->vsi_timeout :
+					ENETC4_DEF_VSI_WAIT_TIMEOUT_UPDATE;
+	int delay_us = hw->vsi_delay ? (int)hw->vsi_delay :
+				       ENETC4_DEF_VSI_WAIT_DELAY_UPDATE;
 	uint8_t class_id = 0;
 	int err = 0;
 	int vsimsgsr;
@@ -382,7 +387,7 @@ enetc4_vf_set_mac_addr(struct rte_eth_dev *dev, struct rte_ether_addr *addr)
 					ENETC_CMD_ID_SET_PRIMARY_MAC, 0, 0, 0);
 
 	/* send the command and wait */
-	err = enetc4_msg_vsi_send(enetc_hw, msg);
+	err = enetc4_msg_vsi_send(hw, msg);
 	if (err) {
 		ENETC_PMD_ERR("VSI message send error");
 		goto end;
@@ -426,7 +431,6 @@ static int
 enetc4_vf_promisc_send_message(struct rte_eth_dev *dev, bool promisc_en)
 {
 	struct enetc_eth_hw *hw = ENETC_DEV_PRIVATE_TO_HW(dev->data->dev_private);
-	struct enetc_hw *enetc_hw = &hw->hw;
 	struct enetc_msg_cmd_set_promisc *cmd;
 	struct enetc_msg_swbd *msg;
 	uint32_t msg_size;
@@ -466,7 +470,7 @@ enetc4_vf_promisc_send_message(struct rte_eth_dev *dev, bool promisc_en)
 				ENETC_CMD_ID_SET_MAC_PROMISCUOUS, 0, 0, 0);
 
 	/* send the command and wait */
-	err = enetc4_msg_vsi_send(enetc_hw, msg);
+	err = enetc4_msg_vsi_send(hw, msg);
 	if (err) {
 		ENETC_PMD_ERR("VSI message send error");
 		goto end;
@@ -483,7 +487,6 @@ static int
 enetc4_vf_allmulti_send_message(struct rte_eth_dev *dev, bool mc_promisc)
 {
 	struct enetc_eth_hw *hw = ENETC_DEV_PRIVATE_TO_HW(dev->data->dev_private);
-	struct enetc_hw *enetc_hw = &hw->hw;
 	struct enetc_msg_cmd_set_promisc *cmd;
 	struct enetc_msg_swbd *msg;
 	uint32_t msg_size;
@@ -524,7 +527,7 @@ enetc4_vf_allmulti_send_message(struct rte_eth_dev *dev, bool mc_promisc)
 				ENETC_CMD_ID_SET_MAC_PROMISCUOUS, 0, 0, 0);
 
 	/* send the command and wait */
-	err = enetc4_msg_vsi_send(enetc_hw, msg);
+	err = enetc4_msg_vsi_send(hw, msg);
 	if (err) {
 		ENETC_PMD_ERR("VSI message send error");
 		goto end;
@@ -630,7 +633,7 @@ enetc4_vf_get_link_status(struct rte_eth_dev *dev, struct enetc_psi_reply_msg *r
 			ENETC_CMD_ID_GET_LINK_STATUS, 0, 0, 0);
 
 	/* send the command and wait */
-	err = enetc4_msg_vsi_send(enetc_hw, msg);
+	err = enetc4_msg_vsi_send(hw, msg);
 	if (err) {
 		ENETC_PMD_ERR("VSI message send error");
 		goto end;
@@ -676,7 +679,7 @@ enetc4_vf_get_link_speed(struct rte_eth_dev *dev, struct enetc_psi_reply_msg *re
 			ENETC_CMD_ID_GET_LINK_SPEED, 0, 0, 0);
 
 	/* send the command and wait */
-	err = enetc4_msg_vsi_send(enetc_hw, msg);
+	err = enetc4_msg_vsi_send(hw, msg);
 	if (err) {
 		ENETC_PMD_ERR("VSI message send error");
 		goto end;
@@ -819,7 +822,6 @@ static int
 enetc4_vf_vlan_promisc(struct rte_eth_dev *dev, bool promisc_en)
 {
 	struct enetc_eth_hw *hw = ENETC_DEV_PRIVATE_TO_HW(dev->data->dev_private);
-	struct enetc_hw *enetc_hw = &hw->hw;
 	struct enetc_msg_cmd_set_vlan_promisc *cmd;
 	struct enetc_msg_swbd *msg;
 	uint32_t msg_size;
@@ -858,7 +860,7 @@ enetc4_vf_vlan_promisc(struct rte_eth_dev *dev, bool promisc_en)
 				ENETC_CMD_ID_SET_VLAN_PROMISCUOUS, 0, 0, 0);
 
 	/* send the command and wait */
-	err = enetc4_msg_vsi_send(enetc_hw, msg);
+	err = enetc4_msg_vsi_send(hw, msg);
 	if (err) {
 		ENETC_PMD_ERR("VSI message send error");
 		goto end;
@@ -921,7 +923,7 @@ enetc4_vf_mac_addr_add(struct rte_eth_dev *dev, struct rte_ether_addr *addr,
 			ENETC_MSG_ADD_EXACT_MAC_ENTRIES, 0, 0, 0);
 
 	/* send the command and wait */
-	err = enetc4_msg_vsi_send(enetc_hw, msg);
+	err = enetc4_msg_vsi_send(hw, msg);
 	if (err) {
 		ENETC_PMD_ERR("VSI message send error");
 		goto end;
@@ -1021,7 +1023,7 @@ static int enetc4_vf_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id,
 	}
 
 	/* send the command and wait */
-	err = enetc4_msg_vsi_send(enetc_hw, msg);
+	err = enetc4_msg_vsi_send(hw, msg);
 	if (err) {
 		ENETC_PMD_ERR("VSI message send error");
 		goto end;
@@ -1104,7 +1106,6 @@ static int
 enetc4_vf_link_register_notif(struct rte_eth_dev *dev, bool enable)
 {
 	struct enetc_eth_hw *hw = ENETC_DEV_PRIVATE_TO_HW(dev->data->dev_private);
-	struct enetc_hw *enetc_hw = &hw->hw;
 	struct enetc_msg_swbd *msg;
 	struct rte_eth_link link;
 	uint32_t msg_size;
@@ -1138,7 +1139,7 @@ enetc4_vf_link_register_notif(struct rte_eth_dev *dev, bool enable)
 			cmd, 0, 0, 0);
 
 	/* send the command and wait */
-	err = enetc4_msg_vsi_send(enetc_hw, msg);
+	err = enetc4_msg_vsi_send(hw, msg);
 	if (err)
 		ENETC_PMD_ERR("VSI msg error for link status notification");
 
@@ -1322,12 +1323,41 @@ enetc4_vf_dev_init(struct rte_eth_dev *eth_dev)
 		kvlist = rte_kvargs_parse(eth_dev->device->devargs->args,
 					  NULL);
 		if (kvlist) {
+			const char *val;
+
 			if (rte_kvargs_count(kvlist, ENETC4_VSI_DISABLE) != 0) {
 				ENETC_PMD_NOTICE("VSI messaging disabled by devarg");
 				eth_dev->dev_ops = &enetc4_vf_ops_no_vsi_m;
 			} else {
 				eth_dev->dev_ops = &enetc4_vf_ops;
 			}
+
+			/* parse optional VSI-PSI timeout devarg */
+			val = rte_kvargs_get(kvlist, ENETC4_VSI_TIMEOUT);
+			if (val) {
+				errno = 0;
+				hw->vsi_timeout = (uint32_t)strtoul(val, NULL, 0);
+				if (errno != 0 || hw->vsi_timeout == 0) {
+					ENETC_PMD_ERR("Invalid VSI Timeout value = %u",
+							hw->vsi_timeout);
+					return -1;
+				}
+				ENETC_PMD_NOTICE("VSI timeout set to %u", hw->vsi_timeout);
+			}
+
+			/* parse optional VSI-PSI delay devarg */
+			val = rte_kvargs_get(kvlist, ENETC4_VSI_DELAY);
+			if (val) {
+				errno = 0;
+				hw->vsi_delay = (uint32_t)strtoul(val, NULL, 0);
+				if (errno != 0 || hw->vsi_delay == 0) {
+					ENETC_PMD_ERR("Invalid VSI Delay value = %u",
+							hw->vsi_delay);
+					return -1;
+				}
+				ENETC_PMD_NOTICE("VSI delay set to %u us", hw->vsi_delay);
+			}
+
 			rte_kvargs_free(kvlist);
 		} else {
 			eth_dev->dev_ops = &enetc4_vf_ops;
@@ -1443,5 +1473,7 @@ RTE_PMD_REGISTER_PCI(net_enetc4_vf, rte_enetc4_vf_pmd);
 RTE_PMD_REGISTER_PCI_TABLE(net_enetc4_vf, pci_vf_id_enetc4_map);
 RTE_PMD_REGISTER_KMOD_DEP(net_enetc4_vf, "* igb_uio | uio_pci_generic");
 RTE_PMD_REGISTER_PARAM_STRING(net_enetc4_vf,
-			      ENETC4_VSI_DISABLE "=<any>");
+			      ENETC4_VSI_DISABLE "=<any> "
+			      ENETC4_VSI_TIMEOUT "=<uint> "
+			      ENETC4_VSI_DELAY "=<uint>");
 RTE_LOG_REGISTER_DEFAULT(enetc4_vf_logtype_pmd, NOTICE);
-- 
2.25.1


^ permalink raw reply related

* [PATCH v2 5/9] net/enetc: support scatter-gather
From: Gagandeep Singh @ 2026-06-22 11:35 UTC (permalink / raw)
  To: dev; +Cc: hemant.agrawal, Vanshika Shukla
In-Reply-To: <20260622113517.1616028-1-g.singh@nxp.com>

From: Vanshika Shukla <vanshika.shukla@nxp.com>

Add scatter-gather support for ENETC4 PMD:
- Add ENETC_RXBD_LSTATUS_R/F bits for RX BD status
- Add ENETC4_MAX_SEGS (63) for max segments per TX packet
- Update enetc4_vf_dev_infos_get to fill nb_seg_max, offloads,
  max queues and packet length
- Extend enetc_xmit_pkts_nc to handle multi-segment mbufs
- Extend enetc_clean_rx_ring_nc to chain scatter-gather segments
  using LSTATUS_R/F bits

Signed-off-by: Vanshika Shukla <vanshika.shukla@nxp.com>
---
 doc/guides/nics/features/enetc4.ini    |   1 +
 doc/guides/rel_notes/release_26_07.rst |   3 +-
 drivers/net/enetc/base/enetc_hw.h      |   2 +
 drivers/net/enetc/enetc.h              |   7 +-
 drivers/net/enetc/enetc4_ethdev.c      |  10 +-
 drivers/net/enetc/enetc4_vf.c          |  46 +++++++--
 drivers/net/enetc/enetc_rxtx.c         | 129 ++++++++++++++++---------
 7 files changed, 139 insertions(+), 59 deletions(-)

diff --git a/doc/guides/nics/features/enetc4.ini b/doc/guides/nics/features/enetc4.ini
index 87425f4..698140e 100644
--- a/doc/guides/nics/features/enetc4.ini
+++ b/doc/guides/nics/features/enetc4.ini
@@ -17,6 +17,7 @@ Basic stats          = Y
 L3 checksum offload  = Y
 L4 checksum offload  = Y
 Queue start/stop     = Y
+Scattered Rx         = Y
 Linux                = Y
 ARMv8                = Y
 Usage doc            = Y
diff --git a/doc/guides/rel_notes/release_26_07.rst b/doc/guides/rel_notes/release_26_07.rst
index 35476c2..f900145 100644
--- a/doc/guides/rel_notes/release_26_07.rst
+++ b/doc/guides/rel_notes/release_26_07.rst
@@ -189,7 +189,8 @@ New Features
 
 * **Updated NXP ENETC ethernet driver.**
 
-  * Added support for ESP packet type in packet parsing
+  * Added support for ESP packet type in packet parsing.
+  * Added scatter-gather support for ENETC4 PFs and VFs.
 
 Removed Items
 -------------
diff --git a/drivers/net/enetc/base/enetc_hw.h b/drivers/net/enetc/base/enetc_hw.h
index f79c950..6e96562 100644
--- a/drivers/net/enetc/base/enetc_hw.h
+++ b/drivers/net/enetc/base/enetc_hw.h
@@ -230,6 +230,8 @@ enum enetc_bdr_type {TX, RX};
 			(0x0005 | ENETC_PKT_TYPE_IPV4)
 #define ENETC_PKT_TYPE_IPV6_ESP \
 			(0x0005 | ENETC_PKT_TYPE_IPV6)
+#define ENETC_RXBD_LSTATUS_R	BIT(30)
+#define ENETC_RXBD_LSTATUS_F	BIT(31)
 
 /* PCI device info */
 struct enetc_hw {
diff --git a/drivers/net/enetc/enetc.h b/drivers/net/enetc/enetc.h
index 4d99b5b..01da898 100644
--- a/drivers/net/enetc/enetc.h
+++ b/drivers/net/enetc/enetc.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright 2018-2019,2024 NXP
+ * Copyright 2018-2019,2024-2026 NXP
  */
 
 #ifndef _ENETC_H_
@@ -28,6 +28,8 @@
 #define MIN_BD_COUNT   32
 /* BD ALIGN */
 #define BD_ALIGN       8
+/* Max segments per ENETC4 TX packet (scatter-gather) */
+#define ENETC4_MAX_SEGS	63
 
 /* minimum frame size supported */
 #define ENETC_MAC_MINFRM_SIZE	68
@@ -90,6 +92,9 @@ struct enetc_bdr {
 		int next_to_alloc; /* Rx */
 	};
 	struct rte_mempool *mb_pool;   /* mbuf pool to populate RX ring. */
+	/* Partial scatter-gather chain persisted across burst calls. */
+	struct rte_mbuf *pkt_first_seg; /* first segment of in-progress frame */
+	struct rte_mbuf *pkt_last_seg;  /* last segment linked so far */
 	struct rte_eth_dev *ndev;
 	uint64_t ierrors;
 	uint8_t rx_deferred_start;
diff --git a/drivers/net/enetc/enetc4_ethdev.c b/drivers/net/enetc/enetc4_ethdev.c
index 154fc09..ad1ef4d 100644
--- a/drivers/net/enetc/enetc4_ethdev.c
+++ b/drivers/net/enetc/enetc4_ethdev.c
@@ -14,13 +14,15 @@
 static uint64_t dev_rx_offloads_sup =
 	RTE_ETH_RX_OFFLOAD_IPV4_CKSUM |
 	RTE_ETH_RX_OFFLOAD_UDP_CKSUM |
-	RTE_ETH_RX_OFFLOAD_TCP_CKSUM;
+	RTE_ETH_RX_OFFLOAD_TCP_CKSUM |
+	RTE_ETH_RX_OFFLOAD_SCATTER;
 
 /* Supported Tx offloads */
 static uint64_t dev_tx_offloads_sup =
 	RTE_ETH_TX_OFFLOAD_IPV4_CKSUM |
 	RTE_ETH_TX_OFFLOAD_UDP_CKSUM |
-	RTE_ETH_TX_OFFLOAD_TCP_CKSUM;
+	RTE_ETH_TX_OFFLOAD_TCP_CKSUM |
+	RTE_ETH_TX_OFFLOAD_MULTI_SEGS;
 
 static int
 enetc4_dev_start(struct rte_eth_dev *dev)
@@ -199,11 +201,15 @@ enetc4_dev_infos_get(struct rte_eth_dev *dev,
 		.nb_max = MAX_BD_COUNT,
 		.nb_min = MIN_BD_COUNT,
 		.nb_align = BD_ALIGN,
+		.nb_seg_max = ENETC4_MAX_SEGS,
+		.nb_mtu_seg_max = ENETC4_MAX_SEGS,
 	};
 	dev_info->tx_desc_lim = (struct rte_eth_desc_lim) {
 		.nb_max = MAX_BD_COUNT,
 		.nb_min = MIN_BD_COUNT,
 		.nb_align = BD_ALIGN,
+		.nb_seg_max = ENETC4_MAX_SEGS,
+		.nb_mtu_seg_max = ENETC4_MAX_SEGS,
 	};
 	dev_info->max_rx_queues = hw->max_rx_queues;
 	dev_info->max_tx_queues = hw->max_tx_queues;
diff --git a/drivers/net/enetc/enetc4_vf.c b/drivers/net/enetc/enetc4_vf.c
index bec7128..9dc4e1d 100644
--- a/drivers/net/enetc/enetc4_vf.c
+++ b/drivers/net/enetc/enetc4_vf.c
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright 2024 NXP
+ * Copyright 2024-2026 NXP
  */
 
 #include <stdbool.h>
@@ -18,8 +18,19 @@ uint16_t enetc_crc_table[ENETC_CRC_TABLE_SIZE];
 bool enetc_crc_gen;
 
 /* Supported Rx offloads */
-static uint64_t dev_vf_rx_offloads_sup =
-	RTE_ETH_RX_OFFLOAD_VLAN_FILTER;
+static uint64_t dev_rx_offloads_sup =
+	RTE_ETH_RX_OFFLOAD_IPV4_CKSUM |
+	RTE_ETH_RX_OFFLOAD_UDP_CKSUM |
+	RTE_ETH_RX_OFFLOAD_TCP_CKSUM |
+	RTE_ETH_RX_OFFLOAD_VLAN_FILTER |
+	RTE_ETH_RX_OFFLOAD_SCATTER;
+
+/* Supported Tx offloads */
+static uint64_t dev_tx_offloads_sup =
+	RTE_ETH_TX_OFFLOAD_IPV4_CKSUM |
+	RTE_ETH_TX_OFFLOAD_UDP_CKSUM |
+	RTE_ETH_TX_OFFLOAD_TCP_CKSUM |
+	RTE_ETH_TX_OFFLOAD_MULTI_SEGS;
 
 static void
 enetc_gen_crc_table(void)
@@ -61,21 +72,38 @@ static int
 enetc4_vf_dev_infos_get(struct rte_eth_dev *dev,
 			struct rte_eth_dev_info *dev_info)
 {
-	int ret = 0;
+	struct enetc_eth_hw *hw =
+		ENETC_DEV_PRIVATE_TO_HW(dev->data->dev_private);
 
 	PMD_INIT_FUNC_TRACE();
 
-	ret = enetc4_dev_infos_get(dev, dev_info);
-	if (ret)
-		return ret;
-
+	dev_info->rx_desc_lim = (struct rte_eth_desc_lim) {
+		.nb_max = MAX_BD_COUNT,
+		.nb_min = MIN_BD_COUNT,
+		.nb_align = BD_ALIGN,
+		.nb_seg_max = ENETC4_MAX_SEGS,
+		.nb_mtu_seg_max = ENETC4_MAX_SEGS,
+	};
+	dev_info->tx_desc_lim = (struct rte_eth_desc_lim) {
+		.nb_max = MAX_BD_COUNT,
+		.nb_min = MIN_BD_COUNT,
+		.nb_align = BD_ALIGN,
+		.nb_seg_max = ENETC4_MAX_SEGS,
+		.nb_mtu_seg_max = ENETC4_MAX_SEGS,
+	};
+	dev_info->max_rx_queues = hw->max_rx_queues;
+	dev_info->max_tx_queues = hw->max_tx_queues;
+	dev_info->max_rx_pktlen = ENETC4_MAC_MAXFRM_SIZE;
 	dev_info->max_mtu = dev_info->max_rx_pktlen - (RTE_ETHER_HDR_LEN + RTE_ETHER_CRC_LEN);
 	dev_info->max_mac_addrs = ENETC4_MAC_ENTRIES;
-	dev_info->rx_offload_capa |= dev_vf_rx_offloads_sup;
+	dev_info->rx_offload_capa = dev_rx_offloads_sup;
+	dev_info->tx_offload_capa = dev_tx_offloads_sup;
+	dev_info->flow_type_rss_offloads = ENETC_RSS_OFFLOAD_ALL;
 
 	return 0;
 }
 
+
 int
 enetc4_vf_dev_stop(struct rte_eth_dev *dev __rte_unused)
 {
diff --git a/drivers/net/enetc/enetc_rxtx.c b/drivers/net/enetc/enetc_rxtx.c
index 94177bb..e4f5608 100644
--- a/drivers/net/enetc/enetc_rxtx.c
+++ b/drivers/net/enetc/enetc_rxtx.c
@@ -149,54 +149,64 @@ enetc_xmit_pkts_nc(void *tx_queue,
 		struct rte_mbuf **tx_pkts,
 		uint16_t nb_pkts)
 {
-	struct enetc_swbd *tx_swbd;
-	int i, start, bds_to_use;
-	struct enetc_tx_bd *txbd;
 	struct enetc_bdr *tx_ring = (struct enetc_bdr *)tx_queue;
-	unsigned int buflen, j;
+	int i, start, bds_to_use, bd_count;
+	struct enetc_tx_bd *txbd;
+	struct rte_mbuf *seg;
+	uint16_t seg_len, segs_per_pkt;
+	bool is_first_seg;
+	unsigned int j;
 	uint8_t *data;
 
 	i = tx_ring->next_to_use;
-
 	bds_to_use = enetc_bd_unused(tx_ring);
-	if (bds_to_use < nb_pkts)
-		nb_pkts = bds_to_use;
+	bd_count = tx_ring->bd_count;
 
 	start = 0;
-	while (nb_pkts--) {
-		tx_ring->q_swbd[i].buffer_addr = tx_pkts[start];
+	while (start < nb_pkts) {
+		seg = tx_pkts[start];
+		segs_per_pkt = seg->nb_segs;
 
-		buflen = rte_pktmbuf_pkt_len(tx_ring->q_swbd[i].buffer_addr);
-		data = rte_pktmbuf_mtod(tx_ring->q_swbd[i].buffer_addr, void *);
-		for (j = 0; j <= buflen; j += RTE_CACHE_LINE_SIZE)
-			dcbf(data + j);
+		if (bds_to_use < segs_per_pkt)
+			break;
 
-		txbd = ENETC_TXBD(*tx_ring, i);
-		txbd->flags = 0;
-		if (tx_ring->q_swbd[i].buffer_addr->ol_flags & ENETC4_TX_CKSUM_OFFLOAD_MASK)
-			enetc4_tx_offload_checksum(tx_ring->q_swbd[i].buffer_addr, txbd);
+		is_first_seg = true;
+		while (seg) {
+			tx_ring->q_swbd[i].buffer_addr = NULL;
+			seg_len = rte_pktmbuf_data_len(seg);
+			data = rte_pktmbuf_mtod(seg, void *);
+
+			/* Flush payload to PoC so HW DMA reads the correct data. */
+			for (j = 0; j < seg_len; j += RTE_CACHE_LINE_SIZE)
+				dcbf(data + j);
+			/* Cover the last byte of an unaligned buffer. */
+			dcbf(data + (seg_len - 1));
+
+			txbd = ENETC_TXBD(*tx_ring, i);
+			txbd->flags = 0;
+			if (is_first_seg) {
+				tx_ring->q_swbd[i].buffer_addr = tx_pkts[start];
+				txbd->frm_len = rte_pktmbuf_pkt_len(seg);
+				if (seg->ol_flags & ENETC4_TX_CKSUM_OFFLOAD_MASK)
+					enetc4_tx_offload_checksum(seg, txbd);
+				is_first_seg = false;
+			}
+
+			txbd->buf_len = rte_cpu_to_le_16(seg_len);
+			txbd->addr = rte_cpu_to_le_64(rte_mbuf_data_iova(seg));
+			seg = seg->next;
+			i++;
+			bds_to_use--;
+			if (unlikely(i == bd_count))
+				i = 0;
+		}
 
-		tx_swbd = &tx_ring->q_swbd[i];
-		txbd->frm_len = buflen;
-		txbd->buf_len = txbd->frm_len;
-		txbd->addr = (uint64_t)(uintptr_t)
-		rte_cpu_to_le_64((size_t)tx_swbd->buffer_addr->buf_iova +
-				 tx_swbd->buffer_addr->data_off);
+		/* Set the frame-last flag on the final BD of this packet. */
 		txbd->flags |= ENETC4_TXBD_FLAGS_F;
-		i++;
 		start++;
-		if (unlikely(i == tx_ring->bd_count))
-			i = 0;
 	}
 
-	/* we're only cleaning up the Tx ring here, on the assumption that
-	 * software is slower than hardware and hardware completed sending
-	 * older frames out by now.
-	 * We're also cleaning up the ring before kicking off Tx for the new
-	 * batch to minimize chances of contention on the Tx ring
-	 */
 	enetc_clean_tx_ring(tx_ring);
-
 	tx_ring->next_to_use = i;
 	enetc_wr_reg(tx_ring->tcir, i);
 	return start;
@@ -501,38 +511,63 @@ enetc_clean_rx_ring_nc(struct enetc_bdr *rx_ring,
 	int cleaned_cnt, i;
 	struct enetc_swbd *rx_swbd;
 	union enetc_rx_bd *rxbd, rxbd_temp;
+	struct rte_mbuf *first_seg, *cur_seg;
 	uint32_t bd_status;
 	uint8_t *data;
 	uint32_t j;
+	struct rte_mbuf *seg;
+	uint16_t data_len;
 
 	/* next descriptor to process */
 	i = rx_ring->next_to_clean;
-	/* next descriptor to process */
 	rxbd = ENETC_RXBD(*rx_ring, i);
-
 	cleaned_cnt = enetc_bd_unused(rx_ring);
 	rx_swbd = &rx_ring->q_swbd[i];
 
+	/* Restore partial multi-segment chain from a previous burst. */
+	first_seg = rx_ring->pkt_first_seg;
+	cur_seg = rx_ring->pkt_last_seg;
+
 	while (likely(rx_frm_cnt < work_limit)) {
 		rxbd_temp = *rxbd;
 		bd_status = rte_le_to_cpu_32(rxbd_temp.r.lstatus);
-		if (!bd_status)
+		/* LSTATUS_R indicates this BD has been written by HW */
+		if (!(bd_status & ENETC_RXBD_LSTATUS_R))
 			break;
 		if (rxbd_temp.r.error)
 			rx_ring->ierrors++;
 
-		rx_swbd->buffer_addr->pkt_len = rxbd_temp.r.buf_len -
-						rx_ring->crc_len;
-		rx_swbd->buffer_addr->data_len = rx_swbd->buffer_addr->pkt_len;
-		rx_swbd->buffer_addr->hash.rss = rxbd_temp.r.rss_hash;
-		enetc_dev_rx_parse(rx_swbd->buffer_addr,
-				   rxbd_temp.r.parse_summary);
+		seg = rx_swbd->buffer_addr;
+		data_len = rte_le_to_cpu_16(rxbd_temp.r.buf_len);
+		seg->data_len = data_len;
+
+		if (!first_seg) {
+			first_seg = seg;
+			cur_seg = seg;
+			first_seg->pkt_len = data_len;
+			enetc_dev_rx_parse(first_seg, rxbd_temp.r.parse_summary);
+			first_seg->hash.rss = rxbd_temp.r.rss_hash;
+		} else {
+			first_seg->pkt_len += data_len;
+			first_seg->nb_segs++;
+			cur_seg->next = seg;
+			cur_seg = seg;
+		}
 
-		data = rte_pktmbuf_mtod(rx_swbd->buffer_addr, void *);
-		for (j = 0; j <= rx_swbd->buffer_addr->pkt_len; j += RTE_CACHE_LINE_SIZE)
+		/* Invalidate packet data cache lines so CPU reads HW-written data. */
+		data = rte_pktmbuf_mtod(seg, void *);
+		for (j = 0; j < data_len; j += RTE_CACHE_LINE_SIZE)
 			dccivac(data + j);
+		dccivac(data + (data_len - 1));
+
+		if (bd_status & ENETC_RXBD_LSTATUS_F) {
+			seg->next = NULL;
+			first_seg->pkt_len -= rx_ring->crc_len;
+			rx_pkts[rx_frm_cnt] = first_seg;
+			rx_frm_cnt++;
+			first_seg = NULL;
+		}
 
-		rx_pkts[rx_frm_cnt] = rx_swbd->buffer_addr;
 		cleaned_cnt++;
 		rx_swbd++;
 		i++;
@@ -541,9 +576,11 @@ enetc_clean_rx_ring_nc(struct enetc_bdr *rx_ring,
 			rx_swbd = &rx_ring->q_swbd[i];
 		}
 		rxbd = ENETC_RXBD(*rx_ring, i);
-		rx_frm_cnt++;
 	}
 
+	/* Save partial chain for the next burst if frame is incomplete. */
+	rx_ring->pkt_first_seg = first_seg;
+	rx_ring->pkt_last_seg = cur_seg;
 	rx_ring->next_to_clean = i;
 	enetc_refill_rx_ring(rx_ring, cleaned_cnt);
 
-- 
2.25.1


^ permalink raw reply related

* [PATCH v2 4/9] net/enetc: update random MAC generation code
From: Gagandeep Singh @ 2026-06-22 11:35 UTC (permalink / raw)
  To: dev; +Cc: hemant.agrawal, Gagandeep Singh
In-Reply-To: <20260622113517.1616028-1-g.singh@nxp.com>

Use rte_eth_random_addr() instead of manual rte_rand() based MAC
generation. Also handle VF path by writing to ENETC_SIPMAR0/1 instead
of ENETC_PSIPMAR0/1 when running as a VF.

Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
---
 drivers/net/enetc/enetc_ethdev.c | 22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/net/enetc/enetc_ethdev.c b/drivers/net/enetc/enetc_ethdev.c
index 8196377..55b0e0b 100644
--- a/drivers/net/enetc/enetc_ethdev.c
+++ b/drivers/net/enetc/enetc_ethdev.c
@@ -195,20 +195,18 @@ enetc_hardware_init(struct enetc_eth_hw *hw)
 	}
 
 	if ((high_mac | low_mac) == 0) {
-		char *first_byte;
-
 		ENETC_PMD_NOTICE("MAC is not available for this SI, "
 				"set random MAC");
-		mac = (uint32_t *)hw->mac.addr;
-		*mac = (uint32_t)rte_rand();
-		first_byte = (char *)mac;
-		*first_byte &= 0xfe;	/* clear multicast bit */
-		*first_byte |= 0x02;	/* set local assignment bit (IEEE802) */
-
-		enetc_port_wr(enetc_hw, ENETC_PSIPMAR0(0), *mac);
-		mac++;
-		*mac = (uint16_t)rte_rand();
-		enetc_port_wr(enetc_hw, ENETC_PSIPMAR1(0), *mac);
+		rte_eth_random_addr(hw->mac.addr);
+		high_mac = *(uint32_t *)hw->mac.addr;
+		low_mac = *(uint16_t *)(hw->mac.addr + 4);
+		if (hw->device_id == ENETC_DEV_ID_VF) {
+			enetc_wr(enetc_hw, ENETC_SIPMAR0, high_mac);
+			enetc_wr(enetc_hw, ENETC_SIPMAR1, low_mac);
+		} else {
+			enetc_port_wr(enetc_hw, ENETC_PSIPMAR0(0), high_mac);
+			enetc_port_wr(enetc_hw, ENETC_PSIPMAR1(0), low_mac);
+		}
 		enetc_print_ethaddr("New address: ",
 			      (const struct rte_ether_addr *)hw->mac.addr);
 	}
-- 
2.25.1


^ permalink raw reply related

* [PATCH v2 3/9] net/enetc: support ESP packet type in packet parsing
From: Gagandeep Singh @ 2026-06-22 11:35 UTC (permalink / raw)
  To: dev; +Cc: hemant.agrawal, Gagandeep Singh
In-Reply-To: <20260622113517.1616028-1-g.singh@nxp.com>

Add ESP (Encapsulating Security Payload) packet type definitions and
handling to the RX packet parsing path. Also update the supported
ptypes array to advertise ESP tunnel type support.

Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
---
 doc/guides/rel_notes/release_26_07.rst |  3 +++
 drivers/net/enetc/base/enetc_hw.h      |  4 ++++
 drivers/net/enetc/enetc_ethdev.c       |  3 ++-
 drivers/net/enetc/enetc_rxtx.c         | 10 ++++++++++
 4 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_26_07.rst b/doc/guides/rel_notes/release_26_07.rst
index fc6e144..35476c2 100644
--- a/doc/guides/rel_notes/release_26_07.rst
+++ b/doc/guides/rel_notes/release_26_07.rst
@@ -187,6 +187,9 @@ New Features
   ``RTE_ETH_EVENT_RECOVERY_FAILED``) to notify upper layers of the
   reset lifecycle.
 
+* **Updated NXP ENETC ethernet driver.**
+
+  * Added support for ESP packet type in packet parsing
 
 Removed Items
 -------------
diff --git a/drivers/net/enetc/base/enetc_hw.h b/drivers/net/enetc/base/enetc_hw.h
index 19efadd..f79c950 100644
--- a/drivers/net/enetc/base/enetc_hw.h
+++ b/drivers/net/enetc/base/enetc_hw.h
@@ -226,6 +226,10 @@ enum enetc_bdr_type {TX, RX};
 			(0x0003 | ENETC_PKT_TYPE_IPV4)
 #define ENETC_PKT_TYPE_IPV6_ICMP \
 			(0x0003 | ENETC_PKT_TYPE_IPV6)
+#define ENETC_PKT_TYPE_IPV4_ESP \
+			(0x0005 | ENETC_PKT_TYPE_IPV4)
+#define ENETC_PKT_TYPE_IPV6_ESP \
+			(0x0005 | ENETC_PKT_TYPE_IPV6)
 
 /* PCI device info */
 struct enetc_hw {
diff --git a/drivers/net/enetc/enetc_ethdev.c b/drivers/net/enetc/enetc_ethdev.c
index f41f3c1..8196377 100644
--- a/drivers/net/enetc/enetc_ethdev.c
+++ b/drivers/net/enetc/enetc_ethdev.c
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright 2018-2024 NXP
+ * Copyright 2018-2026 NXP
  */
 
 #include <stdbool.h>
@@ -95,6 +95,7 @@ enetc_supported_ptypes_get(struct rte_eth_dev *dev __rte_unused,
 		RTE_PTYPE_L4_UDP,
 		RTE_PTYPE_L4_SCTP,
 		RTE_PTYPE_L4_ICMP,
+		RTE_PTYPE_TUNNEL_ESP
 	};
 
 	*no_of_elements = RTE_DIM(ptypes);
diff --git a/drivers/net/enetc/enetc_rxtx.c b/drivers/net/enetc/enetc_rxtx.c
index d3b98b3..94177bb 100644
--- a/drivers/net/enetc/enetc_rxtx.c
+++ b/drivers/net/enetc/enetc_rxtx.c
@@ -370,6 +370,16 @@ enetc_dev_rx_parse(struct rte_mbuf *m, uint16_t parse_results)
 				 RTE_PTYPE_L3_IPV6 |
 				 RTE_PTYPE_L4_UDP;
 		return;
+	case ENETC_PKT_TYPE_IPV4_ESP:
+		m->packet_type = RTE_PTYPE_L2_ETHER |
+				 RTE_PTYPE_L3_IPV4 |
+				 RTE_PTYPE_TUNNEL_ESP;
+		return;
+	case ENETC_PKT_TYPE_IPV6_ESP:
+		m->packet_type = RTE_PTYPE_L2_ETHER |
+				 RTE_PTYPE_L3_IPV6 |
+				 RTE_PTYPE_TUNNEL_ESP;
+		return;
 	case ENETC_PKT_TYPE_IPV4_SCTP:
 		m->packet_type = RTE_PTYPE_L2_ETHER |
 				 RTE_PTYPE_L3_IPV4 |
-- 
2.25.1


^ permalink raw reply related

* [PATCH v2 2/9] net/enetc: fix queue initialization
From: Gagandeep Singh @ 2026-06-22 11:35 UTC (permalink / raw)
  To: dev; +Cc: hemant.agrawal, stable, Gagandeep Singh
In-Reply-To: <20260622113517.1616028-1-g.singh@nxp.com>

Hardware can misbehave if the user tries to reset the consumer and
producer indexes without resetting the ring.

This patch adds the ring reset step before resetting the indexes.

Fixes: 6c9c5aadc0e0 ("net/enetc: support ENETC4 queue API")
Cc: stable@dpdk.org

Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
---
 drivers/net/enetc/enetc4_ethdev.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/net/enetc/enetc4_ethdev.c b/drivers/net/enetc/enetc4_ethdev.c
index 78eba70..154fc09 100644
--- a/drivers/net/enetc/enetc4_ethdev.c
+++ b/drivers/net/enetc/enetc4_ethdev.c
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright 2024 NXP
+ * Copyright 2024-2026 NXP
  */
 
 #include <stdbool.h>
@@ -279,6 +279,7 @@ enetc4_tx_queue_setup(struct rte_eth_dev *dev,
 		     const struct rte_eth_txconf *tx_conf)
 {
 	int err;
+	uint32_t tx_data;
 	struct enetc_bdr *tx_ring;
 	struct rte_eth_dev_data *data = dev->data;
 	struct enetc_eth_adapter *priv =
@@ -301,6 +302,10 @@ enetc4_tx_queue_setup(struct rte_eth_dev *dev,
 		goto fail;
 
 	tx_ring->ndev = dev;
+	/* reset queue */
+	tx_data = enetc4_txbdr_rd(&priv->hw.hw, tx_ring->index, ENETC_TBMR);
+	tx_data &= ~ENETC_TBMR_EN;
+	enetc4_txbdr_wr(&priv->hw.hw, tx_ring->index, ENETC_TBMR, tx_data);
 	enetc4_setup_txbdr(&priv->hw.hw, tx_ring);
 	data->tx_queues[queue_idx] = tx_ring;
 	tx_ring->tx_deferred_start = tx_conf->tx_deferred_start;
@@ -427,6 +432,7 @@ enetc4_rx_queue_setup(struct rte_eth_dev *dev,
 		     struct rte_mempool *mb_pool)
 {
 	int err = 0;
+	uint32_t rx_enable;
 	struct enetc_bdr *rx_ring;
 	struct rte_eth_dev_data *data =  dev->data;
 	struct enetc_eth_adapter *adapter =
@@ -450,6 +456,10 @@ enetc4_rx_queue_setup(struct rte_eth_dev *dev,
 		goto fail;
 
 	rx_ring->ndev = dev;
+	/* reset queue */
+	rx_enable = enetc4_rxbdr_rd(&adapter->hw.hw, rx_ring->index, ENETC_RBMR);
+	rx_enable &= ~ENETC_RBMR_EN;
+	enetc4_rxbdr_wr(&adapter->hw.hw, rx_ring->index, ENETC_RBMR, rx_enable);
 	enetc4_setup_rxbdr(&adapter->hw.hw, rx_ring, mb_pool);
 	data->rx_queues[rx_queue_id] = rx_ring;
 	rx_ring->rx_deferred_start = rx_conf->rx_deferred_start;
-- 
2.25.1


^ permalink raw reply related

* [PATCH v2 1/9] net/enetc: fix TX BD structure
From: Gagandeep Singh @ 2026-06-22 11:35 UTC (permalink / raw)
  To: dev; +Cc: hemant.agrawal, stable, Gagandeep Singh
In-Reply-To: <20260622113517.1616028-1-g.singh@nxp.com>

The flags field in struct enetc_tx_bd was declared as uint16_t but
ENETC4 TX BDs only use an 8-bit flags byte. Fix the type to uint8_t
to match the hardware descriptor layout.

Fixes: 696fa399d797 ("net/enetc: add PMD with basic operations")
Cc: stable@dpdk.org

Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
---
 drivers/net/enetc/base/enetc_hw.h |  7 +++----
 drivers/net/enetc/enetc_rxtx.c    | 17 +++++++++--------
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/net/enetc/base/enetc_hw.h b/drivers/net/enetc/base/enetc_hw.h
index 173d677..19efadd 100644
--- a/drivers/net/enetc/base/enetc_hw.h
+++ b/drivers/net/enetc/base/enetc_hw.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright 2018-2024 NXP
+ * Copyright 2018-2026 NXP
  */
 
 #ifndef _ENETC_HW_H_
@@ -198,8 +198,7 @@ enum enetc_bdr_type {TX, RX};
 
 #define ENETC_TX_ADDR(txq, addr) ((void *)((txq)->enetc_txbdr + (addr)))
 
-#define ENETC_TXBD_FLAGS_IE		BIT(13)
-#define ENETC_TXBD_FLAGS_F		BIT(15)
+#define ENETC_TXBD_FLAGS_F		BIT(7)
 
 /* ENETC Parsed values (Little Endian) */
 #define ENETC_PARSE_ERROR		0x8000
@@ -262,7 +261,7 @@ struct enetc_tx_bd {
 			uint8_t l3t:1;
 			uint8_t resv:5;
 			uint8_t l4t:3;
-			uint16_t flags;
+			uint8_t flags;
 		};/* default layout */
 		uint32_t txstart;
 		uint32_t lstatus;
diff --git a/drivers/net/enetc/enetc_rxtx.c b/drivers/net/enetc/enetc_rxtx.c
index a2b8153..d3b98b3 100644
--- a/drivers/net/enetc/enetc_rxtx.c
+++ b/drivers/net/enetc/enetc_rxtx.c
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright 2018-2024 NXP
+ * Copyright 2018-2026 NXP
  */
 
 #include <stdbool.h>
@@ -101,7 +101,7 @@ enetc_xmit_pkts(void *tx_queue,
 		tx_swbd = &tx_ring->q_swbd[i];
 		txbd->frm_len = tx_pkts[start]->pkt_len;
 		txbd->buf_len = txbd->frm_len;
-		txbd->flags = rte_cpu_to_le_16(ENETC_TXBD_FLAGS_F);
+		txbd->flags = ENETC_TXBD_FLAGS_F;
 		txbd->addr = (uint64_t)(uintptr_t)
 		rte_cpu_to_le_64((size_t)tx_swbd->buffer_addr->buf_iova +
 				 tx_swbd->buffer_addr->data_off);
@@ -133,13 +133,13 @@ enetc4_tx_offload_checksum(struct rte_mbuf *mbuf, struct enetc_tx_bd *txbd)
 		txbd->ipcs = ENETC4_TXBD_IPCS;
 		txbd->l3_start = mbuf->l2_len;
 		txbd->l3_hdr_size = mbuf->l3_len / 4;
-		txbd->flags |= rte_cpu_to_le_16(ENETC4_TXBD_FLAGS_L_TX_CKSUM);
+		txbd->flags |= ENETC4_TXBD_FLAGS_L_TX_CKSUM;
 		if ((mbuf->ol_flags & RTE_MBUF_F_TX_UDP_CKSUM) == RTE_MBUF_F_TX_UDP_CKSUM) {
-			txbd->l4t = rte_cpu_to_le_16(ENETC4_TXBD_L4T_UDP);
-			txbd->flags |= rte_cpu_to_le_16(ENETC4_TXBD_FLAGS_L4CS);
+			txbd->l4t = ENETC4_TXBD_L4T_UDP;
+			txbd->flags |= ENETC4_TXBD_FLAGS_L4CS;
 		} else if ((mbuf->ol_flags & RTE_MBUF_F_TX_TCP_CKSUM) == RTE_MBUF_F_TX_TCP_CKSUM) {
-			txbd->l4t = rte_cpu_to_le_16(ENETC4_TXBD_L4T_TCP);
-			txbd->flags |= rte_cpu_to_le_16(ENETC4_TXBD_FLAGS_L4CS);
+			txbd->l4t = ENETC4_TXBD_L4T_TCP;
+			txbd->flags |= ENETC4_TXBD_FLAGS_L4CS;
 		}
 	}
 }
@@ -172,7 +172,7 @@ enetc_xmit_pkts_nc(void *tx_queue,
 			dcbf(data + j);
 
 		txbd = ENETC_TXBD(*tx_ring, i);
-		txbd->flags = rte_cpu_to_le_16(ENETC4_TXBD_FLAGS_F);
+		txbd->flags = 0;
 		if (tx_ring->q_swbd[i].buffer_addr->ol_flags & ENETC4_TX_CKSUM_OFFLOAD_MASK)
 			enetc4_tx_offload_checksum(tx_ring->q_swbd[i].buffer_addr, txbd);
 
@@ -182,6 +182,7 @@ enetc_xmit_pkts_nc(void *tx_queue,
 		txbd->addr = (uint64_t)(uintptr_t)
 		rte_cpu_to_le_64((size_t)tx_swbd->buffer_addr->buf_iova +
 				 tx_swbd->buffer_addr->data_off);
+		txbd->flags |= ENETC4_TXBD_FLAGS_F;
 		i++;
 		start++;
 		if (unlikely(i == tx_ring->bd_count))
-- 
2.25.1


^ permalink raw reply related

* [PATCH v2 0/9] ENETC driver related changes series
From: Gagandeep Singh @ 2026-06-22 11:35 UTC (permalink / raw)
  To: dev; +Cc: hemant.agrawal
In-Reply-To: <20260619184427.522518-1-g.singh@nxp.com>

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=y, Size: 2320 bytes --]

V2 changes:
  - Fixed an un-used variable compilation issue reported on fedora:43-gcc-minsize
  - Fixed various AI reported issues:
	- Release notes updated for all new devargs
	- enect4.ini features doc updated for scattered RX.
	- removed Not required RTE_PTYPE_UNKNOWN.
	- Fixed mid-frame mbuf leak in SG case.
	- Enabled SG for enetc4 PF also.
	- move to calloc from rte_zmalloc in parse_txq_prior().
	- added vaidation checks on strdup, strtoul.
	- added NC devargs to use cacheable ops conditionally.
	- removed dead code like bd_base_p etc.
	- Fixed rte_cpu_to_le_16() conversion on flags and combined
	  all flags related patches in one patch.
	- Fixed memory leak issue due to TXQ priority patch.
   - There were some false positives, I have ignored them:
	Race condition on flags field:
		clean_tx_ring only touches HW-completed BDs (next_to_clean→hwci),
		never newly-submitted BDs; doorbell hasn't fired yet.
	Missing dcbf in clean_tx_ring:
		DPDK is single-threaded per queue; TX path always overwrites
		flags completely before dcbf.
	TX dcbf granularity with wrap:
		Safe (AI admits it).
	RX refill flush at wrap:
		In-loop dcbf at i & mask == 0 already flushes aligned groups;
		trailing flush only needed for partial groups.
	RX reading before invalidate:
		dccivac precedes the read for every group in the loop

Gagandeep Singh (7):
  net/enetc: fix TX BD structure
  net/enetc: fix queue initialization
  net/enetc: support ESP packet type in packet parsing
  net/enetc: update random MAC generation code
  net/enetc: add option to disable VSI messaging
  net/enetc: add devargs to control VSI-PSI timeout and delay
  net/enetc4: add cacheable BD ring support with SW cache maintenance

Vanshika Shukla (2):
  net/enetc: support scatter-gather
  net/enetc: set user configurable priority to TX rings

 doc/guides/nics/features/enetc4.ini    |   1 +
 doc/guides/rel_notes/release_26_07.rst |  10 +
 drivers/net/enetc/base/enetc_hw.h      |  13 +-
 drivers/net/enetc/enetc.h              |  31 +-
 drivers/net/enetc/enetc4_ethdev.c      | 172 ++++++++--
 drivers/net/enetc/enetc4_vf.c          | 204 ++++++++++--
 drivers/net/enetc/enetc_ethdev.c       |  25 +-
 drivers/net/enetc/enetc_rxtx.c         | 430 ++++++++++++++++++++++---
 8 files changed, 768 insertions(+), 118 deletions(-)

-- 
2.25.1


^ permalink raw reply

* [PATCH 3/3] test/security_inline_proto: check for capabilities
From: Bruce Richardson @ 2026-06-22 11:18 UTC (permalink / raw)
  To: dev
  Cc: Bruce Richardson, stable, Akhil Goyal, Anoob Joseph,
	Nithin Dabilpuram, Fan Zhang
In-Reply-To: <20260622111835.233554-1-bruce.richardson@intel.com>

Skip the test cases when the HW doesn't support the necessary offloads,
skipping the whole suite if necessary, e.g. if no IPsec capabilities are
present. For event-based tests, skip those if no eventdev is present.

Fixes: 86e2487c5f2c ("test/security: add cases for inline IPsec offload")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 app/test/test_security_inline_proto.c | 57 ++++++++++++++++++++++++++-
 1 file changed, 56 insertions(+), 1 deletion(-)

diff --git a/app/test/test_security_inline_proto.c b/app/test/test_security_inline_proto.c
index 81fce7364c..2e1ee17078 100644
--- a/app/test/test_security_inline_proto.c
+++ b/app/test/test_security_inline_proto.c
@@ -2043,10 +2043,25 @@ inline_ipsec_testsuite_setup(void)
 	}
 
 	memcpy(&local_port_conf, &port_conf, sizeof(port_conf));
+	local_port_conf.rxmode.offloads &= dev_info.rx_offload_capa;
+	local_port_conf.txmode.offloads &= dev_info.tx_offload_capa;
+
+	if (!(local_port_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_SECURITY) ||
+	    !(local_port_conf.txmode.offloads & RTE_ETH_TX_OFFLOAD_SECURITY)) {
+		printf("Inline IPsec unsupported: required security offloads are missing\n");
+		return TEST_SKIPPED;
+	}
+
 	/* Add Multi seg flags */
 	if (sg_mode) {
 		uint32_t max_data_room;
 
+		if (!(dev_info.rx_offload_capa & RTE_ETH_RX_OFFLOAD_SCATTER) ||
+		    !(dev_info.tx_offload_capa & RTE_ETH_TX_OFFLOAD_MULTI_SEGS)) {
+			printf("SG mode unsupported: required scatter or multi-seg offloads are missing\n");
+			return TEST_SKIPPED;
+		}
+
 		if (dev_info.rx_desc_lim.nb_seg_max == 0) {
 			printf("SG mode unsupported: invalid max Rx segments (0)\n");
 			return TEST_SKIPPED;
@@ -2112,6 +2127,25 @@ inline_ipsec_testsuite_setup(void)
 		plaintext_len = 0;
 	}
 
+	/* Check that at least one inline IPsec capability is registered */
+	void *sec_ctx = rte_eth_dev_get_sec_ctx(port_id);
+	const struct rte_security_capability *cap;
+
+	if (sec_ctx == NULL) {
+		printf("No security context on port %d\n", port_id);
+		return TEST_SKIPPED;
+	}
+	for (cap = rte_security_capabilities_get(sec_ctx);
+			cap != NULL && cap->action != RTE_SECURITY_ACTION_TYPE_NONE; cap++) {
+		if (cap->action == RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL &&
+				cap->protocol == RTE_SECURITY_PROTOCOL_IPSEC)
+			break;
+	}
+	if (cap == NULL || cap->action == RTE_SECURITY_ACTION_TYPE_NONE) {
+		printf("No inline IPsec capabilities registered\n");
+		return TEST_SKIPPED;
+	}
+
 	return 0;
 }
 
@@ -2140,6 +2174,8 @@ event_inline_ipsec_testsuite_setup(void)
 	struct rte_event_dev_config eventdev_conf = {0};
 	struct rte_event_queue_conf eventq_conf = {0};
 	struct rte_event_port_conf ev_port_conf = {0};
+	struct rte_eth_conf local_port_conf;
+	struct rte_eth_dev_info dev_info;
 	const uint16_t nb_txd = 1024, nb_rxd = 1024;
 	uint16_t nb_rx_queue = 1, nb_tx_queue = 1;
 	uint8_t ev_queue_id = 0, tx_queue_id = 0;
@@ -2151,6 +2187,11 @@ event_inline_ipsec_testsuite_setup(void)
 
 	printf("Start event inline IPsec test.\n");
 
+	if (rte_event_dev_count() == 0) {
+		printf("Event inline IPsec unsupported: no event devices available\n");
+		return TEST_SKIPPED;
+	}
+
 	nb_ports = rte_eth_dev_count_avail();
 	if (nb_ports == 0) {
 		printf("Test require: 1 port, available: 0\n");
@@ -2182,9 +2223,23 @@ event_inline_ipsec_testsuite_setup(void)
 
 	/* configuring port 0 for the test is enough */
 	port_id = 0;
+	if (rte_eth_dev_info_get(port_id, &dev_info)) {
+		printf("Failed to get devinfo");
+		return -1;
+	}
+
+	memcpy(&local_port_conf, &port_conf, sizeof(port_conf));
+	local_port_conf.rxmode.offloads &= dev_info.rx_offload_capa;
+	local_port_conf.txmode.offloads &= dev_info.tx_offload_capa;
+	if ((local_port_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_SECURITY) == 0 ||
+	    (local_port_conf.txmode.offloads & RTE_ETH_TX_OFFLOAD_SECURITY) == 0) {
+		printf("Event inline IPsec unsupported: required security offloads are missing\n");
+		return TEST_SKIPPED;
+	}
+
 	/* port configure */
 	ret = rte_eth_dev_configure(port_id, nb_rx_queue,
-				    nb_tx_queue, &port_conf);
+				    nb_tx_queue, &local_port_conf);
 	if (ret < 0) {
 		printf("Cannot configure device: err=%d, port=%d\n",
 			 ret, port_id);
-- 
2.53.0


^ permalink raw reply related

* [PATCH 2/3] test/security_inline_proto: fix MTU calculation underflow
From: Bruce Richardson @ 2026-06-22 11:18 UTC (permalink / raw)
  To: dev; +Cc: Bruce Richardson, stable, Akhil Goyal, Anoob Joseph,
	Nithin Dabilpuram
In-Reply-To: <20260622111835.233554-1-bruce.richardson@intel.com>

When testing with some NICs it was observed that the max MTU calculation
could underflow. Despite the RTE_MIN, due to integer promotion, this
lead to the configuring of the port with an excessively large, invalid
value. For example, if lim_nb_seg_max == 0, that meant that
max_data_room - 256 is -256 for comparison purposes using standard
integers (promoted from uint16_t). That is lower than the max mtu so the
-256 value is chosen, which becomes >65000 when converted back to
uint16_t.

Fix this by a) checking for seg_max == 0 b) doing calculations using
uint32_t and c) checking explicitly for underflow before subtracting.

Fixes: 3edd1197a605 ("test/security: add multi-segment inline IPsec cases")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 app/test/test_security_inline_proto.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/app/test/test_security_inline_proto.c b/app/test/test_security_inline_proto.c
index bc3ef54f71..81fce7364c 100644
--- a/app/test/test_security_inline_proto.c
+++ b/app/test/test_security_inline_proto.c
@@ -2045,8 +2045,19 @@ inline_ipsec_testsuite_setup(void)
 	memcpy(&local_port_conf, &port_conf, sizeof(port_conf));
 	/* Add Multi seg flags */
 	if (sg_mode) {
-		uint16_t max_data_room = RTE_MBUF_DEFAULT_DATAROOM *
-			dev_info.rx_desc_lim.nb_seg_max;
+		uint32_t max_data_room;
+
+		if (dev_info.rx_desc_lim.nb_seg_max == 0) {
+			printf("SG mode unsupported: invalid max Rx segments (0)\n");
+			return TEST_SKIPPED;
+		}
+
+		max_data_room = RTE_MBUF_DEFAULT_DATAROOM * dev_info.rx_desc_lim.nb_seg_max;
+		if (max_data_room <= 256) {
+			printf("SG mode unsupported: max data room (%u) too small\n",
+				max_data_room);
+			return TEST_SKIPPED;
+		}
 
 		local_port_conf.rxmode.offloads |= RTE_ETH_RX_OFFLOAD_SCATTER;
 		local_port_conf.txmode.offloads |= RTE_ETH_TX_OFFLOAD_MULTI_SEGS;
-- 
2.53.0


^ permalink raw reply related

* [PATCH 1/3] test/security_inline_proto: remove fast-free Tx flag
From: Bruce Richardson @ 2026-06-22 11:18 UTC (permalink / raw)
  To: dev
  Cc: Bruce Richardson, stable, Akhil Goyal, Anoob Joseph,
	Nithin Dabilpuram, Fan Zhang
In-Reply-To: <20260622111835.233554-1-bruce.richardson@intel.com>

The FAST_FREE Tx offload flag is an optimization that may not be
supported by all drivers, but the test unconditionally sets this as part
of the Tx config. Since it's an optimization that should not affect test
correctness, and since this is a unit test, not perf test, just remove
the flag.

Fixes: 86e2487c5f2c ("test/security: add cases for inline IPsec offload")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 app/test/test_security_inline_proto.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/app/test/test_security_inline_proto.c b/app/test/test_security_inline_proto.c
index b0cce5ebd9..bc3ef54f71 100644
--- a/app/test/test_security_inline_proto.c
+++ b/app/test/test_security_inline_proto.c
@@ -94,8 +94,7 @@ static struct rte_eth_conf port_conf = {
 	},
 	.txmode = {
 		.mq_mode = RTE_ETH_MQ_TX_NONE,
-		.offloads = RTE_ETH_TX_OFFLOAD_SECURITY |
-			    RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE,
+		.offloads = RTE_ETH_TX_OFFLOAD_SECURITY,
 	},
 	.lpbk_mode = 1,  /* enable loopback */
 };
-- 
2.53.0


^ permalink raw reply related

* [PATCH 0/3] Fixes for inline ipsec test cases
From: Bruce Richardson @ 2026-06-22 11:18 UTC (permalink / raw)
  To: dev; +Cc: Bruce Richardson

A series of small fixes for the inline ipsec test cases, with the third
patch being the most significant - skipping tests where we don't have the
necessary HW support to run the tests. Previously this was leading to test
failures when the tests should never have been run due to missing HW
capabilities.

Bruce Richardson (3):
  test/security_inline_proto: remove fast-free Tx flag
  test/security_inline_proto: fix MTU calculation underflow
  test/security_inline_proto: check for capabilities

 app/test/test_security_inline_proto.c | 75 +++++++++++++++++++++++++--
 1 file changed, 70 insertions(+), 5 deletions(-)

--
2.53.0


^ permalink raw reply

* [PATCH v9 21/21] net/txgbe: fix temperature track for AML NIC
From: Zaiyu Wang @ 2026-06-22 11:11 UTC (permalink / raw)
  To: dev; +Cc: Zaiyu Wang, stable, Jiawen Wu
In-Reply-To: <20260622111111.21024-1-zaiyuwang@trustnetic.com>

Previously, temperature tracking for the amlite NIC was handled by
firmware together with the hardware setup. However, the firmware-based
PHY configuration has proven to be unstable.

Re-add the temperature tracking function directly in the driver and
invoke it periodically to ensure the PHY remains calibrated. According
to the hardware recommendation, the tracking sequence should be run at
least every 100 ms to keep temperature drift within 5 °C. Considering
the software and hardware overhead, a 2-second interval is used as a
practical trade-off that still meets stability requirements while
minimizing performance impact.

The periodic tracking is implemented using a timer in the driver, and
the sequence itself is the same as the one originally performed during
link setup.

Fixes: fb6eb170dfa2 ("net/txgbe: add basic link configuration for Amber-Lite")
Cc: stable@dpdk.org

Signed-off-by: Zaiyu Wang <zaiyuwang@trustnetic.com>
---
 drivers/net/txgbe/txgbe_ethdev.c | 44 +++++++++++++++++++++++++++++++-
 drivers/net/txgbe/txgbe_ethdev.h |  1 +
 2 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/drivers/net/txgbe/txgbe_ethdev.c b/drivers/net/txgbe/txgbe_ethdev.c
index 4691a562d8..308963b616 100644
--- a/drivers/net/txgbe/txgbe_ethdev.c
+++ b/drivers/net/txgbe/txgbe_ethdev.c
@@ -2011,8 +2011,10 @@ txgbe_dev_start(struct rte_eth_dev *dev)
 	txgbe_filter_restore(dev);
 
 	hw->bp_event_interval = 100 * 1000;
-	if (hw->mac.type == txgbe_mac_aml || hw->mac.type == txgbe_mac_aml40)
+	if (hw->mac.type == txgbe_mac_aml || hw->mac.type == txgbe_mac_aml40) {
 		rte_eal_alarm_set(hw->bp_event_interval, txgbe_dev_e56_check_bp_event, dev);
+		rte_eal_alarm_set(1000 * 1000 * 2, txgbe_dev_check_aml_temp_event, dev);
+	}
 
 	if (tm_conf->root && !tm_conf->committed)
 		PMD_DRV_LOG(WARNING,
@@ -2060,6 +2062,7 @@ txgbe_dev_stop(struct rte_eth_dev *dev)
 
 	if (hw->mac.type == txgbe_mac_aml || hw->mac.type == txgbe_mac_aml40) {
 		rte_eal_alarm_cancel(txgbe_dev_e56_check_bp_event, dev);
+		rte_eal_alarm_cancel(txgbe_dev_check_aml_temp_event, dev);
 		rte_eal_alarm_cancel(txgbe_dev_setup_link_alarm_handler_aml, hw);
 	}
 
@@ -2932,6 +2935,45 @@ txgbe_dev_supported_ptypes_get(struct rte_eth_dev *dev, size_t *no_of_elements)
 	return NULL;
 }
 
+void txgbe_dev_check_aml_temp_event(void *param)
+{
+	struct rte_eth_dev *dev = (struct rte_eth_dev *)param;
+	struct txgbe_hw *hw = TXGBE_DEV_HW(dev);
+	uint32_t link_speed = 0, val = 0;
+	s32 status = 0;
+	int temp;
+
+	if (hw == NULL)
+		return;
+
+	status = txgbe_e56_get_temp(hw, &temp);
+	if (status)
+		temp = DEFAULT_TEMP;
+
+	if (!(temp - hw->temperature > 4 ||
+		hw->temperature - temp > 4))
+		goto out;
+
+	hw->temperature = temp;
+	val = rd32(hw, TXGBE_PORT);
+	if (val & TXGBE_AMLITE_LED_LINK_40G)
+		link_speed = TXGBE_LINK_SPEED_40GB_FULL;
+	else if (val & TXGBE_AMLITE_LED_LINK_25G)
+		link_speed = TXGBE_LINK_SPEED_25GB_FULL;
+	else
+		link_speed = TXGBE_LINK_SPEED_10GB_FULL;
+
+	rte_spinlock_lock(&hw->phy_lock);
+	if (hw->mac.type == txgbe_mac_aml)
+		txgbe_temp_track_seq(hw, link_speed);
+	else if (hw->mac.type == txgbe_mac_aml40)
+		txgbe_temp_track_seq_40g(hw, link_speed);
+	rte_spinlock_unlock(&hw->phy_lock);
+
+out:
+	rte_eal_alarm_set(1000 * 1000 * 2, txgbe_dev_check_aml_temp_event, dev);
+}
+
 void txgbe_dev_e56_check_bp_event(void *param)
 {
 	struct rte_eth_dev *dev = (struct rte_eth_dev *)param;
diff --git a/drivers/net/txgbe/txgbe_ethdev.h b/drivers/net/txgbe/txgbe_ethdev.h
index 309db3bfe9..c32c61d8bf 100644
--- a/drivers/net/txgbe/txgbe_ethdev.h
+++ b/drivers/net/txgbe/txgbe_ethdev.h
@@ -747,5 +747,6 @@ void txgbe_vlan_hw_strip_bitmap_set(struct rte_eth_dev *dev,
 		uint16_t queue, bool on);
 void txgbe_config_vlan_strip_on_all_queues(struct rte_eth_dev *dev,
 						  int mask);
+void txgbe_dev_check_aml_temp_event(void *param);
 void txgbe_dev_e56_check_bp_event(void *param);
 #endif /* _TXGBE_ETHDEV_H_ */
-- 
2.21.0.windows.1


^ permalink raw reply related

* [PATCH v9 20/21] net/txgbe: fix to enable Tx desc check
From: Zaiyu Wang @ 2026-06-22 11:11 UTC (permalink / raw)
  To: dev; +Cc: Zaiyu Wang, stable, Jiawen Wu
In-Reply-To: <20260622111111.21024-1-zaiyuwang@trustnetic.com>

Now lib security is enabled by default, and cannot be disabled if the
driver is intended to be used. So Tdm_desc_chk is always unable to enable.
Remove this restriction, and just enable the corresponding queue check.

Fixes: 0eabdfcd4af4 ("net/txgbe: enable Tx descriptor error interrupt")
Cc: stable@dpdk.org

Signed-off-by: Zaiyu Wang <zaiyuwang@trustnetic.com>
---
 drivers/net/txgbe/txgbe_rxtx.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/net/txgbe/txgbe_rxtx.c b/drivers/net/txgbe/txgbe_rxtx.c
index 0a9fc3ddfd..f51c6193a9 100644
--- a/drivers/net/txgbe/txgbe_rxtx.c
+++ b/drivers/net/txgbe/txgbe_rxtx.c
@@ -4761,6 +4761,12 @@ txgbe_dev_tx_init(struct rte_eth_dev *dev)
 		wr32(hw, TXGBE_TXRP(txq->reg_idx), 0);
 		wr32(hw, TXGBE_TXWP(txq->reg_idx), 0);
 
+#ifdef RTE_LIB_SECURITY
+		if (!txq->using_ipsec)
+#endif
+			wr32m(hw, TXGBE_TDM_DESC_CHK(txq->reg_idx / 32),
+			      RTE_BIT32(txq->reg_idx % 32), RTE_BIT32(txq->reg_idx % 32));
+
 		if (txq->headwb_mem) {
 			uint32_t txdctl;
 
@@ -4778,11 +4784,6 @@ txgbe_dev_tx_init(struct rte_eth_dev *dev)
 		}
 	}
 
-#ifndef RTE_LIB_SECURITY
-	for (i = 0; i < 4; i++)
-		wr32(hw, TXGBE_TDM_DESC_CHK(i), 0xFFFFFFFF);
-#endif
-
 	/* Device configured with multiple TX queues. */
 	txgbe_dev_mq_tx_configure(dev);
 }
-- 
2.21.0.windows.1


^ permalink raw reply related

* [PATCH v9 19/21] net/txgbe: fix to reset Tx write-back pointer
From: Zaiyu Wang @ 2026-06-22 11:11 UTC (permalink / raw)
  To: dev; +Cc: Zaiyu Wang, stable, Jiawen Wu
In-Reply-To: <20260622111111.21024-1-zaiyuwang@trustnetic.com>

The write-back pointer was not reset when the Tx queue was reset. This
leads to the wrong Tx desc free logic. Move the resetting of pointer into
txq->ops->reset(txq).

Fixes: 8ada71d0bb7f ("net/txgbe: add Tx head write-back mode for Amber-Lite")
Cc: stable@dpdk.org

Signed-off-by: Zaiyu Wang <zaiyuwang@trustnetic.com>
---
 drivers/net/txgbe/txgbe_rxtx.c            | 45 +++++++++++++----------
 drivers/net/txgbe/txgbe_rxtx.h            |  1 +
 drivers/net/txgbe/txgbe_rxtx_vec_common.h |  7 ++++
 3 files changed, 33 insertions(+), 20 deletions(-)

diff --git a/drivers/net/txgbe/txgbe_rxtx.c b/drivers/net/txgbe/txgbe_rxtx.c
index 3525b708f8..0a9fc3ddfd 100644
--- a/drivers/net/txgbe/txgbe_rxtx.c
+++ b/drivers/net/txgbe/txgbe_rxtx.c
@@ -2313,6 +2313,12 @@ txgbe_reset_tx_queue(struct txgbe_tx_queue *txq)
 	txq->tx_next_dd = (uint16_t)(txq->tx_free_thresh - 1);
 	txq->tx_tail = 0;
 
+	/* Zero out headwb_mem memory */
+	if (txq->headwb_mem) {
+		for (i = 0; i < txq->headwb_size; i++)
+			txq->headwb_mem[i] = 0;
+	}
+
 	/*
 	 * Always allow 1 descriptor to be un-allocated to avoid
 	 * a H/W race condition
@@ -2412,7 +2418,7 @@ txgbe_get_tx_port_offloads(struct rte_eth_dev *dev)
 	return tx_offload_capa;
 }
 
-static int
+static void
 txgbe_setup_headwb_resources(struct rte_eth_dev *dev,
 					void *tx_queue,
 					unsigned int socket_id)
@@ -2420,33 +2426,33 @@ txgbe_setup_headwb_resources(struct rte_eth_dev *dev,
 	struct txgbe_hw *hw = TXGBE_DEV_HW(dev);
 	const struct rte_memzone *headwb;
 	struct txgbe_tx_queue *txq = tx_queue;
-	u8 i, headwb_size = 0;
+	u8 headwb_size = 0;
 
-	if (hw->mac.type != txgbe_mac_aml && hw->mac.type != txgbe_mac_aml40) {
-		txq->headwb_mem = NULL;
-		return 0;
-	}
+	if (hw->mac.type != txgbe_mac_aml && hw->mac.type != txgbe_mac_aml40)
+		goto out;
+
+	if (!hw->devarg.tx_headwb)
+		goto out;
 
-	headwb_size = hw->devarg.tx_headwb_size;
+	headwb_size = txq->headwb_size;
 	headwb = rte_eth_dma_zone_reserve(dev, "tx_headwb_mem", txq->queue_id,
 			sizeof(u32) * headwb_size,
 			TXGBE_ALIGN, socket_id);
 
 	if (headwb == NULL) {
-		DEBUGOUT("Fail to setup headwb resources: no mem");
-		txgbe_tx_queue_release(txq);
-		return -ENOMEM;
+		PMD_DRV_LOG(INFO,
+			    "Failed to allocate headwb memory for Tx queue %u, change to SP mode",
+			    txq->queue_id);
+		goto out;
 	}
 
 	txq->headwb = headwb;
 	txq->headwb_dma = TMZ_PADDR(headwb);
 	txq->headwb_mem = (uint32_t *)TMZ_VADDR(headwb);
+	return;
 
-	/* Zero out headwb_mem memory */
-	for (i = 0; i < headwb_size; i++)
-		txq->headwb_mem[i] = 0;
-
-	return 0;
+out:
+	txq->headwb_mem = NULL;
 }
 
 int __rte_cold
@@ -2542,6 +2548,7 @@ txgbe_dev_tx_queue_setup(struct rte_eth_dev *dev,
 	txq->offloads = offloads;
 	txq->ops = &def_txq_ops;
 	txq->tx_deferred_start = tx_conf->tx_deferred_start;
+	txq->headwb_size = hw->devarg.tx_headwb_size;
 #ifdef RTE_LIB_SECURITY
 	txq->using_ipsec = !!(dev->data->dev_conf.txmode.offloads &
 			RTE_ETH_TX_OFFLOAD_SECURITY);
@@ -2577,8 +2584,7 @@ txgbe_dev_tx_queue_setup(struct rte_eth_dev *dev,
 	/* set up scalar TX function as appropriate */
 	txgbe_set_tx_function(dev, txq);
 
-	if (hw->devarg.tx_headwb)
-		err = txgbe_setup_headwb_resources(dev, txq, socket_id);
+	txgbe_setup_headwb_resources(dev, txq, socket_id);
 
 	txq->ops->reset(txq);
 	txq->desc_error = 0;
@@ -4755,15 +4761,14 @@ txgbe_dev_tx_init(struct rte_eth_dev *dev)
 		wr32(hw, TXGBE_TXRP(txq->reg_idx), 0);
 		wr32(hw, TXGBE_TXWP(txq->reg_idx), 0);
 
-		if ((hw->mac.type == txgbe_mac_aml || hw->mac.type == txgbe_mac_aml40) &&
-		     hw->devarg.tx_headwb) {
+		if (txq->headwb_mem) {
 			uint32_t txdctl;
 
 			wr32(hw, TXGBE_PX_TR_HEAD_ADDRL(txq->reg_idx),
 				(uint32_t)(txq->headwb_dma & BIT_MASK32));
 			wr32(hw, TXGBE_PX_TR_HEAD_ADDRH(txq->reg_idx),
 				(uint32_t)(txq->headwb_dma >> 32));
-			if (hw->devarg.tx_headwb_size == 16)
+			if (txq->headwb_size == 16)
 				txdctl = TXGBE_PX_TR_CFG_HEAD_WB |
 					 TXGBE_PX_TR_CFG_HEAD_WB_64BYTE;
 			else
diff --git a/drivers/net/txgbe/txgbe_rxtx.h b/drivers/net/txgbe/txgbe_rxtx.h
index 43c818cfbf..5d2e33a8d4 100644
--- a/drivers/net/txgbe/txgbe_rxtx.h
+++ b/drivers/net/txgbe/txgbe_rxtx.h
@@ -416,6 +416,7 @@ struct txgbe_tx_queue {
 	uint64_t	    desc_error;
 	bool		    resetting;
 	const struct rte_memzone *headwb;
+	uint16_t             headwb_size;
 	uint64_t             headwb_dma;
 	volatile uint32_t    *headwb_mem;
 };
diff --git a/drivers/net/txgbe/txgbe_rxtx_vec_common.h b/drivers/net/txgbe/txgbe_rxtx_vec_common.h
index 77d7ff785b..6e561aff30 100644
--- a/drivers/net/txgbe/txgbe_rxtx_vec_common.h
+++ b/drivers/net/txgbe/txgbe_rxtx_vec_common.h
@@ -252,6 +252,13 @@ _txgbe_reset_tx_queue_vec(struct txgbe_tx_queue *txq)
 	txq->tx_next_dd = (uint16_t)(txq->tx_free_thresh - 1);
 
 	txq->tx_tail = 0;
+
+	/* Zero out headwb_mem memory */
+	if (txq->headwb_mem) {
+		for (i = 0; i < txq->headwb_size; i++)
+			txq->headwb_mem[i] = 0;
+	}
+
 	/*
 	 * Always allow 1 descriptor to be un-allocated to avoid
 	 * a H/W race condition
-- 
2.21.0.windows.1


^ permalink raw reply related

* [PATCH v9 18/21] net/txgbe: fix get EEPROM operation
From: Zaiyu Wang @ 2026-06-22 11:11 UTC (permalink / raw)
  To: dev; +Cc: Zaiyu Wang, stable, Jiawen Wu
In-Reply-To: <20260622111111.21024-1-zaiyuwang@trustnetic.com>

The original I2C access flow in the module information retrieval
process was flawed. Correct the implementation to properly fetch
module info.

Fixes: abf042d32b39 ("net/txgbe: add Amber-Lite 25G/40G NICs")
Cc: stable@dpdk.org

Signed-off-by: Zaiyu Wang <zaiyuwang@trustnetic.com>
---
 drivers/net/txgbe/base/txgbe_phy.h |  1 +
 drivers/net/txgbe/txgbe_ethdev.c   | 81 +++++++++++++++++++++++++++---
 2 files changed, 76 insertions(+), 6 deletions(-)

diff --git a/drivers/net/txgbe/base/txgbe_phy.h b/drivers/net/txgbe/base/txgbe_phy.h
index 31bdceb35b..a5df015a4d 100644
--- a/drivers/net/txgbe/base/txgbe_phy.h
+++ b/drivers/net/txgbe/base/txgbe_phy.h
@@ -245,6 +245,7 @@
 /* EEPROM (dev_addr = 0xA0) */
 #define TXGBE_I2C_EEPROM_DEV_ADDR	0xA0
 #define TXGBE_SFF_IDENTIFIER		0x00
+#define TXGBE_SFF_8636_STATUS_OFFSET	0x02
 #define TXGBE_SFF_IDENTIFIER_SFP	0x03
 #define TXGBE_SFF_VENDOR_OUI_BYTE0	0x25
 #define TXGBE_SFF_VENDOR_OUI_BYTE1	0x26
diff --git a/drivers/net/txgbe/txgbe_ethdev.c b/drivers/net/txgbe/txgbe_ethdev.c
index c02ba6c517..4691a562d8 100644
--- a/drivers/net/txgbe/txgbe_ethdev.c
+++ b/drivers/net/txgbe/txgbe_ethdev.c
@@ -5462,23 +5462,92 @@ txgbe_get_module_eeprom(struct rte_eth_dev *dev,
 	uint8_t databyte = 0xFF;
 	uint8_t *data = info->data;
 	uint32_t i = 0;
+	bool is_sfp = false;
+	uint32_t value;
+	uint8_t identifier = 0;
+	uint16_t offset;
+	uint8_t page = 0;
+	bool is_flat_mem = false;
+
+	if (hw->mac.type == txgbe_mac_aml40) {
+		value = rd32(hw, TXGBE_GPIOEXT);
+		if (value & TXGBE_SFP1_MOD_PRST_LS)
+			return -EIO;
+	}
+
+	if (hw->mac.type == txgbe_mac_aml) {
+		value = rd32(hw, TXGBE_GPIOEXT);
+		if (value & TXGBE_SFP1_MOD_ABS_LS)
+			return -EIO;
+	}
 
 	if (info->length == 0)
 		return -EINVAL;
 
-	for (i = info->offset; i < info->offset + info->length; i++) {
-		if (i < RTE_ETH_MODULE_SFF_8079_LEN)
-			status = hw->phy.read_i2c_eeprom(hw, i, &databyte);
-		else
-			status = hw->phy.read_i2c_sff8472(hw, i, &databyte);
+	status = hw->mac.acquire_swfw_sync(hw, TXGBE_MNGSEM_SWPHY);
+	if (status)
+		return -EBUSY;
+
+	status = hw->phy.read_i2c_eeprom(hw,
+					     TXGBE_SFF_IDENTIFIER,
+					     &identifier);
+	if (status != 0)
+		goto ERROR_IO;
 
+	if (identifier == TXGBE_SFF_IDENTIFIER_SFP) {
+		is_sfp = true;
+	} else {
+		uint8_t rdata = 0;
+
+		status = hw->phy.read_i2c_sff8636(hw, 0,
+						  TXGBE_SFF_8636_STATUS_OFFSET,
+						  &rdata);
 		if (status != 0)
-			return -EIO;
+			goto ERROR_IO;
 
+		if (rdata & 0x4)
+			is_flat_mem = true;
+	}
+
+	memset(data, 0, info->length);
+
+	for (i = info->offset; i < info->offset + info->length; i++) {
+		databyte = 0;
+
+		if (is_sfp) {
+			if (i < RTE_ETH_MODULE_SFF_8079_LEN)
+				status = hw->phy.read_i2c_eeprom(hw, i,
+					       &databyte);
+			else
+				status = hw->phy.read_i2c_sff8472(hw, i,
+					       &databyte);
+
+			if (status != 0)
+				goto ERROR_IO;
+		} else {
+			offset = i;
+			page = 0;
+			while (offset >= RTE_ETH_MODULE_SFF_8436_LEN) {
+				offset -= RTE_ETH_MODULE_SFF_8436_LEN / 2;
+				page++;
+			}
+			if (page == 0 || !is_flat_mem) {
+				status = hw->phy.read_i2c_sff8636(hw, page, offset,
+					       &databyte);
+				if (status != 0)
+					goto ERROR_IO;
+			}
+		}
 		data[i - info->offset] = databyte;
 	}
 
+	hw->mac.release_swfw_sync(hw, TXGBE_MNGSEM_SWPHY);
 	return 0;
+
+ERROR_IO:
+	PMD_DRV_LOG(ERR, "I2C IO ERROR.");
+	hw->mac.release_swfw_sync(hw, TXGBE_MNGSEM_SWPHY);
+	return -EIO;
 }
 
 bool
-- 
2.21.0.windows.1


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox