* Re: [PATCH v2 0/7] Intel network drivers enhancements
From: Stephen Hemminger @ 2026-06-18 15:45 UTC (permalink / raw)
To: Dawid Wesierski; +Cc: dev, thomas, david.marchand, Marek Kasiewicz
In-Reply-To: <20260618144442.312844-1-dawid.wesierski@intel.com>
On Thu, 18 Jun 2026 10:44:35 -0400
Dawid Wesierski <dawid.wesierski@intel.com> wrote:
> From: Marek Kasiewicz <marek.kasiewicz@intel.com>
>
> This series introduces several improvements to Intel iavf and ice
> drivers, including a new ethdev API for header split mbuf callbacks,
> increased ring descriptors, and improved PTP timestamping.
>
> Marek Kasiewicz (7):
> ethdev: add header split mbuf callback API
> net/iavf: increase max ring descriptors to hardware limit
> net/iavf: allow runtime queue rate limit configuration
> net/ice/base: reduce default scheduler burst size
> net/ice: timestamp all received packets when PTP is enabled
> net/iavf: disable runtime queue setup capability
> net/intel: support header split mbuf callback
>
> drivers/net/intel/common/rx.h | 2 +
> drivers/net/intel/iavf/iavf_ethdev.c | 3 --
> drivers/net/intel/iavf/iavf_rxtx.h | 2 +-
> drivers/net/intel/iavf/iavf_tm.c | 11 ++--
> drivers/net/intel/ice/base/ice_type.h | 2 +-
> drivers/net/intel/ice/ice_ethdev.c | 1 +
> drivers/net/intel/ice/ice_rxtx.c | 72 ++++++++++++++++++++++++---
> drivers/net/intel/ice/ice_rxtx.h | 2 +
> lib/ethdev/ethdev_driver.h | 15 +++++++
> lib/ethdev/rte_ethdev.c | 51 ++++++++++++++++++++++
> lib/ethdev/rte_ethdev.h | 7 +++
> 11 files changed, 153 insertions(+), 15 deletions(-)
>
This looks interesting but I don't understand the motivation
or use case for the new code. There is no documentation, examples
or test integration. At this point the patch is in pure RFC state.
More wordy answer from AI...
Documentation and motivation
This adds a new public ethdev RX mechanism but ships no prose
rationale, no documentation, and no worked example. For a new API
the series needs to make the case before the code can be evaluated.
Missing documentation:
- No prog_guide section. A new RX model -- header split where the
application supplies the payload buffer -- needs a write-up (e.g.
in doc/guides/prog_guide) covering the buffer-lifetime contract,
the IOVA/headroom semantics, which offloads must be enabled
(RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT), and what happens on a PMD that
does not implement the hook.
- No example or testpmd integration. There is no end-to-end flow a
reviewer can read or run: register the callback, supply buffers,
receive, consume, recycle. Without one there is no demonstrated
use case and no way to validate the design (the mempool-corruption
and headroom issues in 7/7 would likely have surfaced in a working
example).
Missing motivation:
- The actual use case is not named. "Zero-copy RX into mapped frame
buffers with known IOVA" describes a mechanism, not a user. Who
needs this -- an AF_XDP-style external umem, a GPU/DMA target
(gpudev), something else? State it.
- The benefit is unquantified. The claim is avoiding "an extra
memcpy from the mempool mbuf," but the patch replaces that with a
per-allocation indirect callback on the Rx fast path
(ice_recv_pkts, ice_rx_alloc_bufs). The rationale should show the
memcpy actually exists in the path being optimized and that the
indirect call per buffer is a net win.
- It does not explain why existing mechanisms are insufficient. DPDK
already lets an application own payload memory via
rte_pktmbuf_attach_extbuf() and via external-buffer mempools
(pinned external memory passed to the Rx mempool). The series must
argue why a new callback API is needed instead of those -- this is
the same alternative raised against the 7/7 implementation, so the
motivation and the correctness fix point at the same question.
- Constraints are undocumented: PMD support scope, fast-path
threading and reentrancy of the callback, and interaction with the
buffer-split configuration.
Until there is a documented model and at least one example showing
the buffer lifecycle, it is hard to tell whether the callback API is
the right shape -- or whether attach_extbuf / an external-mempool
already covers the use case.
^ permalink raw reply
* [DPDK/other Bug 1655] Kernel crash when hot-unplug igb_uio device while DPDK application is running
From: bugzilla @ 2026-06-18 15:32 UTC (permalink / raw)
To: dev
In-Reply-To: <bug-1655-3@http.bugs.dpdk.org/>
http://bugs.dpdk.org/show_bug.cgi?id=1655
Thomas Monjalon (thomas@monjalon.net) changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |WONTFIX
CC| |thomas@monjalon.net
Status|UNCONFIRMED |RESOLVED
--- Comment #1 from Thomas Monjalon (thomas@monjalon.net) ---
UIO is deprecated.
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply
* [DPDK/other Bug 1650] igb_uio can not be used when running l3fwd-power
From: bugzilla @ 2026-06-18 15:31 UTC (permalink / raw)
To: dev
In-Reply-To: <bug-1650-3@http.bugs.dpdk.org/>
http://bugs.dpdk.org/show_bug.cgi?id=1650
Thomas Monjalon (thomas@monjalon.net) changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |thomas@monjalon.net
Status|UNCONFIRMED |RESOLVED
Resolution|--- |WONTFIX
--- Comment #1 from Thomas Monjalon (thomas@monjalon.net) ---
UIO is deprecated.
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply
* [PATCH v2 10/10] bus/vmbus: support unplug
From: David Marchand @ 2026-06-18 15:28 UTC (permalink / raw)
To: dev
Cc: thomas, stephen, bruce.richardson, fengchengwen, longli,
hemant.agrawal, Wei Hu
In-Reply-To: <20260618152826.490569-1-david.marchand@redhat.com>
Add .unplug callback to handle driver removal, device unmapping, and
interrupt cleanup. This enables use of the generic bus cleanup helper.
The cleanup function was already performing these operations, so it
seems safe to expose them through the unplug operation.
Signed-off-by: David Marchand <david.marchand@redhat.com>
---
doc/guides/rel_notes/release_26_07.rst | 4 +++
drivers/bus/vmbus/vmbus_common.c | 41 ++++++++++++--------------
2 files changed, 23 insertions(+), 22 deletions(-)
diff --git a/doc/guides/rel_notes/release_26_07.rst b/doc/guides/rel_notes/release_26_07.rst
index 5d7aa8d1bf..55d3b44527 100644
--- a/doc/guides/rel_notes/release_26_07.rst
+++ b/doc/guides/rel_notes/release_26_07.rst
@@ -114,6 +114,10 @@ New Features
Added no-IOMMU mode for devices without or not enabling IOMMU/SVA.
+* **Added unplug operation support to VMBUS bus.**
+
+ Implemented device unplug operation to allow runtime removal of VMBUS devices.
+
* **Added selective Rx in ethdev API.**
Some parts of packets may be discarded in Rx
diff --git a/drivers/bus/vmbus/vmbus_common.c b/drivers/bus/vmbus/vmbus_common.c
index a6e3a24a7c..cd6e851e4c 100644
--- a/drivers/bus/vmbus/vmbus_common.c
+++ b/drivers/bus/vmbus/vmbus_common.c
@@ -144,34 +144,29 @@ rte_vmbus_probe(void)
}
static int
-rte_vmbus_cleanup(struct rte_bus *bus)
+vmbus_unplug_device(struct rte_device *rte_dev)
{
- struct rte_vmbus_device *dev;
- int error = 0;
-
- RTE_BUS_FOREACH_DEV(dev, bus) {
- const struct rte_vmbus_driver *drv;
- int ret;
-
- if (!rte_dev_is_probed(&dev->device))
- continue;
- drv = RTE_BUS_DRIVER(dev->device.driver, *drv);
- if (drv->remove == NULL)
- continue;
+ const struct rte_vmbus_driver *drv = RTE_BUS_DRIVER(rte_dev->driver, *drv);
+ struct rte_vmbus_device *dev = RTE_BUS_DEVICE(rte_dev, *dev);
+ int ret = 0;
+ if (drv->remove != NULL) {
ret = drv->remove(dev);
if (ret < 0)
- error = -1;
+ return ret;
+ }
- rte_vmbus_unmap_device(dev);
- rte_intr_instance_free(dev->intr_handle);
+ rte_vmbus_unmap_device(dev);
+ rte_intr_instance_free(dev->intr_handle);
+ dev->intr_handle = NULL;
- dev->device.driver = NULL;
- rte_bus_remove_device(bus, &dev->device);
- free(dev);
- }
+ return 0;
+}
- return error;
+static void
+vmbus_free_device(struct rte_device *dev)
+{
+ free(RTE_BUS_DEVICE(dev, struct rte_vmbus_device));
}
static int
@@ -222,10 +217,12 @@ rte_vmbus_unregister(struct rte_vmbus_driver *driver)
struct rte_bus rte_vmbus_bus = {
.scan = rte_vmbus_scan,
.probe = rte_bus_generic_probe,
- .cleanup = rte_vmbus_cleanup,
+ .free_device = vmbus_free_device,
+ .cleanup = rte_bus_generic_cleanup,
.find_device = rte_bus_generic_find_device,
.match = vmbus_bus_match,
.probe_device = vmbus_probe_device,
+ .unplug_device = vmbus_unplug_device,
.parse = vmbus_parse,
.dev_compare = vmbus_dev_compare,
};
--
2.53.0
^ permalink raw reply related
* [PATCH v2 09/10] bus/vmbus: store name in bus specific device
From: David Marchand @ 2026-06-18 15:28 UTC (permalink / raw)
To: dev
Cc: thomas, stephen, bruce.richardson, fengchengwen, longli,
hemant.agrawal, Wei Hu
In-Reply-To: <20260618152826.490569-1-david.marchand@redhat.com>
The device name is allocated with strdup() during scan and freed in
several places. However, when this bus cleanup is converted to use the
EAL generic helper, freeing the device object will require a custom
helper to also free the device name (and for this, a cast will be
needed).
Instead, add an embedded name array to rte_vmbus_device structure
(char name[RTE_DEV_NAME_MAX_LEN]) which is sufficient for all VMBUS
device names (UUID format: 36 characters, or shorter legacy format).
This simplifies the device freeing to a simple free() call.
Signed-off-by: David Marchand <david.marchand@redhat.com>
---
drivers/bus/vmbus/bus_vmbus_driver.h | 1 +
drivers/bus/vmbus/linux/vmbus_bus.c | 10 +++-------
2 files changed, 4 insertions(+), 7 deletions(-)
diff --git a/drivers/bus/vmbus/bus_vmbus_driver.h b/drivers/bus/vmbus/bus_vmbus_driver.h
index 888d856141..706ff1fcf5 100644
--- a/drivers/bus/vmbus/bus_vmbus_driver.h
+++ b/drivers/bus/vmbus/bus_vmbus_driver.h
@@ -38,6 +38,7 @@ enum hv_uio_map {
*/
struct rte_vmbus_device {
struct rte_device device; /**< Inherit core device */
+ char name[RTE_DEV_NAME_MAX_LEN]; /**< VMBUS device name */
rte_uuid_t device_id; /**< VMBUS device id */
rte_uuid_t class_id; /**< VMBUS device type */
uint32_t relid; /**< id for primary */
diff --git a/drivers/bus/vmbus/linux/vmbus_bus.c b/drivers/bus/vmbus/linux/vmbus_bus.c
index 77d904ad6d..779ea50b92 100644
--- a/drivers/bus/vmbus/linux/vmbus_bus.c
+++ b/drivers/bus/vmbus/linux/vmbus_bus.c
@@ -280,15 +280,14 @@ vmbus_scan_one(const char *name)
char filename[PATH_MAX];
char dirname[PATH_MAX];
unsigned long tmp;
- char *dev_name;
dev = calloc(1, sizeof(*dev));
if (dev == NULL)
return -1;
- dev->device.name = dev_name = strdup(name);
- if (!dev->device.name)
+ if (rte_strscpy(dev->name, name, sizeof(dev->name)) < 0)
goto error;
+ dev->device.name = dev->name;
/* sysfs base directory
* /sys/bus/vmbus/devices/7a08391f-f5a0-4ac0-9802-d13fd964f8df
@@ -305,7 +304,6 @@ vmbus_scan_one(const char *name)
/* skip non-network devices */
if (rte_uuid_compare(dev->class_id, vmbus_nic_uuid) != 0) {
- free(dev_name);
free(dev);
return 0;
}
@@ -330,7 +328,7 @@ vmbus_scan_one(const char *name)
dev->monitor_id = UINT8_MAX;
}
- dev->device.devargs = rte_bus_find_devargs(&rte_vmbus_bus, dev_name);
+ dev->device.devargs = rte_bus_find_devargs(&rte_vmbus_bus, dev->name);
dev->device.numa_node = SOCKET_ID_ANY;
if (vmbus_use_numa(dev)) {
@@ -360,7 +358,6 @@ vmbus_scan_one(const char *name)
} else { /* already registered */
VMBUS_LOG(NOTICE,
"%s already registered", name);
- free(dev_name);
free(dev);
}
return 0;
@@ -371,7 +368,6 @@ vmbus_scan_one(const char *name)
error:
VMBUS_LOG(DEBUG, "failed");
- free(dev_name);
free(dev);
return -1;
}
--
2.53.0
^ permalink raw reply related
* [PATCH v2 08/10] bus: implement cleanup in EAL
From: David Marchand @ 2026-06-18 15:28 UTC (permalink / raw)
To: dev
Cc: thomas, stephen, bruce.richardson, fengchengwen, longli,
hemant.agrawal, Parav Pandit, Xueming Li, Sachin Saxena, Rosen Xu,
Chenbo Xia, Nipun Gupta, Tomasz Duszynski, Wei Hu
In-Reply-To: <20260618152826.490569-1-david.marchand@redhat.com>
Introduce a generic cleanup helper rte_bus_generic_cleanup() that
eliminates code duplication across bus cleanup implementations:
unplug probed devices, remove devargs, remove from bus list,
and free device structures.
Add .free_device operation to struct rte_bus to allow buses to specify
how to free their device structures.
Update all buses for the new .cleanup and RTE_REGISTER_BUS prototypes.
Convert to rte_bus_generic_cleanup() the buses that have both a .cleanup
and .unplug_device: this requires implementing .free_device for them.
Untouched buses are:
- dma/idxd which has no unplug support,
- bus/cdx which has unplug support, but no cleanup was implemented so
far,
- NXP buses:
- bus/dpaa and bus/fslmc have many issues on interrupt
allocation/setup/freeing or VFIO setup/release,
- bus/fslmc cleanup callback is actually implemented in its internal
VFIO layer and requires too much refactoring,
Signed-off-by: David Marchand <david.marchand@redhat.com>
---
Changes since v1:
- dropped hack on using free() and the check in RTE_REGISTER_BUS,
---
drivers/bus/auxiliary/auxiliary_common.c | 28 ++++---------------
drivers/bus/dpaa/dpaa_bus.c | 4 +--
drivers/bus/fslmc/fslmc_bus.c | 2 +-
drivers/bus/ifpga/ifpga_bus.c | 32 ++++------------------
drivers/bus/pci/pci_common.c | 29 +++++---------------
drivers/bus/platform/platform.c | 20 ++++----------
drivers/bus/uacce/uacce.c | 28 ++++---------------
drivers/bus/vdev/vdev.c | 26 +++++++-----------
drivers/bus/vmbus/vmbus_common.c | 6 ++---
lib/eal/common/eal_common_bus.c | 33 ++++++++++++++++++++++-
lib/eal/include/bus_driver.h | 34 +++++++++++++++++++++++-
11 files changed, 107 insertions(+), 135 deletions(-)
diff --git a/drivers/bus/auxiliary/auxiliary_common.c b/drivers/bus/auxiliary/auxiliary_common.c
index 10f466e57a..80b90a4961 100644
--- a/drivers/bus/auxiliary/auxiliary_common.c
+++ b/drivers/bus/auxiliary/auxiliary_common.c
@@ -179,29 +179,10 @@ rte_auxiliary_unregister(struct rte_auxiliary_driver *driver)
rte_bus_remove_driver(&auxiliary_bus, &driver->driver);
}
-static int
-auxiliary_cleanup(void)
+static void
+auxiliary_free_device(struct rte_device *dev)
{
- struct rte_auxiliary_device *dev;
- int error = 0;
-
- RTE_BUS_FOREACH_DEV(dev, &auxiliary_bus) {
- int ret;
-
- if (rte_dev_is_probed(&dev->device)) {
- ret = auxiliary_unplug_device(&dev->device);
- if (ret < 0) {
- rte_errno = errno;
- error = -1;
- }
- }
-
- rte_devargs_remove(dev->device.devargs);
- rte_bus_remove_device(&auxiliary_bus, &dev->device);
- free(dev);
- }
-
- return error;
+ free(RTE_BUS_DEVICE(dev, struct rte_auxiliary_device));
}
static int
@@ -247,7 +228,8 @@ auxiliary_get_iommu_class(void)
struct rte_bus auxiliary_bus = {
.scan = auxiliary_scan,
.probe = rte_bus_generic_probe,
- .cleanup = auxiliary_cleanup,
+ .free_device = auxiliary_free_device,
+ .cleanup = rte_bus_generic_cleanup,
.find_device = rte_bus_generic_find_device,
.match = auxiliary_bus_match,
.probe_device = auxiliary_probe_device,
diff --git a/drivers/bus/dpaa/dpaa_bus.c b/drivers/bus/dpaa/dpaa_bus.c
index ee467b94d5..54779f82f7 100644
--- a/drivers/bus/dpaa/dpaa_bus.c
+++ b/drivers/bus/dpaa/dpaa_bus.c
@@ -807,12 +807,12 @@ dpaa_bus_probe_device(struct rte_driver *drv, struct rte_device *dev)
}
static int
-dpaa_bus_cleanup(void)
+dpaa_bus_cleanup(struct rte_bus *bus)
{
struct rte_dpaa_device *dev;
BUS_INIT_FUNC_TRACE();
- RTE_BUS_FOREACH_DEV(dev, &rte_dpaa_bus) {
+ RTE_BUS_FOREACH_DEV(dev, bus) {
const struct rte_dpaa_driver *drv;
int ret = 0;
diff --git a/drivers/bus/fslmc/fslmc_bus.c b/drivers/bus/fslmc/fslmc_bus.c
index dca4c5b182..1a0eca30b4 100644
--- a/drivers/bus/fslmc/fslmc_bus.c
+++ b/drivers/bus/fslmc/fslmc_bus.c
@@ -436,7 +436,7 @@ fslmc_bus_match(const struct rte_driver *drv, const struct rte_device *dev)
}
static int
-rte_fslmc_close(void)
+rte_fslmc_close(struct rte_bus *bus __rte_unused)
{
int ret = 0;
diff --git a/drivers/bus/ifpga/ifpga_bus.c b/drivers/bus/ifpga/ifpga_bus.c
index 394b777916..f8e0e7770d 100644
--- a/drivers/bus/ifpga/ifpga_bus.c
+++ b/drivers/bus/ifpga/ifpga_bus.c
@@ -295,33 +295,10 @@ ifpga_unplug_device(struct rte_device *dev)
return 0;
}
-/*
- * Cleanup the content of the Intel FPGA bus, and call the remove() function
- * for all registered devices.
- */
-static int
-ifpga_cleanup(void)
+static void
+ifpga_free_device(struct rte_device *dev)
{
- struct rte_afu_device *afu_dev;
- int error = 0;
-
- RTE_BUS_FOREACH_DEV(afu_dev, &rte_ifpga_bus) {
- int ret = 0;
-
- if (rte_dev_is_probed(&afu_dev->device)) {
- ret = ifpga_unplug_device(&afu_dev->device);
- if (ret < 0) {
- rte_errno = errno;
- error = -1;
- }
- }
-
- rte_devargs_remove(afu_dev->device.devargs);
- rte_bus_remove_device(&rte_ifpga_bus, &afu_dev->device);
- free(afu_dev);
- }
-
- return error;
+ free(RTE_BUS_DEVICE(dev, struct rte_afu_device));
}
static int
@@ -371,7 +348,8 @@ ifpga_parse(const char *name, void *addr)
static struct rte_bus rte_ifpga_bus = {
.scan = ifpga_scan,
.probe = rte_bus_generic_probe,
- .cleanup = ifpga_cleanup,
+ .free_device = ifpga_free_device,
+ .cleanup = rte_bus_generic_cleanup,
.find_device = rte_bus_generic_find_device,
.match = ifpga_bus_match,
.probe_device = ifpga_probe_device,
diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c
index bf4822f7ec..0f635e1537 100644
--- a/drivers/bus/pci/pci_common.c
+++ b/drivers/bus/pci/pci_common.c
@@ -317,29 +317,11 @@ pci_unplug_device(struct rte_device *rte_dev)
return 0;
}
-static int
-pci_cleanup(void)
+static void
+pci_free_device(struct rte_device *dev)
{
- struct rte_pci_device *dev;
- int error = 0;
-
- RTE_BUS_FOREACH_DEV(dev, &rte_pci_bus) {
- int ret = 0;
-
- if (rte_dev_is_probed(&dev->device)) {
- ret = pci_unplug_device(&dev->device);
- if (ret < 0) {
- rte_errno = errno;
- error = -1;
- }
- }
-
- rte_devargs_remove(dev->device.devargs);
- rte_bus_remove_device(&rte_pci_bus, &dev->device);
- pci_free(RTE_PCI_DEVICE_INTERNAL(dev));
- }
-
- return error;
+ struct rte_pci_device *pdev = RTE_BUS_DEVICE(dev, *pdev);
+ pci_free(RTE_PCI_DEVICE_INTERNAL(pdev));
}
/* dump one device */
@@ -743,7 +725,8 @@ struct rte_bus rte_pci_bus = {
.allow_multi_probe = true,
.scan = rte_pci_scan,
.probe = rte_bus_generic_probe,
- .cleanup = pci_cleanup,
+ .free_device = pci_free_device,
+ .cleanup = rte_bus_generic_cleanup,
.find_device = rte_bus_generic_find_device,
.match = pci_bus_match,
.probe_device = pci_probe_device,
diff --git a/drivers/bus/platform/platform.c b/drivers/bus/platform/platform.c
index 5b3c78a505..90d865a8df 100644
--- a/drivers/bus/platform/platform.c
+++ b/drivers/bus/platform/platform.c
@@ -491,26 +491,17 @@ platform_bus_get_iommu_class(void)
return RTE_IOVA_DC;
}
-static int
-platform_bus_cleanup(void)
+static void
+platform_free_device(struct rte_device *dev)
{
- struct rte_platform_device *pdev;
-
- RTE_BUS_FOREACH_DEV(pdev, &platform_bus) {
- if (rte_dev_is_probed(&pdev->device))
- platform_bus_unplug_device(&pdev->device);
-
- rte_devargs_remove(pdev->device.devargs);
- rte_bus_remove_device(&platform_bus, &pdev->device);
- free(pdev);
- }
-
- return 0;
+ free(RTE_BUS_DEVICE(dev, struct rte_platform_device));
}
static struct rte_bus platform_bus = {
.scan = platform_bus_scan,
.probe = rte_bus_generic_probe,
+ .free_device = platform_free_device,
+ .cleanup = rte_bus_generic_cleanup,
.find_device = rte_bus_generic_find_device,
.match = platform_bus_match,
.probe_device = platform_bus_probe_device,
@@ -520,7 +511,6 @@ static struct rte_bus platform_bus = {
.dma_unmap = platform_bus_dma_unmap,
.get_iommu_class = platform_bus_get_iommu_class,
.dev_iterate = rte_bus_generic_dev_iterate,
- .cleanup = platform_bus_cleanup,
};
RTE_REGISTER_BUS(platform, platform_bus);
diff --git a/drivers/bus/uacce/uacce.c b/drivers/bus/uacce/uacce.c
index bfe1f26557..99a6fb314d 100644
--- a/drivers/bus/uacce/uacce.c
+++ b/drivers/bus/uacce/uacce.c
@@ -402,29 +402,10 @@ uacce_unplug_device(struct rte_device *rte_dev)
return 0;
}
-static int
-uacce_cleanup(void)
+static void
+uacce_free_device(struct rte_device *dev)
{
- struct rte_uacce_device *dev;
- int error = 0;
-
- RTE_BUS_FOREACH_DEV(dev, &uacce_bus) {
- int ret = 0;
-
- if (rte_dev_is_probed(&dev->device)) {
- ret = uacce_unplug_device(&dev->device);
- if (ret < 0) {
- rte_errno = errno;
- error = -1;
- }
- }
-
- rte_devargs_remove(dev->device.devargs);
- rte_bus_remove_device(&uacce_bus, &dev->device);
- free(dev);
- }
-
- return error;
+ free(RTE_BUS_DEVICE(dev, struct rte_uacce_device));
}
static int
@@ -551,7 +532,8 @@ rte_uacce_unregister(struct rte_uacce_driver *driver)
static struct rte_bus uacce_bus = {
.scan = uacce_scan,
.probe = rte_bus_generic_probe,
- .cleanup = uacce_cleanup,
+ .free_device = uacce_free_device,
+ .cleanup = rte_bus_generic_cleanup,
.match = uacce_bus_match,
.probe_device = uacce_probe_device,
.unplug_device = uacce_unplug_device,
diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index 7e94f86e28..02d719a44d 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -548,26 +548,19 @@ vdev_scan(void)
return 0;
}
+static void
+vdev_free_device(struct rte_device *dev)
+{
+ free(RTE_BUS_DEVICE(dev, struct rte_vdev_device));
+}
+
static int
-vdev_cleanup(void)
+vdev_cleanup(struct rte_bus *bus)
{
- struct rte_vdev_device *dev;
- int error = 0;
+ int error;
rte_spinlock_recursive_lock(&vdev_device_list_lock);
- RTE_BUS_FOREACH_DEV(dev, &rte_vdev_bus) {
- int ret;
-
- if (rte_dev_is_probed(&dev->device)) {
- ret = vdev_unplug_device(&dev->device);
- if (ret < 0)
- error = -1;
- }
-
- rte_devargs_remove(dev->device.devargs);
- rte_bus_remove_device(&rte_vdev_bus, &dev->device);
- free(dev);
- }
+ error = rte_bus_generic_cleanup(bus);
rte_spinlock_recursive_unlock(&vdev_device_list_lock);
return error;
@@ -608,6 +601,7 @@ vdev_get_iommu_class(void)
static struct rte_bus rte_vdev_bus = {
.scan = vdev_scan,
.probe = rte_bus_generic_probe,
+ .free_device = vdev_free_device,
.cleanup = vdev_cleanup,
.find_device = vdev_find_device,
.match = vdev_bus_match,
diff --git a/drivers/bus/vmbus/vmbus_common.c b/drivers/bus/vmbus/vmbus_common.c
index bfb45e963c..a6e3a24a7c 100644
--- a/drivers/bus/vmbus/vmbus_common.c
+++ b/drivers/bus/vmbus/vmbus_common.c
@@ -144,12 +144,12 @@ rte_vmbus_probe(void)
}
static int
-rte_vmbus_cleanup(void)
+rte_vmbus_cleanup(struct rte_bus *bus)
{
struct rte_vmbus_device *dev;
int error = 0;
- RTE_BUS_FOREACH_DEV(dev, &rte_vmbus_bus) {
+ RTE_BUS_FOREACH_DEV(dev, bus) {
const struct rte_vmbus_driver *drv;
int ret;
@@ -167,7 +167,7 @@ rte_vmbus_cleanup(void)
rte_intr_instance_free(dev->intr_handle);
dev->device.driver = NULL;
- rte_bus_remove_device(&rte_vmbus_bus, &dev->device);
+ rte_bus_remove_device(bus, &dev->device);
free(dev);
}
diff --git a/lib/eal/common/eal_common_bus.c b/lib/eal/common/eal_common_bus.c
index ca13ccce5b..9ba23516ee 100644
--- a/lib/eal/common/eal_common_bus.c
+++ b/lib/eal/common/eal_common_bus.c
@@ -124,6 +124,37 @@ rte_bus_generic_probe(struct rte_bus *bus)
return (probed && probed == failed) ? -1 : 0;
}
+/*
+ * Generic cleanup function for buses.
+ * Iterates through all devices on the bus, unplugs probed devices,
+ * removes devargs, removes devices from the bus list, and frees device structures.
+ */
+RTE_EXPORT_INTERNAL_SYMBOL(rte_bus_generic_cleanup)
+int
+rte_bus_generic_cleanup(struct rte_bus *bus)
+{
+ struct rte_device *dev;
+ int error = 0;
+
+ RTE_VERIFY(bus->free_device);
+ RTE_VERIFY(bus->unplug_device);
+
+ while ((dev = TAILQ_FIRST(&bus->device_list)) != NULL) {
+ if (rte_dev_is_probed(dev)) {
+ if (bus->unplug_device && bus->unplug_device(dev) < 0) {
+ rte_errno = errno;
+ error = -1;
+ }
+ }
+
+ rte_devargs_remove(dev->devargs);
+ rte_bus_remove_device(bus, dev);
+ bus->free_device(dev);
+ }
+
+ return error;
+}
+
/* Probe all devices of all buses */
RTE_EXPORT_SYMBOL(rte_bus_probe)
int
@@ -164,7 +195,7 @@ eal_bus_cleanup(void)
TAILQ_FOREACH(bus, &rte_bus_list, next) {
if (bus->cleanup == NULL)
continue;
- if (bus->cleanup() != 0)
+ if (bus->cleanup(bus) != 0)
ret = -1;
}
diff --git a/lib/eal/include/bus_driver.h b/lib/eal/include/bus_driver.h
index fde55ff06d..4f6521c87f 100644
--- a/lib/eal/include/bus_driver.h
+++ b/lib/eal/include/bus_driver.h
@@ -226,17 +226,31 @@ typedef int (*rte_bus_hot_unplug_handler_t)(struct rte_device *dev);
*/
typedef int (*rte_bus_sigbus_handler_t)(const void *failure_addr);
+/**
+ * Free a bus-specific device structure.
+ *
+ * @param dev
+ * Device pointer.
+ */
+typedef void (*rte_bus_free_device_t)(struct rte_device *dev);
+
/**
* Implementation specific cleanup function which is responsible for cleaning up
* devices on that bus with applicable drivers.
*
+ * The cleanup operation is the counterpart to scan, removing all devices added
+ * during scan.
+ *
* This is called while iterating over each registered bus.
*
+ * @param bus
+ * Pointer to the bus to cleanup.
+ *
* @return
* 0 for successful cleanup
* !0 for any error during cleanup
*/
-typedef int (*rte_bus_cleanup_t)(void);
+typedef int (*rte_bus_cleanup_t)(struct rte_bus *bus);
/**
* Check if a driver matches a device.
@@ -336,6 +350,7 @@ struct rte_bus {
/**< handle hot-unplug failure on the bus */
rte_bus_sigbus_handler_t sigbus_handler;
/**< handle sigbus error on the bus */
+ rte_bus_free_device_t free_device; /**< Free bus-specific device */
rte_bus_cleanup_t cleanup; /**< Cleanup devices on bus */
RTE_TAILQ_HEAD(, rte_device) device_list; /**< List of devices on the bus */
RTE_TAILQ_HEAD(, rte_driver) driver_list; /**< List of drivers on the bus */
@@ -624,6 +639,23 @@ struct rte_driver *rte_bus_find_driver(const struct rte_bus *bus, const struct r
__rte_internal
int rte_bus_generic_probe(struct rte_bus *bus);
+/**
+ * Generic cleanup function for buses.
+ *
+ * Iterates through all devices on the bus, unplugs probed devices,
+ * removes devargs, removes devices from the bus list, and frees device structures.
+ *
+ * This function can be used by buses that don't require special cleanup
+ * logic and just need the standard device cleanup sequence.
+ *
+ * @param bus
+ * Pointer to the bus to cleanup.
+ * @return
+ * 0 on success, -1 if any errors occurred during cleanup.
+ */
+__rte_internal
+int rte_bus_generic_cleanup(struct rte_bus *bus);
+
#ifdef __cplusplus
}
#endif
--
2.53.0
^ permalink raw reply related
* [PATCH v2 07/10] bus: align unplug with device probe
From: David Marchand @ 2026-06-18 15:28 UTC (permalink / raw)
To: dev
Cc: thomas, stephen, bruce.richardson, fengchengwen, longli,
hemant.agrawal, Parav Pandit, Xueming Li, Nipun Gupta,
Nikhil Agarwal, Sachin Saxena, Rosen Xu, Chenbo Xia,
Tomasz Duszynski
In-Reply-To: <20260618152826.490569-1-david.marchand@redhat.com>
Refactor bus unplug operations to be the counterpart of probe_device.
The (renamed) unplug operation now only handles:
- Driver removal (calling the driver's remove callback)
- Freeing probe-allocated resources (interrupts, mappings)
Device deletion (devargs removal, bus removal, freeing device
structure) is now handled only during bus cleanup, not in unplug.
Additionally, move driver pointer clearing from individual bus unplug
operations to EAL's local_dev_remove() where the unplug operation is
invoked. This centralizes driver lifecycle management and eliminates
code duplication across bus drivers.
For vdev, add a check in rte_vdev_uninit() since this public API can
be called on devices without a driver attached.
Signed-off-by: David Marchand <david.marchand@redhat.com>
---
doc/guides/prog_guide/device_hotplug.rst | 18 ++++---
drivers/bus/auxiliary/auxiliary_common.c | 46 ++++++----------
drivers/bus/cdx/cdx.c | 29 ++--------
drivers/bus/fslmc/fslmc_bus.c | 7 +--
drivers/bus/ifpga/ifpga_bus.c | 63 ++++++++++------------
drivers/bus/pci/pci_common.c | 57 ++++----------------
drivers/bus/platform/platform.c | 16 +++---
drivers/bus/uacce/uacce.c | 67 ++++++++----------------
drivers/bus/vdev/vdev.c | 53 ++++++++-----------
lib/eal/common/eal_common_dev.c | 8 +--
lib/eal/include/bus_driver.h | 4 +-
11 files changed, 129 insertions(+), 239 deletions(-)
diff --git a/doc/guides/prog_guide/device_hotplug.rst b/doc/guides/prog_guide/device_hotplug.rst
index 7eb7fbcc2b..d21ba0c244 100644
--- a/doc/guides/prog_guide/device_hotplug.rst
+++ b/doc/guides/prog_guide/device_hotplug.rst
@@ -165,7 +165,7 @@ using ``rte_dev_event_callback_register()`` function.
on the device in question.
When ``RTE_DEV_EVENT_REMOVE`` event is delivered,
it indicates that the kernel has removed the device;
- the application should call ``rte_dev_remove()`` to clean up EAL resources.
+ the application should call ``rte_dev_remove()`` to unplug the device driver.
Event Notification Usage
@@ -256,13 +256,17 @@ When ``rte_dev_remove()`` is called, the following sequence occurs:
See `Multi-process Synchronization`_ for details.
#. **Device Unplug**:
- The bus's ``unplug()`` method is called (``dev->bus->unplug()``),
- which triggers the driver's remove function.
- This typically stops device operations, releases device resources,
- unmaps memory regions, and unregisters from subsystems.
+ The bus's ``unplug_device()`` method is called (``dev->bus->unplug_device()``),
+ which triggers the driver's remove function
+ and releases resources allocated during probe
+ (such as interrupt handles and device memory mappings).
-#. **Devargs Cleanup**:
- The devargs associated with the device are removed from the global list.
+.. note::
+
+ The device structure, its devargs, and its entry in the bus device list
+ are NOT freed during ``rte_dev_remove()``.
+ They remain in memory until ``rte_eal_cleanup()`` is called,
+ at which point the bus's ``cleanup()`` method handles complete device deletion.
Multi-process Synchronization
diff --git a/drivers/bus/auxiliary/auxiliary_common.c b/drivers/bus/auxiliary/auxiliary_common.c
index 048aacf254..10f466e57a 100644
--- a/drivers/bus/auxiliary/auxiliary_common.c
+++ b/drivers/bus/auxiliary/auxiliary_common.c
@@ -122,13 +122,11 @@ auxiliary_probe_device(struct rte_driver *drv, struct rte_device *dev)
return ret;
}
-/*
- * Call the remove() function of the driver.
- */
static int
-rte_auxiliary_driver_remove_dev(struct rte_auxiliary_device *dev)
+auxiliary_unplug_device(struct rte_device *rte_dev)
{
- const struct rte_auxiliary_driver *drv = RTE_BUS_DRIVER(dev->device.driver, *drv);
+ const struct rte_auxiliary_driver *drv = RTE_BUS_DRIVER(rte_dev->driver, *drv);
+ struct rte_auxiliary_device *dev = RTE_BUS_DEVICE(rte_dev, *dev);
int ret = 0;
AUXILIARY_LOG(DEBUG, "Driver %s remove auxiliary device %s on NUMA node %i",
@@ -140,8 +138,8 @@ rte_auxiliary_driver_remove_dev(struct rte_auxiliary_device *dev)
return ret;
}
- /* clear driver structure */
- dev->device.driver = NULL;
+ rte_intr_instance_free(dev->intr_handle);
+ dev->intr_handle = NULL;
return 0;
}
@@ -181,22 +179,6 @@ rte_auxiliary_unregister(struct rte_auxiliary_driver *driver)
rte_bus_remove_driver(&auxiliary_bus, &driver->driver);
}
-static int
-auxiliary_unplug(struct rte_device *dev)
-{
- struct rte_auxiliary_device *adev = RTE_BUS_DEVICE(dev, *adev);
- int ret;
-
- ret = rte_auxiliary_driver_remove_dev(adev);
- if (ret == 0) {
- rte_bus_remove_device(&auxiliary_bus, &adev->device);
- rte_devargs_remove(dev->devargs);
- rte_intr_instance_free(adev->intr_handle);
- free(adev);
- }
- return ret;
-}
-
static int
auxiliary_cleanup(void)
{
@@ -206,13 +188,17 @@ auxiliary_cleanup(void)
RTE_BUS_FOREACH_DEV(dev, &auxiliary_bus) {
int ret;
- if (!rte_dev_is_probed(&dev->device))
- continue;
- ret = auxiliary_unplug(&dev->device);
- if (ret < 0) {
- rte_errno = errno;
- error = -1;
+ if (rte_dev_is_probed(&dev->device)) {
+ ret = auxiliary_unplug_device(&dev->device);
+ if (ret < 0) {
+ rte_errno = errno;
+ error = -1;
+ }
}
+
+ rte_devargs_remove(dev->device.devargs);
+ rte_bus_remove_device(&auxiliary_bus, &dev->device);
+ free(dev);
}
return error;
@@ -265,7 +251,7 @@ struct rte_bus auxiliary_bus = {
.find_device = rte_bus_generic_find_device,
.match = auxiliary_bus_match,
.probe_device = auxiliary_probe_device,
- .unplug = auxiliary_unplug,
+ .unplug_device = auxiliary_unplug_device,
.parse = auxiliary_parse,
.dma_map = auxiliary_dma_map,
.dma_unmap = auxiliary_dma_unmap,
diff --git a/drivers/bus/cdx/cdx.c b/drivers/bus/cdx/cdx.c
index 2443161e1a..c0b46a41ad 100644
--- a/drivers/bus/cdx/cdx.c
+++ b/drivers/bus/cdx/cdx.c
@@ -374,14 +374,11 @@ rte_cdx_unregister(struct rte_cdx_driver *driver)
rte_bus_remove_driver(&rte_cdx_bus, &driver->driver);
}
-/*
- * If vendor/device ID match, call the remove() function of the
- * driver.
- */
static int
-cdx_detach_dev(struct rte_cdx_device *dev)
+cdx_unplug_device(struct rte_device *rte_dev)
{
- const struct rte_cdx_driver *dr = RTE_BUS_DRIVER(dev->device.driver, *dr);
+ const struct rte_cdx_driver *dr = RTE_BUS_DRIVER(rte_dev->driver, *dr);
+ struct rte_cdx_device *dev = RTE_BUS_DEVICE(rte_dev, *dev);
int ret = 0;
CDX_BUS_DEBUG("detach device %s using driver: %s",
@@ -393,9 +390,6 @@ cdx_detach_dev(struct rte_cdx_device *dev)
return ret;
}
- /* clear driver structure */
- dev->device.driver = NULL;
-
rte_cdx_unmap_device(dev);
rte_intr_instance_free(dev->intr_handle);
@@ -404,21 +398,6 @@ cdx_detach_dev(struct rte_cdx_device *dev)
return 0;
}
-static int
-cdx_unplug(struct rte_device *dev)
-{
- struct rte_cdx_device *cdx_dev = RTE_BUS_DEVICE(dev, *cdx_dev);
- int ret;
-
- ret = cdx_detach_dev(cdx_dev);
- if (ret == 0) {
- rte_bus_remove_device(&rte_cdx_bus, &cdx_dev->device);
- rte_devargs_remove(dev->devargs);
- free(cdx_dev);
- }
- return ret;
-}
-
static int
cdx_dma_map(struct rte_device *dev, void *addr, uint64_t iova, size_t len)
{
@@ -452,7 +431,7 @@ static struct rte_bus rte_cdx_bus = {
.find_device = rte_bus_generic_find_device,
.match = cdx_bus_match,
.probe_device = cdx_probe_device,
- .unplug = cdx_unplug,
+ .unplug_device = cdx_unplug_device,
.parse = cdx_parse,
.dma_map = cdx_dma_map,
.dma_unmap = cdx_dma_unmap,
diff --git a/drivers/bus/fslmc/fslmc_bus.c b/drivers/bus/fslmc/fslmc_bus.c
index c7549a361a..dca4c5b182 100644
--- a/drivers/bus/fslmc/fslmc_bus.c
+++ b/drivers/bus/fslmc/fslmc_bus.c
@@ -520,6 +520,7 @@ fslmc_bus_probe_device(struct rte_driver *driver, struct rte_device *rte_dev)
return 0;
}
+ /* FIXME: probe_device should allocate intr_handle */
ret = drv->probe(drv, dev);
if (ret != 0) {
DPAA2_BUS_ERR("Unable to probe");
@@ -531,7 +532,7 @@ fslmc_bus_probe_device(struct rte_driver *driver, struct rte_device *rte_dev)
}
static int
-fslmc_bus_unplug(struct rte_device *rte_dev)
+fslmc_bus_unplug_device(struct rte_device *rte_dev)
{
struct rte_dpaa2_device *dev = RTE_BUS_DEVICE(rte_dev, *dev);
const struct rte_dpaa2_driver *drv = RTE_BUS_DRIVER(rte_dev->driver, *drv);
@@ -540,7 +541,7 @@ fslmc_bus_unplug(struct rte_device *rte_dev)
int ret = drv->remove(dev);
if (ret != 0)
return ret;
- dev->device.driver = NULL;
+ /* FIXME: unplug_device should free intr_handle */
DPAA2_BUS_INFO("%s Un-Plugged", dev->device.name);
return 0;
}
@@ -558,7 +559,7 @@ struct rte_bus rte_fslmc_bus = {
.get_iommu_class = rte_dpaa2_get_iommu_class,
.match = fslmc_bus_match,
.probe_device = fslmc_bus_probe_device,
- .unplug = fslmc_bus_unplug,
+ .unplug_device = fslmc_bus_unplug_device,
.dev_iterate = rte_bus_generic_dev_iterate,
};
diff --git a/drivers/bus/ifpga/ifpga_bus.c b/drivers/bus/ifpga/ifpga_bus.c
index 2c22329f65..394b777916 100644
--- a/drivers/bus/ifpga/ifpga_bus.c
+++ b/drivers/bus/ifpga/ifpga_bus.c
@@ -276,6 +276,25 @@ ifpga_probe_device(struct rte_driver *drv, struct rte_device *dev)
return afu_drv->probe(afu_dev);
}
+static int
+ifpga_unplug_device(struct rte_device *dev)
+{
+ const struct rte_afu_driver *afu_drv = RTE_BUS_DRIVER(dev->driver, *afu_drv);
+ struct rte_afu_device *afu_dev = RTE_BUS_DEVICE(dev, *afu_dev);
+ int ret = 0;
+
+ if (afu_drv->remove) {
+ ret = afu_drv->remove(afu_dev);
+ if (ret)
+ return ret;
+ }
+
+ rte_intr_instance_free(afu_dev->intr_handle);
+ afu_dev->intr_handle = NULL;
+
+ return 0;
+}
+
/*
* Cleanup the content of the Intel FPGA bus, and call the remove() function
* for all registered devices.
@@ -287,52 +306,24 @@ ifpga_cleanup(void)
int error = 0;
RTE_BUS_FOREACH_DEV(afu_dev, &rte_ifpga_bus) {
- const struct rte_afu_driver *drv;
int ret = 0;
- if (!rte_dev_is_probed(&afu_dev->device))
- goto free;
- drv = RTE_BUS_DRIVER(afu_dev->device.driver, *drv);
- if (drv->remove == NULL)
- goto free;
-
- ret = drv->remove(afu_dev);
- if (ret < 0) {
- rte_errno = errno;
- error = -1;
+ if (rte_dev_is_probed(&afu_dev->device)) {
+ ret = ifpga_unplug_device(&afu_dev->device);
+ if (ret < 0) {
+ rte_errno = errno;
+ error = -1;
+ }
}
- afu_dev->device.driver = NULL;
-free:
- rte_bus_remove_device(&rte_ifpga_bus, &afu_dev->device);
rte_devargs_remove(afu_dev->device.devargs);
- rte_intr_instance_free(afu_dev->intr_handle);
+ rte_bus_remove_device(&rte_ifpga_bus, &afu_dev->device);
free(afu_dev);
}
return error;
}
-static int
-ifpga_unplug(struct rte_device *dev)
-{
- struct rte_afu_device *afu_dev = RTE_BUS_DEVICE(dev, *afu_dev);
- const struct rte_afu_driver *afu_drv = RTE_BUS_DRIVER(dev->driver, *afu_drv);
- int ret;
-
- ret = afu_drv->remove(afu_dev);
- if (ret)
- return ret;
-
- rte_bus_remove_device(&rte_ifpga_bus, &afu_dev->device);
-
- rte_devargs_remove(dev->devargs);
- rte_intr_instance_free(afu_dev->intr_handle);
- free(afu_dev);
- return 0;
-
-}
-
static int
ifpga_parse(const char *name, void *addr)
{
@@ -384,7 +375,7 @@ static struct rte_bus rte_ifpga_bus = {
.find_device = rte_bus_generic_find_device,
.match = ifpga_bus_match,
.probe_device = ifpga_probe_device,
- .unplug = ifpga_unplug,
+ .unplug_device = ifpga_unplug_device,
.parse = ifpga_parse,
};
diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c
index 791e9a7b49..bf4822f7ec 100644
--- a/drivers/bus/pci/pci_common.c
+++ b/drivers/bus/pci/pci_common.c
@@ -282,13 +282,10 @@ pci_probe_device(struct rte_driver *drv, struct rte_device *dev)
return ret;
}
-/*
- * If vendor/device ID match, call the remove() function of the
- * driver.
- */
static int
-rte_pci_detach_dev(struct rte_pci_device *dev)
+pci_unplug_device(struct rte_device *rte_dev)
{
+ struct rte_pci_device *dev = RTE_BUS_DEVICE(rte_dev, *dev);
struct rte_pci_addr *loc;
const struct rte_pci_driver *dr = RTE_BUS_DRIVER(dev->device.driver, *dr);
int ret = 0;
@@ -308,9 +305,6 @@ rte_pci_detach_dev(struct rte_pci_device *dev)
return ret;
}
- /* clear driver structure */
- dev->device.driver = NULL;
-
if (dr->drv_flags & RTE_PCI_DRV_NEED_MAPPING)
/* unmap resources for devices that use igb_uio */
rte_pci_unmap_device(dev);
@@ -330,33 +324,17 @@ pci_cleanup(void)
int error = 0;
RTE_BUS_FOREACH_DEV(dev, &rte_pci_bus) {
- const struct rte_pci_driver *drv;
int ret = 0;
- if (!rte_dev_is_probed(&dev->device))
- goto free;
- drv = RTE_BUS_DRIVER(dev->device.driver, *drv);
- if (drv->remove == NULL)
- goto free;
-
- ret = drv->remove(dev);
- if (ret < 0) {
- rte_errno = errno;
- error = -1;
+ if (rte_dev_is_probed(&dev->device)) {
+ ret = pci_unplug_device(&dev->device);
+ if (ret < 0) {
+ rte_errno = errno;
+ error = -1;
+ }
}
- if (drv->drv_flags & RTE_PCI_DRV_NEED_MAPPING)
- rte_pci_unmap_device(dev);
-
- dev->device.driver = NULL;
-
-free:
- /* free interrupt handles */
- rte_intr_instance_free(dev->intr_handle);
- dev->intr_handle = NULL;
- rte_intr_instance_free(dev->vfio_req_intr_handle);
- dev->vfio_req_intr_handle = NULL;
-
+ rte_devargs_remove(dev->device.devargs);
rte_bus_remove_device(&rte_pci_bus, &dev->device);
pci_free(RTE_PCI_DEVICE_INTERNAL(dev));
}
@@ -521,21 +499,6 @@ pci_sigbus_handler(const void *failure_addr)
return ret;
}
-static int
-pci_unplug(struct rte_device *dev)
-{
- struct rte_pci_device *pdev = RTE_BUS_DEVICE(dev, *pdev);
- int ret;
-
- ret = rte_pci_detach_dev(pdev);
- if (ret == 0) {
- rte_bus_remove_device(&rte_pci_bus, &pdev->device);
- rte_devargs_remove(dev->devargs);
- pci_free(RTE_PCI_DEVICE_INTERNAL(pdev));
- }
- return ret;
-}
-
static int
pci_dma_map(struct rte_device *dev, void *addr, uint64_t iova, size_t len)
{
@@ -784,7 +747,7 @@ struct rte_bus rte_pci_bus = {
.find_device = rte_bus_generic_find_device,
.match = pci_bus_match,
.probe_device = pci_probe_device,
- .unplug = pci_unplug,
+ .unplug_device = pci_unplug_device,
.parse = pci_parse,
.dev_compare = pci_dev_compare,
.devargs_parse = rte_pci_devargs_parse,
diff --git a/drivers/bus/platform/platform.c b/drivers/bus/platform/platform.c
index 170a2e03d0..5b3c78a505 100644
--- a/drivers/bus/platform/platform.c
+++ b/drivers/bus/platform/platform.c
@@ -416,19 +416,15 @@ device_release_driver(struct rte_platform_device *pdev)
if (ret)
PLATFORM_LOG_LINE(WARNING, "failed to remove %s", pdev->name);
}
-
- pdev->device.driver = NULL;
}
static int
-platform_bus_unplug(struct rte_device *dev)
+platform_bus_unplug_device(struct rte_device *dev)
{
struct rte_platform_device *pdev = RTE_BUS_DEVICE(dev, *pdev);
device_release_driver(pdev);
device_cleanup(pdev);
- rte_devargs_remove(pdev->device.devargs);
- free(pdev);
return 0;
}
@@ -501,10 +497,12 @@ platform_bus_cleanup(void)
struct rte_platform_device *pdev;
RTE_BUS_FOREACH_DEV(pdev, &platform_bus) {
+ if (rte_dev_is_probed(&pdev->device))
+ platform_bus_unplug_device(&pdev->device);
+
+ rte_devargs_remove(pdev->device.devargs);
rte_bus_remove_device(&platform_bus, &pdev->device);
- if (!rte_dev_is_probed(&pdev->device))
- continue;
- platform_bus_unplug(&pdev->device);
+ free(pdev);
}
return 0;
@@ -516,7 +514,7 @@ static struct rte_bus platform_bus = {
.find_device = rte_bus_generic_find_device,
.match = platform_bus_match,
.probe_device = platform_bus_probe_device,
- .unplug = platform_bus_unplug,
+ .unplug_device = platform_bus_unplug_device,
.parse = platform_bus_parse,
.dma_map = platform_bus_dma_map,
.dma_unmap = platform_bus_dma_unmap,
diff --git a/drivers/bus/uacce/uacce.c b/drivers/bus/uacce/uacce.c
index 8a3c55b248..bfe1f26557 100644
--- a/drivers/bus/uacce/uacce.c
+++ b/drivers/bus/uacce/uacce.c
@@ -385,40 +385,10 @@ uacce_probe_device(struct rte_driver *drv, struct rte_device *dev)
}
static int
-uacce_cleanup(void)
+uacce_unplug_device(struct rte_device *rte_dev)
{
- struct rte_uacce_device *dev;
- int error = 0;
-
- RTE_BUS_FOREACH_DEV(dev, &uacce_bus) {
- const struct rte_uacce_driver *dr;
- int ret = 0;
-
- if (!rte_dev_is_probed(&dev->device))
- goto free;
- dr = RTE_BUS_DRIVER(dev->device.driver, *dr);
- if (dr->remove == NULL)
- goto free;
-
- ret = dr->remove(dev);
- if (ret < 0) {
- rte_errno = errno;
- error = -1;
- }
- dev->device.driver = NULL;
-
-free:
- rte_bus_remove_device(&uacce_bus, &dev->device);
- free(dev);
- }
-
- return error;
-}
-
-static int
-uacce_detach_dev(struct rte_uacce_device *dev)
-{
- const struct rte_uacce_driver *dr = RTE_BUS_DRIVER(dev->device.driver, *dr);
+ const struct rte_uacce_driver *dr = RTE_BUS_DRIVER(rte_dev->driver, *dr);
+ struct rte_uacce_device *dev = RTE_BUS_DEVICE(rte_dev, *dev);
int ret = 0;
UACCE_BUS_DEBUG("detach device %s using driver: %s", dev->device.name, dr->driver.name);
@@ -429,25 +399,32 @@ uacce_detach_dev(struct rte_uacce_device *dev)
return ret;
}
- dev->device.driver = NULL;
-
return 0;
}
static int
-uacce_unplug(struct rte_device *dev)
+uacce_cleanup(void)
{
- struct rte_uacce_device *uacce_dev = RTE_BUS_DEVICE(dev, *uacce_dev);
- int ret;
+ struct rte_uacce_device *dev;
+ int error = 0;
- ret = uacce_detach_dev(uacce_dev);
- if (ret == 0) {
- rte_bus_remove_device(&uacce_bus, &uacce_dev->device);
- rte_devargs_remove(dev->devargs);
- free(uacce_dev);
+ RTE_BUS_FOREACH_DEV(dev, &uacce_bus) {
+ int ret = 0;
+
+ if (rte_dev_is_probed(&dev->device)) {
+ ret = uacce_unplug_device(&dev->device);
+ if (ret < 0) {
+ rte_errno = errno;
+ error = -1;
+ }
+ }
+
+ rte_devargs_remove(dev->device.devargs);
+ rte_bus_remove_device(&uacce_bus, &dev->device);
+ free(dev);
}
- return ret;
+ return error;
}
static int
@@ -577,7 +554,7 @@ static struct rte_bus uacce_bus = {
.cleanup = uacce_cleanup,
.match = uacce_bus_match,
.probe_device = uacce_probe_device,
- .unplug = uacce_unplug,
+ .unplug_device = uacce_unplug_device,
.find_device = rte_bus_generic_find_device,
.parse = uacce_parse,
.dev_iterate = rte_bus_generic_dev_iterate,
diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index 09221ccdea..7e94f86e28 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -343,19 +343,15 @@ rte_vdev_init(const char *name, const char *args)
}
static int
-vdev_remove_driver(struct rte_vdev_device *dev)
+vdev_unplug_device(struct rte_device *rte_dev)
{
- const char *name = rte_vdev_device_name(dev);
- const struct rte_vdev_driver *driver;
+ const struct rte_vdev_driver *driver = RTE_BUS_DRIVER(rte_dev->driver, *driver);
+ struct rte_vdev_device *dev = RTE_BUS_DEVICE(rte_dev, *dev);
- if (!dev->device.driver) {
- VDEV_LOG(DEBUG, "no driver attach to device %s", name);
- return 1;
- }
+ if (driver->remove)
+ return driver->remove(dev);
- driver = RTE_BUS_DRIVER(dev->device.driver, *driver);
-
- return driver->remove(dev);
+ return 0;
}
RTE_EXPORT_SYMBOL(rte_vdev_uninit)
@@ -376,7 +372,12 @@ rte_vdev_uninit(const char *name)
goto unlock;
}
- ret = vdev_remove_driver(dev);
+ if (rte_dev_is_probed(&dev->device)) {
+ ret = vdev_unplug_device(&dev->device);
+ } else {
+ VDEV_LOG(DEBUG, "no driver attach to device %s", name);
+ ret = 1;
+ }
if (ret)
goto unlock;
@@ -553,27 +554,21 @@ vdev_cleanup(void)
struct rte_vdev_device *dev;
int error = 0;
+ rte_spinlock_recursive_lock(&vdev_device_list_lock);
RTE_BUS_FOREACH_DEV(dev, &rte_vdev_bus) {
- const struct rte_vdev_driver *drv;
int ret;
- if (!rte_dev_is_probed(&dev->device))
- goto free;
-
- drv = RTE_BUS_DRIVER(dev->device.driver, *drv);
-
- if (drv->remove == NULL)
- goto free;
-
- ret = drv->remove(dev);
- if (ret < 0)
- error = -1;
+ if (rte_dev_is_probed(&dev->device)) {
+ ret = vdev_unplug_device(&dev->device);
+ if (ret < 0)
+ error = -1;
+ }
- dev->device.driver = NULL;
-free:
+ rte_devargs_remove(dev->device.devargs);
rte_bus_remove_device(&rte_vdev_bus, &dev->device);
free(dev);
}
+ rte_spinlock_recursive_unlock(&vdev_device_list_lock);
return error;
}
@@ -591,12 +586,6 @@ vdev_find_device(const struct rte_bus *bus, const struct rte_device *start,
return dev;
}
-static int
-vdev_unplug(struct rte_device *dev)
-{
- return rte_vdev_uninit(dev->name);
-}
-
static enum rte_iova_mode
vdev_get_iommu_class(void)
{
@@ -623,7 +612,7 @@ static struct rte_bus rte_vdev_bus = {
.find_device = vdev_find_device,
.match = vdev_bus_match,
.probe_device = vdev_probe_device,
- .unplug = vdev_unplug,
+ .unplug_device = vdev_unplug_device,
.parse = vdev_parse,
.dma_map = vdev_dma_map,
.dma_unmap = vdev_dma_unmap,
diff --git a/lib/eal/common/eal_common_dev.c b/lib/eal/common/eal_common_dev.c
index 2a2103ec57..762ed09e21 100644
--- a/lib/eal/common/eal_common_dev.c
+++ b/lib/eal/common/eal_common_dev.c
@@ -385,19 +385,21 @@ local_dev_remove(struct rte_device *dev)
{
int ret;
- if (dev->bus->unplug == NULL) {
- EAL_LOG(ERR, "Function unplug not supported by bus (%s)",
+ if (dev->bus->unplug_device == NULL) {
+ EAL_LOG(ERR, "Function unplug_device not supported by bus (%s)",
dev->bus->name);
return -ENOTSUP;
}
- ret = dev->bus->unplug(dev);
+ ret = dev->bus->unplug_device(dev);
if (ret) {
EAL_LOG(ERR, "Driver cannot detach the device (%s)",
dev->name);
return (ret < 0) ? ret : -ENOENT;
}
+ dev->driver = NULL;
+
return 0;
}
diff --git a/lib/eal/include/bus_driver.h b/lib/eal/include/bus_driver.h
index 9711e6712b..fde55ff06d 100644
--- a/lib/eal/include/bus_driver.h
+++ b/lib/eal/include/bus_driver.h
@@ -101,7 +101,7 @@ typedef int (*rte_bus_probe_device_t)(struct rte_driver *drv, struct rte_device
* 0 on success.
* !0 on error.
*/
-typedef int (*rte_bus_unplug_t)(struct rte_device *dev);
+typedef int (*rte_bus_unplug_device_t)(struct rte_device *dev);
/**
* Bus specific parsing function.
@@ -323,7 +323,7 @@ struct rte_bus {
rte_bus_find_device_t find_device; /**< Find a device on the bus */
rte_bus_match_t match; /**< Check if driver matches device */
rte_bus_probe_device_t probe_device; /**< Probe single device with driver */
- rte_bus_unplug_t unplug; /**< Remove single device from driver */
+ rte_bus_unplug_device_t unplug_device; /**< Remove single device from driver */
rte_bus_parse_t parse; /**< Parse a device name */
rte_bus_dev_compare_t dev_compare; /**< Compare two device names */
rte_bus_devargs_parse_t devargs_parse; /**< Parse bus devargs */
--
2.53.0
^ permalink raw reply related
* [PATCH v2 06/10] bus/vmbus: allocate interrupt during probing
From: David Marchand @ 2026-06-18 15:28 UTC (permalink / raw)
To: dev
Cc: thomas, stephen, bruce.richardson, fengchengwen, longli,
hemant.agrawal, Wei Hu
In-Reply-To: <20260618152826.490569-1-david.marchand@redhat.com>
Allocating the interrupt handle is a waste of memory if no device is
probed later (like for example, if a allowlist is passed).
Instead, allocate this handle at the time probe_device is called.
Signed-off-by: David Marchand <david.marchand@redhat.com>
---
Changes since v1:
- fixed/reordered interrupt handle allocation,
---
drivers/bus/vmbus/linux/vmbus_bus.c | 6 ------
drivers/bus/vmbus/vmbus_common.c | 18 ++++++++++++++++--
2 files changed, 16 insertions(+), 8 deletions(-)
diff --git a/drivers/bus/vmbus/linux/vmbus_bus.c b/drivers/bus/vmbus/linux/vmbus_bus.c
index 0af10f6a69..77d904ad6d 100644
--- a/drivers/bus/vmbus/linux/vmbus_bus.c
+++ b/drivers/bus/vmbus/linux/vmbus_bus.c
@@ -345,12 +345,6 @@ vmbus_scan_one(const char *name)
}
}
- /* Allocate interrupt handle instance */
- dev->intr_handle =
- rte_intr_instance_alloc(RTE_INTR_INSTANCE_F_PRIVATE);
- if (dev->intr_handle == NULL)
- goto error;
-
/* device is valid, add in list (sorted) */
VMBUS_LOG(DEBUG, "Adding vmbus device %s", name);
diff --git a/drivers/bus/vmbus/vmbus_common.c b/drivers/bus/vmbus/vmbus_common.c
index 74c1ddff69..bfb45e963c 100644
--- a/drivers/bus/vmbus/vmbus_common.c
+++ b/drivers/bus/vmbus/vmbus_common.c
@@ -100,10 +100,16 @@ vmbus_probe_device(struct rte_driver *drv, struct rte_device *dev)
return 1;
}
+ /* allocate interrupt handle instance */
+ vmbus_dev->intr_handle =
+ rte_intr_instance_alloc(RTE_INTR_INSTANCE_F_PRIVATE);
+ if (vmbus_dev->intr_handle == NULL)
+ return -ENOMEM;
+
/* map resources for device */
ret = rte_vmbus_map_device(vmbus_dev);
if (ret != 0)
- return ret;
+ goto free_intr;
if (vmbus_dev->device.numa_node < 0 && rte_socket_count() > 1)
VMBUS_LOG(INFO, "Device %s is not NUMA-aware", guid);
@@ -112,7 +118,15 @@ vmbus_probe_device(struct rte_driver *drv, struct rte_device *dev)
VMBUS_LOG(INFO, " probe driver: %s", vmbus_drv->driver.name);
ret = vmbus_drv->probe(vmbus_drv, vmbus_dev);
if (ret != 0)
- rte_vmbus_unmap_device(vmbus_dev);
+ goto unmap;
+
+ return 0;
+
+unmap:
+ rte_vmbus_unmap_device(vmbus_dev);
+free_intr:
+ rte_intr_instance_free(vmbus_dev->intr_handle);
+ vmbus_dev->intr_handle = NULL;
return ret;
}
--
2.53.0
^ permalink raw reply related
* [PATCH v2 05/10] bus/vmbus: fix interrupt leak in cleanup
From: David Marchand @ 2026-06-18 15:28 UTC (permalink / raw)
To: dev
Cc: thomas, stephen, bruce.richardson, fengchengwen, longli,
hemant.agrawal, stable, Wei Hu
In-Reply-To: <20260618152826.490569-1-david.marchand@redhat.com>
When calling this bus cleanup, interrupt handle was not released.
Fixes: 65780eada9d9 ("bus/vmbus: support cleanup")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
---
drivers/bus/vmbus/vmbus_common.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/bus/vmbus/vmbus_common.c b/drivers/bus/vmbus/vmbus_common.c
index 01573927ce..74c1ddff69 100644
--- a/drivers/bus/vmbus/vmbus_common.c
+++ b/drivers/bus/vmbus/vmbus_common.c
@@ -150,6 +150,7 @@ rte_vmbus_cleanup(void)
error = -1;
rte_vmbus_unmap_device(dev);
+ rte_intr_instance_free(dev->intr_handle);
dev->device.driver = NULL;
rte_bus_remove_device(&rte_vmbus_bus, &dev->device);
--
2.53.0
^ permalink raw reply related
* [PATCH v2 04/10] bus/pci: fix mapping leak in bus cleanup
From: David Marchand @ 2026-06-18 15:28 UTC (permalink / raw)
To: dev
Cc: thomas, stephen, bruce.richardson, fengchengwen, longli,
hemant.agrawal, stable, Chenbo Xia, Nipun Gupta,
Morten Brørup, Kevin Laatz
In-Reply-To: <20260618152826.490569-1-david.marchand@redhat.com>
When calling this bus cleanup, PCI resources were not unmapped.
Fixes: 1cab1a40ea9b ("bus: cleanup devices on shutdown")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
---
drivers/bus/pci/pci_common.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c
index fd18b8772b..791e9a7b49 100644
--- a/drivers/bus/pci/pci_common.c
+++ b/drivers/bus/pci/pci_common.c
@@ -344,6 +344,10 @@ pci_cleanup(void)
rte_errno = errno;
error = -1;
}
+
+ if (drv->drv_flags & RTE_PCI_DRV_NEED_MAPPING)
+ rte_pci_unmap_device(dev);
+
dev->device.driver = NULL;
free:
--
2.53.0
^ permalink raw reply related
* [PATCH v2 02/10] dma/idxd: remove next pointer in bus specific device
From: David Marchand @ 2026-06-18 15:28 UTC (permalink / raw)
To: dev
Cc: thomas, stephen, bruce.richardson, fengchengwen, longli,
hemant.agrawal, Kevin Laatz
In-Reply-To: <20260618152826.490569-1-david.marchand@redhat.com>
The dma/idxd devices are now stored in a list of generic rte_device
objects.
Fixes: b4f0974a995b ("bus: factorize device list")
Signed-off-by: David Marchand <david.marchand@redhat.com>
---
drivers/dma/idxd/idxd_bus.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c
index 4810d52f2a..2ec526ec09 100644
--- a/drivers/dma/idxd/idxd_bus.c
+++ b/drivers/dma/idxd/idxd_bus.c
@@ -34,7 +34,6 @@ struct dsa_wq_addr {
/** a DSA device instance */
struct rte_dsa_device {
struct rte_device device; /**< Inherit core device */
- TAILQ_ENTRY(rte_dsa_device) next; /**< next dev in list */
char wq_name[32]; /**< the workqueue name/number e.g. wq0.1 */
struct dsa_wq_addr addr; /**< Identifies the specific WQ */
--
2.53.0
^ permalink raw reply related
* [PATCH v2 03/10] bus/vdev: remove driver setting in probe
From: David Marchand @ 2026-06-18 15:28 UTC (permalink / raw)
To: dev; +Cc: thomas, stephen, bruce.richardson, fengchengwen, longli,
hemant.agrawal
In-Reply-To: <20260618152826.490569-1-david.marchand@redhat.com>
Setting the device driver field is not the responsibility of the
probe_device callback anymore, but that of EAL (see local_dev_probe).
Yet, because of the VDEV API, rte_vdev_init() must be updated to mark
the device as probed.
Fixes: f282771a04ef ("bus: factorize driver reference")
Signed-off-by: David Marchand <david.marchand@redhat.com>
---
Changes since v1:
- implement the same way as EAL,
---
drivers/bus/vdev/vdev.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index 3bddf8938c..09221ccdea 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -188,7 +188,6 @@ vdev_probe_device(struct rte_driver *drv, struct rte_device *dev)
struct rte_vdev_driver *vdev_drv = RTE_BUS_DRIVER(drv, *vdev_drv);
const char *name;
enum rte_iova_mode iova_mode;
- int ret;
name = rte_vdev_device_name(vdev_dev);
VDEV_LOG(DEBUG, "Search driver to probe device %s", name);
@@ -200,10 +199,7 @@ vdev_probe_device(struct rte_driver *drv, struct rte_device *dev)
return -1;
}
- ret = vdev_drv->probe(vdev_dev);
- if (ret == 0)
- vdev_dev->device.driver = &vdev_drv->driver;
- return ret;
+ return vdev_drv->probe(vdev_dev);
}
/* The caller shall be responsible for thread-safe */
@@ -328,7 +324,10 @@ rte_vdev_init(const char *name, const char *args)
} else if (rte_dev_is_probed(&dev->device)) {
ret = -EEXIST;
} else {
+ dev->device.driver = drv;
ret = rte_vdev_bus.probe_device(drv, &dev->device);
+ if (ret != 0)
+ dev->device.driver = NULL;
}
if (ret < 0) {
/* If fails, remove it from vdev list */
--
2.53.0
^ permalink raw reply related
* [PATCH v2 01/10] bus: fix reference to plug callback
From: David Marchand @ 2026-06-18 15:28 UTC (permalink / raw)
To: dev; +Cc: thomas, stephen, bruce.richardson, fengchengwen, longli,
hemant.agrawal
In-Reply-To: <20260618152826.490569-1-david.marchand@redhat.com>
Remove now unused typedef, update documentation
and some log following the callback rename.
Fixes: 76622feba9e6 ("bus: refactor device probe")
Signed-off-by: David Marchand <david.marchand@redhat.com>
---
Changes since v1:
- remove missed rte_bus_plug_t typedef,
---
doc/guides/prog_guide/device_hotplug.rst | 2 +-
lib/eal/common/eal_common_dev.c | 2 +-
lib/eal/include/bus_driver.h | 13 -------------
3 files changed, 2 insertions(+), 15 deletions(-)
diff --git a/doc/guides/prog_guide/device_hotplug.rst b/doc/guides/prog_guide/device_hotplug.rst
index 9896a097f3..7eb7fbcc2b 100644
--- a/doc/guides/prog_guide/device_hotplug.rst
+++ b/doc/guides/prog_guide/device_hotplug.rst
@@ -234,7 +234,7 @@ When ``rte_dev_probe()`` is called, the following sequence occurs:
and the attach operation fails if the device is not found.
#. **Device Probe**:
- The bus's ``plug()`` method is called, which triggers the device driver's probe function.
+ The bus's ``probe_device()`` method is called, which triggers the device driver's probe function.
The probe function typically allocates device-specific resources,
maps device memory regions, initializes device hardware,
and registers the device with the appropriate subsystem (e.g., ethdev for network devices).
diff --git a/lib/eal/common/eal_common_dev.c b/lib/eal/common/eal_common_dev.c
index 48b631532a..2a2103ec57 100644
--- a/lib/eal/common/eal_common_dev.c
+++ b/lib/eal/common/eal_common_dev.c
@@ -193,7 +193,7 @@ local_dev_probe(const char *devargs, struct rte_device **new_dev)
goto err_devarg;
if (da->bus->probe_device == NULL) {
- EAL_LOG(ERR, "Function plug not supported by bus (%s)",
+ EAL_LOG(ERR, "Function probe_device not supported by bus (%s)",
da->bus->name);
ret = -ENOTSUP;
goto err_devarg;
diff --git a/lib/eal/include/bus_driver.h b/lib/eal/include/bus_driver.h
index 0a7e23d98d..9711e6712b 100644
--- a/lib/eal/include/bus_driver.h
+++ b/lib/eal/include/bus_driver.h
@@ -75,19 +75,6 @@ typedef struct rte_device *
(*rte_bus_find_device_t)(const struct rte_bus *bus, const struct rte_device *start,
rte_dev_cmp_t cmp, const void *data);
-/**
- * Implementation specific probe function which is responsible for linking
- * devices on that bus with applicable drivers.
- *
- * @param dev
- * Device pointer that was returned by a previous call to find_device.
- *
- * @return
- * 0 on success.
- * !0 on error.
- */
-typedef int (*rte_bus_plug_t)(struct rte_device *dev);
-
/**
* Implementation specific probe function which is responsible for linking
* devices on that bus with applicable drivers.
--
2.53.0
^ permalink raw reply related
* Re: [EXTERNAL] [PATCH 00/13] Bus cleanup infrastructure and fixes
From: David Marchand @ 2026-06-18 15:28 UTC (permalink / raw)
To: Hemant Agrawal
Cc: dev@dpdk.org, thomas@monjalon.net, stephen@networkplumber.org,
bruce.richardson@intel.com, fengchengwen@huawei.com, Long Li
In-Reply-To: <CAJFAV8yAPFieFh0-kFSQAVOYUcn+qmvN3KfFskh5pZbsBSAb7w@mail.gmail.com>
On Thu, 18 Jun 2026 at 10:39, David Marchand <david.marchand@redhat.com> wrote:
>
> Hello Hemant,
>
> On Wed, 17 Jun 2026 at 11:16, Hemant Agrawal <hemant.agrawal@nxp.com> wrote:
> > > > > > There is a hung on vmbus during device shutdown after applying the
> > > > > > series, I'm looking into it.
> > > > >
> > > > > Turned out to be a test issue. Please see my comments on patch 08, the
> > > patch set tested well after that fix.
> > > >
> > > > Thanks a lot for testing!
> > > >
> > > > I'll fix this regression in the next revision.
> > >
> > > Fyi Hemant, this series has a similar regression for dpaa/fslmc bus (interrupt
> > > handle allocated too late in the device probing flow).
> > > The implications seem greater than fixing vmbus though, as I am now finding
> > > bugs on the cleanup side (interrupt eventfd are never closed, for example).
> > >
> > > I'll think about how to fix it in the next revision, one option may be to leave
> > > dpaa/fslmc alone.. ?
> > > But in the long run, all bus drivers should behave consistently.
> > >
> > > I'll get back in this thread once I have a better view of the situation.
> > >
> >
> > HI David,
> > Give me some time to get this tested on the hardware.
>
> Thanks!
>
> If you did not start testing yet, please wait a bit more, I have a v2
> that should address my concerns.
> I hope I can send it in the next hours.
There are more complications and I am running out of time for this release.
I dropped all changes on dpaa and fslmc bus drivers for now.
v2, incoming.
--
David Marchand
^ permalink raw reply
* [PATCH v2 00/10] Bus cleanup infrastructure and fixes
From: David Marchand @ 2026-06-18 15:28 UTC (permalink / raw)
To: dev; +Cc: thomas, stephen, bruce.richardson, fengchengwen, longli,
hemant.agrawal
In-Reply-To: <20260611094551.1514962-1-david.marchand@redhat.com>
This is a followup of the previous bus refactoring.
See https://inbox.dpdk.org/dev/CAJFAV8zvFpLwz8SY8DUUezyJyM43eRZ17Yj30ex808eHC4ZE=g@mail.gmail.com/.
This series refactors the bus cleanup infrastructure to reduce code
duplication and fix resource leaks in several bus drivers.
It should address the leak Thomas pointed at.
The first part of the series (patches 1-6) addresses several bugs and
inconsistencies:
- Documentation and log message inconsistencies from earlier bus
refactoring
- Device list management issues in dma/idxd and bus/vdev
- Resource leaks in PCI and VMBUS bus cleanup (mappings and interrupts)
- Deferred interrupt allocation to probe time (VMBUS)
The core infrastructure changes (patches 7-8) introduce the generic
cleanup framework:
- Refactors unplug operations to be the counterpart of probe_device
- Implements rte_bus_generic_cleanup() to centralize cleanup logic
- Adds .free_device operation to struct rte_bus
The final patches (9-10) convert the VMBUS bus to use the generic
cleanup helper.
After this series, most buses use the generic cleanup helper, eliminating
duplicated code and ensuring consistent cleanup behavior across the
codebase.
NXP bus drivers require more (leak) fixes and refactoring and
are left untouched.
--
David Marchand
Changes since v1:
- dropped all changes on DPAA and FSLMC bus,
- added one more cleanup on the first patch,
- changed coding style in rte_vdev_init,
- implemented explicit .free_device instead of hack for calling free(),
- reordered interrupt handle allocation in VMBUS bus,
David Marchand (10):
bus: fix reference to plug callback
dma/idxd: remove next pointer in bus specific device
bus/vdev: remove driver setting in probe
bus/pci: fix mapping leak in bus cleanup
bus/vmbus: fix interrupt leak in cleanup
bus/vmbus: allocate interrupt during probing
bus: align unplug with device probe
bus: implement cleanup in EAL
bus/vmbus: store name in bus specific device
bus/vmbus: support unplug
doc/guides/prog_guide/device_hotplug.rst | 20 ++++---
doc/guides/rel_notes/release_26_07.rst | 4 ++
drivers/bus/auxiliary/auxiliary_common.c | 54 ++++-------------
drivers/bus/cdx/cdx.c | 29 ++-------
drivers/bus/dpaa/dpaa_bus.c | 4 +-
drivers/bus/fslmc/fslmc_bus.c | 9 +--
drivers/bus/ifpga/ifpga_bus.c | 67 ++++++---------------
drivers/bus/pci/pci_common.c | 68 +++------------------
drivers/bus/platform/platform.c | 26 +++-----
drivers/bus/uacce/uacce.c | 59 +++---------------
drivers/bus/vdev/vdev.c | 76 +++++++++---------------
drivers/bus/vmbus/bus_vmbus_driver.h | 1 +
drivers/bus/vmbus/linux/vmbus_bus.c | 16 +----
drivers/bus/vmbus/vmbus_common.c | 58 +++++++++++-------
drivers/dma/idxd/idxd_bus.c | 1 -
lib/eal/common/eal_common_bus.c | 33 +++++++++-
lib/eal/common/eal_common_dev.c | 10 ++--
lib/eal/include/bus_driver.h | 51 +++++++++++-----
18 files changed, 222 insertions(+), 364 deletions(-)
--
2.53.0
^ permalink raw reply
* Re: [PATCH v3 1/1] pcapng: add user-supplied timestamp support
From: Stephen Hemminger @ 2026-06-18 15:22 UTC (permalink / raw)
To: Dawid Wesierski; +Cc: dev, thomas, Marek Kasiewicz
In-Reply-To: <20260618143819.310046-1-dawid.wesierski@intel.com>
On Thu, 18 Jun 2026 10:38:15 -0400
Dawid Wesierski <dawid.wesierski@intel.com> wrote:
> + * @param ts
> + * Packet timestamp in nanoseconds since the Unix epoch. If zero, the
> + * current TSC is captured and converted to epoch ns by
> + * rte_pcapng_write_packets() when the packet is written.
> *
It might help users if a helper rte_tsc_to_epoch() was exposed.
^ permalink raw reply
* Re: [PATCH v3 1/1] pcapng: add user-supplied timestamp support
From: Stephen Hemminger @ 2026-06-18 15:20 UTC (permalink / raw)
To: Dawid Wesierski; +Cc: dev, thomas, Marek Kasiewicz
In-Reply-To: <20260618144432.312767-1-dawid.wesierski@intel.com>
On Thu, 18 Jun 2026 10:44:29 -0400
Dawid Wesierski <dawid.wesierski@intel.com> wrote:
> +static inline struct rte_mbuf *
> rte_pcapng_copy(uint16_t port_id, uint32_t queue,
> const struct rte_mbuf *m, struct rte_mempool *mp,
> uint32_t length,
> - enum rte_pcapng_direction direction, const char *comment);
> + enum rte_pcapng_direction direction, const char *comment)
> +{
> + return rte_pcapng_copy_ts(port_id, queue, m, mp, length, direction,
> + comment, 0);
> +}
>
Switching from function to inline in header would cause ABI breakage.
New build would not have old function to runtime linking.
In this case, please just keep the old function name but add
a parameter using function versioning.
^ permalink raw reply
* Re: net/ice: VLAN mode changes after port reset or SIGKILL, VLAN RX offload broken
From: Taras Bilous @ 2026-06-18 14:43 UTC (permalink / raw)
To: dev
In-Reply-To: <CAL5H3VnP8CJbEaid9H5R+vKYd+-gooRDcixYy9xUR7myzsCa8w@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 6185 bytes --]
Hi,
A quick update on this issue.
I can no longer reproduce the problem with DPDK 25.11.2 on the following
NIC:
- Intel(R) Ethernet Network Adapter E810-CQDA2
- Ethernet Controller E810-C for QSFP (rev 02)
- Subsystem: Intel Ethernet Network Adapter E810-C-Q2
However, the issue is still reproducible with DPDK 25.11.2 on:
- E810-C 100GbE Controller
- Ethernet Controller E810-C for QSFP (rev 02)
- Subsystem: Intel Ethernet 100G 2P E810-C Adapter
The same test scenario was used in both cases.
Thanks,
Taras
On Tue, Feb 10, 2026 at 11:05 AM Taras Bilous <tarasb@interfacemasters.com>
wrote:
> Hi All,
>
> I am seeing multiple issues with net_ice where VLAN mode changes after a
> device reset or an ungraceful application exit, which then breaks RX VLAN
> offloading.
>
> The same NIC and the same DDP package behave correctly after a cold start,
> but switch to a different VLAN mode after either port reset (or
> rte_eth_dev_reset()) in testpmd or after killing the application with
> SIGKILL. After that, VLAN tags appear to be stripped in hardware, but VLAN
> metadata is no longer delivered to vlan_tci in mbuf structure.
>
> *My setup is the following:*
> Debian Bookworm
> Ethernet controller: Intel(R) Ethernet Controller E810-C for QSFP
> Firmware (NVM): 4.91 and 4.51 were tested
> DDP: ICE COMMS Package 1.3.55.0 and 1.3.50.0 were tested
> DPDK: 25.11
>
> *First scenario: port reset in testpmd*
>
>> :~$ sudo ./DPDK/dpdk-25.11.0/dpdk-25.11/build/app/dpdk-testpmd -l 0-15 -a
>> 0000:04:00.0 -a 0000:04:00.1 -- -i
>> EAL: Detected CPU lcores: 90
>> EAL: Detected NUMA nodes: 1
>> EAL: Detected static linkage of DPDK
>> EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
>> EAL: Selected IOVA mode 'PA'
>> ICE_INIT: ice_load_pkg_type(): Active package is: 1.3.55.0, ICE COMMS
>> Package (double VLAN mode)
>> ICE_INIT: ice_load_pkg_type(): Active package is: 1.3.55.0, ICE COMMS
>> Package (double VLAN mode)
>> Interactive-mode selected
>> Warning: NUMA should be configured manually by using --port-numa-config
>> and --ring-numa-config parameters along with --numa.
>> testpmd: create a new mbuf pool <mb_pool_0>: n=267456, size=2176, socket=0
>> testpmd: preferred mempool ops selected: ring_mp_mc
>> Configuring Port 0 (socket 0)
>> ICE_DRIVER: ice_set_rx_function(): Using Vector AVX2 (port 0).
>> Configuring Port 1 (socket 0)
>> ICE_DRIVER: ice_set_rx_function(): Using Vector AVX2 (port 1).
>> Port 0: link state change event
>> Port 0: link state change event
>> Checking link statuses...
>> Done
>> testpmd> port stop all
>> Stopping ports...
>> Checking link statuses...
>> Done
>> testpmd> port reset all
>> Resetting ports...
>> ETHDEV: Device with port_id=0 already stopped
>> ICE_INIT: ice_load_pkg_type(): Active package is: 1.3.55.0, ICE COMMS
>> Package (single VLAN mode)
>> ETHDEV: Device with port_id=1 already stopped
>> ICE_INIT: ice_load_pkg_type(): Active package is: 1.3.55.0, ICE COMMS
>> Package (single VLAN mode)
>
>
> After this reset, RX VLAN offload no longer behaves correctly. VLAN tags
> are stripped from packets, but VLAN TCI information is missing in the
> received mbufs vlan_tci fields.
>
> *Second scenario: killing the application with SIGKILL*
>
>> :~$ sudo ./DPDK/dpdk-25.11.0/dpdk-25.11/build/app/dpdk-testpmd -l 0-15 -a
>> 0000:04:00.0 -a 0000:04:00.1 -- -i
>> EAL: Detected CPU lcores: 90
>> EAL: Detected NUMA nodes: 1
>> EAL: Detected static linkage of DPDK
>> EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
>> EAL: Selected IOVA mode 'PA'
>> ICE_INIT: ice_load_pkg_type(): Active package is: 1.3.55.0, ICE COMMS
>> Package (double VLAN mode)
>> ICE_INIT: ice_load_pkg_type(): Active package is: 1.3.55.0, ICE COMMS
>> Package (double VLAN mode)
>> Interactive-mode selected
>> Warning: NUMA should be configured manually by using --port-numa-config
>> and --ring-numa-config parameters along with --numa.
>> testpmd: create a new mbuf pool <mb_pool_0>: n=267456, size=2176, socket=0
>> testpmd: preferred mempool ops selected: ring_mp_mc
>> Configuring Port 0 (socket 0)
>> ICE_DRIVER: ice_set_rx_function(): Using Vector AVX2 (port 0).
>> Configuring Port 1 (socket 0)
>> ICE_DRIVER: ice_set_rx_function(): Using Vector AVX2 (port 1).
>> Port 0: link state change event
>> Port 0: link state change event
>> Checking link statuses...
>> Done
>> testpmd> Killed
>>
>
>
>> :~$ sudo ./DPDK/dpdk-25.11.0/dpdk-25.11/build/app/dpdk-testpmd -l 0-15 -a
>> 0000:04:00.0 -a 0000:04:00.1 -- -i
>> EAL: Detected CPU lcores: 90
>> EAL: Detected NUMA nodes: 1
>> EAL: Detected static linkage of DPDK
>> EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
>> EAL: Selected IOVA mode 'PA'
>> ICE_INIT: ice_load_pkg_type(): Active package is: 1.3.55.0, ICE COMMS
>> Package (single VLAN mode)
>> ICE_INIT: ice_load_pkg_type(): Active package is: 1.3.55.0, ICE COMMS
>> Package (single VLAN mode)
>> Interactive-mode selected
>> Warning: NUMA should be configured manually by using --port-numa-config
>> and --ring-numa-config parameters along with --numa.
>> testpmd: create a new mbuf pool <mb_pool_0>: n=267456, size=2176, socket=0
>> testpmd: preferred mempool ops selected: ring_mp_mc
>> Configuring Port 0 (socket 0)
>> ICE_DRIVER: ice_set_rx_function(): Using Vector AVX2 (port 0).
>> Configuring Port 1 (socket 0)
>> ICE_DRIVER: ice_set_rx_function(): Using Vector AVX2 (port 1).
>> Port 0: link state change event
>> Port 0: link state change event
>> Checking link statuses...
>> Done
>> testpmd>
>
>
> At this point the behavior is the same as after port reset: RX VLAN
> offloading is broken.
>
> The observed RX VLAN offloading behavior then is very similar to the issue
> described in the following bug report, where VLAN stripping works but
> metadata is missing:
> https://bugs.dpdk.org/show_bug.cgi?id=1677
> The trigger is different, but the resulting VLAN RX offload behavior
> appears to be the same.
> I can reproduce this reliably and can provide more logs or information if
> needed.
>
> Best regards,
> Taras
>
[-- Attachment #2: Type: text/html, Size: 7221 bytes --]
^ permalink raw reply
* [PATCH] examples/ptp_tap_relay_sw: forbid shadowed variables
From: Thomas Monjalon @ 2026-06-18 14:25 UTC (permalink / raw)
To: dev; +Cc: Rajesh Kumar
By removing the compilation flag no_shadow_cflag,
it becomes forbidden to shadow a variable.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
examples/ptp_tap_relay_sw/meson.build | 1 -
1 file changed, 1 deletion(-)
diff --git a/examples/ptp_tap_relay_sw/meson.build b/examples/ptp_tap_relay_sw/meson.build
index e78b284ad8..f9fb6780f7 100644
--- a/examples/ptp_tap_relay_sw/meson.build
+++ b/examples/ptp_tap_relay_sw/meson.build
@@ -10,4 +10,3 @@ sources = files(
'ptp_tap_relay_sw.c',
)
deps += ['net']
-cflags += no_shadow_cflag
--
2.54.0
^ permalink raw reply related
* [PATCH v2 7/7] net/intel: support header split mbuf callback
From: Dawid Wesierski @ 2026-06-18 14:44 UTC (permalink / raw)
To: dev; +Cc: thomas, david.marchand, Marek Kasiewicz, Dawid Wesierski
In-Reply-To: <20260618144442.312844-1-dawid.wesierski@intel.com>
From: Marek Kasiewicz <marek.kasiewicz@intel.com>
Wire the new ethdev header split mbuf callback API into the ICE PMD.
A new dev_ops hook, hdrs_mbuf_set_cb, lets applications register a
callback (and private context) on a receive queue; the callback returns
a payload buffer (virtual address and IOVA) that overrides the default
mempool-backed payload mbuf for header split RX.
The callback is invoked at three allocation points in the ICE driver:
- initial queue setup (ice_alloc_rx_queue_mbufs),
- bulk buffer allocation (ice_rx_alloc_bufs),
- single-packet receive path (ice_recv_pkts).
This enables zero-copy RX for header split: the NIC DMAs the payload
directly into application-managed buffers (e.g., mapped frame buffers
with known IOVA), bypassing an extra memcpy from the mempool mbuf.
Depends on: "ethdev: add header split mbuf callback API"
Signed-off-by: Marek Kasiewicz <marek.kasiewicz@intel.com>
Signed-off-by: Dawid Wesierski <dawid.wesierski@intel.com>
---
drivers/net/intel/common/rx.h | 2 +
drivers/net/intel/ice/ice_ethdev.c | 1 +
drivers/net/intel/ice/ice_rxtx.c | 63 ++++++++++++++++++++++++++++++
drivers/net/intel/ice/ice_rxtx.h | 2 +
4 files changed, 68 insertions(+)
diff --git a/drivers/net/intel/common/rx.h b/drivers/net/intel/common/rx.h
index e0bf520ebd..8abb2a3ce9 100644
--- a/drivers/net/intel/common/rx.h
+++ b/drivers/net/intel/common/rx.h
@@ -113,6 +113,8 @@ struct ci_rx_queue {
uint32_t hw_time_low; /* low 32 bits of timestamp */
int ts_offset; /* dynamic mbuf timestamp field offset */
uint64_t ts_flag; /* dynamic mbuf timestamp flag */
+ rte_eth_hdrs_mbuf_callback_fn hdrs_mbuf_cb; /* hdr split mbuf cb */
+ void *hdrs_mbuf_cb_priv; /* hdr split mbuf cb priv */
};
struct { /* iavf specific values */
const struct iavf_rxq_ops *ops; /**< queue ops */
diff --git a/drivers/net/intel/ice/ice_ethdev.c b/drivers/net/intel/ice/ice_ethdev.c
index ad9c49b339..353da8f2bd 100644
--- a/drivers/net/intel/ice/ice_ethdev.c
+++ b/drivers/net/intel/ice/ice_ethdev.c
@@ -282,6 +282,7 @@ static const struct eth_dev_ops ice_eth_dev_ops = {
.dev_set_link_down = ice_dev_set_link_down,
.dev_led_on = ice_dev_led_on,
.dev_led_off = ice_dev_led_off,
+ .hdrs_mbuf_set_cb = ice_hdrs_mbuf_set_cb,
.rx_queue_start = ice_rx_queue_start,
.rx_queue_stop = ice_rx_queue_stop,
.tx_queue_start = ice_tx_queue_start,
diff --git a/drivers/net/intel/ice/ice_rxtx.c b/drivers/net/intel/ice/ice_rxtx.c
index 8d709125f7..867f595291 100644
--- a/drivers/net/intel/ice/ice_rxtx.c
+++ b/drivers/net/intel/ice/ice_rxtx.c
@@ -487,6 +487,17 @@ ice_alloc_rx_queue_mbufs(struct ci_rx_queue *rxq)
return -ENOMEM;
}
+ if (rxq->hdrs_mbuf_cb) {
+ struct rte_eth_hdrs_mbuf hdrs_mbuf = {0};
+ int ret = rxq->hdrs_mbuf_cb(rxq->hdrs_mbuf_cb_priv,
+ &hdrs_mbuf);
+
+ if (ret >= 0) {
+ mbuf_pay->buf_addr = hdrs_mbuf.buf_addr;
+ mbuf_pay->buf_iova = hdrs_mbuf.buf_iova;
+ }
+ }
+
mbuf_pay->next = NULL;
mbuf_pay->data_off = RTE_PKTMBUF_HEADROOM;
mbuf_pay->nb_segs = 1;
@@ -2126,6 +2137,16 @@ ice_rx_alloc_bufs(struct ci_rx_queue *rxq)
rxdp[i].read.pkt_addr = dma_addr;
} else {
mb->next = rxq->sw_split_buf[i].mbuf;
+ if (rxq->hdrs_mbuf_cb && mb->next) {
+ struct rte_eth_hdrs_mbuf hdrs_mbuf = {0};
+ int ret = rxq->hdrs_mbuf_cb(rxq->hdrs_mbuf_cb_priv,
+ &hdrs_mbuf);
+
+ if (ret >= 0) {
+ mb->next->buf_addr = hdrs_mbuf.buf_addr;
+ mb->next->buf_iova = hdrs_mbuf.buf_iova;
+ }
+ }
pay_addr = rte_cpu_to_le_64(rte_mbuf_data_iova_default(mb->next));
rxdp[i].read.hdr_addr = dma_addr;
rxdp[i].read.pkt_addr = pay_addr;
@@ -2810,6 +2831,17 @@ ice_recv_pkts(void *rx_queue,
break;
}
+ if (rxq->hdrs_mbuf_cb) {
+ struct rte_eth_hdrs_mbuf hdrs_mbuf = {0};
+ int ret = rxq->hdrs_mbuf_cb(rxq->hdrs_mbuf_cb_priv,
+ &hdrs_mbuf);
+
+ if (ret >= 0) {
+ nmb_pay->buf_addr = hdrs_mbuf.buf_addr;
+ nmb_pay->buf_iova = hdrs_mbuf.buf_iova;
+ }
+ }
+
nmb->next = nmb_pay;
nmb_pay->next = NULL;
@@ -4533,3 +4565,34 @@ ice_fdir_programming(struct ice_pf *pf, struct ice_fltr_desc *fdir_desc)
}
+
+int
+ice_hdrs_mbuf_set_cb(struct rte_eth_dev *dev, uint16_t rx_queue_id,
+ void *priv, rte_eth_hdrs_mbuf_callback_fn cb)
+{
+ struct ci_rx_queue *rxq;
+
+ if (rx_queue_id >= dev->data->nb_rx_queues) {
+ PMD_DRV_LOG(ERR, "RX queue %u out of range", rx_queue_id);
+ return -EINVAL;
+ }
+
+ rxq = dev->data->rx_queues[rx_queue_id];
+ if (rxq == NULL) {
+ PMD_DRV_LOG(ERR, "RX queue %u not available or setup", rx_queue_id);
+ return -EINVAL;
+ }
+
+ if (rxq->hdrs_mbuf_cb) {
+ PMD_DRV_LOG(ERR, "RX queue %u has hdrs mbuf cb already",
+ rx_queue_id);
+ return -EEXIST;
+ }
+
+ rxq->hdrs_mbuf_cb_priv = priv;
+ rxq->hdrs_mbuf_cb = cb;
+ PMD_DRV_LOG(NOTICE, "RX queue %u register hdrs mbuf cb at %p",
+ rx_queue_id, cb);
+
+ return 0;
+}
diff --git a/drivers/net/intel/ice/ice_rxtx.h b/drivers/net/intel/ice/ice_rxtx.h
index 999b6b30d6..7ed114ee94 100644
--- a/drivers/net/intel/ice/ice_rxtx.h
+++ b/drivers/net/intel/ice/ice_rxtx.h
@@ -303,6 +303,8 @@ uint16_t ice_xmit_pkts_vec_avx512_offload(void *tx_queue,
int ice_fdir_programming(struct ice_pf *pf, struct ice_fltr_desc *fdir_desc);
int ice_tx_done_cleanup(void *txq, uint32_t free_cnt);
int ice_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc);
+int ice_hdrs_mbuf_set_cb(struct rte_eth_dev *dev, uint16_t rx_queue_id,
+ void *priv, rte_eth_hdrs_mbuf_callback_fn cb);
enum rte_vect_max_simd ice_get_max_simd_bitwidth(void);
#define FDIR_PARSING_ENABLE_PER_QUEUE(ad, on) do { \
--
2.47.3
---------------------------------------------------------------------
Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN.
Spolka oswiadcza, ze posiada status duzego przedsiebiorcy w rozumieniu ustawy z dnia 8 marca 2013 r. o przeciwdzialaniu nadmiernym opoznieniom w transakcjach handlowych.
Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by others is strictly prohibited.
^ permalink raw reply related
* [PATCH v2 6/7] net/iavf: disable runtime queue setup capability
From: Dawid Wesierski @ 2026-06-18 14:44 UTC (permalink / raw)
To: dev; +Cc: thomas, david.marchand, Marek Kasiewicz, Dawid Wesierski
In-Reply-To: <20260618144442.312844-1-dawid.wesierski@intel.com>
From: Marek Kasiewicz <marek.kasiewicz@intel.com>
Remove the advertisement of RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP
and RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP capabilities from the
iavf VF driver.
Runtime queue setup on E810 VFs causes queue state corruption when
queues are dynamically reconfigured while the hardware rate limiter
is actively pacing TX queues. Queue configuration messages to the PF
via virtchnl can race with ongoing TX operations, leading to undefined
behavior.
By not advertising these capabilities, all queues are configured at
port start and remain stable throughout the port lifecycle.
Signed-off-by: Marek Kasiewicz <marek.kasiewicz@intel.com>
Signed-off-by: Dawid Wesierski <dawid.wesierski@intel.com>
---
drivers/net/intel/iavf/iavf_ethdev.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/drivers/net/intel/iavf/iavf_ethdev.c b/drivers/net/intel/iavf/iavf_ethdev.c
index ec1ad02826..ab223e6afd 100644
--- a/drivers/net/intel/iavf/iavf_ethdev.c
+++ b/drivers/net/intel/iavf/iavf_ethdev.c
@@ -1160,9 +1160,6 @@ iavf_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
dev_info->reta_size = vf->vf_res->rss_lut_size;
dev_info->flow_type_rss_offloads = IAVF_RSS_OFFLOAD_ALL;
dev_info->max_mac_addrs = IAVF_NUM_MACADDR_MAX;
- dev_info->dev_capa =
- RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP |
- RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP;
dev_info->rx_offload_capa =
RTE_ETH_RX_OFFLOAD_VLAN_STRIP |
RTE_ETH_RX_OFFLOAD_QINQ_STRIP |
--
2.47.3
---------------------------------------------------------------------
Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN.
Spolka oswiadcza, ze posiada status duzego przedsiebiorcy w rozumieniu ustawy z dnia 8 marca 2013 r. o przeciwdzialaniu nadmiernym opoznieniom w transakcjach handlowych.
Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by others is strictly prohibited.
^ permalink raw reply related
* [PATCH v2 5/7] net/ice: timestamp all received packets when PTP is enabled
From: Dawid Wesierski @ 2026-06-18 14:44 UTC (permalink / raw)
To: dev; +Cc: thomas, david.marchand, Marek Kasiewicz, Dawid Wesierski
In-Reply-To: <20260618144442.312844-1-dawid.wesierski@intel.com>
From: Marek Kasiewicz <marek.kasiewicz@intel.com>
When PTP is enabled on the ICE PMD, hardware RX timestamps are only
applied to packets classified as IEEE 1588 (Ethertype 0x88F7). This
prevents applications from obtaining hardware timestamps on regular
UDP/IP traffic.
Remove the TIMESYNC packet type filter so that all received packets
get hardware timestamps when PTP is enabled. This is required for
time-sensitive networking applications that need per-packet arrival
timing on media traffic, such as ST 2110-21 receiver compliance
monitoring.
The change affects all three RX paths: scan, scattered, and single
packet receive functions.
Signed-off-by: Marek Kasiewicz <marek.kasiewicz@intel.com>
Signed-off-by: Dawid Wesierski <dawid.wesierski@intel.com>
---
drivers/net/intel/ice/ice_rxtx.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/drivers/net/intel/ice/ice_rxtx.c b/drivers/net/intel/ice/ice_rxtx.c
index c4b5454c53..8d709125f7 100644
--- a/drivers/net/intel/ice/ice_rxtx.c
+++ b/drivers/net/intel/ice/ice_rxtx.c
@@ -2023,8 +2023,7 @@ ice_rx_scan_hw_ring(struct ci_rx_queue *rxq)
pkt_flags |= rxq->ts_flag;
}
- if (ad->ptp_ena && ((mb->packet_type &
- RTE_PTYPE_L2_MASK) == RTE_PTYPE_L2_ETHER_TIMESYNC)) {
+ if (ad->ptp_ena) {
rxq->time_high =
rte_le_to_cpu_32(rxdp[j].wb.flex_ts.ts_high);
mb->timesync = rxq->queue_id;
@@ -2390,8 +2389,7 @@ ice_recv_scattered_pkts(void *rx_queue,
pkt_flags |= rxq->ts_flag;
}
- if (ad->ptp_ena && ((first_seg->packet_type & RTE_PTYPE_L2_MASK)
- == RTE_PTYPE_L2_ETHER_TIMESYNC)) {
+ if (ad->ptp_ena) {
rxq->time_high =
rte_le_to_cpu_32(rxd.wb.flex_ts.ts_high);
first_seg->timesync = rxq->queue_id;
@@ -2881,8 +2879,7 @@ ice_recv_pkts(void *rx_queue,
pkt_flags |= rxq->ts_flag;
}
- if (ad->ptp_ena && ((rxm->packet_type & RTE_PTYPE_L2_MASK) ==
- RTE_PTYPE_L2_ETHER_TIMESYNC)) {
+ if (ad->ptp_ena) {
rxq->time_high =
rte_le_to_cpu_32(rxd.wb.flex_ts.ts_high);
rxm->timesync = rxq->queue_id;
--
2.47.3
---------------------------------------------------------------------
Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN.
Spolka oswiadcza, ze posiada status duzego przedsiebiorcy w rozumieniu ustawy z dnia 8 marca 2013 r. o przeciwdzialaniu nadmiernym opoznieniom w transakcjach handlowych.
Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by others is strictly prohibited.
^ permalink raw reply related
* [PATCH v2 4/7] net/ice/base: reduce default scheduler burst size
From: Dawid Wesierski @ 2026-06-18 14:44 UTC (permalink / raw)
To: dev; +Cc: thomas, david.marchand, Marek Kasiewicz, Dawid Wesierski
In-Reply-To: <20260618144442.312844-1-dawid.wesierski@intel.com>
From: Marek Kasiewicz <marek.kasiewicz@intel.com>
Reduce ICE_SCHED_DFLT_BURST_SIZE from 15 KB to 2 KB to improve
TX rate limiter granularity. The E810 TX scheduler uses a token
bucket algorithm where the burst size controls the maximum bytes
sent in a single burst before the rate limiter throttles.
A 15 KB burst allows micro-bursts of ~10 max-size frames, which
violates tight inter-packet spacing requirements in time-sensitive
networking applications such as SMPTE ST 2110-21 narrow-sender
compliance. Reducing to 2 KB forces near-constant-rate output
matching the configured shaper profile.
Signed-off-by: Marek Kasiewicz <marek.kasiewicz@intel.com>
Signed-off-by: Dawid Wesierski <dawid.wesierski@intel.com>
---
drivers/net/intel/ice/base/ice_type.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/intel/ice/base/ice_type.h b/drivers/net/intel/ice/base/ice_type.h
index 6d8c187689..39569ff3e3 100644
--- a/drivers/net/intel/ice/base/ice_type.h
+++ b/drivers/net/intel/ice/base/ice_type.h
@@ -1100,7 +1100,7 @@ enum ice_rl_type {
#define ICE_SCHED_NO_SHARED_RL_PROF_ID 0xFFFF
#define ICE_SCHED_DFLT_BW_WT 4
#define ICE_SCHED_INVAL_PROF_ID 0xFFFF
-#define ICE_SCHED_DFLT_BURST_SIZE (15 * 1024) /* in bytes (15k) */
+#define ICE_SCHED_DFLT_BURST_SIZE (2 * 1024) /* in bytes (2k) */
/* Access Macros for Tx Sched RL Profile data */
#define ICE_TXSCHED_GET_RL_PROF_ID(p) LE16_TO_CPU((p)->info.profile_id)
--
2.47.3
---------------------------------------------------------------------
Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN.
Spolka oswiadcza, ze posiada status duzego przedsiebiorcy w rozumieniu ustawy z dnia 8 marca 2013 r. o przeciwdzialaniu nadmiernym opoznieniom w transakcjach handlowych.
Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by others is strictly prohibited.
^ permalink raw reply related
* [PATCH v2 3/7] net/iavf: allow runtime queue rate limit configuration
From: Dawid Wesierski @ 2026-06-18 14:44 UTC (permalink / raw)
To: dev; +Cc: thomas, david.marchand, Marek Kasiewicz, Dawid Wesierski
In-Reply-To: <20260618144442.312844-1-dawid.wesierski@intel.com>
From: Marek Kasiewicz <marek.kasiewicz@intel.com>
Allow per-queue bandwidth rate limiting to be configured without
stopping the port when only a single TC node and single QoS element
are involved. This enables dynamic session management where individual
queue pacing rates can be changed while other queues continue
transmitting.
Also fix the queue ID assignment in the bandwidth configuration to
use the actual TM node ID rather than a sequential counter index, and
only mark the TM hierarchy as committed when the port is stopped to
permit subsequent reconfiguration.
Signed-off-by: Marek Kasiewicz <marek.kasiewicz@intel.com>
Signed-off-by: Dawid Wesierski <dawid.wesierski@intel.com>
---
drivers/net/intel/iavf/iavf_tm.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/net/intel/iavf/iavf_tm.c b/drivers/net/intel/iavf/iavf_tm.c
index 1cf7bfb106..43d7a44337 100644
--- a/drivers/net/intel/iavf/iavf_tm.c
+++ b/drivers/net/intel/iavf/iavf_tm.c
@@ -804,8 +804,10 @@ static int iavf_hierarchy_commit(struct rte_eth_dev *dev,
int index = 0, node_committed = 0;
int i, ret_val = IAVF_SUCCESS;
- /* check if port is stopped */
- if (adapter->stopped != 1) {
+ /* check if port is stopped, except for setting queue bandwidth */
+ if (vf->tm_conf.nb_tc_node != 1 &&
+ vf->qos_cap->num_elem != 1 &&
+ adapter->stopped != 1) {
PMD_DRV_LOG(ERR, "Please stop port first");
ret_val = IAVF_ERR_NOT_READY;
goto err;
@@ -856,7 +858,7 @@ static int iavf_hierarchy_commit(struct rte_eth_dev *dev,
q_tc_mapping->tc[tm_node->tc].req.queue_count++;
if (tm_node->shaper_profile) {
- q_bw->cfg[node_committed].queue_id = node_committed;
+ q_bw->cfg[node_committed].queue_id = tm_node->id;
q_bw->cfg[node_committed].shaper.peak =
tm_node->shaper_profile->profile.peak.rate /
1000 * IAVF_BITS_PER_BYTE;
@@ -900,7 +902,8 @@ static int iavf_hierarchy_commit(struct rte_eth_dev *dev,
goto fail_clear;
vf->qtc_map = qtc_map;
- vf->tm_conf.committed = true;
+ if (adapter->stopped == 1)
+ vf->tm_conf.committed = true;
return ret_val;
fail_clear:
--
2.47.3
---------------------------------------------------------------------
Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN.
Spolka oswiadcza, ze posiada status duzego przedsiebiorcy w rozumieniu ustawy z dnia 8 marca 2013 r. o przeciwdzialaniu nadmiernym opoznieniom w transakcjach handlowych.
Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by others is strictly prohibited.
^ permalink raw reply related
* [PATCH v2 2/7] net/iavf: increase max ring descriptors to hardware limit
From: Dawid Wesierski @ 2026-06-18 14:44 UTC (permalink / raw)
To: dev; +Cc: thomas, david.marchand, Marek Kasiewicz, Dawid Wesierski
In-Reply-To: <20260618144442.312844-1-dawid.wesierski@intel.com>
From: Marek Kasiewicz <marek.kasiewicz@intel.com>
The Intel E810 hardware supports up to 8160 (8K - 32) descriptors per
TX/RX ring, but IAVF_MAX_RING_DESC caps it at 4096. Applications that
need deep descriptor rings for hardware rate-limited pacing (e.g.,
ST2110 video with thousands of packets per frame) cannot queue enough
packets before the pacing epoch begins.
Increase IAVF_MAX_RING_DESC to the hardware maximum of 8160 to allow
full utilization of the ring depth on E810 VFs.
Signed-off-by: Marek Kasiewicz <marek.kasiewicz@intel.com>
Signed-off-by: Dawid Wesierski <dawid.wesierski@intel.com>
---
drivers/net/intel/iavf/iavf_rxtx.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/intel/iavf/iavf_rxtx.h b/drivers/net/intel/iavf/iavf_rxtx.h
index 8449236d4d..22ea415f44 100644
--- a/drivers/net/intel/iavf/iavf_rxtx.h
+++ b/drivers/net/intel/iavf/iavf_rxtx.h
@@ -16,7 +16,7 @@
/* In QLEN must be whole number of 32 descriptors. */
#define IAVF_ALIGN_RING_DESC 32
#define IAVF_MIN_RING_DESC 64
-#define IAVF_MAX_RING_DESC 4096
+#define IAVF_MAX_RING_DESC (8192 - 32)
#define IAVF_DMA_MEM_ALIGN 4096
/* Base address of the HW descriptor ring should be 128B aligned. */
#define IAVF_RING_BASE_ALIGN 128
--
2.47.3
---------------------------------------------------------------------
Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN.
Spolka oswiadcza, ze posiada status duzego przedsiebiorcy w rozumieniu ustawy z dnia 8 marca 2013 r. o przeciwdzialaniu nadmiernym opoznieniom w transakcjach handlowych.
Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by others is strictly prohibited.
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox