DPDK-dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/5] eal: take experimental flag off of rte_memeq_timingsafe
From: Stephen Hemminger @ 2026-06-25 15:56 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger
In-Reply-To: <20260625160200.24170-1-stephen@networkplumber.org>

This function is needed in other places, and don't want to
have to propagate allow_experimental_api into those drivers.
It is stable enough and inline so no ABI exposure.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 doc/guides/rel_notes/release_26_07.rst | 4 ++++
 lib/eal/include/rte_memory.h           | 4 ----
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/doc/guides/rel_notes/release_26_07.rst b/doc/guides/rel_notes/release_26_07.rst
index 0b1cac3e0d..a9ca81905c 100644
--- a/doc/guides/rel_notes/release_26_07.rst
+++ b/doc/guides/rel_notes/release_26_07.rst
@@ -218,6 +218,10 @@ API Changes
   - ``rte_pmd_mlx5_enable_steering``
   - ``rte_pmd_mlx5_disable_steering``
 
+* **eal: promoted timing-safe memory comparison from experimental to stable.**
+
+  The inline function ``rte_memeq_timingsafe()`` is no longer marked experimental.
+
 
 ABI Changes
 -----------
diff --git a/lib/eal/include/rte_memory.h b/lib/eal/include/rte_memory.h
index b6e97ad695..940770f1eb 100644
--- a/lib/eal/include/rte_memory.h
+++ b/lib/eal/include/rte_memory.h
@@ -747,9 +747,6 @@ void
 rte_memzero_explicit(void *dst, size_t sz);
 
 /**
- * @warning
- * @b EXPERIMENTAL: this API may change without prior notice.
- *
  * Timing-safe memory equality comparison.
  *
  * This function compares two memory regions in constant time,
@@ -770,7 +767,6 @@ rte_memzero_explicit(void *dst, size_t sz);
  * @return
  *   true if the memory regions are identical, false if they differ.
  */
-__rte_experimental
 static inline bool
 rte_memeq_timingsafe(const void *a, const void *b, size_t n)
 {
-- 
2.53.0


^ permalink raw reply related

* [PATCH 0/5] crypto: use timing-safe digest comparison
From: Stephen Hemminger @ 2026-06-25 15:56 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Timing attacks in DPDK crypto were fixed earlier but
several drivers did not use the new timing safe comparison
operation.

First patch drops the experimental flag off rte_memeq_timingsafe().
The function is a static inline with no exported symbol, no ABI change.
This avoids having to turn on experimental flag in other drivers.

The rest convert the digest verify comparisons in the uadk, ccp,
armv8 and cnxk PMDs.

This problem was reported for several drivers and for those
the Reported-by was added.

Stephen Hemminger (5):
  eal: take experimental flag off of rte_memeq_timingsafe
  crypto/uadk: use timing-safe digest comparison
  crypto/ccp: use timing-safe digest comparison
  crypto/armv8: use timing-safe digest comparison
  crypto/cnxk: use timing-safe digest comparison

 doc/guides/rel_notes/release_26_07.rst | 4 ++++
 drivers/crypto/armv8/rte_armv8_pmd.c   | 4 ++--
 drivers/crypto/ccp/ccp_crypto.c        | 8 ++++----
 drivers/crypto/cnxk/cnxk_se.h          | 2 +-
 drivers/crypto/uadk/uadk_crypto_pmd.c  | 4 ++--
 lib/eal/include/rte_memory.h           | 4 ----
 6 files changed, 13 insertions(+), 13 deletions(-)

-- 
2.53.0


^ permalink raw reply

* Re: [PATCH 3/3] net/iavf: fix Rx packets statistics underflow
From: Bruce Richardson @ 2026-06-25 15:58 UTC (permalink / raw)
  To: Ciara Loftus; +Cc: dev, stable
In-Reply-To: <20260625093619.726471-4-ciara.loftus@intel.com>

On Thu, Jun 25, 2026 at 09:36:19AM +0000, Ciara Loftus wrote:
> The number of Rx packets is computed as the sum of the unicast,
> multicast and broadcast packet counters, minus the discarded packet
> count:
> 
>     ipackets = rx_unicast + rx_multicast + rx_broadcast - rx_discards
> 
> The unicast, multicast and broadcast counters already include the
> packets that were subsequently dropped, so subtracting rx_discards
> yields only the packets delivered to the application. These values are
> provided by the PF in a virtchnl_eth_stats message; the PF samples them
> from separate sources and the VF cannot guarantee the order in which
> they are read. Under load, rx_discards can therefore momentarily exceed
> the sum of the unicast, multicast and broadcast counters. As ipackets is
> unsigned, the subtraction then wraps to a huge bogus value, reported to
> the application as an enormous Rx packet count and packet rate.
> 
> The read order cannot be guaranteed from the VF, so use a saturating
> subtraction: when rx_discards exceeds the sum of the unicast, multicast
> and broadcast counters essentially nothing was delivered, so report zero
> instead of underflowing.
> 
> Fixes: e71ffcc1008e ("net/iavf: fix Rx total stats")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
> ---
Acked-by: Bruce Richardson <bruce.richardson@intel.com>

^ permalink raw reply

* Re: [PATCH 2/3] net/ice: fix DCF Rx packets statistics underflow
From: Bruce Richardson @ 2026-06-25 15:57 UTC (permalink / raw)
  To: Ciara Loftus; +Cc: dev, stable
In-Reply-To: <20260625093619.726471-3-ciara.loftus@intel.com>

On Thu, Jun 25, 2026 at 09:36:18AM +0000, Ciara Loftus wrote:
> The number of Rx packets is computed as the sum of the unicast,
> multicast and broadcast packet counters, minus the discarded packet
> count:
> 
>     ipackets = rx_unicast + rx_multicast + rx_broadcast - rx_discards
> 
> The unicast, multicast and broadcast counters already include the
> packets that were subsequently dropped, so subtracting rx_discards
> yields only the packets delivered to the application. These values are
> provided by the PF in a virtchnl_eth_stats message; the PF samples them
> from separate sources and the order in which they are read cannot be
> guaranteed. Under load, rx_discards can therefore momentarily exceed the
> sum of the unicast, multicast and broadcast counters. As ipackets is
> unsigned, the subtraction then wraps to a huge bogus value, reported to
> the application as an enormous Rx packet count and packet rate.
> 
> The read order cannot be guaranteed, so use a saturating subtraction:
> when rx_discards exceeds the sum of the unicast, multicast and broadcast
> counters essentially nothing was delivered, so report zero instead of
> underflowing.
> 
> Fixes: c9f889e99616 ("net/ice: enable stats for DCF")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
> ---
Acked-by: Bruce Richardson <bruce.richardson@intel.com>

^ permalink raw reply

* Re: [PATCH 1/3] net/ice: fix Rx packets statistics underflow
From: Bruce Richardson @ 2026-06-25 15:57 UTC (permalink / raw)
  To: Ciara Loftus; +Cc: dev, stable
In-Reply-To: <20260625093619.726471-2-ciara.loftus@intel.com>

On Thu, Jun 25, 2026 at 09:36:17AM +0000, Ciara Loftus wrote:
> The number of Rx packets is computed as the sum of the unicast,
> multicast and broadcast packet counters, minus the discarded packet
> count:
> 
>     ipackets = rx_unicast + rx_multicast + rx_broadcast - rx_discards
> 
> The unicast, multicast and broadcast counters already include the
> packets that were subsequently dropped, so subtracting rx_discards
> yields only the packets delivered to the application. However, each of
> these counters is read from a separate hardware register, and the reads
> happen at slightly different instants. Under load, rx_discards (read
> last) can momentarily exceed the sum of the unicast, multicast and
> broadcast counters (read earlier). As ipackets is unsigned, the
> subtraction then wraps to a huge bogus value, reported to the
> application as an enormous Rx packet count and packet rate.
> 
> Read the rx_discards register before the unicast, multicast and
> broadcast registers. As all of these counters only ever increase, and
> the unicast, multicast and broadcast counters always include the
> discarded packets, sampling rx_discards first guarantees it can never
> exceed the later sum of the other three, so the subtraction can never
> underflow.
> 
> Fixes: a37bde56314d ("net/ice: support statistics")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
> ---
Acked-by: Bruce Richardson <bruce.richardson@intel.com>

^ permalink raw reply

* [PATCH v1 1/1] net/ice: fix NULL pointer dereference in DCF
From: Anatoly Burakov @ 2026-06-25 15:49 UTC (permalink / raw)
  To: dev, Bruce Richardson, Vladimir Medvedkin

When calling into common parameter checks and subsequent action check
callbacks, the DCF path never supplied the `driver_ctx` struct, yet the
callback itself dereferenced the struct.

Fix it by supplying the DCF adapter struct as context.

Fixes: db24a8092854 ("net/ice: use common action checks for switch")
Cc: vladimir.medvedkin@intel.com

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 drivers/net/intel/ice/ice_switch_filter.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/intel/ice/ice_switch_filter.c b/drivers/net/intel/ice/ice_switch_filter.c
index 5d9e0062af..6f5af03f6a 100644
--- a/drivers/net/intel/ice/ice_switch_filter.c
+++ b/drivers/net/intel/ice/ice_switch_filter.c
@@ -1773,6 +1773,7 @@ ice_switch_parse_pattern_action(struct ice_adapter *ad,
 		},
 		.max_actions = 1,
 		.check = ice_switch_dcf_action_check,
+		.driver_ctx = container_of(ad, struct ice_dcf_adapter, parent),
 	};
 	struct ci_flow_actions_check_param param = {
 		.allowed_types = (enum rte_flow_action_type[]){
-- 
2.47.3


^ permalink raw reply related

* [PATCH v1 4/4] net/iavf: fix potential NULL dereference
From: Anatoly Burakov @ 2026-06-25 15:48 UTC (permalink / raw)
  To: dev, Vladimir Medvedkin, Bruce Richardson
In-Reply-To: <c42d6f4efb83f4962ea096a702ed320a3d0a5eed.1782402484.git.anatoly.burakov@intel.com>

Static analysis reports that a rule might have a valid engine but invalid
rule pointer while having a valid rule size. This should not happen in
normal operation, but it's a good defensive check, so fix the potential
issue by checking for the problematic condition before dumping flow memory.

Coverity issue: 503764

Fixes: 32251f047e92 ("net/iavf: support flow dump")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 drivers/net/intel/iavf/iavf_generic_flow.c | 23 ++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/net/intel/iavf/iavf_generic_flow.c b/drivers/net/intel/iavf/iavf_generic_flow.c
index 8bd874c32a..84bb161bd1 100644
--- a/drivers/net/intel/iavf/iavf_generic_flow.c
+++ b/drivers/net/intel/iavf/iavf_generic_flow.c
@@ -2414,18 +2414,21 @@ iavf_flow_dev_dump(struct rte_eth_dev *dev,
 		if (flow != NULL && f != flow)
 			continue;
 
+		/* this should not happen */
+		if (f->engine == NULL || f->rule == NULL) {
+			PMD_DRV_LOG(DEBUG, "Invalid flow");
+			continue;
+		}
+
 		found = true;
-		if (f->engine != NULL) {
-			rule_size = f->engine->rule_size;
-			if (f->rule != NULL)
-				rule_data = f->rule;
-		}
 
-		if (f->engine != NULL)
-			iavf_flow_dump_blob(file,
-				f->engine->name != NULL ?
-					f->engine->name : "unknown",
-				rule_data, rule_size);
+		rule_size = f->engine->rule_size;
+		rule_data = f->rule;
+
+		iavf_flow_dump_blob(file,
+			f->engine->name != NULL ?
+				f->engine->name : "unknown",
+			rule_data, rule_size);
 	}
 	rte_spinlock_unlock(&vf->flow_ops_lock);
 
-- 
2.47.3


^ permalink raw reply related

* [PATCH v1 3/4] net/i40e: fix potential NULL dereference
From: Anatoly Burakov @ 2026-06-25 15:48 UTC (permalink / raw)
  To: dev, Bruce Richardson
In-Reply-To: <c42d6f4efb83f4962ea096a702ed320a3d0a5eed.1782402484.git.anatoly.burakov@intel.com>

Static analysis reports that a rule dump may trigger NULL dereference when
rule pointer is NULL. This should not happen under normal circumstances as
0 sized rule would not dereference the rule data pointer due to chunking,
but it's a good defensive check, so add it.

Coverity issue: 503771

Fixes: ffaddd0fa935 ("net/i40e: support flow dump")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 drivers/net/intel/i40e/i40e_flow.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/net/intel/i40e/i40e_flow.c b/drivers/net/intel/i40e/i40e_flow.c
index 1051c99fba..142cfb5150 100644
--- a/drivers/net/intel/i40e/i40e_flow.c
+++ b/drivers/net/intel/i40e/i40e_flow.c
@@ -1276,11 +1276,19 @@ i40e_flow_dev_dump(struct rte_eth_dev *dev,
 		if (flow != NULL && p_flow != flow)
 			continue;
 
+		/* should not happen */
+		if (p_flow->rule == NULL) {
+			PMD_DRV_LOG(DEBUG, "Invalid flow rule");
+			continue;
+		}
+
+		rule_size = i40e_flow_rule_size(p_flow->filter_type);
+		/* should not happen either */
+		if (rule_size == 0)
+			continue;
+
 		found = true;
-		if (p_flow->rule != NULL) {
-			rule_size = i40e_flow_rule_size(p_flow->filter_type);
-			rule_data = p_flow->rule;
-		}
+		rule_data = p_flow->rule;
 		i40e_flow_dump_blob(file,
 			i40e_flow_rule_name(p_flow->filter_type),
 			rule_data, rule_size);
-- 
2.47.3


^ permalink raw reply related

* [PATCH v1 2/4] net/ixgbe: fix potential NULL dereference
From: Anatoly Burakov @ 2026-06-25 15:48 UTC (permalink / raw)
  To: dev, Vladimir Medvedkin, Bruce Richardson
In-Reply-To: <c42d6f4efb83f4962ea096a702ed320a3d0a5eed.1782402484.git.anatoly.burakov@intel.com>

Static analysis reports that a rule dump may trigger NULL dereference. This
cannot actually happen because dump_blob will not dereference the rule_data
when rule_size is 0, but it's a good defensive check anyway, so add it.

Coverity issue: 503773

Fixes: 51f505a4090f ("net/ixgbe: support flow dump")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 drivers/net/intel/ixgbe/ixgbe_flow.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/net/intel/ixgbe/ixgbe_flow.c b/drivers/net/intel/ixgbe/ixgbe_flow.c
index 19d7f93d93..6868893d46 100644
--- a/drivers/net/intel/ixgbe/ixgbe_flow.c
+++ b/drivers/net/intel/ixgbe/ixgbe_flow.c
@@ -3345,11 +3345,18 @@ ixgbe_flow_dev_dump(struct rte_eth_dev *dev,
 		if (flow != NULL && p_flow != flow)
 			continue;
 
+		/* this should not happen */
+		if (p_flow->rule == NULL) {
+			PMD_DRV_LOG(DEBUG, "Invalid flow");
+			continue;
+		}
+
+		rule_size = ixgbe_flow_rule_size(p_flow);
+		if (rule_size == 0)
+			continue;
+
 		found = true;
-		if (p_flow->rule != NULL) {
-			rule_data = ixgbe_flow_rule_data(p_flow);
-			rule_size = ixgbe_flow_rule_size(p_flow);
-		}
+		rule_data = ixgbe_flow_rule_data(p_flow);
 		engine_name = ixgbe_flow_rule_engine_name(p_flow);
 		ixgbe_flow_dump_blob(file, engine_name,
 			rule_data, rule_size);
-- 
2.47.3


^ permalink raw reply related

* [PATCH v1 1/4] net/ice: fix potential NULL dereference
From: Anatoly Burakov @ 2026-06-25 15:48 UTC (permalink / raw)
  To: dev, Bruce Richardson

Static analysis reports that a rule might have a valid engine but invalid
rule pointer while having a valid rule size. This should not happen in
normal operation, but it's a good defensive check, so fix the potential
issue by checking for the problematic condition before dumping flow memory.

Coverity issue: 503774

Fixes: 215a768c180a ("net/ice: support flow dump")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 drivers/net/intel/ice/ice_generic_flow.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/net/intel/ice/ice_generic_flow.c b/drivers/net/intel/ice/ice_generic_flow.c
index cbc3d78079..04097cca76 100644
--- a/drivers/net/intel/ice/ice_generic_flow.c
+++ b/drivers/net/intel/ice/ice_generic_flow.c
@@ -2660,15 +2660,18 @@ ice_flow_dev_dump(struct rte_eth_dev *dev,
 		if (flow != NULL && p_flow != flow)
 			continue;
 
+		/* this should not happen */
+		if (p_flow->engine == NULL || p_flow->rule == NULL) {
+			PMD_DRV_LOG(DEBUG, "Invalid flow");
+			continue;
+		}
+
 		found = true;
-		if (p_flow->engine != NULL) {
-			rule_size = p_flow->engine->rule_size;
-			if (p_flow->rule != NULL)
-				rule_data = p_flow->rule;
-		}
 
-		if (p_flow->engine != NULL)
-			ice_flow_dump_blob(file,
+		rule_size = p_flow->engine->rule_size;
+		rule_data = p_flow->rule;
+
+		ice_flow_dump_blob(file,
 				p_flow->engine->name != NULL ?
 				p_flow->engine->name : "unknown",
 				rule_data, rule_size);
-- 
2.47.3


^ permalink raw reply related

* [PATCH v5] dts: report dut/NIC info during DTS run
From: Koushik Bhargav Nimoji @ 2026-06-25 15:47 UTC (permalink / raw)
  To: luca.vizzarro, patrickrobb1997
  Cc: dev, abailey, ahassick, lylavoie, Koushik Bhargav Nimoji
In-Reply-To: <20260602163647.101815-1-knimoji@iol.unh.edu>

This patch gathers NIC info during a DTS run and writes it to an output
json file. This allows the json file to be used when reporting results
on the DTS results dashboard.

Signed-off-by: Koushik Bhargav Nimoji <knimoji@iol.unh.edu>
---
v2:
    *Resolved merge conflicts
v3:
    *Fixed an issue with retrieving
     the NIC's hardware version   
v4:
    *Moved nic info gathering step before the nics get
     binded to their respective drivers
    *Condensed some areas of code in order to make them
     more readable
    *Removed redundant None checks and added some where
     required
    *Fixed LshwOutput class to better reflect the lshw
     command output
v5:
    *Changed variable names for code readability     
---
 dts/framework/test_run.py                    | 10 +++
 dts/framework/testbed_model/linux_session.py | 68 ++++++++++++++++++++
 dts/framework/testbed_model/os_session.py    | 11 ++++
 3 files changed, 89 insertions(+)

diff --git a/dts/framework/test_run.py b/dts/framework/test_run.py
index 94dc6023a7..fea1b52e44 100644
--- a/dts/framework/test_run.py
+++ b/dts/framework/test_run.py
@@ -98,6 +98,7 @@
         "InternalError" -> "exit":ew
 """
 
+import json
 import random
 from collections import deque
 from collections.abc import Iterable
@@ -347,6 +348,14 @@ def next(self) -> State | None:
         test_run.ctx.dpdk.setup()
         test_run.ctx.topology.setup()
 
+        testrun_nic_info: list[dict[str, str]] = (
+            self.test_run.ctx.sut_node.main_session.get_nic_info()
+        )
+        with open(f"{SETTINGS.output_dir}/dut_info.json", "w") as file:
+            json.dump(testrun_nic_info, file, indent=3)
+
+        self.logger.info(f"DUT NIC info written to: {SETTINGS.output_dir}/dut_info.json")
+
         if test_run.config.use_virtual_functions:
             test_run.ctx.topology.instantiate_vf_ports()
         if test_run.ctx.sut_node.cryptodevs and test_run.config.crypto:
@@ -370,6 +379,7 @@ def next(self) -> State | None:
         test_run.supported_capabilities = get_supported_capabilities(
             test_run.ctx.sut_node, test_run.ctx.topology, test_run.required_capabilities
         )
+
         return TestRunExecution(test_run, self.result)
 
     def on_error(self, ex: BaseException) -> State | None:
diff --git a/dts/framework/testbed_model/linux_session.py b/dts/framework/testbed_model/linux_session.py
index 3a6e97974b..b8836effbe 100644
--- a/dts/framework/testbed_model/linux_session.py
+++ b/dts/framework/testbed_model/linux_session.py
@@ -38,6 +38,8 @@ class LshwConfigurationOutput(TypedDict):
     driver: str
     #:
     link: str
+    #:
+    firmware: str
 
 
 class LshwOutput(TypedDict):
@@ -61,6 +63,12 @@ class LshwOutput(TypedDict):
             ...
     """
 
+    #:
+    vendor: NotRequired[str]
+    #:
+    product: NotRequired[str]
+    #:
+    version: NotRequired[str]
     #:
     businfo: str
     #:
@@ -197,6 +205,66 @@ def unbind_ports(self, ports: list[Port]):
         if self._lshw_net_info:
             del self._lshw_net_info
 
+    def get_nic_info(self) -> list[dict[str, str]]:
+        """Overrides :meth`~.os_session.OSSession.get_nic_info`.
+
+        Raises:
+            ConfigurationError: If the NIC info could not be found.
+        """
+        port_data = {
+            port.get("businfo"): port for port in self._lshw_net_info if port.get("businfo")
+        }
+
+        all_nic_info: list[dict[str, str]] = []
+        for port in self._config.ports:
+            pci_addr = port.pci
+
+            lshw_result = self.send_command(
+                f"sudo lshw -c network -businfo | grep '{pci_addr}' | cut -d'@' -f1"
+            )
+            if lshw_result.return_code != 0 and lshw_result.stdout == "":
+                raise ConfigurationError(f"Unable to get bus type for port {pci_addr}.")
+            bus_type = lshw_result.stdout
+
+            bus_info = f"{bus_type}@{pci_addr}"
+            nic_port: LshwOutput | None = port_data[bus_info]
+            if nic_port is None:
+                raise ConfigurationError(f"Port {pci_addr} could not be found on the node.")
+
+            config: LshwConfigurationOutput | None = nic_port["configuration"]
+            if config is None:
+                raise ConfigurationError(
+                    f"Configuration info for port {pci_addr} could not be found on the node."
+                )
+
+            if "logicalname" not in nic_port:
+                raise ConfigurationError(
+                    f"Logical name for port {pci_addr} could not be found on the node."
+                )
+
+            ethtool_result = self.send_command(
+                f"ethtool {nic_port['logicalname']} | grep 'Speed:' | awk '{{print $2}}'"
+            )
+            if ethtool_result.return_code == 0 and ethtool_result.stdout:
+                nic_speed = ethtool_result.stdout
+            else:
+                self._logger.error(f"Unable to get speed for NIC: {pci_addr}")
+                nic_speed = None
+
+            dut_json = {
+                "make": nic_port["vendor"] if "vendor" in nic_port else "Unknown",
+                "model": nic_port["product"] if "product" in nic_port else "Unknown",
+                "hardware version": nic_port["version"] if "version" in nic_port else "Unknown",
+                "firmware version": config["firmware"] if "firmware" in config else "Unknown",
+                "deviceBusType": bus_type,
+                "deviceId": nic_port["serial"] if "serial" in nic_port else "Unknown",
+                "pmd": config["driver"] if "driver" in config else "Unknown",
+                "speed": nic_speed or "Unknown",
+            }
+            all_nic_info.append(dut_json)
+
+        return all_nic_info
+
     def bind_ports_to_driver(self, ports: list[Port], driver_name: str) -> None:
         """Overrides :meth:`~.os_session.OSSession.bind_ports_to_driver`.
 
diff --git a/dts/framework/testbed_model/os_session.py b/dts/framework/testbed_model/os_session.py
index f2dc9b20a9..f88427a53d 100644
--- a/dts/framework/testbed_model/os_session.py
+++ b/dts/framework/testbed_model/os_session.py
@@ -581,6 +581,17 @@ def unbind_ports(self, ports: list[Port]) -> None:
             ports: The list of ports to unbind.
         """
 
+    @abstractmethod
+    def get_nic_info(self) -> list[dict[str, str]]:
+        """Get NIC information.
+
+        Returns:
+            NIC info as a list of dictionaries.
+
+        Raises:
+            ConfigurationError: If the NIC info could not be found.
+        """
+
     @abstractmethod
     def bind_ports_to_driver(self, ports: list[Port], driver_name: str) -> None:
         """Bind `ports` to the given `driver_name`.
-- 
2.54.0


^ permalink raw reply related

* RE: [PATCH v5 4/9] bpf/arm64: mask shift count per RFC 9669
From: Marat Khalili @ 2026-06-25 15:40 UTC (permalink / raw)
  To: Stephen Hemminger, dev@dpdk.org
  Cc: stable@dpdk.org, Wathsala Vithanage, Konstantin Ananyev,
	Jerin Jacob
In-Reply-To: <20260624175815.673064-5-stephen@networkplumber.org>

> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Wednesday 24 June 2026 18:55
> To: dev@dpdk.org
> Cc: Stephen Hemminger <stephen@networkplumber.org>; stable@dpdk.org; Wathsala Vithanage
> <wathsala.vithanage@arm.com>; Konstantin Ananyev <konstantin.ananyev@huawei.com>; Marat Khalili
> <marat.khalili@huawei.com>; Jerin Jacob <jerinj@marvell.com>
> Subject: [PATCH v5 4/9] bpf/arm64: mask shift count per RFC 9669
> 
> The ARM JIT was not masking the shift count as required by RFC 9669
> (0x3f for 64-bit, 0x1f for 32-bit), so large immediate shift counts
> overflowed the UBFM/SBFM encoding and failed the JIT. Mask the
> immediate in emit_lsl/emit_lsr/emit_asr.
> 
> Fixes: 9f4469d9e83a ("bpf/arm: add logical operations")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

Acked-by: Marat Khalili <marat.khalili@huawei.com>

> ---
>  lib/bpf/bpf_jit_arm64.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/bpf/bpf_jit_arm64.c b/lib/bpf/bpf_jit_arm64.c
> index ba7ae4d680..7582370062 100644
> --- a/lib/bpf/bpf_jit_arm64.c
> +++ b/lib/bpf/bpf_jit_arm64.c
> @@ -545,12 +545,14 @@ emit_bitfield(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t rn,
>  	emit_insn(ctx, insn, check_reg(rd) || check_reg(rn) ||
>  		  check_immr_imms(is64, immr, imms));
>  }
> +
>  static void
>  emit_lsl(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t imm)
>  {
>  	const unsigned int width = is64 ? 64 : 32;
>  	uint8_t imms, immr;
> 
> +	imm &= width - 1;
>  	immr = (width - imm) & (width - 1);
>  	imms = width - 1 - imm;
> 
> @@ -560,13 +562,19 @@ emit_lsl(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t imm)
>  static void
>  emit_lsr(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t imm)
>  {
> -	emit_bitfield(ctx, is64, rd, rd, imm, is64 ? 63 : 31, A64_UBFM);
> +	const unsigned int width = is64 ? 64 : 32;
> +
> +	imm &= width - 1;
> +	emit_bitfield(ctx, is64, rd, rd, imm, width - 1, A64_UBFM);
>  }
> 
>  static void
>  emit_asr(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t imm)
>  {
> -	emit_bitfield(ctx, is64, rd, rd, imm, is64 ? 63 : 31, A64_SBFM);
> +	const unsigned int width = is64 ? 64 : 32;
> +
> +	imm &= width - 1;
> +	emit_bitfield(ctx, is64, rd, rd, imm, width - 1, A64_SBFM);
>  }
> 
>  #define A64_AND 0
> --
> 2.53.0


^ permalink raw reply

* RE: [PATCH v5 5/9] test/bpf: add test for large shift
From: Marat Khalili @ 2026-06-25 15:38 UTC (permalink / raw)
  To: Stephen Hemminger, dev@dpdk.org; +Cc: Konstantin Ananyev
In-Reply-To: <20260624175815.673064-6-stephen@networkplumber.org>

> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Wednesday 24 June 2026 18:55
> To: dev@dpdk.org
> Cc: Stephen Hemminger <stephen@networkplumber.org>; Konstantin Ananyev <konstantin.ananyev@huawei.com>;
> Marat Khalili <marat.khalili@huawei.com>
> Subject: [PATCH v5 5/9] test/bpf: add test for large shift
> 
> There were multiple bugs with immediate values in shift instructions.
> The code was not masking as required by RFC.
> 
> Add new tests that cover these instructions.
> 
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

Acked-by: Marat Khalili <marat.khalili@huawei.com>

> ---
>  app/test/test_bpf.c | 59 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 59 insertions(+)
> 
> diff --git a/app/test/test_bpf.c b/app/test/test_bpf.c
> index 232e9e2a98..0e5894a532 100644
> --- a/app/test/test_bpf.c
> +++ b/app/test/test_bpf.c
> @@ -2005,6 +2005,51 @@ test_div1_check(uint64_t rc, const void *arg)
>  	return cmp_res(__func__, 0, rc, dve.out, dvt->out, sizeof(dve.out));
>  }
> 
> +/*
> + * Shift counts are masked to the operand width (RFC 9669: 0x3f for 64-bit,
> + * 0x1f for 32-bit). Counts >= 128 also exercise the x86 imm_size() path that
> + * used to desync the stream, and the arm64 UBFM/SBFM immediate encoding.
> + */
> +static const struct ebpf_insn test_shift_big_imm_prog[] = {
> +	{
> +		.code = (EBPF_ALU64 | EBPF_MOV | BPF_K),
> +		.dst_reg = EBPF_REG_0,
> +		.imm = 1
> +	},
> +	{
> +		.code = (EBPF_ALU64 | BPF_LSH | BPF_K),
> +		.dst_reg = EBPF_REG_0,
> +		.imm = 191
> +	},
> +	{
> +		.code = (EBPF_ALU64 | EBPF_ARSH | BPF_K),
> +		.dst_reg = EBPF_REG_0,
> +		.imm = 200
> +	},
> +	{
> +		.code = (EBPF_ALU64 | BPF_RSH | BPF_K),
> +		.dst_reg = EBPF_REG_0,
> +		.imm = 130
> +	},
> +	{
> +		.code = (BPF_JMP | EBPF_EXIT)
> +	},
> +};
> +
> +static void
> +test_shift_big_imm_prepare(void *arg)
> +{
> +	memset(arg, 0, sizeof(struct dummy_offset));
> +}
> +
> +static int
> +test_shift_big_imm_check(uint64_t rc, const void *arg)
> +{
> +	uint64_t expect = 0x3FE0000000000000ULL;
> +
> +	return cmp_res(__func__, expect, rc, arg, arg, 0);
> +}
> +
>  /* call test-cases */
>  static const struct ebpf_insn test_call1_prog[] = {
> 
> @@ -3409,6 +3454,20 @@ static const struct bpf_test tests[] = {
>  		.prepare = test_mul1_prepare,
>  		.check_result = test_div1_check,
>  	},
> +	{
> +		.name = "test_shift_big_imm",
> +		.arg_sz = sizeof(struct dummy_offset),
> +		.prm = {
> +			.ins = test_shift_big_imm_prog,
> +			.nb_ins = RTE_DIM(test_shift_big_imm_prog),
> +			.prog_arg = {
> +				.type = RTE_BPF_ARG_PTR,
> +				.size = sizeof(struct dummy_offset),
> +			},
> +		},
> +		.prepare = test_shift_big_imm_prepare,
> +		.check_result = test_shift_big_imm_check,
> +	},
>  	{
>  		.name = "test_call1",
>  		.arg_sz = sizeof(struct dummy_offset),
> --
> 2.53.0


^ permalink raw reply

* RE: [PATCH v5 3/9] bpf: mask shift count in interpreter per RFC 9669
From: Marat Khalili @ 2026-06-25 15:35 UTC (permalink / raw)
  To: Stephen Hemminger, dev@dpdk.org
  Cc: stable@dpdk.org, Konstantin Ananyev, Ferruh Yigit
In-Reply-To: <20260624175815.673064-4-stephen@networkplumber.org>

> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Wednesday 24 June 2026 18:55
> To: dev@dpdk.org
> Cc: Stephen Hemminger <stephen@networkplumber.org>; stable@dpdk.org; Konstantin Ananyev
> <konstantin.ananyev@huawei.com>; Marat Khalili <marat.khalili@huawei.com>; Ferruh Yigit
> <ferruh.yigit@amd.com>
> Subject: [PATCH v5 3/9] bpf: mask shift count in interpreter per RFC 9669
> 
> The interpreter shifted by the raw immediate or register value, which
> is undefined behavior in C when the count is >= the operand width and
> trips UBSan. RFC 9669 masks shift counts (0x3f for 64-bit, 0x1f for
> 32-bit); mask the count in the LSH/RSH/ARSH cases.
> 
> Fixes: 94972f35a02e ("bpf: add BPF loading and execution framework")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

Acked-by: Marat Khalili <marat.khalili@huawei.com>

> ---
>  lib/bpf/bpf_exec.c | 31 +++++++++++++++++++++----------
>  1 file changed, 21 insertions(+), 10 deletions(-)
> 
> diff --git a/lib/bpf/bpf_exec.c b/lib/bpf/bpf_exec.c
> index d423ef28f5..bb03c9cc2c 100644
> --- a/lib/bpf/bpf_exec.c
> +++ b/lib/bpf/bpf_exec.c
> @@ -4,6 +4,7 @@
> 
>  #include <stdio.h>
>  #include <stdint.h>
> +#include <limits.h>
> 
>  #include <eal_export.h>
>  #include <rte_common.h>
> @@ -43,6 +44,16 @@
>  	((reg)[(ins)->dst_reg] = \
>  		(type)(reg)[(ins)->dst_reg] op (type)(ins)->imm)
> 
> +#define BPF_OP_SHIFT_IMM(reg, ins, op, type)	\
> +	((reg)[(ins)->dst_reg] =		\
> +		(type)(reg)[(ins)->dst_reg] op	\
> +		((ins)->imm & (sizeof(type) * CHAR_BIT - 1)))
> +
> +#define BPF_OP_SHIFT_REG(reg, ins, op, type)	\
> +	((reg)[(ins)->dst_reg] =		\
> +		(type)(reg)[(ins)->dst_reg] op	\
> +		((reg)[(ins)->src_reg] & (sizeof(type) * CHAR_BIT - 1)))
> +
>  #define BPF_DIV_ZERO_CHECK(bpf, reg, ins, type) do { \
>  	if ((type)(reg)[(ins)->src_reg] == 0) { \
>  		RTE_BPF_LOG_LINE(ERR, \
> @@ -183,10 +194,10 @@ bpf_exec(const struct rte_bpf *bpf, uint64_t reg[EBPF_REG_NUM])
>  			BPF_OP_ALU_IMM(reg, ins, |, uint32_t);
>  			break;
>  		case (BPF_ALU | BPF_LSH | BPF_K):
> -			BPF_OP_ALU_IMM(reg, ins, <<, uint32_t);
> +			BPF_OP_SHIFT_IMM(reg, ins, <<, uint32_t);
>  			break;
>  		case (BPF_ALU | BPF_RSH | BPF_K):
> -			BPF_OP_ALU_IMM(reg, ins, >>, uint32_t);
> +			BPF_OP_SHIFT_IMM(reg, ins, >>, uint32_t);
>  			break;
>  		case (BPF_ALU | BPF_XOR | BPF_K):
>  			BPF_OP_ALU_IMM(reg, ins, ^, uint32_t);
> @@ -217,10 +228,10 @@ bpf_exec(const struct rte_bpf *bpf, uint64_t reg[EBPF_REG_NUM])
>  			BPF_OP_ALU_REG(reg, ins, |, uint32_t);
>  			break;
>  		case (BPF_ALU | BPF_LSH | BPF_X):
> -			BPF_OP_ALU_REG(reg, ins, <<, uint32_t);
> +			BPF_OP_SHIFT_REG(reg, ins, <<, uint32_t);
>  			break;
>  		case (BPF_ALU | BPF_RSH | BPF_X):
> -			BPF_OP_ALU_REG(reg, ins, >>, uint32_t);
> +			BPF_OP_SHIFT_REG(reg, ins, >>, uint32_t);
>  			break;
>  		case (BPF_ALU | BPF_XOR | BPF_X):
>  			BPF_OP_ALU_REG(reg, ins, ^, uint32_t);
> @@ -262,13 +273,13 @@ bpf_exec(const struct rte_bpf *bpf, uint64_t reg[EBPF_REG_NUM])
>  			BPF_OP_ALU_IMM(reg, ins, |, uint64_t);
>  			break;
>  		case (EBPF_ALU64 | BPF_LSH | BPF_K):
> -			BPF_OP_ALU_IMM(reg, ins, <<, uint64_t);
> +			BPF_OP_SHIFT_IMM(reg, ins, <<, uint64_t);
>  			break;
>  		case (EBPF_ALU64 | BPF_RSH | BPF_K):
> -			BPF_OP_ALU_IMM(reg, ins, >>, uint64_t);
> +			BPF_OP_SHIFT_IMM(reg, ins, >>, uint64_t);
>  			break;
>  		case (EBPF_ALU64 | EBPF_ARSH | BPF_K):
> -			BPF_OP_ALU_IMM(reg, ins, >>, int64_t);
> +			BPF_OP_SHIFT_IMM(reg, ins, >>, int64_t);
>  			break;
>  		case (EBPF_ALU64 | BPF_XOR | BPF_K):
>  			BPF_OP_ALU_IMM(reg, ins, ^, uint64_t);
> @@ -299,13 +310,13 @@ bpf_exec(const struct rte_bpf *bpf, uint64_t reg[EBPF_REG_NUM])
>  			BPF_OP_ALU_REG(reg, ins, |, uint64_t);
>  			break;
>  		case (EBPF_ALU64 | BPF_LSH | BPF_X):
> -			BPF_OP_ALU_REG(reg, ins, <<, uint64_t);
> +			BPF_OP_SHIFT_REG(reg, ins, <<, uint64_t);
>  			break;
>  		case (EBPF_ALU64 | BPF_RSH | BPF_X):
> -			BPF_OP_ALU_REG(reg, ins, >>, uint64_t);
> +			BPF_OP_SHIFT_REG(reg, ins, >>, uint64_t);
>  			break;
>  		case (EBPF_ALU64 | EBPF_ARSH | BPF_X):
> -			BPF_OP_ALU_REG(reg, ins, >>, int64_t);
> +			BPF_OP_SHIFT_REG(reg, ins, >>, int64_t);
>  			break;
>  		case (EBPF_ALU64 | BPF_XOR | BPF_X):
>  			BPF_OP_ALU_REG(reg, ins, ^, uint64_t);
> --
> 2.53.0


^ permalink raw reply

* Re: [PATCH v4] dts: report dut/NIC info during DTS run
From: Koushik Bhargav Nimoji @ 2026-06-25 15:30 UTC (permalink / raw)
  To: Patrick Robb; +Cc: luca.vizzarro, dev, abailey, ahassick, lylavoie
In-Reply-To: <CAK6DuxucO6-D0qTeTb=mdpWWP48j9rDyvbwfHGgjfwQXTM93Gg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 7237 bytes --]

On Wed, Jun 24, 2026 at 10:06 PM Patrick Robb <patrickrobb1997@gmail.com>
wrote:

>
>
> On Wed, Jun 24, 2026 at 5:33 PM Koushik Bhargav Nimoji <
> knimoji@iol.unh.edu> wrote:
>
>> This patch gathers NIC info during a DTS run and writes it to an output
>> json file. This allows the json file to be used when reporting results
>> on the DTS results dashboard.
>>
>> Signed-off-by: Koushik Bhargav Nimoji <knimoji@iol.unh.edu>
>> ---
>> v2:
>>     *Resolved merge conflicts
>> v3:
>>     *Fixed an issue with retrieving
>>      the NIC's hardware version
>> v4:
>>     *Moved nic info gathering step before the nics get
>>      binded to their respective drivers
>>     *Condensed some areas of code in order to make them
>>      more readable
>>     *Removed redundant None checks and added some where
>>      required
>>     *Fixed LshwOutput class to better reflect the lshw
>>      command output
>> ---
>>  dts/framework/test_run.py                    |  8 +++
>>  dts/framework/testbed_model/linux_session.py | 68 ++++++++++++++++++++
>>  dts/framework/testbed_model/os_session.py    | 11 ++++
>>  3 files changed, 87 insertions(+)
>>
>> diff --git a/dts/framework/test_run.py b/dts/framework/test_run.py
>> index 94dc6023a7..c92fe90f2e 100644
>> --- a/dts/framework/test_run.py
>> +++ b/dts/framework/test_run.py
>> @@ -98,6 +98,7 @@
>>          "InternalError" -> "exit":ew
>>  """
>>
>> +import json
>>  import random
>>  from collections import deque
>>  from collections.abc import Iterable
>> @@ -347,6 +348,12 @@ def next(self) -> State | None:
>>          test_run.ctx.dpdk.setup()
>>          test_run.ctx.topology.setup()
>>
>> +        used_nic_info: list[dict[str, str]] =
>> self.test_run.ctx.sut_node.main_session.get_nic_info()
>>
>
> drop "used" for nic_info or change to testrun_nic_info?
>
>
>> +        with open(f"{SETTINGS.output_dir}/dut_info.json", "w") as file:
>> +            json.dump(used_nic_info, file, indent=3)
>> +
>> +        self.logger.info(f"DUT NIC info written to:
>> {SETTINGS.output_dir}/dut_info.json")
>> +
>>          if test_run.config.use_virtual_functions:
>>              test_run.ctx.topology.instantiate_vf_ports()
>>          if test_run.ctx.sut_node.cryptodevs and test_run.config.crypto:
>> @@ -370,6 +377,7 @@ def next(self) -> State | None:
>>          test_run.supported_capabilities = get_supported_capabilities(
>>              test_run.ctx.sut_node, test_run.ctx.topology,
>> test_run.required_capabilities
>>          )
>> +
>>          return TestRunExecution(test_run, self.result)
>>
>>      def on_error(self, ex: BaseException) -> State | None:
>> diff --git a/dts/framework/testbed_model/linux_session.py
>> b/dts/framework/testbed_model/linux_session.py
>> index 3a6e97974b..9e9146c372 100644
>> --- a/dts/framework/testbed_model/linux_session.py
>> +++ b/dts/framework/testbed_model/linux_session.py
>> @@ -38,6 +38,8 @@ class LshwConfigurationOutput(TypedDict):
>>      driver: str
>>      #:
>>      link: str
>> +    #:
>> +    firmware: str
>>
>>
>>  class LshwOutput(TypedDict):
>> @@ -61,6 +63,12 @@ class LshwOutput(TypedDict):
>>              ...
>>      """
>>
>> +    #:
>> +    vendor: NotRequired[str]
>> +    #:
>> +    product: NotRequired[str]
>> +    #:
>> +    version: NotRequired[str]
>>      #:
>>      businfo: str
>>      #:
>> @@ -197,6 +205,66 @@ def unbind_ports(self, ports: list[Port]):
>>          if self._lshw_net_info:
>>              del self._lshw_net_info
>>
>> +    def get_nic_info(self) -> list[dict[str, str]]:
>> +        """Overrides :meth`~.os_session.OSSession.get_nic_info`.
>> +
>> +        Raises:
>> +            ConfigurationError: If the NIC info could not be found.
>> +        """
>> +        port_data = {
>> +            port.get("businfo"): port for port in self._lshw_net_info if
>> port.get("businfo")
>> +        }
>> +
>> +        all_nic_info: list[dict[str, str]] = []
>> +        for port in self._config.ports:
>> +            pci_addr = port.pci
>> +
>> +            command_result = self.send_command(
>>
>
> rename to lshw_result please.
>
>
>> +                f"sudo lshw -c network -businfo | grep '{pci_addr}' |
>> cut -d'@' -f1"
>> +            )
>> +            if command_result.return_code != 0 and command_result.stdout
>> == "":
>> +                raise ConfigurationError(f"Unable to get bus type for
>> port {pci_addr}.")
>> +            bus_type = command_result.stdout
>> +
>> +            bus_info = f"{bus_type}@{pci_addr}"
>> +            nic_port: LshwOutput | None = port_data[bus_info]
>> +            if nic_port is None:
>> +                raise ConfigurationError(f"Port {pci_addr} could not be
>> found on the node.")
>> +
>> +            config: LshwConfigurationOutput | None =
>> nic_port["configuration"]
>> +            if config is None:
>> +                raise ConfigurationError(
>> +                    f"Configuration info for port {pci_addr} could not
>> be found on the node."
>> +                )
>> +
>> +            if "logicalname" not in nic_port:
>> +                raise ConfigurationError(
>> +                    f"Logical name for port {pci_addr} could not be
>> found on the node."
>> +                )
>> +
>> +            command_result = self.send_command(
>>
>
> ethtool_result
>
>
>> +                f"ethtool {nic_port['logicalname']} | grep 'Speed:' |
>> awk '{{print $2}}'"
>> +            )
>
> +            if command_result.return_code == 0 and command_result.stdout:
>> +                nic_speed = command_result.stdout
>> +            else:
>> +                self._logger.error(f"Unable to get speed for NIC:
>> {pci_addr}")
>> +                nic_speed = None
>> +
>> +            dut_json = {
>> +                "make": nic_port["vendor"] if "vendor" in nic_port else
>> "Unknown",
>> +                "model": nic_port["product"] if "product" in nic_port
>> else "Unknown",
>> +                "hardware version": nic_port["version"] if "version" in
>> nic_port else "Unknown",
>> +                "firmware version": config["firmware"] if "firmware" in
>> config else "Unknown",
>> +                "deviceBusType": bus_type,
>> +                "deviceId": nic_port["serial"] if "serial" in nic_port
>> else "Unknown",
>> +                "pmd": config["driver"] if "driver" in config else
>> "Unknown",
>> +                "speed": nic_speed or "Unknown",
>> +            }
>> +            all_nic_info.append(dut_json)
>> +
>> +        return all_nic_info
>> +
>>
>
> What is the intended behavior for cryptodev tests? I realize the ports
> list will be empty and we will not enter the initial loop, but is this
> intended? Do we want to gether cryptodev info too?
>
>
The intended behavior here is to skip cryptodev devices. Not entering the
initial loop, and therefore returning an empty list is the expected
behavior when running cryptodev tests.

>

>      def bind_ports_to_driver(self, ports: list[Port], driver_name: str)
>> -> None:
>>
>>
> Reviewed-by: Patrick Robb <patrickrobb1997@gmail.com>
>

[-- Attachment #2: Type: text/html, Size: 10501 bytes --]

^ permalink raw reply

* Re: [PATCH v3 2/4] build: support function versioning for drivers
From: David Marchand @ 2026-06-25 14:45 UTC (permalink / raw)
  To: Dariusz Sosnowski; +Cc: Bruce Richardson, dev, Yu Jiang
In-Reply-To: <20260625133311.1299705-3-dsosnowski@nvidia.com>

On Thu, 25 Jun 2026 at 15:34, Dariusz Sosnowski <dsosnowski@nvidia.com> wrote:
>
> Add support for enabling function versioning
> (through use_function_versioning meson variable) for drivers,
> similar to libraries.
>
> Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
> ---
>  drivers/meson.build | 21 ++++++++++++++++++++-
>  1 file changed, 20 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/meson.build b/drivers/meson.build
> index 4d95604ecd..8f3ab490ee 100644
> --- a/drivers/meson.build
> +++ b/drivers/meson.build
> @@ -171,6 +171,7 @@ foreach subpath:subdirs
>          pkgconfig_extra_libs = []
>          testpmd_sources = []
>          require_iova_in_mbuf = true
> +        use_function_versioning = false
>          # for handling base code files which may need extra cflags
>          base_sources = []
>          base_cflags = []
> @@ -273,6 +274,13 @@ foreach subpath:subdirs
>          endif
>          dpdk_conf.set(lib_name.to_upper(), 1)
>
> +        if developer_mode and is_windows and use_function_versioning
> +            message('@0@: Function versioning is not supported by Windows.'.format(name))
> +        endif
> +        if use_function_versioning
> +            cflags += '-DRTE_USE_FUNCTION_VERSIONING'
> +        endif
> +
>          dpdk_extra_ldflags += pkgconfig_extra_libs
>
>          dpdk_headers += headers
> @@ -363,7 +371,18 @@ foreach subpath:subdirs
>                      depends: [version_map])
>          endif
>
> -        shared_lib = shared_library(lib_name, sources_pmd_info,
> +        if not use_function_versioning or is_windows
> +            # Use pre-built objects and pmdinfo sources to build shared library.
> +            shared_sources = sources_pmd_info
> +        else
> +            # For compat we need to rebuild with RTE_BUILD_SHARED_LIB defined.
> +            # Use original sources and pmdinfo sources.
> +            cflags += '-DRTE_BUILD_SHARED_LIB'
> +            shared_sources = sources + sources_pmd_info
> +            objs = []
> +        endif
> +
> +        shared_lib = shared_library(lib_name, shared_sources,
>                  objects: objs,
>                  include_directories: includes,
>                  dependencies: shared_deps,

Older meson version don't like this form:

drivers/meson.build:381:12: ERROR: Invalid use of addition: can only
concatenate list (not "CustomTargetHolder") to list

It seems to work with something like:

diff --git a/drivers/meson.build b/drivers/meson.build
index 8f3ab490ee..79c215a7c8 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -373,12 +373,12 @@ foreach subpath:subdirs

         if not use_function_versioning or is_windows
             # Use pre-built objects and pmdinfo sources to build
shared library.
-            shared_sources = sources_pmd_info
+            shared_sources = [sources_pmd_info]
         else
             # For compat we need to rebuild with RTE_BUILD_SHARED_LIB defined.
             # Use original sources and pmdinfo sources.
             cflags += '-DRTE_BUILD_SHARED_LIB'
-            shared_sources = sources + sources_pmd_info
+            shared_sources = sources + [sources_pmd_info]
             objs = []
         endif



-- 
David Marchand


^ permalink raw reply related

* RE: [PATCH v5 02/24] bpf: add format instruction function
From: Marat Khalili @ 2026-06-25 14:22 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Konstantin Ananyev, dev@dpdk.org
In-Reply-To: <20260624100934.00e99af1@phoenix.local>

> On Wed, 24 Jun 2026 13:17:35 +0100
> Marat Khalili <marat.khalili@huawei.com> wrote:
> 
> > BPF library already contains BPF instruction formatting functions, but
> > they could only be used via `rte_bpf_dump` to dump result into file. Add
> > new function `rte_bpf_format` to format instruction in various way
> > (hexadecimal, disassembly) into a user-provided buffer, as well as a
> > service function `rte_bpf_insn_is_wide` to detect wide instructions.
> >
> > Signed-off-by: Marat Khalili <marat.khalili@huawei.com>
> > Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> > ---
> 
> Is this format similar to what tcpdump -d and objdump produce?

Closest I could find is bpf_dbg, the rest are slightly different.

Git log says it was originally added by you :)

^ permalink raw reply

* Re: [PATCH v1 2/2] dts: add latency coverage for cryptodev testing
From: Patrick Robb @ 2026-06-25 14:17 UTC (permalink / raw)
  To: Andrew Bailey; +Cc: luca.vizzarro, dev, lylavoie, ahassick, knimoji
In-Reply-To: <20260513152715.133381-2-abailey@iol.unh.edu>

[-- Attachment #1: Type: text/plain, Size: 11431 bytes --]

On Wed, May 13, 2026 at 11:27 AM Andrew Bailey <abailey@iol.unh.edu> wrote:

> Currently, next DTS only has cryptodev testing coverage for throughput
> metrics. This patch adds a test suite to include latency testing for
> crypto devices.
>
> Signed-off-by: Andrew Bailey <abailey@iol.unh.edu>
> ---
>  .../dts/tests.TestSuite_cryptodev_latency.rst |   8 +
>  dts/tests/TestSuite_cryptodev_latency.py      | 695 ++++++++++++++++++
>  2 files changed, 703 insertions(+)
>  create mode 100644 doc/api/dts/tests.TestSuite_cryptodev_latency.rst
>  create mode 100644 dts/tests/TestSuite_cryptodev_latency.py
>
>
> +    @crypto_test
> +    def aesni_gcm_vdev(self) -> None:
> +        """aesni_gcm virtual device latency test.
> +
> +        Steps:
> +            * Create a cryptodev instance with provided device type and
> buffer sizes.
> +        Verify:
> +            * The latency is below or within delta of provided baseline.
> +
> +        Raises:
> +            SkippedTestException: When configuration is not provided.
> +        """
> +        if "aesni_gcm_vdev" not in self.latency_test_parameters:
> +            skip("test not configured")
> +        app = Cryptodev(
> +            ptest=TestType.latency,
> +            vdevs=[VirtualDevice("crypto_aesni_gcm0")],
> +            devtype=DeviceType.crypto_aesni_gcm,
> +            optype=OperationType.aead,
> +            aead_op=EncryptDecryptSwitch.encrypt,
> +            aead_key_sz=16,
> +            aead_iv_sz=12,
> +            aead_aad_sz=16,
> +            digest_sz=16,
> +            burst_sz=32,
> +            total_ops=TOTAL_OPS,
> +            buffer_sz=self.buffer_sizes["aesni_gcm_vdev"],
> +        )
> +        results = self._verify_latency(app.run_app(), "aesni_gcm_vdev")
> +        self._print_stats(results)
> +        for result in results:
> +            verify(result["passed"] == "PASS", "latency fell more than
> the delta tolerance")
>

Why this one does not have "below baseline" in the string like other verify
assertions in this suite?


> +
> +    @crypto_test
> +    def aesni_mb_cipher_then_auth_vdev(self) -> None:
> +        """aesni_mb vdev cipher and auth latency test.
> +
> +        Steps:
> +            * Create a cryptodev instance with provided device type and
> buffer sizes.
> +        Verify:
> +            * The latency is below or within delta of provided baseline.
> +
> +        Raises:
> +            SkippedTestException: When configuration is not provided.
> +        """
> +        if "aesni_mb_cipher_then_auth_vdev" not in
> self.latency_test_parameters:
> +            skip("test not configured")
> +        app = Cryptodev(
> +            ptest=TestType.latency,
> +            vdevs=[VirtualDevice("crypto_aesni_mb0")],
> +            devtype=DeviceType.crypto_aesni_mb,
> +            optype=OperationType.cipher_then_auth,
> +            cipher_algo=CipherAlgorithm.aes_cbc,
> +            cipher_op=EncryptDecryptSwitch.encrypt,
> +            cipher_key_sz=16,
> +            auth_algo=AuthenticationAlgorithm.sha1_hmac,
> +            auth_op=AuthenticationOpMode.generate,
> +            auth_key_sz=64,
> +            digest_sz=12,
> +            burst_sz=32,
> +            total_ops=TOTAL_OPS,
> +            buffer_sz=self.buffer_sizes["aesni_mb_cipher_then_auth_vdev"],
> +        )
> +        results = self._verify_latency(app.run_app(),
> "aesni_mb_cipher_then_auth_vdev")
> +        self._print_stats(results)
> +        for result in results:
> +            verify(
> +                result["passed"] == "PASS",
> +                "latency fell more than the delta tolerance below
> baseline",
> +            )
> +
> +    @crypto_test
> +    def aesni_mb_vdev(self) -> None:
> +        """aesni_mb vdev latency test.
> +
> +        Steps:
> +            * Create a cryptodev instance with provided device type and
> buffer sizes.
> +        Verify:
> +            * The latency is below or within delta of provided baseline.
> +
> +        Raises:
> +            SkippedTestException: When configuration is not provided.
> +        """
> +        if "aesni_mb_vdev" not in self.latency_test_parameters:
> +            skip("test not configured")
> +        app = Cryptodev(
> +            ptest=TestType.latency,
> +            vdevs=[VirtualDevice("crypto_aesni_mb0")],
> +            devtype=DeviceType.crypto_aesni_mb,
> +            optype=OperationType.cipher_only,
> +            cipher_algo=CipherAlgorithm.aes_cbc,
> +            cipher_op=EncryptDecryptSwitch.encrypt,
> +            cipher_key_sz=16,
> +            cipher_iv_sz=16,
> +            burst_sz=32,
> +            total_ops=TOTAL_OPS,
> +            buffer_sz=self.buffer_sizes["aesni_mb_vdev"],
> +        )
> +        results = self._verify_latency(app.run_app(), "aesni_mb_vdev")
> +        self._print_stats(results)
> +        for result in results:
> +            verify(result["passed"] == "PASS", "Gbps fell below delta
> tolerance")
> +
> +    @crypto_test
> +    def kasumi_vdev(self) -> None:
> +        """Kasumi vdev latency test.
> +
> +        Steps:
> +            * Create a cryptodev instance with provided device type and
> buffer sizes.
> +        Verify:
> +            * The latency is below or within delta of provided baseline.
> +
> +        Raises:
> +            SkippedTestException: When configuration is not provided.
> +        """
> +        if "kasumi_vdev" not in self.latency_test_parameters:
> +            skip("test not configured")
> +        app = Cryptodev(
> +            ptest=TestType.latency,
> +            vdevs=[VirtualDevice("crypto_kasumi0")],
> +            devtype=DeviceType.crypto_kasumi,
> +            optype=OperationType.cipher_then_auth,
> +            cipher_algo=CipherAlgorithm.kasumi_f8,
> +            cipher_op=EncryptDecryptSwitch.encrypt,
> +            cipher_key_sz=16,
> +            cipher_iv_sz=8,
> +            auth_algo=AuthenticationAlgorithm.kasumi_f9,
> +            auth_op=AuthenticationOpMode.generate,
> +            auth_key_sz=16,
> +            digest_sz=4,
> +            burst_sz=32,
> +            total_ops=TOTAL_OPS,
> +            buffer_sz=self.buffer_sizes["kasumi_vdev"],
> +        )
> +        results = self._verify_latency(app.run_app(), "kasmui_vdev")
> +        self._print_stats(results)
> +        for result in results:
> +            verify(result["passed"] == "PASS", "Gbps fell below delta
> tolerance")
>

Should this be latency instead of Gbps?


> +
> +    @crypto_test
> +    def open_ssl_vdev(self) -> None:
> +        """open_ssl vdev latency test.
> +
> +        Steps:
> +            * Create a cryptodev instance with provided device type and
> buffer sizes.
> +        Verify:
> +            * The latency is below or within delta of provided baseline.
> +
> +        Raises:
> +            SkippedTestException: When configuration is not provided.
> +        """
> +        if "open_ssl_vdev" not in self.latency_test_parameters:
> +            skip("test not configured")
> +        app = Cryptodev(
> +            ptest=TestType.latency,
> +            vdevs=[VirtualDevice("crypto_openssl0")],
> +            devtype=DeviceType.crypto_openssl,
> +            optype=OperationType.aead,
> +            aead_algo=AeadAlgName.aes_gcm,
> +            aead_op=EncryptDecryptSwitch.encrypt,
> +            aead_key_sz=16,
> +            aead_iv_sz=16,
> +            aead_aad_sz=16,
> +            digest_sz=16,
> +            total_ops=TOTAL_OPS,
> +            buffer_sz=self.buffer_sizes["open_ssl_vdev"],
> +        )
> +        results = self._verify_latency(app.run_app(), "open_ssl_vdev")
> +        self._print_stats(results)
> +        for result in results:
> +            verify(result["passed"] == "PASS", "Gbps fell below delta
> tolerance")
>
Same


> +
> +    @crypto_test
> +    def snow3g_vdev(self) -> None:
> +        """snow3g vdev latency test.
> +
> +        Steps:
> +            * Create a cryptodev instance with provided device type and
> buffer sizes.
> +        Verify:
> +            * The latency is below or within delta of provided baseline.
> +
> +        Raises:
> +            SkippedTestException: When configuration is not provided.
> +        """
> +        if "snow3g_vdev" not in self.latency_test_parameters:
> +            skip("test not configured")
> +        app = Cryptodev(
> +            ptest=TestType.latency,
> +            vdevs=[VirtualDevice("crypto_snow3g0")],
> +            devtype=DeviceType.crypto_snow3g,
> +            optype=OperationType.cipher_then_auth,
> +            cipher_algo=CipherAlgorithm.snow3g_uea2,
> +            cipher_op=EncryptDecryptSwitch.encrypt,
> +            cipher_key_sz=16,
> +            cipher_iv_sz=16,
> +            auth_algo=AuthenticationAlgorithm.snow3g_uia2,
> +            auth_op=AuthenticationOpMode.generate,
> +            auth_key_sz=16,
> +            auth_iv_sz=16,
> +            digest_sz=16,
> +            burst_sz=32,
> +            total_ops=TOTAL_OPS,
> +            buffer_sz=self.buffer_sizes["open_ssl_vdev"],
> +        )
> +        results = self._verify_latency(app.run_app(), "open_ssl_vdev")
> +        self._print_stats(results)
> +        for result in results:
> +            verify(result["passed"] == "PASS", "Gbps fell below delta
> tolerance")
>
Same


> +
> +    @crypto_test
> +    def zuc_vdev(self) -> None:
> +        """Zuc vdev latency test.
> +
> +        Steps:
> +            * Create a cryptodev instance with provided device type and
> buffer sizes.
> +        Verify:
> +            * The latency is below or within delta of provided baseline.
> +
> +        Raises:
> +            SkippedTestException: When configuration is not provided.
> +        """
> +        if "zuc_vdev" not in self.latency_test_parameters:
> +            skip("test not configured")
> +        app = Cryptodev(
> +            ptest=TestType.latency,
> +            vdevs=[VirtualDevice("crypto_zuc0")],
> +            devtype=DeviceType.crypto_zuc,
> +            optype=OperationType.cipher_then_auth,
> +            cipher_algo=CipherAlgorithm.zuc_eea3,
> +            cipher_op=EncryptDecryptSwitch.encrypt,
> +            cipher_key_sz=16,
> +            cipher_iv_sz=16,
> +            auth_algo=AuthenticationAlgorithm.zuc_eia3,
> +            auth_op=AuthenticationOpMode.generate,
> +            auth_key_sz=16,
> +            auth_iv_sz=16,
> +            digest_sz=4,
> +            burst_sz=32,
> +            total_ops=TOTAL_OPS,
> +            buffer_sz=self.buffer_sizes["zuc_vdev"],
> +        )
> +        results = self._verify_latency(app.run_app(), "zuc_vdev")
> +        self._print_stats(results)
> +        for result in results:
> +            verify(result["passed"] == "PASS", "Gbps fell below delta
> tolerance")
>

Same

> --
> 2.50.1
>
>
Make sure you also see the ai code review mention of kasumi and snow3g typo
or misassignment
https://mails.dpdk.org/archives/test-report/2026-May/990932.html

The suggestions about safety when reading index 0 of a list are worth
implementing too.

Reviewed-by: Patrick Robb <patrickrobb1997@gmail.com>

[-- Attachment #2: Type: text/html, Size: 15261 bytes --]

^ permalink raw reply

* [PATCH v6 6/6] eal: fix async IPC callback not fired when no peers
From: Anatoly Burakov @ 2026-06-25 14:01 UTC (permalink / raw)
  To: dev, Jianfeng Tan
In-Reply-To: <cover.1782395581.git.anatoly.burakov@intel.com>

Currently, when rte_mp_request_async() is called and no peer processes
are connected (nb_sent == 0), the user callback is never invoked.

The original implementation used a dedicated background thread and
pthread_cond_signal() to wake it after queuing the dummy request. When
that thread was replaced with per-message alarms, no alarm was set for
the dummy request, silently breaking the nb_sent == 0 path.

This was not noticed because async requests are usually used while handling
secondary process requests, where peers are typically already present.

Fix it by setting a 1us alarm on the dummy request, so the callback path
immediately triggers and processes it.

Fixes: daf9bfca717e ("ipc: remove thread for async requests")
Cc: stable@dpdk.org

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/eal/common/eal_common_proc.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/lib/eal/common/eal_common_proc.c b/lib/eal/common/eal_common_proc.c
index 235687ab84..2b8874e416 100644
--- a/lib/eal/common/eal_common_proc.c
+++ b/lib/eal/common/eal_common_proc.c
@@ -1197,11 +1197,22 @@ rte_mp_request_async(struct rte_mp_msg *req, const struct timespec *ts,
 	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
 		ret = mp_request_async(eal_mp_socket_path(), copy, param, ts);
 
-		/* if we didn't send anything, put dummy request on the queue */
+		/* if we didn't send anything, put dummy request on the queue
+		 * and set a minimum-delay alarm so the callback fires immediately.
+		 */
 		if (ret == 0 && reply->nb_sent == 0) {
 			TAILQ_INSERT_TAIL(&pending_requests.requests, dummy,
 					next);
 			dummy_used = true;
+			if (rte_eal_alarm_set(1, async_reply_handle,
+					(void *)(uintptr_t)dummy->id) < 0) {
+				EAL_LOG(ERR, "Fail to set alarm for dummy request");
+				/* roll back the changes */
+				TAILQ_REMOVE(&pending_requests.requests, dummy, next);
+				dummy_used = false;
+				ret = -1;
+				goto unlock_fail;
+			}
 		}
 
 		pthread_mutex_unlock(&pending_requests.lock);
@@ -1275,6 +1286,16 @@ rte_mp_request_async(struct rte_mp_msg *req, const struct timespec *ts,
 	if (ret == 0 && reply->nb_sent == 0) {
 		TAILQ_INSERT_HEAD(&pending_requests.requests, dummy, next);
 		dummy_used = true;
+
+		if (rte_eal_alarm_set(1, async_reply_handle,
+				(void *)(uintptr_t)dummy->id) < 0) {
+			EAL_LOG(ERR, "Fail to set alarm for dummy request");
+			/* roll back the changes */
+			TAILQ_REMOVE(&pending_requests.requests, dummy, next);
+			dummy_used = false;
+			ret = -1;
+			goto closedir_fail;
+		}
 	}
 
 	/* finally, unlock the queue */
-- 
2.47.3


^ permalink raw reply related

* [PATCH v6 5/6] eal: fix memory leak in async IPC secondary path
From: Anatoly Burakov @ 2026-06-25 14:01 UTC (permalink / raw)
  To: dev, Jianfeng Tan
In-Reply-To: <cover.1782395581.git.anatoly.burakov@intel.com>

When rte_mp_request_async() succeeds on the secondary process path, the
dummy request is freed only if it was inserted into the queue. However,
when the actual request was sent successfully (nb_sent > 0), the dummy is
not used and the function returns without freeing it.

Free dummy before returning on the success path when it was not inserted
into the queue.

Fixes: f05e26051c15 ("eal: add IPC asynchronous request")
Cc: stable@dpdk.org

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/eal/common/eal_common_proc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/eal/common/eal_common_proc.c b/lib/eal/common/eal_common_proc.c
index 5cd1bb8d13..235687ab84 100644
--- a/lib/eal/common/eal_common_proc.c
+++ b/lib/eal/common/eal_common_proc.c
@@ -1209,6 +1209,8 @@ rte_mp_request_async(struct rte_mp_msg *req, const struct timespec *ts,
 		/* if we couldn't send anything, clean up */
 		if (ret != 0)
 			goto fail;
+		if (!dummy_used)
+			free(dummy);
 		return 0;
 	}
 
-- 
2.47.3


^ permalink raw reply related

* [PATCH v6 4/6] eal: fix async IPC memory leaks on partial failure
From: Anatoly Burakov @ 2026-06-25 14:01 UTC (permalink / raw)
  To: dev, Jianfeng Tan
In-Reply-To: <cover.1782395581.git.anatoly.burakov@intel.com>

When rte_mp_request_async() fails to send requests to all peers,
copy and param can lose ownership and leak.

However, we cannot simply free them unconditionally, as "partial failure"
means some requests were already queued and thus still reference `copy` and
`param`, so freeing them directly on the error path can cause
use-after-free when those requests are later handled by the async timeout.

Fix this by rolling back queued requests from the current batch, and reset
nb_sent to 0. Freeing the requests is now safe even if some requests were
sent, as any responses or timeouts will not find the request ID in the
queue and will safely exit without doing anything.

Coverity issue: 501503
Fixes: f05e26051c15 ("eal: add IPC asynchronous request")
Cc: stable@dpdk.org

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/eal/common/eal_common_proc.c | 34 +++++++++++++++++++++++++++++++-
 1 file changed, 33 insertions(+), 1 deletion(-)

diff --git a/lib/eal/common/eal_common_proc.c b/lib/eal/common/eal_common_proc.c
index 869ce99bf9..5cd1bb8d13 100644
--- a/lib/eal/common/eal_common_proc.c
+++ b/lib/eal/common/eal_common_proc.c
@@ -1242,7 +1242,34 @@ rte_mp_request_async(struct rte_mp_msg *req, const struct timespec *ts,
 		} else if (mp_request_async(path, copy, param, ts))
 			ret = -1;
 	}
-	/* if we didn't send anything, put dummy request on the queue */
+
+	/*
+	 * On partial failure, roll back all queued requests. We hold the lock
+	 * so no one else touches the queue. All requests in this batch share
+	 * the same param pointer. Stale alarms will fire and harmlessly find
+	 * nothing via ID-based lookup.
+	 */
+	if (ret != 0 && reply->nb_sent > 0) {
+		struct pending_request *r, *next;
+
+		for (r = TAILQ_FIRST(&pending_requests.requests);
+				r != NULL; r = next) {
+			next = TAILQ_NEXT(r, next);
+			if (r->type == REQUEST_TYPE_ASYNC &&
+					r->async.param == param) {
+				TAILQ_REMOVE(&pending_requests.requests,
+						r, next);
+				free(r->reply);
+				/* r->request == copy, freed below after the loop */
+				free(r);
+			}
+		}
+		reply->nb_sent = 0;
+	}
+
+	/* if we didn't send anything, put dummy request on the queue
+	 * and set a minimum-delay alarm so the callback fires immediately.
+	 */
 	if (ret == 0 && reply->nb_sent == 0) {
 		TAILQ_INSERT_HEAD(&pending_requests.requests, dummy, next);
 		dummy_used = true;
@@ -1260,6 +1287,11 @@ rte_mp_request_async(struct rte_mp_msg *req, const struct timespec *ts,
 	/* if dummy was unused, free it */
 	if (!dummy_used)
 		free(dummy);
+	/* if nothing was sent, nobody owns copy/param */
+	if (ret != 0) {
+		free(param);
+		free(copy);
+	}
 
 	return ret;
 closedir_fail:
-- 
2.47.3


^ permalink raw reply related

* [PATCH v6 3/6] eal: avoid deadlock in async IPC alarm callback
From: Anatoly Burakov @ 2026-06-25 14:01 UTC (permalink / raw)
  To: dev, Jianfeng Tan
In-Reply-To: <cover.1782395581.git.anatoly.burakov@intel.com>

async_reply_handle_thread_unsafe() can run while holding
pending_requests.lock and currently calls rte_eal_alarm_cancel().

rte_eal_alarm_cancel() may spin-wait for an executing callback, which can
deadlock if that callback is blocked on the same lock.

Remove callback-side alarm cancellation. It is safe to do so, because any
callback triggered without a pending request becomes a noop due to the
async request lookup now using numerical ID.

Fixes: daf9bfca717e ("ipc: remove thread for async requests")
Cc: stable@dpdk.org

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/eal/common/eal_common_proc.c | 13 -------------
 1 file changed, 13 deletions(-)

diff --git a/lib/eal/common/eal_common_proc.c b/lib/eal/common/eal_common_proc.c
index 3e32ee5027..869ce99bf9 100644
--- a/lib/eal/common/eal_common_proc.c
+++ b/lib/eal/common/eal_common_proc.c
@@ -549,19 +549,6 @@ async_reply_handle_thread_unsafe(struct pending_request *req)
 
 	TAILQ_REMOVE(&pending_requests.requests, req, next);
 
-	if (rte_eal_alarm_cancel(async_reply_handle,
-			(void *)(uintptr_t)req->id) < 0) {
-		/* if we failed to cancel the alarm because it's already in
-		 * progress, don't proceed because otherwise we will end up
-		 * handling the same message twice.
-		 */
-		if (rte_errno == EINPROGRESS) {
-			EAL_LOG(DEBUG, "Request handling is already in progress");
-			goto no_trigger;
-		}
-		EAL_LOG(ERR, "Failed to cancel alarm");
-	}
-
 	if (action == ACTION_TRIGGER)
 		return req;
 no_trigger:
-- 
2.47.3


^ permalink raw reply related

* [PATCH v6 2/6] eal: use request ID instead of pointers
From: Anatoly Burakov @ 2026-06-25 14:01 UTC (permalink / raw)
  To: dev, Jianfeng Tan
In-Reply-To: <cover.1782395581.git.anatoly.burakov@intel.com>

Initial implementation of async IPC request handling was using request
pointers directly. Because of the nature of how IPC is meant to work and
that requests ownership is disconnected from their creation (as in, freeing
a request may happen due to timeout, or due to received response, or due
to rollback because of a later failure), using pointers as identity is not
safe.

Use numeric request ID for async request lookup instead. This way, we can
safely free requests even if we are already waiting on responses/timeouts
for them, as the pointers themselves will not be referenced directly by
the response/timeout.

Fixes: f05e26051c15 ("eal: add IPC asynchronous request")
Cc: stable@dpdk.org

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/eal/common/eal_common_proc.c | 63 ++++++++++++++++++++++----------
 1 file changed, 43 insertions(+), 20 deletions(-)

diff --git a/lib/eal/common/eal_common_proc.c b/lib/eal/common/eal_common_proc.c
index 799c6e81b0..3e32ee5027 100644
--- a/lib/eal/common/eal_common_proc.c
+++ b/lib/eal/common/eal_common_proc.c
@@ -74,6 +74,7 @@ struct async_request_param {
 
 struct pending_request {
 	TAILQ_ENTRY(pending_request) next;
+	unsigned long id;
 	enum {
 		REQUEST_TYPE_SYNC,
 		REQUEST_TYPE_ASYNC
@@ -92,6 +93,8 @@ struct pending_request {
 	};
 };
 
+static unsigned long next_request_id;
+
 TAILQ_HEAD(pending_request_list, pending_request);
 
 static struct {
@@ -111,15 +114,15 @@ mp_send(struct rte_mp_msg *msg, const char *peer, int type);
 static void
 async_reply_handle(void *arg);
 
-/* for use with process_msg */
+/* for use with alarm callback and process_msg */
 static struct pending_request *
-async_reply_handle_thread_unsafe(void *arg);
+async_reply_handle_thread_unsafe(struct pending_request *req);
 
 static void
 trigger_async_action(struct pending_request *req);
 
 static struct pending_request *
-find_pending_request(const char *dst, const char *act_name)
+find_request_by_name(const char *dst, const char *act_name)
 {
 	struct pending_request *r;
 
@@ -132,6 +135,19 @@ find_pending_request(const char *dst, const char *act_name)
 	return r;
 }
 
+static struct pending_request *
+find_async_request_by_id(unsigned long id)
+{
+	struct pending_request *r;
+
+	TAILQ_FOREACH(r, &pending_requests.requests, next) {
+		if (r->id == id && r->type == REQUEST_TYPE_ASYNC)
+			return r;
+	}
+
+	return NULL;
+}
+
 /*
  * Combine prefix and name(optional) to return unix domain socket path
  * return the number of characters that would have been put into buffer.
@@ -354,7 +370,7 @@ process_msg(struct mp_msg_internal *m, struct sockaddr_un *s)
 		struct pending_request *req = NULL;
 
 		pthread_mutex_lock(&pending_requests.lock);
-		pending_req = find_pending_request(s->sun_path, msg->name);
+		pending_req = find_request_by_name(s->sun_path, msg->name);
 		if (pending_req) {
 			memcpy(pending_req->reply, msg, sizeof(*msg));
 			/* -1 indicates that we've been asked to ignore */
@@ -519,9 +535,8 @@ trigger_async_action(struct pending_request *sr)
 }
 
 static struct pending_request *
-async_reply_handle_thread_unsafe(void *arg)
+async_reply_handle_thread_unsafe(struct pending_request *req)
 {
-	struct pending_request *req = (struct pending_request *)arg;
 	enum async_action action;
 	struct timespec ts_now;
 
@@ -534,7 +549,8 @@ async_reply_handle_thread_unsafe(void *arg)
 
 	TAILQ_REMOVE(&pending_requests.requests, req, next);
 
-	if (rte_eal_alarm_cancel(async_reply_handle, req) < 0) {
+	if (rte_eal_alarm_cancel(async_reply_handle,
+			(void *)(uintptr_t)req->id) < 0) {
 		/* if we failed to cancel the alarm because it's already in
 		 * progress, don't proceed because otherwise we will end up
 		 * handling the same message twice.
@@ -557,9 +573,13 @@ static void
 async_reply_handle(void *arg)
 {
 	struct pending_request *req;
+	/* alarm arg carries the request ID packed into a void * via uintptr_t */
+	unsigned long id = (uintptr_t)arg;
 
 	pthread_mutex_lock(&pending_requests.lock);
-	req = async_reply_handle_thread_unsafe(arg);
+	req = find_async_request_by_id(id);
+	if (req != NULL)
+		req = async_reply_handle_thread_unsafe(req);
 	pthread_mutex_unlock(&pending_requests.lock);
 
 	if (req != NULL)
@@ -878,8 +898,19 @@ mp_request_async(const char *dst, struct rte_mp_msg *req,
 {
 	struct rte_mp_msg *reply_msg;
 	struct pending_request *pending_req, *exist;
+	unsigned long id;
 	int ret = -1;
 
+	/* queue already locked by caller */
+
+	exist = find_request_by_name(dst, req->name);
+	if (exist) {
+		EAL_LOG(ERR, "A pending request %s:%s", dst, req->name);
+		rte_errno = EEXIST;
+		return -1;
+	}
+
+	id = ++next_request_id;
 	pending_req = calloc(1, sizeof(*pending_req));
 	reply_msg = calloc(1, sizeof(*reply_msg));
 	if (pending_req == NULL || reply_msg == NULL) {
@@ -890,21 +921,12 @@ mp_request_async(const char *dst, struct rte_mp_msg *req,
 	}
 
 	pending_req->type = REQUEST_TYPE_ASYNC;
+	pending_req->id = id;
 	strlcpy(pending_req->dst, dst, sizeof(pending_req->dst));
 	pending_req->request = req;
 	pending_req->reply = reply_msg;
 	pending_req->async.param = param;
 
-	/* queue already locked by caller */
-
-	exist = find_pending_request(dst, req->name);
-	if (exist) {
-		EAL_LOG(ERR, "A pending request %s:%s", dst, req->name);
-		rte_errno = EEXIST;
-		ret = -1;
-		goto fail;
-	}
-
 	ret = send_msg(dst, req, MP_REQ);
 	if (ret < 0) {
 		EAL_LOG(ERR, "Fail to send request %s:%s",
@@ -919,7 +941,7 @@ mp_request_async(const char *dst, struct rte_mp_msg *req,
 
 	/* if alarm set fails, we simply ignore the reply */
 	if (rte_eal_alarm_set(ts->tv_sec * 1000000 + ts->tv_nsec / 1000,
-			      async_reply_handle, pending_req) < 0) {
+			async_reply_handle, (void *)(uintptr_t)id) < 0) {
 		EAL_LOG(ERR, "Fail to set alarm for request %s:%s",
 			dst, req->name);
 		ret = -1;
@@ -952,7 +974,7 @@ mp_request_sync(const char *dst, struct rte_mp_msg *req,
 	pthread_condattr_setclock(&attr, CLOCK_MONOTONIC);
 	pthread_cond_init(&pending_req.sync.cond, &attr);
 
-	exist = find_pending_request(dst, req->name);
+	exist = find_request_by_name(dst, req->name);
 	if (exist) {
 		EAL_LOG(ERR, "A pending request %s:%s", dst, req->name);
 		rte_errno = EEXIST;
@@ -1178,6 +1200,7 @@ rte_mp_request_async(struct rte_mp_msg *req, const struct timespec *ts,
 	 * it, and put it on the queue if we don't send any requests.
 	 */
 	dummy->type = REQUEST_TYPE_ASYNC;
+	dummy->id = ++next_request_id;
 	dummy->request = copy;
 	dummy->reply = NULL;
 	dummy->async.param = param;
-- 
2.47.3


^ permalink raw reply related

* [PATCH v6 1/6] eal: fix wrong log message in async IPC request
From: Anatoly Burakov @ 2026-06-25 14:01 UTC (permalink / raw)
  To: dev, Jianfeng Tan
In-Reply-To: <cover.1782395581.git.anatoly.burakov@intel.com>

The allocation failure log message in mp_request_async() says "sync
request" but the function handles asynchronous requests.

Fix the log to say "async request".

Fixes: f05e26051c15 ("eal: add IPC asynchronous request")
Cc: stable@dpdk.org

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/eal/common/eal_common_proc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/eal/common/eal_common_proc.c b/lib/eal/common/eal_common_proc.c
index 06f151818c..799c6e81b0 100644
--- a/lib/eal/common/eal_common_proc.c
+++ b/lib/eal/common/eal_common_proc.c
@@ -883,7 +883,7 @@ mp_request_async(const char *dst, struct rte_mp_msg *req,
 	pending_req = calloc(1, sizeof(*pending_req));
 	reply_msg = calloc(1, sizeof(*reply_msg));
 	if (pending_req == NULL || reply_msg == NULL) {
-		EAL_LOG(ERR, "Could not allocate space for sync request");
+		EAL_LOG(ERR, "Could not allocate space for async request");
 		rte_errno = ENOMEM;
 		ret = -1;
 		goto fail;
-- 
2.47.3


^ permalink raw reply related

* [PATCH v6 0/6] IPC fixes
From: Anatoly Burakov @ 2026-06-25 14:01 UTC (permalink / raw)
  To: dev
In-Reply-To: <740b39c5098b4d40cafb9881ad70865a3c889012.1773936429.git.anatoly.burakov@intel.com>

Coverity has reported (issue ID 501503) a memory leak, but there
actually were a few more problems with IPC than that. This patchset
addresses said problems.

1. Using pointer as async request identity is unsafe

Because asynchronous requests can fail at arbitrary points while
having arbitrary number of requests or alarms already in flight,
using pointer as request identity can create use-after-free risks.
Patchset replaces this with using numeric request ID instead.

2. Alarm cancel can deadlock

Async request handler may attempt to cancel the alarm, but an alarm
might have already been in progress blocking on the same lock that
is held by async request, leading to a deadlock. Patchset removes
the alarm cancel call, and allows the alarm to fire. This is fine,
because due to fix #1 the worst that can happen from calling stale
alarm is a noop, as request ID would not be found.

3. Memory leaks

There are a couple of memory leaks in failure paths. Patchset fixes
those.

4. Zero-peer async request does not trigger alarm

When async requests are performed but no peers exist, we created
a dummy request and put it on the queue, but we never set the
dummy alarm that is supposed to handle that request. Patchset adds
the alarm set in dummy paths where none was present before.

v6:

Moved pieces around, namely:

1) apply request ID refactor first as a standalone patch
2) fix the deadlock immediately after
3) fix memory leaks next
4) add missing callback as a final step

Contents of the patchset remain the same.

Anatoly Burakov (6):
  eal: fix wrong log message in async IPC request
  eal: use request ID instead of pointers
  eal: avoid deadlock in async IPC alarm callback
  eal: fix async IPC memory leaks on partial failure
  eal: fix memory leak in async IPC secondary path
  eal: fix async IPC callback not fired when no peers

 lib/eal/common/eal_common_proc.c | 133 +++++++++++++++++++++++--------
 1 file changed, 99 insertions(+), 34 deletions(-)

-- 
2.47.3


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox