public inbox for dev@dpdk.org
 help / color / mirror / Atom feed
* [PATCH v2] net/netvsc: switch data path to synthetic on device stop
@ 2026-03-23 23:49 Long Li
  2026-03-24 16:00 ` Stephen Hemminger
  0 siblings, 1 reply; 4+ messages in thread
From: Long Li @ 2026-03-23 23:49 UTC (permalink / raw)
  To: dev, Wei Hu, Stephen Hemminger, stable; +Cc: Long Li

When DPDK stops a netvsc device (e.g. on testpmd quit), the data path
was left pointing to the VF/MANA device. If the kernel netvsc driver
subsequently reloads the MANA device and opens it, incoming traffic
arrives on the MANA device immediately, before the queues are fully
initialized. This causes bogus RX completion events to appear on the
TX completion queue, triggering a kernel WARNING in mana_poll_tx_cq().

Fix this by switching the data path back to synthetic (via
NVS_DATAPATH_SYNTHETIC) in hn_vf_stop() before stopping the VF device.
This tells the host to route traffic through the synthetic path, so
that when the MANA driver recreates its queues, no unexpected traffic
arrives until netvsc explicitly switches back to VF.

Also update hn_vf_start() to switch the data path back to VF after the
VF device is started, enabling correct stop/start cycling.

Both functions now use write locks instead of read locks since they
modify vf_vsc_switched state.

Fixes: dc7680e8597c ("net/netvsc: support integrated VF")
Cc: stable@dpdk.org

Signed-off-by: Long Li <longli@microsoft.com>
---
v2:
- hn_vf_stop(): only clear vf_vsc_switched on successful datapath switch
- hn_vf_start(): stop VF device if datapath switch to VF fails

 drivers/net/netvsc/hn_vf.c | 59 ++++++++++++++++++++++++++++++++------
 1 file changed, 51 insertions(+), 8 deletions(-)

diff --git a/drivers/net/netvsc/hn_vf.c b/drivers/net/netvsc/hn_vf.c
index 99e8086afa..7840c77c5c 100644
--- a/drivers/net/netvsc/hn_vf.c
+++ b/drivers/net/netvsc/hn_vf.c
@@ -314,9 +314,17 @@ int hn_vf_add_unlocked(struct rte_eth_dev *dev, struct hn_data *hv)
 	}
 
 switch_data_path:
-	ret = hn_nvs_set_datapath(hv, NVS_DATAPATH_VF);
-	if (ret == 0)
-		hv->vf_ctx.vf_vsc_switched = true;
+	/* Only switch data path to VF if the device is started.
+	 * Otherwise defer to hn_vf_start() to avoid routing traffic
+	 * to the VF before queues are set up.
+	 */
+	if (dev->data->dev_started) {
+		ret = hn_nvs_set_datapath(hv, NVS_DATAPATH_VF);
+		if (ret)
+			PMD_DRV_LOG(ERR, "Failed to switch to VF: %d", ret);
+		else
+			hv->vf_ctx.vf_vsc_switched = true;
+	}
 
 exit:
 	return ret;
@@ -521,11 +529,32 @@ int hn_vf_start(struct rte_eth_dev *dev)
 	struct rte_eth_dev *vf_dev;
 	int ret = 0;
 
-	rte_rwlock_read_lock(&hv->vf_lock);
+	rte_rwlock_write_lock(&hv->vf_lock);
 	vf_dev = hn_get_vf_dev(hv);
-	if (vf_dev)
+	if (vf_dev) {
 		ret = rte_eth_dev_start(vf_dev->data->port_id);
-	rte_rwlock_read_unlock(&hv->vf_lock);
+		if (ret == 0) {
+			/* Re-switch data path to VF if VSP has reported
+			 * VF is present and we haven't switched yet
+			 * (e.g. after a stop/start cycle).
+			 */
+			if (hv->vf_ctx.vf_vsp_reported &&
+			    !hv->vf_ctx.vf_vsc_switched) {
+				ret = hn_nvs_set_datapath(hv,
+							  NVS_DATAPATH_VF);
+				if (ret) {
+					PMD_DRV_LOG(ERR,
+						    "Failed to switch to VF: %d",
+						    ret);
+					rte_eth_dev_stop(
+						vf_dev->data->port_id);
+				} else {
+					hv->vf_ctx.vf_vsc_switched = true;
+				}
+			}
+		}
+	}
+	rte_rwlock_write_unlock(&hv->vf_lock);
 	return ret;
 }
 
@@ -535,15 +564,29 @@ int hn_vf_stop(struct rte_eth_dev *dev)
 	struct rte_eth_dev *vf_dev;
 	int ret = 0;
 
-	rte_rwlock_read_lock(&hv->vf_lock);
+	rte_rwlock_write_lock(&hv->vf_lock);
 	vf_dev = hn_get_vf_dev(hv);
 	if (vf_dev) {
+		/* Switch data path back to synthetic before stopping VF,
+		 * so the host stops routing traffic to the VF device.
+		 */
+		if (hv->vf_ctx.vf_vsc_switched) {
+			ret = hn_nvs_set_datapath(hv, NVS_DATAPATH_SYNTHETIC);
+			if (ret) {
+				PMD_DRV_LOG(ERR,
+					    "Failed to switch to synthetic: %d",
+					    ret);
+			} else {
+				hv->vf_ctx.vf_vsc_switched = false;
+			}
+		}
+
 		ret = rte_eth_dev_stop(vf_dev->data->port_id);
 		if (ret != 0)
 			PMD_DRV_LOG(ERR, "Failed to stop device on port %u",
 				    vf_dev->data->port_id);
 	}
-	rte_rwlock_read_unlock(&hv->vf_lock);
+	rte_rwlock_write_unlock(&hv->vf_lock);
 
 	return ret;
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v2] net/netvsc: switch data path to synthetic on device stop
  2026-03-21  0:43 [PATCH] " Long Li
@ 2026-03-23 23:58 ` Long Li
  0 siblings, 0 replies; 4+ messages in thread
From: Long Li @ 2026-03-23 23:58 UTC (permalink / raw)
  To: dev, Wei Hu, Stephen Hemminger, stable; +Cc: Long Li

When DPDK stops a netvsc device (e.g. on testpmd quit), the data path
was left pointing to the VF/MANA device. If the kernel netvsc driver
subsequently reloads the MANA device and opens it, incoming traffic
arrives on the MANA device immediately, before the queues are fully
initialized. This causes bogus RX completion events to appear on the
TX completion queue, triggering a kernel WARNING in mana_poll_tx_cq().

Fix this by switching the data path back to synthetic (via
NVS_DATAPATH_SYNTHETIC) in hn_vf_stop() before stopping the VF device.
This tells the host to route traffic through the synthetic path, so
that when the MANA driver recreates its queues, no unexpected traffic
arrives until netvsc explicitly switches back to VF.

Also update hn_vf_start() to switch the data path back to VF after the
VF device is started, enabling correct stop/start cycling.

Both functions now use write locks instead of read locks since they
modify vf_vsc_switched state.

Fixes: dc7680e8597c ("net/netvsc: support integrated VF")
Cc: stable@dpdk.org

Signed-off-by: Long Li <longli@microsoft.com>
---

v2:
- hn_vf_stop(): only clear vf_vsc_switched on successful datapath switch
- hn_vf_start(): stop VF device if datapath switch to VF fails

 drivers/net/netvsc/hn_vf.c | 58 ++++++++++++++++++++++++++++++++------
 1 file changed, 50 insertions(+), 8 deletions(-)

diff --git a/drivers/net/netvsc/hn_vf.c b/drivers/net/netvsc/hn_vf.c
index 99e8086afa..c3f024b378 100644
--- a/drivers/net/netvsc/hn_vf.c
+++ b/drivers/net/netvsc/hn_vf.c
@@ -314,9 +314,17 @@ int hn_vf_add_unlocked(struct rte_eth_dev *dev, struct hn_data *hv)
 	}
 
 switch_data_path:
-	ret = hn_nvs_set_datapath(hv, NVS_DATAPATH_VF);
-	if (ret == 0)
-		hv->vf_ctx.vf_vsc_switched = true;
+	/* Only switch data path to VF if the device is started.
+	 * Otherwise defer to hn_vf_start() to avoid routing traffic
+	 * to the VF before queues are set up.
+	 */
+	if (dev->data->dev_started) {
+		ret = hn_nvs_set_datapath(hv, NVS_DATAPATH_VF);
+		if (ret)
+			PMD_DRV_LOG(ERR, "Failed to switch to VF: %d", ret);
+		else
+			hv->vf_ctx.vf_vsc_switched = true;
+	}
 
 exit:
 	return ret;
@@ -521,11 +529,31 @@ int hn_vf_start(struct rte_eth_dev *dev)
 	struct rte_eth_dev *vf_dev;
 	int ret = 0;
 
-	rte_rwlock_read_lock(&hv->vf_lock);
+	rte_rwlock_write_lock(&hv->vf_lock);
 	vf_dev = hn_get_vf_dev(hv);
-	if (vf_dev)
+	if (vf_dev) {
 		ret = rte_eth_dev_start(vf_dev->data->port_id);
-	rte_rwlock_read_unlock(&hv->vf_lock);
+		if (ret == 0) {
+			/* Re-switch data path to VF if VSP has reported
+			 * VF is present and we haven't switched yet
+			 * (e.g. after a stop/start cycle).
+			 */
+			if (hv->vf_ctx.vf_vsp_reported &&
+			    !hv->vf_ctx.vf_vsc_switched) {
+				ret = hn_nvs_set_datapath(hv,
+							  NVS_DATAPATH_VF);
+				if (ret) {
+					PMD_DRV_LOG(ERR,
+						    "Failed to switch to VF: %d",
+						    ret);
+					rte_eth_dev_stop(vf_dev->data->port_id);
+				} else {
+					hv->vf_ctx.vf_vsc_switched = true;
+				}
+			}
+		}
+	}
+	rte_rwlock_write_unlock(&hv->vf_lock);
 	return ret;
 }
 
@@ -535,15 +563,29 @@ int hn_vf_stop(struct rte_eth_dev *dev)
 	struct rte_eth_dev *vf_dev;
 	int ret = 0;
 
-	rte_rwlock_read_lock(&hv->vf_lock);
+	rte_rwlock_write_lock(&hv->vf_lock);
 	vf_dev = hn_get_vf_dev(hv);
 	if (vf_dev) {
+		/* Switch data path back to synthetic before stopping VF,
+		 * so the host stops routing traffic to the VF device.
+		 */
+		if (hv->vf_ctx.vf_vsc_switched) {
+			ret = hn_nvs_set_datapath(hv, NVS_DATAPATH_SYNTHETIC);
+			if (ret) {
+				PMD_DRV_LOG(ERR,
+					    "Failed to switch to synthetic: %d",
+					    ret);
+			} else {
+				hv->vf_ctx.vf_vsc_switched = false;
+			}
+		}
+
 		ret = rte_eth_dev_stop(vf_dev->data->port_id);
 		if (ret != 0)
 			PMD_DRV_LOG(ERR, "Failed to stop device on port %u",
 				    vf_dev->data->port_id);
 	}
-	rte_rwlock_read_unlock(&hv->vf_lock);
+	rte_rwlock_write_unlock(&hv->vf_lock);
 
 	return ret;
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] net/netvsc: switch data path to synthetic on device stop
  2026-03-23 23:49 [PATCH v2] net/netvsc: switch data path to synthetic on device stop Long Li
@ 2026-03-24 16:00 ` Stephen Hemminger
  2026-03-25 19:00   ` [EXTERNAL] " Long Li
  0 siblings, 1 reply; 4+ messages in thread
From: Stephen Hemminger @ 2026-03-24 16:00 UTC (permalink / raw)
  To: Long Li; +Cc: dev, Wei Hu, stable

On Mon, 23 Mar 2026 16:49:57 -0700
Long Li <longli@microsoft.com> wrote:

> When DPDK stops a netvsc device (e.g. on testpmd quit), the data path
> was left pointing to the VF/MANA device. If the kernel netvsc driver
> subsequently reloads the MANA device and opens it, incoming traffic
> arrives on the MANA device immediately, before the queues are fully
> initialized. This causes bogus RX completion events to appear on the
> TX completion queue, triggering a kernel WARNING in mana_poll_tx_cq().
> 
> Fix this by switching the data path back to synthetic (via
> NVS_DATAPATH_SYNTHETIC) in hn_vf_stop() before stopping the VF device.
> This tells the host to route traffic through the synthetic path, so
> that when the MANA driver recreates its queues, no unexpected traffic
> arrives until netvsc explicitly switches back to VF.
> 
> Also update hn_vf_start() to switch the data path back to VF after the
> VF device is started, enabling correct stop/start cycling.
> 
> Both functions now use write locks instead of read locks since they
> modify vf_vsc_switched state.
> 
> Fixes: dc7680e8597c ("net/netvsc: support integrated VF")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Long Li <longli@microsoft.com>
> ---

AI spotted that ret is overwritten on error path


**Patch: [PATCH v2] net/netvsc: switch data path to synthetic on device stop**

The v2 addresses the two issues from v1 review: `hn_vf_stop()` now only clears `vf_vsc_switched` on success, and `hn_vf_start()` now stops the VF if the datapath switch fails. Both fixes are correct.

**Warning: `hn_vf_stop()` — `ret` from failed datapath switch is overwritten by `rte_eth_dev_stop()`**

When `hn_nvs_set_datapath(hv, NVS_DATAPATH_SYNTHETIC)` fails, `ret` holds that error. But execution falls through to `rte_eth_dev_stop()`, which overwrites `ret`. The caller loses the datapath switch failure — if `rte_eth_dev_stop()` succeeds, `hn_vf_stop()` returns 0 despite the datapath switch having failed. If the intent is to always stop the VF regardless of the switch result (reasonable), the datapath error should be preserved separately, or the function should return the first error. Something like:

```c
		if (hv->vf_ctx.vf_vsc_switched) {
			ret = hn_nvs_set_datapath(hv, NVS_DATAPATH_SYNTHETIC);
			if (ret) {
				PMD_DRV_LOG(ERR,
					    "Failed to switch to synthetic: %d",
					    ret);
			} else {
				hv->vf_ctx.vf_vsc_switched = false;
			}
		}

		err = rte_eth_dev_stop(vf_dev->data->port_id);
		if (err != 0)
			PMD_DRV_LOG(ERR, "Failed to stop device on port %u",
				    vf_dev->data->port_id);
		if (ret == 0)
			ret = err;
```

This preserves the first error while still attempting to stop the VF.

Everything else looks correct. The lock upgrades from read to write are appropriate, the conditional logic in `hn_vf_add_unlocked()` correctly defers the datapath switch when the device isn't started, and `hn_vf_start()` properly rolls back the VF start on datapath switch failure.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [EXTERNAL] Re: [PATCH v2] net/netvsc: switch data path to synthetic on device stop
  2026-03-24 16:00 ` Stephen Hemminger
@ 2026-03-25 19:00   ` Long Li
  0 siblings, 0 replies; 4+ messages in thread
From: Long Li @ 2026-03-25 19:00 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev@dpdk.org, Wei Hu, stable@dpdk.org

> On Mon, 23 Mar 2026 16:49:57 -0700
> Long Li <longli@microsoft.com> wrote:
> 
> > When DPDK stops a netvsc device (e.g. on testpmd quit), the data path
> > was left pointing to the VF/MANA device. If the kernel netvsc driver
> > subsequently reloads the MANA device and opens it, incoming traffic
> > arrives on the MANA device immediately, before the queues are fully
> > initialized. This causes bogus RX completion events to appear on the
> > TX completion queue, triggering a kernel WARNING in mana_poll_tx_cq().
> >
> > Fix this by switching the data path back to synthetic (via
> > NVS_DATAPATH_SYNTHETIC) in hn_vf_stop() before stopping the VF device.
> > This tells the host to route traffic through the synthetic path, so
> > that when the MANA driver recreates its queues, no unexpected traffic
> > arrives until netvsc explicitly switches back to VF.
> >
> > Also update hn_vf_start() to switch the data path back to VF after the
> > VF device is started, enabling correct stop/start cycling.
> >
> > Both functions now use write locks instead of read locks since they
> > modify vf_vsc_switched state.
> >
> > Fixes: dc7680e8597c ("net/netvsc: support integrated VF")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Long Li <longli@microsoft.com>
> > ---
> 
> AI spotted that ret is overwritten on error path

I have sent v3.

Thanks,
Long

> 
> 
> **Patch: [PATCH v2] net/netvsc: switch data path to synthetic on device
> stop**
> 
> The v2 addresses the two issues from v1 review: `hn_vf_stop()` now only
> clears `vf_vsc_switched` on success, and `hn_vf_start()` now stops the VF if
> the datapath switch fails. Both fixes are correct.
> 
> **Warning: `hn_vf_stop()` — `ret` from failed datapath switch is overwritten
> by `rte_eth_dev_stop()`**
> 
> When `hn_nvs_set_datapath(hv, NVS_DATAPATH_SYNTHETIC)` fails, `ret`
> holds that error. But execution falls through to `rte_eth_dev_stop()`, which
> overwrites `ret`. The caller loses the datapath switch failure — if
> `rte_eth_dev_stop()` succeeds, `hn_vf_stop()` returns 0 despite the datapath
> switch having failed. If the intent is to always stop the VF regardless of the
> switch result (reasonable), the datapath error should be preserved separately,
> or the function should return the first error. Something like:
> 
> ```c
> 		if (hv->vf_ctx.vf_vsc_switched) {
> 			ret = hn_nvs_set_datapath(hv,
> NVS_DATAPATH_SYNTHETIC);
> 			if (ret) {
> 				PMD_DRV_LOG(ERR,
> 					    "Failed to switch to synthetic: %d",
> 					    ret);
> 			} else {
> 				hv->vf_ctx.vf_vsc_switched = false;
> 			}
> 		}
> 
> 		err = rte_eth_dev_stop(vf_dev->data->port_id);
> 		if (err != 0)
> 			PMD_DRV_LOG(ERR, "Failed to stop device on
> port %u",
> 				    vf_dev->data->port_id);
> 		if (ret == 0)
> 			ret = err;
> ```
> 
> This preserves the first error while still attempting to stop the VF.
> 
> Everything else looks correct. The lock upgrades from read to write are
> appropriate, the conditional logic in `hn_vf_add_unlocked()` correctly defers
> the datapath switch when the device isn't started, and `hn_vf_start()` properly
> rolls back the VF start on datapath switch failure.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-03-25 19:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-23 23:49 [PATCH v2] net/netvsc: switch data path to synthetic on device stop Long Li
2026-03-24 16:00 ` Stephen Hemminger
2026-03-25 19:00   ` [EXTERNAL] " Long Li
  -- strict thread matches above, loose matches on Subject: below --
2026-03-21  0:43 [PATCH] " Long Li
2026-03-23 23:58 ` [PATCH v2] " Long Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox