* [PATCH v3 0/6] net/netvsc: bug fixes and runtime queue reconfiguration
@ 2026-03-03 2:53 Long Li
2026-03-03 2:53 ` [PATCH v3 1/6] net/netvsc: fix subchannel leak on device removal Long Li
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: Long Li @ 2026-03-03 2:53 UTC (permalink / raw)
To: dev; +Cc: stephen, longli, weh, stable
This series fixes several resource management bugs in the netvsc PMD and
adds support for runtime queue count reconfiguration.
Patches 1-5 are bug fixes:
1. Subchannel leak on device removal — subchannels allocated during
configure are never closed in uninit.
2. Double-free of primary Rx queue — hn_dev_free_queues() already
frees hv->primary, then uninit frees it again.
3. Init error path leaks — hv->primary and hv->channels[0] not freed
on failure after allocation.
4. Event callback leak — device event callback not unregistered when
rxfilter or vf_start fails in hn_dev_start().
5. MTU change path leaks — missing chimney bitmap teardown/rebuild,
rxbuf_info leak, and VMBus channel close before free.
Patch 6 adds runtime queue count reconfiguration via port
stop/configure/start, with full NVS/RNDIS session teardown and reinit
when the queue count changes.
v3:
- Split into 6 patches: 5 bug fixes (patches 1-5) + 1 feature (patch 6)
- New patch 2: fix double-free of primary Rx queue on uninit
- New patch 3: fix resource leak in init error path
- New patch 4: fix event callback leak on rxfilter/vf_start failure
- New patch 5: fix resource leaks in MTU change path (chimney bitmap,
rxbuf_info leak, missing VMBus chan close before free)
- Patch 6 (reconfig): fix subchan_cleanup to call hn_rndis_conf_offload
so device retains offload config on partial failure
- Patch 6 (reconfig): fix recovery path to set hv->primary->chan after
re-opening channel, preventing use-after-free of stale pointer
- Patch 6 (reconfig): add hn_chim_init in recovery path to rebuild
chimney bitmap after successful recovery
- Fix Fixes tag on patch 5 to point to correct MTU support commit
v2:
- Split subchannel leak fix into separate patch with Fixes tag (patch 1)
- Fix reinit_failed recovery: re-map device before chan_open when device
is unmapped to prevent undefined behavior on unmapped ring buffers
- Move hn_rndis_conf_offload() to after reinit block so offload config
targets the final RNDIS session instead of being lost on teardown
- Use write lock in hn_vf_tx/rx_queue_release() to prevent race with
concurrent fast-path readers holding read lock
- Reset RSS indirection table to queue 0 in subchan_cleanup error path
- Fix multi-line comment style to follow DPDK convention
Long Li (6):
net/netvsc: fix subchannel leak on device removal
net/netvsc: fix double-free of primary Rx queue on uninit
net/netvsc: fix resource leak in init error path
net/netvsc: fix event callback leak on rxfilter failure
net/netvsc: fix resource leaks in MTU change path
net/netvsc: support runtime queue count reconfiguration
drivers/net/netvsc/hn_ethdev.c | 243 ++++++++++++++++++++++++++++++---
drivers/net/netvsc/hn_vf.c | 16 ++-
2 files changed, 234 insertions(+), 25 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v3 1/6] net/netvsc: fix subchannel leak on device removal
2026-03-03 2:53 [PATCH v3 0/6] net/netvsc: bug fixes and runtime queue reconfiguration Long Li
@ 2026-03-03 2:53 ` Long Li
2026-03-03 2:53 ` [PATCH v3 2/6] net/netvsc: fix double-free of primary Rx queue on uninit Long Li
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Long Li @ 2026-03-03 2:53 UTC (permalink / raw)
To: dev; +Cc: stephen, longli, weh, stable
eth_hn_dev_uninit() only closes the primary channel but never
closes subchannels allocated during hn_dev_configure(). This
leaks VMBus subchannel resources on device removal.
Close all subchannels before closing the primary channel to
prevent resource leaks.
Fixes: 4e9c73e96e83 ("net/netvsc: add Hyper-V network device")
Cc: stable@dpdk.org
Signed-off-by: Long Li <longli@microsoft.com>
---
drivers/net/netvsc/hn_ethdev.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/net/netvsc/hn_ethdev.c b/drivers/net/netvsc/hn_ethdev.c
index 6584819f4f..798b4c9023 100644
--- a/drivers/net/netvsc/hn_ethdev.c
+++ b/drivers/net/netvsc/hn_ethdev.c
@@ -1433,6 +1433,7 @@ eth_hn_dev_uninit(struct rte_eth_dev *eth_dev)
{
struct hn_data *hv = eth_dev->data->dev_private;
int ret, ret_stop;
+ int i;
PMD_INIT_FUNC_TRACE();
@@ -1444,6 +1445,15 @@ eth_hn_dev_uninit(struct rte_eth_dev *eth_dev)
hn_detach(hv);
hn_chim_uninit(eth_dev);
+
+ /* Close any subchannels before closing the primary channel */
+ for (i = 1; i < HN_MAX_CHANNELS; i++) {
+ if (hv->channels[i] != NULL) {
+ rte_vmbus_chan_close(hv->channels[i]);
+ hv->channels[i] = NULL;
+ }
+ }
+
rte_vmbus_chan_close(hv->channels[0]);
rte_free(hv->primary);
ret = rte_eth_dev_owner_delete(hv->owner.id);
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v3 2/6] net/netvsc: fix double-free of primary Rx queue on uninit
2026-03-03 2:53 [PATCH v3 0/6] net/netvsc: bug fixes and runtime queue reconfiguration Long Li
2026-03-03 2:53 ` [PATCH v3 1/6] net/netvsc: fix subchannel leak on device removal Long Li
@ 2026-03-03 2:53 ` Long Li
2026-03-03 2:53 ` [PATCH v3 3/6] net/netvsc: fix resource leak in init error path Long Li
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Long Li @ 2026-03-03 2:53 UTC (permalink / raw)
To: dev; +Cc: stephen, longli, weh, stable
eth_hn_dev_uninit() calls hn_dev_close() which calls hn_dev_free_queues()
that frees all rx queues including hv->primary via hn_rx_queue_free(rxq,
false). After that, eth_hn_dev_uninit() calls rte_free(hv->primary) again,
resulting in a double-free.
Remove the redundant rte_free(hv->primary) since hn_dev_free_queues()
already handles freeing it.
Fixes: 4e9c73e96e83 ("net/netvsc: add Hyper-V network device")
Cc: stable@dpdk.org
Signed-off-by: Long Li <longli@microsoft.com>
---
v3: New patch.
---
drivers/net/netvsc/hn_ethdev.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/net/netvsc/hn_ethdev.c b/drivers/net/netvsc/hn_ethdev.c
index 798b4c9023..9f61f3a1a5 100644
--- a/drivers/net/netvsc/hn_ethdev.c
+++ b/drivers/net/netvsc/hn_ethdev.c
@@ -1455,7 +1455,6 @@ eth_hn_dev_uninit(struct rte_eth_dev *eth_dev)
}
rte_vmbus_chan_close(hv->channels[0]);
- rte_free(hv->primary);
ret = rte_eth_dev_owner_delete(hv->owner.id);
if (ret != 0)
return ret;
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v3 3/6] net/netvsc: fix resource leak in init error path
2026-03-03 2:53 [PATCH v3 0/6] net/netvsc: bug fixes and runtime queue reconfiguration Long Li
2026-03-03 2:53 ` [PATCH v3 1/6] net/netvsc: fix subchannel leak on device removal Long Li
2026-03-03 2:53 ` [PATCH v3 2/6] net/netvsc: fix double-free of primary Rx queue on uninit Long Li
@ 2026-03-03 2:53 ` Long Li
2026-03-03 2:53 ` [PATCH v3 4/6] net/netvsc: fix event callback leak on rxfilter failure Long Li
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Long Li @ 2026-03-03 2:53 UTC (permalink / raw)
To: dev; +Cc: stephen, longli, weh, stable
The failed label in eth_hn_dev_init() does not free hv->primary or close
hv->channels[0], leaking both resources on any init failure after they
are allocated.
Additionally, the early return when hv->primary allocation fails leaks
hv->channels[0]. Change it to goto failed.
Add rte_free(hv->primary) and rte_vmbus_chan_close(hv->channels[0]) to
the failed label to properly clean up on all error paths.
Fixes: 4e9c73e96e83 ("net/netvsc: add Hyper-V network device")
Cc: stable@dpdk.org
Signed-off-by: Long Li <longli@microsoft.com>
---
v3: New patch.
---
drivers/net/netvsc/hn_ethdev.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/net/netvsc/hn_ethdev.c b/drivers/net/netvsc/hn_ethdev.c
index 9f61f3a1a5..19721b4829 100644
--- a/drivers/net/netvsc/hn_ethdev.c
+++ b/drivers/net/netvsc/hn_ethdev.c
@@ -1376,8 +1376,10 @@ eth_hn_dev_init(struct rte_eth_dev *eth_dev)
hv->primary = hn_rx_queue_alloc(hv, 0,
eth_dev->device->numa_node);
- if (!hv->primary)
- return -ENOMEM;
+ if (!hv->primary) {
+ err = -ENOMEM;
+ goto failed;
+ }
err = hn_attach(hv, RTE_ETHER_MTU);
if (err)
@@ -1403,8 +1405,10 @@ eth_hn_dev_init(struct rte_eth_dev *eth_dev)
max_chan = rte_vmbus_max_channels(vmbus);
PMD_INIT_LOG(DEBUG, "VMBus max channels %d", max_chan);
- if (max_chan <= 0)
+ if (max_chan <= 0) {
+ err = max_chan ? max_chan : -ENODEV;
goto failed;
+ }
if (hn_rndis_query_rsscaps(hv, &rxr_cnt) != 0)
rxr_cnt = 1;
@@ -1425,6 +1429,8 @@ eth_hn_dev_init(struct rte_eth_dev *eth_dev)
hn_chim_uninit(eth_dev);
hn_detach(hv);
+ rte_free(hv->primary);
+ rte_vmbus_chan_close(hv->channels[0]);
return err;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v3 4/6] net/netvsc: fix event callback leak on rxfilter failure
2026-03-03 2:53 [PATCH v3 0/6] net/netvsc: bug fixes and runtime queue reconfiguration Long Li
` (2 preceding siblings ...)
2026-03-03 2:53 ` [PATCH v3 3/6] net/netvsc: fix resource leak in init error path Long Li
@ 2026-03-03 2:53 ` Long Li
2026-03-03 2:54 ` [PATCH v3 5/6] net/netvsc: fix resource leaks in MTU change path Long Li
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Long Li @ 2026-03-03 2:53 UTC (permalink / raw)
To: dev; +Cc: stephen, longli, weh, stable
In hn_dev_start(), if hn_rndis_set_rxfilter() fails after registering
the device event callback, the function returns without unregistering
the callback. Unregister it before returning on error.
Fixes: 4e9c73e96e83 ("net/netvsc: add Hyper-V network device")
Cc: stable@dpdk.org
Signed-off-by: Long Li <longli@microsoft.com>
---
v3: New patch.
---
drivers/net/netvsc/hn_ethdev.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/drivers/net/netvsc/hn_ethdev.c b/drivers/net/netvsc/hn_ethdev.c
index 19721b4829..5e954b8812 100644
--- a/drivers/net/netvsc/hn_ethdev.c
+++ b/drivers/net/netvsc/hn_ethdev.c
@@ -1043,16 +1043,22 @@ hn_dev_start(struct rte_eth_dev *dev)
NDIS_PACKET_TYPE_BROADCAST |
NDIS_PACKET_TYPE_ALL_MULTICAST |
NDIS_PACKET_TYPE_DIRECTED);
- if (error)
+ if (error) {
+ rte_dev_event_callback_unregister(NULL,
+ netvsc_hotadd_callback, hv);
return error;
+ }
error = hn_vf_start(dev);
- if (error)
+ if (error) {
hn_rndis_set_rxfilter(hv, 0);
+ rte_dev_event_callback_unregister(NULL,
+ netvsc_hotadd_callback, hv);
+ return error;
+ }
/* Initialize Link state */
- if (error == 0)
- hn_dev_link_update(dev, 0);
+ hn_dev_link_update(dev, 0);
for (i = 0; i < hv->num_queues; i++) {
dev->data->tx_queue_state[i] = RTE_ETH_QUEUE_STATE_STARTED;
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v3 5/6] net/netvsc: fix resource leaks in MTU change path
2026-03-03 2:53 [PATCH v3 0/6] net/netvsc: bug fixes and runtime queue reconfiguration Long Li
` (3 preceding siblings ...)
2026-03-03 2:53 ` [PATCH v3 4/6] net/netvsc: fix event callback leak on rxfilter failure Long Li
@ 2026-03-03 2:54 ` Long Li
2026-03-03 2:54 ` [PATCH v3 6/6] net/netvsc: support runtime queue count reconfiguration Long Li
2026-03-03 3:56 ` [PATCH v3 0/6] net/netvsc: bug fixes and runtime queue reconfiguration Long Li
6 siblings, 0 replies; 8+ messages in thread
From: Long Li @ 2026-03-03 2:54 UTC (permalink / raw)
To: dev; +Cc: stephen, longli, weh, stable
hn_dev_mtu_set() has several resource management bugs:
1. Calls rte_free(hv->channels[0]) without rte_vmbus_chan_close()
first, skipping the VMBus close protocol.
2. Does not free hv->primary->rxbuf_info before hn_detach(), causing
hn_nvs_conn_rxbuf() in hn_reinit() to leak the old allocation.
3. Does not call hn_chim_uninit()/hn_chim_init() around the
detach/reinit sequence, leaving a stale chimney bitmap that may
not match the new chim_cnt.
Fix all three issues.
Fixes: 45c83603087e ("net/netvsc: support MTU set")
Cc: stable@dpdk.org
Signed-off-by: Long Li <longli@microsoft.com>
---
v3: New patch (split from reconfig patch).
---
drivers/net/netvsc/hn_ethdev.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/net/netvsc/hn_ethdev.c b/drivers/net/netvsc/hn_ethdev.c
index 5e954b8812..45d69272aa 100644
--- a/drivers/net/netvsc/hn_ethdev.c
+++ b/drivers/net/netvsc/hn_ethdev.c
@@ -1214,6 +1214,11 @@ hn_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
if (ret)
return ret;
+ /* Free chimney bitmap and rxbuf_info before NVS detach */
+ hn_chim_uninit(dev);
+ rte_free(hv->primary->rxbuf_info);
+ hv->primary->rxbuf_info = NULL;
+
/* Release channel resources */
hn_detach(hv);
@@ -1222,6 +1227,7 @@ hn_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
rte_vmbus_chan_close(hv->channels[i]);
/* Close primary vmbus channel */
+ rte_vmbus_chan_close(hv->channels[0]);
rte_free(hv->channels[0]);
/* Unmap and re-map vmbus device */
@@ -1248,13 +1254,17 @@ hn_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
rte_vmbus_set_latency(hv->vmbus, hv->channels[0], hv->latency);
ret = hn_reinit(dev, mtu);
- if (!ret)
+ if (!ret) {
+ hn_chim_init(dev);
goto out;
+ }
/* In case of error, attempt to restore original MTU */
ret = hn_reinit(dev, orig_mtu);
if (ret)
PMD_DRV_LOG(ERR, "Restoring original MTU failed for netvsc");
+ else
+ hn_chim_init(dev);
ret = hn_vf_mtu_set(dev, orig_mtu);
if (ret)
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v3 6/6] net/netvsc: support runtime queue count reconfiguration
2026-03-03 2:53 [PATCH v3 0/6] net/netvsc: bug fixes and runtime queue reconfiguration Long Li
` (4 preceding siblings ...)
2026-03-03 2:54 ` [PATCH v3 5/6] net/netvsc: fix resource leaks in MTU change path Long Li
@ 2026-03-03 2:54 ` Long Li
2026-03-03 3:56 ` [PATCH v3 0/6] net/netvsc: bug fixes and runtime queue reconfiguration Long Li
6 siblings, 0 replies; 8+ messages in thread
From: Long Li @ 2026-03-03 2:54 UTC (permalink / raw)
To: dev; +Cc: stephen, longli, weh, stable
Add support for changing the number of RX/TX queues at runtime
via port stop/configure/start. When the queue count changes,
perform a full NVS/RNDIS teardown and reinit to allocate fresh
VMBus subchannels matching the new queue count, then reconfigure
RSS indirection table accordingly.
Key changes:
- hn_dev_configure: detect queue count changes and perform full
NVS session reinit with subchannel teardown/recreation
- hn_dev_stop: drain pending TX completions (up to 1s) to prevent
stale completions from corrupting queue state after reconfig
- hn_vf_tx/rx_queue_release: use write lock when nulling VF queue
pointers to prevent use-after-free with concurrent fast-path
readers
Signed-off-by: Long Li <longli@microsoft.com>
---
v3:
- Fix subchan_cleanup to call hn_rndis_conf_offload so device retains
offload config on partial subchannel failure
- Fix recovery path to set hv->primary->chan = hv->channels[0] after
re-opening channel, preventing use-after-free of stale pointer
- Add hn_chim_init in recovery path to rebuild chimney bitmap
v2:
- Fix reinit_failed recovery: re-map device before chan_open when device
is unmapped to prevent undefined behavior on unmapped ring buffers
- Move hn_rndis_conf_offload() to after reinit block so offload config
targets the final RNDIS session instead of being lost on teardown
- Use write lock in hn_vf_tx/rx_queue_release() to prevent race with
concurrent fast-path readers holding read lock
- Reset RSS indirection table to queue 0 in subchan_cleanup error path
- Fix multi-line comment style to follow DPDK convention
---
drivers/net/netvsc/hn_ethdev.c | 194 +++++++++++++++++++++++++++++++--
drivers/net/netvsc/hn_vf.c | 16 ++-
2 files changed, 194 insertions(+), 16 deletions(-)
diff --git a/drivers/net/netvsc/hn_ethdev.c b/drivers/net/netvsc/hn_ethdev.c
index 45d69272aa..78ad566309 100644
--- a/drivers/net/netvsc/hn_ethdev.c
+++ b/drivers/net/netvsc/hn_ethdev.c
@@ -745,6 +745,9 @@ netvsc_hotadd_callback(const char *device_name, enum rte_dev_event_type type,
}
}
+static void hn_detach(struct hn_data *hv);
+static int hn_attach(struct hn_data *hv, unsigned int mtu);
+
static int hn_dev_configure(struct rte_eth_dev *dev)
{
struct rte_eth_conf *dev_conf = &dev->data->dev_conf;
@@ -754,6 +757,8 @@ static int hn_dev_configure(struct rte_eth_dev *dev)
struct hn_data *hv = dev->data->dev_private;
uint64_t unsupported;
int i, err, subchan;
+ uint32_t old_subchans = 0;
+ bool device_unmapped = false;
PMD_INIT_FUNC_TRACE();
@@ -778,36 +783,111 @@ static int hn_dev_configure(struct rte_eth_dev *dev)
hv->vlan_strip = !!(rxmode->offloads & RTE_ETH_RX_OFFLOAD_VLAN_STRIP);
- err = hn_rndis_conf_offload(hv, txmode->offloads,
- rxmode->offloads);
- if (err) {
- PMD_DRV_LOG(NOTICE,
- "offload configure failed");
- return err;
- }
+ /* If queue count unchanged, skip subchannel teardown/reinit */
+ if (RTE_MAX(dev->data->nb_rx_queues,
+ dev->data->nb_tx_queues) == hv->num_queues)
+ goto skip_reinit;
hv->num_queues = RTE_MAX(dev->data->nb_rx_queues,
dev->data->nb_tx_queues);
+ /* Close all existing subchannels */
+ for (i = 1; i < HN_MAX_CHANNELS; i++) {
+ if (hv->channels[i] != NULL) {
+ rte_vmbus_chan_close(hv->channels[i]);
+ hv->channels[i] = NULL;
+ old_subchans++;
+ }
+ }
+
+ /*
+ * If subchannels existed, do a full NVS/RNDIS teardown
+ * and vmbus re-init to ensure a clean NVS session.
+ * Cannot re-send NVS subchannel request on the same
+ * session without invalidating the data path.
+ */
+ if (old_subchans > 0) {
+ PMD_DRV_LOG(NOTICE,
+ "reinit NVS (had %u subchannels)",
+ old_subchans);
+
+ hn_chim_uninit(dev);
+ rte_free(hv->primary->rxbuf_info);
+ hv->primary->rxbuf_info = NULL;
+ hn_detach(hv);
+
+ rte_vmbus_chan_close(hv->channels[0]);
+ rte_free(hv->channels[0]);
+ hv->channels[0] = NULL;
+
+ rte_vmbus_unmap_device(hv->vmbus);
+ device_unmapped = true;
+ err = rte_vmbus_map_device(hv->vmbus);
+ if (err) {
+ PMD_DRV_LOG(ERR,
+ "Could not re-map vmbus device!");
+ goto reinit_failed;
+ }
+ device_unmapped = false;
+
+ hv->rxbuf_res = hv->vmbus->resource[HV_RECV_BUF_MAP];
+ hv->chim_res = hv->vmbus->resource[HV_SEND_BUF_MAP];
+
+ err = rte_vmbus_chan_open(hv->vmbus, &hv->channels[0]);
+ if (err) {
+ PMD_DRV_LOG(ERR,
+ "Could not re-open vmbus channel!");
+ goto reinit_failed;
+ }
+
+ hv->primary->chan = hv->channels[0];
+
+ rte_vmbus_set_latency(hv->vmbus, hv->channels[0],
+ hv->latency);
+
+ err = hn_attach(hv, dev->data->mtu);
+ if (err) {
+ rte_vmbus_chan_close(hv->channels[0]);
+ rte_free(hv->channels[0]);
+ hv->channels[0] = NULL;
+ PMD_DRV_LOG(ERR,
+ "NVS reinit failed: %d", err);
+ goto reinit_failed;
+ }
+
+ err = hn_chim_init(dev);
+ if (err) {
+ hn_detach(hv);
+ rte_vmbus_chan_close(hv->channels[0]);
+ rte_free(hv->channels[0]);
+ hv->channels[0] = NULL;
+ PMD_DRV_LOG(ERR,
+ "chim reinit failed: %d", err);
+ goto reinit_failed;
+ }
+ }
+
for (i = 0; i < NDIS_HASH_INDCNT; i++)
hv->rss_ind[i] = i % dev->data->nb_rx_queues;
hn_rss_hash_init(hv, rss_conf);
subchan = hv->num_queues - 1;
+
+ /* Allocate fresh subchannels and configure RSS */
if (subchan > 0) {
err = hn_subchan_configure(hv, subchan);
if (err) {
PMD_DRV_LOG(NOTICE,
"subchannel configuration failed");
- return err;
+ goto subchan_cleanup;
}
err = hn_rndis_conf_rss(hv, NDIS_RSS_FLAG_DISABLE);
if (err) {
PMD_DRV_LOG(NOTICE,
"rss disable failed");
- return err;
+ goto subchan_cleanup;
}
if (rss_conf->rss_hf != 0) {
@@ -815,12 +895,82 @@ static int hn_dev_configure(struct rte_eth_dev *dev)
if (err) {
PMD_DRV_LOG(NOTICE,
"initial RSS config failed");
- return err;
+ goto subchan_cleanup;
}
}
}
+skip_reinit:
+ /* Apply offload config after reinit so it targets the final RNDIS session */
+ err = hn_rndis_conf_offload(hv, txmode->offloads,
+ rxmode->offloads);
+ if (err) {
+ PMD_DRV_LOG(NOTICE,
+ "offload configure failed");
+ return err;
+ }
+
return hn_vf_configure_locked(dev, dev_conf);
+
+subchan_cleanup:
+ for (i = 1; i < HN_MAX_CHANNELS; i++) {
+ if (hv->channels[i] != NULL) {
+ rte_vmbus_chan_close(hv->channels[i]);
+ hv->channels[i] = NULL;
+ }
+ }
+ hv->num_queues = 1;
+ for (i = 0; i < NDIS_HASH_INDCNT; i++)
+ hv->rss_ind[i] = 0;
+
+ /* Apply offload config so device is usable on primary queue */
+ hn_rndis_conf_offload(hv, txmode->offloads, rxmode->offloads);
+ return err;
+
+reinit_failed:
+ /*
+ * Device is in a broken state after failed reinit.
+ * Try to re-establish minimal connectivity.
+ */
+ PMD_DRV_LOG(ERR,
+ "reinit failed (err %d), attempting recovery", err);
+ if (hv->channels[0] == NULL) {
+ if (device_unmapped) {
+ if (rte_vmbus_map_device(hv->vmbus)) {
+ hv->num_queues = 0;
+ PMD_DRV_LOG(ERR,
+ "recovery failed, could not re-map device");
+ return err;
+ }
+ hv->rxbuf_res = hv->vmbus->resource[HV_RECV_BUF_MAP];
+ hv->chim_res = hv->vmbus->resource[HV_SEND_BUF_MAP];
+ }
+ if (rte_vmbus_chan_open(hv->vmbus, &hv->channels[0]) == 0) {
+ if (hn_attach(hv, dev->data->mtu) == 0) {
+ hv->primary->chan = hv->channels[0];
+ if (hn_chim_init(dev) != 0)
+ PMD_DRV_LOG(WARNING,
+ "chim reinit failed during recovery");
+ hv->num_queues = 1;
+ PMD_DRV_LOG(NOTICE,
+ "recovery successful on primary channel");
+ } else {
+ rte_vmbus_chan_close(hv->channels[0]);
+ rte_free(hv->channels[0]);
+ hv->channels[0] = NULL;
+ hv->num_queues = 0;
+ PMD_DRV_LOG(ERR,
+ "recovery failed, device unusable");
+ }
+ } else {
+ hv->num_queues = 0;
+ PMD_DRV_LOG(ERR,
+ "recovery failed, device unusable");
+ }
+ } else {
+ hv->num_queues = 1;
+ }
+ return err;
}
static int hn_dev_stats_get(struct rte_eth_dev *dev,
@@ -1073,6 +1223,7 @@ hn_dev_stop(struct rte_eth_dev *dev)
{
struct hn_data *hv = dev->data->dev_private;
int i, ret;
+ unsigned int retry;
PMD_INIT_FUNC_TRACE();
dev->data->dev_started = 0;
@@ -1081,6 +1232,29 @@ hn_dev_stop(struct rte_eth_dev *dev)
hn_rndis_set_rxfilter(hv, 0);
ret = hn_vf_stop(dev);
+ /*
+ * Drain pending TX completions to prevent stale completions
+ * from corrupting queue state after port reconfiguration.
+ */
+ for (retry = 0; retry < 100; retry++) {
+ uint32_t pending = 0;
+
+ for (i = 0; i < hv->num_queues; i++) {
+ struct hn_tx_queue *txq = dev->data->tx_queues[i];
+
+ if (txq == NULL)
+ continue;
+ hn_process_events(hv, i, 0);
+ pending += rte_mempool_in_use_count(txq->txdesc_pool);
+ }
+ if (pending == 0)
+ break;
+ rte_delay_ms(10);
+ }
+ if (retry >= 100)
+ PMD_DRV_LOG(WARNING,
+ "Failed to drain all TX completions");
+
for (i = 0; i < hv->num_queues; i++) {
dev->data->tx_queue_state[i] = RTE_ETH_QUEUE_STATE_STOPPED;
dev->data->rx_queue_state[i] = RTE_ETH_QUEUE_STATE_STOPPED;
diff --git a/drivers/net/netvsc/hn_vf.c b/drivers/net/netvsc/hn_vf.c
index 0ecfaf54ea..e77232bfb3 100644
--- a/drivers/net/netvsc/hn_vf.c
+++ b/drivers/net/netvsc/hn_vf.c
@@ -637,12 +637,14 @@ void hn_vf_tx_queue_release(struct hn_data *hv, uint16_t queue_id)
{
struct rte_eth_dev *vf_dev;
- rte_rwlock_read_lock(&hv->vf_lock);
+ rte_rwlock_write_lock(&hv->vf_lock);
vf_dev = hn_get_vf_dev(hv);
- if (vf_dev && vf_dev->dev_ops->tx_queue_release)
+ if (vf_dev && vf_dev->dev_ops->tx_queue_release) {
(*vf_dev->dev_ops->tx_queue_release)(vf_dev, queue_id);
+ vf_dev->data->tx_queues[queue_id] = NULL;
+ }
- rte_rwlock_read_unlock(&hv->vf_lock);
+ rte_rwlock_write_unlock(&hv->vf_lock);
}
int hn_vf_rx_queue_setup(struct rte_eth_dev *dev,
@@ -669,11 +671,13 @@ void hn_vf_rx_queue_release(struct hn_data *hv, uint16_t queue_id)
{
struct rte_eth_dev *vf_dev;
- rte_rwlock_read_lock(&hv->vf_lock);
+ rte_rwlock_write_lock(&hv->vf_lock);
vf_dev = hn_get_vf_dev(hv);
- if (vf_dev && vf_dev->dev_ops->rx_queue_release)
+ if (vf_dev && vf_dev->dev_ops->rx_queue_release) {
(*vf_dev->dev_ops->rx_queue_release)(vf_dev, queue_id);
- rte_rwlock_read_unlock(&hv->vf_lock);
+ vf_dev->data->rx_queues[queue_id] = NULL;
+ }
+ rte_rwlock_write_unlock(&hv->vf_lock);
}
int hn_vf_stats_get(struct rte_eth_dev *dev,
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* RE: [PATCH v3 0/6] net/netvsc: bug fixes and runtime queue reconfiguration
2026-03-03 2:53 [PATCH v3 0/6] net/netvsc: bug fixes and runtime queue reconfiguration Long Li
` (5 preceding siblings ...)
2026-03-03 2:54 ` [PATCH v3 6/6] net/netvsc: support runtime queue count reconfiguration Long Li
@ 2026-03-03 3:56 ` Long Li
6 siblings, 0 replies; 8+ messages in thread
From: Long Li @ 2026-03-03 3:56 UTC (permalink / raw)
To: Long Li, dev@dpdk.org; +Cc: stephen@networkplumber.org, Wei Hu, stable@dpdk.org
> -----Original Message-----
> From: Long Li <longli@microsoft.com>
> Sent: Monday, March 2, 2026 6:54 PM
> To: dev@dpdk.org
> Cc: stephen@networkplumber.org; Long Li <longli@microsoft.com>; Wei Hu
> <weh@microsoft.com>; stable@dpdk.org
> Subject: [PATCH v3 0/6] net/netvsc: bug fixes and runtime queue reconfiguration
>
> This series fixes several resource management bugs in the netvsc PMD and adds
> support for runtime queue count reconfiguration.
Please discard this series. Didn't pass tests.
I will send v4 after clearing all tests.
Long
>
> Patches 1-5 are bug fixes:
> 1. Subchannel leak on device removal — subchannels allocated during
> configure are never closed in uninit.
> 2. Double-free of primary Rx queue — hn_dev_free_queues() already
> frees hv->primary, then uninit frees it again.
> 3. Init error path leaks — hv->primary and hv->channels[0] not freed
> on failure after allocation.
> 4. Event callback leak — device event callback not unregistered when
> rxfilter or vf_start fails in hn_dev_start().
> 5. MTU change path leaks — missing chimney bitmap teardown/rebuild,
> rxbuf_info leak, and VMBus channel close before free.
>
> Patch 6 adds runtime queue count reconfiguration via port stop/configure/start,
> with full NVS/RNDIS session teardown and reinit when the queue count changes.
>
> v3:
> - Split into 6 patches: 5 bug fixes (patches 1-5) + 1 feature (patch 6)
> - New patch 2: fix double-free of primary Rx queue on uninit
> - New patch 3: fix resource leak in init error path
> - New patch 4: fix event callback leak on rxfilter/vf_start failure
> - New patch 5: fix resource leaks in MTU change path (chimney bitmap,
> rxbuf_info leak, missing VMBus chan close before free)
> - Patch 6 (reconfig): fix subchan_cleanup to call hn_rndis_conf_offload
> so device retains offload config on partial failure
> - Patch 6 (reconfig): fix recovery path to set hv->primary->chan after
> re-opening channel, preventing use-after-free of stale pointer
> - Patch 6 (reconfig): add hn_chim_init in recovery path to rebuild
> chimney bitmap after successful recovery
> - Fix Fixes tag on patch 5 to point to correct MTU support commit
>
> v2:
> - Split subchannel leak fix into separate patch with Fixes tag (patch 1)
> - Fix reinit_failed recovery: re-map device before chan_open when device
> is unmapped to prevent undefined behavior on unmapped ring buffers
> - Move hn_rndis_conf_offload() to after reinit block so offload config
> targets the final RNDIS session instead of being lost on teardown
> - Use write lock in hn_vf_tx/rx_queue_release() to prevent race with
> concurrent fast-path readers holding read lock
> - Reset RSS indirection table to queue 0 in subchan_cleanup error path
> - Fix multi-line comment style to follow DPDK convention
>
> Long Li (6):
> net/netvsc: fix subchannel leak on device removal
> net/netvsc: fix double-free of primary Rx queue on uninit
> net/netvsc: fix resource leak in init error path
> net/netvsc: fix event callback leak on rxfilter failure
> net/netvsc: fix resource leaks in MTU change path
> net/netvsc: support runtime queue count reconfiguration
>
> drivers/net/netvsc/hn_ethdev.c | 243 ++++++++++++++++++++++++++++++---
> drivers/net/netvsc/hn_vf.c | 16 ++-
> 2 files changed, 234 insertions(+), 25 deletions(-)
>
> --
> 2.43.0
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-03-03 3:56 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-03 2:53 [PATCH v3 0/6] net/netvsc: bug fixes and runtime queue reconfiguration Long Li
2026-03-03 2:53 ` [PATCH v3 1/6] net/netvsc: fix subchannel leak on device removal Long Li
2026-03-03 2:53 ` [PATCH v3 2/6] net/netvsc: fix double-free of primary Rx queue on uninit Long Li
2026-03-03 2:53 ` [PATCH v3 3/6] net/netvsc: fix resource leak in init error path Long Li
2026-03-03 2:53 ` [PATCH v3 4/6] net/netvsc: fix event callback leak on rxfilter failure Long Li
2026-03-03 2:54 ` [PATCH v3 5/6] net/netvsc: fix resource leaks in MTU change path Long Li
2026-03-03 2:54 ` [PATCH v3 6/6] net/netvsc: support runtime queue count reconfiguration Long Li
2026-03-03 3:56 ` [PATCH v3 0/6] net/netvsc: bug fixes and runtime queue reconfiguration Long Li
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.