* FAILED: patch "[PATCH] RDMA/mana_ib: Disable RX steering on RSS QP destroy" failed to apply to 6.6-stable tree
@ 2026-05-01 11:03 gregkh
2026-05-02 2:59 ` [PATCH 6.6.y] RDMA/mana_ib: Disable RX steering on RSS QP destroy Sasha Levin
0 siblings, 1 reply; 2+ messages in thread
From: gregkh @ 2026-05-01 11:03 UTC (permalink / raw)
To: longli, leon; +Cc: stable
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable@vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x dbeb256e8dd87233d891b170c0b32a6466467036
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable@vger.kernel.org>' --in-reply-to '2026050132-kinetic-idealize-dece@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From dbeb256e8dd87233d891b170c0b32a6466467036 Mon Sep 17 00:00:00 2001
From: Long Li <longli@microsoft.com>
Date: Wed, 25 Mar 2026 12:40:57 -0700
Subject: [PATCH] RDMA/mana_ib: Disable RX steering on RSS QP destroy
When an RSS QP is destroyed (e.g. DPDK exit), mana_ib_destroy_qp_rss()
destroys the RX WQ objects but does not disable vPort RX steering in
firmware. This leaves stale steering configuration that still points to
the destroyed RX objects.
If traffic continues to arrive (e.g. peer VM is still transmitting) and
the VF interface is subsequently brought up (mana_open), the firmware
may deliver completions using stale CQ IDs from the old RX objects.
These CQ IDs can be reused by the ethernet driver for new TX CQs,
causing RX completions to land on TX CQs:
WARNING: mana_poll_tx_cq+0x1b8/0x220 [mana] (is_sq == false)
WARNING: mana_gd_process_eq_events+0x209/0x290 (cq_table lookup fails)
Fix this by disabling vPort RX steering before destroying RX WQ objects.
Note that mana_fence_rqs() cannot be used here because the fence
completion is delivered on the CQ, which is polled by user-mode (e.g.
DPDK) and not visible to the kernel driver.
Refactor the disable logic into a shared mana_disable_vport_rx() in
mana_en, exported for use by mana_ib, replacing the duplicate code.
The ethernet driver's mana_dealloc_queues() is also updated to call
this common function.
Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
Cc: stable@vger.kernel.org
Signed-off-by: Long Li <longli@microsoft.com>
Link: https://patch.msgid.link/20260325194100.1929056-1-longli@microsoft.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
diff --git a/drivers/infiniband/hw/mana/qp.c b/drivers/infiniband/hw/mana/qp.c
index f3bb1edc7f79..e6fc3cc10795 100644
--- a/drivers/infiniband/hw/mana/qp.c
+++ b/drivers/infiniband/hw/mana/qp.c
@@ -799,6 +799,21 @@ static int mana_ib_destroy_qp_rss(struct mana_ib_qp *qp,
ndev = mana_ib_get_netdev(qp->ibqp.device, qp->port);
mpc = netdev_priv(ndev);
+ /* Disable vPort RX steering before destroying RX WQ objects.
+ * Otherwise firmware still routes traffic to the destroyed queues,
+ * which can cause bogus completions on reused CQ IDs when the
+ * ethernet driver later creates new queues on mana_open().
+ *
+ * Unlike the ethernet teardown path, mana_fence_rqs() cannot be
+ * used here because the fence completion CQE is delivered on the
+ * CQ which is polled by userspace (e.g. DPDK), so there is no way
+ * for the kernel to wait for fence completion.
+ *
+ * This is best effort — if it fails there is not much we can do,
+ * and mana_cfg_vport_steering() already logs the error.
+ */
+ mana_disable_vport_rx(mpc);
+
for (i = 0; i < (1 << ind_tbl->log_ind_tbl_size); i++) {
ibwq = ind_tbl->ind_tbl[i];
wq = container_of(ibwq, struct mana_ib_wq, ibwq);
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index dca62fb9a3a9..af2a35c09773 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -2882,6 +2882,13 @@ static void mana_rss_table_init(struct mana_port_context *apc)
ethtool_rxfh_indir_default(i, apc->num_queues);
}
+int mana_disable_vport_rx(struct mana_port_context *apc)
+{
+ return mana_cfg_vport_steering(apc, TRI_STATE_FALSE, false, false,
+ false);
+}
+EXPORT_SYMBOL_NS(mana_disable_vport_rx, "NET_MANA");
+
int mana_config_rss(struct mana_port_context *apc, enum TRI_STATE rx,
bool update_hash, bool update_tab)
{
@@ -3266,10 +3273,12 @@ static int mana_dealloc_queues(struct net_device *ndev)
*/
apc->rss_state = TRI_STATE_FALSE;
- err = mana_config_rss(apc, TRI_STATE_FALSE, false, false);
+ err = mana_disable_vport_rx(apc);
if (err && mana_en_need_log(apc, err))
netdev_err(ndev, "Failed to disable vPort: %d\n", err);
+ mana_fence_rqs(apc);
+
/* Even in err case, still need to cleanup the vPort */
mana_destroy_vport(apc);
diff --git a/include/net/mana/mana.h b/include/net/mana/mana.h
index a078af283bdd..743bfa8ad8e3 100644
--- a/include/net/mana/mana.h
+++ b/include/net/mana/mana.h
@@ -568,6 +568,7 @@ struct mana_port_context {
netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev);
int mana_config_rss(struct mana_port_context *ac, enum TRI_STATE rx,
bool update_hash, bool update_tab);
+int mana_disable_vport_rx(struct mana_port_context *apc);
int mana_alloc_queues(struct net_device *ndev);
int mana_attach(struct net_device *ndev);
^ permalink raw reply related [flat|nested] 2+ messages in thread
* [PATCH 6.6.y] RDMA/mana_ib: Disable RX steering on RSS QP destroy
2026-05-01 11:03 FAILED: patch "[PATCH] RDMA/mana_ib: Disable RX steering on RSS QP destroy" failed to apply to 6.6-stable tree gregkh
@ 2026-05-02 2:59 ` Sasha Levin
0 siblings, 0 replies; 2+ messages in thread
From: Sasha Levin @ 2026-05-02 2:59 UTC (permalink / raw)
To: stable; +Cc: Long Li, Leon Romanovsky, Sasha Levin
From: Long Li <longli@microsoft.com>
[ Upstream commit dbeb256e8dd87233d891b170c0b32a6466467036 ]
When an RSS QP is destroyed (e.g. DPDK exit), mana_ib_destroy_qp_rss()
destroys the RX WQ objects but does not disable vPort RX steering in
firmware. This leaves stale steering configuration that still points to
the destroyed RX objects.
If traffic continues to arrive (e.g. peer VM is still transmitting) and
the VF interface is subsequently brought up (mana_open), the firmware
may deliver completions using stale CQ IDs from the old RX objects.
These CQ IDs can be reused by the ethernet driver for new TX CQs,
causing RX completions to land on TX CQs:
WARNING: mana_poll_tx_cq+0x1b8/0x220 [mana] (is_sq == false)
WARNING: mana_gd_process_eq_events+0x209/0x290 (cq_table lookup fails)
Fix this by disabling vPort RX steering before destroying RX WQ objects.
Note that mana_fence_rqs() cannot be used here because the fence
completion is delivered on the CQ, which is polled by user-mode (e.g.
DPDK) and not visible to the kernel driver.
Refactor the disable logic into a shared mana_disable_vport_rx() in
mana_en, exported for use by mana_ib, replacing the duplicate code.
The ethernet driver's mana_dealloc_queues() is also updated to call
this common function.
Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
Cc: stable@vger.kernel.org
Signed-off-by: Long Li <longli@microsoft.com>
Link: https://patch.msgid.link/20260325194100.1929056-1-longli@microsoft.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
[ kept early-return error handling and used unquoted NET_MANA namespace in EXPORT_SYMBOL_NS ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/infiniband/hw/mana/qp.c | 15 +++++++++++++++
drivers/net/ethernet/microsoft/mana/mana_en.c | 11 ++++++++++-
include/net/mana/mana.h | 1 +
3 files changed, 26 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/mana/qp.c b/drivers/infiniband/hw/mana/qp.c
index 4b3b5b274e849..8009a339bf9ca 100644
--- a/drivers/infiniband/hw/mana/qp.c
+++ b/drivers/infiniband/hw/mana/qp.c
@@ -449,6 +449,21 @@ static int mana_ib_destroy_qp_rss(struct mana_ib_qp *qp,
ndev = mc->ports[qp->port - 1];
mpc = netdev_priv(ndev);
+ /* Disable vPort RX steering before destroying RX WQ objects.
+ * Otherwise firmware still routes traffic to the destroyed queues,
+ * which can cause bogus completions on reused CQ IDs when the
+ * ethernet driver later creates new queues on mana_open().
+ *
+ * Unlike the ethernet teardown path, mana_fence_rqs() cannot be
+ * used here because the fence completion CQE is delivered on the
+ * CQ which is polled by userspace (e.g. DPDK), so there is no way
+ * for the kernel to wait for fence completion.
+ *
+ * This is best effort — if it fails there is not much we can do,
+ * and mana_cfg_vport_steering() already logs the error.
+ */
+ mana_disable_vport_rx(mpc);
+
for (i = 0; i < (1 << ind_tbl->log_ind_tbl_size); i++) {
ibwq = ind_tbl->ind_tbl[i];
wq = container_of(ibwq, struct mana_ib_wq, ibwq);
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index b56a337b1e212..343f6e879af39 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -2380,6 +2380,13 @@ static void mana_rss_table_init(struct mana_port_context *apc)
ethtool_rxfh_indir_default(i, apc->num_queues);
}
+int mana_disable_vport_rx(struct mana_port_context *apc)
+{
+ return mana_cfg_vport_steering(apc, TRI_STATE_FALSE, false, false,
+ false);
+}
+EXPORT_SYMBOL_NS(mana_disable_vport_rx, NET_MANA);
+
int mana_config_rss(struct mana_port_context *apc, enum TRI_STATE rx,
bool update_hash, bool update_tab)
{
@@ -2620,12 +2627,14 @@ static int mana_dealloc_queues(struct net_device *ndev)
*/
apc->rss_state = TRI_STATE_FALSE;
- err = mana_config_rss(apc, TRI_STATE_FALSE, false, false);
+ err = mana_disable_vport_rx(apc);
if (err) {
netdev_err(ndev, "Failed to disable vPort: %d\n", err);
return err;
}
+ mana_fence_rqs(apc);
+
mana_destroy_vport(apc);
return 0;
diff --git a/include/net/mana/mana.h b/include/net/mana/mana.h
index 7892b79854f62..d716771a7262c 100644
--- a/include/net/mana/mana.h
+++ b/include/net/mana/mana.h
@@ -437,6 +437,7 @@ struct mana_port_context {
netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev);
int mana_config_rss(struct mana_port_context *ac, enum TRI_STATE rx,
bool update_hash, bool update_tab);
+int mana_disable_vport_rx(struct mana_port_context *apc);
int mana_alloc_queues(struct net_device *ndev);
int mana_attach(struct net_device *ndev);
--
2.53.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-05-02 2:59 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-01 11:03 FAILED: patch "[PATCH] RDMA/mana_ib: Disable RX steering on RSS QP destroy" failed to apply to 6.6-stable tree gregkh
2026-05-02 2:59 ` [PATCH 6.6.y] RDMA/mana_ib: Disable RX steering on RSS QP destroy Sasha Levin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox