linux-hyperv.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH AUTOSEL 6.18-6.12] net: mana: Drop TX skb on post_work_request failure and unmap resources
       [not found] <20251209001610.611575-1-sashal@kernel.org>
@ 2025-12-09  0:15 ` Sasha Levin
  0 siblings, 0 replies; only message in thread
From: Sasha Levin @ 2025-12-09  0:15 UTC (permalink / raw)
  To: patches, stable
  Cc: Aditya Garg, Haiyang Zhang, Jakub Kicinski, Sasha Levin, kys,
	wei.liu, decui, longli, kotaranov, leon, shradhagupta, mlevitsk,
	ernis, yury.norov, ssengar, dipayanroy, shirazsaleem,
	linux-hyperv, linux-rdma

From: Aditya Garg <gargaditya@linux.microsoft.com>

[ Upstream commit 45120304e84171fd215c1b57b15b285446d15106 ]

Drop TX packets when posting the work request fails and ensure DMA
mappings are always cleaned up.

Signed-off-by: Aditya Garg <gargaditya@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/1763464269-10431-3-git-send-email-gargaditya@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Looking at this commit, I need to analyze it for stable backporting
suitability.

## Commit Analysis

### What the commit does:

1. **In `mana_en.c`**: When `mana_gd_post_work_request()` fails during
   TX:
   - Previously: returned `NETDEV_TX_BUSY` without cleaning up DMA
     mappings
   - Now: calls `mana_unmap_skb()` to clean up DMA mappings, then drops
     the packet properly

2. **In `gdma_main.c`**: Removes a noisy `dev_err` message for flow
   control failure

3. **In `mana.h`**: Exports `mana_unmap_skb()` function for use across
   files

### Bug Being Fixed

Looking at the original error path:
```c
if (err) {
    (void)skb_dequeue_tail(&txq->pending_skbs);
    netdev_warn(ndev, "Failed to post TX OOB: %d\n", err);
    err = NETDEV_TX_BUSY;
    goto tx_busy;  // DMA mappings NOT unmapped!
}
```

The original code has **two bugs**:
1. **DMA resource leak**: When `mana_gd_post_work_request()` fails, the
   DMA mappings set up earlier (via `mana_map_skb()`) are never freed
2. **Improper error handling**: Returns `NETDEV_TX_BUSY` which tells the
   network stack to retry, but the skb was already dequeued from
   `pending_skbs`, leading to inconsistent state

### Code Change Assessment

The fix is straightforward and surgical:
```c
if (err) {
    (void)skb_dequeue_tail(&txq->pending_skbs);
    mana_unmap_skb(skb, apc);  // NEW: Clean up DMA mappings
    netdev_warn(ndev, "Failed to post TX OOB: %d\n", err);
    goto free_sgl_ptr;  // Drop packet properly
}
```

### Stable Criteria Evaluation

| Criterion | Assessment |
|-----------|------------|
| Fixes real bug | ✅ Yes - DMA resource leak and improper error handling
|
| Obviously correct | ✅ Yes - unmapping DMA on error is clearly correct
|
| Small and contained | ✅ Yes - only affects error path in MANA driver |
| No new features | ✅ Yes - pure bug fix |
| User impact | ✅ Medium-high - affects Azure VM users; leaks can cause
long-term instability |

### Risk Assessment

- **Low risk**: Changes only affect the TX error path
- **Self-contained**: The `mana_unmap_skb()` function already exists,
  just made non-static
- **No dependencies**: This commit is standalone
- **Well-tested**: Has "Reviewed-by" tag from Microsoft maintainer

### Missing Signals

- No `Cc: stable@vger.kernel.org` tag
- No `Fixes:` tag

However, this is clearly fixing a resource leak bug that has existed
since the error path was written. Resource leaks in network drivers are
exactly the kind of bugs stable trees should fix.

### User Impact

The MANA driver is used in Microsoft Azure VMs. DMA mapping leaks:
- Cause memory pressure over time
- Can lead to allocation failures under sustained TX errors
- May cause VM instability in production cloud environments

This is a production-relevant bug affecting real cloud users.

### Conclusion

This commit fixes a genuine resource leak bug in the MANA network
driver. The fix is:
- Small (adds one function call + declaration)
- Surgical (only affects error path)
- Obviously correct (unmapping DMA on error is mandatory)
- Low risk (doesn't touch normal TX path)
- Important for production Azure VM users

The lack of stable/Fixes tags is unfortunate but doesn't diminish the
clear bug-fix nature of this change.

**YES**

 drivers/net/ethernet/microsoft/mana/gdma_main.c | 6 +-----
 drivers/net/ethernet/microsoft/mana/mana_en.c   | 7 +++----
 include/net/mana/mana.h                         | 1 +
 3 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
index 43f034e180c41..7d232ac250958 100644
--- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
+++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
@@ -1299,7 +1299,6 @@ int mana_gd_post_work_request(struct gdma_queue *wq,
 			      struct gdma_posted_wqe_info *wqe_info)
 {
 	u32 client_oob_size = wqe_req->inline_oob_size;
-	struct gdma_context *gc;
 	u32 sgl_data_size;
 	u32 max_wqe_size;
 	u32 wqe_size;
@@ -1329,11 +1328,8 @@ int mana_gd_post_work_request(struct gdma_queue *wq,
 	if (wqe_size > max_wqe_size)
 		return -EINVAL;
 
-	if (wq->monitor_avl_buf && wqe_size > mana_gd_wq_avail_space(wq)) {
-		gc = wq->gdma_dev->gdma_context;
-		dev_err(gc->dev, "unsuccessful flow control!\n");
+	if (wq->monitor_avl_buf && wqe_size > mana_gd_wq_avail_space(wq))
 		return -ENOSPC;
-	}
 
 	if (wqe_info)
 		wqe_info->wqe_size_in_bu = wqe_size / GDMA_WQE_BU_SIZE;
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index 0142fd98392c2..6d37f39930453 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -438,9 +438,9 @@ netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 
 	if (err) {
 		(void)skb_dequeue_tail(&txq->pending_skbs);
+		mana_unmap_skb(skb, apc);
 		netdev_warn(ndev, "Failed to post TX OOB: %d\n", err);
-		err = NETDEV_TX_BUSY;
-		goto tx_busy;
+		goto free_sgl_ptr;
 	}
 
 	err = NETDEV_TX_OK;
@@ -460,7 +460,6 @@ netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 	tx_stats->bytes += len + ((num_gso_seg - 1) * gso_hs);
 	u64_stats_update_end(&tx_stats->syncp);
 
-tx_busy:
 	if (netif_tx_queue_stopped(net_txq) && mana_can_tx(gdma_sq)) {
 		netif_tx_wake_queue(net_txq);
 		apc->eth_stats.wake_queue++;
@@ -1606,7 +1605,7 @@ static int mana_move_wq_tail(struct gdma_queue *wq, u32 num_units)
 	return 0;
 }
 
-static void mana_unmap_skb(struct sk_buff *skb, struct mana_port_context *apc)
+void mana_unmap_skb(struct sk_buff *skb, struct mana_port_context *apc)
 {
 	struct mana_skb_head *ash = (struct mana_skb_head *)skb->head;
 	struct gdma_context *gc = apc->ac->gdma_dev->gdma_context;
diff --git a/include/net/mana/mana.h b/include/net/mana/mana.h
index 0921485565c05..330e1bb088bb9 100644
--- a/include/net/mana/mana.h
+++ b/include/net/mana/mana.h
@@ -580,6 +580,7 @@ int mana_set_bw_clamp(struct mana_port_context *apc, u32 speed,
 void mana_query_phy_stats(struct mana_port_context *apc);
 int mana_pre_alloc_rxbufs(struct mana_port_context *apc, int mtu, int num_queues);
 void mana_pre_dealloc_rxbufs(struct mana_port_context *apc);
+void mana_unmap_skb(struct sk_buff *skb, struct mana_port_context *apc);
 
 extern const struct ethtool_ops mana_ethtool_ops;
 extern struct dentry *mana_debugfs_root;
-- 
2.51.0


^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2025-12-09  0:17 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20251209001610.611575-1-sashal@kernel.org>
2025-12-09  0:15 ` [PATCH AUTOSEL 6.18-6.12] net: mana: Drop TX skb on post_work_request failure and unmap resources Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).