public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Chaitanya Kulkarni <kch@nvidia.com>,
	Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>,
	Sasha Levin <sashal@kernel.org>,
	linux-block@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH AUTOSEL 6.19] block: fix partial IOVA mapping cleanup in blk_rq_dma_map_iova
Date: Wed, 18 Feb 2026 21:03:50 -0500	[thread overview]
Message-ID: <20260219020422.1539798-14-sashal@kernel.org> (raw)
In-Reply-To: <20260219020422.1539798-1-sashal@kernel.org>

From: Chaitanya Kulkarni <kch@nvidia.com>

[ Upstream commit 81e7223b1a2d63b655ee72577c8579f968d037e3 ]

When dma_iova_link() fails partway through mapping a request's bvec
list, the function breaks out of the loop without cleaning up
already mapped segments. Similarly, if dma_iova_sync() fails after
linking all segments, no cleanup is performed.

This leaves partial IOVA mappings in place. The completion path
attempts to unmap the full expected size via dma_iova_destroy() or
nvme_unmap_data(), but only a partial size was actually mapped,
leading to incorrect unmap operations.

Add an out_unlink error path that calls dma_iova_destroy() to clean
up partial mappings before returning failure. The dma_iova_destroy()
function handles both partial unlink and IOVA space freeing. It
correctly handles the mapped_len == 0 case (first dma_iova_link()
failure) by only freeing the IOVA allocation without attempting to
unmap.

Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

The function is called from `blk_dma_map_iter_start()` which is the main
DMA mapping entry point for block requests using IOMMU-based (IOVA)
mapping. This is used by NVMe drivers and potentially other high-
performance storage drivers.

## 3. Summary of Analysis

### What the bug is:
In `blk_rq_dma_map_iova()`, when `dma_iova_link()` fails partway through
mapping multiple segments, or when `dma_iova_sync()` fails after all
segments are linked:

1. **dma_iova_link() failure**: The code breaks out of the loop but
   doesn't clean up already-linked segments. The IOVA allocation and
   partial mappings are leaked. Additionally, when the `dma_iova_link()`
   fails, the code falls through to `dma_iova_sync()` which then
   operates on partially mapped data — this is also incorrect behavior.

2. **dma_iova_sync() failure**: The code returns `false` with
   `iter->status` set, but doesn't call `dma_iova_destroy()` to clean up
   the linked IOVA mappings.

In both cases, the completion path will attempt to unmap using the full
expected size via `dma_iova_destroy()` or `nvme_unmap_data()`, but only
a partial size was actually mapped, leading to **incorrect unmap
operations** — which could corrupt IOMMU mappings, cause IOMMU faults,
or lead to data corruption.

### Why it matters:
- This is a bug in the **block I/O DMA path** — the very core of how
  storage I/O works with IOMMU
- It can trigger on any system using IOMMU with NVMe storage when memory
  pressure or IOMMU resource exhaustion causes `dma_iova_link()` to fail
- Consequences of incorrect IOMMU unmap: potential data corruption,
  IOMMU page faults, kernel crashes
- NVMe is extremely widely deployed; any system with IOMMU enabled could
  be affected

### Stable criteria assessment:
- **Obviously correct**: Yes — adds proper error cleanup with
  `dma_iova_destroy()` which is designed for exactly this purpose
- **Fixes a real bug**: Yes — partial IOVA mapping cleanup is missing,
  leading to incorrect unmap operations
- **Small and contained**: Yes — the diff is minimal (~15 lines changed
  in a single function in a single file)
- **No new features**: Correct — purely error path fix
- **Reviewed**: Yes — reviewed by Christoph Hellwig (original author of
  the code), committed by Jens Axboe (block layer maintainer)
- **Tested in mainline**: Yes — it's in mainline already

### Risk assessment:
- **Very low risk**: The change only affects error paths, adding proper
  cleanup where none existed
- **Well-understood cleanup function**: `dma_iova_destroy()` is
  specifically designed for this cleanup and handles both partial and
  zero-length cases
- **Backport note**: The patch won't apply cleanly to 6.17.y and 6.18.y
  because the `attrs` variable was added in v6.19. A minor adaptation
  would be needed (remove `attrs` and pass `0` to `dma_iova_destroy()`).
  For 6.19.y, it should apply cleanly.

## Verification

- **git log** showed `blk-mq-dma.c` was created by commit
  `b0a4158554b90` and the buggy function `blk_rq_dma_map_iova` was
  introduced by `858299dc61603` ("block: add scatterlist-less DMA
  mapping helpers"), first in v6.17
- **git tag --contains** confirmed the buggy code entered mainline in
  v6.17
- **git show stable/linux-6.18.y:block/blk-mq-dma.c** and **git show
  stable/linux-6.17.y:block/blk-mq-dma.c** both confirmed the buggy code
  is present in stable trees (with `break` instead of `goto out_unlink`,
  and no cleanup on `dma_iova_sync` failure)
- **git show stable/linux-6.12.y:block/blk-mq-dma.c** confirmed the file
  does NOT exist in 6.12.y (only affects 6.17+)
- **Read of drivers/iommu/dma-iommu.c** confirmed `dma_iova_destroy()`
  properly handles mapped_len == 0 by calling `dma_iova_free()` only,
  and mapped_len > 0 by unlinking and freeing
- **git log stable/linux-6.18.y** and **stable/linux-6.19.y** confirmed
  the fix has not yet been applied to stable trees
- **git log 37f0c7a8df7ad** confirmed the `attrs` variable was
  introduced in v6.19, meaning 6.17.y and 6.18.y will need a trivial
  adaptation for clean backport
- The reviewer (Christoph Hellwig) is verified as the original author of
  the buggy code via the Signed-off-by on `858299dc61603`

**YES**

 block/blk-mq-dma.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
index fb018fffffdcc..feead1934301a 100644
--- a/block/blk-mq-dma.c
+++ b/block/blk-mq-dma.c
@@ -126,17 +126,20 @@ static bool blk_rq_dma_map_iova(struct request *req, struct device *dma_dev,
 		error = dma_iova_link(dma_dev, state, vec->paddr, mapped,
 				vec->len, dir, attrs);
 		if (error)
-			break;
+			goto out_unlink;
 		mapped += vec->len;
 	} while (blk_map_iter_next(req, &iter->iter, vec));
 
 	error = dma_iova_sync(dma_dev, state, 0, mapped);
-	if (error) {
-		iter->status = errno_to_blk_status(error);
-		return false;
-	}
+	if (error)
+		goto out_unlink;
 
 	return true;
+
+out_unlink:
+	dma_iova_destroy(dma_dev, state, mapped, dir, attrs);
+	iter->status = errno_to_blk_status(error);
+	return false;
 }
 
 static inline void blk_rq_map_iter_init(struct request *rq,
-- 
2.51.0


  parent reply	other threads:[~2026-02-19  2:04 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-19  2:03 [PATCH AUTOSEL 6.19] rust_binder: Fix build failure if !CONFIG_COMPAT Sasha Levin
2026-02-19  2:03 ` [PATCH AUTOSEL 6.19-6.12] usb: chipidea: udc: fix DMA and SG cleanup in _ep_nuke() Sasha Levin
2026-02-19  2:03 ` [PATCH AUTOSEL 6.19-5.15] staging: rtl8723bs: fix memory leak on failure path Sasha Levin
2026-02-19  2:03 ` [PATCH AUTOSEL 6.19] tty: vt/keyboard: Split apart vt_do_diacrit() Sasha Levin
2026-02-19  2:03 ` [PATCH AUTOSEL 6.19-5.10] fix it87_wdt early reboot by reporting running timer Sasha Levin
2026-02-19  2:03 ` [PATCH AUTOSEL 6.19-5.15] misc: eeprom: Fix EWEN/EWDS/ERAL commands for 93xx56 and 93xx66 Sasha Levin
2026-02-19  2:03 ` [PATCH AUTOSEL 6.19-5.15] mmc: rtsx_pci: add quirk to disable MMC_CAP_AGGRESSIVE_PM for RTS525A Sasha Levin
2026-02-19 10:29   ` Ulf Hansson
2026-02-26 13:23     ` Sasha Levin
2026-02-19  2:03 ` [PATCH AUTOSEL 6.19-6.1] fpga: of-fpga-region: Fail if any bridge is missing Sasha Levin
2026-02-19  2:03 ` [PATCH AUTOSEL 6.19-6.12] soundwire: intel_auxdevice: add cs42l45 codec to wake_capable_list Sasha Levin
2026-02-19  2:03 ` [PATCH AUTOSEL 6.19-5.10] iio: magnetometer: Remove IRQF_ONESHOT Sasha Levin
2026-02-19  2:03 ` [PATCH AUTOSEL 6.19-6.1] watchdog: imx7ulp_wdt: handle the nowayout option Sasha Levin
2026-02-19  2:03 ` [PATCH AUTOSEL 6.19-5.10] serial: 8250_dw: handle clock enable errors in runtime_resume Sasha Levin
2026-02-19  2:03 ` [PATCH AUTOSEL 6.19-6.12] most: core: fix resource leak in most_register_interface error paths Sasha Levin
2026-02-19  2:03 ` Sasha Levin [this message]
2026-02-19  2:03 ` [PATCH AUTOSEL 6.19-6.1] misc: bcm_vk: Fix possible null-pointer dereferences in bcm_vk_read() Sasha Levin
2026-02-19  2:03 ` [PATCH AUTOSEL 6.19-6.1] dmaengine: sun6i: Choose appropriate burst length under maxburst Sasha Levin
2026-02-19  2:03 ` [PATCH AUTOSEL 6.19-6.1] mmc: rtsx: reset power state on suspend Sasha Levin
2026-02-19 10:27   ` Ulf Hansson
2026-02-26 13:24     ` Sasha Levin
2026-02-19  2:03 ` [PATCH AUTOSEL 6.19] serial: rsci: Add set_rtrg() callback Sasha Levin
2026-02-19  2:03 ` [PATCH AUTOSEL 6.19-5.10] Revert "mfd: da9052-spi: Change read-mask to write-mask" Sasha Levin
2026-02-19  2:03 ` [PATCH AUTOSEL 6.19-6.18] pinctrl: mediatek: make devm allocations safer and clearer in mtk_eint_do_init() Sasha Levin
2026-02-19  2:03 ` [PATCH AUTOSEL 6.19-6.12] serial: 8250: 8250_omap.c: Add support for handling UART error conditions Sasha Levin
2026-02-19  2:03 ` [PATCH AUTOSEL 6.19-6.12] usb: gadget: f_fs: Fix ioctl error handling Sasha Levin
2026-02-19  2:03 ` [PATCH AUTOSEL 6.19-6.12] phy: cadence-torrent: restore parent clock for refclk during resume Sasha Levin
2026-02-19  2:04 ` [PATCH AUTOSEL 6.19-5.10] binder: don't use %pK through printk Sasha Levin
2026-02-19  2:04 ` [PATCH AUTOSEL 6.19-6.18] iio: bmi270_i2c: Add MODULE_DEVICE_TABLE for BMI260/270 Sasha Levin
2026-02-19  2:04 ` [PATCH AUTOSEL 6.19-5.15] iio: Use IRQF_NO_THREAD Sasha Levin
2026-02-19  2:04 ` [PATCH AUTOSEL 6.19-6.12] mfd: intel-lpss: Add Intel Nova Lake-S PCI IDs Sasha Levin
2026-02-19  2:04 ` [PATCH AUTOSEL 6.19-6.12] phy: ti: phy-j721e-wiz: restore mux selection during resume Sasha Levin
2026-02-19  2:04 ` [PATCH AUTOSEL 6.19-5.10] MIPS: Loongson: Make cpumask_of_node() robust against NUMA_NO_NODE Sasha Levin
2026-02-19  2:04 ` [PATCH AUTOSEL 6.19-6.12] usb: gadget: f_fs: fix DMA-BUF OUT queues Sasha Levin
2026-02-19  2:04 ` [PATCH AUTOSEL 6.19-5.10] phy: fsl-imx8mq-usb: disable bind/unbind platform driver feature Sasha Levin
2026-02-19  2:04 ` [PATCH AUTOSEL 6.19-6.18] watchdog: rzv2h_wdt: Discard pm_runtime_put() return value Sasha Levin
2026-02-19  2:04 ` [PATCH AUTOSEL 6.19-6.1] soundwire: dmi-quirks: add mapping for Avell B.ON (OEM rebranded of NUC15) Sasha Levin
2026-02-19  2:04 ` [PATCH AUTOSEL 6.19-6.18] pinctrl: renesas: rzt2h: Allow .get_direction() for IRQ function GPIOs Sasha Levin
2026-02-19  2:04 ` [PATCH AUTOSEL 6.19-6.12] dmaengine: stm32-dma3: use module_platform_driver Sasha Levin
2026-02-19  2:04 ` [PATCH AUTOSEL 6.19-5.15] staging: rtl8723bs: fix missing status update on sdio_alloc_irq() failure Sasha Levin
2026-02-19  2:04 ` [PATCH AUTOSEL 6.19-5.15] phy: mvebu-cp110-utmi: fix dr_mode property read from dts Sasha Levin
2026-02-19  2:04 ` [PATCH AUTOSEL 6.19-6.1] usb: typec: ucsi: psy: Fix voltage and current max for non-Fixed PDOs Sasha Levin
2026-02-19  2:04 ` [PATCH AUTOSEL 6.19-5.10] serial: 8250: 8250_omap.c: Clear DMA RX running status only after DMA termination is done Sasha Levin
2026-02-19  2:04 ` [PATCH AUTOSEL 6.19-6.1] dmaengine: stm32-mdma: initialize m2m_hw_period and ccr to fix warnings Sasha Levin
2026-02-19  2:04 ` [PATCH AUTOSEL 6.19-6.18] misc: ti_fpc202: fix a potential memory leak in probe function Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260219020422.1539798-14-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=kch@nvidia.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=patches@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox