From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Chaitanya Kulkarni <kch@nvidia.com>,
Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>,
Sasha Levin <sashal@kernel.org>,
linux-block@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH AUTOSEL 6.19] block: fix partial IOVA mapping cleanup in blk_rq_dma_map_iova
Date: Wed, 18 Feb 2026 21:03:50 -0500 [thread overview]
Message-ID: <20260219020422.1539798-14-sashal@kernel.org> (raw)
In-Reply-To: <20260219020422.1539798-1-sashal@kernel.org>
From: Chaitanya Kulkarni <kch@nvidia.com>
[ Upstream commit 81e7223b1a2d63b655ee72577c8579f968d037e3 ]
When dma_iova_link() fails partway through mapping a request's bvec
list, the function breaks out of the loop without cleaning up
already mapped segments. Similarly, if dma_iova_sync() fails after
linking all segments, no cleanup is performed.
This leaves partial IOVA mappings in place. The completion path
attempts to unmap the full expected size via dma_iova_destroy() or
nvme_unmap_data(), but only a partial size was actually mapped,
leading to incorrect unmap operations.
Add an out_unlink error path that calls dma_iova_destroy() to clean
up partial mappings before returning failure. The dma_iova_destroy()
function handles both partial unlink and IOVA space freeing. It
correctly handles the mapped_len == 0 case (first dma_iova_link()
failure) by only freeing the IOVA allocation without attempting to
unmap.
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
The function is called from `blk_dma_map_iter_start()` which is the main
DMA mapping entry point for block requests using IOMMU-based (IOVA)
mapping. This is used by NVMe drivers and potentially other high-
performance storage drivers.
## 3. Summary of Analysis
### What the bug is:
In `blk_rq_dma_map_iova()`, when `dma_iova_link()` fails partway through
mapping multiple segments, or when `dma_iova_sync()` fails after all
segments are linked:
1. **dma_iova_link() failure**: The code breaks out of the loop but
doesn't clean up already-linked segments. The IOVA allocation and
partial mappings are leaked. Additionally, when the `dma_iova_link()`
fails, the code falls through to `dma_iova_sync()` which then
operates on partially mapped data — this is also incorrect behavior.
2. **dma_iova_sync() failure**: The code returns `false` with
`iter->status` set, but doesn't call `dma_iova_destroy()` to clean up
the linked IOVA mappings.
In both cases, the completion path will attempt to unmap using the full
expected size via `dma_iova_destroy()` or `nvme_unmap_data()`, but only
a partial size was actually mapped, leading to **incorrect unmap
operations** — which could corrupt IOMMU mappings, cause IOMMU faults,
or lead to data corruption.
### Why it matters:
- This is a bug in the **block I/O DMA path** — the very core of how
storage I/O works with IOMMU
- It can trigger on any system using IOMMU with NVMe storage when memory
pressure or IOMMU resource exhaustion causes `dma_iova_link()` to fail
- Consequences of incorrect IOMMU unmap: potential data corruption,
IOMMU page faults, kernel crashes
- NVMe is extremely widely deployed; any system with IOMMU enabled could
be affected
### Stable criteria assessment:
- **Obviously correct**: Yes — adds proper error cleanup with
`dma_iova_destroy()` which is designed for exactly this purpose
- **Fixes a real bug**: Yes — partial IOVA mapping cleanup is missing,
leading to incorrect unmap operations
- **Small and contained**: Yes — the diff is minimal (~15 lines changed
in a single function in a single file)
- **No new features**: Correct — purely error path fix
- **Reviewed**: Yes — reviewed by Christoph Hellwig (original author of
the code), committed by Jens Axboe (block layer maintainer)
- **Tested in mainline**: Yes — it's in mainline already
### Risk assessment:
- **Very low risk**: The change only affects error paths, adding proper
cleanup where none existed
- **Well-understood cleanup function**: `dma_iova_destroy()` is
specifically designed for this cleanup and handles both partial and
zero-length cases
- **Backport note**: The patch won't apply cleanly to 6.17.y and 6.18.y
because the `attrs` variable was added in v6.19. A minor adaptation
would be needed (remove `attrs` and pass `0` to `dma_iova_destroy()`).
For 6.19.y, it should apply cleanly.
## Verification
- **git log** showed `blk-mq-dma.c` was created by commit
`b0a4158554b90` and the buggy function `blk_rq_dma_map_iova` was
introduced by `858299dc61603` ("block: add scatterlist-less DMA
mapping helpers"), first in v6.17
- **git tag --contains** confirmed the buggy code entered mainline in
v6.17
- **git show stable/linux-6.18.y:block/blk-mq-dma.c** and **git show
stable/linux-6.17.y:block/blk-mq-dma.c** both confirmed the buggy code
is present in stable trees (with `break` instead of `goto out_unlink`,
and no cleanup on `dma_iova_sync` failure)
- **git show stable/linux-6.12.y:block/blk-mq-dma.c** confirmed the file
does NOT exist in 6.12.y (only affects 6.17+)
- **Read of drivers/iommu/dma-iommu.c** confirmed `dma_iova_destroy()`
properly handles mapped_len == 0 by calling `dma_iova_free()` only,
and mapped_len > 0 by unlinking and freeing
- **git log stable/linux-6.18.y** and **stable/linux-6.19.y** confirmed
the fix has not yet been applied to stable trees
- **git log 37f0c7a8df7ad** confirmed the `attrs` variable was
introduced in v6.19, meaning 6.17.y and 6.18.y will need a trivial
adaptation for clean backport
- The reviewer (Christoph Hellwig) is verified as the original author of
the buggy code via the Signed-off-by on `858299dc61603`
**YES**
block/blk-mq-dma.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
index fb018fffffdcc..feead1934301a 100644
--- a/block/blk-mq-dma.c
+++ b/block/blk-mq-dma.c
@@ -126,17 +126,20 @@ static bool blk_rq_dma_map_iova(struct request *req, struct device *dma_dev,
error = dma_iova_link(dma_dev, state, vec->paddr, mapped,
vec->len, dir, attrs);
if (error)
- break;
+ goto out_unlink;
mapped += vec->len;
} while (blk_map_iter_next(req, &iter->iter, vec));
error = dma_iova_sync(dma_dev, state, 0, mapped);
- if (error) {
- iter->status = errno_to_blk_status(error);
- return false;
- }
+ if (error)
+ goto out_unlink;
return true;
+
+out_unlink:
+ dma_iova_destroy(dma_dev, state, mapped, dir, attrs);
+ iter->status = errno_to_blk_status(error);
+ return false;
}
static inline void blk_rq_map_iter_init(struct request *rq,
--
2.51.0
next prev parent reply other threads:[~2026-02-19 2:04 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-19 2:03 [PATCH AUTOSEL 6.19] rust_binder: Fix build failure if !CONFIG_COMPAT Sasha Levin
2026-02-19 2:03 ` [PATCH AUTOSEL 6.19-6.12] usb: chipidea: udc: fix DMA and SG cleanup in _ep_nuke() Sasha Levin
2026-02-19 2:03 ` [PATCH AUTOSEL 6.19-5.15] staging: rtl8723bs: fix memory leak on failure path Sasha Levin
2026-02-19 2:03 ` [PATCH AUTOSEL 6.19] tty: vt/keyboard: Split apart vt_do_diacrit() Sasha Levin
2026-02-19 2:03 ` [PATCH AUTOSEL 6.19-5.10] fix it87_wdt early reboot by reporting running timer Sasha Levin
2026-02-19 2:03 ` [PATCH AUTOSEL 6.19-5.15] misc: eeprom: Fix EWEN/EWDS/ERAL commands for 93xx56 and 93xx66 Sasha Levin
2026-02-19 2:03 ` [PATCH AUTOSEL 6.19-5.15] mmc: rtsx_pci: add quirk to disable MMC_CAP_AGGRESSIVE_PM for RTS525A Sasha Levin
2026-02-19 10:29 ` Ulf Hansson
2026-02-26 13:23 ` Sasha Levin
2026-02-19 2:03 ` [PATCH AUTOSEL 6.19-6.1] fpga: of-fpga-region: Fail if any bridge is missing Sasha Levin
2026-02-19 2:03 ` [PATCH AUTOSEL 6.19-6.12] soundwire: intel_auxdevice: add cs42l45 codec to wake_capable_list Sasha Levin
2026-02-19 2:03 ` [PATCH AUTOSEL 6.19-5.10] iio: magnetometer: Remove IRQF_ONESHOT Sasha Levin
2026-02-19 2:03 ` [PATCH AUTOSEL 6.19-6.1] watchdog: imx7ulp_wdt: handle the nowayout option Sasha Levin
2026-02-19 2:03 ` [PATCH AUTOSEL 6.19-5.10] serial: 8250_dw: handle clock enable errors in runtime_resume Sasha Levin
2026-02-19 2:03 ` [PATCH AUTOSEL 6.19-6.12] most: core: fix resource leak in most_register_interface error paths Sasha Levin
2026-02-19 2:03 ` Sasha Levin [this message]
2026-02-19 2:03 ` [PATCH AUTOSEL 6.19-6.1] misc: bcm_vk: Fix possible null-pointer dereferences in bcm_vk_read() Sasha Levin
2026-02-19 2:03 ` [PATCH AUTOSEL 6.19-6.1] dmaengine: sun6i: Choose appropriate burst length under maxburst Sasha Levin
2026-02-19 2:03 ` [PATCH AUTOSEL 6.19-6.1] mmc: rtsx: reset power state on suspend Sasha Levin
2026-02-19 10:27 ` Ulf Hansson
2026-02-26 13:24 ` Sasha Levin
2026-02-19 2:03 ` [PATCH AUTOSEL 6.19] serial: rsci: Add set_rtrg() callback Sasha Levin
2026-02-19 2:03 ` [PATCH AUTOSEL 6.19-5.10] Revert "mfd: da9052-spi: Change read-mask to write-mask" Sasha Levin
2026-02-19 2:03 ` [PATCH AUTOSEL 6.19-6.18] pinctrl: mediatek: make devm allocations safer and clearer in mtk_eint_do_init() Sasha Levin
2026-02-19 2:03 ` [PATCH AUTOSEL 6.19-6.12] serial: 8250: 8250_omap.c: Add support for handling UART error conditions Sasha Levin
2026-02-19 2:03 ` [PATCH AUTOSEL 6.19-6.12] usb: gadget: f_fs: Fix ioctl error handling Sasha Levin
2026-02-19 2:03 ` [PATCH AUTOSEL 6.19-6.12] phy: cadence-torrent: restore parent clock for refclk during resume Sasha Levin
2026-02-19 2:04 ` [PATCH AUTOSEL 6.19-5.10] binder: don't use %pK through printk Sasha Levin
2026-02-19 2:04 ` [PATCH AUTOSEL 6.19-6.18] iio: bmi270_i2c: Add MODULE_DEVICE_TABLE for BMI260/270 Sasha Levin
2026-02-19 2:04 ` [PATCH AUTOSEL 6.19-5.15] iio: Use IRQF_NO_THREAD Sasha Levin
2026-02-19 2:04 ` [PATCH AUTOSEL 6.19-6.12] mfd: intel-lpss: Add Intel Nova Lake-S PCI IDs Sasha Levin
2026-02-19 2:04 ` [PATCH AUTOSEL 6.19-6.12] phy: ti: phy-j721e-wiz: restore mux selection during resume Sasha Levin
2026-02-19 2:04 ` [PATCH AUTOSEL 6.19-5.10] MIPS: Loongson: Make cpumask_of_node() robust against NUMA_NO_NODE Sasha Levin
2026-02-19 2:04 ` [PATCH AUTOSEL 6.19-6.12] usb: gadget: f_fs: fix DMA-BUF OUT queues Sasha Levin
2026-02-19 2:04 ` [PATCH AUTOSEL 6.19-5.10] phy: fsl-imx8mq-usb: disable bind/unbind platform driver feature Sasha Levin
2026-02-19 2:04 ` [PATCH AUTOSEL 6.19-6.18] watchdog: rzv2h_wdt: Discard pm_runtime_put() return value Sasha Levin
2026-02-19 2:04 ` [PATCH AUTOSEL 6.19-6.1] soundwire: dmi-quirks: add mapping for Avell B.ON (OEM rebranded of NUC15) Sasha Levin
2026-02-19 2:04 ` [PATCH AUTOSEL 6.19-6.18] pinctrl: renesas: rzt2h: Allow .get_direction() for IRQ function GPIOs Sasha Levin
2026-02-19 2:04 ` [PATCH AUTOSEL 6.19-6.12] dmaengine: stm32-dma3: use module_platform_driver Sasha Levin
2026-02-19 2:04 ` [PATCH AUTOSEL 6.19-5.15] staging: rtl8723bs: fix missing status update on sdio_alloc_irq() failure Sasha Levin
2026-02-19 2:04 ` [PATCH AUTOSEL 6.19-5.15] phy: mvebu-cp110-utmi: fix dr_mode property read from dts Sasha Levin
2026-02-19 2:04 ` [PATCH AUTOSEL 6.19-6.1] usb: typec: ucsi: psy: Fix voltage and current max for non-Fixed PDOs Sasha Levin
2026-02-19 2:04 ` [PATCH AUTOSEL 6.19-5.10] serial: 8250: 8250_omap.c: Clear DMA RX running status only after DMA termination is done Sasha Levin
2026-02-19 2:04 ` [PATCH AUTOSEL 6.19-6.1] dmaengine: stm32-mdma: initialize m2m_hw_period and ccr to fix warnings Sasha Levin
2026-02-19 2:04 ` [PATCH AUTOSEL 6.19-6.18] misc: ti_fpc202: fix a potential memory leak in probe function Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260219020422.1539798-14-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=axboe@kernel.dk \
--cc=hch@lst.de \
--cc=kch@nvidia.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox