public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Keith Busch <kbusch@kernel.org>, Christoph Hellwig <hch@lst.de>,
	Kanchan Joshi <joshi.k@samsung.com>,
	Sasha Levin <sashal@kernel.org>,
	sagi@grimberg.me, linux-nvme@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: [PATCH AUTOSEL 6.19-5.10] nvme-pci: ensure we're polling a polled queue
Date: Tue, 10 Mar 2026 05:01:03 -0400	[thread overview]
Message-ID: <20260310090145.2709021-3-sashal@kernel.org> (raw)
In-Reply-To: <20260310090145.2709021-1-sashal@kernel.org>

From: Keith Busch <kbusch@kernel.org>

[ Upstream commit 166e31d7dbf6aa44829b98aa446bda5c9580f12a ]

A user can change the polled queue count at run time. There's a brief
window during a reset where a hipri task may try to poll that queue
before the block layer has updated the queue maps, which would race with
the now interrupt driven queue and may cause double completions.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have all the information needed for my analysis.

## Analysis

### 1. Commit Message Analysis

The commit message clearly describes a **race condition** that can cause
**double completions**:
- A user can change the polled queue count at runtime
- During a controller reset, there's a window where a hipri task may
  poll a queue that has transitioned from polled to interrupt-driven
- Both the polling path and the interrupt handler could complete the
  same I/O, causing double completions

Double completions in the block layer are a serious bug — they can lead
to use-after-free (the request is freed after the first completion, then
accessed again during the second completion), data corruption, or kernel
crashes.

### 2. Code Change Analysis

The fix is minimal — a single additional check in `nvme_poll()`:

```c
- if (!nvme_cqe_pending(nvmeq))
+       if (!test_bit(NVMEQ_POLLED, &nvmeq->flags) ||
+           !nvme_cqe_pending(nvmeq))
                return 0;
```

Before polling the completion queue, verify the queue is actually still
a polled queue via `test_bit(NVMEQ_POLLED, ...)`. If the queue has
transitioned to interrupt-driven mode (the NVMEQ_POLLED bit was cleared
in `nvme_suspend_queue()`), skip the poll and return 0.

This is a 2-line change in a single file. The risk of regression is
extremely low.

### 3. Classification

- **Race condition fix** — prevents concurrent polling and interrupt-
  driven completion of the same queue
- **Prevents double completions** — which are a serious kernel bug
  (potential UAF, crashes)
- Authored by Keith Busch (NVMe maintainer), reviewed by Christoph
  Hellwig and Kanchan Joshi

### 4. Scope and Risk

- **1 file changed, 2 insertions, 1 deletion** — extremely small and
  contained
- The added check is a fast bit test that simply returns early if the
  queue isn't polled
- No functional changes to the normal code path — polled queues still
  poll as before
- Only affects the edge case of runtime queue configuration changes

### 5. Dependencies

The `NVMEQ_POLLED` flag has existed since kernel v5.2 (commit
7c349dde26b7, March 2019). The `test_bit(NVMEQ_POLLED, ...)` pattern is
already used elsewhere in the same file (e.g., `nvme_poll_irqdisable()`,
`nvme_suspend_queue()`). This fix is fully self-contained and applies
cleanly without any prerequisites.

The commit appears alongside two related commits (`44b798a6fd680` "cap
queue creation" and `02bfc966463a7` "do not try to add queue maps at
runtime"), but this fix is independent — it guards against an already-
existing race in `nvme_poll()` regardless of the other changes.

### 6. User Impact

NVMe is one of the most widely used storage interfaces in modern
systems. Any system using polled I/O queues (common for high-performance
storage workloads) and changing queue configuration at runtime could hit
this race. Double completions can cause kernel panics, data corruption,
or use-after-free.

### Verification

- Verified upstream commit is `166e31d7dbf6` which appeared in
  `v7.0-rc3`
- Confirmed `NVMEQ_POLLED` flag was introduced in commit `7c349dde26b7`
  (v5.2), so it exists in all current stable trees
- Confirmed the fix is a 2-line change modifying only
  `drivers/nvme/host/pci.c`
- Confirmed the code pattern (`test_bit(NVMEQ_POLLED, ...)`) is already
  used elsewhere in the same file, particularly in
  `nvme_poll_irqdisable()` at line 1631
- Verified the commit is reviewed by Christoph Hellwig and Kanchan Joshi
- Confirmed the fix is self-contained — no dependency on the two
  companion commits

### Conclusion

This is a textbook stable backport candidate: a tiny, surgical fix for a
race condition that can cause double completions (potential UAF/crash),
in a critical subsystem (NVMe storage), authored and reviewed by
subsystem maintainers, with zero risk of regression.

**YES**

 drivers/nvme/host/pci.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 74aca8846fffe..3f4af341474e1 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1499,7 +1499,8 @@ static int nvme_poll(struct blk_mq_hw_ctx *hctx, struct io_comp_batch *iob)
 	struct nvme_queue *nvmeq = hctx->driver_data;
 	bool found;
 
-	if (!nvme_cqe_pending(nvmeq))
+	if (!test_bit(NVMEQ_POLLED, &nvmeq->flags) ||
+	    !nvme_cqe_pending(nvmeq))
 		return 0;
 
 	spin_lock(&nvmeq->cq_poll_lock);
-- 
2.51.0


  parent reply	other threads:[~2026-03-10  9:01 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-10  9:01 [PATCH AUTOSEL 6.19-6.18] ALSA: hda/hdmi: Add Tegra238 HDA codec device ID Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-6.18] ASoC: amd: acp: Add ACP6.3 match entries for Cirrus Logic parts Sasha Levin
2026-03-10  9:01 ` Sasha Levin [this message]
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-6.18] ASoC: cs35l56: Only patch ASP registers if the DAI is part of a DAIlink Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-6.12] ALSA: hda/senary: Ensure EAPD is enabled during init Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-5.10] ASoC: fsl_easrc: Fix event generation in fsl_easrc_iec958_set_reg() Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-6.12] kbuild: install-extmod-build: Package resolve_btfids if necessary Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-6.12] scsi: devinfo: Add BLIST_SKIP_IO_HINTS for Iomega ZIP Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-6.18] block: break pcpu_alloc_mutex dependency on freeze_lock Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-6.18] platform/x86: oxpec: Add support for OneXPlayer X1z Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19] spi: spi-dw-dma: fix print error log when wait finish transaction Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-6.6] HID: asus: add xg mobile 2023 external hardware support Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-6.18] ASoC: rt1321: fix DMIC ch2/3 mask issue Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-6.12] drm/ttm/tests: Fix build failure on PREEMPT_RT Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-6.12] bpf: Fix u32/s32 bounds when ranges cross min/max boundary Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-5.10] HID: mcp2221: cancel last I2C command on read error Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-5.15] platform/x86: intel-hid: Add Dell 14 Plus 2-in-1 to dmi_vgbs_allow_list Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-5.10] HID: asus: avoid memory leak in asus_report_fixup() Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-6.12] scsi: mpi3mr: Clear reset history on ready and recheck state after timeout Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-6.18] platform/x86: oxpec: Add support for Aokzoe A2 Pro Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19] platform/x86: hp-wmi: Add Victus 16-d0xxx support Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-5.10] platform/x86: touchscreen_dmi: Add quirk for y-inverted Goodix touchscreen on SUPI S10 Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-6.1] HID: apple: avoid memory leak in apple_report_fixup() Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-5.10] platform/x86: intel-hid: Enable 5-button array on ThinkPad X1 Fold 16 Gen 1 Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-6.18] ASoC: Intel: sof_sdw: Add quirk for Alienware Area 51 (2025) 0CCD SKU Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-6.18] platform/x86: hp-wmi: Add Omen 16-xd0xxx fan and thermal support Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-6.12] HID: apple: Add EPOMAKER TH87 to the non-apple keyboards list Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-6.18] platform/x86: hp-wmi: Add Omen 16-wf0xxx fan and thermal support Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-5.15] nvme-pci: cap queue creation to used queues Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-5.10] dma-buf: Include ioctl.h in UAPI header Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-6.18] platform/x86: oxpec: Add support for OneXPlayer X1 Air Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19] platform/x86: hp-wmi: add Omen 14-fb1xxx (board 8E41) support Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-5.15] net: usb: r8152: add TRENDnet TUC-ET2G Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-6.18] platform/x86: oxpec: Add support for OneXPlayer APEX Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-5.10] ASoC: fsl_easrc: Fix event generation in fsl_easrc_iec958_put_bits() Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-5.15] HID: magicmouse: fix battery reporting for Apple Magic Trackpad 2 Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-6.18] HID: intel-ish-hid: ipc: Add Nova Lake-H/S PCI device IDs Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-6.1] nvme-fabrics: use kfree_sensitive() for DHCHAP secrets Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-5.15] module: Fix kernel panic when a symbol st_shndx is out of bounds Sasha Levin
2026-03-10  9:01 ` [PATCH AUTOSEL 6.19-5.15] HID: magicmouse: avoid memory leak in magicmouse_report_fixup() Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260310090145.2709021-3-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=hch@lst.de \
    --cc=joshi.k@samsung.com \
    --cc=kbusch@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=patches@lists.linux.dev \
    --cc=sagi@grimberg.me \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox