From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Keith Busch <kbusch@kernel.org>,
Kanchan Joshi <joshi.k@samsung.com>,
Christoph Hellwig <hch@lst.de>, Sasha Levin <sashal@kernel.org>,
sagi@grimberg.me, linux-nvme@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: [PATCH AUTOSEL 6.19-5.15] nvme-pci: cap queue creation to used queues
Date: Tue, 10 Mar 2026 05:01:29 -0400 [thread overview]
Message-ID: <20260310090145.2709021-29-sashal@kernel.org> (raw)
In-Reply-To: <20260310090145.2709021-1-sashal@kernel.org>
From: Keith Busch <kbusch@kernel.org>
[ Upstream commit 4735b510a00fb2d4ac9e8d21a8c9552cb281f585 ]
If the user reduces the special queue count at runtime and resets the
controller, we need to reduce the number of queues and interrupts
requested accordingly rather than start with the pre-allocated queue
count.
Tested-by: Kanchan Joshi <joshi.k@samsung.com>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
Now I have a complete understanding. Let me summarize the analysis.
## Analysis
### What the commit does
This is a one-line functional change (replacing
`dev->nr_allocated_queues - 1` with `min(nvme_max_io_queues(dev),
dev->nr_allocated_queues - 1)`) in the NVMe PCI driver's
`nvme_setup_io_queues()` function.
### The bug
The `write_queues` and `poll_queues` module parameters are writable at
runtime (permission 0644). At probe time, `nr_allocated_queues` is set
to `nvme_max_io_queues(dev) + 1`, based on the CPU count plus write/poll
queue counts at that time. If the user later **reduces** these
parameters and then triggers a controller reset,
`nvme_setup_io_queues()` would still use the old, larger
`nr_allocated_queues - 1` as the starting queue count, requesting more
interrupts and queues than actually needed.
### Historical context
Interestingly, there **was** a `min()` cap here originally. Commit
`ff4e5fbad06f7` (v5.11) removed it with the reasoning that
`nr_allocated_queues` equals `nvme_max_io_queues() + 1` so the min is
redundant. That reasoning is correct **at probe time** but breaks when
module parameters change at runtime. This commit effectively restores
the `min()` to handle the runtime modification case.
### Consequences of the bug
Without this fix, after reducing queue parameters and resetting:
1. Too many interrupt vectors are requested via
`pci_alloc_irq_vectors_affinity()`
2. The retry loop at lines 3022-3030 may be triggered unnecessarily when
the controller can't create all requested queues
3. Unnecessary resource waste (interrupt vectors, queue slots)
4. Potential allocation failures on systems with limited interrupt
resources
### Stable criteria assessment
- **Fixes a real bug**: Yes - incorrect behavior when runtime-modifiable
parameters change
- **Obviously correct**: Yes - it's restoring a `min()` that was
previously there, with well-understood semantics
- **Small and contained**: Yes - 1 functional line change (7 lines with
comment)
- **No new features**: Correct - purely a bug fix
- **Tested and reviewed**: Has Tested-by, Reviewed-by from Kanchan Joshi
and Reviewed-by from Christoph Hellwig (NVMe subsystem maintainer)
- **Author**: Keith Busch is the NVMe maintainer himself
- **Risk**: Very low - `min()` can only reduce the initial queue count,
never increase it; the previous value was always valid, so this can
only make things better
### Dependencies
This commit is standalone. The `nvme_max_io_queues()` function has
existed since v5.10 (commit `e3aef0950a30e`). The related commit
`02bfc966463a7` ("do not try to add queue maps at runtime") was applied
to the same stable tree but is independent - it handles a different
aspect (preventing addition of queue map types at runtime), while this
commit handles queue **count** capping.
The code at lines 2923-2933 (tagset nr_maps check from the "do not try
to add queue maps" commit) executes before this line and can further
reduce `dev->nr_write_queues`/`dev->nr_poll_queues`. While this commit
works fine without that prerequisite (both independently reduce the
queue count for different reasons), having the prerequisite makes the
`min()` even more effective since `nvme_max_io_queues()` returns the
properly capped value.
### Verification
- **git show ff4e5fbad06f7**: Confirmed that a `min()` with
`nvme_max_io_queues()` existed before v5.11 and was removed because it
was considered redundant at the time
- **git tag --contains ff4e5fbad06f7**: Confirmed the `min()` removal is
in stable trees from v5.11 onward, meaning all actively maintained
stable trees (5.15.y, 6.1.y, 6.6.y, etc.) have the bug
- **Module parameter permissions**: Confirmed `write_queues` and
`poll_queues` are 0644 (runtime writable) at line 265 and 271 of pci.c
- **nvme_max_io_queues()**: Confirmed it exists since v5.10 (commit
e3aef0950a30e), so the function is available in all stable trees
- **git show 4735b510a00fb**: Confirmed upstream commit exists and the
diff matches exactly
- **git show 02bfc966463a7**: Verified the related "do not try to add
queue maps" commit is independent; it adds the tagset check (lines
2923-2933) which is additive and doesn't conflict
- **git log --author="Keith Busch"**: Confirmed Keith Busch is the NVMe
maintainer and author of related fixes
- Could NOT independently verify whether the retry loop causes user-
visible failures vs just being wasteful (unverified, but requesting
excess interrupts is clearly wrong behavior)
### Conclusion
This is a small, surgical, well-reviewed bug fix by the NVMe subsystem
maintainer that restores a necessary safety cap for a realistic user
scenario (modifying module parameters at runtime). It has minimal risk -
`min()` can only reduce the queue count. The affected code exists in all
maintained stable trees. It meets all stable kernel criteria.
**YES**
drivers/nvme/host/pci.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index d86f2565a92ca..74aca8846fffe 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2776,7 +2776,13 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
dev->nr_write_queues = write_queues;
dev->nr_poll_queues = poll_queues;
- nr_io_queues = dev->nr_allocated_queues - 1;
+ /*
+ * The initial number of allocated queue slots may be too large if the
+ * user reduced the special queue parameters. Cap the value to the
+ * number we need for this round.
+ */
+ nr_io_queues = min(nvme_max_io_queues(dev),
+ dev->nr_allocated_queues - 1);
result = nvme_set_queue_count(&dev->ctrl, &nr_io_queues);
if (result < 0)
return result;
--
2.51.0
next prev parent reply other threads:[~2026-03-10 9:02 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-10 9:01 [PATCH AUTOSEL 6.19-6.18] ALSA: hda/hdmi: Add Tegra238 HDA codec device ID Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-6.18] ASoC: amd: acp: Add ACP6.3 match entries for Cirrus Logic parts Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-5.10] nvme-pci: ensure we're polling a polled queue Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-6.18] ASoC: cs35l56: Only patch ASP registers if the DAI is part of a DAIlink Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-6.12] ALSA: hda/senary: Ensure EAPD is enabled during init Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-5.10] ASoC: fsl_easrc: Fix event generation in fsl_easrc_iec958_set_reg() Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-6.12] kbuild: install-extmod-build: Package resolve_btfids if necessary Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-6.12] scsi: devinfo: Add BLIST_SKIP_IO_HINTS for Iomega ZIP Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-6.18] block: break pcpu_alloc_mutex dependency on freeze_lock Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-6.18] platform/x86: oxpec: Add support for OneXPlayer X1z Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19] spi: spi-dw-dma: fix print error log when wait finish transaction Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-6.6] HID: asus: add xg mobile 2023 external hardware support Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-6.18] ASoC: rt1321: fix DMIC ch2/3 mask issue Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-6.12] drm/ttm/tests: Fix build failure on PREEMPT_RT Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-6.12] bpf: Fix u32/s32 bounds when ranges cross min/max boundary Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-5.10] HID: mcp2221: cancel last I2C command on read error Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-5.15] platform/x86: intel-hid: Add Dell 14 Plus 2-in-1 to dmi_vgbs_allow_list Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-5.10] HID: asus: avoid memory leak in asus_report_fixup() Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-6.12] scsi: mpi3mr: Clear reset history on ready and recheck state after timeout Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-6.18] platform/x86: oxpec: Add support for Aokzoe A2 Pro Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19] platform/x86: hp-wmi: Add Victus 16-d0xxx support Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-5.10] platform/x86: touchscreen_dmi: Add quirk for y-inverted Goodix touchscreen on SUPI S10 Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-6.1] HID: apple: avoid memory leak in apple_report_fixup() Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-5.10] platform/x86: intel-hid: Enable 5-button array on ThinkPad X1 Fold 16 Gen 1 Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-6.18] ASoC: Intel: sof_sdw: Add quirk for Alienware Area 51 (2025) 0CCD SKU Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-6.18] platform/x86: hp-wmi: Add Omen 16-xd0xxx fan and thermal support Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-6.12] HID: apple: Add EPOMAKER TH87 to the non-apple keyboards list Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-6.18] platform/x86: hp-wmi: Add Omen 16-wf0xxx fan and thermal support Sasha Levin
2026-03-10 9:01 ` Sasha Levin [this message]
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-5.10] dma-buf: Include ioctl.h in UAPI header Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-6.18] platform/x86: oxpec: Add support for OneXPlayer X1 Air Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19] platform/x86: hp-wmi: add Omen 14-fb1xxx (board 8E41) support Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-5.15] net: usb: r8152: add TRENDnet TUC-ET2G Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-6.18] platform/x86: oxpec: Add support for OneXPlayer APEX Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-5.10] ASoC: fsl_easrc: Fix event generation in fsl_easrc_iec958_put_bits() Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-5.15] HID: magicmouse: fix battery reporting for Apple Magic Trackpad 2 Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-6.18] HID: intel-ish-hid: ipc: Add Nova Lake-H/S PCI device IDs Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-6.1] nvme-fabrics: use kfree_sensitive() for DHCHAP secrets Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-5.15] module: Fix kernel panic when a symbol st_shndx is out of bounds Sasha Levin
2026-03-10 9:01 ` [PATCH AUTOSEL 6.19-5.15] HID: magicmouse: avoid memory leak in magicmouse_report_fixup() Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260310090145.2709021-29-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=hch@lst.de \
--cc=joshi.k@samsung.com \
--cc=kbusch@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=patches@lists.linux.dev \
--cc=sagi@grimberg.me \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox