From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Yu Kuai <yukuai@fnnas.com>, Nilay Shroff <nilay@linux.ibm.com>,
Ming Lei <ming.lei@redhat.com>, Hannes Reinecke <hare@suse.de>,
Jens Axboe <axboe@kernel.dk>, Sasha Levin <sashal@kernel.org>,
linux-block@vger.kernel.org
Subject: [PATCH AUTOSEL 6.19-5.10] blk-mq-debugfs: add missing debugfs_mutex in blk_mq_debugfs_register_hctxs()
Date: Wed, 11 Feb 2026 07:30:33 -0500 [thread overview]
Message-ID: <20260211123112.1330287-23-sashal@kernel.org> (raw)
In-Reply-To: <20260211123112.1330287-1-sashal@kernel.org>
From: Yu Kuai <yukuai@fnnas.com>
[ Upstream commit 9d20fd6ce1ba9733cd5ac96fcab32faa9fc404dd ]
In blk_mq_update_nr_hw_queues(), debugfs_mutex is not held while
creating debugfs entries for hctxs. Hence add debugfs_mutex there,
it's safe because queue is not frozen.
Signed-off-by: Yu Kuai <yukuai@fnnas.com>
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
Now I have a comprehensive picture. Let me compile the full analysis.
---
## Complete Analysis
### 1. COMMIT MESSAGE ANALYSIS
The commit title is clear: "add missing debugfs_mutex in
blk_mq_debugfs_register_hctxs()". The body explicitly describes the
problem: `debugfs_mutex` is not held while creating debugfs entries for
hctxs during `blk_mq_update_nr_hw_queues()`. The author (Yu Kuai) notes
it's safe to add the mutex there because "queue is not frozen" at that
point (meaning there's no deadlock risk with freeze-related locks).
Three reviewers signed off: Nilay Shroff (IBM), Ming Lei (Red Hat), and
Hannes Reinecke (SUSE) -- all experienced block layer developers. The
merge was by Jens Axboe (block layer maintainer).
### 2. CODE CHANGE ANALYSIS
The diff is minimal -- **3 lines added**:
```c
mutex_lock(&q->debugfs_mutex);
queue_for_each_hw_ctx(q, hctx, i)
blk_mq_debugfs_register_hctx(q, hctx);
mutex_unlock(&q->debugfs_mutex);
```
#### The Bug Mechanism
The `debugfs_mutex` was introduced in commit `5cf9c91ba927` ("block:
serialize all debugfs operations using q->debugfs_mutex", v5.19) by
Christoph Hellwig. That commit explicitly stated the goal: *"Use the
existing debugfs_mutex to serialize all debugfs operations that rely on
q->debugfs_dir or the directories hanging off it."*
That commit added `lockdep_assert_held(&q->debugfs_mutex)` in many
functions:
- `blk_mq_debugfs_register_sched()` (line 706)
- `blk_mq_debugfs_unregister_sched()` (line 725)
- `blk_mq_debugfs_unregister_rqos()` (line 746)
- `blk_mq_debugfs_register_rqos()` (line 759)
- `blk_mq_debugfs_register_sched_hctx()` (line 777)
- `blk_mq_debugfs_unregister_sched_hctx()` (line 798)
But it missed `blk_mq_debugfs_register_hctxs()`, which is called from
`__blk_mq_update_nr_hw_queues()` at line 5165 of `block/blk-mq.c`
**without** holding `debugfs_mutex`.
All other callers properly hold the mutex:
- `blk_register_queue()` in `blk-sysfs.c` holds `debugfs_mutex` when
calling `blk_mq_debugfs_register()` (which internally calls
`blk_mq_debugfs_register_hctx()`)
- `blk_mq_sched_reg_debugfs()` in `blk-mq-sched.c` holds `debugfs_mutex`
when calling sched debugfs registration
- `rq_qos_add()` in `blk-rq-qos.c` holds `debugfs_mutex` when
registering rqos debugfs
#### The Race
The race is between two concurrent paths:
**Thread A** -- `__blk_mq_update_nr_hw_queues()` →
`blk_mq_debugfs_register_hctxs()` → `blk_mq_debugfs_register_hctx()`:
```656:673:block/blk-mq-debugfs.c
void blk_mq_debugfs_register_hctx(struct request_queue *q,
struct blk_mq_hw_ctx *hctx)
{
struct blk_mq_ctx *ctx;
char name[20];
int i;
if (!q->debugfs_dir)
return;
snprintf(name, sizeof(name), "hctx%u", hctx->queue_num);
hctx->debugfs_dir = debugfs_create_dir(name, q->debugfs_dir);
// ... creates more debugfs files ...
}
```
**Thread B** -- `blk_unregister_queue()` → `blk_debugfs_remove()`:
```884:895:block/blk-sysfs.c
static void blk_debugfs_remove(struct gendisk *disk)
{
struct request_queue *q = disk->queue;
mutex_lock(&q->debugfs_mutex);
blk_trace_shutdown(q);
debugfs_remove_recursive(q->debugfs_dir);
q->debugfs_dir = NULL;
// ...
mutex_unlock(&q->debugfs_mutex);
}
```
Without the mutex in Thread A:
1. Thread A checks `q->debugfs_dir` (line 663) -- not NULL, proceeds
2. Thread B acquires `debugfs_mutex`, removes `q->debugfs_dir`, sets it
to NULL
3. Thread A uses the now-stale/freed `q->debugfs_dir` to create child
entries (line 667)
This can result in orphaned debugfs entries, inconsistent debugfs state,
and potentially use of a freed dentry.
### 3. CLASSIFICATION
This is a **synchronization bug fix** -- adding a missing lock
acquisition in a path that was accidentally omitted when the locking
scheme was introduced. It completes the locking protocol established in
commit `5cf9c91ba927`.
### 4. SCOPE AND RISK
- **Size**: 3 lines added (+2 mutex_lock/unlock, consistent with
existing pattern)
- **Files touched**: 1 file (`block/blk-mq-debugfs.c`)
- **Subsystem**: Block layer (blk-mq) -- core infrastructure used by all
block device drivers
- **Risk**: Extremely low. The mutex is already held by all other paths.
The commit message explicitly addresses deadlock safety ("queue is not
frozen")
- **Pattern match**: 100% consistent with how every other debugfs
operation in the block layer acquires this lock
### 5. USER IMPACT
The `__blk_mq_update_nr_hw_queues()` path is triggered during:
- CPU hotplug events (adding/removing CPUs changes hw queue count)
- Driver-initiated hw queue resizing (e.g., `null_blk` via configfs
`submit_queues`)
- SCSI/NVMe driver queue changes
This means any system doing CPU hotplug or block device reconfiguration
could hit this race. The impact is concurrent with disk
registration/unregistration, which happens during device discovery and
removal -- both common operations.
### 6. STABILITY INDICATORS
- **3 Reviewed-by tags** from major Linux storage developers
- **Merged by Jens Axboe** (block layer maintainer)
- Part of a series of debugfs locking improvements to the block layer
(see commit `0e94ed336814` protecting `hctx_busy_show`)
### 7. DEPENDENCY CHECK
- **Dependencies**: Requires `debugfs_mutex` in `struct request_queue`,
which has been present since v5.19 (commit `5cf9c91ba927`)
- **Self-contained**: The fix is entirely self-contained -- no other
patches required
- **Applies to**: All stable trees from 5.19 onwards (6.1.y, 6.6.y,
6.12.y, etc.)
### Summary
This commit fixes a missing lock acquisition in
`blk_mq_debugfs_register_hctxs()` that was overlooked when the
`debugfs_mutex` serialization scheme was introduced in v5.19. Every
other debugfs operation in the block layer properly holds this lock. The
fix is:
- **3 lines** of trivially correct code
- **Zero risk** of regression (follows established pattern exactly)
- **Well-reviewed** by 3 experienced block layer developers
- **Fixes a real race** between hw queue updates and debugfs teardown
- **Self-contained** with no dependencies beyond what's in all affected
stable trees
The fix is small, surgical, obviously correct, and meets all stable
kernel criteria.
**YES**
block/blk-mq-debugfs.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
index 4896525b1c054..553d93b88e194 100644
--- a/block/blk-mq-debugfs.c
+++ b/block/blk-mq-debugfs.c
@@ -686,8 +686,10 @@ void blk_mq_debugfs_register_hctxs(struct request_queue *q)
struct blk_mq_hw_ctx *hctx;
unsigned long i;
+ mutex_lock(&q->debugfs_mutex);
queue_for_each_hw_ctx(q, hctx, i)
blk_mq_debugfs_register_hctx(q, hctx);
+ mutex_unlock(&q->debugfs_mutex);
}
void blk_mq_debugfs_unregister_hctxs(struct request_queue *q)
--
2.51.0
next prev parent reply other threads:[~2026-02-11 12:31 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-11 12:30 [PATCH AUTOSEL 6.19-5.10] s390/perf: Disable register readout on sampling events Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] arm64: Add support for TSV110 Spectre-BHB mitigation Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] xenbus: Use .freeze/.thaw to handle xenbus devices Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] s390/purgatory: Add -Wno-default-const-init-unsafe to KBUILD_CFLAGS Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.18] s390/boot: " Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.1] perf/arm-cmn: Support CMN-600AE Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.18] ntfs: ->d_compare() must not block Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.12] ACPI: x86: s2idle: Invoke Microsoft _DSM Function 9 (Turn On Display) Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.12] block: decouple secure erase size limit from discard size limit Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] sparc: don't reference obsolete termio struct for TC* constants Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] EFI/CPER: don't go past the ARM processor CPER record buffer Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19] ACPI: scan: Use async schedule function in acpi_scan_clear_dep_fn() Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.6] cpufreq: dt-platdev: Block the driver from probing on more QC platforms Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] EFI/CPER: don't dump the entire memory region Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.12] ACPI: battery: fix incorrect charging status when current is zero Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.18] rust: cpufreq: always inline functions using build_assert with arguments Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.18] blk-mq-sched: unify elevators checking for async requests Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] x86/xen/pvh: Enable PAE mode for 32-bit guest only when CONFIG_X86_PAE is set Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.12] APEI/GHES: ARM processor Error: don't go past allocated memory Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.18] md raid: fix hang when stopping arrays with metadata through dm-raid Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] tools/power cpupower: Reset errno before strtoull() Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] sparc: Synchronize user stack on fork and clone Sasha Levin
2026-02-11 12:30 ` Sasha Levin [this message]
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] rnbd-srv: Zero the rsp buffer before using it Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.12] alpha: fix user-space corruption during memory compaction Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] ACPICA: Abort AML bytecode execution when executing AML_FATAL_OP Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19] arm64: mte: Set TCMA1 whenever MTE is present in the kernel Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.18] tools/cpupower: Fix inverted APERF capability check Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.15] ACPI: processor: Fix NULL-pointer dereference in acpi_processor_errata_piix4() Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.12] ACPI: resource: Add JWIPC JVC9100 to irq1_level_low_skip_override[] Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.6] perf/cxlpmu: Replace IRQF_ONESHOT with IRQF_NO_THREAD Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.6] md-cluster: fix NULL pointer dereference in process_metadata_update Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] APEI/GHES: ensure that won't go past CPER allocated record Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.12] powercap: intel_rapl: Add PL4 support for Ice Lake Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.18] io_uring/timeout: annotate data race in io_flush_timeouts() Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260211123112.1330287-23-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=axboe@kernel.dk \
--cc=hare@suse.de \
--cc=linux-block@vger.kernel.org \
--cc=ming.lei@redhat.com \
--cc=nilay@linux.ibm.com \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
--cc=yukuai@fnnas.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox