From: Ming Lei <tom.leiming@gmail.com>
To: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Cc: linux-block@vger.kernel.org, Jens Axboe <axboe@kernel.dk>,
Nilay Shroff <nilay@linux.ibm.com>
Subject: Re: [PATCH v2] block: serialize elevator changes for the same queue using a writer lock
Date: Wed, 24 Jun 2026 04:44:14 -0500 [thread overview]
Message-ID: <ajum7l5BPDRoeJgi@fedora> (raw)
In-Reply-To: <20260623013238.642052-1-shinichiro.kawasaki@wdc.com>
On Tue, Jun 23, 2026 at 10:32:38AM +0900, Shin'ichiro Kawasaki wrote:
> When elevator_change() is called concurrently for the same queue, the
> elevator_change_done() function runs concurrently as well. This function
> adds or deletes kobjects for the debugfs entry of the queue. Then the
> concurrent calls cause memory corruption of the kobjects and result in a
> process hang. The core part of the elevator switch is protected by queue
> freeze and q->elevator_lock. However, since the commit 559dc11143eb
> ("block: move elv_register[unregister]_queue out of elevator_lock"), the
> elevator_change_done() is not serialized. Hence the memory corruption
> and the hang.
>
> The failures are observed when udev-worker writes to a sysfs
> queue/scheduler attribute file while the blktests test case block/005
> writes to the same attribute file. The failure also can be recreated by
> running two processes that write to the same queue/scheduler file
> concurrently. The failure is observed since another commit 370ac285f23a
> ("block: avoid cpu_hotplug_lock depedency on freeze_lock"). This commit
> changed the behavior of queue freeze and it unveiled the failure.
>
> Fix the failure by changing elv_iosched_store() to acquire
> update_nr_hwq_lock as the writer lock instead of the reader lock. This
> serializes the whole elevator switch steps, including the
> elevator_change_done() call.
>
> Fixes: 559dc11143eb ("block: move elv_register[unregister]_queue out of elevator_lock")
> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
> ---
> I observed that the blktests test case block/005 hung on a specific
> server hardware using a specific HDD as a block device. During the test
> case run, the kernel reported KASAN null-ptr-deref and slab-use-after-
> free errors. The failure happened when a sysfs queue/scheduler attribute
> file is written concurrently. I reported the failure and shared a
> candidate fix patch as RFC [1]. Based on the comments and discussion on
> the RFC patch, I propose this v2 patch that avoids introducing a new
> lock. My thanks go to Ming and Nilay for the discussion.
>
> Please refer to [1] for details of the failure. Also, I created a
> blktests test case that recreates the hang [2], which I used to test the
> fix.
>
> * Changes from RFC v1
> - Instead of adding a new mutex to struct request_queue, replace the
> reader lock on update_nr_hwq_lock with the writer lock in
> elv_iosched_store().
>
> [1] https://lore.kernel.org/linux-block/20260611074200.474676-1-shinichiro.kawasaki@wdc.com/
> [2] https://github.com/kawasaki/blktests/commit/8e80b3ccc0bbbe3f209d00eacd138d020de97fc6
>
> block/elevator.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/block/elevator.c b/block/elevator.c
> index 3bcd37c2aa34..b03185a217ff 100644
> --- a/block/elevator.c
> +++ b/block/elevator.c
> @@ -813,7 +813,7 @@ ssize_t elv_iosched_store(struct gendisk *disk, const char *buf,
> * update_nr_hwq_lock -> kn->active (via del_gendisk -> kobject_del)
> * kn->active -> update_nr_hwq_lock (via this sysfs write path)
> */
> - if (!down_read_trylock(&set->update_nr_hwq_lock)) {
> + if (!down_write_trylock(&set->update_nr_hwq_lock)) {
> ret = -EBUSY;
> goto out;
> }
> @@ -824,7 +824,7 @@ ssize_t elv_iosched_store(struct gendisk *disk, const char *buf,
> } else {
> ret = -ENOENT;
> }
> - up_read(&set->update_nr_hwq_lock);
> + up_write(&set->update_nr_hwq_lock);
I feel this is still abuse of the above lock, which serves writer vs
reader wrt. updating hw queue.
How about the following fix?
diff --git a/block/elevator.c b/block/elevator.c
index 3bcd37c2aa34..0375eb77646e 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -627,8 +627,10 @@ static void elv_exit_and_release(struct elv_change_ctx *ctx,
static int elevator_change_done(struct request_queue *q,
struct elv_change_ctx *ctx)
{
+ bool exit_and_release = false;
int ret = 0;
+ mutex_lock(&q->elevator_lock);
if (ctx->old) {
struct elevator_resources res = {
.et = ctx->old->et,
@@ -640,10 +642,24 @@ static int elevator_change_done(struct request_queue *q,
kobject_put(&ctx->old->kobj);
}
if (ctx->new) {
+ if (ctx->new != q->elevator) {
+ ret = -EBUSY;
+ goto unlock;
+ }
ret = elv_register_queue(q, ctx->new, !ctx->no_uevent);
+ /*
+ * Tear down the failed elevator after dropping elevator_lock:
+ * elv_exit_and_release() freezes the queue and re-acquires
+ * elevator_lock itself, so it must not be called nested here.
+ */
if (ret)
- elv_exit_and_release(ctx, q);
+ exit_and_release = true;
}
+unlock:
+ mutex_unlock(&q->elevator_lock);
+
+ if (exit_and_release)
+ elv_exit_and_release(ctx, q);
return ret;
}
Thanks,
Ming
next prev parent reply other threads:[~2026-06-24 9:44 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-23 1:32 [PATCH v2] block: serialize elevator changes for the same queue using a writer lock Shin'ichiro Kawasaki
2026-06-23 5:29 ` Nilay Shroff
2026-06-24 9:44 ` Ming Lei [this message]
2026-06-24 11:48 ` Shin'ichiro Kawasaki
2026-06-24 15:09 ` Ming Lei
2026-06-24 15:12 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ajum7l5BPDRoeJgi@fedora \
--to=tom.leiming@gmail.com \
--cc=axboe@kernel.dk \
--cc=linux-block@vger.kernel.org \
--cc=nilay@linux.ibm.com \
--cc=shinichiro.kawasaki@wdc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.