From: Christoph Hellwig <hch@lst.de>
To: Ming Lei <ming.lei@redhat.com>
Cc: "Jens Axboe" <axboe@kernel.dk>,
linux-block@vger.kernel.org, "Nilay Shroff" <nilay@linux.ibm.com>,
"Shinichiro Kawasaki" <shinichiro.kawasaki@wdc.com>,
"Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
"Christoph Hellwig" <hch@lst.de>
Subject: Re: [PATCH 04/15] block: prevent elevator switch during updating nr_hw_queues
Date: Thu, 10 Apr 2025 16:36:22 +0200 [thread overview]
Message-ID: <20250410143622.GC10701@lst.de> (raw)
In-Reply-To: <20250410133029.2487054-5-ming.lei@redhat.com>
On Thu, Apr 10, 2025 at 09:30:16PM +0800, Ming Lei wrote:
> updating nr_hw_queues is usually used for error handling code, when it
Capitalize the first word of each sentence, please.
> doesn't make sense to allow blk-mq elevator switching, since nr_hw_queues
> may change, and elevator tags depends on nr_hw_queues.
I don't think it's really updated from error handling
- nbd does it when starting a device
- nullb can do it through debugfs
- xen-blkfront does it when resuming from a suspend
- nvme does it when resetting a controller. While error handling
can escalate to it¸ it's basically probing and re-probing code
> Prevent elevator switch during updating nr_hw_queues by setting flag of
> BLK_MQ_F_UPDATE_HW_QUEUES, and use srcu to fail elevator switch during
> the period. Here elevator switch code is srcu reader of nr_hw_queues,
> and blk_mq_update_nr_hw_queues() is the writer.
That being said as we generally are in a setup path I think the general
idea is fine. No devices should be life yet at this point and thus
no udev rules changing the scheduler should run yet.
> This way avoids lot of trouble.
Can you spell that out a bit?
> Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
> Closes: https://lore.kernel.org/linux-block/mz4t4tlwiqjijw3zvqnjb7ovvvaegkqganegmmlc567tt5xj67@xal5ro544cnc/
Are we using Closes for bug reports now? I haven't really seen that
anywhere.
> out_cleanup_srcu:
> if (set->flags & BLK_MQ_F_BLOCKING)
> cleanup_srcu_struct(set->srcu);
> @@ -5081,7 +5087,18 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set,
> void blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, int nr_hw_queues)
> {
> mutex_lock(&set->tag_list_lock);
> + /*
> + * Mark us in updating nr_hw_queues for preventing switching
> + * elevator
>
> + *
> + * Elevator switch code can _not_ acquire ->tag_list_lock
Please add a . at the end of a sentences. Also this should probably
be something like "Mark us as in.." but I'll leave more nitpicking
to the native speakers.
> struct request_queue *q = disk->queue;
> + struct blk_mq_tag_set *set = q->tag_set;
>
> /*
> * If the attribute needs to load a module, do it before freezing the
> @@ -732,6 +733,13 @@ ssize_t elv_iosched_store(struct gendisk *disk, const char *buf,
>
> elv_iosched_load_module(name);
>
> + idx = srcu_read_lock(&set->update_nr_hwq_srcu);
> +
> + if (set->flags & BLK_MQ_F_UPDATE_HW_QUEUES) {
What provides atomicity for field modifications vs reading of set->flags?
i.e. does this need to switch using test/set_bit?
> + struct srcu_struct update_nr_hwq_srcu;
> };
>
> /**
> @@ -681,7 +682,14 @@ enum {
> */
> BLK_MQ_F_NO_SCHED_BY_DEFAULT = 1 << 6,
>
> - BLK_MQ_F_MAX = 1 << 7,
> + /*
> + * True when updating nr_hw_queues is in-progress
> + *
> + * tag_set only flag, not usable for hctx
> + */
> + BLK_MQ_F_UPDATE_HW_QUEUES = 1 << 7,
> +
> + BLK_MQ_F_MAX = 1 << 8,
Also mixing internal state with driver provided flags is always
a bad idea. So this should probably be a new state field in the
tag_set and not reuse flags.
next prev parent reply other threads:[~2025-04-10 14:36 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-10 13:30 [PATCH 00/15] block: unify elevator changing and fix lockdep warning Ming Lei
2025-04-10 13:30 ` [PATCH 01/15] block: don't call freeze queue in elevator_switch() and elevator_disable() Ming Lei
2025-04-10 13:30 ` [PATCH 02/15] block: add two helpers for registering/un-registering sched debugfs Ming Lei
2025-04-10 14:25 ` Christoph Hellwig
2025-04-10 13:30 ` [PATCH 03/15] block: move sched debugfs register into elvevator_register_queue Ming Lei
2025-04-10 14:27 ` Christoph Hellwig
2025-04-14 0:42 ` Ming Lei
2025-04-10 13:30 ` [PATCH 04/15] block: prevent elevator switch during updating nr_hw_queues Ming Lei
2025-04-10 14:36 ` Christoph Hellwig [this message]
2025-04-14 0:54 ` Ming Lei
2025-04-14 6:07 ` Christoph Hellwig
2025-04-15 2:03 ` Ming Lei
2025-04-11 19:13 ` Nilay Shroff
2025-04-14 0:55 ` Ming Lei
2025-04-10 13:30 ` [PATCH 05/15] block: simplify elevator reset for " Ming Lei
2025-04-10 14:40 ` Christoph Hellwig
2025-04-10 15:34 ` Christoph Hellwig
2025-04-14 0:58 ` Ming Lei
2025-04-14 6:09 ` Christoph Hellwig
2025-04-15 2:05 ` Ming Lei
2025-04-10 13:30 ` [PATCH 06/15] block: add helper of elevator_change() Ming Lei
2025-04-10 13:30 ` [PATCH 07/15] block: move blk_unregister_queue() & device_del() after freeze wait Ming Lei
2025-04-14 6:19 ` Christoph Hellwig
2025-04-15 2:26 ` Ming Lei
2025-04-10 13:30 ` [PATCH 08/15] block: add `struct elev_change_ctx` for unifying elevator change Ming Lei
2025-04-14 6:21 ` Christoph Hellwig
2025-04-10 13:30 ` [PATCH 09/15] block: " Ming Lei
2025-04-10 18:37 ` Nilay Shroff
2025-04-14 1:22 ` Ming Lei
2025-04-15 12:30 ` Nilay Shroff
2025-04-16 1:49 ` Ming Lei
2025-04-10 13:30 ` [PATCH 10/15] block: pass elevator_queue to elv_register_queue & unregister_queue Ming Lei
2025-04-14 6:22 ` Christoph Hellwig
2025-04-15 2:31 ` Ming Lei
2025-04-16 4:53 ` Christoph Hellwig
2025-04-10 13:30 ` [PATCH 11/15] block: move elv_register[unregister]_queue out of elevator_lock Ming Lei
2025-04-11 19:20 ` Nilay Shroff
2025-04-14 1:24 ` Ming Lei
2025-04-15 9:39 ` Nilay Shroff
2025-04-15 10:32 ` Ming Lei
2025-04-10 13:30 ` [PATCH 12/15] block: move debugfs/sysfs register out of freezing queue Ming Lei
2025-04-10 18:57 ` Nilay Shroff
2025-04-14 1:42 ` Ming Lei
2025-04-15 9:37 ` Nilay Shroff
2025-04-15 10:06 ` Ming Lei
2025-04-15 11:15 ` Nilay Shroff
2025-04-15 11:54 ` Ming Lei
2025-04-15 12:21 ` Nilay Shroff
2025-04-15 12:41 ` Ming Lei
2025-04-10 13:30 ` [PATCH 13/15] block: remove several ->elevator_lock Ming Lei
2025-04-10 19:07 ` Nilay Shroff
2025-04-14 1:46 ` Ming Lei
2025-04-10 13:30 ` [PATCH 14/15] block: move hctx cpuhp add/del out of queue freezing Ming Lei
2025-04-10 13:30 ` [PATCH 15/15] block: move wbt_enable_default() out of queue freezing from scheduler's ->exit() Ming Lei
2025-04-10 19:20 ` Nilay Shroff
2025-04-14 1:55 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250410143622.GC10701@lst.de \
--to=hch@lst.de \
--cc=axboe@kernel.dk \
--cc=linux-block@vger.kernel.org \
--cc=ming.lei@redhat.com \
--cc=nilay@linux.ibm.com \
--cc=shinichiro.kawasaki@wdc.com \
--cc=thomas.hellstrom@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.