public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Nilay Shroff <nilay@linux.ibm.com>
Cc: "Jens Axboe" <axboe@kernel.dk>,
	linux-block@vger.kernel.org,
	"Shinichiro Kawasaki" <shinichiro.kawasaki@wdc.com>,
	"Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
	"Christoph Hellwig" <hch@lst.de>
Subject: Re: [PATCH V2 20/20] block: move wbt_enable_default() out of queue freezing from sched ->exit()
Date: Tue, 22 Apr 2025 17:33:03 +0800	[thread overview]
Message-ID: <aAdiT3j5ptl_NdZm@fedora> (raw)
In-Reply-To: <8c07d251-bafc-4601-b14d-5771c4615703@linux.ibm.com>

On Tue, Apr 22, 2025 at 02:57:01PM +0530, Nilay Shroff wrote:
> 
> 
> On 4/22/25 1:43 PM, Ming Lei wrote:
> > On Tue, Apr 22, 2025 at 11:44:59AM +0530, Nilay Shroff wrote:
> >>
> >>
> >> On 4/21/25 12:57 PM, Ming Lei wrote:
> >>> On Sat, Apr 19, 2025 at 08:09:04PM +0530, Nilay Shroff wrote:
> >>>>
> >>>>
> >>>> On 4/18/25 10:07 PM, Ming Lei wrote:
> >>>>> scheduler's ->exit() is called with queue frozen and elevator lock is held, and
> >>>>> wbt_enable_default() can't be called with queue frozen, otherwise the
> >>>>> following lockdep warning is triggered:
> >>>>>
> >>>>> 	#6 (&q->rq_qos_mutex){+.+.}-{4:4}:
> >>>>> 	#5 (&eq->sysfs_lock){+.+.}-{4:4}:
> >>>>> 	#4 (&q->elevator_lock){+.+.}-{4:4}:
> >>>>> 	#3 (&q->q_usage_counter(io)#3){++++}-{0:0}:
> >>>>> 	#2 (fs_reclaim){+.+.}-{0:0}:
> >>>>> 	#1 (&sb->s_type->i_mutex_key#3){+.+.}-{4:4}:
> >>>>> 	#0 (&q->debugfs_mutex){+.+.}-{4:4}:
> >>>>>
> >>>>> Fix the issue by moving wbt_enable_default() out of bfq's exit(), and
> >>>>> call it from elevator_change_done().
> >>>>>
> >>>>> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> >>>>> ---
> >>>>>  block/bfq-iosched.c | 2 +-
> >>>>>  block/elevator.c    | 5 +++++
> >>>>>  block/elevator.h    | 1 +
> >>>>>  3 files changed, 7 insertions(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
> >>>>> index 40e4106a71e7..310ce1d8c41e 100644
> >>>>> --- a/block/bfq-iosched.c
> >>>>> +++ b/block/bfq-iosched.c
> >>>>> @@ -7211,7 +7211,7 @@ static void bfq_exit_queue(struct elevator_queue *e)
> >>>>>  
> >>>>>  	blk_stat_disable_accounting(bfqd->queue);
> >>>>>  	blk_queue_flag_clear(QUEUE_FLAG_DISABLE_WBT, bfqd->queue);
> >>>>> -	wbt_enable_default(bfqd->queue->disk);
> >>>>> +	set_bit(ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT, &e->flags);
> >>>>>  
> >>>>>  	kfree(bfqd);
> >>>>>  }
> >>>>> diff --git a/block/elevator.c b/block/elevator.c
> >>>>> index 8652fe45a2db..378553fce5d8 100644
> >>>>> --- a/block/elevator.c
> >>>>> +++ b/block/elevator.c
> >>>>> @@ -687,8 +687,13 @@ int elevator_change_done(struct request_queue *q, struct elv_change_ctx *ctx)
> >>>>>  	int ret = 0;
> >>>>>  
> >>>>>  	if (ctx->old) {
> >>>>> +		bool enable_wbt = test_bit(ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT,
> >>>>> +				&ctx->old->flags);
> >>>>> +
> >>>>>  		elv_unregister_queue(q, ctx->old);
> >>>>>  		kobject_put(&ctx->old->kobj);
> >>>>> +		if (enable_wbt)
> >>>>> +			wbt_enable_default(q->disk);
> >>>>>  	}
> >>>>>  	if (ctx->new) {
> >>>>>  		ret = elv_register_queue(q, ctx->new, ctx->uevent);
> >>>>> diff --git a/block/elevator.h b/block/elevator.h
> >>>>> index 486be0690499..b14c611c74b6 100644
> >>>>> --- a/block/elevator.h
> >>>>> +++ b/block/elevator.h
> >>>>> @@ -122,6 +122,7 @@ struct elevator_queue
> >>>>>  
> >>>>>  #define ELEVATOR_FLAG_REGISTERED	0
> >>>>>  #define ELEVATOR_FLAG_DYING		1
> >>>>> +#define ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT	2
> >>>>>  
> >>>>>  /* Holding context data for changing elevator */
> >>>>>  struct elv_change_ctx {
> >>>>
> >>>> It seems invoking wbt_enable_default from elevator_change_done could probably
> >>>> still race with ioc_qos_write or queue_wb_lat_store. Both ioc_qos_write and 
> >>>> queue_wb_lat_store run with ->freeze_lock and ->elevator_lock protection.
> >>>
> >>> Actually wbt_enable_default() and wbt_init() needn't the above protection,
> >>> especially since the patch 2/20 removes q->elevator use in
> >>> wbt_enable_default().
> >>>
> >> Yes agreed, and as I understand XXX_FLAG_DISABLE_WBT was earlier elevator_queue->flags 
> >> but now (with patch 2/20) it has been moved to request_queue->flags. As elevator_change_done 
> >> first puts elevator_queue object which would potentially releases/frees the  elevator_queue 
> >> object. Next while we enable wbt (in elevator_change_done)  we may not have access to the 
> >> elevator_queue object and so now we reference  QUEUE_FLAG_DISABLE_WBT using request_queue->flags. 
> >> That's, I believe, the purpose of patch 2/20.
> >>
> >> However even with patch 2/20 change, both elevator_change_done and ioc_qos_write or
> >> queue_wb_lat_store may run in parallel, isn't it?
> >>
> >> therad1:
> >> blk_mq_update_nr_hw_queues
> >>   -> __blk_mq_update_nr_hw_queues
> >>     -> elevator_change_done
> >>       -> wbt_enable_default
> >>         -> wbt_init
> >>          -> wbt_update_limits
> > 
> > Here wbt_update_limits() is called on un-attached `struct rq_wb` instance,
> > which is just allocated from heap.
> > 
> >>
> >> therad2:
> >> queue_wb_lat_store
> >>   -> wbt_set_min_lat
> >>    -> wbt_update_limits
> > 
> > The above one is run on attached `struct rq_wb` instance.
> > 
> > And there can only be one attached `struct rq_wb` instance, so the above
> > race doesn't exist because attaching wbt to queue/disk is covered by `q->rq_qos_mutex`.
> > 
> Yes you were correct, however, what if throttling is already enabled/attached to 
> the queue? In that case we'd race updating rq_wb->enable_state, no? For instance,
> 
> thread1:
> blk_mq_update_nr_hw_queues
>   -> elevator_change_done
>     -> wbt_enable_default ==> (updates ->enable_state)
> 
> thread2: 
> queue_wb_lat_store
>   -> wbt_set_min_lat ==> (updates ->enable_state)
> 
> thread3:
> ioc_qos_write
>   -> wbt_disable_default ==> (updates ->enable_state)

OK, that is one race, but should have been handled by one rqos dedicated
lock instead of ->elevator_lock.


Thanks,
Ming


      reply	other threads:[~2025-04-22  9:33 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-18 16:36 [PATCH V2 00/20] block: unify elevator changing and fix lockdep warning Ming Lei
2025-04-18 16:36 ` [PATCH V2 01/20] block: move blk_mq_add_queue_tag_set() after blk_mq_map_swqueue() Ming Lei
2025-04-19  8:57   ` Yu Kuai
2025-04-19 10:25   ` Nilay Shroff
2025-04-21 12:31   ` Christoph Hellwig
2025-04-22  6:00   ` Hannes Reinecke
2025-04-18 16:36 ` [PATCH V2 02/20] block: move ELEVATOR_FLAG_DISABLE_WBT as request queue flag Ming Lei
2025-04-19  8:59   ` Yu Kuai
2025-04-19 14:52   ` Nilay Shroff
2025-04-21 12:33   ` Christoph Hellwig
2025-04-22  6:01   ` Hannes Reinecke
2025-04-18 16:36 ` [PATCH V2 03/20] block: don't call freeze queue in elevator_switch() and elevator_disable() Ming Lei
2025-04-19  9:01   ` Yu Kuai
2025-04-21 12:34   ` Christoph Hellwig
2025-04-22  4:00     ` Ming Lei
2025-04-22  7:00       ` Christoph Hellwig
2025-04-22  6:02   ` Hannes Reinecke
2025-04-18 16:36 ` [PATCH V2 04/20] block: add two helpers for registering/un-registering sched debugfs Ming Lei
2025-04-19  9:18   ` Yu Kuai
2025-04-19 10:27   ` Nilay Shroff
2025-04-21 12:36   ` Christoph Hellwig
2025-04-22  6:04   ` Hannes Reinecke
2025-04-18 16:36 ` [PATCH V2 05/20] block: move sched debugfs register into elvevator_register_queue Ming Lei
2025-04-19  9:28   ` Yu Kuai
2025-04-19 10:28   ` Nilay Shroff
2025-04-21 12:37   ` Christoph Hellwig
2025-04-22  6:06   ` Hannes Reinecke
2025-04-18 16:36 ` [PATCH V2 06/20] block: add & pass 'struct gendisk_data' for retrying add/del disk in updating nr_hw_queues Ming Lei
2025-04-18 16:36 ` [PATCH V2 07/20] block: prevent adding/deleting disk during " Ming Lei
2025-04-19 11:20   ` Nilay Shroff
2025-04-21  2:39     ` Ming Lei
2025-04-22  7:04       ` Christoph Hellwig
2025-04-22  9:29         ` Ming Lei
2025-04-22 10:26           ` Nilay Shroff
2025-04-18 16:36 ` [PATCH V2 08/20] block: don't allow to switch elevator if updating nr_hw_queues is in-progress Ming Lei
2025-04-19 11:22   ` Nilay Shroff
2025-04-18 16:36 ` [PATCH V2 09/20] block: simplify elevator rebuild for updating nr_hw_queues Ming Lei
2025-04-19 11:25   ` Nilay Shroff
2025-04-18 16:36 ` [PATCH V2 10/20] block: add helper of elevator_change() Ming Lei
2025-04-18 16:36 ` [PATCH V2 11/20] block: move blk_unregister_queue() & device_del() after freeze wait Ming Lei
2025-04-19 12:38   ` Nilay Shroff
2025-04-22 10:50     ` Ming Lei
2025-04-18 16:36 ` [PATCH V2 12/20] block: add `struct elv_change_ctx` for unifying elevator_change Ming Lei
2025-04-19 12:40   ` Nilay Shroff
2025-04-22  7:07   ` Christoph Hellwig
2025-04-22  8:36     ` Ming Lei
2025-04-23  7:10       ` Christoph Hellwig
2025-04-18 16:36 ` [PATCH V2 13/20] block: unifying elevator change Ming Lei
2025-04-19 13:26   ` Nilay Shroff
2025-04-24  9:02     ` Ming Lei
2025-04-18 16:36 ` [PATCH V2 14/20] block: pass elevator_queue to elv_register_queue & unregister_queue Ming Lei
2025-04-19 13:28   ` Nilay Shroff
2025-04-18 16:36 ` [PATCH V2 15/20] block: fail to show/store elevator sysfs attribute if elevator is dying Ming Lei
2025-04-19 13:45   ` Nilay Shroff
2025-04-18 16:36 ` [PATCH V2 16/20] block: move elv_register[unregister]_queue out of elevator_lock Ming Lei
2025-04-19 13:55   ` Nilay Shroff
2025-04-22 10:53     ` Ming Lei
2025-04-22 11:06       ` Nilay Shroff
2025-04-22 11:14         ` Ming Lei
2025-04-18 16:36 ` [PATCH V2 17/20] block: move debugfs/sysfs register out of freezing queue Ming Lei
2025-04-19 13:57   ` Nilay Shroff
2025-04-18 16:36 ` [PATCH V2 18/20] block: remove several ->elevator_lock Ming Lei
2025-04-19 14:17   ` Nilay Shroff
2025-04-18 16:37 ` [PATCH V2 19/20] block: move hctx cpuhp add/del out of queue freezing Ming Lei
2025-04-19 14:20   ` Nilay Shroff
2025-04-18 16:37 ` [PATCH V2 20/20] block: move wbt_enable_default() out of queue freezing from sched ->exit() Ming Lei
2025-04-19 14:39   ` Nilay Shroff
2025-04-21  7:27     ` Ming Lei
2025-04-22  6:14       ` Nilay Shroff
2025-04-22  8:13         ` Ming Lei
2025-04-22  9:27           ` Nilay Shroff
2025-04-22  9:33             ` Ming Lei [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aAdiT3j5ptl_NdZm@fedora \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=nilay@linux.ibm.com \
    --cc=shinichiro.kawasaki@wdc.com \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox