All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Nilay Shroff <nilay@linux.ibm.com>
Cc: "Jens Axboe" <axboe@kernel.dk>,
	linux-block@vger.kernel.org,
	"Shinichiro Kawasaki" <shinichiro.kawasaki@wdc.com>,
	"Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
	"Christoph Hellwig" <hch@lst.de>
Subject: Re: [PATCH V2 20/20] block: move wbt_enable_default() out of queue freezing from sched ->exit()
Date: Tue, 22 Apr 2025 17:33:03 +0800	[thread overview]
Message-ID: <aAdiT3j5ptl_NdZm@fedora> (raw)
In-Reply-To: <8c07d251-bafc-4601-b14d-5771c4615703@linux.ibm.com>

On Tue, Apr 22, 2025 at 02:57:01PM +0530, Nilay Shroff wrote:
> 
> 
> On 4/22/25 1:43 PM, Ming Lei wrote:
> > On Tue, Apr 22, 2025 at 11:44:59AM +0530, Nilay Shroff wrote:
> >>
> >>
> >> On 4/21/25 12:57 PM, Ming Lei wrote:
> >>> On Sat, Apr 19, 2025 at 08:09:04PM +0530, Nilay Shroff wrote:
> >>>>
> >>>>
> >>>> On 4/18/25 10:07 PM, Ming Lei wrote:
> >>>>> scheduler's ->exit() is called with queue frozen and elevator lock is held, and
> >>>>> wbt_enable_default() can't be called with queue frozen, otherwise the
> >>>>> following lockdep warning is triggered:
> >>>>>
> >>>>> 	#6 (&q->rq_qos_mutex){+.+.}-{4:4}:
> >>>>> 	#5 (&eq->sysfs_lock){+.+.}-{4:4}:
> >>>>> 	#4 (&q->elevator_lock){+.+.}-{4:4}:
> >>>>> 	#3 (&q->q_usage_counter(io)#3){++++}-{0:0}:
> >>>>> 	#2 (fs_reclaim){+.+.}-{0:0}:
> >>>>> 	#1 (&sb->s_type->i_mutex_key#3){+.+.}-{4:4}:
> >>>>> 	#0 (&q->debugfs_mutex){+.+.}-{4:4}:
> >>>>>
> >>>>> Fix the issue by moving wbt_enable_default() out of bfq's exit(), and
> >>>>> call it from elevator_change_done().
> >>>>>
> >>>>> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> >>>>> ---
> >>>>>  block/bfq-iosched.c | 2 +-
> >>>>>  block/elevator.c    | 5 +++++
> >>>>>  block/elevator.h    | 1 +
> >>>>>  3 files changed, 7 insertions(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
> >>>>> index 40e4106a71e7..310ce1d8c41e 100644
> >>>>> --- a/block/bfq-iosched.c
> >>>>> +++ b/block/bfq-iosched.c
> >>>>> @@ -7211,7 +7211,7 @@ static void bfq_exit_queue(struct elevator_queue *e)
> >>>>>  
> >>>>>  	blk_stat_disable_accounting(bfqd->queue);
> >>>>>  	blk_queue_flag_clear(QUEUE_FLAG_DISABLE_WBT, bfqd->queue);
> >>>>> -	wbt_enable_default(bfqd->queue->disk);
> >>>>> +	set_bit(ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT, &e->flags);
> >>>>>  
> >>>>>  	kfree(bfqd);
> >>>>>  }
> >>>>> diff --git a/block/elevator.c b/block/elevator.c
> >>>>> index 8652fe45a2db..378553fce5d8 100644
> >>>>> --- a/block/elevator.c
> >>>>> +++ b/block/elevator.c
> >>>>> @@ -687,8 +687,13 @@ int elevator_change_done(struct request_queue *q, struct elv_change_ctx *ctx)
> >>>>>  	int ret = 0;
> >>>>>  
> >>>>>  	if (ctx->old) {
> >>>>> +		bool enable_wbt = test_bit(ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT,
> >>>>> +				&ctx->old->flags);
> >>>>> +
> >>>>>  		elv_unregister_queue(q, ctx->old);
> >>>>>  		kobject_put(&ctx->old->kobj);
> >>>>> +		if (enable_wbt)
> >>>>> +			wbt_enable_default(q->disk);
> >>>>>  	}
> >>>>>  	if (ctx->new) {
> >>>>>  		ret = elv_register_queue(q, ctx->new, ctx->uevent);
> >>>>> diff --git a/block/elevator.h b/block/elevator.h
> >>>>> index 486be0690499..b14c611c74b6 100644
> >>>>> --- a/block/elevator.h
> >>>>> +++ b/block/elevator.h
> >>>>> @@ -122,6 +122,7 @@ struct elevator_queue
> >>>>>  
> >>>>>  #define ELEVATOR_FLAG_REGISTERED	0
> >>>>>  #define ELEVATOR_FLAG_DYING		1
> >>>>> +#define ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT	2
> >>>>>  
> >>>>>  /* Holding context data for changing elevator */
> >>>>>  struct elv_change_ctx {
> >>>>
> >>>> It seems invoking wbt_enable_default from elevator_change_done could probably
> >>>> still race with ioc_qos_write or queue_wb_lat_store. Both ioc_qos_write and 
> >>>> queue_wb_lat_store run with ->freeze_lock and ->elevator_lock protection.
> >>>
> >>> Actually wbt_enable_default() and wbt_init() needn't the above protection,
> >>> especially since the patch 2/20 removes q->elevator use in
> >>> wbt_enable_default().
> >>>
> >> Yes agreed, and as I understand XXX_FLAG_DISABLE_WBT was earlier elevator_queue->flags 
> >> but now (with patch 2/20) it has been moved to request_queue->flags. As elevator_change_done 
> >> first puts elevator_queue object which would potentially releases/frees the  elevator_queue 
> >> object. Next while we enable wbt (in elevator_change_done)  we may not have access to the 
> >> elevator_queue object and so now we reference  QUEUE_FLAG_DISABLE_WBT using request_queue->flags. 
> >> That's, I believe, the purpose of patch 2/20.
> >>
> >> However even with patch 2/20 change, both elevator_change_done and ioc_qos_write or
> >> queue_wb_lat_store may run in parallel, isn't it?
> >>
> >> therad1:
> >> blk_mq_update_nr_hw_queues
> >>   -> __blk_mq_update_nr_hw_queues
> >>     -> elevator_change_done
> >>       -> wbt_enable_default
> >>         -> wbt_init
> >>          -> wbt_update_limits
> > 
> > Here wbt_update_limits() is called on un-attached `struct rq_wb` instance,
> > which is just allocated from heap.
> > 
> >>
> >> therad2:
> >> queue_wb_lat_store
> >>   -> wbt_set_min_lat
> >>    -> wbt_update_limits
> > 
> > The above one is run on attached `struct rq_wb` instance.
> > 
> > And there can only be one attached `struct rq_wb` instance, so the above
> > race doesn't exist because attaching wbt to queue/disk is covered by `q->rq_qos_mutex`.
> > 
> Yes you were correct, however, what if throttling is already enabled/attached to 
> the queue? In that case we'd race updating rq_wb->enable_state, no? For instance,
> 
> thread1:
> blk_mq_update_nr_hw_queues
>   -> elevator_change_done
>     -> wbt_enable_default ==> (updates ->enable_state)
> 
> thread2: 
> queue_wb_lat_store
>   -> wbt_set_min_lat ==> (updates ->enable_state)
> 
> thread3:
> ioc_qos_write
>   -> wbt_disable_default ==> (updates ->enable_state)

OK, that is one race, but should have been handled by one rqos dedicated
lock instead of ->elevator_lock.


Thanks,
Ming


      reply	other threads:[~2025-04-22  9:33 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-18 16:36 [PATCH V2 00/20] block: unify elevator changing and fix lockdep warning Ming Lei
2025-04-18 16:36 ` [PATCH V2 01/20] block: move blk_mq_add_queue_tag_set() after blk_mq_map_swqueue() Ming Lei
2025-04-19  8:57   ` Yu Kuai
2025-04-19 10:25   ` Nilay Shroff
2025-04-21 12:31   ` Christoph Hellwig
2025-04-22  6:00   ` Hannes Reinecke
2025-04-18 16:36 ` [PATCH V2 02/20] block: move ELEVATOR_FLAG_DISABLE_WBT as request queue flag Ming Lei
2025-04-19  8:59   ` Yu Kuai
2025-04-19 14:52   ` Nilay Shroff
2025-04-21 12:33   ` Christoph Hellwig
2025-04-22  6:01   ` Hannes Reinecke
2025-04-18 16:36 ` [PATCH V2 03/20] block: don't call freeze queue in elevator_switch() and elevator_disable() Ming Lei
2025-04-19  9:01   ` Yu Kuai
2025-04-21 12:34   ` Christoph Hellwig
2025-04-22  4:00     ` Ming Lei
2025-04-22  7:00       ` Christoph Hellwig
2025-04-22  6:02   ` Hannes Reinecke
2025-04-18 16:36 ` [PATCH V2 04/20] block: add two helpers for registering/un-registering sched debugfs Ming Lei
2025-04-19  9:18   ` Yu Kuai
2025-04-19 10:27   ` Nilay Shroff
2025-04-21 12:36   ` Christoph Hellwig
2025-04-22  6:04   ` Hannes Reinecke
2025-04-18 16:36 ` [PATCH V2 05/20] block: move sched debugfs register into elvevator_register_queue Ming Lei
2025-04-19  9:28   ` Yu Kuai
2025-04-19 10:28   ` Nilay Shroff
2025-04-21 12:37   ` Christoph Hellwig
2025-04-22  6:06   ` Hannes Reinecke
2025-04-18 16:36 ` [PATCH V2 06/20] block: add & pass 'struct gendisk_data' for retrying add/del disk in updating nr_hw_queues Ming Lei
2025-04-18 16:36 ` [PATCH V2 07/20] block: prevent adding/deleting disk during " Ming Lei
2025-04-19 11:20   ` Nilay Shroff
2025-04-21  2:39     ` Ming Lei
2025-04-22  7:04       ` Christoph Hellwig
2025-04-22  9:29         ` Ming Lei
2025-04-22 10:26           ` Nilay Shroff
2025-04-18 16:36 ` [PATCH V2 08/20] block: don't allow to switch elevator if updating nr_hw_queues is in-progress Ming Lei
2025-04-19 11:22   ` Nilay Shroff
2025-04-18 16:36 ` [PATCH V2 09/20] block: simplify elevator rebuild for updating nr_hw_queues Ming Lei
2025-04-19 11:25   ` Nilay Shroff
2025-04-18 16:36 ` [PATCH V2 10/20] block: add helper of elevator_change() Ming Lei
2025-04-18 16:36 ` [PATCH V2 11/20] block: move blk_unregister_queue() & device_del() after freeze wait Ming Lei
2025-04-19 12:38   ` Nilay Shroff
2025-04-22 10:50     ` Ming Lei
2025-04-18 16:36 ` [PATCH V2 12/20] block: add `struct elv_change_ctx` for unifying elevator_change Ming Lei
2025-04-19 12:40   ` Nilay Shroff
2025-04-22  7:07   ` Christoph Hellwig
2025-04-22  8:36     ` Ming Lei
2025-04-23  7:10       ` Christoph Hellwig
2025-04-18 16:36 ` [PATCH V2 13/20] block: unifying elevator change Ming Lei
2025-04-19 13:26   ` Nilay Shroff
2025-04-24  9:02     ` Ming Lei
2025-04-18 16:36 ` [PATCH V2 14/20] block: pass elevator_queue to elv_register_queue & unregister_queue Ming Lei
2025-04-19 13:28   ` Nilay Shroff
2025-04-18 16:36 ` [PATCH V2 15/20] block: fail to show/store elevator sysfs attribute if elevator is dying Ming Lei
2025-04-19 13:45   ` Nilay Shroff
2025-04-18 16:36 ` [PATCH V2 16/20] block: move elv_register[unregister]_queue out of elevator_lock Ming Lei
2025-04-19 13:55   ` Nilay Shroff
2025-04-22 10:53     ` Ming Lei
2025-04-22 11:06       ` Nilay Shroff
2025-04-22 11:14         ` Ming Lei
2025-04-18 16:36 ` [PATCH V2 17/20] block: move debugfs/sysfs register out of freezing queue Ming Lei
2025-04-19 13:57   ` Nilay Shroff
2025-04-18 16:36 ` [PATCH V2 18/20] block: remove several ->elevator_lock Ming Lei
2025-04-19 14:17   ` Nilay Shroff
2025-04-18 16:37 ` [PATCH V2 19/20] block: move hctx cpuhp add/del out of queue freezing Ming Lei
2025-04-19 14:20   ` Nilay Shroff
2025-04-18 16:37 ` [PATCH V2 20/20] block: move wbt_enable_default() out of queue freezing from sched ->exit() Ming Lei
2025-04-19 14:39   ` Nilay Shroff
2025-04-21  7:27     ` Ming Lei
2025-04-22  6:14       ` Nilay Shroff
2025-04-22  8:13         ` Ming Lei
2025-04-22  9:27           ` Nilay Shroff
2025-04-22  9:33             ` Ming Lei [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aAdiT3j5ptl_NdZm@fedora \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=nilay@linux.ibm.com \
    --cc=shinichiro.kawasaki@wdc.com \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.