public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Yu Kuai <yukuai@fnnas.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, Nilay Shroff <nilay@linux.ibm.com>,
	Yu Kuai <yukuai3@huawei.com>, Guangwu Zhang <guazhang@redhat.com>
Subject: Re: [PATCH] block: fix race between wbt_enable_default and IO submission
Date: Fri, 12 Dec 2025 15:47:02 +0800	[thread overview]
Message-ID: <aTvIdnk3Pl-L-4s_@fedora> (raw)
In-Reply-To: <52b5f2a4-e5e1-4917-8620-490fde89cfe7@fnnas.com>

On Fri, Dec 12, 2025 at 01:56:20PM +0800, Yu Kuai wrote:
> Hi,
> 
> 在 2025/12/10 17:10, Ming Lei 写道:
> > When wbt_enable_default() is moved out of queue freezing in elevator_change(),
> > it can cause the wbt inflight counter to become negative (-1), leading to hung
> > tasks in the writeback path. Tasks get stuck in wbt_wait() because the counter
> > is in an inconsistent state.
> >
> > The issue occurs because wbt_enable_default() could race with IO submission,
> > allowing the counter to be decremented before proper initialization. This manifests
> > as:
> >
> >    rq_wait[0]:
> >      inflight:             -1
> >      has_waiters:        True
> >
> > And results in hung task warnings like:
> >    task:kworker/u24:39 state:D stack:0 pid:14767
> >    Call Trace:
> >      rq_qos_wait+0xb4/0x150
> >      wbt_wait+0xa9/0x100
> >      __rq_qos_throttle+0x24/0x40
> >      blk_mq_submit_bio+0x672/0x7b0
> >      ...
> >
> > Fix this by:
> >
> > 1. Splitting wbt_enable_default() into:
> >     - __wbt_enable_default(): Returns true if wbt_init() should be called
> >     - wbt_enable_default(): Wrapper for existing callers (no init)
> >     - wbt_init_enable_default(): New function that checks and inits WBT
> >
> > 2. Using wbt_init_enable_default() in blk_register_queue() to ensure
> >     proper initialization during queue registration
> >
> > 3. Move wbt_init() out of wbt_enable_default() which is only for enabling
> >     disabled wbt from bfq and iocost, and wbt_init() isn't needed. Then the
> >     original lock warning can be avoided.
> >
> > 4. Removing the ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT flag and its handling
> >     code since it's no longer needed
> >
> > This ensures WBT is properly initialized before any IO can be submitted,
> > preventing the counter from going negative.
> >
> > Cc: Nilay Shroff <nilay@linux.ibm.com>
> > Cc: Yu Kuai <yukuai3@huawei.com>
> > Cc: Guangwu Zhang <guazhang@redhat.com>
> > Fixes: 78c271344b6f ("block: move wbt_enable_default() out of queue freezing from sched ->exit()")
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >   block/bfq-iosched.c |  2 +-
> >   block/blk-sysfs.c   |  2 +-
> >   block/blk-wbt.c     | 20 ++++++++++++++++----
> >   block/blk-wbt.h     |  5 +++++
> >   block/elevator.c    |  4 ----
> >   block/elevator.h    |  1 -
> >   6 files changed, 23 insertions(+), 11 deletions(-)
> >
> > diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
> > index 4a8d3d96bfe4..6e54b1d3d8bc 100644
> > --- a/block/bfq-iosched.c
> > +++ b/block/bfq-iosched.c
> > @@ -7181,7 +7181,7 @@ static void bfq_exit_queue(struct elevator_queue *e)
> >   
> >   	blk_stat_disable_accounting(bfqd->queue);
> >   	blk_queue_flag_clear(QUEUE_FLAG_DISABLE_WBT_DEF, bfqd->queue);
> > -	set_bit(ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT, &e->flags);
> > +	wbt_enable_default(bfqd->queue->disk);
> >   
> >   	kfree(bfqd);
> >   }
> > diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> > index 8684c57498cc..e0a70d26972b 100644
> > --- a/block/blk-sysfs.c
> > +++ b/block/blk-sysfs.c
> > @@ -932,7 +932,7 @@ int blk_register_queue(struct gendisk *disk)
> >   		elevator_set_default(q);
> >   
> >   	blk_queue_flag_set(QUEUE_FLAG_REGISTERED, q);
> > -	wbt_enable_default(disk);
> > +	wbt_init_enable_default(disk);
> >   
> >   	/* Now everything is ready and send out KOBJ_ADD uevent */
> >   	kobject_uevent(&disk->queue_kobj, KOBJ_ADD);
> > diff --git a/block/blk-wbt.c b/block/blk-wbt.c
> > index eb8037bae0bd..0974875f77bd 100644
> > --- a/block/blk-wbt.c
> > +++ b/block/blk-wbt.c
> > @@ -699,7 +699,7 @@ static void wbt_requeue(struct rq_qos *rqos, struct request *rq)
> >   /*
> >    * Enable wbt if defaults are configured that way
> >    */
> > -void wbt_enable_default(struct gendisk *disk)
> > +static bool __wbt_enable_default(struct gendisk *disk)
> >   {
> >   	struct request_queue *q = disk->queue;
> >   	struct rq_qos *rqos;
> > @@ -716,19 +716,31 @@ void wbt_enable_default(struct gendisk *disk)
> >   		if (enable && RQWB(rqos)->enable_state == WBT_STATE_OFF_DEFAULT)
> >   			RQWB(rqos)->enable_state = WBT_STATE_ON_DEFAULT;
> 
> Is this problem due to above state? Changing the state is not under queue frozen.
> The commit message is not quite clear to me. If so, the changes looks good to me,
> by setting the state with queue frozen.

Yes, rwb_enabled() checks the state, which can be update exactly between wbt_wait()
(rq_qos_throttle()) and wbt_track()(rq_qos_track), then the inflight counter will
become negative.


Thanks,
Ming


  reply	other threads:[~2025-12-12  7:47 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-10  9:10 [PATCH] block: fix race between wbt_enable_default and IO submission Ming Lei
2025-12-12  5:56 ` Yu Kuai
2025-12-12  7:47   ` Ming Lei [this message]
2025-12-12 13:40     ` Nilay Shroff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aTvIdnk3Pl-L-4s_@fedora \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=guazhang@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=nilay@linux.ibm.com \
    --cc=yukuai3@huawei.com \
    --cc=yukuai@fnnas.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox