Re: [QUESTION] blk_mq_freeze_queue in elevator_init_mq

From: Ming Lei <ming.lei@redhat.com>
To: yangerkun <yangerkun@huawei.com>
Cc: damien.lemoal@wdc.com, axboe@kernel.dk,
	miquel.raynal@bootlin.com, richard@nod.at, vigneshr@ti.com,
	linux-block@vger.kernel.org, linux-mtd@lists.infradead.org,
	yi.zhang@huawei.com, yebin10@huawei.com, houtao1@huawei.com
Subject: Re: [QUESTION] blk_mq_freeze_queue in elevator_init_mq
Date: Wed, 17 Nov 2021 16:06:45 +0800	[thread overview]
Message-ID: <YZS4FYxtxYAXjtFJ@T590> (raw)
In-Reply-To: <d9113bf8-4654-cb04-f79c-38e11493cb2c@huawei.com>

On Wed, Nov 17, 2021 at 11:37:13AM +0800, yangerkun wrote:
> Nowdays we meet the boot regression while enable lots of mtdblock

What is your boot regression? Any dmesg log?

> compare with 4.4. The main reason was that the blk_mq_freeze_queue in
> elevator_init_mq will wait a RCU gap which want to make sure no IO will
> happen while blk_mq_init_sched.

There isn't RCU grace period implied in the blk_mq_freeze_queue() called
from elevator_init_mq(), because the .q_usage_counter works at atomic mode
at that time.

> 
> Other module like loop meets this problem too and has been fix with

Again, what is the problem?

> follow patches:
> 
>  2112f5c1330a loop: Select I/O scheduler 'none' from inside add_disk()
>  90b7198001f2 blk-mq: Introduce the BLK_MQ_F_NO_SCHED_BY_DEFAULT flag
> 
> They change the default IO scheduler for loop to 'none'. So no need to
> call blk_mq_freeze_queue and blk_mq_init_sched. But it seems not
> appropriate for mtdblocks. Mtdblocks can use 'mq-deadline' to help
> optimize the random write with the help of mtdblock's cache. Once change
> to 'none', we may meet the regression for random write.
> 
> commit 737eb78e82d52d35df166d29af32bf61992de71d
> Author: Damien Le Moal <damien.lemoal@wdc.com>
> Date:   Thu Sep 5 18:51:33 2019 +0900
> 
>     block: Delay default elevator initialization
> 
>     ...
> 
>     Additionally, to make sure that the elevator initialization is never
>     done while requests are in-flight (there should be none when the device
>     driver calls device_add_disk()), freeze and quiesce the device request
>     queue before calling blk_mq_init_sched() in elevator_init_mq().
>     ...
> 
> This commit add blk_mq_freeze_queue in elevator_init_mq which try to
> make sure no in-flight request while we go through blk_mq_init_sched.
> But does there any drivers can leave IO alive while we go through
> elevator_init_mq？ And if no, maybe we can just remove this logical to
> fix the regression...

SCSI should have passthrough requests at that moment.

Thanks,
Ming