From: Ming Lei <ming.lei@redhat.com>
To: Li Nan <linan666@huaweicloud.com>
Cc: Yu Kuai <yukuai1@huaweicloud.com>,
axboe@kernel.dk, jianchao.w.wang@oracle.com,
linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
yangerkun@huawei.com, yi.zhang@huawei.com,
"yukuai (C)" <yukuai3@huawei.com>
Subject: Re: [PATCH] blk-mq: check kobject state_in_sysfs before deleting in blk_mq_unregister_hctx
Date: Thu, 28 Aug 2025 20:08:57 +0800 [thread overview]
Message-ID: <aLBG2VCNZEnSYxx9@fedora> (raw)
In-Reply-To: <fc587a1a-97fb-584c-c17c-13bb5e3d7a92@huaweicloud.com>
On Thu, Aug 28, 2025 at 05:28:26PM +0800, Li Nan wrote:
>
>
> 在 2025/8/27 16:10, Ming Lei 写道:
> > On Wed, Aug 27, 2025 at 11:22:06AM +0800, Li Nan wrote:
> > >
> > >
> > > 在 2025/8/27 9:35, Ming Lei 写道:
> > > > On Wed, Aug 27, 2025 at 09:04:45AM +0800, Yu Kuai wrote:
> > > > > Hi,
> > > > >
> > > > > 在 2025/08/27 8:58, Ming Lei 写道:
> > > > > > On Tue, Aug 26, 2025 at 04:48:54PM +0800, linan666@huaweicloud.com wrote:
> > > > > > > From: Li Nan <linan122@huawei.com>
> > > > > > >
> > > > > > > In __blk_mq_update_nr_hw_queues() the return value of
> > > > > > > blk_mq_sysfs_register_hctxs() is not checked. If sysfs creation for hctx
> > > > > >
> > > > > > Looks we should check its return value and handle the failure in both
> > > > > > the call site and blk_mq_sysfs_register_hctxs().
> > > > >
> > > > > From __blk_mq_update_nr_hw_queues(), the old hctxs is already
> > > > > unregistered, and this function is void, we failed to register new hctxs
> > > > > because of memory allocation failure. I really don't know how to handle
> > > > > the failure here, do you have any suggestions?
> > > >
> > > > It is out of memory, I think it is fine to do whatever to leave queue state
> > > > intact instead of making it `partial workable`, such as:
> > > >
> > > > - try update nr_hw_queues to 1
> > > >
> > > > - if it still fails, delete disk & mark queue as dead if disk is attached
> > > >
> > >
> > > If we ignore these non-critical sysfs creation failures, the disk remains
> > > usable with no loss of functionality. Deleting the disk seems to escalate
> > > the error?
> >
> > It is more like a workaround by ignoring the sysfs register failure. And if
> > the issue need to be fixed in this way, you have to document it. >
> > In case of OOM, it usually means that the system isn't usable any more.
> > But it is NOIO allocation and the typical use case is for error recovery in
> > nvme pci, so there may not be enough pages for noio allocation only. That is
> > the reason for ignoring sysfs register in blk_mq_update_nr_hw_queues()?
> >
> > But NVMe has been pretty fragile in this area by using non-owner queue
> > freeze, and call blk_mq_update_nr_hw_queues() on frozen queue, so it is
> > really necessary to take it into account?
>
> I agree with your points about NOIO and NVMe.
>
> I hit this issue in null_blk during fuzz testing with memory-fault
> injection. Changing the number of hardware queues under OOM is extremely
> rare in real-world usage. So I think adding a workaround and documenting it
> is sufficient. What do you think?
Looks fine for me.
Thanks,
Ming
next prev parent reply other threads:[~2025-08-28 12:09 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-26 8:48 [PATCH] blk-mq: check kobject state_in_sysfs before deleting in blk_mq_unregister_hctx linan666
2025-08-27 0:53 ` Yu Kuai
2025-08-27 0:58 ` Ming Lei
2025-08-27 1:04 ` Yu Kuai
2025-08-27 1:35 ` Ming Lei
2025-08-27 3:22 ` Li Nan
2025-08-27 8:10 ` Ming Lei
2025-08-28 9:28 ` Li Nan
2025-08-28 12:08 ` Ming Lei [this message]
2025-08-28 17:23 ` Jens Axboe
2025-08-29 1:09 ` Yu Kuai
2025-08-29 1:20 ` Jens Axboe
2025-08-29 1:21 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aLBG2VCNZEnSYxx9@fedora \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=jianchao.w.wang@oracle.com \
--cc=linan666@huaweicloud.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=yangerkun@huawei.com \
--cc=yi.zhang@huawei.com \
--cc=yukuai1@huaweicloud.com \
--cc=yukuai3@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox