public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Yu Kuai <yukuai1@huaweicloud.com>,
	Li Nan <linan666@huaweicloud.com>, Ming Lei <ming.lei@redhat.com>
Cc: jianchao.w.wang@oracle.com, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org, yangerkun@huawei.com,
	yi.zhang@huawei.com, "yukuai (C)" <yukuai3@huawei.com>
Subject: Re: [PATCH] blk-mq: check kobject state_in_sysfs before deleting in blk_mq_unregister_hctx
Date: Thu, 28 Aug 2025 19:20:30 -0600	[thread overview]
Message-ID: <6424f720-9eaa-4642-9186-c0a148995e02@kernel.dk> (raw)
In-Reply-To: <5adb469d-9e4b-e2d9-a77c-a1a4d11a49d5@huaweicloud.com>

On 8/28/25 7:09 PM, Yu Kuai wrote:
> Hi,
> 
> ? 2025/08/29 1:23, Jens Axboe ??:
>> On 8/28/25 3:28 AM, Li Nan wrote:
>>>
>>>
>>> ? 2025/8/27 16:10, Ming Lei ??:
>>>> On Wed, Aug 27, 2025 at 11:22:06AM +0800, Li Nan wrote:
>>>>>
>>>>>
>>>>> ? 2025/8/27 9:35, Ming Lei ??:
>>>>>> On Wed, Aug 27, 2025 at 09:04:45AM +0800, Yu Kuai wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> ? 2025/08/27 8:58, Ming Lei ??:
>>>>>>>> On Tue, Aug 26, 2025 at 04:48:54PM +0800, linan666@huaweicloud.com wrote:
>>>>>>>>> From: Li Nan <linan122@huawei.com>
>>>>>>>>>
>>>>>>>>> In __blk_mq_update_nr_hw_queues() the return value of
>>>>>>>>> blk_mq_sysfs_register_hctxs() is not checked. If sysfs creation for hctx
>>>>>>>>
>>>>>>>> Looks we should check its return value and handle the failure in both
>>>>>>>> the call site and blk_mq_sysfs_register_hctxs().
>>>>>>>
>>>>>>>    From __blk_mq_update_nr_hw_queues(), the old hctxs is already
>>>>>>> unregistered, and this function is void, we failed to register new hctxs
>>>>>>> because of memory allocation failure. I really don't know how to handle
>>>>>>> the failure here, do you have any suggestions?
>>>>>>
>>>>>> It is out of memory, I think it is fine to do whatever to leave queue state
>>>>>> intact instead of making it `partial workable`, such as:
>>>>>>
>>>>>> - try update nr_hw_queues to 1
>>>>>>
>>>>>> - if it still fails, delete disk & mark queue as dead if disk is attached
>>>>>>
>>>>>
>>>>> If we ignore these non-critical sysfs creation failures, the disk remains
>>>>> usable with no loss of functionality. Deleting the disk seems to escalate
>>>>> the error?
>>>>
>>>> It is more like a workaround by ignoring the sysfs register failure. And if
>>>> the issue need to be fixed in this way, you have to document it. >
>>>> In case of OOM, it usually means that the system isn't usable any more.
>>>> But it is NOIO allocation and the typical use case is for error recovery in
>>>> nvme pci, so there may not be enough pages for noio allocation only. That is
>>>> the reason for ignoring sysfs register in blk_mq_update_nr_hw_queues()?
>>>>
>>>> But NVMe has been pretty fragile in this area by using non-owner queue
>>>> freeze, and call blk_mq_update_nr_hw_queues() on frozen queue, so it is
>>>> really necessary to take it into account?
>>>
>>> I agree with your points about NOIO and NVMe.
>>>
>>> I hit this issue in null_blk during fuzz testing with memory-fault
>>> injection. Changing the number of hardware queues under OOM is
>>> extremely rare in real-world usage. So I think adding a workaround and
>>> documenting it is sufficient. What do you think?
>>
>> Working around it is fine, as it isn't a situation we really need to
>> worry about. But let's please not do it by poking at kobject internals.
>>
> 
> There is already used in someplaces like sysfs_slab_unlink().
> 
> Do we prefre add a new hctx->state like BLK_MQ_S_REGISTERED?

If it's already used in a few spots, then I guess we should just be
using it as well rather than have a state around it. So I guess it's
fine. I'll just grab the patch.

-- 
Jens Axboe

  reply	other threads:[~2025-08-29  1:20 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-26  8:48 [PATCH] blk-mq: check kobject state_in_sysfs before deleting in blk_mq_unregister_hctx linan666
2025-08-27  0:53 ` Yu Kuai
2025-08-27  0:58 ` Ming Lei
2025-08-27  1:04   ` Yu Kuai
2025-08-27  1:35     ` Ming Lei
2025-08-27  3:22       ` Li Nan
2025-08-27  8:10         ` Ming Lei
2025-08-28  9:28           ` Li Nan
2025-08-28 12:08             ` Ming Lei
2025-08-28 17:23             ` Jens Axboe
2025-08-29  1:09               ` Yu Kuai
2025-08-29  1:20                 ` Jens Axboe [this message]
2025-08-29  1:21 ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6424f720-9eaa-4642-9186-c0a148995e02@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=jianchao.w.wang@oracle.com \
    --cc=linan666@huaweicloud.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=yangerkun@huawei.com \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai1@huaweicloud.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox