From: Nilay Shroff <nilay@linux.ibm.com>
To: Yu Kuai <yukuai1@huaweicloud.com>, linux-block@vger.kernel.org
Cc: yi.zhang@redhat.com, ming.lei@redhat.com, hch@lst.de,
axboe@kernel.dk, shinichiro.kawasaki@wdc.com, gjoyce@ibm.com,
"yukuai (C)" <yukuai3@huawei.com>
Subject: Re: [PATCHv2] block: restore two stage elevator switch while running nr_hw_queue update
Date: Wed, 23 Jul 2025 16:35:33 +0530 [thread overview]
Message-ID: <19972ca9-804e-407b-a784-ba2566bc907a@linux.ibm.com> (raw)
In-Reply-To: <a707901a-1d21-313f-0456-01f419181f2c@huaweicloud.com>
On 7/23/25 12:28 PM, Yu Kuai wrote:
>>>>> BTW, this is not related to this patch. Should we handle fall_back
>>>>> failure like blk_mq_sysfs_register_hctxs()?
>>>>>
>>>> OKay I guess you meant here handle failure case by unwinding the
>>>> queue instead of looping through it from start to end. If yes, then
>>>> it could be done but again we may not want to do it the bug fix patch.
>>>>
>>>
>>> Not like that, actually I don't have any ideas for now, the hctxs is
>>> unregistered first, and if register failed, for example, due to -ENOMEM,
>>> I can't find a way to fallback :(
>>>
>> If registering new hctxs fails, we fall back to the previous value of
>> nr_hw_queues (prev_nr_hw_queues). When prev_nr_hw_queues is less than
>> the new nr_hw_queues, we do not reallocate memory for the existing hctxs—
>> instead, we reuse the memory that was already allocated.
>>
>> Memory allocation is only attempted for the additional hctxs beyond
>> prev_nr_hw_queues. Therefore, if memory allocation for these new hctxs
>> fails, we can safely fall back to prev_nr_hw_queues because the memory
>> of the previously allocated hctxs remains intact.
>
> No, like I said before, blk_mq_sysfs_unregister_hctxs() will free memory
> by kobject_del() for hctx->kobj and ctx->kobj, and
> __blk_mq_update_nr_hw_queues() call that helper in the beginning.
> And later in the fall back code, blk_mq_sysfs_register_hctxs() can fail
> by memory allocation in kobject_add(), however, the return value is not
> checked.
>
This can be done checking the kobject state in sysfs: kobj->state_in_sysfs.
If kobj->state_in_sysfs is 1 then it implies that kobject_add() for this
kobj was successful and we can safely call kobject_del() on it otherwise
we can skip it. We already have few places in the kernel using this trick.
For instance, check sysfs_slab_unlink(). So, IMO, similar technique could be
used for hctx->kobj and ctx->kobj as well while we attempt to delete these
kobjects from unregistering queue and nr_hw_queue update.
Thanks,
--Nilay
next prev parent reply other threads:[~2025-07-23 11:06 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-18 13:32 [PATCHv2] block: restore two stage elevator switch while running nr_hw_queue update Nilay Shroff
2025-07-20 12:19 ` Ming Lei
2025-07-20 14:25 ` Nilay Shroff
2025-07-22 2:21 ` Yu Kuai
2025-07-22 11:27 ` Nilay Shroff
2025-07-23 0:37 ` Yu Kuai
2025-07-23 6:24 ` Nilay Shroff
2025-07-23 6:58 ` Yu Kuai
2025-07-23 11:05 ` Nilay Shroff [this message]
2025-07-25 1:15 ` Yu Kuai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=19972ca9-804e-407b-a784-ba2566bc907a@linux.ibm.com \
--to=nilay@linux.ibm.com \
--cc=axboe@kernel.dk \
--cc=gjoyce@ibm.com \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
--cc=ming.lei@redhat.com \
--cc=shinichiro.kawasaki@wdc.com \
--cc=yi.zhang@redhat.com \
--cc=yukuai1@huaweicloud.com \
--cc=yukuai3@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox