From: Ming Lei <ming.lei@redhat.com>
To: Yu Kuai <yukuai@kernel.org>
Cc: Nilay Shroff <nilay@linux.ibm.com>,
Yu Kuai <yukuai1@huaweicloud.com>,
axboe@kernel.dk, hare@suse.de, linux-block@vger.kernel.org,
linux-kernel@vger.kernel.org, yukuai3@huawei.com,
yi.zhang@huawei.com, yangerkun@huawei.com,
johnny.chenyi@huawei.com
Subject: Re: [PATCH 08/10] blk-mq: fix blk_mq_tags double free while nr_requests grown
Date: Sat, 16 Aug 2025 12:05:15 +0800 [thread overview]
Message-ID: <aKADe9hNz99dQTfy@fedora> (raw)
In-Reply-To: <af40ef99-9b61-4725-ba77-c5d3741add99@kernel.org>
On Sat, Aug 16, 2025 at 10:57:23AM +0800, Yu Kuai wrote:
> Hi,
>
> 在 2025/8/16 3:30, Nilay Shroff 写道:
> >
> > On 8/15/25 1:32 PM, Yu Kuai wrote:
> > > From: Yu Kuai <yukuai3@huawei.com>
> > >
> > > In the case user trigger tags grow by queue sysfs attribute nr_requests,
> > > hctx->sched_tags will be freed directly and replaced with a new
> > > allocated tags, see blk_mq_tag_update_depth().
> > >
> > > The problem is that hctx->sched_tags is from elevator->et->tags, while
> > > et->tags is still the freed tags, hence later elevator exist will try to
> > > free the tags again, causing kernel panic.
> > >
> > > Fix this problem by using new allocated elevator_tags, also convert
> > > blk_mq_update_nr_requests to void since this helper will never fail now.
> > >
> > > Meanwhile, there is a longterm problem can be fixed as well:
> > >
> > > If blk_mq_tag_update_depth() succeed for previous hctx, then bitmap depth
> > > is updated, however, if following hctx failed, q->nr_requests is not
> > > updated and the previous hctx->sched_tags endup bigger than q->nr_requests.
> > >
> > > Fixes: f5a6604f7a44 ("block: fix lockdep warning caused by lock dependency in elv_iosched_store")
> > > Fixes: e3a2b3f931f5 ("blk-mq: allow changing of queue depth through sysfs")
> > > Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> > > ---
> > > block/blk-mq.c | 19 ++++++-------------
> > > block/blk-mq.h | 4 +++-
> > > block/blk-sysfs.c | 21 ++++++++++++++-------
> > > 3 files changed, 23 insertions(+), 21 deletions(-)
> > >
> > > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > > index 11c8baebb9a0..e9f037a25fe3 100644
> > > --- a/block/blk-mq.c
> > > +++ b/block/blk-mq.c
> > > @@ -4917,12 +4917,12 @@ void blk_mq_free_tag_set(struct blk_mq_tag_set *set)
> > > }
> > > EXPORT_SYMBOL(blk_mq_free_tag_set);
> > > -int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr)
> > > +void blk_mq_update_nr_requests(struct request_queue *q,
> > > + struct elevator_tags *et, unsigned int nr)
> > > {
> > > struct blk_mq_tag_set *set = q->tag_set;
> > > struct blk_mq_hw_ctx *hctx;
> > > unsigned long i;
> > > - int ret = 0;
> > > blk_mq_quiesce_queue(q);
> > > @@ -4946,24 +4946,17 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr)
> > > nr - hctx->sched_tags->nr_reserved_tags);
> > > }
> > > } else {
> > > - queue_for_each_hw_ctx(q, hctx, i) {
> > > - if (!hctx->tags)
> > > - continue;
> > > - ret = blk_mq_tag_update_depth(hctx, &hctx->sched_tags,
> > > - nr);
> > > - if (ret)
> > > - goto out;
> > > - }
> > > + blk_mq_free_sched_tags(q->elevator->et, set);
> > I think you also need to ensure that elevator tags are freed after we unfreeze
> > queue and release ->elevator_lock otherwise we may get into the lockdep splat
> > for pcpu_lock dependency on ->freeze_lock and/or ->elevator_lock. Please note
> > that blk_mq_free_sched_tags internally invokes sbitmap_free which invokes
> > free_percpu which acquires pcpu_lock.
>
> Ok, thanks for the notice. However, as Ming suggested, we might fix this
> problem
>
> in the next merge window.
There are two issues involved:
- blk_mq_tags double free, introduced recently
- long-term lock issue in queue_requests_store()
IMO, the former is a bit serious, because kernel panic can be triggered,
so suggest to make it to v6.17. The latter looks less serious and has
existed for long time, but may need code refactor to get clean fix.
> I'll send one patch to fix this regression by
> replace
>
> st->tags with reallocated new sched_tags as well.
Patch 7 in this patchset and patch 8 in your 1st post looks enough to
fix this double free issue.
Thanks,
Ming
next prev parent reply other threads:[~2025-08-16 4:05 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-15 8:02 [PATCH 00/10] blk-mq: fix blk_mq_tags double free while nr_requests grown Yu Kuai
2025-08-15 8:02 ` [PATCH 01/10] blk-mq: remove useless checking from queue_requests_store() Yu Kuai
2025-08-15 8:02 ` [PATCH 02/10] blk-mq: remove useless checkings from blk_mq_update_nr_requests() Yu Kuai
2025-08-15 8:02 ` [PATCH 03/10] blk-mq: check invalid nr_requests in queue_requests_store() Yu Kuai
2025-08-15 8:02 ` [PATCH 04/10] blk-mq: serialize updating nr_requests with update_nr_hwq_lock Yu Kuai
2025-08-15 14:47 ` Ming Lei
2025-08-16 0:49 ` Yu Kuai
2025-08-16 2:23 ` Ming Lei
2025-08-15 8:02 ` [PATCH 05/10] blk-mq: cleanup shared tags case in blk_mq_update_nr_requests() Yu Kuai
2025-08-15 8:02 ` [PATCH 06/10] blk-mq: split bitmap grow and resize " Yu Kuai
2025-08-15 8:02 ` [PATCH 07/10] blk-mq-sched: add new parameter nr_requests in blk_mq_alloc_sched_tags() Yu Kuai
2025-08-15 8:02 ` [PATCH 08/10] blk-mq: fix blk_mq_tags double free while nr_requests grown Yu Kuai
2025-08-15 19:30 ` Nilay Shroff
2025-08-16 2:57 ` Yu Kuai
2025-08-16 4:05 ` Ming Lei [this message]
2025-08-16 8:05 ` 余快
2025-08-18 2:11 ` Ming Lei
2025-08-18 3:12 ` Ming Lei
2025-08-15 8:02 ` [PATCH 09/10] blk-mq: remove blk_mq_tag_update_depth() Yu Kuai
2025-08-15 8:02 ` [PATCH 10/10] blk-mq: fix stale nr_requests documentation Yu Kuai
2025-08-15 8:30 ` [PATCH 00/10] blk-mq: fix blk_mq_tags double free while nr_requests grown Ming Lei
2025-08-15 9:05 ` Yu Kuai
2025-08-15 14:20 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aKADe9hNz99dQTfy@fedora \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=hare@suse.de \
--cc=johnny.chenyi@huawei.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=nilay@linux.ibm.com \
--cc=yangerkun@huawei.com \
--cc=yi.zhang@huawei.com \
--cc=yukuai1@huaweicloud.com \
--cc=yukuai3@huawei.com \
--cc=yukuai@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox