Re: CVE-2025-40146: blk-mq: fix potential deadlock while nr_requests grown

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Nilay Shroff <nilay@linux.ibm.com>
To: yangerkun <yangerkun@huawei.com>,
	Zheng Qixing <zhengqixing@huaweicloud.com>,
	Greg KH <gregkh@linuxfoundation.org>
Cc: cve@kernel.org, linux-kernel@vger.kernel.org, yukuai@fnnas.com,
	ming.lei@redhat.com, "zhangyi (F)" <yi.zhang@huawei.com>,
	Hou Tao <houtao1@huawei.com>
Subject: Re: CVE-2025-40146: blk-mq: fix potential deadlock while nr_requests grown
Date: Fri, 28 Nov 2025 12:45:19 +0530	[thread overview]
Message-ID: <8ad23abf-b5d2-4a16-a8bd-28df4ab6823f@linux.ibm.com> (raw)
In-Reply-To: <9ef3f966-c9bc-2da0-3fbf-a926eee740c9@huawei.com>

> 
> commit b86433721f46d934940528f28d49c1dedb690df1 (HEAD -> master)
> Author: Yu Kuai <yukuai3@huawei.com>
> Date:   Wed Sep 10 16:04:43 2025 +0800
> 
>     blk-mq: fix potential deadlock while nr_requests grown
> 
>     Allocate and free sched_tags while queue is freezed can deadlock[1],
>     this is a long term problem, hence allocate memory before freezing
>     queue and free memory after queue is unfreezed.
> 
>     [1] https://lore.kernel.org/all/0659ea8d-a463-47c8-9180-43c719e106eb@linux.ibm.com/
>     Fixes: e3a2b3f931f5 ("blk-mq: allow changing of queue depth through sysfs")
> 
>     Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>     Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
>     Signed-off-by: Jens Axboe <axboe@kernel.dk>
> 
> We are assume that what's the problem Yu describe is when we update
> nr_request, we may need some memory allocation(nr_requests grows). And
> the memory allocation may trigger some memory reclaim, and fall into
> another I/O process, and since the request_queue has been freezen, there
> exist deadlock.
> 
> But after checking the source code, there exist
> queue_requests_store->blk_mq_freeze_queue->memalloc_noio_save, the
> whole process which may trigger memory allocation won't trigger I/O
> process. So deadlock can not happened... And if that's true, this patch
> does not fix any problem.
> 

Yes, memalloc_noio_save() is invoked before we freeze the queue (e.g., in
elv_iosched_store()), but that does not prevent the deadlock scenario described
in the lockdep splat.

If you look closely at the splat, the problematic lock is not fs_reclaim (which
may be the first impression), but rather ->pcpu_alloc_mutex. From the splat, the
chain of dependencies looks like this:

thread #0: blocked on q->elevator_lock  
  thread #1: blocked on ->pcpu_alloc_mutex
    thread #2: blocked on fs-reclaim

Here is the key detail:

Thread #0 is running under GFP_NOIO scope (due to memalloc_noio_save()). 
However, it is not blocked on fs_reclaim. Instead, it is blocked 
on ->elevator_lock.

Thread #1 is also running with GFP_NOIO and holds ->elevator_lock
while the queue is frozen. It is blocked on ->pcpu_alloc_mutex,
which is already held by Thread #2 (the thread that is stuck in
fs_reclaim). Thread #2 is running without GFP_NOIO scope.

In other words:
- GFP_NOIO prevents a thread from entering fs_reclaim, but it does
  not prevent triggering per-CPU memory allocations, which require
  taking ->pcpu_alloc_mutex.
- This ->pcpu_alloc_mutex is the actual source of contention in the
  splat, and it sits outside the protections offered by GFP_NOIO. 

That means:
- Even though memalloc_noio_save() avoids fs reclaim recursion, 
  it does not prevent per-CPU allocations from blocking, and thus
  it cannot prevent the deadlock involving ->pcpu_alloc_mutex.

So the reasoning that “memalloc_noio_save() prevents deadlock” is
incomplete — GFP_NOIO only handles reclaim, not per-CPU allocations.

This is why the patch is still needed: GFP_NOIO scope with freezing
the queue and modifying scheduler tags can still lead to a circular
dependency involving ->pcpu_alloc_mutex, which the splat clearly shows. 

If you look at the reasoning described in commit f5a6604f7a44 (“block: 
fix lockdep warning caused by lock dependency in elv_iosched_store”) and
commit 04225d13aef1 (“block: fix potential deadlock while running 
nr_hw_queue update”), both explicitly explain that the goal is to break
the dependency chain between the percpu allocator lock and the elevator lock.

The third commit, b86433721f46 (“blk-mq: fix potential deadlock while
nr_requests grown”), is less explicit in its explanation, but it addresses
the same underlying issue.

Taken together, all these commits resolve the deadlock described in the
lockdep splat.

Thanks,
--Nilay

next prev parent reply	other threads:[~2025-11-28  7:15 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <2025111256-CVE-2025-40146-b919@gregkh>
2025-11-27 13:22 ` CVE-2025-40146: blk-mq: fix potential deadlock while nr_requests grown Zheng Qixing
2025-11-27 13:39   ` Greg KH
2025-11-28  1:15     ` Zheng Qixing
2025-11-28  5:20       ` Nilay Shroff
2025-11-28  6:33         ` yangerkun
2025-11-28  7:15           ` Nilay Shroff [this message]
2025-11-28  9:44             ` Zheng Qixing
2025-11-29  3:52             ` Zheng Qixing
2025-11-29  4:01 ` Zheng Qixing

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8ad23abf-b5d2-4a16-a8bd-28df4ab6823f@linux.ibm.com \
    --to=nilay@linux.ibm.com \
    --cc=cve@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=houtao1@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=yangerkun@huawei.com \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai@fnnas.com \
    --cc=zhengqixing@huaweicloud.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox