From: Kaitao Cheng <kaitao.cheng@linux.dev>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Dennis Zhou <dennis@kernel.org>, Tejun Heo <tj@kernel.org>,
Christoph Lameter <cl@gentwo.org>,
Uladzislau Rezki <urezki@gmail.com>,
Pedro Falcato <pfalcato@suse.de>,
Vlastimil Babka <vbabka@kernel.org>,
Michal Hocko <mhocko@suse.com>,
muchun.song@linux.dev, linux-mm@kvack.org,
linux-kernel@vger.kernel.org,
Kaitao Cheng <chengkaitao@kylinos.cn>
Subject: Re: [PATCH v2 3/3] mm/percpu: Avoid IO/FS reclaim in backing allocations
Date: Fri, 5 Jun 2026 16:48:35 +0800 [thread overview]
Message-ID: <3de3a89b-92f0-4cd2-9f41-8e853eae4e78@linux.dev> (raw)
In-Reply-To: <20260604120709.445c027637b3ad72ad13279a@linux-foundation.org>
在 2026/6/5 03:07, Andrew Morton 写道:
> On Thu, 4 Jun 2026 19:31:01 +0800 Kaitao Cheng <kaitao.cheng@linux.dev> wrote:
>
>> From: Kaitao Cheng <chengkaitao@kylinos.cn>
>>
>> Commit 9a5b183941b5 ("mm, percpu: do not consider sleepable
>> allocations atomic") allows sleepable GFP_NOIO and GFP_NOFS percpu
>> allocations to take pcpu_alloc_mutex. This avoids premature allocation
>> failures, but it also makes the mutex visible to callers from constrained
>> IO/FS contexts.
>>
>> Thread A calls pcpu_alloc_noprof() with GFP_KERNEL and takes
>> pcpu_alloc_mutex. Since the internal allocation is not constrained by
>> NOFS, it may enter FS reclaim while still holding pcpu_alloc_mutex,
>> creating a dependency like: pcpu_alloc_mutex -> fs_reclaim -> FS lock
>>
>> At the same time, Thread B may already hold an FS lock and then call
>> pcpu_alloc_noprof() with GFP_NOFS. It will try to acquire
>> pcpu_alloc_mutex and block, creating the reverse dependency:
>> FS lock -> pcpu_alloc_mutex
>>
>> This can still form a potential deadlock cycle.
>>
>> Avoid the dependency by restricting percpu backing allocations to GFP_NOIO.
>> The public allocation still uses the caller's GFP context to decide whether
>> it may block, but the internal memory allocations performed while
>> pcpu_alloc_mutex is held cannot recurse into IO or FS reclaim.
>>
>> ...
>>
>> --- a/mm/percpu.c
>> +++ b/mm/percpu.c
>> @@ -1726,9 +1726,8 @@ static void pcpu_alloc_tag_free_hook(struct pcpu_chunk *chunk, int off, size_t s
>> * @gfp: allocation flags
>> *
>> * Allocate percpu area of @size bytes aligned at @align. If @gfp doesn't
>> - * contain %GFP_KERNEL, the allocation is atomic. If @gfp has __GFP_NOWARN
>> - * then no warning will be triggered on invalid or failed allocation
>> - * requests.
>> + * allow blocking, the allocation is atomic. If @gfp has __GFP_NOWARN then no
>> + * warning will be triggered on invalid or failed allocation requests.
>> *
>> * RETURNS:
>> * Percpu pointer to the allocated area on success, NULL on failure.
>> @@ -1749,8 +1748,14 @@ void __percpu *pcpu_alloc_noprof(size_t size, size_t align, bool reserved,
>> size_t bits, bit_align;
>>
>> gfp = current_gfp_context(gfp);
>> - /* whitelisted flags that can be passed to the backing allocators */
>> - pcpu_gfp = gfp & (GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN);
>> + /*
>> + * Whitelisted flags that can be passed to the backing allocators.
>
> We're supposed to say "allowlist".
>
>> + * Backing allocations under pcpu_alloc_mutex must not recurse into
>> + * IO/FS reclaim. Otherwise a GFP_KERNEL caller holding the mutex can
>> + * block on reclaim while a GFP_NOIO/NOFS caller holding an IO/FS lock
>> + * waits for the same mutex.
>> + */
>> + pcpu_gfp = gfp & (GFP_NOIO | __GFP_NORETRY | __GFP_NOWARN);
>
> AI review
> (https://sashiko.dev/#/patchset/20260604113101.89510-1-kaitao.cheng@linux.dev)
> asked why we're currently removing __GFP_NOFAIL here. There are
> probably good reasons for this, but it would be good to describe them
> in that comment.
>
This behavior has been present since commit 554fef1c39ee
("percpu: allow select gfp to be passed to underlying allocators"),
which introduced the whitelist for GFP flags passed down to the backing
allocators.
I did a quick AI-assisted scan of the current tree and did not find any
in-tree caller passing __GFP_NOFAIL to pcpu_alloc_noprof() or its
wrappers. So the issue Sashiko described does not appear to be reachable
with current callers.
That said, I agree the semantics are somewhat incomplete: __GFP_NOFAIL
is handled when taking pcpu_alloc_mutex, but it is not propagated through
pcpu_gfp to the backing allocations. If we want to address this
defensively, I think it would be better as a separate patch. Even though it
touches the same line, it fixes a different issue from this change.
--
Thanks
Kaitao Cheng
prev parent reply other threads:[~2026-06-05 8:49 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-04 11:30 [PATCH v2 0/3] mm/percpu: Fix possible NOFS/NOIO reclaim recursion Kaitao Cheng
2026-06-04 11:30 ` [PATCH v2 1/3] mm/vmalloc: honor GFP constraints in pcpu_get_vm_areas() Kaitao Cheng
2026-06-04 16:49 ` Uladzislau Rezki
2026-06-04 11:31 ` [PATCH v2 2/3] mm/percpu: honor GFP constraints when populating chunks Kaitao Cheng
2026-06-04 11:31 ` [PATCH v2 3/3] mm/percpu: Avoid IO/FS reclaim in backing allocations Kaitao Cheng
2026-06-04 19:07 ` Andrew Morton
2026-06-05 8:48 ` Kaitao Cheng [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3de3a89b-92f0-4cd2-9f41-8e853eae4e78@linux.dev \
--to=kaitao.cheng@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=chengkaitao@kylinos.cn \
--cc=cl@gentwo.org \
--cc=dennis@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=muchun.song@linux.dev \
--cc=pfalcato@suse.de \
--cc=tj@kernel.org \
--cc=urezki@gmail.com \
--cc=vbabka@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.