From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-172.mta0.migadu.com (out-172.mta0.migadu.com [91.218.175.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 48F353F4DCF for ; Thu, 28 May 2026 13:30:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779975026; cv=none; b=bef+hsXCmfT2AqJNaQ2xH8cEKLgulxJf6PyKH7BA8x/ASrXoYOKw03aWKsvxcHLG6k/kG/MQEnA72Om+0AhiY/hq/q2GudhobFXs1fgXMRypy9JMUN6elzJPsIPYSftVpCOGQWgEEQ0FmvG+7IplaJ6cUIVHZdfawERsREBAfIs= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779975026; c=relaxed/simple; bh=fBV7kgjVOW7YD+b0n4xG3n5GcYxPNPKjJh2X2fJk3gM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Suy98qJS3XrxIpWOmnoPaCQMrP3pmvhklk8D39tuAToGlrWsveF5e6ygQ4iuPLJIlpa46gpf6DcEa1WAe8uhpMTBTpzrjX8S2UzJGbNwrIeDJOrhxIMMVa7QZ24EsekDVEPOMcLmgzYFSQXVjiMal+UvQjmgp8XKkhKpfIyEwTk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=cyDYzH4/; arc=none smtp.client-ip=91.218.175.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="cyDYzH4/" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1779975023; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ymCDErMyx33t1ESuPWjnecWS9by/6Siz6vuFexERi1I=; b=cyDYzH4/+ZT8vA7JAhmm8qbgu/lvOEBFg9dhu1qduEUxh1Hrjxpsxk8yoPy22H3wD+Y6Ib sqbb3EhkhGuIyldt0iTGWDTkvmKk8MMTvjyDKA3LtzB2hJbLASAdZO3KtAB3tpEqr25KfK jyQkDK4jNvpteFI7zeUAyu9Vktw0iyk= From: Kaitao Cheng To: dennis@kernel.org, tj@kernel.org, cl@gentwo.org, akpm@linux-foundation.org Cc: mhocko@suse.com, vbabka@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, muchun.song@linux.dev, Kaitao Cheng Subject: [PATCH 1/2] mm/percpu: Preserve NOFS/NOIO scope during chunk create and populate Date: Thu, 28 May 2026 21:29:16 +0800 Message-ID: <20260528132917.81123-2-kaitao.cheng@linux.dev> In-Reply-To: <20260528132917.81123-1-kaitao.cheng@linux.dev> References: <20260528132917.81123-1-kaitao.cheng@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT From: Kaitao Cheng pcpu_alloc_noprof() derives pcpu_gfp from the caller supplied GFP mask and passes it to the backing percpu allocators. This preserves GFP_NOFS and GFP_NOIO for pcpu_alloc_pages() and for the initial pcpu_chunk allocation. However, the chunk creation and population slow paths also call helpers which do not take a GFP mask and perform internal allocations with GFP_KERNEL. For example, pcpu_create_chunk() calls pcpu_get_vm_areas(), and population can allocate temporary metadata or page tables while mapping backing pages. As a result, a caller which explicitly uses GFP_NOFS or GFP_NOIO can still enter FS or IO reclaim while creating or populating a percpu chunk. This is problematic for callers which use GFP_NOFS or GFP_NOIO because they are already holding filesystem or IO-path locks. If free chunks are exhausted, the percpu allocation can take pcpu_alloc_mutex and then enter unconstrained reclaim from these internal allocations, defeating the caller's allocation context and potentially recreating reclaim lock dependencies. Wrap chunk creation and population in a scoped NOIO or NOFS context when pcpu_gfp has the corresponding constraints. Leave ordinary GFP_KERNEL allocations unchanged so they retain full reclaim capability. Fixes: 9a5b183941b5 ("mm, percpu: do not consider sleepable allocations atomic") Signed-off-by: Kaitao Cheng --- mm/percpu.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/mm/percpu.c b/mm/percpu.c index 71a85d7245c7..1bb38467390b 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -1778,6 +1778,23 @@ static void pcpu_alloc_tag_free_hook(struct pcpu_chunk *chunk, int off, size_t s } #endif +static unsigned int pcpu_memalloc_scope_save(gfp_t gfp) +{ + if (!(gfp & __GFP_IO)) + return memalloc_noio_save(); + if (!(gfp & __GFP_FS)) + return memalloc_nofs_save(); + return 0; +} + +static void pcpu_memalloc_scope_restore(gfp_t gfp, unsigned int flags) +{ + if (!(gfp & __GFP_IO)) + memalloc_noio_restore(flags); + else if (!(gfp & __GFP_FS)) + memalloc_nofs_restore(flags); +} + /** * pcpu_alloc - the percpu allocator * @size: size of area to allocate in bytes @@ -1901,7 +1918,12 @@ void __percpu *pcpu_alloc_noprof(size_t size, size_t align, bool reserved, /* No space left. Create a new chunk. */ if (list_empty(&pcpu_chunk_lists[pcpu_free_slot])) { + unsigned int pcpu_scope; + + pcpu_scope = pcpu_memalloc_scope_save(pcpu_gfp); chunk = pcpu_create_chunk(pcpu_gfp); + pcpu_memalloc_scope_restore(pcpu_gfp, pcpu_scope); + if (!chunk) { err = "failed to allocate new chunk"; goto fail; @@ -1931,9 +1953,13 @@ void __percpu *pcpu_alloc_noprof(size_t size, size_t align, bool reserved, page_end = PFN_UP(off + size); for_each_clear_bitrange_from(rs, re, chunk->populated, page_end) { + unsigned int pcpu_scope; + WARN_ON(chunk->immutable); + pcpu_scope = pcpu_memalloc_scope_save(pcpu_gfp); ret = pcpu_populate_chunk(chunk, rs, re, pcpu_gfp); + pcpu_memalloc_scope_restore(pcpu_gfp, pcpu_scope); spin_lock_irqsave(&pcpu_lock, flags); if (ret) { -- 2.50.1 (Apple Git-155)