From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-177.mta0.migadu.com (out-177.mta0.migadu.com [91.218.175.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 53B6240D594 for ; Fri, 19 Jun 2026 00:22:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.177 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781828523; cv=none; b=fTPdA7GJHR43fGO6vTNvS3kxm3E6kdWgJsdgaclRRnSwOuJgzAdLWcy3CN1NTBifjsKYeDzvkzVYwdoIDk8iI7NsO4igbl9Muus2oPhZR3VGJfeferIU+ykr6LQCfa76tAeab293BLLMMpUQoaa8YG8re5zdtkDxdzW8VCMIUhQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781828523; c=relaxed/simple; bh=RaYf8t9PZnmxO9HkOewLyoFx+n5E/LpEGT1+NMxdd2M=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=gSpNbOK3q2LTuz0XHl+8zDIWBbjXiZ5uVyschljsWvBPZ49L6RHJlJhsqqk2vp9j5B6koCr3wCODJ+F8eEHrmUnDOXy3TAzTcHsYLroUhFqbMsaVRJXFTqaHa4szbx8wg+kucArocqoRpl5/V6SXYo7DN82pnjHs4VlhoNPReXM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=mM2OgRQP; arc=none smtp.client-ip=91.218.175.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="mM2OgRQP" Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1781828519; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TacnOFFQBHM2kdpmFXzPsQJ6Zb/KM4XHoRnV2vJcRq4=; b=mM2OgRQPEnVh9XkWXKSvoXPWvCmRUUOLahZp5g3ZH6yZld0r05uNbFC5x6AamALhKM1IHs zWhFog+VXsoRiLYZevR5vwzJLWRC2aKQb0tx9aLHzX0IpzTSZ51haZoDOaHuWKFCUkRLiX qykyjJay5Ti5tptLQBFuCXNnsWo5luI= Date: Fri, 19 Jun 2026 08:21:51 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [PATCH v4 4/4] mm/percpu: Avoid IO/FS reclaim in backing allocations To: Michal Hocko Cc: Andrew Morton , Uladzislau Rezki , Dennis Zhou , Tejun Heo , Christoph Lameter , Vlastimil Babka , Pedro Falcato , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Kaitao Cheng References: <20260618130414.96383-1-kaitao.cheng@linux.dev> <20260618130414.96383-5-kaitao.cheng@linux.dev> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Kaitao Cheng In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT 在 2026/6/19 02:03, Michal Hocko 写道: > On Thu 18-06-26 21:04:14, Kaitao Cheng wrote: >> From: Kaitao Cheng >> >> Commit 9a5b183941b5 ("mm, percpu: do not consider sleepable >> allocations atomic") allows sleepable GFP_NOIO and GFP_NOFS percpu >> allocations to take pcpu_alloc_mutex. This avoids premature allocation >> failures, but it also makes the mutex visible to callers from constrained >> IO/FS contexts. >> >> Thread A calls pcpu_alloc_noprof() with GFP_KERNEL and takes >> pcpu_alloc_mutex. Since the internal allocation is not constrained by >> NOFS, it may enter FS reclaim while still holding pcpu_alloc_mutex, >> creating a dependency like: pcpu_alloc_mutex -> fs_reclaim -> FS lock >> >> At the same time, Thread B may already hold an FS lock and then call >> pcpu_alloc_noprof() with GFP_NOFS. It will try to acquire >> pcpu_alloc_mutex and block, creating the reverse dependency: >> FS lock -> pcpu_alloc_mutex >> >> This can still form a potential deadlock cycle. >> >> Avoid the dependency by restricting percpu backing allocations to GFP_NOIO. >> The public allocation still uses the caller's GFP context to decide whether >> it may block, but the internal memory allocations performed while >> pcpu_alloc_mutex is held cannot recurse into IO or FS reclaim. >> >> Fixes: 9a5b183941b5 ("mm, percpu: do not consider sleepable allocations atomic") >> Signed-off-by: Kaitao Cheng > > This seems like the only viable short term fix but long term it would be > really better to make allocations outside of the lock. > Acked-by: Michal Hocko > > Minor nit >> @@ -1749,8 +1748,17 @@ void __percpu *pcpu_alloc_noprof(size_t size, size_t align, bool reserved, >> size_t bits, bit_align; >> >> gfp = current_gfp_context(gfp); >> - /* whitelisted flags that can be passed to the backing allocators */ >> - pcpu_gfp = gfp & (GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN); >> + /* >> + * Allowlisted flags that can be passed to the backing allocators. >> + * Backing allocations under pcpu_alloc_mutex must not recurse into >> + * IO/FS reclaim. Otherwise a GFP_KERNEL caller holding the mutex can >> + * block on reclaim while a GFP_NOIO/NOFS caller holding an IO/FS lock >> + * waits for the same mutex. >> + * >> + * Do not pass __GFP_NOFAIL. A small percpu allocation may need many >> + * backing pages, making nofail reclaim too costly under NOIO/NOFS. >> + */ >> + pcpu_gfp = gfp & (GFP_NOIO | __GFP_NORETRY | __GFP_NOWARN); > > GFP_NOIO, NOFS are negative masks in the sense that that are lacking > flags so the overal intention would be more readable IMHO in the > following form > pcpu_gfp = gfp & (GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN) > pcpu_gfp &= ~(__GFP_IO | __GFP_FS) This looks a bit redundant. The newly added comment already makes the intent clear, and the extra code seems to serve only as another hint to readers, which is essentially the same role as the comment. GFP_NOIO already excludes __GFP_IO and __GFP_FS, so its semantics are clear enough. It should not be misleading, and it is also more concise. >> is_atomic = !gfpflags_allow_blocking(gfp); >> do_warn = !(gfp & __GFP_NOWARN); >> >> -- >> 2.50.1 (Apple Git-155) > -- Thanks Kaitao Cheng