From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wm1-f49.google.com (mail-wm1-f49.google.com [209.85.128.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 16CCF2EAB72 for ; Thu, 18 Jun 2026 18:03:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.49 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781805817; cv=none; b=t8zcg6rQpaymINmyzsgGCYgWYyuC8pLhosb9PfmT0+Ukl8pD7kkRTng/FyYNou3Jt8vzndtscUauUXbhd3fB53RZnTjTSPW4GfDtgeXCI+lfkRI9LwXjhvGDlqxgKIOrMnCP83DzeSLS5Q9tqpAKWx+3M0GPs2dowIXCvq6s8+M= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781805817; c=relaxed/simple; bh=HCvyE0hfDJNf8xUwS8evJ04BNaGDgm4peWpgrA2y2Rc=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=sG83C8s0x/9r6lHFwfnbh5SYar0sCuAMecLnrXSg5tvM4KDznOukMeVRh4kDT7RQ6te2k0M83p1Le3TXfPEOdDPYO3TFYTU6FaCue4toRpQ7/lYG7LDQSNH62CRyFstMw2EpzMk0ZkJ9gpiwx1EnHM9H//HdnAuFsk6bxXEA4q4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com; spf=pass smtp.mailfrom=suse.com; dkim=pass (2048-bit key) header.d=suse.com header.i=@suse.com header.b=GSBn5KvT; arc=none smtp.client-ip=209.85.128.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=suse.com header.i=@suse.com header.b="GSBn5KvT" Received: by mail-wm1-f49.google.com with SMTP id 5b1f17b1804b1-490c1915793so10209785e9.2 for ; Thu, 18 Jun 2026 11:03:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1781805814; x=1782410614; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=IA0sYyo4EfGoXUxBeWeWHqK/t1kO9rAsE49hd7e6t5A=; b=GSBn5KvT9VQEgEH2ILo+w2RK7WN4Qe+4TZC4fEME9nL5n16QIl1aXTxWHvt9e08/rn 9ohdu838CaawfmMOCuIPtmw2yhOecPLRCYe4W/fZTA1xx+qJSvIY+vA3BkqusfoJAbTI 6PFXqXt2xaBee1CeTSWkFHF/ZOIKca9kiqYTlYKoHTEfHObQcAT7eX6PwjWqH6r6h+Wn 92H8D/oX4gEhXcwoO7A5TP1XL8S5ZO57cs+1BCp2oWeY+rEYXe53EtmSQUjs6It/nlft y15XgzaReIkRlzF3enHMIAH+p+NiN+lW5LLFTY3F4HkunfSNQKisjxdXTLEqyvYbZ0Gv 1v7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1781805814; x=1782410614; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=IA0sYyo4EfGoXUxBeWeWHqK/t1kO9rAsE49hd7e6t5A=; b=mq6/Y4xwt9IvWVPA9TmPCoYItRk9CljZtuNS9krY45I366Ep7cHqwTX7KJm55fzimB O2lVBRQkcoq0/nUcmb6sj6WHmYXAiC5mtfZj4L3HMKXIoj5hI/MHewzxvW8jg9//ELqn Z+4Zpm6wwYALYjHIhYv7VA4b/GiHX1gDq2IEKLGvKL+bqZeE74nkVFKB5VjuXZOlz/oO EQ/IR73qv4U7xvBQOL99pe2SWyqEMSMCKZrywH40KQcU0OpAu0L0m4Opm553prHaBff2 6qw4BaVDX19i+8DMP4Vyxh4x1MwmD6hhiV/bcOyzEQ1s41tvoIwACUsLgJrCS9X10mle XhkQ== X-Forwarded-Encrypted: i=1; AFNElJ9RK0ZZG+AqHNzuZJpJ7xM17bbCHtZPX+TkYBZIiLnSb6KZ6aEpVNjmKkomu4mVwI1i2HRz2LuVEchVVnQ=@vger.kernel.org X-Gm-Message-State: AOJu0YyoEM7bT5CFGU3aFh+5gxF02VwTUKvWnZH+StSlTNI5pQYSSLK5 AK2qBsXa6Vn96KeQitERjFbgKHsWLwF57fUK8hLzGpWCtMZDZvwJuinrNxc1Jaa3lC8= X-Gm-Gg: AfdE7cn68ETaZoNAHSpLsP4H6Q3+jgQNtGfBCpBIlHr2BOqBPJso6W0W3721tgguapw ks76dn2k1JcDOicqtOYGpaQa4Y1tR5JD+AwbJ+D0MCL1+Sb1/6pw7N3siVCjhBmz1rqrPXmnink rWdDxf3gFWsSUk96WENOXl8GUSLaFl1Pug0WOnvy1G5JhChfLZWlwGozPwXAw3uGkNwj04mbFa4 byPewzj8Its2+MWr0OVU9AuYK3GpP9J/KmqrfUKc/+L1Yawfj4H0ZY6ZiZ4x/HUvA2RJx7xx0Qn tUEEMIDeWyIU63J/xPWseHn/pww5+wJTmmIM53s4C/PLE5c++KzUU/u0090yEzWXb5Vp08F4c1r 702K/sXy7SXFJ174A9vXrkRCKPtsULCAXyF5qKHlJQOr8tzYcuPeT+Lo0hwk9YEsacb41L+rqZJ Fjkb9qB5GD/Gmp2MiAVn1378wFkg== X-Received: by 2002:a05:600c:8485:b0:492:259d:567 with SMTP id 5b1f17b1804b1-4923f57986bmr11284415e9.23.1781805814485; Thu, 18 Jun 2026 11:03:34 -0700 (PDT) Received: from localhost (109-81-26-193.rct.o2.cz. [109.81.26.193]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4923fe7b359sm5357325e9.9.2026.06.18.11.03.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Jun 2026 11:03:34 -0700 (PDT) Date: Thu, 18 Jun 2026 20:03:32 +0200 From: Michal Hocko To: Kaitao Cheng Cc: Andrew Morton , Uladzislau Rezki , Dennis Zhou , Tejun Heo , Christoph Lameter , Vlastimil Babka , Pedro Falcato , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Kaitao Cheng Subject: Re: [PATCH v4 4/4] mm/percpu: Avoid IO/FS reclaim in backing allocations Message-ID: References: <20260618130414.96383-1-kaitao.cheng@linux.dev> <20260618130414.96383-5-kaitao.cheng@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20260618130414.96383-5-kaitao.cheng@linux.dev> On Thu 18-06-26 21:04:14, Kaitao Cheng wrote: > From: Kaitao Cheng > > Commit 9a5b183941b5 ("mm, percpu: do not consider sleepable > allocations atomic") allows sleepable GFP_NOIO and GFP_NOFS percpu > allocations to take pcpu_alloc_mutex. This avoids premature allocation > failures, but it also makes the mutex visible to callers from constrained > IO/FS contexts. > > Thread A calls pcpu_alloc_noprof() with GFP_KERNEL and takes > pcpu_alloc_mutex. Since the internal allocation is not constrained by > NOFS, it may enter FS reclaim while still holding pcpu_alloc_mutex, > creating a dependency like: pcpu_alloc_mutex -> fs_reclaim -> FS lock > > At the same time, Thread B may already hold an FS lock and then call > pcpu_alloc_noprof() with GFP_NOFS. It will try to acquire > pcpu_alloc_mutex and block, creating the reverse dependency: > FS lock -> pcpu_alloc_mutex > > This can still form a potential deadlock cycle. > > Avoid the dependency by restricting percpu backing allocations to GFP_NOIO. > The public allocation still uses the caller's GFP context to decide whether > it may block, but the internal memory allocations performed while > pcpu_alloc_mutex is held cannot recurse into IO or FS reclaim. > > Fixes: 9a5b183941b5 ("mm, percpu: do not consider sleepable allocations atomic") > Signed-off-by: Kaitao Cheng This seems like the only viable short term fix but long term it would be really better to make allocations outside of the lock. Acked-by: Michal Hocko Minor nit > @@ -1749,8 +1748,17 @@ void __percpu *pcpu_alloc_noprof(size_t size, size_t align, bool reserved, > size_t bits, bit_align; > > gfp = current_gfp_context(gfp); > - /* whitelisted flags that can be passed to the backing allocators */ > - pcpu_gfp = gfp & (GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN); > + /* > + * Allowlisted flags that can be passed to the backing allocators. > + * Backing allocations under pcpu_alloc_mutex must not recurse into > + * IO/FS reclaim. Otherwise a GFP_KERNEL caller holding the mutex can > + * block on reclaim while a GFP_NOIO/NOFS caller holding an IO/FS lock > + * waits for the same mutex. > + * > + * Do not pass __GFP_NOFAIL. A small percpu allocation may need many > + * backing pages, making nofail reclaim too costly under NOIO/NOFS. > + */ > + pcpu_gfp = gfp & (GFP_NOIO | __GFP_NORETRY | __GFP_NOWARN); GFP_NOIO, NOFS are negative masks in the sense that that are lacking flags so the overal intention would be more readable IMHO in the following form pcpu_gfp = gfp & (GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN) pcpu_gfp &= ~(__GFP_IO | __GFP_FS) > is_atomic = !gfpflags_allow_blocking(gfp); > do_warn = !(gfp & __GFP_NOWARN); > > -- > 2.50.1 (Apple Git-155) -- Michal Hocko SUSE Labs