From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 978C8C52D7C for ; Mon, 19 Aug 2024 19:23:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2CDD76B007B; Mon, 19 Aug 2024 15:23:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 27E4B6B0082; Mon, 19 Aug 2024 15:23:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1469A6B0083; Mon, 19 Aug 2024 15:23:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id EE1E16B007B for ; Mon, 19 Aug 2024 15:23:47 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 91CD7A93F8 for ; Mon, 19 Aug 2024 19:23:47 +0000 (UTC) X-FDA: 82469969694.22.6B914FF Received: from mail-qv1-f51.google.com (mail-qv1-f51.google.com [209.85.219.51]) by imf08.hostedemail.com (Postfix) with ESMTP id 9928516000E for ; Mon, 19 Aug 2024 19:23:45 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=DTOwJhH0; spf=pass (imf08.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.219.51 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724095322; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jzFJ7K9wdRT+QQN8nk00JRyGEe7qfSEYg3oRtqc9s/Y=; b=l/ZcXhNMXmNTjCf0wV2NR7FqumRJZczzhm0hdc1ALjZDz//iM0lBKmpSP4TgEelYOEL3HW BsbFHlHqnvEr3FjL75XVr2uWOgxNVYDmDuqYIq2nBqv0sbrwhrtLANLWFO0TNIqncc2i5V FMcNeDsvav34EWMvXhbltROZ/b9cqAY= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=DTOwJhH0; spf=pass (imf08.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.219.51 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724095322; a=rsa-sha256; cv=none; b=cVn8WgMuW0iH0ysb+++7OJFuQpNpCjNxEbpuk0p/s9nFAveiREBGGnwCccFtBlZ6tizpz+ 1Y5kL91/suDQ1mKHbjudI2HkX1z1gv14jVRLKhIcJ7XiE+m6se030qEozGvfFHQcMcUKtF 90imXqwjmdbcwncD4VAlz/+livFyp0Q= Received: by mail-qv1-f51.google.com with SMTP id 6a1803df08f44-6bf784346b9so21842546d6.2 for ; Mon, 19 Aug 2024 12:23:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724095425; x=1724700225; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=jzFJ7K9wdRT+QQN8nk00JRyGEe7qfSEYg3oRtqc9s/Y=; b=DTOwJhH0LgT3IDUT5m/b5kZRNBlJQcVojhxWGsXKQe6e4p4UbFu8I7IM0qY8BSwVVB LZ6mqie7RArYYOdGxDg5aTf0ksBuM0muV+kVC+L3fJGvQudrt/2+e2G7Z5jMWdrY2AJo 68peY1odZU+eilBZMXCqAmF82Kz4/PG6+njf6JrDz3A3rftcLi4jpZ4WDFiSxXdm/VLL rEPeHiy2Hidgrz5IVSyJimxR8uQJk2Jr18+tBaGm7onO9RUCRj7H3+bbAunvV8yl5199 B9cn3H+13/7WhnJTPPclNVYscBMa8Y1KgtAtZ31gw1OhFglXEy8CM7Kww00rDRqeFAcB UW/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724095425; x=1724700225; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jzFJ7K9wdRT+QQN8nk00JRyGEe7qfSEYg3oRtqc9s/Y=; b=VsNviZ1aGGcK/QAgCV7zN3WAiMnN1iam6iBDniDgV2INRtSuLSwngrSLoF+rI4+yH8 30WC8dgDuLRtR83kndh0GsHY4WiMTIMR5BdiuIcrw/ADVdFNyC6QGGXX0fhHOettk54H jOt3EgjBCZOdiQ2nvxcuGFaQBSh4VTPQzy+FUlInSnmC9ypkupgM07tNvNBcDPwMjSfr qPAlz0+DWm7If6Xr4Zy/NdwkCxHj40Jdddzqd/m+OdY3llBOkAfmgu74WZvg/L/m5VRC JQG+s1jQ2MN31r6Gd29CfJktNgZAES056im4T8LNFz+XR1sW3qTouO9cXIaRLJQGi7w1 OLtQ== X-Forwarded-Encrypted: i=1; AJvYcCVLqpGMXwOStfwZx1eHjbLdXTzXdD4xa/gvBY8sPRjtNZNjIU1PDhprWr82QFvG5BpR3oaRygQeirm1RkT6Wu4AomY= X-Gm-Message-State: AOJu0YzecYsV14TjkHPnYj31LkG4lo6Coryt5KQ0AFxpotigDbF3MrPe yVsEMW9rfSwp0LenRwUm8uEMoOOKrbHrnKP9kH1B+aN7XsXZfebxdEb/nRurvJPErEx3O7YrDka RfFpaZ8W1ZDixDpN6GLffNhxXUJg= X-Google-Smtp-Source: AGHT+IETlY5cNq9g0dHqEUB4cSHIAgnEby1WMSf7rLlyw3/o+UIOTsTYGREIlaqYZFL3bZuBjpvCJZSGSZ64HzX4VC8= X-Received: by 2002:a05:6214:5490:b0:6bf:8711:ab43 with SMTP id 6a1803df08f44-6bf8711aee5mr82046406d6.31.1724095424517; Mon, 19 Aug 2024 12:23:44 -0700 (PDT) MIME-Version: 1.0 References: <20240817062449.21164-1-21cnbao@gmail.com> <7050deab-e99c-4c83-b7b9-b5dad42f4e95@redhat.com> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Tue, 20 Aug 2024 07:23:31 +1200 Message-ID: Subject: Re: [PATCH v3 0/4] mm: clarify nofail memory allocation To: Linus Torvalds Cc: David Hildenbrand , akpm@linux-foundation.org, linux-mm@kvack.org, 42.hyeyoo@gmail.com, cl@linux.com, hailong.liu@oppo.com, hch@infradead.org, iamjoonsoo.kim@lge.com, mhocko@suse.com, penberg@kernel.org, rientjes@google.com, roman.gushchin@linux.dev, urezki@gmail.com, v-songbaohua@oppo.com, vbabka@suse.cz, virtualization@lists.linux.dev Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 9928516000E X-Stat-Signature: o7y5meuwy8maijiyfeyg67m3jyedsbcp X-Rspam-User: X-HE-Tag: 1724095425-925759 X-HE-Meta: U2FsdGVkX196KFuLZEx8oqo2d3/eNj1PC1cCyuIVpLVPWyxmm4ywvrj3Y/v8sRsMRo5znryXTRhZ0vYTVaXkctkFbJxQqD/WIMNJAzJQdIzOf5bB3tBoPaGshb4nmzvQYjTkMZNVL/a5aD7/OILwS7glwa14XrvSbq6auH59N5i4Eyzjc6kWNtBfcrxYsgSfF4Y6xn3HQ91WZEv7+NKEUJ2AD4U6DYwy+VHqESWXX/QY2dNvBpZInGNm+JAQgfiMp5bT5qJ0x65oIg987qagy4/qWKKX3E7OPicgoFtU5PTnuSuZJBzPohYXUhqJIR5qjOAZ67LiMZM0s3nJhs+71aY49M7RwB4GEnbSsX1vb5YhzRXSfXVl4F+uHirgpBRtSZOB8OiAM7OOBDPWFC13yYnapagXHSWy7cV85hz5iriY6HqIgNLDXL2DFW3TXVYWiMB9XIbHUshdYn5X0AqyPMpROEP/mI3J5Ylp4bXhFK9Qv6GyAQBTVpJCz2gJhPrTH6X3Ko80MER3sh+sbVYDAaFFBLxzzUneAwOuOZuVMqwe0HLvj3koG3nqmwQRfu42ZRf19ZxCz0oLHbNT6Ky7rt46dyd/V8CVZUcRaY3WYW8Zvr7hXlYspegfIh0Ea58U0pr+xHeME7ji4HTVfcp9MJFg9nPoZDwPjuxFL+Durb9NTjBstBKHdPQCB1ArWuEwNCgWAa+011uX0tSrzZCQPjfaC7dy0RU7vXa/mH8cA0v7RVLlzPmr/g/BM+8q2ZZK7pNSs23Zg1l02rgwZqND/q2U2RS1sWZwX1Acv4fnK7DwY8RNMAMk/gumwfoxBbMh2VaMVZYKehmtY1AbnbBU9neXvSIvyBV41OWSH/DTTc84jIAgK7u88FSXHFbujeOT39bwR+l6y8S5IUsdHB0irwdcVL6wxkGJG8nF2HBAIcpTXq2ZUMYnKwKLzgdosf+YBtF9I+dH6ddiZs+jF3Q fyRFf/z+ bINSmkYP6Lh4tFGs4ifstgbQuFVhbG8+SrfO/dysqnth4U3POYSWTCIRuJrXz6OP+4x08ke/IxGITR5wBnccxop9kdQP0uGevoEFG+QPSOZhalMHzAk6bCOBgrNCeyy7A0ApBjF1PyvGgC5HgEXz0Vek44f9eokWEkBp+EKI4ELjsbT9ZNMliNmIjEy28PoYyNeyAeG/7WYeL4yORDyTQbXHPqHGvLEY3sror9BAgBSryfo3yfV8002YL6W1vYR+8bD4goR23JiMVobfD6bxuO8XjQnR9hQo1Iq9JFic5MffNH9IpIdySIGkIDIMSYhYhNvYQyXPKaQGxeg3eLOoMjDMj+IrVB52oYcXfeal3BT8ZdDQ884I9OK6eoVyxTfa8jpY4ymJ+BVbJv5sTEIy/d13DYwGYMqhy/3ChG0hF2tlgSKrS+eA9qd73d4ych62f8J/gguP+SEO6ms2Au/C9Nj8wx/0RUEU/ShYMoVGV15tDXqQbGKN7CrQ1dTphPmR0N+ll X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Aug 20, 2024 at 4:05=E2=80=AFAM Linus Torvalds wrote: > > On Mon, 19 Aug 2024 at 06:02, David Hildenbrand wrote: > > > > > > If we must still fail a nofail allocation, we should trigger a BUG ra= ther > > > than exposing NULL dereferences to callers who do not check the retur= n > > > value. > > > > I am not convinced that BUG_ON is the right tool here to save the world= , > > but I see how we arrived here. > > I think the thing to do is to just add a > > WARN_ON_ONCE((flags & __GFP_NOFAIL) && bad_nofail_alloc(oder, flags)= ); > > or similar, where that bad_nofail_alloc() checks that the allocation > order is small and that the flags are sane for a NOFAIL allocation. > > Because no, BUG_ON() is *never* the answer. The answer is to make sure > nobody ever sets NOFAIL in situations where the allocation can fail > and there is no way forward. > > A BUG_ON() will quite likely just make things worse. You're better off > with a WARN_ON() and letting the caller just oops. That could be an exploit taking advantage of those improper callers, thus it wouldn=E2=80=99t necessarily result in an immediate oops in callers= but result in an exploit such as p =3D malloc(gfp_nofail); using p->field or using p + offset where p->filed or p + offset might be attacking code though p is NULL. Could we consider infinitely blocking those bad callers: while (illegal_gfp_nofail) do_sth_relax_cpu(); > > Honestly, I'm perfectly fine with just removing that stupid useless > flag entirely. The flag goes back to 2003 and was introduced in > 2.5.69, and was meant to be for very particular uses that otherwise > just looped waiting for memory. > > Back in 2.5.69, there was exactly one user: the jbd journal code, that > did a buffer head allocation with GFP_NOFAIL. By 2.6.0 that had > expanded by another user in XFS, and even that one had a comment > saying that it needed to be narrowed down. And in fact, by the 2.6.12 > release, that XFS use had been removed, but the jbd journal had grown > another jbd_kmalloc case for transaction data. So at the beginning of > the git archives, we had exactly *one* user (with two places). > > *THAT* is the kind of use that the flag was meant for: small > allocations required to make forward progress in writeout during > memory pressure. > > It has then expanded and is now a problem. The cases using GFP_NOFAIL > for things like vmalloc() - which is by definition not a small > allocation - should be just removed as outright bugs. > > Note that we had this comment back in 2010: > > * __GFP_NOFAIL: The VM implementation _must_ retry infinitely: the calle= r > * cannot handle allocation failures. This modifier is deprecated and no= new > * users should be added. > > and then it was softened in 2015 to the current > > * __GFP_NOFAIL: The VM implementation _must_ retry infinitely: the calle= r > * cannot handle allocation failures. New users should be evaluated caref= ully > ... > > and clearly that "evaluated carefully" actually never happened, so the > new comment is just garbage. > > I wonder how many modern users of GFP_NOFAIL are simply due to > over-eager allocation failure injection testing, and then people added > GFP_NOFAIL just because it shut up the mindless random allocation > failures. > > I mean, we have a __GFP_NOFAIL in rhashtable_init() - which can > actually return an error just fine, but there was this crazy worry > about the IPC layer initialization failing: > > https://lore.kernel.org/all/20180523172500.anfvmjtumww65ief@linux-n805= / > > Things like that, where people just added mindless "theoretical > concerns" issues, or possibly had some error injection module that > inserted impossible failures. > > I do NOT want those things to become BUG_ON()'s. It's better to just > return NULL with a "bogus GFP_NOFAIL" warning, and have the oops > happen in the actual bad place that did an invalid allocation. > > Because the blame should go *there*, and it should not even remotely > look like "oh, the MM code failed". No. The caller was garbage. > > So no. No MM BUG_ON code. > > Linus