From: Yafang Shao <laoar.shao@gmail.com>
To: Barry Song <21cnbao@gmail.com>
Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
42.hyeyoo@gmail.com, cl@linux.com, hailong.liu@oppo.com,
hch@infradead.org, iamjoonsoo.kim@lge.com, mhocko@suse.com,
penberg@kernel.org, rientjes@google.com,
roman.gushchin@linux.dev, torvalds@linux-foundation.org,
urezki@gmail.com, v-songbaohua@oppo.com, vbabka@suse.cz,
virtualization@lists.linux.dev,
"Lorenzo Stoakes" <lorenzo.stoakes@oracle.com>,
"Kees Cook" <kees@kernel.org>,
"Eugenio Pérez" <eperezma@redhat.com>,
"Jason Wang" <jasowang@redhat.com>,
"Maxime Coquelin" <maxime.coquelin@redhat.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
"Xuan Zhuo" <xuanzhuo@linux.alibaba.com>
Subject: Re: [PATCH v3 4/4] mm: prohibit NULL deference exposed for unsupported non-blockable __GFP_NOFAIL
Date: Sun, 18 Aug 2024 13:51:04 +0800 [thread overview]
Message-ID: <CALOAHbBLF-Rt-RML=oOAkwW6PJDoMRTdV_6itw69xJLBrERGmQ@mail.gmail.com> (raw)
In-Reply-To: <CAGsJ_4yWPw5ihcnxxGDBVR0s69WdqZf_GQdx49u6NXEefd1pQA@mail.gmail.com>
On Sun, Aug 18, 2024 at 11:48 AM Barry Song <21cnbao@gmail.com> wrote:
>
> On Sun, Aug 18, 2024 at 2:55 PM Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> > On Sat, Aug 17, 2024 at 2:25 PM Barry Song <21cnbao@gmail.com> wrote:
> > >
> > > From: Barry Song <v-songbaohua@oppo.com>
> > >
> > > When users allocate memory with the __GFP_NOFAIL flag, they might
> > > incorrectly use it alongside GFP_ATOMIC, GFP_NOWAIT, etc. This kind of
> > > non-blockable __GFP_NOFAIL is not supported and is pointless. If we
> > > attempt and still fail to allocate memory for these users, we have two
> > > choices:
> > >
> > > 1. We could busy-loop and hope that some other direct reclamation or
> > > kswapd rescues the current process. However, this is unreliable
> > > and could ultimately lead to hard or soft lockups,
> >
> > That can occur even if we set both __GFP_NOFAIL and
> > __GFP_DIRECT_RECLAIM, right? So, I don't believe the issue is related
> > to setting __GFP_DIRECT_RECLAIM; rather, it stems from the flawed
> > design of __GFP_NOFAIL itself.
>
> the point of GFP_NOFAIL is that it won't fail and its user won't check
> the return value. without direct_reclamation, it is sometimes impossible.
> but with direct reclamation, users constantly wait and finally they can
So, what exactly is the difference between 'constantly waiting' and
'busy looping'? Could you please clarify? Also, why can't we
'constantly wait' when __GFP_DIRECT_RECLAIM is not set?
> get memory. if you read the doc of __GFP_NOFAIL you will find it.
> it is absolutely clearly documented.
>
> >
> > > which might not
> > > be well supported by some architectures.
> > >
> > > 2. We could use BUG_ON to trigger a reliable system crash, avoiding
> > > exposing NULL dereference.
> > >
> > > Neither option is ideal, but both are improvements over the existing code.
> > > This patch selects the second option because, with the introduction of
> > > scoped API and GFP_NOFAIL—capable of enforcing direct reclamation for
> > > nofail users(which is in my plan), non-blockable nofail allocations will
> > > no longer be possible.
> > >
> > > Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> > > Cc: Michal Hocko <mhocko@suse.com>
> > > Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > > Cc: Christoph Hellwig <hch@infradead.org>
> > > Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > > Cc: Christoph Lameter <cl@linux.com>
> > > Cc: Pekka Enberg <penberg@kernel.org>
> > > Cc: David Rientjes <rientjes@google.com>
> > > Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > > Cc: Vlastimil Babka <vbabka@suse.cz>
> > > Cc: Roman Gushchin <roman.gushchin@linux.dev>
> > > Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> > > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > > Cc: Kees Cook <kees@kernel.org>
> > > Cc: "Eugenio Pérez" <eperezma@redhat.com>
> > > Cc: Hailong.Liu <hailong.liu@oppo.com>
> > > Cc: Jason Wang <jasowang@redhat.com>
> > > Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
> > > Cc: "Michael S. Tsirkin" <mst@redhat.com>
> > > Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > ---
> > > mm/page_alloc.c | 10 +++++-----
> > > 1 file changed, 5 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > index d2c37f8f8d09..fb5850ecd3ae 100644
> > > --- a/mm/page_alloc.c
> > > +++ b/mm/page_alloc.c
> > > @@ -4399,11 +4399,11 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> > > */
> > > if (gfp_mask & __GFP_NOFAIL) {
> > > /*
> > > - * All existing users of the __GFP_NOFAIL are blockable, so warn
> > > - * of any new users that actually require GFP_NOWAIT
> > > + * All existing users of the __GFP_NOFAIL are blockable
> > > + * otherwise we introduce a busy loop with inside the page
> > > + * allocator from non-sleepable contexts
> > > */
> > > - if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask))
> > > - goto fail;
> > > + BUG_ON(!can_direct_reclaim);
> >
> > I'm not in favor of using BUG_ON() here, as many call sites already
> > handle the return value of __GFP_NOFAIL.
> >
>
> it is not correct to handle the return value of __GFP_NOFAIL.
> if you check the ret, don't use __GFP_NOFAIL.
If so, you have many code changes to make in the linux kernel ;)
>
> > If we believe BUG_ON() is necessary, why not place it at the beginning
> > of __alloc_pages_slowpath() instead of after numerous operations,
> > which could potentially obscure the issue?
>
> to some extent I agree with you. but the point here is that we might
> want to avoid this check in the hot path. so basically, we check when
> we have to check. in 99%+ case, this check can be avoided.
It's on the slow path, but that's not the main point here.
--
Regards
Yafang
next prev parent reply other threads:[~2024-08-18 5:51 UTC|newest]
Thread overview: 101+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-17 6:24 [PATCH v3 0/4] mm: clarify nofail memory allocation Barry Song
2024-08-17 6:24 ` [PATCH v3 1/4] vduse: avoid using __GFP_NOFAIL Barry Song
2024-08-17 6:24 ` [PATCH v3 2/4] mm: document __GFP_NOFAIL must be blockable Barry Song
2024-08-17 6:24 ` [PATCH v3 3/4] mm: BUG_ON to avoid NULL deference while __GFP_NOFAIL fails Barry Song
2024-08-19 9:43 ` David Hildenbrand
2024-08-19 9:47 ` Barry Song
2024-08-19 9:55 ` David Hildenbrand
2024-08-19 10:02 ` Barry Song
2024-08-19 12:33 ` David Hildenbrand
2024-08-19 12:48 ` Barry Song
2024-08-19 12:49 ` David Hildenbrand
2024-08-19 17:12 ` Michal Hocko
2024-08-19 17:17 ` Linus Torvalds
2024-08-19 20:24 ` David Hildenbrand
2024-08-19 20:35 ` Linus Torvalds
2024-08-19 21:57 ` David Hildenbrand
2024-08-19 22:13 ` Linus Torvalds
2024-08-20 6:17 ` Michal Hocko
2024-08-19 12:49 ` Christoph Hellwig
2024-08-19 12:51 ` David Hildenbrand
2024-08-19 12:53 ` Christoph Hellwig
2024-08-19 13:14 ` David Hildenbrand
2024-08-19 13:05 ` Barry Song
2024-08-19 13:10 ` David Hildenbrand
2024-08-19 13:19 ` Barry Song
2024-08-19 13:22 ` David Hildenbrand
2024-08-17 6:24 ` [PATCH v3 4/4] mm: prohibit NULL deference exposed for unsupported non-blockable __GFP_NOFAIL Barry Song
2024-08-18 2:55 ` Yafang Shao
2024-08-18 3:48 ` Barry Song
2024-08-18 5:51 ` Yafang Shao [this message]
2024-08-18 6:27 ` Barry Song
2024-08-18 6:45 ` Barry Song
2024-08-18 7:07 ` Yafang Shao
2024-08-18 7:25 ` Barry Song
2024-08-19 7:51 ` Michal Hocko
2024-08-19 7:50 ` Michal Hocko
2024-08-19 9:25 ` Yafang Shao
2024-08-19 9:39 ` Barry Song
2024-08-19 9:45 ` Yafang Shao
2024-08-19 10:10 ` Barry Song
2024-08-19 11:56 ` Yafang Shao
2024-08-19 12:09 ` Michal Hocko
2024-08-19 12:17 ` Yafang Shao
2024-08-19 14:01 ` Michal Hocko
2024-08-19 10:17 ` Michal Hocko
2024-08-19 11:56 ` Yafang Shao
2024-08-19 12:04 ` Michal Hocko
2024-08-19 9:44 ` David Hildenbrand
2024-08-19 10:19 ` Michal Hocko
2024-08-19 12:48 ` David Hildenbrand
2024-08-19 13:02 ` [PATCH v3 0/4] mm: clarify nofail memory allocation David Hildenbrand
2024-08-19 16:05 ` Linus Torvalds
2024-08-19 19:23 ` Barry Song
2024-08-19 19:33 ` Linus Torvalds
2024-08-19 21:48 ` Barry Song
2024-08-20 6:24 ` Michal Hocko
2024-08-21 12:40 ` Yafang Shao
2024-08-21 22:59 ` Linus Torvalds
2024-08-22 6:21 ` Michal Hocko
2024-08-22 6:40 ` Linus Torvalds
2024-08-22 6:56 ` Linus Torvalds
2024-08-22 7:47 ` Michal Hocko
2024-08-22 7:57 ` Barry Song
2024-08-22 8:24 ` Michal Hocko
2024-08-22 8:39 ` David Hildenbrand
2024-08-22 9:08 ` Linus Torvalds
2024-08-22 9:16 ` Michal Hocko
2024-08-22 9:24 ` Linus Torvalds
2024-08-22 9:11 ` Michal Hocko
2024-08-22 9:18 ` Linus Torvalds
2024-08-22 9:33 ` Michal Hocko
2024-08-22 9:44 ` Linus Torvalds
2024-08-22 9:59 ` Michal Hocko
2024-08-22 10:30 ` Linus Torvalds
2024-08-22 10:46 ` Michal Hocko
2024-08-22 9:27 ` David Hildenbrand
2024-08-22 9:34 ` Linus Torvalds
2024-08-22 9:43 ` David Hildenbrand
2024-08-22 9:53 ` Linus Torvalds
2024-08-22 11:58 ` Johannes Weiner
2024-08-26 12:10 ` Vlastimil Babka
2024-08-27 6:57 ` Linus Torvalds
2024-08-27 7:15 ` Barry Song
2024-08-27 7:38 ` Vlastimil Babka
2024-08-27 7:50 ` Barry Song
2024-08-29 10:24 ` Vlastimil Babka
2024-08-29 11:53 ` Barry Song
2024-08-29 13:20 ` Michal Hocko
2024-08-29 21:27 ` Barry Song
2024-08-29 22:31 ` Barry Song
2024-08-30 7:24 ` Michal Hocko
2024-08-30 7:37 ` Vlastimil Babka
2024-08-22 9:41 ` Michal Hocko
2024-08-22 9:42 ` David Hildenbrand
2024-08-22 7:01 ` Gao Xiang
2024-08-22 7:54 ` Michal Hocko
2024-08-22 8:04 ` Gao Xiang
2024-08-22 14:35 ` Yafang Shao
2024-08-22 15:02 ` Gao Xiang
2024-08-22 6:37 ` Barry Song
2024-08-22 14:22 ` Yafang Shao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CALOAHbBLF-Rt-RML=oOAkwW6PJDoMRTdV_6itw69xJLBrERGmQ@mail.gmail.com' \
--to=laoar.shao@gmail.com \
--cc=21cnbao@gmail.com \
--cc=42.hyeyoo@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=eperezma@redhat.com \
--cc=hailong.liu@oppo.com \
--cc=hch@infradead.org \
--cc=iamjoonsoo.kim@lge.com \
--cc=jasowang@redhat.com \
--cc=kees@kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=maxime.coquelin@redhat.com \
--cc=mhocko@suse.com \
--cc=mst@redhat.com \
--cc=penberg@kernel.org \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=torvalds@linux-foundation.org \
--cc=urezki@gmail.com \
--cc=v-songbaohua@oppo.com \
--cc=vbabka@suse.cz \
--cc=virtualization@lists.linux.dev \
--cc=xuanzhuo@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).