linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Yafang Shao <laoar.shao@gmail.com>
Cc: "Barry Song" <21cnbao@gmail.com>,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	42.hyeyoo@gmail.com, cl@linux.com, hailong.liu@oppo.com,
	hch@infradead.org, iamjoonsoo.kim@lge.com, penberg@kernel.org,
	rientjes@google.com, roman.gushchin@linux.dev,
	torvalds@linux-foundation.org, urezki@gmail.com,
	v-songbaohua@oppo.com, vbabka@suse.cz,
	virtualization@lists.linux.dev,
	"Lorenzo Stoakes" <lorenzo.stoakes@oracle.com>,
	"Kees Cook" <kees@kernel.org>,
	"Eugenio Pérez" <eperezma@redhat.com>,
	"Jason Wang" <jasowang@redhat.com>,
	"Maxime Coquelin" <maxime.coquelin@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Xuan Zhuo" <xuanzhuo@linux.alibaba.com>
Subject: Re: [PATCH v3 4/4] mm: prohibit NULL deference exposed for unsupported non-blockable __GFP_NOFAIL
Date: Mon, 19 Aug 2024 14:09:49 +0200	[thread overview]
Message-ID: <ZsM2De5v06eJsiG3@tiehlicka> (raw)
In-Reply-To: <CALOAHbCG=_W9rrf+hgkFYF6+1LZuEeEjpyiexu9iseGOaDJ+QA@mail.gmail.com>

On Mon 19-08-24 19:56:53, Yafang Shao wrote:
> On Mon, Aug 19, 2024 at 6:10 PM Barry Song <21cnbao@gmail.com> wrote:
> >
> > On Mon, Aug 19, 2024 at 9:46 PM Yafang Shao <laoar.shao@gmail.com> wrote:
> > >
> > > On Mon, Aug 19, 2024 at 5:39 PM Barry Song <21cnbao@gmail.com> wrote:
> > > >
> > > > On Mon, Aug 19, 2024 at 9:25 PM Yafang Shao <laoar.shao@gmail.com> wrote:
> > > > >
> > > > > On Mon, Aug 19, 2024 at 3:50 PM Michal Hocko <mhocko@suse.com> wrote:
> > > > > >
> > > > > > On Sun 18-08-24 10:55:09, Yafang Shao wrote:
> > > > > > > On Sat, Aug 17, 2024 at 2:25 PM Barry Song <21cnbao@gmail.com> wrote:
> > > > > > > >
> > > > > > > > From: Barry Song <v-songbaohua@oppo.com>
> > > > > > > >
> > > > > > > > When users allocate memory with the __GFP_NOFAIL flag, they might
> > > > > > > > incorrectly use it alongside GFP_ATOMIC, GFP_NOWAIT, etc.  This kind of
> > > > > > > > non-blockable __GFP_NOFAIL is not supported and is pointless.  If we
> > > > > > > > attempt and still fail to allocate memory for these users, we have two
> > > > > > > > choices:
> > > > > > > >
> > > > > > > >     1. We could busy-loop and hope that some other direct reclamation or
> > > > > > > >     kswapd rescues the current process. However, this is unreliable
> > > > > > > >     and could ultimately lead to hard or soft lockups,
> > > > > > >
> > > > > > > That can occur even if we set both __GFP_NOFAIL and
> > > > > > > __GFP_DIRECT_RECLAIM, right?
> > > > > >
> > > > > > No, it cannot! With __GFP_DIRECT_RECLAIM the allocator might take a long
> > > > > > time to satisfy the allocation but it will reclaim to get the memory, it
> > > > > > will sleep if necessary and it will will trigger OOM killer if there is
> > > > > > no other option. __GFP_DIRECT_RECLAIM is a completely different story
> > > > > > than without it which means _no_sleeping_ is allowed and therefore only
> > > > > > a busy loop waiting for the allocation to proceed is allowed.
> > > > >
> > > > > That could be a livelock.
> > > > > From the user's perspective, there's no noticeable difference between
> > > > > a livelock, soft lockup, or hard lockup.
> > > >
> > > > This is certainly different. A lockup occurs when tasks can't be scheduled,
> > > > causing the entire system to stop functioning.
> > >
> > > When a livelock occurs, your only options are to migrate your
> > > applications to other servers or reboot the system—there’s no other
> > > resolution (except for using oomd, which is difficult for users
> > > without cgroup2 or swap).
> > >
> > > So, there's effectively no difference.
> >
> > Could you express your options more clearly? I am guessing two
> > possibilities?
> > 1. entirely drop __GFP_NOFAIL and require all users who are
> > using __GFP_NOFAIL to add error handlers instead?
> 
> When the system is unstable—such as after reaching the maximum retries
> without successfully allocating pages—simply failing the operation
> might be the better option.

It seems you are failing to understand the __GFP_NOFAIL semantic and you
are circling around that. So let me repeat that for you here. Make sure
you understand before going forward with the discussion. Feel free if
something is not clear but please do not continue with what-if kind of
questions.

GFP_NOFAIL means that the caller has no way to deal with the allocation
strategy. Allocator simply cannot fail the request even if that takes
ages to succeed! To put it simpler if you have a code like

	while (!(ptr = alloc()));
or
	BUG_ON(!(ptr = alloc()));

then you should better use __GFP_NOFAIL rather than opencode the endless
loop or the bug on for the failure.

Our (page, vmalloc, kmalloc) allocators do support that node for
allocation that are allowed to sleep. But those allocators have never
supported and are unlikely to suppoort atomic non-failing allocations.

More clear?
-- 
Michal Hocko
SUSE Labs


  reply	other threads:[~2024-08-19 12:09 UTC|newest]

Thread overview: 101+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-17  6:24 [PATCH v3 0/4] mm: clarify nofail memory allocation Barry Song
2024-08-17  6:24 ` [PATCH v3 1/4] vduse: avoid using __GFP_NOFAIL Barry Song
2024-08-17  6:24 ` [PATCH v3 2/4] mm: document __GFP_NOFAIL must be blockable Barry Song
2024-08-17  6:24 ` [PATCH v3 3/4] mm: BUG_ON to avoid NULL deference while __GFP_NOFAIL fails Barry Song
2024-08-19  9:43   ` David Hildenbrand
2024-08-19  9:47     ` Barry Song
2024-08-19  9:55       ` David Hildenbrand
2024-08-19 10:02         ` Barry Song
2024-08-19 12:33           ` David Hildenbrand
2024-08-19 12:48             ` Barry Song
2024-08-19 12:49               ` David Hildenbrand
2024-08-19 17:12                 ` Michal Hocko
2024-08-19 17:17                   ` Linus Torvalds
2024-08-19 20:24                   ` David Hildenbrand
2024-08-19 20:35                     ` Linus Torvalds
2024-08-19 21:57                       ` David Hildenbrand
2024-08-19 22:13                         ` Linus Torvalds
2024-08-20  6:17                         ` Michal Hocko
2024-08-19 12:49             ` Christoph Hellwig
2024-08-19 12:51               ` David Hildenbrand
2024-08-19 12:53                 ` Christoph Hellwig
2024-08-19 13:14                   ` David Hildenbrand
2024-08-19 13:05                 ` Barry Song
2024-08-19 13:10                   ` David Hildenbrand
2024-08-19 13:19                     ` Barry Song
2024-08-19 13:22                       ` David Hildenbrand
2024-08-17  6:24 ` [PATCH v3 4/4] mm: prohibit NULL deference exposed for unsupported non-blockable __GFP_NOFAIL Barry Song
2024-08-18  2:55   ` Yafang Shao
2024-08-18  3:48     ` Barry Song
2024-08-18  5:51       ` Yafang Shao
2024-08-18  6:27         ` Barry Song
2024-08-18  6:45           ` Barry Song
2024-08-18  7:07             ` Yafang Shao
2024-08-18  7:25               ` Barry Song
2024-08-19  7:51               ` Michal Hocko
2024-08-19  7:50     ` Michal Hocko
2024-08-19  9:25       ` Yafang Shao
2024-08-19  9:39         ` Barry Song
2024-08-19  9:45           ` Yafang Shao
2024-08-19 10:10             ` Barry Song
2024-08-19 11:56               ` Yafang Shao
2024-08-19 12:09                 ` Michal Hocko [this message]
2024-08-19 12:17                   ` Yafang Shao
2024-08-19 14:01                     ` Michal Hocko
2024-08-19 10:17         ` Michal Hocko
2024-08-19 11:56           ` Yafang Shao
2024-08-19 12:04             ` Michal Hocko
2024-08-19  9:44   ` David Hildenbrand
2024-08-19 10:19     ` Michal Hocko
2024-08-19 12:48       ` David Hildenbrand
2024-08-19 13:02 ` [PATCH v3 0/4] mm: clarify nofail memory allocation David Hildenbrand
2024-08-19 16:05   ` Linus Torvalds
2024-08-19 19:23     ` Barry Song
2024-08-19 19:33       ` Linus Torvalds
2024-08-19 21:48         ` Barry Song
2024-08-20  6:24         ` Michal Hocko
2024-08-21 12:40     ` Yafang Shao
2024-08-21 22:59       ` Linus Torvalds
2024-08-22  6:21         ` Michal Hocko
2024-08-22  6:40           ` Linus Torvalds
2024-08-22  6:56             ` Linus Torvalds
2024-08-22  7:47               ` Michal Hocko
2024-08-22  7:57                 ` Barry Song
2024-08-22  8:24                   ` Michal Hocko
2024-08-22  8:39                     ` David Hildenbrand
2024-08-22  9:08                       ` Linus Torvalds
2024-08-22  9:16                         ` Michal Hocko
2024-08-22  9:24                           ` Linus Torvalds
2024-08-22  9:11                       ` Michal Hocko
2024-08-22  9:18                         ` Linus Torvalds
2024-08-22  9:33                           ` Michal Hocko
2024-08-22  9:44                             ` Linus Torvalds
2024-08-22  9:59                               ` Michal Hocko
2024-08-22 10:30                                 ` Linus Torvalds
2024-08-22 10:46                                   ` Michal Hocko
2024-08-22  9:27                         ` David Hildenbrand
2024-08-22  9:34                           ` Linus Torvalds
2024-08-22  9:43                             ` David Hildenbrand
2024-08-22  9:53                               ` Linus Torvalds
2024-08-22 11:58                                 ` Johannes Weiner
2024-08-26 12:10                             ` Vlastimil Babka
2024-08-27  6:57                               ` Linus Torvalds
2024-08-27  7:15                               ` Barry Song
2024-08-27  7:38                                 ` Vlastimil Babka
2024-08-27  7:50                                   ` Barry Song
2024-08-29 10:24                                     ` Vlastimil Babka
2024-08-29 11:53                                       ` Barry Song
2024-08-29 13:20                                         ` Michal Hocko
2024-08-29 21:27                                           ` Barry Song
2024-08-29 22:31                                             ` Barry Song
2024-08-30  7:24                                               ` Michal Hocko
2024-08-30  7:37                                                 ` Vlastimil Babka
2024-08-22  9:41                           ` Michal Hocko
2024-08-22  9:42                             ` David Hildenbrand
2024-08-22  7:01             ` Gao Xiang
2024-08-22  7:54               ` Michal Hocko
2024-08-22  8:04                 ` Gao Xiang
2024-08-22 14:35                   ` Yafang Shao
2024-08-22 15:02                     ` Gao Xiang
2024-08-22  6:37       ` Barry Song
2024-08-22 14:22         ` Yafang Shao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZsM2De5v06eJsiG3@tiehlicka \
    --to=mhocko@suse.com \
    --cc=21cnbao@gmail.com \
    --cc=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=eperezma@redhat.com \
    --cc=hailong.liu@oppo.com \
    --cc=hch@infradead.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=jasowang@redhat.com \
    --cc=kees@kernel.org \
    --cc=laoar.shao@gmail.com \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=mst@redhat.com \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=torvalds@linux-foundation.org \
    --cc=urezki@gmail.com \
    --cc=v-songbaohua@oppo.com \
    --cc=vbabka@suse.cz \
    --cc=virtualization@lists.linux.dev \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).