From: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
To: Dave Chinner <david@fromorbit.com>, Matthew Wilcox <willy@infradead.org>
Cc: lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org, linux-block@vger.kernel.org,
linux-ide@vger.kernel.org, linux-scsi@vger.kernel.org,
linux-nvme@lists.infradead.org,
Kent Overstreet <kent.overstreet@gmail.com>,
Michal Hocko <mhocko@kernel.org>
Subject: Re: [LSF/MM/BPF TOPIC] Removing GFP_NOFS
Date: Thu, 8 Feb 2024 17:02:07 +0100 [thread overview]
Message-ID: <3ba0dffa-beea-478f-bb6e-777b6304fb69@kernel.org> (raw)
In-Reply-To: <ZZzP6731XwZQnz0o@dread.disaster.area>
On 1/9/24 05:47, Dave Chinner wrote:
> On Thu, Jan 04, 2024 at 09:17:16PM +0000, Matthew Wilcox wrote:
>> This is primarily a _FILESYSTEM_ track topic. All the work has already
>> been done on the MM side; the FS people need to do their part. It could
>> be a joint session, but I'm not sure there's much for the MM people
>> to say.
>>
>> There are situations where we need to allocate memory, but cannot call
>> into the filesystem to free memory. Generally this is because we're
>> holding a lock or we've started a transaction, and attempting to write
>> out dirty folios to reclaim memory would result in a deadlock.
>>
>> The old way to solve this problem is to specify GFP_NOFS when allocating
>> memory. This conveys little information about what is being protected
>> against, and so it is hard to know when it might be safe to remove.
>> It's also a reflex -- many filesystem authors use GFP_NOFS by default
>> even when they could use GFP_KERNEL because there's no risk of deadlock.
>>
>> The new way is to use the scoped APIs -- memalloc_nofs_save() and
>> memalloc_nofs_restore(). These should be called when we start a
>> transaction or take a lock that would cause a GFP_KERNEL allocation to
>> deadlock. Then just use GFP_KERNEL as normal. The memory allocators
>> can see the nofs situation is in effect and will not call back into
>> the filesystem.
>
> So in rebasing the XFS kmem.[ch] removal patchset I've been working
> on, there is a clear memory allocator function that we need to be
> scoped: __GFP_NOFAIL.
>
> All of the allocations done through the existing XFS kmem.[ch]
> interfaces (i.e just about everything) have __GFP_NOFAIL semantics
> added except in the explicit cases where we add KM_MAYFAIL to
> indicate that the allocation can fail.
>
> The result of this conversion to remove GFP_NOFS is that I'm also
> adding *dozens* of __GFP_NOFAIL annotations because we effectively
> scope that behaviour.
>
> Hence I think this discussion needs to consider that __GFP_NOFAIL is
> also widely used within critical filesystem code that cannot
> gracefully recover from memory allocation failures, and that this
> would also be useful to scope....
>
> Yeah, I know, mm developers hate __GFP_NOFAIL. We've been using
> these semantics NOFAIL in XFS for over 2 decades and the sky hasn't
> fallen. So can we get memalloc_nofail_{save,restore}() so that we
> can change the default allocation behaviour in certain contexts
> (e.g. the same contexts we need NOFS allocations) to be NOFAIL
> unless __GFP_RETRY_MAYFAIL or __GFP_NORETRY are set?
Your points and Kent's proposal of scoped GFP_NOWAIT [1] suggests to me this
is no longer FS-only topic as this isn't just about converting to the scoped
apis, but also how they should be improved.
[1] http://lkml.kernel.org/r/Zbu_yyChbCO6b2Lj@tiehlicka
> We already have memalloc_noreclaim_{save/restore}() for turning off
> direct memory reclaim for a given context (i.e. equivalent of
> clearing __GFP_DIRECT_RECLAIM), so if we are going to embrace scoped
> allocation contexts, then we should be going all in and providing
> all the contexts that filesystems actually need....
>
> -Dave.
next prev parent reply other threads:[~2024-02-08 16:02 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-04 21:17 [LSF/MM/BPF TOPIC] Removing GFP_NOFS Matthew Wilcox
2024-01-05 10:13 ` Viacheslav Dubeyko
2024-01-05 10:26 ` [Lsf-pc] " Jan Kara
2024-01-05 14:17 ` Viacheslav Dubeyko
2024-01-05 14:35 ` Vlastimil Babka (SUSE)
2024-01-05 10:57 ` [Lsf-pc] " Jan Kara
2024-01-08 11:47 ` Johannes Thumshirn
2024-01-08 17:39 ` David Sterba
2024-01-09 7:43 ` Johannes Thumshirn
2024-01-09 22:23 ` Dave Chinner
2024-01-09 15:47 ` Luis Henriques
2024-01-09 18:04 ` Johannes Thumshirn
2024-01-08 6:39 ` Dave Chinner
2024-01-09 4:47 ` Dave Chinner
2024-02-08 16:02 ` Vlastimil Babka (SUSE) [this message]
2024-02-08 17:33 ` Michal Hocko
2024-02-08 19:55 ` Vlastimil Babka (SUSE)
2024-02-08 22:45 ` Kent Overstreet
2024-02-12 1:20 ` Dave Chinner
2024-02-12 2:06 ` Kent Overstreet
2024-02-12 4:35 ` Dave Chinner
2024-02-12 19:30 ` Kent Overstreet
2024-02-12 22:07 ` Dave Chinner
2024-01-09 22:44 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3ba0dffa-beea-478f-bb6e-777b6304fb69@kernel.org \
--to=vbabka@kernel.org \
--cc=david@fromorbit.com \
--cc=kent.overstreet@gmail.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-ide@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-scsi@vger.kernel.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=mhocko@kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.