From: Kees Cook <kees@kernel.org>
To: Michal Hocko <mhocko@suse.com>
Cc: Shakeel Butt <shakeel.butt@linux.dev>,
Dave Chinner <david@fromorbit.com>,
Yafang Shao <laoar.shao@gmail.com>,
Harry Yoo <harry.yoo@oracle.com>,
joel.granados@kernel.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, Josef Bacik <josef@toxicpanda.com>,
linux-mm@kvack.org, Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [PATCH] mm: kvmalloc: make kmalloc fast path real fast path
Date: Thu, 3 Apr 2025 09:21:50 -0700 [thread overview]
Message-ID: <202504030920.EB65CCA2@keescook> (raw)
In-Reply-To: <Z-48K0OdNxZXcnkB@tiehlicka>
On Thu, Apr 03, 2025 at 09:43:39AM +0200, Michal Hocko wrote:
> There are users like xfs which need larger allocations with NOFAIL
> sementic. They are not using kvmalloc currently because the current
> implementation tries too hard to allocate through the kmalloc path
> which causes a lot of direct reclaim and compaction and that hurts
> performance a lot (see 8dc9384b7d75 ("xfs: reduce kvmalloc overhead for
> CIL shadow buffers") for more details).
>
> kvmalloc does support __GFP_RETRY_MAYFAIL semantic to express that
> kmalloc (physically contiguous) allocation is preferred and we should go
> more aggressive to make it happen. There is currently no way to express
> that kmalloc should be very lightweight and as it has been argued [1]
> this mode should be default to support kvmalloc(NOFAIL) with a
> lightweight kmalloc path which is currently impossible to express as
> __GFP_NOFAIL cannot be combined by any other reclaim modifiers.
>
> This patch makes all kmalloc allocations GFP_NOWAIT unless
> __GFP_RETRY_MAYFAIL is provided to kvmalloc. This allows to support both
> fail fast and retry hard on physically contiguous memory with vmalloc
> fallback.
>
> There is a potential downside that relatively small allocations (smaller
> than PAGE_ALLOC_COSTLY_ORDER) could fallback to vmalloc too easily and
> cause page block fragmentation. We cannot really rule that out but it
> seems that xlog_cil_kvmalloc use doesn't indicate this to be happening.
>
> [1] https://lore.kernel.org/all/Z-3i1wATGh6vI8x8@dread.disaster.area/T/#u
> Signed-off-by: Michal Hocko <mhocko@suse.com>
Thanks for finding a solution for this! It makes way more sense to me to
kick over to vmap by default for kvmalloc users.
> ---
> mm/slub.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index b46f87662e71..2da40c2f6478 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -4972,14 +4972,16 @@ static gfp_t kmalloc_gfp_adjust(gfp_t flags, size_t size)
> * We want to attempt a large physically contiguous block first because
> * it is less likely to fragment multiple larger blocks and therefore
> * contribute to a long term fragmentation less than vmalloc fallback.
> - * However make sure that larger requests are not too disruptive - no
> - * OOM killer and no allocation failure warnings as we have a fallback.
> + * However make sure that larger requests are not too disruptive - i.e.
> + * do not direct reclaim unless physically continuous memory is preferred
> + * (__GFP_RETRY_MAYFAIL mode). We still kick in kswapd/kcompactd to start
> + * working in the background but the allocation itself.
I think a word is missing here? "...but do the allocation..." or
"...allocation itself happens" ?
--
Kees Cook
next prev parent reply other threads:[~2025-04-03 16:21 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20250401073046.51121-1-laoar.shao@gmail.com>
2025-04-01 14:01 ` [PATCH] proc: Avoid costly high-order page allocations when reading proc files Kees Cook
2025-04-01 14:50 ` Yafang Shao
2025-04-02 4:15 ` Harry Yoo
2025-04-02 8:42 ` Yafang Shao
2025-04-02 9:25 ` Vlastimil Babka
2025-04-02 12:17 ` Michal Hocko
2025-04-02 18:25 ` Shakeel Butt
2025-04-02 11:32 ` Dave Chinner
2025-04-02 12:24 ` Michal Hocko
2025-04-02 17:24 ` Matthew Wilcox
2025-04-02 18:30 ` Shakeel Butt
2025-04-02 22:38 ` Dave Chinner
2025-04-02 21:16 ` Dave Chinner
2025-04-02 23:10 ` Shakeel Butt
2025-04-03 1:22 ` Dave Chinner
2025-04-03 3:32 ` Yafang Shao
2025-04-03 5:05 ` Shakeel Butt
2025-04-03 7:20 ` Michal Hocko
2025-04-03 4:37 ` Shakeel Butt
2025-04-03 7:22 ` Michal Hocko
2025-04-03 7:43 ` [PATCH] mm: kvmalloc: make kmalloc fast path real fast path Michal Hocko
2025-04-03 8:24 ` Vlastimil Babka
2025-04-03 8:59 ` Michal Hocko
2025-04-03 16:21 ` Kees Cook [this message]
2025-04-03 19:49 ` Michal Hocko
2025-04-04 15:33 ` Darrick J. Wong
2025-04-03 18:30 ` Shakeel Butt
2025-04-03 19:51 ` Michal Hocko
2025-04-09 1:10 ` Dave Chinner
2025-06-04 18:42 ` Matthew Wilcox
2025-04-09 7:35 ` Michal Hocko
2025-04-09 9:11 ` Vlastimil Babka
2025-04-09 12:20 ` Michal Hocko
2025-04-09 12:23 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202504030920.EB65CCA2@keescook \
--to=kees@kernel.org \
--cc=david@fromorbit.com \
--cc=harry.yoo@oracle.com \
--cc=joel.granados@kernel.org \
--cc=josef@toxicpanda.com \
--cc=laoar.shao@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=shakeel.butt@linux.dev \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).