From: Slava Imameev <slava.imameev@crowdstrike.com>
To: <alexei.starovoitov@gmail.com>
Cc: <ameryhung@gmail.com>, <andrii@kernel.org>, <ast@kernel.org>,
<bot+bpf-ci@kernel.org>, <bpf@vger.kernel.org>, <clm@meta.com>,
<daniel@iogearbox.net>, <eddyz87@gmail.com>,
<ihor.solodrai@linux.dev>, <kernel-team@meta.com>,
<martin.lau@kernel.org>, <memxor@gmail.com>,
<netdev@vger.kernel.org>, <yonghong.song@linux.dev>
Subject: Re: [PATCH bpf-next v2 2/3] bpf: Use kmalloc_nolock() universally in local storage
Date: Mon, 13 Apr 2026 05:40:44 +1000 [thread overview]
Message-ID: <20260412194044.13195-1-slava.imameev@crowdstrike.com> (raw)
In-Reply-To: <CAADnVQKeFF--bgnZZSU12UY0muuwYA=7EdzLyOi837oZs+bXTA@mail.gmail.com>
On Fri, 10 Apr 2026 21:39:00 -0700 Alexei Starovoitov wrote:
> >
> >
> > This allows value sizes up to ~65KB. Before this patch, socket and
> > inode storage used bpf_map_kzalloc() (backed by regular kmalloc)
> > which could handle those large sizes. After this patch, any
> > elem_size above KMALLOC_MAX_CACHE_SIZE will silently fail: the map
> > creation succeeds via bpf_local_storage_map_alloc_check() but every
> > element allocation returns NULL.
> >
> > Should BPF_LOCAL_STORAGE_MAX_VALUE_SIZE be updated to use
> > KMALLOC_MAX_CACHE_SIZE instead of KMALLOC_MAX_SIZE now that all
> > storage types go through kmalloc_nolock()?
> >
> > Slava Imameev raised the same concern for task storage in
> > https://lore.kernel.org/bpf/20260410014341.47043-1-slava.imameev@crowdstrike.com/
>
> Right. Let's update it, but I don't think it's a regression.
> On a loaded system kmalloc_large() rarely succeeds for order 2+.
> That's why kmalloc_nolock() doesn't attempt to bridge that gap.
> One or two contiguous physical pages is the best one can expect.
> In early bpf days we picked KMALLOC_MAX_SIZE assuming that
> it's a realistic max for kmalloc().
> It turned out to be wishful thinking.
> kmalloc_large concept should really be removed.
> It deceives users into thinking that it's usable.
Do you think it would be viable to extend task storage to
support larger allocations, to restore support for 64KB or maybe
less value like 32 KB, using vmalloc or bpf_mem_cache_alloc,
with the obvious restrictions that vmalloc imposes? Perhaps we
could use bpf_mem_cache_alloc as the primary mechanism with
vmalloc as a fallback when the caller context permits?
We've found task storage allocations larger than 8KB quite
valuable for scenarios involving processing multiple file paths.
Currently, without large task storage support, we're forced to
preallocate maps with 12KB+ values and significantly
over-provision the number of entries to reduce the probability
of free entry depletion. This approach places unnecessary burden
on the memory subsystem since much of this pre-allocated memory
remains unused.
Even if task storage allocation fails due to lack of contiguous
physical memory and vmalloc is not possible, there's an option to
maintain an emergency preallocated map of much smaller size
compared to when this map serves as the primary mechanism.
With larger task storage allocations, we've implemented a simple
memory allocator that operates over task storage. For example, a
16KB task storage can accommodate multiple allocations, one big
and couple of small, which has substantially reduced our memory
footprint compared to the current map-based approach. We've also
experimented successfully with 32KB arenas for workloads
requiring even larger working sets.
next prev parent reply other threads:[~2026-04-12 19:41 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-11 1:54 [PATCH bpf-next v2 0/3] Use kmalloc_nolock() universally in BPF local storage Amery Hung
2026-04-11 1:54 ` [PATCH bpf-next v2 1/3] selftests/bpf: Remove kmalloc tracing from local storage create bench Amery Hung
2026-04-11 1:54 ` [PATCH bpf-next v2 2/3] bpf: Use kmalloc_nolock() universally in local storage Amery Hung
2026-04-11 2:36 ` bot+bpf-ci
2026-04-11 4:39 ` Alexei Starovoitov
2026-04-12 19:40 ` Slava Imameev [this message]
2026-04-13 3:48 ` Slava Imameev
2026-04-11 1:54 ` [PATCH bpf-next v2 3/3] bpf: Remove gfp_flags plumbing from bpf_local_storage_update() Amery Hung
2026-04-11 4:30 ` [PATCH bpf-next v2 0/3] Use kmalloc_nolock() universally in BPF local storage patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260412194044.13195-1-slava.imameev@crowdstrike.com \
--to=slava.imameev@crowdstrike.com \
--cc=alexei.starovoitov@gmail.com \
--cc=ameryhung@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bot+bpf-ci@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=clm@meta.com \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=ihor.solodrai@linux.dev \
--cc=kernel-team@meta.com \
--cc=martin.lau@kernel.org \
--cc=memxor@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox