* [PATCH] mm: percpu: Increase PERCPU_MODULE_RESERVE
@ 2025-06-11 16:18 Sebastian Andrzej Siewior
2025-06-11 17:45 ` Tejun Heo
0 siblings, 1 reply; 6+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-06-11 16:18 UTC (permalink / raw)
To: linux-mm
Cc: Dennis Zhou, Tejun Heo, Christoph Lameter, Gal Pressman,
Peter Zijlstra, Thomas Gleixner
PERCPU_MODULE_RESERVE defines the maximum size that can by used for the
per-CPU data size used by modules. This is 8KiB.
Commit 035fcdc4d240c ("openvswitch: Merge three per-CPU structures into
one") restructured the per-CPU memory allocation for openvswitch and
moved the separate alloc_percpu() invocations at module init time to a
static per-CPU variable which is allocated by the module loader.
The size of the per-CPU data section for openvswitch is 6488 bytes which
is ~80% of the available per-CPU memory. Together with a few other
modules it is easy to exhaust the available 8KiB of memory.
The memory range for the per-CPU memory is allocated early and pages for
its backing are only allocated once the per-CPU memory is allocated.
Increasing the map from 8 to 16 KiB adds 256 bytes to the alloc_map and
bound_map and 64 bytes to md_blocks (576 bytes in total).
Increase the available memory for module's per-CPU data section to
16KiB.
Reported-by: Gal Pressman <gal@nvidia.com>
Closes: https://lore.kernel.org/all/c401e017-f8db-4f57-a1cd-89beb979a277@nvidia.com
Fixes: 035fcdc4d240c ("openvswitch: Merge three per-CPU structures into one")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
include/linux/percpu.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/percpu.h b/include/linux/percpu.h
index 85bf8dd9f0874..57fcca09de18e 100644
--- a/include/linux/percpu.h
+++ b/include/linux/percpu.h
@@ -15,7 +15,7 @@
/* enough to cover all DEFINE_PER_CPUs in modules */
#ifdef CONFIG_MODULES
-#define PERCPU_MODULE_RESERVE (8 << 10)
+#define PERCPU_MODULE_RESERVE (8 << 11)
#else
#define PERCPU_MODULE_RESERVE 0
#endif
--
2.49.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] mm: percpu: Increase PERCPU_MODULE_RESERVE
2025-06-11 16:18 [PATCH] mm: percpu: Increase PERCPU_MODULE_RESERVE Sebastian Andrzej Siewior
@ 2025-06-11 17:45 ` Tejun Heo
2025-06-11 18:32 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 6+ messages in thread
From: Tejun Heo @ 2025-06-11 17:45 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: linux-mm, Dennis Zhou, Christoph Lameter, Gal Pressman,
Peter Zijlstra, Thomas Gleixner
On Wed, Jun 11, 2025 at 06:18:49PM +0200, Sebastian Andrzej Siewior wrote:
> PERCPU_MODULE_RESERVE defines the maximum size that can by used for the
> per-CPU data size used by modules. This is 8KiB.
>
> Commit 035fcdc4d240c ("openvswitch: Merge three per-CPU structures into
> one") restructured the per-CPU memory allocation for openvswitch and
> moved the separate alloc_percpu() invocations at module init time to a
> static per-CPU variable which is allocated by the module loader.
>
> The size of the per-CPU data section for openvswitch is 6488 bytes which
> is ~80% of the available per-CPU memory. Together with a few other
> modules it is easy to exhaust the available 8KiB of memory.
>
> The memory range for the per-CPU memory is allocated early and pages for
> its backing are only allocated once the per-CPU memory is allocated.
> Increasing the map from 8 to 16 KiB adds 256 bytes to the alloc_map and
> bound_map and 64 bytes to md_blocks (576 bytes in total).
>
> Increase the available memory for module's per-CPU data section to
> 16KiB.
I think a better direction would be keeping using alloc_percpu(). There
aren't a lot of benefits to using static definitions compared to dynamic
ones and making it larger increases overhead for everyone.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mm: percpu: Increase PERCPU_MODULE_RESERVE
2025-06-11 17:45 ` Tejun Heo
@ 2025-06-11 18:32 ` Sebastian Andrzej Siewior
2025-06-11 18:37 ` Tejun Heo
0 siblings, 1 reply; 6+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-06-11 18:32 UTC (permalink / raw)
To: Tejun Heo
Cc: linux-mm, Dennis Zhou, Christoph Lameter, Gal Pressman,
Peter Zijlstra, Thomas Gleixner
On 2025-06-11 07:45:46 [-1000], Tejun Heo wrote:
> On Wed, Jun 11, 2025 at 06:18:49PM +0200, Sebastian Andrzej Siewior wrote:
> > PERCPU_MODULE_RESERVE defines the maximum size that can by used for the
> > per-CPU data size used by modules. This is 8KiB.
> >
> > Commit 035fcdc4d240c ("openvswitch: Merge three per-CPU structures into
> > one") restructured the per-CPU memory allocation for openvswitch and
> > moved the separate alloc_percpu() invocations at module init time to a
> > static per-CPU variable which is allocated by the module loader.
> >
> > The size of the per-CPU data section for openvswitch is 6488 bytes which
> > is ~80% of the available per-CPU memory. Together with a few other
> > modules it is easy to exhaust the available 8KiB of memory.
> >
> > The memory range for the per-CPU memory is allocated early and pages for
> > its backing are only allocated once the per-CPU memory is allocated.
> > Increasing the map from 8 to 16 KiB adds 256 bytes to the alloc_map and
> > bound_map and 64 bytes to md_blocks (576 bytes in total).
> >
> > Increase the available memory for module's per-CPU data section to
> > 16KiB.
>
> I think a better direction would be keeping using alloc_percpu(). There
> aren't a lot of benefits to using static definitions compared to dynamic
> ones and making it larger increases overhead for everyone.
I could avoid initialising per-CPU locks because of the static build
time init and avoid the alloc_percpu() at module init. I can access
members of the struct avoid one pointer dereference. All this for 576
bytes.
> Thanks.
Sebastian
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mm: percpu: Increase PERCPU_MODULE_RESERVE
2025-06-11 18:32 ` Sebastian Andrzej Siewior
@ 2025-06-11 18:37 ` Tejun Heo
2025-06-11 19:06 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 6+ messages in thread
From: Tejun Heo @ 2025-06-11 18:37 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: linux-mm, Dennis Zhou, Christoph Lameter, Gal Pressman,
Peter Zijlstra, Thomas Gleixner
On Wed, Jun 11, 2025 at 08:32:57PM +0200, Sebastian Andrzej Siewior wrote:
> On 2025-06-11 07:45:46 [-1000], Tejun Heo wrote:
> > On Wed, Jun 11, 2025 at 06:18:49PM +0200, Sebastian Andrzej Siewior wrote:
> > > PERCPU_MODULE_RESERVE defines the maximum size that can by used for the
> > > per-CPU data size used by modules. This is 8KiB.
> > >
> > > Commit 035fcdc4d240c ("openvswitch: Merge three per-CPU structures into
> > > one") restructured the per-CPU memory allocation for openvswitch and
> > > moved the separate alloc_percpu() invocations at module init time to a
> > > static per-CPU variable which is allocated by the module loader.
> > >
> > > The size of the per-CPU data section for openvswitch is 6488 bytes which
> > > is ~80% of the available per-CPU memory. Together with a few other
> > > modules it is easy to exhaust the available 8KiB of memory.
> > >
> > > The memory range for the per-CPU memory is allocated early and pages for
> > > its backing are only allocated once the per-CPU memory is allocated.
> > > Increasing the map from 8 to 16 KiB adds 256 bytes to the alloc_map and
> > > bound_map and 64 bytes to md_blocks (576 bytes in total).
> > >
> > > Increase the available memory for module's per-CPU data section to
> > > 16KiB.
> >
> > I think a better direction would be keeping using alloc_percpu(). There
> > aren't a lot of benefits to using static definitions compared to dynamic
> > ones and making it larger increases overhead for everyone.
>
> I could avoid initialising per-CPU locks because of the static build
> time init and avoid the alloc_percpu() at module init. I can access
> members of the struct avoid one pointer dereference. All this for 576
> bytes.
Yeah but for that, you're making all machines that run the kernel to waste
two more pages per CPU. Modern machines are big and the overhead quickly
gets into megs. Sure, it's not a huge amount of memory but it's going to be
memory that almost nobody uses, relatively speaking, which just sits there
and gets wasted.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mm: percpu: Increase PERCPU_MODULE_RESERVE
2025-06-11 18:37 ` Tejun Heo
@ 2025-06-11 19:06 ` Sebastian Andrzej Siewior
2025-06-11 20:35 ` Tejun Heo
0 siblings, 1 reply; 6+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-06-11 19:06 UTC (permalink / raw)
To: Tejun Heo
Cc: linux-mm, Dennis Zhou, Christoph Lameter, Gal Pressman,
Peter Zijlstra, Thomas Gleixner
On 2025-06-11 08:37:13 [-1000], Tejun Heo wrote:
> Yeah but for that, you're making all machines that run the kernel to waste
> two more pages per CPU. Modern machines are big and the overhead quickly
> gets into megs. Sure, it's not a huge amount of memory but it's going to be
> memory that almost nobody uses, relatively speaking, which just sits there
> and gets wasted.
Not sure I waste two pages waste because the memory is allocated once
used.
Anyway, let me redo it to the dynamic allocation then. The memory of the
single module is quite huge. I looked at the per-CPU allocation of all
modules built with a Debian config on x86-64 and (ignoring alignment and
the openvswitch module) it was below 4KiB…
> Thanks.
>
Sebastian
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mm: percpu: Increase PERCPU_MODULE_RESERVE
2025-06-11 19:06 ` Sebastian Andrzej Siewior
@ 2025-06-11 20:35 ` Tejun Heo
0 siblings, 0 replies; 6+ messages in thread
From: Tejun Heo @ 2025-06-11 20:35 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: linux-mm, Dennis Zhou, Christoph Lameter, Gal Pressman,
Peter Zijlstra, Thomas Gleixner
On Wed, Jun 11, 2025 at 09:06:42PM +0200, Sebastian Andrzej Siewior wrote:
> On 2025-06-11 08:37:13 [-1000], Tejun Heo wrote:
> > Yeah but for that, you're making all machines that run the kernel to waste
> > two more pages per CPU. Modern machines are big and the overhead quickly
> > gets into megs. Sure, it's not a huge amount of memory but it's going to be
> > memory that almost nobody uses, relatively speaking, which just sits there
> > and gets wasted.
>
> Not sure I waste two pages waste because the memory is allocated once
> used.
The reserve percpu pages need to be allocated on every machine and most
machines wouldn't be using openvswitch, right?
> Anyway, let me redo it to the dynamic allocation then. The memory of the
> single module is quite huge. I looked at the per-CPU allocation of all
> modules built with a Debian config on x86-64 and (ignoring alignment and
> the openvswitch module) it was below 4KiB…
Yeah, the percpu module static reserve is intended for things like
individual pointers or counters, not big structs. The limitation comes from
how the reserve area is implemented. Because of the addressing limitations,
it has to be close to the built-in percpu static area and it's beneficial to
put the built-in area in address area that's covered by huge pages, so we
can't really sparse map them. So, the end result is not the greatest -
modules can only use small static percpu memory areas.
Maybe it'd be better to outright require all percpu areas to be dynamically
allocated for modules but that seems unnecessarily draconian if you need
like a couple counters, so this is where we ended up.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-06-11 20:35 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-11 16:18 [PATCH] mm: percpu: Increase PERCPU_MODULE_RESERVE Sebastian Andrzej Siewior
2025-06-11 17:45 ` Tejun Heo
2025-06-11 18:32 ` Sebastian Andrzej Siewior
2025-06-11 18:37 ` Tejun Heo
2025-06-11 19:06 ` Sebastian Andrzej Siewior
2025-06-11 20:35 ` Tejun Heo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).