* [PATCH] memblock: fix memblock_estimated_nr_free_pages() for soft-reserved memory
@ 2025-11-04 0:39 Akinobu Mita
2025-11-04 17:18 ` Mike Rapoport
0 siblings, 1 reply; 5+ messages in thread
From: Akinobu Mita @ 2025-11-04 0:39 UTC (permalink / raw)
To: akinobu.mita; +Cc: linux-cxl, linux-kernel, linux-mm, akpm, rppt
memblock_estimated_nr_free_pages() returns the difference between the total
size of the "memory" memblock type and the "reserved" memblock type.
The "soft-reserved" memory regions are added to the "reserved" memblock
type, but not to the "memory" memblock type. Therefore,
memblock_estimated_nr_free_pages() may return a smaller value than
expected, or if it underflows, an extremely large value.
/proc/sys/kernel/threads-max is determined by the value of
memblock_estimated_nr_free_pages(). This issue was discovered on machines
with CXL memory because kernel.threads-max was either smaller than expected
or extremely large for the installed DRAM size.
This fixes the issue by improving the accuracy of
memblock_estimated_nr_free_pages() by subtracting only the overlapping size
of regions with "memory" and "reserved" memblock types.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
---
mm/memblock.c | 33 ++++++++++++++++++++++++++++++++-
1 file changed, 32 insertions(+), 1 deletion(-)
diff --git a/mm/memblock.c b/mm/memblock.c
index e23e16618e9b..af014fa10a44 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1812,6 +1812,22 @@ phys_addr_t __init_memblock memblock_reserved_kern_size(phys_addr_t limit, int n
return total;
}
+static phys_addr_t __init memblock_addrs_overlap_size(phys_addr_t base1, phys_addr_t size1,
+ phys_addr_t base2, phys_addr_t size2)
+{
+ phys_addr_t start, end;
+
+ if (!memblock_addrs_overlap(base1, size1, base2, size2))
+ return 0;
+
+ memblock_cap_size(base1, &size1);
+ memblock_cap_size(base2, &size2);
+ start = max(base1, base2);
+ end = min(base1 + size1, base2 + size2);
+
+ return end - start;
+}
+
/**
* memblock_estimated_nr_free_pages - return estimated number of free pages
* from memblock point of view
@@ -1826,7 +1842,22 @@ phys_addr_t __init_memblock memblock_reserved_kern_size(phys_addr_t limit, int n
*/
unsigned long __init memblock_estimated_nr_free_pages(void)
{
- return PHYS_PFN(memblock_phys_mem_size() - memblock_reserved_size());
+ int memory_idx, reserved_idx;
+ struct memblock_type *memory_type = &memblock.memory;
+ struct memblock_type *reserved_type = &memblock.reserved;
+ struct memblock_region *memory_region, *reserved_region;
+ phys_addr_t phys_mem_size = 0;
+
+ for_each_memblock_type(memory_idx, memory_type, memory_region) {
+ phys_mem_size += memory_region->size;
+ for_each_memblock_type(reserved_idx, reserved_type, reserved_region) {
+ phys_mem_size -= memblock_addrs_overlap_size(memory_region->base,
+ memory_region->size, reserved_region->base,
+ reserved_region->size);
+ }
+ }
+
+ return PHYS_PFN(phys_mem_size);
}
/* lowest address */
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH] memblock: fix memblock_estimated_nr_free_pages() for soft-reserved memory
2025-11-04 0:39 [PATCH] memblock: fix memblock_estimated_nr_free_pages() for soft-reserved memory Akinobu Mita
@ 2025-11-04 17:18 ` Mike Rapoport
2025-11-05 13:23 ` Akinobu Mita
2025-11-11 1:00 ` [PATCH v2] " Akinobu Mita
0 siblings, 2 replies; 5+ messages in thread
From: Mike Rapoport @ 2025-11-04 17:18 UTC (permalink / raw)
To: Akinobu Mita, Dan Williams; +Cc: linux-cxl, linux-kernel, linux-mm, akpm
(added Dan Williams)
Hi,
On Tue, Nov 04, 2025 at 09:39:21AM +0900, Akinobu Mita wrote:
> memblock_estimated_nr_free_pages() returns the difference between the total
> size of the "memory" memblock type and the "reserved" memblock type.
>
> The "soft-reserved" memory regions are added to the "reserved" memblock
> type, but not to the "memory" memblock type. Therefore,
@Dan, do we really need to memblock_reserve() the E820_TYPE_SOFT_RESERVED
ranges?
Quick scan didn't show anything that requires this, but I could easily miss
something.
> memblock_estimated_nr_free_pages() may return a smaller value than
> expected, or if it underflows, an extremely large value.
>
> /proc/sys/kernel/threads-max is determined by the value of
> memblock_estimated_nr_free_pages(). This issue was discovered on machines
> with CXL memory because kernel.threads-max was either smaller than expected
> or extremely large for the installed DRAM size.
>
> This fixes the issue by improving the accuracy of
> memblock_estimated_nr_free_pages() by subtracting only the overlapping size
> of regions with "memory" and "reserved" memblock types.
>
> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
> ---
> mm/memblock.c | 33 ++++++++++++++++++++++++++++++++-
> 1 file changed, 32 insertions(+), 1 deletion(-)
>
> diff --git a/mm/memblock.c b/mm/memblock.c
> index e23e16618e9b..af014fa10a44 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
...
> @@ -1826,7 +1842,22 @@ phys_addr_t __init_memblock memblock_reserved_kern_size(phys_addr_t limit, int n
> */
> unsigned long __init memblock_estimated_nr_free_pages(void)
> {
> - return PHYS_PFN(memblock_phys_mem_size() - memblock_reserved_size());
We have memblock_reserved_kern_size() that tells how much memory was
reserved from the actual RAM. Replacing memblock_reserved_size() with
memblock_reserved_kern_size() will omit "soft-reserved" ranges.
> + int memory_idx, reserved_idx;
> + struct memblock_type *memory_type = &memblock.memory;
> + struct memblock_type *reserved_type = &memblock.reserved;
> + struct memblock_region *memory_region, *reserved_region;
> + phys_addr_t phys_mem_size = 0;
> +
> + for_each_memblock_type(memory_idx, memory_type, memory_region) {
> + phys_mem_size += memory_region->size;
> + for_each_memblock_type(reserved_idx, reserved_type, reserved_region) {
> + phys_mem_size -= memblock_addrs_overlap_size(memory_region->base,
> + memory_region->size, reserved_region->base,
> + reserved_region->size);
> + }
> + }
> +
> + return PHYS_PFN(phys_mem_size);
> }
>
> /* lowest address */
> --
> 2.43.0
>
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] memblock: fix memblock_estimated_nr_free_pages() for soft-reserved memory
2025-11-04 17:18 ` Mike Rapoport
@ 2025-11-05 13:23 ` Akinobu Mita
2025-11-11 1:00 ` [PATCH v2] " Akinobu Mita
1 sibling, 0 replies; 5+ messages in thread
From: Akinobu Mita @ 2025-11-05 13:23 UTC (permalink / raw)
To: Mike Rapoport; +Cc: Dan Williams, linux-cxl, linux-kernel, linux-mm, akpm
2025年11月5日(水) 2:18 Mike Rapoport <rppt@kernel.org>:
>
> (added Dan Williams)
>
> Hi,
>
> On Tue, Nov 04, 2025 at 09:39:21AM +0900, Akinobu Mita wrote:
> > memblock_estimated_nr_free_pages() returns the difference between the total
> > size of the "memory" memblock type and the "reserved" memblock type.
> >
> > The "soft-reserved" memory regions are added to the "reserved" memblock
> > type, but not to the "memory" memblock type. Therefore,
>
> @Dan, do we really need to memblock_reserve() the E820_TYPE_SOFT_RESERVED
> ranges?
> Quick scan didn't show anything that requires this, but I could easily miss
> something.
>
> > memblock_estimated_nr_free_pages() may return a smaller value than
> > expected, or if it underflows, an extremely large value.
> >
> > /proc/sys/kernel/threads-max is determined by the value of
> > memblock_estimated_nr_free_pages(). This issue was discovered on machines
> > with CXL memory because kernel.threads-max was either smaller than expected
> > or extremely large for the installed DRAM size.
> >
> > This fixes the issue by improving the accuracy of
> > memblock_estimated_nr_free_pages() by subtracting only the overlapping size
> > of regions with "memory" and "reserved" memblock types.
> >
> > Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
> > ---
> > mm/memblock.c | 33 ++++++++++++++++++++++++++++++++-
> > 1 file changed, 32 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/memblock.c b/mm/memblock.c
> > index e23e16618e9b..af014fa10a44 100644
> > --- a/mm/memblock.c
> > +++ b/mm/memblock.c
>
> ...
>
> > @@ -1826,7 +1842,22 @@ phys_addr_t __init_memblock memblock_reserved_kern_size(phys_addr_t limit, int n
> > */
> > unsigned long __init memblock_estimated_nr_free_pages(void)
> > {
> > - return PHYS_PFN(memblock_phys_mem_size() - memblock_reserved_size());
>
> We have memblock_reserved_kern_size() that tells how much memory was
> reserved from the actual RAM. Replacing memblock_reserved_size() with
> memblock_reserved_kern_size() will omit "soft-reserved" ranges.
Replacing memblock_reserved_size() with memblock_reserved_kern_size(
MEMBLOCK_ALLOC_ANYWHERE, NUMA_NO_NODE) also fixed the problem. Thank you.
^ permalink raw reply [flat|nested] 5+ messages in thread* [PATCH v2] memblock: fix memblock_estimated_nr_free_pages() for soft-reserved memory
2025-11-04 17:18 ` Mike Rapoport
2025-11-05 13:23 ` Akinobu Mita
@ 2025-11-11 1:00 ` Akinobu Mita
2025-11-11 16:19 ` Mike Rapoport
1 sibling, 1 reply; 5+ messages in thread
From: Akinobu Mita @ 2025-11-11 1:00 UTC (permalink / raw)
To: rppt; +Cc: akinobu.mita, linux-cxl, linux-kernel, linux-mm, akpm,
dan.j.williams
memblock_estimated_nr_free_pages() returns the difference between the total
size of the "memory" memblock type and the "reserved" memblock type.
The "soft-reserved" memory regions are added to the "reserved" memblock
type, but not to the "memory" memblock type. Therefore,
memblock_estimated_nr_free_pages() may return a smaller value than
expected, or if it underflows, an extremely large value.
/proc/sys/kernel/threads-max is determined by the value of
memblock_estimated_nr_free_pages(). This issue was discovered on machines
with CXL memory because kernel.threads-max was either smaller than expected
or extremely large for the installed DRAM size.
This fixes the issue by replacing memblock_reserved_size() with
memblock_reserved_kern_size() that tells how much memory was
reserved from the actual RAM.
Suggested-by: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
---
v2: instead of subtracting only the overlapping size,
replace memblock_reserved_size() with memblock_reserved_kern_size()
mm/memblock.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/memblock.c b/mm/memblock.c
index c7869860e659..905d06b16348 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1826,7 +1826,8 @@ phys_addr_t __init_memblock memblock_reserved_kern_size(phys_addr_t limit, int n
*/
unsigned long __init memblock_estimated_nr_free_pages(void)
{
- return PHYS_PFN(memblock_phys_mem_size() - memblock_reserved_size());
+ return PHYS_PFN(memblock_phys_mem_size() -
+ memblock_reserved_kern_size(MEMBLOCK_ALLOC_ANYWHERE, NUMA_NO_NODE));
}
/* lowest address */
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH v2] memblock: fix memblock_estimated_nr_free_pages() for soft-reserved memory
2025-11-11 1:00 ` [PATCH v2] " Akinobu Mita
@ 2025-11-11 16:19 ` Mike Rapoport
0 siblings, 0 replies; 5+ messages in thread
From: Mike Rapoport @ 2025-11-11 16:19 UTC (permalink / raw)
To: Akinobu Mita
Cc: Mike Rapoport, linux-cxl, linux-kernel, linux-mm, akpm,
dan.j.williams
From: Mike Rapoport (Microsoft) <rppt@kernel.org>
On Tue, 11 Nov 2025 10:00:10 +0900, Akinobu Mita wrote:
> memblock_estimated_nr_free_pages() returns the difference between the total
> size of the "memory" memblock type and the "reserved" memblock type.
>
> The "soft-reserved" memory regions are added to the "reserved" memblock
> type, but not to the "memory" memblock type. Therefore,
> memblock_estimated_nr_free_pages() may return a smaller value than
> expected, or if it underflows, an extremely large value.
>
> [...]
Applied to fixes branch of memblock.git tree, thanks!
[1/1] memblock: fix memblock_estimated_nr_free_pages() for soft-reserved memory
commit: c42af83c59b65d01c0f7a074e450bbbb43b22f0d
tree: https://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock
branch: fixes
In the future please start a new thread when sending the next version of a
patch.
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-11-11 16:19 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-04 0:39 [PATCH] memblock: fix memblock_estimated_nr_free_pages() for soft-reserved memory Akinobu Mita
2025-11-04 17:18 ` Mike Rapoport
2025-11-05 13:23 ` Akinobu Mita
2025-11-11 1:00 ` [PATCH v2] " Akinobu Mita
2025-11-11 16:19 ` Mike Rapoport
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).