linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] mm: fix accounting of memmap pages for early sections
@ 2025-08-04 15:13 Sumanth Korikkar
  2025-08-04 15:24 ` David Hildenbrand
  2025-08-06  9:03 ` Wei Yang
  0 siblings, 2 replies; 5+ messages in thread
From: Sumanth Korikkar @ 2025-08-04 15:13 UTC (permalink / raw)
  To: Andrew Morton, linux-mm, LKML, David Hildenbrand
  Cc: Gerald Schaefer, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	linux-s390, sumanthk

memmap pages can be allocated either from the memblock (boot) allocator
during early boot or from the buddy allocator.

When these memmap pages are removed via arch_remove_memory(), the
deallocation path depends on their source:

* For pages from the buddy allocator, depopulate_section_memmap() is
  called, which should decrement the count of nr_memmap_pages.

* For pages from the boot allocator, free_map_bootmem() is called, which
  should decrement the count of the nr_memmap_boot_pages.

Ensure correct tracking of memmap pages for both early sections and non
early sections by adjusting the accounting in section_deactivate().

Cc: stable@vger.kernel.org
Fixes: 15995a352474 ("mm: report per-page metadata information")
Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
---
v2: consider accounting for !CONFIG_SPARSEMEM_VMEMMAP.

 mm/sparse.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/mm/sparse.c b/mm/sparse.c
index 3c012cf83cc2..b9cc9e548f80 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -680,7 +680,6 @@ static void depopulate_section_memmap(unsigned long pfn, unsigned long nr_pages,
 	unsigned long start = (unsigned long) pfn_to_page(pfn);
 	unsigned long end = start + nr_pages * sizeof(struct page);
 
-	memmap_pages_add(-1L * (DIV_ROUND_UP(end - start, PAGE_SIZE)));
 	vmemmap_free(start, end, altmap);
 }
 static void free_map_bootmem(struct page *memmap)
@@ -856,10 +855,14 @@ static void section_deactivate(unsigned long pfn, unsigned long nr_pages,
 	 * The memmap of early sections is always fully populated. See
 	 * section_activate() and pfn_valid() .
 	 */
-	if (!section_is_early)
+	if (!section_is_early) {
+		memmap_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE)));
 		depopulate_section_memmap(pfn, nr_pages, altmap);
-	else if (memmap)
+	} else if (memmap) {
+		memmap_boot_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page),
+				      PAGE_SIZE)));
 		free_map_bootmem(memmap);
+	}
 
 	if (empty)
 		ms->section_mem_map = (unsigned long)NULL;
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] mm: fix accounting of memmap pages for early sections
  2025-08-04 15:13 [PATCH v2] mm: fix accounting of memmap pages for early sections Sumanth Korikkar
@ 2025-08-04 15:24 ` David Hildenbrand
  2025-08-06  9:03 ` Wei Yang
  1 sibling, 0 replies; 5+ messages in thread
From: David Hildenbrand @ 2025-08-04 15:24 UTC (permalink / raw)
  To: Sumanth Korikkar, Andrew Morton, linux-mm, LKML
  Cc: Gerald Schaefer, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	linux-s390

On 04.08.25 17:13, Sumanth Korikkar wrote:
> memmap pages can be allocated either from the memblock (boot) allocator
> during early boot or from the buddy allocator.
> 
> When these memmap pages are removed via arch_remove_memory(), the
> deallocation path depends on their source:
> 
> * For pages from the buddy allocator, depopulate_section_memmap() is
>    called, which should decrement the count of nr_memmap_pages.
> 
> * For pages from the boot allocator, free_map_bootmem() is called, which
>    should decrement the count of the nr_memmap_boot_pages.
> 
> Ensure correct tracking of memmap pages for both early sections and non
> early sections by adjusting the accounting in section_deactivate().
> 
> Cc: stable@vger.kernel.org
> Fixes: 15995a352474 ("mm: report per-page metadata information")
> Suggested-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
> ---
> v2: consider accounting for !CONFIG_SPARSEMEM_VMEMMAP.
> 
>   mm/sparse.c | 9 ++++++---
>   1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/sparse.c b/mm/sparse.c
> index 3c012cf83cc2..b9cc9e548f80 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -680,7 +680,6 @@ static void depopulate_section_memmap(unsigned long pfn, unsigned long nr_pages,
>   	unsigned long start = (unsigned long) pfn_to_page(pfn);
>   	unsigned long end = start + nr_pages * sizeof(struct page);
>   
> -	memmap_pages_add(-1L * (DIV_ROUND_UP(end - start, PAGE_SIZE)));
>   	vmemmap_free(start, end, altmap);
>   }
>   static void free_map_bootmem(struct page *memmap)
> @@ -856,10 +855,14 @@ static void section_deactivate(unsigned long pfn, unsigned long nr_pages,
>   	 * The memmap of early sections is always fully populated. See
>   	 * section_activate() and pfn_valid() .
>   	 */
> -	if (!section_is_early)
> +	if (!section_is_early) {
> +		memmap_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE)));
>   		depopulate_section_memmap(pfn, nr_pages, altmap);
> -	else if (memmap)
> +	} else if (memmap) {
> +		memmap_boot_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page),
> +				      PAGE_SIZE)));
>   		free_map_bootmem(memmap);
> +	}
>   
>   	if (empty)
>   		ms->section_mem_map = (unsigned long)NULL;

Acked-by: David Hildenbrand <david@redhat.com>

Hopefully we're not missing anything important.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] mm: fix accounting of memmap pages for early sections
  2025-08-04 15:13 [PATCH v2] mm: fix accounting of memmap pages for early sections Sumanth Korikkar
  2025-08-04 15:24 ` David Hildenbrand
@ 2025-08-06  9:03 ` Wei Yang
  2025-08-06 12:46   ` Sumanth Korikkar
  1 sibling, 1 reply; 5+ messages in thread
From: Wei Yang @ 2025-08-06  9:03 UTC (permalink / raw)
  To: Sumanth Korikkar
  Cc: Andrew Morton, linux-mm, LKML, David Hildenbrand, Gerald Schaefer,
	Heiko Carstens, Vasily Gorbik, Alexander Gordeev, linux-s390

On Mon, Aug 04, 2025 at 05:13:27PM +0200, Sumanth Korikkar wrote:
>memmap pages can be allocated either from the memblock (boot) allocator
>during early boot or from the buddy allocator.
>
>When these memmap pages are removed via arch_remove_memory(), the
>deallocation path depends on their source:
>
>* For pages from the buddy allocator, depopulate_section_memmap() is
>  called, which should decrement the count of nr_memmap_pages.
>
>* For pages from the boot allocator, free_map_bootmem() is called, which
>  should decrement the count of the nr_memmap_boot_pages.
>
>Ensure correct tracking of memmap pages for both early sections and non
>early sections by adjusting the accounting in section_deactivate().
>
>Cc: stable@vger.kernel.org
>Fixes: 15995a352474 ("mm: report per-page metadata information")
>Suggested-by: David Hildenbrand <david@redhat.com>
>Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
>---
>v2: consider accounting for !CONFIG_SPARSEMEM_VMEMMAP.
>
> mm/sparse.c | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
>diff --git a/mm/sparse.c b/mm/sparse.c
>index 3c012cf83cc2..b9cc9e548f80 100644
>--- a/mm/sparse.c
>+++ b/mm/sparse.c
>@@ -680,7 +680,6 @@ static void depopulate_section_memmap(unsigned long pfn, unsigned long nr_pages,
> 	unsigned long start = (unsigned long) pfn_to_page(pfn);
> 	unsigned long end = start + nr_pages * sizeof(struct page);
> 
>-	memmap_pages_add(-1L * (DIV_ROUND_UP(end - start, PAGE_SIZE)));
> 	vmemmap_free(start, end, altmap);
> }
> static void free_map_bootmem(struct page *memmap)
>@@ -856,10 +855,14 @@ static void section_deactivate(unsigned long pfn, unsigned long nr_pages,
> 	 * The memmap of early sections is always fully populated. See
> 	 * section_activate() and pfn_valid() .
> 	 */
>-	if (!section_is_early)
>+	if (!section_is_early) {
>+		memmap_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE)));
> 		depopulate_section_memmap(pfn, nr_pages, altmap);
>-	else if (memmap)
>+	} else if (memmap) {
>+		memmap_boot_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page),
>+				      PAGE_SIZE)));
> 		free_map_bootmem(memmap);
>+	}

The change here is reasonable. While maybe we still miss the counting at some
other points.

For example:

a. 

  sparse_init_nid()
    __populate_section_memmap()

If !CONFIG_SPARSEMEM_VMEMMAP, and sparse_buffer_alloc() return NULL, it
allocate extra memory from bootmem, which looks not counted.

b. 

  section_activate()
    populate_section_memmap()

If !CONFIG_SPARSEMEM_VMEMMAP, it just call kvmalloc_node(), which looks not
counted.

Do I missed something?

> 
> 	if (empty)
> 		ms->section_mem_map = (unsigned long)NULL;
>-- 
>2.48.1
>

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] mm: fix accounting of memmap pages for early sections
  2025-08-06  9:03 ` Wei Yang
@ 2025-08-06 12:46   ` Sumanth Korikkar
  2025-08-06 14:31     ` Wei Yang
  0 siblings, 1 reply; 5+ messages in thread
From: Sumanth Korikkar @ 2025-08-06 12:46 UTC (permalink / raw)
  To: Wei Yang
  Cc: Andrew Morton, linux-mm, LKML, David Hildenbrand, Gerald Schaefer,
	Heiko Carstens, Vasily Gorbik, Alexander Gordeev, linux-s390

> The change here is reasonable. While maybe we still miss the counting at some
> other points.
> 
> For example:
> 
> a. 
> 
>   sparse_init_nid()
>     __populate_section_memmap()
> 
> If !CONFIG_SPARSEMEM_VMEMMAP, and sparse_buffer_alloc() return NULL, it
> allocate extra memory from bootmem, which looks not counted.

Currently, the accounting is done upfront in sparse_buffer_init(), where
memmap_boot_pages_add() is called for !CONFIG_SPARSEMEM_VMEMMAP.

The function sparse_buffer_alloc() can return NULL in two scenarios:

* During sparse_buffer_init(), if memmap_alloc() fails, sparsemap_buf will be NULL.
* Inside sparse_buffer_alloc(), if ptr + size exceeds sparsemap_buf_end,
  then ptr is set to NULL.

Considering this, perhaps memmap_boot_pages_add() could be moved into
__populate_section_memmap(), with the accounting done only if the
operation is successful. What do you think?

>   section_activate()
>     populate_section_memmap()
> 
> If !CONFIG_SPARSEMEM_VMEMMAP, it just call kvmalloc_node(), which looks not
> counted.

Sounds right. This means nr_memmap_pages adjustment is needed for
!CONFIG_SPARSEMEM_VMEMMAP here. I will recheck this.

Thank you


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] mm: fix accounting of memmap pages for early sections
  2025-08-06 12:46   ` Sumanth Korikkar
@ 2025-08-06 14:31     ` Wei Yang
  0 siblings, 0 replies; 5+ messages in thread
From: Wei Yang @ 2025-08-06 14:31 UTC (permalink / raw)
  To: Sumanth Korikkar
  Cc: Wei Yang, Andrew Morton, linux-mm, LKML, David Hildenbrand,
	Gerald Schaefer, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	linux-s390

On Wed, Aug 06, 2025 at 02:46:43PM +0200, Sumanth Korikkar wrote:
>> The change here is reasonable. While maybe we still miss the counting at some
>> other points.
>> 
>> For example:
>> 
>> a. 
>> 
>>   sparse_init_nid()
>>     __populate_section_memmap()
>> 
>> If !CONFIG_SPARSEMEM_VMEMMAP, and sparse_buffer_alloc() return NULL, it
>> allocate extra memory from bootmem, which looks not counted.
>
>Currently, the accounting is done upfront in sparse_buffer_init(), where
>memmap_boot_pages_add() is called for !CONFIG_SPARSEMEM_VMEMMAP.
>
>The function sparse_buffer_alloc() can return NULL in two scenarios:
>
>* During sparse_buffer_init(), if memmap_alloc() fails, sparsemap_buf will be NULL.
>* Inside sparse_buffer_alloc(), if ptr + size exceeds sparsemap_buf_end,
>  then ptr is set to NULL.
>
>Considering this, perhaps memmap_boot_pages_add() could be moved into
>__populate_section_memmap(), with the accounting done only if the
>operation is successful. What do you think?
>

Looks reasonable to me.

>>   section_activate()
>>     populate_section_memmap()
>> 
>> If !CONFIG_SPARSEMEM_VMEMMAP, it just call kvmalloc_node(), which looks not
>> counted.
>
>Sounds right. This means nr_memmap_pages adjustment is needed for
>!CONFIG_SPARSEMEM_VMEMMAP here. I will recheck this.
>
>Thank you

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-08-06 14:31 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-04 15:13 [PATCH v2] mm: fix accounting of memmap pages for early sections Sumanth Korikkar
2025-08-04 15:24 ` David Hildenbrand
2025-08-06  9:03 ` Wei Yang
2025-08-06 12:46   ` Sumanth Korikkar
2025-08-06 14:31     ` Wei Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).