From: Baoquan He <bhe@redhat.com>
To: linux-mm@kvack.org
Cc: akpm@linux-foundation.org, x86@kernel.org,
linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
rppt@kernel.org
Subject: Re: [PATCH v2 4/6] mm/mm_init.c: remove meaningless calculation of zone->managed_pages in free_area_init_core()
Date: Thu, 28 Mar 2024 16:32:38 +0800 [thread overview]
Message-ID: <ZgUrJuatQqAT0QA1@MiWiFi-R3L-srv> (raw)
In-Reply-To: <20240325145646.1044760-5-bhe@redhat.com>
On 03/25/24 at 10:56pm, Baoquan He wrote:
> Currently, in free_area_init_core(), when initialize zone's field, a
> rough value is set to zone->managed_pages. That value is calculated by
> (zone->present_pages - memmap_pages).
>
> In the meantime, add the value to nr_all_pages and nr_kernel_pages which
> represent all free pages of system (only low memory or including HIGHMEM
> memory separately). Both of them are gonna be used in
> alloc_large_system_hash().
>
> However, the rough calculation and setting of zone->managed_pages is
> meaningless because
> a) memmap pages are allocated on units of node in sparse_init() or
> alloc_node_mem_map(pgdat); The simple (zone->present_pages -
> memmap_pages) is too rough to make sense for zone;
> b) the set zone->managed_pages will be zeroed out and reset with
> acutal value in mem_init() via memblock_free_all(). Before the
> resetting, no buddy allocation request is issued.
>
> Here, remove the meaningless and complicated calculation of
> (zone->present_pages - memmap_pages), initialize zone->managed_pages as 0
> which reflect its actual value because no any page is added into buddy
> system right now. It will be reset in mem_init().
>
> And also remove the assignment of nr_all_pages and nr_kernel_pages in
> free_area_init_core(). Instead, call the newly added calc_nr_kernel_pages()
> to count up all free but not reserved memory in memblock and assign to
> nr_all_pages and nr_kernel_pages. The counting excludes memmap_pages,
> and other kernel used data, which is more accurate than old way and
> simpler, and can also cover the ppc required arch_reserved_kernel_pages()
> case.
>
> And also clean up the outdated code comment above free_area_init_core().
> And free_area_init_core() is easy to understand now, no need to add
> words to explain.
>
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
> mm/mm_init.c | 46 +++++-----------------------------------------
> 1 file changed, 5 insertions(+), 41 deletions(-)
>
> diff --git a/mm/mm_init.c b/mm/mm_init.c
> index c57a7fc97a16..7f71e56e83f3 100644
> --- a/mm/mm_init.c
> +++ b/mm/mm_init.c
> @@ -1565,15 +1565,6 @@ void __ref free_area_init_core_hotplug(struct pglist_data *pgdat)
> }
> #endif
>
> -/*
> - * Set up the zone data structures:
> - * - mark all pages reserved
> - * - mark all memory queues empty
> - * - clear the memory bitmaps
> - *
> - * NOTE: pgdat should get zeroed by caller.
> - * NOTE: this function is only called during early init.
> - */
> static void __init free_area_init_core(struct pglist_data *pgdat)
> {
> enum zone_type j;
> @@ -1584,41 +1575,13 @@ static void __init free_area_init_core(struct pglist_data *pgdat)
>
> for (j = 0; j < MAX_NR_ZONES; j++) {
> struct zone *zone = pgdat->node_zones + j;
> - unsigned long size, freesize, memmap_pages;
> -
> - size = zone->spanned_pages;
> - freesize = zone->present_pages;
> -
> - /*
> - * Adjust freesize so that it accounts for how much memory
> - * is used by this zone for memmap. This affects the watermark
> - * and per-cpu initialisations
> - */
> - memmap_pages = calc_memmap_size(size, freesize);
> - if (!is_highmem_idx(j)) {
> - if (freesize >= memmap_pages) {
> - freesize -= memmap_pages;
> - if (memmap_pages)
> - pr_debug(" %s zone: %lu pages used for memmap\n",
> - zone_names[j], memmap_pages);
> - } else
> - pr_warn(" %s zone: %lu memmap pages exceeds freesize %lu\n",
> - zone_names[j], memmap_pages, freesize);
> - }
> -
> - if (!is_highmem_idx(j))
> - nr_kernel_pages += freesize;
> - /* Charge for highmem memmap if there are enough kernel pages */
> - else if (nr_kernel_pages > memmap_pages * 2)
> - nr_kernel_pages -= memmap_pages;
> - nr_all_pages += freesize;
> + unsigned long size = zone->spanned_pages;
>
> /*
> - * Set an approximate value for lowmem here, it will be adjusted
> - * when the bootmem allocator frees pages into the buddy system.
> - * And all highmem pages will be managed by the buddy system.
> + * Initialize zone->managed_pages as 0 , it will be reset
> + * when memblock allocator frees pages into buddy system.
> */
> - zone_init_internals(zone, j, nid, freesize);
> + zone_init_internals(zone, j, nid, 0);
Here, we should initialize zone->managed_pages as zone->present_pages
because later page_group_by_mobility_disabled need be set according to
zone->managed_pages. Otherwise page_group_by_mobility_disabled will be
set to 1 always. I will sent out v3.
From a17b0921b4bd00596330f61ee9ea4b82386a9fed Mon Sep 17 00:00:00 2001
From: Baoquan He <bhe@redhat.com>
Date: Thu, 28 Mar 2024 16:20:15 +0800
Subject: [PATCH] mm/mm_init.c: set zone's ->managed_pages as ->present_pages
for now
Content-type: text/plain
Because page_group_by_mobility_disabled need be set according to zone's
managed_pages later.
Signed-off-by: Baoquan He <bhe@redhat.com>
---
mm/mm_init.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/mm_init.c b/mm/mm_init.c
index cc24e7958c0c..dd875f943cbb 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -1561,7 +1561,7 @@ static void __init free_area_init_core(struct pglist_data *pgdat)
* Initialize zone->managed_pages as 0 , it will be reset
* when memblock allocator frees pages into buddy system.
*/
- zone_init_internals(zone, j, nid, 0);
+ zone_init_internals(zone, j, nid, zone->present_pages);
if (!size)
continue;
--
2.41.0
>
> if (!size)
> continue;
> @@ -1915,6 +1878,7 @@ void __init free_area_init(unsigned long *max_zone_pfn)
> check_for_memory(pgdat);
> }
>
> + calc_nr_kernel_pages();
> memmap_init();
>
> /* disable hash distribution for systems with a single node */
> --
> 2.41.0
>
WARNING: multiple messages have this Message-ID (diff)
From: Baoquan He <bhe@redhat.com>
To: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org, x86@kernel.org,
linuxppc-dev@lists.ozlabs.org, akpm@linux-foundation.org,
rppt@kernel.org
Subject: Re: [PATCH v2 4/6] mm/mm_init.c: remove meaningless calculation of zone->managed_pages in free_area_init_core()
Date: Thu, 28 Mar 2024 16:32:38 +0800 [thread overview]
Message-ID: <ZgUrJuatQqAT0QA1@MiWiFi-R3L-srv> (raw)
In-Reply-To: <20240325145646.1044760-5-bhe@redhat.com>
On 03/25/24 at 10:56pm, Baoquan He wrote:
> Currently, in free_area_init_core(), when initialize zone's field, a
> rough value is set to zone->managed_pages. That value is calculated by
> (zone->present_pages - memmap_pages).
>
> In the meantime, add the value to nr_all_pages and nr_kernel_pages which
> represent all free pages of system (only low memory or including HIGHMEM
> memory separately). Both of them are gonna be used in
> alloc_large_system_hash().
>
> However, the rough calculation and setting of zone->managed_pages is
> meaningless because
> a) memmap pages are allocated on units of node in sparse_init() or
> alloc_node_mem_map(pgdat); The simple (zone->present_pages -
> memmap_pages) is too rough to make sense for zone;
> b) the set zone->managed_pages will be zeroed out and reset with
> acutal value in mem_init() via memblock_free_all(). Before the
> resetting, no buddy allocation request is issued.
>
> Here, remove the meaningless and complicated calculation of
> (zone->present_pages - memmap_pages), initialize zone->managed_pages as 0
> which reflect its actual value because no any page is added into buddy
> system right now. It will be reset in mem_init().
>
> And also remove the assignment of nr_all_pages and nr_kernel_pages in
> free_area_init_core(). Instead, call the newly added calc_nr_kernel_pages()
> to count up all free but not reserved memory in memblock and assign to
> nr_all_pages and nr_kernel_pages. The counting excludes memmap_pages,
> and other kernel used data, which is more accurate than old way and
> simpler, and can also cover the ppc required arch_reserved_kernel_pages()
> case.
>
> And also clean up the outdated code comment above free_area_init_core().
> And free_area_init_core() is easy to understand now, no need to add
> words to explain.
>
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
> mm/mm_init.c | 46 +++++-----------------------------------------
> 1 file changed, 5 insertions(+), 41 deletions(-)
>
> diff --git a/mm/mm_init.c b/mm/mm_init.c
> index c57a7fc97a16..7f71e56e83f3 100644
> --- a/mm/mm_init.c
> +++ b/mm/mm_init.c
> @@ -1565,15 +1565,6 @@ void __ref free_area_init_core_hotplug(struct pglist_data *pgdat)
> }
> #endif
>
> -/*
> - * Set up the zone data structures:
> - * - mark all pages reserved
> - * - mark all memory queues empty
> - * - clear the memory bitmaps
> - *
> - * NOTE: pgdat should get zeroed by caller.
> - * NOTE: this function is only called during early init.
> - */
> static void __init free_area_init_core(struct pglist_data *pgdat)
> {
> enum zone_type j;
> @@ -1584,41 +1575,13 @@ static void __init free_area_init_core(struct pglist_data *pgdat)
>
> for (j = 0; j < MAX_NR_ZONES; j++) {
> struct zone *zone = pgdat->node_zones + j;
> - unsigned long size, freesize, memmap_pages;
> -
> - size = zone->spanned_pages;
> - freesize = zone->present_pages;
> -
> - /*
> - * Adjust freesize so that it accounts for how much memory
> - * is used by this zone for memmap. This affects the watermark
> - * and per-cpu initialisations
> - */
> - memmap_pages = calc_memmap_size(size, freesize);
> - if (!is_highmem_idx(j)) {
> - if (freesize >= memmap_pages) {
> - freesize -= memmap_pages;
> - if (memmap_pages)
> - pr_debug(" %s zone: %lu pages used for memmap\n",
> - zone_names[j], memmap_pages);
> - } else
> - pr_warn(" %s zone: %lu memmap pages exceeds freesize %lu\n",
> - zone_names[j], memmap_pages, freesize);
> - }
> -
> - if (!is_highmem_idx(j))
> - nr_kernel_pages += freesize;
> - /* Charge for highmem memmap if there are enough kernel pages */
> - else if (nr_kernel_pages > memmap_pages * 2)
> - nr_kernel_pages -= memmap_pages;
> - nr_all_pages += freesize;
> + unsigned long size = zone->spanned_pages;
>
> /*
> - * Set an approximate value for lowmem here, it will be adjusted
> - * when the bootmem allocator frees pages into the buddy system.
> - * And all highmem pages will be managed by the buddy system.
> + * Initialize zone->managed_pages as 0 , it will be reset
> + * when memblock allocator frees pages into buddy system.
> */
> - zone_init_internals(zone, j, nid, freesize);
> + zone_init_internals(zone, j, nid, 0);
Here, we should initialize zone->managed_pages as zone->present_pages
because later page_group_by_mobility_disabled need be set according to
zone->managed_pages. Otherwise page_group_by_mobility_disabled will be
set to 1 always. I will sent out v3.
From a17b0921b4bd00596330f61ee9ea4b82386a9fed Mon Sep 17 00:00:00 2001
From: Baoquan He <bhe@redhat.com>
Date: Thu, 28 Mar 2024 16:20:15 +0800
Subject: [PATCH] mm/mm_init.c: set zone's ->managed_pages as ->present_pages
for now
Content-type: text/plain
Because page_group_by_mobility_disabled need be set according to zone's
managed_pages later.
Signed-off-by: Baoquan He <bhe@redhat.com>
---
mm/mm_init.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/mm_init.c b/mm/mm_init.c
index cc24e7958c0c..dd875f943cbb 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -1561,7 +1561,7 @@ static void __init free_area_init_core(struct pglist_data *pgdat)
* Initialize zone->managed_pages as 0 , it will be reset
* when memblock allocator frees pages into buddy system.
*/
- zone_init_internals(zone, j, nid, 0);
+ zone_init_internals(zone, j, nid, zone->present_pages);
if (!size)
continue;
--
2.41.0
>
> if (!size)
> continue;
> @@ -1915,6 +1878,7 @@ void __init free_area_init(unsigned long *max_zone_pfn)
> check_for_memory(pgdat);
> }
>
> + calc_nr_kernel_pages();
> memmap_init();
>
> /* disable hash distribution for systems with a single node */
> --
> 2.41.0
>
next prev parent reply other threads:[~2024-03-28 8:33 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-25 14:56 [PATCH v2 0/6] mm/mm_init.c: refactor free_area_init_core() Baoquan He
2024-03-25 14:56 ` Baoquan He
2024-03-25 14:56 ` [PATCH v2 1/6] x86: remove unneeded memblock_find_dma_reserve() Baoquan He
2024-03-25 14:56 ` Baoquan He
2024-03-26 6:44 ` Mike Rapoport
2024-03-26 6:44 ` Mike Rapoport
2024-03-25 14:56 ` [PATCH v2 2/6] mm/mm_init.c: remove the useless dma_reserve Baoquan He
2024-03-25 14:56 ` Baoquan He
2024-03-26 6:44 ` Mike Rapoport
2024-03-26 6:44 ` Mike Rapoport
2024-03-25 14:56 ` [PATCH v2 3/6] mm/mm_init.c: add new function calc_nr_all_pages() Baoquan He
2024-03-25 14:56 ` Baoquan He
2024-03-26 6:57 ` Mike Rapoport
2024-03-26 6:57 ` Mike Rapoport
2024-03-26 13:49 ` Baoquan He
2024-03-26 13:49 ` Baoquan He
2024-03-27 15:40 ` Mike Rapoport
2024-03-27 15:40 ` Mike Rapoport
2024-03-25 14:56 ` [PATCH v2 4/6] mm/mm_init.c: remove meaningless calculation of zone->managed_pages in free_area_init_core() Baoquan He
2024-03-25 14:56 ` Baoquan He
2024-03-27 15:40 ` Mike Rapoport
2024-03-27 15:40 ` Mike Rapoport
2024-03-28 8:32 ` Baoquan He [this message]
2024-03-28 8:32 ` Baoquan He
2024-03-28 9:53 ` Mike Rapoport
2024-03-28 9:53 ` Mike Rapoport
2024-03-28 14:46 ` Baoquan He
2024-03-28 14:46 ` Baoquan He
2024-03-28 9:12 ` [PATCH v3 " Baoquan He
2024-03-28 9:12 ` Baoquan He
2024-03-25 14:56 ` [PATCH v2 5/6] mm/mm_init.c: remove unneeded calc_memmap_size() Baoquan He
2024-03-25 14:56 ` Baoquan He
2024-03-27 16:21 ` Mike Rapoport
2024-03-27 16:21 ` Mike Rapoport
2024-03-28 1:24 ` Baoquan He
2024-03-28 1:24 ` Baoquan He
2024-03-25 14:56 ` [PATCH v2 6/6] mm/mm_init.c: remove arch_reserved_kernel_pages() Baoquan He
2024-03-25 14:56 ` Baoquan He
2024-03-26 6:57 ` Mike Rapoport
2024-03-26 6:57 ` Mike Rapoport
2024-03-27 15:41 ` Mike Rapoport
2024-03-27 15:41 ` Mike Rapoport
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZgUrJuatQqAT0QA1@MiWiFi-R3L-srv \
--to=bhe@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=rppt@kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.