All of lore.kernel.org
 help / color / mirror / Atom feed
From: Baoquan He <bhe@redhat.com>
To: Mike Rapoport <rppt@kernel.org>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org, x86@kernel.org,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 4/6] mm/mm_init.c: remove meaningless calculation of zone->managed_pages in free_area_init_core()
Date: Thu, 28 Mar 2024 22:46:39 +0800	[thread overview]
Message-ID: <ZgWCzx7+OxPgFIaU@MiWiFi-R3L-srv> (raw)
In-Reply-To: <ZgU9_zpuIOu2b_gT@kernel.org>

On 03/28/24 at 11:53am, Mike Rapoport wrote:
> On Thu, Mar 28, 2024 at 04:32:38PM +0800, Baoquan He wrote:
> > On 03/25/24 at 10:56pm, Baoquan He wrote:
> > >  
> > >  		/*
> > > -		 * Set an approximate value for lowmem here, it will be adjusted
> > > -		 * when the bootmem allocator frees pages into the buddy system.
> > > -		 * And all highmem pages will be managed by the buddy system.
> > > +		 * Initialize zone->managed_pages as 0 , it will be reset
> > > +		 * when memblock allocator frees pages into buddy system.
> > >  		 */
> > > -		zone_init_internals(zone, j, nid, freesize);
> > > +		zone_init_internals(zone, j, nid, 0);
> > 
> > Here, we should initialize zone->managed_pages as zone->present_pages
> > because later page_group_by_mobility_disabled need be set according to
> > zone->managed_pages. Otherwise page_group_by_mobility_disabled will be
> > set to 1 always. I will sent out v3.
> 
> With zone->managed_pages set to zone->present_pages we won't account for
> the reserved memory for initialization of page_group_by_mobility_disabled.

The old zone->managed_pages didn't account for the reserved pages
either. It's calculated by (zone->present_pages - memmap_pages). memmap
pages only is only a very small portion, e.g on x86_64, 4K page size,
assuming size of struct page is 64, then it's 1/64 of system memory.
On arm64, 64K page size, it's 1/1024 of system memory.

And about the setting of page_group_by_mobility_disabled, the compared
value pageblock_nr_pages * MIGRATE_TYPES which is very small. On x86_64,
it's 4M*6=24M; on arm64 with 64K size and 128M*6=768M which should be
the biggest among ARCH-es. 

	if (vm_total_pages < (pageblock_nr_pages * MIGRATE_TYPES)) 
                page_group_by_mobility_disabled = 1;
        else    
                page_group_by_mobility_disabled = 0;

So page_group_by_mobility_disabled could be set to 1 only on system with
very little memory which is very rarely seen. And setting
zone->managed_pages as zone->present_pages is very close to its old
value: (zone->present_pages - memmap_pages). Here we don't need be very
accurate, just a rough value.

> 
> As watermarks are still not initialized at the time build_all_zonelists()
> is called, we may use nr_all_pages - nr_kernel_pages instead of
> nr_free_zone_pages(), IMO.

nr_all_pages should be fine if we take this way. nr_kernel_pages is a
misleading name, it's all low memory pages excluding kernel reserved
apges. nr_all_pages is all memory pages including highmema and exluding
kernel reserved pages.

Both is fine to me. The first one is easier, simply setting
zone->managed_pages as zone->present_pages. The 2nd way is a little more
accurate.

>  
> > From a17b0921b4bd00596330f61ee9ea4b82386a9fed Mon Sep 17 00:00:00 2001
> > From: Baoquan He <bhe@redhat.com>
> > Date: Thu, 28 Mar 2024 16:20:15 +0800
> > Subject: [PATCH] mm/mm_init.c: set zone's ->managed_pages as ->present_pages
> >  for now
> > Content-type: text/plain
> > 
> > Because page_group_by_mobility_disabled need be set according to zone's
> > managed_pages later.
> > 
> > Signed-off-by: Baoquan He <bhe@redhat.com>
> > ---
> >  mm/mm_init.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/mm/mm_init.c b/mm/mm_init.c
> > index cc24e7958c0c..dd875f943cbb 100644
> > --- a/mm/mm_init.c
> > +++ b/mm/mm_init.c
> > @@ -1561,7 +1561,7 @@ static void __init free_area_init_core(struct pglist_data *pgdat)
> >  		 * Initialize zone->managed_pages as 0 , it will be reset
> >  		 * when memblock allocator frees pages into buddy system.
> >  		 */
> > -		zone_init_internals(zone, j, nid, 0);
> > +		zone_init_internals(zone, j, nid, zone->present_pages);
> >  
> >  		if (!size)
> >  			continue;
> > -- 
> > 2.41.0
> > 
> > 
> > >  
> > >  		if (!size)
> > >  			continue;
> > > @@ -1915,6 +1878,7 @@ void __init free_area_init(unsigned long *max_zone_pfn)
> > >  		check_for_memory(pgdat);
> > >  	}
> > >  
> > > +	calc_nr_kernel_pages();
> > >  	memmap_init();
> > >  
> > >  	/* disable hash distribution for systems with a single node */
> > > -- 
> > > 2.41.0
> > > 
> > 
> 
> -- 
> Sincerely yours,
> Mike.
> 


WARNING: multiple messages have this Message-ID (diff)
From: Baoquan He <bhe@redhat.com>
To: Mike Rapoport <rppt@kernel.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org,
	linuxppc-dev@lists.ozlabs.org, akpm@linux-foundation.org
Subject: Re: [PATCH v2 4/6] mm/mm_init.c: remove meaningless calculation of zone->managed_pages in free_area_init_core()
Date: Thu, 28 Mar 2024 22:46:39 +0800	[thread overview]
Message-ID: <ZgWCzx7+OxPgFIaU@MiWiFi-R3L-srv> (raw)
In-Reply-To: <ZgU9_zpuIOu2b_gT@kernel.org>

On 03/28/24 at 11:53am, Mike Rapoport wrote:
> On Thu, Mar 28, 2024 at 04:32:38PM +0800, Baoquan He wrote:
> > On 03/25/24 at 10:56pm, Baoquan He wrote:
> > >  
> > >  		/*
> > > -		 * Set an approximate value for lowmem here, it will be adjusted
> > > -		 * when the bootmem allocator frees pages into the buddy system.
> > > -		 * And all highmem pages will be managed by the buddy system.
> > > +		 * Initialize zone->managed_pages as 0 , it will be reset
> > > +		 * when memblock allocator frees pages into buddy system.
> > >  		 */
> > > -		zone_init_internals(zone, j, nid, freesize);
> > > +		zone_init_internals(zone, j, nid, 0);
> > 
> > Here, we should initialize zone->managed_pages as zone->present_pages
> > because later page_group_by_mobility_disabled need be set according to
> > zone->managed_pages. Otherwise page_group_by_mobility_disabled will be
> > set to 1 always. I will sent out v3.
> 
> With zone->managed_pages set to zone->present_pages we won't account for
> the reserved memory for initialization of page_group_by_mobility_disabled.

The old zone->managed_pages didn't account for the reserved pages
either. It's calculated by (zone->present_pages - memmap_pages). memmap
pages only is only a very small portion, e.g on x86_64, 4K page size,
assuming size of struct page is 64, then it's 1/64 of system memory.
On arm64, 64K page size, it's 1/1024 of system memory.

And about the setting of page_group_by_mobility_disabled, the compared
value pageblock_nr_pages * MIGRATE_TYPES which is very small. On x86_64,
it's 4M*6=24M; on arm64 with 64K size and 128M*6=768M which should be
the biggest among ARCH-es. 

	if (vm_total_pages < (pageblock_nr_pages * MIGRATE_TYPES)) 
                page_group_by_mobility_disabled = 1;
        else    
                page_group_by_mobility_disabled = 0;

So page_group_by_mobility_disabled could be set to 1 only on system with
very little memory which is very rarely seen. And setting
zone->managed_pages as zone->present_pages is very close to its old
value: (zone->present_pages - memmap_pages). Here we don't need be very
accurate, just a rough value.

> 
> As watermarks are still not initialized at the time build_all_zonelists()
> is called, we may use nr_all_pages - nr_kernel_pages instead of
> nr_free_zone_pages(), IMO.

nr_all_pages should be fine if we take this way. nr_kernel_pages is a
misleading name, it's all low memory pages excluding kernel reserved
apges. nr_all_pages is all memory pages including highmema and exluding
kernel reserved pages.

Both is fine to me. The first one is easier, simply setting
zone->managed_pages as zone->present_pages. The 2nd way is a little more
accurate.

>  
> > From a17b0921b4bd00596330f61ee9ea4b82386a9fed Mon Sep 17 00:00:00 2001
> > From: Baoquan He <bhe@redhat.com>
> > Date: Thu, 28 Mar 2024 16:20:15 +0800
> > Subject: [PATCH] mm/mm_init.c: set zone's ->managed_pages as ->present_pages
> >  for now
> > Content-type: text/plain
> > 
> > Because page_group_by_mobility_disabled need be set according to zone's
> > managed_pages later.
> > 
> > Signed-off-by: Baoquan He <bhe@redhat.com>
> > ---
> >  mm/mm_init.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/mm/mm_init.c b/mm/mm_init.c
> > index cc24e7958c0c..dd875f943cbb 100644
> > --- a/mm/mm_init.c
> > +++ b/mm/mm_init.c
> > @@ -1561,7 +1561,7 @@ static void __init free_area_init_core(struct pglist_data *pgdat)
> >  		 * Initialize zone->managed_pages as 0 , it will be reset
> >  		 * when memblock allocator frees pages into buddy system.
> >  		 */
> > -		zone_init_internals(zone, j, nid, 0);
> > +		zone_init_internals(zone, j, nid, zone->present_pages);
> >  
> >  		if (!size)
> >  			continue;
> > -- 
> > 2.41.0
> > 
> > 
> > >  
> > >  		if (!size)
> > >  			continue;
> > > @@ -1915,6 +1878,7 @@ void __init free_area_init(unsigned long *max_zone_pfn)
> > >  		check_for_memory(pgdat);
> > >  	}
> > >  
> > > +	calc_nr_kernel_pages();
> > >  	memmap_init();
> > >  
> > >  	/* disable hash distribution for systems with a single node */
> > > -- 
> > > 2.41.0
> > > 
> > 
> 
> -- 
> Sincerely yours,
> Mike.
> 



  reply	other threads:[~2024-03-28 14:47 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-25 14:56 [PATCH v2 0/6] mm/mm_init.c: refactor free_area_init_core() Baoquan He
2024-03-25 14:56 ` Baoquan He
2024-03-25 14:56 ` [PATCH v2 1/6] x86: remove unneeded memblock_find_dma_reserve() Baoquan He
2024-03-25 14:56   ` Baoquan He
2024-03-26  6:44   ` Mike Rapoport
2024-03-26  6:44     ` Mike Rapoport
2024-03-25 14:56 ` [PATCH v2 2/6] mm/mm_init.c: remove the useless dma_reserve Baoquan He
2024-03-25 14:56   ` Baoquan He
2024-03-26  6:44   ` Mike Rapoport
2024-03-26  6:44     ` Mike Rapoport
2024-03-25 14:56 ` [PATCH v2 3/6] mm/mm_init.c: add new function calc_nr_all_pages() Baoquan He
2024-03-25 14:56   ` Baoquan He
2024-03-26  6:57   ` Mike Rapoport
2024-03-26  6:57     ` Mike Rapoport
2024-03-26 13:49     ` Baoquan He
2024-03-26 13:49       ` Baoquan He
2024-03-27 15:40   ` Mike Rapoport
2024-03-27 15:40     ` Mike Rapoport
2024-03-25 14:56 ` [PATCH v2 4/6] mm/mm_init.c: remove meaningless calculation of zone->managed_pages in free_area_init_core() Baoquan He
2024-03-25 14:56   ` Baoquan He
2024-03-27 15:40   ` Mike Rapoport
2024-03-27 15:40     ` Mike Rapoport
2024-03-28  8:32   ` Baoquan He
2024-03-28  8:32     ` Baoquan He
2024-03-28  9:53     ` Mike Rapoport
2024-03-28  9:53       ` Mike Rapoport
2024-03-28 14:46       ` Baoquan He [this message]
2024-03-28 14:46         ` Baoquan He
2024-03-28  9:12   ` [PATCH v3 " Baoquan He
2024-03-28  9:12     ` Baoquan He
2024-03-25 14:56 ` [PATCH v2 5/6] mm/mm_init.c: remove unneeded calc_memmap_size() Baoquan He
2024-03-25 14:56   ` Baoquan He
2024-03-27 16:21   ` Mike Rapoport
2024-03-27 16:21     ` Mike Rapoport
2024-03-28  1:24     ` Baoquan He
2024-03-28  1:24       ` Baoquan He
2024-03-25 14:56 ` [PATCH v2 6/6] mm/mm_init.c: remove arch_reserved_kernel_pages() Baoquan He
2024-03-25 14:56   ` Baoquan He
2024-03-26  6:57   ` Mike Rapoport
2024-03-26  6:57     ` Mike Rapoport
2024-03-27 15:41   ` Mike Rapoport
2024-03-27 15:41     ` Mike Rapoport

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZgWCzx7+OxPgFIaU@MiWiFi-R3L-srv \
    --to=bhe@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=rppt@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.