* [PATCH] arm64: init: override deferred_page_init_max_threads
@ 2024-05-20 23:15 Eric Chanudet
2024-05-21 14:47 ` Baoquan He
2024-05-21 16:10 ` Mike Rapoport
0 siblings, 2 replies; 6+ messages in thread
From: Eric Chanudet @ 2024-05-20 23:15 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon, Baoquan He, Andrew Morton, Zhen Lei,
Yajun Deng, Mike Rapoport (IBM), Zhang Jianhua, linux-arm-kernel,
linux-kernel
Cc: Eric Chanudet
This was the behavior prior to making the function arch-specific with
commit ecd096506922 ("mm: make deferred init's max threads
arch-specific")
Architectures can override the generic implementation that uses only one
CPU. Setting DEFERRED_STRUCT_PAGE_INIT and testing on a few arm64
platforms shows faster deferred_init_memmap completions:
| | x13s | SA8775p-ride | Ampere R137-P31 | Ampere HR330 |
| | Metal, 32GB | VM, 36GB | VM, 58GB | Metal, 128GB |
| | 8cpus | 8cpus | 8cpus | 32cpus |
|---------|-------------|--------------|-----------------|--------------|
| threads | ms (%) | ms (%) | ms (%) | ms (%) |
|---------|-------------|--------------|-----------------|--------------|
| 1 | 108 (0%) | 72 (0%) | 224 (0%) | 324 (0%) |
| cpus | 24 (-77%) | 36 (-50%) | 40 (-82%) | 56 (-82%) |
Signed-off-by: Eric Chanudet <echanude@redhat.com>
---
arch/arm64/mm/init.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 9b5ab6818f7f..71f5188fe63d 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -158,6 +158,13 @@ static void __init zone_sizes_init(void)
free_area_init(max_zone_pfns);
}
+#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
+int __init deferred_page_init_max_threads(const struct cpumask *node_cpumask)
+{
+ return max_t(int, cpumask_weight(node_cpumask), 1);
+}
+#endif
+
int pfn_is_map_memory(unsigned long pfn)
{
phys_addr_t addr = PFN_PHYS(pfn);
--
2.44.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH] arm64: init: override deferred_page_init_max_threads 2024-05-20 23:15 [PATCH] arm64: init: override deferred_page_init_max_threads Eric Chanudet @ 2024-05-21 14:47 ` Baoquan He 2024-05-21 16:10 ` Mike Rapoport 1 sibling, 0 replies; 6+ messages in thread From: Baoquan He @ 2024-05-21 14:47 UTC (permalink / raw) To: Eric Chanudet Cc: Catalin Marinas, Will Deacon, Andrew Morton, Zhen Lei, Yajun Deng, Mike Rapoport (IBM), Zhang Jianhua, linux-arm-kernel, linux-kernel On 05/20/24 at 07:15pm, Eric Chanudet wrote: > This was the behavior prior to making the function arch-specific with > commit ecd096506922 ("mm: make deferred init's max threads > arch-specific") > > Architectures can override the generic implementation that uses only one > CPU. Setting DEFERRED_STRUCT_PAGE_INIT and testing on a few arm64 > platforms shows faster deferred_init_memmap completions: > > | | x13s | SA8775p-ride | Ampere R137-P31 | Ampere HR330 | > | | Metal, 32GB | VM, 36GB | VM, 58GB | Metal, 128GB | > | | 8cpus | 8cpus | 8cpus | 32cpus | > |---------|-------------|--------------|-----------------|--------------| > | threads | ms (%) | ms (%) | ms (%) | ms (%) | > |---------|-------------|--------------|-----------------|--------------| > | 1 | 108 (0%) | 72 (0%) | 224 (0%) | 324 (0%) | > | cpus | 24 (-77%) | 36 (-50%) | 40 (-82%) | 56 (-82%) | > > Signed-off-by: Eric Chanudet <echanude@redhat.com> > --- > arch/arm64/mm/init.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > index 9b5ab6818f7f..71f5188fe63d 100644 > --- a/arch/arm64/mm/init.c > +++ b/arch/arm64/mm/init.c > @@ -158,6 +158,13 @@ static void __init zone_sizes_init(void) > free_area_init(max_zone_pfns); > } > > +#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT > +int __init deferred_page_init_max_threads(const struct cpumask *node_cpumask) > +{ > + return max_t(int, cpumask_weight(node_cpumask), 1); > +} > +#endif LGTM, Reviewed-by: Baoquan He <bhe@redhat.com> > + > int pfn_is_map_memory(unsigned long pfn) > { > phys_addr_t addr = PFN_PHYS(pfn); > -- > 2.44.0 > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] arm64: init: override deferred_page_init_max_threads 2024-05-20 23:15 [PATCH] arm64: init: override deferred_page_init_max_threads Eric Chanudet 2024-05-21 14:47 ` Baoquan He @ 2024-05-21 16:10 ` Mike Rapoport 2024-05-21 22:21 ` Eric Chanudet 1 sibling, 1 reply; 6+ messages in thread From: Mike Rapoport @ 2024-05-21 16:10 UTC (permalink / raw) To: Eric Chanudet Cc: Catalin Marinas, Will Deacon, Baoquan He, Andrew Morton, Zhen Lei, Yajun Deng, Zhang Jianhua, linux-arm-kernel, linux-kernel, Nick Piggin, Michael Ellerman (added powerpc folks) On Mon, May 20, 2024 at 07:15:59PM -0400, Eric Chanudet wrote: > This was the behavior prior to making the function arch-specific with > commit ecd096506922 ("mm: make deferred init's max threads > arch-specific") > > Architectures can override the generic implementation that uses only one > CPU. Setting DEFERRED_STRUCT_PAGE_INIT and testing on a few arm64 > platforms shows faster deferred_init_memmap completions: > > | | x13s | SA8775p-ride | Ampere R137-P31 | Ampere HR330 | > | | Metal, 32GB | VM, 36GB | VM, 58GB | Metal, 128GB | > | | 8cpus | 8cpus | 8cpus | 32cpus | > |---------|-------------|--------------|-----------------|--------------| > | threads | ms (%) | ms (%) | ms (%) | ms (%) | > |---------|-------------|--------------|-----------------|--------------| > | 1 | 108 (0%) | 72 (0%) | 224 (0%) | 324 (0%) | > | cpus | 24 (-77%) | 36 (-50%) | 40 (-82%) | 56 (-82%) | > > Signed-off-by: Eric Chanudet <echanude@redhat.com> > --- > arch/arm64/mm/init.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > index 9b5ab6818f7f..71f5188fe63d 100644 > --- a/arch/arm64/mm/init.c > +++ b/arch/arm64/mm/init.c > @@ -158,6 +158,13 @@ static void __init zone_sizes_init(void) > free_area_init(max_zone_pfns); > } > > +#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT > +int __init deferred_page_init_max_threads(const struct cpumask *node_cpumask) > +{ > + return max_t(int, cpumask_weight(node_cpumask), 1); > +} > +#endif > + Maybe we should make this default and let architectures that want a single thread override deferred_page_init_max_threads() to return 1? > int pfn_is_map_memory(unsigned long pfn) > { > phys_addr_t addr = PFN_PHYS(pfn); > -- > 2.44.0 > -- Sincerely yours, Mike. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] arm64: init: override deferred_page_init_max_threads 2024-05-21 16:10 ` Mike Rapoport @ 2024-05-21 22:21 ` Eric Chanudet 2024-05-22 13:41 ` Michael Ellerman 0 siblings, 1 reply; 6+ messages in thread From: Eric Chanudet @ 2024-05-21 22:21 UTC (permalink / raw) To: Mike Rapoport Cc: Catalin Marinas, Will Deacon, Baoquan He, Andrew Morton, Zhen Lei, Yajun Deng, Zhang Jianhua, linux-arm-kernel, linux-kernel, Nick Piggin, Michael Ellerman On Tue, May 21, 2024 at 07:10:07PM +0300, Mike Rapoport wrote: > (added powerpc folks) > > On Mon, May 20, 2024 at 07:15:59PM -0400, Eric Chanudet wrote: > > This was the behavior prior to making the function arch-specific with > > commit ecd096506922 ("mm: make deferred init's max threads > > arch-specific") > > > > Architectures can override the generic implementation that uses only one > > CPU. Setting DEFERRED_STRUCT_PAGE_INIT and testing on a few arm64 > > platforms shows faster deferred_init_memmap completions: > > > > | | x13s | SA8775p-ride | Ampere R137-P31 | Ampere HR330 | > > | | Metal, 32GB | VM, 36GB | VM, 58GB | Metal, 128GB | > > | | 8cpus | 8cpus | 8cpus | 32cpus | > > |---------|-------------|--------------|-----------------|--------------| > > | threads | ms (%) | ms (%) | ms (%) | ms (%) | > > |---------|-------------|--------------|-----------------|--------------| > > | 1 | 108 (0%) | 72 (0%) | 224 (0%) | 324 (0%) | > > | cpus | 24 (-77%) | 36 (-50%) | 40 (-82%) | 56 (-82%) | > > > > Signed-off-by: Eric Chanudet <echanude@redhat.com> > > --- > > arch/arm64/mm/init.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > > index 9b5ab6818f7f..71f5188fe63d 100644 > > --- a/arch/arm64/mm/init.c > > +++ b/arch/arm64/mm/init.c > > @@ -158,6 +158,13 @@ static void __init zone_sizes_init(void) > > free_area_init(max_zone_pfns); > > } > > > > +#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT > > +int __init deferred_page_init_max_threads(const struct cpumask *node_cpumask) > > +{ > > + return max_t(int, cpumask_weight(node_cpumask), 1); > > +} > > +#endif > > + > > Maybe we should make this default and let architectures that want a single > thread override deferred_page_init_max_threads() to return 1? It would affect more archs than I can try this on. Currently, only x86 (with this change, arm64) return more than one thread. I'm happy to send a v2 inverting the logic if you find it preferable. Best, > > int pfn_is_map_memory(unsigned long pfn) > > { > > phys_addr_t addr = PFN_PHYS(pfn); > > -- > > 2.44.0 > > > > -- > Sincerely yours, > Mike. > -- Eric Chanudet _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] arm64: init: override deferred_page_init_max_threads 2024-05-21 22:21 ` Eric Chanudet @ 2024-05-22 13:41 ` Michael Ellerman 2024-05-22 13:54 ` Eric Chanudet 0 siblings, 1 reply; 6+ messages in thread From: Michael Ellerman @ 2024-05-22 13:41 UTC (permalink / raw) To: Eric Chanudet, Mike Rapoport Cc: Catalin Marinas, Will Deacon, Baoquan He, Andrew Morton, Zhen Lei, Yajun Deng, Zhang Jianhua, linux-arm-kernel, linux-kernel, Nick Piggin Eric Chanudet <echanude@redhat.com> writes: > On Tue, May 21, 2024 at 07:10:07PM +0300, Mike Rapoport wrote: >> (added powerpc folks) Thanks Mike. >> On Mon, May 20, 2024 at 07:15:59PM -0400, Eric Chanudet wrote: >> > This was the behavior prior to making the function arch-specific with >> > commit ecd096506922 ("mm: make deferred init's max threads >> > arch-specific") >> > >> > Architectures can override the generic implementation that uses only one >> > CPU. Setting DEFERRED_STRUCT_PAGE_INIT and testing on a few arm64 >> > platforms shows faster deferred_init_memmap completions: >> > >> > | | x13s | SA8775p-ride | Ampere R137-P31 | Ampere HR330 | >> > | | Metal, 32GB | VM, 36GB | VM, 58GB | Metal, 128GB | >> > | | 8cpus | 8cpus | 8cpus | 32cpus | >> > |---------|-------------|--------------|-----------------|--------------| >> > | threads | ms (%) | ms (%) | ms (%) | ms (%) | >> > |---------|-------------|--------------|-----------------|--------------| >> > | 1 | 108 (0%) | 72 (0%) | 224 (0%) | 324 (0%) | >> > | cpus | 24 (-77%) | 36 (-50%) | 40 (-82%) | 56 (-82%) | How did you measure this, just some printks in page_alloc_init_late() or something more sophisticated? Just so I can do some comparable measurements. >> > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c >> > index 9b5ab6818f7f..71f5188fe63d 100644 >> > --- a/arch/arm64/mm/init.c >> > +++ b/arch/arm64/mm/init.c >> > @@ -158,6 +158,13 @@ static void __init zone_sizes_init(void) >> > free_area_init(max_zone_pfns); >> > } >> > >> > +#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT >> > +int __init deferred_page_init_max_threads(const struct cpumask *node_cpumask) >> > +{ >> > + return max_t(int, cpumask_weight(node_cpumask), 1); >> > +} >> > +#endif >> > + >> >> Maybe we should make this default and let architectures that want a single >> thread override deferred_page_init_max_threads() to return 1? > > It would affect more archs than I can try this on. Currently, only x86 > (with this change, arm64) return more than one thread. I can test powerpc and we can find someone to test s390. No other arches have it enabled in their defconfig. > I'm happy to send a v2 inverting the logic if you find it preferable. That seems preferable. It's a scalability feature, it makes no sense for the default to be a single thread AFAICS. cheers _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] arm64: init: override deferred_page_init_max_threads 2024-05-22 13:41 ` Michael Ellerman @ 2024-05-22 13:54 ` Eric Chanudet 0 siblings, 0 replies; 6+ messages in thread From: Eric Chanudet @ 2024-05-22 13:54 UTC (permalink / raw) To: Michael Ellerman Cc: Mike Rapoport, Catalin Marinas, Will Deacon, Baoquan He, Andrew Morton, Zhen Lei, Yajun Deng, Zhang Jianhua, linux-arm-kernel, linux-kernel, Nick Piggin On Wed, May 22, 2024 at 11:41:07PM +1000, Michael Ellerman wrote: > Eric Chanudet <echanude@redhat.com> writes: > > On Tue, May 21, 2024 at 07:10:07PM +0300, Mike Rapoport wrote: > >> (added powerpc folks) > > Thanks Mike. > > >> On Mon, May 20, 2024 at 07:15:59PM -0400, Eric Chanudet wrote: > >> > This was the behavior prior to making the function arch-specific with > >> > commit ecd096506922 ("mm: make deferred init's max threads > >> > arch-specific") > >> > > >> > Architectures can override the generic implementation that uses only one > >> > CPU. Setting DEFERRED_STRUCT_PAGE_INIT and testing on a few arm64 > >> > platforms shows faster deferred_init_memmap completions: > >> > > >> > | | x13s | SA8775p-ride | Ampere R137-P31 | Ampere HR330 | > >> > | | Metal, 32GB | VM, 36GB | VM, 58GB | Metal, 128GB | > >> > | | 8cpus | 8cpus | 8cpus | 32cpus | > >> > |---------|-------------|--------------|-----------------|--------------| > >> > | threads | ms (%) | ms (%) | ms (%) | ms (%) | > >> > |---------|-------------|--------------|-----------------|--------------| > >> > | 1 | 108 (0%) | 72 (0%) | 224 (0%) | 324 (0%) | > >> > | cpus | 24 (-77%) | 36 (-50%) | 40 (-82%) | 56 (-82%) | > > How did you measure this, just some printks in page_alloc_init_late() or > something more sophisticated? Just so I can do some comparable measurements. I used the existing pr_info in deferred_init_memmap(). > >> > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > >> > index 9b5ab6818f7f..71f5188fe63d 100644 > >> > --- a/arch/arm64/mm/init.c > >> > +++ b/arch/arm64/mm/init.c > >> > @@ -158,6 +158,13 @@ static void __init zone_sizes_init(void) > >> > free_area_init(max_zone_pfns); > >> > } > >> > > >> > +#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT > >> > +int __init deferred_page_init_max_threads(const struct cpumask *node_cpumask) > >> > +{ > >> > + return max_t(int, cpumask_weight(node_cpumask), 1); > >> > +} > >> > +#endif > >> > + > >> > >> Maybe we should make this default and let architectures that want a single > >> thread override deferred_page_init_max_threads() to return 1? > > > > It would affect more archs than I can try this on. Currently, only x86 > > (with this change, arm64) return more than one thread. > > I can test powerpc and we can find someone to test s390. No other > arches have it enabled in their defconfig. Many thanks! > > I'm happy to send a v2 inverting the logic if you find it preferable. > > That seems preferable. It's a scalability feature, it makes no sense for > the default to be a single thread AFAICS. Understood, I'll respin. -- Eric Chanudet _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-05-22 13:54 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-05-20 23:15 [PATCH] arm64: init: override deferred_page_init_max_threads Eric Chanudet 2024-05-21 14:47 ` Baoquan He 2024-05-21 16:10 ` Mike Rapoport 2024-05-21 22:21 ` Eric Chanudet 2024-05-22 13:41 ` Michael Ellerman 2024-05-22 13:54 ` Eric Chanudet
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).