From: Michael Ellerman <mpe@ellerman.id.au>
To: Eric Chanudet <echanude@redhat.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
Andy Lutomirski <luto@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
"H. Peter Anvin" <hpa@zytor.com>, Mike Rapoport <rppt@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Baoquan He <bhe@redhat.com>, Nick Piggin <npiggin@gmail.com>
Cc: x86@kernel.org, linux-arm-kernel@lists.infradead.org,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linux-s390@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
Eric Chanudet <echanude@redhat.com>
Subject: Re: [PATCH v2] mm/mm_init: use node's number of cpus in deferred_page_init_max_threads
Date: Thu, 23 May 2024 21:13:55 +1000 [thread overview]
Message-ID: <87jzjk3hn0.fsf@mail.lhotse> (raw)
In-Reply-To: <20240522203758.626932-4-echanude@redhat.com>
Eric Chanudet <echanude@redhat.com> writes:
> x86_64 is already using the node's cpu as maximum threads. Make that the
> default for all archs setting DEFERRED_STRUCT_PAGE_INIT.
>
> This returns to the behavior prior making the function arch-specific
> with commit ecd096506922 ("mm: make deferred init's max threads
> arch-specific").
>
> Signed-off-by: Eric Chanudet <echanude@redhat.com>
>
> ---
> Setting DEFERRED_STRUCT_PAGE_INIT and testing on a few arm64 platforms
> shows faster deferred_init_memmap completions:
>
> | | x13s | SA8775p-ride | Ampere R137-P31 | Ampere HR330 |
> | | Metal, 32GB | VM, 36GB | VM, 58GB | Metal, 128GB |
> | | 8cpus | 8cpus | 8cpus | 32cpus |
> |---------|-------------|--------------|-----------------|--------------|
> | threads | ms (%) | ms (%) | ms (%) | ms (%) |
> |---------|-------------|--------------|-----------------|--------------|
> | 1 | 108 (0%) | 72 (0%) | 224 (0%) | 324 (0%) |
> | cpus | 24 (-77%) | 36 (-50%) | 40 (-82%) | 56 (-82%) |
>
> - v1: https://lore.kernel.org/linux-arm-kernel/20240520231555.395979-5-echanude@redhat.com
> - Changes since v1:
> - Make the generic function return the number of cpus of the node as
> max threads limit instead overriding it for arm64.
> - Drop Baoquan He's R-b on v1 since the logic changed.
> - Add CCs according to patch changes (ppc and s390 set
> DEFERRED_STRUCT_PAGE_INIT by default).
>
> arch/x86/mm/init_64.c | 12 ------------
> mm/mm_init.c | 2 +-
> 2 files changed, 1 insertion(+), 13 deletions(-)
On a machine here (1TB, 40 cores, 4KB pages) the existing code gives:
[ 0.500124] node 2 deferred pages initialised in 210ms
[ 0.515790] node 3 deferred pages initialised in 230ms
[ 0.516061] node 0 deferred pages initialised in 230ms
[ 0.516522] node 7 deferred pages initialised in 230ms
[ 0.516672] node 4 deferred pages initialised in 230ms
[ 0.516798] node 6 deferred pages initialised in 230ms
[ 0.517051] node 5 deferred pages initialised in 230ms
[ 0.523887] node 1 deferred pages initialised in 240ms
vs with the patch:
[ 0.379613] node 0 deferred pages initialised in 90ms
[ 0.380388] node 1 deferred pages initialised in 90ms
[ 0.380540] node 4 deferred pages initialised in 100ms
[ 0.390239] node 6 deferred pages initialised in 100ms
[ 0.390249] node 2 deferred pages initialised in 100ms
[ 0.390786] node 3 deferred pages initialised in 110ms
[ 0.396721] node 5 deferred pages initialised in 110ms
[ 0.397095] node 7 deferred pages initialised in 110ms
Which is a nice speedup.
Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
cheers
WARNING: multiple messages have this Message-ID (diff)
From: Michael Ellerman <mpe@ellerman.id.au>
To: Eric Chanudet <echanude@redhat.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
Andy Lutomirski <luto@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
"H. Peter Anvin" <hpa@zytor.com>, Mike Rapoport <rppt@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Baoquan He <bhe@redhat.com>, Nick Piggin <npiggin@gmail.com>
Cc: linux-s390@vger.kernel.org, x86@kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Eric Chanudet <echanude@redhat.com>,
linuxppc-dev@lists.ozlabs.org,
linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v2] mm/mm_init: use node's number of cpus in deferred_page_init_max_threads
Date: Thu, 23 May 2024 21:13:55 +1000 [thread overview]
Message-ID: <87jzjk3hn0.fsf@mail.lhotse> (raw)
In-Reply-To: <20240522203758.626932-4-echanude@redhat.com>
Eric Chanudet <echanude@redhat.com> writes:
> x86_64 is already using the node's cpu as maximum threads. Make that the
> default for all archs setting DEFERRED_STRUCT_PAGE_INIT.
>
> This returns to the behavior prior making the function arch-specific
> with commit ecd096506922 ("mm: make deferred init's max threads
> arch-specific").
>
> Signed-off-by: Eric Chanudet <echanude@redhat.com>
>
> ---
> Setting DEFERRED_STRUCT_PAGE_INIT and testing on a few arm64 platforms
> shows faster deferred_init_memmap completions:
>
> | | x13s | SA8775p-ride | Ampere R137-P31 | Ampere HR330 |
> | | Metal, 32GB | VM, 36GB | VM, 58GB | Metal, 128GB |
> | | 8cpus | 8cpus | 8cpus | 32cpus |
> |---------|-------------|--------------|-----------------|--------------|
> | threads | ms (%) | ms (%) | ms (%) | ms (%) |
> |---------|-------------|--------------|-----------------|--------------|
> | 1 | 108 (0%) | 72 (0%) | 224 (0%) | 324 (0%) |
> | cpus | 24 (-77%) | 36 (-50%) | 40 (-82%) | 56 (-82%) |
>
> - v1: https://lore.kernel.org/linux-arm-kernel/20240520231555.395979-5-echanude@redhat.com
> - Changes since v1:
> - Make the generic function return the number of cpus of the node as
> max threads limit instead overriding it for arm64.
> - Drop Baoquan He's R-b on v1 since the logic changed.
> - Add CCs according to patch changes (ppc and s390 set
> DEFERRED_STRUCT_PAGE_INIT by default).
>
> arch/x86/mm/init_64.c | 12 ------------
> mm/mm_init.c | 2 +-
> 2 files changed, 1 insertion(+), 13 deletions(-)
On a machine here (1TB, 40 cores, 4KB pages) the existing code gives:
[ 0.500124] node 2 deferred pages initialised in 210ms
[ 0.515790] node 3 deferred pages initialised in 230ms
[ 0.516061] node 0 deferred pages initialised in 230ms
[ 0.516522] node 7 deferred pages initialised in 230ms
[ 0.516672] node 4 deferred pages initialised in 230ms
[ 0.516798] node 6 deferred pages initialised in 230ms
[ 0.517051] node 5 deferred pages initialised in 230ms
[ 0.523887] node 1 deferred pages initialised in 240ms
vs with the patch:
[ 0.379613] node 0 deferred pages initialised in 90ms
[ 0.380388] node 1 deferred pages initialised in 90ms
[ 0.380540] node 4 deferred pages initialised in 100ms
[ 0.390239] node 6 deferred pages initialised in 100ms
[ 0.390249] node 2 deferred pages initialised in 100ms
[ 0.390786] node 3 deferred pages initialised in 110ms
[ 0.396721] node 5 deferred pages initialised in 110ms
[ 0.397095] node 7 deferred pages initialised in 110ms
Which is a nice speedup.
Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
cheers
WARNING: multiple messages have this Message-ID (diff)
From: Michael Ellerman <mpe@ellerman.id.au>
To: Eric Chanudet <echanude@redhat.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
Andy Lutomirski <luto@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
"H. Peter Anvin" <hpa@zytor.com>, Mike Rapoport <rppt@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Baoquan He <bhe@redhat.com>, Nick Piggin <npiggin@gmail.com>
Cc: x86@kernel.org, linux-arm-kernel@lists.infradead.org,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linux-s390@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
Eric Chanudet <echanude@redhat.com>
Subject: Re: [PATCH v2] mm/mm_init: use node's number of cpus in deferred_page_init_max_threads
Date: Thu, 23 May 2024 21:13:55 +1000 [thread overview]
Message-ID: <87jzjk3hn0.fsf@mail.lhotse> (raw)
In-Reply-To: <20240522203758.626932-4-echanude@redhat.com>
Eric Chanudet <echanude@redhat.com> writes:
> x86_64 is already using the node's cpu as maximum threads. Make that the
> default for all archs setting DEFERRED_STRUCT_PAGE_INIT.
>
> This returns to the behavior prior making the function arch-specific
> with commit ecd096506922 ("mm: make deferred init's max threads
> arch-specific").
>
> Signed-off-by: Eric Chanudet <echanude@redhat.com>
>
> ---
> Setting DEFERRED_STRUCT_PAGE_INIT and testing on a few arm64 platforms
> shows faster deferred_init_memmap completions:
>
> | | x13s | SA8775p-ride | Ampere R137-P31 | Ampere HR330 |
> | | Metal, 32GB | VM, 36GB | VM, 58GB | Metal, 128GB |
> | | 8cpus | 8cpus | 8cpus | 32cpus |
> |---------|-------------|--------------|-----------------|--------------|
> | threads | ms (%) | ms (%) | ms (%) | ms (%) |
> |---------|-------------|--------------|-----------------|--------------|
> | 1 | 108 (0%) | 72 (0%) | 224 (0%) | 324 (0%) |
> | cpus | 24 (-77%) | 36 (-50%) | 40 (-82%) | 56 (-82%) |
>
> - v1: https://lore.kernel.org/linux-arm-kernel/20240520231555.395979-5-echanude@redhat.com
> - Changes since v1:
> - Make the generic function return the number of cpus of the node as
> max threads limit instead overriding it for arm64.
> - Drop Baoquan He's R-b on v1 since the logic changed.
> - Add CCs according to patch changes (ppc and s390 set
> DEFERRED_STRUCT_PAGE_INIT by default).
>
> arch/x86/mm/init_64.c | 12 ------------
> mm/mm_init.c | 2 +-
> 2 files changed, 1 insertion(+), 13 deletions(-)
On a machine here (1TB, 40 cores, 4KB pages) the existing code gives:
[ 0.500124] node 2 deferred pages initialised in 210ms
[ 0.515790] node 3 deferred pages initialised in 230ms
[ 0.516061] node 0 deferred pages initialised in 230ms
[ 0.516522] node 7 deferred pages initialised in 230ms
[ 0.516672] node 4 deferred pages initialised in 230ms
[ 0.516798] node 6 deferred pages initialised in 230ms
[ 0.517051] node 5 deferred pages initialised in 230ms
[ 0.523887] node 1 deferred pages initialised in 240ms
vs with the patch:
[ 0.379613] node 0 deferred pages initialised in 90ms
[ 0.380388] node 1 deferred pages initialised in 90ms
[ 0.380540] node 4 deferred pages initialised in 100ms
[ 0.390239] node 6 deferred pages initialised in 100ms
[ 0.390249] node 2 deferred pages initialised in 100ms
[ 0.390786] node 3 deferred pages initialised in 110ms
[ 0.396721] node 5 deferred pages initialised in 110ms
[ 0.397095] node 7 deferred pages initialised in 110ms
Which is a nice speedup.
Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
cheers
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2024-05-23 11:14 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-22 20:38 [PATCH v2] mm/mm_init: use node's number of cpus in deferred_page_init_max_threads Eric Chanudet
2024-05-22 20:38 ` Eric Chanudet
2024-05-22 20:38 ` Eric Chanudet
2024-05-22 22:46 ` Andrew Morton
2024-05-22 22:46 ` Andrew Morton
2024-05-22 22:46 ` Andrew Morton
2024-05-23 11:13 ` Michael Ellerman [this message]
2024-05-23 11:13 ` Michael Ellerman
2024-05-23 11:13 ` Michael Ellerman
2024-05-23 14:59 ` Mike Rapoport
2024-05-23 14:59 ` Mike Rapoport
2024-05-23 14:59 ` Mike Rapoport
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87jzjk3hn0.fsf@mail.lhotse \
--to=mpe@ellerman.id.au \
--cc=akpm@linux-foundation.org \
--cc=bhe@redhat.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=echanude@redhat.com \
--cc=hpa@zytor.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-s390@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=luto@kernel.org \
--cc=mingo@redhat.com \
--cc=npiggin@gmail.com \
--cc=peterz@infradead.org \
--cc=rppt@kernel.org \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.