All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@kernel.org>
To: Shijie Huang <shijie@amperemail.onmicrosoft.com>
Cc: Yury Norov <yury.norov@gmail.com>,
	Huang Shijie <shijie@os.amperecomputing.com>,
	gregkh@linuxfoundation.org, patches@amperecomputing.com,
	rafael@kernel.org, paul.walmsley@sifive.com, palmer@dabbelt.com,
	aou@eecs.berkeley.edu, kuba@kernel.org, vschneid@redhat.com,
	mingo@kernel.org, akpm@linux-foundation.org, vbabka@suse.cz,
	tglx@linutronix.de, jpoimboe@kernel.org, ndesaulniers@google.com,
	mikelley@microsoft.com, mhiramat@kernel.org, arnd@arndb.de,
	linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org,
	linux-arm-kernel@lists.infradead.org, catalin.marinas@arm.com,
	will@kernel.org, mark.rutland@arm.com, mpe@ellerman.id.au,
	linuxppc-dev@lists.ozlabs.org, chenhuacai@kernel.org,
	jiaxun.yang@flygoat.com, linux-mips@vger.kernel.org,
	cl@os.amperecomputing.com
Subject: Re: [PATCH] NUMA: Early use of cpu_to_node() returns 0 instead of the correct node id
Date: Fri, 19 Jan 2024 10:42:04 +0200	[thread overview]
Message-ID: <Zao13I4Bb0tur0fZ@kernel.org> (raw)
In-Reply-To: <1cd078fd-c345-4d85-a92f-04c806c20efa@amperemail.onmicrosoft.com>

On Fri, Jan 19, 2024 at 02:46:16PM +0800, Shijie Huang wrote:
> 
> 在 2024/1/19 12:42, Yury Norov 写道:
> > This adds another level of indirection, I think. Currently cpu_to_node
> > is a simple inliner. After the patch it would be a real function with
> > all the associate overhead. Can you share a bloat-o-meter output here?
> #./scripts/bloat-o-meter vmlinux vmlinux.new
> add/remove: 6/1 grow/shrink: 61/51 up/down: 1168/-588 (580)
> Function                                     old     new   delta
> numa_update_cpu                              148     244     +96
> 
>  ...................................................................................................................................(to many to skip)
> 
> Total: Before=32990130, After=32990710, chg +0.00%
 
It's not only about text size, the indirect call also hurts performance
 
> > 
> > Regardless, I don't think that the approach is correct. As per your
> > description, some initialization functions erroneously call
> > cpu_to_node() instead of early_cpu_to_node() which exists specifically
> > for that case.
> > 
> > If the above correct, it's clearly a caller problem, and the fix is to
> > simply switch all those callers to use early version.
> 
> It is easy to change to early_cpu_to_node() for sched_init(),
> init_sched_fair_class()
> 
> and workqueue_init_early(). These three places call the cpu_to_node() in the
> __init function.
> 
> 
> But it is a little hard to change the early_trace_init(), since it calls
> cpu_to_node in the deep
> 
> function stack:
> 
>   early_trace_init() --> ring_buffer_alloc() -->rb_allocate_cpu_buffer()
> 
> 
> For early_trace_init(), we need to change more code.
> 
> 
> Anyway, If we think it is not a good idea to change the common code, I am
> oaky too.
 
Is there a fundamental reason to have early_cpu_to_node() at all?
It seems that all the mappings are known by the end of setup_arch() and the
initialization of numa_node can be moved earlier. 
 
> > I would also initialize the numa_node with NUMA_NO_NODE at declaration,
> > so that if someone calls cpu_to_node() before the variable is properly
> > initialized at runtime, he'll get NO_NODE, which is obviously an error.
> 
> Even we set the numa_node with NUMA_NO_NODE, it does not always produce
> error.
> 
> Please see the alloc_pages_node().
> 
> 
> Thanks
> 
> Huang Shijie
> 

-- 
Sincerely yours,
Mike.

WARNING: multiple messages have this Message-ID (diff)
From: Mike Rapoport <rppt@kernel.org>
To: Shijie Huang <shijie@amperemail.onmicrosoft.com>
Cc: Yury Norov <yury.norov@gmail.com>,
	Huang Shijie <shijie@os.amperecomputing.com>,
	gregkh@linuxfoundation.org, patches@amperecomputing.com,
	rafael@kernel.org, paul.walmsley@sifive.com, palmer@dabbelt.com,
	aou@eecs.berkeley.edu, kuba@kernel.org, vschneid@redhat.com,
	mingo@kernel.org, akpm@linux-foundation.org, vbabka@suse.cz,
	tglx@linutronix.de, jpoimboe@kernel.org, ndesaulniers@google.com,
	mikelley@microsoft.com, mhiramat@kernel.org, arnd@arndb.de,
	linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org,
	linux-arm-kernel@lists.infradead.org, catalin.marinas@arm.com,
	will@kernel.org, mark.rutland@arm.com, mpe@ellerman.id.au,
	linuxppc-dev@lists.ozlabs.org, chenhuacai@kernel.org,
	jiaxun.yang@flygoat.com, linux-mips@vger.kernel.org,
	cl@os.amperecomputing.com
Subject: Re: [PATCH] NUMA: Early use of cpu_to_node() returns 0 instead of the correct node id
Date: Fri, 19 Jan 2024 10:42:04 +0200	[thread overview]
Message-ID: <Zao13I4Bb0tur0fZ@kernel.org> (raw)
In-Reply-To: <1cd078fd-c345-4d85-a92f-04c806c20efa@amperemail.onmicrosoft.com>

On Fri, Jan 19, 2024 at 02:46:16PM +0800, Shijie Huang wrote:
> 
> 在 2024/1/19 12:42, Yury Norov 写道:
> > This adds another level of indirection, I think. Currently cpu_to_node
> > is a simple inliner. After the patch it would be a real function with
> > all the associate overhead. Can you share a bloat-o-meter output here?
> #./scripts/bloat-o-meter vmlinux vmlinux.new
> add/remove: 6/1 grow/shrink: 61/51 up/down: 1168/-588 (580)
> Function                                     old     new   delta
> numa_update_cpu                              148     244     +96
> 
>  ...................................................................................................................................(to many to skip)
> 
> Total: Before=32990130, After=32990710, chg +0.00%
 
It's not only about text size, the indirect call also hurts performance
 
> > 
> > Regardless, I don't think that the approach is correct. As per your
> > description, some initialization functions erroneously call
> > cpu_to_node() instead of early_cpu_to_node() which exists specifically
> > for that case.
> > 
> > If the above correct, it's clearly a caller problem, and the fix is to
> > simply switch all those callers to use early version.
> 
> It is easy to change to early_cpu_to_node() for sched_init(),
> init_sched_fair_class()
> 
> and workqueue_init_early(). These three places call the cpu_to_node() in the
> __init function.
> 
> 
> But it is a little hard to change the early_trace_init(), since it calls
> cpu_to_node in the deep
> 
> function stack:
> 
>   early_trace_init() --> ring_buffer_alloc() -->rb_allocate_cpu_buffer()
> 
> 
> For early_trace_init(), we need to change more code.
> 
> 
> Anyway, If we think it is not a good idea to change the common code, I am
> oaky too.
 
Is there a fundamental reason to have early_cpu_to_node() at all?
It seems that all the mappings are known by the end of setup_arch() and the
initialization of numa_node can be moved earlier. 
 
> > I would also initialize the numa_node with NUMA_NO_NODE at declaration,
> > so that if someone calls cpu_to_node() before the variable is properly
> > initialized at runtime, he'll get NO_NODE, which is obviously an error.
> 
> Even we set the numa_node with NUMA_NO_NODE, it does not always produce
> error.
> 
> Please see the alloc_pages_node().
> 
> 
> Thanks
> 
> Huang Shijie
> 

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

WARNING: multiple messages have this Message-ID (diff)
From: Mike Rapoport <rppt@kernel.org>
To: Shijie Huang <shijie@amperemail.onmicrosoft.com>
Cc: mark.rutland@arm.com, rafael@kernel.org, catalin.marinas@arm.com,
	jiaxun.yang@flygoat.com, mikelley@microsoft.com,
	linux-riscv@lists.infradead.org, will@kernel.org,
	mingo@kernel.org, vschneid@redhat.com, arnd@arndb.de,
	chenhuacai@kernel.org, cl@os.amperecomputing.com,
	linux-arm-kernel@lists.infradead.org, kuba@kernel.org,
	patches@amperecomputing.com, linux-mips@vger.kernel.org,
	aou@eecs.berkeley.edu, Yury Norov <yury.norov@gmail.com>,
	paul.walmsley@sifive.com, tglx@linutronix.de,
	jpoimboe@kernel.org, vbabka@suse.cz,
	Huang Shijie <shijie@os.amperecomputing.com>,
	gregkh@linuxfoundation.org, ndesaulniers@google.com,
	linux-kernel@vger.kernel.org, palmer@dabbelt.com,
	mhiramat@kernel.org, akpm@linux-foundation.org,
	linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH] NUMA: Early use of cpu_to_node() returns 0 instead of the correct node id
Date: Fri, 19 Jan 2024 10:42:04 +0200	[thread overview]
Message-ID: <Zao13I4Bb0tur0fZ@kernel.org> (raw)
In-Reply-To: <1cd078fd-c345-4d85-a92f-04c806c20efa@amperemail.onmicrosoft.com>

On Fri, Jan 19, 2024 at 02:46:16PM +0800, Shijie Huang wrote:
> 
> 在 2024/1/19 12:42, Yury Norov 写道:
> > This adds another level of indirection, I think. Currently cpu_to_node
> > is a simple inliner. After the patch it would be a real function with
> > all the associate overhead. Can you share a bloat-o-meter output here?
> #./scripts/bloat-o-meter vmlinux vmlinux.new
> add/remove: 6/1 grow/shrink: 61/51 up/down: 1168/-588 (580)
> Function                                     old     new   delta
> numa_update_cpu                              148     244     +96
> 
>  ...................................................................................................................................(to many to skip)
> 
> Total: Before=32990130, After=32990710, chg +0.00%
 
It's not only about text size, the indirect call also hurts performance
 
> > 
> > Regardless, I don't think that the approach is correct. As per your
> > description, some initialization functions erroneously call
> > cpu_to_node() instead of early_cpu_to_node() which exists specifically
> > for that case.
> > 
> > If the above correct, it's clearly a caller problem, and the fix is to
> > simply switch all those callers to use early version.
> 
> It is easy to change to early_cpu_to_node() for sched_init(),
> init_sched_fair_class()
> 
> and workqueue_init_early(). These three places call the cpu_to_node() in the
> __init function.
> 
> 
> But it is a little hard to change the early_trace_init(), since it calls
> cpu_to_node in the deep
> 
> function stack:
> 
>   early_trace_init() --> ring_buffer_alloc() -->rb_allocate_cpu_buffer()
> 
> 
> For early_trace_init(), we need to change more code.
> 
> 
> Anyway, If we think it is not a good idea to change the common code, I am
> oaky too.
 
Is there a fundamental reason to have early_cpu_to_node() at all?
It seems that all the mappings are known by the end of setup_arch() and the
initialization of numa_node can be moved earlier. 
 
> > I would also initialize the numa_node with NUMA_NO_NODE at declaration,
> > so that if someone calls cpu_to_node() before the variable is properly
> > initialized at runtime, he'll get NO_NODE, which is obviously an error.
> 
> Even we set the numa_node with NUMA_NO_NODE, it does not always produce
> error.
> 
> Please see the alloc_pages_node().
> 
> 
> Thanks
> 
> Huang Shijie
> 

-- 
Sincerely yours,
Mike.

WARNING: multiple messages have this Message-ID (diff)
From: Mike Rapoport <rppt@kernel.org>
To: Shijie Huang <shijie@amperemail.onmicrosoft.com>
Cc: Yury Norov <yury.norov@gmail.com>,
	Huang Shijie <shijie@os.amperecomputing.com>,
	gregkh@linuxfoundation.org, patches@amperecomputing.com,
	rafael@kernel.org, paul.walmsley@sifive.com, palmer@dabbelt.com,
	aou@eecs.berkeley.edu, kuba@kernel.org, vschneid@redhat.com,
	mingo@kernel.org, akpm@linux-foundation.org, vbabka@suse.cz,
	tglx@linutronix.de, jpoimboe@kernel.org, ndesaulniers@google.com,
	mikelley@microsoft.com, mhiramat@kernel.org, arnd@arndb.de,
	linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org,
	linux-arm-kernel@lists.infradead.org, catalin.marinas@arm.com,
	will@kernel.org, mark.rutland@arm.com, mpe@ellerman.id.au,
	linuxppc-dev@lists.ozlabs.org, chenhuacai@kernel.org,
	jiaxun.yang@flygoat.com, linux-mips@vger.kernel.org,
	cl@os.amperecomputing.com
Subject: Re: [PATCH] NUMA: Early use of cpu_to_node() returns 0 instead of the correct node id
Date: Fri, 19 Jan 2024 10:42:04 +0200	[thread overview]
Message-ID: <Zao13I4Bb0tur0fZ@kernel.org> (raw)
In-Reply-To: <1cd078fd-c345-4d85-a92f-04c806c20efa@amperemail.onmicrosoft.com>

On Fri, Jan 19, 2024 at 02:46:16PM +0800, Shijie Huang wrote:
> 
> 在 2024/1/19 12:42, Yury Norov 写道:
> > This adds another level of indirection, I think. Currently cpu_to_node
> > is a simple inliner. After the patch it would be a real function with
> > all the associate overhead. Can you share a bloat-o-meter output here?
> #./scripts/bloat-o-meter vmlinux vmlinux.new
> add/remove: 6/1 grow/shrink: 61/51 up/down: 1168/-588 (580)
> Function                                     old     new   delta
> numa_update_cpu                              148     244     +96
> 
>  ...................................................................................................................................(to many to skip)
> 
> Total: Before=32990130, After=32990710, chg +0.00%
 
It's not only about text size, the indirect call also hurts performance
 
> > 
> > Regardless, I don't think that the approach is correct. As per your
> > description, some initialization functions erroneously call
> > cpu_to_node() instead of early_cpu_to_node() which exists specifically
> > for that case.
> > 
> > If the above correct, it's clearly a caller problem, and the fix is to
> > simply switch all those callers to use early version.
> 
> It is easy to change to early_cpu_to_node() for sched_init(),
> init_sched_fair_class()
> 
> and workqueue_init_early(). These three places call the cpu_to_node() in the
> __init function.
> 
> 
> But it is a little hard to change the early_trace_init(), since it calls
> cpu_to_node in the deep
> 
> function stack:
> 
>   early_trace_init() --> ring_buffer_alloc() -->rb_allocate_cpu_buffer()
> 
> 
> For early_trace_init(), we need to change more code.
> 
> 
> Anyway, If we think it is not a good idea to change the common code, I am
> oaky too.
 
Is there a fundamental reason to have early_cpu_to_node() at all?
It seems that all the mappings are known by the end of setup_arch() and the
initialization of numa_node can be moved earlier. 
 
> > I would also initialize the numa_node with NUMA_NO_NODE at declaration,
> > so that if someone calls cpu_to_node() before the variable is properly
> > initialized at runtime, he'll get NO_NODE, which is obviously an error.
> 
> Even we set the numa_node with NUMA_NO_NODE, it does not always produce
> error.
> 
> Please see the alloc_pages_node().
> 
> 
> Thanks
> 
> Huang Shijie
> 

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2024-01-19  8:42 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-19  3:32 [PATCH] NUMA: Early use of cpu_to_node() returns 0 instead of the correct node id Huang Shijie
2024-01-19  3:32 ` Huang Shijie
2024-01-19  3:32 ` Huang Shijie
2024-01-19  3:32 ` Huang Shijie
2024-01-19  4:42 ` Yury Norov
2024-01-19  4:42   ` Yury Norov
2024-01-19  4:42   ` Yury Norov
2024-01-19  4:42   ` Yury Norov
2024-01-19  6:46   ` Shijie Huang
2024-01-19  6:46     ` Shijie Huang
2024-01-19  6:46     ` Shijie Huang
2024-01-19  6:46     ` Shijie Huang
2024-01-19  7:02     ` Shijie Huang
2024-01-19  7:02       ` Shijie Huang
2024-01-19  7:02       ` Shijie Huang
2024-01-19  7:02       ` Shijie Huang
2024-01-19  8:42     ` Mike Rapoport [this message]
2024-01-19  8:42       ` Mike Rapoport
2024-01-19  8:42       ` Mike Rapoport
2024-01-19  8:42       ` Mike Rapoport
2024-01-19  8:50       ` Shijie Huang
2024-01-19  8:50         ` Shijie Huang
2024-01-19  8:50         ` Shijie Huang
2024-01-19  8:50         ` Shijie Huang
2024-01-19 18:02         ` Yury Norov
2024-01-19 18:02           ` Yury Norov
2024-01-19 18:02           ` Yury Norov
2024-01-19 18:02           ` Yury Norov
2024-01-22  7:32           ` Shijie Huang
2024-01-22  7:32             ` Shijie Huang
2024-01-22  7:32             ` Shijie Huang
2024-01-22  7:32             ` Shijie Huang
2024-01-22  7:41         ` Mike Rapoport
2024-01-22  7:41           ` Mike Rapoport
2024-01-22  7:41           ` Mike Rapoport
2024-01-22  7:41           ` Mike Rapoport
2024-01-22  8:27           ` Shijie Huang
2024-01-22  8:27             ` Shijie Huang
2024-01-22  8:27             ` Shijie Huang
2024-01-22  8:27             ` Shijie Huang
2024-01-19  7:42   ` Shijie Huang
2024-01-19  7:42     ` Shijie Huang
2024-01-19  7:42     ` Shijie Huang
2024-01-19  7:42     ` Shijie Huang
2024-01-19  5:35 ` Greg KH
2024-01-19  5:35   ` Greg KH
2024-01-19  5:35   ` Greg KH
2024-01-19  5:35   ` Greg KH
2024-01-19 16:32 ` kernel test robot
2024-01-19 16:32   ` kernel test robot
2024-01-19 16:32   ` kernel test robot
2024-01-19 16:32   ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zao13I4Bb0tur0fZ@kernel.org \
    --to=rppt@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=aou@eecs.berkeley.edu \
    --cc=arnd@arndb.de \
    --cc=catalin.marinas@arm.com \
    --cc=chenhuacai@kernel.org \
    --cc=cl@os.amperecomputing.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jiaxun.yang@flygoat.com \
    --cc=jpoimboe@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mark.rutland@arm.com \
    --cc=mhiramat@kernel.org \
    --cc=mikelley@microsoft.com \
    --cc=mingo@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=ndesaulniers@google.com \
    --cc=palmer@dabbelt.com \
    --cc=patches@amperecomputing.com \
    --cc=paul.walmsley@sifive.com \
    --cc=rafael@kernel.org \
    --cc=shijie@amperemail.onmicrosoft.com \
    --cc=shijie@os.amperecomputing.com \
    --cc=tglx@linutronix.de \
    --cc=vbabka@suse.cz \
    --cc=vschneid@redhat.com \
    --cc=will@kernel.org \
    --cc=yury.norov@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.