linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jiang Liu <jiang.liu@linux.intel.com>
To: Denys Vlasenko <dvlasenk@redhat.com>, Ingo Molnar <mingo@kernel.org>
Cc: Daniel J Blueman <daniel@numascale.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Len Brown <len.brown@intel.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/3] x86/apic: Use smaller array for __apicid_to_node[] mapping
Date: Fri, 9 Oct 2015 23:35:32 +0800	[thread overview]
Message-ID: <5617DEC4.5060705@linux.intel.com> (raw)
In-Reply-To: <1443813145-29102-3-git-send-email-dvlasenk@redhat.com>

On 2015/10/3 3:12, Denys Vlasenko wrote:
> From: Daniel J Blueman <daniel@numascale.com>
> 
> The Intel x2APIC spec states the upper 16-bits of APIC ID is the
> cluster ID [1, p2-12], intended for future distributed systems. Beyond
> the legacy 8-bit APIC ID, Numascale NumaConnect uses 4-bits for the
> position of a server on each axis of a multi-dimension torus; SGI
> NUMAlink also structures the APIC ID space.
> 
> Instead, define an array based on NR_CPUs to achieve a 1:1 mapping and
> perform linear search; this addresses the binary bloat and the present
> artificial APIC ID limits. With CONFIG_NR_CPUS=256:
> 
> $ size vmlinux vmlinux-patched
>   text      data     bss      dec     hex filename
> 18232877 1849656 2281472 22364005 1553f65 vmlinux
> 18233034 1786168 2281472 22300674 1544802 vmlinux-patched
> 
> That is, ~64 kbytes less data.
> 
> Works peachy on a 256-core system with a 20-bit APIC ID space, and on a
> 48-core legacy 8-bit APIC ID system. If we care, I can make
> numa_cpu_node O(1) lookup for typical cases.
> 
> Signed-off-by: Daniel J Blueman <daniel@numascale.com>
> CC: Ingo Molnar <mingo@kernel.org>
> CC: Daniel J Blueman <daniel@numascale.com>
> CC: Jiang Liu <jiang.liu@linux.intel.com>
> CC: Thomas Gleixner <tglx@linutronix.de>
> CC: Len Brown <len.brown@intel.com>
> CC: x86@kernel.org
> CC: linux-kernel@vger.kernel.org
> 
> [1]
> http://www.intel.com/content/dam/doc/specification-update/64-architecture-x2apic-specification.pdf
> ---
> 
> I added forgotten change in arch/x86/mm/numa_emulation.c (Denys)
> 
>  arch/x86/include/asm/numa.h  | 13 +++++++------
>  arch/x86/kernel/cpu/amd.c    |  8 ++++----
>  arch/x86/mm/numa.c           | 31 +++++++++++++++++++++++--------
>  arch/x86/mm/numa_emulation.c |  6 +++---
>  4 files changed, 37 insertions(+), 21 deletions(-)
> 
> diff --git a/arch/x86/include/asm/numa.h b/arch/x86/include/asm/numa.h
> index c2ecfd0..33becb8 100644
> --- a/arch/x86/include/asm/numa.h
> +++ b/arch/x86/include/asm/numa.h
> @@ -17,6 +17,11 @@
>   */
>  #define NODE_MIN_SIZE (4*1024*1024)
>  
> +struct apicid_to_node {
> +	int apicid;
> +	s16 node;
> +};
> +
>  extern int numa_off;
>  
>  /*
> @@ -27,17 +32,13 @@ extern int numa_off;
>   * should be accessed by the accessors - set_apicid_to_node() and
>   * numa_cpu_node().
>   */
> -extern s16 __apicid_to_node[MAX_LOCAL_APICID];
> +extern struct apicid_to_node __apicid_to_node[NR_CPUS];
Hi Denys and Daniel,
	I still have some concerns about limiting the array to NR_CPUS.
__apicid_to_node are populated according to the order that CPUs are
listed in ACPI SRAT table. And CPU IDs are allocated according to the
order that CPUs are listed in ACPI MADT(APIC) order. So it may cause
trouble if:
1) system has more than NR_CPUS CPUs
2) CPUs are listed in different order in SRAT and MADT tables.

<snit>
> @@ -607,9 +625,6 @@ static int __init numa_init(int (*init_func)(void))
>  	int i;
>  	int ret;
>  
> -	for (i = 0; i < MAX_LOCAL_APICID; i++)
> -		set_apicid_to_node(i, NUMA_NO_NODE);
> -
	Why remove above code? numa_init() may be called multiple times
so it needs to reset __apicid_to_node array on the second and following
calls. So we need another way to reset __apicid_to_node array instead
of simply deleting above code.
Thanks,
Gerry

>  	nodes_clear(numa_nodes_parsed);
>  	nodes_clear(node_possible_map);
>  	nodes_clear(node_online_map);
> diff --git a/arch/x86/mm/numa_emulation.c b/arch/x86/mm/numa_emulation.c
> index a8f90ce..1a0e112 100644
> --- a/arch/x86/mm/numa_emulation.c
> +++ b/arch/x86/mm/numa_emulation.c
> @@ -399,12 +399,12 @@ void __init numa_emulation(struct numa_meminfo *numa_meminfo, int numa_dist_cnt)
>  	 * back to zero just in case.
>  	 */
>  	for (i = 0; i < ARRAY_SIZE(__apicid_to_node); i++) {
> -		if (__apicid_to_node[i] == NUMA_NO_NODE)
> +		if (__apicid_to_node[i].node == NUMA_NO_NODE)
>  			continue;
>  		for (j = 0; j < ARRAY_SIZE(emu_nid_to_phys); j++)
> -			if (__apicid_to_node[i] == emu_nid_to_phys[j])
> +			if (__apicid_to_node[i].node == emu_nid_to_phys[j])
>  				break;
> -		__apicid_to_node[i] = j < ARRAY_SIZE(emu_nid_to_phys) ? j : 0;
> +		__apicid_to_node[i].node = j < ARRAY_SIZE(emu_nid_to_phys) ? j : 0;
>  	}
>  
>  	/* make sure all emulated nodes are mapped to a physical node */
> 

  parent reply	other threads:[~2015-10-09 15:35 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-02 19:12 [PATCH 1/3] x86/apic: Rename MAX_LOCAL_APIC to MAX_LOCAL_APICID Denys Vlasenko
2015-10-02 19:12 ` [PATCH 2/3] x86/apic: Make apic_version[] smaller Denys Vlasenko
2015-10-02 19:12 ` [PATCH 3/3] x86/apic: Use smaller array for __apicid_to_node[] mapping Denys Vlasenko
2015-10-03  7:44   ` Ingo Molnar
2015-10-03 20:26     ` Denys Vlasenko
2015-10-05  4:32     ` [PATCH v2] " Daniel J Blueman
2015-10-09 14:15       ` Thomas Gleixner
2015-10-09 15:16         ` Jiang Liu
2015-10-09 20:40           ` Thomas Gleixner
2015-10-09 15:35   ` Jiang Liu [this message]
2015-10-12 10:21     ` [PATCH 3/3] " Daniel J Blueman
2015-10-12 10:25       ` Thomas Gleixner
2015-10-13  9:32         ` Jiang Liu
2015-10-13 12:55           ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5617DEC4.5060705@linux.intel.com \
    --to=jiang.liu@linux.intel.com \
    --cc=daniel@numascale.com \
    --cc=dvlasenk@redhat.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).