Re: [Discontig-devel] RE: Cleanup of NUMA support in ACPI

public inbox for linux-acpi@vger.kernel.org
 help / color / mirror / Atom feed

From: Erich Focht <efocht-+HQ0pkNQ8fyELgA04lAiVw@public.gmane.org>
To: "KOCHI,
	Takayoshi"
	<t-kouchi-dPjYVeZdYcz+G+EEi5ephHgSJqDPrsil@public.gmane.org>
Cc: andrew.grover-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org,
	discontig-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org,
	acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org,
	David Mosberger <davidm-sDzT885Ts8HQT0dZR+AlfA@public.gmane.org>
Subject: Re: [Discontig-devel] RE: Cleanup of NUMA support in ACPI
Date: Thu, 15 Aug 2002 17:58:05 +0200	[thread overview]
Message-ID: <200208151758.05568.efocht@ess.nec.de> (raw)
In-Reply-To: <20020815092003.JBWFC0A82650.6C9EC293-dPjYVeZdYcz+G+EEi5ephHgSJqDPrsil@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 1709 bytes --]

Hi Tak,

looks good! Now the parts are finally splitted and there doesn't seem
to be any reason not to include the arch-independent part into the
mainline.

Here comes the rest, the ia64 arch-dependent part. It applies over the
latest Linus tree and should apply to David's 2.5.30 + Kimi's startup
fix, too.

Regards,
Erich


On Thursday 15 August 2002 02:21, KOCHI, Takayoshi wrote:
> Hi Andy,
>
> > > There are a bunch of other ACPI config options that should be split out
> > > like this as well, so if you don't want to redo the patch, I'll just
> > > take it and it can get fixed in a sweep with the others, later.
> >
> > Erich has worked to clean up some portions and sent to LKML.
> > I'll split ACPI part from the patch and send to you this afternoon.
>
> Here's the patch.
> This is generated against 2.5.30 kernel's driver.
> For 2.4.18 + 20020725, I saw faiures against Config.in and Makefile...
> perhaps it's easy to merge.
>
>
> For discontig/numa developers:
>
> If this is applied, both i386 and ia64 have to provide their own
> functions in architecture-dependent numa support (only when CONFIG_NUMA and
> CONFIG_ACPI_NUMA are both Y):
>
> void __init acpi_numa_slit_init (struct acpi_table_slit *slit);
> void __init acpi_numa_processor_affinity_init (struct
> acpi_table_processor_affinity *pa); void __init
> acpi_numa_memory_affinity_init (struct acpi_table_memory_affinity *ma);
> void __init acpi_numa_arch_fixup(void);
>
> Also, acpi_numa_init() has to be called in very early stage of
> initialization of architecture-dependent setup routine
> so that you can use the SRAT/SLIT information for constructing
> numa-related structures.
>

[-- Attachment #2: acpi-numa-2.5.31-ia64.patch --]
[-- Type: text/x-diff, Size: 12425 bytes --]

# This is a BitKeeper generated patch for the following project:
# Project Name: Linux kernel tree
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
#	           ChangeSet	1.520   -> 1.521  
#	arch/ia64/kernel/setup.c	1.22    -> 1.23   
#	arch/ia64/mm/Makefile	1.1     -> 1.2    
#	include/asm-ia64/acpi.h	1.3     -> 1.4    
#	arch/ia64/kernel/acpi.c	1.16    -> 1.17   
#	               (new)	        -> 1.1     arch/ia64/mm/numa.c
#	               (new)	        -> 1.1     include/asm-ia64/numa.h
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 02/08/15	focht-3YFNcdZb8i1z/Fpu909jNA@public.gmane.org	1.521
#   Architecture dependent part (IA64) of ACPI SRAT (Static Resource
#   Affinity Table) and SLIT (System Locality Information Table)
#   parsing code. Tables are parsed if CONFIG_ACPI_NUMA=y.
#   
#   The ACPI SRAT and SLIT parsing routines are actually quite
#   arch-independent, but currently this setup is used only on IA64.
#   The NUMA related variables touched are:
#    node_memblk : physical memory blocks, node to which they belong
#    node_cpuid : hardware cpu IDs and nodes to which they belong
#    numa_slit : locality matrix with "distances" between nodes.
#   
#   This setup is needed for CONFIG_DISCONTIGMEM on IA64 but can be
#   used for other NUMA purposes, too.
# --------------------------------------------
#
diff -Nru a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c
--- a/arch/ia64/kernel/acpi.c	Thu Aug 15 17:48:17 2002
+++ b/arch/ia64/kernel/acpi.c	Thu Aug 15 17:48:17 2002
@@ -8,6 +8,9 @@
  *  Copyright (C) 2000 Intel Corp.
  *  Copyright (C) 2000,2001 J.I. Lee <jung-ik.lee-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  *  Copyright (C) 2001 Paul Diefenbaugh <paul.s.diefenbaugh-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
+ *  Copyright (C) 2001 Jenna Hall <jenna.s.hall-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
+ *  Copyright (C) 2001 Takayoshi Kochi <t-kouchi-f7IHDacdhdx8UrSeD/g0lQ@public.gmane.org>
+ *  Copyright (C) 2002 Erich Focht <efocht-+HQ0pkNQ8fyELgA04lAiVw@public.gmane.org>
  *
  * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  *
@@ -43,6 +46,7 @@
 #include <asm/machvec.h>
 #include <asm/page.h>
 #include <asm/system.h>
+#include <asm/numa.h>
 
 
 #define PREFIX			"ACPI: "
@@ -440,6 +444,168 @@
 	return 0;
 }
 
+#ifdef CONFIG_ACPI_NUMA
+
+#define SRAT_DEBUG
+#define SLIT_DEBUG
+
+#define PXM_FLAG_LEN ((MAX_PXM_DOMAINS + 1)/32)
+
+static int __initdata srat_num_cpus = 0;		/* number of cpus */
+static u32 __initdata pxm_flag[PXM_FLAG_LEN] = { [0 ... PXM_FLAG_LEN-1] = 0};
+#define PXM_BIT_SET(bit)	(set_bit(bit,(void *)pxm_flag))
+#define PXM_BIT_CLEAR(bit)	(clear_bit(bit,(void *)pxm_flag))
+#define PXM_BIT_TEST(bit)	(test_bit(bit,(void *)pxm_flag))
+/* maps to convert between proximity domain and logical node ID */
+int pxm_to_nid_map[MAX_PXM_DOMAINS] = { [0 ... MAX_PXM_DOMAINS-1] = -1};
+int nid_to_pxm_map[NR_NODES] = { [0 ... NR_NODES-1] = -1};
+
+/*
+ * ACPI 2.0 SLIT (System Locality Information Table)
+ * http://devresource.hp.com/devresource/Docs/TechPapers/IA64/
+ */
+void __init
+acpi_numa_slit_init (struct acpi_table_slit *slit)
+{
+	int i, j, node_from, node_to;
+	u32 len;
+
+	len = sizeof(struct acpi_table_header) + 8 
+		+ slit->localities * slit->localities;
+	if (slit->header.length != len) {
+		printk("ACPI 2.0 SLIT: size mismatch: %d expected, %d actual\n",
+		      len, slit->header.length);
+		memset(numa_slit, 10, sizeof(numa_slit));
+		return;
+	}
+
+	memset(numa_slit, -1, sizeof(numa_slit));
+	for (i=0; i<slit->localities; i++) {
+		if (!PXM_BIT_TEST(i))
+			continue;
+		node_from = pxm_to_nid_map[i];
+		for (j=0; j<slit->localities; j++) {
+			if (!PXM_BIT_TEST(j))
+				continue;
+			node_to = pxm_to_nid_map[j];
+			node_distance(node_from, node_to) = 
+				slit->entry[i*slit->localities + j];
+		}
+	}
+
+#ifdef SLIT_DEBUG
+	printk("ACPI 2.0 SLIT locality table:\n");
+	for (i = 0; i < numnodes; i++) {
+		for (j = 0; j < numnodes; j++)
+			printk("%03d ", node_distance(i,j));
+		printk("\n");
+	}
+#endif
+}
+
+void __init
+acpi_numa_processor_affinity_init (struct acpi_table_processor_affinity *pa)
+{
+	/* record this node in proximity bitmap */
+	PXM_BIT_SET(pa->proximity_domain);
+
+	node_cpuid[srat_num_cpus].phys_id = (pa->apic_id << 8) | (pa->lsapic_eid);
+	/* nid should be overridden as logical node id later */
+	node_cpuid[srat_num_cpus].nid = pa->proximity_domain;
+	srat_num_cpus++;
+
+#ifdef SRAT_DEBUG
+	printk("CPU %x in proximity domain %x %s\n",
+	       pa->apic_id, pa->proximity_domain,
+	       pa->flags.enabled ? "enabled" : "disabled");
+#endif
+}
+
+void __init
+acpi_numa_memory_affinity_init (struct acpi_table_memory_affinity *ma)
+{
+	unsigned long paddr, size;
+	u8 pxm;
+	struct node_memblk_s *p, *q, *pend;
+
+	pxm = ma->proximity_domain;
+
+	/* record this node in proximity bitmap */
+	PXM_BIT_SET(pxm);
+
+	/* fill node memory chunk structure */
+	paddr = ma->base_addr_hi;
+	paddr = (paddr << 32) | ma->base_addr_lo;
+	size = ma->length_hi;
+	size = (size << 32) | ma->length_lo;
+
+	if (num_memblks >= NR_MEMBLKS) {
+		printk("Too many mem chunks in SRAT. Ignoring %ld MBytes at %lx\n",
+			size/(1024*1024), paddr);
+		return;
+	}
+
+	/* Insertion sort based on base address */
+	pend = &node_memblk[num_memblks];
+	for (p = &node_memblk[0]; p < pend; p++) {
+		if (paddr < p->start_paddr)
+			break;
+	}
+	if (p < pend) {
+		for (q = pend; q >= p; q--)
+			*(q + 1) = *q;
+	}
+	p->start_paddr = paddr;
+	p->size = size;
+	p->nid = pxm;
+	num_memblks++;
+
+#ifdef SRAT_DEBUG
+	printk("Memory range 0x%lx to 0x%lx (type %x) in proximity domain %x %s\n",
+	       paddr, paddr + size - 1,
+	       ma->memory_type, ma->proximity_domain,
+	       ma->flags.enabled ? (ma->flags.hot_pluggable ? 
+				    "enabled and removable" : "enabled" )
+	       : "disabled");
+#endif
+}
+
+void __init
+acpi_numa_arch_fixup(void)
+{
+	int i, j;
+
+	/* calculate total number of nodes in system from PXM bitmap */
+	numnodes = 0;		/* init total nodes in system */
+	for (i = 0; i < MAX_PXM_DOMAINS; i++) {
+		if (PXM_BIT_TEST(i)) {
+			pxm_to_nid_map[i] = numnodes;
+			nid_to_pxm_map[numnodes++] = i;
+		}
+	}
+
+	/* set logical node id in memory chunk structure */
+	for (i = 0; i < num_memblks; i++)
+		node_memblk[i].nid = pxm_to_nid_map[node_memblk[i].nid];
+
+	/* assign memory bank numbers for each chunk on each node */
+	for (i = 0; i < numnodes; i++) {
+		int bank;
+
+		bank = 0;
+		for (j = 0; j < num_memblks; j++)
+			if (node_memblk[j].nid == i)
+				node_memblk[j].bank = bank++;
+	}
+
+	/* set logical node id in cpu structure */
+	for (i = 0; i < srat_num_cpus; i++)
+		node_cpuid[i].nid = pxm_to_nid_map[node_cpuid[i].nid];
+
+	printk("Number of logical nodes in system = %d\n", numnodes);
+	printk("Number of memory chunks in system = %d\n", num_memblks);
+}
+#endif /* CONFIG_ACPI_NUMA */
 static int __init
 acpi_parse_fadt (unsigned long phys_addr, unsigned long size)
 {
@@ -535,12 +701,6 @@
 int __init
 acpi_boot_init (char *cmdline)
 {
-	int result;
-
-	/* Initialize the ACPI boot-time table parser */
-	result = acpi_table_init(cmdline);
-	if (result)
-		return result;
 
 	/*
 	 * MADT
diff -Nru a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
--- a/arch/ia64/kernel/setup.c	Thu Aug 15 17:48:17 2002
+++ b/arch/ia64/kernel/setup.c	Thu Aug 15 17:48:17 2002
@@ -297,6 +297,16 @@
 
 	efi_init();
 
+#ifdef CONFIG_ACPI_BOOT
+	/* Initialize the ACPI boot-time table parser */
+	acpi_table_init(*cmdline_p);
+
+#ifdef CONFIG_ACPI_NUMA
+	acpi_numa_init();
+#endif
+
+#endif /* CONFIG_APCI_BOOT */
+
 	find_memory();
 
 #if 0
diff -Nru a/arch/ia64/mm/Makefile b/arch/ia64/mm/Makefile
--- a/arch/ia64/mm/Makefile	Thu Aug 15 17:48:17 2002
+++ b/arch/ia64/mm/Makefile	Thu Aug 15 17:48:17 2002
@@ -10,5 +10,6 @@
 O_TARGET := mm.o
 
 obj-y	 := init.o fault.o tlb.o extable.o
+obj-$(CONFIG_NUMA) += numa.o
 
 include $(TOPDIR)/Rules.make
diff -Nru a/arch/ia64/mm/numa.c b/arch/ia64/mm/numa.c
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/arch/ia64/mm/numa.c	Thu Aug 15 17:48:17 2002
@@ -0,0 +1,46 @@
+/*
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * This file contains NUMA specific variables and functions which can
+ * be split away from DISCONTIGMEM and are used on NUMA machines with
+ * contiguous memory.
+ * 
+ *                         2002/08/07 Erich Focht <efocht-+HQ0pkNQ8fyELgA04lAiVw@public.gmane.org>
+ */
+
+#include <linux/config.h>
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/init.h>
+#include <linux/bootmem.h>
+#include <linux/mmzone.h>
+#include <asm/numa.h>
+
+/*
+ * The following structures are usually initialized by ACPI or
+ * similar mechanisms and describe the NUMA characteristics of the machine.
+ */
+int num_memblks = 0;
+struct node_memblk_s node_memblk[NR_MEMBLKS];
+struct node_cpuid_s node_cpuid[NR_CPUS];
+/*
+ * This is a matrix with "distances" between nodes, they should be
+ * proportional to the memory access latency ratios.
+ */
+u8 numa_slit[NR_NODES * NR_NODES];
+
+/* Identify which cnode a physical address resides on */
+int
+paddr_to_nid(unsigned long paddr)
+{
+	int	i;
+
+	for (i = 0; i < num_memblks; i++)
+		if (paddr >= node_memblk[i].start_paddr &&
+		    paddr < node_memblk[i].start_paddr + node_memblk[i].size)
+			break;
+
+	return (i < num_memblks) ? node_memblk[i].nid : -1;
+}
diff -Nru a/include/asm-ia64/acpi.h b/include/asm-ia64/acpi.h
--- a/include/asm-ia64/acpi.h	Thu Aug 15 17:48:17 2002
+++ b/include/asm-ia64/acpi.h	Thu Aug 15 17:48:17 2002
@@ -97,16 +97,16 @@
 	} while (0)
 
 const char *acpi_get_sysname (void);
-int acpi_boot_init (char *cdline);
 int acpi_request_vector (u32 int_type);
 int acpi_get_prt (struct pci_vector_struct **vectors, int *count);
 int acpi_get_interrupt_model(int *type);
 
-#ifdef CONFIG_DISCONTIGMEM
-#define NODE_ARRAY_INDEX(x)	((x) / 8)	/* 8 bits/char */
-#define NODE_ARRAY_OFFSET(x)	((x) % 8)	/* 8 bits/char */
-#define MAX_PXM_DOMAINS		(256)
-#endif /* CONFIG_DISCONTIGMEM */
+#ifdef CONFIG_ACPI_NUMA
+/* Proximity bitmap length; _PXM is at most 255 (8 bit)*/
+#define MAX_PXM_DOMAINS (256)
+extern int pxm_to_nid_map[MAX_PXM_DOMAINS];
+extern int nid_to_pxm_map[NR_NODES];
+#endif
 
 #endif /*__KERNEL__*/
 
diff -Nru a/include/asm-ia64/numa.h b/include/asm-ia64/numa.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-ia64/numa.h	Thu Aug 15 17:48:17 2002
@@ -0,0 +1,64 @@
+/*
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * This file contains NUMA specific prototypes and definitions.
+ * 
+ * 2002/08/05 Erich Focht <efocht-+HQ0pkNQ8fyELgA04lAiVw@public.gmane.org>
+ *
+ */
+#ifndef _ASM_IA64_NUMA_H
+#define _ASM_IA64_NUMA_H
+
+#ifdef CONFIG_NUMA
+
+#ifdef CONFIG_DISCONTIGMEM
+# include <asm/mmzone.h>
+# define NR_NODES     (PLAT_MAX_COMPACT_NODES)
+# define NR_MEMBLKS   (PLAT_MAXCLUMPS)
+#else
+# define NR_NODES     (8)
+# define NR_MEMBLKS   (NR_NODES * 8)
+#endif
+
+/* Stuff below this line could be architecture independent */
+
+extern int num_memblks;		/* total number of memory chunks */
+
+/*
+ * List of node memory chunks. Filled when parsing SRAT table to
+ * obtain information about memory nodes.
+*/
+
+struct node_memblk_s {
+	unsigned long start_paddr;
+	unsigned long size;
+	int nid;		/* which logical node contains this chunk? */
+	int bank;		/* which mem bank on this node */
+};
+
+struct node_cpuid_s {
+	u16	phys_id;	/* id << 8 | eid */
+	int	nid;		/* logical node containing this CPU */
+};
+
+extern struct node_memblk_s node_memblk[NR_MEMBLKS];
+extern struct node_cpuid_s node_cpuid[NR_CPUS];
+
+/*
+ * ACPI 2.0 SLIT (System Locality Information Table)
+ * http://devresource.hp.com/devresource/Docs/TechPapers/IA64/
+ *
+ * This is a matrix with "distances" between nodes, they should be
+ * proportional to the memory access latency ratios.
+ */
+
+extern u8 numa_slit[NR_NODES * NR_NODES];
+#define node_distance(from,to) (numa_slit[from * numnodes + to])
+
+extern int paddr_to_nid(unsigned long paddr);
+
+#endif /* CONFIG_NUMA */
+
+#endif /* _ASM_IA64_NUMA_H */

next prev parent reply	other threads:[~2002-08-15 15:58 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-08-14 17:54 Cleanup of NUMA support in ACPI Grover, Andrew
     [not found] ` <EDC461A30AC4D511ADE10002A5072CAD0236DD8B-OU+JdkIUtvd9zuciVAfUoVDQ4js95KgL@public.gmane.org>
2002-08-14 18:30   ` [Discontig-devel] " KOCHI, Takayoshi
     [not found]     ` <20020815032846.DAESC0A82654.59A07363-dPjYVeZdYcz+G+EEi5ephHgSJqDPrsil@public.gmane.org>
2002-08-15  0:21       ` KOCHI, Takayoshi
     [not found]         ` <20020815092003.JBWFC0A82650.6C9EC293-dPjYVeZdYcz+G+EEi5ephHgSJqDPrsil@public.gmane.org>
2002-08-15 15:58           ` Erich Focht [this message]
     [not found]             ` <200208151758.05568.efocht-+HQ0pkNQ8fyELgA04lAiVw@public.gmane.org>
2002-08-16  1:32               ` KOCHI, Takayoshi
  -- strict thread matches above, loose matches on Subject: below --
2002-08-19 17:33 Grover, Andrew
     [not found] ` <EDC461A30AC4D511ADE10002A5072CAD0236DDAE-OU+JdkIUtvd9zuciVAfUoVDQ4js95KgL@public.gmane.org>
2002-08-19 17:43   ` Martin J. Bligh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200208151758.05568.efocht@ess.nec.de \
    --to=efocht-+hq0pknq8fyelga04laivw@public.gmane.org \
    --cc=acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org \
    --cc=andrew.grover-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=davidm-sDzT885Ts8HQT0dZR+AlfA@public.gmane.org \
    --cc=discontig-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org \
    --cc=t-kouchi-dPjYVeZdYcz+G+EEi5ephHgSJqDPrsil@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox