From: Lee Schermerhorn <lee.schermerhorn@hp.com>
To: linux-arch@vger.kernel.org, linux-mm@kvack.org,
linux-numa@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>, Mel Gorman <mel@csn.ul.ie>,
Andi Kleen <andi@firstfloor.org>,
Christoph Lameter <cl@linux-foundation.org>,
Nick Piggin <npiggin@suse.de>,
David Rientjes <rientjes@google.com>,
akpm@linux-foundation.org, eric.whitney@hp.com
Subject: [PATCH/RFC 5/8] numa: Introduce numa_mem_id()- effective local memory node id
Date: Thu, 04 Mar 2010 12:08:17 -0500 [thread overview]
Message-ID: <20100304170817.10606.29049.sendpatchset@localhost.localdomain> (raw)
In-Reply-To: <20100304170654.10606.32225.sendpatchset@localhost.localdomain>
Against: 2.6.33-mmotm-100302-1838
Introduce numa_mem_id(), based on generic percpu variable infrastructure
to track "effective local memory node" for archs that support memoryless
nodes.
Define API in <linux/topology.h> when CONFIG_HAVE_MEMORYLESS_NODES
defined, else stubs. Architectures will define HAVE_MEMORYLESS_NODES
if/when they support them.
Archs can override definitions of:
numa_mem_id() - returns node number of "local memory" node
set_numa_mem() - initialize [this cpus'] per cpu variable 'numa_mem'
cpu_to_mem() - return numa_mem for specified cpu; may be used as lvalue
if they don't want to use the generic version, but want to support
memoryless nodes.
Generic initialization of 'numa_mem' occurs in __build_all_zonelists().
This will initialize the boot cpu at boot time, and all cpus on change of
numa_zonelist_order, or when node or memory hot-plug requires zonelist rebuild.
Archs that use this implementation will need to initialize 'numa_mem' for
secondary cpus as they're brought on-line.
Question: Is it worth adding a generic initialization of per cpu numa_mem?
E.g., built only when CONFIG_HAVE_MEMORYLESS_NODES defined? Or leave it
to the archs?
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
V2: + split this out of Christoph's incomplete "starter patch"
+ flesh out the definition
include/asm-generic/topology.h | 3 +++
include/linux/mmzone.h | 6 ++++++
include/linux/topology.h | 24 ++++++++++++++++++++++++
mm/page_alloc.c | 39 ++++++++++++++++++++++++++++++++++++++-
4 files changed, 71 insertions(+), 1 deletion(-)
Index: linux-2.6.33-mmotm-100302-1838/include/linux/topology.h
===================================================================
--- linux-2.6.33-mmotm-100302-1838.orig/include/linux/topology.h 2010-03-03 16:28:53.000000000 -0500
+++ linux-2.6.33-mmotm-100302-1838/include/linux/topology.h 2010-03-03 16:28:55.000000000 -0500
@@ -233,6 +233,30 @@ DECLARE_PER_CPU(int, numa_node);
#endif /* [!]CONFIG_USE_PERCPU_NUMA_NODE_ID */
+#ifdef CONFIG_HAVE_MEMORYLESS_NODES
+
+DECLARE_PER_CPU(int, numa_mem);
+
+#ifndef set_numa_mem
+#define set_numa_mem(__node) percpu_write(numa_mem, __node)
+#endif
+
+#else /* !CONFIG_HAVE_MEMORYLESS_NODES */
+
+#define numa_mem numa_node
+static inline void set_numa_mem(int node) {}
+
+#endif /* [!]CONFIG_HAVE_MEMORYLESS_NODES */
+
+#ifndef numa_mem_id
+/* Returns the number of the nearest Node with memory */
+#define numa_mem_id() __this_cpu_read(numa_mem)
+#endif
+
+#ifndef cpu_to_mem
+#define cpu_to_mem(__cpu) per_cpu(numa_mem, (__cpu))
+#endif
+
#ifndef topology_physical_package_id
#define topology_physical_package_id(cpu) ((void)(cpu), -1)
#endif
Index: linux-2.6.33-mmotm-100302-1838/mm/page_alloc.c
===================================================================
--- linux-2.6.33-mmotm-100302-1838.orig/mm/page_alloc.c 2010-03-03 16:28:53.000000000 -0500
+++ linux-2.6.33-mmotm-100302-1838/mm/page_alloc.c 2010-03-03 16:28:55.000000000 -0500
@@ -61,6 +61,11 @@ DEFINE_PER_CPU(int, numa_node);
EXPORT_PER_CPU_SYMBOL(numa_node);
#endif
+#ifdef CONFIG_HAVE_MEMORYLESS_NODES
+DEFINE_PER_CPU(int, numa_mem); /* Kernel "local memory" node */
+EXPORT_PER_CPU_SYMBOL(numa_mem);
+#endif
+
/*
* Array of node states.
*/
@@ -2733,6 +2738,24 @@ static void build_zonelist_cache(pg_data
zlc->z_to_n[z - zonelist->_zonerefs] = zonelist_node_idx(z);
}
+#ifdef CONFIG_HAVE_MEMORYLESS_NODES
+/*
+ * Return node id of node used for "local" allocations.
+ * I.e., first node id of first zone in arg node's generic zonelist.
+ * Used for initializing percpu 'numa_mem', which is used primarily
+ * for kernel allocations, so use GFP_KERNEL flags to locate zonelist.
+ */
+int local_memory_node(int node)
+{
+ struct zone *zone;
+
+ (void)first_zones_zonelist(node_zonelist(node, GFP_KERNEL),
+ gfp_zone(GFP_KERNEL),
+ NULL,
+ &zone);
+ return zone->node;
+}
+#endif
#else /* CONFIG_NUMA */
@@ -2832,9 +2855,23 @@ static int __build_all_zonelists(void *d
* needs the percpu allocator in order to allocate its pagesets
* (a chicken-egg dilemma).
*/
- for_each_possible_cpu(cpu)
+ for_each_possible_cpu(cpu) {
setup_pageset(&per_cpu(boot_pageset, cpu), 0);
+#ifdef CONFIG_HAVE_MEMORYLESS_NODES
+ /*
+ * We now know the "local memory node" for each node--
+ * i.e., the node of the first zone in the generic zonelist.
+ * Set up numa_mem percpu variable for on-line cpus. During
+ * boot, only the boot cpu should be on-line; we'll init the
+ * secondary cpus' numa_mem as they come on-line. During
+ * node/memory hotplug, we'll fixup all on-line cpus.
+ */
+ if (cpu_online(cpu))
+ cpu_to_mem(cpu) = local_memory_node(cpu_to_node(cpu));
+#endif
+ }
+
return 0;
}
Index: linux-2.6.33-mmotm-100302-1838/include/linux/mmzone.h
===================================================================
--- linux-2.6.33-mmotm-100302-1838.orig/include/linux/mmzone.h 2010-03-03 16:28:53.000000000 -0500
+++ linux-2.6.33-mmotm-100302-1838/include/linux/mmzone.h 2010-03-03 16:28:55.000000000 -0500
@@ -661,6 +661,12 @@ void memory_present(int nid, unsigned lo
static inline void memory_present(int nid, unsigned long start, unsigned long end) {}
#endif
+#ifdef CONFIG_HAVE_MEMORYLESS_NODES
+int local_memory_node(int node_id);
+#else
+static inline int local_memory_node(int node_id) { return node_id; };
+#endif
+
#ifdef CONFIG_NEED_NODE_MEMMAP_SIZE
unsigned long __init node_memmap_size_bytes(int, unsigned long, unsigned long);
#endif
Index: linux-2.6.33-mmotm-100302-1838/include/asm-generic/topology.h
===================================================================
--- linux-2.6.33-mmotm-100302-1838.orig/include/asm-generic/topology.h 2010-03-03 16:28:53.000000000 -0500
+++ linux-2.6.33-mmotm-100302-1838/include/asm-generic/topology.h 2010-03-03 16:28:55.000000000 -0500
@@ -34,6 +34,9 @@
#ifndef cpu_to_node
#define cpu_to_node(cpu) ((void)(cpu),0)
#endif
+#ifndef cpu_to_mem
+#define cpu_to_mem(cpu) (void)(cpu),0)
+#endif
#ifndef parent_node
#define parent_node(node) ((void)(node),0)
#endif
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-03-04 17:00 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-04 17:06 [PATCH/RFC 0/8] Numa: Use Generic Per-cpu Variables for numa_*_id() Lee Schermerhorn
2010-03-04 17:07 ` [PATCH/RFC 1/8] numa: prep: move generic percpu interface definitions to percpu-defs.h Lee Schermerhorn
2010-03-09 8:46 ` Tejun Heo
2010-03-09 14:13 ` Lee Schermerhorn
2010-03-10 9:06 ` Tejun Heo
2010-03-04 17:07 ` [PATCH/RFC 2/8] numa: add generic percpu var implementation of numa_node_id() Lee Schermerhorn
2010-03-04 18:44 ` Christoph Lameter
2010-03-04 17:07 ` [PATCH/RFC 3/8] numa: x86_64: use generic percpu var for numa_node_id() implementation Lee Schermerhorn
2010-03-04 18:47 ` Christoph Lameter
2010-03-04 20:42 ` Lee Schermerhorn
2010-03-04 21:16 ` Christoph Lameter
2010-03-04 17:07 ` [PATCH/RFC 4/8] numa: ia64: use generic percpu var " Lee Schermerhorn
2010-03-04 18:48 ` Christoph Lameter
2010-03-04 17:08 ` Lee Schermerhorn [this message]
2010-03-04 18:52 ` [PATCH/RFC 5/8] numa: Introduce numa_mem_id()- effective local memory node id Christoph Lameter
2010-03-04 19:28 ` Lee Schermerhorn
2010-03-04 17:08 ` [PATCH/RFC 6/8] numa: ia64: support numa_mem_id() for memoryless nodes Lee Schermerhorn
2010-03-04 17:08 ` [PATCH/RFC 7/8] numa: slab: use numa_mem_id() for slab local memory node Lee Schermerhorn
2010-03-04 17:08 ` [PATCH/RFC 8/8] numa: in-kernel profiling -- support memoryless nodes Lee Schermerhorn
2010-03-05 1:19 ` [PATCH/RFC 0/8] Numa: Use Generic Per-cpu Variables for numa_*_id() KAMEZAWA Hiroyuki
2010-03-05 1:25 ` Lee Schermerhorn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100304170817.10606.29049.sendpatchset@localhost.localdomain \
--to=lee.schermerhorn@hp.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=cl@linux-foundation.org \
--cc=eric.whitney@hp.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-numa@vger.kernel.org \
--cc=mel@csn.ul.ie \
--cc=npiggin@suse.de \
--cc=rientjes@google.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).