* [PATCH] 2.6.23-rc3-mm1 - update N_HIGH_MEMORY node state for memory hotadd [not found] <200708242228.l7OMS5fU017948@imap1.linux-foundation.org> @ 2007-08-27 15:58 ` Lee Schermerhorn 2007-08-27 17:48 ` [PATCH/RFC] Add node 'states' sysfs class attribute Lee Schermerhorn 2007-08-27 21:02 ` [PATCH/RFC] Add node 'states' sysfs class attribute - V2 Lee Schermerhorn 2 siblings, 0 replies; 43+ messages in thread From: Lee Schermerhorn @ 2007-08-27 15:58 UTC (permalink / raw) To: akpm Cc: clameter, jeremy, mel, y-goto, Kamezawa Hiroyuki, linux-mm, Eric Whitney I believe [something like] the following is required for memory hot add, after moving the setting of N_HIGH_MEMORY node state to free_area_init_nodes(). However, we could also move that BACK to __build_all_zonelists() BEFORE calling build_zonelists() and dispense with this patch. Thoughts? [besides the obvious churn, I mean. :-\] Lee ================= PATCH update N_HIGH_MEMORY node state for memory hotadd Against: 2.6.23-rc3-mm1 Setting N_HIGH_MEMORY node state in free_area_init_nodes() works for memory present at boot time, but not for hot-added memory. Update the N_HIGH_MEMORY node state in online_pages(), if we've added pages to this node, before rebuilding zonelists. Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> mm/memory_hotplug.c | 2 ++ 1 file changed, 2 insertions(+) Index: Linux/mm/memory_hotplug.c =================================================================== --- Linux.orig/mm/memory_hotplug.c 2007-08-22 09:20:26.000000000 -0400 +++ Linux/mm/memory_hotplug.c 2007-08-27 10:40:57.000000000 -0400 @@ -211,6 +211,8 @@ int online_pages(unsigned long pfn, unsi online_pages_range); zone->present_pages += onlined_pages; zone->zone_pgdat->node_present_pages += onlined_pages; + if (onlined_pages) + node_set_state(zone->node, N_HIGH_MEMORY); setup_per_zone_pages_min(); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH/RFC] Add node 'states' sysfs class attribute [not found] <200708242228.l7OMS5fU017948@imap1.linux-foundation.org> 2007-08-27 15:58 ` [PATCH] 2.6.23-rc3-mm1 - update N_HIGH_MEMORY node state for memory hotadd Lee Schermerhorn @ 2007-08-27 17:48 ` Lee Schermerhorn 2007-08-27 19:11 ` Christoph Lameter 2007-08-27 21:02 ` [PATCH/RFC] Add node 'states' sysfs class attribute - V2 Lee Schermerhorn 2 siblings, 1 reply; 43+ messages in thread From: Lee Schermerhorn @ 2007-08-27 17:48 UTC (permalink / raw) To: linux-mm Cc: clameter, mel, y-goto, Kamezawa Hiroyuki, Andrew Morton, Eric Whitney Christoph suggested something like this a while back, as a new /proc file. Believing, perhaps incorrectly, that new /proc files are discouraged, I have implemented this as a sysfs class attribute on the 'node' system device class. Works on my numa platform: 4 nodes with cpus, one memory only node. Questions: 1) if this is useful, do we need/want the possible mask? 2) how about teaching nodemask_scnprintf() to suppress leading words of all zeros? Lee =========================== PATCH Add node 'states' sysfs class attribute Against: 2.6.23-rc3-mm1 Add a sysfs class attribute file /sys/devices/system/node/states to display node state masks: Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> drivers/base/node.c | 73 +++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 72 insertions(+), 1 deletion(-) Index: Linux/drivers/base/node.c =================================================================== --- Linux.orig/drivers/base/node.c 2007-08-27 12:31:32.000000000 -0400 +++ Linux/drivers/base/node.c 2007-08-27 13:25:19.000000000 -0400 @@ -12,6 +12,7 @@ #include <linux/topology.h> #include <linux/nodemask.h> #include <linux/cpu.h> +#include <linux/device.h> static struct sysdev_class node_class = { set_kset_name("node"), @@ -232,8 +233,78 @@ void unregister_one_node(int nid) unregister_node(&node_devices[nid]); } +/* + * [node] states attribute + */ +static char * node_state_names[] = { + "possible:", + "on-line:", + "normal memory:", +#ifdef CONFIG_HIGHMEM + "high memory:", +#endif + "cpu:", +}; + +static ssize_t +print_node_states(struct class *class, char *buf) +{ + int i; + int n; + size_t size = PAGE_SIZE; + ssize_t len = 0; + + for (i=0; i < NR_NODE_STATES; ++i) { + n = snprintf(buf, size, "%16s ", node_state_names[i]); + if (n < 0) + break; + buf += n; + len += n; + size -= n; + if (size < 0) + break; + + n = nodemask_scnprintf(buf, size, node_states[i]); + if (n < 0) + break; + buf += n; + len += n; + size -=n; + if (size < 0) + break; + + n = snprintf(buf, size, "\n"); + if (n < 0) + break; + buf += n; + len += n; + size -= n; + if (size < 0) + break; + } + + return n < 0 ? n : len; +} + +static CLASS_ATTR(states, 0444, print_node_states, NULL); + +static int node_states_init(void) +{ + return sysfs_create_file(&node_class.kset.kobj, + &class_attr_states.attr); +} + static int __init register_node_type(void) { - return sysdev_class_register(&node_class); + int ret; + + ret = sysdev_class_register(&node_class); + if (ret) + goto out; + + ret = node_states_init(); + +out: + return ret; } postcore_initcall(register_node_type); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node 'states' sysfs class attribute 2007-08-27 17:48 ` [PATCH/RFC] Add node 'states' sysfs class attribute Lee Schermerhorn @ 2007-08-27 19:11 ` Christoph Lameter 2007-08-27 20:08 ` Lee Schermerhorn 0 siblings, 1 reply; 43+ messages in thread From: Christoph Lameter @ 2007-08-27 19:11 UTC (permalink / raw) To: Lee Schermerhorn Cc: linux-mm, mel, y-goto, Kamezawa Hiroyuki, Andrew Morton, Eric Whitney On Mon, 27 Aug 2007, Lee Schermerhorn wrote: > Works on my numa platform: 4 nodes with cpus, one memory only node. > > Questions: > > 1) if this is useful, do we need/want the possible mask? Yes that is important for software that wants to allocate per node structures. The possible mask shows which nodes could be activated later. > 2) how about teaching nodemask_scnprintf() to suppress leading > words of all zeros? Leading words of all zeroes? nodemask_scnprintf calls bitmap_scnprintf(). Maybe it should call bitmap_scnlistprintf() instead? > +static ssize_t > +print_node_states(struct class *class, char *buf) > +{ > + int i; > + int n; > + size_t size = PAGE_SIZE; > + ssize_t len = 0; The size varies? Isnt the len enough. Maybe just using one variable would simplify the code? > + > + for (i=0; i < NR_NODE_STATES; ++i) { Missing blanks around assignment. Please use i++. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node 'states' sysfs class attribute 2007-08-27 19:11 ` Christoph Lameter @ 2007-08-27 20:08 ` Lee Schermerhorn 2007-08-27 20:15 ` Christoph Lameter 0 siblings, 1 reply; 43+ messages in thread From: Lee Schermerhorn @ 2007-08-27 20:08 UTC (permalink / raw) To: Christoph Lameter Cc: linux-mm, mel, y-goto, Kamezawa Hiroyuki, Andrew Morton, Eric Whitney On Mon, 2007-08-27 at 12:11 -0700, Christoph Lameter wrote: > On Mon, 27 Aug 2007, Lee Schermerhorn wrote: > > > Works on my numa platform: 4 nodes with cpus, one memory only node. > > > > Questions: > > > > 1) if this is useful, do we need/want the possible mask? > > Yes that is important for software that wants to allocate per node > structures. The possible mask shows which nodes could be activated later. Good point. Given that, I'm thinking we might want to limit the displayed masks--even the internal value of the mask--to something closer to what a particular platform architecture can support, even tho' the kernel might be configured for a much larger number. I'll have to look into how to do this. > > > 2) how about teaching nodemask_scnprintf() to suppress leading > > words of all zeros? > > Leading words of all zeroes? nodemask_scnprintf calls bitmap_scnprintf(). > Maybe it should call bitmap_scnlistprintf() instead? For platforms with small numbers of possible nodes, that might look nicer. > > > > +static ssize_t > > +print_node_states(struct class *class, char *buf) > > +{ > > + int i; > > + int n; > > + size_t size = PAGE_SIZE; > > + ssize_t len = 0; > > The size varies? Isnt the len enough. Maybe just using one variable would > simplify the code? 'size' is used as the remaining amount of space in the buffer for each subsequent snprintf()-like call. But, yeah, I can just decrement size after each call and at the end, subtract it from the original buffer size--i.e., PAGE_SIZE--to get the length. Next respin. > > > + > > + for (i=0; i < NR_NODE_STATES; ++i) { > > Missing blanks around assignment. OK. > Please use i++. Sure. Old habits die hard. Lee -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node 'states' sysfs class attribute 2007-08-27 20:08 ` Lee Schermerhorn @ 2007-08-27 20:15 ` Christoph Lameter 0 siblings, 0 replies; 43+ messages in thread From: Christoph Lameter @ 2007-08-27 20:15 UTC (permalink / raw) To: Lee Schermerhorn Cc: linux-mm, mel, y-goto, Kamezawa Hiroyuki, Andrew Morton, Eric Whitney On Mon, 27 Aug 2007, Lee Schermerhorn wrote: > > Yes that is important for software that wants to allocate per node > > structures. The possible mask shows which nodes could be activated later. > > Good point. Given that, I'm thinking we might want to limit the > displayed masks--even the internal value of the mask--to something > closer to what a particular platform architecture can support, even tho' > the kernel might be configured for a much larger number. I'll have to > look into how to do this. It is the responsibility of the arch code to set this up the right way AFAIK. > > Leading words of all zeroes? nodemask_scnprintf calls bitmap_scnprintf(). > > Maybe it should call bitmap_scnlistprintf() instead? > > For platforms with small numbers of possible nodes, that might look > nicer. For large platforms this will avoid long node lists that warp around. So lets do it. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH/RFC] Add node 'states' sysfs class attribute - V2 [not found] <200708242228.l7OMS5fU017948@imap1.linux-foundation.org> 2007-08-27 15:58 ` [PATCH] 2.6.23-rc3-mm1 - update N_HIGH_MEMORY node state for memory hotadd Lee Schermerhorn 2007-08-27 17:48 ` [PATCH/RFC] Add node 'states' sysfs class attribute Lee Schermerhorn @ 2007-08-27 21:02 ` Lee Schermerhorn 2007-08-27 21:04 ` Christoph Lameter ` (2 more replies) 2 siblings, 3 replies; 43+ messages in thread From: Lee Schermerhorn @ 2007-08-27 21:02 UTC (permalink / raw) To: linux-mm Cc: clameter, mel, y-goto, Kamezawa Hiroyuki, Andrew Morton, Eric Whitney Here's a cleaned up version that addresses Christoph's comments. Lee =============== PATCH Add node 'states' sysfs class attribute v2 Against: 2.6.23-rc3-mm1 V1 -> V2: + style cleanup + drop 'len' variable in print_node_states(); compute from final size. + use nodelist_scnprintf() for state masks. Add a sysfs class attribute file to /sys/devices/system/node to display node state masks. Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> drivers/base/node.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 70 insertions(+), 1 deletion(-) Index: Linux/drivers/base/node.c =================================================================== --- Linux.orig/drivers/base/node.c 2007-08-27 12:31:32.000000000 -0400 +++ Linux/drivers/base/node.c 2007-08-27 16:30:18.000000000 -0400 @@ -12,6 +12,7 @@ #include <linux/topology.h> #include <linux/nodemask.h> #include <linux/cpu.h> +#include <linux/device.h> static struct sysdev_class node_class = { set_kset_name("node"), @@ -232,8 +233,76 @@ void unregister_one_node(int nid) unregister_node(&node_devices[nid]); } +/* + * [node] states attribute + */ +static char * node_state_names[] = { + "possible:", + "on-line:", + "normal memory:", +#ifdef CONFIG_HIGHMEM + "high memory:", +#endif + "cpu:", +}; + +static ssize_t +print_node_states(struct class *class, char *buf) +{ + int i; + int n; + ssize_t size = PAGE_SIZE; + + for (i = 0; i < NR_NODE_STATES; i++) { + n = snprintf(buf, size, "%14s ", node_state_names[i]); + if (n <= 0) + break; + buf += n; + size -= n; + if (size <= 0) + break; + + n = nodelist_scnprintf(buf, size, node_states[i]); + if (n <= 0) + break; + buf += n; + size -=n; + if (size <= 0) + break; + + n = snprintf(buf, size, "\n"); + if (n <= 0) + break; + buf += n; + size -= n; + if (size <= 0) + break; + } + + if (n > 0) { + n = PAGE_SIZE; + if (size > 0) + n -= size; + } + return n; +} + +static CLASS_ATTR(states, 0444, print_node_states, NULL); + +static int node_states_init(void) +{ + return sysfs_create_file(&node_class.kset.kobj, + &class_attr_states.attr); +} + static int __init register_node_type(void) { - return sysdev_class_register(&node_class); + int ret; + + ret = sysdev_class_register(&node_class); + if (!ret) + ret = node_states_init(); + + return ret; } postcore_initcall(register_node_type); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node 'states' sysfs class attribute - V2 2007-08-27 21:02 ` [PATCH/RFC] Add node 'states' sysfs class attribute - V2 Lee Schermerhorn @ 2007-08-27 21:04 ` Christoph Lameter 2007-08-28 0:01 ` Andrew Morton 2007-08-28 1:16 ` Yasunori Goto 2 siblings, 0 replies; 43+ messages in thread From: Christoph Lameter @ 2007-08-27 21:04 UTC (permalink / raw) To: Lee Schermerhorn Cc: linux-mm, mel, y-goto, Kamezawa Hiroyuki, Andrew Morton, Eric Whitney On Mon, 27 Aug 2007, Lee Schermerhorn wrote: > Add a sysfs class attribute file to /sys/devices/system/node to display > node state masks. Acked-by: Christoph Lameter <clameter@sgi.com> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node 'states' sysfs class attribute - V2 2007-08-27 21:02 ` [PATCH/RFC] Add node 'states' sysfs class attribute - V2 Lee Schermerhorn 2007-08-27 21:04 ` Christoph Lameter @ 2007-08-28 0:01 ` Andrew Morton 2007-08-28 0:08 ` Christoph Lameter 2007-08-28 1:16 ` Yasunori Goto 2 siblings, 1 reply; 43+ messages in thread From: Andrew Morton @ 2007-08-28 0:01 UTC (permalink / raw) To: Lee Schermerhorn Cc: linux-mm, clameter, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney On Mon, 27 Aug 2007 17:02:08 -0400 Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote: > Here's a cleaned up version that addresses Christoph's comments. > > Lee > =============== > PATCH Add node 'states' sysfs class attribute v2 > > Against: 2.6.23-rc3-mm1 > > V1 -> V2: > + style cleanup > + drop 'len' variable in print_node_states(); compute from > final size. > + use nodelist_scnprintf() for state masks. > > Add a sysfs class attribute file to /sys/devices/system/node > to display node state masks. > > Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> So I spent half a minute or so working out "wtf does this do" then decided that it isn't efficient for everyone who reads this patch to have to do the same thing. Perhaps including sample output would help to explain wtf this does. afaict it will spit out a bitmap like: possible: 11110000 on-line: 11010000 normal memory: 01110000 etc or something like that, dunno. Please document this interface for us? > Index: Linux/drivers/base/node.c > =================================================================== > --- Linux.orig/drivers/base/node.c 2007-08-27 12:31:32.000000000 -0400 > +++ Linux/drivers/base/node.c 2007-08-27 16:30:18.000000000 -0400 > @@ -12,6 +12,7 @@ > #include <linux/topology.h> > #include <linux/nodemask.h> > #include <linux/cpu.h> > +#include <linux/device.h> > > static struct sysdev_class node_class = { > set_kset_name("node"), > @@ -232,8 +233,76 @@ void unregister_one_node(int nid) > unregister_node(&node_devices[nid]); > } > > +/* > + * [node] states attribute > + */ > +static char * node_state_names[] = { s/* /*/ > + "possible:", > + "on-line:", It would be more typical to use "online" here. > + "normal memory:", > +#ifdef CONFIG_HIGHMEM > + "high memory:", Do we really want a space in here? It makes parsing somewhat harder. Do the other files in /sys/devices/system/node take care to avoid doing this? And what happened to the one-value-per-sysfs file rule? Did we already break it so much in /sys/devices/system/node that we've just given up? > +#endif > + "cpu:", > +}; > + > +static ssize_t > +print_node_states(struct class *class, char *buf) static ssize_t print_node_states(struct class *class, char *buf) fits in 80-cols, hece is preferred here. > +{ > + int i; > + int n; > + ssize_t size = PAGE_SIZE; > + > + for (i = 0; i < NR_NODE_STATES; i++) { > + n = snprintf(buf, size, "%14s ", node_state_names[i]); > + if (n <= 0) > + break; > + buf += n; > + size -= n; > + if (size <= 0) > + break; > + > + n = nodelist_scnprintf(buf, size, node_states[i]); > + if (n <= 0) > + break; > + buf += n; > + size -=n; > + if (size <= 0) > + break; > + > + n = snprintf(buf, size, "\n"); > + if (n <= 0) > + break; > + buf += n; > + size -= n; > + if (size <= 0) > + break; > + } > + > + if (n > 0) { > + n = PAGE_SIZE; > + if (size > 0) > + n -= size; > + } > + return n; > +} Can't use seq_file interface here? The fiddling with PAGE_SIZE is unfortunate. > +static CLASS_ATTR(states, 0444, print_node_states, NULL); > + > +static int node_states_init(void) > +{ > + return sysfs_create_file(&node_class.kset.kobj, > + &class_attr_states.attr); > +} > + > static int __init register_node_type(void) > { > - return sysdev_class_register(&node_class); > + int ret; > + > + ret = sysdev_class_register(&node_class); > + if (!ret) > + ret = node_states_init(); > + > + return ret; > } > postcore_initcall(register_node_type); > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node 'states' sysfs class attribute - V2 2007-08-28 0:01 ` Andrew Morton @ 2007-08-28 0:08 ` Christoph Lameter 2007-08-28 1:14 ` Andrew Morton 0 siblings, 1 reply; 43+ messages in thread From: Christoph Lameter @ 2007-08-28 0:08 UTC (permalink / raw) To: Andrew Morton Cc: Lee Schermerhorn, linux-mm, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney On Mon, 27 Aug 2007, Andrew Morton wrote: > Perhaps including sample output would help to explain wtf this does. > afaict it will spit out a bitmap like: > > possible: 11110000 > on-line: 11010000 > normal memory: 01110000 > etc > > or something like that, dunno. Please document this interface for us? We also talked about having nodelist_scnprintf call bitmap_scnlistprintf. I'd expect that to be a separate patch. The output should then be more like possible: 0-4 online: 0-1, 3 normal memory: 1-3 > > + "normal memory:", > > +#ifdef CONFIG_HIGHMEM > > + "high memory:", > > Do we really want a space in here? It makes parsing somewhat > harder. Do the other files in /sys/devices/system/node take care to avoid > doing this? This is the first file in that directory. The files in /sys/devices/system/node/nodeX use _ there. > And what happened to the one-value-per-sysfs file rule? Did we already > break it so much in /sys/devices/system/node that we've just given up? /sys/devices/system/node/nodeX/meminfo is like /proc/meminfo containing multiple settings. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node 'states' sysfs class attribute - V2 2007-08-28 0:08 ` Christoph Lameter @ 2007-08-28 1:14 ` Andrew Morton 2007-08-28 1:29 ` Christoph Lameter 0 siblings, 1 reply; 43+ messages in thread From: Andrew Morton @ 2007-08-28 1:14 UTC (permalink / raw) To: Christoph Lameter Cc: Lee Schermerhorn, linux-mm, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney On Mon, 27 Aug 2007 17:08:02 -0700 (PDT) Christoph Lameter <clameter@sgi.com> wrote: > On Mon, 27 Aug 2007, Andrew Morton wrote: > > > Perhaps including sample output would help to explain wtf this does. > > afaict it will spit out a bitmap like: > > > > possible: 11110000 > > on-line: 11010000 > > normal memory: 01110000 > > etc > > > > or something like that, dunno. Please document this interface for us? > > We also talked about having nodelist_scnprintf call bitmap_scnlistprintf. > I'd expect that to be a separate patch. The output should then be more > like > > possible: 0-4 > online: 0-1, 3 really? with commas and spaces and minus signs and colons? ug, what next? animated ascii art? This is sysfs, not procfs ;) > normal memory: 1-3 > > > > + "normal memory:", > > > +#ifdef CONFIG_HIGHMEM > > > + "high memory:", > > > > Do we really want a space in here? It makes parsing somewhat > > harder. Do the other files in /sys/devices/system/node take care to avoid > > doing this? > > This is the first file in that directory. The files in > /sys/devices/system/node/nodeX use _ there. > > > And what happened to the one-value-per-sysfs file rule? Did we already > > break it so much in /sys/devices/system/node that we've just given up? > > /sys/devices/system/node/nodeX/meminfo is like /proc/meminfo containing > multiple settings. OK, well if the meminfo file is the only one in there which broke the golden rule, I don't think we have sufficient excuse to break it again. $ cat /sys/devices/system/node/possible 0-4 $ I think a bitmap would be better, personally. That in fact makes "possible" unneeded, doesn't it? It would always be all-ones? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node 'states' sysfs class attribute - V2 2007-08-28 1:14 ` Andrew Morton @ 2007-08-28 1:29 ` Christoph Lameter 2007-08-28 3:18 ` Andrew Morton 0 siblings, 1 reply; 43+ messages in thread From: Christoph Lameter @ 2007-08-28 1:29 UTC (permalink / raw) To: Andrew Morton Cc: Lee Schermerhorn, linux-mm, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney On Mon, 27 Aug 2007, Andrew Morton wrote: > > online: 0-1, 3 > > really? with commas and spaces and minus signs and colons? ug, what next? > animated ascii art? This is sysfs, not procfs ;) The masks can get quite ugly to read if you have lots of nodes. F.e. with 1024 nodes you get a line that wraps around more than 10 times. > OK, well if the meminfo file is the only one in there which broke the > golden rule, I don't think we have sufficient excuse to break it again. > > $ cat /sys/devices/system/node/possible > 0-4 > $ > > I think a bitmap would be better, personally. > > That in fact makes "possible" unneeded, doesn't it? It would always be > all-ones? There could be the case that nodes 1-9 and 20-29 are possible but the ones in between are not available. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node 'states' sysfs class attribute - V2 2007-08-28 1:29 ` Christoph Lameter @ 2007-08-28 3:18 ` Andrew Morton 2007-08-28 5:15 ` Christoph Lameter 0 siblings, 1 reply; 43+ messages in thread From: Andrew Morton @ 2007-08-28 3:18 UTC (permalink / raw) To: Christoph Lameter Cc: Lee Schermerhorn, linux-mm, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney On Mon, 27 Aug 2007 18:29:26 -0700 (PDT) Christoph Lameter <clameter@sgi.com> wrote: > On Mon, 27 Aug 2007, Andrew Morton wrote: > > > > online: 0-1, 3 > > > > really? with commas and spaces and minus signs and colons? ug, what next? > > animated ascii art? This is sysfs, not procfs ;) > > The masks can get quite ugly to read if you have lots of nodes. F.e. with > 1024 nodes you get a line that wraps around more than 10 times. So don't read them - use a program to turn them into your preferred human-readable representation. One could argue that with sufficient effort, all formats are machine-digestible, but a simple, robust and maintainable format like a bitmap makes sense for a kernel->userspace interface, IMO. We see this often, btw. People want nice and easy-to-read kernel->human interfaces because the tool of choice for displaying these things to humans is "cat". How lame is that? I do think that a sysfs interface like this should be optimised for kernel->program communication, not for kernel->human. > > OK, well if the meminfo file is the only one in there which broke the > > golden rule, I don't think we have sufficient excuse to break it again. > > > > $ cat /sys/devices/system/node/possible > > 0-4 > > $ > > > > I think a bitmap would be better, personally. > > > > That in fact makes "possible" unneeded, doesn't it? It would always be > > all-ones? > > There could be the case that nodes 1-9 and 20-29 are possible but > the ones in between are not available. OK. How do we communicate to userspace what is the maximum number of nodes which this kernel supports? That would be (1<<CONFIG_NODES_SHIFT), I guess. Or maybe we don't care? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node 'states' sysfs class attribute - V2 2007-08-28 3:18 ` Andrew Morton @ 2007-08-28 5:15 ` Christoph Lameter 2007-08-28 5:29 ` Andrew Morton 0 siblings, 1 reply; 43+ messages in thread From: Christoph Lameter @ 2007-08-28 5:15 UTC (permalink / raw) To: Andrew Morton Cc: Lee Schermerhorn, linux-mm, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney On Mon, 27 Aug 2007, Andrew Morton wrote: > > The masks can get quite ugly to read if you have lots of nodes. F.e. with > > 1024 nodes you get a line that wraps around more than 10 times. > > So don't read them - use a program to turn them into your preferred > human-readable representation. You are volunterring to write all the programs we need for this and that debugging situation? Even if you do this: It will significantly slow us down always having to come to you and ask you for a tool. > We see this often, btw. People want nice and easy-to-read kernel->human > interfaces because the tool of choice for displaying these things to humans > is "cat". How lame is that? > > I do think that a sysfs interface like this should be optimised for > kernel->program communication, not for kernel->human. Well I keep ending up cat this and that proc entry for debugging and its difficult to do if one sysfs file spews huge amounts of illegible binary data to you. I do not think that tools would have trouble deciphering that format. It is going to be more compact and easier to handle to have the node ranges rather than converting a list of binaries. Lee already noted the huge amounts of zeros that he got. > > > OK, well if the meminfo file is the only one in there which broke the > > > golden rule, I don't think we have sufficient excuse to break it again. > > > > > > $ cat /sys/devices/system/node/possible > > > 0-4 > > > $ > > > > > > I think a bitmap would be better, personally. > > > > > > That in fact makes "possible" unneeded, doesn't it? It would always be > > > all-ones? > > > > There could be the case that nodes 1-9 and 20-29 are possible but > > the ones in between are not available. > > OK. > > How do we communicate to userspace what is the maximum number of nodes > which this kernel supports? That would be (1<<CONFIG_NODES_SHIFT), I > guess. Or maybe we don't care? The last node mentioned in possible is the highest. That will be difficult to see if you want the kernel developers to read binary numbers but I guess we have to be tough. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node 'states' sysfs class attribute - V2 2007-08-28 5:15 ` Christoph Lameter @ 2007-08-28 5:29 ` Andrew Morton 2007-08-28 5:34 ` Andrew Morton 2007-08-28 5:53 ` Christoph Lameter 0 siblings, 2 replies; 43+ messages in thread From: Andrew Morton @ 2007-08-28 5:29 UTC (permalink / raw) To: Christoph Lameter Cc: Lee Schermerhorn, linux-mm, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney On Mon, 27 Aug 2007 22:15:23 -0700 (PDT) Christoph Lameter <clameter@sgi.com> wrote: > On Mon, 27 Aug 2007, Andrew Morton wrote: > > > > The masks can get quite ugly to read if you have lots of nodes. F.e. with > > > 1024 nodes you get a line that wraps around more than 10 times. > > > > So don't read them - use a program to turn them into your preferred > > human-readable representation. > > You are volunterring to write all the programs we need for this and that > debugging situation? Sure! $500/hour! > Even if you do this: It will significantly slow us > down always having to come to you and ask you for a tool. Your claim here is, I believe, that a human user interface should be implemented in the kernel because the cost (to you) (short-term) of doing that is lower that the cost of implementing a simpler kernel interface and a bit of userspace human presentation code. Even though the long-term cost to the kernel maintainers is higher, and the resulting output is harder for programs to parse. Interesting. Please type "cat /proc/stat". The world hasn't ended. > > We see this often, btw. People want nice and easy-to-read kernel->human > > interfaces because the tool of choice for displaying these things to humans > > is "cat". How lame is that? > > > > I do think that a sysfs interface like this should be optimised for > > kernel->program communication, not for kernel->human. > > Well I keep ending up cat this and that proc entry for debugging and its > difficult to do if one sysfs file spews huge amounts of illegible binary > data to you. Nobody ever said "binary". Please try to keep up. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node 'states' sysfs class attribute - V2 2007-08-28 5:29 ` Andrew Morton @ 2007-08-28 5:34 ` Andrew Morton 2007-08-28 5:53 ` Christoph Lameter 1 sibling, 0 replies; 43+ messages in thread From: Andrew Morton @ 2007-08-28 5:34 UTC (permalink / raw) To: Christoph Lameter, Lee Schermerhorn, linux-mm, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney On Mon, 27 Aug 2007 22:29:12 -0700 Andrew Morton <akpm@linux-foundation.org> wrote: > > Even if you do this: It will significantly slow us > > down always having to come to you and ask you for a tool. > > Your claim here is, I believe, that a human user interface should be > implemented in the kernel because the cost (to you) (short-term) of doing > that is lower that the cost of implementing a simpler kernel interface and > a bit of userspace human presentation code. Even though the long-term > cost to the kernel maintainers is higher, and the resulting output is > harder for programs to parse. oh, hang on. We've already implemented that bitmap_scnlistprintf() monstrosity and we're presumably using it in sysfs, so tools already need to know how to parse it all. Sigh, so it's too late to fix it. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node 'states' sysfs class attribute - V2 2007-08-28 5:29 ` Andrew Morton 2007-08-28 5:34 ` Andrew Morton @ 2007-08-28 5:53 ` Christoph Lameter 2007-08-28 6:12 ` Andrew Morton 1 sibling, 1 reply; 43+ messages in thread From: Christoph Lameter @ 2007-08-28 5:53 UTC (permalink / raw) To: Andrew Morton Cc: Lee Schermerhorn, linux-mm, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney On Mon, 27 Aug 2007, Andrew Morton wrote: > Your claim here is, I believe, that a human user interface should be > implemented in the kernel because the cost (to you) (short-term) of doing > that is lower that the cost of implementing a simpler kernel interface and > a bit of userspace human presentation code. Even though the long-term > cost to the kernel maintainers is higher, and the resulting output is > harder for programs to parse. The long term cost is zero since there is already a kernel function to process these lists. See bitmap_parselist(). The kernel already allows output and input of these lists. > Please type "cat /proc/stat". The world hasn't ended. Yea that the prime example of a bad use of the proc filesystem. All these numbers better be split up into individual files. The cpu affinity is a horror to see on 4096 cpu systems. If you want to figure out to which cpu the process has restricted itself then you need to do some quick hex conversions in your mind. margin:/proc/1 # cat /proc/1/stat 1 (init) S 0 1 1 0 -1 4194560 77 340052 10 1575 0 463 1256 1246 20 0 1 0 67 1802240 47 18446744073709551615 4611686018427387904 4611686018428481976 6953557824659209808 16140902370223191856 11529215046068536865 0 0 1467013372 680207875 11529215050365063520 0 0 0 2 0 0 0 > > Well I keep ending up cat this and that proc entry for debugging and its > > difficult to do if one sysfs file spews huge amounts of illegible binary > > data to you. > > Nobody ever said "binary". Please try to keep up. What you get right now from this patch is a series of hex digits and you have the task of converting that to a series of 0 and 1's in your mind and then figure out which node it was that had a 1 there. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node 'states' sysfs class attribute - V2 2007-08-28 5:53 ` Christoph Lameter @ 2007-08-28 6:12 ` Andrew Morton 2007-08-28 14:05 ` Lee Schermerhorn 2007-08-28 19:34 ` [PATCH/RFC] Add node 'states' sysfs class attribute - V2 Christoph Lameter 0 siblings, 2 replies; 43+ messages in thread From: Andrew Morton @ 2007-08-28 6:12 UTC (permalink / raw) To: Christoph Lameter Cc: Lee Schermerhorn, linux-mm, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney On Mon, 27 Aug 2007 22:53:15 -0700 (PDT) Christoph Lameter <clameter@sgi.com> wrote: > On Mon, 27 Aug 2007, Andrew Morton wrote: > > > Your claim here is, I believe, that a human user interface should be > > implemented in the kernel because the cost (to you) (short-term) of doing > > that is lower that the cost of implementing a simpler kernel interface and > > a bit of userspace human presentation code. Even though the long-term > > cost to the kernel maintainers is higher, and the resulting output is > > harder for programs to parse. > > The long term cost is zero since there is already a kernel function > to process these lists. See bitmap_parselist(). The kernel already allows > output and input of these lists. yeah, I noticed. Just step back from this for a minute, and think how utterly lame that is. User interface code in the kernel because we (actually you guys) have not expended the tiny amount of effort and initiative which would be required to develop a little utility to do it. > > Please type "cat /proc/stat". The world hasn't ended. > > Yea that the prime example of a bad use of the proc filesystem. All these > numbers better be split up into individual files. Wrong! My point is that this incomprehensible format is not a problem to anyone because others have put the effort and initiative into preparation of tools which present that information to users. > The cpu affinity is a horror to see on 4096 cpu systems. If you > want to figure out to which cpu the process has restricted itself then you > need to do some quick hex conversions in your mind. wtf? You meen nobody has written the teeny bit of code which is needed to convert that info into your desired format? Well that's your problem. It certainly is not an argument that this user interface code should be placed in the kernel. > > > Well I keep ending up cat this and that proc entry for debugging and its > > > difficult to do if one sysfs file spews huge amounts of illegible binary > > > data to you. > > > > Nobody ever said "binary". Please try to keep up. > > What you get right now from this patch is a series of hex digits and you > have the task of converting that to a series of 0 and 1's in your mind and > then figure out which node it was that had a 1 there. Dude, that problem sounds like a google job interview question. For hardware engineers ;) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node 'states' sysfs class attribute - V2 2007-08-28 6:12 ` Andrew Morton @ 2007-08-28 14:05 ` Lee Schermerhorn 2007-08-28 22:02 ` Christoph Lameter 2007-08-28 19:34 ` [PATCH/RFC] Add node 'states' sysfs class attribute - V2 Christoph Lameter 1 sibling, 1 reply; 43+ messages in thread From: Lee Schermerhorn @ 2007-08-28 14:05 UTC (permalink / raw) To: Andrew Morton Cc: Christoph Lameter, linux-mm, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney On Mon, 2007-08-27 at 23:12 -0700, Andrew Morton wrote: > On Mon, 27 Aug 2007 22:53:15 -0700 (PDT) Christoph Lameter <clameter@sgi.com> wrote: <something> > > > On Mon, 27 Aug 2007, Andrew Morton wrote: <something else> <and so on, ...> Wow! Who'd have thought such a simple PATCH/RFC would generate such an "animated" discussion! Stepping back, as Andrew suggests, this all started when I added a couple of temporary debug printk's to display the node state masks being generated. I noted that I wasn't seeing the N_CPU mask getting populated. [Turns out this was because I was printing too early--before the other cpus came up and called process_zones().] Christoph suggested that a /proc variable to display the maps would be useful to some applications/shell scripts... I thought I'd give it a try, but thinking that /proc variables were discouraged, where else but sysfs to put them. A class attribute to /sys/devices/system/node seemed like the appropriate place. I do recall seeing the discussion/"golden rule" about a sysfs having a single value, but: 1) I forgot. 2) I was making a change in drivers/base/node.c where I had the meminfo "monstrosity" as an example 3) While it makes sense for settable attributes to be separate files, for displaying a related set of info, like meminfo and node states, it just seems silly to duplicate code and allocate multiple sysfs objects for what can be done with so simply with a single file. I'm not wedded to this interface. However, I realy don't think it's worth doing as multiple files. Regarding Andrew's comment about "user interface code in the kernel": This was my reaction when I first encountered Linux's procfs with ascii formatted files. I was coming from a Unix background with a binary procfs interface. Again, it seemed silly to have "all that" formatting and parsing code sitting around in the kernel, given how infrequently its executed, in the grand scheme of things. However, I must admit that I've become addicted to the ease with which one can write one-off scripts to query configuration/statistics, tune/modify behavior or trigger actions via just cat'ing from and/or echo'ing to a /proc or /sys file. So, where to go with this patch? Drop it? Leave it as is? Move it /proc so that it can be a single file? Make it multiple files in sysfs? Putting it as politely as possible, the last is not my favorite option, but if folks think this info is useful and that's the way to go, so be it. And what about mask vs list? It's a 4 character change in the code to go either way. Lee -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node 'states' sysfs class attribute - V2 2007-08-28 14:05 ` Lee Schermerhorn @ 2007-08-28 22:02 ` Christoph Lameter 2007-08-28 22:13 ` Nish Aravamudan 0 siblings, 1 reply; 43+ messages in thread From: Christoph Lameter @ 2007-08-28 22:02 UTC (permalink / raw) To: Lee Schermerhorn Cc: Andrew Morton, linux-mm, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney On Tue, 28 Aug 2007, Lee Schermerhorn wrote: > I thought I'd give it a try, but thinking that /proc variables were > discouraged, where else but sysfs to put them. A class attribute > to /sys/devices/system/node seemed like the appropriate place. Right. That is the right place. > I'm not wedded to this interface. However, I realy don't think it's > worth doing as multiple files. I think one single file per nodemask makes sense. Otherwise files become difficult to parse. I just forgot.... > its executed, in the grand scheme of things. However, I must admit that > I've become addicted to the ease with which one can write one-off > scripts to query configuration/statistics, tune/modify behavior or > trigger actions via just cat'ing from and/or echo'ing to a /proc or /sys > file. > > So, where to go with this patch? Drop it? Leave it as is? Move > it /proc so that it can be a single file? Make it multiple files in > sysfs? Putting it as politely as possible, the last is not my favorite > option, but if folks think this info is useful and that's the way to go, > so be it. And what about mask vs list? It's a 4 character change in > the code to go either way. I would suggest to do the one file thing in sysfs and use the function that already exists in the kernel to print the nice nodelists. Using the nice function is just calling another function since the code is already there. At some point we may even allow changing the nodemasks. One could imagine that we would add nodemasks that allow use of hugepages on certain nodes or the slab allocator to allocate on certain nodes. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node 'states' sysfs class attribute - V2 2007-08-28 22:02 ` Christoph Lameter @ 2007-08-28 22:13 ` Nish Aravamudan 2007-08-29 14:43 ` Lee Schermerhorn 0 siblings, 1 reply; 43+ messages in thread From: Nish Aravamudan @ 2007-08-28 22:13 UTC (permalink / raw) To: Christoph Lameter Cc: Lee Schermerhorn, Andrew Morton, linux-mm, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney On 8/28/07, Christoph Lameter <clameter@sgi.com> wrote: > On Tue, 28 Aug 2007, Lee Schermerhorn wrote: > > > I thought I'd give it a try, but thinking that /proc variables were > > discouraged, where else but sysfs to put them. A class attribute > > to /sys/devices/system/node seemed like the appropriate place. > > Right. That is the right place. > > > I'm not wedded to this interface. However, I realy don't think it's > > worth doing as multiple files. > > I think one single file per nodemask makes sense. Otherwise files become > difficult to parse. I just forgot.... > > > its executed, in the grand scheme of things. However, I must admit that > > I've become addicted to the ease with which one can write one-off > > scripts to query configuration/statistics, tune/modify behavior or > > trigger actions via just cat'ing from and/or echo'ing to a /proc or /sys > > file. > > > > So, where to go with this patch? Drop it? Leave it as is? Move > > it /proc so that it can be a single file? Make it multiple files in > > sysfs? Putting it as politely as possible, the last is not my favorite > > option, but if folks think this info is useful and that's the way to go, > > so be it. And what about mask vs list? It's a 4 character change in > > the code to go either way. > > I would suggest to do the one file thing in sysfs and use the function > that already exists in the kernel to print the nice nodelists. Using the > nice function is just calling another function since the code is already > there. > > At some point we may even allow changing the nodemasks. One could imagine > that we would add nodemasks that allow use of hugepages on certain nodes > or the slab allocator to allocate on certain nodes. Just to chime in here -- I've been on vacation for a bit recently -- I fully support the one-value per file rule for sysfs. I think it makes things a bit clearer. I like this attribute as well, and the idea of expanding it down the road is easiest if we use one file per-nodemask. Thanks, Nish -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node 'states' sysfs class attribute - V2 2007-08-28 22:13 ` Nish Aravamudan @ 2007-08-29 14:43 ` Lee Schermerhorn 2007-08-29 17:39 ` Christoph Lameter 0 siblings, 1 reply; 43+ messages in thread From: Lee Schermerhorn @ 2007-08-29 14:43 UTC (permalink / raw) To: Nish Aravamudan Cc: Christoph Lameter, Andrew Morton, linux-mm, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney On Tue, 2007-08-28 at 15:13 -0700, Nish Aravamudan wrote: > On 8/28/07, Christoph Lameter <clameter@sgi.com> wrote: > > On Tue, 28 Aug 2007, Lee Schermerhorn wrote: > > > > > I thought I'd give it a try, but thinking that /proc variables were > > > discouraged, where else but sysfs to put them. A class attribute > > > to /sys/devices/system/node seemed like the appropriate place. > > > > Right. That is the right place. > > > > > I'm not wedded to this interface. However, I realy don't think it's > > > worth doing as multiple files. > > > > I think one single file per nodemask makes sense. Otherwise files become > > difficult to parse. I just forgot.... > > > > > its executed, in the grand scheme of things. However, I must admit that > > > I've become addicted to the ease with which one can write one-off > > > scripts to query configuration/statistics, tune/modify behavior or > > > trigger actions via just cat'ing from and/or echo'ing to a /proc or /sys > > > file. > > > > > > So, where to go with this patch? Drop it? Leave it as is? Move > > > it /proc so that it can be a single file? Make it multiple files in > > > sysfs? Putting it as politely as possible, the last is not my favorite > > > option, but if folks think this info is useful and that's the way to go, > > > so be it. And what about mask vs list? It's a 4 character change in > > > the code to go either way. > > > > I would suggest to do the one file thing in sysfs and use the function > > that already exists in the kernel to print the nice nodelists. Using the > > nice function is just calling another function since the code is already > > there. > > > > At some point we may even allow changing the nodemasks. One could imagine > > that we would add nodemasks that allow use of hugepages on certain nodes > > or the slab allocator to allocate on certain nodes. > > Just to chime in here -- I've been on vacation for a bit recently -- I > fully support the one-value per file rule for sysfs. I think it makes > things a bit clearer. I like this attribute as well, and the idea of > expanding it down the road is easiest if we use one file per-nodemask. Welcome back, Nish. OK, I relent. I'll respin with one file per state. I'll go with a slight modification to the names suggested by Yasunori-san: possible, online, has_memory, has_cpu Some come, mon... Lee -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node 'states' sysfs class attribute - V2 2007-08-29 14:43 ` Lee Schermerhorn @ 2007-08-29 17:39 ` Christoph Lameter 2007-08-29 21:31 ` [PATCH/RFC] Add node states sysfs class attributeS - V3 Lee Schermerhorn ` (2 more replies) 0 siblings, 3 replies; 43+ messages in thread From: Christoph Lameter @ 2007-08-29 17:39 UTC (permalink / raw) To: Lee Schermerhorn Cc: Nish Aravamudan, Andrew Morton, linux-mm, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney On Wed, 29 Aug 2007, Lee Schermerhorn wrote: > Some come, mon... Go Lee! Go! -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH/RFC] Add node states sysfs class attributeS - V3 2007-08-29 17:39 ` Christoph Lameter @ 2007-08-29 21:31 ` Lee Schermerhorn 2007-08-29 22:14 ` Christoph Lameter 2007-08-29 22:36 ` Nish Aravamudan 2007-08-30 15:19 ` [PATCH/RFC] Add node states sysfs class attributeS - V4 Lee Schermerhorn 2007-09-11 13:56 ` [PATCH/RFC] Add node states sysfs class attributeS - V5 Lee Schermerhorn 2 siblings, 2 replies; 43+ messages in thread From: Lee Schermerhorn @ 2007-08-29 21:31 UTC (permalink / raw) To: linux-mm Cc: Christoph Lameter, Nish Aravamudan, Andrew Morton, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney Here's the reworked version: + multiple sysfs node class attribute files + one value per file + printed as 'node list' rather than mask + sample output in patch description - still no ascii art nor animation Lee ===================================== PATCH Add node states sysfs class attributeS v3 Against: 2.6.23-rc3-mm1 V2 -> V3: + changed to per state sysfs file -- "one value per file" V1 -> V2: + style cleanup + drop 'len' variable in print_node_states(); compute from final size. Add a per node state sysfs class attribute file to /sys/devices/system/node to display node state masks. E.g., on a 4-cell HP ia64 NUMA platform, we have 5 nodes: 4 representing the actual hardware cells and one memory-only pseudo-node representing a small amount [512MB] of "hardware interleaved" memory. With this patch, in /sys/devices/system/node we see: root@gwydyr(root):ls -1 /sys/devices/system/node cpu node0/ node1/ node2/ node3/ node4/ normal_memory online possible root@gwydyr(root):cat /sys/devices/system/node/possible possible: 0-255 root@gwydyr(root):cat /sys/devices/system/node/online on-line: 0-4 root@gwydyr(root):cat /sys/devices/system/node/normal_memory memory: 0-4 root@gwydyr(root):cat /sys/devices/system/node/cpu cpu: 0-3 N.B., NOT TESTED with CONFIG_HIGHMEM=y. Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> drivers/base/node.c | 130 +++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 129 insertions(+), 1 deletion(-) Index: Linux/drivers/base/node.c =================================================================== --- Linux.orig/drivers/base/node.c 2007-08-29 15:18:53.000000000 -0400 +++ Linux/drivers/base/node.c 2007-08-29 17:00:32.000000000 -0400 @@ -12,6 +12,7 @@ #include <linux/topology.h> #include <linux/nodemask.h> #include <linux/cpu.h> +#include <linux/device.h> static struct sysdev_class node_class = { set_kset_name("node"), @@ -232,8 +233,135 @@ void unregister_one_node(int nid) unregister_node(&node_devices[nid]); } +/* + * [node] states attribute + */ +static char *node_state_names[] = { + "possible:", + "on-line:", + "memory:", +#ifdef CONFIG_HIGHMEM + "high memory:", /* max length = 12 */ +#endif + "cpu:", +}; + +static ssize_t +print_nodes_state(enum node_states state, char *buf) +{ + int n; + ssize_t size = PAGE_SIZE; + + do { + n = snprintf(buf, size, "%-14s ", node_state_names[state]); + if (n <= 0) + break; + buf += n; + size -= n; + if (size <= 0) + break; + + n = nodelist_scnprintf(buf, size, node_states[state]); + if (n <= 0) + break; + buf += n; + size -= n; + if (size <= 0) + break; + + n = snprintf(buf, size, "\n"); + if (n <= 0) + break; + buf += n; + size -= n; + if (size <= 0) + break; + } while (0); + + if (n > 0) { + n = PAGE_SIZE; + if (size > 0) + n -= size; + } + return n; +} + +static ssize_t +print_nodes_possible(struct sysdev_class *class, char *buf) +{ + return print_nodes_state(N_POSSIBLE, buf); +} + +static ssize_t +print_nodes_online(struct sysdev_class *class, char *buf) +{ + return print_nodes_state(N_ONLINE, buf); +} + +static ssize_t +print_nodes_has_normal_memory(struct sysdev_class *class, char *buf) +{ + return print_nodes_state(N_NORMAL_MEMORY, buf); +} + +static ssize_t +print_nodes_has_cpu(struct sysdev_class *class, char *buf) +{ + return print_nodes_state(N_CPU, buf); +} + +static SYSDEV_CLASS_ATTR(possible, 0444, print_nodes_possible, NULL); +static SYSDEV_CLASS_ATTR(online, 0444, print_nodes_online, NULL); +static SYSDEV_CLASS_ATTR(normal_memory, 0444, print_nodes_has_normal_memory, + NULL); +static SYSDEV_CLASS_ATTR(cpu, 0444, print_nodes_has_cpu, NULL); + +#ifdef CONFIG_HIGHMEM +static ssize_t +print_nodes_has_high_memory(struct sysdev_class *class, char *buf) +{ + return print_nodes_state(N_HIGH_MEMORY, buf); +} + +static SYSDEV_CLASS_ATTR(high_memory, 0444, print_nodes_has_high_memory, NULL); +#endif + +struct sysdev_class_attribute *node_state_attr[] = { + &attr_possible, + &attr_online, + &attr_normal_memory, +#ifdef CONFIG_HIGHMEM + &attr_high_memory, +#endif + &attr_cpu, +}; + +static int node_states_init(void) +{ + int i; + int err = 0; + + for (i = 0; i < NR_NODE_STATES; i++) { + int ret; + ret = sysdev_class_create_file(&node_class, node_state_attr[i]); + if (!err) + err = ret; + } + return err; +} + static int __init register_node_type(void) { - return sysdev_class_register(&node_class); + int ret; + + ret = sysdev_class_register(&node_class); + if (!ret) + ret = node_states_init(); + + /* + * Note: we're not going to unregister the node class if we fail + * to register the node state class attribute files. + */ + return ret; } postcore_initcall(register_node_type); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node states sysfs class attributeS - V3 2007-08-29 21:31 ` [PATCH/RFC] Add node states sysfs class attributeS - V3 Lee Schermerhorn @ 2007-08-29 22:14 ` Christoph Lameter 2007-08-30 13:34 ` Lee Schermerhorn 2007-08-29 22:36 ` Nish Aravamudan 1 sibling, 1 reply; 43+ messages in thread From: Christoph Lameter @ 2007-08-29 22:14 UTC (permalink / raw) To: Lee Schermerhorn Cc: linux-mm, Nish Aravamudan, Andrew Morton, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney On Wed, 29 Aug 2007, Lee Schermerhorn wrote: > root@gwydyr(root):cat /sys/devices/system/node/possible > possible: 0-255 The file is already called "possible". Repeating it in the output will make it difficult to parse. > +static ssize_t > +print_nodes_possible(struct sysdev_class *class, char *buf) > +{ > + return print_nodes_state(N_POSSIBLE, buf); > +} > + > +static ssize_t > +print_nodes_online(struct sysdev_class *class, char *buf) > +{ > + return print_nodes_state(N_ONLINE, buf); > +} > + > +static ssize_t > +print_nodes_has_normal_memory(struct sysdev_class *class, char *buf) > +{ > + return print_nodes_state(N_NORMAL_MEMORY, buf); > +} > + > +static ssize_t > +print_nodes_has_cpu(struct sysdev_class *class, char *buf) > +{ > + return print_nodes_state(N_CPU, buf); > +} Is there a way to avoid having to add another one of these if we add a new node state? Also there is a CR after the type. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node states sysfs class attributeS - V3 2007-08-29 22:14 ` Christoph Lameter @ 2007-08-30 13:34 ` Lee Schermerhorn 0 siblings, 0 replies; 43+ messages in thread From: Lee Schermerhorn @ 2007-08-30 13:34 UTC (permalink / raw) To: Christoph Lameter Cc: linux-mm, Nish Aravamudan, Andrew Morton, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney On Wed, 2007-08-29 at 15:14 -0700, Christoph Lameter wrote: > On Wed, 29 Aug 2007, Lee Schermerhorn wrote: > > > root@gwydyr(root):cat /sys/devices/system/node/possible > > possible: 0-255 > > The file is already called "possible". Repeating it in the output will > make it difficult to parse. Yeah. I noticed, after I posted, how stupid that looked. Clear a case of "premature patch-ulation". I'm fixing it now. > > > +static ssize_t > > +print_nodes_possible(struct sysdev_class *class, char *buf) > > +{ > > + return print_nodes_state(N_POSSIBLE, buf); > > +} > > + > > +static ssize_t > > +print_nodes_online(struct sysdev_class *class, char *buf) > > +{ > > + return print_nodes_state(N_ONLINE, buf); > > +} > > + > > +static ssize_t > > +print_nodes_has_normal_memory(struct sysdev_class *class, char *buf) > > +{ > > + return print_nodes_state(N_NORMAL_MEMORY, buf); > > +} > > + > > +static ssize_t > > +print_nodes_has_cpu(struct sysdev_class *class, char *buf) > > +{ > > + return print_nodes_state(N_CPU, buf); > > +} > > Is there a way to avoid having to add another one of these if we add > a new node state? I haven't figure out a way from the info I'm given in the show/print routine [just the node class and the buffer address] to figure out which attribute file was read, or I'd have avoided the function per attribute nonsense. > > Also there is a CR after the type. Took me a minute to figure out what you meant. Again, old habits... I've always put my function names against the left margin for easy searching. But I have read where this is discouraged. I will say that the patch passed checkpatch just fine. I'll fix it in the respin. Lee -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node states sysfs class attributeS - V3 2007-08-29 21:31 ` [PATCH/RFC] Add node states sysfs class attributeS - V3 Lee Schermerhorn 2007-08-29 22:14 ` Christoph Lameter @ 2007-08-29 22:36 ` Nish Aravamudan 1 sibling, 0 replies; 43+ messages in thread From: Nish Aravamudan @ 2007-08-29 22:36 UTC (permalink / raw) To: Lee Schermerhorn Cc: linux-mm, Christoph Lameter, Andrew Morton, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney On 8/29/07, Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote: > Here's the reworked version: > + multiple sysfs node class attribute files > + one value per file > + printed as 'node list' rather than mask > + sample output in patch description > - still no ascii art nor animation > > Lee > ===================================== > > PATCH Add node states sysfs class attributeS v3 > > Against: 2.6.23-rc3-mm1 > > V2 -> V3: > + changed to per state sysfs file -- "one value per file" > > V1 -> V2: > + style cleanup > + drop 'len' variable in print_node_states(); compute from > final size. > > Add a per node state sysfs class attribute file to > /sys/devices/system/node to display node state masks. > > E.g., on a 4-cell HP ia64 NUMA platform, we have 5 nodes: > 4 representing the actual hardware cells and one memory-only > pseudo-node representing a small amount [512MB] of "hardware > interleaved" memory. With this patch, in /sys/devices/system/node > we see: > > root@gwydyr(root):ls -1 /sys/devices/system/node > cpu > node0/ > node1/ > node2/ > node3/ > node4/ > normal_memory > online > possible > root@gwydyr(root):cat /sys/devices/system/node/possible > possible: 0-255 > root@gwydyr(root):cat /sys/devices/system/node/online > on-line: 0-4 > root@gwydyr(root):cat /sys/devices/system/node/normal_memory > memory: 0-4 > root@gwydyr(root):cat /sys/devices/system/node/cpu > cpu: 0-3 Sorry if this has been mentioned before, but now that there is one file per mask, why do we need to prefix the output with anything? That is, I would prefer to see: $ cat /sys/devices/system/node/cpu 0-3 I don't think outputting "cpu:" provides any extra information. Thanks, Nish -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH/RFC] Add node states sysfs class attributeS - V4 2007-08-29 17:39 ` Christoph Lameter 2007-08-29 21:31 ` [PATCH/RFC] Add node states sysfs class attributeS - V3 Lee Schermerhorn @ 2007-08-30 15:19 ` Lee Schermerhorn 2007-08-30 16:44 ` Nish Aravamudan 2007-08-30 18:19 ` Christoph Lameter 2007-09-11 13:56 ` [PATCH/RFC] Add node states sysfs class attributeS - V5 Lee Schermerhorn 2 siblings, 2 replies; 43+ messages in thread From: Lee Schermerhorn @ 2007-08-30 15:19 UTC (permalink / raw) To: linux-mm Cc: Christoph Lameter, Nish Aravamudan, Andrew Morton, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney Try, try again. Maybe closer this time. Question: do we need/want to display the normal and high memory masks separately for systems with HIGHMEM? If not, I'd suggest changing the print_nodes_state() function to take a nodemask_t* instead of a state enum and expose a single 'has_memory' attribute that we print using something like: static ssize_t print_nodes_has_memory(struct sysdev_class *class, char *buf) { nodemask_t has_memory = node_states[N_NORMAL_MEMORY]; if (N_HIGH_MEMORY - N_NORMAL_MEMORY) nodes_or(has_memory, has_memory, node_states[N_HIGH_MEMORY]); return print_nodes_state(&has_memory, buf); } The compiler should eliminate the 'or' when HIGHMEM is not configured. But, we'd probably be doing the node_states[] index computation in each print function. Thoughts? Lee ==================== PATCH Add node 'states' sysfs class attributes v4 Against: 2.6.23-rc3-mm1 V3 -> V4: + drop the annotations -- not needed with one value per file. + this simplifies print_nodes_state() + fix "function return type on separate line" style glitch V2 -> V3: + changed to per state sysfs file -- "one value per file" V1 -> V2: + style cleanup + drop 'len' variable in print_node_states(); compute from final size. Add a per node state sysfs class attribute file to /sys/devices/system/node to display node state masks. E.g., on a 4-cell HP ia64 NUMA platform, we have 5 nodes: 4 representing the actual hardware cells and one memory-only pseudo-node representing a small amount [512MB] of "hardware interleaved" memory. With this patch, in /sys/devices/system/node we see: #ls -1F /sys/devices/system/node has_cpu has_normal_memory node0/ node1/ node2/ node3/ node4/ online possible #cat /sys/devices/system/node/possible 0-255 #cat /sys/devices/system/node/online 0-4 #cat /sys/devices/system/node/has_normal_memory 0-4 #cat /sys/devices/system/node/has_cpu 0-3 N.B., NOT TESTED with CONFIG_HIGHMEM=y. Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> drivers/base/node.c | 94 +++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 93 insertions(+), 1 deletion(-) Index: Linux/drivers/base/node.c =================================================================== --- Linux.orig/drivers/base/node.c 2007-08-29 15:18:53.000000000 -0400 +++ Linux/drivers/base/node.c 2007-08-30 10:38:44.000000000 -0400 @@ -12,6 +12,7 @@ #include <linux/topology.h> #include <linux/nodemask.h> #include <linux/cpu.h> +#include <linux/device.h> static struct sysdev_class node_class = { set_kset_name("node"), @@ -232,8 +233,99 @@ void unregister_one_node(int nid) unregister_node(&node_devices[nid]); } +/* + * node states attributes + */ + +static ssize_t print_nodes_state(enum node_states state, char *buf) +{ + int n; + + n = nodelist_scnprintf(buf, PAGE_SIZE, node_states[state]); + if (n <= 0) + goto done; + if (PAGE_SIZE - n > 1) { + *(buf + n++) = '\n'; + *(buf + n++) = '\0'; + } +done: + return n; +} + +static ssize_t print_nodes_possible(struct sysdev_class *class, char *buf) +{ + return print_nodes_state(N_POSSIBLE, buf); +} + +static ssize_t print_nodes_online(struct sysdev_class *class, char *buf) +{ + return print_nodes_state(N_ONLINE, buf); +} + +static ssize_t print_nodes_has_normal_memory(struct sysdev_class *class, + char *buf) +{ + return print_nodes_state(N_NORMAL_MEMORY, buf); +} + +static ssize_t print_nodes_has_cpu(struct sysdev_class *class, char *buf) +{ + return print_nodes_state(N_CPU, buf); +} + +static SYSDEV_CLASS_ATTR(possible, 0444, print_nodes_possible, NULL); +static SYSDEV_CLASS_ATTR(online, 0444, print_nodes_online, NULL); +static SYSDEV_CLASS_ATTR(has_normal_memory, 0444, print_nodes_has_normal_memory, + NULL); +static SYSDEV_CLASS_ATTR(has_cpu, 0444, print_nodes_has_cpu, NULL); + +#ifdef CONFIG_HIGHMEM +static ssize_t print_nodes_has_high_memory(struct sysdev_class *class, + char *buf) +{ + return print_nodes_state(N_HIGH_MEMORY, buf); +} + +static SYSDEV_CLASS_ATTR(has_high_memory, 0444, print_nodes_has_high_memory, + NULL); +#endif + +struct sysdev_class_attribute *node_state_attr[] = { + &attr_possible, + &attr_online, + &attr_has_normal_memory, +#ifdef CONFIG_HIGHMEM + &attr_has_high_memory, +#endif + &attr_has_cpu, +}; + +static int node_states_init(void) +{ + int i; + int err = 0; + + for (i = 0; i < NR_NODE_STATES; i++) { + int ret; + ret = sysdev_class_create_file(&node_class, node_state_attr[i]); + if (!err) + err = ret; + } + return err; +} + static int __init register_node_type(void) { - return sysdev_class_register(&node_class); + int ret; + + ret = sysdev_class_register(&node_class); + if (!ret) + ret = node_states_init(); + + /* + * Note: we're not going to unregister the node class if we fail + * to register the node state class attribute files. + */ + return ret; } postcore_initcall(register_node_type); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node states sysfs class attributeS - V4 2007-08-30 15:19 ` [PATCH/RFC] Add node states sysfs class attributeS - V4 Lee Schermerhorn @ 2007-08-30 16:44 ` Nish Aravamudan 2007-08-30 18:20 ` Christoph Lameter 2007-08-30 18:19 ` Christoph Lameter 1 sibling, 1 reply; 43+ messages in thread From: Nish Aravamudan @ 2007-08-30 16:44 UTC (permalink / raw) To: Lee Schermerhorn Cc: linux-mm, Christoph Lameter, Andrew Morton, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney On 8/30/07, Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote: > Try, try again. Maybe closer this time. > > Question: do we need/want to display the normal and high memory masks > separately for systems with HIGHMEM? If not, I'd suggest changing the > print_nodes_state() function to take a nodemask_t* instead of a state > enum and expose a single 'has_memory' attribute that we print using > something like: I feel like we should keep them separate. They are distinct in the kernel for a reason, right? Do we perhaps actually want has_normal_memory has_highmem_memory has_memory That might be overkill, though -- and perhaps folks will argue you can figure out the third by or'ing the results of the first two in a script? Thanks, Nish -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node states sysfs class attributeS - V4 2007-08-30 16:44 ` Nish Aravamudan @ 2007-08-30 18:20 ` Christoph Lameter 0 siblings, 0 replies; 43+ messages in thread From: Christoph Lameter @ 2007-08-30 18:20 UTC (permalink / raw) To: Nish Aravamudan Cc: Lee Schermerhorn, linux-mm, Andrew Morton, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney On Thu, 30 Aug 2007, Nish Aravamudan wrote: > I feel like we should keep them separate. They are distinct in the > kernel for a reason, right? Do we perhaps actually want > > has_normal_memory > has_highmem_memory > has_memory No. We just want the norma and the highmem mask. Lets keep them 1-1 to the actual nodemask array and keep the same names too to reduce confusion. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node states sysfs class attributeS - V4 2007-08-30 15:19 ` [PATCH/RFC] Add node states sysfs class attributeS - V4 Lee Schermerhorn 2007-08-30 16:44 ` Nish Aravamudan @ 2007-08-30 18:19 ` Christoph Lameter 2007-08-30 18:41 ` Lee Schermerhorn 1 sibling, 1 reply; 43+ messages in thread From: Christoph Lameter @ 2007-08-30 18:19 UTC (permalink / raw) To: Lee Schermerhorn Cc: linux-mm, Nish Aravamudan, Andrew Morton, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney On Thu, 30 Aug 2007, Lee Schermerhorn wrote: > Try, try again. Maybe closer this time. Yes. Thanks for all your work on this. > Question: do we need/want to display the normal and high memory masks > separately for systems with HIGHMEM? If not, I'd suggest changing the > print_nodes_state() function to take a nodemask_t* instead of a state > enum and expose a single 'has_memory' attribute that we print using > something like: No leave it separate. > > static ssize_t print_nodes_has_memory(struct sysdev_class *class, > char *buf) > { > nodemask_t has_memory = node_states[N_NORMAL_MEMORY]; > > if (N_HIGH_MEMORY - N_NORMAL_MEMORY) Uggg. Better do #ifdef CONFIG_HIGHMEM > nodes_or(has_memory, has_memory, node_states[N_HIGH_MEMORY]); > > return print_nodes_state(&has_memory, buf); > + * node states attributes > + */ > + > +static ssize_t print_nodes_state(enum node_states state, char *buf) > +{ > + int n; > + > + n = nodelist_scnprintf(buf, PAGE_SIZE, node_states[state]); > + if (n <= 0) > + goto done; > + if (PAGE_SIZE - n > 1) { if (n > 0 && PAGE_SIZE > n + 1) ? > + *(buf + n++) = '\n'; > + *(buf + n++) = '\0'; > + } > +static ssize_t print_nodes_possible(struct sysdev_class *class, char *buf) > +{ > + return print_nodes_state(N_POSSIBLE, buf); > +} > + > +static ssize_t print_nodes_online(struct sysdev_class *class, char *buf) > +{ > + return print_nodes_state(N_ONLINE, buf); > +} > + > +static ssize_t print_nodes_has_normal_memory(struct sysdev_class *class, > + char *buf) > +{ > + return print_nodes_state(N_NORMAL_MEMORY, buf); > +} > + > +static ssize_t print_nodes_has_cpu(struct sysdev_class *class, char *buf) > +{ > + return print_nodes_state(N_CPU, buf); > +} > + > +static SYSDEV_CLASS_ATTR(possible, 0444, print_nodes_possible, NULL); > +static SYSDEV_CLASS_ATTR(online, 0444, print_nodes_online, NULL); > +static SYSDEV_CLASS_ATTR(has_normal_memory, 0444, print_nodes_has_normal_memory, I am sure that there is some way to dynamicall allocate these. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node states sysfs class attributeS - V4 2007-08-30 18:19 ` Christoph Lameter @ 2007-08-30 18:41 ` Lee Schermerhorn 0 siblings, 0 replies; 43+ messages in thread From: Lee Schermerhorn @ 2007-08-30 18:41 UTC (permalink / raw) To: Christoph Lameter Cc: linux-mm, Nish Aravamudan, Andrew Morton, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney On Thu, 2007-08-30 at 11:19 -0700, Christoph Lameter wrote: > On Thu, 30 Aug 2007, Lee Schermerhorn wrote: > > > Try, try again. Maybe closer this time. > > Yes. Thanks for all your work on this. > > > Question: do we need/want to display the normal and high memory masks > > separately for systems with HIGHMEM? If not, I'd suggest changing the > > print_nodes_state() function to take a nodemask_t* instead of a state > > enum and expose a single 'has_memory' attribute that we print using > > something like: > > No leave it separate. OK. > > > > > static ssize_t print_nodes_has_memory(struct sysdev_class *class, > > char *buf) > > { > > nodemask_t has_memory = node_states[N_NORMAL_MEMORY]; > > > > if (N_HIGH_MEMORY - N_NORMAL_MEMORY) > > Uggg. Better do #ifdef CONFIG_HIGHMEM I thought Andrew preferred to allow the compiler to do the dead code elimination after syntax checking, ... Sounds like it's moot in this case, tho', as we're leaving the two masks separate. > > > nodes_or(has_memory, has_memory, node_states[N_HIGH_MEMORY]); > > > > return print_nodes_state(&has_memory, buf); > > > + * node states attributes > > + */ > > + > > +static ssize_t print_nodes_state(enum node_states state, char *buf) > > +{ > > + int n; > > + > > + n = nodelist_scnprintf(buf, PAGE_SIZE, node_states[state]); > > + if (n <= 0) > > + goto done; > > + if (PAGE_SIZE - n > 1) { > > if (n > 0 && PAGE_SIZE > n + 1) OK. I'll fix that after KS. > > ? > > + *(buf + n++) = '\n'; > > + *(buf + n++) = '\0'; > > + } > > > +static ssize_t print_nodes_possible(struct sysdev_class *class, char *buf) > > +{ > > + return print_nodes_state(N_POSSIBLE, buf); > > +} > > + > > +static ssize_t print_nodes_online(struct sysdev_class *class, char *buf) > > +{ > > + return print_nodes_state(N_ONLINE, buf); > > +} > > + > > +static ssize_t print_nodes_has_normal_memory(struct sysdev_class *class, > > + char *buf) > > +{ > > + return print_nodes_state(N_NORMAL_MEMORY, buf); > > +} > > + > > +static ssize_t print_nodes_has_cpu(struct sysdev_class *class, char *buf) > > +{ > > + return print_nodes_state(N_CPU, buf); > > +} > > + > > +static SYSDEV_CLASS_ATTR(possible, 0444, print_nodes_possible, NULL); > > +static SYSDEV_CLASS_ATTR(online, 0444, print_nodes_online, NULL); > > +static SYSDEV_CLASS_ATTR(has_normal_memory, 0444, print_nodes_has_normal_memory, > > I am sure that there is some way to dynamicall allocate these. There may be some way to allocate the files in a group. Not sure for class attribute files. In any case, AFAICS the attributes/files each require a separately defined attribute structure. I'll let this set for a week or so and see if I can find a preferred way. Lee -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH/RFC] Add node states sysfs class attributeS - V5 2007-08-29 17:39 ` Christoph Lameter 2007-08-29 21:31 ` [PATCH/RFC] Add node states sysfs class attributeS - V3 Lee Schermerhorn 2007-08-30 15:19 ` [PATCH/RFC] Add node states sysfs class attributeS - V4 Lee Schermerhorn @ 2007-09-11 13:56 ` Lee Schermerhorn 2007-09-11 20:25 ` Christoph Lameter 2007-09-14 10:50 ` Andrew Morton 2 siblings, 2 replies; 43+ messages in thread From: Lee Schermerhorn @ 2007-09-11 13:56 UTC (permalink / raw) To: linux-mm Cc: Christoph Lameter, Nish Aravamudan, Andrew Morton, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney Should be about ready to go... Lee PATCH Add node 'states' sysfs class attributes v5 Against: 2.6.23-rc4-mm1 V4 -> V5: + further cleanup of print_nodes_state() suggested by Chirstoph. V3 -> V4: + drop the annotations -- not needed with one value per file. + this simplifies print_nodes_state() + fix "function return type on separate line" style glitch V2 -> V3: + changed to per state sysfs file -- "one value per file" V1 -> V2: + style cleanup + drop 'len' variable in print_node_states(); compute from final size. Add a per node state sysfs class attribute file to /sys/devices/system/node to display node state masks. E.g., on a 4-cell HP ia64 NUMA platform, we have 5 nodes: 4 representing the actual hardware cells and one memory-only pseudo-node representing a small amount [512MB] of "hardware interleaved" memory. With this patch, in /sys/devices/system/node we see: #ls -1F /sys/devices/system/node has_cpu has_normal_memory node0/ node1/ node2/ node3/ node4/ online possible #cat /sys/devices/system/node/possible 0-255 #cat /sys/devices/system/node/online 0-4 #cat /sys/devices/system/node/has_normal_memory 0-4 #cat /sys/devices/system/node/has_cpu 0-3 N.B., NOT TESTED with CONFIG_HIGHMEM=y. Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> drivers/base/node.c | 91 +++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 90 insertions(+), 1 deletion(-) Index: Linux/drivers/base/node.c =================================================================== --- Linux.orig/drivers/base/node.c 2007-07-08 19:32:17.000000000 -0400 +++ Linux/drivers/base/node.c 2007-09-10 13:56:36.000000000 -0400 @@ -12,6 +12,7 @@ #include <linux/topology.h> #include <linux/nodemask.h> #include <linux/cpu.h> +#include <linux/device.h> static struct sysdev_class node_class = { set_kset_name("node"), @@ -232,8 +233,96 @@ void unregister_one_node(int nid) unregister_node(&node_devices[nid]); } +/* + * node states attributes + */ + +static ssize_t print_nodes_state(enum node_states state, char *buf) +{ + int n; + + n = nodelist_scnprintf(buf, PAGE_SIZE, node_states[state]); + if (n > 0 && PAGE_SIZE > n + 1) { + *(buf + n++) = '\n'; + *(buf + n++) = '\0'; + } + return n; +} + +static ssize_t print_nodes_possible(struct sysdev_class *class, char *buf) +{ + return print_nodes_state(N_POSSIBLE, buf); +} + +static ssize_t print_nodes_online(struct sysdev_class *class, char *buf) +{ + return print_nodes_state(N_ONLINE, buf); +} + +static ssize_t print_nodes_has_normal_memory(struct sysdev_class *class, + char *buf) +{ + return print_nodes_state(N_NORMAL_MEMORY, buf); +} + +static ssize_t print_nodes_has_cpu(struct sysdev_class *class, char *buf) +{ + return print_nodes_state(N_CPU, buf); +} + +static SYSDEV_CLASS_ATTR(possible, 0444, print_nodes_possible, NULL); +static SYSDEV_CLASS_ATTR(online, 0444, print_nodes_online, NULL); +static SYSDEV_CLASS_ATTR(has_normal_memory, 0444, print_nodes_has_normal_memory, + NULL); +static SYSDEV_CLASS_ATTR(has_cpu, 0444, print_nodes_has_cpu, NULL); + +#ifdef CONFIG_HIGHMEM +static ssize_t print_nodes_has_high_memory(struct sysdev_class *class, + char *buf) +{ + return print_nodes_state(N_HIGH_MEMORY, buf); +} + +static SYSDEV_CLASS_ATTR(has_high_memory, 0444, print_nodes_has_high_memory, + NULL); +#endif + +struct sysdev_class_attribute *node_state_attr[] = { + &attr_possible, + &attr_online, + &attr_has_normal_memory, +#ifdef CONFIG_HIGHMEM + &attr_has_high_memory, +#endif + &attr_has_cpu, +}; + +static int node_states_init(void) +{ + int i; + int err = 0; + + for (i = 0; i < NR_NODE_STATES; i++) { + int ret; + ret = sysdev_class_create_file(&node_class, node_state_attr[i]); + if (!err) + err = ret; + } + return err; +} + static int __init register_node_type(void) { - return sysdev_class_register(&node_class); + int ret; + + ret = sysdev_class_register(&node_class); + if (!ret) + ret = node_states_init(); + + /* + * Note: we're not going to unregister the node class if we fail + * to register the node state class attribute files. + */ + return ret; } postcore_initcall(register_node_type); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node states sysfs class attributeS - V5 2007-09-11 13:56 ` [PATCH/RFC] Add node states sysfs class attributeS - V5 Lee Schermerhorn @ 2007-09-11 20:25 ` Christoph Lameter 2007-09-14 10:50 ` Andrew Morton 1 sibling, 0 replies; 43+ messages in thread From: Christoph Lameter @ 2007-09-11 20:25 UTC (permalink / raw) To: Lee Schermerhorn Cc: linux-mm, Nish Aravamudan, Andrew Morton, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node states sysfs class attributeS - V5 2007-09-11 13:56 ` [PATCH/RFC] Add node states sysfs class attributeS - V5 Lee Schermerhorn 2007-09-11 20:25 ` Christoph Lameter @ 2007-09-14 10:50 ` Andrew Morton 2007-09-14 11:35 ` Andy Whitcroft 2007-09-14 14:43 ` Mel Gorman 1 sibling, 2 replies; 43+ messages in thread From: Andrew Morton @ 2007-09-14 10:50 UTC (permalink / raw) To: Lee Schermerhorn Cc: linux-mm, Christoph Lameter, Nish Aravamudan, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney, Andy Whitcroft, Martin Bligh On Tue, 11 Sep 2007 09:56:15 -0400 Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote: > Should be about ready to go... > > Lee > > > PATCH Add node 'states' sysfs class attributes v5 > > Against: 2.6.23-rc4-mm1 > > V4 -> V5: > + further cleanup of print_nodes_state() suggested by Chirstoph. > > V3 -> V4: > + drop the annotations -- not needed with one value per file. > + this simplifies print_nodes_state() > + fix "function return type on separate line" style glitch > > V2 -> V3: > + changed to per state sysfs file -- "one value per file" > > V1 -> V2: > + style cleanup > + drop 'len' variable in print_node_states(); compute from > final size. > > Add a per node state sysfs class attribute file to > /sys/devices/system/node to display node state masks. > > E.g., on a 4-cell HP ia64 NUMA platform, we have 5 nodes: > 4 representing the actual hardware cells and one memory-only > pseudo-node representing a small amount [512MB] of "hardware > interleaved" memory. With this patch, in /sys/devices/system/node > we see: > > #ls -1F /sys/devices/system/node > has_cpu > has_normal_memory > node0/ > node1/ > node2/ > node3/ > node4/ > online > possible > #cat /sys/devices/system/node/possible > 0-255 > #cat /sys/devices/system/node/online > 0-4 > #cat /sys/devices/system/node/has_normal_memory > 0-4 > #cat /sys/devices/system/node/has_cpu > 0-3 > > N.B., NOT TESTED with CONFIG_HIGHMEM=y. > So how do we get it tested with CONFIG_HIGHMEM=y? Needs an i386 numa machine, yes? Perhaps Andy or Martin can remember to do this sometime, but they'll need a test plan ;) Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node states sysfs class attributeS - V5 2007-09-14 10:50 ` Andrew Morton @ 2007-09-14 11:35 ` Andy Whitcroft 2007-09-14 14:34 ` Lee Schermerhorn 2007-09-14 14:43 ` Mel Gorman 1 sibling, 1 reply; 43+ messages in thread From: Andy Whitcroft @ 2007-09-14 11:35 UTC (permalink / raw) To: Andrew Morton Cc: Lee Schermerhorn, linux-mm, Christoph Lameter, Nish Aravamudan, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney, Martin Bligh On Fri, Sep 14, 2007 at 03:50:58AM -0700, Andrew Morton wrote: > On Tue, 11 Sep 2007 09:56:15 -0400 Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote: > > > Should be about ready to go... > > > > Lee > > > > > > PATCH Add node 'states' sysfs class attributes v5 > > > > Against: 2.6.23-rc4-mm1 > > > > V4 -> V5: > > + further cleanup of print_nodes_state() suggested by Chirstoph. > > > > V3 -> V4: > > + drop the annotations -- not needed with one value per file. > > + this simplifies print_nodes_state() > > + fix "function return type on separate line" style glitch > > > > V2 -> V3: > > + changed to per state sysfs file -- "one value per file" > > > > V1 -> V2: > > + style cleanup > > + drop 'len' variable in print_node_states(); compute from > > final size. > > > > Add a per node state sysfs class attribute file to > > /sys/devices/system/node to display node state masks. > > > > E.g., on a 4-cell HP ia64 NUMA platform, we have 5 nodes: > > 4 representing the actual hardware cells and one memory-only > > pseudo-node representing a small amount [512MB] of "hardware > > interleaved" memory. With this patch, in /sys/devices/system/node > > we see: > > > > #ls -1F /sys/devices/system/node > > has_cpu > > has_normal_memory > > node0/ > > node1/ > > node2/ > > node3/ > > node4/ > > online > > possible > > #cat /sys/devices/system/node/possible > > 0-255 > > #cat /sys/devices/system/node/online > > 0-4 > > #cat /sys/devices/system/node/has_normal_memory > > 0-4 > > #cat /sys/devices/system/node/has_cpu > > 0-3 > > > > N.B., NOT TESTED with CONFIG_HIGHMEM=y. > > > > So how do we get it tested with CONFIG_HIGHMEM=y? Needs an i386 > numa machine, yes? Perhaps Andy or Martin can remember to do this > sometime, but they'll need a test plan ;) Yep, let me know what needs testing and I am sure I can grab one of the dinosaurs to get it tested. Base, patches, what to test ... -apw -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node states sysfs class attributeS - V5 2007-09-14 11:35 ` Andy Whitcroft @ 2007-09-14 14:34 ` Lee Schermerhorn 0 siblings, 0 replies; 43+ messages in thread From: Lee Schermerhorn @ 2007-09-14 14:34 UTC (permalink / raw) To: Andy Whitcroft Cc: Andrew Morton, linux-mm, Christoph Lameter, Nish Aravamudan, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney, Martin Bligh On Fri, 2007-09-14 at 12:35 +0100, Andy Whitcroft wrote: > On Fri, Sep 14, 2007 at 03:50:58AM -0700, Andrew Morton wrote: > > On Tue, 11 Sep 2007 09:56:15 -0400 Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote: > > > > > Should be about ready to go... > > > > > > Lee > > > > > > > > > PATCH Add node 'states' sysfs class attributes v5 > > > > > > Against: 2.6.23-rc4-mm1 > > > > > > V4 -> V5: > > > + further cleanup of print_nodes_state() suggested by Chirstoph. > > > > > > V3 -> V4: > > > + drop the annotations -- not needed with one value per file. > > > + this simplifies print_nodes_state() > > > + fix "function return type on separate line" style glitch > > > > > > V2 -> V3: > > > + changed to per state sysfs file -- "one value per file" > > > > > > V1 -> V2: > > > + style cleanup > > > + drop 'len' variable in print_node_states(); compute from > > > final size. > > > > > > Add a per node state sysfs class attribute file to > > > /sys/devices/system/node to display node state masks. > > > > > > E.g., on a 4-cell HP ia64 NUMA platform, we have 5 nodes: > > > 4 representing the actual hardware cells and one memory-only > > > pseudo-node representing a small amount [512MB] of "hardware > > > interleaved" memory. With this patch, in /sys/devices/system/node > > > we see: > > > > > > #ls -1F /sys/devices/system/node > > > has_cpu > > > has_normal_memory > > > node0/ > > > node1/ > > > node2/ > > > node3/ > > > node4/ > > > online > > > possible > > > #cat /sys/devices/system/node/possible > > > 0-255 > > > #cat /sys/devices/system/node/online > > > 0-4 > > > #cat /sys/devices/system/node/has_normal_memory > > > 0-4 > > > #cat /sys/devices/system/node/has_cpu > > > 0-3 > > > > > > N.B., NOT TESTED with CONFIG_HIGHMEM=y. > > > > > > > So how do we get it tested with CONFIG_HIGHMEM=y? Needs an i386 > > numa machine, yes? Perhaps Andy or Martin can remember to do this > > sometime, but they'll need a test plan ;) > > Yep, let me know what needs testing and I am sure I can grab one of the > dinosaurs to get it tested. Base, patches, what to test ... Hi, Andy: patch is against 23-rc4-mm1. Needed testing: 1) build/boot 2) ls /sys/devices/system/node You should see 'has_high_memory' in addition to the attributes shown in the patch description. 3) cat out the node state attributes to see that they make sense for the platform [a NumaQ?]. Note that 'possible' just lists the nodes configured in via the NODES_SHIFT config option. At some point, I'd like to display the actual number of nodes possible for the platform. This will require arch and platform hooks. Thanks, Lee -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node states sysfs class attributeS - V5 2007-09-14 10:50 ` Andrew Morton 2007-09-14 11:35 ` Andy Whitcroft @ 2007-09-14 14:43 ` Mel Gorman 2007-09-14 15:00 ` Paul Mundt 2007-09-14 16:00 ` Martin J. Bligh 1 sibling, 2 replies; 43+ messages in thread From: Mel Gorman @ 2007-09-14 14:43 UTC (permalink / raw) To: Andrew Morton Cc: Lee Schermerhorn, linux-mm, Christoph Lameter, Nish Aravamudan, y-goto, Kamezawa Hiroyuki, Eric Whitney, Andy Whitcroft, Martin Bligh, lethal On (14/09/07 03:50), Andrew Morton didst pronounce: > On Tue, 11 Sep 2007 09:56:15 -0400 Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote: > > > Should be about ready to go... > > > > Lee > > > > > > PATCH Add node 'states' sysfs class attributes v5 > > > > Against: 2.6.23-rc4-mm1 > > > > V4 -> V5: > > + further cleanup of print_nodes_state() suggested by Chirstoph. > > > > V3 -> V4: > > + drop the annotations -- not needed with one value per file. > > + this simplifies print_nodes_state() > > + fix "function return type on separate line" style glitch > > > > V2 -> V3: > > + changed to per state sysfs file -- "one value per file" > > > > V1 -> V2: > > + style cleanup > > + drop 'len' variable in print_node_states(); compute from > > final size. > > > > Add a per node state sysfs class attribute file to > > /sys/devices/system/node to display node state masks. > > > > E.g., on a 4-cell HP ia64 NUMA platform, we have 5 nodes: > > 4 representing the actual hardware cells and one memory-only > > pseudo-node representing a small amount [512MB] of "hardware > > interleaved" memory. With this patch, in /sys/devices/system/node > > we see: > > > > #ls -1F /sys/devices/system/node > > has_cpu > > has_normal_memory > > node0/ > > node1/ > > node2/ > > node3/ > > node4/ > > online > > possible > > #cat /sys/devices/system/node/possible > > 0-255 > > #cat /sys/devices/system/node/online > > 0-4 > > #cat /sys/devices/system/node/has_normal_memory > > 0-4 > > #cat /sys/devices/system/node/has_cpu > > 0-3 > > > > N.B., NOT TESTED with CONFIG_HIGHMEM=y. > > > > So how do we get it tested with CONFIG_HIGHMEM=y? Needs an i386 > numa machine, yes? Perhaps Andy or Martin can remember to do this > sometime, but they'll need a test plan ;) > As an aside, 32 Bit NUMA usually means we turn the NUMAQ into a whipping boy and give the problem lip service. However, I'd be interested in hearing if superh has dependencies on 32 bit NUMA working properly, including HIGHMEM issues. I've cc'd Paul Mundt. Paul, does superh use 32 bit NUMA? Is it used with with HIGHMEM? -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node states sysfs class attributeS - V5 2007-09-14 14:43 ` Mel Gorman @ 2007-09-14 15:00 ` Paul Mundt 2007-09-16 12:10 ` Mel Gorman 2007-09-14 16:00 ` Martin J. Bligh 1 sibling, 1 reply; 43+ messages in thread From: Paul Mundt @ 2007-09-14 15:00 UTC (permalink / raw) To: Mel Gorman Cc: Andrew Morton, Lee Schermerhorn, linux-mm, Christoph Lameter, Nish Aravamudan, y-goto, Kamezawa Hiroyuki, Eric Whitney, Andy Whitcroft, Martin Bligh On Fri, Sep 14, 2007 at 03:43:00PM +0100, Mel Gorman wrote: > On (14/09/07 03:50), Andrew Morton didst pronounce: > > So how do we get it tested with CONFIG_HIGHMEM=y? Needs an i386 > > numa machine, yes? Perhaps Andy or Martin can remember to do this > > sometime, but they'll need a test plan ;) > > As an aside, 32 Bit NUMA usually means we turn the NUMAQ into a whipping boy > and give the problem lip service. However, I'd be interested in hearing if > superh has dependencies on 32 bit NUMA working properly, including HIGHMEM > issues. > > I've cc'd Paul Mundt. Paul, does superh use 32 bit NUMA? Is it used with > with HIGHMEM? > We do use 32-bit NUMA, yes. Not with highmem at the moment, though. highmem support is something that will be coming soon, so the 32-bit NUMA + highmem assertion will be true on SH in the not-so-distant future. This is something that quite a few CPUs and boards are depending on, and these are all using static sparsemem exclusively at present. It's also something that's tested with current git on a close to daily basis, albeit with page migration generally disabled, due to the small size of the non-system memory nodes. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node states sysfs class attributeS - V5 2007-09-14 15:00 ` Paul Mundt @ 2007-09-16 12:10 ` Mel Gorman 0 siblings, 0 replies; 43+ messages in thread From: Mel Gorman @ 2007-09-16 12:10 UTC (permalink / raw) To: Paul Mundt, Andrew Morton, Lee Schermerhorn, linux-mm, Christoph Lameter, Nish Aravamudan, y-goto, Kamezawa Hiroyuki, Eric Whitney, Andy Whitcroft, Martin Bligh On (15/09/07 00:00), Paul Mundt didst pronounce: > On Fri, Sep 14, 2007 at 03:43:00PM +0100, Mel Gorman wrote: > > On (14/09/07 03:50), Andrew Morton didst pronounce: > > > So how do we get it tested with CONFIG_HIGHMEM=y? Needs an i386 > > > numa machine, yes? Perhaps Andy or Martin can remember to do this > > > sometime, but they'll need a test plan ;) > > > > As an aside, 32 Bit NUMA usually means we turn the NUMAQ into a whipping boy > > and give the problem lip service. However, I'd be interested in hearing if > > superh has dependencies on 32 bit NUMA working properly, including HIGHMEM > > issues. > > > > I've cc'd Paul Mundt. Paul, does superh use 32 bit NUMA? Is it used with > > with HIGHMEM? > > > We do use 32-bit NUMA, yes. Not with highmem at the moment, though. > highmem support is something that will be coming soon, so the 32-bit NUMA > + highmem assertion will be true on SH in the not-so-distant future. > > This is something that quite a few CPUs and boards are depending on, and > these are all using static sparsemem exclusively at present. It's also > something that's tested with current git on a close to daily basis, > albeit with page migration generally disabled, due to the small size of > the non-system memory nodes. > Ok, thanks Paul. It confirms that 32 bit NUMA is not something that will disappear when the NUMAQs finally kick the bucket. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node states sysfs class attributeS - V5 2007-09-14 14:43 ` Mel Gorman 2007-09-14 15:00 ` Paul Mundt @ 2007-09-14 16:00 ` Martin J. Bligh 1 sibling, 0 replies; 43+ messages in thread From: Martin J. Bligh @ 2007-09-14 16:00 UTC (permalink / raw) To: Mel Gorman Cc: Andrew Morton, Lee Schermerhorn, linux-mm, Christoph Lameter, Nish Aravamudan, y-goto, Kamezawa Hiroyuki, Eric Whitney, Andy Whitcroft, lethal >> So how do we get it tested with CONFIG_HIGHMEM=y? Needs an i386 >> numa machine, yes? Perhaps Andy or Martin can remember to do this >> sometime, but they'll need a test plan ;) >> > > As an aside, 32 Bit NUMA usually means we turn the NUMAQ into a whipping boy > and give the problem lip service. The x440 should be 32-bit NUMA too if it hasn't gone up in flames yet. > However, I'd be interested in hearing if > superh has dependencies on 32 bit NUMA working properly, including HIGHMEM > issues. > > I've cc'd Paul Mundt. Paul, does superh use 32 bit NUMA? Is it used with > with HIGHMEM? > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node 'states' sysfs class attribute - V2 2007-08-28 6:12 ` Andrew Morton 2007-08-28 14:05 ` Lee Schermerhorn @ 2007-08-28 19:34 ` Christoph Lameter 1 sibling, 0 replies; 43+ messages in thread From: Christoph Lameter @ 2007-08-28 19:34 UTC (permalink / raw) To: Andrew Morton Cc: Lee Schermerhorn, linux-mm, mel, y-goto, Kamezawa Hiroyuki, Eric Whitney On Mon, 27 Aug 2007, Andrew Morton wrote: > Just step back from this for a minute, and think how utterly lame that is. > User interface code in the kernel because we (actually you guys) have not > expended the tiny amount of effort and initiative which would be required > to develop a little utility to do it. A little utility that would cause a lot of work to keep up to date when the kernel can already give you the bare numbers you need? We have tools for sysadmins that collect these numbers and present a higher level overview but that does not help us. If they report a problem then you have to dig down into where this information come from to figure out what is wrong. > > The cpu affinity is a horror to see on 4096 cpu systems. If you > > want to figure out to which cpu the process has restricted itself then you > > need to do some quick hex conversions in your mind. > > wtf? You meen nobody has written the teeny bit of code which is needed to > convert that info into your desired format? Of course there is somewhere. But it summarizes various things and so it is mostly useless baggage if you are debugging a kernel problem. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node 'states' sysfs class attribute - V2 2007-08-27 21:02 ` [PATCH/RFC] Add node 'states' sysfs class attribute - V2 Lee Schermerhorn 2007-08-27 21:04 ` Christoph Lameter 2007-08-28 0:01 ` Andrew Morton @ 2007-08-28 1:16 ` Yasunori Goto 2007-08-28 1:21 ` Yasunori Goto 2 siblings, 1 reply; 43+ messages in thread From: Yasunori Goto @ 2007-08-28 1:16 UTC (permalink / raw) To: Lee Schermerhorn Cc: linux-mm, clameter, mel, Kamezawa Hiroyuki, Andrew Morton, Eric Whitney Hi. > Add a sysfs class attribute file to /sys/devices/system/node > to display node state masks. IIRC, sysfs has the policy that each file shows only one value, and all files keep it. But, this states file shows 4 values. I think you should make 4 files which show just one value like followings. /sys/devices/system/node/possible /online /has_normal_memory /has_cpu Thanks. > > Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> > > drivers/base/node.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 70 insertions(+), 1 deletion(-) > > Index: Linux/drivers/base/node.c > =================================================================== > --- Linux.orig/drivers/base/node.c 2007-08-27 12:31:32.000000000 -0400 > +++ Linux/drivers/base/node.c 2007-08-27 16:30:18.000000000 -0400 > @@ -12,6 +12,7 @@ > #include <linux/topology.h> > #include <linux/nodemask.h> > #include <linux/cpu.h> > +#include <linux/device.h> > > static struct sysdev_class node_class = { > set_kset_name("node"), > @@ -232,8 +233,76 @@ void unregister_one_node(int nid) > unregister_node(&node_devices[nid]); > } > > +/* > + * [node] states attribute > + */ > +static char * node_state_names[] = { > + "possible:", > + "on-line:", > + "normal memory:", > +#ifdef CONFIG_HIGHMEM > + "high memory:", > +#endif > + "cpu:", > +}; > + > +static ssize_t > +print_node_states(struct class *class, char *buf) > +{ > + int i; > + int n; > + ssize_t size = PAGE_SIZE; > + > + for (i = 0; i < NR_NODE_STATES; i++) { > + n = snprintf(buf, size, "%14s ", node_state_names[i]); > + if (n <= 0) > + break; > + buf += n; > + size -= n; > + if (size <= 0) > + break; > + > + n = nodelist_scnprintf(buf, size, node_states[i]); > + if (n <= 0) > + break; > + buf += n; > + size -=n; > + if (size <= 0) > + break; > + > + n = snprintf(buf, size, "\n"); > + if (n <= 0) > + break; > + buf += n; > + size -= n; > + if (size <= 0) > + break; > + } > + > + if (n > 0) { > + n = PAGE_SIZE; > + if (size > 0) > + n -= size; > + } > + return n; > +} > + > +static CLASS_ATTR(states, 0444, print_node_states, NULL); > + > +static int node_states_init(void) > +{ > + return sysfs_create_file(&node_class.kset.kobj, > + &class_attr_states.attr); > +} > + > static int __init register_node_type(void) > { > - return sysdev_class_register(&node_class); > + int ret; > + > + ret = sysdev_class_register(&node_class); > + if (!ret) > + ret = node_states_init(); > + > + return ret; > } > postcore_initcall(register_node_type); > -- Yasunori Goto -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH/RFC] Add node 'states' sysfs class attribute - V2 2007-08-28 1:16 ` Yasunori Goto @ 2007-08-28 1:21 ` Yasunori Goto 0 siblings, 0 replies; 43+ messages in thread From: Yasunori Goto @ 2007-08-28 1:21 UTC (permalink / raw) To: Lee Schermerhorn Cc: linux-mm, clameter, mel, Kamezawa Hiroyuki, Andrew Morton, Eric Whitney > > Add a sysfs class attribute file to /sys/devices/system/node > > to display node state masks. > > IIRC, sysfs has the policy that each file shows only one value, > and all files keep it. > > But, this states file shows 4 values. > I think you should make 4 files which show just one value like > followings. > /sys/devices/system/node/possible > /online > /has_normal_memory > /has_cpu Oh, I found it discussed in other thread.... Sorry for noise.. -- Yasunori Goto -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 43+ messages in thread
end of thread, other threads:[~2007-09-16 12:10 UTC | newest]
Thread overview: 43+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <200708242228.l7OMS5fU017948@imap1.linux-foundation.org>
2007-08-27 15:58 ` [PATCH] 2.6.23-rc3-mm1 - update N_HIGH_MEMORY node state for memory hotadd Lee Schermerhorn
2007-08-27 17:48 ` [PATCH/RFC] Add node 'states' sysfs class attribute Lee Schermerhorn
2007-08-27 19:11 ` Christoph Lameter
2007-08-27 20:08 ` Lee Schermerhorn
2007-08-27 20:15 ` Christoph Lameter
2007-08-27 21:02 ` [PATCH/RFC] Add node 'states' sysfs class attribute - V2 Lee Schermerhorn
2007-08-27 21:04 ` Christoph Lameter
2007-08-28 0:01 ` Andrew Morton
2007-08-28 0:08 ` Christoph Lameter
2007-08-28 1:14 ` Andrew Morton
2007-08-28 1:29 ` Christoph Lameter
2007-08-28 3:18 ` Andrew Morton
2007-08-28 5:15 ` Christoph Lameter
2007-08-28 5:29 ` Andrew Morton
2007-08-28 5:34 ` Andrew Morton
2007-08-28 5:53 ` Christoph Lameter
2007-08-28 6:12 ` Andrew Morton
2007-08-28 14:05 ` Lee Schermerhorn
2007-08-28 22:02 ` Christoph Lameter
2007-08-28 22:13 ` Nish Aravamudan
2007-08-29 14:43 ` Lee Schermerhorn
2007-08-29 17:39 ` Christoph Lameter
2007-08-29 21:31 ` [PATCH/RFC] Add node states sysfs class attributeS - V3 Lee Schermerhorn
2007-08-29 22:14 ` Christoph Lameter
2007-08-30 13:34 ` Lee Schermerhorn
2007-08-29 22:36 ` Nish Aravamudan
2007-08-30 15:19 ` [PATCH/RFC] Add node states sysfs class attributeS - V4 Lee Schermerhorn
2007-08-30 16:44 ` Nish Aravamudan
2007-08-30 18:20 ` Christoph Lameter
2007-08-30 18:19 ` Christoph Lameter
2007-08-30 18:41 ` Lee Schermerhorn
2007-09-11 13:56 ` [PATCH/RFC] Add node states sysfs class attributeS - V5 Lee Schermerhorn
2007-09-11 20:25 ` Christoph Lameter
2007-09-14 10:50 ` Andrew Morton
2007-09-14 11:35 ` Andy Whitcroft
2007-09-14 14:34 ` Lee Schermerhorn
2007-09-14 14:43 ` Mel Gorman
2007-09-14 15:00 ` Paul Mundt
2007-09-16 12:10 ` Mel Gorman
2007-09-14 16:00 ` Martin J. Bligh
2007-08-28 19:34 ` [PATCH/RFC] Add node 'states' sysfs class attribute - V2 Christoph Lameter
2007-08-28 1:16 ` Yasunori Goto
2007-08-28 1:21 ` Yasunori Goto
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).