linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [Bugme-new] [Bug 13690] New: nodes_clear cause hugepage unusable on non-NUMA machine
       [not found] <bug-13690-10286@http.bugzilla.kernel.org/>
@ 2009-07-02  1:34 ` Andrew Morton
  2009-07-02  2:14   ` Yinghai Lu
  0 siblings, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2009-07-02  1:34 UTC (permalink / raw)
  To: alex.shi
  Cc: bugzilla-daemon, bugme-daemon, Yinghai Lu, Christoph Lameter,
	Mel Gorman, Ingo Molnar, linux-kernel


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Thu, 2 Jul 2009 01:22:24 GMT bugzilla-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=13690
> 
>            Summary: nodes_clear cause hugepage unusable on non-NUMA
>                     machine
>            Product: Platform Specific/Hardware
>            Version: 2.5
>     Kernel Version: 2.6.31-rc1
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: high
>           Priority: P1
>          Component: i386
>         AssignedTo: platform_i386@kernel-bugs.osdl.org
>         ReportedBy: alex.shi@intel.com
>                 CC: yinghai@kernel.org
>         Regression: Yes
> 
> 
> 73d60b7f747176dbdff826c4127d22e1fd3f9f74  commit introduced a nodes_clear
> function for NUMA machine. But seems the commit omits non-NUMA machine.
>  If find_zone_movable_pfns_for_nodes/early_calculate_totalpages has no
>  chance to run. nodes_clear will block HUPEPAGE using in my specjbb2005
>  testing on my Stoakely(i386/x86_64), waybridge(i386), IBM T61(i386)
> 
> +       /*
> +        * find_zone_movable_pfns_for_nodes/early_calculate_totalpages init
> +        * that node_mask, clear it at first
> +        */
> +       nodes_clear(node_states[N_HIGH_MEMORY]);

Thanks.

fyi, with recently-occurring bugs and regressions of this nature, it is (I
think) best to deal with them via email rather than bugzilla.  Bugzilla is
better-suited to longer-lived bugs where we have a need to track them,
generate statistics, etc.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bugme-new] [Bug 13690] New: nodes_clear cause hugepage unusable on non-NUMA machine
  2009-07-02  1:34 ` [Bugme-new] [Bug 13690] New: nodes_clear cause hugepage unusable on non-NUMA machine Andrew Morton
@ 2009-07-02  2:14   ` Yinghai Lu
  2009-07-02  6:45     ` Alex Shi
  0 siblings, 1 reply; 13+ messages in thread
From: Yinghai Lu @ 2009-07-02  2:14 UTC (permalink / raw)
  To: Andrew Morton
  Cc: alex.shi, bugzilla-daemon, bugme-daemon, Christoph Lameter,
	Mel Gorman, Ingo Molnar, linux-kernel

that looks strange...

config is 32bit. 

the second patch only do save and restore. and should be right right.

please check following patch on today's linus tree. and send out /proc/iomem

Thanks

Yinghai

[PATCH] x86: add boundary check for 32bit res before expand e820 resource to alignment

fix hang with HIGHMEM_64G and 32bit resource.
according to hpa and Linus, use (resource_size_t)-1 to fend off big ranges.

analyized by hpa

Reported-and-tested-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 arch/x86/include/asm/proto.h |    3 ---
 arch/x86/kernel/e820.c       |   20 ++++++++++++--------
 include/linux/kernel.h       |    5 +++++
 3 files changed, 17 insertions(+), 11 deletions(-)

Index: linux-2.6/arch/x86/kernel/e820.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/e820.c
+++ linux-2.6/arch/x86/kernel/e820.c
@@ -1367,9 +1367,9 @@ void __init e820_reserve_resources(void)
 }
 
 /* How much should we pad RAM ending depending on where it is? */
-static unsigned long ram_alignment(resource_size_t pos)
+static u64 ram_alignment(u64 pos)
 {
-	unsigned long mb = pos >> 20;
+	u64 mb = pos >> 20;
 
 	/* To 64kB in the first megabyte */
 	if (!mb)
@@ -1383,6 +1383,8 @@ static unsigned long ram_alignment(resou
 	return 32*1024*1024;
 }
 
+#define MAX_RESOURCE_SIZE ((resource_size_t)-1)
+
 void __init e820_reserve_resources_late(void)
 {
 	int i;
@@ -1400,17 +1402,19 @@ void __init e820_reserve_resources_late(
 	 * avoid stolen RAM:
 	 */
 	for (i = 0; i < e820.nr_map; i++) {
-		struct e820entry *entry = &e820_saved.map[i];
-		resource_size_t start, end;
+		struct e820entry *entry = &e820.map[i];
+		u64 start, end;
 
 		if (entry->type != E820_RAM)
 			continue;
 		start = entry->addr + entry->size;
-		end = round_up(start, ram_alignment(start));
-		if (start == end)
+		end = round_up(start, ram_alignment(start)) - 1;
+		if (end > MAX_RESOURCE_SIZE)
+			end = MAX_RESOURCE_SIZE;
+		if (start > end)
 			continue;
-		reserve_region_with_split(&iomem_resource, start,
-						  end - 1, "RAM buffer");
+		reserve_region_with_split(&iomem_resource, start, end,
+					  "RAM buffer");
 	}
 }
 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bugme-new] [Bug 13690] New: nodes_clear cause hugepage unusable on non-NUMA machine
  2009-07-02  2:14   ` Yinghai Lu
@ 2009-07-02  6:45     ` Alex Shi
  2009-07-02  8:50       ` Yinghai Lu
  0 siblings, 1 reply; 13+ messages in thread
From: Alex Shi @ 2009-07-02  6:45 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Andrew Morton, bugzilla-daemon@bugzilla.kernel.org,
	bugme-daemon@bugzilla.kernel.org, Christoph Lameter, Mel Gorman,
	Ingo Molnar, linux-kernel@vger.kernel.org, yanmin.zhang,
	tim.c.chen

The new patch works for my stoakley i386 machine. But for x86_64 machine
the specjbb2005 still can not run with hugepage. The specjbb2005 use the
same java setting as i386 system. After apply your patch, the iomem of
x86_64 is:

00000000-0000ffff : reserved
00010000-0009cbff : System RAM
0009cc00-0009ffff : reserved
000cc000-000cffff : reserved
000e0000-000fffff : reserved
00100000-cfefffff : System RAM
  01000000-014eb53e : Kernel code
  014eb53f-0177390f : Kernel data
  01830000-018f583f : Kernel bss
cff00000-cff0afff : ACPI Tables
cff0b000-cff0bfff : ACPI Non-volatile Storage
cff0c000-cfffffff : reserved
d0000000-d7ffffff : PCI Bus 0000:08
  d0000000-d7ffffff : 0000:08:01.0
d8000000-d81fffff : PCI Bus 0000:03
  d8000000-d81fffff : PCI Bus 0000:06
    d8000000-d80fffff : 0000:06:02.0
    d8100000-d810ffff : 0000:06:01.0
d8200000-d84fffff : PCI Bus 0000:03
  d8200000-d83fffff : PCI Bus 0000:06
    d8200000-d82fffff : 0000:06:02.0
      d8200000-d82fffff : e100
    d8300000-d831ffff : 0000:06:01.0
      d8300000-d831ffff : e1000
    d8320000-d832ffff : 0000:06:01.0
      d8320000-d832ffff : e1000
    d8330000-d8330fff : 0000:06:02.0
      d8330000-d8330fff : e100
d8500000-d87fffff : PCI Bus 0000:07
  d8500000-d8503fff : 0000:07:00.0
  d8504000-d8507fff : 0000:07:00.1
  d8520000-d853ffff : 0000:07:00.0
  d8540000-d855ffff : 0000:07:00.1
  d8600000-d86fffff : 0000:07:00.0
  d8700000-d87fffff : 0000:07:00.1
d8800000-d88fffff : PCI Bus 0000:08
  d8800000-d880ffff : 0000:08:01.0
  d8810000-d8813fff : 0000:08:08.0
  d8814000-d88147ff : 0000:08:08.0
  d8820000-d883ffff : 0000:08:01.0
d8904000-d8907fff : 0000:00:1b.0
d8908000-d89083ff : 0000:00:1d.7
  d8908000-d89083ff : ehci_hcd
d8908400-d89087ff : 0000:00:1f.2
  d8908400-d89087ff : ahci
d8a00000-d8bfffff : PCI Bus 0000:07
  d8a00000-d8afffff : 0000:07:00.0
  d8b00000-d8bfffff : 0000:07:00.1
e0000000-efffffff : reserved
  e0000000-efffffff : pnp 00:01
    e0000000-e07fffff : PCI MMCONFIG 0 [00-07]
fe000000-fe01ffff : pnp 00:01
  fe000000-fe01ffff : i5k_amb
fe600000-fe6fffff : pnp 00:01
fe700000-fe703fff : 0000:00:0f.0
fec00000-fec0ffff : reserved
  fec00000-fec00fff : IOAPIC 0
fec88000-fec88fff : IOAPIC 1
  fec88000-fec88fff : pnp 00:01
fec89000-fec89fff : IOAPIC 2
  fec89000-fec89fff : pnp 00:01
fed00000-fed003ff : HPET 0
fed1c000-fed1ffff : pnp 00:01
fed20000-fed44fff : pnp 00:01
fed45000-fed8ffff : pnp 00:01
fee00000-fee00fff : Local APIC
  fee00000-fee00fff : reserved
ff000000-ffffffff : reserved
100000000-12fffffff : System RAM

====================
The iomem of i386 stoakley is: 
--- stoakley.iomem.x86_64  2009-07-02 13:53:35.000000000 +0800
+++ stoakley.iomem.i386 2009-07-02 14:19:59.000000000 +0800
@@ -1,12 +1,15 @@
 00000000-0000ffff : reserved
 00010000-0009cbff : System RAM
 0009cc00-0009ffff : reserved
+000a0000-000bffff : Video RAM area
+000c0000-000cafff : Video ROM
 000cc000-000cffff : reserved
 000e0000-000fffff : reserved
+  000f0000-000fffff : System ROM
 00100000-cfefffff : System RAM
-  01000000-014eb53e : Kernel code
-  014eb53f-0177390f : Kernel data
-  01830000-018f583f : Kernel bss
+  00100000-00602876 : Kernel code
+  00602877-008e49db : Kernel data
+  00954000-009fe433 : Kernel bss
 cff00000-cff0afff : ACPI Tables
 cff0b000-cff0bfff : ACPI Non-volatile Storage
 cff0c000-cfffffff : reserved
@@ -50,7 +53,6 @@
   e0000000-efffffff : pnp 00:01
     e0000000-e07fffff : PCI MMCONFIG 0 [00-07]
 fe000000-fe01ffff : pnp 00:01
-  fe000000-fe01ffff : i5k_amb
 fe600000-fe6fffff : pnp 00:01
 fe700000-fe703fff : 0000:00:0f.0
 fec00000-fec0ffff : reserved
@@ -66,4 +68,3 @@
 fee00000-fee00fff : Local APIC
   fee00000-fee00fff : reserved
 ff000000-ffffffff : reserved
-100000000-12fffffff : System RAM


Alex 

On Thu, 2009-07-02 at 10:14 +0800, Yinghai Lu wrote:
> that looks strange...
> 
> config is 32bit. 
> 
> the second patch only do save and restore. and should be right right.
> 
> please check following patch on today's linus tree. and send out /proc/iomem
> 
> Thanks
> 
> Yinghai
> 
> [PATCH] x86: add boundary check for 32bit res before expand e820 resource to alignment
> 
> fix hang with HIGHMEM_64G and 32bit resource.
> according to hpa and Linus, use (resource_size_t)-1 to fend off big ranges.
> 
> analyized by hpa
> 
> Reported-and-tested-by: Mikael Pettersson <mikpe@it.uu.se>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> 
> ---
>  arch/x86/include/asm/proto.h |    3 ---
>  arch/x86/kernel/e820.c       |   20 ++++++++++++--------
>  include/linux/kernel.h       |    5 +++++
>  3 files changed, 17 insertions(+), 11 deletions(-)
> 
> Index: linux-2.6/arch/x86/kernel/e820.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/e820.c
> +++ linux-2.6/arch/x86/kernel/e820.c
> @@ -1367,9 +1367,9 @@ void __init e820_reserve_resources(void)
>  }
>  
>  /* How much should we pad RAM ending depending on where it is? */
> -static unsigned long ram_alignment(resource_size_t pos)
> +static u64 ram_alignment(u64 pos)
>  {
> -	unsigned long mb = pos >> 20;
> +	u64 mb = pos >> 20;
>  
>  	/* To 64kB in the first megabyte */
>  	if (!mb)
> @@ -1383,6 +1383,8 @@ static unsigned long ram_alignment(resou
>  	return 32*1024*1024;
>  }
>  
> +#define MAX_RESOURCE_SIZE ((resource_size_t)-1)
> +
>  void __init e820_reserve_resources_late(void)
>  {
>  	int i;
> @@ -1400,17 +1402,19 @@ void __init e820_reserve_resources_late(
>  	 * avoid stolen RAM:
>  	 */
>  	for (i = 0; i < e820.nr_map; i++) {
> -		struct e820entry *entry = &e820_saved.map[i];
> -		resource_size_t start, end;
> +		struct e820entry *entry = &e820.map[i];
> +		u64 start, end;
>  
>  		if (entry->type != E820_RAM)
>  			continue;
>  		start = entry->addr + entry->size;
> -		end = round_up(start, ram_alignment(start));
> -		if (start == end)
> +		end = round_up(start, ram_alignment(start)) - 1;
> +		if (end > MAX_RESOURCE_SIZE)
> +			end = MAX_RESOURCE_SIZE;
> +		if (start > end)
>  			continue;
> -		reserve_region_with_split(&iomem_resource, start,
> -						  end - 1, "RAM buffer");
> +		reserve_region_with_split(&iomem_resource, start, end,
> +					  "RAM buffer");
>  	}
>  }
>  


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bugme-new] [Bug 13690] New: nodes_clear cause hugepage unusable on non-NUMA machine
  2009-07-02  6:45     ` Alex Shi
@ 2009-07-02  8:50       ` Yinghai Lu
  2009-07-02 14:11         ` Christoph Lameter
                           ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Yinghai Lu @ 2009-07-02  8:50 UTC (permalink / raw)
  To: alex.shi, Andrew Morton, Ingo Molnar
  Cc: bugzilla-daemon@bugzilla.kernel.org,
	bugme-daemon@bugzilla.kernel.org, Christoph Lameter, Mel Gorman,
	linux-kernel@vger.kernel.org, yanmin.zhang, tim.c.chen

Alex Shi wrote:
> The new patch works for my stoakley i386 machine. But for x86_64 machine
> the specjbb2005 still can not run with hugepage. The specjbb2005 use the
> same java setting as i386 system. After apply your patch, the iomem of
> x86_64 is:

please check

[PATCH] x86: don't clear nodes_states[N_NORMAL_MEMORY] when numa is not compiled in

Alex found:
for x86_64 machine the specjbb2005 still can not run with hugepage

only happens when numa is not compiled in

the root cause: node_set_state will not set it back for us in that case

so don't clear that when numa is not select in config

Reported-by: Alex Shi <alex.shi@intel.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 arch/x86/mm/init_64.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Index: linux-2.6/arch/x86/mm/init_64.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/init_64.c
+++ linux-2.6/arch/x86/mm/init_64.c
@@ -598,8 +598,14 @@ void __init paging_init(void)
 
 	sparse_memory_present_with_active_regions(MAX_NUMNODES);
 	sparse_init();
-	/* clear the default setting with node 0 */
+#if MAX_NUMNODES > 1
+	/*
+	 * clear the default setting with node 0
+	 * note: don't clear it, node_set_state will do nothing
+	 *	 (aka set it back) when numa support is not compiled in
+	 */
 	nodes_clear(node_states[N_NORMAL_MEMORY]);
+#endif
 	free_area_init_nodes(max_zone_pfns);
 }
 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bugme-new] [Bug 13690] New: nodes_clear cause hugepage unusable on non-NUMA machine
  2009-07-02  8:50       ` Yinghai Lu
@ 2009-07-02 14:11         ` Christoph Lameter
  2009-07-02 18:04           ` [PATCH] x86: don't clear nodes_states[N_NORMAL_MEMORY] when numa is not compiled in -v2 Yinghai Lu
                             ` (2 more replies)
  2009-07-02 14:12         ` [Bugme-new] [Bug 13690] New: nodes_clear cause hugepage unusable on non-NUMA machine Shi, Alex
  2009-07-06  3:13         ` Alex Shi
  2 siblings, 3 replies; 13+ messages in thread
From: Christoph Lameter @ 2009-07-02 14:11 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: alex.shi, Andrew Morton, Ingo Molnar,
	bugzilla-daemon@bugzilla.kernel.org,
	bugme-daemon@bugzilla.kernel.org, Mel Gorman,
	linux-kernel@vger.kernel.org, yanmin.zhang, tim.c.chen

On Thu, 2 Jul 2009, Yinghai Lu wrote:

> Index: linux-2.6/arch/x86/mm/init_64.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/mm/init_64.c
> +++ linux-2.6/arch/x86/mm/init_64.c
> @@ -598,8 +598,14 @@ void __init paging_init(void)
>
>  	sparse_memory_present_with_active_regions(MAX_NUMNODES);
>  	sparse_init();
> -	/* clear the default setting with node 0 */
> +#if MAX_NUMNODES > 1
> +	/*
> +	 * clear the default setting with node 0
> +	 * note: don't clear it, node_set_state will do nothing
> +	 *	 (aka set it back) when numa support is not compiled in
> +	 */
>  	nodes_clear(node_states[N_NORMAL_MEMORY]);

The problem was that nodes_clear() does not fall back to a noop on !NUMA.
The node_set/clear_states() operations do become noops.

Could we make it more consistent by using only operations of the same
type? F.e. Add a node_clearall_states() in include/linux/nodemask.h that
falls back to a noop on !NUMA like the node_*_states operation?

Another options is to restore node_states[N_NORMAL_MEMORY] to its
initial condition. See the definition of node_states in page_alloc.c.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [Bugme-new] [Bug 13690] New: nodes_clear cause hugepage unusable on non-NUMA machine
  2009-07-02  8:50       ` Yinghai Lu
  2009-07-02 14:11         ` Christoph Lameter
@ 2009-07-02 14:12         ` Shi, Alex
  2009-07-06  3:13         ` Alex Shi
  2 siblings, 0 replies; 13+ messages in thread
From: Shi, Alex @ 2009-07-02 14:12 UTC (permalink / raw)
  To: Yinghai Lu, Andrew Morton, Ingo Molnar
  Cc: bugzilla-daemon@bugzilla.kernel.org,
	bugme-daemon@bugzilla.kernel.org, Christoph Lameter, Mel Gorman,
	linux-kernel@vger.kernel.org, Zhang, Yanmin, Chen, Tim C

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="gb2312", Size: 2093 bytes --]

Yes, the patch fixes this bug! 

Alex 

>-----Original Message-----
>From: Yinghai Lu [mailto:yinghai@kernel.org]
>Sent: 2009Äê7ÔÂ2ÈÕ 16:51
>To: Shi, Alex; Andrew Morton; Ingo Molnar
>Cc: bugzilla-daemon@bugzilla.kernel.org; bugme-daemon@bugzilla.kernel.org;
>Christoph Lameter; Mel Gorman; linux-kernel@vger.kernel.org; Zhang, Yanmin;
>Chen, Tim C
>Subject: Re: [Bugme-new] [Bug 13690] New: nodes_clear cause hugepage unusable
>on non-NUMA machine
>
>Alex Shi wrote:
>> The new patch works for my stoakley i386 machine. But for x86_64 machine
>> the specjbb2005 still can not run with hugepage. The specjbb2005 use the
>> same java setting as i386 system. After apply your patch, the iomem of
>> x86_64 is:
>
>please check
>
>[PATCH] x86: don't clear nodes_states[N_NORMAL_MEMORY] when numa is not
>compiled in
>
>Alex found:
>for x86_64 machine the specjbb2005 still can not run with hugepage
>
>only happens when numa is not compiled in
>
>the root cause: node_set_state will not set it back for us in that case
>
>so don't clear that when numa is not select in config
>
>Reported-by: Alex Shi <alex.shi@intel.com>
>Signed-off-by: Yinghai Lu <yinghai@kernel.org>
>
>---
> arch/x86/mm/init_64.c |    8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
>Index: linux-2.6/arch/x86/mm/init_64.c
>===================================================================
>--- linux-2.6.orig/arch/x86/mm/init_64.c
>+++ linux-2.6/arch/x86/mm/init_64.c
>@@ -598,8 +598,14 @@ void __init paging_init(void)
>
> 	sparse_memory_present_with_active_regions(MAX_NUMNODES);
> 	sparse_init();
>-	/* clear the default setting with node 0 */
>+#if MAX_NUMNODES > 1
>+	/*
>+	 * clear the default setting with node 0
>+	 * note: don't clear it, node_set_state will do nothing
>+	 *	 (aka set it back) when numa support is not compiled in
>+	 */
> 	nodes_clear(node_states[N_NORMAL_MEMORY]);
>+#endif
> 	free_area_init_nodes(max_zone_pfns);
> }
>
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] x86: don't clear nodes_states[N_NORMAL_MEMORY] when numa is not compiled in -v2
  2009-07-02 14:11         ` Christoph Lameter
@ 2009-07-02 18:04           ` Yinghai Lu
  2009-07-03 15:39           ` [Bugme-new] [Bug 13690] New: nodes_clear cause hugepage unusable on non-NUMA machine Yinghai Lu
  2009-07-08 16:50           ` [PATCH] x86: don't clear nodes_states[N_NORMAL_MEMORY] when numa is not compiled in -v2 Yinghai Lu
  2 siblings, 0 replies; 13+ messages in thread
From: Yinghai Lu @ 2009-07-02 18:04 UTC (permalink / raw)
  To: Christoph Lameter, Ingo Molnar, Thomas Gleixner, H. Peter Anvin
  Cc: alex.shi, Andrew Morton, Mel Gorman, linux-kernel@vger.kernel.org,
	yanmin.zhang, tim.c.chen



Alex found:
for x86_64 machine the specjbb2005 still can not run with hugepage

only happens when numa is not compiled in

the root cause: node_set_state will not set it back for us in that case

so don't clear that when numa is not select in config

v2: use node_clear_state instead

Reported-and-Tested-by: Alex Shi <alex.shi@intel.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 arch/x86/mm/init_64.c |   11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

Index: linux-2.6/arch/x86/mm/init_64.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/init_64.c
+++ linux-2.6/arch/x86/mm/init_64.c
@@ -598,8 +598,15 @@ void __init paging_init(void)
 
 	sparse_memory_present_with_active_regions(MAX_NUMNODES);
 	sparse_init();
-	/* clear the default setting with node 0 */
-	nodes_clear(node_states[N_NORMAL_MEMORY]);
+
+	/*
+	 * clear the default setting with node 0
+	 * note: don't use nodes_clear here, that is really clearing when
+	 *	 numa support is not compiled in, and later node_set_state
+	 *	 will not set it back.
+	 */
+	node_clear_state(0, N_NORMAL_MEMORY);
+
 	free_area_init_nodes(max_zone_pfns);
 }
 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bugme-new] [Bug 13690] New: nodes_clear cause hugepage unusable on non-NUMA machine
  2009-07-02 14:11         ` Christoph Lameter
  2009-07-02 18:04           ` [PATCH] x86: don't clear nodes_states[N_NORMAL_MEMORY] when numa is not compiled in -v2 Yinghai Lu
@ 2009-07-03 15:39           ` Yinghai Lu
  2009-07-08 16:50           ` [PATCH] x86: don't clear nodes_states[N_NORMAL_MEMORY] when numa is not compiled in -v2 Yinghai Lu
  2 siblings, 0 replies; 13+ messages in thread
From: Yinghai Lu @ 2009-07-03 15:39 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: alex.shi, Andrew Morton, Ingo Molnar,
	bugzilla-daemon@bugzilla.kernel.org,
	bugme-daemon@bugzilla.kernel.org, Mel Gorman,
	linux-kernel@vger.kernel.org, yanmin.zhang, tim.c.chen

Christoph Lameter wrote:
> On Thu, 2 Jul 2009, Yinghai Lu wrote:
> 
>> Index: linux-2.6/arch/x86/mm/init_64.c
>> ===================================================================
>> --- linux-2.6.orig/arch/x86/mm/init_64.c
>> +++ linux-2.6/arch/x86/mm/init_64.c
>> @@ -598,8 +598,14 @@ void __init paging_init(void)
>>
>>  	sparse_memory_present_with_active_regions(MAX_NUMNODES);
>>  	sparse_init();
>> -	/* clear the default setting with node 0 */
>> +#if MAX_NUMNODES > 1
>> +	/*
>> +	 * clear the default setting with node 0
>> +	 * note: don't clear it, node_set_state will do nothing
>> +	 *	 (aka set it back) when numa support is not compiled in
>> +	 */
>>  	nodes_clear(node_states[N_NORMAL_MEMORY]);
> 
> The problem was that nodes_clear() does not fall back to a noop on !NUMA.
> The node_set/clear_states() operations do become noops.
> 
> Could we make it more consistent by using only operations of the same
> type? F.e. Add a node_clearall_states() in include/linux/nodemask.h that
> falls back to a noop on !NUMA like the node_*_states operation?
> 
> Another options is to restore node_states[N_NORMAL_MEMORY] to its
> initial condition. See the definition of node_states in page_alloc.c.

could use node_clear_state(0, N_NORMAL_MEMORY) instead. because default one only have node 0 set in that mask.

YH

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bugme-new] [Bug 13690] New: nodes_clear cause hugepage unusable on non-NUMA machine
  2009-07-02  8:50       ` Yinghai Lu
  2009-07-02 14:11         ` Christoph Lameter
  2009-07-02 14:12         ` [Bugme-new] [Bug 13690] New: nodes_clear cause hugepage unusable on non-NUMA machine Shi, Alex
@ 2009-07-06  3:13         ` Alex Shi
  2009-07-07  0:07           ` Yinghai Lu
  2 siblings, 1 reply; 13+ messages in thread
From: Alex Shi @ 2009-07-06  3:13 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Andrew Morton, Ingo Molnar, bugzilla-daemon@bugzilla.kernel.org,
	bugme-daemon@bugzilla.kernel.org, Christoph Lameter, Mel Gorman,
	linux-kernel@vger.kernel.org, Zhang, Yanmin, Chen, Tim C

Yinghai:

The 31-rc2 kernel still can not use hugepage on non-NUMA machine. And
this patch did not appear on rc2 kernel. Are there some concern about
this? 

BRG
Alex 


On Thu, 2009-07-02 at 16:50 +0800, Yinghai Lu wrote:
> Alex Shi wrote:
> > The new patch works for my stoakley i386 machine. But for x86_64 machine
> > the specjbb2005 still can not run with hugepage. The specjbb2005 use the
> > same java setting as i386 system. After apply your patch, the iomem of
> > x86_64 is:
> 
> please check
> 
> [PATCH] x86: don't clear nodes_states[N_NORMAL_MEMORY] when numa is not compiled in
> 
> Alex found:
> for x86_64 machine the specjbb2005 still can not run with hugepage
> 
> only happens when numa is not compiled in
> 
> the root cause: node_set_state will not set it back for us in that case
> 
> so don't clear that when numa is not select in config
> 
> Reported-by: Alex Shi <alex.shi@intel.com>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> 
> ---
>  arch/x86/mm/init_64.c |    8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> Index: linux-2.6/arch/x86/mm/init_64.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/mm/init_64.c
> +++ linux-2.6/arch/x86/mm/init_64.c
> @@ -598,8 +598,14 @@ void __init paging_init(void)
>  
>  	sparse_memory_present_with_active_regions(MAX_NUMNODES);
>  	sparse_init();
> -	/* clear the default setting with node 0 */
> +#if MAX_NUMNODES > 1
> +	/*
> +	 * clear the default setting with node 0
> +	 * note: don't clear it, node_set_state will do nothing
> +	 *	 (aka set it back) when numa support is not compiled in
> +	 */
>  	nodes_clear(node_states[N_NORMAL_MEMORY]);
> +#endif
>  	free_area_init_nodes(max_zone_pfns);
>  }
>  


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bugme-new] [Bug 13690] New: nodes_clear cause hugepage unusable on non-NUMA machine
  2009-07-06  3:13         ` Alex Shi
@ 2009-07-07  0:07           ` Yinghai Lu
  2009-07-07  7:23             ` Alex Shi
  0 siblings, 1 reply; 13+ messages in thread
From: Yinghai Lu @ 2009-07-07  0:07 UTC (permalink / raw)
  To: alex.shi
  Cc: Andrew Morton, Ingo Molnar, bugzilla-daemon@bugzilla.kernel.org,
	bugme-daemon@bugzilla.kernel.org, Christoph Lameter, Mel Gorman,
	linux-kernel@vger.kernel.org, Zhang, Yanmin, Chen, Tim C

Alex Shi wrote:
> Yinghai:
> 
> The 31-rc2 kernel still can not use hugepage on non-NUMA machine. And
> this patch did not appear on rc2 kernel. Are there some concern about
> this? 
> 

can you check
http://lkml.org/lkml/2009/7/2/326

YH

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bugme-new] [Bug 13690] New: nodes_clear cause hugepage unusable on non-NUMA machine
  2009-07-07  0:07           ` Yinghai Lu
@ 2009-07-07  7:23             ` Alex Shi
  0 siblings, 0 replies; 13+ messages in thread
From: Alex Shi @ 2009-07-07  7:23 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Andrew Morton, Ingo Molnar, bugzilla-daemon@bugzilla.kernel.org,
	bugme-daemon@bugzilla.kernel.org, Christoph Lameter, Mel Gorman,
	linux-kernel@vger.kernel.org, Zhang, Yanmin, Chen, Tim C

On Tue, 2009-07-07 at 08:07 +0800, Yinghai Lu wrote:
> Alex Shi wrote:
> > Yinghai:
> > 
> > The 31-rc2 kernel still can not use hugepage on non-NUMA machine. And
> > this patch did not appear on rc2 kernel. Are there some concern about
> > this? 
> > 
> 
> can you check
> http://lkml.org/lkml/2009/7/2/326
> 
> YH

It works on my Stoakley i386 and x86_64 with latest Linus' kernel tree. 



Alex 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] x86: don't clear nodes_states[N_NORMAL_MEMORY] when numa is not compiled in -v2
  2009-07-02 14:11         ` Christoph Lameter
  2009-07-02 18:04           ` [PATCH] x86: don't clear nodes_states[N_NORMAL_MEMORY] when numa is not compiled in -v2 Yinghai Lu
  2009-07-03 15:39           ` [Bugme-new] [Bug 13690] New: nodes_clear cause hugepage unusable on non-NUMA machine Yinghai Lu
@ 2009-07-08 16:50           ` Yinghai Lu
  2009-07-08 17:16             ` Christoph Lameter
  2 siblings, 1 reply; 13+ messages in thread
From: Yinghai Lu @ 2009-07-08 16:50 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	Linus Torvalds
  Cc: Christoph Lameter, alex.shi, Mel Gorman,
	linux-kernel@vger.kernel.org, yanmin.zhang, tim.c.chen



Alex found:
for x86_64 machine the specjbb2005 still can not run with hugepage

only happens when numa is not compiled in

the root cause: node_set_state will not set it back for us in that case

so don't clear that when numa is not select in config

v2: use node_clear_state instead

Reported-and-Tested-by: Alex Shi <alex.shi@intel.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 arch/x86/mm/init_64.c |   11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

Index: linux-2.6/arch/x86/mm/init_64.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/init_64.c
+++ linux-2.6/arch/x86/mm/init_64.c
@@ -598,8 +598,15 @@ void __init paging_init(void)
 
 	sparse_memory_present_with_active_regions(MAX_NUMNODES);
 	sparse_init();
-	/* clear the default setting with node 0 */
-	nodes_clear(node_states[N_NORMAL_MEMORY]);
+
+	/*
+	 * clear the default setting with node 0
+	 * note: don't use nodes_clear here, that is really clearing when
+	 *	 numa support is not compiled in, and later node_set_state
+	 *	 will not set it back.
+	 */
+	node_clear_state(0, N_NORMAL_MEMORY);
+
 	free_area_init_nodes(max_zone_pfns);
 }
 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86: don't clear nodes_states[N_NORMAL_MEMORY] when numa is not compiled in -v2
  2009-07-08 16:50           ` [PATCH] x86: don't clear nodes_states[N_NORMAL_MEMORY] when numa is not compiled in -v2 Yinghai Lu
@ 2009-07-08 17:16             ` Christoph Lameter
  0 siblings, 0 replies; 13+ messages in thread
From: Christoph Lameter @ 2009-07-08 17:16 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	Linus Torvalds, alex.shi, Mel Gorman,
	linux-kernel@vger.kernel.org, yanmin.zhang, tim.c.chen


Reviewed-by: Christoph Lameter <cl@linux-foundation.org>


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2009-07-08 17:17 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <bug-13690-10286@http.bugzilla.kernel.org/>
2009-07-02  1:34 ` [Bugme-new] [Bug 13690] New: nodes_clear cause hugepage unusable on non-NUMA machine Andrew Morton
2009-07-02  2:14   ` Yinghai Lu
2009-07-02  6:45     ` Alex Shi
2009-07-02  8:50       ` Yinghai Lu
2009-07-02 14:11         ` Christoph Lameter
2009-07-02 18:04           ` [PATCH] x86: don't clear nodes_states[N_NORMAL_MEMORY] when numa is not compiled in -v2 Yinghai Lu
2009-07-03 15:39           ` [Bugme-new] [Bug 13690] New: nodes_clear cause hugepage unusable on non-NUMA machine Yinghai Lu
2009-07-08 16:50           ` [PATCH] x86: don't clear nodes_states[N_NORMAL_MEMORY] when numa is not compiled in -v2 Yinghai Lu
2009-07-08 17:16             ` Christoph Lameter
2009-07-02 14:12         ` [Bugme-new] [Bug 13690] New: nodes_clear cause hugepage unusable on non-NUMA machine Shi, Alex
2009-07-06  3:13         ` Alex Shi
2009-07-07  0:07           ` Yinghai Lu
2009-07-07  7:23             ` Alex Shi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).