public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/mm/srat_64.c: nodes_parsed should include all nodes detected by ACPI.
@ 2010-01-15  7:42 Haicheng Li
  2010-01-17  2:22 ` Haicheng Li
  2010-01-17 21:53 ` David Rientjes
  0 siblings, 2 replies; 26+ messages in thread
From: Haicheng Li @ 2010-01-15  7:42 UTC (permalink / raw)
  To: H. Peter Anvin, Ingo Molnar, Thomas Gleixner
  Cc: x86, Andi Kleen, linux-kernel

This is to fix the bug discussed in email thread: http://patchwork.kernel.org/patch/69499/.

Currently node_possible_map won't include the offlined node that has neither CPU onlined nor MEM 
onlined at booting time. As a result, nr_node_ids won't be equal to possible nodes.

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com>
---
  arch/x86/mm/srat_64.c |   10 ++--------
  1 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
index a271241..a5bc297 100644
--- a/arch/x86/mm/srat_64.c
+++ b/arch/x86/mm/srat_64.c
@@ -238,7 +238,7 @@ update_nodes_add(int node, unsigned long start, unsigned long end)
  void __init
  acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
  {
-	struct bootnode *nd, oldnode;
+	struct bootnode *nd;
  	unsigned long start, end;
  	int node, pxm;
  	int i;
@@ -277,7 +277,6 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
  		return;
  	}
  	nd = &nodes[node];
-	oldnode = *nd;
  	if (!node_test_and_set(node, nodes_parsed)) {
  		nd->start = start;
  		nd->end = end;
@@ -291,13 +290,8 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
  	printk(KERN_INFO "SRAT: Node %u PXM %u %lx-%lx\n", node, pxm,
  	       start, end);

-	if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) {
+	if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)
  		update_nodes_add(node, start, end);
-		/* restore nodes[node] */
-		*nd = oldnode;
-		if ((nd->start | nd->end) == 0)
-			node_clear(node, nodes_parsed);
-	}

  	node_memblk_range[num_node_memblks].start = start;
  	node_memblk_range[num_node_memblks].end = end;

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/mm/srat_64.c: nodes_parsed should include all nodes detected by ACPI.
  2010-01-15  7:42 [PATCH] x86/mm/srat_64.c: nodes_parsed should include all nodes detected by ACPI Haicheng Li
@ 2010-01-17  2:22 ` Haicheng Li
  2010-01-17 21:53 ` David Rientjes
  1 sibling, 0 replies; 26+ messages in thread
From: Haicheng Li @ 2010-01-17  2:22 UTC (permalink / raw)
  To: Haicheng Li
  Cc: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, x86, Andi Kleen,
	linux-kernel

Hello,

Would there be any potential issue with this patch? Without this fix, hotadding a CPU with MEM 
attached will cause kernel BUG like below. So this could even be a fix for stable, any comments?

the BUG is:
[  141.667487] BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
[  141.667782] IP: [<ffffffff810b8a64>] cache_reap+0x71/0x236
[  141.667969] PGD 0
[  141.668129] Oops: 0000 [#1] SMP
[  141.668357] last sysfs file: /sys/class/scsi_host/host4/proc_name
[  141.668469] CPU
[  141.668630] Modules linked in: ipv6 autofs4 rfcomm l2cap crc16 bluetooth rfkill binfmt_misc 
dm_mirror dm_region_hash dm_log dm_multipath dm_mod video output sbs sbshc fan battery ac parport_pc 
lp parport joydev usbhid sr_mod cdrom thermal processor thermal_sys container button rtc_cmos 
rtc_core rtc_lib i2c_i801 i2c_core pcspkr uhci_hcd ohci_hcd ehci_hcd usbcore
[  141.671659] Pid: 126, comm: events/27 Not tainted 2.6.32 #9  Server
[  141.671771] RIP: 0010:[<ffffffff810b8a64>]  [<ffffffff810b8a64>] cache_reap+0x71/0x236
[  141.671981] RSP: 0018:ffff88027e81bdf0  EFLAGS: 00010206
[  141.672089] RAX: 0000000000000002 RBX: 0000000000000078 RCX: ffff88047d86e580
[  141.672204] RDX: ffff88047dfcbc00 RSI: ffff88047f13f6c0 RDI: ffff88047d9136c0
[  141.672319] RBP: ffff88027e81be30 R08: 0000000000000001 R09: 0000000000000001
[  141.672433] R10: 0000000000000000 R11: 0000000000000086 R12: ffff88047d87c200
[  141.672548] R13: ffff88047d87d680 R14: ffffffff810b89f3 R15: 0000000000000002
[  141.672663] FS:  0000000000000000(0000) GS:ffff88028b5a0000(0000) knlGS:0000000000000000
[  141.672807] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[  141.672917] CR2: 0000000000000078 CR3: 0000000001001000 CR4: 00000000000006e0
[  141.673032] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  141.673147] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  141.673262] Process events/27 (pid: 126, threadinfo ffff88027e81a000, task ffff88027f3ea040)
[  141.673406] Stack:
[  141.673503]  ffff88027e81be30 ffff88028b5b05a0 0000000100000000 ffff88027e81be80
[  141.673808] <0> ffff88028b5b5b40 ffff88028b5b05a0 ffffffff810b89f3 fffffffff00000c6
[  141.674265] <0> ffff88027e81bec0 ffffffff81057394 ffffffff8105733e ffffffff81369f3a
[  141.674813] Call Trace:
[  141.674915]  [<ffffffff810b89f3>] ? cache_reap+0x0/0x236
[  141.675028]  [<ffffffff81057394>] worker_thread+0x17a/0x27b
[  141.675138]  [<ffffffff8105733e>] ? worker_thread+0x124/0x27b
[  141.675256]  [<ffffffff81369f3a>] ? thread_return+0x3e/0xee
[  141.675369]  [<ffffffff8105a244>] ? autoremove_wake_function+0x0/0x38
[  141.675482]  [<ffffffff8105721a>] ? worker_thread+0x0/0x27b
[  141.675593]  [<ffffffff8105a146>] kthread+0x7d/0x87
[  141.675707]  [<ffffffff81012daa>] child_rip+0xa/0x20
[  141.675817]  [<ffffffff81012710>] ? restore_args+0x0/0x30
[  141.675927]  [<ffffffff8105a0c9>] ? kthread+0x0/0x87
[  141.676035]  [<ffffffff81012da0>] ? child_rip+0x0/0x20
[  141.676142] Code: a4 c5 68 08 00 00 65 48 8b 04 25 00 e4 00 00 48 8b 04 18 49 8b 4c 24 78 48 85 
c9 74 5b 41 89 c7 48 98 48 8b 1c c1 48 85 db 74 4d <83> 3b 00 74 48 48 83 3d ff d4 65 00 00 75 04 0f 
0b eb fe fa 66
[  141.680610] RIP  [<ffffffff810b8a64>] cache_reap+0x71/0x236
[  141.680785]  RSP <ffff88027e81bdf0>
[  141.680886] CR2: 0000000000000078
[  141.681016] ---[ end trace b1e17069ef81fe83 ]--

Thanks.

-haicheng

Haicheng Li wrote:
> This is to fix the bug discussed in email thread: 
> http://patchwork.kernel.org/patch/69499/.
> 
> Currently node_possible_map won't include the offlined node that has 
> neither CPU onlined nor MEM onlined at booting time. As a result, 
> nr_node_ids won't be equal to possible nodes.
> 
> CC: Thomas Gleixner <tglx@linutronix.de>
> CC: Ingo Molnar <mingo@redhat.com>
> CC: H. Peter Anvin <hpa@zytor.com>
> CC: Andi Kleen <andi@firstfloor.org>
> Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com>
> ---
>  arch/x86/mm/srat_64.c |   10 ++--------
>  1 files changed, 2 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
> index a271241..a5bc297 100644
> --- a/arch/x86/mm/srat_64.c
> +++ b/arch/x86/mm/srat_64.c
> @@ -238,7 +238,7 @@ update_nodes_add(int node, unsigned long start, 
> unsigned long end)
>  void __init
>  acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
>  {
> -    struct bootnode *nd, oldnode;
> +    struct bootnode *nd;
>      unsigned long start, end;
>      int node, pxm;
>      int i;
> @@ -277,7 +277,6 @@ acpi_numa_memory_affinity_init(struct 
> acpi_srat_mem_affinity *ma)
>          return;
>      }
>      nd = &nodes[node];
> -    oldnode = *nd;
>      if (!node_test_and_set(node, nodes_parsed)) {
>          nd->start = start;
>          nd->end = end;
> @@ -291,13 +290,8 @@ acpi_numa_memory_affinity_init(struct 
> acpi_srat_mem_affinity *ma)
>      printk(KERN_INFO "SRAT: Node %u PXM %u %lx-%lx\n", node, pxm,
>             start, end);
> 
> -    if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) {
> +    if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)
>          update_nodes_add(node, start, end);
> -        /* restore nodes[node] */
> -        *nd = oldnode;
> -        if ((nd->start | nd->end) == 0)
> -            node_clear(node, nodes_parsed);
> -    }
> 
>      node_memblk_range[num_node_memblks].start = start;
>      node_memblk_range[num_node_memblks].end = end;
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/mm/srat_64.c: nodes_parsed should include all nodes detected by ACPI.
  2010-01-15  7:42 [PATCH] x86/mm/srat_64.c: nodes_parsed should include all nodes detected by ACPI Haicheng Li
  2010-01-17  2:22 ` Haicheng Li
@ 2010-01-17 21:53 ` David Rientjes
  2010-01-18  6:30   ` Yinghai Lu
  1 sibling, 1 reply; 26+ messages in thread
From: David Rientjes @ 2010-01-17 21:53 UTC (permalink / raw)
  To: Haicheng Li
  Cc: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, x86, Andi Kleen,
	linux-kernel

On Fri, 15 Jan 2010, Haicheng Li wrote:

> This is to fix the bug discussed in email thread:
> http://patchwork.kernel.org/patch/69499/.
> 

It would be better to include a brief synopsis of the problem in the 
changelog instead of a citation, otherwise it obfuscates the issue it 
solves.

> Currently node_possible_map won't include the offlined node that has neither
> CPU onlined nor MEM onlined at booting time. As a result, nr_node_ids won't be
> equal to possible nodes.
> 
> CC: Thomas Gleixner <tglx@linutronix.de>
> CC: Ingo Molnar <mingo@redhat.com>
> CC: H. Peter Anvin <hpa@zytor.com>
> CC: Andi Kleen <andi@firstfloor.org>
> Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com>
> ---
>  arch/x86/mm/srat_64.c |   10 ++--------
>  1 files changed, 2 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
> index a271241..a5bc297 100644
> --- a/arch/x86/mm/srat_64.c
> +++ b/arch/x86/mm/srat_64.c
> @@ -238,7 +238,7 @@ update_nodes_add(int node, unsigned long start, unsigned
> long end)
>  void __init
>  acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
>  {
> -	struct bootnode *nd, oldnode;
> +	struct bootnode *nd;
>  	unsigned long start, end;
>  	int node, pxm;
>  	int i;
> @@ -277,7 +277,6 @@ acpi_numa_memory_affinity_init(struct
> acpi_srat_mem_affinity *ma)
>  		return;
>  	}
>  	nd = &nodes[node];
> -	oldnode = *nd;
>  	if (!node_test_and_set(node, nodes_parsed)) {
>  		nd->start = start;
>  		nd->end = end;
> @@ -291,13 +290,8 @@ acpi_numa_memory_affinity_init(struct
> acpi_srat_mem_affinity *ma)
>  	printk(KERN_INFO "SRAT: Node %u PXM %u %lx-%lx\n", node, pxm,
>  	       start, end);
> 
> -	if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) {
> +	if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)
>  		update_nodes_add(node, start, end);
> -		/* restore nodes[node] */
> -		*nd = oldnode;
> -		if ((nd->start | nd->end) == 0)
> -			node_clear(node, nodes_parsed);
> -	}
> 
>  	node_memblk_range[num_node_memblks].start = start;
>  	node_memblk_range[num_node_memblks].end = end;

There're a couple of issues with this patch:

 - it breaks CONFIG_MEMORY_HOTPLUG_SPARSE kernels when the SRAT reports 
   more than one entry for a single node id and the later entry does not
   specify an address range, and

 - carrying the bit for the node over in nodes_parsed is inappropriate
   since other x86 code depends on that map including only nodes with
   memory such as e820_register_active_regions() and nodes_cover_memory().

    [ The existing naming is admittedly convoluted since nodes_parsed
      represents nodes with memory and cpu_nodes_parsed represents nodes
      with only cpus; in actuality, they should probably be
      mem_nodes_parsed and no_mem_nodes_parsed, respectively, since the
      latter will now include hotpluggable nodes. ]

To fix both of these issues simultaneously, I think it would be better to 
set the bit in no_mem_nodes_parsed so that it gets unioned in 
acpi_scan_nodes() when nodes_possible_map is actually formed.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/mm/srat_64.c: nodes_parsed should include all nodes  detected by ACPI.
  2010-01-17 21:53 ` David Rientjes
@ 2010-01-18  6:30   ` Yinghai Lu
  2010-01-18 10:43     ` David Rientjes
  0 siblings, 1 reply; 26+ messages in thread
From: Yinghai Lu @ 2010-01-18  6:30 UTC (permalink / raw)
  To: David Rientjes
  Cc: Haicheng Li, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, x86,
	Andi Kleen, linux-kernel

On Sun, Jan 17, 2010 at 1:53 PM, David Rientjes <rientjes@google.com> wrote:
> On Fri, 15 Jan 2010, Haicheng Li wrote:
>
>> This is to fix the bug discussed in email thread:
>> http://patchwork.kernel.org/patch/69499/.
>>
>
> It would be better to include a brief synopsis of the problem in the
> changelog instead of a citation, otherwise it obfuscates the issue it
> solves.
>
>> Currently node_possible_map won't include the offlined node that has neither
>> CPU onlined nor MEM onlined at booting time. As a result, nr_node_ids won't be
>> equal to possible nodes.
>>
>> CC: Thomas Gleixner <tglx@linutronix.de>
>> CC: Ingo Molnar <mingo@redhat.com>
>> CC: H. Peter Anvin <hpa@zytor.com>
>> CC: Andi Kleen <andi@firstfloor.org>
>> Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com>
>> ---
>>  arch/x86/mm/srat_64.c |   10 ++--------
>>  1 files changed, 2 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
>> index a271241..a5bc297 100644
>> --- a/arch/x86/mm/srat_64.c
>> +++ b/arch/x86/mm/srat_64.c
>> @@ -238,7 +238,7 @@ update_nodes_add(int node, unsigned long start, unsigned
>> long end)
>>  void __init
>>  acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
>>  {
>> -     struct bootnode *nd, oldnode;
>> +     struct bootnode *nd;
>>       unsigned long start, end;
>>       int node, pxm;
>>       int i;
>> @@ -277,7 +277,6 @@ acpi_numa_memory_affinity_init(struct
>> acpi_srat_mem_affinity *ma)
>>               return;
>>       }
>>       nd = &nodes[node];
>> -     oldnode = *nd;
>>       if (!node_test_and_set(node, nodes_parsed)) {
>>               nd->start = start;
>>               nd->end = end;
>> @@ -291,13 +290,8 @@ acpi_numa_memory_affinity_init(struct
>> acpi_srat_mem_affinity *ma)
>>       printk(KERN_INFO "SRAT: Node %u PXM %u %lx-%lx\n", node, pxm,
>>              start, end);
>>
>> -     if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) {
>> +     if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)
>>               update_nodes_add(node, start, end);
>> -             /* restore nodes[node] */
>> -             *nd = oldnode;
>> -             if ((nd->start | nd->end) == 0)
>> -                     node_clear(node, nodes_parsed);
>> -     }
>>
>>       node_memblk_range[num_node_memblks].start = start;
>>       node_memblk_range[num_node_memblks].end = end;
>
> There're a couple of issues with this patch:
>
>  - it breaks CONFIG_MEMORY_HOTPLUG_SPARSE kernels when the SRAT reports
>   more than one entry for a single node id and the later entry does not
>   specify an address range, and

not sure this one.

>
>  - carrying the bit for the node over in nodes_parsed is inappropriate
>   since other x86 code depends on that map including only nodes with
>   memory such as e820_register_active_regions() and nodes_cover_memory().

assume if e820 is right, even srat table have a range for hot plug
aka the e820 have hole already. then e820_register_active_regions()
and nodes_cover_memory
should be ok even bootnode have some pre reserved range.

YH

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/mm/srat_64.c: nodes_parsed should include all nodes detected by ACPI.
  2010-01-18  6:30   ` Yinghai Lu
@ 2010-01-18 10:43     ` David Rientjes
  2010-01-19 11:08       ` Haicheng Li
  0 siblings, 1 reply; 26+ messages in thread
From: David Rientjes @ 2010-01-18 10:43 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Haicheng Li, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, x86,
	Andi Kleen, linux-kernel

On Sun, 17 Jan 2010, Yinghai Lu wrote:

> > There're a couple of issues with this patch:
> >
> > - it breaks CONFIG_MEMORY_HOTPLUG_SPARSE kernels when the SRAT reports
> >  more than one entry for a single node id and the later entry does not
> >  specify an address range, and
> 
> not sure this one.
> 

The old code would preserve the address range from the preceding SRAT 
entry for that node to pass to e820_register_active_regions() when 
CONFIG_MEMORY_HOTPLUG_SPARSE is enabled, this patch incorrectly clears 
that data.

> > - carrying the bit for the node over in nodes_parsed is inappropriate
> >  since other x86 code depends on that map including only nodes with
> >  memory such as e820_register_active_regions() and nodes_cover_memory().
> 
> assume if e820 is right, even srat table have a range for hot plug
> aka the e820 have hole already. then e820_register_active_regions()
> and nodes_cover_memory
> should be ok even bootnode have some pre reserved range.
> 

cpu_nodes_parsed handles nodes without memory, there's no reason why a bit 
should be set in nodes_parsed if its corresponding node does not have a 
valid address range.  We have a reasonable expectation that nodes_parsed 
represents memory nodes given its use for e820_register_active_regions() 
and nodes_cover_memory() as well as acpi_get_nodes() for NUMA emulation, 
for example, which would be broken with this patch.  See dc0985519.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/mm/srat_64.c: nodes_parsed should include all nodes detected by ACPI.
  2010-01-18 10:43     ` David Rientjes
@ 2010-01-19 11:08       ` Haicheng Li
  2010-01-19 11:29         ` Haicheng Li
  2010-01-19 23:30         ` David Rientjes
  0 siblings, 2 replies; 26+ messages in thread
From: Haicheng Li @ 2010-01-19 11:08 UTC (permalink / raw)
  To: David Rientjes
  Cc: Yinghai Lu, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, x86,
	Andi Kleen, linux-kernel

Dave & Yinghai,

Thanks for the review, and see my reply below.

David Rientjes wrote:
> On Sun, 17 Jan 2010, Yinghai Lu wrote:
> 
>>> There're a couple of issues with this patch:
>>>
>>> - it breaks CONFIG_MEMORY_HOTPLUG_SPARSE kernels when the SRAT reports
>>>  more than one entry for a single node id and the later entry does not
>>>  specify an address range, and
>> not sure this one.
>>
> The old code would preserve the address range from the preceding SRAT 
> entry for that node to pass to e820_register_active_regions() when 
> CONFIG_MEMORY_HOTPLUG_SPARSE is enabled, this patch incorrectly clears 
> that data.

To fully understand what could happen with old code, I write a debug patch like following:
diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
index dbb5381..f7118d1 100644
--- a/arch/x86/mm/srat_64.c
+++ b/arch/x86/mm/srat_64.c
@@ -290,6 +290,8 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)

  	printk(KERN_INFO "SRAT: Node %u PXM %u %lx-%lx\n", node, pxm,
  	       start, end);
+	printk(KERN_INFO "SRAT: Node %u oldnode: %lx-%lx\n", node,
+	       oldnode.start, oldnode.end);
  	e820_register_active_regions(node, start >> PAGE_SHIFT,
  				     end >> PAGE_SHIFT);

@@ -297,8 +299,10 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
  		update_nodes_add(node, start, end);
  		/* restore nodes[node] */
  		*nd = oldnode;
-		if ((nd->start | nd->end) == 0)
+		if ((nd->start | nd->end) == 0) {
  			node_clear(node, nodes_parsed);
+			printk("node-clear: %u\n", node);
+		}
  	}

  	node_memblk_range[num_node_memblks].start = start;


Then on a system with Node0 on, but Node1 off (both cpu and memory are off), the booting log is like:
[    0.000000] SRAT: Node 0 PXM 0 0-80000000
[    0.000000] SRAT: Node 0 oldnode: 0-0
[    0.000000] SRAT: Node 0 PXM 0 100000000-280000000
[    0.000000] SRAT: Node 0 oldnode: 0-80000000
[    0.000000] SRAT: Node 0 PXM 0 300000000-2300000000
[    0.000000] SRAT: Node 0 oldnode: 0-280000000
[    0.000000] SRAT: hot plug zone found 300000000 - 2300000000
[    0.000000] SRAT: Node 0 PXM 0 2300000000-4300000000
[    0.000000] SRAT: Node 0 oldnode: 0-280000000
[    0.000000] SRAT: hot plug zone found 300000000 - 4300000000

[    0.000000] SRAT: Node 1 PXM 1 4300000000-6300000000
[    0.000000] SRAT: Node 1 oldnode: 0-0
[    0.000000] SRAT: hot plug zone found 4300000000 - 6300000000
[    0.000000] node-clear: 1
[    0.000000] SRAT: Node 1 PXM 1 6300000000-8300000000
[    0.000000] SRAT: Node 1 oldnode: 0-0
[    0.000000] SRAT: hot plug zone found 4300000000 - 8300000000
[    0.000000] node-clear: 1

David, per my understanding, your concern should be like, with this fix, if 3rd or 4th entry of 
Node0 has no address range, then Node0 won't be recoverd with oldnode and won't be cleared in 
nodes_parsed. But how is it handled by old code?

- it recovers node with oldnode as long as current entry is HOT_PLUGGABLE. so it handles the recover 
issue. but I think following patch can simply fix it as well.

diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
index dbb5381..fdf067f 100644
--- a/arch/x86/mm/srat_64.c
+++ b/arch/x86/mm/srat_64.c
@@ -281,7 +281,7 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
  	if (!node_test_and_set(node, nodes_parsed)) {
  		nd->start = start;
  		nd->end = end;
-	} else {
+	} else if ((nd->start | nd->end) != 0) {
  		if (start < nd->start)
  			nd->start = start;
  		if (nd->end < end)

- as Node1 log shows, node_clear(node, nodes_parsed) can be executed only when either the 1st entry 
of the node is HOT_PLUGGABLE or all preceding entries have no address range plus current entry 
itself is HOT_PLUGGABLE. as a result, even Node1 has a valid address range, since its 1st entry is 
HOT_PLUGGABLE, the old code still recovers it with initial node value (i.e. zeroed) as well as 
clears Node1 from nodes_parsed. Isn't it a buggy logic?

>>> - carrying the bit for the node over in nodes_parsed is inappropriate
>>>  since other x86 code depends on that map including only nodes with
>>>  memory such as e820_register_active_regions() and nodes_cover_memory().
>> assume if e820 is right, even srat table have a range for hot plug
>> aka the e820 have hole already. then e820_register_active_regions()
>> and nodes_cover_memory
>> should be ok even bootnode have some pre reserved range.
>>
> 
> cpu_nodes_parsed handles nodes without memory, there's no reason why a bit 
> should be set in nodes_parsed if its corresponding node does not have a 
> valid address range.  

For a node has _NOT_ either CPU or Memory like Node1, cpu_nodes_parsed cannot handle it.

> We have a reasonable expectation that nodes_parsed 
> represents memory nodes given its use for e820_register_active_regions() 
> and nodes_cover_memory() as well as acpi_get_nodes() for NUMA emulation, 
> for example, which would be broken with this patch.  See dc0985519.
> 

either nodes_cover_memory() or e820_register_active_regions() or acpi_get_nodes(), they all have 
node-addr-range check code, if the node-addr-range is invalid, they won't be harmed.

What should nodes_parsed be? I think it's reasonable to include all the nodes reported by SRAT 
Memory Affinity Structure. Besides, also needs above small fix for recovery issue.

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/mm/srat_64.c: nodes_parsed should include all nodes detected by ACPI.
  2010-01-19 11:08       ` Haicheng Li
@ 2010-01-19 11:29         ` Haicheng Li
  2010-01-19 23:30         ` David Rientjes
  1 sibling, 0 replies; 26+ messages in thread
From: Haicheng Li @ 2010-01-19 11:29 UTC (permalink / raw)
  To: David Rientjes
  Cc: Yinghai Lu, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, x86,
	Andi Kleen, linux-kernel

Haicheng Li wrote:
> - it recovers node with oldnode as long as current entry is 
> HOT_PLUGGABLE. so it handles the recover issue. but I think following 
> patch can simply fix it as well.
> 
> diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
> index dbb5381..fdf067f 100644
> --- a/arch/x86/mm/srat_64.c
> +++ b/arch/x86/mm/srat_64.c
> @@ -281,7 +281,7 @@ acpi_numa_memory_affinity_init(struct 
> acpi_srat_mem_affinity *ma)
>      if (!node_test_and_set(node, nodes_parsed)) {
>          nd->start = start;
>          nd->end = end;
> -    } else {
> +    } else if ((nd->start | nd->end) != 0) {
oops, typo here, should be:
+    } else if ((start | end) != 0) {

>          if (start < nd->start)
>              nd->start = start;
>          if (nd->end < end)
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/mm/srat_64.c: nodes_parsed should include all nodes detected by ACPI.
  2010-01-19 11:08       ` Haicheng Li
  2010-01-19 11:29         ` Haicheng Li
@ 2010-01-19 23:30         ` David Rientjes
  2010-01-20 16:40           ` Haicheng Li
  1 sibling, 1 reply; 26+ messages in thread
From: David Rientjes @ 2010-01-19 23:30 UTC (permalink / raw)
  To: Haicheng Li
  Cc: Yinghai Lu, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, x86,
	Andi Kleen, linux-kernel

On Tue, 19 Jan 2010, Haicheng Li wrote:

> > The old code would preserve the address range from the preceding SRAT entry
> > for that node to pass to e820_register_active_regions() when
> > CONFIG_MEMORY_HOTPLUG_SPARSE is enabled, this patch incorrectly clears that
> > data.
> 
> To fully understand what could happen with old code, I write a debug patch
> like following:
> diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
> index dbb5381..f7118d1 100644
> --- a/arch/x86/mm/srat_64.c
> +++ b/arch/x86/mm/srat_64.c
> @@ -290,6 +290,8 @@ acpi_numa_memory_affinity_init(struct
> acpi_srat_mem_affinity *ma)
> 
>  	printk(KERN_INFO "SRAT: Node %u PXM %u %lx-%lx\n", node, pxm,
>  	       start, end);
> +	printk(KERN_INFO "SRAT: Node %u oldnode: %lx-%lx\n", node,
> +	       oldnode.start, oldnode.end);
>  	e820_register_active_regions(node, start >> PAGE_SHIFT,
>  				     end >> PAGE_SHIFT);
> 
> @@ -297,8 +299,10 @@ acpi_numa_memory_affinity_init(struct
> acpi_srat_mem_affinity *ma)
>  		update_nodes_add(node, start, end);
>  		/* restore nodes[node] */
>  		*nd = oldnode;
> -		if ((nd->start | nd->end) == 0)
> +		if ((nd->start | nd->end) == 0) {
>  			node_clear(node, nodes_parsed);
> +			printk("node-clear: %u\n", node);
> +		}
>  	}
> 
>  	node_memblk_range[num_node_memblks].start = start;
> 
> 
> Then on a system with Node0 on, but Node1 off (both cpu and memory are off),
> the booting log is like:
> [    0.000000] SRAT: Node 0 PXM 0 0-80000000
> [    0.000000] SRAT: Node 0 oldnode: 0-0
> [    0.000000] SRAT: Node 0 PXM 0 100000000-280000000
> [    0.000000] SRAT: Node 0 oldnode: 0-80000000
> [    0.000000] SRAT: Node 0 PXM 0 300000000-2300000000
> [    0.000000] SRAT: Node 0 oldnode: 0-280000000
> [    0.000000] SRAT: hot plug zone found 300000000 - 2300000000
> [    0.000000] SRAT: Node 0 PXM 0 2300000000-4300000000
> [    0.000000] SRAT: Node 0 oldnode: 0-280000000
> [    0.000000] SRAT: hot plug zone found 300000000 - 4300000000
> 
> [    0.000000] SRAT: Node 1 PXM 1 4300000000-6300000000
> [    0.000000] SRAT: Node 1 oldnode: 0-0
> [    0.000000] SRAT: hot plug zone found 4300000000 - 6300000000
> [    0.000000] node-clear: 1
> [    0.000000] SRAT: Node 1 PXM 1 6300000000-8300000000
> [    0.000000] SRAT: Node 1 oldnode: 0-0
> [    0.000000] SRAT: hot plug zone found 4300000000 - 8300000000
> [    0.000000] node-clear: 1
> 
> David, per my understanding, your concern should be like, with this fix, if
> 3rd or 4th entry of Node0 has no address range, then Node0 won't be recoverd
> with oldnode and won't be cleared in nodes_parsed. But how is it handled by
> old code?
> 

It's not evident with your machine because you do not have two SRAT 
entries for the same node id, one without ACPI_SRAT_MEM_HOT_PLUGGABLE and 
other with ACPI_SRAT_MEM_HOT_PLUGGABLE.

The old code would preserve the address range for the former in oldnode 
and then reset its data in the struct bootnode since nodes_parsed has a 
bit set for that node.  That's needed by later code that I've mentioned: 
acpi_get_nodes(), specifically, which breaks with your patch in addition 
to nodes_cover_memory() and e820_register_active_regions().

Only when the previous oldnode entry does not have a valid address range, 
meaning it is [0, 0), does the bit get cleared in nodes_parsed.

> - it recovers node with oldnode as long as current entry is HOT_PLUGGABLE. so
> it handles the recover issue. but I think following patch can simply fix it as
> well.
> 

If it's not ACPI_SRAT_MEM_HOT_PLUGGABLE, we know the address range is 
already valid given the sanity checks that it has successfully passed 
through in acpi_numa_memory_affinity_init(), so we require no further 
checking.  However, your patch will not reset the previous address range 
when a ACPI_SRAT_MEM_HOT_PLUGGABLE entry is found for the same address 
range and you're leaving the bit set in nodes_parsed.

> > cpu_nodes_parsed handles nodes without memory, there's no reason why a bit
> > should be set in nodes_parsed if its corresponding node does not have a
> > valid address range.  
> 
> For a node has _NOT_ either CPU or Memory like Node1, cpu_nodes_parsed cannot
> handle it.
> 

It most certainly can since its sole purpose is to include memoryless 
nodes in node_possible_map.  It has no other use case that would break as 
the result of adding hotpluggable nodes, hence the reason I suggested 
renaming it no_mem_nodes_parsed.

> > We have a reasonable expectation that nodes_parsed represents memory nodes
> > given its use for e820_register_active_regions() and nodes_cover_memory() as
> > well as acpi_get_nodes() for NUMA emulation, for example, which would be
> > broken with this patch.  See dc0985519.
> > 
> 
> either nodes_cover_memory() or e820_register_active_regions() or
> acpi_get_nodes(), they all have node-addr-range check code, if the
> node-addr-range is invalid, they won't be harmed.
> 

Wrong, acpi_get_nodes() does not have such a check it only iterates over 
nodes_parsed.  In other words, you'd be starting a new requirement for 
nodes_parsed with your patch: it would now be necessary to check for a 
valid (non-zero) address range for each set bit.  Instead, I'm suggesting 
the nodes_parsed represents only nodes with valid memory, which is a 
reasonable expectation given the semantics of both it and cpu_nodes_parsed 
to handle their memoryless counterparts.

In other words, the following should easily fix the issue without breaking 
the existing logic that preserves the old address range for node ids that 
have SRAT entries both with and without ACPI_SRAT_MEM_HOT_PLUGGABLE.  
Could you give it a try?
---
diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
--- a/arch/x86/mm/srat_64.c
+++ b/arch/x86/mm/srat_64.c
@@ -229,9 +229,11 @@ update_nodes_add(int node, unsigned long start, unsigned long end)
 			printk(KERN_ERR "SRAT: Hotplug zone not continuous. Partly ignored\n");
 	}
 
-	if (changed)
+	if (changed) {
 		printk(KERN_INFO "SRAT: hot plug zone found %Lx - %Lx\n",
 				 nd->start, nd->end);
+		node_set(node, cpu_nodes_parsed);
+	}
 }
 
 /* Callback for parsing of the Proximity Domain <-> Memory Area mappings */

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/mm/srat_64.c: nodes_parsed should include all nodes detected by ACPI.
  2010-01-19 23:30         ` David Rientjes
@ 2010-01-20 16:40           ` Haicheng Li
  2010-01-20 20:10             ` [patch] x86: set hotpluggable nodes in nodes_possible_map David Rientjes
  0 siblings, 1 reply; 26+ messages in thread
From: Haicheng Li @ 2010-01-20 16:40 UTC (permalink / raw)
  To: David Rientjes
  Cc: Yinghai Lu, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, x86,
	Andi Kleen, linux-kernel

David Rientjes wrote:
> On Tue, 19 Jan 2010, Haicheng Li wrote:
>> David, per my understanding, your concern should be like, with this fix, if
>> 3rd or 4th entry of Node0 has no address range, then Node0 won't be recoverd
>> with oldnode and won't be cleared in nodes_parsed. But how is it handled by
>> old code?
>>
>
> It's not evident with your machine because you do not have two SRAT
> entries for the same node id, one without ACPI_SRAT_MEM_HOT_PLUGGABLE and
> other with ACPI_SRAT_MEM_HOT_PLUGGABLE.
>
> The old code would preserve the address range for the former in oldnode
> and then reset its data in the struct bootnode since nodes_parsed has a
> bit set for that node.  That's needed by later code that I've mentioned:
> acpi_get_nodes(), specifically, which breaks with your patch in addition
> to nodes_cover_memory() and e820_register_active_regions().
>
> Only when the previous oldnode entry does not have a valid address range,
> meaning it is [0, 0), does the bit get cleared in nodes_parsed.

Understood, the old code is meant to make nodes_parsed _NEVER_ include the node whose memory regions
are all hotpluggable.

>> - it recovers node with oldnode as long as current entry is HOT_PLUGGABLE. so
>> it handles the recover issue. but I think following patch can simply fix it as
>> well.
>>
>
> If it's not ACPI_SRAT_MEM_HOT_PLUGGABLE, we know the address range is
> already valid given the sanity checks that it has successfully passed
> through in acpi_numa_memory_affinity_init(), so we require no further
> checking.  However, your patch will not reset the previous address range
> when a ACPI_SRAT_MEM_HOT_PLUGGABLE entry is found for the same address
> range and you're leaving the bit set in nodes_parsed.

I see. the precondition is that nodes_parsed should not include such hotpluggable node, then such
data of hotpluggable mem should be kept in nodes_add[] other than in nodes[].

>>> cpu_nodes_parsed handles nodes without memory, there's no reason why a bit
>>> should be set in nodes_parsed if its corresponding node does not have a
>>> valid address range.
>> For a node has _NOT_ either CPU or Memory like Node1, cpu_nodes_parsed cannot
>> handle it.
>>
>
> It most certainly can since its sole purpose is to include memoryless
> nodes in node_possible_map.  It has no other use case that would break as
> the result of adding hotpluggable nodes, hence the reason I suggested
> renaming it no_mem_nodes_parsed.

Yeah, so the key point is who should keep hotpluggable nodes, cpu_nodes_parsed or nodes_parsed?
Actually now I agree with you on this, let cpu_nodes_parsed keep hotpluggable nodes since it won't
break any old code. Originally my patch wanna let nodes_parsed keep hotpluggable nodes, which would
make things complex.

but name "no_mem_nodes_parsed" seems convoluted too because (from code logic) this nodemask is
usually based on CPU/APIC Affinity Structure.
How about rename cpu_nodes_parsed as "rest_nodes_parsed" (comparing with "mem_nodes_parsed), since
it handles
   - nodes with CPU on
   - nodes with hotpluggable memory region
?

>>> We have a reasonable expectation that nodes_parsed represents memory nodes
>>> given its use for e820_register_active_regions() and nodes_cover_memory() as
>>> well as acpi_get_nodes() for NUMA emulation, for example, which would be
>>> broken with this patch.  See dc0985519.
>>>
>> either nodes_cover_memory() or e820_register_active_regions() or
>> acpi_get_nodes(), they all have node-addr-range check code, if the
>> node-addr-range is invalid, they won't be harmed.
>>
>
> Wrong, acpi_get_nodes() does not have such a check it only iterates over
> nodes_parsed.  In other words, you'd be starting a new requirement for
> nodes_parsed with your patch: it would now be necessary to check for a
> valid (non-zero) address range for each set bit.  Instead, I'm suggesting
> the nodes_parsed represents only nodes with valid memory, which is a
> reasonable expectation given the semantics of both it and cpu_nodes_parsed
> to handle their memoryless counterparts.

agreed. In term of this, using nodes_parsed to represent only nodes with valid memory can make
things simple.

> In other words, the following should easily fix the issue without breaking
> the existing logic that preserves the old address range for node ids that
> have SRAT entries both with and without ACPI_SRAT_MEM_HOT_PLUGGABLE.
> Could you give it a try?

of course, it fixes the issue because node_possible_map now includes hotpluggable node, and then
nr_node_ids becomes equal to maximum of possible nodes on the motherboard;).

let's add more changes to fix naming issue as well since it's too confusing for people to understand
the code logic. how about below patch?
---
  arch/x86/mm/srat_64.c |   39 +++++++++++++++++++++++++--------------
  1 files changed, 25 insertions(+), 14 deletions(-)

diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
index a271241..aebbdd4 100644
--- a/arch/x86/mm/srat_64.c
+++ b/arch/x86/mm/srat_64.c
@@ -27,8 +27,17 @@ int acpi_numa __initdata;

  static struct acpi_table_slit *acpi_slit;

-static nodemask_t nodes_parsed __initdata;
-static nodemask_t cpu_nodes_parsed __initdata;
+/* mem_nodes_parsed:
+ *  - nodes with memory on
+ *
+ * rest_nodes_parsed:
+ *  - nodes with CPU on
+ *  - nodes with hotpluggable memory region
+ *
+ * We union these two nodemasks to get node_possible_map.
+ */
+static nodemask_t mem_nodes_parsed __initdata;
+static nodemask_t rest_nodes_parsed __initdata;
  static struct bootnode nodes[MAX_NUMNODES] __initdata;
  static struct bootnode nodes_add[MAX_NUMNODES];

@@ -134,7 +143,7 @@ acpi_numa_x2apic_affinity_init(struct acpi_srat_x2apic_cpu_affinity *pa)

  	apic_id = pa->apic_id;
  	apicid_to_node[apic_id] = node;
-	node_set(node, cpu_nodes_parsed);
+	node_set(node, rest_nodes_parsed);
  	acpi_numa = 1;
  	printk(KERN_INFO "SRAT: PXM %u -> APIC 0x%04x -> Node %u\n",
  	       pxm, apic_id, node);
@@ -168,7 +177,7 @@ acpi_numa_processor_affinity_init(struct acpi_srat_cpu_affinity *pa)
  	else
  		apic_id = pa->apic_id;
  	apicid_to_node[apic_id] = node;
-	node_set(node, cpu_nodes_parsed);
+	node_set(node, rest_nodes_parsed);
  	acpi_numa = 1;
  	printk(KERN_INFO "SRAT: PXM %u -> APIC 0x%02x -> Node %u\n",
  	       pxm, apic_id, node);
@@ -229,9 +238,11 @@ update_nodes_add(int node, unsigned long start, unsigned long end)
  			printk(KERN_ERR "SRAT: Hotplug zone not continuous. Partly ignored\n");
  	}

-	if (changed)
+	if (changed) {
+		node_set(node, rest_nodes_parsed);
  		printk(KERN_INFO "SRAT: hot plug zone found %Lx - %Lx\n",
  				 nd->start, nd->end);
+	}
  }

  /* Callback for parsing of the Proximity Domain <-> Memory Area mappings */
@@ -278,7 +289,7 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
  	}
  	nd = &nodes[node];
  	oldnode = *nd;
-	if (!node_test_and_set(node, nodes_parsed)) {
+	if (!node_test_and_set(node, mem_nodes_parsed)) {
  		nd->start = start;
  		nd->end = end;
  	} else {
@@ -296,7 +307,7 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
  		/* restore nodes[node] */
  		*nd = oldnode;
  		if ((nd->start | nd->end) == 0)
-			node_clear(node, nodes_parsed);
+			node_clear(node, mem_nodes_parsed);
  	}

  	node_memblk_range[num_node_memblks].start = start;
@@ -313,7 +324,7 @@ static int __init nodes_cover_memory(const struct bootnode *nodes)
  	unsigned long pxmram, e820ram;

  	pxmram = 0;
-	for_each_node_mask(i, nodes_parsed) {
+	for_each_node_mask(i, mem_nodes_parsed) {
  		unsigned long s = nodes[i].start >> PAGE_SHIFT;
  		unsigned long e = nodes[i].end >> PAGE_SHIFT;
  		pxmram += e - s;
@@ -341,7 +352,7 @@ int __init acpi_get_nodes(struct bootnode *physnodes)
  	int i;
  	int ret = 0;

-	for_each_node_mask(i, nodes_parsed) {
+	for_each_node_mask(i, mem_nodes_parsed) {
  		physnodes[ret].start = nodes[i].start;
  		physnodes[ret].end = nodes[i].end;
  		ret++;
@@ -370,7 +381,7 @@ int __init acpi_scan_nodes(unsigned long start, unsigned long end)
  		return -1;
  	}

-	for_each_node_mask(i, nodes_parsed)
+	for_each_node_mask(i, mem_nodes_parsed)
  		e820_register_active_regions(i, nodes[i].start >> PAGE_SHIFT,
  						nodes[i].end >> PAGE_SHIFT);
  	/* for out of order entries in SRAT */
@@ -381,7 +392,7 @@ int __init acpi_scan_nodes(unsigned long start, unsigned long end)
  	}

  	/* Account for nodes with cpus and no memory */
-	nodes_or(node_possible_map, nodes_parsed, cpu_nodes_parsed);
+	nodes_or(node_possible_map, mem_nodes_parsed, rest_nodes_parsed);

  	/* Finally register nodes */
  	for_each_node_mask(i, node_possible_map)
@@ -416,7 +427,7 @@ static int __init find_node_by_addr(unsigned long addr)
  	int ret = NUMA_NO_NODE;
  	int i;

-	for_each_node_mask(i, nodes_parsed) {
+	for_each_node_mask(i, mem_nodes_parsed) {
  		/*
  		 * Find the real node that this emulated node appears on.  For
  		 * the sake of simplicity, we only use a real node's starting
@@ -466,10 +477,10 @@ void __init acpi_fake_nodes(const struct bootnode *fake_nodes, int num_nodes)
  		__acpi_map_pxm_to_node(fake_node_to_pxm_map[i], i);
  	memcpy(apicid_to_node, fake_apicid_to_node, sizeof(apicid_to_node));

-	nodes_clear(nodes_parsed);
+	nodes_clear(mem_nodes_parsed);
  	for (i = 0; i < num_nodes; i++)
  		if (fake_nodes[i].start != fake_nodes[i].end)
-			node_set(i, nodes_parsed);
+			node_set(i, mem_nodes_parsed);
  }

  static int null_slit_node_compare(int a, int b)
-- 
1.5.4.4



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [patch] x86: set hotpluggable nodes in nodes_possible_map
  2010-01-20 16:40           ` Haicheng Li
@ 2010-01-20 20:10             ` David Rientjes
  2010-01-20 22:45               ` Yinghai Lu
                                 ` (3 more replies)
  0 siblings, 4 replies; 26+ messages in thread
From: David Rientjes @ 2010-01-20 20:10 UTC (permalink / raw)
  To: Haicheng Li, Ingo Molnar
  Cc: Yinghai Lu, H. Peter Anvin, Thomas Gleixner, x86, Andi Kleen,
	linux-kernel

On Thu, 21 Jan 2010, Haicheng Li wrote:

> let's add more changes to fix naming issue as well since it's too confusing
> for people to understand
> the code logic. how about below patch?

That should be a seperate change; there's a bugfix here (my patch) and 
then a cleanup patch that you could make incrementally on mine.  I don't 
personally like the name "rest_nodes_parsed" since it's poor English, I 
suggest renaming nodes_parsed to mem_nodes_parsed as I originally asked 
and then cpu_nodes_parsed to acpi_nodes_parsed or something similiar.

Ingo, the following is my bugfix patch that addresses the issue at 
http://patchwork.kernel.org/patch/69499.  Haicheng, can we add your 
tested-by line?



x86: set hotpluggable nodes in nodes_possible_map

nodes_possible_map does not currently include nodes that have SRAT
entries that are all ACPI_SRAT_MEM_HOT_PLUGGABLE since the bit is cleared
in nodes_parsed if it does not have an online address range.

Unequivocally setting the bit in nodes_parsed is insufficient since
existing code, such as acpi_get_nodes(), assumes all nodes in the map
have online address ranges.  In fact, all code using nodes_parsed assumes
such nodes represent an address range of online memory.

nodes_possible_map is created by unioning nodes_parsed and
cpu_nodes_parsed; the former represents nodes with online memory and the
latter represents memoryless nodes.  We now set the bit for hotpluggable
nodes in cpu_nodes_parsed so that it also gets set in nodes_possible_map.

Signed-off-by: David Rientjes <rientjes@google.com>
---
 arch/x86/mm/srat_64.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
--- a/arch/x86/mm/srat_64.c
+++ b/arch/x86/mm/srat_64.c
@@ -229,9 +229,11 @@ update_nodes_add(int node, unsigned long start, unsigned long end)
 			printk(KERN_ERR "SRAT: Hotplug zone not continuous. Partly ignored\n");
 	}
 
-	if (changed)
+	if (changed) {
+		node_set(node, cpu_nodes_parsed);
 		printk(KERN_INFO "SRAT: hot plug zone found %Lx - %Lx\n",
 				 nd->start, nd->end);
+	}
 }
 
 /* Callback for parsing of the Proximity Domain <-> Memory Area mappings */

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [patch] x86: set hotpluggable nodes in nodes_possible_map
  2010-01-20 20:10             ` [patch] x86: set hotpluggable nodes in nodes_possible_map David Rientjes
@ 2010-01-20 22:45               ` Yinghai Lu
  2010-01-20 23:32                 ` David Rientjes
  2010-01-21  3:00                 ` Haicheng Li
  2010-01-21  2:58               ` Haicheng Li
                                 ` (2 subsequent siblings)
  3 siblings, 2 replies; 26+ messages in thread
From: Yinghai Lu @ 2010-01-20 22:45 UTC (permalink / raw)
  To: David Rientjes
  Cc: Haicheng Li, Ingo Molnar, H. Peter Anvin, Thomas Gleixner, x86,
	Andi Kleen, linux-kernel

On 01/20/2010 12:10 PM, David Rientjes wrote:
> On Thu, 21 Jan 2010, Haicheng Li wrote:
> 
>> let's add more changes to fix naming issue as well since it's too confusing
>> for people to understand
>> the code logic. how about below patch?
> 
> That should be a seperate change; there's a bugfix here (my patch) and 
> then a cleanup patch that you could make incrementally on mine.  I don't 
> personally like the name "rest_nodes_parsed" since it's poor English, I 
> suggest renaming nodes_parsed to mem_nodes_parsed as I originally asked 
> and then cpu_nodes_parsed to acpi_nodes_parsed or something similiar.
> 
> Ingo, the following is my bugfix patch that addresses the issue at 
> http://patchwork.kernel.org/patch/69499.  Haicheng, can we add your 
> tested-by line?
> 
> 
> 
> x86: set hotpluggable nodes in nodes_possible_map
> 
> nodes_possible_map does not currently include nodes that have SRAT
> entries that are all ACPI_SRAT_MEM_HOT_PLUGGABLE since the bit is cleared
> in nodes_parsed if it does not have an online address range.
> 
> Unequivocally setting the bit in nodes_parsed is insufficient since
> existing code, such as acpi_get_nodes(), assumes all nodes in the map
> have online address ranges.  In fact, all code using nodes_parsed assumes
> such nodes represent an address range of online memory.
> 
> nodes_possible_map is created by unioning nodes_parsed and
> cpu_nodes_parsed; the former represents nodes with online memory and the
> latter represents memoryless nodes.  We now set the bit for hotpluggable
> nodes in cpu_nodes_parsed so that it also gets set in nodes_possible_map.
> 
> Signed-off-by: David Rientjes <rientjes@google.com>
> ---
>  arch/x86/mm/srat_64.c |    4 +++-
>  1 files changed, 3 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
> --- a/arch/x86/mm/srat_64.c
> +++ b/arch/x86/mm/srat_64.c
> @@ -229,9 +229,11 @@ update_nodes_add(int node, unsigned long start, unsigned long end)
>  			printk(KERN_ERR "SRAT: Hotplug zone not continuous. Partly ignored\n");
>  	}
>  
> -	if (changed)
> +	if (changed) {
> +		node_set(node, cpu_nodes_parsed);
>  		printk(KERN_INFO "SRAT: hot plug zone found %Lx - %Lx\n",
>  				 nd->start, nd->end);
> +	}
>  }
>  
>  /* Callback for parsing of the Proximity Domain <-> Memory Area mappings */

        if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) {
                update_nodes_add(node, start, end);
                /* restore nodes[node] */
                *nd = oldnode;
                if ((nd->start | nd->end) == 0)
                        node_clear(node, nodes_parsed);
        }

removing clearing with nodes_parsed is not working?

YH


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [patch] x86: set hotpluggable nodes in nodes_possible_map
  2010-01-20 22:45               ` Yinghai Lu
@ 2010-01-20 23:32                 ` David Rientjes
  2010-01-21  3:00                 ` Haicheng Li
  1 sibling, 0 replies; 26+ messages in thread
From: David Rientjes @ 2010-01-20 23:32 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Haicheng Li, Ingo Molnar, H. Peter Anvin, Thomas Gleixner, x86,
	Andi Kleen, linux-kernel

On Wed, 20 Jan 2010, Yinghai Lu wrote:

> > nodes_possible_map does not currently include nodes that have SRAT
> > entries that are all ACPI_SRAT_MEM_HOT_PLUGGABLE since the bit is cleared
> > in nodes_parsed if it does not have an online address range.
> > 
> > Unequivocally setting the bit in nodes_parsed is insufficient since
> > existing code, such as acpi_get_nodes(), assumes all nodes in the map
> > have online address ranges.  In fact, all code using nodes_parsed assumes
> > such nodes represent an address range of online memory.
> > 
> > nodes_possible_map is created by unioning nodes_parsed and
> > cpu_nodes_parsed; the former represents nodes with online memory and the
> > latter represents memoryless nodes.  We now set the bit for hotpluggable
> > nodes in cpu_nodes_parsed so that it also gets set in nodes_possible_map.
> > 
> > Signed-off-by: David Rientjes <rientjes@google.com>
> > ---
> >  arch/x86/mm/srat_64.c |    4 +++-
> >  1 files changed, 3 insertions(+), 1 deletions(-)
> > 
> > diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
> > --- a/arch/x86/mm/srat_64.c
> > +++ b/arch/x86/mm/srat_64.c
> > @@ -229,9 +229,11 @@ update_nodes_add(int node, unsigned long start, unsigned long end)
> >  			printk(KERN_ERR "SRAT: Hotplug zone not continuous. Partly ignored\n");
> >  	}
> >  
> > -	if (changed)
> > +	if (changed) {
> > +		node_set(node, cpu_nodes_parsed);
> >  		printk(KERN_INFO "SRAT: hot plug zone found %Lx - %Lx\n",
> >  				 nd->start, nd->end);
> > +	}
> >  }
> >  
> >  /* Callback for parsing of the Proximity Domain <-> Memory Area mappings */
> 
>         if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) {
>                 update_nodes_add(node, start, end);
>                 /* restore nodes[node] */
>                 *nd = oldnode;
>                 if ((nd->start | nd->end) == 0)
>                         node_clear(node, nodes_parsed);
>         }
> 
> removing clearing with nodes_parsed is not working?
> 

As previously discussed in this thread, the semantics of nodes_parsed is 
to define nodes that have online address ranges; all code written to use 
nodes_parsed operates on those address ranges.  Requiring it to also 
support memoryless (or hotpluggable) nodes would require a change in 
semantics in other areas such as acpi_get_nodes() and would be contrary to 
the effort in dc098551 to fix this inconsistency.

I believe Haicheng will be posting an incremental patch that changes the 
name of cpu_nodes_parsed to something such as acpi_nodes_parsed to 
indicate it does not necessarily represent memory but rather another ACPI 
definition: either a node including only cpus or hotpluggable ranges.  

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [patch] x86: set hotpluggable nodes in nodes_possible_map
  2010-01-20 20:10             ` [patch] x86: set hotpluggable nodes in nodes_possible_map David Rientjes
  2010-01-20 22:45               ` Yinghai Lu
@ 2010-01-21  2:58               ` Haicheng Li
  2010-01-21  6:58                 ` David Rientjes
  2010-01-22 11:15               ` [tip:x86/urgent] x86: Set hotpluggable nodes in nodes_possible_map tip-bot for David Rientjes
  2010-01-23  6:51               ` tip-bot for David Rientjes
  3 siblings, 1 reply; 26+ messages in thread
From: Haicheng Li @ 2010-01-21  2:58 UTC (permalink / raw)
  To: David Rientjes
  Cc: Ingo Molnar, Yinghai Lu, H. Peter Anvin, Thomas Gleixner, x86,
	Andi Kleen, linux-kernel

David Rientjes wrote:
> On Thu, 21 Jan 2010, Haicheng Li wrote:
> 
>> let's add more changes to fix naming issue as well since it's too confusing
>> for people to understand
>> the code logic. how about below patch?
> 
> That should be a seperate change; there's a bugfix here (my patch) and 
> then a cleanup patch that you could make incrementally on mine.  I don't 
> personally like the name "rest_nodes_parsed" since it's poor English, I 
> suggest renaming nodes_parsed to mem_nodes_parsed as I originally asked 
> and then cpu_nodes_parsed to acpi_nodes_parsed or something similiar.

IMHO, name acpi_nodes_parsed is not appropriate for this intention as nodes_parsed comes from acpi too.

Think about it again, it's better _NOT_ to mix up cpu_nodes_parsed and hotplugpable_nodes together.
We cannot assume cpu_nodes_parsed has no other use in future, like possibly cpu hotplug emulation. I 
think that a clean and straightforward way is to keep nodes with hotpluggable range in a separated 
nodemask, like "hp_nodes_parsed", which could also be used for future mem hotplug emulation. Then 
node data corresponding to nodes_parsed is kept in nodes[], node data for hp_nodes_parsed is kept in 
nodes_add[].

> Ingo, the following is my bugfix patch that addresses the issue at 
> http://patchwork.kernel.org/patch/69499.  

Personally I don't like such patch with confusable info.
I think following patch would be more straightfoward and clean:

diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
index a271241..c5552ae 100644
--- a/arch/x86/mm/srat_64.c
+++ b/arch/x86/mm/srat_64.c
@@ -27,8 +27,18 @@ int acpi_numa __initdata;

  static struct acpi_table_slit *acpi_slit;

+/* nodes_parsed:
+ *  - nodes with memory on
+ * cpu_nodes_parsed:
+ *  - nodes with cpu on
+ * hp_nodes_parsed:
+ *  - nodes with hotpluggable memory region
+ *
+ * We union these tree nodemasks to get node_possible_map.
+ */
  static nodemask_t nodes_parsed __initdata;
  static nodemask_t cpu_nodes_parsed __initdata;
+static nodemask_t hp_nodes_parsed __initdata;
  static struct bootnode nodes[MAX_NUMNODES] __initdata;
  static struct bootnode nodes_add[MAX_NUMNODES];

@@ -229,9 +239,11 @@ update_nodes_add(int node, unsigned long start, unsigned long end)
  			printk(KERN_ERR "SRAT: Hotplug zone not continuous. Partly ignored\n");
  	}

-	if (changed)
+	if (changed) {
+		node_set(node, hp_nodes_parsed);
  		printk(KERN_INFO "SRAT: hot plug zone found %Lx - %Lx\n",
  				 nd->start, nd->end);
+	}
  }

  /* Callback for parsing of the Proximity Domain <-> Memory Area mappings */
@@ -380,8 +392,10 @@ int __init acpi_scan_nodes(unsigned long start, unsigned long end)
  		return -1;
  	}

-	/* Account for nodes with cpus and no memory */
+	/* Account for nodes with memory and cpus */
  	nodes_or(node_possible_map, nodes_parsed, cpu_nodes_parsed);
+	/* Account for nodes with hotpluggable memory regions */
+	nodes_or(node_possible_map, node_possible_map, hp_nodes_parsed);

  	/* Finally register nodes */
  	for_each_node_mask(i, node_possible_map)


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [patch] x86: set hotpluggable nodes in nodes_possible_map
  2010-01-20 22:45               ` Yinghai Lu
  2010-01-20 23:32                 ` David Rientjes
@ 2010-01-21  3:00                 ` Haicheng Li
  1 sibling, 0 replies; 26+ messages in thread
From: Haicheng Li @ 2010-01-21  3:00 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: David Rientjes, Ingo Molnar, H. Peter Anvin, Thomas Gleixner, x86,
	Andi Kleen, linux-kernel

Yinghai Lu wrote:
> 
>         if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) {
>                 update_nodes_add(node, start, end);
>                 /* restore nodes[node] */
>                 *nd = oldnode;
>                 if ((nd->start | nd->end) == 0)
>                         node_clear(node, nodes_parsed);
>         }
> 
> removing clearing with nodes_parsed is not working?
> 
> YH

Yinghai,

Theoretically removing clearing with nodes_parsed can work fine except that it requires more 
consideration, since some functions already based on nodes_parsed, like acpi_get_nodes(), is 
supposing nodes_parsed just represents for nodes with memory on.

See my another email to David, I think we'd better keep hotpluggable node info separately since it 
is straightforward as well as would be useful for future hotplug related usage. How do you think 
about it? thanks.

-haicheng

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [patch] x86: set hotpluggable nodes in nodes_possible_map
  2010-01-21  2:58               ` Haicheng Li
@ 2010-01-21  6:58                 ` David Rientjes
  2010-01-21  7:31                   ` Haicheng Li
  0 siblings, 1 reply; 26+ messages in thread
From: David Rientjes @ 2010-01-21  6:58 UTC (permalink / raw)
  To: Haicheng Li
  Cc: Ingo Molnar, Yinghai Lu, H. Peter Anvin, Thomas Gleixner, x86,
	Andi Kleen, linux-kernel

On Thu, 21 Jan 2010, Haicheng Li wrote:

> diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
> index a271241..c5552ae 100644
> --- a/arch/x86/mm/srat_64.c
> +++ b/arch/x86/mm/srat_64.c
> @@ -27,8 +27,18 @@ int acpi_numa __initdata;
> 
>  static struct acpi_table_slit *acpi_slit;
> 
> +/* nodes_parsed:
> + *  - nodes with memory on
> + * cpu_nodes_parsed:
> + *  - nodes with cpu on
> + * hp_nodes_parsed:
> + *  - nodes with hotpluggable memory region
> + *
> + * We union these tree nodemasks to get node_possible_map.
> + */
>  static nodemask_t nodes_parsed __initdata;
>  static nodemask_t cpu_nodes_parsed __initdata;
> +static nodemask_t hp_nodes_parsed __initdata;
>  static struct bootnode nodes[MAX_NUMNODES] __initdata;
>  static struct bootnode nodes_add[MAX_NUMNODES];
> 
> @@ -229,9 +239,11 @@ update_nodes_add(int node, unsigned long start, unsigned
> long end)
>  			printk(KERN_ERR "SRAT: Hotplug zone not continuous.
> Partly ignored\n");
>  	}
> 
> -	if (changed)
> +	if (changed) {
> +		node_set(node, hp_nodes_parsed);
>  		printk(KERN_INFO "SRAT: hot plug zone found %Lx - %Lx\n",
>  				 nd->start, nd->end);
> +	}
>  }
> 
>  /* Callback for parsing of the Proximity Domain <-> Memory Area mappings */
> @@ -380,8 +392,10 @@ int __init acpi_scan_nodes(unsigned long start, unsigned
> long end)
>  		return -1;
>  	}
> 
> -	/* Account for nodes with cpus and no memory */
> +	/* Account for nodes with memory and cpus */
>  	nodes_or(node_possible_map, nodes_parsed, cpu_nodes_parsed);
> +	/* Account for nodes with hotpluggable memory regions */
> +	nodes_or(node_possible_map, node_possible_map, hp_nodes_parsed);
> 
>  	/* Finally register nodes */
>  	for_each_node_mask(i, node_possible_map)
> 
> 

Nack, we don't need to add yet another nodemask because you're having 
trouble finding a new name for a cpu_nodes_parsed.  It would be perfectly 
acceptable to rename nodes_parsed to mem_nodes and cpu_nodes_parsed to 
no_mem_nodes, which are even shorter.  But we definitely don't need 
another nodemask and I think we're convoluting the bug fix here way too 
much.  Regardless of the future nomenclature, let's please merge my patch 
to fix the outstanding kernel issue and then propose an alternative naming 
scheme later.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [patch] x86: set hotpluggable nodes in nodes_possible_map
  2010-01-21  6:58                 ` David Rientjes
@ 2010-01-21  7:31                   ` Haicheng Li
  2010-01-21  7:50                     ` David Rientjes
  0 siblings, 1 reply; 26+ messages in thread
From: Haicheng Li @ 2010-01-21  7:31 UTC (permalink / raw)
  To: David Rientjes
  Cc: Ingo Molnar, Yinghai Lu, H. Peter Anvin, Thomas Gleixner, x86,
	Andi Kleen, linux-kernel

David Rientjes wrote:
> On Thu, 21 Jan 2010, Haicheng Li wrote:
> 
>> diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
>> index a271241..c5552ae 100644
>> --- a/arch/x86/mm/srat_64.c
>> +++ b/arch/x86/mm/srat_64.c
>> @@ -27,8 +27,18 @@ int acpi_numa __initdata;
>>
>>  static struct acpi_table_slit *acpi_slit;
>>
>> +/* nodes_parsed:
>> + *  - nodes with memory on
>> + * cpu_nodes_parsed:
>> + *  - nodes with cpu on
>> + * hp_nodes_parsed:
>> + *  - nodes with hotpluggable memory region
>> + *
>> + * We union these tree nodemasks to get node_possible_map.
>> + */
>>  static nodemask_t nodes_parsed __initdata;
>>  static nodemask_t cpu_nodes_parsed __initdata;
>> +static nodemask_t hp_nodes_parsed __initdata;
>>  static struct bootnode nodes[MAX_NUMNODES] __initdata;
>>  static struct bootnode nodes_add[MAX_NUMNODES];
>>
>> @@ -229,9 +239,11 @@ update_nodes_add(int node, unsigned long start, unsigned
>> long end)
>>  			printk(KERN_ERR "SRAT: Hotplug zone not continuous.
>> Partly ignored\n");
>>  	}
>>
>> -	if (changed)
>> +	if (changed) {
>> +		node_set(node, hp_nodes_parsed);
>>  		printk(KERN_INFO "SRAT: hot plug zone found %Lx - %Lx\n",
>>  				 nd->start, nd->end);
>> +	}
>>  }
>>
>>  /* Callback for parsing of the Proximity Domain <-> Memory Area mappings */
>> @@ -380,8 +392,10 @@ int __init acpi_scan_nodes(unsigned long start, unsigned
>> long end)
>>  		return -1;
>>  	}
>>
>> -	/* Account for nodes with cpus and no memory */
>> +	/* Account for nodes with memory and cpus */
>>  	nodes_or(node_possible_map, nodes_parsed, cpu_nodes_parsed);
>> +	/* Account for nodes with hotpluggable memory regions */
>> +	nodes_or(node_possible_map, node_possible_map, hp_nodes_parsed);
>>
>>  	/* Finally register nodes */
>>  	for_each_node_mask(i, node_possible_map)
>>
>>
> 
> Nack, we don't need to add yet another nodemask because you're having 
> trouble finding a new name for a cpu_nodes_parsed.  It would be perfectly 
Hey Dave, why do you think it's just a naming issue?

What I'm concerning is that your assumption of cpu_nodes_parsed use is wrong,
cpu_nodes_parsed is needed anyway since its semantics represent the node with cpu affinity rather 
than memless node, that's also why I originally doubted cpu_node_parsed cannot handle hotplug node.
we also need hp_nodes_parsed to represent the node with hotpluggable memory region, just like why we 
need nodes_parsed to repsent node with mem on.

The code could be straightforward and easy to undertand in this way.

> acceptable to rename nodes_parsed to mem_nodes and cpu_nodes_parsed to 
> no_mem_nodes, which are even shorter.  But we definitely don't need 
> another nodemask and I think we're convoluting the bug fix here way too 
> much.  Regardless of the future nomenclature, let's please merge my patch 
> to fix the outstanding kernel issue and then propose an alternative naming 
It's important to fix the outstanding kernel issue, that's also why I keep shooting down this issue. 
I wanna fix it thru a right way rather than just get it fixed.



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [patch] x86: set hotpluggable nodes in nodes_possible_map
  2010-01-21  7:31                   ` Haicheng Li
@ 2010-01-21  7:50                     ` David Rientjes
  2010-01-21  8:33                       ` Haicheng Li
  0 siblings, 1 reply; 26+ messages in thread
From: David Rientjes @ 2010-01-21  7:50 UTC (permalink / raw)
  To: Haicheng Li
  Cc: Ingo Molnar, Yinghai Lu, H. Peter Anvin, Thomas Gleixner, x86,
	Andi Kleen, linux-kernel

On Thu, 21 Jan 2010, Haicheng Li wrote:

> > Nack, we don't need to add yet another nodemask because you're having
> > trouble finding a new name for a cpu_nodes_parsed.  It would be perfectly 
> Hey Dave, why do you think it's just a naming issue?
> 
> What I'm concerning is that your assumption of cpu_nodes_parsed use is wrong,
> cpu_nodes_parsed is needed anyway since its semantics represent the node with
> cpu affinity rather than memless node, that's also why I originally doubted
> cpu_node_parsed cannot handle hotplug node.

Wrong, cpu_nodes_parsed (despite its name) solely represents nodes that do 
not have online address memory ranges.  That's it.  Nothing more, nothing 
less.  That's why I suggest you rename it to no_mem_nodes or something 
similar.  Look at the single time that the nodemask is used: to set 
cleared bits in node_possible_map that were not set in nodes_parsed 
because THEY DO NOT HAVE ASSOCIATED ONLINE MEMORY RANGES, the _only_ time 
a node gets set in nodes_parsed.

Once you rename nodes_parsed to mem_nodes and cpu_nodes_parsed to 
no_mem_nodes, it may become clearer.

> we also need hp_nodes_parsed to represent the node with hotpluggable memory
> region, just like why we need nodes_parsed to repsent node with mem on.
> 

It's pointless to add another nodemask, the semantics of cpu_nodes_parsed 
is perfectly legitimate for hotpluggable nodes as well.  Instead of 
fixating on the name, look at the code that uses it.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [patch] x86: set hotpluggable nodes in nodes_possible_map
  2010-01-21  7:50                     ` David Rientjes
@ 2010-01-21  8:33                       ` Haicheng Li
  2010-01-21 23:12                         ` David Rientjes
  0 siblings, 1 reply; 26+ messages in thread
From: Haicheng Li @ 2010-01-21  8:33 UTC (permalink / raw)
  To: David Rientjes
  Cc: Ingo Molnar, Yinghai Lu, H. Peter Anvin, Thomas Gleixner, x86,
	Andi Kleen, linux-kernel

David Rientjes wrote:
> On Thu, 21 Jan 2010, Haicheng Li wrote:
> 
>>> Nack, we don't need to add yet another nodemask because you're having
>>> trouble finding a new name for a cpu_nodes_parsed.  It would be perfectly 
>> Hey Dave, why do you think it's just a naming issue?
>>
>> What I'm concerning is that your assumption of cpu_nodes_parsed use is wrong,
>> cpu_nodes_parsed is needed anyway since its semantics represent the node with
>> cpu affinity rather than memless node, that's also why I originally doubted
>> cpu_node_parsed cannot handle hotplug node.
> 
> Wrong, cpu_nodes_parsed (despite its name) solely represents nodes that do 
> not have online address memory ranges.  That's it.  Nothing more, nothing 
> less.  That's why I suggest you rename it to no_mem_nodes or something 
> similar.  Look at the single time that the nodemask is used: to set 
> cleared bits in node_possible_map that were not set in nodes_parsed 
> because THEY DO NOT HAVE ASSOCIATED ONLINE MEMORY RANGES, the _only_ time 
> a node gets set in nodes_parsed.
> 
> Once you rename nodes_parsed to mem_nodes and cpu_nodes_parsed to 
> no_mem_nodes, it may become clearer.
> 
>> we also need hp_nodes_parsed to represent the node with hotpluggable memory
>> region, just like why we need nodes_parsed to repsent node with mem on.
>>
> 
> It's pointless to add another nodemask, the semantics of cpu_nodes_parsed 
> is perfectly legitimate for hotpluggable nodes as well.  Instead of 
> fixating on the name, look at the code that uses it.

I'm not meant to be rude, but it's illogical excuse. that it is now used by a single function 
doesn't mean that it will never be used by others in future or it is only useful for that single 
function.

see the code, we can find such relationships between related data-structures.
- struct bootnode nodes[] -> nodes_parsed
all the node in nodes_parsed should have a relative mem range in nodes[].

- apicid_to_node[] -> cpu_nodes_parsed
all the value of apicid_to_node[] should be valid in cpu_nodes_parsed. If we put hotpluggable node 
into cpu_nodes_parsed, such relationship is broken, right?

- nodes_add[] -> ?? (this is for hotpluggable node)
all the hotplugged mem can find a corresponding node thru nodes_add[].

-haicheng

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [patch] x86: set hotpluggable nodes in nodes_possible_map
  2010-01-21  8:33                       ` Haicheng Li
@ 2010-01-21 23:12                         ` David Rientjes
  2010-01-22  4:06                           ` [PATCH] x86/mm/srat_64.c: make node_possible_map include hotpluggable node Haicheng Li
  0 siblings, 1 reply; 26+ messages in thread
From: David Rientjes @ 2010-01-21 23:12 UTC (permalink / raw)
  To: Haicheng Li
  Cc: Ingo Molnar, Yinghai Lu, H. Peter Anvin, Thomas Gleixner, x86,
	Andi Kleen, linux-kernel

On Thu, 21 Jan 2010, Haicheng Li wrote:

> > It's pointless to add another nodemask, the semantics of cpu_nodes_parsed is
> > perfectly legitimate for hotpluggable nodes as well.  Instead of fixating on
> > the name, look at the code that uses it.
> 
> I'm not meant to be rude, but it's illogical excuse. that it is now used by a
> single function doesn't mean that it will never be used by others in future or
> it is only useful for that single function.
> 

It will not be used by a single function in this context for any valid 
purpose, hp_nodes_parsed is completely unnecessary and overkill to address 
a problem that is a relative one-liner.  You're completely missing the 
points that (i) no other SRAT parsing code cares about distinguishing only 
nodes with hotpluggable memory ranges, and (ii) no other SRAT parsing code 
even cares about distinguishing memoryless nodes with only cpus attached.  
Thus, we can use cpu_nodes_allowed to represent _all_ nodes without online 
memory ranges (and you can even rename it to no_mem_nodes later, if you 
want).

This discussion is becoming very annoying because you have constantly 
proposed new patches (I think 4 now?) that are much more complex and 
without any real consistent idea behind them.  Your latest proposal is to 
add a nodemask because you speculate that someday it will become useful; 
the truth of the matter is that we don't need to do anything with them 
beyond detection so the bit gets set in nodes_possible_map.

Ingo should not need to look through what is becoming an extremely long 
discussion for a bugfix that should be applied for rc5 because we're 
getting _very_ late in the 2.6.33 release cycle.  Do you expect Ingo to 
push your fix to Linus with the rationale that "maybe someday we'll use 
this new nodemask even though it may be rc5 and nobody knows what we'd 
ever use it for"?  Is that appropriate for -stable candidates as well?

You've already tested my patch that this thread was restarted with and it 
works, so let's fix the bug.  Then, later, you can rename cpu_nodes_parsed 
to no_mems_nodes, which I'd agree with.  You may even try to seperate the 
hotpluggable nodes out into their own nodemask, but I trust that the x86 
maintainers will be looking for some rationale behind that other than "it 
may one day be useful."

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH] x86/mm/srat_64.c: make node_possible_map include hotpluggable node
  2010-01-21 23:12                         ` David Rientjes
@ 2010-01-22  4:06                           ` Haicheng Li
  2010-01-22  7:33                             ` H. Peter Anvin
  0 siblings, 1 reply; 26+ messages in thread
From: Haicheng Li @ 2010-01-22  4:06 UTC (permalink / raw)
  To: David Rientjes, Ingo Molnar, H. Peter Anvin, Thomas Gleixner
  Cc: Yinghai Lu, x86, Andi Kleen, linux-kernel

David Rientjes wrote:
 > You've already tested my patch that this thread was restarted with and it
 > works, so let's fix the bug.  Then, later, you can rename cpu_nodes_parsed
 > to no_mems_nodes, which I'd agree with.  You may even try to seperate the
 > hotpluggable nodes out into their own nodemask, but I trust that the x86
 > maintainers will be looking for some rationale behind that other than "it
 > may one day be useful."

David, you are misleading people to fix the BUG with a logically problematic patch. I don't want 
such fixing to possibly bother other people someday, please let's avoid it in review stage.

 > getting _very_ late in the 2.6.33 release cycle.  Do you expect Ingo to
 > push your fix to Linus with the rationale that "maybe someday we'll use
 > this new nodemask even though it may be rc5 and nobody knows what we'd
 > ever use it for"?  Is that appropriate for -stable candidates as well?

Don't speak for any other people. Let maintainers themselves decide if my patch is ugly or 
acceptable. I don't want to argue with you anymore if you cannot find any true problem from my 
recent patch.

Below is my updated patch (in fact, it's v2 for the patch I sent out for review in 
http://lkml.org/lkml/2010/1/15/9).

***

This is to fix the outstanding BUG I ever reported in email thread: 
http://patchwork.kernel.org/patch/69499/.
[  141.667487] BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
[  141.667782] IP: [<ffffffff810b8a64>] cache_reap+0x71/0x236
[  141.667969] PGD 0
[  141.668129] Oops: 0000 [#1] SMP
[  141.668357] last sysfs file: /sys/class/scsi_host/host4/proc_name
[  141.668469] CPU
[  141.668630] Modules linked in: ipv6 autofs4 rfcomm l2cap crc16 bluetooth rfkill binfmt_misc 
dm_mirror dm_region_hash dm_log dm_multipath dm_mod video output sbs sbshc fan battery ac parport_pc 
lp parport joydev usbhid sr_mod cdrom thermal processor thermal_sys container button rtc_cmos 
rtc_core rtc_lib i2c_i801 i2c_core pcspkr uhci_hcd ohci_hcd ehci_hcd usbcore
[  141.671659] Pid: 126, comm: events/27 Not tainted 2.6.32 #9  Server
[  141.671771] RIP: 0010:[<ffffffff810b8a64>]  [<ffffffff810b8a64>] cache_reap+0x71/0x236
[  141.671981] RSP: 0018:ffff88027e81bdf0  EFLAGS: 00010206
[  141.672089] RAX: 0000000000000002 RBX: 0000000000000078 RCX: ffff88047d86e580
[  141.672204] RDX: ffff88047dfcbc00 RSI: ffff88047f13f6c0 RDI: ffff88047d9136c0
[  141.672319] RBP: ffff88027e81be30 R08: 0000000000000001 R09: 0000000000000001
[  141.672433] R10: 0000000000000000 R11: 0000000000000086 R12: ffff88047d87c200
[  141.672548] R13: ffff88047d87d680 R14: ffffffff810b89f3 R15: 0000000000000002
[  141.672663] FS:  0000000000000000(0000) GS:ffff88028b5a0000(0000) knlGS:0000000000000000
[  141.672807] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[  141.672917] CR2: 0000000000000078 CR3: 0000000001001000 CR4: 00000000000006e0
[  141.673032] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  141.673147] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  141.673262] Process events/27 (pid: 126, threadinfo ffff88027e81a000, task ffff88027f3ea040)
[  141.673406] Stack:
[  141.673503]  ffff88027e81be30 ffff88028b5b05a0 0000000100000000 ffff88027e81be80
[  141.673808] <0> ffff88028b5b5b40 ffff88028b5b05a0 ffffffff810b89f3 fffffffff00000c6
[  141.674265] <0> ffff88027e81bec0 ffffffff81057394 ffffffff8105733e ffffffff81369f3a
[  141.674813] Call Trace:
[  141.674915]  [<ffffffff810b89f3>] ? cache_reap+0x0/0x236
[  141.675028]  [<ffffffff81057394>] worker_thread+0x17a/0x27b
[  141.675138]  [<ffffffff8105733e>] ? worker_thread+0x124/0x27b
[  141.675256]  [<ffffffff81369f3a>] ? thread_return+0x3e/0xee
[  141.675369]  [<ffffffff8105a244>] ? autoremove_wake_function+0x0/0x38
[  141.675482]  [<ffffffff8105721a>] ? worker_thread+0x0/0x27b
[  141.675593]  [<ffffffff8105a146>] kthread+0x7d/0x87
[  141.675707]  [<ffffffff81012daa>] child_rip+0xa/0x20
[  141.675817]  [<ffffffff81012710>] ? restore_args+0x0/0x30
[  141.675927]  [<ffffffff8105a0c9>] ? kthread+0x0/0x87
[  141.676035]  [<ffffffff81012da0>] ? child_rip+0x0/0x20
[  141.676142] Code: a4 c5 68 08 00 00 65 48 8b 04 25 00 e4 00 00 48 8b 04 18 49 8b 4c 24 78 48 85 
c9 74 5b 41 89 c7 48 98 48 8b 1c c1 48 85 db 74 4d <83> 3b 00 74 48 48 83 3d ff d4 65 00 00 75 04 0f 
0b eb fe fa 66
[  141.680610] RIP  [<ffffffff810b8a64>] cache_reap+0x71/0x236
[  141.680785]  RSP <ffff88027e81bdf0>
[  141.680886] CR2: 0000000000000078
[  141.681016] ---[ end trace b1e17069ef81fe83 ]--

Existing code doesn't record hotpluggable nodes parsed from SRAT because such nodes have neither CPU 
online (in cpu_nodes_parsed) nor MEM online (in nodes_parsed) at booting time. As a result, 
node_possible_map won't include hotpluggable nodes and then nr_node_ids won't be equal to maximum of 
the possible nodes on the system.

To fix it, naturally we add nodemask_t hp_nodes_parsed to record nodes with hotpluggable memory 
region, corresponding region data is kept in existing struct bootnode nodes_add[]; finally we union 
hp_nodes_parsed together with cpu_nodes_parsed and nodes_parsed to get correct node_possible_map, 
which then includes all possible nodes:
  - nodes with memory on
  - nodes with cpu on
  - nodes with hotpluggable memory region

Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com>
---
  arch/x86/mm/srat_64.c |    9 +++++++--
  1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
index dbb5381..595b14d 100644
--- a/arch/x86/mm/srat_64.c
+++ b/arch/x86/mm/srat_64.c
@@ -29,6 +29,7 @@ static struct acpi_table_slit *acpi_slit;

  static nodemask_t nodes_parsed __initdata;
  static nodemask_t cpu_nodes_parsed __initdata;
+static nodemask_t hp_nodes_parsed __initdata;
  static struct bootnode nodes[MAX_NUMNODES] __initdata;
  static struct bootnode nodes_add[MAX_NUMNODES];

@@ -229,9 +230,11 @@ update_nodes_add(int node, unsigned long start, unsigned long end)
  			printk(KERN_ERR "SRAT: Hotplug zone not continuous. Partly ignored\n");
  	}

-	if (changed)
+	if (changed) {
+		node_set(node, hp_nodes_parsed);
  		printk(KERN_INFO "SRAT: hot plug zone found %Lx - %Lx\n",
  				 nd->start, nd->end);
+	}
  }

  /* Callback for parsing of the Proximity Domain <-> Memory Area mappings */
@@ -364,8 +367,10 @@ int __init acpi_scan_nodes(unsigned long start, unsigned long end)
  		return -1;
  	}

-	/* Account for nodes with cpus and no memory */
+	/* Account for nodes with either memory or cpus online */
  	nodes_or(node_possible_map, nodes_parsed, cpu_nodes_parsed);
+	/* Account for nodes with hotpluggable memory region */
+	nodes_or(node_possible_map, node_possible_map, hp_nodes_parsed);

  	/* Finally register nodes */
  	for_each_node_mask(i, node_possible_map)
-- 
1.5.3.8


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/mm/srat_64.c: make node_possible_map include hotpluggable node
  2010-01-22  4:06                           ` [PATCH] x86/mm/srat_64.c: make node_possible_map include hotpluggable node Haicheng Li
@ 2010-01-22  7:33                             ` H. Peter Anvin
  2010-01-22  8:43                               ` Haicheng Li
  0 siblings, 1 reply; 26+ messages in thread
From: H. Peter Anvin @ 2010-01-22  7:33 UTC (permalink / raw)
  To: Haicheng Li
  Cc: David Rientjes, Ingo Molnar, Thomas Gleixner, Yinghai Lu, x86,
	Andi Kleen, linux-kernel

On 01/21/2010 08:06 PM, Haicheng Li wrote:
> David Rientjes wrote:
>> You've already tested my patch that this thread was restarted with and it
>> works, so let's fix the bug.  Then, later, you can rename
> cpu_nodes_parsed
>> to no_mems_nodes, which I'd agree with.  You may even try to seperate the
>> hotpluggable nodes out into their own nodemask, but I trust that the x86
>> maintainers will be looking for some rationale behind that other than "it
>> may one day be useful."
> 
> David, you are misleading people to fix the BUG with a logically
> problematic patch. I don't want such fixing to possibly bother other
> people someday, please let's avoid it in review stage.
> 
>> getting _very_ late in the 2.6.33 release cycle.  Do you expect Ingo to
>> push your fix to Linus with the rationale that "maybe someday we'll use
>> this new nodemask even though it may be rc5 and nobody knows what we'd
>> ever use it for"?  Is that appropriate for -stable candidates as well?
> 
> Don't speak for any other people. Let maintainers themselves decide if
> my patch is ugly or acceptable. I don't want to argue with you anymore
> if you cannot find any true problem from my recent patch.
> 
> Below is my updated patch (in fact, it's v2 for the patch I sent out for
> review in http://lkml.org/lkml/2010/1/15/9).
> 

Okay... please calm down.  I just read through this thread from the top,
and had missed the fact that it had gotten so tense.

I have to say I agree with David Rientjes that we need the minimal patch
for upstream and stable.  If you need the additional bitmask in the
future it should be added later.

Haicheng, would you be willing to prepare a minimal patch so we can
close the issue in the release trees as quickly as possible?

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/mm/srat_64.c: make node_possible_map include hotpluggable node
  2010-01-22  7:33                             ` H. Peter Anvin
@ 2010-01-22  8:43                               ` Haicheng Li
  2010-01-22 10:14                                 ` H. Peter Anvin
  0 siblings, 1 reply; 26+ messages in thread
From: Haicheng Li @ 2010-01-22  8:43 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: David Rientjes, Ingo Molnar, Thomas Gleixner, Yinghai Lu, x86,
	Andi Kleen, linux-kernel

H. Peter Anvin wrote:
 > I have to say I agree with David Rientjes that we need the minimal patch
 > for upstream and stable.  If you need the additional bitmask in the
 > future it should be added later.
 >
 > Haicheng, would you be willing to prepare a minimal patch so we can
 > close the issue in the release trees as quickly as possible?

Peter,

Okay, let's close it. then please take the patch pasted below, which is the one without additional 
bitmask added.

---
x86: set hotpluggable nodes in nodes_possible_map

nodes_possible_map does not currently include nodes that have SRAT
entries that are all ACPI_SRAT_MEM_HOT_PLUGGABLE since the bit is cleared
in nodes_parsed if it does not have an online address range.

Unequivocally setting the bit in nodes_parsed is insufficient since
existing code, such as acpi_get_nodes(), assumes all nodes in the map
have online address ranges.  In fact, all code using nodes_parsed assumes
such nodes represent an address range of online memory.

nodes_possible_map is created by unioning nodes_parsed and
cpu_nodes_parsed; the former represents nodes with online memory and the
latter represents memoryless nodes.  We now set the bit for hotpluggable
nodes in cpu_nodes_parsed so that it also gets set in nodes_possible_map.

Signed-off-by: David Rientjes <rientjes@google.com>
---
  arch/x86/mm/srat_64.c |    4 +++-
  1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
--- a/arch/x86/mm/srat_64.c
+++ b/arch/x86/mm/srat_64.c
@@ -229,9 +229,11 @@ update_nodes_add(int node, unsigned long start, unsigned long end)
  			printk(KERN_ERR "SRAT: Hotplug zone not continuous. Partly ignored\n");
  	}

-	if (changed)
+	if (changed) {
+		node_set(node, cpu_nodes_parsed);
  		printk(KERN_INFO "SRAT: hot plug zone found %Lx - %Lx\n",
  				 nd->start, nd->end);
+	}
  }

  /* Callback for parsing of the Proximity Domain <-> Memory Area mappings */

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/mm/srat_64.c: make node_possible_map include hotpluggable node
  2010-01-22  8:43                               ` Haicheng Li
@ 2010-01-22 10:14                                 ` H. Peter Anvin
  2010-01-22 10:35                                   ` Haicheng Li
  0 siblings, 1 reply; 26+ messages in thread
From: H. Peter Anvin @ 2010-01-22 10:14 UTC (permalink / raw)
  To: Haicheng Li
  Cc: David Rientjes, Ingo Molnar, Thomas Gleixner, Yinghai Lu, x86,
	Andi Kleen, linux-kernel

On 01/22/2010 12:43 AM, Haicheng Li wrote:
> H. Peter Anvin wrote:
>> I have to say I agree with David Rientjes that we need the minimal patch
>> for upstream and stable.  If you need the additional bitmask in the
>> future it should be added later.
>>
>> Haicheng, would you be willing to prepare a minimal patch so we can
>> close the issue in the release trees as quickly as possible?
> 
> Peter,
> 
> Okay, let's close it. then please take the patch pasted below, which is
> the one without additional bitmask added.
> 

As David asked, OK to add your Tested-by: line?

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/mm/srat_64.c: make node_possible_map include hotpluggable node
  2010-01-22 10:14                                 ` H. Peter Anvin
@ 2010-01-22 10:35                                   ` Haicheng Li
  0 siblings, 0 replies; 26+ messages in thread
From: Haicheng Li @ 2010-01-22 10:35 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: David Rientjes, Ingo Molnar, Thomas Gleixner, Yinghai Lu, x86,
	Andi Kleen, linux-kernel

H. Peter Anvin wrote:
> On 01/22/2010 12:43 AM, Haicheng Li wrote:
>> H. Peter Anvin wrote:
>>> I have to say I agree with David Rientjes that we need the minimal patch
>>> for upstream and stable.  If you need the additional bitmask in the
>>> future it should be added later.
>>>
>>> Haicheng, would you be willing to prepare a minimal patch so we can
>>> close the issue in the release trees as quickly as possible?
>> Peter,
>>
>> Okay, let's close it. then please take the patch pasted below, which is
>> the one without additional bitmask added.
>>
> 
> As David asked, OK to add your Tested-by: line?
Sure,
Tested-by: Haicheng Li <haicheng.li@linux.intel.com>

Thanks.
-haicheng

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [tip:x86/urgent] x86: Set hotpluggable nodes in nodes_possible_map
  2010-01-20 20:10             ` [patch] x86: set hotpluggable nodes in nodes_possible_map David Rientjes
  2010-01-20 22:45               ` Yinghai Lu
  2010-01-21  2:58               ` Haicheng Li
@ 2010-01-22 11:15               ` tip-bot for David Rientjes
  2010-01-23  6:51               ` tip-bot for David Rientjes
  3 siblings, 0 replies; 26+ messages in thread
From: tip-bot for David Rientjes @ 2010-01-22 11:15 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, haicheng.li, hpa, mingo, stable, tglx, rientjes

Commit-ID:  fefe0d8e9f0b2d1855ab4902c10399432cfc2e16
Gitweb:     http://git.kernel.org/tip/fefe0d8e9f0b2d1855ab4902c10399432cfc2e16
Author:     David Rientjes <rientjes@google.com>
AuthorDate: Wed, 20 Jan 2010 12:10:47 -0800
Committer:  H. Peter Anvin <hpa@zytor.com>
CommitDate: Fri, 22 Jan 2010 02:44:43 -0800

x86: Set hotpluggable nodes in nodes_possible_map

nodes_possible_map does not currently include nodes that have SRAT
entries that are all ACPI_SRAT_MEM_HOT_PLUGGABLE since the bit is
cleared in nodes_parsed if it does not have an online address range.

Unequivocally setting the bit in nodes_parsed is insufficient since
existing code, such as acpi_get_nodes(), assumes all nodes in the map
have online address ranges.  In fact, all code using nodes_parsed
assumes such nodes represent an address range of online memory.

nodes_possible_map is created by unioning nodes_parsed and
cpu_nodes_parsed; the former represents nodes with online memory and
the latter represents memoryless nodes.  We now set the bit for
hotpluggable nodes in cpu_nodes_parsed so that it also gets set in
nodes_possible_map.

[ hpa: Haicheng Li points out that this makes the naming of the
  variable cpu_nodes_parsed somewhat counterintuitive.  However, leave
  it as is in the interest of keeping the pure bug fix patch small. ]

Signed-off-by: David Rientjes <rientjes@google.com>
Tested-by: Haicheng Li <haicheng.li@linux.intel.com>
LKML-Reference: <alpine.DEB.2.00.1001201152040.30528@chino.kir.corp.google.com>
Cc: <stable@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
---
 arch/x86/mm/srat_64.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
index a271241..28c6876 100644
--- a/arch/x86/mm/srat_64.c
+++ b/arch/x86/mm/srat_64.c
@@ -229,9 +229,11 @@ update_nodes_add(int node, unsigned long start, unsigned long end)
 			printk(KERN_ERR "SRAT: Hotplug zone not continuous. Partly ignored\n");
 	}
 
-	if (changed)
+	if (changed) {
+		node_set(node, cpu_nodes_parsed);
 		printk(KERN_INFO "SRAT: hot plug zone found %Lx - %Lx\n",
 				 nd->start, nd->end);
+	}
 }
 
 /* Callback for parsing of the Proximity Domain <-> Memory Area mappings */

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [tip:x86/urgent] x86: Set hotpluggable nodes in nodes_possible_map
  2010-01-20 20:10             ` [patch] x86: set hotpluggable nodes in nodes_possible_map David Rientjes
                                 ` (2 preceding siblings ...)
  2010-01-22 11:15               ` [tip:x86/urgent] x86: Set hotpluggable nodes in nodes_possible_map tip-bot for David Rientjes
@ 2010-01-23  6:51               ` tip-bot for David Rientjes
  3 siblings, 0 replies; 26+ messages in thread
From: tip-bot for David Rientjes @ 2010-01-23  6:51 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, haicheng.li, hpa, mingo, stable, tglx, rientjes

Commit-ID:  3a5fc0e40cb467e692737bc798bc99773c81e1e2
Gitweb:     http://git.kernel.org/tip/3a5fc0e40cb467e692737bc798bc99773c81e1e2
Author:     David Rientjes <rientjes@google.com>
AuthorDate: Wed, 20 Jan 2010 12:10:47 -0800
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Sat, 23 Jan 2010 06:21:57 +0100

x86: Set hotpluggable nodes in nodes_possible_map

nodes_possible_map does not currently include nodes that have SRAT
entries that are all ACPI_SRAT_MEM_HOT_PLUGGABLE since the bit is
cleared in nodes_parsed if it does not have an online address range.

Unequivocally setting the bit in nodes_parsed is insufficient since
existing code, such as acpi_get_nodes(), assumes all nodes in the map
have online address ranges.  In fact, all code using nodes_parsed
assumes such nodes represent an address range of online memory.

nodes_possible_map is created by unioning nodes_parsed and
cpu_nodes_parsed; the former represents nodes with online memory and
the latter represents memoryless nodes.  We now set the bit for
hotpluggable nodes in cpu_nodes_parsed so that it also gets set in
nodes_possible_map.

[ hpa: Haicheng Li points out that this makes the naming of the
  variable cpu_nodes_parsed somewhat counterintuitive.  However, leave
  it as is in the interest of keeping the pure bug fix patch small. ]

Signed-off-by: David Rientjes <rientjes@google.com>
Tested-by: Haicheng Li <haicheng.li@linux.intel.com>
LKML-Reference: <alpine.DEB.2.00.1001201152040.30528@chino.kir.corp.google.com>
Cc: <stable@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
---
 arch/x86/mm/srat_64.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
index a271241..28c6876 100644
--- a/arch/x86/mm/srat_64.c
+++ b/arch/x86/mm/srat_64.c
@@ -229,9 +229,11 @@ update_nodes_add(int node, unsigned long start, unsigned long end)
 			printk(KERN_ERR "SRAT: Hotplug zone not continuous. Partly ignored\n");
 	}
 
-	if (changed)
+	if (changed) {
+		node_set(node, cpu_nodes_parsed);
 		printk(KERN_INFO "SRAT: hot plug zone found %Lx - %Lx\n",
 				 nd->start, nd->end);
+	}
 }
 
 /* Callback for parsing of the Proximity Domain <-> Memory Area mappings */

^ permalink raw reply related	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2010-01-23  6:52 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-15  7:42 [PATCH] x86/mm/srat_64.c: nodes_parsed should include all nodes detected by ACPI Haicheng Li
2010-01-17  2:22 ` Haicheng Li
2010-01-17 21:53 ` David Rientjes
2010-01-18  6:30   ` Yinghai Lu
2010-01-18 10:43     ` David Rientjes
2010-01-19 11:08       ` Haicheng Li
2010-01-19 11:29         ` Haicheng Li
2010-01-19 23:30         ` David Rientjes
2010-01-20 16:40           ` Haicheng Li
2010-01-20 20:10             ` [patch] x86: set hotpluggable nodes in nodes_possible_map David Rientjes
2010-01-20 22:45               ` Yinghai Lu
2010-01-20 23:32                 ` David Rientjes
2010-01-21  3:00                 ` Haicheng Li
2010-01-21  2:58               ` Haicheng Li
2010-01-21  6:58                 ` David Rientjes
2010-01-21  7:31                   ` Haicheng Li
2010-01-21  7:50                     ` David Rientjes
2010-01-21  8:33                       ` Haicheng Li
2010-01-21 23:12                         ` David Rientjes
2010-01-22  4:06                           ` [PATCH] x86/mm/srat_64.c: make node_possible_map include hotpluggable node Haicheng Li
2010-01-22  7:33                             ` H. Peter Anvin
2010-01-22  8:43                               ` Haicheng Li
2010-01-22 10:14                                 ` H. Peter Anvin
2010-01-22 10:35                                   ` Haicheng Li
2010-01-22 11:15               ` [tip:x86/urgent] x86: Set hotpluggable nodes in nodes_possible_map tip-bot for David Rientjes
2010-01-23  6:51               ` tip-bot for David Rientjes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox