linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] memory-hotplug: Clear pgdat which is allocated by bootmem in try_offline_node()
@ 2014-10-20 10:05 Yasuaki Ishimatsu
  2014-10-21 16:56 ` Toshi Kani
  0 siblings, 1 reply; 3+ messages in thread
From: Yasuaki Ishimatsu @ 2014-10-20 10:05 UTC (permalink / raw)
  To: akpm, linux-mm, linux-kernel
  Cc: zhenzhang.zhang, wangnan0, tangchen, toshi.kani, dave.hansen,
	rientjes

When hot adding the same memory after hot removing a memory,
the following messages are shown:

WARNING: CPU: 20 PID: 6 at mm/page_alloc.c:4968 free_area_init_node+0x3fe/0x426()
...
Call Trace:
 [<...>] dump_stack+0x46/0x58
 [<...>] warn_slowpath_common+0x81/0xa0
 [<...>] warn_slowpath_null+0x1a/0x20
 [<...>] free_area_init_node+0x3fe/0x426
 [<...>] ? up+0x32/0x50
 [<...>] hotadd_new_pgdat+0x90/0x110
 [<...>] add_memory+0xd4/0x200
 [<...>] acpi_memory_device_add+0x1aa/0x289
 [<...>] acpi_bus_attach+0xfd/0x204
 [<...>] ? device_register+0x1e/0x30
 [<...>] acpi_bus_attach+0x178/0x204
 [<...>] acpi_bus_scan+0x6a/0x90
 [<...>] ? acpi_bus_get_status+0x2d/0x5f
 [<...>] acpi_device_hotplug+0xe8/0x418
 [<...>] acpi_hotplug_work_fn+0x1f/0x2b
 [<...>] process_one_work+0x14e/0x3f0
 [<...>] worker_thread+0x11b/0x510
 [<...>] ? rescuer_thread+0x350/0x350
 [<...>] kthread+0xe1/0x100
 [<...>] ? kthread_create_on_node+0x1b0/0x1b0
 [<...>] ret_from_fork+0x7c/0xb0
 [<...>] ? kthread_create_on_node+0x1b0/0x1b0

The detaled explanation is as follows:

When hot removing memory, pgdat is set to 0 in try_offline_node().
But if the pgdat is allocated by bootmem allocator, the clearing
step is skipped. And when hot adding the same memory, the uninitialized
pgdat is reused. But free_area_init_node() chacks wether pgdat is set
to zero. As a result, free_area_init_node() hits WARN_ON().

This patch clears pgdat which is allocated by bootmem allocator
in try_offline_node().

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
CC: Zhang Zhen <zhenzhang.zhang@huawei.com>
CC: Wang Nan <wangnan0@huawei.com>
CC: Tang Chen <tangchen@cn.fujitsu.com>
CC: Toshi Kani <toshi.kani@hp.com>
CC: Dave Hansen <dave.hansen@intel.com>
CC: David Rientjes <rientjes@google.com>

---
 mm/memory_hotplug.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 29d8693..7649f7c 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1943,7 +1943,7 @@ void try_offline_node(int nid)

 	if (!PageSlab(pgdat_page) && !PageCompound(pgdat_page))
 		/* node data is allocated from boot memory */
-		return;
+		goto out;

 	/* free waittable in each zone */
 	for (i = 0; i < MAX_NR_ZONES; i++) {
@@ -1957,6 +1957,7 @@ void try_offline_node(int nid)
 			vfree(zone->wait_table);
 	}

+out:
 	/*
 	 * Since there is no way to guarentee the address of pgdat/zone is not
 	 * on stack of any kernel threads or used by other kernel objects
-- 
1.8.3.1


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] memory-hotplug: Clear pgdat which is allocated by bootmem in try_offline_node()
  2014-10-20 10:05 [PATCH] memory-hotplug: Clear pgdat which is allocated by bootmem in try_offline_node() Yasuaki Ishimatsu
@ 2014-10-21 16:56 ` Toshi Kani
  2014-10-22  5:32   ` Yasuaki Ishimatsu
  0 siblings, 1 reply; 3+ messages in thread
From: Toshi Kani @ 2014-10-21 16:56 UTC (permalink / raw)
  To: Yasuaki Ishimatsu
  Cc: akpm, linux-mm, linux-kernel, zhenzhang.zhang, wangnan0, tangchen,
	dave.hansen, rientjes

On Mon, 2014-10-20 at 19:05 +0900, Yasuaki Ishimatsu wrote:
 :
> When hot removing memory, pgdat is set to 0 in try_offline_node().
> But if the pgdat is allocated by bootmem allocator, the clearing
> step is skipped. And when hot adding the same memory, the uninitialized
> pgdat is reused. But free_area_init_node() chacks wether pgdat is set

s/chacks/checks


> to zero. As a result, free_area_init_node() hits WARN_ON().
> 
> This patch clears pgdat which is allocated by bootmem allocator
> in try_offline_node().
> 
> Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
> CC: Zhang Zhen <zhenzhang.zhang@huawei.com>
> CC: Wang Nan <wangnan0@huawei.com>
> CC: Tang Chen <tangchen@cn.fujitsu.com>
> CC: Toshi Kani <toshi.kani@hp.com>
> CC: Dave Hansen <dave.hansen@intel.com>
> CC: David Rientjes <rientjes@google.com>
> 
> ---
>  mm/memory_hotplug.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 29d8693..7649f7c 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1943,7 +1943,7 @@ void try_offline_node(int nid)
> 
>  	if (!PageSlab(pgdat_page) && !PageCompound(pgdat_page))
>  		/* node data is allocated from boot memory */
> -		return;
> +		goto out;

Do we still need this if-statement?  That is, do we have to skip the
for-loop below even though it checks with is_vmalloc_addr()?

Thanks,
-Toshi


>  	/* free waittable in each zone */
>  	for (i = 0; i < MAX_NR_ZONES; i++) {
> @@ -1957,6 +1957,7 @@ void try_offline_node(int nid)
>  			vfree(zone->wait_table);
>  	}
> 
> +out:
>  	/*
>  	 * Since there is no way to guarentee the address of pgdat/zone is not
>  	 * on stack of any kernel threads or used by other kernel objects


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] memory-hotplug: Clear pgdat which is allocated by bootmem in try_offline_node()
  2014-10-21 16:56 ` Toshi Kani
@ 2014-10-22  5:32   ` Yasuaki Ishimatsu
  0 siblings, 0 replies; 3+ messages in thread
From: Yasuaki Ishimatsu @ 2014-10-22  5:32 UTC (permalink / raw)
  To: Toshi Kani
  Cc: akpm, linux-mm, linux-kernel, zhenzhang.zhang, wangnan0, tangchen,
	dave.hansen, rientjes

(2014/10/22 1:56), Toshi Kani wrote:
> On Mon, 2014-10-20 at 19:05 +0900, Yasuaki Ishimatsu wrote:
>   :
>> When hot removing memory, pgdat is set to 0 in try_offline_node().
>> But if the pgdat is allocated by bootmem allocator, the clearing
>> step is skipped. And when hot adding the same memory, the uninitialized
>> pgdat is reused. But free_area_init_node() chacks wether pgdat is set
>

> s/chacks/checks

I'll update it.

>
>
>> to zero. As a result, free_area_init_node() hits WARN_ON().
>>
>> This patch clears pgdat which is allocated by bootmem allocator
>> in try_offline_node().
>>
>> Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>> CC: Zhang Zhen <zhenzhang.zhang@huawei.com>
>> CC: Wang Nan <wangnan0@huawei.com>
>> CC: Tang Chen <tangchen@cn.fujitsu.com>
>> CC: Toshi Kani <toshi.kani@hp.com>
>> CC: Dave Hansen <dave.hansen@intel.com>
>> CC: David Rientjes <rientjes@google.com>
>>
>> ---
>>   mm/memory_hotplug.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index 29d8693..7649f7c 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -1943,7 +1943,7 @@ void try_offline_node(int nid)
>>
>>   	if (!PageSlab(pgdat_page) && !PageCompound(pgdat_page))
>>   		/* node data is allocated from boot memory */
>> -		return;
>> +		goto out;
>

> Do we still need this if-statement?  That is, do we have to skip the
> for-loop below even though it checks with is_vmalloc_addr()?

You are right. The if-statement is not necessary. So the issue can be
fixed by just removing the if-statement.

I'll post updated patch soon.

Thanks,
Yasuaki Ishimatsu

>
> Thanks,
> -Toshi
>
>
>>   	/* free waittable in each zone */
>>   	for (i = 0; i < MAX_NR_ZONES; i++) {
>> @@ -1957,6 +1957,7 @@ void try_offline_node(int nid)
>>   			vfree(zone->wait_table);
>>   	}
>>
>> +out:
>>   	/*
>>   	 * Since there is no way to guarentee the address of pgdat/zone is not
>>   	 * on stack of any kernel threads or used by other kernel objects
>
>


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-10-22  5:33 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-20 10:05 [PATCH] memory-hotplug: Clear pgdat which is allocated by bootmem in try_offline_node() Yasuaki Ishimatsu
2014-10-21 16:56 ` Toshi Kani
2014-10-22  5:32   ` Yasuaki Ishimatsu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).