From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from fgwmail5.fujitsu.co.jp (fgwmail5.fujitsu.co.jp [192.51.44.35]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 8351D2C015C for ; Tue, 17 Jul 2012 14:52:22 +1000 (EST) Received: from m2.gw.fujitsu.co.jp (unknown [10.0.50.72]) by fgwmail5.fujitsu.co.jp (Postfix) with ESMTP id 7122D3EE0C1 for ; Tue, 17 Jul 2012 13:52:18 +0900 (JST) Received: from smail (m2 [127.0.0.1]) by outgoing.m2.gw.fujitsu.co.jp (Postfix) with ESMTP id 53CD545DE56 for ; Tue, 17 Jul 2012 13:52:18 +0900 (JST) Received: from s2.gw.fujitsu.co.jp (s2.gw.fujitsu.co.jp [10.0.50.92]) by m2.gw.fujitsu.co.jp (Postfix) with ESMTP id 2696845DE50 for ; Tue, 17 Jul 2012 13:52:18 +0900 (JST) Received: from s2.gw.fujitsu.co.jp (localhost.localdomain [127.0.0.1]) by s2.gw.fujitsu.co.jp (Postfix) with ESMTP id 17A30E08009 for ; Tue, 17 Jul 2012 13:52:18 +0900 (JST) Received: from g01jpexchyt06.g01.fujitsu.local (g01jpexchyt06.g01.fujitsu.local [10.128.194.45]) by s2.gw.fujitsu.co.jp (Postfix) with ESMTP id 5BCD6E08007 for ; Tue, 17 Jul 2012 13:52:17 +0900 (JST) Message-ID: <5004EF68.5050708@jp.fujitsu.com> Date: Tue, 17 Jul 2012 13:51:52 +0900 From: Yasuaki Ishimatsu MIME-Version: 1.0 To: Wen Congyang Subject: Re: [RFC PATCH v3 2/13] memory-hotplug : add physical memory hotplug code to acpi_memory_device_remove References: <4FFAB0A2.8070304@jp.fujitsu.com> <4FFAB148.9000803@jp.fujitsu.com> <4FFF9771.5080307@cn.fujitsu.com> <5004C39B.1060204@jp.fujitsu.com> <5004C5E2.1050906@jp.fujitsu.com> <5004CEB7.4090400@cn.fujitsu.com> <5004D745.3060303@jp.fujitsu.com> <5004DCC2.4030905@cn.fujitsu.com> In-Reply-To: <5004DCC2.4030905@cn.fujitsu.com> Content-Type: text/plain; charset="ISO-2022-JP" Cc: len.brown@intel.com, linux-acpi@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, paulus@samba.org, minchan.kim@gmail.com, kosaki.motohiro@jp.fujitsu.com, rientjes@google.com, cl@linux.com, linuxppc-dev@lists.ozlabs.org, akpm@linux-foundation.org, liuj97@gmail.com List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Hi Wen, 2012/07/17 12:32, Wen Congyang wrote: > At 07/17/2012 11:08 AM, Yasuaki Ishimatsu Wrote: >> Hi Wen, >> >> 2012/07/17 11:32, Wen Congyang wrote: >>> At 07/17/2012 09:54 AM, Yasuaki Ishimatsu Wrote: >>>> Hi Wen, >>>> >>>> 2012/07/17 10:44, Yasuaki Ishimatsu wrote: >>>>> Hi Wen, >>>>> >>>>> 2012/07/13 12:35, Wen Congyang wrote: >>>>>> At 07/09/2012 06:24 PM, Yasuaki Ishimatsu Wrote: >>>>>>> acpi_memory_device_remove() has been prepared to remove physical memory. >>>>>>> But, the function only frees acpi_memory_device currentlry. >>>>>>> >>>>>>> The patch adds following functions into acpi_memory_device_remove(): >>>>>>> - offline memory >>>>>>> - remove physical memory (only return -EBUSY) >>>>>>> - free acpi_memory_device >>>>>>> >>>>>>> CC: David Rientjes >>>>>>> CC: Jiang Liu >>>>>>> CC: Len Brown >>>>>>> CC: Benjamin Herrenschmidt >>>>>>> CC: Paul Mackerras >>>>>>> CC: Christoph Lameter >>>>>>> Cc: Minchan Kim >>>>>>> CC: Andrew Morton >>>>>>> CC: KOSAKI Motohiro >>>>>>> CC: Wen Congyang >>>>>>> Signed-off-by: Yasuaki Ishimatsu >>>>>>> >>>>>>> --- >>>>>>> drivers/acpi/acpi_memhotplug.c | 26 +++++++++++++++++++++++++- >>>>>>> drivers/base/memory.c | 39 +++++++++++++++++++++++++++++++++++++++ >>>>>>> include/linux/memory.h | 5 +++++ >>>>>>> include/linux/memory_hotplug.h | 1 + >>>>>>> mm/memory_hotplug.c | 8 ++++++++ >>>>>>> 5 files changed, 78 insertions(+), 1 deletion(-) >>>>>>> >>>>>>> Index: linux-3.5-rc6/drivers/acpi/acpi_memhotplug.c >>>>>>> =================================================================== >>>>>>> --- linux-3.5-rc6.orig/drivers/acpi/acpi_memhotplug.c 2012-07-09 18:08:29.946888653 +0900 >>>>>>> +++ linux-3.5-rc6/drivers/acpi/acpi_memhotplug.c 2012-07-09 18:08:43.470719531 +0900 >>>>>>> @@ -29,6 +29,7 @@ >>>>>>> #include >>>>>>> #include >>>>>>> #include >>>>>>> +#include >>>>>>> #include >>>>>>> #include >>>>>>> #include >>>>>>> @@ -452,12 +453,35 @@ static int acpi_memory_device_add(struct >>>>>>> static int acpi_memory_device_remove(struct acpi_device *device, int type) >>>>>>> { >>>>>>> struct acpi_memory_device *mem_device = NULL; >>>>>>> - >>>>>>> + struct acpi_memory_info *info, *tmp; >>>>>>> + int result; >>>>>>> + int node; >>>>>>> >>>>>>> if (!device || !acpi_driver_data(device)) >>>>>>> return -EINVAL; >>>>>>> >>>>>>> mem_device = acpi_driver_data(device); >>>>>>> + >>>>>>> + node = acpi_get_node(mem_device->device->handle); >>>>>>> + >>>>>>> + list_for_each_entry_safe(info, tmp, &mem_device->res_list, list) { >>>>>>> + if (!info->enabled) >>>>>>> + continue; >>>>>>> + >>>>>>> + if (!is_memblk_offline(info->start_addr, info->length)) { >>>>>>> + result = offline_memory(info->start_addr, info->length); >>>>>>> + if (result) >>>>>>> + return result; >>>>>>> + } >>>>>>> + >>>>>>> + result = remove_memory(node, info->start_addr, info->length); >>>>>> >>>>>> The user may online the memory between offline_memory() and remove_memory(). >>>>>> So I think we should lock memory hotplug before check the memory's status >>>>>> and release it after remove_memory(). >>>>> >>>>> How about get "mem_block->state_mutex" of removed memory? When offlining >>>>> memory, we need to change "memory_block->state" into "MEM_OFFLINE". >>>>> In this case, we get mem_block->state_mutex. So I think the mutex lock >>>>> is beneficial. >>>> >>>> It is not good idea since remove_memory frees mem_block structure... >>>> Do you have any ideas? >>> >>> Hmm, split offline_memory() to 2 functions: offline_pages() and __offline_pages() >>> >>> offline_pages() >>> lock_memory_hotplug(); >>> __offline_pages(); >>> unlock_memory_hotplug(); >>> >>> and implement remove_memory() like this: >>> remove_memory() >>> lock_memory_hotplug() >>> if (!is_memblk_offline()) { >>> __offline_pages(); >>> } >>> // cleanup >>> unlock_memory_hotplug(); >>> >>> What about this? >> >> I also thought about it once. But a problem remains. Current offilne_pages() >> cannot realize the memory has been removed by remove_memory(). So even if >> protecting the race by lock_memory_hotplug(), offline_pages() can offline >> the removed memory. offline_pages() should have the means to know the memory >> was removed. But I don't have good idea. > > We can not online/offline part of memory block, so what about this? It seems you do not understand my concern. When memory_remove() and offline_pages() run to same memory simultaneously, offline_pages runs to removed memory. memory_remove() | offline_pages() ----------------------------------------------------------- lock_memory_hotplug() | | wait at lock_memory_hotplug() remove memory | unlock_memory_hotplug() | | wake up and start offline_pages() | offline page | => but the memory has already removed | by memory_remove() In this case, offline_page() may access removed memory. Thanks, Yasuaki Ishimatsu > > remove_memory() > lock_memory_hotplug() > for each memory block: > if (!is_memblk_offline()) { > __offline_pages(); > } > // cleanup > unlock_memory_hotplug(); > > Thanks > Wen Congyang >> >> Thanks, >> Yasuaki Ishimatsu >> >>> >>> Thanks >>> Wen Congyang >>>> >>>> Thanks, >>>> Yasuaki Ishimatsu >>>> >>>>> Thanks, >>>>> Yasuaki Ishimatsu >>>>> >>>>>> >>>>>> Thanks >>>>>> Wen Congyang >>>>>> >>>>>>> + if (result) >>>>>>> + return result; >>>>>>> + >>>>>>> + list_del(&info->list); >>>>>>> + kfree(info); >>>>>>> + } >>>>>>> + >>>>>>> kfree(mem_device); >>>>>>> >>>>>>> return 0; >>>>>>> Index: linux-3.5-rc6/include/linux/memory_hotplug.h >>>>>>> =================================================================== >>>>>>> --- linux-3.5-rc6.orig/include/linux/memory_hotplug.h 2012-07-09 18:08:29.955888542 +0900 >>>>>>> +++ linux-3.5-rc6/include/linux/memory_hotplug.h 2012-07-09 18:08:43.471719518 +0900 >>>>>>> @@ -233,6 +233,7 @@ static inline int is_mem_section_removab >>>>>>> extern int mem_online_node(int nid); >>>>>>> extern int add_memory(int nid, u64 start, u64 size); >>>>>>> extern int arch_add_memory(int nid, u64 start, u64 size); >>>>>>> +extern int remove_memory(int nid, u64 start, u64 size); >>>>>>> extern int offline_memory(u64 start, u64 size); >>>>>>> extern int sparse_add_one_section(struct zone *zone, unsigned long start_pfn, >>>>>>> int nr_pages); >>>>>>> Index: linux-3.5-rc6/mm/memory_hotplug.c >>>>>>> =================================================================== >>>>>>> --- linux-3.5-rc6.orig/mm/memory_hotplug.c 2012-07-09 18:08:29.953888567 +0900 >>>>>>> +++ linux-3.5-rc6/mm/memory_hotplug.c 2012-07-09 18:08:43.476719455 +0900 >>>>>>> @@ -659,6 +659,14 @@ out: >>>>>>> } >>>>>>> EXPORT_SYMBOL_GPL(add_memory); >>>>>>> >>>>>>> +int remove_memory(int nid, u64 start, u64 size) >>>>>>> +{ >>>>>>> + return -EBUSY; >>>>>>> + >>>>>>> +} >>>>>>> +EXPORT_SYMBOL_GPL(remove_memory); >>>>>>> + >>>>>>> + >>>>>>> #ifdef CONFIG_MEMORY_HOTREMOVE >>>>>>> /* >>>>>>> * A free page on the buddy free lists (not the per-cpu lists) has PageBuddy >>>>>>> Index: linux-3.5-rc6/drivers/base/memory.c >>>>>>> =================================================================== >>>>>>> --- linux-3.5-rc6.orig/drivers/base/memory.c 2012-07-09 18:08:29.947888640 +0900 >>>>>>> +++ linux-3.5-rc6/drivers/base/memory.c 2012-07-09 18:10:54.880076739 +0900 >>>>>>> @@ -70,6 +70,45 @@ void unregister_memory_isolate_notifier( >>>>>>> } >>>>>>> EXPORT_SYMBOL(unregister_memory_isolate_notifier); >>>>>>> >>>>>>> +bool is_memblk_offline(unsigned long start, unsigned long size) >>>>>>> +{ >>>>>>> + struct memory_block *mem = NULL; >>>>>>> + struct mem_section *section; >>>>>>> + unsigned long start_pfn, end_pfn; >>>>>>> + unsigned long pfn, section_nr; >>>>>>> + >>>>>>> + start_pfn = PFN_DOWN(start); >>>>>>> + end_pfn = start_pfn + PFN_DOWN(start); >>>>>>> + >>>>>>> + for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) { >>>>>>> + section_nr = pfn_to_section_nr(pfn); >>>>>>> + if (!present_section_nr(section_nr)); >>>>>>> + continue; >>>>>>> + >>>>>>> + section = __nr_to_section(section_nr); >>>>>>> + /* same memblock? */ >>>>>>> + if (mem) >>>>>>> + if((section_nr >= mem->start_section_nr) && >>>>>>> + (section_nr <= mem->end_section_nr)) >>>>>>> + continue; >>>>>>> + >>>>>>> + mem = find_memory_block_hinted(section, mem); >>>>>>> + if (!mem) >>>>>>> + continue; >>>>>>> + if (mem->state == MEM_OFFLINE) >>>>>>> + continue; >>>>>>> + >>>>>>> + kobject_put(&mem->dev.kobj); >>>>>>> + return false; >>>>>>> + } >>>>>>> + >>>>>>> + if (mem) >>>>>>> + kobject_put(&mem->dev.kobj); >>>>>>> + >>>>>>> + return true; >>>>>>> +} >>>>>>> +EXPORT_SYMBOL(is_memblk_offline); >>>>>>> + >>>>>>> /* >>>>>>> * register_memory - Setup a sysfs device for a memory block >>>>>>> */ >>>>>>> Index: linux-3.5-rc6/include/linux/memory.h >>>>>>> =================================================================== >>>>>>> --- linux-3.5-rc6.orig/include/linux/memory.h 2012-07-08 09:23:56.000000000 +0900 >>>>>>> +++ linux-3.5-rc6/include/linux/memory.h 2012-07-09 18:08:43.484719355 +0900 >>>>>>> @@ -106,6 +106,10 @@ static inline int memory_isolate_notify( >>>>>>> { >>>>>>> return 0; >>>>>>> } >>>>>>> +static inline bool is_memblk_offline(unsigned long start, unsigned long size) >>>>>>> +{ >>>>>>> + return false; >>>>>>> +} >>>>>>> #else >>>>>>> extern int register_memory_notifier(struct notifier_block *nb); >>>>>>> extern void unregister_memory_notifier(struct notifier_block *nb); >>>>>>> @@ -120,6 +124,7 @@ extern int memory_isolate_notify(unsigne >>>>>>> extern struct memory_block *find_memory_block_hinted(struct mem_section *, >>>>>>> struct memory_block *); >>>>>>> extern struct memory_block *find_memory_block(struct mem_section *); >>>>>>> +extern bool is_memblk_offline(unsigned long start, unsigned long size); >>>>>>> #define CONFIG_MEM_BLOCK_SIZE (PAGES_PER_SECTION<>>>>>> enum mem_add_context { BOOT, HOTPLUG }; >>>>>>> #endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */ >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> To unsubscribe, send a message with 'unsubscribe linux-mm' in >>>>>> the body to majordomo@kvack.org. For more info on Linux MM, >>>>>> see: http://www.linux-mm.org/ . >>>>>> Don't email: email@kvack.org >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in >>>>> the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> >>>> >>>> >>>> >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>> >> >> >> >> > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org >