* [PATCH v2 0/3] Support memory hot-delete to boot memory @ 2013-04-08 17:09 Toshi Kani 2013-04-08 17:09 ` [PATCH v2 1/3] resource: Add __adjust_resource() for internal use Toshi Kani ` (3 more replies) 0 siblings, 4 replies; 13+ messages in thread From: Toshi Kani @ 2013-04-08 17:09 UTC (permalink / raw) To: akpm Cc: linux-mm, linux-kernel, linuxram, guz.fnst, tmac, isimatu.yasuaki, wency, tangchen, jiang.liu, Toshi Kani Memory hot-delete to a memory range present at boot causes an error message in __release_region(), such as: Trying to free nonexistent resource <0000000070000000-0000000077ffffff> Hot-delete operation still continues since __release_region() is a void function, but the target memory range is not freed from iomem_resource as the result. This also leads a failure in a subsequent hot-add operation to the same memory range since the address range is still in-use in iomem_resource. This problem happens because the granularity of memory resource ranges may be different between boot and hot-delete. During bootup, iomem_resource is set up from the boot descriptor table, such as EFI Memory Table and e820. Each resource entry usually covers the whole contiguous memory range. Hot-delete request, on the other hand, may target to a particular range of memory resource, and its size can be much smaller than the whole contiguous memory. Since the existing release interfaces like __release_region() require a requested region to be exactly matched to a resource entry, they do not allow a partial resource to be released. This patchset introduces release_mem_region_adjustable() for memory hot-delete operations, which allows releasing a partial memory range and adjusts remaining resource accordingly. This patchset makes no changes to the existing interfaces since their restriction is still valid for I/O resources. --- v2: Updated release_mem_region_adjustable() per code reviews from Yasuaki Ishimatsu, Ram Pai and Gu Zheng. --- Toshi Kani (3): resource: Add __adjust_resource() for internal use resource: Add release_mem_region_adjustable() mm: Change __remove_pages() to call release_mem_region_adjustable() --- include/linux/ioport.h | 2 + kernel/resource.c | 128 ++++++++++++++++++++++++++++++++++++++++++++----- mm/memory_hotplug.c | 11 ++++- 3 files changed, 126 insertions(+), 15 deletions(-) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 1/3] resource: Add __adjust_resource() for internal use 2013-04-08 17:09 [PATCH v2 0/3] Support memory hot-delete to boot memory Toshi Kani @ 2013-04-08 17:09 ` Toshi Kani 2013-04-10 6:10 ` David Rientjes 2013-04-08 17:09 ` [PATCH v2 2/3] resource: Add release_mem_region_adjustable() Toshi Kani ` (2 subsequent siblings) 3 siblings, 1 reply; 13+ messages in thread From: Toshi Kani @ 2013-04-08 17:09 UTC (permalink / raw) To: akpm Cc: linux-mm, linux-kernel, linuxram, guz.fnst, tmac, isimatu.yasuaki, wency, tangchen, jiang.liu, Toshi Kani Added __adjust_resource(), which is called by adjust_resource() internally after the resource_lock is held. There is no interface change to adjust_resource(). This change allows other functions to call __adjust_resource() internally while the resource_lock is held. Signed-off-by: Toshi Kani <toshi.kani@hp.com> Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> --- kernel/resource.c | 35 ++++++++++++++++++++++------------- 1 file changed, 22 insertions(+), 13 deletions(-) diff --git a/kernel/resource.c b/kernel/resource.c index 73f35d4..ae246f9 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -706,24 +706,13 @@ void insert_resource_expand_to_fit(struct resource *root, struct resource *new) write_unlock(&resource_lock); } -/** - * adjust_resource - modify a resource's start and size - * @res: resource to modify - * @start: new start value - * @size: new size - * - * Given an existing resource, change its start and size to match the - * arguments. Returns 0 on success, -EBUSY if it can't fit. - * Existing children of the resource are assumed to be immutable. - */ -int adjust_resource(struct resource *res, resource_size_t start, resource_size_t size) +static int __adjust_resource(struct resource *res, resource_size_t start, + resource_size_t size) { struct resource *tmp, *parent = res->parent; resource_size_t end = start + size - 1; int result = -EBUSY; - write_lock(&resource_lock); - if (!parent) goto skip; @@ -751,6 +740,26 @@ skip: result = 0; out: + return result; +} + +/** + * adjust_resource - modify a resource's start and size + * @res: resource to modify + * @start: new start value + * @size: new size + * + * Given an existing resource, change its start and size to match the + * arguments. Returns 0 on success, -EBUSY if it can't fit. + * Existing children of the resource are assumed to be immutable. + */ +int adjust_resource(struct resource *res, resource_size_t start, + resource_size_t size) +{ + int result; + + write_lock(&resource_lock); + result = __adjust_resource(res, start, size); write_unlock(&resource_lock); return result; } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2 1/3] resource: Add __adjust_resource() for internal use 2013-04-08 17:09 ` [PATCH v2 1/3] resource: Add __adjust_resource() for internal use Toshi Kani @ 2013-04-10 6:10 ` David Rientjes 2013-04-10 15:39 ` Toshi Kani 0 siblings, 1 reply; 13+ messages in thread From: David Rientjes @ 2013-04-10 6:10 UTC (permalink / raw) To: Toshi Kani Cc: akpm, linux-mm, linux-kernel, linuxram, guz.fnst, tmac, isimatu.yasuaki, wency, tangchen, jiang.liu On Mon, 8 Apr 2013, Toshi Kani wrote: > Added __adjust_resource(), which is called by adjust_resource() > internally after the resource_lock is held. There is no interface > change to adjust_resource(). This change allows other functions > to call __adjust_resource() internally while the resource_lock is > held. > > Signed-off-by: Toshi Kani <toshi.kani@hp.com> > Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Acked-by: David Rientjes <rientjes@google.com> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 1/3] resource: Add __adjust_resource() for internal use 2013-04-10 6:10 ` David Rientjes @ 2013-04-10 15:39 ` Toshi Kani 0 siblings, 0 replies; 13+ messages in thread From: Toshi Kani @ 2013-04-10 15:39 UTC (permalink / raw) To: David Rientjes Cc: akpm, linux-mm, linux-kernel, linuxram, guz.fnst, tmac, isimatu.yasuaki, wency, tangchen, jiang.liu On Tue, 2013-04-09 at 23:10 -0700, David Rientjes wrote: > On Mon, 8 Apr 2013, Toshi Kani wrote: > > > Added __adjust_resource(), which is called by adjust_resource() > > internally after the resource_lock is held. There is no interface > > change to adjust_resource(). This change allows other functions > > to call __adjust_resource() internally while the resource_lock is > > held. > > > > Signed-off-by: Toshi Kani <toshi.kani@hp.com> > > Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> > > Acked-by: David Rientjes <rientjes@google.com> Great! Thanks David! -Toshi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 2/3] resource: Add release_mem_region_adjustable() 2013-04-08 17:09 [PATCH v2 0/3] Support memory hot-delete to boot memory Toshi Kani 2013-04-08 17:09 ` [PATCH v2 1/3] resource: Add __adjust_resource() for internal use Toshi Kani @ 2013-04-08 17:09 ` Toshi Kani 2013-04-10 6:16 ` David Rientjes 2013-04-08 17:09 ` [PATCH v2 3/3] mm: Change __remove_pages() to call release_mem_region_adjustable() Toshi Kani 2013-04-08 20:44 ` [PATCH v2 0/3] Support memory hot-delete to boot memory Andrew Morton 3 siblings, 1 reply; 13+ messages in thread From: Toshi Kani @ 2013-04-08 17:09 UTC (permalink / raw) To: akpm Cc: linux-mm, linux-kernel, linuxram, guz.fnst, tmac, isimatu.yasuaki, wency, tangchen, jiang.liu, Toshi Kani Added release_mem_region_adjustable(), which releases a requested region from a currently busy memory resource. This interface adjusts the matched memory resource accordingly even if the requested region does not match exactly but still fits into. This new interface is intended for memory hot-delete. During bootup, memory resources are inserted from the boot descriptor table, such as EFI Memory Table and e820. Each memory resource entry usually covers the whole contigous memory range. Memory hot-delete request, on the other hand, may target to a particular range of memory resource, and its size can be much smaller than the whole contiguous memory. Since the existing release interfaces like __release_region() require a requested region to be exactly matched to a resource entry, they do not allow a partial resource to be released. There is no change to the existing interfaces since their restriction is valid for I/O resources. Signed-off-by: Toshi Kani <toshi.kani@hp.com> Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> --- include/linux/ioport.h | 2 + kernel/resource.c | 93 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 95 insertions(+) diff --git a/include/linux/ioport.h b/include/linux/ioport.h index 85ac9b9b..0fe1a82 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -192,6 +192,8 @@ extern struct resource * __request_region(struct resource *, extern int __check_region(struct resource *, resource_size_t, resource_size_t); extern void __release_region(struct resource *, resource_size_t, resource_size_t); +extern int release_mem_region_adjustable(struct resource *, resource_size_t, + resource_size_t); static inline int __deprecated check_region(resource_size_t s, resource_size_t n) diff --git a/kernel/resource.c b/kernel/resource.c index ae246f9..870fb26 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -1021,6 +1021,99 @@ void __release_region(struct resource *parent, resource_size_t start, } EXPORT_SYMBOL(__release_region); +/** + * release_mem_region_adjustable - release a previously reserved memory region + * @parent: parent resource descriptor + * @start: resource start address + * @size: resource region size + * + * The requested region is released from a currently busy memory resource. + * It adjusts the matched busy memory resource accordingly if the requested + * region does not match exactly but still fits into. Existing children of + * the busy memory resource must be immutable in this request. + * + * Note, when the busy memory resource gets split into two entries, the code + * assumes that all children remain in the lower address entry for simplicity. + * Enhance this logic when necessary. + */ +int release_mem_region_adjustable(struct resource *parent, + resource_size_t start, resource_size_t size) +{ + struct resource **p; + struct resource *res, *new; + resource_size_t end; + int ret = -EINVAL; + + end = start + size - 1; + if ((start < parent->start) || (end > parent->end)) + return ret; + + p = &parent->child; + write_lock(&resource_lock); + + while ((res = *p)) { + if (res->start >= end) + break; + + /* look for the next resource if it does not fit into */ + if (res->start > start || res->end < end) { + p = &res->sibling; + continue; + } + + if (!(res->flags & IORESOURCE_MEM)) + break; + + if (!(res->flags & IORESOURCE_BUSY)) { + p = &res->child; + continue; + } + + /* found the target resource; let's adjust accordingly */ + if (res->start == start && res->end == end) { + /* free the whole entry */ + *p = res->sibling; + kfree(res); + ret = 0; + } else if (res->start == start && res->end != end) { + /* adjust the start */ + ret = __adjust_resource(res, end + 1, + res->end - end); + } else if (res->start != start && res->end == end) { + /* adjust the end */ + ret = __adjust_resource(res, res->start, + start - res->start); + } else { + /* split into two entries */ + new = kzalloc(sizeof(struct resource), GFP_KERNEL); + if (!new) { + ret = -ENOMEM; + break; + } + new->name = res->name; + new->start = end + 1; + new->end = res->end; + new->flags = res->flags; + new->parent = res->parent; + new->sibling = res->sibling; + new->child = NULL; + + ret = __adjust_resource(res, res->start, + start - res->start); + if (ret) { + kfree(new); + break; + } + res->sibling = new; + } + + break; + } + + write_unlock(&resource_lock); + return ret; +} + /* * Managed region resource */ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/3] resource: Add release_mem_region_adjustable() 2013-04-08 17:09 ` [PATCH v2 2/3] resource: Add release_mem_region_adjustable() Toshi Kani @ 2013-04-10 6:16 ` David Rientjes 2013-04-10 16:36 ` Toshi Kani 0 siblings, 1 reply; 13+ messages in thread From: David Rientjes @ 2013-04-10 6:16 UTC (permalink / raw) To: Toshi Kani Cc: akpm, linux-mm, linux-kernel, linuxram, guz.fnst, tmac, isimatu.yasuaki, wency, tangchen, jiang.liu On Mon, 8 Apr 2013, Toshi Kani wrote: > Added release_mem_region_adjustable(), which releases a requested > region from a currently busy memory resource. This interface > adjusts the matched memory resource accordingly even if the > requested region does not match exactly but still fits into. > > This new interface is intended for memory hot-delete. During > bootup, memory resources are inserted from the boot descriptor > table, such as EFI Memory Table and e820. Each memory resource > entry usually covers the whole contigous memory range. Memory > hot-delete request, on the other hand, may target to a particular > range of memory resource, and its size can be much smaller than > the whole contiguous memory. Since the existing release interfaces > like __release_region() require a requested region to be exactly > matched to a resource entry, they do not allow a partial resource > to be released. > > There is no change to the existing interfaces since their restriction > is valid for I/O resources. > > Signed-off-by: Toshi Kani <toshi.kani@hp.com> > Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Should this emit a warning for attempting to free a non-existant region like __release_region() does? I think it would be better to base this off my patch and surround it with #ifdef CONFIG_MEMORY_HOTREMOVE as suggested by Andrew. There shouldn't be any conflicts. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/3] resource: Add release_mem_region_adjustable() 2013-04-10 6:16 ` David Rientjes @ 2013-04-10 16:36 ` Toshi Kani 0 siblings, 0 replies; 13+ messages in thread From: Toshi Kani @ 2013-04-10 16:36 UTC (permalink / raw) To: David Rientjes Cc: akpm, linux-mm, linux-kernel, linuxram, guz.fnst, tmac, isimatu.yasuaki, wency, tangchen, jiang.liu On Tue, 2013-04-09 at 23:16 -0700, David Rientjes wrote: > On Mon, 8 Apr 2013, Toshi Kani wrote: > > > Added release_mem_region_adjustable(), which releases a requested > > region from a currently busy memory resource. This interface > > adjusts the matched memory resource accordingly even if the > > requested region does not match exactly but still fits into. > > > > This new interface is intended for memory hot-delete. During > > bootup, memory resources are inserted from the boot descriptor > > table, such as EFI Memory Table and e820. Each memory resource > > entry usually covers the whole contigous memory range. Memory > > hot-delete request, on the other hand, may target to a particular > > range of memory resource, and its size can be much smaller than > > the whole contiguous memory. Since the existing release interfaces > > like __release_region() require a requested region to be exactly > > matched to a resource entry, they do not allow a partial resource > > to be released. > > > > There is no change to the existing interfaces since their restriction > > is valid for I/O resources. > > > > Signed-off-by: Toshi Kani <toshi.kani@hp.com> > > Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> > > Should this emit a warning for attempting to free a non-existant region > like __release_region() does? Since __release_region() is a void function, it needs to emit a warning within the func. I made release_mem_region_adjustable() as an int function so that the caller can receive an error and decide what to do based on its operation. I changed the caller __remove_pages() to emit a warning message in PATCH 3/3 in this case. > I think it would be better to base this off my patch and surround it with > #ifdef CONFIG_MEMORY_HOTREMOVE as suggested by Andrew. There shouldn't be > any conflicts. Yes, I realized that CONFIG_MEMORY_HOTREMOVE was a better choice, but I had to use CONFIG_MEMORY_HOTPLUG at this time. So, thanks for doing the cleanup! Since it's already rc6, I will keep my patchset independent for now. I will make minor change to update CONFIG_MEMORY_HOTPLUG to CONFIG_MEMORY_HOTREMOVE after your patch gets accepted -- either by sending a separate patch (if my patchset is already accepted) or updating my current patchset (if my patchset is not accepted yet). Thanks! -Toshi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 3/3] mm: Change __remove_pages() to call release_mem_region_adjustable() 2013-04-08 17:09 [PATCH v2 0/3] Support memory hot-delete to boot memory Toshi Kani 2013-04-08 17:09 ` [PATCH v2 1/3] resource: Add __adjust_resource() for internal use Toshi Kani 2013-04-08 17:09 ` [PATCH v2 2/3] resource: Add release_mem_region_adjustable() Toshi Kani @ 2013-04-08 17:09 ` Toshi Kani 2013-04-08 20:44 ` [PATCH v2 0/3] Support memory hot-delete to boot memory Andrew Morton 3 siblings, 0 replies; 13+ messages in thread From: Toshi Kani @ 2013-04-08 17:09 UTC (permalink / raw) To: akpm Cc: linux-mm, linux-kernel, linuxram, guz.fnst, tmac, isimatu.yasuaki, wency, tangchen, jiang.liu, Toshi Kani Changed __remove_pages() to call release_mem_region_adjustable(). This allows a requested memory range to be released from the iomem_resource table even if it does not match exactly to an resource entry but still fits into. The resource entries initialized at bootup usually cover the whole contiguous memory ranges and may not necessarily match with the size of memory hot-delete requests. If release_mem_region_adjustable() failed, __remove_pages() logs an error message and continues to proceed as it was the case with release_mem_region(). release_mem_region(), which is defined to __release_region(), logs an error message and returns no error since a void function. Signed-off-by: Toshi Kani <toshi.kani@hp.com> Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> --- mm/memory_hotplug.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 57decb2..c916582 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -705,8 +705,10 @@ EXPORT_SYMBOL_GPL(__add_pages); int __remove_pages(struct zone *zone, unsigned long phys_start_pfn, unsigned long nr_pages) { - unsigned long i, ret = 0; + unsigned long i; int sections_to_remove; + resource_size_t start, size; + int ret = 0; /* * We can only remove entire sections @@ -714,7 +716,12 @@ int __remove_pages(struct zone *zone, unsigned long phys_start_pfn, BUG_ON(phys_start_pfn & ~PAGE_SECTION_MASK); BUG_ON(nr_pages % PAGES_PER_SECTION); - release_mem_region(phys_start_pfn << PAGE_SHIFT, nr_pages * PAGE_SIZE); + start = phys_start_pfn << PAGE_SHIFT; + size = nr_pages * PAGE_SIZE; + ret = release_mem_region_adjustable(&iomem_resource, start, size); + if (ret) + pr_warn("Unable to release resource <%016llx-%016llx> (%d)\n", + start, start + size - 1, ret); sections_to_remove = nr_pages / PAGES_PER_SECTION; for (i = 0; i < sections_to_remove; i++) { -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2 0/3] Support memory hot-delete to boot memory 2013-04-08 17:09 [PATCH v2 0/3] Support memory hot-delete to boot memory Toshi Kani ` (2 preceding siblings ...) 2013-04-08 17:09 ` [PATCH v2 3/3] mm: Change __remove_pages() to call release_mem_region_adjustable() Toshi Kani @ 2013-04-08 20:44 ` Andrew Morton 2013-04-08 20:58 ` Toshi Kani 3 siblings, 1 reply; 13+ messages in thread From: Andrew Morton @ 2013-04-08 20:44 UTC (permalink / raw) To: Toshi Kani Cc: linux-mm, linux-kernel, linuxram, guz.fnst, tmac, isimatu.yasuaki, wency, tangchen, jiang.liu On Mon, 8 Apr 2013 11:09:53 -0600 Toshi Kani <toshi.kani@hp.com> wrote: > Memory hot-delete to a memory range present at boot causes an > error message in __release_region(), such as: > > Trying to free nonexistent resource <0000000070000000-0000000077ffffff> > > Hot-delete operation still continues since __release_region() is > a void function, but the target memory range is not freed from > iomem_resource as the result. This also leads a failure in a > subsequent hot-add operation to the same memory range since the > address range is still in-use in iomem_resource. > > This problem happens because the granularity of memory resource ranges > may be different between boot and hot-delete. So we don't need this new code if CONFIG_MEMORY_HOTPLUG=n? If so, can we please arrange for it to not be present if the user doesn't need it? > During bootup, > iomem_resource is set up from the boot descriptor table, such as EFI > Memory Table and e820. Each resource entry usually covers the whole > contiguous memory range. Hot-delete request, on the other hand, may > target to a particular range of memory resource, and its size can be > much smaller than the whole contiguous memory. Since the existing > release interfaces like __release_region() require a requested region > to be exactly matched to a resource entry, they do not allow a partial > resource to be released. > > This patchset introduces release_mem_region_adjustable() for memory > hot-delete operations, which allows releasing a partial memory range > and adjusts remaining resource accordingly. This patchset makes no > changes to the existing interfaces since their restriction is still > valid for I/O resources. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 0/3] Support memory hot-delete to boot memory 2013-04-08 20:44 ` [PATCH v2 0/3] Support memory hot-delete to boot memory Andrew Morton @ 2013-04-08 20:58 ` Toshi Kani 2013-04-10 5:52 ` David Rientjes 0 siblings, 1 reply; 13+ messages in thread From: Toshi Kani @ 2013-04-08 20:58 UTC (permalink / raw) To: Andrew Morton Cc: linux-mm, linux-kernel, linuxram, guz.fnst, tmac, isimatu.yasuaki, wency, tangchen, jiang.liu On Mon, 2013-04-08 at 13:44 -0700, Andrew Morton wrote: > On Mon, 8 Apr 2013 11:09:53 -0600 Toshi Kani <toshi.kani@hp.com> wrote: > > > Memory hot-delete to a memory range present at boot causes an > > error message in __release_region(), such as: > > > > Trying to free nonexistent resource <0000000070000000-0000000077ffffff> > > > > Hot-delete operation still continues since __release_region() is > > a void function, but the target memory range is not freed from > > iomem_resource as the result. This also leads a failure in a > > subsequent hot-add operation to the same memory range since the > > address range is still in-use in iomem_resource. > > > > This problem happens because the granularity of memory resource ranges > > may be different between boot and hot-delete. > > So we don't need this new code if CONFIG_MEMORY_HOTPLUG=n? If so, can > we please arrange for it to not be present if the user doesn't need it? Good point! Yes, since the new function is intended for memory hot-delete and is only called from __remove_pages() in mm/memory_hotplug.c, it should be added as #ifdef CONFIG_MEMORY_HOTPLUG in PATCH 2/3. I will make the change, and send an updated patch to PATCH 2/3. Thanks, -Toshi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 0/3] Support memory hot-delete to boot memory 2013-04-08 20:58 ` Toshi Kani @ 2013-04-10 5:52 ` David Rientjes 2013-04-10 6:07 ` [patch] mm, hotplug: avoid compiling memory hotremove functions when disabled David Rientjes 0 siblings, 1 reply; 13+ messages in thread From: David Rientjes @ 2013-04-10 5:52 UTC (permalink / raw) To: Toshi Kani Cc: Andrew Morton, linux-mm, linux-kernel, linuxram, guz.fnst, tmac, isimatu.yasuaki, wency, tangchen, jiang.liu On Mon, 8 Apr 2013, Toshi Kani wrote: > > So we don't need this new code if CONFIG_MEMORY_HOTPLUG=n? If so, can > > we please arrange for it to not be present if the user doesn't need it? > > Good point! Yes, since the new function is intended for memory > hot-delete and is only called from __remove_pages() in > mm/memory_hotplug.c, it should be added as #ifdef CONFIG_MEMORY_HOTPLUG > in PATCH 2/3. > > I will make the change, and send an updated patch to PATCH 2/3. > It should actually depend on CONFIG_MEMORY_HOTREMOVE, but the pseries OF_RECONFIG_DETACH_NODE code seems to be the only code that doesn't make that distinction. CONFIG_MEMORY_HOTREMOVE acts as a wrapper to protect configs that don't have ARCH_ENABLE_MEMORY_HOTREMOVE, so we'll want to keep it around and presumably that powerpc code depends on it as well. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* [patch] mm, hotplug: avoid compiling memory hotremove functions when disabled 2013-04-10 5:52 ` David Rientjes @ 2013-04-10 6:07 ` David Rientjes 2013-04-10 17:29 ` Toshi Kani 0 siblings, 1 reply; 13+ messages in thread From: David Rientjes @ 2013-04-10 6:07 UTC (permalink / raw) To: Andrew Morton Cc: Toshi Kani, Benjamin Herrenschmidt, Paul Mackerras, Greg Kroah-Hartman, Wen Congyang, Tang Chen, Yasuaki Ishimatsu, linux-kernel, linuxppc-dev, linux-mm __remove_pages() is only necessary for CONFIG_MEMORY_HOTREMOVE. PowerPC pseries will return -EOPNOTSUPP if unsupported. Adding an #ifdef causes several other functions it depends on to also become unnecessary, which saves in .text when disabled (it's disabled in most defconfigs besides powerpc, including x86). remove_memory_block() becomes static since it is not referenced outside of drivers/base/memory.c. Build tested on x86 and powerpc with CONFIG_MEMORY_HOTREMOVE both enabled and disabled. Signed-off-by: David Rientjes <rientjes@google.com> --- arch/powerpc/platforms/pseries/hotplug-memory.c | 12 +++++ drivers/base/memory.c | 44 +++++++-------- include/linux/memory.h | 3 +- include/linux/memory_hotplug.h | 4 +- mm/memory_hotplug.c | 68 +++++++++++------------ mm/sparse.c | 72 +++++++++++++------------ 6 files changed, 113 insertions(+), 90 deletions(-) diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c --- a/arch/powerpc/platforms/pseries/hotplug-memory.c +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c @@ -72,6 +72,7 @@ unsigned long memory_block_size_bytes(void) return get_memblock_size(); } +#ifdef CONFIG_MEMORY_HOTREMOVE static int pseries_remove_memblock(unsigned long base, unsigned int memblock_size) { unsigned long start, start_pfn; @@ -153,6 +154,17 @@ static int pseries_remove_memory(struct device_node *np) ret = pseries_remove_memblock(base, lmb_size); return ret; } +#else +static inline int pseries_remove_memblock(unsigned long base, + unsigned int memblock_size) +{ + return -EOPNOTSUPP; +} +static inline int pseries_remove_memory(struct device_node *np) +{ + return -EOPNOTSUPP; +} +#endif /* CONFIG_MEMORY_HOTREMOVE */ static int pseries_add_memory(struct device_node *np) { diff --git a/drivers/base/memory.c b/drivers/base/memory.c --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -93,16 +93,6 @@ int register_memory(struct memory_block *memory) return error; } -static void -unregister_memory(struct memory_block *memory) -{ - BUG_ON(memory->dev.bus != &memory_subsys); - - /* drop the ref. we got in remove_memory_block() */ - kobject_put(&memory->dev.kobj); - device_unregister(&memory->dev); -} - unsigned long __weak memory_block_size_bytes(void) { return MIN_MEMORY_BLOCK_SIZE; @@ -637,8 +627,28 @@ static int add_memory_section(int nid, struct mem_section *section, return ret; } -int remove_memory_block(unsigned long node_id, struct mem_section *section, - int phys_device) +/* + * need an interface for the VM to add new memory regions, + * but without onlining it. + */ +int register_new_memory(int nid, struct mem_section *section) +{ + return add_memory_section(nid, section, NULL, MEM_OFFLINE, HOTPLUG); +} + +#ifdef CONFIG_MEMORY_HOTREMOVE +static void +unregister_memory(struct memory_block *memory) +{ + BUG_ON(memory->dev.bus != &memory_subsys); + + /* drop the ref. we got in remove_memory_block() */ + kobject_put(&memory->dev.kobj); + device_unregister(&memory->dev); +} + +static int remove_memory_block(unsigned long node_id, + struct mem_section *section, int phys_device) { struct memory_block *mem; @@ -661,15 +671,6 @@ int remove_memory_block(unsigned long node_id, struct mem_section *section, return 0; } -/* - * need an interface for the VM to add new memory regions, - * but without onlining it. - */ -int register_new_memory(int nid, struct mem_section *section) -{ - return add_memory_section(nid, section, NULL, MEM_OFFLINE, HOTPLUG); -} - int unregister_memory_section(struct mem_section *section) { if (!present_section(section)) @@ -677,6 +678,7 @@ int unregister_memory_section(struct mem_section *section) return remove_memory_block(0, section, 0); } +#endif /* CONFIG_MEMORY_HOTREMOVE */ /* * offline one memory block. If the memory block has been offlined, do nothing. diff --git a/include/linux/memory.h b/include/linux/memory.h --- a/include/linux/memory.h +++ b/include/linux/memory.h @@ -114,9 +114,10 @@ extern void unregister_memory_notifier(struct notifier_block *nb); extern int register_memory_isolate_notifier(struct notifier_block *nb); extern void unregister_memory_isolate_notifier(struct notifier_block *nb); extern int register_new_memory(int, struct mem_section *); +#ifdef CONFIG_MEMORY_HOTREMOVE extern int unregister_memory_section(struct mem_section *); +#endif extern int memory_dev_init(void); -extern int remove_memory_block(unsigned long, struct mem_section *, int); extern int memory_notify(unsigned long val, void *v); extern int memory_isolate_notify(unsigned long val, void *v); extern struct memory_block *find_memory_block_hinted(struct mem_section *, diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -97,13 +97,13 @@ extern void __online_page_free(struct page *page); #ifdef CONFIG_MEMORY_HOTREMOVE extern bool is_pageblock_removable_nolock(struct page *page); extern int arch_remove_memory(u64 start, u64 size); +extern int __remove_pages(struct zone *zone, unsigned long start_pfn, + unsigned long nr_pages); #endif /* CONFIG_MEMORY_HOTREMOVE */ /* reasonably generic interface to expand the physical pages in a zone */ extern int __add_pages(int nid, struct zone *zone, unsigned long start_pfn, unsigned long nr_pages); -extern int __remove_pages(struct zone *zone, unsigned long start_pfn, - unsigned long nr_pages); #ifdef CONFIG_NUMA extern int memory_add_physaddr_to_nid(u64 start); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -436,6 +436,40 @@ static int __meminit __add_section(int nid, struct zone *zone, return register_new_memory(nid, __pfn_to_section(phys_start_pfn)); } +/* + * Reasonably generic function for adding memory. It is + * expected that archs that support memory hotplug will + * call this function after deciding the zone to which to + * add the new pages. + */ +int __ref __add_pages(int nid, struct zone *zone, unsigned long phys_start_pfn, + unsigned long nr_pages) +{ + unsigned long i; + int err = 0; + int start_sec, end_sec; + /* during initialize mem_map, align hot-added range to section */ + start_sec = pfn_to_section_nr(phys_start_pfn); + end_sec = pfn_to_section_nr(phys_start_pfn + nr_pages - 1); + + for (i = start_sec; i <= end_sec; i++) { + err = __add_section(nid, zone, i << PFN_SECTION_SHIFT); + + /* + * EEXIST is finally dealt with by ioresource collision + * check. see add_memory() => register_memory_resource() + * Warning will be printed if there is collision. + */ + if (err && (err != -EEXIST)) + break; + err = 0; + } + + return err; +} +EXPORT_SYMBOL_GPL(__add_pages); + +#ifdef CONFIG_MEMORY_HOTREMOVE /* find the smallest valid pfn in the range [start_pfn, end_pfn) */ static int find_smallest_section_pfn(int nid, struct zone *zone, unsigned long start_pfn, @@ -658,39 +692,6 @@ static int __remove_section(struct zone *zone, struct mem_section *ms) return 0; } -/* - * Reasonably generic function for adding memory. It is - * expected that archs that support memory hotplug will - * call this function after deciding the zone to which to - * add the new pages. - */ -int __ref __add_pages(int nid, struct zone *zone, unsigned long phys_start_pfn, - unsigned long nr_pages) -{ - unsigned long i; - int err = 0; - int start_sec, end_sec; - /* during initialize mem_map, align hot-added range to section */ - start_sec = pfn_to_section_nr(phys_start_pfn); - end_sec = pfn_to_section_nr(phys_start_pfn + nr_pages - 1); - - for (i = start_sec; i <= end_sec; i++) { - err = __add_section(nid, zone, i << PFN_SECTION_SHIFT); - - /* - * EEXIST is finally dealt with by ioresource collision - * check. see add_memory() => register_memory_resource() - * Warning will be printed if there is collision. - */ - if (err && (err != -EEXIST)) - break; - err = 0; - } - - return err; -} -EXPORT_SYMBOL_GPL(__add_pages); - /** * __remove_pages() - remove sections of pages from a zone * @zone: zone from which pages need to be removed @@ -726,6 +727,7 @@ int __remove_pages(struct zone *zone, unsigned long phys_start_pfn, return ret; } EXPORT_SYMBOL_GPL(__remove_pages); +#endif /* CONFIG_MEMORY_HOTREMOVE */ int set_online_page_callback(online_page_callback_t callback) { diff --git a/mm/sparse.c b/mm/sparse.c --- a/mm/sparse.c +++ b/mm/sparse.c @@ -620,6 +620,7 @@ static void __kfree_section_memmap(struct page *memmap, unsigned long nr_pages) vmemmap_free(start, end); } +#ifdef CONFIG_MEMORY_HOTREMOVE static void free_map_bootmem(struct page *memmap, unsigned long nr_pages) { unsigned long start = (unsigned long)memmap; @@ -627,6 +628,7 @@ static void free_map_bootmem(struct page *memmap, unsigned long nr_pages) vmemmap_free(start, end); } +#endif /* CONFIG_MEMORY_HOTREMOVE */ #else static struct page *__kmalloc_section_memmap(unsigned long nr_pages) { @@ -664,6 +666,7 @@ static void __kfree_section_memmap(struct page *memmap, unsigned long nr_pages) get_order(sizeof(struct page) * nr_pages)); } +#ifdef CONFIG_MEMORY_HOTREMOVE static void free_map_bootmem(struct page *memmap, unsigned long nr_pages) { unsigned long maps_section_nr, removing_section_nr, i; @@ -690,40 +693,9 @@ static void free_map_bootmem(struct page *memmap, unsigned long nr_pages) put_page_bootmem(page); } } +#endif /* CONFIG_MEMORY_HOTREMOVE */ #endif /* CONFIG_SPARSEMEM_VMEMMAP */ -static void free_section_usemap(struct page *memmap, unsigned long *usemap) -{ - struct page *usemap_page; - unsigned long nr_pages; - - if (!usemap) - return; - - usemap_page = virt_to_page(usemap); - /* - * Check to see if allocation came from hot-plug-add - */ - if (PageSlab(usemap_page) || PageCompound(usemap_page)) { - kfree(usemap); - if (memmap) - __kfree_section_memmap(memmap, PAGES_PER_SECTION); - return; - } - - /* - * The usemap came from bootmem. This is packed with other usemaps - * on the section which has pgdat at boot time. Just keep it as is now. - */ - - if (memmap) { - nr_pages = PAGE_ALIGN(PAGES_PER_SECTION * sizeof(struct page)) - >> PAGE_SHIFT; - - free_map_bootmem(memmap, nr_pages); - } -} - /* * returns the number of sections whose mem_maps were properly * set. If this is <=0, then that means that the passed-in @@ -800,6 +772,39 @@ static inline void clear_hwpoisoned_pages(struct page *memmap, int nr_pages) } #endif +#ifdef CONFIG_MEMORY_HOTREMOVE +static void free_section_usemap(struct page *memmap, unsigned long *usemap) +{ + struct page *usemap_page; + unsigned long nr_pages; + + if (!usemap) + return; + + usemap_page = virt_to_page(usemap); + /* + * Check to see if allocation came from hot-plug-add + */ + if (PageSlab(usemap_page) || PageCompound(usemap_page)) { + kfree(usemap); + if (memmap) + __kfree_section_memmap(memmap, PAGES_PER_SECTION); + return; + } + + /* + * The usemap came from bootmem. This is packed with other usemaps + * on the section which has pgdat at boot time. Just keep it as is now. + */ + + if (memmap) { + nr_pages = PAGE_ALIGN(PAGES_PER_SECTION * sizeof(struct page)) + >> PAGE_SHIFT; + + free_map_bootmem(memmap, nr_pages); + } +} + void sparse_remove_one_section(struct zone *zone, struct mem_section *ms) { struct page *memmap = NULL; @@ -819,4 +824,5 @@ void sparse_remove_one_section(struct zone *zone, struct mem_section *ms) clear_hwpoisoned_pages(memmap, PAGES_PER_SECTION); free_section_usemap(memmap, usemap); } -#endif +#endif /* CONFIG_MEMORY_HOTREMOVE */ +#endif /* CONFIG_MEMORY_HOTPLUG */ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch] mm, hotplug: avoid compiling memory hotremove functions when disabled 2013-04-10 6:07 ` [patch] mm, hotplug: avoid compiling memory hotremove functions when disabled David Rientjes @ 2013-04-10 17:29 ` Toshi Kani 0 siblings, 0 replies; 13+ messages in thread From: Toshi Kani @ 2013-04-10 17:29 UTC (permalink / raw) To: David Rientjes Cc: Andrew Morton, Benjamin Herrenschmidt, Paul Mackerras, Greg Kroah-Hartman, Wen Congyang, Tang Chen, Yasuaki Ishimatsu, linux-kernel, linuxppc-dev, linux-mm On Tue, 2013-04-09 at 23:07 -0700, David Rientjes wrote: > __remove_pages() is only necessary for CONFIG_MEMORY_HOTREMOVE. PowerPC > pseries will return -EOPNOTSUPP if unsupported. > > Adding an #ifdef causes several other functions it depends on to also > become unnecessary, which saves in .text when disabled (it's disabled in > most defconfigs besides powerpc, including x86). remove_memory_block() > becomes static since it is not referenced outside of > drivers/base/memory.c. > > Build tested on x86 and powerpc with CONFIG_MEMORY_HOTREMOVE both enabled > and disabled. > > Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Toshi Kani <toshi.kani@hp.com> Thanks, -Toshi > --- > arch/powerpc/platforms/pseries/hotplug-memory.c | 12 +++++ > drivers/base/memory.c | 44 +++++++-------- > include/linux/memory.h | 3 +- > include/linux/memory_hotplug.h | 4 +- > mm/memory_hotplug.c | 68 +++++++++++------------ > mm/sparse.c | 72 +++++++++++++------------ > 6 files changed, 113 insertions(+), 90 deletions(-) > > diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c > --- a/arch/powerpc/platforms/pseries/hotplug-memory.c > +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c > @@ -72,6 +72,7 @@ unsigned long memory_block_size_bytes(void) > return get_memblock_size(); > } > > +#ifdef CONFIG_MEMORY_HOTREMOVE > static int pseries_remove_memblock(unsigned long base, unsigned int memblock_size) > { > unsigned long start, start_pfn; > @@ -153,6 +154,17 @@ static int pseries_remove_memory(struct device_node *np) > ret = pseries_remove_memblock(base, lmb_size); > return ret; > } > +#else > +static inline int pseries_remove_memblock(unsigned long base, > + unsigned int memblock_size) > +{ > + return -EOPNOTSUPP; > +} > +static inline int pseries_remove_memory(struct device_node *np) > +{ > + return -EOPNOTSUPP; > +} > +#endif /* CONFIG_MEMORY_HOTREMOVE */ > > static int pseries_add_memory(struct device_node *np) > { > diff --git a/drivers/base/memory.c b/drivers/base/memory.c > --- a/drivers/base/memory.c > +++ b/drivers/base/memory.c > @@ -93,16 +93,6 @@ int register_memory(struct memory_block *memory) > return error; > } > > -static void > -unregister_memory(struct memory_block *memory) > -{ > - BUG_ON(memory->dev.bus != &memory_subsys); > - > - /* drop the ref. we got in remove_memory_block() */ > - kobject_put(&memory->dev.kobj); > - device_unregister(&memory->dev); > -} > - > unsigned long __weak memory_block_size_bytes(void) > { > return MIN_MEMORY_BLOCK_SIZE; > @@ -637,8 +627,28 @@ static int add_memory_section(int nid, struct mem_section *section, > return ret; > } > > -int remove_memory_block(unsigned long node_id, struct mem_section *section, > - int phys_device) > +/* > + * need an interface for the VM to add new memory regions, > + * but without onlining it. > + */ > +int register_new_memory(int nid, struct mem_section *section) > +{ > + return add_memory_section(nid, section, NULL, MEM_OFFLINE, HOTPLUG); > +} > + > +#ifdef CONFIG_MEMORY_HOTREMOVE > +static void > +unregister_memory(struct memory_block *memory) > +{ > + BUG_ON(memory->dev.bus != &memory_subsys); > + > + /* drop the ref. we got in remove_memory_block() */ > + kobject_put(&memory->dev.kobj); > + device_unregister(&memory->dev); > +} > + > +static int remove_memory_block(unsigned long node_id, > + struct mem_section *section, int phys_device) > { > struct memory_block *mem; > > @@ -661,15 +671,6 @@ int remove_memory_block(unsigned long node_id, struct mem_section *section, > return 0; > } > > -/* > - * need an interface for the VM to add new memory regions, > - * but without onlining it. > - */ > -int register_new_memory(int nid, struct mem_section *section) > -{ > - return add_memory_section(nid, section, NULL, MEM_OFFLINE, HOTPLUG); > -} > - > int unregister_memory_section(struct mem_section *section) > { > if (!present_section(section)) > @@ -677,6 +678,7 @@ int unregister_memory_section(struct mem_section *section) > > return remove_memory_block(0, section, 0); > } > +#endif /* CONFIG_MEMORY_HOTREMOVE */ > > /* > * offline one memory block. If the memory block has been offlined, do nothing. > diff --git a/include/linux/memory.h b/include/linux/memory.h > --- a/include/linux/memory.h > +++ b/include/linux/memory.h > @@ -114,9 +114,10 @@ extern void unregister_memory_notifier(struct notifier_block *nb); > extern int register_memory_isolate_notifier(struct notifier_block *nb); > extern void unregister_memory_isolate_notifier(struct notifier_block *nb); > extern int register_new_memory(int, struct mem_section *); > +#ifdef CONFIG_MEMORY_HOTREMOVE > extern int unregister_memory_section(struct mem_section *); > +#endif > extern int memory_dev_init(void); > -extern int remove_memory_block(unsigned long, struct mem_section *, int); > extern int memory_notify(unsigned long val, void *v); > extern int memory_isolate_notify(unsigned long val, void *v); > extern struct memory_block *find_memory_block_hinted(struct mem_section *, > diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h > --- a/include/linux/memory_hotplug.h > +++ b/include/linux/memory_hotplug.h > @@ -97,13 +97,13 @@ extern void __online_page_free(struct page *page); > #ifdef CONFIG_MEMORY_HOTREMOVE > extern bool is_pageblock_removable_nolock(struct page *page); > extern int arch_remove_memory(u64 start, u64 size); > +extern int __remove_pages(struct zone *zone, unsigned long start_pfn, > + unsigned long nr_pages); > #endif /* CONFIG_MEMORY_HOTREMOVE */ > > /* reasonably generic interface to expand the physical pages in a zone */ > extern int __add_pages(int nid, struct zone *zone, unsigned long start_pfn, > unsigned long nr_pages); > -extern int __remove_pages(struct zone *zone, unsigned long start_pfn, > - unsigned long nr_pages); > > #ifdef CONFIG_NUMA > extern int memory_add_physaddr_to_nid(u64 start); > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -436,6 +436,40 @@ static int __meminit __add_section(int nid, struct zone *zone, > return register_new_memory(nid, __pfn_to_section(phys_start_pfn)); > } > > +/* > + * Reasonably generic function for adding memory. It is > + * expected that archs that support memory hotplug will > + * call this function after deciding the zone to which to > + * add the new pages. > + */ > +int __ref __add_pages(int nid, struct zone *zone, unsigned long phys_start_pfn, > + unsigned long nr_pages) > +{ > + unsigned long i; > + int err = 0; > + int start_sec, end_sec; > + /* during initialize mem_map, align hot-added range to section */ > + start_sec = pfn_to_section_nr(phys_start_pfn); > + end_sec = pfn_to_section_nr(phys_start_pfn + nr_pages - 1); > + > + for (i = start_sec; i <= end_sec; i++) { > + err = __add_section(nid, zone, i << PFN_SECTION_SHIFT); > + > + /* > + * EEXIST is finally dealt with by ioresource collision > + * check. see add_memory() => register_memory_resource() > + * Warning will be printed if there is collision. > + */ > + if (err && (err != -EEXIST)) > + break; > + err = 0; > + } > + > + return err; > +} > +EXPORT_SYMBOL_GPL(__add_pages); > + > +#ifdef CONFIG_MEMORY_HOTREMOVE > /* find the smallest valid pfn in the range [start_pfn, end_pfn) */ > static int find_smallest_section_pfn(int nid, struct zone *zone, > unsigned long start_pfn, > @@ -658,39 +692,6 @@ static int __remove_section(struct zone *zone, struct mem_section *ms) > return 0; > } > > -/* > - * Reasonably generic function for adding memory. It is > - * expected that archs that support memory hotplug will > - * call this function after deciding the zone to which to > - * add the new pages. > - */ > -int __ref __add_pages(int nid, struct zone *zone, unsigned long phys_start_pfn, > - unsigned long nr_pages) > -{ > - unsigned long i; > - int err = 0; > - int start_sec, end_sec; > - /* during initialize mem_map, align hot-added range to section */ > - start_sec = pfn_to_section_nr(phys_start_pfn); > - end_sec = pfn_to_section_nr(phys_start_pfn + nr_pages - 1); > - > - for (i = start_sec; i <= end_sec; i++) { > - err = __add_section(nid, zone, i << PFN_SECTION_SHIFT); > - > - /* > - * EEXIST is finally dealt with by ioresource collision > - * check. see add_memory() => register_memory_resource() > - * Warning will be printed if there is collision. > - */ > - if (err && (err != -EEXIST)) > - break; > - err = 0; > - } > - > - return err; > -} > -EXPORT_SYMBOL_GPL(__add_pages); > - > /** > * __remove_pages() - remove sections of pages from a zone > * @zone: zone from which pages need to be removed > @@ -726,6 +727,7 @@ int __remove_pages(struct zone *zone, unsigned long phys_start_pfn, > return ret; > } > EXPORT_SYMBOL_GPL(__remove_pages); > +#endif /* CONFIG_MEMORY_HOTREMOVE */ > > int set_online_page_callback(online_page_callback_t callback) > { > diff --git a/mm/sparse.c b/mm/sparse.c > --- a/mm/sparse.c > +++ b/mm/sparse.c > @@ -620,6 +620,7 @@ static void __kfree_section_memmap(struct page *memmap, unsigned long nr_pages) > > vmemmap_free(start, end); > } > +#ifdef CONFIG_MEMORY_HOTREMOVE > static void free_map_bootmem(struct page *memmap, unsigned long nr_pages) > { > unsigned long start = (unsigned long)memmap; > @@ -627,6 +628,7 @@ static void free_map_bootmem(struct page *memmap, unsigned long nr_pages) > > vmemmap_free(start, end); > } > +#endif /* CONFIG_MEMORY_HOTREMOVE */ > #else > static struct page *__kmalloc_section_memmap(unsigned long nr_pages) > { > @@ -664,6 +666,7 @@ static void __kfree_section_memmap(struct page *memmap, unsigned long nr_pages) > get_order(sizeof(struct page) * nr_pages)); > } > > +#ifdef CONFIG_MEMORY_HOTREMOVE > static void free_map_bootmem(struct page *memmap, unsigned long nr_pages) > { > unsigned long maps_section_nr, removing_section_nr, i; > @@ -690,40 +693,9 @@ static void free_map_bootmem(struct page *memmap, unsigned long nr_pages) > put_page_bootmem(page); > } > } > +#endif /* CONFIG_MEMORY_HOTREMOVE */ > #endif /* CONFIG_SPARSEMEM_VMEMMAP */ > > -static void free_section_usemap(struct page *memmap, unsigned long *usemap) > -{ > - struct page *usemap_page; > - unsigned long nr_pages; > - > - if (!usemap) > - return; > - > - usemap_page = virt_to_page(usemap); > - /* > - * Check to see if allocation came from hot-plug-add > - */ > - if (PageSlab(usemap_page) || PageCompound(usemap_page)) { > - kfree(usemap); > - if (memmap) > - __kfree_section_memmap(memmap, PAGES_PER_SECTION); > - return; > - } > - > - /* > - * The usemap came from bootmem. This is packed with other usemaps > - * on the section which has pgdat at boot time. Just keep it as is now. > - */ > - > - if (memmap) { > - nr_pages = PAGE_ALIGN(PAGES_PER_SECTION * sizeof(struct page)) > - >> PAGE_SHIFT; > - > - free_map_bootmem(memmap, nr_pages); > - } > -} > - > /* > * returns the number of sections whose mem_maps were properly > * set. If this is <=0, then that means that the passed-in > @@ -800,6 +772,39 @@ static inline void clear_hwpoisoned_pages(struct page *memmap, int nr_pages) > } > #endif > > +#ifdef CONFIG_MEMORY_HOTREMOVE > +static void free_section_usemap(struct page *memmap, unsigned long *usemap) > +{ > + struct page *usemap_page; > + unsigned long nr_pages; > + > + if (!usemap) > + return; > + > + usemap_page = virt_to_page(usemap); > + /* > + * Check to see if allocation came from hot-plug-add > + */ > + if (PageSlab(usemap_page) || PageCompound(usemap_page)) { > + kfree(usemap); > + if (memmap) > + __kfree_section_memmap(memmap, PAGES_PER_SECTION); > + return; > + } > + > + /* > + * The usemap came from bootmem. This is packed with other usemaps > + * on the section which has pgdat at boot time. Just keep it as is now. > + */ > + > + if (memmap) { > + nr_pages = PAGE_ALIGN(PAGES_PER_SECTION * sizeof(struct page)) > + >> PAGE_SHIFT; > + > + free_map_bootmem(memmap, nr_pages); > + } > +} > + > void sparse_remove_one_section(struct zone *zone, struct mem_section *ms) > { > struct page *memmap = NULL; > @@ -819,4 +824,5 @@ void sparse_remove_one_section(struct zone *zone, struct mem_section *ms) > clear_hwpoisoned_pages(memmap, PAGES_PER_SECTION); > free_section_usemap(memmap, usemap); > } > -#endif > +#endif /* CONFIG_MEMORY_HOTREMOVE */ > +#endif /* CONFIG_MEMORY_HOTPLUG */ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2013-04-10 17:42 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-04-08 17:09 [PATCH v2 0/3] Support memory hot-delete to boot memory Toshi Kani 2013-04-08 17:09 ` [PATCH v2 1/3] resource: Add __adjust_resource() for internal use Toshi Kani 2013-04-10 6:10 ` David Rientjes 2013-04-10 15:39 ` Toshi Kani 2013-04-08 17:09 ` [PATCH v2 2/3] resource: Add release_mem_region_adjustable() Toshi Kani 2013-04-10 6:16 ` David Rientjes 2013-04-10 16:36 ` Toshi Kani 2013-04-08 17:09 ` [PATCH v2 3/3] mm: Change __remove_pages() to call release_mem_region_adjustable() Toshi Kani 2013-04-08 20:44 ` [PATCH v2 0/3] Support memory hot-delete to boot memory Andrew Morton 2013-04-08 20:58 ` Toshi Kani 2013-04-10 5:52 ` David Rientjes 2013-04-10 6:07 ` [patch] mm, hotplug: avoid compiling memory hotremove functions when disabled David Rientjes 2013-04-10 17:29 ` Toshi Kani
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).