linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 00/15] memory-hotplug: hot-remove physical memory
@ 2013-01-09  9:32 Tang Chen
  2013-01-09  9:32 ` [PATCH v6 01/15] memory-hotplug: try to offline the memory twice to avoid dependence Tang Chen
                   ` (17 more replies)
  0 siblings, 18 replies; 67+ messages in thread
From: Tang Chen @ 2013-01-09  9:32 UTC (permalink / raw)
  To: akpm, rientjes, len.brown, benh, paulus, cl, minchan.kim,
	kosaki.motohiro, isimatu.yasuaki, wujianguo, wency, tangchen, hpa,
	linfeng, laijs, mgorman, yinghai, glommer
  Cc: linux-s390, linux-ia64, linux-acpi, linux-sh, x86, linux-kernel,
	cmetcalf, linux-mm, sparclinux, linuxppc-dev

Here is the physical memory hot-remove patch-set based on 3.8rc-2.

This patch-set aims to implement physical memory hot-removing.

The patches can free/remove the following things:

  - /sys/firmware/memmap/X/{end, start, type} : [PATCH 4/15]
  - memmap of sparse-vmemmap                  : [PATCH 6,7,8,10/15]
  - page table of removed memory              : [RFC PATCH 7,8,10/15]
  - node and related sysfs files              : [RFC PATCH 13-15/15]


Existing problem:
If CONFIG_MEMCG is selected, we will allocate memory to store page cgroup
when we online pages.

For example: there is a memory device on node 1. The address range
is [1G, 1.5G). You will find 4 new directories memory8, memory9, memory10,
and memory11 under the directory /sys/devices/system/memory/.

If CONFIG_MEMCG is selected, when we online memory8, the memory stored page
cgroup is not provided by this memory device. But when we online memory9, the
memory stored page cgroup may be provided by memory8. So we can't offline
memory8 now. We should offline the memory in the reversed order.

When the memory device is hotremoved, we will auto offline memory provided
by this memory device. But we don't know which memory is onlined first, so
offlining memory may fail.

In patch1, we provide a solution which is not good enough:
Iterate twice to offline the memory.
1st iterate: offline every non primary memory block.
2nd iterate: offline primary (i.e. first added) memory block.

And a new idea from Wen Congyang <wency@cn.fujitsu.com> is:
allocate the memory from the memory block they are describing.

But we are not sure if it is OK to do so because there is not existing API
to do so, and we need to move page_cgroup memory allocation from MEM_GOING_ONLINE
to MEM_ONLINE. And also, it may interfere the hugepage.



How to test this patchset?
1. apply this patchset and build the kernel. MEMORY_HOTPLUG, MEMORY_HOTREMOVE,
   ACPI_HOTPLUG_MEMORY must be selected.
2. load the module acpi_memhotplug
3. hotplug the memory device(it depends on your hardware)
   You will see the memory device under the directory /sys/bus/acpi/devices/.
   Its name is PNP0C80:XX.
4. online/offline pages provided by this memory device
   You can write online/offline to /sys/devices/system/memory/memoryX/state to
   online/offline pages provided by this memory device
5. hotremove the memory device
   You can hotremove the memory device by the hardware, or writing 1 to
   /sys/bus/acpi/devices/PNP0C80:XX/eject.


Note: if the memory provided by the memory device is used by the kernel, it
can't be offlined. It is not a bug.


Changelogs from v5 to v6:
 Patch3: Add some more comments to explain memory hot-remove.
 Patch4: Remove bootmem member in struct firmware_map_entry.
 Patch6: Repeatedly register bootmem pages when using hugepage.
 Patch8: Repeatedly free bootmem pages when using hugepage.
 Patch14: Don't free pgdat when offlining a node, just reset it to 0.
 Patch15: New patch, pgdat is not freed in patch14, so don't allocate a new
          one when online a node.

Changelogs from v4 to v5:
 Patch7: new patch, move pgdat_resize_lock into sparse_remove_one_section() to
         avoid disabling irq because we need flush tlb when free pagetables.
 Patch8: new patch, pick up some common APIs that are used to free direct mapping
         and vmemmap pagetables.
 Patch9: free direct mapping pagetables on x86_64 arch.
 Patch10: free vmemmap pagetables.
 Patch11: since freeing memmap with vmemmap has been implemented, the config
          macro CONFIG_SPARSEMEM_VMEMMAP when defining __remove_section() is
          no longer needed.
 Patch13: no need to modify acpi_memory_disable_device() since it was removed,
          and add nid parameter when calling remove_memory().

Changelogs from v3 to v4:
 Patch7: remove unused codes.
 Patch8: fix nr_pages that is passed to free_map_bootmem()

Changelogs from v2 to v3:
 Patch9: call sync_global_pgds() if pgd is changed
 Patch10: fix a problem int the patch

Changelogs from v1 to v2:
 Patch1: new patch, offline memory twice. 1st iterate: offline every non primary
         memory block. 2nd iterate: offline primary (i.e. first added) memory
         block.

 Patch3: new patch, no logical change, just remove reduntant codes.

 Patch9: merge the patch from wujianguo into this patch. flush tlb on all cpu
         after the pagetable is changed.

 Patch12: new patch, free node_data when a node is offlined.


Tang Chen (6):
  memory-hotplug: move pgdat_resize_lock into
    sparse_remove_one_section()
  memory-hotplug: remove page table of x86_64 architecture
  memory-hotplug: remove memmap of sparse-vmemmap
  memory-hotplug: Integrated __remove_section() of
    CONFIG_SPARSEMEM_VMEMMAP.
  memory-hotplug: remove sysfs file of node
  memory-hotplug: Do not allocate pdgat if it was not freed when
    offline.

Wen Congyang (5):
  memory-hotplug: try to offline the memory twice to avoid dependence
  memory-hotplug: remove redundant codes
  memory-hotplug: introduce new function arch_remove_memory() for
    removing page table depends on architecture
  memory-hotplug: Common APIs to support page tables hot-remove
  memory-hotplug: free node_data when a node is offlined

Yasuaki Ishimatsu (4):
  memory-hotplug: check whether all memory blocks are offlined or not
    when removing memory
  memory-hotplug: remove /sys/firmware/memmap/X sysfs
  memory-hotplug: implement register_page_bootmem_info_section of
    sparse-vmemmap
  memory-hotplug: memory_hotplug: clear zone when removing the memory

 arch/arm64/mm/mmu.c                  |    3 +
 arch/ia64/mm/discontig.c             |   10 +
 arch/ia64/mm/init.c                  |   18 ++
 arch/powerpc/mm/init_64.c            |   10 +
 arch/powerpc/mm/mem.c                |   12 +
 arch/s390/mm/init.c                  |   12 +
 arch/s390/mm/vmem.c                  |   10 +
 arch/sh/mm/init.c                    |   17 ++
 arch/sparc/mm/init_64.c              |   10 +
 arch/tile/mm/init.c                  |    8 +
 arch/x86/include/asm/pgtable_types.h |    1 +
 arch/x86/mm/init_32.c                |   12 +
 arch/x86/mm/init_64.c                |  390 +++++++++++++++++++++++++++++
 arch/x86/mm/pageattr.c               |   47 ++--
 drivers/acpi/acpi_memhotplug.c       |    8 +-
 drivers/base/memory.c                |    6 +
 drivers/firmware/memmap.c            |   96 +++++++-
 include/linux/bootmem.h              |    1 +
 include/linux/firmware-map.h         |    6 +
 include/linux/memory_hotplug.h       |   15 +-
 include/linux/mm.h                   |    4 +-
 mm/memory_hotplug.c                  |  459 +++++++++++++++++++++++++++++++---
 mm/sparse.c                          |    8 +-
 23 files changed, 1094 insertions(+), 69 deletions(-)

^ permalink raw reply	[flat|nested] 67+ messages in thread

end of thread, other threads:[~2013-02-04 23:04 UTC | newest]

Thread overview: 67+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-09  9:32 [PATCH v6 00/15] memory-hotplug: hot-remove physical memory Tang Chen
2013-01-09  9:32 ` [PATCH v6 01/15] memory-hotplug: try to offline the memory twice to avoid dependence Tang Chen
2013-01-09  9:32 ` [PATCH v6 02/15] memory-hotplug: check whether all memory blocks are offlined or not when removing memory Tang Chen
2013-01-09 23:11   ` Andrew Morton
2013-01-10  5:56     ` Tang Chen
2013-01-09  9:32 ` [PATCH v6 03/15] memory-hotplug: remove redundant codes Tang Chen
2013-01-09  9:32 ` [PATCH v6 04/15] memory-hotplug: remove /sys/firmware/memmap/X sysfs Tang Chen
2013-01-09 22:49   ` Andrew Morton
2013-01-10  6:07     ` Tang Chen
2013-01-09 23:19   ` Andrew Morton
2013-01-10  6:15     ` Tang Chen
2013-01-09  9:32 ` [PATCH v6 05/15] memory-hotplug: introduce new function arch_remove_memory() for removing page table depends on architecture Tang Chen
2013-01-09 22:50   ` Andrew Morton
2013-01-10  2:25     ` Tang Chen
2013-01-09  9:32 ` [PATCH v6 06/15] memory-hotplug: implement register_page_bootmem_info_section of sparse-vmemmap Tang Chen
2013-01-09  9:32 ` [PATCH v6 07/15] memory-hotplug: move pgdat_resize_lock into sparse_remove_one_section() Tang Chen
2013-01-09  9:32 ` [PATCH v6 08/15] memory-hotplug: Common APIs to support page tables hot-remove Tang Chen
2013-01-29 13:02   ` Simon Jeons
2013-01-30  1:53     ` Jianguo Wu
2013-01-30  2:13       ` Simon Jeons
2013-01-29 13:04   ` Simon Jeons
2013-01-30  2:16     ` Tang Chen
2013-01-30  3:27       ` Simon Jeons
2013-01-30  5:55         ` Tang Chen
2013-01-30  7:32           ` Simon Jeons
2013-02-04 23:04   ` Andrew Morton
2013-01-09  9:32 ` [PATCH v6 09/15] memory-hotplug: remove page table of x86_64 architecture Tang Chen
2013-01-09  9:32 ` [PATCH v6 10/15] memory-hotplug: remove memmap of sparse-vmemmap Tang Chen
2013-01-09  9:32 ` [PATCH v6 11/15] memory-hotplug: Integrated __remove_section() of CONFIG_SPARSEMEM_VMEMMAP Tang Chen
2013-01-09  9:32 ` [PATCH v6 12/15] memory-hotplug: memory_hotplug: clear zone when removing the memory Tang Chen
2013-01-09  9:32 ` [PATCH v6 13/15] memory-hotplug: remove sysfs file of node Tang Chen
2013-01-09  9:32 ` [PATCH v6 14/15] memory-hotplug: free node_data when a node is offlined Tang Chen
2013-01-09  9:32 ` [PATCH v6 15/15] memory-hotplug: Do not allocate pdgat if it was not freed when offline Tang Chen
2013-01-09 22:23 ` [PATCH v6 00/15] memory-hotplug: hot-remove physical memory Andrew Morton
2013-01-10  2:17   ` Tang Chen
2013-01-10  7:14     ` Glauber Costa
2013-01-10  7:31       ` Kamezawa Hiroyuki
2013-01-10  7:55         ` Glauber Costa
2013-01-10  8:23           ` Kamezawa Hiroyuki
2013-01-10  8:36             ` Glauber Costa
2013-01-10  8:39               ` Kamezawa Hiroyuki
2013-01-09 23:33 ` Andrew Morton
2013-01-10  2:18   ` Tang Chen
2013-01-29 12:52 ` Simon Jeons
2013-01-30  2:32   ` Tang Chen
2013-01-30  2:48     ` Simon Jeons
2013-01-30  3:00       ` Tang Chen
2013-01-30 10:15   ` Tang Chen
2013-01-30 10:18     ` Tang Chen
2013-01-31  1:22     ` Simon Jeons
2013-01-31  3:31       ` Tang Chen
2013-01-31  6:19         ` Simon Jeons
2013-01-31  7:10           ` Tang Chen
2013-01-31  8:17             ` Simon Jeons
2013-01-31  8:48             ` Simon Jeons
2013-01-31  9:44               ` Tang Chen
2013-01-31 10:38                 ` Simon Jeons
2013-02-01  1:32                   ` Jianguo Wu
2013-02-01  1:36                     ` Simon Jeons
2013-02-01  1:57                       ` Jianguo Wu
2013-02-01  2:06                         ` Simon Jeons
2013-02-01  2:18                           ` Jianguo Wu
2013-02-01  1:57                       ` Tang Chen
2013-02-01  2:17                         ` Simon Jeons
2013-02-01  2:42                           ` Tang Chen
2013-02-01  3:06                             ` Simon Jeons
2013-02-01  3:39                               ` Tang Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).