* [RFC, 0/7] NUMA Hotplug emulator @ 2010-05-13 11:36 Shaohui Zheng 2010-05-13 12:11 ` Shaohui Zheng ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: Shaohui Zheng @ 2010-05-13 11:36 UTC (permalink / raw) To: akpm, linux-mm; +Cc: linux-kernel, ak, fengguang.wu, haicheng.li, shaohui.zheng Hi, All This patchset introduces NUMA hotplug emulator for x86. it refers too many files and might introduce new bugs, so we send a RFC to comminity first and expect comments and suggestions, thanks. * WHAT IS HOTPLUG EMULATOR NUMA hotplug emulator is collectively named for the hotplug emulation it is able to emulate NUMA Node Hotplug thru a pure software way. It intends to help people easily debug and test node/cpu/memory hotplug related stuff on a none-numa-hotplug-support machine, even an UMA machine. The emulator provides mechanism to emulate the process of physcial cpu/mem hotadd, it provides possibility to debug CPU and memory hotplug on the machines without NUMA support for kenrel developers. It offers an interface for cpu and memory hotplug test purpose. * WHY DO WE USE HOTPLUG EMULATOR We are focusing on the hotplug emualation for a few months. The emualor helps team to reproduce all the major hotplug bugs. It plays an important role to the hotplug code quality assuirance. Because of the hotplug emulator, we already move most of the debug working to virtual evironment. We send it to * EXPECT BUGS This is the first version to send to the comminity, but it is already 3rd version in internal. It expected to have bugs. OPEN: Kernel might use part of hidden memory region as RAM buffer, now emulator directly hide 128M extra space to workaround this issue. Any better way to avoid this conflict? We expect a better solution from the community(for patch 002). * Principles & Usages NUMA hotplug emulator include 3 different parts, We add a menu item to the menuconfig to enable/disable them (Refer to http://shaohui.org/images/hpe-krnl-cfg.jpg) 1) Node hotplug emulation: The emulator firstly hides RAM via E820 table, and then it can fake offlined nodes with the hidden RAM. After system bootup, user is able to hotplug-add these offlined nodes, which is just similar to a real hotplug hardware behavior. Using boot option "numa=hide=N*size" to fake offlined nodes: - N is the number of hidden nodes - size is the memory size (in MB) per hidden node. There is a sysfs entry "probe" under /sys/devices/system/node/ for user to hotplug the fake offlined nodes: - to show all fake offlined nodes: $ cat /sys/devices/system/node/probe - to hotadd a fake offlined node, e.g. nodeid is N: $ echo N > /sys/devices/system/node/probe 2) CPU hotplug emulation: The emulator reserve CPUs throu grub parameter, the reserved CPUs can be hot-add/hot-remove in software method, it emulates the procuess of physical cpu hotplug. - to hide CPUs - Using boot option "maxcpus=N" hide CPUs N is the number of initialize CPUs - Using boot option "cpu_hpe=on" to enable cpu hotplug emulation when cpu_hpe is enabled, the rest CPUs will not be initialized - to hot-add CPU to node $ echo nid > cpu/probe - to hot-remove CPU $ echo nid > cpu/release 3) Memory hotplug emulation: The emulator reserve memory before OS booting, the reserved memory region is remove from e820 table, and they can be hot-added via the probe interface, this interface was extend to support add memory to the specified node, It maintains backwards compatibility. The difficulty of Memory Release is well-known, we have no plan for it until now. - reserve memory throu grub parameter mem=1024m - add a memory section to node 3 $ echo 0x40000000,3 > memory/probe OR $ echo 1024m,3 > memory/probe * ACKNOWLEDGMENT hotplug emulator includes a team's efforts, thanks all of them. They are: Andi Kleen, Haicheng Li, Shaohui Zheng, Fengguang Wu and Yongkang You -- Thanks & Regards, Shaohui -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC, 0/7] NUMA Hotplug emulator 2010-05-13 11:36 [RFC, 0/7] NUMA Hotplug emulator Shaohui Zheng @ 2010-05-13 12:11 ` Shaohui Zheng 2010-05-13 12:22 ` Shaohui Zheng 2010-05-14 6:57 ` Balbir Singh 2010-05-21 9:33 ` Ankita Garg 2 siblings, 1 reply; 7+ messages in thread From: Shaohui Zheng @ 2010-05-13 12:11 UTC (permalink / raw) To: akpm, linux-mm, linux-kernel, ak, fengguang.wu, haicheng.li, shaohui.zheng This email was lost after I check the LKML, resend it, sorry if duplicated. Hi, All This patchset introduces NUMA hotplug emulator for x86. it refers too many files and might introduce new bugs, so we send a RFC to comminity first and expect comments and suggestions, thanks. * WHAT IS HOTPLUG EMULATOR NUMA hotplug emulator is collectively named for the hotplug emulation it is able to emulate NUMA Node Hotplug thru a pure software way. It intends to help people easily debug and test node/cpu/memory hotplug related stuff on a none-numa-hotplug-support machine, even an UMA machine. The emulator provides mechanism to emulate the process of physcial cpu/mem hotadd, it provides possibility to debug CPU and memory hotplug on the machines without NUMA support for kenrel developers. It offers an interface for cpu and memory hotplug test purpose. * WHY DO WE USE HOTPLUG EMULATOR We are focusing on the hotplug emualation for a few months. The emualor helps team to reproduce all the major hotplug bugs. It plays an important role to the hotplug code quality assuirance. Because of the hotplug emulator, we already move most of the debug working to virtual evironment. We send it to * EXPECT BUGS This is the first version to send to the comminity, but it is already 3rd version in internal. It expected to have bugs. OPEN: Kernel might use part of hidden memory region as RAM buffer, now emulator directly hide 128M extra space to workaround this issue. Any better way to avoid this conflict? We expect a better solution from the community(for patch 002). * Principles & Usages NUMA hotplug emulator include 3 different parts, We add a menu item to the menuconfig to enable/disable them (Refer to http://shaohui.org/images/hpe-krnl-cfg.jpg) 1) Node hotplug emulation: The emulator firstly hides RAM via E820 table, and then it can fake offlined nodes with the hidden RAM. After system bootup, user is able to hotplug-add these offlined nodes, which is just similar to a real hotplug hardware behavior. Using boot option "numa=hide=N*size" to fake offlined nodes: - N is the number of hidden nodes - size is the memory size (in MB) per hidden node. There is a sysfs entry "probe" under /sys/devices/system/node/ for user to hotplug the fake offlined nodes: - to show all fake offlined nodes: $ cat /sys/devices/system/node/probe - to hotadd a fake offlined node, e.g. nodeid is N: $ echo N > /sys/devices/system/node/probe 2) CPU hotplug emulation: The emulator reserve CPUs throu grub parameter, the reserved CPUs can be hot-add/hot-remove in software method, it emulates the procuess of physical cpu hotplug. - to hide CPUs - Using boot option "maxcpus=N" hide CPUs N is the number of initialize CPUs - Using boot option "cpu_hpe=on" to enable cpu hotplug emulation when cpu_hpe is enabled, the rest CPUs will not be initialized - to hot-add CPU to node $ echo nid > cpu/probe - to hot-remove CPU $ echo nid > cpu/release 3) Memory hotplug emulation: The emulator reserve memory before OS booting, the reserved memory region is remove from e820 table, and they can be hot-added via the probe interface, this interface was extend to support add memory to the specified node, It maintains backwards compatibility. The difficulty of Memory Release is well-known, we have no plan for it until now. - reserve memory throu grub parameter mem=1024m - add a memory section to node 3 $ echo 0x40000000,3 > memory/probe OR $ echo 1024m,3 > memory/probe * ACKNOWLEDGMENT hotplug emulator includes a team's efforts, thanks all of them. They are: Andi Kleen, Haicheng Li, Shaohui Zheng, Fengguang Wu and Yongkang You -- Thanks & Regards, Shaohui -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC, 0/7] NUMA Hotplug emulator 2010-05-13 12:11 ` Shaohui Zheng @ 2010-05-13 12:22 ` Shaohui Zheng 0 siblings, 0 replies; 7+ messages in thread From: Shaohui Zheng @ 2010-05-13 12:22 UTC (permalink / raw) To: linux-mm, linux-kernel, ak, shaohui.zheng This email was lost after I check the LKML, resend it, sorry if duplicated. Hi, All This patchset introduces NUMA hotplug emulator for x86. it refers too many files and might introduce new bugs, so we send a RFC to comminity first and expect comments and suggestions, thanks. * WHAT IS HOTPLUG EMULATOR NUMA hotplug emulator is collectively named for the hotplug emulation it is able to emulate NUMA Node Hotplug thru a pure software way. It intends to help people easily debug and test node/cpu/memory hotplug related stuff on a none-numa-hotplug-support machine, even an UMA machine. The emulator provides mechanism to emulate the process of physcial cpu/mem hotadd, it provides possibility to debug CPU and memory hotplug on the machines without NUMA support for kenrel developers. It offers an interface for cpu and memory hotplug test purpose. * WHY DO WE USE HOTPLUG EMULATOR We are focusing on the hotplug emualation for a few months. The emualor helps team to reproduce all the major hotplug bugs. It plays an important role to the hotplug code quality assuirance. Because of the hotplug emulator, we already move most of the debug working to virtual evironment. We send it to * EXPECT BUGS This is the first version to send to the comminity, but it is already 3rd version in internal. It expected to have bugs. OPEN: Kernel might use part of hidden memory region as RAM buffer, now emulator directly hide 128M extra space to workaround this issue. Any better way to avoid this conflict? We expect a better solution from the community(for patch 002). * Principles & Usages NUMA hotplug emulator include 3 different parts, We add a menu item to the menuconfig to enable/disable them 1) Node hotplug emulation: The emulator firstly hides RAM via E820 table, and then it can fake offlined nodes with the hidden RAM. After system bootup, user is able to hotplug-add these offlined nodes, which is just similar to a real hotplug hardware behavior. Using boot option "numa=hide=N*size" to fake offlined nodes: - N is the number of hidden nodes - size is the memory size (in MB) per hidden node. There is a sysfs entry "probe" under /sys/devices/system/node/ for user to hotplug the fake offlined nodes: - to show all fake offlined nodes: $ cat /sys/devices/system/node/probe - to hotadd a fake offlined node, e.g. nodeid is N: $ echo N > /sys/devices/system/node/probe 2) CPU hotplug emulation: The emulator reserve CPUs throu grub parameter, the reserved CPUs can be hot-add/hot-remove in software method, it emulates the procuess of physical cpu hotplug. - to hide CPUs - Using boot option "maxcpus=N" hide CPUs N is the number of initialize CPUs - Using boot option "cpu_hpe=on" to enable cpu hotplug emulation when cpu_hpe is enabled, the rest CPUs will not be initialized - to hot-add CPU to node $ echo nid > cpu/probe - to hot-remove CPU $ echo nid > cpu/release 3) Memory hotplug emulation: The emulator reserve memory before OS booting, the reserved memory region is remove from e820 table, and they can be hot-added via the probe interface, this interface was extend to support add memory to the specified node, It maintains backwards compatibility. The difficulty of Memory Release is well-known, we have no plan for it until now. - reserve memory throu grub parameter mem=1024m - add a memory section to node 3 $ echo 0x40000000,3 > memory/probe OR $ echo 1024m,3 > memory/probe * ACKNOWLEDGMENT hotplug emulator includes a team's efforts, thanks all of them. They are: Andi Kleen, Haicheng Li, Shaohui Zheng, Fengguang Wu and Yongkang You -- Thanks & Regards, Shaohui -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC, 0/7] NUMA Hotplug emulator 2010-05-13 11:36 [RFC, 0/7] NUMA Hotplug emulator Shaohui Zheng 2010-05-13 12:11 ` Shaohui Zheng @ 2010-05-14 6:57 ` Balbir Singh 2010-05-15 11:59 ` Shaohui Zheng 2010-05-21 9:33 ` Ankita Garg 2 siblings, 1 reply; 7+ messages in thread From: Balbir Singh @ 2010-05-14 6:57 UTC (permalink / raw) To: akpm, linux-mm, linux-kernel, ak, fengguang.wu, haicheng.li, shaohui.zheng * Shaohui Zheng <shaohui.zheng@intel.com> [2010-05-13 19:36:30]: > Hi, All > This patchset introduces NUMA hotplug emulator for x86. it refers too > many files and might introduce new bugs, so we send a RFC to comminity first > and expect comments and suggestions, thanks. > > * WHAT IS HOTPLUG EMULATOR > > NUMA hotplug emulator is collectively named for the hotplug emulation > it is able to emulate NUMA Node Hotplug thru a pure software way. It > intends to help people easily debug and test node/cpu/memory hotplug > related stuff on a none-numa-hotplug-support machine, even an UMA machine. > > The emulator provides mechanism to emulate the process of physcial cpu/mem > hotadd, it provides possibility to debug CPU and memory hotplug on the machines > without NUMA support for kenrel developers. It offers an interface for cpu > and memory hotplug test purpose. > Sounds like an interesting project, could you please Post your patches as threaded, ideally having 0/7 to 7/7 in a thread helps track the patches and comments. -- Three Cheers, Balbir -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC, 0/7] NUMA Hotplug emulator 2010-05-14 6:57 ` Balbir Singh @ 2010-05-15 11:59 ` Shaohui Zheng 0 siblings, 0 replies; 7+ messages in thread From: Shaohui Zheng @ 2010-05-15 11:59 UTC (permalink / raw) To: Balbir Singh Cc: akpm, linux-mm, linux-kernel, ak, fengguang.wu, haicheng.li, shaohui.zheng On Fri, May 14, 2010 at 12:27:44PM +0530, Balbir Singh wrote: > * Shaohui Zheng <shaohui.zheng@intel.com> [2010-05-13 19:36:30]: > > > Hi, All > > This patchset introduces NUMA hotplug emulator for x86. it refers too > > many files and might introduce new bugs, so we send a RFC to comminity first > > and expect comments and suggestions, thanks. > > > > * WHAT IS HOTPLUG EMULATOR > > > > NUMA hotplug emulator is collectively named for the hotplug emulation > > it is able to emulate NUMA Node Hotplug thru a pure software way. It > > intends to help people easily debug and test node/cpu/memory hotplug > > related stuff on a none-numa-hotplug-support machine, even an UMA machine. > > > > The emulator provides mechanism to emulate the process of physcial cpu/mem > > hotadd, it provides possibility to debug CPU and memory hotplug on the machines > > without NUMA support for kenrel developers. It offers an interface for cpu > > and memory hotplug test purpose. > > > > Sounds like an interesting project, could you please > > Post your patches as threaded, ideally having 0/7 to 7/7 in a thread > helps track the patches and comments. > > -- > Three Cheers, > Balbir Sorry for the late response, I have no experience to post all the patches into one thread, I will consult local expert. Thanks Balbir, because of your guys's feedbacks and review comments, the code quality should be guaranteed. -- Thanks & Regards, Shaohui -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC, 0/7] NUMA Hotplug emulator 2010-05-13 11:36 [RFC, 0/7] NUMA Hotplug emulator Shaohui Zheng 2010-05-13 12:11 ` Shaohui Zheng 2010-05-14 6:57 ` Balbir Singh @ 2010-05-21 9:33 ` Ankita Garg 2010-05-24 1:47 ` Shaohui Zheng 2 siblings, 1 reply; 7+ messages in thread From: Ankita Garg @ 2010-05-21 9:33 UTC (permalink / raw) To: akpm, linux-mm, linux-kernel, ak, fengguang.wu, haicheng.li, shaohui.zheng Cc: Balbir Singh, Vaidyanathan Srinivasan Hi, On Thu, May 13, 2010 at 07:36:30PM +0800, Shaohui Zheng wrote: > Hi, All > This patchset introduces NUMA hotplug emulator for x86. it refers too > many files and might introduce new bugs, so we send a RFC to comminity first > and expect comments and suggestions, thanks. > <snip> > * Principles & Usages > > NUMA hotplug emulator include 3 different parts, We add a menu item to the > menuconfig to enable/disable them > (Refer to http://shaohui.org/images/hpe-krnl-cfg.jpg) > > > 1) Node hotplug emulation: > > The emulator firstly hides RAM via E820 table, and then it can > fake offlined nodes with the hidden RAM. > > After system bootup, user is able to hotplug-add these offlined > nodes, which is just similar to a real hotplug hardware behavior. > > Using boot option "numa=hide=N*size" to fake offlined nodes: > - N is the number of hidden nodes > - size is the memory size (in MB) per hidden node. > > There is a sysfs entry "probe" under /sys/devices/system/node/ for user > to hotplug the fake offlined nodes: > > - to show all fake offlined nodes: > $ cat /sys/devices/system/node/probe > > - to hotadd a fake offlined node, e.g. nodeid is N: > $ echo N > /sys/devices/system/node/probe > I tried the patchset on a non-NUMA machine. So, inorder to create fake NUMA nodes and be able to emulate the hotplug behavior, I used the following commandline: "numa=fake=4 numa=hide=2*2048" on a machine with 8G memory. I expected to see 4 nodes, out of which 2 would be hidden. However, the system comes up the 4 online nodes and 2 offline nodes (thus a total of 6 nodes). While we could decide this to be the semantics, however, I feel that numa=fake should define the total number of nodes. So in the above case, the system should have come up with 2 online nodes and 2 offline nodes. Also, "numa=hide=N" could also be supported, with the size of the hidden nodes being equal to the entire size of the node, with or without numa=fake parameter. On onlining one of the offline nodes, I see another issue that the memory under it is not automatically brought online. For example: #ls /sys/devices/system/node .... node0 node1 node2.. #cat /sys/devices/system/node/probe 3 #echo 3 > /sys/devices/system/node/probe #ls /sys/devices/system/node .... node0 node1 node2 node3 #cat /sys/devices/system/node/node3/meminfo Node 3 MemTotal: 0 kB Node 3 MemFree: 0 kB Node 3 MemUsed: 0 kB Node 3 Active: 0 kB ...... i.e, as memory-less nodes. However, these nodes were designated to have memory. So, on onlining the nodes, maybe we could have all their memory brought into online state as well ? -- Regards, Ankita Garg (ankita@in.ibm.com) Linux Technology Center IBM India Systems & Technology Labs, Bangalore, India -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC, 0/7] NUMA Hotplug emulator 2010-05-21 9:33 ` Ankita Garg @ 2010-05-24 1:47 ` Shaohui Zheng 0 siblings, 0 replies; 7+ messages in thread From: Shaohui Zheng @ 2010-05-24 1:47 UTC (permalink / raw) To: Ankita Garg Cc: akpm, linux-mm, linux-kernel, ak, fengguang.wu, haicheng.li, shaohui.zheng, Balbir Singh, Vaidyanathan Srinivasan On Fri, May 21, 2010 at 03:03:40PM +0530, Ankita Garg wrote: > > I tried the patchset on a non-NUMA machine. So, inorder to create fake > NUMA nodes and be able to emulate the hotplug behavior, I used the > following commandline: > > "numa=fake=4 numa=hide=2*2048" > > on a machine with 8G memory. I expected to see 4 nodes, out of which 2 > would be hidden. However, the system comes up the 4 online nodes and 2 > offline nodes (thus a total of 6 nodes). While we could decide this to > be the semantics, however, I feel that numa=fake should define the total > number of nodes. So in the above case, the system should have come up > with 2 online nodes and 2 offline nodes. Ankita, it is the expected result, NUMA_EMU and NUMA_HOTPLUG_EMU are 2 different features, there is no dependency between the 2 features. Even if you disable NUMA_EMU, the hotplug emualation still working, this implementatin reduces the dependency, it make things simple and easy to understand. You concern makes sense in semantices, but we do not pefer to combine 2 independent modules together. > > Also, "numa=hide=N" could also be supported, with the size > of the hidden nodes being equal to the entire size of the node, with or > without numa=fake parameter. > > On onlining one of the offline nodes, I see another issue that the > memory under it is not automatically brought online. For example: > > #ls /sys/devices/system/node > .... node0 node1 node2.. > > #cat /sys/devices/system/node/probe > 3 > > #echo 3 > /sys/devices/system/node/probe > #ls /sys/devices/system/node > .... node0 node1 node2 node3 > > #cat /sys/devices/system/node/node3/meminfo > Node 3 MemTotal: 0 kB > Node 3 MemFree: 0 kB > Node 3 MemUsed: 0 kB > Node 3 Active: 0 kB > ...... > > i.e, as memory-less nodes. However, these nodes were designated to have > memory. So, on onlining the nodes, maybe we could have all their memory > brought into online state as well ? it is the same result with the real implemetation for memory hotplug in linux kernel, when we hot-add physical memory into machine, the linux kernel create the memory entires and create the related data structure, but the OS will never online the memory, it should finish in user space. the node hotplug emulation and memory hotplug emualtioni feature follows up the same rules with the kernel. As we know, when we allocate memory from a memory-less node, it will cause a OOM issue, Some engineer is already focus on this bug. Because of the OOM issue can be reproduced with the hotplug emulator, it helps the engineer so much. This feature is flexible. As I know, Some OSV already online the hotplug memory automatically, if the mainline kernel decide do the same thing, we will change the related code, too. > > -- > Regards, > Ankita Garg (ankita@in.ibm.com) > Linux Technology Center > IBM India Systems & Technology Labs, > Bangalore, India -- Thanks & Regards, Shaohui -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-05-24 2:14 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-05-13 11:36 [RFC, 0/7] NUMA Hotplug emulator Shaohui Zheng 2010-05-13 12:11 ` Shaohui Zheng 2010-05-13 12:22 ` Shaohui Zheng 2010-05-14 6:57 ` Balbir Singh 2010-05-15 11:59 ` Shaohui Zheng 2010-05-21 9:33 ` Ankita Garg 2010-05-24 1:47 ` Shaohui Zheng
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).