From mboxrd@z Thu Jan 1 00:00:00 1970 From: Kamezawa Hiroyuki Subject: Re: [PATCH v6 00/15] memory-hotplug: hot-remove physical memory Date: Thu, 10 Jan 2013 17:23:07 +0900 Message-ID: <50EE7A6B.7020005@jp.fujitsu.com> References: <1357723959-5416-1-git-send-email-tangchen@cn.fujitsu.com> <20130109142314.1ce04a96.akpm@linux-foundation.org> <50EE24A4.8020601@cn.fujitsu.com> <50EE6A48.7060307@parallels.com> <50EE6E50.3040609@jp.fujitsu.com> <50EE73DE.30208@parallels.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <50EE73DE.30208@parallels.com> Sender: linux-ia64-owner@vger.kernel.org To: Glauber Costa Cc: Tang Chen , Andrew Morton , rientjes@google.com, len.brown@intel.com, benh@kernel.crashing.org, paulus@samba.org, cl@linux.com, minchan.kim@gmail.com, kosaki.motohiro@jp.fujitsu.com, isimatu.yasuaki@jp.fujitsu.com, wujianguo@huawei.com, wency@cn.fujitsu.com, hpa@zytor.com, linfeng@cn.fujitsu.com, laijs@cn.fujitsu.com, mgorman@suse.de, yinghai@kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-acpi@vger.kernel.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, linux-ia64@vger.kernel.org, cmetcalf@tilera.com, sparclinux@vger.kernel.org List-Id: linux-acpi@vger.kernel.org (2013/01/10 16:55), Glauber Costa wrote: > On 01/10/2013 11:31 AM, Kamezawa Hiroyuki wrote: >> (2013/01/10 16:14), Glauber Costa wrote: >>> On 01/10/2013 06:17 AM, Tang Chen wrote: >>>>>> Note: if the memory provided by the memory device is used by the >>>>>> kernel, it >>>>>> can't be offlined. It is not a bug. >>>>> >>>>> Right. But how often does this happen in testing? In other words, >>>>> please provide an overall description of how well memory hot-remove is >>>>> presently operating. Is it reliable? What is the success rate in >>>>> real-world situations? >>>> >>>> We test the hot-remove functionality mostly with movable_online used. >>>> And the memory used by kernel is not allowed to be removed. >>> >>> Can you try doing this using cpusets configured to hardwall ? >>> It is my understanding that the object allocators will try hard not to >>> allocate anything outside the walls defined by cpuset. Which means that >>> if you have one process per node, and they are hardwalled, your kernel >>> memory will be spread evenly among the machine. With a big enough load, >>> they should eventually be present in all blocks. >>> >> >> I'm sorry I couldn't catch your point. >> Do you want to confirm whether cpuset can work enough instead of >> ZONE_MOVABLE ? >> Or Do you want to confirm whether ZONE_MOVABLE will not work if it's >> used with cpuset ? >> >> > No, I am not proposing to use cpuset do tackle the problem. I am just > wondering if you would still have high success rates with cpusets in use > with hardwalls. This is just one example of a workload that would spread > kernel memory around quite heavily. > > So this is just me trying to understand the limitations of the mechanism. > Hm, okay. In my undestanding, if the whole memory of a node is configured as MOVABLE, no kernel memory will not be allocated in the node because zonelist will not match. So, if cpuset is used with hardwalls, user will see -ENOMEM or OOM, I guess. even fork() will fail if fallback-to-other-node is not allowed. If it's configure as ZONE_NORMAL, you need to pray for offlining memory. AFAIK, IBM's ppc? has 16MB section size. So, some of sections can be offlined even if they are configured as ZONE_NORMAL. For them, placement of offlined memory is not important because it's virtualized by LPAR, they don't try to remove DIMM, they just want to increase/decrease amount of memory. It's an another approach. But here, we(fujitsu) tries to remove a system board/DIMM. So, configuring the whole memory of a node as ZONE_MOVABLE and tries to guarantee DIMM as removable. >> IMHO, I don't think shrink_slab() can kill all objects in a node even >> if they are some caches. We need more study for doing that. >> > > Indeed, shrink_slab can only kill cached objects. They, however, are > usually a very big part of kernel memory. I wonder though if in case of > failure, it is worth it to try at least one shrink pass before you give up. > Yeah, now, his (our) approach is never allowing kernel memory on a node to be hot-removed by ZONE_MOVABLE. So, shrink_slab()'s effect will not be seen. If other brave guys tries to use ZONE_NORMAL for hot-pluggable DIMM, I see, it's worth triying. How about checking the target memsection is in NORMAL or in MOVABLE at hot-removing ? If NORMAL, shrink_slab() will be worth to be called. BTW, shrink_slab() is now node/zone aware ? If not, fixing that first will be better direction I guess. Thanks, -Kame