From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B44C5C25B75 for ; Fri, 31 May 2024 09:49:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Subject:CC:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=jhiPuYdfX0QIw2/8DXUiMQfwetHZIpFYZ7/Hmh2R5wA=; b=VTuz1EqFPNxd4d r8lZnQbN9Q6Gp5bPtIhw2rgLpx7PXAOnjM//kJH29hSfTErDd159K5Gj/CbtenYJtTMdrGbnsh9gr sZft09CxLEgndRGJqqsiaBQkyESH8acBBfs/xpJzn0gatW8nfWOJUaBTW6suVhCUImdaAZgb9D9Bc u0mFiAZYk3Qx07Mk1IXLpfwoLuDKv02iCOhYFgMVJuxtpC/ULiWhYX1McfCqUAsEmRGvLpmQlXzAw cwXviSUWSZCpSnYW8ya+eR5y6zd8a1bhiHFX1idZ4hvraHQXRFioHHVmEq1u9j11pdVazxugQ5v0O T2ZmIU9U+zrEjjlmn6Sw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sCysh-00000009r2O-4A1p; Fri, 31 May 2024 09:49:04 +0000 Received: from frasgout.his.huawei.com ([185.176.79.56]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sCysd-00000009r12-42au for linux-arm-kernel@lists.infradead.org; Fri, 31 May 2024 09:49:02 +0000 Received: from mail.maildlp.com (unknown [172.18.186.231]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4VrJ9p11zGz6K5bC; Fri, 31 May 2024 17:44:30 +0800 (CST) Received: from lhrpeml500005.china.huawei.com (unknown [7.191.163.240]) by mail.maildlp.com (Postfix) with ESMTPS id C8E62140B33; Fri, 31 May 2024 17:48:49 +0800 (CST) Received: from localhost (10.202.227.76) by lhrpeml500005.china.huawei.com (7.191.163.240) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Fri, 31 May 2024 10:48:49 +0100 Date: Fri, 31 May 2024 10:48:48 +0100 From: Jonathan Cameron To: David Hildenbrand CC: Dan Williams , , , Sudeep Holla , Andrew Morton , Will Deacon , Jia He , Mike Rapoport , , , , Yuquan Wang , Oscar Salvador , Lorenzo Pieralisi , James Morse Subject: Re: [RFC PATCH 8/8] HACK: mm: memory_hotplug: Drop memblock_phys_free() call in try_remove_memory() Message-ID: <20240531104848.00006a95@Huawei.com> In-Reply-To: References: <20240529171236.32002-1-Jonathan.Cameron@huawei.com> <20240529171236.32002-9-Jonathan.Cameron@huawei.com> Organization: Huawei Technologies Research and Development (UK) Ltd. X-Mailer: Claws Mail 4.1.0 (GTK 3.24.33; x86_64-w64-mingw32) MIME-Version: 1.0 X-Originating-IP: [10.202.227.76] X-ClientProxiedBy: lhrpeml500003.china.huawei.com (7.191.162.67) To lhrpeml500005.china.huawei.com (7.191.163.240) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240531_024900_645693_C533134E X-CRM114-Status: GOOD ( 35.52 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Fri, 31 May 2024 09:49:32 +0200 David Hildenbrand wrote: > On 29.05.24 19:12, Jonathan Cameron wrote: > > I'm not sure what this is balancing, but it if is necessary then the reserved > > memblock approach can't be used to stash NUMA node assignments as after the > > first add / remove cycle the entry is dropped so not available if memory is > > re-added at the same HPA. > > > > This patch is here to hopefully spur comments on what this is there for! > > > > Signed-off-by: Jonathan Cameron > > --- > > mm/memory_hotplug.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > > index 431b1f6753c0..3d8dd4749dfc 100644 > > --- a/mm/memory_hotplug.c > > +++ b/mm/memory_hotplug.c > > @@ -2284,7 +2284,7 @@ static int __ref try_remove_memory(u64 start, u64 size) > > } > > > > if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) { > > - memblock_phys_free(start, size); > > + // memblock_phys_free(start, size); > > memblock_remove(start, size); > > } > > memblock_phys_free() works on memblock.reserved, memblock_remove() works > on memblock.memory. > > If you take a look at the doc at the top of memblock.c: > > memblock.memory: physical memory available to the system > memblock.reserved: regions that were allocated [during boot] > > > memblock.memory is supposed to be a superset of memblock.reserved. Your > "hack" here indicates that you somehow would be relying on the opposite > being true, which indicates that you are doing the wrong thing. > > > memblock_remove() indeed balances against memblock_add_node() for > hotplugged memory [add_memory_resource()]. There seem to a case where we > would succeed in hotunplugging memory that was part of "memblock.reserved". > > But how could that happen? I think the following way: > > Once the buddy is up and running, memory allocated during early boot is > not freed back to memblock, but usually we simply go via something like > free_reserved_page(), not memblock_free() [because the buddy took over]. > So one could end up unplugging memory that still resides in > memblock.reserved set. > > So with memblock_phys_free(), we are enforcing the invariant that > memblock.memory is a superset of memblock.reserved. > > Likely, arm64 should store that node assignment elsewhere from where it > can be queried. Or it should be using something like > CONFIG_HAVE_MEMBLOCK_PHYS_MAP for these static windows. > Hi David, Thanks for the explanation and pointers. I'd rather avoid inventing a parallel infrastructure for this (like x86 has for other reasons, but which is also used for this purpose). >From a quick look CONFIG_HAVE_MEMBLOCK_PHYS_MAP is documented in a fashion that makes me think it's not directly appropriate (this isn't actual physical memory available during boot) but the general approach of just adding another memblock collection seems like it will work. Hardest problem might be naming it. physmem_possible perhaps? Fill that with anything found in SRAT or CEDT should work for ACPI, but I'm not sure on whether we can fill it when neither of those is present. Maybe we just don't bother as for today's usecase CEDT needs to be present. Maybe physmem_known_possible is the way to go. I'll give this approach a spin and see how it goes. Jonathan _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel