From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 34E16F3D5E0 for ; Sun, 5 Apr 2026 12:53:43 +0000 (UTC) Received: from boromir.ozlabs.org (localhost [127.0.0.1]) by lists.ozlabs.org (Postfix) with ESMTP id 4fpXSx3kHlz2yjs; Sun, 05 Apr 2026 22:53:37 +1000 (AEST) Authentication-Results: lists.ozlabs.org; arc=none smtp.remote-ip="2607:f8b0:4864:20::1036" ARC-Seal: i=1; a=rsa-sha256; d=lists.ozlabs.org; s=201707; t=1775393617; cv=none; b=RDZzo5R9uG9jE2hDwK05ttwfOzdzdpdIAUy9lwldVGDC0oj5ykVqqRraFX13cvSPQgeC1HxqusTRJ1mQYLtV34l4o2QuXODS6e06EJbnoSRtVFEhdTqgNZJL3pkMq7HIdPhH45nA4JNOs/02tSwVuxtCvh6Q0h2C6jC80ngL5F0879K28XyIeau3RYwjMkOnmkAkzQ0re9pR9EuK2EZ/pV0Q3rB+9uYOdDcjFGGY3IeN8vs61iQMvHGJh2Z7ChoS7Vj3DY/CC31PKc2rwkn7qZf0qtE+T/PBEuVJErL/FcunM4dltmnr36Ry/IljYJWsw0K1ccdHmeBFIdYDK/4W1g== ARC-Message-Signature: i=1; a=rsa-sha256; d=lists.ozlabs.org; s=201707; t=1775393617; c=relaxed/relaxed; bh=qyPeWfsO418Re5hKKT3eSnZkoPoEx5W3lIDo032LUZM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=KtJzbpvoy7VErLqbiZbNLdufKvzzEozxCXfb/se7+H73ZgZqMp7Kynwk3CWBvGZMWsGpDDtCIKCerQSVgroku4CmL1szx2wgqILlMCCfd2SDRSHNNJmJHHuawViDg8sd7eaI1FK0blqCYOxH8JNSNXVuLV0vhgSCsF5rJSAOsDyjwzQMXEi6dHARWpxNZFzms+rK2B2CKeZWpGxy89AdtkkYpvYsOk0VH9HKKs4i0evElTk8uZCWxZ2/pItNPx9YLYdrj5icu1Pta9HX5pkhusGUC7OXzkq6scUzksY0Bbw4+l0tKCUkGiW8f83L3/avJJvn54VgWj9UPShydXnc9A== ARC-Authentication-Results: i=1; lists.ozlabs.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; dkim=pass (2048-bit key; unprotected) header.d=bytedance.com header.i=@bytedance.com header.a=rsa-sha256 header.s=google header.b=fdPUoCbt; dkim-atps=neutral; spf=pass (client-ip=2607:f8b0:4864:20::1036; helo=mail-pj1-x1036.google.com; envelope-from=songmuchun@bytedance.com; receiver=lists.ozlabs.org) smtp.mailfrom=bytedance.com Authentication-Results: lists.ozlabs.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=bytedance.com header.i=@bytedance.com header.a=rsa-sha256 header.s=google header.b=fdPUoCbt; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=bytedance.com (client-ip=2607:f8b0:4864:20::1036; helo=mail-pj1-x1036.google.com; envelope-from=songmuchun@bytedance.com; receiver=lists.ozlabs.org) Received: from mail-pj1-x1036.google.com (mail-pj1-x1036.google.com [IPv6:2607:f8b0:4864:20::1036]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange x25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4fpXSw4pK9z2xMY for ; Sun, 05 Apr 2026 22:53:36 +1000 (AEST) Received: by mail-pj1-x1036.google.com with SMTP id 98e67ed59e1d1-354bc7c2c46so1787114a91.0 for ; Sun, 05 Apr 2026 05:53:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393615; x=1775998415; darn=lists.ozlabs.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=qyPeWfsO418Re5hKKT3eSnZkoPoEx5W3lIDo032LUZM=; b=fdPUoCbtqiT7QU9qVfdFp3xIvFEPaKNhV/HjhoQrTGiGycaDVoX/Fh8QGTQlygnU44 YabHTFiyN66KdKS29SKan6aET8T2VeLB9u1L1I/nPNmk352x26Uu/4BNdpRa8DrtAh1+ GBZpq+sp5E9g5Kpzj1CFTf/92vAa1GVhzkxPC1aYgJTnVLXdMqAeGVF/47MupgvHJDsF Zo6iOPJbZ8oBvPF/moo/ZViKshIiGrhoDkZr630lp+NnfXD31MOLDHPXmmeac3HIxnCt 1sq+cdqQqq2S9caRPUzYATX5qO+jianFlQ3PSAXPZmwFs7bw5GXCDVAIXAJJ58clEAyJ vARQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393615; x=1775998415; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=qyPeWfsO418Re5hKKT3eSnZkoPoEx5W3lIDo032LUZM=; b=aOIkLAH2tlNmUrQc6pbSHRiZsFgoUSLKsZMQEH/f2+vdpdFeLB5Wfgrj9GDljyjC27 mYWGPVfPA8ewYo9q8b6Fk8AjPYvqSNV7n08HnihZoeOS0yIo5uVE5mb5xlZXdq5DyUVp a16v1fhcTMVNF7SGDUonMQXsCrWmpBbQrl/MXjyV1AbgC+n6LTiRff7Mye/3JzRRFbWq H/pRSd63UaSs0iJSXanO0fN9ZLPo7ykb6k+1Hcv0S0g/QQr5OGn7Vb87f0sPceV1i1T8 kTPbGKO6J64YJgeMX8YRWK3wwrU0NkysDhCVQCVuFW1laObhKIwz77dX7lh9hVt5ZpZu 3TWQ== X-Forwarded-Encrypted: i=1; AJvYcCUAq3fplk+rqG340egYx/gMFAYdfCxAffblvgeCx1fgfStf7CE26xXzu/4Xp2c1Ni7LdIX0HL8y9YAbyEM=@lists.ozlabs.org X-Gm-Message-State: AOJu0YyvxhcQeUovd9UvdjiYtzCYa84WuInE9szg0N/GOJSDmncZpGmy rHi1OnfM3ENISpvoI8PX7kuvkQk6z9SIDe8wbC/ceReoEbKnlGilhmiBWw94nqJtaOE= X-Gm-Gg: AeBDiesZlCLOaM9A/sge2rmppc6wSmVvUWTiqaAMRxTkKseAwHzVFYT+Byf0uYA9vYm H5tjh9+ZusRCtVCu3LFLaIviXsfJeSCmf7EwL3VkojBXl6njdv2PqadQdHbTebUcMcC0k3PmkBT s7KftismPLfEFV/669P5S9avu7gBPHVUpoltWpgSoVrL1mzgly6arDyC6NtwASeTt/SJ0o+rJD/ 2rGOlzhgtEiQbTKHdS1WNoF6aOqBojhTmwuh/pk7dBa0QVzY0Ols35GphgyxVKivLStnewAvqIq ADGYDElCNZoK9oycGjNVyNjn8+kEmKmRrotMrv3gbNJN3AsHB1Hty337fk8vDwBfX+kVN5IToPT Zeq0XXtQ4lGlEm+C73P7N5Fm0B9BxpiswkHhhOm8CZ3bOu+CW+XztkO6rM6lK/MBFxdd2kLM2aE qr3ZLdyDk23Pu1fPZOA+NjiVkoJfp+JMTYFoHxAgoWelY= X-Received: by 2002:a17:90b:4ac7:b0:35d:a8d9:3b4 with SMTP id 98e67ed59e1d1-35de678f7d7mr8776631a91.4.1775393614559; Sun, 05 Apr 2026 05:53:34 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.53.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:53:34 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 02/49] mm/sparse: add a @pgmap argument to memory deactivation paths Date: Sun, 5 Apr 2026 20:51:53 +0800 Message-Id: <20260405125240.2558577-3-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> X-Mailing-List: linuxppc-dev@lists.ozlabs.org List-Id: List-Help: List-Owner: List-Post: List-Archive: , List-Subscribe: , , List-Unsubscribe: Precedence: list MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Currently, memory hot-remove paths do not pass the struct dev_pagemap pointer down to section_deactivate(). This prevents the lower levels from knowing whether the section was originally populated with vmemmap optimizations (e.g., DAX with HVO enabled). Without this information, we cannot call vmemmap_can_optimize() to determine if the vmemmap pages were optimized. As a result, the vmemmap page accounting during teardown will mistakenly assume a non-optimized allocation, leading to incorrect page statistics. To lay the groundwork for fixing the vmemmap page accounting, we need to pass the @pgmap pointer down to the deactivation location. Plumb the @pgmap argument through the APIs of arch_remove_memory(), __remove_pages() and sparse_remove_section(), mirroring the corresponding *_activate() paths. Signed-off-by: Muchun Song --- arch/arm64/mm/mmu.c | 5 +++-- arch/loongarch/mm/init.c | 5 +++-- arch/powerpc/mm/mem.c | 5 +++-- arch/riscv/mm/init.c | 5 +++-- arch/s390/mm/init.c | 5 +++-- arch/x86/mm/init_64.c | 5 +++-- include/linux/memory_hotplug.h | 8 +++++--- mm/memory_hotplug.c | 12 ++++++------ mm/memremap.c | 4 ++-- mm/sparse-vmemmap.c | 8 ++++---- 10 files changed, 35 insertions(+), 27 deletions(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index ec1c6971a561..dc8a8281888c 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1994,12 +1994,13 @@ int arch_add_memory(int nid, u64 start, u64 size, return ret; } -void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) +void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; - __remove_pages(start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap, pgmap); __remove_pgd_mapping(swapper_pg_dir, __phys_to_virt(start), size); } diff --git a/arch/loongarch/mm/init.c b/arch/loongarch/mm/init.c index 00f3822b6e47..c9c57f08fa2c 100644 --- a/arch/loongarch/mm/init.c +++ b/arch/loongarch/mm/init.c @@ -86,7 +86,8 @@ int arch_add_memory(int nid, u64 start, u64 size, struct mhp_params *params) return ret; } -void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) +void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; @@ -95,7 +96,7 @@ void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) /* With altmap the first mapped page is offset from @start */ if (altmap) page += vmem_altmap_offset(altmap); - __remove_pages(start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap, pgmap); } #endif diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c index 648d0c5602ec..4c1afab91996 100644 --- a/arch/powerpc/mm/mem.c +++ b/arch/powerpc/mm/mem.c @@ -158,12 +158,13 @@ int __ref arch_add_memory(int nid, u64 start, u64 size, return rc; } -void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) +void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; - __remove_pages(start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap, pgmap); arch_remove_linear_mapping(start, size); } #endif diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 5142ca80be6f..980f693e6b19 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -1810,9 +1810,10 @@ int __ref arch_add_memory(int nid, u64 start, u64 size, struct mhp_params *param return ret; } -void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) +void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { - __remove_pages(start >> PAGE_SHIFT, size >> PAGE_SHIFT, altmap); + __remove_pages(start >> PAGE_SHIFT, size >> PAGE_SHIFT, altmap, pgmap); remove_linear_mapping(start, size); flush_tlb_all(); } diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c index 1f72efc2a579..11a689423440 100644 --- a/arch/s390/mm/init.c +++ b/arch/s390/mm/init.c @@ -276,12 +276,13 @@ int arch_add_memory(int nid, u64 start, u64 size, return rc; } -void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) +void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; - __remove_pages(start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap, pgmap); vmem_remove_mapping(start, size); } #endif /* CONFIG_MEMORY_HOTPLUG */ diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index df2261fa4f98..77b889b71cf3 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1288,12 +1288,13 @@ kernel_physical_mapping_remove(unsigned long start, unsigned long end) remove_pagetable(start, end, true, NULL); } -void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) +void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; - __remove_pages(start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap, pgmap); kernel_physical_mapping_remove(start, start + size); } #endif /* CONFIG_MEMORY_HOTPLUG */ diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 815e908c4135..7c9d66729c60 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -135,9 +135,10 @@ static inline bool movable_node_is_enabled(void) return movable_node_enabled; } -extern void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap); +extern void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap); extern void __remove_pages(unsigned long start_pfn, unsigned long nr_pages, - struct vmem_altmap *altmap); + struct vmem_altmap *altmap, struct dev_pagemap *pgmap); /* reasonably generic interface to expand the physical pages */ extern int __add_pages(int nid, unsigned long start_pfn, unsigned long nr_pages, @@ -307,7 +308,8 @@ extern int sparse_add_section(int nid, unsigned long pfn, unsigned long nr_pages, struct vmem_altmap *altmap, struct dev_pagemap *pgmap); extern void sparse_remove_section(unsigned long pfn, unsigned long nr_pages, - struct vmem_altmap *altmap); + struct vmem_altmap *altmap, + struct dev_pagemap *pgmap); extern struct zone *zone_for_pfn_range(enum mmop online_type, int nid, struct memory_group *group, unsigned long start_pfn, unsigned long nr_pages); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 8b18ddd1e7d5..05f5df12d843 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -583,7 +583,7 @@ void remove_pfn_range_from_zone(struct zone *zone, * calling offline_pages(). */ void __remove_pages(unsigned long pfn, unsigned long nr_pages, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { const unsigned long end_pfn = pfn + nr_pages; unsigned long cur_nr_pages; @@ -598,7 +598,7 @@ void __remove_pages(unsigned long pfn, unsigned long nr_pages, /* Select all remaining pages up to the next section boundary */ cur_nr_pages = min(end_pfn - pfn, SECTION_ALIGN_UP(pfn + 1) - pfn); - sparse_remove_section(pfn, cur_nr_pages, altmap); + sparse_remove_section(pfn, cur_nr_pages, altmap, pgmap); } } @@ -1418,7 +1418,7 @@ static void remove_memory_blocks_and_altmaps(u64 start, u64 size) remove_memory_block_devices(cur_start, memblock_size); - arch_remove_memory(cur_start, memblock_size, altmap); + arch_remove_memory(cur_start, memblock_size, altmap, NULL); /* Verify that all vmemmap pages have actually been freed. */ WARN(altmap->alloc, "Altmap not fully unmapped"); @@ -1461,7 +1461,7 @@ static int create_altmaps_and_memory_blocks(int nid, struct memory_group *group, ret = create_memory_block_devices(cur_start, memblock_size, nid, params.altmap, group); if (ret) { - arch_remove_memory(cur_start, memblock_size, NULL); + arch_remove_memory(cur_start, memblock_size, NULL, NULL); kfree(params.altmap); goto out; } @@ -1547,7 +1547,7 @@ int add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags) /* create memory block devices after memory was added */ ret = create_memory_block_devices(start, size, nid, NULL, group); if (ret) { - arch_remove_memory(start, size, params.altmap); + arch_remove_memory(start, size, params.altmap, NULL); goto error; } } @@ -2246,7 +2246,7 @@ static int try_remove_memory(u64 start, u64 size) * No altmaps present, do the removal directly */ remove_memory_block_devices(start, size); - arch_remove_memory(start, size, NULL); + arch_remove_memory(start, size, NULL, NULL); } else { /* all memblocks in the range have altmaps */ remove_memory_blocks_and_altmaps(start, size); diff --git a/mm/memremap.c b/mm/memremap.c index ac7be07e3361..c45b90f334ea 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -97,10 +97,10 @@ static void pageunmap_range(struct dev_pagemap *pgmap, int range_id) PHYS_PFN(range_len(range))); if (pgmap->type == MEMORY_DEVICE_PRIVATE) { __remove_pages(PHYS_PFN(range->start), - PHYS_PFN(range_len(range)), NULL); + PHYS_PFN(range_len(range)), NULL, pgmap); } else { arch_remove_memory(range->start, range_len(range), - pgmap_altmap(pgmap)); + pgmap_altmap(pgmap), pgmap); kasan_remove_zero_shadow(__va(range->start), range_len(range)); } mem_hotplug_done(); diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index ee27d0c0efe2..7aa9a97498eb 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -737,7 +737,7 @@ static int fill_subsection_map(unsigned long pfn, unsigned long nr_pages) * usage map, but still need to free the vmemmap range. */ static void section_deactivate(unsigned long pfn, unsigned long nr_pages, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { struct mem_section *ms = __pfn_to_section(pfn); bool section_is_early = early_section(ms); @@ -824,7 +824,7 @@ static struct page * __meminit section_activate(int nid, unsigned long pfn, memmap = populate_section_memmap(pfn, nr_pages, nid, altmap, pgmap); memmap_pages_add(DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE)); if (!memmap) { - section_deactivate(pfn, nr_pages, altmap); + section_deactivate(pfn, nr_pages, altmap, pgmap); return ERR_PTR(-ENOMEM); } @@ -885,13 +885,13 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, } void sparse_remove_section(unsigned long pfn, unsigned long nr_pages, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { struct mem_section *ms = __pfn_to_section(pfn); if (WARN_ON_ONCE(!valid_section(ms))) return; - section_deactivate(pfn, nr_pages, altmap); + section_deactivate(pfn, nr_pages, altmap, pgmap); } #endif /* CONFIG_MEMORY_HOTPLUG */ -- 2.20.1