From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 473B1FF886D for ; Tue, 28 Apr 2026 20:06:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=Dr76VyEiyO5HN1DNiXlEz3RnYgKfsn2aLp91PIx5228=; b=wip7y4Pi8rNN0RzBVo15kSqLMd kcDtZCiKuOo2ihLBQq0hTuhisCUhmUcQuoCpWt9Nym0TjmbqyYnJOQc86iHj4wACKplQcwi3q1qUj GOBVRpYnBJW1RGDTaYsetMFOD7NjrZ54fXow3LeSzM/HuRedE+GxZtm3SWYyCAvczqHQ8SgvRQX0T 5AScM6WM2v/xVXpFiErNt+5xKd1QxPWq2vtWfqaI/w7ugt92hu1q2QIN7nfYn96UiMq2d3yjymCCC qDSXfK/TCPRPCxGYPbdUgZoysIBVbybIIC1IO4n+LEj39ygk4MQZiKX09ziqLhYHW/KZb0VFOteDB fq7ilBmw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98.2 #2 (Red Hat Linux)) id 1wHogt-00000002C47-1Srl; Tue, 28 Apr 2026 20:05:55 +0000 Received: from sea.source.kernel.org ([2600:3c0a:e001:78e:0:1991:8:25]) by bombadil.infradead.org with esmtps (Exim 4.98.2 #2 (Red Hat Linux)) id 1wHogq-00000002C3h-0ghH for linux-arm-kernel@lists.infradead.org; Tue, 28 Apr 2026 20:05:54 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sea.source.kernel.org (Postfix) with ESMTP id 652EC43FE5; Tue, 28 Apr 2026 20:05:51 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5FE79C2BCAF; Tue, 28 Apr 2026 20:05:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777406751; bh=Y7a6feOaVeJP4q66K7OSq9SR4dMdT1W5SWT0J9vniQs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=AbRbhe6hWtUIuUxqeDy6YlkubQj9Pc9fdc4NdN442hqWF5SfWOJ8wL95wjpEVvbV4 Oiv3iyEuIMKSypH2bc0GIR1wVkxfPRv6qlNDl4WCer1YaL1sW3HGtfuhLzj9MomLBV yANqte9VUi0WM4Xb0camJ9LtfNDgTR4DwFTnphd2sCYhRIFkBxs1Y7tdczOxja1UL4 Jrz1NV2KULD7Sujr5H9qSNl2GukmP+99KEoOhGVKkc7rbiMcqO5sxPn89477Fz+drI S0vYsMIyd1ePD0Rm6IE+xYU4fwdeWZUhBfLkJlJdBRuGjYqMF4ukorPbeQRNyrWH8d Gc+8Sp7d1btMA== From: Sasha Levin To: stable@vger.kernel.org Cc: Anshuman Khandual , Will Deacon , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, "David Hildenbrand (Arm)" , Ryan Roberts , Catalin Marinas , Sasha Levin Subject: [PATCH 5.10.y] arm64/mm: Enable batched TLB flush in unmap_hotplug_range() Date: Tue, 28 Apr 2026 16:05:48 -0400 Message-ID: <20260428200548.3191346-1-sashal@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <2026042729-abiding-helmet-8eef@gregkh> References: <2026042729-abiding-helmet-8eef@gregkh> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20260428_130552_773703_F5030B1E X-CRM114-Status: GOOD ( 17.26 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: Anshuman Khandual [ Upstream commit 48478b9f791376b4b89018d7afdfd06865498f65 ] During a memory hot remove operation, both linear and vmemmap mappings for the memory range being removed, get unmapped via unmap_hotplug_range() but mapped pages get freed only for vmemmap mapping. This is just a sequential operation where each table entry gets cleared, followed by a leaf specific TLB flush, and then followed by memory free operation when applicable. This approach was simple and uniform both for vmemmap and linear mappings. But linear mapping might contain CONT marked block memory where it becomes necessary to first clear out all entire in the range before a TLB flush. This is as per the architecture requirement. Hence batch all TLB flushes during the table tear down walk and finally do it in unmap_hotplug_range(). Prior to this fix, it was hypothetically possible for a speculative access to a higher address in the contiguous block to fill the TLB with shattered entries for the entire contiguous range after a lower address had already been cleared and invalidated. Due to the table entries being shattered, the subsequent TLB invalidation for the higher address would not then clear the TLB entries for the lower address, meaning stale TLB entries could persist. Besides it also helps in improving the performance via TLBI range operation along with reduced synchronization instructions. The time spent executing unmap_hotplug_range() improved 97% measured over a 2GB memory hot removal in KVM guest. This scheme is not applicable during vmemmap mapping tear down where memory needs to be freed and hence a TLB flush is required after clearing out page table entry. Cc: Will Deacon Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Closes: https://lore.kernel.org/all/aWZYXhrT6D2M-7-N@willie-the-truck/ Fixes: bbd6ec605c0f ("arm64/mm: Enable memory hot remove") Cc: stable@vger.kernel.org Reviewed-by: David Hildenbrand (Arm) Reviewed-by: Ryan Roberts Signed-off-by: Ryan Roberts Signed-off-by: Anshuman Khandual Signed-off-by: Catalin Marinas [ renamed `__pte_clear()` to `pte_clear()` and inlined `pmd_cont(pmd)` as `pmd_val(pmd) & PMD_SECT_CONT` ] Signed-off-by: Sasha Levin --- arch/arm64/mm/mmu.c | 36 ++++++++++++++++++++---------------- 1 file changed, 20 insertions(+), 16 deletions(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index b584bf200619f..ff16cc7251b40 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -862,10 +862,14 @@ static void unmap_hotplug_pte_range(pmd_t *pmdp, unsigned long addr, WARN_ON(!pte_present(pte)); pte_clear(&init_mm, addr, ptep); - flush_tlb_kernel_range(addr, addr + PAGE_SIZE); - if (free_mapped) + if (free_mapped) { + /* CONT blocks are not supported in the vmemmap */ + WARN_ON(pte_cont(pte)); + flush_tlb_kernel_range(addr, addr + PAGE_SIZE); free_hotplug_page_range(pte_page(pte), PAGE_SIZE, altmap); + } + /* unmap_hotplug_range() flushes TLB for !free_mapped */ } while (addr += PAGE_SIZE, addr < end); } @@ -886,15 +890,14 @@ static void unmap_hotplug_pmd_range(pud_t *pudp, unsigned long addr, WARN_ON(!pmd_present(pmd)); if (pmd_sect(pmd)) { pmd_clear(pmdp); - - /* - * One TLBI should be sufficient here as the PMD_SIZE - * range is mapped with a single block entry. - */ - flush_tlb_kernel_range(addr, addr + PAGE_SIZE); - if (free_mapped) + if (free_mapped) { + /* CONT blocks are not supported in the vmemmap */ + WARN_ON(pmd_val(pmd) & PMD_SECT_CONT); + flush_tlb_kernel_range(addr, addr + PMD_SIZE); free_hotplug_page_range(pmd_page(pmd), PMD_SIZE, altmap); + } + /* unmap_hotplug_range() flushes TLB for !free_mapped */ continue; } WARN_ON(!pmd_table(pmd)); @@ -919,15 +922,12 @@ static void unmap_hotplug_pud_range(p4d_t *p4dp, unsigned long addr, WARN_ON(!pud_present(pud)); if (pud_sect(pud)) { pud_clear(pudp); - - /* - * One TLBI should be sufficient here as the PUD_SIZE - * range is mapped with a single block entry. - */ - flush_tlb_kernel_range(addr, addr + PAGE_SIZE); - if (free_mapped) + if (free_mapped) { + flush_tlb_kernel_range(addr, addr + PUD_SIZE); free_hotplug_page_range(pud_page(pud), PUD_SIZE, altmap); + } + /* unmap_hotplug_range() flushes TLB for !free_mapped */ continue; } WARN_ON(!pud_table(pud)); @@ -957,6 +957,7 @@ static void unmap_hotplug_p4d_range(pgd_t *pgdp, unsigned long addr, static void unmap_hotplug_range(unsigned long addr, unsigned long end, bool free_mapped, struct vmem_altmap *altmap) { + unsigned long start = addr; unsigned long next; pgd_t *pgdp, pgd; @@ -978,6 +979,9 @@ static void unmap_hotplug_range(unsigned long addr, unsigned long end, WARN_ON(!pgd_present(pgd)); unmap_hotplug_p4d_range(pgdp, addr, next, free_mapped, altmap); } while (addr = next, addr < end); + + if (!free_mapped) + flush_tlb_kernel_range(start, end); } static void free_empty_pte_table(pmd_t *pmdp, unsigned long addr, -- 2.53.0