* [PATCH 6.6.y] arm64/mm: Enable batched TLB flush in unmap_hotplug_range()
[not found] <2026042727-accent-daylong-5b92@gregkh>
@ 2026-04-28 15:24 ` Sasha Levin
2026-05-15 10:27 ` Patch "arm64/mm: Enable batched TLB flush in unmap_hotplug_range()" has been added to the 6.6-stable tree gregkh
0 siblings, 1 reply; 2+ messages in thread
From: Sasha Levin @ 2026-04-28 15:24 UTC (permalink / raw)
To: stable
Cc: Anshuman Khandual, Will Deacon, linux-arm-kernel, linux-kernel,
David Hildenbrand (Arm), Ryan Roberts, Catalin Marinas,
Sasha Levin
From: Anshuman Khandual <anshuman.khandual@arm.com>
[ Upstream commit 48478b9f791376b4b89018d7afdfd06865498f65 ]
During a memory hot remove operation, both linear and vmemmap mappings for
the memory range being removed, get unmapped via unmap_hotplug_range() but
mapped pages get freed only for vmemmap mapping. This is just a sequential
operation where each table entry gets cleared, followed by a leaf specific
TLB flush, and then followed by memory free operation when applicable.
This approach was simple and uniform both for vmemmap and linear mappings.
But linear mapping might contain CONT marked block memory where it becomes
necessary to first clear out all entire in the range before a TLB flush.
This is as per the architecture requirement. Hence batch all TLB flushes
during the table tear down walk and finally do it in unmap_hotplug_range().
Prior to this fix, it was hypothetically possible for a speculative access
to a higher address in the contiguous block to fill the TLB with shattered
entries for the entire contiguous range after a lower address had already
been cleared and invalidated. Due to the table entries being shattered, the
subsequent TLB invalidation for the higher address would not then clear the
TLB entries for the lower address, meaning stale TLB entries could persist.
Besides it also helps in improving the performance via TLBI range operation
along with reduced synchronization instructions. The time spent executing
unmap_hotplug_range() improved 97% measured over a 2GB memory hot removal
in KVM guest.
This scheme is not applicable during vmemmap mapping tear down where memory
needs to be freed and hence a TLB flush is required after clearing out page
table entry.
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Closes: https://lore.kernel.org/all/aWZYXhrT6D2M-7-N@willie-the-truck/
Fixes: bbd6ec605c0f ("arm64/mm: Enable memory hot remove")
Cc: stable@vger.kernel.org
Reviewed-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[ replaced `__pte_clear()` with `pte_clear()` ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/arm64/mm/mmu.c | 36 ++++++++++++++++++++----------------
1 file changed, 20 insertions(+), 16 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index d6411f7f0b72c..8c5cbf4c858d9 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -870,10 +870,14 @@ static void unmap_hotplug_pte_range(pmd_t *pmdp, unsigned long addr,
WARN_ON(!pte_present(pte));
pte_clear(&init_mm, addr, ptep);
- flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
- if (free_mapped)
+ if (free_mapped) {
+ /* CONT blocks are not supported in the vmemmap */
+ WARN_ON(pte_cont(pte));
+ flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
free_hotplug_page_range(pte_page(pte),
PAGE_SIZE, altmap);
+ }
+ /* unmap_hotplug_range() flushes TLB for !free_mapped */
} while (addr += PAGE_SIZE, addr < end);
}
@@ -894,15 +898,14 @@ static void unmap_hotplug_pmd_range(pud_t *pudp, unsigned long addr,
WARN_ON(!pmd_present(pmd));
if (pmd_sect(pmd)) {
pmd_clear(pmdp);
-
- /*
- * One TLBI should be sufficient here as the PMD_SIZE
- * range is mapped with a single block entry.
- */
- flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
- if (free_mapped)
+ if (free_mapped) {
+ /* CONT blocks are not supported in the vmemmap */
+ WARN_ON(pmd_cont(pmd));
+ flush_tlb_kernel_range(addr, addr + PMD_SIZE);
free_hotplug_page_range(pmd_page(pmd),
PMD_SIZE, altmap);
+ }
+ /* unmap_hotplug_range() flushes TLB for !free_mapped */
continue;
}
WARN_ON(!pmd_table(pmd));
@@ -927,15 +930,12 @@ static void unmap_hotplug_pud_range(p4d_t *p4dp, unsigned long addr,
WARN_ON(!pud_present(pud));
if (pud_sect(pud)) {
pud_clear(pudp);
-
- /*
- * One TLBI should be sufficient here as the PUD_SIZE
- * range is mapped with a single block entry.
- */
- flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
- if (free_mapped)
+ if (free_mapped) {
+ flush_tlb_kernel_range(addr, addr + PUD_SIZE);
free_hotplug_page_range(pud_page(pud),
PUD_SIZE, altmap);
+ }
+ /* unmap_hotplug_range() flushes TLB for !free_mapped */
continue;
}
WARN_ON(!pud_table(pud));
@@ -965,6 +965,7 @@ static void unmap_hotplug_p4d_range(pgd_t *pgdp, unsigned long addr,
static void unmap_hotplug_range(unsigned long addr, unsigned long end,
bool free_mapped, struct vmem_altmap *altmap)
{
+ unsigned long start = addr;
unsigned long next;
pgd_t *pgdp, pgd;
@@ -986,6 +987,9 @@ static void unmap_hotplug_range(unsigned long addr, unsigned long end,
WARN_ON(!pgd_present(pgd));
unmap_hotplug_p4d_range(pgdp, addr, next, free_mapped, altmap);
} while (addr = next, addr < end);
+
+ if (!free_mapped)
+ flush_tlb_kernel_range(start, end);
}
static void free_empty_pte_table(pmd_t *pmdp, unsigned long addr,
--
2.53.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Patch "arm64/mm: Enable batched TLB flush in unmap_hotplug_range()" has been added to the 6.6-stable tree
2026-04-28 15:24 ` [PATCH 6.6.y] arm64/mm: Enable batched TLB flush in unmap_hotplug_range() Sasha Levin
@ 2026-05-15 10:27 ` gregkh
0 siblings, 0 replies; 2+ messages in thread
From: gregkh @ 2026-05-15 10:27 UTC (permalink / raw)
To: anshuman.khandual, catalin.marinas, david, gregkh,
linux-arm-kernel, ryan.roberts, sashal, will
Cc: stable-commits
This is a note to let you know that I've just added the patch titled
arm64/mm: Enable batched TLB flush in unmap_hotplug_range()
to the 6.6-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
The filename of the patch is:
arm64-mm-enable-batched-tlb-flush-in-unmap_hotplug_range.patch
and it can be found in the queue-6.6 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.
From stable+bounces-241691-greg=kroah.com@vger.kernel.org Tue Apr 28 17:25:10 2026
From: Sasha Levin <sashal@kernel.org>
Date: Tue, 28 Apr 2026 11:24:00 -0400
Subject: arm64/mm: Enable batched TLB flush in unmap_hotplug_range()
To: stable@vger.kernel.org
Cc: Anshuman Khandual <anshuman.khandual@arm.com>, Will Deacon <will@kernel.org>, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, "David Hildenbrand (Arm)" <david@kernel.org>, Ryan Roberts <ryan.roberts@arm.com>, Catalin Marinas <catalin.marinas@arm.com>, Sasha Levin <sashal@kernel.org>
Message-ID: <20260428152400.3033637-1-sashal@kernel.org>
From: Anshuman Khandual <anshuman.khandual@arm.com>
[ Upstream commit 48478b9f791376b4b89018d7afdfd06865498f65 ]
During a memory hot remove operation, both linear and vmemmap mappings for
the memory range being removed, get unmapped via unmap_hotplug_range() but
mapped pages get freed only for vmemmap mapping. This is just a sequential
operation where each table entry gets cleared, followed by a leaf specific
TLB flush, and then followed by memory free operation when applicable.
This approach was simple and uniform both for vmemmap and linear mappings.
But linear mapping might contain CONT marked block memory where it becomes
necessary to first clear out all entire in the range before a TLB flush.
This is as per the architecture requirement. Hence batch all TLB flushes
during the table tear down walk and finally do it in unmap_hotplug_range().
Prior to this fix, it was hypothetically possible for a speculative access
to a higher address in the contiguous block to fill the TLB with shattered
entries for the entire contiguous range after a lower address had already
been cleared and invalidated. Due to the table entries being shattered, the
subsequent TLB invalidation for the higher address would not then clear the
TLB entries for the lower address, meaning stale TLB entries could persist.
Besides it also helps in improving the performance via TLBI range operation
along with reduced synchronization instructions. The time spent executing
unmap_hotplug_range() improved 97% measured over a 2GB memory hot removal
in KVM guest.
This scheme is not applicable during vmemmap mapping tear down where memory
needs to be freed and hence a TLB flush is required after clearing out page
table entry.
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Closes: https://lore.kernel.org/all/aWZYXhrT6D2M-7-N@willie-the-truck/
Fixes: bbd6ec605c0f ("arm64/mm: Enable memory hot remove")
Cc: stable@vger.kernel.org
Reviewed-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[ replaced `__pte_clear()` with `pte_clear()` ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
arch/arm64/mm/mmu.c | 36 ++++++++++++++++++++----------------
1 file changed, 20 insertions(+), 16 deletions(-)
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -870,10 +870,14 @@ static void unmap_hotplug_pte_range(pmd_
WARN_ON(!pte_present(pte));
pte_clear(&init_mm, addr, ptep);
- flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
- if (free_mapped)
+ if (free_mapped) {
+ /* CONT blocks are not supported in the vmemmap */
+ WARN_ON(pte_cont(pte));
+ flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
free_hotplug_page_range(pte_page(pte),
PAGE_SIZE, altmap);
+ }
+ /* unmap_hotplug_range() flushes TLB for !free_mapped */
} while (addr += PAGE_SIZE, addr < end);
}
@@ -894,15 +898,14 @@ static void unmap_hotplug_pmd_range(pud_
WARN_ON(!pmd_present(pmd));
if (pmd_sect(pmd)) {
pmd_clear(pmdp);
-
- /*
- * One TLBI should be sufficient here as the PMD_SIZE
- * range is mapped with a single block entry.
- */
- flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
- if (free_mapped)
+ if (free_mapped) {
+ /* CONT blocks are not supported in the vmemmap */
+ WARN_ON(pmd_cont(pmd));
+ flush_tlb_kernel_range(addr, addr + PMD_SIZE);
free_hotplug_page_range(pmd_page(pmd),
PMD_SIZE, altmap);
+ }
+ /* unmap_hotplug_range() flushes TLB for !free_mapped */
continue;
}
WARN_ON(!pmd_table(pmd));
@@ -927,15 +930,12 @@ static void unmap_hotplug_pud_range(p4d_
WARN_ON(!pud_present(pud));
if (pud_sect(pud)) {
pud_clear(pudp);
-
- /*
- * One TLBI should be sufficient here as the PUD_SIZE
- * range is mapped with a single block entry.
- */
- flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
- if (free_mapped)
+ if (free_mapped) {
+ flush_tlb_kernel_range(addr, addr + PUD_SIZE);
free_hotplug_page_range(pud_page(pud),
PUD_SIZE, altmap);
+ }
+ /* unmap_hotplug_range() flushes TLB for !free_mapped */
continue;
}
WARN_ON(!pud_table(pud));
@@ -965,6 +965,7 @@ static void unmap_hotplug_p4d_range(pgd_
static void unmap_hotplug_range(unsigned long addr, unsigned long end,
bool free_mapped, struct vmem_altmap *altmap)
{
+ unsigned long start = addr;
unsigned long next;
pgd_t *pgdp, pgd;
@@ -986,6 +987,9 @@ static void unmap_hotplug_range(unsigned
WARN_ON(!pgd_present(pgd));
unmap_hotplug_p4d_range(pgdp, addr, next, free_mapped, altmap);
} while (addr = next, addr < end);
+
+ if (!free_mapped)
+ flush_tlb_kernel_range(start, end);
}
static void free_empty_pte_table(pmd_t *pmdp, unsigned long addr,
Patches currently in stable-queue which might be from sashal@kernel.org are
queue-6.6/ksmbd-reset-rcount-per-connection-in-ksmbd_conn_wait_idle_sess_id.patch
queue-6.6/dmaengine-idxd-fix-crash-when-the-event-log-is-disab.patch
queue-6.6/bpf-don-t-mark-stack_invalid-as-stack_misc-in-mark_s.patch
queue-6.6/wifi-mt76-connac-introduce-helper-for-mt7925-chipset.patch
queue-6.6/wifi-mt76-mt792x-describe-usb-wfsys-reset-with-a-descriptor.patch
queue-6.6/mmc-core-optimize-time-for-secure-erase-trim-for-some-kingston-emmcs.patch
queue-6.6/ksmbd-replace-connection-list-with-hash-table.patch
queue-6.6/selftests-bpf-validate-fake-register-spill-fill-prec.patch
queue-6.6/block-relax-pgmap-check-in-bio_add_page-for-compatible-zone-device-pages.patch
queue-6.6/wifi-rtl8xxxu-fix-potential-use-of-uninitialized-value.patch
queue-6.6/x86-shadow-stacks-proper-error-handling-for-mmap-loc.patch
queue-6.6/ksmbd-use-msleep-instaed-of-schedule_timeout_interruptible.patch
queue-6.6/net-txgbe-fix-rtnl-assertion-warning-when-remove-mod.patch
queue-6.6/bluetooth-mgmt-fix-possible-uafs.patch
queue-6.6/net-qrtr-ns-limit-the-total-number-of-nodes.patch
queue-6.6/bpf-handle-fake-register-spill-to-stack-with-bpf_st_.patch
queue-6.6/io_uring-poll-fix-multishot-recv-missing-eof-on-wake.patch
queue-6.6/drm-amdgpu-use-vmemdup_array_user-in-amdgpu_bo_creat.patch
queue-6.6/arm64-mm-enable-batched-tlb-flush-in-unmap_hotplug_range.patch
queue-6.6/smb-common-change-the-data-type-of-num_aces-to-le16.patch
queue-6.6/mtd-docg3-convert-to-platform-remove-callback-return.patch
queue-6.6/f2fs-fix-uaf-caused-by-decrementing-sbi-nr_pages-in-f2fs_write_end_io.patch
queue-6.6/iommu-amd-use-atomic64_inc_return-in-iommu.c.patch
queue-6.6/wifi-mwifiex-fix-use-after-free-in-mwifiex_adapter_cleanup.patch
queue-6.6/f2fs-fix-to-detect-potential-corrupted-nid-in-free_n.patch
queue-6.6/selftests-bpf-validate-precision-logic-in-partial_st.patch
queue-6.6/rxrpc-fix-rxrpc_input_call_event-to-only-unshare-dat.patch
queue-6.6/regset-use-kvzalloc-for-regset_get_alloc.patch
queue-6.6/pci-epf-mhi-return-0-not-remaining-timeout-when-edma-ops-complete.patch
queue-6.6/spi-meson-spicc-fix-double-put-in-remove-path.patch
queue-6.6/net-fix-icmp-host-relookup-triggering-ip_rt_bug.patch
queue-6.6/alsa-aoa-use-guard-for-mutex-locks.patch
queue-6.6/udf-fix-partition-descriptor-append-bookkeeping.patch
queue-6.6/lib-test_hmm-evict-device-pages-on-file-close-to-avoid-use-after-free.patch
queue-6.6/kvm-x86-fix-shadow-paging-use-after-free-due-to-unex.patch
queue-6.6/bpf-preserve-stack_zero-slots-on-partial-reg-spills.patch
queue-6.6/driver-core-don-t-let-a-device-probe-until-it-s-read.patch
queue-6.6/hfsplus-fix-uninit-value-by-validating-catalog-record-size.patch
queue-6.6/selftests-bpf-validate-zero-preservation-for-sub-slo.patch
queue-6.6/bpf-preserve-constant-zero-when-doing-partial-regist.patch
queue-6.6/smb-move-some-duplicate-definitions-to-common-smbacl.h.patch
queue-6.6/alsa-aoa-i2sbus-clear-stale-prepared-state.patch
queue-6.6/padata-fix-pd-uaf-once-and-for-all.patch
queue-6.6/drm-amdgpu-limit-bo-list-entry-count-to-prevent-reso.patch
queue-6.6/net-mctp-fix-don-t-require-received-header-reserved-bits-to-be-zero.patch
queue-6.6/media-rc-ttusbir-respect-dma-coherency-rules.patch
queue-6.6/f2fs-fix-to-do-sanity-check-on-dcc-discard_cmd_cnt-conditionally.patch
queue-6.6/hfsplus-fix-held-lock-freed-on-hfsplus_fill_super.patch
queue-6.6/spi-fix-resource-leaks-on-device-setup-failure.patch
queue-6.6/selftests-bpf-add-stack-access-precision-test.patch
queue-6.6/bpf-track-aligned-stack_zero-cases-as-imprecise-spil.patch
queue-6.6/mtd-docg3-fix-use-after-free-in-docg3_release.patch
queue-6.6/smb-client-validate-the-whole-dacl-before-rewriting-it-in-cifsacl.patch
queue-6.6/sched-use-u64-for-bandwidth-ratio-calculations.patch
queue-6.6/flow_dissector-do-not-dissect-pppoe-pfc-frames.patch
queue-6.6/padata-remove-comment-for-reorder_work.patch
queue-6.6/fbdev-defio-disconnect-deferred-i-o-from-the-lifetime-of-struct-fb_info.patch
queue-6.6/dmaengine-idxd-fix-leaking-event-log-memory.patch
queue-6.6/selftests-bpf-validate-stack_zero-is-preserved-on-su.patch
queue-6.6/net-qrtr-ns-limit-the-maximum-number-of-lookups.patch
queue-6.6/iommu-amd-serialize-sequence-allocation-under-concur.patch
queue-6.6/alsa-aoa-skip-devices-with-no-codecs-in-i2sbus_resume.patch
queue-6.6/loongarch-add-spectre-boundry-for-syscall-dispatch-t.patch
queue-6.6/net-bridge-use-a-stable-fdb-dst-snapshot-in-rcu-readers.patch
queue-6.6/x86-shstk-prevent-deadlock-during-shstk-sigreturn.patch
queue-6.6/rdma-mana_ib-disable-rx-steering-on-rss-qp-destroy.patch
queue-6.6/xfs-fix-a-resource-leak-in-xfs_alloc_buftarg.patch
queue-6.6/thermal-core-fix-thermal-zone-governor-cleanup-issues.patch
queue-6.6/drm-amd-display-do-not-skip-unrelated-mode-changes-i.patch
queue-6.6/wifi-mt76-mt792x-fix-mt7925u-usb-wfsys-reset-handling.patch
queue-6.6/net-qrtr-ns-limit-the-maximum-server-registration-per-node.patch
queue-6.6/ext4-validate-p_idx-bounds-in-ext4_ext_correct_index.patch
queue-6.6/bpf-support-non-r10-register-spill-fill-to-from-stac.patch
queue-6.6/rxrpc-fix-potential-uaf-after-skb_unshare-failure.patch
queue-6.6/firmware-google-framebuffer-do-not-unregister-platform-device.patch
queue-6.6/ksmbd-require-minimum-ace-size-in-smb_check_perm_dacl.patch
queue-6.6/media-rc-igorplugusb-heed-coherency-rules.patch
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-05-15 10:27 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <2026042727-accent-daylong-5b92@gregkh>
2026-04-28 15:24 ` [PATCH 6.6.y] arm64/mm: Enable batched TLB flush in unmap_hotplug_range() Sasha Levin
2026-05-15 10:27 ` Patch "arm64/mm: Enable batched TLB flush in unmap_hotplug_range()" has been added to the 6.6-stable tree gregkh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox