linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 00/13] arm64: Refactor TLB invalidation API and implementation
@ 2025-12-16 14:45 Ryan Roberts
  2025-12-16 14:45 ` [PATCH v1 01/13] arm64: mm: Re-implement the __tlbi_level macro as a C function Ryan Roberts
                   ` (12 more replies)
  0 siblings, 13 replies; 24+ messages in thread
From: Ryan Roberts @ 2025-12-16 14:45 UTC (permalink / raw)
  To: Will Deacon, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
	Linus Torvalds, Oliver Upton, Marc Zyngier, Dev Jain,
	Linu Cherian
  Cc: Ryan Roberts, linux-arm-kernel, linux-kernel

Hi All,

This series refactors the TLB invalidation API to make it more general and
flexible, and refactors the implementation, aiming to make it more robust,
easier to understand and easier to add new features in future.

It is heavily based on the series posted by Will back in July at [1]; I've
attempted to maintain correct authorship and tags - apologies if I got any of
the etiquette wrong.

The first 8 patches reimplement the full scope of Will's series, but fixed up to
use function pointers instead of the enum, as per Linus's suggestion. Patches
9-12 then reformulate the API for the range- and page-based functions to remove
all the "nosync", "nonotify" and "local" function variants and replace with a
set of flags to modify the behaviour instead. This allows having a single
implementation that can rely on constant folding. IMO It's much cleaner and more
flexible. Finally, patch 13 provides a minor theoretical performance improvement
by hinting the TTL for page-based invalidations (the preceeding API improvements
made that pretty simple).

We have a couple of other things in the queue to put on top of this series,
which these changes make simpler:

 - Optimization to only do local TLBI when an mm is single-threaded
 - Introduce TLBIP for use with D128 pgtables

The series applies on top of v6.19-rc1, I've compile tested each patch and run
mm selftests for the end result in a VM on Apple M2; all tests pass. I've run an
earlier version of this code through our performance benchmarking system and no
regressions were found. I've looked at the generated instructions and all the
expected constant folding seems to be happening, and I've checked code size
before and after; there is no significant change.

[1] https://lore.kernel.org/linux-arm-kernel/20250711161732.384-1-will@kernel.org/

Thanks,
Ryan

Ryan Roberts (9):
  arm64: mm: Re-implement the __tlbi_level macro as a C function
  arm64: mm: Introduce a C wrapper for by-range TLB invalidation
  arm64: mm: Implicitly invalidate user ASID based on TLBI operation
  arm64: mm: Re-implement the __flush_tlb_range_op macro in C
  arm64: mm: Refactor flush_tlb_page() to use __tlbi_level_asid()
  arm64: mm: Refactor __flush_tlb_range() to take flags
  arm64: mm: More flags for __flush_tlb_range()
  arm64: mm: Wrap flush_tlb_page() around ___flush_tlb_range()
  arm64: mm: Provide level hint for flush_tlb_page()

Will Deacon (4):
  arm64: mm: Push __TLBI_VADDR() into __tlbi_level()
  arm64: mm: Inline __TLBI_VADDR_RANGE() into __tlbi_range()
  arm64: mm: Simplify __TLBI_RANGE_NUM() macro
  arm64: mm: Simplify __flush_tlb_range_limit_excess()

 arch/arm64/include/asm/hugetlb.h  |  12 +-
 arch/arm64/include/asm/pgtable.h  |  13 +-
 arch/arm64/include/asm/tlb.h      |   6 +-
 arch/arm64/include/asm/tlbflush.h | 461 +++++++++++++++++-------------
 arch/arm64/kernel/sys_compat.c    |   2 +-
 arch/arm64/kvm/hyp/nvhe/mm.c      |   2 +-
 arch/arm64/kvm/hyp/pgtable.c      |   4 +-
 arch/arm64/mm/contpte.c           |  12 +-
 arch/arm64/mm/fault.c             |   2 +-
 arch/arm64/mm/hugetlbpage.c       |   4 +-
 arch/arm64/mm/mmu.c               |   2 +-
 11 files changed, 288 insertions(+), 232 deletions(-)

--
2.43.0



^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2026-01-02 15:24 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-16 14:45 [PATCH v1 00/13] arm64: Refactor TLB invalidation API and implementation Ryan Roberts
2025-12-16 14:45 ` [PATCH v1 01/13] arm64: mm: Re-implement the __tlbi_level macro as a C function Ryan Roberts
2025-12-16 17:53   ` Jonathan Cameron
2026-01-02 14:18     ` Ryan Roberts
2025-12-16 14:45 ` [PATCH v1 02/13] arm64: mm: Introduce a C wrapper for by-range TLB invalidation Ryan Roberts
2025-12-16 14:45 ` [PATCH v1 03/13] arm64: mm: Implicitly invalidate user ASID based on TLBI operation Ryan Roberts
2025-12-16 18:01   ` Jonathan Cameron
2026-01-02 14:20     ` Ryan Roberts
2025-12-18  6:30   ` Linu Cherian
2025-12-18  7:05     ` Linu Cherian
2025-12-18 15:47       ` Linu Cherian
2026-01-02 14:30         ` Ryan Roberts
2025-12-16 14:45 ` [PATCH v1 04/13] arm64: mm: Push __TLBI_VADDR() into __tlbi_level() Ryan Roberts
2025-12-16 14:45 ` [PATCH v1 05/13] arm64: mm: Inline __TLBI_VADDR_RANGE() into __tlbi_range() Ryan Roberts
2025-12-16 14:45 ` [PATCH v1 06/13] arm64: mm: Re-implement the __flush_tlb_range_op macro in C Ryan Roberts
2025-12-16 14:45 ` [PATCH v1 07/13] arm64: mm: Simplify __TLBI_RANGE_NUM() macro Ryan Roberts
2025-12-16 14:45 ` [PATCH v1 08/13] arm64: mm: Simplify __flush_tlb_range_limit_excess() Ryan Roberts
2025-12-17  8:12   ` Dev Jain
2026-01-02 15:23     ` Ryan Roberts
2025-12-16 14:45 ` [PATCH v1 09/13] arm64: mm: Refactor flush_tlb_page() to use __tlbi_level_asid() Ryan Roberts
2025-12-16 14:45 ` [PATCH v1 10/13] arm64: mm: Refactor __flush_tlb_range() to take flags Ryan Roberts
2025-12-16 14:45 ` [PATCH v1 11/13] arm64: mm: More flags for __flush_tlb_range() Ryan Roberts
2025-12-16 14:45 ` [PATCH v1 12/13] arm64: mm: Wrap flush_tlb_page() around ___flush_tlb_range() Ryan Roberts
2025-12-16 14:45 ` [PATCH v1 13/13] arm64: mm: Provide level hint for flush_tlb_page() Ryan Roberts

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).