All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v4 0/8] Intel RAR TLB invalidation
@ 2025-06-19 20:03 Rik van Riel
  2025-06-19 20:03 ` [RFC PATCH v4 1/8] x86/mm: Introduce Remote Action Request MSRs Rik van Riel
                   ` (8 more replies)
  0 siblings, 9 replies; 23+ messages in thread
From: Rik van Riel @ 2025-06-19 20:03 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel-team, dave.hansen, luto, peterz, bp, x86, nadav.amit,
	seanjc, tglx, mingo

This patch series adds support for IPI-less TLB invalidation
using Intel RAR technology.

Intel RAR differs from AMD INVLPGB in a few ways:
- RAR goes through (emulated?) APIC writes, not instructions
- RAR flushes go through a memory table with 64 entries
- RAR flushes can be targeted to a cpumask
- The RAR functionality must be set up at boot time before it can be used

The cpumask targeting has resulted in Intel RAR and AMD INVLPGB having
slightly different rules:
- Processes with dynamic ASIDs use IPI based shootdowns
- INVLPGB: processes with a global ASID 
   - always have the TLB up to date, on every CPU
   - never need to flush the TLB at context switch time
- RAR: processes with global ASIDs
   - have the TLB up to date on CPUs in the mm_cpumask
   - can skip a TLB flush at context switch time if the CPU is in the mm_cpumask
   - need to flush the TLB when scheduled on a cpu not in the mm_cpumask,
     in case it used to run there before and the TLB has stale entries

RAR functionality is present on Sapphire Rapids and newer CPUs.

Information about Intel RAR can be found in this whitepaper.

https://www.intel.com/content/dam/develop/external/us/en/documents/341431-remote-action-request-white-paper.pdf

This patch series is based off a 2019 patch series created by
Intel, with patches later in the series modified to fit into
the TLB flush code structure we have after AMD INVLPGB functionality
was integrated.

TODO:
- some sort of optimization to avoid sending RARs to CPUs in deeper
  idle states when they have init_mm loaded (flush when switching to init_mm?)

v4:
- remove chicken/egg problem that made it impossible to use RAR early
  in bootup, now RAR can be used to flush the local TLB (but it's broken?)
- always flush other CPUs with RAR, no more periodic flush_tlb_func
- separate, simplified cpumask trimming code
- attempt to use RAR to flush the local TLB, which should work
  according to the documentation
- add a DEBUG patch to flush the local TLB with RAR and again locally,
  may need some help from Intel to figure out why this makes a difference
- memory dumps of rar_payload[] suggest we are sending valid RARs
- receiving CPUs set the status from RAR_PENDING to RAR_SUCCESS
- unclear whether the TLB is actually flushed correctly :(
v3:
- move cpa_flush() change out of this patch series
- use MSR_IA32_CORE_CAPS definition, merge first two patches together
- move RAR initialization to early_init_intel()
- remove single-CPU "fast path" from smp_call_rar_many
- remove smp call table RAR entries, just do a direct call
- cleanups suggested (Ingo, Nadav, Dave, Thomas, Borislav, Sean)
- fix !CONFIG_SMP compile in Kconfig
- match RAR definitions to the names & numbers in the documentation
- the code seems to work now
v2:
- Cleanups suggested by Ingo and Nadav (thank you)
- Basic RAR code seems to actually work now.
- Kernel TLB flushes with RAR seem to work correctly.
- User TLB flushes with RAR are still broken, with two symptoms:
  - The !is_lazy WARN_ON in leave_mm() is tripped
  - Random segfaults.


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2025-06-29  1:42 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-19 20:03 [RFC PATCH v4 0/8] Intel RAR TLB invalidation Rik van Riel
2025-06-19 20:03 ` [RFC PATCH v4 1/8] x86/mm: Introduce Remote Action Request MSRs Rik van Riel
2025-06-19 20:03 ` [RFC PATCH v4 2/8] x86/mm: enable BROADCAST_TLB_FLUSH on Intel, too Rik van Riel
2025-06-26 13:08   ` Kirill A. Shutemov
2025-06-19 20:03 ` [RFC PATCH v4 3/8] x86/mm: Introduce X86_FEATURE_RAR Rik van Riel
2025-06-19 20:03 ` [RFC PATCH v4 4/8] x86/apic: Introduce Remote Action Request Operations Rik van Riel
2025-06-26 13:20   ` Kirill A. Shutemov
2025-06-26 16:09     ` Sean Christopherson
2025-06-19 20:03 ` [RFC PATCH v4 5/8] x86/mm: Introduce Remote Action Request Rik van Riel
2025-06-19 23:01   ` Nadav Amit
2025-06-20  1:10     ` Rik van Riel
2025-06-20 15:27       ` Sean Christopherson
2025-06-20 21:24       ` Nadav Amit
2025-06-23 10:50       ` David Laight
2025-06-20 15:05   ` kernel test robot
2025-06-26 15:41   ` Kirill A. Shutemov
2025-06-26 15:54     ` Kirill A. Shutemov
2025-06-19 20:03 ` [RFC PATCH v4 6/8] x86/mm: use RAR for kernel TLB flushes Rik van Riel
2025-06-27 13:27   ` Kirill A. Shutemov
2025-06-29  1:30     ` Rik van Riel
2025-06-19 20:03 ` [RFC PATCH v4 7/8] x86/mm: userspace & pageout flushing using Intel RAR Rik van Riel
2025-06-19 20:04 ` [RFC PATCH v4 8/8] x86/tlb: flush the local TLB twice (DEBUG) Rik van Riel
2025-06-26 18:08 ` [RFC PATCH v4 0/8] Intel RAR TLB invalidation Dave Jiang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.