Re: [PATCH v6 09/12] x86/mm: enable broadcast TLB invalidation for multi-threaded processes

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Peter Zijlstra <peterz@infradead.org>
To: Rik van Riel <riel@surriel.com>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org, bp@alien8.de,
	dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com,
	nadav.amit@gmail.com, thomas.lendacky@amd.com,
	kernel-team@meta.com, linux-mm@kvack.org,
	akpm@linux-foundation.org, jannh@google.com,
	mhklinux@outlook.com, andrew.cooper3@citrix.com
Subject: Re: [PATCH v6 09/12] x86/mm: enable broadcast TLB invalidation for multi-threaded processes
Date: Tue, 21 Jan 2025 11:33:33 +0100	[thread overview]
Message-ID: <20250121103333.GA7145@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <20250121095507.GB5388@noisy.programming.kicks-ass.net>

On Tue, Jan 21, 2025 at 10:55:07AM +0100, Peter Zijlstra wrote:
> On Sun, Jan 19, 2025 at 09:40:17PM -0500, Rik van Riel wrote:
> > +/*
> > + * Figure out whether to assign a global ASID to a process.
> > + * We vary the threshold by how empty or full global ASID space is.
> > + * 1/4 full: >= 4 active threads
> > + * 1/2 full: >= 8 active threads
> > + * 3/4 full: >= 16 active threads
> > + * 7/8 full: >= 32 active threads
> > + * etc
> > + *
> > + * This way we should never exhaust the global ASID space, even on very
> > + * large systems, and the processes with the largest number of active
> > + * threads should be able to use broadcast TLB invalidation.
> > + */
> > +#define HALFFULL_THRESHOLD 8
> > +static bool meets_global_asid_threshold(struct mm_struct *mm)
> > +{
> > +	int avail = global_asid_available;
> > +	int threshold = HALFFULL_THRESHOLD;
> > +
> > +	if (!avail)
> > +		return false;
> > +
> > +	if (avail > MAX_ASID_AVAILABLE * 3 / 4) {
> > +		threshold = HALFFULL_THRESHOLD / 4;
> > +	} else if (avail > MAX_ASID_AVAILABLE / 2) {
> > +		threshold = HALFFULL_THRESHOLD / 2;
> > +	} else if (avail < MAX_ASID_AVAILABLE / 3) {
> > +		do {
> > +			avail *= 2;
> > +			threshold *= 2;
> > +		} while ((avail + threshold) < MAX_ASID_AVAILABLE / 2);
> > +	}
> > +
> > +	return mm_active_cpus_exceeds(mm, threshold);
> > +}
> 
> I'm still very much disliking this. Why do we need this? Yes, running
> out of ASID space is a pain, but this increasing threshold also makes
> things behave weird.
> 
> Suppose our most used processes starts slow, and ends up not getting an
> ASID because too much irrelevant crap gets started before it spawns
> enough threads and then no longer qualifies.
> 
> Can't we just start with a very simple constant test and poke at things
> if/when its found to not work?

Something like so perhaps?

--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -268,7 +268,7 @@ static inline u16 mm_global_asid(struct
 	if (!cpu_feature_enabled(X86_FEATURE_INVLPGB))
 		return 0;
 
-	asid = READ_ONCE(mm->context.global_asid);
+	asid = smp_load_acquire(&mm->context.global_asid);
 
 	/* mm->context.global_asid is either 0, or a global ASID */
 	VM_WARN_ON_ONCE(is_dyn_asid(asid));
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -308,13 +308,18 @@ static void reset_global_asid_space(void
 static u16 get_global_asid(void)
 {
 	lockdep_assert_held(&global_asid_lock);
+	bool done_reset = false;
 
 	do {
 		u16 start = last_global_asid;
 		u16 asid = find_next_zero_bit(global_asid_used, MAX_ASID_AVAILABLE, start);
 
-		if (asid >= MAX_ASID_AVAILABLE) {
+		if (asid > MAX_ASID_AVAILABLE) {
+			if (done_reset)
+				return asid;
+
 			reset_global_asid_space();
+			done_reset = true;
 			continue;
 		}
 
@@ -392,6 +398,12 @@ static bool mm_active_cpus_exceeds(struc
  */
 static void use_global_asid(struct mm_struct *mm)
 {
+	u16 asid;
+
+	/* This process is already using broadcast TLB invalidation. */
+	if (mm->context.global_asid)
+		return;
+
 	guard(raw_spinlock_irqsave)(&global_asid_lock);
 
 	/* This process is already using broadcast TLB invalidation. */
@@ -402,58 +414,25 @@ static void use_global_asid(struct mm_st
 	if (!global_asid_available)
 		return;
 
+	asid = get_global_asid();
+	if (asid > MAX_ASID_AVAILABLE)
+		return;
+
 	/*
-	 * The transition from IPI TLB flushing, with a dynamic ASID,
-	 * and broadcast TLB flushing, using a global ASID, uses memory
-	 * ordering for synchronization.
-	 *
-	 * While the process has threads still using a dynamic ASID,
-	 * TLB invalidation IPIs continue to get sent.
-	 *
-	 * This code sets asid_transition first, before assigning the
-	 * global ASID.
-	 *
-	 * The TLB flush code will only verify the ASID transition
-	 * after it has seen the new global ASID for the process.
+	 * Notably flush_tlb_mm_range() -> broadcast_tlb_flush() ->
+	 * finish_asid_transition() needs to observe asid_transition == true
+	 * once it observes global_asid.
 	 */
-	WRITE_ONCE(mm->context.asid_transition, true);
-	WRITE_ONCE(mm->context.global_asid, get_global_asid());
+	mm->context.asid_transition = true;
+	smp_store_release(&mm->context.global_asid, asid);
 }
 
-/*
- * Figure out whether to assign a global ASID to a process.
- * We vary the threshold by how empty or full global ASID space is.
- * 1/4 full: >= 4 active threads
- * 1/2 full: >= 8 active threads
- * 3/4 full: >= 16 active threads
- * 7/8 full: >= 32 active threads
- * etc
- *
- * This way we should never exhaust the global ASID space, even on very
- * large systems, and the processes with the largest number of active
- * threads should be able to use broadcast TLB invalidation.
- */
-#define HALFFULL_THRESHOLD 8
 static bool meets_global_asid_threshold(struct mm_struct *mm)
 {
-	int avail = global_asid_available;
-	int threshold = HALFFULL_THRESHOLD;
-
-	if (!avail)
+	if (!global_asid_available)
 		return false;
 
-	if (avail > MAX_ASID_AVAILABLE * 3 / 4) {
-		threshold = HALFFULL_THRESHOLD / 4;
-	} else if (avail > MAX_ASID_AVAILABLE / 2) {
-		threshold = HALFFULL_THRESHOLD / 2;
-	} else if (avail < MAX_ASID_AVAILABLE / 3) {
-		do {
-			avail *= 2;
-			threshold *= 2;
-		} while ((avail + threshold) < MAX_ASID_AVAILABLE / 2);
-	}
-
-	return mm_active_cpus_exceeds(mm, threshold);
+	return mm_active_cpus_exceeds(mm, 4);
 }
 
 static void consider_global_asid(struct mm_struct *mm)

next prev parent reply	other threads:[~2025-01-21 10:33 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-20  2:40 [PATCH v6 00/12] AMD broadcast TLB invalidation Rik van Riel
2025-01-20  2:40 ` [PATCH v6 01/12] x86/mm: make MMU_GATHER_RCU_TABLE_FREE unconditional Rik van Riel
2025-01-20 19:32   ` David Hildenbrand
2025-01-20  2:40 ` [PATCH v6 02/12] x86/mm: remove pv_ops.mmu.tlb_remove_table call Rik van Riel
2025-01-20 19:47   ` David Hildenbrand
2025-01-21  1:03     ` Rik van Riel
2025-01-21  7:46       ` David Hildenbrand
2025-01-21  8:54         ` Peter Zijlstra
2025-01-22 15:48           ` Rik van Riel
2025-01-20  2:40 ` [PATCH v6 03/12] x86/mm: consolidate full flush threshold decision Rik van Riel
2025-01-20  2:40 ` [PATCH v6 04/12] x86/mm: get INVLPGB count max from CPUID Rik van Riel
2025-01-20  2:40 ` [PATCH v6 05/12] x86/mm: add INVLPGB support code Rik van Riel
2025-01-21  9:45   ` Peter Zijlstra
2025-01-22 16:58     ` Rik van Riel
2025-01-20  2:40 ` [PATCH v6 06/12] x86/mm: use INVLPGB for kernel TLB flushes Rik van Riel
2025-01-20  2:40 ` [PATCH v6 07/12] x86/tlb: use INVLPGB in flush_tlb_all Rik van Riel
2025-01-20  2:40 ` [PATCH v6 08/12] x86/mm: use broadcast TLB flushing for page reclaim TLB flushing Rik van Riel
2025-01-20  2:40 ` [PATCH v6 09/12] x86/mm: enable broadcast TLB invalidation for multi-threaded processes Rik van Riel
2025-01-20 14:02   ` Nadav Amit
2025-01-20 16:09     ` Rik van Riel
2025-01-20 20:04       ` Nadav Amit
2025-01-20 22:44         ` Rik van Riel
2025-01-21  7:31           ` Nadav Amit
2025-01-21  9:55   ` Peter Zijlstra
2025-01-21 10:33     ` Peter Zijlstra [this message]
2025-01-23  1:40       ` Rik van Riel
2025-01-21 18:48     ` Dave Hansen
2025-01-22  8:38   ` Peter Zijlstra
2025-01-23  1:13     ` Rik van Riel
2025-01-23  9:07       ` Peter Zijlstra
2025-01-23 12:42         ` Rik van Riel
2025-01-20  2:40 ` [PATCH v6 10/12] x86,tlb: do targeted broadcast flushing from tlbbatch code Rik van Riel
2025-01-20  2:40 ` [PATCH v6 11/12] x86/mm: enable AMD translation cache extensions Rik van Riel
2025-01-20  2:40 ` [PATCH v6 12/12] x86/mm: only invalidate final translations with INVLPGB Rik van Riel
2025-01-20  5:58 ` [PATCH v6 00/12] AMD broadcast TLB invalidation Michael Kelley
2025-01-24 11:41 ` Manali Shukla

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250121103333.GA7145@noisy.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=andrew.cooper3@citrix.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=jannh@google.com \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhklinux@outlook.com \
    --cc=nadav.amit@gmail.com \
    --cc=riel@surriel.com \
    --cc=thomas.lendacky@amd.com \
    --cc=x86@kernel.org \
    --cc=zhengqi.arch@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.