[RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching

All of lore.kernel.org
 help / color / mirror / Atom feed

* [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-07 13:39 [RFC PATCH 0/3] x86/mm: LAM fixups and cleanups Yosry Ahmed
@ 2024-03-07 13:39 ` Yosry Ahmed
  2024-03-07 15:29   ` Dave Hansen
                     ` (3 more replies)
  0 siblings, 4 replies; 27+ messages in thread
From: Yosry Ahmed @ 2024-03-07 13:39 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Peter Zijlstra, Andy Lutomirski, Kirill A. Shutemov, x86,
	linux-mm, linux-kernel, Yosry Ahmed

During context switching, if we are not switching to new mm and no TLB
flush is needed, we do not write CR3. However, it is possible that a
user thread enables LAM while a kthread is running on a different CPU
with the old LAM CR3 mask. If the kthread context switches into any
thread of that user process, it may not write CR3 with the new LAM mask,
which would cause the user thread to run with a misconfigured CR3 that
disables LAM on the CPU.

Fix this by making sure we write a new CR3 if LAM is not up-to-date. No
problems were observed in practice, this was found by code inspection.

Not that it is possible that mm->context.lam_cr3_mask changes throughout
switch_mm_irqs_off(). But since LAM can only be enabled by a
single-threaded process on its own behalf, in that case we cannot be
switching to a user thread in that same process, we can only be
switching to another kthread using the borrowed mm or a different user
process, which should be fine.

Fixes: 82721d8b25d7 ("x86/mm: Handle LAM on context switch")
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
---
 arch/x86/mm/tlb.c | 50 ++++++++++++++++++++++++++++-------------------
 1 file changed, 30 insertions(+), 20 deletions(-)

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 2975d3f89a5de..3610c23499085 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -503,11 +503,12 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
 {
 	struct mm_struct *prev = this_cpu_read(cpu_tlbstate.loaded_mm);
 	u16 prev_asid = this_cpu_read(cpu_tlbstate.loaded_mm_asid);
+	u64 cpu_tlb_gen = this_cpu_read(cpu_tlbstate.ctxs[prev_asid].tlb_gen);
 	bool was_lazy = this_cpu_read(cpu_tlbstate_shared.is_lazy);
+	bool need_flush = false, need_lam_update = false;
 	unsigned cpu = smp_processor_id();
 	unsigned long new_lam;
 	u64 next_tlb_gen;
-	bool need_flush;
 	u16 new_asid;
 
 	/* We don't want flush_tlb_func() to run concurrently with us. */
@@ -570,32 +571,41 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
 				 !cpumask_test_cpu(cpu, mm_cpumask(next))))
 			cpumask_set_cpu(cpu, mm_cpumask(next));
 
+		/*
+		 * tlbstate_lam_cr3_mask() may be outdated if a different thread
+		 * has enabled LAM while we were borrowing its mm on this CPU.
+		 * Make sure we update CR3 in case we are switching to another
+		 * thread in that process.
+		 */
+		if (tlbstate_lam_cr3_mask() != mm_lam_cr3_mask(next))
+			need_lam_update = true;
+
 		/*
 		 * If the CPU is not in lazy TLB mode, we are just switching
 		 * from one thread in a process to another thread in the same
 		 * process. No TLB flush required.
 		 */
-		if (!was_lazy)
-			return;
+		if (was_lazy) {
+			/*
+			 * Read the tlb_gen to check whether a flush is needed.
+			 * If the TLB is up to date, just use it.  The barrier
+			 * synchronizes with the tlb_gen increment in the TLB
+			 * shootdown code.
+			 */
+			smp_mb();
+			next_tlb_gen = atomic64_read(&next->context.tlb_gen);
+			if (cpu_tlb_gen < next_tlb_gen) {
+				/*
+				 * TLB contents went out of date while we were
+				 * in lazy mode.
+				 */
+				new_asid = prev_asid;
+				need_flush = true;
+			}
+		}
 
-		/*
-		 * Read the tlb_gen to check whether a flush is needed.
-		 * If the TLB is up to date, just use it.
-		 * The barrier synchronizes with the tlb_gen increment in
-		 * the TLB shootdown code.
-		 */
-		smp_mb();
-		next_tlb_gen = atomic64_read(&next->context.tlb_gen);
-		if (this_cpu_read(cpu_tlbstate.ctxs[prev_asid].tlb_gen) ==
-				next_tlb_gen)
+		if (!need_flush && !need_lam_update)
 			return;
-
-		/*
-		 * TLB contents went out of date while we were in lazy
-		 * mode. Fall through to the TLB switching code below.
-		 */
-		new_asid = prev_asid;
-		need_flush = true;
 	} else {
 		/*
 		 * Apply process to process speculation vulnerability
-- 
2.44.0.278.ge034bb2e1d-goog



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-07 13:39 ` [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching Yosry Ahmed
@ 2024-03-07 15:29   ` Dave Hansen
  2024-03-07 21:04     ` Yosry Ahmed
  2024-03-07 17:29   ` Kirill A. Shutemov
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 27+ messages in thread
From: Dave Hansen @ 2024-03-07 15:29 UTC (permalink / raw)
  To: Yosry Ahmed, Andrew Morton
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Peter Zijlstra,
	Andy Lutomirski, Kirill A. Shutemov, x86, linux-mm, linux-kernel

On 3/7/24 05:39, Yosry Ahmed wrote:
> -		/*
> -		 * Read the tlb_gen to check whether a flush is needed.
> -		 * If the TLB is up to date, just use it.
> -		 * The barrier synchronizes with the tlb_gen increment in
> -		 * the TLB shootdown code.
> -		 */
> -		smp_mb();
> -		next_tlb_gen = atomic64_read(&next->context.tlb_gen);
> -		if (this_cpu_read(cpu_tlbstate.ctxs[prev_asid].tlb_gen) ==
> -				next_tlb_gen)
> +		if (!need_flush && !need_lam_update)
>  			return;

Instead of all this new complexity, why not just inc_mm_tlb_gen() at the
site where LAM is enabled?  That should signal to any CPU thread that
its TLB is out of date and it needs to do a full CR3 reload.

Also, have you been able to actually catch this scenario in practice?
Considering how fun this code path is, a little effort at an actual
reproduction would be really appreciated.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-07 13:39 ` [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching Yosry Ahmed
  2024-03-07 15:29   ` Dave Hansen
@ 2024-03-07 17:29   ` Kirill A. Shutemov
  2024-03-07 17:56     ` Dave Hansen
  2024-03-08  1:34   ` Andy Lutomirski
  2024-03-11  6:09   ` Dan Carpenter
  3 siblings, 1 reply; 27+ messages in thread
From: Kirill A. Shutemov @ 2024-03-07 17:29 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Andrew Morton, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Peter Zijlstra, Andy Lutomirski, x86, linux-mm,
	linux-kernel

On Thu, Mar 07, 2024 at 01:39:15PM +0000, Yosry Ahmed wrote:
> During context switching, if we are not switching to new mm and no TLB
> flush is needed, we do not write CR3. However, it is possible that a
> user thread enables LAM while a kthread is running on a different CPU
> with the old LAM CR3 mask. If the kthread context switches into any
> thread of that user process, it may not write CR3 with the new LAM mask,
> which would cause the user thread to run with a misconfigured CR3 that
> disables LAM on the CPU.

I don't think it is possible. As I said we can only enable LAM when the
process has single thread. If it enables LAM concurrently with kernel
thread and kernel thread gets control on the same CPU after the userspace
thread of the same process LAM is already going to be enabled. No need in
special handling.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-07 17:29   ` Kirill A. Shutemov
@ 2024-03-07 17:56     ` Dave Hansen
  2024-03-07 21:08       ` Yosry Ahmed
  0 siblings, 1 reply; 27+ messages in thread
From: Dave Hansen @ 2024-03-07 17:56 UTC (permalink / raw)
  To: Kirill A. Shutemov, Yosry Ahmed
  Cc: Andrew Morton, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Peter Zijlstra, Andy Lutomirski, x86, linux-mm, linux-kernel

On 3/7/24 09:29, Kirill A. Shutemov wrote:
> On Thu, Mar 07, 2024 at 01:39:15PM +0000, Yosry Ahmed wrote:
>> During context switching, if we are not switching to new mm and no TLB
>> flush is needed, we do not write CR3. However, it is possible that a
>> user thread enables LAM while a kthread is running on a different CPU
>> with the old LAM CR3 mask. If the kthread context switches into any
>> thread of that user process, it may not write CR3 with the new LAM mask,
>> which would cause the user thread to run with a misconfigured CR3 that
>> disables LAM on the CPU.
> I don't think it is possible. As I said we can only enable LAM when the
> process has single thread. If it enables LAM concurrently with kernel
> thread and kernel thread gets control on the same CPU after the userspace
> thread of the same process LAM is already going to be enabled. No need in
> special handling.

I think it's something logically like this:

						// main thread
	kthread_use_mm()
	cr3 |= mm->lam_cr3_mask;
						mm->lam_cr3_mask = foo;
	cpu_tlbstate.lam = mm->lam_cr3_mask;

Obviously the kthread's LAM state is going to be random.  It's
fundamentally racing with the enabling thread.  That part is fine.

The main pickle is the fact that CR3 and cpu_tlbstate.lam are out of
sync.  That seems worth fixing.

Or is there something else that keeps this whole thing from racing in
the first place?


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-07 15:29   ` Dave Hansen
@ 2024-03-07 21:04     ` Yosry Ahmed
  2024-03-07 21:39       ` Dave Hansen
  0 siblings, 1 reply; 27+ messages in thread
From: Yosry Ahmed @ 2024-03-07 21:04 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Andrew Morton, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Peter Zijlstra, Andy Lutomirski, Kirill A. Shutemov, x86,
	linux-mm, linux-kernel

On Thu, Mar 07, 2024 at 07:29:34AM -0800, Dave Hansen wrote:
> On 3/7/24 05:39, Yosry Ahmed wrote:
> > -		/*
> > -		 * Read the tlb_gen to check whether a flush is needed.
> > -		 * If the TLB is up to date, just use it.
> > -		 * The barrier synchronizes with the tlb_gen increment in
> > -		 * the TLB shootdown code.
> > -		 */
> > -		smp_mb();
> > -		next_tlb_gen = atomic64_read(&next->context.tlb_gen);
> > -		if (this_cpu_read(cpu_tlbstate.ctxs[prev_asid].tlb_gen) ==
> > -				next_tlb_gen)
> > +		if (!need_flush && !need_lam_update)
> >  			return;
> 
> Instead of all this new complexity, why not just inc_mm_tlb_gen() at the
> site where LAM is enabled?  That should signal to any CPU thread that
> its TLB is out of date and it needs to do a full CR3 reload.

It's not really a lot of complexity imo, most of the patch is reverting
the if conditions that return if a TLB flush is not needed to have a
single if block that sets need_flush to true. I can split this to a
different patch if it helps review, or I can do something like this to
keep the diff concise:

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 2975d3f89a5de..17ab105522287 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -576,7 +576,7 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
                 * process. No TLB flush required.
                 */
                if (!was_lazy)
-                       return;
+                       goto check_return;

                /*
                 * Read the tlb_gen to check whether a flush is needed.
@@ -588,7 +588,7 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
                next_tlb_gen = atomic64_read(&next->context.tlb_gen);
                if (this_cpu_read(cpu_tlbstate.ctxs[prev_asid].tlb_gen) ==
                                next_tlb_gen)
-                       return;
+                       goto check_return;

                /*
                 * TLB contents went out of date while we were in lazy
@@ -596,6 +596,9 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
                 */
                new_asid = prev_asid;
                need_flush = true;
+check_return:
+               if (!need_flush && /* LAM up-to-date */)
+                       return;
        } else {
                /*
                 * Apply process to process speculation vulnerability

..but I prefer the current patch though to avoid the goto. I think the
flow is easier to follow.

I thought about doing inc_mm_tlb_gen() when LAM is updated, but it felt
hacky and more importantly doesn't make it clear in switch_mm_irqs_off()
that we correctly handle LAM updates. We can certainly add a comment,
but I think an explicit check for CPU LAM vs. mm LAM is much clearer.

WDYT?

> 
> Also, have you been able to actually catch this scenario in practice?

Nope, by code inspection (as I admitted in the commit log).

> Considering how fun this code path is, a little effort at an actual
> reproduction would be really appreciated.

I tried reproducing it but gave up quickly. We need a certain sequence
of events to happen:

CPU 1					CPU 2
kthread_use_mm()
					/* user thread enables LAM */
					context_switch()
context_switch() /* to user thread */


IIUC we don't really need kthread_use_mm(), we need the kthread to be
scheduled on the CPU directly after the user thread, so maybe something
like:

CPU 1					CPU 2
/* user thread running */
context_switch() /* to kthread */
					/* user thread enables LAM */
					context_switch()
context_switch() /* to user thread */

It doesn't seem easy to reproduce. WDYT?


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-07 17:56     ` Dave Hansen
@ 2024-03-07 21:08       ` Yosry Ahmed
  2024-03-07 21:48         ` Dave Hansen
  0 siblings, 1 reply; 27+ messages in thread
From: Yosry Ahmed @ 2024-03-07 21:08 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Kirill A. Shutemov, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Peter Zijlstra, Andy Lutomirski, x86, linux-mm,
	linux-kernel

On Thu, Mar 07, 2024 at 09:56:07AM -0800, Dave Hansen wrote:
> On 3/7/24 09:29, Kirill A. Shutemov wrote:
> > On Thu, Mar 07, 2024 at 01:39:15PM +0000, Yosry Ahmed wrote:
> >> During context switching, if we are not switching to new mm and no TLB
> >> flush is needed, we do not write CR3. However, it is possible that a
> >> user thread enables LAM while a kthread is running on a different CPU
> >> with the old LAM CR3 mask. If the kthread context switches into any
> >> thread of that user process, it may not write CR3 with the new LAM mask,
> >> which would cause the user thread to run with a misconfigured CR3 that
> >> disables LAM on the CPU.
> > I don't think it is possible. As I said we can only enable LAM when the
> > process has single thread. If it enables LAM concurrently with kernel
> > thread and kernel thread gets control on the same CPU after the userspace
> > thread of the same process LAM is already going to be enabled. No need in
> > special handling.
> 
> I think it's something logically like this:
> 
> 						// main thread
> 	kthread_use_mm()
> 	cr3 |= mm->lam_cr3_mask;
> 						mm->lam_cr3_mask = foo;
> 	cpu_tlbstate.lam = mm->lam_cr3_mask;

IIUC it doesn't have to be through kthread_use_mm(). If we context
switch directly from the user thread to a kthread, the kthread will keep
using the user thread's mm AFAICT.

> 
> Obviously the kthread's LAM state is going to be random.  It's
> fundamentally racing with the enabling thread.  That part is fine.
> 
> The main pickle is the fact that CR3 and cpu_tlbstate.lam are out of
> sync.  That seems worth fixing.

That's what is fixed by patch 1, specifically a race between
switch_mm_irqs_off() and LAM being enabled. This patch is fixing a
different problem:

CPU 1                                   CPU 2
/* user thread running */
context_switch() /* to kthread */
                                        /* user thread enables LAM */
                                        context_switch()
context_switch() /* to user thread */

In this case, there are no races, but the second context switch on CPU 1
may not write CR3 (if TLB is up-to-date), in which case we will run the
user thread with CR3 having the wrong LAM mask. This could cause bigger
problems, right?

> 
> Or is there something else that keeps this whole thing from racing in
> the first place?

+1 that would be good to know, but I didn't find anything.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-07 21:04     ` Yosry Ahmed
@ 2024-03-07 21:39       ` Dave Hansen
  2024-03-07 22:29         ` Yosry Ahmed
  0 siblings, 1 reply; 27+ messages in thread
From: Dave Hansen @ 2024-03-07 21:39 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Andrew Morton, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Peter Zijlstra, Andy Lutomirski, Kirill A. Shutemov, x86,
	linux-mm, linux-kernel

On 3/7/24 13:04, Yosry Ahmed wrote:
> I thought about doing inc_mm_tlb_gen() when LAM is updated, but it felt
> hacky and more importantly doesn't make it clear in switch_mm_irqs_off()
> that we correctly handle LAM updates. We can certainly add a comment,
> but I think an explicit check for CPU LAM vs. mm LAM is much clearer.
> 
> WDYT?

The mm generations are literally there so that if the mm changes that
all the CPUs know they need an update.  Changing LAM enabling is 100%
consistent with telling other CPUs that they need an update.

I'd be curious of Andy feels differently though.

>> Considering how fun this code path is, a little effort at an actual
>> reproduction would be really appreciated.
> 
> I tried reproducing it but gave up quickly. We need a certain sequence
> of events to happen:
> 
> CPU 1					CPU 2
> kthread_use_mm()
> 					/* user thread enables LAM */
> 					context_switch()
> context_switch() /* to user thread */

First, it would be fine to either create a new kthread for reproduction
purposes or to hack an existing one.  For instance, have have the LAM
prctl() take an extra ref on the mm and stick it in a global variable:

	mmgrab(current->mm);
	global_mm = current->mm;

Then in the kthread, grab the mm and use it:

	while (!global_mm);
	kthread_use_mm(global_mm);
	... check for the race
	mmdrop(global_mm);

You can also hackily wait for thread to move with a stupid spin loop:

	while (smp_processor_id() != 1);

and then actually move it with sched_setaffinity() from userspace.  That
can make it easier to get that series of events to happen in lockstep.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-07 21:08       ` Yosry Ahmed
@ 2024-03-07 21:48         ` Dave Hansen
  2024-03-07 22:30           ` Yosry Ahmed
  0 siblings, 1 reply; 27+ messages in thread
From: Dave Hansen @ 2024-03-07 21:48 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Kirill A. Shutemov, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Peter Zijlstra, Andy Lutomirski, x86, linux-mm,
	linux-kernel

On 3/7/24 13:08, Yosry Ahmed wrote:
> CPU 1                                   CPU 2
> /* user thread running */
> context_switch() /* to kthread */
>                                         /* user thread enables LAM */
>                                         context_switch()
> context_switch() /* to user thread */
> 
> In this case, there are no races, but the second context switch on CPU 1
> may not write CR3 (if TLB is up-to-date), in which case we will run the
> user thread with CR3 having the wrong LAM mask. This could cause bigger
> problems, right?

Yes, but this is precisely the kind of thing that we should solve with
mm generations.  Its sole purpose is to thwart optimization where the
mm_prev==mm_next and might not rewrite CR3.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-07 21:39       ` Dave Hansen
@ 2024-03-07 22:29         ` Yosry Ahmed
  2024-03-07 22:41           ` Dave Hansen
  2024-03-08  1:26           ` Yosry Ahmed
  0 siblings, 2 replies; 27+ messages in thread
From: Yosry Ahmed @ 2024-03-07 22:29 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Andrew Morton, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Peter Zijlstra, Andy Lutomirski, Kirill A. Shutemov, x86,
	linux-mm, linux-kernel

On Thu, Mar 07, 2024 at 01:39:53PM -0800, Dave Hansen wrote:
> On 3/7/24 13:04, Yosry Ahmed wrote:
> > I thought about doing inc_mm_tlb_gen() when LAM is updated, but it felt
> > hacky and more importantly doesn't make it clear in switch_mm_irqs_off()
> > that we correctly handle LAM updates. We can certainly add a comment,
> > but I think an explicit check for CPU LAM vs. mm LAM is much clearer.
> > 
> > WDYT?
> 
> The mm generations are literally there so that if the mm changes that
> all the CPUs know they need an update.  Changing LAM enabling is 100%
> consistent with telling other CPUs that they need an update.
> 
> I'd be curious of Andy feels differently though.

The mm generations are TLB-specific and all the code using them implies
as such (e.g. look at the checks in switch_mm_irqs_off() when prev ==
next). We can go around and update comments and/or function names to
make them more generic, but this seems excessive. If we don't, the code
becomes less clear imo.

I agree that the use case here is essentially the same (let other
CPUs know they need to write CR3), but I still think that since the LAM
case is just a simple one-time enablement, an explicit check in
switch_mm_irqs_off() would be clearer.

Just my 2c, let me know what you prefer :)

> 
> >> Considering how fun this code path is, a little effort at an actual
> >> reproduction would be really appreciated.
> > 
> > I tried reproducing it but gave up quickly. We need a certain sequence
> > of events to happen:
> > 
> > CPU 1					CPU 2
> > kthread_use_mm()
> > 					/* user thread enables LAM */
> > 					context_switch()
> > context_switch() /* to user thread */
> 
> First, it would be fine to either create a new kthread for reproduction
> purposes or to hack an existing one.  For instance, have have the LAM
> prctl() take an extra ref on the mm and stick it in a global variable:
> 
> 	mmgrab(current->mm);
> 	global_mm = current->mm;
> 
> Then in the kthread, grab the mm and use it:
> 
> 	while (!global_mm);
> 	kthread_use_mm(global_mm);
> 	... check for the race
> 	mmdrop(global_mm);
> 
> You can also hackily wait for thread to move with a stupid spin loop:
> 
> 	while (smp_processor_id() != 1);
> 
> and then actually move it with sched_setaffinity() from userspace.  That
> can make it easier to get that series of events to happen in lockstep.

I will take a stab at doing something similar and let you know, thanks.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-07 21:48         ` Dave Hansen
@ 2024-03-07 22:30           ` Yosry Ahmed
  0 siblings, 0 replies; 27+ messages in thread
From: Yosry Ahmed @ 2024-03-07 22:30 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Kirill A. Shutemov, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Peter Zijlstra, Andy Lutomirski, x86, linux-mm,
	linux-kernel

On Thu, Mar 07, 2024 at 01:48:28PM -0800, Dave Hansen wrote:
> On 3/7/24 13:08, Yosry Ahmed wrote:
> > CPU 1                                   CPU 2
> > /* user thread running */
> > context_switch() /* to kthread */
> >                                         /* user thread enables LAM */
> >                                         context_switch()
> > context_switch() /* to user thread */
> > 
> > In this case, there are no races, but the second context switch on CPU 1
> > may not write CR3 (if TLB is up-to-date), in which case we will run the
> > user thread with CR3 having the wrong LAM mask. This could cause bigger
> > problems, right?
> 
> Yes, but this is precisely the kind of thing that we should solve with
> mm generations.  Its sole purpose is to thwart optimization where the
> mm_prev==mm_next and might not rewrite CR3.

I think it's clearer to have an explicit check for oudated LAM in
switch_mm_irqs_off(), as I mentioned in my other reply.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-07 22:29         ` Yosry Ahmed
@ 2024-03-07 22:41           ` Dave Hansen
  2024-03-07 22:44             ` Yosry Ahmed
  2024-03-08  1:26           ` Yosry Ahmed
  1 sibling, 1 reply; 27+ messages in thread
From: Dave Hansen @ 2024-03-07 22:41 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Andrew Morton, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Peter Zijlstra, Andy Lutomirski, Kirill A. Shutemov, x86,
	linux-mm, linux-kernel

On 3/7/24 14:29, Yosry Ahmed wrote:
> Just my 2c, let me know what you prefer 🙂

I'll let Andy weigh in here, but I think I kinda already mentioned at
least twice what I prefer. :)


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-07 22:41           ` Dave Hansen
@ 2024-03-07 22:44             ` Yosry Ahmed
  0 siblings, 0 replies; 27+ messages in thread
From: Yosry Ahmed @ 2024-03-07 22:44 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Andrew Morton, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Peter Zijlstra, Andy Lutomirski, Kirill A. Shutemov, x86,
	linux-mm, linux-kernel

On Thu, Mar 7, 2024 at 2:41 PM Dave Hansen <dave.hansen@intel.com> wrote:
>
> On 3/7/24 14:29, Yosry Ahmed wrote:
> > Just my 2c, let me know what you prefer 🙂
>
> I'll let Andy weigh in here, but I think I kinda already mentioned at
> least twice what I prefer. :)

Well I was hoping my last email changed your mind, but obviously not :)


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-07 22:29         ` Yosry Ahmed
  2024-03-07 22:41           ` Dave Hansen
@ 2024-03-08  1:26           ` Yosry Ahmed
  2024-03-08  8:09             ` Yosry Ahmed
  1 sibling, 1 reply; 27+ messages in thread
From: Yosry Ahmed @ 2024-03-08  1:26 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Andrew Morton, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Peter Zijlstra, Andy Lutomirski, Kirill A. Shutemov, x86,
	linux-mm, linux-kernel

On Thu, Mar 07, 2024 at 10:29:29PM +0000, Yosry Ahmed wrote:
> On Thu, Mar 07, 2024 at 01:39:53PM -0800, Dave Hansen wrote:
> > On 3/7/24 13:04, Yosry Ahmed wrote:
> > > I thought about doing inc_mm_tlb_gen() when LAM is updated, but it felt
> > > hacky and more importantly doesn't make it clear in switch_mm_irqs_off()
> > > that we correctly handle LAM updates. We can certainly add a comment,
> > > but I think an explicit check for CPU LAM vs. mm LAM is much clearer.
> > > 
> > > WDYT?
> > 
> > The mm generations are literally there so that if the mm changes that
> > all the CPUs know they need an update.  Changing LAM enabling is 100%
> > consistent with telling other CPUs that they need an update.
> > 
> > I'd be curious of Andy feels differently though.
> 
> The mm generations are TLB-specific and all the code using them implies
> as such (e.g. look at the checks in switch_mm_irqs_off() when prev ==
> next). We can go around and update comments and/or function names to
> make them more generic, but this seems excessive. If we don't, the code
> becomes less clear imo.
> 
> I agree that the use case here is essentially the same (let other
> CPUs know they need to write CR3), but I still think that since the LAM
> case is just a simple one-time enablement, an explicit check in
> switch_mm_irqs_off() would be clearer.
> 
> Just my 2c, let me know what you prefer :)
> 
> > 
> > >> Considering how fun this code path is, a little effort at an actual
> > >> reproduction would be really appreciated.
> > > 
> > > I tried reproducing it but gave up quickly. We need a certain sequence
> > > of events to happen:
> > > 
> > > CPU 1					CPU 2
> > > kthread_use_mm()
> > > 					/* user thread enables LAM */
> > > 					context_switch()
> > > context_switch() /* to user thread */
> > 
> > First, it would be fine to either create a new kthread for reproduction
> > purposes or to hack an existing one.  For instance, have have the LAM
> > prctl() take an extra ref on the mm and stick it in a global variable:
> > 
> > 	mmgrab(current->mm);
> > 	global_mm = current->mm;
> > 
> > Then in the kthread, grab the mm and use it:
> > 
> > 	while (!global_mm);
> > 	kthread_use_mm(global_mm);
> > 	... check for the race
> > 	mmdrop(global_mm);
> > 
> > You can also hackily wait for thread to move with a stupid spin loop:
> > 
> > 	while (smp_processor_id() != 1);
> > 
> > and then actually move it with sched_setaffinity() from userspace.  That
> > can make it easier to get that series of events to happen in lockstep.
> 
> I will take a stab at doing something similar and let you know, thanks.

I came up with a kernel patch that I *think* may reproduce the problem
with enough iterations. Userspace only needs to enable LAM, so I think
the selftest can be enough to trigger it.

However, there is no hardware with LAM at my disposal, and IIUC I cannot
use QEMU without KVM to run a kernel with LAM. I was planning to do more
testing before sending a non-RFC version, but apparently I cannot do
any testing beyond building at this point (including reproducing) :/

Let me know how you want to proceed. I can send a non-RFC v1 based on
the feedback I got on the RFC, but it will only be build tested.

For the record, here is the diff that I *think* may reproduce the bug:

diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 33b268747bb7b..c37a8c26a3c21 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -750,8 +750,25 @@ static long prctl_map_vdso(const struct vdso_image *image, unsigned long addr)
 
 #define LAM_U57_BITS 6
 
+static int kthread_fn(void *_mm)
+{
+	struct mm_struct *mm = _mm;
+
+	/*
+	 * Wait for LAM to be enabled then schedule. Hopefully we will context
+	 * switch directly into the task that enabled LAM due to CPU pinning.
+	 */
+	kthread_use_mm(mm);
+	while (!test_bit(MM_CONTEXT_LOCK_LAM, &mm->context.flags));
+	schedule();
+	return 0;
+}
+
 static int prctl_enable_tagged_addr(struct mm_struct *mm, unsigned long nr_bits)
 {
+	struct task_struct *kthread_task;
+	int kthread_cpu;
+
 	if (!cpu_feature_enabled(X86_FEATURE_LAM))
 		return -ENODEV;
 
@@ -782,10 +799,22 @@ static int prctl_enable_tagged_addr(struct mm_struct *mm, unsigned long nr_bits)
 		return -EINVAL;
 	}
 
+	/* Pin the task to the current CPU */
+	set_cpus_allowed_ptr(current, cpumask_of(smp_processor_id()));
+
+	/* Run a kthread on another CPU and wait for it to start */
+	kthread_cpu = cpumask_next_wrap(smp_processor_id(), cpu_online_mask, 0, false),
+	kthread_task = kthread_run_on_cpu(kthread_fn, mm, kthread_cpu, "lam_repro_kthread");
+	while (!task_is_running(kthread_task));
+
 	write_cr3(__read_cr3() | mm->context.lam_cr3_mask);
 	set_tlbstate_lam_mode(mm);
 	set_bit(MM_CONTEXT_LOCK_LAM, &mm->context.flags);
 
+	/* Move the task to the kthread CPU */
+	set_cpus_allowed_ptr(current, cpumask_of(kthread_cpu));
+
 	mmap_write_unlock(mm);
 
 	return 0;
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 51f9f56941058..3afb53f1a1901 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -593,7 +593,7 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
 		next_tlb_gen = atomic64_read(&next->context.tlb_gen);
 		if (this_cpu_read(cpu_tlbstate.ctxs[prev_asid].tlb_gen) ==
 				next_tlb_gen)
-			return;
+			BUG_ON(new_lam != tlbstate_lam_cr3_mask());
 
 		/*
 		 * TLB contents went out of date while we were in lazy



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-07 13:39 ` [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching Yosry Ahmed
  2024-03-07 15:29   ` Dave Hansen
  2024-03-07 17:29   ` Kirill A. Shutemov
@ 2024-03-08  1:34   ` Andy Lutomirski
  2024-03-08  1:47     ` Yosry Ahmed
  2024-03-08 15:23     ` Dave Hansen
  2024-03-11  6:09   ` Dan Carpenter
  3 siblings, 2 replies; 27+ messages in thread
From: Andy Lutomirski @ 2024-03-08  1:34 UTC (permalink / raw)
  To: Yosry Ahmed, Andrew Morton
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Peter Zijlstra (Intel), Kirill A. Shutemov,
	the arch/x86 maintainers, linux-mm, Linux Kernel Mailing List

Catching up a bit...

On Thu, Mar 7, 2024, at 5:39 AM, Yosry Ahmed wrote:
> During context switching, if we are not switching to new mm and no TLB
> flush is needed, we do not write CR3. However, it is possible that a
> user thread enables LAM while a kthread is running on a different CPU
> with the old LAM CR3 mask. If the kthread context switches into any
> thread of that user process, it may not write CR3 with the new LAM mask,
> which would cause the user thread to run with a misconfigured CR3 that
> disables LAM on the CPU.

So I think (off the top of my head -- haven't thought about it all that hard) that LAM is logically like PCE and LDT: it's a property of an mm that is only rarely changed, and it doesn't really belong as part of the tlb_gen mechanism.  And, critically, it's not worth the effort and complexity to try to optimize LAM changes when we have a lazy CPU (just like PCE and LDT) (whereas TLB flushes are performance critical and are absolutely worth optimizing).

So...

>
> Fix this by making sure we write a new CR3 if LAM is not up-to-date. No
> problems were observed in practice, this was found by code inspection.

I think it should be fixed with a much bigger hammer: explicit IPIs.  Just don't ever let it get out of date, like install_ldt().

>
> Not that it is possible that mm->context.lam_cr3_mask changes throughout
> switch_mm_irqs_off(). But since LAM can only be enabled by a
> single-threaded process on its own behalf, in that case we cannot be
> switching to a user thread in that same process, we can only be
> switching to another kthread using the borrowed mm or a different user
> process, which should be fine.

The thought process is even simpler with the IPI: it *can* change while switching, but it will resynchronize immediately once IRQs turn back on.  And whoever changes it will *synchronize* with us, which would otherwise require extremely complex logic to get right.

And...

> -		if (!was_lazy)
> -			return;
> +		if (was_lazy) {
> +			/*
> +			 * Read the tlb_gen to check whether a flush is needed.
> +			 * If the TLB is up to date, just use it.  The barrier
> +			 * synchronizes with the tlb_gen increment in the TLB
> +			 * shootdown code.
> +			 */
> +			smp_mb();

This is actually rather expensive -- from old memory, we're talking maybe 20 cycles here, but this path is *very* hot and we try fairly hard to make it be fast.  If we get the happy PCID path, it's maybe 100-200 cycles, so this is like a 10% regression.  Ouch.

And you can delete all of this if you accept my suggestion.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-08  1:34   ` Andy Lutomirski
@ 2024-03-08  1:47     ` Yosry Ahmed
  2024-03-08 14:05       ` Kirill A. Shutemov
  2024-03-08 15:23     ` Dave Hansen
  1 sibling, 1 reply; 27+ messages in thread
From: Yosry Ahmed @ 2024-03-08  1:47 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Andrew Morton, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Peter Zijlstra (Intel), Kirill A. Shutemov,
	the arch/x86 maintainers, linux-mm, Linux Kernel Mailing List

On Thu, Mar 07, 2024 at 05:34:21PM -0800, Andy Lutomirski wrote:
> Catching up a bit...
> 
> On Thu, Mar 7, 2024, at 5:39 AM, Yosry Ahmed wrote:
> > During context switching, if we are not switching to new mm and no TLB
> > flush is needed, we do not write CR3. However, it is possible that a
> > user thread enables LAM while a kthread is running on a different CPU
> > with the old LAM CR3 mask. If the kthread context switches into any
> > thread of that user process, it may not write CR3 with the new LAM mask,
> > which would cause the user thread to run with a misconfigured CR3 that
> > disables LAM on the CPU.
> 
> So I think (off the top of my head -- haven't thought about it all that hard) that LAM is logically like PCE and LDT: it's a property of an mm that is only rarely changed, and it doesn't really belong as part of the tlb_gen mechanism.  And, critically, it's not worth the effort and complexity to try to optimize LAM changes when we have a lazy CPU (just like PCE and LDT) (whereas TLB flushes are performance critical and are absolutely worth optimizing).
> 
> So...
> 
> >
> > Fix this by making sure we write a new CR3 if LAM is not up-to-date. No
> > problems were observed in practice, this was found by code inspection.
> 
> I think it should be fixed with a much bigger hammer: explicit IPIs.  Just don't ever let it get out of date, like install_ldt().

I like this, and I think earlier versions of the code did this. I think
the code now assumes it's fine to not send an IPI since only
single-threaded processes can enable LAM, but this means we have to
handle kthreads switching to user threads with outdated LAMs (what this
patch is trying to do).

I also think there is currently an assumption that it's fine for
kthreads to run with an incorrect LAM, which is mostly fine, but the IPI
also drops that assumption.

> 
> >
> > Not that it is possible that mm->context.lam_cr3_mask changes throughout
> > switch_mm_irqs_off(). But since LAM can only be enabled by a
> > single-threaded process on its own behalf, in that case we cannot be
> > switching to a user thread in that same process, we can only be
> > switching to another kthread using the borrowed mm or a different user
> > process, which should be fine.
> 
> The thought process is even simpler with the IPI: it *can* change while switching, but it will resynchronize immediately once IRQs turn back on.  And whoever changes it will *synchronize* with us, which would otherwise require extremely complex logic to get right.
> 
> And...
> 
> > -		if (!was_lazy)
> > -			return;
> > +		if (was_lazy) {
> > +			/*
> > +			 * Read the tlb_gen to check whether a flush is needed.
> > +			 * If the TLB is up to date, just use it.  The barrier
> > +			 * synchronizes with the tlb_gen increment in the TLB
> > +			 * shootdown code.
> > +			 */
> > +			smp_mb();
> 
> This is actually rather expensive -- from old memory, we're talking maybe 20 cycles here, but this path is *very* hot and we try fairly hard to make it be fast.  If we get the happy PCID path, it's maybe 100-200 cycles, so this is like a 10% regression.  Ouch.

This is not newly introduced by this patch. I merely refactored this
code (reversed the if conditions). I think if we keep the current
approach I should move this refactoring to a separate patch to make
things clearer.

> 
> And you can delete all of this if you accept my suggestion.

I like it very much. The problem now is, as I told Dave, I realized I
cannot do any testing beyond compilation due to lack of hardware. I am
happy to send a next version if this is acceptable or if someone else
can test.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-08  1:26           ` Yosry Ahmed
@ 2024-03-08  8:09             ` Yosry Ahmed
  0 siblings, 0 replies; 27+ messages in thread
From: Yosry Ahmed @ 2024-03-08  8:09 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Andrew Morton, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Peter Zijlstra, Andy Lutomirski, Kirill A. Shutemov, x86,
	linux-mm, linux-kernel

> I came up with a kernel patch that I *think* may reproduce the problem
> with enough iterations. Userspace only needs to enable LAM, so I think
> the selftest can be enough to trigger it.
> 
> However, there is no hardware with LAM at my disposal, and IIUC I cannot
> use QEMU without KVM to run a kernel with LAM. I was planning to do more
> testing before sending a non-RFC version, but apparently I cannot do
> any testing beyond building at this point (including reproducing) :/
> 
> Let me know how you want to proceed. I can send a non-RFC v1 based on
> the feedback I got on the RFC, but it will only be build tested.
> 
> For the record, here is the diff that I *think* may reproduce the bug:

Okay, I was actually able to run _some_ testing with the diff below on
_a kernel_, and I hit the BUG_ON pretty quickly. If I did things
correctly, this BUG_ON means that even though we have an outdated LAM in
our CR3, we will not update CR3 because the TLB is up-to-date.

I can work on a v1 now with the IPI approach that Andy suggested. A
small kink is that we may still hit the BUG_ON with that fix, but in
that case it should be fine to not write CR3 because once we re-enable
interrupts we will receive the IPI and fix it. IOW, the diff below will
still BUG with the proposed fix, but it should be okay.

One thing I am not clear about with the IPI approach, if we use
mm_cpumask() to limit the IPI scope, we need to make sure that we read
mm_lam_cr3_mask() *after* we update the cpumask in switch_mm_irqs_off(),
which makes me think we'll need a barrier (and Andy said we want to
avoid those in this path). But looking at the code I see:

		/*
		 * Start remote flushes and then read tlb_gen.
		 */
		if (next != &init_mm)
			cpumask_set_cpu(cpu, mm_cpumask(next));
		next_tlb_gen = atomic64_read(&next->context.tlb_gen);

This code doesn't have a barrier. How do we make sure the read actually
happens after the write?

If no barrier is needed there, then I think we can similarly just read
the LAM mask after cpumask_set_cpu().

> 
> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
> index 33b268747bb7b..c37a8c26a3c21 100644
> --- a/arch/x86/kernel/process_64.c
> +++ b/arch/x86/kernel/process_64.c
> @@ -750,8 +750,25 @@ static long prctl_map_vdso(const struct vdso_image *image, unsigned long addr)
>  
>  #define LAM_U57_BITS 6
>  
> +static int kthread_fn(void *_mm)
> +{
> +	struct mm_struct *mm = _mm;
> +
> +	/*
> +	 * Wait for LAM to be enabled then schedule. Hopefully we will context
> +	 * switch directly into the task that enabled LAM due to CPU pinning.
> +	 */
> +	kthread_use_mm(mm);
> +	while (!test_bit(MM_CONTEXT_LOCK_LAM, &mm->context.flags));
> +	schedule();
> +	return 0;
> +}
> +
>  static int prctl_enable_tagged_addr(struct mm_struct *mm, unsigned long nr_bits)
>  {
> +	struct task_struct *kthread_task;
> +	int kthread_cpu;
> +
>  	if (!cpu_feature_enabled(X86_FEATURE_LAM))
>  		return -ENODEV;
>  
> @@ -782,10 +799,22 @@ static int prctl_enable_tagged_addr(struct mm_struct *mm, unsigned long nr_bits)
>  		return -EINVAL;
>  	}
>  
> +	/* Pin the task to the current CPU */
> +	set_cpus_allowed_ptr(current, cpumask_of(smp_processor_id()));
> +
> +	/* Run a kthread on another CPU and wait for it to start */
> +	kthread_cpu = cpumask_next_wrap(smp_processor_id(), cpu_online_mask, 0, false),
> +	kthread_task = kthread_run_on_cpu(kthread_fn, mm, kthread_cpu, "lam_repro_kthread");
> +	while (!task_is_running(kthread_task));
> +
>  	write_cr3(__read_cr3() | mm->context.lam_cr3_mask);
>  	set_tlbstate_lam_mode(mm);
>  	set_bit(MM_CONTEXT_LOCK_LAM, &mm->context.flags);
>  
> +	/* Move the task to the kthread CPU */
> +	set_cpus_allowed_ptr(current, cpumask_of(kthread_cpu));
> +
>  	mmap_write_unlock(mm);
>  
>  	return 0;
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index 51f9f56941058..3afb53f1a1901 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -593,7 +593,7 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
>  		next_tlb_gen = atomic64_read(&next->context.tlb_gen);
>  		if (this_cpu_read(cpu_tlbstate.ctxs[prev_asid].tlb_gen) ==
>  				next_tlb_gen)
> -			return;
> +			BUG_ON(new_lam != tlbstate_lam_cr3_mask());
>  
>  		/*
>  		 * TLB contents went out of date while we were in lazy
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-08  1:47     ` Yosry Ahmed
@ 2024-03-08 14:05       ` Kirill A. Shutemov
  0 siblings, 0 replies; 27+ messages in thread
From: Kirill A. Shutemov @ 2024-03-08 14:05 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Andy Lutomirski, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, Peter Zijlstra (Intel),
	the arch/x86 maintainers, linux-mm, Linux Kernel Mailing List

On Fri, Mar 08, 2024 at 01:47:39AM +0000, Yosry Ahmed wrote:
> I like it very much. The problem now is, as I told Dave, I realized I
> cannot do any testing beyond compilation due to lack of hardware. I am
> happy to send a next version if this is acceptable or if someone else
> can test.

I have non-upstreamable QEMU patch that adds LAM emulation, if it helps:

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 93b1ca810bf4..fe887a86a156 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1295,6 +1295,19 @@ void tlb_set_page(CPUState *cpu, vaddr addr,
                             prot, mmu_idx, size);
 }
 
+
+static vaddr clean_addr(CPUState *cpu, vaddr addr)
+{
+    CPUClass *cc = CPU_GET_CLASS(cpu);
+
+    if (cc->tcg_ops->do_clean_addr) {
+        addr = cc->tcg_ops->do_clean_addr(cpu, addr);
+    }
+
+    return addr;
+}
+
+
 /*
  * Note: tlb_fill() can trigger a resize of the TLB. This means that all of the
  * caller's prior references to the TLB table (e.g. CPUTLBEntry pointers) must
@@ -1867,9 +1880,10 @@ static bool mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
  * Probe for an atomic operation.  Do not allow unaligned operations,
  * or io operations to proceed.  Return the host address.
  */
-static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
+static void *atomic_mmu_lookup(CPUState *cpu, vaddr address, MemOpIdx oi,
                                int size, uintptr_t retaddr)
 {
+    vaddr addr = clean_addr(cpu, address);
     uintptr_t mmu_idx = get_mmuidx(oi);
     MemOp mop = get_memop(oi);
     int a_bits = get_alignment_bits(mop);
@@ -2002,10 +2016,11 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
  * The bytes are concatenated in big-endian order with @ret_be.
  */
 static uint64_t int_ld_mmio_beN(CPUState *cpu, CPUTLBEntryFull *full,
-                                uint64_t ret_be, vaddr addr, int size,
+                                uint64_t ret_be, vaddr address, int size,
                                 int mmu_idx, MMUAccessType type, uintptr_t ra,
                                 MemoryRegion *mr, hwaddr mr_offset)
 {
+    vaddr addr = clean_addr(cpu, address);
     do {
         MemOp this_mop;
         unsigned this_size;
@@ -2543,10 +2558,11 @@ static Int128 do_ld16_mmu(CPUState *cpu, vaddr addr,
  * return the bytes of @val_le beyond @p->size that have not been stored.
  */
 static uint64_t int_st_mmio_leN(CPUState *cpu, CPUTLBEntryFull *full,
-                                uint64_t val_le, vaddr addr, int size,
+                                uint64_t val_le, vaddr address, int size,
                                 int mmu_idx, uintptr_t ra,
                                 MemoryRegion *mr, hwaddr mr_offset)
 {
+    vaddr addr = clean_addr(cpu, address);
     do {
         MemOp this_mop;
         unsigned this_size;
diff --git a/include/hw/core/tcg-cpu-ops.h b/include/hw/core/tcg-cpu-ops.h
index bf8ff8e3eec1..eaa8e09a6226 100644
--- a/include/hw/core/tcg-cpu-ops.h
+++ b/include/hw/core/tcg-cpu-ops.h
@@ -140,6 +140,12 @@ struct TCGCPUOps {
                                            MMUAccessType access_type,
                                            int mmu_idx, uintptr_t retaddr);
 
+
+    /**
+     * @do_clean_addr: Callback for clearing metadata/tags from the address.
+     */
+    vaddr (*do_clean_addr)(CPUState *cpu, vaddr addr);
+
     /**
      * @adjust_watchpoint_address: hack for cpu_check_watchpoint used by ARM
      */
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 2666ef380891..1bbfd31042b2 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -739,7 +739,7 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t vendor1,
 #define TCG_7_0_EDX_FEATURES (CPUID_7_0_EDX_FSRM | CPUID_7_0_EDX_KERNEL_FEATURES)
 
 #define TCG_7_1_EAX_FEATURES (CPUID_7_1_EAX_FZRM | CPUID_7_1_EAX_FSRS | \
-          CPUID_7_1_EAX_FSRC | CPUID_7_1_EAX_CMPCCXADD)
+          CPUID_7_1_EAX_FSRC | CPUID_7_1_EAX_CMPCCXADD | CPUID_7_1_EAX_LAM)
 #define TCG_7_1_EDX_FEATURES 0
 #define TCG_7_2_EDX_FEATURES 0
 #define TCG_APM_FEATURES 0
@@ -968,7 +968,7 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
             "fsrc", NULL, NULL, NULL,
             NULL, NULL, NULL, NULL,
             NULL, "amx-fp16", NULL, "avx-ifma",
-            NULL, NULL, NULL, NULL,
+            NULL, NULL, "lam", NULL,
             NULL, NULL, NULL, NULL,
         },
         .cpuid = {
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 952174bb6f52..6ef9afd443b7 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -238,6 +238,9 @@ typedef enum X86Seg {
 #define CR0_CD_MASK  (1U << 30)
 #define CR0_PG_MASK  (1U << 31)
 
+#define CR3_LAM_U57  (1ULL << 61)
+#define CR3_LAM_U48  (1ULL << 62)
+
 #define CR4_VME_MASK  (1U << 0)
 #define CR4_PVI_MASK  (1U << 1)
 #define CR4_TSD_MASK  (1U << 2)
@@ -261,6 +264,7 @@ typedef enum X86Seg {
 #define CR4_SMAP_MASK   (1U << 21)
 #define CR4_PKE_MASK   (1U << 22)
 #define CR4_PKS_MASK   (1U << 24)
+#define CR4_LAM_SUP    (1U << 28)
 
 #define CR4_RESERVED_MASK \
 (~(target_ulong)(CR4_VME_MASK | CR4_PVI_MASK | CR4_TSD_MASK \
@@ -269,7 +273,8 @@ typedef enum X86Seg {
                 | CR4_OSFXSR_MASK | CR4_OSXMMEXCPT_MASK | CR4_UMIP_MASK \
                 | CR4_LA57_MASK \
                 | CR4_FSGSBASE_MASK | CR4_PCIDE_MASK | CR4_OSXSAVE_MASK \
-                | CR4_SMEP_MASK | CR4_SMAP_MASK | CR4_PKE_MASK | CR4_PKS_MASK))
+                | CR4_SMEP_MASK | CR4_SMAP_MASK | CR4_PKE_MASK | CR4_PKS_MASK \
+                | CR4_LAM_SUP ))
 
 #define DR6_BD          (1 << 13)
 #define DR6_BS          (1 << 14)
@@ -932,6 +937,8 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
 #define CPUID_7_1_EAX_AMX_FP16          (1U << 21)
 /* Support for VPMADD52[H,L]UQ */
 #define CPUID_7_1_EAX_AVX_IFMA          (1U << 23)
+/* Linear Address Masking */
+#define CPUID_7_1_EAX_LAM               (1U << 26)
 
 /* Support for VPDPB[SU,UU,SS]D[,S] */
 #define CPUID_7_1_EDX_AVX_VNNI_INT8     (1U << 4)
@@ -2525,6 +2532,24 @@ static inline bool hyperv_feat_enabled(X86CPU *cpu, int feat)
     return !!(cpu->hyperv_features & BIT(feat));
 }
 
+static inline uint64_t cr3_reserved_bits(CPUX86State *env)
+{
+    uint64_t reserved_bits;
+
+    if (!(env->efer & MSR_EFER_LMA)) {
+        return 0;
+    }
+
+    reserved_bits = (~0ULL) << env_archcpu(env)->phys_bits;
+
+    if (env->features[FEAT_7_1_EAX] & CPUID_7_1_EAX_LAM) {
+        reserved_bits &= ~(CR3_LAM_U48 | CR3_LAM_U57);
+    }
+
+    return reserved_bits;
+}
+
+
 static inline uint64_t cr4_reserved_bits(CPUX86State *env)
 {
     uint64_t reserved_bits = CR4_RESERVED_MASK;
diff --git a/target/i386/helper.c b/target/i386/helper.c
index 2070dd0dda1f..4901c9c17b1e 100644
--- a/target/i386/helper.c
+++ b/target/i386/helper.c
@@ -262,7 +262,7 @@ hwaddr x86_cpu_get_phys_page_attrs_debug(CPUState *cs, vaddr addr,
             }
 
             if (la57) {
-                pml5e_addr = ((env->cr[3] & ~0xfff) +
+                pml5e_addr = ((env->cr[3] & PG_ADDRESS_MASK) +
                         (((addr >> 48) & 0x1ff) << 3)) & a20_mask;
                 pml5e = x86_ldq_phys(cs, pml5e_addr);
                 if (!(pml5e & PG_PRESENT_MASK)) {
diff --git a/target/i386/tcg/helper-tcg.h b/target/i386/tcg/helper-tcg.h
index effc2c1c9842..11f75ea475e3 100644
--- a/target/i386/tcg/helper-tcg.h
+++ b/target/i386/tcg/helper-tcg.h
@@ -84,6 +84,7 @@ bool x86_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
 G_NORETURN void x86_cpu_do_unaligned_access(CPUState *cs, vaddr vaddr,
                                             MMUAccessType access_type,
                                             int mmu_idx, uintptr_t retaddr);
+vaddr x86_cpu_clean_addr(CPUState *cpu, vaddr addr);
 #endif
 
 /* cc_helper.c */
diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c
index 8f7011d96631..1bc71170e6a3 100644
--- a/target/i386/tcg/sysemu/excp_helper.c
+++ b/target/i386/tcg/sysemu/excp_helper.c
@@ -163,7 +163,7 @@ static bool mmu_translate(CPUX86State *env, const TranslateParams *in,
                 /*
                  * Page table level 5
                  */
-                pte_addr = (in->cr3 & ~0xfff) + (((addr >> 48) & 0x1ff) << 3);
+                pte_addr = (in->cr3 & PG_ADDRESS_MASK) + (((addr >> 48) & 0x1ff) << 3);
                 if (!ptw_translate(&pte_trans, pte_addr)) {
                     return false;
                 }
@@ -638,3 +638,30 @@ G_NORETURN void x86_cpu_do_unaligned_access(CPUState *cs, vaddr vaddr,
     X86CPU *cpu = X86_CPU(cs);
     handle_unaligned_access(&cpu->env, vaddr, access_type, retaddr);
 }
+
+
+static inline int64_t sign_extend64(uint64_t value, int index)
+{
+    int shift = 63 - index;
+    return (int64_t)(value << shift) >> shift;
+}
+
+vaddr x86_cpu_clean_addr(CPUState *cs, vaddr addr)
+{
+    CPUX86State *env = &X86_CPU(cs)->env;
+    bool la57 = env->cr[4] & CR4_LA57_MASK;
+
+    if (addr >> 63) {
+        if (env->cr[4] & CR4_LAM_SUP) {
+            return sign_extend64(addr, la57 ? 56 : 47);
+        }
+    } else {
+        if (env->cr[3] & CR3_LAM_U57) {
+            return sign_extend64(addr, 56);
+        } else if (env->cr[3] & CR3_LAM_U48) {
+            return sign_extend64(addr, 47);
+        }
+    }
+
+    return addr;
+}
diff --git a/target/i386/tcg/sysemu/misc_helper.c b/target/i386/tcg/sysemu/misc_helper.c
index edb7c3d89408..aecb523e777d 100644
--- a/target/i386/tcg/sysemu/misc_helper.c
+++ b/target/i386/tcg/sysemu/misc_helper.c
@@ -98,8 +98,7 @@ void helper_write_crN(CPUX86State *env, int reg, target_ulong t0)
         cpu_x86_update_cr0(env, t0);
         break;
     case 3:
-        if ((env->efer & MSR_EFER_LMA) &&
-                (t0 & ((~0ULL) << env_archcpu(env)->phys_bits))) {
+        if (t0 & cr3_reserved_bits(env)) {
             cpu_vmexit(env, SVM_EXIT_ERR, 0, GETPC());
         }
         if (!(env->efer & MSR_EFER_LMA)) {
diff --git a/target/i386/tcg/sysemu/svm_helper.c b/target/i386/tcg/sysemu/svm_helper.c
index 5d6de2294fa1..e981b124d975 100644
--- a/target/i386/tcg/sysemu/svm_helper.c
+++ b/target/i386/tcg/sysemu/svm_helper.c
@@ -305,8 +305,7 @@ void helper_vmrun(CPUX86State *env, int aflag, int next_eip_addend)
         cpu_vmexit(env, SVM_EXIT_ERR, 0, GETPC());
     }
     new_cr3 = x86_ldq_phys(cs, env->vm_vmcb + offsetof(struct vmcb, save.cr3));
-    if ((env->efer & MSR_EFER_LMA) &&
-            (new_cr3 & ((~0ULL) << cpu->phys_bits))) {
+    if (new_cr3 & cr3_reserved_bits(env)) {
         cpu_vmexit(env, SVM_EXIT_ERR, 0, GETPC());
     }
     new_cr4 = x86_ldq_phys(cs, env->vm_vmcb + offsetof(struct vmcb, save.cr4));
diff --git a/target/i386/tcg/tcg-cpu.c b/target/i386/tcg/tcg-cpu.c
index cca19cd40e81..8ceeb954364e 100644
--- a/target/i386/tcg/tcg-cpu.c
+++ b/target/i386/tcg/tcg-cpu.c
@@ -118,6 +118,7 @@ static const TCGCPUOps x86_tcg_ops = {
     .record_sigbus = x86_cpu_record_sigbus,
 #else
     .tlb_fill = x86_cpu_tlb_fill,
+    .do_clean_addr = x86_cpu_clean_addr,
     .do_interrupt = x86_cpu_do_interrupt,
     .cpu_exec_halt = x86_cpu_exec_halt,
     .cpu_exec_interrupt = x86_cpu_exec_interrupt,
-- 
  Kiryl Shutsemau / Kirill A. Shutemov


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-08  1:34   ` Andy Lutomirski
  2024-03-08  1:47     ` Yosry Ahmed
@ 2024-03-08 15:23     ` Dave Hansen
  2024-03-08 18:18       ` Kirill A. Shutemov
  2024-03-09  2:19       ` Yosry Ahmed
  1 sibling, 2 replies; 27+ messages in thread
From: Dave Hansen @ 2024-03-08 15:23 UTC (permalink / raw)
  To: Andy Lutomirski, Yosry Ahmed, Andrew Morton
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Peter Zijlstra (Intel), Kirill A. Shutemov,
	the arch/x86 maintainers, linux-mm, Linux Kernel Mailing List

On 3/7/24 17:34, Andy Lutomirski wrote:
>> Fix this by making sure we write a new CR3 if LAM is not
>> up-to-date. No problems were observed in practice, this was found
>> by code inspection.
> I think it should be fixed with a much bigger hammer: explicit IPIs.
> Just don't ever let it get out of date, like install_ldt().
I guess it matters whether the thing that matters is having a persistent
inconsistency or a temporary one.  IPIs will definitely turn a permanent
one into a temporary one.

But this is all easier to reason about if we can get rid of even the
temporary inconsistency.

Wouldn't this be even simpler than IPIs?

static inline unsigned long set_tlbstate_lam_mode(struct mm_struct *mm)
{
	unsigned long lam = READ_ONCE(mm->context.lam_cr3_mask);

+	/* LAM is for userspace only.  Ignore it for kernel threads: */
+	if (tsk->flags & PF_KTHREAD)
+		return 0;

	this_cpu_write(cpu_tlbstate.lam, lam >> X86_CR3_LAM_U57_BIT);
 	this_cpu_write(tlbstate_untag_mask, mm->context.untag_mask);
	return lam;
}


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-08 15:23     ` Dave Hansen
@ 2024-03-08 18:18       ` Kirill A. Shutemov
  2024-03-09  2:19       ` Yosry Ahmed
  1 sibling, 0 replies; 27+ messages in thread
From: Kirill A. Shutemov @ 2024-03-08 18:18 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Andy Lutomirski, Yosry Ahmed, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Peter Zijlstra (Intel),
	the arch/x86 maintainers, linux-mm, Linux Kernel Mailing List

On Fri, Mar 08, 2024 at 07:23:58AM -0800, Dave Hansen wrote:
> On 3/7/24 17:34, Andy Lutomirski wrote:
> >> Fix this by making sure we write a new CR3 if LAM is not
> >> up-to-date. No problems were observed in practice, this was found
> >> by code inspection.
> > I think it should be fixed with a much bigger hammer: explicit IPIs.
> > Just don't ever let it get out of date, like install_ldt().
> I guess it matters whether the thing that matters is having a persistent
> inconsistency or a temporary one.  IPIs will definitely turn a permanent
> one into a temporary one.
> 
> But this is all easier to reason about if we can get rid of even the
> temporary inconsistency.
> 
> Wouldn't this be even simpler than IPIs?
> 
> static inline unsigned long set_tlbstate_lam_mode(struct mm_struct *mm)
> {
> 	unsigned long lam = READ_ONCE(mm->context.lam_cr3_mask);
> 
> +	/* LAM is for userspace only.  Ignore it for kernel threads: */
> +	if (tsk->flags & PF_KTHREAD)
> +		return 0;

I like this approach. kthread_use_mm() WARNs if it called for
non-PF_KTHREAD task, so it should be okay.

I was worried that it would also exclude io_uring, but worker threads
don't have the flag set.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-08 15:23     ` Dave Hansen
  2024-03-08 18:18       ` Kirill A. Shutemov
@ 2024-03-09  2:19       ` Yosry Ahmed
  2024-03-09 16:34         ` Kirill A. Shutemov
  1 sibling, 1 reply; 27+ messages in thread
From: Yosry Ahmed @ 2024-03-09  2:19 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Andy Lutomirski, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Peter Zijlstra (Intel), Kirill A. Shutemov,
	the arch/x86 maintainers, linux-mm, Linux Kernel Mailing List

On Fri, Mar 08, 2024 at 07:23:58AM -0800, Dave Hansen wrote:
> On 3/7/24 17:34, Andy Lutomirski wrote:
> >> Fix this by making sure we write a new CR3 if LAM is not
> >> up-to-date. No problems were observed in practice, this was found
> >> by code inspection.
> > I think it should be fixed with a much bigger hammer: explicit IPIs.
> > Just don't ever let it get out of date, like install_ldt().
> I guess it matters whether the thing that matters is having a persistent
> inconsistency or a temporary one.  IPIs will definitely turn a permanent
> one into a temporary one.
> 
> But this is all easier to reason about if we can get rid of even the
> temporary inconsistency.
> 
> Wouldn't this be even simpler than IPIs?
> 
> static inline unsigned long set_tlbstate_lam_mode(struct mm_struct *mm)
> {
> 	unsigned long lam = READ_ONCE(mm->context.lam_cr3_mask);
> 
> +	/* LAM is for userspace only.  Ignore it for kernel threads: */
> +	if (tsk->flags & PF_KTHREAD)
> +		return 0;
> 
> 	this_cpu_write(cpu_tlbstate.lam, lam >> X86_CR3_LAM_U57_BIT);
>  	this_cpu_write(tlbstate_untag_mask, mm->context.untag_mask);
> 	return lam;
> }

Hmm I don't see how this fixes the problem addressed by this patch. I
think this fixes the problem addressed by patch 1, where CR3 and
cpu_tlbstate.lam may get out of sync if LAM enablement races with
switch_mm_irqs_off().

However, this patch is fixing a deeper problem (an actual bug).
Precisely this situation:

CPU 1					CPU 2
/* kthread */
kthread_use_mm()
					/* user thread */
					prctl_enable_tagged_addr()
					/* LAM enabled */
					context_switch() /* to CPU 1 */
switch_mm_irqs_off()
/* user thread */
---> LAM is disabled here <---

When switch_mm_irqs_off() runs on CPU 1 to switch from the kthread to
the user thread, because the mm is not actually changing, we may not
write CR3. In this case, the user thread runs on CPU 1 with LAM
disabled, right?

The IPI would fix this problem because prctl_enable_tagged_addr() will
make sure that CPU 1 enables LAM before it returns to userspace.
Alternatively, this patch fixes the problem by making sure we write CR3
in switch_mm_irqs_off() if LAM is out-of-date.

I don't see how skipping set_tlbstate_lam_mode() for kthreads fixes this
problem. Do you mind elaborating?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-09  2:19       ` Yosry Ahmed
@ 2024-03-09 16:34         ` Kirill A. Shutemov
  2024-03-09 21:37           ` Yosry Ahmed
  0 siblings, 1 reply; 27+ messages in thread
From: Kirill A. Shutemov @ 2024-03-09 16:34 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Dave Hansen, Andy Lutomirski, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Peter Zijlstra (Intel),
	the arch/x86 maintainers, linux-mm, Linux Kernel Mailing List

On Sat, Mar 09, 2024 at 02:19:19AM +0000, Yosry Ahmed wrote:
> I don't see how skipping set_tlbstate_lam_mode() for kthreads fixes this
> problem. Do you mind elaborating?

Define what problem is.

Yes, in this scenario kthread gets more permissive LAM mode than it needs.
But nothing breaks.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-09 16:34         ` Kirill A. Shutemov
@ 2024-03-09 21:37           ` Yosry Ahmed
  2024-03-11 12:42             ` Kirill A. Shutemov
  0 siblings, 1 reply; 27+ messages in thread
From: Yosry Ahmed @ 2024-03-09 21:37 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Dave Hansen, Andy Lutomirski, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Peter Zijlstra (Intel),
	the arch/x86 maintainers, linux-mm, Linux Kernel Mailing List

On Sat, Mar 9, 2024 at 8:34 AM Kirill A. Shutemov
<kirill.shutemov@linux.intel.com> wrote:
>
> On Sat, Mar 09, 2024 at 02:19:19AM +0000, Yosry Ahmed wrote:
> > I don't see how skipping set_tlbstate_lam_mode() for kthreads fixes this
> > problem. Do you mind elaborating?
>
> Define what problem is.
>
> Yes, in this scenario kthread gets more permissive LAM mode than it needs.
> But nothing breaks.

The problem here is not how the kthread runs at all. It is the fact
that if that kthread context switches into the user process that has
enabled LAM, it may not update CR3 because the mm doesn't change.
switch_mm_irqs_off() will only update CR3 in this case if there is a
pending TLB flush. Otherwise, we just return, even if the LAM for this
mm has changed.

This can cause the process that has enabled LAM to run with LAM
disabled and fault on tagged addresses, right? Did I miss something?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
@ 2024-03-10 10:04 kernel test robot
  0 siblings, 0 replies; 27+ messages in thread
From: kernel test robot @ 2024-03-10 10:04 UTC (permalink / raw)
  To: oe-kbuild; +Cc: lkp, Dan Carpenter

BCC: lkp@intel.com
CC: oe-kbuild-all@lists.linux.dev
In-Reply-To: <20240307133916.3782068-3-yosryahmed@google.com>
References: <20240307133916.3782068-3-yosryahmed@google.com>
TO: Yosry Ahmed <yosryahmed@google.com>

Hi Yosry,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build warnings:

[auto build test WARNING on akpm-mm/mm-everything]
[also build test WARNING on tip/master linus/master tip/auto-latest v6.8-rc7 next-20240308]
[cannot apply to tip/x86/core tip/x86/mm]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Yosry-Ahmed/x86-mm-fix-LAM-cr3-mask-inconsistency-during-context-switch/20240307-214052
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/20240307133916.3782068-3-yosryahmed%40google.com
patch subject: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
:::::: branch date: 3 days ago
:::::: commit date: 3 days ago
config: x86_64-randconfig-161-20240310 (https://download.01.org/0day-ci/archive/20240310/202403101713.eM4pyFpc-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Reported-by: Dan Carpenter <error27@gmail.com>
| Closes: https://lore.kernel.org/r/202403101713.eM4pyFpc-lkp@intel.com/

smatch warnings:
arch/x86/mm/tlb.c:655 switch_mm_irqs_off() error: uninitialized symbol 'new_asid'.

vim +/new_asid +655 arch/x86/mm/tlb.c

cb2a02355b042e Thomas Gleixner    2020-04-21  494  
3cfd6625a6cf83 Yosry Ahmed        2024-01-26  495  /*
cd87d9f58439a1 Yosry Ahmed        2024-02-22  496   * This optimizes when not actually switching mm's.  Some architectures use the
cd87d9f58439a1 Yosry Ahmed        2024-02-22  497   * 'unused' argument for this optimization, but x86 must use
cd87d9f58439a1 Yosry Ahmed        2024-02-22  498   * 'cpu_tlbstate.loaded_mm' instead because it does not always keep
cd87d9f58439a1 Yosry Ahmed        2024-02-22  499   * 'current->active_mm' up to date.
3cfd6625a6cf83 Yosry Ahmed        2024-01-26  500   */
3cfd6625a6cf83 Yosry Ahmed        2024-01-26  501  void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
078194f8e9fe3c Andy Lutomirski    2016-04-26  502  			struct task_struct *tsk)
69c0319aabba45 Andy Lutomirski    2016-04-26  503  {
3cfd6625a6cf83 Yosry Ahmed        2024-01-26  504  	struct mm_struct *prev = this_cpu_read(cpu_tlbstate.loaded_mm);
10af6235e0d327 Andy Lutomirski    2017-07-24  505  	u16 prev_asid = this_cpu_read(cpu_tlbstate.loaded_mm_asid);
c2bc96a91dc26e Yosry Ahmed        2024-03-07  506  	u64 cpu_tlb_gen = this_cpu_read(cpu_tlbstate.ctxs[prev_asid].tlb_gen);
2f4305b19fe6a2 Nadav Amit         2021-02-20  507  	bool was_lazy = this_cpu_read(cpu_tlbstate_shared.is_lazy);
c2bc96a91dc26e Yosry Ahmed        2024-03-07  508  	bool need_flush = false, need_lam_update = false;
94b1b03b519b81 Andy Lutomirski    2017-06-29  509  	unsigned cpu = smp_processor_id();
897738e46e29e1 Yosry Ahmed        2024-03-07  510  	unsigned long new_lam;
94b1b03b519b81 Andy Lutomirski    2017-06-29  511  	u64 next_tlb_gen;
12c4d978fd170c Rik van Riel       2018-09-25  512  	u16 new_asid;
3d28ebceaffab4 Andy Lutomirski    2017-05-28  513  
4c1ba3923e6c8a Nadav Amit         2021-02-20  514  	/* We don't want flush_tlb_func() to run concurrently with us. */
94b1b03b519b81 Andy Lutomirski    2017-06-29  515  	if (IS_ENABLED(CONFIG_PROVE_LOCKING))
94b1b03b519b81 Andy Lutomirski    2017-06-29  516  		WARN_ON_ONCE(!irqs_disabled());
94b1b03b519b81 Andy Lutomirski    2017-06-29  517  
94b1b03b519b81 Andy Lutomirski    2017-06-29  518  	/*
94b1b03b519b81 Andy Lutomirski    2017-06-29  519  	 * Verify that CR3 is what we think it is.  This will catch
94b1b03b519b81 Andy Lutomirski    2017-06-29  520  	 * hypothetical buggy code that directly switches to swapper_pg_dir
10af6235e0d327 Andy Lutomirski    2017-07-24  521  	 * without going through leave_mm() / switch_mm_irqs_off() or that
10af6235e0d327 Andy Lutomirski    2017-07-24  522  	 * does something like write_cr3(read_cr3_pa()).
a376e7f99be7c1 Andy Lutomirski    2017-09-07  523  	 *
a376e7f99be7c1 Andy Lutomirski    2017-09-07  524  	 * Only do this check if CONFIG_DEBUG_VM=y because __read_cr3()
a376e7f99be7c1 Andy Lutomirski    2017-09-07  525  	 * isn't free.
94b1b03b519b81 Andy Lutomirski    2017-06-29  526  	 */
a376e7f99be7c1 Andy Lutomirski    2017-09-07  527  #ifdef CONFIG_DEBUG_VM
3cfd6625a6cf83 Yosry Ahmed        2024-01-26  528  	if (WARN_ON_ONCE(__read_cr3() != build_cr3(prev->pgd, prev_asid,
82721d8b25d76c Kirill A. Shutemov 2023-03-12  529  						   tlbstate_lam_cr3_mask()))) {
a376e7f99be7c1 Andy Lutomirski    2017-09-07  530  		/*
a376e7f99be7c1 Andy Lutomirski    2017-09-07  531  		 * If we were to BUG here, we'd be very likely to kill
a376e7f99be7c1 Andy Lutomirski    2017-09-07  532  		 * the system so hard that we don't see the call trace.
a376e7f99be7c1 Andy Lutomirski    2017-09-07  533  		 * Try to recover instead by ignoring the error and doing
a376e7f99be7c1 Andy Lutomirski    2017-09-07  534  		 * a global flush to minimize the chance of corruption.
a376e7f99be7c1 Andy Lutomirski    2017-09-07  535  		 *
a376e7f99be7c1 Andy Lutomirski    2017-09-07  536  		 * (This is far from being a fully correct recovery.
a376e7f99be7c1 Andy Lutomirski    2017-09-07  537  		 *  Architecturally, the CPU could prefetch something
a376e7f99be7c1 Andy Lutomirski    2017-09-07  538  		 *  back into an incorrect ASID slot and leave it there
a376e7f99be7c1 Andy Lutomirski    2017-09-07  539  		 *  to cause trouble down the road.  It's better than
a376e7f99be7c1 Andy Lutomirski    2017-09-07  540  		 *  nothing, though.)
a376e7f99be7c1 Andy Lutomirski    2017-09-07  541  		 */
a376e7f99be7c1 Andy Lutomirski    2017-09-07  542  		__flush_tlb_all();
a376e7f99be7c1 Andy Lutomirski    2017-09-07  543  	}
a376e7f99be7c1 Andy Lutomirski    2017-09-07  544  #endif
09c5272e48614a Nadav Amit         2021-02-20  545  	if (was_lazy)
2f4305b19fe6a2 Nadav Amit         2021-02-20  546  		this_cpu_write(cpu_tlbstate_shared.is_lazy, false);
3d28ebceaffab4 Andy Lutomirski    2017-05-28  547  
306e060435d7a3 Mathieu Desnoyers  2018-01-29  548  	/*
10bcc80e9dbced Mathieu Desnoyers  2018-01-29  549  	 * The membarrier system call requires a full memory barrier and
10bcc80e9dbced Mathieu Desnoyers  2018-01-29  550  	 * core serialization before returning to user-space, after
a493d1ca1a03b5 Andy Lutomirski    2020-12-03  551  	 * storing to rq->curr, when changing mm.  This is because
a493d1ca1a03b5 Andy Lutomirski    2020-12-03  552  	 * membarrier() sends IPIs to all CPUs that are in the target mm
a493d1ca1a03b5 Andy Lutomirski    2020-12-03  553  	 * to make them issue memory barriers.  However, if another CPU
a493d1ca1a03b5 Andy Lutomirski    2020-12-03  554  	 * switches to/from the target mm concurrently with
a493d1ca1a03b5 Andy Lutomirski    2020-12-03  555  	 * membarrier(), it can cause that CPU not to receive an IPI
a493d1ca1a03b5 Andy Lutomirski    2020-12-03  556  	 * when it really should issue a memory barrier.  Writing to CR3
a493d1ca1a03b5 Andy Lutomirski    2020-12-03  557  	 * provides that full memory barrier and core serializing
a493d1ca1a03b5 Andy Lutomirski    2020-12-03  558  	 * instruction.
306e060435d7a3 Mathieu Desnoyers  2018-01-29  559  	 */
3cfd6625a6cf83 Yosry Ahmed        2024-01-26  560  	if (prev == next) {
82721d8b25d76c Kirill A. Shutemov 2023-03-12  561  		/* Not actually switching mm's */
e8b9b0cc8269c8 Andy Lutomirski    2017-10-14  562  		VM_WARN_ON(this_cpu_read(cpu_tlbstate.ctxs[prev_asid].ctx_id) !=
94b1b03b519b81 Andy Lutomirski    2017-06-29  563  			   next->context.ctx_id);
94b1b03b519b81 Andy Lutomirski    2017-06-29  564  
3d28ebceaffab4 Andy Lutomirski    2017-05-28  565  		/*
145f573b89a62b Rik van Riel       2018-09-25  566  		 * Even in lazy TLB mode, the CPU should stay set in the
145f573b89a62b Rik van Riel       2018-09-25  567  		 * mm_cpumask. The TLB shootdown code can figure out from
2f4305b19fe6a2 Nadav Amit         2021-02-20  568  		 * cpu_tlbstate_shared.is_lazy whether or not to send an IPI.
3d28ebceaffab4 Andy Lutomirski    2017-05-28  569  		 */
3cfd6625a6cf83 Yosry Ahmed        2024-01-26  570  		if (WARN_ON_ONCE(prev != &init_mm &&
b956575bed91ec Andy Lutomirski    2017-10-09  571  				 !cpumask_test_cpu(cpu, mm_cpumask(next))))
94b1b03b519b81 Andy Lutomirski    2017-06-29  572  			cpumask_set_cpu(cpu, mm_cpumask(next));
94b1b03b519b81 Andy Lutomirski    2017-06-29  573  
c2bc96a91dc26e Yosry Ahmed        2024-03-07  574  		/*
c2bc96a91dc26e Yosry Ahmed        2024-03-07  575  		 * tlbstate_lam_cr3_mask() may be outdated if a different thread
c2bc96a91dc26e Yosry Ahmed        2024-03-07  576  		 * has enabled LAM while we were borrowing its mm on this CPU.
c2bc96a91dc26e Yosry Ahmed        2024-03-07  577  		 * Make sure we update CR3 in case we are switching to another
c2bc96a91dc26e Yosry Ahmed        2024-03-07  578  		 * thread in that process.
c2bc96a91dc26e Yosry Ahmed        2024-03-07  579  		 */
c2bc96a91dc26e Yosry Ahmed        2024-03-07  580  		if (tlbstate_lam_cr3_mask() != mm_lam_cr3_mask(next))
c2bc96a91dc26e Yosry Ahmed        2024-03-07  581  			need_lam_update = true;
c2bc96a91dc26e Yosry Ahmed        2024-03-07  582  
145f573b89a62b Rik van Riel       2018-09-25  583  		/*
145f573b89a62b Rik van Riel       2018-09-25  584  		 * If the CPU is not in lazy TLB mode, we are just switching
145f573b89a62b Rik van Riel       2018-09-25  585  		 * from one thread in a process to another thread in the same
145f573b89a62b Rik van Riel       2018-09-25  586  		 * process. No TLB flush required.
145f573b89a62b Rik van Riel       2018-09-25  587  		 */
c2bc96a91dc26e Yosry Ahmed        2024-03-07  588  		if (was_lazy) {
145f573b89a62b Rik van Riel       2018-09-25  589  			/*
145f573b89a62b Rik van Riel       2018-09-25  590  			 * Read the tlb_gen to check whether a flush is needed.
c2bc96a91dc26e Yosry Ahmed        2024-03-07  591  			 * If the TLB is up to date, just use it.  The barrier
c2bc96a91dc26e Yosry Ahmed        2024-03-07  592  			 * synchronizes with the tlb_gen increment in the TLB
c2bc96a91dc26e Yosry Ahmed        2024-03-07  593  			 * shootdown code.
145f573b89a62b Rik van Riel       2018-09-25  594  			 */
145f573b89a62b Rik van Riel       2018-09-25  595  			smp_mb();
145f573b89a62b Rik van Riel       2018-09-25  596  			next_tlb_gen = atomic64_read(&next->context.tlb_gen);
c2bc96a91dc26e Yosry Ahmed        2024-03-07  597  			if (cpu_tlb_gen < next_tlb_gen) {
145f573b89a62b Rik van Riel       2018-09-25  598  				/*
c2bc96a91dc26e Yosry Ahmed        2024-03-07  599  				 * TLB contents went out of date while we were
c2bc96a91dc26e Yosry Ahmed        2024-03-07  600  				 * in lazy mode.
145f573b89a62b Rik van Riel       2018-09-25  601  				 */
145f573b89a62b Rik van Riel       2018-09-25  602  				new_asid = prev_asid;
145f573b89a62b Rik van Riel       2018-09-25  603  				need_flush = true;
c2bc96a91dc26e Yosry Ahmed        2024-03-07  604  			}
c2bc96a91dc26e Yosry Ahmed        2024-03-07  605  		}
c2bc96a91dc26e Yosry Ahmed        2024-03-07  606  
c2bc96a91dc26e Yosry Ahmed        2024-03-07  607  		if (!need_flush && !need_lam_update)
c2bc96a91dc26e Yosry Ahmed        2024-03-07  608  			return;
94b1b03b519b81 Andy Lutomirski    2017-06-29  609  	} else {
18bf3c3ea8ece8 Tim Chen           2018-01-29  610  		/*
371b09c6fdc436 Balbir Singh       2021-01-08  611  		 * Apply process to process speculation vulnerability
371b09c6fdc436 Balbir Singh       2021-01-08  612  		 * mitigations if applicable.
18bf3c3ea8ece8 Tim Chen           2018-01-29  613  		 */
371b09c6fdc436 Balbir Singh       2021-01-08  614  		cond_mitigation(tsk);
94b1b03b519b81 Andy Lutomirski    2017-06-29  615  
e9d8c61557687b Rik van Riel       2018-07-16  616  		/*
e9d8c61557687b Rik van Riel       2018-07-16  617  		 * Stop remote flushes for the previous mm.
e9d8c61557687b Rik van Riel       2018-07-16  618  		 * Skip kernel threads; we never send init_mm TLB flushing IPIs,
e9d8c61557687b Rik van Riel       2018-07-16  619  		 * but the bitmap manipulation can cause cache line contention.
e9d8c61557687b Rik van Riel       2018-07-16  620  		 */
3cfd6625a6cf83 Yosry Ahmed        2024-01-26  621  		if (prev != &init_mm) {
e9d8c61557687b Rik van Riel       2018-07-16  622  			VM_WARN_ON_ONCE(!cpumask_test_cpu(cpu,
3cfd6625a6cf83 Yosry Ahmed        2024-01-26  623  						mm_cpumask(prev)));
3cfd6625a6cf83 Yosry Ahmed        2024-01-26  624  			cpumask_clear_cpu(cpu, mm_cpumask(prev));
e9d8c61557687b Rik van Riel       2018-07-16  625  		}
e37e43a497d5a8 Andy Lutomirski    2016-08-11  626  
69c0319aabba45 Andy Lutomirski    2016-04-26  627  		/*
94b1b03b519b81 Andy Lutomirski    2017-06-29  628  		 * Start remote flushes and then read tlb_gen.
69c0319aabba45 Andy Lutomirski    2016-04-26  629  		 */
e9d8c61557687b Rik van Riel       2018-07-16  630  		if (next != &init_mm)
94b1b03b519b81 Andy Lutomirski    2017-06-29  631  			cpumask_set_cpu(cpu, mm_cpumask(next));
94b1b03b519b81 Andy Lutomirski    2017-06-29  632  		next_tlb_gen = atomic64_read(&next->context.tlb_gen);
94b1b03b519b81 Andy Lutomirski    2017-06-29  633  
10af6235e0d327 Andy Lutomirski    2017-07-24  634  		choose_new_asid(next, next_tlb_gen, &new_asid, &need_flush);
69c0319aabba45 Andy Lutomirski    2016-04-26  635  
4012e77a903d11 Andy Lutomirski    2018-08-29  636  		/* Let nmi_uaccess_okay() know that we're changing CR3. */
4012e77a903d11 Andy Lutomirski    2018-08-29  637  		this_cpu_write(cpu_tlbstate.loaded_mm, LOADED_MM_SWITCHING);
4012e77a903d11 Andy Lutomirski    2018-08-29  638  		barrier();
12c4d978fd170c Rik van Riel       2018-09-25  639  	}
4012e77a903d11 Andy Lutomirski    2018-08-29  640  
897738e46e29e1 Yosry Ahmed        2024-03-07  641  	/*
897738e46e29e1 Yosry Ahmed        2024-03-07  642  	 * Even if we are not actually switching mm's, another thread could have
897738e46e29e1 Yosry Ahmed        2024-03-07  643  	 * updated mm->context.lam_cr3_mask. Make sure tlbstate_lam_cr3_mask()
897738e46e29e1 Yosry Ahmed        2024-03-07  644  	 * and the loaded CR3 use the up-to-date mask.
897738e46e29e1 Yosry Ahmed        2024-03-07  645  	 */
897738e46e29e1 Yosry Ahmed        2024-03-07  646  	new_lam = set_tlbstate_lam_mode(next);
10af6235e0d327 Andy Lutomirski    2017-07-24  647  	if (need_flush) {
10af6235e0d327 Andy Lutomirski    2017-07-24  648  		this_cpu_write(cpu_tlbstate.ctxs[new_asid].ctx_id, next->context.ctx_id);
10af6235e0d327 Andy Lutomirski    2017-07-24  649  		this_cpu_write(cpu_tlbstate.ctxs[new_asid].tlb_gen, next_tlb_gen);
82721d8b25d76c Kirill A. Shutemov 2023-03-12  650  		load_new_mm_cr3(next->pgd, new_asid, new_lam, true);
675357362aeba1 Andy Lutomirski    2017-11-04  651  
bf9282dc26e7fe Peter Zijlstra     2020-08-12  652  		trace_tlb_flush(TLB_FLUSH_ON_TASK_SWITCH, TLB_FLUSH_ALL);
10af6235e0d327 Andy Lutomirski    2017-07-24  653  	} else {
10af6235e0d327 Andy Lutomirski    2017-07-24  654  		/* The new ASID is already up to date. */
82721d8b25d76c Kirill A. Shutemov 2023-03-12 @655  		load_new_mm_cr3(next->pgd, new_asid, new_lam, false);
675357362aeba1 Andy Lutomirski    2017-11-04  656  
bf9282dc26e7fe Peter Zijlstra     2020-08-12  657  		trace_tlb_flush(TLB_FLUSH_ON_TASK_SWITCH, 0);
10af6235e0d327 Andy Lutomirski    2017-07-24  658  	}
10af6235e0d327 Andy Lutomirski    2017-07-24  659  
4012e77a903d11 Andy Lutomirski    2018-08-29  660  	/* Make sure we write CR3 before loaded_mm. */
4012e77a903d11 Andy Lutomirski    2018-08-29  661  	barrier();
4012e77a903d11 Andy Lutomirski    2018-08-29  662  
10af6235e0d327 Andy Lutomirski    2017-07-24  663  	this_cpu_write(cpu_tlbstate.loaded_mm, next);
10af6235e0d327 Andy Lutomirski    2017-07-24  664  	this_cpu_write(cpu_tlbstate.loaded_mm_asid, new_asid);
69c0319aabba45 Andy Lutomirski    2016-04-26  665  
3cfd6625a6cf83 Yosry Ahmed        2024-01-26  666  	if (next != prev) {
cb2a02355b042e Thomas Gleixner    2020-04-21  667  		cr4_update_pce_mm(next);
3cfd6625a6cf83 Yosry Ahmed        2024-01-26  668  		switch_ldt(prev, next);
69c0319aabba45 Andy Lutomirski    2016-04-26  669  	}
145f573b89a62b Rik van Riel       2018-09-25  670  }
69c0319aabba45 Andy Lutomirski    2016-04-26  671  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-07 13:39 ` [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching Yosry Ahmed
                     ` (2 preceding siblings ...)
  2024-03-08  1:34   ` Andy Lutomirski
@ 2024-03-11  6:09   ` Dan Carpenter
  2024-03-11 21:28     ` Yosry Ahmed
  3 siblings, 1 reply; 27+ messages in thread
From: Dan Carpenter @ 2024-03-11  6:09 UTC (permalink / raw)
  To: oe-kbuild, Yosry Ahmed; +Cc: lkp, oe-kbuild-all

Hi Yosry,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build warnings:

https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Yosry-Ahmed/x86-mm-fix-LAM-cr3-mask-inconsistency-during-context-switch/20240307-214052
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/20240307133916.3782068-3-yosryahmed%40google.com
patch subject: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
config: x86_64-randconfig-161-20240310 (https://download.01.org/0day-ci/archive/20240310/202403101713.eM4pyFpc-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
| Closes: https://lore.kernel.org/r/202403101713.eM4pyFpc-lkp@intel.com/

smatch warnings:
arch/x86/mm/tlb.c:655 switch_mm_irqs_off() error: uninitialized symbol 'new_asid'.

vim +/new_asid +655 arch/x86/mm/tlb.c

3cfd6625a6cf83 Yosry Ahmed        2024-01-26  501  void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
078194f8e9fe3c Andy Lutomirski    2016-04-26  502  			struct task_struct *tsk)
69c0319aabba45 Andy Lutomirski    2016-04-26  503  {
3cfd6625a6cf83 Yosry Ahmed        2024-01-26  504  	struct mm_struct *prev = this_cpu_read(cpu_tlbstate.loaded_mm);
10af6235e0d327 Andy Lutomirski    2017-07-24  505  	u16 prev_asid = this_cpu_read(cpu_tlbstate.loaded_mm_asid);
c2bc96a91dc26e Yosry Ahmed        2024-03-07  506  	u64 cpu_tlb_gen = this_cpu_read(cpu_tlbstate.ctxs[prev_asid].tlb_gen);
2f4305b19fe6a2 Nadav Amit         2021-02-20  507  	bool was_lazy = this_cpu_read(cpu_tlbstate_shared.is_lazy);
c2bc96a91dc26e Yosry Ahmed        2024-03-07  508  	bool need_flush = false, need_lam_update = false;
94b1b03b519b81 Andy Lutomirski    2017-06-29  509  	unsigned cpu = smp_processor_id();
897738e46e29e1 Yosry Ahmed        2024-03-07  510  	unsigned long new_lam;
94b1b03b519b81 Andy Lutomirski    2017-06-29  511  	u64 next_tlb_gen;
12c4d978fd170c Rik van Riel       2018-09-25  512  	u16 new_asid;
3d28ebceaffab4 Andy Lutomirski    2017-05-28  513  
4c1ba3923e6c8a Nadav Amit         2021-02-20  514  	/* We don't want flush_tlb_func() to run concurrently with us. */
94b1b03b519b81 Andy Lutomirski    2017-06-29  515  	if (IS_ENABLED(CONFIG_PROVE_LOCKING))
94b1b03b519b81 Andy Lutomirski    2017-06-29  516  		WARN_ON_ONCE(!irqs_disabled());
94b1b03b519b81 Andy Lutomirski    2017-06-29  517  
94b1b03b519b81 Andy Lutomirski    2017-06-29  518  	/*
94b1b03b519b81 Andy Lutomirski    2017-06-29  519  	 * Verify that CR3 is what we think it is.  This will catch
94b1b03b519b81 Andy Lutomirski    2017-06-29  520  	 * hypothetical buggy code that directly switches to swapper_pg_dir
10af6235e0d327 Andy Lutomirski    2017-07-24  521  	 * without going through leave_mm() / switch_mm_irqs_off() or that
10af6235e0d327 Andy Lutomirski    2017-07-24  522  	 * does something like write_cr3(read_cr3_pa()).
a376e7f99be7c1 Andy Lutomirski    2017-09-07  523  	 *
a376e7f99be7c1 Andy Lutomirski    2017-09-07  524  	 * Only do this check if CONFIG_DEBUG_VM=y because __read_cr3()
a376e7f99be7c1 Andy Lutomirski    2017-09-07  525  	 * isn't free.
94b1b03b519b81 Andy Lutomirski    2017-06-29  526  	 */
a376e7f99be7c1 Andy Lutomirski    2017-09-07  527  #ifdef CONFIG_DEBUG_VM
3cfd6625a6cf83 Yosry Ahmed        2024-01-26  528  	if (WARN_ON_ONCE(__read_cr3() != build_cr3(prev->pgd, prev_asid,
82721d8b25d76c Kirill A. Shutemov 2023-03-12  529  						   tlbstate_lam_cr3_mask()))) {
a376e7f99be7c1 Andy Lutomirski    2017-09-07  530  		/*
a376e7f99be7c1 Andy Lutomirski    2017-09-07  531  		 * If we were to BUG here, we'd be very likely to kill
a376e7f99be7c1 Andy Lutomirski    2017-09-07  532  		 * the system so hard that we don't see the call trace.
a376e7f99be7c1 Andy Lutomirski    2017-09-07  533  		 * Try to recover instead by ignoring the error and doing
a376e7f99be7c1 Andy Lutomirski    2017-09-07  534  		 * a global flush to minimize the chance of corruption.
a376e7f99be7c1 Andy Lutomirski    2017-09-07  535  		 *
a376e7f99be7c1 Andy Lutomirski    2017-09-07  536  		 * (This is far from being a fully correct recovery.
a376e7f99be7c1 Andy Lutomirski    2017-09-07  537  		 *  Architecturally, the CPU could prefetch something
a376e7f99be7c1 Andy Lutomirski    2017-09-07  538  		 *  back into an incorrect ASID slot and leave it there
a376e7f99be7c1 Andy Lutomirski    2017-09-07  539  		 *  to cause trouble down the road.  It's better than
a376e7f99be7c1 Andy Lutomirski    2017-09-07  540  		 *  nothing, though.)
a376e7f99be7c1 Andy Lutomirski    2017-09-07  541  		 */
a376e7f99be7c1 Andy Lutomirski    2017-09-07  542  		__flush_tlb_all();
a376e7f99be7c1 Andy Lutomirski    2017-09-07  543  	}
a376e7f99be7c1 Andy Lutomirski    2017-09-07  544  #endif
09c5272e48614a Nadav Amit         2021-02-20  545  	if (was_lazy)
2f4305b19fe6a2 Nadav Amit         2021-02-20  546  		this_cpu_write(cpu_tlbstate_shared.is_lazy, false);
3d28ebceaffab4 Andy Lutomirski    2017-05-28  547  
306e060435d7a3 Mathieu Desnoyers  2018-01-29  548  	/*
10bcc80e9dbced Mathieu Desnoyers  2018-01-29  549  	 * The membarrier system call requires a full memory barrier and
10bcc80e9dbced Mathieu Desnoyers  2018-01-29  550  	 * core serialization before returning to user-space, after
a493d1ca1a03b5 Andy Lutomirski    2020-12-03  551  	 * storing to rq->curr, when changing mm.  This is because
a493d1ca1a03b5 Andy Lutomirski    2020-12-03  552  	 * membarrier() sends IPIs to all CPUs that are in the target mm
a493d1ca1a03b5 Andy Lutomirski    2020-12-03  553  	 * to make them issue memory barriers.  However, if another CPU
a493d1ca1a03b5 Andy Lutomirski    2020-12-03  554  	 * switches to/from the target mm concurrently with
a493d1ca1a03b5 Andy Lutomirski    2020-12-03  555  	 * membarrier(), it can cause that CPU not to receive an IPI
a493d1ca1a03b5 Andy Lutomirski    2020-12-03  556  	 * when it really should issue a memory barrier.  Writing to CR3
a493d1ca1a03b5 Andy Lutomirski    2020-12-03  557  	 * provides that full memory barrier and core serializing
a493d1ca1a03b5 Andy Lutomirski    2020-12-03  558  	 * instruction.
306e060435d7a3 Mathieu Desnoyers  2018-01-29  559  	 */
3cfd6625a6cf83 Yosry Ahmed        2024-01-26  560  	if (prev == next) {
82721d8b25d76c Kirill A. Shutemov 2023-03-12  561  		/* Not actually switching mm's */
e8b9b0cc8269c8 Andy Lutomirski    2017-10-14  562  		VM_WARN_ON(this_cpu_read(cpu_tlbstate.ctxs[prev_asid].ctx_id) !=
94b1b03b519b81 Andy Lutomirski    2017-06-29  563  			   next->context.ctx_id);
94b1b03b519b81 Andy Lutomirski    2017-06-29  564  
3d28ebceaffab4 Andy Lutomirski    2017-05-28  565  		/*
145f573b89a62b Rik van Riel       2018-09-25  566  		 * Even in lazy TLB mode, the CPU should stay set in the
145f573b89a62b Rik van Riel       2018-09-25  567  		 * mm_cpumask. The TLB shootdown code can figure out from
2f4305b19fe6a2 Nadav Amit         2021-02-20  568  		 * cpu_tlbstate_shared.is_lazy whether or not to send an IPI.
3d28ebceaffab4 Andy Lutomirski    2017-05-28  569  		 */
3cfd6625a6cf83 Yosry Ahmed        2024-01-26  570  		if (WARN_ON_ONCE(prev != &init_mm &&
b956575bed91ec Andy Lutomirski    2017-10-09  571  				 !cpumask_test_cpu(cpu, mm_cpumask(next))))
94b1b03b519b81 Andy Lutomirski    2017-06-29  572  			cpumask_set_cpu(cpu, mm_cpumask(next));
94b1b03b519b81 Andy Lutomirski    2017-06-29  573  
c2bc96a91dc26e Yosry Ahmed        2024-03-07  574  		/*
c2bc96a91dc26e Yosry Ahmed        2024-03-07  575  		 * tlbstate_lam_cr3_mask() may be outdated if a different thread
c2bc96a91dc26e Yosry Ahmed        2024-03-07  576  		 * has enabled LAM while we were borrowing its mm on this CPU.
c2bc96a91dc26e Yosry Ahmed        2024-03-07  577  		 * Make sure we update CR3 in case we are switching to another
c2bc96a91dc26e Yosry Ahmed        2024-03-07  578  		 * thread in that process.
c2bc96a91dc26e Yosry Ahmed        2024-03-07  579  		 */
c2bc96a91dc26e Yosry Ahmed        2024-03-07  580  		if (tlbstate_lam_cr3_mask() != mm_lam_cr3_mask(next))
c2bc96a91dc26e Yosry Ahmed        2024-03-07  581  			need_lam_update = true;
c2bc96a91dc26e Yosry Ahmed        2024-03-07  582  
145f573b89a62b Rik van Riel       2018-09-25  583  		/*
145f573b89a62b Rik van Riel       2018-09-25  584  		 * If the CPU is not in lazy TLB mode, we are just switching
145f573b89a62b Rik van Riel       2018-09-25  585  		 * from one thread in a process to another thread in the same
145f573b89a62b Rik van Riel       2018-09-25  586  		 * process. No TLB flush required.
145f573b89a62b Rik van Riel       2018-09-25  587  		 */
c2bc96a91dc26e Yosry Ahmed        2024-03-07  588  		if (was_lazy) {
145f573b89a62b Rik van Riel       2018-09-25  589  			/*
145f573b89a62b Rik van Riel       2018-09-25  590  			 * Read the tlb_gen to check whether a flush is needed.
c2bc96a91dc26e Yosry Ahmed        2024-03-07  591  			 * If the TLB is up to date, just use it.  The barrier
c2bc96a91dc26e Yosry Ahmed        2024-03-07  592  			 * synchronizes with the tlb_gen increment in the TLB
c2bc96a91dc26e Yosry Ahmed        2024-03-07  593  			 * shootdown code.
145f573b89a62b Rik van Riel       2018-09-25  594  			 */
145f573b89a62b Rik van Riel       2018-09-25  595  			smp_mb();
145f573b89a62b Rik van Riel       2018-09-25  596  			next_tlb_gen = atomic64_read(&next->context.tlb_gen);
c2bc96a91dc26e Yosry Ahmed        2024-03-07  597  			if (cpu_tlb_gen < next_tlb_gen) {
145f573b89a62b Rik van Riel       2018-09-25  598  				/*
c2bc96a91dc26e Yosry Ahmed        2024-03-07  599  				 * TLB contents went out of date while we were
c2bc96a91dc26e Yosry Ahmed        2024-03-07  600  				 * in lazy mode.
145f573b89a62b Rik van Riel       2018-09-25  601  				 */
145f573b89a62b Rik van Riel       2018-09-25  602  				new_asid = prev_asid;

new_asid is only initialized if several conditions are true.

145f573b89a62b Rik van Riel       2018-09-25  603  				need_flush = true;
c2bc96a91dc26e Yosry Ahmed        2024-03-07  604  			}
c2bc96a91dc26e Yosry Ahmed        2024-03-07  605  		}
c2bc96a91dc26e Yosry Ahmed        2024-03-07  606  
c2bc96a91dc26e Yosry Ahmed        2024-03-07  607  		if (!need_flush && !need_lam_update)
c2bc96a91dc26e Yosry Ahmed        2024-03-07  608  			return;
94b1b03b519b81 Andy Lutomirski    2017-06-29  609  	} else {
18bf3c3ea8ece8 Tim Chen           2018-01-29  610  		/*
371b09c6fdc436 Balbir Singh       2021-01-08  611  		 * Apply process to process speculation vulnerability
371b09c6fdc436 Balbir Singh       2021-01-08  612  		 * mitigations if applicable.
18bf3c3ea8ece8 Tim Chen           2018-01-29  613  		 */
371b09c6fdc436 Balbir Singh       2021-01-08  614  		cond_mitigation(tsk);
94b1b03b519b81 Andy Lutomirski    2017-06-29  615  
e9d8c61557687b Rik van Riel       2018-07-16  616  		/*
e9d8c61557687b Rik van Riel       2018-07-16  617  		 * Stop remote flushes for the previous mm.
e9d8c61557687b Rik van Riel       2018-07-16  618  		 * Skip kernel threads; we never send init_mm TLB flushing IPIs,
e9d8c61557687b Rik van Riel       2018-07-16  619  		 * but the bitmap manipulation can cause cache line contention.
e9d8c61557687b Rik van Riel       2018-07-16  620  		 */
3cfd6625a6cf83 Yosry Ahmed        2024-01-26  621  		if (prev != &init_mm) {
e9d8c61557687b Rik van Riel       2018-07-16  622  			VM_WARN_ON_ONCE(!cpumask_test_cpu(cpu,
3cfd6625a6cf83 Yosry Ahmed        2024-01-26  623  						mm_cpumask(prev)));
3cfd6625a6cf83 Yosry Ahmed        2024-01-26  624  			cpumask_clear_cpu(cpu, mm_cpumask(prev));
e9d8c61557687b Rik van Riel       2018-07-16  625  		}
e37e43a497d5a8 Andy Lutomirski    2016-08-11  626  
69c0319aabba45 Andy Lutomirski    2016-04-26  627  		/*
94b1b03b519b81 Andy Lutomirski    2017-06-29  628  		 * Start remote flushes and then read tlb_gen.
69c0319aabba45 Andy Lutomirski    2016-04-26  629  		 */
e9d8c61557687b Rik van Riel       2018-07-16  630  		if (next != &init_mm)
94b1b03b519b81 Andy Lutomirski    2017-06-29  631  			cpumask_set_cpu(cpu, mm_cpumask(next));
94b1b03b519b81 Andy Lutomirski    2017-06-29  632  		next_tlb_gen = atomic64_read(&next->context.tlb_gen);
94b1b03b519b81 Andy Lutomirski    2017-06-29  633  
10af6235e0d327 Andy Lutomirski    2017-07-24  634  		choose_new_asid(next, next_tlb_gen, &new_asid, &need_flush);
69c0319aabba45 Andy Lutomirski    2016-04-26  635  
4012e77a903d11 Andy Lutomirski    2018-08-29  636  		/* Let nmi_uaccess_okay() know that we're changing CR3. */
4012e77a903d11 Andy Lutomirski    2018-08-29  637  		this_cpu_write(cpu_tlbstate.loaded_mm, LOADED_MM_SWITCHING);
4012e77a903d11 Andy Lutomirski    2018-08-29  638  		barrier();
12c4d978fd170c Rik van Riel       2018-09-25  639  	}
4012e77a903d11 Andy Lutomirski    2018-08-29  640  
897738e46e29e1 Yosry Ahmed        2024-03-07  641  	/*
897738e46e29e1 Yosry Ahmed        2024-03-07  642  	 * Even if we are not actually switching mm's, another thread could have
897738e46e29e1 Yosry Ahmed        2024-03-07  643  	 * updated mm->context.lam_cr3_mask. Make sure tlbstate_lam_cr3_mask()
897738e46e29e1 Yosry Ahmed        2024-03-07  644  	 * and the loaded CR3 use the up-to-date mask.
897738e46e29e1 Yosry Ahmed        2024-03-07  645  	 */
897738e46e29e1 Yosry Ahmed        2024-03-07  646  	new_lam = set_tlbstate_lam_mode(next);
10af6235e0d327 Andy Lutomirski    2017-07-24  647  	if (need_flush) {
10af6235e0d327 Andy Lutomirski    2017-07-24  648  		this_cpu_write(cpu_tlbstate.ctxs[new_asid].ctx_id, next->context.ctx_id);
10af6235e0d327 Andy Lutomirski    2017-07-24  649  		this_cpu_write(cpu_tlbstate.ctxs[new_asid].tlb_gen, next_tlb_gen);
82721d8b25d76c Kirill A. Shutemov 2023-03-12  650  		load_new_mm_cr3(next->pgd, new_asid, new_lam, true);
675357362aeba1 Andy Lutomirski    2017-11-04  651  
bf9282dc26e7fe Peter Zijlstra     2020-08-12  652  		trace_tlb_flush(TLB_FLUSH_ON_TASK_SWITCH, TLB_FLUSH_ALL);
10af6235e0d327 Andy Lutomirski    2017-07-24  653  	} else {
10af6235e0d327 Andy Lutomirski    2017-07-24  654  		/* The new ASID is already up to date. */
82721d8b25d76c Kirill A. Shutemov 2023-03-12 @655  		load_new_mm_cr3(next->pgd, new_asid, new_lam, false);
675357362aeba1 Andy Lutomirski    2017-11-04  656  
bf9282dc26e7fe Peter Zijlstra     2020-08-12  657  		trace_tlb_flush(TLB_FLUSH_ON_TASK_SWITCH, 0);
10af6235e0d327 Andy Lutomirski    2017-07-24  658  	}
10af6235e0d327 Andy Lutomirski    2017-07-24  659  
4012e77a903d11 Andy Lutomirski    2018-08-29  660  	/* Make sure we write CR3 before loaded_mm. */
4012e77a903d11 Andy Lutomirski    2018-08-29  661  	barrier();
4012e77a903d11 Andy Lutomirski    2018-08-29  662  
10af6235e0d327 Andy Lutomirski    2017-07-24  663  	this_cpu_write(cpu_tlbstate.loaded_mm, next);
10af6235e0d327 Andy Lutomirski    2017-07-24  664  	this_cpu_write(cpu_tlbstate.loaded_mm_asid, new_asid);
69c0319aabba45 Andy Lutomirski    2016-04-26  665  
3cfd6625a6cf83 Yosry Ahmed        2024-01-26  666  	if (next != prev) {
cb2a02355b042e Thomas Gleixner    2020-04-21  667  		cr4_update_pce_mm(next);
3cfd6625a6cf83 Yosry Ahmed        2024-01-26  668  		switch_ldt(prev, next);
69c0319aabba45 Andy Lutomirski    2016-04-26  669  	}
145f573b89a62b Rik van Riel       2018-09-25  670  }

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-09 21:37           ` Yosry Ahmed
@ 2024-03-11 12:42             ` Kirill A. Shutemov
  2024-03-11 18:27               ` Yosry Ahmed
  0 siblings, 1 reply; 27+ messages in thread
From: Kirill A. Shutemov @ 2024-03-11 12:42 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Dave Hansen, Andy Lutomirski, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Peter Zijlstra (Intel),
	the arch/x86 maintainers, linux-mm, Linux Kernel Mailing List

On Sat, Mar 09, 2024 at 01:37:06PM -0800, Yosry Ahmed wrote:
> On Sat, Mar 9, 2024 at 8:34 AM Kirill A. Shutemov
> <kirill.shutemov@linux.intel.com> wrote:
> >
> > On Sat, Mar 09, 2024 at 02:19:19AM +0000, Yosry Ahmed wrote:
> > > I don't see how skipping set_tlbstate_lam_mode() for kthreads fixes this
> > > problem. Do you mind elaborating?
> >
> > Define what problem is.
> >
> > Yes, in this scenario kthread gets more permissive LAM mode than it needs.
> > But nothing breaks.
> 
> 
> The problem here is not how the kthread runs at all. It is the fact
> that if that kthread context switches into the user process that has
> enabled LAM, it may not update CR3 because the mm doesn't change.
> switch_mm_irqs_off() will only update CR3 in this case if there is a
> pending TLB flush. Otherwise, we just return, even if the LAM for this
> mm has changed.
> 
> This can cause the process that has enabled LAM to run with LAM
> disabled and fault on tagged addresses, right? Did I miss something?

You are right. I think IPI is the way to go.

Will you prepare a patch?

-- 
  Kiryl Shutsemau / Kirill A. Shutemov


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-11 12:42             ` Kirill A. Shutemov
@ 2024-03-11 18:27               ` Yosry Ahmed
  0 siblings, 0 replies; 27+ messages in thread
From: Yosry Ahmed @ 2024-03-11 18:27 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Dave Hansen, Andy Lutomirski, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Peter Zijlstra (Intel),
	the arch/x86 maintainers, linux-mm, Linux Kernel Mailing List

On Mon, Mar 11, 2024 at 5:42 AM Kirill A. Shutemov
<kirill.shutemov@linux.intel.com> wrote:
>
> On Sat, Mar 09, 2024 at 01:37:06PM -0800, Yosry Ahmed wrote:
> > On Sat, Mar 9, 2024 at 8:34 AM Kirill A. Shutemov
> > <kirill.shutemov@linux.intel.com> wrote:
> > >
> > > On Sat, Mar 09, 2024 at 02:19:19AM +0000, Yosry Ahmed wrote:
> > > > I don't see how skipping set_tlbstate_lam_mode() for kthreads fixes this
> > > > problem. Do you mind elaborating?
> > >
> > > Define what problem is.
> > >
> > > Yes, in this scenario kthread gets more permissive LAM mode than it needs.
> > > But nothing breaks.
> >
> >
> > The problem here is not how the kthread runs at all. It is the fact
> > that if that kthread context switches into the user process that has
> > enabled LAM, it may not update CR3 because the mm doesn't change.
> > switch_mm_irqs_off() will only update CR3 in this case if there is a
> > pending TLB flush. Otherwise, we just return, even if the LAM for this
> > mm has changed.
> >
> > This can cause the process that has enabled LAM to run with LAM
> > disabled and fault on tagged addresses, right? Did I miss something?
>
> You are right. I think IPI is the way to go.
>
> Will you prepare a patch?

I am working on it.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
  2024-03-11  6:09   ` Dan Carpenter
@ 2024-03-11 21:28     ` Yosry Ahmed
  0 siblings, 0 replies; 27+ messages in thread
From: Yosry Ahmed @ 2024-03-11 21:28 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: oe-kbuild, lkp, oe-kbuild-all

On Mon, Mar 11, 2024 at 09:09:34AM +0300, Dan Carpenter wrote:
> Hi Yosry,
> 
> [This is a private test report for your RFC patch.]
> kernel test robot noticed the following build warnings:
> 
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
> 
> url:    https://github.com/intel-lab-lkp/linux/commits/Yosry-Ahmed/x86-mm-fix-LAM-cr3-mask-inconsistency-during-context-switch/20240307-214052
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
> patch link:    https://lore.kernel.org/r/20240307133916.3782068-3-yosryahmed%40google.com
> patch subject: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
> config: x86_64-randconfig-161-20240310 (https://download.01.org/0day-ci/archive/20240310/202403101713.eM4pyFpc-lkp@intel.com/config)
> compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
> | Closes: https://lore.kernel.org/r/202403101713.eM4pyFpc-lkp@intel.com/
> 
> smatch warnings:
> arch/x86/mm/tlb.c:655 switch_mm_irqs_off() error: uninitialized symbol 'new_asid'.

Thanks for reporting this. I think it may be a false positive, but this
code chunk will be dropped in the next version anyway.

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2024-03-11 21:28 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-10 10:04 [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching kernel test robot
  -- strict thread matches above, loose matches on Subject: below --
2024-03-07 13:39 [RFC PATCH 0/3] x86/mm: LAM fixups and cleanups Yosry Ahmed
2024-03-07 13:39 ` [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching Yosry Ahmed
2024-03-07 15:29   ` Dave Hansen
2024-03-07 21:04     ` Yosry Ahmed
2024-03-07 21:39       ` Dave Hansen
2024-03-07 22:29         ` Yosry Ahmed
2024-03-07 22:41           ` Dave Hansen
2024-03-07 22:44             ` Yosry Ahmed
2024-03-08  1:26           ` Yosry Ahmed
2024-03-08  8:09             ` Yosry Ahmed
2024-03-07 17:29   ` Kirill A. Shutemov
2024-03-07 17:56     ` Dave Hansen
2024-03-07 21:08       ` Yosry Ahmed
2024-03-07 21:48         ` Dave Hansen
2024-03-07 22:30           ` Yosry Ahmed
2024-03-08  1:34   ` Andy Lutomirski
2024-03-08  1:47     ` Yosry Ahmed
2024-03-08 14:05       ` Kirill A. Shutemov
2024-03-08 15:23     ` Dave Hansen
2024-03-08 18:18       ` Kirill A. Shutemov
2024-03-09  2:19       ` Yosry Ahmed
2024-03-09 16:34         ` Kirill A. Shutemov
2024-03-09 21:37           ` Yosry Ahmed
2024-03-11 12:42             ` Kirill A. Shutemov
2024-03-11 18:27               ` Yosry Ahmed
2024-03-11  6:09   ` Dan Carpenter
2024-03-11 21:28     ` Yosry Ahmed

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.