All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yosry Ahmed <yosryahmed@google.com>
To: Dave Hansen <dave.hansen@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	 Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Peter Zijlstra <peterz@infradead.org>,
	 Andy Lutomirski <luto@kernel.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	x86@kernel.org,  linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
Date: Fri, 8 Mar 2024 08:09:21 +0000	[thread overview]
Message-ID: <ZerHsQV8qcyzW0V5@google.com> (raw)
In-Reply-To: <ZeppLlDeTro6zpIg@google.com>

> I came up with a kernel patch that I *think* may reproduce the problem
> with enough iterations. Userspace only needs to enable LAM, so I think
> the selftest can be enough to trigger it.
> 
> However, there is no hardware with LAM at my disposal, and IIUC I cannot
> use QEMU without KVM to run a kernel with LAM. I was planning to do more
> testing before sending a non-RFC version, but apparently I cannot do
> any testing beyond building at this point (including reproducing) :/
> 
> Let me know how you want to proceed. I can send a non-RFC v1 based on
> the feedback I got on the RFC, but it will only be build tested.
> 
> For the record, here is the diff that I *think* may reproduce the bug:

Okay, I was actually able to run _some_ testing with the diff below on
_a kernel_, and I hit the BUG_ON pretty quickly. If I did things
correctly, this BUG_ON means that even though we have an outdated LAM in
our CR3, we will not update CR3 because the TLB is up-to-date.

I can work on a v1 now with the IPI approach that Andy suggested. A
small kink is that we may still hit the BUG_ON with that fix, but in
that case it should be fine to not write CR3 because once we re-enable
interrupts we will receive the IPI and fix it. IOW, the diff below will
still BUG with the proposed fix, but it should be okay.

One thing I am not clear about with the IPI approach, if we use
mm_cpumask() to limit the IPI scope, we need to make sure that we read
mm_lam_cr3_mask() *after* we update the cpumask in switch_mm_irqs_off(),
which makes me think we'll need a barrier (and Andy said we want to
avoid those in this path). But looking at the code I see:

		/*
		 * Start remote flushes and then read tlb_gen.
		 */
		if (next != &init_mm)
			cpumask_set_cpu(cpu, mm_cpumask(next));
		next_tlb_gen = atomic64_read(&next->context.tlb_gen);

This code doesn't have a barrier. How do we make sure the read actually
happens after the write?

If no barrier is needed there, then I think we can similarly just read
the LAM mask after cpumask_set_cpu().

> 
> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
> index 33b268747bb7b..c37a8c26a3c21 100644
> --- a/arch/x86/kernel/process_64.c
> +++ b/arch/x86/kernel/process_64.c
> @@ -750,8 +750,25 @@ static long prctl_map_vdso(const struct vdso_image *image, unsigned long addr)
>  
>  #define LAM_U57_BITS 6
>  
> +static int kthread_fn(void *_mm)
> +{
> +	struct mm_struct *mm = _mm;
> +
> +	/*
> +	 * Wait for LAM to be enabled then schedule. Hopefully we will context
> +	 * switch directly into the task that enabled LAM due to CPU pinning.
> +	 */
> +	kthread_use_mm(mm);
> +	while (!test_bit(MM_CONTEXT_LOCK_LAM, &mm->context.flags));
> +	schedule();
> +	return 0;
> +}
> +
>  static int prctl_enable_tagged_addr(struct mm_struct *mm, unsigned long nr_bits)
>  {
> +	struct task_struct *kthread_task;
> +	int kthread_cpu;
> +
>  	if (!cpu_feature_enabled(X86_FEATURE_LAM))
>  		return -ENODEV;
>  
> @@ -782,10 +799,22 @@ static int prctl_enable_tagged_addr(struct mm_struct *mm, unsigned long nr_bits)
>  		return -EINVAL;
>  	}
>  
> +	/* Pin the task to the current CPU */
> +	set_cpus_allowed_ptr(current, cpumask_of(smp_processor_id()));
> +
> +	/* Run a kthread on another CPU and wait for it to start */
> +	kthread_cpu = cpumask_next_wrap(smp_processor_id(), cpu_online_mask, 0, false),
> +	kthread_task = kthread_run_on_cpu(kthread_fn, mm, kthread_cpu, "lam_repro_kthread");
> +	while (!task_is_running(kthread_task));
> +
>  	write_cr3(__read_cr3() | mm->context.lam_cr3_mask);
>  	set_tlbstate_lam_mode(mm);
>  	set_bit(MM_CONTEXT_LOCK_LAM, &mm->context.flags);
>  
> +	/* Move the task to the kthread CPU */
> +	set_cpus_allowed_ptr(current, cpumask_of(kthread_cpu));
> +
>  	mmap_write_unlock(mm);
>  
>  	return 0;
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index 51f9f56941058..3afb53f1a1901 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -593,7 +593,7 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
>  		next_tlb_gen = atomic64_read(&next->context.tlb_gen);
>  		if (this_cpu_read(cpu_tlbstate.ctxs[prev_asid].tlb_gen) ==
>  				next_tlb_gen)
> -			return;
> +			BUG_ON(new_lam != tlbstate_lam_cr3_mask());
>  
>  		/*
>  		 * TLB contents went out of date while we were in lazy
> 


  reply	other threads:[~2024-03-08  8:09 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-07 13:39 [RFC PATCH 0/3] x86/mm: LAM fixups and cleanups Yosry Ahmed
2024-03-07 13:39 ` [RFC PATCH 1/3] x86/mm: fix LAM cr3 mask inconsistency during context switch Yosry Ahmed
2024-03-07 17:22   ` Kirill A. Shutemov
2024-03-07 20:31     ` Yosry Ahmed
2024-03-07 17:36   ` Dave Hansen
2024-03-07 18:49     ` Sean Christopherson
2024-03-07 20:44       ` Yosry Ahmed
2024-03-07 22:12         ` Sean Christopherson
2024-03-07 20:42     ` Yosry Ahmed
2024-03-07 23:21       ` Yosry Ahmed
2024-03-07 23:32         ` Dave Hansen
2024-03-07 23:37           ` Yosry Ahmed
2024-03-07 13:39 ` [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching Yosry Ahmed
2024-03-07 15:29   ` Dave Hansen
2024-03-07 21:04     ` Yosry Ahmed
2024-03-07 21:39       ` Dave Hansen
2024-03-07 22:29         ` Yosry Ahmed
2024-03-07 22:41           ` Dave Hansen
2024-03-07 22:44             ` Yosry Ahmed
2024-03-08  1:26           ` Yosry Ahmed
2024-03-08  8:09             ` Yosry Ahmed [this message]
2024-03-07 17:29   ` Kirill A. Shutemov
2024-03-07 17:56     ` Dave Hansen
2024-03-07 21:08       ` Yosry Ahmed
2024-03-07 21:48         ` Dave Hansen
2024-03-07 22:30           ` Yosry Ahmed
2024-03-08  1:34   ` Andy Lutomirski
2024-03-08  1:47     ` Yosry Ahmed
2024-03-08 14:05       ` Kirill A. Shutemov
2024-03-08 15:23     ` Dave Hansen
2024-03-08 18:18       ` Kirill A. Shutemov
2024-03-09  2:19       ` Yosry Ahmed
2024-03-09 16:34         ` Kirill A. Shutemov
2024-03-09 21:37           ` Yosry Ahmed
2024-03-11 12:42             ` Kirill A. Shutemov
2024-03-11 18:27               ` Yosry Ahmed
2024-03-11  6:09   ` Dan Carpenter
2024-03-11 21:28     ` Yosry Ahmed
2024-03-07 13:39 ` [RFC PATCH 3/3] x86/mm: cleanup prctl_enable_tagged_addr() nr_bits error checking Yosry Ahmed
2024-03-07 17:31   ` Kirill A. Shutemov
2024-03-07 20:27     ` Yosry Ahmed
  -- strict thread matches above, loose matches on Subject: below --
2024-03-10 10:04 [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZerHsQV8qcyzW0V5@google.com \
    --to=yosryahmed@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@intel.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.