From: Yosry Ahmed <yosryahmed@google.com>
To: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@intel.com>,
"Peter Zijlstra (Intel)" <peterz@infradead.org>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
"the arch/x86 maintainers" <x86@kernel.org>,
linux-mm@kvack.org,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching
Date: Fri, 8 Mar 2024 01:47:39 +0000 [thread overview]
Message-ID: <ZepuO5bDoE-5T0RB@google.com> (raw)
In-Reply-To: <420fcb06-c3c3-4e8f-a82d-be2fb2ef444d@app.fastmail.com>
On Thu, Mar 07, 2024 at 05:34:21PM -0800, Andy Lutomirski wrote:
> Catching up a bit...
>
> On Thu, Mar 7, 2024, at 5:39 AM, Yosry Ahmed wrote:
> > During context switching, if we are not switching to new mm and no TLB
> > flush is needed, we do not write CR3. However, it is possible that a
> > user thread enables LAM while a kthread is running on a different CPU
> > with the old LAM CR3 mask. If the kthread context switches into any
> > thread of that user process, it may not write CR3 with the new LAM mask,
> > which would cause the user thread to run with a misconfigured CR3 that
> > disables LAM on the CPU.
>
> So I think (off the top of my head -- haven't thought about it all that hard) that LAM is logically like PCE and LDT: it's a property of an mm that is only rarely changed, and it doesn't really belong as part of the tlb_gen mechanism. And, critically, it's not worth the effort and complexity to try to optimize LAM changes when we have a lazy CPU (just like PCE and LDT) (whereas TLB flushes are performance critical and are absolutely worth optimizing).
>
> So...
>
> >
> > Fix this by making sure we write a new CR3 if LAM is not up-to-date. No
> > problems were observed in practice, this was found by code inspection.
>
> I think it should be fixed with a much bigger hammer: explicit IPIs. Just don't ever let it get out of date, like install_ldt().
I like this, and I think earlier versions of the code did this. I think
the code now assumes it's fine to not send an IPI since only
single-threaded processes can enable LAM, but this means we have to
handle kthreads switching to user threads with outdated LAMs (what this
patch is trying to do).
I also think there is currently an assumption that it's fine for
kthreads to run with an incorrect LAM, which is mostly fine, but the IPI
also drops that assumption.
>
> >
> > Not that it is possible that mm->context.lam_cr3_mask changes throughout
> > switch_mm_irqs_off(). But since LAM can only be enabled by a
> > single-threaded process on its own behalf, in that case we cannot be
> > switching to a user thread in that same process, we can only be
> > switching to another kthread using the borrowed mm or a different user
> > process, which should be fine.
>
> The thought process is even simpler with the IPI: it *can* change while switching, but it will resynchronize immediately once IRQs turn back on. And whoever changes it will *synchronize* with us, which would otherwise require extremely complex logic to get right.
>
> And...
>
> > - if (!was_lazy)
> > - return;
> > + if (was_lazy) {
> > + /*
> > + * Read the tlb_gen to check whether a flush is needed.
> > + * If the TLB is up to date, just use it. The barrier
> > + * synchronizes with the tlb_gen increment in the TLB
> > + * shootdown code.
> > + */
> > + smp_mb();
>
> This is actually rather expensive -- from old memory, we're talking maybe 20 cycles here, but this path is *very* hot and we try fairly hard to make it be fast. If we get the happy PCID path, it's maybe 100-200 cycles, so this is like a 10% regression. Ouch.
This is not newly introduced by this patch. I merely refactored this
code (reversed the if conditions). I think if we keep the current
approach I should move this refactoring to a separate patch to make
things clearer.
>
> And you can delete all of this if you accept my suggestion.
I like it very much. The problem now is, as I told Dave, I realized I
cannot do any testing beyond compilation due to lack of hardware. I am
happy to send a next version if this is acceptable or if someone else
can test.
next prev parent reply other threads:[~2024-03-08 1:47 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-07 13:39 [RFC PATCH 0/3] x86/mm: LAM fixups and cleanups Yosry Ahmed
2024-03-07 13:39 ` [RFC PATCH 1/3] x86/mm: fix LAM cr3 mask inconsistency during context switch Yosry Ahmed
2024-03-07 17:22 ` Kirill A. Shutemov
2024-03-07 20:31 ` Yosry Ahmed
2024-03-07 17:36 ` Dave Hansen
2024-03-07 18:49 ` Sean Christopherson
2024-03-07 20:44 ` Yosry Ahmed
2024-03-07 22:12 ` Sean Christopherson
2024-03-07 20:42 ` Yosry Ahmed
2024-03-07 23:21 ` Yosry Ahmed
2024-03-07 23:32 ` Dave Hansen
2024-03-07 23:37 ` Yosry Ahmed
2024-03-07 13:39 ` [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching Yosry Ahmed
2024-03-07 15:29 ` Dave Hansen
2024-03-07 21:04 ` Yosry Ahmed
2024-03-07 21:39 ` Dave Hansen
2024-03-07 22:29 ` Yosry Ahmed
2024-03-07 22:41 ` Dave Hansen
2024-03-07 22:44 ` Yosry Ahmed
2024-03-08 1:26 ` Yosry Ahmed
2024-03-08 8:09 ` Yosry Ahmed
2024-03-07 17:29 ` Kirill A. Shutemov
2024-03-07 17:56 ` Dave Hansen
2024-03-07 21:08 ` Yosry Ahmed
2024-03-07 21:48 ` Dave Hansen
2024-03-07 22:30 ` Yosry Ahmed
2024-03-08 1:34 ` Andy Lutomirski
2024-03-08 1:47 ` Yosry Ahmed [this message]
2024-03-08 14:05 ` Kirill A. Shutemov
2024-03-08 15:23 ` Dave Hansen
2024-03-08 18:18 ` Kirill A. Shutemov
2024-03-09 2:19 ` Yosry Ahmed
2024-03-09 16:34 ` Kirill A. Shutemov
2024-03-09 21:37 ` Yosry Ahmed
2024-03-11 12:42 ` Kirill A. Shutemov
2024-03-11 18:27 ` Yosry Ahmed
2024-03-11 6:09 ` Dan Carpenter
2024-03-11 21:28 ` Yosry Ahmed
2024-03-07 13:39 ` [RFC PATCH 3/3] x86/mm: cleanup prctl_enable_tagged_addr() nr_bits error checking Yosry Ahmed
2024-03-07 17:31 ` Kirill A. Shutemov
2024-03-07 20:27 ` Yosry Ahmed
-- strict thread matches above, loose matches on Subject: below --
2024-03-10 10:04 [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZepuO5bDoE-5T0RB@google.com \
--to=yosryahmed@google.com \
--cc=akpm@linux-foundation.org \
--cc=bp@alien8.de \
--cc=dave.hansen@intel.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.