From: Rik van Riel <riel@surriel.com>
To: Andy Lutomirski <luto@kernel.org>,
"Jason A. Donenfeld" <Jason@zx2c4.com>,
Dave Hansen <dave.hansen@linux.intel.com>
Cc: LKML <linux-kernel@vger.kernel.org>, X86 ML <x86@kernel.org>
Subject: Re: Lazy FPU restoration / moving kernel_fpu_end() to context switch
Date: Fri, 15 Jun 2018 14:40:02 -0400 [thread overview]
Message-ID: <1529088002.7898.156.camel@surriel.com> (raw)
In-Reply-To: <CALCETrU+2mBPDfkBz1i_GT1EOJau+mzj4yOK8N0UnT2pGjiUWQ@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 3746 bytes --]
On Fri, 2018-06-15 at 11:31 -0700, Andy Lutomirski wrote:
> On Fri, Jun 15, 2018 at 6:11 AM Jason A. Donenfeld <Jason@zx2c4.com>
> wrote:
> >
> > Hi Andy & folks,
> >
> > Lots of crypto routines look like this:
> >
> > kernel_fpu_begin();
> > encrypt();
> > kernel_fpu_end();
> >
> > If you call such a routine twice, you get:
> >
> > kernel_fpu_begin();
> > encrypt();
> > kernel_fpu_end();
> > kernel_fpu_begin();
> > encrypt();
> > kernel_fpu_end();
> >
> > In a loop this looks like:
> >
> > for (thing) {
> > kernel_fpu_begin();
> > encrypt(thing);
> > kernel_fpu_end();
> > }
> >
> > This is obviously very bad, because begin() and end() are slow, so
> > WireGuard does the obvious:
> >
> > kernel_fpu_begin();
> > for (thing)
> > encrypt(thing);
> > kernel_fpu_end();
> >
> > This is fine and well, and the crypto API I'm working on will
> > enable
> > this to be done in a clear way, but I do wonder if maybe this is
> > not
> > something that should be happening at the level of the caller, but
> > rather in the fpu functions themselves. Namely, what are your
> > thoughts
> > on modifying kernel_fpu_end() so that it doesn't actually restore
> > the
> > state, but just reenables preemption and marks that on the next
> > context switch, the state should be restored? Then, essentially,
> > kernel_fpu_begin() and end() become free after the first usage of
> > kernel_fpu_begin().
> >
> > Is this something feasible? I know that performance-wise, I'm
> > really
> > gaining a lot from hoisting those functions out of the loops, and
> > API
> > wise, it'd be slightly simpler to implement if I didn't have to all
> > for that hoisting.
>
> Hi Jason-
>
> Funny you asked. This has been discussed a few times, although not
> quite in the form you imagined. The idea that we've tossed around is
> to restore FPU state on return to user mode. Roughly, we'd introduce
> a new thread flag TIF_FPU_UNLOADED (name TBD).
> prepare_exit_to_usermode() would notice this flag, copy the fpstate
> to
> fpregs, and clear the flag. (Or maybe exit_to_usermode_loop() -- No
> one has quite thought it through, but I think it should be outside
> the
> loop.) We'd update all the FPU accessors to understand the flag.
> We'd probably have a helper like:
>
> void unload_user_fpstate(void)
>
> that would do nothing if TIF_FPU_UNLOADED is set. If
> TIF_FPU_UNLOADED
> is clear, it would move the state to memory and set TIF_FPU_UNLOADED.
> We'd call unload_user_fpstate() during context switch in the prev
> task
> context and in kernel_fpu_begin().
>
> The one major complication I know of is that PKRU is different from
> all other FPU state in that it needs to be loaded when *kernel* code
> does any user memory access. So we'd need to make sure that context
> switches load the new task's PKRU eagerly. Using WRPKRU is easy,
> but,
> unless we do something very clever, actually finding PKRU in the
> in-memory fpstate image may be slightly nontrivial.
IIRC the in-kernel FPU state always has every FPU field
at a constant location.
> I suppose we could eagerly load the new FPU state on context
> switches,
> but I'm not sure I see any point. We'd still have to load it when we
> switch back to user mode, so it would be slower but not necessarily
> any simpler.
>
> I'd love to review patches to make this change, but I don't have the
> bandwidth to write them right now.
I can take a look at this, I am already looking at
some context switch code right now, anyway.
I also should still have the TIF_FPU_UNLOADED patches
(or whatever the flag was called) code around somewhere.
--
All Rights Reversed.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2018-06-15 18:40 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-15 13:11 Lazy FPU restoration / moving kernel_fpu_end() to context switch Jason A. Donenfeld
2018-06-15 16:25 ` Thomas Gleixner
2018-06-15 18:33 ` Brian Gerst
2018-06-15 19:34 ` Peter Zijlstra
2018-06-15 20:30 ` Jason A. Donenfeld
2018-06-18 9:44 ` Peter Zijlstra
2018-06-18 15:25 ` Jason A. Donenfeld
2018-06-15 18:31 ` Andy Lutomirski
2018-06-15 18:40 ` Rik van Riel [this message]
2018-06-15 18:41 ` Dave Hansen
2018-06-15 18:49 ` Andy Lutomirski
2018-06-15 18:48 ` Dave Hansen
2018-06-15 18:53 ` Andy Lutomirski
2018-06-15 20:27 ` Jason A. Donenfeld
2018-06-15 20:48 ` Jason A. Donenfeld
2018-06-19 11:43 ` David Laight
2018-06-19 13:08 ` Thomas Gleixner
2018-06-15 20:33 ` Jason A. Donenfeld
2018-06-15 20:42 ` Dave Hansen
2018-06-15 20:52 ` Andy Lutomirski
2018-06-15 20:56 ` Rik van Riel
2018-07-11 16:28 ` Sebastian Andrzej Siewior
2018-07-11 20:10 ` Rik van Riel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1529088002.7898.156.camel@surriel.com \
--to=riel@surriel.com \
--cc=Jason@zx2c4.com \
--cc=dave.hansen@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.