From: Denys Vlasenko <dvlasenk@redhat.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Steven Rostedt <rostedt@goodmis.org>,
Ingo Molnar <mingo@kernel.org>, Borislav Petkov <bp@alien8.de>,
"H. Peter Anvin" <hpa@zytor.com>, Oleg Nesterov <oleg@redhat.com>,
Frederic Weisbecker <fweisbec@gmail.com>,
Alexei Starovoitov <ast@plumgrid.com>,
Will Drewry <wad@chromium.org>, Kees Cook <keescook@chromium.org>,
X86 ML <x86@kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH] x86: optimize IRET returns to kernel
Date: Tue, 31 Mar 2015 17:59:05 +0200 [thread overview]
Message-ID: <551AC449.6030909@redhat.com> (raw)
In-Reply-To: <CALCETrU=nywXFqCDQb7EPZFZiyyXSES+UXGHMB6+5Mm=Fc9_eQ@mail.gmail.com>
On 03/31/2015 03:54 PM, Andy Lutomirski wrote:
> On Tue, Mar 31, 2015 at 5:46 AM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
>> This is not proposed to be merged yet.
>>
>> Andy, this patch is in spirit of your crazy ideas of repurposing
>> instructions for the roles they weren't intended for :)
>>
>> Recently I measured IRET timings and was newly "impressed"
>> how slow it is. 200+ cycles. So I started thinking...
>>
>> When we return from interrupt/exception *to kernel*,
>> most of IRET's doings are not necessary. CS and SS
>> do not need changing. And in many (most?) cases
>> saved RSP points right at the top of pt_regs,
>> or (top of pt_regs+8).
>>
>> In which case we can (ab)use POPF and RET!
>>
>> Please see the patch.
>
> I have an old attempt at this here:
>
> https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=fast-return-to-kernel&id=6cfe29821979c42cd812878e05577f69f99fafaf
Your version is better :/
I'd only suggest s/pop %rsp/mov (%rsp),%rsp/
I suspect "pop %rsp" is not an easy insn for CPU to digest.
> If I were doing it again, I'd add a bit more care: if saved eflags
> have RF set (can kgdb do that?), then we have to use iret.
Good idea, we can even be paranoid and jump to real IRET if any
of "unusual" flags are set.
> I think that, if returning to IF=1, you need to do sti;ret to avoid an
> infinite stack usage failure in which, during an IRQ storm, each IRQ
> adds around one word of stack utilization because you haven't done the
> ret yet before the next IRQ comes in. To make that robust, I'd adjust
> the NMI code to clear IF and back up one instruction if it interrupts
> after sti.
I kinda hoped POPF is secretly a shadowing insn too.
Experiments show it is not.
next prev parent reply other threads:[~2015-03-31 15:59 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-31 12:46 [RFC PATCH] x86: optimize IRET returns to kernel Denys Vlasenko
2015-03-31 13:49 ` Steven Rostedt
2015-03-31 13:54 ` Andy Lutomirski
2015-03-31 15:59 ` Denys Vlasenko [this message]
2015-04-04 16:54 ` Andy Lutomirski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=551AC449.6030909@redhat.com \
--to=dvlasenk@redhat.com \
--cc=ast@plumgrid.com \
--cc=bp@alien8.de \
--cc=fweisbec@gmail.com \
--cc=hpa@zytor.com \
--cc=keescook@chromium.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=mingo@kernel.org \
--cc=oleg@redhat.com \
--cc=rostedt@goodmis.org \
--cc=torvalds@linux-foundation.org \
--cc=wad@chromium.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox