From: Joerg Roedel <joro@8bytes.org>
To: Andy Lutomirski <luto@kernel.org>
Cc: kvm list <kvm@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Dave Hansen <dave.hansen@linux.intel.com>,
Linux Virtualization <virtualization@lists.linux-foundation.org>,
Arvind Sankar <nivedita@alum.mit.edu>,
"H. Peter Anvin" <hpa@zytor.com>, Jiri Slaby <jslaby@suse.cz>,
X86 ML <x86@kernel.org>, David Rientjes <rientjes@google.com>,
Martin Radev <martin.b.radev@gmail.com>,
Tom Lendacky <thomas.lendacky@amd.com>,
Joerg Roedel <jroedel@suse.de>, Kees Cook <keescook@chromium.org>,
Cfir Cohen <cfir@google.com>,
Dan Williams <dan.j.williams@intel.com>,
Juergen Gross <jgross@suse.com>, Mike Stunes <mstunes@vmware.com>,
Sean Christopherson <seanjc@google.com>,
LKML <linux-kernel@vger.kernel.org>,
stable <stable@vger.kernel.org>,
Masami Hiramatsu <mhiramat@kernel.org>,
Erdem Aktas <erdemaktas@google.com>
Subject: Re: [PATCH 2/3] x86/sev-es: Check if regs->sp is trusted before adjusting #VC IST stack
Date: Fri, 19 Feb 2021 12:05:49 +0100 [thread overview]
Message-ID: <20210219110549.GI7302@8bytes.org> (raw)
In-Reply-To: <CALCETrUaOLwO51Js+OGNY03aep8BHoncZKTMr8sG1guUhLk40A@mail.gmail.com>
On Thu, Feb 18, 2021 at 04:28:36PM -0800, Andy Lutomirski wrote:
> On Thu, Feb 18, 2021 at 11:21 AM Joerg Roedel <jroedel@suse.de> wrote:
> Can you give me an example, even artificial, in which the linked-list
> logic is useful?
So here we go, its of course artificial, but still:
1. #VC happens, not important where
2. NMI in the #VC prologue before it moved off its IST stack
- first VC IST adjustment happening here
3. #VC in the NMI handler
4. #HV in the #VC prologue again
- second VC IST adjustment happening here, so the #HV handler
can cause its own #VC exceptions.
Can only happen if the #HV handler is allowed to cause #VC exceptions.
But even if its not allowed, it can happen with SNP and a malicious
Hypervisor. But in this case the only option is to reliably panic.
> Can you explain your reasoning in considering the entry stack unsafe?
> It's 4k bytes these days.
I wasn't aware that it is 4k in size now. I still thought it was just
these 64 words large and one can not simply execute C code on it.
> You forgot about entry_SYSCALL_compat.
Right, thanks for pointing this out.
> Your 8-byte alignment is confusing to me. In valid kernel code, SP
> should be 8-byte-aligned already, and, if you're trying to match
> architectural behavior, the CPU aligns to 16 bytes.
Yeah, I was just being cautious. The explicit alignment can be removed,
Boris also pointed this out.
> We're not robust against #VC, NMI in the #VC prologue before the magic
> stack switch, and a new #VC in the NMI prologue. Nor do we appear to
> have any detection of the case where #VC nests directly inside its own
> prologue. Or did I miss something else here?
No, you don't miss anything here. At the moment #VC can't happen at
those places, so this is not handled yet. With SNP it can happen and
needs to be handled in a way to at least allow a reliable panic (because
if it really happens the Hypervisor is messing with us).
> If we get NMI and get #VC in the NMI *asm*, the #VC magic stack switch
> looks like it will merrily run itself in the NMI special-stack-layout
> section, and that sounds really quite bad.
Yes, I havn't looked at the details yet, but if a #VC happens there it
probably better not returns.
> I mean that, IIRC, a malicious hypervisor can inject inappropriate
> vectors at inappropriate times if the #HV mechanism isn't enabled.
> For example, it could inject a page fault or an interrupt in a context
> in which we have the wrong GSBASE loaded.
Yes, a malicious Hypervisor can do that, and without #HV there is no
real protection against this besides turning all vectors (even IRQs)
into paranoid entries. Maybe even more care is needed, but I think its
not worth to care about this.
> But the #DB issue makes this moot. We have to use IST unless we turn
> off SCE. But I admit I'm leaning toward turning off SCE until we have
> a solution that seems convincingly robust.
Turning off SCE might be tempting, but I guess doing so would break a
quite some user-space code, no?
Regards,
Joerg
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
WARNING: multiple messages have this Message-ID (diff)
From: Joerg Roedel <joro@8bytes.org>
To: Andy Lutomirski <luto@kernel.org>
Cc: Joerg Roedel <jroedel@suse.de>, X86 ML <x86@kernel.org>,
stable <stable@vger.kernel.org>, "H. Peter Anvin" <hpa@zytor.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
Peter Zijlstra <peterz@infradead.org>,
Jiri Slaby <jslaby@suse.cz>,
Dan Williams <dan.j.williams@intel.com>,
Tom Lendacky <thomas.lendacky@amd.com>,
Juergen Gross <jgross@suse.com>,
Kees Cook <keescook@chromium.org>,
David Rientjes <rientjes@google.com>,
Cfir Cohen <cfir@google.com>, Erdem Aktas <erdemaktas@google.com>,
Masami Hiramatsu <mhiramat@kernel.org>,
Mike Stunes <mstunes@vmware.com>,
Sean Christopherson <seanjc@google.com>,
Martin Radev <martin.b.radev@gmail.com>,
Arvind Sankar <nivedita@alum.mit.edu>,
LKML <linux-kernel@vger.kernel.org>,
kvm list <kvm@vger.kernel.org>,
Linux Virtualization <virtualization@lists.linux-foundation.org>
Subject: Re: [PATCH 2/3] x86/sev-es: Check if regs->sp is trusted before adjusting #VC IST stack
Date: Fri, 19 Feb 2021 12:05:49 +0100 [thread overview]
Message-ID: <20210219110549.GI7302@8bytes.org> (raw)
In-Reply-To: <CALCETrUaOLwO51Js+OGNY03aep8BHoncZKTMr8sG1guUhLk40A@mail.gmail.com>
On Thu, Feb 18, 2021 at 04:28:36PM -0800, Andy Lutomirski wrote:
> On Thu, Feb 18, 2021 at 11:21 AM Joerg Roedel <jroedel@suse.de> wrote:
> Can you give me an example, even artificial, in which the linked-list
> logic is useful?
So here we go, its of course artificial, but still:
1. #VC happens, not important where
2. NMI in the #VC prologue before it moved off its IST stack
- first VC IST adjustment happening here
3. #VC in the NMI handler
4. #HV in the #VC prologue again
- second VC IST adjustment happening here, so the #HV handler
can cause its own #VC exceptions.
Can only happen if the #HV handler is allowed to cause #VC exceptions.
But even if its not allowed, it can happen with SNP and a malicious
Hypervisor. But in this case the only option is to reliably panic.
> Can you explain your reasoning in considering the entry stack unsafe?
> It's 4k bytes these days.
I wasn't aware that it is 4k in size now. I still thought it was just
these 64 words large and one can not simply execute C code on it.
> You forgot about entry_SYSCALL_compat.
Right, thanks for pointing this out.
> Your 8-byte alignment is confusing to me. In valid kernel code, SP
> should be 8-byte-aligned already, and, if you're trying to match
> architectural behavior, the CPU aligns to 16 bytes.
Yeah, I was just being cautious. The explicit alignment can be removed,
Boris also pointed this out.
> We're not robust against #VC, NMI in the #VC prologue before the magic
> stack switch, and a new #VC in the NMI prologue. Nor do we appear to
> have any detection of the case where #VC nests directly inside its own
> prologue. Or did I miss something else here?
No, you don't miss anything here. At the moment #VC can't happen at
those places, so this is not handled yet. With SNP it can happen and
needs to be handled in a way to at least allow a reliable panic (because
if it really happens the Hypervisor is messing with us).
> If we get NMI and get #VC in the NMI *asm*, the #VC magic stack switch
> looks like it will merrily run itself in the NMI special-stack-layout
> section, and that sounds really quite bad.
Yes, I havn't looked at the details yet, but if a #VC happens there it
probably better not returns.
> I mean that, IIRC, a malicious hypervisor can inject inappropriate
> vectors at inappropriate times if the #HV mechanism isn't enabled.
> For example, it could inject a page fault or an interrupt in a context
> in which we have the wrong GSBASE loaded.
Yes, a malicious Hypervisor can do that, and without #HV there is no
real protection against this besides turning all vectors (even IRQs)
into paranoid entries. Maybe even more care is needed, but I think its
not worth to care about this.
> But the #DB issue makes this moot. We have to use IST unless we turn
> off SCE. But I admit I'm leaning toward turning off SCE until we have
> a solution that seems convincingly robust.
Turning off SCE might be tempting, but I guess doing so would break a
quite some user-space code, no?
Regards,
Joerg
next prev parent reply other threads:[~2021-02-19 11:06 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-17 12:01 [PATCH 0/3] x86/sev-es: Check for trusted regs->sp in __sev_es_ist_enter() Joerg Roedel
2021-02-17 12:01 ` Joerg Roedel
2021-02-17 12:01 ` [PATCH 1/3] x86/sev-es: Introduce from_syscall_gap() helper Joerg Roedel
2021-02-17 12:01 ` Joerg Roedel
2021-02-17 17:59 ` Borislav Petkov
2021-02-17 17:59 ` Borislav Petkov
2021-02-17 12:01 ` [PATCH 2/3] x86/sev-es: Check if regs->sp is trusted before adjusting #VC IST stack Joerg Roedel
2021-02-17 12:01 ` Joerg Roedel
2021-02-17 18:00 ` Borislav Petkov
2021-02-17 18:00 ` Borislav Petkov
2021-02-17 18:09 ` Andy Lutomirski
2021-02-17 18:09 ` Andy Lutomirski
2021-02-18 11:25 ` Joerg Roedel
2021-02-18 11:25 ` Joerg Roedel
2021-02-18 17:49 ` Andy Lutomirski
2021-02-18 17:49 ` Andy Lutomirski
2021-02-18 19:21 ` Joerg Roedel
2021-02-18 19:21 ` Joerg Roedel
2021-02-19 0:28 ` Andy Lutomirski
2021-02-19 0:28 ` Andy Lutomirski
2021-02-19 11:05 ` Joerg Roedel [this message]
2021-02-19 11:05 ` Joerg Roedel
2021-02-17 12:01 ` [PATCH 3/3] x86/sev-es: Improve comments in and around __sev_es_ist_enter/exit() Joerg Roedel
2021-02-17 12:01 ` Joerg Roedel
2021-02-17 18:00 ` Borislav Petkov
2021-02-17 18:00 ` Borislav Petkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210219110549.GI7302@8bytes.org \
--to=joro@8bytes.org \
--cc=cfir@google.com \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=erdemaktas@google.com \
--cc=hpa@zytor.com \
--cc=jgross@suse.com \
--cc=jroedel@suse.de \
--cc=jslaby@suse.cz \
--cc=keescook@chromium.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=martin.b.radev@gmail.com \
--cc=mhiramat@kernel.org \
--cc=mstunes@vmware.com \
--cc=nivedita@alum.mit.edu \
--cc=peterz@infradead.org \
--cc=rientjes@google.com \
--cc=seanjc@google.com \
--cc=stable@vger.kernel.org \
--cc=thomas.lendacky@amd.com \
--cc=virtualization@lists.linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.