From: Sean Christopherson <sean.j.christopherson@intel.com>
To: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
Thomas Gleixner <tglx@linutronix.de>,
LKML <linux-kernel@vger.kernel.org>, X86 ML <x86@kernel.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Tom Lendacky <thomas.lendacky@amd.com>, Pu Wen <puwen@hygon.cn>,
Stephen Hemminger <sthemmin@microsoft.com>,
Sasha Levin <alexander.levin@microsoft.com>,
Dirk Hohndel <dirkhh@vmware.com>,
Jan Kiszka <jan.kiszka@siemens.com>,
Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>,
"H. Peter Anvin" <hpa@linux.intel.com>,
Asit Mallick <asit.k.mallick@intel.com>,
Gordon Tetlow <gordon@tetlows.org>,
David Kaplan <David.Kaplan@amd.com>,
Tony Luck <tony.luck@intel.com>
Subject: Re: TDX #VE in SYSCALL gap (was: [RFD] x86: Curing the exception and syscall trainwreck in hardware)
Date: Tue, 25 Aug 2020 10:19:03 -0700 [thread overview]
Message-ID: <20200825171903.GA20660@sjchrist-ice> (raw)
In-Reply-To: <CALCETrUP1T2k3UzZMsXMfAD83xbYEG+nAv3a-LeBjNW+=ijJAg@mail.gmail.com>
On Tue, Aug 25, 2020 at 09:49:05AM -0700, Andy Lutomirski wrote:
> On Mon, Aug 24, 2020 at 9:40 PM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> >
> > +Andy
> >
> > On Mon, Aug 24, 2020 at 02:52:01PM +0100, Andrew Cooper wrote:
> > > And to help with coordination, here is something prepared (slightly)
> > > earlier.
> > >
> > > https://docs.google.com/document/d/1hWejnyDkjRRAW-JEsRjA5c9CKLOPc6VKJQsuvODlQEI/edit?usp=sharing
> > >
> > > This identifies the problems from software's perspective, along with
> > > proposing behaviour which ought to resolve the issues.
> > >
> > > It is still a work-in-progress. The #VE section still needs updating in
> > > light of the publication of the recent TDX spec.
> >
> > For #VE on memory accesses in the SYSCALL gap (or NMI entry), is this
> > something we (Linux) as the guest kernel actually want to handle gracefully
> > (where gracefully means not panicking)? For TDX, a #VE in the SYSCALL gap
> > would require one of two things:
> >
> > a) The guest kernel to not accept/validate the GPA->HPA mapping for the
> > relevant pages, e.g. code or scratch data.
> >
> > b) The host VMM to remap the GPA (making the GPA->HPA pending again).
> >
> > (a) is only possible if there's a fatal buggy guest kernel (or perhaps vBIOS).
> > (b) requires either a buggy or malicious host VMM.
> >
> > I ask because, if the answer is "no, panic at will", then we shouldn't need
> > to burn an IST for TDX #VE. Exceptions won't morph to #VE and hitting an
> > instruction based #VE in the SYSCALL gap would be a CPU bug or a kernel bug.
>
> Or malicious hypervisor action, and that's a problem.
>
> Suppose the hypervisor remaps a GPA used in the SYSCALL gap (e.g. the
> actual SYSCALL text or the first memory it accesses -- I don't have a
> TDX spec so I don't know the details).
You can thank our legal department :-)
> The user does SYSCALL, the kernel hits the funny GPA, and #VE is delivered.
> The microcode wil write the IRET frame, with mostly user-controlled contents,
> wherever RSP points, and RSP is also user controlled. Calling this a "panic"
> is charitable -- it's really game over against an attacker who is moderately
> clever.
>
> The kernel can't do anything about this because it's game over before
> the kernel has had the chance to execute any instructions.
Hrm, I was thinking that SMAP=1 would give the necessary protections, but
in typing that out I realized userspace can throw in an RSP value that
points at kernel memory. Duh.
One thought would be to have the TDX module (thing that runs in SEAM and
sits between the VMM and the guest) provide a TDCALL (hypercall from guest
to TDX module) to the guest that would allow the guest to specify a very
limited number of GPAs that must never generate a #VE, e.g. go straight to
guest shutdown if a disallowed GPA would go pending. That seems doable
from a TDX perspective without incurring noticeable overhead (assuming the
list of GPAs is very small) and should be easy to to support in the guest,
e.g. make a TDCALL/hypercall or two during boot to protect the SYSCALL
page and its scratch data.
next prev parent reply other threads:[~2020-08-25 17:19 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-24 12:24 [RFD] x86: Curing the exception and syscall trainwreck in hardware Thomas Gleixner
2020-08-24 13:52 ` Andrew Cooper
2020-08-25 4:39 ` TDX #VE in SYSCALL gap (was: [RFD] x86: Curing the exception and syscall trainwreck in hardware) Sean Christopherson
2020-08-25 15:25 ` Dave Hansen
2020-08-25 16:49 ` Andy Lutomirski
2020-08-25 17:19 ` Sean Christopherson [this message]
2020-08-25 17:28 ` Andy Lutomirski
2020-08-25 17:35 ` Luck, Tony
2020-08-25 17:41 ` Andy Lutomirski
2020-08-25 17:59 ` Andrew Cooper
2020-08-25 18:38 ` Dave Hansen
2020-08-25 19:49 ` Thomas Gleixner
2020-08-26 19:16 ` Sean Christopherson
2020-08-30 15:37 ` Andy Lutomirski
2020-08-30 18:37 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200825171903.GA20660@sjchrist-ice \
--to=sean.j.christopherson@intel.com \
--cc=David.Kaplan@amd.com \
--cc=TonyWWang-oc@zhaoxin.com \
--cc=alexander.levin@microsoft.com \
--cc=andrew.cooper3@citrix.com \
--cc=asit.k.mallick@intel.com \
--cc=dirkhh@vmware.com \
--cc=gordon@tetlows.org \
--cc=hpa@linux.intel.com \
--cc=jan.kiszka@siemens.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=puwen@hygon.cn \
--cc=sthemmin@microsoft.com \
--cc=tglx@linutronix.de \
--cc=thomas.lendacky@amd.com \
--cc=tony.luck@intel.com \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox