From: Dave Hansen <dave.hansen@intel.com>
To: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
luto@kernel.org, peterz@infradead.org,
sathyanarayanan.kuppuswamy@linux.intel.com, aarcange@redhat.com,
ak@linux.intel.com, dan.j.williams@intel.com, david@redhat.com,
hpa@zytor.com, jgross@suse.com, jmattson@google.com,
joro@8bytes.org, jpoimboe@redhat.com, knsathya@kernel.org,
pbonzini@redhat.com, sdeep@vmware.com, seanjc@google.com,
tony.luck@intel.com, vkuznets@redhat.com, wanpengli@tencent.com,
thomas.lendacky@amd.com, brijesh.singh@amd.com, x86@kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCHv5 11/30] x86/tdx: Handle in-kernel MMIO
Date: Thu, 10 Mar 2022 09:53:01 -0800 [thread overview]
Message-ID: <9b2836ce-5267-8342-65eb-1084ba7e0cdf@intel.com> (raw)
In-Reply-To: <20220310164839.erpjijvxwuzjql5x@black.fi.intel.com>
On 3/10/22 08:48, Kirill A. Shutemov wrote:
> On Wed, Mar 09, 2022 at 05:06:11PM -0800, Dave Hansen wrote:
>> On 3/9/22 16:51, Kirill A. Shutemov wrote:
>>> On Tue, Mar 08, 2022 at 01:26:28PM -0800, Dave Hansen wrote:
>>>> Memory encryption has zero to do with this. The TDX isolation
>>>> mechanisms are totally discrete from memory encryption, although they
>>>> are "neighbors" of sorts.
>>>
>>> Hm. I don't see why you say encryption is not relevant. VMM (host kernel)
>>> has ultimate access to guest memory cypher text. It can read it as
>>> cypher text without any issue (using KeyID-0).
>>>
>>> Could you elaborate on the point?
>>
>> I think you're just confusing what TDX has with MKTME. The whitepaper says:
>>
>>> The TD-bit associated with the line in memory seeks to
>>> detect software or devices attempting to read memory
>>> encrypted with private KeyID, using a shared KeyID, to reveal
>>> the ciphertext. On such accesses, the MKTME returns a fixed
>>> pattern to prevent ciphertext analysis.
>>
>> I think several firstborn were sacrificed to get that bit. Let's not
>> forget why we have it. :)
>
> Okay, I missed the memo. I will drop reference to encryption:
>
> - The CPU disallows software other than the TDX module and TDs from
> making memory accesses using the private key. Without the correct
> key VMM has no way to access TD-private memory.
I think this is good enough:
- All guest code is expected to be in TD-private memory. Being
private to the TD, VMMs have no way to access TD-private memory and
no way to read the instruction to decode and emulate it.
We don't have to rehash what private memory is or how it is implemented.
>> This problem was laid out as having three cases:
>> 1. virtio
>> 2. x86-specific drivers
>> 3. random drivers (everything else)
>>
>> #1 could be done with paravirt
>> #2 is unspecified and unknown
>> #3 use doesn't as far as I know exist in TDX guests today
>
> #2 doesn't matter from performance point of view and there is no
> convenient place where they can be intercepted as they are scattered
> across the tree. Patching them doesn't bring any benefit, only pain.
I'd feel a lot better if this was slightly better specified. Even
booting with a:
printf("rip: %lx\n", regs->rip);
in the #VE handler would give some hard data about these. This still
feels to me like something that Sean got booting two years ago and
nobody has really reconsidered.
> #3 some customers already declared that they will use device passthough
> (yes, it is not safe). CSP may want to emulate random device, depending on
> setup. Like, a power button or something.
I'm not sure I'm totally on board with that.
But, let's try to make a coherent changelog out of that mess.
This approach is bad for performance. But, it has (virtually)
no impact on the size of the kernel image and will work for a
wide variety of drivers. This allows TDX deployments to use
arbitrary devices and device drivers, including virtio. TDX
customers have asked for the capability to use random devices in
their deployments.
In other words, even if all of the work was done to
paravirtualize all x86 MMIO users and virtio, this approach
would still be needed. There is essentially no way to get rid
of this code.
This approach is functional for all in-kernel MMIO users current
and future and does so with a minimal amount of code and kernel
image bloat.
Does that summarize it?
>>> BUG() here makes it clear that the handler itself is buggy. Returning
>>> false and kicking in #GP-like logic indicates that something wrong with
>>> the code that triggered #VE. I think it is an important distinction.
>>
>> OK, then how about a WARN_ON() which is followed by the #GP?
>
> You folks give mixed messages. Thomas was very unhappy when I tried to add
> code that recovers from WBINVD:
>
> https://lore.kernel.org/all/87y22uujkm.ffs@tglx
>
> It is exactly the same scenario: kernel code is buggy and has to be fixed.
>
> So, what the policy?
Lately, I've tried to subscribe to the "there is NO F*CKING EXCUSE to
knowingly kill the kernel" policy[1].
You don't add a BUG_ON() unless the kernel has no meaningful way to
continue. It's not for a "hey, that's weird..." kind of thing. Like,
"hey, the instruction decoder looks confused, that's weird."
>> Let's say insn_decode_mmio() does something insane like:
>>
>> return -EINVAL;
>>
>> Should we really be killing the kernel for that?
>
> Note that #GP is most likely kill kernel as well. We handle in-kernel
> MMIO. There are no many chances for recover.
>
> Is it really the big deal?
No, not really.
But, I'd like to see one of two things:
1. Change the BUG()s to WARN()s.
2. Make it utterly clear that handle_mmio() is for handling kernel MMIO
only. Definitely change the naming, possibly add a check for
user_mode(). In other words, make it even _less_ generic.
None of that should be hard.
BTW, the BUG()s made me think about how the gp_try_fixup_and_notify()
code would work for MMIO. For instance, are there any places where
fixup might be done for MMIO? If so, an earlier BUG() wouldn't allow
the fixup to occur.
Do we *WANT* #VE's to be exposed to the #GP fixup machinery?
1.
https://lore.kernel.org/all/CA+55aFwyNTLuZgOWMTRuabWobF27ygskuxvFd-P0n-3UNT=0Og@mail.gmail.com/
next prev parent reply other threads:[~2022-03-10 17:53 UTC|newest]
Thread overview: 84+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-02 14:27 [PATCHv5 00/30] TDX Guest: TDX core support Kirill A. Shutemov
2022-03-02 14:27 ` [PATCHv5 01/30] x86/tdx: Detect running as a TDX guest in early boot Kirill A. Shutemov
2022-03-04 15:43 ` Borislav Petkov
2022-03-04 15:47 ` Dave Hansen
2022-03-04 16:02 ` Borislav Petkov
2022-03-07 22:24 ` [PATCHv5.1 " Kirill A. Shutemov
2022-03-09 18:22 ` Borislav Petkov
2022-03-02 14:27 ` [PATCHv5 02/30] x86/tdx: Provide common base for SEAMCALL and TDCALL C wrappers Kirill A. Shutemov
2022-03-08 19:56 ` Dave Hansen
2022-03-10 12:32 ` Borislav Petkov
2022-03-10 14:44 ` Kirill A. Shutemov
2022-03-10 14:51 ` Borislav Petkov
2022-03-02 14:27 ` [PATCHv5 03/30] x86/tdx: Add __tdx_module_call() and __tdx_hypercall() helper functions Kirill A. Shutemov
2022-03-08 20:03 ` Dave Hansen
2022-03-10 15:30 ` Borislav Petkov
2022-03-10 21:20 ` Kirill A. Shutemov
2022-03-10 21:48 ` Kirill A. Shutemov
2022-03-15 15:56 ` Borislav Petkov
2022-03-12 10:41 ` Borislav Petkov
2022-03-02 14:27 ` [PATCHv5 04/30] x86/tdx: Extend the confidential computing API to support TDX guests Kirill A. Shutemov
2022-03-08 20:17 ` Dave Hansen
2022-03-09 16:01 ` [PATCHv5.1 " Kirill A. Shutemov
2022-03-09 18:36 ` Dave Hansen
2022-03-09 23:51 ` [PATCHv5.2 " Kirill A. Shutemov
2022-03-10 0:07 ` Dave Hansen
2022-03-15 19:41 ` Borislav Petkov
2022-03-02 14:27 ` [PATCHv5 05/30] x86/tdx: Exclude shared bit from __PHYSICAL_MASK Kirill A. Shutemov
2022-03-02 14:27 ` [PATCHv5 06/30] x86/traps: Refactor exc_general_protection() Kirill A. Shutemov
2022-03-08 20:18 ` Dave Hansen
2022-03-02 14:27 ` [PATCHv5 07/30] x86/traps: Add #VE support for TDX guest Kirill A. Shutemov
2022-03-08 20:29 ` Dave Hansen
2022-03-02 14:27 ` [PATCHv5 08/30] x86/tdx: Add HLT support for TDX guests Kirill A. Shutemov
2022-03-02 14:27 ` [PATCHv5 09/30] x86/tdx: Add MSR " Kirill A. Shutemov
2022-03-02 14:27 ` [PATCHv5 10/30] x86/tdx: Handle CPUID via #VE Kirill A. Shutemov
2022-03-08 20:33 ` Dave Hansen
2022-03-09 16:15 ` [PATCH] " Kirill A. Shutemov
2022-03-02 14:27 ` [PATCHv5 11/30] x86/tdx: Handle in-kernel MMIO Kirill A. Shutemov
2022-03-08 21:26 ` Dave Hansen
2022-03-10 0:51 ` Kirill A. Shutemov
2022-03-10 1:06 ` Dave Hansen
2022-03-10 16:48 ` Kirill A. Shutemov
2022-03-10 17:53 ` Dave Hansen [this message]
2022-03-11 17:18 ` Kirill A. Shutemov
2022-03-11 17:22 ` Dave Hansen
2022-03-11 18:01 ` Dave Hansen
2022-03-02 14:27 ` [PATCHv5 12/30] x86/tdx: Detect TDX at early kernel decompression time Kirill A. Shutemov
2022-03-07 22:27 ` [PATCHv5.1 " Kirill A. Shutemov
2022-03-02 14:27 ` [PATCHv5 13/30] x86: Adjust types used in port I/O helpers Kirill A. Shutemov
2022-03-02 14:27 ` [PATCHv5 14/30] x86: Consolidate " Kirill A. Shutemov
2022-03-02 14:27 ` [PATCHv5 15/30] x86/boot: Port I/O: allow to hook up alternative helpers Kirill A. Shutemov
2022-03-02 17:42 ` Josh Poimboeuf
2022-03-02 19:41 ` Dave Hansen
2022-03-02 20:02 ` Josh Poimboeuf
2022-03-02 14:27 ` [PATCHv5 16/30] x86/boot: Port I/O: add decompression-time support for TDX Kirill A. Shutemov
2022-03-02 14:27 ` [PATCHv5 17/30] x86/tdx: Port I/O: add runtime hypercalls Kirill A. Shutemov
2022-03-08 21:30 ` Dave Hansen
2022-03-02 14:27 ` [PATCHv5 18/30] x86/tdx: Port I/O: add early boot support Kirill A. Shutemov
2022-03-02 14:27 ` [PATCHv5 19/30] x86/tdx: Wire up KVM hypercalls Kirill A. Shutemov
2022-03-02 14:27 ` [PATCHv5 20/30] x86/boot: Add a trampoline for booting APs via firmware handoff Kirill A. Shutemov
2022-03-02 14:27 ` [PATCHv5 21/30] x86/acpi, x86/boot: Add multiprocessor wake-up support Kirill A. Shutemov
2022-03-02 14:27 ` [PATCHv5 22/30] x86/boot: Set CR0.NE early and keep it set during the boot Kirill A. Shutemov
2022-03-08 21:37 ` Dave Hansen
2022-03-02 14:27 ` [PATCHv5 23/30] x86/boot: Avoid #VE during boot for TDX platforms Kirill A. Shutemov
2022-03-07 9:29 ` Xiaoyao Li
2022-03-07 22:33 ` Kirill A. Shutemov
2022-03-08 1:19 ` Xiaoyao Li
2022-03-08 16:41 ` Kirill A. Shutemov
2022-03-07 22:36 ` [PATCHv5.1 " Kirill A. Shutemov
2022-03-02 14:28 ` [PATCHv5 24/30] x86/topology: Disable CPU online/offline control for TDX guests Kirill A. Shutemov
2022-03-02 14:28 ` [PATCHv5 25/30] x86/tdx: Make pages shared in ioremap() Kirill A. Shutemov
2022-03-08 22:02 ` Dave Hansen
2022-03-02 14:28 ` [PATCHv5 26/30] x86/mm/cpa: Add support for TDX shared memory Kirill A. Shutemov
2022-03-09 19:44 ` Dave Hansen
2022-03-02 14:28 ` [PATCHv5 27/30] x86/kvm: Use bounce buffers for TD guest Kirill A. Shutemov
2022-03-09 20:07 ` Dave Hansen
2022-03-10 14:29 ` Tom Lendacky
2022-03-10 14:51 ` Christoph Hellwig
2022-03-02 14:28 ` [PATCHv5 28/30] x86/tdx: ioapic: Add shared bit for IOAPIC base address Kirill A. Shutemov
2022-03-09 20:39 ` Dave Hansen
2022-03-02 14:28 ` [PATCHv5 29/30] ACPICA: Avoid cache flush inside virtual machines Kirill A. Shutemov
2022-03-02 16:13 ` Dan Williams
2022-03-09 20:56 ` Dave Hansen
2022-03-02 14:28 ` [PATCHv5 30/30] Documentation/x86: Document TDX kernel architecture Kirill A. Shutemov
2022-03-09 21:49 ` Dave Hansen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9b2836ce-5267-8342-65eb-1084ba7e0cdf@intel.com \
--to=dave.hansen@intel.com \
--cc=aarcange@redhat.com \
--cc=ak@linux.intel.com \
--cc=bp@alien8.de \
--cc=brijesh.singh@amd.com \
--cc=dan.j.williams@intel.com \
--cc=david@redhat.com \
--cc=hpa@zytor.com \
--cc=jgross@suse.com \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=jpoimboe@redhat.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=knsathya@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=sathyanarayanan.kuppuswamy@linux.intel.com \
--cc=sdeep@vmware.com \
--cc=seanjc@google.com \
--cc=tglx@linutronix.de \
--cc=thomas.lendacky@amd.com \
--cc=tony.luck@intel.com \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox