kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Rick P Edgecombe <rick.p.edgecombe@intel.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	 "dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	 "thomas.lendacky@amd.com" <thomas.lendacky@amd.com>,
	"dionnaglaze@google.com" <dionnaglaze@google.com>,
	 Binbin Wu <binbin.wu@intel.com>,
	 "kirill.shutemov@linux.intel.com"
	<kirill.shutemov@linux.intel.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	 "pbonzini@redhat.com" <pbonzini@redhat.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	 "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"hpa@zytor.com" <hpa@zytor.com>,
	 "vkuznets@redhat.com" <vkuznets@redhat.com>,
	"bp@alien8.de" <bp@alien8.de>,
	"jgross@suse.com" <jgross@suse.com>,
	 "pgonda@google.com" <pgonda@google.com>,
	"x86@kernel.org" <x86@kernel.org>
Subject: Re: [PATCH 0/2] x86/kvm: Force legacy PCI hole as WB under SNP/TDX
Date: Mon, 3 Feb 2025 16:27:02 -0800	[thread overview]
Message-ID: <Z6Fe1nFWv52rDijx@google.com> (raw)
In-Reply-To: <0102090cd553e42709f43c30d2982b2c713e1a68.camel@intel.com>

On Mon, Feb 03, 2025, Rick P Edgecombe wrote:
> On Mon, 2025-02-03 at 12:33 -0800, Sean Christopherson wrote:
> > > Since there is no upstream KVM TDX support yet, why isn't it an option to
> > > still
> > > revert the EDKII commit too? It was a relatively recent change.
> > 
> > I'm fine with that route too, but it too is a band-aid.  Relying on the
> > *untrusted*
> > hypervisor to essentially communicate memory maps is not a winning strategy. 
> > 
> > > To me it seems that the normal KVM MTRR support is not ideal, because it is
> > > still lying about what it is doing. For example, in the past there was an
> > > attempt to use UC to prevent speculative execution accesses to sensitive
> > > data.
> > > The KVM MTRR support only happens to work with existing guests, but not all
> > > possible MTRR usages.
> > > 
> > > Since diverging from the architecture creates loose ends like that, we could
> > > instead define some other way for EDKII to communicate the ranges to the
> > > kernel.
> > > Like some simple KVM PV MSRs that are for communication only, and not
> > 
> > Hard "no" to any PV solution.  This isn't KVM specific, and as above, bouncing
> > through the hypervisor to communicate information within the guest is asinine,
> > especially for CoCo VMs.
> 
> Hmm, right.
> 
> So the other options could be:
> 
> 1. Some TDX module feature to hold the ranges:
>  - Con: Not shared with AMD
> 
> 2. Re-use MTRRs for the communication, revert changes in guest and edk2:

Thinking more about how EDK2 is consumed downstream, I think reverting the EDK2
changes is necessary regardless of what happens in the kernel.  Or at the least,
somehow communicate to EDK2 users that ingesting those changes is a bad idea
unless the kernel has also been updated.

AFAIK, Bring Your Own Firmware[*] isn't widely adopted, which means that the CSP
is shipping the firmware.  And shipping OVMF/EDK2 with the "ignores MTRRs" code
will cause problems for guests without commit 8e690b817e38 ("x86/kvm: Override
default caching mode for SEV-SNP and TDX").  Since the host doesn't control the
guest kernel, there's no way to know if deploying those EDK2 changes is safe.
 
[*] https://kvm-forum.qemu.org/2024/BYOF_-_KVM_Forum_2024_iWTioIP.pdf

>  - Con: Creating more half support, when it's technically not required
>  - Con: Still bouncing through the hypervisor

I assume by "Re-use MTRRs for the communication" you also mean updating the guest
to address the "everything is UC!" flaw, otherwise another con is:

   - Con: Doesn't address the performance issue with TDX guests "using" UC
          memory by default (unless there's yet more enabled).

Presumably that can be accomplished by simply skipping the CR0.CD toggling, and
doing MTRR stuff as nonrmal?

>  - Pro: Design and code is clear
>
> 3. Create some new architectural definition, like a bit that means "MTRRs don't
> actually work:
>  - Con: Takes a long time, need to get agreement
>  - Con: Still bouncing through the hypervisor

Not for KVM guests.  As I laid out in my bug report, it's safe to assume MTRRs
don't actually affect the memory type when running under KVM.

FWIW, PAT doesn't "work" on most KVM Intel setups either, because of misguided
KVM code that resulted in "Ignore Guest PAT" being set in all EPTEs for the
overwhelming majority of guests.  That's not desirable long term because it
prevents the guest from using WC (via PAT) in situations where doing so is needed
for performance and/or correctness.

>  - Pro: More pure solution

MTRRs "not working" is a red herring.  The problem isn't that MTRRs don't work,
it's that the kernel is (somewhat unknowingly) using MTRRs as a crutch to get the
desired memtype for devices.  E.g. for emulated MMIO, MTRRs _can't_ be virtualized,
because there's never a valid mapping, i.e. there is no physical memory and thus
no memtype.  In other words, under KVM guests (and possibly other hypervisors),
MTRRs end up being nothing more than a communication channel between guest firmware
and the kernel.

The gap for CoCo VMs is that using MTRRs is undesirable because they are controlled
by the untrusted host.  But that's largely a future problem, unless someone has a
clever way to fix the kernel mess.

> 4. Do this series:
>  - Pro: Looks ok to me
>  - Cons: As explained in the patches, it's a bit hacky.
>  - Cons: Could there be more cases than the legacy PCI hole?
> 
> I would kind of like to see something like 3, but 2 or 4 seem the only feasible
> ones if we want to resolve this soon.

  reply	other threads:[~2025-02-04  0:27 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-01  0:50 [PATCH 0/2] x86/kvm: Force legacy PCI hole as WB under SNP/TDX Sean Christopherson
2025-02-01  0:50 ` [PATCH 1/2] x86/mtrr: Return success vs. "failure" from guest_force_mtrr_state() Sean Christopherson
2025-02-01  0:50 ` [PATCH 2/2] x86/kvm: Override low memory above TOLUD to WB when MTRRs are forced WB Sean Christopherson
2025-02-01 14:25 ` [PATCH 0/2] x86/kvm: Force legacy PCI hole as WB under SNP/TDX Dionna Amalie Glaze
2025-02-03 18:14 ` Edgecombe, Rick P
2025-02-03 20:33   ` Sean Christopherson
2025-02-03 23:01     ` Edgecombe, Rick P
2025-02-04  0:27       ` Sean Christopherson [this message]
2025-02-05  3:51         ` Edgecombe, Rick P
2025-02-05  7:49           ` Xu, Min M
2025-02-10 15:29         ` Binbin Wu
2025-07-08 14:24 ` Nikolay Borisov
  -- strict thread matches above, loose matches on Subject: below --
2025-07-09 16:54 Jianxiong Gao
2025-07-14  9:06 ` Binbin Wu
2025-07-14 11:24   ` Nikolay Borisov
2025-07-15  2:53     ` Binbin Wu
2025-07-16  9:51       ` Binbin Wu
2025-07-23 14:34     ` Sean Christopherson
2025-07-24  3:16       ` Binbin Wu
2025-07-28 15:33         ` Sean Christopherson
2025-07-30  7:34           ` Binbin Wu
2025-08-15 23:55             ` Korakit Seemakhupt
2025-08-18 11:07               ` Binbin Wu
2025-08-20  3:07             ` Vishal Annapurve
2025-08-20 10:03               ` Binbin Wu
2025-08-20 11:13                 ` Binbin Wu
2025-08-20 17:56                   ` Sean Christopherson
2025-08-21  3:30                     ` Binbin Wu
2025-08-21  5:23                       ` Binbin Wu
2025-08-21  6:02                         ` Jürgen Groß
2025-08-21 15:27                         ` Sean Christopherson
2025-08-28  0:07                           ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z6Fe1nFWv52rDijx@google.com \
    --to=seanjc@google.com \
    --cc=binbin.wu@intel.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=dionnaglaze@google.com \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=pgonda@google.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=vkuznets@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).