All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Rick P Edgecombe <rick.p.edgecombe@intel.com>
Cc: "tglx@linutronix.de" <tglx@linutronix.de>,
	"x86@kernel.org" <x86@kernel.org>,
	 "mingo@redhat.com" <mingo@redhat.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"bp@alien8.de" <bp@alien8.de>,
	 "dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"pgonda@google.com" <pgonda@google.com>,
	 "jgross@suse.com" <jgross@suse.com>,
	Binbin Wu <binbin.wu@intel.com>,
	 "vkuznets@redhat.com" <vkuznets@redhat.com>,
	"hpa@zytor.com" <hpa@zytor.com>,
	 "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	 "kirill.shutemov@linux.intel.com"
	<kirill.shutemov@linux.intel.com>,
	 "dionnaglaze@google.com" <dionnaglaze@google.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	 "thomas.lendacky@amd.com" <thomas.lendacky@amd.com>
Subject: Re: [PATCH 0/2] x86/kvm: Force legacy PCI hole as WB under SNP/TDX
Date: Mon, 3 Feb 2025 12:33:04 -0800	[thread overview]
Message-ID: <Z6EoAAHn4d_FujZa@google.com> (raw)
In-Reply-To: <dbbfa3f1d16a3ab60523f5c21d857e0fcbc65e52.camel@intel.com>

On Mon, Feb 03, 2025, Rick P Edgecombe wrote:
> On Fri, 2025-01-31 at 16:50 -0800, Sean Christopherson wrote:
> > Attempt to hack around the SNP/TDX guest MTRR disaster by hijacking
> > x86_platform.is_untracked_pat_range() to force the legacy PCI hole, i.e.
> > memory from TOLUD => 4GiB, as unconditionally writeback.
> > 
> > TDX in particular has created an impossible situation with MTRRs.  Because
> > TDX disallows toggling CR0.CD, TDX enabling decided the easiest solution
> > was to ignore MTRRs entirely (because omitting CR0.CD write is obviously
> > too simple).
> > 
> > Unfortunately, under KVM at least, the kernel subtly relies on MTRRs to
> > make ACPI play nice with device drivers.  ACPI tries to map ranges it finds
> > as WB, which in turn prevents device drivers from mapping device memory as
> > WC/UC-.
> > 
> > For the record, I hate this hack.  But it's the safest approach I can come
> > up with.  E.g. forcing ioremap() to always use WB scares me because it's
> > possible, however unlikely, that the kernel could try to map non-emulated
> > memory (that is presented as MMIO to the guest) as WC/UC-, and silently
> > forcing those mappings to WB could do weird things.
> > 
> > My initial thought was to effectively revert the offending commit and
> > skip the cache disabling/enabling, i.e. the problematic CR0.CD toggling,
> > but unfortunately OVMF/EDKII has also added code to skip MTRR setup. :-(
> 
> Oof. The missing context in 8e690b817e38 ("x86/kvm: Override default caching
> mode for SEV-SNP and TDX"), is that since it is impossible to virtualize MTRRs
> on TDX private memory (in the old way KVM used to do it) and there was no
> upstream support yet, there looked like an opportunity to avoid strange "happens
> to work" support that normal VMs ended up with. Instead KVM could just not
> support TDX KVM MTRRs from the beginning. So part of the thinking was that we
> could drop all TDX KVM MTRR hacks. (which I guess turned out to be wrong).
> 
> Since there is no upstream KVM TDX support yet, why isn't it an option to still
> revert the EDKII commit too? It was a relatively recent change.

I'm fine with that route too, but it too is a band-aid.  Relying on the *untrusted*
hypervisor to essentially communicate memory maps is not a winning strategy. 

> To me it seems that the normal KVM MTRR support is not ideal, because it is
> still lying about what it is doing. For example, in the past there was an
> attempt to use UC to prevent speculative execution accesses to sensitive data.
> The KVM MTRR support only happens to work with existing guests, but not all
> possible MTRR usages.
> 
> Since diverging from the architecture creates loose ends like that, we could
> instead define some other way for EDKII to communicate the ranges to the kernel.
> Like some simple KVM PV MSRs that are for communication only, and not

Hard "no" to any PV solution.  This isn't KVM specific, and as above, bouncing
through the hypervisor to communicate information within the guest is asinine,
especially for CoCo VMs.

> overlapping with architecture that expects to cause memory behavior. EDKII and
> the kernel could be changed to use them.

  reply	other threads:[~2025-02-03 20:33 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-01  0:50 [PATCH 0/2] x86/kvm: Force legacy PCI hole as WB under SNP/TDX Sean Christopherson
2025-02-01  0:50 ` [PATCH 1/2] x86/mtrr: Return success vs. "failure" from guest_force_mtrr_state() Sean Christopherson
2025-02-01  0:50 ` [PATCH 2/2] x86/kvm: Override low memory above TOLUD to WB when MTRRs are forced WB Sean Christopherson
2025-02-01 14:25 ` [PATCH 0/2] x86/kvm: Force legacy PCI hole as WB under SNP/TDX Dionna Amalie Glaze
2025-02-03 18:14 ` Edgecombe, Rick P
2025-02-03 20:33   ` Sean Christopherson [this message]
2025-02-03 23:01     ` Edgecombe, Rick P
2025-02-04  0:27       ` Sean Christopherson
2025-02-05  3:51         ` Edgecombe, Rick P
2025-02-05  7:49           ` Xu, Min M
2025-02-10 15:29         ` Binbin Wu
2025-07-08 14:24 ` Nikolay Borisov
  -- strict thread matches above, loose matches on Subject: below --
2025-07-09 16:54 Jianxiong Gao
2025-07-14  9:06 ` Binbin Wu
2025-07-14 11:24   ` Nikolay Borisov
2025-07-15  2:53     ` Binbin Wu
2025-07-16  9:51       ` Binbin Wu
2025-07-23 14:34     ` Sean Christopherson
2025-07-24  3:16       ` Binbin Wu
2025-07-28 15:33         ` Sean Christopherson
2025-07-30  7:34           ` Binbin Wu
2025-08-15 23:55             ` Korakit Seemakhupt
2025-08-18 11:07               ` Binbin Wu
2025-08-20  3:07             ` Vishal Annapurve
2025-08-20 10:03               ` Binbin Wu
2025-08-20 11:13                 ` Binbin Wu
2025-08-20 17:56                   ` Sean Christopherson
2025-08-21  3:30                     ` Binbin Wu
2025-08-21  5:23                       ` Binbin Wu
2025-08-21  6:02                         ` Jürgen Groß
2025-08-21 15:27                         ` Sean Christopherson
2025-08-28  0:07                           ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z6EoAAHn4d_FujZa@google.com \
    --to=seanjc@google.com \
    --cc=binbin.wu@intel.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=dionnaglaze@google.com \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=pgonda@google.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=vkuznets@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.