public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Chao Gao <chao.gao@intel.com>
Cc: Jim Mattson <jmattson@google.com>,
	Tao Su <tao1.su@linux.intel.com>,
	kvm@vger.kernel.org,  pbonzini@redhat.com, eddie.dong@intel.com,
	xiaoyao.li@intel.com,  yuan.yao@linux.intel.com,
	yi1.lai@intel.com, xudong.hao@intel.com,  chao.p.peng@intel.com
Subject: Re: [PATCH 1/2] x86: KVM: Limit guest physical bits when 5-level EPT is unsupported
Date: Tue, 19 Dec 2023 07:26:00 -0800	[thread overview]
Message-ID: <ZYG2CDRFlq50siec@google.com> (raw)
In-Reply-To: <ZYFPsISS9K867BU5@chao-email>

On Tue, Dec 19, 2023, Chao Gao wrote:
> On Mon, Dec 18, 2023 at 07:40:11PM -0800, Jim Mattson wrote:
> >Honestly, I think KVM should just disable EPT if the EPT tables can't
> >support the CPU's physical address width.
> 
> Yes, it is an option.
> But I prefer to allow admin to override this (i.e., admin still can enable EPT
> via module parameter) because those issues are not new and disabling EPT
> doesn't prevent QEMU from launching guests w/ smaller MAXPHYADDR.
> 
> >> Here nothing visible to selftests or QEMU indicates that guest.MAXPHYADDR = 52
> >> is invalid/incorrect. how can we say selftests are at fault and we should fix
> >> them?
> >
> >In this case, the CPU is at fault, and you should complain to the CPU vendor.
> 
> Yeah, I agree with you and will check with related team inside Intel.

I agree that the CPU is being weird, but this is technically an architecturally
legal configuration, and KVM has largely committed to supporting weird setups.
At some point we have to draw a line when things get too ridiculous, but I don't
think this particular oddity crosses into absurd territory.

> My point was just this isn't a selftest issue because not all information is
> disclosed to the tests.

Ah, right, EPT capabilities are in MSRs that userspace can't read.

> And I am afraid KVM as L1 VMM may run into this situation, i.e., only 4-level
> EPT is supported but MAXPHYADDR is 52. So, KVM needs a fix anyway.

Yes, but forcing emulation for a funky setup is not a good fix.  KVM can simply
constrain the advertised MAXPHYADDR, no? 

---
 arch/x86/kvm/cpuid.c   | 17 +++++++++++++----
 arch/x86/kvm/mmu.h     |  2 ++
 arch/x86/kvm/mmu/mmu.c |  5 +++++
 3 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 294e5bd5f8a0..5c346e1a10bd 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -1233,12 +1233,21 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
 		 *
 		 * If TDP is enabled but an explicit guest MAXPHYADDR is not
 		 * provided, use the raw bare metal MAXPHYADDR as reductions to
-		 * the HPAs do not affect GPAs.
+		 * the HPAs do not affect GPAs.  Finally, if TDP is enabled and
+		 * doesn't support 5-level paging, cap guest MAXPHYADDR at 48
+		 * bits as KVM can't install SPTEs for larger GPAs.
 		 */
-		if (!tdp_enabled)
+		if (!tdp_enabled) {
 			g_phys_as = boot_cpu_data.x86_phys_bits;
-		else if (!g_phys_as)
-			g_phys_as = phys_as;
+		} else {
+			u8 max_tdp_level = kvm_mmu_get_max_tdp_level();
+
+			if (!g_phys_as)
+				g_phys_as = phys_as;
+
+			if (max_tdp_level < 5)
+				g_phys_as = min(g_phys_as, 48);
+		}
 
 		entry->eax = g_phys_as | (virt_as << 8);
 		entry->ecx &= ~(GENMASK(31, 16) | GENMASK(11, 8));
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 60f21bb4c27b..b410a227c601 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -100,6 +100,8 @@ static inline u8 kvm_get_shadow_phys_bits(void)
 	return boot_cpu_data.x86_phys_bits;
 }
 
+u8 kvm_mmu_get_max_tdp_level(void);
+
 void kvm_mmu_set_mmio_spte_mask(u64 mmio_value, u64 mmio_mask, u64 access_mask);
 void kvm_mmu_set_me_spte_mask(u64 me_value, u64 me_mask);
 void kvm_mmu_set_ept_masks(bool has_ad_bits, bool has_exec_only);
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 3c844e428684..b2845f5520b3 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5267,6 +5267,11 @@ static inline int kvm_mmu_get_tdp_level(struct kvm_vcpu *vcpu)
 	return max_tdp_level;
 }
 
+u8 kvm_mmu_get_max_tdp_level(void)
+{
+	return tdp_root_level ? tdp_root_level : max_tdp_level;
+}
+
 static union kvm_mmu_page_role
 kvm_calc_tdp_mmu_root_page_role(struct kvm_vcpu *vcpu,
 				union kvm_cpu_role cpu_role)

base-commit: f2a3fb7234e52f72ff4a38364dbf639cf4c7d6c6
-- 

  reply	other threads:[~2023-12-19 15:26 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-18 14:05 [PATCH 0/2] x86: KVM: Limit guest physical bits when 5-level EPT is unsupported Tao Su
2023-12-18 14:05 ` [PATCH 1/2] " Tao Su
2023-12-18 15:13   ` Sean Christopherson
2023-12-19  2:51     ` Chao Gao
2023-12-19  3:40       ` Jim Mattson
2023-12-19  8:09         ` Chao Gao
2023-12-19 15:26           ` Sean Christopherson [this message]
2023-12-20  7:16             ` Xiaoyao Li
2023-12-20 15:37               ` Sean Christopherson
2023-12-20 11:59             ` Tao Su
2023-12-20 13:39             ` Jim Mattson
2023-12-19  8:31     ` Tao Su
2023-12-20 16:28   ` Sean Christopherson
2023-12-21  7:45     ` Tao Su
2023-12-21  8:19     ` Xu Yilun
2024-01-02 23:24       ` Sean Christopherson
2024-01-03  0:34         ` Jim Mattson
2024-01-03 18:04           ` Sean Christopherson
2024-01-04  2:45             ` Chao Gao
2024-01-04  3:40               ` Jim Mattson
2024-01-04  4:34                 ` Jim Mattson
2024-01-04 11:56                   ` Tao Su
2024-01-04 14:03                     ` Jim Mattson
2024-01-04 15:07                 ` Chao Gao
2024-01-04 17:02                   ` Jim Mattson
2024-01-05 20:26                     ` Sean Christopherson
2024-01-08 13:45                       ` Tao Su
2024-01-08 15:29                         ` Sean Christopherson
2023-12-18 14:05 ` [PATCH 2/2] x86: KVM: Emulate instruction when GPA can't be translated by EPT Tao Su
2023-12-18 15:23   ` Sean Christopherson
2023-12-19  3:10     ` Chao Gao
2023-12-20 13:42   ` Jim Mattson
2024-01-08 13:48     ` Tao Su
2024-01-08 15:19       ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZYG2CDRFlq50siec@google.com \
    --to=seanjc@google.com \
    --cc=chao.gao@intel.com \
    --cc=chao.p.peng@intel.com \
    --cc=eddie.dong@intel.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=tao1.su@linux.intel.com \
    --cc=xiaoyao.li@intel.com \
    --cc=xudong.hao@intel.com \
    --cc=yi1.lai@intel.com \
    --cc=yuan.yao@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox