public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Jim Mattson <jmattson@google.com>
Cc: Chao Gao <chao.gao@intel.com>,
	Xu Yilun <yilun.xu@linux.intel.com>,
	 Tao Su <tao1.su@linux.intel.com>,
	kvm@vger.kernel.org, pbonzini@redhat.com,  eddie.dong@intel.com,
	xiaoyao.li@intel.com, yuan.yao@linux.intel.com,
	 yi1.lai@intel.com, xudong.hao@intel.com, chao.p.peng@intel.com
Subject: Re: [PATCH 1/2] x86: KVM: Limit guest physical bits when 5-level EPT is unsupported
Date: Fri, 5 Jan 2024 12:26:08 -0800	[thread overview]
Message-ID: <ZZhl4FHcdrzMXoVy@google.com> (raw)
In-Reply-To: <CALMp9eTf=9VqM=xutOXmgRr+aFz-YhOz6h4B+uLgtFBXtHefPA@mail.gmail.com>

On Thu, Jan 04, 2024, Jim Mattson wrote:
> On Thu, Jan 4, 2024 at 7:08 AM Chao Gao <chao.gao@intel.com> wrote:
> >
> > On Wed, Jan 03, 2024 at 07:40:02PM -0800, Jim Mattson wrote:
> > >On Wed, Jan 3, 2024 at 6:45 PM Chao Gao <chao.gao@intel.com> wrote:
> > >>
> > >> On Wed, Jan 03, 2024 at 10:04:41AM -0800, Sean Christopherson wrote:
> > >> >On Tue, Jan 02, 2024, Jim Mattson wrote:
> > >> >> This is all just so broken and wrong. The only guest.MAXPHYADDR that
> > >> >> can be supported under TDP is the host.MAXPHYADDR. If KVM claims to
> > >> >> support a smaller guest.MAXPHYADDR, then KVM is obligated to intercept
> > >> >> every #PF,
> > >>
> > >> in this case (i.e., to support 48-bit guest.MAXPHYADDR when CPU supports only
> > >> 4-level EPT), KVM has no need to intercept #PF because accessing a GPA with
> > >> RSVD bits 51-48 set leads to EPT violation.
> > >
> > >At the completion of the page table walk, if there is a permission
> > >fault, the data address should not be accessed, so there should not be
> > >an EPT violation. Remember Meltdown?
> >
> > You are right. I missed this case. KVM needs to intercept #PF to set RSVD bit
> > in PFEC.
> 
> I have no problem with a user deliberately choosing an unsupported
> configuration, but I do have a problem with KVM_GET_SUPPORTED_CPUID
> returning an unsupported configuration.

+1

Advertising guest.MAXPHYADDR < host.MAXPHYADDR in KVM_GET_SUPPORTED_CPUID simply
isn't viable when TDP is enabled.  I suppose KVM do so when allow_smaller_maxphyaddr
is enabled, but that's just asking for confusion, e.g. if userspace reflects the
CPUID back into the guest, it could unknowingly create a VM that depends on
allow_smaller_maxphyaddr.

I think the least awful option is to have the kernel expose whether or not the
CPU support 5-level EPT to userspace.  That doesn't even require new uAPI per se,
just a new flag in /proc/cpuinfo.  It'll be a bit gross for userspace to parse,
but it's not the end of the world.  Alternatively, KVM could add a capability to
enumerate the max *addressable* GPA, but userspace would still need to manually
take action when KVM can't address all of memory, i.e. a capability would be less
ugly, but wouldn't meaningfully change userspace's responsibilities.

I.e.

diff --git a/arch/x86/include/asm/vmxfeatures.h b/arch/x86/include/asm/vmxfeatures.h
index c6a7eed03914..266daf5b5b84 100644
--- a/arch/x86/include/asm/vmxfeatures.h
+++ b/arch/x86/include/asm/vmxfeatures.h
@@ -25,6 +25,7 @@
 #define VMX_FEATURE_EPT_EXECUTE_ONLY   ( 0*32+ 17) /* "ept_x_only" EPT entries can be execute only */
 #define VMX_FEATURE_EPT_AD             ( 0*32+ 18) /* EPT Accessed/Dirty bits */
 #define VMX_FEATURE_EPT_1GB            ( 0*32+ 19) /* 1GB EPT pages */
+#define VMX_FEATURE_EPT_5LEVEL         ( 0*32+ 20) /* 5-level EPT paging */
 
 /* Aggregated APIC features 24-27 */
 #define VMX_FEATURE_FLEXPRIORITY       ( 0*32+ 24) /* TPR shadow + virt APIC */
diff --git a/arch/x86/kernel/cpu/feat_ctl.c b/arch/x86/kernel/cpu/feat_ctl.c
index 03851240c3e3..1640ae76548f 100644
--- a/arch/x86/kernel/cpu/feat_ctl.c
+++ b/arch/x86/kernel/cpu/feat_ctl.c
@@ -72,6 +72,8 @@ static void init_vmx_capabilities(struct cpuinfo_x86 *c)
                c->vmx_capability[MISC_FEATURES] |= VMX_F(EPT_AD);
        if (ept & VMX_EPT_1GB_PAGE_BIT)
                c->vmx_capability[MISC_FEATURES] |= VMX_F(EPT_1GB);
+       if (ept & VMX_EPT_PAGE_WALK_5_BIT)
+               c->vmx_capability[MISC_FEATURES] |= VMX_F(EPT_5LEVEL);
 
        /* Synthetic APIC features that are aggregates of multiple features. */
        if ((c->vmx_capability[PRIMARY_CTLS] & VMX_F(VIRTUAL_TPR)) &&


  reply	other threads:[~2024-01-05 20:26 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-18 14:05 [PATCH 0/2] x86: KVM: Limit guest physical bits when 5-level EPT is unsupported Tao Su
2023-12-18 14:05 ` [PATCH 1/2] " Tao Su
2023-12-18 15:13   ` Sean Christopherson
2023-12-19  2:51     ` Chao Gao
2023-12-19  3:40       ` Jim Mattson
2023-12-19  8:09         ` Chao Gao
2023-12-19 15:26           ` Sean Christopherson
2023-12-20  7:16             ` Xiaoyao Li
2023-12-20 15:37               ` Sean Christopherson
2023-12-20 11:59             ` Tao Su
2023-12-20 13:39             ` Jim Mattson
2023-12-19  8:31     ` Tao Su
2023-12-20 16:28   ` Sean Christopherson
2023-12-21  7:45     ` Tao Su
2023-12-21  8:19     ` Xu Yilun
2024-01-02 23:24       ` Sean Christopherson
2024-01-03  0:34         ` Jim Mattson
2024-01-03 18:04           ` Sean Christopherson
2024-01-04  2:45             ` Chao Gao
2024-01-04  3:40               ` Jim Mattson
2024-01-04  4:34                 ` Jim Mattson
2024-01-04 11:56                   ` Tao Su
2024-01-04 14:03                     ` Jim Mattson
2024-01-04 15:07                 ` Chao Gao
2024-01-04 17:02                   ` Jim Mattson
2024-01-05 20:26                     ` Sean Christopherson [this message]
2024-01-08 13:45                       ` Tao Su
2024-01-08 15:29                         ` Sean Christopherson
2023-12-18 14:05 ` [PATCH 2/2] x86: KVM: Emulate instruction when GPA can't be translated by EPT Tao Su
2023-12-18 15:23   ` Sean Christopherson
2023-12-19  3:10     ` Chao Gao
2023-12-20 13:42   ` Jim Mattson
2024-01-08 13:48     ` Tao Su
2024-01-08 15:19       ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZZhl4FHcdrzMXoVy@google.com \
    --to=seanjc@google.com \
    --cc=chao.gao@intel.com \
    --cc=chao.p.peng@intel.com \
    --cc=eddie.dong@intel.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=tao1.su@linux.intel.com \
    --cc=xiaoyao.li@intel.com \
    --cc=xudong.hao@intel.com \
    --cc=yi1.lai@intel.com \
    --cc=yilun.xu@linux.intel.com \
    --cc=yuan.yao@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox