All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Xiaoyao Li <xiaoyao.li@intel.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	 Rick P Edgecombe <rick.p.edgecombe@intel.com>,
	Isaku Yamahata <isaku.yamahata@intel.com>,
	 Wei W Wang <wei.w.wang@intel.com>,
	David Skidmore <davidskidmore@google.com>,
	 Steve Rutherford <srutherford@google.com>,
	Pankaj Gupta <pankaj.gupta@amd.com>
Subject: Re: [ANNOUNCE] PUCK Notes - 2024.04.03 - TDX Upstreaming Strategy
Date: Mon, 8 Apr 2024 09:20:51 -0700	[thread overview]
Message-ID: <ZhQZYzkDPMxXe2RN@google.com> (raw)
In-Reply-To: <73b40363-1063-4cb3-b744-9c90bae900b5@intel.com>

On Sun, Apr 07, 2024, Xiaoyao Li wrote:
> On 4/6/2024 12:58 AM, Sean Christopherson wrote:
> >   - For guest MAXPHYADDR vs. GPAW, rely on KVM_GET_SUPPORTED_CPUID to enumerate
> >     the usable MAXPHYADDR[2], and simply refuse to enable TDX if the TDX Module
> >     isn't compatible.  Specifically, if MAXPHYADDR=52, 5-level paging is enabled,
> >     but the TDX-Module only allows GPAW=0, i.e. only supports 4-level paging.
> 
> So userspace can get supported GPAW from usable MAXPHYADDR, i.e.,
> CPUID(0X8000_0008).eaxx[23:16] of KVM_GET_SUPPORTED_CPUID:
>  - if usable MAXPHYADDR == 52, supported GPAW is 0 and 1.
>  - if usable MAXPHYADDR <= 48, supported GPAW is only 0.
> 
> There is another thing needs to be discussed. How does userspace configure
> GPAW for TD guest?
> 
> Currently, KVM uses CPUID(0x8000_0008).EAX[7:0] in struct
> kvm_tdx_init_vm::cpuid.entries[] of IOCTL(KVM_TDX_INIT_VM) to deduce the
> GPAW:
> 
> 	int maxpa = 36;
> 	entry = kvm_find_cpuid_entry2(cpuid->entries, cpuid->nent, 0x80000008, 0);
> 	if (entry)
> 		max_pa = entry->eax & 0xff;
> 
> 	...
> 	if (!cpu_has_vmx_ept_5levels() && max_pa > 48)
> 		return -EINVAL;
> 	if (cpu_has_vmx_ept_5levels() && max_pa > 48) {
> 		td_params->eptp_controls |= VMX_EPTP_PWL_5;
> 		td_params->exec_controls |= TDX_EXEC_CONTROL_MAX_GPAW;
> 	} else {
> 		td_params->eptp_controls |= VMX_EPTP_PWL_4;
> 	}
> 
> The code implies that KVM allows the provided CPUID(0x8000_0008).EAX[7:0] to
> be any value (when 5level ept is supported). when it > 48, configure GPAW of
> TD to 1, otherwise to 0.
> 
> However, the virtual value of CPUID(0x8000_0008).EAX[7:0] inside TD is
> always the native value of hardware (for current TDX).
> 
> So if we want to keep this behavior, we need to document it somewhere that
> CPUID(0x8000_0008).EAX[7:0] in struct kvm_tdx_init_vm::cpuid.entries[] of
> IOCTL(KVM_TDX_INIT_VM) is only for configuring GPAW, not for userspace to
> configure virtual CPUID value for TD VMs.
> 
> Another option is that, KVM doesn't allow userspace to configure
> CPUID(0x8000_0008).EAX[7:0]. Instead, it provides a gpaw field in struct
> kvm_tdx_init_vm for userspace to configure directly.
> 
> What do you prefer?

Hmm, neither.  I think the best approach is to build on Gerd's series to have KVM
select 4-level vs. 5-level based on the enumerated guest.MAXPHYADDR, not on
host.MAXPHYADDR.

With a moderate amount of refactoring, cache/compute guest_maxphyaddr as:

	static void kvm_vcpu_refresh_maxphyaddr(struct kvm_vcpu *vcpu)
	{
		struct kvm_cpuid_entry2 *best;

		best = kvm_find_cpuid_entry(vcpu, 0x80000000);
		if (!best || best->eax < 0x80000008)
			goto not_found;

		best = kvm_find_cpuid_entry(vcpu, 0x80000008);
		if (!best)
			goto not_found;

		vcpu->arch.maxphyaddr = best->eax & GENMASK(7, 0);

		if (best->eax & GENMASK(15, 8))
			vcpu->arch.guest_maxphyaddr = (best->eax & GENMASK(15, 8)) >> 8;
		else
			vcpu->arch.guest_maxphyaddr = vcpu->arch.maxphyaddr;

		return;

	not_found:
		vcpu->arch.maxphyaddr = KVM_X86_DEFAULT_MAXPHYADDR;
		vcpu->arch.guest_maxphyaddr = KVM_X86_DEFAULT_MAXPHYADDR;
	}

and then use vcpu->arch.guest_maxphyaddr instead of vcpu->arch.maxphyaddr when
selecting the TDP level.

	static inline int kvm_mmu_get_tdp_level(struct kvm_vcpu *vcpu)
	{
		/* tdp_root_level is architecture forced level, use it if nonzero */
		if (tdp_root_level)
			return tdp_root_level;

		/*
		* Use 5-level TDP if and only if it's useful/necessary.  Definitely a
		* more verbose comment here.
		*/
		if (max_tdp_level == 5 && vcpu->arch.guest_maxphyaddr <= 48)
			return 4;

		return max_tdp_level;
	}

The only question is whether or not the behavior needs to be opt-in via a new
capability, e.g. in case there is some weird usage where userspace enumerates
guest.MAXPHYADDR < host.MAXPHYADDR but still wants/needs 5-level paging.  I highly
doubt such a use case exists though.

I'll get Gerd's series applied, and will post a small series to implement the
above later this week.

  reply	other threads:[~2024-04-08 16:20 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-05 16:58 [ANNOUNCE] PUCK Notes - 2024.04.03 - TDX Upstreaming Strategy Sean Christopherson
2024-04-07  3:15 ` Xiaoyao Li
2024-04-08 16:20   ` Sean Christopherson [this message]
2024-04-08 17:42     ` Edgecombe, Rick P
2024-04-08 18:51       ` Sean Christopherson
2024-04-08 21:56         ` Edgecombe, Rick P
2024-04-08 22:36           ` Sean Christopherson
2024-04-08 23:46             ` Edgecombe, Rick P
2024-04-09  1:37               ` Sean Christopherson
2024-04-09 14:46                 ` Edgecombe, Rick P
2024-04-09 15:23                   ` Sean Christopherson
2024-04-09 15:49                     ` Edgecombe, Rick P
2024-04-09 16:13                       ` Xiaoyao Li
2024-04-09 16:18                         ` Xiaoyao Li
2024-04-10  1:05                           ` Huang, Kai
2024-04-09 16:26                       ` Sean Christopherson
2024-04-11  1:13                         ` Edgecombe, Rick P
2024-04-11 14:22                           ` Sean Christopherson
2024-04-11 15:16                             ` Xiaoyao Li
2024-04-11 15:26                               ` Sean Christopherson
2024-04-11 15:41                                 ` Xiaoyao Li
2024-04-11 18:52                                 ` Edgecombe, Rick P
2024-04-12  8:40                                   ` Xiaoyao Li
2024-04-12 17:39                                     ` Isaku Yamahata
2024-04-12 20:05                                       ` Edgecombe, Rick P
2024-04-15 21:04                                         ` Isaku Yamahata
2024-04-10  1:12         ` Isaku Yamahata
2024-04-10 14:03           ` Huang, Kai
2024-04-11  1:03             ` Isaku Yamahata
2024-04-11  3:46               ` Isaku Yamahata
2024-04-11 13:39                 ` Huang, Kai
2024-04-09  2:57     ` Xiaoyao Li
2024-04-09 14:01       ` Sean Christopherson
2024-04-09 14:15         ` Xiaoyao Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZhQZYzkDPMxXe2RN@google.com \
    --to=seanjc@google.com \
    --cc=davidskidmore@google.com \
    --cc=isaku.yamahata@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pankaj.gupta@amd.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=srutherford@google.com \
    --cc=wei.w.wang@intel.com \
    --cc=xiaoyao.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.