Re: [PATCH v9 3/6] KVM: x86: Virtualize CR3.LAM_{U48,U57}

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Sean Christopherson <seanjc@google.com>
To: Binbin Wu <binbin.wu@linux.intel.com>, t@google.com
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	pbonzini@redhat.com, chao.gao@intel.com, kai.huang@intel.com,
	David.Laight@aculab.com, robert.hu@linux.intel.com
Subject: Re: [PATCH v9 3/6] KVM: x86: Virtualize CR3.LAM_{U48,U57}
Date: Tue, 27 Jun 2023 16:40:37 -0700	[thread overview]
Message-ID: <ZJtzdftocuwTvp67@google.com> (raw)
In-Reply-To: <20230606091842.13123-4-binbin.wu@linux.intel.com>

On Tue, Jun 06, 2023, Binbin Wu wrote:
> Opportunistically use GENMASK_ULL() to define __PT_BASE_ADDR_MASK.

This are not the type of changes to do opportunstically.   Opportunstic changes
are things like fixing comment typos, dropping unnecessary semi-colons, fixing
coding styles violations, etc.

> Opportunistically use kvm_vcpu_is_legal_cr3() to check CR3 in SVM nested code,
> to provide a clear distinction b/t CR3 and GPA checks.

This *shouldn't* be an opportunsitic thing.  That you felt compelled to call it
out is a symptom of this patch doing too much.

In short, split this into three patches:

  1. Do the __PT_BASE_ADDR_MASK() changes
  2. Add and use kvm_vcpu_is_legal_cr3()
  3. Add support for CR3.LAM bits

> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
> Co-developed-by: Binbin Wu <binbin.wu@linux.intel.com>
> Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
> Reviewed-by: Kai Huang <kai.huang@intel.com>
> Reviewed-by: Chao Gao <chao.gao@intel.com>
> ---
>  arch/x86/include/asm/kvm_host.h | 5 +++++
>  arch/x86/kvm/cpuid.h            | 5 +++++
>  arch/x86/kvm/mmu.h              | 5 +++++
>  arch/x86/kvm/mmu/mmu.c          | 8 +++++++-
>  arch/x86/kvm/mmu/mmu_internal.h | 1 +
>  arch/x86/kvm/mmu/paging_tmpl.h  | 3 ++-
>  arch/x86/kvm/mmu/spte.h         | 2 +-
>  arch/x86/kvm/svm/nested.c       | 4 ++--
>  arch/x86/kvm/vmx/nested.c       | 4 ++--
>  arch/x86/kvm/vmx/vmx.c          | 8 +++++++-
>  arch/x86/kvm/x86.c              | 4 ++--
>  11 files changed, 39 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index c6f03d151c31..46471dd9cc1b 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -727,6 +727,11 @@ struct kvm_vcpu_arch {
>  	unsigned long cr0_guest_owned_bits;
>  	unsigned long cr2;
>  	unsigned long cr3;
> +	/*
> +	 * CR3 non-address feature control bits.
> +	 * Guest CR3 may contain any of those bits at runtime.
> +	 */
> +	u64 cr3_ctrl_bits;

This should be an "unsigned long".

Hmm, "ctrl_bits" is unnecessarily generic at this point.  It's also arguably wrong,
because X86_CR3_PCID_NOFLUSH is also a control bit, it's just allowed in CR3 itself.

I think I'd prefer to drop this field and avoid bikeshedding the name entirely.  The
only reason to effectively cache "X86_CR3_LAM_U48 | X86_CR3_LAM_U57" is because
guest_cpuid_has() is slow, and I'd rather solve that problem with the "governed
feature" framework.

More below.

>  	unsigned long cr4;
>  	unsigned long cr4_guest_owned_bits;
>  	unsigned long cr4_guest_rsvd_bits;
> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> index b1658c0de847..ef8e1b912d7d 100644
> --- a/arch/x86/kvm/cpuid.h
> +++ b/arch/x86/kvm/cpuid.h
> @@ -42,6 +42,11 @@ static inline int cpuid_maxphyaddr(struct kvm_vcpu *vcpu)
>  	return vcpu->arch.maxphyaddr;
>  }
>  
> +static inline bool kvm_vcpu_is_legal_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)

Heh, I think it makes sense to wrap this one.  I'll probably tell you differently
tomorrow, but today, let's wrap.

> +{
> +	return !((cr3 & vcpu->arch.reserved_gpa_bits) & ~vcpu->arch.cr3_ctrl_bits);

Don't open code something for which there is a perfect helper, i.e. use
kvm_vcpu_is_legal_gpa().

If we go the governed feature route, this becomes:

static inline bool kvm_vcpu_is_legal_cr3(struct kvm_vcpu *vcpu,
					 unsigned long cr3)
{
	if (guest_can_use(vcpu, X86_FEATURE_LAM))
		cr3 &= ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57);

	return kvm_vcpu_is_legal_gpa(cr3);
}

> +}
> +
>  static inline bool kvm_vcpu_is_legal_gpa(struct kvm_vcpu *vcpu, gpa_t gpa)
>  {
>  	return !(gpa & vcpu->arch.reserved_gpa_bits);
> diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
> index 92d5a1924fc1..81d8a433dae1 100644
> --- a/arch/x86/kvm/mmu.h
> +++ b/arch/x86/kvm/mmu.h
> @@ -144,6 +144,11 @@ static inline unsigned long kvm_get_active_pcid(struct kvm_vcpu *vcpu)
>  	return kvm_get_pcid(vcpu, kvm_read_cr3(vcpu));
>  }
>  
> +static inline u64 kvm_get_active_cr3_ctrl_bits(struct kvm_vcpu *vcpu)

And then this becomes:

static inline u64 kvm_get_active_cr3_lam_bits(struct kvm_vcpu *vcpu)
{
	if (!guest_can_use(vcpu, X86_FEATURE_LAM))
		return 0;

	return kvm_read_cr3(vcpu) & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
}

> +{
> +	return kvm_read_cr3(vcpu) & vcpu->arch.cr3_ctrl_bits;
> +}
> +
>  static inline void kvm_mmu_load_pgd(struct kvm_vcpu *vcpu)
>  {
>  	u64 root_hpa = vcpu->arch.mmu->root.hpa;
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index c8961f45e3b1..deea9a9f0c75 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -3812,7 +3812,13 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
>  	hpa_t root;
>  
>  	root_pgd = kvm_mmu_get_guest_pgd(vcpu, mmu);
> -	root_gfn = root_pgd >> PAGE_SHIFT;
> +	/*
> +	 * Guest PGD can be CR3 or EPTP (for nested EPT case). CR3 may contain
> +	 * additional control bits (e.g. LAM control bits). To be generic,
> +	 * unconditionally strip non-address bits when computing the GFN since
> +	 * the guest PGD has already been checked for validity.
> +	 */

Drop this comment, the code is self-explanatory, and the comment is incomplete,
e.g. it can also be nCR3.

> +	root_gfn = (root_pgd & __PT_BASE_ADDR_MASK) >> PAGE_SHIFT;
>  
>  	if (mmu_check_root(vcpu, root_gfn))
>  		return 1;
> diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
> index d39af5639ce9..7d2105432d66 100644
> --- a/arch/x86/kvm/mmu/mmu_internal.h
> +++ b/arch/x86/kvm/mmu/mmu_internal.h
> @@ -21,6 +21,7 @@ extern bool dbg;
>  #endif
>  
>  /* Page table builder macros common to shadow (host) PTEs and guest PTEs. */
> +#define __PT_BASE_ADDR_MASK GENMASK_ULL(51, 12)
>  #define __PT_LEVEL_SHIFT(level, bits_per_level)	\
>  	(PAGE_SHIFT + ((level) - 1) * (bits_per_level))
>  #define __PT_INDEX(address, level, bits_per_level) \
> diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
> index 0662e0278e70..394733ac9088 100644
> --- a/arch/x86/kvm/mmu/paging_tmpl.h
> +++ b/arch/x86/kvm/mmu/paging_tmpl.h
> @@ -62,7 +62,7 @@
>  #endif
>  
>  /* Common logic, but per-type values.  These also need to be undefined. */
> -#define PT_BASE_ADDR_MASK	((pt_element_t)(((1ULL << 52) - 1) & ~(u64)(PAGE_SIZE-1)))
> +#define PT_BASE_ADDR_MASK	((pt_element_t)__PT_BASE_ADDR_MASK)
>  #define PT_LVL_ADDR_MASK(lvl)	__PT_LVL_ADDR_MASK(PT_BASE_ADDR_MASK, lvl, PT_LEVEL_BITS)
>  #define PT_LVL_OFFSET_MASK(lvl)	__PT_LVL_OFFSET_MASK(PT_BASE_ADDR_MASK, lvl, PT_LEVEL_BITS)
>  #define PT_INDEX(addr, lvl)	__PT_INDEX(addr, lvl, PT_LEVEL_BITS)
> @@ -324,6 +324,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
>  	trace_kvm_mmu_pagetable_walk(addr, access);
>  retry_walk:
>  	walker->level = mmu->cpu_role.base.level;
> +	/* gpte_to_gfn() will strip non-address bits. */

Drop this comment too, it's not relevant to the immediate code, i.e. it'd be
better suited about this code:

	table_gfn = gpte_to_gfn(pte);

but IMO that code is quite self-explanatory too.

> @@ -7740,6 +7741,11 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  		vmx->msr_ia32_feature_control_valid_bits &=
>  			~FEAT_CTL_SGX_LC_ENABLED;
>  
> +	if (guest_cpuid_has(vcpu, X86_FEATURE_LAM))

This is wrong, KVM needs to check that the host supports LAM too, otherwise KVM
will allow userspace to shove garbage into guest CR3 and induce VM-Entry failures
and whatnot.  If we go the guest_can_use() route, this problem solves itself.

next prev parent reply	other threads:[~2023-06-27 23:40 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-06  9:18 [PATCH v9 0/6] Linear Address Masking (LAM) KVM Enabling Binbin Wu
2023-06-06  9:18 ` [PATCH v9 1/6] KVM: x86: Consolidate flags for __linearize() Binbin Wu
2023-06-06  9:18 ` [PATCH v9 2/6] KVM: x86: Virtualize CR4.LAM_SUP Binbin Wu
2023-06-07  3:40   ` Huang, Kai
2023-06-07  4:55     ` Binbin Wu
2023-06-07  9:20       ` Huang, Kai
2023-06-06  9:18 ` [PATCH v9 3/6] KVM: x86: Virtualize CR3.LAM_{U48,U57} Binbin Wu
2023-06-27 23:40   ` Sean Christopherson [this message]
2023-06-28  3:05     ` Binbin Wu
2023-06-28 17:40       ` Sean Christopherson
2023-07-03  7:56         ` Binbin Wu
2023-07-22  1:28           ` Sean Christopherson
2023-06-06  9:18 ` [PATCH v9 4/6] KVM: x86: Introduce untag_addr() in kvm_x86_ops Binbin Wu
2023-06-28  0:15   ` Sean Christopherson
2023-06-29  6:12     ` Binbin Wu
2023-06-29  6:57       ` Chao Gao
2023-06-29  7:22         ` Binbin Wu
2023-06-29 15:33           ` Sean Christopherson
2023-06-29  8:30       ` David Laight
2023-06-29 15:16       ` Sean Christopherson
2023-06-29 17:26         ` Binbin Wu
2023-06-06  9:18 ` [PATCH v9 5/6] KVM: x86: Untag address when LAM applicable Binbin Wu
2023-06-28  0:19   ` Sean Christopherson
2023-06-06  9:18 ` [PATCH v9 6/6] KVM: x86: Expose LAM feature to userspace VMM Binbin Wu
2023-06-07  3:52   ` Huang, Kai
2023-06-16  1:45     ` Binbin Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZJtzdftocuwTvp67@google.com \
    --to=seanjc@google.com \
    --cc=David.Laight@aculab.com \
    --cc=binbin.wu@linux.intel.com \
    --cc=chao.gao@intel.com \
    --cc=kai.huang@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=robert.hu@linux.intel.com \
    --cc=t@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.