linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Binbin Wu <binbin.wu@linux.intel.com>
To: Sean Christopherson <seanjc@google.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	pbonzini@redhat.com, chao.gao@intel.com, kai.huang@intel.com,
	David.Laight@aculab.com, robert.hu@linux.intel.com
Subject: Re: [PATCH v9 3/6] KVM: x86: Virtualize CR3.LAM_{U48,U57}
Date: Wed, 28 Jun 2023 11:05:05 +0800	[thread overview]
Message-ID: <e11e348c-3763-8eda-281d-c8d965cd52b6@linux.intel.com> (raw)
In-Reply-To: <ZJtzdftocuwTvp67@google.com>



On 6/28/2023 7:40 AM, Sean Christopherson wrote:
> On Tue, Jun 06, 2023, Binbin Wu wrote:
>> Opportunistically use GENMASK_ULL() to define __PT_BASE_ADDR_MASK.
> This are not the type of changes to do opportunstically.   Opportunstic changes
> are things like fixing comment typos, dropping unnecessary semi-colons, fixing
> coding styles violations, etc.

OK, thanks for the education.
>
>> Opportunistically use kvm_vcpu_is_legal_cr3() to check CR3 in SVM nested code,
>> to provide a clear distinction b/t CR3 and GPA checks.
> This *shouldn't* be an opportunsitic thing.  That you felt compelled to call it
> out is a symptom of this patch doing too much.
>
> In short, split this into three patches:
>
>    1. Do the __PT_BASE_ADDR_MASK() changes
>    2. Add and use kvm_vcpu_is_legal_cr3()
>    3. Add support for CR3.LAM bits
Will do that, thanks.

>
>> Suggested-by: Sean Christopherson <seanjc@google.com>
>> Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
>> Co-developed-by: Binbin Wu <binbin.wu@linux.intel.com>
>> Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
>> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
>> Reviewed-by: Kai Huang <kai.huang@intel.com>
>> Reviewed-by: Chao Gao <chao.gao@intel.com>
>> ---
>>   arch/x86/include/asm/kvm_host.h | 5 +++++
>>   arch/x86/kvm/cpuid.h            | 5 +++++
>>   arch/x86/kvm/mmu.h              | 5 +++++
>>   arch/x86/kvm/mmu/mmu.c          | 8 +++++++-
>>   arch/x86/kvm/mmu/mmu_internal.h | 1 +
>>   arch/x86/kvm/mmu/paging_tmpl.h  | 3 ++-
>>   arch/x86/kvm/mmu/spte.h         | 2 +-
>>   arch/x86/kvm/svm/nested.c       | 4 ++--
>>   arch/x86/kvm/vmx/nested.c       | 4 ++--
>>   arch/x86/kvm/vmx/vmx.c          | 8 +++++++-
>>   arch/x86/kvm/x86.c              | 4 ++--
>>   11 files changed, 39 insertions(+), 10 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index c6f03d151c31..46471dd9cc1b 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -727,6 +727,11 @@ struct kvm_vcpu_arch {
>>   	unsigned long cr0_guest_owned_bits;
>>   	unsigned long cr2;
>>   	unsigned long cr3;
>> +	/*
>> +	 * CR3 non-address feature control bits.
>> +	 * Guest CR3 may contain any of those bits at runtime.
>> +	 */
>> +	u64 cr3_ctrl_bits;
> This should be an "unsigned long".
>
> Hmm, "ctrl_bits" is unnecessarily generic at this point.  It's also arguably wrong,
> because X86_CR3_PCID_NOFLUSH is also a control bit, it's just allowed in CR3 itself.
>
> I think I'd prefer to drop this field and avoid bikeshedding the name entirely.  The
> only reason to effectively cache "X86_CR3_LAM_U48 | X86_CR3_LAM_U57" is because
> guest_cpuid_has() is slow, and I'd rather solve that problem with the "governed
> feature" framework.
Thanks for the suggestion.

Is the below patch the lastest patch of "governed feature" framework 
support?
https://lore.kernel.org/kvm/20230217231022.816138-2-seanjc@google.com/

Do you have plan to apply it to kvm-x86 repo?

>
> More below.
>
>>   	unsigned long cr4;
>>   	unsigned long cr4_guest_owned_bits;
>>   	unsigned long cr4_guest_rsvd_bits;
>> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
>> index b1658c0de847..ef8e1b912d7d 100644
>> --- a/arch/x86/kvm/cpuid.h
>> +++ b/arch/x86/kvm/cpuid.h
>> @@ -42,6 +42,11 @@ static inline int cpuid_maxphyaddr(struct kvm_vcpu *vcpu)
>>   	return vcpu->arch.maxphyaddr;
>>   }
>>   
>> +static inline bool kvm_vcpu_is_legal_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
> Heh, I think it makes sense to wrap this one.  I'll probably tell you differently
> tomorrow, but today, let's wrap.
>
>> +{
>> +	return !((cr3 & vcpu->arch.reserved_gpa_bits) & ~vcpu->arch.cr3_ctrl_bits);
> Don't open code something for which there is a perfect helper, i.e. use
> kvm_vcpu_is_legal_gpa().
I didn't use the helper because after masking the control bits (LAM 
bits), CR3 is not  a GPA conceptally,
i.e, it contains PCID or PWT/PCD in lower bits.
But maybe this is my overthinking?Will use the helper instead.

>
> If we go the governed feature route, this becomes:
>
> static inline bool kvm_vcpu_is_legal_cr3(struct kvm_vcpu *vcpu,
> 					 unsigned long cr3)
> {
> 	if (guest_can_use(vcpu, X86_FEATURE_LAM))
> 		cr3 &= ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
>
> 	return kvm_vcpu_is_legal_gpa(cr3);
> }
>
>> +}
>> +
>>   static inline bool kvm_vcpu_is_legal_gpa(struct kvm_vcpu *vcpu, gpa_t gpa)
>>   {
>>   	return !(gpa & vcpu->arch.reserved_gpa_bits);
>> diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
>> index 92d5a1924fc1..81d8a433dae1 100644
>> --- a/arch/x86/kvm/mmu.h
>> +++ b/arch/x86/kvm/mmu.h
>> @@ -144,6 +144,11 @@ static inline unsigned long kvm_get_active_pcid(struct kvm_vcpu *vcpu)
>>   	return kvm_get_pcid(vcpu, kvm_read_cr3(vcpu));
>>   }
>>   
>> +static inline u64 kvm_get_active_cr3_ctrl_bits(struct kvm_vcpu *vcpu)
> And then this becomes:
>
> static inline u64 kvm_get_active_cr3_lam_bits(struct kvm_vcpu *vcpu)
> {
> 	if (!guest_can_use(vcpu, X86_FEATURE_LAM))
> 		return 0;
>
> 	return kvm_read_cr3(vcpu) & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
> }
>
>> +{
>> +	return kvm_read_cr3(vcpu) & vcpu->arch.cr3_ctrl_bits;
>> +}
>> +
>>   static inline void kvm_mmu_load_pgd(struct kvm_vcpu *vcpu)
>>   {
>>   	u64 root_hpa = vcpu->arch.mmu->root.hpa;
>> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
>> index c8961f45e3b1..deea9a9f0c75 100644
>> --- a/arch/x86/kvm/mmu/mmu.c
>> +++ b/arch/x86/kvm/mmu/mmu.c
>> @@ -3812,7 +3812,13 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
>>   	hpa_t root;
>>   
>>   	root_pgd = kvm_mmu_get_guest_pgd(vcpu, mmu);
>> -	root_gfn = root_pgd >> PAGE_SHIFT;
>> +	/*
>> +	 * Guest PGD can be CR3 or EPTP (for nested EPT case). CR3 may contain
>> +	 * additional control bits (e.g. LAM control bits). To be generic,
>> +	 * unconditionally strip non-address bits when computing the GFN since
>> +	 * the guest PGD has already been checked for validity.
>> +	 */
> Drop this comment, the code is self-explanatory, and the comment is incomplete,
> e.g. it can also be nCR3.
I was trying to use CR3 for both nested/non-nested cases. Sorry for the 
confusion.
Anyway, will drop the comment.


>
>> +	root_gfn = (root_pgd & __PT_BASE_ADDR_MASK) >> PAGE_SHIFT;
>>   
>>   	if (mmu_check_root(vcpu, root_gfn))
>>   		return 1;
>> diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
>> index d39af5639ce9..7d2105432d66 100644
>> --- a/arch/x86/kvm/mmu/mmu_internal.h
>> +++ b/arch/x86/kvm/mmu/mmu_internal.h
>> @@ -21,6 +21,7 @@ extern bool dbg;
>>   #endif
>>   
>>   /* Page table builder macros common to shadow (host) PTEs and guest PTEs. */
>> +#define __PT_BASE_ADDR_MASK GENMASK_ULL(51, 12)
>>   #define __PT_LEVEL_SHIFT(level, bits_per_level)	\
>>   	(PAGE_SHIFT + ((level) - 1) * (bits_per_level))
>>   #define __PT_INDEX(address, level, bits_per_level) \
>> diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
>> index 0662e0278e70..394733ac9088 100644
>> --- a/arch/x86/kvm/mmu/paging_tmpl.h
>> +++ b/arch/x86/kvm/mmu/paging_tmpl.h
>> @@ -62,7 +62,7 @@
>>   #endif
>>   
>>   /* Common logic, but per-type values.  These also need to be undefined. */
>> -#define PT_BASE_ADDR_MASK	((pt_element_t)(((1ULL << 52) - 1) & ~(u64)(PAGE_SIZE-1)))
>> +#define PT_BASE_ADDR_MASK	((pt_element_t)__PT_BASE_ADDR_MASK)
>>   #define PT_LVL_ADDR_MASK(lvl)	__PT_LVL_ADDR_MASK(PT_BASE_ADDR_MASK, lvl, PT_LEVEL_BITS)
>>   #define PT_LVL_OFFSET_MASK(lvl)	__PT_LVL_OFFSET_MASK(PT_BASE_ADDR_MASK, lvl, PT_LEVEL_BITS)
>>   #define PT_INDEX(addr, lvl)	__PT_INDEX(addr, lvl, PT_LEVEL_BITS)
>> @@ -324,6 +324,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
>>   	trace_kvm_mmu_pagetable_walk(addr, access);
>>   retry_walk:
>>   	walker->level = mmu->cpu_role.base.level;
>> +	/* gpte_to_gfn() will strip non-address bits. */
> Drop this comment too, it's not relevant to the immediate code, i.e. it'd be
> better suited about this code:
>
> 	table_gfn = gpte_to_gfn(pte);
>
> but IMO that code is quite self-explanatory too.

OK, thanks.
>
>> @@ -7740,6 +7741,11 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>>   		vmx->msr_ia32_feature_control_valid_bits &=
>>   			~FEAT_CTL_SGX_LC_ENABLED;
>>   
>> +	if (guest_cpuid_has(vcpu, X86_FEATURE_LAM))
> This is wrong, KVM needs to check that the host supports LAM too, otherwise KVM
> will allow userspace to shove garbage into guest CR3 and induce VM-Entry failures
> and whatnot.
Right, will fix it.

>    If we go the guest_can_use() route, this problem solves itself.


  reply	other threads:[~2023-06-28  3:05 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-06  9:18 [PATCH v9 0/6] Linear Address Masking (LAM) KVM Enabling Binbin Wu
2023-06-06  9:18 ` [PATCH v9 1/6] KVM: x86: Consolidate flags for __linearize() Binbin Wu
2023-06-06  9:18 ` [PATCH v9 2/6] KVM: x86: Virtualize CR4.LAM_SUP Binbin Wu
2023-06-07  3:40   ` Huang, Kai
2023-06-07  4:55     ` Binbin Wu
2023-06-07  9:20       ` Huang, Kai
2023-06-06  9:18 ` [PATCH v9 3/6] KVM: x86: Virtualize CR3.LAM_{U48,U57} Binbin Wu
2023-06-27 23:40   ` Sean Christopherson
2023-06-28  3:05     ` Binbin Wu [this message]
2023-06-28 17:40       ` Sean Christopherson
2023-07-03  7:56         ` Binbin Wu
2023-07-22  1:28           ` Sean Christopherson
2023-06-06  9:18 ` [PATCH v9 4/6] KVM: x86: Introduce untag_addr() in kvm_x86_ops Binbin Wu
2023-06-28  0:15   ` Sean Christopherson
2023-06-29  6:12     ` Binbin Wu
2023-06-29  6:57       ` Chao Gao
2023-06-29  7:22         ` Binbin Wu
2023-06-29 15:33           ` Sean Christopherson
2023-06-29  8:30       ` David Laight
2023-06-29 15:16       ` Sean Christopherson
2023-06-29 17:26         ` Binbin Wu
2023-06-06  9:18 ` [PATCH v9 5/6] KVM: x86: Untag address when LAM applicable Binbin Wu
2023-06-28  0:19   ` Sean Christopherson
2023-06-06  9:18 ` [PATCH v9 6/6] KVM: x86: Expose LAM feature to userspace VMM Binbin Wu
2023-06-07  3:52   ` Huang, Kai
2023-06-16  1:45     ` Binbin Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e11e348c-3763-8eda-281d-c8d965cd52b6@linux.intel.com \
    --to=binbin.wu@linux.intel.com \
    --cc=David.Laight@aculab.com \
    --cc=chao.gao@intel.com \
    --cc=kai.huang@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=robert.hu@linux.intel.com \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).