All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Matlack <dmatlack@google.com>
To: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	Paolo Bonzini <pbonzini@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Lai Jiangshan <jiangshan.ljs@antgroup.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH V2 2/7] KVM: X86/MMU: Add special shadow pages
Date: Fri, 13 May 2022 23:43:06 +0000	[thread overview]
Message-ID: <Yn7tCpt9s8qf3Rn/@google.com> (raw)
In-Reply-To: <20220503150735.32723-3-jiangshanlai@gmail.com>

On Tue, May 03, 2022 at 11:07:30PM +0800, Lai Jiangshan wrote:
> From: Lai Jiangshan <jiangshan.ljs@antgroup.com>
> 
> Special pages are pages to hold PDPTEs for 32bit guest or higher
> level pages linked to special page when shadowing NPT.
> 
> Current code use mmu->pae_root, mmu->pml4_root, and mmu->pml5_root to
> setup special root.  The initialization code is complex and the roots
> are not associated with struct kvm_mmu_page which causes the code more
> complex.
> 
> Add kvm_mmu_alloc_special_page() and mmu_free_special_root_page() to
> allocate and free special shadow pages and prepare for using special
> shadow pages to replace current logic and share the most logic with
> normal shadow pages.
> 
> The code is not activated since using_special_root_page() is false in
> the place where it is inserted.
> 
> Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
> ---
>  arch/x86/kvm/mmu/mmu.c | 91 +++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 90 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 7f20796af351..126f0cd07f98 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -1719,6 +1719,58 @@ static bool using_special_root_page(struct kvm_mmu *mmu)
>  		return mmu->cpu_role.base.level <= PT32E_ROOT_LEVEL;
>  }
>  
> +/*
> + * Special pages are pages to hold PAE PDPTEs for 32bit guest or higher level
> + * pages linked to special page when shadowing NPT.
> + *
> + * Special pages are specially allocated.  If sp->spt needs to be 32bit, it

I'm not sure what you mean by "If sp->spt needs to be 32bit". Do you mean
"If sp shadows a 32-bit PAE page table"?

> + * will use the preallocated mmu->pae_root.
> + *
> + * Special pages are only visible to local VCPU except through rmap from their
> + * children, so they are not in the kvm->arch.active_mmu_pages nor in the hash.
> + *
> + * And they are either accounted nor write-protected since they don't has gfn
> + * associated.

Instead of "has gfn associated", how about "shadow a guest page table"?

> + *
> + * Because of above, special pages can not be freed nor zapped like normal
> + * shadow pages.  They are freed directly when the special root is freed, see
> + * mmu_free_special_root_page().
> + *
> + * Special root page can not be put on mmu->prev_roots because the comparison
> + * must use PDPTEs instead of CR3 and mmu->pae_root can not be shared for multi
> + * root pages.
> + *
> + * Except above limitations, all the other abilities are the same as other
> + * shadow page, like link, parent rmap, sync, unsync etc.
> + *
> + * Special pages can be obsoleted but might be possibly reused later.  When
> + * the obsoleting process is done, all the obsoleted shadow pages are unlinked
> + * from the special pages by the help of the parent rmap of the children and
> + * the special pages become theoretically valid again.  If there is no other
> + * event to cause a VCPU to free the root and the VCPU is being preempted by
> + * the host during two obsoleting processes, the VCPU can reuse its special
> + * pages when it is back.

Sorry I am having a lot of trouble parsing this paragraph.

> + */

This comment (and more broadly, this series) mixes "special page",
"special root", "special root page", and "special shadow page". Can you
be more consistent with the terminology?

> +static struct kvm_mmu_page *kvm_mmu_alloc_special_page(struct kvm_vcpu *vcpu,
> +		union kvm_mmu_page_role role)
> +{
> +	struct kvm_mmu_page *sp;
> +
> +	sp = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_page_header_cache);
> +	sp->gfn = 0;
> +	sp->role = role;
> +	if (role.level == PT32E_ROOT_LEVEL &&
> +	    vcpu->arch.mmu->root_role.level == PT32E_ROOT_LEVEL)
> +		sp->spt = vcpu->arch.mmu->pae_root;

Why use pae_root here instead of allocating from the cache?

> +	else
> +		sp->spt = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache);
> +	/* sp->gfns is not used for special shadow page */
> +	set_page_private(virt_to_page(sp->spt), (unsigned long)sp);
> +	sp->mmu_valid_gen = vcpu->kvm->arch.mmu_valid_gen;
> +
> +	return sp;
> +}
> +
>  static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu, int direct)
>  {
>  	struct kvm_mmu_page *sp;
> @@ -2076,6 +2128,9 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
>  	if (level <= vcpu->arch.mmu->cpu_role.base.level)
>  		role.passthrough = 0;
>  
> +	if (unlikely(level >= PT32E_ROOT_LEVEL && using_special_root_page(vcpu->arch.mmu)))
> +		return kvm_mmu_alloc_special_page(vcpu, role);
> +
>  	sp_list = &vcpu->kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn)];
>  	for_each_valid_sp(vcpu->kvm, sp, sp_list) {
>  		if (sp->gfn != gfn) {
> @@ -3290,6 +3345,37 @@ static void mmu_free_root_page(struct kvm *kvm, hpa_t *root_hpa,
>  	*root_hpa = INVALID_PAGE;
>  }
>  
> +static void mmu_free_special_root_page(struct kvm *kvm, struct kvm_mmu *mmu)
> +{
> +	u64 spte = mmu->root.hpa;
> +	struct kvm_mmu_page *sp = to_shadow_page(spte & PT64_BASE_ADDR_MASK);
> +	int i;
> +
> +	/* Free level 5 or 4 roots for shadow NPT for 32 bit L1 */
> +	while (sp->role.level > PT32E_ROOT_LEVEL)
> +	{
> +		spte = sp->spt[0];
> +		mmu_page_zap_pte(kvm, sp, sp->spt + 0, NULL);

Instead of using mmu_page_zap_pte(..., NULL) what about creating a new
helper that just does drop_parent_pte(), since that's all you really
want?

> +		free_page((unsigned long)sp->spt);
> +		kmem_cache_free(mmu_page_header_cache, sp);
> +		if (!is_shadow_present_pte(spte))
> +			return;
> +		sp = to_shadow_page(spte & PT64_BASE_ADDR_MASK);
> +	}
> +
> +	if (WARN_ON_ONCE(sp->role.level != PT32E_ROOT_LEVEL))
> +		return;
> +
> +	/* Free PAE roots */

nit: This loop does not do any freeing, it just disconnets the PAE root
table from the 4 PAE page directories. So how about:

/* Disconnect PAE root from the 4 PAE page directories */

> +	for (i = 0; i < 4; i++)
> +		mmu_page_zap_pte(kvm, sp, sp->spt + i, NULL);
> +
> +	if (sp->spt != mmu->pae_root)
> +		free_page((unsigned long)sp->spt);
> +
> +	kmem_cache_free(mmu_page_header_cache, sp);
> +}
> +
>  /* roots_to_free must be some combination of the KVM_MMU_ROOT_* flags */
>  void kvm_mmu_free_roots(struct kvm *kvm, struct kvm_mmu *mmu,
>  			ulong roots_to_free)
> @@ -3323,7 +3409,10 @@ void kvm_mmu_free_roots(struct kvm *kvm, struct kvm_mmu *mmu,
>  
>  	if (free_active_root) {
>  		if (to_shadow_page(mmu->root.hpa)) {
> -			mmu_free_root_page(kvm, &mmu->root.hpa, &invalid_list);
> +			if (using_special_root_page(mmu))
> +				mmu_free_special_root_page(kvm, mmu);
> +			else
> +				mmu_free_root_page(kvm, &mmu->root.hpa, &invalid_list);
>  		} else if (mmu->pae_root) {
>  			for (i = 0; i < 4; ++i) {
>  				if (!IS_VALID_PAE_ROOT(mmu->pae_root[i]))
> -- 
> 2.19.1.6.gb485710b
> 

  reply	other threads:[~2022-05-14  1:38 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-03 15:07 [PATCH V2 0/7] KVM: X86/MMU: Use one-off special shadow page for special roots Lai Jiangshan
2022-05-03 15:07 ` [PATCH V2 1/7] KVM: X86/MMU: Add using_special_root_page() Lai Jiangshan
2022-05-13 22:53   ` David Matlack
2022-05-26  9:20     ` Lai Jiangshan
2022-05-03 15:07 ` [PATCH V2 2/7] KVM: X86/MMU: Add special shadow pages Lai Jiangshan
2022-05-13 23:43   ` David Matlack [this message]
2022-05-26  9:38     ` Lai Jiangshan
2022-05-03 15:07 ` [PATCH V2 3/7] KVM: X86/MMU: Link PAE root pagetable with its children Lai Jiangshan
2022-05-17  0:01   ` David Matlack
2022-05-17  1:13     ` Lai Jiangshan
2022-05-17 16:41       ` David Matlack
2022-05-26  9:12         ` Lai Jiangshan
2022-05-03 15:07 ` [PATCH V2 4/7] KVM: X86/MMU: Activate special shadow pages and remove old logic Lai Jiangshan
2022-05-17  0:16   ` David Matlack
2022-05-26  9:15     ` Lai Jiangshan
2022-05-03 15:07 ` [PATCH V2 5/7] KVM: X86/MMU: Remove the check of the return value of to_shadow_page() Lai Jiangshan
2022-05-17 16:47   ` David Matlack
2022-05-03 15:07 ` [PATCH V2 6/7] KVM: X86/MMU: Allocate mmu->pae_root for PAE paging on-demand Lai Jiangshan
2022-05-17 16:57   ` David Matlack
2022-05-26  8:52     ` Lai Jiangshan
2022-05-03 15:07 ` [PATCH V2 7/7] KVM: X86/MMU: Remove mmu_alloc_special_roots() Lai Jiangshan
2022-05-13  8:22 ` [PATCH V2 0/7] KVM: X86/MMU: Use one-off special shadow page for special roots Lai Jiangshan
2022-05-13 18:31   ` David Matlack

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yn7tCpt9s8qf3Rn/@google.com \
    --to=dmatlack@google.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jiangshan.ljs@antgroup.com \
    --cc=jiangshanlai@gmail.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.