public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: David Matlack <dmatlack@google.com>
To: Ben Gardon <bgardon@google.com>
Cc: kvm@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
	linux-kernel@vger.kernel.org, Peter Xu <peterx@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Jim Mattson <jmattson@google.com>,
	David Dunn <daviddunn@google.com>,
	Jing Zhang <jingzhangos@google.com>,
	Junaid Shahid <junaids@google.com>
Subject: Re: [PATCH] KVM: x86/MMU: Zap non-leaf SPTEs when disabling dirty logging
Date: Thu, 26 May 2022 16:40:23 +0000	[thread overview]
Message-ID: <Yo+td/A+yEK8z7Z/@google.com> (raw)
In-Reply-To: <20220525230904.1584480-1-bgardon@google.com>

On Wed, May 25, 2022 at 11:09:04PM +0000, Ben Gardon wrote:
> When disabling dirty logging, zap non-leaf parent entries to allow
> replacement with huge pages instead of recursing and zapping all of the
> child, leaf entries. This reduces the number of TLB flushes required.
> 
> Currently disabling dirty logging with the TDP MMU is extremely slow.
> On a 96 vCPU / 96G VM backed with gigabyte pages, it takes ~200 seconds
> to disable dirty logging with the TDP MMU, as opposed to ~4 seconds with
> the shadow MMU. This patch reduces the disable dirty log time with the
> TDP MMU to ~3 seconds.

Nice!

It'd be good to also mention the new WARN. e.g.

  Opportunistically add a WARN() to catch GFNS that are mapped at a
  higher level than their max level.

> 
> Testing:
> Ran KVM selftests and kvm-unit-tests on an Intel Haswell. This
> patch introduced no new failures.
> 
> Signed-off-by: Ben Gardon <bgardon@google.com>

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  arch/x86/kvm/mmu/tdp_iter.c |  9 +++++++++
>  arch/x86/kvm/mmu/tdp_iter.h |  1 +
>  arch/x86/kvm/mmu/tdp_mmu.c  | 38 +++++++++++++++++++++++++++++++------
>  3 files changed, 42 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/tdp_iter.c b/arch/x86/kvm/mmu/tdp_iter.c
> index 6d3b3e5a5533..ee4802d7b36c 100644
> --- a/arch/x86/kvm/mmu/tdp_iter.c
> +++ b/arch/x86/kvm/mmu/tdp_iter.c
> @@ -145,6 +145,15 @@ static bool try_step_up(struct tdp_iter *iter)
>  	return true;
>  }
>  
> +/*
> + * Step the iterator back up a level in the paging structure. Should only be
> + * used when the iterator is below the root level.
> + */
> +void tdp_iter_step_up(struct tdp_iter *iter)
> +{
> +	WARN_ON(!try_step_up(iter));
> +}
> +
>  /*
>   * Step to the next SPTE in a pre-order traversal of the paging structure.
>   * To get to the next SPTE, the iterator either steps down towards the goal
> diff --git a/arch/x86/kvm/mmu/tdp_iter.h b/arch/x86/kvm/mmu/tdp_iter.h
> index f0af385c56e0..adfca0cf94d3 100644
> --- a/arch/x86/kvm/mmu/tdp_iter.h
> +++ b/arch/x86/kvm/mmu/tdp_iter.h
> @@ -114,5 +114,6 @@ void tdp_iter_start(struct tdp_iter *iter, struct kvm_mmu_page *root,
>  		    int min_level, gfn_t next_last_level_gfn);
>  void tdp_iter_next(struct tdp_iter *iter);
>  void tdp_iter_restart(struct tdp_iter *iter);
> +void tdp_iter_step_up(struct tdp_iter *iter);
>  
>  #endif /* __KVM_X86_MMU_TDP_ITER_H */
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 841feaa48be5..7b9265d67131 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -1742,12 +1742,12 @@ static void zap_collapsible_spte_range(struct kvm *kvm,
>  	gfn_t start = slot->base_gfn;
>  	gfn_t end = start + slot->npages;
>  	struct tdp_iter iter;
> +	int max_mapping_level;
>  	kvm_pfn_t pfn;
>  
>  	rcu_read_lock();
>  
>  	tdp_root_for_each_pte(iter, root, start, end) {
> -retry:
>  		if (tdp_mmu_iter_cond_resched(kvm, &iter, false, true))
>  			continue;
>  
> @@ -1755,15 +1755,41 @@ static void zap_collapsible_spte_range(struct kvm *kvm,
>  		    !is_last_spte(iter.old_spte, iter.level))
>  			continue;
>  
> +		/*
> +		 * This is a leaf SPTE. Check if the PFN it maps can
> +		 * be mapped at a higher level.
> +		 */
>  		pfn = spte_to_pfn(iter.old_spte);
> -		if (kvm_is_reserved_pfn(pfn) ||
> -		    iter.level >= kvm_mmu_max_mapping_level(kvm, slot, iter.gfn,
> -							    pfn, PG_LEVEL_NUM))
> +
> +		if (kvm_is_reserved_pfn(pfn))
>  			continue;
>  
> +		max_mapping_level = kvm_mmu_max_mapping_level(kvm, slot,
> +				iter.gfn, pfn, PG_LEVEL_NUM);
> +
> +		WARN_ON(max_mapping_level < iter.level);
> +
> +		/*
> +		 * If this page is already mapped at the highest
> +		 * viable level, there's nothing more to do.
> +		 */
> +		if (max_mapping_level == iter.level)
> +			continue;
> +
> +		/*
> +		 * The page can be remapped at a higher level, so step
> +		 * up to zap the parent SPTE.
> +		 */
> +		while (max_mapping_level > iter.level)
> +			tdp_iter_step_up(&iter);
> +
>  		/* Note, a successful atomic zap also does a remote TLB flush. */
> -		if (tdp_mmu_zap_spte_atomic(kvm, &iter))
> -			goto retry;
> +		tdp_mmu_zap_spte_atomic(kvm, &iter);
> +
> +		/*
> +		 * If the atomic zap fails, the iter will recurse back into
> +		 * the same subtree to retry.
> +		 */
>  	}
>  
>  	rcu_read_unlock();
> -- 
> 2.36.1.124.g0e6072fb45-goog
> 

      parent reply	other threads:[~2022-05-26 16:40 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-25 23:09 [PATCH] KVM: x86/MMU: Zap non-leaf SPTEs when disabling dirty logging Ben Gardon
2022-05-25 23:12 ` Ben Gardon
2022-05-26  1:30 ` Yuan Yao
2022-05-26 15:57   ` Ben Gardon
2022-05-26 12:01 ` Paolo Bonzini
2022-05-26 14:30   ` Peter Xu
2022-05-26 15:52     ` Paolo Bonzini
2022-05-26 16:00       ` Ben Gardon
2022-06-06 23:11         ` Ben Gardon
2022-05-26 16:40 ` David Matlack [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yo+td/A+yEK8z7Z/@google.com \
    --to=dmatlack@google.com \
    --cc=bgardon@google.com \
    --cc=daviddunn@google.com \
    --cc=jingzhangos@google.com \
    --cc=jmattson@google.com \
    --cc=junaids@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox