kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Vipin Sharma <vipinsh@google.com>
Cc: pbonzini@redhat.com, dmatlack@google.com, kvm@vger.kernel.org,
	 linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/2] KVM: x86/mmu: Recover NX Huge pages belonging to TDP MMU under MMU read lock
Date: Mon, 19 Aug 2024 15:19:19 -0700	[thread overview]
Message-ID: <ZsPE56MnelsV490m@google.com> (raw)
In-Reply-To: <20240812171341.1763297-3-vipinsh@google.com>

On Mon, Aug 12, 2024, Vipin Sharma wrote:
> Use MMU read lock to recover NX huge pages belonging to TDP MMU. Iterate
> through kvm->arch.possible_nx_huge_pages while holding
> kvm->arch.tdp_mmu_pages_lock. Rename kvm_tdp_mmu_zap_sp() to
> tdp_mmu_zap_sp() and make it static as there are no callers outside of
> TDP MMU.
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 933bb8b11c9f..7c7d207ee590 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -817,9 +817,11 @@ static void tdp_mmu_zap_root(struct kvm *kvm, struct kvm_mmu_page *root,
>  	rcu_read_unlock();
>  }
>  
> -bool kvm_tdp_mmu_zap_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
> +static bool tdp_mmu_zap_sp(struct kvm *kvm, struct kvm_mmu_page *sp)

At this point, I think we should rename this to tdp_mmu_zap_possible_nx_huge_page(),
as I can't imagine there's another use case where we'll zap a SP starting from the
SP itself, i.e. without first walking from the root.

>  {
> -	u64 old_spte;
> +	struct tdp_iter iter = {};

Rather than initializes the on-stack structure, I think it makes sense to directly
initialize the whole thing and then WARN after, e.g. so that its easier to see
that "iter" is simply being filled from @sp.

	struct tdp_iter iter = {
		.old_spte = sp->ptep ? kvm_tdp_mmu_read_spte(sp->ptep) : 0,
		.sptep = sp->ptep,
		.level = sp->role.level + 1,
		.gfn = sp->gfn,
		.as_id = kvm_mmu_page_as_id(sp),
	};

	lockdep_assert_held_read(&kvm->mmu_lock);

	/*
	 * Root shadow pages don't a parent page table and thus no associated
	 * entry, but they can never be possible NX huge pages.
	 */
	if (WARN_ON_ONCE(!sp->ptep))
		return false;

> +
> +	lockdep_assert_held_read(&kvm->mmu_lock);
>  
>  	/*
>  	 * This helper intentionally doesn't allow zapping a root shadow page,
> @@ -828,12 +830,25 @@ bool kvm_tdp_mmu_zap_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
>  	if (WARN_ON_ONCE(!sp->ptep))
>  		return false;
>  
> -	old_spte = kvm_tdp_mmu_read_spte(sp->ptep);
> -	if (WARN_ON_ONCE(!is_shadow_present_pte(old_spte)))
> +	iter.old_spte = kvm_tdp_mmu_read_spte(sp->ptep);
> +	iter.sptep = sp->ptep;
> +	iter.level = sp->role.level + 1;
> +	iter.gfn = sp->gfn;
> +	iter.as_id = kvm_mmu_page_as_id(sp);
> +
> +retry:
> +	/*
> +	 * Since mmu_lock is held in read mode, it's possible to race with
> +	 * another CPU which can remove sp from the page table hierarchy.
> +	 *
> +	 * No need to re-read iter.old_spte as tdp_mmu_set_spte_atomic() will
> +	 * update it in the case of failure.
> +	 */
> +	if (sp->spt != spte_to_child_pt(iter.old_spte, iter.level))
>  		return false;
>  
> -	tdp_mmu_set_spte(kvm, kvm_mmu_page_as_id(sp), sp->ptep, old_spte,
> -			 SHADOW_NONPRESENT_VALUE, sp->gfn, sp->role.level + 1);
> +	if (tdp_mmu_set_spte_atomic(kvm, &iter, SHADOW_NONPRESENT_VALUE))
> +		goto retry;

I'm pretty sure there's no need to retry.  Non-leaf SPTEs don't have Dirty bits,
and KVM always sets the Accessed bit (and never clears it) for non-leaf SPTEs.
Ditty for the Writable bit.

So unless I'm missing something, the only way for the CMPXCHG to fail is if the
SPTE was zapped or replaced with something else, in which case the sp->spt will
fail.  I would much prefer to WARN on that logic failing than have what appears
to be a potential infinite loop.

  parent reply	other threads:[~2024-08-19 22:19 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-12 17:13 [PATCH 0/2] KVM: x86/mmu: Run NX huge page recovery under MMU read lock Vipin Sharma
2024-08-12 17:13 ` [PATCH 1/2] KVM: x86/mmu: Split NX hugepage recovery flow into TDP and non-TDP flow Vipin Sharma
2024-08-16 23:29   ` Sean Christopherson
2024-08-19 17:20     ` Vipin Sharma
2024-08-19 17:28       ` David Matlack
2024-08-19 18:31         ` Sean Christopherson
2024-08-19 21:57           ` Vipin Sharma
2024-08-12 17:13 ` [PATCH 2/2] KVM: x86/mmu: Recover NX Huge pages belonging to TDP MMU under MMU read lock Vipin Sharma
2024-08-14  9:33   ` kernel test robot
2024-08-14 18:23     ` Vipin Sharma
2024-08-14 22:50   ` kernel test robot
2024-08-15 16:42     ` Vipin Sharma
2024-08-16 23:38   ` Sean Christopherson
2024-08-19 17:34     ` Vipin Sharma
2024-08-19 22:12       ` Sean Christopherson
2024-08-23 22:38         ` Vipin Sharma
2024-08-26 14:34           ` Sean Christopherson
2024-08-26 19:24             ` Vipin Sharma
2024-08-26 19:51               ` Sean Christopherson
2024-08-19 22:19   ` Sean Christopherson [this message]
2024-08-23 19:27     ` Vipin Sharma
2024-08-23 20:45       ` Sean Christopherson
2024-08-23 21:41         ` Vipin Sharma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZsPE56MnelsV490m@google.com \
    --to=seanjc@google.com \
    --cc=dmatlack@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=vipinsh@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).