kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] x86/kvm/mmu: Remove unnecessary ‘NULL’ values from sptep
  2023-09-02 17:54 [PATCH] x86/kvm/mmu: Remove unnecessary ‘NULL’ values from sptep Li zeming
@ 2023-09-01 16:48 ` Sean Christopherson
  2023-09-04 10:39   ` Huang, Kai
  0 siblings, 1 reply; 5+ messages in thread
From: Sean Christopherson @ 2023-09-01 16:48 UTC (permalink / raw)
  To: Li zeming; +Cc: pbonzini, tglx, mingo, bp, dave.hansen, x86, kvm, linux-kernel

On Sun, Sep 03, 2023, Li zeming wrote:
> sptep is assigned first, so it does not need to initialize the assignment.
> 
> Signed-off-by: Li zeming <zeming@nfschina.com>
> ---
>  arch/x86/kvm/mmu/mmu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index ec169f5c7dce..95f745aec4aa 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -3457,7 +3457,7 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
>  	struct kvm_mmu_page *sp;
>  	int ret = RET_PF_INVALID;
>  	u64 spte = 0ull;
> -	u64 *sptep = NULL;
> +	u64 *sptep;
>  	uint retry_count = 0;
>  
>  	if (!page_fault_can_be_fast(fault))

Hmm, this is safe, but there's some ugliness lurking.  Theoretically, it's possible
for spte to be left untouched by the walkers.  That _shouldn't_ happen, as it means
there's a bug somewhere in KVM.  But if that did happen, on the second or later
iteration, it's (again, theoretically) possible to consume a stale spte.

		if (tdp_mmu_enabled)
			sptep = kvm_tdp_mmu_fast_pf_get_last_sptep(vcpu, fault->addr, &spte);
		else
			sptep = fast_pf_get_last_sptep(vcpu, fault->addr, &spte);

		if (!is_shadow_present_pte(spte)) <=== could consume stale data
			break;

If we're going to tidy up sptep, I think we should also give spte similar treatment
and harden KVM in the process, e.g.

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 6325bb3e8c2b..ae2f87bbbf0a 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3430,8 +3430,8 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
 {
        struct kvm_mmu_page *sp;
        int ret = RET_PF_INVALID;
-       u64 spte = 0ull;
-       u64 *sptep = NULL;
+       u64 spte;
+       u64 *sptep;
        uint retry_count = 0;
 
        if (!page_fault_can_be_fast(fault))
@@ -3447,6 +3447,14 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
                else
                        sptep = fast_pf_get_last_sptep(vcpu, fault->addr, &spte);
 
+               /*
+                * It's entirely possible for the mapping to have been zapped
+                * by a different task, but the root page is should always be
+                * available as the vCPU holds a reference to its root(s).
+                */
+               if (WARN_ON_ONCE(!sptep))
+                       spte = REMOVED_SPTE;
+
                if (!is_shadow_present_pte(spte))
                        break;
 


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH] x86/kvm/mmu: Remove unnecessary ‘NULL’ values from sptep
@ 2023-09-02 17:54 Li zeming
  2023-09-01 16:48 ` Sean Christopherson
  0 siblings, 1 reply; 5+ messages in thread
From: Li zeming @ 2023-09-02 17:54 UTC (permalink / raw)
  To: seanjc, pbonzini, tglx, mingo, bp, dave.hansen
  Cc: x86, kvm, linux-kernel, Li zeming

sptep is assigned first, so it does not need to initialize the assignment.

Signed-off-by: Li zeming <zeming@nfschina.com>
---
 arch/x86/kvm/mmu/mmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index ec169f5c7dce..95f745aec4aa 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3457,7 +3457,7 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
 	struct kvm_mmu_page *sp;
 	int ret = RET_PF_INVALID;
 	u64 spte = 0ull;
-	u64 *sptep = NULL;
+	u64 *sptep;
 	uint retry_count = 0;
 
 	if (!page_fault_can_be_fast(fault))
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86/kvm/mmu: Remove unnecessary ‘NULL’ values from sptep
  2023-09-01 16:48 ` Sean Christopherson
@ 2023-09-04 10:39   ` Huang, Kai
  2023-09-05 21:09     ` Sean Christopherson
  0 siblings, 1 reply; 5+ messages in thread
From: Huang, Kai @ 2023-09-04 10:39 UTC (permalink / raw)
  To: zeming@nfschina.com, Christopherson,, Sean
  Cc: x86@kernel.org, dave.hansen@linux.intel.com, bp@alien8.de,
	mingo@redhat.com, tglx@linutronix.de, kvm@vger.kernel.org,
	pbonzini@redhat.com, linux-kernel@vger.kernel.org

On Fri, 2023-09-01 at 09:48 -0700, Sean Christopherson wrote:
> On Sun, Sep 03, 2023, Li zeming wrote:
> > sptep is assigned first, so it does not need to initialize the assignment.
> > 
> > Signed-off-by: Li zeming <zeming@nfschina.com>
> > ---
> >  arch/x86/kvm/mmu/mmu.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index ec169f5c7dce..95f745aec4aa 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -3457,7 +3457,7 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
> >  	struct kvm_mmu_page *sp;
> >  	int ret = RET_PF_INVALID;
> >  	u64 spte = 0ull;
> > -	u64 *sptep = NULL;
> > +	u64 *sptep;
> >  	uint retry_count = 0;
> >  
> >  	if (!page_fault_can_be_fast(fault))
> 
> Hmm, this is safe, but there's some ugliness lurking.  Theoretically, it's possible
> for spte to be left untouched by the walkers.  That _shouldn't_ happen, as it means
> there's a bug somewhere in KVM.  But if that did happen, on the second or later
> iteration, it's (again, theoretically) possible to consume a stale spte.
> 
> 		if (tdp_mmu_enabled)
> 			sptep = kvm_tdp_mmu_fast_pf_get_last_sptep(vcpu, fault->addr, &spte);
> 		else
> 			sptep = fast_pf_get_last_sptep(vcpu, fault->addr, &spte);
> 
> 		if (!is_shadow_present_pte(spte)) <=== could consume stale data
> 			break;
> 
> If we're going to tidy up sptep, I think we should also give spte similar treatment
> and harden KVM in the process, e.g.
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 6325bb3e8c2b..ae2f87bbbf0a 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -3430,8 +3430,8 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
>  {
>         struct kvm_mmu_page *sp;
>         int ret = RET_PF_INVALID;
> -       u64 spte = 0ull;
> -       u64 *sptep = NULL;
> +       u64 spte;
> +       u64 *sptep;
>         uint retry_count = 0;
>  
>         if (!page_fault_can_be_fast(fault))
> @@ -3447,6 +3447,14 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
>                 else
>                         sptep = fast_pf_get_last_sptep(vcpu, fault->addr, &spte);
>  
> +               /*
> +                * It's entirely possible for the mapping to have been zapped
> +                * by a different task, but the root page is should always be
> +                * available as the vCPU holds a reference to its root(s).
> +                */
> +               if (WARN_ON_ONCE(!sptep))
> +                       spte = REMOVED_SPTE;

If I recall correctly, REMOVED_SPTE is only used by TDP MMU code.  Should we use
0 (or initial SPTE value for case like TDX) instead of REMOVED_SPTE?

And I agree this code is more error proof (although theoretically for now).
  
> +
>                 if (!is_shadow_present_pte(spte))
>                         break;
>  
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86/kvm/mmu: Remove unnecessary ‘NULL’ values from sptep
  2023-09-04 10:39   ` Huang, Kai
@ 2023-09-05 21:09     ` Sean Christopherson
  2023-09-05 23:31       ` Huang, Kai
  0 siblings, 1 reply; 5+ messages in thread
From: Sean Christopherson @ 2023-09-05 21:09 UTC (permalink / raw)
  To: Kai Huang
  Cc: zeming@nfschina.com, x86@kernel.org, dave.hansen@linux.intel.com,
	bp@alien8.de, mingo@redhat.com, tglx@linutronix.de,
	kvm@vger.kernel.org, pbonzini@redhat.com,
	linux-kernel@vger.kernel.org

On Mon, Sep 04, 2023, Kai Huang wrote:
> On Fri, 2023-09-01 at 09:48 -0700, Sean Christopherson wrote:
> > @@ -3447,6 +3447,14 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
> >                 else
> >                         sptep = fast_pf_get_last_sptep(vcpu, fault->addr, &spte);
> >  
> > +               /*
> > +                * It's entirely possible for the mapping to have been zapped
> > +                * by a different task, but the root page is should always be
> > +                * available as the vCPU holds a reference to its root(s).
> > +                */
> > +               if (WARN_ON_ONCE(!sptep))
> > +                       spte = REMOVED_SPTE;
> 
> If I recall correctly, REMOVED_SPTE is only used by TDP MMU code.  Should we use
> 0 (or initial SPTE value for case like TDX) instead of REMOVED_SPTE?

I deliberately suggested REMOVED_SPTE in part because of TDX introducing "initial
SPTE"; finding/remembering '0' initialization of SPTEs is hard.  Though FWIW, '0'
would be totally fine for TDX because the value is never exposed to hardware.

I think it's totally fine to use REMOVED_SPTE like this in common code, I would
be quite surprised if it causes confusion.  Even though REMOVED_SPTE was introduced
by the TDP MMU, its value/semantics are generic.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86/kvm/mmu: Remove unnecessary ‘NULL’ values from sptep
  2023-09-05 21:09     ` Sean Christopherson
@ 2023-09-05 23:31       ` Huang, Kai
  0 siblings, 0 replies; 5+ messages in thread
From: Huang, Kai @ 2023-09-05 23:31 UTC (permalink / raw)
  To: Christopherson,, Sean
  Cc: zeming@nfschina.com, dave.hansen@linux.intel.com, bp@alien8.de,
	x86@kernel.org, mingo@redhat.com, tglx@linutronix.de,
	kvm@vger.kernel.org, pbonzini@redhat.com,
	linux-kernel@vger.kernel.org

On Tue, 2023-09-05 at 14:09 -0700, Sean Christopherson wrote:
> On Mon, Sep 04, 2023, Kai Huang wrote:
> > On Fri, 2023-09-01 at 09:48 -0700, Sean Christopherson wrote:
> > > @@ -3447,6 +3447,14 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
> > >                 else
> > >                         sptep = fast_pf_get_last_sptep(vcpu, fault->addr, &spte);
> > >  
> > > +               /*
> > > +                * It's entirely possible for the mapping to have been zapped
> > > +                * by a different task, but the root page is should always be
> > > +                * available as the vCPU holds a reference to its root(s).
> > > +                */
> > > +               if (WARN_ON_ONCE(!sptep))
> > > +                       spte = REMOVED_SPTE;
> > 
> > If I recall correctly, REMOVED_SPTE is only used by TDP MMU code.  Should we use
> > 0 (or initial SPTE value for case like TDX) instead of REMOVED_SPTE?
> 
> I deliberately suggested REMOVED_SPTE in part because of TDX introducing "initial
> SPTE"; finding/remembering '0' initialization of SPTEs is hard.  Though FWIW, '0'
> would be totally fine for TDX because the value is never exposed to hardware.
> 
> I think it's totally fine to use REMOVED_SPTE like this in common code, I would
> be quite surprised if it causes confusion.  Even though REMOVED_SPTE was introduced
> by the TDP MMU, its value/semantics are generic.

Yeah certainly no harm here. :-)

My thinking was REMOVED_SPTE is supposed to be an intermediate state for one
SPTE, which is actually "in the page table", while multiple threads can operate
on the page table entry concurrently.  Slightly mismatch the case here IIUC. 
But I guess it also depends on how we view this case here.



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-09-05 23:31 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-02 17:54 [PATCH] x86/kvm/mmu: Remove unnecessary ‘NULL’ values from sptep Li zeming
2023-09-01 16:48 ` Sean Christopherson
2023-09-04 10:39   ` Huang, Kai
2023-09-05 21:09     ` Sean Christopherson
2023-09-05 23:31       ` Huang, Kai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).