public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* KVM: MMU: update sp->gfns on pte update path
@ 2011-01-25 13:07 Marcelo Tosatti
  2011-01-25 14:32 ` Avi Kivity
  0 siblings, 1 reply; 5+ messages in thread
From: Marcelo Tosatti @ 2011-01-25 13:07 UTC (permalink / raw)
  To: kvm; +Cc: Avi Kivity, Nicolas Prochazka


If an emulated pte write modifies the gpa of a present spte, sp->gfns is
not updated, retaining a stale value which later leads to:

rmap_remove: ffff8807d245fff8 0->BUG
------------[ cut here ]------------
kernel BUG at arch/x86/kvm/mmu.c:695!

Fix by updating sp->gfns even if spte was present.

Resolves: https://bugzilla.kernel.org/show_bug.cgi?id=27052
Reported-and-tested-by: Nicolas Prochazka <prochazka.nicolas@gmail.com>
KVM-Stable-Tag.

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index cc1bada..37d0886 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2054,6 +2054,12 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 		rmap_count = rmap_add(vcpu, sptep, gfn);
 		if (rmap_count > RMAP_RECYCLE_THRESHOLD)
 			rmap_recycle(vcpu, sptep, gfn);
+	} else {
+		struct kvm_mmu_page *sp = page_header(__pa(sptep));
+		int index = sptep - sp->spt;
+
+		if (!sp->role.direct && sp->gfns[index] != gfn)
+			sp->gfns[index] = gfn;
 	}
 	kvm_release_pfn_clean(pfn);
 	if (speculative) {


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: KVM: MMU: update sp->gfns on pte update path
  2011-01-25 13:07 KVM: MMU: update sp->gfns on pte update path Marcelo Tosatti
@ 2011-01-25 14:32 ` Avi Kivity
  2011-01-25 17:12   ` Marcelo Tosatti
  0 siblings, 1 reply; 5+ messages in thread
From: Avi Kivity @ 2011-01-25 14:32 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: kvm, Nicolas Prochazka

On 01/25/2011 03:07 PM, Marcelo Tosatti wrote:
> If an emulated pte write modifies the gpa of a present spte, sp->gfns is
> not updated, retaining a stale value which later leads to:
>
> rmap_remove: ffff8807d245fff8 0->BUG
> ------------[ cut here ]------------
> kernel BUG at arch/x86/kvm/mmu.c:695!
>
> Fix by updating sp->gfns even if spte was present.
>
> Resolves: https://bugzilla.kernel.org/show_bug.cgi?id=27052
> Reported-and-tested-by: Nicolas Prochazka<prochazka.nicolas@gmail.com>
> KVM-Stable-Tag.
>
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index cc1bada..37d0886 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -2054,6 +2054,12 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
>   		rmap_count = rmap_add(vcpu, sptep, gfn);
>   		if (rmap_count>  RMAP_RECYCLE_THRESHOLD)
>   			rmap_recycle(vcpu, sptep, gfn);
> +	} else {
> +		struct kvm_mmu_page *sp = page_header(__pa(sptep));
> +		int index = sptep - sp->spt;
> +
> +		if (!sp->role.direct&&  sp->gfns[index] != gfn)
> +			sp->gfns[index] = gfn;
>   	}
>   	kvm_release_pfn_clean(pfn);
>   	if (speculative) {
>

Should be done by a call to kvm_mmu_page_set_gfn().  But I don't 
understand how it could become inconsistent in the first place.

     if (is_rmap_spte(*sptep)) {
         /*
          * If we overwrite a PTE page pointer with a 2MB PMD, unlink
          * the parent of the now unreachable PTE.
          */
         if (level > PT_PAGE_TABLE_LEVEL &&
             !is_large_pte(*sptep)) {
             struct kvm_mmu_page *child;
             u64 pte = *sptep;

             child = page_header(pte & PT64_BASE_ADDR_MASK);
             mmu_page_remove_parent_pte(child, sptep);
             __set_spte(sptep, shadow_trap_nonpresent_pte);
             kvm_flush_remote_tlbs(vcpu->kvm);
         } else if (pfn != spte_to_pfn(*sptep)) {
             pgprintk("hfn old %llx new %llx\n",
                  spte_to_pfn(*sptep), pfn);
             drop_spte(vcpu->kvm, sptep, shadow_trap_nonpresent_pte);
             kvm_flush_remote_tlbs(vcpu->kvm);
         } else
             was_rmapped = 1;
     }

If we set was_rmapped, that means rmap_add() was previously called for 
this spte/gfn/pfn pair, and all that changes is permissions, no?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: KVM: MMU: update sp->gfns on pte update path
  2011-01-25 14:32 ` Avi Kivity
@ 2011-01-25 17:12   ` Marcelo Tosatti
  2011-01-25 17:36     ` Avi Kivity
  0 siblings, 1 reply; 5+ messages in thread
From: Marcelo Tosatti @ 2011-01-25 17:12 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, Nicolas Prochazka

On Tue, Jan 25, 2011 at 04:32:29PM +0200, Avi Kivity wrote:
> On 01/25/2011 03:07 PM, Marcelo Tosatti wrote:
> >If an emulated pte write modifies the gpa of a present spte, sp->gfns is
> >not updated, retaining a stale value which later leads to:
> >
> >rmap_remove: ffff8807d245fff8 0->BUG
> >------------[ cut here ]------------
> >kernel BUG at arch/x86/kvm/mmu.c:695!
> >
> >Fix by updating sp->gfns even if spte was present.
> >
> >Resolves: https://bugzilla.kernel.org/show_bug.cgi?id=27052
> >Reported-and-tested-by: Nicolas Prochazka<prochazka.nicolas@gmail.com>
> >KVM-Stable-Tag.
> >
> >diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> >index cc1bada..37d0886 100644
> >--- a/arch/x86/kvm/mmu.c
> >+++ b/arch/x86/kvm/mmu.c
> >@@ -2054,6 +2054,12 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
> >  		rmap_count = rmap_add(vcpu, sptep, gfn);
> >  		if (rmap_count>  RMAP_RECYCLE_THRESHOLD)
> >  			rmap_recycle(vcpu, sptep, gfn);
> >+	} else {
> >+		struct kvm_mmu_page *sp = page_header(__pa(sptep));
> >+		int index = sptep - sp->spt;
> >+
> >+		if (!sp->role.direct&&  sp->gfns[index] != gfn)
> >+			sp->gfns[index] = gfn;
> >  	}
> >  	kvm_release_pfn_clean(pfn);
> >  	if (speculative) {
> >
> 
> Should be done by a call to kvm_mmu_page_set_gfn().  But I don't
> understand how it could become inconsistent in the first place.
> 
>     if (is_rmap_spte(*sptep)) {
>         /*
>          * If we overwrite a PTE page pointer with a 2MB PMD, unlink
>          * the parent of the now unreachable PTE.
>          */
>         if (level > PT_PAGE_TABLE_LEVEL &&
>             !is_large_pte(*sptep)) {
>             struct kvm_mmu_page *child;
>             u64 pte = *sptep;
> 
>             child = page_header(pte & PT64_BASE_ADDR_MASK);
>             mmu_page_remove_parent_pte(child, sptep);
>             __set_spte(sptep, shadow_trap_nonpresent_pte);
>             kvm_flush_remote_tlbs(vcpu->kvm);
>         } else if (pfn != spte_to_pfn(*sptep)) {
>             pgprintk("hfn old %llx new %llx\n",
>                  spte_to_pfn(*sptep), pfn);
>             drop_spte(vcpu->kvm, sptep, shadow_trap_nonpresent_pte);
>             kvm_flush_remote_tlbs(vcpu->kvm);
>         } else
>             was_rmapped = 1;
>     }
> 
> If we set was_rmapped, that means rmap_add() was previously called
> for this spte/gfn/pfn pair, and all that changes is permissions, no?

What if pfn is the same but gfn differs?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: KVM: MMU: update sp->gfns on pte update path
  2011-01-25 17:12   ` Marcelo Tosatti
@ 2011-01-25 17:36     ` Avi Kivity
  2011-01-31 13:54       ` Marcelo Tosatti
  0 siblings, 1 reply; 5+ messages in thread
From: Avi Kivity @ 2011-01-25 17:36 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: kvm, Nicolas Prochazka

On 01/25/2011 07:12 PM, Marcelo Tosatti wrote:
> >
> >  Should be done by a call to kvm_mmu_page_set_gfn().  But I don't
> >  understand how it could become inconsistent in the first place.
> >
> >      if (is_rmap_spte(*sptep)) {
> >          /*
> >           * If we overwrite a PTE page pointer with a 2MB PMD, unlink
> >           * the parent of the now unreachable PTE.
> >           */
> >          if (level>  PT_PAGE_TABLE_LEVEL&&
> >              !is_large_pte(*sptep)) {
> >              struct kvm_mmu_page *child;
> >              u64 pte = *sptep;
> >
> >              child = page_header(pte&  PT64_BASE_ADDR_MASK);
> >              mmu_page_remove_parent_pte(child, sptep);
> >              __set_spte(sptep, shadow_trap_nonpresent_pte);
> >              kvm_flush_remote_tlbs(vcpu->kvm);
> >          } else if (pfn != spte_to_pfn(*sptep)) {
> >              pgprintk("hfn old %llx new %llx\n",
> >                   spte_to_pfn(*sptep), pfn);
> >              drop_spte(vcpu->kvm, sptep, shadow_trap_nonpresent_pte);
> >              kvm_flush_remote_tlbs(vcpu->kvm);
> >          } else
> >              was_rmapped = 1;
> >      }
> >
> >  If we set was_rmapped, that means rmap_add() was previously called
> >  for this spte/gfn/pfn pair, and all that changes is permissions, no?
>
> What if pfn is the same but gfn differs?

Could be.  Any way to verify if this was the case?

Isn't it nicer to have it detected by the test above and do the 
drop_spte()/kvm_flush_remote_tlbs() thing instead?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: KVM: MMU: update sp->gfns on pte update path
  2011-01-25 17:36     ` Avi Kivity
@ 2011-01-31 13:54       ` Marcelo Tosatti
  0 siblings, 0 replies; 5+ messages in thread
From: Marcelo Tosatti @ 2011-01-31 13:54 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, Nicolas Prochazka

On Tue, Jan 25, 2011 at 07:36:02PM +0200, Avi Kivity wrote:
> On 01/25/2011 07:12 PM, Marcelo Tosatti wrote:
> >>
> >>  Should be done by a call to kvm_mmu_page_set_gfn().  But I don't
> >>  understand how it could become inconsistent in the first place.
> >>
> >>      if (is_rmap_spte(*sptep)) {
> >>          /*
> >>           * If we overwrite a PTE page pointer with a 2MB PMD, unlink
> >>           * the parent of the now unreachable PTE.
> >>           */
> >>          if (level>  PT_PAGE_TABLE_LEVEL&&
> >>              !is_large_pte(*sptep)) {
> >>              struct kvm_mmu_page *child;
> >>              u64 pte = *sptep;
> >>
> >>              child = page_header(pte&  PT64_BASE_ADDR_MASK);
> >>              mmu_page_remove_parent_pte(child, sptep);
> >>              __set_spte(sptep, shadow_trap_nonpresent_pte);
> >>              kvm_flush_remote_tlbs(vcpu->kvm);
> >>          } else if (pfn != spte_to_pfn(*sptep)) {
> >>              pgprintk("hfn old %llx new %llx\n",
> >>                   spte_to_pfn(*sptep), pfn);
> >>              drop_spte(vcpu->kvm, sptep, shadow_trap_nonpresent_pte);
> >>              kvm_flush_remote_tlbs(vcpu->kvm);
> >>          } else
> >>              was_rmapped = 1;
> >>      }
> >>
> >>  If we set was_rmapped, that means rmap_add() was previously called
> >>  for this spte/gfn/pfn pair, and all that changes is permissions, no?
> >
> >What if pfn is the same but gfn differs?
> 
> Could be.  Any way to verify if this was the case?
> 
> Isn't it nicer to have it detected by the test above and do the
> drop_spte()/kvm_flush_remote_tlbs() thing instead?

It could not be the case. If spte is updated to point to a new gfn, and
rmap is not updated:

1. rmap[A] = spte
   sp->gfns[i] = A
   spte points to gfn A

2. rmap[A] = spte
   sp->gfns[i] = A
   spte points to gfn B

rmap_remove(spte) will succeed (as in not crash). In case gfn A's slot
is removed, all shadow pages will be destroyed. So what can fail from
this point on are operations on gfn B such as rmap_write_protect(B).

Yes, its nicer (and correct) to do it at drop_spte. Will resubmit.

However, still have no explanation for Nicolas BUG's... ideas?


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-01-31 13:54 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-01-25 13:07 KVM: MMU: update sp->gfns on pte update path Marcelo Tosatti
2011-01-25 14:32 ` Avi Kivity
2011-01-25 17:12   ` Marcelo Tosatti
2011-01-25 17:36     ` Avi Kivity
2011-01-31 13:54       ` Marcelo Tosatti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox