From mboxrd@z Thu Jan  1 00:00:00 1970
From: Marcelo Tosatti <mtosatti@redhat.com>
Subject: Re: KVM: MMU: update sp->gfns on pte update path
Date: Mon, 31 Jan 2011 11:54:11 -0200
Message-ID: <20110131135411.GA18950@amt.cnet>
References: <20110125130733.GA5645@amt.cnet>
 <4D3EDEFD.9060706@redhat.com>
 <20110125171236.GA10388@amt.cnet>
 <4D3F0A02.3060902@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: kvm <kvm@vger.kernel.org>,
	Nicolas Prochazka <prochazka.nicolas@gmail.com>
To: Avi Kivity <avi@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mx1.redhat.com ([209.132.183.28]:57214 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1753566Ab1AaNyX (ORCPT <rfc822;kvm@vger.kernel.org>);
	Mon, 31 Jan 2011 08:54:23 -0500
Content-Disposition: inline
In-Reply-To: <4D3F0A02.3060902@redhat.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On Tue, Jan 25, 2011 at 07:36:02PM +0200, Avi Kivity wrote:
> On 01/25/2011 07:12 PM, Marcelo Tosatti wrote:
> >>
> >>  Should be done by a call to kvm_mmu_page_set_gfn().  But I don't
> >>  understand how it could become inconsistent in the first place.
> >>
> >>      if (is_rmap_spte(*sptep)) {
> >>          /*
> >>           * If we overwrite a PTE page pointer with a 2MB PMD, unlink
> >>           * the parent of the now unreachable PTE.
> >>           */
> >>          if (level>  PT_PAGE_TABLE_LEVEL&&
> >>              !is_large_pte(*sptep)) {
> >>              struct kvm_mmu_page *child;
> >>              u64 pte = *sptep;
> >>
> >>              child = page_header(pte&  PT64_BASE_ADDR_MASK);
> >>              mmu_page_remove_parent_pte(child, sptep);
> >>              __set_spte(sptep, shadow_trap_nonpresent_pte);
> >>              kvm_flush_remote_tlbs(vcpu->kvm);
> >>          } else if (pfn != spte_to_pfn(*sptep)) {
> >>              pgprintk("hfn old %llx new %llx\n",
> >>                   spte_to_pfn(*sptep), pfn);
> >>              drop_spte(vcpu->kvm, sptep, shadow_trap_nonpresent_pte);
> >>              kvm_flush_remote_tlbs(vcpu->kvm);
> >>          } else
> >>              was_rmapped = 1;
> >>      }
> >>
> >>  If we set was_rmapped, that means rmap_add() was previously called
> >>  for this spte/gfn/pfn pair, and all that changes is permissions, no?
> >
> >What if pfn is the same but gfn differs?
> 
> Could be.  Any way to verify if this was the case?
> 
> Isn't it nicer to have it detected by the test above and do the
> drop_spte()/kvm_flush_remote_tlbs() thing instead?

It could not be the case. If spte is updated to point to a new gfn, and
rmap is not updated:

1. rmap[A] = spte
   sp->gfns[i] = A
   spte points to gfn A

2. rmap[A] = spte
   sp->gfns[i] = A
   spte points to gfn B

rmap_remove(spte) will succeed (as in not crash). In case gfn A's slot
is removed, all shadow pages will be destroyed. So what can fail from
this point on are operations on gfn B such as rmap_write_protect(B).

Yes, its nicer (and correct) to do it at drop_spte. Will resubmit.

However, still have no explanation for Nicolas BUG's... ideas?