From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcelo Tosatti Subject: Re: [patch 02/10] KVM: MMU: move local TLB flush to mmu_set_spte Date: Sat, 20 Sep 2008 21:10:46 -0300 Message-ID: <20080921001046.GA8593@dmt.cnet> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: kvm-devel , "David S. Ahern" To: avi@redhat.com Return-path: Received: from mx2.redhat.com ([66.187.237.31]:50961 "EHLO mx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751578AbYIUAMH (ORCPT ); Sat, 20 Sep 2008 20:12:07 -0400 Content-Disposition: inline Sender: kvm-owner@vger.kernel.org List-ID: On Fri, Sep 19, 2008 at 05:21:09PM -0700, Avi Kivity wrote: > Marcelo Tosatti wrote: >> Since the sync page path can collapse flushes. >> >> Also only flush if the spte was writable before. >> >> Signed-off-by: Marcelo Tosatti >> >> @@ -1241,9 +1239,12 @@ static void mmu_set_spte(struct kvm_vcpu >> } >> } >> if (set_spte(vcpu, shadow_pte, pte_access, user_fault, write_fault, >> - dirty, largepage, gfn, pfn, speculative)) >> + dirty, largepage, gfn, pfn, speculative)) { >> if (write_fault) >> *ptwrite = 1; >> + if (was_writeble) >> + kvm_x86_ops->tlb_flush(vcpu); >> + } >> > > I think we had cases where the spte.pfn contents changed, for example > when a large page was replaced by a normal page, True. And the TLB is not flushed now for large->normal replace, in case the pte thats faulting is read-only. The local (and remote) TLB's must be flushed on large->normal replace. (BTW the largepage patch is wrong, will reply to that soon). > and also: > > } else if (pfn != spte_to_pfn(*shadow_pte)) { That one is likely to crash the guest anyway, so I don't see the need for a flush there: > > Did you find out what's causing the errors in the first place (if > > zap is not used)? It worries me greatly. > Yes, the problem is that the rmap code does not handle the qemu > process > mappings from vanishing while there is a present rmap. If that > happens, > and there is a fault for a gfn whose qemu mapping has been removed, a > different physical zero page will be allocated: > > rmap a -> gfn 0 -> physical host page 0 > mapping for gfn 0 gets removed > guest faults in gfn 0 through the same pte "chain" > rmap a -> gfn 0 -> physical host page 1 > > When instantiating the shadow mapping for the second time, the > "is_rmap_pte" check succeeds, so we release the reference grabbed by > gfn_to_page() at mmu_set_spte(). We now have a shadow mapping > pointing > to a physical page without having an additional reference on that > page. > > The following makes the host not crash under such a condition, but > the condition itself is invalid leading to inconsistent state on the > guest. > So IMHO it shouldnt be allowed to happen in the first place.