From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcelo Tosatti Subject: Re: [PATCH v2 3/5] KVM: Flush TLB in mmu notifier without holding mmu_lock Date: Mon, 21 May 2012 18:07:40 -0300 Message-ID: <20120521210740.GB27076@amt.cnet> References: <1337250284-18607-1-git-send-email-avi@redhat.com> <1337250284-18607-3-git-send-email-avi@redhat.com> <20120521205850.GA27076@amt.cnet> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Xiao Guangrong , Takuya Yoshikawa , kvm@vger.kernel.org To: Avi Kivity Return-path: Received: from mx1.redhat.com ([209.132.183.28]:37127 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1759150Ab2EUVMi (ORCPT ); Mon, 21 May 2012 17:12:38 -0400 Content-Disposition: inline In-Reply-To: <20120521205850.GA27076@amt.cnet> Sender: kvm-owner@vger.kernel.org List-ID: On Mon, May 21, 2012 at 05:58:50PM -0300, Marcelo Tosatti wrote: > On Thu, May 17, 2012 at 01:24:42PM +0300, Avi Kivity wrote: > > Signed-off-by: Avi Kivity > > --- > > virt/kvm/kvm_main.c | 16 ++++++++-------- > > 1 file changed, 8 insertions(+), 8 deletions(-) > > > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > > index 585ab45..9f6d15d 100644 > > --- a/virt/kvm/kvm_main.c > > +++ b/virt/kvm/kvm_main.c > > @@ -302,11 +302,11 @@ static void kvm_mmu_notifier_invalidate_page(struct mmu_notifier *mn, > > kvm->mmu_notifier_seq++; > > if (kvm_unmap_hva(kvm, address)) > > kvm_mark_tlb_dirty(kvm); > > - /* we've to flush the tlb before the pages can be freed */ > > - kvm_cond_flush_remote_tlbs(kvm); > > - > > spin_unlock(&kvm->mmu_lock); > > srcu_read_unlock(&kvm->srcu, idx); > > + > > + /* we've to flush the tlb before the pages can be freed */ > > + kvm_cond_flush_remote_tlbs(kvm); > > } > > There are still sites that assumed implicitly that acquiring mmu_lock > guarantees that sptes and remote TLBs are in sync. Example: > > void kvm_mmu_zap_all(struct kvm *kvm) > { > struct kvm_mmu_page *sp, *node; > LIST_HEAD(invalid_list); > > spin_lock(&kvm->mmu_lock); > restart: > list_for_each_entry_safe(sp, node, &kvm->arch.active_mmu_pages, > link) > if (kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list)) > goto restart; > > kvm_mmu_commit_zap_page(kvm, &invalid_list); > spin_unlock(&kvm->mmu_lock); > } > > kvm_mmu_commit_zap_page only flushes if the TLB was dirtied by this > context, not before. kvm_mmu_unprotect_page is similar. Its tempting to say "lets fix kvm_mmu_commit_zap_page by always conditionally flushing and done". But some variation of whats suggested below is safer (no need to think about all this every time something touches mmu_lock). > In general: > > context 1 context 2 > > lock(mmu_lock) > modify spte > mark_tlb_dirty > unlock(mmu_lock) > lock(mmu_lock) > read spte > make a decision based on spte value > unlock(mmu_lock) > flush remote TLBs > > > Is scary. > > Perhaps have a rule that says: > > 1) Conditionally flush remote TLB after acquiring mmu_lock, > before anything (even perhaps inside the lock macro). > 2) Except special cases where it is clear that this is not > necessary.