From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [RFC] KVM: x86: conditionally acquire/release slots_lock on entry/exit Date: Sun, 13 Sep 2009 18:42:49 +0300 Message-ID: <4AAD12F9.5020601@redhat.com> References: <20090827012000.762063112@localhost.localdomain> <20090827155450.GA6312@amt.cnet> <4A96B404.5010500@redhat.com> <20090827225940.GA13571@amt.cnet> <4A977E3C.7050604@redhat.com> <20090910223003.GA658@amt.cnet> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: kvm@vger.kernel.org, Gleb Natapov , "Paul E. McKenney" To: Marcelo Tosatti Return-path: Received: from mx1.redhat.com ([209.132.183.28]:44473 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750869AbZIMPmt (ORCPT ); Sun, 13 Sep 2009 11:42:49 -0400 In-Reply-To: <20090910223003.GA658@amt.cnet> Sender: kvm-owner@vger.kernel.org List-ID: On 09/11/2009 01:30 AM, Marcelo Tosatti wrote: > >> We don't need to stop vcpus, just kick them out of guest mode to let >> them notice the new data. SRCU does that well. >> > Two problems: > > 1. The removal of memslots/aliases and zapping of mmu (to remove any > shadow pages with stale sp->gfn) must be atomic from the POV of the > pagefault path. Otherwise something like this can happen: > > fault path set_memory_region > srcu_read_lock() > walk_addr fetches and validates > table_gfns > delete memslot > synchronize_srcu > > fetch creates shadow > srcu_read_unlock() > page with nonexistant sp->gfn > I think synchronize_srcu() will be deferred until the fault path is complete (and srcu_read_unlock() runs). Copying someone who knows for sure. > OR > > mmu_alloc_roots path set_memory_region > srcu_read_lock() > > delete memslot > root_gfn = vcpu->arch.cr3<< PAGE_SHIFT > mmu_check_root(root_gfn) synchronize_rcu > kvm_mmu_get_page() > srcu_read_unlock() > kvm_mmu_zap_all > Ditto, srcu_read_lock() protects us. > Accesses between kvm_mmu_get_page and kvm_mmu_zap_all window can see > shadow pages with stale gfn. > > But, if you still think its worthwhile to use RCU, at least handling > gfn_to_memslot / unalias_gfn errors _and_ adding mmu_notifier_retry > invalidation to set_memory_region path will be necessary (so that > gfn_to_pfn validation, in the fault path, is discarded in case > of memslot/alias update). > It really is worthwhile to reuse complex infrastructure instead of writing new infrastructure. > 2. Another complication is that memslot creation and kvm_iommu_map_pages > are not atomic. > > create memslot > synchronize_srcu > <----- vcpu grabs gfn reference without > iommu mapping. > kvm_iommu_map_pages > > Which can be solved by changing kvm_iommu_map_pages (and new gfn_to_pfn > helper) to use base_gfn,npages,hva information from somewhere else other > than visible kvm->memslots (so that when the slot becomes visible its > already iommu mapped). > Yes. It can accept a memslots structure instead of deriving it from kvm->memslots. Then we do a rcu_assign_pointer() to switch the tables. > So it appears to me using RCU introduces more complications / subtle > details than mutual exclusion here. The new request bit which the > original patch introduces is limited to enabling/disabling conditional > acquision of slots_lock (calling it a "new locking protocol" is unfair) > to improve write acquision latency. > It's true that it is not a new locking protocol. But I feel it is worthwhile to try to use rcu for this; at least it will make it easier for newcomers (provided they understand rcu). -- error compiling committee.c: too many arguments to function