From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46679) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VlLfK-00036Q-Nc for qemu-devel@nongnu.org; Tue, 26 Nov 2013 11:39:16 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VlLfE-000828-O2 for qemu-devel@nongnu.org; Tue, 26 Nov 2013 11:39:10 -0500 Received: from mx1.redhat.com ([209.132.183.28]:24883) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VlLfE-000822-Fb for qemu-devel@nongnu.org; Tue, 26 Nov 2013 11:39:04 -0500 Date: Tue, 26 Nov 2013 18:38:57 +0200 From: Gleb Natapov Message-ID: <20131126163857.GE20352@redhat.com> References: <52949847.6020908@redhat.com> <5294A68F.6060301@redhat.com> <5294B461.5000405@redhat.com> <5294B634.4050801@cloudius-systems.com> <20131126150357.GA20352@redhat.com> <5294BC3B.6070902@redhat.com> <20131126162414.GC20352@redhat.com> <5294CC03.7050801@cloudius-systems.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5294CC03.7050801@cloudius-systems.com> Subject: Re: [Qemu-devel] [RFC] create a single workqueue for each vm to update vm irq routing table List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Avi Kivity Cc: "Huangweidong (C)" , KVM , "Michael S. Tsirkin" , "Zhanghaoyu (A)" , Luonengjun , "qemu-devel@nongnu.org" , Zanghongyong , Avi Kivity , Paolo Bonzini , "Jinxin (F)" On Tue, Nov 26, 2013 at 06:27:47PM +0200, Avi Kivity wrote: > On 11/26/2013 06:24 PM, Gleb Natapov wrote: > >On Tue, Nov 26, 2013 at 04:20:27PM +0100, Paolo Bonzini wrote: > >>Il 26/11/2013 16:03, Gleb Natapov ha scritto: > >>>>>>>>>I understood the proposal was also to eliminate the synchronize_rcu(), > >>>>>>>>>so while new interrupts would see the new routing table, interrupts > >>>>>>>>>already in flight could pick up the old one. > >>>>>>>Isn't that always the case with RCU? (See my answer above: "the vcpus > >>>>>>>already see the new routing table after the rcu_assign_pointer that is > >>>>>>>in kvm_irq_routing_update"). > >>>>>With synchronize_rcu(), you have the additional guarantee that any > >>>>>parallel accesses to the old routing table have completed. Since we > >>>>>also trigger the irq from rcu context, you know that after > >>>>>synchronize_rcu() you won't get any interrupts to the old > >>>>>destination (see kvm_set_irq_inatomic()). > >>>We do not have this guaranty for other vcpus that do not call > >>>synchronize_rcu(). They may still use outdated routing table while a vcpu > >>>or iothread that performed table update sits in synchronize_rcu(). > >>Avi's point is that, after the VCPU resumes execution, you know that no > >>interrupt will be sent to the old destination because > >>kvm_set_msi_inatomic (and ultimately kvm_irq_delivery_to_apic_fast) is > >>also called within the RCU read-side critical section. > >> > >>Without synchronize_rcu you could have > >> > >> VCPU writes to routing table > >> e = entry from IRQ routing table > >> kvm_irq_routing_update(kvm, new); > >> VCPU resumes execution > >> kvm_set_msi_irq(e, &irq); > >> kvm_irq_delivery_to_apic_fast(); > >> > >>where the entry is stale but the VCPU has already resumed execution. > >> > >So how is it different from what we have now: > > > >disable_irq() > >VCPU writes to routing table > > e = entry from IRQ routing table > > kvm_set_msi_irq(e, &irq); > > kvm_irq_delivery_to_apic_fast(); > >kvm_irq_routing_update(kvm, new); > >synchronize_rcu() > >VCPU resumes execution > >enable_irq() > >receive stale irq > > > > > > Suppose the guest did not disable_irq() and enable_irq(), but > instead had a pci read where you have the enable_irq(). After the > read you cannot have a stale irq (assuming the read flushes the irq > all the way to the APIC). There still may be race between pci read and MSI registered in IRR. I do not believe such read can undo IRR changes. -- Gleb.