From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42769) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VlLUV-0002NB-2A for qemu-devel@nongnu.org; Tue, 26 Nov 2013 11:28:06 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VlLUN-0003wm-Jt for qemu-devel@nongnu.org; Tue, 26 Nov 2013 11:27:59 -0500 Received: from mail-bk0-f53.google.com ([209.85.214.53]:34101) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VlLUN-0003wT-Dh for qemu-devel@nongnu.org; Tue, 26 Nov 2013 11:27:51 -0500 Received: by mail-bk0-f53.google.com with SMTP id na10so2668354bkb.40 for ; Tue, 26 Nov 2013 08:27:50 -0800 (PST) Message-ID: <5294CC03.7050801@cloudius-systems.com> Date: Tue, 26 Nov 2013 18:27:47 +0200 From: Avi Kivity MIME-Version: 1.0 References: <52949847.6020908@redhat.com> <5294A68F.6060301@redhat.com> <5294B461.5000405@redhat.com> <5294B634.4050801@cloudius-systems.com> <20131126150357.GA20352@redhat.com> <5294BC3B.6070902@redhat.com> <20131126162414.GC20352@redhat.com> In-Reply-To: <20131126162414.GC20352@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC] create a single workqueue for each vm to update vm irq routing table List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Gleb Natapov , Paolo Bonzini Cc: "Huangweidong (C)" , KVM , "Michael S. Tsirkin" , "Zhanghaoyu (A)" , Luonengjun , "qemu-devel@nongnu.org" , Zanghongyong , Avi Kivity , "Jinxin (F)" On 11/26/2013 06:24 PM, Gleb Natapov wrote: > On Tue, Nov 26, 2013 at 04:20:27PM +0100, Paolo Bonzini wrote: >> Il 26/11/2013 16:03, Gleb Natapov ha scritto: >>>>>>>>> I understood the proposal was also to eliminate the synchronize_rcu(), >>>>>>>>> so while new interrupts would see the new routing table, interrupts >>>>>>>>> already in flight could pick up the old one. >>>>>>> Isn't that always the case with RCU? (See my answer above: "the vcpus >>>>>>> already see the new routing table after the rcu_assign_pointer that is >>>>>>> in kvm_irq_routing_update"). >>>>> With synchronize_rcu(), you have the additional guarantee that any >>>>> parallel accesses to the old routing table have completed. Since we >>>>> also trigger the irq from rcu context, you know that after >>>>> synchronize_rcu() you won't get any interrupts to the old >>>>> destination (see kvm_set_irq_inatomic()). >>> We do not have this guaranty for other vcpus that do not call >>> synchronize_rcu(). They may still use outdated routing table while a vcpu >>> or iothread that performed table update sits in synchronize_rcu(). >> Avi's point is that, after the VCPU resumes execution, you know that no >> interrupt will be sent to the old destination because >> kvm_set_msi_inatomic (and ultimately kvm_irq_delivery_to_apic_fast) is >> also called within the RCU read-side critical section. >> >> Without synchronize_rcu you could have >> >> VCPU writes to routing table >> e = entry from IRQ routing table >> kvm_irq_routing_update(kvm, new); >> VCPU resumes execution >> kvm_set_msi_irq(e, &irq); >> kvm_irq_delivery_to_apic_fast(); >> >> where the entry is stale but the VCPU has already resumed execution. >> > So how is it different from what we have now: > > disable_irq() > VCPU writes to routing table > e = entry from IRQ routing table > kvm_set_msi_irq(e, &irq); > kvm_irq_delivery_to_apic_fast(); > kvm_irq_routing_update(kvm, new); > synchronize_rcu() > VCPU resumes execution > enable_irq() > receive stale irq > > > Suppose the guest did not disable_irq() and enable_irq(), but instead had a pci read where you have the enable_irq(). After the read you cannot have a stale irq (assuming the read flushes the irq all the way to the APIC).