From mboxrd@z Thu Jan 1 00:00:00 1970 From: Gleb Natapov Subject: Re: [Qemu-devel] [RFC] create a single workqueue for each vm to update vm irq routing table Date: Thu, 28 Nov 2013 11:20:35 +0200 Message-ID: <20131128092034.GB4609@kernel.org> References: <52949847.6020908@redhat.com> <5294A68F.6060301@redhat.com> <5294B461.5000405@redhat.com> <5294B634.4050801@cloudius-systems.com> <20131126150357.GA20352@redhat.com> <5294BD21.4010701@cloudius-systems.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Avi Kivity , Gleb Natapov , Paolo Bonzini , Avi Kivity , "Huangweidong (C)" , KVM , "Michael S. Tsirkin" , "Jinxin (F)" , Luonengjun , "qemu-devel@nongnu.org" , Zanghongyong To: "Zhanghaoyu (A)" Return-path: Received: from mail-bk0-f43.google.com ([209.85.214.43]:42808 "EHLO mail-bk0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750859Ab3K1JUk (ORCPT ); Thu, 28 Nov 2013 04:20:40 -0500 Received: by mail-bk0-f43.google.com with SMTP id mz12so3672210bkb.16 for ; Thu, 28 Nov 2013 01:20:39 -0800 (PST) Content-Disposition: inline In-Reply-To: Sender: kvm-owner@vger.kernel.org List-ID: On Thu, Nov 28, 2013 at 09:14:22AM +0000, Zhanghaoyu (A) wrote: > >>>>> No, this would be exactly the same code that is running now: > >>>>> > >>>>> mutex_lock(&kvm->irq_lock); > >>>>> old = kvm->irq_routing; > >>>>> kvm_irq_routing_update(kvm, new); > >>>>> mutex_unlock(&kvm->irq_lock); > >>>>> > >>>>> synchronize_rcu(); > >>>>> kfree(old); > >>>>> return 0; > >>>>> > >>>>> Except that the kfree would run in the call_rcu kernel thread instead of > >>>>> the vcpu thread. But the vcpus already see the new routing table after > >>>>> the rcu_assign_pointer that is in kvm_irq_routing_update. > >>>>> > >>>>> I understood the proposal was also to eliminate the > >>>>> synchronize_rcu(), so while new interrupts would see the new > >>>>> routing table, interrupts already in flight could pick up the old one. > >>>> Isn't that always the case with RCU? (See my answer above: "the > >>>> vcpus already see the new routing table after the rcu_assign_pointer > >>>> that is in kvm_irq_routing_update"). > >>> With synchronize_rcu(), you have the additional guarantee that any > >>> parallel accesses to the old routing table have completed. Since we > >>> also trigger the irq from rcu context, you know that after > >>> synchronize_rcu() you won't get any interrupts to the old destination > >>> (see kvm_set_irq_inatomic()). > >> We do not have this guaranty for other vcpus that do not call > >> synchronize_rcu(). They may still use outdated routing table while a > >> vcpu or iothread that performed table update sits in synchronize_rcu(). > >> > > > >Consider this guest code: > > > > write msi entry, directing the interrupt away from this vcpu > > nop > > memset(&idt, 0, sizeof(idt)); > > > >Currently, this code will never trigger a triple fault. With the change to call_rcu(), it may. > > > >Now it may be that the guest does not expect this to work (PCI writes are posted; and interrupts can be delayed indefinitely by the pci fabric), > >but we don't know if there's a path that guarantees the guest something that we're taking away with this change. > > In native environment, if a CPU's LAPIC's IRR and ISR have been pending many interrupts, then OS perform zeroing this CPU's IDT before receiving interrupts, > will the same problem happen? > This is just an example. OS can ensure that there is no other pending interrupt by some other means. -- Gleb.