From: Gleb Natapov <gleb@redhat.com>
To: Avi Kivity <avi@cloudius-systems.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Avi Kivity <avi.kivity@gmail.com>,
"Huangweidong (C)" <weidong.huang@huawei.com>,
KVM <kvm@vger.kernel.org>, "Michael S. Tsirkin" <mst@redhat.com>,
"Jinxin (F)" <jinxin712@huawei.com>,
"Zhanghaoyu (A)" <haoyu.zhang@huawei.com>,
Luonengjun <luonengjun@huawei.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
Zanghongyong <zanghongyong@huawei.com>
Subject: Re: [Qemu-devel] [RFC] create a single workqueue for each vm to update vm irq routing table
Date: Tue, 26 Nov 2013 18:38:57 +0200 [thread overview]
Message-ID: <20131126163857.GE20352@redhat.com> (raw)
In-Reply-To: <5294CC03.7050801@cloudius-systems.com>
On Tue, Nov 26, 2013 at 06:27:47PM +0200, Avi Kivity wrote:
> On 11/26/2013 06:24 PM, Gleb Natapov wrote:
> >On Tue, Nov 26, 2013 at 04:20:27PM +0100, Paolo Bonzini wrote:
> >>Il 26/11/2013 16:03, Gleb Natapov ha scritto:
> >>>>>>>>>I understood the proposal was also to eliminate the synchronize_rcu(),
> >>>>>>>>>so while new interrupts would see the new routing table, interrupts
> >>>>>>>>>already in flight could pick up the old one.
> >>>>>>>Isn't that always the case with RCU? (See my answer above: "the vcpus
> >>>>>>>already see the new routing table after the rcu_assign_pointer that is
> >>>>>>>in kvm_irq_routing_update").
> >>>>>With synchronize_rcu(), you have the additional guarantee that any
> >>>>>parallel accesses to the old routing table have completed. Since we
> >>>>>also trigger the irq from rcu context, you know that after
> >>>>>synchronize_rcu() you won't get any interrupts to the old
> >>>>>destination (see kvm_set_irq_inatomic()).
> >>>We do not have this guaranty for other vcpus that do not call
> >>>synchronize_rcu(). They may still use outdated routing table while a vcpu
> >>>or iothread that performed table update sits in synchronize_rcu().
> >>Avi's point is that, after the VCPU resumes execution, you know that no
> >>interrupt will be sent to the old destination because
> >>kvm_set_msi_inatomic (and ultimately kvm_irq_delivery_to_apic_fast) is
> >>also called within the RCU read-side critical section.
> >>
> >>Without synchronize_rcu you could have
> >>
> >> VCPU writes to routing table
> >> e = entry from IRQ routing table
> >> kvm_irq_routing_update(kvm, new);
> >> VCPU resumes execution
> >> kvm_set_msi_irq(e, &irq);
> >> kvm_irq_delivery_to_apic_fast();
> >>
> >>where the entry is stale but the VCPU has already resumed execution.
> >>
> >So how is it different from what we have now:
> >
> >disable_irq()
> >VCPU writes to routing table
> > e = entry from IRQ routing table
> > kvm_set_msi_irq(e, &irq);
> > kvm_irq_delivery_to_apic_fast();
> >kvm_irq_routing_update(kvm, new);
> >synchronize_rcu()
> >VCPU resumes execution
> >enable_irq()
> >receive stale irq
> >
> >
>
> Suppose the guest did not disable_irq() and enable_irq(), but
> instead had a pci read where you have the enable_irq(). After the
> read you cannot have a stale irq (assuming the read flushes the irq
> all the way to the APIC).
There still may be race between pci read and MSI registered in IRR. I do
not believe such read can undo IRR changes.
--
Gleb.
WARNING: multiple messages have this Message-ID (diff)
From: Gleb Natapov <gleb@redhat.com>
To: Avi Kivity <avi@cloudius-systems.com>
Cc: "Huangweidong (C)" <weidong.huang@huawei.com>,
KVM <kvm@vger.kernel.org>, "Michael S. Tsirkin" <mst@redhat.com>,
"Zhanghaoyu (A)" <haoyu.zhang@huawei.com>,
Luonengjun <luonengjun@huawei.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
Zanghongyong <zanghongyong@huawei.com>,
Avi Kivity <avi.kivity@gmail.com>,
Paolo Bonzini <pbonzini@redhat.com>,
"Jinxin (F)" <jinxin712@huawei.com>
Subject: Re: [Qemu-devel] [RFC] create a single workqueue for each vm to update vm irq routing table
Date: Tue, 26 Nov 2013 18:38:57 +0200 [thread overview]
Message-ID: <20131126163857.GE20352@redhat.com> (raw)
In-Reply-To: <5294CC03.7050801@cloudius-systems.com>
On Tue, Nov 26, 2013 at 06:27:47PM +0200, Avi Kivity wrote:
> On 11/26/2013 06:24 PM, Gleb Natapov wrote:
> >On Tue, Nov 26, 2013 at 04:20:27PM +0100, Paolo Bonzini wrote:
> >>Il 26/11/2013 16:03, Gleb Natapov ha scritto:
> >>>>>>>>>I understood the proposal was also to eliminate the synchronize_rcu(),
> >>>>>>>>>so while new interrupts would see the new routing table, interrupts
> >>>>>>>>>already in flight could pick up the old one.
> >>>>>>>Isn't that always the case with RCU? (See my answer above: "the vcpus
> >>>>>>>already see the new routing table after the rcu_assign_pointer that is
> >>>>>>>in kvm_irq_routing_update").
> >>>>>With synchronize_rcu(), you have the additional guarantee that any
> >>>>>parallel accesses to the old routing table have completed. Since we
> >>>>>also trigger the irq from rcu context, you know that after
> >>>>>synchronize_rcu() you won't get any interrupts to the old
> >>>>>destination (see kvm_set_irq_inatomic()).
> >>>We do not have this guaranty for other vcpus that do not call
> >>>synchronize_rcu(). They may still use outdated routing table while a vcpu
> >>>or iothread that performed table update sits in synchronize_rcu().
> >>Avi's point is that, after the VCPU resumes execution, you know that no
> >>interrupt will be sent to the old destination because
> >>kvm_set_msi_inatomic (and ultimately kvm_irq_delivery_to_apic_fast) is
> >>also called within the RCU read-side critical section.
> >>
> >>Without synchronize_rcu you could have
> >>
> >> VCPU writes to routing table
> >> e = entry from IRQ routing table
> >> kvm_irq_routing_update(kvm, new);
> >> VCPU resumes execution
> >> kvm_set_msi_irq(e, &irq);
> >> kvm_irq_delivery_to_apic_fast();
> >>
> >>where the entry is stale but the VCPU has already resumed execution.
> >>
> >So how is it different from what we have now:
> >
> >disable_irq()
> >VCPU writes to routing table
> > e = entry from IRQ routing table
> > kvm_set_msi_irq(e, &irq);
> > kvm_irq_delivery_to_apic_fast();
> >kvm_irq_routing_update(kvm, new);
> >synchronize_rcu()
> >VCPU resumes execution
> >enable_irq()
> >receive stale irq
> >
> >
>
> Suppose the guest did not disable_irq() and enable_irq(), but
> instead had a pci read where you have the enable_irq(). After the
> read you cannot have a stale irq (assuming the read flushes the irq
> all the way to the APIC).
There still may be race between pci read and MSI registered in IRR. I do
not believe such read can undo IRR changes.
--
Gleb.
next prev parent reply other threads:[~2013-11-26 16:39 UTC|newest]
Thread overview: 120+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-26 12:40 [RFC] create a single workqueue for each vm to update vm irq routing table Zhanghaoyu (A)
2013-11-26 12:40 ` [Qemu-devel] " Zhanghaoyu (A)
2013-11-26 12:47 ` Paolo Bonzini
2013-11-26 12:47 ` [Qemu-devel] " Paolo Bonzini
2013-11-26 12:56 ` Gleb Natapov
2013-11-26 12:56 ` [Qemu-devel] " Gleb Natapov
2013-11-26 13:13 ` Paolo Bonzini
2013-11-26 13:13 ` [Qemu-devel] " Paolo Bonzini
2013-11-28 3:46 ` Zhanghaoyu (A)
2013-11-28 3:46 ` [Qemu-devel] " Zhanghaoyu (A)
2013-11-26 16:05 ` Michael S. Tsirkin
2013-11-26 16:05 ` [Qemu-devel] " Michael S. Tsirkin
2013-11-26 16:14 ` Gleb Natapov
2013-11-26 16:14 ` [Qemu-devel] " Gleb Natapov
2013-11-26 16:24 ` Michael S. Tsirkin
2013-11-26 16:24 ` [Qemu-devel] " Michael S. Tsirkin
2013-11-30 2:46 ` Zhanghaoyu (A)
2013-11-30 2:46 ` [Qemu-devel] " Zhanghaoyu (A)
2013-11-26 13:18 ` Avi Kivity
2013-11-26 13:18 ` [Qemu-devel] " Avi Kivity
2013-11-26 13:47 ` Paolo Bonzini
2013-11-26 13:47 ` [Qemu-devel] " Paolo Bonzini
2013-11-26 14:36 ` Avi Kivity
2013-11-26 14:36 ` [Qemu-devel] " Avi Kivity
2013-11-26 14:46 ` Paolo Bonzini
2013-11-26 14:46 ` Paolo Bonzini
2013-11-26 14:54 ` Avi Kivity
2013-11-26 14:54 ` Avi Kivity
2013-11-26 15:03 ` Gleb Natapov
2013-11-26 15:03 ` Gleb Natapov
2013-11-26 15:20 ` Paolo Bonzini
2013-11-26 15:20 ` Paolo Bonzini
2013-11-26 15:25 ` Avi Kivity
2013-11-26 15:25 ` Avi Kivity
2013-11-26 15:28 ` Paolo Bonzini
2013-11-26 15:28 ` Paolo Bonzini
2013-11-26 15:35 ` Avi Kivity
2013-11-26 15:35 ` Avi Kivity
2013-11-26 15:58 ` Paolo Bonzini
2013-11-26 15:58 ` Paolo Bonzini
2013-11-26 16:06 ` Avi Kivity
2013-11-26 16:06 ` Avi Kivity
2013-11-26 16:11 ` Michael S. Tsirkin
2013-11-26 16:11 ` Michael S. Tsirkin
2013-11-26 16:21 ` Avi Kivity
2013-11-26 16:21 ` Avi Kivity
2013-11-26 16:42 ` Paolo Bonzini
2013-11-26 16:42 ` Paolo Bonzini
2013-11-26 16:08 ` Michael S. Tsirkin
2013-11-26 16:08 ` Michael S. Tsirkin
2013-11-26 16:24 ` Gleb Natapov
2013-11-26 16:24 ` Gleb Natapov
2013-11-26 16:27 ` Avi Kivity
2013-11-26 16:27 ` Avi Kivity
2013-11-26 16:38 ` Gleb Natapov [this message]
2013-11-26 16:38 ` Gleb Natapov
2013-11-26 16:28 ` Paolo Bonzini
2013-11-26 16:28 ` Paolo Bonzini
2013-11-26 16:29 ` Avi Kivity
2013-11-26 16:29 ` Avi Kivity
2013-11-26 16:37 ` Gleb Natapov
2013-11-26 16:37 ` Gleb Natapov
2013-11-28 6:27 ` Zhanghaoyu (A)
2013-11-28 6:27 ` Zhanghaoyu (A)
2013-11-28 8:55 ` Paolo Bonzini
2013-11-28 8:55 ` Paolo Bonzini
2013-11-28 9:19 ` Gleb Natapov
2013-11-28 9:19 ` Gleb Natapov
2013-11-28 9:29 ` Paolo Bonzini
2013-11-28 9:29 ` Paolo Bonzini
2013-11-28 9:43 ` Gleb Natapov
2013-11-28 9:43 ` Gleb Natapov
2013-11-28 9:49 ` Avi Kivity
2013-11-28 9:49 ` Avi Kivity
2013-11-28 9:53 ` Paolo Bonzini
2013-11-28 9:53 ` Paolo Bonzini
2013-11-28 10:16 ` Avi Kivity
2013-11-28 10:16 ` Avi Kivity
2013-11-28 10:40 ` Paolo Bonzini
2013-11-28 10:40 ` Paolo Bonzini
2013-11-28 10:47 ` Avi Kivity
2013-11-28 10:47 ` Avi Kivity
2013-11-28 11:09 ` Gleb Natapov
2013-11-28 11:09 ` Gleb Natapov
2013-11-28 11:10 ` Paolo Bonzini
2013-11-28 11:10 ` Paolo Bonzini
2013-11-28 11:16 ` Avi Kivity
2013-11-28 11:16 ` Avi Kivity
2013-11-28 11:23 ` Gleb Natapov
2013-11-28 11:23 ` Gleb Natapov
2013-11-28 11:31 ` Paolo Bonzini
2013-11-28 11:31 ` Paolo Bonzini
2013-11-28 11:33 ` Avi Kivity
2013-11-28 11:33 ` Avi Kivity
2013-11-28 10:11 ` Gleb Natapov
2013-11-28 10:11 ` Gleb Natapov
2013-11-28 10:12 ` Avi Kivity
2013-11-28 10:12 ` Avi Kivity
2013-11-28 11:02 ` Gleb Natapov
2013-11-28 11:02 ` Gleb Natapov
2013-11-28 11:18 ` Avi Kivity
2013-11-28 11:18 ` Avi Kivity
2013-11-28 11:22 ` Gleb Natapov
2013-11-28 11:22 ` Gleb Natapov
2013-11-28 11:30 ` Avi Kivity
2013-11-28 11:30 ` [Qemu-devel] " Avi Kivity
2013-11-28 11:33 ` Michael S. Tsirkin
2013-11-28 11:33 ` Michael S. Tsirkin
2013-11-28 11:33 ` Gleb Natapov
2013-11-28 11:33 ` Gleb Natapov
2013-11-26 15:24 ` Avi Kivity
2013-11-26 15:24 ` Avi Kivity
2013-11-28 9:14 ` Zhanghaoyu (A)
2013-11-28 9:14 ` Zhanghaoyu (A)
2013-11-28 9:20 ` Gleb Natapov
2013-11-28 9:20 ` Gleb Natapov
2013-11-26 12:48 ` Gleb Natapov
2013-11-26 12:48 ` [Qemu-devel] " Gleb Natapov
2013-11-26 12:50 ` Gleb Natapov
2013-11-26 12:50 ` [Qemu-devel] " Gleb Natapov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131126163857.GE20352@redhat.com \
--to=gleb@redhat.com \
--cc=avi.kivity@gmail.com \
--cc=avi@cloudius-systems.com \
--cc=haoyu.zhang@huawei.com \
--cc=jinxin712@huawei.com \
--cc=kvm@vger.kernel.org \
--cc=luonengjun@huawei.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=weidong.huang@huawei.com \
--cc=zanghongyong@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.