From: Gleb Natapov <gleb@minantech.com>
To: "Zhanghaoyu (A)" <haoyu.zhang@huawei.com>
Cc: Avi Kivity <avi@cloudius-systems.com>,
Gleb Natapov <gleb@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Avi Kivity <avi.kivity@gmail.com>,
"Huangweidong (C)" <weidong.huang@huawei.com>,
KVM <kvm@vger.kernel.org>, "Michael S. Tsirkin" <mst@redhat.com>,
"Jinxin (F)" <jinxin712@huawei.com>,
Luonengjun <luonengjun@huawei.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
Zanghongyong <zanghongyong@huawei.com>
Subject: Re: [Qemu-devel] [RFC] create a single workqueue for each vm to update vm irq routing table
Date: Thu, 28 Nov 2013 11:20:35 +0200 [thread overview]
Message-ID: <20131128092034.GB4609@kernel.org> (raw)
In-Reply-To: <D3E216785288A145B7BC975F83A2ED10448BBBB4@SZXEMA510-MBS.china.huawei.com>
On Thu, Nov 28, 2013 at 09:14:22AM +0000, Zhanghaoyu (A) wrote:
> >>>>> No, this would be exactly the same code that is running now:
> >>>>>
> >>>>> mutex_lock(&kvm->irq_lock);
> >>>>> old = kvm->irq_routing;
> >>>>> kvm_irq_routing_update(kvm, new);
> >>>>> mutex_unlock(&kvm->irq_lock);
> >>>>>
> >>>>> synchronize_rcu();
> >>>>> kfree(old);
> >>>>> return 0;
> >>>>>
> >>>>> Except that the kfree would run in the call_rcu kernel thread instead of
> >>>>> the vcpu thread. But the vcpus already see the new routing table after
> >>>>> the rcu_assign_pointer that is in kvm_irq_routing_update.
> >>>>>
> >>>>> I understood the proposal was also to eliminate the
> >>>>> synchronize_rcu(), so while new interrupts would see the new
> >>>>> routing table, interrupts already in flight could pick up the old one.
> >>>> Isn't that always the case with RCU? (See my answer above: "the
> >>>> vcpus already see the new routing table after the rcu_assign_pointer
> >>>> that is in kvm_irq_routing_update").
> >>> With synchronize_rcu(), you have the additional guarantee that any
> >>> parallel accesses to the old routing table have completed. Since we
> >>> also trigger the irq from rcu context, you know that after
> >>> synchronize_rcu() you won't get any interrupts to the old destination
> >>> (see kvm_set_irq_inatomic()).
> >> We do not have this guaranty for other vcpus that do not call
> >> synchronize_rcu(). They may still use outdated routing table while a
> >> vcpu or iothread that performed table update sits in synchronize_rcu().
> >>
> >
> >Consider this guest code:
> >
> > write msi entry, directing the interrupt away from this vcpu
> > nop
> > memset(&idt, 0, sizeof(idt));
> >
> >Currently, this code will never trigger a triple fault. With the change to call_rcu(), it may.
> >
> >Now it may be that the guest does not expect this to work (PCI writes are posted; and interrupts can be delayed indefinitely by the pci fabric),
> >but we don't know if there's a path that guarantees the guest something that we're taking away with this change.
>
> In native environment, if a CPU's LAPIC's IRR and ISR have been pending many interrupts, then OS perform zeroing this CPU's IDT before receiving interrupts,
> will the same problem happen?
>
This is just an example. OS can ensure that there is no other pending interrupt by some other means.
--
Gleb.
WARNING: multiple messages have this Message-ID (diff)
From: Gleb Natapov <gleb@minantech.com>
To: "Zhanghaoyu (A)" <haoyu.zhang@huawei.com>
Cc: Avi Kivity <avi@cloudius-systems.com>,
"Huangweidong (C)" <weidong.huang@huawei.com>,
KVM <kvm@vger.kernel.org>, Gleb Natapov <gleb@redhat.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Luonengjun <luonengjun@huawei.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
Zanghongyong <zanghongyong@huawei.com>,
Avi Kivity <avi.kivity@gmail.com>,
Paolo Bonzini <pbonzini@redhat.com>,
"Jinxin (F)" <jinxin712@huawei.com>
Subject: Re: [Qemu-devel] [RFC] create a single workqueue for each vm to update vm irq routing table
Date: Thu, 28 Nov 2013 11:20:35 +0200 [thread overview]
Message-ID: <20131128092034.GB4609@kernel.org> (raw)
In-Reply-To: <D3E216785288A145B7BC975F83A2ED10448BBBB4@SZXEMA510-MBS.china.huawei.com>
On Thu, Nov 28, 2013 at 09:14:22AM +0000, Zhanghaoyu (A) wrote:
> >>>>> No, this would be exactly the same code that is running now:
> >>>>>
> >>>>> mutex_lock(&kvm->irq_lock);
> >>>>> old = kvm->irq_routing;
> >>>>> kvm_irq_routing_update(kvm, new);
> >>>>> mutex_unlock(&kvm->irq_lock);
> >>>>>
> >>>>> synchronize_rcu();
> >>>>> kfree(old);
> >>>>> return 0;
> >>>>>
> >>>>> Except that the kfree would run in the call_rcu kernel thread instead of
> >>>>> the vcpu thread. But the vcpus already see the new routing table after
> >>>>> the rcu_assign_pointer that is in kvm_irq_routing_update.
> >>>>>
> >>>>> I understood the proposal was also to eliminate the
> >>>>> synchronize_rcu(), so while new interrupts would see the new
> >>>>> routing table, interrupts already in flight could pick up the old one.
> >>>> Isn't that always the case with RCU? (See my answer above: "the
> >>>> vcpus already see the new routing table after the rcu_assign_pointer
> >>>> that is in kvm_irq_routing_update").
> >>> With synchronize_rcu(), you have the additional guarantee that any
> >>> parallel accesses to the old routing table have completed. Since we
> >>> also trigger the irq from rcu context, you know that after
> >>> synchronize_rcu() you won't get any interrupts to the old destination
> >>> (see kvm_set_irq_inatomic()).
> >> We do not have this guaranty for other vcpus that do not call
> >> synchronize_rcu(). They may still use outdated routing table while a
> >> vcpu or iothread that performed table update sits in synchronize_rcu().
> >>
> >
> >Consider this guest code:
> >
> > write msi entry, directing the interrupt away from this vcpu
> > nop
> > memset(&idt, 0, sizeof(idt));
> >
> >Currently, this code will never trigger a triple fault. With the change to call_rcu(), it may.
> >
> >Now it may be that the guest does not expect this to work (PCI writes are posted; and interrupts can be delayed indefinitely by the pci fabric),
> >but we don't know if there's a path that guarantees the guest something that we're taking away with this change.
>
> In native environment, if a CPU's LAPIC's IRR and ISR have been pending many interrupts, then OS perform zeroing this CPU's IDT before receiving interrupts,
> will the same problem happen?
>
This is just an example. OS can ensure that there is no other pending interrupt by some other means.
--
Gleb.
next prev parent reply other threads:[~2013-11-28 9:20 UTC|newest]
Thread overview: 120+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-26 12:40 [RFC] create a single workqueue for each vm to update vm irq routing table Zhanghaoyu (A)
2013-11-26 12:40 ` [Qemu-devel] " Zhanghaoyu (A)
2013-11-26 12:47 ` Paolo Bonzini
2013-11-26 12:47 ` [Qemu-devel] " Paolo Bonzini
2013-11-26 12:56 ` Gleb Natapov
2013-11-26 12:56 ` [Qemu-devel] " Gleb Natapov
2013-11-26 13:13 ` Paolo Bonzini
2013-11-26 13:13 ` [Qemu-devel] " Paolo Bonzini
2013-11-28 3:46 ` Zhanghaoyu (A)
2013-11-28 3:46 ` [Qemu-devel] " Zhanghaoyu (A)
2013-11-26 16:05 ` Michael S. Tsirkin
2013-11-26 16:05 ` [Qemu-devel] " Michael S. Tsirkin
2013-11-26 16:14 ` Gleb Natapov
2013-11-26 16:14 ` [Qemu-devel] " Gleb Natapov
2013-11-26 16:24 ` Michael S. Tsirkin
2013-11-26 16:24 ` [Qemu-devel] " Michael S. Tsirkin
2013-11-30 2:46 ` Zhanghaoyu (A)
2013-11-30 2:46 ` [Qemu-devel] " Zhanghaoyu (A)
2013-11-26 13:18 ` Avi Kivity
2013-11-26 13:18 ` [Qemu-devel] " Avi Kivity
2013-11-26 13:47 ` Paolo Bonzini
2013-11-26 13:47 ` [Qemu-devel] " Paolo Bonzini
2013-11-26 14:36 ` Avi Kivity
2013-11-26 14:36 ` [Qemu-devel] " Avi Kivity
2013-11-26 14:46 ` Paolo Bonzini
2013-11-26 14:46 ` Paolo Bonzini
2013-11-26 14:54 ` Avi Kivity
2013-11-26 14:54 ` Avi Kivity
2013-11-26 15:03 ` Gleb Natapov
2013-11-26 15:03 ` Gleb Natapov
2013-11-26 15:20 ` Paolo Bonzini
2013-11-26 15:20 ` Paolo Bonzini
2013-11-26 15:25 ` Avi Kivity
2013-11-26 15:25 ` Avi Kivity
2013-11-26 15:28 ` Paolo Bonzini
2013-11-26 15:28 ` Paolo Bonzini
2013-11-26 15:35 ` Avi Kivity
2013-11-26 15:35 ` Avi Kivity
2013-11-26 15:58 ` Paolo Bonzini
2013-11-26 15:58 ` Paolo Bonzini
2013-11-26 16:06 ` Avi Kivity
2013-11-26 16:06 ` Avi Kivity
2013-11-26 16:11 ` Michael S. Tsirkin
2013-11-26 16:11 ` Michael S. Tsirkin
2013-11-26 16:21 ` Avi Kivity
2013-11-26 16:21 ` Avi Kivity
2013-11-26 16:42 ` Paolo Bonzini
2013-11-26 16:42 ` Paolo Bonzini
2013-11-26 16:08 ` Michael S. Tsirkin
2013-11-26 16:08 ` Michael S. Tsirkin
2013-11-26 16:24 ` Gleb Natapov
2013-11-26 16:24 ` Gleb Natapov
2013-11-26 16:27 ` Avi Kivity
2013-11-26 16:27 ` Avi Kivity
2013-11-26 16:38 ` Gleb Natapov
2013-11-26 16:38 ` Gleb Natapov
2013-11-26 16:28 ` Paolo Bonzini
2013-11-26 16:28 ` Paolo Bonzini
2013-11-26 16:29 ` Avi Kivity
2013-11-26 16:29 ` Avi Kivity
2013-11-26 16:37 ` Gleb Natapov
2013-11-26 16:37 ` Gleb Natapov
2013-11-28 6:27 ` Zhanghaoyu (A)
2013-11-28 6:27 ` Zhanghaoyu (A)
2013-11-28 8:55 ` Paolo Bonzini
2013-11-28 8:55 ` Paolo Bonzini
2013-11-28 9:19 ` Gleb Natapov
2013-11-28 9:19 ` Gleb Natapov
2013-11-28 9:29 ` Paolo Bonzini
2013-11-28 9:29 ` Paolo Bonzini
2013-11-28 9:43 ` Gleb Natapov
2013-11-28 9:43 ` Gleb Natapov
2013-11-28 9:49 ` Avi Kivity
2013-11-28 9:49 ` Avi Kivity
2013-11-28 9:53 ` Paolo Bonzini
2013-11-28 9:53 ` Paolo Bonzini
2013-11-28 10:16 ` Avi Kivity
2013-11-28 10:16 ` Avi Kivity
2013-11-28 10:40 ` Paolo Bonzini
2013-11-28 10:40 ` Paolo Bonzini
2013-11-28 10:47 ` Avi Kivity
2013-11-28 10:47 ` Avi Kivity
2013-11-28 11:09 ` Gleb Natapov
2013-11-28 11:09 ` Gleb Natapov
2013-11-28 11:10 ` Paolo Bonzini
2013-11-28 11:10 ` Paolo Bonzini
2013-11-28 11:16 ` Avi Kivity
2013-11-28 11:16 ` Avi Kivity
2013-11-28 11:23 ` Gleb Natapov
2013-11-28 11:23 ` Gleb Natapov
2013-11-28 11:31 ` Paolo Bonzini
2013-11-28 11:31 ` Paolo Bonzini
2013-11-28 11:33 ` Avi Kivity
2013-11-28 11:33 ` Avi Kivity
2013-11-28 10:11 ` Gleb Natapov
2013-11-28 10:11 ` Gleb Natapov
2013-11-28 10:12 ` Avi Kivity
2013-11-28 10:12 ` Avi Kivity
2013-11-28 11:02 ` Gleb Natapov
2013-11-28 11:02 ` Gleb Natapov
2013-11-28 11:18 ` Avi Kivity
2013-11-28 11:18 ` Avi Kivity
2013-11-28 11:22 ` Gleb Natapov
2013-11-28 11:22 ` Gleb Natapov
2013-11-28 11:30 ` Avi Kivity
2013-11-28 11:30 ` [Qemu-devel] " Avi Kivity
2013-11-28 11:33 ` Michael S. Tsirkin
2013-11-28 11:33 ` Michael S. Tsirkin
2013-11-28 11:33 ` Gleb Natapov
2013-11-28 11:33 ` Gleb Natapov
2013-11-26 15:24 ` Avi Kivity
2013-11-26 15:24 ` Avi Kivity
2013-11-28 9:14 ` Zhanghaoyu (A)
2013-11-28 9:14 ` Zhanghaoyu (A)
2013-11-28 9:20 ` Gleb Natapov [this message]
2013-11-28 9:20 ` Gleb Natapov
2013-11-26 12:48 ` Gleb Natapov
2013-11-26 12:48 ` [Qemu-devel] " Gleb Natapov
2013-11-26 12:50 ` Gleb Natapov
2013-11-26 12:50 ` [Qemu-devel] " Gleb Natapov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131128092034.GB4609@kernel.org \
--to=gleb@minantech.com \
--cc=avi.kivity@gmail.com \
--cc=avi@cloudius-systems.com \
--cc=gleb@redhat.com \
--cc=haoyu.zhang@huawei.com \
--cc=jinxin712@huawei.com \
--cc=kvm@vger.kernel.org \
--cc=luonengjun@huawei.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=weidong.huang@huawei.com \
--cc=zanghongyong@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.