qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Gleb Natapov <gleb@minantech.com>
To: Avi Kivity <avi@cloudius-systems.com>
Cc: "Huangweidong (C)" <weidong.huang@huawei.com>,
	KVM <kvm@vger.kernel.org>, Gleb Natapov <gleb@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Zhanghaoyu (A)" <haoyu.zhang@huawei.com>,
	Luonengjun <luonengjun@huawei.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	Zanghongyong <zanghongyong@huawei.com>,
	Avi Kivity <avi.kivity@gmail.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	"Jinxin (F)" <jinxin712@huawei.com>
Subject: Re: [Qemu-devel] [RFC] create a single workqueue for each vm to update vm irq routing table
Date: Thu, 28 Nov 2013 13:02:50 +0200	[thread overview]
Message-ID: <20131128110250.GC5822@minantech.com> (raw)
In-Reply-To: <52971727.2030600@cloudius-systems.com>

On Thu, Nov 28, 2013 at 12:12:55PM +0200, Avi Kivity wrote:
> On 11/28/2013 12:11 PM, Gleb Natapov wrote:
> >On Thu, Nov 28, 2013 at 11:49:00AM +0200, Avi Kivity wrote:
> >>On 11/28/2013 11:19 AM, Gleb Natapov wrote:
> >>>On Thu, Nov 28, 2013 at 09:55:42AM +0100, Paolo Bonzini wrote:
> >>>>Il 28/11/2013 07:27, Zhanghaoyu (A) ha scritto:
> >>>>>>>Without synchronize_rcu you could have
> >>>>>>>
> >>>>>>>    VCPU writes to routing table
> >>>>>>>                                       e = entry from IRQ routing table
> >>>>>>>    kvm_irq_routing_update(kvm, new);
> >>>>>>>    VCPU resumes execution
> >>>>>>>                                       kvm_set_msi_irq(e, &irq);
> >>>>>>>                                       kvm_irq_delivery_to_apic_fast();
> >>>>>>>
> >>>>>>>where the entry is stale but the VCPU has already resumed execution.
> >>>>>>>
> >>>>>If we use call_rcu()(Not consider the problem that Gleb pointed out temporarily) instead of synchronize_rcu(), should we still ensure this?
> >>>>The problem is that we should ensure this, so using call_rcu is not
> >>>>possible (even not considering the memory allocation problem).
> >>>>
> >>>Not changing current behaviour is certainly safer, but I am still not 100%
> >>>convinced we have to ensure this.
> >>>
> >>>Suppose guest does:
> >>>
> >>>1: change msi interrupt by writing to pci register
> >>>2: read the pci register to flush the write
> >>>3: zero idt
> >>>
> >>>I am pretty certain that this code can get interrupt after step 2 on real HW,
> >>>but I cannot tell if guest can rely on it to be delivered exactly after
> >>>read instruction or it can be delayed by couple of instructions. Seems to me
> >>>it would be fragile for an OS to depend on this behaviour. AFAIK Linux does not.
> >>>
> >>Linux is safe, it does interrupt migration from within the interrupt
> >>handler.  If you do that before the device-specific EOI, you won't
> >>get another interrupt until programming the MSI is complete.
> >>
> >>Is virtio safe? IIRC it can post multiple interrupts without guest acks.
> >>
> >>Using call_rcu() is a better solution than srcu IMO.  Less code
> >>changes, consistently faster.
> >Why not fix userspace to use KVM_SIGNAL_MSI instead?
> >
> >
> 
> Shouldn't it work with old userspace too? Maybe I misunderstood your intent.
Zhanghaoyu said that the problem mostly hurts in real-time telecom
environment, so I propose how he can fix the problem in his specific
environment.  It will not fix older userspace obviously, but kernel
fix will also require kernel update and usually updating userspace
is easier.

--
			Gleb.

  reply	other threads:[~2013-11-28 11:03 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-26 12:40 [Qemu-devel] [RFC] create a single workqueue for each vm to update vm irq routing table Zhanghaoyu (A)
2013-11-26 12:47 ` Paolo Bonzini
2013-11-26 12:56   ` Gleb Natapov
2013-11-26 13:13     ` Paolo Bonzini
2013-11-28  3:46       ` Zhanghaoyu (A)
2013-11-26 16:05     ` Michael S. Tsirkin
2013-11-26 16:14       ` Gleb Natapov
2013-11-26 16:24         ` Michael S. Tsirkin
2013-11-30  2:46           ` Zhanghaoyu (A)
2013-11-26 13:18   ` Avi Kivity
2013-11-26 13:47     ` Paolo Bonzini
2013-11-26 14:36       ` Avi Kivity
2013-11-26 14:46         ` Paolo Bonzini
2013-11-26 14:54           ` Avi Kivity
2013-11-26 15:03             ` Gleb Natapov
2013-11-26 15:20               ` Paolo Bonzini
2013-11-26 15:25                 ` Avi Kivity
2013-11-26 15:28                   ` Paolo Bonzini
2013-11-26 15:35                     ` Avi Kivity
2013-11-26 15:58                       ` Paolo Bonzini
2013-11-26 16:06                         ` Avi Kivity
2013-11-26 16:11                           ` Michael S. Tsirkin
2013-11-26 16:21                             ` Avi Kivity
2013-11-26 16:42                               ` Paolo Bonzini
2013-11-26 16:08                         ` Michael S. Tsirkin
2013-11-26 16:24                 ` Gleb Natapov
2013-11-26 16:27                   ` Avi Kivity
2013-11-26 16:38                     ` Gleb Natapov
2013-11-26 16:28                   ` Paolo Bonzini
2013-11-26 16:29                     ` Avi Kivity
2013-11-26 16:37                     ` Gleb Natapov
2013-11-28  6:27                 ` Zhanghaoyu (A)
2013-11-28  8:55                   ` Paolo Bonzini
2013-11-28  9:19                     ` Gleb Natapov
2013-11-28  9:29                       ` Paolo Bonzini
2013-11-28  9:43                         ` Gleb Natapov
2013-11-28  9:49                       ` Avi Kivity
2013-11-28  9:53                         ` Paolo Bonzini
2013-11-28 10:16                           ` Avi Kivity
2013-11-28 10:40                             ` Paolo Bonzini
2013-11-28 10:47                               ` Avi Kivity
2013-11-28 11:09                               ` Gleb Natapov
2013-11-28 11:10                                 ` Paolo Bonzini
2013-11-28 11:16                                   ` Avi Kivity
2013-11-28 11:23                                   ` Gleb Natapov
2013-11-28 11:31                                     ` Paolo Bonzini
2013-11-28 11:33                                       ` Avi Kivity
2013-11-28 10:11                         ` Gleb Natapov
2013-11-28 10:12                           ` Avi Kivity
2013-11-28 11:02                             ` Gleb Natapov [this message]
2013-11-28 11:18                               ` Avi Kivity
2013-11-28 11:22                                 ` Gleb Natapov
2013-11-28 11:30                                   ` Avi Kivity
2013-11-28 11:33                                   ` Michael S. Tsirkin
2013-11-28 11:33                                     ` Gleb Natapov
2013-11-26 15:24               ` Avi Kivity
2013-11-28  9:14                 ` Zhanghaoyu (A)
2013-11-28  9:20                   ` Gleb Natapov
2013-11-26 12:48 ` Gleb Natapov
2013-11-26 12:50   ` Gleb Natapov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131128110250.GC5822@minantech.com \
    --to=gleb@minantech.com \
    --cc=avi.kivity@gmail.com \
    --cc=avi@cloudius-systems.com \
    --cc=gleb@redhat.com \
    --cc=haoyu.zhang@huawei.com \
    --cc=jinxin712@huawei.com \
    --cc=kvm@vger.kernel.org \
    --cc=luonengjun@huawei.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=weidong.huang@huawei.com \
    --cc=zanghongyong@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).