From: "Michael S. Tsirkin" <mst@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Huangweidong (C)" <weidong.huang@huawei.com>,
"gleb@redhat.com" <gleb@redhat.com>,
Radim Krcmar <rkrcmar@redhat.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"Gonglei (Arei)" <arei.gonglei@huawei.com>,
"avi.kivity@gmail.com" <avi.kivity@gmail.com>,
"Herongguang (Stephen)" <herongguang.he@huawei.com>
Subject: Re: [Qemu-devel] [RFC] vhost: Can we change synchronize_rcu to call_rcu in vhost_set_memory() in vhost kernel module?
Date: Mon, 12 May 2014 15:53:23 +0300 [thread overview]
Message-ID: <20140512125323.GA16846@redhat.com> (raw)
In-Reply-To: <5370C2B6.6080605@redhat.com>
On Mon, May 12, 2014 at 02:46:46PM +0200, Paolo Bonzini wrote:
> Il 12/05/2014 14:12, Michael S. Tsirkin ha scritto:
> >On Mon, May 12, 2014 at 01:46:19PM +0200, Paolo Bonzini wrote:
> >>Il 12/05/2014 13:07, Michael S. Tsirkin ha scritto:
> >>>On Mon, May 12, 2014 at 12:25:35PM +0200, Paolo Bonzini wrote:
> >>>>Il 12/05/2014 12:18, Michael S. Tsirkin ha scritto:
> >>>>>On Mon, May 12, 2014 at 12:14:25PM +0200, Paolo Bonzini wrote:
> >>>>>>Il 12/05/2014 12:08, Michael S. Tsirkin ha scritto:
> >>>>>>>On Mon, May 12, 2014 at 11:57:32AM +0200, Paolo Bonzini wrote:
> >>>>>>>>Perhaps we can check for cases where only the address is changing,
> >>>>>>>>and poke at an existing struct kvm_kernel_irq_routing_entry without
> >>>>>>>>doing any RCU synchronization?
> >>>>>>>
> >>>>>>>I suspect interrupts can get lost then: e.g. if address didn't match any
> >>>>>>>cpus, now it matches some. No?
> >>>>>>
> >>>>>>Can you explain the problem more verbosely? :)
> >>>>>>
> >>>>>>Multiple writers would still be protected by the mutex, so you
> >>>>>>cannot have an "in-place update" writer racing with a "copy the
> >>>>>>array" writer.
> >>>>>
> >>>>>I am not sure really.
> >>>>>I'm worried about reader vs writer.
> >>>>>If reader sees a stale msi value msi will be sent to a wrong
> >>>>>address.
> >>>>
> >>>>That shouldn't happen on any cache-coherent system, no?
> >>>>
> >>>>Or at least, it shouldn't become any worse than what can already
> >>>>happen with RCU.
> >>>
> >>>Meaning guest must do some synchronization anyway?
> >>
> >>Yes, I think so. The simplest would be to mask the interrupt around
> >>MSI configuration changes. Radim was looking at a similar bug.
> >
> >Was this with classic assignment or with vfio?
>
> No, it was with QEMU emulated devices (AHCI). Not this path, but it
> did involve MSI configuration changes while the interrupt was
> unmasked in the device.
>
> >>He couldn't replicate on bare metal,
> >>but I don't see why it shouldn't
> >>be possible there too.
> >
> >I doubt linux does something tricky here, so
> >I'm not talking about something linux guest does,
> >I'm talking about an abstract guarantee of the
> >APIC.
>
> APIC or PCI? The chipset is involved here, not the APICs (the APICs
> just see interrupt transactions).
>
> >If baremetal causes a synchronisation point
> >and we don't, at some point linux will use it and
> >this will bite us.
>
> Yes, and I agree.
>
> It's complicated; on one hand I'm not sure we can entirely fix this,
> on the other hand it may be that the guest cannot have too high
> expectations.
>
> The invalid scenario is an interrupt happening before the MSI
> reconfiguration and using the new value. It's invalid because MSI
> reconfiguration is a non-posted write and must push previously
> posted writes. But we cannot do that at all! QEMU has the big lock
> that implicitly orders transactions, but this does not apply to
> high-performance users of irqfd (vhost, VFIO, even QEMU's own
> virtio-blk dataplane). Even if we removed all RCU uses for the
> routing table, we don't have a "bottleneck" that orders transactions
> and removes this invalid scenario.
>
> At the same time, it's obviously okay if an interrupt happens after
> the MSI reconfiguration and uses the old value. That's a posted
> write passing an earlier non-posted write, which is allowed. This
> is the main reason why I believe that masking/unmasking is required
> on bare metal. In addition, if the masking/unmasking is done with a
> posted write (MMIO BAR), the guest needs to do a read for
> synchronization before it unmasks the interrupt.
>
> In any case, whether writes synchronize with RCU or bypass it
> doesn't change the picture. In either case, writes are ordered
> against each other but not against reads. RCU does nothing except
> preventing dangling pointer accesses.
>
> Paolo
This is the only part I don't get.
RCU will make sure no VCPUs are running, won't it?
So it's a kind of full barrier.
--
MST
next prev parent reply other threads:[~2014-05-12 12:54 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-09 1:57 [Qemu-devel] [RFC] vhost: Can we change synchronize_rcu to call_rcu in vhost_set_memory() in vhost kernel module? Gonglei (Arei)
2014-05-09 8:14 ` Paolo Bonzini
2014-05-09 9:04 ` Gonglei (Arei)
2014-05-09 9:53 ` Paolo Bonzini
2014-05-12 9:28 ` Gonglei (Arei)
2014-05-12 9:57 ` Paolo Bonzini
2014-05-12 10:08 ` Michael S. Tsirkin
2014-05-12 10:14 ` Paolo Bonzini
2014-05-12 10:18 ` Michael S. Tsirkin
2014-05-12 10:25 ` Paolo Bonzini
2014-05-12 11:07 ` Michael S. Tsirkin
2014-05-12 11:46 ` Paolo Bonzini
2014-05-12 12:12 ` Michael S. Tsirkin
2014-05-12 12:46 ` Paolo Bonzini
2014-05-12 12:53 ` Michael S. Tsirkin [this message]
2014-05-12 13:02 ` Paolo Bonzini
2014-05-12 10:30 ` Michael S. Tsirkin
2014-05-13 7:03 ` Gonglei (Arei)
2014-05-13 8:21 ` Michael S. Tsirkin
2014-05-13 7:01 ` Gonglei (Arei)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140512125323.GA16846@redhat.com \
--to=mst@redhat.com \
--cc=arei.gonglei@huawei.com \
--cc=avi.kivity@gmail.com \
--cc=gleb@redhat.com \
--cc=herongguang.he@huawei.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=rkrcmar@redhat.com \
--cc=weidong.huang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).