From: "Michael S. Tsirkin" <mst@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v2 00/38] Delay destruction of memory regions to instance_finalize
Date: Tue, 17 Sep 2013 22:51:23 +0300 [thread overview]
Message-ID: <20130917195123.GB21419@redhat.com> (raw)
In-Reply-To: <5238A859.9040705@redhat.com>
On Tue, Sep 17, 2013 at 09:07:05PM +0200, Paolo Bonzini wrote:
> Il 17/09/2013 19:26, Michael S. Tsirkin ha scritto:
> > On Tue, Sep 17, 2013 at 07:16:15PM +0200, Paolo Bonzini wrote:
> >> Il 17/09/2013 19:07, Michael S. Tsirkin ha scritto:
> >>> After memory_region_del_subregion returns,
> >>> it's a promise that there will not be accesses
> >>> to the region.
> >>
> >> It's racy anyway. You can have memory_region_del_subregion happen one
> >> clock cycle after the other (physical) CPU has done checked that there
> >> will not be accesses to the region.
> >
> > Fix it then. Stick synchronize_rcu() in memory_region_del_subregion.
>
> Easier said than done since you shouldn't do synchronize_rcu() with the
> BQL taken.
>
> > Are you trying to convince me there's no way to have
> > synchronous APIs in presence of RCU?
>
> No, I'm not. But in the presence of coarse locks it's harder to use
> synchronous APIs. Sticking to past QEMU experiences, the AIO
> synchronous cancellation API is a nightmare because the monitor cannot
> run while I/O is cancelled.
>
> >> A real bus has a "big PCI lock" (there can be only one transaction at a
> >> time), which is exactly what we want to get rid of.
> >
> > Not really, in a bridged setup transactions on the
> > secondary bus are handled in parallel with transactions
> > on the primary bus.
>
> That's not what you're suggesting though. A device removal on the
> secondary bus,
> triggered by configuration space access on the primary
> bus, wouldn't complete the configuration space access on the primary bus
> until the removal on the secondary bus is complete. So you still have a
> big lock around transactions on each bus.
I'm not really concerned about device removal. We can work it out.
A much more interesting case is e.g. disabling memory.
E.g.
config write (disable memory)
read (flush out outstanding writes)
write <- must now have no effect
Or disabling bus master:
config write (disable bus master)
read (flush in outstanding writes)
<- device must now not change memory
And these rules absolutely must be obeyed,
if you don't you'll break guests.
So I think we really should find a way to make the above work correctly.
Removal will follow almost automatically, since it disables memory and
mastering by itself.
Note: I'm talking about writes done by CPU here.
When writes are initiated by device and target MMIO, it's less
practical. The only case we emulate that I'm familiar with is MSI
writes into APIC, and this MR is never disabled resized or moved.
> Would a PCI bridge behave any differently?
A real bridge? Yes what happens on the primary bus doesn't
affect the secondary bus.
> It's still too coarse for emulation purposes.
It's possible that we'll have to implement real
transaction ordering. If you want to remove BQL for MMIOs I'd
say that's almost a given?
> > Also, there are split transactions completed
> > asynchronously: guests that care about ordering
> > do flushes e.g. using memory reads.
>
> Can you do split burst transactions? If not, the granularity is still
> measured in bytes, not pages or more.
>
> Paolo
I think we can but I'll have to check.
> > The logic is in the device/memory region really.
> >
> >>> So I'm not even sure we really need to move destroy to finalize anymore ...
> >>
> >> We definitely need to move it. Even if we added a flag that we set in
> >> memory_region_del_subregion, we need to check it later when AIO completes.
> >>
> >> Paolo
> >
> > Flag where? region can become subregion of another
> > region entirely.
> >
> > The solution for RCU is to flush on changes, not add
> > more flags and locks.
> >
next prev parent reply other threads:[~2013-09-17 19:49 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-09-03 12:32 [Qemu-devel] [PATCH v2 00/38] Delay destruction of memory regions to instance_finalize Paolo Bonzini
2013-09-03 12:32 ` [Qemu-devel] [PATCH 01/38] qdev: document assumption that unrealize is followed by finalize Paolo Bonzini
2013-09-17 9:00 ` Michael S. Tsirkin
2013-09-03 12:32 ` [Qemu-devel] [PATCH 02/38] pci: split exit and finalize Paolo Bonzini
2013-09-17 9:16 ` Michael S. Tsirkin
2013-09-17 9:56 ` Paolo Bonzini
2013-09-17 10:23 ` Paolo Bonzini
2013-09-17 10:06 ` Michael S. Tsirkin
2013-09-03 12:32 ` [Qemu-devel] [PATCH 03/38] ac97: use instance_finalize instead of exit Paolo Bonzini
2013-09-03 12:32 ` [Qemu-devel] [PATCH 04/38] es1370: " Paolo Bonzini
2013-09-03 12:32 ` [Qemu-devel] [PATCH 05/38] hda: reclaim memory in " Paolo Bonzini
2013-09-03 12:32 ` [Qemu-devel] [PATCH 06/38] serial: " Paolo Bonzini
2013-09-03 12:32 ` [Qemu-devel] [PATCH 07/38] tpci200: use " Paolo Bonzini
2013-09-03 12:32 ` [Qemu-devel] [PATCH 08/38] pci-assign: reclaim memory in " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 09/38] ahci: " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 10/38] msix: split msix_free from msix_uninit Paolo Bonzini
2013-09-17 9:21 ` Michael S. Tsirkin
2013-09-17 9:56 ` Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 11/38] cmd646: use instance_finalize instead of exit Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 12/38] ide/piix: " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 13/38] ide/via: " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 14/38] ivshmem: reclaim memory in " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 15/38] pci-testdev: use " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 16/38] vfio: reclaim memory in " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 17/38] e1000: use " Paolo Bonzini
2013-09-17 9:27 ` Michael S. Tsirkin
2013-09-17 10:13 ` Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 18/38] eepro100: " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 19/38] ne2000: " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 20/38] pcnet: " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 21/38] rtl8139: " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 22/38] vmxnet3: reclaim memory in " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 23/38] shpc: split shpc_free from shpc_cleanup Paolo Bonzini
2013-09-17 9:24 ` Michael S. Tsirkin
2013-09-17 9:58 ` Paolo Bonzini
2013-09-17 10:03 ` Michael S. Tsirkin
2013-09-03 12:33 ` [Qemu-devel] [PATCH 24/38] pci_bridge: split pci_bridge_free from pci_bridge_exitfn Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 25/38] pcie_aer: pcie_aer_exit really frees stuff Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 26/38] pci_bridge: reclaim memory in instance_finalize instead of exit Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 27/38] ioh4320: " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 28/38] xio3130-downstream: " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 29/38] xio3130-upstream: " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 30/38] pcie: do not recreate mmcfg I/O region, use an alias instead Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 31/38] esp: use instance_finalize instead of exit Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 32/38] lsi: " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 33/38] pvscsi: reclaim memory in " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 34/38] usb-uhci: use " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 35/38] virtio-pci: reclaim memory in " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 36/38] wdt_i6300esb: use " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 37/38] xen_pt: reclaim memory in " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 38/38] tpm: move add/del_subregion to realize/unrealize Paolo Bonzini
2013-09-16 16:35 ` [Qemu-devel] [PATCH v2 00/38] Delay destruction of memory regions to instance_finalize Paolo Bonzini
2013-09-17 6:44 ` Wenchao Xia
2013-09-17 10:01 ` Paolo Bonzini
2013-09-20 6:16 ` Wenchao Xia
2013-09-17 9:31 ` Michael S. Tsirkin
2013-09-17 12:47 ` Michael S. Tsirkin
2013-09-17 14:41 ` Paolo Bonzini
2013-09-17 14:45 ` Michael S. Tsirkin
2013-09-17 15:41 ` Paolo Bonzini
2013-09-17 15:59 ` Michael S. Tsirkin
2013-09-17 16:13 ` Paolo Bonzini
2013-09-17 16:29 ` Michael S. Tsirkin
2013-09-17 16:58 ` Paolo Bonzini
2013-09-17 17:07 ` Michael S. Tsirkin
2013-09-17 17:16 ` Paolo Bonzini
2013-09-17 17:26 ` Michael S. Tsirkin
2013-09-17 19:07 ` Paolo Bonzini
2013-09-17 19:51 ` Michael S. Tsirkin [this message]
2013-09-17 22:02 ` Paolo Bonzini
2013-09-18 5:48 ` Michael S. Tsirkin
2013-09-18 7:40 ` Paolo Bonzini
2013-09-18 8:41 ` Michael S. Tsirkin
2013-09-18 11:26 ` Paolo Bonzini
2013-09-18 11:56 ` Peter Maydell
2013-09-18 13:11 ` Paolo Bonzini
2013-09-18 13:19 ` Peter Maydell
2013-09-18 13:28 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130917195123.GB21419@redhat.com \
--to=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).