qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v2 00/38] Delay destruction of memory regions to instance_finalize
Date: Wed, 18 Sep 2013 13:26:16 +0200	[thread overview]
Message-ID: <52398DD8.3050805@redhat.com> (raw)
In-Reply-To: <20130918084138.GA31069@redhat.com>

Il 18/09/2013 10:41, Michael S. Tsirkin ha scritto:
> On Wed, Sep 18, 2013 at 09:40:19AM +0200, Paolo Bonzini wrote:
>> Il 18/09/2013 07:48, Michael S. Tsirkin ha scritto:
>>> So I think the fix is actually obeying ordering rules,
>>> that is know that write is in progress
>>> and flush on read.
>>
>> I think this can be modeled as a generic, synchronous
>> (*busmaster_cancel)(PCIDevice*) callback, that is called after bus
>> master is turned off.  You don't even really have to wait for a read.
> 
> Not really.
> Bus master is just an single instance of the bigger issue.
> It could be any device-specific register just as well.
> 
> PCI reads and writes must obey ordering rules.
> ATM MMIO and DMA achieve this by using a single lock.

There are two issues.

One is synchronization with address_space_map...unmap.  Reads and writes
from address_space_map/unmap are already handled outside the BQL
(possibly in the kernel), so they already ignore any ordering.  As far
as memory writes are concerned, it is indeed specific to bus master, see
PCIe spec 6.4:

   The ability of the driver and/or system software to block new
   Requests from the device is supported by the Bus Master Enable, SERR
   Enable, and Interrupt Disable bits in the Command register
   (Section 7.5.1.1), and other such control bits.

The second is ordering of writes and between writes and reads.  This
does not concern device->memory transaction because when a device does
DMA it can use smp_*mb() to achieve the ordering.  Similarly I think we
can ignore message requests (but we probably have hidden bugs now, for
example you probably need a smp_wmb() in msi{,x}_notify).

But even though this only concerns CPU->device transactions, the
restrictions are very strong.  They apply across distinct devices
(because the ordering is already enforced at the root complex level,
right?), so basically you'd have to wrap each and every MMIO operation
destined to BARs or configuration space with some kind of
pci_global_{read,write}_{start,end} API.  This is likely slower than
just having a "big MMIO lock" around all memory accesses(*).  Devices
can then move part of the processing to a separate thread or bottom half.

     (*) You cannot really do that for all of them; if a BAR is backed
         by RAM, accesses would still be unordered since they do not go
         through QEMU at all.

But does guest code actually care?  In many cases, I suspect that
sticking a smp_rmb() in the read side of "unlocked" register accesses,
and a smp_wmb() in the write side, will do just fine.  And add a
compatibility property to place a device back under the BQL for guests
that have problems.

Paolo

> If you want to move MMIO and DMA out of a common lock you must find some
> other way to force ordering.
> 
>>> I think moving memory region destroy out to finalize makes sense
>>> irrespectively, as long as destroy is made idempotent so we can simply
>>> destroy everything without worrying whether we initialized it.
>>>
>>> The rest of the changes will be harder, we'll have to do
>>> them carefully on a case by case basis.
>>
>> Good, we are in agreement then.
>>
>> Paolo
> 
> 

  reply	other threads:[~2013-09-18 11:26 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-03 12:32 [Qemu-devel] [PATCH v2 00/38] Delay destruction of memory regions to instance_finalize Paolo Bonzini
2013-09-03 12:32 ` [Qemu-devel] [PATCH 01/38] qdev: document assumption that unrealize is followed by finalize Paolo Bonzini
2013-09-17  9:00   ` Michael S. Tsirkin
2013-09-03 12:32 ` [Qemu-devel] [PATCH 02/38] pci: split exit and finalize Paolo Bonzini
2013-09-17  9:16   ` Michael S. Tsirkin
2013-09-17  9:56     ` Paolo Bonzini
2013-09-17 10:23       ` Paolo Bonzini
2013-09-17 10:06   ` Michael S. Tsirkin
2013-09-03 12:32 ` [Qemu-devel] [PATCH 03/38] ac97: use instance_finalize instead of exit Paolo Bonzini
2013-09-03 12:32 ` [Qemu-devel] [PATCH 04/38] es1370: " Paolo Bonzini
2013-09-03 12:32 ` [Qemu-devel] [PATCH 05/38] hda: reclaim memory in " Paolo Bonzini
2013-09-03 12:32 ` [Qemu-devel] [PATCH 06/38] serial: " Paolo Bonzini
2013-09-03 12:32 ` [Qemu-devel] [PATCH 07/38] tpci200: use " Paolo Bonzini
2013-09-03 12:32 ` [Qemu-devel] [PATCH 08/38] pci-assign: reclaim memory in " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 09/38] ahci: " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 10/38] msix: split msix_free from msix_uninit Paolo Bonzini
2013-09-17  9:21   ` Michael S. Tsirkin
2013-09-17  9:56     ` Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 11/38] cmd646: use instance_finalize instead of exit Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 12/38] ide/piix: " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 13/38] ide/via: " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 14/38] ivshmem: reclaim memory in " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 15/38] pci-testdev: use " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 16/38] vfio: reclaim memory in " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 17/38] e1000: use " Paolo Bonzini
2013-09-17  9:27   ` Michael S. Tsirkin
2013-09-17 10:13     ` Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 18/38] eepro100: " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 19/38] ne2000: " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 20/38] pcnet: " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 21/38] rtl8139: " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 22/38] vmxnet3: reclaim memory in " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 23/38] shpc: split shpc_free from shpc_cleanup Paolo Bonzini
2013-09-17  9:24   ` Michael S. Tsirkin
2013-09-17  9:58     ` Paolo Bonzini
2013-09-17 10:03       ` Michael S. Tsirkin
2013-09-03 12:33 ` [Qemu-devel] [PATCH 24/38] pci_bridge: split pci_bridge_free from pci_bridge_exitfn Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 25/38] pcie_aer: pcie_aer_exit really frees stuff Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 26/38] pci_bridge: reclaim memory in instance_finalize instead of exit Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 27/38] ioh4320: " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 28/38] xio3130-downstream: " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 29/38] xio3130-upstream: " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 30/38] pcie: do not recreate mmcfg I/O region, use an alias instead Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 31/38] esp: use instance_finalize instead of exit Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 32/38] lsi: " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 33/38] pvscsi: reclaim memory in " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 34/38] usb-uhci: use " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 35/38] virtio-pci: reclaim memory in " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 36/38] wdt_i6300esb: use " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 37/38] xen_pt: reclaim memory in " Paolo Bonzini
2013-09-03 12:33 ` [Qemu-devel] [PATCH 38/38] tpm: move add/del_subregion to realize/unrealize Paolo Bonzini
2013-09-16 16:35 ` [Qemu-devel] [PATCH v2 00/38] Delay destruction of memory regions to instance_finalize Paolo Bonzini
2013-09-17  6:44 ` Wenchao Xia
2013-09-17 10:01   ` Paolo Bonzini
2013-09-20  6:16     ` Wenchao Xia
2013-09-17  9:31 ` Michael S. Tsirkin
2013-09-17 12:47 ` Michael S. Tsirkin
2013-09-17 14:41   ` Paolo Bonzini
2013-09-17 14:45     ` Michael S. Tsirkin
2013-09-17 15:41       ` Paolo Bonzini
2013-09-17 15:59         ` Michael S. Tsirkin
2013-09-17 16:13           ` Paolo Bonzini
2013-09-17 16:29             ` Michael S. Tsirkin
2013-09-17 16:58               ` Paolo Bonzini
2013-09-17 17:07                 ` Michael S. Tsirkin
2013-09-17 17:16                   ` Paolo Bonzini
2013-09-17 17:26                     ` Michael S. Tsirkin
2013-09-17 19:07                       ` Paolo Bonzini
2013-09-17 19:51                         ` Michael S. Tsirkin
2013-09-17 22:02                           ` Paolo Bonzini
2013-09-18  5:48                             ` Michael S. Tsirkin
2013-09-18  7:40                               ` Paolo Bonzini
2013-09-18  8:41                                 ` Michael S. Tsirkin
2013-09-18 11:26                                   ` Paolo Bonzini [this message]
2013-09-18 11:56                                     ` Peter Maydell
2013-09-18 13:11                                       ` Paolo Bonzini
2013-09-18 13:19                                         ` Peter Maydell
2013-09-18 13:28                                           ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52398DD8.3050805@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).