Re: [Qemu-devel] [PATCH V3] virtio: do not require IOMMU to be created in advance

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Marcel Apfelbaum <marcel@redhat.com>
To: Jason Wang <jasowang@redhat.com>, Peter Xu <peterx@redhat.com>
Cc: mst@redhat.com, qemu-devel@nongnu.org,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [PATCH V3] virtio: do not require IOMMU to be created in advance
Date: Wed, 8 Mar 2017 10:17:09 +0200	[thread overview]
Message-ID: <5b388627-dcae-6c11-008d-f3c5da57b334@redhat.com> (raw)
In-Reply-To: <9d2dd1a4-aadc-a497-3dbb-6afb6d621587@redhat.com>

On 03/08/2017 05:15 AM, Jason Wang wrote:
>
>
> On 2017年03月08日 10:43, Peter Xu wrote:
>> On Tue, Mar 07, 2017 at 02:47:30PM +0200, Marcel Apfelbaum wrote:
>>> On 03/07/2017 11:09 AM, Jason Wang wrote:
>>>> After commit 96a8821d2141 ("virtio: unbreak virtio-pci with IOMMU
>>>> after caching ring translations"), IOMMU was required to be created in
>>>> advance. This is because we can only get the correct dma_as after pci
>>>> IOMMU (e.g intel_iommu) was initialized. This is suboptimal and
>>>> inconvenient for user. This patch releases this by:
>>>>
>>>> - introduce a bus_master_ready method for PCIDeviceClass and trigger
>>>>   this during pci_init_bus_master
>>>> - implement virtio-pci method and 1) reset the dma_as 2) re-register
>>>>   the memory listener to the new dma_as
>>>>
>>> Hi Jason,
>>>
>>>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>> ---
>>>> Changes from V2:
>>>> - delay pci_init_bus_master() after pci device is realized to make
>>>>   bus_master_ready a more generic method
>>>> ---
>>>> hw/pci/pci.c               | 11 ++++++++---
>>>> hw/virtio/virtio-pci.c     |  9 +++++++++
>>>> hw/virtio/virtio.c         | 19 +++++++++++++++++++
>>>> include/hw/pci/pci.h       |  1 +
>>>> include/hw/virtio/virtio.h |  1 +
>>>> 5 files changed, 38 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>>> index 273f1e4..22e6ad9 100644
>>>> --- a/hw/pci/pci.c
>>>> +++ b/hw/pci/pci.c
>>>> @@ -83,6 +83,7 @@ static const VMStateDescription vmstate_pcibus = {
>>>> static void pci_init_bus_master(PCIDevice *pci_dev)
>>>> {
>>>>      AddressSpace *dma_as = pci_device_iommu_address_space(pci_dev);
>>>> +    PCIDeviceClass *pc = PCI_DEVICE_GET_CLASS(pci_dev);
>>>>
>>>>      memory_region_init_alias(&pci_dev->bus_master_enable_region,
>>>>                               OBJECT(pci_dev), "bus master",
>>>> @@ -90,6 +91,9 @@ static void pci_init_bus_master(PCIDevice *pci_dev)
>>>>      memory_region_set_enabled(&pci_dev->bus_master_enable_region, false);
>>>>      address_space_init(&pci_dev->bus_master_as,
>>>>                         &pci_dev->bus_master_enable_region, pci_dev->name);
>>>> +    if (pc->bus_master_ready) {
>>>> +        pc->bus_master_ready(pci_dev);
>>>> +    }
>>>> }
>>>>
>>>> static void pcibus_machine_done(Notifier *notifier, void *data)
>>>> @@ -995,9 +999,6 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus,
>>>>      pci_dev->devfn = devfn;
>>>>      pci_dev->requester_id_cache = pci_req_id_cache_get(pci_dev);
>>>>
>>>> -    if (qdev_hotplug) {
>>>> -        pci_init_bus_master(pci_dev);
>>>> -    }
>>>>      pstrcpy(pci_dev->name, sizeof(pci_dev->name), name);
>>>>      pci_dev->irq_state = 0;
>>>>      pci_config_alloc(pci_dev);
>>>> @@ -1995,6 +1996,10 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
>>>>          }
>>>>      }
>>>>
>>>> +    if (qdev_hotplug) {
>>>> +        pci_init_bus_master(pci_dev);
>>>> +    }
>>>> +
>>> How does it help if we move qdev_hotplug check outside "do_pci_register_device"?
>>> I suppose you want the code to run after the "realize" function?
>
> Yes.
>
>>> If so, what happens if a "realize" function of another device needs the bus_master_as
>>> (at hotplug time)?
>
> I'm not sure this is really needed. What kind of device need to check hotplug during their realize? (Looks like we don't have such kind of device now).

I am sorry I was not clear enough:
If a device is added after the system is up (hotplug), we cannot depend on the "machine_done"
event to enable "bus master".
This is why we have
    if (qdev_hotplug)
         pci_init_bus_master(pci_dev);

The code you proposed changes the order, so this call is done *after* realize.

My question was: What if any other device may require the bus_master_as
at realize time (and can be hot-plugged) ?
For example: hcd-ehci/hcd-ohci devices call pci_get_address_space()
and caches the bus_master_as.

It is possible the mentioned devices can't be hot-plugged or
are not relevant for PCIe.

I am only saying we should double check when making such a change.

>
>> My unmature understanding is that, we can just call
>> pci_device_iommu_address_space() if device realization really needs
>> the address space, rather than using bus_master_as.
>
> A little issue is pci_device_iommu_address_space() can be wrong if it was called before OMMU was created.
>

At hotplug time this is irrelevant because the iommu has already been created.

Thanks,
Marcel

> Thanks
>
>>
>> An example is vfio-pci device. It is using
>> pci_device_iommu_address_space() in vfio_realize(). Though I _guess_
>> there may be other reasons behind, e.g., VFIOAddressSpace should be
>> 1:1 mapped to bus master address space, so we may want to make sure
>> this address space will be the same when different devices are using
>> the same address space (while bus_master_as is one per device, even if
>> two devices have the same backend address space, there will be two
>> distinct bus_master_as).
>>
>> IIUC, bus_master_as is only used to emulate the master bit in PCI
>> command register. In that case, that should only be there for the
>> guest operations, while imho device realization is "emulation code",
>> which can directly use pci_device_iommu_address_space(). Actually, if
>> it plays with bus_master_as even if it can, I guess it'll just fail
>> since that region has not yet been enabled.
>>
>> Please kindly correct me if I made a mistake...
>>
>> Thanks,
>>
>> -- peterx
>

next prev parent reply	other threads:[~2017-03-08  8:17 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-07  9:09 [Qemu-devel] [PATCH V3] virtio: do not require IOMMU to be created in advance Jason Wang
2017-03-07  9:19 ` Peter Xu
2017-03-07 12:47 ` Marcel Apfelbaum
2017-03-08  2:43   ` Peter Xu
2017-03-08  3:15     ` Jason Wang
2017-03-08  8:17       ` Marcel Apfelbaum [this message]
2017-03-08  9:09         ` Peter Xu
2017-03-08  9:16           ` Peter Xu
2017-03-08 13:14             ` Marcel Apfelbaum
2017-03-08  9:57         ` Jason Wang
2017-03-08  8:24     ` Marcel Apfelbaum
2017-03-08 16:40   ` Igor Mammedov
2017-03-09  2:32     ` Jason Wang
2017-03-09  9:28       ` Igor Mammedov
2017-03-09  9:30         ` Paolo Bonzini
2017-03-09  9:58         ` Jason Wang
2017-03-09 10:05           ` Paolo Bonzini
2017-03-09 15:31             ` Michael S. Tsirkin
2017-03-09 15:32               ` Paolo Bonzini
2017-03-10 10:54                 ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5b388627-dcae-6c11-008d-f3c5da57b334@redhat.com \
    --to=marcel@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).