From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>,
"Michael S . Tsirkin" <mst@redhat.com>,
qemu-devel@nongnu.org,
Alex Williamson <alex.williamson@redhat.com>,
David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] [PATCH v2] pci: Introduce helper to retrieve a PCI device's DMA address space
Date: Fri, 09 Aug 2013 21:21:49 +1000 [thread overview]
Message-ID: <5204D0CD.8050400@ozlabs.ru> (raw)
In-Reply-To: <5204CD80.7000409@redhat.com>
On 08/09/2013 09:07 PM, Paolo Bonzini wrote:
> Il 09/08/2013 12:58, Alexey Kardashevskiy ha scritto:
>> On 08/09/2013 08:19 PM, Paolo Bonzini wrote:
>>> Il 09/08/2013 12:13, Alexey Kardashevskiy ha scritto:
>>>> On 08/09/2013 07:53 PM, Paolo Bonzini wrote:
>>>>> Il 09/08/2013 11:48, Alexey Kardashevskiy ha scritto:
>>>>>> On 08/09/2013 07:40 PM, Paolo Bonzini wrote:
>>>>>>> Il 09/08/2013 10:49, Alexey Kardashevskiy ha scritto:
>>>>>>>> A PCI device's DMA address space (possibly an IOMMU) is returned by a
>>>>>>>> method on the PCIBus. At the moment that only has one caller, so the
>>>>>>>> method is simply open coded. We'll need another caller for VFIO, so
>>>>>>>> this patch introduces a helper/wrapper function.
>>>>>>>>
>>>>>>>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
>>>>>>>> [aik: added inheritance from parent if iommu is not set for the current bus]
>>>>>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>>>>>>>
>>>>>>>> ---
>>>>>>>> Changes:
>>>>>>>> v2:
>>>>>>>> * added inheritance, needed for a pci-bridge on spapr-ppc64
>>>>>>>> * pci_iommu_as renamed to pci_device_iommu_address_space
>>>>>>>> ---
>>>>>>>> hw/pci/pci.c | 22 ++++++++++++++++------
>>>>>>>> include/hw/pci/pci.h | 1 +
>>>>>>>> 2 files changed, 17 insertions(+), 6 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>>>>>>> index 4c004f5..0072b54 100644
>>>>>>>> --- a/hw/pci/pci.c
>>>>>>>> +++ b/hw/pci/pci.c
>>>>>>>> @@ -812,12 +812,7 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus,
>>>>>>>> }
>>>>>>>>
>>>>>>>> pci_dev->bus = bus;
>>>>>>>> - if (bus->iommu_fn) {
>>>>>>>> - dma_as = bus->iommu_fn(bus, bus->iommu_opaque, devfn);
>>>>>>>> - } else {
>>>>>>>> - /* FIXME: inherit memory region from bus creator */
>>>>>>>> - dma_as = &address_space_memory;
>>>>>>>> - }
>>>>>>>> + dma_as = pci_device_iommu_address_space(pci_dev);
>>>>>>>>
>>>>>>>> memory_region_init_alias(&pci_dev->bus_master_enable_region,
>>>>>>>> OBJECT(pci_dev), "bus master",
>>>>>>>> @@ -2239,6 +2234,21 @@ static void pci_device_class_init(ObjectClass *klass, void *data)
>>>>>>>> k->props = pci_props;
>>>>>>>> }
>>>>>>>>
>>>>>>>> +AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
>>>>>>>> +{
>>>>>>>> + PCIBus *bus = PCI_BUS(dev->bus);
>>>>>>>> +
>>>>>>>> + if (bus->iommu_fn) {
>>>>>>>> + return bus->iommu_fn(bus, bus->iommu_opaque, dev->devfn);
>>>>>>>> + }
>>>>>>>> +
>>>>>>>> + if (bus->parent_dev) {
>>>>>>>> + return pci_device_iommu_address_space(bus->parent_dev);
>>>>>>>> + }
>>>>>>>
>>>>>>> No, this would fail if bus->parent_dev is not NULL but not a PCI device
>>>>>>> either.
>>>>>>
>>>>>> parent_dev is of the PCIDevice* type, how can it be not a PCI device? :-/
>>>>>
>>>>> Doh, I misread the code, I thought it was the "parent" field in
>>>>> BusState. Why do we have parent_dev at all?
>>>>
>>>> The code is too old? Don't know.
>>>>
>>>>
>>>>>>> You can use object_dynamic_cast to convert the parent_dev to
>>>>>>> PCIDevice, and if the cast succeeds you call the new function.
>>>>>>>
>>>>>>> Perhaps you could make the new function take a PCIBus instead.
>>>>>>> Accessing the PCIDevice's IOMMU address space (as opposed to the
>>>>>>> bus-master address space) doesn't make much sense, VFIO is really a
>>>>>>> special case. Putting the new API on the bus side instead looks better.
>>>>>>>
>>>>>>> (BTW, do you need to enable bus-master DMA on PCI bridges, to do DMA for
>>>>>>> devices sitting on the secondary bus?)
>>>>>>
>>>>>> It happens naturally I guess when linux enables devices.
>>>>>
>>>>> Yes, but then using the IOMMU address space would be wrong; you would
>>>>> have to use the bus-master address space as a base for the child's
>>>>> bus-master address space.
>>>>
>>>>
>>>> Like this? Works too.
>>>>
>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>>> index 8bdcedc..a4c70e6 100644
>>>> --- a/hw/pci/pci.c
>>>> +++ b/hw/pci/pci.c
>>>> @@ -2247,23 +2247,23 @@ AddressSpace
>>>> *pci_device_iommu_address_space(PCIDevice *dev)
>>>> {
>>>> PCIBus *bus = PCI_BUS(dev->bus);
>>>>
>>>> if (bus->iommu_fn) {
>>>> return bus->iommu_fn(bus, bus->iommu_opaque, dev->devfn);
>>>> }
>>>>
>>>> if (bus->parent_dev) {
>>>> return pci_device_iommu_address_space(bus->parent_dev);
>>>> }
>>>>
>>>> - return &address_space_memory;
>>>> + return &dev->bus_master_as;
>>>> }
>>>
>>> I was thinking more like this:
>>>
>>> if (bus->parent_dev) {
>>> - return pci_device_iommu_address_space(bus->parent_dev);
>>> + /* Take parent device's bus-master enable bit into account. */
>>> + return pci_get_address_space(bus->parent_dev);
>>> }
>>>
>>> + /* Not a secondary bus and no IOMMU. Use system memory. */
>>> return &address_space_memory;
>>
>> Ok.
>>
>> Oh. BTW. This "bus master" thing now breaks VFIO's check for all devices
>> being in the same address space as every single device has its own "bus
>> master" AddressSpace.
>
> Yes, fixing that check is another good use of the new API you introduced.
>
> But after Ben's answer, I guess the above change is not really needed.
> It would add complication for VFIO, too. Proper emulation would require
> QEMU to trap writes to the device's bus-master bit. QEMU would have to
> take of the value that the guest writes, AND it with the bus-master
> bit(s) of all bridges between the host bridge and the VFIO device, and
> write the result to the passed-through device. This is because the
> bridges are emulated, and do not exist in real hardware.
So does this mean that we go with the original patch and ignore bus master
address space here?
I guess I could overcome that VFIO check by comparing not just AddressSpace
but AddressSpace->root if AddressSpace is different but it does not make a
lot of sense.
>
>>>>> Also, we would have to fix the x86 firmware.
>>
>> btw what would it mean? :)
>
> Firmware would have to look for bridges and enable bus master DMA on
> them. Otherwise, for example a SCSI controller below a bridge (not
> virtio-scsi) would fail to boot.
>
> Paolo
>
--
Alexey
next prev parent reply other threads:[~2013-08-09 11:22 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-09 8:49 [Qemu-devel] [PATCH v2] pci: Introduce helper to retrieve a PCI device's DMA address space Alexey Kardashevskiy
2013-08-09 9:40 ` Paolo Bonzini
2013-08-09 9:48 ` Alexey Kardashevskiy
2013-08-09 9:53 ` Paolo Bonzini
2013-08-09 10:13 ` Alexey Kardashevskiy
2013-08-09 10:19 ` Paolo Bonzini
2013-08-09 10:58 ` Alexey Kardashevskiy
2013-08-09 11:07 ` Paolo Bonzini
2013-08-09 11:21 ` Alexey Kardashevskiy [this message]
2013-08-09 11:30 ` Paolo Bonzini
2013-08-09 10:44 ` David Gibson
2013-08-09 10:20 ` Benjamin Herrenschmidt
2013-08-09 10:29 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5204D0CD.8050400@ozlabs.ru \
--to=aik@ozlabs.ru \
--cc=alex.williamson@redhat.com \
--cc=aliguori@us.ibm.com \
--cc=david@gibson.dropbear.id.au \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).