From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60473) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1V7kYZ-0007RW-N4 for qemu-devel@nongnu.org; Fri, 09 Aug 2013 07:08:35 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1V7kYV-0001cb-BH for qemu-devel@nongnu.org; Fri, 09 Aug 2013 07:08:31 -0400 Received: from mx1.redhat.com ([209.132.183.28]:2402) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1V7kYV-0001cL-3c for qemu-devel@nongnu.org; Fri, 09 Aug 2013 07:08:27 -0400 Message-ID: <5204CD80.7000409@redhat.com> Date: Fri, 09 Aug 2013 13:07:44 +0200 From: Paolo Bonzini MIME-Version: 1.0 References: <1376038150-14527-1-git-send-email-aik@ozlabs.ru> <5204B8F9.5050201@redhat.com> <5204BAE0.8070507@ozlabs.ru> <5204BC34.8030405@redhat.com> <5204C0D8.5010205@ozlabs.ru> <5204C247.2050906@redhat.com> <5204CB5A.8060000@ozlabs.ru> In-Reply-To: <5204CB5A.8060000@ozlabs.ru> Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v2] pci: Introduce helper to retrieve a PCI device's DMA address space List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexey Kardashevskiy Cc: Anthony Liguori , "Michael S . Tsirkin" , qemu-devel@nongnu.org, Alex Williamson , David Gibson Il 09/08/2013 12:58, Alexey Kardashevskiy ha scritto: > On 08/09/2013 08:19 PM, Paolo Bonzini wrote: >> Il 09/08/2013 12:13, Alexey Kardashevskiy ha scritto: >>> On 08/09/2013 07:53 PM, Paolo Bonzini wrote: >>>> Il 09/08/2013 11:48, Alexey Kardashevskiy ha scritto: >>>>> On 08/09/2013 07:40 PM, Paolo Bonzini wrote: >>>>>> Il 09/08/2013 10:49, Alexey Kardashevskiy ha scritto: >>>>>>> A PCI device's DMA address space (possibly an IOMMU) is returned by a >>>>>>> method on the PCIBus. At the moment that only has one caller, so the >>>>>>> method is simply open coded. We'll need another caller for VFIO, so >>>>>>> this patch introduces a helper/wrapper function. >>>>>>> >>>>>>> Signed-off-by: David Gibson >>>>>>> [aik: added inheritance from parent if iommu is not set for the current bus] >>>>>>> Signed-off-by: Alexey Kardashevskiy >>>>>>> >>>>>>> --- >>>>>>> Changes: >>>>>>> v2: >>>>>>> * added inheritance, needed for a pci-bridge on spapr-ppc64 >>>>>>> * pci_iommu_as renamed to pci_device_iommu_address_space >>>>>>> --- >>>>>>> hw/pci/pci.c | 22 ++++++++++++++++------ >>>>>>> include/hw/pci/pci.h | 1 + >>>>>>> 2 files changed, 17 insertions(+), 6 deletions(-) >>>>>>> >>>>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >>>>>>> index 4c004f5..0072b54 100644 >>>>>>> --- a/hw/pci/pci.c >>>>>>> +++ b/hw/pci/pci.c >>>>>>> @@ -812,12 +812,7 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus, >>>>>>> } >>>>>>> >>>>>>> pci_dev->bus = bus; >>>>>>> - if (bus->iommu_fn) { >>>>>>> - dma_as = bus->iommu_fn(bus, bus->iommu_opaque, devfn); >>>>>>> - } else { >>>>>>> - /* FIXME: inherit memory region from bus creator */ >>>>>>> - dma_as = &address_space_memory; >>>>>>> - } >>>>>>> + dma_as = pci_device_iommu_address_space(pci_dev); >>>>>>> >>>>>>> memory_region_init_alias(&pci_dev->bus_master_enable_region, >>>>>>> OBJECT(pci_dev), "bus master", >>>>>>> @@ -2239,6 +2234,21 @@ static void pci_device_class_init(ObjectClass *klass, void *data) >>>>>>> k->props = pci_props; >>>>>>> } >>>>>>> >>>>>>> +AddressSpace *pci_device_iommu_address_space(PCIDevice *dev) >>>>>>> +{ >>>>>>> + PCIBus *bus = PCI_BUS(dev->bus); >>>>>>> + >>>>>>> + if (bus->iommu_fn) { >>>>>>> + return bus->iommu_fn(bus, bus->iommu_opaque, dev->devfn); >>>>>>> + } >>>>>>> + >>>>>>> + if (bus->parent_dev) { >>>>>>> + return pci_device_iommu_address_space(bus->parent_dev); >>>>>>> + } >>>>>> >>>>>> No, this would fail if bus->parent_dev is not NULL but not a PCI device >>>>>> either. >>>>> >>>>> parent_dev is of the PCIDevice* type, how can it be not a PCI device? :-/ >>>> >>>> Doh, I misread the code, I thought it was the "parent" field in >>>> BusState. Why do we have parent_dev at all? >>> >>> The code is too old? Don't know. >>> >>> >>>>>> You can use object_dynamic_cast to convert the parent_dev to >>>>>> PCIDevice, and if the cast succeeds you call the new function. >>>>>> >>>>>> Perhaps you could make the new function take a PCIBus instead. >>>>>> Accessing the PCIDevice's IOMMU address space (as opposed to the >>>>>> bus-master address space) doesn't make much sense, VFIO is really a >>>>>> special case. Putting the new API on the bus side instead looks better. >>>>>> >>>>>> (BTW, do you need to enable bus-master DMA on PCI bridges, to do DMA for >>>>>> devices sitting on the secondary bus?) >>>>> >>>>> It happens naturally I guess when linux enables devices. >>>> >>>> Yes, but then using the IOMMU address space would be wrong; you would >>>> have to use the bus-master address space as a base for the child's >>>> bus-master address space. >>> >>> >>> Like this? Works too. >>> >>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >>> index 8bdcedc..a4c70e6 100644 >>> --- a/hw/pci/pci.c >>> +++ b/hw/pci/pci.c >>> @@ -2247,23 +2247,23 @@ AddressSpace >>> *pci_device_iommu_address_space(PCIDevice *dev) >>> { >>> PCIBus *bus = PCI_BUS(dev->bus); >>> >>> if (bus->iommu_fn) { >>> return bus->iommu_fn(bus, bus->iommu_opaque, dev->devfn); >>> } >>> >>> if (bus->parent_dev) { >>> return pci_device_iommu_address_space(bus->parent_dev); >>> } >>> >>> - return &address_space_memory; >>> + return &dev->bus_master_as; >>> } >> >> I was thinking more like this: >> >> if (bus->parent_dev) { >> - return pci_device_iommu_address_space(bus->parent_dev); >> + /* Take parent device's bus-master enable bit into account. */ >> + return pci_get_address_space(bus->parent_dev); >> } >> >> + /* Not a secondary bus and no IOMMU. Use system memory. */ >> return &address_space_memory; > > Ok. > > Oh. BTW. This "bus master" thing now breaks VFIO's check for all devices > being in the same address space as every single device has its own "bus > master" AddressSpace. Yes, fixing that check is another good use of the new API you introduced. But after Ben's answer, I guess the above change is not really needed. It would add complication for VFIO, too. Proper emulation would require QEMU to trap writes to the device's bus-master bit. QEMU would have to take of the value that the guest writes, AND it with the bus-master bit(s) of all bridges between the host bridge and the VFIO device, and write the result to the passed-through device. This is because the bridges are emulated, and do not exist in real hardware. >>>> Also, we would have to fix the x86 firmware. > > btw what would it mean? :) Firmware would have to look for bridges and enable bus master DMA on them. Otherwise, for example a SCSI controller below a bridge (not virtio-scsi) would fail to boot. Paolo