From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57548) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZCDzj-0002HJ-V1 for qemu-devel@nongnu.org; Mon, 06 Jul 2015 17:32:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZCDze-00075N-Pi for qemu-devel@nongnu.org; Mon, 06 Jul 2015 17:32:07 -0400 Date: Mon, 6 Jul 2015 23:31:49 +0200 From: Thomas Huth Message-ID: <20150706233149.74e6af40@thh440s> In-Reply-To: <1436148670-6592-11-git-send-email-aik@ozlabs.ru> References: <1436148670-6592-1-git-send-email-aik@ozlabs.ru> <1436148670-6592-11-git-send-email-aik@ozlabs.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH qemu v10 10/14] spapr_pci: Enable vfio-pci hotplug List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexey Kardashevskiy Cc: Michael Roth , qemu-devel@nongnu.org, Gavin Shan , Alex Williamson , qemu-ppc@nongnu.org, David Gibson On Mon, 6 Jul 2015 12:11:06 +1000 Alexey Kardashevskiy wrote: > sPAPR IOMMU is managing two copies of an TCE table: > 1) a guest view of the table - this is what emulated devices use and > this is where H_GET_TCE reads from; > 2) a hardware TCE table - only present if there is at least one vfio-pci > device on a PHB; it is updated via a memory listener on a PHB address > space which forwards map/unmap requests to vfio-pci IOMMU host driver. > > At the moment presence of vfio-pci devices on a bus affect the way > the guest view table is allocated. If there is no vfio-pci on a PHB > and the host kernel supports KVM acceleration of H_PUT_TCE, a table > is allocated in KVM. However, if there is vfio-pci and we do yet not > support KVM acceleration for these, the table has to be allocated > by the userspace. > > When vfio-pci device is hotplugged and there were no vfio-pci devices > already, the guest view table could have been allocated by KVM which > means that H_PUT_TCE is handled by the host kernel and since we > do not support vfio-pci in KVM, the hardware table will not be updated. > > This reallocates the guest view table in QEMU if the first vfio-pci > device has just been plugged. spapr_tce_realloc_userspace() handles this. I wonder whether it would help to improve the readability of the code later if you put the description of the function into the code instead of the commit message? > This replays all the mappings to make sure that the tables are in sync. > This will not have a visible effect though as for a new device > the guest kernel will allocate-and-map new addresses and therefore > existing mappings from emulated devices will not be used by vfio-pci > devices. > > This adds calls to spapr_phb_dma_capabilities_update() in PCI hotplug > hooks. > > Signed-off-by: Alexey Kardashevskiy > --- ... > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c > index 76c988f..d1fa157 100644 > --- a/hw/ppc/spapr_pci.c > +++ b/hw/ppc/spapr_pci.c > @@ -827,6 +827,43 @@ int spapr_phb_dma_reset(sPAPRPHBState *sphb) > return 0; > } > > +static int spapr_phb_hotplug_dma_sync(sPAPRPHBState *sphb) > +{ > + int ret = 0, i; > + bool had_vfio = sphb->has_vfio; > + sPAPRTCETable *tcet; > + > + spapr_phb_dma_capabilities_update(sphb); > + > + if (!had_vfio && sphb->has_vfio) { if (had_vfio || !sphb->has_vfio) { return 0; } ... and then you can save one level of indentation for the following for-loop. > + for (i = 0; i < SPAPR_PCI_DMA_MAX_WINDOWS; ++i) { > + tcet = spapr_tce_find_by_liobn(SPAPR_PCI_LIOBN(sphb->index, i)); > + if (!tcet || !tcet->enabled) { > + continue; > + } > + if (tcet->fd >= 0) { > + /* > + * We got first vfio-pci device on accelerated table. > + * VFIO acceleration is not possible. > + * Reallocate table in userspace and replay mappings. > + */ > + ret = spapr_tce_realloc_userspace(tcet, true); > + trace_spapr_pci_dma_realloc_update(tcet->liobn, ret); > + } else { > + /* There was no acceleration, so just replay mappings. */ > + ret = spapr_tce_replay(tcet); > + trace_spapr_pci_dma_update(tcet->liobn, ret); > + } > + if (ret) { > + break; > + } > + } > + return ret; > + } > + > + return 0; > +} > + > /* Macros to operate with address in OF binding to PCI */ > #define b_x(x, p, l) (((x) & ((1<<(l))-1)) << (p)) > #define b_n(x) b_x((x), 31, 1) /* 0 if relocatable */ ... > @@ -1130,6 +1174,9 @@ static void spapr_phb_remove_pci_device_cb(DeviceState *dev, void *opaque) > */ > pci_device_reset(PCI_DEVICE(dev)); > object_unparent(OBJECT(dev)); > + > + /* Actual VFIO device release happens from RCU so postpone DMA update */ > + call_rcu1(&((sPAPRPHBState *)opaque)->rcu, spapr_phb_remove_sync_dma); Too much brackets again for my taste ;-) > } > Thomas