From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40987) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bs56J-0007kr-M9 for qemu-devel@nongnu.org; Thu, 06 Oct 2016 05:36:29 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bs56D-0002gv-QD for qemu-devel@nongnu.org; Thu, 06 Oct 2016 05:36:26 -0400 References: <1475722987-18644-1-git-send-email-david@gibson.dropbear.id.au> <1475722987-18644-2-git-send-email-david@gibson.dropbear.id.au> From: Laurent Vivier Message-ID: <071b0b41-0c16-4be8-a59b-c1296c18eaf8@redhat.com> Date: Thu, 6 Oct 2016 11:36:12 +0200 MIME-Version: 1.0 In-Reply-To: <1475722987-18644-2-git-send-email-david@gibson.dropbear.id.au> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC 1/4] spapr_pci: Delegate placement of PCI host bridges to machine type List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: David Gibson , qemu-ppc@nongnu.org Cc: qemu-devel@nongnu.org, benh@kernel.crashing.org, thuth@redhat.com, agraf@suse.de, mst@redhat.com, aik@ozlabs.ru, mdroth@linux.vnet.ibm.com, nikunj@linux.vnet.ibm.com, bharata@linux.vnet.ibm.com, abologna@redhat.com, mpolednik@redhat.com On 06/10/2016 05:03, David Gibson wrote: > The 'spapr-pci-host-bridge' represents the virtual PCI host bridge (PHB) > for a PAPR guest. Unlike on x86, it's routine on Power (both bare metal > and PAPR guests) to have numerous independent PHBs, each controlling a > separate PCI domain. > > There are two ways of configuring the spapr-pci-host-bridge device: first > it can be done fully manually, specifying the locations and sizes of all > the IO windows. This gives the most control, but is very awkward with 6 > mandatory parameters. Alternatively just an "index" can be specified > which essentially selects from an array of predefined PHB locations. > The PHB at index 0 is automatically created as the default PHB. > > The current set of default locations causes some problems for guests with > large RAM (> 1 TiB) or PCI devices with very large BARs (e.g. big nVidia > GPGPU cards via VFIO). Obviously, for migration we can only change the > locations on a new machine type, however. > > This is awkward, because the placement is currently decided within the > spapr-pci-host-bridge code, so it breaks abstraction to look inside the > machine type version. > > So, this patch delegates the "default mode" PHB placement from the > spapr-pci-host-bridge device back to the machine type via a public method > in sPAPRMachineClass. It's still a bit ugly, but it's about the best we > can do. > > For now, this just changes where the calculation is done. It doesn't > change the actual location of the host bridges, or any other behaviour. > > Signed-off-by: David Gibson > --- > hw/ppc/spapr.c | 34 ++++++++++++++++++++++++++++++++++ > hw/ppc/spapr_pci.c | 22 ++++++++-------------- > include/hw/pci-host/spapr.h | 11 +---------- > include/hw/ppc/spapr.h | 4 ++++ > 4 files changed, 47 insertions(+), 24 deletions(-) > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > index 03e3803..f6e9c2a 100644 > --- a/hw/ppc/spapr.c > +++ b/hw/ppc/spapr.c > @@ -2370,6 +2370,39 @@ static HotpluggableCPUList *spapr_query_hotpluggable_cpus(MachineState *machine) > return head; > } > > +static void spapr_phb_placement(sPAPRMachineState *spapr, uint32_t index, > + uint64_t *buid, hwaddr *pio, hwaddr *pio_size, > + hwaddr *mmio, hwaddr *mmio_size, > + unsigned n_dma, uint32_t *liobns, Error **errp) > +{ > + const uint64_t base_buid = 0x800000020000000ULL; > + const hwaddr phb0_base = 0x10000000000ULL; /* 1 TiB */ > + const hwaddr phb_spacing = 0x1000000000ULL; /* 64 GiB */ > + const hwaddr mmio_offset = 0xa0000000; /* 2 GiB + 512 MiB */ > + const hwaddr pio_offset = 0x80000000; /* 2 GiB */ > + const uint32_t max_index = 255; > + > + hwaddr phb_base; > + int i; > + > + if (index > max_index) { > + error_setg(errp, "\"index\" for PAPR PHB is too large (max %u)", > + max_index); > + return; > + } > + > + *buid = base_buid + index; > + for (i = 0; i < n_dma; ++i) { > + liobns[i] = SPAPR_PCI_LIOBN(index, i); > + } > + > + phb_base = phb0_base + index * phb_spacing; > + *pio = phb_base + pio_offset; > + *pio_size = SPAPR_PCI_IO_WIN_SIZE; > + *mmio = phb_base + mmio_offset; > + *mmio_size = SPAPR_PCI_MMIO_WIN_SIZE; > +} > + > static void spapr_machine_class_init(ObjectClass *oc, void *data) > { > MachineClass *mc = MACHINE_CLASS(oc); > @@ -2406,6 +2439,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data) > mc->query_hotpluggable_cpus = spapr_query_hotpluggable_cpus; > fwc->get_dev_path = spapr_get_fw_dev_path; > nc->nmi_monitor_handler = spapr_nmi; > + smc->phb_placement = spapr_phb_placement; > } > > static const TypeInfo spapr_machine_info = { > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c > index 4f00865..c0fc964 100644 > --- a/hw/ppc/spapr_pci.c > +++ b/hw/ppc/spapr_pci.c > @@ -1311,7 +1311,8 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp) > sphb->ddw_enabled ? SPAPR_PCI_DMA_MAX_WINDOWS : 1; > > if (sphb->index != (uint32_t)-1) { > - hwaddr windows_base; > + sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr); > + Error *local_err = NULL; > > if ((sphb->buid != (uint64_t)-1) || (sphb->dma_liobn[0] != (uint32_t)-1) > || (sphb->dma_liobn[1] != (uint32_t)-1 && windows_supported == 2) > @@ -1322,21 +1323,14 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp) > return; > } > > - if (sphb->index > SPAPR_PCI_MAX_INDEX) { > - error_setg(errp, "\"index\" for PAPR PHB is too large (max %u)", > - SPAPR_PCI_MAX_INDEX); > + smc->phb_placement(spapr, sphb->index, > + &sphb->buid, &sphb->io_win_addr, &sphb->io_win_size, > + &sphb->mem_win_addr, &sphb->mem_win_size, > + windows_supported, sphb->dma_liobn, &local_err); Why don't you pass "sphb" instead of "&sphb->buid, &sphb->io_win_addr, &sphb->io_win_size, &sphb->mem_win_addr, &sphb->mem_win_size, sphb->dma_liobn"? Something like: smc->phb_placement(spapr, sphb, windows_supported, &local_err); Thanks, Laurent