From: David Gibson <david@gibson.dropbear.id.au>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: qemu-devel@nongnu.org, qemu-ppc@nongnu.org,
Alexander Graf <agraf@suse.de>,
Alex Williamson <alex.williamson@redhat.com>
Subject: Re: [Qemu-devel] [PATCH qemu v17 11/12] spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW)
Date: Thu, 9 Jun 2016 14:28:56 +1000 [thread overview]
Message-ID: <20160609042856.GJ9226@voom.fritz.box> (raw)
In-Reply-To: <20d3189c-ac47-3cdb-137b-079a93eaaff2@ozlabs.ru>
[-- Attachment #1: Type: text/plain, Size: 9243 bytes --]
On Wed, Jun 08, 2016 at 04:09:57PM +1000, Alexey Kardashevskiy wrote:
> On 08/06/16 15:57, David Gibson wrote:
> > On Mon, Jun 06, 2016 at 06:12:58PM +1000, Alexey Kardashevskiy wrote:
> >> On 06/06/16 15:57, David Gibson wrote:
> >>> On Wed, Jun 01, 2016 at 06:57:42PM +1000, Alexey Kardashevskiy wrote:
> >>>> This adds support for Dynamic DMA Windows (DDW) option defined by
> >>>> the SPAPR specification which allows to have additional DMA window(s)
> >>>>
> >>>> The "ddw" property is enabled by default on a PHB but for compatibility
> >>>> the pseries-2.5 machine (TODO: update version) and older disable it.
> >>>
> >>> Looks like your todo is now todone, but you need to update the commit
> >>> message.
> >>>
> >>>> This also creates a single DMA window for the older machines to
> >>>> maintain backward migration.
> >>>>
> >>>> This implements DDW for PHB with emulated and VFIO devices. The host
> >>>> kernel support is required. The advertised IOMMU page sizes are 4K and
> >>>> 64K; 16M pages are supported but not advertised by default, in order to
> >>>> enable them, the user has to specify "pgsz" property for PHB and
> >>>> enable huge pages for RAM.
> >>>>
> >>>> The existing linux guests try creating one additional huge DMA window
> >>>> with 64K or 16MB pages and map the entire guest RAM to. If succeeded,
> >>>> the guest switches to dma_direct_ops and never calls TCE hypercalls
> >>>> (H_PUT_TCE,...) again. This enables VFIO devices to use the entire RAM
> >>>> and not waste time on map/unmap later. This adds a "dma64_win_addr"
> >>>> property which is a bus address for the 64bit window and by default
> >>>> set to 0x800.0000.0000.0000 as this is what the modern POWER8 hardware
> >>>> uses and this allows having emulated and VFIO devices on the same bus.
> >>>>
> >>>> This adds 4 RTAS handlers:
> >>>> * ibm,query-pe-dma-window
> >>>> * ibm,create-pe-dma-window
> >>>> * ibm,remove-pe-dma-window
> >>>> * ibm,reset-pe-dma-window
> >>>> These are registered from type_init() callback.
> >>>>
> >>>> These RTAS handlers are implemented in a separate file to avoid polluting
> >>>> spapr_iommu.c with PCI.
> >>>>
> >>>> This changes sPAPRPHBState::dma_liobn to an array to allow 2 LIOBNs.
> >>>>
> >>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> >>>
> >>> Looks pretty close to ready.
> >>>
> >>> There are a handful of nits and one real error noted below.
> >>>
> >>>> ---
> >>>> Changes:
> >>>> v17:
> >>>> * fixed: "query" did return non-page-shifted value when memory hotplug is enabled
> >>>>
> >>>> v16:
> >>>> * s/dma_liobn/dma_liobn[SPAPR_PCI_DMA_MAX_WINDOWS]/
> >>>> * s/SPAPR_PCI_LIOBN()/dma_liobn[]/
> >>>>
> >>>> v15:
> >>>> * moved page mask filtering to PHB realize(), use "-mempath" to know
> >>>> if there are huge pages
> >>>> * fixed error reporting in RTAS handlers
> >>>> * max window size accounts now hotpluggable memory boundaries
> >>>> ---
> >>>> hw/ppc/Makefile.objs | 1 +
> >>>> hw/ppc/spapr.c | 5 +
> >>>> hw/ppc/spapr_pci.c | 77 +++++++++---
> >>>> hw/ppc/spapr_rtas_ddw.c | 293 ++++++++++++++++++++++++++++++++++++++++++++
> >>>> include/hw/pci-host/spapr.h | 8 +-
> >>>> include/hw/ppc/spapr.h | 16 ++-
> >>>> trace-events | 4 +
> >>>> 7 files changed, 383 insertions(+), 21 deletions(-)
> >>>> create mode 100644 hw/ppc/spapr_rtas_ddw.c
> >>>>
> >>>> diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
> >>>> index c1ffc77..986b36f 100644
> >>>> --- a/hw/ppc/Makefile.objs
> >>>> +++ b/hw/ppc/Makefile.objs
> >>>> @@ -7,6 +7,7 @@ obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o spapr_drc.o spapr_rng.o
> >>>> ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES)$(CONFIG_LINUX), yyy)
> >>>> obj-y += spapr_pci_vfio.o
> >>>> endif
> >>>> +obj-$(CONFIG_PSERIES) += spapr_rtas_ddw.o
> >>>> # PowerPC 4xx boards
> >>>> obj-y += ppc405_boards.o ppc4xx_devs.o ppc405_uc.o ppc440_bamboo.o
> >>>> obj-y += ppc4xx_pci.o
> >>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >>>> index 44e401a..6ddcda9 100644
> >>>> --- a/hw/ppc/spapr.c
> >>>> +++ b/hw/ppc/spapr.c
> >>>> @@ -2366,6 +2366,11 @@ DEFINE_SPAPR_MACHINE(2_6, "2.6", true);
> >>>> .driver = "spapr-vlan", \
> >>>> .property = "use-rx-buffer-pools", \
> >>>> .value = "off", \
> >>>> + }, \
> >>>> + {\
> >>>> + .driver = TYPE_SPAPR_PCI_HOST_BRIDGE,\
> >>>> + .property = "ddw",\
> >>>> + .value = stringify(off),\
> >>>> },
> >>>>
> >>>> static void spapr_machine_2_5_instance_options(MachineState *machine)
> >>>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> >>>> index 68de523..bcf0360 100644
> >>>> --- a/hw/ppc/spapr_pci.c
> >>>> +++ b/hw/ppc/spapr_pci.c
> >>>> @@ -35,6 +35,7 @@
> >>>> #include "hw/ppc/spapr.h"
> >>>> #include "hw/pci-host/spapr.h"
> >>>> #include "exec/address-spaces.h"
> >>>> +#include "exec/ram_addr.h"
> >>>> #include <libfdt.h>
> >>>> #include "trace.h"
> >>>> #include "qemu/error-report.h"
> >>>> @@ -45,6 +46,7 @@
> >>>> #include "hw/ppc/spapr_drc.h"
> >>>> #include "sysemu/device_tree.h"
> >>>> #include "sysemu/kvm.h"
> >>>> +#include "sysemu/hostmem.h"
> >>>>
> >>>> #include "hw/vfio/vfio.h"
> >>>>
> >>>> @@ -1088,7 +1090,7 @@ static void spapr_phb_add_pci_device(sPAPRDRConnector *drc,
> >>>> int fdt_start_offset = 0, fdt_size;
> >>>>
> >>>> if (object_dynamic_cast(OBJECT(pdev), "vfio-pci")) {
> >>>> - sPAPRTCETable *tcet = spapr_tce_find_by_liobn(phb->dma_liobn);
> >>>> + sPAPRTCETable *tcet = spapr_tce_find_by_liobn(phb->dma_liobn[0]);
> >>>>
> >>>> spapr_tce_set_need_vfio(tcet, true);
> >>>> }
> >>>
> >>> Hang on.. I thought you'd got rid of the need for this explicit
> >>> set_need_vfio() stuff.
> >>
> >>
> >> It is in 12/12 (which I'll split in 2 halves when I respin this), I moved
> >> it to the end as it is not essential for DDW itself.
> >
> > Yes, sorry, I saw that shortly after writing this.
> >
> >>>> @@ -1310,11 +1312,14 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
> >>>> PCIBus *bus;
> >>>> uint64_t msi_window_size = 4096;
> >>>> sPAPRTCETable *tcet;
> >>>> + const unsigned windows_supported =
> >>>> + sphb->ddw_enabled ? SPAPR_PCI_DMA_MAX_WINDOWS : 1;
> >>>>
> >>>> if (sphb->index != (uint32_t)-1) {
> >>>> hwaddr windows_base;
> >>>>
> >>>> - if ((sphb->buid != (uint64_t)-1) || (sphb->dma_liobn != (uint32_t)-1)
> >>>> + if ((sphb->buid != (uint64_t)-1) || (sphb->dma_liobn[0] != (uint32_t)-1)
> >>>> + || ((sphb->dma_liobn[1] != (uint32_t)-1) && (windows_supported > 1))
> >>>> || (sphb->mem_win_addr != (hwaddr)-1)
> >>>> || (sphb->io_win_addr != (hwaddr)-1)) {
> >>>> error_setg(errp, "Either \"index\" or other parameters must"
> >>>> @@ -1329,7 +1334,9 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
> >>>> }
> >>>>
> >>>> sphb->buid = SPAPR_PCI_BASE_BUID + sphb->index;
> >>>> - sphb->dma_liobn = SPAPR_PCI_LIOBN(sphb->index, 0);
> >>>> + for (i = 0; i < windows_supported; ++i) {
> >>>> + sphb->dma_liobn[i] = SPAPR_PCI_LIOBN(sphb->index, i);
> >>>> + }
> >>>>
> >>>> windows_base = SPAPR_PCI_WINDOW_BASE
> >>>> + sphb->index * SPAPR_PCI_WINDOW_SPACING;
> >>>> @@ -1342,8 +1349,9 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
> >>>> return;
> >>>> }
> >>>>
> >>>> - if (sphb->dma_liobn == (uint32_t)-1) {
> >>>> - error_setg(errp, "LIOBN not specified for PHB");
> >>>> + if ((sphb->dma_liobn[0] == (uint32_t)-1) ||
> >>>> + ((sphb->dma_liobn[1] == (uint32_t)-1) && (windows_supported > 1))) {
> >>>> + error_setg(errp, "LIOBN(s) not specified for PHB");
> >>>> return;
> >>>> }
> >>>
> >>> Hrm.. there's a bit of false generality here, since this would break
> >>> if windows_supported > 2, and dma_liobn[2] was not specified. Not
> >>> urgent for the initial commit though.
> >>
> >>
> >> Is s/windows_supported > 1/windows_supported == 2/ any better here?
> >
> > Not really. Unless you also have a windows_supported > 2 case (which
> > could just error / abort / whatever).
>
>
> It cannot be >2 as SPAPR_PCI_DMA_MAX_WINDOWS is defined as "2" now and it
> is quite unlikely to change in the future.
Yes, I know, but the value of MAX_WINDOWS isn't obvious if you're just
looking at this code. My point is that a casual look at this code
suggests it will handle arbitrary windows_supported values, but that's
not actually the case. This is generally bad practice.
> What do I do right now? :)
Well, as I said it's not urgent for the first commit. Just leave it
as it is and we'll deal with the consequences later.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
prev parent reply other threads:[~2016-06-09 5:10 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1464771463-37214-1-git-send-email-aik@ozlabs.ru>
2016-06-01 8:57 ` [Qemu-devel] [PATCH qemu v17 01/12] vmstate: Define VARRAY with VMS_ALLOC Alexey Kardashevskiy
2016-06-01 8:57 ` [Qemu-devel] [PATCH qemu v17 02/12] spapr_iommu: Introduce "enabled" state for TCE table Alexey Kardashevskiy
2016-06-01 8:57 ` [Qemu-devel] [PATCH qemu v17 03/12] spapr_iommu: Migrate full state Alexey Kardashevskiy
2016-06-01 8:57 ` [Qemu-devel] [PATCH qemu v17 04/12] spapr_iommu: Add root memory region Alexey Kardashevskiy
2016-06-01 8:57 ` [Qemu-devel] [PATCH qemu v17 05/12] spapr_pci: Reset DMA config on PHB reset Alexey Kardashevskiy
2016-06-01 8:57 ` [Qemu-devel] [PATCH qemu v17 06/12] memory: Add reporting of supported page sizes Alexey Kardashevskiy
2016-06-01 8:57 ` [Qemu-devel] [PATCH qemu v17 07/12] vfio: spapr: Add DMA memory preregistering (SPAPR IOMMU v2) Alexey Kardashevskiy
2016-06-01 8:57 ` [Qemu-devel] [PATCH qemu v17 08/12] spapr_pci: Add and export DMA resetting helper Alexey Kardashevskiy
2016-06-01 8:57 ` [Qemu-devel] [PATCH qemu v17 09/12] vfio: Add host side DMA window capabilities Alexey Kardashevskiy
2016-06-01 8:57 ` [Qemu-devel] [PATCH qemu v17 10/12] vfio/spapr: Create DMA window dynamically (SPAPR IOMMU v2) Alexey Kardashevskiy
2016-06-01 8:57 ` [Qemu-devel] [PATCH qemu v17 11/12] spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW) Alexey Kardashevskiy
2016-06-01 8:57 ` [Qemu-devel] [PATCH qemu v17 12/12] spapr_iommu, vfio, memory: Notify IOMMU about starting/stopping listening Alexey Kardashevskiy
[not found] ` <201606010902.u518wwmb029353@mx0a-001b2d01.pphosted.com>
2016-06-02 3:35 ` [Qemu-devel] [PATCH qemu v17 06/12] memory: Add reporting of supported page sizes David Gibson
2016-06-06 13:31 ` Paolo Bonzini
2016-06-07 3:42 ` Alexey Kardashevskiy
2016-06-08 6:00 ` David Gibson
2016-06-08 6:05 ` Alexey Kardashevskiy
2016-06-14 21:41 ` Alexey Kardashevskiy
2016-06-15 6:15 ` David Gibson
[not found] ` <201606010900.u518wvH7046287@mx0a-001b2d01.pphosted.com>
2016-06-02 4:18 ` [Qemu-devel] [PATCH qemu v17 07/12] vfio: spapr: Add DMA memory preregistering (SPAPR IOMMU v2) David Gibson
[not found] ` <201606010902.u518x15j023604@mx0a-001b2d01.pphosted.com>
2016-06-02 4:19 ` [Qemu-devel] [PATCH qemu v17 08/12] spapr_pci: Add and export DMA resetting helper David Gibson
[not found] ` <201606010901.u518wwEL029369@mx0a-001b2d01.pphosted.com>
2016-06-03 7:23 ` [Qemu-devel] [PATCH qemu v17 09/12] vfio: Add host side DMA window capabilities David Gibson
[not found] ` <201606011012.u51A9A6i023070@mx0a-001b2d01.pphosted.com>
2016-06-03 7:37 ` [Qemu-devel] [PATCH qemu v17 10/12] vfio/spapr: Create DMA window dynamically (SPAPR IOMMU v2) David Gibson
2016-06-06 6:45 ` Alexey Kardashevskiy
2016-06-08 5:56 ` David Gibson
[not found] ` <201606010902.u51902Zl007699@mx0a-001b2d01.pphosted.com>
2016-06-03 15:37 ` [Qemu-devel] [PATCH qemu v17 06/12] memory: Add reporting of supported page sizes Alex Williamson
[not found] ` <201606010900.u51900Om007391@mx0a-001b2d01.pphosted.com>
2016-06-03 16:13 ` [Qemu-devel] [PATCH qemu v17 07/12] vfio: spapr: Add DMA memory preregistering (SPAPR IOMMU v2) Alex Williamson
2016-06-06 6:04 ` Alexey Kardashevskiy
2016-06-06 17:20 ` Alex Williamson
2016-06-07 3:10 ` Alexey Kardashevskiy
[not found] ` <201606010901.u518x843001647@mx0a-001b2d01.pphosted.com>
2016-06-03 16:32 ` [Qemu-devel] [PATCH qemu v17 09/12] vfio: Add host side DMA window capabilities Alex Williamson
[not found] ` <201606010901.u518x7AQ001537@mx0a-001b2d01.pphosted.com>
2016-06-03 16:50 ` [Qemu-devel] [PATCH qemu v17 10/12] vfio/spapr: Create DMA window dynamically (SPAPR IOMMU v2) Alex Williamson
[not found] ` <201606011013.u51A9ALx023064@mx0a-001b2d01.pphosted.com>
2016-06-03 16:59 ` [Qemu-devel] [PATCH qemu v17 12/12] spapr_iommu, vfio, memory: Notify IOMMU about starting/stopping listening Alex Williamson
[not found] ` <201606010901.u518x1wF023568@mx0a-001b2d01.pphosted.com>
2016-06-06 5:57 ` [Qemu-devel] [PATCH qemu v17 11/12] spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW) David Gibson
2016-06-06 8:12 ` Alexey Kardashevskiy
2016-06-08 5:57 ` David Gibson
2016-06-08 6:09 ` Alexey Kardashevskiy
2016-06-09 4:28 ` David Gibson [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160609042856.GJ9226@voom.fritz.box \
--to=david@gibson.dropbear.id.au \
--cc=agraf@suse.de \
--cc=aik@ozlabs.ru \
--cc=alex.williamson@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.