From: David Gibson <david@gibson.dropbear.id.au>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: Alex Williamson <alex.williamson@redhat.com>,
qemu-ppc@nongnu.org, qemu-devel@nongnu.org,
Alexander Graf <agraf@suse.de>
Subject: Re: [Qemu-devel] [PATCH qemu v5 07/12] spapr_iommu: Rework TCE table initialization
Date: Wed, 8 Apr 2015 12:35:24 +1000 [thread overview]
Message-ID: <20150408023524.GE28909@voom.redhat.com> (raw)
In-Reply-To: <1427779727-13353-8-git-send-email-aik@ozlabs.ru>
[-- Attachment #1: Type: text/plain, Size: 12950 bytes --]
On Tue, Mar 31, 2015 at 04:28:42PM +1100, Alexey Kardashevskiy wrote:
> Currently TCE tables are created once at start and their size never
> changes. We are going to change that by introducing a Dynamic DMA windows
> support where DMA configuration may change during the guest execution.
>
> This changes spapr_tce_new_table() to create an empty stub object. Only
> LIOBN is assigned by the time of creation. It still will be called once
> at the owner object (VIO or PHB) creation.
>
> This introduces spapr_tce_set_props() to set the table size, start and
> page size. It only assigns the properties. It will be called at the owner
> object creation OR later from the "ibm,create-pe-dma-window" RTAS handler
> so the table's parameters can change.
>
> This introduces an "enabled" state for TCE table objects with two
> helper functions - spapr_tce_table_enable()/spapr_tce_table_disable().
> spapr_tce_table_enable() allocates the guest view of the TCE table
> (in the user space or KVM). spapr_tce_table_disable() disposes the table.
>
> Follow up patches will disable+enable tables on reset (system reset
> or DDW reset).
>
> No visible change in behaviour is expected except the actual table
> will be reallocated every reset. We might optimize this later.
>
> The other way to implement this would be dynamically create/remove
> the TCE table QOM objects but this would make migration impossible
> as migration expects all QOM objects to exist at the receiver
> so we have to have TCE table objects created when migration begins.
>
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> hw/ppc/spapr_iommu.c | 98 +++++++++++++++++++++++++++++++------------------
> hw/ppc/spapr_pci.c | 8 ++--
> hw/ppc/spapr_pci_vfio.c | 11 ++++--
> hw/ppc/spapr_vio.c | 10 ++---
> include/hw/ppc/spapr.h | 12 +++---
> 5 files changed, 87 insertions(+), 52 deletions(-)
>
> diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
> index a14cdc4..a015357 100644
> --- a/hw/ppc/spapr_iommu.c
> +++ b/hw/ppc/spapr_iommu.c
> @@ -126,25 +126,6 @@ static MemoryRegionIOMMUOps spapr_iommu_ops = {
> static int spapr_tce_table_realize(DeviceState *dev)
> {
> sPAPRTCETable *tcet = SPAPR_TCE_TABLE(dev);
> - uint64_t window_size = (uint64_t)tcet->nb_table << tcet->page_shift;
> -
> - if (kvm_enabled() && !(window_size >> 32)) {
> - tcet->table = kvmppc_create_spapr_tce(tcet->liobn,
> - window_size,
> - &tcet->fd,
> - tcet->vfio_accel);
> - }
> -
> - if (!tcet->table) {
> - size_t table_size = tcet->nb_table * sizeof(uint64_t);
> - tcet->table = g_malloc0(table_size);
> - }
> -
> - trace_spapr_iommu_new_table(tcet->liobn, tcet, tcet->table, tcet->fd);
> -
> - memory_region_init_iommu(&tcet->iommu, OBJECT(dev), &spapr_iommu_ops,
> - "iommu-spapr",
> - (uint64_t)tcet->nb_table << tcet->page_shift);
>
> QLIST_INSERT_HEAD(&spapr_tce_tables, tcet, list);
>
> @@ -154,11 +135,7 @@ static int spapr_tce_table_realize(DeviceState *dev)
> return 0;
> }
>
> -sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn,
> - uint64_t bus_offset,
> - uint32_t page_shift,
> - uint32_t nb_table,
> - bool vfio_accel)
> +sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn)
> {
> sPAPRTCETable *tcet;
> char tmp[64];
> @@ -169,36 +146,87 @@ sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn,
> return NULL;
> }
>
> - if (!nb_table) {
> - return NULL;
> - }
> -
> tcet = SPAPR_TCE_TABLE(object_new(TYPE_SPAPR_TCE_TABLE));
> tcet->liobn = liobn;
> - tcet->bus_offset = bus_offset;
> - tcet->page_shift = page_shift;
> - tcet->nb_table = nb_table;
> - tcet->vfio_accel = vfio_accel;
>
> snprintf(tmp, sizeof(tmp), "tce-table-%x", liobn);
> object_property_add_child(OBJECT(owner), tmp, OBJECT(tcet), NULL);
>
> object_property_set_bool(OBJECT(tcet), true, "realized", NULL);
>
> + trace_spapr_iommu_new_table(tcet->liobn, tcet, tcet->table, tcet->fd);
> +
> return tcet;
> }
>
> -static void spapr_tce_table_unrealize(DeviceState *dev, Error **errp)
> +void spapr_tce_set_props(sPAPRTCETable *tcet, uint64_t bus_offset,
> + uint32_t page_shift, uint32_t nb_table,
> + bool vfio_accel)
> {
> - sPAPRTCETable *tcet = SPAPR_TCE_TABLE(dev);
> + if (tcet->enabled) {
> + return;
> + }
Since you can't change the properties while the table is enabled, why
not just make these parameters to spapr_tce_table_enable().
It seems to me what this is really about is making a distinction
between two objects: (1) is the TCE table as an abstract concept - it
knows its liobn and its owner, and that's about it (2) the TCE table
as a specific instantiated table - it has a specific size and current
entries.
(2) can't be a QOM object or migration breaks, but you can still think
of it as a distinct entity at the C level.
> + tcet->bus_offset = bus_offset;
> + tcet->page_shift = page_shift;
> + tcet->nb_table = nb_table;
> + tcet->vfio_accel = vfio_accel;
> +}
>
> - QLIST_REMOVE(tcet, list);
> +void spapr_tce_table_enable(sPAPRTCETable *tcet)
> +{
> + uint64_t window_size = (uint64_t)tcet->nb_table << tcet->page_shift;
> +
> + if (tcet->enabled) {
> + return;
> + }
> +
> + if (!tcet->nb_table) {
> + return;
> + }
> +
> + if (kvm_enabled() && !(window_size >> 32)) {
> + tcet->table = kvmppc_create_spapr_tce(tcet->liobn,
> + window_size,
> + &tcet->fd,
> + tcet->vfio_accel);
> + }
> +
> + if (!tcet->table) {
> + size_t table_size = tcet->nb_table * sizeof(uint64_t);
> + tcet->table = g_malloc0(table_size);
> + }
> +
> + memory_region_init_iommu(&tcet->iommu, OBJECT(tcet), &spapr_iommu_ops,
> + "iommu-spapr",
> + (uint64_t)tcet->nb_table << tcet->page_shift);
> +
> + tcet->enabled = true;
> +}
> +
> +void spapr_tce_table_disable(sPAPRTCETable *tcet)
> +{
> + if (!tcet->enabled) {
> + return;
> + }
>
> if (!kvm_enabled() ||
> (kvmppc_remove_spapr_tce(tcet->table, tcet->fd,
> tcet->nb_table) != 0)) {
> + tcet->fd = -1;
> g_free(tcet->table);
> }
> + tcet->table = NULL;
> + tcet->enabled = false;
> + spapr_tce_set_props(tcet, 0, 0, 0, false);
> +}
> +
> +static void spapr_tce_table_unrealize(DeviceState *dev, Error **errp)
> +{
> + sPAPRTCETable *tcet = SPAPR_TCE_TABLE(dev);
> +
> + QLIST_REMOVE(tcet, list);
> +
> + spapr_tce_table_disable(tcet);
> }
>
> MemoryRegion *spapr_tce_get_iommu(sPAPRTCETable *tcet)
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 52c5c73..acfdbe5 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -895,15 +895,17 @@ static void spapr_phb_finish_realize(sPAPRPHBState *sphb, Error **errp)
> sPAPRTCETable *tcet;
> uint32_t nb_table;
>
> - nb_table = SPAPR_PCI_DMA32_SIZE >> SPAPR_TCE_PAGE_SHIFT;
> - tcet = spapr_tce_new_table(DEVICE(sphb), sphb->dma_liobn,
> - 0, SPAPR_TCE_PAGE_SHIFT, nb_table, false);
> + tcet = spapr_tce_new_table(DEVICE(sphb), sphb->dma_liobn);
> if (!tcet) {
> error_setg(errp, "Unable to create TCE table for %s",
> sphb->dtbusname);
> return ;
> }
>
> + nb_table = SPAPR_PCI_DMA32_SIZE >> SPAPR_TCE_PAGE_SHIFT;
> + spapr_tce_set_props(tcet, 0, SPAPR_TCE_PAGE_SHIFT, nb_table, false);
> + spapr_tce_table_enable(tcet);
> +
> /* Register default 32bit DMA window */
> memory_region_add_subregion(&sphb->iommu_root, 0,
> spapr_tce_get_iommu(tcet));
> diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
> index f8b503e..6c9adb5 100644
> --- a/hw/ppc/spapr_pci_vfio.c
> +++ b/hw/ppc/spapr_pci_vfio.c
> @@ -34,6 +34,7 @@ static void spapr_phb_vfio_finish_realize(sPAPRPHBState *sphb, Error **errp)
> int ret;
> sPAPRTCETable *tcet;
> uint32_t liobn = svphb->phb.dma_liobn;
> + uint32_t nb_table;
>
> ret = vfio_container_ioctl(&svphb->phb.iommu_as,
> VFIO_CHECK_EXTENSION,
> @@ -52,16 +53,18 @@ static void spapr_phb_vfio_finish_realize(sPAPRPHBState *sphb, Error **errp)
> return;
> }
>
> - tcet = spapr_tce_new_table(DEVICE(sphb), liobn, info.dma32_window_start,
> - SPAPR_TCE_PAGE_SHIFT,
> - info.dma32_window_size >> SPAPR_TCE_PAGE_SHIFT,
> - true);
> + tcet = spapr_tce_new_table(DEVICE(sphb), liobn);
> if (!tcet) {
> error_setg(errp, "spapr-vfio: failed to create VFIO TCE table");
> return;
> }
>
> /* Register default 32bit DMA window */
> + nb_table = info.dma32_window_size >> SPAPR_TCE_PAGE_SHIFT;
> + spapr_tce_set_props(tcet, info.dma32_window_start, SPAPR_TCE_PAGE_SHIFT,
> + nb_table, true);
> + spapr_tce_table_enable(tcet);
> +
> memory_region_add_subregion(&sphb->iommu_root, tcet->bus_offset,
> spapr_tce_get_iommu(tcet));
> }
> diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
> index 174033d..6394527 100644
> --- a/hw/ppc/spapr_vio.c
> +++ b/hw/ppc/spapr_vio.c
> @@ -479,11 +479,11 @@ static void spapr_vio_busdev_realize(DeviceState *qdev, Error **errp)
> memory_region_add_subregion_overlap(&dev->mrroot, 0, &dev->mrbypass, 1);
> address_space_init(&dev->as, &dev->mrroot, qdev->id);
>
> - dev->tcet = spapr_tce_new_table(qdev, liobn,
> - 0,
> - SPAPR_TCE_PAGE_SHIFT,
> - pc->rtce_window_size >>
> - SPAPR_TCE_PAGE_SHIFT, false);
> + dev->tcet = spapr_tce_new_table(qdev, liobn);
> + spapr_tce_set_props(dev->tcet, 0, SPAPR_TCE_PAGE_SHIFT,
> + pc->rtce_window_size >> SPAPR_TCE_PAGE_SHIFT,
> + false);
> + spapr_tce_table_enable(dev->tcet);
> dev->tcet->vdev = dev;
> memory_region_add_subregion_overlap(&dev->mrroot, 0,
> spapr_tce_get_iommu(dev->tcet), 2);
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 7d9ab9d..6e33b9b 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -498,6 +498,7 @@ typedef struct sPAPRTCETable sPAPRTCETable;
>
> struct sPAPRTCETable {
> DeviceState parent;
> + bool enabled;
> uint32_t liobn;
> uint32_t nb_table;
> uint64_t bus_offset;
> @@ -515,11 +516,12 @@ sPAPRTCETable *spapr_tce_find_by_liobn(uint32_t liobn);
> void spapr_events_init(sPAPREnvironment *spapr);
> void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
> int spapr_h_cas_compose_response(target_ulong addr, target_ulong size);
> -sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn,
> - uint64_t bus_offset,
> - uint32_t page_shift,
> - uint32_t nb_table,
> - bool vfio_accel);
> +sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn);
> +void spapr_tce_set_props(sPAPRTCETable *tcet, uint64_t bus_offset,
> + uint32_t page_shift, uint32_t nb_table,
> + bool vfio_accel);
> +void spapr_tce_table_enable(sPAPRTCETable *tcet);
> +void spapr_tce_table_disable(sPAPRTCETable *tcet);
> MemoryRegion *spapr_tce_get_iommu(sPAPRTCETable *tcet);
> int spapr_dma_dt(void *fdt, int node_off, const char *propname,
> uint32_t liobn, uint64_t window, uint32_t size);
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]
next prev parent reply other threads:[~2015-04-08 2:41 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-31 5:28 [Qemu-devel] [PATCH qemu v5 00/12] spapr: vfio: Enable Dynamic DMA windows (DDW) Alexey Kardashevskiy
2015-03-31 5:28 ` [Qemu-devel] [PATCH qemu v5 01/12] linux headers update for DDW on SPAPR Alexey Kardashevskiy
2015-03-31 5:28 ` [Qemu-devel] [PATCH qemu v5 02/12] vmstate: Define VARRAY with VMS_ALLOC Alexey Kardashevskiy
2015-04-08 1:55 ` David Gibson
2015-03-31 5:28 ` [Qemu-devel] [PATCH qemu v5 03/12] spapr_pci: Make find_phb()/find_dev() public Alexey Kardashevskiy
2015-03-31 5:28 ` [Qemu-devel] [PATCH qemu v5 04/12] spapr_pci_vfio: Enable multiple groups per container Alexey Kardashevskiy
2015-04-08 2:01 ` David Gibson
2015-04-08 3:45 ` Alexey Kardashevskiy
2015-04-09 6:43 ` David Gibson
2015-04-09 7:13 ` Alexey Kardashevskiy
2015-03-31 5:28 ` [Qemu-devel] [PATCH qemu v5 05/12] vfio: spapr: Move SPAPR-related code to a separate file Alexey Kardashevskiy
2015-04-08 2:05 ` David Gibson
2015-03-31 5:28 ` [Qemu-devel] [PATCH qemu v5 06/12] vfio: spapr: Add SPAPR IOMMU v2 support (DMA memory preregistering) Alexey Kardashevskiy
2015-04-08 2:15 ` David Gibson
2015-04-08 4:05 ` Alexey Kardashevskiy
2015-04-08 5:11 ` David Gibson
2015-03-31 5:28 ` [Qemu-devel] [PATCH qemu v5 07/12] spapr_iommu: Rework TCE table initialization Alexey Kardashevskiy
2015-04-08 2:35 ` David Gibson [this message]
2015-03-31 5:28 ` [Qemu-devel] [PATCH qemu v5 08/12] spapr_pci: Rework reset to reset DMA configuration Alexey Kardashevskiy
2015-04-08 2:42 ` David Gibson
2015-03-31 5:28 ` [Qemu-devel] [PATCH qemu v5 09/12] spapr_iommu: Add root memory region Alexey Kardashevskiy
2015-03-31 5:28 ` [Qemu-devel] [PATCH qemu v5 10/12] spapr_pci: Rework finish_realize() Alexey Kardashevskiy
2015-04-08 5:08 ` David Gibson
2015-03-31 5:28 ` [Qemu-devel] [PATCH qemu v5 11/12] spapr_pci: Disable all DMA windows on reset Alexey Kardashevskiy
2015-04-08 5:09 ` David Gibson
2015-03-31 5:28 ` [Qemu-devel] [PATCH qemu v5 12/12] spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW) Alexey Kardashevskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150408023524.GE28909@voom.redhat.com \
--to=david@gibson.dropbear.id.au \
--cc=agraf@suse.de \
--cc=aik@ozlabs.ru \
--cc=alex.williamson@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.