From: David Gibson <david@gibson.dropbear.id.au>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: Alex Williamson <alex.williamson@redhat.com>,
qemu-ppc@nongnu.org, qemu-devel@nongnu.org,
Alexander Graf <agraf@suse.de>
Subject: Re: [Qemu-devel] [PATCH qemu v5 07/12] spapr_iommu: Rework TCE table initialization
Date: Wed, 8 Apr 2015 12:35:24 +1000 [thread overview]
Message-ID: <20150408023524.GE28909@voom.redhat.com> (raw)
In-Reply-To: <1427779727-13353-8-git-send-email-aik@ozlabs.ru>
[-- Attachment #1: Type: text/plain, Size: 12950 bytes --]
On Tue, Mar 31, 2015 at 04:28:42PM +1100, Alexey Kardashevskiy wrote:
> Currently TCE tables are created once at start and their size never
> changes. We are going to change that by introducing a Dynamic DMA windows
> support where DMA configuration may change during the guest execution.
>
> This changes spapr_tce_new_table() to create an empty stub object. Only
> LIOBN is assigned by the time of creation. It still will be called once
> at the owner object (VIO or PHB) creation.
>
> This introduces spapr_tce_set_props() to set the table size, start and
> page size. It only assigns the properties. It will be called at the owner
> object creation OR later from the "ibm,create-pe-dma-window" RTAS handler
> so the table's parameters can change.
>
> This introduces an "enabled" state for TCE table objects with two
> helper functions - spapr_tce_table_enable()/spapr_tce_table_disable().
> spapr_tce_table_enable() allocates the guest view of the TCE table
> (in the user space or KVM). spapr_tce_table_disable() disposes the table.
>
> Follow up patches will disable+enable tables on reset (system reset
> or DDW reset).
>
> No visible change in behaviour is expected except the actual table
> will be reallocated every reset. We might optimize this later.
>
> The other way to implement this would be dynamically create/remove
> the TCE table QOM objects but this would make migration impossible
> as migration expects all QOM objects to exist at the receiver
> so we have to have TCE table objects created when migration begins.
>
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> hw/ppc/spapr_iommu.c | 98 +++++++++++++++++++++++++++++++------------------
> hw/ppc/spapr_pci.c | 8 ++--
> hw/ppc/spapr_pci_vfio.c | 11 ++++--
> hw/ppc/spapr_vio.c | 10 ++---
> include/hw/ppc/spapr.h | 12 +++---
> 5 files changed, 87 insertions(+), 52 deletions(-)
>
> diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
> index a14cdc4..a015357 100644
> --- a/hw/ppc/spapr_iommu.c
> +++ b/hw/ppc/spapr_iommu.c
> @@ -126,25 +126,6 @@ static MemoryRegionIOMMUOps spapr_iommu_ops = {
> static int spapr_tce_table_realize(DeviceState *dev)
> {
> sPAPRTCETable *tcet = SPAPR_TCE_TABLE(dev);
> - uint64_t window_size = (uint64_t)tcet->nb_table << tcet->page_shift;
> -
> - if (kvm_enabled() && !(window_size >> 32)) {
> - tcet->table = kvmppc_create_spapr_tce(tcet->liobn,
> - window_size,
> - &tcet->fd,
> - tcet->vfio_accel);
> - }
> -
> - if (!tcet->table) {
> - size_t table_size = tcet->nb_table * sizeof(uint64_t);
> - tcet->table = g_malloc0(table_size);
> - }
> -
> - trace_spapr_iommu_new_table(tcet->liobn, tcet, tcet->table, tcet->fd);
> -
> - memory_region_init_iommu(&tcet->iommu, OBJECT(dev), &spapr_iommu_ops,
> - "iommu-spapr",
> - (uint64_t)tcet->nb_table << tcet->page_shift);
>
> QLIST_INSERT_HEAD(&spapr_tce_tables, tcet, list);
>
> @@ -154,11 +135,7 @@ static int spapr_tce_table_realize(DeviceState *dev)
> return 0;
> }
>
> -sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn,
> - uint64_t bus_offset,
> - uint32_t page_shift,
> - uint32_t nb_table,
> - bool vfio_accel)
> +sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn)
> {
> sPAPRTCETable *tcet;
> char tmp[64];
> @@ -169,36 +146,87 @@ sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn,
> return NULL;
> }
>
> - if (!nb_table) {
> - return NULL;
> - }
> -
> tcet = SPAPR_TCE_TABLE(object_new(TYPE_SPAPR_TCE_TABLE));
> tcet->liobn = liobn;
> - tcet->bus_offset = bus_offset;
> - tcet->page_shift = page_shift;
> - tcet->nb_table = nb_table;
> - tcet->vfio_accel = vfio_accel;
>
> snprintf(tmp, sizeof(tmp), "tce-table-%x", liobn);
> object_property_add_child(OBJECT(owner), tmp, OBJECT(tcet), NULL);
>
> object_property_set_bool(OBJECT(tcet), true, "realized", NULL);
>
> + trace_spapr_iommu_new_table(tcet->liobn, tcet, tcet->table, tcet->fd);
> +
> return tcet;
> }
>
> -static void spapr_tce_table_unrealize(DeviceState *dev, Error **errp)
> +void spapr_tce_set_props(sPAPRTCETable *tcet, uint64_t bus_offset,
> + uint32_t page_shift, uint32_t nb_table,
> + bool vfio_accel)
> {
> - sPAPRTCETable *tcet = SPAPR_TCE_TABLE(dev);
> + if (tcet->enabled) {
> + return;
> + }
Since you can't change the properties while the table is enabled, why
not just make these parameters to spapr_tce_table_enable().
It seems to me what this is really about is making a distinction
between two objects: (1) is the TCE table as an abstract concept - it
knows its liobn and its owner, and that's about it (2) the TCE table
as a specific instantiated table - it has a specific size and current
entries.
(2) can't be a QOM object or migration breaks, but you can still think
of it as a distinct entity at the C level.
> + tcet->bus_offset = bus_offset;
> + tcet->page_shift = page_shift;
> + tcet->nb_table = nb_table;
> + tcet->vfio_accel = vfio_accel;
> +}
>
> - QLIST_REMOVE(tcet, list);
> +void spapr_tce_table_enable(sPAPRTCETable *tcet)
> +{
> + uint64_t window_size = (uint64_t)tcet->nb_table << tcet->page_shift;
> +
> + if (tcet->enabled) {
> + return;
> + }
> +
> + if (!tcet->nb_table) {
> + return;
> + }
> +
> + if (kvm_enabled() && !(window_size >> 32)) {
> + tcet->table = kvmppc_create_spapr_tce(tcet->liobn,
> + window_size,
> + &tcet->fd,
> + tcet->vfio_accel);
> + }
> +
> + if (!tcet->table) {
> + size_t table_size = tcet->nb_table * sizeof(uint64_t);
> + tcet->table = g_malloc0(table_size);
> + }
> +
> + memory_region_init_iommu(&tcet->iommu, OBJECT(tcet), &spapr_iommu_ops,
> + "iommu-spapr",
> + (uint64_t)tcet->nb_table << tcet->page_shift);
> +
> + tcet->enabled = true;
> +}
> +
> +void spapr_tce_table_disable(sPAPRTCETable *tcet)
> +{
> + if (!tcet->enabled) {
> + return;
> + }
>
> if (!kvm_enabled() ||
> (kvmppc_remove_spapr_tce(tcet->table, tcet->fd,
> tcet->nb_table) != 0)) {
> + tcet->fd = -1;
> g_free(tcet->table);
> }
> + tcet->table = NULL;
> + tcet->enabled = false;
> + spapr_tce_set_props(tcet, 0, 0, 0, false);
> +}
> +
> +static void spapr_tce_table_unrealize(DeviceState *dev, Error **errp)
> +{
> + sPAPRTCETable *tcet = SPAPR_TCE_TABLE(dev);
> +
> + QLIST_REMOVE(tcet, list);
> +
> + spapr_tce_table_disable(tcet);
> }
>
> MemoryRegion *spapr_tce_get_iommu(sPAPRTCETable *tcet)
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 52c5c73..acfdbe5 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -895,15 +895,17 @@ static void spapr_phb_finish_realize(sPAPRPHBState *sphb, Error **errp)
> sPAPRTCETable *tcet;
> uint32_t nb_table;
>
> - nb_table = SPAPR_PCI_DMA32_SIZE >> SPAPR_TCE_PAGE_SHIFT;
> - tcet = spapr_tce_new_table(DEVICE(sphb), sphb->dma_liobn,
> - 0, SPAPR_TCE_PAGE_SHIFT, nb_table, false);
> + tcet = spapr_tce_new_table(DEVICE(sphb), sphb->dma_liobn);
> if (!tcet) {
> error_setg(errp, "Unable to create TCE table for %s",
> sphb->dtbusname);
> return ;
> }
>
> + nb_table = SPAPR_PCI_DMA32_SIZE >> SPAPR_TCE_PAGE_SHIFT;
> + spapr_tce_set_props(tcet, 0, SPAPR_TCE_PAGE_SHIFT, nb_table, false);
> + spapr_tce_table_enable(tcet);
> +
> /* Register default 32bit DMA window */
> memory_region_add_subregion(&sphb->iommu_root, 0,
> spapr_tce_get_iommu(tcet));
> diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
> index f8b503e..6c9adb5 100644
> --- a/hw/ppc/spapr_pci_vfio.c
> +++ b/hw/ppc/spapr_pci_vfio.c
> @@ -34,6 +34,7 @@ static void spapr_phb_vfio_finish_realize(sPAPRPHBState *sphb, Error **errp)
> int ret;
> sPAPRTCETable *tcet;
> uint32_t liobn = svphb->phb.dma_liobn;
> + uint32_t nb_table;
>
> ret = vfio_container_ioctl(&svphb->phb.iommu_as,
> VFIO_CHECK_EXTENSION,
> @@ -52,16 +53,18 @@ static void spapr_phb_vfio_finish_realize(sPAPRPHBState *sphb, Error **errp)
> return;
> }
>
> - tcet = spapr_tce_new_table(DEVICE(sphb), liobn, info.dma32_window_start,
> - SPAPR_TCE_PAGE_SHIFT,
> - info.dma32_window_size >> SPAPR_TCE_PAGE_SHIFT,
> - true);
> + tcet = spapr_tce_new_table(DEVICE(sphb), liobn);
> if (!tcet) {
> error_setg(errp, "spapr-vfio: failed to create VFIO TCE table");
> return;
> }
>
> /* Register default 32bit DMA window */
> + nb_table = info.dma32_window_size >> SPAPR_TCE_PAGE_SHIFT;
> + spapr_tce_set_props(tcet, info.dma32_window_start, SPAPR_TCE_PAGE_SHIFT,
> + nb_table, true);
> + spapr_tce_table_enable(tcet);
> +
> memory_region_add_subregion(&sphb->iommu_root, tcet->bus_offset,
> spapr_tce_get_iommu(tcet));
> }
> diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
> index 174033d..6394527 100644
> --- a/hw/ppc/spapr_vio.c
> +++ b/hw/ppc/spapr_vio.c
> @@ -479,11 +479,11 @@ static void spapr_vio_busdev_realize(DeviceState *qdev, Error **errp)
> memory_region_add_subregion_overlap(&dev->mrroot, 0, &dev->mrbypass, 1);
> address_space_init(&dev->as, &dev->mrroot, qdev->id);
>
> - dev->tcet = spapr_tce_new_table(qdev, liobn,
> - 0,
> - SPAPR_TCE_PAGE_SHIFT,
> - pc->rtce_window_size >>
> - SPAPR_TCE_PAGE_SHIFT, false);
> + dev->tcet = spapr_tce_new_table(qdev, liobn);
> + spapr_tce_set_props(dev->tcet, 0, SPAPR_TCE_PAGE_SHIFT,
> + pc->rtce_window_size >> SPAPR_TCE_PAGE_SHIFT,
> + false);
> + spapr_tce_table_enable(dev->tcet);
> dev->tcet->vdev = dev;
> memory_region_add_subregion_overlap(&dev->mrroot, 0,
> spapr_tce_get_iommu(dev->tcet), 2);
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 7d9ab9d..6e33b9b 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -498,6 +498,7 @@ typedef struct sPAPRTCETable sPAPRTCETable;
>
> struct sPAPRTCETable {
> DeviceState parent;
> + bool enabled;
> uint32_t liobn;
> uint32_t nb_table;
> uint64_t bus_offset;
> @@ -515,11 +516,12 @@ sPAPRTCETable *spapr_tce_find_by_liobn(uint32_t liobn);
> void spapr_events_init(sPAPREnvironment *spapr);
> void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
> int spapr_h_cas_compose_response(target_ulong addr, target_ulong size);
> -sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn,
> - uint64_t bus_offset,
> - uint32_t page_shift,
> - uint32_t nb_table,
> - bool vfio_accel);
> +sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn);
> +void spapr_tce_set_props(sPAPRTCETable *tcet, uint64_t bus_offset,
> + uint32_t page_shift, uint32_t nb_table,
> + bool vfio_accel);
> +void spapr_tce_table_enable(sPAPRTCETable *tcet);
> +void spapr_tce_table_disable(sPAPRTCETable *tcet);
> MemoryRegion *spapr_tce_get_iommu(sPAPRTCETable *tcet);
> int spapr_dma_dt(void *fdt, int node_off, const char *propname,
> uint32_t liobn, uint64_t window, uint32_t size);
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]
next prev parent reply other threads:[~2015-04-08 2:41 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-31 5:28 [Qemu-devel] [PATCH qemu v5 00/12] spapr: vfio: Enable Dynamic DMA windows (DDW) Alexey Kardashevskiy
2015-03-31 5:28 ` [Qemu-devel] [PATCH qemu v5 01/12] linux headers update for DDW on SPAPR Alexey Kardashevskiy
2015-03-31 5:28 ` [Qemu-devel] [PATCH qemu v5 02/12] vmstate: Define VARRAY with VMS_ALLOC Alexey Kardashevskiy
2015-04-08 1:55 ` David Gibson
2015-03-31 5:28 ` [Qemu-devel] [PATCH qemu v5 03/12] spapr_pci: Make find_phb()/find_dev() public Alexey Kardashevskiy
2015-03-31 5:28 ` [Qemu-devel] [PATCH qemu v5 04/12] spapr_pci_vfio: Enable multiple groups per container Alexey Kardashevskiy
2015-04-08 2:01 ` David Gibson
2015-04-08 3:45 ` Alexey Kardashevskiy
2015-04-09 6:43 ` David Gibson
2015-04-09 7:13 ` Alexey Kardashevskiy
2015-03-31 5:28 ` [Qemu-devel] [PATCH qemu v5 05/12] vfio: spapr: Move SPAPR-related code to a separate file Alexey Kardashevskiy
2015-04-08 2:05 ` David Gibson
2015-03-31 5:28 ` [Qemu-devel] [PATCH qemu v5 06/12] vfio: spapr: Add SPAPR IOMMU v2 support (DMA memory preregistering) Alexey Kardashevskiy
2015-04-08 2:15 ` David Gibson
2015-04-08 4:05 ` Alexey Kardashevskiy
2015-04-08 5:11 ` David Gibson
2015-03-31 5:28 ` [Qemu-devel] [PATCH qemu v5 07/12] spapr_iommu: Rework TCE table initialization Alexey Kardashevskiy
2015-04-08 2:35 ` David Gibson [this message]
2015-03-31 5:28 ` [Qemu-devel] [PATCH qemu v5 08/12] spapr_pci: Rework reset to reset DMA configuration Alexey Kardashevskiy
2015-04-08 2:42 ` David Gibson
2015-03-31 5:28 ` [Qemu-devel] [PATCH qemu v5 09/12] spapr_iommu: Add root memory region Alexey Kardashevskiy
2015-03-31 5:28 ` [Qemu-devel] [PATCH qemu v5 10/12] spapr_pci: Rework finish_realize() Alexey Kardashevskiy
2015-04-08 5:08 ` David Gibson
2015-03-31 5:28 ` [Qemu-devel] [PATCH qemu v5 11/12] spapr_pci: Disable all DMA windows on reset Alexey Kardashevskiy
2015-04-08 5:09 ` David Gibson
2015-03-31 5:28 ` [Qemu-devel] [PATCH qemu v5 12/12] spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW) Alexey Kardashevskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150408023524.GE28909@voom.redhat.com \
--to=david@gibson.dropbear.id.au \
--cc=agraf@suse.de \
--cc=aik@ozlabs.ru \
--cc=alex.williamson@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).