qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC v4 0/5] ARM SMMUv3 Emulation Support
@ 2017-05-13 17:43 Eric Auger
  2017-05-13 17:43 ` [Qemu-devel] [RFC v4 1/5] hw/arm/smmu-common: smmu base class Eric Auger
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: Eric Auger @ 2017-05-13 17:43 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, edgar.iglesias,
	qemu-arm, qemu-devel, prem.mallappa
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain

This series introduces the emulation code for ARM SMMUv3.
This is the continuation of Prem's work [1].

This v4 is yet another visibility step as many restrictions apply
to the model at the moment:
- only VMSAv8-64 is supported (no VMSAv8-32)
- only stage 1 has been tested (Block PTE still needs to be
  implemented and tested though)
- no integration with VFIO. We are missing some quirks to force
  the guest to invalidate TLBs when updating the page tables.
  But replay will be easily implemented upon new page table scan.
- stage 2 is not supported. There is partial support when decoding
  configuration information but no support at page table walk level.
  This is my next step.
- at the moment I don't plan to support nested S1 + S2 as I
  understand no SW stack uses it at the moment.
- no support for HYP mappings
- register fine emulation, commands, interrupts and errors were
  not accurately tested. Handling is sufficient to run a guest
  with a virtio-net-pci device using dma ops.
- At the moment no change to the PCIe instantiation (further
  discussions about SMMUv2 TBUs versus SMMUv3 TLBs needed).

- In ACPI mode, the guest must feature the IOMMU probe deferral
  series wich fixes streamid multiple lookup

The smmu is instantiated when passing the smmu option to machvirt:
"-M virt-2.10,smmu"

Best Regards

Eric

This series can be found at:
v4: https://github.com/eauger/qemu/tree/v2.9-SMMU-v4

Testing:
- booted a 4.11 guest in dt and acpi mode with an iommu_platform
  virtio-net-pci device (using dma ops). Tested with the following
  guest combinations: 4K page - 39 bit VA, 4K - 48b, 64K - 39b,
  64K - 48b.
- Also Prem's unit test environment was rebased (available upon
  demand)

References:
[1] Prem's last iteration:
- https://lists.gnu.org/archive/html/qemu-devel/2016-08/msg03531.html

History:
v3 -> v4 [Eric]:
- page table walk rewritten to allow scan of the page table within a
  range of IOVA. This prepares for VFIO integration and replay.
- configuration parsing partially reworked.
- do not advertise unsupported/untested features: S2, S1 + S2, HYP,
  PRI, ATS, ..
- added ACPI table generation
- migrated to dynamic traces
- mingw compilation fix

v2 -> v3 [Eric]:
- rebased on 2.9
- mostly code and patch reorganization to ease the review process
- optional patches removed. They may be handled separately. I am currently
  working on ACPI enablement.
- optional instantiation of the smmu in mach-virt
- removed [2/9] (fdt functions) since not mandated
- start splitting main patch into base and derived object
- no new function feature added

v1 -> v2 [Prem]:
- Adopted review comments from Eric Auger
        - Make SMMU_DPRINTF to internally call qemu_log
            (since translation requests are too many, we need control
             on the type of log we want)
        - SMMUTransCfg modified to suite simplicity
        - Change RegInfo to uint64 register array
        - Code cleanup
        - Test cleanups
- Reshuffled patches

v0 -> v1 [Prem]:
- As per SMMUv3 spec 16.0 (only is_ste_consistant() is noticeable)
- Reworked register access/update logic
- Factored out translation code for
        - single point bug fix
        - sharing/removal in future
- (optional) Unit tests added, with PCI test device
        - S1 with 4k/64k, S1+S2 with 4k/64k
        - (S1 or S2) only can be verified by Linux 4.7 driver
        - (optional) Priliminary ACPI support

v0 [Prem]:
- Implements SMMUv3 spec 11.0
- Supported for PCIe devices,
- Command Queue and Event Queue supported
- LPAE only, S1 is supported and Tested, S2 not tested
- BE mode Translation not supported
- IRQ support (legacy, no MSI)
- Tested with DPDK and e1000

Eric Auger (2):
  hw/arm/smmu-common: smmu base class
  hw/arm/virt: Add 2.10 machine type

Prem Mallappa (3):
  hw/arm/smmuv3: smmuv3 emulation model
  hw/arm/virt: Add SMMUv3 to the virt board
  hw/arm/virt-acpi-build: add smmuv3 node in IORT table

 default-configs/aarch64-softmmu.mak |    1 +
 hw/arm/Makefile.objs                |    1 +
 hw/arm/smmu-common.c                |  419 +++++++++++++
 hw/arm/smmu-internal.h              |   97 +++
 hw/arm/smmuv3-internal.h            |  603 +++++++++++++++++++
 hw/arm/smmuv3.c                     | 1134 +++++++++++++++++++++++++++++++++++
 hw/arm/trace-events                 |   44 ++
 hw/arm/virt-acpi-build.c            |   56 +-
 hw/arm/virt.c                       |  109 +++-
 include/hw/acpi/acpi-defs.h         |   15 +
 include/hw/arm/smmu-common.h        |  125 ++++
 include/hw/arm/smmuv3.h             |   87 +++
 include/hw/arm/virt.h               |    5 +
 13 files changed, 2687 insertions(+), 9 deletions(-)
 create mode 100644 hw/arm/smmu-common.c
 create mode 100644 hw/arm/smmu-internal.h
 create mode 100644 hw/arm/smmuv3-internal.h
 create mode 100644 hw/arm/smmuv3.c
 create mode 100644 include/hw/arm/smmu-common.h
 create mode 100644 include/hw/arm/smmuv3.h

-- 
2.5.5

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Qemu-devel] [RFC v4 1/5] hw/arm/smmu-common: smmu base class
  2017-05-13 17:43 [Qemu-devel] [RFC v4 0/5] ARM SMMUv3 Emulation Support Eric Auger
@ 2017-05-13 17:43 ` Eric Auger
  2017-05-30 15:56   ` Peter Maydell
  2017-05-13 17:43 ` [Qemu-devel] [RFC v4 2/5] hw/arm/smmuv3: smmuv3 emulation model Eric Auger
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: Eric Auger @ 2017-05-13 17:43 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, edgar.iglesias,
	qemu-arm, qemu-devel, prem.mallappa
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain

Introduces the base device and class for the ARM smmu.
Implements VMSAv8-64 table lookup and translation. VMSAv8-32
is not yet implemented.

For VFIO integration we will need to notify mapping changes
of an input range and skipped unmapped regions. table walk
helper allows.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>

---
v3 -> v4:
- reworked page table walk to prepare for VFIO integration
  (capability to scan a range of IOVA). Same function is used
  for translate for a single iova. This is largely inspired
  from intel_iommu.c
- as the translate function was not straightforward to me,
  I tried to stick more closely to the VMSA spec.
- remove support of nested stage (kernel driver does not
  support it anyway)
- introduce smmu-internal.h to put page table definitions
- added smmu_find_as_from_bus_num
- SMMU_PCI_BUS_MAX and SMMU_PCI_DEVFN_MAX in smmu-common header
- new fields in SMMUState:
  - iommu_ops, smmu_as_by_busptr, smmu_as_by_bus_num
- use error_report and trace events
- add aa64[] field in SMMUTransCfg

v3:
- moved the base code in a separate patch to ease the review.
- clearer separation between base class and smmuv3 class
- translate_* only implemented as class methods
---
 default-configs/aarch64-softmmu.mak |   1 +
 hw/arm/Makefile.objs                |   1 +
 hw/arm/smmu-common.c                | 419 ++++++++++++++++++++++++++++++++++++
 hw/arm/smmu-internal.h              |  97 +++++++++
 hw/arm/trace-events                 |  12 ++
 include/hw/arm/smmu-common.h        | 125 +++++++++++
 6 files changed, 655 insertions(+)
 create mode 100644 hw/arm/smmu-common.c
 create mode 100644 hw/arm/smmu-internal.h
 create mode 100644 include/hw/arm/smmu-common.h

diff --git a/default-configs/aarch64-softmmu.mak b/default-configs/aarch64-softmmu.mak
index 2449483..83a2932 100644
--- a/default-configs/aarch64-softmmu.mak
+++ b/default-configs/aarch64-softmmu.mak
@@ -7,3 +7,4 @@ CONFIG_AUX=y
 CONFIG_DDC=y
 CONFIG_DPCD=y
 CONFIG_XLNX_ZYNQMP=y
+CONFIG_ARM_SMMUV3=y
diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
index 4c5c4ee..6c7d4af 100644
--- a/hw/arm/Makefile.objs
+++ b/hw/arm/Makefile.objs
@@ -18,3 +18,4 @@ obj-$(CONFIG_FSL_IMX25) += fsl-imx25.o imx25_pdk.o
 obj-$(CONFIG_FSL_IMX31) += fsl-imx31.o kzm.o
 obj-$(CONFIG_FSL_IMX6) += fsl-imx6.o sabrelite.o
 obj-$(CONFIG_ASPEED_SOC) += aspeed_soc.o aspeed.o
+obj-$(CONFIG_ARM_SMMUV3) += smmu-common.o
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
new file mode 100644
index 0000000..49ed1c2
--- /dev/null
+++ b/hw/arm/smmu-common.c
@@ -0,0 +1,419 @@
+/*
+ * Copyright (C) 2014-2016 Broadcom Corporation
+ * Copyright (c) 2017 Red Hat, Inc.
+ * Written by Prem Mallappa, Eric Auger
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Author: Prem Mallappa <pmallapp@broadcom.com>
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "sysemu/sysemu.h"
+#include "exec/address-spaces.h"
+#include "trace.h"
+#include "qemu/error-report.h"
+#include "hw/arm/smmu-common.h"
+#include "smmu-internal.h"
+
+inline MemTxResult smmu_read_sysmem(dma_addr_t addr, void *buf, dma_addr_t len,
+                                    bool secure)
+{
+    MemTxAttrs attrs = {.unspecified = 1, .secure = secure};
+
+    switch (len) {
+    case 4:
+        *(uint32_t *)buf = ldl_le_phys(&address_space_memory, addr);
+        break;
+    case 8:
+        *(uint64_t *)buf = ldq_le_phys(&address_space_memory, addr);
+        break;
+    default:
+        return address_space_rw(&address_space_memory, addr,
+                                attrs, buf, len, false);
+    }
+    return MEMTX_OK;
+}
+
+inline void
+smmu_write_sysmem(dma_addr_t addr, void *buf, dma_addr_t len, bool secure)
+{
+    MemTxAttrs attrs = {.unspecified = 1, .secure = secure};
+
+    switch (len) {
+    case 4:
+        stl_le_phys(&address_space_memory, addr, *(uint32_t *)buf);
+        break;
+    case 8:
+        stq_le_phys(&address_space_memory, addr, *(uint64_t *)buf);
+        break;
+    default:
+        address_space_rw(&address_space_memory, addr,
+                         attrs, buf, len, true);
+    }
+}
+
+/*************************/
+/* VMSAv8-64 Translation */
+/*************************/
+
+/**
+ * get_pte - Get the content of a page table entry located in
+ * @base_addr[@index]
+ */
+static uint64_t get_pte(dma_addr_t baseaddr, uint32_t index)
+{
+    uint64_t pte;
+
+    if (smmu_read_sysmem(baseaddr + index * sizeof(pte),
+                         &pte, sizeof(pte), false)) {
+        error_report("can't read pte at address=0x%"PRIx64,
+                     baseaddr + index * sizeof(pte));
+        pte = (uint64_t)-1;
+        return pte;
+    }
+    trace_smmu_get_pte(baseaddr, index, baseaddr + index * sizeof(pte), pte);
+    /* TODO: handle endianness */
+    return pte;
+}
+
+/* VMSAv8-64 Translation Table Format Descriptor Decoding */
+
+#define PTE_ADDRESS(pte, shift) (extract64(pte, shift, 47 - shift) << shift)
+
+/**
+ * get_page_pte_address - returns the L3 descriptor output address,
+ * ie. the page frame
+ * ARM ARM spec: Figure D4-17 VMSAv8-64 level 3 descriptor format
+ */
+static inline hwaddr get_page_pte_address(uint64_t pte, int granule_sz)
+{
+    return PTE_ADDRESS(pte, granule_sz);
+}
+
+/**
+ * get_table_pte_address - return table descriptor output address,
+ * ie. address of next level table
+ * ARM ARM Figure D4-16 VMSAv8-64 level0, level1, and level 2 descriptor formats
+ */
+static inline hwaddr get_table_pte_address(uint64_t pte, int granule_sz)
+{
+    return PTE_ADDRESS(pte, granule_sz);
+}
+
+/**
+ * get_block_pte_address - return block descriptor output address
+ * ARM ARM Figure D4-16 VMSAv8-64 level0, level1, and level 2 descriptor formats
+ */
+static hwaddr get_block_pte_address(uint64_t pte, int level, int granule_sz)
+{
+    int n;
+
+    switch (granule_sz) {
+    case 12:
+        if (level == 1) {
+            n = 30;
+        } else if (level == 2) {
+            n = 21;
+        } else {
+            goto error_out;
+        }
+        break;
+    case 14:
+        if (level == 2) {
+            n = 25;
+        } else {
+            goto error_out;
+        }
+        break;
+    case 16:
+        if (level == 2) {
+            n = 29;
+        } else {
+            goto error_out;
+        }
+        break;
+    default:
+            goto error_out;
+    }
+    return PTE_ADDRESS(pte, n);
+
+error_out:
+
+    error_report("unexpected granule_sz=%d/level=%d for block pte",
+                 granule_sz, level);
+    return (hwaddr)-1;
+}
+
+/**
+ * smmu_page_walk_level_64 - Walk an IOVA range from a specific level
+ * @baseaddr: table base address corresponding to @level
+ * @level: level
+ * @cfg: translation config
+ * @start: end of the IOVA range
+ * @end: end of the IOVA range
+ * @hook_fn: the hook that to be called for each detected area
+ * @private: private data for the hook function
+ * @read: whether parent level has read permission
+ * @write: whether parent level has write permission
+ * @must_translate: indicates whether each iova of the range
+ *  must be translated or whether failure is allowed
+ * @notify_unmap: whether we should notify invalid entries
+ *
+ * Return 0 on success, < 0 on errors not related to translation
+ * process, > 1 on errors related to translation process (only
+ * if must_translate is set)
+ */
+static int
+smmu_page_walk_level_64(dma_addr_t baseaddr, int level,
+                        SMMUTransCfg *cfg, uint64_t start, uint64_t end,
+                        smmu_page_walk_hook hook_fn, void *private,
+                        bool read, bool write, bool must_translate,
+                        bool notify_unmap)
+{
+    uint64_t subpage_size, subpage_mask, pte, iova = start;
+    bool read_cur, write_cur, entry_valid;
+    int ret, granule_sz;
+    IOMMUTLBEntry entry;
+
+    granule_sz = cfg->granule_sz;
+
+    subpage_size = 1ULL << level_shift(level, granule_sz);
+    subpage_mask = level_page_mask(level, granule_sz);
+
+    trace_smmu_page_walk_level_in(level, baseaddr, granule_sz,
+                                  start, end, subpage_size);
+
+    while (iova < end) {
+        dma_addr_t next_table_baseaddr;
+        uint64_t iova_next;
+        uint32_t offset;
+
+        iova_next = (iova & subpage_mask) + subpage_size;
+
+        offset = iova_level_offset(iova, level, granule_sz);
+        pte = get_pte(baseaddr, offset);
+
+        trace_smmu_page_walk_level(level, iova, baseaddr, offset, pte);
+
+        if (pte == (uint64_t)-1) {
+            if (must_translate) {
+                return SMMU_TRANS_ERR_WALK_EXT_ABRT;
+            }
+            goto next;
+        }
+        if (is_invalid_pte(pte) || is_reserved_pte(pte, level)) {
+            trace_smmu_page_walk_level_res_invalid_pte(baseaddr, offset, pte);
+            if (must_translate) {
+                return SMMU_TRANS_ERR_WALK_EXT_ABRT;
+            }
+            goto next;
+        }
+
+        read_cur = read; /* TODO */
+        write_cur = write; /* TODO */
+        entry_valid = read_cur | write_cur; /* TODO */
+
+        if (is_page_pte(pte, level)) {
+            entry.target_as = &address_space_memory;
+            entry.iova = iova & subpage_mask;
+            /* NOTE: this is only meaningful if entry_valid == true */
+            entry.translated_addr = get_page_pte_address(pte, granule_sz);
+            entry.addr_mask = ~subpage_mask;
+            entry.perm = IOMMU_ACCESS_FLAG(read_cur, write_cur);
+            trace_smmu_page_walk_level_page_pte(pte, entry.translated_addr);
+            if (!entry_valid && !notify_unmap) {
+                printf("%s entry_valid=%d notify_unmap=%d\n", __func__,
+                       entry_valid, notify_unmap);
+                goto next;
+            }
+            if (hook_fn) {
+                ret = hook_fn(&entry, private);
+                if (ret) {
+                    return ret;
+                }
+            }
+            goto next;
+        }
+        if (is_block_pte(pte, level)) {
+            trace_smmu_page_walk_level_block_pte(pte,
+                get_block_pte_address(pte, level, granule_sz));
+            if (must_translate) {
+                return SMMU_TRANS_ERR_WALK_EXT_ABRT;
+            }
+            printf("%s BLOCK PTE not handled yet\n", __func__);
+            goto next;
+        }
+        /* table pte */
+        next_table_baseaddr = get_table_pte_address(pte, granule_sz);
+        trace_smmu_page_walk_level_table_pte(pte, next_table_baseaddr);
+        ret = smmu_page_walk_level_64(next_table_baseaddr, level + 1, cfg,
+                                      iova, MIN(iova_next, end),
+                                      hook_fn, private, read_cur, write_cur,
+                                      must_translate, notify_unmap);
+        if (!ret) {
+            return ret;
+        }
+
+next:
+        iova = iova_next;
+    }
+
+    return SMMU_TRANS_ERR_NONE;
+}
+
+/**
+ * smmu_page_walk_64 - walk a specific IOVA range from the initial
+ * lookup level, and call the hook for each valid entry
+ *
+ * @cfg: translation config
+ * @start: start of the IOVA range
+ * @end: end of the IOVA range
+ * @must_translate: indicates whether each iova of the range
+ *  must be translated or whether failure is allowed
+ * @hook_fn: the hook that to be called for each detected area
+ * @private: private data for the hook function
+ */
+static int
+smmu_page_walk_64(SMMUTransCfg *cfg, uint64_t start, uint64_t end,
+                  bool must_translate, smmu_page_walk_hook hook_fn,
+                  void *private)
+{
+    dma_addr_t ttbr;
+    int stage = cfg->stage;
+    int initial_level;
+
+    if (stage != 1) {
+        error_report("%s stage 2 not yet supported", __func__);
+        return -1; /* TODO */
+    }
+
+    /* TODO check start/end */
+
+    ttbr = extract64(cfg->ttbr, 0, 48);
+    initial_level = initial_lookup_level(cfg->tsz, cfg->granule_sz);
+
+    trace_smmu_page_walk(stage, cfg->ttbr, initial_level, start, end);
+
+    if (initial_level < 0) {
+        return -1; /* TODO */
+    }
+
+    return smmu_page_walk_level_64(ttbr, initial_level, cfg, start, end,
+                                   hook_fn, private,
+                                   true /* read */, true /* write */,
+                                   must_translate, false /* notify_unmap */);
+}
+
+static int set_translated_address(IOMMUTLBEntry *entry, void *private)
+{
+    SMMUTransCfg *cfg = (SMMUTransCfg *)private;
+    size_t offset = cfg->input - entry->iova;
+
+    cfg->output = entry->translated_addr + offset;
+
+    trace_smmu_set_translated_address(cfg->input, cfg->output);
+    return 0;
+}
+
+static int
+smmu_translate_64(SMMUTransCfg *cfg, uint32_t *pagesize,
+                  uint32_t *perm, bool is_write)
+{
+    int ret;
+
+    ret = smmu_page_walk_64(cfg, cfg->input, cfg->input + 1,
+                            true /* must_translate */,
+                            set_translated_address, cfg);
+    *pagesize = 1 << cfg->granule_sz;
+    return ret;
+}
+
+/*************************/
+/* VMSAv8-32 Translation */
+/*************************/
+
+static int
+smmu_page_walk_32(SMMUTransCfg *cfg, uint64_t start, uint64_t end,
+                  bool must_translate, smmu_page_walk_hook hook_fn,
+                  void *private)
+{
+    error_report("VMSAv8-32 translation is not yet implemented");
+    abort();
+}
+
+static int smmu_translate_32(SMMUTransCfg *cfg, uint32_t *pagesize,
+                             uint32_t *perm, bool is_write)
+{
+    error_report("VMSAv8-32 translation is not yet implemented");
+    abort();
+}
+
+/******************/
+/* Infrastructure */
+/******************/
+
+SMMUPciBus *smmu_find_as_from_bus_num(SMMUState *s, uint8_t bus_num)
+{
+    SMMUPciBus *smmu_pci_bus = s->smmu_as_by_bus_num[bus_num];
+
+    if (!smmu_pci_bus) {
+        GHashTableIter iter;
+
+        g_hash_table_iter_init(&iter, s->smmu_as_by_busptr);
+        while (g_hash_table_iter_next(&iter, NULL, (void **)&smmu_pci_bus)) {
+            if (pci_bus_num(smmu_pci_bus->bus) == bus_num) {
+                s->smmu_as_by_bus_num[bus_num] = smmu_pci_bus;
+                return smmu_pci_bus;
+            }
+        }
+    }
+    return smmu_pci_bus;
+}
+
+static void smmu_base_instance_init(Object *obj)
+{
+     /* Nothing much to do here as of now */
+}
+
+static void smmu_base_class_init(ObjectClass *klass, void *data)
+{
+    SMMUBaseClass *sbc = SMMU_DEVICE_CLASS(klass);
+
+    sbc->translate_64 = smmu_translate_64;
+    sbc->page_walk_64 = smmu_page_walk_64;
+
+    sbc->translate_32 = smmu_translate_32;
+    sbc->page_walk_32 = smmu_page_walk_32;
+}
+
+static const TypeInfo smmu_base_info = {
+    .name          = TYPE_SMMU_DEV_BASE,
+    .parent        = TYPE_SYS_BUS_DEVICE,
+    .instance_size = sizeof(SMMUState),
+    .instance_init = smmu_base_instance_init,
+    .class_data    = NULL,
+    .class_size    = sizeof(SMMUBaseClass),
+    .class_init    = smmu_base_class_init,
+    .abstract      = true,
+};
+
+static void smmu_base_register_types(void)
+{
+    type_register_static(&smmu_base_info);
+}
+
+type_init(smmu_base_register_types)
+
diff --git a/hw/arm/smmu-internal.h b/hw/arm/smmu-internal.h
new file mode 100644
index 0000000..5e890bb
--- /dev/null
+++ b/hw/arm/smmu-internal.h
@@ -0,0 +1,97 @@
+/*
+ * ARM SMMU support - Internal API
+ *
+ * Copyright (c) 2017 Red Hat, Inc.
+ * Written by Eric Auger
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#define ARM_LPAE_MAX_ADDR_BITS          48
+#define ARM_LPAE_MAX_LEVELS             4
+
+/* Page table bits */
+
+#ifndef HW_ARM_SMMU_INTERNAL_H
+#define HW_ARM_SMMU_INTERNAL_H
+
+#define ARM_LPAE_PTE_TYPE_SHIFT         0
+#define ARM_LPAE_PTE_TYPE_MASK          0x3
+
+#define ARM_LPAE_PTE_TYPE_BLOCK         1
+#define ARM_LPAE_PTE_TYPE_RESERVED      1
+#define ARM_LPAE_PTE_TYPE_TABLE         3
+#define ARM_LPAE_PTE_TYPE_PAGE          3
+
+#define ARM_LPAE_PTE_VALID              (1 << 0)
+
+static inline bool is_invalid_pte(uint64_t pte)
+{
+    return !(pte & ARM_LPAE_PTE_VALID);
+}
+
+static inline bool is_reserved_pte(uint64_t pte, int level)
+{
+    return ((level == 3) &&
+            ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_PTE_TYPE_RESERVED));
+}
+
+static inline bool is_block_pte(uint64_t pte, int level)
+{
+    return ((level < 3) &&
+            ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_PTE_TYPE_BLOCK));
+}
+
+static inline bool is_table_pte(uint64_t pte, int level)
+{
+    return ((level < 3) &&
+            ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_PTE_TYPE_TABLE));
+}
+
+static inline bool is_page_pte(uint64_t pte, int level)
+{
+    return ((level == 3) &&
+            ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_PTE_TYPE_PAGE));
+}
+
+static inline int level_shift(int level, int granule_sz)
+{
+    return granule_sz + (3 - level) * (granule_sz - 3);
+}
+
+static inline uint64_t level_page_mask(int level, int granule_sz)
+{
+    return ~((1ULL << level_shift(level, granule_sz)) - 1);
+}
+
+/**
+ * TODO: handle the case where the level resolves less than
+ * granule_sz -3 IA bits.
+ */
+static inline
+uint64_t iova_level_offset(uint64_t iova, int level, int granule_sz)
+{
+    return (iova >> level_shift(level, granule_sz)) &
+            ((1ULL << (granule_sz - 3)) - 1);
+}
+
+/* TODO: check this for stage 2 and table concatenation */
+static inline int initial_lookup_level(int tnsz, int granule_sz)
+{
+    return 4 - (64 - tnsz - 4) / (granule_sz - 3);
+}
+
+
+
+#endif
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index d5f33a2..1d53ad0 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -2,3 +2,15 @@
 
 # hw/arm/virt-acpi-build.c
 virt_acpi_setup(void) "No fw cfg or ACPI disabled. Bailing out."
+
+# hw/arm/smmu-common.c
+
+smmu_page_walk(int stage, uint64_t baseaddr, int first_level, uint64_t start, uint64_t end) "stage=%d, baseaddr=0x%"PRIx64", first level=%d, start=0x%"PRIx64", end=0x%"PRIx64
+smmu_page_walk_level_in(int level, uint64_t baseaddr, int granule_sz, uint64_t start, uint64_t end, uint64_t subpage_size) "level=%d baseaddr=0x%"PRIx64" granule=%d, start=0x%"PRIx64" end=0x%"PRIx64", subpage_size=0x%lx"
+smmu_page_walk_level(int level, uint64_t iova, uint64_t baseaddr, uint32_t offset, uint64_t pte) "level=%d iova=0x%lx baseaddr=0x%"PRIx64" offset=0x%x => pte=0x%lx"
+smmu_page_walk_level_res_invalid_pte(uint64_t baseaddr, uint32_t offset, uint64_t pte) "baseaddr=0x%"PRIx64" offset=0x%x pte=0x%lx"
+smmu_page_walk_level_page_pte(uint64_t pte, uint64_t address) "pte=0x%"PRIx64" page address = 0x%"PRIx64
+smmu_page_walk_level_block_pte(uint64_t pte, uint64_t address) "pte=0x%"PRIx64" block address = 0x%"PRIx64
+smmu_page_walk_level_table_pte(uint64_t pte, uint64_t address) "pte=0x%"PRIx64" next table address = 0x%"PRIx64
+smmu_get_pte(uint64_t baseaddr, int index, uint64_t pteaddr, uint64_t pte) "baseaddr=0x%"PRIx64" index=0x%x, pteaddr=0x%"PRIx64", pte=0x%"PRIx64
+smmu_set_translated_address(hwaddr iova, hwaddr va) "iova = 0x%"PRIx64" -> pa = 0x%"PRIx64
diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
new file mode 100644
index 0000000..836c916
--- /dev/null
+++ b/include/hw/arm/smmu-common.h
@@ -0,0 +1,125 @@
+/*
+ * ARM SMMU Support
+ *
+ * Copyright (C) 2015-2016 Broadcom Corporation
+ * Copyright (c) 2017 Red Hat, Inc.
+ * Written by Prem Mallappa, Eric Auger
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_ARM_SMMU_COMMON_H
+#define HW_ARM_SMMU_COMMON_H
+
+#include <hw/sysbus.h>
+#include "hw/pci/pci.h"
+
+#define SMMU_PCI_BUS_MAX      256
+#define SMMU_PCI_DEVFN_MAX    256
+
+typedef enum {
+    SMMU_TRANS_ERR_NONE          = 0x0,
+    SMMU_TRANS_ERR_WALK_EXT_ABRT = 0x1,  /* Translation walk external abort */
+    SMMU_TRANS_ERR_TRANS         = 0x10, /* Translation fault */
+    SMMU_TRANS_ERR_ADDR_SZ,              /* Address Size fault */
+    SMMU_TRANS_ERR_ACCESS,               /* Access fault */
+    SMMU_TRANS_ERR_PERM,                 /* Permission fault */
+    SMMU_TRANS_ERR_TLB_CONFLICT  = 0x20, /* TLB Conflict */
+} SMMUTransErr;
+
+/*
+ * Generic structure populated by derived SMMU devices
+ * after decoding the configuration information and used as
+ * input to the page table walk
+ */
+typedef struct SMMUTransCfg {
+    hwaddr   input;            /* input address */
+    hwaddr   output;           /* Output address */
+    int      stage;            /* translation stage */
+    uint32_t oas;              /* output address width */
+    uint32_t tsz;              /* input range, ie. 2^(64 -tnsz)*/
+    uint64_t ttbr;             /* TTBR address */
+    uint32_t granule_sz;       /* granule page shift */
+    bool     aa64;             /* arch64 or aarch32 translation table */
+} SMMUTransCfg;
+
+typedef struct SMMUDevice {
+    void         *smmu;
+    PCIBus       *bus;
+    int           devfn;
+    MemoryRegion  iommu;
+    AddressSpace  as;
+} SMMUDevice;
+
+typedef struct SMMUNotifierNode {
+    SMMUDevice *sdev;
+    QLIST_ENTRY(SMMUNotifierNode) next;
+} SMMUNotifierNode;
+
+typedef struct SMMUPciBus {
+    PCIBus       *bus;
+    SMMUDevice   *pbdev[0]; /* Parent array is sparse, so dynamically alloc */
+} SMMUPciBus;
+
+typedef struct SMMUState {
+    /* <private> */
+    SysBusDevice  dev;
+
+    MemoryRegion iomem;
+
+    MemoryRegionIOMMUOps iommu_ops;
+    GHashTable *smmu_as_by_busptr;
+    SMMUPciBus *smmu_as_by_bus_num[SMMU_PCI_BUS_MAX];
+    QLIST_HEAD(, SMMUNotifierNode) notifiers_list;
+
+} SMMUState;
+
+typedef int (*smmu_page_walk_hook)(IOMMUTLBEntry *entry, void *private);
+
+typedef struct {
+    /* <private> */
+    SysBusDeviceClass parent_class;
+
+    /* public */
+    int (*translate_32)(SMMUTransCfg *cfg, uint32_t *pagesize,
+                        uint32_t *perm, bool is_write);
+    int (*translate_64)(SMMUTransCfg *cfg, uint32_t *pagesize,
+                        uint32_t *perm, bool is_write);
+    int (*page_walk_32)(SMMUTransCfg *cfg, uint64_t start, uint64_t end,
+                        bool must_translate, smmu_page_walk_hook hook_fn,
+                        void *private);
+    int (*page_walk_64)(SMMUTransCfg *cfg, uint64_t start, uint64_t end,
+                        bool must_translate, smmu_page_walk_hook hook_fn,
+                        void *private);
+} SMMUBaseClass;
+
+#define TYPE_SMMU_DEV_BASE "smmu-base"
+#define SMMU_SYS_DEV(obj) OBJECT_CHECK(SMMUState, (obj), TYPE_SMMU_DEV_BASE)
+#define SMMU_DEVICE_GET_CLASS(obj)                              \
+    OBJECT_GET_CLASS(SMMUBaseClass, (obj), TYPE_SMMU_DEV_BASE)
+#define SMMU_DEVICE_CLASS(klass)                                    \
+    OBJECT_CLASS_CHECK(SMMUBaseClass, (klass), TYPE_SMMU_DEV_BASE)
+
+MemTxResult smmu_read_sysmem(dma_addr_t addr, void *buf,
+                             dma_addr_t len, bool secure);
+void smmu_write_sysmem(dma_addr_t addr, void *buf, dma_addr_t len, bool secure);
+
+SMMUPciBus *smmu_find_as_from_bus_num(SMMUState *s, uint8_t bus_num);
+
+static inline uint16_t smmu_get_sid(SMMUDevice *sdev)
+{
+    return  ((pci_bus_num(sdev->bus) & 0xff) << 8) | sdev->devfn;
+}
+
+#endif  /* HW_ARM_SMMU_COMMON */
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [RFC v4 2/5] hw/arm/smmuv3: smmuv3 emulation model
  2017-05-13 17:43 [Qemu-devel] [RFC v4 0/5] ARM SMMUv3 Emulation Support Eric Auger
  2017-05-13 17:43 ` [Qemu-devel] [RFC v4 1/5] hw/arm/smmu-common: smmu base class Eric Auger
@ 2017-05-13 17:43 ` Eric Auger
  2017-05-30 16:01   ` Peter Maydell
  2017-05-13 17:43 ` [Qemu-devel] [RFC v4 3/5] hw/arm/virt: Add SMMUv3 to the virt board Eric Auger
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: Eric Auger @ 2017-05-13 17:43 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, edgar.iglesias,
	qemu-arm, qemu-devel, prem.mallappa
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain

From: Prem Mallappa <prem.mallappa@broadcom.com>

Introduces the SMMUv3 derived model. This is based on
System MMUv3 specification (v17).

Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v3 -> v4
- smmu_irq_update
- fix hash key allocation
- set smmu_iommu_ops
- set SMMU_REG_CR0,
- smmuv3_translate: ret.perm not set in bypass mode
- use trace events
- renamed STM2U64 into L1STD_L2PTR and STMSPAN into L1STD_SPAN
- rework smmu_find_ste
- fix tg2granule in TT0/0b10 corresponds to 16kB

v2 -> v3:
- move creation of include/hw/arm/smmuv3.h to this patch to fix compil issue
- compilation allowed
- fix sbus allocation in smmu_init_pci_iommu
- restructure code into headers
- misc cleanups
---
 hw/arm/Makefile.objs     |    2 +-
 hw/arm/smmuv3-internal.h |  603 ++++++++++++++++++++++++
 hw/arm/smmuv3.c          | 1134 ++++++++++++++++++++++++++++++++++++++++++++++
 hw/arm/trace-events      |   32 ++
 include/hw/arm/smmuv3.h  |   87 ++++
 5 files changed, 1857 insertions(+), 1 deletion(-)
 create mode 100644 hw/arm/smmuv3-internal.h
 create mode 100644 hw/arm/smmuv3.c
 create mode 100644 include/hw/arm/smmuv3.h

diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
index 6c7d4af..02cd23f 100644
--- a/hw/arm/Makefile.objs
+++ b/hw/arm/Makefile.objs
@@ -18,4 +18,4 @@ obj-$(CONFIG_FSL_IMX25) += fsl-imx25.o imx25_pdk.o
 obj-$(CONFIG_FSL_IMX31) += fsl-imx31.o kzm.o
 obj-$(CONFIG_FSL_IMX6) += fsl-imx6.o sabrelite.o
 obj-$(CONFIG_ASPEED_SOC) += aspeed_soc.o aspeed.o
-obj-$(CONFIG_ARM_SMMUV3) += smmu-common.o
+obj-$(CONFIG_ARM_SMMUV3) += smmu-common.o smmuv3.o
diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
new file mode 100644
index 0000000..db0048f
--- /dev/null
+++ b/hw/arm/smmuv3-internal.h
@@ -0,0 +1,603 @@
+/*
+ * ARM SMMUv3 support - Internal API
+ *
+ * Copyright (C) 2014-2016 Broadcom Corporation
+ * Copyright (c) 2017 Red Hat, Inc.
+ * Written by Prem Mallappa, Eric Auger
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_ARM_SMMU_V3_INTERNAL_H
+#define HW_ARM_SMMU_V3_INTERNAL_H
+
+#include "trace.h"
+#include "qemu/error-report.h"
+#include "hw/arm/smmu-common.h"
+
+/*****************************
+ * MMIO Register
+ *****************************/
+enum {
+    SMMU_REG_IDR0            = 0x0,
+
+/* IDR0 Field Values and supported features */
+
+#define SMMU_IDR0_S2P      1  /* stage 2 */
+#define SMMU_IDR0_S1P      1  /* stage 1 */
+#define SMMU_IDR0_TTF      2  /* Aarch64 only - not Aarch32 (LPAE) */
+#define SMMU_IDR0_COHACC   1  /* IO coherent access */
+#define SMMU_IDR0_HTTU     2  /* Access and Dirty flag update */
+#define SMMU_IDR0_HYP      0  /* Hypervisor Stage 1 contexts */
+#define SMMU_IDR0_ATS      0  /* PCIe RC ATS */
+#define SMMU_IDR0_ASID16   1  /* 16-bit ASID */
+#define SMMU_IDR0_PRI      0  /* Page Request Interface */
+#define SMMU_IDR0_VMID16   0  /* 16-bit VMID */
+#define SMMU_IDR0_CD2L     0  /* 2-level Context Descriptor table */
+#define SMMU_IDR0_STALL    1  /* Stalling fault model */
+#define SMMU_IDR0_TERM     1  /* Termination model behaviour */
+#define SMMU_IDR0_STLEVEL  1  /* Multi-level Stream Table */
+
+#define SMMU_IDR0_S2P_SHIFT      0
+#define SMMU_IDR0_S1P_SHIFT      1
+#define SMMU_IDR0_TTF_SHIFT      2
+#define SMMU_IDR0_COHACC_SHIFT   4
+#define SMMU_IDR0_HTTU_SHIFT     6
+#define SMMU_IDR0_HYP_SHIFT      9
+#define SMMU_IDR0_ATS_SHIFT      10
+#define SMMU_IDR0_ASID16_SHIFT   12
+#define SMMU_IDR0_PRI_SHIFT      16
+#define SMMU_IDR0_VMID16_SHIFT   18
+#define SMMU_IDR0_CD2L_SHIFT     19
+#define SMMU_IDR0_STALL_SHIFT    24
+#define SMMU_IDR0_TERM_SHIFT     26
+#define SMMU_IDR0_STLEVEL_SHIFT  27
+
+    SMMU_REG_IDR1            = 0x4,
+#define SMMU_IDR1_SIDSIZE 16
+    SMMU_REG_IDR2            = 0x8,
+    SMMU_REG_IDR3            = 0xc,
+    SMMU_REG_IDR4            = 0x10,
+    SMMU_REG_IDR5            = 0x14,
+#define SMMU_IDR5_GRAN_SHIFT 4
+#define SMMU_IDR5_GRAN       0b101 /* GRAN4K, GRAN64K */
+#define SMMU_IDR5_OAS        4     /* 44 bits */
+    SMMU_REG_IIDR            = 0x1c,
+    SMMU_REG_CR0             = 0x20,
+
+#define SMMU_CR0_SMMU_ENABLE (1 << 0)
+#define SMMU_CR0_PRIQ_ENABLE (1 << 1)
+#define SMMU_CR0_EVTQ_ENABLE (1 << 2)
+#define SMMU_CR0_CMDQ_ENABLE (1 << 3)
+#define SMMU_CR0_ATS_CHECK   (1 << 4)
+
+    SMMU_REG_CR0_ACK         = 0x24,
+    SMMU_REG_CR1             = 0x28,
+    SMMU_REG_CR2             = 0x2c,
+
+    SMMU_REG_STATUSR         = 0x40,
+
+    SMMU_REG_IRQ_CTRL        = 0x50,
+    SMMU_REG_IRQ_CTRL_ACK    = 0x54,
+
+#define SMMU_IRQ_CTRL_GERROR_EN (1 << 0)
+#define SMMU_IRQ_CTRL_EVENT_EN  (1 << 1)
+#define SMMU_IRQ_CTRL_PRI_EN    (1 << 2)
+
+    SMMU_REG_GERROR          = 0x60,
+
+#define SMMU_GERROR_CMDQ       (1 << 0)
+#define SMMU_GERROR_EVENTQ     (1 << 2)
+#define SMMU_GERROR_PRIQ       (1 << 3)
+#define SMMU_GERROR_MSI_CMDQ   (1 << 4)
+#define SMMU_GERROR_MSI_EVENTQ (1 << 5)
+#define SMMU_GERROR_MSI_PRIQ   (1 << 6)
+#define SMMU_GERROR_MSI_GERROR (1 << 7)
+#define SMMU_GERROR_SFM_ERR    (1 << 8)
+
+    SMMU_REG_GERRORN         = 0x64,
+    SMMU_REG_GERROR_IRQ_CFG0 = 0x68,
+    SMMU_REG_GERROR_IRQ_CFG1 = 0x70,
+    SMMU_REG_GERROR_IRQ_CFG2 = 0x74,
+
+    /* SMMU_BASE_RA Applies to STRTAB_BASE, CMDQ_BASE and EVTQ_BASE */
+#define SMMU_BASE_RA        (1ULL << 62)
+    SMMU_REG_STRTAB_BASE     = 0x80,
+    SMMU_REG_STRTAB_BASE_CFG = 0x88,
+
+    SMMU_REG_CMDQ_BASE       = 0x90,
+    SMMU_REG_CMDQ_PROD       = 0x98,
+    SMMU_REG_CMDQ_CONS       = 0x9c,
+    /* CMD Consumer (CONS) */
+#define SMMU_CMD_CONS_ERR_SHIFT        24
+#define SMMU_CMD_CONS_ERR_BITS         7
+
+    SMMU_REG_EVTQ_BASE       = 0xa0,
+    SMMU_REG_EVTQ_PROD       = 0xa8,
+    SMMU_REG_EVTQ_CONS       = 0xac,
+    SMMU_REG_EVTQ_IRQ_CFG0   = 0xb0,
+    SMMU_REG_EVTQ_IRQ_CFG1   = 0xb8,
+    SMMU_REG_EVTQ_IRQ_CFG2   = 0xbc,
+
+    SMMU_REG_PRIQ_BASE       = 0xc0,
+    SMMU_REG_PRIQ_PROD       = 0xc8,
+    SMMU_REG_PRIQ_CONS       = 0xcc,
+    SMMU_REG_PRIQ_IRQ_CFG0   = 0xd0,
+    SMMU_REG_PRIQ_IRQ_CFG1   = 0xd8,
+    SMMU_REG_PRIQ_IRQ_CFG2   = 0xdc,
+
+    SMMU_ID_REGS_OFFSET      = 0xfd0,
+
+    /* Secure registers are not used for now */
+    SMMU_SECURE_OFFSET       = 0x8000,
+};
+
+/**********************
+ * Data Structures
+ **********************/
+
+struct __smmu_data2 {
+    uint32_t word[2];
+};
+
+struct __smmu_data8 {
+    uint32_t word[8];
+};
+
+struct __smmu_data16 {
+    uint32_t word[16];
+};
+
+struct __smmu_data4 {
+    uint32_t word[4];
+};
+
+typedef struct __smmu_data2  STEDesc; /* STE Level 1 Descriptor */
+typedef struct __smmu_data16 Ste;     /* Stream Table Entry(STE) */
+typedef struct __smmu_data2  CDDesc;  /* CD Level 1 Descriptor */
+typedef struct __smmu_data16 Cd;      /* Context Descriptor(CD) */
+
+typedef struct __smmu_data4  Cmd; /* Command Entry */
+typedef struct __smmu_data8  Evt; /* Event Entry */
+typedef struct __smmu_data4  Pri; /* PRI entry */
+
+/*****************************
+ * STE fields
+ *****************************/
+
+#define STE_VALID(x)   extract32((x)->word[0], 0, 1) /* 0 */
+#define STE_CONFIG(x)  extract32((x)->word[0], 1, 3)
+enum {
+    STE_CONFIG_NONE      = 0,
+    STE_CONFIG_BYPASS    = 4,           /* S1 Bypass, S2 Bypass */
+    STE_CONFIG_S1TR      = 1,           /* S1 Translate, S2 Bypass */
+    STE_CONFIG_S2TR      = 2,           /* S1 Bypass, S2 Translate */
+    STE_CONFIG_S1TR_S2TR = 3,           /* S1 Translate, S2 Translate */
+};
+#define STE_S1FMT(x)   extract32((x)->word[0], 4, 2)
+#define STE_S1CDMAX(x) extract32((x)->word[1], 27, 5)
+#define STE_EATS(x)    extract32((x)->word[2], 28, 2)
+#define STE_STRW(x)    extract32((x)->word[2], 30, 2)
+#define STE_S2VMID(x)  extract32((x)->word[4], 0, 16)
+#define STE_S2T0SZ(x)  extract32((x)->word[5], 0, 6)
+#define STE_S2TG(x)    extract32((x)->word[5], 14, 2)
+#define STE_S2PS(x)    extract32((x)->word[5], 16, 3)
+#define STE_S2AA64(x)  extract32((x)->word[5], 19, 1)
+#define STE_S2HD(x)    extract32((x)->word[5], 24, 1)
+#define STE_S2HA(x)    extract32((x)->word[5], 25, 1)
+#define STE_S2S(x)     extract32((x)->word[5], 26, 1)
+#define STE_CTXPTR(x)                                           \
+    ({                                                          \
+        unsigned long addr;                                     \
+        addr = (uint64_t)extract32((x)->word[1], 0, 16) << 32;  \
+        addr |= (uint64_t)((x)->word[0] & 0xffffffc0);          \
+        addr;                                                   \
+    })
+
+#define STE_S2TTB(x)                                            \
+    ({                                                          \
+        unsigned long addr;                                     \
+        addr = (uint64_t)extract32((x)->word[7], 0, 16) << 32;  \
+        addr |= (uint64_t)((x)->word[6] & 0xfffffff0);          \
+        addr;                                                   \
+    })
+
+static inline int is_ste_bypass(Ste *ste)
+{
+    return STE_CONFIG(ste) == STE_CONFIG_BYPASS;
+}
+
+static inline bool has_stage1(Ste *ste)
+{
+    return STE_CONFIG(ste) & 0b001;
+}
+
+static inline bool has_stage2(Ste *ste)
+{
+    return STE_CONFIG(ste) & 0b010;
+}
+
+/**
+ * is_s2granule_valid - Check the stage 2 translation granule size
+ * advertised in the STE matches any IDR5 supported value
+ */
+static inline bool is_s2granule_valid(Ste *ste)
+{
+    int idr5_format = 0;
+
+    switch (STE_S2TG(ste)) {
+    case 0: /* 4kB */
+        idr5_format = 0x1;
+        break;
+    case 1: /* 64 kB */
+        idr5_format = 0x4;
+        break;
+    case 2: /* 16 kB */
+        idr5_format = 0x2;
+        break;
+    case 3: /* reserved */
+        break;
+    }
+    idr5_format &= SMMU_IDR5_GRAN;
+    return idr5_format;
+}
+
+static inline int oas2bits(int oas_field)
+{
+    switch (oas_field) {
+    case 0b011:
+        return 42;
+    case 0b100:
+        return 44;
+    default:
+        return 32 + (1 << oas_field);
+   }
+}
+
+static inline int pa_range(Ste *ste)
+{
+    int oas_field = MIN(STE_S2PS(ste), SMMU_IDR5_OAS);
+
+    if (!STE_S2AA64(ste)) {
+        return 40;
+    }
+
+    return oas2bits(oas_field);
+}
+
+#define MAX_PA(ste) ((1 << pa_range(ste)) - 1)
+
+/*****************************
+ * CD fields
+ *****************************/
+#define CD_VALID(x)   extract32((x)->word[0], 30, 1)
+#define CD_ASID(x)    extract32((x)->word[1], 16, 16)
+#define CD_TTB(x, sel)                                      \
+    ({                                                      \
+        uint64_t hi, lo;                                    \
+        hi = extract32((x)->word[(sel) * 2 + 3], 0, 16);    \
+        hi <<= 32;                                          \
+        lo = (x)->word[(sel) * 2 + 2] & ~0xf;               \
+        hi | lo;                                            \
+    })
+
+#define CD_TSZ(x, sel)   extract32((x)->word[0], (16 * (sel)) + 0, 6)
+#define CD_TG(x, sel)    extract32((x)->word[0], (16 * (sel)) + 6, 2)
+#define CD_EPD(x, sel)   extract32((x)->word[0], (16 * (sel)) + 14, 1)
+
+#define CD_T0SZ(x)    CD_TSZ((x), 0)
+#define CD_T1SZ(x)    CD_TSZ((x), 1)
+#define CD_TG0(x)     CD_TG((x), 0)
+#define CD_TG1(x)     CD_TG((x), 1)
+#define CD_EPD0(x)    CD_EPD((x), 0)
+#define CD_EPD1(x)    CD_EPD((x), 1)
+#define CD_IPS(x)     extract32((x)->word[1], 0, 3)
+#define CD_AARCH64(x) extract32((x)->word[1], 9, 1)
+#define CD_TTB0(x)    CD_TTB((x), 0)
+#define CD_TTB1(x)    CD_TTB((x), 1)
+
+#define CDM_VALID(x)    ((x)->word[0] & 0x1)
+
+static inline int is_cd_valid(SMMUV3State *s, Ste *ste, Cd *cd)
+{
+    return CD_VALID(cd);
+}
+
+/*****************************
+ * Commands
+ *****************************/
+enum {
+    SMMU_CMD_PREFETCH_CONFIG = 0x01,
+    SMMU_CMD_PREFETCH_ADDR,
+    SMMU_CMD_CFGI_STE,
+    SMMU_CMD_CFGI_STE_RANGE,
+    SMMU_CMD_CFGI_CD,
+    SMMU_CMD_CFGI_CD_ALL,
+    SMMU_CMD_CFGI_ALL,
+    SMMU_CMD_TLBI_NH_ALL     = 0x10,
+    SMMU_CMD_TLBI_NH_ASID,
+    SMMU_CMD_TLBI_NH_VA,
+    SMMU_CMD_TLBI_NH_VAA,
+    SMMU_CMD_TLBI_EL3_ALL    = 0x18,
+    SMMU_CMD_TLBI_EL3_VA     = 0x1a,
+    SMMU_CMD_TLBI_EL2_ALL    = 0x20,
+    SMMU_CMD_TLBI_EL2_ASID,
+    SMMU_CMD_TLBI_EL2_VA,
+    SMMU_CMD_TLBI_EL2_VAA,  /* 0x23 */
+    SMMU_CMD_TLBI_S12_VMALL  = 0x28,
+    SMMU_CMD_TLBI_S2_IPA     = 0x2a,
+    SMMU_CMD_TLBI_NSNH_ALL   = 0x30,
+    SMMU_CMD_ATC_INV         = 0x40,
+    SMMU_CMD_PRI_RESP,
+    SMMU_CMD_RESUME          = 0x44,
+    SMMU_CMD_STALL_TERM,
+    SMMU_CMD_SYNC,          /* 0x46 */
+};
+
+/*****************************
+ *  Register Access Primitives
+ *****************************/
+
+static inline void smmu_write64_reg(SMMUV3State *s, uint32_t addr, uint64_t val)
+{
+    addr >>= 2;
+    s->regs[addr] = val & 0xFFFFFFFFULL;
+    s->regs[addr + 1] = val & ~0xFFFFFFFFULL;
+}
+
+static inline void smmu_write_reg(SMMUV3State *s, uint32_t addr, uint64_t val)
+{
+    s->regs[addr >> 2] = val;
+}
+
+static inline uint32_t smmu_read_reg(SMMUV3State *s, uint32_t addr)
+{
+    return s->regs[addr >> 2];
+}
+
+static inline uint64_t smmu_read64_reg(SMMUV3State *s, uint32_t addr)
+{
+    addr >>= 2;
+    return s->regs[addr] | (s->regs[addr + 1] << 32);
+}
+
+#define smmu_read32_reg smmu_read_reg
+#define smmu_write32_reg smmu_write_reg
+
+/*****************************
+ * CMDQ fields
+ *****************************/
+
+enum { /* Command Errors */
+    SMMU_CMD_ERR_NONE = 0,
+    SMMU_CMD_ERR_ILLEGAL,
+    SMMU_CMD_ERR_ABORT
+};
+
+enum { /* Command completion notification */
+    CMD_SYNC_SIG_NONE,
+    CMD_SYNC_SIG_IRQ,
+    CMD_SYNC_SIG_SEV,
+};
+
+#define CMD_TYPE(x)  extract32((x)->word[0], 0, 8)
+#define CMD_SEC(x)   extract32((x)->word[0], 9, 1)
+#define CMD_SEV(x)   extract32((x)->word[0], 10, 1)
+#define CMD_AC(x)    extract32((x)->word[0], 12, 1)
+#define CMD_AB(x)    extract32((x)->word[0], 13, 1)
+#define CMD_CS(x)    extract32((x)->word[0], 12, 2)
+#define CMD_SSID(x)  extract32((x)->word[0], 16, 16)
+#define CMD_SID(x)   ((x)->word[1])
+#define CMD_VMID(x)  extract32((x)->word[1], 0, 16)
+#define CMD_ASID(x)  extract32((x)->word[1], 16, 16)
+#define CMD_STAG(x)  extract32((x)->word[2], 0, 16)
+#define CMD_RESP(x)  extract32((x)->word[2], 11, 2)
+#define CMD_GRPID(x) extract32((x)->word[3], 0, 8)
+#define CMD_SIZE(x)  extract32((x)->word[3], 0, 16)
+#define CMD_LEAF(x)  extract32((x)->word[3], 0, 1)
+#define CMD_SPAN(x)  extract32((x)->word[3], 0, 5)
+#define CMD_ADDR(x) ({                                  \
+            uint64_t addr = (uint64_t)(x)->word[3];     \
+            addr <<= 32;                                \
+            addr |=  extract32((x)->word[3], 12, 20);   \
+            addr;                                       \
+        })
+
+/***************************
+ * Queue Handling
+ ***************************/
+
+typedef enum {
+    CMD_Q_EMPTY,
+    CMD_Q_FULL,
+    CMD_Q_INUSE,
+} SMMUQStatus;
+
+#define Q_ENTRY(q, idx)  (q->base + q->ent_size * idx)
+#define Q_WRAP(q, pc)    ((pc) >> (q)->shift)
+#define Q_IDX(q, pc)     ((pc) & ((1 << (q)->shift) - 1))
+
+static inline SMMUQStatus
+__smmu_queue_status(SMMUV3State *s, SMMUQueue *q)
+{
+    uint32_t prod = Q_IDX(q, q->prod), cons = Q_IDX(q, q->cons);
+    if ((prod == cons) && (q->wrap.prod != q->wrap.cons)) {
+        return CMD_Q_FULL;
+    } else if ((prod == cons) && (q->wrap.prod == q->wrap.cons)) {
+        return CMD_Q_EMPTY;
+    }
+    return CMD_Q_INUSE;
+}
+#define smmu_is_q_full(s, q) (__smmu_queue_status(s, q) == CMD_Q_FULL)
+#define smmu_is_q_empty(s, q) (__smmu_queue_status(s, q) == CMD_Q_EMPTY)
+
+static inline int __smmu_q_enabled(SMMUV3State *s, uint32_t q)
+{
+    return smmu_read32_reg(s, SMMU_REG_CR0) & q;
+}
+#define smmu_cmd_q_enabled(s) __smmu_q_enabled(s, SMMU_CR0_CMDQ_ENABLE)
+#define smmu_evt_q_enabled(s) __smmu_q_enabled(s, SMMU_CR0_EVTQ_ENABLE)
+
+#define SMMU_CMDQ_ERR(s) ((smmu_read32_reg(s, SMMU_REG_GERROR) ^    \
+                           smmu_read32_reg(s, SMMU_REG_GERRORN)) &  \
+                          SMMU_GERROR_CMDQ)
+
+/*****************************
+ * EVTQ fields
+ *****************************/
+
+#define EVT_Q_OVERFLOW        (1 << 31)
+
+#define EVT_SET_TYPE(x, t)    deposit32((x)->word[0], 0, 8, t)
+#define EVT_SET_SID(x, s)     ((x)->word[1] =  s)
+#define EVT_SET_INPUT_ADDR(x, addr) ({                    \
+            (x)->word[5] = (uint32_t)(addr >> 32);        \
+            (x)->word[4] = (uint32_t)(addr & 0xffffffff); \
+            addr;                                         \
+        })
+
+/*****************************
+ * Events
+ *****************************/
+
+enum evt_err {
+    SMMU_EVT_F_UUT    = 0x1,
+    SMMU_EVT_C_BAD_SID,
+    SMMU_EVT_F_STE_FETCH,
+    SMMU_EVT_C_BAD_STE,
+    SMMU_EVT_F_BAD_ATS_REQ,
+    SMMU_EVT_F_STREAM_DISABLED,
+    SMMU_EVT_F_TRANS_FORBIDDEN,
+    SMMU_EVT_C_BAD_SSID,
+    SMMU_EVT_F_CD_FETCH,
+    SMMU_EVT_C_BAD_CD,
+    SMMU_EVT_F_WALK_EXT_ABRT,
+    SMMU_EVT_F_TRANS        = 0x10,
+    SMMU_EVT_F_ADDR_SZ,
+    SMMU_EVT_F_ACCESS,
+    SMMU_EVT_F_PERM,
+    SMMU_EVT_F_TLB_CONFLICT = 0x20,
+    SMMU_EVT_F_CFG_CONFLICT = 0x21,
+    SMMU_EVT_E_PAGE_REQ     = 0x24,
+};
+
+typedef enum evt_err SMMUEvtErr;
+
+/*****************************
+ * Interrupts
+ *****************************/
+
+static inline int __smmu_irq_enabled(SMMUV3State *s, uint32_t q)
+{
+    return smmu_read64_reg(s, SMMU_REG_IRQ_CTRL) & q;
+}
+#define smmu_evt_irq_enabled(s)                   \
+    __smmu_irq_enabled(s, SMMU_IRQ_CTRL_EVENT_EN)
+#define smmu_gerror_irq_enabled(s)                  \
+    __smmu_irq_enabled(s, SMMU_IRQ_CTRL_GERROR_EN)
+#define smmu_pri_irq_enabled(s)                 \
+    __smmu_irq_enabled(s, SMMU_IRQ_CTRL_PRI_EN)
+
+static inline bool
+smmu_is_irq_pending(SMMUV3State *s, int irq)
+{
+    return smmu_read32_reg(s, SMMU_REG_GERROR) ^
+        smmu_read32_reg(s, SMMU_REG_GERRORN);
+}
+
+/*****************************
+ * Hash Table
+ *****************************/
+
+static inline gboolean smmu_uint64_equal(gconstpointer v1, gconstpointer v2)
+{
+    return *((const uint64_t *)v1) == *((const uint64_t *)v2);
+}
+
+static inline guint smmu_uint64_hash(gconstpointer v)
+{
+    return (guint)*(const uint64_t *)v;
+}
+
+/*****************************
+ * Misc
+ *****************************/
+
+/**
+ * tg2granule - Decodes the CD translation granule size field according
+ * to the TT in use
+ * @bits: TG0/1 fiels
+ * @tg1: if set, @bits belong to TG1, otherwise belong to TG0
+ */
+static inline int tg2granule(int bits, bool tg1)
+{
+    switch (bits) {
+    case 1:
+        return tg1 ? 14 : 16;
+    case 2:
+        return tg1 ? 12 : 14;
+    case 3:
+        return tg1 ? 16 : 12;
+    default:
+        return 12;
+    }
+}
+
+#define L1STD_L2PTR(stm) ({                                 \
+            uint64_t hi, lo;                            \
+            hi = (stm)->word[1];                        \
+            lo = (stm)->word[0] & ~(uint64_t)0x1f;      \
+            hi << 32 | lo;                              \
+        })
+
+#define L1STD_SPAN(stm) (extract32((stm)->word[0], 0, 4))
+
+/*****************************
+ * Debug
+ *****************************/
+/* #define ARM_SMMU_DEBUG */
+
+#ifdef ARM_SMMU_DEBUG
+static inline void dump_ste(Ste *ste)
+{
+    int i;
+
+    for (i = 0; i < ARRAY_SIZE(ste->word); i += 2) {
+        trace_smmu_dump_ste(i, ste->word[i], i + 1, ste->word[i + 1]);
+    }
+}
+
+static inline void dump_cd(Cd *cd)
+{
+    int i;
+    for (i = 0; i < ARRAY_SIZE(cd->word); i += 2) {
+        trace_smmu_dump_cd(i, cd->word[i], i + 1, cd->word[i + 1]);
+    }
+}
+
+static inline void dump_cmd(Cmd *cmd)
+{
+    int i;
+    for (i = 0; i < ARRAY_SIZE(cmd->word); i += 2) {
+        trace_smmu_dump_cmd(i, cmd->word[i], i + 1, cmd->word[i + 1]);
+    }
+}
+
+#else
+#define dump_ste(...) do {} while (0)
+#define dump_cd(...) do {} while (0)
+#define dump_cmd(...) do {} while (0)
+#endif /* ARM_SMMU_DEBUG */
+
+#endif
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
new file mode 100644
index 0000000..aa6e991
--- /dev/null
+++ b/hw/arm/smmuv3.c
@@ -0,0 +1,1134 @@
+/*
+ * Copyright (C) 2014-2016 Broadcom Corporation
+ * Copyright (c) 2017 Red Hat, Inc.
+ * Written by Prem Mallappa, Eric Auger
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/boards.h"
+#include "sysemu/sysemu.h"
+#include "hw/sysbus.h"
+#include "hw/pci/pci.h"
+#include "exec/address-spaces.h"
+#include "trace.h"
+#include "qemu/error-report.h"
+
+#include "hw/arm/smmuv3.h"
+#include "smmuv3-internal.h"
+
+static inline int smmu_enabled(SMMUV3State *s)
+{
+    return smmu_read32_reg(s, SMMU_REG_CR0) & SMMU_CR0_SMMU_ENABLE;
+}
+
+/**
+ * smmu_irq_update - update the GERROR register according to
+ * the IRQ and the enable state
+ *
+ * return > 0 when IRQ is supposed to be raised
+ */
+static int smmu_irq_update(SMMUV3State *s, int irq, uint64_t data)
+{
+    uint32_t error = 0;
+
+    if (!smmu_gerror_irq_enabled(s)) {
+        return 0;
+    }
+
+    switch (irq) {
+    case SMMU_IRQ_EVTQ:
+        if (smmu_evt_irq_enabled(s)) {
+            error = SMMU_GERROR_EVENTQ;
+        }
+        break;
+    case SMMU_IRQ_CMD_SYNC:
+        if (smmu_gerror_irq_enabled(s)) {
+            uint32_t err_type = (uint32_t)data;
+
+            if (err_type) {
+                uint32_t regval = smmu_read32_reg(s, SMMU_REG_CMDQ_CONS);
+                smmu_write32_reg(s, SMMU_REG_CMDQ_CONS,
+                                 regval | err_type << SMMU_CMD_CONS_ERR_SHIFT);
+            }
+            error = SMMU_GERROR_CMDQ;
+        }
+        break;
+    case SMMU_IRQ_PRIQ:
+        if (smmu_pri_irq_enabled(s)) {
+            error = SMMU_GERROR_PRIQ;
+        }
+        break;
+    }
+
+    if (error) {
+        uint32_t gerror = smmu_read32_reg(s, SMMU_REG_GERROR);
+        uint32_t gerrorn = smmu_read32_reg(s, SMMU_REG_GERRORN);
+
+        trace_smmuv3_irq_update(error, gerror, gerrorn);
+
+        /* only toggle GERROR if the interrupt is not active */
+        if (!((gerror ^ gerrorn) & error)) {
+            smmu_write32_reg(s, SMMU_REG_GERROR, gerror ^ error);
+        }
+    }
+
+    return error;
+}
+
+static void smmu_irq_raise(SMMUV3State *s, int irq, uint64_t data)
+{
+    trace_smmuv3_irq_raise(irq);
+    if (smmu_irq_update(s, irq, data)) {
+            qemu_irq_raise(s->irq[irq]);
+    }
+}
+
+static MemTxResult smmu_q_read(SMMUV3State *s, SMMUQueue *q, void *data)
+{
+    uint64_t addr = Q_ENTRY(q, Q_IDX(q, q->cons));
+
+    q->cons++;
+    if (q->cons == q->entries) {
+        q->cons = 0;
+        q->wrap.cons++;     /* this will toggle */
+    }
+
+    return smmu_read_sysmem(addr, data, q->ent_size, false);
+}
+
+static MemTxResult smmu_q_write(SMMUV3State *s, SMMUQueue *q, void *data)
+{
+    uint64_t addr = Q_ENTRY(q, Q_IDX(q, q->prod));
+
+    if (q->prod == q->entries) {
+        q->prod = 0;
+        q->wrap.prod++;     /* this will toggle */
+    }
+
+    q->prod++;
+
+    smmu_write_sysmem(addr, data, q->ent_size, false);
+
+    return MEMTX_OK;
+}
+
+static MemTxResult smmu_read_cmdq(SMMUV3State *s, Cmd *cmd)
+{
+    SMMUQueue *q = &s->cmdq;
+    MemTxResult ret = smmu_q_read(s, q, cmd);
+    uint32_t val = 0;
+
+    val |= (q->wrap.cons << q->shift) | q->cons;
+
+    /* Update consumer pointer */
+    smmu_write32_reg(s, SMMU_REG_CMDQ_CONS, val);
+
+    return ret;
+}
+
+static int smmu_cmdq_consume(SMMUV3State *s)
+{
+    uint32_t error = SMMU_CMD_ERR_NONE;
+
+    trace_smmuv3_cmdq_consume(SMMU_CMDQ_ERR(s));
+
+    if (!smmu_cmd_q_enabled(s)) {
+        return 0;
+    }
+
+    while (!SMMU_CMDQ_ERR(s) && !smmu_is_q_empty(s, &s->cmdq)) {
+        SMMUQueue *q = &s->cmdq;
+        uint32_t type;
+        Cmd cmd;
+
+        if (smmu_read_cmdq(s, &cmd) != MEMTX_OK) {
+            error = SMMU_CMD_ERR_ABORT;
+            break;
+        }
+
+        trace_smmuv3_cmdq_consume_details(q->base, q->cons, q->prod,
+                                          cmd.word[0], q->wrap.cons);
+
+        type = CMD_TYPE(&cmd);
+
+        switch (CMD_TYPE(&cmd)) {
+        case SMMU_CMD_SYNC:     /* Fallthrough */
+            if (CMD_CS(&cmd) & CMD_SYNC_SIG_IRQ) {
+                smmu_irq_raise(s, SMMU_IRQ_CMD_SYNC, SMMU_CMD_ERR_NONE);
+            } else if (CMD_CS(&cmd) & CMD_SYNC_SIG_SEV) {
+                trace_smmuv3_cmdq_consume_sev();
+            }
+            break;
+        case SMMU_CMD_PREFETCH_CONFIG:
+        case SMMU_CMD_PREFETCH_ADDR:
+        case SMMU_CMD_CFGI_STE:
+        case SMMU_CMD_CFGI_STE_RANGE: /* same as SMMU_CMD_CFGI_ALL */
+        case SMMU_CMD_CFGI_CD:
+        case SMMU_CMD_CFGI_CD_ALL:
+        case SMMU_CMD_TLBI_NH_ALL:
+        case SMMU_CMD_TLBI_NH_ASID:
+        case SMMU_CMD_TLBI_NH_VA:
+        case SMMU_CMD_TLBI_NH_VAA:
+        case SMMU_CMD_TLBI_EL3_ALL:
+        case SMMU_CMD_TLBI_EL3_VA:
+        case SMMU_CMD_TLBI_EL2_ALL:
+        case SMMU_CMD_TLBI_EL2_ASID:
+        case SMMU_CMD_TLBI_EL2_VA:
+        case SMMU_CMD_TLBI_EL2_VAA:
+        case SMMU_CMD_TLBI_S12_VMALL:
+        case SMMU_CMD_TLBI_S2_IPA:
+        case SMMU_CMD_TLBI_NSNH_ALL:
+        case SMMU_CMD_ATC_INV:
+        case SMMU_CMD_PRI_RESP:
+        case SMMU_CMD_RESUME:
+        case SMMU_CMD_STALL_TERM:
+            trace_smmuv3_unhandled_cmd(type);
+            break;
+        default:
+            error = SMMU_CMD_ERR_ILLEGAL;
+            error_report("Illegal command type: %d, ignoring", CMD_TYPE(&cmd));
+            dump_cmd(&cmd);
+            break;
+        }
+
+        if (error != SMMU_CMD_ERR_NONE) {
+            error_report("CMD Error");
+            break;
+        }
+    }
+
+    if (error) {
+        smmu_irq_raise(s, SMMU_IRQ_GERROR, error);
+    }
+
+    trace_smmuv3_cmdq_consume_out(s->cmdq.wrap.prod, s->cmdq.prod,
+                                  s->cmdq.wrap.cons, s->cmdq.cons);
+
+    return 0;
+}
+
+/**
+ * GERROR is updated when raising an interrupt, GERRORN will be updated
+ * by SW and should match GERROR before normal operation resumes.
+ */
+static void smmu_irq_clear(SMMUV3State *s, uint64_t gerrorn)
+{
+    int irq = SMMU_IRQ_GERROR;
+    uint32_t toggled;
+
+    toggled = smmu_read32_reg(s, SMMU_REG_GERRORN) ^ gerrorn;
+
+    while (toggled) {
+        irq = ctz32(toggled);
+
+        qemu_irq_lower(s->irq[irq]);
+
+        toggled &= toggled - 1;
+    }
+}
+
+static int smmu_evtq_update(SMMUV3State *s)
+{
+    if (!smmu_enabled(s)) {
+        return 0;
+    }
+
+    if (!smmu_is_q_empty(s, &s->evtq)) {
+        if (smmu_evt_irq_enabled(s)) {
+            smmu_irq_raise(s, SMMU_IRQ_EVTQ, 0);
+        }
+    }
+
+    if (smmu_is_q_empty(s, &s->evtq)) {
+        smmu_irq_clear(s, SMMU_GERROR_EVENTQ);
+    }
+
+    return 1;
+}
+
+static void smmu_create_event(SMMUV3State *s, hwaddr iova,
+                              uint32_t sid, bool is_write, int error);
+
+static void smmu_update(SMMUV3State *s)
+{
+    int error = 0;
+
+    /* SMMU starts processing commands even when not enabled */
+    if (!smmu_enabled(s)) {
+        goto check_cmdq;
+    }
+
+    /* EVENT Q updates takes more priority */
+    if ((smmu_evt_q_enabled(s)) && !smmu_is_q_empty(s, &s->evtq)) {
+        trace_smmuv3_update(smmu_is_q_empty(s, &s->evtq), s->evtq.prod,
+                            s->evtq.cons, s->evtq.wrap.prod, s->evtq.wrap.cons);
+        error = smmu_evtq_update(s);
+    }
+
+    if (error) {
+        /* TODO: May be in future we create proper event queue entry */
+        /* an error condition is not a recoverable event, like other devices */
+        error_report("An unfavourable condition");
+        smmu_create_event(s, 0, 0, 0, error);
+    }
+
+check_cmdq:
+    if (smmu_cmd_q_enabled(s) && !SMMU_CMDQ_ERR(s)) {
+        smmu_cmdq_consume(s);
+    } else {
+        trace_smmuv3_update_check_cmd(SMMU_CMDQ_ERR(s));
+    }
+
+}
+
+static void smmu_update_irq(SMMUV3State *s, uint64_t addr, uint64_t val)
+{
+    smmu_irq_clear(s, val);
+
+    smmu_write32_reg(s, SMMU_REG_GERRORN, val);
+
+    trace_smmuv3_update_irq(smmu_is_irq_pending(s, 0),
+                          smmu_read32_reg(s, SMMU_REG_GERROR),
+                          smmu_read32_reg(s, SMMU_REG_GERRORN));
+
+    /* Clear only when no more left */
+    if (!smmu_is_irq_pending(s, 0)) {
+        qemu_irq_lower(s->irq[0]);
+    }
+}
+
+#define SMMU_ID_REG_INIT(s, reg, d) do {        \
+    s->regs[reg >> 2] = d;                      \
+    } while (0)
+
+static void smmuv3_id_reg_init(SMMUV3State *s)
+{
+    uint32_t data =
+        SMMU_IDR0_STLEVEL << SMMU_IDR0_STLEVEL_SHIFT |
+        SMMU_IDR0_TERM    << SMMU_IDR0_TERM_SHIFT    |
+        SMMU_IDR0_STALL   << SMMU_IDR0_STALL_SHIFT   |
+        SMMU_IDR0_VMID16  << SMMU_IDR0_VMID16_SHIFT  |
+        SMMU_IDR0_PRI     << SMMU_IDR0_PRI_SHIFT     |
+        SMMU_IDR0_ASID16  << SMMU_IDR0_ASID16_SHIFT  |
+        SMMU_IDR0_ATS     << SMMU_IDR0_ATS_SHIFT     |
+        SMMU_IDR0_HYP     << SMMU_IDR0_HYP_SHIFT     |
+        SMMU_IDR0_HTTU    << SMMU_IDR0_HTTU_SHIFT    |
+        SMMU_IDR0_COHACC  << SMMU_IDR0_COHACC_SHIFT  |
+        SMMU_IDR0_TTF     << SMMU_IDR0_TTF_SHIFT     |
+        SMMU_IDR0_S1P     << SMMU_IDR0_S1P_SHIFT     |
+        SMMU_IDR0_S2P     << SMMU_IDR0_S2P_SHIFT;
+
+    SMMU_ID_REG_INIT(s, SMMU_REG_IDR0, data);
+
+#define SMMU_QUEUE_SIZE_LOG2  19
+    data =
+        1 << 27 |                    /* Attr Types override */
+        SMMU_QUEUE_SIZE_LOG2 << 21 | /* Cmd Q size */
+        SMMU_QUEUE_SIZE_LOG2 << 16 | /* Event Q size */
+        SMMU_QUEUE_SIZE_LOG2 << 11 | /* PRI Q size */
+        0  << 6 |                    /* SSID not supported */
+        SMMU_IDR1_SIDSIZE;
+
+    SMMU_ID_REG_INIT(s, SMMU_REG_IDR1, data);
+
+    data =
+        SMMU_IDR5_GRAN << SMMU_IDR5_GRAN_SHIFT | SMMU_IDR5_OAS;
+
+    SMMU_ID_REG_INIT(s, SMMU_REG_IDR5, data);
+
+}
+
+static void smmuv3_init(SMMUV3State *s)
+{
+    smmuv3_id_reg_init(s);      /* Update ID regs alone */
+
+    s->sid_size = SMMU_IDR1_SIDSIZE;
+
+    s->cmdq.entries = (smmu_read32_reg(s, SMMU_REG_IDR1) >> 21) & 0x1f;
+    s->cmdq.ent_size = sizeof(Cmd);
+    s->evtq.entries = (smmu_read32_reg(s, SMMU_REG_IDR1) >> 16) & 0x1f;
+    s->evtq.ent_size = sizeof(Evt);
+}
+
+/*
+ * All SMMU data structures are little endian, and are aligned to 8 bytes
+ * L1STE/STE/L1CD/CD, Queue entries in CMDQ/EVTQ/PRIQ
+ */
+static inline int smmu_get_ste(SMMUV3State *s, hwaddr addr, Ste *buf)
+{
+    int ret;
+
+    trace_smmuv3_get_ste(addr);
+    ret = dma_memory_read(&address_space_memory, addr, buf, sizeof(*buf));
+    dump_ste(buf);
+    return ret;
+}
+
+/*
+ * For now we only support CD with a single entry, 'ssid' is used to identify
+ * otherwise
+ */
+static inline int smmu_get_cd(SMMUV3State *s, Ste *ste, uint32_t ssid, Cd *buf)
+{
+    hwaddr addr = STE_CTXPTR(ste);
+    int ret;
+
+    if (STE_S1CDMAX(ste) != 0) {
+        error_report("Multilevel Ctx Descriptor not supported yet");
+    }
+
+    ret = dma_memory_read(&address_space_memory, addr, buf, sizeof(*buf));
+
+    trace_smmuv3_get_cd(addr);
+    dump_cd(cd);
+
+    return ret;
+}
+
+/**
+ * is_ste_consistent - Check validity of STE
+ * according to 6.2.1 Validty of STE
+ * TODO: check the relevance of each check and compliance
+ * with this spec chapter
+ */
+static int is_ste_consistent(SMMUV3State *s, Ste *ste)
+{
+    uint32_t _config = STE_CONFIG(ste);
+    uint32_t ste_vmid, ste_eats, ste_s2s, ste_s1fmt, ste_s2aa64, ste_s1cdmax;
+    uint32_t ste_strw;
+    bool strw_unused, addr_out_of_range, granule_supported;
+    bool config[] = {_config & 0x1, _config & 0x2, _config & 0x3};
+
+    ste_vmid = STE_S2VMID(ste);
+    ste_eats = STE_EATS(ste); /* Enable PCIe ATS trans */
+    ste_s2s = STE_S2S(ste);
+    ste_s1fmt = STE_S1FMT(ste);
+    ste_s2aa64 = STE_S2AA64(ste);
+    ste_s1cdmax = STE_S1CDMAX(ste); /*CD bit # S1ContextPtr */
+    ste_strw = STE_STRW(ste); /* stream world control */
+
+    if (!STE_VALID(ste)) {
+        error_report("STE NOT valid");
+        return false;
+    }
+
+    granule_supported = is_s2granule_valid(ste);
+
+    /* As S1/S2 combinations are supported do not check
+     * corresponding STE config values */
+
+    if (!config[2]) {
+        /* Report abort to device, no event recorded */
+        error_report("STE config 0b000 not implemented");
+        return false;
+    }
+
+    if (!SMMU_IDR1_SIDSIZE && ste_s1cdmax && config[0] &&
+        !SMMU_IDR0_CD2L && (ste_s1fmt == 1 || ste_s1fmt == 2)) {
+        error_report("STE inconsistant, CD mismatch");
+        return false;
+    }
+    if (SMMU_IDR0_ATS && ((_config & 0x3) == 0) &&
+        ((ste_eats == 2 && (_config != 0x7 || ste_s2s)) ||
+        (ste_eats == 1 && !ste_s2s))) {
+        error_report("STE inconsistant, EATS/S2S mismatch");
+        return false;
+    }
+    if (config[0] && (SMMU_IDR1_SIDSIZE &&
+        (ste_s1cdmax > SMMU_IDR1_SIDSIZE))) {
+        error_report("STE inconsistant, SSID out of range");
+        return false;
+    }
+
+    strw_unused = (!SMMU_IDR0_S1P || !SMMU_IDR0_HYP || (_config == 4));
+
+    addr_out_of_range = STE_S2TTB(ste) > MAX_PA(ste);
+
+    if (has_stage2(ste)) {
+        if ((ste_s2aa64 && !is_s2granule_valid(ste)) ||
+            (!ste_s2aa64 && !(SMMU_IDR0_TTF & 0x1)) ||
+            (ste_s2aa64 && !(SMMU_IDR0_TTF & 0x2))  ||
+            ((STE_S2HA(ste) || STE_S2HD(ste)) && !ste_s2aa64) ||
+            ((STE_S2HA(ste) || STE_S2HD(ste)) && !SMMU_IDR0_HTTU) ||
+            (STE_S2HD(ste) && (SMMU_IDR0_HTTU == 1)) || addr_out_of_range) {
+            error_report("STE inconsistant");
+            trace_smmuv3_is_ste_consistent(config[1], granule_supported,
+                                           addr_out_of_range, ste_s2aa64,
+                                           STE_S2HA(ste), STE_S2HD(ste),
+                                           STE_S2TTB(ste));
+            return false;
+        }
+    }
+    if (SMMU_IDR0_S2P && (config[0] == 0 && config[1]) &&
+        (strw_unused || !ste_strw) && !SMMU_IDR0_VMID16 && !(ste_vmid >> 8)) {
+        error_report("STE inconsistant, VMID out of range");
+        return false;
+    }
+
+    return true;
+}
+
+/**
+ * smmu_find_ste - Return the stream table entry associated
+ * to the sid
+ *
+ * @s: smmuv3 handle
+ * @sid: stream ID
+ * @ste: returned stream table entry
+ * Supports linear and 2-level stream table
+ */
+static int smmu_find_ste(SMMUV3State *s, uint16_t sid, Ste *ste)
+{
+    hwaddr addr;
+
+    trace_smmuv3_find_ste(sid, s->features, s->sid_split);
+    /* Check SID range */
+    if (sid > (1 << s->sid_size)) {
+        return SMMU_EVT_C_BAD_SID;
+    }
+    if (s->features & SMMU_FEATURE_2LVL_STE) {
+        int l1_ste_offset, l2_ste_offset, max_l2_ste, span;
+        hwaddr l1ptr, l2ptr;
+        STEDesc l1std;
+
+        l1_ste_offset = sid >> s->sid_split;
+        l2_ste_offset = sid & ((1 << s->sid_split) - 1);
+        l1ptr = (hwaddr)(s->strtab_base + l1_ste_offset * sizeof(l1std));
+        smmu_read_sysmem(l1ptr, &l1std, sizeof(l1std), false);
+        span = L1STD_SPAN(&l1std);
+
+        if (!span) {
+            /* l2ptr is not valid */
+            error_report("invalid sid=%d (L1STD span=0)", sid);
+            return SMMU_EVT_C_BAD_SID;
+        }
+        max_l2_ste = (1 << span) - 1;
+        l2ptr = L1STD_L2PTR(&l1std);
+        trace_smmuv3_find_ste_2lvl(s->strtab_base, l1ptr, l1_ste_offset,
+                                   l2ptr, l2_ste_offset, max_l2_ste);
+        if (l2_ste_offset > max_l2_ste) {
+            error_report("l2_ste_offset=%d > max_l2_ste=%d",
+                         l2_ste_offset, max_l2_ste);
+            return SMMU_EVT_C_BAD_STE;
+        }
+        addr = L1STD_L2PTR(&l1std) + l2_ste_offset * sizeof(*ste);
+    } else {
+        addr = s->strtab_base + sid * sizeof(*ste);
+    }
+
+    if (smmu_get_ste(s, addr, ste)) {
+        error_report("Unable to Fetch STE");
+        return SMMU_EVT_F_UUT;
+    }
+
+    return 0;
+}
+
+/**
+ * smmu_cfg_populate_s1 - Populate the stage 1 translation config
+ * from the context descriptor
+ */
+static int smmu_cfg_populate_s1(SMMUTransCfg *cfg, Cd *cd)
+{
+    bool s1a64 = CD_AARCH64(cd);
+    int epd0 = CD_EPD0(cd);
+    int tg;
+
+    cfg->stage   = 1;
+    tg           = epd0 ? CD_TG1(cd) : CD_TG0(cd);
+    cfg->tsz     = epd0 ? CD_T1SZ(cd) : CD_T0SZ(cd);
+    cfg->ttbr    = epd0 ? CD_TTB1(cd) : CD_TTB0(cd);
+    cfg->oas     = oas2bits(CD_IPS(cd));
+
+    if (s1a64) {
+        cfg->tsz = MIN(cfg->tsz, 39);
+        cfg->tsz = MAX(cfg->tsz, 16);
+    }
+    cfg->granule_sz = tg2granule(tg, epd0);
+
+    cfg->oas = MIN(oas2bits(SMMU_IDR5_OAS), cfg->oas);
+    /* fix ttbr - make top bits zero*/
+    cfg->ttbr = extract64(cfg->ttbr, 0, cfg->oas);
+    cfg->aa64 = s1a64;
+
+    trace_smmuv3_cfg_stage(cfg->stage, cfg->oas, cfg->tsz, cfg->ttbr,
+                           cfg->aa64, cfg->granule_sz);
+
+    return 0;
+}
+
+/**
+ * smmu_cfg_populate_s2 - Populate the stage 2 translation config
+ * from the Stream Table Entry
+ */
+static int smmu_cfg_populate_s2(SMMUTransCfg *cfg, Ste *ste)
+{
+    bool s2a64 = STE_S2AA64(ste);
+    int tg;
+
+    if (cfg->stage) {
+        error_report("%s nested S1 + S2 is not supported", __func__);
+    } else {
+        /* S2 only */
+        cfg->stage = 2;
+    }
+
+    tg           = STE_S2TG(ste);
+    cfg->tsz     = STE_S2T0SZ(ste);
+    cfg->ttbr    = STE_S2TTB(ste);
+    cfg->oas     = pa_range(ste);
+
+    cfg->aa64    = s2a64;
+
+    if (s2a64) {
+        cfg->tsz = MIN(cfg->tsz, 39);
+        cfg->tsz = MAX(cfg->tsz, 16);
+    }
+    cfg->granule_sz = tg2granule(tg, 0);
+
+    cfg->oas = MIN(oas2bits(SMMU_IDR5_OAS), cfg->oas);
+    /* fix ttbr - make top bits zero*/
+    cfg->ttbr = extract64(cfg->ttbr, 0, cfg->oas);
+
+    trace_smmuv3_cfg_stage(cfg->stage, cfg->oas, cfg->tsz, cfg->ttbr,
+                           cfg->aa64, cfg->granule_sz);
+
+    return 0;
+}
+
+/**
+ * smmu_cfg_populate - Populates the translation config corresponding
+ * to the STE and CD content
+ */
+static int smmu_cfg_populate(Ste *ste, Cd *cd, SMMUTransCfg *cfg)
+{
+    int ret = 0;
+
+    if (is_ste_bypass(ste)) {
+        return ret;
+    }
+
+    if (has_stage1(ste)) {
+        ret = smmu_cfg_populate_s1(cfg, cd);
+        if (ret) {
+            return ret;
+        }
+    }
+    if (has_stage2(ste)) {
+        ret = smmu_cfg_populate_s2(cfg, ste);
+        if (ret) {
+            return ret;
+        }
+    }
+    return ret;
+}
+
+static SMMUEvtErr smmu_walk_pgtable(SMMUV3State *s, SMMUTransCfg *cfg,
+                                    IOMMUTLBEntry *tlbe, bool is_write)
+{
+    SMMUState *sys = SMMU_SYS_DEV(s);
+    SMMUBaseClass *sbc = SMMU_DEVICE_GET_CLASS(sys);
+    SMMUEvtErr error = 0;
+    uint32_t page_size = 0, perm = 0;
+
+    trace_smmuv3_walk_pgtable(tlbe->iova, is_write);
+
+    if (!cfg->stage) {
+        return 0;
+    }
+
+    cfg->input = tlbe->iova;
+
+    if (cfg->aa64) {
+        error = sbc->translate_64(cfg, &page_size, &perm, is_write);
+    } else {
+        error = sbc->translate_32(cfg, &page_size, &perm, is_write);
+    }
+
+    if (error) {
+        error_report("PTW failed for iova=0x%"PRIx64" is_write=%d (%d)",
+                     cfg->input, is_write, error);
+        goto exit;
+    }
+    tlbe->translated_addr = cfg->output;
+    tlbe->addr_mask = page_size - 1;
+    tlbe->perm = perm;
+
+    trace_smmuv3_walk_pgtable_out(tlbe->translated_addr,
+                                  tlbe->addr_mask, tlbe->perm);
+exit:
+    return error;
+}
+
+static MemTxResult smmu_write_evtq(SMMUV3State *s, Evt *evt)
+{
+    SMMUQueue *q = &s->evtq;
+    int ret = smmu_q_write(s, q, evt);
+    uint32_t val = 0;
+
+    val |= (q->wrap.prod << q->shift) | q->prod;
+
+    smmu_write32_reg(s, SMMU_REG_EVTQ_PROD, val);
+
+    return ret;
+}
+
+/*
+ * Events created on the EventQ
+ */
+static void smmu_create_event(SMMUV3State *s, hwaddr iova,
+                              uint32_t sid, bool is_write, int error)
+{
+    SMMUQueue *q = &s->evtq;
+    uint64_t head;
+    Evt evt;
+
+    if (!smmu_evt_q_enabled(s)) {
+        return;
+    }
+
+    EVT_SET_TYPE(&evt, error);
+    EVT_SET_SID(&evt, sid);
+
+    switch (error) {
+    case SMMU_EVT_F_UUT:
+    case SMMU_EVT_C_BAD_STE:
+        break;
+    case SMMU_EVT_C_BAD_CD:
+    case SMMU_EVT_F_CD_FETCH:
+        break;
+    case SMMU_EVT_F_TRANS_FORBIDDEN:
+    case SMMU_EVT_F_WALK_EXT_ABRT:
+        EVT_SET_INPUT_ADDR(&evt, iova);
+    default:
+        break;
+    }
+
+    smmu_write_evtq(s, &evt);
+
+    head = Q_IDX(q, q->prod);
+
+    if (smmu_is_q_full(s, &s->evtq)) {
+        head = q->prod ^ (1 << 31);     /* Set overflow */
+    }
+
+    smmu_write32_reg(s, SMMU_REG_EVTQ_PROD, head);
+
+    smmu_irq_raise(s, SMMU_IRQ_EVTQ, 0);
+}
+
+/*
+ * TR - Translation Request
+ * TT - Translated Tansaction
+ * OT - Other Transaction
+ */
+static IOMMUTLBEntry
+smmuv3_translate(MemoryRegion *mr, hwaddr addr, bool is_write)
+{
+    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
+    SMMUV3State *s = sdev->smmu;
+    SMMUTransCfg transcfg = {};
+    uint16_t sid = 0;
+    Ste ste;
+    Cd cd;
+    SMMUEvtErr error = 0;
+
+    IOMMUTLBEntry entry = {
+        .target_as = &address_space_memory,
+        .iova = addr,
+        .translated_addr = addr,
+        .addr_mask = ~(hwaddr)0,
+        .perm = IOMMU_NONE,
+    };
+
+    sid = smmu_get_sid(sdev);
+
+    /* SMMU Bypass, We allow traffic through if SMMU is disabled  */
+    if (!smmu_enabled(s)) {
+        trace_smmuv3_translate_bypass(mr->name, sid, addr, is_write);
+        goto out;
+    }
+
+    trace_smmuv3_translate_in(sid, pci_bus_num(sdev->bus), s->strtab_base);
+
+    /* Fetch & Check STE */
+    error = smmu_find_ste(s, sid, &ste);
+    if (error) {
+        goto error_out;  /* F_STE_FETCH or F_CFG_CONFLICT */
+    }
+
+    if (STE_VALID(&ste) && is_ste_bypass(&ste)) {
+        trace_smmuv3_translate_bypass(mr->name, sid, addr, is_write);
+        goto out;
+    }
+
+    if (!is_ste_consistent(s, &ste)) {
+        error = SMMU_EVT_C_BAD_STE;
+        goto error_out;
+    }
+
+    if (has_stage1(&ste)) { /* Stage 1 */
+        smmu_get_cd(s, &ste, 0, &cd); /* We dont have SSID yet, so 0 */
+
+        if (!is_cd_valid(s, &ste, &cd)) {
+            error = SMMU_EVT_C_BAD_CD;
+            goto error_out;
+        }
+    }
+
+    smmu_cfg_populate(&ste, &cd, &transcfg);
+
+    /* Walk Stage1, if S2 is enabled, S2 walked for Every access on S1 */
+    error = smmu_walk_pgtable(s, &transcfg, &entry, is_write);
+
+    entry.perm = is_write ? IOMMU_RW : IOMMU_RO;
+
+    trace_smmuv3_translate_ok(mr->name, sid, addr,
+                              entry.translated_addr, entry.perm);
+
+error_out:
+    if (error > 1) {
+        error_report("Translation Error: %x", error);
+        smmu_create_event(s, entry.iova, sid, is_write, error);
+    }
+
+out:
+    return entry;
+}
+
+static inline void smmu_update_base_reg(SMMUV3State *s, uint64_t *base,
+                                        uint64_t val)
+{
+    *base = val & ~(SMMU_BASE_RA | 0x3fULL);
+}
+
+static void smmu_update_qreg(SMMUV3State *s, SMMUQueue *q, hwaddr reg,
+                             uint32_t off, uint64_t val, unsigned size)
+{
+    if (size == 8 && off == 0) {
+        smmu_write64_reg(s, reg, val);
+    } else {
+        smmu_write_reg(s, reg, val);
+    }
+
+    switch (off) {
+    case 0:                             /* BASE register */
+        val = smmu_read64_reg(s, reg);
+        q->shift = val & 0x1f;
+        q->entries = 1 << (q->shift);
+        smmu_update_base_reg(s, &q->base, val);
+        break;
+
+    case 4:                             /* CONS */
+        q->cons = Q_IDX(q, val);
+        q->wrap.cons = val >> q->shift;
+        trace_smmuv3_update_qreg(q->cons, val);
+        break;
+
+    case 8:                             /* PROD */
+        q->prod = Q_IDX(q, val);
+        q->wrap.prod = val >> q->shift;
+        break;
+    }
+
+    switch (reg) {
+    case SMMU_REG_CMDQ_PROD:            /* should be only for CMDQ_PROD */
+    case SMMU_REG_CMDQ_CONS:            /* but we do it anyway */
+        smmu_update(s);
+        break;
+    }
+}
+
+static void smmu_write_mmio_fixup(SMMUV3State *s, hwaddr *addr)
+{
+    switch (*addr) {
+    case 0x100a8: case 0x100ac:         /* Aliasing => page0 registers */
+    case 0x100c8: case 0x100cc:
+        *addr ^= (hwaddr)0x10000;
+    }
+}
+
+static void smmu_write_mmio(void *opaque, hwaddr addr,
+                            uint64_t val, unsigned size)
+{
+    SMMUState *sys = opaque;
+    SMMUV3State *s = SMMU_V3_DEV(sys);
+    bool update = false;
+
+    smmu_write_mmio_fixup(s, &addr);
+
+    trace_smmuv3_write_mmio(addr, val);
+
+    switch (addr) {
+    case 0xFDC ... 0xFFC:
+    case SMMU_REG_IDR0 ... SMMU_REG_IDR5:
+        trace_smmuv3_write_mmio_idr(addr, val);
+        return;
+
+    case SMMU_REG_GERRORN:
+        smmu_update_irq(s, addr, val);
+        return;
+
+    case SMMU_REG_CR0:
+        smmu_write32_reg(s, SMMU_REG_CR0, val);
+        smmu_write32_reg(s, SMMU_REG_CR0_ACK, val);
+        update = true;
+        break;
+
+    case SMMU_REG_IRQ_CTRL:
+        smmu_write32_reg(s, SMMU_REG_IRQ_CTRL_ACK, val);
+        update = true;
+        break;
+
+    case SMMU_REG_STRTAB_BASE:
+        smmu_update_base_reg(s, &s->strtab_base, val);
+        return;
+
+    case SMMU_REG_STRTAB_BASE_CFG:
+        if (((val >> 16) & 0x3) == 0x1) {
+            s->sid_split = (val >> 6) & 0x1f;
+            s->features |= SMMU_FEATURE_2LVL_STE;
+        }
+        break;
+
+    case SMMU_REG_CMDQ_PROD:
+    case SMMU_REG_CMDQ_CONS:
+    case SMMU_REG_CMDQ_BASE:
+    case SMMU_REG_CMDQ_BASE + 4:
+        smmu_update_qreg(s, &s->cmdq, addr, addr - SMMU_REG_CMDQ_BASE,
+                         val, size);
+        return;
+
+    case SMMU_REG_EVTQ_CONS:            /* fallthrough */
+    {
+        SMMUQueue *evtq = &s->evtq;
+        evtq->cons = Q_IDX(evtq, val);
+        evtq->wrap.cons = Q_WRAP(evtq, val);
+
+        trace_smmuv3_write_mmio_evtq_cons_bef_clear(evtq->prod, evtq->cons,
+                                                    evtq->wrap.prod,
+                                                    evtq->wrap.cons);
+        if (smmu_is_q_empty(s, &s->evtq)) {
+            trace_smmuv3_write_mmio_evtq_cons_after_clear(evtq->prod,
+                                                          evtq->cons,
+                                                          evtq->wrap.prod,
+                                                          evtq->wrap.cons);
+            qemu_irq_lower(s->irq[SMMU_IRQ_EVTQ]);
+        }
+    }
+    case SMMU_REG_EVTQ_BASE:
+    case SMMU_REG_EVTQ_BASE + 4:
+    case SMMU_REG_EVTQ_PROD:
+        smmu_update_qreg(s, &s->evtq, addr, addr - SMMU_REG_EVTQ_BASE,
+                         val, size);
+        return;
+
+    case SMMU_REG_PRIQ_CONS:
+    case SMMU_REG_PRIQ_BASE:
+    case SMMU_REG_PRIQ_BASE + 4:
+    case SMMU_REG_PRIQ_PROD:
+        smmu_update_qreg(s, &s->priq, addr, addr - SMMU_REG_PRIQ_BASE,
+                         val, size);
+        return;
+    }
+
+    if (size == 8) {
+        smmu_write_reg(s, addr, val);
+    } else {
+        smmu_write32_reg(s, addr, (uint32_t)val);
+    }
+
+    if (update) {
+        smmu_update(s);
+    }
+}
+
+static uint64_t smmu_read_mmio(void *opaque, hwaddr addr, unsigned size)
+{
+    SMMUState *sys = opaque;
+    SMMUV3State *s = SMMU_V3_DEV(sys);
+    uint64_t val;
+
+    smmu_write_mmio_fixup(s, &addr);
+
+    /* Primecell/Corelink ID registers */
+    switch (addr) {
+    case 0xFF0 ... 0xFFC:
+    case 0xFDC ... 0xFE4:
+        val = 0;
+        error_report("addr:0x%"PRIx64" val:0x%"PRIx64, addr, val);
+        break;
+
+    default:
+        val = (uint64_t)smmu_read32_reg(s, addr);
+        break;
+
+    case SMMU_REG_STRTAB_BASE ... SMMU_REG_CMDQ_BASE:
+    case SMMU_REG_EVTQ_BASE:
+    case SMMU_REG_PRIQ_BASE ... SMMU_REG_PRIQ_IRQ_CFG1:
+        val = smmu_read64_reg(s, addr);
+        break;
+    }
+
+    trace_smmuv3_read_mmio(addr, val, s->cmdq.cons);
+    return val;
+}
+
+static const MemoryRegionOps smmu_mem_ops = {
+    .read = smmu_read_mmio,
+    .write = smmu_write_mmio,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid = {
+        .min_access_size = 4,
+        .max_access_size = 8,
+    },
+};
+
+static void smmu_init_irq(SMMUV3State *s, SysBusDevice *dev)
+{
+    int i;
+
+    for (i = 0; i < ARRAY_SIZE(s->irq); i++) {
+        sysbus_init_irq(dev, &s->irq[i]);
+    }
+}
+
+static AddressSpace *smmu_find_add_as(PCIBus *bus, void *opaque, int devfn)
+{
+    SMMUState *s = opaque;
+    uintptr_t key = (uintptr_t)bus;
+    SMMUPciBus *sbus = g_hash_table_lookup(s->smmu_as_by_busptr, &key);
+    SMMUDevice *sdev;
+
+    if (!sbus) {
+        uintptr_t *new_key = g_malloc(sizeof(*new_key));
+
+        *new_key = (uintptr_t)bus;
+        sbus = g_malloc0(sizeof(SMMUPciBus) +
+                         sizeof(SMMUDevice *) * SMMU_PCI_DEVFN_MAX);
+        sbus->bus = bus;
+        g_hash_table_insert(s->smmu_as_by_busptr, new_key, sbus);
+    }
+
+    sdev = sbus->pbdev[devfn];
+    if (!sdev) {
+        sdev = sbus->pbdev[devfn] = g_malloc0(sizeof(SMMUDevice));
+
+        sdev->smmu = s;
+        sdev->bus = bus;
+        sdev->devfn = devfn;
+
+        memory_region_init_iommu(&sdev->iommu, OBJECT(s),
+                                 &s->iommu_ops, TYPE_SMMU_V3_DEV,
+                                 UINT64_MAX);
+        address_space_init(&sdev->as, &sdev->iommu, TYPE_SMMU_V3_DEV);
+    }
+
+    return &sdev->as;
+
+}
+
+static void smmu_init_iommu_as(SMMUV3State *sys)
+{
+    SMMUState *s = SMMU_SYS_DEV(sys);
+    PCIBus *pcibus = pci_find_primary_bus();
+
+    if (pcibus) {
+        pci_setup_iommu(pcibus, smmu_find_add_as, s);
+    } else {
+        error_report("No PCI bus, SMMU is not registered");
+    }
+}
+
+static void smmu_reset(DeviceState *dev)
+{
+    SMMUV3State *s = SMMU_V3_DEV(dev);
+    smmuv3_init(s);
+}
+
+static int smmu_populate_internal_state(void *opaque, int version_id)
+{
+    SMMUV3State *s = opaque;
+
+    smmu_update(s);
+    return 0;
+}
+
+static void smmu_realize(DeviceState *d, Error **errp)
+{
+    SMMUState *sys = SMMU_SYS_DEV(d);
+    SMMUV3State *s = SMMU_V3_DEV(sys);
+    SysBusDevice *dev = SYS_BUS_DEVICE(d);
+
+    sys->iommu_ops.translate = smmuv3_translate;
+    sys->iommu_ops.notify_flag_changed = NULL;
+    /* Register Access */
+    memset(sys->smmu_as_by_bus_num, 0, sizeof(sys->smmu_as_by_bus_num));
+    memory_region_init_io(&sys->iomem, OBJECT(s),
+                          &smmu_mem_ops, sys, TYPE_SMMU_V3_DEV, 0x20000);
+
+    sys->smmu_as_by_busptr = g_hash_table_new_full(smmu_uint64_hash,
+                                                   smmu_uint64_equal,
+                                                   g_free, g_free);
+    sysbus_init_mmio(dev, &sys->iomem);
+
+    smmu_init_irq(s, dev);
+
+    smmu_init_iommu_as(s);
+}
+
+static const VMStateDescription vmstate_smmuv3 = {
+    .name = "smmuv3",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .post_load = smmu_populate_internal_state,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT64_ARRAY(regs, SMMUV3State, SMMU_NREGS),
+        VMSTATE_END_OF_LIST(),
+    },
+};
+
+static void smmuv3_instance_init(Object *obj)
+{
+    /* Nothing much to do here as of now */
+}
+
+static void smmuv3_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+
+    dc->reset   = smmu_reset;
+    dc->vmsd    = &vmstate_smmuv3;
+    dc->realize = smmu_realize;
+}
+
+static const TypeInfo smmuv3_type_info = {
+    .name          = TYPE_SMMU_V3_DEV,
+    .parent        = TYPE_SMMU_DEV_BASE,
+    .instance_size = sizeof(SMMUV3State),
+    .instance_init = smmuv3_instance_init,
+    .class_data    = NULL,
+    .class_size    = sizeof(SMMUV3Class),
+    .class_init    = smmuv3_class_init,
+};
+
+static void smmuv3_register_types(void)
+{
+    type_register(&smmuv3_type_info);
+}
+
+type_init(smmuv3_register_types)
+
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 1d53ad0..d3aeaef 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -14,3 +14,35 @@ smmu_page_walk_level_block_pte(uint64_t pte, uint64_t address) "pte=0x%"PRIx64"
 smmu_page_walk_level_table_pte(uint64_t pte, uint64_t address) "pte=0x%"PRIx64" next table address = 0x%"PRIx64
 smmu_get_pte(uint64_t baseaddr, int index, uint64_t pteaddr, uint64_t pte) "baseaddr=0x%"PRIx64" index=0x%x, pteaddr=0x%"PRIx64", pte=0x%"PRIx64
 smmu_set_translated_address(hwaddr iova, hwaddr va) "iova = 0x%"PRIx64" -> pa = 0x%"PRIx64
+
+#hw/arm/smmuv3.c
+smmuv3_irq_update(uint32_t error, uint32_t gerror, uint32_t gerrorn) "<<<< error:0x%x gerror:0x%x gerrorn:0x%x"
+smmuv3_irq_raise(int irq) "irq:%d"
+smmuv3_unhandled_cmd(uint32_t type) "Unhandled command type=%d"
+smmuv3_cmdq_consume(int error) "CMDQ_ERR: %d"
+smmuv3_cmdq_consume_details(hwaddr base, uint32_t cons, uint32_t prod, uint32_t word, uint8_t wrap_cons) "CMDQ base: 0x%"PRIx64" cons:%d prod:%d val:0x%x wrap:%d"
+smmuv3_cmdq_consume_sev(void) "CMD_SYNC CS=SEV not supported, ignoring"
+smmuv3_cmdq_consume_out(uint8_t prod_wrap, uint32_t prod, uint8_t cons_wrap, uint32_t cons) "prod_wrap:%d, prod:0x%x cons_wrap:%d cons:0x%x"
+smmuv3_update(bool is_empty, uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "q empty:%d prod:%d cons:%d p.wrap:%d p.cons:%d"
+smmuv3_update_check_cmd(int error) "cmdq not enabled or error :0x%x"
+smmuv3_update_irq(bool is_pending, uint32_t gerror, uint32_t gerrorn) "irq pend: %d gerror:0x%x gerrorn:0x%x"
+smmuv3_is_ste_consistent(bool cfg, bool granule_supported, bool addr_oor, uint32_t aa64, int s2ha, int s2hd, uint64_t s2ttb ) "config[1]:%d gran:%d addr:%d aa64:%d s2ha:%d s2hd:%d s2ttb:0x%"PRIx64
+smmuv3_find_ste(uint16_t sid, uint32_t features, uint16_t sid_split) "SID:0x%x features:0x%x, sid_split:0x%x"
+smmuv3_find_ste_2lvl(uint64_t strtab_base, hwaddr l1ptr, int l1_ste_offset, hwaddr l2ptr, int l2_ste_offset, int max_l2_ste) "strtab_base:%lx l1ptr:0x%"PRIx64" l1_off:0x%x, l2ptr:0x%"PRIx64" l2_off:0x%x max_l2_ste:%d"
+smmuv3_get_ste(hwaddr addr) "STE addr: 0x%"PRIx64
+smmuv3_walk_pgtable(hwaddr iova, bool is_write) "Input addr: 0x%"PRIx64", is_write=%d"
+smmuv3_walk_pgtable_out(hwaddr addr, uint32_t mask, int perm) "DONE: o/p addr:0x%"PRIx64" mask:0x%x perm:%d"
+smmuv3_translate_bypass(const char *n, uint16_t sid, hwaddr addr, bool is_write) "%s sid=%d bypass iova:0x%"PRIx64" is_write=%d"
+smmuv3_translate_in(uint16_t sid, int pci_bus_num, hwaddr strtab_base) "SID:0x%x bus:%d strtab_base:0x%"PRIx64
+smmuv3_get_cd(hwaddr addr) "CD addr: 0x%"PRIx64
+smmuv3_translate_ok(const char *n, uint16_t sid, hwaddr iova, hwaddr translated, int perm) "%s sid=%d iova=0x%"PRIx64" translated=0x%"PRIx64" perm=0x%x"
+smmuv3_update_qreg(uint32_t cons, uint64_t val) "cons written : %d val:0x%"PRIx64
+smmuv3_write_mmio(hwaddr addr, uint64_t val) "addr: 0x%"PRIx64" val:0x%"PRIx64
+smmuv3_write_mmio_idr(hwaddr addr, uint64_t val) "write to RO/Unimpl reg %lx val64:%lx"
+smmuv3_write_mmio_evtq_cons_bef_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "Before clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"
+smmuv3_write_mmio_evtq_cons_after_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "after clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"
+smmuv3_read_mmio(hwaddr addr, uint64_t val, uint32_t cons) "addr: 0x%"PRIx64" val:0x%"PRIx64" cmdq cons:%d"
+smmuv3_dump_ste(int i, uint32_t word0, int j,  uint32_t word1) "STE[%2d]: %#010x\t STE[%2d]: %#010x"
+smmuv3_dump_cd(int i, uint32_t word0, int j,  uint32_t word1) "CD[%2d]: %#010x\t CD[%2d]: %#010x"
+smmuv3_dump_cmd(int i, uint32_t word0, int j,  uint32_t word1) "CMD[%2d]: %#010x\t CMD[%2d]: %#010x"
+smmuv3_cfg_stage(int s, uint32_t oas, uint32_t tsz, uint64_t ttbr, bool aa64, uint32_t granule_sz) "TransCFG stage:%d oas:%d tsz:%d ttbr:0x%"PRIx64"  aa64:%d granule_sz:%d"
diff --git a/include/hw/arm/smmuv3.h b/include/hw/arm/smmuv3.h
new file mode 100644
index 0000000..1f7d78e
--- /dev/null
+++ b/include/hw/arm/smmuv3.h
@@ -0,0 +1,87 @@
+/*
+ * Copyright (C) 2014-2016 Broadcom Corporation
+ * Copyright (c) 2017 Red Hat, Inc.
+ * Written by Prem Mallappa, Eric Auger
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_ARM_SMMUV3_H
+#define HW_ARM_SMMUV3_H
+
+#include "hw/arm/smmu-common.h"
+
+#define SMMU_NREGS            0x200
+
+typedef struct SMMUQueue {
+     hwaddr base;
+     uint32_t prod;
+     uint32_t cons;
+     union {
+          struct {
+               uint8_t prod:1;
+               uint8_t cons:1;
+          };
+          uint8_t unused;
+     } wrap;
+
+     uint16_t entries;           /* Number of entries */
+     uint8_t  ent_size;          /* Size of entry in bytes */
+     uint8_t  shift;             /* Size in log2 */
+} SMMUQueue;
+
+typedef struct SMMUV3State {
+    SMMUState     smmu_state;
+
+#define SMMU_FEATURE_2LVL_STE (1 << 0)
+    /* Local cache of most-frequently used register */
+    uint32_t     features;
+    uint16_t     sid_size;
+    uint16_t     sid_split;
+    uint64_t     strtab_base;
+
+    uint64_t    regs[SMMU_NREGS];
+
+    qemu_irq     irq[4];
+
+    SMMUQueue    cmdq, evtq, priq;
+
+    /* IOMMU Address space */
+    MemoryRegion iommu;
+    AddressSpace iommu_as;
+    /*
+     * Bus number is not populated in the beginning, hence we need
+     * a mechanism to retrieve the corresponding address space for each
+     * pci device.
+    */
+    GHashTable   *smmu_as_by_busptr;
+} SMMUV3State;
+
+typedef enum {
+    SMMU_IRQ_GERROR,
+    SMMU_IRQ_PRIQ,
+    SMMU_IRQ_EVTQ,
+    SMMU_IRQ_CMD_SYNC,
+} SMMUIrq;
+
+typedef struct {
+    SMMUBaseClass smmu_base_class;
+} SMMUV3Class;
+
+#define TYPE_SMMU_V3_DEV   "smmuv3"
+#define SMMU_V3_DEV(obj) OBJECT_CHECK(SMMUV3State, (obj), TYPE_SMMU_V3_DEV)
+#define SMMU_V3_DEVICE_GET_CLASS(obj)                              \
+    OBJECT_GET_CLASS(SMMUBaseClass, (obj), TYPE_SMMU_V3_DEV)
+
+#endif
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [RFC v4 3/5] hw/arm/virt: Add SMMUv3 to the virt board
  2017-05-13 17:43 [Qemu-devel] [RFC v4 0/5] ARM SMMUv3 Emulation Support Eric Auger
  2017-05-13 17:43 ` [Qemu-devel] [RFC v4 1/5] hw/arm/smmu-common: smmu base class Eric Auger
  2017-05-13 17:43 ` [Qemu-devel] [RFC v4 2/5] hw/arm/smmuv3: smmuv3 emulation model Eric Auger
@ 2017-05-13 17:43 ` Eric Auger
  2017-05-13 17:43 ` [Qemu-devel] [RFC v4 4/5] hw/arm/virt: Add 2.10 machine type Eric Auger
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Eric Auger @ 2017-05-13 17:43 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, edgar.iglesias,
	qemu-arm, qemu-devel, prem.mallappa
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain

From: Prem Mallappa <prem.mallappa@broadcom.com>

Add code to instantiate an smmu-v3 in mach-virt. A new boolean flag
is introduced in VirtMachineState to allow this instantiation. It
is currently false.

Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v2 -> v3:
- vbi was removed. Use vms instead
- migrate to new smmu binding format (iommu-map)
- don't use appendprop anymore
- add vms->smmu and guard instantiation with this latter
- interrupts type changed to edge
---
 hw/arm/virt.c         | 58 +++++++++++++++++++++++++++++++++++++++++++++++++++
 include/hw/arm/virt.h |  4 ++++
 2 files changed, 62 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 5f62a03..c00efb2 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -56,6 +56,7 @@
 #include "hw/smbios/smbios.h"
 #include "qapi/visitor.h"
 #include "standard-headers/linux/input.h"
+#include "hw/arm/smmuv3.h"
 
 #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
     static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
@@ -139,6 +140,7 @@ static const MemMapEntry a15memmap[] = {
     [VIRT_FW_CFG] =             { 0x09020000, 0x00000018 },
     [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
     [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
+    [VIRT_SMMU] =               { 0x09050000, 0x00020000 }, /* 128K, needed */
     [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
     /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
     [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
@@ -159,6 +161,7 @@ static const int a15irqmap[] = {
     [VIRT_SECURE_UART] = 8,
     [VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */
     [VIRT_GIC_V2M] = 48, /* ...to 48 + NUM_GICV2M_SPIS - 1 */
+    [VIRT_SMMU] = 74,    /* ...to 74 + NUM_SMMU_IRQS - 1 */
     [VIRT_PLATFORM_BUS] = 112, /* ...to 112 + PLATFORM_BUS_NUM_IRQS -1 */
 };
 
@@ -969,6 +972,52 @@ static void create_pcie_irq_map(const VirtMachineState *vms,
                            0x7           /* PCI irq */);
 }
 
+static void alloc_smmu_phandle(VirtMachineState *vms)
+{
+    if (vms->smmu && !vms->smmu_phandle) {
+        vms->smmu_phandle = qemu_fdt_alloc_phandle(vms->fdt);
+    }
+}
+
+static void create_smmu(const VirtMachineState *vms, qemu_irq *pic)
+{
+    char *smmu;
+    const char compat[] = "arm,smmu-v3";
+    int irq =  vms->irqmap[VIRT_SMMU];
+    hwaddr base = vms->memmap[VIRT_SMMU].base;
+    hwaddr size = vms->memmap[VIRT_SMMU].size;
+    const char irq_names[] = "eventq\0priq\0cmdq-sync\0gerror";
+
+    if (!vms->smmu) {
+        return;
+    }
+
+    sysbus_create_varargs("smmuv3", base, pic[irq], pic[irq + 1],
+                          pic[irq + 2], pic[irq + 3], NULL);
+
+    smmu = g_strdup_printf("/smmuv3@%" PRIx64, base);
+    qemu_fdt_add_subnode(vms->fdt, smmu);
+    qemu_fdt_setprop(vms->fdt, smmu, "compatible", compat, sizeof(compat));
+    qemu_fdt_setprop_sized_cells(vms->fdt, smmu, "reg", 2, base, 2, size);
+
+    qemu_fdt_setprop_cells(vms->fdt, smmu, "interrupts",
+            GIC_FDT_IRQ_TYPE_SPI, irq    , GIC_FDT_IRQ_FLAGS_EDGE_LO_HI,
+            GIC_FDT_IRQ_TYPE_SPI, irq + 1, GIC_FDT_IRQ_FLAGS_EDGE_LO_HI,
+            GIC_FDT_IRQ_TYPE_SPI, irq + 2, GIC_FDT_IRQ_FLAGS_EDGE_LO_HI,
+            GIC_FDT_IRQ_TYPE_SPI, irq + 3, GIC_FDT_IRQ_FLAGS_EDGE_LO_HI);
+
+    qemu_fdt_setprop(vms->fdt, smmu, "interrupt-names", irq_names,
+                     sizeof(irq_names));
+
+    qemu_fdt_setprop_cell(vms->fdt, smmu, "clocks", vms->clock_phandle);
+    qemu_fdt_setprop_string(vms->fdt, smmu, "clock-names", "apb_pclk");
+
+    qemu_fdt_setprop_cell(vms->fdt, smmu, "#iommu-cells", 1);
+
+    qemu_fdt_setprop_cell(vms->fdt, smmu, "phandle", vms->smmu_phandle);
+    g_free(smmu);
+}
+
 static void create_pcie(const VirtMachineState *vms, qemu_irq *pic)
 {
     hwaddr base_mmio = vms->memmap[VIRT_PCIE_MMIO].base;
@@ -1081,6 +1130,11 @@ static void create_pcie(const VirtMachineState *vms, qemu_irq *pic)
     qemu_fdt_setprop_cell(vms->fdt, nodename, "#interrupt-cells", 1);
     create_pcie_irq_map(vms, vms->gic_phandle, irq, nodename);
 
+    if (vms->smmu) {
+        qemu_fdt_setprop_cells(vms->fdt, nodename, "iommu-map",
+                               0x0, vms->smmu_phandle, 0x0, 0x10000);
+    }
+
     g_free(nodename);
 }
 
@@ -1402,8 +1456,12 @@ static void machvirt_init(MachineState *machine)
 
     create_rtc(vms, pic);
 
+    alloc_smmu_phandle(vms);
+
     create_pcie(vms, pic);
 
+    create_smmu(vms, pic);
+
     create_gpio(vms, pic);
 
     /* Create mmio transports, so the user can create virtio backends
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 33b0ff3..164a531 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -38,6 +38,7 @@
 
 #define NUM_GICV2M_SPIS       64
 #define NUM_VIRTIO_TRANSPORTS 32
+#define NUM_SMMU_IRQS          4
 
 #define ARCH_GICV3_MAINT_IRQ  9
 
@@ -59,6 +60,7 @@ enum {
     VIRT_GIC_V2M,
     VIRT_GIC_ITS,
     VIRT_GIC_REDIST,
+    VIRT_SMMU,
     VIRT_UART,
     VIRT_MMIO,
     VIRT_RTC,
@@ -95,6 +97,7 @@ typedef struct {
     bool highmem;
     bool its;
     bool virt;
+    bool smmu;
     int32_t gic_version;
     struct arm_boot_info bootinfo;
     const MemMapEntry *memmap;
@@ -105,6 +108,7 @@ typedef struct {
     uint32_t clock_phandle;
     uint32_t gic_phandle;
     uint32_t msi_phandle;
+    uint32_t smmu_phandle;
     int psci_conduit;
 } VirtMachineState;
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [RFC v4 4/5] hw/arm/virt: Add 2.10 machine type
  2017-05-13 17:43 [Qemu-devel] [RFC v4 0/5] ARM SMMUv3 Emulation Support Eric Auger
                   ` (2 preceding siblings ...)
  2017-05-13 17:43 ` [Qemu-devel] [RFC v4 3/5] hw/arm/virt: Add SMMUv3 to the virt board Eric Auger
@ 2017-05-13 17:43 ` Eric Auger
  2017-05-30 16:04   ` Peter Maydell
  2017-05-13 17:43 ` [Qemu-devel] [RFC v4 5/5] hw/arm/virt-acpi-build: add smmuv3 node in IORT table Eric Auger
  2017-05-30 16:09 ` [Qemu-devel] [RFC v4 0/5] ARM SMMUv3 Emulation Support Peter Maydell
  5 siblings, 1 reply; 14+ messages in thread
From: Eric Auger @ 2017-05-13 17:43 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, edgar.iglesias,
	qemu-arm, qemu-devel, prem.mallappa
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain

The new machine type allows smmuv3 instantiation. A new option
is introduced to turn the feature on/off (off by default).

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

Another alternative would be to use the -device option as
done on x86. As the smmu is a sysbus device, we would need to
use the platform bus framework. This would work fine
for the dt generation. However the feasibility needs to be
studied for ACPI table generation.

a Veuillez saisir le message de validation pour vos modifications. Les lignes
---
 hw/arm/virt.c         | 51 +++++++++++++++++++++++++++++++++++++++++++++++++--
 include/hw/arm/virt.h |  1 +
 2 files changed, 50 insertions(+), 2 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index c00efb2..71ea707 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1496,6 +1496,20 @@ static void machvirt_init(MachineState *machine)
     create_platform_bus(vms, pic);
 }
 
+static bool virt_get_smmu(Object *obj, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    return vms->smmu;
+}
+
+static void virt_set_smmu(Object *obj, bool value, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    vms->smmu = value;
+}
+
 static bool virt_get_secure(Object *obj, Error **errp)
 {
     VirtMachineState *vms = VIRT_MACHINE(obj);
@@ -1609,7 +1623,7 @@ static void machvirt_machine_init(void)
 }
 type_init(machvirt_machine_init);
 
-static void virt_2_9_instance_init(Object *obj)
+static void virt_2_10_instance_init(Object *obj)
 {
     VirtMachineState *vms = VIRT_MACHINE(obj);
     VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
@@ -1665,14 +1679,47 @@ static void virt_2_9_instance_init(Object *obj)
                                         NULL);
     }
 
+    if (vmc->no_smmu) {
+        vms->smmu = false;
+    } else {
+        /* Default disallows smmu instantiation */
+        vms->smmu = false;
+        object_property_add_bool(obj, "smmu", virt_get_smmu,
+                                 virt_set_smmu, NULL);
+        object_property_set_description(obj, "smmu",
+                                        "Set on/off to enable/disable "
+                                        "smmu instantiation (default off)",
+                                        NULL);
+    }
+
     vms->memmap = a15memmap;
     vms->irqmap = a15irqmap;
 }
 
+static void virt_machine_2_10_options(MachineClass *mc)
+{
+}
+DEFINE_VIRT_MACHINE_AS_LATEST(2, 10)
+
+#define VIRT_COMPAT_2_9 \
+    HW_COMPAT_2_9
+
+static void virt_2_9_instance_init(Object *obj)
+{
+    virt_2_10_instance_init(obj);
+}
+
 static void virt_machine_2_9_options(MachineClass *mc)
 {
+    VirtMachineClass *vmc = VIRT_MACHINE_CLASS(OBJECT_CLASS(mc));
+
+    virt_machine_2_10_options(mc);
+    SET_MACHINE_COMPAT(mc, VIRT_COMPAT_2_9);
+
+    vmc->no_smmu = true;
 }
-DEFINE_VIRT_MACHINE_AS_LATEST(2, 9)
+DEFINE_VIRT_MACHINE(2, 9)
+
 
 #define VIRT_COMPAT_2_8 \
     HW_COMPAT_2_8
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 164a531..cd2c82e 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -86,6 +86,7 @@ typedef struct {
     bool disallow_affinity_adjustment;
     bool no_its;
     bool no_pmu;
+    bool no_smmu;
     bool claim_edge_triggered_timers;
 } VirtMachineClass;
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [RFC v4 5/5] hw/arm/virt-acpi-build: add smmuv3 node in IORT table
  2017-05-13 17:43 [Qemu-devel] [RFC v4 0/5] ARM SMMUv3 Emulation Support Eric Auger
                   ` (3 preceding siblings ...)
  2017-05-13 17:43 ` [Qemu-devel] [RFC v4 4/5] hw/arm/virt: Add 2.10 machine type Eric Auger
@ 2017-05-13 17:43 ` Eric Auger
  2017-05-30 16:09 ` [Qemu-devel] [RFC v4 0/5] ARM SMMUv3 Emulation Support Peter Maydell
  5 siblings, 0 replies; 14+ messages in thread
From: Eric Auger @ 2017-05-13 17:43 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, edgar.iglesias,
	qemu-arm, qemu-devel, prem.mallappa
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain

From: Prem Mallappa <prem.mallappa@broadcom.com>

This patch builds the smmuv3 node in the ACPI IORT table.

The RID space of the root complex, which spans 0x0-0x10000
maps to streamid space 0x0-0x10000 in smmuv3, which in turn
maps to deviceid space 0x0-0x10000 in the ITS group.

The guest must feature the IOMMU probe deferral series
(https://lkml.org/lkml/2017/4/10/214) wich fixes streamid
multiple lookup. This bug is not related to the SMMU emulation.

Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v2 -> v3:
- integrate into the existing IORT table made up of ITS, RC nodes
- take into account vms->smmu
- match linux actbl2.h acpi_iort_smmu_v3 field names
---
 hw/arm/virt-acpi-build.c    | 56 +++++++++++++++++++++++++++++++++++++++------
 include/hw/acpi/acpi-defs.h | 15 ++++++++++++
 2 files changed, 64 insertions(+), 7 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 0835e59..28ee133 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -393,19 +393,26 @@ build_rsdp(GArray *rsdp_table, BIOSLinker *linker, unsigned rsdt_tbl_offset)
 }
 
 static void
-build_iort(GArray *table_data, BIOSLinker *linker)
+build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
 {
-    int iort_start = table_data->len;
+    int nb_nodes, iort_start = table_data->len;
     AcpiIortIdMapping *idmap;
     AcpiIortItsGroup *its;
     AcpiIortTable *iort;
-    size_t node_size, iort_length;
+    AcpiIortSmmu3 *smmu;
+    size_t node_size, iort_length, smmu_offset = 0;
     AcpiIortRC *rc;
 
     iort = acpi_data_push(table_data, sizeof(*iort));
 
+    if (vms->smmu) {
+        nb_nodes = 3; /* RC, ITS, SMMUv3 */
+    } else {
+        nb_nodes = 2; /* RC, ITS */
+    }
+
     iort_length = sizeof(*iort);
-    iort->node_count = cpu_to_le32(2); /* RC and ITS nodes */
+    iort->node_count = cpu_to_le32(nb_nodes);
     iort->node_offset = cpu_to_le32(sizeof(*iort));
 
     /* ITS group node */
@@ -418,6 +425,35 @@ build_iort(GArray *table_data, BIOSLinker *linker)
     its->its_count = cpu_to_le32(1);
     its->identifiers[0] = 0; /* MADT translation_id */
 
+    if (vms->smmu) {
+        int irq =  vms->irqmap[VIRT_SMMU];
+
+        /* SMMUv3 node */
+        smmu_offset = cpu_to_le32(iort->node_offset + node_size);
+        node_size = sizeof(*smmu) + sizeof(*idmap);
+        iort_length += node_size;
+        smmu = acpi_data_push(table_data, node_size);
+
+
+        smmu->type = ACPI_IORT_NODE_SMMU_V3;
+        smmu->length = cpu_to_le16(node_size);
+        smmu->mapping_count = cpu_to_le32(1);
+        smmu->mapping_offset = cpu_to_le32(sizeof(*smmu));
+        smmu->base_address = cpu_to_le64(vms->memmap[VIRT_SMMU].base);
+        smmu->event_gsiv = cpu_to_le32(irq);
+        smmu->pri_gsiv = cpu_to_le32(irq + 1);
+        smmu->gerr_gsiv = cpu_to_le32(irq + 2);
+        smmu->sync_gsiv = cpu_to_le32(irq + 3);
+
+        /* Identity RID mapping covering the whole input RID range */
+        idmap = &smmu->id_mapping_array[0];
+        idmap->input_base = 0;
+        idmap->id_count = cpu_to_le32(0xFFFF);
+        idmap->output_base = 0;
+        /* output IORT node is the ITS group node (the first node) */
+        idmap->output_reference = cpu_to_le32(iort->node_offset);
+    }
+
     /* Root Complex Node */
     node_size = sizeof(*rc) + sizeof(*idmap);
     iort_length += node_size;
@@ -438,8 +474,14 @@ build_iort(GArray *table_data, BIOSLinker *linker)
     idmap->input_base = 0;
     idmap->id_count = cpu_to_le32(0xFFFF);
     idmap->output_base = 0;
-    /* output IORT node is the ITS group node (the first node) */
-    idmap->output_reference = cpu_to_le32(iort->node_offset);
+
+    if (vms->smmu) {
+        /* output IORT node is the smmuv3 node */
+        idmap->output_reference = cpu_to_le32(smmu_offset);
+    } else {
+        /* output IORT node is the ITS group node (the first node) */
+        idmap->output_reference = cpu_to_le32(iort->node_offset);
+    }
 
     iort->length = cpu_to_le32(iort_length);
 
@@ -785,7 +827,7 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
 
     if (its_class_name() && !vmc->no_its) {
         acpi_add_table(table_offsets, tables_blob);
-        build_iort(tables_blob, tables->linker);
+        build_iort(tables_blob, tables->linker, vms);
     }
 
     /* RSDT is pointed to by RSDP */
diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
index 293ee45..62ff60c 100644
--- a/include/hw/acpi/acpi-defs.h
+++ b/include/hw/acpi/acpi-defs.h
@@ -696,6 +696,21 @@ struct AcpiIortItsGroup {
 } QEMU_PACKED;
 typedef struct AcpiIortItsGroup AcpiIortItsGroup;
 
+struct AcpiIortSmmu3 {
+    ACPI_IORT_NODE_HEADER_DEF
+    uint64_t base_address;
+    uint32_t flags;
+    uint32_t reserved2;
+    uint64_t vatos_address;
+    uint32_t model;
+    uint32_t event_gsiv;
+    uint32_t pri_gsiv;
+    uint32_t gerr_gsiv;
+    uint32_t sync_gsiv;
+    AcpiIortIdMapping id_mapping_array[0];
+} QEMU_PACKED;
+typedef struct AcpiIortSmmu3 AcpiIortSmmu3;
+
 struct AcpiIortRC {
     ACPI_IORT_NODE_HEADER_DEF
     AcpiIortMemoryAccess memory_properties;
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [RFC v4 1/5] hw/arm/smmu-common: smmu base class
  2017-05-13 17:43 ` [Qemu-devel] [RFC v4 1/5] hw/arm/smmu-common: smmu base class Eric Auger
@ 2017-05-30 15:56   ` Peter Maydell
  2017-05-31  7:04     ` Auger Eric
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Maydell @ 2017-05-30 15:56 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, Edgar E. Iglesias, qemu-arm, QEMU Developers,
	prem.mallappa, Andrew Jones, Christoffer Dall, Radha.Chintakuntla,
	Sunil.Goutham, Radha Mohan, Trey Cain

On 13 May 2017 at 18:43, Eric Auger <eric.auger@redhat.com> wrote:
> Introduces the base device and class for the ARM smmu.
> Implements VMSAv8-64 table lookup and translation. VMSAv8-32
> is not yet implemented.
>
> For VFIO integration we will need to notify mapping changes
> of an input range and skipped unmapped regions. table walk
> helper allows.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
>
> +/**
> + * smmu_page_walk_level_64 - Walk an IOVA range from a specific level
> + * @baseaddr: table base address corresponding to @level
> + * @level: level
> + * @cfg: translation config
> + * @start: end of the IOVA range
> + * @end: end of the IOVA range
> + * @hook_fn: the hook that to be called for each detected area
> + * @private: private data for the hook function
> + * @read: whether parent level has read permission
> + * @write: whether parent level has write permission
> + * @must_translate: indicates whether each iova of the range
> + *  must be translated or whether failure is allowed
> + * @notify_unmap: whether we should notify invalid entries
> + *
> + * Return 0 on success, < 0 on errors not related to translation
> + * process, > 1 on errors related to translation process (only
> + * if must_translate is set)
> + */
> +static int
> +smmu_page_walk_level_64(dma_addr_t baseaddr, int level,
> +                        SMMUTransCfg *cfg, uint64_t start, uint64_t end,
> +                        smmu_page_walk_hook hook_fn, void *private,
> +                        bool read, bool write, bool must_translate,
> +                        bool notify_unmap)
> +{

> +        /* table pte */
> +        next_table_baseaddr = get_table_pte_address(pte, granule_sz);
> +        trace_smmu_page_walk_level_table_pte(pte, next_table_baseaddr);
> +        ret = smmu_page_walk_level_64(next_table_baseaddr, level + 1, cfg,
> +                                      iova, MIN(iova_next, end),
> +                                      hook_fn, private, read_cur, write_cur,
> +                                      must_translate, notify_unmap);
> +        if (!ret) {
> +            return ret;
> +        }
> +
> +next:
> +        iova = iova_next;
> +    }
> +
> +    return SMMU_TRANS_ERR_NONE;
> +}

It seems a bit odd that this function is recursive -- I was
expecting it just to loop through for each successive level
until it hits a block/page entry, as the LPAE table walk
code in target/arm/helper.c does...

thanks
-- PMM

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [RFC v4 2/5] hw/arm/smmuv3: smmuv3 emulation model
  2017-05-13 17:43 ` [Qemu-devel] [RFC v4 2/5] hw/arm/smmuv3: smmuv3 emulation model Eric Auger
@ 2017-05-30 16:01   ` Peter Maydell
  2017-05-31  7:07     ` Auger Eric
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Maydell @ 2017-05-30 16:01 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, Edgar E. Iglesias, qemu-arm, QEMU Developers,
	prem.mallappa, Andrew Jones, Christoffer Dall, Radha.Chintakuntla,
	Sunil.Goutham, Radha Mohan, Trey Cain

On 13 May 2017 at 18:43, Eric Auger <eric.auger@redhat.com> wrote:
> From: Prem Mallappa <prem.mallappa@broadcom.com>
>
> Introduces the SMMUv3 derived model. This is based on
> System MMUv3 specification (v17).
>
> Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
> v3 -> v4
> - smmu_irq_update
> - fix hash key allocation
> - set smmu_iommu_ops
> - set SMMU_REG_CR0,
> - smmuv3_translate: ret.perm not set in bypass mode
> - use trace events
> - renamed STM2U64 into L1STD_L2PTR and STMSPAN into L1STD_SPAN
> - rework smmu_find_ste
> - fix tg2granule in TT0/0b10 corresponds to 16kB
>
> v2 -> v3:
> - move creation of include/hw/arm/smmuv3.h to this patch to fix compil issue
> - compilation allowed
> - fix sbus allocation in smmu_init_pci_iommu
> - restructure code into headers
> - misc cleanups
> ---
>  hw/arm/Makefile.objs     |    2 +-
>  hw/arm/smmuv3-internal.h |  603 ++++++++++++++++++++++++
>  hw/arm/smmuv3.c          | 1134 ++++++++++++++++++++++++++++++++++++++++++++++
>  hw/arm/trace-events      |   32 ++
>  include/hw/arm/smmuv3.h  |   87 ++++
>  5 files changed, 1857 insertions(+), 1 deletion(-)
>  create mode 100644 hw/arm/smmuv3-internal.h
>  create mode 100644 hw/arm/smmuv3.c
>  create mode 100644 include/hw/arm/smmuv3.h

This is a bit of a big patch for review for my taste -- are
there some easy splits into multiple patches possible?

> +typedef struct SMMUQueue {
> +     hwaddr base;
> +     uint32_t prod;
> +     uint32_t cons;
> +     union {
> +          struct {
> +               uint8_t prod:1;
> +               uint8_t cons:1;
> +          };
> +          uint8_t unused;
> +     } wrap;

Use of bitfields here seems a bit odd but I haven't
looked at the code really.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [RFC v4 4/5] hw/arm/virt: Add 2.10 machine type
  2017-05-13 17:43 ` [Qemu-devel] [RFC v4 4/5] hw/arm/virt: Add 2.10 machine type Eric Auger
@ 2017-05-30 16:04   ` Peter Maydell
  2017-05-31  7:15     ` Auger Eric
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Maydell @ 2017-05-30 16:04 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, Edgar E. Iglesias, qemu-arm, QEMU Developers,
	prem.mallappa, Andrew Jones, Christoffer Dall, Radha.Chintakuntla,
	Sunil.Goutham, Radha Mohan, Trey Cain

On 13 May 2017 at 18:43, Eric Auger <eric.auger@redhat.com> wrote:
> The new machine type allows smmuv3 instantiation. A new option
> is introduced to turn the feature on/off (off by default).

Should we go for default-on, or would that break guests?
For other things added to the virt board I think the
approach we've taken has been:
 * if this is just an extra device, simply provide it in the
   new virt-n.nn machine by default
 * if this is something that changes how the machine behaves
   even for code that doesn't care about that feature (eg
   EL2, EL3, since they change what mode you start in on boot)
   then default it to off
 * if the feature only works with TCG and not KVM then
   maybe default it to off

thanks
-- PMM

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [RFC v4 0/5] ARM SMMUv3 Emulation Support
  2017-05-13 17:43 [Qemu-devel] [RFC v4 0/5] ARM SMMUv3 Emulation Support Eric Auger
                   ` (4 preceding siblings ...)
  2017-05-13 17:43 ` [Qemu-devel] [RFC v4 5/5] hw/arm/virt-acpi-build: add smmuv3 node in IORT table Eric Auger
@ 2017-05-30 16:09 ` Peter Maydell
  2017-05-31  7:21   ` Auger Eric
  5 siblings, 1 reply; 14+ messages in thread
From: Peter Maydell @ 2017-05-30 16:09 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, Edgar E. Iglesias, qemu-arm, QEMU Developers,
	prem.mallappa, Andrew Jones, Christoffer Dall, Radha.Chintakuntla,
	Sunil.Goutham, Radha Mohan, Trey Cain

On 13 May 2017 at 18:43, Eric Auger <eric.auger@redhat.com> wrote:
> This series introduces the emulation code for ARM SMMUv3.
> This is the continuation of Prem's work [1].
>
> This v4 is yet another visibility step as many restrictions apply
> to the model at the moment:

I had a quick scan through to see if anything leapt out
as odd, and have sent a few comments, but I'm assuming
you don't want in-depth review at this time (and hoping
that somebody else will provide it when you do want it ;-)).

thanks
-- PMM

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [RFC v4 1/5] hw/arm/smmu-common: smmu base class
  2017-05-30 15:56   ` Peter Maydell
@ 2017-05-31  7:04     ` Auger Eric
  0 siblings, 0 replies; 14+ messages in thread
From: Auger Eric @ 2017-05-31  7:04 UTC (permalink / raw)
  To: Peter Maydell
  Cc: eric.auger.pro, Edgar E. Iglesias, qemu-arm, QEMU Developers,
	prem.mallappa, Andrew Jones, Christoffer Dall, Radha.Chintakuntla,
	Sunil.Goutham, Radha Mohan, Trey Cain

Hi Peter,

On 30/05/2017 17:56, Peter Maydell wrote:
> On 13 May 2017 at 18:43, Eric Auger <eric.auger@redhat.com> wrote:
>> Introduces the base device and class for the ARM smmu.
>> Implements VMSAv8-64 table lookup and translation. VMSAv8-32
>> is not yet implemented.
>>
>> For VFIO integration we will need to notify mapping changes
>> of an input range and skipped unmapped regions. table walk
>> helper allows.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
>>
>> +/**
>> + * smmu_page_walk_level_64 - Walk an IOVA range from a specific level
>> + * @baseaddr: table base address corresponding to @level
>> + * @level: level
>> + * @cfg: translation config
>> + * @start: end of the IOVA range
>> + * @end: end of the IOVA range
>> + * @hook_fn: the hook that to be called for each detected area
>> + * @private: private data for the hook function
>> + * @read: whether parent level has read permission
>> + * @write: whether parent level has write permission
>> + * @must_translate: indicates whether each iova of the range
>> + *  must be translated or whether failure is allowed
>> + * @notify_unmap: whether we should notify invalid entries
>> + *
>> + * Return 0 on success, < 0 on errors not related to translation
>> + * process, > 1 on errors related to translation process (only
>> + * if must_translate is set)
>> + */
>> +static int
>> +smmu_page_walk_level_64(dma_addr_t baseaddr, int level,
>> +                        SMMUTransCfg *cfg, uint64_t start, uint64_t end,
>> +                        smmu_page_walk_hook hook_fn, void *private,
>> +                        bool read, bool write, bool must_translate,
>> +                        bool notify_unmap)
>> +{
> 
>> +        /* table pte */
>> +        next_table_baseaddr = get_table_pte_address(pte, granule_sz);
>> +        trace_smmu_page_walk_level_table_pte(pte, next_table_baseaddr);
>> +        ret = smmu_page_walk_level_64(next_table_baseaddr, level + 1, cfg,
>> +                                      iova, MIN(iova_next, end),
>> +                                      hook_fn, private, read_cur, write_cur,
>> +                                      must_translate, notify_unmap);
>> +        if (!ret) {
>> +            return ret;
>> +        }
>> +
>> +next:
>> +        iova = iova_next;
>> +    }
>> +
>> +    return SMMU_TRANS_ERR_NONE;
>> +}
> 
> It seems a bit odd that this function is recursive -- I was
> expecting it just to loop through for each successive level
> until it hits a block/page entry, as the LPAE table walk
> code in target/arm/helper.c does...

welcome back ;-)

Thanks for the pointer to get_phys_addr_lpae(). Actually I got largely
inspired of vtd_page_walk and vtd_page_walk_level of
hw/i386/intel_iommu.c which also use the same recursive form.

The peculiarity compared to get_phys_addr_lpae() is that we scan a range
of iovas and do not translate a single iova. In case a level entry is
not implemented we want to jump to the next one at the same level and in
case the entry matches a table entry we want to scan the next level
entries matching that iova range.

Does that make more sense?

Thanks

Eric
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [RFC v4 2/5] hw/arm/smmuv3: smmuv3 emulation model
  2017-05-30 16:01   ` Peter Maydell
@ 2017-05-31  7:07     ` Auger Eric
  0 siblings, 0 replies; 14+ messages in thread
From: Auger Eric @ 2017-05-31  7:07 UTC (permalink / raw)
  To: Peter Maydell
  Cc: eric.auger.pro, Edgar E. Iglesias, qemu-arm, QEMU Developers,
	prem.mallappa, Andrew Jones, Christoffer Dall, Radha.Chintakuntla,
	Sunil.Goutham, Radha Mohan, Trey Cain

Hi Peter,
On 30/05/2017 18:01, Peter Maydell wrote:
> On 13 May 2017 at 18:43, Eric Auger <eric.auger@redhat.com> wrote:
>> From: Prem Mallappa <prem.mallappa@broadcom.com>
>>
>> Introduces the SMMUv3 derived model. This is based on
>> System MMUv3 specification (v17).
>>
>> Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>> v3 -> v4
>> - smmu_irq_update
>> - fix hash key allocation
>> - set smmu_iommu_ops
>> - set SMMU_REG_CR0,
>> - smmuv3_translate: ret.perm not set in bypass mode
>> - use trace events
>> - renamed STM2U64 into L1STD_L2PTR and STMSPAN into L1STD_SPAN
>> - rework smmu_find_ste
>> - fix tg2granule in TT0/0b10 corresponds to 16kB
>>
>> v2 -> v3:
>> - move creation of include/hw/arm/smmuv3.h to this patch to fix compil issue
>> - compilation allowed
>> - fix sbus allocation in smmu_init_pci_iommu
>> - restructure code into headers
>> - misc cleanups
>> ---
>>  hw/arm/Makefile.objs     |    2 +-
>>  hw/arm/smmuv3-internal.h |  603 ++++++++++++++++++++++++
>>  hw/arm/smmuv3.c          | 1134 ++++++++++++++++++++++++++++++++++++++++++++++
>>  hw/arm/trace-events      |   32 ++
>>  include/hw/arm/smmuv3.h  |   87 ++++
>>  5 files changed, 1857 insertions(+), 1 deletion(-)
>>  create mode 100644 hw/arm/smmuv3-internal.h
>>  create mode 100644 hw/arm/smmuv3.c
>>  create mode 100644 include/hw/arm/smmuv3.h
> 
> This is a bit of a big patch for review for my taste -- are
> there some easy splits into multiple patches possible?

Yes I fully understand. I will try to split it to ease the review. Maybe
I first need to reach a decent level of functionality first and then
split it.
> 
>> +typedef struct SMMUQueue {
>> +     hwaddr base;
>> +     uint32_t prod;
>> +     uint32_t cons;
>> +     union {
>> +          struct {
>> +               uint8_t prod:1;
>> +               uint8_t cons:1;
>> +          };
>> +          uint8_t unused;
>> +     } wrap;
> 
> Use of bitfields here seems a bit odd but I haven't
> looked at the code really.

OK. I will further look at this part.

Thanks

Eric
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [RFC v4 4/5] hw/arm/virt: Add 2.10 machine type
  2017-05-30 16:04   ` Peter Maydell
@ 2017-05-31  7:15     ` Auger Eric
  0 siblings, 0 replies; 14+ messages in thread
From: Auger Eric @ 2017-05-31  7:15 UTC (permalink / raw)
  To: Peter Maydell
  Cc: eric.auger.pro, Edgar E. Iglesias, qemu-arm, QEMU Developers,
	prem.mallappa, Andrew Jones, Christoffer Dall, Radha.Chintakuntla,
	Sunil.Goutham, Radha Mohan, Trey Cain

Hi Peter,

On 30/05/2017 18:04, Peter Maydell wrote:
> On 13 May 2017 at 18:43, Eric Auger <eric.auger@redhat.com> wrote:
>> The new machine type allows smmuv3 instantiation. A new option
>> is introduced to turn the feature on/off (off by default).
> 
> Should we go for default-on, or would that break guests?
> For other things added to the virt board I think the
> approach we've taken has been:
>  * if this is just an extra device, simply provide it in the
>    new virt-n.nn machine by default
>  * if this is something that changes how the machine behaves
>    even for code that doesn't care about that feature (eg
>    EL2, EL3, since they change what mode you start in on boot)
>    then default it to off
>  * if the feature only works with TCG and not KVM then
>    maybe default it to off

OK thanks for the guidelines. I put it off by default since it can
largely degrade the performance in some use cases. Also instantiating
the smmu can also induce some option changes at other levels: for
instance virtio-pci devices may be used along with iommu_platform option
for the guest to use dma ops.

Thanks

Eric
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [RFC v4 0/5] ARM SMMUv3 Emulation Support
  2017-05-30 16:09 ` [Qemu-devel] [RFC v4 0/5] ARM SMMUv3 Emulation Support Peter Maydell
@ 2017-05-31  7:21   ` Auger Eric
  0 siblings, 0 replies; 14+ messages in thread
From: Auger Eric @ 2017-05-31  7:21 UTC (permalink / raw)
  To: Peter Maydell
  Cc: eric.auger.pro, Edgar E. Iglesias, qemu-arm, QEMU Developers,
	prem.mallappa, Andrew Jones, Christoffer Dall, Radha.Chintakuntla,
	Sunil.Goutham, Radha Mohan, Trey Cain

Hi Peter,

On 30/05/2017 18:09, Peter Maydell wrote:
> On 13 May 2017 at 18:43, Eric Auger <eric.auger@redhat.com> wrote:
>> This series introduces the emulation code for ARM SMMUv3.
>> This is the continuation of Prem's work [1].
>>
>> This v4 is yet another visibility step as many restrictions apply
>> to the model at the moment:
> 
> I had a quick scan through to see if anything leapt out
> as odd, and have sent a few comments, but I'm assuming
> you don't want in-depth review at this time (and hoping
> that somebody else will provide it when you do want it ;-)).

Thank you for this quick scan. Indeed this is what I expect at the
moment since the model clearly misses maturity and features. I will ping
people for proper review when this makes sense.

At the moment people can start testing it for stage 1 only. I was
reported and reproduced boot problems with a guest using virtio-blk-pci.
This is currently under debug and then my next step is to complete the
VFIO integration.

Thanks

Eric

> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-05-31  7:21 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-13 17:43 [Qemu-devel] [RFC v4 0/5] ARM SMMUv3 Emulation Support Eric Auger
2017-05-13 17:43 ` [Qemu-devel] [RFC v4 1/5] hw/arm/smmu-common: smmu base class Eric Auger
2017-05-30 15:56   ` Peter Maydell
2017-05-31  7:04     ` Auger Eric
2017-05-13 17:43 ` [Qemu-devel] [RFC v4 2/5] hw/arm/smmuv3: smmuv3 emulation model Eric Auger
2017-05-30 16:01   ` Peter Maydell
2017-05-31  7:07     ` Auger Eric
2017-05-13 17:43 ` [Qemu-devel] [RFC v4 3/5] hw/arm/virt: Add SMMUv3 to the virt board Eric Auger
2017-05-13 17:43 ` [Qemu-devel] [RFC v4 4/5] hw/arm/virt: Add 2.10 machine type Eric Auger
2017-05-30 16:04   ` Peter Maydell
2017-05-31  7:15     ` Auger Eric
2017-05-13 17:43 ` [Qemu-devel] [RFC v4 5/5] hw/arm/virt-acpi-build: add smmuv3 node in IORT table Eric Auger
2017-05-30 16:09 ` [Qemu-devel] [RFC v4 0/5] ARM SMMUv3 Emulation Support Peter Maydell
2017-05-31  7:21   ` Auger Eric

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).