[RFC PATCH 0/5] hw/arm: MPAM Emulation + PPTT cache description.

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [RFC PATCH 0/5] hw/arm: MPAM Emulation + PPTT cache description.
@ 2023-08-08 11:57 Jonathan Cameron via
  2023-08-08 11:57 ` [RFC PATCH 1/5] hw/acpi: Add PPTT cache descriptions Jonathan Cameron via
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Jonathan Cameron via @ 2023-08-08 11:57 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gavin Shan, linuxarm, James Morse, peter . maydell @ linaro . org,
	zhao1.liu, Alex Bennée, Shameerali Kolothum Thodi,
	Yicong Yang

Aim of this bit of emulation is to use it for testing James Morse's
kernel tree - in particularly letting us poke the corner cases.
Right now I'm not that focused on upstreaming this (too many other things
in my backlog), but any feedback on the approach etc welcome and perhaps
the PPTT part is useful independent of MPAM support.

Current kernel branch (one outstanding bug reported but that's hard to hit
and requires setting the narrowing target number of IDs to 1 which is bonkers):
https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/log/?h=mpam/snapshot/v6.5-rc1

Supported:
* PPTT cache description - this is necessary for the cross references MPAM
  table entries use to establish which Cache any given control set influences.
  I included option for generating shared tables which were a common choice
  prior to MPAM needing those cross references.
* CPU emulation for MPAM. Given we aren't doing anything with the content
  this is just a case of adding the MPAM_IDR register and read/write registers
  to control current PARTID / PMG group.
* MPAM MSC emulation for caches and memory controllers.
  Multiple RIS support allows up to 16 such elements to be controlled via
  a single interface (used only for memory currently.
  Most controls wired up, though introspection interface and sanity checks
  only cover some of them so far. No monitoring yet.
* ACPI tables and device instantiation in ARM Virt. ACPI only because the
  kernel patches clearly state the DT binding is a WIP.
* A hack to add lots of caches to the MAX cpu via the relevant CPU registers
  - these are read back to generate the PPTT Table and MPAM devices.

TODO:
- Dealing with case of no NUMA nodes. Currently we don't start if NUMA
  nodes aren't specified and mpam=on.  Defaulting to a single NUMA
  node if MPAM is enabled may make more sense.
- Error injection / reporting on invalid parameters.
- Monitor support.
- Wire up the interrupts properly.
- Tighten checks on unexpected values to further help with catching
  bugs in kernel code (a few already found and fixed by James).
- ACPI table test (yeah I'm lazy).
- Remove remaining 'fixed' constraints on number of partitions etc
  so they can be different across controllers / different levels
  of the hierarchy.
- Expand the qmp introspection interface to cover the missing parts.

Example command line (who doesn't love SMT arm machines?):
aarch64-softmmu/qemu-system-aarch64 -D test.log -d unimp \
 -M virt,nvdimm=on,gic-version=3,mpam=on,mpam_min_msc=on \
 -m 4g -cpu max,core-count=2 \
 -smp 16,sockets=1,clusters=4,threads=2,cache-cluster-start-level=2,cache-node-start-level=3 \
 -kernel Image \
 -drive if=none,file=full.qcow2,format=qcow2,id=hd \
 -device pcie-root-port,id=root_port1 -device virtio-blk-pci,drive=hd \
 -qmp-pretty tcp:localhost:4445,server=on,wait=off \
...
 -nographic -no-reboot -append 'earlycon root=/dev/vda2' \
 -bios QEMU_EFI.fd \
 -object memory-backend-ram,size=1G,id=mem0 \
 -object memory-backend-ram,size=1G,id=mem1 \
 -object memory-backend-ram,size=1G,id=mem2 \
 -object memory-backend-ram,size=1G,id=mem3 \
 -numa node,nodeid=0,cpus=0-3,memdev=mem0 \
 -numa node,nodeid=1,cpus=4-7,memdev=mem1 \
 -numa node,nodeid=2,cpus=8-11,memdev=mem2 \
 -numa node,nodeid=3,cpus=12-15,memdev=mem3
 
QMP comamnds:

{ "execute": "qmp_capabilities" }
{ "execute": "query-mpam-cache",
  "arguments": {
    "level": 3
  }
}

Will return something like (reformatted as the pretty version is 'long')
An 'ideal' version of this interface will take some more thought as it
needs to balance readability and clarity with complex implementation of
the code to 'interpret' the register values.

{
    "return": [
        {
            "cpu": 0,
            "level": 3,
            "regs": [
                {
                    "mbwumon-idr": 0,
                    "idr": 758514712831,
                    "cfg-cpbm": [
                        { "words": [ 4294967295 ] },
                        { "words": [ 0 ] },
                        { "words": [ 0 ] },
                        { "words": [ 0 ] },	
                        { "words": [ 0 ] },
			....
			{ "words": [ 0 ] }
                    ],
                    "partid-nrw-idr": 31,
                    "mbw-idr": 0,
                    "csumon-idr": 0,
                    "esr": 0,
                    "ecr": 1,
                    "cfg-part-sel": 0,
                    "iidr": 44042038,
                    "cpor-idr": 32,
                    "msmon-idr": 0,
                    "ccap-idr": 2952791044,
                    "aidr": 17,
                    "pri-idr": 35
                }
            ],
            "type": 3
        }
    ]
}

Jonathan Cameron (5):
  hw/acpi: Add PPTT cache descriptions
  HACK: target/arm/tcg: Add some more caches to cpu=max
  target/arm: Add support for MPAM CPU registers
  hw/arm: Add MPAM emulation.
  hw/arm/virt: Add MPAM MSCs for memory controllers and caches.

 qapi/machine.json           |   8 +-
 qapi/mpam.json              |  78 ++++
 qapi/qapi-schema.json       |   1 +
 include/hw/acpi/aml-build.h |  19 +-
 include/hw/arm/mpam.h       |  13 +
 include/hw/arm/virt.h       |   2 +
 include/hw/boards.h         |   4 +
 target/arm/cpu.h            |  15 +
 hw/acpi/aml-build.c         | 189 +++++++-
 hw/arm/mpam-qapi-stubs.c    |   9 +
 hw/arm/mpam-qapi.c          |  58 +++
 hw/arm/mpam.c               | 886 ++++++++++++++++++++++++++++++++++++
 hw/arm/virt-acpi-build.c    | 327 ++++++++++++-
 hw/arm/virt.c               | 134 ++++++
 hw/core/machine-smp.c       |   8 +
 hw/loongarch/acpi-build.c   |   2 +-
 target/arm/cpu.c            |  10 +-
 target/arm/helper.c         |  30 ++
 target/arm/tcg/cpu64.c      |  12 +
 hw/arm/Kconfig              |   4 +
 hw/arm/meson.build          |   4 +
 qapi/meson.build            |   1 +
 22 files changed, 1803 insertions(+), 11 deletions(-)
 create mode 100644 qapi/mpam.json
 create mode 100644 include/hw/arm/mpam.h
 create mode 100644 hw/arm/mpam-qapi-stubs.c
 create mode 100644 hw/arm/mpam-qapi.c
 create mode 100644 hw/arm/mpam.c

-- 
2.39.2



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC PATCH 1/5] hw/acpi: Add PPTT cache descriptions
  2023-08-08 11:57 [RFC PATCH 0/5] hw/arm: MPAM Emulation + PPTT cache description Jonathan Cameron via
@ 2023-08-08 11:57 ` Jonathan Cameron via
  2023-08-14  9:50   ` Zhao Liu
  2023-08-08 11:57 ` [RFC PATCH 2/5] HACK: target/arm/tcg: Add some more caches to cpu=max Jonathan Cameron via
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Jonathan Cameron via @ 2023-08-08 11:57 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gavin Shan, linuxarm, James Morse, peter . maydell @ linaro . org,
	zhao1.liu, Alex Bennée, Shameerali Kolothum Thodi,
	Yicong Yang

Current PPTT tables generated by QEMU only provide information on CPU
topology and neglect the description of Caches.

This patch adds flexible definition of those caches and updates the
table version to 3 to allow for the per CPU cache instance IDs needed
for cross references from the MPAM table.

If MPAM is not being used, then a unified description can be used,
greatly reducing the resulting table size.

New machine parameters are used to control the cache toplogy.
cache-cluster-start-level: Which caches are associated with the cluster
  level of the topology. e.g cache-cluster-start-level=2 results in shared
  l2 cache across a cluster.
cache-numa-start-level: Which caches are associate with the NUMA (in qemu
  this is currently the physical package level).  For example
  cache-cluster-start-level=2,cache-numa-start-level=3 gives
  private l1, cluster shared l2 and package shared L3.

FIXME: Test updates.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 qapi/machine.json           |   8 +-
 include/hw/acpi/aml-build.h |  19 +++-
 include/hw/boards.h         |   4 +
 hw/acpi/aml-build.c         | 189 ++++++++++++++++++++++++++++++++++--
 hw/arm/virt-acpi-build.c    | 130 ++++++++++++++++++++++++-
 hw/core/machine-smp.c       |   8 ++
 hw/loongarch/acpi-build.c   |   2 +-
 7 files changed, 350 insertions(+), 10 deletions(-)

diff --git a/qapi/machine.json b/qapi/machine.json
index a08b6576ca..cc86784641 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -1494,6 +1494,10 @@
 # @maxcpus: maximum number of hotpluggable virtual CPUs in the virtual
 #     machine
 #
+# @cache-cluster-start-level: Level of first cache attached to cluster
+#
+# @cache-node-start-level: Level of first cache attached to cluster
+#
 # Since: 6.1
 ##
 { 'struct': 'SMPConfiguration', 'data': {
@@ -1503,7 +1507,9 @@
      '*clusters': 'int',
      '*cores': 'int',
      '*threads': 'int',
-     '*maxcpus': 'int' } }
+     '*maxcpus': 'int',
+     '*cache-cluster-start-level': 'int',
+     '*cache-node-start-level': 'int'} }
 
 ##
 # @x-query-irq:
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index d1fb08514b..055b74820d 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -489,8 +489,25 @@ void build_srat_memory(GArray *table_data, uint64_t base,
 void build_slit(GArray *table_data, BIOSLinker *linker, MachineState *ms,
                 const char *oem_id, const char *oem_table_id);
 
+typedef enum ACPIPPTTCacheType {
+    DATA,
+    INSTRUCTION,
+    UNIFIED,
+} ACPIPPTTCacheType;
+
+typedef struct ACPIPPTTCache {
+    ACPIPPTTCacheType type;
+    int sets;
+    int size;
+    int associativity;
+    int linesize;
+    unsigned int pptt_id;
+    int level;
+} ACPIPPTTCache;
+
 void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
-                const char *oem_id, const char *oem_table_id);
+                const char *oem_id, const char *oem_table_id,
+                int num_caches, ACPIPPTTCache *caches);
 
 void build_fadt(GArray *tbl, BIOSLinker *linker, const AcpiFadtData *f,
                 const char *oem_id, const char *oem_table_id);
diff --git a/include/hw/boards.h b/include/hw/boards.h
index ed83360198..6e8ab92684 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -316,6 +316,8 @@ typedef struct DeviceMemoryState {
  * @cores: the number of cores in one cluster
  * @threads: the number of threads in one core
  * @max_cpus: the maximum number of logical processors on the machine
+ * @cache_cluster_start_level: First cache level attached to cluster
+ * @cache_node_start_level: First cache level attached to node
  */
 typedef struct CpuTopology {
     unsigned int cpus;
@@ -325,6 +327,8 @@ typedef struct CpuTopology {
     unsigned int cores;
     unsigned int threads;
     unsigned int max_cpus;
+    unsigned int cache_cluster_start_level;
+    unsigned int cache_node_start_level;
 } CpuTopology;
 
 /**
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index ea331a20d1..e103cd638f 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1994,32 +1994,175 @@ static void build_processor_hierarchy_node(GArray *tbl, uint32_t flags,
     }
 }
 
+static void build_cache_nodes(GArray *tbl, ACPIPPTTCache *cache,
+                              uint32_t next_offset,
+                              bool has_id, unsigned int id)
+{
+    int val;
+
+    /* Type 1 - cache */
+    build_append_byte(tbl, 1);
+    /* Length */
+    build_append_byte(tbl, 28);
+    /* Reserved */
+    build_append_int_noprefix(tbl, 0, 2);
+    /* Flags - everything except possibly the ID */
+    build_append_int_noprefix(tbl, has_id ? 0xff : 0x7f, 4);
+    /* Offset of next cache up */
+    build_append_int_noprefix(tbl, next_offset, 4);
+    build_append_int_noprefix(tbl, cache->size, 4);
+    build_append_int_noprefix(tbl, cache->sets, 4);
+    build_append_byte(tbl, cache->associativity);
+    /* Read and Write allocate amd WB */
+    val = 0x3 | (1 << 4);
+    switch (cache->type) {
+    case INSTRUCTION:
+        val |= (1 << 2);
+        break;
+    case DATA:
+        val |= (0 << 2); /* Data */
+        break;
+    case UNIFIED:
+        val |= (3 << 2); /* Unified */
+        break;
+    }
+    build_append_byte(tbl, val);
+    build_append_int_noprefix(tbl, cache->linesize, 2);
+    build_append_int_noprefix(tbl,
+                              has_id ?
+                              (cache->type << 24) | (cache->level << 16) | id :
+                              0, 4);
+}
+
+static void build_caches_subset(GArray *table_data, uint32_t pptt_start,
+                                int num_caches, ACPIPPTTCache *caches,
+                                bool assign_ids, int base_id,
+                                uint8_t level_high, uint8_t level_low,
+                                uint32_t *data_offset, uint32_t *instr_offset)
+{
+    uint32_t next_level_offset_data = 0, next_level_offset_instruction = 0;
+    uint32_t this_offset, next_offset = 0;
+    int c, l;
+
+    /* Walk caches from top to bottom */
+
+    for (l = level_high; l >= level_low; l--) { /* Walk down levels */
+        for (c = 0; c < num_caches; c++) {
+            if (caches[c].level != l) {
+                continue;
+            }
+
+            /* Assume only unified above l1 for now */
+            this_offset = table_data->len - pptt_start;
+            switch (caches[c].type) {
+            case INSTRUCTION:
+                next_offset = next_level_offset_instruction;
+                break;
+            case DATA:
+                next_offset = next_level_offset_data;
+                break;
+            case UNIFIED:
+                /* Either is fine here - hopefully */
+                next_offset = next_level_offset_instruction;
+                break;
+            }
+            build_cache_nodes(table_data, &caches[c], next_offset,
+                              assign_ids, base_id);
+            switch (caches[c].type) {
+            case INSTRUCTION:
+                next_level_offset_instruction = this_offset;
+                break;
+            case DATA:
+                next_level_offset_data = this_offset;
+                break;
+            case UNIFIED:
+                next_level_offset_instruction = this_offset;
+                next_level_offset_data = this_offset;
+                break;
+            }
+            *data_offset = next_level_offset_data;
+            *instr_offset = next_level_offset_instruction;
+        }
+    }
+}
+
 /*
  * ACPI spec, Revision 6.3
  * 5.2.29 Processor Properties Topology Table (PPTT)
  */
 void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
-                const char *oem_id, const char *oem_table_id)
+                const char *oem_id, const char *oem_table_id,
+                int num_caches, ACPIPPTTCache *caches)
 {
+    bool share_structs = false;
     MachineClass *mc = MACHINE_GET_CLASS(ms);
     CPUArchIdList *cpus = ms->possible_cpus;
     int64_t socket_id = -1, cluster_id = -1, core_id = -1;
     uint32_t socket_offset = 0, cluster_offset = 0, core_offset = 0;
     uint32_t pptt_start = table_data->len;
     int n;
-    AcpiTable table = { .sig = "PPTT", .rev = 2,
+    AcpiTable table = { .sig = "PPTT", .rev = 3,
                         .oem_id = oem_id, .oem_table_id = oem_table_id };
+    uint32_t l1_data_offset = 0;
+    uint32_t l1_instr_offset = 0;
+    uint32_t cluster_data_offset = 0;
+    uint32_t cluster_instr_offset = 0;
+    uint32_t node_data_offset = 0;
+    uint32_t node_instr_offset = 0;
+    int top_node = 7;
+    int top_cluster = 7;
+    int top_core = 7;
 
     acpi_table_begin(&table, table_data);
 
+    /* Let us have a unified cache description for now */
+
+    if (share_structs && num_caches >= 1) {
+        if (ms->smp.cache_node_start_level) {
+            build_caches_subset(table_data, pptt_start, num_caches, caches,
+                                false, 0,
+                                top_node, ms->smp.cache_node_start_level,
+                                &node_data_offset, &node_instr_offset);
+            top_cluster = ms->smp.cache_node_start_level - 1;
+        }
+        /* Assumption that some caches below this */
+        if (ms->smp.cache_cluster_start_level) {
+            build_caches_subset(table_data, pptt_start, num_caches, caches,
+                                false, 0,
+                                top_cluster,  ms->smp.cache_cluster_start_level,
+                                &cluster_data_offset, &cluster_instr_offset);
+            top_core = ms->smp.cache_cluster_start_level - 1;
+        }
+        build_caches_subset(table_data, pptt_start, num_caches, caches,
+                            false, 0,
+                            top_core , 0,
+                            &l1_data_offset, &l1_instr_offset);
+    }
+
     /*
      * This works with the assumption that cpus[n].props.*_id has been
      * sorted from top to down levels in mc->possible_cpu_arch_ids().
      * Otherwise, the unexpected and duplicated containers will be
      * created.
      */
+
     for (n = 0; n < cpus->len; n++) {
         if (cpus->cpus[n].props.socket_id != socket_id) {
+            uint32_t priv_rsrc[2];
+            int num_priv = 0;
+
+            if (!share_structs && ms->smp.cache_node_start_level) {
+                build_caches_subset(table_data, pptt_start, num_caches, caches,
+                                    true, n,
+                                    top_node, ms->smp.cache_node_start_level,
+                                    &node_data_offset, &node_instr_offset);
+                top_cluster = ms->smp.cache_node_start_level - 1;
+            }
+            priv_rsrc[0] = node_instr_offset;
+            priv_rsrc[1] = node_data_offset;
+            if (node_instr_offset || node_data_offset) {
+                num_priv = node_instr_offset == node_data_offset ? 1 : 2;
+            }
             assert(cpus->cpus[n].props.socket_id > socket_id);
             socket_id = cpus->cpus[n].props.socket_id;
             cluster_id = -1;
@@ -2027,36 +2170,70 @@ void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
             socket_offset = table_data->len - pptt_start;
             build_processor_hierarchy_node(table_data,
                 (1 << 0), /* Physical package */
-                0, socket_id, NULL, 0);
+                0, socket_id, priv_rsrc, num_priv);
         }
 
+
         if (mc->smp_props.clusters_supported && mc->smp_props.has_clusters) {
             if (cpus->cpus[n].props.cluster_id != cluster_id) {
+                uint32_t priv_rsrc[2];
+                int num_priv = 0;
+
+                if (!share_structs && ms->smp.cache_cluster_start_level) {
+                    build_caches_subset(table_data, pptt_start, num_caches,
+                                        caches, true, n,
+                                        top_cluster,
+                                        ms->smp.cache_cluster_start_level,
+                                        &cluster_data_offset,
+                                        &cluster_instr_offset);
+                    top_core = ms->smp.cache_cluster_start_level - 1;
+                }
+                priv_rsrc[0] = cluster_instr_offset;
+                priv_rsrc[1] = cluster_data_offset;
+
                 assert(cpus->cpus[n].props.cluster_id > cluster_id);
                 cluster_id = cpus->cpus[n].props.cluster_id;
                 core_id = -1;
                 cluster_offset = table_data->len - pptt_start;
+
+                if (cluster_instr_offset || cluster_data_offset) {
+                    num_priv = cluster_instr_offset == cluster_data_offset ?
+                        1 : 2;
+                }
                 build_processor_hierarchy_node(table_data,
                     (0 << 0), /* Not a physical package */
-                    socket_offset, cluster_id, NULL, 0);
+                    socket_offset, cluster_id, priv_rsrc, num_priv);
             }
         } else {
             cluster_offset = socket_offset;
         }
 
+        if (!share_structs &&
+            cpus->cpus[n].props.core_id != core_id) {
+            build_caches_subset(table_data, pptt_start, num_caches, caches,
+                                true, n,
+                                top_core , 0,
+                                &l1_data_offset, &l1_instr_offset);
+        }
         if (ms->smp.threads == 1) {
+            uint32_t priv_rsrc[2] = { l1_instr_offset, l1_data_offset };
+
             build_processor_hierarchy_node(table_data,
                 (1 << 1) | /* ACPI Processor ID valid */
                 (1 << 3),  /* Node is a Leaf */
-                cluster_offset, n, NULL, 0);
+                cluster_offset, n, priv_rsrc,
+                l1_instr_offset == l1_data_offset ? 1 : 2);
         } else {
             if (cpus->cpus[n].props.core_id != core_id) {
+                uint32_t priv_rsrc[2] = { l1_instr_offset, l1_data_offset };
+
                 assert(cpus->cpus[n].props.core_id > core_id);
                 core_id = cpus->cpus[n].props.core_id;
                 core_offset = table_data->len - pptt_start;
                 build_processor_hierarchy_node(table_data,
                     (0 << 0), /* Not a physical package */
-                    cluster_offset, core_id, NULL, 0);
+                    cluster_offset, core_id, priv_rsrc,
+                    l1_instr_offset == l1_data_offset ? 1 : 2);
             }
 
             build_processor_hierarchy_node(table_data,
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 6b674231c2..ec8fdcefff 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -922,6 +922,129 @@ static void acpi_align_size(GArray *blob, unsigned align)
     g_array_set_size(blob, ROUND_UP(acpi_data_len(blob), align));
 }
 
+static unsigned int virt_get_caches(VirtMachineState *vms,
+                                    ACPIPPTTCache *caches)
+{
+    ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(0));
+    bool ccidx = cpu_isar_feature(any_ccidx, armcpu);
+    unsigned int num_cache, i;
+    int level_instr = 1, level_data = 1;
+
+    for (i = 0, num_cache = 0; i < 7; i++, num_cache++) {
+        int type = (armcpu->clidr >> (3 * i)) & 7;
+        int bank_index;
+        int level = 0;
+        ACPIPPTTCacheType cache_type = INSTRUCTION;
+
+        if (type == 0) {
+            break;
+        }
+
+        switch (type) {
+        case 1:
+            cache_type = INSTRUCTION;
+            level = level_instr;
+            break;
+        case 2:
+            cache_type = DATA;
+            level = level_data;
+            break;
+        case 4:
+            cache_type = UNIFIED;
+            level = level_instr > level_data ? level_instr : level_data;
+            break;
+        case 3: /* Split - Do data first */
+            cache_type = DATA;
+            level = level_data;
+            break;
+        }
+        /*
+         * ccsidr is indexed using both the level and whether it is
+         * an instruction cache. Unified caches use the same storage
+         * as data caches.
+         */
+        bank_index = (i * 2) | ((type == 1) ? 1 : 0);
+        if (ccidx) {
+            caches[num_cache] = (ACPIPPTTCache) {
+                .type =  cache_type,
+                .level = level,
+                .linesize = 1 << (FIELD_EX64(armcpu->ccsidr[bank_index],
+                                             CCSIDR_EL1,
+                                             CCIDX_LINESIZE) + 4),
+                .associativity = FIELD_EX64(armcpu->ccsidr[bank_index],
+                                            CCSIDR_EL1,
+                                            CCIDX_ASSOCIATIVITY) + 1,
+                .sets = FIELD_EX64(armcpu->ccsidr[bank_index], CCSIDR_EL1,
+                                   CCIDX_NUMSETS) + 1,
+            };
+        } else {
+            caches[num_cache] = (ACPIPPTTCache) {
+                .type =  cache_type,
+                .level = level,
+                .linesize = 1 << (FIELD_EX64(armcpu->ccsidr[bank_index],
+                                             CCSIDR_EL1, LINESIZE) + 4),
+                .associativity = FIELD_EX64(armcpu->ccsidr[bank_index],
+                                            CCSIDR_EL1,
+                                            ASSOCIATIVITY) + 1,
+                .sets = FIELD_EX64(armcpu->ccsidr[bank_index], CCSIDR_EL1,
+                                   NUMSETS) + 1,
+            };
+        }
+        caches[num_cache].size = caches[num_cache].associativity *
+            caches[num_cache].sets * caches[num_cache].linesize;
+
+        /* Break one 'split' entry up into two records */
+        if (type == 3) {
+            num_cache++;
+            bank_index = (i * 2) | 1;
+            if (ccidx) {
+                /* Instruction cache: bottom bit set when reading banked reg */
+                caches[num_cache] = (ACPIPPTTCache) {
+                    .type = INSTRUCTION,
+                    .level = level_instr,
+                    .linesize = 1 << (FIELD_EX64(armcpu->ccsidr[bank_index],
+                                                 CCSIDR_EL1,
+                                                 CCIDX_LINESIZE) + 4),
+                    .associativity = FIELD_EX64(armcpu->ccsidr[bank_index],
+                                                CCSIDR_EL1,
+                                                CCIDX_ASSOCIATIVITY) + 1,
+                    .sets = FIELD_EX64(armcpu->ccsidr[bank_index], CCSIDR_EL1,
+                                       CCIDX_NUMSETS) + 1,
+                };
+            } else {
+                caches[num_cache] = (ACPIPPTTCache) {
+                    .type = INSTRUCTION,
+                    .level = level_instr,
+                    .linesize = 1 << (FIELD_EX64(armcpu->ccsidr[bank_index],
+                                                 CCSIDR_EL1, LINESIZE) + 4),
+                    .associativity = FIELD_EX64(armcpu->ccsidr[bank_index],
+                                                CCSIDR_EL1,
+                                                ASSOCIATIVITY) + 1,
+                    .sets = FIELD_EX64(armcpu->ccsidr[bank_index], CCSIDR_EL1,
+                                       NUMSETS) + 1,
+                };
+            }
+            caches[num_cache].size = caches[num_cache].associativity *
+                caches[num_cache].sets * caches[num_cache].linesize;
+        }
+        switch (type) {
+        case 1:
+            level_instr++;
+            break;
+        case 2:
+            level_data++;
+            break;
+        case 3:
+        case 4:
+            level_instr++;
+            level_data++;
+            break;
+        }
+    }
+
+    return num_cache;
+}
+
 static
 void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
 {
@@ -930,6 +1053,8 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
     unsigned dsdt, xsdt;
     GArray *tables_blob = tables->table_data;
     MachineState *ms = MACHINE(vms);
+    ACPIPPTTCache caches[16]; /* Can select up to 16 */
+    unsigned int num_cache;
 
     table_offsets = g_array_new(false, true /* clear */,
                                         sizeof(uint32_t));
@@ -949,10 +1074,13 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
     acpi_add_table(table_offsets, tables_blob);
     build_madt(tables_blob, tables->linker, vms);
 
+    num_cache = virt_get_caches(vms, caches);
+
     if (!vmc->no_cpu_topology) {
         acpi_add_table(table_offsets, tables_blob);
         build_pptt(tables_blob, tables->linker, ms,
-                   vms->oem_id, vms->oem_table_id);
+                   vms->oem_id, vms->oem_table_id,
+                   num_cache, caches);
     }
 
     acpi_add_table(table_offsets, tables_blob);
diff --git a/hw/core/machine-smp.c b/hw/core/machine-smp.c
index 0f4d9b6f7a..cbb0bf1bc7 100644
--- a/hw/core/machine-smp.c
+++ b/hw/core/machine-smp.c
@@ -81,6 +81,10 @@ void machine_parse_smp_config(MachineState *ms,
     unsigned cores   = config->has_cores ? config->cores : 0;
     unsigned threads = config->has_threads ? config->threads : 0;
     unsigned maxcpus = config->has_maxcpus ? config->maxcpus : 0;
+    unsigned cache_cl_start = config->has_cache_cluster_start_level ?
+        config->cache_cluster_start_level : 0;
+    unsigned cache_nd_start = config->has_cache_node_start_level ?
+        config->cache_node_start_level : 0;
 
     /*
      * Specified CPU topology parameters must be greater than zero,
@@ -161,6 +165,10 @@ void machine_parse_smp_config(MachineState *ms,
     ms->smp.max_cpus = maxcpus;
 
     mc->smp_props.has_clusters = config->has_clusters;
+    if (mc->smp_props.has_clusters) {
+        ms->smp.cache_cluster_start_level = cache_cl_start;
+        ms->smp.cache_node_start_level = cache_nd_start;
+    }
 
     /* sanity-check of the computed topology */
     if (sockets * dies * clusters * cores * threads != maxcpus) {
diff --git a/hw/loongarch/acpi-build.c b/hw/loongarch/acpi-build.c
index 0b62c3a2f7..51d4ed9a19 100644
--- a/hw/loongarch/acpi-build.c
+++ b/hw/loongarch/acpi-build.c
@@ -439,7 +439,7 @@ static void acpi_build(AcpiBuildTables *tables, MachineState *machine)
 
     acpi_add_table(table_offsets, tables_blob);
     build_pptt(tables_blob, tables->linker, machine,
-               lams->oem_id, lams->oem_table_id);
+               lams->oem_id, lams->oem_table_id, 0, NULL);
 
     acpi_add_table(table_offsets, tables_blob);
     build_srat(tables_blob, tables->linker, machine);
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 2/5] HACK: target/arm/tcg: Add some more caches to cpu=max
  2023-08-08 11:57 [RFC PATCH 0/5] hw/arm: MPAM Emulation + PPTT cache description Jonathan Cameron via
  2023-08-08 11:57 ` [RFC PATCH 1/5] hw/acpi: Add PPTT cache descriptions Jonathan Cameron via
@ 2023-08-08 11:57 ` Jonathan Cameron via
  2023-08-14 10:13   ` Alex Bennée
  2023-08-08 11:57 ` [RFC PATCH 3/5] target/arm: Add support for MPAM CPU registers Jonathan Cameron via
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Jonathan Cameron via @ 2023-08-08 11:57 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gavin Shan, linuxarm, James Morse, peter . maydell @ linaro . org,
	zhao1.liu, Alex Bennée, Shameerali Kolothum Thodi,
	Yicong Yang

Used to drive the MPAM cache intialization and to exercise more
of the PPTT cache entry generation code. Perhaps a default
L3 cache is acceptable for max?

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 target/arm/tcg/cpu64.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
index 8019f00bc3..2af67739f6 100644
--- a/target/arm/tcg/cpu64.c
+++ b/target/arm/tcg/cpu64.c
@@ -711,6 +711,17 @@ void aarch64_max_tcg_initfn(Object *obj)
     uint64_t t;
     uint32_t u;
 
+    /*
+     * Expanded cache set
+     */
+    cpu->clidr = 0x8204923; /* 4 4 4 4 3 in 3 bit fields */
+    cpu->ccsidr[0] = 0x000000ff0000001aull; /* 64KB L1 dcache */
+    cpu->ccsidr[1] = 0x000000ff0000001aull; /* 64KB L1 icache */
+    cpu->ccsidr[2] = 0x000007ff0000003aull; /* 1MB L2 unified cache */
+    cpu->ccsidr[4] = 0x000007ff0000007cull; /* 2MB L3 cache 128B line */
+    cpu->ccsidr[6] = 0x00007fff0000007cull; /* 16MB L4 cache 128B line */
+    cpu->ccsidr[8] = 0x0007ffff0000007cull; /* 2048MB L5 cache 128B line */
+
     /*
      * Reset MIDR so the guest doesn't mistake our 'max' CPU type for a real
      * one and try to apply errata workarounds or use impdef features we
@@ -828,6 +839,7 @@ void aarch64_max_tcg_initfn(Object *obj)
     t = FIELD_DP64(t, ID_AA64MMFR2, BBM, 2);      /* FEAT_BBM at level 2 */
     t = FIELD_DP64(t, ID_AA64MMFR2, EVT, 2);      /* FEAT_EVT */
     t = FIELD_DP64(t, ID_AA64MMFR2, E0PD, 1);     /* FEAT_E0PD */
+    t = FIELD_DP64(t, ID_AA64MMFR2, CCIDX, 1);      /* FEAT_TTCNP */
     cpu->isar.id_aa64mmfr2 = t;
 
     t = cpu->isar.id_aa64zfr0;
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 3/5] target/arm: Add support for MPAM CPU registers
  2023-08-08 11:57 [RFC PATCH 0/5] hw/arm: MPAM Emulation + PPTT cache description Jonathan Cameron via
  2023-08-08 11:57 ` [RFC PATCH 1/5] hw/acpi: Add PPTT cache descriptions Jonathan Cameron via
  2023-08-08 11:57 ` [RFC PATCH 2/5] HACK: target/arm/tcg: Add some more caches to cpu=max Jonathan Cameron via
@ 2023-08-08 11:57 ` Jonathan Cameron via
  2023-08-08 11:57 ` [RFC PATCH 4/5] hw/arm: Add MPAM emulation Jonathan Cameron via
  2023-08-08 11:57 ` [RFC PATCH 5/5] hw/arm/virt: Add MPAM MSCs for memory controllers and caches Jonathan Cameron via
  4 siblings, 0 replies; 11+ messages in thread
From: Jonathan Cameron via @ 2023-08-08 11:57 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gavin Shan, linuxarm, James Morse, peter . maydell @ linaro . org,
	zhao1.liu, Alex Bennée, Shameerali Kolothum Thodi,
	Yicong Yang

It is common to support MPAM on CPU cores, but not in the rest
of the system, so there is little disadvantage in always enabling
these.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 target/arm/cpu.h    | 15 +++++++++++++++
 target/arm/cpu.c    | 10 +++++++++-
 target/arm/helper.c | 30 ++++++++++++++++++++++++++++++
 3 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 88e5accda6..8d28e22291 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -735,6 +735,10 @@ typedef struct CPUArchState {
      * to keep the offsets into the rest of the structure smaller.
      */
     ARMVectorReg zarray[ARM_MAX_VQ * 16];
+
+    uint64_t mpam0_el1;
+    uint64_t mpam1_el1;
+
 #endif
 
     struct CPUBreakpoint *cpu_breakpoint[16];
@@ -1043,6 +1047,7 @@ struct ArchCPU {
         uint64_t id_aa64zfr0;
         uint64_t id_aa64smfr0;
         uint64_t reset_pmcr_el0;
+        uint64_t mpamidr_el1;
     } isar;
     uint64_t midr;
     uint32_t revidr;
@@ -2327,6 +2332,16 @@ FIELD(DBGDEVID, DOUBLELOCK, 20, 4)
 FIELD(DBGDEVID, AUXREGS, 24, 4)
 FIELD(DBGDEVID, CIDMASK, 28, 4)
 
+FIELD(MPAMIDR, PARTID_MAX, 0, 16)
+FIELD(MPAMIDR, HAS_HCR, 17, 1)
+FIELD(MPAMIDR, VMR_MAX, 18, 3)
+FIELD(MPAMIDR, PMG_MAX, 32, 8)
+FIELD(MPAMIDR, HAS_ALTSP, 57, 1)
+FIELD(MPAMIDR, HAS_TIDR, 58, 1)
+FIELD(MPAMIDR, SP4, 59, 1)
+FIELD(MPAMIDR, HAS_FORCE_NS, 60, 1)
+FIELD(MPAMIDR, HAS_SDEFLT, 61, 1)
+
 FIELD(MVFR0, SIMDREG, 0, 4)
 FIELD(MVFR0, FPSP, 4, 4)
 FIELD(MVFR0, FPDP, 8, 4)
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 93c28d50e5..d85a3ec8a2 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -305,6 +305,9 @@ static void arm_cpu_reset_hold(Object *obj)
         env->cp15.rvbar = cpu->rvbar_prop;
         env->pc = env->cp15.rvbar;
 #endif
+
+        env->mpam1_el1 = 1ULL << 63;
+
     } else {
 #if defined(CONFIG_USER_ONLY)
         /* Userspace expects access to cp10 and cp11 for FP/Neon */
@@ -2097,7 +2100,12 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
             FIELD_DP32(cpu->isar.id_pfr0, ID_PFR0, AMU, 0);
         /* FEAT_MPAM (Memory Partitioning and Monitoring Extension) */
         cpu->isar.id_aa64pfr0 =
-            FIELD_DP64(cpu->isar.id_aa64pfr0, ID_AA64PFR0, MPAM, 0);
+            FIELD_DP64(cpu->isar.id_aa64pfr0, ID_AA64PFR0, MPAM, 1);
+        cpu->isar.mpamidr_el1 =
+            FIELD_DP64(cpu->isar.mpamidr_el1, MPAMIDR, PARTID_MAX, 63);
+        cpu->isar.mpamidr_el1 =
+            FIELD_DP64(cpu->isar.mpamidr_el1, MPAMIDR, PMG_MAX, 3);
+
         /* FEAT_NV (Nested Virtualization) */
         cpu->isar.id_aa64mmfr2 =
             FIELD_DP64(cpu->isar.id_aa64mmfr2, ID_AA64MMFR2, NV, 0);
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 50f61e42ca..dbeb8d9fa6 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -8072,7 +8072,17 @@ static const ARMCPRegInfo actlr2_hactlr2_reginfo[] = {
       .access = PL2_RW, .type = ARM_CP_CONST,
       .resetvalue = 0 },
 };
+/*
+static uint64_t mpam_el1_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+    return 0;
+}
 
+static void mpam_el1_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t val)
+{
+    return;
+}
+*/
 void register_cp_regs_for_features(ARMCPU *cpu)
 {
     /* Register all the coprocessor registers based on feature bits */
@@ -8404,6 +8414,26 @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .access = PL1_R, .type = ARM_CP_CONST,
               .accessfn = access_aa64_tid3,
               .resetvalue = 0 },
+
+            /* Should be separate feature */
+            { .name = "MPAMIDR_EL1", .state = ARM_CP_STATE_AA64,
+              .opc0 = 3, .opc1 = 0, .crn = 0xa, .crm = 4, .opc2 = 4,
+              .access = PL1_R, .type = ARM_CP_CONST,
+              .accessfn = access_aa64_tid3,
+              .resetvalue = cpu->isar.mpamidr_el1 },
+            /* TODO: check the accessfn and whether we need a reset value for these */
+            { .name = "MPAM0_EL1", .state = ARM_CP_STATE_AA64,
+              .opc0 = 3, .opc1 = 0, .crn = 0xa, .crm = 5, .opc2 = 1,
+              .access = PL1_RW, .type = ARM_CP_ALIAS,
+              .accessfn = access_aa64_tid3,
+              .fieldoffset = offsetof(CPUARMState, mpam0_el1),
+            },
+            { .name = "MPAM1_EL1", .state = ARM_CP_STATE_AA64,
+              .opc0 = 3, .opc1 = 0, .crn = 0xa, .crm = 5, .opc2 = 0,
+              .access = PL1_RW, .type = ARM_CP_ALIAS,
+              .accessfn = access_aa64_tid3,
+              .fieldoffset = offsetof(CPUARMState, mpam1_el1),
+            },
             { .name = "MVFR0_EL1", .state = ARM_CP_STATE_AA64,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 3, .opc2 = 0,
               .access = PL1_R, .type = ARM_CP_CONST,
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 4/5] hw/arm: Add MPAM emulation.
  2023-08-08 11:57 [RFC PATCH 0/5] hw/arm: MPAM Emulation + PPTT cache description Jonathan Cameron via
                   ` (2 preceding siblings ...)
  2023-08-08 11:57 ` [RFC PATCH 3/5] target/arm: Add support for MPAM CPU registers Jonathan Cameron via
@ 2023-08-08 11:57 ` Jonathan Cameron via
  2023-08-08 11:57 ` [RFC PATCH 5/5] hw/arm/virt: Add MPAM MSCs for memory controllers and caches Jonathan Cameron via
  4 siblings, 0 replies; 11+ messages in thread
From: Jonathan Cameron via @ 2023-08-08 11:57 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gavin Shan, linuxarm, James Morse, peter . maydell @ linaro . org,
	zhao1.liu, Alex Bennée, Shameerali Kolothum Thodi,
	Yicong Yang

Note this doesn't 'do' anything other than provide an introspection
interface.  The intent here is to support work on the Linux kernel support
and for that a functional emulation of the interface is useful.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 qapi/mpam.json           |  78 ++++
 qapi/qapi-schema.json    |   1 +
 include/hw/arm/mpam.h    |  12 +
 include/hw/arm/virt.h    |   2 +
 hw/arm/mpam-qapi-stubs.c |   9 +
 hw/arm/mpam-qapi.c       |  58 +++
 hw/arm/mpam.c            | 886 +++++++++++++++++++++++++++++++++++++++
 hw/arm/Kconfig           |   4 +
 hw/arm/meson.build       |   4 +
 qapi/meson.build         |   1 +
 10 files changed, 1055 insertions(+)

diff --git a/qapi/mpam.json b/qapi/mpam.json
new file mode 100644
index 0000000000..f4990ef96b
--- /dev/null
+++ b/qapi/mpam.json
@@ -0,0 +1,78 @@
+# -*- Mode: Python -*-
+# vim: filetype=python
+
+##
+# = ARM MPAM State Introspection
+##
+
+##
+# @MpamBm:
+##
+{ 'struct': 'MpamBm',
+  'data' : { 'words' : [ 'int' ]
+           }
+}
+
+##
+# @MpamRegs:
+#
+# Per RIS Register State
+#
+##
+{ 'struct' : 'MpamRegs',
+  'data' : { 'idr' : 'int',
+             'iidr' : 'int',
+             'aidr' : 'int',
+             'cpor-idr': 'int',
+             'ccap-idr': 'int',
+             'mbw-idr': 'int',
+             'pri-idr': 'int',
+             'partid-nrw-idr': 'int',
+             'msmon-idr': 'int',
+             'csumon-idr': 'int',
+             'mbwumon-idr': 'int',
+             'ecr': 'int',
+             'esr': 'int',
+             'cfg-part-sel': 'int',
+             'cfg-cpbm': ['MpamBm']#lazy
+          }
+}
+
+##
+# @MpamCacheInfo:
+#
+# Information about MPAM Cache MSCs
+#
+# @cpu: First CPU of the set associated with the cache
+#
+# @level: Level of cache
+#
+# @type: type of cache - make an enum
+#
+# Since: 9.0
+##
+
+{ 'struct': 'MpamCacheInfo',
+  'data' : { 'cpu' : 'int',
+             'level' : 'int',
+             'type' : 'int',
+             'regs' : ['MpamRegs']
+           }
+}
+
+##
+# @query-mpam-cache:
+#
+# Get a list of MpamInfo for all Cache Related MSC
+#
+# @level: Provide a cache level to filter against
+#
+# Returns: a list of @MpamCacheInfo describing all
+#   the MPAM cache MSC instances
+#
+# Since: 9.0
+##
+{ 'command': 'query-mpam-cache',
+  'data': { '*level': 'int' },
+  'returns' : ['MpamCacheInfo'],
+  'allow-preconfig': true }
diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json
index 6594afba31..ea3ee75841 100644
--- a/qapi/qapi-schema.json
+++ b/qapi/qapi-schema.json
@@ -79,3 +79,4 @@
 { 'include': 'virtio.json' }
 { 'include': 'cryptodev.json' }
 { 'include': 'cxl.json' }
+{ 'include': 'mpam.json' }
diff --git a/include/hw/arm/mpam.h b/include/hw/arm/mpam.h
new file mode 100644
index 0000000000..7bd88d57bc
--- /dev/null
+++ b/include/hw/arm/mpam.h
@@ -0,0 +1,12 @@
+#ifndef _MPAM_H_
+#define _MPAM_H_
+
+#include "qom/object.h"
+#include "qapi/qapi-commands-mpam.h"
+
+#define TYPE_MPAM_MSC_MEM "mpam-msc-mem"
+#define TYPE_MPAM_MSC_CACHE "mpam-msc-cache"
+
+void mpam_cache_fill_info(Object *obj, MpamCacheInfo *info);
+
+#endif /* _MPAM_H_ */
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index e1ddbea96b..ac015a391a 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -71,6 +71,7 @@ enum {
     VIRT_SMMU,
     VIRT_UART,
     VIRT_MMIO,
+    VIRT_MPAM_MSC,
     VIRT_RTC,
     VIRT_FW_CFG,
     VIRT_PCIE,
@@ -160,6 +161,7 @@ struct VirtMachineState {
     bool ras;
     bool mte;
     bool dtb_randomness;
+    bool mpam, mpam_min_msc;
     OnOffAuto acpi;
     VirtGICType gic_version;
     VirtIOMMUType iommu;
diff --git a/hw/arm/mpam-qapi-stubs.c b/hw/arm/mpam-qapi-stubs.c
new file mode 100644
index 0000000000..40ccb7de9a
--- /dev/null
+++ b/hw/arm/mpam-qapi-stubs.c
@@ -0,0 +1,9 @@
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qapi/qapi-commands-mpam.h"
+
+MpamCacheInfoList *qmp_query_mpam_cache(bool has_level, int64_t level, Error **errp)
+{
+    return NULL;
+}
diff --git a/hw/arm/mpam-qapi.c b/hw/arm/mpam-qapi.c
new file mode 100644
index 0000000000..cf027e0da9
--- /dev/null
+++ b/hw/arm/mpam-qapi.c
@@ -0,0 +1,58 @@
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "hw/arm/mpam.h"
+#include "qom/object.h"
+#include "qapi/qapi-commands-mpam.h"
+
+typedef struct MPAMQueryState {
+    Error **errp;
+    MpamCacheInfoList **head;
+    bool level_filter_on;
+    int level;
+} MPAMQueryState;
+
+static int mpam_query_cache(Object *obj, void *opaque)
+{
+    MPAMQueryState *state = opaque;
+    MpamCacheInfoList *infolist;
+    MpamCacheInfo *info;
+
+    if (!object_dynamic_cast(obj, TYPE_MPAM_MSC_CACHE)) {
+        return 0;
+    }
+    if (state->level_filter_on &&
+        object_property_get_uint(obj, "cache-level", state->errp) !=
+        state->level) {
+        return 0;
+    }
+
+    infolist = g_malloc0(sizeof(*infolist));
+    info = g_malloc0(sizeof(*info));
+
+    mpam_cache_fill_info(obj, info);
+
+    infolist->value = info;
+
+    *state->head = infolist;
+    state->head = &infolist->next;
+
+    return 0;
+}
+
+MpamCacheInfoList *qmp_query_mpam_cache(bool has_level, int64_t level,
+                                        Error **errp)
+{
+
+    MpamCacheInfoList *head = NULL;
+    MPAMQueryState state = {
+        .errp = errp,
+        .head = &head,
+        .level_filter_on = has_level,
+        .level = level,
+    };
+
+    object_child_foreach_recursive(object_get_root(), mpam_query_cache, &state);
+
+    return head;
+}
diff --git a/hw/arm/mpam.c b/hw/arm/mpam.c
new file mode 100644
index 0000000000..4b645efc2e
--- /dev/null
+++ b/hw/arm/mpam.c
@@ -0,0 +1,886 @@
+/*
+ * ARM MPAM emulation
+ *
+ * Copyright (c) 2023 Huawei
+ */
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu/log.h"
+#include "hw/irq.h"
+#include "hw/qdev-properties.h"
+#include "hw/registerfields.h"
+#include "hw/sysbus.h"
+#include "qom/object.h"
+#include "hw/arm/mpam.h"
+
+REG64(MPAMF_IDR, 0)
+    FIELD(MPAMF_IDR, PART_ID_MAX, 0, 16)
+    FIELD(MPAMF_IDR, PMG_MAX, 16, 8)
+    FIELD(MPAMF_IDR, HAS_CCAP_PART, 24, 1)
+    FIELD(MPAMF_IDR, HAS_CPOR_PART, 25, 1)
+    FIELD(MPAMF_IDR, HAS_MBW_PART, 26, 1)
+    FIELD(MPAMF_IDR, HAS_PRI_PART, 27, 1)
+    FIELD(MPAMF_IDR, EXT, 28, 1)
+    FIELD(MPAMF_IDR, HAS_IMPL_IDR, 29, 1)
+    FIELD(MPAMF_IDR, HAS_MSMON, 30, 1)
+    FIELD(MPAMF_IDR, HAS_PARTID_NRW, 31, 1)
+    FIELD(MPAMF_IDR, HAS_RIS, 32, 1)
+    FIELD(MPAMF_IDR, NO_IMPL_PART, 36, 1)
+    FIELD(MPAMF_IDR, NO_IMPL_MSMON, 37, 1)
+    FIELD(MPAMF_IDR, HAS_EXTD_ESR, 38, 1)
+    FIELD(MPAMF_IDR, HAS_ESR, 39, 1)
+    FIELD(MPAMF_IDR, HAS_ERR_MS, 40, 1)
+    FIELD(MPAMF_IDR, SP4, 41, 1)
+    FIELD(MPAMF_IDR, HAS_ENDIS, 42, 1)
+    FIELD(MPAMF_IDR, HAS_NFU, 43, 1)
+    FIELD(MPAMF_IDR, RIS_MAX, 56, 4)
+
+REG32(MPAMF_IIDR, 0x0018)
+    FIELD(MPAMF_IIDR, IMPLEMENTER, 0, 12)
+    FIELD(MPAMF_IIDR, REVISION, 12, 4)
+    FIELD(MPAMF_IIDR, VARIANT, 16, 4)
+    FIELD(MPAMF_IIDR, PRODUCT_ID, 20, 12)
+
+REG32(MPAMF_AIDR, 0x0020)
+    FIELD(MPAMF_AIDR, ARCH_MINOR_REV, 0, 4)
+    FIELD(MPAMF_AIDR, ARCH_MAJOR_REV, 4, 4)
+
+REG32(MPAMF_IMPL_IDR, 0x0028)
+REG32(MPAMF_CPOR_IDR, 0x0030)
+    FIELD(MPAMF_CPOR_IDR, CPBM_WD, 0, 16)
+
+REG32(MPAMF_CCAP_IDR, 0x0038)
+    FIELD(MPAMF_CCAP_IDR, CMAX_WD, 0, 6)
+    FIELD(MPAMF_CCAP_IDR, CASSOC_WD, 8, 5)
+    FIELD(MPAMF_CCAP_IDR, HAS_CASSOC, 28, 1)
+    FIELD(MPAMF_CCAP_IDR, HAS_CMIN, 29, 1)
+    FIELD(MPAMF_CCAP_IDR, NO_CMAX, 30, 1)
+    FIELD(MPAMF_CCAP_IDR, HAS_CMAX_SOFTLIM, 31, 1)
+
+REG32(MPAMF_MBW_IDR, 0x0040)
+    FIELD(MPAMF_MBW_IDR, BWA_WD, 0, 6)
+    FIELD(MPAMF_MBW_IDR, HAS_MIN, 10, 1)
+    FIELD(MPAMF_MBW_IDR, HAS_MAX, 11, 1)
+    FIELD(MPAMF_MBW_IDR, HAS_PBM, 12, 1)
+    FIELD(MPAMF_MBW_IDR, HAS_PROP, 13, 1)
+    FIELD(MPAMF_MBW_IDR, WINDWR, 14, 1)
+    FIELD(MPAMF_MBW_IDR, BWPBM_WD, 16, 13)
+
+REG32(MPAMF_PRI_IDR, 0x0048)
+    FIELD(MPAMF_PRI_IDR, HAS_INTPRI, 0, 1)
+    FIELD(MPAMF_PRI_IDR, INTPRI_0_IS_LOW, 1, 1)
+    FIELD(MPAMF_PRI_IDR, INTPRI_WD, 4, 6)
+    FIELD(MPAMF_PRI_IDR, HAS_DSPRI, 16, 1)
+    FIELD(MPAMF_PRI_IDR, DSPRI_0_IS_LOW, 17, 1)
+    FIELD(MPAMF_PRI_IDR, DSPRI_WD, 20, 6)
+
+REG32(MPAMF_PARTID_NRW_IDR, 0x0050)
+    FIELD(MPAMF_PARTID_NRW_IDR, INTPARTID_MAX, 0, 16)
+
+REG32(MPAMF_MSMON_IDR, 0x080)
+    FIELD(MPAMF_MSMON_IDR, MSMON_CSU, 16, 1)
+    FIELD(MPAMF_MSMON_IDR, MSMON_MBWU, 17, 1)
+    FIELD(MPAMF_MSMON_IDR, HAS_OFLOW_SR, 28, 1)
+    FIELD(MPAMF_MSMON_IDR, HAS_OFLW_MS, 29, 1)
+    FIELD(MPAMF_MSMON_IDR, NO_OFLW_INTR, 30, 1)
+    FIELD(MPAMF_MSMON_IDR, HAS_LOCAL_CAPT_EVNT, 31, 1)
+
+REG32(MPAMF_CSUMON_IDR, 0x0088)
+    FIELD(MPAMF_CSUMON_IDR, NUM_MON, 0, 16)
+    FIELD(MPAMF_CSUMON_IDR, HAS_OFLOW_CAPT, 24, 1)
+    FIELD(MPAMF_CSUMON_IDR, HAS_CEVNT_OFLW, 25, 1)
+    FIELD(MPAMF_CSUMON_IDR, HAS_OFSR, 26, 1)
+    FIELD(MPAMF_CSUMON_IDR, HAS_OFLOW_LNKG, 27, 1)
+    FIELD(MPAMF_CSUMON_IDR, HAS_XCL, 29, 1)
+    FIELD(MPAMF_CSUMON_IDR, CSU_RO, 30, 1)
+    FIELD(MPAMF_CSUMON_IDR, HAS_CAPTURE, 31, 1)
+
+REG32(MPAMF_MBWUMON_IDR, 0x0090)
+    FIELD(MPAMF_MBWUMON_IDR, NUM_MON, 0, 16)
+    FIELD(MPAMF_MBWUMON_IDR, SCALE, 16, 5)
+    FIELD(MPAMF_MBWUMON_IDR, HAS_OFLOW_CAPT, 24, 1)
+    FIELD(MPAMF_MBWUMON_IDR, HAS_CEVNT_OFLW, 25, 1)
+    FIELD(MPAMF_MBWUMON_IDR, HAS_OFSR, 26, 1)
+    FIELD(MPAMF_MBWUMON_IDR, HAS_OFLOW_LNKG, 27, 1)
+    FIELD(MPAMF_MBWUMON_IDR, HAS_RWBW, 28, 1)
+    FIELD(MPAMF_MBWUMON_IDR, LWD, 29, 1)
+    FIELD(MPAMF_MBWUMON_IDR, HAS_LONG, 30, 1)
+    FIELD(MPAMF_MBWUMON_IDR, HAS_CAPTURE, 31, 1)
+
+REG32(MPAMF_ERR_MSI_MPAM, 0x00dc)
+REG32(MPAMF_ERR_MSI_ADDR_L, 0x00e0)
+REG32(MPAMF_ERR_MSI_ADDR_H, 0x00e4)
+REG32(MPAMF_ERR_MSI_DATA, 0x00e8)
+REG32(MPAMF_ERR_MSI_ATTR, 0x00ec)
+
+REG32(MPAMF_ECR, 0x00f0)
+    FIELD(MPAMF_ECR, INTEN, 0, 1)
+#define MPAMF_ECR_WRITE_MASK ( \
+    R_MPAMF_ECR_INTEN_MASK)
+
+REG64(MPAMF_ESR, 0x00f8)
+    FIELD(MPAMF_ESR, PARID_MON, 0, 16)
+    FIELD(MPAMF_ESR, PMG, 16, 8)
+    FIELD(MPAMF_ESR, ERR_CODE, 24, 4)
+    FIELD(MPAMF_ESR, OVRWR, 31, 1)
+    FIELD(MPAMF_ESR, RIS, 32, 4)
+
+REG32(MPAMF_CFG_PART_SEL, 0x0100)
+    FIELD(MPAMF_CFG_PART_SEL, PARTID_SEL, 0, 16)
+    FIELD(MPAMF_CFG_PART_SEL, INTERNAL, 16, 1)
+    FIELD(MPAMF_CFG_PART_SEL, RIS, 24, 4)
+#define MPAMF_CFG_PART_SEL_WRITE_MASK ( \
+    R_MPAMF_CFG_PART_SEL_PARTID_SEL_MASK | \
+    R_MPAMF_CFG_PART_SEL_INTERNAL_MASK | \
+    R_MPAMF_CFG_PART_SEL_RIS_MASK)
+
+REG32(MPAMF_MPAMCFG_CMAX, 0x0108)
+    FIELD(MPAMF_MPAMCFG_CMAX, CMAX, 0, 16)
+    FIELD(MPAMF_MPAMCFG_CMAX, SOFTLIM, 31, 1)
+#define MPAMF_MPAMCFG_CMAX_WRITE_MASK ( \
+    R_MPAMF_MPAMCFG_CMAX_CMAX_MASK | \
+    R_MPAMF_MPAMCFG_CMAX_SOFTLIM_MASK)
+
+REG32(MPAMF_MPAMCFG_CMIN, 0x0110)
+    FIELD(MPAMF_MPAMCFG_CMIN, CMIN, 0, 16)
+#define MPAMF_MPAMCFG_CMIN_WRITE_MASK ( \
+    R_MPAMF_MPAMCFG_CMIN_CMIN_MASK)
+
+REG32(MPAMF_MPAMCFG_CASSOC, 0x0118)
+    FIELD(MPAMF_MPAMCFG_CASSOC, CASSOC, 0, 16)
+#define MPAMF_MPAMCFG_CASSOC_WRITE_MASK ( \
+    R_MPAMF_MPAMCCG_CASSOC_CASSOC_MASK)
+
+REG32(MPAMF_MPAMCFG_MBW_MIN, 0x0200)
+    FIELD(MPAMF_MPAMCFG_MBW_MIN, MIN, 0, 16)
+#define MPAMF_MPAMCFG_MBW_MIN_WRITE_MASK ( \
+    R_MPAMF_MPAMCFG_MBW_MIN_MASK)
+
+REG32(MPAMF_MPAMCFG_MBW_MAX, 0x0208)
+    FIELD(MPAMF_MPAMCFG_MBW_MAX, MAX, 0, 16)
+    FIELD(MPAMF_MPAMCFG_MBW_MAX, HARDLIM, 31, 1)
+#define MPAMF_MPAMCFG_MBW_MAX_WRITE_MASK ( \
+    R_MPAMF_MPAMCFG_MBW_MAX_MAX_MASK | \
+    R_MPAMF_MPAMCFG_MBW_MAX_HARDLIM_MASK)
+
+REG32(MPAMF_MPAMCFG_WINWD, 0x0220)
+    FIELD(MPAMF_MPAMCFG_WINWD, US_FRAC, 0, 8)
+    FIELD(MPAMF_MPAMCFG_WINWD, US_INT, 8, 16)
+#define MPAMF_MPAMCFG_WINWD_WRITE_MASK ( \
+    R_MPAMF_MPAMCFG_WINWD_US_FRAC | \
+    R_MPAMF_MPAMCFG_WINWD_US_INT)
+
+REG32(MPAMF_MPAMCFG_EN, 0x0300)
+    FIELD(MPAMF_MPAMCFG_EN, PARTID, 0, 16)
+#define MPAMF_MPAMCFG_EN_WRITE_MASK ( \
+    R_MPAMF_MPAMCFG_EN_PARTID_MASK)
+
+REG32(MPAMF_MPAMCFG_DIS, 0x0310) /* WHat is this for? */
+    FIELD(MPAMF_MPAMCFG_DIS, PARTID, 0, 16)
+    FIELD(MPAMF_MPAMCFG_DIS, NFU, 31, 1)
+#define MPAMF_MPAMCFG_DIS_WRITE_MASK ( \
+    R_MPAMF_MPAMCFG_DIS_PARTID_MASK | \
+    R_MPAMF_MPAMCFG_DIS_NFU)
+
+REG32(MPAMF_MPAMCFG_EN_FLAGS, 0x320)
+
+REG32(MPAMF_MPAMCFG_PRI, 0x400)
+    FIELD(MPAMF_MPAMCFG_PRI, INTPRI, 0, 16)
+    FIELD(MPAMF_MPAMCFG_PRI, DSPRI, 16, 16)
+#define MPAMF_MPAMCFG_PRI_WRITE_MASK ( \
+    R_MPAMF_MPAMCFG_PRI_INTPRI_MASK | \
+    R_MPAMF_MPAMCFG_PRI_DSPRI_MASK)
+
+REG32(MPAMF_MPAMCFG_MBW_PROP, 0x500)
+    FIELD(MPAMF_MPAMCFG_MBW_PROP, STRIDEM1, 0, 16)
+    FIELD(MPAMF_MPAMCFG_MBW_PROP, EN, 31, 1)
+#define MPAMF_MPAMCFG_MBW_PROP_WRITE_MASK ( \
+    R_MPAMF_MPAMCFG_MBW_PROP_STRIDEM1_MASK | \
+    R_MPAMF_MPAMCFG_MBW_PROP_EN_MASK)
+
+REG32(MPAMF_MPAMCFG_INTPARTID, 0x600)
+    FIELD(MPAMF_MPAMCFG_INTPARTID, INTPARTID, 0, 16)
+    FIELD(MPAMF_MPAMCFG_INTPARTID, INTERNAL, 16, 1)
+#define MPAMF_MPAMCFG_INTPARTID_WRITE_MASK ( \
+    R_MPAMF_MPAMCFG_INTPARTID_INTPARTID_MASK | \
+    R_MPAMF_MPAMCFG_INTPARTID_INTERNAL_MASK)
+
+REG32(MPAMF_MPAMCFG_CPBM0, 0x1000)
+
+REG32(MPAMF_MPAMCFG_MBW_PBM0, 0x2000)
+
+#define TYPE_MPAM_MSC "mpam-msc"
+#define MPAM_MBW_PART 4
+#define MPAM_CACHE_PART 32
+
+typedef struct MpamfPerNrwId {
+        uint32_t cfg_cpbm[(MPAM_CACHE_PART + 31) / 32];
+        uint32_t cfg_mbw_pbm[(MPAM_MBW_PART + 31) / 32];
+        uint32_t cfg_pri;
+        uint32_t cfg_cmax;
+        uint32_t cfg_mbw_min;
+        uint32_t cfg_mbw_max;
+        uint32_t cfg_mbw_prop;
+} MpamfPerNrwId;
+
+typedef struct Mpamf {
+    uint64_t idr;
+    uint32_t iidr;
+    uint32_t aidr;
+    uint32_t impl_idr;
+    uint32_t cpor_idr;
+    uint32_t ccap_idr;
+    uint32_t mbw_idr;
+    uint32_t pri_idr;
+    uint32_t partid_nrw_idr;
+    uint32_t msmon_idr;
+    uint32_t csumon_idr;
+    uint32_t mbwumon_idr;
+    uint32_t err_msi_mpam;
+    uint32_t err_msi_addr_l;
+    uint32_t err_msi_addr_h;
+    uint32_t err_msi_data;
+    uint32_t err_msi_attr;
+    uint32_t ecr;
+    uint32_t esr;
+    uint32_t cfg_part_sel;
+    uint32_t *cfg_intpartid;
+
+    MpamfPerNrwId *per_nrw_id;
+
+} Mpamf;
+
+typedef struct MPAMMSCState {
+    SysBusDevice parent_obj;
+
+    Mpamf *mpamf;
+
+    uint8_t ris;
+    uint16_t part_sel; /* Technically per ris, but in same reg */
+    bool internal_part_sel;
+    struct MemoryRegion mr;
+    uint32_t num_partid;
+    uint32_t num_int_partid;
+    uint8_t num_ris;
+
+} MPAMMSCState;
+
+/*
+ * ID narrowing may be in effect.  If it is there is an indirection
+ * table per RIS mapping from part_sel to the internal ID. To make things
+ * more complex, the Partition selection register can directly address
+ * internal IDs. That works for everything other than the ID map itself.
+ * This function pulls the right internal ID out of this complexity
+ * for use in accessing the per_nrw_id structures.
+ */
+static uint32_t mpam_get_nrw_id(MPAMMSCState *s)
+{
+    Mpamf *mpamf = &s->mpamf[s->ris];
+
+    if (!FIELD_EX32(mpamf->idr, MPAMF_IDR, HAS_PARTID_NRW)) {
+        return s->part_sel;
+    }
+    if (s->internal_part_sel) {
+        return s->part_sel;
+    }
+    return mpamf->cfg_intpartid[s->part_sel];
+}
+
+typedef struct MPAMMSCMemState {
+    MPAMMSCState parent;
+} MPAMMSCMemState;
+
+typedef struct MPAMMSCCacheState {
+    MPAMMSCState parent;
+    uint8_t cache_level;
+    uint8_t cache_type;
+    uint16_t cpu;
+
+} MPAMMSCCacheState;
+
+DECLARE_INSTANCE_CHECKER(MPAMMSCState, MPAM_MSC_DEVICE, TYPE_MPAM_MSC);
+
+DECLARE_INSTANCE_CHECKER(MPAMMSCMemState, MPAM_MSC_MEM_DEVICE,
+                         TYPE_MPAM_MSC_MEM);
+
+DECLARE_INSTANCE_CHECKER(MPAMMSCCacheState, MPAM_MSC_CACHE_DEVICE,
+                         TYPE_MPAM_MSC_CACHE);
+
+void mpam_cache_fill_info(Object *obj, MpamCacheInfo *info)
+{
+    MPAMMSCCacheState *cs = MPAM_MSC_CACHE_DEVICE(obj);
+    MPAMMSCState *s = MPAM_MSC_DEVICE(obj);
+    MpamRegsList *reg_list = NULL, **r_next = &reg_list;
+    int i, p, b;
+
+    info->cpu = cs->cpu;
+    info->level = cs->cache_level;
+    info->type = cs->cache_level;
+    for (i = 0; i < s->num_ris; i++) {
+        MpamRegsList *regs;
+        MpamRegs *r;
+        MpamBmList *bm_list = NULL, **bm_next = &bm_list;
+
+        Mpamf *mpamf = &s->mpamf[i];
+
+        r = g_malloc0(sizeof(*r));
+        regs = g_malloc0(sizeof(*regs));
+        regs->value = r;
+
+        *r = (MpamRegs) {
+            .idr = mpamf->idr,
+            .iidr = mpamf->iidr,
+            .aidr = mpamf->aidr,
+            .cpor_idr = mpamf->cpor_idr,
+            .ccap_idr = mpamf->ccap_idr,
+            .mbw_idr = mpamf->mbw_idr,
+            .pri_idr = mpamf->pri_idr,
+            .partid_nrw_idr = mpamf->partid_nrw_idr,
+            .msmon_idr = mpamf->msmon_idr,
+            .csumon_idr = mpamf->csumon_idr,
+            .mbwumon_idr = mpamf->mbwumon_idr,
+            .ecr = mpamf->ecr,
+            .esr = mpamf->esr,
+            .cfg_part_sel = mpamf->cfg_part_sel, /* Garbage */
+        };
+
+        /* This is annoyingly complex */
+        for (p = 0; p < s->num_int_partid; p++) {
+            intList *w_list = NULL, **w_next = &w_list;
+            MpamBm *bm = g_malloc0(sizeof(*bm));
+            MpamBmList *bml = g_malloc0(sizeof(*bml));
+
+            bml->value = bm;
+
+            for (b = 0; b < (MPAM_CACHE_PART + 31) / 32; b++) {
+                intList *il = g_malloc0(sizeof(*il));
+
+                il->value = mpamf->per_nrw_id[p].cfg_cpbm[b];
+                *w_next = il;
+                w_next = &il->next;
+
+            }
+            *bm_next = bml;
+            bm_next = &bml->next;
+            bm->words = w_list;
+        }
+        r->cfg_cpbm = bm_list;
+        *r_next = regs;
+        r_next = &regs->next;
+    }
+    info->regs = reg_list;
+
+}
+
+static uint64_t mpam_msc_read_reg(void *opaque, hwaddr offset,
+                                  unsigned size)
+{
+    MPAMMSCState *s = MPAM_MSC_DEVICE(opaque);
+    Mpamf *mpamf = &s->mpamf[s->ris];
+    uint32_t nrw_part_sel = mpam_get_nrw_id(s);
+
+    switch (offset) {
+    case A_MPAMF_IDR:
+        switch (size) {
+        case 4:
+            return mpamf->idr & 0xffffffff;
+        case 8:
+            return mpamf->idr;
+        default:
+            qemu_log_mask(LOG_UNIMP, "MPAM: Unexpected read size\n");
+            return 0;
+        }
+    case A_MPAMF_IDR + 0x04:
+        if (!FIELD_EX32(mpamf->idr, MPAMF_IDR, EXT)) {
+            qemu_log_mask(LOG_UNIMP, "MPAM: Unexpected read of top of IDR\n");
+            return 0;
+        }
+        switch (size) {
+        case 4:
+            return mpamf->idr >> 32;
+        default:
+            qemu_log_mask(LOG_UNIMP, "MPAM: Unexpected read size\n");
+            return 0;
+        }
+    case A_MPAMF_IIDR:
+        return mpamf->iidr;
+    case A_MPAMF_AIDR:
+        return mpamf->aidr;
+    case A_MPAMF_IMPL_IDR:
+        if (!FIELD_EX32(mpamf->idr, MPAMF_IDR, HAS_IMPL_IDR)) {
+            qemu_log_mask(LOG_UNIMP,
+                "MPAM: Accessing IMPL_IDR which isn't suported\n");
+            return 0;
+        }
+        return mpamf->impl_idr;
+    case A_MPAMF_CPOR_IDR:
+        if (!FIELD_EX32(mpamf->idr, MPAMF_IDR, HAS_CPOR_PART)) {
+            qemu_log_mask(LOG_UNIMP,
+                "MPAM: Unexpected read of CPOR_IDR with no CPOR support\n");
+            return 0;
+        }
+        return mpamf->cpor_idr;
+    case A_MPAMF_CCAP_IDR:
+        if (!FIELD_EX32(mpamf->idr, MPAMF_IDR, HAS_CCAP_PART)) {
+            qemu_log_mask(LOG_UNIMP,
+                "MPAM: Unexpected read of CCAP_IDR with no CCAP support\n");
+            return 0;
+        }
+        return mpamf->ccap_idr;
+    case A_MPAMF_MBW_IDR:
+        return mpamf->mbw_idr;
+    case A_MPAMF_PRI_IDR:
+        if (!FIELD_EX32(mpamf->idr, MPAMF_IDR, HAS_PRI_PART)) {
+            qemu_log_mask(LOG_UNIMP,
+                "MPAM: Unexpected read of PRI_IDR with no PRI PART support\n");
+            return 0;
+        }
+        return mpamf->pri_idr;
+    case A_MPAMF_PARTID_NRW_IDR: {
+        return mpamf->partid_nrw_idr;
+    }
+    case A_MPAMF_MSMON_IDR:
+        return mpamf->msmon_idr;
+    case A_MPAMF_CSUMON_IDR:
+        return mpamf->csumon_idr;
+    case A_MPAMF_MBWUMON_IDR:
+        return mpamf->mbwumon_idr;
+    case A_MPAMF_ERR_MSI_MPAM:
+        return mpamf->err_msi_mpam;
+    case A_MPAMF_ERR_MSI_ADDR_L:
+        return mpamf->err_msi_addr_l;
+    case A_MPAMF_ERR_MSI_ADDR_H:
+        return mpamf->err_msi_addr_h;
+    case A_MPAMF_ERR_MSI_DATA:
+        return mpamf->err_msi_data;
+    case A_MPAMF_ERR_MSI_ATTR:
+        return mpamf->err_msi_attr;
+    case A_MPAMF_ECR:
+        return mpamf->ecr;
+    case A_MPAMF_ESR:
+        return mpamf->esr;
+    case A_MPAMF_CFG_PART_SEL:
+        return mpamf->cfg_part_sel;
+    case A_MPAMF_MPAMCFG_CMAX:
+        if (!FIELD_EX32(mpamf->idr, MPAMF_IDR, HAS_CCAP_PART)) {
+            qemu_log_mask(LOG_UNIMP,
+                "MPAM: Unexpected read of CMAX with no CCAP support\n");
+            return 0;
+        }
+        return mpamf->per_nrw_id[nrw_part_sel].cfg_cmax;
+    case A_MPAMF_MPAMCFG_MBW_MIN:
+        return mpamf->per_nrw_id[nrw_part_sel].cfg_mbw_min;
+    case A_MPAMF_MPAMCFG_MBW_MAX:
+        return mpamf->per_nrw_id[nrw_part_sel].cfg_mbw_max;
+    case A_MPAMF_MPAMCFG_PRI:
+        return mpamf->per_nrw_id[nrw_part_sel].cfg_pri;
+    case A_MPAMF_MPAMCFG_MBW_PROP:
+        return mpamf->per_nrw_id[nrw_part_sel].cfg_mbw_prop;
+    case A_MPAMF_MPAMCFG_CPBM0...
+        (A_MPAMF_MPAMCFG_CPBM0 + ((MPAM_CACHE_PART + 31) / 32 - 1) * 4):
+    {
+        uint32_t array_offset;
+
+        if (!FIELD_EX32(mpamf->idr, MPAMF_IDR, HAS_CPOR_PART)) {
+            qemu_log_mask(LOG_UNIMP,
+                "MPAM: Unexpected read of CPBM with no CPOR support\n");
+            return 0;
+        }
+        array_offset = (offset - A_MPAMF_MPAMCFG_CPBM0) / sizeof(uint32_t);
+        return mpamf->per_nrw_id[nrw_part_sel].cfg_cpbm[array_offset];
+    }
+    case A_MPAMF_MPAMCFG_MBW_PBM0...
+            (A_MPAMF_MPAMCFG_MBW_PBM0 + ((MPAM_MBW_PART + 31) / 32 - 1) * 4):
+    {
+        uint32_t array_offset;
+
+        if (!FIELD_EX32(mpamf->idr, MPAMF_IDR, HAS_MBW_PART)) {
+            qemu_log_mask(LOG_UNIMP,
+                "MPAM: Unexpected read of MBW_PBM with no MBW_PART support\n");
+            return 0;
+        }
+        array_offset = (offset - A_MPAMF_MPAMCFG_MBW_PBM0) / sizeof(uint32_t);
+        return mpamf->per_nrw_id[nrw_part_sel].cfg_mbw_pbm[array_offset];
+    }
+    default:
+        qemu_log_mask(LOG_UNIMP,
+                      "MPAM: Unexpected read of %lx\n", offset);
+        return 0x0;
+    }
+}
+
+static void mpam_msc_write_reg(void *opaque, hwaddr offset, uint64_t value,
+                               unsigned size)
+{
+    MPAMMSCState *s = MPAM_MSC_DEVICE(opaque);
+    /* Update in the ris setting path */
+    Mpamf *mpamf = &s->mpamf[s->ris];
+    /* Update if cfg_intpartid being touched */
+    uint32_t nrw_part_sel = mpam_get_nrw_id(s);
+
+    switch (offset) {
+    case A_MPAMF_CFG_PART_SEL:
+        if (value & ~MPAMF_CFG_PART_SEL_WRITE_MASK) {
+            qemu_log_mask(LOG_UNIMP,
+                          "MPAM: Unexpected write to CFG_PART_SEL Mask=%x Value=%lx\n",
+                          MPAMF_CFG_PART_SEL_WRITE_MASK, value);
+        }
+        /* Field matches for all RIS */
+        if (!FIELD_EX64(mpamf->idr, MPAMF_IDR, HAS_RIS) &&
+            FIELD_EX32(value, MPAMF_CFG_PART_SEL, RIS) != 0) {
+            qemu_log_mask(LOG_UNIMP,
+                          "MPAM: Unexpected write of non 0 RIS on MSC with !HAS_RIS\n");
+            return;
+        }
+        s->ris = FIELD_EX32(value, MPAMF_CFG_PART_SEL, RIS);
+        s->part_sel = FIELD_EX32(value, MPAMF_CFG_PART_SEL, PARTID_SEL);
+        s->internal_part_sel = FIELD_EX32(value, MPAMF_CFG_PART_SEL, INTERNAL);
+        mpamf = &s->mpamf[s->ris];
+        mpamf->cfg_part_sel = value;
+        return;
+    case A_MPAMF_MPAMCFG_CMAX:
+        if (value & ~MPAMF_MPAMCFG_CMAX_WRITE_MASK) {
+            qemu_log_mask(LOG_UNIMP,
+                          "MPAM: Unexpected write to CMAX Mask=%x Value=%lx\n",
+                          MPAMF_MPAMCFG_CMAX_WRITE_MASK, value);
+        }
+        mpamf->per_nrw_id[nrw_part_sel].cfg_cmax = value;
+        return;
+    case A_MPAMF_MPAMCFG_MBW_MIN:
+        if (value & ~MPAMF_MPAMCFG_CMIN_WRITE_MASK) {
+            qemu_log_mask(LOG_UNIMP,
+                          "MPAM: Unexpected write to CMAX Mask=%x Value=%lx\n",
+                          MPAMF_MPAMCFG_CMIN_WRITE_MASK, value);
+        }
+        mpamf->per_nrw_id[nrw_part_sel].cfg_mbw_min = value;
+        return;
+    case A_MPAMF_MPAMCFG_MBW_MAX:
+        if (value & ~MPAMF_MPAMCFG_MBW_MAX_WRITE_MASK) {
+            qemu_log_mask(LOG_UNIMP,
+                          "MPAM: Unexpected write to MBW_MAX Mask=%x Value=%lx\n",
+                          MPAMF_MPAMCFG_MBW_MAX_WRITE_MASK, value);
+        }
+        mpamf->per_nrw_id[nrw_part_sel].cfg_mbw_max = value;
+        return;
+    case A_MPAMF_MPAMCFG_PRI:
+        if (!FIELD_EX32(mpamf->idr, MPAMF_IDR, HAS_PRI_PART)) {
+            qemu_log_mask(LOG_UNIMP,
+                          "MPAM: Unexpected write to CFG_PRI when !HAS_PRI_PART\n");
+        } else {
+            if (!FIELD_EX32(mpamf->pri_idr, MPAMF_PRI_IDR, HAS_DSPRI) &&
+                FIELD_EX32(value, MPAMF_MPAMCFG_PRI, DSPRI)) {
+                qemu_log_mask(LOG_UNIMP,
+                              "MPAM: Unexpected write to CFGP_PRI DSPRI when !HAS_DSPRI\n");
+            }
+            if (!FIELD_EX32(mpamf->pri_idr, MPAMF_PRI_IDR, HAS_INTPRI) &&
+                FIELD_EX32(value, MPAMF_MPAMCFG_PRI, INTPRI)) {
+                qemu_log_mask(LOG_UNIMP,
+                              "MPAM: Unexpected write to CFGP_PRI INTPRI when !HAS_INTPRI\n");
+            }
+        }
+        mpamf->per_nrw_id[nrw_part_sel].cfg_pri = value;
+        return;
+    case A_MPAMF_MPAMCFG_MBW_PROP:
+        if (value & ~MPAMF_MPAMCFG_MBW_PROP_WRITE_MASK) {
+            qemu_log_mask(LOG_UNIMP,
+                          "MPAM: Unexpected write to MBW_PROP Mask=%x Value=%lx\n",
+                          MPAMF_MPAMCFG_MBW_MAX_WRITE_MASK, value);
+        }
+        mpamf->per_nrw_id[nrw_part_sel].cfg_mbw_prop = value;
+        return;
+    case A_MPAMF_MPAMCFG_CPBM0...
+        (A_MPAMF_MPAMCFG_CPBM0 + ((MPAM_CACHE_PART + 31) / 32 - 1) * 4):
+    {
+        uint32_t array_offset;
+
+        /* TODO: Figure out write mask to check this stays in write bits */
+        if (!FIELD_EX32(mpamf->idr, MPAMF_IDR, HAS_CPOR_PART)) {
+            qemu_log_mask(LOG_UNIMP,
+                          "MPAM: Unexpected write to CPMB when !HAS_CPORT_PART\n");
+            return;
+        }
+        array_offset = (offset - A_MPAMF_MPAMCFG_CPBM0) / sizeof(uint32_t);
+        mpamf->per_nrw_id[nrw_part_sel].cfg_cpbm[array_offset] = value;
+        return;
+    }
+    case A_MPAMF_MPAMCFG_MBW_PBM0...
+        (A_MPAMF_MPAMCFG_MBW_PBM0 + ((MPAM_MBW_PART + 31) / 32 - 1) * 4):
+    {
+        uint32_t array_offset;
+
+        if (!FIELD_EX32(mpamf->idr, MPAMF_IDR, HAS_MBW_PART)) {
+            qemu_log_mask(LOG_UNIMP,
+                          "MPAM: Unexpected write to MBW_PBM when !HAS_MBW_PART\n");
+            return;
+        }
+        /* TODO: Figure out write mask to check this stays in write bits */
+        array_offset = (offset - A_MPAMF_MPAMCFG_MBW_PBM0) / sizeof(uint32_t);
+
+        mpamf->per_nrw_id[nrw_part_sel].cfg_mbw_pbm[array_offset] = value;
+        return;
+    }
+    case A_MPAMF_ECR:
+        if (value & ~MPAMF_ECR_WRITE_MASK) {
+            qemu_log_mask(LOG_UNIMP,
+                          "MPAM: Unexpected write to ECR Mask=%x Value=%lx\n",
+                          MPAMF_ECR_WRITE_MASK, value);
+        }
+        mpamf->ecr = value;
+        return;
+    default:
+        qemu_log_mask(LOG_UNIMP,
+                      "MPAM: Write to unexpected register Addr %lx Value=%lx\n",
+                      offset, value);
+    }
+}
+
+static const MemoryRegionOps mpam_msc_ops = {
+    .read = mpam_msc_read_reg,
+    .write = mpam_msc_write_reg,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid = {
+        .min_access_size = 1,
+        .max_access_size = 8,
+        .unaligned = false,
+    },
+    .impl = {
+        .min_access_size = 1,
+        .max_access_size = 8,
+    },
+};
+
+
+static void mpam_msc_realize(DeviceState *dev, Error **errp)
+{
+    MPAMMSCState *s = MPAM_MSC_DEVICE(dev);
+    int i;
+
+    if (s->num_ris > 16) {
+        error_setg(errp, "num-ris must be <= 16");
+        return;
+    }
+    if (s->num_partid == 0) {
+        error_setg(errp, "num-ris must be <= 16");
+        return;
+    }
+    if (s->num_int_partid == 0) {
+        s->num_int_partid = s->num_partid;
+    }
+
+    s->mpamf = g_new0(Mpamf, s->num_ris);
+    for (i = 0; i < s->num_ris; i++) {
+        s->mpamf[i].per_nrw_id = g_new0(MpamfPerNrwId, s->num_int_partid);
+        s->mpamf[i].cfg_intpartid = g_new0(uint32_t, s->num_partid);
+    }
+
+    memory_region_init_io(&s->mr, OBJECT(s), &mpam_msc_ops, s, "mpam_msc",
+                          0x4000);
+    sysbus_init_mmio(SYS_BUS_DEVICE(dev), &s->mr);
+}
+
+static void mpam_msc_mem_realize(DeviceState *dev, Error **errp)
+{
+
+    MPAMMSCState *s = MPAM_MSC_DEVICE(dev);
+    int i;
+
+    mpam_msc_realize(dev, errp);
+
+    for (i = 0; i < s->num_ris; i++) {
+        Mpamf *mpamf = &s->mpamf[i];
+
+        mpamf->idr = FIELD_DP64(mpamf->idr, MPAMF_IDR, PART_ID_MAX,
+                                s->num_partid - 1);
+        /* No PCMG for now */
+        mpamf->idr = FIELD_DP64(mpamf->idr, MPAMF_IDR, PMG_MAX, 0);
+        mpamf->idr = FIELD_DP64(mpamf->idr, MPAMF_IDR, EXT, 1);
+        mpamf->idr = FIELD_DP64(mpamf->idr, MPAMF_IDR, HAS_RIS, s->num_ris > 1);
+        mpamf->idr = FIELD_DP64(mpamf->idr, MPAMF_IDR, RIS_MAX,
+                                  s->num_ris > 1 ? s->num_ris - 1 : 0);
+         /* Optional - test with and without */
+        mpamf->idr = FIELD_DP64(mpamf->idr, MPAMF_IDR, HAS_ESR, 1);
+        mpamf->idr = FIELD_DP64(mpamf->idr, MPAMF_IDR, HAS_EXTD_ESR, 1);
+        mpamf->idr = FIELD_DP64(mpamf->idr, MPAMF_IDR, HAS_PARTID_NRW,
+                                s->num_int_partid < s->num_partid);
+
+        /* We won't implement any implementation specific stuff */
+        mpamf->idr = FIELD_DP64(mpamf->idr, MPAMF_IDR, NO_IMPL_PART, 1);
+        mpamf->idr = FIELD_DP64(mpamf->idr, MPAMF_IDR, NO_IMPL_MSMON, 1);
+        /* Memory specific bit */
+        mpamf->idr = FIELD_DP64(mpamf->idr, MPAMF_IDR, HAS_MBW_PART, 1);
+
+        mpamf->iidr = FIELD_DP64(mpamf->iidr, MPAMF_IIDR, IMPLEMENTER, 0x736);
+        mpamf->iidr = FIELD_DP64(mpamf->iidr, MPAMF_IIDR, REVISION, 0);
+        mpamf->iidr = FIELD_DP64(mpamf->iidr, MPAMF_IIDR, VARIANT, 0);
+        /* FIXME get allocation for this emulation */
+        mpamf->iidr = FIELD_DP64(mpamf->iidr, MPAMF_IIDR, PRODUCT_ID, 42);
+
+        mpamf->aidr = FIELD_DP64(mpamf->aidr, MPAMF_AIDR, ARCH_MINOR_REV, 1);
+        mpamf->aidr = FIELD_DP64(mpamf->aidr, MPAMF_AIDR, ARCH_MAJOR_REV, 1);
+
+        mpamf->mbw_idr = FIELD_DP32(mpamf->mbw_idr, MPAMF_MBW_IDR, BWA_WD, 16);
+        mpamf->mbw_idr = FIELD_DP32(mpamf->mbw_idr, MPAMF_MBW_IDR, HAS_MIN, 1);
+        mpamf->mbw_idr = FIELD_DP32(mpamf->mbw_idr, MPAMF_MBW_IDR, HAS_MAX, 1);
+        mpamf->mbw_idr = FIELD_DP32(mpamf->mbw_idr, MPAMF_MBW_IDR, HAS_PBM, 1);
+        mpamf->mbw_idr = FIELD_DP32(mpamf->mbw_idr, MPAMF_MBW_IDR, HAS_PROP, 1);
+
+        mpamf->mbw_idr = FIELD_DP32(mpamf->mbw_idr, MPAMF_MBW_IDR, WINDWR, 0);
+        mpamf->mbw_idr = FIELD_DP32(mpamf->mbw_idr, MPAMF_MBW_IDR,
+                                    BWPBM_WD, MPAM_MBW_PART);
+
+        if (s->num_int_partid < s->num_partid) {
+            mpamf->partid_nrw_idr = FIELD_DP32(mpamf->partid_nrw_idr,
+                                               MPAMF_PARTID_NRW_IDR,
+                                               INTPARTID_MAX,
+                                               s->num_int_partid - 1);
+        }
+    }
+}
+
+static void mpam_msc_cache_realize(DeviceState *dev, Error **errp)
+{
+    MPAMMSCState *s = MPAM_MSC_DEVICE(dev);
+    int i;
+
+    mpam_msc_realize(dev, errp);
+
+    for (i = 0; i < s->num_ris; i++) {
+        Mpamf *mpamf = &s->mpamf[i];
+
+        mpamf->idr = FIELD_DP64(mpamf->idr, MPAMF_IDR, PART_ID_MAX,
+                                s->num_partid - 1);
+        /* No PCMG for now */
+        mpamf->idr = FIELD_DP64(mpamf->idr, MPAMF_IDR, PMG_MAX, 0);
+        mpamf->idr = FIELD_DP64(mpamf->idr, MPAMF_IDR, EXT, 1);
+        mpamf->idr = FIELD_DP64(mpamf->idr, MPAMF_IDR, HAS_RIS, s->num_ris > 1);
+        mpamf->idr = FIELD_DP64(mpamf->idr, MPAMF_IDR, RIS_MAX,
+                                  s->num_ris > 1 ? s->num_ris - 1 : 0);
+
+        /* Optional - test with and without */
+        mpamf->idr = FIELD_DP64(mpamf->idr, MPAMF_IDR, HAS_ESR, 1);
+        mpamf->idr = FIELD_DP64(mpamf->idr, MPAMF_IDR, HAS_EXTD_ESR,
+                                s->num_ris > 1);
+
+        /* We won't implement any implementation specific stuff */
+        mpamf->idr = FIELD_DP64(mpamf->idr, MPAMF_IDR, NO_IMPL_PART, 1);
+        mpamf->idr = FIELD_DP64(mpamf->idr, MPAMF_IDR, NO_IMPL_MSMON, 1);
+
+        /* Need to implement for RME */
+        mpamf->idr = FIELD_DP64(mpamf->idr, MPAMF_IDR, SP4, 0);
+
+        /* Cache specific bit */
+        mpamf->idr = FIELD_DP64(mpamf->idr, MPAMF_IDR, HAS_CPOR_PART, 1);
+        mpamf->idr = FIELD_DP64(mpamf->idr, MPAMF_IDR, HAS_CCAP_PART, 1);
+        mpamf->idr = FIELD_DP64(mpamf->idr, MPAMF_IDR, HAS_PRI_PART, 1);
+        mpamf->idr = FIELD_DP64(mpamf->idr, MPAMF_IDR, HAS_PARTID_NRW,
+                                s->num_int_partid < s->num_partid);
+
+        mpamf->iidr = FIELD_DP32(mpamf->iidr, MPAMF_IIDR, IMPLEMENTER, 0x736);
+        mpamf->iidr = FIELD_DP32(mpamf->iidr, MPAMF_IIDR, REVISION, 0);
+        mpamf->iidr = FIELD_DP32(mpamf->iidr, MPAMF_IIDR, VARIANT, 0);
+        /* FIXME get allocation for this emulation */
+        mpamf->iidr = FIELD_DP32(mpamf->iidr, MPAMF_IIDR, PRODUCT_ID, 42);
+
+        mpamf->aidr = FIELD_DP32(mpamf->aidr, MPAMF_AIDR, ARCH_MINOR_REV, 1);
+        mpamf->aidr = FIELD_DP32(mpamf->aidr, MPAMF_AIDR, ARCH_MAJOR_REV, 1);
+
+        /* Portion */
+        mpamf->cpor_idr = FIELD_DP32(mpamf->cpor_idr, MPAMF_CPOR_IDR, CPBM_WD,
+                                     MPAM_CACHE_PART);
+
+        /* Priority */
+        mpamf->pri_idr = FIELD_DP32(mpamf->pri_idr, MPAMF_PRI_IDR,
+                                    HAS_INTPRI, 1);
+        mpamf->pri_idr = FIELD_DP32(mpamf->pri_idr, MPAMF_PRI_IDR,
+                                    INTPRI_0_IS_LOW, 1);
+        mpamf->pri_idr = FIELD_DP32(mpamf->pri_idr, MPAMF_PRI_IDR,
+                                    INTPRI_WD, 2);
+
+        /* Capacity Partitioning */
+        mpamf->ccap_idr = FIELD_DP32(mpamf->ccap_idr, MPAMF_CCAP_IDR,
+                                     HAS_CMAX_SOFTLIM, 1);
+        mpamf->ccap_idr = FIELD_DP32(mpamf->ccap_idr, MPAMF_CCAP_IDR,
+                                     NO_CMAX, 0);
+        mpamf->ccap_idr = FIELD_DP32(mpamf->ccap_idr, MPAMF_CCAP_IDR,
+                                     HAS_CMIN, 1);
+        mpamf->ccap_idr = FIELD_DP32(mpamf->ccap_idr, MPAMF_CCAP_IDR,
+                                     HAS_CASSOC, 1);
+        mpamf->ccap_idr = FIELD_DP32(mpamf->ccap_idr, MPAMF_CCAP_IDR,
+                                     CASSOC_WD, 4); /* Not much flex on this */
+        mpamf->ccap_idr = FIELD_DP32(mpamf->ccap_idr, MPAMF_CCAP_IDR,
+                                     CMAX_WD, 4);
+
+        if (s->num_int_partid < s->num_partid) {
+            mpamf->partid_nrw_idr = FIELD_DP32(mpamf->partid_nrw_idr,
+                                               MPAMF_PARTID_NRW_IDR,
+                                               INTPARTID_MAX,
+                                               s->num_int_partid - 1);
+        }
+        /* TODO: Initialize all on as if firmware had done it for us */
+    }
+}
+
+static Property mpam_msc_props[] = {
+    DEFINE_PROP_UINT8("num-ris", MPAMMSCState, num_ris, 1),
+    DEFINE_PROP_UINT32("num-partid", MPAMMSCState, num_partid, 1),
+    DEFINE_PROP_UINT32("num-int-partid", MPAMMSCState, num_int_partid, 0),
+    DEFINE_PROP_END_OF_LIST()
+};
+
+static Property mpam_msc_cache_props[] = {
+    DEFINE_PROP_UINT8("cache-level", MPAMMSCCacheState, cache_level, 1),
+    DEFINE_PROP_UINT8("cache-type", MPAMMSCCacheState, cache_type, 1),
+    DEFINE_PROP_UINT16("cpu", MPAMMSCCacheState, cpu, 2),
+    DEFINE_PROP_END_OF_LIST()
+};
+
+static void mpam_msc_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+
+    device_class_set_props(dc, mpam_msc_props);
+}
+
+static void mpam_msc_mem_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+
+    dc->realize = mpam_msc_mem_realize;
+}
+
+static void mpam_msc_cache_init(ObjectClass *klass, void *data)
+{
+   DeviceClass *dc = DEVICE_CLASS(klass);
+
+   dc->realize = mpam_msc_cache_realize;
+   device_class_set_props(dc, mpam_msc_cache_props);
+}
+
+static const TypeInfo mpam_msc_info = {
+    .name = TYPE_MPAM_MSC,
+    .parent = TYPE_SYS_BUS_DEVICE,
+    .instance_size = sizeof(MPAMMSCState),
+    .class_init = mpam_msc_init
+};
+
+static const TypeInfo mpam_msc_mem_info = {
+    .name = TYPE_MPAM_MSC_MEM,
+    .parent = TYPE_MPAM_MSC,
+    .instance_size = sizeof(MPAMMSCMemState),
+    .class_init = mpam_msc_mem_init
+};
+
+static const TypeInfo mpam_msc_cache_info = {
+    .name = TYPE_MPAM_MSC_CACHE,
+    .parent = TYPE_MPAM_MSC,
+    .instance_size = sizeof(MPAMMSCCacheState),
+    .class_init = mpam_msc_cache_init
+};
+
+static void mpam_register_types(void)
+{
+    type_register_static(&mpam_msc_info);
+    type_register_static(&mpam_msc_mem_info);
+    type_register_static(&mpam_msc_cache_info);
+}
+type_init(mpam_register_types);
diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index 7e68348440..e44259910c 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -32,6 +32,7 @@ config ARM_VIRT
     select VIRTIO_MEM_SUPPORTED
     select ACPI_CXL
     select ACPI_HMAT
+    select MPAM
 
 config CHEETAH
     bool
@@ -593,6 +594,9 @@ config FSL_IMX7
 config ARM_SMMUV3
     bool
 
+config MPAM
+    bool
+
 config FSL_IMX6UL
     bool
     default y
diff --git a/hw/arm/meson.build b/hw/arm/meson.build
index 11eb9112f8..08f72befa1 100644
--- a/hw/arm/meson.build
+++ b/hw/arm/meson.build
@@ -67,6 +67,10 @@ arm_ss.add(when: 'CONFIG_XEN', if_true: files('xen_arm.c'))
 arm_ss.add_all(xen_ss)
 
 system_ss.add(when: 'CONFIG_ARM_SMMUV3', if_true: files('smmu-common.c'))
+system_ss.add(when: 'CONFIG_MPAM', if_true: files('mpam.c', 'mpam-qapi.c'))
+system_ss.add(when: 'CONFIG_MPAM', if_false: files('mpam-qapi-stubs.c'))
+system_ss.add(when: 'CONFIG_ALL', if_true: files('mpam-qapi-stubs.c'))
+
 system_ss.add(when: 'CONFIG_EXYNOS4', if_true: files('exynos4_boards.c'))
 system_ss.add(when: 'CONFIG_RASPI', if_true: files('bcm2835_peripherals.c'))
 system_ss.add(when: 'CONFIG_TOSA', if_true: files('tosa.c'))
diff --git a/qapi/meson.build b/qapi/meson.build
index 60a668b343..8e9a45330b 100644
--- a/qapi/meson.build
+++ b/qapi/meson.build
@@ -41,6 +41,7 @@ qapi_all_modules = [
   'migration',
   'misc',
   'misc-target',
+  'mpam',
   'net',
   'pragma',
   'qom',
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 5/5] hw/arm/virt: Add MPAM MSCs for memory controllers and caches.
  2023-08-08 11:57 [RFC PATCH 0/5] hw/arm: MPAM Emulation + PPTT cache description Jonathan Cameron via
                   ` (3 preceding siblings ...)
  2023-08-08 11:57 ` [RFC PATCH 4/5] hw/arm: Add MPAM emulation Jonathan Cameron via
@ 2023-08-08 11:57 ` Jonathan Cameron via
  4 siblings, 0 replies; 11+ messages in thread
From: Jonathan Cameron via @ 2023-08-08 11:57 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gavin Shan, linuxarm, James Morse, peter . maydell @ linaro . org,
	zhao1.liu, Alex Bennée, Shameerali Kolothum Thodi,
	Yicong Yang

Allow sharing of MSC instances (using RIS) for memory controllers.
Currently cached controllers are not using RIS (tend to be more
in a system so more clever logic needed to auto allocate them).

No DT support yet.  The kernel bindings are considered unstable
so premature to add too much on that front.

Note that for now MPAM MSC creation is dependent on having SRAT
and hence you need some numa nodes defined.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 include/hw/arm/mpam.h    |   1 +
 hw/arm/virt-acpi-build.c | 197 +++++++++++++++++++++++++++++++++++++++
 hw/arm/virt.c            | 134 ++++++++++++++++++++++++++
 3 files changed, 332 insertions(+)

diff --git a/include/hw/arm/mpam.h b/include/hw/arm/mpam.h
index 7bd88d57bc..8f47c8806f 100644
--- a/include/hw/arm/mpam.h
+++ b/include/hw/arm/mpam.h
@@ -7,6 +7,7 @@
 #define TYPE_MPAM_MSC_MEM "mpam-msc-mem"
 #define TYPE_MPAM_MSC_CACHE "mpam-msc-cache"
 
+#define MPAM_SIZE 0x4000 /* Big enough for anyone ;) */
 void mpam_cache_fill_info(Object *obj, MpamCacheInfo *info);
 
 #endif /* _MPAM_H_ */
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index ec8fdcefff..b14dce3722 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -49,6 +49,7 @@
 #include "hw/pci/pci_bus.h"
 #include "hw/pci-host/gpex.h"
 #include "hw/arm/virt.h"
+#include "hw/arm/mpam.h"
 #include "hw/intc/arm_gicv3_its_common.h"
 #include "hw/mem/nvdimm.h"
 #include "hw/platform-bus.h"
@@ -515,6 +516,198 @@ build_spcr(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     acpi_table_end(linker, &table);
 }
 
+static void build_msc_memory_controller(GArray *table_data, int identifier,
+                                        hwaddr base_addr, uint32_t mpam_id,
+                                        int num_ris, uint64_t *nodes)
+{
+    int length = 72 + 24 * num_ris;
+    int i;
+
+    build_append_int_noprefix(table_data, length, 2);
+    /* Interface Type */
+    build_append_int_noprefix(table_data, 0 /* MMIO */, 1);
+    /* Reserved */
+    build_append_int_noprefix(table_data, 0, 1);
+    build_append_int_noprefix(table_data, identifier, 4);
+    build_append_int_noprefix(table_data, base_addr, 8);
+    build_append_int_noprefix(table_data, MPAM_SIZE, 4);
+    /* Overflow interrupt - HACK */
+    build_append_int_noprefix(table_data, 0x2C, 4);
+    /* Edge - SPI */
+    build_append_int_noprefix(table_data, 1, 4);
+    /* Reserved */
+    build_append_int_noprefix(table_data, 0, 4);
+    /* Overflow Int Affinity - HACK */
+    build_append_int_noprefix(table_data, 0, 4);
+    /* Error interrupt - HACK */
+    build_append_int_noprefix(table_data, 0x2D, 4);
+    /* Edge - SPI */
+    build_append_int_noprefix(table_data, 1, 4);
+    /* More reserved */
+    build_append_int_noprefix(table_data, 0, 4);
+    /* Error Int Affinity */
+    build_append_int_noprefix(table_data, 0, 4);
+    /* MAX_NRDY_USEC */
+    build_append_int_noprefix(table_data, 100, 4);
+    /* Linked Device - none for now */
+    build_append_int_noprefix(table_data, 0, 8);
+    /* _UID of linked device */
+    build_append_int_noprefix(table_data, 0, 4);
+    build_append_int_noprefix(table_data, num_ris, 4);
+    /* Build a memory controller resouce */
+    for (i = 0; i < num_ris; i++) {
+        build_append_int_noprefix(table_data, mpam_id, 4);
+        build_append_int_noprefix(table_data, i, 1);
+        /* Reserved1 */
+        build_append_int_noprefix(table_data, 0, 2);
+        /* Locator type - 1 = Memory */
+        build_append_int_noprefix(table_data, 1, 1);
+        /* Locator part 1 Node */
+        build_append_int_noprefix(table_data, nodes[i], 8);
+        /* Locator part 2 reserved */
+        build_append_int_noprefix(table_data, 0, 4);
+        /* Num functional dependencies */
+        build_append_int_noprefix(table_data, 0, 4);
+    }
+}
+
+static void build_msc_cache_controller(GArray *table_data, int identifier,
+                                       hwaddr base_addr, uint32_t mpam_id,
+                                       int num_ris, uint64_t *cache_id)
+{
+    int length = 72 + 24 * num_ris;
+    int i;
+
+    build_append_int_noprefix(table_data, length, 2);
+    /* Interface Type */
+    build_append_int_noprefix(table_data, 0 /* MMIO */, 1);
+    /* Reserved */
+    build_append_int_noprefix(table_data, 0, 1);
+    build_append_int_noprefix(table_data, identifier, 4);
+    build_append_int_noprefix(table_data, base_addr, 8);
+    build_append_int_noprefix(table_data, MPAM_SIZE, 4);
+    /* Overflow interrupt - HACK */
+    build_append_int_noprefix(table_data, 0x2C, 4);
+    /* Edge - SPI */
+    build_append_int_noprefix(table_data, 1, 4);
+    /* Reserved */
+    build_append_int_noprefix(table_data, 0, 4);
+    /* Overflow Int Affinity */
+    build_append_int_noprefix(table_data, 0, 4);
+    /* Error interrupt - HACK */
+    build_append_int_noprefix(table_data, 0x2D, 4);
+    /* Edge - SPI */
+    build_append_int_noprefix(table_data, 1, 4);
+    /* More reserved */
+    build_append_int_noprefix(table_data, 0, 4);
+    /* Error Int Affinity */
+    build_append_int_noprefix(table_data, 0, 4);
+    /* MAX_NRDY_USEC */
+    build_append_int_noprefix(table_data, 100, 4);
+    /* Linked Device - none for now */
+    build_append_int_noprefix(table_data, 0, 8);
+    /* _UID of linked device */
+    build_append_int_noprefix(table_data, 0, 4);
+    /* Num resource nodes */
+    build_append_int_noprefix(table_data, num_ris, 4);
+    /* Build a memory controller resouce */
+    for (i = 0; i < num_ris; i++) {
+        /* Identifier */
+        build_append_int_noprefix(table_data, mpam_id, 4);
+        /* RIS index */
+        build_append_int_noprefix(table_data, i, 1);
+        /* Reserved1 */
+        build_append_int_noprefix(table_data, 0, 2);
+        /* Locator type - 0 = Cache */
+        build_append_int_noprefix(table_data, 0, 1);
+        /* Locator part 1 PPTT ID */
+        build_append_int_noprefix(table_data, cache_id[i], 8);
+        /* Locator part 2 reserved */
+        build_append_int_noprefix(table_data, 0, 4);
+        /* Num functional dependencies */
+        build_append_int_noprefix(table_data, 0, 4);
+    }
+}
+
+struct mpam_stat {
+    MachineState *ms;
+    GArray *table_data;
+    hwaddr base_addr;
+    int count;
+    int cpu_count;
+    uint32_t mpam_id; /* Just needs to be unique */
+};
+
+static int mpam_add_msc(Object *obj, void *opaque)
+{
+    if (object_dynamic_cast(obj, TYPE_MPAM_MSC_MEM)) {
+        struct mpam_stat *mpam_s = opaque;
+        SysBusDevice *s = SYS_BUS_DEVICE(obj);
+        int num_ris = object_property_get_uint(obj, "num-ris", &error_fatal);
+        uint64_t *ids = g_new0(uint64_t, num_ris);
+        int j = 0;
+        int i;
+
+        /* Fill them in based on which nodes have memory */
+        for (i = 0; i < mpam_s->ms->numa_state->num_nodes; i++) {
+            if (mpam_s->ms->numa_state->nodes[i].node_mem) {
+                ids[j++] = i;
+            }
+        }
+
+        build_msc_memory_controller(mpam_s->table_data, mpam_s->count,
+                                    s->mmio[0].addr, mpam_s->mpam_id++,
+                                    num_ris, ids);
+        mpam_s->count++;
+        g_free(ids);
+    }
+
+    if (object_dynamic_cast(obj, TYPE_MPAM_MSC_CACHE)) {
+        struct mpam_stat *mpam_s = opaque;
+        SysBusDevice *s = SYS_BUS_DEVICE(obj);
+        int num_ris = 1;
+        uint64_t *ids = g_new0(uint64_t, num_ris);
+        uint8_t cache_level = object_property_get_uint(obj, "cache-level",
+                                                       &error_fatal);
+        uint8_t cache_type = object_property_get_uint(obj, "cache-type",
+                                                      &error_fatal);
+        uint16_t cpu = object_property_get_uint(obj, "cpu", &error_fatal);
+
+        ids[0] = cpu | (cache_level << 16) | (cache_type << 24);
+        printf("MPAM has a cache with ID %lx\n", ids[0]);
+        build_msc_cache_controller(mpam_s->table_data, mpam_s->count,
+                                   s->mmio[0].addr, mpam_s->mpam_id++,
+                                   num_ris, ids);
+        mpam_s->count++;
+        g_free(ids);
+    }
+
+    return 0;
+}
+
+static void
+build_mpam(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
+{
+    AcpiTable table = {
+        .sig = "MPAM",
+        .rev = 1,
+        .oem_id = vms->oem_id,
+        .oem_table_id = vms->oem_table_id,
+    };
+    struct mpam_stat mpam_s = {
+        .ms = MACHINE(vms),
+        .count = 0,
+        .base_addr = vms->memmap[VIRT_MPAM_MSC].base,
+        .table_data = table_data,
+    };
+
+    acpi_table_begin(&table, table_data);
+
+    object_child_foreach_recursive(object_get_root(), mpam_add_msc, &mpam_s);
+
+    acpi_table_end(linker, &table);
+}
+
 /*
  * ACPI spec, Revision 5.1
  * 5.2.16 System Resource Affinity Table (SRAT)
@@ -1124,6 +1317,10 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
                        vms->oem_id, vms->oem_table_id);
         }
     }
+    if (vms->mpam) {
+        acpi_add_table(table_offsets, tables_blob);
+        build_mpam(tables_blob, tables->linker, vms);
+    }
 
     if (ms->nvdimms_state->is_enabled) {
         nvdimm_build_acpi(table_offsets, tables_blob, tables->linker,
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 7d9dbc2663..1ded7737f0 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -79,6 +79,7 @@
 #include "hw/virtio/virtio-md-pci.h"
 #include "hw/virtio/virtio-iommu.h"
 #include "hw/char/pl011.h"
+#include "hw/arm/mpam.h"
 #include "qemu/guest-random.h"
 
 #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
@@ -156,6 +157,7 @@ static const MemMapEntry base_memmap[] = {
     [VIRT_PVTIME] =             { 0x090a0000, 0x00010000 },
     [VIRT_SECURE_GPIO] =        { 0x090b0000, 0x00001000 },
     [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
+    [VIRT_MPAM_MSC] =           { 0x0b006000, 0x00004000 * 256 },
     /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
     [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
     [VIRT_SECURE_MEM] =         { 0x0e000000, 0x01000000 },
@@ -1406,6 +1408,98 @@ static void create_virtio_iommu_dt_bindings(VirtMachineState *vms)
                            bdf + 1, vms->iommu_phandle, bdf + 1, 0xffff - bdf);
 }
 
+static void create_mpam_msc_cache(VirtMachineState *vms, int level,
+                                  hwaddr *base)
+{
+    VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
+    MachineClass *mc = MACHINE_CLASS(vmc);
+    MachineState *ms = MACHINE(vms);
+    const CPUArchIdList *cpu_list = mc->possible_cpu_arch_ids(ms);
+    DeviceState *dev;
+    int step, i;
+
+    /* First check if L2 is at socket level */
+    if (ms->smp.cache_node_start_level &&
+        ms->smp.cache_node_start_level <= level) {
+        step = cpu_list->len / ms->smp.sockets;
+        /* If not check cluster level */
+    } else if (ms->smp.cache_cluster_start_level <= level) {
+        step = cpu_list->len / ms->smp.clusters;
+        /* Must be private then (or non existent?) */
+    } else {
+        step = ms->smp.threads;
+    }
+
+    for (i = 0; i < cpu_list->len; i += step) {
+        dev = qdev_new(TYPE_MPAM_MSC_CACHE);
+        object_property_set_uint(OBJECT(dev), "num-ris", 1, &error_fatal);
+        object_property_set_uint(OBJECT(dev), "num-partid", 256, &error_fatal);
+        object_property_set_uint(OBJECT(dev), "num-int-partid", 32,
+                                 &error_fatal);
+        object_property_set_uint(OBJECT(dev), "cache-level", level,
+                                 &error_fatal);
+        object_property_set_uint(OBJECT(dev), "cache-type", UNIFIED,
+                                 &error_fatal);
+        object_property_set_uint(OBJECT(dev), "cpu", i, &error_fatal);
+        sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
+        sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, *base);
+        *base += MPAM_SIZE;
+    }
+}
+
+static void create_mpam_msc(VirtMachineState *vms, Error **errp)
+{
+    MachineState *ms = MACHINE(vms);
+    DeviceState *dev;
+    int i, count = 0;
+    hwaddr base = vms->memmap[VIRT_MPAM_MSC].base;
+
+    if (ms->numa_state->num_nodes == 0) {
+        error_setg(errp,
+                   "MPAM support requires NUMA nodes to be specified");
+        return;
+    }
+    if (!vms->mpam_min_msc) {
+        for (i = 0; i < ms->numa_state->num_nodes; i++) {
+            if (ms->numa_state->nodes[i].node_mem > 0 && count < 16) {
+                dev = qdev_new(TYPE_MPAM_MSC_MEM);
+
+                object_property_set_uint(OBJECT(dev), "num-ris", 1,
+                                         &error_fatal);
+                object_property_set_uint(OBJECT(dev), "num-partid", 256,
+                                         &error_fatal);
+                object_property_set_uint(OBJECT(dev), "num-int-partid", 32,
+                                         &error_fatal);
+                sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
+                sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, base);
+                base += MPAM_SIZE;
+            }
+        }
+    } else {
+        /* One MSC for all numa nodes with memory */
+        int count_with_mem = 0;
+
+        for (i = 0; i < ms->numa_state->num_nodes; i++) {
+            if (ms->numa_state->nodes[i].node_mem) {
+                count_with_mem++;
+            }
+        }
+        dev = qdev_new(TYPE_MPAM_MSC_MEM);
+        object_property_set_uint(OBJECT(dev), "num-ris", count_with_mem,
+                                 &error_fatal);
+        object_property_set_uint(OBJECT(dev), "num-partid", 256, &error_fatal);
+        object_property_set_uint(OBJECT(dev), "num-int-partid", 2,
+                                 &error_fatal);
+
+        sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
+        sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, base);
+        base += MPAM_SIZE;
+    }
+
+    create_mpam_msc_cache(vms, 3, &base);
+    create_mpam_msc_cache(vms, 2, &base);
+}
+
 static void create_pcie(VirtMachineState *vms)
 {
     hwaddr base_mmio = vms->memmap[VIRT_PCIE_MMIO].base;
@@ -2280,6 +2374,10 @@ static void machvirt_init(MachineState *machine)
 
     create_pcie(vms);
 
+    if (vms->mpam) {
+        create_mpam_msc(vms, &error_fatal);
+    }
+
     if (has_ged && aarch64 && firmware_loaded && virt_is_acpi_enabled(vms)) {
         vms->acpi_dev = create_acpi_ged(vms);
     } else {
@@ -2457,6 +2555,34 @@ static void virt_set_dtb_randomness(Object *obj, bool value, Error **errp)
     vms->dtb_randomness = value;
 }
 
+static bool virt_get_mpam(Object *obj, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    return vms->mpam;
+}
+
+static void virt_set_mpam(Object *obj, bool value, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    vms->mpam = value;
+}
+
+static bool virt_get_mpam_min_msc(Object *obj, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    return vms->mpam_min_msc;
+}
+
+static void virt_set_mpam_min_msc(Object *obj, bool value, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    vms->mpam_min_msc = value;
+}
+
 static char *virt_get_oem_id(Object *obj, Error **errp)
 {
     VirtMachineState *vms = VIRT_MACHINE(obj);
@@ -3053,6 +3179,14 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
                                           "guest CPU which implements the ARM "
                                           "Memory Tagging Extension");
 
+    object_class_property_add_bool(oc, "mpam", virt_get_mpam, virt_set_mpam);
+    object_class_property_set_description(oc, "mpam", "Enable MPAM");
+
+    object_class_property_add_bool(oc, "mpam-min-msc", virt_get_mpam_min_msc,
+                                   virt_set_mpam_min_msc);
+    object_class_property_set_description(oc, "mpam-min-msc",
+                                          "Use RIS to reduce MSCs exposed.");
+
     object_class_property_add_bool(oc, "its", virt_get_its,
                                    virt_set_its);
     object_class_property_set_description(oc, "its",
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 1/5] hw/acpi: Add PPTT cache descriptions
  2023-08-08 11:57 ` [RFC PATCH 1/5] hw/acpi: Add PPTT cache descriptions Jonathan Cameron via
@ 2023-08-14  9:50   ` Zhao Liu
  2023-08-23 15:08     ` Jonathan Cameron via
  0 siblings, 1 reply; 11+ messages in thread
From: Zhao Liu @ 2023-08-14  9:50 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: qemu-devel, Gavin Shan, linuxarm, James Morse,
	peter . maydell @ linaro . org, Alex Bennée,
	Shameerali Kolothum Thodi, Yicong Yang, Zhao Liu

Hi Jonathan,

On Tue, Aug 08, 2023 at 12:57:09PM +0100, Jonathan Cameron via wrote:
> Date: Tue, 8 Aug 2023 12:57:09 +0100
> From: Jonathan Cameron via <qemu-devel@nongnu.org>
> Subject: [RFC PATCH 1/5] hw/acpi: Add PPTT cache descriptions
> X-Mailer: git-send-email 2.39.2
> 
> Current PPTT tables generated by QEMU only provide information on CPU
> topology and neglect the description of Caches.
> 
> This patch adds flexible definition of those caches and updates the
> table version to 3 to allow for the per CPU cache instance IDs needed
> for cross references from the MPAM table.
> 
> If MPAM is not being used, then a unified description can be used,
> greatly reducing the resulting table size.
> 
> New machine parameters are used to control the cache toplogy.
> cache-cluster-start-level: Which caches are associated with the cluster
>   level of the topology. e.g cache-cluster-start-level=2 results in shared
>   l2 cache across a cluster.

So the i/d cache are at core level by default and we don't need to
configure its topology, right?

> cache-numa-start-level: Which caches are associate with the NUMA (in qemu
>   this is currently the physical package level).

I'm a bit confused about the connection of this numa and l3.
Does there "NUMA" refer to socket level?

> For example
>   cache-cluster-start-level=2,cache-numa-start-level=3 gives
>   private l1, cluster shared l2 and package shared L3.

Okay, you list the topology as: l1 per core, l2 per cluster and l3 per
socket.

For this case, I think my QOM topology proposal [1] (this is the underlying
general topology implementation, compatible with symmetric and 
heterogeneous, and I'm working on this QOM topology as a superset of smp)
is compatible with your command.

And I understand the difference between my "x-l2-cache-topo=[core|cluster]"
for x86 and yours is that I named the l2 cache, while you took level count
as the parameter.

What if I extend my symmetric cache topology commands for i386 as
"l2-cache=cluster,l3-cache=socket (*)"?

Compared to cache-cluster-start-level=2,cache-numa-start-level=3, are there
some specific cases that cache-xxx-start-level can solves but (*) command
cannot?

[1]: https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg05167.html

> 
> FIXME: Test updates.
> 
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
>  qapi/machine.json           |   8 +-
>  include/hw/acpi/aml-build.h |  19 +++-
>  include/hw/boards.h         |   4 +
>  hw/acpi/aml-build.c         | 189 ++++++++++++++++++++++++++++++++++--
>  hw/arm/virt-acpi-build.c    | 130 ++++++++++++++++++++++++-
>  hw/core/machine-smp.c       |   8 ++
>  hw/loongarch/acpi-build.c   |   2 +-
>  7 files changed, 350 insertions(+), 10 deletions(-)
> 
> diff --git a/qapi/machine.json b/qapi/machine.json
> index a08b6576ca..cc86784641 100644
> --- a/qapi/machine.json
> +++ b/qapi/machine.json
> @@ -1494,6 +1494,10 @@
>  # @maxcpus: maximum number of hotpluggable virtual CPUs in the virtual
>  #     machine
>  #
> +# @cache-cluster-start-level: Level of first cache attached to cluster
> +#
> +# @cache-node-start-level: Level of first cache attached to cluster

node or numa?

Thanks,
Zhao

> +#
>  # Since: 6.1
>  ##
>  { 'struct': 'SMPConfiguration', 'data': {
> @@ -1503,7 +1507,9 @@
>       '*clusters': 'int',
>       '*cores': 'int',
>       '*threads': 'int',
> -     '*maxcpus': 'int' } }
> +     '*maxcpus': 'int',
> +     '*cache-cluster-start-level': 'int',
> +     '*cache-node-start-level': 'int'} }
>  
>  ##
>  # @x-query-irq:
> diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
> index d1fb08514b..055b74820d 100644
> --- a/include/hw/acpi/aml-build.h
> +++ b/include/hw/acpi/aml-build.h
> @@ -489,8 +489,25 @@ void build_srat_memory(GArray *table_data, uint64_t base,
>  void build_slit(GArray *table_data, BIOSLinker *linker, MachineState *ms,
>                  const char *oem_id, const char *oem_table_id);
>  
> +typedef enum ACPIPPTTCacheType {
> +    DATA,
> +    INSTRUCTION,
> +    UNIFIED,
> +} ACPIPPTTCacheType;
> +
> +typedef struct ACPIPPTTCache {
> +    ACPIPPTTCacheType type;
> +    int sets;
> +    int size;
> +    int associativity;
> +    int linesize;
> +    unsigned int pptt_id;
> +    int level;
> +} ACPIPPTTCache;
> +
>  void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
> -                const char *oem_id, const char *oem_table_id);
> +                const char *oem_id, const char *oem_table_id,
> +                int num_caches, ACPIPPTTCache *caches);
>  
>  void build_fadt(GArray *tbl, BIOSLinker *linker, const AcpiFadtData *f,
>                  const char *oem_id, const char *oem_table_id);
> diff --git a/include/hw/boards.h b/include/hw/boards.h
> index ed83360198..6e8ab92684 100644
> --- a/include/hw/boards.h
> +++ b/include/hw/boards.h
> @@ -316,6 +316,8 @@ typedef struct DeviceMemoryState {
>   * @cores: the number of cores in one cluster
>   * @threads: the number of threads in one core
>   * @max_cpus: the maximum number of logical processors on the machine
> + * @cache_cluster_start_level: First cache level attached to cluster
> + * @cache_node_start_level: First cache level attached to node
>   */
>  typedef struct CpuTopology {
>      unsigned int cpus;
> @@ -325,6 +327,8 @@ typedef struct CpuTopology {
>      unsigned int cores;
>      unsigned int threads;
>      unsigned int max_cpus;
> +    unsigned int cache_cluster_start_level;
> +    unsigned int cache_node_start_level;
>  } CpuTopology;
>  
>  /**
> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> index ea331a20d1..e103cd638f 100644
> --- a/hw/acpi/aml-build.c
> +++ b/hw/acpi/aml-build.c
> @@ -1994,32 +1994,175 @@ static void build_processor_hierarchy_node(GArray *tbl, uint32_t flags,
>      }
>  }
>  
> +static void build_cache_nodes(GArray *tbl, ACPIPPTTCache *cache,
> +                              uint32_t next_offset,
> +                              bool has_id, unsigned int id)
> +{
> +    int val;
> +
> +    /* Type 1 - cache */
> +    build_append_byte(tbl, 1);
> +    /* Length */
> +    build_append_byte(tbl, 28);
> +    /* Reserved */
> +    build_append_int_noprefix(tbl, 0, 2);
> +    /* Flags - everything except possibly the ID */
> +    build_append_int_noprefix(tbl, has_id ? 0xff : 0x7f, 4);
> +    /* Offset of next cache up */
> +    build_append_int_noprefix(tbl, next_offset, 4);
> +    build_append_int_noprefix(tbl, cache->size, 4);
> +    build_append_int_noprefix(tbl, cache->sets, 4);
> +    build_append_byte(tbl, cache->associativity);
> +    /* Read and Write allocate amd WB */
> +    val = 0x3 | (1 << 4);
> +    switch (cache->type) {
> +    case INSTRUCTION:
> +        val |= (1 << 2);
> +        break;
> +    case DATA:
> +        val |= (0 << 2); /* Data */
> +        break;
> +    case UNIFIED:
> +        val |= (3 << 2); /* Unified */
> +        break;
> +    }
> +    build_append_byte(tbl, val);
> +    build_append_int_noprefix(tbl, cache->linesize, 2);
> +    build_append_int_noprefix(tbl,
> +                              has_id ?
> +                              (cache->type << 24) | (cache->level << 16) | id :
> +                              0, 4);
> +}
> +
> +static void build_caches_subset(GArray *table_data, uint32_t pptt_start,
> +                                int num_caches, ACPIPPTTCache *caches,
> +                                bool assign_ids, int base_id,
> +                                uint8_t level_high, uint8_t level_low,
> +                                uint32_t *data_offset, uint32_t *instr_offset)
> +{
> +    uint32_t next_level_offset_data = 0, next_level_offset_instruction = 0;
> +    uint32_t this_offset, next_offset = 0;
> +    int c, l;
> +
> +    /* Walk caches from top to bottom */
> +
> +    for (l = level_high; l >= level_low; l--) { /* Walk down levels */
> +        for (c = 0; c < num_caches; c++) {
> +            if (caches[c].level != l) {
> +                continue;
> +            }
> +
> +            /* Assume only unified above l1 for now */
> +            this_offset = table_data->len - pptt_start;
> +            switch (caches[c].type) {
> +            case INSTRUCTION:
> +                next_offset = next_level_offset_instruction;
> +                break;
> +            case DATA:
> +                next_offset = next_level_offset_data;
> +                break;
> +            case UNIFIED:
> +                /* Either is fine here - hopefully */
> +                next_offset = next_level_offset_instruction;
> +                break;
> +            }
> +            build_cache_nodes(table_data, &caches[c], next_offset,
> +                              assign_ids, base_id);
> +            switch (caches[c].type) {
> +            case INSTRUCTION:
> +                next_level_offset_instruction = this_offset;
> +                break;
> +            case DATA:
> +                next_level_offset_data = this_offset;
> +                break;
> +            case UNIFIED:
> +                next_level_offset_instruction = this_offset;
> +                next_level_offset_data = this_offset;
> +                break;
> +            }
> +            *data_offset = next_level_offset_data;
> +            *instr_offset = next_level_offset_instruction;
> +        }
> +    }
> +}
> +
>  /*
>   * ACPI spec, Revision 6.3
>   * 5.2.29 Processor Properties Topology Table (PPTT)
>   */
>  void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
> -                const char *oem_id, const char *oem_table_id)
> +                const char *oem_id, const char *oem_table_id,
> +                int num_caches, ACPIPPTTCache *caches)
>  {
> +    bool share_structs = false;
>      MachineClass *mc = MACHINE_GET_CLASS(ms);
>      CPUArchIdList *cpus = ms->possible_cpus;
>      int64_t socket_id = -1, cluster_id = -1, core_id = -1;
>      uint32_t socket_offset = 0, cluster_offset = 0, core_offset = 0;
>      uint32_t pptt_start = table_data->len;
>      int n;
> -    AcpiTable table = { .sig = "PPTT", .rev = 2,
> +    AcpiTable table = { .sig = "PPTT", .rev = 3,
>                          .oem_id = oem_id, .oem_table_id = oem_table_id };
> +    uint32_t l1_data_offset = 0;
> +    uint32_t l1_instr_offset = 0;
> +    uint32_t cluster_data_offset = 0;
> +    uint32_t cluster_instr_offset = 0;
> +    uint32_t node_data_offset = 0;
> +    uint32_t node_instr_offset = 0;
> +    int top_node = 7;
> +    int top_cluster = 7;
> +    int top_core = 7;
>  
>      acpi_table_begin(&table, table_data);
>  
> +    /* Let us have a unified cache description for now */
> +
> +    if (share_structs && num_caches >= 1) {
> +        if (ms->smp.cache_node_start_level) {
> +            build_caches_subset(table_data, pptt_start, num_caches, caches,
> +                                false, 0,
> +                                top_node, ms->smp.cache_node_start_level,
> +                                &node_data_offset, &node_instr_offset);
> +            top_cluster = ms->smp.cache_node_start_level - 1;
> +        }
> +        /* Assumption that some caches below this */
> +        if (ms->smp.cache_cluster_start_level) {
> +            build_caches_subset(table_data, pptt_start, num_caches, caches,
> +                                false, 0,
> +                                top_cluster,  ms->smp.cache_cluster_start_level,
> +                                &cluster_data_offset, &cluster_instr_offset);
> +            top_core = ms->smp.cache_cluster_start_level - 1;
> +        }
> +        build_caches_subset(table_data, pptt_start, num_caches, caches,
> +                            false, 0,
> +                            top_core , 0,
> +                            &l1_data_offset, &l1_instr_offset);
> +    }
> +
>      /*
>       * This works with the assumption that cpus[n].props.*_id has been
>       * sorted from top to down levels in mc->possible_cpu_arch_ids().
>       * Otherwise, the unexpected and duplicated containers will be
>       * created.
>       */
> +
>      for (n = 0; n < cpus->len; n++) {
>          if (cpus->cpus[n].props.socket_id != socket_id) {
> +            uint32_t priv_rsrc[2];
> +            int num_priv = 0;
> +
> +            if (!share_structs && ms->smp.cache_node_start_level) {
> +                build_caches_subset(table_data, pptt_start, num_caches, caches,
> +                                    true, n,
> +                                    top_node, ms->smp.cache_node_start_level,
> +                                    &node_data_offset, &node_instr_offset);
> +                top_cluster = ms->smp.cache_node_start_level - 1;
> +            }
> +            priv_rsrc[0] = node_instr_offset;
> +            priv_rsrc[1] = node_data_offset;
> +            if (node_instr_offset || node_data_offset) {
> +                num_priv = node_instr_offset == node_data_offset ? 1 : 2;
> +            }
>              assert(cpus->cpus[n].props.socket_id > socket_id);
>              socket_id = cpus->cpus[n].props.socket_id;
>              cluster_id = -1;
> @@ -2027,36 +2170,70 @@ void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
>              socket_offset = table_data->len - pptt_start;
>              build_processor_hierarchy_node(table_data,
>                  (1 << 0), /* Physical package */
> -                0, socket_id, NULL, 0);
> +                0, socket_id, priv_rsrc, num_priv);
>          }
>  
> +
>          if (mc->smp_props.clusters_supported && mc->smp_props.has_clusters) {
>              if (cpus->cpus[n].props.cluster_id != cluster_id) {
> +                uint32_t priv_rsrc[2];
> +                int num_priv = 0;
> +
> +                if (!share_structs && ms->smp.cache_cluster_start_level) {
> +                    build_caches_subset(table_data, pptt_start, num_caches,
> +                                        caches, true, n,
> +                                        top_cluster,
> +                                        ms->smp.cache_cluster_start_level,
> +                                        &cluster_data_offset,
> +                                        &cluster_instr_offset);
> +                    top_core = ms->smp.cache_cluster_start_level - 1;
> +                }
> +                priv_rsrc[0] = cluster_instr_offset;
> +                priv_rsrc[1] = cluster_data_offset;
> +
>                  assert(cpus->cpus[n].props.cluster_id > cluster_id);
>                  cluster_id = cpus->cpus[n].props.cluster_id;
>                  core_id = -1;
>                  cluster_offset = table_data->len - pptt_start;
> +
> +                if (cluster_instr_offset || cluster_data_offset) {
> +                    num_priv = cluster_instr_offset == cluster_data_offset ?
> +                        1 : 2;
> +                }
>                  build_processor_hierarchy_node(table_data,
>                      (0 << 0), /* Not a physical package */
> -                    socket_offset, cluster_id, NULL, 0);
> +                    socket_offset, cluster_id, priv_rsrc, num_priv);
>              }
>          } else {
>              cluster_offset = socket_offset;
>          }
>  
> +        if (!share_structs &&
> +            cpus->cpus[n].props.core_id != core_id) {
> +            build_caches_subset(table_data, pptt_start, num_caches, caches,
> +                                true, n,
> +                                top_core , 0,
> +                                &l1_data_offset, &l1_instr_offset);
> +        }
>          if (ms->smp.threads == 1) {
> +            uint32_t priv_rsrc[2] = { l1_instr_offset, l1_data_offset };
> +
>              build_processor_hierarchy_node(table_data,
>                  (1 << 1) | /* ACPI Processor ID valid */
>                  (1 << 3),  /* Node is a Leaf */
> -                cluster_offset, n, NULL, 0);
> +                cluster_offset, n, priv_rsrc,
> +                l1_instr_offset == l1_data_offset ? 1 : 2);
>          } else {
>              if (cpus->cpus[n].props.core_id != core_id) {
> +                uint32_t priv_rsrc[2] = { l1_instr_offset, l1_data_offset };
> +
>                  assert(cpus->cpus[n].props.core_id > core_id);
>                  core_id = cpus->cpus[n].props.core_id;
>                  core_offset = table_data->len - pptt_start;
>                  build_processor_hierarchy_node(table_data,
>                      (0 << 0), /* Not a physical package */
> -                    cluster_offset, core_id, NULL, 0);
> +                    cluster_offset, core_id, priv_rsrc,
> +                    l1_instr_offset == l1_data_offset ? 1 : 2);
>              }
>  
>              build_processor_hierarchy_node(table_data,
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 6b674231c2..ec8fdcefff 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -922,6 +922,129 @@ static void acpi_align_size(GArray *blob, unsigned align)
>      g_array_set_size(blob, ROUND_UP(acpi_data_len(blob), align));
>  }
>  
> +static unsigned int virt_get_caches(VirtMachineState *vms,
> +                                    ACPIPPTTCache *caches)
> +{
> +    ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(0));
> +    bool ccidx = cpu_isar_feature(any_ccidx, armcpu);
> +    unsigned int num_cache, i;
> +    int level_instr = 1, level_data = 1;
> +
> +    for (i = 0, num_cache = 0; i < 7; i++, num_cache++) {
> +        int type = (armcpu->clidr >> (3 * i)) & 7;
> +        int bank_index;
> +        int level = 0;
> +        ACPIPPTTCacheType cache_type = INSTRUCTION;
> +
> +        if (type == 0) {
> +            break;
> +        }
> +
> +        switch (type) {
> +        case 1:
> +            cache_type = INSTRUCTION;
> +            level = level_instr;
> +            break;
> +        case 2:
> +            cache_type = DATA;
> +            level = level_data;
> +            break;
> +        case 4:
> +            cache_type = UNIFIED;
> +            level = level_instr > level_data ? level_instr : level_data;
> +            break;
> +        case 3: /* Split - Do data first */
> +            cache_type = DATA;
> +            level = level_data;
> +            break;
> +        }
> +        /*
> +         * ccsidr is indexed using both the level and whether it is
> +         * an instruction cache. Unified caches use the same storage
> +         * as data caches.
> +         */
> +        bank_index = (i * 2) | ((type == 1) ? 1 : 0);
> +        if (ccidx) {
> +            caches[num_cache] = (ACPIPPTTCache) {
> +                .type =  cache_type,
> +                .level = level,
> +                .linesize = 1 << (FIELD_EX64(armcpu->ccsidr[bank_index],
> +                                             CCSIDR_EL1,
> +                                             CCIDX_LINESIZE) + 4),
> +                .associativity = FIELD_EX64(armcpu->ccsidr[bank_index],
> +                                            CCSIDR_EL1,
> +                                            CCIDX_ASSOCIATIVITY) + 1,
> +                .sets = FIELD_EX64(armcpu->ccsidr[bank_index], CCSIDR_EL1,
> +                                   CCIDX_NUMSETS) + 1,
> +            };
> +        } else {
> +            caches[num_cache] = (ACPIPPTTCache) {
> +                .type =  cache_type,
> +                .level = level,
> +                .linesize = 1 << (FIELD_EX64(armcpu->ccsidr[bank_index],
> +                                             CCSIDR_EL1, LINESIZE) + 4),
> +                .associativity = FIELD_EX64(armcpu->ccsidr[bank_index],
> +                                            CCSIDR_EL1,
> +                                            ASSOCIATIVITY) + 1,
> +                .sets = FIELD_EX64(armcpu->ccsidr[bank_index], CCSIDR_EL1,
> +                                   NUMSETS) + 1,
> +            };
> +        }
> +        caches[num_cache].size = caches[num_cache].associativity *
> +            caches[num_cache].sets * caches[num_cache].linesize;
> +
> +        /* Break one 'split' entry up into two records */
> +        if (type == 3) {
> +            num_cache++;
> +            bank_index = (i * 2) | 1;
> +            if (ccidx) {
> +                /* Instruction cache: bottom bit set when reading banked reg */
> +                caches[num_cache] = (ACPIPPTTCache) {
> +                    .type = INSTRUCTION,
> +                    .level = level_instr,
> +                    .linesize = 1 << (FIELD_EX64(armcpu->ccsidr[bank_index],
> +                                                 CCSIDR_EL1,
> +                                                 CCIDX_LINESIZE) + 4),
> +                    .associativity = FIELD_EX64(armcpu->ccsidr[bank_index],
> +                                                CCSIDR_EL1,
> +                                                CCIDX_ASSOCIATIVITY) + 1,
> +                    .sets = FIELD_EX64(armcpu->ccsidr[bank_index], CCSIDR_EL1,
> +                                       CCIDX_NUMSETS) + 1,
> +                };
> +            } else {
> +                caches[num_cache] = (ACPIPPTTCache) {
> +                    .type = INSTRUCTION,
> +                    .level = level_instr,
> +                    .linesize = 1 << (FIELD_EX64(armcpu->ccsidr[bank_index],
> +                                                 CCSIDR_EL1, LINESIZE) + 4),
> +                    .associativity = FIELD_EX64(armcpu->ccsidr[bank_index],
> +                                                CCSIDR_EL1,
> +                                                ASSOCIATIVITY) + 1,
> +                    .sets = FIELD_EX64(armcpu->ccsidr[bank_index], CCSIDR_EL1,
> +                                       NUMSETS) + 1,
> +                };
> +            }
> +            caches[num_cache].size = caches[num_cache].associativity *
> +                caches[num_cache].sets * caches[num_cache].linesize;
> +        }
> +        switch (type) {
> +        case 1:
> +            level_instr++;
> +            break;
> +        case 2:
> +            level_data++;
> +            break;
> +        case 3:
> +        case 4:
> +            level_instr++;
> +            level_data++;
> +            break;
> +        }
> +    }
> +
> +    return num_cache;
> +}
> +
>  static
>  void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
>  {
> @@ -930,6 +1053,8 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
>      unsigned dsdt, xsdt;
>      GArray *tables_blob = tables->table_data;
>      MachineState *ms = MACHINE(vms);
> +    ACPIPPTTCache caches[16]; /* Can select up to 16 */
> +    unsigned int num_cache;
>  
>      table_offsets = g_array_new(false, true /* clear */,
>                                          sizeof(uint32_t));
> @@ -949,10 +1074,13 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
>      acpi_add_table(table_offsets, tables_blob);
>      build_madt(tables_blob, tables->linker, vms);
>  
> +    num_cache = virt_get_caches(vms, caches);
> +
>      if (!vmc->no_cpu_topology) {
>          acpi_add_table(table_offsets, tables_blob);
>          build_pptt(tables_blob, tables->linker, ms,
> -                   vms->oem_id, vms->oem_table_id);
> +                   vms->oem_id, vms->oem_table_id,
> +                   num_cache, caches);
>      }
>  
>      acpi_add_table(table_offsets, tables_blob);
> diff --git a/hw/core/machine-smp.c b/hw/core/machine-smp.c
> index 0f4d9b6f7a..cbb0bf1bc7 100644
> --- a/hw/core/machine-smp.c
> +++ b/hw/core/machine-smp.c
> @@ -81,6 +81,10 @@ void machine_parse_smp_config(MachineState *ms,
>      unsigned cores   = config->has_cores ? config->cores : 0;
>      unsigned threads = config->has_threads ? config->threads : 0;
>      unsigned maxcpus = config->has_maxcpus ? config->maxcpus : 0;
> +    unsigned cache_cl_start = config->has_cache_cluster_start_level ?
> +        config->cache_cluster_start_level : 0;
> +    unsigned cache_nd_start = config->has_cache_node_start_level ?
> +        config->cache_node_start_level : 0;
>  
>      /*
>       * Specified CPU topology parameters must be greater than zero,
> @@ -161,6 +165,10 @@ void machine_parse_smp_config(MachineState *ms,
>      ms->smp.max_cpus = maxcpus;
>  
>      mc->smp_props.has_clusters = config->has_clusters;
> +    if (mc->smp_props.has_clusters) {
> +        ms->smp.cache_cluster_start_level = cache_cl_start;
> +        ms->smp.cache_node_start_level = cache_nd_start;
> +    }
>  
>      /* sanity-check of the computed topology */
>      if (sockets * dies * clusters * cores * threads != maxcpus) {
> diff --git a/hw/loongarch/acpi-build.c b/hw/loongarch/acpi-build.c
> index 0b62c3a2f7..51d4ed9a19 100644
> --- a/hw/loongarch/acpi-build.c
> +++ b/hw/loongarch/acpi-build.c
> @@ -439,7 +439,7 @@ static void acpi_build(AcpiBuildTables *tables, MachineState *machine)
>  
>      acpi_add_table(table_offsets, tables_blob);
>      build_pptt(tables_blob, tables->linker, machine,
> -               lams->oem_id, lams->oem_table_id);
> +               lams->oem_id, lams->oem_table_id, 0, NULL);
>  
>      acpi_add_table(table_offsets, tables_blob);
>      build_srat(tables_blob, tables->linker, machine);
> -- 
> 2.39.2
> 
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 2/5] HACK: target/arm/tcg: Add some more caches to cpu=max
  2023-08-08 11:57 ` [RFC PATCH 2/5] HACK: target/arm/tcg: Add some more caches to cpu=max Jonathan Cameron via
@ 2023-08-14 10:13   ` Alex Bennée
  2023-08-23 14:59     ` Jonathan Cameron via
  0 siblings, 1 reply; 11+ messages in thread
From: Alex Bennée @ 2023-08-14 10:13 UTC (permalink / raw)
  To: qemu-devel, Jonathan Cameron
  Cc: Gavin Shan, linuxarm, James Morse, peter . maydell @ linaro . org,
	zhao1.liu, Shameerali Kolothum Thodi, Yicong Yang


Jonathan Cameron <Jonathan.Cameron@huawei.com> writes:

> Used to drive the MPAM cache intialization and to exercise more
> of the PPTT cache entry generation code. Perhaps a default
> L3 cache is acceptable for max?
>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
>  target/arm/tcg/cpu64.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
> index 8019f00bc3..2af67739f6 100644
> --- a/target/arm/tcg/cpu64.c
> +++ b/target/arm/tcg/cpu64.c
> @@ -711,6 +711,17 @@ void aarch64_max_tcg_initfn(Object *obj)
>      uint64_t t;
>      uint32_t u;
>  
> +    /*
> +     * Expanded cache set
> +     */
> +    cpu->clidr = 0x8204923; /* 4 4 4 4 3 in 3 bit fields */
> +    cpu->ccsidr[0] = 0x000000ff0000001aull; /* 64KB L1 dcache */
> +    cpu->ccsidr[1] = 0x000000ff0000001aull; /* 64KB L1 icache */
> +    cpu->ccsidr[2] = 0x000007ff0000003aull; /* 1MB L2 unified cache */
> +    cpu->ccsidr[4] = 0x000007ff0000007cull; /* 2MB L3 cache 128B line */
> +    cpu->ccsidr[6] = 0x00007fff0000007cull; /* 16MB L4 cache 128B line */
> +    cpu->ccsidr[8] = 0x0007ffff0000007cull; /* 2048MB L5 cache 128B line */
> +

I think Peter in another thread wondered if we should have a generic
function for expanding the cache idr registers based on a abstract lane
definition. 

>      /*
>       * Reset MIDR so the guest doesn't mistake our 'max' CPU type for a real
>       * one and try to apply errata workarounds or use impdef features we
> @@ -828,6 +839,7 @@ void aarch64_max_tcg_initfn(Object *obj)
>      t = FIELD_DP64(t, ID_AA64MMFR2, BBM, 2);      /* FEAT_BBM at level 2 */
>      t = FIELD_DP64(t, ID_AA64MMFR2, EVT, 2);      /* FEAT_EVT */
>      t = FIELD_DP64(t, ID_AA64MMFR2, E0PD, 1);     /* FEAT_E0PD */
> +    t = FIELD_DP64(t, ID_AA64MMFR2, CCIDX, 1);      /* FEAT_TTCNP */
>      cpu->isar.id_aa64mmfr2 = t;
>  
>      t = cpu->isar.id_aa64zfr0;


-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 2/5] HACK: target/arm/tcg: Add some more caches to cpu=max
  2023-08-14 10:13   ` Alex Bennée
@ 2023-08-23 14:59     ` Jonathan Cameron via
  2023-08-23 19:05       ` Richard Henderson
  0 siblings, 1 reply; 11+ messages in thread
From: Jonathan Cameron via @ 2023-08-23 14:59 UTC (permalink / raw)
  To: Alex Bennée
  Cc: qemu-devel, Gavin Shan, linuxarm, James Morse,
	peter . maydell @ linaro . org, zhao1.liu,
	Shameerali Kolothum Thodi, Yicong Yang

On Mon, 14 Aug 2023 11:13:58 +0100
Alex Bennée <alex.bennee@linaro.org> wrote:

> Jonathan Cameron <Jonathan.Cameron@huawei.com> writes:
> 
> > Used to drive the MPAM cache intialization and to exercise more
> > of the PPTT cache entry generation code. Perhaps a default
> > L3 cache is acceptable for max?
> >
> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > ---
> >  target/arm/tcg/cpu64.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> >
> > diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
> > index 8019f00bc3..2af67739f6 100644
> > --- a/target/arm/tcg/cpu64.c
> > +++ b/target/arm/tcg/cpu64.c
> > @@ -711,6 +711,17 @@ void aarch64_max_tcg_initfn(Object *obj)
> >      uint64_t t;
> >      uint32_t u;
> >  
> > +    /*
> > +     * Expanded cache set
> > +     */
> > +    cpu->clidr = 0x8204923; /* 4 4 4 4 3 in 3 bit fields */
> > +    cpu->ccsidr[0] = 0x000000ff0000001aull; /* 64KB L1 dcache */
> > +    cpu->ccsidr[1] = 0x000000ff0000001aull; /* 64KB L1 icache */
> > +    cpu->ccsidr[2] = 0x000007ff0000003aull; /* 1MB L2 unified cache */
> > +    cpu->ccsidr[4] = 0x000007ff0000007cull; /* 2MB L3 cache 128B line */
> > +    cpu->ccsidr[6] = 0x00007fff0000007cull; /* 16MB L4 cache 128B line */
> > +    cpu->ccsidr[8] = 0x0007ffff0000007cull; /* 2048MB L5 cache 128B line */
> > +  
> 
> I think Peter in another thread wondered if we should have a generic
> function for expanding the cache idr registers based on a abstract lane
> definition. 
> 

Great!

This response?
https://lore.kernel.org/qemu-devel/CAFEAcA_Lzj1LEutMro72fCfqiCWtOpd+5b-YPcfKv8Bg1f+rCg@mail.gmail.com/

That might get us somewhere but ultimately I think we need a general way to push this stuff
in as parameters of the CPU or a CPU definition with a wide enough set of caches to allow us to
poke the boundaries and hang a typical MPAM setup off it.  Would people mind adding at least
an L3 to max? The L4 and above is useful for checking the PPTT building code works,
but that's probably more a development time activity than an every day one.

Jonathan



> >      /*
> >       * Reset MIDR so the guest doesn't mistake our 'max' CPU type for a real
> >       * one and try to apply errata workarounds or use impdef features we
> > @@ -828,6 +839,7 @@ void aarch64_max_tcg_initfn(Object *obj)
> >      t = FIELD_DP64(t, ID_AA64MMFR2, BBM, 2);      /* FEAT_BBM at level 2 */
> >      t = FIELD_DP64(t, ID_AA64MMFR2, EVT, 2);      /* FEAT_EVT */
> >      t = FIELD_DP64(t, ID_AA64MMFR2, E0PD, 1);     /* FEAT_E0PD */
> > +    t = FIELD_DP64(t, ID_AA64MMFR2, CCIDX, 1);      /* FEAT_TTCNP */
> >      cpu->isar.id_aa64mmfr2 = t;
> >  
> >      t = cpu->isar.id_aa64zfr0;  
> 
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 1/5] hw/acpi: Add PPTT cache descriptions
  2023-08-14  9:50   ` Zhao Liu
@ 2023-08-23 15:08     ` Jonathan Cameron via
  0 siblings, 0 replies; 11+ messages in thread
From: Jonathan Cameron via @ 2023-08-23 15:08 UTC (permalink / raw)
  To: Zhao Liu
  Cc: qemu-devel, Gavin Shan, linuxarm, James Morse,
	peter . maydell @ linaro . org, Alex Bennée,
	Shameerali Kolothum Thodi, Yicong Yang, Zhao Liu

On Mon, 14 Aug 2023 17:50:58 +0800
Zhao Liu <zhao1.liu@linux.intel.com> wrote:

> Hi Jonathan,
> 
> On Tue, Aug 08, 2023 at 12:57:09PM +0100, Jonathan Cameron via wrote:
> > Date: Tue, 8 Aug 2023 12:57:09 +0100
> > From: Jonathan Cameron via <qemu-devel@nongnu.org>
> > Subject: [RFC PATCH 1/5] hw/acpi: Add PPTT cache descriptions
> > X-Mailer: git-send-email 2.39.2
> > 
> > Current PPTT tables generated by QEMU only provide information on CPU
> > topology and neglect the description of Caches.
> > 
> > This patch adds flexible definition of those caches and updates the
> > table version to 3 to allow for the per CPU cache instance IDs needed
> > for cross references from the MPAM table.
> > 
> > If MPAM is not being used, then a unified description can be used,
> > greatly reducing the resulting table size.
> > 
> > New machine parameters are used to control the cache toplogy.
> > cache-cluster-start-level: Which caches are associated with the cluster
> >   level of the topology. e.g cache-cluster-start-level=2 results in shared
> >   l2 cache across a cluster.  
> 
> So the i/d cache are at core level by default and we don't need to
> configure its topology, right?

Exactly.  Default is everything private.  Everything below
cache-cluster-start-level remains so.

> 
> > cache-numa-start-level: Which caches are associate with the NUMA (in qemu
> >   this is currently the physical package level).  
> 
> I'm a bit confused about the connection of this numa and l3.
> Does there "NUMA" refer to socket level?
Fair point. We don't really have enough flexibility in QEMU so far to
represent all the complexities seen in fairly standard systems.
Can take one I have to hand

l1i, l1d, l2 private caches.
l3 tags shared by cluster (that just looks like a processor container
in PPTT with no caches associated with it)
l3 shared by die
Numa domain per die,
2 Numa domains (dies) per Socket.
N sockets.

So far we are a level short in QEMU representation. 

> 
> > For example
> >   cache-cluster-start-level=2,cache-numa-start-level=3 gives
> >   private l1, cluster shared l2 and package shared L3.  
> 
> Okay, you list the topology as: l1 per core, l2 per cluster and l3 per
> socket.
> 
> For this case, I think my QOM topology proposal [1] (this is the underlying
> general topology implementation, compatible with symmetric and 
> heterogeneous, and I'm working on this QOM topology as a superset of smp)
> is compatible with your command.
I've been on holiday and it will take a few more days to catch up.
After that I plan to take a close look at your proposal.

> 
> And I understand the difference between my "x-l2-cache-topo=[core|cluster]"
> for x86 and yours is that I named the l2 cache, while you took level count
> as the parameter.
> 
> What if I extend my symmetric cache topology commands for i386 as
> "l2-cache=cluster,l3-cache=socket (*)"?

That would work fine for me.


> 
> Compared to cache-cluster-start-level=2,cache-numa-start-level=3, are there
> some specific cases that cache-xxx-start-level can solves but (*) command
> cannot?
> 
> [1]: https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg05167.html

None, though we'd probably want some sanity checking as
l2-cache=socket,l3-cache=cluster isn't representable in PPTT.
Arguably I should have that for my start-level based approach as well.

> 
> > 
> > FIXME: Test updates.
> > 
> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > ---
> >  qapi/machine.json           |   8 +-
> >  include/hw/acpi/aml-build.h |  19 +++-
> >  include/hw/boards.h         |   4 +
> >  hw/acpi/aml-build.c         | 189 ++++++++++++++++++++++++++++++++++--
> >  hw/arm/virt-acpi-build.c    | 130 ++++++++++++++++++++++++-
> >  hw/core/machine-smp.c       |   8 ++
> >  hw/loongarch/acpi-build.c   |   2 +-
> >  7 files changed, 350 insertions(+), 10 deletions(-)
> > 
> > diff --git a/qapi/machine.json b/qapi/machine.json
> > index a08b6576ca..cc86784641 100644
> > --- a/qapi/machine.json
> > +++ b/qapi/machine.json
> > @@ -1494,6 +1494,10 @@
> >  # @maxcpus: maximum number of hotpluggable virtual CPUs in the virtual
> >  #     machine
> >  #
> > +# @cache-cluster-start-level: Level of first cache attached to cluster
> > +#
> > +# @cache-node-start-level: Level of first cache attached to cluster  
> 
> node or numa?
oops - should be consistent on that. Thanks!

Jonathan
> 
> Thanks,
> Zhao
> 
> > +#
> >  # Since: 6.1
> >  ##
> >  { 'struct': 'SMPConfiguration', 'data': {
> > @@ -1503,7 +1507,9 @@
> >       '*clusters': 'int',
> >       '*cores': 'int',
> >       '*threads': 'int',
> > -     '*maxcpus': 'int' } }
> > +     '*maxcpus': 'int',
> > +     '*cache-cluster-start-level': 'int',
> > +     '*cache-node-start-level': 'int'} }
> >  
> >  ##
> >  # @x-query-irq:
> > diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
> > index d1fb08514b..055b74820d 100644
> > --- a/include/hw/acpi/aml-build.h
> > +++ b/include/hw/acpi/aml-build.h
> > @@ -489,8 +489,25 @@ void build_srat_memory(GArray *table_data, uint64_t base,
> >  void build_slit(GArray *table_data, BIOSLinker *linker, MachineState *ms,
> >                  const char *oem_id, const char *oem_table_id);
> >  
> > +typedef enum ACPIPPTTCacheType {
> > +    DATA,
> > +    INSTRUCTION,
> > +    UNIFIED,
> > +} ACPIPPTTCacheType;
> > +
> > +typedef struct ACPIPPTTCache {
> > +    ACPIPPTTCacheType type;
> > +    int sets;
> > +    int size;
> > +    int associativity;
> > +    int linesize;
> > +    unsigned int pptt_id;
> > +    int level;
> > +} ACPIPPTTCache;
> > +
> >  void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
> > -                const char *oem_id, const char *oem_table_id);
> > +                const char *oem_id, const char *oem_table_id,
> > +                int num_caches, ACPIPPTTCache *caches);
> >  
> >  void build_fadt(GArray *tbl, BIOSLinker *linker, const AcpiFadtData *f,
> >                  const char *oem_id, const char *oem_table_id);
> > diff --git a/include/hw/boards.h b/include/hw/boards.h
> > index ed83360198..6e8ab92684 100644
> > --- a/include/hw/boards.h
> > +++ b/include/hw/boards.h
> > @@ -316,6 +316,8 @@ typedef struct DeviceMemoryState {
> >   * @cores: the number of cores in one cluster
> >   * @threads: the number of threads in one core
> >   * @max_cpus: the maximum number of logical processors on the machine
> > + * @cache_cluster_start_level: First cache level attached to cluster
> > + * @cache_node_start_level: First cache level attached to node
> >   */
> >  typedef struct CpuTopology {
> >      unsigned int cpus;
> > @@ -325,6 +327,8 @@ typedef struct CpuTopology {
> >      unsigned int cores;
> >      unsigned int threads;
> >      unsigned int max_cpus;
> > +    unsigned int cache_cluster_start_level;
> > +    unsigned int cache_node_start_level;
> >  } CpuTopology;
> >  
> >  /**
> > diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> > index ea331a20d1..e103cd638f 100644
> > --- a/hw/acpi/aml-build.c
> > +++ b/hw/acpi/aml-build.c
> > @@ -1994,32 +1994,175 @@ static void build_processor_hierarchy_node(GArray *tbl, uint32_t flags,
> >      }
> >  }
> >  
> > +static void build_cache_nodes(GArray *tbl, ACPIPPTTCache *cache,
> > +                              uint32_t next_offset,
> > +                              bool has_id, unsigned int id)
> > +{
> > +    int val;
> > +
> > +    /* Type 1 - cache */
> > +    build_append_byte(tbl, 1);
> > +    /* Length */
> > +    build_append_byte(tbl, 28);
> > +    /* Reserved */
> > +    build_append_int_noprefix(tbl, 0, 2);
> > +    /* Flags - everything except possibly the ID */
> > +    build_append_int_noprefix(tbl, has_id ? 0xff : 0x7f, 4);
> > +    /* Offset of next cache up */
> > +    build_append_int_noprefix(tbl, next_offset, 4);
> > +    build_append_int_noprefix(tbl, cache->size, 4);
> > +    build_append_int_noprefix(tbl, cache->sets, 4);
> > +    build_append_byte(tbl, cache->associativity);
> > +    /* Read and Write allocate amd WB */
> > +    val = 0x3 | (1 << 4);
> > +    switch (cache->type) {
> > +    case INSTRUCTION:
> > +        val |= (1 << 2);
> > +        break;
> > +    case DATA:
> > +        val |= (0 << 2); /* Data */
> > +        break;
> > +    case UNIFIED:
> > +        val |= (3 << 2); /* Unified */
> > +        break;
> > +    }
> > +    build_append_byte(tbl, val);
> > +    build_append_int_noprefix(tbl, cache->linesize, 2);
> > +    build_append_int_noprefix(tbl,
> > +                              has_id ?
> > +                              (cache->type << 24) | (cache->level << 16) | id :
> > +                              0, 4);
> > +}
> > +
> > +static void build_caches_subset(GArray *table_data, uint32_t pptt_start,
> > +                                int num_caches, ACPIPPTTCache *caches,
> > +                                bool assign_ids, int base_id,
> > +                                uint8_t level_high, uint8_t level_low,
> > +                                uint32_t *data_offset, uint32_t *instr_offset)
> > +{
> > +    uint32_t next_level_offset_data = 0, next_level_offset_instruction = 0;
> > +    uint32_t this_offset, next_offset = 0;
> > +    int c, l;
> > +
> > +    /* Walk caches from top to bottom */
> > +
> > +    for (l = level_high; l >= level_low; l--) { /* Walk down levels */
> > +        for (c = 0; c < num_caches; c++) {
> > +            if (caches[c].level != l) {
> > +                continue;
> > +            }
> > +
> > +            /* Assume only unified above l1 for now */
> > +            this_offset = table_data->len - pptt_start;
> > +            switch (caches[c].type) {
> > +            case INSTRUCTION:
> > +                next_offset = next_level_offset_instruction;
> > +                break;
> > +            case DATA:
> > +                next_offset = next_level_offset_data;
> > +                break;
> > +            case UNIFIED:
> > +                /* Either is fine here - hopefully */
> > +                next_offset = next_level_offset_instruction;
> > +                break;
> > +            }
> > +            build_cache_nodes(table_data, &caches[c], next_offset,
> > +                              assign_ids, base_id);
> > +            switch (caches[c].type) {
> > +            case INSTRUCTION:
> > +                next_level_offset_instruction = this_offset;
> > +                break;
> > +            case DATA:
> > +                next_level_offset_data = this_offset;
> > +                break;
> > +            case UNIFIED:
> > +                next_level_offset_instruction = this_offset;
> > +                next_level_offset_data = this_offset;
> > +                break;
> > +            }
> > +            *data_offset = next_level_offset_data;
> > +            *instr_offset = next_level_offset_instruction;
> > +        }
> > +    }
> > +}
> > +
> >  /*
> >   * ACPI spec, Revision 6.3
> >   * 5.2.29 Processor Properties Topology Table (PPTT)
> >   */
> >  void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
> > -                const char *oem_id, const char *oem_table_id)
> > +                const char *oem_id, const char *oem_table_id,
> > +                int num_caches, ACPIPPTTCache *caches)
> >  {
> > +    bool share_structs = false;
> >      MachineClass *mc = MACHINE_GET_CLASS(ms);
> >      CPUArchIdList *cpus = ms->possible_cpus;
> >      int64_t socket_id = -1, cluster_id = -1, core_id = -1;
> >      uint32_t socket_offset = 0, cluster_offset = 0, core_offset = 0;
> >      uint32_t pptt_start = table_data->len;
> >      int n;
> > -    AcpiTable table = { .sig = "PPTT", .rev = 2,
> > +    AcpiTable table = { .sig = "PPTT", .rev = 3,
> >                          .oem_id = oem_id, .oem_table_id = oem_table_id };
> > +    uint32_t l1_data_offset = 0;
> > +    uint32_t l1_instr_offset = 0;
> > +    uint32_t cluster_data_offset = 0;
> > +    uint32_t cluster_instr_offset = 0;
> > +    uint32_t node_data_offset = 0;
> > +    uint32_t node_instr_offset = 0;
> > +    int top_node = 7;
> > +    int top_cluster = 7;
> > +    int top_core = 7;
> >  
> >      acpi_table_begin(&table, table_data);
> >  
> > +    /* Let us have a unified cache description for now */
> > +
> > +    if (share_structs && num_caches >= 1) {
> > +        if (ms->smp.cache_node_start_level) {
> > +            build_caches_subset(table_data, pptt_start, num_caches, caches,
> > +                                false, 0,
> > +                                top_node, ms->smp.cache_node_start_level,
> > +                                &node_data_offset, &node_instr_offset);
> > +            top_cluster = ms->smp.cache_node_start_level - 1;
> > +        }
> > +        /* Assumption that some caches below this */
> > +        if (ms->smp.cache_cluster_start_level) {
> > +            build_caches_subset(table_data, pptt_start, num_caches, caches,
> > +                                false, 0,
> > +                                top_cluster,  ms->smp.cache_cluster_start_level,
> > +                                &cluster_data_offset, &cluster_instr_offset);
> > +            top_core = ms->smp.cache_cluster_start_level - 1;
> > +        }
> > +        build_caches_subset(table_data, pptt_start, num_caches, caches,
> > +                            false, 0,
> > +                            top_core , 0,
> > +                            &l1_data_offset, &l1_instr_offset);
> > +    }
> > +
> >      /*
> >       * This works with the assumption that cpus[n].props.*_id has been
> >       * sorted from top to down levels in mc->possible_cpu_arch_ids().
> >       * Otherwise, the unexpected and duplicated containers will be
> >       * created.
> >       */
> > +
> >      for (n = 0; n < cpus->len; n++) {
> >          if (cpus->cpus[n].props.socket_id != socket_id) {
> > +            uint32_t priv_rsrc[2];
> > +            int num_priv = 0;
> > +
> > +            if (!share_structs && ms->smp.cache_node_start_level) {
> > +                build_caches_subset(table_data, pptt_start, num_caches, caches,
> > +                                    true, n,
> > +                                    top_node, ms->smp.cache_node_start_level,
> > +                                    &node_data_offset, &node_instr_offset);
> > +                top_cluster = ms->smp.cache_node_start_level - 1;
> > +            }
> > +            priv_rsrc[0] = node_instr_offset;
> > +            priv_rsrc[1] = node_data_offset;
> > +            if (node_instr_offset || node_data_offset) {
> > +                num_priv = node_instr_offset == node_data_offset ? 1 : 2;
> > +            }
> >              assert(cpus->cpus[n].props.socket_id > socket_id);
> >              socket_id = cpus->cpus[n].props.socket_id;
> >              cluster_id = -1;
> > @@ -2027,36 +2170,70 @@ void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
> >              socket_offset = table_data->len - pptt_start;
> >              build_processor_hierarchy_node(table_data,
> >                  (1 << 0), /* Physical package */
> > -                0, socket_id, NULL, 0);
> > +                0, socket_id, priv_rsrc, num_priv);
> >          }
> >  
> > +
> >          if (mc->smp_props.clusters_supported && mc->smp_props.has_clusters) {
> >              if (cpus->cpus[n].props.cluster_id != cluster_id) {
> > +                uint32_t priv_rsrc[2];
> > +                int num_priv = 0;
> > +
> > +                if (!share_structs && ms->smp.cache_cluster_start_level) {
> > +                    build_caches_subset(table_data, pptt_start, num_caches,
> > +                                        caches, true, n,
> > +                                        top_cluster,
> > +                                        ms->smp.cache_cluster_start_level,
> > +                                        &cluster_data_offset,
> > +                                        &cluster_instr_offset);
> > +                    top_core = ms->smp.cache_cluster_start_level - 1;
> > +                }
> > +                priv_rsrc[0] = cluster_instr_offset;
> > +                priv_rsrc[1] = cluster_data_offset;
> > +
> >                  assert(cpus->cpus[n].props.cluster_id > cluster_id);
> >                  cluster_id = cpus->cpus[n].props.cluster_id;
> >                  core_id = -1;
> >                  cluster_offset = table_data->len - pptt_start;
> > +
> > +                if (cluster_instr_offset || cluster_data_offset) {
> > +                    num_priv = cluster_instr_offset == cluster_data_offset ?
> > +                        1 : 2;
> > +                }
> >                  build_processor_hierarchy_node(table_data,
> >                      (0 << 0), /* Not a physical package */
> > -                    socket_offset, cluster_id, NULL, 0);
> > +                    socket_offset, cluster_id, priv_rsrc, num_priv);
> >              }
> >          } else {
> >              cluster_offset = socket_offset;
> >          }
> >  
> > +        if (!share_structs &&
> > +            cpus->cpus[n].props.core_id != core_id) {
> > +            build_caches_subset(table_data, pptt_start, num_caches, caches,
> > +                                true, n,
> > +                                top_core , 0,
> > +                                &l1_data_offset, &l1_instr_offset);
> > +        }
> >          if (ms->smp.threads == 1) {
> > +            uint32_t priv_rsrc[2] = { l1_instr_offset, l1_data_offset };
> > +
> >              build_processor_hierarchy_node(table_data,
> >                  (1 << 1) | /* ACPI Processor ID valid */
> >                  (1 << 3),  /* Node is a Leaf */
> > -                cluster_offset, n, NULL, 0);
> > +                cluster_offset, n, priv_rsrc,
> > +                l1_instr_offset == l1_data_offset ? 1 : 2);
> >          } else {
> >              if (cpus->cpus[n].props.core_id != core_id) {
> > +                uint32_t priv_rsrc[2] = { l1_instr_offset, l1_data_offset };
> > +
> >                  assert(cpus->cpus[n].props.core_id > core_id);
> >                  core_id = cpus->cpus[n].props.core_id;
> >                  core_offset = table_data->len - pptt_start;
> >                  build_processor_hierarchy_node(table_data,
> >                      (0 << 0), /* Not a physical package */
> > -                    cluster_offset, core_id, NULL, 0);
> > +                    cluster_offset, core_id, priv_rsrc,
> > +                    l1_instr_offset == l1_data_offset ? 1 : 2);
> >              }
> >  
> >              build_processor_hierarchy_node(table_data,
> > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> > index 6b674231c2..ec8fdcefff 100644
> > --- a/hw/arm/virt-acpi-build.c
> > +++ b/hw/arm/virt-acpi-build.c
> > @@ -922,6 +922,129 @@ static void acpi_align_size(GArray *blob, unsigned align)
> >      g_array_set_size(blob, ROUND_UP(acpi_data_len(blob), align));
> >  }
> >  
> > +static unsigned int virt_get_caches(VirtMachineState *vms,
> > +                                    ACPIPPTTCache *caches)
> > +{
> > +    ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(0));
> > +    bool ccidx = cpu_isar_feature(any_ccidx, armcpu);
> > +    unsigned int num_cache, i;
> > +    int level_instr = 1, level_data = 1;
> > +
> > +    for (i = 0, num_cache = 0; i < 7; i++, num_cache++) {
> > +        int type = (armcpu->clidr >> (3 * i)) & 7;
> > +        int bank_index;
> > +        int level = 0;
> > +        ACPIPPTTCacheType cache_type = INSTRUCTION;
> > +
> > +        if (type == 0) {
> > +            break;
> > +        }
> > +
> > +        switch (type) {
> > +        case 1:
> > +            cache_type = INSTRUCTION;
> > +            level = level_instr;
> > +            break;
> > +        case 2:
> > +            cache_type = DATA;
> > +            level = level_data;
> > +            break;
> > +        case 4:
> > +            cache_type = UNIFIED;
> > +            level = level_instr > level_data ? level_instr : level_data;
> > +            break;
> > +        case 3: /* Split - Do data first */
> > +            cache_type = DATA;
> > +            level = level_data;
> > +            break;
> > +        }
> > +        /*
> > +         * ccsidr is indexed using both the level and whether it is
> > +         * an instruction cache. Unified caches use the same storage
> > +         * as data caches.
> > +         */
> > +        bank_index = (i * 2) | ((type == 1) ? 1 : 0);
> > +        if (ccidx) {
> > +            caches[num_cache] = (ACPIPPTTCache) {
> > +                .type =  cache_type,
> > +                .level = level,
> > +                .linesize = 1 << (FIELD_EX64(armcpu->ccsidr[bank_index],
> > +                                             CCSIDR_EL1,
> > +                                             CCIDX_LINESIZE) + 4),
> > +                .associativity = FIELD_EX64(armcpu->ccsidr[bank_index],
> > +                                            CCSIDR_EL1,
> > +                                            CCIDX_ASSOCIATIVITY) + 1,
> > +                .sets = FIELD_EX64(armcpu->ccsidr[bank_index], CCSIDR_EL1,
> > +                                   CCIDX_NUMSETS) + 1,
> > +            };
> > +        } else {
> > +            caches[num_cache] = (ACPIPPTTCache) {
> > +                .type =  cache_type,
> > +                .level = level,
> > +                .linesize = 1 << (FIELD_EX64(armcpu->ccsidr[bank_index],
> > +                                             CCSIDR_EL1, LINESIZE) + 4),
> > +                .associativity = FIELD_EX64(armcpu->ccsidr[bank_index],
> > +                                            CCSIDR_EL1,
> > +                                            ASSOCIATIVITY) + 1,
> > +                .sets = FIELD_EX64(armcpu->ccsidr[bank_index], CCSIDR_EL1,
> > +                                   NUMSETS) + 1,
> > +            };
> > +        }
> > +        caches[num_cache].size = caches[num_cache].associativity *
> > +            caches[num_cache].sets * caches[num_cache].linesize;
> > +
> > +        /* Break one 'split' entry up into two records */
> > +        if (type == 3) {
> > +            num_cache++;
> > +            bank_index = (i * 2) | 1;
> > +            if (ccidx) {
> > +                /* Instruction cache: bottom bit set when reading banked reg */
> > +                caches[num_cache] = (ACPIPPTTCache) {
> > +                    .type = INSTRUCTION,
> > +                    .level = level_instr,
> > +                    .linesize = 1 << (FIELD_EX64(armcpu->ccsidr[bank_index],
> > +                                                 CCSIDR_EL1,
> > +                                                 CCIDX_LINESIZE) + 4),
> > +                    .associativity = FIELD_EX64(armcpu->ccsidr[bank_index],
> > +                                                CCSIDR_EL1,
> > +                                                CCIDX_ASSOCIATIVITY) + 1,
> > +                    .sets = FIELD_EX64(armcpu->ccsidr[bank_index], CCSIDR_EL1,
> > +                                       CCIDX_NUMSETS) + 1,
> > +                };
> > +            } else {
> > +                caches[num_cache] = (ACPIPPTTCache) {
> > +                    .type = INSTRUCTION,
> > +                    .level = level_instr,
> > +                    .linesize = 1 << (FIELD_EX64(armcpu->ccsidr[bank_index],
> > +                                                 CCSIDR_EL1, LINESIZE) + 4),
> > +                    .associativity = FIELD_EX64(armcpu->ccsidr[bank_index],
> > +                                                CCSIDR_EL1,
> > +                                                ASSOCIATIVITY) + 1,
> > +                    .sets = FIELD_EX64(armcpu->ccsidr[bank_index], CCSIDR_EL1,
> > +                                       NUMSETS) + 1,
> > +                };
> > +            }
> > +            caches[num_cache].size = caches[num_cache].associativity *
> > +                caches[num_cache].sets * caches[num_cache].linesize;
> > +        }
> > +        switch (type) {
> > +        case 1:
> > +            level_instr++;
> > +            break;
> > +        case 2:
> > +            level_data++;
> > +            break;
> > +        case 3:
> > +        case 4:
> > +            level_instr++;
> > +            level_data++;
> > +            break;
> > +        }
> > +    }
> > +
> > +    return num_cache;
> > +}
> > +
> >  static
> >  void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
> >  {
> > @@ -930,6 +1053,8 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
> >      unsigned dsdt, xsdt;
> >      GArray *tables_blob = tables->table_data;
> >      MachineState *ms = MACHINE(vms);
> > +    ACPIPPTTCache caches[16]; /* Can select up to 16 */
> > +    unsigned int num_cache;
> >  
> >      table_offsets = g_array_new(false, true /* clear */,
> >                                          sizeof(uint32_t));
> > @@ -949,10 +1074,13 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
> >      acpi_add_table(table_offsets, tables_blob);
> >      build_madt(tables_blob, tables->linker, vms);
> >  
> > +    num_cache = virt_get_caches(vms, caches);
> > +
> >      if (!vmc->no_cpu_topology) {
> >          acpi_add_table(table_offsets, tables_blob);
> >          build_pptt(tables_blob, tables->linker, ms,
> > -                   vms->oem_id, vms->oem_table_id);
> > +                   vms->oem_id, vms->oem_table_id,
> > +                   num_cache, caches);
> >      }
> >  
> >      acpi_add_table(table_offsets, tables_blob);
> > diff --git a/hw/core/machine-smp.c b/hw/core/machine-smp.c
> > index 0f4d9b6f7a..cbb0bf1bc7 100644
> > --- a/hw/core/machine-smp.c
> > +++ b/hw/core/machine-smp.c
> > @@ -81,6 +81,10 @@ void machine_parse_smp_config(MachineState *ms,
> >      unsigned cores   = config->has_cores ? config->cores : 0;
> >      unsigned threads = config->has_threads ? config->threads : 0;
> >      unsigned maxcpus = config->has_maxcpus ? config->maxcpus : 0;
> > +    unsigned cache_cl_start = config->has_cache_cluster_start_level ?
> > +        config->cache_cluster_start_level : 0;
> > +    unsigned cache_nd_start = config->has_cache_node_start_level ?
> > +        config->cache_node_start_level : 0;
> >  
> >      /*
> >       * Specified CPU topology parameters must be greater than zero,
> > @@ -161,6 +165,10 @@ void machine_parse_smp_config(MachineState *ms,
> >      ms->smp.max_cpus = maxcpus;
> >  
> >      mc->smp_props.has_clusters = config->has_clusters;
> > +    if (mc->smp_props.has_clusters) {
> > +        ms->smp.cache_cluster_start_level = cache_cl_start;
> > +        ms->smp.cache_node_start_level = cache_nd_start;
> > +    }
> >  
> >      /* sanity-check of the computed topology */
> >      if (sockets * dies * clusters * cores * threads != maxcpus) {
> > diff --git a/hw/loongarch/acpi-build.c b/hw/loongarch/acpi-build.c
> > index 0b62c3a2f7..51d4ed9a19 100644
> > --- a/hw/loongarch/acpi-build.c
> > +++ b/hw/loongarch/acpi-build.c
> > @@ -439,7 +439,7 @@ static void acpi_build(AcpiBuildTables *tables, MachineState *machine)
> >  
> >      acpi_add_table(table_offsets, tables_blob);
> >      build_pptt(tables_blob, tables->linker, machine,
> > -               lams->oem_id, lams->oem_table_id);
> > +               lams->oem_id, lams->oem_table_id, 0, NULL);
> >  
> >      acpi_add_table(table_offsets, tables_blob);
> >      build_srat(tables_blob, tables->linker, machine);
> > -- 
> > 2.39.2
> > 
> >   
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 2/5] HACK: target/arm/tcg: Add some more caches to cpu=max
  2023-08-23 14:59     ` Jonathan Cameron via
@ 2023-08-23 19:05       ` Richard Henderson
  0 siblings, 0 replies; 11+ messages in thread
From: Richard Henderson @ 2023-08-23 19:05 UTC (permalink / raw)
  To: Jonathan Cameron, Alex Bennée
  Cc: qemu-devel, Gavin Shan, linuxarm, James Morse,
	peter . maydell @ linaro . org, zhao1.liu,
	Shameerali Kolothum Thodi, Yicong Yang

On 8/23/23 07:59, Jonathan Cameron via wrote:
> On Mon, 14 Aug 2023 11:13:58 +0100
> Alex Bennée <alex.bennee@linaro.org> wrote:
> 
>> Jonathan Cameron <Jonathan.Cameron@huawei.com> writes:
>>
>>> Used to drive the MPAM cache intialization and to exercise more
>>> of the PPTT cache entry generation code. Perhaps a default
>>> L3 cache is acceptable for max?
>>>
>>> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>>> ---
>>>   target/arm/tcg/cpu64.c | 12 ++++++++++++
>>>   1 file changed, 12 insertions(+)
>>>
>>> diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
>>> index 8019f00bc3..2af67739f6 100644
>>> --- a/target/arm/tcg/cpu64.c
>>> +++ b/target/arm/tcg/cpu64.c
>>> @@ -711,6 +711,17 @@ void aarch64_max_tcg_initfn(Object *obj)
>>>       uint64_t t;
>>>       uint32_t u;
>>>   
>>> +    /*
>>> +     * Expanded cache set
>>> +     */
>>> +    cpu->clidr = 0x8204923; /* 4 4 4 4 3 in 3 bit fields */
>>> +    cpu->ccsidr[0] = 0x000000ff0000001aull; /* 64KB L1 dcache */
>>> +    cpu->ccsidr[1] = 0x000000ff0000001aull; /* 64KB L1 icache */
>>> +    cpu->ccsidr[2] = 0x000007ff0000003aull; /* 1MB L2 unified cache */
>>> +    cpu->ccsidr[4] = 0x000007ff0000007cull; /* 2MB L3 cache 128B line */
>>> +    cpu->ccsidr[6] = 0x00007fff0000007cull; /* 16MB L4 cache 128B line */
>>> +    cpu->ccsidr[8] = 0x0007ffff0000007cull; /* 2048MB L5 cache 128B line */
>>> +
>>
>> I think Peter in another thread wondered if we should have a generic
>> function for expanding the cache idr registers based on a abstract lane
>> definition.
>>
> 
> Great!
> 
> This response?
> https://lore.kernel.org/qemu-devel/CAFEAcA_Lzj1LEutMro72fCfqiCWtOpd+5b-YPcfKv8Bg1f+rCg@mail.gmail.com/

Followed up with

https://lore.kernel.org/qemu-devel/20230811214031.171020-6-richard.henderson@linaro.org/


r~


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-08-23 19:05 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-08 11:57 [RFC PATCH 0/5] hw/arm: MPAM Emulation + PPTT cache description Jonathan Cameron via
2023-08-08 11:57 ` [RFC PATCH 1/5] hw/acpi: Add PPTT cache descriptions Jonathan Cameron via
2023-08-14  9:50   ` Zhao Liu
2023-08-23 15:08     ` Jonathan Cameron via
2023-08-08 11:57 ` [RFC PATCH 2/5] HACK: target/arm/tcg: Add some more caches to cpu=max Jonathan Cameron via
2023-08-14 10:13   ` Alex Bennée
2023-08-23 14:59     ` Jonathan Cameron via
2023-08-23 19:05       ` Richard Henderson
2023-08-08 11:57 ` [RFC PATCH 3/5] target/arm: Add support for MPAM CPU registers Jonathan Cameron via
2023-08-08 11:57 ` [RFC PATCH 4/5] hw/arm: Add MPAM emulation Jonathan Cameron via
2023-08-08 11:57 ` [RFC PATCH 5/5] hw/arm/virt: Add MPAM MSCs for memory controllers and caches Jonathan Cameron via

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).