All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/1] numa: add 'memmap-type' option for memory type configuration
@ 2026-02-26 10:50 fanhuang
  2026-02-26 10:50 ` [PATCH v6 1/1] " fanhuang
  2026-03-05 12:32 ` [PATCH v6 0/1] " Jonathan Cameron via qemu development
  0 siblings, 2 replies; 10+ messages in thread
From: fanhuang @ 2026-02-26 10:50 UTC (permalink / raw)
  To: qemu-devel, david, imammedo, gourry, jonathan.cameron
  Cc: apopple, dan.j.williams, Zhigang.Luo, Lianjie.Shi, fanhuang

Hi all,

This is v6 of the SPM (Specific Purpose Memory) patch. Thank you for
the feedback on v5, especially Gregory's review.

Changes in v6:
- Added validation: memmap-type now requires memdev to be specified,
  to avoid misconfiguration on memory-less NUMA nodes
- Simplified pc_update_numa_memory_types() by replacing switch/goto
  with a direct conditional expression
- Reserved memory nodes are now excluded from SRAT memory affinity
  entries, since E820 already marks them as reserved and SRAT should
  not report them as enabled memory affinity

Use case:
This feature allows marking NUMA node memory as Specific Purpose Memory
(SPM) or reserved in the E820 table. SPM serves as a hint to the guest
that this memory might be managed by device drivers based on guest policy

Example usage:
  -object memory-backend-ram,size=8G,id=m0
  -object memory-backend-memfd,size=8G,id=m1
  -numa node,nodeid=0,memdev=m0
  -numa node,nodeid=1,memdev=m1,memmap-type=spm

Supported memmap-type values:
  - normal:   Regular system RAM (E820 type 1, default)
  - spm:      Specific Purpose Memory (E820 type 0xEFFFFFFF), a hint
              that this memory might be managed by device drivers
  - reserved: Reserved memory (E820 type 2), not usable as RAM

OS-facing test results:

1. memmap-type=spm
~~~~~~~~~~~~~~~~~~

  -numa node,cpus=4-7,nodeid=1,memdev=m1,memmap-type=spm

Guest dmesg output:

  [    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000027fffffff] usable
  [    0.000000] BIOS-e820: [mem 0x0000000280000000-0x000000047fffffff] soft reserved
  [    0.000000] BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved

  [    0.042582] ACPI: SRAT: Node 1 PXM 1 [mem 0x280000000-0x47fffffff]

2. memmap-type=reserved
~~~~~~~~~~~~~~~~~~~~~~~

  -numa node,cpus=4-7,nodeid=1,memdev=m1,memmap-type=reserved

Guest dmesg output:

  [    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000027fffffff] usable
  [    0.000000] BIOS-e820: [mem 0x0000000280000000-0x000000047fffffff] reserved
  [    0.000000] BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved

  [    0.042728] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x0009ffff]
  [    0.042729] ACPI: SRAT: Node 0 PXM 0 [mem 0x00100000-0x7fffffff]
  [    0.042731] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0x27fffffff]

  Note: Node 1 is excluded from SRAT memory affinity entries since v6,
  because E820 already marks it as reserved and it should not be reported
  as enabled memory.

3. normal (default, no memmap-type specified)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  -numa node,cpus=4-7,nodeid=1,memdev=m1

Guest dmesg output:

  [    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000047fffffff] usable
  [    0.000000] BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved

  [    0.042413] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x0009ffff]
  [    0.042414] ACPI: SRAT: Node 0 PXM 0 [mem 0x00100000-0x7fffffff]
  [    0.042416] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0x27fffffff]
  [    0.042417] ACPI: SRAT: Node 1 PXM 1 [mem 0x280000000-0x47fffffff]

The results show:
- Node association is correct (Node 1 at 0x280000000-0x47fffffff)
- E820 types are correctly applied (usable/soft reserved/reserved)
- SRAT entries are generated for normal and spm configurations
- Reserved nodes are excluded from SRAT memory affinity (new in v6)

Please review. Thanks!

Best regards,
Jerry Huang

fanhuang (1):
  numa: add 'memmap-type' option for memory type configuration

 hw/core/numa.c               | 24 ++++++++++++
 hw/i386/acpi-build.c         |  8 ++++
 hw/i386/e820_memory_layout.c | 72 ++++++++++++++++++++++++++++++++++++
 hw/i386/e820_memory_layout.h | 12 +++---
 hw/i386/pc.c                 | 48 ++++++++++++++++++++++++
 include/system/numa.h        |  7 ++++
 qapi/machine.json            | 24 ++++++++++++
 qemu-options.hx              | 14 ++++++-
 8 files changed, 202 insertions(+), 7 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v6 1/1] numa: add 'memmap-type' option for memory type configuration
  2026-02-26 10:50 [PATCH v6 0/1] numa: add 'memmap-type' option for memory type configuration fanhuang
@ 2026-02-26 10:50 ` fanhuang
  2026-02-27 20:34   ` David Hildenbrand
                     ` (2 more replies)
  2026-03-05 12:32 ` [PATCH v6 0/1] " Jonathan Cameron via qemu development
  1 sibling, 3 replies; 10+ messages in thread
From: fanhuang @ 2026-02-26 10:50 UTC (permalink / raw)
  To: qemu-devel, david, imammedo, gourry, jonathan.cameron
  Cc: apopple, dan.j.williams, Zhigang.Luo, Lianjie.Shi, fanhuang

Add a 'memmap-type' option to NUMA node configuration that allows
specifying the memory type for a NUMA node.

Supported values:
  - normal:   Regular system RAM (E820 type 1, default)
  - spm:      Specific Purpose Memory (E820 type 0xEFFFFFFF)
  - reserved: Reserved memory (E820 type 2)

The 'spm' type indicates Specific Purpose Memory - a hint to the guest
that this memory might be managed by device drivers based on guest policy.
The 'reserved' type marks memory as not usable as RAM.

Note: This option is only supported on x86 platforms.

Usage:
  -numa node,nodeid=1,memdev=m1,memmap-type=spm

Signed-off-by: fanhuang <FangSheng.Huang@amd.com>
---
 hw/core/numa.c               | 24 ++++++++++++
 hw/i386/acpi-build.c         |  8 ++++
 hw/i386/e820_memory_layout.c | 72 ++++++++++++++++++++++++++++++++++++
 hw/i386/e820_memory_layout.h | 12 +++---
 hw/i386/pc.c                 | 48 ++++++++++++++++++++++++
 include/system/numa.h        |  7 ++++
 qapi/machine.json            | 24 ++++++++++++
 qemu-options.hx              | 14 ++++++-
 8 files changed, 202 insertions(+), 7 deletions(-)

diff --git a/hw/core/numa.c b/hw/core/numa.c
index f462883c87..521c8f10f1 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -38,6 +38,7 @@
 #include "hw/mem/pc-dimm.h"
 #include "hw/core/boards.h"
 #include "hw/mem/memory-device.h"
+#include "hw/i386/x86.h"
 #include "qemu/option.h"
 #include "qemu/config-file.h"
 #include "qemu/cutils.h"
@@ -164,6 +165,29 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
         numa_info[nodenr].node_memdev = MEMORY_BACKEND(o);
     }
 
+    if (node->has_memmap_type && node->memmap_type != NUMA_MEMMAP_TYPE_NORMAL) {
+        if (!node->memdev) {
+            error_setg(errp, "memmap-type=%s requires memdev to be specified",
+                       NumaMemmapType_str(node->memmap_type));
+            return;
+        }
+        if (!object_dynamic_cast(OBJECT(ms), TYPE_X86_MACHINE)) {
+            error_setg(errp, "memmap-type=%s is only supported on x86 machines",
+                       NumaMemmapType_str(node->memmap_type));
+            return;
+        }
+        switch (node->memmap_type) {
+        case NUMA_MEMMAP_TYPE_SPM:
+            numa_info[nodenr].memmap_type = NUMA_MEMMAP_SPM;
+            break;
+        case NUMA_MEMMAP_TYPE_RESERVED:
+            numa_info[nodenr].memmap_type = NUMA_MEMMAP_RESERVED;
+            break;
+        default:
+            break;
+        }
+    }
+
     numa_info[nodenr].present = true;
     max_numa_nodeid = MAX(max_numa_nodeid, nodenr + 1);
     ms->numa_state->num_nodes++;
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 9446a9f862..5aefef9079 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1418,6 +1418,14 @@ build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine)
         mem_len = numa_info[i - 1].node_mem;
         next_base = mem_base + mem_len;
 
+        /*
+         * Skip reserved memory nodes - E820 marks them as reserved,
+         * so SRAT should not report them as enabled memory affinity.
+         */
+        if (numa_info[i - 1].memmap_type == NUMA_MEMMAP_RESERVED) {
+            continue;
+        }
+
         /* Cut out the 640K hole */
         if (mem_base <= HOLE_640K_START &&
             next_base > HOLE_640K_START) {
diff --git a/hw/i386/e820_memory_layout.c b/hw/i386/e820_memory_layout.c
index 3e848fb69c..4c62b5ddea 100644
--- a/hw/i386/e820_memory_layout.c
+++ b/hw/i386/e820_memory_layout.c
@@ -46,3 +46,75 @@ bool e820_get_entry(int idx, uint32_t type, uint64_t *address, uint64_t *length)
     }
     return false;
 }
+
+bool e820_update_entry_type(uint64_t start, uint64_t length, uint32_t new_type)
+{
+    uint64_t end = start + length;
+    assert(!e820_done);
+
+    /* For E820_SOFT_RESERVED, validate range is within E820_RAM */
+    if (new_type == E820_SOFT_RESERVED) {
+        bool range_in_ram = false;
+
+        for (size_t j = 0; j < e820_entries; j++) {
+            uint64_t ram_start = le64_to_cpu(e820_table[j].address);
+            uint64_t ram_end = ram_start + le64_to_cpu(e820_table[j].length);
+            uint32_t ram_type = le32_to_cpu(e820_table[j].type);
+
+            if (ram_type == E820_RAM && ram_start <= start && ram_end >= end) {
+                range_in_ram = true;
+                break;
+            }
+        }
+        if (!range_in_ram) {
+            return false;
+        }
+    }
+
+    /* Find entry that contains the target range and update it */
+    for (size_t i = 0; i < e820_entries; i++) {
+        uint64_t entry_start = le64_to_cpu(e820_table[i].address);
+        uint64_t entry_length = le64_to_cpu(e820_table[i].length);
+        uint64_t entry_end = entry_start + entry_length;
+
+        if (entry_start <= start && entry_end >= end) {
+            uint32_t original_type = e820_table[i].type;
+
+            /* Remove original entry */
+            memmove(&e820_table[i], &e820_table[i + 1],
+                    (e820_entries - i - 1) * sizeof(struct e820_entry));
+            e820_entries--;
+
+            /* Add split parts inline */
+            if (entry_start < start) {
+                e820_table = g_renew(struct e820_entry, e820_table,
+                                     e820_entries + 1);
+                e820_table[e820_entries].address = cpu_to_le64(entry_start);
+                e820_table[e820_entries].length =
+                    cpu_to_le64(start - entry_start);
+                e820_table[e820_entries].type = original_type;
+                e820_entries++;
+            }
+
+            e820_table = g_renew(struct e820_entry, e820_table,
+                                 e820_entries + 1);
+            e820_table[e820_entries].address = cpu_to_le64(start);
+            e820_table[e820_entries].length = cpu_to_le64(length);
+            e820_table[e820_entries].type = cpu_to_le32(new_type);
+            e820_entries++;
+
+            if (end < entry_end) {
+                e820_table = g_renew(struct e820_entry, e820_table,
+                                     e820_entries + 1);
+                e820_table[e820_entries].address = cpu_to_le64(end);
+                e820_table[e820_entries].length = cpu_to_le64(entry_end - end);
+                e820_table[e820_entries].type = original_type;
+                e820_entries++;
+            }
+
+            return true;
+        }
+    }
+
+    return false;
+}
diff --git a/hw/i386/e820_memory_layout.h b/hw/i386/e820_memory_layout.h
index b50acfa201..a85b4fd14c 100644
--- a/hw/i386/e820_memory_layout.h
+++ b/hw/i386/e820_memory_layout.h
@@ -10,11 +10,12 @@
 #define HW_I386_E820_MEMORY_LAYOUT_H
 
 /* e820 types */
-#define E820_RAM        1
-#define E820_RESERVED   2
-#define E820_ACPI       3
-#define E820_NVS        4
-#define E820_UNUSABLE   5
+#define E820_RAM            1
+#define E820_RESERVED       2
+#define E820_ACPI           3
+#define E820_NVS            4
+#define E820_UNUSABLE       5
+#define E820_SOFT_RESERVED  0xEFFFFFFF
 
 struct e820_entry {
     uint64_t address;
@@ -26,5 +27,6 @@ void e820_add_entry(uint64_t address, uint64_t length, uint32_t type);
 bool e820_get_entry(int index, uint32_t type,
                     uint64_t *address, uint64_t *length);
 int e820_get_table(struct e820_entry **table);
+bool e820_update_entry_type(uint64_t start, uint64_t length, uint32_t new_type);
 
 #endif
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 5cb074c0a0..22679c69fb 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -794,6 +794,51 @@ static hwaddr pc_max_used_gpa(PCMachineState *pcms, uint64_t pci_hole64_size)
     return pc_above_4g_end(pcms) - 1;
 }
 
+/*
+ * Update E820 entries for NUMA nodes with non-default memory types.
+ */
+static void pc_update_numa_memory_types(X86MachineState *x86ms)
+{
+    MachineState *ms = MACHINE(x86ms);
+    uint64_t addr = 0;
+
+    for (int i = 0; i < ms->numa_state->num_nodes; i++) {
+        NodeInfo *numa_info = &ms->numa_state->nodes[i];
+        uint64_t node_size = numa_info->node_mem;
+
+        if (numa_info->node_memdev &&
+            (numa_info->memmap_type == NUMA_MEMMAP_SPM ||
+             numa_info->memmap_type == NUMA_MEMMAP_RESERVED)) {
+            uint64_t guest_addr;
+            uint32_t e820_type = (numa_info->memmap_type == NUMA_MEMMAP_SPM)
+                                  ? E820_SOFT_RESERVED : E820_RESERVED;
+
+            if (addr < x86ms->below_4g_mem_size) {
+                if (addr + node_size <= x86ms->below_4g_mem_size) {
+                    guest_addr = addr;
+                } else {
+                    error_report("NUMA node %d with memmap-type spans across "
+                                 "4GB boundary, not supported", i);
+                    exit(EXIT_FAILURE);
+                }
+            } else {
+                guest_addr = 0x100000000ULL +
+                            (addr - x86ms->below_4g_mem_size);
+            }
+
+            if (!e820_update_entry_type(guest_addr, node_size, e820_type)) {
+                warn_report("Failed to update E820 entry for node %d "
+                           "at 0x%" PRIx64 " length 0x%" PRIx64,
+                           i, guest_addr, node_size);
+            }
+        }
+
+        if (numa_info->node_memdev) {
+            addr += node_size;
+        }
+    }
+}
+
 /*
  * AMD systems with an IOMMU have an additional hole close to the
  * 1Tb, which are special GPAs that cannot be DMA mapped. Depending
@@ -910,6 +955,9 @@ void pc_memory_init(PCMachineState *pcms,
         e820_add_entry(pcms->sgx_epc.base, pcms->sgx_epc.size, E820_RESERVED);
     }
 
+    /* Update E820 for NUMA nodes with special memory types */
+    pc_update_numa_memory_types(x86ms);
+
     if (!pcmc->has_reserved_memory &&
         (machine->ram_slots ||
          (machine->maxram_size > machine->ram_size))) {
diff --git a/include/system/numa.h b/include/system/numa.h
index 1044b0eb6e..64e8f63736 100644
--- a/include/system/numa.h
+++ b/include/system/numa.h
@@ -35,12 +35,19 @@ enum {
 
 #define UINT16_BITS       16
 
+typedef enum {
+    NUMA_MEMMAP_NORMAL = 0,
+    NUMA_MEMMAP_SPM,
+    NUMA_MEMMAP_RESERVED,
+} NumaMemmapTypeInternal;
+
 typedef struct NodeInfo {
     uint64_t node_mem;
     struct HostMemoryBackend *node_memdev;
     bool present;
     bool has_cpu;
     bool has_gi;
+    NumaMemmapTypeInternal memmap_type;
     uint8_t lb_info_provided;
     uint16_t initiator;
     uint8_t distance[MAX_NODES];
diff --git a/qapi/machine.json b/qapi/machine.json
index 907cb25f75..b7fc8c564f 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -464,6 +464,22 @@
 { 'enum': 'NumaOptionsType',
   'data': [ 'node', 'dist', 'cpu', 'hmat-lb', 'hmat-cache' ] }
 
+##
+# @NumaMemmapType:
+#
+# Memory mapping type for a NUMA node.
+#
+# @normal: Normal system RAM (E820 type 1)
+#
+# @spm: Specific Purpose Memory (E820 type 0xEFFFFFFF)
+#
+# @reserved: Reserved memory (E820 type 2)
+#
+# Since: 10.2
+##
+{ 'enum': 'NumaMemmapType',
+  'data': ['normal', 'spm', 'reserved'] }
+
 ##
 # @NumaOptions:
 #
@@ -500,6 +516,13 @@
 # @memdev: memory backend object.  If specified for one node, it must
 #     be specified for all nodes.
 #
+# @memmap-type: specifies the memory type for this NUMA node.
+#     'normal' (default) is regular system RAM.
+#     'spm' is Specific Purpose Memory - a hint to the guest that
+#     this memory might be managed by device drivers based on policy.
+#     'reserved' is reserved memory, not usable as RAM.
+#     Currently only supported on x86.  (since 10.2)
+#
 # @initiator: defined in ACPI 6.3 Chapter 5.2.27.3 Table 5-145, points
 #     to the nodeid which has the memory controller responsible for
 #     this NUMA node.  This field provides additional information as
@@ -514,6 +537,7 @@
    '*cpus':   ['uint16'],
    '*mem':    'size',
    '*memdev': 'str',
+   '*memmap-type': 'NumaMemmapType',
    '*initiator': 'uint16' }}
 
 ##
diff --git a/qemu-options.hx b/qemu-options.hx
index ec92723f10..4da17cbefb 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -433,7 +433,7 @@ ERST
 
 DEF("numa", HAS_ARG, QEMU_OPTION_numa,
     "-numa node[,mem=size][,cpus=firstcpu[-lastcpu]][,nodeid=node][,initiator=node]\n"
-    "-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node][,initiator=node]\n"
+    "-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node][,initiator=node][,memmap-type=normal|spm|reserved]\n"
     "-numa dist,src=source,dst=destination,val=distance\n"
     "-numa cpu,node-id=node[,socket-id=x][,core-id=y][,thread-id=z]\n"
     "-numa hmat-lb,initiator=node,target=node,hierarchy=memory|first-level|second-level|third-level,data-type=access-latency|read-latency|write-latency[,latency=lat][,bandwidth=bw]\n"
@@ -442,7 +442,7 @@ DEF("numa", HAS_ARG, QEMU_OPTION_numa,
 SRST
 ``-numa node[,mem=size][,cpus=firstcpu[-lastcpu]][,nodeid=node][,initiator=initiator]``
   \ 
-``-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node][,initiator=initiator]``
+``-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node][,initiator=initiator][,memmap-type=type]``
   \
 ``-numa dist,src=source,dst=destination,val=distance``
   \ 
@@ -510,6 +510,16 @@ SRST
     largest bandwidth) to this NUMA node. Note that this option can be
     set only when the machine property 'hmat' is set to 'on'.
 
+    '\ ``memmap-type``\ ' specifies the memory type for this NUMA node:
+
+    - ``normal`` (default): Regular system RAM (E820 type 1)
+    - ``spm``: Specific Purpose Memory (E820 type 0xEFFFFFFF). This is a
+      hint to the guest that the memory might be managed by device drivers
+      based on guest policy.
+    - ``reserved``: Reserved memory (E820 type 2), not usable as RAM.
+
+    This option is only supported on x86 platforms.
+
     Following example creates a machine with 2 NUMA nodes, node 0 has
     CPU. node 1 has only memory, and its initiator is node 0. Note that
     because node 0 has CPU, by default the initiator of node 0 is itself
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 1/1] numa: add 'memmap-type' option for memory type configuration
  2026-02-26 10:50 ` [PATCH v6 1/1] " fanhuang
@ 2026-02-27 20:34   ` David Hildenbrand
  2026-03-02  9:01     ` Huang, FangSheng (Jerry)
  2026-03-04 17:19   ` Gregory Price
  2026-03-05 21:06   ` Gregory Price
  2 siblings, 1 reply; 10+ messages in thread
From: David Hildenbrand @ 2026-02-27 20:34 UTC (permalink / raw)
  To: fanhuang, qemu-devel, imammedo, gourry, jonathan.cameron
  Cc: apopple, dan.j.williams, Zhigang.Luo, Lianjie.Shi

On 2/26/26 11:50, fanhuang wrote:
> Add a 'memmap-type' option to NUMA node configuration that allows
> specifying the memory type for a NUMA node.
> 
> Supported values:
>   - normal:   Regular system RAM (E820 type 1, default)
>   - spm:      Specific Purpose Memory (E820 type 0xEFFFFFFF)
>   - reserved: Reserved memory (E820 type 2)
> 
> The 'spm' type indicates Specific Purpose Memory - a hint to the guest
> that this memory might be managed by device drivers based on guest policy.
> The 'reserved' type marks memory as not usable as RAM.
> 
> Note: This option is only supported on x86 platforms.
> 
> Usage:
>   -numa node,nodeid=1,memdev=m1,memmap-type=spm
> 
> Signed-off-by: fanhuang <FangSheng.Huang@amd.com>
> ---
>  hw/core/numa.c               | 24 ++++++++++++
>  hw/i386/acpi-build.c         |  8 ++++
>  hw/i386/e820_memory_layout.c | 72 ++++++++++++++++++++++++++++++++++++
>  hw/i386/e820_memory_layout.h | 12 +++---
>  hw/i386/pc.c                 | 48 ++++++++++++++++++++++++
>  include/system/numa.h        |  7 ++++
>  qapi/machine.json            | 24 ++++++++++++
>  qemu-options.hx              | 14 ++++++-
>  8 files changed, 202 insertions(+), 7 deletions(-)

I didn't take a look at the x86 implementation bits. The high-level
concept LGTM.

In an ideal world, we'd only indicate the property if actually supported
by the machine. Not sure if that is easy to achieve with the "-numa"
option. So I guess this has to do :)

-- 
Cheers,

David



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 1/1] numa: add 'memmap-type' option for memory type configuration
  2026-02-27 20:34   ` David Hildenbrand
@ 2026-03-02  9:01     ` Huang, FangSheng (Jerry)
  2026-03-04 17:16       ` David Hildenbrand
  0 siblings, 1 reply; 10+ messages in thread
From: Huang, FangSheng (Jerry) @ 2026-03-02  9:01 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel, imammedo, gourry, jonathan.cameron
  Cc: apopple, dan.j.williams, Zhigang.Luo, Lianjie.Shi



On 2/28/2026 4:34 AM, David Hildenbrand wrote:
> On 2/26/26 11:50, fanhuang wrote:
>> Add a 'memmap-type' option to NUMA node configuration that allows
>> specifying the memory type for a NUMA node.
>>
>> Supported values:
>>    - normal:   Regular system RAM (E820 type 1, default)
>>    - spm:      Specific Purpose Memory (E820 type 0xEFFFFFFF)
>>    - reserved: Reserved memory (E820 type 2)
>>
>> The 'spm' type indicates Specific Purpose Memory - a hint to the guest
>> that this memory might be managed by device drivers based on guest policy.
>> The 'reserved' type marks memory as not usable as RAM.
>>
>> Note: This option is only supported on x86 platforms.
>>
>> Usage:
>>    -numa node,nodeid=1,memdev=m1,memmap-type=spm
>>
>> Signed-off-by: fanhuang <FangSheng.Huang@amd.com>
>> ---
>>   hw/core/numa.c               | 24 ++++++++++++
>>   hw/i386/acpi-build.c         |  8 ++++
>>   hw/i386/e820_memory_layout.c | 72 ++++++++++++++++++++++++++++++++++++
>>   hw/i386/e820_memory_layout.h | 12 +++---
>>   hw/i386/pc.c                 | 48 ++++++++++++++++++++++++
>>   include/system/numa.h        |  7 ++++
>>   qapi/machine.json            | 24 ++++++++++++
>>   qemu-options.hx              | 14 ++++++-
>>   8 files changed, 202 insertions(+), 7 deletions(-)
> 
> I didn't take a look at the x86 implementation bits. The high-level
> concept LGTM.
> 
> In an ideal world, we'd only indicate the property if actually supported
> by the machine. Not sure if that is easy to achieve with the "-numa"
> option. So I guess this has to do :)
> 
Hi David,

Thanks for the review and the LGTM on the high-level concept!

Regarding the per-machine property visibility — agreed, it would be
cleaner. Currently we handle it with a runtime error when memmap-type
is used on non-x86 machines, which seems like a reasonable compromise
given the "-numa" option structure.

I was wondering if there's anything else you or the other maintainers
would like me to address in the current v6? If the patch is in
reasonable shape, it would be great if it could be picked up, as I
have a follow-up OVMF patch for soft reserved memory support that
depends on this QEMU change being merged first.

Happy to make further changes if needed — just let me know.

Thanks,
Jerry Huang


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 1/1] numa: add 'memmap-type' option for memory type configuration
  2026-03-02  9:01     ` Huang, FangSheng (Jerry)
@ 2026-03-04 17:16       ` David Hildenbrand
  0 siblings, 0 replies; 10+ messages in thread
From: David Hildenbrand @ 2026-03-04 17:16 UTC (permalink / raw)
  To: Huang, FangSheng (Jerry), qemu-devel, imammedo, gourry,
	jonathan.cameron
  Cc: apopple, dan.j.williams, Zhigang.Luo, Lianjie.Shi

On 3/2/26 10:01, Huang, FangSheng (Jerry) wrote:
> 
> 
> On 2/28/2026 4:34 AM, David Hildenbrand wrote:
>> On 2/26/26 11:50, fanhuang wrote:
>>> Add a 'memmap-type' option to NUMA node configuration that allows
>>> specifying the memory type for a NUMA node.
>>>
>>> Supported values:
>>>    - normal:   Regular system RAM (E820 type 1, default)
>>>    - spm:      Specific Purpose Memory (E820 type 0xEFFFFFFF)
>>>    - reserved: Reserved memory (E820 type 2)
>>>
>>> The 'spm' type indicates Specific Purpose Memory - a hint to the guest
>>> that this memory might be managed by device drivers based on guest
>>> policy.
>>> The 'reserved' type marks memory as not usable as RAM.
>>>
>>> Note: This option is only supported on x86 platforms.
>>>
>>> Usage:
>>>    -numa node,nodeid=1,memdev=m1,memmap-type=spm
>>>
>>> Signed-off-by: fanhuang <FangSheng.Huang@amd.com>
>>> ---
>>>   hw/core/numa.c               | 24 ++++++++++++
>>>   hw/i386/acpi-build.c         |  8 ++++
>>>   hw/i386/e820_memory_layout.c | 72 ++++++++++++++++++++++++++++++++++++
>>>   hw/i386/e820_memory_layout.h | 12 +++---
>>>   hw/i386/pc.c                 | 48 ++++++++++++++++++++++++
>>>   include/system/numa.h        |  7 ++++
>>>   qapi/machine.json            | 24 ++++++++++++
>>>   qemu-options.hx              | 14 ++++++-
>>>   8 files changed, 202 insertions(+), 7 deletions(-)
>>
>> I didn't take a look at the x86 implementation bits. The high-level
>> concept LGTM.
>>
>> In an ideal world, we'd only indicate the property if actually supported
>> by the machine. Not sure if that is easy to achieve with the "-numa"
>> option. So I guess this has to do :)
>>
> Hi David,
> 
> Thanks for the review and the LGTM on the high-level concept!
> 
> Regarding the per-machine property visibility — agreed, it would be
> cleaner. Currently we handle it with a runtime error when memmap-type
> is used on non-x86 machines, which seems like a reasonable compromise
> given the "-numa" option structure.
> 
> I was wondering if there's anything else you or the other maintainers
> would like me to address in the current v6?

Not from my side, so

Acked-by: David Hildenbrand <david@kernel.org>

on the core bits.

-- 
Cheers,

David



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 1/1] numa: add 'memmap-type' option for memory type configuration
  2026-02-26 10:50 ` [PATCH v6 1/1] " fanhuang
  2026-02-27 20:34   ` David Hildenbrand
@ 2026-03-04 17:19   ` Gregory Price
  2026-03-05 10:39     ` Huang, FangSheng (Jerry)
  2026-03-05 21:06   ` Gregory Price
  2 siblings, 1 reply; 10+ messages in thread
From: Gregory Price @ 2026-03-04 17:19 UTC (permalink / raw)
  To: fanhuang
  Cc: qemu-devel, david, imammedo, jonathan.cameron, apopple,
	dan.j.williams, Zhigang.Luo, Lianjie.Shi

On Thu, Feb 26, 2026 at 06:50:23PM +0800, fanhuang wrote:
> Add a 'memmap-type' option to NUMA node configuration that allows
> specifying the memory type for a NUMA node.
> 
> Supported values:
>   - normal:   Regular system RAM (E820 type 1, default)
>   - spm:      Specific Purpose Memory (E820 type 0xEFFFFFFF)
>   - reserved: Reserved memory (E820 type 2)
> 
> The 'spm' type indicates Specific Purpose Memory - a hint to the guest
> that this memory might be managed by device drivers based on guest policy.
> The 'reserved' type marks memory as not usable as RAM.
> 
> Note: This option is only supported on x86 platforms.
> 
> Usage:
>   -numa node,nodeid=1,memdev=m1,memmap-type=spm
> 
> Signed-off-by: fanhuang <FangSheng.Huang@amd.com>

Thank you for the reworks!

I will set up a test soon, and this will actually help me with my other
work, so it's much appreciated.

Reviewed-by: Gregory Price <gourry@gourry.net>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 1/1] numa: add 'memmap-type' option for memory type configuration
  2026-03-04 17:19   ` Gregory Price
@ 2026-03-05 10:39     ` Huang, FangSheng (Jerry)
  0 siblings, 0 replies; 10+ messages in thread
From: Huang, FangSheng (Jerry) @ 2026-03-05 10:39 UTC (permalink / raw)
  To: Gregory Price
  Cc: qemu-devel, david, imammedo, jonathan.cameron, apopple,
	dan.j.williams, Zhigang.Luo, Lianjie.Shi



On 3/5/2026 1:19 AM, Gregory Price wrote:
> On Thu, Feb 26, 2026 at 06:50:23PM +0800, fanhuang wrote:
>> Add a 'memmap-type' option to NUMA node configuration that allows
>> specifying the memory type for a NUMA node.
>>
>> Supported values:
>>    - normal:   Regular system RAM (E820 type 1, default)
>>    - spm:      Specific Purpose Memory (E820 type 0xEFFFFFFF)
>>    - reserved: Reserved memory (E820 type 2)
>>
>> The 'spm' type indicates Specific Purpose Memory - a hint to the guest
>> that this memory might be managed by device drivers based on guest policy.
>> The 'reserved' type marks memory as not usable as RAM.
>>
>> Note: This option is only supported on x86 platforms.
>>
>> Usage:
>>    -numa node,nodeid=1,memdev=m1,memmap-type=spm
>>
>> Signed-off-by: fanhuang <FangSheng.Huang@amd.com>
> 
> Thank you for the reworks!
> 
> I will set up a test soon, and this will actually help me with my other
> work, so it's much appreciated.
> 
> Reviewed-by: Gregory Price <gourry@gourry.net>
Hi David, Gregory,

Thank you both for the review and the Acked-by / Reviewed-by!

Really appreciate all the feedback and guidance from you and the
other reviewers throughout this patch series. It helped shape the
design significantly.

Thanks,
Jerry Huang


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 0/1] numa: add 'memmap-type' option for memory type configuration
  2026-02-26 10:50 [PATCH v6 0/1] numa: add 'memmap-type' option for memory type configuration fanhuang
  2026-02-26 10:50 ` [PATCH v6 1/1] " fanhuang
@ 2026-03-05 12:32 ` Jonathan Cameron via qemu development
  1 sibling, 0 replies; 10+ messages in thread
From: Jonathan Cameron via qemu development @ 2026-03-05 12:32 UTC (permalink / raw)
  To: fanhuang
  Cc: qemu-devel, david, imammedo, gourry, apopple, dan.j.williams,
	Zhigang.Luo, Lianjie.Shi

On Thu, 26 Feb 2026 18:50:22 +0800
fanhuang <FangSheng.Huang@amd.com> wrote:

> Hi all,
> 
> This is v6 of the SPM (Specific Purpose Memory) patch. Thank you for
> the feedback on v5, especially Gregory's review.
> 
> Changes in v6:
> - Added validation: memmap-type now requires memdev to be specified,
>   to avoid misconfiguration on memory-less NUMA nodes
> - Simplified pc_update_numa_memory_types() by replacing switch/goto
>   with a direct conditional expression
> - Reserved memory nodes are now excluded from SRAT memory affinity
>   entries, since E820 already marks them as reserved and SRAT should
>   not report them as enabled memory affinity
> 
> Use case:
> This feature allows marking NUMA node memory as Specific Purpose Memory
> (SPM) or reserved in the E820 table. SPM serves as a hint to the guest
> that this memory might be managed by device drivers based on guest policy
> 
> Example usage:
>   -object memory-backend-ram,size=8G,id=m0
>   -object memory-backend-memfd,size=8G,id=m1
>   -numa node,nodeid=0,memdev=m0
>   -numa node,nodeid=1,memdev=m1,memmap-type=spm
> 
> Supported memmap-type values:
>   - normal:   Regular system RAM (E820 type 1, default)
>   - spm:      Specific Purpose Memory (E820 type 0xEFFFFFFF), a hint
>               that this memory might be managed by device drivers
>   - reserved: Reserved memory (E820 type 2), not usable as RAM
Interface looks good to me.  I'm not familiar enough to confirm the
x86 specific elements without more time than I have today though so
no tags from me.

Thanks for doing this! 

Jonathan


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 1/1] numa: add 'memmap-type' option for memory type configuration
  2026-02-26 10:50 ` [PATCH v6 1/1] " fanhuang
  2026-02-27 20:34   ` David Hildenbrand
  2026-03-04 17:19   ` Gregory Price
@ 2026-03-05 21:06   ` Gregory Price
  2026-03-06  5:48     ` Huang, FangSheng (Jerry)
  2 siblings, 1 reply; 10+ messages in thread
From: Gregory Price @ 2026-03-05 21:06 UTC (permalink / raw)
  To: fanhuang
  Cc: qemu-devel, david, imammedo, jonathan.cameron, apopple,
	dan.j.williams, Zhigang.Luo, Lianjie.Shi

On Thu, Feb 26, 2026 at 06:50:23PM +0800, fanhuang wrote:
> +
> +            if (addr < x86ms->below_4g_mem_size) {
> +                if (addr + node_size <= x86ms->below_4g_mem_size) {
> +                    guest_addr = addr;
> +                } else {
> +                    error_report("NUMA node %d with memmap-type spans across "
> +                                 "4GB boundary, not supported", i);
> +                    exit(EXIT_FAILURE);
> +                }
> +            } else {
> +                guest_addr = 0x100000000ULL +
> +                            (addr - x86ms->below_4g_mem_size);
> +            }
> +

I missed this on my first go around

Should this be:

  guest_addr = x86ms->above_4g_mem_start +
              (addr - x86ms->below_4g_mem_size);

Or is there a reason for the hard-code?

~Gregory


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 1/1] numa: add 'memmap-type' option for memory type configuration
  2026-03-05 21:06   ` Gregory Price
@ 2026-03-06  5:48     ` Huang, FangSheng (Jerry)
  0 siblings, 0 replies; 10+ messages in thread
From: Huang, FangSheng (Jerry) @ 2026-03-06  5:48 UTC (permalink / raw)
  To: Gregory Price
  Cc: qemu-devel, david, imammedo, jonathan.cameron, apopple,
	dan.j.williams, Zhigang.Luo, Lianjie.Shi



On 3/6/2026 5:06 AM, Gregory Price wrote:
> On Thu, Feb 26, 2026 at 06:50:23PM +0800, fanhuang wrote:
>> +
>> +            if (addr < x86ms->below_4g_mem_size) {
>> +                if (addr + node_size <= x86ms->below_4g_mem_size) {
>> +                    guest_addr = addr;
>> +                } else {
>> +                    error_report("NUMA node %d with memmap-type spans across "
>> +                                 "4GB boundary, not supported", i);
>> +                    exit(EXIT_FAILURE);
>> +                }
>> +            } else {
>> +                guest_addr = 0x100000000ULL +
>> +                            (addr - x86ms->below_4g_mem_size);
>> +            }
>> +
> 
> I missed this on my first go around
> 
> Should this be:
> 
>    guest_addr = x86ms->above_4g_mem_start +
>                (addr - x86ms->below_4g_mem_size);
> 
> Or is there a reason for the hard-code?
> 
> ~Gregory

Hi Gregory,

Good catch, thanks for spotting this!

You're right, there's no reason to hardcode 4 GiB here — on AMD
hosts with IOMMU, above_4g_mem_start gets relocated to above 1 TB,
so the hardcode would produce a wrong address.

You're right, it should be:

   guest_addr = x86ms->above_4g_mem_start +
               (addr - x86ms->below_4g_mem_size);

I'll send a v7 shortly with this fix.

Thanks,
Jerry Huang


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-03-06  5:48 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-26 10:50 [PATCH v6 0/1] numa: add 'memmap-type' option for memory type configuration fanhuang
2026-02-26 10:50 ` [PATCH v6 1/1] " fanhuang
2026-02-27 20:34   ` David Hildenbrand
2026-03-02  9:01     ` Huang, FangSheng (Jerry)
2026-03-04 17:16       ` David Hildenbrand
2026-03-04 17:19   ` Gregory Price
2026-03-05 10:39     ` Huang, FangSheng (Jerry)
2026-03-05 21:06   ` Gregory Price
2026-03-06  5:48     ` Huang, FangSheng (Jerry)
2026-03-05 12:32 ` [PATCH v6 0/1] " Jonathan Cameron via qemu development

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.