public inbox for linux-cxl@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC QEMU PATCH 0/7] Application Specific Tagged Memory Support in CXL Type 3 Devices
@ 2025-11-27 22:55 Alireza Sanaee
  2025-11-27 22:55 ` [RFC PATCH 1/7] hw/mem: Add tagged memory backend object Alireza Sanaee
                   ` (6 more replies)
  0 siblings, 7 replies; 15+ messages in thread
From: Alireza Sanaee @ 2025-11-27 22:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: jonathan.cameron, linuxarm, eblake, armbru, berrange, pbonzini,
	mst, lizhijian, anisa.su, linux-cxl

Application may need memory for specific purposes. For example, a
database application may want to allocate memory that is optimized for
workloads involving large datasets and frequent read/write operations.
Or there might be a large read only dataset that can be mapped to a CXL
Type 3 device.

There must be a way to feed this memory into the VMs. This series
introduces a tagged memory backend object that allows tagging memory
regions with application specific tags. The tagged memory regions can
then be used as backing memory for CXL Type 3 devices.

This series includes the following changes:
 - A new tagged memory backend object is introduced that allows
   allocating and managing memory regions with specific tags.
 - The CXL Type 3 device implementation is modified to support using
   tagged memory regions as backing memory.
 - New QAPI commands and structures are added to facilitate the
   management of tagged memory regions and their association with CXL
   Type 3 devices.
 - The CXL extent management code is updated to handle tagged memory
   regions appropriately, including lazy loading and direct mapping
   optimizations.

# Assumptions:

1) Each extent must be mapped entirely to a single tagged memory
backend.
2) Punching holes in extents is not supported, and not allowed.

# Diagram that illustrates the design:

                  VM
                  |
                  | FMW.0
                  v
                +--------------------+
                |   CXL Type-3 dev   |
                |  +--------------+  |
                |  | extent0 T0   |----> backend0 (tag T0)
                |  | extent1 T1   |----> backend1 (tag T1)
                |  | extent2 T2   |----> backend2 (tag T2)
                |  +--------------+  |
                +--------------------+


# Tested scenario:

Created two different tagged memory backends with different tags at
runtime with QMP commands, and then added two different extents that
cover the whole memory backends specified by tags.

# Changes to the kernel:

UUID/tags must be allowed in the DCD patchset available online from
Ira which is series depends on [1].

# Commands used:

# First memory backend with tag 5be13bce-ae34-4a77-b6c3-16df975fcf1a:

{
    "execute": "object-add",
    "arguments": {
        "qom-type": "memory-backend-tagged",
        "id": "tm0",
        "size": 1073741824,
        "tag": "5be13bce-ae34-4a77-b6c3-16df975fcf1a"
    }
}

## Second memory backend with tag 6be13bce-ae34-4a77-b6c3-16df975fcf1a:

{

    "execute": "object-add",
    "arguments": {
        "qom-type": "memory-backend-tagged",
        "id": "tm1",
        "size": 1073741824,
        "tag": "6be13bce-ae34-4a77-b6c3-16df975fcf1a"
    }
}

## Add capacity extent with tag 5be13bce-ae34-4a77-b6c3-16df975fcf1a:

{
    "execute": "cxl-add-dynamic-capacity",
	"arguments": {
		"path": "/machine/peripheral/cxl-vmem0",
		"host-id": 0,
		"selection-policy": "prescriptive",
		"region": 0,
		"tag": "5be13bce-ae34-4a77-b6c3-16df975fcf1a",
		"extents": [
			{
				"offset": 0,
				"len": 1073741824
			}
		]
	}
}

## Add capacity extent with tag 6be13bce-ae34-4a77-b6c3-16df975fcf1a:

{
    "execute": "cxl-add-dynamic-capacity",
    "arguments": {
        "path": "/machine/peripheral/cxl-vmem0",
        "host-id": 0,
        "selection-policy": "prescriptive",
        "region": 0,
        "tag": "6be13bce-ae34-4a77-b6c3-16df975fcf1a",
        "extents": [
            {
                "offset": 1073741824,
                "len": 1073741824
            }
        ]
    }
}

## Release capacity extent with tag 5be13bce-ae34-4a77-b6c3-16df975fcf1a:

{ "execute": "cxl-release-dynamic-capacity",
	"arguments": {
		"path": "/machine/peripheral/cxl-vmem0",
		"host-id": 0,
		"removal-policy":"tag-based",
        	"tag": "5be13bce-ae34-4a77-b6c3-16df975fcf1a",
		"region": 0,
		"extents": [
		{
			"offset": 0,
			"len": 1073741824
		}
		]
	}
}

## Release capacity extent with tag 6be13bce-ae34-4a77-b6c3-16df975fcf1a:

{ "execute": "cxl-release-dynamic-capacity",
	"arguments": {
		"path": "/machine/peripheral/cxl-vmem0",
		"host-id": 0,
		"removal-policy":"tag-based",
        	"tag": "6be13bce-ae34-4a77-b6c3-16df975fcf1a",
		"region": 0,
		"extents": [
		{
			"offset": 1073741824,
			"len": 1073741824
		}
		]
	}
}

## Checking if capacity extents are removed successfully:

{
  "execute": "cxl-release-dynamic-capacity-status",
  "arguments": {
	"path": "/machine/peripheral/cxl-vmem0",
	"host-id": 0,
        "tag": "6be13bce-ae34-4a77-b6c3-16df975fcf1a",
	"region": 0
  }
}

Response:
{
    "return": {
        "status": "not-found"
    }
}

# QEMU Command:

$ qemu-system-x86_64 \
  -cpu max \
  -smp 8 \
  -drive file=debian12.qcow2,format=qcow2,if=none,id=mydrive0,index=0 \
  -device virtio-blk-pci,drive=mydrive0 \
  -kernel bzImage \
  -append "console=ttyS0,115200 TERM=linux root=/dev/vda1 nokaslr ignore_loglevel \
           fsck.mode=skip cxl_acpi.dyndbg=+fplm cxl_pci.dyndbg=+fplm cxl" \
  -nographic \
  -serial mon:stdio \
  -machine type=q35,accel=tcg \
  -virtfs local,path=hostshare/,mount_tag=hostshare,security_model=passthrough,id=hostshare \
  -qmp tcp:localhost:4444,server,wait=off \
  -netdev user,id=network0,hostfwd=tcp::2025-:22 \
  -device virtio-net,netdev=network0 \
  -m 12G,maxmem=20G,slots=10 \
  -object memory-backend-ram,id=vmem0,share=on,size=2G \
  -device pxb-cxl,numa_node=0,bus_nr=23,bus=pcie.0,id=cxl.1,hdm_for_passthrough=true \
  -device cxl-rp,port=0,bus=cxl.1,id=root_port13,chassis=0,slot=2 \
  -M cxl=on,cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=6G \
  -device cxl-type3,bus=root_port13,id=cxl-vmem0,num-dc-regions=1,dc-regions-total-size=4G

BASE: 2a3af116a78e1adceb24521d76199b97f08c0f1d

Depends-on:
https://lore.kernel.org/linux-cxl/20251013160151.000039dd.alireza.sanaee@huawei.com/

Depends-on:
https://lore.kernel.org/all/20250413-dcd-type2-upstream-v9-0-1d4911a0b365@intel.com/

[1] https://github.com/sarsanaee/linux/tree/allow_uuid_ira

Alireza Sanaee (7):
  hw/mem: Add tagged memory backend object
  hw/cxl: Allow initializing type3 device with no backing device
  hw/cxl: Change Extent add/remove APIs for lazy memory backend.
  hw/cxl: Map lazy memory backend after host acceptance
  hw/cxl: Add performant direct mapping for extents
  hw/cxl: Add remove alias functionality for extent direct mapping
  hw/cxl: Add tag-based removal functionality

 hw/cxl/cxl-host.c           |   6 +
 hw/cxl/cxl-mailbox-utils.c  | 190 +++++++++++++++++++--
 hw/mem/cxl_type3.c          | 326 ++++++++++++++++++++++++++++++------
 hw/mem/meson.build          |   1 +
 hw/mem/tagged_mem.c         | 116 +++++++++++++
 include/hw/cxl/cxl_device.h |  44 ++++-
 include/hw/mem/tagged_mem.h |  31 ++++
 qapi/cxl.json               |  46 +++++
 qapi/qom.json               |  15 ++
 9 files changed, 707 insertions(+), 68 deletions(-)
 create mode 100644 hw/mem/tagged_mem.c
 create mode 100644 include/hw/mem/tagged_mem.h

-- 
2.43.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC PATCH 1/7] hw/mem: Add tagged memory backend object
  2025-11-27 22:55 [RFC QEMU PATCH 0/7] Application Specific Tagged Memory Support in CXL Type 3 Devices Alireza Sanaee
@ 2025-11-27 22:55 ` Alireza Sanaee
  2026-02-06 12:16   ` Jonathan Cameron
  2025-11-27 22:55 ` [RFC PATCH 2/7] hw/cxl: Allow initializing type3 device with no backing device Alireza Sanaee
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Alireza Sanaee @ 2025-11-27 22:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: jonathan.cameron, linuxarm, eblake, armbru, berrange, pbonzini,
	mst, lizhijian, anisa.su, linux-cxl

Add a new memory-backend-tagged supports a tag property where you can
find it based on tag. This is useful for scenarios where you want to add
a piece of memory for a particular purpose to be passed for another
device.

At the moment, this only supports a ram-backed object where we add a tag
to it, and it temporary. However, we are planning for a generalized
approach. The plan is to have a shim object where we add a tag to it,
and then it can be later linked to any BACKEND object types.

Example use QMP API:
{

    "execute": "object-add",
    "arguments": {
        "qom-type": "memory-backend-tagged",
        "id": "tm0",
        "size": 1073741824,
        "tag": "6be13bce-ae34-4a77-b6c3-16df975fcf1a"
    }
}

Tags are assumed to be UUID. But this is something for debate maybe.

Signed-off-by: Alireza Sanaee <alireza.sanaee@huawei.com>
---
 hw/mem/meson.build          |   1 +
 hw/mem/tagged_mem.c         | 116 ++++++++++++++++++++++++++++++++++++
 include/hw/mem/tagged_mem.h |  31 ++++++++++
 qapi/qom.json               |  15 +++++
 4 files changed, 163 insertions(+)
 create mode 100644 hw/mem/tagged_mem.c
 create mode 100644 include/hw/mem/tagged_mem.h

diff --git a/hw/mem/meson.build b/hw/mem/meson.build
index 1c1c6da24b..529d86f840 100644
--- a/hw/mem/meson.build
+++ b/hw/mem/meson.build
@@ -10,3 +10,4 @@ system_ss.add(when: 'CONFIG_MEM_DEVICE', if_false: files('memory-device-stubs.c'
 system_ss.add_all(when: 'CONFIG_MEM_DEVICE', if_true: mem_ss)
 
 system_ss.add(when: 'CONFIG_SPARSE_MEM', if_true: files('sparse-mem.c'))
+system_ss.add(files('tagged_mem.c'))
diff --git a/hw/mem/tagged_mem.c b/hw/mem/tagged_mem.c
new file mode 100644
index 0000000000..27b88e845e
--- /dev/null
+++ b/hw/mem/tagged_mem.c
@@ -0,0 +1,116 @@
+/*
+ * Tagged memory backend. Temporary implementation for testing purposes and
+ * only supports RAM based.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "hw/mem/tagged_mem.h"
+#include "qapi/error.h"
+#include "qemu/module.h"
+#include "hw/qdev-properties.h"
+#include "hw/qdev-properties-system.h"
+#include "qapi/error.h"
+#include "qemu/log.h"
+#include "qom/object.h"
+#include "qom/qom-qobject.h"
+
+static int check_property_equals_test(Object *obj, void *opaque)
+{
+    Error *err = NULL;
+    struct TagSearchContext *ctx = opaque;
+    g_autofree char *value;
+
+    if (!object_dynamic_cast(OBJECT(obj), TYPE_MEMORY_BACKEND_TAGGED)) {
+        return 0;
+    }
+
+    value = object_property_get_str(obj, "tag", &err);
+    if (err) {
+        error_report_err(err);
+        return 0;
+    }
+
+    if (strcmp(value, ctx->tag_value) == 0) {
+        ctx->result = MEMORY_BACKEND(obj);
+        return 1;
+    }
+
+    return 0;
+}
+
+HostMemoryBackend *memory_backend_tagged_find_by_tag(const char *tag,
+                                                     Error **errp)
+{
+    struct TagSearchContext ctx = {
+        .tag_value = tag,
+        .result = NULL,
+    };
+
+    object_child_foreach_recursive(object_get_objects_root(),
+                                   check_property_equals_test, &ctx);
+
+    if (!ctx.result) {
+        qemu_log("didn't find any results!\n");
+        return NULL;
+    }
+
+    return ctx.result;
+}
+
+static void tagged_mem_set_tag(Object *obj, const char *value, Error **errp)
+{
+    MemoryBackendTagged *tm = MEMORY_BACKEND_TAGGED(obj);
+    g_free(tm->tag);
+    tm->tag = g_strdup(value);
+}
+
+static char *tagged_mem_get_tag(Object *obj, Error **errp)
+{
+    MemoryBackendTagged *tm = MEMORY_BACKEND_TAGGED(obj);
+    return g_strdup(tm->tag);
+}
+
+static bool ram_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
+{
+    g_autofree char *name = NULL;
+    uint32_t ram_flags;
+
+    if (!backend->size) {
+        error_setg(errp, "can't create backend with size 0");
+        return false;
+    }
+
+    name = host_memory_backend_get_name(backend);
+    ram_flags = backend->share ? RAM_SHARED : RAM_PRIVATE;
+    ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
+    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0;
+    return memory_region_init_ram_flags_nomigrate(
+        &backend->mr, OBJECT(backend), name, backend->size, ram_flags, errp);
+}
+
+static void memory_backend_tagged_class_init(ObjectClass *oc, const void *data)
+{
+    HostMemoryBackendClass *bc = MEMORY_BACKEND_CLASS(oc);
+
+    bc->alloc = ram_backend_memory_alloc;
+    object_class_property_add_str(oc, "tag", tagged_mem_get_tag,
+                                  tagged_mem_set_tag);
+    object_class_property_set_description(oc, "tag",
+        "A user-defined tag to identify this memory backend");
+}
+
+static const TypeInfo memory_backend_tagged_info = {
+    .name = TYPE_MEMORY_BACKEND_TAGGED,
+    .parent = TYPE_MEMORY_BACKEND,
+    .instance_size = sizeof(MemoryBackendTagged),
+    .class_init = memory_backend_tagged_class_init,
+};
+
+static void memory_backend_tagged_register_types(void)
+{
+    type_register_static(&memory_backend_tagged_info);
+}
+
+type_init(memory_backend_tagged_register_types);
diff --git a/include/hw/mem/tagged_mem.h b/include/hw/mem/tagged_mem.h
new file mode 100644
index 0000000000..4f3b033597
--- /dev/null
+++ b/include/hw/mem/tagged_mem.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Tagged Memory backend
+ *
+ * Copyright (c) 2025 Alireza Sanaee <alireza.sanaee@huawei.com>
+ */
+#ifndef HW_TAGGED_MEM_H
+#define HW_TAGGED_MEM_H
+
+#include "hw/qdev-core.h"
+#include "system/memory.h"
+#include "system/hostmem.h"
+
+#define TYPE_MEMORY_BACKEND_TAGGED "memory-backend-tagged"
+OBJECT_DECLARE_SIMPLE_TYPE(MemoryBackendTagged, MEMORY_BACKEND_TAGGED)
+
+typedef struct MemoryBackendTagged {
+    HostMemoryBackend parent_obj;
+
+    char *tag;
+} MemoryBackendTagged;
+
+struct TagSearchContext {
+    const char *tag_value;
+    HostMemoryBackend *result;
+};
+
+HostMemoryBackend *memory_backend_tagged_find_by_tag(const char *tag,
+                                                     Error **errp);
+
+#endif
diff --git a/qapi/qom.json b/qapi/qom.json
index 830cb2ffe7..96d0184864 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -687,6 +687,19 @@
             'size': 'size',
             '*x-use-canonical-path-for-ramblock-id': 'bool' } }
 
+##
+# @MemoryBackendTaggedProperties:
+#
+# Properties for objects of classes derived from memory-backend.
+#
+# @tag: Memory tag
+#
+# Since: 11.0
+##
+{ 'struct': 'MemoryBackendTaggedProperties',
+  'base': 'MemoryBackendProperties',
+  'data': { '*tag': 'str' } }
+
 ##
 # @MemoryBackendFileProperties:
 #
@@ -1218,6 +1231,7 @@
     { 'name': 'memory-backend-memfd',
       'if': 'CONFIG_LINUX' },
     'memory-backend-ram',
+    'memory-backend-tagged',
     { 'name': 'memory-backend-shm',
       'if': 'CONFIG_POSIX' },
     'pef-guest',
@@ -1296,6 +1310,7 @@
       'memory-backend-memfd':       { 'type': 'MemoryBackendMemfdProperties',
                                       'if': 'CONFIG_LINUX' },
       'memory-backend-ram':         'MemoryBackendProperties',
+      'memory-backend-tagged':      'MemoryBackendTaggedProperties',
       'memory-backend-shm':         { 'type': 'MemoryBackendShmProperties',
                                       'if': 'CONFIG_POSIX' },
       'pr-manager-helper':          { 'type': 'PrManagerHelperProperties',
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 2/7] hw/cxl: Allow initializing type3 device with no backing device
  2025-11-27 22:55 [RFC QEMU PATCH 0/7] Application Specific Tagged Memory Support in CXL Type 3 Devices Alireza Sanaee
  2025-11-27 22:55 ` [RFC PATCH 1/7] hw/mem: Add tagged memory backend object Alireza Sanaee
@ 2025-11-27 22:55 ` Alireza Sanaee
  2026-02-06 12:28   ` Jonathan Cameron
  2025-11-27 22:55 ` [RFC PATCH 3/7] hw/cxl: Change Extent add/remove APIs for lazy memory backend Alireza Sanaee
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Alireza Sanaee @ 2025-11-27 22:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: jonathan.cameron, linuxarm, eblake, armbru, berrange, pbonzini,
	mst, lizhijian, anisa.su, linux-cxl

Allow creating a type3 device without any backing device for DC. In
Dynamic Capacity scenarios, memory can show up asynchronously and it can
be coming from difference resources, RAM, PMEM, FILE BACKED. For these
cases, only one parameter will be needed to know total size of DC which
is exposed by dc-total-regions-size.

Signed-off-by: Alireza Sanaee <alireza.sanaee@huawei.com>
---
 hw/mem/cxl_type3.c          | 157 ++++++++++++++++++++++++++----------
 include/hw/cxl/cxl_device.h |   1 +
 2 files changed, 115 insertions(+), 43 deletions(-)

diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 8cdb3bff7e..690b3ab658 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -30,6 +30,7 @@
 #include "system/numa.h"
 #include "hw/cxl/cxl.h"
 #include "hw/pci/msix.h"
+#include "hw/mem/tagged_mem.h"
 
 /* type3 device private */
 enum CXL_T3_MSIX_VECTOR {
@@ -190,12 +191,15 @@ static int ct3_build_cdat_table(CDATSubHeader ***cdat_table, void *priv)
     }
 
     if (ct3d->dc.num_regions) {
-        if (!ct3d->dc.host_dc) {
-            return -EINVAL;
-        }
-        dc_mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
-        if (!dc_mr) {
-            return -EINVAL;
+        /* Only check if DC is static */
+        if (ct3d->dc.total_capacity_cmd == 0) {
+            if (!ct3d->dc.host_dc) {
+                return -EINVAL;
+            }
+            dc_mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
+            if (!dc_mr) {
+                return -EINVAL;
+            }
         }
         len += CT3_CDAT_NUM_ENTRIES * ct3d->dc.num_regions;
     }
@@ -216,7 +220,7 @@ static int ct3_build_cdat_table(CDATSubHeader ***cdat_table, void *priv)
         cur_ent += CT3_CDAT_NUM_ENTRIES;
     }
 
-    if (dc_mr) {
+    if (dc_mr || ct3d->dc.total_capacity_cmd) {
         int i;
         uint64_t region_base = vmr_size + pmr_size;
 
@@ -651,8 +655,13 @@ static bool cxl_create_dc_regions(CXLType3Dev *ct3d, Error **errp)
     MemoryRegion *mr;
     uint64_t dc_size;
 
-    mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
-    dc_size = memory_region_size(mr);
+    if (ct3d->dc.total_capacity_cmd != 0) {
+        dc_size = ct3d->dc.total_capacity_cmd;
+    } else {
+        mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
+        dc_size = memory_region_size(mr);
+    }
+
     region_len = DIV_ROUND_UP(dc_size, ct3d->dc.num_regions);
 
     if (dc_size % (ct3d->dc.num_regions * CXL_CAPACITY_MULTIPLIER) != 0) {
@@ -810,39 +819,43 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
 
     ct3d->dc.total_capacity = 0;
     if (ct3d->dc.num_regions > 0) {
-        MemoryRegion *dc_mr;
-        char *dc_name;
+        if (ct3d->dc.total_capacity_cmd == 0) {
+            MemoryRegion *dc_mr;
+            char *dc_name;
 
-        if (!ct3d->dc.host_dc) {
-            error_setg(errp, "dynamic capacity must have a backing device");
-            return false;
-        }
+            if (!ct3d->dc.host_dc) {
+                error_setg(errp, "dynamic capacity must have a backing device");
+                return false;
+            }
 
-        dc_mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
-        if (!dc_mr) {
-            error_setg(errp, "dynamic capacity must have a backing device");
-            return false;
-        }
+            dc_mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
+            if (!dc_mr) {
+                error_setg(errp, "dynamic capacity must have a backing device");
+                return false;
+            }
 
-        if (host_memory_backend_is_mapped(ct3d->dc.host_dc)) {
-            error_setg(errp, "memory backend %s can't be used multiple times.",
-               object_get_canonical_path_component(OBJECT(ct3d->dc.host_dc)));
-            return false;
-        }
-        /*
-         * Set DC regions as volatile for now, non-volatile support can
-         * be added in the future if needed.
-         */
-        memory_region_set_nonvolatile(dc_mr, false);
-        memory_region_set_enabled(dc_mr, true);
-        host_memory_backend_set_mapped(ct3d->dc.host_dc, true);
-        if (ds->id) {
-            dc_name = g_strdup_printf("cxl-dcd-dpa-dc-space:%s", ds->id);
-        } else {
-            dc_name = g_strdup("cxl-dcd-dpa-dc-space");
+            if (host_memory_backend_is_mapped(ct3d->dc.host_dc)) {
+                error_setg(errp,
+                           "memory backend %s can't be used multiple times.",
+                           object_get_canonical_path_component(
+                               OBJECT(ct3d->dc.host_dc)));
+                return false;
+            }
+            /*
+             * Set DC regions as volatile for now, non-volatile support can
+             * be added in the future if needed.
+             */
+            memory_region_set_nonvolatile(dc_mr, false);
+            memory_region_set_enabled(dc_mr, true);
+            host_memory_backend_set_mapped(ct3d->dc.host_dc, true);
+            if (ds->id) {
+                dc_name = g_strdup_printf("cxl-dcd-dpa-dc-space:%s", ds->id);
+            } else {
+                dc_name = g_strdup("cxl-dcd-dpa-dc-space");
+            }
+            address_space_init(&ct3d->dc.host_dc_as, dc_mr, dc_name);
+            g_free(dc_name);
         }
-        address_space_init(&ct3d->dc.host_dc_as, dc_mr, dc_name);
-        g_free(dc_name);
 
         if (!cxl_create_dc_regions(ct3d, errp)) {
             error_append_hint(errp, "setup DC regions failed");
@@ -1284,6 +1297,8 @@ static const Property ct3_props[] = {
     DEFINE_PROP_UINT8("num-dc-regions", CXLType3Dev, dc.num_regions, 0),
     DEFINE_PROP_LINK("volatile-dc-memdev", CXLType3Dev, dc.host_dc,
                      TYPE_MEMORY_BACKEND, HostMemoryBackend *),
+    DEFINE_PROP_SIZE("dc-regions-total-size", CXLType3Dev,
+                     dc.total_capacity_cmd, 0),
     DEFINE_PROP_PCIE_LINK_SPEED("x-speed", CXLType3Dev,
                                 speed, PCIE_LINK_SPEED_32),
     DEFINE_PROP_PCIE_LINK_WIDTH("x-width", CXLType3Dev,
@@ -1952,12 +1967,38 @@ bool cxl_extent_groups_overlaps_dpa_range(CXLDCExtentGroupList *list,
     return false;
 }
 
+static bool cxl_device_lazy_dynamic_capacity_init(CXLType3Dev *ct3d,
+                                                  const char *tag, Error **errp)
+{
+    MemoryRegion *dc_mr;
+
+    ct3d->dc.host_dc = memory_backend_tagged_find_by_tag(tag, errp);
+    if (!ct3d->dc.host_dc) {
+        error_setg(errp, "dynamic capacity must have a backing device");
+        return false;
+    }
+
+    dc_mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
+    if (!dc_mr) {
+        error_setg(errp, "test dynamic capacity must have a backing device");
+        return false;
+    }
+
+    if (host_memory_backend_is_mapped(ct3d->dc.host_dc)) {
+        qemu_log("Warning: memory backend %s is already mapped. Reusing it.\n",
+               object_get_canonical_path_component(OBJECT(ct3d->dc.host_dc)));
+        return true;
+    }
+
+    return true;
+}
+
 /*
  * The main function to process dynamic capacity event with extent list.
  * Currently DC extents add/release requests are processed.
  */
 static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path,
-        uint16_t hid, CXLDCEventType type, uint8_t rid,
+        uint16_t hid, CXLDCEventType type, uint8_t rid, const char *tag,
         CxlDynamicCapacityExtentList *records, Error **errp)
 {
     Object *obj;
@@ -1966,8 +2007,10 @@ static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path,
     CxlDynamicCapacityExtentList *list;
     CXLDCExtentGroup *group = NULL;
     g_autofree CXLDCExtentRaw *extents = NULL;
-    uint64_t dpa, offset, len, block_size;
+    uint64_t dpa, offset, block_size;
+    uint64_t len = 0;
     g_autofree unsigned long *blk_bitmap = NULL;
+    QemuUUID uuid;
     int i;
 
     obj = object_resolve_path_type(path, TYPE_CXL_TYPE3, NULL);
@@ -1996,6 +2039,7 @@ static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path,
         offset = list->value->offset;
         len = list->value->len;
         dpa = offset + dcd->dc.regions[rid].base;
+        qemu_uuid_parse(tag, &uuid);
 
         if (len == 0) {
             error_setg(errp, "extent with 0 length is not allowed");
@@ -2049,6 +2093,31 @@ static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path,
         num_extents++;
     }
 
+    if (type == DC_EVENT_ADD_CAPACITY && dcd->dc.total_capacity_cmd) {
+        MemoryRegion *host_dc_mr;
+        uint64_t size;
+
+        if (num_extents > 1) {
+            error_setg(errp, "Only single extent add is supported currently");
+            return;
+        }
+
+        if (!cxl_device_lazy_dynamic_capacity_init(dcd, tag, errp)) {
+            return;
+        }
+
+        host_dc_mr = host_memory_backend_get_memory(dcd->dc.host_dc);
+        size = memory_region_size(host_dc_mr);
+
+        if (size != len) {
+            error_setg(errp,
+                       "Host memory backend size 0x%" PRIx64
+                       " does not match extent length 0x%" PRIx64,
+                       size, len);
+            return;
+        }
+    }
+
     /* Create extent list for event being passed to host */
     i = 0;
     list = records;
@@ -2060,7 +2129,7 @@ static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path,
 
         extents[i].start_dpa = dpa;
         extents[i].len = len;
-        memset(extents[i].tag, 0, 0x10);
+        memcpy(extents[i].tag, &uuid.data, 0x10);
         extents[i].shared_seq = 0;
         if (type == DC_EVENT_ADD_CAPACITY) {
             group = cxl_insert_extent_to_extent_group(group,
@@ -2091,7 +2160,8 @@ void qmp_cxl_add_dynamic_capacity(const char *path, uint16_t host_id,
     case CXL_EXTENT_SELECTION_POLICY_PRESCRIPTIVE:
         qmp_cxl_process_dynamic_capacity_prescriptive(path, host_id,
                                                       DC_EVENT_ADD_CAPACITY,
-                                                      region, extents, errp);
+                                                      region, tag, extents,
+                                                      errp);
         return;
     default:
         error_setg(errp, "Selection policy not supported");
@@ -2122,7 +2192,8 @@ void qmp_cxl_release_dynamic_capacity(const char *path, uint16_t host_id,
     switch (removal_policy) {
     case CXL_EXTENT_REMOVAL_POLICY_PRESCRIPTIVE:
         qmp_cxl_process_dynamic_capacity_prescriptive(path, host_id, type,
-                                                      region, extents, errp);
+                                                      region, tag, extents,
+                                                      errp);
         return;
     default:
         error_setg(errp, "Removal policy not supported");
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index 1d199d035e..bfd1a97e03 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -633,6 +633,7 @@ struct CXLType3Dev {
          * memory region size.
          */
         uint64_t total_capacity; /* 256M aligned */
+        uint64_t total_capacity_cmd; /* 256M aligned */
         CXLDCExtentList extents;
         CXLDCExtentGroupList extents_pending;
         uint32_t total_extent_count;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 3/7] hw/cxl: Change Extent add/remove APIs for lazy memory backend.
  2025-11-27 22:55 [RFC QEMU PATCH 0/7] Application Specific Tagged Memory Support in CXL Type 3 Devices Alireza Sanaee
  2025-11-27 22:55 ` [RFC PATCH 1/7] hw/mem: Add tagged memory backend object Alireza Sanaee
  2025-11-27 22:55 ` [RFC PATCH 2/7] hw/cxl: Allow initializing type3 device with no backing device Alireza Sanaee
@ 2025-11-27 22:55 ` Alireza Sanaee
  2026-02-06 12:30   ` Jonathan Cameron
  2025-11-27 22:55 ` [RFC PATCH 4/7] hw/cxl: Map lazy memory backend after host acceptance Alireza Sanaee
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Alireza Sanaee @ 2025-11-27 22:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: jonathan.cameron, linuxarm, eblake, armbru, berrange, pbonzini,
	mst, lizhijian, anisa.su, linux-cxl

Add extra information in each extent about fix memory window and
memory backend, because extent might be backed by different memory
backends, thus such information must be stored in the extent object.
Consequently, APIs should be changes to support extra members.

Signed-off-by: Alireza Sanaee <alireza.sanaee@huawei.com>
---
 hw/cxl/cxl-host.c           |  2 ++
 hw/cxl/cxl-mailbox-utils.c  | 42 ++++++++++++++++++++++++-------------
 hw/mem/cxl_type3.c          | 23 +++++++++++++++-----
 include/hw/cxl/cxl_device.h | 17 ++++++++++++---
 4 files changed, 61 insertions(+), 23 deletions(-)

diff --git a/hw/cxl/cxl-host.c b/hw/cxl/cxl-host.c
index 3a563af3bc..7c8fde4646 100644
--- a/hw/cxl/cxl-host.c
+++ b/hw/cxl/cxl-host.c
@@ -365,6 +365,8 @@ static int cxl_fmws_direct_passthrough(Object *obj, void *opaque)
         return 0;
     }
 
+    ct3d->dc.fw = fw;
+
     if (state->commit) {
         MemoryRegion *mr = NULL;
         uint64_t vmr_size = 0, pmr_size = 0;
diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 68c7cc9891..ae723c03ec 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -2840,10 +2840,13 @@ CXLDCRegion *cxl_find_dc_region(CXLType3Dev *ct3d, uint64_t dpa, uint64_t len)
 }
 
 void cxl_insert_extent_to_extent_list(CXLDCExtentList *list,
+                                             HostMemoryBackend *host_mem,
+                                             struct CXLFixedWindow *fw,
                                              uint64_t dpa,
                                              uint64_t len,
                                              uint8_t *tag,
-                                             uint16_t shared_seq)
+                                             uint16_t shared_seq,
+                                             int rid)
 {
     CXLDCExtent *extent;
 
@@ -2871,17 +2874,20 @@ void cxl_remove_extent_from_extent_list(CXLDCExtentList *list,
  * Return value: the extent group where the extent is inserted.
  */
 CXLDCExtentGroup *cxl_insert_extent_to_extent_group(CXLDCExtentGroup *group,
+                                                    HostMemoryBackend *host_mem,
+                                                    struct CXLFixedWindow *fw,
                                                     uint64_t dpa,
                                                     uint64_t len,
                                                     uint8_t *tag,
-                                                    uint16_t shared_seq)
+                                                    uint16_t shared_seq,
+                                                    int rid)
 {
     if (!group) {
         group = g_new0(CXLDCExtentGroup, 1);
         QTAILQ_INIT(&group->list);
     }
-    cxl_insert_extent_to_extent_list(&group->list, dpa, len,
-                                     tag, shared_seq);
+    cxl_insert_extent_to_extent_list(&group->list, host_mem, fw, dpa, len,
+                                     tag, shared_seq, rid);
     return group;
 }
 
@@ -3062,7 +3068,9 @@ static CXLRetCode cmd_dcd_add_dyn_cap_rsp(const struct cxl_cmd *cmd,
         dpa = in->updated_entries[i].start_dpa;
         len = in->updated_entries[i].len;
 
-        cxl_insert_extent_to_extent_list(extent_list, dpa, len, NULL, 0);
+        cxl_insert_extent_to_extent_list(extent_list, NULL,
+                                         NULL, dpa, len,
+                                         NULL, 0, 0);
         ct3d->dc.total_extent_count += 1;
         ct3d->dc.nr_extents_accepted += 1;
         ct3_set_region_block_backed(ct3d, dpa, len);
@@ -3089,8 +3097,9 @@ static uint32_t copy_extent_list(CXLDCExtentList *dst,
     }
 
     QTAILQ_FOREACH(ent, src, node) {
-        cxl_insert_extent_to_extent_list(dst, ent->start_dpa, ent->len,
-                                         ent->tag, ent->shared_seq);
+        cxl_insert_extent_to_extent_list(dst, ent->hm, ent->fw, ent->start_dpa,
+                                         ent->len, ent->tag, ent->shared_seq,
+                                         ent->rid);
         cnt++;
     }
     return cnt;
@@ -3144,15 +3153,17 @@ static CXLRetCode cxl_dc_extent_release_dry_run(CXLType3Dev *ct3d,
                     cnt_delta--;
 
                     if (len1) {
-                        cxl_insert_extent_to_extent_list(updated_list,
-                                                         ent_start_dpa,
-                                                         len1, NULL, 0);
+                        cxl_insert_extent_to_extent_list(updated_list, NULL,
+                                                         NULL, ent_start_dpa,
+                                                         len1, NULL, 0,
+                                                         ent->rid);
                         cnt_delta++;
                     }
                     if (len2) {
-                        cxl_insert_extent_to_extent_list(updated_list,
-                                                         dpa + len,
-                                                         len2, NULL, 0);
+                        cxl_insert_extent_to_extent_list(updated_list, NULL,
+                                                         NULL, dpa + len,
+                                                         len2, NULL, 0,
+                                                         ent->rid);
                         cnt_delta++;
                     }
 
@@ -3624,9 +3635,10 @@ static CXLRetCode cmd_fm_initiate_dc_add(const struct cxl_cmd *cmd,
             for (i = 0; i < in->ext_count; i++) {
                 CXLDCExtentRaw *ext = &in->extents[i];
 
-                group = cxl_insert_extent_to_extent_group(group, ext->start_dpa,
+                group = cxl_insert_extent_to_extent_group(group, NULL, NULL,
+                                                          ext->start_dpa,
                                                           ext->len, ext->tag,
-                                                          ext->shared_seq);
+                                                          ext->shared_seq, 0);
             }
 
             cxl_extent_group_list_insert_tail(&ct3d->dc.extents_pending, group);
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 690b3ab658..ef1c1cbaef 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -2132,11 +2132,24 @@ static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path,
         memcpy(extents[i].tag, &uuid.data, 0x10);
         extents[i].shared_seq = 0;
         if (type == DC_EVENT_ADD_CAPACITY) {
-            group = cxl_insert_extent_to_extent_group(group,
-                                                      extents[i].start_dpa,
-                                                      extents[i].len,
-                                                      extents[i].tag,
-                                                      extents[i].shared_seq);
+            if (!dcd->dc.total_capacity_cmd) {
+                group = cxl_insert_extent_to_extent_group(group,
+                                                          NULL, NULL,
+                                                          extents[i].start_dpa,
+                                                          extents[i].len,
+                                                          extents[i].tag,
+                                                          extents[i].shared_seq,
+                                                          rid);
+            } else {
+                group = cxl_insert_extent_to_extent_group(group,
+                                                          dcd->dc.host_dc,
+                                                          dcd->dc.fw,
+                                                          extents[i].start_dpa,
+                                                          extents[i].len,
+                                                          extents[i].tag,
+                                                          extents[i].shared_seq,
+                                                          rid);
+            }
         }
 
         list = list->next;
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index bfd1a97e03..fe0c44e8d7 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -10,6 +10,7 @@
 #ifndef CXL_DEVICE_H
 #define CXL_DEVICE_H
 
+#include "hw/cxl/cxl.h"
 #include "hw/cxl/cxl_component.h"
 #include "hw/pci/pci_device.h"
 #include "hw/register.h"
@@ -515,11 +516,14 @@ typedef struct CXLDCExtentRaw {
 } QEMU_PACKED CXLDCExtentRaw;
 
 typedef struct CXLDCExtent {
+    HostMemoryBackend *hm;
+    struct CXLFixedWindow *fw;
     uint64_t start_dpa;
     uint64_t len;
     uint8_t tag[0x10];
     uint16_t shared_seq;
     uint8_t rsvd[0x6];
+    int rid;
 
     QTAILQ_ENTRY(CXLDCExtent) node;
 } CXLDCExtent;
@@ -628,6 +632,7 @@ struct CXLType3Dev {
     struct dynamic_capacity {
         HostMemoryBackend *host_dc;
         AddressSpace host_dc_as;
+        struct CXLFixedWindow *fw;
         /*
          * total_capacity is equivalent to the dynamic capability
          * memory region size.
@@ -711,18 +716,24 @@ CXLDCRegion *cxl_find_dc_region(CXLType3Dev *ct3d, uint64_t dpa, uint64_t len);
 
 void cxl_remove_extent_from_extent_list(CXLDCExtentList *list,
                                         CXLDCExtent *extent);
-void cxl_insert_extent_to_extent_list(CXLDCExtentList *list, uint64_t dpa,
+void cxl_insert_extent_to_extent_list(CXLDCExtentList *list,
+                                      HostMemoryBackend *hm,
+                                      struct CXLFixedWindow *fw,
+                                      uint64_t dpa,
                                       uint64_t len, uint8_t *tag,
-                                      uint16_t shared_seq);
+                                      uint16_t shared_seq, int rid);
 bool test_any_bits_set(const unsigned long *addr, unsigned long nr,
                        unsigned long size);
 bool cxl_extents_contains_dpa_range(CXLDCExtentList *list,
                                     uint64_t dpa, uint64_t len);
 CXLDCExtentGroup *cxl_insert_extent_to_extent_group(CXLDCExtentGroup *group,
+                                                    HostMemoryBackend *host_mem,
+                                                    struct CXLFixedWindow *fw,
                                                     uint64_t dpa,
                                                     uint64_t len,
                                                     uint8_t *tag,
-                                                    uint16_t shared_seq);
+                                                    uint16_t shared_seq,
+                                                    int rid);
 void cxl_extent_group_list_insert_tail(CXLDCExtentGroupList *list,
                                        CXLDCExtentGroup *group);
 uint32_t cxl_extent_group_list_delete_front(CXLDCExtentGroupList *list);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 4/7] hw/cxl: Map lazy memory backend after host acceptance
  2025-11-27 22:55 [RFC QEMU PATCH 0/7] Application Specific Tagged Memory Support in CXL Type 3 Devices Alireza Sanaee
                   ` (2 preceding siblings ...)
  2025-11-27 22:55 ` [RFC PATCH 3/7] hw/cxl: Change Extent add/remove APIs for lazy memory backend Alireza Sanaee
@ 2025-11-27 22:55 ` Alireza Sanaee
  2026-02-06 12:33   ` Jonathan Cameron
  2025-11-27 22:55 ` [RFC PATCH 5/7] hw/cxl: Add performant direct mapping for extents Alireza Sanaee
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Alireza Sanaee @ 2025-11-27 22:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: jonathan.cameron, linuxarm, eblake, armbru, berrange, pbonzini,
	mst, lizhijian, anisa.su, linux-cxl

Map relevant memory backend when host accepted an extent.

Signed-off-by: Alireza Sanaee <alireza.sanaee@huawei.com>
---
 hw/cxl/cxl-mailbox-utils.c | 74 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 71 insertions(+), 3 deletions(-)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index ae723c03ec..b785553225 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -2979,6 +2979,30 @@ static CXLRetCode cxl_detect_malformed_extent_list(CXLType3Dev *ct3d,
     return CXL_MBOX_SUCCESS;
 }
 
+static bool cxl_extent_find_extent_detail(CXLDCExtentGroupList *list,
+                                          uint64_t start_dpa,
+                                          uint64_t len,
+                                          uint8_t *tag,
+                                          HostMemoryBackend **hmb,
+                                          struct CXLFixedWindow **fw,
+                                          int *rid)
+{
+    CXLDCExtent *ent, *ent_next;
+    CXLDCExtentGroup *group = QTAILQ_FIRST(list);
+
+    QTAILQ_FOREACH_SAFE(ent, &group->list, node, ent_next) {
+        if (ent->start_dpa == start_dpa && ent->len == len) {
+            *fw = ent->fw;
+            *hmb = ent->hm;
+            memcpy(tag, ent->tag, 0x10);
+            *rid = ent->rid;
+            return true;
+        }
+    }
+
+    return false;
+}
+
 static CXLRetCode cxl_dcd_add_dyn_cap_rsp_dry_run(CXLType3Dev *ct3d,
         const CXLUpdateDCExtentListInPl *in)
 {
@@ -3029,8 +3053,12 @@ static CXLRetCode cmd_dcd_add_dyn_cap_rsp(const struct cxl_cmd *cmd,
     CXLUpdateDCExtentListInPl *in = (void *)payload_in;
     CXLType3Dev *ct3d = CXL_TYPE3(cci->d);
     CXLDCExtentList *extent_list = &ct3d->dc.extents;
+    struct CXLFixedWindow *fw;
+    HostMemoryBackend *hmb_dc;
+    uint8_t tag[0x10];
     uint32_t i, num;
     uint64_t dpa, len;
+    int rid;
     CXLRetCode ret;
 
     if (len_in < sizeof(*in)) {
@@ -3065,12 +3093,52 @@ static CXLRetCode cmd_dcd_add_dyn_cap_rsp(const struct cxl_cmd *cmd,
     }
 
     for (i = 0; i < in->num_entries_updated; i++) {
+        bool found;
+        MemoryRegion *mr;
+
         dpa = in->updated_entries[i].start_dpa;
         len = in->updated_entries[i].len;
 
-        cxl_insert_extent_to_extent_list(extent_list, NULL,
-                                         NULL, dpa, len,
-                                         NULL, 0, 0);
+        if (ct3d->dc.total_capacity_cmd) {
+            found = cxl_extent_find_extent_detail(
+                &ct3d->dc.extents_pending, dpa, len, tag, &hmb_dc, &fw, &rid);
+
+            /*
+             * This only occurs when host accepts an extent where device does
+             * not know anything about it.
+             */
+            if (!found) {
+                qemu_log("Could not find the extent detail for DPA 0x%" PRIx64
+                         " LEN 0x%" PRIx64 "\n",
+                         dpa, len);
+                return CXL_MBOX_INVALID_PA;
+            }
+
+            /* The host memory backend should not be already mapped */
+            if (host_memory_backend_is_mapped(hmb_dc)) {
+                qemu_log("The host memory backend for DPA 0x%" PRIx64
+                         " LEN 0x%" PRIx64 " is already mapped\n",
+                         dpa, len);
+                return CXL_MBOX_INVALID_PA;
+            }
+
+            mr = host_memory_backend_get_memory(hmb_dc);
+            if (!mr) {
+                qemu_log("Could not get memory region from host memory "
+                         "backend\n");
+                return CXL_MBOX_INVALID_PA;
+            }
+
+            memory_region_set_nonvolatile(mr, false);
+            memory_region_set_enabled(mr, true);
+            host_memory_backend_set_mapped(hmb_dc, true);
+            cxl_insert_extent_to_extent_list(extent_list, hmb_dc, fw, dpa, len,
+                                             NULL, 0, rid);
+        } else {
+            cxl_insert_extent_to_extent_list(extent_list, NULL, NULL, dpa, len,
+                                             NULL, 0, -1);
+        }
+
         ct3d->dc.total_extent_count += 1;
         ct3d->dc.nr_extents_accepted += 1;
         ct3_set_region_block_backed(ct3d, dpa, len);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 5/7] hw/cxl: Add performant direct mapping for extents
  2025-11-27 22:55 [RFC QEMU PATCH 0/7] Application Specific Tagged Memory Support in CXL Type 3 Devices Alireza Sanaee
                   ` (3 preceding siblings ...)
  2025-11-27 22:55 ` [RFC PATCH 4/7] hw/cxl: Map lazy memory backend after host acceptance Alireza Sanaee
@ 2025-11-27 22:55 ` Alireza Sanaee
  2026-02-06 12:41   ` Jonathan Cameron
  2025-11-27 22:55 ` [RFC PATCH 6/7] hw/cxl: Add remove alias functionality for extent direct mapping Alireza Sanaee
  2025-11-27 22:55 ` [RFC PATCH 7/7] hw/cxl: Add tag-based removal functionality Alireza Sanaee
  6 siblings, 1 reply; 15+ messages in thread
From: Alireza Sanaee @ 2025-11-27 22:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: jonathan.cameron, linuxarm, eblake, armbru, berrange, pbonzini,
	mst, lizhijian, anisa.su, linux-cxl

Add alias direct mapping into the fixed memory window.

Signed-off-by: Alireza Sanaee <alireza.sanaee@huawei.com>
---
 hw/cxl/cxl-host.c           |  4 ++
 hw/cxl/cxl-mailbox-utils.c  | 73 +++++++++++++++++++++++++++++--------
 hw/mem/cxl_type3.c          |  6 ++-
 include/hw/cxl/cxl_device.h | 25 +++++++++++--
 4 files changed, 87 insertions(+), 21 deletions(-)

diff --git a/hw/cxl/cxl-host.c b/hw/cxl/cxl-host.c
index 7c8fde4646..3b327be68c 100644
--- a/hw/cxl/cxl-host.c
+++ b/hw/cxl/cxl-host.c
@@ -362,10 +362,14 @@ static int cxl_fmws_direct_passthrough(Object *obj, void *opaque)
     fw = CXL_FMW(obj);
 
     if (!cfmws_is_not_interleaved(fw, state->decoder_base)) {
+        ct3d->direct_mr_enabled = false;
         return 0;
     }
+    ct3d->direct_mr_enabled = true;
 
     ct3d->dc.fw = fw;
+    ct3d->dc.dc_decoder_window.base = state->decoder_base;
+    ct3d->dc.dc_decoder_window.size = state->decoder_size;
 
     if (state->commit) {
         MemoryRegion *mr = NULL;
diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index b785553225..15114a5314 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -2846,13 +2846,19 @@ void cxl_insert_extent_to_extent_list(CXLDCExtentList *list,
                                              uint64_t len,
                                              uint8_t *tag,
                                              uint16_t shared_seq,
-                                             int rid)
+                                             int rid,
+                                             uint64_t offset)
 {
     CXLDCExtent *extent;
 
     extent = g_new0(CXLDCExtent, 1);
     extent->start_dpa = dpa;
     extent->len = len;
+    extent->offset = offset;
+    extent->shared_seq = shared_seq;
+    extent->hm = host_mem;
+    extent->fw = fw;
+    extent->rid = rid;
     if (tag) {
         memcpy(extent->tag, tag, 0x10);
     }
@@ -2880,14 +2886,15 @@ CXLDCExtentGroup *cxl_insert_extent_to_extent_group(CXLDCExtentGroup *group,
                                                     uint64_t len,
                                                     uint8_t *tag,
                                                     uint16_t shared_seq,
-                                                    int rid)
+                                                    int rid,
+                                                    uint64_t offset)
 {
     if (!group) {
         group = g_new0(CXLDCExtentGroup, 1);
         QTAILQ_INIT(&group->list);
     }
     cxl_insert_extent_to_extent_list(&group->list, host_mem, fw, dpa, len,
-                                     tag, shared_seq, rid);
+                                     tag, shared_seq, rid, offset);
     return group;
 }
 
@@ -2985,7 +2992,8 @@ static bool cxl_extent_find_extent_detail(CXLDCExtentGroupList *list,
                                           uint8_t *tag,
                                           HostMemoryBackend **hmb,
                                           struct CXLFixedWindow **fw,
-                                          int *rid)
+                                          int *rid,
+                                          uint64_t *offset)
 {
     CXLDCExtent *ent, *ent_next;
     CXLDCExtentGroup *group = QTAILQ_FIRST(list);
@@ -2996,6 +3004,7 @@ static bool cxl_extent_find_extent_detail(CXLDCExtentGroupList *list,
             *hmb = ent->hm;
             memcpy(tag, ent->tag, 0x10);
             *rid = ent->rid;
+            *offset = ent->offset;
             return true;
         }
     }
@@ -3057,7 +3066,7 @@ static CXLRetCode cmd_dcd_add_dyn_cap_rsp(const struct cxl_cmd *cmd,
     HostMemoryBackend *hmb_dc;
     uint8_t tag[0x10];
     uint32_t i, num;
-    uint64_t dpa, len;
+    uint64_t dpa, len, offset;
     int rid;
     CXLRetCode ret;
 
@@ -3100,16 +3109,25 @@ static CXLRetCode cmd_dcd_add_dyn_cap_rsp(const struct cxl_cmd *cmd,
         len = in->updated_entries[i].len;
 
         if (ct3d->dc.total_capacity_cmd) {
+            int mr_idx = ct3d->dc.cur_direct_region_idx;
             found = cxl_extent_find_extent_detail(
-                &ct3d->dc.extents_pending, dpa, len, tag, &hmb_dc, &fw, &rid);
+                &ct3d->dc.extents_pending, dpa, len, tag, &hmb_dc, &fw, &rid,
+                &offset);
 
             /*
              * This only occurs when host accepts an extent where device does
              * not know anything about it.
              */
             if (!found) {
-                qemu_log("Could not find the extent detail for DPA 0x%" PRIx64
-                         " LEN 0x%" PRIx64 "\n",
+                qemu_log("could not find the extent detail for dpa 0x%" PRIx64
+                         " len 0x%" PRIx64 "\n",
+                         dpa, len);
+                return CXL_MBOX_INVALID_PA;
+            }
+
+            if (!hmb_dc) {
+                qemu_log("Mapping host memory backend for dpa 0x%" PRIx64
+                         " len 0x%" PRIx64 "\n",
                          dpa, len);
                 return CXL_MBOX_INVALID_PA;
             }
@@ -3123,6 +3141,7 @@ static CXLRetCode cmd_dcd_add_dyn_cap_rsp(const struct cxl_cmd *cmd,
             }
 
             mr = host_memory_backend_get_memory(hmb_dc);
+
             if (!mr) {
                 qemu_log("Could not get memory region from host memory "
                          "backend\n");
@@ -3132,11 +3151,32 @@ static CXLRetCode cmd_dcd_add_dyn_cap_rsp(const struct cxl_cmd *cmd,
             memory_region_set_nonvolatile(mr, false);
             memory_region_set_enabled(mr, true);
             host_memory_backend_set_mapped(hmb_dc, true);
+
+            if (ct3d->direct_mr_enabled) {
+                g_autofree char *direct_mapping_name =
+                    g_strdup_printf("cxl-direct-mapping-%d", mr_idx);
+                int region_offset = dpa - ct3d->dc.regions[rid].base;
+                MemoryRegion *dr_dc_mr = &ct3d->dc.dc_direct_mr[mr_idx];
+                memory_region_init_alias(dr_dc_mr, OBJECT(ct3d),
+                                         direct_mapping_name, mr, region_offset,
+                                         ct3d->dc.dc_decoder_window.size);
+                memory_region_add_subregion(&fw->mr,
+                                            ct3d->dc.dc_decoder_window.base -
+                                                fw->base + offset,
+                                            dr_dc_mr);
+                /*
+                 * for now assuming 4 extents and 4 direct mapping memory
+                 * regions.
+                 */
+                ct3d->dc.cur_direct_region_idx =
+                    (ct3d->dc.cur_direct_region_idx + 1) % 4;
+            }
+
             cxl_insert_extent_to_extent_list(extent_list, hmb_dc, fw, dpa, len,
-                                             NULL, 0, rid);
+                                             tag, 0, rid, offset);
         } else {
             cxl_insert_extent_to_extent_list(extent_list, NULL, NULL, dpa, len,
-                                             NULL, 0, -1);
+                                             NULL, 0, -1, -1);
         }
 
         ct3d->dc.total_extent_count += 1;
@@ -3167,7 +3207,7 @@ static uint32_t copy_extent_list(CXLDCExtentList *dst,
     QTAILQ_FOREACH(ent, src, node) {
         cxl_insert_extent_to_extent_list(dst, ent->hm, ent->fw, ent->start_dpa,
                                          ent->len, ent->tag, ent->shared_seq,
-                                         ent->rid);
+                                         ent->rid, ent->offset);
         cnt++;
     }
     return cnt;
@@ -3223,15 +3263,15 @@ static CXLRetCode cxl_dc_extent_release_dry_run(CXLType3Dev *ct3d,
                     if (len1) {
                         cxl_insert_extent_to_extent_list(updated_list, NULL,
                                                          NULL, ent_start_dpa,
-                                                         len1, NULL, 0,
-                                                         ent->rid);
+                                                         len1, ent->tag, 0,
+                                                         ent->rid, ent->offset);
                         cnt_delta++;
                     }
                     if (len2) {
                         cxl_insert_extent_to_extent_list(updated_list, NULL,
                                                          NULL, dpa + len,
-                                                         len2, NULL, 0,
-                                                         ent->rid);
+                                                         len2, ent->tag, 0,
+                                                         ent->rid, ent->offset);
                         cnt_delta++;
                     }
 
@@ -3706,7 +3746,8 @@ static CXLRetCode cmd_fm_initiate_dc_add(const struct cxl_cmd *cmd,
                 group = cxl_insert_extent_to_extent_group(group, NULL, NULL,
                                                           ext->start_dpa,
                                                           ext->len, ext->tag,
-                                                          ext->shared_seq, 0);
+                                                          ext->shared_seq, 0,
+                                                          -1);
             }
 
             cxl_extent_group_list_insert_tail(&ct3d->dc.extents_pending, group);
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index ef1c1cbaef..e3093f63a3 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -2139,7 +2139,8 @@ static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path,
                                                           extents[i].len,
                                                           extents[i].tag,
                                                           extents[i].shared_seq,
-                                                          rid);
+                                                          rid,
+                                                          offset);
             } else {
                 group = cxl_insert_extent_to_extent_group(group,
                                                           dcd->dc.host_dc,
@@ -2148,7 +2149,8 @@ static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path,
                                                           extents[i].len,
                                                           extents[i].tag,
                                                           extents[i].shared_seq,
-                                                          rid);
+                                                          rid,
+                                                          offset);
             }
         }
 
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index fe0c44e8d7..1a521df881 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -524,6 +524,7 @@ typedef struct CXLDCExtent {
     uint16_t shared_seq;
     uint8_t rsvd[0x6];
     int rid;
+    uint64_t offset;
 
     QTAILQ_ENTRY(CXLDCExtent) node;
 } CXLDCExtent;
@@ -589,6 +590,7 @@ struct CXLType3Dev {
 
     /* State */
     MemoryRegion direct_mr[CXL_HDM_DECODER_COUNT];
+    bool direct_mr_enabled;
     AddressSpace hostvmem_as;
     AddressSpace hostpmem_as;
     CXLComponentState cxl_cstate;
@@ -633,6 +635,14 @@ struct CXLType3Dev {
         HostMemoryBackend *host_dc;
         AddressSpace host_dc_as;
         struct CXLFixedWindow *fw;
+        int cur_direct_region_idx;
+        /*
+         * dc_decoder_window represents the CXL Decoder Window
+         */
+        struct decoder_window {
+            hwaddr base;
+            hwaddr size;
+        } dc_decoder_window;
         /*
          * total_capacity is equivalent to the dynamic capability
          * memory region size.
@@ -647,6 +657,11 @@ struct CXLType3Dev {
 
         uint8_t num_regions; /* 0-8 regions */
         CXLDCRegion regions[DCD_MAX_NUM_REGION];
+        /*
+         * Assume 4 now but many possible, each region is one alias an extent
+         * to allow performance translation in KVM.
+         */
+        MemoryRegion dc_direct_mr[4];
     } dc;
 
     struct CXLSanitizeInfo *media_op_sanitize;
@@ -720,8 +735,11 @@ void cxl_insert_extent_to_extent_list(CXLDCExtentList *list,
                                       HostMemoryBackend *hm,
                                       struct CXLFixedWindow *fw,
                                       uint64_t dpa,
-                                      uint64_t len, uint8_t *tag,
-                                      uint16_t shared_seq, int rid);
+                                      uint64_t len,
+                                      uint8_t *tag,
+                                      uint16_t shared_seq,
+                                      int rid,
+                                      uint64_t offset);
 bool test_any_bits_set(const unsigned long *addr, unsigned long nr,
                        unsigned long size);
 bool cxl_extents_contains_dpa_range(CXLDCExtentList *list,
@@ -733,7 +751,8 @@ CXLDCExtentGroup *cxl_insert_extent_to_extent_group(CXLDCExtentGroup *group,
                                                     uint64_t len,
                                                     uint8_t *tag,
                                                     uint16_t shared_seq,
-                                                    int rid);
+                                                    int rid,
+                                                    uint64_t offset);
 void cxl_extent_group_list_insert_tail(CXLDCExtentGroupList *list,
                                        CXLDCExtentGroup *group);
 uint32_t cxl_extent_group_list_delete_front(CXLDCExtentGroupList *list);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 6/7] hw/cxl: Add remove alias functionality for extent direct mapping
  2025-11-27 22:55 [RFC QEMU PATCH 0/7] Application Specific Tagged Memory Support in CXL Type 3 Devices Alireza Sanaee
                   ` (4 preceding siblings ...)
  2025-11-27 22:55 ` [RFC PATCH 5/7] hw/cxl: Add performant direct mapping for extents Alireza Sanaee
@ 2025-11-27 22:55 ` Alireza Sanaee
  2026-02-06 12:43   ` Jonathan Cameron
  2025-11-27 22:55 ` [RFC PATCH 7/7] hw/cxl: Add tag-based removal functionality Alireza Sanaee
  6 siblings, 1 reply; 15+ messages in thread
From: Alireza Sanaee @ 2025-11-27 22:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: jonathan.cameron, linuxarm, eblake, armbru, berrange, pbonzini,
	mst, lizhijian, anisa.su, linux-cxl

Remove alias related to an extent when the extent is longer available,
from removed from the VM.

Signed-off-by: Alireza Sanaee <alireza.sanaee@huawei.com>
---
 hw/cxl/cxl-mailbox-utils.c  | 57 ++++++++++++++++++++++++++++++-------
 hw/mem/cxl_type3.c          | 29 +++++++++++++++++--
 include/hw/cxl/cxl_device.h |  9 ++++--
 3 files changed, 81 insertions(+), 14 deletions(-)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 15114a5314..e0ac31ac41 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -2847,7 +2847,8 @@ void cxl_insert_extent_to_extent_list(CXLDCExtentList *list,
                                              uint8_t *tag,
                                              uint16_t shared_seq,
                                              int rid,
-                                             uint64_t offset)
+                                             uint64_t offset,
+                                             uint32_t direct_window_idx)
 {
     CXLDCExtent *extent;
 
@@ -2863,6 +2864,7 @@ void cxl_insert_extent_to_extent_list(CXLDCExtentList *list,
         memcpy(extent->tag, tag, 0x10);
     }
     extent->shared_seq = shared_seq;
+    extent->direct_window_idx = direct_window_idx;
 
     QTAILQ_INSERT_TAIL(list, extent, node);
 }
@@ -2887,14 +2889,16 @@ CXLDCExtentGroup *cxl_insert_extent_to_extent_group(CXLDCExtentGroup *group,
                                                     uint8_t *tag,
                                                     uint16_t shared_seq,
                                                     int rid,
-                                                    uint64_t offset)
+                                                    uint64_t offset,
+                                                    uint32_t direct_window_idx)
 {
     if (!group) {
         group = g_new0(CXLDCExtentGroup, 1);
         QTAILQ_INIT(&group->list);
     }
-    cxl_insert_extent_to_extent_list(&group->list, host_mem, fw, dpa, len,
-                                     tag, shared_seq, rid, offset);
+    cxl_insert_extent_to_extent_list(&group->list, host_mem, fw, dpa, len, tag,
+                                     shared_seq, rid, offset,
+                                     direct_window_idx);
     return group;
 }
 
@@ -3173,10 +3177,10 @@ static CXLRetCode cmd_dcd_add_dyn_cap_rsp(const struct cxl_cmd *cmd,
             }
 
             cxl_insert_extent_to_extent_list(extent_list, hmb_dc, fw, dpa, len,
-                                             tag, 0, rid, offset);
+                                             tag, 0, rid, offset, mr_idx);
         } else {
             cxl_insert_extent_to_extent_list(extent_list, NULL, NULL, dpa, len,
-                                             NULL, 0, -1, -1);
+                                             NULL, 0, -1, -1, -1);
         }
 
         ct3d->dc.total_extent_count += 1;
@@ -3207,7 +3211,8 @@ static uint32_t copy_extent_list(CXLDCExtentList *dst,
     QTAILQ_FOREACH(ent, src, node) {
         cxl_insert_extent_to_extent_list(dst, ent->hm, ent->fw, ent->start_dpa,
                                          ent->len, ent->tag, ent->shared_seq,
-                                         ent->rid, ent->offset);
+                                         ent->rid, ent->offset,
+                                         ent->direct_window_idx);
         cnt++;
     }
     return cnt;
@@ -3215,6 +3220,7 @@ static uint32_t copy_extent_list(CXLDCExtentList *dst,
 
 static CXLRetCode cxl_dc_extent_release_dry_run(CXLType3Dev *ct3d,
         const CXLUpdateDCExtentListInPl *in, CXLDCExtentList *updated_list,
+        CXLDCExtentList *updated_removed_list,
         uint32_t *updated_list_size)
 {
     CXLDCExtent *ent, *ent_next;
@@ -3224,6 +3230,9 @@ static CXLRetCode cxl_dc_extent_release_dry_run(CXLType3Dev *ct3d,
     CXLRetCode ret = CXL_MBOX_SUCCESS;
 
     QTAILQ_INIT(updated_list);
+    if (updated_removed_list) {
+        QTAILQ_INIT(updated_removed_list);
+    }
     copy_extent_list(updated_list, &ct3d->dc.extents);
 
     for (i = 0; i < in->num_entries_updated; i++) {
@@ -3257,6 +3266,19 @@ static CXLRetCode cxl_dc_extent_release_dry_run(CXLType3Dev *ct3d,
                     }
                     len_done = ent_len - len1 - len2;
 
+                    /* Cannot split extents with direct window mapping */
+                    if (ent->direct_window_idx >= 0 && (len1 || len2)) {
+                        ret = CXL_MBOX_INVALID_INPUT;
+                        goto free_and_exit;
+                    }
+
+                    if (updated_removed_list) {
+                        cxl_insert_extent_to_extent_list(
+                            updated_removed_list, ent->hm, ent->fw,
+                            ent->start_dpa, ent->len, ent->tag, ent->shared_seq,
+                            ent->rid, ent->offset, ent->direct_window_idx);
+                    }
+
                     cxl_remove_extent_from_extent_list(updated_list, ent);
                     cnt_delta--;
 
@@ -3264,14 +3286,16 @@ static CXLRetCode cxl_dc_extent_release_dry_run(CXLType3Dev *ct3d,
                         cxl_insert_extent_to_extent_list(updated_list, NULL,
                                                          NULL, ent_start_dpa,
                                                          len1, ent->tag, 0,
-                                                         ent->rid, ent->offset);
+                                                         ent->rid, ent->offset,
+                                                         ent->direct_window_idx);
                         cnt_delta++;
                     }
                     if (len2) {
                         cxl_insert_extent_to_extent_list(updated_list, NULL,
                                                          NULL, dpa + len,
                                                          len2, ent->tag, 0,
-                                                         ent->rid, ent->offset);
+                                                         ent->rid, ent->offset,
+                                                         ent->direct_window_idx);
                         cnt_delta++;
                     }
 
@@ -3313,6 +3337,7 @@ static CXLRetCode cmd_dcd_release_dyn_cap(const struct cxl_cmd *cmd,
     CXLUpdateDCExtentListInPl *in = (void *)payload_in;
     CXLType3Dev *ct3d = CXL_TYPE3(cci->d);
     CXLDCExtentList updated_list;
+    CXLDCExtentList updated_removed_list;
     CXLDCExtent *ent, *ent_next;
     uint32_t updated_list_size;
     CXLRetCode ret;
@@ -3336,11 +3361,22 @@ static CXLRetCode cmd_dcd_release_dyn_cap(const struct cxl_cmd *cmd,
     }
 
     ret = cxl_dc_extent_release_dry_run(ct3d, in, &updated_list,
+                                        &updated_removed_list,
                                         &updated_list_size);
     if (ret != CXL_MBOX_SUCCESS) {
         return ret;
     }
 
+    if (ct3d->direct_mr_enabled) {
+        /*
+         * Remove memory alias for the removed extents
+         */
+        QTAILQ_FOREACH_SAFE(ent, &updated_removed_list, node, ent_next) {
+            cxl_remove_memory_alias(ct3d, ent->fw, ent->direct_window_idx);
+            cxl_remove_extent_from_extent_list(&updated_removed_list, ent);
+        }
+    }
+
     /*
      * If the dry run release passes, the returned updated_list will
      * be the updated extent list and we just need to clear the extents
@@ -3747,7 +3783,7 @@ static CXLRetCode cmd_fm_initiate_dc_add(const struct cxl_cmd *cmd,
                                                           ext->start_dpa,
                                                           ext->len, ext->tag,
                                                           ext->shared_seq, 0,
-                                                          -1);
+                                                          -1, -1);
             }
 
             cxl_extent_group_list_insert_tail(&ct3d->dc.extents_pending, group);
@@ -3829,6 +3865,7 @@ static CXLRetCode cmd_fm_initiate_dc_release(const struct cxl_cmd *cmd,
             rc = cxl_dc_extent_release_dry_run(ct3d,
                                                list,
                                                &updated_list,
+                                               NULL,
                                                &updated_list_size);
             if (rc) {
                 return rc;
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index e3093f63a3..d3ea62ef3f 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -2140,7 +2140,8 @@ static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path,
                                                           extents[i].tag,
                                                           extents[i].shared_seq,
                                                           rid,
-                                                          offset);
+                                                          offset,
+                                                          0);
             } else {
                 group = cxl_insert_extent_to_extent_group(group,
                                                           dcd->dc.host_dc,
@@ -2150,7 +2151,8 @@ static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path,
                                                           extents[i].tag,
                                                           extents[i].shared_seq,
                                                           rid,
-                                                          offset);
+                                                          offset,
+                                                          0);
             }
         }
 
@@ -2216,6 +2218,29 @@ void qmp_cxl_release_dynamic_capacity(const char *path, uint16_t host_id,
     }
 }
 
+void cxl_remove_memory_alias(CXLType3Dev *dcd, struct CXLFixedWindow *fw,
+                             uint32_t hdm_id)
+{
+    MemoryRegion *mr;
+
+    if (dcd->dc.total_capacity_cmd > 0) {
+        mr = &dcd->dc.dc_direct_mr[hdm_id];
+    } else {
+        qemu_log("No dynamic capacity command support, "
+                 "cannot remove memory region alias\n");
+        return;
+    }
+
+    if (!fw) {
+        qemu_log(
+            "Cannot remove memory region alias without a valid fixed window\n");
+        return;
+    }
+
+    memory_region_del_subregion(&fw->mr, mr);
+    return;
+}
+
 static void ct3_class_init(ObjectClass *oc, const void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(oc);
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index 1a521df881..c8c57ac837 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -525,6 +525,7 @@ typedef struct CXLDCExtent {
     uint8_t rsvd[0x6];
     int rid;
     uint64_t offset;
+    int direct_window_idx;
 
     QTAILQ_ENTRY(CXLDCExtent) node;
 } CXLDCExtent;
@@ -739,7 +740,8 @@ void cxl_insert_extent_to_extent_list(CXLDCExtentList *list,
                                       uint8_t *tag,
                                       uint16_t shared_seq,
                                       int rid,
-                                      uint64_t offset);
+                                      uint64_t offset,
+                                      uint32_t direct_window_idx);
 bool test_any_bits_set(const unsigned long *addr, unsigned long nr,
                        unsigned long size);
 bool cxl_extents_contains_dpa_range(CXLDCExtentList *list,
@@ -752,7 +754,8 @@ CXLDCExtentGroup *cxl_insert_extent_to_extent_group(CXLDCExtentGroup *group,
                                                     uint8_t *tag,
                                                     uint16_t shared_seq,
                                                     int rid,
-                                                    uint64_t offset);
+                                                    uint64_t offset,
+                                                    uint32_t direct_window_idx);
 void cxl_extent_group_list_insert_tail(CXLDCExtentGroupList *list,
                                        CXLDCExtentGroup *group);
 uint32_t cxl_extent_group_list_delete_front(CXLDCExtentGroupList *list);
@@ -773,4 +776,6 @@ bool cxl_extents_overlaps_dpa_range(CXLDCExtentList *list,
                                     uint64_t dpa, uint64_t len);
 bool cxl_extent_groups_overlaps_dpa_range(CXLDCExtentGroupList *list,
                                           uint64_t dpa, uint64_t len);
+void cxl_remove_memory_alias(CXLType3Dev *dcd, struct CXLFixedWindow *fw,
+                             uint32_t hdm_id);
 #endif
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 7/7] hw/cxl: Add tag-based removal functionality
  2025-11-27 22:55 [RFC QEMU PATCH 0/7] Application Specific Tagged Memory Support in CXL Type 3 Devices Alireza Sanaee
                   ` (5 preceding siblings ...)
  2025-11-27 22:55 ` [RFC PATCH 6/7] hw/cxl: Add remove alias functionality for extent direct mapping Alireza Sanaee
@ 2025-11-27 22:55 ` Alireza Sanaee
  2026-02-06 12:49   ` Jonathan Cameron
  6 siblings, 1 reply; 15+ messages in thread
From: Alireza Sanaee @ 2025-11-27 22:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: jonathan.cameron, linuxarm, eblake, armbru, berrange, pbonzini,
	mst, lizhijian, anisa.su, linux-cxl

Add tag based removal, in which alias tear down must be done properly.

Signed-off-by: Alireza Sanaee <alireza.sanaee@huawei.com>
---
 hw/mem/cxl_type3.c | 119 +++++++++++++++++++++++++++++++++++++++++++++
 qapi/cxl.json      |  46 ++++++++++++++++++
 2 files changed, 165 insertions(+)

diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index d3ea62ef3f..29355792da 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -2186,6 +2186,61 @@ void qmp_cxl_add_dynamic_capacity(const char *path, uint16_t host_id,
     }
 }
 
+static void qmp_cxl_process_dynamic_capacity_tag_based(const char *path,
+        uint16_t hid, CXLDCEventType type, uint8_t rid, const char *tag,
+        CxlDynamicCapacityExtentList *records, Error **errp) {
+
+    Object *obj;
+    CXLType3Dev *dcd;
+    CXLDCExtentList *list = NULL;
+    CXLDCExtent *ent;
+    g_autofree CXLDCExtentRaw *extents = NULL;
+
+    obj = object_resolve_path_type(path, TYPE_CXL_TYPE3, NULL);
+    if (!obj) {
+        error_setg(errp, "Unable to resolve CXL type 3 device");
+        return;
+    }
+
+    dcd = CXL_TYPE3(obj);
+    if (!dcd->dc.num_regions) {
+        error_setg(errp, "No dynamic capacity support from the device");
+        return;
+    }
+
+    if (rid >= dcd->dc.num_regions) {
+        error_setg(errp, "region id is too large");
+        return;
+    }
+
+    QemuUUID uuid_req;
+    qemu_uuid_parse(tag, &uuid_req);
+
+    list = &dcd->dc.extents;
+    size_t cap = 8, n = 0;
+    extents = g_new0(CXLDCExtentRaw, cap);
+    QTAILQ_FOREACH(ent, list, node) {
+        QemuUUID uuid_ext;
+        memcpy(&uuid_ext.data, ent->tag, sizeof(ent->tag));
+        if (!qemu_uuid_is_equal(&uuid_req, &uuid_ext)) {
+            continue;
+        }
+
+        if (n == cap) {
+            cap = cap < 8 ? 8 : cap * 2;
+            extents = g_renew(CXLDCExtentRaw, extents, cap);
+        }
+
+        extents[n++] = (CXLDCExtentRaw){ .start_dpa = ent->start_dpa,
+                                         .len = ent->len,
+                                         .shared_seq = 0 };
+    }
+
+    extents = g_renew(CXLDCExtentRaw, extents, n);
+    cxl_create_dc_event_records_for_extents(dcd, type, extents, n);
+    return;
+}
+
 void qmp_cxl_release_dynamic_capacity(const char *path, uint16_t host_id,
                                       CxlExtentRemovalPolicy removal_policy,
                                       bool has_forced_removal,
@@ -2212,6 +2267,10 @@ void qmp_cxl_release_dynamic_capacity(const char *path, uint16_t host_id,
                                                       region, tag, extents,
                                                       errp);
         return;
+    case CXL_EXTENT_REMOVAL_POLICY_TAG_BASED:
+        qmp_cxl_process_dynamic_capacity_tag_based(path, host_id, type, region,
+                                                   tag, extents, errp);
+        return;
     default:
         error_setg(errp, "Removal policy not supported");
         return;
@@ -2241,6 +2300,66 @@ void cxl_remove_memory_alias(CXLType3Dev *dcd, struct CXLFixedWindow *fw,
     return;
 }
 
+/*
+ * This function allows for a simple check to make sure that
+ * our extent is removed. It can be used by an orchestration layer.
+ */
+ExtentStatus *qmp_cxl_release_dynamic_capacity_status(const char *path,
+                                                      uint16_t hid, uint8_t rid,
+                                                      const char *tag,
+                                                      Error **errp)
+{
+    Object *obj;
+    CXLType3Dev *dcd;
+    CXLDCExtentList *list = NULL;
+    CXLDCExtent *ent;
+    QemuUUID uuid_req;
+    ExtentStatus *res = g_new0(ExtentStatus, 1);
+
+    obj = object_resolve_path_type(path, TYPE_CXL_TYPE3, NULL);
+    if (!obj) {
+        error_setg(errp, "Unable to resolve CXL type 3 device");
+        return NULL;
+    }
+
+    dcd = CXL_TYPE3(obj);
+    if (!dcd->dc.num_regions) {
+        error_setg(errp, "No dynamic capacity support from the device");
+        return NULL;
+    }
+
+    if (rid >= dcd->dc.num_regions) {
+        error_setg(errp, "Region id is too large");
+        return NULL;
+    }
+
+    if (!tag) {
+        error_setg(errp, "Tag must be valid");
+        return NULL;
+    }
+
+    list = &dcd->dc.extents;
+    qemu_uuid_parse(tag, &uuid_req);
+
+    QTAILQ_FOREACH(ent, list, node) {
+        QemuUUID uuid_ext;
+        memcpy(&uuid_ext.data, ent->tag, sizeof(ent->tag));
+        if (qemu_uuid_is_equal(&uuid_req, &uuid_ext) == true) {
+            res->status = g_strdup("Not Released");
+            res->message =
+                g_strdup_printf("Found extent with tag %s dpa 0x%" PRIx64
+                                " len 0x%" PRIx64 "\n",
+                                ent->tag, ent->start_dpa, ent->len);
+            return res;
+        }
+    }
+
+
+    res->status = g_strdup("Released");
+    res->message = g_strdup_printf("Tag %s released or not found\n", tag);
+    return res;
+}
+
 static void ct3_class_init(ObjectClass *oc, const void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(oc);
diff --git a/qapi/cxl.json b/qapi/cxl.json
index 52cc5d4f33..3372ce3745 100644
--- a/qapi/cxl.json
+++ b/qapi/cxl.json
@@ -555,3 +555,49 @@
            },
   'features': [ 'unstable' ]
 }
+
+##
+# @ExtentStatus:
+# This is an object that describes the status of an extent.
+#
+# @status:   String indicating the overall result, e.g. "success".
+# @message:  Human-readable description of the outcome.
+#
+# Since: 9.1
+##
+{ 'struct': 'ExtentStatus',
+      'data': { 'status': 'str', 'message': 'str' }
+}
+
+##
+# @cxl-release-dynamic-capacity-status:
+#
+# This commands checks if an extent tag has been released or not.
+#
+# @path: path to the CXL Dynamic Capacity Device in the QOM tree.
+#
+# @host-id: The "Host ID" field as defined in Compute Express Link
+#     (CXL) Specification, Revision 3.1, Table 7-71.
+#
+# @region: The "Region Number" field as defined in Compute Express
+#     Link Specification, Revision 3.1, Table 7-71.  Valid range
+#     is from 0-7.
+#
+# @tag: The "Tag" field as defined in Compute Express Link (CXL)
+#     Specification, Revision 3.1, Table 7-71.
+#
+# Features:
+#
+# @unstable: For now this command is subject to change.
+#
+# Since: 9.1
+##
+{ 'command': 'cxl-release-dynamic-capacity-status',
+  'data': { 'path': 'str',
+            'host-id': 'uint16',
+            'region': 'uint8',
+            'tag': 'str'
+           },
+  'features': [ 'unstable' ],
+  'returns': 'ExtentStatus'
+}
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 1/7] hw/mem: Add tagged memory backend object
  2025-11-27 22:55 ` [RFC PATCH 1/7] hw/mem: Add tagged memory backend object Alireza Sanaee
@ 2026-02-06 12:16   ` Jonathan Cameron
  0 siblings, 0 replies; 15+ messages in thread
From: Jonathan Cameron @ 2026-02-06 12:16 UTC (permalink / raw)
  To: Alireza Sanaee
  Cc: qemu-devel, linuxarm, eblake, armbru, berrange, pbonzini, mst,
	lizhijian, anisa.su, linux-cxl

On Thu, 27 Nov 2025 22:55:19 +0000
Alireza Sanaee <alireza.sanaee@huawei.com> wrote:

> Add a new memory-backend-tagged supports a tag property where you can
> find it based on tag. This is useful for scenarios where you want to add
> a piece of memory for a particular purpose to be passed for another
> device.
> 
> At the moment, this only supports a ram-backed object where we add a tag
> to it, and it temporary. However, we are planning for a generalized
> approach. The plan is to have a shim object where we add a tag to it,
> and then it can be later linked to any BACKEND object types.
> 
> Example use QMP API:
> {
> 
>     "execute": "object-add",
>     "arguments": {
>         "qom-type": "memory-backend-tagged",
>         "id": "tm0",
>         "size": 1073741824,
>         "tag": "6be13bce-ae34-4a77-b6c3-16df975fcf1a"
>     }
> }
> 
> Tags are assumed to be UUID. But this is something for debate maybe.
> 
> Signed-off-by: Alireza Sanaee <alireza.sanaee@huawei.com>
Just to keep a note here on what we discussed offline

Given this doesn't end up with a shim, but rather would need a
variant for each potential backend type, I'd not do this and instead
(For now) anyway, just add tag as a property to the HostMemoryBackend

We can come up with a clever solution later if that isn't an acceptable
path forwards.

Jonathan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 2/7] hw/cxl: Allow initializing type3 device with no backing device
  2025-11-27 22:55 ` [RFC PATCH 2/7] hw/cxl: Allow initializing type3 device with no backing device Alireza Sanaee
@ 2026-02-06 12:28   ` Jonathan Cameron
  0 siblings, 0 replies; 15+ messages in thread
From: Jonathan Cameron @ 2026-02-06 12:28 UTC (permalink / raw)
  To: Alireza Sanaee
  Cc: qemu-devel, linuxarm, eblake, armbru, berrange, pbonzini, mst,
	lizhijian, anisa.su, linux-cxl

On Thu, 27 Nov 2025 22:55:20 +0000
Alireza Sanaee <alireza.sanaee@huawei.com> wrote:

> Allow creating a type3 device without any backing device for DC. In
> Dynamic Capacity scenarios, memory can show up asynchronously and it can
> be coming from difference resources, RAM, PMEM, FILE BACKED. For these
> cases, only one parameter will be needed to know total size of DC which
> is exposed by dc-total-regions-size.
Hi Ali,

I'd describe this as 'one additional parameter' rather than 'only one'.

The patch title and this description don't really do justice to everything
I think is in here.

I'd do just that first - > but no capacity to add extents if
dc-total-regions-size is used.  You'll need a trivial check
to fail any FMAPI or qmp command that tries to add them.

Then in a second patch add the stuff that books up the backend at runtime.
That can have a lot more description of what actually works after
that patch.  I think nothing, beyond verifying that the size is
good? 

Aim being a series of steps with clear descriptions heading for
the end goal.

Jonathan


> 
> Signed-off-by: Alireza Sanaee <alireza.sanaee@huawei.com>
> ---
>  hw/mem/cxl_type3.c          | 157 ++++++++++++++++++++++++++----------
>  include/hw/cxl/cxl_device.h |   1 +
>  2 files changed, 115 insertions(+), 43 deletions(-)
> 
> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> index 8cdb3bff7e..690b3ab658 100644
> --- a/hw/mem/cxl_type3.c
> +++ b/hw/mem/cxl_type3.c
> @@ -30,6 +30,7 @@
>  #include "system/numa.h"
>  #include "hw/cxl/cxl.h"
>  #include "hw/pci/msix.h"
> +#include "hw/mem/tagged_mem.h"
>  
>  /* type3 device private */
>  enum CXL_T3_MSIX_VECTOR {
> @@ -190,12 +191,15 @@ static int ct3_build_cdat_table(CDATSubHeader ***cdat_table, void *priv)
>      }
>  
>      if (ct3d->dc.num_regions) {
> -        if (!ct3d->dc.host_dc) {
> -            return -EINVAL;
> -        }
> -        dc_mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
> -        if (!dc_mr) {
> -            return -EINVAL;
> +        /* Only check if DC is static */
> +        if (ct3d->dc.total_capacity_cmd == 0) {
> +            if (!ct3d->dc.host_dc) {
> +                return -EINVAL;
> +            }
> +            dc_mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
> +            if (!dc_mr) {
> +                return -EINVAL;
> +            }
>          }
>          len += CT3_CDAT_NUM_ENTRIES * ct3d->dc.num_regions;
>      }
> @@ -216,7 +220,7 @@ static int ct3_build_cdat_table(CDATSubHeader ***cdat_table, void *priv)
>          cur_ent += CT3_CDAT_NUM_ENTRIES;
>      }
>  
> -    if (dc_mr) {
> +    if (dc_mr || ct3d->dc.total_capacity_cmd) {
>          int i;
>          uint64_t region_base = vmr_size + pmr_size;
>  
> @@ -651,8 +655,13 @@ static bool cxl_create_dc_regions(CXLType3Dev *ct3d, Error **errp)
>      MemoryRegion *mr;
>      uint64_t dc_size;
>  
> -    mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
> -    dc_size = memory_region_size(mr);
> +    if (ct3d->dc.total_capacity_cmd != 0) {
> +        dc_size = ct3d->dc.total_capacity_cmd;
> +    } else {
> +        mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
> +        dc_size = memory_region_size(mr);
> +    }
> +
>      region_len = DIV_ROUND_UP(dc_size, ct3d->dc.num_regions);
>  
>      if (dc_size % (ct3d->dc.num_regions * CXL_CAPACITY_MULTIPLIER) != 0) {
> @@ -810,39 +819,43 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
>  
>      ct3d->dc.total_capacity = 0;
>      if (ct3d->dc.num_regions > 0) {
> -        MemoryRegion *dc_mr;
> -        char *dc_name;
> +        if (ct3d->dc.total_capacity_cmd == 0) {
> +            MemoryRegion *dc_mr;
> +            char *dc_name;
>  
> -        if (!ct3d->dc.host_dc) {
> -            error_setg(errp, "dynamic capacity must have a backing device");
> -            return false;
> -        }
> +            if (!ct3d->dc.host_dc) {
> +                error_setg(errp, "dynamic capacity must have a backing device");
> +                return false;
> +            }
>  
> -        dc_mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
> -        if (!dc_mr) {
> -            error_setg(errp, "dynamic capacity must have a backing device");
> -            return false;
> -        }
> +            dc_mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
> +            if (!dc_mr) {
> +                error_setg(errp, "dynamic capacity must have a backing device");
> +                return false;
> +            }
>  
> -        if (host_memory_backend_is_mapped(ct3d->dc.host_dc)) {
> -            error_setg(errp, "memory backend %s can't be used multiple times.",
> -               object_get_canonical_path_component(OBJECT(ct3d->dc.host_dc)));
> -            return false;
> -        }
> -        /*
> -         * Set DC regions as volatile for now, non-volatile support can
> -         * be added in the future if needed.
> -         */
> -        memory_region_set_nonvolatile(dc_mr, false);
> -        memory_region_set_enabled(dc_mr, true);
> -        host_memory_backend_set_mapped(ct3d->dc.host_dc, true);
> -        if (ds->id) {
> -            dc_name = g_strdup_printf("cxl-dcd-dpa-dc-space:%s", ds->id);
> -        } else {
> -            dc_name = g_strdup("cxl-dcd-dpa-dc-space");
> +            if (host_memory_backend_is_mapped(ct3d->dc.host_dc)) {
> +                error_setg(errp,
> +                           "memory backend %s can't be used multiple times.",
> +                           object_get_canonical_path_component(
> +                               OBJECT(ct3d->dc.host_dc)));
> +                return false;
> +            }
> +            /*
> +             * Set DC regions as volatile for now, non-volatile support can
> +             * be added in the future if needed.
> +             */
> +            memory_region_set_nonvolatile(dc_mr, false);
> +            memory_region_set_enabled(dc_mr, true);
> +            host_memory_backend_set_mapped(ct3d->dc.host_dc, true);
> +            if (ds->id) {
> +                dc_name = g_strdup_printf("cxl-dcd-dpa-dc-space:%s", ds->id);
> +            } else {
> +                dc_name = g_strdup("cxl-dcd-dpa-dc-space");
> +            }
> +            address_space_init(&ct3d->dc.host_dc_as, dc_mr, dc_name);
> +            g_free(dc_name);
>          }
> -        address_space_init(&ct3d->dc.host_dc_as, dc_mr, dc_name);
> -        g_free(dc_name);
>  
>          if (!cxl_create_dc_regions(ct3d, errp)) {
>              error_append_hint(errp, "setup DC regions failed");
> @@ -1284,6 +1297,8 @@ static const Property ct3_props[] = {
>      DEFINE_PROP_UINT8("num-dc-regions", CXLType3Dev, dc.num_regions, 0),
>      DEFINE_PROP_LINK("volatile-dc-memdev", CXLType3Dev, dc.host_dc,
>                       TYPE_MEMORY_BACKEND, HostMemoryBackend *),
> +    DEFINE_PROP_SIZE("dc-regions-total-size", CXLType3Dev,
> +                     dc.total_capacity_cmd, 0),
>      DEFINE_PROP_PCIE_LINK_SPEED("x-speed", CXLType3Dev,
>                                  speed, PCIE_LINK_SPEED_32),
>      DEFINE_PROP_PCIE_LINK_WIDTH("x-width", CXLType3Dev,
> @@ -1952,12 +1967,38 @@ bool cxl_extent_groups_overlaps_dpa_range(CXLDCExtentGroupList *list,
>      return false;
>  }
>  
> +static bool cxl_device_lazy_dynamic_capacity_init(CXLType3Dev *ct3d,
> +                                                  const char *tag, Error **errp)
> +{
> +    MemoryRegion *dc_mr;
> +
> +    ct3d->dc.host_dc = memory_backend_tagged_find_by_tag(tag, errp);
> +    if (!ct3d->dc.host_dc) {
> +        error_setg(errp, "dynamic capacity must have a backing device");
> +        return false;
> +    }
> +
> +    dc_mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
> +    if (!dc_mr) {
> +        error_setg(errp, "test dynamic capacity must have a backing device");
> +        return false;
> +    }
> +
> +    if (host_memory_backend_is_mapped(ct3d->dc.host_dc)) {
> +        qemu_log("Warning: memory backend %s is already mapped. Reusing it.\n",
> +               object_get_canonical_path_component(OBJECT(ct3d->dc.host_dc)));
> +        return true;
> +    }
> +
> +    return true;
> +}
> +
>  /*
>   * The main function to process dynamic capacity event with extent list.
>   * Currently DC extents add/release requests are processed.
>   */
>  static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path,
> -        uint16_t hid, CXLDCEventType type, uint8_t rid,
> +        uint16_t hid, CXLDCEventType type, uint8_t rid, const char *tag,
>          CxlDynamicCapacityExtentList *records, Error **errp)
>  {
>      Object *obj;
> @@ -1966,8 +2007,10 @@ static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path,
>      CxlDynamicCapacityExtentList *list;
>      CXLDCExtentGroup *group = NULL;
>      g_autofree CXLDCExtentRaw *extents = NULL;
> -    uint64_t dpa, offset, len, block_size;
> +    uint64_t dpa, offset, block_size;
> +    uint64_t len = 0;
>      g_autofree unsigned long *blk_bitmap = NULL;
> +    QemuUUID uuid;
>      int i;
>  
>      obj = object_resolve_path_type(path, TYPE_CXL_TYPE3, NULL);
> @@ -1996,6 +2039,7 @@ static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path,
>          offset = list->value->offset;
>          len = list->value->len;
>          dpa = offset + dcd->dc.regions[rid].base;
> +        qemu_uuid_parse(tag, &uuid);
>  
>          if (len == 0) {
>              error_setg(errp, "extent with 0 length is not allowed");
> @@ -2049,6 +2093,31 @@ static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path,
>          num_extents++;
>      }
>  
> +    if (type == DC_EVENT_ADD_CAPACITY && dcd->dc.total_capacity_cmd) {
> +        MemoryRegion *host_dc_mr;
> +        uint64_t size;
> +
> +        if (num_extents > 1) {
> +            error_setg(errp, "Only single extent add is supported currently");
> +            return;
> +        }
> +
> +        if (!cxl_device_lazy_dynamic_capacity_init(dcd, tag, errp)) {
> +            return;
> +        }
> +
> +        host_dc_mr = host_memory_backend_get_memory(dcd->dc.host_dc);
> +        size = memory_region_size(host_dc_mr);
> +
> +        if (size != len) {
> +            error_setg(errp,
> +                       "Host memory backend size 0x%" PRIx64
> +                       " does not match extent length 0x%" PRIx64,
> +                       size, len);
> +            return;
> +        }
> +    }



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 3/7] hw/cxl: Change Extent add/remove APIs for lazy memory backend.
  2025-11-27 22:55 ` [RFC PATCH 3/7] hw/cxl: Change Extent add/remove APIs for lazy memory backend Alireza Sanaee
@ 2026-02-06 12:30   ` Jonathan Cameron
  0 siblings, 0 replies; 15+ messages in thread
From: Jonathan Cameron @ 2026-02-06 12:30 UTC (permalink / raw)
  To: Alireza Sanaee
  Cc: qemu-devel, linuxarm, eblake, armbru, berrange, pbonzini, mst,
	lizhijian, anisa.su, linux-cxl

On Thu, 27 Nov 2025 22:55:21 +0000
Alireza Sanaee <alireza.sanaee@huawei.com> wrote:

> Add extra information in each extent about fix memory window and

fixed

> memory backend, because extent might be backed by different memory
because each extent might be backed by a different memory backend

(under new scheme they are one to one)

> backends, thus such information must be stored in the extent object.
> Consequently, APIs should be changes to support extra members.
> 
> Signed-off-by: Alireza Sanaee <alireza.sanaee@huawei.com>
> ---
>  hw/cxl/cxl-host.c           |  2 ++
>  hw/cxl/cxl-mailbox-utils.c  | 42 ++++++++++++++++++++++++-------------
>  hw/mem/cxl_type3.c          | 23 +++++++++++++++-----
>  include/hw/cxl/cxl_device.h | 17 ++++++++++++---
>  4 files changed, 61 insertions(+), 23 deletions(-)
> 
> diff --git a/hw/cxl/cxl-host.c b/hw/cxl/cxl-host.c
> index 3a563af3bc..7c8fde4646 100644
> --- a/hw/cxl/cxl-host.c
> +++ b/hw/cxl/cxl-host.c
> @@ -365,6 +365,8 @@ static int cxl_fmws_direct_passthrough(Object *obj, void *opaque)
>          return 0;
>      }
>  
> +    ct3d->dc.fw = fw;
> +
>      if (state->commit) {
>          MemoryRegion *mr = NULL;
>          uint64_t vmr_size = 0, pmr_size = 0;
> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> index 68c7cc9891..ae723c03ec 100644
> --- a/hw/cxl/cxl-mailbox-utils.c
> +++ b/hw/cxl/cxl-mailbox-utils.c
> @@ -2840,10 +2840,13 @@ CXLDCRegion *cxl_find_dc_region(CXLType3Dev *ct3d, uint64_t dpa, uint64_t len)
>  }
>  
>  void cxl_insert_extent_to_extent_list(CXLDCExtentList *list,
> +                                             HostMemoryBackend *host_mem,
> +                                             struct CXLFixedWindow *fw,
>                                               uint64_t dpa,
>                                               uint64_t len,
>                                               uint8_t *tag,
> -                                             uint16_t shared_seq)
> +                                             uint16_t shared_seq,
> +                                             int rid)

I'm not sure what rid is...  Maybe there is a better name?
The dynamic capacity region index?




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 4/7] hw/cxl: Map lazy memory backend after host acceptance
  2025-11-27 22:55 ` [RFC PATCH 4/7] hw/cxl: Map lazy memory backend after host acceptance Alireza Sanaee
@ 2026-02-06 12:33   ` Jonathan Cameron
  0 siblings, 0 replies; 15+ messages in thread
From: Jonathan Cameron @ 2026-02-06 12:33 UTC (permalink / raw)
  To: Alireza Sanaee
  Cc: qemu-devel, linuxarm, eblake, armbru, berrange, pbonzini, mst,
	lizhijian, anisa.su, linux-cxl

On Thu, 27 Nov 2025 22:55:22 +0000
Alireza Sanaee <alireza.sanaee@huawei.com> wrote:

> Map relevant memory backend when host accepted an extent.

Explain what works at this point.  Does the old non performant
read / write land in this memory after this patch?

We could decide not to support that, but key is the patch
should explain where we are at this point.

No comments inline.

> 
> Signed-off-by: Alireza Sanaee <alireza.sanaee@huawei.com>
> ---
>  hw/cxl/cxl-mailbox-utils.c | 74 ++++++++++++++++++++++++++++++++++++--
>  1 file changed, 71 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> index ae723c03ec..b785553225 100644
> --- a/hw/cxl/cxl-mailbox-utils.c
> +++ b/hw/cxl/cxl-mailbox-utils.c
> @@ -2979,6 +2979,30 @@ static CXLRetCode cxl_detect_malformed_extent_list(CXLType3Dev *ct3d,
>      return CXL_MBOX_SUCCESS;
>  }
>  
> +static bool cxl_extent_find_extent_detail(CXLDCExtentGroupList *list,
> +                                          uint64_t start_dpa,
> +                                          uint64_t len,
> +                                          uint8_t *tag,
> +                                          HostMemoryBackend **hmb,
> +                                          struct CXLFixedWindow **fw,
> +                                          int *rid)
> +{
> +    CXLDCExtent *ent, *ent_next;
> +    CXLDCExtentGroup *group = QTAILQ_FIRST(list);
> +
> +    QTAILQ_FOREACH_SAFE(ent, &group->list, node, ent_next) {
> +        if (ent->start_dpa == start_dpa && ent->len == len) {
> +            *fw = ent->fw;
> +            *hmb = ent->hm;
> +            memcpy(tag, ent->tag, 0x10);
> +            *rid = ent->rid;
> +            return true;
> +        }
> +    }
> +
> +    return false;
> +}
> +
>  static CXLRetCode cxl_dcd_add_dyn_cap_rsp_dry_run(CXLType3Dev *ct3d,
>          const CXLUpdateDCExtentListInPl *in)
>  {
> @@ -3029,8 +3053,12 @@ static CXLRetCode cmd_dcd_add_dyn_cap_rsp(const struct cxl_cmd *cmd,
>      CXLUpdateDCExtentListInPl *in = (void *)payload_in;
>      CXLType3Dev *ct3d = CXL_TYPE3(cci->d);
>      CXLDCExtentList *extent_list = &ct3d->dc.extents;
> +    struct CXLFixedWindow *fw;
> +    HostMemoryBackend *hmb_dc;
> +    uint8_t tag[0x10];
>      uint32_t i, num;
>      uint64_t dpa, len;
> +    int rid;
>      CXLRetCode ret;
>  
>      if (len_in < sizeof(*in)) {
> @@ -3065,12 +3093,52 @@ static CXLRetCode cmd_dcd_add_dyn_cap_rsp(const struct cxl_cmd *cmd,
>      }
>  
>      for (i = 0; i < in->num_entries_updated; i++) {
> +        bool found;
> +        MemoryRegion *mr;
> +
>          dpa = in->updated_entries[i].start_dpa;
>          len = in->updated_entries[i].len;
>  
> -        cxl_insert_extent_to_extent_list(extent_list, NULL,
> -                                         NULL, dpa, len,
> -                                         NULL, 0, 0);
> +        if (ct3d->dc.total_capacity_cmd) {
> +            found = cxl_extent_find_extent_detail(
> +                &ct3d->dc.extents_pending, dpa, len, tag, &hmb_dc, &fw, &rid);
> +
> +            /*
> +             * This only occurs when host accepts an extent where device does
> +             * not know anything about it.
> +             */
> +            if (!found) {
> +                qemu_log("Could not find the extent detail for DPA 0x%" PRIx64
> +                         " LEN 0x%" PRIx64 "\n",
> +                         dpa, len);
> +                return CXL_MBOX_INVALID_PA;
> +            }
> +
> +            /* The host memory backend should not be already mapped */
> +            if (host_memory_backend_is_mapped(hmb_dc)) {
> +                qemu_log("The host memory backend for DPA 0x%" PRIx64
> +                         " LEN 0x%" PRIx64 " is already mapped\n",
> +                         dpa, len);
> +                return CXL_MBOX_INVALID_PA;
> +            }
> +
> +            mr = host_memory_backend_get_memory(hmb_dc);
> +            if (!mr) {
> +                qemu_log("Could not get memory region from host memory "
> +                         "backend\n");
> +                return CXL_MBOX_INVALID_PA;
> +            }
> +
> +            memory_region_set_nonvolatile(mr, false);
> +            memory_region_set_enabled(mr, true);
> +            host_memory_backend_set_mapped(hmb_dc, true);
> +            cxl_insert_extent_to_extent_list(extent_list, hmb_dc, fw, dpa, len,
> +                                             NULL, 0, rid);
> +        } else {
> +            cxl_insert_extent_to_extent_list(extent_list, NULL, NULL, dpa, len,
> +                                             NULL, 0, -1);
> +        }
> +
>          ct3d->dc.total_extent_count += 1;
>          ct3d->dc.nr_extents_accepted += 1;
>          ct3_set_region_block_backed(ct3d, dpa, len);


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 5/7] hw/cxl: Add performant direct mapping for extents
  2025-11-27 22:55 ` [RFC PATCH 5/7] hw/cxl: Add performant direct mapping for extents Alireza Sanaee
@ 2026-02-06 12:41   ` Jonathan Cameron
  0 siblings, 0 replies; 15+ messages in thread
From: Jonathan Cameron @ 2026-02-06 12:41 UTC (permalink / raw)
  To: Alireza Sanaee
  Cc: qemu-devel, linuxarm, eblake, armbru, berrange, pbonzini, mst,
	lizhijian, anisa.su, linux-cxl

On Thu, 27 Nov 2025 22:55:23 +0000
Alireza Sanaee <alireza.sanaee@huawei.com> wrote:

> Add alias direct mapping into the fixed memory window.

Say more on why, plus provide sequence of commands to do this.

> 
> Signed-off-by: Alireza Sanaee <alireza.sanaee@huawei.com>

> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> index b785553225..15114a5314 100644
> --- a/hw/cxl/cxl-mailbox-utils.c
> +++ b/hw/cxl/cxl-mailbox-utils.c

> @@ -3123,6 +3141,7 @@ static CXLRetCode cmd_dcd_add_dyn_cap_rsp(const struct cxl_cmd *cmd,
>              }
>  
>              mr = host_memory_backend_get_memory(hmb_dc);
> +
>              if (!mr) {
>                  qemu_log("Could not get memory region from host memory "
>                           "backend\n");
> @@ -3132,11 +3151,32 @@ static CXLRetCode cmd_dcd_add_dyn_cap_rsp(const struct cxl_cmd *cmd,
>              memory_region_set_nonvolatile(mr, false);
>              memory_region_set_enabled(mr, true);
>              host_memory_backend_set_mapped(hmb_dc, true);
> +
> +            if (ct3d->direct_mr_enabled) {
> +                g_autofree char *direct_mapping_name =
> +                    g_strdup_printf("cxl-direct-mapping-%d", mr_idx);
> +                int region_offset = dpa - ct3d->dc.regions[rid].base;
> +                MemoryRegion *dr_dc_mr = &ct3d->dc.dc_direct_mr[mr_idx];
> +                memory_region_init_alias(dr_dc_mr, OBJECT(ct3d),
> +                                         direct_mapping_name, mr, region_offset,
> +                                         ct3d->dc.dc_decoder_window.size);
> +                memory_region_add_subregion(&fw->mr,
> +                                            ct3d->dc.dc_decoder_window.base -
> +                                                fw->base + offset,
> +                                            dr_dc_mr);
> +                /*
> +                 * for now assuming 4 extents and 4 direct mapping memory
> +                 * regions.
> +                 */
> +                ct3d->dc.cur_direct_region_idx =
> +                    (ct3d->dc.cur_direct_region_idx + 1) % 4;

Don't do a modulo as that'll just overwrite someone else.  Just check if
we are out of space.  Ultimately they will come and go in random orders
so you'll need a different data structure to avoid running out too early.

> +            }

> diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> index fe0c44e8d7..1a521df881 100644
> --- a/include/hw/cxl/cxl_device.h
> +++ b/include/hw/cxl/cxl_device.h
> @@ -524,6 +524,7 @@ typedef struct CXLDCExtent {
>      uint16_t shared_seq;
>      uint8_t rsvd[0x6];
>      int rid;
> +    uint64_t offset;
>  
>      QTAILQ_ENTRY(CXLDCExtent) node;
>  } CXLDCExtent;
> @@ -589,6 +590,7 @@ struct CXLType3Dev {
>  
>      /* State */
>      MemoryRegion direct_mr[CXL_HDM_DECODER_COUNT];
> +    bool direct_mr_enabled;
>      AddressSpace hostvmem_as;
>      AddressSpace hostpmem_as;
>      CXLComponentState cxl_cstate;
> @@ -633,6 +635,14 @@ struct CXLType3Dev {
>          HostMemoryBackend *host_dc;
>          AddressSpace host_dc_as;
>          struct CXLFixedWindow *fw;
> +        int cur_direct_region_idx;
> +        /*
> +         * dc_decoder_window represents the CXL Decoder Window

What decoder? Needs more info on why this exists.
There is no particular reason a decoder maps to a DCD region but
that's probably the largest granularity we'll get. Could well point
to just part of a dc region.  I don't mind rejecting fast path
in those corner cases but we need to make the slow path work then.

> +         */
> +        struct decoder_window {
> +            hwaddr base;
> +            hwaddr size;
> +        } dc_decoder_window;
>          /*
>           * total_capacity is equivalent to the dynamic capability
>           * memory region size.
> @@ -647,6 +657,11 @@ struct CXLType3Dev {
>  
>          uint8_t num_regions; /* 0-8 regions */
>          CXLDCRegion regions[DCD_MAX_NUM_REGION];
> +        /*
> +         * Assume 4 now but many possible, each region is one alias an extent
> +         * to allow performance translation in KVM.

The KVM bit doesn't matter.  It is better with this in TCG as well.
I think we'll ultimately want a list for these - but this is fine for now.

> +         */
> +        MemoryRegion dc_direct_mr[4];
>      } dc;


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 6/7] hw/cxl: Add remove alias functionality for extent direct mapping
  2025-11-27 22:55 ` [RFC PATCH 6/7] hw/cxl: Add remove alias functionality for extent direct mapping Alireza Sanaee
@ 2026-02-06 12:43   ` Jonathan Cameron
  0 siblings, 0 replies; 15+ messages in thread
From: Jonathan Cameron @ 2026-02-06 12:43 UTC (permalink / raw)
  To: Alireza Sanaee
  Cc: qemu-devel, linuxarm, eblake, armbru, berrange, pbonzini, mst,
	lizhijian, anisa.su, linux-cxl

On Thu, 27 Nov 2025 22:55:24 +0000
Alireza Sanaee <alireza.sanaee@huawei.com> wrote:

> Remove alias related to an extent when the extent is longer available,
> from removed from the VM.
> 
> Signed-off-by: Alireza Sanaee <alireza.sanaee@huawei.com>
I'd combine with previous.  Otherwise we are broken in between.
I guess you could add a temporary error return to the remove path in previous
patch if they really don't more sense combined.

Jonathan


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 7/7] hw/cxl: Add tag-based removal functionality
  2025-11-27 22:55 ` [RFC PATCH 7/7] hw/cxl: Add tag-based removal functionality Alireza Sanaee
@ 2026-02-06 12:49   ` Jonathan Cameron
  0 siblings, 0 replies; 15+ messages in thread
From: Jonathan Cameron @ 2026-02-06 12:49 UTC (permalink / raw)
  To: Alireza Sanaee
  Cc: qemu-devel, linuxarm, eblake, armbru, berrange, pbonzini, mst,
	lizhijian, anisa.su, linux-cxl

On Thu, 27 Nov 2025 22:55:25 +0000
Alireza Sanaee <alireza.sanaee@huawei.com> wrote:

> Add tag based removal, in which alias tear down must be done properly.

I'd add the status check as as separate patch.
Aim to keep the addition of the removal command super simple.

I'm not sure how this interacts with the older handling.  Maybe it's
worth adding a release by tag before all the rest of the series?

Seems like a useful thing on it's own - as is the status check
though I think that interface may need some more thought.

Jonathan

> 
> Signed-off-by: Alireza Sanaee <alireza.sanaee@huawei.com>
> ---
>  hw/mem/cxl_type3.c | 119 +++++++++++++++++++++++++++++++++++++++++++++
>  qapi/cxl.json      |  46 ++++++++++++++++++
>  2 files changed, 165 insertions(+)
> 
> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> index d3ea62ef3f..29355792da 100644
> --- a/hw/mem/cxl_type3.c
> +++ b/hw/mem/cxl_type3.c
> @@ -2186,6 +2186,61 @@ void qmp_cxl_add_dynamic_capacity(const char *path, uint16_t host_id,
>      }
>  }
>  
> +static void qmp_cxl_process_dynamic_capacity_tag_based(const char *path,
> +        uint16_t hid, CXLDCEventType type, uint8_t rid, const char *tag,
> +        CxlDynamicCapacityExtentList *records, Error **errp) {
Check style...  I don't remember qemu doing { on that line.


> +
> +    Object *obj;
> +    CXLType3Dev *dcd;
> +    CXLDCExtentList *list = NULL;
> +    CXLDCExtent *ent;
> +    g_autofree CXLDCExtentRaw *extents = NULL;
> +
> +    obj = object_resolve_path_type(path, TYPE_CXL_TYPE3, NULL);
> +    if (!obj) {
> +        error_setg(errp, "Unable to resolve CXL type 3 device");
> +        return;
> +    }
> +
> +    dcd = CXL_TYPE3(obj);
> +    if (!dcd->dc.num_regions) {
> +        error_setg(errp, "No dynamic capacity support from the device");
> +        return;
> +    }
> +
> +    if (rid >= dcd->dc.num_regions) {
> +        error_setg(errp, "region id is too large");
> +        return;
> +    }
> +
> +    QemuUUID uuid_req;
> +    qemu_uuid_parse(tag, &uuid_req);
> +
> +    list = &dcd->dc.extents;
> +    size_t cap = 8, n = 0;
> +    extents = g_new0(CXLDCExtentRaw, cap);
> +    QTAILQ_FOREACH(ent, list, node) {
> +        QemuUUID uuid_ext;
> +        memcpy(&uuid_ext.data, ent->tag, sizeof(ent->tag));
> +        if (!qemu_uuid_is_equal(&uuid_req, &uuid_ext)) {
> +            continue;
> +        }
> +
> +        if (n == cap) {
> +            cap = cap < 8 ? 8 : cap * 2;
> +            extents = g_renew(CXLDCExtentRaw, extents, cap);
> +        }
> +
> +        extents[n++] = (CXLDCExtentRaw){ .start_dpa = ent->start_dpa,
> +                                         .len = ent->len,
> +                                         .shared_seq = 0 };
> +    }
> +
> +    extents = g_renew(CXLDCExtentRaw, extents, n);
> +    cxl_create_dc_event_records_for_extents(dcd, type, extents, n);
> +    return;
> +}
> +



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2026-02-06 12:49 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-27 22:55 [RFC QEMU PATCH 0/7] Application Specific Tagged Memory Support in CXL Type 3 Devices Alireza Sanaee
2025-11-27 22:55 ` [RFC PATCH 1/7] hw/mem: Add tagged memory backend object Alireza Sanaee
2026-02-06 12:16   ` Jonathan Cameron
2025-11-27 22:55 ` [RFC PATCH 2/7] hw/cxl: Allow initializing type3 device with no backing device Alireza Sanaee
2026-02-06 12:28   ` Jonathan Cameron
2025-11-27 22:55 ` [RFC PATCH 3/7] hw/cxl: Change Extent add/remove APIs for lazy memory backend Alireza Sanaee
2026-02-06 12:30   ` Jonathan Cameron
2025-11-27 22:55 ` [RFC PATCH 4/7] hw/cxl: Map lazy memory backend after host acceptance Alireza Sanaee
2026-02-06 12:33   ` Jonathan Cameron
2025-11-27 22:55 ` [RFC PATCH 5/7] hw/cxl: Add performant direct mapping for extents Alireza Sanaee
2026-02-06 12:41   ` Jonathan Cameron
2025-11-27 22:55 ` [RFC PATCH 6/7] hw/cxl: Add remove alias functionality for extent direct mapping Alireza Sanaee
2026-02-06 12:43   ` Jonathan Cameron
2025-11-27 22:55 ` [RFC PATCH 7/7] hw/cxl: Add tag-based removal functionality Alireza Sanaee
2026-02-06 12:49   ` Jonathan Cameron

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox