* [PATCH v8 00/14] Enabling DCD emulation support in Qemu
@ 2024-05-23 17:44 nifan.cxl
2024-05-23 17:44 ` [PATCH v8 01/14] hw/cxl/mailbox: change CCI cmd set structure to be a member, not a reference nifan.cxl
` (15 more replies)
0 siblings, 16 replies; 28+ messages in thread
From: nifan.cxl @ 2024-05-23 17:44 UTC (permalink / raw)
To: qemu-devel
Cc: jonathan.cameron, linux-cxl, gregory.price, ira.weiny,
dan.j.williams, a.manzanares, dave, nmtadam.samsung, nifan.cxl,
jim.harris, Jorgen.Hansen, wj28.lee, armbru, mst
From: Fan Ni <nifan.cxl@gmail.com>
A git tree of this series can be found here (with one extra commit on top
for printing out accepted/pending extent list for testing):
https://github.com/moking/qemu/tree/dcd-v8-qapi
v7->v8:
This version carries over the following two patches from Gregory.
1. hw/cxl/mailbox: change CCI cmd set structure to be a member, not a reference
https://gitlab.com/jic23/qemu/-/commit/f44ebc5a455ccdd6535879b0c5824e0d76b04da5
2. hw/cxl/mailbox: interface to add CCI commands to an existing CCI
https://gitlab.com/jic23/qemu/-/commit/00a4dd8b388add03c588298f665ee918626296a5
Note, the above two patches are not directly related to DCD emulation.
All the following patches in this series are built on top of mainstream QEMU
and the above two patches.
The most significant changes of v8 is in Patch 11 (Patch 9 in v7). Based on
feedback from Markus and Jonathan, the QMP interfaces for adding and releasing
DC extents have been redesigned and now they look like below,
# add a 128MB extent at offset 0 to region 0
{ "execute": "cxl-add-dynamic-capacity",
"arguments": {
"path": "/machine/peripheral/cxl-memdev0",
"host-id":0,
"selection-policy": 'prescriptive',
"region": 0,
"tag": "",
"extents": [
{
"offset": 0,
"len": 134217728
}
]
}
}
Note: tag is optional.
# Release a 128MB extent at offset 0 from region 0
{ "execute": "cxl-release-dynamic-capacity",
"arguments": {
"path": "/machine/peripheral/cxl-memdev0",
"host-id":0,
"removal-policy":"prescriptive",
"forced-removal": false,
"sanitize-on-release": false,
"region": 0,
"tag": "",
"extents": [
{
"offset": 0,
"len": 134217728
}
]
}
}
Note: removal-policy, sanitize-on-release and tag are optional.
Other changes include,
1. Applied tags to patches.
2. Replaced error_setq with error_append_hint for cxl_create_dc_region error
case in Patch 6 (Patch 4 in v7); (Zhijian Li)
3. Updated the error message to include region size information in
cxl_create_dc_region.
4. set range1_size_hi to 0 for DCD in build_dvsec. (Jonathan)
5. Several minor format fixes.
Thanks Markus, Jonathan, Gregory, and Zhijian for reviewing v7 and
svetly Todorov for testing v7.
This series pass the same tests as v7 check the cover letter of v7 for
more details. Additionally, we tested the QAPI interface for
adding/releasing DC extents with optional input parameters.
v7: https://lore.kernel.org/linux-cxl/5856b7a4-4082-465f-9f61-b1ec6c35ef0f@fujitsu.com/T/#mec4c85022ce28c80b241aaf2d5431cadaa45f097
Fan Ni (12):
hw/cxl/cxl-mailbox-utils: Add dc_event_log_size field to output
payload of identify memory device command
hw/cxl/cxl-mailbox-utils: Add dynamic capacity region representative
and mailbox command support
include/hw/cxl/cxl_device: Rename mem_size as static_mem_size for
type3 memory devices
hw/mem/cxl_type3: Add support to create DC regions to type3 memory
devices
hw/mem/cxl-type3: Refactor ct3_build_cdat_entries_for_mr to take mr
size instead of mr as argument
hw/mem/cxl_type3: Add host backend and address space handling for DC
regions
hw/mem/cxl_type3: Add DC extent list representative and get DC extent
list mailbox support
hw/cxl/cxl-mailbox-utils: Add mailbox commands to support add/release
dynamic capacity response
hw/cxl/events: Add qmp interfaces to add/release dynamic capacity
extents
hw/mem/cxl_type3: Add DPA range validation for accesses to DC regions
hw/cxl/cxl-mailbox-utils: Add superset extent release mailbox support
hw/mem/cxl_type3: Allow to release extent superset in QMP interface
Gregory Price (2):
hw/cxl/mailbox: change CCI cmd set structure to be a member, not a
reference
hw/cxl/mailbox: interface to add CCI commands to an existing CCI
hw/cxl/cxl-mailbox-utils.c | 658 +++++++++++++++++++++++++++++++++++-
hw/mem/cxl_type3.c | 634 ++++++++++++++++++++++++++++++++--
hw/mem/cxl_type3_stubs.c | 25 ++
include/hw/cxl/cxl_device.h | 85 ++++-
include/hw/cxl/cxl_events.h | 18 +
qapi/cxl.json | 143 ++++++++
6 files changed, 1511 insertions(+), 52 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v8 01/14] hw/cxl/mailbox: change CCI cmd set structure to be a member, not a reference
2024-05-23 17:44 [PATCH v8 00/14] Enabling DCD emulation support in Qemu nifan.cxl
@ 2024-05-23 17:44 ` nifan.cxl
2024-05-23 17:44 ` [PATCH v8 02/14] hw/cxl/mailbox: interface to add CCI commands to an existing CCI nifan.cxl
` (14 subsequent siblings)
15 siblings, 0 replies; 28+ messages in thread
From: nifan.cxl @ 2024-05-23 17:44 UTC (permalink / raw)
To: qemu-devel
Cc: jonathan.cameron, linux-cxl, gregory.price, ira.weiny,
dan.j.williams, a.manzanares, dave, nmtadam.samsung, nifan.cxl,
jim.harris, Jorgen.Hansen, wj28.lee, armbru, mst, Gregory Price,
Jonathan Cameron, Fan Ni
From: Gregory Price <gourry.memverge@gmail.com>
This allows devices to have fully customized CCIs, along with complex
devices where wrapper devices can override or add additional CCI
commands without having to replicate full command structures or
pollute a base device with every command that might ever be used.
Signed-off-by: Gregory Price <gregory.price@memverge.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
---
hw/cxl/cxl-mailbox-utils.c | 19 +++++++++++++++----
include/hw/cxl/cxl_device.h | 2 +-
2 files changed, 16 insertions(+), 5 deletions(-)
diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index e5eb97cb91..2c9f50f0f9 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -1447,10 +1447,21 @@ void cxl_init_cci(CXLCCI *cci, size_t payload_max)
bg_timercb, cci);
}
+static void cxl_copy_cci_commands(CXLCCI *cci, const struct cxl_cmd (*cxl_cmds)[256])
+{
+ for (int set = 0; set < 256; set++) {
+ for (int cmd = 0; cmd < 256; cmd++) {
+ if (cxl_cmds[set][cmd].handler) {
+ cci->cxl_cmd_set[set][cmd] = cxl_cmds[set][cmd];
+ }
+ }
+ }
+}
+
void cxl_initialize_mailbox_swcci(CXLCCI *cci, DeviceState *intf,
DeviceState *d, size_t payload_max)
{
- cci->cxl_cmd_set = cxl_cmd_set_sw;
+ cxl_copy_cci_commands(cci, cxl_cmd_set_sw);
cci->d = d;
cci->intf = intf;
cxl_init_cci(cci, payload_max);
@@ -1458,7 +1469,7 @@ void cxl_initialize_mailbox_swcci(CXLCCI *cci, DeviceState *intf,
void cxl_initialize_mailbox_t3(CXLCCI *cci, DeviceState *d, size_t payload_max)
{
- cci->cxl_cmd_set = cxl_cmd_set;
+ cxl_copy_cci_commands(cci, cxl_cmd_set);
cci->d = d;
/* No separation for PCI MB as protocol handled in PCI device */
@@ -1476,7 +1487,7 @@ static const struct cxl_cmd cxl_cmd_set_t3_ld[256][256] = {
void cxl_initialize_t3_ld_cci(CXLCCI *cci, DeviceState *d, DeviceState *intf,
size_t payload_max)
{
- cci->cxl_cmd_set = cxl_cmd_set_t3_ld;
+ cxl_copy_cci_commands(cci, cxl_cmd_set_t3_ld);
cci->d = d;
cci->intf = intf;
cxl_init_cci(cci, payload_max);
@@ -1496,7 +1507,7 @@ void cxl_initialize_t3_fm_owned_ld_mctpcci(CXLCCI *cci, DeviceState *d,
DeviceState *intf,
size_t payload_max)
{
- cci->cxl_cmd_set = cxl_cmd_set_t3_fm_owned_ld_mctp;
+ cxl_copy_cci_commands(cci, cxl_cmd_set_t3_fm_owned_ld_mctp);
cci->d = d;
cci->intf = intf;
cxl_init_cci(cci, payload_max);
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index 279b276bda..ccc4611875 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -164,7 +164,7 @@ typedef struct CXLEventLog {
} CXLEventLog;
typedef struct CXLCCI {
- const struct cxl_cmd (*cxl_cmd_set)[256];
+ struct cxl_cmd cxl_cmd_set[256][256];
struct cel_log {
uint16_t opcode;
uint16_t effect;
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v8 02/14] hw/cxl/mailbox: interface to add CCI commands to an existing CCI
2024-05-23 17:44 [PATCH v8 00/14] Enabling DCD emulation support in Qemu nifan.cxl
2024-05-23 17:44 ` [PATCH v8 01/14] hw/cxl/mailbox: change CCI cmd set structure to be a member, not a reference nifan.cxl
@ 2024-05-23 17:44 ` nifan.cxl
2024-05-23 17:44 ` [PATCH v8 03/14] hw/cxl/cxl-mailbox-utils: Add dc_event_log_size field to output payload of identify memory device command nifan.cxl
` (13 subsequent siblings)
15 siblings, 0 replies; 28+ messages in thread
From: nifan.cxl @ 2024-05-23 17:44 UTC (permalink / raw)
To: qemu-devel
Cc: jonathan.cameron, linux-cxl, gregory.price, ira.weiny,
dan.j.williams, a.manzanares, dave, nmtadam.samsung, nifan.cxl,
jim.harris, Jorgen.Hansen, wj28.lee, armbru, mst, Gregory Price,
Jonathan Cameron, Fan Ni
From: Gregory Price <gourry.memverge@gmail.com>
This enables wrapper devices to customize the base device's CCI
(for example, with custom commands outside the specification)
without the need to change the base device.
The also enabled the base device to dispatch those commands without
requiring additional driver support.
Signed-off-by: Gregory Price <gregory.price@memverge.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
--
Heavily edited by Jonathan Cameron to increase code reuse
---
hw/cxl/cxl-mailbox-utils.c | 19 +++++++++++++++++--
include/hw/cxl/cxl_device.h | 2 ++
2 files changed, 19 insertions(+), 2 deletions(-)
diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 2c9f50f0f9..4bcd727f4c 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -1424,9 +1424,9 @@ static void bg_timercb(void *opaque)
}
}
-void cxl_init_cci(CXLCCI *cci, size_t payload_max)
+static void cxl_rebuild_cel(CXLCCI *cci)
{
- cci->payload_max = payload_max;
+ cci->cel_size = 0; /* Reset for a fresh build */
for (int set = 0; set < 256; set++) {
for (int cmd = 0; cmd < 256; cmd++) {
if (cci->cxl_cmd_set[set][cmd].handler) {
@@ -1440,6 +1440,13 @@ void cxl_init_cci(CXLCCI *cci, size_t payload_max)
}
}
}
+}
+
+void cxl_init_cci(CXLCCI *cci, size_t payload_max)
+{
+ cci->payload_max = payload_max;
+ cxl_rebuild_cel(cci);
+
cci->bg.complete_pct = 0;
cci->bg.starttime = 0;
cci->bg.runtime = 0;
@@ -1458,6 +1465,14 @@ static void cxl_copy_cci_commands(CXLCCI *cci, const struct cxl_cmd (*cxl_cmds)[
}
}
+void cxl_add_cci_commands(CXLCCI *cci, const struct cxl_cmd (*cxl_cmd_set)[256],
+ size_t payload_max)
+{
+ cci->payload_max = payload_max > cci->payload_max ? payload_max : cci->payload_max;
+ cxl_copy_cci_commands(cci, cxl_cmd_set);
+ cxl_rebuild_cel(cci);
+}
+
void cxl_initialize_mailbox_swcci(CXLCCI *cci, DeviceState *intf,
DeviceState *d, size_t payload_max)
{
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index ccc4611875..a5f8e25020 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -301,6 +301,8 @@ void cxl_initialize_mailbox_t3(CXLCCI *cci, DeviceState *d, size_t payload_max);
void cxl_initialize_mailbox_swcci(CXLCCI *cci, DeviceState *intf,
DeviceState *d, size_t payload_max);
void cxl_init_cci(CXLCCI *cci, size_t payload_max);
+void cxl_add_cci_commands(CXLCCI *cci, const struct cxl_cmd (*cxl_cmd_set)[256],
+ size_t payload_max);
int cxl_process_cci_message(CXLCCI *cci, uint8_t set, uint8_t cmd,
size_t len_in, uint8_t *pl_in,
size_t *len_out, uint8_t *pl_out,
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v8 03/14] hw/cxl/cxl-mailbox-utils: Add dc_event_log_size field to output payload of identify memory device command
2024-05-23 17:44 [PATCH v8 00/14] Enabling DCD emulation support in Qemu nifan.cxl
2024-05-23 17:44 ` [PATCH v8 01/14] hw/cxl/mailbox: change CCI cmd set structure to be a member, not a reference nifan.cxl
2024-05-23 17:44 ` [PATCH v8 02/14] hw/cxl/mailbox: interface to add CCI commands to an existing CCI nifan.cxl
@ 2024-05-23 17:44 ` nifan.cxl
2024-05-23 17:44 ` [PATCH v8 04/14] hw/cxl/cxl-mailbox-utils: Add dynamic capacity region representative and mailbox command support nifan.cxl
` (12 subsequent siblings)
15 siblings, 0 replies; 28+ messages in thread
From: nifan.cxl @ 2024-05-23 17:44 UTC (permalink / raw)
To: qemu-devel
Cc: jonathan.cameron, linux-cxl, gregory.price, ira.weiny,
dan.j.williams, a.manzanares, dave, nmtadam.samsung, nifan.cxl,
jim.harris, Jorgen.Hansen, wj28.lee, armbru, mst, Fan Ni,
Jonathan Cameron
From: Fan Ni <fan.ni@samsung.com>
Based on CXL spec r3.1 Table 8-127 (Identify Memory Device Output
Payload), dynamic capacity event log size should be part of
output of the Identify command.
Add dc_event_log_size to the output payload for the host to get the info.
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
---
hw/cxl/cxl-mailbox-utils.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 4bcd727f4c..ba1d9901df 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -21,6 +21,7 @@
#include "sysemu/hostmem.h"
#define CXL_CAPACITY_MULTIPLIER (256 * MiB)
+#define CXL_DC_EVENT_LOG_SIZE 8
/*
* How to add a new command, example. The command set FOO, with cmd BAR.
@@ -780,8 +781,9 @@ static CXLRetCode cmd_identify_memory_device(const struct cxl_cmd *cmd,
uint16_t inject_poison_limit;
uint8_t poison_caps;
uint8_t qos_telemetry_caps;
+ uint16_t dc_event_log_size;
} QEMU_PACKED *id;
- QEMU_BUILD_BUG_ON(sizeof(*id) != 0x43);
+ QEMU_BUILD_BUG_ON(sizeof(*id) != 0x45);
CXLType3Dev *ct3d = CXL_TYPE3(cci->d);
CXLType3Class *cvc = CXL_TYPE3_GET_CLASS(ct3d);
CXLDeviceState *cxl_dstate = &ct3d->cxl_dstate;
@@ -807,6 +809,7 @@ static CXLRetCode cmd_identify_memory_device(const struct cxl_cmd *cmd,
st24_le_p(id->poison_list_max_mer, 256);
/* No limit - so limited by main poison record limit */
stw_le_p(&id->inject_poison_limit, 0);
+ stw_le_p(&id->dc_event_log_size, CXL_DC_EVENT_LOG_SIZE);
*len_out = sizeof(*id);
return CXL_MBOX_SUCCESS;
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v8 04/14] hw/cxl/cxl-mailbox-utils: Add dynamic capacity region representative and mailbox command support
2024-05-23 17:44 [PATCH v8 00/14] Enabling DCD emulation support in Qemu nifan.cxl
` (2 preceding siblings ...)
2024-05-23 17:44 ` [PATCH v8 03/14] hw/cxl/cxl-mailbox-utils: Add dc_event_log_size field to output payload of identify memory device command nifan.cxl
@ 2024-05-23 17:44 ` nifan.cxl
2024-05-23 17:44 ` [PATCH v8 05/14] include/hw/cxl/cxl_device: Rename mem_size as static_mem_size for type3 memory devices nifan.cxl
` (11 subsequent siblings)
15 siblings, 0 replies; 28+ messages in thread
From: nifan.cxl @ 2024-05-23 17:44 UTC (permalink / raw)
To: qemu-devel
Cc: jonathan.cameron, linux-cxl, gregory.price, ira.weiny,
dan.j.williams, a.manzanares, dave, nmtadam.samsung, nifan.cxl,
jim.harris, Jorgen.Hansen, wj28.lee, armbru, mst, Fan Ni,
Jonathan Cameron
From: Fan Ni <fan.ni@samsung.com>
Per cxl spec r3.1, add dynamic capacity (DC) region representative based on
Table 8-165 and extend the cxl type3 device definition to include DC region
information. Also, based on info in 8.2.9.9.9.1, add 'Get Dynamic Capacity
Configuration' mailbox support.
Note: we store region decode length as byte-wise length on the device, which
should be divided by 256 * MiB before being returned to the host
for "Get Dynamic Capacity Configuration" mailbox command per
specification.
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
---
hw/cxl/cxl-mailbox-utils.c | 96 +++++++++++++++++++++++++++++++++++++
include/hw/cxl/cxl_device.h | 16 +++++++
2 files changed, 112 insertions(+)
diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index ba1d9901df..49c7944d93 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -22,6 +22,8 @@
#define CXL_CAPACITY_MULTIPLIER (256 * MiB)
#define CXL_DC_EVENT_LOG_SIZE 8
+#define CXL_NUM_EXTENTS_SUPPORTED 512
+#define CXL_NUM_TAGS_SUPPORTED 0
/*
* How to add a new command, example. The command set FOO, with cmd BAR.
@@ -80,6 +82,8 @@ enum {
#define GET_POISON_LIST 0x0
#define INJECT_POISON 0x1
#define CLEAR_POISON 0x2
+ DCD_CONFIG = 0x48,
+ #define GET_DC_CONFIG 0x0
PHYSICAL_SWITCH = 0x51,
#define IDENTIFY_SWITCH_DEVICE 0x0
#define GET_PHYSICAL_PORT_STATE 0x1
@@ -1238,6 +1242,88 @@ static CXLRetCode cmd_media_clear_poison(const struct cxl_cmd *cmd,
return CXL_MBOX_SUCCESS;
}
+/*
+ * CXL r3.1 section 8.2.9.9.9.1: Get Dynamic Capacity Configuration
+ * (Opcode: 4800h)
+ */
+static CXLRetCode cmd_dcd_get_dyn_cap_config(const struct cxl_cmd *cmd,
+ uint8_t *payload_in,
+ size_t len_in,
+ uint8_t *payload_out,
+ size_t *len_out,
+ CXLCCI *cci)
+{
+ CXLType3Dev *ct3d = CXL_TYPE3(cci->d);
+ struct {
+ uint8_t region_cnt;
+ uint8_t start_rid;
+ } QEMU_PACKED *in = (void *)payload_in;
+ struct {
+ uint8_t num_regions;
+ uint8_t regions_returned;
+ uint8_t rsvd1[6];
+ struct {
+ uint64_t base;
+ uint64_t decode_len;
+ uint64_t region_len;
+ uint64_t block_size;
+ uint32_t dsmadhandle;
+ uint8_t flags;
+ uint8_t rsvd2[3];
+ } QEMU_PACKED records[];
+ } QEMU_PACKED *out = (void *)payload_out;
+ struct {
+ uint32_t num_extents_supported;
+ uint32_t num_extents_available;
+ uint32_t num_tags_supported;
+ uint32_t num_tags_available;
+ } QEMU_PACKED *extra_out;
+ uint16_t record_count;
+ uint16_t i;
+ uint16_t out_pl_len;
+ uint8_t start_rid;
+
+ start_rid = in->start_rid;
+ if (start_rid >= ct3d->dc.num_regions) {
+ return CXL_MBOX_INVALID_INPUT;
+ }
+
+ record_count = MIN(ct3d->dc.num_regions - in->start_rid, in->region_cnt);
+
+ out_pl_len = sizeof(*out) + record_count * sizeof(out->records[0]);
+ extra_out = (void *)(payload_out + out_pl_len);
+ out_pl_len += sizeof(*extra_out);
+ assert(out_pl_len <= CXL_MAILBOX_MAX_PAYLOAD_SIZE);
+
+ out->num_regions = ct3d->dc.num_regions;
+ out->regions_returned = record_count;
+ for (i = 0; i < record_count; i++) {
+ stq_le_p(&out->records[i].base,
+ ct3d->dc.regions[start_rid + i].base);
+ stq_le_p(&out->records[i].decode_len,
+ ct3d->dc.regions[start_rid + i].decode_len /
+ CXL_CAPACITY_MULTIPLIER);
+ stq_le_p(&out->records[i].region_len,
+ ct3d->dc.regions[start_rid + i].len);
+ stq_le_p(&out->records[i].block_size,
+ ct3d->dc.regions[start_rid + i].block_size);
+ stl_le_p(&out->records[i].dsmadhandle,
+ ct3d->dc.regions[start_rid + i].dsmadhandle);
+ out->records[i].flags = ct3d->dc.regions[start_rid + i].flags;
+ }
+ /*
+ * TODO: Assign values once extents and tags are introduced
+ * to use.
+ */
+ stl_le_p(&extra_out->num_extents_supported, CXL_NUM_EXTENTS_SUPPORTED);
+ stl_le_p(&extra_out->num_extents_available, CXL_NUM_EXTENTS_SUPPORTED);
+ stl_le_p(&extra_out->num_tags_supported, CXL_NUM_TAGS_SUPPORTED);
+ stl_le_p(&extra_out->num_tags_available, CXL_NUM_TAGS_SUPPORTED);
+
+ *len_out = out_pl_len;
+ return CXL_MBOX_SUCCESS;
+}
+
#define IMMEDIATE_CONFIG_CHANGE (1 << 1)
#define IMMEDIATE_DATA_CHANGE (1 << 2)
#define IMMEDIATE_POLICY_CHANGE (1 << 3)
@@ -1282,6 +1368,11 @@ static const struct cxl_cmd cxl_cmd_set[256][256] = {
cmd_media_clear_poison, 72, 0 },
};
+static const struct cxl_cmd cxl_cmd_set_dcd[256][256] = {
+ [DCD_CONFIG][GET_DC_CONFIG] = { "DCD_GET_DC_CONFIG",
+ cmd_dcd_get_dyn_cap_config, 2, 0 },
+};
+
static const struct cxl_cmd cxl_cmd_set_sw[256][256] = {
[INFOSTAT][IS_IDENTIFY] = { "IDENTIFY", cmd_infostat_identify, 0, 0 },
[INFOSTAT][BACKGROUND_OPERATION_STATUS] = { "BACKGROUND_OPERATION_STATUS",
@@ -1487,7 +1578,12 @@ void cxl_initialize_mailbox_swcci(CXLCCI *cci, DeviceState *intf,
void cxl_initialize_mailbox_t3(CXLCCI *cci, DeviceState *d, size_t payload_max)
{
+ CXLType3Dev *ct3d = CXL_TYPE3(d);
+
cxl_copy_cci_commands(cci, cxl_cmd_set);
+ if (ct3d->dc.num_regions) {
+ cxl_copy_cci_commands(cci, cxl_cmd_set_dcd);
+ }
cci->d = d;
/* No separation for PCI MB as protocol handled in PCI device */
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index a5f8e25020..e839370266 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -422,6 +422,17 @@ typedef struct CXLPoison {
typedef QLIST_HEAD(, CXLPoison) CXLPoisonList;
#define CXL_POISON_LIST_LIMIT 256
+#define DCD_MAX_NUM_REGION 8
+
+typedef struct CXLDCRegion {
+ uint64_t base; /* aligned to 256*MiB */
+ uint64_t decode_len; /* aligned to 256*MiB */
+ uint64_t len;
+ uint64_t block_size;
+ uint32_t dsmadhandle;
+ uint8_t flags;
+} CXLDCRegion;
+
struct CXLType3Dev {
/* Private */
PCIDevice parent_obj;
@@ -454,6 +465,11 @@ struct CXLType3Dev {
unsigned int poison_list_cnt;
bool poison_list_overflowed;
uint64_t poison_list_overflow_ts;
+
+ struct dynamic_capacity {
+ uint8_t num_regions; /* 0-8 regions */
+ CXLDCRegion regions[DCD_MAX_NUM_REGION];
+ } dc;
};
#define TYPE_CXL_TYPE3 "cxl-type3"
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v8 05/14] include/hw/cxl/cxl_device: Rename mem_size as static_mem_size for type3 memory devices
2024-05-23 17:44 [PATCH v8 00/14] Enabling DCD emulation support in Qemu nifan.cxl
` (3 preceding siblings ...)
2024-05-23 17:44 ` [PATCH v8 04/14] hw/cxl/cxl-mailbox-utils: Add dynamic capacity region representative and mailbox command support nifan.cxl
@ 2024-05-23 17:44 ` nifan.cxl
2024-05-23 17:44 ` [PATCH v8 06/14] hw/mem/cxl_type3: Add support to create DC regions to " nifan.cxl
` (10 subsequent siblings)
15 siblings, 0 replies; 28+ messages in thread
From: nifan.cxl @ 2024-05-23 17:44 UTC (permalink / raw)
To: qemu-devel
Cc: jonathan.cameron, linux-cxl, gregory.price, ira.weiny,
dan.j.williams, a.manzanares, dave, nmtadam.samsung, nifan.cxl,
jim.harris, Jorgen.Hansen, wj28.lee, armbru, mst, Fan Ni,
Jonathan Cameron
From: Fan Ni <fan.ni@samsung.com>
Rename mem_size as static_mem_size for type3 memdev to cover static RAM and
pmem capacity, preparing for the introduction of dynamic capacity to support
dynamic capacity devices.
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
---
hw/cxl/cxl-mailbox-utils.c | 4 ++--
hw/mem/cxl_type3.c | 8 ++++----
include/hw/cxl/cxl_device.h | 2 +-
3 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 49c7944d93..0f2ad58a14 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -803,7 +803,7 @@ static CXLRetCode cmd_identify_memory_device(const struct cxl_cmd *cmd,
snprintf(id->fw_revision, 0x10, "BWFW VERSION %02d", 0);
stq_le_p(&id->total_capacity,
- cxl_dstate->mem_size / CXL_CAPACITY_MULTIPLIER);
+ cxl_dstate->static_mem_size / CXL_CAPACITY_MULTIPLIER);
stq_le_p(&id->persistent_capacity,
cxl_dstate->pmem_size / CXL_CAPACITY_MULTIPLIER);
stq_le_p(&id->volatile_capacity,
@@ -1179,7 +1179,7 @@ static CXLRetCode cmd_media_clear_poison(const struct cxl_cmd *cmd,
struct clear_poison_pl *in = (void *)payload_in;
dpa = ldq_le_p(&in->dpa);
- if (dpa + CXL_CACHE_LINE_SIZE > cxl_dstate->mem_size) {
+ if (dpa + CXL_CACHE_LINE_SIZE > cxl_dstate->static_mem_size) {
return CXL_MBOX_INVALID_PA;
}
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 3e42490b6c..7194c8f902 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -608,7 +608,7 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
}
address_space_init(&ct3d->hostvmem_as, vmr, v_name);
ct3d->cxl_dstate.vmem_size = memory_region_size(vmr);
- ct3d->cxl_dstate.mem_size += memory_region_size(vmr);
+ ct3d->cxl_dstate.static_mem_size += memory_region_size(vmr);
g_free(v_name);
}
@@ -631,7 +631,7 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
}
address_space_init(&ct3d->hostpmem_as, pmr, p_name);
ct3d->cxl_dstate.pmem_size = memory_region_size(pmr);
- ct3d->cxl_dstate.mem_size += memory_region_size(pmr);
+ ct3d->cxl_dstate.static_mem_size += memory_region_size(pmr);
g_free(p_name);
}
@@ -837,7 +837,7 @@ static int cxl_type3_hpa_to_as_and_dpa(CXLType3Dev *ct3d,
return -EINVAL;
}
- if (*dpa_offset > ct3d->cxl_dstate.mem_size) {
+ if (*dpa_offset > ct3d->cxl_dstate.static_mem_size) {
return -EINVAL;
}
@@ -1010,7 +1010,7 @@ static bool set_cacheline(CXLType3Dev *ct3d, uint64_t dpa_offset, uint8_t *data)
return false;
}
- if (dpa_offset + CXL_CACHE_LINE_SIZE > ct3d->cxl_dstate.mem_size) {
+ if (dpa_offset + CXL_CACHE_LINE_SIZE > ct3d->cxl_dstate.static_mem_size) {
return false;
}
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index e839370266..f7f56b44e3 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -234,7 +234,7 @@ typedef struct cxl_device_state {
} timestamp;
/* memory region size, HDM */
- uint64_t mem_size;
+ uint64_t static_mem_size;
uint64_t pmem_size;
uint64_t vmem_size;
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v8 06/14] hw/mem/cxl_type3: Add support to create DC regions to type3 memory devices
2024-05-23 17:44 [PATCH v8 00/14] Enabling DCD emulation support in Qemu nifan.cxl
` (4 preceding siblings ...)
2024-05-23 17:44 ` [PATCH v8 05/14] include/hw/cxl/cxl_device: Rename mem_size as static_mem_size for type3 memory devices nifan.cxl
@ 2024-05-23 17:44 ` nifan.cxl
2024-05-27 7:42 ` Zhijian Li (Fujitsu)
2024-05-23 17:44 ` [PATCH v8 07/14] hw/mem/cxl-type3: Refactor ct3_build_cdat_entries_for_mr to take mr size instead of mr as argument nifan.cxl
` (9 subsequent siblings)
15 siblings, 1 reply; 28+ messages in thread
From: nifan.cxl @ 2024-05-23 17:44 UTC (permalink / raw)
To: qemu-devel
Cc: jonathan.cameron, linux-cxl, gregory.price, ira.weiny,
dan.j.williams, a.manzanares, dave, nmtadam.samsung, nifan.cxl,
jim.harris, Jorgen.Hansen, wj28.lee, armbru, mst, Fan Ni,
Jonathan Cameron
From: Fan Ni <fan.ni@samsung.com>
With the change, when setting up memory for type3 memory device, we can
create DC regions.
A property 'num-dc-regions' is added to ct3_props to allow users to pass the
number of DC regions to create. To make it easier, other region parameters
like region base, length, and block size are hard coded. If needed,
these parameters can be added easily.
With the change, we can create DC regions with proper kernel side
support like below:
region=$(cat /sys/bus/cxl/devices/decoder0.0/create_dc_region)
echo $region > /sys/bus/cxl/devices/decoder0.0/create_dc_region
echo 256 > /sys/bus/cxl/devices/$region/interleave_granularity
echo 1 > /sys/bus/cxl/devices/$region/interleave_ways
echo "dc0" >/sys/bus/cxl/devices/decoder2.0/mode
echo 0x40000000 >/sys/bus/cxl/devices/decoder2.0/dpa_size
echo 0x40000000 > /sys/bus/cxl/devices/$region/size
echo "decoder2.0" > /sys/bus/cxl/devices/$region/target0
echo 1 > /sys/bus/cxl/devices/$region/commit
echo $region > /sys/bus/cxl/drivers/cxl_region/bind
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
---
hw/mem/cxl_type3.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 53 insertions(+)
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 7194c8f902..06c6f9bb78 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -30,6 +30,7 @@
#include "hw/pci/msix.h"
#define DWORD_BYTE 4
+#define CXL_CAPACITY_MULTIPLIER (256 * MiB)
/* Default CDAT entries for a memory region */
enum {
@@ -567,6 +568,50 @@ static void ct3d_reg_write(void *opaque, hwaddr offset, uint64_t value,
}
}
+/*
+ * TODO: dc region configuration will be updated once host backend and address
+ * space support is added for DCD.
+ */
+static bool cxl_create_dc_regions(CXLType3Dev *ct3d, Error **errp)
+{
+ int i;
+ uint64_t region_base = 0;
+ uint64_t region_len = 2 * GiB;
+ uint64_t decode_len = 2 * GiB;
+ uint64_t blk_size = 2 * MiB;
+ CXLDCRegion *region;
+ MemoryRegion *mr;
+
+ if (ct3d->hostvmem) {
+ mr = host_memory_backend_get_memory(ct3d->hostvmem);
+ region_base += memory_region_size(mr);
+ }
+ if (ct3d->hostpmem) {
+ mr = host_memory_backend_get_memory(ct3d->hostpmem);
+ region_base += memory_region_size(mr);
+ }
+ if (region_base % CXL_CAPACITY_MULTIPLIER != 0) {
+ error_setg(errp, "DC region base not aligned to 0x%lx",
+ CXL_CAPACITY_MULTIPLIER);
+ return false;
+ }
+
+ for (i = 0, region = &ct3d->dc.regions[0];
+ i < ct3d->dc.num_regions;
+ i++, region++, region_base += region_len) {
+ *region = (CXLDCRegion) {
+ .base = region_base,
+ .decode_len = decode_len,
+ .len = region_len,
+ .block_size = blk_size,
+ /* dsmad_handle set when creating CDAT table entries */
+ .flags = 0,
+ };
+ }
+
+ return true;
+}
+
static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
{
DeviceState *ds = DEVICE(ct3d);
@@ -635,6 +680,13 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
g_free(p_name);
}
+ if (ct3d->dc.num_regions > 0) {
+ if (!cxl_create_dc_regions(ct3d, errp)) {
+ error_append_hint(errp, "setup DC regions failed");
+ return false;
+ }
+ }
+
return true;
}
@@ -930,6 +982,7 @@ static Property ct3_props[] = {
HostMemoryBackend *),
DEFINE_PROP_UINT64("sn", CXLType3Dev, sn, UI64_NULL),
DEFINE_PROP_STRING("cdat", CXLType3Dev, cxl_cstate.cdat.filename),
+ DEFINE_PROP_UINT8("num-dc-regions", CXLType3Dev, dc.num_regions, 0),
DEFINE_PROP_END_OF_LIST(),
};
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v8 07/14] hw/mem/cxl-type3: Refactor ct3_build_cdat_entries_for_mr to take mr size instead of mr as argument
2024-05-23 17:44 [PATCH v8 00/14] Enabling DCD emulation support in Qemu nifan.cxl
` (5 preceding siblings ...)
2024-05-23 17:44 ` [PATCH v8 06/14] hw/mem/cxl_type3: Add support to create DC regions to " nifan.cxl
@ 2024-05-23 17:44 ` nifan.cxl
2024-05-23 17:44 ` [PATCH v8 08/14] hw/mem/cxl_type3: Add host backend and address space handling for DC regions nifan.cxl
` (8 subsequent siblings)
15 siblings, 0 replies; 28+ messages in thread
From: nifan.cxl @ 2024-05-23 17:44 UTC (permalink / raw)
To: qemu-devel
Cc: jonathan.cameron, linux-cxl, gregory.price, ira.weiny,
dan.j.williams, a.manzanares, dave, nmtadam.samsung, nifan.cxl,
jim.harris, Jorgen.Hansen, wj28.lee, armbru, mst, Fan Ni,
Jonathan Cameron
From: Fan Ni <fan.ni@samsung.com>
The function ct3_build_cdat_entries_for_mr only uses size of the passed
memory region argument, refactor the function definition to make the passed
arguments more specific.
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
---
hw/mem/cxl_type3.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 06c6f9bb78..51be50ce87 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -44,7 +44,7 @@ enum {
};
static void ct3_build_cdat_entries_for_mr(CDATSubHeader **cdat_table,
- int dsmad_handle, MemoryRegion *mr,
+ int dsmad_handle, uint64_t size,
bool is_pmem, uint64_t dpa_base)
{
CDATDsmas *dsmas;
@@ -63,7 +63,7 @@ static void ct3_build_cdat_entries_for_mr(CDATSubHeader **cdat_table,
.DSMADhandle = dsmad_handle,
.flags = is_pmem ? CDAT_DSMAS_FLAG_NV : 0,
.DPA_base = dpa_base,
- .DPA_length = memory_region_size(mr),
+ .DPA_length = size,
};
/* For now, no memory side cache, plausiblish numbers */
@@ -132,7 +132,7 @@ static void ct3_build_cdat_entries_for_mr(CDATSubHeader **cdat_table,
*/
.EFI_memory_type_attr = is_pmem ? 2 : 1,
.DPA_offset = 0,
- .DPA_length = memory_region_size(mr),
+ .DPA_length = size,
};
/* Header always at start of structure */
@@ -149,6 +149,7 @@ static int ct3_build_cdat_table(CDATSubHeader ***cdat_table, void *priv)
g_autofree CDATSubHeader **table = NULL;
CXLType3Dev *ct3d = priv;
MemoryRegion *volatile_mr = NULL, *nonvolatile_mr = NULL;
+ uint64_t vmr_size = 0, pmr_size = 0;
int dsmad_handle = 0;
int cur_ent = 0;
int len = 0;
@@ -163,6 +164,7 @@ static int ct3_build_cdat_table(CDATSubHeader ***cdat_table, void *priv)
return -EINVAL;
}
len += CT3_CDAT_NUM_ENTRIES;
+ vmr_size = memory_region_size(volatile_mr);
}
if (ct3d->hostpmem) {
@@ -171,21 +173,22 @@ static int ct3_build_cdat_table(CDATSubHeader ***cdat_table, void *priv)
return -EINVAL;
}
len += CT3_CDAT_NUM_ENTRIES;
+ pmr_size = memory_region_size(nonvolatile_mr);
}
table = g_malloc0(len * sizeof(*table));
/* Now fill them in */
if (volatile_mr) {
- ct3_build_cdat_entries_for_mr(table, dsmad_handle++, volatile_mr,
+ ct3_build_cdat_entries_for_mr(table, dsmad_handle++, vmr_size,
false, 0);
cur_ent = CT3_CDAT_NUM_ENTRIES;
}
if (nonvolatile_mr) {
- uint64_t base = volatile_mr ? memory_region_size(volatile_mr) : 0;
+ uint64_t base = vmr_size;
ct3_build_cdat_entries_for_mr(&(table[cur_ent]), dsmad_handle++,
- nonvolatile_mr, true, base);
+ pmr_size, true, base);
cur_ent += CT3_CDAT_NUM_ENTRIES;
}
assert(len == cur_ent);
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v8 08/14] hw/mem/cxl_type3: Add host backend and address space handling for DC regions
2024-05-23 17:44 [PATCH v8 00/14] Enabling DCD emulation support in Qemu nifan.cxl
` (6 preceding siblings ...)
2024-05-23 17:44 ` [PATCH v8 07/14] hw/mem/cxl-type3: Refactor ct3_build_cdat_entries_for_mr to take mr size instead of mr as argument nifan.cxl
@ 2024-05-23 17:44 ` nifan.cxl
2024-06-03 12:27 ` Jonathan Cameron
2024-05-23 17:44 ` [PATCH v8 09/14] hw/mem/cxl_type3: Add DC extent list representative and get DC extent list mailbox support nifan.cxl
` (7 subsequent siblings)
15 siblings, 1 reply; 28+ messages in thread
From: nifan.cxl @ 2024-05-23 17:44 UTC (permalink / raw)
To: qemu-devel
Cc: jonathan.cameron, linux-cxl, gregory.price, ira.weiny,
dan.j.williams, a.manzanares, dave, nmtadam.samsung, nifan.cxl,
jim.harris, Jorgen.Hansen, wj28.lee, armbru, mst, Fan Ni
From: Fan Ni <fan.ni@samsung.com>
Add (file/memory backed) host backend for DCD. All the dynamic capacity
regions will share a single, large enough host backend. Set up address
space for DC regions to support read/write operations to dynamic capacity
for DCD.
With the change, the following support is added:
1. Add a new property to type3 device "volatile-dc-memdev" to point to host
memory backend for dynamic capacity. Currently, all DC regions share one
host backend;
2. Add namespace for dynamic capacity for read/write support;
3. Create cdat entries for each dynamic capacity region.
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
---
hw/cxl/cxl-mailbox-utils.c | 16 ++--
hw/mem/cxl_type3.c | 174 +++++++++++++++++++++++++++++-------
include/hw/cxl/cxl_device.h | 8 ++
3 files changed, 162 insertions(+), 36 deletions(-)
diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 0f2ad58a14..831cef0567 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -622,7 +622,8 @@ static CXLRetCode cmd_firmware_update_get_info(const struct cxl_cmd *cmd,
size_t *len_out,
CXLCCI *cci)
{
- CXLDeviceState *cxl_dstate = &CXL_TYPE3(cci->d)->cxl_dstate;
+ CXLType3Dev *ct3d = CXL_TYPE3(cci->d);
+ CXLDeviceState *cxl_dstate = &ct3d->cxl_dstate;
struct {
uint8_t slots_supported;
uint8_t slot_info;
@@ -636,7 +637,8 @@ static CXLRetCode cmd_firmware_update_get_info(const struct cxl_cmd *cmd,
QEMU_BUILD_BUG_ON(sizeof(*fw_info) != 0x50);
if ((cxl_dstate->vmem_size < CXL_CAPACITY_MULTIPLIER) ||
- (cxl_dstate->pmem_size < CXL_CAPACITY_MULTIPLIER)) {
+ (cxl_dstate->pmem_size < CXL_CAPACITY_MULTIPLIER) ||
+ (ct3d->dc.total_capacity < CXL_CAPACITY_MULTIPLIER)) {
return CXL_MBOX_INTERNAL_ERROR;
}
@@ -793,7 +795,8 @@ static CXLRetCode cmd_identify_memory_device(const struct cxl_cmd *cmd,
CXLDeviceState *cxl_dstate = &ct3d->cxl_dstate;
if ((!QEMU_IS_ALIGNED(cxl_dstate->vmem_size, CXL_CAPACITY_MULTIPLIER)) ||
- (!QEMU_IS_ALIGNED(cxl_dstate->pmem_size, CXL_CAPACITY_MULTIPLIER))) {
+ (!QEMU_IS_ALIGNED(cxl_dstate->pmem_size, CXL_CAPACITY_MULTIPLIER)) ||
+ (!QEMU_IS_ALIGNED(ct3d->dc.total_capacity, CXL_CAPACITY_MULTIPLIER))) {
return CXL_MBOX_INTERNAL_ERROR;
}
@@ -835,9 +838,11 @@ static CXLRetCode cmd_ccls_get_partition_info(const struct cxl_cmd *cmd,
uint64_t next_pmem;
} QEMU_PACKED *part_info = (void *)payload_out;
QEMU_BUILD_BUG_ON(sizeof(*part_info) != 0x20);
+ CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
if ((!QEMU_IS_ALIGNED(cxl_dstate->vmem_size, CXL_CAPACITY_MULTIPLIER)) ||
- (!QEMU_IS_ALIGNED(cxl_dstate->pmem_size, CXL_CAPACITY_MULTIPLIER))) {
+ (!QEMU_IS_ALIGNED(cxl_dstate->pmem_size, CXL_CAPACITY_MULTIPLIER)) ||
+ (!QEMU_IS_ALIGNED(ct3d->dc.total_capacity, CXL_CAPACITY_MULTIPLIER))) {
return CXL_MBOX_INTERNAL_ERROR;
}
@@ -1179,7 +1184,8 @@ static CXLRetCode cmd_media_clear_poison(const struct cxl_cmd *cmd,
struct clear_poison_pl *in = (void *)payload_in;
dpa = ldq_le_p(&in->dpa);
- if (dpa + CXL_CACHE_LINE_SIZE > cxl_dstate->static_mem_size) {
+ if (dpa + CXL_CACHE_LINE_SIZE > cxl_dstate->static_mem_size +
+ ct3d->dc.total_capacity) {
return CXL_MBOX_INVALID_PA;
}
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 51be50ce87..f645a3f2e9 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -45,7 +45,8 @@ enum {
static void ct3_build_cdat_entries_for_mr(CDATSubHeader **cdat_table,
int dsmad_handle, uint64_t size,
- bool is_pmem, uint64_t dpa_base)
+ bool is_pmem, bool is_dynamic,
+ uint64_t dpa_base)
{
CDATDsmas *dsmas;
CDATDslbis *dslbis0;
@@ -61,7 +62,8 @@ static void ct3_build_cdat_entries_for_mr(CDATSubHeader **cdat_table,
.length = sizeof(*dsmas),
},
.DSMADhandle = dsmad_handle,
- .flags = is_pmem ? CDAT_DSMAS_FLAG_NV : 0,
+ .flags = (is_pmem ? CDAT_DSMAS_FLAG_NV : 0) |
+ (is_dynamic ? CDAT_DSMAS_FLAG_DYNAMIC_CAP : 0),
.DPA_base = dpa_base,
.DPA_length = size,
};
@@ -149,12 +151,13 @@ static int ct3_build_cdat_table(CDATSubHeader ***cdat_table, void *priv)
g_autofree CDATSubHeader **table = NULL;
CXLType3Dev *ct3d = priv;
MemoryRegion *volatile_mr = NULL, *nonvolatile_mr = NULL;
+ MemoryRegion *dc_mr = NULL;
uint64_t vmr_size = 0, pmr_size = 0;
int dsmad_handle = 0;
int cur_ent = 0;
int len = 0;
- if (!ct3d->hostpmem && !ct3d->hostvmem) {
+ if (!ct3d->hostpmem && !ct3d->hostvmem && !ct3d->dc.num_regions) {
return 0;
}
@@ -176,21 +179,54 @@ static int ct3_build_cdat_table(CDATSubHeader ***cdat_table, void *priv)
pmr_size = memory_region_size(nonvolatile_mr);
}
+ if (ct3d->dc.num_regions) {
+ if (!ct3d->dc.host_dc) {
+ return -EINVAL;
+ }
+ dc_mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
+ if (!dc_mr) {
+ return -EINVAL;
+ }
+ len += CT3_CDAT_NUM_ENTRIES * ct3d->dc.num_regions;
+ }
+
table = g_malloc0(len * sizeof(*table));
/* Now fill them in */
if (volatile_mr) {
ct3_build_cdat_entries_for_mr(table, dsmad_handle++, vmr_size,
- false, 0);
+ false, false, 0);
cur_ent = CT3_CDAT_NUM_ENTRIES;
}
if (nonvolatile_mr) {
uint64_t base = vmr_size;
ct3_build_cdat_entries_for_mr(&(table[cur_ent]), dsmad_handle++,
- pmr_size, true, base);
+ pmr_size, true, false, base);
cur_ent += CT3_CDAT_NUM_ENTRIES;
}
+
+ if (dc_mr) {
+ int i;
+ uint64_t region_base = vmr_size + pmr_size;
+
+ /*
+ * We assume the dynamic capacity to be volatile for now.
+ * Non-volatile dynamic capacity will be added if needed in the
+ * future.
+ */
+ for (i = 0; i < ct3d->dc.num_regions; i++) {
+ ct3_build_cdat_entries_for_mr(&(table[cur_ent]),
+ dsmad_handle++,
+ ct3d->dc.regions[i].len,
+ false, true, region_base);
+ ct3d->dc.regions[i].dsmadhandle = dsmad_handle - 1;
+
+ cur_ent += CT3_CDAT_NUM_ENTRIES;
+ region_base += ct3d->dc.regions[i].len;
+ }
+ }
+
assert(len == cur_ent);
*cdat_table = g_steal_pointer(&table);
@@ -301,10 +337,17 @@ static void build_dvsecs(CXLType3Dev *ct3d)
range2_size_lo = (2 << 5) | (2 << 2) | 0x3 |
(ct3d->hostpmem->size & 0xF0000000);
}
- } else {
+ } else if (ct3d->hostpmem) {
range1_size_hi = ct3d->hostpmem->size >> 32;
range1_size_lo = (2 << 5) | (2 << 2) | 0x3 |
(ct3d->hostpmem->size & 0xF0000000);
+ } else {
+ /*
+ * For DCD with no static memory, set memory active, memory class bits.
+ * No range is set.
+ */
+ range1_size_hi = 0;
+ range1_size_lo = (2 << 5) | (2 << 2) | 0x3;
}
dvsec = (uint8_t *)&(CXLDVSECDevice){
@@ -579,11 +622,28 @@ static bool cxl_create_dc_regions(CXLType3Dev *ct3d, Error **errp)
{
int i;
uint64_t region_base = 0;
- uint64_t region_len = 2 * GiB;
- uint64_t decode_len = 2 * GiB;
+ uint64_t region_len;
+ uint64_t decode_len;
uint64_t blk_size = 2 * MiB;
CXLDCRegion *region;
MemoryRegion *mr;
+ uint64_t dc_size;
+
+ mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
+ dc_size = memory_region_size(mr);
+ region_len = DIV_ROUND_UP(dc_size, ct3d->dc.num_regions);
+
+ if (dc_size % (ct3d->dc.num_regions * CXL_CAPACITY_MULTIPLIER) != 0) {
+ error_setg(errp, "backend size is not multiple of region len: 0x%lx",
+ region_len);
+ return false;
+ }
+ if (region_len % CXL_CAPACITY_MULTIPLIER != 0) {
+ error_setg(errp, "DC region size is unaligned to 0x%lx",
+ CXL_CAPACITY_MULTIPLIER);
+ return false;
+ }
+ decode_len = region_len;
if (ct3d->hostvmem) {
mr = host_memory_backend_get_memory(ct3d->hostvmem);
@@ -610,6 +670,7 @@ static bool cxl_create_dc_regions(CXLType3Dev *ct3d, Error **errp)
/* dsmad_handle set when creating CDAT table entries */
.flags = 0,
};
+ ct3d->dc.total_capacity += region->len;
}
return true;
@@ -619,7 +680,8 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
{
DeviceState *ds = DEVICE(ct3d);
- if (!ct3d->hostmem && !ct3d->hostvmem && !ct3d->hostpmem) {
+ if (!ct3d->hostmem && !ct3d->hostvmem && !ct3d->hostpmem
+ && !ct3d->dc.num_regions) {
error_setg(errp, "at least one memdev property must be set");
return false;
} else if (ct3d->hostmem && ct3d->hostpmem) {
@@ -683,7 +745,37 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
g_free(p_name);
}
+ ct3d->dc.total_capacity = 0;
if (ct3d->dc.num_regions > 0) {
+ MemoryRegion *dc_mr;
+ char *dc_name;
+
+ if (!ct3d->dc.host_dc) {
+ error_setg(errp, "dynamic capacity must have a backing device");
+ return false;
+ }
+
+ dc_mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
+ if (!dc_mr) {
+ error_setg(errp, "dynamic capacity must have a backing device");
+ return false;
+ }
+
+ /*
+ * Set DC regions as volatile for now, non-volatile support can
+ * be added in the future if needed.
+ */
+ memory_region_set_nonvolatile(dc_mr, false);
+ memory_region_set_enabled(dc_mr, true);
+ host_memory_backend_set_mapped(ct3d->dc.host_dc, true);
+ if (ds->id) {
+ dc_name = g_strdup_printf("cxl-dcd-dpa-dc-space:%s", ds->id);
+ } else {
+ dc_name = g_strdup("cxl-dcd-dpa-dc-space");
+ }
+ address_space_init(&ct3d->dc.host_dc_as, dc_mr, dc_name);
+ g_free(dc_name);
+
if (!cxl_create_dc_regions(ct3d, errp)) {
error_append_hint(errp, "setup DC regions failed");
return false;
@@ -779,6 +871,9 @@ err_release_cdat:
err_free_special_ops:
g_free(regs->special_ops);
err_address_space_free:
+ if (ct3d->dc.host_dc) {
+ address_space_destroy(&ct3d->dc.host_dc_as);
+ }
if (ct3d->hostpmem) {
address_space_destroy(&ct3d->hostpmem_as);
}
@@ -797,6 +892,9 @@ static void ct3_exit(PCIDevice *pci_dev)
pcie_aer_exit(pci_dev);
cxl_doe_cdat_release(cxl_cstate);
g_free(regs->special_ops);
+ if (ct3d->dc.host_dc) {
+ address_space_destroy(&ct3d->dc.host_dc_as);
+ }
if (ct3d->hostpmem) {
address_space_destroy(&ct3d->hostpmem_as);
}
@@ -875,16 +973,23 @@ static int cxl_type3_hpa_to_as_and_dpa(CXLType3Dev *ct3d,
AddressSpace **as,
uint64_t *dpa_offset)
{
- MemoryRegion *vmr = NULL, *pmr = NULL;
+ MemoryRegion *vmr = NULL, *pmr = NULL, *dc_mr = NULL;
+ uint64_t vmr_size = 0, pmr_size = 0, dc_size = 0;
if (ct3d->hostvmem) {
vmr = host_memory_backend_get_memory(ct3d->hostvmem);
+ vmr_size = memory_region_size(vmr);
}
if (ct3d->hostpmem) {
pmr = host_memory_backend_get_memory(ct3d->hostpmem);
+ pmr_size = memory_region_size(pmr);
+ }
+ if (ct3d->dc.host_dc) {
+ dc_mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
+ dc_size = memory_region_size(dc_mr);
}
- if (!vmr && !pmr) {
+ if (!vmr && !pmr && !dc_mr) {
return -ENODEV;
}
@@ -892,19 +997,18 @@ static int cxl_type3_hpa_to_as_and_dpa(CXLType3Dev *ct3d,
return -EINVAL;
}
- if (*dpa_offset > ct3d->cxl_dstate.static_mem_size) {
+ if (*dpa_offset >= vmr_size + pmr_size + dc_size) {
return -EINVAL;
}
- if (vmr) {
- if (*dpa_offset < memory_region_size(vmr)) {
- *as = &ct3d->hostvmem_as;
- } else {
- *as = &ct3d->hostpmem_as;
- *dpa_offset -= memory_region_size(vmr);
- }
- } else {
+ if (*dpa_offset < vmr_size) {
+ *as = &ct3d->hostvmem_as;
+ } else if (*dpa_offset < vmr_size + pmr_size) {
*as = &ct3d->hostpmem_as;
+ *dpa_offset -= vmr_size;
+ } else {
+ *as = &ct3d->dc.host_dc_as;
+ *dpa_offset -= (vmr_size + pmr_size);
}
return 0;
@@ -986,6 +1090,8 @@ static Property ct3_props[] = {
DEFINE_PROP_UINT64("sn", CXLType3Dev, sn, UI64_NULL),
DEFINE_PROP_STRING("cdat", CXLType3Dev, cxl_cstate.cdat.filename),
DEFINE_PROP_UINT8("num-dc-regions", CXLType3Dev, dc.num_regions, 0),
+ DEFINE_PROP_LINK("volatile-dc-memdev", CXLType3Dev, dc.host_dc,
+ TYPE_MEMORY_BACKEND, HostMemoryBackend *),
DEFINE_PROP_END_OF_LIST(),
};
@@ -1052,33 +1158,39 @@ static void set_lsa(CXLType3Dev *ct3d, const void *buf, uint64_t size,
static bool set_cacheline(CXLType3Dev *ct3d, uint64_t dpa_offset, uint8_t *data)
{
- MemoryRegion *vmr = NULL, *pmr = NULL;
+ MemoryRegion *vmr = NULL, *pmr = NULL, *dc_mr = NULL;
AddressSpace *as;
+ uint64_t vmr_size = 0, pmr_size = 0, dc_size = 0;
if (ct3d->hostvmem) {
vmr = host_memory_backend_get_memory(ct3d->hostvmem);
+ vmr_size = memory_region_size(vmr);
}
if (ct3d->hostpmem) {
pmr = host_memory_backend_get_memory(ct3d->hostpmem);
+ pmr_size = memory_region_size(pmr);
}
+ if (ct3d->dc.host_dc) {
+ dc_mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
+ dc_size = memory_region_size(dc_mr);
+ }
- if (!vmr && !pmr) {
+ if (!vmr && !pmr && !dc_mr) {
return false;
}
- if (dpa_offset + CXL_CACHE_LINE_SIZE > ct3d->cxl_dstate.static_mem_size) {
+ if (dpa_offset + CXL_CACHE_LINE_SIZE > vmr_size + pmr_size + dc_size) {
return false;
}
- if (vmr) {
- if (dpa_offset < memory_region_size(vmr)) {
- as = &ct3d->hostvmem_as;
- } else {
- as = &ct3d->hostpmem_as;
- dpa_offset -= memory_region_size(vmr);
- }
- } else {
+ if (dpa_offset < vmr_size) {
+ as = &ct3d->hostvmem_as;
+ } else if (dpa_offset < vmr_size + pmr_size) {
as = &ct3d->hostpmem_as;
+ dpa_offset -= vmr_size;
+ } else {
+ as = &ct3d->dc.host_dc_as;
+ dpa_offset -= (vmr_size + pmr_size);
}
address_space_write(as, dpa_offset, MEMTXATTRS_UNSPECIFIED, &data,
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index f7f56b44e3..c2c3df0d2a 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -467,6 +467,14 @@ struct CXLType3Dev {
uint64_t poison_list_overflow_ts;
struct dynamic_capacity {
+ HostMemoryBackend *host_dc;
+ AddressSpace host_dc_as;
+ /*
+ * total_capacity is equivalent to the dynamic capability
+ * memory region size.
+ */
+ uint64_t total_capacity; /* 256M aligned */
+
uint8_t num_regions; /* 0-8 regions */
CXLDCRegion regions[DCD_MAX_NUM_REGION];
} dc;
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v8 09/14] hw/mem/cxl_type3: Add DC extent list representative and get DC extent list mailbox support
2024-05-23 17:44 [PATCH v8 00/14] Enabling DCD emulation support in Qemu nifan.cxl
` (7 preceding siblings ...)
2024-05-23 17:44 ` [PATCH v8 08/14] hw/mem/cxl_type3: Add host backend and address space handling for DC regions nifan.cxl
@ 2024-05-23 17:44 ` nifan.cxl
2024-05-23 17:44 ` [PATCH v8 10/14] hw/cxl/cxl-mailbox-utils: Add mailbox commands to support add/release dynamic capacity response nifan.cxl
` (6 subsequent siblings)
15 siblings, 0 replies; 28+ messages in thread
From: nifan.cxl @ 2024-05-23 17:44 UTC (permalink / raw)
To: qemu-devel
Cc: jonathan.cameron, linux-cxl, gregory.price, ira.weiny,
dan.j.williams, a.manzanares, dave, nmtadam.samsung, nifan.cxl,
jim.harris, Jorgen.Hansen, wj28.lee, armbru, mst, Fan Ni,
Svetly Todorov, Jonathan Cameron
From: Fan Ni <fan.ni@samsung.com>
Add dynamic capacity extent list representative to the definition of
CXLType3Dev and implement get DC extent list mailbox command per
CXL.spec.3.1:.8.2.9.9.9.2.
Tested-by: Svetly Todorov <svetly.todorov@memverge.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
---
hw/cxl/cxl-mailbox-utils.c | 73 ++++++++++++++++++++++++++++++++++++-
hw/mem/cxl_type3.c | 1 +
include/hw/cxl/cxl_device.h | 22 +++++++++++
3 files changed, 95 insertions(+), 1 deletion(-)
diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 831cef0567..1915959015 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -84,6 +84,7 @@ enum {
#define CLEAR_POISON 0x2
DCD_CONFIG = 0x48,
#define GET_DC_CONFIG 0x0
+ #define GET_DYN_CAP_EXT_LIST 0x1
PHYSICAL_SWITCH = 0x51,
#define IDENTIFY_SWITCH_DEVICE 0x0
#define GET_PHYSICAL_PORT_STATE 0x1
@@ -1322,7 +1323,8 @@ static CXLRetCode cmd_dcd_get_dyn_cap_config(const struct cxl_cmd *cmd,
* to use.
*/
stl_le_p(&extra_out->num_extents_supported, CXL_NUM_EXTENTS_SUPPORTED);
- stl_le_p(&extra_out->num_extents_available, CXL_NUM_EXTENTS_SUPPORTED);
+ stl_le_p(&extra_out->num_extents_available, CXL_NUM_EXTENTS_SUPPORTED -
+ ct3d->dc.total_extent_count);
stl_le_p(&extra_out->num_tags_supported, CXL_NUM_TAGS_SUPPORTED);
stl_le_p(&extra_out->num_tags_available, CXL_NUM_TAGS_SUPPORTED);
@@ -1330,6 +1332,72 @@ static CXLRetCode cmd_dcd_get_dyn_cap_config(const struct cxl_cmd *cmd,
return CXL_MBOX_SUCCESS;
}
+/*
+ * CXL r3.1 section 8.2.9.9.9.2:
+ * Get Dynamic Capacity Extent List (Opcode 4801h)
+ */
+static CXLRetCode cmd_dcd_get_dyn_cap_ext_list(const struct cxl_cmd *cmd,
+ uint8_t *payload_in,
+ size_t len_in,
+ uint8_t *payload_out,
+ size_t *len_out,
+ CXLCCI *cci)
+{
+ CXLType3Dev *ct3d = CXL_TYPE3(cci->d);
+ struct {
+ uint32_t extent_cnt;
+ uint32_t start_extent_id;
+ } QEMU_PACKED *in = (void *)payload_in;
+ struct {
+ uint32_t count;
+ uint32_t total_extents;
+ uint32_t generation_num;
+ uint8_t rsvd[4];
+ CXLDCExtentRaw records[];
+ } QEMU_PACKED *out = (void *)payload_out;
+ uint32_t start_extent_id = in->start_extent_id;
+ CXLDCExtentList *extent_list = &ct3d->dc.extents;
+ uint16_t record_count = 0, i = 0, record_done = 0;
+ uint16_t out_pl_len, size;
+ CXLDCExtent *ent;
+
+ if (start_extent_id > ct3d->dc.total_extent_count) {
+ return CXL_MBOX_INVALID_INPUT;
+ }
+
+ record_count = MIN(in->extent_cnt,
+ ct3d->dc.total_extent_count - start_extent_id);
+ size = CXL_MAILBOX_MAX_PAYLOAD_SIZE - sizeof(*out);
+ record_count = MIN(record_count, size / sizeof(out->records[0]));
+ out_pl_len = sizeof(*out) + record_count * sizeof(out->records[0]);
+
+ stl_le_p(&out->count, record_count);
+ stl_le_p(&out->total_extents, ct3d->dc.total_extent_count);
+ stl_le_p(&out->generation_num, ct3d->dc.ext_list_gen_seq);
+
+ if (record_count > 0) {
+ CXLDCExtentRaw *out_rec = &out->records[record_done];
+
+ QTAILQ_FOREACH(ent, extent_list, node) {
+ if (i++ < start_extent_id) {
+ continue;
+ }
+ stq_le_p(&out_rec->start_dpa, ent->start_dpa);
+ stq_le_p(&out_rec->len, ent->len);
+ memcpy(&out_rec->tag, ent->tag, 0x10);
+ stw_le_p(&out_rec->shared_seq, ent->shared_seq);
+
+ record_done++;
+ if (record_done == record_count) {
+ break;
+ }
+ }
+ }
+
+ *len_out = out_pl_len;
+ return CXL_MBOX_SUCCESS;
+}
+
#define IMMEDIATE_CONFIG_CHANGE (1 << 1)
#define IMMEDIATE_DATA_CHANGE (1 << 2)
#define IMMEDIATE_POLICY_CHANGE (1 << 3)
@@ -1377,6 +1445,9 @@ static const struct cxl_cmd cxl_cmd_set[256][256] = {
static const struct cxl_cmd cxl_cmd_set_dcd[256][256] = {
[DCD_CONFIG][GET_DC_CONFIG] = { "DCD_GET_DC_CONFIG",
cmd_dcd_get_dyn_cap_config, 2, 0 },
+ [DCD_CONFIG][GET_DYN_CAP_EXT_LIST] = {
+ "DCD_GET_DYNAMIC_CAPACITY_EXTENT_LIST", cmd_dcd_get_dyn_cap_ext_list,
+ 8, 0 },
};
static const struct cxl_cmd cxl_cmd_set_sw[256][256] = {
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index f645a3f2e9..f6ab885270 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -672,6 +672,7 @@ static bool cxl_create_dc_regions(CXLType3Dev *ct3d, Error **errp)
};
ct3d->dc.total_capacity += region->len;
}
+ QTAILQ_INIT(&ct3d->dc.extents);
return true;
}
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index c2c3df0d2a..6aec6ac983 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -424,6 +424,25 @@ typedef QLIST_HEAD(, CXLPoison) CXLPoisonList;
#define DCD_MAX_NUM_REGION 8
+typedef struct CXLDCExtentRaw {
+ uint64_t start_dpa;
+ uint64_t len;
+ uint8_t tag[0x10];
+ uint16_t shared_seq;
+ uint8_t rsvd[0x6];
+} QEMU_PACKED CXLDCExtentRaw;
+
+typedef struct CXLDCExtent {
+ uint64_t start_dpa;
+ uint64_t len;
+ uint8_t tag[0x10];
+ uint16_t shared_seq;
+ uint8_t rsvd[0x6];
+
+ QTAILQ_ENTRY(CXLDCExtent) node;
+} CXLDCExtent;
+typedef QTAILQ_HEAD(, CXLDCExtent) CXLDCExtentList;
+
typedef struct CXLDCRegion {
uint64_t base; /* aligned to 256*MiB */
uint64_t decode_len; /* aligned to 256*MiB */
@@ -474,6 +493,9 @@ struct CXLType3Dev {
* memory region size.
*/
uint64_t total_capacity; /* 256M aligned */
+ CXLDCExtentList extents;
+ uint32_t total_extent_count;
+ uint32_t ext_list_gen_seq;
uint8_t num_regions; /* 0-8 regions */
CXLDCRegion regions[DCD_MAX_NUM_REGION];
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v8 10/14] hw/cxl/cxl-mailbox-utils: Add mailbox commands to support add/release dynamic capacity response
2024-05-23 17:44 [PATCH v8 00/14] Enabling DCD emulation support in Qemu nifan.cxl
` (8 preceding siblings ...)
2024-05-23 17:44 ` [PATCH v8 09/14] hw/mem/cxl_type3: Add DC extent list representative and get DC extent list mailbox support nifan.cxl
@ 2024-05-23 17:44 ` nifan.cxl
2024-05-23 17:44 ` [PATCH v8 11/14] hw/cxl/events: Add qmp interfaces to add/release dynamic capacity extents nifan.cxl
` (5 subsequent siblings)
15 siblings, 0 replies; 28+ messages in thread
From: nifan.cxl @ 2024-05-23 17:44 UTC (permalink / raw)
To: qemu-devel
Cc: jonathan.cameron, linux-cxl, gregory.price, ira.weiny,
dan.j.williams, a.manzanares, dave, nmtadam.samsung, nifan.cxl,
jim.harris, Jorgen.Hansen, wj28.lee, armbru, mst, Fan Ni,
Svetly Todorov
From: Fan Ni <fan.ni@samsung.com>
Per CXL spec 3.1, two mailbox commands are implemented:
Add Dynamic Capacity Response (Opcode 4802h) 8.2.9.9.9.3, and
Release Dynamic Capacity (Opcode 4803h) 8.2.9.9.9.4.
For the process of the above two commands, we use two-pass approach.
Pass 1: Check whether the input payload is valid or not; if not, skip
Pass 2 and return mailbox process error.
Pass 2: Do the real work--add or release extents, respectively.
Tested-by: Svetly Todorov <svetly.todorov@memverge.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
---
hw/cxl/cxl-mailbox-utils.c | 394 ++++++++++++++++++++++++++++++++++++
hw/mem/cxl_type3.c | 11 +
include/hw/cxl/cxl_device.h | 4 +
3 files changed, 409 insertions(+)
diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 1915959015..9d54e10cd4 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -19,6 +19,7 @@
#include "qemu/units.h"
#include "qemu/uuid.h"
#include "sysemu/hostmem.h"
+#include "qemu/range.h"
#define CXL_CAPACITY_MULTIPLIER (256 * MiB)
#define CXL_DC_EVENT_LOG_SIZE 8
@@ -85,6 +86,8 @@ enum {
DCD_CONFIG = 0x48,
#define GET_DC_CONFIG 0x0
#define GET_DYN_CAP_EXT_LIST 0x1
+ #define ADD_DYN_CAP_RSP 0x2
+ #define RELEASE_DYN_CAP 0x3
PHYSICAL_SWITCH = 0x51,
#define IDENTIFY_SWITCH_DEVICE 0x0
#define GET_PHYSICAL_PORT_STATE 0x1
@@ -1398,6 +1401,391 @@ static CXLRetCode cmd_dcd_get_dyn_cap_ext_list(const struct cxl_cmd *cmd,
return CXL_MBOX_SUCCESS;
}
+/*
+ * Check whether any bit between addr[nr, nr+size) is set,
+ * return true if any bit is set, otherwise return false
+ */
+static bool test_any_bits_set(const unsigned long *addr, unsigned long nr,
+ unsigned long size)
+{
+ unsigned long res = find_next_bit(addr, size + nr, nr);
+
+ return res < nr + size;
+}
+
+CXLDCRegion *cxl_find_dc_region(CXLType3Dev *ct3d, uint64_t dpa, uint64_t len)
+{
+ int i;
+ CXLDCRegion *region = &ct3d->dc.regions[0];
+
+ if (dpa < region->base ||
+ dpa >= region->base + ct3d->dc.total_capacity) {
+ return NULL;
+ }
+
+ /*
+ * CXL r3.1 section 9.13.3: Dynamic Capacity Device (DCD)
+ *
+ * Regions are used in increasing-DPA order, with Region 0 being used for
+ * the lowest DPA of Dynamic Capacity and Region 7 for the highest DPA.
+ * So check from the last region to find where the dpa belongs. Extents that
+ * cross multiple regions are not allowed.
+ */
+ for (i = ct3d->dc.num_regions - 1; i >= 0; i--) {
+ region = &ct3d->dc.regions[i];
+ if (dpa >= region->base) {
+ if (dpa + len > region->base + region->len) {
+ return NULL;
+ }
+ return region;
+ }
+ }
+
+ return NULL;
+}
+
+static void cxl_insert_extent_to_extent_list(CXLDCExtentList *list,
+ uint64_t dpa,
+ uint64_t len,
+ uint8_t *tag,
+ uint16_t shared_seq)
+{
+ CXLDCExtent *extent;
+
+ extent = g_new0(CXLDCExtent, 1);
+ extent->start_dpa = dpa;
+ extent->len = len;
+ if (tag) {
+ memcpy(extent->tag, tag, 0x10);
+ }
+ extent->shared_seq = shared_seq;
+
+ QTAILQ_INSERT_TAIL(list, extent, node);
+}
+
+void cxl_remove_extent_from_extent_list(CXLDCExtentList *list,
+ CXLDCExtent *extent)
+{
+ QTAILQ_REMOVE(list, extent, node);
+ g_free(extent);
+}
+
+/*
+ * CXL r3.1 Table 8-168: Add Dynamic Capacity Response Input Payload
+ * CXL r3.1 Table 8-170: Release Dynamic Capacity Input Payload
+ */
+typedef struct CXLUpdateDCExtentListInPl {
+ uint32_t num_entries_updated;
+ uint8_t flags;
+ uint8_t rsvd[3];
+ /* CXL r3.1 Table 8-169: Updated Extent */
+ struct {
+ uint64_t start_dpa;
+ uint64_t len;
+ uint8_t rsvd[8];
+ } QEMU_PACKED updated_entries[];
+} QEMU_PACKED CXLUpdateDCExtentListInPl;
+
+/*
+ * For the extents in the extent list to operate, check whether they are valid
+ * 1. The extent should be in the range of a valid DC region;
+ * 2. The extent should not cross multiple regions;
+ * 3. The start DPA and the length of the extent should align with the block
+ * size of the region;
+ * 4. The address range of multiple extents in the list should not overlap.
+ */
+static CXLRetCode cxl_detect_malformed_extent_list(CXLType3Dev *ct3d,
+ const CXLUpdateDCExtentListInPl *in)
+{
+ uint64_t min_block_size = UINT64_MAX;
+ CXLDCRegion *region;
+ CXLDCRegion *lastregion = &ct3d->dc.regions[ct3d->dc.num_regions - 1];
+ g_autofree unsigned long *blk_bitmap = NULL;
+ uint64_t dpa, len;
+ uint32_t i;
+
+ for (i = 0; i < ct3d->dc.num_regions; i++) {
+ region = &ct3d->dc.regions[i];
+ min_block_size = MIN(min_block_size, region->block_size);
+ }
+
+ blk_bitmap = bitmap_new((lastregion->base + lastregion->len -
+ ct3d->dc.regions[0].base) / min_block_size);
+
+ for (i = 0; i < in->num_entries_updated; i++) {
+ dpa = in->updated_entries[i].start_dpa;
+ len = in->updated_entries[i].len;
+
+ region = cxl_find_dc_region(ct3d, dpa, len);
+ if (!region) {
+ return CXL_MBOX_INVALID_PA;
+ }
+
+ dpa -= ct3d->dc.regions[0].base;
+ if (dpa % region->block_size || len % region->block_size) {
+ return CXL_MBOX_INVALID_EXTENT_LIST;
+ }
+ /* the dpa range already covered by some other extents in the list */
+ if (test_any_bits_set(blk_bitmap, dpa / min_block_size,
+ len / min_block_size)) {
+ return CXL_MBOX_INVALID_EXTENT_LIST;
+ }
+ bitmap_set(blk_bitmap, dpa / min_block_size, len / min_block_size);
+ }
+
+ return CXL_MBOX_SUCCESS;
+}
+
+static CXLRetCode cxl_dcd_add_dyn_cap_rsp_dry_run(CXLType3Dev *ct3d,
+ const CXLUpdateDCExtentListInPl *in)
+{
+ uint32_t i;
+ CXLDCExtent *ent;
+ uint64_t dpa, len;
+ Range range1, range2;
+
+ for (i = 0; i < in->num_entries_updated; i++) {
+ dpa = in->updated_entries[i].start_dpa;
+ len = in->updated_entries[i].len;
+
+ range_init_nofail(&range1, dpa, len);
+
+ /*
+ * TODO: once the pending extent list is added, check against
+ * the list will be added here.
+ */
+
+ /* to-be-added range should not overlap with range already accepted */
+ QTAILQ_FOREACH(ent, &ct3d->dc.extents, node) {
+ range_init_nofail(&range2, ent->start_dpa, ent->len);
+ if (range_overlaps_range(&range1, &range2)) {
+ return CXL_MBOX_INVALID_PA;
+ }
+ }
+ }
+ return CXL_MBOX_SUCCESS;
+}
+
+/*
+ * CXL r3.1 section 8.2.9.9.9.3: Add Dynamic Capacity Response (Opcode 4802h)
+ * An extent is added to the extent list and becomes usable only after the
+ * response is processed successfully.
+ */
+static CXLRetCode cmd_dcd_add_dyn_cap_rsp(const struct cxl_cmd *cmd,
+ uint8_t *payload_in,
+ size_t len_in,
+ uint8_t *payload_out,
+ size_t *len_out,
+ CXLCCI *cci)
+{
+ CXLUpdateDCExtentListInPl *in = (void *)payload_in;
+ CXLType3Dev *ct3d = CXL_TYPE3(cci->d);
+ CXLDCExtentList *extent_list = &ct3d->dc.extents;
+ uint32_t i;
+ uint64_t dpa, len;
+ CXLRetCode ret;
+
+ if (in->num_entries_updated == 0) {
+ /*
+ * TODO: once the pending list is introduced, extents in the beginning
+ * will get wiped out.
+ */
+ return CXL_MBOX_SUCCESS;
+ }
+
+ /* Adding extents causes exceeding device's extent tracking ability. */
+ if (in->num_entries_updated + ct3d->dc.total_extent_count >
+ CXL_NUM_EXTENTS_SUPPORTED) {
+ return CXL_MBOX_RESOURCES_EXHAUSTED;
+ }
+
+ ret = cxl_detect_malformed_extent_list(ct3d, in);
+ if (ret != CXL_MBOX_SUCCESS) {
+ return ret;
+ }
+
+ ret = cxl_dcd_add_dyn_cap_rsp_dry_run(ct3d, in);
+ if (ret != CXL_MBOX_SUCCESS) {
+ return ret;
+ }
+
+ for (i = 0; i < in->num_entries_updated; i++) {
+ dpa = in->updated_entries[i].start_dpa;
+ len = in->updated_entries[i].len;
+
+ cxl_insert_extent_to_extent_list(extent_list, dpa, len, NULL, 0);
+ ct3d->dc.total_extent_count += 1;
+ /*
+ * TODO: we will add a pending extent list based on event log record
+ * and process the list accordingly here.
+ */
+ }
+
+ return CXL_MBOX_SUCCESS;
+}
+
+/*
+ * Copy extent list from src to dst
+ * Return value: number of extents copied
+ */
+static uint32_t copy_extent_list(CXLDCExtentList *dst,
+ const CXLDCExtentList *src)
+{
+ uint32_t cnt = 0;
+ CXLDCExtent *ent;
+
+ if (!dst || !src) {
+ return 0;
+ }
+
+ QTAILQ_FOREACH(ent, src, node) {
+ cxl_insert_extent_to_extent_list(dst, ent->start_dpa, ent->len,
+ ent->tag, ent->shared_seq);
+ cnt++;
+ }
+ return cnt;
+}
+
+static CXLRetCode cxl_dc_extent_release_dry_run(CXLType3Dev *ct3d,
+ const CXLUpdateDCExtentListInPl *in, CXLDCExtentList *updated_list,
+ uint32_t *updated_list_size)
+{
+ CXLDCExtent *ent, *ent_next;
+ uint64_t dpa, len;
+ uint32_t i;
+ int cnt_delta = 0;
+ CXLRetCode ret = CXL_MBOX_SUCCESS;
+
+ QTAILQ_INIT(updated_list);
+ copy_extent_list(updated_list, &ct3d->dc.extents);
+
+ for (i = 0; i < in->num_entries_updated; i++) {
+ Range range;
+
+ dpa = in->updated_entries[i].start_dpa;
+ len = in->updated_entries[i].len;
+
+ while (len > 0) {
+ QTAILQ_FOREACH(ent, updated_list, node) {
+ range_init_nofail(&range, ent->start_dpa, ent->len);
+
+ if (range_contains(&range, dpa)) {
+ uint64_t len1, len2 = 0, len_done = 0;
+ uint64_t ent_start_dpa = ent->start_dpa;
+ uint64_t ent_len = ent->len;
+
+ len1 = dpa - ent->start_dpa;
+ /* Found the extent or the subset of an existing extent */
+ if (range_contains(&range, dpa + len - 1)) {
+ len2 = ent_start_dpa + ent_len - dpa - len;
+ } else {
+ /*
+ * TODO: we reject the attempt to remove an extent
+ * that overlaps with multiple extents in the device
+ * for now. We will allow it once superset release
+ * support is added.
+ */
+ ret = CXL_MBOX_INVALID_PA;
+ goto free_and_exit;
+ }
+ len_done = ent_len - len1 - len2;
+
+ cxl_remove_extent_from_extent_list(updated_list, ent);
+ cnt_delta--;
+
+ if (len1) {
+ cxl_insert_extent_to_extent_list(updated_list,
+ ent_start_dpa,
+ len1, NULL, 0);
+ cnt_delta++;
+ }
+ if (len2) {
+ cxl_insert_extent_to_extent_list(updated_list,
+ dpa + len,
+ len2, NULL, 0);
+ cnt_delta++;
+ }
+
+ if (cnt_delta + ct3d->dc.total_extent_count >
+ CXL_NUM_EXTENTS_SUPPORTED) {
+ ret = CXL_MBOX_RESOURCES_EXHAUSTED;
+ goto free_and_exit;
+ }
+
+ len -= len_done;
+ /* len == 0 here until superset release is added */
+ break;
+ }
+ }
+ if (len) {
+ ret = CXL_MBOX_INVALID_PA;
+ goto free_and_exit;
+ }
+ }
+ }
+free_and_exit:
+ if (ret != CXL_MBOX_SUCCESS) {
+ QTAILQ_FOREACH_SAFE(ent, updated_list, node, ent_next) {
+ cxl_remove_extent_from_extent_list(updated_list, ent);
+ }
+ *updated_list_size = 0;
+ } else {
+ *updated_list_size = ct3d->dc.total_extent_count + cnt_delta;
+ }
+
+ return ret;
+}
+
+/*
+ * CXL r3.1 section 8.2.9.9.9.4: Release Dynamic Capacity (Opcode 4803h)
+ */
+static CXLRetCode cmd_dcd_release_dyn_cap(const struct cxl_cmd *cmd,
+ uint8_t *payload_in,
+ size_t len_in,
+ uint8_t *payload_out,
+ size_t *len_out,
+ CXLCCI *cci)
+{
+ CXLUpdateDCExtentListInPl *in = (void *)payload_in;
+ CXLType3Dev *ct3d = CXL_TYPE3(cci->d);
+ CXLDCExtentList updated_list;
+ CXLDCExtent *ent, *ent_next;
+ uint32_t updated_list_size;
+ CXLRetCode ret;
+
+ if (in->num_entries_updated == 0) {
+ return CXL_MBOX_INVALID_INPUT;
+ }
+
+ ret = cxl_detect_malformed_extent_list(ct3d, in);
+ if (ret != CXL_MBOX_SUCCESS) {
+ return ret;
+ }
+
+ ret = cxl_dc_extent_release_dry_run(ct3d, in, &updated_list,
+ &updated_list_size);
+ if (ret != CXL_MBOX_SUCCESS) {
+ return ret;
+ }
+
+ /*
+ * If the dry run release passes, the returned updated_list will
+ * be the updated extent list and we just need to clear the extents
+ * in the accepted list and copy extents in the updated_list to accepted
+ * list and update the extent count;
+ */
+ QTAILQ_FOREACH_SAFE(ent, &ct3d->dc.extents, node, ent_next) {
+ cxl_remove_extent_from_extent_list(&ct3d->dc.extents, ent);
+ }
+ copy_extent_list(&ct3d->dc.extents, &updated_list);
+ QTAILQ_FOREACH_SAFE(ent, &updated_list, node, ent_next) {
+ cxl_remove_extent_from_extent_list(&updated_list, ent);
+ }
+ ct3d->dc.total_extent_count = updated_list_size;
+
+ return CXL_MBOX_SUCCESS;
+}
+
#define IMMEDIATE_CONFIG_CHANGE (1 << 1)
#define IMMEDIATE_DATA_CHANGE (1 << 2)
#define IMMEDIATE_POLICY_CHANGE (1 << 3)
@@ -1448,6 +1836,12 @@ static const struct cxl_cmd cxl_cmd_set_dcd[256][256] = {
[DCD_CONFIG][GET_DYN_CAP_EXT_LIST] = {
"DCD_GET_DYNAMIC_CAPACITY_EXTENT_LIST", cmd_dcd_get_dyn_cap_ext_list,
8, 0 },
+ [DCD_CONFIG][ADD_DYN_CAP_RSP] = {
+ "DCD_ADD_DYNAMIC_CAPACITY_RESPONSE", cmd_dcd_add_dyn_cap_rsp,
+ ~0, IMMEDIATE_DATA_CHANGE },
+ [DCD_CONFIG][RELEASE_DYN_CAP] = {
+ "DCD_RELEASE_DYNAMIC_CAPACITY", cmd_dcd_release_dyn_cap,
+ ~0, IMMEDIATE_DATA_CHANGE },
};
static const struct cxl_cmd cxl_cmd_set_sw[256][256] = {
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index f6ab885270..7c9038938f 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -677,6 +677,15 @@ static bool cxl_create_dc_regions(CXLType3Dev *ct3d, Error **errp)
return true;
}
+static void cxl_destroy_dc_regions(CXLType3Dev *ct3d)
+{
+ CXLDCExtent *ent, *ent_next;
+
+ QTAILQ_FOREACH_SAFE(ent, &ct3d->dc.extents, node, ent_next) {
+ cxl_remove_extent_from_extent_list(&ct3d->dc.extents, ent);
+ }
+}
+
static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
{
DeviceState *ds = DEVICE(ct3d);
@@ -873,6 +882,7 @@ err_free_special_ops:
g_free(regs->special_ops);
err_address_space_free:
if (ct3d->dc.host_dc) {
+ cxl_destroy_dc_regions(ct3d);
address_space_destroy(&ct3d->dc.host_dc_as);
}
if (ct3d->hostpmem) {
@@ -894,6 +904,7 @@ static void ct3_exit(PCIDevice *pci_dev)
cxl_doe_cdat_release(cxl_cstate);
g_free(regs->special_ops);
if (ct3d->dc.host_dc) {
+ cxl_destroy_dc_regions(ct3d);
address_space_destroy(&ct3d->dc.host_dc_as);
}
if (ct3d->hostpmem) {
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index 6aec6ac983..df3511e91b 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -551,4 +551,8 @@ void cxl_event_irq_assert(CXLType3Dev *ct3d);
void cxl_set_poison_list_overflowed(CXLType3Dev *ct3d);
+CXLDCRegion *cxl_find_dc_region(CXLType3Dev *ct3d, uint64_t dpa, uint64_t len);
+
+void cxl_remove_extent_from_extent_list(CXLDCExtentList *list,
+ CXLDCExtent *extent);
#endif
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v8 11/14] hw/cxl/events: Add qmp interfaces to add/release dynamic capacity extents
2024-05-23 17:44 [PATCH v8 00/14] Enabling DCD emulation support in Qemu nifan.cxl
` (9 preceding siblings ...)
2024-05-23 17:44 ` [PATCH v8 10/14] hw/cxl/cxl-mailbox-utils: Add mailbox commands to support add/release dynamic capacity response nifan.cxl
@ 2024-05-23 17:44 ` nifan.cxl
2024-06-04 7:12 ` Markus Armbruster
2025-09-02 10:39 ` Alireza Sanaee
2024-05-23 17:44 ` [PATCH v8 12/14] hw/mem/cxl_type3: Add DPA range validation for accesses to DC regions nifan.cxl
` (4 subsequent siblings)
15 siblings, 2 replies; 28+ messages in thread
From: nifan.cxl @ 2024-05-23 17:44 UTC (permalink / raw)
To: qemu-devel
Cc: jonathan.cameron, linux-cxl, gregory.price, ira.weiny,
dan.j.williams, a.manzanares, dave, nmtadam.samsung, nifan.cxl,
jim.harris, Jorgen.Hansen, wj28.lee, armbru, mst, Fan Ni,
Svetly Todorov
From: Fan Ni <fan.ni@samsung.com>
To simulate FM functionalities for initiating Dynamic Capacity Add
(Opcode 5604h) and Dynamic Capacity Release (Opcode 5605h) as in CXL spec
r3.1 7.6.7.6.5 and 7.6.7.6.6, we implemented two QMP interfaces to issue
add/release dynamic capacity extents requests.
With the change, we allow to release an extent only when its DPA range
is contained by a single accepted extent in the device. That is to say,
extent superset release is not supported yet.
1. Add dynamic capacity extents:
For example, the command to add two continuous extents (each 128MiB long)
to region 0 (starting at DPA offset 0) looks like below:
{ "execute": "qmp_capabilities" }
{ "execute": "cxl-add-dynamic-capacity",
"arguments": {
"path": "/machine/peripheral/cxl-dcd0",
"host-id": 0,
"selection-policy": "prescriptive",
"region": 0,
"extents": [
{
"offset": 0,
"len": 134217728
},
{
"offset": 134217728,
"len": 134217728
}
]
}
}
2. Release dynamic capacity extents:
For example, the command to release an extent of size 128MiB from region 0
(DPA offset 128MiB) looks like below:
{ "execute": "cxl-release-dynamic-capacity",
"arguments": {
"path": "/machine/peripheral/cxl-dcd0",
"host-id": 0,
"removal-policy":"prescriptive",
"region": 0,
"extents": [
{
"offset": 134217728,
"len": 134217728
}
]
}
}
Tested-by: Svetly Todorov <svetly.todorov@memverge.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
---
hw/cxl/cxl-mailbox-utils.c | 62 ++++++--
hw/mem/cxl_type3.c | 306 +++++++++++++++++++++++++++++++++++-
hw/mem/cxl_type3_stubs.c | 25 +++
include/hw/cxl/cxl_device.h | 22 +++
include/hw/cxl/cxl_events.h | 18 +++
qapi/cxl.json | 143 +++++++++++++++++
6 files changed, 563 insertions(+), 13 deletions(-)
diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 9d54e10cd4..ab71492697 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -1405,7 +1405,7 @@ static CXLRetCode cmd_dcd_get_dyn_cap_ext_list(const struct cxl_cmd *cmd,
* Check whether any bit between addr[nr, nr+size) is set,
* return true if any bit is set, otherwise return false
*/
-static bool test_any_bits_set(const unsigned long *addr, unsigned long nr,
+bool test_any_bits_set(const unsigned long *addr, unsigned long nr,
unsigned long size)
{
unsigned long res = find_next_bit(addr, size + nr, nr);
@@ -1444,7 +1444,7 @@ CXLDCRegion *cxl_find_dc_region(CXLType3Dev *ct3d, uint64_t dpa, uint64_t len)
return NULL;
}
-static void cxl_insert_extent_to_extent_list(CXLDCExtentList *list,
+void cxl_insert_extent_to_extent_list(CXLDCExtentList *list,
uint64_t dpa,
uint64_t len,
uint8_t *tag,
@@ -1470,6 +1470,44 @@ void cxl_remove_extent_from_extent_list(CXLDCExtentList *list,
g_free(extent);
}
+/*
+ * Add a new extent to the extent "group" if group exists;
+ * otherwise, create a new group
+ * Return value: the extent group where the extent is inserted.
+ */
+CXLDCExtentGroup *cxl_insert_extent_to_extent_group(CXLDCExtentGroup *group,
+ uint64_t dpa,
+ uint64_t len,
+ uint8_t *tag,
+ uint16_t shared_seq)
+{
+ if (!group) {
+ group = g_new0(CXLDCExtentGroup, 1);
+ QTAILQ_INIT(&group->list);
+ }
+ cxl_insert_extent_to_extent_list(&group->list, dpa, len,
+ tag, shared_seq);
+ return group;
+}
+
+void cxl_extent_group_list_insert_tail(CXLDCExtentGroupList *list,
+ CXLDCExtentGroup *group)
+{
+ QTAILQ_INSERT_TAIL(list, group, node);
+}
+
+void cxl_extent_group_list_delete_front(CXLDCExtentGroupList *list)
+{
+ CXLDCExtent *ent, *ent_next;
+ CXLDCExtentGroup *group = QTAILQ_FIRST(list);
+
+ QTAILQ_REMOVE(list, group, node);
+ QTAILQ_FOREACH_SAFE(ent, &group->list, node, ent_next) {
+ cxl_remove_extent_from_extent_list(&group->list, ent);
+ }
+ g_free(group);
+}
+
/*
* CXL r3.1 Table 8-168: Add Dynamic Capacity Response Input Payload
* CXL r3.1 Table 8-170: Release Dynamic Capacity Input Payload
@@ -1541,6 +1579,7 @@ static CXLRetCode cxl_dcd_add_dyn_cap_rsp_dry_run(CXLType3Dev *ct3d,
{
uint32_t i;
CXLDCExtent *ent;
+ CXLDCExtentGroup *ext_group;
uint64_t dpa, len;
Range range1, range2;
@@ -1551,9 +1590,13 @@ static CXLRetCode cxl_dcd_add_dyn_cap_rsp_dry_run(CXLType3Dev *ct3d,
range_init_nofail(&range1, dpa, len);
/*
- * TODO: once the pending extent list is added, check against
- * the list will be added here.
+ * The host-accepted DPA range must be contained by the first extent
+ * group in the pending list
*/
+ ext_group = QTAILQ_FIRST(&ct3d->dc.extents_pending);
+ if (!cxl_extents_contains_dpa_range(&ext_group->list, dpa, len)) {
+ return CXL_MBOX_INVALID_PA;
+ }
/* to-be-added range should not overlap with range already accepted */
QTAILQ_FOREACH(ent, &ct3d->dc.extents, node) {
@@ -1586,10 +1629,7 @@ static CXLRetCode cmd_dcd_add_dyn_cap_rsp(const struct cxl_cmd *cmd,
CXLRetCode ret;
if (in->num_entries_updated == 0) {
- /*
- * TODO: once the pending list is introduced, extents in the beginning
- * will get wiped out.
- */
+ cxl_extent_group_list_delete_front(&ct3d->dc.extents_pending);
return CXL_MBOX_SUCCESS;
}
@@ -1615,11 +1655,9 @@ static CXLRetCode cmd_dcd_add_dyn_cap_rsp(const struct cxl_cmd *cmd,
cxl_insert_extent_to_extent_list(extent_list, dpa, len, NULL, 0);
ct3d->dc.total_extent_count += 1;
- /*
- * TODO: we will add a pending extent list based on event log record
- * and process the list accordingly here.
- */
}
+ /* Remove the first extent group in the pending list */
+ cxl_extent_group_list_delete_front(&ct3d->dc.extents_pending);
return CXL_MBOX_SUCCESS;
}
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 7c9038938f..2161766b14 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -673,6 +673,7 @@ static bool cxl_create_dc_regions(CXLType3Dev *ct3d, Error **errp)
ct3d->dc.total_capacity += region->len;
}
QTAILQ_INIT(&ct3d->dc.extents);
+ QTAILQ_INIT(&ct3d->dc.extents_pending);
return true;
}
@@ -680,10 +681,19 @@ static bool cxl_create_dc_regions(CXLType3Dev *ct3d, Error **errp)
static void cxl_destroy_dc_regions(CXLType3Dev *ct3d)
{
CXLDCExtent *ent, *ent_next;
+ CXLDCExtentGroup *group, *group_next;
QTAILQ_FOREACH_SAFE(ent, &ct3d->dc.extents, node, ent_next) {
cxl_remove_extent_from_extent_list(&ct3d->dc.extents, ent);
}
+
+ QTAILQ_FOREACH_SAFE(group, &ct3d->dc.extents_pending, node, group_next) {
+ QTAILQ_REMOVE(&ct3d->dc.extents_pending, group, node);
+ QTAILQ_FOREACH_SAFE(ent, &group->list, node, ent_next) {
+ cxl_remove_extent_from_extent_list(&group->list, ent);
+ }
+ g_free(group);
+ }
}
static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
@@ -1448,7 +1458,6 @@ static int ct3d_qmp_cxl_event_log_enc(CxlEventLog log)
return CXL_EVENT_TYPE_FAIL;
case CXL_EVENT_LOG_FATAL:
return CXL_EVENT_TYPE_FATAL;
-/* DCD not yet supported */
default:
return -EINVAL;
}
@@ -1699,6 +1708,301 @@ void qmp_cxl_inject_memory_module_event(const char *path, CxlEventLog log,
}
}
+/* CXL r3.1 Table 8-50: Dynamic Capacity Event Record */
+static const QemuUUID dynamic_capacity_uuid = {
+ .data = UUID(0xca95afa7, 0xf183, 0x4018, 0x8c, 0x2f,
+ 0x95, 0x26, 0x8e, 0x10, 0x1a, 0x2a),
+};
+
+typedef enum CXLDCEventType {
+ DC_EVENT_ADD_CAPACITY = 0x0,
+ DC_EVENT_RELEASE_CAPACITY = 0x1,
+ DC_EVENT_FORCED_RELEASE_CAPACITY = 0x2,
+ DC_EVENT_REGION_CONFIG_UPDATED = 0x3,
+ DC_EVENT_ADD_CAPACITY_RSP = 0x4,
+ DC_EVENT_CAPACITY_RELEASED = 0x5,
+} CXLDCEventType;
+
+/*
+ * Check whether the range [dpa, dpa + len - 1] has overlaps with extents in
+ * the list.
+ * Return value: return true if has overlaps; otherwise, return false
+ */
+static bool cxl_extents_overlaps_dpa_range(CXLDCExtentList *list,
+ uint64_t dpa, uint64_t len)
+{
+ CXLDCExtent *ent;
+ Range range1, range2;
+
+ if (!list) {
+ return false;
+ }
+
+ range_init_nofail(&range1, dpa, len);
+ QTAILQ_FOREACH(ent, list, node) {
+ range_init_nofail(&range2, ent->start_dpa, ent->len);
+ if (range_overlaps_range(&range1, &range2)) {
+ return true;
+ }
+ }
+ return false;
+}
+
+/*
+ * Check whether the range [dpa, dpa + len - 1] is contained by extents in
+ * the list.
+ * Will check multiple extents containment once superset release is added.
+ * Return value: return true if range is contained; otherwise, return false
+ */
+bool cxl_extents_contains_dpa_range(CXLDCExtentList *list,
+ uint64_t dpa, uint64_t len)
+{
+ CXLDCExtent *ent;
+ Range range1, range2;
+
+ if (!list) {
+ return false;
+ }
+
+ range_init_nofail(&range1, dpa, len);
+ QTAILQ_FOREACH(ent, list, node) {
+ range_init_nofail(&range2, ent->start_dpa, ent->len);
+ if (range_contains_range(&range2, &range1)) {
+ return true;
+ }
+ }
+ return false;
+}
+
+static bool cxl_extent_groups_overlaps_dpa_range(CXLDCExtentGroupList *list,
+ uint64_t dpa, uint64_t len)
+{
+ CXLDCExtentGroup *group;
+
+ if (!list) {
+ return false;
+ }
+
+ QTAILQ_FOREACH(group, list, node) {
+ if (cxl_extents_overlaps_dpa_range(&group->list, dpa, len)) {
+ return true;
+ }
+ }
+ return false;
+}
+
+/*
+ * The main function to process dynamic capacity event with extent list.
+ * Currently DC extents add/release requests are processed.
+ */
+static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path,
+ uint16_t hid, CXLDCEventType type, uint8_t rid,
+ CXLDynamicCapacityExtentList *records, Error **errp)
+{
+ Object *obj;
+ CXLEventDynamicCapacity dCap = {};
+ CXLEventRecordHdr *hdr = &dCap.hdr;
+ CXLType3Dev *dcd;
+ uint8_t flags = 1 << CXL_EVENT_TYPE_INFO;
+ uint32_t num_extents = 0;
+ CXLDynamicCapacityExtentList *list;
+ CXLDCExtentGroup *group = NULL;
+ g_autofree CXLDCExtentRaw *extents = NULL;
+ uint8_t enc_log = CXL_EVENT_TYPE_DYNAMIC_CAP;
+ uint64_t dpa, offset, len, block_size;
+ g_autofree unsigned long *blk_bitmap = NULL;
+ int i;
+
+ obj = object_resolve_path_type(path, TYPE_CXL_TYPE3, NULL);
+ if (!obj) {
+ error_setg(errp, "Unable to resolve CXL type 3 device");
+ return;
+ }
+
+ dcd = CXL_TYPE3(obj);
+ if (!dcd->dc.num_regions) {
+ error_setg(errp, "No dynamic capacity support from the device");
+ return;
+ }
+
+
+ if (rid >= dcd->dc.num_regions) {
+ error_setg(errp, "region id is too large");
+ return;
+ }
+ block_size = dcd->dc.regions[rid].block_size;
+ blk_bitmap = bitmap_new(dcd->dc.regions[rid].len / block_size);
+
+ /* Sanity check and count the extents */
+ list = records;
+ while (list) {
+ offset = list->value->offset;
+ len = list->value->len;
+ dpa = offset + dcd->dc.regions[rid].base;
+
+ if (len == 0) {
+ error_setg(errp, "extent with 0 length is not allowed");
+ return;
+ }
+
+ if (offset % block_size || len % block_size) {
+ error_setg(errp, "dpa or len is not aligned to region block size");
+ return;
+ }
+
+ if (offset + len > dcd->dc.regions[rid].len) {
+ error_setg(errp, "extent range is beyond the region end");
+ return;
+ }
+
+ /* No duplicate or overlapped extents are allowed */
+ if (test_any_bits_set(blk_bitmap, offset / block_size,
+ len / block_size)) {
+ error_setg(errp, "duplicate or overlapped extents are detected");
+ return;
+ }
+ bitmap_set(blk_bitmap, offset / block_size, len / block_size);
+
+ if (type == DC_EVENT_RELEASE_CAPACITY) {
+ if (cxl_extent_groups_overlaps_dpa_range(&dcd->dc.extents_pending,
+ dpa, len)) {
+ error_setg(errp,
+ "cannot release extent with pending DPA range");
+ return;
+ }
+ if (!cxl_extents_contains_dpa_range(&dcd->dc.extents, dpa, len)) {
+ error_setg(errp,
+ "cannot release extent with non-existing DPA range");
+ return;
+ }
+ } else if (type == DC_EVENT_ADD_CAPACITY) {
+ if (cxl_extents_overlaps_dpa_range(&dcd->dc.extents, dpa, len)) {
+ error_setg(errp,
+ "cannot add DPA already accessible to the same LD");
+ return;
+ }
+ if (cxl_extent_groups_overlaps_dpa_range(&dcd->dc.extents_pending,
+ dpa, len)) {
+ error_setg(errp,
+ "cannot add DPA again while still pending");
+ return;
+ }
+ }
+ list = list->next;
+ num_extents++;
+ }
+
+ /* Create extent list for event being passed to host */
+ i = 0;
+ list = records;
+ extents = g_new0(CXLDCExtentRaw, num_extents);
+ while (list) {
+ offset = list->value->offset;
+ len = list->value->len;
+ dpa = dcd->dc.regions[rid].base + offset;
+
+ extents[i].start_dpa = dpa;
+ extents[i].len = len;
+ memset(extents[i].tag, 0, 0x10);
+ extents[i].shared_seq = 0;
+ if (type == DC_EVENT_ADD_CAPACITY) {
+ group = cxl_insert_extent_to_extent_group(group,
+ extents[i].start_dpa,
+ extents[i].len,
+ extents[i].tag,
+ extents[i].shared_seq);
+ }
+
+ list = list->next;
+ i++;
+ }
+ if (group) {
+ cxl_extent_group_list_insert_tail(&dcd->dc.extents_pending, group);
+ }
+
+ /*
+ * CXL r3.1 section 8.2.9.2.1.6: Dynamic Capacity Event Record
+ *
+ * All Dynamic Capacity event records shall set the Event Record Severity
+ * field in the Common Event Record Format to Informational Event. All
+ * Dynamic Capacity related events shall be logged in the Dynamic Capacity
+ * Event Log.
+ */
+ cxl_assign_event_header(hdr, &dynamic_capacity_uuid, flags, sizeof(dCap),
+ cxl_device_get_timestamp(&dcd->cxl_dstate));
+
+ dCap.type = type;
+ /* FIXME: for now, validity flag is cleared */
+ dCap.validity_flags = 0;
+ stw_le_p(&dCap.host_id, hid);
+ /* only valid for DC_REGION_CONFIG_UPDATED event */
+ dCap.updated_region_id = 0;
+ dCap.flags = 0;
+ for (i = 0; i < num_extents; i++) {
+ memcpy(&dCap.dynamic_capacity_extent, &extents[i],
+ sizeof(CXLDCExtentRaw));
+
+ if (i < num_extents - 1) {
+ /* Set "More" flag */
+ dCap.flags |= BIT(0);
+ }
+
+ if (cxl_event_insert(&dcd->cxl_dstate, enc_log,
+ (CXLEventRecordRaw *)&dCap)) {
+ cxl_event_irq_assert(dcd);
+ }
+ }
+}
+
+void qmp_cxl_add_dynamic_capacity(const char *path, uint16_t host_id,
+ CXLExtSelPolicy sel_policy, uint8_t region,
+ const char *tag,
+ CXLDynamicCapacityExtentList *extents,
+ Error **errp)
+{
+ switch (sel_policy) {
+ case CXL_EXT_SEL_POLICY_PRESCRIPTIVE:
+ qmp_cxl_process_dynamic_capacity_prescriptive(path, host_id,
+ DC_EVENT_ADD_CAPACITY,
+ region, extents, errp);
+ return;
+ default:
+ error_setg(errp, "Selection policy not supported");
+ return;
+ }
+}
+
+void qmp_cxl_release_dynamic_capacity(const char *path, uint16_t host_id,
+ CXLExtRemovalPolicy removal_policy,
+ bool has_forced_removal,
+ bool forced_removal,
+ bool has_sanitize_on_release,
+ bool sanitize_on_release,
+ uint8_t region,
+ const char *tag,
+ CXLDynamicCapacityExtentList *extents,
+ Error **errp)
+{
+ CXLDCEventType type = DC_EVENT_RELEASE_CAPACITY;
+
+ if (has_forced_removal && forced_removal) {
+ /* TODO: enable forced removal in the future */
+ type = DC_EVENT_FORCED_RELEASE_CAPACITY;
+ error_setg(errp, "Forced removal not supported yet");
+ return;
+ }
+
+ switch (removal_policy) {
+ case CXL_EXT_REMOVAL_POLICY_PRESCRIPTIVE:
+ qmp_cxl_process_dynamic_capacity_prescriptive(path, host_id, type,
+ region, extents, errp);
+ return;
+ default:
+ error_setg(errp, "Removal policy not supported");
+ return;
+ }
+}
+
static void ct3_class_init(ObjectClass *oc, void *data)
{
DeviceClass *dc = DEVICE_CLASS(oc);
diff --git a/hw/mem/cxl_type3_stubs.c b/hw/mem/cxl_type3_stubs.c
index 3e1851e32b..45419bbefe 100644
--- a/hw/mem/cxl_type3_stubs.c
+++ b/hw/mem/cxl_type3_stubs.c
@@ -67,3 +67,28 @@ void qmp_cxl_inject_correctable_error(const char *path, CxlCorErrorType type,
{
error_setg(errp, "CXL Type 3 support is not compiled in");
}
+
+void qmp_cxl_add_dynamic_capacity(const char *path,
+ uint16_t host_id,
+ CXLExtSelPolicy sel_policy,
+ uint8_t region,
+ const char *tag,
+ CXLDynamicCapacityExtentList *extents,
+ Error **errp)
+{
+ error_setg(errp, "CXL Type 3 support is not compiled in");
+}
+
+void qmp_cxl_release_dynamic_capacity(const char *path, uint16_t host_id,
+ CXLExtRemovalPolicy removal_policy,
+ bool has_forced_removal,
+ bool forced_removal,
+ bool has_sanitize_on_release,
+ bool sanitize_on_release,
+ uint8_t region,
+ const char *tag,
+ CXLDynamicCapacityExtentList *extents,
+ Error **errp)
+{
+ error_setg(errp, "CXL Type 3 support is not compiled in");
+}
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index df3511e91b..c69ff6b5de 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -443,6 +443,12 @@ typedef struct CXLDCExtent {
} CXLDCExtent;
typedef QTAILQ_HEAD(, CXLDCExtent) CXLDCExtentList;
+typedef struct CXLDCExtentGroup {
+ CXLDCExtentList list;
+ QTAILQ_ENTRY(CXLDCExtentGroup) node;
+} CXLDCExtentGroup;
+typedef QTAILQ_HEAD(, CXLDCExtentGroup) CXLDCExtentGroupList;
+
typedef struct CXLDCRegion {
uint64_t base; /* aligned to 256*MiB */
uint64_t decode_len; /* aligned to 256*MiB */
@@ -494,6 +500,7 @@ struct CXLType3Dev {
*/
uint64_t total_capacity; /* 256M aligned */
CXLDCExtentList extents;
+ CXLDCExtentGroupList extents_pending;
uint32_t total_extent_count;
uint32_t ext_list_gen_seq;
@@ -555,4 +562,19 @@ CXLDCRegion *cxl_find_dc_region(CXLType3Dev *ct3d, uint64_t dpa, uint64_t len);
void cxl_remove_extent_from_extent_list(CXLDCExtentList *list,
CXLDCExtent *extent);
+void cxl_insert_extent_to_extent_list(CXLDCExtentList *list, uint64_t dpa,
+ uint64_t len, uint8_t *tag,
+ uint16_t shared_seq);
+bool test_any_bits_set(const unsigned long *addr, unsigned long nr,
+ unsigned long size);
+bool cxl_extents_contains_dpa_range(CXLDCExtentList *list,
+ uint64_t dpa, uint64_t len);
+CXLDCExtentGroup *cxl_insert_extent_to_extent_group(CXLDCExtentGroup *group,
+ uint64_t dpa,
+ uint64_t len,
+ uint8_t *tag,
+ uint16_t shared_seq);
+void cxl_extent_group_list_insert_tail(CXLDCExtentGroupList *list,
+ CXLDCExtentGroup *group);
+void cxl_extent_group_list_delete_front(CXLDCExtentGroupList *list);
#endif
diff --git a/include/hw/cxl/cxl_events.h b/include/hw/cxl/cxl_events.h
index 5170b8dbf8..38cadaa0f3 100644
--- a/include/hw/cxl/cxl_events.h
+++ b/include/hw/cxl/cxl_events.h
@@ -166,4 +166,22 @@ typedef struct CXLEventMemoryModule {
uint8_t reserved[0x3d];
} QEMU_PACKED CXLEventMemoryModule;
+/*
+ * CXL r3.1 section Table 8-50: Dynamic Capacity Event Record
+ * All fields little endian.
+ */
+typedef struct CXLEventDynamicCapacity {
+ CXLEventRecordHdr hdr;
+ uint8_t type;
+ uint8_t validity_flags;
+ uint16_t host_id;
+ uint8_t updated_region_id;
+ uint8_t flags;
+ uint8_t reserved2[2];
+ uint8_t dynamic_capacity_extent[0x28]; /* defined in cxl_device.h */
+ uint8_t reserved[0x18];
+ uint32_t extents_avail;
+ uint32_t tags_avail;
+} QEMU_PACKED CXLEventDynamicCapacity;
+
#endif /* CXL_EVENTS_H */
diff --git a/qapi/cxl.json b/qapi/cxl.json
index 4281726dec..57d9f82014 100644
--- a/qapi/cxl.json
+++ b/qapi/cxl.json
@@ -361,3 +361,146 @@
##
{'command': 'cxl-inject-correctable-error',
'data': {'path': 'str', 'type': 'CxlCorErrorType'}}
+
+##
+# @CXLDynamicCapacityExtent:
+#
+# A single dynamic capacity extent
+#
+# @offset: The offset (in bytes) to the start of the region
+# where the extent belongs to.
+#
+# @len: The length of the extent in bytes.
+#
+# Since: 9.1
+##
+{ 'struct': 'CXLDynamicCapacityExtent',
+ 'data': {
+ 'offset':'uint64',
+ 'len': 'uint64'
+ }
+}
+
+##
+# @CXLExtSelPolicy:
+#
+# The policy to use for selecting which extents comprise the added
+# capacity, as defined in cxl spec r3.1 Table 7-70.
+#
+# @free: 0h = Free
+#
+# @contiguous: 1h = Continuous
+#
+# @prescriptive: 2h = Prescriptive
+#
+# @enable-shared-access: 3h = Enable Shared Access
+#
+# Since: 9.1
+##
+{ 'enum': 'CXLExtSelPolicy',
+ 'data': ['free',
+ 'contiguous',
+ 'prescriptive',
+ 'enable-shared-access']
+}
+
+##
+# @cxl-add-dynamic-capacity:
+#
+# Command to initiate to add dynamic capacity extents to a host. It
+# simulates operations defined in cxl spec r3.1 7.6.7.6.5.
+#
+# @path: CXL DCD canonical QOM path.
+#
+# @host-id: The "Host ID" field as defined in cxl spec r3.1
+# Table 7-70.
+#
+# @selection-policy: The "Selection Policy" bits as defined in
+# cxl spec r3.1 Table 7-70. It specifies the policy to use for
+# selecting which extents comprise the added capacity.
+#
+# @region: The "Region Number" field as defined in cxl spec r3.1
+# Table 7-70. The dynamic capacity region where the capacity
+# is being added. Valid range is from 0-7.
+#
+# @tag: The "Tag" field as defined in cxl spec r3.1 Table 7-70.
+#
+# @extents: The "Extent List" field as defined in cxl spec r3.1
+# Table 7-70.
+#
+# Since : 9.1
+##
+{ 'command': 'cxl-add-dynamic-capacity',
+ 'data': { 'path': 'str',
+ 'host-id': 'uint16',
+ 'selection-policy': 'CXLExtSelPolicy',
+ 'region': 'uint8',
+ '*tag': 'str',
+ 'extents': [ 'CXLDynamicCapacityExtent' ]
+ }
+}
+
+##
+# @CXLExtRemovalPolicy:
+#
+# The policy to use for selecting which extents comprise the released
+# capacity, defined in the "Flags" field in cxl spec r3.1 Table 7-71.
+#
+# @tag-based: value = 0h. Extents are selected by the device based
+# on tag, with no requirement for contiguous extents.
+#
+# @prescriptive: value = 1h. Extent list of capacity to release is
+# included in the request payload.
+#
+# Since: 9.1
+##
+{ 'enum': 'CXLExtRemovalPolicy',
+ 'data': ['tag-based',
+ 'prescriptive']
+}
+
+##
+# @cxl-release-dynamic-capacity:
+#
+# Command to initiate to release dynamic capacity extents from a
+# host. It simulates operations defined in cxl spec r3.1 7.6.7.6.6.
+#
+# @path: CXL DCD canonical QOM path.
+#
+# @host-id: The "Host ID" field as defined in cxl spec r3.1
+# Table 7-71.
+#
+# @removal-policy: Bit[3:0] of the "Flags" field as defined in cxl
+# spec r3.1 Table 7-71.
+#
+# @forced-removal: Bit[4] of the "Flags" field in cxl spec r3.1
+# Table 7-71. When set, device does not wait for a Release
+# Dynamic Capacity command from the host. Host immediately
+# loses access to released capacity.
+#
+# @sanitize-on-release: Bit[5] of the "Flags" field in cxl spec r3.1
+# Table 7-71. When set, device should sanitize all released
+# capacity as a result of this request.
+#
+# @region: The "Region Number" field as defined in cxl spec r3.1
+# Table 7-71. The dynamic capacity region where the capacity
+# is being added. Valid range is from 0-7.
+#
+# @tag: The "Tag" field as defined in cxl spec r3.1 Table 7-71.
+#
+# @extents: The "Extent List" field as defined in cxl spec r3.1
+# Table 7-71.
+#
+# Since : 9.1
+##
+{ 'command': 'cxl-release-dynamic-capacity',
+ 'data': { 'path': 'str',
+ 'host-id': 'uint16',
+ 'removal-policy': 'CXLExtRemovalPolicy',
+ '*forced-removal': 'bool',
+ '*sanitize-on-release': 'bool',
+ 'region': 'uint8',
+ '*tag': 'str',
+ 'extents': [ 'CXLDynamicCapacityExtent' ]
+ }
+}
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v8 12/14] hw/mem/cxl_type3: Add DPA range validation for accesses to DC regions
2024-05-23 17:44 [PATCH v8 00/14] Enabling DCD emulation support in Qemu nifan.cxl
` (10 preceding siblings ...)
2024-05-23 17:44 ` [PATCH v8 11/14] hw/cxl/events: Add qmp interfaces to add/release dynamic capacity extents nifan.cxl
@ 2024-05-23 17:44 ` nifan.cxl
2024-05-23 17:44 ` [PATCH v8 13/14] hw/cxl/cxl-mailbox-utils: Add superset extent release mailbox support nifan.cxl
` (3 subsequent siblings)
15 siblings, 0 replies; 28+ messages in thread
From: nifan.cxl @ 2024-05-23 17:44 UTC (permalink / raw)
To: qemu-devel
Cc: jonathan.cameron, linux-cxl, gregory.price, ira.weiny,
dan.j.williams, a.manzanares, dave, nmtadam.samsung, nifan.cxl,
jim.harris, Jorgen.Hansen, wj28.lee, armbru, mst, Fan Ni,
Svetly Todorov, Jonathan Cameron
From: Fan Ni <fan.ni@samsung.com>
All DPA ranges in the DC regions are invalid to access until an extent
covering the range has been successfully accepted by the host. A bitmap
is added to each region to record whether a DC block in the region has
been backed by a DC extent. Each bit in the bitmap represents a DC block.
When a DC extent is accepted, all the bits representing the blocks in the
extent are set, which will be cleared when the extent is released.
Tested-by: Svetly Todorov <svetly.todorov@memverge.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
---
hw/cxl/cxl-mailbox-utils.c | 3 ++
hw/mem/cxl_type3.c | 76 +++++++++++++++++++++++++++++++++++++
include/hw/cxl/cxl_device.h | 7 ++++
3 files changed, 86 insertions(+)
diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index ab71492697..045bce8f74 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -1655,6 +1655,7 @@ static CXLRetCode cmd_dcd_add_dyn_cap_rsp(const struct cxl_cmd *cmd,
cxl_insert_extent_to_extent_list(extent_list, dpa, len, NULL, 0);
ct3d->dc.total_extent_count += 1;
+ ct3_set_region_block_backed(ct3d, dpa, len);
}
/* Remove the first extent group in the pending list */
cxl_extent_group_list_delete_front(&ct3d->dc.extents_pending);
@@ -1813,10 +1814,12 @@ static CXLRetCode cmd_dcd_release_dyn_cap(const struct cxl_cmd *cmd,
* list and update the extent count;
*/
QTAILQ_FOREACH_SAFE(ent, &ct3d->dc.extents, node, ent_next) {
+ ct3_clear_region_block_backed(ct3d, ent->start_dpa, ent->len);
cxl_remove_extent_from_extent_list(&ct3d->dc.extents, ent);
}
copy_extent_list(&ct3d->dc.extents, &updated_list);
QTAILQ_FOREACH_SAFE(ent, &updated_list, node, ent_next) {
+ ct3_set_region_block_backed(ct3d, ent->start_dpa, ent->len);
cxl_remove_extent_from_extent_list(&updated_list, ent);
}
ct3d->dc.total_extent_count = updated_list_size;
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 2161766b14..60cbaa9bb6 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -671,6 +671,7 @@ static bool cxl_create_dc_regions(CXLType3Dev *ct3d, Error **errp)
.flags = 0,
};
ct3d->dc.total_capacity += region->len;
+ region->blk_bitmap = bitmap_new(region->len / region->block_size);
}
QTAILQ_INIT(&ct3d->dc.extents);
QTAILQ_INIT(&ct3d->dc.extents_pending);
@@ -682,6 +683,8 @@ static void cxl_destroy_dc_regions(CXLType3Dev *ct3d)
{
CXLDCExtent *ent, *ent_next;
CXLDCExtentGroup *group, *group_next;
+ int i;
+ CXLDCRegion *region;
QTAILQ_FOREACH_SAFE(ent, &ct3d->dc.extents, node, ent_next) {
cxl_remove_extent_from_extent_list(&ct3d->dc.extents, ent);
@@ -694,6 +697,11 @@ static void cxl_destroy_dc_regions(CXLType3Dev *ct3d)
}
g_free(group);
}
+
+ for (i = 0; i < ct3d->dc.num_regions; i++) {
+ region = &ct3d->dc.regions[i];
+ g_free(region->blk_bitmap);
+ }
}
static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
@@ -925,6 +933,70 @@ static void ct3_exit(PCIDevice *pci_dev)
}
}
+/*
+ * Mark the DPA range [dpa, dap + len - 1] to be backed and accessible. This
+ * happens when a DC extent is added and accepted by the host.
+ */
+void ct3_set_region_block_backed(CXLType3Dev *ct3d, uint64_t dpa,
+ uint64_t len)
+{
+ CXLDCRegion *region;
+
+ region = cxl_find_dc_region(ct3d, dpa, len);
+ if (!region) {
+ return;
+ }
+
+ bitmap_set(region->blk_bitmap, (dpa - region->base) / region->block_size,
+ len / region->block_size);
+}
+
+/*
+ * Check whether the DPA range [dpa, dpa + len - 1] is backed with DC extents.
+ * Used when validating read/write to dc regions
+ */
+bool ct3_test_region_block_backed(CXLType3Dev *ct3d, uint64_t dpa,
+ uint64_t len)
+{
+ CXLDCRegion *region;
+ uint64_t nbits;
+ long nr;
+
+ region = cxl_find_dc_region(ct3d, dpa, len);
+ if (!region) {
+ return false;
+ }
+
+ nr = (dpa - region->base) / region->block_size;
+ nbits = DIV_ROUND_UP(len, region->block_size);
+ /*
+ * if bits between [dpa, dpa + len) are all 1s, meaning the DPA range is
+ * backed with DC extents, return true; else return false.
+ */
+ return find_next_zero_bit(region->blk_bitmap, nr + nbits, nr) == nr + nbits;
+}
+
+/*
+ * Mark the DPA range [dpa, dap + len - 1] to be unbacked and inaccessible.
+ * This happens when a dc extent is released by the host.
+ */
+void ct3_clear_region_block_backed(CXLType3Dev *ct3d, uint64_t dpa,
+ uint64_t len)
+{
+ CXLDCRegion *region;
+ uint64_t nbits;
+ long nr;
+
+ region = cxl_find_dc_region(ct3d, dpa, len);
+ if (!region) {
+ return;
+ }
+
+ nr = (dpa - region->base) / region->block_size;
+ nbits = len / region->block_size;
+ bitmap_clear(region->blk_bitmap, nr, nbits);
+}
+
static bool cxl_type3_dpa(CXLType3Dev *ct3d, hwaddr host_addr, uint64_t *dpa)
{
int hdm_inc = R_CXL_HDM_DECODER1_BASE_LO - R_CXL_HDM_DECODER0_BASE_LO;
@@ -1029,6 +1101,10 @@ static int cxl_type3_hpa_to_as_and_dpa(CXLType3Dev *ct3d,
*as = &ct3d->hostpmem_as;
*dpa_offset -= vmr_size;
} else {
+ if (!ct3_test_region_block_backed(ct3d, *dpa_offset, size)) {
+ return -ENODEV;
+ }
+
*as = &ct3d->dc.host_dc_as;
*dpa_offset -= (vmr_size + pmr_size);
}
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index c69ff6b5de..0a4fcb2800 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -456,6 +456,7 @@ typedef struct CXLDCRegion {
uint64_t block_size;
uint32_t dsmadhandle;
uint8_t flags;
+ unsigned long *blk_bitmap;
} CXLDCRegion;
struct CXLType3Dev {
@@ -577,4 +578,10 @@ CXLDCExtentGroup *cxl_insert_extent_to_extent_group(CXLDCExtentGroup *group,
void cxl_extent_group_list_insert_tail(CXLDCExtentGroupList *list,
CXLDCExtentGroup *group);
void cxl_extent_group_list_delete_front(CXLDCExtentGroupList *list);
+void ct3_set_region_block_backed(CXLType3Dev *ct3d, uint64_t dpa,
+ uint64_t len);
+void ct3_clear_region_block_backed(CXLType3Dev *ct3d, uint64_t dpa,
+ uint64_t len);
+bool ct3_test_region_block_backed(CXLType3Dev *ct3d, uint64_t dpa,
+ uint64_t len);
#endif
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v8 13/14] hw/cxl/cxl-mailbox-utils: Add superset extent release mailbox support
2024-05-23 17:44 [PATCH v8 00/14] Enabling DCD emulation support in Qemu nifan.cxl
` (11 preceding siblings ...)
2024-05-23 17:44 ` [PATCH v8 12/14] hw/mem/cxl_type3: Add DPA range validation for accesses to DC regions nifan.cxl
@ 2024-05-23 17:44 ` nifan.cxl
2024-05-23 17:44 ` [PATCH v8 14/14] hw/mem/cxl_type3: Allow to release extent superset in QMP interface nifan.cxl
` (2 subsequent siblings)
15 siblings, 0 replies; 28+ messages in thread
From: nifan.cxl @ 2024-05-23 17:44 UTC (permalink / raw)
To: qemu-devel
Cc: jonathan.cameron, linux-cxl, gregory.price, ira.weiny,
dan.j.williams, a.manzanares, dave, nmtadam.samsung, nifan.cxl,
jim.harris, Jorgen.Hansen, wj28.lee, armbru, mst, Fan Ni,
Svetly Todorov
From: Fan Ni <fan.ni@samsung.com>
With the change, we extend the extent release mailbox command processing
to allow more flexible release. As long as the DPA range of the extent to
release is covered by accepted extent(s) in the device, the release can be
performed.
Tested-by: Svetly Todorov <svetly.todorov@memverge.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
---
hw/cxl/cxl-mailbox-utils.c | 21 ++++++++-------------
1 file changed, 8 insertions(+), 13 deletions(-)
diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 045bce8f74..ec8949ce7b 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -1704,6 +1704,13 @@ static CXLRetCode cxl_dc_extent_release_dry_run(CXLType3Dev *ct3d,
dpa = in->updated_entries[i].start_dpa;
len = in->updated_entries[i].len;
+ /* Check if the DPA range is not fully backed with valid extents */
+ if (!ct3_test_region_block_backed(ct3d, dpa, len)) {
+ ret = CXL_MBOX_INVALID_PA;
+ goto free_and_exit;
+ }
+
+ /* After this point, extent overflow is the only error can happen */
while (len > 0) {
QTAILQ_FOREACH(ent, updated_list, node) {
range_init_nofail(&range, ent->start_dpa, ent->len);
@@ -1718,14 +1725,7 @@ static CXLRetCode cxl_dc_extent_release_dry_run(CXLType3Dev *ct3d,
if (range_contains(&range, dpa + len - 1)) {
len2 = ent_start_dpa + ent_len - dpa - len;
} else {
- /*
- * TODO: we reject the attempt to remove an extent
- * that overlaps with multiple extents in the device
- * for now. We will allow it once superset release
- * support is added.
- */
- ret = CXL_MBOX_INVALID_PA;
- goto free_and_exit;
+ dpa = ent_start_dpa + ent_len;
}
len_done = ent_len - len1 - len2;
@@ -1752,14 +1752,9 @@ static CXLRetCode cxl_dc_extent_release_dry_run(CXLType3Dev *ct3d,
}
len -= len_done;
- /* len == 0 here until superset release is added */
break;
}
}
- if (len) {
- ret = CXL_MBOX_INVALID_PA;
- goto free_and_exit;
- }
}
}
free_and_exit:
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v8 14/14] hw/mem/cxl_type3: Allow to release extent superset in QMP interface
2024-05-23 17:44 [PATCH v8 00/14] Enabling DCD emulation support in Qemu nifan.cxl
` (12 preceding siblings ...)
2024-05-23 17:44 ` [PATCH v8 13/14] hw/cxl/cxl-mailbox-utils: Add superset extent release mailbox support nifan.cxl
@ 2024-05-23 17:44 ` nifan.cxl
2024-06-03 13:51 ` [PATCH v8 00/14] Enabling DCD emulation support in Qemu Jonathan Cameron
2025-06-25 14:22 ` Alireza Sanaee
15 siblings, 0 replies; 28+ messages in thread
From: nifan.cxl @ 2024-05-23 17:44 UTC (permalink / raw)
To: qemu-devel
Cc: jonathan.cameron, linux-cxl, gregory.price, ira.weiny,
dan.j.williams, a.manzanares, dave, nmtadam.samsung, nifan.cxl,
jim.harris, Jorgen.Hansen, wj28.lee, armbru, mst, Fan Ni,
Svetly Todorov, Jonathan Cameron
From: Fan Ni <fan.ni@samsung.com>
Before the change, the QMP interface used for add/release DC extents
only allows to release an extent whose DPA range is contained by a single
accepted extent in the device.
With the change, we relax the constraints. As long as the DPA range of
the extent is covered by accepted extents, we allow the release.
Tested-by: Svetly Todorov <svetly.todorov@memverge.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
---
hw/mem/cxl_type3.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 60cbaa9bb6..284db94182 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -1946,7 +1946,7 @@ static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path,
"cannot release extent with pending DPA range");
return;
}
- if (!cxl_extents_contains_dpa_range(&dcd->dc.extents, dpa, len)) {
+ if (!ct3_test_region_block_backed(dcd, dpa, len)) {
error_setg(errp,
"cannot release extent with non-existing DPA range");
return;
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH v8 06/14] hw/mem/cxl_type3: Add support to create DC regions to type3 memory devices
2024-05-23 17:44 ` [PATCH v8 06/14] hw/mem/cxl_type3: Add support to create DC regions to " nifan.cxl
@ 2024-05-27 7:42 ` Zhijian Li (Fujitsu)
0 siblings, 0 replies; 28+ messages in thread
From: Zhijian Li (Fujitsu) @ 2024-05-27 7:42 UTC (permalink / raw)
To: nifan.cxl@gmail.com, qemu-devel@nongnu.org
Cc: jonathan.cameron@huawei.com, linux-cxl@vger.kernel.org,
gregory.price@memverge.com, ira.weiny@intel.com,
dan.j.williams@intel.com, a.manzanares@samsung.com,
dave@stgolabs.net, nmtadam.samsung@gmail.com,
jim.harris@samsung.com, Jorgen.Hansen@wdc.com, wj28.lee@gmail.com,
armbru@redhat.com, mst@redhat.com, Fan Ni
On 24/05/2024 01:44, nifan.cxl@gmail.com wrote:
> From: Fan Ni <fan.ni@samsung.com>
>
> With the change, when setting up memory for type3 memory device, we can
> create DC regions.
> A property 'num-dc-regions' is added to ct3_props to allow users to pass the
> number of DC regions to create. To make it easier, other region parameters
> like region base, length, and block size are hard coded. If needed,
> these parameters can be added easily.
>
> With the change, we can create DC regions with proper kernel side
> support like below:
>
> region=$(cat /sys/bus/cxl/devices/decoder0.0/create_dc_region)
> echo $region > /sys/bus/cxl/devices/decoder0.0/create_dc_region
> echo 256 > /sys/bus/cxl/devices/$region/interleave_granularity
> echo 1 > /sys/bus/cxl/devices/$region/interleave_ways
>
> echo "dc0" >/sys/bus/cxl/devices/decoder2.0/mode
> echo 0x40000000 >/sys/bus/cxl/devices/decoder2.0/dpa_size
>
> echo 0x40000000 > /sys/bus/cxl/devices/$region/size
> echo "decoder2.0" > /sys/bus/cxl/devices/$region/target0
> echo 1 > /sys/bus/cxl/devices/$region/commit
> echo $region > /sys/bus/cxl/drivers/cxl_region/bind
>
> Reviewed-by: Gregory Price <gregory.price@memverge.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v8 08/14] hw/mem/cxl_type3: Add host backend and address space handling for DC regions
2024-05-23 17:44 ` [PATCH v8 08/14] hw/mem/cxl_type3: Add host backend and address space handling for DC regions nifan.cxl
@ 2024-06-03 12:27 ` Jonathan Cameron
2024-06-03 15:04 ` Michael S. Tsirkin
0 siblings, 1 reply; 28+ messages in thread
From: Jonathan Cameron @ 2024-06-03 12:27 UTC (permalink / raw)
To: nifan.cxl
Cc: qemu-devel, linux-cxl, gregory.price, ira.weiny, dan.j.williams,
a.manzanares, dave, nmtadam.samsung, jim.harris, Jorgen.Hansen,
wj28.lee, armbru, mst, Fan Ni
On Thu, 23 May 2024 10:44:48 -0700
nifan.cxl@gmail.com wrote:
> From: Fan Ni <fan.ni@samsung.com>
>
> Add (file/memory backed) host backend for DCD. All the dynamic capacity
> regions will share a single, large enough host backend. Set up address
> space for DC regions to support read/write operations to dynamic capacity
> for DCD.
>
> With the change, the following support is added:
> 1. Add a new property to type3 device "volatile-dc-memdev" to point to host
> memory backend for dynamic capacity. Currently, all DC regions share one
> host backend;
> 2. Add namespace for dynamic capacity for read/write support;
> 3. Create cdat entries for each dynamic capacity region.
>
> Reviewed-by: Gregory Price <gregory.price@memverge.com>
> Signed-off-by: Fan Ni <fan.ni@samsung.com>
> dvsec = (uint8_t *)&(CXLDVSECDevice){
> @@ -579,11 +622,28 @@ static bool cxl_create_dc_regions(CXLType3Dev *ct3d, Error **errp)
> {
> int i;
> uint64_t region_base = 0;
> - uint64_t region_len = 2 * GiB;
> - uint64_t decode_len = 2 * GiB;
> + uint64_t region_len;
> + uint64_t decode_len;
> uint64_t blk_size = 2 * MiB;
> CXLDCRegion *region;
> MemoryRegion *mr;
> + uint64_t dc_size;
> +
> + mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
> + dc_size = memory_region_size(mr);
> + region_len = DIV_ROUND_UP(dc_size, ct3d->dc.num_regions);
> +
> + if (dc_size % (ct3d->dc.num_regions * CXL_CAPACITY_MULTIPLIER) != 0) {
> + error_setg(errp, "backend size is not multiple of region len: 0x%lx",
Just seen a build error for this in mst's gitlab.
Needs to be the messy PRIx64(not tested) e.g.
error_setg(errp, "backend size is not multiple of region len: " PRIx64,
region_len);
Michael, do you want a new version, or are you happy to fix this up?
Thanks,
Jonathan
> + region_len);
> + return false;
> + }
> + if (region_len % CXL_CAPACITY_MULTIPLIER != 0) {
> + error_setg(errp, "DC region size is unaligned to 0x%lx",
> + CXL_CAPACITY_MULTIPLIER);
> + return false;
> + }
> + decode_len = region_len;
>
> if (ct3d->hostvmem) {
> mr = host_memory_backend_get_memory(ct3d->hostvmem);
> @@ -610,6 +670,7 @@ static bool cxl_create_dc_regions(CXLType3Dev *ct3d, Error **errp)
> /* dsmad_handle set when creating CDAT table entries */
> .flags = 0,
> };
> + ct3d->dc.total_capacity += region->len;
> }
>
> return true;
> @@ -619,7 +680,8 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
> {
> DeviceState *ds = DEVICE(ct3d);
>
> - if (!ct3d->hostmem && !ct3d->hostvmem && !ct3d->hostpmem) {
> + if (!ct3d->hostmem && !ct3d->hostvmem && !ct3d->hostpmem
> + && !ct3d->dc.num_regions) {
> error_setg(errp, "at least one memdev property must be set");
> return false;
> } else if (ct3d->hostmem && ct3d->hostpmem) {
> @@ -683,7 +745,37 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
> g_free(p_name);
> }
>
> + ct3d->dc.total_capacity = 0;
> if (ct3d->dc.num_regions > 0) {
> + MemoryRegion *dc_mr;
> + char *dc_name;
> +
> + if (!ct3d->dc.host_dc) {
> + error_setg(errp, "dynamic capacity must have a backing device");
> + return false;
> + }
> +
> + dc_mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
> + if (!dc_mr) {
> + error_setg(errp, "dynamic capacity must have a backing device");
> + return false;
> + }
> +
> + /*
> + * Set DC regions as volatile for now, non-volatile support can
> + * be added in the future if needed.
> + */
> + memory_region_set_nonvolatile(dc_mr, false);
> + memory_region_set_enabled(dc_mr, true);
> + host_memory_backend_set_mapped(ct3d->dc.host_dc, true);
> + if (ds->id) {
> + dc_name = g_strdup_printf("cxl-dcd-dpa-dc-space:%s", ds->id);
> + } else {
> + dc_name = g_strdup("cxl-dcd-dpa-dc-space");
> + }
> + address_space_init(&ct3d->dc.host_dc_as, dc_mr, dc_name);
> + g_free(dc_name);
> +
> if (!cxl_create_dc_regions(ct3d, errp)) {
> error_append_hint(errp, "setup DC regions failed");
> return false;
> @@ -779,6 +871,9 @@ err_release_cdat:
> err_free_special_ops:
> g_free(regs->special_ops);
> err_address_space_free:
> + if (ct3d->dc.host_dc) {
> + address_space_destroy(&ct3d->dc.host_dc_as);
> + }
> if (ct3d->hostpmem) {
> address_space_destroy(&ct3d->hostpmem_as);
> }
> @@ -797,6 +892,9 @@ static void ct3_exit(PCIDevice *pci_dev)
> pcie_aer_exit(pci_dev);
> cxl_doe_cdat_release(cxl_cstate);
> g_free(regs->special_ops);
> + if (ct3d->dc.host_dc) {
> + address_space_destroy(&ct3d->dc.host_dc_as);
> + }
> if (ct3d->hostpmem) {
> address_space_destroy(&ct3d->hostpmem_as);
> }
> @@ -875,16 +973,23 @@ static int cxl_type3_hpa_to_as_and_dpa(CXLType3Dev *ct3d,
> AddressSpace **as,
> uint64_t *dpa_offset)
> {
> - MemoryRegion *vmr = NULL, *pmr = NULL;
> + MemoryRegion *vmr = NULL, *pmr = NULL, *dc_mr = NULL;
> + uint64_t vmr_size = 0, pmr_size = 0, dc_size = 0;
>
> if (ct3d->hostvmem) {
> vmr = host_memory_backend_get_memory(ct3d->hostvmem);
> + vmr_size = memory_region_size(vmr);
> }
> if (ct3d->hostpmem) {
> pmr = host_memory_backend_get_memory(ct3d->hostpmem);
> + pmr_size = memory_region_size(pmr);
> + }
> + if (ct3d->dc.host_dc) {
> + dc_mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
> + dc_size = memory_region_size(dc_mr);
> }
>
> - if (!vmr && !pmr) {
> + if (!vmr && !pmr && !dc_mr) {
> return -ENODEV;
> }
>
> @@ -892,19 +997,18 @@ static int cxl_type3_hpa_to_as_and_dpa(CXLType3Dev *ct3d,
> return -EINVAL;
> }
>
> - if (*dpa_offset > ct3d->cxl_dstate.static_mem_size) {
> + if (*dpa_offset >= vmr_size + pmr_size + dc_size) {
> return -EINVAL;
> }
>
> - if (vmr) {
> - if (*dpa_offset < memory_region_size(vmr)) {
> - *as = &ct3d->hostvmem_as;
> - } else {
> - *as = &ct3d->hostpmem_as;
> - *dpa_offset -= memory_region_size(vmr);
> - }
> - } else {
> + if (*dpa_offset < vmr_size) {
> + *as = &ct3d->hostvmem_as;
> + } else if (*dpa_offset < vmr_size + pmr_size) {
> *as = &ct3d->hostpmem_as;
> + *dpa_offset -= vmr_size;
> + } else {
> + *as = &ct3d->dc.host_dc_as;
> + *dpa_offset -= (vmr_size + pmr_size);
> }
>
> return 0;
> @@ -986,6 +1090,8 @@ static Property ct3_props[] = {
> DEFINE_PROP_UINT64("sn", CXLType3Dev, sn, UI64_NULL),
> DEFINE_PROP_STRING("cdat", CXLType3Dev, cxl_cstate.cdat.filename),
> DEFINE_PROP_UINT8("num-dc-regions", CXLType3Dev, dc.num_regions, 0),
> + DEFINE_PROP_LINK("volatile-dc-memdev", CXLType3Dev, dc.host_dc,
> + TYPE_MEMORY_BACKEND, HostMemoryBackend *),
> DEFINE_PROP_END_OF_LIST(),
> };
>
> @@ -1052,33 +1158,39 @@ static void set_lsa(CXLType3Dev *ct3d, const void *buf, uint64_t size,
>
> static bool set_cacheline(CXLType3Dev *ct3d, uint64_t dpa_offset, uint8_t *data)
> {
> - MemoryRegion *vmr = NULL, *pmr = NULL;
> + MemoryRegion *vmr = NULL, *pmr = NULL, *dc_mr = NULL;
> AddressSpace *as;
> + uint64_t vmr_size = 0, pmr_size = 0, dc_size = 0;
>
> if (ct3d->hostvmem) {
> vmr = host_memory_backend_get_memory(ct3d->hostvmem);
> + vmr_size = memory_region_size(vmr);
> }
> if (ct3d->hostpmem) {
> pmr = host_memory_backend_get_memory(ct3d->hostpmem);
> + pmr_size = memory_region_size(pmr);
> }
> + if (ct3d->dc.host_dc) {
> + dc_mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
> + dc_size = memory_region_size(dc_mr);
> + }
>
> - if (!vmr && !pmr) {
> + if (!vmr && !pmr && !dc_mr) {
> return false;
> }
>
> - if (dpa_offset + CXL_CACHE_LINE_SIZE > ct3d->cxl_dstate.static_mem_size) {
> + if (dpa_offset + CXL_CACHE_LINE_SIZE > vmr_size + pmr_size + dc_size) {
> return false;
> }
>
> - if (vmr) {
> - if (dpa_offset < memory_region_size(vmr)) {
> - as = &ct3d->hostvmem_as;
> - } else {
> - as = &ct3d->hostpmem_as;
> - dpa_offset -= memory_region_size(vmr);
> - }
> - } else {
> + if (dpa_offset < vmr_size) {
> + as = &ct3d->hostvmem_as;
> + } else if (dpa_offset < vmr_size + pmr_size) {
> as = &ct3d->hostpmem_as;
> + dpa_offset -= vmr_size;
> + } else {
> + as = &ct3d->dc.host_dc_as;
> + dpa_offset -= (vmr_size + pmr_size);
> }
>
> address_space_write(as, dpa_offset, MEMTXATTRS_UNSPECIFIED, &data,
> diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> index f7f56b44e3..c2c3df0d2a 100644
> --- a/include/hw/cxl/cxl_device.h
> +++ b/include/hw/cxl/cxl_device.h
> @@ -467,6 +467,14 @@ struct CXLType3Dev {
> uint64_t poison_list_overflow_ts;
>
> struct dynamic_capacity {
> + HostMemoryBackend *host_dc;
> + AddressSpace host_dc_as;
> + /*
> + * total_capacity is equivalent to the dynamic capability
> + * memory region size.
> + */
> + uint64_t total_capacity; /* 256M aligned */
> +
> uint8_t num_regions; /* 0-8 regions */
> CXLDCRegion regions[DCD_MAX_NUM_REGION];
> } dc;
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v8 00/14] Enabling DCD emulation support in Qemu
2024-05-23 17:44 [PATCH v8 00/14] Enabling DCD emulation support in Qemu nifan.cxl
` (13 preceding siblings ...)
2024-05-23 17:44 ` [PATCH v8 14/14] hw/mem/cxl_type3: Allow to release extent superset in QMP interface nifan.cxl
@ 2024-06-03 13:51 ` Jonathan Cameron
2025-06-25 14:22 ` Alireza Sanaee
15 siblings, 0 replies; 28+ messages in thread
From: Jonathan Cameron @ 2024-06-03 13:51 UTC (permalink / raw)
To: nifan.cxl
Cc: qemu-devel, linux-cxl, gregory.price, ira.weiny, dan.j.williams,
a.manzanares, dave, nmtadam.samsung, jim.harris, Jorgen.Hansen,
wj28.lee, armbru, mst
On Thu, 23 May 2024 10:44:40 -0700
nifan.cxl@gmail.com wrote:
> From: Fan Ni <nifan.cxl@gmail.com>
>
Hi Fan,
Apologies for slow response to this update - been a busy few weeks and
I knew I was basically happy with this now so it feel down the todo list.
I've taken one last look this morning and looks good to me.
I'll rebase my tree on top of this so that I can start posting the various
other dependent series in the next few days.
For the remaining patches which don't already have my tag,
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Thanks for your hard work on this fiddly series!
I notice Michael has this in his gitlab tree already which is great.
Hopefully that will go smoothly. No problem if I'm too late with
the above tag.
Looks like there are some build issues with format strings though.
https://gitlab.com/mstredhat/qemu/-/jobs/6998408862
I've called that out in the relevant patch.
Jonathan
> A git tree of this series can be found here (with one extra commit on top
> for printing out accepted/pending extent list for testing):
> https://github.com/moking/qemu/tree/dcd-v8-qapi
>
> v7->v8:
>
> This version carries over the following two patches from Gregory.
> 1. hw/cxl/mailbox: change CCI cmd set structure to be a member, not a reference
> https://gitlab.com/jic23/qemu/-/commit/f44ebc5a455ccdd6535879b0c5824e0d76b04da5
> 2. hw/cxl/mailbox: interface to add CCI commands to an existing CCI
> https://gitlab.com/jic23/qemu/-/commit/00a4dd8b388add03c588298f665ee918626296a5
>
> Note, the above two patches are not directly related to DCD emulation.
>
> All the following patches in this series are built on top of mainstream QEMU
> and the above two patches.
>
> The most significant changes of v8 is in Patch 11 (Patch 9 in v7). Based on
> feedback from Markus and Jonathan, the QMP interfaces for adding and releasing
> DC extents have been redesigned and now they look like below,
>
> # add a 128MB extent at offset 0 to region 0
> { "execute": "cxl-add-dynamic-capacity",
> "arguments": {
> "path": "/machine/peripheral/cxl-memdev0",
> "host-id":0,
> "selection-policy": 'prescriptive',
> "region": 0,
> "tag": "",
> "extents": [
> {
> "offset": 0,
> "len": 134217728
> }
> ]
> }
> }
>
> Note: tag is optional.
>
> # Release a 128MB extent at offset 0 from region 0
> { "execute": "cxl-release-dynamic-capacity",
> "arguments": {
> "path": "/machine/peripheral/cxl-memdev0",
> "host-id":0,
> "removal-policy":"prescriptive",
> "forced-removal": false,
> "sanitize-on-release": false,
> "region": 0,
> "tag": "",
> "extents": [
> {
> "offset": 0,
> "len": 134217728
> }
> ]
> }
> }
>
> Note: removal-policy, sanitize-on-release and tag are optional.
>
> Other changes include,
> 1. Applied tags to patches.
> 2. Replaced error_setq with error_append_hint for cxl_create_dc_region error
> case in Patch 6 (Patch 4 in v7); (Zhijian Li)
> 3. Updated the error message to include region size information in
> cxl_create_dc_region.
> 4. set range1_size_hi to 0 for DCD in build_dvsec. (Jonathan)
> 5. Several minor format fixes.
>
> Thanks Markus, Jonathan, Gregory, and Zhijian for reviewing v7 and
> svetly Todorov for testing v7.
>
> This series pass the same tests as v7 check the cover letter of v7 for
> more details. Additionally, we tested the QAPI interface for
> adding/releasing DC extents with optional input parameters.
>
>
> v7: https://lore.kernel.org/linux-cxl/5856b7a4-4082-465f-9f61-b1ec6c35ef0f@fujitsu.com/T/#mec4c85022ce28c80b241aaf2d5431cadaa45f097
>
>
> Fan Ni (12):
> hw/cxl/cxl-mailbox-utils: Add dc_event_log_size field to output
> payload of identify memory device command
> hw/cxl/cxl-mailbox-utils: Add dynamic capacity region representative
> and mailbox command support
> include/hw/cxl/cxl_device: Rename mem_size as static_mem_size for
> type3 memory devices
> hw/mem/cxl_type3: Add support to create DC regions to type3 memory
> devices
> hw/mem/cxl-type3: Refactor ct3_build_cdat_entries_for_mr to take mr
> size instead of mr as argument
> hw/mem/cxl_type3: Add host backend and address space handling for DC
> regions
> hw/mem/cxl_type3: Add DC extent list representative and get DC extent
> list mailbox support
> hw/cxl/cxl-mailbox-utils: Add mailbox commands to support add/release
> dynamic capacity response
> hw/cxl/events: Add qmp interfaces to add/release dynamic capacity
> extents
> hw/mem/cxl_type3: Add DPA range validation for accesses to DC regions
> hw/cxl/cxl-mailbox-utils: Add superset extent release mailbox support
> hw/mem/cxl_type3: Allow to release extent superset in QMP interface
>
> Gregory Price (2):
> hw/cxl/mailbox: change CCI cmd set structure to be a member, not a
> reference
> hw/cxl/mailbox: interface to add CCI commands to an existing CCI
>
> hw/cxl/cxl-mailbox-utils.c | 658 +++++++++++++++++++++++++++++++++++-
> hw/mem/cxl_type3.c | 634 ++++++++++++++++++++++++++++++++--
> hw/mem/cxl_type3_stubs.c | 25 ++
> include/hw/cxl/cxl_device.h | 85 ++++-
> include/hw/cxl/cxl_events.h | 18 +
> qapi/cxl.json | 143 ++++++++
> 6 files changed, 1511 insertions(+), 52 deletions(-)
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v8 08/14] hw/mem/cxl_type3: Add host backend and address space handling for DC regions
2024-06-03 12:27 ` Jonathan Cameron
@ 2024-06-03 15:04 ` Michael S. Tsirkin
2024-06-03 17:27 ` Jonathan Cameron
0 siblings, 1 reply; 28+ messages in thread
From: Michael S. Tsirkin @ 2024-06-03 15:04 UTC (permalink / raw)
To: Jonathan Cameron
Cc: nifan.cxl, qemu-devel, linux-cxl, gregory.price, ira.weiny,
dan.j.williams, a.manzanares, dave, nmtadam.samsung, jim.harris,
Jorgen.Hansen, wj28.lee, armbru, Fan Ni
On Mon, Jun 03, 2024 at 01:27:59PM +0100, Jonathan Cameron wrote:
> On Thu, 23 May 2024 10:44:48 -0700
> nifan.cxl@gmail.com wrote:
>
> > From: Fan Ni <fan.ni@samsung.com>
> >
> > Add (file/memory backed) host backend for DCD. All the dynamic capacity
> > regions will share a single, large enough host backend. Set up address
> > space for DC regions to support read/write operations to dynamic capacity
> > for DCD.
> >
> > With the change, the following support is added:
> > 1. Add a new property to type3 device "volatile-dc-memdev" to point to host
> > memory backend for dynamic capacity. Currently, all DC regions share one
> > host backend;
> > 2. Add namespace for dynamic capacity for read/write support;
> > 3. Create cdat entries for each dynamic capacity region.
> >
> > Reviewed-by: Gregory Price <gregory.price@memverge.com>
> > Signed-off-by: Fan Ni <fan.ni@samsung.com>
>
> > dvsec = (uint8_t *)&(CXLDVSECDevice){
> > @@ -579,11 +622,28 @@ static bool cxl_create_dc_regions(CXLType3Dev *ct3d, Error **errp)
> > {
> > int i;
> > uint64_t region_base = 0;
> > - uint64_t region_len = 2 * GiB;
> > - uint64_t decode_len = 2 * GiB;
> > + uint64_t region_len;
> > + uint64_t decode_len;
> > uint64_t blk_size = 2 * MiB;
> > CXLDCRegion *region;
> > MemoryRegion *mr;
> > + uint64_t dc_size;
> > +
> > + mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
> > + dc_size = memory_region_size(mr);
> > + region_len = DIV_ROUND_UP(dc_size, ct3d->dc.num_regions);
> > +
> > + if (dc_size % (ct3d->dc.num_regions * CXL_CAPACITY_MULTIPLIER) != 0) {
> > + error_setg(errp, "backend size is not multiple of region len: 0x%lx",
>
> Just seen a build error for this in mst's gitlab.
> Needs to be the messy PRIx64(not tested) e.g.
>
> error_setg(errp, "backend size is not multiple of region len: " PRIx64,
> region_len);
>
> Michael, do you want a new version, or are you happy to fix this up?
>
> Thanks,
>
> Jonathan
I did this fixup. If nothing else happens I'll keep it, if more
issues creep up I will drop. Thanks!
> > + region_len);
> > + return false;
> > + }
> > + if (region_len % CXL_CAPACITY_MULTIPLIER != 0) {
> > + error_setg(errp, "DC region size is unaligned to 0x%lx",
> > + CXL_CAPACITY_MULTIPLIER);
> > + return false;
> > + }
> > + decode_len = region_len;
> >
> > if (ct3d->hostvmem) {
> > mr = host_memory_backend_get_memory(ct3d->hostvmem);
> > @@ -610,6 +670,7 @@ static bool cxl_create_dc_regions(CXLType3Dev *ct3d, Error **errp)
> > /* dsmad_handle set when creating CDAT table entries */
> > .flags = 0,
> > };
> > + ct3d->dc.total_capacity += region->len;
> > }
> >
> > return true;
> > @@ -619,7 +680,8 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
> > {
> > DeviceState *ds = DEVICE(ct3d);
> >
> > - if (!ct3d->hostmem && !ct3d->hostvmem && !ct3d->hostpmem) {
> > + if (!ct3d->hostmem && !ct3d->hostvmem && !ct3d->hostpmem
> > + && !ct3d->dc.num_regions) {
> > error_setg(errp, "at least one memdev property must be set");
> > return false;
> > } else if (ct3d->hostmem && ct3d->hostpmem) {
> > @@ -683,7 +745,37 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
> > g_free(p_name);
> > }
> >
> > + ct3d->dc.total_capacity = 0;
> > if (ct3d->dc.num_regions > 0) {
> > + MemoryRegion *dc_mr;
> > + char *dc_name;
> > +
> > + if (!ct3d->dc.host_dc) {
> > + error_setg(errp, "dynamic capacity must have a backing device");
> > + return false;
> > + }
> > +
> > + dc_mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
> > + if (!dc_mr) {
> > + error_setg(errp, "dynamic capacity must have a backing device");
> > + return false;
> > + }
> > +
> > + /*
> > + * Set DC regions as volatile for now, non-volatile support can
> > + * be added in the future if needed.
> > + */
> > + memory_region_set_nonvolatile(dc_mr, false);
> > + memory_region_set_enabled(dc_mr, true);
> > + host_memory_backend_set_mapped(ct3d->dc.host_dc, true);
> > + if (ds->id) {
> > + dc_name = g_strdup_printf("cxl-dcd-dpa-dc-space:%s", ds->id);
> > + } else {
> > + dc_name = g_strdup("cxl-dcd-dpa-dc-space");
> > + }
> > + address_space_init(&ct3d->dc.host_dc_as, dc_mr, dc_name);
> > + g_free(dc_name);
> > +
> > if (!cxl_create_dc_regions(ct3d, errp)) {
> > error_append_hint(errp, "setup DC regions failed");
> > return false;
> > @@ -779,6 +871,9 @@ err_release_cdat:
> > err_free_special_ops:
> > g_free(regs->special_ops);
> > err_address_space_free:
> > + if (ct3d->dc.host_dc) {
> > + address_space_destroy(&ct3d->dc.host_dc_as);
> > + }
> > if (ct3d->hostpmem) {
> > address_space_destroy(&ct3d->hostpmem_as);
> > }
> > @@ -797,6 +892,9 @@ static void ct3_exit(PCIDevice *pci_dev)
> > pcie_aer_exit(pci_dev);
> > cxl_doe_cdat_release(cxl_cstate);
> > g_free(regs->special_ops);
> > + if (ct3d->dc.host_dc) {
> > + address_space_destroy(&ct3d->dc.host_dc_as);
> > + }
> > if (ct3d->hostpmem) {
> > address_space_destroy(&ct3d->hostpmem_as);
> > }
> > @@ -875,16 +973,23 @@ static int cxl_type3_hpa_to_as_and_dpa(CXLType3Dev *ct3d,
> > AddressSpace **as,
> > uint64_t *dpa_offset)
> > {
> > - MemoryRegion *vmr = NULL, *pmr = NULL;
> > + MemoryRegion *vmr = NULL, *pmr = NULL, *dc_mr = NULL;
> > + uint64_t vmr_size = 0, pmr_size = 0, dc_size = 0;
> >
> > if (ct3d->hostvmem) {
> > vmr = host_memory_backend_get_memory(ct3d->hostvmem);
> > + vmr_size = memory_region_size(vmr);
> > }
> > if (ct3d->hostpmem) {
> > pmr = host_memory_backend_get_memory(ct3d->hostpmem);
> > + pmr_size = memory_region_size(pmr);
> > + }
> > + if (ct3d->dc.host_dc) {
> > + dc_mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
> > + dc_size = memory_region_size(dc_mr);
> > }
> >
> > - if (!vmr && !pmr) {
> > + if (!vmr && !pmr && !dc_mr) {
> > return -ENODEV;
> > }
> >
> > @@ -892,19 +997,18 @@ static int cxl_type3_hpa_to_as_and_dpa(CXLType3Dev *ct3d,
> > return -EINVAL;
> > }
> >
> > - if (*dpa_offset > ct3d->cxl_dstate.static_mem_size) {
> > + if (*dpa_offset >= vmr_size + pmr_size + dc_size) {
> > return -EINVAL;
> > }
> >
> > - if (vmr) {
> > - if (*dpa_offset < memory_region_size(vmr)) {
> > - *as = &ct3d->hostvmem_as;
> > - } else {
> > - *as = &ct3d->hostpmem_as;
> > - *dpa_offset -= memory_region_size(vmr);
> > - }
> > - } else {
> > + if (*dpa_offset < vmr_size) {
> > + *as = &ct3d->hostvmem_as;
> > + } else if (*dpa_offset < vmr_size + pmr_size) {
> > *as = &ct3d->hostpmem_as;
> > + *dpa_offset -= vmr_size;
> > + } else {
> > + *as = &ct3d->dc.host_dc_as;
> > + *dpa_offset -= (vmr_size + pmr_size);
> > }
> >
> > return 0;
> > @@ -986,6 +1090,8 @@ static Property ct3_props[] = {
> > DEFINE_PROP_UINT64("sn", CXLType3Dev, sn, UI64_NULL),
> > DEFINE_PROP_STRING("cdat", CXLType3Dev, cxl_cstate.cdat.filename),
> > DEFINE_PROP_UINT8("num-dc-regions", CXLType3Dev, dc.num_regions, 0),
> > + DEFINE_PROP_LINK("volatile-dc-memdev", CXLType3Dev, dc.host_dc,
> > + TYPE_MEMORY_BACKEND, HostMemoryBackend *),
> > DEFINE_PROP_END_OF_LIST(),
> > };
> >
> > @@ -1052,33 +1158,39 @@ static void set_lsa(CXLType3Dev *ct3d, const void *buf, uint64_t size,
> >
> > static bool set_cacheline(CXLType3Dev *ct3d, uint64_t dpa_offset, uint8_t *data)
> > {
> > - MemoryRegion *vmr = NULL, *pmr = NULL;
> > + MemoryRegion *vmr = NULL, *pmr = NULL, *dc_mr = NULL;
> > AddressSpace *as;
> > + uint64_t vmr_size = 0, pmr_size = 0, dc_size = 0;
> >
> > if (ct3d->hostvmem) {
> > vmr = host_memory_backend_get_memory(ct3d->hostvmem);
> > + vmr_size = memory_region_size(vmr);
> > }
> > if (ct3d->hostpmem) {
> > pmr = host_memory_backend_get_memory(ct3d->hostpmem);
> > + pmr_size = memory_region_size(pmr);
> > }
> > + if (ct3d->dc.host_dc) {
> > + dc_mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
> > + dc_size = memory_region_size(dc_mr);
> > + }
> >
> > - if (!vmr && !pmr) {
> > + if (!vmr && !pmr && !dc_mr) {
> > return false;
> > }
> >
> > - if (dpa_offset + CXL_CACHE_LINE_SIZE > ct3d->cxl_dstate.static_mem_size) {
> > + if (dpa_offset + CXL_CACHE_LINE_SIZE > vmr_size + pmr_size + dc_size) {
> > return false;
> > }
> >
> > - if (vmr) {
> > - if (dpa_offset < memory_region_size(vmr)) {
> > - as = &ct3d->hostvmem_as;
> > - } else {
> > - as = &ct3d->hostpmem_as;
> > - dpa_offset -= memory_region_size(vmr);
> > - }
> > - } else {
> > + if (dpa_offset < vmr_size) {
> > + as = &ct3d->hostvmem_as;
> > + } else if (dpa_offset < vmr_size + pmr_size) {
> > as = &ct3d->hostpmem_as;
> > + dpa_offset -= vmr_size;
> > + } else {
> > + as = &ct3d->dc.host_dc_as;
> > + dpa_offset -= (vmr_size + pmr_size);
> > }
> >
> > address_space_write(as, dpa_offset, MEMTXATTRS_UNSPECIFIED, &data,
> > diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> > index f7f56b44e3..c2c3df0d2a 100644
> > --- a/include/hw/cxl/cxl_device.h
> > +++ b/include/hw/cxl/cxl_device.h
> > @@ -467,6 +467,14 @@ struct CXLType3Dev {
> > uint64_t poison_list_overflow_ts;
> >
> > struct dynamic_capacity {
> > + HostMemoryBackend *host_dc;
> > + AddressSpace host_dc_as;
> > + /*
> > + * total_capacity is equivalent to the dynamic capability
> > + * memory region size.
> > + */
> > + uint64_t total_capacity; /* 256M aligned */
> > +
> > uint8_t num_regions; /* 0-8 regions */
> > CXLDCRegion regions[DCD_MAX_NUM_REGION];
> > } dc;
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v8 08/14] hw/mem/cxl_type3: Add host backend and address space handling for DC regions
2024-06-03 15:04 ` Michael S. Tsirkin
@ 2024-06-03 17:27 ` Jonathan Cameron
0 siblings, 0 replies; 28+ messages in thread
From: Jonathan Cameron @ 2024-06-03 17:27 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: nifan.cxl, qemu-devel, linux-cxl, gregory.price, ira.weiny,
dan.j.williams, a.manzanares, dave, nmtadam.samsung, jim.harris,
Jorgen.Hansen, wj28.lee, armbru, Fan Ni
On Mon, 3 Jun 2024 11:04:06 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Mon, Jun 03, 2024 at 01:27:59PM +0100, Jonathan Cameron wrote:
> > On Thu, 23 May 2024 10:44:48 -0700
> > nifan.cxl@gmail.com wrote:
> >
> > > From: Fan Ni <fan.ni@samsung.com>
> > >
> > > Add (file/memory backed) host backend for DCD. All the dynamic capacity
> > > regions will share a single, large enough host backend. Set up address
> > > space for DC regions to support read/write operations to dynamic capacity
> > > for DCD.
> > >
> > > With the change, the following support is added:
> > > 1. Add a new property to type3 device "volatile-dc-memdev" to point to host
> > > memory backend for dynamic capacity. Currently, all DC regions share one
> > > host backend;
> > > 2. Add namespace for dynamic capacity for read/write support;
> > > 3. Create cdat entries for each dynamic capacity region.
> > >
> > > Reviewed-by: Gregory Price <gregory.price@memverge.com>
> > > Signed-off-by: Fan Ni <fan.ni@samsung.com>
> >
> > > dvsec = (uint8_t *)&(CXLDVSECDevice){
> > > @@ -579,11 +622,28 @@ static bool cxl_create_dc_regions(CXLType3Dev *ct3d, Error **errp)
> > > {
> > > int i;
> > > uint64_t region_base = 0;
> > > - uint64_t region_len = 2 * GiB;
> > > - uint64_t decode_len = 2 * GiB;
> > > + uint64_t region_len;
> > > + uint64_t decode_len;
> > > uint64_t blk_size = 2 * MiB;
> > > CXLDCRegion *region;
> > > MemoryRegion *mr;
> > > + uint64_t dc_size;
> > > +
> > > + mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
> > > + dc_size = memory_region_size(mr);
> > > + region_len = DIV_ROUND_UP(dc_size, ct3d->dc.num_regions);
> > > +
> > > + if (dc_size % (ct3d->dc.num_regions * CXL_CAPACITY_MULTIPLIER) != 0) {
> > > + error_setg(errp, "backend size is not multiple of region len: 0x%lx",
> >
> > Just seen a build error for this in mst's gitlab.
> > Needs to be the messy PRIx64(not tested) e.g.
> >
> > error_setg(errp, "backend size is not multiple of region len: " PRIx64,
> > region_len);
> >
> > Michael, do you want a new version, or are you happy to fix this up?
> >
> > Thanks,
> >
> > Jonathan
>
>
> I did this fixup. If nothing else happens I'll keep it, if more
> issues creep up I will drop. Thanks!
I failed to mention there are several instances.
I guess you have seen that in the gitlab run.
There is one in this patch and one in patch 6 concerning
CXL_CAPACITY_MULTIPLIER
which is defines
as SZ_256M which oddly seems to end up as different sizes on different
architectures. Maybe just cast that in the calls?
>
> > > + region_len);
> > > + return false;
> > > + }
> > > + if (region_len % CXL_CAPACITY_MULTIPLIER != 0) {
> > > + error_setg(errp, "DC region size is unaligned to 0x%lx",
> > > + CXL_CAPACITY_MULTIPLIER);
> > > + return false;
> > > + }
> > > + decode_len = region_len;
> > >
> > > if (ct3d->hostvmem) {
> > > mr = host_memory_backend_get_memory(ct3d->hostvmem);
> > > @@ -610,6 +670,7 @@ static bool cxl_create_dc_regions(CXLType3Dev *ct3d, Error **errp)
> > > /* dsmad_handle set when creating CDAT table entries */
> > > .flags = 0,
> > > };
> > > + ct3d->dc.total_capacity += region->len;
> > > }
> > >
> > > return true;
> > > @@ -619,7 +680,8 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
> > > {
> > > DeviceState *ds = DEVICE(ct3d);
> > >
> > > - if (!ct3d->hostmem && !ct3d->hostvmem && !ct3d->hostpmem) {
> > > + if (!ct3d->hostmem && !ct3d->hostvmem && !ct3d->hostpmem
> > > + && !ct3d->dc.num_regions) {
> > > error_setg(errp, "at least one memdev property must be set");
> > > return false;
> > > } else if (ct3d->hostmem && ct3d->hostpmem) {
> > > @@ -683,7 +745,37 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
> > > g_free(p_name);
> > > }
> > >
> > > + ct3d->dc.total_capacity = 0;
> > > if (ct3d->dc.num_regions > 0) {
> > > + MemoryRegion *dc_mr;
> > > + char *dc_name;
> > > +
> > > + if (!ct3d->dc.host_dc) {
> > > + error_setg(errp, "dynamic capacity must have a backing device");
> > > + return false;
> > > + }
> > > +
> > > + dc_mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
> > > + if (!dc_mr) {
> > > + error_setg(errp, "dynamic capacity must have a backing device");
> > > + return false;
> > > + }
> > > +
> > > + /*
> > > + * Set DC regions as volatile for now, non-volatile support can
> > > + * be added in the future if needed.
> > > + */
> > > + memory_region_set_nonvolatile(dc_mr, false);
> > > + memory_region_set_enabled(dc_mr, true);
> > > + host_memory_backend_set_mapped(ct3d->dc.host_dc, true);
> > > + if (ds->id) {
> > > + dc_name = g_strdup_printf("cxl-dcd-dpa-dc-space:%s", ds->id);
> > > + } else {
> > > + dc_name = g_strdup("cxl-dcd-dpa-dc-space");
> > > + }
> > > + address_space_init(&ct3d->dc.host_dc_as, dc_mr, dc_name);
> > > + g_free(dc_name);
> > > +
> > > if (!cxl_create_dc_regions(ct3d, errp)) {
> > > error_append_hint(errp, "setup DC regions failed");
> > > return false;
> > > @@ -779,6 +871,9 @@ err_release_cdat:
> > > err_free_special_ops:
> > > g_free(regs->special_ops);
> > > err_address_space_free:
> > > + if (ct3d->dc.host_dc) {
> > > + address_space_destroy(&ct3d->dc.host_dc_as);
> > > + }
> > > if (ct3d->hostpmem) {
> > > address_space_destroy(&ct3d->hostpmem_as);
> > > }
> > > @@ -797,6 +892,9 @@ static void ct3_exit(PCIDevice *pci_dev)
> > > pcie_aer_exit(pci_dev);
> > > cxl_doe_cdat_release(cxl_cstate);
> > > g_free(regs->special_ops);
> > > + if (ct3d->dc.host_dc) {
> > > + address_space_destroy(&ct3d->dc.host_dc_as);
> > > + }
> > > if (ct3d->hostpmem) {
> > > address_space_destroy(&ct3d->hostpmem_as);
> > > }
> > > @@ -875,16 +973,23 @@ static int cxl_type3_hpa_to_as_and_dpa(CXLType3Dev *ct3d,
> > > AddressSpace **as,
> > > uint64_t *dpa_offset)
> > > {
> > > - MemoryRegion *vmr = NULL, *pmr = NULL;
> > > + MemoryRegion *vmr = NULL, *pmr = NULL, *dc_mr = NULL;
> > > + uint64_t vmr_size = 0, pmr_size = 0, dc_size = 0;
> > >
> > > if (ct3d->hostvmem) {
> > > vmr = host_memory_backend_get_memory(ct3d->hostvmem);
> > > + vmr_size = memory_region_size(vmr);
> > > }
> > > if (ct3d->hostpmem) {
> > > pmr = host_memory_backend_get_memory(ct3d->hostpmem);
> > > + pmr_size = memory_region_size(pmr);
> > > + }
> > > + if (ct3d->dc.host_dc) {
> > > + dc_mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
> > > + dc_size = memory_region_size(dc_mr);
> > > }
> > >
> > > - if (!vmr && !pmr) {
> > > + if (!vmr && !pmr && !dc_mr) {
> > > return -ENODEV;
> > > }
> > >
> > > @@ -892,19 +997,18 @@ static int cxl_type3_hpa_to_as_and_dpa(CXLType3Dev *ct3d,
> > > return -EINVAL;
> > > }
> > >
> > > - if (*dpa_offset > ct3d->cxl_dstate.static_mem_size) {
> > > + if (*dpa_offset >= vmr_size + pmr_size + dc_size) {
> > > return -EINVAL;
> > > }
> > >
> > > - if (vmr) {
> > > - if (*dpa_offset < memory_region_size(vmr)) {
> > > - *as = &ct3d->hostvmem_as;
> > > - } else {
> > > - *as = &ct3d->hostpmem_as;
> > > - *dpa_offset -= memory_region_size(vmr);
> > > - }
> > > - } else {
> > > + if (*dpa_offset < vmr_size) {
> > > + *as = &ct3d->hostvmem_as;
> > > + } else if (*dpa_offset < vmr_size + pmr_size) {
> > > *as = &ct3d->hostpmem_as;
> > > + *dpa_offset -= vmr_size;
> > > + } else {
> > > + *as = &ct3d->dc.host_dc_as;
> > > + *dpa_offset -= (vmr_size + pmr_size);
> > > }
> > >
> > > return 0;
> > > @@ -986,6 +1090,8 @@ static Property ct3_props[] = {
> > > DEFINE_PROP_UINT64("sn", CXLType3Dev, sn, UI64_NULL),
> > > DEFINE_PROP_STRING("cdat", CXLType3Dev, cxl_cstate.cdat.filename),
> > > DEFINE_PROP_UINT8("num-dc-regions", CXLType3Dev, dc.num_regions, 0),
> > > + DEFINE_PROP_LINK("volatile-dc-memdev", CXLType3Dev, dc.host_dc,
> > > + TYPE_MEMORY_BACKEND, HostMemoryBackend *),
> > > DEFINE_PROP_END_OF_LIST(),
> > > };
> > >
> > > @@ -1052,33 +1158,39 @@ static void set_lsa(CXLType3Dev *ct3d, const void *buf, uint64_t size,
> > >
> > > static bool set_cacheline(CXLType3Dev *ct3d, uint64_t dpa_offset, uint8_t *data)
> > > {
> > > - MemoryRegion *vmr = NULL, *pmr = NULL;
> > > + MemoryRegion *vmr = NULL, *pmr = NULL, *dc_mr = NULL;
> > > AddressSpace *as;
> > > + uint64_t vmr_size = 0, pmr_size = 0, dc_size = 0;
> > >
> > > if (ct3d->hostvmem) {
> > > vmr = host_memory_backend_get_memory(ct3d->hostvmem);
> > > + vmr_size = memory_region_size(vmr);
> > > }
> > > if (ct3d->hostpmem) {
> > > pmr = host_memory_backend_get_memory(ct3d->hostpmem);
> > > + pmr_size = memory_region_size(pmr);
> > > }
> > > + if (ct3d->dc.host_dc) {
> > > + dc_mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
> > > + dc_size = memory_region_size(dc_mr);
> > > + }
> > >
> > > - if (!vmr && !pmr) {
> > > + if (!vmr && !pmr && !dc_mr) {
> > > return false;
> > > }
> > >
> > > - if (dpa_offset + CXL_CACHE_LINE_SIZE > ct3d->cxl_dstate.static_mem_size) {
> > > + if (dpa_offset + CXL_CACHE_LINE_SIZE > vmr_size + pmr_size + dc_size) {
> > > return false;
> > > }
> > >
> > > - if (vmr) {
> > > - if (dpa_offset < memory_region_size(vmr)) {
> > > - as = &ct3d->hostvmem_as;
> > > - } else {
> > > - as = &ct3d->hostpmem_as;
> > > - dpa_offset -= memory_region_size(vmr);
> > > - }
> > > - } else {
> > > + if (dpa_offset < vmr_size) {
> > > + as = &ct3d->hostvmem_as;
> > > + } else if (dpa_offset < vmr_size + pmr_size) {
> > > as = &ct3d->hostpmem_as;
> > > + dpa_offset -= vmr_size;
> > > + } else {
> > > + as = &ct3d->dc.host_dc_as;
> > > + dpa_offset -= (vmr_size + pmr_size);
> > > }
> > >
> > > address_space_write(as, dpa_offset, MEMTXATTRS_UNSPECIFIED, &data,
> > > diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> > > index f7f56b44e3..c2c3df0d2a 100644
> > > --- a/include/hw/cxl/cxl_device.h
> > > +++ b/include/hw/cxl/cxl_device.h
> > > @@ -467,6 +467,14 @@ struct CXLType3Dev {
> > > uint64_t poison_list_overflow_ts;
> > >
> > > struct dynamic_capacity {
> > > + HostMemoryBackend *host_dc;
> > > + AddressSpace host_dc_as;
> > > + /*
> > > + * total_capacity is equivalent to the dynamic capability
> > > + * memory region size.
> > > + */
> > > + uint64_t total_capacity; /* 256M aligned */
> > > +
> > > uint8_t num_regions; /* 0-8 regions */
> > > CXLDCRegion regions[DCD_MAX_NUM_REGION];
> > > } dc;
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v8 11/14] hw/cxl/events: Add qmp interfaces to add/release dynamic capacity extents
2024-05-23 17:44 ` [PATCH v8 11/14] hw/cxl/events: Add qmp interfaces to add/release dynamic capacity extents nifan.cxl
@ 2024-06-04 7:12 ` Markus Armbruster
2024-06-04 11:55 ` Jonathan Cameron
2025-09-02 10:39 ` Alireza Sanaee
1 sibling, 1 reply; 28+ messages in thread
From: Markus Armbruster @ 2024-06-04 7:12 UTC (permalink / raw)
To: nifan.cxl
Cc: qemu-devel, jonathan.cameron, linux-cxl, gregory.price, ira.weiny,
dan.j.williams, a.manzanares, dave, nmtadam.samsung, jim.harris,
Jorgen.Hansen, wj28.lee, mst, Fan Ni, Svetly Todorov
nifan.cxl@gmail.com writes:
> From: Fan Ni <fan.ni@samsung.com>
>
> To simulate FM functionalities for initiating Dynamic Capacity Add
> (Opcode 5604h) and Dynamic Capacity Release (Opcode 5605h) as in CXL spec
> r3.1 7.6.7.6.5 and 7.6.7.6.6, we implemented two QMP interfaces to issue
> add/release dynamic capacity extents requests.
>
> With the change, we allow to release an extent only when its DPA range
> is contained by a single accepted extent in the device. That is to say,
> extent superset release is not supported yet.
>
> 1. Add dynamic capacity extents:
>
> For example, the command to add two continuous extents (each 128MiB long)
> to region 0 (starting at DPA offset 0) looks like below:
>
> { "execute": "qmp_capabilities" }
>
> { "execute": "cxl-add-dynamic-capacity",
> "arguments": {
> "path": "/machine/peripheral/cxl-dcd0",
> "host-id": 0,
> "selection-policy": "prescriptive",
> "region": 0,
> "extents": [
> {
> "offset": 0,
> "len": 134217728
> },
> {
> "offset": 134217728,
> "len": 134217728
> }
> ]
> }
> }
>
> 2. Release dynamic capacity extents:
>
> For example, the command to release an extent of size 128MiB from region 0
> (DPA offset 128MiB) looks like below:
>
> { "execute": "cxl-release-dynamic-capacity",
> "arguments": {
> "path": "/machine/peripheral/cxl-dcd0",
> "host-id": 0,
> "removal-policy":"prescriptive",
> "region": 0,
> "extents": [
> {
> "offset": 134217728,
> "len": 134217728
> }
> ]
> }
> }
>
> Tested-by: Svetly Todorov <svetly.todorov@memverge.com>
> Reviewed-by: Gregory Price <gregory.price@memverge.com>
> Signed-off-by: Fan Ni <fan.ni@samsung.com>
[...]
> diff --git a/qapi/cxl.json b/qapi/cxl.json
> index 4281726dec..57d9f82014 100644
> --- a/qapi/cxl.json
> +++ b/qapi/cxl.json
> @@ -361,3 +361,146 @@
> ##
> {'command': 'cxl-inject-correctable-error',
> 'data': {'path': 'str', 'type': 'CxlCorErrorType'}}
> +
> +##
> +# @CXLDynamicCapacityExtent:
Three existing type names start with Cxl, and only one starts with CXL.
Please make your new ones start with Cxl, not CXL:
CxlDynamicCapacityExtent.
> +#
> +# A single dynamic capacity extent
> +#
> +# @offset: The offset (in bytes) to the start of the region
> +# where the extent belongs to.
> +#
> +# @len: The length of the extent in bytes.
What is this? Memory?
> +#
> +# Since: 9.1
> +##
> +{ 'struct': 'CXLDynamicCapacityExtent',
> + 'data': {
> + 'offset':'uint64',
> + 'len': 'uint64'
> + }
> +}
> +
> +##
> +# @CXLExtSelPolicy:
CxlExtentSelectionPolicy
> +#
> +# The policy to use for selecting which extents comprise the added
> +# capacity, as defined in cxl spec r3.1 Table 7-70.
Use the official title: "as defined in the CXL Specification 3.1" (I
think, the actual document is behind a click-through agreement).
> +#
> +# @free: 0h = Free
> +#
> +# @contiguous: 1h = Continuous
What does "1h =" mean? The numeric encoding?
What exactly is "contiguous" / "continuous"? I figure it's clear enough
if you have the CXL spec open in another window. Can we condense it
into one phrase for use here?
> +#
> +# @prescriptive: 2h = Prescriptive
> +#
> +# @enable-shared-access: 3h = Enable Shared Access
Similar questions.
> +#
> +# Since: 9.1
> +##
> +{ 'enum': 'CXLExtSelPolicy',
> + 'data': ['free',
> + 'contiguous',
> + 'prescriptive',
> + 'enable-shared-access']
> +}
> +
> +##
> +# @cxl-add-dynamic-capacity:
> +#
> +# Command to initiate to add dynamic capacity extents to a host. It
"Initiate adding dynamic capacity extents"
When a command initiates something, we commonly need a way to detect
completion, and sometimes need a way to track progress.
How can we detect completion, and if we can't, why's that okay?
Can adding capacity fail after the command succeeded? If yes, how can
we detect that?
How long until completion after the command succeeded? Unbounded time?
> +# simulates operations defined in cxl spec r3.1 7.6.7.6.5.
"defined in the CXL Specification 3.1 section 7.6.7.6.5"
More of the same below, not noting it again.
> +#
> +# @path: CXL DCD canonical QOM path.
Sure the QOM path needs to be canonical?
If not, what about "path to the CXL dynamic capacity device in the QOM
tree". Intentionally close to existing descriptions of @qom-path
elsewhere.
> +#
> +# @host-id: The "Host ID" field as defined in cxl spec r3.1
> +# Table 7-70.
> +#
> +# @selection-policy: The "Selection Policy" bits as defined in
> +# cxl spec r3.1 Table 7-70. It specifies the policy to use for
> +# selecting which extents comprise the added capacity.
> +#
> +# @region: The "Region Number" field as defined in cxl spec r3.1
> +# Table 7-70. The dynamic capacity region where the capacity
> +# is being added. Valid range is from 0-7.
Scratch the second sentence?
> +#
> +# @tag: The "Tag" field as defined in cxl spec r3.1 Table 7-70.
> +#
> +# @extents: The "Extent List" field as defined in cxl spec r3.1
> +# Table 7-70.
> +#
> +# Since : 9.1
> +##
> +{ 'command': 'cxl-add-dynamic-capacity',
> + 'data': { 'path': 'str',
> + 'host-id': 'uint16',
> + 'selection-policy': 'CXLExtSelPolicy',
> + 'region': 'uint8',
> + '*tag': 'str',
> + 'extents': [ 'CXLDynamicCapacityExtent' ]
> + }
> +}
> +
> +##
> +# @CXLExtRemovalPolicy:
CxlExtentRemovalPolicy
> +#
> +# The policy to use for selecting which extents comprise the released
> +# capacity, defined in the "Flags" field in cxl spec r3.1 Table 7-71.
> +#
> +# @tag-based: value = 0h. Extents are selected by the device based
> +# on tag, with no requirement for contiguous extents.
> +#
> +# @prescriptive: value = 1h. Extent list of capacity to release is
> +# included in the request payload.
I guess "value = ..." documents the numeric value. Sure that's useful
here?
> +#
> +# Since: 9.1
> +##
> +{ 'enum': 'CXLExtRemovalPolicy',
> + 'data': ['tag-based',
> + 'prescriptive']
> +}
> +
> +##
> +# @cxl-release-dynamic-capacity:
> +#
> +# Command to initiate to release dynamic capacity extents from a
"Initiate releasing dynamic capacity extents"
When a command initiates something, we commonly need a way to detect
completion, and sometimes need a way to track progress. See
cxl-add-dynamic-capacity above.
> +# host. It simulates operations defined in cxl spec r3.1 7.6.7.6.6.
> +#
> +# @path: CXL DCD canonical QOM path.
My comment on cxl-add-dynamic-capacity argument @path applies.
> +#
> +# @host-id: The "Host ID" field as defined in cxl spec r3.1
> +# Table 7-71.
> +#
> +# @removal-policy: Bit[3:0] of the "Flags" field as defined in cxl
> +# spec r3.1 Table 7-71.
> +#
> +# @forced-removal: Bit[4] of the "Flags" field in cxl spec r3.1
> +# Table 7-71. When set, device does not wait for a Release
"the device"
> +# Dynamic Capacity command from the host. Host immediately
> +# loses access to released capacity.
"Instead, the host immediately loses"
> +#
> +# @sanitize-on-release: Bit[5] of the "Flags" field in cxl spec r3.1
> +# Table 7-71. When set, device should sanitize all released
"the device"
> +# capacity as a result of this request.
What does it mean "to sanitize capacity"? Is this about scrubbing the
memory?
> +#
> +# @region: The "Region Number" field as defined in cxl spec r3.1
> +# Table 7-71. The dynamic capacity region where the capacity
> +# is being added. Valid range is from 0-7.
My comment on cxl-add-dynamic-capacity argument @region applies.
> +#
> +# @tag: The "Tag" field as defined in cxl spec r3.1 Table 7-71.
> +#
> +# @extents: The "Extent List" field as defined in cxl spec r3.1
> +# Table 7-71.
> +#
> +# Since : 9.1
> +##
> +{ 'command': 'cxl-release-dynamic-capacity',
> + 'data': { 'path': 'str',
> + 'host-id': 'uint16',
> + 'removal-policy': 'CXLExtRemovalPolicy',
> + '*forced-removal': 'bool',
> + '*sanitize-on-release': 'bool',
> + 'region': 'uint8',
> + '*tag': 'str',
> + 'extents': [ 'CXLDynamicCapacityExtent' ]
> + }
> +}
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v8 11/14] hw/cxl/events: Add qmp interfaces to add/release dynamic capacity extents
2024-06-04 7:12 ` Markus Armbruster
@ 2024-06-04 11:55 ` Jonathan Cameron
2024-06-04 14:49 ` Markus Armbruster
0 siblings, 1 reply; 28+ messages in thread
From: Jonathan Cameron @ 2024-06-04 11:55 UTC (permalink / raw)
To: Markus Armbruster
Cc: nifan.cxl, qemu-devel, linux-cxl, gregory.price, ira.weiny,
dan.j.williams, a.manzanares, dave, nmtadam.samsung, jim.harris,
Jorgen.Hansen, wj28.lee, mst, Fan Ni, Svetly Todorov
On Tue, 04 Jun 2024 09:12:09 +0200
Markus Armbruster <armbru@redhat.com> wrote:
> nifan.cxl@gmail.com writes:
>
> > From: Fan Ni <fan.ni@samsung.com>
> >
> > To simulate FM functionalities for initiating Dynamic Capacity Add
> > (Opcode 5604h) and Dynamic Capacity Release (Opcode 5605h) as in CXL spec
> > r3.1 7.6.7.6.5 and 7.6.7.6.6, we implemented two QMP interfaces to issue
> > add/release dynamic capacity extents requests.
> >
> > With the change, we allow to release an extent only when its DPA range
> > is contained by a single accepted extent in the device. That is to say,
> > extent superset release is not supported yet.
> >
> > 1. Add dynamic capacity extents:
> >
> > For example, the command to add two continuous extents (each 128MiB long)
> > to region 0 (starting at DPA offset 0) looks like below:
> >
> > { "execute": "qmp_capabilities" }
> >
> > { "execute": "cxl-add-dynamic-capacity",
> > "arguments": {
> > "path": "/machine/peripheral/cxl-dcd0",
> > "host-id": 0,
> > "selection-policy": "prescriptive",
> > "region": 0,
> > "extents": [
> > {
> > "offset": 0,
> > "len": 134217728
> > },
> > {
> > "offset": 134217728,
> > "len": 134217728
> > }
> > ]
> > }
> > }
> >
> > 2. Release dynamic capacity extents:
> >
> > For example, the command to release an extent of size 128MiB from region 0
> > (DPA offset 128MiB) looks like below:
> >
> > { "execute": "cxl-release-dynamic-capacity",
> > "arguments": {
> > "path": "/machine/peripheral/cxl-dcd0",
> > "host-id": 0,
> > "removal-policy":"prescriptive",
> > "region": 0,
> > "extents": [
> > {
> > "offset": 134217728,
> > "len": 134217728
> > }
> > ]
> > }
> > }
> >
> > Tested-by: Svetly Todorov <svetly.todorov@memverge.com>
> > Reviewed-by: Gregory Price <gregory.price@memverge.com>
> > Signed-off-by: Fan Ni <fan.ni@samsung.com>
>
Hi Markus,
Thanks for the detailed review.
Fan is traveling for a few weeks and may have intermittent internet.
He asked me to help with any feedback that came in during this period.
Perhaps at this stage (as Michael has this queued) best bet is a follow on patch
tweaking things. The blast radius is more or less contained to the
qmp file subject to a few parameter type changes. I'd be keen on this
approach if possible because that lets me start attacking the annoyingly
large queue of stuff dependent on this series in parallel with
improving this aspect.
Proposed draft patch at end of this email and responses to individual
comments inline.
I'll do a separate patch in response to your suggestion to mark the
two interfaces unstable. For now seems there is little disadvantage
in doing so as I assume there is nothing stopping us removing
that marking in a cycle or two if things look stable.
> [...]
>
> > diff --git a/qapi/cxl.json b/qapi/cxl.json
> > index 4281726dec..57d9f82014 100644
> > --- a/qapi/cxl.json
> > +++ b/qapi/cxl.json
> > @@ -361,3 +361,146 @@
> > ##
> > {'command': 'cxl-inject-correctable-error',
> > 'data': {'path': 'str', 'type': 'CxlCorErrorType'}}
> > +
> > +##
> > +# @CXLDynamicCapacityExtent:
>
> Three existing type names start with Cxl, and only one starts with CXL.
> Please make your new ones start with Cxl, not CXL:
> CxlDynamicCapacityExtent.
Ok.
>
> > +#
> > +# A single dynamic capacity extent
> > +#
> > +# @offset: The offset (in bytes) to the start of the region
> > +# where the extent belongs to.
> > +#
> > +# @len: The length of the extent in bytes.
>
> What is this? Memory?
Yes. Probably makes more sense to add to the initial description rather
than down here.
# A single dynamic capacity extent. This is a contiguous allocation
# of memory by Device Physical Address within a single Dynamic Capacity
# Region on a CXL Type 3 device.
This is all a bit of a balance between not quoting large chunks of
the specification and providing enough detail here.
Reality is that people who don't know what this is, won't use this
interface. We can add some additional documentation to introduce
all the concepts but it probably doesn't make sense to do so here.
>
> > +#
> > +# Since: 9.1
> > +##
> > +{ 'struct': 'CXLDynamicCapacityExtent',
> > + 'data': {
> > + 'offset':'uint64',
> > + 'len': 'uint64'
> > + }
> > +}
> > +
> > +##
> > +# @CXLExtSelPolicy:
>
> CxlExtentSelectionPolicy
>
> > +#
> > +# The policy to use for selecting which extents comprise the added
> > +# capacity, as defined in cxl spec r3.1 Table 7-70.
>
> Use the official title: "as defined in the CXL Specification 3.1" (I
> think, the actual document is behind a click-through agreement).
Sadly not that simple, hence the desire for an abbreviation. Should be
Compute Express Link (CXL) Specification, Revision 3.1, Version 1.0
Can drop the Version 1.0 (as there have never been other versions and
probably won't be) but the Revision part matters (unfortunately)
hence the r in the above.
Not that we've used CXL r3.0 etc in previous QMP docs for this. Perhaps
just sticking to that and relying on the reference in
docs/system/devices/cxl.rst for the canonical reference.
For now I'll go with the (almost) full form here as it's never wrong to
spell it out. So all the new references will be to
Compute Express Link (CXL) Specification, Revision 3.1, Section xxxx
>
> > +#
> > +# @free: 0h = Free
> > +#
> > +# @contiguous: 1h = Continuous
>
> What does "1h =" mean? The numeric encoding?
Alignment with spec, but doesn't need to be here so removed.
>
> What exactly is "contiguous" / "continuous"? I figure it's clear enough
> if you have the CXL spec open in another window. Can we condense it
> into one phrase for use here?
@free: Device is responsible for allocating the requested memory
capacity and is free to do this using any combination of
supported extents.
@contiguous: Device is responsible for allocating the requested
memory capacity but must do so as a single contiguous
extent.
@prescriptive: The precise set of extents to be allocated is specified
by the command. Thus allocation is being managed by the
issuer of the allocation command, not the device.
@enable-shared-access: Capacity has already been allocated to a
different host using free, contiguous or prescriptive methods with
a known tag. This policy then instructs the device to make the
capacity with the specified tag available to an additional host.
Capacity is implicit as it matches that already associated with the
tag. Note that the extent list (and hence DPAs)
used are per host, so a device may use different representations
on each host. The ordering of the extents provided to each host
is indicated to the host using per extent sequence numbers generated
by the device. Has a similar
meaning for temporal sharing but in that case there may be only
one host involved.
>
> > +#
> > +# @prescriptive: 2h = Prescriptive
> > +#
> > +# @enable-shared-access: 3h = Enable Shared Access
>
> Similar questions.
>
> > +#
> > +# Since: 9.1
> > +##
> > +{ 'enum': 'CXLExtSelPolicy',
> > + 'data': ['free',
> > + 'contiguous',
> > + 'prescriptive',
> > + 'enable-shared-access']
> > +}
> > +
> > +##
> > +# @cxl-add-dynamic-capacity:
> > +#
> > +# Command to initiate to add dynamic capacity extents to a host. It
>
> "Initiate adding dynamic capacity extents"
Done.
>
> When a command initiates something, we commonly need a way to detect
> completion, and sometimes need a way to track progress.
>
> How can we detect completion, and if we can't, why's that okay?
>
> Can adding capacity fail after the command succeeded? If yes, how can
> we detect that?
The full flow can fail, in the sense that the host can reject the offered
capacity.
This command just initiates the flow.
Today we can't detect it via QMP. There are a could of options but I
think they are out of scope for this document (for now).
There are a lot more DCD features to come and I'd include a
resolution to this aspect as one of those. Aim today is just
to get to the point where we can test the OS handling - other
cases like virtualization of this require a lot more infrastructure
on top of what we have here.
So likely options:
* The 'fabric manager' will have an out of band path to the OS as it
doesn't spontaneously decide to offer capacity - that happens
because an orchestrator (think kubernetes or similar) has told a
host to bring up an application that needs this extra capacity.
That path would typically include an acknowledgment that the capacity
has turned up and the host can run what it was asked to run.
There is an inband path for a real fabric manager interface that
we don't yet have an equivalent of in QEMU. An earlier version
of this patch set provided a hacky equivalent so was dropped.
That path is the Fabric Manager side Dynamic Capacity Event Log
which has events for this
0x4 Add Capacity Response:
" The host has responded to the Add Capacity event and the Dynamic
Capacity Extent field in this structure specifies the capacity
accepted by the host. This event shall only be reported
to the FM"
0x5 is the similar one for release.
So long term there is probably a need for a reporting interface
but lots more to do in general and I think this is functional
without that. For now I think all we can do is document that
discovering success must be done via an out of band interface.
I've added:
" Note that, currently, establishing success or failure of the full Add Dynamic
Capacity flow requires out of band communication with the OS of
the CXL host."
Does that work for now? We will have to remember to update if/when
we add a way to query this.
Also clear we could benefit from some additional documentation
in cxl.rst. That's a job for another day however - for now to
get the details users will have to read the CXL specification or
may watch a bunch of conference videos and webinars at least.
>
> How long until completion after the command succeeded? Unbounded time?
Depends on the host, and indeed unbounded - ultimately there is an abort
path (forced removal later in this doc) but it is sometimes fatal for the
OS running and only meant for the case where the host OS crashed.
Not many operating systems play well with force removal of memory and due
to a race condition it may looks like that to the host. So basically
it's a 'don't use this' kind of hardware feature.
However it's not that QEMU is waiting for it beyond having some tracking
structures allocated that are not freed until the flow has finished.
This is very much an an asynchronous flow.
>
> > +# simulates operations defined in cxl spec r3.1 7.6.7.6.5.
>
> "defined in the CXL Specification 3.1 section 7.6.7.6.5"
>
> More of the same below, not noting it again.
Sure. Hopefully fixed throughout the new text. I've not taken
on the existing cases today.
>
> > +#
> > +# @path: CXL DCD canonical QOM path.
>
> Sure the QOM path needs to be canonical?
>
> If not, what about "path to the CXL dynamic capacity device in the QOM
> tree". Intentionally close to existing descriptions of @qom-path
> elsewhere.
That text LGTM. I'll focus only on new cases of this for an initial
patch but there are a load of other cases of this text that will
want updating separately.
>
> > +#
> > +# @host-id: The "Host ID" field as defined in cxl spec r3.1
> > +# Table 7-70.
> > +#
> > +# @selection-policy: The "Selection Policy" bits as defined in
> > +# cxl spec r3.1 Table 7-70. It specifies the policy to use for
> > +# selecting which extents comprise the added capacity.
> > +#
> > +# @region: The "Region Number" field as defined in cxl spec r3.1
> > +# Table 7-70. The dynamic capacity region where the capacity
> > +# is being added. Valid range is from 0-7.
>
> Scratch the second sentence?
Sure, I guess because nearly everything else is just a spec reference
and this isn't adding enough info to be useful?
>
> > +#
> > +# @tag: The "Tag" field as defined in cxl spec r3.1 Table 7-70.
> > +#
> > +# @extents: The "Extent List" field as defined in cxl spec r3.1
> > +# Table 7-70.
> > +#
> > +# Since : 9.1
> > +##
> > +{ 'command': 'cxl-add-dynamic-capacity',
> > + 'data': { 'path': 'str',
> > + 'host-id': 'uint16',
> > + 'selection-policy': 'CXLExtSelPolicy',
> > + 'region': 'uint8',
> > + '*tag': 'str',
> > + 'extents': [ 'CXLDynamicCapacityExtent' ]
> > + }
> > +}
> > +
> > +##
> > +# @CXLExtRemovalPolicy:
>
> CxlExtentRemovalPolicy
Done this and similar.
>
> > +#
> > +# The policy to use for selecting which extents comprise the released
> > +# capacity, defined in the "Flags" field in cxl spec r3.1 Table 7-71.
> > +#
> > +# @tag-based: value = 0h. Extents are selected by the device based
> > +# on tag, with no requirement for contiguous extents.
> > +#
> > +# @prescriptive: value = 1h. Extent list of capacity to release is
> > +# included in the request payload.
>
> I guess "value = ..." documents the numeric value. Sure that's useful
> here?
Dropped as not useful here.
>
> > +#
> > +# Since: 9.1
> > +##
> > +{ 'enum': 'CXLExtRemovalPolicy',
> > + 'data': ['tag-based',
> > + 'prescriptive']
> > +}
> > +
> > +##
> > +# @cxl-release-dynamic-capacity:
> > +#
> > +# Command to initiate to release dynamic capacity extents from a
>
> "Initiate releasing dynamic capacity extents"
>
> When a command initiates something, we commonly need a way to detect
> completion, and sometimes need a way to track progress. See
> cxl-add-dynamic-capacity above.
>
Effectively same reply. Today you can only do this via out of band
comms with the host. We have quite a lot more to add before we
can report this via QMP. This is very much part 1 of DCD support,
I'd expect us to be still adding features in a year or more.
I'll add similar text to proposed for the add path.
...
>
> > +# capacity as a result of this request.
>
> What does it mean "to sanitize capacity"? Is this about scrubbing the
> memory?
For one meaning of scrubbing. Not the one that is normally applied to
memory which is patrol scrub / ECC error detection and correction and
subject to a long kernel mailing list thread at the moment and another
QEMU patch set on my queue..
Why can't we have a be dictionary of canonical terms. Ah well.
Added a slightly shortened quote from the CXL spec.
"This Ensures that all user data and metadata is made permanently
unavailable by whatever means is appropriate for the media type.
Note that changing encryption keys is not sufficient."
The last bit is because we will shortly have secure erase support
via another patch set and in that case changing encryption keys is
sufficient.
>
> > +#
> > +# @region: The "Region Number" field as defined in cxl spec r3.1
> > +# Table 7-71. The dynamic capacity region where the capacity
> > +# is being added. Valid range is from 0-7.
>
> My comment on cxl-add-dynamic-capacity argument @region applies.
"The dynamic capacity region where the capacity is being added."
sentence dropped.
>
> > +#
> > +# @tag: The "Tag" field as defined in cxl spec r3.1 Table 7-71.
> > +#
> > +# @extents: The "Extent List" field as defined in cxl spec r3.1
> > +# Table 7-71.
> > +#
> > +# Since : 9.1
> > +##
> > +{ 'command': 'cxl-release-dynamic-capacity',
> > + 'data': { 'path': 'str',
> > + 'host-id': 'uint16',
> > + 'removal-policy': 'CXLExtRemovalPolicy',
> > + '*forced-removal': 'bool',
> > + '*sanitize-on-release': 'bool',
> > + 'region': 'uint8',
> > + '*tag': 'str',
> > + 'extents': [ 'CXLDynamicCapacityExtent' ]
> > + }
> > +}
>
So with all that incorporated, what I currently have is:
[PATCH] hw/cxl/events: Improve QMP interfaces and documentation for add/release dynamic capacity.
New DCD command definitions updated in response to review comments
from Markus.
- Used CxlXXXX instead of CXLXXXXX for newly added types.
- Expanded some abreviations in type names to be easier to read.
- Additional documentation for some fields.
- Replace slightly vague cxl r3.1 references with
"Compute Express Link (CXL) Specification, Revision 3.1, XXXX"
to bring them inline with what it says on the specification cover.
Suggested-by: Maruks Armbruster <armbru@redhat.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
I can break this up into a separate patches, but that's going to be
quite a lot of churn as often multiple of the above affect the same
paragraph.
---
qapi/cxl.json | 152 ++++++++++++++++++++++++---------------
hw/mem/cxl_type3.c | 18 ++---
hw/mem/cxl_type3_stubs.c | 8 +--
3 files changed, 107 insertions(+), 71 deletions(-)
diff --git a/qapi/cxl.json b/qapi/cxl.json
index 57d9f82014..a38622a0d1 100644
--- a/qapi/cxl.json
+++ b/qapi/cxl.json
@@ -363,9 +363,11 @@
'data': {'path': 'str', 'type': 'CxlCorErrorType'}}
##
-# @CXLDynamicCapacityExtent:
+# @CxlDynamicCapacityExtent:
#
-# A single dynamic capacity extent
+# A single dynamic capacity extent. This is a contiguous allocation
+# of memory by Device Physical Address within a single Dynamic
+# Capacity Region on a CXL Type 3 Device.
#
# @offset: The offset (in bytes) to the start of the region
# where the extent belongs to.
@@ -374,7 +376,7 @@
#
# Since: 9.1
##
-{ 'struct': 'CXLDynamicCapacityExtent',
+{ 'struct': 'CxlDynamicCapacityExtent',
'data': {
'offset':'uint64',
'len': 'uint64'
@@ -382,22 +384,40 @@
}
##
-# @CXLExtSelPolicy:
+# @CxlExtentSelectionPolicy:
#
# The policy to use for selecting which extents comprise the added
-# capacity, as defined in cxl spec r3.1 Table 7-70.
-#
-# @free: 0h = Free
-#
-# @contiguous: 1h = Continuous
-#
-# @prescriptive: 2h = Prescriptive
-#
-# @enable-shared-access: 3h = Enable Shared Access
+# capacity, as defined in Compute Express Link (CXL) Specification,
+# Revision 3.1, Table 7-70.
+#
+# @free: Device is responsible for allocating the requested memory
+# capacity and is free to do this using any combination of
+# supported extents.
+#
+# @contiguous: Device is responsible for allocating the requested
+# memory capacity but must do so as a single contiguous
+# extent.
+#
+# @prescriptive: The precise set of extents to be allocated is
+# specified by the command. Thus allocation is being managed
+# by the issuer of the allocation command, not the device.
+#
+# @enable-shared-access: Capacity has already been allocated to a
+# different host using free, contiguous or prescriptive policy
+# with a known tag. This policy then instructs the device to
+# make the capacity with the specified tag available to an
+# additional host. Capacity is implicit as it matches that
+# already associated with the tag. Note that the extent list
+# (and hence Device Physical Addresses) used are per host, so
+# a device may use different representations on each host.
+# The ordering of the extents provided to each host is indicated
+# to the host using per extent sequence numbers generated by
+# the device. Has a similar meaning for temporal sharing, but
+# in that case there may be only one host involved.
#
# Since: 9.1
##
-{ 'enum': 'CXLExtSelPolicy',
+{ 'enum': 'CxlExtentSelectionPolicy',
'data': ['free',
'contiguous',
'prescriptive',
@@ -407,54 +427,60 @@
##
# @cxl-add-dynamic-capacity:
#
-# Command to initiate to add dynamic capacity extents to a host. It
-# simulates operations defined in cxl spec r3.1 7.6.7.6.5.
+# Initiate adding dynamic capacity extents to a host. This simulates
+# operations defined in Compute Express Link (CXL) Specification,
+# Revision 3.1, Section 7.6.7.6.5. Note that, currently, establishing
+# success or failure of the full Add Dynamic Capacity flow requires
+# out of band communication with the OS of the CXL host.
#
-# @path: CXL DCD canonical QOM path.
+# @path: path to the CXL Dynamic Capacity Device in the QOM tree.
#
-# @host-id: The "Host ID" field as defined in cxl spec r3.1
-# Table 7-70.
+# @host-id: The "Host ID" field as defined in Compute Express Link
+# (CXL) Specification, Revision 3.1, Table 7-70.
#
# @selection-policy: The "Selection Policy" bits as defined in
-# cxl spec r3.1 Table 7-70. It specifies the policy to use for
-# selecting which extents comprise the added capacity.
+# Compute Express Link (CXL) Specification, Revision 3.1,
+# Table 7-70. It specifies the policy to use for selecting
+# which extents comprise the added capacity.
#
-# @region: The "Region Number" field as defined in cxl spec r3.1
-# Table 7-70. The dynamic capacity region where the capacity
-# is being added. Valid range is from 0-7.
+# @region: The "Region Number" field as defined in Compute Express
+# Link (CXL) Specification, Revision 3.1, Table 7-70. Valid
+# range is from 0-7.
#
-# @tag: The "Tag" field as defined in cxl spec r3.1 Table 7-70.
+# @tag: The "Tag" field as defined in Compute Express Link (CXL)
+# Specification, Revision 3.1, Table 7-70.
#
-# @extents: The "Extent List" field as defined in cxl spec r3.1
-# Table 7-70.
+# @extents: The "Extent List" field as defined in Compute Express Link
+# (CXL) Specification, Revision 3.1, Table 7-70.
#
# Since : 9.1
##
{ 'command': 'cxl-add-dynamic-capacity',
'data': { 'path': 'str',
'host-id': 'uint16',
- 'selection-policy': 'CXLExtSelPolicy',
+ 'selection-policy': 'CxlExtentSelectionPolicy',
'region': 'uint8',
'*tag': 'str',
- 'extents': [ 'CXLDynamicCapacityExtent' ]
+ 'extents': [ 'CxlDynamicCapacityExtent' ]
}
}
##
-# @CXLExtRemovalPolicy:
+# @CxlExtentRemovalPolicy:
#
# The policy to use for selecting which extents comprise the released
-# capacity, defined in the "Flags" field in cxl spec r3.1 Table 7-71.
+# capacity, defined in the "Flags" field in Compute Express Link (CXL)
+# Specification, Revision 3.1, Table 7-71.
#
-# @tag-based: value = 0h. Extents are selected by the device based
-# on tag, with no requirement for contiguous extents.
+# @tag-based: Extents are selected by the device based on tag, with
+# no requirement for contiguous extents.
#
-# @prescriptive: value = 1h. Extent list of capacity to release is
-# included in the request payload.
+# @prescriptive: Extent list of capacity to release is included in
+# the request payload.
#
# Since: 9.1
##
-{ 'enum': 'CXLExtRemovalPolicy',
+{ 'enum': 'CxlExtentRemovalPolicy',
'data': ['tag-based',
'prescriptive']
}
@@ -462,45 +488,55 @@
##
# @cxl-release-dynamic-capacity:
#
-# Command to initiate to release dynamic capacity extents from a
-# host. It simulates operations defined in cxl spec r3.1 7.6.7.6.6.
+# Initiate release of dynamic capacity extents from a host. This
+# simulates operations defined in Compute Express Link (CXL)
+# Specification, Revision 3.1, Section 7.6.7.6.6. Note that,
+# currently, success or failure of the full Release Dynamic Capacity
+# flow requires out of band communication with the OS of the CXL host.
#
-# @path: CXL DCD canonical QOM path.
+# @path: path to the CXL Dynamic Capacity Device in the QOM tree.
#
-# @host-id: The "Host ID" field as defined in cxl spec r3.1
-# Table 7-71.
+# @host-id: The "Host ID" field as defined in Compute Express Link
+# (CXL) Specification, Revision 3.1, Table 7-71.
#
-# @removal-policy: Bit[3:0] of the "Flags" field as defined in cxl
-# spec r3.1 Table 7-71.
+# @removal-policy: Bit[3:0] of the "Flags" field as defined in
+# Compute Express Link (CXL) Specification, Revision 3.1,
+# Table 7-71.
#
-# @forced-removal: Bit[4] of the "Flags" field in cxl spec r3.1
-# Table 7-71. When set, device does not wait for a Release
-# Dynamic Capacity command from the host. Host immediately
-# loses access to released capacity.
+# @forced-removal: Bit[4] of the "Flags" field in Compute Express
+# Link (CXL) Specification, Revision 3.1, Table 7-71. When set,
+# the device does not wait for a Release Dynamic Capacity command
+# from the host. Instead, the host immediately looses access to
+# the released capacity.
#
-# @sanitize-on-release: Bit[5] of the "Flags" field in cxl spec r3.1
-# Table 7-71. When set, device should sanitize all released
-# capacity as a result of this request.
+# @sanitize-on-release: Bit[5] of the "Flags" field in Compute
+# Express Link (CXL) Specification, Revision 3.1, Table 7-71.
+# When set, the device should sanitize all released capacity as
+# a result of this request. This ensures that all user data
+# and metadata is made permanently unavailable by whatever
+# means is appropriate for the media type. Note that changing
+# encryption keys is not sufficient.
#
-# @region: The "Region Number" field as defined in cxl spec r3.1
-# Table 7-71. The dynamic capacity region where the capacity
-# is being added. Valid range is from 0-7.
+# @region: The "Region Number" field as defined in Compute Express
+# Link Specification, Revision 3.1, Table 7-71. Valid range
+# is from 0-7.
#
-# @tag: The "Tag" field as defined in cxl spec r3.1 Table 7-71.
+# @tag: The "Tag" field as defined in Compute Express Link (CXL)
+# Specification, Revision 3.1, Table 7-71.
#
-# @extents: The "Extent List" field as defined in cxl spec r3.1
-# Table 7-71.
+# @extents: The "Extent List" field as defined in Compute Express
+# Link (CXL) Specification, Revision 3.1, Table 7-71.
#
# Since : 9.1
##
{ 'command': 'cxl-release-dynamic-capacity',
'data': { 'path': 'str',
'host-id': 'uint16',
- 'removal-policy': 'CXLExtRemovalPolicy',
+ 'removal-policy': 'CxlExtentRemovalPolicy',
'*forced-removal': 'bool',
'*sanitize-on-release': 'bool',
'region': 'uint8',
'*tag': 'str',
- 'extents': [ 'CXLDynamicCapacityExtent' ]
+ 'extents': [ 'CxlDynamicCapacityExtent' ]
}
}
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 284db94182..2242986d8b 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -1873,7 +1873,7 @@ static bool cxl_extent_groups_overlaps_dpa_range(CXLDCExtentGroupList *list,
*/
static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path,
uint16_t hid, CXLDCEventType type, uint8_t rid,
- CXLDynamicCapacityExtentList *records, Error **errp)
+ CxlDynamicCapacityExtentList *records, Error **errp)
{
Object *obj;
CXLEventDynamicCapacity dCap = {};
@@ -1881,7 +1881,7 @@ static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path,
CXLType3Dev *dcd;
uint8_t flags = 1 << CXL_EVENT_TYPE_INFO;
uint32_t num_extents = 0;
- CXLDynamicCapacityExtentList *list;
+ CxlDynamicCapacityExtentList *list;
CXLDCExtentGroup *group = NULL;
g_autofree CXLDCExtentRaw *extents = NULL;
uint8_t enc_log = CXL_EVENT_TYPE_DYNAMIC_CAP;
@@ -2031,13 +2031,13 @@ static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path,
}
void qmp_cxl_add_dynamic_capacity(const char *path, uint16_t host_id,
- CXLExtSelPolicy sel_policy, uint8_t region,
- const char *tag,
- CXLDynamicCapacityExtentList *extents,
+ CxlExtentSelectionPolicy sel_policy,
+ uint8_t region, const char *tag,
+ CxlDynamicCapacityExtentList *extents,
Error **errp)
{
switch (sel_policy) {
- case CXL_EXT_SEL_POLICY_PRESCRIPTIVE:
+ case CXL_EXTENT_SELECTION_POLICY_PRESCRIPTIVE:
qmp_cxl_process_dynamic_capacity_prescriptive(path, host_id,
DC_EVENT_ADD_CAPACITY,
region, extents, errp);
@@ -2049,14 +2049,14 @@ void qmp_cxl_add_dynamic_capacity(const char *path, uint16_t host_id,
}
void qmp_cxl_release_dynamic_capacity(const char *path, uint16_t host_id,
- CXLExtRemovalPolicy removal_policy,
+ CxlExtentRemovalPolicy removal_policy,
bool has_forced_removal,
bool forced_removal,
bool has_sanitize_on_release,
bool sanitize_on_release,
uint8_t region,
const char *tag,
- CXLDynamicCapacityExtentList *extents,
+ CxlDynamicCapacityExtentList *extents,
Error **errp)
{
CXLDCEventType type = DC_EVENT_RELEASE_CAPACITY;
@@ -2069,7 +2069,7 @@ void qmp_cxl_release_dynamic_capacity(const char *path, uint16_t host_id,
}
switch (removal_policy) {
- case CXL_EXT_REMOVAL_POLICY_PRESCRIPTIVE:
+ case CXL_EXTENT_REMOVAL_POLICY_PRESCRIPTIVE:
qmp_cxl_process_dynamic_capacity_prescriptive(path, host_id, type,
region, extents, errp);
return;
diff --git a/hw/mem/cxl_type3_stubs.c b/hw/mem/cxl_type3_stubs.c
index 45419bbefe..c1a5e4a7c1 100644
--- a/hw/mem/cxl_type3_stubs.c
+++ b/hw/mem/cxl_type3_stubs.c
@@ -70,24 +70,24 @@ void qmp_cxl_inject_correctable_error(const char *path, CxlCorErrorType type,
void qmp_cxl_add_dynamic_capacity(const char *path,
uint16_t host_id,
- CXLExtSelPolicy sel_policy,
+ CxlExtentSelectionPolicy sel_policy,
uint8_t region,
const char *tag,
- CXLDynamicCapacityExtentList *extents,
+ CxlDynamicCapacityExtentList *extents,
Error **errp)
{
error_setg(errp, "CXL Type 3 support is not compiled in");
}
void qmp_cxl_release_dynamic_capacity(const char *path, uint16_t host_id,
- CXLExtRemovalPolicy removal_policy,
+ CxlExtentRemovalPolicy removal_policy,
bool has_forced_removal,
bool forced_removal,
bool has_sanitize_on_release,
bool sanitize_on_release,
uint8_t region,
const char *tag,
- CXLDynamicCapacityExtentList *extents,
+ CxlDynamicCapacityExtentList *extents,
Error **errp)
{
error_setg(errp, "CXL Type 3 support is not compiled in");
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH v8 11/14] hw/cxl/events: Add qmp interfaces to add/release dynamic capacity extents
2024-06-04 11:55 ` Jonathan Cameron
@ 2024-06-04 14:49 ` Markus Armbruster
0 siblings, 0 replies; 28+ messages in thread
From: Markus Armbruster @ 2024-06-04 14:49 UTC (permalink / raw)
To: Jonathan Cameron
Cc: nifan.cxl, qemu-devel, linux-cxl, gregory.price, ira.weiny,
dan.j.williams, a.manzanares, dave, nmtadam.samsung, jim.harris,
Jorgen.Hansen, wj28.lee, mst, Fan Ni, Svetly Todorov
Jonathan Cameron <Jonathan.Cameron@Huawei.com> writes:
> On Tue, 04 Jun 2024 09:12:09 +0200
> Markus Armbruster <armbru@redhat.com> wrote:
>
>> nifan.cxl@gmail.com writes:
>>
>> > From: Fan Ni <fan.ni@samsung.com>
>> >
>> > To simulate FM functionalities for initiating Dynamic Capacity Add
>> > (Opcode 5604h) and Dynamic Capacity Release (Opcode 5605h) as in CXL spec
>> > r3.1 7.6.7.6.5 and 7.6.7.6.6, we implemented two QMP interfaces to issue
>> > add/release dynamic capacity extents requests.
>> >
>> > With the change, we allow to release an extent only when its DPA range
>> > is contained by a single accepted extent in the device. That is to say,
>> > extent superset release is not supported yet.
>> >
>> > 1. Add dynamic capacity extents:
>> >
>> > For example, the command to add two continuous extents (each 128MiB long)
>> > to region 0 (starting at DPA offset 0) looks like below:
>> >
>> > { "execute": "qmp_capabilities" }
>> >
>> > { "execute": "cxl-add-dynamic-capacity",
>> > "arguments": {
>> > "path": "/machine/peripheral/cxl-dcd0",
>> > "host-id": 0,
>> > "selection-policy": "prescriptive",
>> > "region": 0,
>> > "extents": [
>> > {
>> > "offset": 0,
>> > "len": 134217728
>> > },
>> > {
>> > "offset": 134217728,
>> > "len": 134217728
>> > }
>> > ]
>> > }
>> > }
>> >
>> > 2. Release dynamic capacity extents:
>> >
>> > For example, the command to release an extent of size 128MiB from region 0
>> > (DPA offset 128MiB) looks like below:
>> >
>> > { "execute": "cxl-release-dynamic-capacity",
>> > "arguments": {
>> > "path": "/machine/peripheral/cxl-dcd0",
>> > "host-id": 0,
>> > "removal-policy":"prescriptive",
>> > "region": 0,
>> > "extents": [
>> > {
>> > "offset": 134217728,
>> > "len": 134217728
>> > }
>> > ]
>> > }
>> > }
>> >
>> > Tested-by: Svetly Todorov <svetly.todorov@memverge.com>
>> > Reviewed-by: Gregory Price <gregory.price@memverge.com>
>> > Signed-off-by: Fan Ni <fan.ni@samsung.com>
>>
>
> Hi Markus,
>
> Thanks for the detailed review.
>
> Fan is traveling for a few weeks and may have intermittent internet.
> He asked me to help with any feedback that came in during this period.
>
> Perhaps at this stage (as Michael has this queued) best bet is a follow on patch
> tweaking things. The blast radius is more or less contained to the
> qmp file subject to a few parameter type changes. I'd be keen on this
> approach if possible because that lets me start attacking the annoyingly
> large queue of stuff dependent on this series in parallel with
> improving this aspect.
Sacrifices git history tidiness for development velocity. Judgement
call.
> Proposed draft patch at end of this email and responses to individual
> comments inline.
>
> I'll do a separate patch in response to your suggestion to mark the
> two interfaces unstable. For now seems there is little disadvantage
> in doing so as I assume there is nothing stopping us removing
> that marking in a cycle or two if things look stable.
We can certainly make things stable when we're reasonably convinced they
are, and have a need for it.
>> [...]
>>
>> > diff --git a/qapi/cxl.json b/qapi/cxl.json
>> > index 4281726dec..57d9f82014 100644
>> > --- a/qapi/cxl.json
>> > +++ b/qapi/cxl.json
>> > @@ -361,3 +361,146 @@
>> > ##
>> > {'command': 'cxl-inject-correctable-error',
>> > 'data': {'path': 'str', 'type': 'CxlCorErrorType'}}
>> > +
>> > +##
>> > +# @CXLDynamicCapacityExtent:
>>
>> Three existing type names start with Cxl, and only one starts with CXL.
>> Please make your new ones start with Cxl, not CXL:
>> CxlDynamicCapacityExtent.
> Ok.
>>
>> > +#
>> > +# A single dynamic capacity extent
>> > +#
>> > +# @offset: The offset (in bytes) to the start of the region
>> > +# where the extent belongs to.
>> > +#
>> > +# @len: The length of the extent in bytes.
>>
>> What is this? Memory?
>
> Yes. Probably makes more sense to add to the initial description rather
> than down here.
>
> # A single dynamic capacity extent. This is a contiguous allocation
> # of memory by Device Physical Address within a single Dynamic Capacity
> # Region on a CXL Type 3 device.
Yes, that's better.
> This is all a bit of a balance between not quoting large chunks of
> the specification and providing enough detail here.
Yes.
> Reality is that people who don't know what this is, won't use this
> interface. We can add some additional documentation to introduce
> all the concepts but it probably doesn't make sense to do so here.
I suggest to try combining references to the spec with just enough
explanation to serve as reminders for the people familiar with this
stuff, and maybe even as terse overview for the rest of us.
>> > +#
>> > +# Since: 9.1
>> > +##
>> > +{ 'struct': 'CXLDynamicCapacityExtent',
>> > + 'data': {
>> > + 'offset':'uint64',
>> > + 'len': 'uint64'
>> > + }
>> > +}
>> > +
>> > +##
>> > +# @CXLExtSelPolicy:
>>
>> CxlExtentSelectionPolicy
>>
>> > +#
>> > +# The policy to use for selecting which extents comprise the added
>> > +# capacity, as defined in cxl spec r3.1 Table 7-70.
>>
>> Use the official title: "as defined in the CXL Specification 3.1" (I
>> think, the actual document is behind a click-through agreement).
>
> Sadly not that simple, hence the desire for an abbreviation. Should be
>
> Compute Express Link (CXL) Specification, Revision 3.1, Version 1.0
>
> Can drop the Version 1.0 (as there have never been other versions and
> probably won't be) but the Revision part matters (unfortunately)
> hence the r in the above.
>
> Not that we've used CXL r3.0 etc in previous QMP docs for this. Perhaps
> just sticking to that and relying on the reference in
> docs/system/devices/cxl.rst for the canonical reference.
>
> For now I'll go with the (almost) full form here as it's never wrong to
> spell it out. So all the new references will be to
> Compute Express Link (CXL) Specification, Revision 3.1, Section xxxx
Abbreviating a long title is okay as long the full title is still easy
enough to find. But always abbreviate the same way, please.
>> > +#
>> > +# @free: 0h = Free
>> > +#
>> > +# @contiguous: 1h = Continuous
>>
>> What does "1h =" mean? The numeric encoding?
> Alignment with spec, but doesn't need to be here so removed.
>
>>
>> What exactly is "contiguous" / "continuous"? I figure it's clear enough
>> if you have the CXL spec open in another window. Can we condense it
>> into one phrase for use here?
>
> @free: Device is responsible for allocating the requested memory
> capacity and is free to do this using any combination of
> supported extents.
>
> @contiguous: Device is responsible for allocating the requested
> memory capacity but must do so as a single contiguous
> extent.
>
> @prescriptive: The precise set of extents to be allocated is specified
> by the command. Thus allocation is being managed by the
> issuer of the allocation command, not the device.
>
> @enable-shared-access: Capacity has already been allocated to a
> different host using free, contiguous or prescriptive methods with
> a known tag. This policy then instructs the device to make the
> capacity with the specified tag available to an additional host.
> Capacity is implicit as it matches that already associated with the
> tag. Note that the extent list (and hence DPAs)
> used are per host, so a device may use different representations
> on each host. The ordering of the extents provided to each host
> is indicated to the host using per extent sequence numbers generated
> by the device. Has a similar
> meaning for temporal sharing but in that case there may be only
> one host involved.
Better.
Feel free to omit some of the detail from the last one.
>> > +#
>> > +# @prescriptive: 2h = Prescriptive
>> > +#
>> > +# @enable-shared-access: 3h = Enable Shared Access
>>
>> Similar questions.
>>
>> > +#
>> > +# Since: 9.1
>> > +##
>> > +{ 'enum': 'CXLExtSelPolicy',
>> > + 'data': ['free',
>> > + 'contiguous',
>> > + 'prescriptive',
>> > + 'enable-shared-access']
>> > +}
>> > +
>> > +##
>> > +# @cxl-add-dynamic-capacity:
>> > +#
>> > +# Command to initiate to add dynamic capacity extents to a host. It
>>
>> "Initiate adding dynamic capacity extents"
> Done.
>>
>> When a command initiates something, we commonly need a way to detect
>> completion, and sometimes need a way to track progress.
>>
>> How can we detect completion, and if we can't, why's that okay?
>>
>> Can adding capacity fail after the command succeeded? If yes, how can
>> we detect that?
>
> The full flow can fail, in the sense that the host can reject the offered
> capacity.
> This command just initiates the flow.
>
> Today we can't detect it via QMP. There are a could of options but I
> think they are out of scope for this document (for now).
> There are a lot more DCD features to come and I'd include a
> resolution to this aspect as one of those. Aim today is just
> to get to the point where we can test the OS handling - other
> cases like virtualization of this require a lot more infrastructure
> on top of what we have here.
Sounds scary...
> So likely options:
>
> * The 'fabric manager' will have an out of band path to the OS as it
> doesn't spontaneously decide to offer capacity - that happens
> because an orchestrator (think kubernetes or similar) has told a
> host to bring up an application that needs this extra capacity.
> That path would typically include an acknowledgment that the capacity
> has turned up and the host can run what it was asked to run.
>
> There is an inband path for a real fabric manager interface that
> we don't yet have an equivalent of in QEMU. An earlier version
> of this patch set provided a hacky equivalent so was dropped.
> That path is the Fabric Manager side Dynamic Capacity Event Log
> which has events for this
> 0x4 Add Capacity Response:
> " The host has responded to the Add Capacity event and the Dynamic
> Capacity Extent field in this structure specifies the capacity
> accepted by the host. This event shall only be reported
> to the FM"
> 0x5 is the similar one for release.
>
> So long term there is probably a need for a reporting interface
> but lots more to do in general and I think this is functional
> without that. For now I think all we can do is document that
> discovering success must be done via an out of band interface.
>
> I've added:
> " Note that, currently, establishing success or failure of the full Add Dynamic
> Capacity flow requires out of band communication with the OS of
> the CXL host."
>
> Does that work for now? We will have to remember to update if/when
> we add a way to query this.
Good enough for an unstable interface.
> Also clear we could benefit from some additional documentation
> in cxl.rst. That's a job for another day however - for now to
> get the details users will have to read the CXL specification or
> may watch a bunch of conference videos and webinars at least.
Would it make sense to add a short paragaph on what's missing there?
>> How long until completion after the command succeeded? Unbounded time?
>
> Depends on the host, and indeed unbounded - ultimately there is an abort
> path (forced removal later in this doc) but it is sometimes fatal for the
> OS running and only meant for the case where the host OS crashed.
> Not many operating systems play well with force removal of memory and due
> to a race condition it may looks like that to the host. So basically
> it's a 'don't use this' kind of hardware feature.
>
> However it's not that QEMU is waiting for it beyond having some tracking
> structures allocated that are not freed until the flow has finished.
> This is very much an an asynchronous flow.
Vaguely similar: when device_del merely initiates hot unplug, and
completion requires guest cooperation. This puts management
applications into an awkward position. What if they don't get
DEVICE_DELETED event within a reasonable time? What is a reasonable
time? We later added DEVICE_UNPLUG_GUEST_ERROR to avoid this for the
common case of well-behaved guests.
I'm not asking you to do anything about this now. Spelling it out in
documentation seems advisable, though.
>> > +# simulates operations defined in cxl spec r3.1 7.6.7.6.5.
>>
>> "defined in the CXL Specification 3.1 section 7.6.7.6.5"
>>
>> More of the same below, not noting it again.
> Sure. Hopefully fixed throughout the new text. I've not taken
> on the existing cases today.
>
>>
>> > +#
>> > +# @path: CXL DCD canonical QOM path.
>>
>> Sure the QOM path needs to be canonical?
>>
>> If not, what about "path to the CXL dynamic capacity device in the QOM
>> tree". Intentionally close to existing descriptions of @qom-path
>> elsewhere.
>
> That text LGTM. I'll focus only on new cases of this for an initial
> patch but there are a load of other cases of this text that will
> want updating separately.
Okay.
>> > +#
>> > +# @host-id: The "Host ID" field as defined in cxl spec r3.1
>> > +# Table 7-70.
>> > +#
>> > +# @selection-policy: The "Selection Policy" bits as defined in
>> > +# cxl spec r3.1 Table 7-70. It specifies the policy to use for
>> > +# selecting which extents comprise the added capacity.
>> > +#
>> > +# @region: The "Region Number" field as defined in cxl spec r3.1
>> > +# Table 7-70. The dynamic capacity region where the capacity
>> > +# is being added. Valid range is from 0-7.
>>
>> Scratch the second sentence?
>
> Sure, I guess because nearly everything else is just a spec reference
> and this isn't adding enough info to be useful?
One, it adds relatively little over "region number", and two, it's not
actually a sentence ;)
>>
>> > +#
>> > +# @tag: The "Tag" field as defined in cxl spec r3.1 Table 7-70.
>> > +#
>> > +# @extents: The "Extent List" field as defined in cxl spec r3.1
>> > +# Table 7-70.
>> > +#
>> > +# Since : 9.1
>> > +##
>> > +{ 'command': 'cxl-add-dynamic-capacity',
>> > + 'data': { 'path': 'str',
>> > + 'host-id': 'uint16',
>> > + 'selection-policy': 'CXLExtSelPolicy',
>> > + 'region': 'uint8',
>> > + '*tag': 'str',
>> > + 'extents': [ 'CXLDynamicCapacityExtent' ]
>> > + }
>> > +}
>> > +
>> > +##
>> > +# @CXLExtRemovalPolicy:
>>
>> CxlExtentRemovalPolicy
> Done this and similar.
>>
>> > +#
>> > +# The policy to use for selecting which extents comprise the released
>> > +# capacity, defined in the "Flags" field in cxl spec r3.1 Table 7-71.
>> > +#
>> > +# @tag-based: value = 0h. Extents are selected by the device based
>> > +# on tag, with no requirement for contiguous extents.
>> > +#
>> > +# @prescriptive: value = 1h. Extent list of capacity to release is
>> > +# included in the request payload.
>>
>> I guess "value = ..." documents the numeric value. Sure that's useful
>> here?
>
> Dropped as not useful here.
>
>>
>> > +#
>> > +# Since: 9.1
>> > +##
>> > +{ 'enum': 'CXLExtRemovalPolicy',
>> > + 'data': ['tag-based',
>> > + 'prescriptive']
>> > +}
>> > +
>> > +##
>> > +# @cxl-release-dynamic-capacity:
>> > +#
>> > +# Command to initiate to release dynamic capacity extents from a
>>
>> "Initiate releasing dynamic capacity extents"
>>
>> When a command initiates something, we commonly need a way to detect
>> completion, and sometimes need a way to track progress. See
>> cxl-add-dynamic-capacity above.
>>
>
> Effectively same reply. Today you can only do this via out of band
> comms with the host. We have quite a lot more to add before we
> can report this via QMP. This is very much part 1 of DCD support,
> I'd expect us to be still adding features in a year or more.
>
> I'll add similar text to proposed for the add path.
>
> ...
>
>>
>> > +# capacity as a result of this request.
>>
>> What does it mean "to sanitize capacity"? Is this about scrubbing the
>> memory?
>
> For one meaning of scrubbing. Not the one that is normally applied to
> memory which is patrol scrub / ECC error detection and correction and
> subject to a long kernel mailing list thread at the moment and another
> QEMU patch set on my queue..
>
> Why can't we have a be dictionary of canonical terms. Ah well.
> Added a slightly shortened quote from the CXL spec.
> "This Ensures that all user data and metadata is made permanently
ensures
> unavailable by whatever means is appropriate for the media type.
> Note that changing encryption keys is not sufficient."
>
> The last bit is because we will shortly have secure erase support
> via another patch set and in that case changing encryption keys is
> sufficient.
Works for me.
>> > +#
>> > +# @region: The "Region Number" field as defined in cxl spec r3.1
>> > +# Table 7-71. The dynamic capacity region where the capacity
>> > +# is being added. Valid range is from 0-7.
>>
>> My comment on cxl-add-dynamic-capacity argument @region applies.
> "The dynamic capacity region where the capacity is being added."
> sentence dropped.
>
>>
>> > +#
>> > +# @tag: The "Tag" field as defined in cxl spec r3.1 Table 7-71.
>> > +#
>> > +# @extents: The "Extent List" field as defined in cxl spec r3.1
>> > +# Table 7-71.
>> > +#
>> > +# Since : 9.1
>> > +##
>> > +{ 'command': 'cxl-release-dynamic-capacity',
>> > + 'data': { 'path': 'str',
>> > + 'host-id': 'uint16',
>> > + 'removal-policy': 'CXLExtRemovalPolicy',
>> > + '*forced-removal': 'bool',
>> > + '*sanitize-on-release': 'bool',
>> > + 'region': 'uint8',
>> > + '*tag': 'str',
>> > + 'extents': [ 'CXLDynamicCapacityExtent' ]
>> > + }
>> > +}
>>
>
> So with all that incorporated, what I currently have is:
>
>
>
> [PATCH] hw/cxl/events: Improve QMP interfaces and documentation for add/release dynamic capacity.
>
> New DCD command definitions updated in response to review comments
> from Markus.
>
> - Used CxlXXXX instead of CXLXXXXX for newly added types.
> - Expanded some abreviations in type names to be easier to read.
> - Additional documentation for some fields.
> - Replace slightly vague cxl r3.1 references with
> "Compute Express Link (CXL) Specification, Revision 3.1, XXXX"
> to bring them inline with what it says on the specification cover.
>
> Suggested-by: Maruks Armbruster <armbru@redhat.com>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>
> ---
> I can break this up into a separate patches, but that's going to be
> quite a lot of churn as often multiple of the above affect the same
> paragraph.
I don't think breaking it up is worth your while or mine :)
Patch looks good to me at a glance. There are a few instances of
For legibility, wrap text paragraphs so every line is at most 70
characters long.
Separate sentences with two spaces.
[...]
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v8 00/14] Enabling DCD emulation support in Qemu
2024-05-23 17:44 [PATCH v8 00/14] Enabling DCD emulation support in Qemu nifan.cxl
` (14 preceding siblings ...)
2024-06-03 13:51 ` [PATCH v8 00/14] Enabling DCD emulation support in Qemu Jonathan Cameron
@ 2025-06-25 14:22 ` Alireza Sanaee
2025-06-26 16:39 ` Fan Ni
15 siblings, 1 reply; 28+ messages in thread
From: Alireza Sanaee @ 2025-06-25 14:22 UTC (permalink / raw)
To: nifan.cxl
Cc: qemu-devel, jonathan.cameron, linux-cxl, gregory.price, ira.weiny,
dan.j.williams, a.manzanares, dave, nmtadam.samsung, jim.harris,
Jorgen.Hansen, wj28.lee, armbru, mst, anisa.su887
On Thu, 23 May 2024 10:44:40 -0700
nifan.cxl@gmail.com wrote:
> From: Fan Ni <nifan.cxl@gmail.com>
>
> A git tree of this series can be found here (with one extra commit on
> top for printing out accepted/pending extent list for testing):
> https://github.com/moking/qemu/tree/dcd-v8-qapi
>
> v7->v8:
>
> This version carries over the following two patches from Gregory.
> 1. hw/cxl/mailbox: change CCI cmd set structure to be a member, not a
> reference https://gitlab.com/jic23/qemu/-/commit/f44ebc5a455ccdd6535879b0c5824e0d76b04da5
> 2. hw/cxl/mailbox: interface to add CCI commands to an existing CCI
> https://gitlab.com/jic23/qemu/-/commit/00a4dd8b388add03c588298f665ee918626296a5
>
> Note, the above two patches are not directly related to DCD emulation.
>
> All the following patches in this series are built on top of
> mainstream QEMU and the above two patches.
>
> The most significant changes of v8 is in Patch 11 (Patch 9 in v7).
> Based on feedback from Markus and Jonathan, the QMP interfaces for
> adding and releasing DC extents have been redesigned and now they
> look like below,
>
> # add a 128MB extent at offset 0 to region 0
> { "execute": "cxl-add-dynamic-capacity",
> "arguments": {
> "path": "/machine/peripheral/cxl-memdev0",
> "host-id":0,
> "selection-policy": 'prescriptive',
> "region": 0,
> "tag": "",
> "extents": [
> {
> "offset": 0,
> "len": 134217728
> }
> ]
> }
> }
>
> Note: tag is optional.
>
> # Release a 128MB extent at offset 0 from region 0
> { "execute": "cxl-release-dynamic-capacity",
> "arguments": {
> "path": "/machine/peripheral/cxl-memdev0",
> "host-id":0,
> "removal-policy":"prescriptive",
> "forced-removal": false,
> "sanitize-on-release": false,
> "region": 0,
> "tag": "",
> "extents": [
> {
> "offset": 0,
> "len": 134217728
> }
> ]
> }
> }
>
> Note: removal-policy, sanitize-on-release and tag are optional.
>
> Other changes include,
> 1. Applied tags to patches.
> 2. Replaced error_setq with error_append_hint for
> cxl_create_dc_region error case in Patch 6 (Patch 4 in v7); (Zhijian
> Li) 3. Updated the error message to include region size information in
> cxl_create_dc_region.
> 4. set range1_size_hi to 0 for DCD in build_dvsec. (Jonathan)
> 5. Several minor format fixes.
>
> Thanks Markus, Jonathan, Gregory, and Zhijian for reviewing v7 and
> svetly Todorov for testing v7.
>
> This series pass the same tests as v7 check the cover letter of v7 for
> more details. Additionally, we tested the QAPI interface for
> adding/releasing DC extents with optional input parameters.
>
>
> v7: https://lore.kernel.org/linux-cxl/5856b7a4-4082-465f-9f61-b1ec6c35ef0f@fujitsu.com/T/#mec4c85022ce28c80b241aaf2d5431cadaa45f097
>
>
> Fan Ni (12):
> hw/cxl/cxl-mailbox-utils: Add dc_event_log_size field to output
> payload of identify memory device command
> hw/cxl/cxl-mailbox-utils: Add dynamic capacity region representative
> and mailbox command support
> include/hw/cxl/cxl_device: Rename mem_size as static_mem_size for
> type3 memory devices
> hw/mem/cxl_type3: Add support to create DC regions to type3 memory
> devices
> hw/mem/cxl-type3: Refactor ct3_build_cdat_entries_for_mr to take mr
> size instead of mr as argument
> hw/mem/cxl_type3: Add host backend and address space handling for DC
> regions
> hw/mem/cxl_type3: Add DC extent list representative and get DC
> extent list mailbox support
> hw/cxl/cxl-mailbox-utils: Add mailbox commands to support
> add/release dynamic capacity response
> hw/cxl/events: Add qmp interfaces to add/release dynamic capacity
> extents
> hw/mem/cxl_type3: Add DPA range validation for accesses to DC
> regions hw/cxl/cxl-mailbox-utils: Add superset extent release mailbox
> support hw/mem/cxl_type3: Allow to release extent superset in QMP
> interface
>
> Gregory Price (2):
> hw/cxl/mailbox: change CCI cmd set structure to be a member, not a
> reference
> hw/cxl/mailbox: interface to add CCI commands to an existing CCI
>
> hw/cxl/cxl-mailbox-utils.c | 658
> +++++++++++++++++++++++++++++++++++- hw/mem/cxl_type3.c |
> 634 ++++++++++++++++++++++++++++++++-- hw/mem/cxl_type3_stubs.c |
> 25 ++ include/hw/cxl/cxl_device.h | 85 ++++-
> include/hw/cxl/cxl_events.h | 18 +
> qapi/cxl.json | 143 ++++++++
> 6 files changed, 1511 insertions(+), 52 deletions(-)
>
Hi Nifan,
I am trying to test this patchset with Ira's set on DCD, and I am
trying to work out everything by using sysfs rather than using tools
instead. I am not sure where things are going of the rail.
I am using this patchset
(https://lore.kernel.org/qemu-devel/20240523174651.1089554-2-nifan.cxl@gmail.com/) with Ira's v9 https://lore.kernel.org/all/20250413-dcd-type2-upstream-v9-0-1d4911a0b365@intel.com/
This my CXL config:
return [
"-m", "6G,maxmem=20G,slots=10",
"-object", "memory-backend-ram,id=vmem0,share=on,size=2G",
"-device", "pxb-cxl,bus_nr=23,bus=pcie.0,id=cxl.1",
"-device","cxl-rp,port=0,bus=cxl.1,id=root_port13,chassis=0,slot=2",
"-device","cxl-type3,bus=root_port13,volatile-dc-memdev=vmem0,id=cxl-vmem0,num-dc-regions=2",
"-M", "cxl=on,cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G"
]
This is how I create a new CXL DC Region ready to work with.
# creating a region
region=$(cat
/sys/bus/cxl/devices/decoder0.0/create_dynamic_ram_a_region)
echo $region
echo $region >
/sys/bus/cxl/devices/decoder0.0/create_dynamic_ram_a_region
echo 256 > /sys/bus/cxl/devices/$region/interleave_granularity
echo 1 > /sys/bus/cxl/devices/$region/interleave_ways
echo "dynamic_ram_a" > /sys/bus/cxl/devices/decoder2.0/mode
echo $((256 << 20)) > /sys/bus/cxl/devices/decoder2.0/dpa_size
echo $((256 << 20)) > /sys/bus/cxl/devices/$region/size
echo "decoder2.0" > /sys/bus/cxl/devices/$region/target0
echo 1 > /sys/bus/cxl/devices/$region/commit
echo $region > /sys/bus/cxl/drivers/cxl_region/bind
After this I have two things created, region0 and dax_region0:
root@localhost:~# ls /sys/bus/cxl/devices/
dax_region0/ decoder1.0/ decoder1.2/ decoder2.0/
decoder2.2/ endpoint2/ nvdimm-bridge0/ region0/ decoder0.0/
decoder1.1/ decoder1.3/ decoder2.1/ decoder2.3/ mem0/
port1/ root0/
This is what I have in dax_region0:
root@localhost:~# ls /sys/bus/cxl/devices/dax_region0/
dax0.0 dax_region devtype driver modalias subsystem uevent
Now I want to add an extent, and this is how I am doing it:
{ "execute": "cxl-add-dynamic-capacity",
"arguments": {
"path": "/machine/peripheral/cxl-memdev0",
"host-id":0,
"selection-policy": 'prescriptive',
"region": 0,
"tag": "",
"extents": [
{
"offset": 0,
"len": 134217728
}
]
}
}
Now I see the extent added in my region device:
root@localhost:~# ls /sys/bus/cxl/devices/dax_region0/
dax0.0 dax_region devtype driver extent0.0 modalias subsystem
uevent root@localhost:~# ls /sys/bus/cxl/devices/dax_region0/extent0.0/
length offset uevent
This is where things will go off the rails, at this point I want to
create a new dax device to use, but this part is unsuccessful. Here I
first add some size to the dax region created before:
root@localhost:~# echo 67108864 > /sys/bus/dax/devices/dax0.0/size
[264.539280] dax:alloc_dev_dax_range:1015: dax dax0.0: alloc range[0]:
0x0000000810000000:0x0000000813ffffff
When I check the size everything looks OK:
root@localhost:~# cat /sys/bus/dax/devices/dax0.0/size
67108864
But when I want to bind it then it does not work:
root@localhost:~# echo "dax0.0" > /sys/bus/dax/drivers/device_dax/bind
-bash: echo: write error: No such device
I believe I am missing something here. Would be good if you can help
out here.
Thanks,
Alireza
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v8 00/14] Enabling DCD emulation support in Qemu
2025-06-25 14:22 ` Alireza Sanaee
@ 2025-06-26 16:39 ` Fan Ni
0 siblings, 0 replies; 28+ messages in thread
From: Fan Ni @ 2025-06-26 16:39 UTC (permalink / raw)
To: Alireza Sanaee
Cc: nifan.cxl, qemu-devel, jonathan.cameron, linux-cxl, gregory.price,
ira.weiny, dan.j.williams, a.manzanares, dave, nmtadam.samsung,
jim.harris, Jorgen.Hansen, wj28.lee, armbru, mst, anisa.su887
On Wed, Jun 25, 2025 at 03:22:34PM +0100, Alireza Sanaee wrote:
> On Thu, 23 May 2024 10:44:40 -0700
> nifan.cxl@gmail.com wrote:
>
> > From: Fan Ni <nifan.cxl@gmail.com>
> >
> > A git tree of this series can be found here (with one extra commit on
> > top for printing out accepted/pending extent list for testing):
> > https://github.com/moking/qemu/tree/dcd-v8-qapi
> >
> > v7->v8:
> >
> > This version carries over the following two patches from Gregory.
> > 1. hw/cxl/mailbox: change CCI cmd set structure to be a member, not a
> > reference https://gitlab.com/jic23/qemu/-/commit/f44ebc5a455ccdd6535879b0c5824e0d76b04da5
> > 2. hw/cxl/mailbox: interface to add CCI commands to an existing CCI
> > https://gitlab.com/jic23/qemu/-/commit/00a4dd8b388add03c588298f665ee918626296a5
> >
> > Note, the above two patches are not directly related to DCD emulation.
> >
> > All the following patches in this series are built on top of
> > mainstream QEMU and the above two patches.
> >
> > The most significant changes of v8 is in Patch 11 (Patch 9 in v7).
> > Based on feedback from Markus and Jonathan, the QMP interfaces for
> > adding and releasing DC extents have been redesigned and now they
> > look like below,
> >
> > # add a 128MB extent at offset 0 to region 0
> > { "execute": "cxl-add-dynamic-capacity",
> > "arguments": {
> > "path": "/machine/peripheral/cxl-memdev0",
> > "host-id":0,
> > "selection-policy": 'prescriptive',
> > "region": 0,
> > "tag": "",
> > "extents": [
> > {
> > "offset": 0,
> > "len": 134217728
> > }
> > ]
> > }
> > }
> >
> > Note: tag is optional.
> >
> > # Release a 128MB extent at offset 0 from region 0
> > { "execute": "cxl-release-dynamic-capacity",
> > "arguments": {
> > "path": "/machine/peripheral/cxl-memdev0",
> > "host-id":0,
> > "removal-policy":"prescriptive",
> > "forced-removal": false,
> > "sanitize-on-release": false,
> > "region": 0,
> > "tag": "",
> > "extents": [
> > {
> > "offset": 0,
> > "len": 134217728
> > }
> > ]
> > }
> > }
> >
> > Note: removal-policy, sanitize-on-release and tag are optional.
> >
> > Other changes include,
> > 1. Applied tags to patches.
> > 2. Replaced error_setq with error_append_hint for
> > cxl_create_dc_region error case in Patch 6 (Patch 4 in v7); (Zhijian
> > Li) 3. Updated the error message to include region size information in
> > cxl_create_dc_region.
> > 4. set range1_size_hi to 0 for DCD in build_dvsec. (Jonathan)
> > 5. Several minor format fixes.
> >
> > Thanks Markus, Jonathan, Gregory, and Zhijian for reviewing v7 and
> > svetly Todorov for testing v7.
> >
> > This series pass the same tests as v7 check the cover letter of v7 for
> > more details. Additionally, we tested the QAPI interface for
> > adding/releasing DC extents with optional input parameters.
> >
> >
> > v7: https://lore.kernel.org/linux-cxl/5856b7a4-4082-465f-9f61-b1ec6c35ef0f@fujitsu.com/T/#mec4c85022ce28c80b241aaf2d5431cadaa45f097
> >
> >
> > Fan Ni (12):
> > hw/cxl/cxl-mailbox-utils: Add dc_event_log_size field to output
> > payload of identify memory device command
> > hw/cxl/cxl-mailbox-utils: Add dynamic capacity region representative
> > and mailbox command support
> > include/hw/cxl/cxl_device: Rename mem_size as static_mem_size for
> > type3 memory devices
> > hw/mem/cxl_type3: Add support to create DC regions to type3 memory
> > devices
> > hw/mem/cxl-type3: Refactor ct3_build_cdat_entries_for_mr to take mr
> > size instead of mr as argument
> > hw/mem/cxl_type3: Add host backend and address space handling for DC
> > regions
> > hw/mem/cxl_type3: Add DC extent list representative and get DC
> > extent list mailbox support
> > hw/cxl/cxl-mailbox-utils: Add mailbox commands to support
> > add/release dynamic capacity response
> > hw/cxl/events: Add qmp interfaces to add/release dynamic capacity
> > extents
> > hw/mem/cxl_type3: Add DPA range validation for accesses to DC
> > regions hw/cxl/cxl-mailbox-utils: Add superset extent release mailbox
> > support hw/mem/cxl_type3: Allow to release extent superset in QMP
> > interface
> >
> > Gregory Price (2):
> > hw/cxl/mailbox: change CCI cmd set structure to be a member, not a
> > reference
> > hw/cxl/mailbox: interface to add CCI commands to an existing CCI
> >
> > hw/cxl/cxl-mailbox-utils.c | 658
> > +++++++++++++++++++++++++++++++++++- hw/mem/cxl_type3.c |
> > 634 ++++++++++++++++++++++++++++++++-- hw/mem/cxl_type3_stubs.c |
> > 25 ++ include/hw/cxl/cxl_device.h | 85 ++++-
> > include/hw/cxl/cxl_events.h | 18 +
> > qapi/cxl.json | 143 ++++++++
> > 6 files changed, 1511 insertions(+), 52 deletions(-)
> >
>
> Hi Nifan,
>
> I am trying to test this patchset with Ira's set on DCD, and I am
> trying to work out everything by using sysfs rather than using tools
> instead. I am not sure where things are going of the rail.
>
> I am using this patchset
> (https://lore.kernel.org/qemu-devel/20240523174651.1089554-2-nifan.cxl@gmail.com/) with Ira's v9 https://lore.kernel.org/all/20250413-dcd-type2-upstream-v9-0-1d4911a0b365@intel.com/
>
> This my CXL config:
> return [
> "-m", "6G,maxmem=20G,slots=10",
> "-object", "memory-backend-ram,id=vmem0,share=on,size=2G",
> "-device", "pxb-cxl,bus_nr=23,bus=pcie.0,id=cxl.1",
> "-device","cxl-rp,port=0,bus=cxl.1,id=root_port13,chassis=0,slot=2",
> "-device","cxl-type3,bus=root_port13,volatile-dc-memdev=vmem0,id=cxl-vmem0,num-dc-regions=2",
> "-M", "cxl=on,cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G"
> ]
>
>
> This is how I create a new CXL DC Region ready to work with.
>
> # creating a region
> region=$(cat
> /sys/bus/cxl/devices/decoder0.0/create_dynamic_ram_a_region)
> echo $region
>
> echo $region >
> /sys/bus/cxl/devices/decoder0.0/create_dynamic_ram_a_region
> echo 256 > /sys/bus/cxl/devices/$region/interleave_granularity
> echo 1 > /sys/bus/cxl/devices/$region/interleave_ways
>
> echo "dynamic_ram_a" > /sys/bus/cxl/devices/decoder2.0/mode
> echo $((256 << 20)) > /sys/bus/cxl/devices/decoder2.0/dpa_size
>
> echo $((256 << 20)) > /sys/bus/cxl/devices/$region/size
> echo "decoder2.0" > /sys/bus/cxl/devices/$region/target0
> echo 1 > /sys/bus/cxl/devices/$region/commit
>
> echo $region > /sys/bus/cxl/drivers/cxl_region/bind
>
> After this I have two things created, region0 and dax_region0:
>
> root@localhost:~# ls /sys/bus/cxl/devices/
> dax_region0/ decoder1.0/ decoder1.2/ decoder2.0/
> decoder2.2/ endpoint2/ nvdimm-bridge0/ region0/ decoder0.0/
> decoder1.1/ decoder1.3/ decoder2.1/ decoder2.3/ mem0/
> port1/ root0/
>
> This is what I have in dax_region0:
> root@localhost:~# ls /sys/bus/cxl/devices/dax_region0/
> dax0.0 dax_region devtype driver modalias subsystem uevent
>
> Now I want to add an extent, and this is how I am doing it:
>
> { "execute": "cxl-add-dynamic-capacity",
> "arguments": {
> "path": "/machine/peripheral/cxl-memdev0",
> "host-id":0,
> "selection-policy": 'prescriptive',
> "region": 0,
> "tag": "",
> "extents": [
> {
> "offset": 0,
> "len": 134217728
> }
> ]
> }
> }
>
> Now I see the extent added in my region device:
> root@localhost:~# ls /sys/bus/cxl/devices/dax_region0/
> dax0.0 dax_region devtype driver extent0.0 modalias subsystem
> uevent root@localhost:~# ls /sys/bus/cxl/devices/dax_region0/extent0.0/
> length offset uevent
>
> This is where things will go off the rails, at this point I want to
> create a new dax device to use, but this part is unsuccessful. Here I
> first add some size to the dax region created before:
>
> root@localhost:~# echo 67108864 > /sys/bus/dax/devices/dax0.0/size
> [264.539280] dax:alloc_dev_dax_range:1015: dax dax0.0: alloc range[0]:
> 0x0000000810000000:0x0000000813ffffff
>
> When I check the size everything looks OK:
> root@localhost:~# cat /sys/bus/dax/devices/dax0.0/size
> 67108864
>
> But when I want to bind it then it does not work:
> root@localhost:~# echo "dax0.0" > /sys/bus/dax/drivers/device_dax/bind
> -bash: echo: write error: No such device
>
> I believe I am missing something here. Would be good if you can help
> out here.
>
> Thanks,
> Alireza
Hi Alireza,
My test with this series is always with Ira's ndtcl repos,
I have not tried to test by using sysfs directly.
To understand what it is going wrong, we need to see what is the
difference between the workflow of ndctl and that you use above.
After that, usually when I see an error as above, I will try to gdb the kernel.
Try to debug the step that fails and understand why.
For example, the last step fails, try to gdb "bind_store" function in
drivers/base/bus.c.
Fan
>
--
Fan Ni
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v8 11/14] hw/cxl/events: Add qmp interfaces to add/release dynamic capacity extents
2024-05-23 17:44 ` [PATCH v8 11/14] hw/cxl/events: Add qmp interfaces to add/release dynamic capacity extents nifan.cxl
2024-06-04 7:12 ` Markus Armbruster
@ 2025-09-02 10:39 ` Alireza Sanaee
2025-09-02 15:59 ` Ira Weiny
1 sibling, 1 reply; 28+ messages in thread
From: Alireza Sanaee @ 2025-09-02 10:39 UTC (permalink / raw)
To: nifan.cxl
Cc: qemu-devel, jonathan.cameron, linux-cxl, gregory.price, ira.weiny,
dan.j.williams, a.manzanares, dave, nmtadam.samsung, jim.harris,
Jorgen.Hansen, wj28.lee, armbru, mst, Fan Ni, Svetly Todorov
On Thu, 23 May 2024 10:44:51 -0700
nifan.cxl@gmail.com wrote:
> From: Fan Ni <fan.ni@samsung.com>
>
> To simulate FM functionalities for initiating Dynamic Capacity Add
> (Opcode 5604h) and Dynamic Capacity Release (Opcode 5605h) as in CXL
> spec r3.1 7.6.7.6.5 and 7.6.7.6.6, we implemented two QMP interfaces
> to issue add/release dynamic capacity extents requests.
>
> With the change, we allow to release an extent only when its DPA range
> is contained by a single accepted extent in the device. That is to
> say, extent superset release is not supported yet.
>
> 1. Add dynamic capacity extents:
>
> For example, the command to add two continuous extents (each 128MiB
> long) to region 0 (starting at DPA offset 0) looks like below:
>
> { "execute": "qmp_capabilities" }
>
> { "execute": "cxl-add-dynamic-capacity",
> "arguments": {
> "path": "/machine/peripheral/cxl-dcd0",
> "host-id": 0,
> "selection-policy": "prescriptive",
> "region": 0,
> "extents": [
> {
> "offset": 0,
> "len": 134217728
> },
> {
> "offset": 134217728,
> "len": 134217728
> }
> ]
> }
> }
>
> 2. Release dynamic capacity extents:
>
> For example, the command to release an extent of size 128MiB from
> region 0 (DPA offset 128MiB) looks like below:
>
> { "execute": "cxl-release-dynamic-capacity",
> "arguments": {
> "path": "/machine/peripheral/cxl-dcd0",
> "host-id": 0,
> "removal-policy":"prescriptive",
> "region": 0,
> "extents": [
> {
> "offset": 134217728,
> "len": 134217728
> }
> ]
> }
> }
>
> Tested-by: Svetly Todorov <svetly.todorov@memverge.com>
> Reviewed-by: Gregory Price <gregory.price@memverge.com>
> Signed-off-by: Fan Ni <fan.ni@samsung.com>
> ---
> hw/cxl/cxl-mailbox-utils.c | 62 ++++++--
> hw/mem/cxl_type3.c | 306
> +++++++++++++++++++++++++++++++++++- hw/mem/cxl_type3_stubs.c |
> 25 +++ include/hw/cxl/cxl_device.h | 22 +++
> include/hw/cxl/cxl_events.h | 18 +++
> qapi/cxl.json | 143 +++++++++++++++++
> 6 files changed, 563 insertions(+), 13 deletions(-)
>
> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> index 9d54e10cd4..ab71492697 100644
> --- a/hw/cxl/cxl-mailbox-utils.c
> +++ b/hw/cxl/cxl-mailbox-utils.c
> @@ -1405,7 +1405,7 @@ static CXLRetCode
> cmd_dcd_get_dyn_cap_ext_list(const struct cxl_cmd *cmd,
> * Check whether any bit between addr[nr, nr+size) is set,
> * return true if any bit is set, otherwise return false
> */
> -static bool test_any_bits_set(const unsigned long *addr, unsigned
> long nr, +bool test_any_bits_set(const unsigned long *addr, unsigned
> long nr, unsigned long size)
> {
> unsigned long res = find_next_bit(addr, size + nr, nr);
> @@ -1444,7 +1444,7 @@ CXLDCRegion *cxl_find_dc_region(CXLType3Dev
> *ct3d, uint64_t dpa, uint64_t len) return NULL;
> }
>
> -static void cxl_insert_extent_to_extent_list(CXLDCExtentList *list,
> +void cxl_insert_extent_to_extent_list(CXLDCExtentList *list,
> uint64_t dpa,
> uint64_t len,
> uint8_t *tag,
> @@ -1470,6 +1470,44 @@ void
> cxl_remove_extent_from_extent_list(CXLDCExtentList *list,
> g_free(extent); }
>
> +/*
> + * Add a new extent to the extent "group" if group exists;
> + * otherwise, create a new group
> + * Return value: the extent group where the extent is inserted.
> + */
> +CXLDCExtentGroup *cxl_insert_extent_to_extent_group(CXLDCExtentGroup
> *group,
> + uint64_t dpa,
> + uint64_t len,
> + uint8_t *tag,
> + uint16_t
> shared_seq) +{
> + if (!group) {
> + group = g_new0(CXLDCExtentGroup, 1);
> + QTAILQ_INIT(&group->list);
> + }
> + cxl_insert_extent_to_extent_list(&group->list, dpa, len,
> + tag, shared_seq);
> + return group;
> +}
> +
> +void cxl_extent_group_list_insert_tail(CXLDCExtentGroupList *list,
> + CXLDCExtentGroup *group)
> +{
> + QTAILQ_INSERT_TAIL(list, group, node);
> +}
> +
> +void cxl_extent_group_list_delete_front(CXLDCExtentGroupList *list)
> +{
> + CXLDCExtent *ent, *ent_next;
> + CXLDCExtentGroup *group = QTAILQ_FIRST(list);
> +
> + QTAILQ_REMOVE(list, group, node);
> + QTAILQ_FOREACH_SAFE(ent, &group->list, node, ent_next) {
> + cxl_remove_extent_from_extent_list(&group->list, ent);
> + }
> + g_free(group);
> +}
> +
> /*
> * CXL r3.1 Table 8-168: Add Dynamic Capacity Response Input Payload
> * CXL r3.1 Table 8-170: Release Dynamic Capacity Input Payload
> @@ -1541,6 +1579,7 @@ static CXLRetCode
> cxl_dcd_add_dyn_cap_rsp_dry_run(CXLType3Dev *ct3d, {
> uint32_t i;
> CXLDCExtent *ent;
> + CXLDCExtentGroup *ext_group;
> uint64_t dpa, len;
> Range range1, range2;
>
> @@ -1551,9 +1590,13 @@ static CXLRetCode
> cxl_dcd_add_dyn_cap_rsp_dry_run(CXLType3Dev *ct3d,
> range_init_nofail(&range1, dpa, len);
> /*
> - * TODO: once the pending extent list is added, check against
> - * the list will be added here.
> + * The host-accepted DPA range must be contained by the
> first extent
> + * group in the pending list
> */
> + ext_group = QTAILQ_FIRST(&ct3d->dc.extents_pending);
> + if (!cxl_extents_contains_dpa_range(&ext_group->list, dpa,
> len)) {
> + return CXL_MBOX_INVALID_PA;
> + }
>
> /* to-be-added range should not overlap with range already
> accepted */ QTAILQ_FOREACH(ent, &ct3d->dc.extents, node) {
> @@ -1586,10 +1629,7 @@ static CXLRetCode
> cmd_dcd_add_dyn_cap_rsp(const struct cxl_cmd *cmd, CXLRetCode ret;
>
> if (in->num_entries_updated == 0) {
> - /*
> - * TODO: once the pending list is introduced, extents in the
> beginning
> - * will get wiped out.
> - */
> +
> cxl_extent_group_list_delete_front(&ct3d->dc.extents_pending); return
> CXL_MBOX_SUCCESS; }
>
> @@ -1615,11 +1655,9 @@ static CXLRetCode
> cmd_dcd_add_dyn_cap_rsp(const struct cxl_cmd *cmd,
> cxl_insert_extent_to_extent_list(extent_list, dpa, len,
> NULL, 0); ct3d->dc.total_extent_count += 1;
> - /*
> - * TODO: we will add a pending extent list based on event
> log record
> - * and process the list accordingly here.
> - */
> }
> + /* Remove the first extent group in the pending list */
> + cxl_extent_group_list_delete_front(&ct3d->dc.extents_pending);
>
> return CXL_MBOX_SUCCESS;
> }
> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> index 7c9038938f..2161766b14 100644
> --- a/hw/mem/cxl_type3.c
> +++ b/hw/mem/cxl_type3.c
> @@ -673,6 +673,7 @@ static bool cxl_create_dc_regions(CXLType3Dev
> *ct3d, Error **errp) ct3d->dc.total_capacity += region->len;
> }
> QTAILQ_INIT(&ct3d->dc.extents);
> + QTAILQ_INIT(&ct3d->dc.extents_pending);
>
> return true;
> }
> @@ -680,10 +681,19 @@ static bool cxl_create_dc_regions(CXLType3Dev
> *ct3d, Error **errp) static void cxl_destroy_dc_regions(CXLType3Dev
> *ct3d) {
> CXLDCExtent *ent, *ent_next;
> + CXLDCExtentGroup *group, *group_next;
>
> QTAILQ_FOREACH_SAFE(ent, &ct3d->dc.extents, node, ent_next) {
> cxl_remove_extent_from_extent_list(&ct3d->dc.extents, ent);
> }
> +
> + QTAILQ_FOREACH_SAFE(group, &ct3d->dc.extents_pending, node,
> group_next) {
> + QTAILQ_REMOVE(&ct3d->dc.extents_pending, group, node);
> + QTAILQ_FOREACH_SAFE(ent, &group->list, node, ent_next) {
> + cxl_remove_extent_from_extent_list(&group->list, ent);
> + }
> + g_free(group);
> + }
> }
>
> static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
> @@ -1448,7 +1458,6 @@ static int
> ct3d_qmp_cxl_event_log_enc(CxlEventLog log) return
> CXL_EVENT_TYPE_FAIL; case CXL_EVENT_LOG_FATAL:
> return CXL_EVENT_TYPE_FATAL;
> -/* DCD not yet supported */
> default:
> return -EINVAL;
> }
> @@ -1699,6 +1708,301 @@ void qmp_cxl_inject_memory_module_event(const
> char *path, CxlEventLog log, }
> }
>
> +/* CXL r3.1 Table 8-50: Dynamic Capacity Event Record */
> +static const QemuUUID dynamic_capacity_uuid = {
> + .data = UUID(0xca95afa7, 0xf183, 0x4018, 0x8c, 0x2f,
> + 0x95, 0x26, 0x8e, 0x10, 0x1a, 0x2a),
> +};
> +
> +typedef enum CXLDCEventType {
> + DC_EVENT_ADD_CAPACITY = 0x0,
> + DC_EVENT_RELEASE_CAPACITY = 0x1,
> + DC_EVENT_FORCED_RELEASE_CAPACITY = 0x2,
> + DC_EVENT_REGION_CONFIG_UPDATED = 0x3,
> + DC_EVENT_ADD_CAPACITY_RSP = 0x4,
> + DC_EVENT_CAPACITY_RELEASED = 0x5,
> +} CXLDCEventType;
> +
> +/*
> + * Check whether the range [dpa, dpa + len - 1] has overlaps with
> extents in
> + * the list.
> + * Return value: return true if has overlaps; otherwise, return false
> + */
> +static bool cxl_extents_overlaps_dpa_range(CXLDCExtentList *list,
> + uint64_t dpa, uint64_t
> len) +{
> + CXLDCExtent *ent;
> + Range range1, range2;
> +
> + if (!list) {
> + return false;
> + }
> +
> + range_init_nofail(&range1, dpa, len);
> + QTAILQ_FOREACH(ent, list, node) {
> + range_init_nofail(&range2, ent->start_dpa, ent->len);
> + if (range_overlaps_range(&range1, &range2)) {
> + return true;
> + }
> + }
> + return false;
> +}
> +
> +/*
> + * Check whether the range [dpa, dpa + len - 1] is contained by
> extents in
> + * the list.
> + * Will check multiple extents containment once superset release is
> added.
> + * Return value: return true if range is contained; otherwise,
> return false
> + */
> +bool cxl_extents_contains_dpa_range(CXLDCExtentList *list,
> + uint64_t dpa, uint64_t len)
> +{
> + CXLDCExtent *ent;
> + Range range1, range2;
> +
> + if (!list) {
> + return false;
> + }
> +
> + range_init_nofail(&range1, dpa, len);
> + QTAILQ_FOREACH(ent, list, node) {
> + range_init_nofail(&range2, ent->start_dpa, ent->len);
> + if (range_contains_range(&range2, &range1)) {
> + return true;
> + }
> + }
> + return false;
> +}
> +
> +static bool
> cxl_extent_groups_overlaps_dpa_range(CXLDCExtentGroupList *list,
> + uint64_t dpa,
> uint64_t len) +{
> + CXLDCExtentGroup *group;
> +
> + if (!list) {
> + return false;
> + }
> +
> + QTAILQ_FOREACH(group, list, node) {
> + if (cxl_extents_overlaps_dpa_range(&group->list, dpa, len)) {
> + return true;
> + }
> + }
> + return false;
> +}
> +
> +/*
> + * The main function to process dynamic capacity event with extent
> list.
> + * Currently DC extents add/release requests are processed.
> + */
> +static void qmp_cxl_process_dynamic_capacity_prescriptive(const char
> *path,
> + uint16_t hid, CXLDCEventType type, uint8_t rid,
> + CXLDynamicCapacityExtentList *records, Error **errp)
> +{
> + Object *obj;
> + CXLEventDynamicCapacity dCap = {};
> + CXLEventRecordHdr *hdr = &dCap.hdr;
> + CXLType3Dev *dcd;
> + uint8_t flags = 1 << CXL_EVENT_TYPE_INFO;
> + uint32_t num_extents = 0;
> + CXLDynamicCapacityExtentList *list;
> + CXLDCExtentGroup *group = NULL;
> + g_autofree CXLDCExtentRaw *extents = NULL;
> + uint8_t enc_log = CXL_EVENT_TYPE_DYNAMIC_CAP;
> + uint64_t dpa, offset, len, block_size;
> + g_autofree unsigned long *blk_bitmap = NULL;
> + int i;
> +
> + obj = object_resolve_path_type(path, TYPE_CXL_TYPE3, NULL);
> + if (!obj) {
> + error_setg(errp, "Unable to resolve CXL type 3 device");
> + return;
> + }
> +
> + dcd = CXL_TYPE3(obj);
> + if (!dcd->dc.num_regions) {
> + error_setg(errp, "No dynamic capacity support from the
> device");
> + return;
> + }
> +
> +
> + if (rid >= dcd->dc.num_regions) {
> + error_setg(errp, "region id is too large");
> + return;
> + }
> + block_size = dcd->dc.regions[rid].block_size;
> + blk_bitmap = bitmap_new(dcd->dc.regions[rid].len / block_size);
> +
> + /* Sanity check and count the extents */
> + list = records;
> + while (list) {
> + offset = list->value->offset;
> + len = list->value->len;
> + dpa = offset + dcd->dc.regions[rid].base;
> +
> + if (len == 0) {
> + error_setg(errp, "extent with 0 length is not allowed");
> + return;
> + }
> +
> + if (offset % block_size || len % block_size) {
> + error_setg(errp, "dpa or len is not aligned to region
> block size");
> + return;
> + }
> +
> + if (offset + len > dcd->dc.regions[rid].len) {
> + error_setg(errp, "extent range is beyond the region
> end");
> + return;
> + }
> +
> + /* No duplicate or overlapped extents are allowed */
> + if (test_any_bits_set(blk_bitmap, offset / block_size,
> + len / block_size)) {
> + error_setg(errp, "duplicate or overlapped extents are
> detected");
> + return;
> + }
> + bitmap_set(blk_bitmap, offset / block_size, len /
> block_size); +
> + if (type == DC_EVENT_RELEASE_CAPACITY) {
> + if
> (cxl_extent_groups_overlaps_dpa_range(&dcd->dc.extents_pending,
> + dpa, len)) {
> + error_setg(errp,
> + "cannot release extent with pending DPA
> range");
> + return;
> + }
> + if (!cxl_extents_contains_dpa_range(&dcd->dc.extents,
> dpa, len)) {
> + error_setg(errp,
> + "cannot release extent with non-existing
> DPA range");
> + return;
> + }
> + } else if (type == DC_EVENT_ADD_CAPACITY) {
> + if (cxl_extents_overlaps_dpa_range(&dcd->dc.extents,
> dpa, len)) {
> + error_setg(errp,
> + "cannot add DPA already accessible to the
> same LD");
> + return;
> + }
> + if
> (cxl_extent_groups_overlaps_dpa_range(&dcd->dc.extents_pending,
> + dpa, len)) {
> + error_setg(errp,
> + "cannot add DPA again while still
> pending");
> + return;
> + }
> + }
> + list = list->next;
> + num_extents++;
> + }
> +
> + /* Create extent list for event being passed to host */
> + i = 0;
> + list = records;
> + extents = g_new0(CXLDCExtentRaw, num_extents);
> + while (list) {
> + offset = list->value->offset;
> + len = list->value->len;
> + dpa = dcd->dc.regions[rid].base + offset;
> +
> + extents[i].start_dpa = dpa;
> + extents[i].len = len;
> + memset(extents[i].tag, 0, 0x10);
> + extents[i].shared_seq = 0;
> + if (type == DC_EVENT_ADD_CAPACITY) {
> + group = cxl_insert_extent_to_extent_group(group,
> +
> extents[i].start_dpa,
> + extents[i].len,
> + extents[i].tag,
> +
> extents[i].shared_seq);
> + }
> +
> + list = list->next;
> + i++;
> + }
> + if (group) {
> + cxl_extent_group_list_insert_tail(&dcd->dc.extents_pending,
> group);
> + }
> +
> + /*
> + * CXL r3.1 section 8.2.9.2.1.6: Dynamic Capacity Event Record
> + *
> + * All Dynamic Capacity event records shall set the Event Record
> Severity
> + * field in the Common Event Record Format to Informational
> Event. All
> + * Dynamic Capacity related events shall be logged in the
> Dynamic Capacity
> + * Event Log.
> + */
> + cxl_assign_event_header(hdr, &dynamic_capacity_uuid, flags,
> sizeof(dCap),
> +
> cxl_device_get_timestamp(&dcd->cxl_dstate)); +
> + dCap.type = type;
> + /* FIXME: for now, validity flag is cleared */
> + dCap.validity_flags = 0;
> + stw_le_p(&dCap.host_id, hid);
> + /* only valid for DC_REGION_CONFIG_UPDATED event */
> + dCap.updated_region_id = 0;
> + dCap.flags = 0;
> + for (i = 0; i < num_extents; i++) {
> + memcpy(&dCap.dynamic_capacity_extent, &extents[i],
> + sizeof(CXLDCExtentRaw));
> +
> + if (i < num_extents - 1) {
> + /* Set "More" flag */
> + dCap.flags |= BIT(0);
> + }
> +
> + if (cxl_event_insert(&dcd->cxl_dstate, enc_log,
> + (CXLEventRecordRaw *)&dCap)) {
> + cxl_event_irq_assert(dcd);
> + }
> + }
> +}
> +
> +void qmp_cxl_add_dynamic_capacity(const char *path, uint16_t host_id,
> + CXLExtSelPolicy sel_policy,
> uint8_t region,
> + const char *tag,
> + CXLDynamicCapacityExtentList
> *extents,
> + Error **errp)
> +{
> + switch (sel_policy) {
> + case CXL_EXT_SEL_POLICY_PRESCRIPTIVE:
> + qmp_cxl_process_dynamic_capacity_prescriptive(path, host_id,
> +
> DC_EVENT_ADD_CAPACITY,
> + region,
> extents, errp);
> + return;
> + default:
> + error_setg(errp, "Selection policy not supported");
> + return;
> + }
> +}
> +
> +void qmp_cxl_release_dynamic_capacity(const char *path, uint16_t
> host_id,
> + CXLExtRemovalPolicy
> removal_policy,
> + bool has_forced_removal,
> + bool forced_removal,
> + bool has_sanitize_on_release,
> + bool sanitize_on_release,
> + uint8_t region,
> + const char *tag,
> + CXLDynamicCapacityExtentList
> *extents,
> + Error **errp)
> +{
> + CXLDCEventType type = DC_EVENT_RELEASE_CAPACITY;
> +
> + if (has_forced_removal && forced_removal) {
> + /* TODO: enable forced removal in the future */
> + type = DC_EVENT_FORCED_RELEASE_CAPACITY;
> + error_setg(errp, "Forced removal not supported yet");
> + return;
> + }
> +
> + switch (removal_policy) {
> + case CXL_EXT_REMOVAL_POLICY_PRESCRIPTIVE:
> + qmp_cxl_process_dynamic_capacity_prescriptive(path, host_id,
> type,
> + region,
> extents, errp);
> + return;
> + default:
> + error_setg(errp, "Removal policy not supported");
> + return;
> + }
> +}
> +
> static void ct3_class_init(ObjectClass *oc, void *data)
> {
> DeviceClass *dc = DEVICE_CLASS(oc);
> diff --git a/hw/mem/cxl_type3_stubs.c b/hw/mem/cxl_type3_stubs.c
> index 3e1851e32b..45419bbefe 100644
> --- a/hw/mem/cxl_type3_stubs.c
> +++ b/hw/mem/cxl_type3_stubs.c
> @@ -67,3 +67,28 @@ void qmp_cxl_inject_correctable_error(const char
> *path, CxlCorErrorType type, {
> error_setg(errp, "CXL Type 3 support is not compiled in");
> }
> +
> +void qmp_cxl_add_dynamic_capacity(const char *path,
> + uint16_t host_id,
> + CXLExtSelPolicy sel_policy,
> + uint8_t region,
> + const char *tag,
> + CXLDynamicCapacityExtentList
> *extents,
> + Error **errp)
> +{
> + error_setg(errp, "CXL Type 3 support is not compiled in");
> +}
> +
> +void qmp_cxl_release_dynamic_capacity(const char *path, uint16_t
> host_id,
> + CXLExtRemovalPolicy
> removal_policy,
> + bool has_forced_removal,
> + bool forced_removal,
> + bool has_sanitize_on_release,
> + bool sanitize_on_release,
> + uint8_t region,
> + const char *tag,
> + CXLDynamicCapacityExtentList
> *extents,
> + Error **errp)
> +{
> + error_setg(errp, "CXL Type 3 support is not compiled in");
> +}
> diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> index df3511e91b..c69ff6b5de 100644
> --- a/include/hw/cxl/cxl_device.h
> +++ b/include/hw/cxl/cxl_device.h
> @@ -443,6 +443,12 @@ typedef struct CXLDCExtent {
> } CXLDCExtent;
> typedef QTAILQ_HEAD(, CXLDCExtent) CXLDCExtentList;
>
> +typedef struct CXLDCExtentGroup {
> + CXLDCExtentList list;
> + QTAILQ_ENTRY(CXLDCExtentGroup) node;
> +} CXLDCExtentGroup;
> +typedef QTAILQ_HEAD(, CXLDCExtentGroup) CXLDCExtentGroupList;
> +
> typedef struct CXLDCRegion {
> uint64_t base; /* aligned to 256*MiB */
> uint64_t decode_len; /* aligned to 256*MiB */
> @@ -494,6 +500,7 @@ struct CXLType3Dev {
> */
> uint64_t total_capacity; /* 256M aligned */
> CXLDCExtentList extents;
> + CXLDCExtentGroupList extents_pending;
> uint32_t total_extent_count;
> uint32_t ext_list_gen_seq;
>
> @@ -555,4 +562,19 @@ CXLDCRegion *cxl_find_dc_region(CXLType3Dev
> *ct3d, uint64_t dpa, uint64_t len);
> void cxl_remove_extent_from_extent_list(CXLDCExtentList *list,
> CXLDCExtent *extent);
> +void cxl_insert_extent_to_extent_list(CXLDCExtentList *list,
> uint64_t dpa,
> + uint64_t len, uint8_t *tag,
> + uint16_t shared_seq);
> +bool test_any_bits_set(const unsigned long *addr, unsigned long nr,
> + unsigned long size);
> +bool cxl_extents_contains_dpa_range(CXLDCExtentList *list,
> + uint64_t dpa, uint64_t len);
> +CXLDCExtentGroup *cxl_insert_extent_to_extent_group(CXLDCExtentGroup
> *group,
> + uint64_t dpa,
> + uint64_t len,
> + uint8_t *tag,
> + uint16_t
> shared_seq); +void
> cxl_extent_group_list_insert_tail(CXLDCExtentGroupList *list,
> + CXLDCExtentGroup *group);
> +void cxl_extent_group_list_delete_front(CXLDCExtentGroupList *list);
> #endif
> diff --git a/include/hw/cxl/cxl_events.h b/include/hw/cxl/cxl_events.h
> index 5170b8dbf8..38cadaa0f3 100644
> --- a/include/hw/cxl/cxl_events.h
> +++ b/include/hw/cxl/cxl_events.h
> @@ -166,4 +166,22 @@ typedef struct CXLEventMemoryModule {
> uint8_t reserved[0x3d];
> } QEMU_PACKED CXLEventMemoryModule;
>
> +/*
> + * CXL r3.1 section Table 8-50: Dynamic Capacity Event Record
> + * All fields little endian.
> + */
> +typedef struct CXLEventDynamicCapacity {
> + CXLEventRecordHdr hdr;
> + uint8_t type;
> + uint8_t validity_flags;
> + uint16_t host_id;
> + uint8_t updated_region_id;
> + uint8_t flags;
> + uint8_t reserved2[2];
> + uint8_t dynamic_capacity_extent[0x28]; /* defined in
> cxl_device.h */
> + uint8_t reserved[0x18];
> + uint32_t extents_avail;
> + uint32_t tags_avail;
> +} QEMU_PACKED CXLEventDynamicCapacity;
> +
> #endif /* CXL_EVENTS_H */
> diff --git a/qapi/cxl.json b/qapi/cxl.json
> index 4281726dec..57d9f82014 100644
> --- a/qapi/cxl.json
> +++ b/qapi/cxl.json
> @@ -361,3 +361,146 @@
> ##
> {'command': 'cxl-inject-correctable-error',
> 'data': {'path': 'str', 'type': 'CxlCorErrorType'}}
> +
> +##
> +# @CXLDynamicCapacityExtent:
> +#
> +# A single dynamic capacity extent
> +#
> +# @offset: The offset (in bytes) to the start of the region
> +# where the extent belongs to.
> +#
> +# @len: The length of the extent in bytes.
> +#
> +# Since: 9.1
> +##
> +{ 'struct': 'CXLDynamicCapacityExtent',
> + 'data': {
> + 'offset':'uint64',
> + 'len': 'uint64'
> + }
> +}
> +
> +##
> +# @CXLExtSelPolicy:
> +#
> +# The policy to use for selecting which extents comprise the added
> +# capacity, as defined in cxl spec r3.1 Table 7-70.
> +#
> +# @free: 0h = Free
> +#
> +# @contiguous: 1h = Continuous
> +#
> +# @prescriptive: 2h = Prescriptive
> +#
> +# @enable-shared-access: 3h = Enable Shared Access
> +#
> +# Since: 9.1
> +##
> +{ 'enum': 'CXLExtSelPolicy',
> + 'data': ['free',
> + 'contiguous',
> + 'prescriptive',
> + 'enable-shared-access']
> +}
> +
> +##
> +# @cxl-add-dynamic-capacity:
> +#
> +# Command to initiate to add dynamic capacity extents to a host. It
> +# simulates operations defined in cxl spec r3.1 7.6.7.6.5.
> +#
> +# @path: CXL DCD canonical QOM path.
> +#
> +# @host-id: The "Host ID" field as defined in cxl spec r3.1
> +# Table 7-70.
> +#
> +# @selection-policy: The "Selection Policy" bits as defined in
> +# cxl spec r3.1 Table 7-70. It specifies the policy to use for
> +# selecting which extents comprise the added capacity.
> +#
> +# @region: The "Region Number" field as defined in cxl spec r3.1
> +# Table 7-70. The dynamic capacity region where the capacity
> +# is being added. Valid range is from 0-7.
> +#
> +# @tag: The "Tag" field as defined in cxl spec r3.1 Table 7-70.
> +#
> +# @extents: The "Extent List" field as defined in cxl spec r3.1
> +# Table 7-70.
> +#
> +# Since : 9.1
> +##
> +{ 'command': 'cxl-add-dynamic-capacity',
> + 'data': { 'path': 'str',
> + 'host-id': 'uint16',
> + 'selection-policy': 'CXLExtSelPolicy',
> + 'region': 'uint8',
> + '*tag': 'str',
> + 'extents': [ 'CXLDynamicCapacityExtent' ]
> + }
> +}
> +
> +##
> +# @CXLExtRemovalPolicy:
> +#
> +# The policy to use for selecting which extents comprise the released
> +# capacity, defined in the "Flags" field in cxl spec r3.1 Table 7-71.
> +#
> +# @tag-based: value = 0h. Extents are selected by the device based
> +# on tag, with no requirement for contiguous extents.
> +#
> +# @prescriptive: value = 1h. Extent list of capacity to release is
> +# included in the request payload.
> +#
> +# Since: 9.1
> +##
> +{ 'enum': 'CXLExtRemovalPolicy',
> + 'data': ['tag-based',
> + 'prescriptive']
> +}
> +
> +##
> +# @cxl-release-dynamic-capacity:
> +#
> +# Command to initiate to release dynamic capacity extents from a
> +# host. It simulates operations defined in cxl spec r3.1 7.6.7.6.6.
> +#
> +# @path: CXL DCD canonical QOM path.
> +#
> +# @host-id: The "Host ID" field as defined in cxl spec r3.1
> +# Table 7-71.
> +#
> +# @removal-policy: Bit[3:0] of the "Flags" field as defined in cxl
> +# spec r3.1 Table 7-71.
> +#
> +# @forced-removal: Bit[4] of the "Flags" field in cxl spec r3.1
> +# Table 7-71. When set, device does not wait for a Release
> +# Dynamic Capacity command from the host. Host immediately
> +# loses access to released capacity.
> +#
> +# @sanitize-on-release: Bit[5] of the "Flags" field in cxl spec r3.1
> +# Table 7-71. When set, device should sanitize all released
> +# capacity as a result of this request.
> +#
> +# @region: The "Region Number" field as defined in cxl spec r3.1
> +# Table 7-71. The dynamic capacity region where the capacity
> +# is being added. Valid range is from 0-7.
> +#
> +# @tag: The "Tag" field as defined in cxl spec r3.1 Table 7-71.
> +#
> +# @extents: The "Extent List" field as defined in cxl spec r3.1
> +# Table 7-71.
> +#
> +# Since : 9.1
> +##
> +{ 'command': 'cxl-release-dynamic-capacity',
> + 'data': { 'path': 'str',
> + 'host-id': 'uint16',
> + 'removal-policy': 'CXLExtRemovalPolicy',
> + '*forced-removal': 'bool',
> + '*sanitize-on-release': 'bool',
> + 'region': 'uint8',
> + '*tag': 'str',
> + 'extents': [ 'CXLDynamicCapacityExtent' ]
> + }
> +}
Although tag-based removal is not implemented yet, but still just wanted
to leave a comment here that exact extents are not needed for tag-based
removal and `extents` should be an optional parameter here; this is my
understanding reading the spec, so I still might be wrong, shout if you
think it does not make sense.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v8 11/14] hw/cxl/events: Add qmp interfaces to add/release dynamic capacity extents
2025-09-02 10:39 ` Alireza Sanaee
@ 2025-09-02 15:59 ` Ira Weiny
2025-09-04 8:44 ` Alireza Sanaee
0 siblings, 1 reply; 28+ messages in thread
From: Ira Weiny @ 2025-09-02 15:59 UTC (permalink / raw)
To: Alireza Sanaee, nifan.cxl
Cc: qemu-devel, jonathan.cameron, linux-cxl, gregory.price, ira.weiny,
dan.j.williams, a.manzanares, dave, nmtadam.samsung, jim.harris,
Jorgen.Hansen, wj28.lee, armbru, mst, Fan Ni, Svetly Todorov
Alireza Sanaee wrote:
> On Thu, 23 May 2024 10:44:51 -0700
> nifan.cxl@gmail.com wrote:
>
> > From: Fan Ni <fan.ni@samsung.com>
> >
> > To simulate FM functionalities for initiating Dynamic Capacity Add
> > (Opcode 5604h) and Dynamic Capacity Release (Opcode 5605h) as in CXL
> > spec r3.1 7.6.7.6.5 and 7.6.7.6.6, we implemented two QMP interfaces
> > to issue add/release dynamic capacity extents requests.
> >
> > With the change, we allow to release an extent only when its DPA range
> > is contained by a single accepted extent in the device. That is to
> > say, extent superset release is not supported yet.
> >
> > 1. Add dynamic capacity extents:
> >
> > For example, the command to add two continuous extents (each 128MiB
> > long) to region 0 (starting at DPA offset 0) looks like below:
> >
> > { "execute": "qmp_capabilities" }
> >
> > { "execute": "cxl-add-dynamic-capacity",
> > "arguments": {
> > "path": "/machine/peripheral/cxl-dcd0",
> > "host-id": 0,
> > "selection-policy": "prescriptive",
> > "region": 0,
> > "extents": [
> > {
> > "offset": 0,
> > "len": 134217728
> > },
> > {
> > "offset": 134217728,
> > "len": 134217728
> > }
> > ]
> > }
> > }
> >
> > 2. Release dynamic capacity extents:
> >
> > For example, the command to release an extent of size 128MiB from
> > region 0 (DPA offset 128MiB) looks like below:
> >
> > { "execute": "cxl-release-dynamic-capacity",
> > "arguments": {
> > "path": "/machine/peripheral/cxl-dcd0",
> > "host-id": 0,
> > "removal-policy":"prescriptive",
> > "region": 0,
> > "extents": [
> > {
> > "offset": 134217728,
> > "len": 134217728
> > }
> > ]
> > }
> > }
> >
> > Tested-by: Svetly Todorov <svetly.todorov@memverge.com>
> > Reviewed-by: Gregory Price <gregory.price@memverge.com>
> > Signed-off-by: Fan Ni <fan.ni@samsung.com>
[snip]
> > +##
> > +# @cxl-release-dynamic-capacity:
> > +#
> > +# Command to initiate to release dynamic capacity extents from a
> > +# host. It simulates operations defined in cxl spec r3.1 7.6.7.6.6.
> > +#
> > +# @path: CXL DCD canonical QOM path.
> > +#
> > +# @host-id: The "Host ID" field as defined in cxl spec r3.1
> > +# Table 7-71.
> > +#
> > +# @removal-policy: Bit[3:0] of the "Flags" field as defined in cxl
> > +# spec r3.1 Table 7-71.
> > +#
> > +# @forced-removal: Bit[4] of the "Flags" field in cxl spec r3.1
> > +# Table 7-71. When set, device does not wait for a Release
> > +# Dynamic Capacity command from the host. Host immediately
> > +# loses access to released capacity.
> > +#
> > +# @sanitize-on-release: Bit[5] of the "Flags" field in cxl spec r3.1
> > +# Table 7-71. When set, device should sanitize all released
> > +# capacity as a result of this request.
> > +#
> > +# @region: The "Region Number" field as defined in cxl spec r3.1
> > +# Table 7-71. The dynamic capacity region where the capacity
> > +# is being added. Valid range is from 0-7.
> > +#
> > +# @tag: The "Tag" field as defined in cxl spec r3.1 Table 7-71.
> > +#
> > +# @extents: The "Extent List" field as defined in cxl spec r3.1
> > +# Table 7-71.
> > +#
> > +# Since : 9.1
> > +##
> > +{ 'command': 'cxl-release-dynamic-capacity',
> > + 'data': { 'path': 'str',
> > + 'host-id': 'uint16',
> > + 'removal-policy': 'CXLExtRemovalPolicy',
> > + '*forced-removal': 'bool',
> > + '*sanitize-on-release': 'bool',
> > + 'region': 'uint8',
> > + '*tag': 'str',
> > + 'extents': [ 'CXLDynamicCapacityExtent' ]
> > + }
> > +}
>
> Although tag-based removal is not implemented yet, but still just wanted
> to leave a comment here that exact extents are not needed for tag-based
> removal and `extents` should be an optional parameter here; this is my
> understanding reading the spec, so I still might be wrong, shout if you
> think it does not make sense.
It's been a while but I think this allows the removal of non-tagged
extents as well(?) If so the tag would be NULL (or empty-string) and one
can remove a regular extent.
But I could be miss-remembering something,
Ira
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v8 11/14] hw/cxl/events: Add qmp interfaces to add/release dynamic capacity extents
2025-09-02 15:59 ` Ira Weiny
@ 2025-09-04 8:44 ` Alireza Sanaee
0 siblings, 0 replies; 28+ messages in thread
From: Alireza Sanaee @ 2025-09-04 8:44 UTC (permalink / raw)
To: Ira Weiny
Cc: nifan.cxl, qemu-devel, jonathan.cameron, linux-cxl, gregory.price,
dan.j.williams, a.manzanares, dave, nmtadam.samsung, jim.harris,
Jorgen.Hansen, wj28.lee, armbru, mst, Fan Ni, Svetly Todorov
On Tue, 2 Sep 2025 10:59:35 -0500
Ira Weiny <ira.weiny@intel.com> wrote:
> Alireza Sanaee wrote:
> > On Thu, 23 May 2024 10:44:51 -0700
> > nifan.cxl@gmail.com wrote:
> >
> > > From: Fan Ni <fan.ni@samsung.com>
> > >
> > > To simulate FM functionalities for initiating Dynamic Capacity Add
> > > (Opcode 5604h) and Dynamic Capacity Release (Opcode 5605h) as in
> > > CXL spec r3.1 7.6.7.6.5 and 7.6.7.6.6, we implemented two QMP
> > > interfaces to issue add/release dynamic capacity extents requests.
> > >
> > > With the change, we allow to release an extent only when its DPA
> > > range is contained by a single accepted extent in the device.
> > > That is to say, extent superset release is not supported yet.
> > >
> > > 1. Add dynamic capacity extents:
> > >
> > > For example, the command to add two continuous extents (each
> > > 128MiB long) to region 0 (starting at DPA offset 0) looks like
> > > below:
> > >
> > > { "execute": "qmp_capabilities" }
> > >
> > > { "execute": "cxl-add-dynamic-capacity",
> > > "arguments": {
> > > "path": "/machine/peripheral/cxl-dcd0",
> > > "host-id": 0,
> > > "selection-policy": "prescriptive",
> > > "region": 0,
> > > "extents": [
> > > {
> > > "offset": 0,
> > > "len": 134217728
> > > },
> > > {
> > > "offset": 134217728,
> > > "len": 134217728
> > > }
> > > ]
> > > }
> > > }
> > >
> > > 2. Release dynamic capacity extents:
> > >
> > > For example, the command to release an extent of size 128MiB from
> > > region 0 (DPA offset 128MiB) looks like below:
> > >
> > > { "execute": "cxl-release-dynamic-capacity",
> > > "arguments": {
> > > "path": "/machine/peripheral/cxl-dcd0",
> > > "host-id": 0,
> > > "removal-policy":"prescriptive",
> > > "region": 0,
> > > "extents": [
> > > {
> > > "offset": 134217728,
> > > "len": 134217728
> > > }
> > > ]
> > > }
> > > }
> > >
> > > Tested-by: Svetly Todorov <svetly.todorov@memverge.com>
> > > Reviewed-by: Gregory Price <gregory.price@memverge.com>
> > > Signed-off-by: Fan Ni <fan.ni@samsung.com>
>
> [snip]
>
> > > +##
> > > +# @cxl-release-dynamic-capacity:
> > > +#
> > > +# Command to initiate to release dynamic capacity extents from a
> > > +# host. It simulates operations defined in cxl spec r3.1
> > > 7.6.7.6.6. +#
> > > +# @path: CXL DCD canonical QOM path.
> > > +#
> > > +# @host-id: The "Host ID" field as defined in cxl spec r3.1
> > > +# Table 7-71.
> > > +#
> > > +# @removal-policy: Bit[3:0] of the "Flags" field as defined in
> > > cxl +# spec r3.1 Table 7-71.
> > > +#
> > > +# @forced-removal: Bit[4] of the "Flags" field in cxl spec r3.1
> > > +# Table 7-71. When set, device does not wait for a Release
> > > +# Dynamic Capacity command from the host. Host immediately
> > > +# loses access to released capacity.
> > > +#
> > > +# @sanitize-on-release: Bit[5] of the "Flags" field in cxl spec
> > > r3.1 +# Table 7-71. When set, device should sanitize all
> > > released +# capacity as a result of this request.
> > > +#
> > > +# @region: The "Region Number" field as defined in cxl spec r3.1
> > > +# Table 7-71. The dynamic capacity region where the capacity
> > > +# is being added. Valid range is from 0-7.
> > > +#
> > > +# @tag: The "Tag" field as defined in cxl spec r3.1 Table 7-71.
> > > +#
> > > +# @extents: The "Extent List" field as defined in cxl spec r3.1
> > > +# Table 7-71.
> > > +#
> > > +# Since : 9.1
> > > +##
> > > +{ 'command': 'cxl-release-dynamic-capacity',
> > > + 'data': { 'path': 'str',
> > > + 'host-id': 'uint16',
> > > + 'removal-policy': 'CXLExtRemovalPolicy',
> > > + '*forced-removal': 'bool',
> > > + '*sanitize-on-release': 'bool',
> > > + 'region': 'uint8',
> > > + '*tag': 'str',
> > > + 'extents': [ 'CXLDynamicCapacityExtent' ]
> > > + }
> > > +}
> >
> > Although tag-based removal is not implemented yet, but still just
> > wanted to leave a comment here that exact extents are not needed
> > for tag-based removal and `extents` should be an optional parameter
> > here; this is my understanding reading the spec, so I still might
> > be wrong, shout if you think it does not make sense.
>
> It's been a while but I think this allows the removal of non-tagged
> extents as well(?) If so the tag would be NULL (or empty-string) and
> one can remove a regular extent.
>
> But I could be miss-remembering something,
> Ira
>
Yes non-tagged is working completely.
^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2025-09-04 8:45 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-23 17:44 [PATCH v8 00/14] Enabling DCD emulation support in Qemu nifan.cxl
2024-05-23 17:44 ` [PATCH v8 01/14] hw/cxl/mailbox: change CCI cmd set structure to be a member, not a reference nifan.cxl
2024-05-23 17:44 ` [PATCH v8 02/14] hw/cxl/mailbox: interface to add CCI commands to an existing CCI nifan.cxl
2024-05-23 17:44 ` [PATCH v8 03/14] hw/cxl/cxl-mailbox-utils: Add dc_event_log_size field to output payload of identify memory device command nifan.cxl
2024-05-23 17:44 ` [PATCH v8 04/14] hw/cxl/cxl-mailbox-utils: Add dynamic capacity region representative and mailbox command support nifan.cxl
2024-05-23 17:44 ` [PATCH v8 05/14] include/hw/cxl/cxl_device: Rename mem_size as static_mem_size for type3 memory devices nifan.cxl
2024-05-23 17:44 ` [PATCH v8 06/14] hw/mem/cxl_type3: Add support to create DC regions to " nifan.cxl
2024-05-27 7:42 ` Zhijian Li (Fujitsu)
2024-05-23 17:44 ` [PATCH v8 07/14] hw/mem/cxl-type3: Refactor ct3_build_cdat_entries_for_mr to take mr size instead of mr as argument nifan.cxl
2024-05-23 17:44 ` [PATCH v8 08/14] hw/mem/cxl_type3: Add host backend and address space handling for DC regions nifan.cxl
2024-06-03 12:27 ` Jonathan Cameron
2024-06-03 15:04 ` Michael S. Tsirkin
2024-06-03 17:27 ` Jonathan Cameron
2024-05-23 17:44 ` [PATCH v8 09/14] hw/mem/cxl_type3: Add DC extent list representative and get DC extent list mailbox support nifan.cxl
2024-05-23 17:44 ` [PATCH v8 10/14] hw/cxl/cxl-mailbox-utils: Add mailbox commands to support add/release dynamic capacity response nifan.cxl
2024-05-23 17:44 ` [PATCH v8 11/14] hw/cxl/events: Add qmp interfaces to add/release dynamic capacity extents nifan.cxl
2024-06-04 7:12 ` Markus Armbruster
2024-06-04 11:55 ` Jonathan Cameron
2024-06-04 14:49 ` Markus Armbruster
2025-09-02 10:39 ` Alireza Sanaee
2025-09-02 15:59 ` Ira Weiny
2025-09-04 8:44 ` Alireza Sanaee
2024-05-23 17:44 ` [PATCH v8 12/14] hw/mem/cxl_type3: Add DPA range validation for accesses to DC regions nifan.cxl
2024-05-23 17:44 ` [PATCH v8 13/14] hw/cxl/cxl-mailbox-utils: Add superset extent release mailbox support nifan.cxl
2024-05-23 17:44 ` [PATCH v8 14/14] hw/mem/cxl_type3: Allow to release extent superset in QMP interface nifan.cxl
2024-06-03 13:51 ` [PATCH v8 00/14] Enabling DCD emulation support in Qemu Jonathan Cameron
2025-06-25 14:22 ` Alireza Sanaee
2025-06-26 16:39 ` Fan Ni
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).