From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: <shiju.jose@huawei.com>
Cc: <qemu-devel@nongnu.org>, <linux-cxl@vger.kernel.org>,
<armbru@redhat.com>, <dave@stgolabs.net>, <linuxarm@huawei.com>
Subject: Re: [PATCH v6 8/8] hw/cxl: Add emulation for memory sparing control feature
Date: Mon, 11 Aug 2025 15:38:28 +0100 [thread overview]
Message-ID: <20250811153828.00003283@huawei.com> (raw)
In-Reply-To: <20250811085530.2263-9-shiju.jose@huawei.com>
On Mon, 11 Aug 2025 09:55:30 +0100
<shiju.jose@huawei.com> wrote:
> From: Shiju Jose <shiju.jose@huawei.com>
>
> Memory sparing is defined as a repair function that replaces a portion of
> memory with a portion of functional memory at that same DPA. The
> subclasses for this operation vary in terms of the scope of the sparing
> being performed. The Cacheline sparing subclass refers to a sparing
> action that can replace a full cacheline. Row sparing is provided as an
> alternative to PPR sparing functions and its scope is that of a single
> DDR row. Bank sparing allows an entire bank to be replaced. Rank sparing
> is defined as an operation in which an entire DDR rank is replaced.
>
> Memory sparing maintenance operations may be supported by CXL devices
> that implement CXL.mem protocol. A sparing maintenance operation requests
> the CXL device to perform a repair operation on its media.
> For example, a CXL device with DRAM components that support memory sparing
> features may implement sparing Maintenance operations.
>
> The host may issue a query command by setting Query Resources flag in the
> Input Payload (CXL Spec 3.2 Table 8-120) to determine availability of
> sparing resources for a given address. In response to a query request,
> the device shall report the resource availability by producing the Memory
> Sparing Event Record (CXL Spec 3.2 Table 8-60) in which the Channel, Rank,
> Nibble Mask, Bank Group, Bank, Row, Column, Sub-Channel fields are a copy
> of the values specified in the request.
>
> During the execution of a sparing maintenance operation, a CXL memory
> device:
> - May or may not retain data
> - May or may not be able to process CXL.mem requests correctly.
> These CXL memory device capabilities are specified by restriction flags
> in the memory sparing feature readable attributes.
>
> When a CXL device identifies error on a memory component, the device
> may inform the host about the need for a memory sparing maintenance
> operation by using DRAM event record, where the 'maintenance needed' flag
> may set. The event record contains some of the DPA, Channel, Rank,
> Nibble Mask, Bank Group, Bank, Row, Column, Sub-Channel fields that
> should be repaired. The userspace tool requests for maintenance operation
> if the 'maintenance needed' flag set in the CXL DRAM error record.
>
> CXL spec 3.2 section 8.2.10.7.2.3 describes the memory sparing feature
> discovery and configuration.
>
> CXL spec 3.2 section 8.2.10.7.1.4 describes the device's memory sparing
> maintenance operation feature.
>
> Add emulation for CXL memory device memory sparing control feature
> and memory sparing maintenance operation command.
>
> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Hi Shiju,
A few comments inline. I'll fix up whilst applying to the CXL staging tree.
> ---
> hw/cxl/cxl-mailbox-utils.c | 317 +++++++++++++++++++++++++++++++++++-
> hw/mem/cxl_type3.c | 44 +++++
> include/hw/cxl/cxl_device.h | 33 ++++
> include/hw/cxl/cxl_events.h | 5 +
> 4 files changed, 395 insertions(+), 4 deletions(-)
>
> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> index 254154ceda..a3ab3b48f2 100644
> --- a/hw/cxl/cxl-mailbox-utils.c
> +++ b/hw/cxl/cxl-mailbox-utils.c
> +static void cxl_create_mem_sparing_event_records(CXLType3Dev *ct3d,
> uint8_t maint_op_class, uint8_t maint_op_sub_class,
> - CXLMaintenance *ent)
> + CXLMaintenance *ent,
> + CXLMemSparingMaintInPayload *sparing_pi)
> {
> CXLEventSparing event_rec = {};
>
> @@ -1615,6 +1852,31 @@ static void cxl_mbox_create_mem_sparing_event_records(CXLType3Dev *ct3d,
> strncpy((char *)event_rec.component_id, (char *)ent->component_id,
> sizeof(event_rec.component_id));
> }
> + } else if (sparing_pi) {
> + event_rec.flags = CXL_MSER_FLAGS_QUERY_RESOURCES;
> + event_rec.result = 0;
> + event_rec.validity_flags = CXL_MSER_VALID_CHANNEL |
> + CXL_MSER_VALID_RANK |
> + CXL_MSER_VALID_NIB_MASK |
> + CXL_MSER_VALID_BANK_GROUP |
> + CXL_MSER_VALID_BANK |
> + CXL_MSER_VALID_ROW |
> + CXL_MSER_VALID_COLUMN;
> + event_rec.res_avail = 1;
> + event_rec.channel = sparing_pi->channel;
> + event_rec.rank = sparing_pi->rank;
> + if (sparing_pi->flags & CXL_MEM_SPARING_FLAGS_NIB_MASK_VALID) {
> + __builtin_memcpy(event_rec.nibble_mask, sparing_pi->nibble_mask,
Why not memcpy()?
Also, wrong size - should be sizeof(sparing_pi->nibble_mask);
> + sizeof(uint32_t));
> + }
> + event_rec.bank_group = sparing_pi->bank_group;
> + event_rec.bank = sparing_pi->bank;
> + event_rec.column = sparing_pi->column;
> + __builtin_memcpy(event_rec.row, sparing_pi->row, sizeof(uint32_t));
As above.
> + if (sparing_pi->flags & CXL_MEM_SPARING_FLAGS_SUB_CHANNEL_VALID) {
> + event_rec.sub_channel = sparing_pi->sub_channel;
> + event_rec.validity_flags |= CXL_MSER_VALID_SUB_CHANNEL;
> + }
> } else {
> return;
> }
> diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> index 959049aa46..59f8fa7c24 100644
> --- a/include/hw/cxl/cxl_device.h
> +++ b/include/hw/cxl/cxl_device.h
> +#define CXL_MEMDEV_SPARING_GET_FEATURE_VERSION 0x01
> +#define CXL_MEMDEV_SPARING_SET_FEATURE_VERSION 0x01
> +#define CXL_MEMDEV_SPARING_SAFE_IN_USE_FLAG BIT(0)
> +#define CXL_MEMDEV_HARD_SPARING_SUPPORT_FLAG BIT(1)
> +#define CXL_MEMDEV_SOFT_SPARING_SUPPORT_FLAG BIT(2)
Odd alignment. If going for aligned values, make sure they all
align fully. This seems to be 4 spaces, which doesn't relaly
help anything.
WARNING: multiple messages have this Message-ID (diff)
From: Jonathan Cameron via <qemu-devel@nongnu.org>
To: <shiju.jose@huawei.com>
Cc: <qemu-devel@nongnu.org>, <linux-cxl@vger.kernel.org>,
<armbru@redhat.com>, <dave@stgolabs.net>, <linuxarm@huawei.com>
Subject: Re: [PATCH v6 8/8] hw/cxl: Add emulation for memory sparing control feature
Date: Mon, 11 Aug 2025 15:38:28 +0100 [thread overview]
Message-ID: <20250811153828.00003283@huawei.com> (raw)
In-Reply-To: <20250811085530.2263-9-shiju.jose@huawei.com>
On Mon, 11 Aug 2025 09:55:30 +0100
<shiju.jose@huawei.com> wrote:
> From: Shiju Jose <shiju.jose@huawei.com>
>
> Memory sparing is defined as a repair function that replaces a portion of
> memory with a portion of functional memory at that same DPA. The
> subclasses for this operation vary in terms of the scope of the sparing
> being performed. The Cacheline sparing subclass refers to a sparing
> action that can replace a full cacheline. Row sparing is provided as an
> alternative to PPR sparing functions and its scope is that of a single
> DDR row. Bank sparing allows an entire bank to be replaced. Rank sparing
> is defined as an operation in which an entire DDR rank is replaced.
>
> Memory sparing maintenance operations may be supported by CXL devices
> that implement CXL.mem protocol. A sparing maintenance operation requests
> the CXL device to perform a repair operation on its media.
> For example, a CXL device with DRAM components that support memory sparing
> features may implement sparing Maintenance operations.
>
> The host may issue a query command by setting Query Resources flag in the
> Input Payload (CXL Spec 3.2 Table 8-120) to determine availability of
> sparing resources for a given address. In response to a query request,
> the device shall report the resource availability by producing the Memory
> Sparing Event Record (CXL Spec 3.2 Table 8-60) in which the Channel, Rank,
> Nibble Mask, Bank Group, Bank, Row, Column, Sub-Channel fields are a copy
> of the values specified in the request.
>
> During the execution of a sparing maintenance operation, a CXL memory
> device:
> - May or may not retain data
> - May or may not be able to process CXL.mem requests correctly.
> These CXL memory device capabilities are specified by restriction flags
> in the memory sparing feature readable attributes.
>
> When a CXL device identifies error on a memory component, the device
> may inform the host about the need for a memory sparing maintenance
> operation by using DRAM event record, where the 'maintenance needed' flag
> may set. The event record contains some of the DPA, Channel, Rank,
> Nibble Mask, Bank Group, Bank, Row, Column, Sub-Channel fields that
> should be repaired. The userspace tool requests for maintenance operation
> if the 'maintenance needed' flag set in the CXL DRAM error record.
>
> CXL spec 3.2 section 8.2.10.7.2.3 describes the memory sparing feature
> discovery and configuration.
>
> CXL spec 3.2 section 8.2.10.7.1.4 describes the device's memory sparing
> maintenance operation feature.
>
> Add emulation for CXL memory device memory sparing control feature
> and memory sparing maintenance operation command.
>
> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Hi Shiju,
A few comments inline. I'll fix up whilst applying to the CXL staging tree.
> ---
> hw/cxl/cxl-mailbox-utils.c | 317 +++++++++++++++++++++++++++++++++++-
> hw/mem/cxl_type3.c | 44 +++++
> include/hw/cxl/cxl_device.h | 33 ++++
> include/hw/cxl/cxl_events.h | 5 +
> 4 files changed, 395 insertions(+), 4 deletions(-)
>
> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> index 254154ceda..a3ab3b48f2 100644
> --- a/hw/cxl/cxl-mailbox-utils.c
> +++ b/hw/cxl/cxl-mailbox-utils.c
> +static void cxl_create_mem_sparing_event_records(CXLType3Dev *ct3d,
> uint8_t maint_op_class, uint8_t maint_op_sub_class,
> - CXLMaintenance *ent)
> + CXLMaintenance *ent,
> + CXLMemSparingMaintInPayload *sparing_pi)
> {
> CXLEventSparing event_rec = {};
>
> @@ -1615,6 +1852,31 @@ static void cxl_mbox_create_mem_sparing_event_records(CXLType3Dev *ct3d,
> strncpy((char *)event_rec.component_id, (char *)ent->component_id,
> sizeof(event_rec.component_id));
> }
> + } else if (sparing_pi) {
> + event_rec.flags = CXL_MSER_FLAGS_QUERY_RESOURCES;
> + event_rec.result = 0;
> + event_rec.validity_flags = CXL_MSER_VALID_CHANNEL |
> + CXL_MSER_VALID_RANK |
> + CXL_MSER_VALID_NIB_MASK |
> + CXL_MSER_VALID_BANK_GROUP |
> + CXL_MSER_VALID_BANK |
> + CXL_MSER_VALID_ROW |
> + CXL_MSER_VALID_COLUMN;
> + event_rec.res_avail = 1;
> + event_rec.channel = sparing_pi->channel;
> + event_rec.rank = sparing_pi->rank;
> + if (sparing_pi->flags & CXL_MEM_SPARING_FLAGS_NIB_MASK_VALID) {
> + __builtin_memcpy(event_rec.nibble_mask, sparing_pi->nibble_mask,
Why not memcpy()?
Also, wrong size - should be sizeof(sparing_pi->nibble_mask);
> + sizeof(uint32_t));
> + }
> + event_rec.bank_group = sparing_pi->bank_group;
> + event_rec.bank = sparing_pi->bank;
> + event_rec.column = sparing_pi->column;
> + __builtin_memcpy(event_rec.row, sparing_pi->row, sizeof(uint32_t));
As above.
> + if (sparing_pi->flags & CXL_MEM_SPARING_FLAGS_SUB_CHANNEL_VALID) {
> + event_rec.sub_channel = sparing_pi->sub_channel;
> + event_rec.validity_flags |= CXL_MSER_VALID_SUB_CHANNEL;
> + }
> } else {
> return;
> }
> diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> index 959049aa46..59f8fa7c24 100644
> --- a/include/hw/cxl/cxl_device.h
> +++ b/include/hw/cxl/cxl_device.h
> +#define CXL_MEMDEV_SPARING_GET_FEATURE_VERSION 0x01
> +#define CXL_MEMDEV_SPARING_SET_FEATURE_VERSION 0x01
> +#define CXL_MEMDEV_SPARING_SAFE_IN_USE_FLAG BIT(0)
> +#define CXL_MEMDEV_HARD_SPARING_SUPPORT_FLAG BIT(1)
> +#define CXL_MEMDEV_SOFT_SPARING_SUPPORT_FLAG BIT(2)
Odd alignment. If going for aligned values, make sure they all
align fully. This seems to be 4 spaces, which doesn't relaly
help anything.
next prev parent reply other threads:[~2025-08-11 14:38 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-11 8:55 [PATCH v6 0/8] hw/cxl: Update CXL events to rev3.2 and add maintenance support for memory repair features shiju.jose
2025-08-11 8:55 ` shiju.jose--- via
2025-08-11 8:55 ` [PATCH v6 1/8] qapi: cxl: Refactor CXL event injection for common commands arguments shiju.jose
2025-08-11 8:55 ` shiju.jose--- via
2025-08-11 11:06 ` Jonathan Cameron
2025-08-11 11:06 ` Jonathan Cameron via
2025-08-11 8:55 ` [PATCH v6 2/8] hw/cxl/events: Update for rev3.2 common event record format shiju.jose
2025-08-11 8:55 ` shiju.jose--- via
2025-08-11 8:55 ` [PATCH v6 3/8] hw/cxl/events: Updates for rev3.2 general media event record shiju.jose
2025-08-11 8:55 ` shiju.jose--- via
2025-08-11 8:55 ` [PATCH v6 4/8] hw/cxl/events: Updates for rev3.2 DRAM " shiju.jose
2025-08-11 8:55 ` shiju.jose--- via
2025-08-11 8:55 ` [PATCH v6 5/8] hw/cxl/events: Updates for rev3.2 memory module " shiju.jose
2025-08-11 8:55 ` shiju.jose--- via
2025-08-11 8:55 ` [PATCH v6 6/8] hw/cxl/cxl-mailbox-utils: Move declaration of scrub and ECS feature attributes in cmd_features_set_feature() shiju.jose
2025-08-11 8:55 ` shiju.jose--- via
2025-08-11 8:55 ` [PATCH v6 7/8] hw/cxl: Add support for Maintenance command and Post Package Repair (PPR) shiju.jose
2025-08-11 8:55 ` shiju.jose--- via
2025-08-11 14:32 ` Jonathan Cameron
2025-08-11 14:32 ` Jonathan Cameron via
2025-08-11 8:55 ` [PATCH v6 8/8] hw/cxl: Add emulation for memory sparing control feature shiju.jose
2025-08-11 8:55 ` shiju.jose--- via
2025-08-11 14:38 ` Jonathan Cameron [this message]
2025-08-11 14:38 ` Jonathan Cameron via
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250811153828.00003283@huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=armbru@redhat.com \
--cc=dave@stgolabs.net \
--cc=linux-cxl@vger.kernel.org \
--cc=linuxarm@huawei.com \
--cc=qemu-devel@nongnu.org \
--cc=shiju.jose@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.