From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: <shiju.jose@huawei.com>
Cc: <linux-cxl@vger.kernel.org>, <dan.j.williams@intel.com>,
<dave.jiang@intel.com>, <alison.schofield@intel.com>,
<dave@stgolabs.net>, <vishal.l.verma@intel.com>,
<ira.weiny@intel.com>, <tanxiaofei@huawei.com>,
<prime.zeng@hisilicon.com>, <linuxarm@huawei.com>
Subject: Re: [PATCH 4/4] cxl/events: Trace Memory Sparing Event Record
Date: Wed, 16 Jul 2025 14:16:44 +0100 [thread overview]
Message-ID: <20250716141644.00000347@huawei.com> (raw)
In-Reply-To: <20250716104945.2002-5-shiju.jose@huawei.com>
On Wed, 16 Jul 2025 11:49:45 +0100
<shiju.jose@huawei.com> wrote:
> From: Shiju Jose <shiju.jose@huawei.com>
>
> CXL rev 3.2 section 8.2.10.2.1.4 Table 8-60 defines the Memory Sparing
> Event Record.
>
> Determine if the event read is memory sparing record and if so trace the
> record.
>
> Memory device shall produce a memory sparing event record
> 1. After completion of a PPR maintenance operation if the memory sparing
> event record enable bit is set (Field: sPPR/hPPR Operation Mode in
> Table 8-128/Table 8-131).
> 2. In response to a query request by the host (see section 8.2.10.7.1.4)
> to determine the availability of sparing resources.
> The device shall report the resource availability by producing the Memory
> Sparing Event Record (see Table 8-60) in which the channel, rank, nibble
> mask, bank group, bank, row, column, sub-channel fields are a copy of the
> values specified in the request. If the controller does not support
> reporting whether a resource is available, and a perform maintenance
> operation for memory sparing is issued with query resources set to 1, the
> controller shall return invalid input.
>
> Example trace log for produce memory sparing event record on completion
> of a soft PPR operation,
> cxl_memory_sparing: memdev=mem1 host=0000:0f:00.0 serial=3
> log=Informational : time=55045163029
> uuid=e71f3a40-2d29-4092-8a39-4d1c966c7c65 len=128 flags='0x1' handle=1
> related_handle=0 maint_op_class=2 maint_op_sub_class=1
> ld_id=0 head_id=0 : flags='' result=0
> validity_flags='CHANNEL|RANK|NIBBLE|BANK GROUP|BANK|ROW|COLUMN'
> spare resource avail=1 channel=2 rank=5 nibble_mask=a59c bank_group=2
> bank=4 row=13 column=23 sub_channel=0
> comp_id=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> comp_id_pldm_valid_flags='' pldm_entity_id=0x00 pldm_resource_id=0x00
>
> Note: For memory sparing event record, fields 'maintenance operation
> class' and 'maintenance operation subclass' are defined twice, first
> in the common event record (Table 8-55) and second in the memory
> sparing event record (Table 8-60). Thus those in the sparing event
> record coded as reserved, to be removed when the spec is updated.
>
> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Only comment formatting related.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> ---
> drivers/cxl/core/mbox.c | 6 +++
> drivers/cxl/core/trace.h | 100 +++++++++++++++++++++++++++++++++++++++
> drivers/cxl/cxlmem.h | 8 ++++
> include/cxl/event.h | 33 +++++++++++++
> 4 files changed, 147 insertions(+)
>
> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> index c3cd871942c5..2c291fb1857c 100644
> --- a/drivers/cxl/core/trace.h
> +++ b/drivers/cxl/core/trace.h
> @@ -888,6 +888,106 @@ TRACE_EVENT(cxl_memory_module,
> )
> );
>
> +#define CXL_MSER_QUERY_RESOURCE_FLAG BIT(0)
> +#define CXL_MSER_HARD_SPARING_FLAG BIT(1)
> +#define CXL_MSER_DEV_INITED_FLAG BIT(2)
> +#define show_mem_sparing_flags(flags) __print_flags(flags, "|", \
> + { CXL_MSER_QUERY_RESOURCE_FLAG, "Query Resources" }, \
> + { CXL_MSER_HARD_SPARING_FLAG, "Hard Sparing" }, \
> + { CXL_MSER_DEV_INITED_FLAG, "Device Initiated Sparing" } \
Spacing before the } is inconsistent for this last line. Copy whatever we have
in the file already and if it is inconsistent (which it is) pick most common option.
> +)
> +
> +#define CXL_MSER_VALID_CHANNEL BIT(0)
> +#define CXL_MSER_VALID_RANK BIT(1)
> +#define CXL_MSER_VALID_NIBBLE BIT(2)
> +#define CXL_MSER_VALID_BANK_GROUP BIT(3)
> +#define CXL_MSER_VALID_BANK BIT(4)
> +#define CXL_MSER_VALID_ROW BIT(5)
> +#define CXL_MSER_VALID_COLUMN BIT(6)
> +#define CXL_MSER_VALID_COMPONENT_ID BIT(7)
> +#define CXL_MSER_VALID_COMPONENT_ID_FORMAT BIT(8)
> +#define CXL_MSER_VALID_SUB_CHANNEL BIT(9)
> +#define show_mem_sparing_valid_flags(flags) __print_flags(flags, "|", \
> + { CXL_MSER_VALID_CHANNEL, "CHANNEL" }, \
> + { CXL_MSER_VALID_RANK, "RANK" }, \
> + { CXL_MSER_VALID_NIBBLE, "NIBBLE" }, \
> + { CXL_MSER_VALID_BANK_GROUP, "BANK GROUP" }, \
> + { CXL_MSER_VALID_BANK, "BANK" }, \
> + { CXL_MSER_VALID_ROW, "ROW" }, \
> + { CXL_MSER_VALID_COLUMN, "COLUMN" }, \
> + { CXL_MSER_VALID_COMPONENT_ID, "COMPONENT ID" }, \
> + { CXL_MSER_VALID_COMPONENT_ID_FORMAT, "COMPONENT ID PLDM FORMAT" }, \
> + { CXL_MSER_VALID_SUB_CHANNEL, "SUB CHANNEL" } \
> +)
> +
> +TRACE_EVENT(cxl_memory_sparing,
> +
> + TP_PROTO(const struct cxl_memdev *cxlmd, enum cxl_event_log_type log,
> + struct cxl_event_mem_sparing *rec),
> +
> + TP_ARGS(cxlmd, log, rec),
> +
> + TP_STRUCT__entry(
> + CXL_EVT_TP_entry
> +
> + /* Memory Sparing Event */
> + __field(u8, flags)
> + __field(u8, result)
> + __field(u16, validity_flags)
> + __field(u16, res_avail)
> + __field(u8, channel)
> + __field(u8, rank)
> + __field(u32, nibble_mask)
> + __field(u8, bank_group)
> + __field(u8, bank)
> + __field(u32, row)
> + __field(u16, column)
> + __field(u8, sub_channel)
> + __array(u8, comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE)
> + ),
> +
> + TP_fast_assign(
> + CXL_EVT_TP_fast_assign(cxlmd, log, rec->hdr);
> + __entry->hdr_uuid = CXL_EVENT_MEM_SPARING_UUID;
> +
> + /* Memory Sparing Event */
> + __entry->flags = rec->flags;
> + __entry->result = rec->result;
> + __entry->validity_flags = le16_to_cpu(rec->validity_flags);
> + __entry->res_avail = le16_to_cpu(rec->res_avail);
> + __entry->channel = rec->channel;
> + __entry->rank = rec->rank;
> + __entry->nibble_mask = get_unaligned_le24(rec->nibble_mask);
> + __entry->bank_group = rec->bank_group;
> + __entry->bank = rec->bank;
> + __entry->row = get_unaligned_le24(rec->row);
> + __entry->column = le16_to_cpu(rec->column);
> + __entry->sub_channel = rec->sub_channel;
> + memcpy(__entry->comp_id, &rec->component_id,
> + CXL_EVENT_GEN_MED_COMP_ID_SIZE);
> + ),
> +
> + CXL_EVT_TP_printk("flags='%s' result=%u validity_flags='%s' " \
> + "spare resource avail=%u channel=%u rank=%u " \
> + "nibble_mask=%x bank_group=%u bank=%u " \
> + "row=%u column=%u sub_channel=%u " \
> + "comp_id=%s comp_id_pldm_valid_flags='%s' " \
> + "pldm_entity_id=%s pldm_resource_id=%s",
> + show_mem_sparing_flags(__entry->flags),
> + __entry->result,
> + show_mem_sparing_valid_flags(__entry->validity_flags),
> + __entry->res_avail, __entry->channel, __entry->rank,
> + __entry->nibble_mask, __entry->bank_group, __entry->bank,
> + __entry->row, __entry->column, __entry->sub_channel,
> + __print_hex(__entry->comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE),
> + show_comp_id_pldm_flags(__entry->comp_id[0]),
> + show_pldm_entity_id(__entry->validity_flags, CXL_MSER_VALID_COMPONENT_ID,
> + CXL_MSER_VALID_COMPONENT_ID_FORMAT, __entry->comp_id),
> + show_pldm_resource_id(__entry->validity_flags, CXL_MSER_VALID_COMPONENT_ID,
> + CXL_MSER_VALID_COMPONENT_ID_FORMAT, __entry->comp_id)
> + )
> +);
> +
> #define show_poison_trace_type(type) \
> __print_symbolic(type, \
> { CXL_POISON_TRACE_LIST, "List" }, \
next prev parent reply other threads:[~2025-07-16 13:16 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-16 10:49 [PATCH 0/4] cxl/events: Update to rev 3.2, improvements and add trace memory sparing event record shiju.jose
2025-07-16 10:49 ` [PATCH 1/4] cxl/events: Update Common Event Record to CXL spec rev 3.2 shiju.jose
2025-07-16 12:53 ` Jonathan Cameron
2025-07-16 10:49 ` [PATCH 2/4] cxl/events: Add extra validity checks for corrected memory error count in General Media Event Record shiju.jose
2025-07-16 13:04 ` Jonathan Cameron
2025-07-16 21:40 ` Dave Jiang
2025-07-17 3:32 ` kernel test robot
2025-07-16 10:49 ` [PATCH 3/4] cxl/events: Add extra validity checks for CVME count in DRAM " shiju.jose
2025-07-16 13:07 ` Jonathan Cameron
2025-07-16 21:53 ` Dave Jiang
2025-07-17 5:16 ` kernel test robot
2025-07-16 10:49 ` [PATCH 4/4] cxl/events: Trace Memory Sparing " shiju.jose
2025-07-16 13:16 ` Jonathan Cameron [this message]
2025-07-16 15:07 ` Shiju Jose
2025-07-16 22:23 ` Dave Jiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250716141644.00000347@huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linuxarm@huawei.com \
--cc=prime.zeng@hisilicon.com \
--cc=shiju.jose@huawei.com \
--cc=tanxiaofei@huawei.com \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.