* [PATCH] cxl/events: Update memory event type for patrol scrub cycle end event
@ 2025-12-10 13:12 shiju.jose
2025-12-10 14:47 ` Dave Jiang
0 siblings, 1 reply; 3+ messages in thread
From: shiju.jose @ 2025-12-10 13:12 UTC (permalink / raw)
To: linux-cxl, dan.j.williams, dave.jiang, jonathan.cameron,
alison.schofield, dave, vishal.l.verma, ira.weiny
Cc: tanxiaofei, prime.zeng, linuxarm, shiju.jose
From: Shiju Jose <shiju.jose@huawei.com>
According to the CXL Specification Revision 4.0, Advanced CVME (Corrected
Volatile Memory Error) enhancements added additional granularity control
and event generation for Patrol Scrub cycle end.
Update Memory Event Type field in the trace events for section
8.2.10.2.1.1, Table 8-224 (General Media Event Record), and section
8.2.10.2.1.2, Table 8-225 (DRAM Event Record), to include the event type
'Patrol Scrub cycle end'.
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
Open Question,
Option to enable event generation for 'Patrol Scrub cycle end' is given in
the Memory Error Threshold feature, Section 8.2.10.9.11.3 Advanced
Programmable Corrected Volatile Memory Error Threshold Feature Discovery
and Configuration. Does support of this Memory Error Threshold feature is
required in the kernel or via fwctl?
---
drivers/cxl/core/trace.h | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index a972e4ef1936..e79c2bd415af 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -367,6 +367,7 @@ TRACE_EVENT(cxl_generic_event,
#define CXL_GMER_MEM_EVT_TYPE_SCRUB_MEDIA_ECC_ERROR 0x04
#define CXL_GMER_MEM_EVT_TYPE_AP_CME_COUNTER_EXPIRE 0x05
#define CXL_GMER_MEM_EVT_TYPE_CKID_VIOLATION 0x06
+#define CXL_GMER_MEM_EVT_TYPE_AP_CME_PS_CYCLE_END 0x07
#define show_gmer_mem_event_type(type) __print_symbolic(type, \
{ CXL_GMER_MEM_EVT_TYPE_ECC_ERROR, "ECC Error" }, \
{ CXL_GMER_MEM_EVT_TYPE_INV_ADDR, "Invalid Address" }, \
@@ -374,7 +375,8 @@ TRACE_EVENT(cxl_generic_event,
{ CXL_GMER_MEM_EVT_TYPE_TE_STATE_VIOLATION, "TE State Violation" }, \
{ CXL_GMER_MEM_EVT_TYPE_SCRUB_MEDIA_ECC_ERROR, "Scrub Media ECC Error" }, \
{ CXL_GMER_MEM_EVT_TYPE_AP_CME_COUNTER_EXPIRE, "Adv Prog CME Counter Expiration" }, \
- { CXL_GMER_MEM_EVT_TYPE_CKID_VIOLATION, "CKID Violation" } \
+ { CXL_GMER_MEM_EVT_TYPE_CKID_VIOLATION, "CKID Violation" }, \
+ { CXL_GMER_MEM_EVT_TYPE_AP_CME_PS_CYCLE_END, "Adv Prog CME Patrol Scrub Cycle End" } \
)
#define CXL_GMER_TRANS_UNKNOWN 0x00
@@ -554,6 +556,7 @@ TRACE_EVENT(cxl_general_media,
#define CXL_DER_MEM_EVT_TYPE_TE_STATE_VIOLATION 0x04
#define CXL_DER_MEM_EVT_TYPE_AP_CME_COUNTER_EXPIRE 0x05
#define CXL_DER_MEM_EVT_TYPE_CKID_VIOLATION 0x06
+#define CXL_DER_MEM_EVT_TYPE_AP_CME_PS_CYCLE_END 0x07
#define show_dram_mem_event_type(type) __print_symbolic(type, \
{ CXL_DER_MEM_EVT_TYPE_ECC_ERROR, "ECC Error" }, \
{ CXL_DER_MEM_EVT_TYPE_SCRUB_MEDIA_ECC_ERROR, "Scrub Media ECC Error" }, \
@@ -561,7 +564,8 @@ TRACE_EVENT(cxl_general_media,
{ CXL_DER_MEM_EVT_TYPE_DATA_PATH_ERROR, "Data Path Error" }, \
{ CXL_DER_MEM_EVT_TYPE_TE_STATE_VIOLATION, "TE State Violation" }, \
{ CXL_DER_MEM_EVT_TYPE_AP_CME_COUNTER_EXPIRE, "Adv Prog CME Counter Expiration" }, \
- { CXL_DER_MEM_EVT_TYPE_CKID_VIOLATION, "CKID Violation" } \
+ { CXL_DER_MEM_EVT_TYPE_CKID_VIOLATION, "CKID Violation" }, \
+ { CXL_DER_MEM_EVT_TYPE_AP_CME_PS_CYCLE_END, "Adv Prog CME Patrol Scrub Cycle End" } \
)
#define CXL_DER_VALID_CHANNEL BIT(0)
--
2.43.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] cxl/events: Update memory event type for patrol scrub cycle end event
2025-12-10 13:12 [PATCH] cxl/events: Update memory event type for patrol scrub cycle end event shiju.jose
@ 2025-12-10 14:47 ` Dave Jiang
2025-12-15 10:52 ` Jonathan Cameron
0 siblings, 1 reply; 3+ messages in thread
From: Dave Jiang @ 2025-12-10 14:47 UTC (permalink / raw)
To: shiju.jose, linux-cxl, dan.j.williams, jonathan.cameron,
alison.schofield, dave, vishal.l.verma, ira.weiny
Cc: tanxiaofei, prime.zeng, linuxarm
On 12/10/25 6:12 AM, shiju.jose@huawei.com wrote:
> From: Shiju Jose <shiju.jose@huawei.com>
>
> According to the CXL Specification Revision 4.0, Advanced CVME (Corrected
> Volatile Memory Error) enhancements added additional granularity control
> and event generation for Patrol Scrub cycle end.
>
> Update Memory Event Type field in the trace events for section
> 8.2.10.2.1.1, Table 8-224 (General Media Event Record), and section
> 8.2.10.2.1.2, Table 8-225 (DRAM Event Record), to include the event type
> 'Patrol Scrub cycle end'.
>
> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> ---
> Open Question,
> Option to enable event generation for 'Patrol Scrub cycle end' is given in
> the Memory Error Threshold feature, Section 8.2.10.9.11.3 Advanced
> Programmable Corrected Volatile Memory Error Threshold Feature Discovery
> and Configuration. Does support of this Memory Error Threshold feature is
> required in the kernel or via fwctl?
Any thoughts Jonathan? Is that something that would be exposed through EDAC?
DJ
> ---
> drivers/cxl/core/trace.h | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> index a972e4ef1936..e79c2bd415af 100644
> --- a/drivers/cxl/core/trace.h
> +++ b/drivers/cxl/core/trace.h
> @@ -367,6 +367,7 @@ TRACE_EVENT(cxl_generic_event,
> #define CXL_GMER_MEM_EVT_TYPE_SCRUB_MEDIA_ECC_ERROR 0x04
> #define CXL_GMER_MEM_EVT_TYPE_AP_CME_COUNTER_EXPIRE 0x05
> #define CXL_GMER_MEM_EVT_TYPE_CKID_VIOLATION 0x06
> +#define CXL_GMER_MEM_EVT_TYPE_AP_CME_PS_CYCLE_END 0x07
> #define show_gmer_mem_event_type(type) __print_symbolic(type, \
> { CXL_GMER_MEM_EVT_TYPE_ECC_ERROR, "ECC Error" }, \
> { CXL_GMER_MEM_EVT_TYPE_INV_ADDR, "Invalid Address" }, \
> @@ -374,7 +375,8 @@ TRACE_EVENT(cxl_generic_event,
> { CXL_GMER_MEM_EVT_TYPE_TE_STATE_VIOLATION, "TE State Violation" }, \
> { CXL_GMER_MEM_EVT_TYPE_SCRUB_MEDIA_ECC_ERROR, "Scrub Media ECC Error" }, \
> { CXL_GMER_MEM_EVT_TYPE_AP_CME_COUNTER_EXPIRE, "Adv Prog CME Counter Expiration" }, \
> - { CXL_GMER_MEM_EVT_TYPE_CKID_VIOLATION, "CKID Violation" } \
> + { CXL_GMER_MEM_EVT_TYPE_CKID_VIOLATION, "CKID Violation" }, \
> + { CXL_GMER_MEM_EVT_TYPE_AP_CME_PS_CYCLE_END, "Adv Prog CME Patrol Scrub Cycle End" } \
> )
>
> #define CXL_GMER_TRANS_UNKNOWN 0x00
> @@ -554,6 +556,7 @@ TRACE_EVENT(cxl_general_media,
> #define CXL_DER_MEM_EVT_TYPE_TE_STATE_VIOLATION 0x04
> #define CXL_DER_MEM_EVT_TYPE_AP_CME_COUNTER_EXPIRE 0x05
> #define CXL_DER_MEM_EVT_TYPE_CKID_VIOLATION 0x06
> +#define CXL_DER_MEM_EVT_TYPE_AP_CME_PS_CYCLE_END 0x07
> #define show_dram_mem_event_type(type) __print_symbolic(type, \
> { CXL_DER_MEM_EVT_TYPE_ECC_ERROR, "ECC Error" }, \
> { CXL_DER_MEM_EVT_TYPE_SCRUB_MEDIA_ECC_ERROR, "Scrub Media ECC Error" }, \
> @@ -561,7 +564,8 @@ TRACE_EVENT(cxl_general_media,
> { CXL_DER_MEM_EVT_TYPE_DATA_PATH_ERROR, "Data Path Error" }, \
> { CXL_DER_MEM_EVT_TYPE_TE_STATE_VIOLATION, "TE State Violation" }, \
> { CXL_DER_MEM_EVT_TYPE_AP_CME_COUNTER_EXPIRE, "Adv Prog CME Counter Expiration" }, \
> - { CXL_DER_MEM_EVT_TYPE_CKID_VIOLATION, "CKID Violation" } \
> + { CXL_DER_MEM_EVT_TYPE_CKID_VIOLATION, "CKID Violation" }, \
> + { CXL_DER_MEM_EVT_TYPE_AP_CME_PS_CYCLE_END, "Adv Prog CME Patrol Scrub Cycle End" } \
> )
>
> #define CXL_DER_VALID_CHANNEL BIT(0)
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] cxl/events: Update memory event type for patrol scrub cycle end event
2025-12-10 14:47 ` Dave Jiang
@ 2025-12-15 10:52 ` Jonathan Cameron
0 siblings, 0 replies; 3+ messages in thread
From: Jonathan Cameron @ 2025-12-15 10:52 UTC (permalink / raw)
To: Dave Jiang
Cc: shiju.jose, linux-cxl, dan.j.williams, alison.schofield, dave,
vishal.l.verma, ira.weiny, tanxiaofei, prime.zeng, linuxarm
On Wed, 10 Dec 2025 07:47:20 -0700
Dave Jiang <dave.jiang@intel.com> wrote:
> On 12/10/25 6:12 AM, shiju.jose@huawei.com wrote:
> > From: Shiju Jose <shiju.jose@huawei.com>
> >
> > According to the CXL Specification Revision 4.0, Advanced CVME (Corrected
> > Volatile Memory Error) enhancements added additional granularity control
> > and event generation for Patrol Scrub cycle end.
> >
> > Update Memory Event Type field in the trace events for section
> > 8.2.10.2.1.1, Table 8-224 (General Media Event Record), and section
> > 8.2.10.2.1.2, Table 8-225 (DRAM Event Record), to include the event type
> > 'Patrol Scrub cycle end'.
> >
> > Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
>
> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
>
> > ---
> > Open Question,
> > Option to enable event generation for 'Patrol Scrub cycle end' is given in
> > the Memory Error Threshold feature, Section 8.2.10.9.11.3 Advanced
> > Programmable Corrected Volatile Memory Error Threshold Feature Discovery
> > and Configuration. Does support of this Memory Error Threshold feature is
> > required in the kernel or via fwctl?
>
> Any thoughts Jonathan? Is that something that would be exposed through EDAC?
I asked Shiju to add this comment because I wasn't sure of the answer about those
controls. Given the current conservative view point being taken around more
complex scrub features in general in EDAC (which I'm not saying I disagree with!),
these might be very hard to land unless there are similar facilities in other
Scrub controllers and we can argue it is about generalizing the interface.
For anyone following along this new stuff is about counting granularity,
thresholds, resets of counters etc not the reporting of individual errors.
So more telemetry than error detection.
So gut feeling is these are probably a fwctl / userspace tool problem but I may
well be wrong and then we run into that question of whether we can rip out
exiting functionality exposed via fwctl later. IIRC correctly we decided
we could but still don't want to do that unless we have to!
Jonathan
>
> DJ
>
> > ---
> > drivers/cxl/core/trace.h | 8 ++++++--
> > 1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> > index a972e4ef1936..e79c2bd415af 100644
> > --- a/drivers/cxl/core/trace.h
> > +++ b/drivers/cxl/core/trace.h
> > @@ -367,6 +367,7 @@ TRACE_EVENT(cxl_generic_event,
> > #define CXL_GMER_MEM_EVT_TYPE_SCRUB_MEDIA_ECC_ERROR 0x04
> > #define CXL_GMER_MEM_EVT_TYPE_AP_CME_COUNTER_EXPIRE 0x05
> > #define CXL_GMER_MEM_EVT_TYPE_CKID_VIOLATION 0x06
> > +#define CXL_GMER_MEM_EVT_TYPE_AP_CME_PS_CYCLE_END 0x07
> > #define show_gmer_mem_event_type(type) __print_symbolic(type, \
> > { CXL_GMER_MEM_EVT_TYPE_ECC_ERROR, "ECC Error" }, \
> > { CXL_GMER_MEM_EVT_TYPE_INV_ADDR, "Invalid Address" }, \
> > @@ -374,7 +375,8 @@ TRACE_EVENT(cxl_generic_event,
> > { CXL_GMER_MEM_EVT_TYPE_TE_STATE_VIOLATION, "TE State Violation" }, \
> > { CXL_GMER_MEM_EVT_TYPE_SCRUB_MEDIA_ECC_ERROR, "Scrub Media ECC Error" }, \
> > { CXL_GMER_MEM_EVT_TYPE_AP_CME_COUNTER_EXPIRE, "Adv Prog CME Counter Expiration" }, \
> > - { CXL_GMER_MEM_EVT_TYPE_CKID_VIOLATION, "CKID Violation" } \
> > + { CXL_GMER_MEM_EVT_TYPE_CKID_VIOLATION, "CKID Violation" }, \
> > + { CXL_GMER_MEM_EVT_TYPE_AP_CME_PS_CYCLE_END, "Adv Prog CME Patrol Scrub Cycle End" } \
> > )
> >
> > #define CXL_GMER_TRANS_UNKNOWN 0x00
> > @@ -554,6 +556,7 @@ TRACE_EVENT(cxl_general_media,
> > #define CXL_DER_MEM_EVT_TYPE_TE_STATE_VIOLATION 0x04
> > #define CXL_DER_MEM_EVT_TYPE_AP_CME_COUNTER_EXPIRE 0x05
> > #define CXL_DER_MEM_EVT_TYPE_CKID_VIOLATION 0x06
> > +#define CXL_DER_MEM_EVT_TYPE_AP_CME_PS_CYCLE_END 0x07
> > #define show_dram_mem_event_type(type) __print_symbolic(type, \
> > { CXL_DER_MEM_EVT_TYPE_ECC_ERROR, "ECC Error" }, \
> > { CXL_DER_MEM_EVT_TYPE_SCRUB_MEDIA_ECC_ERROR, "Scrub Media ECC Error" }, \
> > @@ -561,7 +564,8 @@ TRACE_EVENT(cxl_general_media,
> > { CXL_DER_MEM_EVT_TYPE_DATA_PATH_ERROR, "Data Path Error" }, \
> > { CXL_DER_MEM_EVT_TYPE_TE_STATE_VIOLATION, "TE State Violation" }, \
> > { CXL_DER_MEM_EVT_TYPE_AP_CME_COUNTER_EXPIRE, "Adv Prog CME Counter Expiration" }, \
> > - { CXL_DER_MEM_EVT_TYPE_CKID_VIOLATION, "CKID Violation" } \
> > + { CXL_DER_MEM_EVT_TYPE_CKID_VIOLATION, "CKID Violation" }, \
> > + { CXL_DER_MEM_EVT_TYPE_AP_CME_PS_CYCLE_END, "Adv Prog CME Patrol Scrub Cycle End" } \
> > )
> >
> > #define CXL_DER_VALID_CHANNEL BIT(0)
>
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-12-15 10:53 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-10 13:12 [PATCH] cxl/events: Update memory event type for patrol scrub cycle end event shiju.jose
2025-12-10 14:47 ` Dave Jiang
2025-12-15 10:52 ` Jonathan Cameron
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.