qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/6] cxl: add poison event handler
@ 2024-03-29  6:36 Shiyang Ruan via
  2024-03-29  6:36 ` [RFC PATCH v2 1/6] cxl/core: correct length of DPA field masks Shiyang Ruan via
                   ` (6 more replies)
  0 siblings, 7 replies; 22+ messages in thread
From: Shiyang Ruan via @ 2024-03-29  6:36 UTC (permalink / raw)
  To: qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams, dave, ira.weiny

Changes:
RFCv1 -> RFCv2:
1. update commit message of PATCH 1
2. use memory_failure_queue() instead of MCE
3. also report poison in debugfs when injecting poison
4. correct DPA->HPA logic:
     find memdev's endpoint decoder to find the region it belongs to
5. distinguish transaction_type of GMER, only handle POISON related
     event for now


Currently driver only traces cxl events, poison injection (for both vmem
and pmem type) on cxl memdev is silent.  OS needs to be notified then it
could handle poison range in time.  Per CXL spec, the device error event
could be signaled through FW-First and OS-First methods.

So, add poison event handler in OS-First method:
  - qemu:
    - CXL device report POISON event to OS by MSI by sending GMER after
      injecting a poison record
  - CXL driver                                  <-- this patchset
    a. parse the POISON event from GMER;
    b. retrieve POISON list from memdev;
    c. translate poisoned DPA to HPA;
    d. enqueue poisoned PFN to memory_failure's work queue;


Shiyang Ruan (6):
  cxl/core: correct length of DPA field masks
  cxl/core: introduce cxl_mem_report_poison()
  cxl/core: add report option for cxl_mem_get_poison()
  cxl/core: report poison when injecting from debugfs
  cxl: add definition for transaction_type
  cxl/core: add poison injection event handler

 drivers/cxl/core/mbox.c   | 126 +++++++++++++++++++++++++++++++++-----
 drivers/cxl/core/memdev.c |   5 +-
 drivers/cxl/core/region.c |   8 +--
 drivers/cxl/core/trace.h  |   6 +-
 drivers/cxl/cxlmem.h      |  13 ++--
 include/linux/cxl-event.h |  17 ++++-
 6 files changed, 144 insertions(+), 31 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC PATCH v2 1/6] cxl/core: correct length of DPA field masks
  2024-03-29  6:36 [RFC PATCH v2 0/6] cxl: add poison event handler Shiyang Ruan via
@ 2024-03-29  6:36 ` Shiyang Ruan via
  2024-03-30  1:37   ` Dan Williams
  2024-03-29  6:36 ` [RFC PATCH v2 2/6] cxl/core: introduce cxl_mem_report_poison() Shiyang Ruan via
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Shiyang Ruan via @ 2024-03-29  6:36 UTC (permalink / raw)
  To: qemu-devel, linux-cxl
  Cc: Jonathan.Cameron, dan.j.williams, dave, ira.weiny, stable

The length of Physical Address in General Media Event Record/DRAM Event
Record is 64-bit, so the field mask should be defined as such length.
Otherwise, this causes cxl_general_media and cxl_dram tracepoints to
mask off the upper-32-bits of DPA addresses. The cxl_poison event is
unaffected.

If userspace was doing its own DPA-to-HPA translation this could lead to
incorrect page retirement decisions, but there is no known consumer
(like rasdaemon) of this event today.

Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
Cc: <stable@vger.kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
 drivers/cxl/core/trace.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index e5f13260fc52..e2d1f296df97 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -253,11 +253,11 @@ TRACE_EVENT(cxl_generic_event,
  * DRAM Event Record
  * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
  */
-#define CXL_DPA_FLAGS_MASK			0x3F
+#define CXL_DPA_FLAGS_MASK			0x3FULL
 #define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
 
-#define CXL_DPA_VOLATILE			BIT(0)
-#define CXL_DPA_NOT_REPAIRABLE			BIT(1)
+#define CXL_DPA_VOLATILE			BIT_ULL(0)
+#define CXL_DPA_NOT_REPAIRABLE			BIT_ULL(1)
 #define show_dpa_flags(flags)	__print_flags(flags, "|",		   \
 	{ CXL_DPA_VOLATILE,			"VOLATILE"		}, \
 	{ CXL_DPA_NOT_REPAIRABLE,		"NOT_REPAIRABLE"	}  \
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH v2 2/6] cxl/core: introduce cxl_mem_report_poison()
  2024-03-29  6:36 [RFC PATCH v2 0/6] cxl: add poison event handler Shiyang Ruan via
  2024-03-29  6:36 ` [RFC PATCH v2 1/6] cxl/core: correct length of DPA field masks Shiyang Ruan via
@ 2024-03-29  6:36 ` Shiyang Ruan via
  2024-03-30  1:39   ` Dan Williams
  2024-03-29  6:36 ` [RFC PATCH v2 3/6] cxl/core: add report option for cxl_mem_get_poison() Shiyang Ruan via
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Shiyang Ruan via @ 2024-03-29  6:36 UTC (permalink / raw)
  To: qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams, dave, ira.weiny

If poison is detected(reported from cxl memdev), OS should be notified to
handle it. So, introduce this helper function for later use:
  1. translate DPA to HPA;
  2. enqueue records into memory_failure's work queue;

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---

Currently poison injection from debugfs always create a 64-bytes-length
record, which is fine.  But the injection from qemu's QMP API:
qmp_cxl_inject_poison() could create a poison record contains big length,
which may cause many many times of calling memory_failure_queue().
Though the MEMORY_FAILURE_FIFO_SIZE is 1 << 4, it seems not enougth.

---
 drivers/cxl/core/mbox.c | 18 ++++++++++++++++++
 drivers/cxl/cxlmem.h    |  3 +++
 2 files changed, 21 insertions(+)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 9adda4795eb7..31b1b8711256 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -1290,6 +1290,24 @@ int cxl_set_timestamp(struct cxl_memdev_state *mds)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_set_timestamp, CXL);
 
+void cxl_mem_report_poison(struct cxl_memdev *cxlmd,
+			   struct cxl_region *cxlr,
+			   struct cxl_poison_record *poison)
+{
+	u64 dpa = le64_to_cpu(poison->address) & CXL_POISON_START_MASK;
+	u64 len = PAGE_ALIGN(le32_to_cpu(poison->length) * CXL_POISON_LEN_MULT);
+	u64 hpa = cxl_trace_hpa(cxlr, cxlmd, dpa);
+	unsigned long pfn = PHYS_PFN(hpa);
+	unsigned long pfn_end = pfn + len / PAGE_SIZE - 1;
+
+	if (!IS_ENABLED(CONFIG_MEMORY_FAILURE))
+		return;
+
+	for (; pfn <= pfn_end; pfn++)
+		memory_failure_queue(pfn, MF_ACTION_REQUIRED);
+}
+EXPORT_SYMBOL_NS_GPL(cxl_mem_report_poison, CXL);
+
 int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
 		       struct cxl_region *cxlr)
 {
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 20fb3b35e89e..82f80eb381fb 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -828,6 +828,9 @@ void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
 			    const uuid_t *uuid, union cxl_event *evt);
 int cxl_set_timestamp(struct cxl_memdev_state *mds);
 int cxl_poison_state_init(struct cxl_memdev_state *mds);
+void cxl_mem_report_poison(struct cxl_memdev *cxlmd,
+			   struct cxl_region *cxlr,
+			   struct cxl_poison_record *poison);
 int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
 		       struct cxl_region *cxlr);
 int cxl_trigger_poison_list(struct cxl_memdev *cxlmd);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH v2 3/6] cxl/core: add report option for cxl_mem_get_poison()
  2024-03-29  6:36 [RFC PATCH v2 0/6] cxl: add poison event handler Shiyang Ruan via
  2024-03-29  6:36 ` [RFC PATCH v2 1/6] cxl/core: correct length of DPA field masks Shiyang Ruan via
  2024-03-29  6:36 ` [RFC PATCH v2 2/6] cxl/core: introduce cxl_mem_report_poison() Shiyang Ruan via
@ 2024-03-29  6:36 ` Shiyang Ruan via
  2024-03-30  1:50   ` Dan Williams
  2024-03-29  6:36 ` [RFC PATCH v2 4/6] cxl/core: report poison when injecting from debugfs Shiyang Ruan via
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Shiyang Ruan via @ 2024-03-29  6:36 UTC (permalink / raw)
  To: qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams, dave, ira.weiny

The GMER only has "Physical Address" field, no such one indicates length.
So, when a poison event is received, we could use GET_POISON_LIST command
to get the poison list.  Now driver has cxl_mem_get_poison(), so
reuse it and add a parameter 'bool report', report poison record to MCE
if set true.

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
 drivers/cxl/core/mbox.c   | 8 ++++++--
 drivers/cxl/core/memdev.c | 4 ++--
 drivers/cxl/core/region.c | 8 ++++----
 drivers/cxl/cxlmem.h      | 2 +-
 4 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 31b1b8711256..19b46fb06ed6 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -1309,7 +1309,7 @@ void cxl_mem_report_poison(struct cxl_memdev *cxlmd,
 EXPORT_SYMBOL_NS_GPL(cxl_mem_report_poison, CXL);
 
 int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
-		       struct cxl_region *cxlr)
+		       struct cxl_region *cxlr, bool report)
 {
 	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
 	struct cxl_mbox_poison_out *po;
@@ -1340,10 +1340,14 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
 		if (rc)
 			break;
 
-		for (int i = 0; i < le16_to_cpu(po->count); i++)
+		for (int i = 0; i < le16_to_cpu(po->count); i++) {
 			trace_cxl_poison(cxlmd, cxlr, &po->record[i],
 					 po->flags, po->overflow_ts,
 					 CXL_POISON_TRACE_LIST);
+			if (report)
+				cxl_mem_report_poison(cxlmd, cxlr,
+						      &po->record[i]);
+		}
 
 		/* Protect against an uncleared _FLAG_MORE */
 		nr_records = nr_records + le16_to_cpu(po->count);
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index d4e259f3a7e9..e976141ca4a9 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -200,14 +200,14 @@ static int cxl_get_poison_by_memdev(struct cxl_memdev *cxlmd)
 	if (resource_size(&cxlds->pmem_res)) {
 		offset = cxlds->pmem_res.start;
 		length = resource_size(&cxlds->pmem_res);
-		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL);
+		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL, false);
 		if (rc)
 			return rc;
 	}
 	if (resource_size(&cxlds->ram_res)) {
 		offset = cxlds->ram_res.start;
 		length = resource_size(&cxlds->ram_res);
-		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL);
+		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL, false);
 		/*
 		 * Invalid Physical Address is not an error for
 		 * volatile addresses. Device support is optional.
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 5c186e0a39b9..e83c46cb4dea 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2585,7 +2585,7 @@ static int cxl_get_poison_unmapped(struct cxl_memdev *cxlmd,
 	if (ctx->mode == CXL_DECODER_RAM) {
 		offset = ctx->offset;
 		length = resource_size(&cxlds->ram_res) - offset;
-		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL);
+		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL, false);
 		if (rc == -EFAULT)
 			rc = 0;
 		if (rc)
@@ -2603,7 +2603,7 @@ static int cxl_get_poison_unmapped(struct cxl_memdev *cxlmd,
 		return 0;
 	}
 
-	return cxl_mem_get_poison(cxlmd, offset, length, NULL);
+	return cxl_mem_get_poison(cxlmd, offset, length, NULL, false);
 }
 
 static int poison_by_decoder(struct device *dev, void *arg)
@@ -2637,7 +2637,7 @@ static int poison_by_decoder(struct device *dev, void *arg)
 	if (cxled->skip) {
 		offset = cxled->dpa_res->start - cxled->skip;
 		length = cxled->skip;
-		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL);
+		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL, false);
 		if (rc == -EFAULT && cxled->mode == CXL_DECODER_RAM)
 			rc = 0;
 		if (rc)
@@ -2646,7 +2646,7 @@ static int poison_by_decoder(struct device *dev, void *arg)
 
 	offset = cxled->dpa_res->start;
 	length = cxled->dpa_res->end - offset + 1;
-	rc = cxl_mem_get_poison(cxlmd, offset, length, cxled->cxld.region);
+	rc = cxl_mem_get_poison(cxlmd, offset, length, cxled->cxld.region, false);
 	if (rc == -EFAULT && cxled->mode == CXL_DECODER_RAM)
 		rc = 0;
 	if (rc)
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 82f80eb381fb..1f03130b9d6a 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -832,7 +832,7 @@ void cxl_mem_report_poison(struct cxl_memdev *cxlmd,
 			   struct cxl_region *cxlr,
 			   struct cxl_poison_record *poison);
 int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
-		       struct cxl_region *cxlr);
+		       struct cxl_region *cxlr, bool report);
 int cxl_trigger_poison_list(struct cxl_memdev *cxlmd);
 int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa);
 int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH v2 4/6] cxl/core: report poison when injecting from debugfs
  2024-03-29  6:36 [RFC PATCH v2 0/6] cxl: add poison event handler Shiyang Ruan via
                   ` (2 preceding siblings ...)
  2024-03-29  6:36 ` [RFC PATCH v2 3/6] cxl/core: add report option for cxl_mem_get_poison() Shiyang Ruan via
@ 2024-03-29  6:36 ` Shiyang Ruan via
  2024-03-29 18:13   ` Alison Schofield
  2024-03-30  1:52   ` Dan Williams
  2024-03-29  6:36 ` [RFC PATCH v2 5/6] cxl: add definition for transaction types Shiyang Ruan via
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 22+ messages in thread
From: Shiyang Ruan via @ 2024-03-29  6:36 UTC (permalink / raw)
  To: qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams, dave, ira.weiny

Poison injection from debugfs is silent too.  Add calling
cxl_mem_report_poison() to make it able to do memory_failure().

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
 drivers/cxl/core/memdev.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index e976141ca4a9..b0dcbe6f1004 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -366,6 +366,7 @@ int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
 		.length = cpu_to_le32(1),
 	};
 	trace_cxl_poison(cxlmd, cxlr, &record, 0, 0, CXL_POISON_TRACE_INJECT);
+	cxl_mem_report_poison(cxlmd, cxlr, &record);
 out:
 	up_read(&cxl_dpa_rwsem);
 	up_read(&cxl_region_rwsem);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH v2 5/6] cxl: add definition for transaction types
  2024-03-29  6:36 [RFC PATCH v2 0/6] cxl: add poison event handler Shiyang Ruan via
                   ` (3 preceding siblings ...)
  2024-03-29  6:36 ` [RFC PATCH v2 4/6] cxl/core: report poison when injecting from debugfs Shiyang Ruan via
@ 2024-03-29  6:36 ` Shiyang Ruan via
  2024-03-30  1:53   ` Dan Williams
  2024-03-29  6:36 ` [RFC PATCH v2 6/6] cxl/core: add poison injection event handler Shiyang Ruan via
  2024-03-29 17:43 ` [RFC PATCH v2 0/6] cxl: add poison " Alison Schofield
  6 siblings, 1 reply; 22+ messages in thread
From: Shiyang Ruan via @ 2024-03-29  6:36 UTC (permalink / raw)
  To: qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams, dave, ira.weiny

The transaction types are defined in General Media Event Record/DRAM Event
per CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43 and
Section 8.2.9.2.1.2; Table 8-44.  Add them for Event Record handler use.

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
 include/linux/cxl-event.h | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/include/linux/cxl-event.h b/include/linux/cxl-event.h
index 03fa6d50d46f..0a50754fc330 100644
--- a/include/linux/cxl-event.h
+++ b/include/linux/cxl-event.h
@@ -23,6 +23,19 @@ struct cxl_event_generic {
 	u8 data[CXL_EVENT_RECORD_DATA_LENGTH];
 } __packed;
 
+/*
+ * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
+ */
+enum cxl_event_transaction_type {
+	CXL_EVENT_TRANSACTION_UNKNOWN = 0X00,
+	CXL_EVENT_TRANSACTION_READ,
+	CXL_EVENT_TRANSACTION_WRITE,
+	CXL_EVENT_TRANSACTION_SCAN_MEDIA,
+	CXL_EVENT_TRANSACTION_INJECT_POISON,
+	CXL_EVENT_TRANSACTION_MEDIA_SCRUB,
+	CXL_EVENT_TRANSACTION_MEDIA_MANAGEMENT,
+};
+
 /*
  * General Media Event Record
  * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
@@ -33,7 +46,7 @@ struct cxl_event_gen_media {
 	__le64 phys_addr;
 	u8 descriptor;
 	u8 type;
-	u8 transaction_type;
+	u8 transaction_type;	/* enum cxl_event_transaction_type */
 	u8 validity_flags[2];
 	u8 channel;
 	u8 rank;
@@ -52,7 +65,7 @@ struct cxl_event_dram {
 	__le64 phys_addr;
 	u8 descriptor;
 	u8 type;
-	u8 transaction_type;
+	u8 transaction_type;	/* enum cxl_event_transaction_type */
 	u8 validity_flags[2];
 	u8 channel;
 	u8 rank;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH v2 6/6] cxl/core: add poison injection event handler
  2024-03-29  6:36 [RFC PATCH v2 0/6] cxl: add poison event handler Shiyang Ruan via
                   ` (4 preceding siblings ...)
  2024-03-29  6:36 ` [RFC PATCH v2 5/6] cxl: add definition for transaction types Shiyang Ruan via
@ 2024-03-29  6:36 ` Shiyang Ruan via
  2024-03-29 18:27   ` Alison Schofield
  2024-03-29 17:43 ` [RFC PATCH v2 0/6] cxl: add poison " Alison Schofield
  6 siblings, 1 reply; 22+ messages in thread
From: Shiyang Ruan via @ 2024-03-29  6:36 UTC (permalink / raw)
  To: qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams, dave, ira.weiny

Currently driver only traces cxl events, poison injection (for both vmem
and pmem type) on cxl memdev is silent.  OS needs to be notified then it
could handle poison range in time.  Per CXL spec, the device error event
could be signaled through FW-First and OS-First methods.

So, add poison event handler in OS-First method:
  - qemu:
    - CXL device report POISON event to OS by MSI by sending GMER after
      injecting a poison record
  - CXL driver
    a. parse the POISON event from GMER;        <-- this patch
    b. retrieve POISON list from memdev;
    c. translate poisoned DPA to HPA;
    d. enqueue poisoned PFN to memory_failure's work queue;

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---

the reply to Jonathan's comment in last version:
> I'm not 100% convinced this is necessary poison causing.  Also
> the text tells us we should see 'an appropriate event'.
> DRAM one seems likely to be chosen by some vendors.
I think it's right to use DRAM Event Record for volatile-memdev, but 
should poison on a persistent-memdev also use DRAM Event Record too? 
Though its 'Physical Address' feild has the 'Volatile' bit too, which is 
same as General Media Event Record.  I am a bit confused about this.

---
 drivers/cxl/core/mbox.c | 100 ++++++++++++++++++++++++++++++++++------
 drivers/cxl/cxlmem.h    |   8 ++--
 2 files changed, 91 insertions(+), 17 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 19b46fb06ed6..97ef45d808b8 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -837,25 +837,99 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
 
-void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
-			    enum cxl_event_log_type type,
-			    enum cxl_event_type event_type,
-			    const uuid_t *uuid, union cxl_event *evt)
+struct cxl_event_poison_context {
+	u64 dpa;
+	u64 length;
+};
+
+static int __cxl_report_poison(struct device *dev, void *arg)
+{
+	struct cxl_event_poison_context *ctx = arg;
+	struct cxl_endpoint_decoder *cxled;
+	struct cxl_memdev *cxlmd;
+
+	cxled = to_cxl_endpoint_decoder(dev);
+	if (!cxled || !cxled->dpa_res || !resource_size(cxled->dpa_res))
+		return 0;
+
+	if (cxled->mode == CXL_DECODER_MIXED) {
+		dev_dbg(dev, "poison list read unsupported in mixed mode\n");
+		return 0;
+	}
+
+	if (ctx->dpa > cxled->dpa_res->end || ctx->dpa < cxled->dpa_res->start)
+		return 0;
+
+	cxlmd = cxled_to_memdev(cxled);
+	cxl_mem_get_poison(cxlmd, ctx->dpa, ctx->length, cxled->cxld.region,
+			   true);
+
+	return 1;
+}
+
+static void cxl_event_handle_poison(struct cxl_memdev *cxlmd,
+				    struct cxl_event_gen_media *rec)
+{
+	struct cxl_port *port = cxlmd->endpoint;
+	u64 phys_addr = le64_to_cpu(rec->phys_addr);
+	struct cxl_event_poison_context ctx = {
+		.dpa = phys_addr & CXL_DPA_MASK,
+	};
+
+	/* No regions mapped to this memdev, that is to say no HPA is mapped */
+	if (!port || !is_cxl_endpoint(port) ||
+	    cxl_num_decoders_committed(port) == 0)
+		return;
+
+	/*
+	 * Host Inject Poison may have a range of DPA, but the GMER only has
+	 * "Physical Address" field, no such one indicates length.  So it's
+	 * better to call cxl_mem_get_poison() to find this poison record.
+	 */
+	ctx.length = phys_addr & CXL_DPA_VOLATILE ?
+			resource_size(&cxlmd->cxlds->ram_res) :
+			resource_size(&cxlmd->cxlds->pmem_res) - ctx.dpa;
+
+	device_for_each_child(&port->dev, &ctx, __cxl_report_poison);
+}
+
+static void cxl_event_handle_general_media(struct cxl_memdev *cxlmd,
+					   enum cxl_event_log_type type,
+					   struct cxl_event_gen_media *rec)
+{
+	if (type == CXL_EVENT_TYPE_FAIL) {
+		switch (rec->transaction_type) {
+		case CXL_EVENT_TRANSACTION_READ:
+		case CXL_EVENT_TRANSACTION_WRITE:
+		case CXL_EVENT_TRANSACTION_INJECT_POISON:
+			cxl_event_handle_poison(cxlmd, rec);
+			break;
+		default:
+			break;
+		}
+	}
+}
+
+void cxl_event_handle_record(struct cxl_memdev *cxlmd,
+			     enum cxl_event_log_type type,
+			     enum cxl_event_type event_type,
+			     const uuid_t *uuid, union cxl_event *evt)
 {
-	if (event_type == CXL_CPER_EVENT_GEN_MEDIA)
+	if (event_type == CXL_CPER_EVENT_GEN_MEDIA) {
 		trace_cxl_general_media(cxlmd, type, &evt->gen_media);
-	else if (event_type == CXL_CPER_EVENT_DRAM)
+		cxl_event_handle_general_media(cxlmd, type, &evt->gen_media);
+	} else if (event_type == CXL_CPER_EVENT_DRAM)
 		trace_cxl_dram(cxlmd, type, &evt->dram);
 	else if (event_type == CXL_CPER_EVENT_MEM_MODULE)
 		trace_cxl_memory_module(cxlmd, type, &evt->mem_module);
 	else
 		trace_cxl_generic_event(cxlmd, type, uuid, &evt->generic);
 }
-EXPORT_SYMBOL_NS_GPL(cxl_event_trace_record, CXL);
+EXPORT_SYMBOL_NS_GPL(cxl_event_handle_record, CXL);
 
-static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
-				     enum cxl_event_log_type type,
-				     struct cxl_event_record_raw *record)
+static void __cxl_event_handle_record(struct cxl_memdev *cxlmd,
+				      enum cxl_event_log_type type,
+				      struct cxl_event_record_raw *record)
 {
 	enum cxl_event_type ev_type = CXL_CPER_EVENT_GENERIC;
 	const uuid_t *uuid = &record->id;
@@ -867,7 +941,7 @@ static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
 	else if (uuid_equal(uuid, &CXL_EVENT_MEM_MODULE_UUID))
 		ev_type = CXL_CPER_EVENT_MEM_MODULE;
 
-	cxl_event_trace_record(cxlmd, type, ev_type, uuid, &record->event);
+	cxl_event_handle_record(cxlmd, type, ev_type, uuid, &record->event);
 }
 
 static int cxl_clear_event_record(struct cxl_memdev_state *mds,
@@ -978,8 +1052,8 @@ static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
 			break;
 
 		for (i = 0; i < nr_rec; i++)
-			__cxl_event_trace_record(cxlmd, type,
-						 &payload->records[i]);
+			__cxl_event_handle_record(cxlmd, type,
+						  &payload->records[i]);
 
 		if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
 			trace_cxl_overflow(cxlmd, type, payload);
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 1f03130b9d6a..dfd7bdd0d66a 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -822,10 +822,10 @@ void set_exclusive_cxl_commands(struct cxl_memdev_state *mds,
 void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
 				  unsigned long *cmds);
 void cxl_mem_get_event_records(struct cxl_memdev_state *mds, u32 status);
-void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
-			    enum cxl_event_log_type type,
-			    enum cxl_event_type event_type,
-			    const uuid_t *uuid, union cxl_event *evt);
+void cxl_event_handle_record(struct cxl_memdev *cxlmd,
+			     enum cxl_event_log_type type,
+			     enum cxl_event_type event_type,
+			     const uuid_t *uuid, union cxl_event *evt);
 int cxl_set_timestamp(struct cxl_memdev_state *mds);
 int cxl_poison_state_init(struct cxl_memdev_state *mds);
 void cxl_mem_report_poison(struct cxl_memdev *cxlmd,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH v2 0/6] cxl: add poison event handler
  2024-03-29  6:36 [RFC PATCH v2 0/6] cxl: add poison event handler Shiyang Ruan via
                   ` (5 preceding siblings ...)
  2024-03-29  6:36 ` [RFC PATCH v2 6/6] cxl/core: add poison injection event handler Shiyang Ruan via
@ 2024-03-29 17:43 ` Alison Schofield
  2024-03-29 18:22   ` Dan Williams
  6 siblings, 1 reply; 22+ messages in thread
From: Alison Schofield @ 2024-03-29 17:43 UTC (permalink / raw)
  To: Shiyang Ruan
  Cc: qemu-devel, linux-cxl, Jonathan.Cameron, dan.j.williams, dave,
	ira.weiny

On Fri, Mar 29, 2024 at 02:36:08PM +0800, Shiyang Ruan wrote:
> Changes:
> RFCv1 -> RFCv2:
> 1. update commit message of PATCH 1
> 2. use memory_failure_queue() instead of MCE
> 3. also report poison in debugfs when injecting poison
> 4. correct DPA->HPA logic:
>      find memdev's endpoint decoder to find the region it belongs to
> 5. distinguish transaction_type of GMER, only handle POISON related
>      event for now
> 
> 
> Currently driver only traces cxl events, poison injection (for both vmem
> and pmem type) on cxl memdev is silent.  OS needs to be notified then it
> could handle poison range in time.  Per CXL spec, the device error event
> could be signaled through FW-First and OS-First methods.
> 
> So, add poison event handler in OS-First method:
>   - qemu:
>     - CXL device report POISON event to OS by MSI by sending GMER after
>       injecting a poison record
>   - CXL driver                                  <-- this patchset
>     a. parse the POISON event from GMER;
>     b. retrieve POISON list from memdev;
>     c. translate poisoned DPA to HPA;
>     d. enqueue poisoned PFN to memory_failure's work queue;

Hi,

Yesterday I posted code adding the HPAs to cxl_general_media & dram
events[1], so as I review this patchset today it's fresh in my mind.

Can we integrate this into the trace_ path directly:

1) On any GMER/poison, trigger a new poison list read

BTW - I'm not sure where to trigger that because we want to keep all
the locking in place and read by endpoints like is done now. It may
not be safe to sneak in a direct call to cxl_mem_get_poison()
as is done in this patch set.

2) Teach the poison list read trace event handler to call
memory_failure_queue().

Upon receipt of that new poison list, call memory_failture_queue()
on *any* poison in a mapped space. Is that OK?  Can we call
memory_failure_queue() on any and every poison report that is in
HPA space regardless of whether it first came to us through a GMER?
I'm actually wondering if that is going to be the next ask anyway -
ie report all poison.

I'll comment a bit more on individual patches.

--Alison

[1] https://lore.kernel.org/linux-cxl/cover.1711598777.git.alison.schofield@intel.com/

> 
> 
> Shiyang Ruan (6):
>   cxl/core: correct length of DPA field masks
>   cxl/core: introduce cxl_mem_report_poison()
>   cxl/core: add report option for cxl_mem_get_poison()
>   cxl/core: report poison when injecting from debugfs
>   cxl: add definition for transaction_type
>   cxl/core: add poison injection event handler
> 
>  drivers/cxl/core/mbox.c   | 126 +++++++++++++++++++++++++++++++++-----
>  drivers/cxl/core/memdev.c |   5 +-
>  drivers/cxl/core/region.c |   8 +--
>  drivers/cxl/core/trace.h  |   6 +-
>  drivers/cxl/cxlmem.h      |  13 ++--
>  include/linux/cxl-event.h |  17 ++++-
>  6 files changed, 144 insertions(+), 31 deletions(-)
> 
> -- 
> 2.34.1
> > 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH v2 4/6] cxl/core: report poison when injecting from debugfs
  2024-03-29  6:36 ` [RFC PATCH v2 4/6] cxl/core: report poison when injecting from debugfs Shiyang Ruan via
@ 2024-03-29 18:13   ` Alison Schofield
  2024-03-30  1:52   ` Dan Williams
  1 sibling, 0 replies; 22+ messages in thread
From: Alison Schofield @ 2024-03-29 18:13 UTC (permalink / raw)
  To: Shiyang Ruan
  Cc: qemu-devel, linux-cxl, Jonathan.Cameron, dan.j.williams, dave,
	ira.weiny

On Fri, Mar 29, 2024 at 02:36:12PM +0800, Shiyang Ruan wrote:
> Poison injection from debugfs is silent too.  Add calling
> cxl_mem_report_poison() to make it able to do memory_failure().

Curious as to why it is silent? Will a GMER poison event occur
and trigger the path to report it via memory_failure?

> 
> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> ---
>  drivers/cxl/core/memdev.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index e976141ca4a9..b0dcbe6f1004 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -366,6 +366,7 @@ int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
>  		.length = cpu_to_le32(1),
>  	};
>  	trace_cxl_poison(cxlmd, cxlr, &record, 0, 0, CXL_POISON_TRACE_INJECT);
> +	cxl_mem_report_poison(cxlmd, cxlr, &record);
>  out:
>  	up_read(&cxl_dpa_rwsem);
>  	up_read(&cxl_region_rwsem);
> -- 
> 2.34.1
> 
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH v2 0/6] cxl: add poison event handler
  2024-03-29 17:43 ` [RFC PATCH v2 0/6] cxl: add poison " Alison Schofield
@ 2024-03-29 18:22   ` Dan Williams
  2024-03-29 19:38     ` Alison Schofield
  0 siblings, 1 reply; 22+ messages in thread
From: Dan Williams @ 2024-03-29 18:22 UTC (permalink / raw)
  To: Alison Schofield, Shiyang Ruan
  Cc: qemu-devel, linux-cxl, Jonathan.Cameron, dan.j.williams, dave,
	ira.weiny

Alison Schofield wrote:
[..]
> Upon receipt of that new poison list, call memory_failture_queue()
> on *any* poison in a mapped space. Is that OK?  Can we call
> memory_failure_queue() on any and every poison report that is in
> HPA space regardless of whether it first came to us through a GMER?
> I'm actually wondering if that is going to be the next ask anyway -
> ie report all poison.

memory_failure_queue() should be called on poison creation events. Leave
the MF_ACTION_REQUIRED flag not set so that memory_failure() performs
"action optional" handling.  So I would expect memory_failure_queue()
notification for GMER events, but not on poison list events.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH v2 6/6] cxl/core: add poison injection event handler
  2024-03-29  6:36 ` [RFC PATCH v2 6/6] cxl/core: add poison injection event handler Shiyang Ruan via
@ 2024-03-29 18:27   ` Alison Schofield
  0 siblings, 0 replies; 22+ messages in thread
From: Alison Schofield @ 2024-03-29 18:27 UTC (permalink / raw)
  To: Shiyang Ruan
  Cc: qemu-devel, linux-cxl, Jonathan.Cameron, dan.j.williams, dave,
	ira.weiny

On Fri, Mar 29, 2024 at 02:36:14PM +0800, Shiyang Ruan wrote:
> Currently driver only traces cxl events, poison injection (for both vmem
> and pmem type) on cxl memdev is silent.  OS needs to be notified then it
> could handle poison range in time.  Per CXL spec, the device error event
> could be signaled through FW-First and OS-First methods.
> 
> So, add poison event handler in OS-First method:
>   - qemu:
>     - CXL device report POISON event to OS by MSI by sending GMER after
>       injecting a poison record
>   - CXL driver
>     a. parse the POISON event from GMER;        <-- this patch
>     b. retrieve POISON list from memdev;
>     c. translate poisoned DPA to HPA;
>     d. enqueue poisoned PFN to memory_failure's work queue;
> 
> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> ---
> 
> the reply to Jonathan's comment in last version:
> > I'm not 100% convinced this is necessary poison causing.  Also
> > the text tells us we should see 'an appropriate event'.
> > DRAM one seems likely to be chosen by some vendors.
> I think it's right to use DRAM Event Record for volatile-memdev, but 
> should poison on a persistent-memdev also use DRAM Event Record too? 
> Though its 'Physical Address' feild has the 'Volatile' bit too, which is 
> same as General Media Event Record.  I am a bit confused about this.


Similar thought as shared in cover letter -
Can the driver trigger new poison list read on any events of interest,
and implement a policy to report mem failures on all poison list reads
that hit mapped addresses?

I guess if the answer to that question is NO - we only report memory
failures on GMER/poison, can we find more synergy and not repeat so 
much work.

--Alison

> 
> ---
>  drivers/cxl/core/mbox.c | 100 ++++++++++++++++++++++++++++++++++------
>  drivers/cxl/cxlmem.h    |   8 ++--
>  2 files changed, 91 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 19b46fb06ed6..97ef45d808b8 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -837,25 +837,99 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
>  
> -void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> -			    enum cxl_event_log_type type,
> -			    enum cxl_event_type event_type,
> -			    const uuid_t *uuid, union cxl_event *evt)
> +struct cxl_event_poison_context {
> +	u64 dpa;
> +	u64 length;
> +};
> +
> +static int __cxl_report_poison(struct device *dev, void *arg)
> +{
> +	struct cxl_event_poison_context *ctx = arg;
> +	struct cxl_endpoint_decoder *cxled;
> +	struct cxl_memdev *cxlmd;
> +
> +	cxled = to_cxl_endpoint_decoder(dev);
> +	if (!cxled || !cxled->dpa_res || !resource_size(cxled->dpa_res))
> +		return 0;
> +
> +	if (cxled->mode == CXL_DECODER_MIXED) {
> +		dev_dbg(dev, "poison list read unsupported in mixed mode\n");
> +		return 0;
> +	}
> +
> +	if (ctx->dpa > cxled->dpa_res->end || ctx->dpa < cxled->dpa_res->start)
> +		return 0;
> +
> +	cxlmd = cxled_to_memdev(cxled);
> +	cxl_mem_get_poison(cxlmd, ctx->dpa, ctx->length, cxled->cxld.region,
> +			   true);
> +
> +	return 1;
> +}
> +
> +static void cxl_event_handle_poison(struct cxl_memdev *cxlmd,
> +				    struct cxl_event_gen_media *rec)
> +{
> +	struct cxl_port *port = cxlmd->endpoint;
> +	u64 phys_addr = le64_to_cpu(rec->phys_addr);
> +	struct cxl_event_poison_context ctx = {
> +		.dpa = phys_addr & CXL_DPA_MASK,
> +	};
> +
> +	/* No regions mapped to this memdev, that is to say no HPA is mapped */
> +	if (!port || !is_cxl_endpoint(port) ||
> +	    cxl_num_decoders_committed(port) == 0)
> +		return;
> +
> +	/*
> +	 * Host Inject Poison may have a range of DPA, but the GMER only has
> +	 * "Physical Address" field, no such one indicates length.  So it's
> +	 * better to call cxl_mem_get_poison() to find this poison record.
> +	 */
> +	ctx.length = phys_addr & CXL_DPA_VOLATILE ?
> +			resource_size(&cxlmd->cxlds->ram_res) :
> +			resource_size(&cxlmd->cxlds->pmem_res) - ctx.dpa;
> +
> +	device_for_each_child(&port->dev, &ctx, __cxl_report_poison);
> +}
> +
> +static void cxl_event_handle_general_media(struct cxl_memdev *cxlmd,
> +					   enum cxl_event_log_type type,
> +					   struct cxl_event_gen_media *rec)
> +{
> +	if (type == CXL_EVENT_TYPE_FAIL) {
> +		switch (rec->transaction_type) {
> +		case CXL_EVENT_TRANSACTION_READ:
> +		case CXL_EVENT_TRANSACTION_WRITE:
> +		case CXL_EVENT_TRANSACTION_INJECT_POISON:
> +			cxl_event_handle_poison(cxlmd, rec);
> +			break;
> +		default:
> +			break;
> +		}
> +	}
> +}
> +
> +void cxl_event_handle_record(struct cxl_memdev *cxlmd,
> +			     enum cxl_event_log_type type,
> +			     enum cxl_event_type event_type,
> +			     const uuid_t *uuid, union cxl_event *evt)
>  {
> -	if (event_type == CXL_CPER_EVENT_GEN_MEDIA)
> +	if (event_type == CXL_CPER_EVENT_GEN_MEDIA) {
>  		trace_cxl_general_media(cxlmd, type, &evt->gen_media);
> -	else if (event_type == CXL_CPER_EVENT_DRAM)
> +		cxl_event_handle_general_media(cxlmd, type, &evt->gen_media);
> +	} else if (event_type == CXL_CPER_EVENT_DRAM)
>  		trace_cxl_dram(cxlmd, type, &evt->dram);
>  	else if (event_type == CXL_CPER_EVENT_MEM_MODULE)
>  		trace_cxl_memory_module(cxlmd, type, &evt->mem_module);
>  	else
>  		trace_cxl_generic_event(cxlmd, type, uuid, &evt->generic);
>  }
> -EXPORT_SYMBOL_NS_GPL(cxl_event_trace_record, CXL);
> +EXPORT_SYMBOL_NS_GPL(cxl_event_handle_record, CXL);
>  
> -static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> -				     enum cxl_event_log_type type,
> -				     struct cxl_event_record_raw *record)
> +static void __cxl_event_handle_record(struct cxl_memdev *cxlmd,
> +				      enum cxl_event_log_type type,
> +				      struct cxl_event_record_raw *record)
>  {
>  	enum cxl_event_type ev_type = CXL_CPER_EVENT_GENERIC;
>  	const uuid_t *uuid = &record->id;
> @@ -867,7 +941,7 @@ static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
>  	else if (uuid_equal(uuid, &CXL_EVENT_MEM_MODULE_UUID))
>  		ev_type = CXL_CPER_EVENT_MEM_MODULE;
>  
> -	cxl_event_trace_record(cxlmd, type, ev_type, uuid, &record->event);
> +	cxl_event_handle_record(cxlmd, type, ev_type, uuid, &record->event);
>  }
>  
>  static int cxl_clear_event_record(struct cxl_memdev_state *mds,
> @@ -978,8 +1052,8 @@ static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
>  			break;
>  
>  		for (i = 0; i < nr_rec; i++)
> -			__cxl_event_trace_record(cxlmd, type,
> -						 &payload->records[i]);
> +			__cxl_event_handle_record(cxlmd, type,
> +						  &payload->records[i]);
>  
>  		if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
>  			trace_cxl_overflow(cxlmd, type, payload);
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 1f03130b9d6a..dfd7bdd0d66a 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -822,10 +822,10 @@ void set_exclusive_cxl_commands(struct cxl_memdev_state *mds,
>  void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
>  				  unsigned long *cmds);
>  void cxl_mem_get_event_records(struct cxl_memdev_state *mds, u32 status);
> -void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> -			    enum cxl_event_log_type type,
> -			    enum cxl_event_type event_type,
> -			    const uuid_t *uuid, union cxl_event *evt);
> +void cxl_event_handle_record(struct cxl_memdev *cxlmd,
> +			     enum cxl_event_log_type type,
> +			     enum cxl_event_type event_type,
> +			     const uuid_t *uuid, union cxl_event *evt);
>  int cxl_set_timestamp(struct cxl_memdev_state *mds);
>  int cxl_poison_state_init(struct cxl_memdev_state *mds);
>  void cxl_mem_report_poison(struct cxl_memdev *cxlmd,
> -- 
> 2.34.1
> 
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH v2 0/6] cxl: add poison event handler
  2024-03-29 18:22   ` Dan Williams
@ 2024-03-29 19:38     ` Alison Schofield
  2024-03-29 20:56       ` Dan Williams
  0 siblings, 1 reply; 22+ messages in thread
From: Alison Schofield @ 2024-03-29 19:38 UTC (permalink / raw)
  To: Dan Williams
  Cc: Shiyang Ruan, qemu-devel, linux-cxl, Jonathan.Cameron, dave,
	ira.weiny

On Fri, Mar 29, 2024 at 11:22:32AM -0700, Dan Williams wrote:
> Alison Schofield wrote:
> [..]
> > Upon receipt of that new poison list, call memory_failture_queue()
> > on *any* poison in a mapped space. Is that OK?  Can we call
> > memory_failure_queue() on any and every poison report that is in
> > HPA space regardless of whether it first came to us through a GMER?
> > I'm actually wondering if that is going to be the next ask anyway -
> > ie report all poison.
> 
> memory_failure_queue() should be called on poison creation events. Leave
> the MF_ACTION_REQUIRED flag not set so that memory_failure() performs
> "action optional" handling.  So I would expect memory_failure_queue()
> notification for GMER events, but not on poison list events.

Seems I totally missed the point of this patch set.
Is it's only purpose to make sure that poison that is injected gets
reported to memory_failure?

So this single patch only:
1. Poison inject leads to this GMER/CXL_EVENT_TRANSACTION_INJECT_POISON 
2. Driver sees GMER/CXL_EVENT_TRANSACTION_INJECT_POISON and reads poison
list to get accurate length.
3. Driver reports that to memory_failure_queue()

Still expect there's some code sharing opportunities and I still wonder
about what is next in this area.

--Alison



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH v2 0/6] cxl: add poison event handler
  2024-03-29 19:38     ` Alison Schofield
@ 2024-03-29 20:56       ` Dan Williams
  0 siblings, 0 replies; 22+ messages in thread
From: Dan Williams @ 2024-03-29 20:56 UTC (permalink / raw)
  To: Alison Schofield, Dan Williams
  Cc: Shiyang Ruan, qemu-devel, linux-cxl, Jonathan.Cameron, dave,
	ira.weiny

Alison Schofield wrote:
> On Fri, Mar 29, 2024 at 11:22:32AM -0700, Dan Williams wrote:
> > Alison Schofield wrote:
> > [..]
> > > Upon receipt of that new poison list, call memory_failture_queue()
> > > on *any* poison in a mapped space. Is that OK?  Can we call
> > > memory_failure_queue() on any and every poison report that is in
> > > HPA space regardless of whether it first came to us through a GMER?
> > > I'm actually wondering if that is going to be the next ask anyway -
> > > ie report all poison.
> > 
> > memory_failure_queue() should be called on poison creation events. Leave
> > the MF_ACTION_REQUIRED flag not set so that memory_failure() performs
> > "action optional" handling.  So I would expect memory_failure_queue()
> > notification for GMER events, but not on poison list events.
> 
> Seems I totally missed the point of this patch set.
> Is it's only purpose to make sure that poison that is injected gets
> reported to memory_failure?

Clarify terms, "poison injection" to me is a debug-only event to test
that poison handling is working, "poison creation" is an event where new
poison was encountered by CPU consumption, deposited by a
DMA-with-poison transaction, or discovered by a background scrub
operation.

> 
> So this single patch only:
> 1. Poison inject leads to this GMER/CXL_EVENT_TRANSACTION_INJECT_POISON 

Inject is a special case. Likely this should copy the PMEM legacy where
notifying memory_failure() on injected poison is optional:

"ndctl inject-error --no-notify"

> 2. Driver sees GMER/CXL_EVENT_TRANSACTION_INJECT_POISON and reads poison
> list to get accurate length.

Again, inject is the least interesting for the common case, production
kernels care about "Media ECC Error" and "Scrub Media ECC Error"
regardless of transaction type.

> 3. Driver reports that to memory_failure_queue()
> 
> Still expect there's some code sharing opportunities and I still wonder
> about what is next in this area.

One area this needs to be careful is in unifying the OS-first and
FW-first paths. In the FW-first case the platform can trigger
memory_failure() along with the GMER by just posting a memory failure
CPER record. So there is a risk that things get doubly reported if the
GMER handling code blindly triggers memory_failure(). Might be benign,
but probably best avoided.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH v2 1/6] cxl/core: correct length of DPA field masks
  2024-03-29  6:36 ` [RFC PATCH v2 1/6] cxl/core: correct length of DPA field masks Shiyang Ruan via
@ 2024-03-30  1:37   ` Dan Williams
  2024-04-01  9:14     ` Shiyang Ruan via
  0 siblings, 1 reply; 22+ messages in thread
From: Dan Williams @ 2024-03-30  1:37 UTC (permalink / raw)
  To: Shiyang Ruan, qemu-devel, linux-cxl
  Cc: Jonathan.Cameron, dan.j.williams, dave, ira.weiny, stable

Shiyang Ruan wrote:
> The length of Physical Address in General Media Event Record/DRAM Event
> Record is 64-bit, so the field mask should be defined as such length.
> Otherwise, this causes cxl_general_media and cxl_dram tracepoints to
> mask off the upper-32-bits of DPA addresses. The cxl_poison event is
> unaffected.
> 
> If userspace was doing its own DPA-to-HPA translation this could lead to
> incorrect page retirement decisions, but there is no known consumer
> (like rasdaemon) of this event today.
> 
> Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
> Cc: <stable@vger.kernel.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Davidlohr Bueso <dave@stgolabs.net>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> ---
>  drivers/cxl/core/trace.h | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> index e5f13260fc52..e2d1f296df97 100644
> --- a/drivers/cxl/core/trace.h
> +++ b/drivers/cxl/core/trace.h
> @@ -253,11 +253,11 @@ TRACE_EVENT(cxl_generic_event,
>   * DRAM Event Record
>   * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
>   */
> -#define CXL_DPA_FLAGS_MASK			0x3F
> +#define CXL_DPA_FLAGS_MASK			0x3FULL

This change makes sense...

>  #define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
>  
> -#define CXL_DPA_VOLATILE			BIT(0)
> -#define CXL_DPA_NOT_REPAIRABLE			BIT(1)
> +#define CXL_DPA_VOLATILE			BIT_ULL(0)
> +#define CXL_DPA_NOT_REPAIRABLE			BIT_ULL(1)

...these do not. The argument to __print_flags() is an unsigned long, so
they will be cast down to (unsigned long), and they are never used as a
mask so the generated code should not change.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH v2 2/6] cxl/core: introduce cxl_mem_report_poison()
  2024-03-29  6:36 ` [RFC PATCH v2 2/6] cxl/core: introduce cxl_mem_report_poison() Shiyang Ruan via
@ 2024-03-30  1:39   ` Dan Williams
  0 siblings, 0 replies; 22+ messages in thread
From: Dan Williams @ 2024-03-30  1:39 UTC (permalink / raw)
  To: Shiyang Ruan, qemu-devel, linux-cxl
  Cc: Jonathan.Cameron, dan.j.williams, dave, ira.weiny

Shiyang Ruan wrote:
> If poison is detected(reported from cxl memdev), OS should be notified to
> handle it. So, introduce this helper function for later use:
>   1. translate DPA to HPA;
>   2. enqueue records into memory_failure's work queue;
> 
> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>

This patch is too small, it needs the corresponding caller to make sense
of the proposed change.

> ---
> 
> Currently poison injection from debugfs always create a 64-bytes-length
> record, which is fine.  But the injection from qemu's QMP API:
> qmp_cxl_inject_poison() could create a poison record contains big length,
> which may cause many many times of calling memory_failure_queue().
> Though the MEMORY_FAILURE_FIFO_SIZE is 1 << 4, it seems not enougth.

What matters is what devices do in practice, the kernel should not be
worried about odd corner cases that only exist in QEMU injection
scenarios.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH v2 3/6] cxl/core: add report option for cxl_mem_get_poison()
  2024-03-29  6:36 ` [RFC PATCH v2 3/6] cxl/core: add report option for cxl_mem_get_poison() Shiyang Ruan via
@ 2024-03-30  1:50   ` Dan Williams
  2024-04-03 14:56     ` Shiyang Ruan via
  0 siblings, 1 reply; 22+ messages in thread
From: Dan Williams @ 2024-03-30  1:50 UTC (permalink / raw)
  To: Shiyang Ruan, qemu-devel, linux-cxl
  Cc: Jonathan.Cameron, dan.j.williams, dave, ira.weiny

Shiyang Ruan wrote:
> The GMER only has "Physical Address" field, no such one indicates length.
> So, when a poison event is received, we could use GET_POISON_LIST command
> to get the poison list.  Now driver has cxl_mem_get_poison(), so
> reuse it and add a parameter 'bool report', report poison record to MCE
> if set true.

I am not sure I agree with the rationale here because there is no
correlation between the event being signaled and the current state of
the poison list. It also establishes race between multiple GMER events,
i.e. imagine the hardware sends 4 GMER events to communicate a 256B
poison discovery event. Does the driver need logic to support GMER event
2, 3, and 4 if it already say all 256B of poison after processing GMER
event 1?

I think the best the driver can do is assume at least 64B of poison
per-event and depend on multiple notifications to handle larger poison
lengths.

Otherwise, the poison list is really only useful for pre-populating
pages to offline after a reboot, i.e. to catch the kernel up with the
state of poison pages after a reboot.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH v2 4/6] cxl/core: report poison when injecting from debugfs
  2024-03-29  6:36 ` [RFC PATCH v2 4/6] cxl/core: report poison when injecting from debugfs Shiyang Ruan via
  2024-03-29 18:13   ` Alison Schofield
@ 2024-03-30  1:52   ` Dan Williams
  2024-04-03 15:07     ` Shiyang Ruan via
  1 sibling, 1 reply; 22+ messages in thread
From: Dan Williams @ 2024-03-30  1:52 UTC (permalink / raw)
  To: Shiyang Ruan, qemu-devel, linux-cxl
  Cc: Jonathan.Cameron, dan.j.williams, dave, ira.weiny

Shiyang Ruan wrote:
> Poison injection from debugfs is silent too.  Add calling
> cxl_mem_report_poison() to make it able to do memory_failure().

Why does this needs to be signalled? It is a debug interface, the
debugger can also trigger a read after the injection, or trigger page
soft-offline.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH v2 5/6] cxl: add definition for transaction types
  2024-03-29  6:36 ` [RFC PATCH v2 5/6] cxl: add definition for transaction types Shiyang Ruan via
@ 2024-03-30  1:53   ` Dan Williams
  0 siblings, 0 replies; 22+ messages in thread
From: Dan Williams @ 2024-03-30  1:53 UTC (permalink / raw)
  To: Shiyang Ruan, qemu-devel, linux-cxl
  Cc: Jonathan.Cameron, dan.j.williams, dave, ira.weiny

Shiyang Ruan wrote:
> The transaction types are defined in General Media Event Record/DRAM Event
> per CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43 and
> Section 8.2.9.2.1.2; Table 8-44.  Add them for Event Record handler use.

Combine this patch with the one that uses them so that the use case can
be reviewed together with the implementation.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH v2 1/6] cxl/core: correct length of DPA field masks
  2024-03-30  1:37   ` Dan Williams
@ 2024-04-01  9:14     ` Shiyang Ruan via
  0 siblings, 0 replies; 22+ messages in thread
From: Shiyang Ruan via @ 2024-04-01  9:14 UTC (permalink / raw)
  To: Dan Williams, qemu-devel, linux-cxl
  Cc: Jonathan.Cameron, dave, ira.weiny, stable



在 2024/3/30 9:37, Dan Williams 写道:
> Shiyang Ruan wrote:
>> The length of Physical Address in General Media Event Record/DRAM Event
>> Record is 64-bit, so the field mask should be defined as such length.
>> Otherwise, this causes cxl_general_media and cxl_dram tracepoints to
>> mask off the upper-32-bits of DPA addresses. The cxl_poison event is
>> unaffected.
>>
>> If userspace was doing its own DPA-to-HPA translation this could lead to
>> incorrect page retirement decisions, but there is no known consumer
>> (like rasdaemon) of this event today.
>>
>> Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
>> Cc: <stable@vger.kernel.org>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Davidlohr Bueso <dave@stgolabs.net>
>> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> Cc: Ira Weiny <ira.weiny@intel.com>
>> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
>> ---
>>   drivers/cxl/core/trace.h | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
>> index e5f13260fc52..e2d1f296df97 100644
>> --- a/drivers/cxl/core/trace.h
>> +++ b/drivers/cxl/core/trace.h
>> @@ -253,11 +253,11 @@ TRACE_EVENT(cxl_generic_event,
>>    * DRAM Event Record
>>    * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
>>    */
>> -#define CXL_DPA_FLAGS_MASK			0x3F
>> +#define CXL_DPA_FLAGS_MASK			0x3FULL
> 
> This change makes sense...
> 
>>   #define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
>>   
>> -#define CXL_DPA_VOLATILE			BIT(0)
>> -#define CXL_DPA_NOT_REPAIRABLE			BIT(1)
>> +#define CXL_DPA_VOLATILE			BIT_ULL(0)
>> +#define CXL_DPA_NOT_REPAIRABLE			BIT_ULL(1)
> 
> ...these do not. The argument to __print_flags() is an unsigned long, so
> they will be cast down to (unsigned long), and they are never used as a
> mask so the generated code should not change.

They will only used to check if such flag is set, not used as mask.  So, 
yes, I'll remove these changes.


--
Thanks,
Ruan.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH v2 3/6] cxl/core: add report option for cxl_mem_get_poison()
  2024-03-30  1:50   ` Dan Williams
@ 2024-04-03 14:56     ` Shiyang Ruan via
  2024-04-04 13:46       ` Jonathan Cameron via
  0 siblings, 1 reply; 22+ messages in thread
From: Shiyang Ruan via @ 2024-04-03 14:56 UTC (permalink / raw)
  To: Dan Williams; +Cc: Jonathan.Cameron, dave, ira.weiny, qemu-devel, linux-cxl



在 2024/3/30 9:50, Dan Williams 写道:
> Shiyang Ruan wrote:
>> The GMER only has "Physical Address" field, no such one indicates length.
>> So, when a poison event is received, we could use GET_POISON_LIST command
>> to get the poison list.  Now driver has cxl_mem_get_poison(), so
>> reuse it and add a parameter 'bool report', report poison record to MCE
>> if set true.
> 
> I am not sure I agree with the rationale here because there is no
> correlation between the event being signaled and the current state of
> the poison list. It also establishes race between multiple GMER events,
> i.e. imagine the hardware sends 4 GMER events to communicate a 256B
> poison discovery event. Does the driver need logic to support GMER event
> 2, 3, and 4 if it already say all 256B of poison after processing GMER
> event 1?

Yes, I didn't thought about that.

> 
> I think the best the driver can do is assume at least 64B of poison
> per-event and depend on multiple notifications to handle larger poison
> lengths.

Agree.  This also makes things easier.

And for qemu, I'm thinking of making a patch to limit the length of a 
poison record when injecting.  The length should between 64B to 4KiB per 
GMER. And emit many GMERs if length > 4KiB.

> 
> Otherwise, the poison list is really only useful for pre-populating
> pages to offline after a reboot, i.e. to catch the kernel up with the
> state of poison pages after a reboot.

Got it.


--
Thanks,
Ruan.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH v2 4/6] cxl/core: report poison when injecting from debugfs
  2024-03-30  1:52   ` Dan Williams
@ 2024-04-03 15:07     ` Shiyang Ruan via
  0 siblings, 0 replies; 22+ messages in thread
From: Shiyang Ruan via @ 2024-04-03 15:07 UTC (permalink / raw)
  To: Dan Williams; +Cc: Jonathan.Cameron, dave, ira.weiny, qemu-devel, linux-cxl



在 2024/3/30 9:52, Dan Williams 写道:
> Shiyang Ruan wrote:
>> Poison injection from debugfs is silent too.  Add calling
>> cxl_mem_report_poison() to make it able to do memory_failure().
> 
> Why does this needs to be signalled? It is a debug interface, the
> debugger can also trigger a read after the injection, or trigger page
> soft-offline.

I was thinking that the poison injection should trigger a chain of 
events.  So, for debugfs they should be independent, right?  I wasn't 
aware of this.  Will drop this patch.


--
Thanks,
Ruan.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH v2 3/6] cxl/core: add report option for cxl_mem_get_poison()
  2024-04-03 14:56     ` Shiyang Ruan via
@ 2024-04-04 13:46       ` Jonathan Cameron via
  0 siblings, 0 replies; 22+ messages in thread
From: Jonathan Cameron via @ 2024-04-04 13:46 UTC (permalink / raw)
  To: Shiyang Ruan; +Cc: Dan Williams, dave, ira.weiny, qemu-devel, linux-cxl

On Wed, 3 Apr 2024 22:56:58 +0800
Shiyang Ruan <ruansy.fnst@fujitsu.com> wrote:

> 在 2024/3/30 9:50, Dan Williams 写道:
> > Shiyang Ruan wrote:  
> >> The GMER only has "Physical Address" field, no such one indicates length.
> >> So, when a poison event is received, we could use GET_POISON_LIST command
> >> to get the poison list.  Now driver has cxl_mem_get_poison(), so
> >> reuse it and add a parameter 'bool report', report poison record to MCE
> >> if set true.  
> > 
> > I am not sure I agree with the rationale here because there is no
> > correlation between the event being signaled and the current state of
> > the poison list. It also establishes race between multiple GMER events,
> > i.e. imagine the hardware sends 4 GMER events to communicate a 256B
> > poison discovery event. Does the driver need logic to support GMER event
> > 2, 3, and 4 if it already say all 256B of poison after processing GMER
> > event 1?  
> 
> Yes, I didn't thought about that.
> 
> > 
> > I think the best the driver can do is assume at least 64B of poison
> > per-event and depend on multiple notifications to handle larger poison
> > lengths.  
> 
> Agree.  This also makes things easier.
> 
> And for qemu, I'm thinking of making a patch to limit the length of a 
> poison record when injecting.  The length should between 64B to 4KiB per 
> GMER. And emit many GMERs if length > 4KiB.

I'm not keen on such a restriction in QEMU.
QEMU is injecting lengths allowed by the specification.  That facility is
useful for testing the kernel and the QEMU modeling should not be based
on what the kernel supports.

When you said this I wondered if we had a clever implementation that fused
entries in the list, but we don't (I thought about doing so a long time
ago but seems I never bothered :)  So if you are using QEMU for testing
and you don't want to exceed the kernel supported poison lengths, don't
inject poison that big.

Jonathan

> 
> > 
> > Otherwise, the poison list is really only useful for pre-populating
> > pages to offline after a reboot, i.e. to catch the kernel up with the
> > state of poison pages after a reboot.  
> 
> Got it.
> 
> 
> --
> Thanks,
> Ruan.



^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2024-04-04 13:48 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-29  6:36 [RFC PATCH v2 0/6] cxl: add poison event handler Shiyang Ruan via
2024-03-29  6:36 ` [RFC PATCH v2 1/6] cxl/core: correct length of DPA field masks Shiyang Ruan via
2024-03-30  1:37   ` Dan Williams
2024-04-01  9:14     ` Shiyang Ruan via
2024-03-29  6:36 ` [RFC PATCH v2 2/6] cxl/core: introduce cxl_mem_report_poison() Shiyang Ruan via
2024-03-30  1:39   ` Dan Williams
2024-03-29  6:36 ` [RFC PATCH v2 3/6] cxl/core: add report option for cxl_mem_get_poison() Shiyang Ruan via
2024-03-30  1:50   ` Dan Williams
2024-04-03 14:56     ` Shiyang Ruan via
2024-04-04 13:46       ` Jonathan Cameron via
2024-03-29  6:36 ` [RFC PATCH v2 4/6] cxl/core: report poison when injecting from debugfs Shiyang Ruan via
2024-03-29 18:13   ` Alison Schofield
2024-03-30  1:52   ` Dan Williams
2024-04-03 15:07     ` Shiyang Ruan via
2024-03-29  6:36 ` [RFC PATCH v2 5/6] cxl: add definition for transaction types Shiyang Ruan via
2024-03-30  1:53   ` Dan Williams
2024-03-29  6:36 ` [RFC PATCH v2 6/6] cxl/core: add poison injection event handler Shiyang Ruan via
2024-03-29 18:27   ` Alison Schofield
2024-03-29 17:43 ` [RFC PATCH v2 0/6] cxl: add poison " Alison Schofield
2024-03-29 18:22   ` Dan Williams
2024-03-29 19:38     ` Alison Schofield
2024-03-29 20:56       ` Dan Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).