Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v9 00/11] drm/xe/guc: Improve GuC log dumping and add to devcoredump
@ 2024-10-02 21:14 John.C.Harrison
  2024-10-02 21:14 ` [PATCH v9 01/11] drm/xe/guc: Remove spurious line feed in debug print John.C.Harrison
                   ` (14 more replies)
  0 siblings, 15 replies; 23+ messages in thread
From: John.C.Harrison @ 2024-10-02 21:14 UTC (permalink / raw)
  To: Intel-Xe; +Cc: John Harrison

From: John Harrison <John.C.Harrison@Intel.com>

There is a debug mechanism for dumping the GuC log as an ASCII hex
stream via dmesg. This is extremely useful for situations where it is
not possibe to query the log from debugfs (self tests, bugs that cause
the driver to fail to load, system hangs, etc.). However, dumping via
dmesg is not the most reliable. The dmesg buffer is limited in size,
can be rate limited and a simple hex stream is hard to parse by tools.

So add extra information to the dump to make it more robust and
parsable. This includes adding start and end tags to delimit the dump,
using longer lines to reduce the per line overhead, adding a rolling
count to check for missing lines and interleaved concurrent dumps and
adding other important information such as the GuC version number and
timestamp offset. Also, switch to using the much more compact ASCII85
encoding rather than 0x%08X hexdumping.

There are various internal error states that the CTB code can check
for. These should never happen but when they do (driver bug, firmware
bug or even hardware bug), they can be a nightmare to debug. So add in
a capture of the GuC log and CT state at the point of error and
subsequent dump from a worker thread.

Finally, include the GuC log and full CTBs in a devcoredump capture.

Note that the ultimate aim is to then provide a mechanism for
generating a devcoredump at an arbitrary point (such as dead CTB or
failed selftest) and dumping that to dmesg. There are still a few
issues with doing that, but this is all good steps along the way.

v2: Remove pm get/put as unnecessary (review feedback from Matthew B).
v3: Add firmware filename and 'wanted' version number.
v4: Use DRM level line printer wrapper from Michal W. Add 'dead CTB'
dump support. Lots of restructuring of capture vs dump for both GuC
log and CTB capture for both the dead CTB dump and for future
inclusion in devcoredump.
v5: Add missing kerneldocs and other review feedback from Michal W.
Fix printf of size_t, clean up re-arming of dead CTBs, add GuC log to
devcoredump captures.
v6: Replace hexdumps with much more compact ascii85 encoding, drop
module parameter (review feedback from Matthew B). Fix potential
use-after-free bug.
v7: Couple of bug fixes and a bunch of changes to improve
readability/parsablility of the core dump file, debugfs file and dead
CTB dmesg dump.
v8: Fix string size calculation, clean up a macro, fix some
formatting, re-work CT_DEAD capture to prevent potential leak (review
feedback by Julia F). Add more section headers, use drm_puts, use
cached variables.
v9: Disable a couple of CT_DEAD checks that are being hit by CI until
the underlying issues can be debugged and fixed.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>


John Harrison (10):
  drm/xe/guc: Remove spurious line feed in debug print
  drm/xe/devcoredump: Use drm_puts and already cached local variables
  drm/xe/devcoredump: Improve section headings and add tile info
  drm/xe/devcoredump: Add ASCII85 dump helper function
  drm/xe/guc: Copy GuC log prior to dumping
  drm/xe/guc: Use a two stage dump for GuC logs and add more info
  drm/xe/guc: Dead CT helper
  drm/xe/guc: Dump entire CTB on errors
  drm/xe/guc: Add GuC log to devcoredump captures
  drm/xe/guc: Add a helper function for dumping GuC log to dmesg

Michal Wajdeczko (1):
  drm/print: Introduce drm_line_printer

 drivers/gpu/drm/drm_print.c                   |  14 +
 .../drm/xe/abi/guc_communication_ctb_abi.h    |   1 +
 drivers/gpu/drm/xe/regs/xe_guc_regs.h         |   1 +
 drivers/gpu/drm/xe/xe_devcoredump.c           | 144 +++++-
 drivers/gpu/drm/xe/xe_devcoredump.h           |   6 +
 drivers/gpu/drm/xe/xe_devcoredump_types.h     |  13 +-
 drivers/gpu/drm/xe/xe_device.c                |   1 +
 drivers/gpu/drm/xe/xe_guc.c                   |   2 +-
 drivers/gpu/drm/xe/xe_guc_ct.c                | 430 ++++++++++++++----
 drivers/gpu/drm/xe/xe_guc_ct.h                |  10 +-
 drivers/gpu/drm/xe/xe_guc_ct_types.h          |  29 +-
 drivers/gpu/drm/xe/xe_guc_log.c               | 208 ++++++++-
 drivers/gpu/drm/xe/xe_guc_log.h               |   5 +
 drivers/gpu/drm/xe/xe_guc_log_types.h         |  27 ++
 drivers/gpu/drm/xe/xe_guc_submit.c            |   2 +-
 drivers/gpu/drm/xe/xe_hw_engine.c             |   1 -
 include/drm/drm_print.h                       |  64 +++
 17 files changed, 816 insertions(+), 142 deletions(-)

-- 
2.46.0


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v9 01/11] drm/xe/guc: Remove spurious line feed in debug print
  2024-10-02 21:14 [PATCH v9 00/11] drm/xe/guc: Improve GuC log dumping and add to devcoredump John.C.Harrison
@ 2024-10-02 21:14 ` John.C.Harrison
  2024-10-02 21:14 ` [PATCH v9 02/11] drm/xe/devcoredump: Use drm_puts and already cached local variables John.C.Harrison
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: John.C.Harrison @ 2024-10-02 21:14 UTC (permalink / raw)
  To: Intel-Xe; +Cc: John Harrison, Michal Wajdeczko, Julia Filipchuk

From: John Harrison <John.C.Harrison@Intel.com>

Including line feeds at the start of a debug print messes up the
output when sent to dmesg. The break appears between all the useful
prefix information and the actual string being printed. In this case,
each block of data has a very clear start line and an extra delimeter
is really not necessary. So don't do it.

v2: Fix typo in commit message (review feedback from Michal W.)

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
---
 drivers/gpu/drm/xe/xe_guc_ct.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
index 4b95f75b1546..a63fe0a9077a 100644
--- a/drivers/gpu/drm/xe/xe_guc_ct.c
+++ b/drivers/gpu/drm/xe/xe_guc_ct.c
@@ -1523,7 +1523,7 @@ void xe_guc_ct_snapshot_print(struct xe_guc_ct_snapshot *snapshot,
 		drm_puts(p, "H2G CTB (all sizes in DW):\n");
 		guc_ctb_snapshot_print(&snapshot->h2g, p);
 
-		drm_puts(p, "\nG2H CTB (all sizes in DW):\n");
+		drm_puts(p, "G2H CTB (all sizes in DW):\n");
 		guc_ctb_snapshot_print(&snapshot->g2h, p);
 
 		drm_printf(p, "\tg2h outstanding: %d\n",
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v9 02/11] drm/xe/devcoredump: Use drm_puts and already cached local variables
  2024-10-02 21:14 [PATCH v9 00/11] drm/xe/guc: Improve GuC log dumping and add to devcoredump John.C.Harrison
  2024-10-02 21:14 ` [PATCH v9 01/11] drm/xe/guc: Remove spurious line feed in debug print John.C.Harrison
@ 2024-10-02 21:14 ` John.C.Harrison
  2024-10-02 21:14 ` [PATCH v9 03/11] drm/xe/devcoredump: Improve section headings and add tile info John.C.Harrison
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: John.C.Harrison @ 2024-10-02 21:14 UTC (permalink / raw)
  To: Intel-Xe; +Cc: John Harrison, Julia Filipchuk

From: John Harrison <John.C.Harrison@Intel.com>

There are a bunch of calls to drm_printf with static strings. Switch
them to drm_puts instead.

There are also a bunch of 'coredump->snapshot.XXX' references when
'coredump->snapshot' has alread been cached locally as 'ss'. So use
'ss->XXX' instead.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
---
 drivers/gpu/drm/xe/xe_devcoredump.c | 40 ++++++++++++++---------------
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c
index bdb76e834e4c..d23719d5c2a3 100644
--- a/drivers/gpu/drm/xe/xe_devcoredump.c
+++ b/drivers/gpu/drm/xe/xe_devcoredump.c
@@ -85,9 +85,9 @@ static ssize_t __xe_devcoredump_read(char *buffer, size_t count,
 
 	p = drm_coredump_printer(&iter);
 
-	drm_printf(&p, "**** Xe Device Coredump ****\n");
-	drm_printf(&p, "kernel: " UTS_RELEASE "\n");
-	drm_printf(&p, "module: " KBUILD_MODNAME "\n");
+	drm_puts(&p, "**** Xe Device Coredump ****\n");
+	drm_puts(&p, "kernel: " UTS_RELEASE "\n");
+	drm_puts(&p, "module: " KBUILD_MODNAME "\n");
 
 	ts = ktime_to_timespec64(ss->snapshot_time);
 	drm_printf(&p, "Snapshot time: %lld.%09ld\n", ts.tv_sec, ts.tv_nsec);
@@ -96,20 +96,20 @@ static ssize_t __xe_devcoredump_read(char *buffer, size_t count,
 	drm_printf(&p, "Process: %s\n", ss->process_name);
 	xe_device_snapshot_print(xe, &p);
 
-	drm_printf(&p, "\n**** GuC CT ****\n");
-	xe_guc_ct_snapshot_print(coredump->snapshot.ct, &p);
-	xe_guc_exec_queue_snapshot_print(coredump->snapshot.ge, &p);
+	drm_puts(&p, "\n**** GuC CT ****\n");
+	xe_guc_ct_snapshot_print(ss->ct, &p);
+	xe_guc_exec_queue_snapshot_print(ss->ge, &p);
 
-	drm_printf(&p, "\n**** Job ****\n");
-	xe_sched_job_snapshot_print(coredump->snapshot.job, &p);
+	drm_puts(&p, "\n**** Job ****\n");
+	xe_sched_job_snapshot_print(ss->job, &p);
 
-	drm_printf(&p, "\n**** HW Engines ****\n");
+	drm_puts(&p, "\n**** HW Engines ****\n");
 	for (i = 0; i < XE_NUM_HW_ENGINES; i++)
-		if (coredump->snapshot.hwe[i])
-			xe_hw_engine_snapshot_print(coredump->snapshot.hwe[i],
-						    &p);
-	drm_printf(&p, "\n**** VM state ****\n");
-	xe_vm_snapshot_print(coredump->snapshot.vm, &p);
+		if (ss->hwe[i])
+			xe_hw_engine_snapshot_print(ss->hwe[i], &p);
+
+	drm_puts(&p, "\n**** VM state ****\n");
+	xe_vm_snapshot_print(ss->vm, &p);
 
 	return count - iter.remain;
 }
@@ -247,18 +247,18 @@ static void devcoredump_snapshot(struct xe_devcoredump *coredump,
 	if (xe_force_wake_get(gt_to_fw(q->gt), XE_FORCEWAKE_ALL))
 		xe_gt_info(ss->gt, "failed to get forcewake for coredump capture\n");
 
-	coredump->snapshot.ct = xe_guc_ct_snapshot_capture(&guc->ct, true);
-	coredump->snapshot.ge = xe_guc_exec_queue_snapshot_capture(q);
-	coredump->snapshot.job = xe_sched_job_snapshot_capture(job);
-	coredump->snapshot.vm = xe_vm_snapshot_capture(q->vm);
+	ss->ct = xe_guc_ct_snapshot_capture(&guc->ct, true);
+	ss->ge = xe_guc_exec_queue_snapshot_capture(q);
+	ss->job = xe_sched_job_snapshot_capture(job);
+	ss->vm = xe_vm_snapshot_capture(q->vm);
 
 	for_each_hw_engine(hwe, q->gt, id) {
 		if (hwe->class != q->hwe->class ||
 		    !(BIT(hwe->logical_instance) & adj_logical_mask)) {
-			coredump->snapshot.hwe[id] = NULL;
+			ss->hwe[id] = NULL;
 			continue;
 		}
-		coredump->snapshot.hwe[id] = xe_hw_engine_snapshot_capture(hwe);
+		ss->hwe[id] = xe_hw_engine_snapshot_capture(hwe);
 	}
 
 	queue_work(system_unbound_wq, &ss->work);
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v9 03/11] drm/xe/devcoredump: Improve section headings and add tile info
  2024-10-02 21:14 [PATCH v9 00/11] drm/xe/guc: Improve GuC log dumping and add to devcoredump John.C.Harrison
  2024-10-02 21:14 ` [PATCH v9 01/11] drm/xe/guc: Remove spurious line feed in debug print John.C.Harrison
  2024-10-02 21:14 ` [PATCH v9 02/11] drm/xe/devcoredump: Use drm_puts and already cached local variables John.C.Harrison
@ 2024-10-02 21:14 ` John.C.Harrison
  2024-10-02 21:14 ` [PATCH v9 04/11] drm/xe/devcoredump: Add ASCII85 dump helper function John.C.Harrison
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: John.C.Harrison @ 2024-10-02 21:14 UTC (permalink / raw)
  To: Intel-Xe; +Cc: John Harrison, Julia Filipchuk

From: John Harrison <John.C.Harrison@Intel.com>

The xe_guc_exec_queue_snapshot is not really a GuC internal thing and
is definitely not a GuC CT thing. So give it its own section heading.
The snapshot itself is really a capture of the submission backend's
internal state. Although all it currently prints out is the submission
contexts. So label it as 'Contexts'. If more general state is added
later then it could be change to 'Submission backend' or some such.

Further, everything from the GuC CT section onwards is GT specific but
there was no indication of which GT it was related to (and that is
impossible to work out from the other fields that are given). So add a
GT section heading. Also include the tile id of the GT, because again
significant information.

Lastly, drop a couple of unnecessary line feeds within sections.

v2: Add GT section heading, add tile id to device section.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
---
 drivers/gpu/drm/xe/xe_devcoredump.c       | 5 +++++
 drivers/gpu/drm/xe/xe_devcoredump_types.h | 3 ++-
 drivers/gpu/drm/xe/xe_device.c            | 1 +
 drivers/gpu/drm/xe/xe_guc_submit.c        | 2 +-
 drivers/gpu/drm/xe/xe_hw_engine.c         | 1 -
 5 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c
index d23719d5c2a3..2690f1d1cde4 100644
--- a/drivers/gpu/drm/xe/xe_devcoredump.c
+++ b/drivers/gpu/drm/xe/xe_devcoredump.c
@@ -96,8 +96,13 @@ static ssize_t __xe_devcoredump_read(char *buffer, size_t count,
 	drm_printf(&p, "Process: %s\n", ss->process_name);
 	xe_device_snapshot_print(xe, &p);
 
+	drm_printf(&p, "\n**** GT #%d ****\n", ss->gt->info.id);
+	drm_printf(&p, "\tTile: %d\n", ss->gt->tile->id);
+
 	drm_puts(&p, "\n**** GuC CT ****\n");
 	xe_guc_ct_snapshot_print(ss->ct, &p);
+
+	drm_puts(&p, "\n**** Contexts ****\n");
 	xe_guc_exec_queue_snapshot_print(ss->ge, &p);
 
 	drm_puts(&p, "\n**** Job ****\n");
diff --git a/drivers/gpu/drm/xe/xe_devcoredump_types.h b/drivers/gpu/drm/xe/xe_devcoredump_types.h
index 440d05d77a5a..3cc2f095fdfb 100644
--- a/drivers/gpu/drm/xe/xe_devcoredump_types.h
+++ b/drivers/gpu/drm/xe/xe_devcoredump_types.h
@@ -37,7 +37,8 @@ struct xe_devcoredump_snapshot {
 	/* GuC snapshots */
 	/** @ct: GuC CT snapshot */
 	struct xe_guc_ct_snapshot *ct;
-	/** @ge: Guc Engine snapshot */
+
+	/** @ge: GuC Submission Engine snapshot */
 	struct xe_guc_submit_exec_queue_snapshot *ge;
 
 	/** @hwe: HW Engine snapshot array */
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 09a7ad830e69..030cf703e970 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -961,6 +961,7 @@ void xe_device_snapshot_print(struct xe_device *xe, struct drm_printer *p)
 
 	for_each_gt(gt, xe, id) {
 		drm_printf(p, "GT id: %u\n", id);
+		drm_printf(p, "\tTile: %u\n", gt->tile->id);
 		drm_printf(p, "\tType: %s\n",
 			   gt->info.type == XE_GT_TYPE_MAIN ? "main" : "media");
 		drm_printf(p, "\tIP ver: %u.%u.%u\n",
diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
index 0ac4a19ec9cc..8690df699170 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit.c
+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
@@ -2240,7 +2240,7 @@ xe_guc_exec_queue_snapshot_print(struct xe_guc_submit_exec_queue_snapshot *snaps
 	if (!snapshot)
 		return;
 
-	drm_printf(p, "\nGuC ID: %d\n", snapshot->guc.id);
+	drm_printf(p, "GuC ID: %d\n", snapshot->guc.id);
 	drm_printf(p, "\tName: %s\n", snapshot->name);
 	drm_printf(p, "\tClass: %d\n", snapshot->class);
 	drm_printf(p, "\tLogical mask: 0x%x\n", snapshot->logical_mask);
diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c
index ea6d9ef7fab6..6c9c27304cdc 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.c
+++ b/drivers/gpu/drm/xe/xe_hw_engine.c
@@ -1084,7 +1084,6 @@ void xe_hw_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot,
 	if (snapshot->hwe->class == XE_ENGINE_CLASS_COMPUTE)
 		drm_printf(p, "\tRCU_MODE: 0x%08x\n",
 			   snapshot->reg.rcu_mode);
-	drm_puts(p, "\n");
 }
 
 /**
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v9 04/11] drm/xe/devcoredump: Add ASCII85 dump helper function
  2024-10-02 21:14 [PATCH v9 00/11] drm/xe/guc: Improve GuC log dumping and add to devcoredump John.C.Harrison
                   ` (2 preceding siblings ...)
  2024-10-02 21:14 ` [PATCH v9 03/11] drm/xe/devcoredump: Improve section headings and add tile info John.C.Harrison
@ 2024-10-02 21:14 ` John.C.Harrison
  2024-10-02 21:14 ` [PATCH v9 05/11] drm/xe/guc: Copy GuC log prior to dumping John.C.Harrison
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: John.C.Harrison @ 2024-10-02 21:14 UTC (permalink / raw)
  To: Intel-Xe; +Cc: John Harrison, Julia Filipchuk

From: John Harrison <John.C.Harrison@Intel.com>

There is a need to include the GuC log and other large binary objects
in core dumps and via dmesg. So add a helper for dumping to a printer
function via conversion to ASCII85 encoding.

Another issue with dumping such a large buffer is that it can be slow,
especially if dumping to dmesg over a serial port. So add a yield to
prevent the 'task has been stuck for 120s' kernel hang check feature
from firing.

v2: Add a prefix to the output string. Fix memory allocation bug.
v3: Correct a string size calculation and clean up a define (review
feedback from Julia F).

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
---
 drivers/gpu/drm/xe/xe_devcoredump.c | 87 +++++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_devcoredump.h |  6 ++
 2 files changed, 93 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c
index 2690f1d1cde4..0884c49942fe 100644
--- a/drivers/gpu/drm/xe/xe_devcoredump.c
+++ b/drivers/gpu/drm/xe/xe_devcoredump.c
@@ -6,6 +6,7 @@
 #include "xe_devcoredump.h"
 #include "xe_devcoredump_types.h"
 
+#include <linux/ascii85.h>
 #include <linux/devcoredump.h>
 #include <generated/utsrelease.h>
 
@@ -315,3 +316,89 @@ int xe_devcoredump_init(struct xe_device *xe)
 }
 
 #endif
+
+/**
+ * xe_print_blob_ascii85 - print a BLOB to some useful location in ASCII85
+ *
+ * The output is split to multiple lines because some print targets, e.g. dmesg
+ * cannot handle arbitrarily long lines. Note also that printing to dmesg in
+ * piece-meal fashion is not possible, each separate call to drm_puts() has a
+ * line-feed automatically added! Therefore, the entire output line must be
+ * constructed in a local buffer first, then printed in one atomic output call.
+ *
+ * There is also a scheduler yield call to prevent the 'task has been stuck for
+ * 120s' kernel hang check feature from firing when printing to a slow target
+ * such as dmesg over a serial port.
+ *
+ * TODO: Add compression prior to the ASCII85 encoding to shrink huge buffers down.
+ *
+ * @p: the printer object to output to
+ * @prefix: optional prefix to add to output string
+ * @blob: the Binary Large OBject to dump out
+ * @offset: offset in bytes to skip from the front of the BLOB, must be a multiple of sizeof(u32)
+ * @size: the size in bytes of the BLOB, must be a multiple of sizeof(u32)
+ */
+void xe_print_blob_ascii85(struct drm_printer *p, const char *prefix,
+			   const void *blob, size_t offset, size_t size)
+{
+	const u32 *blob32 = (const u32 *)blob;
+	char buff[ASCII85_BUFSZ], *line_buff;
+	size_t line_pos = 0;
+
+#define DMESG_MAX_LINE_LEN	800
+#define MIN_SPACE		(ASCII85_BUFSZ + 2)		/* 85 + "\n\0" */
+
+	if (size & 3)
+		drm_printf(p, "Size not word aligned: %zu", size);
+	if (offset & 3)
+		drm_printf(p, "Offset not word aligned: %zu", size);
+
+	line_buff = kzalloc(DMESG_MAX_LINE_LEN, GFP_KERNEL);
+	if (IS_ERR_OR_NULL(line_buff)) {
+		drm_printf(p, "Failed to allocate line buffer: %pe", line_buff);
+		return;
+	}
+
+	blob32 += offset / sizeof(*blob32);
+	size /= sizeof(*blob32);
+
+	if (prefix) {
+		strscpy(line_buff, prefix, DMESG_MAX_LINE_LEN - MIN_SPACE - 2);
+		line_pos = strlen(line_buff);
+
+		line_buff[line_pos++] = ':';
+		line_buff[line_pos++] = ' ';
+	}
+
+	while (size--) {
+		u32 val = *(blob32++);
+
+		strscpy(line_buff + line_pos, ascii85_encode(val, buff),
+			DMESG_MAX_LINE_LEN - line_pos);
+		line_pos += strlen(line_buff + line_pos);
+
+		if ((line_pos + MIN_SPACE) >= DMESG_MAX_LINE_LEN) {
+			line_buff[line_pos++] = '\n';
+			line_buff[line_pos++] = 0;
+
+			drm_puts(p, line_buff);
+
+			line_pos = 0;
+
+			/* Prevent 'stuck thread' time out errors */
+			cond_resched();
+		}
+	}
+
+	if (line_pos) {
+		line_buff[line_pos++] = '\n';
+		line_buff[line_pos++] = 0;
+
+		drm_puts(p, line_buff);
+	}
+
+	kfree(line_buff);
+
+#undef MIN_SPACE
+#undef DMESG_MAX_LINE_LEN
+}
diff --git a/drivers/gpu/drm/xe/xe_devcoredump.h b/drivers/gpu/drm/xe/xe_devcoredump.h
index e2fa65ce0932..a4eebc285fc8 100644
--- a/drivers/gpu/drm/xe/xe_devcoredump.h
+++ b/drivers/gpu/drm/xe/xe_devcoredump.h
@@ -6,6 +6,9 @@
 #ifndef _XE_DEVCOREDUMP_H_
 #define _XE_DEVCOREDUMP_H_
 
+#include <linux/types.h>
+
+struct drm_printer;
 struct xe_device;
 struct xe_sched_job;
 
@@ -23,4 +26,7 @@ static inline int xe_devcoredump_init(struct xe_device *xe)
 }
 #endif
 
+void xe_print_blob_ascii85(struct drm_printer *p, const char *prefix,
+			   const void *blob, size_t offset, size_t size);
+
 #endif
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v9 05/11] drm/xe/guc: Copy GuC log prior to dumping
  2024-10-02 21:14 [PATCH v9 00/11] drm/xe/guc: Improve GuC log dumping and add to devcoredump John.C.Harrison
                   ` (3 preceding siblings ...)
  2024-10-02 21:14 ` [PATCH v9 04/11] drm/xe/devcoredump: Add ASCII85 dump helper function John.C.Harrison
@ 2024-10-02 21:14 ` John.C.Harrison
  2024-10-02 21:14 ` [PATCH v9 06/11] drm/xe/guc: Use a two stage dump for GuC logs and add more info John.C.Harrison
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: John.C.Harrison @ 2024-10-02 21:14 UTC (permalink / raw)
  To: Intel-Xe; +Cc: John Harrison, Julia Filipchuk

From: John Harrison <John.C.Harrison@Intel.com>

Add an extra stage to the GuC log print to copy the log buffer into
regular host memory first, rather than printing the live GPU buffer
object directly. Doing so helps prevent inconsistencies due to the log
being updated as it is being dumped. It also allows the use of the
ASCII85 helper function for printing the log in a more compact form
than a straight hex dump.

v2: Use %zx instead of %lx for size_t prints.
v3: Replace hexdump code with ascii85 call (review feedback from
Matthew B). Move chunking code into next patch as that reduces the
deltas of both.
v4: Add a prefix to the ASCII85 output to aid tool parsing.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
---
 drivers/gpu/drm/xe/xe_guc_log.c | 40 +++++++++++++++++++--------------
 1 file changed, 23 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_guc_log.c b/drivers/gpu/drm/xe/xe_guc_log.c
index a37ee3419428..be47780ec2a7 100644
--- a/drivers/gpu/drm/xe/xe_guc_log.c
+++ b/drivers/gpu/drm/xe/xe_guc_log.c
@@ -6,9 +6,12 @@
 #include "xe_guc_log.h"
 
 #include <drm/drm_managed.h>
+#include <linux/vmalloc.h>
 
 #include "xe_bo.h"
+#include "xe_devcoredump.h"
 #include "xe_gt.h"
+#include "xe_gt_printk.h"
 #include "xe_map.h"
 #include "xe_module.h"
 
@@ -49,32 +52,35 @@ static size_t guc_log_size(void)
 		CAPTURE_BUFFER_SIZE;
 }
 
+/**
+ * xe_guc_log_print - dump a copy of the GuC log to some useful location
+ * @log: GuC log structure
+ * @p: the printer object to output to
+ */
 void xe_guc_log_print(struct xe_guc_log *log, struct drm_printer *p)
 {
 	struct xe_device *xe = log_to_xe(log);
 	size_t size;
-	int i, j;
+	void *copy;
 
-	xe_assert(xe, log->bo);
+	if (!log->bo) {
+		drm_puts(p, "GuC log buffer not allocated");
+		return;
+	}
 
 	size = log->bo->size;
 
-#define DW_PER_READ		128
-	xe_assert(xe, !(size % (DW_PER_READ * sizeof(u32))));
-	for (i = 0; i < size / sizeof(u32); i += DW_PER_READ) {
-		u32 read[DW_PER_READ];
-
-		xe_map_memcpy_from(xe, read, &log->bo->vmap, i * sizeof(u32),
-				   DW_PER_READ * sizeof(u32));
-#define DW_PER_PRINT		4
-		for (j = 0; j < DW_PER_READ / DW_PER_PRINT; ++j) {
-			u32 *print = read + j * DW_PER_PRINT;
-
-			drm_printf(p, "0x%08x 0x%08x 0x%08x 0x%08x\n",
-				   *(print + 0), *(print + 1),
-				   *(print + 2), *(print + 3));
-		}
+	copy = vmalloc(size);
+	if (!copy) {
+		drm_printf(p, "Failed to allocate %zu", size);
+		return;
 	}
+
+	xe_map_memcpy_from(xe, copy, &log->bo->vmap, 0, size);
+
+	xe_print_blob_ascii85(p, "Log data", copy, 0, size);
+
+	vfree(copy);
 }
 
 int xe_guc_log_init(struct xe_guc_log *log)
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v9 06/11] drm/xe/guc: Use a two stage dump for GuC logs and add more info
  2024-10-02 21:14 [PATCH v9 00/11] drm/xe/guc: Improve GuC log dumping and add to devcoredump John.C.Harrison
                   ` (4 preceding siblings ...)
  2024-10-02 21:14 ` [PATCH v9 05/11] drm/xe/guc: Copy GuC log prior to dumping John.C.Harrison
@ 2024-10-02 21:14 ` John.C.Harrison
  2024-10-02 21:14 ` [PATCH v9 07/11] drm/print: Introduce drm_line_printer John.C.Harrison
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: John.C.Harrison @ 2024-10-02 21:14 UTC (permalink / raw)
  To: Intel-Xe; +Cc: John Harrison, Julia Filipchuk

From: John Harrison <John.C.Harrison@Intel.com>

Split the GuC log dump into a two stage snapshot and print mechanism.
This allows the log to be captured at the point of an error (which may
be in a restricted context) and then dump it out later (from a regular
context such as a worker function or a sysfs file handler).

Also add a bunch of other useful pieces of information that can help
(or are fundamentally required!) to decode and parse the log.

v2: Add kerneldoc and fix a couple of comment typos - review feedback
from Michal W.
v3: Move chunking code to this patch as it makes the deltas simpler.
Fix a bunch of kerneldoc issues.
v4: Move the CS frequency out of the coredump snapshot function into
the debugfs only code (as that info is already part of the main
devcoredump). Add a header to the debugfs log to match the one in the
devcoredump to aid processing by a unified tool. Add forcewake to the
GuC timestamp read so it actually works.
v6: Add colon to GuC version string (review feedback by Julia F).

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
---
 drivers/gpu/drm/xe/regs/xe_guc_regs.h |   1 +
 drivers/gpu/drm/xe/xe_guc_log.c       | 178 +++++++++++++++++++++++---
 drivers/gpu/drm/xe/xe_guc_log.h       |   4 +
 drivers/gpu/drm/xe/xe_guc_log_types.h |  27 ++++
 4 files changed, 195 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/xe/regs/xe_guc_regs.h b/drivers/gpu/drm/xe/regs/xe_guc_regs.h
index a5fd14307f94..b27b73680c12 100644
--- a/drivers/gpu/drm/xe/regs/xe_guc_regs.h
+++ b/drivers/gpu/drm/xe/regs/xe_guc_regs.h
@@ -84,6 +84,7 @@
 #define   HUC_LOADING_AGENT_GUC			REG_BIT(1)
 #define   GUC_WOPCM_OFFSET_VALID		REG_BIT(0)
 #define GUC_MAX_IDLE_COUNT			XE_REG(0xc3e4)
+#define GUC_PMTIMESTAMP				XE_REG(0xc3e8)
 
 #define GUC_SEND_INTERRUPT			XE_REG(0xc4c8)
 #define   GUC_SEND_TRIGGER			REG_BIT(0)
diff --git a/drivers/gpu/drm/xe/xe_guc_log.c b/drivers/gpu/drm/xe/xe_guc_log.c
index be47780ec2a7..24564624e91e 100644
--- a/drivers/gpu/drm/xe/xe_guc_log.c
+++ b/drivers/gpu/drm/xe/xe_guc_log.c
@@ -6,15 +6,23 @@
 #include "xe_guc_log.h"
 
 #include <drm/drm_managed.h>
-#include <linux/vmalloc.h>
 
+#include "regs/xe_guc_regs.h"
 #include "xe_bo.h"
 #include "xe_devcoredump.h"
+#include "xe_force_wake.h"
 #include "xe_gt.h"
 #include "xe_gt_printk.h"
 #include "xe_map.h"
+#include "xe_mmio.h"
 #include "xe_module.h"
 
+static struct xe_guc *
+log_to_guc(struct xe_guc_log *log)
+{
+	return container_of(log, struct xe_guc, log);
+}
+
 static struct xe_gt *
 log_to_gt(struct xe_guc_log *log)
 {
@@ -52,35 +60,175 @@ static size_t guc_log_size(void)
 		CAPTURE_BUFFER_SIZE;
 }
 
+#define GUC_LOG_CHUNK_SIZE	SZ_2M
+
+static struct xe_guc_log_snapshot *xe_guc_log_snapshot_alloc(struct xe_guc_log *log, bool atomic)
+{
+	struct xe_guc_log_snapshot *snapshot;
+	size_t remain;
+	int i;
+
+	snapshot = kzalloc(sizeof(*snapshot), atomic ? GFP_ATOMIC : GFP_KERNEL);
+	if (!snapshot)
+		return NULL;
+
+	/*
+	 * NB: kmalloc has a hard limit well below the maximum GuC log buffer size.
+	 * Also, can't use vmalloc as might be called from atomic context. So need
+	 * to break the buffer up into smaller chunks that can be allocated.
+	 */
+	snapshot->size = log->bo->size;
+	snapshot->num_chunks = DIV_ROUND_UP(snapshot->size, GUC_LOG_CHUNK_SIZE);
+
+	snapshot->copy = kcalloc(snapshot->num_chunks, sizeof(*snapshot->copy),
+				 atomic ? GFP_ATOMIC : GFP_KERNEL);
+	if (!snapshot->copy)
+		goto fail_snap;
+
+	remain = snapshot->size;
+	for (i = 0; i < snapshot->num_chunks; i++) {
+		size_t size = min(GUC_LOG_CHUNK_SIZE, remain);
+
+		snapshot->copy[i] = kmalloc(size, atomic ? GFP_ATOMIC : GFP_KERNEL);
+		if (!snapshot->copy[i])
+			goto fail_copy;
+		remain -= size;
+	}
+
+	return snapshot;
+
+fail_copy:
+	for (i = 0; i < snapshot->num_chunks; i++)
+		kfree(snapshot->copy[i]);
+	kfree(snapshot->copy);
+fail_snap:
+	kfree(snapshot);
+	return NULL;
+}
+
 /**
- * xe_guc_log_print - dump a copy of the GuC log to some useful location
+ * xe_guc_log_snapshot_free - free a previously captured GuC log snapshot
+ * @snapshot: GuC log snapshot structure
+ *
+ * Return: pointer to a newly allocated snapshot object or null if out of memory. Caller is
+ * responsible for calling xe_guc_log_snapshot_free when done with the snapshot.
+ */
+void xe_guc_log_snapshot_free(struct xe_guc_log_snapshot *snapshot)
+{
+	int i;
+
+	if (!snapshot)
+		return;
+
+	if (!snapshot->copy) {
+		for (i = 0; i < snapshot->num_chunks; i++)
+			kfree(snapshot->copy[i]);
+		kfree(snapshot->copy);
+	}
+
+	kfree(snapshot);
+}
+
+/**
+ * xe_guc_log_snapshot_capture - create a new snapshot copy the GuC log for later dumping
  * @log: GuC log structure
- * @p: the printer object to output to
+ * @atomic: is the call inside an atomic section of some kind?
+ *
+ * Return: pointer to a newly allocated snapshot object or null if out of memory. Caller is
+ * responsible for calling xe_guc_log_snapshot_free when done with the snapshot.
  */
-void xe_guc_log_print(struct xe_guc_log *log, struct drm_printer *p)
+struct xe_guc_log_snapshot *xe_guc_log_snapshot_capture(struct xe_guc_log *log, bool atomic)
 {
+	struct xe_guc_log_snapshot *snapshot;
 	struct xe_device *xe = log_to_xe(log);
-	size_t size;
-	void *copy;
+	struct xe_guc *guc = log_to_guc(log);
+	struct xe_gt *gt = log_to_gt(log);
+	size_t remain;
+	int i, err;
 
 	if (!log->bo) {
-		drm_puts(p, "GuC log buffer not allocated");
-		return;
+		xe_gt_err(gt, "GuC log buffer not allocated\n");
+		return NULL;
+	}
+
+	snapshot = xe_guc_log_snapshot_alloc(log, atomic);
+	if (!snapshot) {
+		xe_gt_err(gt, "GuC log snapshot not allocated\n");
+		return NULL;
 	}
 
-	size = log->bo->size;
+	remain = snapshot->size;
+	for (i = 0; i < snapshot->num_chunks; i++) {
+		size_t size = min(GUC_LOG_CHUNK_SIZE, remain);
+
+		xe_map_memcpy_from(xe, snapshot->copy[i], &log->bo->vmap,
+				   i * GUC_LOG_CHUNK_SIZE, size);
+		remain -= size;
+	}
+
+	err = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
+	if (err) {
+		snapshot->stamp = ~0;
+	} else {
+		snapshot->stamp = xe_mmio_read32(&gt->mmio, GUC_PMTIMESTAMP);
+		xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
+	}
+	snapshot->ktime = ktime_get_boottime_ns();
+	snapshot->level = log->level;
+	snapshot->ver_found = guc->fw.versions.found[XE_UC_FW_VER_RELEASE];
+	snapshot->ver_want = guc->fw.versions.wanted;
+	snapshot->path = guc->fw.path;
+
+	return snapshot;
+}
+
+/**
+ * xe_guc_log_snapshot_print - dump a previously saved copy of the GuC log to some useful location
+ * @snapshot: a snapshot of the GuC log
+ * @p: the printer object to output to
+ */
+void xe_guc_log_snapshot_print(struct xe_guc_log_snapshot *snapshot, struct drm_printer *p)
+{
+	size_t remain;
+	int i;
 
-	copy = vmalloc(size);
-	if (!copy) {
-		drm_printf(p, "Failed to allocate %zu", size);
+	if (!snapshot) {
+		drm_printf(p, "GuC log snapshot not allocated!\n");
 		return;
 	}
 
-	xe_map_memcpy_from(xe, copy, &log->bo->vmap, 0, size);
+	drm_printf(p, "GuC firmware: %s\n", snapshot->path);
+	drm_printf(p, "GuC version: %u.%u.%u (wanted %u.%u.%u)\n",
+		   snapshot->ver_found.major, snapshot->ver_found.minor, snapshot->ver_found.patch,
+		   snapshot->ver_want.major, snapshot->ver_want.minor, snapshot->ver_want.patch);
+	drm_printf(p, "Kernel timestamp: 0x%08llX [%llu]\n", snapshot->ktime, snapshot->ktime);
+	drm_printf(p, "GuC timestamp: 0x%08X [%u]\n", snapshot->stamp, snapshot->stamp);
+	drm_printf(p, "Log level: %u\n", snapshot->level);
+
+	remain = snapshot->size;
+	for (i = 0; i < snapshot->num_chunks; i++) {
+		size_t size = min(GUC_LOG_CHUNK_SIZE, remain);
+
+		xe_print_blob_ascii85(p, i ? NULL : "Log data", snapshot->copy[i], 0, size);
+		remain -= size;
+	}
+}
+
+/**
+ * xe_guc_log_print - dump a copy of the GuC log to some useful location
+ * @log: GuC log structure
+ * @p: the printer object to output to
+ */
+void xe_guc_log_print(struct xe_guc_log *log, struct drm_printer *p)
+{
+	struct xe_guc_log_snapshot *snapshot;
 
-	xe_print_blob_ascii85(p, "Log data", copy, 0, size);
+	drm_printf(p, "**** GuC Log ****\n");
 
-	vfree(copy);
+	snapshot = xe_guc_log_snapshot_capture(log, false);
+	drm_printf(p, "CS reference clock: %u\n", log_to_gt(log)->info.reference_clock);
+	xe_guc_log_snapshot_print(snapshot, p);
+	xe_guc_log_snapshot_free(snapshot);
 }
 
 int xe_guc_log_init(struct xe_guc_log *log)
diff --git a/drivers/gpu/drm/xe/xe_guc_log.h b/drivers/gpu/drm/xe/xe_guc_log.h
index 2d25ab28b4b3..949d2c98343d 100644
--- a/drivers/gpu/drm/xe/xe_guc_log.h
+++ b/drivers/gpu/drm/xe/xe_guc_log.h
@@ -9,6 +9,7 @@
 #include "xe_guc_log_types.h"
 
 struct drm_printer;
+struct xe_device;
 
 #if IS_ENABLED(CONFIG_DRM_XE_LARGE_GUC_BUFFER)
 #define CRASH_BUFFER_SIZE       SZ_1M
@@ -38,6 +39,9 @@ struct drm_printer;
 
 int xe_guc_log_init(struct xe_guc_log *log);
 void xe_guc_log_print(struct xe_guc_log *log, struct drm_printer *p);
+struct xe_guc_log_snapshot *xe_guc_log_snapshot_capture(struct xe_guc_log *log, bool atomic);
+void xe_guc_log_snapshot_print(struct xe_guc_log_snapshot *snapshot, struct drm_printer *p);
+void xe_guc_log_snapshot_free(struct xe_guc_log_snapshot *snapshot);
 
 static inline u32
 xe_guc_log_get_level(struct xe_guc_log *log)
diff --git a/drivers/gpu/drm/xe/xe_guc_log_types.h b/drivers/gpu/drm/xe/xe_guc_log_types.h
index 125080d138a7..962b9edbd9eb 100644
--- a/drivers/gpu/drm/xe/xe_guc_log_types.h
+++ b/drivers/gpu/drm/xe/xe_guc_log_types.h
@@ -8,8 +8,35 @@
 
 #include <linux/types.h>
 
+#include "xe_uc_fw_types.h"
+
 struct xe_bo;
 
+/**
+ * struct xe_guc_log_snapshot:
+ * Capture of the GuC log plus various state useful for decoding the log
+ */
+struct xe_guc_log_snapshot {
+	/** @size: Size in bytes of the @copy allocation */
+	size_t size;
+	/** @copy: Host memory copy of the log buffer for later dumping, split into chunks */
+	void **copy;
+	/** @num_chunks: Number of chunks within @copy */
+	int num_chunks;
+	/** @ktime: Kernel time the snapshot was taken */
+	u64 ktime;
+	/** @stamp: GuC timestamp at which the snapshot was taken */
+	u32 stamp;
+	/** @level: GuC log verbosity level */
+	u32 level;
+	/** @ver_found: GuC firmware version */
+	struct xe_uc_fw_version ver_found;
+	/** @ver_want: GuC firmware version that driver expected */
+	struct xe_uc_fw_version ver_want;
+	/** @path: Path of GuC firmware blob */
+	const char *path;
+};
+
 /**
  * struct xe_guc_log - GuC log
  */
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v9 07/11] drm/print: Introduce drm_line_printer
  2024-10-02 21:14 [PATCH v9 00/11] drm/xe/guc: Improve GuC log dumping and add to devcoredump John.C.Harrison
                   ` (5 preceding siblings ...)
  2024-10-02 21:14 ` [PATCH v9 06/11] drm/xe/guc: Use a two stage dump for GuC logs and add more info John.C.Harrison
@ 2024-10-02 21:14 ` John.C.Harrison
  2024-10-02 21:14 ` [PATCH v9 08/11] drm/xe/guc: Dead CT helper John.C.Harrison
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: John.C.Harrison @ 2024-10-02 21:14 UTC (permalink / raw)
  To: Intel-Xe; +Cc: Michal Wajdeczko, Jani Nikula, John Harrison, dri-devel

From: Michal Wajdeczko <michal.wajdeczko@intel.com>

This drm printer wrapper can be used to increase the robustness of
the captured output generated by any other drm_printer to make sure
we didn't lost any intermediate lines of the output by adding line
numbers to each output line. Helpful for capturing some crash data.

v2: Extended short int counters to full int (JohnH)

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
---
 drivers/gpu/drm/drm_print.c | 14 ++++++++
 include/drm/drm_print.h     | 64 +++++++++++++++++++++++++++++++++++++
 2 files changed, 78 insertions(+)

diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
index 0081190201a7..08cfea04e22b 100644
--- a/drivers/gpu/drm/drm_print.c
+++ b/drivers/gpu/drm/drm_print.c
@@ -235,6 +235,20 @@ void __drm_printfn_err(struct drm_printer *p, struct va_format *vaf)
 }
 EXPORT_SYMBOL(__drm_printfn_err);
 
+void __drm_printfn_line(struct drm_printer *p, struct va_format *vaf)
+{
+	unsigned int counter = ++p->line.counter;
+	const char *prefix = p->prefix ?: "";
+	const char *pad = p->prefix ? " " : "";
+
+	if (p->line.series)
+		drm_printf(p->arg, "%s%s%u.%u: %pV",
+			   prefix, pad, p->line.series, counter, vaf);
+	else
+		drm_printf(p->arg, "%s%s%u: %pV", prefix, pad, counter, vaf);
+}
+EXPORT_SYMBOL(__drm_printfn_line);
+
 /**
  * drm_puts - print a const string to a &drm_printer stream
  * @p: the &drm printer
diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h
index d2676831d765..b3906dc04388 100644
--- a/include/drm/drm_print.h
+++ b/include/drm/drm_print.h
@@ -177,6 +177,10 @@ struct drm_printer {
 	void *arg;
 	const void *origin;
 	const char *prefix;
+	struct {
+		unsigned int series;
+		unsigned int counter;
+	} line;
 	enum drm_debug_category category;
 };
 
@@ -187,6 +191,7 @@ void __drm_puts_seq_file(struct drm_printer *p, const char *str);
 void __drm_printfn_info(struct drm_printer *p, struct va_format *vaf);
 void __drm_printfn_dbg(struct drm_printer *p, struct va_format *vaf);
 void __drm_printfn_err(struct drm_printer *p, struct va_format *vaf);
+void __drm_printfn_line(struct drm_printer *p, struct va_format *vaf);
 
 __printf(2, 3)
 void drm_printf(struct drm_printer *p, const char *f, ...);
@@ -411,6 +416,65 @@ static inline struct drm_printer drm_err_printer(struct drm_device *drm,
 	return p;
 }
 
+/**
+ * drm_line_printer - construct a &drm_printer that prefixes outputs with line numbers
+ * @p: the &struct drm_printer which actually generates the output
+ * @prefix: optional output prefix, or NULL for no prefix
+ * @series: optional unique series identifier, or 0 to omit identifier in the output
+ *
+ * This printer can be used to increase the robustness of the captured output
+ * to make sure we didn't lost any intermediate lines of the output. Helpful
+ * while capturing some crash data.
+ *
+ * Example 1::
+ *
+ *	void crash_dump(struct drm_device *drm)
+ *	{
+ *		static unsigned int id;
+ *		struct drm_printer p = drm_err_printer(drm, "crash");
+ *		struct drm_printer lp = drm_line_printer(&p, "dump", ++id);
+ *
+ *		drm_printf(&lp, "foo");
+ *		drm_printf(&lp, "bar");
+ *	}
+ *
+ * Above code will print into the dmesg something like::
+ *
+ *	[ ] 0000:00:00.0: [drm] *ERROR* crash dump 1.1: foo
+ *	[ ] 0000:00:00.0: [drm] *ERROR* crash dump 1.2: bar
+ *
+ * Example 2::
+ *
+ *	void line_dump(struct device *dev)
+ *	{
+ *		struct drm_printer p = drm_info_printer(dev);
+ *		struct drm_printer lp = drm_line_printer(&p, NULL, 0);
+ *
+ *		drm_printf(&lp, "foo");
+ *		drm_printf(&lp, "bar");
+ *	}
+ *
+ * Above code will print::
+ *
+ *	[ ] 0000:00:00.0: [drm] 1: foo
+ *	[ ] 0000:00:00.0: [drm] 2: bar
+ *
+ * RETURNS:
+ * The &drm_printer object
+ */
+static inline struct drm_printer drm_line_printer(struct drm_printer *p,
+						  const char *prefix,
+						  unsigned int series)
+{
+	struct drm_printer lp = {
+		.printfn = __drm_printfn_line,
+		.arg = p,
+		.prefix = prefix,
+		.line = { .series = series, },
+	};
+	return lp;
+}
+
 /*
  * struct device based logging
  *
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v9 08/11] drm/xe/guc: Dead CT helper
  2024-10-02 21:14 [PATCH v9 00/11] drm/xe/guc: Improve GuC log dumping and add to devcoredump John.C.Harrison
                   ` (6 preceding siblings ...)
  2024-10-02 21:14 ` [PATCH v9 07/11] drm/print: Introduce drm_line_printer John.C.Harrison
@ 2024-10-02 21:14 ` John.C.Harrison
  2024-10-02 21:14 ` [PATCH v9 09/11] drm/xe/guc: Dump entire CTB on errors John.C.Harrison
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: John.C.Harrison @ 2024-10-02 21:14 UTC (permalink / raw)
  To: Intel-Xe; +Cc: John Harrison, Julia Filipchuk

From: John Harrison <John.C.Harrison@Intel.com>

Add a worker function helper for asynchronously dumping state when an
internal/fatal error is detected in CT processing. Being asynchronous
is required to avoid deadlocks and scheduling-while-atomic or
process-stalled-for-too-long issues. Also check for a bunch more error
conditions and improve the handling of some existing checks.

v2: Use compile time CONFIG check for new (but not directly CT_DEAD
related) checks and use unsigned int for a bitmask, rename
CT_DEAD_RESET to CT_DEAD_REARM and add some explaining comments,
rename 'hxg' macro parameter to 'ctb' - review feedback from Michal W.
Drop CT_DEAD_ALIVE as no need for a bitfield define to just set the
entire mask to zero.
v3: Fix kerneldoc
v4: Nullify some floating pointers after free.
v5: Add section headings and device info to make the state dump look
more like a devcoredump to allow parsing by the same tools (eventual
aim is to just call the devcoredump code itself, but that currently
requires an xe_sched_job, which is not available in the CT code).
v6: Fix potential for leaking snapshots with concurrent error
conditions (review feedback from Julia F).
v7: Don't complain about unexpected G2H messages yet because there is
a known issue causing them. Fix bit shift bug with v6 change. Add GT
id to fake coredump headers and use puts instead of printf.
v8: Disable the head mis-match check in g2h_read because it is failing
on various discrete platforms due to unknown reasons.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
---
 .../drm/xe/abi/guc_communication_ctb_abi.h    |   1 +
 drivers/gpu/drm/xe/xe_guc.c                   |   2 +-
 drivers/gpu/drm/xe/xe_guc_ct.c                | 336 ++++++++++++++++--
 drivers/gpu/drm/xe/xe_guc_ct.h                |   2 +-
 drivers/gpu/drm/xe/xe_guc_ct_types.h          |  23 ++
 5 files changed, 335 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/xe/abi/guc_communication_ctb_abi.h b/drivers/gpu/drm/xe/abi/guc_communication_ctb_abi.h
index 8f86a16dc577..f58198cf2cf6 100644
--- a/drivers/gpu/drm/xe/abi/guc_communication_ctb_abi.h
+++ b/drivers/gpu/drm/xe/abi/guc_communication_ctb_abi.h
@@ -52,6 +52,7 @@ struct guc_ct_buffer_desc {
 #define GUC_CTB_STATUS_OVERFLOW				(1 << 0)
 #define GUC_CTB_STATUS_UNDERFLOW			(1 << 1)
 #define GUC_CTB_STATUS_MISMATCH				(1 << 2)
+#define GUC_CTB_STATUS_DISABLED				(1 << 3)
 	u32 reserved[13];
 } __packed;
 static_assert(sizeof(struct guc_ct_buffer_desc) == 64);
diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
index 2ce22ed12b02..cf646057281b 100644
--- a/drivers/gpu/drm/xe/xe_guc.c
+++ b/drivers/gpu/drm/xe/xe_guc.c
@@ -1181,7 +1181,7 @@ void xe_guc_print_info(struct xe_guc *guc, struct drm_printer *p)
 
 	xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
 
-	xe_guc_ct_print(&guc->ct, p, false);
+	xe_guc_ct_print(&guc->ct, p);
 	xe_guc_submit_print(guc, p);
 }
 
diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
index a63fe0a9077a..46320b94bceb 100644
--- a/drivers/gpu/drm/xe/xe_guc_ct.c
+++ b/drivers/gpu/drm/xe/xe_guc_ct.c
@@ -25,12 +25,48 @@
 #include "xe_gt_sriov_pf_monitor.h"
 #include "xe_gt_tlb_invalidation.h"
 #include "xe_guc.h"
+#include "xe_guc_log.h"
 #include "xe_guc_relay.h"
 #include "xe_guc_submit.h"
 #include "xe_map.h"
 #include "xe_pm.h"
 #include "xe_trace_guc.h"
 
+#if IS_ENABLED(CONFIG_DRM_XE_DEBUG)
+enum {
+	/* Internal states, not error conditions */
+	CT_DEAD_STATE_REARM,			/* 0x0001 */
+	CT_DEAD_STATE_CAPTURE,			/* 0x0002 */
+
+	/* Error conditions */
+	CT_DEAD_SETUP,				/* 0x0004 */
+	CT_DEAD_H2G_WRITE,			/* 0x0008 */
+	CT_DEAD_H2G_HAS_ROOM,			/* 0x0010 */
+	CT_DEAD_G2H_READ,			/* 0x0020 */
+	CT_DEAD_G2H_RECV,			/* 0x0040 */
+	CT_DEAD_G2H_RELEASE,			/* 0x0080 */
+	CT_DEAD_DEADLOCK,			/* 0x0100 */
+	CT_DEAD_PROCESS_FAILED,			/* 0x0200 */
+	CT_DEAD_FAST_G2H,			/* 0x0400 */
+	CT_DEAD_PARSE_G2H_RESPONSE,		/* 0x0800 */
+	CT_DEAD_PARSE_G2H_UNKNOWN,		/* 0x1000 */
+	CT_DEAD_PARSE_G2H_ORIGIN,		/* 0x2000 */
+	CT_DEAD_PARSE_G2H_TYPE,			/* 0x4000 */
+};
+
+static void ct_dead_worker_func(struct work_struct *w);
+static void ct_dead_capture(struct xe_guc_ct *ct, struct guc_ctb *ctb, u32 reason_code);
+
+#define CT_DEAD(ct, ctb, reason_code)		ct_dead_capture((ct), (ctb), CT_DEAD_##reason_code)
+#else
+#define CT_DEAD(ct, ctb, reason)			\
+	do {						\
+		struct guc_ctb *_ctb = (ctb);		\
+		if (_ctb)				\
+			_ctb->info.broken = true;	\
+	} while (0)
+#endif
+
 /* Used when a CT send wants to block and / or receive data */
 struct g2h_fence {
 	u32 *response_buffer;
@@ -183,6 +219,10 @@ int xe_guc_ct_init(struct xe_guc_ct *ct)
 	xa_init(&ct->fence_lookup);
 	INIT_WORK(&ct->g2h_worker, g2h_worker_func);
 	INIT_DELAYED_WORK(&ct->safe_mode_worker, safe_mode_worker_func);
+#if IS_ENABLED(CONFIG_DRM_XE_DEBUG)
+	spin_lock_init(&ct->dead.lock);
+	INIT_WORK(&ct->dead.worker, ct_dead_worker_func);
+#endif
 	init_waitqueue_head(&ct->wq);
 	init_waitqueue_head(&ct->g2h_fence_wq);
 
@@ -419,10 +459,22 @@ int xe_guc_ct_enable(struct xe_guc_ct *ct)
 	if (ct_needs_safe_mode(ct))
 		ct_enter_safe_mode(ct);
 
+#if IS_ENABLED(CONFIG_DRM_XE_DEBUG)
+	/*
+	 * The CT has now been reset so the dumper can be re-armed
+	 * after any existing dead state has been dumped.
+	 */
+	spin_lock_irq(&ct->dead.lock);
+	if (ct->dead.reason)
+		ct->dead.reason |= (1 << CT_DEAD_STATE_REARM);
+	spin_unlock_irq(&ct->dead.lock);
+#endif
+
 	return 0;
 
 err_out:
 	xe_gt_err(gt, "Failed to enable GuC CT (%pe)\n", ERR_PTR(err));
+	CT_DEAD(ct, NULL, SETUP);
 
 	return err;
 }
@@ -466,6 +518,19 @@ static bool h2g_has_room(struct xe_guc_ct *ct, u32 cmd_len)
 
 	if (cmd_len > h2g->info.space) {
 		h2g->info.head = desc_read(ct_to_xe(ct), h2g, head);
+
+		if (h2g->info.head > h2g->info.size) {
+			struct xe_device *xe = ct_to_xe(ct);
+			u32 desc_status = desc_read(xe, h2g, status);
+
+			desc_write(xe, h2g, status, desc_status | GUC_CTB_STATUS_OVERFLOW);
+
+			xe_gt_err(ct_to_gt(ct), "CT: invalid head offset %u >= %u)\n",
+				  h2g->info.head, h2g->info.size);
+			CT_DEAD(ct, h2g, H2G_HAS_ROOM);
+			return false;
+		}
+
 		h2g->info.space = CIRC_SPACE(h2g->info.tail, h2g->info.head,
 					     h2g->info.size) -
 				  h2g->info.resv_space;
@@ -521,10 +586,24 @@ static void __g2h_reserve_space(struct xe_guc_ct *ct, u32 g2h_len, u32 num_g2h)
 
 static void __g2h_release_space(struct xe_guc_ct *ct, u32 g2h_len)
 {
+	bool bad = false;
+
 	lockdep_assert_held(&ct->fast_lock);
-	xe_gt_assert(ct_to_gt(ct), ct->ctbs.g2h.info.space + g2h_len <=
-		     ct->ctbs.g2h.info.size - ct->ctbs.g2h.info.resv_space);
-	xe_gt_assert(ct_to_gt(ct), ct->g2h_outstanding);
+
+	bad = ct->ctbs.g2h.info.space + g2h_len >
+		     ct->ctbs.g2h.info.size - ct->ctbs.g2h.info.resv_space;
+	bad |= !ct->g2h_outstanding;
+
+	if (bad) {
+		xe_gt_err(ct_to_gt(ct), "Invalid G2H release: %d + %d vs %d - %d -> %d vs %d, outstanding = %d!\n",
+			  ct->ctbs.g2h.info.space, g2h_len,
+			  ct->ctbs.g2h.info.size, ct->ctbs.g2h.info.resv_space,
+			  ct->ctbs.g2h.info.space + g2h_len,
+			  ct->ctbs.g2h.info.size - ct->ctbs.g2h.info.resv_space,
+			  ct->g2h_outstanding);
+		CT_DEAD(ct, &ct->ctbs.g2h, G2H_RELEASE);
+		return;
+	}
 
 	ct->ctbs.g2h.info.space += g2h_len;
 	if (!--ct->g2h_outstanding)
@@ -551,12 +630,43 @@ static int h2g_write(struct xe_guc_ct *ct, const u32 *action, u32 len,
 	u32 full_len;
 	struct iosys_map map = IOSYS_MAP_INIT_OFFSET(&h2g->cmds,
 							 tail * sizeof(u32));
+	u32 desc_status;
 
 	full_len = len + GUC_CTB_HDR_LEN;
 
 	lockdep_assert_held(&ct->lock);
 	xe_gt_assert(gt, full_len <= GUC_CTB_MSG_MAX_LEN);
-	xe_gt_assert(gt, tail <= h2g->info.size);
+
+	desc_status = desc_read(xe, h2g, status);
+	if (desc_status) {
+		xe_gt_err(gt, "CT write: non-zero status: %u\n", desc_status);
+		goto corrupted;
+	}
+
+	if (IS_ENABLED(CONFIG_DRM_XE_DEBUG)) {
+		u32 desc_tail = desc_read(xe, h2g, tail);
+		u32 desc_head = desc_read(xe, h2g, head);
+
+		if (tail != desc_tail) {
+			desc_write(xe, h2g, status, desc_status | GUC_CTB_STATUS_MISMATCH);
+			xe_gt_err(gt, "CT write: tail was modified %u != %u\n", desc_tail, tail);
+			goto corrupted;
+		}
+
+		if (tail > h2g->info.size) {
+			desc_write(xe, h2g, status, desc_status | GUC_CTB_STATUS_OVERFLOW);
+			xe_gt_err(gt, "CT write: tail out of range: %u vs %u\n",
+				  tail, h2g->info.size);
+			goto corrupted;
+		}
+
+		if (desc_head >= h2g->info.size) {
+			desc_write(xe, h2g, status, desc_status | GUC_CTB_STATUS_OVERFLOW);
+			xe_gt_err(gt, "CT write: invalid head offset %u >= %u)\n",
+				  desc_head, h2g->info.size);
+			goto corrupted;
+		}
+	}
 
 	/* Command will wrap, zero fill (NOPs), return and check credits again */
 	if (tail + full_len > h2g->info.size) {
@@ -609,6 +719,10 @@ static int h2g_write(struct xe_guc_ct *ct, const u32 *action, u32 len,
 			     desc_read(xe, h2g, head), h2g->info.tail);
 
 	return 0;
+
+corrupted:
+	CT_DEAD(ct, &ct->ctbs.h2g, H2G_WRITE);
+	return -EPIPE;
 }
 
 /*
@@ -720,7 +834,6 @@ static int guc_ct_send_locked(struct xe_guc_ct *ct, const u32 *action, u32 len,
 {
 	struct xe_device *xe = ct_to_xe(ct);
 	struct xe_gt *gt = ct_to_gt(ct);
-	struct drm_printer p = xe_gt_info_printer(gt);
 	unsigned int sleep_period_ms = 1;
 	int ret;
 
@@ -773,8 +886,13 @@ static int guc_ct_send_locked(struct xe_guc_ct *ct, const u32 *action, u32 len,
 			goto broken;
 #undef g2h_avail
 
-		if (dequeue_one_g2h(ct) < 0)
+		ret = dequeue_one_g2h(ct);
+		if (ret < 0) {
+			if (ret != -ECANCELED)
+				xe_gt_err(ct_to_gt(ct), "CTB receive failed (%pe)",
+					  ERR_PTR(ret));
 			goto broken;
+		}
 
 		goto try_again;
 	}
@@ -783,8 +901,7 @@ static int guc_ct_send_locked(struct xe_guc_ct *ct, const u32 *action, u32 len,
 
 broken:
 	xe_gt_err(gt, "No forward process on H2G, reset required\n");
-	xe_guc_ct_print(ct, &p, true);
-	ct->ctbs.h2g.info.broken = true;
+	CT_DEAD(ct, &ct->ctbs.h2g, DEADLOCK);
 
 	return -EDEADLK;
 }
@@ -1011,6 +1128,7 @@ static int parse_g2h_response(struct xe_guc_ct *ct, u32 *msg, u32 len)
 		else
 			xe_gt_err(gt, "unexpected response %u for FAST_REQ H2G fence 0x%x!\n",
 				  type, fence);
+		CT_DEAD(ct, NULL, PARSE_G2H_RESPONSE);
 
 		return -EPROTO;
 	}
@@ -1018,6 +1136,7 @@ static int parse_g2h_response(struct xe_guc_ct *ct, u32 *msg, u32 len)
 	g2h_fence = xa_erase(&ct->fence_lookup, fence);
 	if (unlikely(!g2h_fence)) {
 		/* Don't tear down channel, as send could've timed out */
+		/* CT_DEAD(ct, NULL, PARSE_G2H_UNKNOWN); */
 		xe_gt_warn(gt, "G2H fence (%u) not found!\n", fence);
 		g2h_release_space(ct, GUC_CTB_HXG_MSG_MAX_LEN);
 		return 0;
@@ -1062,7 +1181,7 @@ static int parse_g2h_msg(struct xe_guc_ct *ct, u32 *msg, u32 len)
 	if (unlikely(origin != GUC_HXG_ORIGIN_GUC)) {
 		xe_gt_err(gt, "G2H channel broken on read, origin=%u, reset required\n",
 			  origin);
-		ct->ctbs.g2h.info.broken = true;
+		CT_DEAD(ct, &ct->ctbs.g2h, PARSE_G2H_ORIGIN);
 
 		return -EPROTO;
 	}
@@ -1080,7 +1199,7 @@ static int parse_g2h_msg(struct xe_guc_ct *ct, u32 *msg, u32 len)
 	default:
 		xe_gt_err(gt, "G2H channel broken on read, type=%u, reset required\n",
 			  type);
-		ct->ctbs.g2h.info.broken = true;
+		CT_DEAD(ct, &ct->ctbs.g2h, PARSE_G2H_TYPE);
 
 		ret = -EOPNOTSUPP;
 	}
@@ -1157,9 +1276,11 @@ static int process_g2h_msg(struct xe_guc_ct *ct, u32 *msg, u32 len)
 		xe_gt_err(gt, "unexpected G2H action 0x%04x\n", action);
 	}
 
-	if (ret)
+	if (ret) {
 		xe_gt_err(gt, "G2H action 0x%04x failed (%pe)\n",
 			  action, ERR_PTR(ret));
+		CT_DEAD(ct, NULL, PROCESS_FAILED);
+	}
 
 	return 0;
 }
@@ -1169,7 +1290,7 @@ static int g2h_read(struct xe_guc_ct *ct, u32 *msg, bool fast_path)
 	struct xe_device *xe = ct_to_xe(ct);
 	struct xe_gt *gt = ct_to_gt(ct);
 	struct guc_ctb *g2h = &ct->ctbs.g2h;
-	u32 tail, head, len;
+	u32 tail, head, len, desc_status;
 	s32 avail;
 	u32 action;
 	u32 *hxg;
@@ -1188,6 +1309,63 @@ static int g2h_read(struct xe_guc_ct *ct, u32 *msg, bool fast_path)
 
 	xe_gt_assert(gt, xe_guc_ct_enabled(ct));
 
+	desc_status = desc_read(xe, g2h, status);
+	if (desc_status) {
+		if (desc_status & GUC_CTB_STATUS_DISABLED) {
+			/*
+			 * Potentially valid if a CLIENT_RESET request resulted in
+			 * contexts/engines being reset. But should never happen as
+			 * no contexts should be active when CLIENT_RESET is sent.
+			 */
+			xe_gt_err(gt, "CT read: unexpected G2H after GuC has stopped!\n");
+			desc_status &= ~GUC_CTB_STATUS_DISABLED;
+		}
+
+		if (desc_status) {
+			xe_gt_err(gt, "CT read: non-zero status: %u\n", desc_status);
+			goto corrupted;
+		}
+	}
+
+	if (IS_ENABLED(CONFIG_DRM_XE_DEBUG)) {
+		u32 desc_tail = desc_read(xe, g2h, tail);
+		u32 desc_head = desc_read(xe, g2h, head);
+
+		/*
+		 * info.head and desc_head are updated back-to-back at the end of
+		 * this function and nowhere else. Hence, they cannot be different
+		 * unless two g2h_read calls are running concurrently. Which is not
+		 * possible because it is guarded by ct->fast_lock. And yet, some
+		 * discrete platforms are reguarly hitting this error :(.
+		 *
+		 * desc_head rolling backwards shouldn't cause any noticeable
+		 * problems - just a delay in GuC being allowed to proceed past that
+		 * point in the queue. So for now, just disable the error until it
+		 * can be root caused.
+		 *
+		if (g2h->info.head != desc_head) {
+			desc_write(xe, g2h, status, desc_status | GUC_CTB_STATUS_MISMATCH);
+			xe_gt_err(gt, "CT read: head was modified %u != %u\n",
+				  desc_head, g2h->info.head);
+			goto corrupted;
+		}
+		 */
+
+		if (g2h->info.head > g2h->info.size) {
+			desc_write(xe, g2h, status, desc_status | GUC_CTB_STATUS_OVERFLOW);
+			xe_gt_err(gt, "CT read: head out of range: %u vs %u\n",
+				  g2h->info.head, g2h->info.size);
+			goto corrupted;
+		}
+
+		if (desc_tail >= g2h->info.size) {
+			desc_write(xe, g2h, status, desc_status | GUC_CTB_STATUS_OVERFLOW);
+			xe_gt_err(gt, "CT read: invalid tail offset %u >= %u)\n",
+				  desc_tail, g2h->info.size);
+			goto corrupted;
+		}
+	}
+
 	/* Calculate DW available to read */
 	tail = desc_read(xe, g2h, tail);
 	avail = tail - g2h->info.head;
@@ -1204,9 +1382,7 @@ static int g2h_read(struct xe_guc_ct *ct, u32 *msg, bool fast_path)
 	if (len > avail) {
 		xe_gt_err(gt, "G2H channel broken on read, avail=%d, len=%d, reset required\n",
 			  avail, len);
-		g2h->info.broken = true;
-
-		return -EPROTO;
+		goto corrupted;
 	}
 
 	head = (g2h->info.head + 1) % g2h->info.size;
@@ -1252,6 +1428,10 @@ static int g2h_read(struct xe_guc_ct *ct, u32 *msg, bool fast_path)
 			     action, len, g2h->info.head, tail);
 
 	return len;
+
+corrupted:
+	CT_DEAD(ct, &ct->ctbs.g2h, G2H_READ);
+	return -EPROTO;
 }
 
 static void g2h_fast_path(struct xe_guc_ct *ct, u32 *msg, u32 len)
@@ -1278,9 +1458,11 @@ static void g2h_fast_path(struct xe_guc_ct *ct, u32 *msg, u32 len)
 		xe_gt_warn(gt, "NOT_POSSIBLE");
 	}
 
-	if (ret)
+	if (ret) {
 		xe_gt_err(gt, "G2H action 0x%04x failed (%pe)\n",
 			  action, ERR_PTR(ret));
+		CT_DEAD(ct, NULL, FAST_G2H);
+	}
 }
 
 /**
@@ -1340,7 +1522,6 @@ static int dequeue_one_g2h(struct xe_guc_ct *ct)
 
 static void receive_g2h(struct xe_guc_ct *ct)
 {
-	struct xe_gt *gt = ct_to_gt(ct);
 	bool ongoing;
 	int ret;
 
@@ -1377,9 +1558,8 @@ static void receive_g2h(struct xe_guc_ct *ct)
 		mutex_unlock(&ct->lock);
 
 		if (unlikely(ret == -EPROTO || ret == -EOPNOTSUPP)) {
-			struct drm_printer p = xe_gt_info_printer(gt);
-
-			xe_guc_ct_print(ct, &p, false);
+			xe_gt_err(ct_to_gt(ct), "CT dequeue failed: %d", ret);
+			CT_DEAD(ct, NULL, G2H_RECV);
 			kick_reset(ct);
 		}
 	} while (ret == 1);
@@ -1407,9 +1587,8 @@ static void guc_ctb_snapshot_capture(struct xe_device *xe, struct guc_ctb *ctb,
 
 	snapshot->cmds = kmalloc_array(ctb->info.size, sizeof(u32),
 				       atomic ? GFP_ATOMIC : GFP_KERNEL);
-
 	if (!snapshot->cmds) {
-		drm_err(&xe->drm, "Skipping CTB commands snapshot. Only CTB info will be available.\n");
+		drm_err(&xe->drm, "Skipping CTB commands snapshot. Only CT info will be available.\n");
 		return;
 	}
 
@@ -1490,7 +1669,7 @@ struct xe_guc_ct_snapshot *xe_guc_ct_snapshot_capture(struct xe_guc_ct *ct,
 			   atomic ? GFP_ATOMIC : GFP_KERNEL);
 
 	if (!snapshot) {
-		drm_err(&xe->drm, "Skipping CTB snapshot entirely.\n");
+		xe_gt_err(ct_to_gt(ct), "Skipping CTB snapshot entirely.\n");
 		return NULL;
 	}
 
@@ -1554,16 +1733,119 @@ void xe_guc_ct_snapshot_free(struct xe_guc_ct_snapshot *snapshot)
  * xe_guc_ct_print - GuC CT Print.
  * @ct: GuC CT.
  * @p: drm_printer where it will be printed out.
- * @atomic: Boolean to indicate if this is called from atomic context like
- * reset or CTB handler or from some regular path like debugfs.
  *
  * This function quickly capture a snapshot and immediately print it out.
  */
-void xe_guc_ct_print(struct xe_guc_ct *ct, struct drm_printer *p, bool atomic)
+void xe_guc_ct_print(struct xe_guc_ct *ct, struct drm_printer *p)
 {
 	struct xe_guc_ct_snapshot *snapshot;
 
-	snapshot = xe_guc_ct_snapshot_capture(ct, atomic);
+	snapshot = xe_guc_ct_snapshot_capture(ct, false);
 	xe_guc_ct_snapshot_print(snapshot, p);
 	xe_guc_ct_snapshot_free(snapshot);
 }
+
+#if IS_ENABLED(CONFIG_DRM_XE_DEBUG)
+static void ct_dead_capture(struct xe_guc_ct *ct, struct guc_ctb *ctb, u32 reason_code)
+{
+	struct xe_guc_log_snapshot *snapshot_log;
+	struct xe_guc_ct_snapshot *snapshot_ct;
+	struct xe_guc *guc = ct_to_guc(ct);
+	unsigned long flags;
+	bool have_capture;
+
+	if (ctb)
+		ctb->info.broken = true;
+
+	/* Ignore further errors after the first dump until a reset */
+	if (ct->dead.reported)
+		return;
+
+	spin_lock_irqsave(&ct->dead.lock, flags);
+
+	/* And only capture one dump at a time */
+	have_capture = ct->dead.reason & (1 << CT_DEAD_STATE_CAPTURE);
+	ct->dead.reason |= (1 << reason_code) |
+			   (1 << CT_DEAD_STATE_CAPTURE);
+
+	spin_unlock_irqrestore(&ct->dead.lock, flags);
+
+	if (have_capture)
+		return;
+
+	snapshot_log = xe_guc_log_snapshot_capture(&guc->log, true);
+	snapshot_ct = xe_guc_ct_snapshot_capture((ct), true);
+
+	spin_lock_irqsave(&ct->dead.lock, flags);
+
+	if (ct->dead.snapshot_log || ct->dead.snapshot_ct) {
+		xe_gt_err(ct_to_gt(ct), "Got unexpected dead CT capture!\n");
+		xe_guc_log_snapshot_free(snapshot_log);
+		xe_guc_ct_snapshot_free(snapshot_ct);
+	} else {
+		ct->dead.snapshot_log = snapshot_log;
+		ct->dead.snapshot_ct = snapshot_ct;
+	}
+
+	spin_unlock_irqrestore(&ct->dead.lock, flags);
+
+	queue_work(system_unbound_wq, &(ct)->dead.worker);
+}
+
+static void ct_dead_print(struct xe_dead_ct *dead)
+{
+	struct xe_guc_ct *ct = container_of(dead, struct xe_guc_ct, dead);
+	struct xe_device *xe = ct_to_xe(ct);
+	struct xe_gt *gt = ct_to_gt(ct);
+	static int g_count;
+	struct drm_printer ip = xe_gt_info_printer(gt);
+	struct drm_printer lp = drm_line_printer(&ip, "Capture", ++g_count);
+
+	if (!dead->reason) {
+		xe_gt_err(gt, "CTB is dead for no reason!?\n");
+		return;
+	}
+
+	drm_printf(&lp, "CTB is dead - reason=0x%X\n", dead->reason);
+
+	/* Can't generate a genuine core dump at this point, so just do the good bits */
+	drm_puts(&lp, "**** Xe Device Coredump ****\n");
+	xe_device_snapshot_print(xe, &lp);
+
+	drm_printf(&lp, "**** GT #%d ****\n", gt->info.id);
+	drm_printf(&lp, "\tTile: %d\n", gt->tile->id);
+
+	drm_puts(&lp, "**** GuC Log ****\n");
+	xe_guc_log_snapshot_print(dead->snapshot_log, &lp);
+
+	drm_puts(&lp, "**** GuC CT ****\n");
+	xe_guc_ct_snapshot_print(dead->snapshot_ct, &lp);
+
+	drm_puts(&lp, "Done.\n");
+}
+
+static void ct_dead_worker_func(struct work_struct *w)
+{
+	struct xe_guc_ct *ct = container_of(w, struct xe_guc_ct, dead.worker);
+
+	if (!ct->dead.reported) {
+		ct->dead.reported = true;
+		ct_dead_print(&ct->dead);
+	}
+
+	spin_lock_irq(&ct->dead.lock);
+
+	xe_guc_log_snapshot_free(ct->dead.snapshot_log);
+	ct->dead.snapshot_log = NULL;
+	xe_guc_ct_snapshot_free(ct->dead.snapshot_ct);
+	ct->dead.snapshot_ct = NULL;
+
+	if (ct->dead.reason & (1 << CT_DEAD_STATE_REARM)) {
+		/* A reset has occurred so re-arm the error reporting */
+		ct->dead.reason = 0;
+		ct->dead.reported = false;
+	}
+
+	spin_unlock_irq(&ct->dead.lock);
+}
+#endif
diff --git a/drivers/gpu/drm/xe/xe_guc_ct.h b/drivers/gpu/drm/xe/xe_guc_ct.h
index 190202fce2d0..293041bed7ed 100644
--- a/drivers/gpu/drm/xe/xe_guc_ct.h
+++ b/drivers/gpu/drm/xe/xe_guc_ct.h
@@ -21,7 +21,7 @@ xe_guc_ct_snapshot_capture(struct xe_guc_ct *ct, bool atomic);
 void xe_guc_ct_snapshot_print(struct xe_guc_ct_snapshot *snapshot,
 			      struct drm_printer *p);
 void xe_guc_ct_snapshot_free(struct xe_guc_ct_snapshot *snapshot);
-void xe_guc_ct_print(struct xe_guc_ct *ct, struct drm_printer *p, bool atomic);
+void xe_guc_ct_print(struct xe_guc_ct *ct, struct drm_printer *p);
 
 static inline bool xe_guc_ct_enabled(struct xe_guc_ct *ct)
 {
diff --git a/drivers/gpu/drm/xe/xe_guc_ct_types.h b/drivers/gpu/drm/xe/xe_guc_ct_types.h
index 761cb9031298..85e127ec91d7 100644
--- a/drivers/gpu/drm/xe/xe_guc_ct_types.h
+++ b/drivers/gpu/drm/xe/xe_guc_ct_types.h
@@ -86,6 +86,24 @@ enum xe_guc_ct_state {
 	XE_GUC_CT_STATE_ENABLED,
 };
 
+#if IS_ENABLED(CONFIG_DRM_XE_DEBUG)
+/** struct xe_dead_ct - Information for debugging a dead CT */
+struct xe_dead_ct {
+	/** @lock: protects memory allocation/free operations, and @reason updates */
+	spinlock_t lock;
+	/** @reason: bit mask of CT_DEAD_* reason codes */
+	unsigned int reason;
+	/** @reported: for preventing multiple dumps per error sequence */
+	bool reported;
+	/** @worker: worker thread to get out of interrupt context before dumping */
+	struct work_struct worker;
+	/** snapshot_ct: copy of CT state and CTB content at point of error */
+	struct xe_guc_ct_snapshot *snapshot_ct;
+	/** snapshot_log: copy of GuC log at point of error */
+	struct xe_guc_log_snapshot *snapshot_log;
+};
+#endif
+
 /**
  * struct xe_guc_ct - GuC command transport (CT) layer
  *
@@ -128,6 +146,11 @@ struct xe_guc_ct {
 	u32 msg[GUC_CTB_MSG_MAX_LEN];
 	/** @fast_msg: Message buffer */
 	u32 fast_msg[GUC_CTB_MSG_MAX_LEN];
+
+#if IS_ENABLED(CONFIG_DRM_XE_DEBUG)
+	/** @dead: information for debugging dead CTs */
+	struct xe_dead_ct dead;
+#endif
 };
 
 #endif
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v9 09/11] drm/xe/guc: Dump entire CTB on errors
  2024-10-02 21:14 [PATCH v9 00/11] drm/xe/guc: Improve GuC log dumping and add to devcoredump John.C.Harrison
                   ` (7 preceding siblings ...)
  2024-10-02 21:14 ` [PATCH v9 08/11] drm/xe/guc: Dead CT helper John.C.Harrison
@ 2024-10-02 21:14 ` John.C.Harrison
  2024-10-02 21:14 ` [PATCH v9 10/11] drm/xe/guc: Add GuC log to devcoredump captures John.C.Harrison
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: John.C.Harrison @ 2024-10-02 21:14 UTC (permalink / raw)
  To: Intel-Xe; +Cc: John Harrison, Julia Filipchuk

From: John Harrison <John.C.Harrison@Intel.com>

The dump of the CT buffers was only showing the unprocessed data which
is not generally useful for saying why a hang occurred - because it
was probably caused by the commands that were just processed. So save
and dump the entire buffer but in a more compact dump format. Also
zero fill it on allocation to avoid confusion over uninitialised data
in the dump.

v2: Add kerneldoc - review feedback from Michal W.
v3: Fix kerneldoc.
v4: Use ascii85 instead of hexdump (review feedback from Matthew B).
v5: Dump the entire CTB object rather than separately dumping just the
H2G and G2H sections. That way it includes the full header info.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
---
 drivers/gpu/drm/xe/xe_guc_ct.c       | 94 ++++++++++------------------
 drivers/gpu/drm/xe/xe_guc_ct.h       |  8 +--
 drivers/gpu/drm/xe/xe_guc_ct_types.h |  6 +-
 3 files changed, 41 insertions(+), 67 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
index 46320b94bceb..39abf3b3a043 100644
--- a/drivers/gpu/drm/xe/xe_guc_ct.c
+++ b/drivers/gpu/drm/xe/xe_guc_ct.c
@@ -17,6 +17,7 @@
 #include "abi/guc_actions_sriov_abi.h"
 #include "abi/guc_klvs_abi.h"
 #include "xe_bo.h"
+#include "xe_devcoredump.h"
 #include "xe_device.h"
 #include "xe_gt.h"
 #include "xe_gt_pagefault.h"
@@ -435,6 +436,7 @@ int xe_guc_ct_enable(struct xe_guc_ct *ct)
 
 	xe_gt_assert(gt, !xe_guc_ct_enabled(ct));
 
+	xe_map_memset(xe, &ct->bo->vmap, 0, 0, ct->bo->size);
 	guc_ct_ctb_h2g_init(xe, &ct->ctbs.h2g, &ct->bo->vmap);
 	guc_ct_ctb_g2h_init(xe, &ct->ctbs.g2h, &ct->bo->vmap);
 
@@ -1575,48 +1577,33 @@ static void g2h_worker_func(struct work_struct *w)
 	receive_g2h(ct);
 }
 
-static void guc_ctb_snapshot_capture(struct xe_device *xe, struct guc_ctb *ctb,
-				     struct guc_ctb_snapshot *snapshot,
-				     bool atomic)
+struct xe_guc_ct_snapshot *xe_guc_ct_snapshot_alloc(struct xe_guc_ct *ct, bool atomic)
 {
-	u32 head, tail;
+	struct xe_guc_ct_snapshot *snapshot;
 
-	xe_map_memcpy_from(xe, &snapshot->desc, &ctb->desc, 0,
-			   sizeof(struct guc_ct_buffer_desc));
-	memcpy(&snapshot->info, &ctb->info, sizeof(struct guc_ctb_info));
+	snapshot = kzalloc(sizeof(*snapshot), atomic ? GFP_ATOMIC : GFP_KERNEL);
+	if (!snapshot)
+		return NULL;
 
-	snapshot->cmds = kmalloc_array(ctb->info.size, sizeof(u32),
-				       atomic ? GFP_ATOMIC : GFP_KERNEL);
-	if (!snapshot->cmds) {
-		drm_err(&xe->drm, "Skipping CTB commands snapshot. Only CT info will be available.\n");
-		return;
+	if (ct->bo) {
+		snapshot->ctb_size = ct->bo->size;
+		snapshot->ctb = kmalloc(snapshot->ctb_size, atomic ? GFP_ATOMIC : GFP_KERNEL);
 	}
 
-	head = snapshot->desc.head;
-	tail = snapshot->desc.tail;
-
-	if (head != tail) {
-		struct iosys_map map =
-			IOSYS_MAP_INIT_OFFSET(&ctb->cmds, head * sizeof(u32));
-
-		while (head != tail) {
-			snapshot->cmds[head] = xe_map_rd(xe, &map, 0, u32);
-			++head;
-			if (head == ctb->info.size) {
-				head = 0;
-				map = ctb->cmds;
-			} else {
-				iosys_map_incr(&map, sizeof(u32));
-			}
-		}
-	}
+	return snapshot;
+}
+
+static void guc_ctb_snapshot_capture(struct xe_device *xe, struct guc_ctb *ctb,
+				     struct guc_ctb_snapshot *snapshot)
+{
+	xe_map_memcpy_from(xe, &snapshot->desc, &ctb->desc, 0,
+			   sizeof(struct guc_ct_buffer_desc));
+	memcpy(&snapshot->info, &ctb->info, sizeof(struct guc_ctb_info));
 }
 
 static void guc_ctb_snapshot_print(struct guc_ctb_snapshot *snapshot,
 				   struct drm_printer *p)
 {
-	u32 head, tail;
-
 	drm_printf(p, "\tsize: %d\n", snapshot->info.size);
 	drm_printf(p, "\tresv_space: %d\n", snapshot->info.resv_space);
 	drm_printf(p, "\thead: %d\n", snapshot->info.head);
@@ -1626,25 +1613,6 @@ static void guc_ctb_snapshot_print(struct guc_ctb_snapshot *snapshot,
 	drm_printf(p, "\thead (memory): %d\n", snapshot->desc.head);
 	drm_printf(p, "\ttail (memory): %d\n", snapshot->desc.tail);
 	drm_printf(p, "\tstatus (memory): 0x%x\n", snapshot->desc.status);
-
-	if (!snapshot->cmds)
-		return;
-
-	head = snapshot->desc.head;
-	tail = snapshot->desc.tail;
-
-	while (head != tail) {
-		drm_printf(p, "\tcmd[%d]: 0x%08x\n", head,
-			   snapshot->cmds[head]);
-		++head;
-		if (head == snapshot->info.size)
-			head = 0;
-	}
-}
-
-static void guc_ctb_snapshot_free(struct guc_ctb_snapshot *snapshot)
-{
-	kfree(snapshot->cmds);
 }
 
 /**
@@ -1665,9 +1633,7 @@ struct xe_guc_ct_snapshot *xe_guc_ct_snapshot_capture(struct xe_guc_ct *ct,
 	struct xe_device *xe = ct_to_xe(ct);
 	struct xe_guc_ct_snapshot *snapshot;
 
-	snapshot = kzalloc(sizeof(*snapshot),
-			   atomic ? GFP_ATOMIC : GFP_KERNEL);
-
+	snapshot = xe_guc_ct_snapshot_alloc(ct, atomic);
 	if (!snapshot) {
 		xe_gt_err(ct_to_gt(ct), "Skipping CTB snapshot entirely.\n");
 		return NULL;
@@ -1676,12 +1642,13 @@ struct xe_guc_ct_snapshot *xe_guc_ct_snapshot_capture(struct xe_guc_ct *ct,
 	if (xe_guc_ct_enabled(ct) || ct->state == XE_GUC_CT_STATE_STOPPED) {
 		snapshot->ct_enabled = true;
 		snapshot->g2h_outstanding = READ_ONCE(ct->g2h_outstanding);
-		guc_ctb_snapshot_capture(xe, &ct->ctbs.h2g,
-					 &snapshot->h2g, atomic);
-		guc_ctb_snapshot_capture(xe, &ct->ctbs.g2h,
-					 &snapshot->g2h, atomic);
+		guc_ctb_snapshot_capture(xe, &ct->ctbs.h2g, &snapshot->h2g);
+		guc_ctb_snapshot_capture(xe, &ct->ctbs.g2h, &snapshot->g2h);
 	}
 
+	if (ct->bo && snapshot->ctb)
+		xe_map_memcpy_from(xe, snapshot->ctb, &ct->bo->vmap, 0, snapshot->ctb_size);
+
 	return snapshot;
 }
 
@@ -1704,9 +1671,15 @@ void xe_guc_ct_snapshot_print(struct xe_guc_ct_snapshot *snapshot,
 
 		drm_puts(p, "G2H CTB (all sizes in DW):\n");
 		guc_ctb_snapshot_print(&snapshot->g2h, p);
-
 		drm_printf(p, "\tg2h outstanding: %d\n",
 			   snapshot->g2h_outstanding);
+
+		if (snapshot->ctb) {
+			xe_print_blob_ascii85(p, "CTB data", snapshot->ctb, 0, snapshot->ctb_size);
+		} else {
+			drm_printf(p, "CTB snapshot missing!\n");
+			return;
+		}
 	} else {
 		drm_puts(p, "CT disabled\n");
 	}
@@ -1724,8 +1697,7 @@ void xe_guc_ct_snapshot_free(struct xe_guc_ct_snapshot *snapshot)
 	if (!snapshot)
 		return;
 
-	guc_ctb_snapshot_free(&snapshot->h2g);
-	guc_ctb_snapshot_free(&snapshot->g2h);
+	kfree(snapshot->ctb);
 	kfree(snapshot);
 }
 
diff --git a/drivers/gpu/drm/xe/xe_guc_ct.h b/drivers/gpu/drm/xe/xe_guc_ct.h
index 293041bed7ed..338f0b75d29f 100644
--- a/drivers/gpu/drm/xe/xe_guc_ct.h
+++ b/drivers/gpu/drm/xe/xe_guc_ct.h
@@ -9,6 +9,7 @@
 #include "xe_guc_ct_types.h"
 
 struct drm_printer;
+struct xe_device;
 
 int xe_guc_ct_init(struct xe_guc_ct *ct);
 int xe_guc_ct_enable(struct xe_guc_ct *ct);
@@ -16,10 +17,9 @@ void xe_guc_ct_disable(struct xe_guc_ct *ct);
 void xe_guc_ct_stop(struct xe_guc_ct *ct);
 void xe_guc_ct_fast_path(struct xe_guc_ct *ct);
 
-struct xe_guc_ct_snapshot *
-xe_guc_ct_snapshot_capture(struct xe_guc_ct *ct, bool atomic);
-void xe_guc_ct_snapshot_print(struct xe_guc_ct_snapshot *snapshot,
-			      struct drm_printer *p);
+struct xe_guc_ct_snapshot *xe_guc_ct_snapshot_alloc(struct xe_guc_ct *ct, bool atomic);
+struct xe_guc_ct_snapshot *xe_guc_ct_snapshot_capture(struct xe_guc_ct *ct, bool atomic);
+void xe_guc_ct_snapshot_print(struct xe_guc_ct_snapshot *snapshot, struct drm_printer *p);
 void xe_guc_ct_snapshot_free(struct xe_guc_ct_snapshot *snapshot);
 void xe_guc_ct_print(struct xe_guc_ct *ct, struct drm_printer *p);
 
diff --git a/drivers/gpu/drm/xe/xe_guc_ct_types.h b/drivers/gpu/drm/xe/xe_guc_ct_types.h
index 85e127ec91d7..8e1b9d981d61 100644
--- a/drivers/gpu/drm/xe/xe_guc_ct_types.h
+++ b/drivers/gpu/drm/xe/xe_guc_ct_types.h
@@ -52,8 +52,6 @@ struct guc_ctb {
 struct guc_ctb_snapshot {
 	/** @desc: snapshot of the CTB descriptor */
 	struct guc_ct_buffer_desc desc;
-	/** @cmds: snapshot of the CTB commands */
-	u32 *cmds;
 	/** @info: snapshot of the CTB info */
 	struct guc_ctb_info info;
 };
@@ -70,6 +68,10 @@ struct xe_guc_ct_snapshot {
 	struct guc_ctb_snapshot g2h;
 	/** @h2g: H2G CTB snapshot */
 	struct guc_ctb_snapshot h2g;
+	/** @ctb_size: size of the snapshot of the CTB */
+	size_t ctb_size;
+	/** @ctb: snapshot of the entire CTB */
+	u32 *ctb;
 };
 
 /**
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v9 10/11] drm/xe/guc: Add GuC log to devcoredump captures
  2024-10-02 21:14 [PATCH v9 00/11] drm/xe/guc: Improve GuC log dumping and add to devcoredump John.C.Harrison
                   ` (8 preceding siblings ...)
  2024-10-02 21:14 ` [PATCH v9 09/11] drm/xe/guc: Dump entire CTB on errors John.C.Harrison
@ 2024-10-02 21:14 ` John.C.Harrison
  2024-10-02 21:14 ` [PATCH v9 11/11] drm/xe/guc: Add a helper function for dumping GuC log to dmesg John.C.Harrison
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: John.C.Harrison @ 2024-10-02 21:14 UTC (permalink / raw)
  To: Intel-Xe; +Cc: John Harrison, Julia Filipchuk

From: John Harrison <John.C.Harrison@Intel.com>

Include the GuC log in devcoredump captures because they can be useful
with debugging certain types of bug.

v2: Fix kerneldoc
v3: Drop module parameter as now using more compact ascii85 encoding
rather than hexdump (although still not compressed) (review feedback
from Matthew B). Rebase onto recent refactoring of devcoredump code.
v4: Don't move the submission snapshot inside the GuC internals
structure 'cos it really doesn't belong there.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
---
 drivers/gpu/drm/xe/xe_devcoredump.c       | 16 ++++++++++++----
 drivers/gpu/drm/xe/xe_devcoredump_types.h | 10 +++++++---
 2 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c
index 0884c49942fe..a2d816713489 100644
--- a/drivers/gpu/drm/xe/xe_devcoredump.c
+++ b/drivers/gpu/drm/xe/xe_devcoredump.c
@@ -18,8 +18,10 @@
 #include "xe_gt.h"
 #include "xe_gt_printk.h"
 #include "xe_guc_ct.h"
+#include "xe_guc_log.h"
 #include "xe_guc_submit.h"
 #include "xe_hw_engine.h"
+#include "xe_module.h"
 #include "xe_sched_job.h"
 #include "xe_vm.h"
 
@@ -100,8 +102,10 @@ static ssize_t __xe_devcoredump_read(char *buffer, size_t count,
 	drm_printf(&p, "\n**** GT #%d ****\n", ss->gt->info.id);
 	drm_printf(&p, "\tTile: %d\n", ss->gt->tile->id);
 
+	drm_puts(&p, "\n**** GuC Log ****\n");
+	xe_guc_log_snapshot_print(ss->guc.log, &p);
 	drm_puts(&p, "\n**** GuC CT ****\n");
-	xe_guc_ct_snapshot_print(ss->ct, &p);
+	xe_guc_ct_snapshot_print(ss->guc.ct, &p);
 
 	drm_puts(&p, "\n**** Contexts ****\n");
 	xe_guc_exec_queue_snapshot_print(ss->ge, &p);
@@ -124,8 +128,11 @@ static void xe_devcoredump_snapshot_free(struct xe_devcoredump_snapshot *ss)
 {
 	int i;
 
-	xe_guc_ct_snapshot_free(ss->ct);
-	ss->ct = NULL;
+	xe_guc_log_snapshot_free(ss->guc.log);
+	ss->guc.log = NULL;
+
+	xe_guc_ct_snapshot_free(ss->guc.ct);
+	ss->guc.ct = NULL;
 
 	xe_guc_exec_queue_snapshot_free(ss->ge);
 	ss->ge = NULL;
@@ -253,7 +260,8 @@ static void devcoredump_snapshot(struct xe_devcoredump *coredump,
 	if (xe_force_wake_get(gt_to_fw(q->gt), XE_FORCEWAKE_ALL))
 		xe_gt_info(ss->gt, "failed to get forcewake for coredump capture\n");
 
-	ss->ct = xe_guc_ct_snapshot_capture(&guc->ct, true);
+	ss->guc.log = xe_guc_log_snapshot_capture(&guc->log, true);
+	ss->guc.ct = xe_guc_ct_snapshot_capture(&guc->ct, true);
 	ss->ge = xe_guc_exec_queue_snapshot_capture(q);
 	ss->job = xe_sched_job_snapshot_capture(job);
 	ss->vm = xe_vm_snapshot_capture(q->vm);
diff --git a/drivers/gpu/drm/xe/xe_devcoredump_types.h b/drivers/gpu/drm/xe/xe_devcoredump_types.h
index 3cc2f095fdfb..06ac75ce63dd 100644
--- a/drivers/gpu/drm/xe/xe_devcoredump_types.h
+++ b/drivers/gpu/drm/xe/xe_devcoredump_types.h
@@ -34,9 +34,13 @@ struct xe_devcoredump_snapshot {
 	/** @work: Workqueue for deferred capture outside of signaling context */
 	struct work_struct work;
 
-	/* GuC snapshots */
-	/** @ct: GuC CT snapshot */
-	struct xe_guc_ct_snapshot *ct;
+	/** @guc: GuC snapshots */
+	struct {
+		/** @guc.ct: GuC CT snapshot */
+		struct xe_guc_ct_snapshot *ct;
+		/** @guc.log: GuC log snapshot */
+		struct xe_guc_log_snapshot *log;
+	} guc;
 
 	/** @ge: GuC Submission Engine snapshot */
 	struct xe_guc_submit_exec_queue_snapshot *ge;
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v9 11/11] drm/xe/guc: Add a helper function for dumping GuC log to dmesg
  2024-10-02 21:14 [PATCH v9 00/11] drm/xe/guc: Improve GuC log dumping and add to devcoredump John.C.Harrison
                   ` (9 preceding siblings ...)
  2024-10-02 21:14 ` [PATCH v9 10/11] drm/xe/guc: Add GuC log to devcoredump captures John.C.Harrison
@ 2024-10-02 21:14 ` John.C.Harrison
  2024-10-02 21:20 ` ✓ CI.Patch_applied: success for drm/xe/guc: Improve GuC log dumping and add to devcoredump (rev5) Patchwork
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: John.C.Harrison @ 2024-10-02 21:14 UTC (permalink / raw)
  To: Intel-Xe; +Cc: John Harrison, Julia Filipchuk

From: John Harrison <John.C.Harrison@Intel.com>

Create a helper function that can be used to dump the GuC log to dmesg
in a manner that is reliable for extraction and decode. The intention
is that calls to this can be added by developers when debugging
specific issues that require a GuC log but do not allow easy capture
of the log - e.g. failures in selftests and failues that lead to
kernel hangs.

Also note that this is really a temporary stop-gap. The aim is to
allow on demand creation and dumping of devcoredump captures (which
includes the GuC log and much more). Currently this is not possible as
much of the devcoredump code requires a 'struct xe_sched_job' and
those are not available at many places that might want to do the dump.

v2: Add kerneldoc - review feedback from Michal W.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
---
 drivers/gpu/drm/xe/xe_guc_log.c | 18 ++++++++++++++++++
 drivers/gpu/drm/xe/xe_guc_log.h |  1 +
 2 files changed, 19 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_guc_log.c b/drivers/gpu/drm/xe/xe_guc_log.c
index 24564624e91e..d7be69f20af4 100644
--- a/drivers/gpu/drm/xe/xe_guc_log.c
+++ b/drivers/gpu/drm/xe/xe_guc_log.c
@@ -214,6 +214,24 @@ void xe_guc_log_snapshot_print(struct xe_guc_log_snapshot *snapshot, struct drm_
 	}
 }
 
+/**
+ * xe_guc_log_print_dmesg - dump a copy of the GuC log to dmesg
+ * @log: GuC log structure
+ */
+void xe_guc_log_print_dmesg(struct xe_guc_log *log)
+{
+	struct xe_gt *gt = log_to_gt(log);
+	static int g_count;
+	struct drm_printer ip = xe_gt_info_printer(gt);
+	struct drm_printer lp = drm_line_printer(&ip, "Capture", ++g_count);
+
+	drm_printf(&lp, "Dumping GuC log for %ps...\n", __builtin_return_address(0));
+
+	xe_guc_log_print(log, &lp);
+
+	drm_printf(&lp, "Done.\n");
+}
+
 /**
  * xe_guc_log_print - dump a copy of the GuC log to some useful location
  * @log: GuC log structure
diff --git a/drivers/gpu/drm/xe/xe_guc_log.h b/drivers/gpu/drm/xe/xe_guc_log.h
index 949d2c98343d..1fb2fae1f4e1 100644
--- a/drivers/gpu/drm/xe/xe_guc_log.h
+++ b/drivers/gpu/drm/xe/xe_guc_log.h
@@ -39,6 +39,7 @@ struct xe_device;
 
 int xe_guc_log_init(struct xe_guc_log *log);
 void xe_guc_log_print(struct xe_guc_log *log, struct drm_printer *p);
+void xe_guc_log_print_dmesg(struct xe_guc_log *log);
 struct xe_guc_log_snapshot *xe_guc_log_snapshot_capture(struct xe_guc_log *log, bool atomic);
 void xe_guc_log_snapshot_print(struct xe_guc_log_snapshot *snapshot, struct drm_printer *p);
 void xe_guc_log_snapshot_free(struct xe_guc_log_snapshot *snapshot);
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* ✓ CI.Patch_applied: success for drm/xe/guc: Improve GuC log dumping and add to devcoredump (rev5)
  2024-10-02 21:14 [PATCH v9 00/11] drm/xe/guc: Improve GuC log dumping and add to devcoredump John.C.Harrison
                   ` (10 preceding siblings ...)
  2024-10-02 21:14 ` [PATCH v9 11/11] drm/xe/guc: Add a helper function for dumping GuC log to dmesg John.C.Harrison
@ 2024-10-02 21:20 ` Patchwork
  2024-10-02 21:21 ` ✗ CI.checkpatch: warning " Patchwork
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Patchwork @ 2024-10-02 21:20 UTC (permalink / raw)
  To: john.c.harrison; +Cc: intel-xe

== Series Details ==

Series: drm/xe/guc: Improve GuC log dumping and add to devcoredump (rev5)
URL   : https://patchwork.freedesktop.org/series/137985/
State : success

== Summary ==

=== Applying kernel patches on branch 'drm-tip' with base: ===
Base commit: 789d5631453c drm-tip: 2024y-10m-02d-15h-36m-04s UTC integration manifest
=== git am output follows ===
Applying: drm/xe/guc: Remove spurious line feed in debug print
Applying: drm/xe/devcoredump: Use drm_puts and already cached local variables
Applying: drm/xe/devcoredump: Improve section headings and add tile info
Applying: drm/xe/devcoredump: Add ASCII85 dump helper function
Applying: drm/xe/guc: Copy GuC log prior to dumping
Applying: drm/xe/guc: Use a two stage dump for GuC logs and add more info
Applying: drm/print: Introduce drm_line_printer
Applying: drm/xe/guc: Dead CT helper
Applying: drm/xe/guc: Dump entire CTB on errors
Applying: drm/xe/guc: Add GuC log to devcoredump captures
Applying: drm/xe/guc: Add a helper function for dumping GuC log to dmesg



^ permalink raw reply	[flat|nested] 23+ messages in thread

* ✗ CI.checkpatch: warning for drm/xe/guc: Improve GuC log dumping and add to devcoredump (rev5)
  2024-10-02 21:14 [PATCH v9 00/11] drm/xe/guc: Improve GuC log dumping and add to devcoredump John.C.Harrison
                   ` (11 preceding siblings ...)
  2024-10-02 21:20 ` ✓ CI.Patch_applied: success for drm/xe/guc: Improve GuC log dumping and add to devcoredump (rev5) Patchwork
@ 2024-10-02 21:21 ` Patchwork
  2024-10-02 21:22 ` ✓ CI.KUnit: success " Patchwork
  2024-10-02 21:27 ` ✗ CI.Build: failure " Patchwork
  14 siblings, 0 replies; 23+ messages in thread
From: Patchwork @ 2024-10-02 21:21 UTC (permalink / raw)
  To: john.c.harrison; +Cc: intel-xe

== Series Details ==

Series: drm/xe/guc: Improve GuC log dumping and add to devcoredump (rev5)
URL   : https://patchwork.freedesktop.org/series/137985/
State : warning

== Summary ==

+ KERNEL=/kernel
+ git clone https://gitlab.freedesktop.org/drm/maintainer-tools mt
Cloning into 'mt'...
warning: redirecting to https://gitlab.freedesktop.org/drm/maintainer-tools.git/
+ git -C mt rev-list -n1 origin/master
30ab6715fc09baee6cc14cb3c89ad8858688d474
+ cd /kernel
+ git config --global --add safe.directory /kernel
+ git log -n1
commit b457795128f10d0ae6c41373c3e6469258a7869b
Author: John Harrison <John.C.Harrison@Intel.com>
Date:   Wed Oct 2 14:14:22 2024 -0700

    drm/xe/guc: Add a helper function for dumping GuC log to dmesg
    
    Create a helper function that can be used to dump the GuC log to dmesg
    in a manner that is reliable for extraction and decode. The intention
    is that calls to this can be added by developers when debugging
    specific issues that require a GuC log but do not allow easy capture
    of the log - e.g. failures in selftests and failues that lead to
    kernel hangs.
    
    Also note that this is really a temporary stop-gap. The aim is to
    allow on demand creation and dumping of devcoredump captures (which
    includes the GuC log and much more). Currently this is not possible as
    much of the devcoredump code requires a 'struct xe_sched_job' and
    those are not available at many places that might want to do the dump.
    
    v2: Add kerneldoc - review feedback from Michal W.
    
    Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
    Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
+ /mt/dim checkpatch 789d5631453c3edad1988cd47db1643555e52ac9 drm-intel
2b66b94c6f47 drm/xe/guc: Remove spurious line feed in debug print
076f9d38216e drm/xe/devcoredump: Use drm_puts and already cached local variables
d38d194ac57e drm/xe/devcoredump: Improve section headings and add tile info
58e588bab814 drm/xe/devcoredump: Add ASCII85 dump helper function
059aee99c9d8 drm/xe/guc: Copy GuC log prior to dumping
56d1762b790a drm/xe/guc: Use a two stage dump for GuC logs and add more info
1158d15e6718 drm/print: Introduce drm_line_printer
4fdede82017a drm/xe/guc: Dead CT helper
-:102: WARNING:MACRO_ARG_UNUSED: Argument 'ct' is not used in function-like macro
#102: FILE: drivers/gpu/drm/xe/xe_guc_ct.c:62:
+#define CT_DEAD(ct, ctb, reason)			\
+	do {						\
+		struct guc_ctb *_ctb = (ctb);		\
+		if (_ctb)				\
+			_ctb->info.broken = true;	\
+	} while (0)

-:102: WARNING:MACRO_ARG_UNUSED: Argument 'reason' is not used in function-like macro
#102: FILE: drivers/gpu/drm/xe/xe_guc_ct.c:62:
+#define CT_DEAD(ct, ctb, reason)			\
+	do {						\
+		struct guc_ctb *_ctb = (ctb);		\
+		if (_ctb)				\
+			_ctb->info.broken = true;	\
+	} while (0)

total: 0 errors, 2 warnings, 0 checks, 572 lines checked
5bc232d0885d drm/xe/guc: Dump entire CTB on errors
bec4246c9fba drm/xe/guc: Add GuC log to devcoredump captures
b457795128f1 drm/xe/guc: Add a helper function for dumping GuC log to dmesg



^ permalink raw reply	[flat|nested] 23+ messages in thread

* ✓ CI.KUnit: success for drm/xe/guc: Improve GuC log dumping and add to devcoredump (rev5)
  2024-10-02 21:14 [PATCH v9 00/11] drm/xe/guc: Improve GuC log dumping and add to devcoredump John.C.Harrison
                   ` (12 preceding siblings ...)
  2024-10-02 21:21 ` ✗ CI.checkpatch: warning " Patchwork
@ 2024-10-02 21:22 ` Patchwork
  2024-10-02 21:27 ` ✗ CI.Build: failure " Patchwork
  14 siblings, 0 replies; 23+ messages in thread
From: Patchwork @ 2024-10-02 21:22 UTC (permalink / raw)
  To: john.c.harrison; +Cc: intel-xe

== Series Details ==

Series: drm/xe/guc: Improve GuC log dumping and add to devcoredump (rev5)
URL   : https://patchwork.freedesktop.org/series/137985/
State : success

== Summary ==

+ trap cleanup EXIT
+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/xe/.kunitconfig
[21:21:21] Configuring KUnit Kernel ...
Generating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[21:21:25] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json ARCH=um O=.kunit --jobs=48
../drivers/gpu/drm/xe/xe_guc_ct.c: In function ‘g2h_read’:
../drivers/gpu/drm/xe/xe_guc_ct.c:1334:21: warning: unused variable ‘desc_head’ [-Wunused-variable]
 1334 |                 u32 desc_head = desc_read(xe, g2h, head);
      |                     ^~~~~~~~~
../lib/iomap.c:156:5: warning: no previous prototype for ‘ioread64_lo_hi’ [-Wmissing-prototypes]
  156 | u64 ioread64_lo_hi(const void __iomem *addr)
      |     ^~~~~~~~~~~~~~
../lib/iomap.c:163:5: warning: no previous prototype for ‘ioread64_hi_lo’ [-Wmissing-prototypes]
  163 | u64 ioread64_hi_lo(const void __iomem *addr)
      |     ^~~~~~~~~~~~~~
../lib/iomap.c:170:5: warning: no previous prototype for ‘ioread64be_lo_hi’ [-Wmissing-prototypes]
  170 | u64 ioread64be_lo_hi(const void __iomem *addr)
      |     ^~~~~~~~~~~~~~~~
../lib/iomap.c:178:5: warning: no previous prototype for ‘ioread64be_hi_lo’ [-Wmissing-prototypes]
  178 | u64 ioread64be_hi_lo(const void __iomem *addr)
      |     ^~~~~~~~~~~~~~~~
../lib/iomap.c:264:6: warning: no previous prototype for ‘iowrite64_lo_hi’ [-Wmissing-prototypes]
  264 | void iowrite64_lo_hi(u64 val, void __iomem *addr)
      |      ^~~~~~~~~~~~~~~
../lib/iomap.c:272:6: warning: no previous prototype for ‘iowrite64_hi_lo’ [-Wmissing-prototypes]
  272 | void iowrite64_hi_lo(u64 val, void __iomem *addr)
      |      ^~~~~~~~~~~~~~~
../lib/iomap.c:280:6: warning: no previous prototype for ‘iowrite64be_lo_hi’ [-Wmissing-prototypes]
  280 | void iowrite64be_lo_hi(u64 val, void __iomem *addr)
      |      ^~~~~~~~~~~~~~~~~
../lib/iomap.c:288:6: warning: no previous prototype for ‘iowrite64be_hi_lo’ [-Wmissing-prototypes]
  288 | void iowrite64be_hi_lo(u64 val, void __iomem *addr)
      |      ^~~~~~~~~~~~~~~~~

[21:21:53] Starting KUnit Kernel (1/1)...
[21:21:53] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[21:21:54] =================== guc_dbm (7 subtests) ===================
[21:21:54] [PASSED] test_empty
[21:21:54] [PASSED] test_default
[21:21:54] ======================== test_size  ========================
[21:21:54] [PASSED] 4
[21:21:54] [PASSED] 8
[21:21:54] [PASSED] 32
[21:21:54] [PASSED] 256
[21:21:54] ==================== [PASSED] test_size ====================
[21:21:54] ======================= test_reuse  ========================
[21:21:54] [PASSED] 4
[21:21:54] [PASSED] 8
[21:21:54] [PASSED] 32
[21:21:54] [PASSED] 256
[21:21:54] =================== [PASSED] test_reuse ====================
[21:21:54] =================== test_range_overlap  ====================
[21:21:54] [PASSED] 4
[21:21:54] [PASSED] 8
[21:21:54] [PASSED] 32
[21:21:54] [PASSED] 256
[21:21:54] =============== [PASSED] test_range_overlap ================
[21:21:54] =================== test_range_compact  ====================
[21:21:54] [PASSED] 4
[21:21:54] [PASSED] 8
[21:21:54] [PASSED] 32
[21:21:54] [PASSED] 256
[21:21:54] =============== [PASSED] test_range_compact ================
[21:21:54] ==================== test_range_spare  =====================
[21:21:54] [PASSED] 4
[21:21:54] [PASSED] 8
[21:21:54] [PASSED] 32
[21:21:54] [PASSED] 256
[21:21:54] ================ [PASSED] test_range_spare =================
[21:21:54] ===================== [PASSED] guc_dbm =====================
[21:21:54] =================== guc_idm (6 subtests) ===================
[21:21:54] [PASSED] bad_init
[21:21:54] [PASSED] no_init
[21:21:54] [PASSED] init_fini
[21:21:54] [PASSED] check_used
[21:21:54] [PASSED] check_quota
[21:21:54] [PASSED] check_all
[21:21:54] ===================== [PASSED] guc_idm =====================
[21:21:54] ================== no_relay (3 subtests) ===================
[21:21:54] [PASSED] xe_drops_guc2pf_if_not_ready
[21:21:54] [PASSED] xe_drops_guc2vf_if_not_ready
[21:21:54] [PASSED] xe_rejects_send_if_not_ready
[21:21:54] ==================== [PASSED] no_relay =====================
[21:21:54] ================== pf_relay (14 subtests) ==================
[21:21:54] [PASSED] pf_rejects_guc2pf_too_short
[21:21:54] [PASSED] pf_rejects_guc2pf_too_long
[21:21:54] [PASSED] pf_rejects_guc2pf_no_payload
[21:21:54] [PASSED] pf_fails_no_payload
[21:21:54] [PASSED] pf_fails_bad_origin
[21:21:54] [PASSED] pf_fails_bad_type
[21:21:54] [PASSED] pf_txn_reports_error
[21:21:54] [PASSED] pf_txn_sends_pf2guc
[21:21:54] [PASSED] pf_sends_pf2guc
[21:21:54] [SKIPPED] pf_loopback_nop
[21:21:54] [SKIPPED] pf_loopback_echo
[21:21:54] [SKIPPED] pf_loopback_fail
[21:21:54] [SKIPPED] pf_loopback_busy
[21:21:54] [SKIPPED] pf_loopback_retry
[21:21:54] ==================== [PASSED] pf_relay =====================
[21:21:54] ================== vf_relay (3 subtests) ===================
[21:21:54] [PASSED] vf_rejects_guc2vf_too_short
[21:21:54] [PASSED] vf_rejects_guc2vf_too_long
[21:21:54] [PASSED] vf_rejects_guc2vf_no_payload
[21:21:54] ==================== [PASSED] vf_relay =====================
[21:21:54] ================= pf_service (11 subtests) =================
[21:21:54] [PASSED] pf_negotiate_any
[21:21:54] [PASSED] pf_negotiate_base_match
[21:21:54] [PASSED] pf_negotiate_base_newer
[21:21:54] [PASSED] pf_negotiate_base_next
[21:21:54] [SKIPPED] pf_negotiate_base_older
[21:21:54] [PASSED] pf_negotiate_base_prev
[21:21:54] [PASSED] pf_negotiate_latest_match
[21:21:54] [PASSED] pf_negotiate_latest_newer
[21:21:54] [PASSED] pf_negotiate_latest_next
[21:21:54] [SKIPPED] pf_negotiate_latest_older
[21:21:54] [SKIPPED] pf_negotiate_latest_prev
[21:21:54] =================== [PASSED] pf_service ====================
[21:21:54] ===================== lmtt (1 subtest) =====================
[21:21:54] ======================== test_ops  =========================
[21:21:54] [PASSED] 2-level
[21:21:54] [PASSED] multi-level
[21:21:54] ==================== [PASSED] test_ops =====================
[21:21:54] ====================== [PASSED] lmtt =======================
[21:21:54] =================== xe_mocs (2 subtests) ===================
[21:21:54] ================ xe_live_mocs_kernel_kunit  ================
[21:21:54] =========== [SKIPPED] xe_live_mocs_kernel_kunit ============
[21:21:54] ================ xe_live_mocs_reset_kunit  =================
[21:21:54] ============ [SKIPPED] xe_live_mocs_reset_kunit ============
[21:21:54] ==================== [SKIPPED] xe_mocs =====================
[21:21:54] ================= xe_migrate (2 subtests) ==================
[21:21:54] ================= xe_migrate_sanity_kunit  =================
[21:21:54] ============ [SKIPPED] xe_migrate_sanity_kunit =============
[21:21:54] ================== xe_validate_ccs_kunit  ==================
[21:21:54] ============= [SKIPPED] xe_validate_ccs_kunit ==============
[21:21:54] =================== [SKIPPED] xe_migrate ===================
[21:21:54] ================== xe_dma_buf (1 subtest) ==================
[21:21:54] ==================== xe_dma_buf_kunit  =====================
[21:21:54] ================ [SKIPPED] xe_dma_buf_kunit ================
[21:21:54] =================== [SKIPPED] xe_dma_buf ===================
[21:21:54] ==================== xe_bo (3 subtests) ====================
[21:21:54] ================== xe_ccs_migrate_kunit  ===================
[21:21:54] ============== [SKIPPED] xe_ccs_migrate_kunit ==============
[21:21:54] ==================== xe_bo_evict_kunit  ====================
[21:21:54] =============== [SKIPPED] xe_bo_evict_kunit ================
[21:21:54] =================== xe_bo_shrink_kunit  ====================
[21:21:54] =============== [SKIPPED] xe_bo_shrink_kunit ===============
[21:21:54] ===================== [SKIPPED] xe_bo ======================
[21:21:54] ==================== args (11 subtests) ====================
[21:21:54] [PASSED] count_args_test
[21:21:54] [PASSED] call_args_example
[21:21:54] [PASSED] call_args_test
stty: 'standard input': Inappropriate ioctl for device
[21:21:54] [PASSED] drop_first_arg_example
[21:21:54] [PASSED] drop_first_arg_test
[21:21:54] [PASSED] first_arg_example
[21:21:54] [PASSED] first_arg_test
[21:21:54] [PASSED] last_arg_example
[21:21:54] [PASSED] last_arg_test
[21:21:54] [PASSED] pick_arg_example
[21:21:54] [PASSED] sep_comma_example
[21:21:54] ====================== [PASSED] args =======================
[21:21:54] =================== xe_pci (2 subtests) ====================
[21:21:54] [PASSED] xe_gmdid_graphics_ip
[21:21:54] [PASSED] xe_gmdid_media_ip
[21:21:54] ===================== [PASSED] xe_pci ======================
[21:21:54] =================== xe_rtp (2 subtests) ====================
[21:21:54] =============== xe_rtp_process_to_sr_tests  ================
[21:21:54] [PASSED] coalesce-same-reg
[21:21:54] [PASSED] no-match-no-add
[21:21:54] [PASSED] match-or
[21:21:54] [PASSED] match-or-xfail
[21:21:54] [PASSED] no-match-no-add-multiple-rules
[21:21:54] [PASSED] two-regs-two-entries
[21:21:54] [PASSED] clr-one-set-other
[21:21:54] [PASSED] set-field
[21:21:54] [PASSED] conflict-duplicate
[21:21:54] [PASSED] conflict-not-disjoint
[21:21:54] [PASSED] conflict-reg-type
[21:21:54] =========== [PASSED] xe_rtp_process_to_sr_tests ============
[21:21:54] ================== xe_rtp_process_tests  ===================
[21:21:54] [PASSED] active1
[21:21:54] [PASSED] active2
[21:21:54] [PASSED] active-inactive
[21:21:54] [PASSED] inactive-active
[21:21:54] [PASSED] inactive-1st_or_active-inactive
[21:21:54] [PASSED] inactive-2nd_or_active-inactive
[21:21:54] [PASSED] inactive-last_or_active-inactive
[21:21:54] [PASSED] inactive-no_or_active-inactive
[21:21:54] ============== [PASSED] xe_rtp_process_tests ===============
[21:21:54] ===================== [PASSED] xe_rtp ======================
[21:21:54] ==================== xe_wa (1 subtest) =====================
[21:21:54] ======================== xe_wa_gt  =========================
[21:21:54] [PASSED] TIGERLAKE (B0)
[21:21:54] [PASSED] DG1 (A0)
[21:21:54] [PASSED] DG1 (B0)
[21:21:54] [PASSED] ALDERLAKE_S (A0)
[21:21:54] [PASSED] ALDERLAKE_S (B0)
[21:21:54] [PASSED] ALDERLAKE_S (C0)
[21:21:54] [PASSED] ALDERLAKE_S (D0)
[21:21:54] [PASSED] ALDERLAKE_P (A0)
[21:21:54] [PASSED] ALDERLAKE_P (B0)
[21:21:54] [PASSED] ALDERLAKE_P (C0)
[21:21:54] [PASSED] ALDERLAKE_S_RPLS (D0)
[21:21:54] [PASSED] ALDERLAKE_P_RPLU (E0)
[21:21:54] [PASSED] DG2_G10 (C0)
[21:21:54] [PASSED] DG2_G11 (B1)
[21:21:54] [PASSED] DG2_G12 (A1)
[21:21:54] [PASSED] METEORLAKE (g:A0, m:A0)
[21:21:54] [PASSED] METEORLAKE (g:A0, m:A0)
[21:21:54] [PASSED] METEORLAKE (g:A0, m:A0)
[21:21:54] [PASSED] LUNARLAKE (g:A0, m:A0)
[21:21:54] [PASSED] LUNARLAKE (g:B0, m:A0)
[21:21:54] [PASSED] BATTLEMAGE (g:A0, m:A1)
[21:21:54] ==================== [PASSED] xe_wa_gt =====================
[21:21:54] ====================== [PASSED] xe_wa ======================
[21:21:54] ============================================================
[21:21:54] Testing complete. Ran 122 tests: passed: 106, skipped: 16
[21:21:54] Elapsed time: 32.755s total, 4.426s configuring, 28.062s building, 0.217s running

+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/tests/.kunitconfig
[21:21:54] Configuring KUnit Kernel ...
Regenerating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[21:21:55] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json ARCH=um O=.kunit --jobs=48
../lib/iomap.c:156:5: warning: no previous prototype for ‘ioread64_lo_hi’ [-Wmissing-prototypes]
  156 | u64 ioread64_lo_hi(const void __iomem *addr)
      |     ^~~~~~~~~~~~~~
../lib/iomap.c:163:5: warning: no previous prototype for ‘ioread64_hi_lo’ [-Wmissing-prototypes]
  163 | u64 ioread64_hi_lo(const void __iomem *addr)
      |     ^~~~~~~~~~~~~~
../lib/iomap.c:170:5: warning: no previous prototype for ‘ioread64be_lo_hi’ [-Wmissing-prototypes]
  170 | u64 ioread64be_lo_hi(const void __iomem *addr)
      |     ^~~~~~~~~~~~~~~~
../lib/iomap.c:178:5: warning: no previous prototype for ‘ioread64be_hi_lo’ [-Wmissing-prototypes]
  178 | u64 ioread64be_hi_lo(const void __iomem *addr)
      |     ^~~~~~~~~~~~~~~~
../lib/iomap.c:264:6: warning: no previous prototype for ‘iowrite64_lo_hi’ [-Wmissing-prototypes]
  264 | void iowrite64_lo_hi(u64 val, void __iomem *addr)
      |      ^~~~~~~~~~~~~~~
../lib/iomap.c:272:6: warning: no previous prototype for ‘iowrite64_hi_lo’ [-Wmissing-prototypes]
  272 | void iowrite64_hi_lo(u64 val, void __iomem *addr)
      |      ^~~~~~~~~~~~~~~
../lib/iomap.c:280:6: warning: no previous prototype for ‘iowrite64be_lo_hi’ [-Wmissing-prototypes]
  280 | void iowrite64be_lo_hi(u64 val, void __iomem *addr)
      |      ^~~~~~~~~~~~~~~~~
../lib/iomap.c:288:6: warning: no previous prototype for ‘iowrite64be_hi_lo’ [-Wmissing-prototypes]
  288 | void iowrite64be_hi_lo(u64 val, void __iomem *addr)
      |      ^~~~~~~~~~~~~~~~~

[21:22:18] Starting KUnit Kernel (1/1)...
[21:22:18] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[21:22:19] ============ drm_test_pick_cmdline (2 subtests) ============
[21:22:19] [PASSED] drm_test_pick_cmdline_res_1920_1080_60
[21:22:19] =============== drm_test_pick_cmdline_named  ===============
[21:22:19] [PASSED] NTSC
[21:22:19] [PASSED] NTSC-J
[21:22:19] [PASSED] PAL
[21:22:19] [PASSED] PAL-M
[21:22:19] =========== [PASSED] drm_test_pick_cmdline_named ===========
[21:22:19] ============== [PASSED] drm_test_pick_cmdline ==============
[21:22:19] ================== drm_buddy (7 subtests) ==================
[21:22:19] [PASSED] drm_test_buddy_alloc_limit
[21:22:19] [PASSED] drm_test_buddy_alloc_optimistic
[21:22:19] [PASSED] drm_test_buddy_alloc_pessimistic
[21:22:19] [PASSED] drm_test_buddy_alloc_pathological
[21:22:19] [PASSED] drm_test_buddy_alloc_contiguous
[21:22:19] [PASSED] drm_test_buddy_alloc_clear
[21:22:19] [PASSED] drm_test_buddy_alloc_range_bias
[21:22:19] ==================== [PASSED] drm_buddy ====================
[21:22:19] ============= drm_cmdline_parser (40 subtests) =============
[21:22:19] [PASSED] drm_test_cmdline_force_d_only
[21:22:19] [PASSED] drm_test_cmdline_force_D_only_dvi
[21:22:19] [PASSED] drm_test_cmdline_force_D_only_hdmi
[21:22:19] [PASSED] drm_test_cmdline_force_D_only_not_digital
[21:22:19] [PASSED] drm_test_cmdline_force_e_only
[21:22:19] [PASSED] drm_test_cmdline_res
[21:22:19] [PASSED] drm_test_cmdline_res_vesa
[21:22:19] [PASSED] drm_test_cmdline_res_vesa_rblank
[21:22:19] [PASSED] drm_test_cmdline_res_rblank
[21:22:19] [PASSED] drm_test_cmdline_res_bpp
[21:22:19] [PASSED] drm_test_cmdline_res_refresh
[21:22:19] [PASSED] drm_test_cmdline_res_bpp_refresh
[21:22:19] [PASSED] drm_test_cmdline_res_bpp_refresh_interlaced
[21:22:19] [PASSED] drm_test_cmdline_res_bpp_refresh_margins
[21:22:19] [PASSED] drm_test_cmdline_res_bpp_refresh_force_off
[21:22:19] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on
[21:22:19] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on_analog
[21:22:19] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on_digital
[21:22:19] [PASSED] drm_test_cmdline_res_bpp_refresh_interlaced_margins_force_on
[21:22:19] [PASSED] drm_test_cmdline_res_margins_force_on
[21:22:19] [PASSED] drm_test_cmdline_res_vesa_margins
[21:22:19] [PASSED] drm_test_cmdline_name
[21:22:19] [PASSED] drm_test_cmdline_name_bpp
[21:22:19] [PASSED] drm_test_cmdline_name_option
[21:22:19] [PASSED] drm_test_cmdline_name_bpp_option
[21:22:19] [PASSED] drm_test_cmdline_rotate_0
[21:22:19] [PASSED] drm_test_cmdline_rotate_90
[21:22:19] [PASSED] drm_test_cmdline_rotate_180
[21:22:19] [PASSED] drm_test_cmdline_rotate_270
[21:22:19] [PASSED] drm_test_cmdline_hmirror
[21:22:19] [PASSED] drm_test_cmdline_vmirror
[21:22:19] [PASSED] drm_test_cmdline_margin_options
[21:22:19] [PASSED] drm_test_cmdline_multiple_options
[21:22:19] [PASSED] drm_test_cmdline_bpp_extra_and_option
[21:22:19] [PASSED] drm_test_cmdline_extra_and_option
[21:22:19] [PASSED] drm_test_cmdline_freestanding_options
[21:22:19] [PASSED] drm_test_cmdline_freestanding_force_e_and_options
[21:22:19] [PASSED] drm_test_cmdline_panel_orientation
[21:22:19] ================ drm_test_cmdline_invalid  =================
[21:22:19] [PASSED] margin_only
[21:22:19] [PASSED] interlace_only
[21:22:19] [PASSED] res_missing_x
[21:22:19] [PASSED] res_missing_y
[21:22:19] [PASSED] res_bad_y
[21:22:19] [PASSED] res_missing_y_bpp
[21:22:19] [PASSED] res_bad_bpp
[21:22:19] [PASSED] res_bad_refresh
[21:22:19] [PASSED] res_bpp_refresh_force_on_off
[21:22:19] [PASSED] res_invalid_mode
[21:22:19] [PASSED] res_bpp_wrong_place_mode
[21:22:19] [PASSED] name_bpp_refresh
[21:22:19] [PASSED] name_refresh
[21:22:19] [PASSED] name_refresh_wrong_mode
[21:22:19] [PASSED] name_refresh_invalid_mode
[21:22:19] [PASSED] rotate_multiple
[21:22:19] [PASSED] rotate_invalid_val
[21:22:19] [PASSED] rotate_truncated
[21:22:19] [PASSED] invalid_option
[21:22:19] [PASSED] invalid_tv_option
[21:22:19] [PASSED] truncated_tv_option
[21:22:19] ============ [PASSED] drm_test_cmdline_invalid =============
[21:22:19] =============== drm_test_cmdline_tv_options  ===============
[21:22:19] [PASSED] NTSC
[21:22:19] [PASSED] NTSC_443
[21:22:19] [PASSED] NTSC_J
[21:22:19] [PASSED] PAL
[21:22:19] [PASSED] PAL_M
[21:22:19] [PASSED] PAL_N
[21:22:19] [PASSED] SECAM
[21:22:19] [PASSED] MONO_525
[21:22:19] [PASSED] MONO_625
[21:22:19] =========== [PASSED] drm_test_cmdline_tv_options ===========
[21:22:19] =============== [PASSED] drm_cmdline_parser ================
[21:22:19] ========== drmm_connector_hdmi_init (19 subtests) ==========
[21:22:19] [PASSED] drm_test_connector_hdmi_init_valid
[21:22:19] [PASSED] drm_test_connector_hdmi_init_bpc_8
[21:22:19] [PASSED] drm_test_connector_hdmi_init_bpc_10
[21:22:19] [PASSED] drm_test_connector_hdmi_init_bpc_12
[21:22:19] [PASSED] drm_test_connector_hdmi_init_bpc_invalid
[21:22:19] [PASSED] drm_test_connector_hdmi_init_bpc_null
[21:22:19] [PASSED] drm_test_connector_hdmi_init_formats_empty
[21:22:19] [PASSED] drm_test_connector_hdmi_init_formats_no_rgb
[21:22:19] [PASSED] drm_test_connector_hdmi_init_null_ddc
[21:22:19] [PASSED] drm_test_connector_hdmi_init_null_product
[21:22:19] [PASSED] drm_test_connector_hdmi_init_null_vendor
[21:22:19] [PASSED] drm_test_connector_hdmi_init_product_length_exact
[21:22:19] [PASSED] drm_test_connector_hdmi_init_product_length_too_long
[21:22:19] [PASSED] drm_test_connector_hdmi_init_product_valid
[21:22:19] [PASSED] drm_test_connector_hdmi_init_vendor_length_exact
[21:22:19] [PASSED] drm_test_connector_hdmi_init_vendor_length_too_long
[21:22:19] [PASSED] drm_test_connector_hdmi_init_vendor_valid
[21:22:19] ========= drm_test_connector_hdmi_init_type_valid  =========
[21:22:19] [PASSED] HDMI-A
[21:22:19] [PASSED] HDMI-B
[21:22:19] ===== [PASSED] drm_test_connector_hdmi_init_type_valid =====
[21:22:19] ======== drm_test_connector_hdmi_init_type_invalid  ========
[21:22:19] [PASSED] Unknown
[21:22:19] [PASSED] VGA
[21:22:19] [PASSED] DVI-I
[21:22:19] [PASSED] DVI-D
[21:22:19] [PASSED] DVI-A
[21:22:19] [PASSED] Composite
[21:22:19] [PASSED] SVIDEO
[21:22:19] [PASSED] LVDS
[21:22:19] [PASSED] Component
[21:22:19] [PASSED] DIN
[21:22:19] [PASSED] DP
[21:22:19] [PASSED] TV
[21:22:19] [PASSED] eDP
[21:22:19] [PASSED] Virtual
[21:22:19] [PASSED] DSI
[21:22:19] [PASSED] DPI
[21:22:19] [PASSED] Writeback
[21:22:19] [PASSED] SPI
[21:22:19] [PASSED] USB
[21:22:19] ==== [PASSED] drm_test_connector_hdmi_init_type_invalid ====
[21:22:19] ============ [PASSED] drmm_connector_hdmi_init =============
[21:22:19] ============= drmm_connector_init (3 subtests) =============
[21:22:19] [PASSED] drm_test_drmm_connector_init
[21:22:19] [PASSED] drm_test_drmm_connector_init_null_ddc
[21:22:19] ========= drm_test_drmm_connector_init_type_valid  =========
[21:22:19] [PASSED] Unknown
[21:22:19] [PASSED] VGA
[21:22:19] [PASSED] DVI-I
[21:22:19] [PASSED] DVI-D
[21:22:19] [PASSED] DVI-A
[21:22:19] [PASSED] Composite
[21:22:19] [PASSED] SVIDEO
[21:22:19] [PASSED] LVDS
[21:22:19] [PASSED] Component
[21:22:19] [PASSED] DIN
[21:22:19] [PASSED] DP
[21:22:19] [PASSED] HDMI-A
[21:22:19] [PASSED] HDMI-B
[21:22:19] [PASSED] TV
[21:22:19] [PASSED] eDP
[21:22:19] [PASSED] Virtual
[21:22:19] [PASSED] DSI
[21:22:19] [PASSED] DPI
[21:22:19] [PASSED] Writeback
[21:22:19] [PASSED] SPI
[21:22:19] [PASSED] USB
[21:22:19] ===== [PASSED] drm_test_drmm_connector_init_type_valid =====
[21:22:19] =============== [PASSED] drmm_connector_init ===============
[21:22:19] = drm_connector_attach_broadcast_rgb_property (2 subtests) =
[21:22:19] [PASSED] drm_test_drm_connector_attach_broadcast_rgb_property
[21:22:19] [PASSED] drm_test_drm_connector_attach_broadcast_rgb_property_hdmi_connector
[21:22:19] === [PASSED] drm_connector_attach_broadcast_rgb_property ===
[21:22:19] ========== drm_get_tv_mode_from_name (2 subtests) ==========
[21:22:19] ========== drm_test_get_tv_mode_from_name_valid  ===========
[21:22:19] [PASSED] NTSC
[21:22:19] [PASSED] NTSC-443
[21:22:19] [PASSED] NTSC-J
[21:22:19] [PASSED] PAL
[21:22:19] [PASSED] PAL-M
[21:22:19] [PASSED] PAL-N
[21:22:19] [PASSED] SECAM
[21:22:19] [PASSED] Mono
[21:22:19] ====== [PASSED] drm_test_get_tv_mode_from_name_valid =======
[21:22:19] [PASSED] drm_test_get_tv_mode_from_name_truncated
[21:22:19] ============ [PASSED] drm_get_tv_mode_from_name ============
[21:22:19] = drm_test_connector_hdmi_compute_mode_clock (12 subtests) =
[21:22:19] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb
[21:22:19] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_10bpc
[21:22:19] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_10bpc_vic_1
[21:22:19] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_12bpc
[21:22:19] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_12bpc_vic_1
[21:22:19] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_double
[21:22:19] = drm_test_connector_hdmi_compute_mode_clock_yuv420_valid  =
[21:22:19] [PASSED] VIC 96
[21:22:19] [PASSED] VIC 97
[21:22:19] [PASSED] VIC 101
[21:22:19] [PASSED] VIC 102
[21:22:19] [PASSED] VIC 106
[21:22:19] [PASSED] VIC 107
[21:22:19] === [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv420_valid ===
[21:22:19] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv420_10_bpc
[21:22:19] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv420_12_bpc
[21:22:19] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv422_8_bpc
[21:22:19] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv422_10_bpc
[21:22:19] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv422_12_bpc
[21:22:19] === [PASSED] drm_test_connector_hdmi_compute_mode_clock ====
[21:22:19] == drm_hdmi_connector_get_broadcast_rgb_name (2 subtests) ==
[21:22:19] === drm_test_drm_hdmi_connector_get_broadcast_rgb_name  ====
[21:22:19] [PASSED] Automatic
[21:22:19] [PASSED] Full
[21:22:19] [PASSED] Limited 16:235
[21:22:19] === [PASSED] drm_test_drm_hdmi_connector_get_broadcast_rgb_name ===
[21:22:19] [PASSED] drm_test_drm_hdmi_connector_get_broadcast_rgb_name_invalid
[21:22:19] ==== [PASSED] drm_hdmi_connector_get_broadcast_rgb_name ====
[21:22:19] == drm_hdmi_connector_get_output_format_name (2 subtests) ==
[21:22:19] === drm_test_drm_hdmi_connector_get_output_format_name  ====
[21:22:19] [PASSED] RGB
[21:22:19] [PASSED] YUV 4:2:0
[21:22:19] [PASSED] YUV 4:2:2
[21:22:19] [PASSED] YUV 4:4:4
[21:22:19] === [PASSED] drm_test_drm_hdmi_connector_get_output_format_name ===
[21:22:19] [PASSED] drm_test_drm_hdmi_connector_get_output_format_name_invalid
[21:22:19] ==== [PASSED] drm_hdmi_connector_get_output_format_name ====
[21:22:19] ============= drm_damage_helper (21 subtests) ==============
[21:22:19] [PASSED] drm_test_damage_iter_no_damage
[21:22:19] [PASSED] drm_test_damage_iter_no_damage_fractional_src
[21:22:19] [PASSED] drm_test_damage_iter_no_damage_src_moved
[21:22:19] [PASSED] drm_test_damage_iter_no_damage_fractional_src_moved
[21:22:19] [PASSED] drm_test_damage_iter_no_damage_not_visible
[21:22:19] [PASSED] drm_test_damage_iter_no_damage_no_crtc
[21:22:19] [PASSED] drm_test_damage_iter_no_damage_no_fb
[21:22:19] [PASSED] drm_test_damage_iter_simple_damage
[21:22:19] [PASSED] drm_test_damage_iter_single_damage
[21:22:19] [PASSED] drm_test_damage_iter_single_damage_intersect_src
[21:22:19] [PASSED] drm_test_damage_iter_single_damage_outside_src
[21:22:19] [PASSED] drm_test_damage_iter_single_damage_fractional_src
[21:22:19] [PASSED] drm_test_damage_iter_single_damage_intersect_fractional_src
[21:22:19] [PASSED] drm_test_damage_iter_single_damage_outside_fractional_src
[21:22:19] [PASSED] drm_test_damage_iter_single_damage_src_moved
[21:22:19] [PASSED] drm_test_damage_iter_single_damage_fractional_src_moved
[21:22:19] [PASSED] drm_test_damage_iter_damage
[21:22:19] [PASSED] drm_test_damage_iter_damage_one_intersect
[21:22:19] [PASSED] drm_test_damage_iter_damage_one_outside
[21:22:19] [PASSED] drm_test_damage_iter_damage_src_moved
[21:22:19] [PASSED] drm_test_damage_iter_damage_not_visible
[21:22:19] ================ [PASSED] drm_damage_helper ================
[21:22:19] ============== drm_dp_mst_helper (3 subtests) ==============
[21:22:19] ============== drm_test_dp_mst_calc_pbn_mode  ==============
[21:22:19] [PASSED] Clock 154000 BPP 30 DSC disabled
[21:22:19] [PASSED] Clock 234000 BPP 30 DSC disabled
[21:22:19] [PASSED] Clock 297000 BPP 24 DSC disabled
[21:22:19] [PASSED] Clock 332880 BPP 24 DSC enabled
[21:22:19] [PASSED] Clock 324540 BPP 24 DSC enabled
[21:22:19] ========== [PASSED] drm_test_dp_mst_calc_pbn_mode ==========
[21:22:19] ============== drm_test_dp_mst_calc_pbn_div  ===============
[21:22:19] [PASSED] Link rate 2000000 lane count 4
[21:22:19] [PASSED] Link rate 2000000 lane count 2
[21:22:19] [PASSED] Link rate 2000000 lane count 1
[21:22:19] [PASSED] Link rate 1350000 lane count 4
[21:22:19] [PASSED] Link rate 1350000 lane count 2
[21:22:19] [PASSED] Link rate 1350000 lane count 1
[21:22:19] [PASSED] Link rate 1000000 lane count 4
[21:22:19] [PASSED] Link rate 1000000 lane count 2
[21:22:19] [PASSED] Link rate 1000000 lane count 1
[21:22:19] [PASSED] Link rate 810000 lane count 4
[21:22:19] [PASSED] Link rate 810000 lane count 2
[21:22:19] [PASSED] Link rate 810000 lane count 1
[21:22:19] [PASSED] Link rate 540000 lane count 4
[21:22:19] [PASSED] Link rate 540000 lane count 2
[21:22:19] [PASSED] Link rate 540000 lane count 1
[21:22:19] [PASSED] Link rate 270000 lane count 4
[21:22:19] [PASSED] Link rate 270000 lane count 2
[21:22:19] [PASSED] Link rate 270000 lane count 1
[21:22:19] [PASSED] Link rate 162000 lane count 4
[21:22:19] [PASSED] Link rate 162000 lane count 2
[21:22:19] [PASSED] Link rate 162000 lane count 1
[21:22:19] ========== [PASSED] drm_test_dp_mst_calc_pbn_div ===========
[21:22:19] ========= drm_test_dp_mst_sideband_msg_req_decode  =========
[21:22:19] [PASSED] DP_ENUM_PATH_RESOURCES with port number
[21:22:19] [PASSED] DP_POWER_UP_PHY with port number
[21:22:19] [PASSED] DP_POWER_DOWN_PHY with port number
[21:22:19] [PASSED] DP_ALLOCATE_PAYLOAD with SDP stream sinks
[21:22:19] [PASSED] DP_ALLOCATE_PAYLOAD with port number
[21:22:19] [PASSED] DP_ALLOCATE_PAYLOAD with VCPI
[21:22:19] [PASSED] DP_ALLOCATE_PAYLOAD with PBN
[21:22:19] [PASSED] DP_QUERY_PAYLOAD with port number
[21:22:19] [PASSED] DP_QUERY_PAYLOAD with VCPI
[21:22:19] [PASSED] DP_REMOTE_DPCD_READ with port number
[21:22:19] [PASSED] DP_REMOTE_DPCD_READ with DPCD address
[21:22:19] [PASSED] DP_REMOTE_DPCD_READ with max number of bytes
[21:22:19] [PASSED] DP_REMOTE_DPCD_WRITE with port number
[21:22:19] [PASSED] DP_REMOTE_DPCD_WRITE with DPCD address
[21:22:19] [PASSED] DP_REMOTE_DPCD_WRITE with data array
[21:22:19] [PASSED] DP_REMOTE_I2C_READ with port number
[21:22:19] [PASSED] DP_REMOTE_I2C_READ with I2C device ID
[21:22:19] [PASSED] DP_REMOTE_I2C_READ with transactions array
[21:22:19] [PASSED] DP_REMOTE_I2C_WRITE with port number
[21:22:19] [PASSED] DP_REMOTE_I2C_WRITE with I2C device ID
[21:22:19] [PASSED] DP_REMOTE_I2C_WRITE with data array
[21:22:19] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream ID
[21:22:19] [PASSED] DP_QUERY_STREAM_ENC_STATUS with client ID
[21:22:19] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream event
[21:22:19] [PASSED] DP_QUERY_STREAM_ENC_STATUS with valid stream event
[21:22:19] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream behavior
[21:22:19] [PASSED] DP_QUERY_STREAM_ENC_STATUS with a valid stream behavior
[21:22:19] ===== [PASSED] drm_test_dp_mst_sideband_msg_req_decode =====
[21:22:19] ================ [PASSED] drm_dp_mst_helper ================
[21:22:19] ================== drm_exec (7 subtests) ===================
[21:22:19] [PASSED] sanitycheck
[21:22:19] [PASSED] test_lock
[21:22:19] [PASSED] test_lock_unlock
[21:22:19] [PASSED] test_duplicates
[21:22:19] [PASSED] test_prepare
[21:22:19] [PASSED] test_prepare_array
[21:22:19] [PASSED] test_multiple_loops
[21:22:19] ==================== [PASSED] drm_exec =====================
[21:22:19] =========== drm_format_helper_test (17 subtests) ===========
[21:22:19] ============== drm_test_fb_xrgb8888_to_gray8  ==============
[21:22:19] [PASSED] single_pixel_source_buffer
[21:22:19] [PASSED] single_pixel_clip_rectangle
[21:22:19] [PASSED] well_known_colors
[21:22:19] [PASSED] destination_pitch
[21:22:19] ========== [PASSED] drm_test_fb_xrgb8888_to_gray8 ==========
[21:22:19] ============= drm_test_fb_xrgb8888_to_rgb332  ==============
[21:22:19] [PASSED] single_pixel_source_buffer
[21:22:19] [PASSED] single_pixel_clip_rectangle
[21:22:19] [PASSED] well_known_colors
[21:22:19] [PASSED] destination_pitch
[21:22:19] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb332 ==========
[21:22:19] ============= drm_test_fb_xrgb8888_to_rgb565  ==============
[21:22:19] [PASSED] single_pixel_source_buffer
[21:22:19] [PASSED] single_pixel_clip_rectangle
[21:22:19] [PASSED] well_known_colors
[21:22:19] [PASSED] destination_pitch
[21:22:19] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb565 ==========
[21:22:19] ============ drm_test_fb_xrgb8888_to_xrgb1555  =============
[21:22:19] [PASSED] single_pixel_source_buffer
[21:22:19] [PASSED] single_pixel_clip_rectangle
[21:22:19] [PASSED] well_known_colors
[21:22:19] [PASSED] destination_pitch
[21:22:19] ======== [PASSED] drm_test_fb_xrgb8888_to_xrgb1555 =========
[21:22:19] ============ drm_test_fb_xrgb8888_to_argb1555  =============
[21:22:19] [PASSED] single_pixel_source_buffer
[21:22:19] [PASSED] single_pixel_clip_rectangle
[21:22:19] [PASSED] well_known_colors
[21:22:19] [PASSED] destination_pitch
[21:22:19] ======== [PASSED] drm_test_fb_xrgb8888_to_argb1555 =========
[21:22:19] ============ drm_test_fb_xrgb8888_to_rgba5551  =============
[21:22:19] [PASSED] single_pixel_source_buffer
[21:22:19] [PASSED] single_pixel_clip_rectangle
[21:22:19] [PASSED] well_known_colors
[21:22:19] [PASSED] destination_pitch
[21:22:19] ======== [PASSED] drm_test_fb_xrgb8888_to_rgba5551 =========
[21:22:19] ============= drm_test_fb_xrgb8888_to_rgb888  ==============
[21:22:19] [PASSED] single_pixel_source_buffer
[21:22:19] [PASSED] single_pixel_clip_rectangle
[21:22:19] [PASSED] well_known_colors
[21:22:19] [PASSED] destination_pitch
[21:22:19] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb888 ==========
[21:22:19] ============ drm_test_fb_xrgb8888_to_argb8888  =============
[21:22:19] [PASSED] single_pixel_source_buffer
[21:22:19] [PASSED] single_pixel_clip_rectangle
[21:22:19] [PASSED] well_known_colors
[21:22:19] [PASSED] destination_pitch
[21:22:19] ======== [PASSED] drm_test_fb_xrgb8888_to_argb8888 =========
[21:22:19] =========== drm_test_fb_xrgb8888_to_xrgb2101010  ===========
[21:22:19] [PASSED] single_pixel_source_buffer
[21:22:19] [PASSED] single_pixel_clip_rectangle
[21:22:19] [PASSED] well_known_colors
[21:22:19] [PASSED] destination_pitch
[21:22:19] ======= [PASSED] drm_test_fb_xrgb8888_to_xrgb2101010 =======
[21:22:19] =========== drm_test_fb_xrgb8888_to_argb2101010  ===========
[21:22:19] [PASSED] single_pixel_source_buffer
[21:22:19] [PASSED] single_pixel_clip_rectangle
[21:22:19] [PASSED] well_known_colors
[21:22:19] [PASSED] destination_pitch
[21:22:19] ======= [PASSED] drm_test_fb_xrgb8888_to_argb2101010 =======
[21:22:19] ============== drm_test_fb_xrgb8888_to_mono  ===============
[21:22:19] [PASSED] single_pixel_source_buffer
[21:22:19] [PASSED] single_pixel_clip_rectangle
[21:22:19] [PASSED] well_known_colors
[21:22:19] [PASSED] destination_pitch
[21:22:19] ========== [PASSED] drm_test_fb_xrgb8888_to_mono ===========
[21:22:19] ==================== drm_test_fb_swab  =====================
[21:22:19] [PASSED] single_pixel_source_buffer
[21:22:19] [PASSED] single_pixel_clip_rectangle
[21:22:19] [PASSED] well_known_colors
[21:22:19] [PASSED] destination_pitch
[21:22:19] ================ [PASSED] drm_test_fb_swab =================
[21:22:19] ============ drm_test_fb_xrgb8888_to_xbgr8888  =============
[21:22:19] [PASSED] single_pixel_source_buffer
[21:22:19] [PASSED] single_pixel_clip_rectangle
[21:22:19] [PASSED] well_known_colors
[21:22:19] [PASSED] destination_pitch
[21:22:19] ======== [PASSED] drm_test_fb_xrgb8888_to_xbgr8888 =========
[21:22:19] ============ drm_test_fb_xrgb8888_to_abgr8888  =============
[21:22:19] [PASSED] single_pixel_source_buffer
[21:22:19] [PASSED] single_pixel_clip_rectangle
[21:22:19] [PASSED] well_known_colors
[21:22:19] [PASSED] destination_pitch
[21:22:19] ======== [PASSED] drm_test_fb_xrgb8888_to_abgr8888 =========
[21:22:19] ================= drm_test_fb_clip_offset  =================
[21:22:19] [PASSED] pass through
[21:22:19] [PASSED] horizontal offset
[21:22:19] [PASSED] vertical offset
[21:22:19] [PASSED] horizontal and vertical offset
[21:22:19] [PASSED] horizontal offset (custom pitch)
[21:22:19] [PASSED] vertical offset (custom pitch)
[21:22:19] [PASSED] horizontal and vertical offset (custom pitch)
[21:22:19] ============= [PASSED] drm_test_fb_clip_offset =============
[21:22:19] ============== drm_test_fb_build_fourcc_list  ==============
[21:22:19] [PASSED] no native formats
[21:22:19] [PASSED] XRGB8888 as native format
[21:22:19] [PASSED] remove duplicates
[21:22:19] [PASSED] convert alpha formats
[21:22:19] [PASSED] random formats
[21:22:19] ========== [PASSED] drm_test_fb_build_fourcc_list ==========
[21:22:19] =================== drm_test_fb_memcpy  ====================
[21:22:19] [PASSED] single_pixel_source_buffer: XR24 little-endian (0x34325258)
[21:22:19] [PASSED] single_pixel_source_buffer: XRA8 little-endian (0x38415258)
[21:22:19] [PASSED] single_pixel_source_buffer: YU24 little-endian (0x34325559)
[21:22:19] [PASSED] single_pixel_clip_rectangle: XB24 little-endian (0x34324258)
[21:22:19] [PASSED] single_pixel_clip_rectangle: XRA8 little-endian (0x38415258)
[21:22:19] [PASSED] single_pixel_clip_rectangle: YU24 little-endian (0x34325559)
[21:22:19] [PASSED] well_known_colors: XB24 little-endian (0x34324258)
[21:22:19] [PASSED] well_known_colors: XRA8 little-endian (0x38415258)
[21:22:19] [PASSED] well_known_colors: YU24 little-endian (0x34325559)
[21:22:19] [PASSED] destination_pitch: XB24 little-endian (0x34324258)
[21:22:19] [PASSED] destination_pitch: XRA8 little-endian (0x38415258)
[21:22:19] [PASSED] destination_pitch: YU24 little-endian (0x34325559)
[21:22:19] =============== [PASSED] drm_test_fb_memcpy ================
[21:22:19] ============= [PASSED] drm_format_helper_test ==============
[21:22:19] ================= drm_format (18 subtests) =================
[21:22:19] [PASSED] drm_test_format_block_width_invalid
[21:22:19] [PASSED] drm_test_format_block_width_one_plane
[21:22:19] [PASSED] drm_test_format_block_width_two_plane
[21:22:19] [PASSED] drm_test_format_block_width_three_plane
[21:22:19] [PASSED] drm_test_format_block_width_tiled
[21:22:19] [PASSED] drm_test_format_block_height_invalid
[21:22:19] [PASSED] drm_test_format_block_height_one_plane
[21:22:19] [PASSED] drm_test_format_block_height_two_plane
[21:22:19] [PASSED] drm_test_format_block_height_three_plane
[21:22:19] [PASSED] drm_test_format_block_height_tiled
[21:22:19] [PASSED] drm_test_format_min_pitch_invalid
[21:22:19] [PASSED] drm_test_format_min_pitch_one_plane_8bpp
[21:22:19] [PASSED] drm_test_format_min_pitch_one_plane_16bpp
[21:22:19] [PASSED] drm_test_format_min_pitch_one_plane_24bpp
[21:22:19] [PASSED] drm_test_format_min_pitch_one_plane_32bpp
[21:22:19] [PASSED] drm_test_format_min_pitch_two_plane
[21:22:19] [PASSED] drm_test_format_min_pitch_three_plane_8bpp
[21:22:19] [PASSED] drm_test_format_min_pitch_tiled
[21:22:19] =================== [PASSED] drm_format ====================
[21:22:19] ============== drm_framebuffer (10 subtests) ===============
[21:22:19] ========== drm_test_framebuffer_check_src_coords  ==========
[21:22:19] [PASSED] Success: source fits into fb
[21:22:19] [PASSED] Fail: overflowing fb with x-axis coordinate
[21:22:19] [PASSED] Fail: overflowing fb with y-axis coordinate
[21:22:19] [PASSED] Fail: overflowing fb with source width
[21:22:19] [PASSED] Fail: overflowing fb with source height
[21:22:19] ====== [PASSED] drm_test_framebuffer_check_src_coords ======
[21:22:19] [PASSED] drm_test_framebuffer_cleanup
[21:22:19] =============== drm_test_framebuffer_create  ===============
[21:22:19] [PASSED] ABGR8888 normal sizes
[21:22:19] [PASSED] ABGR8888 max sizes
[21:22:19] [PASSED] ABGR8888 pitch greater than min required
[21:22:19] [PASSED] ABGR8888 pitch less than min required
[21:22:19] [PASSED] ABGR8888 Invalid width
[21:22:19] [PASSED] ABGR8888 Invalid buffer handle
[21:22:19] [PASSED] No pixel format
[21:22:19] [PASSED] ABGR8888 Width 0
[21:22:19] [PASSED] ABGR8888 Height 0
[21:22:19] [PASSED] ABGR8888 Out of bound height * pitch combination
[21:22:19] [PASSED] ABGR8888 Large buffer offset
[21:22:19] [PASSED] ABGR8888 Buffer offset for inexistent plane
[21:22:19] [PASSED] ABGR8888 Invalid flag
[21:22:19] [PASSED] ABGR8888 Set DRM_MODE_FB_MODIFIERS without modifiers
[21:22:19] [PASSED] ABGR8888 Valid buffer modifier
[21:22:19] [PASSED] ABGR8888 Invalid buffer modifier(DRM_FORMAT_MOD_SAMSUNG_64_32_TILE)
[21:22:19] [PASSED] ABGR8888 Extra pitches without DRM_MODE_FB_MODIFIERS
[21:22:19] [PASSED] ABGR8888 Extra pitches with DRM_MODE_FB_MODIFIERS
[21:22:19] [PASSED] NV12 Normal sizes
[21:22:19] [PASSED] NV12 Max sizes
[21:22:19] [PASSED] NV12 Invalid pitch
[21:22:19] [PASSED] NV12 Invalid modifier/missing DRM_MODE_FB_MODIFIERS flag
[21:22:19] [PASSED] NV12 different  modifier per-plane
[21:22:19] [PASSED] NV12 with DRM_FORMAT_MOD_SAMSUNG_64_32_TILE
[21:22:19] [PASSED] NV12 Valid modifiers without DRM_MODE_FB_MODIFIERS
[21:22:19] [PASSED] NV12 Modifier for inexistent plane
[21:22:19] [PASSED] NV12 Handle for inexistent plane
[21:22:19] [PASSED] NV12 Handle for inexistent plane without DRM_MODE_FB_MODIFIERS
[21:22:19] [PASSED] YVU420 DRM_MODE_FB_MODIFIERS set without modifier
[21:22:19] [PASSED] YVU420 Normal sizes
[21:22:19] [PASSED] YVU420 Max sizes
[21:22:19] [PASSED] YVU420 Invalid pitch
[21:22:19] [PASSED] YVU420 Different pitches
[21:22:19] [PASSED] YVU420 Different buffer offsets/pitches
[21:22:19] [PASSED] YVU420 Modifier set just for plane 0, without DRM_MODE_FB_MODIFIERS
[21:22:19] [PASSED] YVU420 Modifier set just for planes 0, 1, without DRM_MODE_FB_MODIFIERS
[21:22:19] [PASSED] YVU420 Modifier set just for plane 0, 1, with DRM_MODE_FB_MODIFIERS
[21:22:19] [PASSED] YVU420 Valid modifier
[21:22:19] [PASSED] YVU420 Different modifiers per plane
[21:22:19] [PASSED] YVU420 Modifier for inexistent plane
[21:22:19] [PASSED] YUV420_10BIT Invalid modifier(DRM_FORMAT_MOD_LINEAR)
[21:22:19] [PASSED] X0L2 Normal sizes
[21:22:19] [PASSED] X0L2 Max sizes
[21:22:19] [PASSED] X0L2 Invalid pitch
[21:22:19] [PASSED] X0L2 Pitch greater than minimum required
[21:22:19] [PASSED] X0L2 Handle for inexistent plane
[21:22:19] [PASSED] X0L2 Offset for inexistent plane, without DRM_MODE_FB_MODIFIERS set
[21:22:19] [PASSED] X0L2 Modifier without DRM_MODE_FB_MODIFIERS set
[21:22:19] [PASSED] X0L2 Valid modifier
[21:22:19] [PASSED] X0L2 Modifier for inexistent plane
[21:22:19] =========== [PASSED] drm_test_framebuffer_create ===========
[21:22:19] [PASSED] drm_test_framebuffer_free
[21:22:19] [PASSED] drm_test_framebuffer_init
[21:22:19] [PASSED] drm_test_framebuffer_init_bad_format
[21:22:19] [PASSED] drm_test_framebuffer_init_dev_mismatch
[21:22:19] [PASSED] drm_test_framebuffer_lookup
[21:22:19] [PASSED] drm_test_framebuffer_lookup_inexistent
[21:22:19] [PASSED] drm_test_framebuffer_modifiers_not_supported
[21:22:19] ================= [PASSED] drm_framebuffer =================
[21:22:19] ================ drm_gem_shmem (8 subtests) ================
[21:22:19] [PASSED] drm_gem_shmem_test_obj_create
[21:22:19] [PASSED] drm_gem_shmem_test_obj_create_private
[21:22:19] [PASSED] drm_gem_shmem_test_pin_pages
[21:22:19] [PASSED] drm_gem_shmem_test_vmap
[21:22:19] [PASSED] drm_gem_shmem_test_get_pages_sgt
[21:22:19] [PASSED] drm_gem_shmem_test_get_sg_table
[21:22:19] [PASSED] drm_gem_shmem_test_madvise
[21:22:19] [PASSED] drm_gem_shmem_test_purge
[21:22:19] ================== [PASSED] drm_gem_shmem ==================
[21:22:19] === drm_atomic_helper_connector_hdmi_check (22 subtests) ===
[21:22:19] [PASSED] drm_test_check_broadcast_rgb_auto_cea_mode
[21:22:19] [PASSED] drm_test_check_broadcast_rgb_auto_cea_mode_vic_1
[21:22:19] [PASSED] drm_test_check_broadcast_rgb_full_cea_mode
[21:22:19] [PASSED] drm_test_check_broadcast_rgb_full_cea_mode_vic_1
[21:22:19] [PASSED] drm_test_check_broadcast_rgb_limited_cea_mode
[21:22:19] [PASSED] drm_test_check_broadcast_rgb_limited_cea_mode_vic_1
[21:22:19] [PASSED] drm_test_check_broadcast_rgb_crtc_mode_changed
[21:22:19] [PASSED] drm_test_check_broadcast_rgb_crtc_mode_not_changed
[21:22:19] [PASSED] drm_test_check_hdmi_funcs_reject_rate
[21:22:19] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback
[21:22:19] [PASSED] drm_test_check_max_tmds_rate_format_fallback
[21:22:19] [PASSED] drm_test_check_output_bpc_crtc_mode_changed
[21:22:19] [PASSED] drm_test_check_output_bpc_crtc_mode_not_changed
[21:22:19] [PASSED] drm_test_check_output_bpc_dvi
[21:22:19] [PASSED] drm_test_check_output_bpc_format_vic_1
[21:22:19] [PASSED] drm_test_check_output_bpc_format_display_8bpc_only
[21:22:19] [PASSED] drm_test_check_output_bpc_format_display_rgb_only
[21:22:19] [PASSED] drm_test_check_output_bpc_format_driver_8bpc_only
[21:22:19] [PASSED] drm_test_check_output_bpc_format_driver_rgb_only
[21:22:19] [PASSED] drm_test_check_tmds_char_rate_rgb_8bpc
[21:22:19] [PASSED] drm_test_check_tmds_char_rate_rgb_10bpc
[21:22:19] [PASSED] drm_test_check_tmds_char_rate_rgb_12bpc
[21:22:19] ===== [PASSED] drm_atomic_helper_connector_hdmi_check ======
[21:22:19] === drm_atomic_helper_connector_hdmi_reset (6 subtests) ====
[21:22:19] [PASSED] drm_test_check_broadcast_rgb_value
[21:22:19] [PASSED] drm_test_check_bpc_8_value
[21:22:19] [PASSED] drm_test_check_bpc_10_value
[21:22:19] [PASSED] drm_test_check_bpc_12_value
[21:22:19] [PASSED] drm_test_check_format_value
[21:22:19] [PASSED] drm_test_check_tmds_char_value
[21:22:19] ===== [PASSED] drm_atomic_helper_connector_hdmi_reset ======
[21:22:19] ================= drm_managed (2 subtests) =================
[21:22:19] [PASSED] drm_test_managed_release_action
[21:22:19] [PASSED] drm_test_managed_run_action
[21:22:19] =================== [PASSED] drm_managed ===================
[21:22:19] =================== drm_mm (6 subtests) ====================
[21:22:19] [PASSED] drm_test_mm_init
[21:22:19] [PASSED] drm_test_mm_debug
[21:22:19] [PASSED] drm_test_mm_align32
[21:22:19] [PASSED] drm_test_mm_align64
[21:22:19] [PASSED] drm_test_mm_lowest
[21:22:19] [PASSED] drm_test_mm_highest
[21:22:19] ===================== [PASSED] drm_mm ======================
[21:22:19] ============= drm_modes_analog_tv (5 subtests) =============
stty: 'standard input': Inappropriate ioctl for device
[21:22:19] [PASSED] drm_test_modes_analog_tv_mono_576i
[21:22:19] [PASSED] drm_test_modes_analog_tv_ntsc_480i
[21:22:19] [PASSED] drm_test_modes_analog_tv_ntsc_480i_inlined
[21:22:19] [PASSED] drm_test_modes_analog_tv_pal_576i
[21:22:19] [PASSED] drm_test_modes_analog_tv_pal_576i_inlined
[21:22:19] =============== [PASSED] drm_modes_analog_tv ===============
[21:22:19] ============== drm_plane_helper (2 subtests) ===============
[21:22:19] =============== drm_test_check_plane_state  ================
[21:22:19] [PASSED] clipping_simple
[21:22:19] [PASSED] clipping_rotate_reflect
[21:22:19] [PASSED] positioning_simple
[21:22:19] [PASSED] upscaling
[21:22:19] [PASSED] downscaling
[21:22:19] [PASSED] rounding1
[21:22:19] [PASSED] rounding2
[21:22:19] [PASSED] rounding3
[21:22:19] [PASSED] rounding4
[21:22:19] =========== [PASSED] drm_test_check_plane_state ============
[21:22:19] =========== drm_test_check_invalid_plane_state  ============
[21:22:19] [PASSED] positioning_invalid
[21:22:19] [PASSED] upscaling_invalid
[21:22:19] [PASSED] downscaling_invalid
[21:22:19] ======= [PASSED] drm_test_check_invalid_plane_state ========
[21:22:19] ================ [PASSED] drm_plane_helper =================
[21:22:19] ====== drm_connector_helper_tv_get_modes (1 subtest) =======
[21:22:19] ====== drm_test_connector_helper_tv_get_modes_check  =======
[21:22:19] [PASSED] None
[21:22:19] [PASSED] PAL
[21:22:19] [PASSED] NTSC
[21:22:19] [PASSED] Both, NTSC Default
[21:22:19] [PASSED] Both, PAL Default
[21:22:19] [PASSED] Both, NTSC Default, with PAL on command-line
[21:22:19] [PASSED] Both, PAL Default, with NTSC on command-line
[21:22:19] == [PASSED] drm_test_connector_helper_tv_get_modes_check ===
[21:22:19] ======== [PASSED] drm_connector_helper_tv_get_modes ========
[21:22:19] ================== drm_rect (9 subtests) ===================
[21:22:19] [PASSED] drm_test_rect_clip_scaled_div_by_zero
[21:22:19] [PASSED] drm_test_rect_clip_scaled_not_clipped
[21:22:19] [PASSED] drm_test_rect_clip_scaled_clipped
[21:22:19] [PASSED] drm_test_rect_clip_scaled_signed_vs_unsigned
[21:22:19] ================= drm_test_rect_intersect  =================
[21:22:19] [PASSED] top-left x bottom-right: 2x2+1+1 x 2x2+0+0
[21:22:19] [PASSED] top-right x bottom-left: 2x2+0+0 x 2x2+1-1
[21:22:19] [PASSED] bottom-left x top-right: 2x2+1-1 x 2x2+0+0
[21:22:19] [PASSED] bottom-right x top-left: 2x2+0+0 x 2x2+1+1
[21:22:19] [PASSED] right x left: 2x1+0+0 x 3x1+1+0
[21:22:19] [PASSED] left x right: 3x1+1+0 x 2x1+0+0
[21:22:19] [PASSED] up x bottom: 1x2+0+0 x 1x3+0-1
[21:22:19] [PASSED] bottom x up: 1x3+0-1 x 1x2+0+0
[21:22:19] [PASSED] touching corner: 1x1+0+0 x 2x2+1+1
[21:22:19] [PASSED] touching side: 1x1+0+0 x 1x1+1+0
[21:22:19] [PASSED] equal rects: 2x2+0+0 x 2x2+0+0
[21:22:19] [PASSED] inside another: 2x2+0+0 x 1x1+1+1
[21:22:19] [PASSED] far away: 1x1+0+0 x 1x1+3+6
[21:22:19] [PASSED] points intersecting: 0x0+5+10 x 0x0+5+10
[21:22:19] [PASSED] points not intersecting: 0x0+0+0 x 0x0+5+10
[21:22:19] ============= [PASSED] drm_test_rect_intersect =============
[21:22:19] ================ drm_test_rect_calc_hscale  ================
[21:22:19] [PASSED] normal use
[21:22:19] [PASSED] out of max range
[21:22:19] [PASSED] out of min range
[21:22:19] [PASSED] zero dst
[21:22:19] [PASSED] negative src
[21:22:19] [PASSED] negative dst
[21:22:19] ============ [PASSED] drm_test_rect_calc_hscale ============
[21:22:19] ================ drm_test_rect_calc_vscale  ================
[21:22:19] [PASSED] normal use
[21:22:19] [PASSED] out of max range
[21:22:19] [PASSED] out of min range
[21:22:19] [PASSED] zero dst
[21:22:19] [PASSED] negative src
[21:22:19] [PASSED] negative dst
[21:22:19] ============ [PASSED] drm_test_rect_calc_vscale ============
[21:22:19] ================== drm_test_rect_rotate  ===================
[21:22:19] [PASSED] reflect-x
[21:22:19] [PASSED] reflect-y
[21:22:19] [PASSED] rotate-0
[21:22:19] [PASSED] rotate-90
[21:22:19] [PASSED] rotate-180
[21:22:19] [PASSED] rotate-270
[21:22:19] ============== [PASSED] drm_test_rect_rotate ===============
[21:22:19] ================ drm_test_rect_rotate_inv  =================
[21:22:19] [PASSED] reflect-x
[21:22:19] [PASSED] reflect-y
[21:22:19] [PASSED] rotate-0
[21:22:19] [PASSED] rotate-90
[21:22:19] [PASSED] rotate-180
[21:22:19] [PASSED] rotate-270
[21:22:19] ============ [PASSED] drm_test_rect_rotate_inv =============
[21:22:19] ==================== [PASSED] drm_rect =====================
[21:22:19] ============================================================
[21:22:19] Testing complete. Ran 531 tests: passed: 531
[21:22:19] Elapsed time: 24.840s total, 1.607s configuring, 23.015s building, 0.172s running

+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/ttm/tests/.kunitconfig
[21:22:19] Configuring KUnit Kernel ...
Regenerating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[21:22:20] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json ARCH=um O=.kunit --jobs=48
[21:22:28] Starting KUnit Kernel (1/1)...
[21:22:28] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[21:22:28] ================= ttm_device (5 subtests) ==================
[21:22:28] [PASSED] ttm_device_init_basic
[21:22:28] [PASSED] ttm_device_init_multiple
[21:22:28] [PASSED] ttm_device_fini_basic
[21:22:28] [PASSED] ttm_device_init_no_vma_man
[21:22:28] ================== ttm_device_init_pools  ==================
[21:22:28] [PASSED] No DMA allocations, no DMA32 required
[21:22:28] [PASSED] DMA allocations, DMA32 required
[21:22:28] [PASSED] No DMA allocations, DMA32 required
[21:22:28] [PASSED] DMA allocations, no DMA32 required
[21:22:28] ============== [PASSED] ttm_device_init_pools ==============
[21:22:28] =================== [PASSED] ttm_device ====================
[21:22:28] ================== ttm_pool (8 subtests) ===================
[21:22:28] ================== ttm_pool_alloc_basic  ===================
[21:22:28] [PASSED] One page
[21:22:28] [PASSED] More than one page
[21:22:28] [PASSED] Above the allocation limit
[21:22:28] [PASSED] One page, with coherent DMA mappings enabled
[21:22:28] [PASSED] Above the allocation limit, with coherent DMA mappings enabled
[21:22:28] ============== [PASSED] ttm_pool_alloc_basic ===============
[21:22:28] ============== ttm_pool_alloc_basic_dma_addr  ==============
[21:22:28] [PASSED] One page
[21:22:28] [PASSED] More than one page
[21:22:28] [PASSED] Above the allocation limit
[21:22:28] [PASSED] One page, with coherent DMA mappings enabled
[21:22:28] [PASSED] Above the allocation limit, with coherent DMA mappings enabled
[21:22:28] ========== [PASSED] ttm_pool_alloc_basic_dma_addr ==========
[21:22:28] [PASSED] ttm_pool_alloc_order_caching_match
[21:22:28] [PASSED] ttm_pool_alloc_caching_mismatch
[21:22:28] [PASSED] ttm_pool_alloc_order_mismatch
[21:22:28] [PASSED] ttm_pool_free_dma_alloc
[21:22:28] [PASSED] ttm_pool_free_no_dma_alloc
[21:22:28] [PASSED] ttm_pool_fini_basic
[21:22:28] ==================== [PASSED] ttm_pool =====================
[21:22:28] ================ ttm_resource (8 subtests) =================
[21:22:28] ================= ttm_resource_init_basic  =================
[21:22:28] [PASSED] Init resource in TTM_PL_SYSTEM
[21:22:28] [PASSED] Init resource in TTM_PL_VRAM
[21:22:28] [PASSED] Init resource in a private placement
[21:22:28] [PASSED] Init resource in TTM_PL_SYSTEM, set placement flags
[21:22:28] ============= [PASSED] ttm_resource_init_basic =============
[21:22:28] [PASSED] ttm_resource_init_pinned
[21:22:28] [PASSED] ttm_resource_fini_basic
[21:22:28] [PASSED] ttm_resource_manager_init_basic
[21:22:28] [PASSED] ttm_resource_manager_usage_basic
[21:22:28] [PASSED] ttm_resource_manager_set_used_basic
[21:22:28] [PASSED] ttm_sys_man_alloc_basic
[21:22:28] [PASSED] ttm_sys_man_free_basic
[21:22:28] ================== [PASSED] ttm_resource ===================
[21:22:28] =================== ttm_tt (15 subtests) ===================
[21:22:28] ==================== ttm_tt_init_basic  ====================
[21:22:28] [PASSED] Page-aligned size
[21:22:28] [PASSED] Extra pages requested
[21:22:28] ================ [PASSED] ttm_tt_init_basic ================
[21:22:28] [PASSED] ttm_tt_init_misaligned
[21:22:28] [PASSED] ttm_tt_fini_basic
[21:22:28] [PASSED] ttm_tt_fini_sg
[21:22:28] [PASSED] ttm_tt_fini_shmem
[21:22:28] [PASSED] ttm_tt_create_basic
[21:22:28] [PASSED] ttm_tt_create_invalid_bo_type
[21:22:28] [PASSED] ttm_tt_create_ttm_exists
[21:22:28] [PASSED] ttm_tt_create_failed
[21:22:28] [PASSED] ttm_tt_destroy_basic
[21:22:28] [PASSED] ttm_tt_populate_null_ttm
[21:22:28] [PASSED] ttm_tt_populate_populated_ttm
[21:22:28] [PASSED] ttm_tt_unpopulate_basic
[21:22:28] [PASSED] ttm_tt_unpopulate_empty_ttm
[21:22:28] [PASSED] ttm_tt_swapin_basic
[21:22:28] ===================== [PASSED] ttm_tt ======================
[21:22:28] =================== ttm_bo (14 subtests) ===================
[21:22:28] =========== ttm_bo_reserve_optimistic_no_ticket  ===========
[21:22:28] [PASSED] Cannot be interrupted and sleeps
[21:22:28] [PASSED] Cannot be interrupted, locks straight away
[21:22:28] [PASSED] Can be interrupted, sleeps
[21:22:28] ======= [PASSED] ttm_bo_reserve_optimistic_no_ticket =======
[21:22:28] [PASSED] ttm_bo_reserve_locked_no_sleep
[21:22:28] [PASSED] ttm_bo_reserve_no_wait_ticket
[21:22:28] [PASSED] ttm_bo_reserve_double_resv
[21:22:28] [PASSED] ttm_bo_reserve_interrupted
[21:22:28] [PASSED] ttm_bo_reserve_deadlock
[21:22:28] [PASSED] ttm_bo_unreserve_basic
[21:22:28] [PASSED] ttm_bo_unreserve_pinned
[21:22:28] [PASSED] ttm_bo_unreserve_bulk
[21:22:28] [PASSED] ttm_bo_put_basic
[21:22:28] [PASSED] ttm_bo_put_shared_resv
[21:22:28] [PASSED] ttm_bo_pin_basic
[21:22:28] [PASSED] ttm_bo_pin_unpin_resource
[21:22:28] [PASSED] ttm_bo_multiple_pin_one_unpin
[21:22:28] ===================== [PASSED] ttm_bo ======================
[21:22:28] ============== ttm_bo_validate (22 subtests) ===============
[21:22:28] ============== ttm_bo_init_reserved_sys_man  ===============
[21:22:28] [PASSED] Buffer object for userspace
[21:22:28] [PASSED] Kernel buffer object
[21:22:28] [PASSED] Shared buffer object
[21:22:28] ========== [PASSED] ttm_bo_init_reserved_sys_man ===========
[21:22:28] ============== ttm_bo_init_reserved_mock_man  ==============
[21:22:28] [PASSED] Buffer object for userspace
[21:22:28] [PASSED] Kernel buffer object
[21:22:28] [PASSED] Shared buffer object
[21:22:28] ========== [PASSED] ttm_bo_init_reserved_mock_man ==========
[21:22:28] [PASSED] ttm_bo_init_reserved_resv
[21:22:28] ================== ttm_bo_validate_basic  ==================
[21:22:28] [PASSED] Buffer object for userspace
[21:22:28] [PASSED] Kernel buffer object
[21:22:28] [PASSED] Shared buffer object
[21:22:28] ============== [PASSED] ttm_bo_validate_basic ==============
[21:22:28] [PASSED] ttm_bo_validate_invalid_placement
[21:22:28] ============= ttm_bo_validate_same_placement  ==============
[21:22:28] [PASSED] System manager
[21:22:28] [PASSED] VRAM manager
[21:22:28] ========= [PASSED] ttm_bo_validate_same_placement ==========
[21:22:28] [PASSED] ttm_bo_validate_failed_alloc
[21:22:28] [PASSED] ttm_bo_validate_pinned
[21:22:28] [PASSED] ttm_bo_validate_busy_placement
[21:22:28] ================ ttm_bo_validate_multihop  =================
[21:22:28] [PASSED] Buffer object for userspace
[21:22:28] [PASSED] Kernel buffer object
[21:22:28] [PASSED] Shared buffer object
[21:22:28] ============ [PASSED] ttm_bo_validate_multihop =============
[21:22:28] ========== ttm_bo_validate_no_placement_signaled  ==========
[21:22:28] [PASSED] Buffer object in system domain, no page vector
[21:22:28] [PASSED] Buffer object in system domain with an existing page vector
[21:22:28] ====== [PASSED] ttm_bo_validate_no_placement_signaled ======
[21:22:28] ======== ttm_bo_validate_no_placement_not_signaled  ========
[21:22:28] [PASSED] Buffer object for userspace
[21:22:28] [PASSED] Kernel buffer object
[21:22:28] [PASSED] Shared buffer object
[21:22:28] ==== [PASSED] ttm_bo_validate_no_placement_not_signaled ====
[21:22:28] [PASSED] ttm_bo_validate_move_fence_signaled
[21:22:28] ========= ttm_bo_validate_move_fence_not_signaled  =========
[21:22:28] [PASSED] Waits for GPU
[21:22:28] [PASSED] Tries to lock straight away
[21:22:29] ===== [PASSED] ttm_bo_validate_move_fence_not_signaled =====
[21:22:29] [PASSED] ttm_bo_validate_swapout
[21:22:29] [PASSED] ttm_bo_validate_happy_evict
[21:22:29] [PASSED] ttm_bo_validate_all_pinned_evict
[21:22:29] [PASSED] ttm_bo_validate_allowed_only_evict
[21:22:29] [PASSED] ttm_bo_validate_deleted_evict
[21:22:29] [PASSED] ttm_bo_validate_busy_domain_evict
[21:22:29] [PASSED] ttm_bo_validate_evict_gutting
[21:22:29] [PASSED] ttm_bo_validate_recrusive_evict
stty: 'standard input': Inappropriate ioctl for device
[21:22:29] ================= [PASSED] ttm_bo_validate =================
[21:22:29] ============================================================
[21:22:29] Testing complete. Ran 102 tests: passed: 102
[21:22:29] Elapsed time: 9.910s total, 1.700s configuring, 7.543s building, 0.563s running

+ cleanup
++ stat -c %u:%g /kernel
+ chown -R 1003:1003 /kernel



^ permalink raw reply	[flat|nested] 23+ messages in thread

* ✗ CI.Build: failure for drm/xe/guc: Improve GuC log dumping and add to devcoredump (rev5)
  2024-10-02 21:14 [PATCH v9 00/11] drm/xe/guc: Improve GuC log dumping and add to devcoredump John.C.Harrison
                   ` (13 preceding siblings ...)
  2024-10-02 21:22 ` ✓ CI.KUnit: success " Patchwork
@ 2024-10-02 21:27 ` Patchwork
  14 siblings, 0 replies; 23+ messages in thread
From: Patchwork @ 2024-10-02 21:27 UTC (permalink / raw)
  To: John Harrison; +Cc: intel-xe

== Series Details ==

Series: drm/xe/guc: Improve GuC log dumping and add to devcoredump (rev5)
URL   : https://patchwork.freedesktop.org/series/137985/
State : failure

== Summary ==

CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml2_dc_resource_mgmt.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml2_mall_phantom.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml_display_rq_dlg_calc.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_top/dml_top.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_top/dml_top_mcache.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_top/dml2_top_optimization.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/inc/dml2_debug.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_core/dml2_core_factory.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_factory.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_mcg/dml2_mcg_dcn4.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_mcg/dml2_mcg_factory.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_pmo/dml2_pmo_dcn3.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_pmo/dml2_pmo_factory.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_pmo/dml2_pmo_dcn4_fams2.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/dml21_translation_helper.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/dml21_wrapper.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/dml21_utils.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dce120/dce120_timing_generator.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dce112/dce112_compressor.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dce110/dce110_timing_generator.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dce110/dce110_compressor.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dce110/dce110_opp_regamma_v.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dce110/dce110_opp_csc_v.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dce110/dce110_timing_generator_v.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dce110/dce110_mem_input_v.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dce110/dce110_opp_v.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dce110/dce110_transform_v.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dce80/dce80_timing_generator.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_timing_generator.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_hw_sequencer.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_resource.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/hdcp/hdcp_msg.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/spl/dc_spl.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/spl/dc_spl_scl_filters.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/spl/dc_spl_scl_easf_filters.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/spl/dc_spl_isharp_filters.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/spl/dc_spl_filters.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/spl/spl_fixpt31_32.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/spl/spl_custom_float.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_stat.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_resource.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_hw_sequencer.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_sink.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_surface.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_debug.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_stream.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link_enc_cfg.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link_exports.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_state.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_vm_helper.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dc_dmub_srv.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dc_edid_parser.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dc_spl_translate.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/modules/freesync/freesync.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/modules/color/color_gamma.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/modules/color/color_table.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/modules/info_packet/info_packet.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/modules/power/power_helpers.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dmub/src/dmub_srv.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dmub/src/dmub_srv_stat.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dmub/src/dmub_reg.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dmub/src/dmub_dcn20.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dmub/src/dmub_dcn21.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dmub/src/dmub_dcn30.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dmub/src/dmub_dcn301.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dmub/src/dmub_dcn302.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dmub/src/dmub_dcn303.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dmub/src/dmub_dcn31.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dmub/src/dmub_dcn314.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dmub/src/dmub_dcn315.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dmub/src/dmub_dcn316.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dmub/src/dmub_dcn32.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dmub/src/dmub_dcn35.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dmub/src/dmub_dcn351.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dmub/src/dmub_dcn401.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/modules/hdcp/hdcp_ddc.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/modules/hdcp/hdcp_log.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/modules/hdcp/hdcp_psp.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/modules/hdcp/hdcp.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/modules/hdcp/hdcp1_execution.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/modules/hdcp/hdcp1_transition.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/modules/hdcp/hdcp2_execution.o
  CC [M]  drivers/gpu/drm/amd/amdgpu/../display/modules/hdcp/hdcp2_transition.o
  LD [M]  drivers/gpu/drm/amd/amdgpu/amdgpu.o
make[5]: *** [../scripts/Makefile.build:478: drivers/gpu/drm] Error 2
make[4]: *** [../scripts/Makefile.build:478: drivers/gpu] Error 2
make[3]: *** [../scripts/Makefile.build:478: drivers] Error 2
make[2]: *** [/kernel/Makefile:1936: .] Error 2
make[1]: *** [/kernel/Makefile:224: __sub-make] Error 2
make[1]: Leaving directory '/kernel/build64-default'
make: *** [Makefile:224: __sub-make] Error 2
+ cleanup
++ stat -c %u:%g /kernel
+ chown -R 1003:1003 /kernel



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v9 03/11] drm/xe/devcoredump: Improve section headings and add tile info
  2024-10-03  0:46 [PATCH v9 00/11] drm/xe/guc: Improve GuC log dumping and add to devcoredump John.C.Harrison
@ 2024-10-03  0:46 ` John.C.Harrison
  2024-12-12 18:17   ` Souza, Jose
  0 siblings, 1 reply; 23+ messages in thread
From: John.C.Harrison @ 2024-10-03  0:46 UTC (permalink / raw)
  To: Intel-Xe; +Cc: John Harrison, Julia Filipchuk

From: John Harrison <John.C.Harrison@Intel.com>

The xe_guc_exec_queue_snapshot is not really a GuC internal thing and
is definitely not a GuC CT thing. So give it its own section heading.
The snapshot itself is really a capture of the submission backend's
internal state. Although all it currently prints out is the submission
contexts. So label it as 'Contexts'. If more general state is added
later then it could be change to 'Submission backend' or some such.

Further, everything from the GuC CT section onwards is GT specific but
there was no indication of which GT it was related to (and that is
impossible to work out from the other fields that are given). So add a
GT section heading. Also include the tile id of the GT, because again
significant information.

Lastly, drop a couple of unnecessary line feeds within sections.

v2: Add GT section heading, add tile id to device section.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
---
 drivers/gpu/drm/xe/xe_devcoredump.c       | 5 +++++
 drivers/gpu/drm/xe/xe_devcoredump_types.h | 3 ++-
 drivers/gpu/drm/xe/xe_device.c            | 1 +
 drivers/gpu/drm/xe/xe_guc_submit.c        | 2 +-
 drivers/gpu/drm/xe/xe_hw_engine.c         | 1 -
 5 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c
index d23719d5c2a3..2690f1d1cde4 100644
--- a/drivers/gpu/drm/xe/xe_devcoredump.c
+++ b/drivers/gpu/drm/xe/xe_devcoredump.c
@@ -96,8 +96,13 @@ static ssize_t __xe_devcoredump_read(char *buffer, size_t count,
 	drm_printf(&p, "Process: %s\n", ss->process_name);
 	xe_device_snapshot_print(xe, &p);
 
+	drm_printf(&p, "\n**** GT #%d ****\n", ss->gt->info.id);
+	drm_printf(&p, "\tTile: %d\n", ss->gt->tile->id);
+
 	drm_puts(&p, "\n**** GuC CT ****\n");
 	xe_guc_ct_snapshot_print(ss->ct, &p);
+
+	drm_puts(&p, "\n**** Contexts ****\n");
 	xe_guc_exec_queue_snapshot_print(ss->ge, &p);
 
 	drm_puts(&p, "\n**** Job ****\n");
diff --git a/drivers/gpu/drm/xe/xe_devcoredump_types.h b/drivers/gpu/drm/xe/xe_devcoredump_types.h
index 440d05d77a5a..3cc2f095fdfb 100644
--- a/drivers/gpu/drm/xe/xe_devcoredump_types.h
+++ b/drivers/gpu/drm/xe/xe_devcoredump_types.h
@@ -37,7 +37,8 @@ struct xe_devcoredump_snapshot {
 	/* GuC snapshots */
 	/** @ct: GuC CT snapshot */
 	struct xe_guc_ct_snapshot *ct;
-	/** @ge: Guc Engine snapshot */
+
+	/** @ge: GuC Submission Engine snapshot */
 	struct xe_guc_submit_exec_queue_snapshot *ge;
 
 	/** @hwe: HW Engine snapshot array */
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 09a7ad830e69..030cf703e970 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -961,6 +961,7 @@ void xe_device_snapshot_print(struct xe_device *xe, struct drm_printer *p)
 
 	for_each_gt(gt, xe, id) {
 		drm_printf(p, "GT id: %u\n", id);
+		drm_printf(p, "\tTile: %u\n", gt->tile->id);
 		drm_printf(p, "\tType: %s\n",
 			   gt->info.type == XE_GT_TYPE_MAIN ? "main" : "media");
 		drm_printf(p, "\tIP ver: %u.%u.%u\n",
diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
index 0ac4a19ec9cc..8690df699170 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit.c
+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
@@ -2240,7 +2240,7 @@ xe_guc_exec_queue_snapshot_print(struct xe_guc_submit_exec_queue_snapshot *snaps
 	if (!snapshot)
 		return;
 
-	drm_printf(p, "\nGuC ID: %d\n", snapshot->guc.id);
+	drm_printf(p, "GuC ID: %d\n", snapshot->guc.id);
 	drm_printf(p, "\tName: %s\n", snapshot->name);
 	drm_printf(p, "\tClass: %d\n", snapshot->class);
 	drm_printf(p, "\tLogical mask: 0x%x\n", snapshot->logical_mask);
diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c
index ea6d9ef7fab6..6c9c27304cdc 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.c
+++ b/drivers/gpu/drm/xe/xe_hw_engine.c
@@ -1084,7 +1084,6 @@ void xe_hw_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot,
 	if (snapshot->hwe->class == XE_ENGINE_CLASS_COMPUTE)
 		drm_printf(p, "\tRCU_MODE: 0x%08x\n",
 			   snapshot->reg.rcu_mode);
-	drm_puts(p, "\n");
 }
 
 /**
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v9 03/11] drm/xe/devcoredump: Improve section headings and add tile info
  2024-10-03  0:46 ` [PATCH v9 03/11] drm/xe/devcoredump: Improve section headings and add tile info John.C.Harrison
@ 2024-12-12 18:17   ` Souza, Jose
  2024-12-12 18:59     ` John Harrison
  0 siblings, 1 reply; 23+ messages in thread
From: Souza, Jose @ 2024-12-12 18:17 UTC (permalink / raw)
  To: Intel-Xe@Lists.FreeDesktop.Org, Harrison, John C, Vivi, Rodrigo,
	De Marchi, Lucas
  Cc: Filipchuk, Julia

On Wed, 2024-10-02 at 17:46 -0700, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> The xe_guc_exec_queue_snapshot is not really a GuC internal thing and
> is definitely not a GuC CT thing. So give it its own section heading.
> The snapshot itself is really a capture of the submission backend's
> internal state. Although all it currently prints out is the submission
> contexts. So label it as 'Contexts'. If more general state is added
> later then it could be change to 'Submission backend' or some such.
> 
> Further, everything from the GuC CT section onwards is GT specific but
> there was no indication of which GT it was related to (and that is
> impossible to work out from the other fields that are given). So add a
> GT section heading. Also include the tile id of the GT, because again
> significant information.
> 
> Lastly, drop a couple of unnecessary line feeds within sections.
> 
> v2: Add GT section heading, add tile id to device section.
> 
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_devcoredump.c       | 5 +++++
>  drivers/gpu/drm/xe/xe_devcoredump_types.h | 3 ++-
>  drivers/gpu/drm/xe/xe_device.c            | 1 +
>  drivers/gpu/drm/xe/xe_guc_submit.c        | 2 +-
>  drivers/gpu/drm/xe/xe_hw_engine.c         | 1 -
>  5 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c
> index d23719d5c2a3..2690f1d1cde4 100644
> --- a/drivers/gpu/drm/xe/xe_devcoredump.c
> +++ b/drivers/gpu/drm/xe/xe_devcoredump.c
> @@ -96,8 +96,13 @@ static ssize_t __xe_devcoredump_read(char *buffer, size_t count,
>  	drm_printf(&p, "Process: %s\n", ss->process_name);
>  	xe_device_snapshot_print(xe, &p);
>  
> +	drm_printf(&p, "\n**** GT #%d ****\n", ss->gt->info.id);
> +	drm_printf(&p, "\tTile: %d\n", ss->gt->tile->id);
> +
>  	drm_puts(&p, "\n**** GuC CT ****\n");
>  	xe_guc_ct_snapshot_print(ss->ct, &p);
> +
> +	drm_puts(&p, "\n**** Contexts ****\n");
>  	xe_guc_exec_queue_snapshot_print(ss->ge, &p);

This broke Mesa parser!
It can't now parse the exec_queue context because it was expected to be on the '**** GuC CT ****' section.

>  
>  	drm_puts(&p, "\n**** Job ****\n");
> diff --git a/drivers/gpu/drm/xe/xe_devcoredump_types.h b/drivers/gpu/drm/xe/xe_devcoredump_types.h
> index 440d05d77a5a..3cc2f095fdfb 100644
> --- a/drivers/gpu/drm/xe/xe_devcoredump_types.h
> +++ b/drivers/gpu/drm/xe/xe_devcoredump_types.h
> @@ -37,7 +37,8 @@ struct xe_devcoredump_snapshot {
>  	/* GuC snapshots */
>  	/** @ct: GuC CT snapshot */
>  	struct xe_guc_ct_snapshot *ct;
> -	/** @ge: Guc Engine snapshot */
> +
> +	/** @ge: GuC Submission Engine snapshot */
>  	struct xe_guc_submit_exec_queue_snapshot *ge;
>  
>  	/** @hwe: HW Engine snapshot array */
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index 09a7ad830e69..030cf703e970 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -961,6 +961,7 @@ void xe_device_snapshot_print(struct xe_device *xe, struct drm_printer *p)
>  
>  	for_each_gt(gt, xe, id) {
>  		drm_printf(p, "GT id: %u\n", id);
> +		drm_printf(p, "\tTile: %u\n", gt->tile->id);
>  		drm_printf(p, "\tType: %s\n",
>  			   gt->info.type == XE_GT_TYPE_MAIN ? "main" : "media");
>  		drm_printf(p, "\tIP ver: %u.%u.%u\n",
> diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> index 0ac4a19ec9cc..8690df699170 100644
> --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> @@ -2240,7 +2240,7 @@ xe_guc_exec_queue_snapshot_print(struct xe_guc_submit_exec_queue_snapshot *snaps
>  	if (!snapshot)
>  		return;
>  
> -	drm_printf(p, "\nGuC ID: %d\n", snapshot->guc.id);
> +	drm_printf(p, "GuC ID: %d\n", snapshot->guc.id);
>  	drm_printf(p, "\tName: %s\n", snapshot->name);
>  	drm_printf(p, "\tClass: %d\n", snapshot->class);
>  	drm_printf(p, "\tLogical mask: 0x%x\n", snapshot->logical_mask);
> diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c
> index ea6d9ef7fab6..6c9c27304cdc 100644
> --- a/drivers/gpu/drm/xe/xe_hw_engine.c
> +++ b/drivers/gpu/drm/xe/xe_hw_engine.c
> @@ -1084,7 +1084,6 @@ void xe_hw_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot,
>  	if (snapshot->hwe->class == XE_ENGINE_CLASS_COMPUTE)
>  		drm_printf(p, "\tRCU_MODE: 0x%08x\n",
>  			   snapshot->reg.rcu_mode);
> -	drm_puts(p, "\n");
>  }
>  
>  /**


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v9 03/11] drm/xe/devcoredump: Improve section headings and add tile info
  2024-12-12 18:17   ` Souza, Jose
@ 2024-12-12 18:59     ` John Harrison
  2024-12-12 19:31       ` Souza, Jose
  0 siblings, 1 reply; 23+ messages in thread
From: John Harrison @ 2024-12-12 18:59 UTC (permalink / raw)
  To: Souza, Jose, Intel-Xe@Lists.FreeDesktop.Org, Vivi, Rodrigo,
	De Marchi, Lucas
  Cc: Filipchuk, Julia

On 12/12/2024 10:17, Souza, Jose wrote:
> On Wed, 2024-10-02 at 17:46 -0700, John.C.Harrison@Intel.com wrote:
>> From: John Harrison <John.C.Harrison@Intel.com>
>>
>> The xe_guc_exec_queue_snapshot is not really a GuC internal thing and
>> is definitely not a GuC CT thing. So give it its own section heading.
>> The snapshot itself is really a capture of the submission backend's
>> internal state. Although all it currently prints out is the submission
>> contexts. So label it as 'Contexts'. If more general state is added
>> later then it could be change to 'Submission backend' or some such.
>>
>> Further, everything from the GuC CT section onwards is GT specific but
>> there was no indication of which GT it was related to (and that is
>> impossible to work out from the other fields that are given). So add a
>> GT section heading. Also include the tile id of the GT, because again
>> significant information.
>>
>> Lastly, drop a couple of unnecessary line feeds within sections.
>>
>> v2: Add GT section heading, add tile id to device section.
>>
>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
>> ---
>>   drivers/gpu/drm/xe/xe_devcoredump.c       | 5 +++++
>>   drivers/gpu/drm/xe/xe_devcoredump_types.h | 3 ++-
>>   drivers/gpu/drm/xe/xe_device.c            | 1 +
>>   drivers/gpu/drm/xe/xe_guc_submit.c        | 2 +-
>>   drivers/gpu/drm/xe/xe_hw_engine.c         | 1 -
>>   5 files changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c
>> index d23719d5c2a3..2690f1d1cde4 100644
>> --- a/drivers/gpu/drm/xe/xe_devcoredump.c
>> +++ b/drivers/gpu/drm/xe/xe_devcoredump.c
>> @@ -96,8 +96,13 @@ static ssize_t __xe_devcoredump_read(char *buffer, size_t count,
>>   	drm_printf(&p, "Process: %s\n", ss->process_name);
>>   	xe_device_snapshot_print(xe, &p);
>>   
>> +	drm_printf(&p, "\n**** GT #%d ****\n", ss->gt->info.id);
>> +	drm_printf(&p, "\tTile: %d\n", ss->gt->tile->id);
>> +
>>   	drm_puts(&p, "\n**** GuC CT ****\n");
>>   	xe_guc_ct_snapshot_print(ss->ct, &p);
>> +
>> +	drm_puts(&p, "\n**** Contexts ****\n");
>>   	xe_guc_exec_queue_snapshot_print(ss->ge, &p);
> This broke Mesa parser!
> It can't now parse the exec_queue context because it was expected to be on the '**** GuC CT ****' section.
Then the mesa parse needs to be updated. That was clearly a bug - exec 
queue contexts are absolutely not GuC CT data and should not be in the 
GuC CT section.

John.

>
>>   
>>   	drm_puts(&p, "\n**** Job ****\n");
>> diff --git a/drivers/gpu/drm/xe/xe_devcoredump_types.h b/drivers/gpu/drm/xe/xe_devcoredump_types.h
>> index 440d05d77a5a..3cc2f095fdfb 100644
>> --- a/drivers/gpu/drm/xe/xe_devcoredump_types.h
>> +++ b/drivers/gpu/drm/xe/xe_devcoredump_types.h
>> @@ -37,7 +37,8 @@ struct xe_devcoredump_snapshot {
>>   	/* GuC snapshots */
>>   	/** @ct: GuC CT snapshot */
>>   	struct xe_guc_ct_snapshot *ct;
>> -	/** @ge: Guc Engine snapshot */
>> +
>> +	/** @ge: GuC Submission Engine snapshot */
>>   	struct xe_guc_submit_exec_queue_snapshot *ge;
>>   
>>   	/** @hwe: HW Engine snapshot array */
>> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
>> index 09a7ad830e69..030cf703e970 100644
>> --- a/drivers/gpu/drm/xe/xe_device.c
>> +++ b/drivers/gpu/drm/xe/xe_device.c
>> @@ -961,6 +961,7 @@ void xe_device_snapshot_print(struct xe_device *xe, struct drm_printer *p)
>>   
>>   	for_each_gt(gt, xe, id) {
>>   		drm_printf(p, "GT id: %u\n", id);
>> +		drm_printf(p, "\tTile: %u\n", gt->tile->id);
>>   		drm_printf(p, "\tType: %s\n",
>>   			   gt->info.type == XE_GT_TYPE_MAIN ? "main" : "media");
>>   		drm_printf(p, "\tIP ver: %u.%u.%u\n",
>> diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
>> index 0ac4a19ec9cc..8690df699170 100644
>> --- a/drivers/gpu/drm/xe/xe_guc_submit.c
>> +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
>> @@ -2240,7 +2240,7 @@ xe_guc_exec_queue_snapshot_print(struct xe_guc_submit_exec_queue_snapshot *snaps
>>   	if (!snapshot)
>>   		return;
>>   
>> -	drm_printf(p, "\nGuC ID: %d\n", snapshot->guc.id);
>> +	drm_printf(p, "GuC ID: %d\n", snapshot->guc.id);
>>   	drm_printf(p, "\tName: %s\n", snapshot->name);
>>   	drm_printf(p, "\tClass: %d\n", snapshot->class);
>>   	drm_printf(p, "\tLogical mask: 0x%x\n", snapshot->logical_mask);
>> diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c
>> index ea6d9ef7fab6..6c9c27304cdc 100644
>> --- a/drivers/gpu/drm/xe/xe_hw_engine.c
>> +++ b/drivers/gpu/drm/xe/xe_hw_engine.c
>> @@ -1084,7 +1084,6 @@ void xe_hw_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot,
>>   	if (snapshot->hwe->class == XE_ENGINE_CLASS_COMPUTE)
>>   		drm_printf(p, "\tRCU_MODE: 0x%08x\n",
>>   			   snapshot->reg.rcu_mode);
>> -	drm_puts(p, "\n");
>>   }
>>   
>>   /**


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v9 03/11] drm/xe/devcoredump: Improve section headings and add tile info
  2024-12-12 18:59     ` John Harrison
@ 2024-12-12 19:31       ` Souza, Jose
  2024-12-12 20:06         ` John Harrison
  0 siblings, 1 reply; 23+ messages in thread
From: Souza, Jose @ 2024-12-12 19:31 UTC (permalink / raw)
  To: Intel-Xe@Lists.FreeDesktop.Org, Harrison, John C, Vivi, Rodrigo,
	De Marchi, Lucas
  Cc: Filipchuk, Julia

On Thu, 2024-12-12 at 10:59 -0800, John Harrison wrote:
> On 12/12/2024 10:17, Souza, Jose wrote:
> > On Wed, 2024-10-02 at 17:46 -0700, John.C.Harrison@Intel.com wrote:
> > > From: John Harrison <John.C.Harrison@Intel.com>
> > > 
> > > The xe_guc_exec_queue_snapshot is not really a GuC internal thing and
> > > is definitely not a GuC CT thing. So give it its own section heading.
> > > The snapshot itself is really a capture of the submission backend's
> > > internal state. Although all it currently prints out is the submission
> > > contexts. So label it as 'Contexts'. If more general state is added
> > > later then it could be change to 'Submission backend' or some such.
> > > 
> > > Further, everything from the GuC CT section onwards is GT specific but
> > > there was no indication of which GT it was related to (and that is
> > > impossible to work out from the other fields that are given). So add a
> > > GT section heading. Also include the tile id of the GT, because again
> > > significant information.
> > > 
> > > Lastly, drop a couple of unnecessary line feeds within sections.
> > > 
> > > v2: Add GT section heading, add tile id to device section.
> > > 
> > > Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> > > Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
> > > ---
> > >   drivers/gpu/drm/xe/xe_devcoredump.c       | 5 +++++
> > >   drivers/gpu/drm/xe/xe_devcoredump_types.h | 3 ++-
> > >   drivers/gpu/drm/xe/xe_device.c            | 1 +
> > >   drivers/gpu/drm/xe/xe_guc_submit.c        | 2 +-
> > >   drivers/gpu/drm/xe/xe_hw_engine.c         | 1 -
> > >   5 files changed, 9 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c
> > > index d23719d5c2a3..2690f1d1cde4 100644
> > > --- a/drivers/gpu/drm/xe/xe_devcoredump.c
> > > +++ b/drivers/gpu/drm/xe/xe_devcoredump.c
> > > @@ -96,8 +96,13 @@ static ssize_t __xe_devcoredump_read(char *buffer, size_t count,
> > >   	drm_printf(&p, "Process: %s\n", ss->process_name);
> > >   	xe_device_snapshot_print(xe, &p);
> > >   
> > > +	drm_printf(&p, "\n**** GT #%d ****\n", ss->gt->info.id);
> > > +	drm_printf(&p, "\tTile: %d\n", ss->gt->tile->id);
> > > +
> > >   	drm_puts(&p, "\n**** GuC CT ****\n");
> > >   	xe_guc_ct_snapshot_print(ss->ct, &p);
> > > +
> > > +	drm_puts(&p, "\n**** Contexts ****\n");
> > >   	xe_guc_exec_queue_snapshot_print(ss->ge, &p);
> > This broke Mesa parser!
> > It can't now parse the exec_queue context because it was expected to be on the '**** GuC CT ****' section.
> Then the mesa parse needs to be updated. That was clearly a bug - exec 
> queue contexts are absolutely not GuC CT data and should not be in the 
> GuC CT section.

Don't matter if it is a bug or not, it broke the parser.
If this is not reverted we will have older Kernel versions that don't work with newer Mesa and newer Kernel versions that don't with old Mesa.

> 
> John.
> 
> > 
> > >   
> > >   	drm_puts(&p, "\n**** Job ****\n");
> > > diff --git a/drivers/gpu/drm/xe/xe_devcoredump_types.h b/drivers/gpu/drm/xe/xe_devcoredump_types.h
> > > index 440d05d77a5a..3cc2f095fdfb 100644
> > > --- a/drivers/gpu/drm/xe/xe_devcoredump_types.h
> > > +++ b/drivers/gpu/drm/xe/xe_devcoredump_types.h
> > > @@ -37,7 +37,8 @@ struct xe_devcoredump_snapshot {
> > >   	/* GuC snapshots */
> > >   	/** @ct: GuC CT snapshot */
> > >   	struct xe_guc_ct_snapshot *ct;
> > > -	/** @ge: Guc Engine snapshot */
> > > +
> > > +	/** @ge: GuC Submission Engine snapshot */
> > >   	struct xe_guc_submit_exec_queue_snapshot *ge;
> > >   
> > >   	/** @hwe: HW Engine snapshot array */
> > > diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> > > index 09a7ad830e69..030cf703e970 100644
> > > --- a/drivers/gpu/drm/xe/xe_device.c
> > > +++ b/drivers/gpu/drm/xe/xe_device.c
> > > @@ -961,6 +961,7 @@ void xe_device_snapshot_print(struct xe_device *xe, struct drm_printer *p)
> > >   
> > >   	for_each_gt(gt, xe, id) {
> > >   		drm_printf(p, "GT id: %u\n", id);
> > > +		drm_printf(p, "\tTile: %u\n", gt->tile->id);
> > >   		drm_printf(p, "\tType: %s\n",
> > >   			   gt->info.type == XE_GT_TYPE_MAIN ? "main" : "media");
> > >   		drm_printf(p, "\tIP ver: %u.%u.%u\n",
> > > diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> > > index 0ac4a19ec9cc..8690df699170 100644
> > > --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> > > +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> > > @@ -2240,7 +2240,7 @@ xe_guc_exec_queue_snapshot_print(struct xe_guc_submit_exec_queue_snapshot *snaps
> > >   	if (!snapshot)
> > >   		return;
> > >   
> > > -	drm_printf(p, "\nGuC ID: %d\n", snapshot->guc.id);
> > > +	drm_printf(p, "GuC ID: %d\n", snapshot->guc.id);
> > >   	drm_printf(p, "\tName: %s\n", snapshot->name);
> > >   	drm_printf(p, "\tClass: %d\n", snapshot->class);
> > >   	drm_printf(p, "\tLogical mask: 0x%x\n", snapshot->logical_mask);
> > > diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c
> > > index ea6d9ef7fab6..6c9c27304cdc 100644
> > > --- a/drivers/gpu/drm/xe/xe_hw_engine.c
> > > +++ b/drivers/gpu/drm/xe/xe_hw_engine.c
> > > @@ -1084,7 +1084,6 @@ void xe_hw_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot,
> > >   	if (snapshot->hwe->class == XE_ENGINE_CLASS_COMPUTE)
> > >   		drm_printf(p, "\tRCU_MODE: 0x%08x\n",
> > >   			   snapshot->reg.rcu_mode);
> > > -	drm_puts(p, "\n");
> > >   }
> > >   
> > >   /**
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v9 03/11] drm/xe/devcoredump: Improve section headings and add tile info
  2024-12-12 19:31       ` Souza, Jose
@ 2024-12-12 20:06         ` John Harrison
  2024-12-12 20:30           ` Souza, Jose
  0 siblings, 1 reply; 23+ messages in thread
From: John Harrison @ 2024-12-12 20:06 UTC (permalink / raw)
  To: Souza, Jose, Intel-Xe@Lists.FreeDesktop.Org, Vivi, Rodrigo,
	De Marchi, Lucas
  Cc: Filipchuk, Julia

On 12/12/2024 11:31, Souza, Jose wrote:
> On Thu, 2024-12-12 at 10:59 -0800, John Harrison wrote:
>> On 12/12/2024 10:17, Souza, Jose wrote:
>>> On Wed, 2024-10-02 at 17:46 -0700, John.C.Harrison@Intel.com wrote:
>>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>>
>>>> The xe_guc_exec_queue_snapshot is not really a GuC internal thing and
>>>> is definitely not a GuC CT thing. So give it its own section heading.
>>>> The snapshot itself is really a capture of the submission backend's
>>>> internal state. Although all it currently prints out is the submission
>>>> contexts. So label it as 'Contexts'. If more general state is added
>>>> later then it could be change to 'Submission backend' or some such.
>>>>
>>>> Further, everything from the GuC CT section onwards is GT specific but
>>>> there was no indication of which GT it was related to (and that is
>>>> impossible to work out from the other fields that are given). So add a
>>>> GT section heading. Also include the tile id of the GT, because again
>>>> significant information.
>>>>
>>>> Lastly, drop a couple of unnecessary line feeds within sections.
>>>>
>>>> v2: Add GT section heading, add tile id to device section.
>>>>
>>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>>> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
>>>> ---
>>>>    drivers/gpu/drm/xe/xe_devcoredump.c       | 5 +++++
>>>>    drivers/gpu/drm/xe/xe_devcoredump_types.h | 3 ++-
>>>>    drivers/gpu/drm/xe/xe_device.c            | 1 +
>>>>    drivers/gpu/drm/xe/xe_guc_submit.c        | 2 +-
>>>>    drivers/gpu/drm/xe/xe_hw_engine.c         | 1 -
>>>>    5 files changed, 9 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c
>>>> index d23719d5c2a3..2690f1d1cde4 100644
>>>> --- a/drivers/gpu/drm/xe/xe_devcoredump.c
>>>> +++ b/drivers/gpu/drm/xe/xe_devcoredump.c
>>>> @@ -96,8 +96,13 @@ static ssize_t __xe_devcoredump_read(char *buffer, size_t count,
>>>>    	drm_printf(&p, "Process: %s\n", ss->process_name);
>>>>    	xe_device_snapshot_print(xe, &p);
>>>>    
>>>> +	drm_printf(&p, "\n**** GT #%d ****\n", ss->gt->info.id);
>>>> +	drm_printf(&p, "\tTile: %d\n", ss->gt->tile->id);
>>>> +
>>>>    	drm_puts(&p, "\n**** GuC CT ****\n");
>>>>    	xe_guc_ct_snapshot_print(ss->ct, &p);
>>>> +
>>>> +	drm_puts(&p, "\n**** Contexts ****\n");
>>>>    	xe_guc_exec_queue_snapshot_print(ss->ge, &p);
>>> This broke Mesa parser!
>>> It can't now parse the exec_queue context because it was expected to be on the '**** GuC CT ****' section.
>> Then the mesa parse needs to be updated. That was clearly a bug - exec
>> queue contexts are absolutely not GuC CT data and should not be in the
>> GuC CT section.
> Don't matter if it is a bug or not, it broke the parser.
> If this is not reverted we will have older Kernel versions that don't work with newer Mesa and newer Kernel versions that don't with old Mesa.
Debug tools cannot count as UAPI that must never change.

The devcoredump contains much information that is essentially the 
internals of the kernel. It is going to change. That is about the only 
guarantee that we can make about it. And saying that we must 
intentionally break the output of a developer only debug feature in 
order to support older mesa is plain wrong. End users do not care about 
debug tools. All user applications will still work just perfectly.

We can start adding version numbers to the devcoredump format if we 
really need to. But that was already shot down as a bad idea. It is 
debug information and not UAPI. So version incompatibilities are 
expected from time to time.

John.


>
>> John.
>>
>>>>    
>>>>    	drm_puts(&p, "\n**** Job ****\n");
>>>> diff --git a/drivers/gpu/drm/xe/xe_devcoredump_types.h b/drivers/gpu/drm/xe/xe_devcoredump_types.h
>>>> index 440d05d77a5a..3cc2f095fdfb 100644
>>>> --- a/drivers/gpu/drm/xe/xe_devcoredump_types.h
>>>> +++ b/drivers/gpu/drm/xe/xe_devcoredump_types.h
>>>> @@ -37,7 +37,8 @@ struct xe_devcoredump_snapshot {
>>>>    	/* GuC snapshots */
>>>>    	/** @ct: GuC CT snapshot */
>>>>    	struct xe_guc_ct_snapshot *ct;
>>>> -	/** @ge: Guc Engine snapshot */
>>>> +
>>>> +	/** @ge: GuC Submission Engine snapshot */
>>>>    	struct xe_guc_submit_exec_queue_snapshot *ge;
>>>>    
>>>>    	/** @hwe: HW Engine snapshot array */
>>>> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
>>>> index 09a7ad830e69..030cf703e970 100644
>>>> --- a/drivers/gpu/drm/xe/xe_device.c
>>>> +++ b/drivers/gpu/drm/xe/xe_device.c
>>>> @@ -961,6 +961,7 @@ void xe_device_snapshot_print(struct xe_device *xe, struct drm_printer *p)
>>>>    
>>>>    	for_each_gt(gt, xe, id) {
>>>>    		drm_printf(p, "GT id: %u\n", id);
>>>> +		drm_printf(p, "\tTile: %u\n", gt->tile->id);
>>>>    		drm_printf(p, "\tType: %s\n",
>>>>    			   gt->info.type == XE_GT_TYPE_MAIN ? "main" : "media");
>>>>    		drm_printf(p, "\tIP ver: %u.%u.%u\n",
>>>> diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
>>>> index 0ac4a19ec9cc..8690df699170 100644
>>>> --- a/drivers/gpu/drm/xe/xe_guc_submit.c
>>>> +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
>>>> @@ -2240,7 +2240,7 @@ xe_guc_exec_queue_snapshot_print(struct xe_guc_submit_exec_queue_snapshot *snaps
>>>>    	if (!snapshot)
>>>>    		return;
>>>>    
>>>> -	drm_printf(p, "\nGuC ID: %d\n", snapshot->guc.id);
>>>> +	drm_printf(p, "GuC ID: %d\n", snapshot->guc.id);
>>>>    	drm_printf(p, "\tName: %s\n", snapshot->name);
>>>>    	drm_printf(p, "\tClass: %d\n", snapshot->class);
>>>>    	drm_printf(p, "\tLogical mask: 0x%x\n", snapshot->logical_mask);
>>>> diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c
>>>> index ea6d9ef7fab6..6c9c27304cdc 100644
>>>> --- a/drivers/gpu/drm/xe/xe_hw_engine.c
>>>> +++ b/drivers/gpu/drm/xe/xe_hw_engine.c
>>>> @@ -1084,7 +1084,6 @@ void xe_hw_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot,
>>>>    	if (snapshot->hwe->class == XE_ENGINE_CLASS_COMPUTE)
>>>>    		drm_printf(p, "\tRCU_MODE: 0x%08x\n",
>>>>    			   snapshot->reg.rcu_mode);
>>>> -	drm_puts(p, "\n");
>>>>    }
>>>>    
>>>>    /**


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v9 03/11] drm/xe/devcoredump: Improve section headings and add tile info
  2024-12-12 20:06         ` John Harrison
@ 2024-12-12 20:30           ` Souza, Jose
  2024-12-12 20:38             ` John Harrison
  0 siblings, 1 reply; 23+ messages in thread
From: Souza, Jose @ 2024-12-12 20:30 UTC (permalink / raw)
  To: Intel-Xe@Lists.FreeDesktop.Org, Harrison, John C, Vivi, Rodrigo,
	De Marchi, Lucas
  Cc: Filipchuk, Julia

On Thu, 2024-12-12 at 12:06 -0800, John Harrison wrote:
> On 12/12/2024 11:31, Souza, Jose wrote:
> > On Thu, 2024-12-12 at 10:59 -0800, John Harrison wrote:
> > > On 12/12/2024 10:17, Souza, Jose wrote:
> > > > On Wed, 2024-10-02 at 17:46 -0700, John.C.Harrison@Intel.com wrote:
> > > > > From: John Harrison <John.C.Harrison@Intel.com>
> > > > > 
> > > > > The xe_guc_exec_queue_snapshot is not really a GuC internal thing and
> > > > > is definitely not a GuC CT thing. So give it its own section heading.
> > > > > The snapshot itself is really a capture of the submission backend's
> > > > > internal state. Although all it currently prints out is the submission
> > > > > contexts. So label it as 'Contexts'. If more general state is added
> > > > > later then it could be change to 'Submission backend' or some such.
> > > > > 
> > > > > Further, everything from the GuC CT section onwards is GT specific but
> > > > > there was no indication of which GT it was related to (and that is
> > > > > impossible to work out from the other fields that are given). So add a
> > > > > GT section heading. Also include the tile id of the GT, because again
> > > > > significant information.
> > > > > 
> > > > > Lastly, drop a couple of unnecessary line feeds within sections.
> > > > > 
> > > > > v2: Add GT section heading, add tile id to device section.
> > > > > 
> > > > > Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> > > > > Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
> > > > > ---
> > > > >    drivers/gpu/drm/xe/xe_devcoredump.c       | 5 +++++
> > > > >    drivers/gpu/drm/xe/xe_devcoredump_types.h | 3 ++-
> > > > >    drivers/gpu/drm/xe/xe_device.c            | 1 +
> > > > >    drivers/gpu/drm/xe/xe_guc_submit.c        | 2 +-
> > > > >    drivers/gpu/drm/xe/xe_hw_engine.c         | 1 -
> > > > >    5 files changed, 9 insertions(+), 3 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c
> > > > > index d23719d5c2a3..2690f1d1cde4 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_devcoredump.c
> > > > > +++ b/drivers/gpu/drm/xe/xe_devcoredump.c
> > > > > @@ -96,8 +96,13 @@ static ssize_t __xe_devcoredump_read(char *buffer, size_t count,
> > > > >    	drm_printf(&p, "Process: %s\n", ss->process_name);
> > > > >    	xe_device_snapshot_print(xe, &p);
> > > > >    
> > > > > +	drm_printf(&p, "\n**** GT #%d ****\n", ss->gt->info.id);
> > > > > +	drm_printf(&p, "\tTile: %d\n", ss->gt->tile->id);
> > > > > +
> > > > >    	drm_puts(&p, "\n**** GuC CT ****\n");
> > > > >    	xe_guc_ct_snapshot_print(ss->ct, &p);
> > > > > +
> > > > > +	drm_puts(&p, "\n**** Contexts ****\n");
> > > > >    	xe_guc_exec_queue_snapshot_print(ss->ge, &p);
> > > > This broke Mesa parser!
> > > > It can't now parse the exec_queue context because it was expected to be on the '**** GuC CT ****' section.
> > > Then the mesa parse needs to be updated. That was clearly a bug - exec
> > > queue contexts are absolutely not GuC CT data and should not be in the
> > > GuC CT section.
> > Don't matter if it is a bug or not, it broke the parser.
> > If this is not reverted we will have older Kernel versions that don't work with newer Mesa and newer Kernel versions that don't with old Mesa.
> Debug tools cannot count as UAPI that must never change.

That is not my understating from previous threads.

Imagine that a big costumer file a bug to us and attach the devcoredump of a older kernel version.
devcoredump parser will not work. If the developer is aware of this "contract" break he can checkout to a older UMD version, build it and then parse
the devcoredump. Then checkout again to main/master branch and work on the fix... Not viable at all.

At least UMD teams should be notified. At the moment Mesa debugging is blocked because of this patches.

> 
> The devcoredump contains much information that is essentially the 
> internals of the kernel. It is going to change. That is about the only 
> guarantee that we can make about it. And saying that we must 
> intentionally break the output of a developer only debug feature in 
> order to support older mesa is plain wrong. End users do not care about 
> debug tools. All user applications will still work just perfectly.
> 
> We can start adding version numbers to the devcoredump format if we 
> really need to. But that was already shot down as a bad idea. It is 
> debug information and not UAPI. So version incompatibilities are 
> expected from time to time.
> 
> John.
> 
> 
> > 
> > > John.
> > > 
> > > > >    
> > > > >    	drm_puts(&p, "\n**** Job ****\n");
> > > > > diff --git a/drivers/gpu/drm/xe/xe_devcoredump_types.h b/drivers/gpu/drm/xe/xe_devcoredump_types.h
> > > > > index 440d05d77a5a..3cc2f095fdfb 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_devcoredump_types.h
> > > > > +++ b/drivers/gpu/drm/xe/xe_devcoredump_types.h
> > > > > @@ -37,7 +37,8 @@ struct xe_devcoredump_snapshot {
> > > > >    	/* GuC snapshots */
> > > > >    	/** @ct: GuC CT snapshot */
> > > > >    	struct xe_guc_ct_snapshot *ct;
> > > > > -	/** @ge: Guc Engine snapshot */
> > > > > +
> > > > > +	/** @ge: GuC Submission Engine snapshot */
> > > > >    	struct xe_guc_submit_exec_queue_snapshot *ge;
> > > > >    
> > > > >    	/** @hwe: HW Engine snapshot array */
> > > > > diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> > > > > index 09a7ad830e69..030cf703e970 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_device.c
> > > > > +++ b/drivers/gpu/drm/xe/xe_device.c
> > > > > @@ -961,6 +961,7 @@ void xe_device_snapshot_print(struct xe_device *xe, struct drm_printer *p)
> > > > >    
> > > > >    	for_each_gt(gt, xe, id) {
> > > > >    		drm_printf(p, "GT id: %u\n", id);
> > > > > +		drm_printf(p, "\tTile: %u\n", gt->tile->id);
> > > > >    		drm_printf(p, "\tType: %s\n",
> > > > >    			   gt->info.type == XE_GT_TYPE_MAIN ? "main" : "media");
> > > > >    		drm_printf(p, "\tIP ver: %u.%u.%u\n",
> > > > > diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> > > > > index 0ac4a19ec9cc..8690df699170 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> > > > > +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> > > > > @@ -2240,7 +2240,7 @@ xe_guc_exec_queue_snapshot_print(struct xe_guc_submit_exec_queue_snapshot *snaps
> > > > >    	if (!snapshot)
> > > > >    		return;
> > > > >    
> > > > > -	drm_printf(p, "\nGuC ID: %d\n", snapshot->guc.id);
> > > > > +	drm_printf(p, "GuC ID: %d\n", snapshot->guc.id);
> > > > >    	drm_printf(p, "\tName: %s\n", snapshot->name);
> > > > >    	drm_printf(p, "\tClass: %d\n", snapshot->class);
> > > > >    	drm_printf(p, "\tLogical mask: 0x%x\n", snapshot->logical_mask);
> > > > > diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c
> > > > > index ea6d9ef7fab6..6c9c27304cdc 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_hw_engine.c
> > > > > +++ b/drivers/gpu/drm/xe/xe_hw_engine.c
> > > > > @@ -1084,7 +1084,6 @@ void xe_hw_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot,
> > > > >    	if (snapshot->hwe->class == XE_ENGINE_CLASS_COMPUTE)
> > > > >    		drm_printf(p, "\tRCU_MODE: 0x%08x\n",
> > > > >    			   snapshot->reg.rcu_mode);
> > > > > -	drm_puts(p, "\n");
> > > > >    }
> > > > >    
> > > > >    /**
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v9 03/11] drm/xe/devcoredump: Improve section headings and add tile info
  2024-12-12 20:30           ` Souza, Jose
@ 2024-12-12 20:38             ` John Harrison
  0 siblings, 0 replies; 23+ messages in thread
From: John Harrison @ 2024-12-12 20:38 UTC (permalink / raw)
  To: Souza, Jose, Intel-Xe@Lists.FreeDesktop.Org, Vivi, Rodrigo,
	De Marchi, Lucas, Matthew Brost
  Cc: Filipchuk, Julia

+Matthew B


On 12/12/2024 12:30, Souza, Jose wrote:
> On Thu, 2024-12-12 at 12:06 -0800, John Harrison wrote:
>> On 12/12/2024 11:31, Souza, Jose wrote:
>>> On Thu, 2024-12-12 at 10:59 -0800, John Harrison wrote:
>>>> On 12/12/2024 10:17, Souza, Jose wrote:
>>>>> On Wed, 2024-10-02 at 17:46 -0700, John.C.Harrison@Intel.com wrote:
>>>>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>>>>
>>>>>> The xe_guc_exec_queue_snapshot is not really a GuC internal thing and
>>>>>> is definitely not a GuC CT thing. So give it its own section heading.
>>>>>> The snapshot itself is really a capture of the submission backend's
>>>>>> internal state. Although all it currently prints out is the submission
>>>>>> contexts. So label it as 'Contexts'. If more general state is added
>>>>>> later then it could be change to 'Submission backend' or some such.
>>>>>>
>>>>>> Further, everything from the GuC CT section onwards is GT specific but
>>>>>> there was no indication of which GT it was related to (and that is
>>>>>> impossible to work out from the other fields that are given). So add a
>>>>>> GT section heading. Also include the tile id of the GT, because again
>>>>>> significant information.
>>>>>>
>>>>>> Lastly, drop a couple of unnecessary line feeds within sections.
>>>>>>
>>>>>> v2: Add GT section heading, add tile id to device section.
>>>>>>
>>>>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>>>>> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
>>>>>> ---
>>>>>>     drivers/gpu/drm/xe/xe_devcoredump.c       | 5 +++++
>>>>>>     drivers/gpu/drm/xe/xe_devcoredump_types.h | 3 ++-
>>>>>>     drivers/gpu/drm/xe/xe_device.c            | 1 +
>>>>>>     drivers/gpu/drm/xe/xe_guc_submit.c        | 2 +-
>>>>>>     drivers/gpu/drm/xe/xe_hw_engine.c         | 1 -
>>>>>>     5 files changed, 9 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c
>>>>>> index d23719d5c2a3..2690f1d1cde4 100644
>>>>>> --- a/drivers/gpu/drm/xe/xe_devcoredump.c
>>>>>> +++ b/drivers/gpu/drm/xe/xe_devcoredump.c
>>>>>> @@ -96,8 +96,13 @@ static ssize_t __xe_devcoredump_read(char *buffer, size_t count,
>>>>>>     	drm_printf(&p, "Process: %s\n", ss->process_name);
>>>>>>     	xe_device_snapshot_print(xe, &p);
>>>>>>     
>>>>>> +	drm_printf(&p, "\n**** GT #%d ****\n", ss->gt->info.id);
>>>>>> +	drm_printf(&p, "\tTile: %d\n", ss->gt->tile->id);
>>>>>> +
>>>>>>     	drm_puts(&p, "\n**** GuC CT ****\n");
>>>>>>     	xe_guc_ct_snapshot_print(ss->ct, &p);
>>>>>> +
>>>>>> +	drm_puts(&p, "\n**** Contexts ****\n");
>>>>>>     	xe_guc_exec_queue_snapshot_print(ss->ge, &p);
>>>>> This broke Mesa parser!
>>>>> It can't now parse the exec_queue context because it was expected to be on the '**** GuC CT ****' section.
>>>> Then the mesa parse needs to be updated. That was clearly a bug - exec
>>>> queue contexts are absolutely not GuC CT data and should not be in the
>>>> GuC CT section.
>>> Don't matter if it is a bug or not, it broke the parser.
>>> If this is not reverted we will have older Kernel versions that don't work with newer Mesa and newer Kernel versions that don't with old Mesa.
>> Debug tools cannot count as UAPI that must never change.
> That is not my understating from previous threads.
>
> Imagine that a big costumer file a bug to us and attach the devcoredump of a older kernel version.
> devcoredump parser will not work. If the developer is aware of this "contract" break he can checkout to a older UMD version, build it and then parse
> the devcoredump. Then checkout again to main/master branch and work on the fix... Not viable at all.
>
> At least UMD teams should be notified. At the moment Mesa debugging is blocked because of this patches.
The alternative is we can never update the devcoredump output to add new 
information, remove old entries that no longer make sense due to driver 
re-work, etc.? That is even less viable.

For this particular issue, the fix is presumably trivial. The mesa tool 
can be updated to look in either the old incorrect section header (I'm 
assuming the exec queue info was actually just not in any section at all 
previously, because it was really not ever part of the GuC CT info) or 
the new correct one. Then the new build will work on both the current 
kernel or the old one.

Going forwards, as I said, we can start adding format version numbers 
but my memory is that was argued against.

John.

>
>> The devcoredump contains much information that is essentially the
>> internals of the kernel. It is going to change. That is about the only
>> guarantee that we can make about it. And saying that we must
>> intentionally break the output of a developer only debug feature in
>> order to support older mesa is plain wrong. End users do not care about
>> debug tools. All user applications will still work just perfectly.
>>
>> We can start adding version numbers to the devcoredump format if we
>> really need to. But that was already shot down as a bad idea. It is
>> debug information and not UAPI. So version incompatibilities are
>> expected from time to time.
>>
>> John.
>>
>>
>>>> John.
>>>>
>>>>>>     
>>>>>>     	drm_puts(&p, "\n**** Job ****\n");
>>>>>> diff --git a/drivers/gpu/drm/xe/xe_devcoredump_types.h b/drivers/gpu/drm/xe/xe_devcoredump_types.h
>>>>>> index 440d05d77a5a..3cc2f095fdfb 100644
>>>>>> --- a/drivers/gpu/drm/xe/xe_devcoredump_types.h
>>>>>> +++ b/drivers/gpu/drm/xe/xe_devcoredump_types.h
>>>>>> @@ -37,7 +37,8 @@ struct xe_devcoredump_snapshot {
>>>>>>     	/* GuC snapshots */
>>>>>>     	/** @ct: GuC CT snapshot */
>>>>>>     	struct xe_guc_ct_snapshot *ct;
>>>>>> -	/** @ge: Guc Engine snapshot */
>>>>>> +
>>>>>> +	/** @ge: GuC Submission Engine snapshot */
>>>>>>     	struct xe_guc_submit_exec_queue_snapshot *ge;
>>>>>>     
>>>>>>     	/** @hwe: HW Engine snapshot array */
>>>>>> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
>>>>>> index 09a7ad830e69..030cf703e970 100644
>>>>>> --- a/drivers/gpu/drm/xe/xe_device.c
>>>>>> +++ b/drivers/gpu/drm/xe/xe_device.c
>>>>>> @@ -961,6 +961,7 @@ void xe_device_snapshot_print(struct xe_device *xe, struct drm_printer *p)
>>>>>>     
>>>>>>     	for_each_gt(gt, xe, id) {
>>>>>>     		drm_printf(p, "GT id: %u\n", id);
>>>>>> +		drm_printf(p, "\tTile: %u\n", gt->tile->id);
>>>>>>     		drm_printf(p, "\tType: %s\n",
>>>>>>     			   gt->info.type == XE_GT_TYPE_MAIN ? "main" : "media");
>>>>>>     		drm_printf(p, "\tIP ver: %u.%u.%u\n",
>>>>>> diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
>>>>>> index 0ac4a19ec9cc..8690df699170 100644
>>>>>> --- a/drivers/gpu/drm/xe/xe_guc_submit.c
>>>>>> +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
>>>>>> @@ -2240,7 +2240,7 @@ xe_guc_exec_queue_snapshot_print(struct xe_guc_submit_exec_queue_snapshot *snaps
>>>>>>     	if (!snapshot)
>>>>>>     		return;
>>>>>>     
>>>>>> -	drm_printf(p, "\nGuC ID: %d\n", snapshot->guc.id);
>>>>>> +	drm_printf(p, "GuC ID: %d\n", snapshot->guc.id);
>>>>>>     	drm_printf(p, "\tName: %s\n", snapshot->name);
>>>>>>     	drm_printf(p, "\tClass: %d\n", snapshot->class);
>>>>>>     	drm_printf(p, "\tLogical mask: 0x%x\n", snapshot->logical_mask);
>>>>>> diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c
>>>>>> index ea6d9ef7fab6..6c9c27304cdc 100644
>>>>>> --- a/drivers/gpu/drm/xe/xe_hw_engine.c
>>>>>> +++ b/drivers/gpu/drm/xe/xe_hw_engine.c
>>>>>> @@ -1084,7 +1084,6 @@ void xe_hw_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot,
>>>>>>     	if (snapshot->hwe->class == XE_ENGINE_CLASS_COMPUTE)
>>>>>>     		drm_printf(p, "\tRCU_MODE: 0x%08x\n",
>>>>>>     			   snapshot->reg.rcu_mode);
>>>>>> -	drm_puts(p, "\n");
>>>>>>     }
>>>>>>     
>>>>>>     /**


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2024-12-12 20:38 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-02 21:14 [PATCH v9 00/11] drm/xe/guc: Improve GuC log dumping and add to devcoredump John.C.Harrison
2024-10-02 21:14 ` [PATCH v9 01/11] drm/xe/guc: Remove spurious line feed in debug print John.C.Harrison
2024-10-02 21:14 ` [PATCH v9 02/11] drm/xe/devcoredump: Use drm_puts and already cached local variables John.C.Harrison
2024-10-02 21:14 ` [PATCH v9 03/11] drm/xe/devcoredump: Improve section headings and add tile info John.C.Harrison
2024-10-02 21:14 ` [PATCH v9 04/11] drm/xe/devcoredump: Add ASCII85 dump helper function John.C.Harrison
2024-10-02 21:14 ` [PATCH v9 05/11] drm/xe/guc: Copy GuC log prior to dumping John.C.Harrison
2024-10-02 21:14 ` [PATCH v9 06/11] drm/xe/guc: Use a two stage dump for GuC logs and add more info John.C.Harrison
2024-10-02 21:14 ` [PATCH v9 07/11] drm/print: Introduce drm_line_printer John.C.Harrison
2024-10-02 21:14 ` [PATCH v9 08/11] drm/xe/guc: Dead CT helper John.C.Harrison
2024-10-02 21:14 ` [PATCH v9 09/11] drm/xe/guc: Dump entire CTB on errors John.C.Harrison
2024-10-02 21:14 ` [PATCH v9 10/11] drm/xe/guc: Add GuC log to devcoredump captures John.C.Harrison
2024-10-02 21:14 ` [PATCH v9 11/11] drm/xe/guc: Add a helper function for dumping GuC log to dmesg John.C.Harrison
2024-10-02 21:20 ` ✓ CI.Patch_applied: success for drm/xe/guc: Improve GuC log dumping and add to devcoredump (rev5) Patchwork
2024-10-02 21:21 ` ✗ CI.checkpatch: warning " Patchwork
2024-10-02 21:22 ` ✓ CI.KUnit: success " Patchwork
2024-10-02 21:27 ` ✗ CI.Build: failure " Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2024-10-03  0:46 [PATCH v9 00/11] drm/xe/guc: Improve GuC log dumping and add to devcoredump John.C.Harrison
2024-10-03  0:46 ` [PATCH v9 03/11] drm/xe/devcoredump: Improve section headings and add tile info John.C.Harrison
2024-12-12 18:17   ` Souza, Jose
2024-12-12 18:59     ` John Harrison
2024-12-12 19:31       ` Souza, Jose
2024-12-12 20:06         ` John Harrison
2024-12-12 20:30           ` Souza, Jose
2024-12-12 20:38             ` John Harrison

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox