Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: intel-xe@lists.freedesktop.org
Cc: alan.previn.teres.alexis@intel.com, zhanjun.dong@intel.com,
	rodrigo.vivi@intel.com
Subject: [PATCH 5/7] drm/xe: Add exec queue param to devcoredump
Date: Fri,  8 Nov 2024 09:43:10 -0800	[thread overview]
Message-ID: <20241108174312.272792-6-matthew.brost@intel.com> (raw)
In-Reply-To: <20241108174312.272792-1-matthew.brost@intel.com>

Add job may unavailable at capture time (e.g., LR mode) while an exec
queue is. Add exec queue param for such use cases.

Cc: Zhanjun Dong <zhanjun.dong@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/xe/xe_devcoredump.c | 15 +++++++++------
 drivers/gpu/drm/xe/xe_devcoredump.h |  6 ++++--
 drivers/gpu/drm/xe/xe_guc_submit.c  |  2 +-
 3 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c
index d3570d3d573c..c32cbb46ef8c 100644
--- a/drivers/gpu/drm/xe/xe_devcoredump.c
+++ b/drivers/gpu/drm/xe/xe_devcoredump.c
@@ -238,10 +238,10 @@ static void xe_devcoredump_free(void *data)
 }
 
 static void devcoredump_snapshot(struct xe_devcoredump *coredump,
+				 struct xe_exec_queue *q,
 				 struct xe_sched_job *job)
 {
 	struct xe_devcoredump_snapshot *ss = &coredump->snapshot;
-	struct xe_exec_queue *q = job->q;
 	struct xe_guc *guc = exec_queue_to_guc(q);
 	u32 adj_logical_mask = q->logical_mask;
 	u32 width_mask = (0x1 << q->width) - 1;
@@ -278,10 +278,12 @@ static void devcoredump_snapshot(struct xe_devcoredump *coredump,
 	ss->guc.log = xe_guc_log_snapshot_capture(&guc->log, true);
 	ss->guc.ct = xe_guc_ct_snapshot_capture(&guc->ct);
 	ss->ge = xe_guc_exec_queue_snapshot_capture(q);
-	ss->job = xe_sched_job_snapshot_capture(job);
+	if (job)
+		ss->job = xe_sched_job_snapshot_capture(job);
 	ss->vm = xe_vm_snapshot_capture(q->vm);
 
-	xe_engine_snapshot_capture_for_job(job);
+	if (job)
+		xe_engine_snapshot_capture_for_job(job);
 
 	queue_work(system_unbound_wq, &ss->work);
 
@@ -291,15 +293,16 @@ static void devcoredump_snapshot(struct xe_devcoredump *coredump,
 
 /**
  * xe_devcoredump - Take the required snapshots and initialize coredump device.
+ * @q: The faulty xe_exec_queue, where the issue was detected.
  * @job: The faulty xe_sched_job, where the issue was detected.
  *
  * This function should be called at the crash time within the serialized
  * gt_reset. It is skipped if we still have the core dump device available
  * with the information of the 'first' snapshot.
  */
-void xe_devcoredump(struct xe_sched_job *job)
+void xe_devcoredump(struct xe_exec_queue *q, struct xe_sched_job *job)
 {
-	struct xe_device *xe = gt_to_xe(job->q->gt);
+	struct xe_device *xe = gt_to_xe(q->gt);
 	struct xe_devcoredump *coredump = &xe->devcoredump;
 
 	if (coredump->captured) {
@@ -308,7 +311,7 @@ void xe_devcoredump(struct xe_sched_job *job)
 	}
 
 	coredump->captured = true;
-	devcoredump_snapshot(coredump, job);
+	devcoredump_snapshot(coredump, q, job);
 
 	drm_info(&xe->drm, "Xe device coredump has been created\n");
 	drm_info(&xe->drm, "Check your /sys/class/drm/card%d/device/devcoredump/data\n",
diff --git a/drivers/gpu/drm/xe/xe_devcoredump.h b/drivers/gpu/drm/xe/xe_devcoredump.h
index a4eebc285fc8..c04a534e3384 100644
--- a/drivers/gpu/drm/xe/xe_devcoredump.h
+++ b/drivers/gpu/drm/xe/xe_devcoredump.h
@@ -10,13 +10,15 @@
 
 struct drm_printer;
 struct xe_device;
+struct xe_exec_queue;
 struct xe_sched_job;
 
 #ifdef CONFIG_DEV_COREDUMP
-void xe_devcoredump(struct xe_sched_job *job);
+void xe_devcoredump(struct xe_exec_queue *q, struct xe_sched_job *job);
 int xe_devcoredump_init(struct xe_device *xe);
 #else
-static inline void xe_devcoredump(struct xe_sched_job *job)
+static inline void xe_devcoredump(struct xe_exec_queue *q,
+				  struct xe_sched_job *job)
 {
 }
 
diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
index 2cf4750bc24d..974c7af7064d 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit.c
+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
@@ -1162,7 +1162,7 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
 	trace_xe_sched_job_timedout(job);
 
 	if (!exec_queue_killed(q))
-		xe_devcoredump(job);
+		xe_devcoredump(q, job);
 
 	/*
 	 * Kernel jobs should never fail, nor should VM jobs if they do
-- 
2.34.1


  parent reply	other threads:[~2024-11-08 17:42 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-08 17:43 [PATCH 0/7] Devcoredump Improvements Matthew Brost
2024-11-08 17:43 ` [PATCH 1/7] drm/xe: Add xe_lrc_is_idle() helper Matthew Brost
2024-11-08 20:11   ` Rodrigo Vivi
2024-11-08 22:00   ` Cavitt, Jonathan
2024-11-08 22:06   ` Dong, Zhanjun
2024-11-08 22:58     ` Matthew Brost
2024-11-08 17:43 ` [PATCH 2/7] drm/xe: Add ring address to LRC snapshot Matthew Brost
2024-11-08 20:12   ` Rodrigo Vivi
2024-11-08 22:05   ` Cavitt, Jonathan
2024-11-08 23:10     ` Matthew Brost
2024-11-08 23:34       ` Cavitt, Jonathan
2024-11-12 17:59         ` John Harrison
2024-11-12 18:18           ` Matthew Brost
2024-11-12 20:16             ` Cavitt, Jonathan
2024-11-12 20:30               ` Matt Roper
2024-11-12 20:46                 ` Rodrigo Vivi
2024-11-12 21:21                   ` Cavitt, Jonathan
2024-11-12 22:26                     ` Matt Roper
2024-11-08 17:43 ` [PATCH 3/7] drm/xe: Add ring start " Matthew Brost
2024-11-08 22:07   ` Cavitt, Jonathan
2024-11-08 17:43 ` [PATCH 4/7] drm/xe: Improve schedule disable response failure Matthew Brost
2024-11-08 22:07   ` Cavitt, Jonathan
2024-11-08 17:43 ` Matthew Brost [this message]
2024-11-08 22:21   ` [PATCH 5/7] drm/xe: Add exec queue param to devcoredump Rodrigo Vivi
2024-11-08 22:56     ` Matthew Brost
2024-11-08 22:22   ` Cavitt, Jonathan
2024-11-08 17:43 ` [PATCH 6/7] drm/xe: Change xe_engine_snapshot_capture_for_job to be for_queue Matthew Brost
2024-11-08 22:27   ` Cavitt, Jonathan
2024-11-11 22:15   ` Dong, Zhanjun
2024-11-11 22:41     ` Dong, Zhanjun
2024-11-08 17:43 ` [PATCH 7/7] drm/xe: Wire devcoredump to LR TDR Matthew Brost
2024-11-08 22:27   ` Cavitt, Jonathan
2024-11-08 17:47 ` ✓ CI.Patch_applied: success for Devcoredump Improvements Patchwork
2024-11-08 17:48 ` ✓ CI.checkpatch: " Patchwork
2024-11-08 17:49 ` ✓ CI.KUnit: " Patchwork
2024-11-08 18:00 ` ✓ CI.Build: " Patchwork
2024-11-08 18:03 ` ✗ CI.Hooks: failure " Patchwork
2024-11-08 18:04 ` ✓ CI.checksparse: success " Patchwork
2024-11-08 18:21 ` ✓ CI.BAT: " Patchwork
2024-11-09 20:30 ` ✗ CI.FULL: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241108174312.272792-6-matthew.brost@intel.com \
    --to=matthew.brost@intel.com \
    --cc=alan.previn.teres.alexis@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=rodrigo.vivi@intel.com \
    --cc=zhanjun.dong@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox