Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: Shekhar Chauhan <shekhar.chauhan@intel.com>
Cc: <intel-xe@lists.freedesktop.org>, <rodrigo.vivi@intel.com>
Subject: Re: [PATCH] Decouple devcoredump from accessing core driver structures
Date: Tue, 10 Feb 2026 16:17:59 -0800	[thread overview]
Message-ID: <aYvKt64AeBC6Nvj6@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <43de4a96-2aaa-4dc3-8c8d-0bce63385536@intel.com>

On Tue, Feb 10, 2026 at 10:48:09PM +0530, Shekhar Chauhan wrote:
> The patch should have a 'drm/xe' prefix, missed it somehow. Will add it in
> the next revision other feedback for the patch is provided!
> 
> -shekhar
> 
> On 2/10/2026 12:59, Shekhar Chauhan wrote:
> > During devcoredump, print should not access any live gt pointers.

Can you explain 'why'?

> > Decouple the snapshot_print from accessing gt pointer by storing the
> > captured data in a separate list and then print data from that list.
> > 
> > Signed-off-by: Shekhar Chauhan <shekhar.chauhan@intel.com>
> > ---
> >   drivers/gpu/drm/xe/xe_guc_capture.c     | 25 +++++++------------------
> >   drivers/gpu/drm/xe/xe_hw_engine.c       | 20 ++++++++++++++++++++
> >   drivers/gpu/drm/xe/xe_hw_engine_types.h |  7 +++++++
> >   3 files changed, 34 insertions(+), 18 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_guc_capture.c b/drivers/gpu/drm/xe/xe_guc_capture.c
> > index 2f5816c78fba..0555fb07c499 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_capture.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_capture.c
> > @@ -1695,8 +1695,7 @@ static void
> >   snapshot_print_by_list_order(struct xe_hw_engine_snapshot *snapshot, struct drm_printer *p,
> >   			     u32 type, const struct __guc_mmio_reg_descr_group *list)
> >   {
> > -	struct xe_gt *gt = snapshot->hwe->gt;
> > -	struct xe_device *xe = gt_to_xe(gt);
> > +	struct xe_device *xe = snapshot->xe;

If an Xe object is available - so is a 'hwe' and 'gt'. All of these
persistent objects created during driver load and the memory is free'd
upin driver unload. i.e., It is not possible for a GT or HWE to invalid
but to have a valid Xe object unless I'm missing something. 

> >   	struct xe_devcoredump *devcoredump = &xe->devcoredump;
> >   	struct xe_devcoredump_snapshot *devcore_snapshot = &devcoredump->snapshot;
> >   	struct gcap_reg_list_info *reginfo = NULL;
> > @@ -1806,7 +1805,6 @@ void xe_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot, struct drm
> >   	};
> >   	int type;
> >   	const struct __guc_mmio_reg_descr_group *list;
> > -	enum guc_capture_list_class_type capture_class;
> >   	struct xe_gt *gt;
> >   	struct xe_device *xe;
> > @@ -1826,8 +1824,6 @@ void xe_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot, struct drm
> >   	xe_gt_assert(gt, snapshot->hwe);
> > -	capture_class = xe_engine_class_to_guc_capture_class(snapshot->hwe->class);
> > -
> >   	drm_printf(p, "%s (physical), logical instance=%d\n",
> >   		   snapshot->name ? snapshot->name : "",
> >   		   snapshot->logical_instance);
> > @@ -1841,23 +1837,16 @@ void xe_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot, struct drm
> >   		   str_yes_no(snapshot->kernel_reserved));
> >   	for (type = GUC_STATE_CAPTURE_TYPE_GLOBAL; type < GUC_STATE_CAPTURE_TYPE_MAX; type++) {
> > -		/*
> > -		 * FIXME: During devcoredump print we should avoid accessing the
> > -		 * driver pointers for gt or engine. Printing should be done only
> > -		 * using the snapshot captured. Here we are accessing the gt
> > -		 * pointer. It should be fixed.
> > -		 */
> > -		list = xe_guc_capture_get_reg_desc_list(gt, GUC_CAPTURE_LIST_INDEX_PF, type,
> > -							capture_class, false);

Again unless I'm completely missing something this is accessing driver
persistent data which seems fine to access during
xe_engine_snapshot_print... What we shouldn't do during print is access
the hardware which is not what this is doing.

Matt

> > +
> > +		list = snapshot->list[type];
> > +
> >   		snapshot_print_by_list_order(snapshot, p, type, list);
> >   	}
> > -	if (capture_class == GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE) {
> > -		list = xe_guc_capture_get_reg_desc_list(gt, GUC_CAPTURE_LIST_INDEX_PF,
> > -							GUC_STATE_CAPTURE_TYPE_ENGINE_CLASS,
> > -							capture_class, true);
> > +	if (snapshot->render_compute_list) {
> > +
> >   		snapshot_print_by_list_order(snapshot, p, GUC_STATE_CAPTURE_TYPE_ENGINE_CLASS,
> > -					     list);
> > +					     snapshot->render_compute_list);
> >   	}
> >   	drm_puts(p, "\n");
> > diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c
> > index 4d3ee5226e3a..a246b4f860e6 100644
> > --- a/drivers/gpu/drm/xe/xe_hw_engine.c
> > +++ b/drivers/gpu/drm/xe/xe_hw_engine.c
> > @@ -923,6 +923,8 @@ xe_hw_engine_snapshot_capture(struct xe_hw_engine *hwe, struct xe_exec_queue *q)
> >   {
> >   	struct xe_hw_engine_snapshot *snapshot;
> >   	struct __guc_capture_parsed_output *node;
> > +	enum guc_capture_list_class_type capture_class;
> > +	int type;
> >   	if (!xe_hw_engine_is_valid(hwe))
> >   		return NULL;
> > @@ -941,6 +943,24 @@ xe_hw_engine_snapshot_capture(struct xe_hw_engine *hwe, struct xe_exec_queue *q)
> >   	snapshot->mmio_base = hwe->mmio_base;
> >   	snapshot->kernel_reserved = xe_hw_engine_is_reserved(hwe);
> > +	snapshot->xe = gt_to_xe(hwe->gt);
> > +
> > +	capture_class = xe_engine_class_to_guc_capture_class(hwe->class);
> > +
> > +	/* Capture the register list */
> > +	for (type = GUC_STATE_CAPTURE_TYPE_GLOBAL; type < GUC_STATE_CAPTURE_TYPE_MAX; type++) {
> > +		snapshot->list[type] = xe_guc_capture_get_reg_desc_list(hwe->gt,
> > +				GUC_CAPTURE_LIST_INDEX_PF, type, capture_class, false);
> > +	}
> > +
> > +	/* Capture the register list from RENDER / COMPUTE */
> > +	if (capture_class == GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE) {
> > +		snapshot->render_compute_list = xe_guc_capture_get_reg_desc_list(hwe->gt,
> > +				GUC_CAPTURE_LIST_INDEX_PF, GUC_STATE_CAPTURE_TYPE_ENGINE_CLASS,
> > +				capture_class, true);
> > +	}
> > +
> > +
> >   	/* no more VF accessible data below this point */
> >   	if (IS_SRIOV_VF(gt_to_xe(hwe->gt)))
> >   		return snapshot;
> > diff --git a/drivers/gpu/drm/xe/xe_hw_engine_types.h b/drivers/gpu/drm/xe/xe_hw_engine_types.h
> > index e4191a7a2c31..87ad1e68fb47 100644
> > --- a/drivers/gpu/drm/xe/xe_hw_engine_types.h
> > +++ b/drivers/gpu/drm/xe/xe_hw_engine_types.h
> > @@ -6,6 +6,7 @@
> >   #ifndef _XE_HW_ENGINE_TYPES_H_
> >   #define _XE_HW_ENGINE_TYPES_H_
> > +#include "abi/guc_capture_abi.h"
> >   #include "xe_force_wake_types.h"
> >   #include "xe_lrc_types.h"
> >   #include "xe_reg_sr_types.h"
> > @@ -167,6 +168,12 @@ struct xe_hw_engine_snapshot {
> >   	char *name;
> >   	/** @hwe: hw engine */
> >   	struct xe_hw_engine *hwe;
> > +	/** @xe: device pointer */
> > +	struct xe_device *xe;
> > +	/** @list: register list */
> > +	const struct __guc_mmio_reg_descr_group *list[GUC_STATE_CAPTURE_TYPE_MAX];
> > +	/** @render_compute_list: render/compute list */
> > +	const struct __guc_mmio_reg_descr_group *render_compute_list;
> >   	/** @logical_instance: logical instance of this hw engine */
> >   	u16 logical_instance;
> >   	/** @forcewake: Force Wake information snapshot */

  reply	other threads:[~2026-02-11  0:18 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-10  7:29 [PATCH] Decouple devcoredump from accessing core driver structures Shekhar Chauhan
2026-02-10  7:34 ` ✗ CI.checkpatch: warning for " Patchwork
2026-02-10  7:35 ` ✓ CI.KUnit: success " Patchwork
2026-02-10  8:09 ` ✓ Xe.CI.BAT: " Patchwork
2026-02-10 12:33 ` ✓ Xe.CI.FULL: " Patchwork
2026-02-10 17:18 ` [PATCH] " Shekhar Chauhan
2026-02-11  0:17   ` Matthew Brost [this message]
2026-02-12  8:16     ` Shekhar Chauhan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aYvKt64AeBC6Nvj6@lstrano-desk.jf.intel.com \
    --to=matthew.brost@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=rodrigo.vivi@intel.com \
    --cc=shekhar.chauhan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox