Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Souza, Jose" <jose.souza@intel.com>
To: "Vivi, Rodrigo" <rodrigo.vivi@intel.com>
Cc: "intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>,
	"Dong,  Zhanjun" <zhanjun.dong@intel.com>,
	"Roper, Matthew D" <matthew.d.roper@intel.com>
Subject: Re: [PATCH v6 3/3] drm/xe: Add INSTDONE registers to devcoredump
Date: Tue, 23 Apr 2024 14:13:19 +0000	[thread overview]
Message-ID: <ca39ac515092e3ef178236ebde834afeb29b6b79.camel@intel.com> (raw)
In-Reply-To: <ZicNtTSq3cH6L4PM@intel.com>

On Mon, 2024-04-22 at 21:24 -0400, Rodrigo Vivi wrote:
> On Fri, Apr 05, 2024 at 11:23:26AM -0700, José Roberto de Souza wrote:
> > This registers contains important information that can help with debug
> > of GPU hangs.
> > 
> > While at it also fixing the double line jump at the end of engine
> > registers for CCS engines.
> > 
> > v2:
> > - print other INSTDONE registers
> > 
> > v3:
> > - add for_each_geometry/compute_dss()
> > 
> > v4:
> > - print one slice_common_instdone per glice in DG2+
> > 
> > v5:
> > - rename registers prefix from DG2 to XEHPG (Zhanjun)
> > 
> > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > Cc: Matt Roper <matthew.d.roper@intel.com>
> > Cc: Zhanjun Dong <zhanjun.dong@intel.com>
> > Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
> > ---
> >  drivers/gpu/drm/xe/regs/xe_engine_regs.h |   1 +
> >  drivers/gpu/drm/xe/regs/xe_gt_regs.h     |  13 +++
> >  drivers/gpu/drm/xe/xe_hw_engine.c        | 128 +++++++++++++++++++++++
> >  drivers/gpu/drm/xe/xe_hw_engine_types.h  |  16 +++
> >  4 files changed, 158 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/xe/regs/xe_engine_regs.h b/drivers/gpu/drm/xe/regs/xe_engine_regs.h
> > index a08528d9c76b2..98d830013e96b 100644
> > --- a/drivers/gpu/drm/xe/regs/xe_engine_regs.h
> > +++ b/drivers/gpu/drm/xe/regs/xe_engine_regs.h
> > @@ -65,6 +65,7 @@
> >  #define RING_ACTHD_UDW(base)			XE_REG((base) + 0x5c)
> >  #define RING_DMA_FADD_UDW(base)			XE_REG((base) + 0x60)
> >  #define RING_IPEHR(base)			XE_REG((base) + 0x68)
> > +#define RING_INSTDONE(base)			XE_REG((base) + 0x6c)
> >  #define RING_ACTHD(base)			XE_REG((base) + 0x74)
> >  #define RING_DMA_FADD(base)			XE_REG((base) + 0x78)
> >  #define RING_HWS_PGA(base)			XE_REG((base) + 0x80)
> > diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> > index 6617c86a096b6..04f9077bacc57 100644
> > --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> > +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> > @@ -94,6 +94,8 @@
> >  #define   FF_MODE2_TDS_TIMER_MASK		REG_GENMASK(23, 16)
> >  #define   FF_MODE2_TDS_TIMER_128		REG_FIELD_PREP(FF_MODE2_TDS_TIMER_MASK, 4)
> >  
> > +#define XEHPG_INSTDONE_GEOM_SVGUNIT		XE_REG_MCR(0x666c)
> > +
> >  #define CACHE_MODE_1				XE_REG(0x7004, XE_REG_OPTION_MASKED)
> >  #define   MSAA_OPTIMIZATION_REDUC_DISABLE	REG_BIT(11)
> >  
> > @@ -111,6 +113,14 @@
> >  #define   FLSH_IGNORES_PSD			REG_BIT(10)
> >  #define   FD_END_COLLECT			REG_BIT(5)
> >  
> > +#define SC_INSTDONE				XE_REG(0x7100)
> > +#define SC_INSTDONE_EXTRA			XE_REG(0x7104)
> > +#define SC_INSTDONE_EXTRA2			XE_REG(0x7108)
> > +
> > +#define XEHPG_SC_INSTDONE			XE_REG_MCR(0x7100)
> > +#define XEHPG_SC_INSTDONE_EXTRA			XE_REG_MCR(0x7104)
> 
> I believe you have an alignment problem here    ^

odd, I check it and looks good in editor also dim checkpatch is not complaining here.

> 
> > +#define XEHPG_SC_INSTDONE_EXTRA2		XE_REG_MCR(0x7108)
> > +
> >  #define COMMON_SLICE_CHICKEN4			XE_REG(0x7300, XE_REG_OPTION_MASKED)
> >  #define   DISABLE_TDC_LOAD_BALANCING_CALC	REG_BIT(6)
> >  
> > @@ -327,6 +337,9 @@
> >  #define HALF_SLICE_CHICKEN5			XE_REG_MCR(0xe188, XE_REG_OPTION_MASKED)
> >  #define   DISABLE_SAMPLE_G_PERFORMANCE		REG_BIT(0)
> >  
> > +#define SAMPLER_INSTDONE			XE_REG_MCR(0xe160)
> > +#define ROW_INSTDONE				XE_REG_MCR(0xe164)
> > +
> >  #define SAMPLER_MODE				XE_REG_MCR(0xe18c, XE_REG_OPTION_MASKED)
> >  #define   ENABLE_SMALLPL			REG_BIT(15)
> >  #define   SC_DISABLE_POWER_OPTIMIZATION_EBB	REG_BIT(9)
> > diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c
> > index 669753f0a9c50..1864eeedf95fd 100644
> > --- a/drivers/gpu/drm/xe/xe_hw_engine.c
> > +++ b/drivers/gpu/drm/xe/xe_hw_engine.c
> > @@ -18,6 +18,7 @@
> >  #include "xe_gt.h"
> >  #include "xe_gt_ccs_mode.h"
> >  #include "xe_gt_printk.h"
> > +#include "xe_gt_mcr.h"
> >  #include "xe_gt_topology.h"
> >  #include "xe_hw_fence.h"
> >  #include "xe_irq.h"
> > @@ -770,6 +771,57 @@ void xe_hw_engine_handle_irq(struct xe_hw_engine *hwe, u16 intr_vec)
> >  		xe_hw_fence_irq_run(hwe->fence_irq);
> >  }
> >  
> > +static bool
> > +is_slice_common_per_gslice(struct xe_device *xe)
> > +{
> > +	return GRAPHICS_VERx100(xe) >= 1255;
> > +}
> > +
> > +static void
> > +xe_he_engine_snapshot_instdone_capture(struct xe_hw_engine *hwe,
> > +				       struct xe_hw_engine_snapshot *snapshot)
> > +{
> > +	struct xe_gt *gt = hwe->gt;
> > +	struct xe_device *xe = gt_to_xe(gt);
> > +	unsigned int dss;
> > +	u16 group, instance;
> > +
> > +	snapshot->reg.instdone.ring = hw_engine_mmio_read32(hwe, RING_INSTDONE(0));
> > +
> > +	if (snapshot->hwe->class != XE_ENGINE_CLASS_RENDER)
> > +		return;
> > +
> > +	if (is_slice_common_per_gslice(xe) == false) {
> 
> do you really need this if? I don't see equivalence in i915.

i915 is missing it for DG2+.
Bspec: 66534, range: 0x7000 to 0x7FFF. all registers in this range has steering.

> 
> > +		snapshot->reg.instdone.slice_common[0] =
> > +			xe_mmio_read32(gt, SC_INSTDONE);
> > +		snapshot->reg.instdone.slice_common_extra[0] =
> > +			xe_mmio_read32(gt, SC_INSTDONE_EXTRA);
> > +		snapshot->reg.instdone.slice_common_extra2[0] =
> > +			xe_mmio_read32(gt, SC_INSTDONE_EXTRA2);
> 
> do you really need to allocate the MAX array only for these?
> perhaps unify them in one array like i915?

I guess i915 is not doing the same because it is not doing one read of each register for each dss in DG2+.
But yeah for gen12 we could allocate just one entry for slice_common, slice_common_extra and slice_common_extra2 but I'm allocating
XE_MAX_DSS_FUSE_BITS just to avoid more complexity in  the allocation block... I can change if you think it is worthy.

> 
> > +	} else {
> > +		for_each_geometry_dss(dss, gt, group, instance) {
> > +			snapshot->reg.instdone.slice_common[dss] =
> > +				xe_gt_mcr_unicast_read(gt, XEHPG_SC_INSTDONE, group, instance);
> > +			snapshot->reg.instdone.slice_common_extra[dss] =
> > +				xe_gt_mcr_unicast_read(gt, XEHPG_SC_INSTDONE_EXTRA, group, instance);
> > +			snapshot->reg.instdone.slice_common_extra2[dss] =
> > +				xe_gt_mcr_unicast_read(gt, XEHPG_SC_INSTDONE_EXTRA2, group, instance);
> > +		}
> > +	}
> > +
> > +	for_each_geometry_dss(dss, gt, group, instance) {
> > +		snapshot->reg.instdone.sampler[dss] =
> > +			xe_gt_mcr_unicast_read(gt, SAMPLER_INSTDONE, group, instance);
> > +		snapshot->reg.instdone.row[dss] =
> > +			xe_gt_mcr_unicast_read(gt, ROW_INSTDONE, group, instance);
> > +
> > +		if (GRAPHICS_VERx100(xe) >= 1255)
> > +			snapshot->reg.instdone.geom_svg[dss] =
> > +				xe_gt_mcr_unicast_read(gt, XEHPG_INSTDONE_GEOM_SVGUNIT,
> > +						       group, instance);
> > +	}
> > +}
> > +
> >  /**
> >   * xe_hw_engine_snapshot_capture - Take a quick snapshot of the HW Engine.
> >   * @hwe: Xe HW Engine.
> > @@ -784,6 +836,7 @@ struct xe_hw_engine_snapshot *
> >  xe_hw_engine_snapshot_capture(struct xe_hw_engine *hwe)
> >  {
> >  	struct xe_hw_engine_snapshot *snapshot;
> > +	size_t len;
> >  	u64 val;
> >  
> >  	if (!xe_hw_engine_is_valid(hwe))
> > @@ -794,6 +847,28 @@ xe_hw_engine_snapshot_capture(struct xe_hw_engine *hwe)
> >  	if (!snapshot)
> >  		return NULL;
> >  
> > +	/* Because XE_MAX_DSS_FUSE_BITS is defined in xe_gt_types.h and it
> > +	 * includes xe_hw_engine_types.h the length of this 3 registers can't be
> > +	 * set in struct xe_hw_engine_snapshot, so here doing additional
> > +	 * allocations.
> > +	 */
> > +	len = (XE_MAX_DSS_FUSE_BITS * sizeof(u32));
> > +	snapshot->reg.instdone.slice_common = kzalloc(len, GFP_ATOMIC);
> > +	snapshot->reg.instdone.slice_common_extra = kzalloc(len, GFP_ATOMIC);
> > +	snapshot->reg.instdone.slice_common_extra2 = kzalloc(len, GFP_ATOMIC);
> > +	snapshot->reg.instdone.sampler = kzalloc(len, GFP_ATOMIC);
> > +	snapshot->reg.instdone.row = kzalloc(len, GFP_ATOMIC);
> > +	snapshot->reg.instdone.geom_svg = kzalloc(len, GFP_ATOMIC);
> > +	if (!snapshot->reg.instdone.slice_common ||
> > +	    !snapshot->reg.instdone.slice_common_extra ||
> > +	    !snapshot->reg.instdone.slice_common_extra2 ||
> > +	    !snapshot->reg.instdone.sampler ||
> > +	    !snapshot->reg.instdone.row ||
> > +	    !snapshot->reg.instdone.geom_svg) {
> > +		xe_hw_engine_snapshot_free(snapshot);
> > +		return NULL;
> > +	}
> > +
> >  	snapshot->name = kstrdup(hwe->name, GFP_ATOMIC);
> >  	snapshot->hwe = hwe;
> >  	snapshot->logical_instance = hwe->logical_instance;
> > @@ -845,6 +920,7 @@ xe_hw_engine_snapshot_capture(struct xe_hw_engine *hwe)
> >  	snapshot->reg.ring_emr = hw_engine_mmio_read32(hwe, RING_EMR(0));
> >  	snapshot->reg.ring_eir = hw_engine_mmio_read32(hwe, RING_EIR(0));
> >  	snapshot->reg.ipehr = hw_engine_mmio_read32(hwe, RING_IPEHR(0));
> > +	xe_he_engine_snapshot_instdone_capture(hwe, snapshot);
> >  
> >  	if (snapshot->hwe->class == XE_ENGINE_CLASS_COMPUTE)
> >  		snapshot->reg.rcu_mode = xe_mmio_read32(hwe->gt, RCU_MODE);
> > @@ -852,6 +928,49 @@ xe_hw_engine_snapshot_capture(struct xe_hw_engine *hwe)
> >  	return snapshot;
> >  }
> >  
> > +static void
> > +xe_hw_engine_snapshot_instdone_print(struct xe_hw_engine_snapshot *snapshot, struct drm_printer *p)
> > +{
> > +	struct xe_gt *gt = snapshot->hwe->gt;
> > +	struct xe_device *xe = gt_to_xe(gt);
> > +	u16 group, instance;
> > +	unsigned int dss;
> > +
> > +	drm_printf(p, "\tRING_INSTDONE: 0x%08x\n", snapshot->reg.instdone.ring);
> > +
> > +	if (snapshot->hwe->class != XE_ENGINE_CLASS_RENDER)
> > +		return;
> > +
> > +	if (is_slice_common_per_gslice(xe) == false) {
> > +		drm_printf(p, "\tSC_INSTDONE[0]: 0x%08x\n",
> > +			   snapshot->reg.instdone.slice_common[0]);
> > +		drm_printf(p, "\tSC_INSTDONE_EXTRA[0]: 0x%08x\n",
> > +			   snapshot->reg.instdone.slice_common_extra[0]);
> > +		drm_printf(p, "\tSC_INSTDONE_EXTRA2[0]: 0x%08x\n",
> > +			   snapshot->reg.instdone.slice_common_extra2[0]);
> > +	} else {
> > +		for_each_geometry_dss(dss, gt, group, instance) {
> > +			drm_printf(p, "\tSC_INSTDONE[%u]: 0x%08x\n", dss,
> > +				   snapshot->reg.instdone.slice_common[dss]);
> > +			drm_printf(p, "\tSC_INSTDONE_EXTRA[%u]: 0x%08x\n", dss,
> > +				   snapshot->reg.instdone.slice_common_extra[dss]);
> > +			drm_printf(p, "\tSC_INSTDONE_EXTRA2[%u]: 0x%08x\n", dss,
> > +				   snapshot->reg.instdone.slice_common_extra2[dss]);
> > +		}
> > +	}
> > +
> > +	for_each_geometry_dss(dss, gt, group, instance) {
> > +		drm_printf(p, "\tSAMPLER_INSTDONE[%u]: 0x%08x\n", dss,
> > +			   snapshot->reg.instdone.sampler[dss]);
> > +		drm_printf(p, "\tROW_INSTDONE[%u]: 0x%08x\n", dss,
> > +			   snapshot->reg.instdone.row[dss]);
> > +
> > +		if (GRAPHICS_VERx100(xe) >= 1255)
> > +			drm_printf(p, "\tINSTDONE_GEOM_SVGUNIT[%u]: 0x%08x\n",
> > +				   dss, snapshot->reg.instdone.geom_svg[dss]);
> > +	}
> > +}
> > +
> >  /**
> >   * xe_hw_engine_snapshot_print - Print out a given Xe HW Engine snapshot.
> >   * @snapshot: Xe HW Engine snapshot object.
> > @@ -891,9 +1010,12 @@ void xe_hw_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot,
> >  	drm_printf(p, "\tBBADDR: 0x%016llx\n", snapshot->reg.ring_bbaddr);
> >  	drm_printf(p, "\tDMA_FADDR: 0x%016llx\n", snapshot->reg.ring_dma_fadd);
> >  	drm_printf(p, "\tIPEHR: 0x%08x\n", snapshot->reg.ipehr);
> > +	xe_hw_engine_snapshot_instdone_print(snapshot, p);
> > +
> >  	if (snapshot->hwe->class == XE_ENGINE_CLASS_COMPUTE)
> >  		drm_printf(p, "\tRCU_MODE: 0x%08x\n",
> >  			   snapshot->reg.rcu_mode);
> > +	drm_puts(p, "\n");
> >  }
> >  
> >  /**
> > @@ -908,6 +1030,12 @@ void xe_hw_engine_snapshot_free(struct xe_hw_engine_snapshot *snapshot)
> >  	if (!snapshot)
> >  		return;
> >  
> > +	kfree(snapshot->reg.instdone.slice_common);
> > +	kfree(snapshot->reg.instdone.slice_common_extra);
> > +	kfree(snapshot->reg.instdone.slice_common_extra2);
> > +	kfree(snapshot->reg.instdone.sampler);
> > +	kfree(snapshot->reg.instdone.row);
> > +	kfree(snapshot->reg.instdone.geom_svg);
> >  	kfree(snapshot->name);
> >  	kfree(snapshot);
> >  }
> > diff --git a/drivers/gpu/drm/xe/xe_hw_engine_types.h b/drivers/gpu/drm/xe/xe_hw_engine_types.h
> > index 27deaa31efd31..9f9755e31b9fe 100644
> > --- a/drivers/gpu/drm/xe/xe_hw_engine_types.h
> > +++ b/drivers/gpu/drm/xe/xe_hw_engine_types.h
> > @@ -211,6 +211,22 @@ struct xe_hw_engine_snapshot {
> >  		u32 ipehr;
> >  		/** @reg.rcu_mode: RCU_MODE */
> >  		u32 rcu_mode;
> > +		struct {
> > +			/** @reg.instdone.ring: RING_INSTDONE */
> > +			u32 ring;
> > +			/** @reg.instdone.slice_common: SC_INSTDONE */
> > +			u32 *slice_common;
> > +			/** @reg.instdone.slice_common_extra: SC_INSTDONE_EXTRA */
> > +			u32 *slice_common_extra;
> > +			/** @reg.instdone.slice_common_extra2: SC_INSTDONE_EXTRA2 */
> > +			u32 *slice_common_extra2;
> > +			/** @reg.instdone.sampler: SAMPLER_INSTDONE */
> > +			u32 *sampler;
> > +			/** @reg.instdone.row: ROW_INSTDONE */
> > +			u32 *row;
> > +			/** @reg.instdone.geom_svg: INSTDONE_GEOM_SVGUNIT */
> > +			u32 *geom_svg;
> > +		} instdone;
> >  	} reg;
> >  };
> >  
> > -- 
> > 2.44.0
> > 


  reply	other threads:[~2024-04-23 14:13 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-05 18:23 [PATCH v6 1/3] drm/xe: Store xe_he_engine in xe_hw_engine_snapshot José Roberto de Souza
2024-04-05 18:23 ` [PATCH v6 2/3] drm/xe: Add helpers to loop over geometry and compute DSS José Roberto de Souza
2024-04-23  1:24   ` Rodrigo Vivi
2024-04-05 18:23 ` [PATCH v6 3/3] drm/xe: Add INSTDONE registers to devcoredump José Roberto de Souza
2024-04-23  1:24   ` Rodrigo Vivi
2024-04-23 14:13     ` Souza, Jose [this message]
2024-04-23 14:17       ` Rodrigo Vivi
2024-04-05 20:11 ` ✓ CI.Patch_applied: success for series starting with [v6,1/3] drm/xe: Store xe_he_engine in xe_hw_engine_snapshot Patchwork
2024-04-05 20:12 ` ✗ CI.checkpatch: warning " Patchwork
2024-04-05 20:13 ` ✓ CI.KUnit: success " Patchwork
2024-04-05 20:24 ` ✓ CI.Build: " Patchwork
2024-04-05 20:27 ` ✓ CI.Hooks: " Patchwork
2024-04-05 20:28 ` ✓ CI.checksparse: " Patchwork
2024-04-05 20:42 ` ✗ CI.BAT: failure " Patchwork
2024-04-08 16:00 ` ✓ CI.Patch_applied: success for series starting with [v6,1/3] drm/xe: Store xe_he_engine in xe_hw_engine_snapshot (rev2) Patchwork
2024-04-08 16:00 ` ✗ CI.checkpatch: warning " Patchwork
2024-04-08 16:02 ` ✓ CI.KUnit: success " Patchwork
2024-04-08 16:14 ` ✓ CI.Build: " Patchwork
2024-04-08 16:18 ` ✓ CI.Hooks: " Patchwork
2024-04-08 16:19 ` ✓ CI.checksparse: " Patchwork
2024-04-08 16:50 ` ✓ CI.BAT: " Patchwork
2024-04-08 19:28 ` ✗ CI.FULL: failure " Patchwork
2024-04-23  1:12 ` [PATCH v6 1/3] drm/xe: Store xe_he_engine in xe_hw_engine_snapshot Rodrigo Vivi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ca39ac515092e3ef178236ebde834afeb29b6b79.camel@intel.com \
    --to=jose.souza@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.d.roper@intel.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=zhanjun.dong@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox