Re: [PATCH v2 2/2] drm/xe: Add prefetch fault support for Xe3p

Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed

From: Matthew Brost <matthew.brost@intel.com>
To: Varun Gupta <varun.gupta@intel.com>
Cc: <intel-xe@lists.freedesktop.org>, <matthew.d.roper@intel.com>,
	<priyanka.dandamudi@intel.com>, <himal.prasad.ghimiray@intel.com>
Subject: Re: [PATCH v2 2/2] drm/xe: Add prefetch fault support for Xe3p
Date: Tue, 27 Jan 2026 12:47:46 -0800	[thread overview]
Message-ID: <aXkkcluQPykwySPx@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <aXkghJIl8V/3lh/d@lstrano-desk.jf.intel.com>

On Tue, Jan 27, 2026 at 12:31:00PM -0800, Matthew Brost wrote:
> On Tue, Jan 27, 2026 at 05:27:13PM +0530, Varun Gupta wrote:
> > Xe3p prefetches memory ranges and it notifies software via an additional
> > bit in the page fault descriptor that the fault was caused by prefetch.
> > The prefetch bit should only be in the reply if the page fault handling
> > was not successful, which allows the HW to avoid generating a CAT error
> > for prefetch faults.
> > 
> > Based on original patches by Brian Welty <brian.welty@intel.com> and
> > Priyanka Dandamudi <priyanka.dandamudi@intel.com>.
> > 
> > v2: Changed comment wording from "repairs" to "handling" for clarity
> >     (Matt Roper)
> > 
> > Bspec: 59311
> > Originally-by: Lucas De Marchi <lucas.demarchi@intel.com>
> > Cc: Matthew Brost <matthew.brost@intel.com>
> > Cc: Priyanka Dandamudi <priyanka.dandamudi@intel.com>
> > Cc: Matt Roper <matthew.d.roper@intel.com>
> > Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> > Signed-off-by: Varun Gupta <varun.gupta@intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_gt_stats.c        |  1 +
> >  drivers/gpu/drm/xe/xe_gt_stats_types.h  |  1 +
> >  drivers/gpu/drm/xe/xe_guc_fwif.h        |  5 +++--
> >  drivers/gpu/drm/xe/xe_guc_pagefault.c   |  2 ++
> >  drivers/gpu/drm/xe/xe_pagefault.c       | 12 ++++++++++++
> >  drivers/gpu/drm/xe/xe_pagefault_types.h |  8 +++++++-
> >  6 files changed, 26 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_gt_stats.c b/drivers/gpu/drm/xe/xe_gt_stats.c
> > index fb2904bd0abd..340d0831b752 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_stats.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_stats.c
> > @@ -35,6 +35,7 @@ static const char *const stat_description[__XE_GT_STATS_NUM_IDS] = {
> >  	DEF_STAT_STR(SVM_TLB_INVAL_US, "svm_tlb_inval_us"),
> >  	DEF_STAT_STR(VMA_PAGEFAULT_COUNT, "vma_pagefault_count"),
> >  	DEF_STAT_STR(VMA_PAGEFAULT_KB, "vma_pagefault_kb"),
> > +	DEF_STAT_STR(PREFETCH_PAGEFAULT_COUNT, "prefetch_pagefault_count"),
> 
> I'd break the stats change into a different patch.
> 
> >  	DEF_STAT_STR(SVM_4K_PAGEFAULT_COUNT, "svm_4K_pagefault_count"),
> >  	DEF_STAT_STR(SVM_64K_PAGEFAULT_COUNT, "svm_64K_pagefault_count"),
> >  	DEF_STAT_STR(SVM_2M_PAGEFAULT_COUNT, "svm_2M_pagefault_count"),
> > diff --git a/drivers/gpu/drm/xe/xe_gt_stats_types.h b/drivers/gpu/drm/xe/xe_gt_stats_types.h
> > index b92d013091d5..82e578726088 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_stats_types.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_stats_types.h
> > @@ -13,6 +13,7 @@ enum xe_gt_stats_id {
> >  	XE_GT_STATS_ID_SVM_TLB_INVAL_US,
> >  	XE_GT_STATS_ID_VMA_PAGEFAULT_COUNT,
> >  	XE_GT_STATS_ID_VMA_PAGEFAULT_KB,
> > +	XE_GT_STATS_ID_PREFETCH_PAGEFAULT_COUNT,
> >  	XE_GT_STATS_ID_SVM_4K_PAGEFAULT_COUNT,
> >  	XE_GT_STATS_ID_SVM_64K_PAGEFAULT_COUNT,
> >  	XE_GT_STATS_ID_SVM_2M_PAGEFAULT_COUNT,
> > diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h b/drivers/gpu/drm/xe/xe_guc_fwif.h
> > index a33ea288b907..b1b7cea26212 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_fwif.h
> > +++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
> > @@ -261,7 +261,8 @@ struct xe_guc_pagefault_desc {
> >  #define PFD_ACCESS_TYPE		GENMASK(1, 0)
> >  #define PFD_FAULT_TYPE		GENMASK(3, 2)
> >  #define PFD_VFID		GENMASK(9, 4)
> > -#define PFD_RSVD_1		GENMASK(11, 10)
> > +#define PFD_RSVD_1		BIT(10)
> > +#define XE3P_PFD_PREFETCH	BIT(11)
> 
> s/XE3P_PFD_PREFETCH/PFD_PREFETCH
> 
> Then...
> 
> #define PFD_PREFETCH	BIT(11)	/* Only valid on XE3P+, reserved on prior platforms */
> 
> >  #define PFD_VIRTUAL_ADDR_LO	GENMASK(31, 12)
> >  #define PFD_VIRTUAL_ADDR_LO_SHIFT 12
> >  
> > @@ -281,7 +282,7 @@ struct xe_guc_pagefault_reply {
> >  
> >  	u32 dw1;
> >  #define PFR_VFID		GENMASK(5, 0)
> > -#define PFR_RSVD_1		BIT(6)
> > +#define XE3P_PFR_PREFETCH	BIT(6)
> 
> s/XE3P_PFR_PREFETCH/PFR_PREFETCH
> 
> >  #define PFR_ENG_INSTANCE	GENMASK(12, 7)
> >  #define PFR_ENG_CLASS		GENMASK(15, 13)
> >  #define PFR_PDATA		GENMASK(31, 16)
> > diff --git a/drivers/gpu/drm/xe/xe_guc_pagefault.c b/drivers/gpu/drm/xe/xe_guc_pagefault.c
> > index 719a18187a31..b6c12e563067 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_pagefault.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_pagefault.c
> > @@ -27,6 +27,7 @@ static void guc_ack_fault(struct xe_pagefault *pf, int err)
> >  		FIELD_PREP(PFR_ASID, pf->consumer.asid),
> >  
> >  		FIELD_PREP(PFR_VFID, vfid) |
> > +		FIELD_PREP(XE3P_PFR_PREFETCH, pf->consumer.prefetch) |
> >  		FIELD_PREP(PFR_ENG_INSTANCE, engine_instance) |
> >  		FIELD_PREP(PFR_ENG_CLASS, engine_class) |
> >  		FIELD_PREP(PFR_PDATA, pdata),
> > @@ -77,6 +78,7 @@ int xe_guc_pagefault_handler(struct xe_guc *guc, u32 *msg, u32 len)
> >  	pf.consumer.asid = FIELD_GET(PFD_ASID, msg[1]);
> >  	pf.consumer.access_type = FIELD_GET(PFD_ACCESS_TYPE, msg[2]);
> >  	pf.consumer.fault_type = FIELD_GET(PFD_FAULT_TYPE, msg[2]);
> > +	pf.consumer.prefetch = FIELD_GET(XE3P_PFD_PREFETCH, msg[2]);
> >  	if (FIELD_GET(XE2_PFD_TRVA_FAULT, msg[0]))
> >  		pf.consumer.fault_level = XE_PAGEFAULT_LEVEL_NACK;
> >  	else
> > diff --git a/drivers/gpu/drm/xe/xe_pagefault.c b/drivers/gpu/drm/xe/xe_pagefault.c
> > index 34dac4280b9d..97d93ed616f9 100644
> > --- a/drivers/gpu/drm/xe/xe_pagefault.c
> > +++ b/drivers/gpu/drm/xe/xe_pagefault.c
> > @@ -223,6 +223,12 @@ static bool xe_pagefault_queue_pop(struct xe_pagefault_queue *pf_queue,
> >  
> >  static void xe_pagefault_error_account(struct xe_pagefault *pf, int err)
> >  {
> > +	/* Don't spam log for prefetch accesses, just add to stats */
> > +	if (pf->consumer.prefetch) {
> > +		xe_gt_stats_incr(pf->gt, XE_GT_STATS_ID_PREFETCH_PAGEFAULT_COUNT, 1);
> > +		return;
> > +	}
> 
> I don't get why this counter is in this increment here.
> 
> I do get why this function aborts though - no need print a fault which
> was triggered by a prefetch. So I'd move this to the call site (e.g.,
> supress error messages there) plus I'd rename PREFETCH_PAGEFAULT_COUNT
> to something more meaningful - maybe INVALID_PREFETCH_PAGEFAULT_COUNT?
> 
> Matt
> 
> > +
> >  	xe_gt_info(pf->gt, "\n\tASID: %d\n"
> >  		   "\tFaulted Address: 0x%08x%08x\n"
> >  		   "\tFaultType: %d\n"
> > @@ -262,6 +268,12 @@ static void xe_pagefault_queue_work(struct work_struct *w)
> >  			xe_pagefault_error_account(&pf, err);
> >  			xe_gt_info(pf.gt, "Fault response: Unsuccessful %pe\n",
> >  				   ERR_PTR(err));

Missed this - also if you want to leave 'Fault response: Unsuccessful '
for prefetch cases, I think that is fine but at minimum also include the
prefetch bit in the info message. 

Matt

> > +		} else {
> > +			/*
> > +			 * Clear prefetch bit - only needed to suppress CAT errors
> > +			 * on unsuccessful handling.
> > +			 */
> > +			pf.consumer.prefetch = 0;
> >  		}
> >  
> >  		pf.producer.ops->ack_fault(&pf, err);
> > diff --git a/drivers/gpu/drm/xe/xe_pagefault_types.h b/drivers/gpu/drm/xe/xe_pagefault_types.h
> > index d3b516407d60..4837f2b40079 100644
> > --- a/drivers/gpu/drm/xe/xe_pagefault_types.h
> > +++ b/drivers/gpu/drm/xe/xe_pagefault_types.h
> > @@ -84,8 +84,14 @@ struct xe_pagefault {
> >  		u8 engine_class;
> >  		/** @consumer.engine_instance: engine instance */
> >  		u8 engine_instance;
> > +		/**
> > +		 * @consumer.prefetch: fault is caused by HW prefetch.
> > +		 * Echo in response to suppress CAT errors on 
> > +		 * unsuccessful handling.
> > +		 */
> > +		u8 prefetch;
> >  		/** consumer.reserved: reserved bits for future expansion */
> > -		u8 reserved[7];
> > +		u8 reserved[6];
> >  	} consumer;
> >  	/**
> >  	 * @producer: State for the producer (i.e., HW/FW interface). Populated
> > -- 
> > 2.43.0
> >

next prev parent reply	other threads:[~2026-01-27 20:47 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-27 11:57 [PATCH v2 0/2] drm/xe: Add prefetch pagefault support for Xe3p Varun Gupta
2026-01-27 11:57 ` [PATCH v2 1/2] drm/xe: Rename xe_pagefault_print to xe_pagefault_error_account Varun Gupta
2026-01-27 20:32   ` Matthew Brost
2026-01-27 11:57 ` [PATCH v2 2/2] drm/xe: Add prefetch fault support for Xe3p Varun Gupta
2026-01-27 20:31   ` Matthew Brost
2026-01-27 20:47     ` Matthew Brost [this message]
2026-01-28  9:53 ` ✗ CI.checkpatch: warning for drm/xe: Add prefetch pagefault support for Xe3p (rev2) Patchwork
2026-01-28  9:55 ` ✓ CI.KUnit: success " Patchwork
2026-01-28 10:43 ` ✓ Xe.CI.BAT: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aXkkcluQPykwySPx@lstrano-desk.jf.intel.com \
    --to=matthew.brost@intel.com \
    --cc=himal.prasad.ghimiray@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.d.roper@intel.com \
    --cc=priyanka.dandamudi@intel.com \
    --cc=varun.gupta@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox