From: Matthew Brost <matthew.brost@intel.com>
To: Varun Gupta <varun.gupta@intel.com>
Cc: <intel-xe@lists.freedesktop.org>, <matthew.d.roper@intel.com>,
<priyanka.dandamudi@intel.com>, <himal.prasad.ghimiray@intel.com>
Subject: Re: [PATCH v2 2/2] drm/xe: Add prefetch fault support for Xe3p
Date: Tue, 27 Jan 2026 12:47:46 -0800 [thread overview]
Message-ID: <aXkkcluQPykwySPx@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <aXkghJIl8V/3lh/d@lstrano-desk.jf.intel.com>
On Tue, Jan 27, 2026 at 12:31:00PM -0800, Matthew Brost wrote:
> On Tue, Jan 27, 2026 at 05:27:13PM +0530, Varun Gupta wrote:
> > Xe3p prefetches memory ranges and it notifies software via an additional
> > bit in the page fault descriptor that the fault was caused by prefetch.
> > The prefetch bit should only be in the reply if the page fault handling
> > was not successful, which allows the HW to avoid generating a CAT error
> > for prefetch faults.
> >
> > Based on original patches by Brian Welty <brian.welty@intel.com> and
> > Priyanka Dandamudi <priyanka.dandamudi@intel.com>.
> >
> > v2: Changed comment wording from "repairs" to "handling" for clarity
> > (Matt Roper)
> >
> > Bspec: 59311
> > Originally-by: Lucas De Marchi <lucas.demarchi@intel.com>
> > Cc: Matthew Brost <matthew.brost@intel.com>
> > Cc: Priyanka Dandamudi <priyanka.dandamudi@intel.com>
> > Cc: Matt Roper <matthew.d.roper@intel.com>
> > Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> > Signed-off-by: Varun Gupta <varun.gupta@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_gt_stats.c | 1 +
> > drivers/gpu/drm/xe/xe_gt_stats_types.h | 1 +
> > drivers/gpu/drm/xe/xe_guc_fwif.h | 5 +++--
> > drivers/gpu/drm/xe/xe_guc_pagefault.c | 2 ++
> > drivers/gpu/drm/xe/xe_pagefault.c | 12 ++++++++++++
> > drivers/gpu/drm/xe/xe_pagefault_types.h | 8 +++++++-
> > 6 files changed, 26 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_stats.c b/drivers/gpu/drm/xe/xe_gt_stats.c
> > index fb2904bd0abd..340d0831b752 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_stats.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_stats.c
> > @@ -35,6 +35,7 @@ static const char *const stat_description[__XE_GT_STATS_NUM_IDS] = {
> > DEF_STAT_STR(SVM_TLB_INVAL_US, "svm_tlb_inval_us"),
> > DEF_STAT_STR(VMA_PAGEFAULT_COUNT, "vma_pagefault_count"),
> > DEF_STAT_STR(VMA_PAGEFAULT_KB, "vma_pagefault_kb"),
> > + DEF_STAT_STR(PREFETCH_PAGEFAULT_COUNT, "prefetch_pagefault_count"),
>
> I'd break the stats change into a different patch.
>
> > DEF_STAT_STR(SVM_4K_PAGEFAULT_COUNT, "svm_4K_pagefault_count"),
> > DEF_STAT_STR(SVM_64K_PAGEFAULT_COUNT, "svm_64K_pagefault_count"),
> > DEF_STAT_STR(SVM_2M_PAGEFAULT_COUNT, "svm_2M_pagefault_count"),
> > diff --git a/drivers/gpu/drm/xe/xe_gt_stats_types.h b/drivers/gpu/drm/xe/xe_gt_stats_types.h
> > index b92d013091d5..82e578726088 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_stats_types.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_stats_types.h
> > @@ -13,6 +13,7 @@ enum xe_gt_stats_id {
> > XE_GT_STATS_ID_SVM_TLB_INVAL_US,
> > XE_GT_STATS_ID_VMA_PAGEFAULT_COUNT,
> > XE_GT_STATS_ID_VMA_PAGEFAULT_KB,
> > + XE_GT_STATS_ID_PREFETCH_PAGEFAULT_COUNT,
> > XE_GT_STATS_ID_SVM_4K_PAGEFAULT_COUNT,
> > XE_GT_STATS_ID_SVM_64K_PAGEFAULT_COUNT,
> > XE_GT_STATS_ID_SVM_2M_PAGEFAULT_COUNT,
> > diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h b/drivers/gpu/drm/xe/xe_guc_fwif.h
> > index a33ea288b907..b1b7cea26212 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_fwif.h
> > +++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
> > @@ -261,7 +261,8 @@ struct xe_guc_pagefault_desc {
> > #define PFD_ACCESS_TYPE GENMASK(1, 0)
> > #define PFD_FAULT_TYPE GENMASK(3, 2)
> > #define PFD_VFID GENMASK(9, 4)
> > -#define PFD_RSVD_1 GENMASK(11, 10)
> > +#define PFD_RSVD_1 BIT(10)
> > +#define XE3P_PFD_PREFETCH BIT(11)
>
> s/XE3P_PFD_PREFETCH/PFD_PREFETCH
>
> Then...
>
> #define PFD_PREFETCH BIT(11) /* Only valid on XE3P+, reserved on prior platforms */
>
> > #define PFD_VIRTUAL_ADDR_LO GENMASK(31, 12)
> > #define PFD_VIRTUAL_ADDR_LO_SHIFT 12
> >
> > @@ -281,7 +282,7 @@ struct xe_guc_pagefault_reply {
> >
> > u32 dw1;
> > #define PFR_VFID GENMASK(5, 0)
> > -#define PFR_RSVD_1 BIT(6)
> > +#define XE3P_PFR_PREFETCH BIT(6)
>
> s/XE3P_PFR_PREFETCH/PFR_PREFETCH
>
> > #define PFR_ENG_INSTANCE GENMASK(12, 7)
> > #define PFR_ENG_CLASS GENMASK(15, 13)
> > #define PFR_PDATA GENMASK(31, 16)
> > diff --git a/drivers/gpu/drm/xe/xe_guc_pagefault.c b/drivers/gpu/drm/xe/xe_guc_pagefault.c
> > index 719a18187a31..b6c12e563067 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_pagefault.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_pagefault.c
> > @@ -27,6 +27,7 @@ static void guc_ack_fault(struct xe_pagefault *pf, int err)
> > FIELD_PREP(PFR_ASID, pf->consumer.asid),
> >
> > FIELD_PREP(PFR_VFID, vfid) |
> > + FIELD_PREP(XE3P_PFR_PREFETCH, pf->consumer.prefetch) |
> > FIELD_PREP(PFR_ENG_INSTANCE, engine_instance) |
> > FIELD_PREP(PFR_ENG_CLASS, engine_class) |
> > FIELD_PREP(PFR_PDATA, pdata),
> > @@ -77,6 +78,7 @@ int xe_guc_pagefault_handler(struct xe_guc *guc, u32 *msg, u32 len)
> > pf.consumer.asid = FIELD_GET(PFD_ASID, msg[1]);
> > pf.consumer.access_type = FIELD_GET(PFD_ACCESS_TYPE, msg[2]);
> > pf.consumer.fault_type = FIELD_GET(PFD_FAULT_TYPE, msg[2]);
> > + pf.consumer.prefetch = FIELD_GET(XE3P_PFD_PREFETCH, msg[2]);
> > if (FIELD_GET(XE2_PFD_TRVA_FAULT, msg[0]))
> > pf.consumer.fault_level = XE_PAGEFAULT_LEVEL_NACK;
> > else
> > diff --git a/drivers/gpu/drm/xe/xe_pagefault.c b/drivers/gpu/drm/xe/xe_pagefault.c
> > index 34dac4280b9d..97d93ed616f9 100644
> > --- a/drivers/gpu/drm/xe/xe_pagefault.c
> > +++ b/drivers/gpu/drm/xe/xe_pagefault.c
> > @@ -223,6 +223,12 @@ static bool xe_pagefault_queue_pop(struct xe_pagefault_queue *pf_queue,
> >
> > static void xe_pagefault_error_account(struct xe_pagefault *pf, int err)
> > {
> > + /* Don't spam log for prefetch accesses, just add to stats */
> > + if (pf->consumer.prefetch) {
> > + xe_gt_stats_incr(pf->gt, XE_GT_STATS_ID_PREFETCH_PAGEFAULT_COUNT, 1);
> > + return;
> > + }
>
> I don't get why this counter is in this increment here.
>
> I do get why this function aborts though - no need print a fault which
> was triggered by a prefetch. So I'd move this to the call site (e.g.,
> supress error messages there) plus I'd rename PREFETCH_PAGEFAULT_COUNT
> to something more meaningful - maybe INVALID_PREFETCH_PAGEFAULT_COUNT?
>
> Matt
>
> > +
> > xe_gt_info(pf->gt, "\n\tASID: %d\n"
> > "\tFaulted Address: 0x%08x%08x\n"
> > "\tFaultType: %d\n"
> > @@ -262,6 +268,12 @@ static void xe_pagefault_queue_work(struct work_struct *w)
> > xe_pagefault_error_account(&pf, err);
> > xe_gt_info(pf.gt, "Fault response: Unsuccessful %pe\n",
> > ERR_PTR(err));
Missed this - also if you want to leave 'Fault response: Unsuccessful '
for prefetch cases, I think that is fine but at minimum also include the
prefetch bit in the info message.
Matt
> > + } else {
> > + /*
> > + * Clear prefetch bit - only needed to suppress CAT errors
> > + * on unsuccessful handling.
> > + */
> > + pf.consumer.prefetch = 0;
> > }
> >
> > pf.producer.ops->ack_fault(&pf, err);
> > diff --git a/drivers/gpu/drm/xe/xe_pagefault_types.h b/drivers/gpu/drm/xe/xe_pagefault_types.h
> > index d3b516407d60..4837f2b40079 100644
> > --- a/drivers/gpu/drm/xe/xe_pagefault_types.h
> > +++ b/drivers/gpu/drm/xe/xe_pagefault_types.h
> > @@ -84,8 +84,14 @@ struct xe_pagefault {
> > u8 engine_class;
> > /** @consumer.engine_instance: engine instance */
> > u8 engine_instance;
> > + /**
> > + * @consumer.prefetch: fault is caused by HW prefetch.
> > + * Echo in response to suppress CAT errors on
> > + * unsuccessful handling.
> > + */
> > + u8 prefetch;
> > /** consumer.reserved: reserved bits for future expansion */
> > - u8 reserved[7];
> > + u8 reserved[6];
> > } consumer;
> > /**
> > * @producer: State for the producer (i.e., HW/FW interface). Populated
> > --
> > 2.43.0
> >
next prev parent reply other threads:[~2026-01-27 20:47 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-27 11:57 [PATCH v2 0/2] drm/xe: Add prefetch pagefault support for Xe3p Varun Gupta
2026-01-27 11:57 ` [PATCH v2 1/2] drm/xe: Rename xe_pagefault_print to xe_pagefault_error_account Varun Gupta
2026-01-27 20:32 ` Matthew Brost
2026-01-27 11:57 ` [PATCH v2 2/2] drm/xe: Add prefetch fault support for Xe3p Varun Gupta
2026-01-27 20:31 ` Matthew Brost
2026-01-27 20:47 ` Matthew Brost [this message]
2026-01-28 9:53 ` ✗ CI.checkpatch: warning for drm/xe: Add prefetch pagefault support for Xe3p (rev2) Patchwork
2026-01-28 9:55 ` ✓ CI.KUnit: success " Patchwork
2026-01-28 10:43 ` ✓ Xe.CI.BAT: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aXkkcluQPykwySPx@lstrano-desk.jf.intel.com \
--to=matthew.brost@intel.com \
--cc=himal.prasad.ghimiray@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=matthew.d.roper@intel.com \
--cc=priyanka.dandamudi@intel.com \
--cc=varun.gupta@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox