Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: "Nguyen, Brian3" <brian3.nguyen@intel.com>
Cc: "Lin, Shuicheng" <shuicheng.lin@intel.com>,
	"intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>,
	"Upadhyay,  Tejas" <tejas.upadhyay@intel.com>,
	"Summers, Stuart" <stuart.summers@intel.com>
Subject: Re: [PATCH 05/11] drm/xe/guc: Add page reclamation interface to GuC
Date: Sat, 22 Nov 2025 10:39:31 -0800	[thread overview]
Message-ID: <aSIDY00xv6Q+I2cg@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <SA1PR11MB8839340F6CB3C3E374DAAF07AAD2A@SA1PR11MB8839.namprd11.prod.outlook.com>

On Fri, Nov 21, 2025 at 06:56:27PM -0700, Nguyen, Brian3 wrote:
> 
> On Friday, November 21, 2025 10:33 AM Lin, Shuicheng wrote:
> > On Tue, Nov 18, 2025 1:06 AM Brian3 Nguyen wrote:
> > > Add page reclamation related changes to GuC interface, handlers, and
> > > senders to support page reclamation.
> > >
> > > Currently TLB invalidations will perform an entire PPC flush in order
> > > to prevent stale memory access for noncoherent system memory. Page
> > > reclamation is an extension of the typical TLB invalidation workflow,
> > > allowing disabling of full PPC flush and enable selective PPC
> > > flushing. Selective flushing will be decided by a list of pages whom's address
> > is passed to GuC at time of action.
> > >
> > > Page reclamation interfaces require at least GuC FW ver 70.31.0.
> > 
> > Should driver disable this feature if the running FW is < 70.31.0?
> 
> Default FW version is above this at time of patchset submission so
> I had assumed it not to be a problem, since the danger is a user
> forcibly using a bad FW which already has unpredictable results.
> 
> However, in hindsight, it is easy enough to skip if FW version is lower,
> and we can safely fallback to default TLB invalidation, so I'll proceed
> with adding a check within the later patches that'll disable
> page reclamation within the xe_guc_tlb_inval.c unless there are
> any objections.
> 

I would just flip 'xe->info.has_page_reclaim_hw_assist' to false very
early in driver load once we have the GuC version if GuC version doesn't
support it.

> > What will happen if driver send this action while GuC doesn't support it yet?
> > 
> > Shuicheng
> > 
> 
> AFAIK, if action is sent before correct FW version, it'll report
> out GUC_HXG_TYPE_RESPONSE_FAILURE in g2h
> due to illegal operation, eventually triggering reset.
> 
> Brian
> 
> > >
> > > Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> > > ---
> > >  drivers/gpu/drm/xe/abi/guc_actions_abi.h |  2 ++
> > >  drivers/gpu/drm/xe/xe_guc_ct.c           |  4 ++++
> > >  drivers/gpu/drm/xe/xe_guc_fwif.h         |  1 +
> > >  drivers/gpu/drm/xe/xe_guc_tlb_inval.c    | 14 ++++++++++++++
> > >  4 files changed, 21 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> > > b/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> > > index 47756e4674a1..11de3bdf69b5 100644
> > > --- a/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> > > +++ b/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> > > @@ -151,6 +151,8 @@ enum xe_guc_action {
> > >  	XE_GUC_ACTION_TLB_INVALIDATION = 0x7000,
> > >  	XE_GUC_ACTION_TLB_INVALIDATION_DONE = 0x7001,
> > >  	XE_GUC_ACTION_TLB_INVALIDATION_ALL = 0x7002,
> > > +	XE_GUC_ACTION_PAGE_RECLAMATION = 0x7003,
> > > +	XE_GUC_ACTION_PAGE_RECLAMATION_DONE = 0x7004,
> > >  	XE_GUC_ACTION_STATE_CAPTURE_NOTIFICATION = 0x8002,
> > >  	XE_GUC_ACTION_NOTIFY_FLUSH_LOG_BUFFER_TO_FILE = 0x8003,
> > >  	XE_GUC_ACTION_NOTIFY_CRASH_DUMP_POSTED = 0x8004, diff --git
> > > a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
> > > index
> > > 2697d711adb2..e13704e61032 100644
> > > --- a/drivers/gpu/drm/xe/xe_guc_ct.c
> > > +++ b/drivers/gpu/drm/xe/xe_guc_ct.c
> > > @@ -1311,6 +1311,7 @@ static int parse_g2h_event(struct xe_guc_ct *ct,
> > > u32 *msg, u32 len)
> > >  	case XE_GUC_ACTION_DEREGISTER_CONTEXT_DONE:
> > >  	case XE_GUC_ACTION_SCHED_ENGINE_MODE_DONE:
> > >  	case XE_GUC_ACTION_TLB_INVALIDATION_DONE:
> > > +	case XE_GUC_ACTION_PAGE_RECLAMATION_DONE:
> > >  		g2h_release_space(ct, len);
> > >  	}
> > >
> > > @@ -1546,6 +1547,7 @@ static int process_g2h_msg(struct xe_guc_ct *ct,
> > > u32 *msg, u32 len)
> > >  		ret = xe_guc_pagefault_handler(guc, payload, adj_len);
> > >  		break;
> > >  	case XE_GUC_ACTION_TLB_INVALIDATION_DONE:
> > > +	case XE_GUC_ACTION_PAGE_RECLAMATION_DONE:
> > >  		ret = xe_guc_tlb_inval_done_handler(guc, payload, adj_len);

I get what is happening here - page reclaim G2H just uses a shared seqno
with TLB invalidations but this looks very odd to the reader of the code
who hasn't worked on this or understands this as shared function is
called for both G2H. Can you add a comment here explaining why it is ok
to use a single G2H handler for both TLB invalidations and page reclaim?

Matt

> > >  		break;
> > >  	case XE_GUC_ACTION_GUC2PF_RELAY_FROM_VF:
> > > @@ -1711,6 +1713,7 @@ static int g2h_read(struct xe_guc_ct *ct, u32
> > > *msg, bool fast_path)
> > >  		switch (action) {
> > >  		case XE_GUC_ACTION_REPORT_PAGE_FAULT_REQ_DESC:
> > >  		case XE_GUC_ACTION_TLB_INVALIDATION_DONE:
> > > +		case XE_GUC_ACTION_PAGE_RECLAMATION_DONE:
> > >  			break;	/* Process these in fast-path */
> > >  		default:
> > >  			return 0;
> > > @@ -1747,6 +1750,7 @@ static void g2h_fast_path(struct xe_guc_ct *ct,
> > > u32 *msg, u32 len)
> > >  		ret = xe_guc_pagefault_handler(guc, payload, adj_len);
> > >  		break;
> > >  	case XE_GUC_ACTION_TLB_INVALIDATION_DONE:
> > > +	case XE_GUC_ACTION_PAGE_RECLAMATION_DONE:
> > >  		__g2h_release_space(ct, len);
> > >  		ret = xe_guc_tlb_inval_done_handler(guc, payload, adj_len);
> > >  		break;
> > > diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h
> > > b/drivers/gpu/drm/xe/xe_guc_fwif.h
> > > index c90dd266e9cf..34d74a71c4f0 100644
> > > --- a/drivers/gpu/drm/xe/xe_guc_fwif.h
> > > +++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
> > > @@ -16,6 +16,7 @@
> > >  #define G2H_LEN_DW_DEREGISTER_CONTEXT		3
> > >  #define G2H_LEN_DW_TLB_INVALIDATE		3
> > >  #define G2H_LEN_DW_G2G_NOTIFY_MIN		3
> > > +#define G2H_LEN_DW_PAGE_RECLAMATION		3
> > >
> > >  #define GUC_ID_MAX			65535
> > >  #define GUC_ID_UNKNOWN			0xffffffff
> > > diff --git a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > > b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > > index c05709a5bc98..3185f8dc00c4 100644
> > > --- a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > > +++ b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > > @@ -95,6 +95,20 @@ static int send_tlb_inval_ggtt(struct xe_tlb_inval
> > > *tlb_inval, u32 seqno)
> > >  	return -ECANCELED;
> > >  }
> > >
> > > +static int send_page_reclaim(struct xe_guc *guc, u32 seqno,
> > > +			     u64 gpu_addr)
> > > +{
> > > +	u32 action[] = {
> > > +		XE_GUC_ACTION_PAGE_RECLAMATION,
> > > +		seqno,
> > > +		lower_32_bits(gpu_addr),
> > > +		upper_32_bits(gpu_addr),
> > > +	};
> > > +
> > > +	return xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action),
> > > +			      G2H_LEN_DW_PAGE_RECLAMATION, 1); }
> > > +
> > >  /*
> > >   * Ensure that roundup_pow_of_two(length) doesn't overflow.
> > >   * Note that roundup_pow_of_two() operates on unsigned long,
> > > --
> > > 2.51.2
> 

  reply	other threads:[~2025-11-22 18:39 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-18  9:05 [PATCH 00/11] Page Reclamation Support for Xe3p Platforms Brian Nguyen
2025-11-18  9:05 ` [PATCH 01/11] [DO, NOT, REVIEW] drm/xe: Do not forward invalid TLB invalidation seqnos to upper layers Brian Nguyen
2025-11-18  9:05 ` [PATCH 02/11] drm/xe: Reset tlb fence timeout on invalid seqno received Brian Nguyen
2025-11-21 17:23   ` Lin, Shuicheng
2025-11-22  1:53     ` Nguyen, Brian3
2025-11-22 18:25   ` Matthew Brost
2025-11-25 11:01     ` Nguyen, Brian3
2025-11-18  9:05 ` [PATCH 03/11] drm/xe/xe_tlb_inval: Modify fence interface to support PPC flush Brian Nguyen
2025-11-21 18:02   ` Lin, Shuicheng
2025-11-22  1:54     ` Nguyen, Brian3
2025-11-22 19:32   ` Matthew Brost
2025-11-25 11:07     ` Nguyen, Brian3
2025-11-18  9:05 ` [PATCH 04/11] drm/xe: Add page reclamation info to device info Brian Nguyen
2025-11-21 18:15   ` Lin, Shuicheng
2025-11-22 18:31   ` Matthew Brost
2025-11-18  9:05 ` [PATCH 05/11] drm/xe/guc: Add page reclamation interface to GuC Brian Nguyen
2025-11-21 18:32   ` Lin, Shuicheng
2025-11-22  1:56     ` Nguyen, Brian3
2025-11-22 18:39       ` Matthew Brost [this message]
2025-11-25 11:13         ` Nguyen, Brian3
2025-11-18  9:05 ` [PATCH 06/11] drm/xe: Create page reclaim list on unbind Brian Nguyen
2025-11-21 21:29   ` Lin, Shuicheng
2025-11-22  1:57     ` Nguyen, Brian3
2025-11-22 19:18   ` Matthew Brost
2025-11-25 11:18     ` Nguyen, Brian3
2025-11-25 18:34       ` Matthew Brost
2025-11-25 19:01         ` Nguyen, Brian3
2025-11-25 19:07           ` Matthew Brost
2025-11-25 19:46             ` Nguyen, Brian3
2025-11-25 22:35               ` Matthew Brost
2025-11-26  2:33                 ` Nguyen, Brian3
2025-11-18  9:05 ` [PATCH 07/11] drm/xe: Suballocate BO for page reclaim Brian Nguyen
2025-11-22 19:42   ` Matthew Brost
2025-11-25 11:20     ` Nguyen, Brian3
2025-11-18  9:05 ` [PATCH 08/11] drm/xe: Prep page reclaim in tlb inval job Brian Nguyen
2025-11-22 13:52   ` Michal Wajdeczko
2025-11-25 11:20     ` Nguyen, Brian3
2025-11-18  9:05 ` [PATCH 09/11] drm/xe: Append page reclamation action to tlb inval Brian Nguyen
2025-11-18  9:05 ` [PATCH 10/11] drm/xe: Optimize flushing of L2$ by skipping unnecessary page reclaim Brian Nguyen
2025-11-24 12:29   ` Matthew Auld
2025-11-25  6:12     ` Nguyen, Brian3
2025-11-25 11:48     ` Upadhyay, Tejas
2025-11-25 13:05       ` Upadhyay, Tejas
2025-11-18  9:05 ` [PATCH 11/11] drm/xe: Add debugfs support for page reclamation Brian Nguyen
2025-11-21 22:32   ` Lin, Shuicheng
2025-11-22  1:57     ` Nguyen, Brian3
2025-11-22 14:18   ` Michal Wajdeczko
2025-11-25 11:21     ` Nguyen, Brian3
2025-11-18  9:52 ` ✗ CI.checkpatch: warning for Page Reclamation Support for Xe3p Platforms Patchwork
2025-11-18  9:53 ` ✓ CI.KUnit: success " Patchwork
2025-11-18 13:02 ` ✗ Xe.CI.Full: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aSIDY00xv6Q+I2cg@lstrano-desk.jf.intel.com \
    --to=matthew.brost@intel.com \
    --cc=brian3.nguyen@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=shuicheng.lin@intel.com \
    --cc=stuart.summers@intel.com \
    --cc=tejas.upadhyay@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox