Re: [PATCH v3 2/7] drm/xe: Implement xe_pagefault_init

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Matthew Brost <matthew.brost@intel.com>
To: Francois Dugast <francois.dugast@intel.com>
Cc: <intel-xe@lists.freedesktop.org>, <stuart.summers@intel.com>
Subject: Re: [PATCH v3 2/7] drm/xe: Implement xe_pagefault_init
Date: Fri, 31 Oct 2025 09:40:33 -0700	[thread overview]
Message-ID: <aQTmgc8CexOuoend@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <aQS8RTAavQUlBQsl@fdugast-desk>

On Fri, Oct 31, 2025 at 02:40:21PM +0100, Francois Dugast wrote:
> On Mon, Oct 27, 2025 at 08:58:38PM -0700, Matthew Brost wrote:
> > Create pagefault queues and initialize them.
> > 
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_device.c       |  5 ++
> >  drivers/gpu/drm/xe/xe_device_types.h | 11 ++++
> >  drivers/gpu/drm/xe/xe_pagefault.c    | 93 +++++++++++++++++++++++++++-
> >  3 files changed, 107 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> > index 47f5391ad8e9..c17813c469fd 100644
> > --- a/drivers/gpu/drm/xe/xe_device.c
> > +++ b/drivers/gpu/drm/xe/xe_device.c
> > @@ -52,6 +52,7 @@
> >  #include "xe_nvm.h"
> >  #include "xe_oa.h"
> >  #include "xe_observation.h"
> > +#include "xe_pagefault.h"
> >  #include "xe_pat.h"
> >  #include "xe_pcode.h"
> >  #include "xe_pm.h"
> > @@ -890,6 +891,10 @@ int xe_device_probe(struct xe_device *xe)
> >  	if (err)
> >  		return err;
> >  
> > +	err = xe_pagefault_init(xe);
> > +	if (err)
> > +		return err;
> > +
> 
> It seems these lines ^...
> 
> >  	for_each_gt(gt, xe, id) {
> >  		err = xe_gt_init(gt);
> >  		if (err)
> 
> 
> ... should come after those ^ otherwise the number of EUs in
> xe_pagefault_queue_init is incorrect.
> 

Yea, this version is busted.

> > diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> > index af0ce275b032..7baf15f51575 100644
> > --- a/drivers/gpu/drm/xe/xe_device_types.h
> > +++ b/drivers/gpu/drm/xe/xe_device_types.h
> > @@ -18,6 +18,7 @@
> >  #include "xe_lmtt_types.h"
> >  #include "xe_memirq_types.h"
> >  #include "xe_oa_types.h"
> > +#include "xe_pagefault_types.h"
> >  #include "xe_platform_types.h"
> >  #include "xe_pmu_types.h"
> >  #include "xe_pt_types.h"
> > @@ -418,6 +419,16 @@ struct xe_device {
> >  		u32 next_asid;
> >  		/** @usm.lock: protects UM state */
> >  		struct rw_semaphore lock;
> > +		/** @usm.pf_wq: page fault work queue, unbound, high priority */
> > +		struct workqueue_struct *pf_wq;
> > +		/*
> > +		 * We pick 4 here because, in the current implementation, it
> > +		 * yields the best bandwidth utilization of the kernel paging
> > +		 * engine.
> > +		 */
> > +#define XE_PAGEFAULT_QUEUE_COUNT	4
> > +		/** @usm.pf_queue: Page fault queues */
> > +		struct xe_pagefault_queue pf_queue[XE_PAGEFAULT_QUEUE_COUNT];
> >  	} usm;
> >  
> >  	/** @pinned: pinned BO state */
> > diff --git a/drivers/gpu/drm/xe/xe_pagefault.c b/drivers/gpu/drm/xe/xe_pagefault.c
> > index d509a80cb1f3..43b26e7d090a 100644
> > --- a/drivers/gpu/drm/xe/xe_pagefault.c
> > +++ b/drivers/gpu/drm/xe/xe_pagefault.c
> > @@ -3,6 +3,10 @@
> >   * Copyright © 2025 Intel Corporation
> >   */
> >  
> > +#include <drm/drm_managed.h>
> > +
> > +#include "xe_device.h"
> > +#include "xe_gt_types.h"
> >  #include "xe_pagefault.h"
> >  #include "xe_pagefault_types.h"
> >  
> > @@ -21,6 +25,71 @@
> >   * xe_pagefault.c implements the consumer layer.
> >   */
> >  
> > +static int xe_pagefault_entry_size(void)
> > +{
> > +	return roundup_pow_of_two(sizeof(struct xe_pagefault));
> > +}
> > +
> > +static void xe_pagefault_queue_work(struct work_struct *w)
> > +{
> > +	/* TODO: Implement */
> > +}
> > +
> > +static int xe_pagefault_queue_init(struct xe_device *xe,
> > +				   struct xe_pagefault_queue *pf_queue)
> > +{
> > +	struct xe_gt *gt;
> > +	int total_num_eus = 0;
> > +	u8 id;
> > +
> > +	for_each_gt(gt, xe, id) {
> > +		xe_dss_mask_t all_dss;
> > +		int num_dss, num_eus;
> > +
> > +		bitmap_or(all_dss, gt->fuse_topo.g_dss_mask,
> > +			  gt->fuse_topo.c_dss_mask, XE_MAX_DSS_FUSE_BITS);
> > +
> > +		num_dss = bitmap_weight(all_dss, XE_MAX_DSS_FUSE_BITS);
> > +		num_eus = bitmap_weight(gt->fuse_topo.eu_mask_per_dss,
> > +					XE_MAX_EU_FUSE_BITS) * num_dss;
> > +
> > +		total_num_eus += num_eus;
> > +	}
> > +
> > +	xe_assert(xe, total_num_eus);
> > +
> > +	/*
> > +	 * user can issue separate page faults per EU and per CS
> > +	 *
> > +	 * XXX: Multiplier required as compute UMD are getting PF queue errors
> > +	 * without it. Follow on why this multiplier is required.
> > +	 */
> > +#define PF_MULTIPLIER	8
> > +	pf_queue->size = (total_num_eus + XE_NUM_HW_ENGINES) *
> > +		xe_pagefault_entry_size() * PF_MULTIPLIER;
> > +	pf_queue->size = roundup_pow_of_two(pf_queue->size);
> > +#undef PF_MULTIPLIER
> > +
> > +	drm_dbg(&xe->drm, "xe_pagefault_entry_size=%d, total_num_eus=%d, pf_queue->size=%u",
> > +		xe_pagefault_entry_size(), total_num_eus, pf_queue->size);
> > +
> > +	spin_lock_init(&pf_queue->lock);
> > +	INIT_WORK(&pf_queue->worker, xe_pagefault_queue_work);
> 
> These 2 lines ^...
> 
> > +
> > +	pf_queue->data = drmm_kzalloc(&xe->drm, pf_queue->size, GFP_KERNEL);
> > +	if (!pf_queue->data)
> > +		return -ENOMEM;
> 
> ... and those 3 lines were swapped since last revision. It is probably
> too early for pf_queue to be used here anyway but was there a reason for
> changing the order?
> 

I was fighting CI and lost. Have a proper version which should work now locally.

Matt

> Francois
> 
> > +
> > +	return 0;
> > +}
> > +
> > +static void xe_pagefault_fini(void *arg)
> > +{
> > +	struct xe_device *xe = arg;
> > +
> > +	destroy_workqueue(xe->usm.pf_wq);
> > +}
> > +
> >  /**
> >   * xe_pagefault_init() - Page fault init
> >   * @xe: xe device instance
> > @@ -31,8 +100,28 @@
> >   */
> >  int xe_pagefault_init(struct xe_device *xe)
> >  {
> > -	/* TODO - implement */
> > -	return 0;
> > +	int err, i;
> > +
> > +	if (!xe->info.has_usm)
> > +		return 0;
> > +
> > +	xe->usm.pf_wq = alloc_workqueue("xe_page_fault_work_queue",
> > +					WQ_UNBOUND | WQ_HIGHPRI,
> > +					XE_PAGEFAULT_QUEUE_COUNT);
> > +	if (!xe->usm.pf_wq)
> > +		return -ENOMEM;
> > +
> > +	for (i = 0; i < XE_PAGEFAULT_QUEUE_COUNT; ++i) {
> > +		err = xe_pagefault_queue_init(xe, xe->usm.pf_queue + i);
> > +		if (err)
> > +			goto err_out;
> > +	}
> > +
> > +	return devm_add_action_or_reset(xe->drm.dev, xe_pagefault_fini, xe);
> > +
> > +err_out:
> > +	destroy_workqueue(xe->usm.pf_wq);
> > +	return err;
> >  }
> >  
> >  /**
> > -- 
> > 2.34.1
> >

next prev parent reply	other threads:[~2025-10-31 16:40 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-28  3:58 [PATCH v3 0/7] Pagefault refactor Matthew Brost
2025-10-28  3:58 ` [PATCH v3 1/7] drm/xe: Stub out new pagefault layer Matthew Brost
2025-10-31 13:33   ` Francois Dugast
2025-10-31 16:41     ` Matthew Brost
2025-10-28  3:58 ` [PATCH v3 2/7] drm/xe: Implement xe_pagefault_init Matthew Brost
2025-10-31 13:40   ` Francois Dugast
2025-10-31 16:40     ` Matthew Brost [this message]
2025-10-28  3:58 ` [PATCH v3 3/7] drm/xe: Implement xe_pagefault_reset Matthew Brost
2025-10-28  3:58 ` [PATCH v3 4/7] drm/xe: Implement xe_pagefault_handler Matthew Brost
2025-10-28  3:58 ` [PATCH v3 5/7] drm/xe: Implement xe_pagefault_queue_work Matthew Brost
2025-10-28  3:58 ` [PATCH v3 6/7] drm/xe: Add xe_guc_pagefault layer Matthew Brost
2025-10-28  3:58 ` [PATCH v3 7/7] drm/xe: Remove unused GT page fault code Matthew Brost
2025-10-28  4:05 ` ✗ CI.checkpatch: warning for Pagefault refactor (rev2) Patchwork
2025-10-28  4:06 ` ✓ CI.KUnit: success " Patchwork
2025-10-28  4:46 ` ✗ Xe.CI.BAT: failure " Patchwork
2025-10-28  9:30 ` ✗ Xe.CI.Full: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aQTmgc8CexOuoend@lstrano-desk.jf.intel.com \
    --to=matthew.brost@intel.com \
    --cc=francois.dugast@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=stuart.summers@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.