Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
To: Jonathan Cavitt <jonathan.cavitt@intel.com>,
	<intel-xe@lists.freedesktop.org>
Cc: <saurabhg.gupta@intel.com>, <alex.zuo@intel.com>,
	<michal.wajdeczko@intel.com>, <matthew.d.roper@intel.com>
Subject: Re: [PATCH] drm/xe/xe_guc: Dynamically decide g2g buffer owner
Date: Mon, 27 Oct 2025 14:39:52 -0700	[thread overview]
Message-ID: <4c8b3cd7-13bc-4051-b58d-df5c68e45e2b@intel.com> (raw)
In-Reply-To: <20251015181159.92509-2-jonathan.cavitt@intel.com>



On 10/15/2025 11:12 AM, Jonathan Cavitt wrote:
> On today's driver, xe_device_get_gt(xe, 0); can never return NULL.
> Hardware-wise there's always at least one tile, and every tilie has a
> primary GT.  If something went wrong during init of tile or GT and we
> couldn't create/initialize the structures, then we already aborted the
> device probe immediately and we'll never get further on to places in the
> code that would be chasing a NULL pointer.
>
> However, there's currently ongoing work to allow the primary GT to be
> disabled via configfs for debugging purposes.  Once that lands, it will
> be possible for this query to return a NULL pointer.  This can cause
> problems in guc_g2g_alloc, as this process currently relies on the
> primary GT always being present.
>
> Instead of making the primary GT the g2g buffer owner, make the first
> GuC passed to guc_g2g_alloc the g2g buffer owner.  This requires keeping
> track of the g2g buffer owner in the xe device so each GuC can know if
> it's the owner or not during initialization.
>
> Note that the associated kunit testing code does not need to be updated
> to reflect this change because
> 1. The kunit testing code always runs with the primary GT enabled, and
> 2. Tracking the g2g buffer owner in the xe device is unnecessary to
>     perform the functions of the tests.

I'm not convinced this works without modification to the selftest. In 
the case where the G2G is already enabled, the selftest will destroy the 
buffer (see g2g_free) to create a new one for the testing. Both for the 
G2G_CTB_TYPE_DEFAULT subtest and as cleanup at the end, the test will 
call guc_g2g_alloc() again; in this scenario xe->g2g_owner has not been 
cleared but the BO is not allocated, so the code in g2g_alloc will try 
to do an xe_bo_get on a bad pointer. I couldn't find the g2g test in the 
CI results so couldn't confirm that this is indeed happening and I'm not 
missing something.

Daniele

> Suggested-by: Matt Roper <matthew.d.roper@intel.com>
> Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> ---
>   drivers/gpu/drm/xe/xe_device_types.h | 4 ++++
>   drivers/gpu/drm/xe/xe_guc.c          | 9 ++++++---
>   2 files changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> index 02c04ad7296e..e17f8b84b8d4 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -38,6 +38,7 @@ struct dram_info;
>   struct intel_display;
>   struct intel_dg_nvm_dev;
>   struct xe_ggtt;
> +struct xe_guc;
>   struct xe_i2c;
>   struct xe_pat_ops;
>   struct xe_pxp;
> @@ -626,6 +627,9 @@ struct xe_device {
>   	atomic_t g2g_test_count;
>   #endif
>   
> +	/** @g2g_owner: Pointer to the GuC that is the owner of the g2g buffer */
> +	struct xe_guc *g2g_owner;
> +
>   	/* private: */
>   
>   #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
> diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
> index d94490979adc..9790ddad606a 100644
> --- a/drivers/gpu/drm/xe/xe_guc.c
> +++ b/drivers/gpu/drm/xe/xe_guc.c
> @@ -465,9 +465,8 @@ static int guc_g2g_alloc(struct xe_guc *guc)
>   	if (guc->g2g.bo)
>   		return 0;
>   
> -	if (gt->info.id != 0) {
> -		struct xe_gt *root_gt = xe_device_get_gt(xe, 0);
> -		struct xe_guc *root_guc = &root_gt->uc.guc;
> +	if (xe->g2g_owner) {
> +		struct xe_guc *root_guc = xe->g2g_owner;
>   		struct xe_bo *bo;
>   
>   		bo = xe_bo_get(root_guc->g2g.bo);
> @@ -492,6 +491,7 @@ static int guc_g2g_alloc(struct xe_guc *guc)
>   	xe_map_memset(xe, &bo->vmap, 0, 0, g2g_size);
>   	guc->g2g.bo = bo;
>   	guc->g2g.owned = true;
> +	xe->g2g_owner = guc;
>   
>   	return 0;
>   }
> @@ -504,6 +504,9 @@ static void guc_g2g_fini(struct xe_guc *guc)
>   	/* Unpinning the owned object is handled by generic shutdown */
>   	if (!guc->g2g.owned)
>   		xe_bo_put(guc->g2g.bo);
> +	/* g2g owner is no longer valid.  Mark as NULL in xe device */
> +	else
> +		guc_to_xe(guc)->g2g_owner = NULL;
>   
>   	guc->g2g.bo = NULL;
>   }


  parent reply	other threads:[~2025-10-27 21:39 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-15 18:12 [PATCH] drm/xe/xe_guc: Dynamically decide g2g buffer owner Jonathan Cavitt
2025-10-16  3:26 ` ✓ CI.KUnit: success for " Patchwork
2025-10-16  4:06 ` ✓ Xe.CI.BAT: " Patchwork
2025-10-16 21:14 ` ✗ Xe.CI.Full: failure " Patchwork
2025-10-27 21:39 ` Daniele Ceraolo Spurio [this message]
2025-10-28 14:37   ` [PATCH] " Cavitt, Jonathan
2025-10-28 18:26     ` Daniele Ceraolo Spurio
2025-10-28 18:39       ` Cavitt, Jonathan
2025-10-28 19:21         ` Cavitt, Jonathan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4c8b3cd7-13bc-4051-b58d-df5c68e45e2b@intel.com \
    --to=daniele.ceraolospurio@intel.com \
    --cc=alex.zuo@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jonathan.cavitt@intel.com \
    --cc=matthew.d.roper@intel.com \
    --cc=michal.wajdeczko@intel.com \
    --cc=saurabhg.gupta@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox