Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Wajdeczko <michal.wajdeczko@intel.com>
To: Matthew Brost <matthew.brost@intel.com>,
	<intel-xe@lists.freedesktop.org>
Subject: Re: [PATCH v4 20/34] drm/xe/vf: Use GUC_HXG_TYPE_EVENT for GuC context register
Date: Fri, 3 Oct 2025 16:57:07 +0200	[thread overview]
Message-ID: <4b3763eb-9b6f-44d7-b597-015bb63ec508@intel.com> (raw)
In-Reply-To: <20251002055402.1865880-21-matthew.brost@intel.com>



On 10/2/2025 7:53 AM, Matthew Brost wrote:
> The only case where the GuC submission backend cannot reason 100%
> correctly is when a GuC context is registered during VF post-migration
> recovery. In this scenario, it's possible that the GuC context register
> H2G is processed, but the immediately following schedule-enable H2G gets
> lost.
> 
> A double register is harmless when using `GUC_HXG_TYPE_EVENT`, as GuC
> simply drops the duplicate H2G. To keep things simple, use
> `GUC_HXG_TYPE_EVENT` for all context registrations on VFs.
> 
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_guc_ct.c | 32 ++++++++++++++++++++++++--------
>  1 file changed, 24 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
> index d0fde371fae3..d84de8544532 100644
> --- a/drivers/gpu/drm/xe/xe_guc_ct.c
> +++ b/drivers/gpu/drm/xe/xe_guc_ct.c
> @@ -736,6 +736,26 @@ static u16 next_ct_seqno(struct xe_guc_ct *ct, bool is_g2h_fence)
>  	return seqno;
>  }
>  
> +#define MAKE_ACTION(type, __action)				\
> +({								\
> +	FIELD_PREP(GUC_HXG_MSG_0_TYPE, type) |			\
> +	FIELD_PREP(GUC_HXG_EVENT_MSG_0_ACTION |			\
> +		   GUC_HXG_EVENT_MSG_0_DATA0, __action);	\
> +})
> +
> +static bool vf_action_can_safely_fail(struct xe_device *xe, u32 action)
> +{
> +	/*
> +	 * If we are VF resuming, we can't exactly track if a context

s/resuming/recovering

> +	 * registration has been completed in the GuC state machine, it is

well, we can (by looking at H2G if that was processed) but to "simplify" we don't

> +	 * harmless to resend as it will just fail silently if
> +	 * GUC_HXG_TYPE_EVENT is used.
> +	 */
> +	return IS_SRIOV_VF(xe) &&

maybe also:
		xe_gt_recovery_inprogress(gt) &&

to limit our trick to recovery only ?

> +		(action == XE_GUC_ACTION_REGISTER_CONTEXT_MULTI_LRC ||
> +		 action == XE_GUC_ACTION_REGISTER_CONTEXT);
> +}
> +
>  #define H2G_CT_HEADERS (GUC_CTB_HDR_LEN + 1) /* one DW CTB header and one DW HxG header */
>  
>  static int h2g_write(struct xe_guc_ct *ct, const u32 *action, u32 len,
> @@ -807,18 +827,14 @@ static int h2g_write(struct xe_guc_ct *ct, const u32 *action, u32 len,
>  		FIELD_PREP(GUC_CTB_MSG_0_NUM_DWORDS, len) |
>  		FIELD_PREP(GUC_CTB_MSG_0_FENCE, ct_fence_value);
>  	if (want_response) {
> -		cmd[1] =
> -			FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) |
> -			FIELD_PREP(GUC_HXG_EVENT_MSG_0_ACTION |
> -				   GUC_HXG_EVENT_MSG_0_DATA0, action[0]);
> +		cmd[1] = MAKE_ACTION(GUC_HXG_TYPE_REQUEST, action[0]);
> +	} else if (vf_action_can_safely_fail(xe, action[0])) {
> +		cmd[1] = MAKE_ACTION(GUC_HXG_TYPE_EVENT, action[0]);
>  	} else {
>  		fast_req_track(ct, ct_fence_value,
>  			       FIELD_GET(GUC_HXG_EVENT_MSG_0_ACTION, action[0]));
>  
> -		cmd[1] =
> -			FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_FAST_REQUEST) |
> -			FIELD_PREP(GUC_HXG_EVENT_MSG_0_ACTION |
> -				   GUC_HXG_EVENT_MSG_0_DATA0, action[0]);
> +		cmd[1] = MAKE_ACTION(GUC_HXG_TYPE_FAST_REQUEST, action[0]);
>  	}
>  
>  	/* H2G header in cmd[1] replaces action[0] so: */



  parent reply	other threads:[~2025-10-03 14:57 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-02  5:53 [PATCH v4 00/34] VF migration redesign Matthew Brost
2025-10-02  5:53 ` [PATCH v4 01/34] drm/xe: Add NULL checks to scratch LRC allocation Matthew Brost
2025-10-02 22:02   ` Lis, Tomasz
2025-10-02  5:53 ` [PATCH v4 02/34] Revert "drm/xe/vf: Rebase exec queue parallel commands during migration recovery" Matthew Brost
2025-10-02  5:53 ` [PATCH v4 03/34] Revert "drm/xe/vf: Post migration, repopulate ring area for pending request" Matthew Brost
2025-10-02  5:53 ` [PATCH v4 04/34] Revert "drm/xe/vf: Fixup CTB send buffer messages after migration" Matthew Brost
2025-10-02  5:53 ` [PATCH v4 05/34] drm/xe: Save off position in ring in which a job was programmed Matthew Brost
2025-10-02  5:53 ` [PATCH v4 06/34] drm/xe/guc: Track pending-enable source in submission state Matthew Brost
2025-10-02  5:53 ` [PATCH v4 07/34] drm/xe: Track LR jobs in DRM scheduler pending list Matthew Brost
2025-10-02 16:14   ` Matthew Auld
2025-10-05  5:21     ` Matthew Brost
2025-10-02  5:53 ` [PATCH v4 08/34] drm/xe: Don't change LRC ring head on job resubmission Matthew Brost
2025-10-02 14:15   ` Matthew Auld
2025-10-05  5:25     ` Matthew Brost
2025-10-05  6:53       ` Matthew Brost
2025-10-06  8:59         ` Matthew Auld
2025-10-02  5:53 ` [PATCH v4 09/34] drm/xe: Make LRC W/A scratch buffer usage consistent Matthew Brost
2025-10-02  5:53 ` [PATCH v4 10/34] drm/xe/guc: Document GuC submission backend Matthew Brost
2025-10-03 14:30   ` Lis, Tomasz
2025-10-02  5:53 ` [PATCH v4 11/34] drm/xe/vf: Add xe_gt_recovery_inprogress helper Matthew Brost
2025-10-03  1:39   ` Lis, Tomasz
2025-10-04  4:32     ` Matthew Brost
2025-10-03  8:40   ` Michal Wajdeczko
2025-10-04  4:32     ` Matthew Brost
2025-10-02  5:53 ` [PATCH v4 12/34] drm/xe/vf: Make VF recovery run on per-GT worker Matthew Brost
2025-10-02  5:53 ` [PATCH v4 13/34] drm/xe/vf: Abort H2G sends during VF post-migration recovery Matthew Brost
2025-10-02  5:53 ` [PATCH v4 14/34] drm/xe/vf: Remove memory allocations from VF post migration recovery Matthew Brost
2025-10-02  5:53 ` [PATCH v4 15/34] drm/xe/vf: Close multi-GT GGTT shift race Matthew Brost
2025-10-03 14:24   ` Michal Wajdeczko
2025-10-04  4:36     ` Matthew Brost
2025-10-02  5:53 ` [PATCH v4 16/34] drm/xe/vf: Teardown VF post migration worker on driver unload Matthew Brost
2025-10-02  5:53 ` [PATCH v4 17/34] drm/xe/vf: Don't allow GT reset to be queued during VF post migration recovery Matthew Brost
2025-10-03 16:09   ` Lis, Tomasz
2025-10-02  5:53 ` [PATCH v4 18/34] drm/xe/vf: Wakeup in GuC backend on " Matthew Brost
2025-10-03 14:38   ` Michal Wajdeczko
2025-10-05  6:22     ` Matthew Brost
2025-10-05  6:35       ` Matthew Brost
2025-10-02  5:53 ` [PATCH v4 19/34] drm/xe/vf: Avoid indefinite blocking in preempt rebind worker for VFs supporting migration Matthew Brost
2025-10-02  5:53 ` [PATCH v4 20/34] drm/xe/vf: Use GUC_HXG_TYPE_EVENT for GuC context register Matthew Brost
2025-10-03 14:26   ` Lis, Tomasz
2025-10-05  5:43     ` Matthew Brost
2025-10-03 14:57   ` Michal Wajdeczko [this message]
2025-10-02  5:53 ` [PATCH v4 21/34] drm/xe/vf: Flush and stop CTs in VF post migration recovery Matthew Brost
2025-10-02  5:53 ` [PATCH v4 22/34] drm/xe/vf: Reset TLB invalidations during " Matthew Brost
2025-10-02  5:53 ` [PATCH v4 23/34] drm/xe/vf: Kickstart after resfix in " Matthew Brost
2025-10-02  5:53 ` [PATCH v4 24/34] drm/xe/vf: Start CTs before resfix " Matthew Brost
2025-10-02 21:50   ` Lis, Tomasz
2025-10-03 15:10   ` Michal Wajdeczko
2025-10-05  6:49     ` Matthew Brost
2025-10-05 12:28       ` Michal Wajdeczko
2025-10-02  5:53 ` [PATCH v4 25/34] drm/xe/vf: Abort VF post migration recovery on failure Matthew Brost
2025-10-02  5:53 ` [PATCH v4 26/34] drm/xe/vf: Replay GuC submission state on pause / unpause Matthew Brost
2025-10-02  5:53 ` [PATCH v4 27/34] drm/xe: Move queue init before LRC creation Matthew Brost
2025-10-03 13:25   ` Lis, Tomasz
2025-10-05  8:03     ` Matthew Brost
2025-10-02  5:53 ` [PATCH v4 28/34] drm/xe/vf: Add debug prints for GuC replaying state during VF recovery Matthew Brost
2025-10-03 13:08   ` Lis, Tomasz
2025-10-02  5:53 ` [PATCH v4 29/34] drm/xe/vf: Workaround for race condition in GuC firmware during VF pause Matthew Brost
2025-10-03 13:06   ` Lis, Tomasz
2025-10-02  5:53 ` [PATCH v4 30/34] drm/xe: Use PPGTT addresses for TLB invalidation to avoid GGTT fixups Matthew Brost
2025-10-02  5:53 ` [PATCH v4 31/34] drm/xe/vf: Use primary GT ordered work queue on media GT on PTL VF Matthew Brost
2025-10-02 21:00   ` Lis, Tomasz
2025-10-05  7:03     ` Matthew Brost
2025-10-02  5:54 ` [PATCH v4 32/34] drm/xe/vf: Ensure media GT VF recovery runs after primary GT on PTL Matthew Brost
2025-10-02 20:19   ` Lis, Tomasz
2025-10-02  5:54 ` [PATCH v4 33/34] drm/xe/vf: Rebase CCS save/restore BB GGTT addresses Matthew Brost
2025-10-02  5:54 ` [PATCH v4 34/34] drm/xe/guc: Increase wait timeout to 2sec after BUSY reply from GuC Matthew Brost
2025-10-02  6:45 ` ✗ CI.checkpatch: warning for VF migration redesign (rev4) Patchwork
2025-10-02  6:47 ` ✓ CI.KUnit: success " Patchwork
2025-10-02  7:33 ` ✗ Xe.CI.BAT: failure " Patchwork
2025-10-02  9:19 ` ✗ Xe.CI.Full: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4b3763eb-9b6f-44d7-b597-015bb63ec508@intel.com \
    --to=michal.wajdeczko@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.brost@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox