All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Lis, Tomasz" <tomasz.lis@intel.com>
To: Matthew Brost <matthew.brost@intel.com>,
	<intel-xe@lists.freedesktop.org>
Subject: Re: [PATCH v3 2/3] drm/xe/guc: Track pending-enable source in submission state
Date: Wed, 27 Aug 2025 15:56:59 +0200	[thread overview]
Message-ID: <8295cb2d-ded1-4a00-837a-e89d1c3bdf89@intel.com> (raw)
In-Reply-To: <20250818172243.2649863-3-matthew.brost@intel.com>


On 8/18/2025 7:22 PM, Matthew Brost wrote:
> Add explicit tracking in the GuC submission state to record the source
> of a pending enable (TDR vs. resume path vs. submission). Disambiguating
> the origin lets the GuC submission state machine apply the correct
> recovery/replay behavior.
>
> This helps VF resume:
VF restore
>   when the device comes back, the state machine knows
> whether the pending enable stems from timeout recovery, from a resume
VF post-migration recovery
> sequence, or submission and can gate sequencing and fixups accordingly.
>
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>   drivers/gpu/drm/xe/xe_guc_submit.c | 36 ++++++++++++++++++++++++++++++
>   1 file changed, 36 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> index 1185b23b1384..9e4118126ef9 100644
> --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> @@ -68,6 +68,8 @@ exec_queue_to_guc(struct xe_exec_queue *q)
>   #define EXEC_QUEUE_STATE_BANNED			(1 << 9)
>   #define EXEC_QUEUE_STATE_CHECK_TIMEOUT		(1 << 10)
>   #define EXEC_QUEUE_STATE_EXTRA_REF		(1 << 11)
> +#define EXEC_QUEUE_STATE_PENDING_RESUME		(1 << 12)

you meant RESTORE. But restore is mostly done by PF side, small part remains

when we reach raising this flag, so that is not the best name either..

VF_FIXUPS? VF_RECOVERY? GGTT_FIXUPS?

Maybe VF_FIXUPS is the best for this context, as fixups will be applied both

for migration recovery and for VF PM resume. GGTT_FIXUPS is longer and

will likely not match future platforms, where other fixups will likely be

required (we already have compression metadata restore, and the future

looks like more are coming).

-Tomasz

> +#define EXEC_QUEUE_STATE_PENDING_TDR_EXIT	(1 << 13)
>   
>   static bool exec_queue_registered(struct xe_exec_queue *q)
>   {
> @@ -219,6 +221,36 @@ static void set_exec_queue_extra_ref(struct xe_exec_queue *q)
>   	atomic_or(EXEC_QUEUE_STATE_EXTRA_REF, &q->guc->state);
>   }
>   
> +static bool __maybe_unused exec_queue_pending_resume(struct xe_exec_queue *q)
> +{
> +	return atomic_read(&q->guc->state) & EXEC_QUEUE_STATE_PENDING_RESUME;
> +}
> +
> +static void set_exec_queue_pending_resume(struct xe_exec_queue *q)
> +{
> +	atomic_or(EXEC_QUEUE_STATE_PENDING_RESUME, &q->guc->state);
> +}
> +
> +static void clear_exec_queue_pending_resume(struct xe_exec_queue *q)
> +{
> +	atomic_and(~EXEC_QUEUE_STATE_PENDING_RESUME, &q->guc->state);
> +}
> +
> +static bool __maybe_unused exec_queue_pending_tdr_exit(struct xe_exec_queue *q)
> +{
> +	return atomic_read(&q->guc->state) & EXEC_QUEUE_STATE_PENDING_TDR_EXIT;
> +}
> +
> +static void set_exec_queue_pending_tdr_exit(struct xe_exec_queue *q)
> +{
> +	atomic_or(EXEC_QUEUE_STATE_PENDING_TDR_EXIT, &q->guc->state);
> +}
> +
> +static void clear_exec_queue_pending_tdr_exit(struct xe_exec_queue *q)
> +{
> +	atomic_and(~EXEC_QUEUE_STATE_PENDING_TDR_EXIT, &q->guc->state);
> +}
> +
>   static bool exec_queue_killed_or_banned_or_wedged(struct xe_exec_queue *q)
>   {
>   	return (atomic_read(&q->guc->state) &
> @@ -1344,6 +1376,7 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
>   	return DRM_GPU_SCHED_STAT_RESET;
>   
>   sched_enable:
> +	set_exec_queue_pending_tdr_exit(q);
>   	enable_scheduling(q);
>   rearm:
>   	/*
> @@ -1494,6 +1527,7 @@ static void __guc_exec_queue_process_msg_resume(struct xe_sched_msg *msg)
>   		clear_exec_queue_suspended(q);
>   		if (!exec_queue_enabled(q)) {
>   			q->guc->resume_time = RESUME_PENDING;
> +			set_exec_queue_pending_resume(q);
>   			enable_scheduling(q);
>   		}
>   	} else {
> @@ -2065,6 +2099,8 @@ static void handle_sched_done(struct xe_guc *guc, struct xe_exec_queue *q,
>   		xe_gt_assert(guc_to_gt(guc), exec_queue_pending_enable(q));
>   
>   		q->guc->resume_time = ktime_get();
> +		clear_exec_queue_pending_resume(q);
> +		clear_exec_queue_pending_tdr_exit(q);
>   		clear_exec_queue_pending_enable(q);
>   		smp_wmb();
>   		wake_up_all(&guc->ct.wq);

  reply	other threads:[~2025-08-27 13:57 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-18 17:22 [PATCH v3 0/3] Core Xe changes preparing for VF resume Matthew Brost
2025-08-18 17:22 ` [PATCH v3 1/3] drm/xe: Save off position in ring in which a job was programmed Matthew Brost
2025-08-27 13:30   ` Lis, Tomasz
2025-08-27 19:07     ` Matthew Brost
2025-08-18 17:22 ` [PATCH v3 2/3] drm/xe/guc: Track pending-enable source in submission state Matthew Brost
2025-08-27 13:56   ` Lis, Tomasz [this message]
2025-08-27 19:10     ` Matthew Brost
2025-08-27 20:43       ` Lis, Tomasz
2025-08-27 20:48         ` Matthew Brost
2025-08-18 17:22 ` [PATCH v3 3/3] drm/xe: Track LR jobs in DRM scheduler pending list Matthew Brost
2025-08-27 14:46   ` Lis, Tomasz
2025-08-27 19:12     ` Matthew Brost
2025-08-27 20:45       ` Lis, Tomasz
2025-08-18 18:17 ` ✓ CI.KUnit: success for Core Xe changes preparing for VF resume Patchwork
2025-08-18 19:36 ` ✓ Xe.CI.BAT: " Patchwork
2025-08-19 12:52 ` ✗ Xe.CI.Full: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8295cb2d-ded1-4a00-837a-e89d1c3bdf89@intel.com \
    --to=tomasz.lis@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.brost@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.