From: Michal Wajdeczko <michal.wajdeczko@intel.com>
To: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>,
<intel-xe@lists.freedesktop.org>
Cc: Matthew Brost <matthew.brost@intel.com>,
Tomasz Lis <tomasz.lis@intel.com>
Subject: Re: [PATCH 1/1] drm/xe/vf: Reset recovery_queued after issuing RESFIX_START
Date: Fri, 5 Dec 2025 21:01:16 +0100 [thread overview]
Message-ID: <6f4a0e2d-0bcf-4f19-92a8-bff2949b63c9@intel.com> (raw)
In-Reply-To: <20251205082615.154649-4-satyanarayana.k.v.p@intel.com>
On 12/5/2025 9:26 AM, Satyanarayana K V P wrote:
> During VF_RESTORE or VF_RESUME, the GuC sends a migration interrupt and
> clears the RESFIX_START marker. If migration or resume occurs before the
> VF issues its own RESFIX_START, VF KMD may receive two back-to-back
> migration interrupts. VF then sends RESFIX_START to indicate the beginning
> of fixups and RESFIX_DONE to mark completion. However, the second
> RESFIX_START fails because the GuC is already in the RUNNING state.
>
> To prevent VF KMD from queuing additional recovery work items when extra
> interrupts arrive, move the clearing of recovery_queued from
> vf_post_migration_shutdown() to vf_post_migration_resfix_start().
hmm, it's not about moving the code from one function to other, as this is
implementation detail clear from the diff below, but rather we want to say
"clear the flag after sending a RESFIX_START message to ignore duplicated
IRQs seen before we start actual recovery"
> This ensures the state is reset only after the fixup process begins,
> avoiding redundant work item queuing.
>
> Fixes: b5fbb94341a2 ("drm/xe/vf: Introduce RESFIX start marker support")
> Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Tomasz Lis <tomasz.lis@intel.com>
> ---
> drivers/gpu/drm/xe/xe_gt_sriov_vf.c | 16 ++++++++++------
> 1 file changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_vf.c b/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
> index 3c806c8e5f3e..90f2ef1772f2 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
> @@ -781,7 +781,7 @@ static void vf_start_migration_recovery(struct xe_gt *gt)
>
> spin_lock(>->sriov.vf.migration.lock);
>
> - if (!gt->sriov.vf.migration.recovery_queued ||
> + if (!gt->sriov.vf.migration.recovery_queued &&
this is a different fix that deserves its own separate patch with
a proper commit message
> !gt->sriov.vf.migration.recovery_teardown) {
> gt->sriov.vf.migration.recovery_queued = true;
> WRITE_ONCE(gt->sriov.vf.migration.recovery_inprogress, true);
> @@ -1171,10 +1171,6 @@ static bool vf_post_migration_shutdown(struct xe_gt *gt)
> return true;
> }
>
> - spin_lock_irq(>->sriov.vf.migration.lock);
> - gt->sriov.vf.migration.recovery_queued = false;
> - spin_unlock_irq(>->sriov.vf.migration.lock);
> -
> xe_guc_ct_flush_and_stop(>->uc.guc.ct);
> xe_guc_submit_pause_vf(>->uc.guc);
> xe_tlb_inval_reset(>->tlb_inval);
> @@ -1258,7 +1254,15 @@ static int vf_post_migration_resfix_done(struct xe_gt *gt, u16 marker)
>
> static int vf_post_migration_resfix_start(struct xe_gt *gt, u16 marker)
> {
> - return vf_resfix_start(gt, marker);
> + int err;
> +
> + err = vf_resfix_start(gt, marker);
> +
> + spin_lock_irq(>->sriov.vf.migration.lock);
> + gt->sriov.vf.migration.recovery_queued = false;
> + spin_unlock_irq(>->sriov.vf.migration.lock);
we may want to use
scoped_guard(spinlock_irq, >->sriov.vf.migration.lock)
gt->sriov.vf.migration.recovery_queued = false;
> +
> + return err;
> }
>
> static u16 vf_post_migration_next_resfix_marker(struct xe_gt *gt)
next prev parent reply other threads:[~2025-12-05 20:01 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-05 8:26 [PATCH 0/1] Reset recovery_queued after issuing RESFIX_START Satyanarayana K V P
2025-12-05 8:26 ` [PATCH 1/1] drm/xe/vf: " Satyanarayana K V P
2025-12-05 20:01 ` Michal Wajdeczko [this message]
2025-12-05 20:27 ` Matthew Brost
2025-12-05 10:47 ` ✓ CI.KUnit: success for " Patchwork
2025-12-05 11:40 ` ✓ Xe.CI.BAT: " Patchwork
2025-12-05 13:23 ` ✗ Xe.CI.Full: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6f4a0e2d-0bcf-4f19-92a8-bff2949b63c9@intel.com \
--to=michal.wajdeczko@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=matthew.brost@intel.com \
--cc=satyanarayana.k.v.p@intel.com \
--cc=tomasz.lis@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox