From: Michal Wajdeczko <michal.wajdeczko@intel.com>
To: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>,
<intel-xe@lists.freedesktop.org>
Cc: Matthew Brost <matthew.brost@intel.com>,
Tomasz Lis <tomasz.lis@intel.com>
Subject: Re: [PATCH v7 3/4] drm/xe/vf: Requeue recovery on GuC MIGRATION error during VF post-migration
Date: Sat, 29 Nov 2025 21:27:36 +0100 [thread overview]
Message-ID: <91d0b4c4-8209-44c5-8d1d-8960ca02144a@intel.com> (raw)
In-Reply-To: <20251128133052.17120-9-satyanarayana.k.v.p@intel.com>
On 11/28/2025 2:30 PM, Satyanarayana K V P wrote:
> Handle GuC response `XE_GUC_RESPONSE_VF_MIGRATED` as a special case in the
> VF post-migration recovery flow. When this error occurs, it indicates that
> a new migration was detected while the resource fixup process was still in
> progress. Instead of failing immediately, requeue the VF into the recovery
> path to allow proper handling of the new migration event.
>
> This improves robustness of VF recovery in SR-IOV environments where
> migrations can overlap with resource fixup steps.
>
> Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Tomasz Lis <tomasz.lis@intel.com>
>
> ---
> V6 -> V7:
> - New commit.
> ---
> drivers/gpu/drm/xe/xe_gt_sriov_vf.c | 9 +++++++--
> drivers/gpu/drm/xe/xe_guc.c | 6 ++++++
> 2 files changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_vf.c b/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
> index fd7dd4a4739d..937554657440 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
> @@ -1256,14 +1256,19 @@ static void vf_post_migration_recovery(struct xe_gt *gt)
> }
>
> err = vf_post_migration_fixups(gt);
> - if (err)
> + if (unlikely(err))
no need to add 'unlikely' anywhere, we are not on critical path
> goto fail;
>
> vf_post_migration_rearm(gt);
>
> err = vf_post_migration_resfix_done(gt, marker);
> - if (err)
> + if (unlikely(err == -EREMCHG))
> + goto queue;
maybe it's better to code that as:
if (err) {
if (err == -EREMCHG)
goto queue;
...
> + if (unlikely(err)) {
> + xe_gt_sriov_err(gt, "Recovery failed at GuC RESFIX_DONE step (%pe)\n",
> + ERR_PTR(err));
> goto fail;
shouldn't this err message be part of the earlier patch?
> + }
>
> vf_post_migration_kickstart(gt);
>
> diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
> index 88376bc2a483..f0407bab9a0c 100644
> --- a/drivers/gpu/drm/xe/xe_guc.c
> +++ b/drivers/gpu/drm/xe/xe_guc.c
> @@ -1484,6 +1484,12 @@ int xe_guc_mmio_send_recv(struct xe_guc *guc, const u32 *request,
> u32 hint = FIELD_GET(GUC_HXG_FAILURE_MSG_0_HINT, header);
> u32 error = FIELD_GET(GUC_HXG_FAILURE_MSG_0_ERROR, header);
>
> + if (unlikely(error == XE_GUC_RESPONSE_VF_MIGRATED)) {
> + xe_gt_dbg(gt, "GuC mmio request %#x rejected due to MIGRATION (hint %#x)\n",
> + request[0], hint);
> + return -EREMCHG;
> + }
> +
> xe_gt_err(gt, "GuC mmio request %#x: failure %#x hint %#x\n",
> request[0], error, hint);
> return -ENXIO;
next prev parent reply other threads:[~2025-11-29 20:27 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-28 13:30 [PATCH v7 0/4] VF double migration Satyanarayana K V P
2025-11-28 13:30 ` [PATCH v7 1/4] drm/xe/vf: Enable VF migration only on supported GuC versions Satyanarayana K V P
2025-11-28 14:29 ` Michal Wajdeczko
2025-11-28 13:30 ` [PATCH v7 2/4] drm/xe/vf: Introduce RESFIX start marker support Satyanarayana K V P
2025-11-29 20:01 ` Michal Wajdeczko
2025-12-01 9:26 ` K V P, Satyanarayana
2025-11-28 13:30 ` [PATCH v7 3/4] drm/xe/vf: Requeue recovery on GuC MIGRATION error during VF post-migration Satyanarayana K V P
2025-11-29 20:27 ` Michal Wajdeczko [this message]
2025-11-28 13:30 ` [PATCH v7 4/4] drm/xe/vf: Add debugfs entries to test VF double migration Satyanarayana K V P
2025-11-29 21:07 ` Michal Wajdeczko
2025-12-01 6:04 ` Adam Miszczak
2025-11-28 14:21 ` ✗ CI.checkpatch: warning for VF double migration (rev7) Patchwork
2025-11-28 14:22 ` ✓ CI.KUnit: success " Patchwork
2025-11-28 15:33 ` ✓ Xe.CI.BAT: " Patchwork
2025-11-28 16:50 ` ✗ Xe.CI.Full: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=91d0b4c4-8209-44c5-8d1d-8960ca02144a@intel.com \
--to=michal.wajdeczko@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=matthew.brost@intel.com \
--cc=satyanarayana.k.v.p@intel.com \
--cc=tomasz.lis@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox