From: Tomasz Lis <tomasz.lis@intel.com>
To: intel-xe@lists.freedesktop.org
Cc: "Michał Winiarski" <michal.winiarski@intel.com>,
"Michał Wajdeczko" <michal.wajdeczko@intel.com>,
"Piotr Piórkowski" <piotr.piorkowski@intel.com>,
"Matthew Brost" <matthew.brost@intel.com>,
"Lucas De Marchi" <lucas.demarchi@intel.com>
Subject: [PATCH v1 4/4] drm/xe/vf: Redo LRC creation while in VF fixups
Date: Fri, 6 Feb 2026 15:53:34 +0100 [thread overview]
Message-ID: <20260206145334.674679-5-tomasz.lis@intel.com> (raw)
In-Reply-To: <20260206145334.674679-1-tomasz.lis@intel.com>
If the xe module within a VM was creating a new LRC during
save/restore, this LRC will be invalid. The fixups procedure
may not be able to reach it, as there will be a race to add
the new LRC reference to an exec queue.
Testing suggests that even if new LRC which was being created
during VM migration is added to EQ in time for fixups, said
LRC may still remain damaged.
Free the incorrectly created LRC, and trigger a re-run of the
creation, but only after waiting for default LRC fixups.
Since LRC creation is many times faster than fixups procedure
(because fixups include GuC handshake), checking once at
the end of LRC creation is enough to ensure that the fixups
running in parallel will be spotted.
Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
---
drivers/gpu/drm/xe/xe_exec_queue.c | 6 ++++++
drivers/gpu/drm/xe/xe_gt_sriov_vf.c | 8 ++++++++
drivers/gpu/drm/xe/xe_gt_sriov_vf.h | 1 +
3 files changed, 15 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
index 6eb561086e1c..2ebf25a35557 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.c
+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
@@ -316,6 +316,12 @@ static int __xe_exec_queue_init(struct xe_exec_queue *q, u32 exec_queue_flags)
err = PTR_ERR(lrc);
goto err_lrc;
}
+ if (!xe_gt_vf_valid_default_lrc(q->gt)) {
+ xe_lrc_put(lrc);
+ i--;
+ continue;
+ }
+
/* Pairs with READ_ONCE to xe_exec_queue_contexts_hwsp_rebase */
WRITE_ONCE(q->lrc[i], lrc);
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_vf.c b/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
index 1edccee84c76..704c7e083ff4 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
@@ -1504,6 +1504,14 @@ static bool vf_valid_default_lrc(struct xe_gt *gt)
return true;
}
+bool xe_gt_vf_valid_default_lrc(struct xe_gt *gt)
+{
+ if (!IS_SRIOV_VF(gt_to_xe(gt)) ||
+ !xe_sriov_vf_migration_supported(gt_to_xe(gt)))
+ return true;
+ return vf_valid_default_lrc(gt);
+}
+
/**
* xe_gt_sriov_vf_wait_valid_default_lrc() - wait for valid GGTT refs in default LRCs
* @gt: the &xe_gt
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_vf.h b/drivers/gpu/drm/xe/xe_gt_sriov_vf.h
index 70232dc38f9a..8c21b8ab2f16 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_vf.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_vf.h
@@ -39,6 +39,7 @@ void xe_gt_sriov_vf_print_config(struct xe_gt *gt, struct drm_printer *p);
void xe_gt_sriov_vf_print_runtime(struct xe_gt *gt, struct drm_printer *p);
void xe_gt_sriov_vf_print_version(struct xe_gt *gt, struct drm_printer *p);
+bool xe_gt_vf_valid_default_lrc(struct xe_gt *gt);
void xe_gt_sriov_vf_wait_valid_default_lrc(struct xe_gt *gt);
#endif
--
2.25.1
next prev parent reply other threads:[~2026-02-06 14:49 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-06 14:53 [PATCH v1 0/4] drm/xe/vf: Fix exec queue creation during post-migration recovery Tomasz Lis
2026-02-06 14:53 ` [PATCH v1 1/4] drm/xe/queue: Call fini on exec queue creation fail Tomasz Lis
2026-02-06 17:38 ` Matthew Brost
2026-02-06 14:53 ` [PATCH v1 2/4] drm/xe/vf: Avoid LRC being freed while applying fixups Tomasz Lis
2026-02-06 17:46 ` Matthew Brost
2026-02-10 20:16 ` Lis, Tomasz
2026-02-06 14:53 ` [PATCH v1 3/4] drm/xe/vf: Wait for default LRCs fixups before using Tomasz Lis
2026-02-06 18:11 ` Matthew Brost
2026-02-10 20:11 ` Lis, Tomasz
2026-02-18 23:15 ` Lis, Tomasz
2026-02-06 14:53 ` Tomasz Lis [this message]
2026-02-06 14:56 ` ✓ CI.KUnit: success for drm/xe/vf: Fix exec queue creation during post-migration recovery Patchwork
2026-02-06 15:29 ` ✓ Xe.CI.BAT: " Patchwork
2026-02-07 15:42 ` ✗ Xe.CI.FULL: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260206145334.674679-5-tomasz.lis@intel.com \
--to=tomasz.lis@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=lucas.demarchi@intel.com \
--cc=matthew.brost@intel.com \
--cc=michal.wajdeczko@intel.com \
--cc=michal.winiarski@intel.com \
--cc=piotr.piorkowski@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox