Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v8 0/4] VF double migration
@ 2025-12-01  9:50 Satyanarayana K V P
  2025-12-01  9:39 ` ✓ CI.KUnit: success for VF double migration (rev8) Patchwork
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: Satyanarayana K V P @ 2025-12-01  9:50 UTC (permalink / raw)
  To: intel-xe; +Cc: Satyanarayana K V P

In scenarios involving double migration, the VF KMD may encounter
situations where it is instructed to re-migrate before having the
opportunity to send RESFIX_DONE for the initial migration. This can occur
when the fix-up for the prior migration is still underway, but the VF KMD
is migrated again.

Consequently, this may lead to the possibility of sending two migration
notifications (i.e., pending fix-up for the first migration and a second
notification for the new migration). Upon receiving the first RES_FIX
notification, the GuC will resume VF submission on the GPU, potentially
resulting in undefined behavior, such as system hangs or crashes.

To avoid these hangs, a new VF2GUC action `VF2GUC_RESFIX_START` is
sent along with marker and when GUC receives the same marker with
`VF2GUC_RESFIX_DONE`action, it starts scheduling work loads from VF.

---
V7 -> V8:
- Fixed review comments (Michal W).
- Fixed issue with checkPatch.pl
- xe_irq_resume() is moved to vf_post_migration_rearm().
- Renamed vf_post_migration_resfix_start_marker() to
vf_post_migration_next_resfix_marker().
- Created new wrapper vf_post_migration_resfix_start() for
vf_resfix_start()
- Updated flow diagram as per review comments.

V6 -> V7:
- Fixed review comments (Michal W).
- Made resfix_start marker width to u8.
- Moved XE_GUC_RESPONSE_VF_MIGRATED handling in xe_guc_mmio_send_recv()
function new patch.

V5 -> V6:
- Fixed review comments (Michal W).
- Updated resfix_done and res_fix_start function names.
- Handled XE_GUC_RESPONSE_VF_MIGRATED error case received from GuC.
- Remove skip_resfix error when another migration is in queue.
- Fixed review comments (Michal W).
- Removed timeout and VF KMD waits infinately when resfix_stoppers bits
are set.
- Created helper macro for WAIT positions.

V4 -> V5:
- Fixed review comments (Michal W).
- Created new function vf_migration_init_late().
- Fixed minor debug log levels and documentation part.
- Moved complete marker logic to vf_post_migration_resfix_start_marker()
- Updated debugfs entries.

V3 -> V4:
- Gated Save/restore on Guc version 70.54.0
- Enabled RESFIX_START by default.
- Updated RESFIX_DONE documention.

V2 -> V3:
- Fixed review comments (Michal W).
- Updated commit message.
- Fixed CI.BAT issues.
- Added helper function to assert on unsupported GUC versions.
- Added debugfs entries to test VF double migration.

V1 -> V2:
- Squashed "Enable RESFIX start marker only on supported GUC
versions" commit into a single commit. (Matt B)
- Use fault injection for testing VF double  migration feature (Matt B).

Satyanarayana K V P (4):
  drm/xe/vf: Enable VF migration only on supported GuC versions
  drm/xe/vf: Introduce RESFIX start marker support
  drm/xe/vf: Requeue recovery on GuC MIGRATION error during VF
    post-migration
  drm/xe/vf: Add debugfs entries to test VF double migration

 .../gpu/drm/xe/abi/guc_actions_sriov_abi.h    |  67 ++++++--
 drivers/gpu/drm/xe/xe_gt_sriov_vf.c           | 153 +++++++++++++-----
 drivers/gpu/drm/xe/xe_gt_sriov_vf_debugfs.c   |  12 ++
 drivers/gpu/drm/xe/xe_gt_sriov_vf_types.h     |  13 ++
 drivers/gpu/drm/xe/xe_guc.c                   |   6 +
 drivers/gpu/drm/xe/xe_sriov_vf.c              |  86 +++++++++-
 6 files changed, 286 insertions(+), 51 deletions(-)

-- 
2.51.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-12-02  5:12 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-01  9:50 [PATCH v8 0/4] VF double migration Satyanarayana K V P
2025-12-01  9:39 ` ✓ CI.KUnit: success for VF double migration (rev8) Patchwork
2025-12-01  9:50 ` [PATCH v8 1/4] drm/xe/vf: Enable VF migration only on supported GuC versions Satyanarayana K V P
2025-12-01  9:50 ` [PATCH v8 2/4] drm/xe/vf: Introduce RESFIX start marker support Satyanarayana K V P
2025-12-01 14:25   ` Michal Wajdeczko
2025-12-01  9:50 ` [PATCH v8 3/4] drm/xe/vf: Requeue recovery on GuC MIGRATION error during VF post-migration Satyanarayana K V P
2025-12-01 14:25   ` Michal Wajdeczko
2025-12-01  9:50 ` [PATCH v8 4/4] drm/xe/vf: Add debugfs entries to test VF double migration Satyanarayana K V P
2025-12-01 10:19 ` ✓ Xe.CI.BAT: success for VF double migration (rev8) Patchwork
2025-12-01 11:29 ` ✗ Xe.CI.Full: failure " Patchwork
2025-12-02  5:12   ` K V P, Satyanarayana

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox