public inbox for igt-dev@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Adam Miszczak <adam.miszczak@linux.intel.com>
To: igt-dev@lists.freedesktop.org
Cc: marcin.bernatowicz@linux.intel.com, kamil.konieczny@linux.intel.com
Subject: [PATCH i-g-t 3/3] tools/vmtb: Add VF migration tests
Date: Thu, 16 Apr 2026 10:35:44 +0200	[thread overview]
Message-ID: <20260416083544.2441874-4-adam.miszczak@linux.intel.com> (raw)
In-Reply-To: <20260416083544.2441874-1-adam.miszczak@linux.intel.com>

Introduce a comprehensive VF migration (state save/restore) test suite,
covering the following scenarios:
- idle migration: no GPU workload active during save/restore
- idle app migration: GPU contexts created but idle during save/restore
- busy migration (WSIM): short/long preemptable and non-preemptable batches
- busy migration (IGT): xe_exec_reset, xe_exec_threads, xe_ccs,
  xe_compute_preempt workloads in multiple variants
- double migration: re-migration while post-restore resource fixup
  (resfix) is in progress, tested at few KMD debug checkpoints
- checkpointing: restore a VM state saved at earlier point of time
- auxiliary: basic pause/resume exercise, migrate without VF driver loaded

Tests support execution in diversity of configuration variants:
VF/VM counts, auto provisioning or vGPU profiles and scheduling modes.

Signed-off-by: Adam Miszczak <adam.miszczak@linux.intel.com>
---
 tools/vmtb/vmm_flows/test_migration.py | 1199 ++++++++++++++++++++++++
 1 file changed, 1199 insertions(+)
 create mode 100644 tools/vmtb/vmm_flows/test_migration.py

diff --git a/tools/vmtb/vmm_flows/test_migration.py b/tools/vmtb/vmm_flows/test_migration.py
new file mode 100644
index 000000000..8a3f10d52
--- /dev/null
+++ b/tools/vmtb/vmm_flows/test_migration.py
@@ -0,0 +1,1199 @@
+# SPDX-License-Identifier: MIT
+# Copyright © 2024-2026 Intel Corporation
+
+import enum
+import logging
+import random
+import time
+from dataclasses import dataclass
+from typing import List, Tuple
+
+import pytest
+
+from bench import exceptions
+from bench.configurators.vgpu_profile_config import VfProvisioningMode, VfSchedulingMode
+from bench.executors.gem_wsim import ONE_CYCLE_DURATION_MS, PREEMPT_10MS_WORKLOAD, GemWsim
+from bench.executors.igt import IgtExecutor, IgtType
+from bench.executors.shell import ShellExecutor
+from bench.helpers.helpers import (cmd_run_check, driver_check,
+                                   duplicate_vm_image, igt_check,
+                                   igt_run_check, modprobe_driver_run_check)
+from bench.machines.host import Host
+from bench.machines.virtual.vm import VirtualMachine
+from vmm_flows.conftest import (VmmTestingConfig, VmmTestingSetup,
+                                idfn_test_config)
+
+logger = logging.getLogger(__name__)
+
+IGT_INIT_DELAY = 6 # Time between WL start and VM pause (pre-save)
+IGT_RESTORE_DELAY = 3 # Time between VM resume and WL status check (post-restore)
+MS_IN_SEC = 1000
+
+
+# Full configuration variant: 1xVF, 2xVF and MAXxVF with auto and vGPU profiles provisioning
+# TODO: add max VFs variants
+test_variants_full = [(1, VfProvisioningMode.AUTO, VfSchedulingMode.DEFAULT_PROFILE),
+                      (2, VfProvisioningMode.AUTO, VfSchedulingMode.DEFAULT_PROFILE),
+                      (1, VfProvisioningMode.VGPU_PROFILE, VfSchedulingMode.DEFAULT_PROFILE),
+                      (2, VfProvisioningMode.VGPU_PROFILE, VfSchedulingMode.DEFAULT_PROFILE)]
+
+
+# Basic configuration variant: 1xVF and 2xVF with auto provisioning
+test_variants_basic = [(1, VfProvisioningMode.AUTO, VfSchedulingMode.DEFAULT_PROFILE),
+                       (2, VfProvisioningMode.AUTO, VfSchedulingMode.DEFAULT_PROFILE)]
+
+
+# vGPU profiles configuration variant: 1xVF and 2xVF with vGPU profiles provisioning
+test_variants_profiles = [(1, VfProvisioningMode.VGPU_PROFILE, VfSchedulingMode.DEFAULT_PROFILE),
+                          (2, VfProvisioningMode.VGPU_PROFILE, VfSchedulingMode.DEFAULT_PROFILE)]
+
+
+@dataclass
+class MigrationWorkloadWsim:
+    workload_file: str # Wsim workload descriptor file
+    num_clients: int # Fork N clients emitting the workload simultaneously
+    num_repeats: int # How many times to emit the workload
+
+    def __str__(self) -> str:
+        return f'WL:{self.workload_file}-(C:{self.num_clients} R:{self.num_repeats})'
+
+
+# VF busy migration WSIM workloads (payload for TestBusyMigrationWsim[N]):
+wsim_idle_app = MigrationWorkloadWsim('idle_ctxs', 1, 1)
+wsim_short_preempt = MigrationWorkloadWsim('short_preempt', 1, 4000) # 5ms * 4000 (20s)
+wsim_short_nonpreempt = MigrationWorkloadWsim('short_nonpreempt', 1, 4000)
+wsim_long_preempt = MigrationWorkloadWsim('long_preempt', 1, 200) # 100ms * 200 (20s)
+wsim_long_nonpreempt = MigrationWorkloadWsim('long_nonpreempt', 1, 200)
+
+
+@dataclass
+class MigrationWorkloadIgt:
+    igt_test: IgtType # IGT test type
+    num_repeats: int = 1 # Number of repeats of the IGT test (calibrated in runtime)
+
+    def __str__(self) -> str:
+        return f'WL:{self.igt_test}'
+
+# VF busy migration IGT workloads (payload for TestBusyMigrationIgt[M]):
+# xe_exec_reset/long_spin subtests:
+# Average exec time: 12-13s - execute 1x
+igt_exec_reset_long_spin_many_preempt = MigrationWorkloadIgt(
+    IgtType.EXEC_RESET_LONG_SPIN_MANY_PREEMPT)
+igt_exec_reset_long_spin_many_preempt_media = MigrationWorkloadIgt(
+    IgtType.EXEC_RESET_LONG_SPIN_MANY_PREEMPT_MEDIA)
+igt_exec_reset_long_spin_many_preempt_threads = MigrationWorkloadIgt(
+    IgtType.EXEC_RESET_LONG_SPIN_MANY_PREEMPT_THREADS)
+igt_exec_reset_long_spin_many_preempt_gt0_threads = MigrationWorkloadIgt(
+    IgtType.EXEC_RESET_LONG_SPIN_MANY_PREEMPT_GT0_THREADS)
+# Average exec time: 6-7s - execute 2x
+igt_exec_reset_long_spin_many_preempt_gt1_threads = MigrationWorkloadIgt(
+    IgtType.EXEC_RESET_LONG_SPIN_MANY_PREEMPT_GT1_THREADS)
+
+# Average exec time: 12-13s - execute 1x
+igt_exec_reset_long_spin_reuse_many_preempt = MigrationWorkloadIgt(
+    IgtType.EXEC_RESET_LONG_SPIN_REUSE_MANY_PREEMPT)
+igt_exec_reset_long_spin_reuse_many_preempt_media = MigrationWorkloadIgt(
+    IgtType.EXEC_RESET_LONG_SPIN_REUSE_MANY_PREEMPT_MEDIA)
+igt_exec_reset_long_spin_reuse_many_preempt_threads = MigrationWorkloadIgt(
+    IgtType.EXEC_RESET_LONG_SPIN_REUSE_MANY_PREEMPT_THREADS)
+igt_exec_reset_long_spin_reuse_many_preempt_gt0_threads = MigrationWorkloadIgt(
+    IgtType.EXEC_RESET_LONG_SPIN_REUSE_MANY_PREEMPT_GT0_THREADS)
+# Average exec time: 6-7s  execute 2x
+igt_exec_reset_long_spin_reuse_many_preempt_gt1_threads = MigrationWorkloadIgt(
+    IgtType.EXEC_RESET_LONG_SPIN_REUSE_MANY_PREEMPT_GT1_THREADS)
+
+# Average exec time: 12-13s - execute 1x
+igt_exec_reset_long_spin_sys_reuse_many_preempt_threads = MigrationWorkloadIgt(
+    IgtType.EXEC_RESET_LONG_SPIN_SYS_REUSE_MANY_PREEMPT_THREADS)
+igt_exec_reset_long_spin_comp_reuse_many_preempt_threads = MigrationWorkloadIgt(
+    IgtType.EXEC_RESET_LONG_SPIN_COMP_REUSE_MANY_PREEMPT_THREADS)
+
+# xe_exec_reset/cancel subtests:
+# Average exec time: 5-7s  execute 2x
+igt_exec_reset_cancel = MigrationWorkloadIgt(
+    IgtType.EXEC_RESET_CANCEL)
+igt_exec_reset_cancel_preempt = MigrationWorkloadIgt(
+    IgtType.EXEC_RESET_CANCEL_PREEMPT)
+# Average exec time: 10-15s  execute 1x
+igt_exec_reset_cancel_timeslice_preempt = MigrationWorkloadIgt(
+    IgtType.EXEC_RESET_CANCEL_TIMESLICE_PREEMPT)
+# Average exec time: 20-25s  execute 1x
+igt_exec_reset_cancel_timeslice_many_preempt = MigrationWorkloadIgt(
+    IgtType.EXEC_RESET_CANCEL_TIMESLICE_MANY_PREEMPT)
+
+# xe_exec_threads subtests (short, execute in a loop):
+# Average exec time: <500ms
+igt_exec_threads_basic = MigrationWorkloadIgt(
+    IgtType.EXEC_THREADS_BASIC)
+igt_exec_threads_bal_basic = MigrationWorkloadIgt(
+    IgtType.EXEC_THREADS_BAL_BASIC)
+# Average exec time: 1-2s
+igt_exec_threads_cm_userptr_invalidate = MigrationWorkloadIgt(
+    IgtType.EXEC_THREADS_CM_USERPTR_INVALIDATE)
+igt_exec_threads_bal_mixed_userptr_invalidate = MigrationWorkloadIgt(
+    IgtType.EXEC_THREADS_BAL_MIXED_USERPTR_INVALIDATE)
+# Average exec time: 1-4s
+igt_exec_threads_many_queues = MigrationWorkloadIgt(
+    IgtType.EXEC_THREADS_MANY_QUEUES)
+
+# xe_ccs subtest (short, execute in a loop):
+# Average exec time: 200-600ms
+igt_ccs_block_copy_compressed = MigrationWorkloadIgt(
+    IgtType.CCS_BLOCK_COPY_COMPRESSED)
+
+# xe_compute_preempt subtest (short, execute in a loop):
+# Average exec time: 1.8-2s
+igt_compute_preempt_many = MigrationWorkloadIgt(
+    IgtType.COMPUTE_PREEMPT_MANY)
+
+
+class BaseTestBusyMigration:
+    """Base class for busy migration tests (with workload executed).
+
+    The class provides implementation for VF save and restore subtests,
+    supports parametrization with a different VMs number and various IGT workload types.
+
+    Dedicated for inheritance by separate child test classes with specific workload setup
+    to avoid bulk dynamic test variants execution with the same VM setup.
+    """
+
+    # State save result flag: executing test_restore depends on prior test_save success
+    test_save_failed = True
+
+    def __calibrate_igt_wl(self, vm: VirtualMachine, igt_wl: MigrationWorkloadIgt):
+        logger.info("Starting %s test loop calibration for migration workload", igt_wl.igt_test)
+        igt_exec =  IgtExecutor(vm, igt_wl.igt_test)
+        assert igt_exec.check_results(), 'Calibration IGT run failed'
+
+        results_log = igt_exec.get_results_log()
+        igt_exec_time: float = round(results_log['time_elapsed']['end'] - results_log['time_elapsed']['start'], 3)
+
+        # Adjust IGT workload loop to execute longer than pre-save wait (with additional margins)
+        if igt_exec_time < IGT_INIT_DELAY + 2:
+            igt_wl.num_repeats = int(IGT_INIT_DELAY * 2 / igt_exec_time) + 1
+
+        logger.debug("Calibrated IGT workload loop: %s iteration(s) x ~%ss", igt_wl.num_repeats, igt_exec_time)
+
+    @pytest.fixture(scope='class', name='run_source_workload')
+    def fixture_run_source_workload(self, setup_vms, set_migration_wl):
+        ts: VmmTestingSetup = setup_vms
+        vm_src: VirtualMachine = ts.get_vm[0] # First VM as source
+        migration_wl = set_migration_wl # WSIM/IGT Workload variant
+
+        if isinstance(migration_wl, MigrationWorkloadWsim):
+            wsim_file_path = ts.wsim_wl_dir / f'{migration_wl.workload_file}.wsim' # Workload descriptor file path
+            if not wsim_file_path.exists():
+                logger.error("gem_wsim workload file %s not available!", wsim_file_path)
+                raise exceptions.GemWsimError(f'gem_wsim workload file {wsim_file_path} not available!')
+
+            # Run IGT wsim workload in pre-migration and check completion in post-migration
+            return GemWsim(vm_src, migration_wl.num_clients, migration_wl.num_repeats, workload=wsim_file_path)
+
+        if isinstance(migration_wl, MigrationWorkloadIgt):
+            self.__calibrate_igt_wl(vm_src, migration_wl)
+            return IgtExecutor(vm_src, migration_wl.igt_test, migration_wl.num_repeats)
+
+        logger.error("Invalid workload type passed to run_source_workload fixture")
+        raise exceptions.BenchError('Invalid workload type passed to run_source_workload fixture')
+
+    @pytest.fixture(scope='function', name='setup_destination_vm')
+    def fixture_setup_destination_vm(self, setup_vms):
+        ts: VmmTestingSetup = setup_vms
+        vm_src: VirtualMachine = ts.get_vm[0]  # First VM as a source
+        vm_dst: VirtualMachine = ts.get_vm[-1] # Last VM as a destination
+        num_vms = ts.get_num_vms()
+
+        if num_vms == 1:
+            logger.debug("Single VM: the same source and destination VM instance")
+            assert vm_src == vm_dst
+            return vm_dst
+
+        logger.debug("Multiple VMs: reload destination VM with the source image (with state snapshot)")
+
+        if vm_src.is_running():
+            # QMP 'quit' is used for paused VM (cannot be powered off via guest-agent)
+            vm_src.quit()
+
+        if vm_dst.is_running():
+            vm_dst.quit()
+            while vm_dst.is_running():
+                time.sleep(1) # VM usually doesn't terminate immediately
+
+        # Re-start destination VM with an image containing a state snapshot
+        vm_dst.set_migration_source(vm_src.image)
+        vm_dst.poweron()
+
+        return vm_dst
+
+    def test_save(self, setup_vms, run_source_workload):
+        logger.info("Test VM busy migration: state save")
+        ts: VmmTestingSetup = setup_vms
+        vm_src: VirtualMachine = ts.get_vm[0] # First VM as source
+
+        logger.debug("Execute migration in-flight workload on source VM")
+        migration_wl = run_source_workload
+        time.sleep(IGT_INIT_DELAY)
+        assert migration_wl.is_running(), 'IGT/wsim migration workload is not running on source VM'
+
+        # Pause VM and save snapshot
+        logger.debug("Pause execution and save source VM state")
+        try:
+            vm_src.pause()
+            vm_src.save_state()
+        except exceptions.GuestError as exc:
+            logger.error("State save error: %s", exc)
+            assert False, 'VF migration failed on save'
+
+        logger.debug("Resume execution on source VM")
+        vm_src.resume()
+
+        assert migration_wl.check_results(), 'VF migration workload failed on source VM (post-save)'
+
+        if ts.get_num_vms() > 1:
+            logger.debug("Multiple VMs: shutdown source VM")
+            vm_src.poweroff()
+
+        BaseTestBusyMigration.test_save_failed = False
+
+    def test_restore(self, setup_vms, setup_destination_vm, run_source_workload):
+        logger.info("Test VM busy migration: state restore")
+        if BaseTestBusyMigration.test_save_failed:
+            logger.error("State save failed - restore is pointless (fail immediately)")
+            assert False, 'test_save subtest failed - do not execute test_restore'
+
+        ts: VmmTestingSetup = setup_vms
+        vm_dst: VirtualMachine = setup_destination_vm
+        migration_wl = run_source_workload # Get an instance of the IGT WL started in a save test
+
+        # Patch the source IgtExecutor/GemWsim instance with the current VM
+        migration_wl.target = vm_dst
+        if isinstance(migration_wl, IgtExecutor):
+            # Clear IGT test results cache - remove post-save source VM results
+            # TODO: implement common IgtExecutor/GemWsim results clear interface to avoid instance type check
+            migration_wl.results.clear()
+
+        # Load the source state snapshot
+        logger.debug("Restore source state on the destination VM")
+        vm_dst.load_state()
+        vm_dst.resume()
+
+        # TODO: add sync to VM class
+        sync_value = random.randint(1, 0xFFFF)
+        assert vm_dst.ga.sync(sync_value)['return'] == sync_value
+
+        assert migration_wl.is_running(), 'IGT/wsim migration workload is not running on destination VM'
+        time.sleep(IGT_RESTORE_DELAY)
+
+        assert migration_wl.check_results(), 'VF migration workload failed on destination VM (post-restore)'
+
+        logger.debug("Check driver health on host and destination VM")
+        assert driver_check(ts.host)
+        assert driver_check(vm_dst)
+
+
+@pytest.fixture(scope='class', name='set_migration_wl')
+def fixture_set_migration_wl(request):
+    """Set IGT/wsim descriptor file used as a migration workload in a TestBusyMigration[WL]."""
+    # Wsim workload variant provided as MigrationWorkload data class instance
+    return request.param
+
+
+def idfn_workload(workload: MigrationWorkloadWsim):
+    """Add workload name to a test config ID in parametrized tests
+    (e.g. test_something[2VF-WL:workload_type-C:n-R:m].
+    """
+    return str(workload)
+
+
+def set_test_config(test_variants: List[Tuple[int, VfProvisioningMode, VfSchedulingMode]],
+                    max_vms: int = 2, wa_reduce_vf_lmem: bool = False) -> List[VmmTestingConfig]:
+    """Helper function to provide a parametrized test with a list of test configuration variants."""
+    test_configs: List[VmmTestingConfig] = []
+
+    for config in test_variants:
+        (num_vfs, provisioning_mode, scheduling_mode) = config
+        test_configs.append(VmmTestingConfig(num_vfs, max_vms, provisioning_mode, scheduling_mode,
+                                             wa_reduce_vf_lmem=wa_reduce_vf_lmem))
+
+    return test_configs
+
+
+# Busy migration TCs with WSIM workload
+@pytest.mark.parametrize('set_migration_wl', [wsim_short_preempt],
+                         ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_full), ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationWsim1(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing short (5ms) preemptable batches.
+
+    IGT/WSIM workload initiated pre-migration starts firing short submissions on each engine and
+    during the execution VM state is migrated (VM state snapshot is saved, then restored).
+    In the post-migration some additional batches are submitted.
+    Executed in the following VM number variants:
+    - single VF/VM: same VM acts as a source and destination.
+    - multiple VFs/VMs: the workload execution is initiated on the source VM,
+      then migrated and verified on the other, destination one.
+    """
+
+
+@pytest.mark.parametrize('set_migration_wl', [wsim_short_nonpreempt],
+                         ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_basic),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationWsim2(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing short (5ms) non-preemptable batches.
+    Similar to TestBusyMigrationShort subtest, but emits non-preemptable batches.
+    """
+
+
+@pytest.mark.parametrize('set_migration_wl', [wsim_long_preempt],
+                         ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_basic),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationWsim3(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing quite long (100ms) but preemptable batches.
+
+    IGT/WSIM workload initiated pre-migration starts firing relatively long submissions and
+    during the execution VM state is migrated (VM state snapshot is saved, then restored).
+    In the post-migration some additional batches are submitted.
+    Executed in the following VM number variants:
+    - single VF/VM: same VM acts as a source and destination.
+    - multiple VFs/VMs: the workload execution is initiated on the source VM,
+      then migrated and verified on the other, destination one.
+    """
+
+
+# TODO: convert to negative scenario.
+# Test is expected to fail because non-premptable workload execution time > PT (VLK-81241)
+@pytest.mark.parametrize('set_migration_wl', [wsim_long_nonpreempt],
+                         ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_basic),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationWsim4(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing quite long (100ms) non-preemptable batches.
+    Similar to TestBusyMigrationLong subtest, but emits non-preemptable batches.
+    """
+
+
+@pytest.mark.parametrize('set_migration_wl', [wsim_idle_app],
+                         ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_basic),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestIdleAppMigration(BaseTestBusyMigration):
+    """Save-restore VM state with an idle VF but user application attached (contexts created).
+
+    IGT/WSIM workload initiated pre-migration creates multiple user contexts and
+    does short submission on each but is idle during a save-restore operation,
+    then resumes post-migration to do more submissions on previously created contexts.
+    Executed in the following VM number variants:
+    - single VF/VM: same VM acts as a source and destination.
+    - multiple VFs/VMs: the workload execution is initiated on the source VM,
+      then migrated and verified on the other, destination one.
+    """
+
+# Busy migration TCs with IGT workload
+@pytest.mark.parametrize('set_migration_wl', [igt_exec_reset_long_spin_many_preempt],
+                         ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_basic),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationIgtExecReset1(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing IGT xe_exec_reset@long-spin-many-preempt."""
+
+
+@pytest.mark.parametrize('set_migration_wl', [igt_exec_reset_long_spin_many_preempt_media],
+                         ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_basic),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationIgtExecReset2(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing IGT xe_exec_reset@long-spin-many-preempt-media."""
+
+
+@pytest.mark.parametrize('set_migration_wl', [igt_exec_reset_long_spin_many_preempt_threads],
+                         ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_full),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationIgtExecReset3(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing IGT xe_exec_reset@long-spin-many-preempt-threads."""
+
+
+@pytest.mark.parametrize('set_migration_wl', [igt_exec_reset_long_spin_many_preempt_gt0_threads],
+                         ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_basic),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationIgtExecReset4(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing IGT xe_exec_reset@long-spin-many-preempt-gt0-threads."""
+
+
+@pytest.mark.parametrize('set_migration_wl', [igt_exec_reset_long_spin_many_preempt_gt1_threads],
+                         ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_basic),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationIgtExecReset5(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing IGT xe_exec_reset@long-spin-many-preempt-gt1-threads."""
+
+
+@pytest.mark.parametrize('set_migration_wl', [igt_exec_reset_long_spin_reuse_many_preempt],
+                         ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_basic),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationIgtExecReset6(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing IGT xe_exec_reset@long-spin-reuse-many-preempt."""
+
+
+@pytest.mark.parametrize('set_migration_wl', [igt_exec_reset_long_spin_reuse_many_preempt_media],
+                         ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_basic),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationIgtExecReset7(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing IGT xe_exec_reset@long-spin-reuse-many-preempt-media."""
+
+
+@pytest.mark.parametrize('set_migration_wl', [igt_exec_reset_long_spin_reuse_many_preempt_threads],
+                         ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_full),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationIgtExecReset8(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing IGT xe_exec_reset@long-spin-reuse-many-preempt-threads."""
+
+
+@pytest.mark.parametrize('set_migration_wl', [igt_exec_reset_long_spin_reuse_many_preempt_gt0_threads],
+                         ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_basic),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationIgtExecReset9(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing IGT xe_exec_reset@long-spin-reuse-many-preempt-gt0-threads."""
+
+
+@pytest.mark.parametrize('set_migration_wl', [igt_exec_reset_long_spin_reuse_many_preempt_gt1_threads],
+                         ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_basic),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationIgtExecReset10(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing IGT xe_exec_reset@long-spin-reuse-many-preempt-gt1-threads."""
+
+
+@pytest.mark.parametrize('set_migration_wl', [igt_exec_reset_long_spin_sys_reuse_many_preempt_threads],
+                         ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_full),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationIgtExecReset11(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing IGT xe_exec_reset@long-spin-sys-reuse-many-preempt-threads."""
+
+
+@pytest.mark.parametrize('set_migration_wl', [igt_exec_reset_long_spin_comp_reuse_many_preempt_threads],
+                         ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_full),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationIgtExecReset12(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing IGT xe_exec_reset@long-spin-comp-reuse-many-preempt-threads."""
+
+
+@pytest.mark.parametrize('set_migration_wl', [igt_exec_reset_cancel],
+                         ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_basic),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationIgtExecReset13(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing IGT xe_exec_reset@cancel."""
+
+
+@pytest.mark.parametrize('set_migration_wl', [igt_exec_reset_cancel_preempt],
+                         ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_basic),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationIgtExecReset14(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing IGT xe_exec_reset@cancel-preempt."""
+
+
+@pytest.mark.parametrize('set_migration_wl', [igt_exec_reset_cancel_timeslice_preempt],
+                         ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_basic),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationIgtExecReset15(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing IGT xe_exec_reset@cancel-timeslice-preempt."""
+
+
+@pytest.mark.parametrize('set_migration_wl', [igt_exec_reset_cancel_timeslice_many_preempt],
+                         ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_basic),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationIgtExecReset16(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing IGT xe_exec_reset@cancel-timeslice-many-preempt."""
+
+
+@pytest.mark.parametrize('set_migration_wl', [igt_exec_threads_basic],
+                         ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_full),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationIgtExecThreads1(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing IGT xe_exec_threads@threads-basic."""
+
+
+@pytest.mark.parametrize('set_migration_wl', [igt_exec_threads_bal_basic],
+                         ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_basic),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationIgtExecThreads2(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing IGT xe_exec_threads@threads-bal-basic."""
+
+
+@pytest.mark.parametrize('set_migration_wl', [igt_exec_threads_cm_userptr_invalidate],
+                        ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_basic),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationIgtExecThreads3(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing IGT xe_exec_threads@threads-cm-userptr-invalidate."""
+
+
+@pytest.mark.parametrize('set_migration_wl', [igt_exec_threads_many_queues],
+                        ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_full),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationIgtExecThreads4(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing IGT xe_exec_threads@threads-many-queues."""
+
+
+@pytest.mark.parametrize('set_migration_wl', [igt_exec_threads_bal_mixed_userptr_invalidate],
+                        ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_basic),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationIgtExecThreads5(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing IGT xe_exec_threads@threads-bal-mixed-userptr-invalidate."""
+
+
+@pytest.mark.parametrize('set_migration_wl', [igt_ccs_block_copy_compressed],
+                        ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_basic),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationIgtCcs(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing IGT xe_ccs@block-copy-compressed."""
+
+
+@pytest.mark.parametrize('set_migration_wl', [igt_compute_preempt_many],
+                        ids=idfn_workload, indirect=['set_migration_wl'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_basic),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestBusyMigrationIgtComputePreempt(BaseTestBusyMigration):
+    """Save-restore VM state with VF busy executing IGT xe_compute_preempt@compute-preempt-many (CCS path)."""
+
+
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_basic),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestIdleMigration:
+    """Save-restore VM state with an idle VF and no user application attached.
+
+    IGT workload initiated and ended twice: pre- and post-migration, but not executing during a save-restore operation.
+    Test setup:
+    - NxVFs running NxVM instances (first (VM[0]) acts as source and a last (VM[N-1] as a destination)
+    - platform provisioned with the relevant vGPU profile M[N] (ATSM, ADLP) or C[N] (PVC)
+    - VF state is saved on the source VM and then restored on the destination VM instance
+      (in case of a single VF variant, source and destination is the same VM instance)
+    """
+
+    @pytest.fixture(scope='function', name='setup_destination_vm')
+    def fixture_setup_destination_vm(self, setup_vms):
+        ts: VmmTestingSetup = setup_vms
+        vm_src: VirtualMachine = ts.get_vm[0]  # First VM as a source
+        vm_dst: VirtualMachine = ts.get_vm[-1] # Last VM as a destination
+        num_vms = ts.get_num_vms()
+
+        if num_vms == 1:
+            logger.debug("Single VM: the same source and destination VM instance")
+            assert vm_src == vm_dst
+            return vm_dst
+
+        logger.debug("Multiple VMs: reload destination VM with the source image (with state snapshot)")
+
+        if vm_src.is_running():
+            # QMP 'quit' is used for paused VM (cannot be powered off via guest-agent)
+            vm_src.quit()
+
+        if vm_dst.is_running():
+            vm_dst.quit()
+            while vm_dst.is_running():
+                time.sleep(1) # VM usually doesn't terminate immediately
+
+        # Re-start destination VM with an image containing a state snapshot
+        vm_dst.set_migration_source(vm_src.image)
+        vm_dst.poweron()
+
+        return vm_dst
+
+    def test_save(self, setup_vms):
+        logger.info("Test VM idle migration: state save")
+        ts: VmmTestingSetup = setup_vms
+        vm_src: VirtualMachine = ts.get_vm[0] # First VM as source
+
+        # Run some interactive program (not returning, as vim) to verify state after migration
+        src_proc = ShellExecutor(vm_src, 'vim migrate.txt')
+        source_proc = vm_src.execute_status(src_proc.pid)
+        logger.debug("Source process: %s", source_proc)
+        assert source_proc.exited is False, 'Source process is not running'
+
+        logger.debug("Execute pre-migration workload on source VM")
+        assert igt_run_check(vm_src, IgtType.EXEC_STORE)
+
+        # Pause VM and save snapshot
+        logger.debug("Pause execution and save VM state")
+        try:
+            vm_src.pause()
+            vm_src.save_state()
+        except exceptions.GuestError as exc:
+            logger.error("State save error: %s", exc)
+            assert False, 'VF migration failed on save'
+
+    def test_restore(self, setup_vms, setup_destination_vm):
+        logger.info("Test VM idle migration: state restore")
+        ts: VmmTestingSetup = setup_vms
+        vm_dst: VirtualMachine = setup_destination_vm
+
+        # Load the source state snapshot
+        logger.debug("Restore source state on the destination VM")
+        vm_dst.load_state()
+        vm_dst.resume()
+
+        # Verify program initiated on source VM is stil running after migration
+        pgrep_dst = ShellExecutor(vm_dst, 'pgrep -f "vim migrate.txt"')
+        pgrep_dst_result = vm_dst.execute_wait(pgrep_dst.pid)
+        assert pgrep_dst_result.exit_code == 0, 'Source process (vim) not found'
+        restored_proc = vm_dst.execute_status(int(pgrep_dst_result.stdout))
+        logger.debug("Restored process: %s", restored_proc)
+        assert restored_proc.exited is False, 'Restored process is not running'
+
+        logger.debug("Execute post-migration workload on destination VM")
+        assert igt_run_check(vm_dst, IgtType.EXEC_STORE)
+
+        logger.debug("Check driver health on host and destination VM")
+        assert driver_check(ts.host)
+        assert driver_check(vm_dst)
+
+
+class ResfixWaitStage(enum.IntEnum):
+    # Resfix stopper checkpoints
+    VF_MIGRATION_CONTINUE = 0
+    VF_MIGRATION_WAIT_BEFORE_RESFIX_START = 1 << 0
+    VF_MIGRATION_WAIT_BEFORE_FIXUPS = 1 << 1
+    VF_MIGRATION_WAIT_BEFORE_RESTART_JOBS = 1 << 2
+    VF_MIGRATION_WAIT_BEFORE_RESFIX_DONE = 1 << 3
+
+
+class MigrationToRestore(enum.Enum):
+    FIRST = 1
+    SECOND = 2
+
+
+@dataclass
+class DoubleMigrationConfig:
+    resfix_stoppers: ResfixWaitStage # Stage for migration RESFIX stop
+    migration_to_restore: MigrationToRestore # Migration snapshot to be restored after doubled save
+
+    def __str__(self) -> str:
+        return f'RS:{hex(self.resfix_stoppers)}-MR:{self.migration_to_restore}'
+
+
+double_migration_1_resfix_1 = DoubleMigrationConfig(
+    ResfixWaitStage.VF_MIGRATION_WAIT_BEFORE_RESFIX_START, MigrationToRestore.FIRST)
+double_migration_1_resfix_2 = DoubleMigrationConfig(
+    ResfixWaitStage.VF_MIGRATION_WAIT_BEFORE_FIXUPS, MigrationToRestore.FIRST)
+double_migration_1_resfix_3 = DoubleMigrationConfig(
+    ResfixWaitStage.VF_MIGRATION_WAIT_BEFORE_RESTART_JOBS, MigrationToRestore.FIRST)
+double_migration_1_resfix_4 = DoubleMigrationConfig(
+    ResfixWaitStage.VF_MIGRATION_WAIT_BEFORE_RESFIX_DONE, MigrationToRestore.FIRST)
+
+
+double_migration_2_resfix_1 = DoubleMigrationConfig(
+    ResfixWaitStage.VF_MIGRATION_WAIT_BEFORE_RESFIX_START, MigrationToRestore.SECOND)
+double_migration_2_resfix_2 = DoubleMigrationConfig(
+    ResfixWaitStage.VF_MIGRATION_WAIT_BEFORE_FIXUPS, MigrationToRestore.SECOND)
+double_migration_2_resfix_3 = DoubleMigrationConfig(
+    ResfixWaitStage.VF_MIGRATION_WAIT_BEFORE_RESTART_JOBS, MigrationToRestore.SECOND)
+double_migration_2_resfix_4 = DoubleMigrationConfig(
+    ResfixWaitStage.VF_MIGRATION_WAIT_BEFORE_RESFIX_DONE, MigrationToRestore.SECOND)
+
+class BaseTestDoubleMigration:
+    """Base class for double migration tests.
+    Test scenario triggers VF re-migrate while the initial restore (resources fixup) is still ongoing.
+
+    Save-load and immediately save again before the initial migration completes (prior to resfix done).
+    Post migration resources fixup is delayed via KMD debug hook to initiate the 2nd save.
+    Tests Xe KMD corner case where two migration notifications must be handled.
+    IGT/WSIM workload is executing during the migration (started prior to 1st save).
+
+    The class provides implementation for VF save and restore-save subtests,
+    supports parametrization with a different VMs number and double migration scenario variants.
+
+    Dedicated for inheritance by separate child test classes with specific
+    double migration test scenarios configurations:
+    - stopping RESFIX in a different stage
+    - restoring from initial (1st) or latter (2nd) migration
+    """
+
+    def __set_debugfs_resfix_stoppers(self, vm: VirtualMachine, stage: ResfixWaitStage):
+        """Set resfix_stoppers:
+        predefined checkpoints that allow the migration process to pause at specific stages.
+        Each state will pause with a 1-second delay per iteration, continuing until
+        its corresponding bit is cleared.
+        Debug hook path: /sys/kernel/debug/dri/<card>/gt0/vf/resfix_stoppers
+        """
+        vf_driver = vm.get_dut().driver
+        vf_driver.write_debugfs(f'{vf_driver.debugfs_path}/gt0/vf/resfix_stoppers', str(stage))
+
+        resfix_stoppers = vf_driver.read_debugfs(f'{vf_driver.debugfs_path}/gt0/vf/resfix_stoppers').strip()
+        logger.debug("[%s] Set migration resfix stoppers: %s (%s)"
+                     "\nPause checkpoints:"
+                     "\n\tVF_MIGRATION_WAIT_BEFORE_RESFIX_START: BIT(0)"
+                     "\n\tVF_MIGRATION_WAIT_BEFORE_FIXUPS: BIT(1)"
+                     "\n\tVF_MIGRATION_WAIT_BEFORE_RESTART_JOBS: BIT(2)"
+                     "\n\tVF_MIGRATION_WAIT_BEFORE_RESFIX_DONE: BIT(3)"
+                     "\n\tResume execution: 0",
+                     vm, resfix_stoppers, bin(int(resfix_stoppers, 16)))
+
+        return int(resfix_stoppers, 16) == stage
+
+    def __is_resfix_stopped(self, vm: VirtualMachine):
+        vf_driver = vm.get_dut().driver
+        resfix_stoppers = vf_driver.read_debugfs(f'{vf_driver.debugfs_path}/gt0/vf/resfix_stoppers').strip()
+
+        return int(resfix_stoppers, 16) != 0
+
+    @pytest.fixture(scope='function', name='set_resfix_stoppers')
+    def fixture_set_resfix_stoppers(self, setup_vms, set_double_migration_config):
+        ts: VmmTestingSetup = setup_vms
+        migration_config: DoubleMigrationConfig = set_double_migration_config
+        vm_src: VirtualMachine = ts.get_vm[0]  # First VM as a source
+
+        return self.__set_debugfs_resfix_stoppers(vm_src, migration_config.resfix_stoppers)
+
+    @pytest.fixture(scope='function', name='clear_resfix_stoppers')
+    def fixture_clear_resfix_stoppers(self, setup_vms):
+        ts: VmmTestingSetup = setup_vms
+        yield
+
+        for vm in ts.get_vm:
+            if vm.is_running() and self.__is_resfix_stopped(vm):
+                logger.info("Teardown fixture - clear remaining resfix stoppers")
+                self.__set_debugfs_resfix_stoppers(vm, ResfixWaitStage.VF_MIGRATION_CONTINUE)
+
+    @pytest.fixture(scope='class', name='run_source_workload')
+    def fixture_run_source_workload(self, setup_vms):
+        ts: VmmTestingSetup = setup_vms
+        vm_src: VirtualMachine = ts.get_vm[0] # First VM as source
+        migration_wl: MigrationWorkloadWsim = wsim_short_preempt # Workload variant
+        wsim_file_path = ts.wsim_wl_dir / f'{migration_wl.workload_file}.wsim' # Workload descriptor file path
+        if not wsim_file_path.exists():
+            logger.error("gem_wsim workload file %s not available!", wsim_file_path)
+            raise exceptions.GemWsimError(f'gem_wsim workload file {wsim_file_path} not available!')
+
+        # Run IGT wsim workload in pre-migration and check completion in post-migration
+        return GemWsim(vm_src, migration_wl.num_clients, migration_wl.num_repeats, workload=wsim_file_path)
+
+    @pytest.fixture(scope='function', name='setup_destination_vm')
+    def fixture_setup_destination_vm(self, setup_vms):
+        ts: VmmTestingSetup = setup_vms
+        vm_src: VirtualMachine = ts.get_vm[0]  # First VM as a source
+        vm_dst: VirtualMachine = ts.get_vm[-1] # Last VM as a destination
+        num_vms = ts.get_num_vms()
+
+        if num_vms == 1:
+            logger.debug("Single VM: the same source and destination VM instance")
+            assert vm_src == vm_dst
+            vm_dst.pause()
+            return vm_dst
+
+        logger.debug("Multiple VMs: reload destination VM with the source image (with state snapshot)")
+
+        if vm_src.is_running():
+            # QMP 'quit' is used for paused VM (cannot be powered off via guest-agent)
+            vm_src.quit()
+
+        if vm_dst.is_running():
+            vm_dst.quit()
+            while vm_dst.is_running():
+                time.sleep(1) # VM usually doesn't terminate immediately
+
+        # Re-start destination VM with an image containing a state snapshot
+        vm_dst.set_migration_source(vm_src.image)
+        vm_dst.poweron()
+
+        return vm_dst
+
+    def test_save(self, setup_vms, run_source_workload, set_resfix_stoppers):
+        logger.info("Test VM double migration: 1st state save")
+        ts: VmmTestingSetup = setup_vms
+        vm_src: VirtualMachine = ts.get_vm[0] # First VM as source
+        assert set_resfix_stoppers, 'Failed to set migration resfix stoppers'
+
+        logger.debug("Execute throughout-migration workload on source VM")
+        migration_wl: GemWsim = run_source_workload
+        time.sleep(IGT_INIT_DELAY)
+        assert migration_wl.is_running(), 'IGT/wsim migration workload is not running on source VM'
+
+        # Pause source VM and save snapshot
+        logger.debug("Pause execution and save source VM state (snapshot #1)")
+        try:
+            vm_src.pause()
+            vm_src.save_state() # snapshot #1
+        except exceptions.GuestError as exc:
+            logger.error("State save error: %s", exc)
+            assert False, 'VF migration failed on save'
+
+    def test_restore_save(self, setup_destination_vm, run_source_workload,
+                          set_double_migration_config, clear_resfix_stoppers):
+        logger.info("Test VM double migration: state restore and 2nd save prior recovery is done")
+        vm_dst: VirtualMachine = setup_destination_vm
+        migration_wl: GemWsim = run_source_workload # Get an instance of the IGT WL started in a save test
+        migration_config: DoubleMigrationConfig = set_double_migration_config
+
+        # Patch the source IgtExecutor instance with the current VM and clear results cache
+        migration_wl.target = vm_dst
+
+        # Load the source state snapshot
+        logger.debug("Restore source state on the destination VM (snapshot #1)")
+        vm_dst.load_state() # snapshot #1
+        vm_dst.resume()
+
+        time.sleep(3) # Wait a bit for the migration recovery fires
+        logger.debug("Save 2nd VM state (snapshot #2) while the 1st migration recovery still in progress")
+
+        if migration_config.migration_to_restore is MigrationToRestore.FIRST:
+            # VM pause/resume is implicitly called by save,
+            # snapshot #1 recovery is continued immediately after snapshot #2 save completes
+            vm_dst.save_state() # snapshot #2
+            logger.info("Continue source VM state recovery (snapshot #1)")
+
+        if migration_config.migration_to_restore is MigrationToRestore.SECOND:
+            # Include explicit VM pause/resume, snapshot #2 load shall immediately follow it's save,
+            # to not allow continuation of state recovery of snapshot #1.
+            vm_dst.pause()
+            vm_dst.save_state() # snapshot #2
+            logger.info("Load state and re-start state recovery of 2nd saved state (snapshot #2)")
+            vm_dst.load_state() # snapshot #2
+            vm_dst.resume()
+
+        logger.info("Continue migration recovery - clear resfix stoppers")
+        self.__set_debugfs_resfix_stoppers(vm_dst, ResfixWaitStage.VF_MIGRATION_CONTINUE)
+
+        logger.debug("Check migration in-flight workload after destination VM save")
+        time.sleep(IGT_RESTORE_DELAY)
+        assert migration_wl.check_results(), 'VF migration workload failed on destination VM (post-restore)'
+
+
+@pytest.fixture(scope='class', name='set_double_migration_config')
+def fixture_set_double_migration_config(request):
+    """Set migration recovery wait stage for double migration test and number of snapshot to restore."""
+    # Provide list of DoubleMigrationConfig instances to setup the test.
+    return request.param
+
+
+def idfn_double_migration(config: DoubleMigrationConfig):
+    """Add double migration settings to a test config ID in parametrized tests
+    (e.g. test_something[2VF-RS:resfix_stopper-MR:snapshot_to_restore].
+    """
+    return str(config)
+
+
+@pytest.mark.parametrize('set_double_migration_config', [double_migration_1_resfix_1],
+                        ids=idfn_double_migration, indirect=['set_double_migration_config'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_profiles, wa_reduce_vf_lmem=True),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestDoubleMigration1Resfix1(BaseTestDoubleMigration):
+    """Double migration test restoring the first snapshot (the former, initial migration):
+    save snapshot #1 -> load snapshot #1 -> save snapshot #2 (during #1 recovery) -> continue to recover #1
+    Stop resfix on VF_MIGRATION_WAIT_BEFORE_RESFIX_START (BIT0) checkpoint to initiate 2nd save.
+
+    W/A: reduce VF VRAM quota to speed up the 2nd save (to avoid time-out).
+    """
+
+
+@pytest.mark.parametrize('set_double_migration_config', [double_migration_1_resfix_2],
+                        ids=idfn_double_migration, indirect=['set_double_migration_config'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_profiles, wa_reduce_vf_lmem=True),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestDoubleMigration1Resfix2(BaseTestDoubleMigration):
+    """Double migration test restoring the first snapshot (the former, initial migration):
+    save snapshot #1 -> load snapshot #1 -> save snapshot #2 (during #1 recovery) -> continue to recover #1
+    Stop resfix on VF_MIGRATION_WAIT_BEFORE_FIXUPS (BIT1) checkpoint to initiate 2nd save.
+
+    W/A: reduce VF VRAM quota to speed up the 2nd save (to avoid time-out).
+    """
+
+
+@pytest.mark.parametrize('set_double_migration_config', [double_migration_1_resfix_3],
+                        ids=idfn_double_migration, indirect=['set_double_migration_config'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_profiles, wa_reduce_vf_lmem=True),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestDoubleMigration1Resfix3(BaseTestDoubleMigration):
+    """Double migration test restoring the first snapshot (the former, initial migration):
+    save snapshot #1 -> load snapshot #1 -> save snapshot #2 (during #1 recovery) -> continue to recover #1
+    Stop resfix on VF_MIGRATION_WAIT_BEFORE_RESTART_JOBS (BIT2) checkpoint to initiate 2nd save.
+
+    W/A: reduce VF VRAM quota to speed up the 2nd save (to avoid time-out).
+    """
+
+
+@pytest.mark.parametrize('set_double_migration_config', [double_migration_1_resfix_4],
+                        ids=idfn_double_migration, indirect=['set_double_migration_config'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_profiles, wa_reduce_vf_lmem=True),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestDoubleMigration1Resfix4(BaseTestDoubleMigration):
+    """Double migration test restoring the first snapshot (the former, initial migration):
+    save snapshot #1 -> load snapshot #1 -> save snapshot #2 (during #1 recovery) -> continue to recover #1
+    Stop resfix on VF_MIGRATION_WAIT_BEFORE_RESFIX_DONE (BIT3) checkpoint to initiate 2nd save.
+
+    W/A: reduce VF VRAM quota to speed up the 2nd save (to avoid time-out).
+    """
+
+
+@pytest.mark.parametrize('set_double_migration_config', [double_migration_2_resfix_1],
+                        ids=idfn_double_migration, indirect=['set_double_migration_config'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_profiles, wa_reduce_vf_lmem=True),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestDoubleMigration2Resfix1(BaseTestDoubleMigration):
+    """Double migration test restoring the second snapshot (the latter migration):
+    save snapshot #1 -> load snapshot #1 -> save snapshot #2 (during #1 recovery) -> load and recover #2
+    Stop resfix on VF_MIGRATION_WAIT_BEFORE_RESFIX_START (BIT0) checkpoint to initiate 2nd save.
+
+    W/A: reduce VF VRAM quota to speed up the 2nd save (to avoid time-out).
+    """
+
+
+@pytest.mark.parametrize('set_double_migration_config', [double_migration_2_resfix_2],
+                        ids=idfn_double_migration, indirect=['set_double_migration_config'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_profiles, wa_reduce_vf_lmem=True),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestDoubleMigration2Resfix2(BaseTestDoubleMigration):
+    """Double migration test restoring the second snapshot (the latter migration):
+    save snapshot #1 -> load snapshot #1 -> save snapshot #2 (during #1 recovery) -> load and recover #2
+    Stop resfix on VF_MIGRATION_WAIT_BEFORE_FIXUPS (BIT1) checkpoint to initiate 2nd save.
+
+    W/A: reduce VF VRAM quota to speed up the 2nd save (to avoid time-out).
+    """
+
+
+@pytest.mark.parametrize('set_double_migration_config', [double_migration_2_resfix_3],
+                        ids=idfn_double_migration, indirect=['set_double_migration_config'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_profiles, wa_reduce_vf_lmem=True),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestDoubleMigration2Resfix3(BaseTestDoubleMigration):
+    """Double migration test restoring the second snapshot (the latter migration):
+    save snapshot #1 -> load snapshot #1 -> save snapshot #2 (during #1 recovery) -> load and recover #2
+    Stop resfix on VF_MIGRATION_WAIT_BEFORE_RESTART_JOBS (BIT2) checkpoint to initiate 2nd save.
+
+    W/A: reduce VF VRAM quota to speed up the 2nd save (to avoid time-out).
+    """
+
+
+@pytest.mark.parametrize('set_double_migration_config', [double_migration_2_resfix_4],
+                        ids=idfn_double_migration, indirect=['set_double_migration_config'])
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_profiles, wa_reduce_vf_lmem=True),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestDoubleMigration2Resfix4(BaseTestDoubleMigration):
+    """Double migration test restoring the second snapshot (the latter migration):
+    save snapshot #1 -> load snapshot #1 -> save snapshot #2 (during #1 recovery) -> load and recover #2
+    Stop resfix on VF_MIGRATION_WAIT_BEFORE_RESFIX_DONE (BIT3) checkpoint to initiate 2nd save.
+
+    W/A: reduce VF VRAM quota to speed up the 2nd save (to avoid time-out).
+    """
+
+
+@pytest.mark.parametrize('setup_vms', set_test_config(test_variants_basic),
+                         ids=idfn_test_config, indirect=['setup_vms'])
+class TestCheckpoint:
+    """Verify a state can be saved for the future use and then loaded at the previous checkpoint."""
+
+    @pytest.fixture(scope='function', name='setup_destination_vm')
+    def fixture_setup_destination_vm(self, setup_vms):
+        ts: VmmTestingSetup = setup_vms
+        vm_src: VirtualMachine = ts.get_vm[0]  # First VM as a source
+        vm_dst: VirtualMachine = ts.get_vm[-1] # Last VM as a destination
+        num_vms = ts.get_num_vms()
+
+        if num_vms == 1:
+            logger.debug("Single VM: the same source and destination VM instance")
+            assert vm_src == vm_dst
+            return vm_dst
+
+        logger.debug("Multiple VMs: restart destination VM with the source image (with state checkpoint)")
+        vm_dst.poweroff()
+        # Source qcow2 must be copied because multiple VMs cannot run with the same image file
+        vm_dst.set_migration_source(duplicate_vm_image(vm_src.image))
+        vm_dst.poweron()
+        vm_dst.resume()
+        assert modprobe_driver_run_check(vm_dst)
+
+        return vm_dst
+
+    @pytest.fixture(scope='class', name='run_source_workload')
+    def fixture_run_source_workload(self, setup_vms):
+        ts: VmmTestingSetup = setup_vms
+        vm_src: VirtualMachine = ts.get_vm[0] # First VM as source
+
+        # Run IGT workload to check before and after a state checkpoint
+        return IgtExecutor(vm_src, IgtType.SPIN_BATCH)
+
+    def test_save(self, setup_vms, run_source_workload):
+        logger.info("Test VM state checkpoint save")
+        ts: VmmTestingSetup = setup_vms
+        vm_src: VirtualMachine = ts.get_vm[0] # First VM as source
+        igt_src: IgtExecutor = run_source_workload
+
+        # Save state checkpoint
+        logger.debug("Save VM state checkpoint")
+        try:
+            vm_src.save_state()
+        except exceptions.GuestError as exc:
+            logger.error("State save error: %s", exc)
+            assert False, 'VF migration failed on save'
+
+        # Verify workload submitted prior to the state checkpoint succeeds
+        assert igt_check(igt_src), 'Source IGT workload has failed'
+
+        logger.debug("Check driver health on host and source VM")
+        assert driver_check(ts.host)
+        assert driver_check(vm_src)
+
+    def test_load(self, setup_vms, setup_destination_vm, run_source_workload):
+        logger.info("Test VM state checkpoint load")
+        ts: VmmTestingSetup = setup_vms
+        vm_dst: VirtualMachine = setup_destination_vm
+        igt_src: IgtExecutor = run_source_workload # Get an instance of the IGT WL started in a save test
+
+        # Patch the source IgtExecutor instance with the current VM and clear results cache
+        igt_src.target = vm_dst
+        igt_src.results.clear()
+
+        # Workload submitted before the checkpoint should not be active before load
+        logger.debug("Verify IGT workload is not executing prior to the state restore (expected pgrep error)")
+        assert not cmd_run_check(vm_dst, 'pgrep igt_runner'), 'IGT workload is (unexpectedly) running'
+
+        # Load previously saved state checkpoint and resume on destination VM
+        logger.debug("Load VM state checkpoint")
+        vm_dst.load_state()
+
+        # Workload submitted before the checkpoint should be restored in running state after load
+        logger.debug("Verify IGT workload is executing again after the state restore")
+        assert not igt_src.status().exited, 'IGT workload is not running after checkpoint load'
+        assert igt_check(igt_src), 'IGT workload loaded on checkpoint has failed'
+
+        logger.debug("Check driver health on host and destination VM")
+        assert driver_check(ts.host)
+        assert driver_check(vm_dst)
+
+
+def test_2vm_pause_resume(create_1host_2vm):
+    """
+    VM/VF pause-resume does not affect workload execution:
+     - 2xVFs running 2xVM instance
+     - both VFs auto-provisioned, running IGT workloads
+     - 1st VM/VF is paused and resumed (but VF state is not saved/loaded)
+     - 2nd VM/VF workload should not be interrupted
+     - IGT workloads shall finish successfully on both VMs
+    """
+    ts: VmmTestingSetup = create_1host_2vm
+    host: Host = ts.host
+    vm0: VirtualMachine = ts.get_vm[0]
+    vm1: VirtualMachine = ts.get_vm[1]
+    assert driver_check(host)
+
+    num_vfs = ts.testing_config.num_vfs
+    assert ts.get_dut().create_vf(num_vfs) == num_vfs
+
+    vf1, vf2 = ts.get_dut().get_vfs_bdf(1, 2)
+    vm0.assign_vf(vf1)
+    vm1.assign_vf(vf2)
+    ts.poweron_vms()
+
+    pause_vf_num = 1
+
+    assert modprobe_driver_run_check(vm0)
+    assert modprobe_driver_run_check(vm1)
+
+    logger.debug("Submit IGT WL (gem_wsim) on VM0")
+    iterations = 3000 # 3k iterations of 10ms WLs give 30s total expected time
+    expected_elapsed_sec = ONE_CYCLE_DURATION_MS * iterations / MS_IN_SEC
+    gem_wsim_vm0 = GemWsim(vm0, 1, iterations, PREEMPT_10MS_WORKLOAD)
+
+    # Allow wsim WL to run some time
+    time.sleep(IGT_INIT_DELAY)
+    assert gem_wsim_vm0.is_running()
+
+    logger.debug("Submit IGT WL (gem_spin_batch) on VM1")
+    igt_vm1 = IgtExecutor(vm1, IgtType.SPIN_BATCH)
+
+    # Special handling of pausing VMs with infinite ExecQuanta - refer to SAS for details
+    logger.debug("Set VF1 EQ/PF before the pause")
+    ts.get_dut().driver.set_exec_quantum_ms(pause_vf_num, 1)
+    ts.get_dut().driver.set_preempt_timeout_us(pause_vf_num, 100)
+
+    logger.debug("Pause execution on VM0/VF1")
+    vm0.pause()
+
+    assert igt_check(igt_vm1)
+    logger.debug("VM1 IGT WL (not paused) finished successfully")
+
+    logger.debug("Resume execution on VM0/VF1")
+    vm0.resume()
+
+    logger.debug("Reset VF1 EQ/PF to the initial values (infinite) after resume")
+    ts.get_dut().driver.set_exec_quantum_ms(pause_vf_num, 0)
+    ts.get_dut().driver.set_preempt_timeout_us(pause_vf_num, 0)
+
+    result_vm0 = gem_wsim_vm0.wait_results() # Throws exception on wsim fail
+    assert expected_elapsed_sec * 0.8 < result_vm0.elapsed_sec < expected_elapsed_sec * 1.5
+    logger.debug("VM0 IGT WL (paused-resumed) finished successfully")
+
+    # Check host and VM health status after pause-resume transition
+    assert driver_check(host)
+    assert driver_check(vm0)
+    assert driver_check(vm1)
+
+
+def test_1vm_save_restore_no_driver(create_1host_1vm):
+    """
+    Save/restore single VM state with no guest driver loaded:
+     - 1xVFs running 1xVM instance (single VM acts as source and destination)
+     - platform provisioned with vGPU profile M1 (ATSM, ADLP) or C1 (PVC)
+     - VF state saved and then restored on the same VM instance
+     - driver probed on VM after the resume, IGT workload executed
+    """
+    ts: VmmTestingSetup = create_1host_1vm
+    host: Host = ts.host
+    vm: VirtualMachine = ts.get_vm[0]
+    assert driver_check(host)
+
+    num_vfs = ts.testing_config.num_vfs
+    assert ts.get_dut().create_vf(num_vfs) == num_vfs
+
+    vf = ts.get_dut().get_vf_bdf(1)
+    vm.assign_vf(vf)
+
+    vm.poweron()
+
+    # Run some interactive program (not returning, as vim) to verify state after migration
+    src_proc = ShellExecutor(vm, 'vim migrate.txt')
+    src_pid = src_proc.pid
+
+    # Pause VM and save snapshot
+    logger.debug("Pause execution and save VM state")
+    try:
+        vm.pause()
+        vm.save_state()
+    except exceptions.GuestError as exc:
+        logger.error("State save error: %s", exc)
+        assert False, 'VF migration failed on save'
+
+    # Load previously saved snapshot and resume the same VM
+    logger.debug("Load state on the same VM instance")
+    vm.load_state()
+    vm.resume()
+
+    # Verify program initiated on source VM is stil running after migration
+    migrated_proc = vm.execute_status(src_pid)
+    logger.debug("Migrated process: %s", migrated_proc)
+    assert migrated_proc.exited is False, 'Migrated process is not running after VM snapshot load'
+
+    logger.debug("Probe driver and execute workload on VM")
+    assert modprobe_driver_run_check(vm)
+    assert igt_run_check(vm, IgtType.EXEC_STORE)
+
+    logger.debug("Check driver health on host and VM")
+    assert driver_check(host)
+    assert driver_check(vm)
-- 
2.39.1


  parent reply	other threads:[~2026-04-16  9:13 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-16  8:35 [PATCH i-g-t 0/3] vmtb: SR-IOV VF migration test suite Adam Miszczak
2026-04-16  8:35 ` [PATCH i-g-t 1/3] tools/vmtb: Define IGT tests used as VF migration workloads Adam Miszczak
2026-04-20 18:27   ` Kamil Konieczny
2026-04-16  8:35 ` [PATCH i-g-t 2/3] tools/vmtb: Provide VF busy migration IGT/gem_wsim workloads Adam Miszczak
2026-04-16  8:35 ` Adam Miszczak [this message]
2026-04-16 15:40 ` ✓ i915.CI.BAT: success for vmtb: SR-IOV VF migration test suite Patchwork
2026-04-16 15:50 ` ✓ Xe.CI.BAT: " Patchwork
2026-04-16 17:34 ` ✗ Xe.CI.FULL: failure " Patchwork
2026-04-17  3:47 ` ✗ i915.CI.Full: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260416083544.2441874-4-adam.miszczak@linux.intel.com \
    --to=adam.miszczak@linux.intel.com \
    --cc=igt-dev@lists.freedesktop.org \
    --cc=kamil.konieczny@linux.intel.com \
    --cc=marcin.bernatowicz@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox