From: David Matlack <dmatlack@google.com>
To: Vipin Sharma <vipinsh@google.com>
Cc: kvm@vger.kernel.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
linux-pci@vger.kernel.org, ajayachandra@nvidia.com,
alex@shazbot.org, amastro@fb.com, ankita@nvidia.com,
apopple@nvidia.com, chrisl@kernel.org, corbet@lwn.net,
graf@amazon.com, jacob.pan@linux.microsoft.com, jgg@nvidia.com,
jgg@ziepe.ca, jrhilke@google.com, julianr@linux.ibm.com,
kevin.tian@intel.com, leon@kernel.org, leonro@nvidia.com,
lukas@wunner.de, michal.winiarski@intel.com, parav@nvidia.com,
pasha.tatashin@soleen.com, praan@google.com, pratyush@kernel.org,
rananta@google.com, rientjes@google.com, rodrigo.vivi@intel.com,
rppt@kernel.org, saeedm@nvidia.com, skhan@linuxfoundation.org,
skhawaja@google.com, vivek.kasireddy@intel.com, witu@nvidia.com,
yanjun.zhu@linux.dev, yi.l.liu@intel.com
Subject: Re: [PATCH v4 02/16] vfio/pci: Preserve vfio-pci device files across Live Update
Date: Thu, 21 May 2026 23:49:40 +0000 [thread overview]
Message-ID: <ag-aFA1BJxdJMywr@google.com> (raw)
In-Reply-To: <20260511234802.2280368-3-vipinsh@google.com>
On 2026-05-11 04:47 PM, Vipin Sharma wrote:
> +static int vfio_pci_liveupdate_freeze(struct liveupdate_file_op_args *args)
> +{
> + struct vfio_device *device = vfio_device_from_file(args->file);
> + struct vfio_pci_core_device *vdev;
> + struct pci_dev *pdev;
> + int ret;
> +
> + vdev = container_of(device, struct vfio_pci_core_device, vdev);
> + pdev = vdev->pdev;
> +
> + guard(mutex)(&device->dev_set->lock);
> +
> + /*
> + * Userspace must disable interrupts on the device prior to freeze so
> + * that the device does not send any interrupts until new interrupt
> + * handlers have been established by the next kernel.
> + */
> + if (vdev->irq_type != VFIO_PCI_NUM_IRQS) {
> + pci_err(pdev, "Freeze failed! Interrupts are still enabled.\n");
> + return -EINVAL;
> + }
> +
> + guard(rwsem_write)(&vdev->memory_lock);
> +
> + /*
> + * Userspace must make sure device is not in the lower power state for
> + * live update. We may relax this in future.
> + */
> + if (pdev->current_state != PCI_D0) {
> + pci_err(pdev, "Freeze failed! Device not in D0 state.\n");
> + return -EINVAL;
> + }
> +
> + /*
> + * Reset is a temporary measure to provide kernel after kexec a clean
> + * device while VFIO live update work is under development and not
> + * fully supported. It will go away once continuous DMA support is
> + * added to device preservation.
> + */
> + vfio_pci_zap_bars(vdev);
> + ret = pci_load_saved_state(pdev, vdev->pci_saved_state);
> + if (ret)
> + return ret;
> + pci_clear_master(pdev);
> + vfio_pci_core_try_reset(vdev);
I am seeing the following lockdep splat get triggered by this reset when
testing with this commit using vfio_pci_liveupdate_kexec_test. It seems to be
related to taking memory_lock above.
[ 2710.299017][T75672] ======================================================
[ 2710.305908][T75672] WARNING: possible circular locking dependency detected
[ 2710.312797][T75672] 7.1.0-dbg-DEV #59 Tainted: G S
[ 2710.319077][T75672] ------------------------------------------------------
[ 2710.325967][T75672] kexec/75672 is trying to acquire lock:
[ 2710.331474][T75672] ff46fd4fdbaeef08 (&group->mutex){+.+.}-{4:4}, at: pci_dev_reset_iommu_prepare+0x6e/0x1a0
[ 2710.341336][T75672]
[ 2710.341336][T75672] but task is already holding lock:
[ 2710.348574][T75672] ff46fd501f9a19a8 (&vdev->memory_lock){++++}-{4:4}, at: vfio_pci_liveupdate_freeze+0x51/0x100
[ 2710.358764][T75672]
[ 2710.358764][T75672] which lock already depends on the new lock.
[ 2710.358764][T75672]
[ 2710.369031][T75672]
[ 2710.369031][T75672] the existing dependency chain (in reverse order) is:
[ 2710.377916][T75672]
[ 2710.377916][T75672] -> #4 (&vdev->memory_lock){++++}-{4:4}:
[ 2710.385675][T75672] down_read+0x3d/0x150
[ 2710.390235][T75672] vfio_pci_mmap_huge_fault+0xb9/0x160
[ 2710.396091][T75672] __do_fault+0x46/0x140
[ 2710.400734][T75672] do_pte_missing+0x4c3/0xff0
[ 2710.405803][T75672] handle_mm_fault+0x7c4/0xb30
[ 2710.410961][T75672] fixup_user_fault+0x115/0x270
[ 2710.416209][T75672] vaddr_get_pfns+0x1a1/0x390
[ 2710.421286][T75672] vfio_pin_pages_remote+0x148/0x4d0
[ 2710.426959][T75672] vfio_pin_map_dma+0xcc/0x260
[ 2710.432116][T75672] vfio_iommu_type1_ioctl+0xda4/0xec0
[ 2710.437884][T75672] __se_sys_ioctl+0x71/0xc0
[ 2710.442790][T75672] do_syscall_64+0x15f/0x710
[ 2710.447788][T75672] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 2710.454074][T75672]
[ 2710.454074][T75672] -> #3 (&mm->mmap_lock){++++}-{4:4}:
[ 2710.461489][T75672] down_read_killable+0x48/0x180
[ 2710.466821][T75672] mmap_read_lock_killable+0x12/0x50
[ 2710.472505][T75672] lock_mm_and_find_vma+0x11d/0x130
[ 2710.478093][T75672] do_user_addr_fault+0x3a0/0x6c0
[ 2710.483521][T75672] exc_page_fault+0x68/0xa0
[ 2710.488423][T75672] asm_exc_page_fault+0x26/0x30
[ 2710.493669][T75672] filldir+0xe2/0x190
[ 2710.498047][T75672] ext4_readdir+0xb47/0xcf0
[ 2710.502950][T75672] iterate_dir+0x84/0x160
[ 2710.507677][T75672] __se_sys_getdents+0x74/0x120
[ 2710.512929][T75672] do_syscall_64+0x15f/0x710
[ 2710.517919][T75672] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 2710.524202][T75672]
[ 2710.524202][T75672] -> #2 (&type->i_mutex_dir_key#4){++++}-{4:4}:
[ 2710.532478][T75672] down_read+0x3d/0x150
[ 2710.537030][T75672] lookup_slow+0x26/0x50
[ 2710.541675][T75672] link_path_walk+0x42c/0x580
[ 2710.546743][T75672] path_openat+0xd1/0xde0
[ 2710.551466][T75672] do_file_open_root+0x114/0x250
[ 2710.556798][T75672] file_open_root+0x89/0xb0
[ 2710.561703][T75672] kernel_read_file_from_path_initns+0xba/0x130
[ 2710.568342][T75672] _request_firmware+0x4ab/0x8c0
[ 2710.573677][T75672] request_firmware_direct+0x36/0x50
[ 2710.579356][T75672] request_microcode_fw+0xf2/0x510
[ 2710.584869][T75672] reload_store+0x197/0x230
[ 2710.589766][T75672] kernfs_fop_write_iter+0x13f/0x1d0
[ 2710.595452][T75672] vfs_write+0x2be/0x3b0
[ 2710.600097][T75672] ksys_write+0x73/0x100
[ 2710.604735][T75672] do_syscall_64+0x15f/0x710
[ 2710.609723][T75672] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 2710.616009][T75672]
[ 2710.616009][T75672] -> #1 (cpu_hotplug_lock){++++}-{0:0}:
[ 2710.623591][T75672] cpus_read_lock+0x3b/0xd0
[ 2710.628499][T75672] __cpuhp_state_add_instance+0x19/0x40
[ 2710.634443][T75672] iova_domain_init_rcaches+0x1ef/0x230
[ 2710.640385][T75672] iommu_setup_dma_ops+0x175/0x540
[ 2710.645891][T75672] iommu_device_register+0x188/0x220
[ 2710.651564][T75672] intel_iommu_init+0x35a/0x440
[ 2710.656811][T75672] pci_iommu_init+0x16/0x40
[ 2710.661713][T75672] do_one_initcall+0xf5/0x3a0
[ 2710.666786][T75672] do_initcall_level+0x82/0xa0
[ 2710.671953][T75672] do_initcalls+0x43/0x70
[ 2710.676672][T75672] kernel_init_freeable+0x152/0x1d0
[ 2710.682266][T75672] kernel_init+0x1a/0x130
[ 2710.686996][T75672] ret_from_fork+0x16b/0x310
[ 2710.691991][T75672] ret_from_fork_asm+0x1a/0x30
[ 2710.697151][T75672]
[ 2710.697151][T75672] -> #0 (&group->mutex){+.+.}-{4:4}:
[ 2710.704478][T75672] __lock_acquire+0x14c6/0x2800
[ 2710.709729][T75672] lock_acquire+0xd3/0x2c0
[ 2710.714542][T75672] __mutex_lock+0x8f/0xcd0
[ 2710.719349][T75672] pci_dev_reset_iommu_prepare+0x6e/0x1a0
[ 2710.725461][T75672] pcie_flr+0x32/0xc0
[ 2710.729842][T75672] __pci_reset_function_locked+0x84/0x120
[ 2710.735954][T75672] vfio_pci_core_try_reset+0x96/0xe0
[ 2710.741630][T75672] vfio_pci_liveupdate_freeze+0x89/0x100
[ 2710.747653][T75672] luo_file_freeze+0xba/0x280
[ 2710.752725][T75672] luo_session_serialize+0x69/0x190
[ 2710.758321][T75672] liveupdate_reboot+0x19/0x30
[ 2710.763490][T75672] kernel_kexec+0x2f/0xa0
[ 2710.768220][T75672] __se_sys_reboot+0xfd/0x210
[ 2710.773301][T75672] do_syscall_64+0x15f/0x710
[ 2710.778284][T75672] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 2710.784568][T75672]
[ 2710.784568][T75672] other info that might help us debug this:
[ 2710.784568][T75672]
[ 2710.794663][T75672] Chain exists of:
[ 2710.794663][T75672] &group->mutex --> &mm->mmap_lock --> &vdev->memory_lock
[ 2710.794663][T75672]
[ 2710.807543][T75672] Possible unsafe locking scenario:
[ 2710.807543][T75672]
[ 2710.814863][T75672] CPU0 CPU1
[ 2710.820106][T75672] ---- ----
[ 2710.825352][T75672] lock(&vdev->memory_lock);
[ 2710.829904][T75672] lock(&mm->mmap_lock);
[ 2710.836620][T75672] lock(&vdev->memory_lock);
[ 2710.843682][T75672] lock(&group->mutex);
[ 2710.847798][T75672]
[ 2710.847798][T75672] *** DEADLOCK ***
[ 2710.847798][T75672]
[ 2710.855818][T75672] 7 locks held by kexec/75672:
[ 2710.860457][T75672] #0: ffffffff90a81330 (system_transition_mutex){+.+.}-{4:4}, at: __se_sys_reboot+0xe4/0x210
[ 2710.870554][T75672] #1: ffffffff90e1d0c0 (luo_session_global.outgoing.rwsem){+.+.}-{4:4}, at: luo_session_serialize+0x1f/0x190
[ 2710.882043][T75672] #2: ff46fd50602b7ae0 (&session->mutex){+.+.}-{4:4}, at: luo_session_serialize+0x4f/0x190
[ 2710.891972][T75672] #3: ff46fd500bec0788 (&luo_file->mutex){+.+.}-{4:4}, at: luo_file_freeze+0x65/0x280
[ 2710.901463][T75672] #4: ff46fd509d8106a8 (&new_dev_set->lock){+.+.}-{4:4}, at: vfio_pci_liveupdate_freeze+0x36/0x100
[ 2710.912086][T75672] #5: ff46fd501f9a19a8 (&vdev->memory_lock){++++}-{4:4}, at: vfio_pci_liveupdate_freeze+0x51/0x100
[ 2710.922701][T75672] #6: ff46fd4fd416c1f0 (&dev->mutex){....}-{4:4}, at: pci_dev_trylock+0x25/0x60
[ 2710.931676][T75672]
[ 2710.931676][T75672] stack backtrace:
[ 2710.937439][T75672] CPU: 193 UID: 0 PID: 75672 Comm: kexec Tainted: G S 7.1.0-dbg-DEV #59 PREEMPTLAZY
[ 2710.937442][T75672] Tainted: [S]=CPU_OUT_OF_SPEC
[ 2710.937442][T75672] Hardware name: Google Izumi-EMR/izumi, BIOS 0.20251023.0-0 10/23/2025
[ 2710.937443][T75672] Call Trace:
[ 2710.937446][T75672] <TASK>
[ 2710.937448][T75672] dump_stack_lvl+0x54/0x70
[ 2710.937453][T75672] print_circular_bug+0x2e1/0x300
[ 2710.937455][T75672] check_noncircular+0xf9/0x120
[ 2710.937456][T75672] ? __bfs+0x129/0x200
[ 2710.937458][T75672] __lock_acquire+0x14c6/0x2800
[ 2710.937460][T75672] ? __lock_acquire+0x1240/0x2800
[ 2710.937463][T75672] ? pci_dev_reset_iommu_prepare+0x6e/0x1a0
[ 2710.937465][T75672] lock_acquire+0xd3/0x2c0
[ 2710.937466][T75672] ? pci_dev_reset_iommu_prepare+0x6e/0x1a0
[ 2710.937468][T75672] ? lock_is_held_type+0x76/0x100
[ 2710.937471][T75672] ? pci_dev_reset_iommu_prepare+0x6e/0x1a0
[ 2710.937473][T75672] __mutex_lock+0x8f/0xcd0
[ 2710.937473][T75672] ? pci_dev_reset_iommu_prepare+0x6e/0x1a0
[ 2710.937475][T75672] ? lockdep_hardirqs_on_prepare+0x151/0x210
[ 2710.937477][T75672] ? _raw_spin_unlock_irqrestore+0x35/0x50
[ 2710.937482][T75672] pci_dev_reset_iommu_prepare+0x6e/0x1a0
[ 2710.937484][T75672] pcie_flr+0x32/0xc0
[ 2710.937485][T75672] __pci_reset_function_locked+0x84/0x120
[ 2710.937487][T75672] vfio_pci_core_try_reset+0x96/0xe0
[ 2710.937489][T75672] vfio_pci_liveupdate_freeze+0x89/0x100
[ 2710.937490][T75672] luo_file_freeze+0xba/0x280
[ 2710.937492][T75672] luo_session_serialize+0x69/0x190
[ 2710.937493][T75672] liveupdate_reboot+0x19/0x30
[ 2710.937495][T75672] kernel_kexec+0x2f/0xa0
[ 2710.937496][T75672] __se_sys_reboot+0xfd/0x210
[ 2710.937497][T75672] ? check_object+0x1ee/0x390
[ 2710.937500][T75672] ? lock_release+0xef/0x350
[ 2710.937501][T75672] ? kmem_cache_free+0x1b5/0x520
[ 2710.937506][T75672] ? _raw_spin_unlock_irqrestore+0x35/0x50
[ 2710.937508][T75672] ? kmem_cache_free+0x1b5/0x520
[ 2710.937509][T75672] ? __x64_sys_close+0x3d/0x80
[ 2710.937510][T75672] ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 2710.937511][T75672] ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 2710.937512][T75672] do_syscall_64+0x15f/0x710
[ 2710.937514][T75672] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 2710.937515][T75672] RIP: 0033:0x7fa57e4f2513
[ 2710.937519][T75672] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 89 fa b8 a9 00 00 00 bf ad de e1 fe be 69 19 12 28 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 f7 d8 48 8b 0d db 2c 07 00 64 89 01 48
[ 2710.937520][T75672] RSP: 002b:00007ffd16943748 EFLAGS: 00000246 ORIG_RAX: 00000000000000a9
[ 2710.937523][T75672] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fa57e4f2513
[ 2710.937524][T75672] RDX: 0000000045584543 RSI: 0000000028121969 RDI: 00000000fee1dead
[ 2710.937526][T75672] RBP: 00007ffd16943a60 R08: 0000000000000009 R09: 00007fa57e5672e0
[ 2710.937527][T75672] R10: 0000000000000008 R11: 0000000000000246 R12: 00007ffd169438e0
[ 2710.937528][T75672] R13: 0000000000000000 R14: 00007ffd169438e0 R15: 0000000000000001
[ 2710.937532][T75672] </TASK>
> + pci_restore_state(pdev);
> + return 0;
> }
next prev parent reply other threads:[~2026-05-21 23:49 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-11 23:47 [PATCH v4 00/16] vfio/pci: Base Live Update support for VFIO Vipin Sharma
2026-05-11 23:47 ` [PATCH v4 01/16] vfio/pci: Register a file handler with Live Update Orchestrator Vipin Sharma
2026-05-13 2:44 ` sashiko-bot
2026-05-11 23:47 ` [PATCH v4 02/16] vfio/pci: Preserve vfio-pci device files across Live Update Vipin Sharma
2026-05-12 20:59 ` David Matlack
2026-05-12 21:29 ` Vipin Sharma
2026-05-13 22:42 ` Samiullah Khawaja
2026-05-14 15:24 ` Pratyush Yadav
2026-05-18 16:37 ` Vipin Sharma
2026-05-13 3:24 ` sashiko-bot
2026-05-21 23:49 ` David Matlack [this message]
2026-05-11 23:47 ` [PATCH v4 03/16] vfio/pci: Retrieve preserved device files after " Vipin Sharma
2026-05-13 4:23 ` sashiko-bot
2026-05-11 23:47 ` [PATCH v4 04/16] vfio/pci: Notify PCI subsystem about devices preserved across " Vipin Sharma
2026-05-11 23:47 ` [PATCH v4 05/16] vfio: Enforce preserved devices are retrieved via LIVEUPDATE_SESSION_RETRIEVE_FD Vipin Sharma
2026-05-13 19:16 ` sashiko-bot
2026-05-17 19:04 ` Zhu Yanjun
2026-05-18 16:47 ` Vipin Sharma
2026-05-11 23:47 ` [PATCH v4 06/16] vfio/pci: Store incoming Live Update state in struct vfio_pci_core_device Vipin Sharma
2026-05-13 20:13 ` sashiko-bot
2026-05-11 23:47 ` [PATCH v4 07/16] docs: liveupdate: Add documentation for VFIO PCI Vipin Sharma
2026-05-11 23:47 ` [PATCH v4 08/16] vfio: selftests: Build liveupdate library in VFIO selftests Vipin Sharma
2026-05-13 20:28 ` sashiko-bot
2026-05-11 23:47 ` [PATCH v4 09/16] vfio: selftests: Add vfio_pci_liveupdate_uapi_test Vipin Sharma
2026-05-13 21:12 ` sashiko-bot
2026-05-11 23:47 ` [PATCH v4 10/16] vfio: selftests: Initialize vfio_pci_device using a VFIO cdev FD Vipin Sharma
2026-05-11 23:47 ` [PATCH v4 11/16] vfio: selftests: Add Makefile support for TEST_GEN_PROGS_EXTENDED Vipin Sharma
2026-05-11 23:47 ` [PATCH v4 12/16] vfio: selftests: Add vfio_pci_liveupdate_kexec_test Vipin Sharma
2026-05-11 23:47 ` [PATCH v4 13/16] vfio: selftests: Expose iommu_modes to tests Vipin Sharma
2026-05-11 23:48 ` [PATCH v4 14/16] vfio: selftests: Expose low-level helper routines for setting up struct vfio_pci_device Vipin Sharma
2026-05-11 23:48 ` [PATCH v4 15/16] vfio: selftests: Verify that opening VFIO device fails during Live Update Vipin Sharma
2026-05-13 23:33 ` sashiko-bot
2026-05-11 23:48 ` [PATCH v4 16/16] vfio: selftests: Add continuous DMA to vfio_pci_liveupdate_kexec_test Vipin Sharma
2026-05-13 23:22 ` sashiko-bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ag-aFA1BJxdJMywr@google.com \
--to=dmatlack@google.com \
--cc=ajayachandra@nvidia.com \
--cc=alex@shazbot.org \
--cc=amastro@fb.com \
--cc=ankita@nvidia.com \
--cc=apopple@nvidia.com \
--cc=chrisl@kernel.org \
--cc=corbet@lwn.net \
--cc=graf@amazon.com \
--cc=jacob.pan@linux.microsoft.com \
--cc=jgg@nvidia.com \
--cc=jgg@ziepe.ca \
--cc=jrhilke@google.com \
--cc=julianr@linux.ibm.com \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=leon@kernel.org \
--cc=leonro@nvidia.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=michal.winiarski@intel.com \
--cc=parav@nvidia.com \
--cc=pasha.tatashin@soleen.com \
--cc=praan@google.com \
--cc=pratyush@kernel.org \
--cc=rananta@google.com \
--cc=rientjes@google.com \
--cc=rodrigo.vivi@intel.com \
--cc=rppt@kernel.org \
--cc=saeedm@nvidia.com \
--cc=skhan@linuxfoundation.org \
--cc=skhawaja@google.com \
--cc=vipinsh@google.com \
--cc=vivek.kasireddy@intel.com \
--cc=witu@nvidia.com \
--cc=yanjun.zhu@linux.dev \
--cc=yi.l.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.