From: David Matlack <dmatlack@google.com>
To: Vipin Sharma <vipinsh@google.com>
Cc: kvm@vger.kernel.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
linux-pci@vger.kernel.org, ajayachandra@nvidia.com,
alex@shazbot.org, amastro@fb.com, ankita@nvidia.com,
apopple@nvidia.com, chrisl@kernel.org, corbet@lwn.net,
graf@amazon.com, jacob.pan@linux.microsoft.com, jgg@nvidia.com,
jgg@ziepe.ca, jrhilke@google.com, julianr@linux.ibm.com,
kevin.tian@intel.com, leon@kernel.org, leonro@nvidia.com,
lukas@wunner.de, michal.winiarski@intel.com, parav@nvidia.com,
pasha.tatashin@soleen.com, praan@google.com, pratyush@kernel.org,
rananta@google.com, rientjes@google.com, rodrigo.vivi@intel.com,
rppt@kernel.org, saeedm@nvidia.com, skhan@linuxfoundation.org,
skhawaja@google.com, vivek.kasireddy@intel.com, witu@nvidia.com,
yanjun.zhu@linux.dev, yi.l.liu@intel.com
Subject: Re: [PATCH v4 02/16] vfio/pci: Preserve vfio-pci device files across Live Update
Date: Thu, 21 May 2026 23:49:40 +0000 [thread overview]
Message-ID: <ag-aFA1BJxdJMywr@google.com> (raw)
In-Reply-To: <20260511234802.2280368-3-vipinsh@google.com>
On 2026-05-11 04:47 PM, Vipin Sharma wrote:
> +static int vfio_pci_liveupdate_freeze(struct liveupdate_file_op_args *args)
> +{
> + struct vfio_device *device = vfio_device_from_file(args->file);
> + struct vfio_pci_core_device *vdev;
> + struct pci_dev *pdev;
> + int ret;
> +
> + vdev = container_of(device, struct vfio_pci_core_device, vdev);
> + pdev = vdev->pdev;
> +
> + guard(mutex)(&device->dev_set->lock);
> +
> + /*
> + * Userspace must disable interrupts on the device prior to freeze so
> + * that the device does not send any interrupts until new interrupt
> + * handlers have been established by the next kernel.
> + */
> + if (vdev->irq_type != VFIO_PCI_NUM_IRQS) {
> + pci_err(pdev, "Freeze failed! Interrupts are still enabled.\n");
> + return -EINVAL;
> + }
> +
> + guard(rwsem_write)(&vdev->memory_lock);
> +
> + /*
> + * Userspace must make sure device is not in the lower power state for
> + * live update. We may relax this in future.
> + */
> + if (pdev->current_state != PCI_D0) {
> + pci_err(pdev, "Freeze failed! Device not in D0 state.\n");
> + return -EINVAL;
> + }
> +
> + /*
> + * Reset is a temporary measure to provide kernel after kexec a clean
> + * device while VFIO live update work is under development and not
> + * fully supported. It will go away once continuous DMA support is
> + * added to device preservation.
> + */
> + vfio_pci_zap_bars(vdev);
> + ret = pci_load_saved_state(pdev, vdev->pci_saved_state);
> + if (ret)
> + return ret;
> + pci_clear_master(pdev);
> + vfio_pci_core_try_reset(vdev);
I am seeing the following lockdep splat get triggered by this reset when
testing with this commit using vfio_pci_liveupdate_kexec_test. It seems to be
related to taking memory_lock above.
[ 2710.299017][T75672] ======================================================
[ 2710.305908][T75672] WARNING: possible circular locking dependency detected
[ 2710.312797][T75672] 7.1.0-dbg-DEV #59 Tainted: G S
[ 2710.319077][T75672] ------------------------------------------------------
[ 2710.325967][T75672] kexec/75672 is trying to acquire lock:
[ 2710.331474][T75672] ff46fd4fdbaeef08 (&group->mutex){+.+.}-{4:4}, at: pci_dev_reset_iommu_prepare+0x6e/0x1a0
[ 2710.341336][T75672]
[ 2710.341336][T75672] but task is already holding lock:
[ 2710.348574][T75672] ff46fd501f9a19a8 (&vdev->memory_lock){++++}-{4:4}, at: vfio_pci_liveupdate_freeze+0x51/0x100
[ 2710.358764][T75672]
[ 2710.358764][T75672] which lock already depends on the new lock.
[ 2710.358764][T75672]
[ 2710.369031][T75672]
[ 2710.369031][T75672] the existing dependency chain (in reverse order) is:
[ 2710.377916][T75672]
[ 2710.377916][T75672] -> #4 (&vdev->memory_lock){++++}-{4:4}:
[ 2710.385675][T75672] down_read+0x3d/0x150
[ 2710.390235][T75672] vfio_pci_mmap_huge_fault+0xb9/0x160
[ 2710.396091][T75672] __do_fault+0x46/0x140
[ 2710.400734][T75672] do_pte_missing+0x4c3/0xff0
[ 2710.405803][T75672] handle_mm_fault+0x7c4/0xb30
[ 2710.410961][T75672] fixup_user_fault+0x115/0x270
[ 2710.416209][T75672] vaddr_get_pfns+0x1a1/0x390
[ 2710.421286][T75672] vfio_pin_pages_remote+0x148/0x4d0
[ 2710.426959][T75672] vfio_pin_map_dma+0xcc/0x260
[ 2710.432116][T75672] vfio_iommu_type1_ioctl+0xda4/0xec0
[ 2710.437884][T75672] __se_sys_ioctl+0x71/0xc0
[ 2710.442790][T75672] do_syscall_64+0x15f/0x710
[ 2710.447788][T75672] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 2710.454074][T75672]
[ 2710.454074][T75672] -> #3 (&mm->mmap_lock){++++}-{4:4}:
[ 2710.461489][T75672] down_read_killable+0x48/0x180
[ 2710.466821][T75672] mmap_read_lock_killable+0x12/0x50
[ 2710.472505][T75672] lock_mm_and_find_vma+0x11d/0x130
[ 2710.478093][T75672] do_user_addr_fault+0x3a0/0x6c0
[ 2710.483521][T75672] exc_page_fault+0x68/0xa0
[ 2710.488423][T75672] asm_exc_page_fault+0x26/0x30
[ 2710.493669][T75672] filldir+0xe2/0x190
[ 2710.498047][T75672] ext4_readdir+0xb47/0xcf0
[ 2710.502950][T75672] iterate_dir+0x84/0x160
[ 2710.507677][T75672] __se_sys_getdents+0x74/0x120
[ 2710.512929][T75672] do_syscall_64+0x15f/0x710
[ 2710.517919][T75672] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 2710.524202][T75672]
[ 2710.524202][T75672] -> #2 (&type->i_mutex_dir_key#4){++++}-{4:4}:
[ 2710.532478][T75672] down_read+0x3d/0x150
[ 2710.537030][T75672] lookup_slow+0x26/0x50
[ 2710.541675][T75672] link_path_walk+0x42c/0x580
[ 2710.546743][T75672] path_openat+0xd1/0xde0
[ 2710.551466][T75672] do_file_open_root+0x114/0x250
[ 2710.556798][T75672] file_open_root+0x89/0xb0
[ 2710.561703][T75672] kernel_read_file_from_path_initns+0xba/0x130
[ 2710.568342][T75672] _request_firmware+0x4ab/0x8c0
[ 2710.573677][T75672] request_firmware_direct+0x36/0x50
[ 2710.579356][T75672] request_microcode_fw+0xf2/0x510
[ 2710.584869][T75672] reload_store+0x197/0x230
[ 2710.589766][T75672] kernfs_fop_write_iter+0x13f/0x1d0
[ 2710.595452][T75672] vfs_write+0x2be/0x3b0
[ 2710.600097][T75672] ksys_write+0x73/0x100
[ 2710.604735][T75672] do_syscall_64+0x15f/0x710
[ 2710.609723][T75672] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 2710.616009][T75672]
[ 2710.616009][T75672] -> #1 (cpu_hotplug_lock){++++}-{0:0}:
[ 2710.623591][T75672] cpus_read_lock+0x3b/0xd0
[ 2710.628499][T75672] __cpuhp_state_add_instance+0x19/0x40
[ 2710.634443][T75672] iova_domain_init_rcaches+0x1ef/0x230
[ 2710.640385][T75672] iommu_setup_dma_ops+0x175/0x540
[ 2710.645891][T75672] iommu_device_register+0x188/0x220
[ 2710.651564][T75672] intel_iommu_init+0x35a/0x440
[ 2710.656811][T75672] pci_iommu_init+0x16/0x40
[ 2710.661713][T75672] do_one_initcall+0xf5/0x3a0
[ 2710.666786][T75672] do_initcall_level+0x82/0xa0
[ 2710.671953][T75672] do_initcalls+0x43/0x70
[ 2710.676672][T75672] kernel_init_freeable+0x152/0x1d0
[ 2710.682266][T75672] kernel_init+0x1a/0x130
[ 2710.686996][T75672] ret_from_fork+0x16b/0x310
[ 2710.691991][T75672] ret_from_fork_asm+0x1a/0x30
[ 2710.697151][T75672]
[ 2710.697151][T75672] -> #0 (&group->mutex){+.+.}-{4:4}:
[ 2710.704478][T75672] __lock_acquire+0x14c6/0x2800
[ 2710.709729][T75672] lock_acquire+0xd3/0x2c0
[ 2710.714542][T75672] __mutex_lock+0x8f/0xcd0
[ 2710.719349][T75672] pci_dev_reset_iommu_prepare+0x6e/0x1a0
[ 2710.725461][T75672] pcie_flr+0x32/0xc0
[ 2710.729842][T75672] __pci_reset_function_locked+0x84/0x120
[ 2710.735954][T75672] vfio_pci_core_try_reset+0x96/0xe0
[ 2710.741630][T75672] vfio_pci_liveupdate_freeze+0x89/0x100
[ 2710.747653][T75672] luo_file_freeze+0xba/0x280
[ 2710.752725][T75672] luo_session_serialize+0x69/0x190
[ 2710.758321][T75672] liveupdate_reboot+0x19/0x30
[ 2710.763490][T75672] kernel_kexec+0x2f/0xa0
[ 2710.768220][T75672] __se_sys_reboot+0xfd/0x210
[ 2710.773301][T75672] do_syscall_64+0x15f/0x710
[ 2710.778284][T75672] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 2710.784568][T75672]
[ 2710.784568][T75672] other info that might help us debug this:
[ 2710.784568][T75672]
[ 2710.794663][T75672] Chain exists of:
[ 2710.794663][T75672] &group->mutex --> &mm->mmap_lock --> &vdev->memory_lock
[ 2710.794663][T75672]
[ 2710.807543][T75672] Possible unsafe locking scenario:
[ 2710.807543][T75672]
[ 2710.814863][T75672] CPU0 CPU1
[ 2710.820106][T75672] ---- ----
[ 2710.825352][T75672] lock(&vdev->memory_lock);
[ 2710.829904][T75672] lock(&mm->mmap_lock);
[ 2710.836620][T75672] lock(&vdev->memory_lock);
[ 2710.843682][T75672] lock(&group->mutex);
[ 2710.847798][T75672]
[ 2710.847798][T75672] *** DEADLOCK ***
[ 2710.847798][T75672]
[ 2710.855818][T75672] 7 locks held by kexec/75672:
[ 2710.860457][T75672] #0: ffffffff90a81330 (system_transition_mutex){+.+.}-{4:4}, at: __se_sys_reboot+0xe4/0x210
[ 2710.870554][T75672] #1: ffffffff90e1d0c0 (luo_session_global.outgoing.rwsem){+.+.}-{4:4}, at: luo_session_serialize+0x1f/0x190
[ 2710.882043][T75672] #2: ff46fd50602b7ae0 (&session->mutex){+.+.}-{4:4}, at: luo_session_serialize+0x4f/0x190
[ 2710.891972][T75672] #3: ff46fd500bec0788 (&luo_file->mutex){+.+.}-{4:4}, at: luo_file_freeze+0x65/0x280
[ 2710.901463][T75672] #4: ff46fd509d8106a8 (&new_dev_set->lock){+.+.}-{4:4}, at: vfio_pci_liveupdate_freeze+0x36/0x100
[ 2710.912086][T75672] #5: ff46fd501f9a19a8 (&vdev->memory_lock){++++}-{4:4}, at: vfio_pci_liveupdate_freeze+0x51/0x100
[ 2710.922701][T75672] #6: ff46fd4fd416c1f0 (&dev->mutex){....}-{4:4}, at: pci_dev_trylock+0x25/0x60
[ 2710.931676][T75672]
[ 2710.931676][T75672] stack backtrace:
[ 2710.937439][T75672] CPU: 193 UID: 0 PID: 75672 Comm: kexec Tainted: G S 7.1.0-dbg-DEV #59 PREEMPTLAZY
[ 2710.937442][T75672] Tainted: [S]=CPU_OUT_OF_SPEC
[ 2710.937442][T75672] Hardware name: Google Izumi-EMR/izumi, BIOS 0.20251023.0-0 10/23/2025
[ 2710.937443][T75672] Call Trace:
[ 2710.937446][T75672] <TASK>
[ 2710.937448][T75672] dump_stack_lvl+0x54/0x70
[ 2710.937453][T75672] print_circular_bug+0x2e1/0x300
[ 2710.937455][T75672] check_noncircular+0xf9/0x120
[ 2710.937456][T75672] ? __bfs+0x129/0x200
[ 2710.937458][T75672] __lock_acquire+0x14c6/0x2800
[ 2710.937460][T75672] ? __lock_acquire+0x1240/0x2800
[ 2710.937463][T75672] ? pci_dev_reset_iommu_prepare+0x6e/0x1a0
[ 2710.937465][T75672] lock_acquire+0xd3/0x2c0
[ 2710.937466][T75672] ? pci_dev_reset_iommu_prepare+0x6e/0x1a0
[ 2710.937468][T75672] ? lock_is_held_type+0x76/0x100
[ 2710.937471][T75672] ? pci_dev_reset_iommu_prepare+0x6e/0x1a0
[ 2710.937473][T75672] __mutex_lock+0x8f/0xcd0
[ 2710.937473][T75672] ? pci_dev_reset_iommu_prepare+0x6e/0x1a0
[ 2710.937475][T75672] ? lockdep_hardirqs_on_prepare+0x151/0x210
[ 2710.937477][T75672] ? _raw_spin_unlock_irqrestore+0x35/0x50
[ 2710.937482][T75672] pci_dev_reset_iommu_prepare+0x6e/0x1a0
[ 2710.937484][T75672] pcie_flr+0x32/0xc0
[ 2710.937485][T75672] __pci_reset_function_locked+0x84/0x120
[ 2710.937487][T75672] vfio_pci_core_try_reset+0x96/0xe0
[ 2710.937489][T75672] vfio_pci_liveupdate_freeze+0x89/0x100
[ 2710.937490][T75672] luo_file_freeze+0xba/0x280
[ 2710.937492][T75672] luo_session_serialize+0x69/0x190
[ 2710.937493][T75672] liveupdate_reboot+0x19/0x30
[ 2710.937495][T75672] kernel_kexec+0x2f/0xa0
[ 2710.937496][T75672] __se_sys_reboot+0xfd/0x210
[ 2710.937497][T75672] ? check_object+0x1ee/0x390
[ 2710.937500][T75672] ? lock_release+0xef/0x350
[ 2710.937501][T75672] ? kmem_cache_free+0x1b5/0x520
[ 2710.937506][T75672] ? _raw_spin_unlock_irqrestore+0x35/0x50
[ 2710.937508][T75672] ? kmem_cache_free+0x1b5/0x520
[ 2710.937509][T75672] ? __x64_sys_close+0x3d/0x80
[ 2710.937510][T75672] ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 2710.937511][T75672] ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 2710.937512][T75672] do_syscall_64+0x15f/0x710
[ 2710.937514][T75672] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 2710.937515][T75672] RIP: 0033:0x7fa57e4f2513
[ 2710.937519][T75672] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 89 fa b8 a9 00 00 00 bf ad de e1 fe be 69 19 12 28 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 f7 d8 48 8b 0d db 2c 07 00 64 89 01 48
[ 2710.937520][T75672] RSP: 002b:00007ffd16943748 EFLAGS: 00000246 ORIG_RAX: 00000000000000a9
[ 2710.937523][T75672] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fa57e4f2513
[ 2710.937524][T75672] RDX: 0000000045584543 RSI: 0000000028121969 RDI: 00000000fee1dead
[ 2710.937526][T75672] RBP: 00007ffd16943a60 R08: 0000000000000009 R09: 00007fa57e5672e0
[ 2710.937527][T75672] R10: 0000000000000008 R11: 0000000000000246 R12: 00007ffd169438e0
[ 2710.937528][T75672] R13: 0000000000000000 R14: 00007ffd169438e0 R15: 0000000000000001
[ 2710.937532][T75672] </TASK>
> + pci_restore_state(pdev);
> + return 0;
> }
next prev parent reply other threads:[~2026-05-21 23:49 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-11 23:47 [PATCH v4 00/16] vfio/pci: Base Live Update support for VFIO Vipin Sharma
2026-05-11 23:47 ` [PATCH v4 01/16] vfio/pci: Register a file handler with Live Update Orchestrator Vipin Sharma
2026-05-11 23:47 ` [PATCH v4 02/16] vfio/pci: Preserve vfio-pci device files across Live Update Vipin Sharma
2026-05-12 20:59 ` David Matlack
2026-05-12 21:29 ` Vipin Sharma
2026-05-13 22:42 ` Samiullah Khawaja
2026-05-14 15:24 ` Pratyush Yadav
2026-05-18 16:37 ` Vipin Sharma
2026-05-21 23:49 ` David Matlack [this message]
2026-05-11 23:47 ` [PATCH v4 03/16] vfio/pci: Retrieve preserved device files after " Vipin Sharma
2026-05-11 23:47 ` [PATCH v4 04/16] vfio/pci: Notify PCI subsystem about devices preserved across " Vipin Sharma
2026-05-11 23:47 ` [PATCH v4 05/16] vfio: Enforce preserved devices are retrieved via LIVEUPDATE_SESSION_RETRIEVE_FD Vipin Sharma
2026-05-17 19:04 ` Zhu Yanjun
2026-05-18 16:47 ` Vipin Sharma
2026-05-11 23:47 ` [PATCH v4 06/16] vfio/pci: Store incoming Live Update state in struct vfio_pci_core_device Vipin Sharma
2026-05-11 23:47 ` [PATCH v4 07/16] docs: liveupdate: Add documentation for VFIO PCI Vipin Sharma
2026-05-11 23:47 ` [PATCH v4 08/16] vfio: selftests: Build liveupdate library in VFIO selftests Vipin Sharma
2026-05-11 23:47 ` [PATCH v4 09/16] vfio: selftests: Add vfio_pci_liveupdate_uapi_test Vipin Sharma
2026-05-11 23:47 ` [PATCH v4 10/16] vfio: selftests: Initialize vfio_pci_device using a VFIO cdev FD Vipin Sharma
2026-05-11 23:47 ` [PATCH v4 11/16] vfio: selftests: Add Makefile support for TEST_GEN_PROGS_EXTENDED Vipin Sharma
2026-05-11 23:47 ` [PATCH v4 12/16] vfio: selftests: Add vfio_pci_liveupdate_kexec_test Vipin Sharma
2026-05-11 23:47 ` [PATCH v4 13/16] vfio: selftests: Expose iommu_modes to tests Vipin Sharma
2026-05-11 23:48 ` [PATCH v4 14/16] vfio: selftests: Expose low-level helper routines for setting up struct vfio_pci_device Vipin Sharma
2026-05-11 23:48 ` [PATCH v4 15/16] vfio: selftests: Verify that opening VFIO device fails during Live Update Vipin Sharma
2026-05-11 23:48 ` [PATCH v4 16/16] vfio: selftests: Add continuous DMA to vfio_pci_liveupdate_kexec_test Vipin Sharma
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ag-aFA1BJxdJMywr@google.com \
--to=dmatlack@google.com \
--cc=ajayachandra@nvidia.com \
--cc=alex@shazbot.org \
--cc=amastro@fb.com \
--cc=ankita@nvidia.com \
--cc=apopple@nvidia.com \
--cc=chrisl@kernel.org \
--cc=corbet@lwn.net \
--cc=graf@amazon.com \
--cc=jacob.pan@linux.microsoft.com \
--cc=jgg@nvidia.com \
--cc=jgg@ziepe.ca \
--cc=jrhilke@google.com \
--cc=julianr@linux.ibm.com \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=leon@kernel.org \
--cc=leonro@nvidia.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=michal.winiarski@intel.com \
--cc=parav@nvidia.com \
--cc=pasha.tatashin@soleen.com \
--cc=praan@google.com \
--cc=pratyush@kernel.org \
--cc=rananta@google.com \
--cc=rientjes@google.com \
--cc=rodrigo.vivi@intel.com \
--cc=rppt@kernel.org \
--cc=saeedm@nvidia.com \
--cc=skhan@linuxfoundation.org \
--cc=skhawaja@google.com \
--cc=vipinsh@google.com \
--cc=vivek.kasireddy@intel.com \
--cc=witu@nvidia.com \
--cc=yanjun.zhu@linux.dev \
--cc=yi.l.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox