From: Zhu Yanjun <yanjun.zhu@linux.dev>
To: David Matlack <dmatlack@google.com>, Alex Williamson <alex@shazbot.org>
Cc: Adithya Jayachandran <ajayachandra@nvidia.com>,
Alex Mastro <amastro@fb.com>,
Alistair Popple <apopple@nvidia.com>,
Andrew Morton <akpm@linux-foundation.org>,
Bjorn Helgaas <bhelgaas@google.com>, Chris Li <chrisl@kernel.org>,
David Rientjes <rientjes@google.com>,
Jacob Pan <jacob.pan@linux.microsoft.com>,
Jason Gunthorpe <jgg@nvidia.com>, Jason Gunthorpe <jgg@ziepe.ca>,
Josh Hilke <jrhilke@google.com>,
Kevin Tian <kevin.tian@intel.com>,
kvm@vger.kernel.org, Leon Romanovsky <leonro@nvidia.com>,
linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
linux-pci@vger.kernel.org, Lukas Wunner <lukas@wunner.de>,
Mike Rapoport <rppt@kernel.org>, Parav Pandit <parav@nvidia.com>,
Pasha Tatashin <pasha.tatashin@soleen.com>,
Philipp Stanner <pstanner@redhat.com>,
Pratyush Yadav <pratyush@kernel.org>,
Saeed Mahameed <saeedm@nvidia.com>,
Samiullah Khawaja <skhawaja@google.com>,
Shuah Khan <shuah@kernel.org>,
Tomita Moeko <tomitamoeko@gmail.com>,
Vipin Sharma <vipinsh@google.com>, William Tu <witu@nvidia.com>,
Yi Liu <yi.l.liu@intel.com>, Yunxiang Li <Yunxiang.Li@amd.com>
Subject: Re: [PATCH 00/21] vfio/pci: Base support to preserve a VFIO device file across Live Update
Date: Thu, 27 Nov 2025 20:56:06 -0800 [thread overview]
Message-ID: <dadaeeb9-4008-4450-8b61-e147a2af38b2@linux.dev> (raw)
In-Reply-To: <20251126193608.2678510-1-dmatlack@google.com>
在 2025/11/26 11:35, David Matlack 写道:
> This series adds the base support to preserve a VFIO device file across
> a Live Update. "Base support" means that this allows userspace to
> safetly preserve a VFIO device file with LIVEUPDATE_SESSION_PRESERVE_FD
> and retrieve a preserved VFIO device file with
> LIVEUPDATE_SESSION_RETRIEVE_FD, but the device itself is not preserved
> in a fully running state across Live Update.
>
> This series unblocks 2 parallel but related streams of work:
>
> - iommufd preservation across Live Update. This work spans iommufd,
> the IOMMU subsystem, and IOMMU drivers [1]
>
> - Preservation of VFIO device state across Live Update (config space,
> BAR addresses, power state, SR-IOV state, etc.). This work spans both
> VFIO and the core PCI subsystem.
>
> While we need all of the above to fully preserve a VFIO device across a
> Live Update without disrupting the workload on the device, this series
> aims to be functional and safe enough to merge as the first incremental
> step toward that goal.
>
> Areas for Discussion
> --------------------
>
> BDF Stability across Live Update
>
> The PCI support for tracking preserved devices across a Live Update to
> prevent auto-probing relies on PCI segment numbers and BDFs remaining
> stable. For now I have disallowed VFs, as the BDFs assigned to VFs can
> vary depending on how the kernel chooses to allocate bus numbers. For
> non-VFs I am wondering if there is any more needed to ensure BDF
> stability across Live Update.
>
> While we would like to support many different systems and
> configurations in due time (including preserving VFs), I'd like to
> keep this first serses constrained to simple use-cases.
>
> FLB Locking
>
> I don't see a way to properly synchronize pci_flb_finish() with
> pci_liveupdate_incoming_is_preserved() since the incoming FLB mutex is
> dropped by liveupdate_flb_get_incoming() when it returns the pointer
> to the object, and taking pci_flb_incoming_lock in pci_flb_finish()
> could result in a deadlock due to reversing the lock ordering.
>
> FLB Retrieving
>
> The first patch of this series includes a fix to prevent an FLB from
> being retrieved again it is finished. I am wondering if this is the
> right approach or if subsystems are expected to stop calling
> liveupdate_flb_get_incoming() after an FLB is finished.
>
> Testing
> -------
>
> The patches at the end of this series provide comprehensive selftests
> for the new code added by this series. The selftests have been validated
> in both a VM environment using a virtio-net PCIe device, and in a
> baremetal environment on an Intel EMR server with an Intel DSA device.
>
> Here is an example of how to run the new selftests:
Hi, David
ERROR: modpost: "liveupdate_register_file_handler"
[drivers/vfio/pci/vfio-pci-core.ko] undefined!
ERROR: modpost: "vfio_pci_ops" [drivers/vfio/pci/vfio-pci-core.ko]
undefined!
ERROR: modpost: "liveupdate_enabled" [drivers/vfio/pci/vfio-pci-core.ko]
undefined!
ERROR: modpost: "liveupdate_unregister_file_handler"
[drivers/vfio/pci/vfio-pci-core.ko] undefined!
ERROR: modpost: "vfio_device_fops" [drivers/vfio/pci/vfio-pci-core.ko]
undefined!
ERROR: modpost: "vfio_pci_is_intel_display"
[drivers/vfio/pci/vfio-pci-core.ko] undefined!
ERROR: modpost: "vfio_pci_liveupdate_init"
[drivers/vfio/pci/vfio-pci.ko] undefined!
ERROR: modpost: "vfio_pci_liveupdate_cleanup"
[drivers/vfio/pci/vfio-pci.ko] undefined!
make[4]: *** [scripts/Makefile.modpost:147: Module.symvers] Error 1
make[3]: *** [Makefile:1960: modpost] Error 2
After I git clone the source code from the link
https://github.com/dmatlack/linux/tree/liveupdate/vfio/cdev/v1,
I found the above errors when I built the source code.
Perhaps the above errors can be solved by EXPORT_SYMBOL.
But I am not sure if a better solution can solve the above problems or not.
Thanks,
Yanjun.Zhu
>
> vfio_pci_liveupdate_uapi_test:
>
> $ tools/testing/selftests/vfio/scripts/setup.sh 0000:00:04.0
> $ tools/testing/selftests/vfio/vfio_pci_liveupdate_uapi_test 0000:00:04.0
> $ tools/testing/selftests/vfio/scripts/cleanup.sh
>
> vfio_pci_liveupdate_kexec_test:
>
> $ tools/testing/selftests/vfio/scripts/setup.sh 0000:00:04.0
> $ tools/testing/selftests/vfio/vfio_pci_liveupdate_kexec_test --stage 1 0000:00:04.0
> $ kexec [...] # NOTE: distro-dependent
>
> $ tools/testing/selftests/vfio/scripts/setup.sh 0000:00:04.0
> $ tools/testing/selftests/vfio/vfio_pci_liveupdate_kexec_test --stage 2 0000:00:04.0
> $ tools/testing/selftests/vfio/scripts/cleanup.sh
>
> Dependencies
> ------------
>
> This series was constructed on top of several in-flight series and on
> top of mm-nonmm-unstable [2].
>
> +-- This series
> |
> +-- [PATCH v2 00/18] vfio: selftests: Support for multi-device tests
> | https://lore.kernel.org/kvm/20251112192232.442761-1-dmatlack@google.com/
> |
> +-- [PATCH v3 0/4] vfio: selftests: update DMA mapping tests to use queried IOVA ranges
> | https://lore.kernel.org/kvm/20251111-iova-ranges-v3-0-7960244642c5@fb.com/
> |
> +-- [PATCH v8 0/2] Live Update: File-Lifecycle-Bound (FLB) State
> | https://lore.kernel.org/linux-mm/20251125225006.3722394-1-pasha.tatashin@soleen.com/
> |
> +-- [PATCH v8 00/18] Live Update Orchestrator
> | https://lore.kernel.org/linux-mm/20251125165850.3389713-1-pasha.tatashin@soleen.com/
> |
>
> To simplify checking out the code, this series can be found on GitHub:
>
> https://github.com/dmatlack/linux/tree/liveupdate/vfio/cdev/v1
>
> Changelog
> ---------
>
> v1:
> - Rebase series on top of LUOv8 and VFIO selftests improvements
> - Drop commits to preserve config space fields across Live Update.
> These changes require changes to the PCI layer. For exmaple,
> preserving rbars could lead to an inconsistent device state until
> device BARs addresses are preserved across Live Update.
> - Drop commits to preserve Bus Master Enable on the device. There's no
> reason to preserve this until iommufd preservation is fully working.
> Furthermore, preserving Bus Master Enable could lead to memory
> corruption when the device if the device is bound to the default
> identity-map domain after Live Update.
> - Drop commits to preserve saved PCI state. This work is not needed
> until we are ready to preserve the device's config space, and
> requires more thought to make the PCI state data layout ABI-friendly.
> - Add support to skip auto-probing devices that are preserved by VFIO
> to avoid them getting bound to a different driver by the next kernel.
> - Restrict device preservation further (no VFs, no intel-graphics).
> - Various refactoring and small edits to improve readability and
> eliminate code duplication.
>
> rfc: https://lore.kernel.org/kvm/20251018000713.677779-1-vipinsh@google.com/
>
> Cc: Saeed Mahameed <saeedm@nvidia.com>
> Cc: Adithya Jayachandran <ajayachandra@nvidia.com>
> Cc: Jason Gunthorpe <jgg@nvidia.com>
> Cc: Parav Pandit <parav@nvidia.com>
> Cc: Leon Romanovsky <leonro@nvidia.com>
> Cc: William Tu <witu@nvidia.com>
> Cc: Jacob Pan <jacob.pan@linux.microsoft.com>
> Cc: Lukas Wunner <lukas@wunner.de>
> Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
> Cc: Mike Rapoport <rppt@kernel.org>
> Cc: Pratyush Yadav <pratyush@kernel.org>
> Cc: Samiullah Khawaja <skhawaja@google.com>
> Cc: Chris Li <chrisl@kernel.org>
> Cc: Josh Hilke <jrhilke@google.com>
> Cc: David Rientjes <rientjes@google.com>
>
> [1] https://lore.kernel.org/linux-iommu/20250928190624.3735830-1-skhawaja@google.com/
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/log/?h=mm-nonmm-unstable
>
> David Matlack (12):
> liveupdate: luo_flb: Prevent retrieve() after finish()
> PCI: Add API to track PCI devices preserved across Live Update
> PCI: Require driver_override for incoming Live Update preserved
> devices
> vfio/pci: Notify PCI subsystem about devices preserved across Live
> Update
> vfio: Enforce preserved devices are retrieved via
> LIVEUPDATE_SESSION_RETRIEVE_FD
> vfio/pci: Store Live Update state in struct vfio_pci_core_device
> vfio: selftests: Add Makefile support for TEST_GEN_PROGS_EXTENDED
> vfio: selftests: Add vfio_pci_liveupdate_uapi_test
> vfio: selftests: Expose iommu_modes to tests
> vfio: selftests: Expose low-level helper routines for setting up
> struct vfio_pci_device
> vfio: selftests: Verify that opening VFIO device fails during Live
> Update
> vfio: selftests: Add continuous DMA to vfio_pci_liveupdate_kexec_test
>
> Vipin Sharma (9):
> vfio/pci: Register a file handler with Live Update Orchestrator
> vfio/pci: Preserve vfio-pci device files across Live Update
> vfio/pci: Retrieve preserved device files after Live Update
> vfio/pci: Skip reset of preserved device after Live Update
> selftests/liveupdate: Move luo_test_utils.* into a reusable library
> selftests/liveupdate: Add helpers to preserve/retrieve FDs
> vfio: selftests: Build liveupdate library in VFIO selftests
> vfio: selftests: Initialize vfio_pci_device using a VFIO cdev FD
> vfio: selftests: Add vfio_pci_liveupdate_kexec_test
>
> MAINTAINERS | 1 +
> drivers/pci/Makefile | 1 +
> drivers/pci/liveupdate.c | 248 ++++++++++++++++
> drivers/pci/pci-driver.c | 12 +-
> drivers/vfio/device_cdev.c | 25 +-
> drivers/vfio/group.c | 9 +
> drivers/vfio/pci/Makefile | 1 +
> drivers/vfio/pci/vfio_pci.c | 11 +-
> drivers/vfio/pci/vfio_pci_core.c | 23 +-
> drivers/vfio/pci/vfio_pci_liveupdate.c | 278 ++++++++++++++++++
> drivers/vfio/pci/vfio_pci_priv.h | 16 +
> drivers/vfio/vfio.h | 13 -
> drivers/vfio/vfio_main.c | 22 +-
> include/linux/kho/abi/pci.h | 53 ++++
> include/linux/kho/abi/vfio_pci.h | 45 +++
> include/linux/liveupdate.h | 3 +
> include/linux/pci.h | 38 +++
> include/linux/vfio.h | 51 ++++
> include/linux/vfio_pci_core.h | 7 +
> kernel/liveupdate/luo_flb.c | 4 +
> tools/testing/selftests/liveupdate/.gitignore | 1 +
> tools/testing/selftests/liveupdate/Makefile | 14 +-
> .../include/libliveupdate.h} | 11 +-
> .../selftests/liveupdate/lib/libliveupdate.mk | 20 ++
> .../{luo_test_utils.c => lib/liveupdate.c} | 43 ++-
> .../selftests/liveupdate/luo_kexec_simple.c | 2 +-
> .../selftests/liveupdate/luo_multi_session.c | 2 +-
> tools/testing/selftests/vfio/Makefile | 23 +-
> .../vfio/lib/include/libvfio/iommu.h | 2 +
> .../lib/include/libvfio/vfio_pci_device.h | 8 +
> tools/testing/selftests/vfio/lib/iommu.c | 4 +-
> .../selftests/vfio/lib/vfio_pci_device.c | 60 +++-
> .../vfio/vfio_pci_liveupdate_kexec_test.c | 255 ++++++++++++++++
> .../vfio/vfio_pci_liveupdate_uapi_test.c | 93 ++++++
> 34 files changed, 1313 insertions(+), 86 deletions(-)
> create mode 100644 drivers/pci/liveupdate.c
> create mode 100644 drivers/vfio/pci/vfio_pci_liveupdate.c
> create mode 100644 include/linux/kho/abi/pci.h
> create mode 100644 include/linux/kho/abi/vfio_pci.h
> rename tools/testing/selftests/liveupdate/{luo_test_utils.h => lib/include/libliveupdate.h} (80%)
> create mode 100644 tools/testing/selftests/liveupdate/lib/libliveupdate.mk
> rename tools/testing/selftests/liveupdate/{luo_test_utils.c => lib/liveupdate.c} (89%)
> create mode 100644 tools/testing/selftests/vfio/vfio_pci_liveupdate_kexec_test.c
> create mode 100644 tools/testing/selftests/vfio/vfio_pci_liveupdate_uapi_test.c
>
--
Best Regards,
Yanjun.Zhu
next prev parent reply other threads:[~2025-11-28 4:56 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-26 19:35 [PATCH 00/21] vfio/pci: Base support to preserve a VFIO device file across Live Update David Matlack
2025-11-26 19:35 ` [PATCH 01/21] liveupdate: luo_flb: Prevent retrieve() after finish() David Matlack
2025-11-26 19:35 ` [PATCH 02/21] PCI: Add API to track PCI devices preserved across Live Update David Matlack
2025-11-29 10:34 ` Lukas Wunner
2025-11-29 20:10 ` Pasha Tatashin
2025-11-30 0:51 ` Jason Gunthorpe
2025-11-30 1:20 ` Pasha Tatashin
2025-12-01 13:29 ` Jason Gunthorpe
2025-12-01 18:54 ` David Matlack
2025-12-02 6:20 ` Lukas Wunner
2025-12-02 14:59 ` Jason Gunthorpe
2025-12-02 16:36 ` Chris Li
2025-12-02 18:19 ` Jason Gunthorpe
2025-12-02 21:20 ` Chris Li
2025-12-03 5:44 ` Lukas Wunner
2025-12-01 21:23 ` Pasha Tatashin
2025-11-29 20:15 ` Pasha Tatashin
2025-12-01 18:07 ` David Matlack
2025-11-26 19:35 ` [PATCH 03/21] PCI: Require driver_override for incoming Live Update preserved devices David Matlack
2025-12-02 21:16 ` David Matlack
2025-12-02 21:24 ` Chris Li
2025-11-26 19:35 ` [PATCH 04/21] vfio/pci: Register a file handler with Live Update Orchestrator David Matlack
2025-11-26 19:35 ` [PATCH 05/21] vfio/pci: Preserve vfio-pci device files across Live Update David Matlack
2025-11-26 19:35 ` [PATCH 06/21] vfio/pci: Retrieve preserved device files after " David Matlack
2025-12-03 12:55 ` Alex Mastro
2025-12-03 15:45 ` Pasha Tatashin
2025-12-03 17:29 ` David Matlack
2025-12-04 10:30 ` Alex Mastro
2025-11-26 19:35 ` [PATCH 07/21] vfio/pci: Notify PCI subsystem about devices preserved across " David Matlack
2025-11-26 19:35 ` [PATCH 08/21] vfio: Enforce preserved devices are retrieved via LIVEUPDATE_SESSION_RETRIEVE_FD David Matlack
2025-11-26 19:35 ` [PATCH 09/21] vfio/pci: Store Live Update state in struct vfio_pci_core_device David Matlack
2025-11-26 19:35 ` [PATCH 10/21] vfio/pci: Skip reset of preserved device after Live Update David Matlack
2025-11-26 19:35 ` [PATCH 11/21] selftests/liveupdate: Move luo_test_utils.* into a reusable library David Matlack
2025-11-26 19:35 ` [PATCH 12/21] selftests/liveupdate: Add helpers to preserve/retrieve FDs David Matlack
2025-11-26 19:36 ` [PATCH 13/21] vfio: selftests: Build liveupdate library in VFIO selftests David Matlack
2025-11-26 19:36 ` [PATCH 14/21] vfio: selftests: Add Makefile support for TEST_GEN_PROGS_EXTENDED David Matlack
2025-11-26 19:36 ` [PATCH 15/21] vfio: selftests: Add vfio_pci_liveupdate_uapi_test David Matlack
2025-11-26 19:36 ` [PATCH 16/21] vfio: selftests: Initialize vfio_pci_device using a VFIO cdev FD David Matlack
2025-11-26 19:36 ` [PATCH 17/21] vfio: selftests: Add vfio_pci_liveupdate_kexec_test David Matlack
2025-11-26 19:36 ` [PATCH 18/21] vfio: selftests: Expose iommu_modes to tests David Matlack
2025-11-26 19:36 ` [PATCH 19/21] vfio: selftests: Expose low-level helper routines for setting up struct vfio_pci_device David Matlack
2025-12-28 4:03 ` Zhu Yanjun
2026-01-05 17:54 ` David Matlack
2026-01-06 0:07 ` Yanjun.Zhu
2026-01-06 0:19 ` David Matlack
2025-11-26 19:36 ` [PATCH 20/21] vfio: selftests: Verify that opening VFIO device fails during Live Update David Matlack
2025-11-26 19:36 ` [PATCH 21/21] vfio: selftests: Add continuous DMA to vfio_pci_liveupdate_kexec_test David Matlack
2025-11-28 4:56 ` Zhu Yanjun [this message]
2025-12-01 15:49 ` [PATCH 00/21] vfio/pci: Base support to preserve a VFIO device file across Live Update Zhu Yanjun
2025-12-01 17:10 ` David Matlack
2025-12-01 17:16 ` Zhu Yanjun
2025-12-01 17:32 ` David Matlack
2025-12-01 17:36 ` David Matlack
2025-12-01 17:44 ` Pasha Tatashin
2025-12-01 21:45 ` Yanjun.Zhu
2025-12-01 21:48 ` David Matlack
2025-12-01 21:56 ` Yanjun.Zhu
2025-12-02 5:50 ` Zhu Yanjun
2025-12-01 21:59 ` Pasha Tatashin
2025-12-02 14:10 ` Pratyush Yadav
2025-12-02 21:29 ` David Matlack
2025-12-02 21:41 ` Pasha Tatashin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=dadaeeb9-4008-4450-8b61-e147a2af38b2@linux.dev \
--to=yanjun.zhu@linux.dev \
--cc=Yunxiang.Li@amd.com \
--cc=ajayachandra@nvidia.com \
--cc=akpm@linux-foundation.org \
--cc=alex@shazbot.org \
--cc=amastro@fb.com \
--cc=apopple@nvidia.com \
--cc=bhelgaas@google.com \
--cc=chrisl@kernel.org \
--cc=dmatlack@google.com \
--cc=jacob.pan@linux.microsoft.com \
--cc=jgg@nvidia.com \
--cc=jgg@ziepe.ca \
--cc=jrhilke@google.com \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=leonro@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=parav@nvidia.com \
--cc=pasha.tatashin@soleen.com \
--cc=pratyush@kernel.org \
--cc=pstanner@redhat.com \
--cc=rientjes@google.com \
--cc=rppt@kernel.org \
--cc=saeedm@nvidia.com \
--cc=shuah@kernel.org \
--cc=skhawaja@google.com \
--cc=tomitamoeko@gmail.com \
--cc=vipinsh@google.com \
--cc=witu@nvidia.com \
--cc=yi.l.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.