public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Alex Mastro <amastro@fb.com>
To: David Matlack <dmatlack@google.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>,
	Alex Williamson <alex@shazbot.org>,
	Adithya Jayachandran <ajayachandra@nvidia.com>,
	Alistair Popple <apopple@nvidia.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Bjorn Helgaas <bhelgaas@google.com>, Chris Li <chrisl@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Jacob Pan <jacob.pan@linux.microsoft.com>,
	Jason Gunthorpe <jgg@nvidia.com>, Jason Gunthorpe <jgg@ziepe.ca>,
	Josh Hilke <jrhilke@google.com>,
	Kevin Tian <kevin.tian@intel.com>, <kvm@vger.kernel.org>,
	Leon Romanovsky <leonro@nvidia.com>,
	<linux-kernel@vger.kernel.org>, <linux-kselftest@vger.kernel.org>,
	<linux-pci@vger.kernel.org>, Lukas Wunner <lukas@wunner.de>,
	Mike Rapoport <rppt@kernel.org>, Parav Pandit <parav@nvidia.com>,
	Philipp Stanner <pstanner@redhat.com>,
	Pratyush Yadav <pratyush@kernel.org>,
	Saeed Mahameed <saeedm@nvidia.com>,
	Samiullah Khawaja <skhawaja@google.com>,
	Shuah Khan <shuah@kernel.org>,
	Tomita Moeko <tomitamoeko@gmail.com>,
	Vipin Sharma <vipinsh@google.com>, William Tu <witu@nvidia.com>,
	Yi Liu <yi.l.liu@intel.com>, Yunxiang Li <Yunxiang.Li@amd.com>,
	Zhu Yanjun <yanjun.zhu@linux.dev>
Subject: Re: [PATCH 06/21] vfio/pci: Retrieve preserved device files after Live Update
Date: Thu, 4 Dec 2025 02:30:05 -0800	[thread overview]
Message-ID: <aTFirYPI5vlIhvCK@devgpu015.cco6.facebook.com> (raw)
In-Reply-To: <CALzav=ciz4kV+u3B5bMzZzVY+cMs-G=q9c5O-jKPz+E4LUdx7g@mail.gmail.com>

On Wed, Dec 03, 2025 at 09:29:27AM -0800, David Matlack wrote:
> On Wed, Dec 3, 2025 at 7:46 AM Pasha Tatashin <pasha.tatashin@soleen.com> wrote:
> >
> > On Wed, Dec 3, 2025 at 7:55 AM Alex Mastro <amastro@fb.com> wrote:
> > >
> > > On Wed, Nov 26, 2025 at 07:35:53PM +0000, David Matlack wrote:
> > > > From: Vipin Sharma <vipinsh@google.com>
> > > >  static int vfio_pci_liveupdate_retrieve(struct liveupdate_file_op_args *args)
> > > >  {
> > > > -     return -EOPNOTSUPP;
> > > > +     struct vfio_pci_core_device_ser *ser;
> > > > +     struct vfio_device *device;
> > > > +     struct folio *folio;
> > > > +     struct file *file;
> > > > +     int ret;
> > > > +
> > > > +     folio = kho_restore_folio(args->serialized_data);
> > > > +     if (!folio)
> > > > +             return -ENOENT;
> > >
> > > Should this be consistent with the behavior of pci_flb_retrieve() which panics
> > > on failure? The short circuit failure paths which follow leak the folio,
> 
> Thanks for catching the leaked folio. I'll fix that in the next version.
> 
> > > which seems like a hygiene issue, but the practical significance is moot if
> > > vfio_pci_liveupdate_retrieve() failure is catastrophic anyways?
> >
> > pci_flb_retrieve() is used during boot. If it fails, we risk DMA
> > corrupting any memory region, so a panic makes sense. In contrast,
> > this retrieval happens once we are already in userspace, allowing the
> > user to decide how to handle the failure to recover the preserved
> > cdev.
> 
> This is what I was thinking as well. vfio_pci_liveupdate_retrieve()
> runs in the context of the ioctl LIVEUPDATE_SESSION_RETRIEVE_FD, so we
> can just return an error up to userspace if anything goes wrong and
> let userspace initiate the reboot to recover the device if/when it's
> ready.
> 
> OTOH, pci_flb_retrieve() gets called by the kernel during early boot
> to determine what devices the previous kernel preserved. If the kernel
> can't determine which devices were preserved by the previous kernel
> and once the kernel starts preserving I/O page tables, that could lead
> to corruption, so panicking is warranted.

Make sense, thanks for elaborating David and Pasha. 

  reply	other threads:[~2025-12-04 10:30 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-26 19:35 [PATCH 00/21] vfio/pci: Base support to preserve a VFIO device file across Live Update David Matlack
2025-11-26 19:35 ` [PATCH 01/21] liveupdate: luo_flb: Prevent retrieve() after finish() David Matlack
2025-11-26 19:35 ` [PATCH 02/21] PCI: Add API to track PCI devices preserved across Live Update David Matlack
2025-11-29 10:34   ` Lukas Wunner
2025-11-29 20:10     ` Pasha Tatashin
2025-11-30  0:51     ` Jason Gunthorpe
2025-11-30  1:20       ` Pasha Tatashin
2025-12-01 13:29         ` Jason Gunthorpe
2025-12-01 18:54           ` David Matlack
2025-12-02  6:20             ` Lukas Wunner
2025-12-02 14:59               ` Jason Gunthorpe
2025-12-02 16:36                 ` Chris Li
2025-12-02 18:19                   ` Jason Gunthorpe
2025-12-02 21:20                     ` Chris Li
2025-12-03  5:44                   ` Lukas Wunner
2025-12-01 21:23           ` Pasha Tatashin
2025-11-29 20:15   ` Pasha Tatashin
2025-12-01 18:07     ` David Matlack
2025-11-26 19:35 ` [PATCH 03/21] PCI: Require driver_override for incoming Live Update preserved devices David Matlack
2025-12-02 21:16   ` David Matlack
2025-12-02 21:24     ` Chris Li
2025-11-26 19:35 ` [PATCH 04/21] vfio/pci: Register a file handler with Live Update Orchestrator David Matlack
2025-11-26 19:35 ` [PATCH 05/21] vfio/pci: Preserve vfio-pci device files across Live Update David Matlack
2025-11-26 19:35 ` [PATCH 06/21] vfio/pci: Retrieve preserved device files after " David Matlack
2025-12-03 12:55   ` Alex Mastro
2025-12-03 15:45     ` Pasha Tatashin
2025-12-03 17:29       ` David Matlack
2025-12-04 10:30         ` Alex Mastro [this message]
2025-11-26 19:35 ` [PATCH 07/21] vfio/pci: Notify PCI subsystem about devices preserved across " David Matlack
2025-11-26 19:35 ` [PATCH 08/21] vfio: Enforce preserved devices are retrieved via LIVEUPDATE_SESSION_RETRIEVE_FD David Matlack
2025-11-26 19:35 ` [PATCH 09/21] vfio/pci: Store Live Update state in struct vfio_pci_core_device David Matlack
2025-11-26 19:35 ` [PATCH 10/21] vfio/pci: Skip reset of preserved device after Live Update David Matlack
2025-11-26 19:35 ` [PATCH 11/21] selftests/liveupdate: Move luo_test_utils.* into a reusable library David Matlack
2025-11-26 19:35 ` [PATCH 12/21] selftests/liveupdate: Add helpers to preserve/retrieve FDs David Matlack
2025-11-26 19:36 ` [PATCH 13/21] vfio: selftests: Build liveupdate library in VFIO selftests David Matlack
2025-11-26 19:36 ` [PATCH 14/21] vfio: selftests: Add Makefile support for TEST_GEN_PROGS_EXTENDED David Matlack
2025-11-26 19:36 ` [PATCH 15/21] vfio: selftests: Add vfio_pci_liveupdate_uapi_test David Matlack
2025-11-26 19:36 ` [PATCH 16/21] vfio: selftests: Initialize vfio_pci_device using a VFIO cdev FD David Matlack
2025-11-26 19:36 ` [PATCH 17/21] vfio: selftests: Add vfio_pci_liveupdate_kexec_test David Matlack
2025-11-26 19:36 ` [PATCH 18/21] vfio: selftests: Expose iommu_modes to tests David Matlack
2025-11-26 19:36 ` [PATCH 19/21] vfio: selftests: Expose low-level helper routines for setting up struct vfio_pci_device David Matlack
2025-12-28  4:03   ` Zhu Yanjun
2026-01-05 17:54     ` David Matlack
2026-01-06  0:07       ` Yanjun.Zhu
2026-01-06  0:19         ` David Matlack
2025-11-26 19:36 ` [PATCH 20/21] vfio: selftests: Verify that opening VFIO device fails during Live Update David Matlack
2025-11-26 19:36 ` [PATCH 21/21] vfio: selftests: Add continuous DMA to vfio_pci_liveupdate_kexec_test David Matlack
2025-11-28  4:56 ` [PATCH 00/21] vfio/pci: Base support to preserve a VFIO device file across Live Update Zhu Yanjun
2025-12-01 15:49   ` Zhu Yanjun
2025-12-01 17:10     ` David Matlack
2025-12-01 17:16       ` Zhu Yanjun
2025-12-01 17:32         ` David Matlack
2025-12-01 17:36           ` David Matlack
2025-12-01 17:44             ` Pasha Tatashin
2025-12-01 21:45               ` Yanjun.Zhu
2025-12-01 21:48                 ` David Matlack
2025-12-01 21:56                   ` Yanjun.Zhu
2025-12-02  5:50             ` Zhu Yanjun
2025-12-01 21:59 ` Pasha Tatashin
2025-12-02 14:10   ` Pratyush Yadav
2025-12-02 21:29     ` David Matlack
2025-12-02 21:41       ` Pasha Tatashin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aTFirYPI5vlIhvCK@devgpu015.cco6.facebook.com \
    --to=amastro@fb.com \
    --cc=Yunxiang.Li@amd.com \
    --cc=ajayachandra@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex@shazbot.org \
    --cc=apopple@nvidia.com \
    --cc=bhelgaas@google.com \
    --cc=chrisl@kernel.org \
    --cc=dmatlack@google.com \
    --cc=jacob.pan@linux.microsoft.com \
    --cc=jgg@nvidia.com \
    --cc=jgg@ziepe.ca \
    --cc=jrhilke@google.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=leonro@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=parav@nvidia.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=pratyush@kernel.org \
    --cc=pstanner@redhat.com \
    --cc=rientjes@google.com \
    --cc=rppt@kernel.org \
    --cc=saeedm@nvidia.com \
    --cc=shuah@kernel.org \
    --cc=skhawaja@google.com \
    --cc=tomitamoeko@gmail.com \
    --cc=vipinsh@google.com \
    --cc=witu@nvidia.com \
    --cc=yanjun.zhu@linux.dev \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox