linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Jacob Pan <jacob.pan@linux.microsoft.com>
Cc: Vipin Sharma <vipinsh@google.com>,
	bhelgaas@google.com, alex.williamson@redhat.com,
	pasha.tatashin@soleen.com, dmatlack@google.com, graf@amazon.com,
	pratyush@kernel.org, gregkh@linuxfoundation.org,
	chrisl@kernel.org, rppt@kernel.org, skhawaja@google.com,
	parav@nvidia.com, saeedm@nvidia.com, kevin.tian@intel.com,
	jrhilke@google.com, david@redhat.com, jgowans@amazon.com,
	dwmw2@infradead.org, epetron@amazon.de, junaids@google.com,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
	kvm@vger.kernel.org, linux-kselftest@vger.kernel.org
Subject: Re: [RFC PATCH 06/21] vfio/pci: Accept live update preservation request for VFIO cdev
Date: Wed, 29 Oct 2025 13:21:10 -0300	[thread overview]
Message-ID: <20251029162110.GQ760669@ziepe.ca> (raw)
In-Reply-To: <20251028103945.0000716e@linux.microsoft.com>

On Tue, Oct 28, 2025 at 10:39:45AM -0700, Jacob Pan wrote:

> My current approach is that I have a special noiommu driver that handles
> the special iommu_domain. It seems much cleaner though some extra code
> overhead. I have a working prototype that has:

Oh interesting, maybe that is OK and reasonable.. My first worry is
that we don't well support iommu driver hot unplug, but if it is very
carefully controlled I think we can make it safe. iommufd selftests is
already doing this and I've been trying to make sure it stays safe
without races or memory leaks..

Binding is going to also need some fiddling because we don't want to
mess with the fwspec on a real struct device..

But maybe we can have some kind of direct 'bind iommu driver to struct
device' call?

> The following user test can pass:
> 1. __iommufd = open("/dev/iommu", O_RDWR);
> 2. devfd = open a noiommu cdev
> 3. ioas_id = ioas_alloc(__iommufd)
> 4. iommufd_bind(__iommufd, devfd)
> 5. successfully do an ioas map, e.g.
> ioctl(iommufd, IOMMU_IOAS_MAP, &map) 
> This will call pfn_reader_user_pin() but the noiommu driver does
> nothing for mapping.

Make sense.

So you can't have a paging iommu_domain that doesn't have a map
function - that just won't work for iommufd. What you should do is use
the iommu pt stuff and have the noiommu driver implement its paging
domain using the amdv1 format.

That will give you map/unmap/iova_to_phys and then iommufd will
immediately full work.

Look at how that series handles the selftest, the simple selftest
iommu_domain is very close to what you need. It is pretty small code
wise.

> > After writing the generic pt self test it occured to me we now have
> > enough infrastructure for iommufd to internally create its own
> > iommu_domain with a AMDv1 page table for the noiommu devices. It would
> > then be so easy to feed that through the existing machinery and have
> > all the pinning/etc work.
>
> Could you elaborate a little more? noiommu devices don't have page
> tables. Are you saying iommufd can create its own iommu_domain w/o a
> vendor iommu driver? Let me catch up with your v7 :)

That was my suggestion, but it seems you tried that and decided it was
too hard with groups/etc. OK.

Adding a dummy iommu driver solves that and you still get to the same
place where there is a paging iommu domain that implements an actual
page table with map/unmap/iova_to_phys. From this perspective iommufd
will be entirely happy and will do all the required pinning and
unpinning.

> > Then only an ioctl to read back the physical addresses from this
> > special domain would be needed
>
> Yes, that was part of your original suggestion to avoid /proc pagemap.
> I have not added that yet. Do you think this warrant a new ioctl or
> just return it in

I think a new ioctl is probably the right idea..

Jason

  reply	other threads:[~2025-10-29 16:21 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-18  0:06 [RFC PATCH 00/21] VFIO live update support Vipin Sharma
2025-10-18  0:06 ` [RFC PATCH 01/21] selftests/liveupdate: Build tests from the selftests/liveupdate directory Vipin Sharma
2025-10-18  0:06 ` [RFC PATCH 02/21] selftests/liveupdate: Create library of core live update ioctls Vipin Sharma
2025-10-18  0:06 ` [RFC PATCH 03/21] selftests/liveupdate: Move do_kexec.sh script to liveupdate/lib Vipin Sharma
2025-10-18  0:06 ` [RFC PATCH 04/21] selftests/liveupdate: Move LUO ioctls calls to liveupdate library Vipin Sharma
2025-10-18  0:06 ` [RFC PATCH 05/21] vfio/pci: Register VFIO live update file handler to Live Update Orchestrator Vipin Sharma
2025-10-31 21:24   ` David Matlack
2025-10-31 22:28   ` David Matlack
2025-10-18  0:06 ` [RFC PATCH 06/21] vfio/pci: Accept live update preservation request for VFIO cdev Vipin Sharma
2025-10-27 20:44   ` Jacob Pan
2025-10-28 13:28     ` Jason Gunthorpe
2025-10-28 17:39       ` Jacob Pan
2025-10-29 16:21         ` Jason Gunthorpe [this message]
2025-10-30 23:10     ` David Matlack
2025-10-31  0:18       ` Pasha Tatashin
2025-10-31 21:41         ` David Matlack
2025-10-18  0:06 ` [RFC PATCH 07/21] vfio/pci: Store VFIO PCI device preservation data in KHO for live update Vipin Sharma
2025-10-18  0:07 ` [RFC PATCH 08/21] vfio/pci: Retrieve preserved VFIO device for Live Update Orechestrator Vipin Sharma
2025-10-31 23:12   ` David Matlack
2025-10-18  0:07 ` [RFC PATCH 09/21] vfio/pci: Add Live Update finish callback implementation Vipin Sharma
2025-10-18  0:07 ` [RFC PATCH 10/21] PCI: Add option to skip Bus Master Enable reset during kexec Vipin Sharma
2025-10-18  0:07 ` [RFC PATCH 11/21] vfio/pci: Skip clearing bus master on live update device " Vipin Sharma
2025-10-18  7:09   ` Lukas Wunner
2025-10-18 22:19     ` Vipin Sharma
2025-10-18  0:07 ` [RFC PATCH 12/21] vfio/pci: Skip clearing bus master on live update restored device Vipin Sharma
2025-10-20 21:29   ` David Matlack
2025-10-20 22:39     ` Vipin Sharma
2025-10-18  0:07 ` [RFC PATCH 13/21] vfio/pci: Preserve VFIO PCI config space through live update Vipin Sharma
2025-10-18 14:59   ` Vipin Sharma
2025-10-18  0:07 ` [RFC PATCH 14/21] vfio/pci: Skip device reset on live update restored device Vipin Sharma
2025-10-18  0:07 ` [RFC PATCH 15/21] PCI: Make PCI saved state and capability structs public Vipin Sharma
2025-10-18  7:17   ` Lukas Wunner
2025-10-18 22:36     ` Vipin Sharma
2025-10-18 23:11       ` Jason Gunthorpe
2025-10-20 23:49         ` Vipin Sharma
2025-10-22 17:45           ` David Matlack
2025-10-22 17:51             ` Jason Gunthorpe
2025-10-22 17:53           ` Jason Gunthorpe
2025-10-19  8:15       ` Lukas Wunner
2025-10-20 23:54         ` Vipin Sharma
2025-10-30 23:55         ` David Matlack
2025-10-31  0:06           ` David Matlack
2025-10-18  0:07 ` [RFC PATCH 16/21] vfio/pci: Save and restore the PCI state of the VFIO device Vipin Sharma
2025-10-18  7:25   ` Lukas Wunner
2025-10-18 22:44     ` Vipin Sharma
2025-10-18 15:02   ` Vipin Sharma
2025-10-18  0:07 ` [RFC PATCH 17/21] vfio/pci: Disable interrupts before going live update kexec Vipin Sharma
2025-10-18  0:07 ` [RFC PATCH 18/21] vfio: selftests: Build liveupdate library in VFIO selftests Vipin Sharma
2025-10-20 20:50   ` David Matlack
2025-10-20 23:55     ` Vipin Sharma
2025-10-18  0:07 ` [RFC PATCH 19/21] vfio: selftests: Initialize vfio_pci_device using a VFIO cdev FD Vipin Sharma
2025-10-18  0:07 ` [RFC PATCH 20/21] vfio: selftests: Add VFIO live update test Vipin Sharma
2025-10-18  0:07 ` [RFC PATCH 21/21] vfio: selftests: Validate vconfig preservation of VFIO PCI device during live update Vipin Sharma
2025-10-18 17:21 ` [RFC PATCH 00/21] VFIO live update support Jason Gunthorpe
2025-10-18 22:53   ` Vipin Sharma
2025-10-18 23:06     ` Jason Gunthorpe
2025-10-20 23:30       ` Vipin Sharma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251029162110.GQ760669@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=alex.williamson@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=chrisl@kernel.org \
    --cc=david@redhat.com \
    --cc=dmatlack@google.com \
    --cc=dwmw2@infradead.org \
    --cc=epetron@amazon.de \
    --cc=graf@amazon.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jacob.pan@linux.microsoft.com \
    --cc=jgowans@amazon.com \
    --cc=jrhilke@google.com \
    --cc=junaids@google.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=parav@nvidia.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=pratyush@kernel.org \
    --cc=rppt@kernel.org \
    --cc=saeedm@nvidia.com \
    --cc=skhawaja@google.com \
    --cc=vipinsh@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).