From: Jacob Pan <jacob.pan@linux.microsoft.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: "Gowans, James" <jgowans@amazon.com>,
"yi.l.liu@intel.com" <yi.l.liu@intel.com>,
"jinankjain@linux.microsoft.com" <jinankjain@linux.microsoft.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"rppt@kernel.org" <rppt@kernel.org>,
"kw@linux.com" <kw@linux.com>,
"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
"madvenka@linux.microsoft.com" <madvenka@linux.microsoft.com>,
"anthony.yznaga@oracle.com" <anthony.yznaga@oracle.com>,
"robin.murphy@arm.com" <robin.murphy@arm.com>,
"baolu.lu@linux.intel.com" <baolu.lu@linux.intel.com>,
"nh-open-source@amazon.com" <nh-open-source@amazon.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"seanjc@google.com" <seanjc@google.com>,
"Saenz Julienne, Nicolas" <nsaenz@amazon.es>,
"pbonzini@redhat.com" <pbonzini@redhat.com>,
"kevin.tian@intel.com" <kevin.tian@intel.com>,
"dwmw2@infradead.org" <dwmw2@infradead.org>,
"ssengar@linux.microsoft.com" <ssengar@linux.microsoft.com>,
"joro@8bytes.org" <joro@8bytes.org>,
"will@kernel.org" <will@kernel.org>, "Graf (AWS),
Alexander" <graf@amazon.de>,
"steven.sistare@oracle.com" <steven.sistare@oracle.com>,
jacob.pan@linux.microsoft.com,
"zhangyu1@microsoft.com" <zhangyu1@microsoft.com>
Subject: Re: [RFC PATCH 05/13] iommufd: Serialise persisted iommufds and ioas
Date: Wed, 6 Nov 2024 11:18:50 -0800 [thread overview]
Message-ID: <20241106111850.69904346@DESKTOP-0403QTC.> (raw)
In-Reply-To: <20241104130011.GD35848@ziepe.ca>
Hi Jason,
On Mon, 4 Nov 2024 09:00:11 -0400
Jason Gunthorpe <jgg@ziepe.ca> wrote:
> On Sat, Nov 02, 2024 at 10:22:54AM +0000, Gowans, James wrote:
>
> > Yes, I think the guidance was to bind a device to iommufd in noiommu
> > mode. It does seem a bit weird to use iommufd with noiommu, but we
> > agreed it's the best/simplest way to get the functionality.
>
> noiommu should still have an ioas and still have kernel managed page
> pinning.
>
> My remark to bring it to iommufd was to also make it a fully
> architected feature and stop relying on mprotect and /proc/ tricks.
>
Just to clarify my tentative understanding with more details(please
correct):
1. create an iommufd access object for noiommu device when
binding to an iommufd ctx.
2. all user memory used by the device under noiommu mode should be
pinned by iommufd, i.e. iommufd_access_pin_pages().
I guess you meant stop doing mlock instead of mprotect trick? I think
openHCL is using /dev/mem trick.
3. ioas can be attched to the noiommu iommufd_access object, similar to
emulated device, mdev.
What kind/source of memory should be supported here?
e.g. device meory regions exposed by PCI BARs.
> > Then as you suggest below the IOMMUFD_OBJ_DEVICE would be serialised
> > too in some way, probably by iommufd telling the PCI layer that this
> > device must be persistent and hence not to re-probe it on kexec.
>
> Presumably VFIO would be doing some/most of this part since it is the
> driver that will be binding?
>
Yes, it is the user mode driver that initiates the binding. I was
thinking since the granularity for persistency is per iommufd ctx, the
VFIO device flag to mark keep_alive can come from iommufd ctx.
> > It's all a bit hand wavy at the moment, but something along those
> > lines probably makes sense. I need to work on rev2 of this RFC as
> > per Jason's feedback in the other thread. Rev2 will make the
> > restore path more userspace driven, with fresh iommufd and pgtables
> > objects being created and then atomically swapped over too. I'll
> > also get the PCI layer involved with rev2. Once that's out (it'll
> > be a few weeks as I'm on leave) then let's take a look at how the
> > noiommu device persistence case would fit in.
>
> In a certain sense it would be nice to see the noiommu flow as it
> breaks apart the problem into the first dependency:
>
> How to get the device handed across the kexec and safely land back in
> VFIO, and only VFIO's hands.
>
> Preserving the iommu HW configuration is an incremental step built on
> that base line.
Makes sense, I need to catch up on the KHO series and hook up noiommu
at the first step.
> Also, FWIW, this needs to follow good open source practices - we need
> an open userspace for the feature and the kernel stuff should be
> merged in a logical order.
>
Yes, we will have matching userspace in openHCL
https://github.com/microsoft/openvmm
Thanks,
Jacob
next prev parent reply other threads:[~2024-11-06 19:18 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-16 11:30 [RFC PATCH 00/13] Support iommu(fd) persistence for live update James Gowans
2024-09-16 11:30 ` [RFC PATCH 01/13] iommufd: Support marking and tracking persistent iommufds James Gowans
2024-09-16 11:30 ` [RFC PATCH 02/13] iommufd: Add plumbing for KHO (de)serialise James Gowans
2024-09-16 11:30 ` [RFC PATCH 03/13] iommu/intel: zap context table entries on kexec James Gowans
2024-10-03 13:27 ` Jason Gunthorpe
2024-09-16 11:30 ` [RFC PATCH 04/13] iommu: Support marking domains as persistent on alloc James Gowans
2024-09-16 11:30 ` [RFC PATCH 05/13] iommufd: Serialise persisted iommufds and ioas James Gowans
2024-10-02 18:55 ` Jason Gunthorpe
2024-10-07 8:39 ` Gowans, James
2024-10-07 8:47 ` David Woodhouse
2024-10-07 8:57 ` Gowans, James
2024-10-07 15:01 ` Jason Gunthorpe
2024-10-09 11:44 ` Gowans, James
2024-10-09 12:28 ` Jason Gunthorpe
2024-10-10 15:12 ` Gowans, James
2024-10-10 15:32 ` Jason Gunthorpe
2024-10-07 15:11 ` Jason Gunthorpe
2024-10-07 15:16 ` Jason Gunthorpe
2024-10-16 22:20 ` Jacob Pan
2024-10-28 16:03 ` Jacob Pan
2024-11-02 10:22 ` Gowans, James
2024-11-04 13:00 ` Jason Gunthorpe
2024-11-06 19:18 ` Jacob Pan [this message]
2024-09-16 11:30 ` [RFC PATCH 06/13] iommufd: Expose persistent iommufd IDs in sysfs James Gowans
2024-09-16 11:30 ` [RFC PATCH 07/13] iommufd: Re-hydrate a usable iommufd ctx from sysfs James Gowans
2024-09-16 11:30 ` [RFC PATCH 08/13] intel-iommu: Add serialise and deserialise boilerplate James Gowans
2024-09-16 11:30 ` [RFC PATCH 09/13] intel-iommu: Serialise dmar_domain on KHO activaet James Gowans
2024-09-16 11:30 ` [RFC PATCH 10/13] intel-iommu: Re-hydrate persistent domains after kexec James Gowans
2024-09-16 11:31 ` [RFC PATCH 11/13] iommu: Add callback to restore persisted iommu_domain James Gowans
2024-10-03 13:33 ` Jason Gunthorpe
2024-09-16 11:31 ` [RFC PATCH 12/13] iommufd, guestmemfs: Ensure persistent file used for persistent DMA James Gowans
2024-10-03 13:36 ` Jason Gunthorpe
2024-09-16 11:31 ` [RFC PATCH 13/13] iommufd, guestmemfs: Pin files when mapped " James Gowans
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241106111850.69904346@DESKTOP-0403QTC. \
--to=jacob.pan@linux.microsoft.com \
--cc=anthony.yznaga@oracle.com \
--cc=baolu.lu@linux.intel.com \
--cc=dwmw2@infradead.org \
--cc=graf@amazon.de \
--cc=iommu@lists.linux.dev \
--cc=jgg@ziepe.ca \
--cc=jgowans@amazon.com \
--cc=jinankjain@linux.microsoft.com \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=kw@linux.com \
--cc=linux-kernel@vger.kernel.org \
--cc=madvenka@linux.microsoft.com \
--cc=nh-open-source@amazon.com \
--cc=nsaenz@amazon.es \
--cc=pbonzini@redhat.com \
--cc=robin.murphy@arm.com \
--cc=rppt@kernel.org \
--cc=seanjc@google.com \
--cc=ssengar@linux.microsoft.com \
--cc=steven.sistare@oracle.com \
--cc=will@kernel.org \
--cc=yi.l.liu@intel.com \
--cc=zhangyu1@microsoft.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox