From: Jason Gunthorpe <jgg@nvidia.com>
To: "Tian, Kevin" <kevin.tian@intel.com>
Cc: "Annapurve, Vishal" <vannapurve@google.com>,
Alexey Kardashevskiy <aik@amd.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
"linux-coco@lists.linux.dev" <linux-coco@lists.linux.dev>,
"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
Alex Williamson <alex.williamson@redhat.com>,
"Williams, Dan J" <dan.j.williams@intel.com>,
"pratikrajesh.sampat@amd.com" <pratikrajesh.sampat@amd.com>,
"michael.day@amd.com" <michael.day@amd.com>,
"david.kaplan@amd.com" <david.kaplan@amd.com>,
"dhaval.giani@amd.com" <dhaval.giani@amd.com>,
Santosh Shukla <santosh.shukla@amd.com>,
Tom Lendacky <thomas.lendacky@amd.com>,
Michael Roth <michael.roth@amd.com>,
Alexander Graf <agraf@suse.de>,
Nikunj A Dadhania <nikunj@amd.com>,
Vasant Hegde <vasant.hegde@amd.com>,
Lukas Wunner <lukas@wunner.de>
Subject: Re: [RFC PATCH 12/21] KVM: IOMMUFD: MEMFD: Map private pages
Date: Mon, 23 Sep 2024 13:02:39 -0300 [thread overview]
Message-ID: <20240923160239.GD9417@nvidia.com> (raw)
In-Reply-To: <BL1PR11MB5271327169B23A60D66965F38C6F2@BL1PR11MB5271.namprd11.prod.outlook.com>
On Mon, Sep 23, 2024 at 08:24:40AM +0000, Tian, Kevin wrote:
> > From: Vishal Annapurve <vannapurve@google.com>
> > Sent: Monday, September 23, 2024 2:34 PM
> >
> > On Mon, Sep 23, 2024 at 7:36 AM Tian, Kevin <kevin.tian@intel.com> wrote:
> > >
> > > > From: Vishal Annapurve <vannapurve@google.com>
> > > > Sent: Saturday, September 21, 2024 5:11 AM
> > > >
> > > > On Sun, Sep 15, 2024 at 11:08 PM Jason Gunthorpe <jgg@nvidia.com>
> > wrote:
> > > > >
> > > > > On Fri, Aug 23, 2024 at 11:21:26PM +1000, Alexey Kardashevskiy wrote:
> > > > > > IOMMUFD calls get_user_pages() for every mapping which will
> > allocate
> > > > > > shared memory instead of using private memory managed by the
> > KVM
> > > > and
> > > > > > MEMFD.
> > > > >
> > > > > Please check this series, it is much more how I would expect this to
> > > > > work. Use the guest memfd directly and forget about kvm in the
> > iommufd
> > > > code:
> > > > >
> > > > > https://lore.kernel.org/r/1726319158-283074-1-git-send-email-
> > > > steven.sistare@oracle.com
> > > > >
> > > > > I would imagine you'd detect the guest memfd when accepting the FD
> > and
> > > > > then having some different path in the pinning logic to pin and get
> > > > > the physical ranges out.
> > > >
> > > > According to the discussion at KVM microconference around hugepage
> > > > support for guest_memfd [1], it's imperative that guest private memory
> > > > is not long term pinned. Ideal way to implement this integration would
> > > > be to support a notifier that can be invoked by guest_memfd when
> > > > memory ranges get truncated so that IOMMU can unmap the
> > corresponding
> > > > ranges. Such a notifier should also get called during memory
> > > > conversion, it would be interesting to discuss how conversion flow
> > > > would work in this case.
> > > >
> > > > [1] https://lpc.events/event/18/contributions/1764/ (checkout the
> > > > slide 12 from attached presentation)
> > > >
> > >
> > > Most devices don't support I/O page fault hence can only DMA to long
> > > term pinned buffers. The notifier might be helpful for in-kernel conversion
> > > but as a basic requirement there needs a way for IOMMUFD to call into
> > > guest memfd to request long term pinning for a given range. That is
> > > how I interpreted "different path" in Jason's comment.
> >
> > Policy that is being aimed here:
> > 1) guest_memfd will pin the pages backing guest memory for all users.
> > 2) kvm_gmem_get_pfn users will get a locked folio with elevated
> > refcount when asking for the pfn/page from guest_memfd. Users will
> > drop the refcount and release the folio lock when they are done
> > using/installing (e.g. in KVM EPT/IOMMU PT entries) it. This folio
> > lock is supposed to be held for short durations.
> > 3) Users can assume the pfn is around until they are notified by
> > guest_memfd on truncation or memory conversion.
> >
> > Step 3 above is already followed by KVM EPT setup logic for CoCo VMs.
> > TDX VMs especially need to have secure EPT entries always mapped (once
> > faulted-in) while the guest memory ranges are private.
>
> 'faulted-in' doesn't work for device DMAs (w/o IOPF).
>
> and above is based on the assumption that CoCo VM will always
> map/pin the private memory pages until a conversion happens.
>
> Conversion is initiated by the guest so ideally the guest is responsible
> for not leaving any in-fly DMAs to the page which is being converted.
> From this angle it is fine for IOMMUFD to receive a notification from
> guest memfd when such a conversion happens.
Right, I think the expectation is if a guest has active DMA on a page
it is changing between shared/private there is no expectation that the
DMA will succeed. So we don't need page fault, we just need to allow
it to safely fail.
IMHO we should try to do as best we can here, and the ideal interface
would be a notifier to switch the shared/private pages in some portion
of the guestmemfd. With the idea that iommufd could perhaps do it
atomically.
When the notifier returns then the old pages are fenced off at the
HW.
But this would have to be a sleepable notifier that can do memory
allocation.
It is actually pretty complicated and we will need a reliable cut
operation to make this work on AMD v1.
Jason
next prev parent reply other threads:[~2024-09-23 16:02 UTC|newest]
Thread overview: 128+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-23 13:21 [RFC PATCH 00/21] Secure VFIO, TDISP, SEV TIO Alexey Kardashevskiy
2024-08-23 13:21 ` [RFC PATCH 01/21] tsm-report: Rename module to reflect what it does Alexey Kardashevskiy
2024-08-23 22:17 ` Bjorn Helgaas
2024-08-28 13:49 ` Jonathan Cameron
2024-08-30 0:13 ` Dan Williams
2024-09-02 1:29 ` Alexey Kardashevskiy
2024-08-23 13:21 ` [RFC PATCH 02/21] pci/doe: Define protocol types and make those public Alexey Kardashevskiy
2024-08-23 22:18 ` Bjorn Helgaas
2024-08-30 2:15 ` Dan Williams
2024-08-23 13:21 ` [RFC PATCH 03/21] pci: Define TEE-IO bit in PCIe device capabilities Alexey Kardashevskiy
2024-08-23 22:19 ` Bjorn Helgaas
2024-08-28 13:54 ` Jonathan Cameron
2024-08-30 2:21 ` Dan Williams
2024-08-30 4:04 ` Alexey Kardashevskiy
2024-08-30 21:37 ` Dan Williams
2024-08-23 13:21 ` [RFC PATCH 04/21] PCI/IDE: Define Integrity and Data Encryption (IDE) extended capability Alexey Kardashevskiy
2024-08-23 22:28 ` Bjorn Helgaas
2024-08-28 14:24 ` Jonathan Cameron
2024-08-30 2:41 ` Dan Williams
2024-08-23 13:21 ` [RFC PATCH 05/21] crypto/ccp: Make some SEV helpers public Alexey Kardashevskiy
2024-08-30 2:45 ` Dan Williams
2024-08-23 13:21 ` [RFC PATCH 06/21] crypto: ccp: Enable SEV-TIO feature in the PSP when supported Alexey Kardashevskiy
2024-08-28 14:32 ` Jonathan Cameron
2024-09-03 21:27 ` Dan Williams
2024-09-05 2:29 ` Alexey Kardashevskiy
2024-09-05 17:40 ` Dan Williams
2024-08-23 13:21 ` [RFC PATCH 07/21] pci/tdisp: Introduce tsm module Alexey Kardashevskiy
2024-08-27 12:32 ` Jason Gunthorpe
2024-08-28 3:00 ` Alexey Kardashevskiy
2024-08-28 23:42 ` Jason Gunthorpe
2024-08-29 0:00 ` Dan Williams
2024-08-29 0:09 ` Jason Gunthorpe
2024-08-29 0:20 ` Dan Williams
2024-08-29 12:03 ` Jason Gunthorpe
2024-08-29 4:57 ` Alexey Kardashevskiy
2024-08-29 12:07 ` Jason Gunthorpe
2024-09-02 0:52 ` Alexey Kardashevskiy
2024-08-28 15:04 ` Jonathan Cameron
2024-09-02 6:50 ` Aneesh Kumar K.V
2024-09-02 7:26 ` Alexey Kardashevskiy
2024-09-03 23:51 ` Dan Williams
2024-09-04 11:13 ` Alexey Kardashevskiy
2024-09-04 23:28 ` Dan Williams
2024-08-23 13:21 ` [RFC PATCH 08/21] crypto/ccp: Implement SEV TIO firmware interface Alexey Kardashevskiy
2024-08-28 15:39 ` Jonathan Cameron
2024-08-23 13:21 ` [RFC PATCH 09/21] kvm: Export kvm_vm_set_mem_attributes Alexey Kardashevskiy
2024-08-23 13:21 ` [RFC PATCH 10/21] vfio: Export helper to get vfio_device from fd Alexey Kardashevskiy
2024-08-23 13:21 ` [RFC PATCH 11/21] KVM: SEV: Add TIO VMGEXIT and bind TDI Alexey Kardashevskiy
2024-08-29 10:08 ` Xu Yilun
2024-08-30 4:00 ` Alexey Kardashevskiy
2024-08-30 7:02 ` Xu Yilun
2024-09-02 1:24 ` Alexey Kardashevskiy
2024-09-13 13:50 ` Zhi Wang
2024-09-13 22:08 ` Dan Williams
2024-09-14 2:47 ` Tian, Kevin
2024-09-14 5:19 ` Zhi Wang
2024-09-18 10:45 ` Xu Yilun
2024-09-20 3:41 ` Tian, Kevin
2024-08-23 13:21 ` [RFC PATCH 12/21] KVM: IOMMUFD: MEMFD: Map private pages Alexey Kardashevskiy
2024-08-26 8:39 ` Tian, Kevin
2024-08-26 12:30 ` Jason Gunthorpe
2024-08-29 9:34 ` Xu Yilun
2024-08-29 12:15 ` Jason Gunthorpe
2024-08-30 3:47 ` Alexey Kardashevskiy
2024-08-30 12:35 ` Jason Gunthorpe
2024-09-02 1:09 ` Alexey Kardashevskiy
2024-09-02 23:52 ` Jason Gunthorpe
2024-09-03 0:03 ` Alexey Kardashevskiy
2024-09-03 0:37 ` Jason Gunthorpe
2024-08-30 5:20 ` Xu Yilun
2024-08-30 12:36 ` Jason Gunthorpe
2024-09-03 20:34 ` Dan Williams
2024-09-04 0:02 ` Jason Gunthorpe
2024-09-04 0:59 ` Dan Williams
2024-09-05 8:29 ` Tian, Kevin
2024-09-05 12:02 ` Jason Gunthorpe
2024-09-05 12:07 ` Tian, Kevin
2024-09-05 12:00 ` Jason Gunthorpe
2024-09-05 12:17 ` Tian, Kevin
2024-09-05 12:23 ` Jason Gunthorpe
2024-09-05 20:53 ` Dan Williams
2024-09-05 23:06 ` Jason Gunthorpe
2024-09-06 2:46 ` Tian, Kevin
2024-09-06 13:54 ` Jason Gunthorpe
2024-09-06 2:41 ` Tian, Kevin
2024-08-27 2:27 ` Alexey Kardashevskiy
2024-08-27 2:31 ` Tian, Kevin
2024-09-15 21:07 ` Jason Gunthorpe
2024-09-20 21:10 ` Vishal Annapurve
2024-09-23 5:35 ` Tian, Kevin
2024-09-23 6:34 ` Vishal Annapurve
2024-09-23 8:24 ` Tian, Kevin
2024-09-23 16:02 ` Jason Gunthorpe [this message]
2024-09-23 23:52 ` Tian, Kevin
2024-09-24 12:07 ` Jason Gunthorpe
2024-09-25 8:44 ` Vishal Annapurve
2024-09-25 15:41 ` Jason Gunthorpe
2024-09-23 20:53 ` Vishal Annapurve
2024-09-23 23:55 ` Tian, Kevin
2024-08-23 13:21 ` [RFC PATCH 13/21] KVM: X86: Handle private MMIO as shared Alexey Kardashevskiy
2024-08-30 16:57 ` Xu Yilun
2024-09-02 2:22 ` Alexey Kardashevskiy
2024-09-03 5:13 ` Xu Yilun
2024-09-06 3:31 ` Alexey Kardashevskiy
2024-09-09 10:07 ` Xu Yilun
2024-09-10 1:28 ` Alexey Kardashevskiy
2024-08-23 13:21 ` [RFC PATCH 14/21] RFC: iommu/iommufd/amd: Add IOMMU_HWPT_TRUSTED flag, tweak DTE's DomainID, IOTLB Alexey Kardashevskiy
2024-08-27 12:17 ` Jason Gunthorpe
2024-08-23 13:21 ` [RFC PATCH 15/21] coco/sev-guest: Allow multiple source files in the driver Alexey Kardashevskiy
2024-08-23 13:21 ` [RFC PATCH 16/21] coco/sev-guest: Make SEV-to-PSP request helpers public Alexey Kardashevskiy
2024-08-23 13:21 ` [RFC PATCH 17/21] coco/sev-guest: Implement the guest side of things Alexey Kardashevskiy
2024-08-28 15:54 ` Jonathan Cameron
2024-09-14 7:19 ` Zhi Wang
2024-09-16 1:18 ` Alexey Kardashevskiy
2024-08-23 13:21 ` [RFC PATCH 18/21] RFC: pci: Add BUS_NOTIFY_PCI_BUS_MASTER event Alexey Kardashevskiy
2024-08-23 13:21 ` [RFC PATCH 19/21] sev-guest: Stop changing encrypted page state for TDISP devices Alexey Kardashevskiy
2024-08-23 13:21 ` [RFC PATCH 20/21] pci: Allow encrypted MMIO mapping via sysfs Alexey Kardashevskiy
2024-08-23 22:37 ` Bjorn Helgaas
2024-09-02 8:22 ` Alexey Kardashevskiy
2024-09-03 21:46 ` Bjorn Helgaas
2024-08-23 13:21 ` [RFC PATCH 21/21] pci: Define pci_iomap_range_encrypted Alexey Kardashevskiy
2024-08-28 20:43 ` [RFC PATCH 00/21] Secure VFIO, TDISP, SEV TIO Dan Williams
2024-08-29 14:13 ` Alexey Kardashevskiy
2024-08-29 23:41 ` Dan Williams
2024-08-30 4:38 ` Alexey Kardashevskiy
2024-08-30 21:57 ` Dan Williams
2024-09-05 8:21 ` Tian, Kevin
2024-09-03 15:56 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240923160239.GD9417@nvidia.com \
--to=jgg@nvidia.com \
--cc=agraf@suse.de \
--cc=aik@amd.com \
--cc=alex.williamson@redhat.com \
--cc=dan.j.williams@intel.com \
--cc=david.kaplan@amd.com \
--cc=dhaval.giani@amd.com \
--cc=iommu@lists.linux.dev \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-coco@lists.linux.dev \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=michael.day@amd.com \
--cc=michael.roth@amd.com \
--cc=nikunj@amd.com \
--cc=pratikrajesh.sampat@amd.com \
--cc=santosh.shukla@amd.com \
--cc=suravee.suthikulpanit@amd.com \
--cc=thomas.lendacky@amd.com \
--cc=vannapurve@google.com \
--cc=vasant.hegde@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).