From: Jason Gunthorpe <jgg@nvidia.com>
To: Sean Christopherson <seanjc@google.com>
Cc: Yan Zhao <yan.y.zhao@intel.com>,
iommu@lists.linux.dev, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, alex.williamson@redhat.com,
pbonzini@redhat.com, joro@8bytes.org, will@kernel.org,
robin.murphy@arm.com, kevin.tian@intel.com,
baolu.lu@linux.intel.com, dwmw2@infradead.org,
yi.l.liu@intel.com
Subject: Re: [RFC PATCH 00/42] Sharing KVM TDP to IOMMU
Date: Mon, 4 Dec 2023 13:30:28 -0400 [thread overview]
Message-ID: <20231204173028.GJ1493156@nvidia.com> (raw)
In-Reply-To: <ZW4Fx2U80L1PJKlh@google.com>
On Mon, Dec 04, 2023 at 09:00:55AM -0800, Sean Christopherson wrote:
> There are more approaches beyond having IOMMUFD and KVM be
> completely separate entities. E.g. extract the bulk of KVM's "TDP
> MMU" implementation to common code so that IOMMUFD doesn't need to
> reinvent the wheel.
We've pretty much done this already, it is called "hmm" and it is what
the IO world uses. Merging/splitting huge page is just something that
needs some coding in the page table code, that people want for other
reasons anyhow.
> - Subjects IOMMUFD to all of KVM's historical baggage, e.g. the memslot deletion
> mess, the truly nasty MTRR emulation (which I still hope to delete), the NX
> hugepage mitigation, etc.
Does it? I think that just remains isolated in kvm. The output from
KVM is only a radix table top pointer, it is up to KVM how to manage
it still.
> I'm not convinced that memory consumption is all that interesting. If a VM is
> mapping the majority of memory into a device, then odds are good that the guest
> is backed with at least 2MiB page, if not 1GiB pages, at which point the memory
> overhead for pages tables is quite small, especially relative to the total amount
> of memory overheads for such systems.
AFAIK the main argument is performance. It is similar to why we want
to do IOMMU SVA with MM page table sharing.
If IOMMU mirrors/shadows/copies a page table using something like HMM
techniques then the invalidations will mark ranges of IOVA as
non-present and faults will occur to trigger hmm_range_fault to do the
shadowing.
This means that pretty much all IO will always encounter a non-present
fault, certainly at the start and maybe worse while ongoing.
On the other hand, if we share the exact page table then natural CPU
touches will usually make the page present before an IO happens in
almost all cases and we don't have to take the horribly expensive IO
page fault at all.
We were not able to make bi-dir notifiers with with the CPU mm, I'm
not sure that is "relatively easy" :(
Jason
next prev parent reply other threads:[~2023-12-04 17:30 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-02 9:12 [RFC PATCH 00/42] Sharing KVM TDP to IOMMU Yan Zhao
2023-12-02 9:13 ` [RFC PATCH 01/42] KVM: Public header for KVM to export TDP Yan Zhao
2023-12-02 9:15 ` [RFC PATCH 02/42] KVM: x86: Arch header for kvm to export TDP for Intel Yan Zhao
2023-12-02 9:15 ` [RFC PATCH 03/42] KVM: Introduce VM ioctl KVM_CREATE_TDP_FD Yan Zhao
2023-12-02 9:16 ` [RFC PATCH 04/42] KVM: Skeleton of KVM TDP FD object Yan Zhao
2023-12-02 9:16 ` [RFC PATCH 05/42] KVM: Embed "arch" object and call arch init/destroy in TDP FD Yan Zhao
2023-12-02 9:17 ` [RFC PATCH 06/42] KVM: Register/Unregister importers to KVM exported TDP Yan Zhao
2023-12-02 9:18 ` [RFC PATCH 07/42] KVM: Forward page fault requests to arch specific code for " Yan Zhao
2023-12-02 9:18 ` [RFC PATCH 08/42] KVM: Add a helper to notify importers that KVM exported TDP is flushed Yan Zhao
2023-12-02 9:19 ` [RFC PATCH 09/42] iommu: Add IOMMU_DOMAIN_KVM Yan Zhao
2023-12-02 9:20 ` [RFC PATCH 10/42] iommu: Add new iommu op to create domains managed by KVM Yan Zhao
2023-12-04 15:09 ` Jason Gunthorpe
2023-12-02 9:20 ` [RFC PATCH 11/42] iommu: Add new domain op cache_invalidate_kvm Yan Zhao
2023-12-04 15:09 ` Jason Gunthorpe
2023-12-05 6:40 ` Yan Zhao
2023-12-05 14:52 ` Jason Gunthorpe
2023-12-06 1:00 ` Yan Zhao
2023-12-02 9:21 ` [RFC PATCH 12/42] iommufd: Introduce allocation data info and flag for KVM managed HWPT Yan Zhao
2023-12-04 18:29 ` Jason Gunthorpe
2023-12-05 7:08 ` Yan Zhao
2023-12-05 14:53 ` Jason Gunthorpe
2023-12-06 0:58 ` Yan Zhao
2023-12-02 9:21 ` [RFC PATCH 13/42] iommufd: Add a KVM HW pagetable object Yan Zhao
2023-12-02 9:22 ` [RFC PATCH 14/42] iommufd: Enable KVM HW page table object to be proxy between KVM and IOMMU Yan Zhao
2023-12-04 18:34 ` Jason Gunthorpe
2023-12-05 7:09 ` Yan Zhao
2023-12-02 9:22 ` [RFC PATCH 15/42] iommufd: Add iopf handler to KVM hw pagetable Yan Zhao
2023-12-02 9:23 ` [RFC PATCH 16/42] iommufd: Enable device feature IOPF during device attachment to KVM HWPT Yan Zhao
2023-12-04 18:36 ` Jason Gunthorpe
2023-12-05 7:14 ` Yan Zhao
2023-12-05 14:53 ` Jason Gunthorpe
2023-12-06 0:55 ` Yan Zhao
2023-12-02 9:23 ` [RFC PATCH 17/42] iommu/vt-d: Make some macros and helpers to be extern Yan Zhao
2023-12-02 9:24 ` [RFC PATCH 18/42] iommu/vt-d: Support of IOMMU_DOMAIN_KVM domain in Intel IOMMU Yan Zhao
2023-12-02 9:24 ` [RFC PATCH 19/42] iommu/vt-d: Set bit PGSNP in PASIDTE if domain cache coherency is enforced Yan Zhao
2023-12-02 9:25 ` [RFC PATCH 20/42] iommu/vt-d: Support attach devices to IOMMU_DOMAIN_KVM domain Yan Zhao
2023-12-02 9:26 ` [RFC PATCH 21/42] iommu/vt-d: Check reserved bits for " Yan Zhao
2023-12-02 9:26 ` [RFC PATCH 22/42] iommu/vt-d: Support cache invalidate of " Yan Zhao
2023-12-02 9:26 ` [RFC PATCH 23/42] iommu/vt-d: Allow pasid 0 in IOPF Yan Zhao
2023-12-02 9:27 ` [RFC PATCH 24/42] KVM: x86/mmu: Move bit SPTE_MMU_PRESENT from bit 11 to bit 59 Yan Zhao
2023-12-02 9:27 ` [RFC PATCH 25/42] KVM: x86/mmu: Abstract "struct kvm_mmu_common" from "struct kvm_mmu" Yan Zhao
2023-12-02 9:28 ` [RFC PATCH 26/42] KVM: x86/mmu: introduce new op get_default_mt_mask to kvm_x86_ops Yan Zhao
2023-12-02 9:28 ` [RFC PATCH 27/42] KVM: x86/mmu: change param "vcpu" to "kvm" in kvm_mmu_hugepage_adjust() Yan Zhao
2023-12-02 9:29 ` [RFC PATCH 28/42] KVM: x86/mmu: change "vcpu" to "kvm" in page_fault_handle_page_track() Yan Zhao
2023-12-02 9:29 ` [RFC PATCH 29/42] KVM: x86/mmu: remove param "vcpu" from kvm_mmu_get_tdp_level() Yan Zhao
2023-12-02 9:30 ` [RFC PATCH 30/42] KVM: x86/mmu: remove param "vcpu" from kvm_calc_tdp_mmu_root_page_role() Yan Zhao
2023-12-02 9:30 ` [RFC PATCH 31/42] KVM: x86/mmu: add extra param "kvm" to kvm_faultin_pfn() Yan Zhao
2023-12-02 9:31 ` [RFC PATCH 32/42] KVM: x86/mmu: add extra param "kvm" to make_mmio_spte() Yan Zhao
2023-12-02 9:31 ` [RFC PATCH 33/42] KVM: x86/mmu: add extra param "kvm" to make_spte() Yan Zhao
2023-12-02 9:32 ` [RFC PATCH 34/42] KVM: x86/mmu: add extra param "kvm" to tdp_mmu_map_handle_target_level() Yan Zhao
2023-12-02 9:32 ` [RFC PATCH 35/42] KVM: x86/mmu: Get/Put TDP root page to be exported Yan Zhao
2023-12-02 9:33 ` [RFC PATCH 36/42] KVM: x86/mmu: Keep exported TDP root valid Yan Zhao
2023-12-02 9:33 ` [RFC PATCH 37/42] KVM: x86: Implement KVM exported TDP fault handler on x86 Yan Zhao
2023-12-02 9:35 ` [RFC PATCH 38/42] KVM: x86: "compose" and "get" interface for meta data of exported TDP Yan Zhao
2023-12-02 9:35 ` [RFC PATCH 39/42] KVM: VMX: add config KVM_INTEL_EXPORTED_EPT Yan Zhao
2023-12-02 9:36 ` [RFC PATCH 40/42] KVM: VMX: Compose VMX specific meta data for KVM exported TDP Yan Zhao
2023-12-02 9:36 ` [RFC PATCH 41/42] KVM: VMX: Implement ops .flush_remote_tlbs* in VMX when EPT is on Yan Zhao
2023-12-02 9:37 ` [RFC PATCH 42/42] KVM: VMX: Notify importers of exported TDP to flush TLBs on KVM flushes EPT Yan Zhao
2023-12-04 15:08 ` [RFC PATCH 00/42] Sharing KVM TDP to IOMMU Jason Gunthorpe
2023-12-04 16:38 ` Sean Christopherson
2023-12-05 1:31 ` Yan Zhao
2023-12-05 6:45 ` Tian, Kevin
2023-12-05 1:52 ` Yan Zhao
2023-12-05 6:30 ` Tian, Kevin
2023-12-04 17:00 ` Sean Christopherson
2023-12-04 17:30 ` Jason Gunthorpe [this message]
2023-12-04 19:22 ` Sean Christopherson
2023-12-04 19:50 ` Jason Gunthorpe
2023-12-04 20:11 ` Sean Christopherson
2023-12-04 23:49 ` Jason Gunthorpe
2023-12-05 7:17 ` Tian, Kevin
2023-12-05 5:53 ` Yan Zhao
2023-12-05 3:51 ` Yan Zhao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231204173028.GJ1493156@nvidia.com \
--to=jgg@nvidia.com \
--cc=alex.williamson@redhat.com \
--cc=baolu.lu@linux.intel.com \
--cc=dwmw2@infradead.org \
--cc=iommu@lists.linux.dev \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=robin.murphy@arm.com \
--cc=seanjc@google.com \
--cc=will@kernel.org \
--cc=yan.y.zhao@intel.com \
--cc=yi.l.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox