From: "Suthikulpanit, Suravee" <suravee.suthikulpanit@amd.com>
To: Jason Gunthorpe <jgg@nvidia.com>, Yi Liu <yi.l.liu@intel.com>,
"Giani, Dhaval" <Dhaval.Giani@amd.com>,
Vasant Hegde <vasant.hegde@amd.com>
Cc: joro@8bytes.org, alex.williamson@redhat.com,
kevin.tian@intel.com, robin.murphy@arm.com,
baolu.lu@linux.intel.com, cohuck@redhat.com,
eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org,
mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com,
yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com,
shameerali.kolothum.thodi@huawei.com, lulu@redhat.com,
iommu@lists.linux.dev, linux-kernel@vger.kernel.org,
linux-kselftest@vger.kernel.org, zhenzhong.duan@intel.com,
joao.m.martins@oracle.com, xin.zeng@intel.com,
yan.y.zhao@intel.com
Subject: Re: [PATCH v6 0/6] iommufd: Add nesting infrastructure (part 2/2)
Date: Tue, 12 Dec 2023 00:35:26 +0700 [thread overview]
Message-ID: <391ab316-79b1-4535-a45b-4c01bfb80de6@amd.com> (raw)
In-Reply-To: <20231209014726.GA2945299@nvidia.com>
On 12/9/2023 8:47 AM, Jason Gunthorpe wrote:
> On Fri, Nov 17, 2023 at 05:07:11AM -0800, Yi Liu wrote:
>
>> Take Intel VT-d as an example, the stage-1 translation table is I/O page
>> table. As the below diagram shows, guest I/O page table pointer in GPA
>> (guest physical address) is passed to host and be used to perform the stage-1
>> address translation. Along with it, modifications to present mappings in the
>> guest I/O page table should be followed with an IOTLB invalidation.
>
> I've been looking at what the three HW's need for invalidation, it is
> a bit messy.. Here is my thinking. Please let me know if I got it right
>
> What is the starting point of the guest memory walks:
> Intel: Single Scalable Mode PASID table entry indexed by a RID & PASID
> AMD: GCR3 table (a table of PASIDs) indexed by RID
GCR3 table is indexed by PASID.
Device Table (DTE) is indexted by DeviceID (RID)
> ...
> Will ATC be forwarded or synthesized:
> Intel: The (vDomain-ID,PASID) is a unique nesting domain so
> the hypervisor knows exactly which RIDs this nesting domain is
> linked to and can generate an ATC invalidation. Plan is to
> supress/discard the ATC invalidations from the VM and generate
> them in the hypervisor.
> AMD: (vDomain-ID,PASID) is ambiguous, it can refer to multiple GCR3
> tables. We know which maximal set of RIDs it represents, but not
> the actual set. I expect AMD will forward the ATC invalidation
> to avoid over invalidation.
Not sure I understand your description here.
For the AMD IOMMU INVALIDE_IOMMU_PAGES (i.e. invalidate the IOMMU TLB),
the hypervisor needs to map gDomainId->hDomainId and issue the command
on behalf of the VM along with the PASID and GVA (or GVA range) provided
by the guest.
For the AMD IOMMU INVALIDE_IOTLB_PAGES (i.e. invalidate the ATC on the
device), the hypervisor needs to map gDeviceId->hDeviceId and issue the
command on behalf of the VM along with the PASID and GVA (or GVA range)
provided by the guest.
> ARM: ASID is ambiguous. We have no idea which Nesting Domain/CD table
> the ASID is contained in. ARM must forward the ATC invalidation
> from the guest.
>
> What iommufd object should receive the IOTLB invalidation command list:
> Intel: The Nesting domain. The command list has to be broken up per
> (vDomain-ID,PASID) and that batch delivered to the single
> nesting domain. Kernel ignores vDomain-ID/PASID and just
> invalidates whatever the nesting domain is actually attached to
> AMD: Any Nesting Domain in the vDomain-ID group. The command list has
> to be broken up per (vDomain-ID). Kernel replaces
> vDomain-ID with pDomain-ID from the nesting domain and executes
> the invalidation.
> ARM: The Nesting Parent domain. Kernel forces the VMID from the
> Nesting Parent and executes the invalidation.
>
> In all cases the VM issues an ATC invalidation with (vRID, PASID) as
> the tag. The VMM must translate vRID -> dev_id -> pRID
>
> For a pure SW flow the vRID can be mapped to the dev_id and the ATC
> invalidation delivered to the device object (eg IOMMUFD_DEV_INVALIDATE)
>
> Finally, we have the HW driven invalidation DMA queues that can be
> directly assigned to the guest. AMD and SMMUv3+vCMDQ support this. In
> this case the HW is directly processing invalidation commands without
> a hypervisor trap.
>
> To make this work the iommu needs to be programmed with:
> AMD: A vDomain-ID -> pDomain-ID table
> A vRID -> pRID table
> This is all bound to some "virtual function"
By "virtual function", I assume you are referring to the AMD vIOMMU
instance in the guest?
> ARM: A vRID -> pRID table
> The vCMDQ is bound to a VM_ID, so to the Nesting Parent
>
> For AMD, as above, I suggest the vDomain-ID be passed when creating
> the nesting domain
Sure, we can do this part.
> The AMD "virtual function".. It is probably best to create a new iommufd
> object for this and it can be passed in to a few places
Something like IOMMUFD_OBJ_VIOMMU? Then operation would include
something like:
* Init
* Destroy
* ...
> The vRID->pRID table should be some mostly common
> IOMMUFD_DEV_ASSIGN_VIRTUAL_ID. AMD will need to pass in the virtual
> function ID and ARM will need to pass in the Nesting Parent ID.
Ok.
> ...
> Thus next steps:
> - Respin this and lets focus on Intel only (this will be tough for
> the holidays, but if it is available I will try)
> - Get an ARM patch that just does IOTLB invalidation and add it to my
> part 3
> - Start working on IOMMUFD_DEV_INVALIDATE along with an ARM
> implementation of it
> - Reorganize the AMD RFC broadly along these lines and lets see it
> freshened up in the next months as well. I would like to see the
> AMD support structured to implement the SW paths in first steps and
> later add in the "virtual function" acceleration stuff. The latter
> is going to be complex.
Working on refining the part 1 to add HW info reporting and nested
translation (minus the invalidation stuff). Should be sending out soon.
Suravee
next prev parent reply other threads:[~2023-12-11 17:35 UTC|newest]
Thread overview: 93+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20231217215720eucas1p2a590aca62ce8eb5ba81df6bc8b1a785d@eucas1p2.samsung.com>
2023-11-17 13:07 ` [PATCH v6 0/6] iommufd: Add nesting infrastructure (part 2/2) Yi Liu
2023-11-17 13:07 ` [PATCH v6 1/6] iommu: Add cache_invalidate_user op Yi Liu
2023-11-20 7:53 ` Tian, Kevin
2023-12-06 18:32 ` Jason Gunthorpe
2023-12-06 18:43 ` Nicolin Chen
2023-12-06 18:50 ` Jason Gunthorpe
2023-12-07 6:53 ` Yi Liu
2024-01-08 7:32 ` Binbin Wu
2023-11-17 13:07 ` [PATCH v6 2/6] iommufd: Add IOMMU_HWPT_INVALIDATE Yi Liu
2023-11-20 8:09 ` Tian, Kevin
2023-11-20 8:29 ` Yi Liu
2023-11-20 8:34 ` Tian, Kevin
2023-11-20 17:36 ` Nicolin Chen
2023-11-21 2:50 ` Tian, Kevin
2023-11-21 5:24 ` Nicolin Chen
2023-11-24 2:36 ` Tian, Kevin
2023-11-27 19:53 ` Nicolin Chen
2023-11-28 6:01 ` Yi Liu
2023-11-29 0:54 ` Nicolin Chen
2023-11-28 8:03 ` Tian, Kevin
2023-11-29 0:51 ` Nicolin Chen
2023-11-29 0:57 ` Jason Gunthorpe
2023-11-29 1:09 ` Nicolin Chen
2023-11-29 19:58 ` Jason Gunthorpe
2023-11-29 22:07 ` Nicolin Chen
2023-11-30 0:08 ` Jason Gunthorpe
2023-11-30 20:41 ` Nicolin Chen
2023-12-01 0:45 ` Jason Gunthorpe
2023-12-01 4:29 ` Nicolin Chen
2023-12-01 12:55 ` Jason Gunthorpe
2023-12-01 19:58 ` Nicolin Chen
2023-12-01 20:43 ` Jason Gunthorpe
2023-12-01 22:12 ` Nicolin Chen
2023-12-04 14:48 ` Jason Gunthorpe
2023-12-05 17:33 ` Nicolin Chen
2023-12-06 12:48 ` Jason Gunthorpe
2023-12-01 3:51 ` Yi Liu
2023-12-01 4:50 ` Nicolin Chen
2023-12-01 5:19 ` Tian, Kevin
2023-12-01 7:05 ` Yi Liu
2023-12-01 7:10 ` Tian, Kevin
2023-12-01 9:08 ` Yi Liu
2023-11-21 5:02 ` Baolu Lu
2023-11-21 5:19 ` Nicolin Chen
2023-11-28 5:54 ` Yi Liu
2023-12-06 18:33 ` Jason Gunthorpe
2023-12-07 6:59 ` Yi Liu
2023-12-07 9:04 ` Tian, Kevin
2023-12-07 14:42 ` Jason Gunthorpe
2023-12-11 7:53 ` Yi Liu
2023-12-11 13:21 ` Jason Gunthorpe
2023-12-12 13:45 ` Liu, Yi L
2023-12-12 14:40 ` Jason Gunthorpe
2023-12-13 13:47 ` Liu, Yi L
2023-12-13 14:11 ` Jason Gunthorpe
2023-12-11 7:49 ` Yi Liu
2023-11-17 13:07 ` [PATCH v6 3/6] iommu: Add iommu_copy_struct_from_user_array helper Yi Liu
2023-11-20 8:17 ` Tian, Kevin
2023-11-20 17:25 ` Nicolin Chen
2023-11-21 2:48 ` Tian, Kevin
2024-01-08 8:37 ` Binbin Wu
2023-11-17 13:07 ` [PATCH v6 4/6] iommufd/selftest: Add mock_domain_cache_invalidate_user support Yi Liu
2023-12-06 18:16 ` Jason Gunthorpe
2023-12-11 11:21 ` Yi Liu
2023-11-17 13:07 ` [PATCH v6 5/6] iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op Yi Liu
2023-11-17 13:07 ` [PATCH v6 6/6] iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl Yi Liu
2023-12-06 18:19 ` Jason Gunthorpe
2023-12-11 11:28 ` Yi Liu
2023-12-11 13:06 ` Jason Gunthorpe
2023-12-09 1:47 ` [PATCH v6 0/6] iommufd: Add nesting infrastructure (part 2/2) Jason Gunthorpe
2023-12-11 2:29 ` Tian, Kevin
2023-12-11 12:36 ` Yi Liu
2023-12-11 13:05 ` Jason Gunthorpe
2023-12-11 15:34 ` Suthikulpanit, Suravee
2023-12-11 16:06 ` Jason Gunthorpe
2023-12-11 12:35 ` Yi Liu
2023-12-11 13:20 ` Jason Gunthorpe
2023-12-11 20:11 ` Nicolin Chen
2023-12-11 21:48 ` Jason Gunthorpe
2023-12-11 17:35 ` Suthikulpanit, Suravee [this message]
2023-12-11 17:45 ` Jason Gunthorpe
2023-12-11 21:27 ` Nicolin Chen
2023-12-11 21:57 ` Jason Gunthorpe
2023-12-12 7:30 ` Nicolin Chen
2023-12-12 14:44 ` Jason Gunthorpe
2023-12-12 19:13 ` Nicolin Chen
2023-12-12 19:21 ` Jason Gunthorpe
2023-12-12 20:05 ` Nicolin Chen
2023-12-13 12:40 ` Jason Gunthorpe
2023-12-13 19:54 ` Nicolin Chen
2023-12-17 11:21 ` Joel Granados
2023-12-19 9:26 ` Yi Liu
2023-12-20 11:23 ` Joel Granados
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=391ab316-79b1-4535-a45b-4c01bfb80de6@amd.com \
--to=suravee.suthikulpanit@amd.com \
--cc=Dhaval.Giani@amd.com \
--cc=alex.williamson@redhat.com \
--cc=baolu.lu@linux.intel.com \
--cc=chao.p.peng@linux.intel.com \
--cc=cohuck@redhat.com \
--cc=eric.auger@redhat.com \
--cc=iommu@lists.linux.dev \
--cc=jasowang@redhat.com \
--cc=jgg@nvidia.com \
--cc=joao.m.martins@oracle.com \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=lulu@redhat.com \
--cc=mjrosato@linux.ibm.com \
--cc=nicolinc@nvidia.com \
--cc=peterx@redhat.com \
--cc=robin.murphy@arm.com \
--cc=shameerali.kolothum.thodi@huawei.com \
--cc=vasant.hegde@amd.com \
--cc=xin.zeng@intel.com \
--cc=yan.y.zhao@intel.com \
--cc=yi.l.liu@intel.com \
--cc=yi.y.sun@linux.intel.com \
--cc=zhenzhong.duan@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox