public inbox for linux-kselftest@vger.kernel.org
 help / color / mirror / Atom feed
From: "Suthikulpanit, Suravee" <suravee.suthikulpanit@amd.com>
To: Jason Gunthorpe <jgg@nvidia.com>, Yi Liu <yi.l.liu@intel.com>,
	"Giani, Dhaval" <Dhaval.Giani@amd.com>,
	Vasant Hegde <vasant.hegde@amd.com>
Cc: joro@8bytes.org, alex.williamson@redhat.com,
	kevin.tian@intel.com, robin.murphy@arm.com,
	baolu.lu@linux.intel.com, cohuck@redhat.com,
	eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org,
	mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com,
	yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com,
	shameerali.kolothum.thodi@huawei.com, lulu@redhat.com,
	iommu@lists.linux.dev, linux-kernel@vger.kernel.org,
	linux-kselftest@vger.kernel.org, zhenzhong.duan@intel.com,
	joao.m.martins@oracle.com, xin.zeng@intel.com,
	yan.y.zhao@intel.com
Subject: Re: [PATCH v6 0/6] iommufd: Add nesting infrastructure (part 2/2)
Date: Tue, 12 Dec 2023 00:35:26 +0700	[thread overview]
Message-ID: <391ab316-79b1-4535-a45b-4c01bfb80de6@amd.com> (raw)
In-Reply-To: <20231209014726.GA2945299@nvidia.com>



On 12/9/2023 8:47 AM, Jason Gunthorpe wrote:
> On Fri, Nov 17, 2023 at 05:07:11AM -0800, Yi Liu wrote:
> 
>> Take Intel VT-d as an example, the stage-1 translation table is I/O page
>> table. As the below diagram shows, guest I/O page table pointer in GPA
>> (guest physical address) is passed to host and be used to perform the stage-1
>> address translation. Along with it, modifications to present mappings in the
>> guest I/O page table should be followed with an IOTLB invalidation.
> 
> I've been looking at what the three HW's need for invalidation, it is
> a bit messy.. Here is my thinking. Please let me know if I got it right
> 
> What is the starting point of the guest memory walks:
>   Intel: Single Scalable Mode PASID table entry indexed by a RID & PASID
>   AMD: GCR3 table (a table of PASIDs) indexed by RID

GCR3 table is indexed by PASID.
Device Table (DTE) is indexted by DeviceID (RID)

> ...
> Will ATC be forwarded or synthesized:
>   Intel: The (vDomain-ID,PASID) is a unique nesting domain so
>          the hypervisor knows exactly which RIDs this nesting domain is
> 	linked to and can generate an ATC invalidation. Plan is to
> 	supress/discard the ATC invalidations from the VM and generate
> 	them in the hypervisor.
>   AMD: (vDomain-ID,PASID) is ambiguous, it can refer to multiple GCR3
>        tables. We know which maximal set of RIDs it represents, but not
>        the actual set. I expect AMD will forward the ATC invalidation
>        to avoid over invalidation.

Not sure I understand your description here.

For the AMD IOMMU INVALIDE_IOMMU_PAGES (i.e. invalidate the IOMMU TLB), 
the hypervisor needs to map gDomainId->hDomainId and issue the command 
on behalf of the VM along with the PASID and GVA (or GVA range) provided 
by the guest.

For the AMD IOMMU INVALIDE_IOTLB_PAGES (i.e. invalidate the ATC on the 
device), the hypervisor needs to map gDeviceId->hDeviceId and issue the 
command on behalf of the VM along with the PASID and GVA (or GVA range) 
provided by the guest.

>   ARM: ASID is ambiguous. We have no idea which Nesting Domain/CD table
>        the ASID is contained in. ARM must forward the ATC invalidation
>        from the guest.
> 
> What iommufd object should receive the IOTLB invalidation command list:
>   Intel: The Nesting domain. The command list has to be broken up per
>          (vDomain-ID,PASID) and that batch delivered to the single
> 	nesting domain. Kernel ignores vDomain-ID/PASID and just
> 	invalidates whatever the nesting domain is actually attached to
>   AMD: Any Nesting Domain in the vDomain-ID group. The command list has
>        to be broken up per (vDomain-ID). Kernel replaces
>        vDomain-ID with pDomain-ID from the nesting domain and executes
>        the invalidation.
>   ARM: The Nesting Parent domain. Kernel forces the VMID from the
>        Nesting Parent and executes the invalidation.
> 
> In all cases the VM issues an ATC invalidation with (vRID, PASID) as
> the tag. The VMM must translate vRID -> dev_id -> pRID
> 
> For a pure SW flow the vRID can be mapped to the dev_id and the ATC
> invalidation delivered to the device object (eg IOMMUFD_DEV_INVALIDATE)
> 
> Finally, we have the HW driven invalidation DMA queues that can be
> directly assigned to the guest. AMD and SMMUv3+vCMDQ support this. In
> this case the HW is directly processing invalidation commands without
> a hypervisor trap.
> 
> To make this work the iommu needs to be programmed with:
>   AMD: A vDomain-ID -> pDomain-ID table
>        A vRID -> pRID table
>        This is all bound to some "virtual function"

By "virtual function", I assume you are referring to the AMD vIOMMU 
instance in the guest?

>   ARM: A vRID -> pRID table
>        The vCMDQ is bound to a VM_ID, so to the Nesting Parent
> 
> For AMD, as above, I suggest the vDomain-ID be passed when creating
> the nesting domain
Sure, we can do this part.

> The AMD "virtual function".. It is probably best to create a new iommufd
> object for this and it can be passed in to a few places

Something like IOMMUFD_OBJ_VIOMMU? Then operation would include 
something like:
   * Init
   * Destroy
   * ...

> The vRID->pRID table should be some mostly common
> IOMMUFD_DEV_ASSIGN_VIRTUAL_ID. AMD will need to pass in the virtual
> function ID and ARM will need to pass in the Nesting Parent ID.

Ok.

> ...
> Thus next steps:
>   - Respin this and lets focus on Intel only (this will be tough for
>     the holidays, but if it is available I will try)
>   - Get an ARM patch that just does IOTLB invalidation and add it to my
>     part 3
>   - Start working on IOMMUFD_DEV_INVALIDATE along with an ARM
>     implementation of it
>   - Reorganize the AMD RFC broadly along these lines and lets see it
>     freshened up in the next months as well. I would like to see the
>     AMD support structured to implement the SW paths in first steps and
>     later add in the "virtual function" acceleration stuff. The latter
>     is going to be complex.

Working on refining the part 1 to add HW info reporting and nested 
translation (minus the invalidation stuff). Should be sending out soon.

Suravee

  parent reply	other threads:[~2023-12-11 17:35 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20231217215720eucas1p2a590aca62ce8eb5ba81df6bc8b1a785d@eucas1p2.samsung.com>
2023-11-17 13:07 ` [PATCH v6 0/6] iommufd: Add nesting infrastructure (part 2/2) Yi Liu
2023-11-17 13:07   ` [PATCH v6 1/6] iommu: Add cache_invalidate_user op Yi Liu
2023-11-20  7:53     ` Tian, Kevin
2023-12-06 18:32     ` Jason Gunthorpe
2023-12-06 18:43       ` Nicolin Chen
2023-12-06 18:50         ` Jason Gunthorpe
2023-12-07  6:53           ` Yi Liu
2024-01-08  7:32     ` Binbin Wu
2023-11-17 13:07   ` [PATCH v6 2/6] iommufd: Add IOMMU_HWPT_INVALIDATE Yi Liu
2023-11-20  8:09     ` Tian, Kevin
2023-11-20  8:29       ` Yi Liu
2023-11-20  8:34         ` Tian, Kevin
2023-11-20 17:36           ` Nicolin Chen
2023-11-21  2:50             ` Tian, Kevin
2023-11-21  5:24               ` Nicolin Chen
2023-11-24  2:36                 ` Tian, Kevin
2023-11-27 19:53                   ` Nicolin Chen
2023-11-28  6:01                     ` Yi Liu
2023-11-29  0:54                       ` Nicolin Chen
2023-11-28  8:03                     ` Tian, Kevin
2023-11-29  0:51                       ` Nicolin Chen
2023-11-29  0:57                         ` Jason Gunthorpe
2023-11-29  1:09                           ` Nicolin Chen
2023-11-29 19:58                             ` Jason Gunthorpe
2023-11-29 22:07                               ` Nicolin Chen
2023-11-30  0:08                                 ` Jason Gunthorpe
2023-11-30 20:41                                   ` Nicolin Chen
2023-12-01  0:45                                     ` Jason Gunthorpe
2023-12-01  4:29                                       ` Nicolin Chen
2023-12-01 12:55                                         ` Jason Gunthorpe
2023-12-01 19:58                                           ` Nicolin Chen
2023-12-01 20:43                                             ` Jason Gunthorpe
2023-12-01 22:12                                               ` Nicolin Chen
2023-12-04 14:48                                                 ` Jason Gunthorpe
2023-12-05 17:33                                                   ` Nicolin Chen
2023-12-06 12:48                                                     ` Jason Gunthorpe
2023-12-01  3:51                           ` Yi Liu
2023-12-01  4:50                             ` Nicolin Chen
2023-12-01  5:19                               ` Tian, Kevin
2023-12-01  7:05                                 ` Yi Liu
2023-12-01  7:10                                   ` Tian, Kevin
2023-12-01  9:08                                     ` Yi Liu
2023-11-21  5:02     ` Baolu Lu
2023-11-21  5:19       ` Nicolin Chen
2023-11-28  5:54         ` Yi Liu
2023-12-06 18:33     ` Jason Gunthorpe
2023-12-07  6:59     ` Yi Liu
2023-12-07  9:04       ` Tian, Kevin
2023-12-07 14:42         ` Jason Gunthorpe
2023-12-11  7:53           ` Yi Liu
2023-12-11 13:21             ` Jason Gunthorpe
2023-12-12 13:45               ` Liu, Yi L
2023-12-12 14:40                 ` Jason Gunthorpe
2023-12-13 13:47                   ` Liu, Yi L
2023-12-13 14:11                     ` Jason Gunthorpe
2023-12-11  7:49         ` Yi Liu
2023-11-17 13:07   ` [PATCH v6 3/6] iommu: Add iommu_copy_struct_from_user_array helper Yi Liu
2023-11-20  8:17     ` Tian, Kevin
2023-11-20 17:25       ` Nicolin Chen
2023-11-21  2:48         ` Tian, Kevin
2024-01-08  8:37     ` Binbin Wu
2023-11-17 13:07   ` [PATCH v6 4/6] iommufd/selftest: Add mock_domain_cache_invalidate_user support Yi Liu
2023-12-06 18:16     ` Jason Gunthorpe
2023-12-11 11:21       ` Yi Liu
2023-11-17 13:07   ` [PATCH v6 5/6] iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op Yi Liu
2023-11-17 13:07   ` [PATCH v6 6/6] iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl Yi Liu
2023-12-06 18:19     ` Jason Gunthorpe
2023-12-11 11:28       ` Yi Liu
2023-12-11 13:06         ` Jason Gunthorpe
2023-12-09  1:47   ` [PATCH v6 0/6] iommufd: Add nesting infrastructure (part 2/2) Jason Gunthorpe
2023-12-11  2:29     ` Tian, Kevin
2023-12-11 12:36       ` Yi Liu
2023-12-11 13:05         ` Jason Gunthorpe
2023-12-11 15:34           ` Suthikulpanit, Suravee
2023-12-11 16:06             ` Jason Gunthorpe
2023-12-11 12:35     ` Yi Liu
2023-12-11 13:20       ` Jason Gunthorpe
2023-12-11 20:11         ` Nicolin Chen
2023-12-11 21:48           ` Jason Gunthorpe
2023-12-11 17:35     ` Suthikulpanit, Suravee [this message]
2023-12-11 17:45       ` Jason Gunthorpe
2023-12-11 21:27     ` Nicolin Chen
2023-12-11 21:57       ` Jason Gunthorpe
2023-12-12  7:30         ` Nicolin Chen
2023-12-12 14:44           ` Jason Gunthorpe
2023-12-12 19:13             ` Nicolin Chen
2023-12-12 19:21               ` Jason Gunthorpe
2023-12-12 20:05                 ` Nicolin Chen
2023-12-13 12:40                   ` Jason Gunthorpe
2023-12-13 19:54                     ` Nicolin Chen
2023-12-17 11:21   ` Joel Granados
2023-12-19  9:26     ` Yi Liu
2023-12-20 11:23       ` Joel Granados

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=391ab316-79b1-4535-a45b-4c01bfb80de6@amd.com \
    --to=suravee.suthikulpanit@amd.com \
    --cc=Dhaval.Giani@amd.com \
    --cc=alex.williamson@redhat.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=chao.p.peng@linux.intel.com \
    --cc=cohuck@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=iommu@lists.linux.dev \
    --cc=jasowang@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=joao.m.martins@oracle.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=lulu@redhat.com \
    --cc=mjrosato@linux.ibm.com \
    --cc=nicolinc@nvidia.com \
    --cc=peterx@redhat.com \
    --cc=robin.murphy@arm.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=vasant.hegde@amd.com \
    --cc=xin.zeng@intel.com \
    --cc=yan.y.zhao@intel.com \
    --cc=yi.l.liu@intel.com \
    --cc=yi.y.sun@linux.intel.com \
    --cc=zhenzhong.duan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox