public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: "Suthikulpanit, Suravee" <suravee.suthikulpanit@amd.com>
Cc: Yi Liu <yi.l.liu@intel.com>, "Tian, Kevin" <kevin.tian@intel.com>,
	"Giani, Dhaval" <Dhaval.Giani@amd.com>,
	Vasant Hegde <vasant.hegde@amd.com>,
	"joro@8bytes.org" <joro@8bytes.org>,
	"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
	"robin.murphy@arm.com" <robin.murphy@arm.com>,
	"baolu.lu@linux.intel.com" <baolu.lu@linux.intel.com>,
	"cohuck@redhat.com" <cohuck@redhat.com>,
	"eric.auger@redhat.com" <eric.auger@redhat.com>,
	"nicolinc@nvidia.com" <nicolinc@nvidia.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"mjrosato@linux.ibm.com" <mjrosato@linux.ibm.com>,
	"chao.p.peng@linux.intel.com" <chao.p.peng@linux.intel.com>,
	"yi.y.sun@linux.intel.com" <yi.y.sun@linux.intel.com>,
	"peterx@redhat.com" <peterx@redhat.com>,
	"jasowang@redhat.com" <jasowang@redhat.com>,
	"shameerali.kolothum.thodi@huawei.com" 
	<shameerali.kolothum.thodi@huawei.com>,
	"lulu@redhat.com" <lulu@redhat.com>,
	"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-kselftest@vger.kernel.org"
	<linux-kselftest@vger.kernel.org>,
	"Duan, Zhenzhong" <zhenzhong.duan@intel.com>,
	"joao.m.martins@oracle.com" <joao.m.martins@oracle.com>,
	"Zeng, Xin" <xin.zeng@intel.com>,
	"Zhao, Yan Y" <yan.y.zhao@intel.com>
Subject: Re: [PATCH v6 0/6] iommufd: Add nesting infrastructure (part 2/2)
Date: Mon, 11 Dec 2023 12:06:32 -0400	[thread overview]
Message-ID: <20231211160632.GI2944114@nvidia.com> (raw)
In-Reply-To: <509489ce-0169-4021-ad56-a31544752aa4@amd.com>

On Mon, Dec 11, 2023 at 10:34:09PM +0700, Suthikulpanit, Suravee wrote:

> Currently, the AMD IOMMU driver allocates a DomainId per IOMMU group.
> One issue with this is when we have nested translation where we could end up
> with multiple devices (RIDs) sharing same PASID and the same hDomainID.

Which means you also create multiple GCR3 tables since those are
(soon) per-device and we end up with the situation I described for a
functional legitimate reason :( It is just wasting memory by
duplicating GCR3 tables.
 
> For example:
> 
>   - Host view
>     Device1 (RID 1) w/ hDomainId 1
>     Device2 (RID 2) w/ hDomainId 1

So.. Groups are another ugly mess that we may have to do something
more robust about.

The group infrastructure assumes that all devices in the group have
the same translation. This is not how the VM communicates, each member
of the group gets to have its own DTE and there are legitimate cases
where the DTEs will be different (even if just temporarily)

How to mesh this is not yet solved (most likely we need to allow group
members to have temporarily different translation). But in the long
run the group should definately not be providing the cache tag, the
driver has to be smarter than this.

I think we talked about this before.. For the AMD driver the v1 page
table should store the domainid in the iommu_domain and that value
should be used everywhere

For modes with a GCR3 table the best you can do is to de-duplicate the
GCR3 tables and assign identical GCR3 tables to identical domain ids.
Ie all devices in a group will eventually share GCR3 tables so they
can converge on the same domain id.

>   - Guest view
>     Pass-through Device1 (vRID 3) w/ vDomainID A + PASID 0
>     Pass-through Device2 (vRID 4) w/ vDomainID B + PASID 0
> 
> We should be able to workaround this by changing the way we assign hDomainId
> to be per-device for VFIO pass-through devices although sharing the same v1
> (stage-2) page table. This would look like.

As I said, this doesn't quite work since the VM could do other
things. The kernel must be aware of the vDomainID and must select an
appropriate hDomainID with that knowledge in mind, otherwise
multi-device-groups in guests are fully broken.

>   - Guest view
>     Pass-through Device1 (vRID 3) w/ vDomainID A + PASID 0
>     Pass-through Device2 (vRID 4) w/ vDomainID B + PASID 0
> 
> This should avoid the IOMMU TLB conflict. However, the invalidation would
> need to be done for both DomainId 1 and 2 when updating the v1 (stage-2)
> page table.

Which is the key problem, if the VM thinks it has only one vDomainID
the VMM can't split that into two hDomainID's and expect the viommu
acceleration will work - so we shouldn't try to make it work in SW
either, IMHO.

Jason

  reply	other threads:[~2023-12-11 16:08 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20231217215720eucas1p2a590aca62ce8eb5ba81df6bc8b1a785d@eucas1p2.samsung.com>
2023-11-17 13:07 ` [PATCH v6 0/6] iommufd: Add nesting infrastructure (part 2/2) Yi Liu
2023-11-17 13:07   ` [PATCH v6 1/6] iommu: Add cache_invalidate_user op Yi Liu
2023-11-20  7:53     ` Tian, Kevin
2023-12-06 18:32     ` Jason Gunthorpe
2023-12-06 18:43       ` Nicolin Chen
2023-12-06 18:50         ` Jason Gunthorpe
2023-12-07  6:53           ` Yi Liu
2024-01-08  7:32     ` Binbin Wu
2023-11-17 13:07   ` [PATCH v6 2/6] iommufd: Add IOMMU_HWPT_INVALIDATE Yi Liu
2023-11-20  8:09     ` Tian, Kevin
2023-11-20  8:29       ` Yi Liu
2023-11-20  8:34         ` Tian, Kevin
2023-11-20 17:36           ` Nicolin Chen
2023-11-21  2:50             ` Tian, Kevin
2023-11-21  5:24               ` Nicolin Chen
2023-11-24  2:36                 ` Tian, Kevin
2023-11-27 19:53                   ` Nicolin Chen
2023-11-28  6:01                     ` Yi Liu
2023-11-29  0:54                       ` Nicolin Chen
2023-11-28  8:03                     ` Tian, Kevin
2023-11-29  0:51                       ` Nicolin Chen
2023-11-29  0:57                         ` Jason Gunthorpe
2023-11-29  1:09                           ` Nicolin Chen
2023-11-29 19:58                             ` Jason Gunthorpe
2023-11-29 22:07                               ` Nicolin Chen
2023-11-30  0:08                                 ` Jason Gunthorpe
2023-11-30 20:41                                   ` Nicolin Chen
2023-12-01  0:45                                     ` Jason Gunthorpe
2023-12-01  4:29                                       ` Nicolin Chen
2023-12-01 12:55                                         ` Jason Gunthorpe
2023-12-01 19:58                                           ` Nicolin Chen
2023-12-01 20:43                                             ` Jason Gunthorpe
2023-12-01 22:12                                               ` Nicolin Chen
2023-12-04 14:48                                                 ` Jason Gunthorpe
2023-12-05 17:33                                                   ` Nicolin Chen
2023-12-06 12:48                                                     ` Jason Gunthorpe
2023-12-01  3:51                           ` Yi Liu
2023-12-01  4:50                             ` Nicolin Chen
2023-12-01  5:19                               ` Tian, Kevin
2023-12-01  7:05                                 ` Yi Liu
2023-12-01  7:10                                   ` Tian, Kevin
2023-12-01  9:08                                     ` Yi Liu
2023-11-21  5:02     ` Baolu Lu
2023-11-21  5:19       ` Nicolin Chen
2023-11-28  5:54         ` Yi Liu
2023-12-06 18:33     ` Jason Gunthorpe
2023-12-07  6:59     ` Yi Liu
2023-12-07  9:04       ` Tian, Kevin
2023-12-07 14:42         ` Jason Gunthorpe
2023-12-11  7:53           ` Yi Liu
2023-12-11 13:21             ` Jason Gunthorpe
2023-12-12 13:45               ` Liu, Yi L
2023-12-12 14:40                 ` Jason Gunthorpe
2023-12-13 13:47                   ` Liu, Yi L
2023-12-13 14:11                     ` Jason Gunthorpe
2023-12-11  7:49         ` Yi Liu
2023-11-17 13:07   ` [PATCH v6 3/6] iommu: Add iommu_copy_struct_from_user_array helper Yi Liu
2023-11-20  8:17     ` Tian, Kevin
2023-11-20 17:25       ` Nicolin Chen
2023-11-21  2:48         ` Tian, Kevin
2024-01-08  8:37     ` Binbin Wu
2023-11-17 13:07   ` [PATCH v6 4/6] iommufd/selftest: Add mock_domain_cache_invalidate_user support Yi Liu
2023-12-06 18:16     ` Jason Gunthorpe
2023-12-11 11:21       ` Yi Liu
2023-11-17 13:07   ` [PATCH v6 5/6] iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op Yi Liu
2023-11-17 13:07   ` [PATCH v6 6/6] iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl Yi Liu
2023-12-06 18:19     ` Jason Gunthorpe
2023-12-11 11:28       ` Yi Liu
2023-12-11 13:06         ` Jason Gunthorpe
2023-12-09  1:47   ` [PATCH v6 0/6] iommufd: Add nesting infrastructure (part 2/2) Jason Gunthorpe
2023-12-11  2:29     ` Tian, Kevin
2023-12-11 12:36       ` Yi Liu
2023-12-11 13:05         ` Jason Gunthorpe
2023-12-11 15:34           ` Suthikulpanit, Suravee
2023-12-11 16:06             ` Jason Gunthorpe [this message]
2023-12-11 12:35     ` Yi Liu
2023-12-11 13:20       ` Jason Gunthorpe
2023-12-11 20:11         ` Nicolin Chen
2023-12-11 21:48           ` Jason Gunthorpe
2023-12-11 17:35     ` Suthikulpanit, Suravee
2023-12-11 17:45       ` Jason Gunthorpe
2023-12-11 21:27     ` Nicolin Chen
2023-12-11 21:57       ` Jason Gunthorpe
2023-12-12  7:30         ` Nicolin Chen
2023-12-12 14:44           ` Jason Gunthorpe
2023-12-12 19:13             ` Nicolin Chen
2023-12-12 19:21               ` Jason Gunthorpe
2023-12-12 20:05                 ` Nicolin Chen
2023-12-13 12:40                   ` Jason Gunthorpe
2023-12-13 19:54                     ` Nicolin Chen
2023-12-17 11:21   ` Joel Granados
2023-12-19  9:26     ` Yi Liu
2023-12-20 11:23       ` Joel Granados

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231211160632.GI2944114@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=Dhaval.Giani@amd.com \
    --cc=alex.williamson@redhat.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=chao.p.peng@linux.intel.com \
    --cc=cohuck@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=iommu@lists.linux.dev \
    --cc=jasowang@redhat.com \
    --cc=joao.m.martins@oracle.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=lulu@redhat.com \
    --cc=mjrosato@linux.ibm.com \
    --cc=nicolinc@nvidia.com \
    --cc=peterx@redhat.com \
    --cc=robin.murphy@arm.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=vasant.hegde@amd.com \
    --cc=xin.zeng@intel.com \
    --cc=yan.y.zhao@intel.com \
    --cc=yi.l.liu@intel.com \
    --cc=yi.y.sun@linux.intel.com \
    --cc=zhenzhong.duan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox