From: Nicolin Chen <nicolinc@nvidia.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Yi Liu <yi.l.liu@intel.com>,
"Giani, Dhaval" <Dhaval.Giani@amd.com>,
Vasant Hegde <vasant.hegde@amd.com>,
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
<joro@8bytes.org>, <alex.williamson@redhat.com>,
<kevin.tian@intel.com>, <robin.murphy@arm.com>,
<baolu.lu@linux.intel.com>, <cohuck@redhat.com>,
<eric.auger@redhat.com>, <kvm@vger.kernel.org>,
<mjrosato@linux.ibm.com>, <chao.p.peng@linux.intel.com>,
<yi.y.sun@linux.intel.com>, <peterx@redhat.com>,
<jasowang@redhat.com>, <shameerali.kolothum.thodi@huawei.com>,
<lulu@redhat.com>, <iommu@lists.linux.dev>,
<linux-kernel@vger.kernel.org>, <linux-kselftest@vger.kernel.org>,
<zhenzhong.duan@intel.com>, <joao.m.martins@oracle.com>,
<xin.zeng@intel.com>, <yan.y.zhao@intel.com>
Subject: Re: [PATCH v6 0/6] iommufd: Add nesting infrastructure (part 2/2)
Date: Wed, 13 Dec 2023 11:54:15 -0800 [thread overview]
Message-ID: <ZXoL56s1EIchNRzD@Asurada-Nvidia> (raw)
In-Reply-To: <20231213124055.GR3014157@nvidia.com>
On Wed, Dec 13, 2023 at 08:40:55AM -0400, Jason Gunthorpe wrote:
> On Tue, Dec 12, 2023 at 12:05:41PM -0800, Nicolin Chen wrote:
> > > > // iommufd_private.h
> > > >
> > > > enum iommufd_object_type {
> > > > ...
> > > > + IOMMUFD_OBJ_VIOMMU,
> > > > ...
> > > > };
> > > >
> > > > +struct iommufd_viommu {
> > > > + struct iommufd_object obj;
> > > > + struct iommufd_hwpt_paging *hwpt;
> > > > + struct xarray devices;
> > > > +};
> > > >
> > > > struct iommufd_hwpt_paging hwpt {
> > > > ...
> > > > + struct list_head viommu_list;
> > > > ...
> > > > };
> > >
> > > I'd probably first try to go backwards and link the hwpt to the
> > > viommu.
> >
> > I think a VM should have only one hwpt_paging object while one
> > or more viommu objects, so we could do only viommu->hwpt_paging
> > and hwpt_paging->viommu_list. How to go backwards?
>
> That is probably how things would work but I don't know if it makes
> sense to enforce it in the kernel logic..
>
> Point the S2 to a list of viommu objects it is linked to
Hmm, isn't that hwpt_paging->viommu_list already?
> > > The second version maybe we have the xarray, or maybe we just push the
> > > xarray to the eventual viommu series.
> >
> > I think that I still don't get the purpose of the xarray here.
> > It was needed previously because a cache invalidate per hwpt
> > doesn't know which device. Now IOMMUFD_DEV_INVALIDATE knows.
> >
> > Maybe it's related to that narrative "logically we could have
> > multiple mappings per iommufd" that you mentioned above. Mind
> > elaborating a bit?
> >
> > In my mind, viommu is allocated by VMM per piommu, by detecting
> > the piommu_id via hw_info. In that case, viommu can only have
> > one unique device list. If IOMMUFD_DEV_INVALIDATE passes in the
> > dev_id, we don't really need a mapping of vRID-pRID in a multi-
> > viommu case either? In another word, VMM already has a mapping
> > from vRID to dev_id, so it could call the DEV_INVALIDATE ioctl
> > in the first place?
>
> The xarray exists to optimize the invalidation flow.
>
> For SW you can imagine issuing an invalidation against the viommu
> itself and *all* commands, be they ASID or ATC invalidations can be
> processed in one shot. The xarray allows converting the vSID to pSID
> to process ATC invalidations, and the viommu object forces a single
> VMID to handle the ATC invalidations. If we want to do this, I don't
> know.
I drafted some patches with IOMMUFD_DEV_INVALIDATE yesterday,
and realized the same problem that you pointed out here: how
VMM should handle a group of commands interlaced with ASID and
ATC commands. If VMM dispatches commands into two groups, the
executions of the commands will be in a different order than
what the guest kernel issued in. This might be bitter if there
is an error occurring in the middle of command executions, in
which case some later invalidations are done successfully but
the CONS index would have to stop at a command prior.
And even if there are only ATC invalidations in a guest queue,
there's no guarantee that all commands are for the same dev_id,
i.e. ATC invalidations themselves would be dispatched into more
groups and separate IOMMUFD_DEV_INVALIDATE calls.
With the xarray, perhaps we could provide a viommu_id in data
structure of the current iommu_hwpt_invalidate, i.e. reshaping
the existing invalidate uAPI per viommu, so it can be reused by
ATC invalidations too instead of adding IOMMUFD_DEV_INVALIDATE?
Then we wouldn't have the out-of-order execution problem above.
> For HW, the xarray holds the vSID to pSID mapping that must be
> programmed into the HW operating the dedicated invalidation queue.
Ah, right! VCMDQ at least needs that.
Thanks
Nicolin
next prev parent reply other threads:[~2023-12-13 19:54 UTC|newest]
Thread overview: 93+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20231217215720eucas1p2a590aca62ce8eb5ba81df6bc8b1a785d@eucas1p2.samsung.com>
2023-11-17 13:07 ` [PATCH v6 0/6] iommufd: Add nesting infrastructure (part 2/2) Yi Liu
2023-11-17 13:07 ` [PATCH v6 1/6] iommu: Add cache_invalidate_user op Yi Liu
2023-11-20 7:53 ` Tian, Kevin
2023-12-06 18:32 ` Jason Gunthorpe
2023-12-06 18:43 ` Nicolin Chen
2023-12-06 18:50 ` Jason Gunthorpe
2023-12-07 6:53 ` Yi Liu
2024-01-08 7:32 ` Binbin Wu
2023-11-17 13:07 ` [PATCH v6 2/6] iommufd: Add IOMMU_HWPT_INVALIDATE Yi Liu
2023-11-20 8:09 ` Tian, Kevin
2023-11-20 8:29 ` Yi Liu
2023-11-20 8:34 ` Tian, Kevin
2023-11-20 17:36 ` Nicolin Chen
2023-11-21 2:50 ` Tian, Kevin
2023-11-21 5:24 ` Nicolin Chen
2023-11-24 2:36 ` Tian, Kevin
2023-11-27 19:53 ` Nicolin Chen
2023-11-28 6:01 ` Yi Liu
2023-11-29 0:54 ` Nicolin Chen
2023-11-28 8:03 ` Tian, Kevin
2023-11-29 0:51 ` Nicolin Chen
2023-11-29 0:57 ` Jason Gunthorpe
2023-11-29 1:09 ` Nicolin Chen
2023-11-29 19:58 ` Jason Gunthorpe
2023-11-29 22:07 ` Nicolin Chen
2023-11-30 0:08 ` Jason Gunthorpe
2023-11-30 20:41 ` Nicolin Chen
2023-12-01 0:45 ` Jason Gunthorpe
2023-12-01 4:29 ` Nicolin Chen
2023-12-01 12:55 ` Jason Gunthorpe
2023-12-01 19:58 ` Nicolin Chen
2023-12-01 20:43 ` Jason Gunthorpe
2023-12-01 22:12 ` Nicolin Chen
2023-12-04 14:48 ` Jason Gunthorpe
2023-12-05 17:33 ` Nicolin Chen
2023-12-06 12:48 ` Jason Gunthorpe
2023-12-01 3:51 ` Yi Liu
2023-12-01 4:50 ` Nicolin Chen
2023-12-01 5:19 ` Tian, Kevin
2023-12-01 7:05 ` Yi Liu
2023-12-01 7:10 ` Tian, Kevin
2023-12-01 9:08 ` Yi Liu
2023-11-21 5:02 ` Baolu Lu
2023-11-21 5:19 ` Nicolin Chen
2023-11-28 5:54 ` Yi Liu
2023-12-06 18:33 ` Jason Gunthorpe
2023-12-07 6:59 ` Yi Liu
2023-12-07 9:04 ` Tian, Kevin
2023-12-07 14:42 ` Jason Gunthorpe
2023-12-11 7:53 ` Yi Liu
2023-12-11 13:21 ` Jason Gunthorpe
2023-12-12 13:45 ` Liu, Yi L
2023-12-12 14:40 ` Jason Gunthorpe
2023-12-13 13:47 ` Liu, Yi L
2023-12-13 14:11 ` Jason Gunthorpe
2023-12-11 7:49 ` Yi Liu
2023-11-17 13:07 ` [PATCH v6 3/6] iommu: Add iommu_copy_struct_from_user_array helper Yi Liu
2023-11-20 8:17 ` Tian, Kevin
2023-11-20 17:25 ` Nicolin Chen
2023-11-21 2:48 ` Tian, Kevin
2024-01-08 8:37 ` Binbin Wu
2023-11-17 13:07 ` [PATCH v6 4/6] iommufd/selftest: Add mock_domain_cache_invalidate_user support Yi Liu
2023-12-06 18:16 ` Jason Gunthorpe
2023-12-11 11:21 ` Yi Liu
2023-11-17 13:07 ` [PATCH v6 5/6] iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op Yi Liu
2023-11-17 13:07 ` [PATCH v6 6/6] iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl Yi Liu
2023-12-06 18:19 ` Jason Gunthorpe
2023-12-11 11:28 ` Yi Liu
2023-12-11 13:06 ` Jason Gunthorpe
2023-12-09 1:47 ` [PATCH v6 0/6] iommufd: Add nesting infrastructure (part 2/2) Jason Gunthorpe
2023-12-11 2:29 ` Tian, Kevin
2023-12-11 12:36 ` Yi Liu
2023-12-11 13:05 ` Jason Gunthorpe
2023-12-11 15:34 ` Suthikulpanit, Suravee
2023-12-11 16:06 ` Jason Gunthorpe
2023-12-11 12:35 ` Yi Liu
2023-12-11 13:20 ` Jason Gunthorpe
2023-12-11 20:11 ` Nicolin Chen
2023-12-11 21:48 ` Jason Gunthorpe
2023-12-11 17:35 ` Suthikulpanit, Suravee
2023-12-11 17:45 ` Jason Gunthorpe
2023-12-11 21:27 ` Nicolin Chen
2023-12-11 21:57 ` Jason Gunthorpe
2023-12-12 7:30 ` Nicolin Chen
2023-12-12 14:44 ` Jason Gunthorpe
2023-12-12 19:13 ` Nicolin Chen
2023-12-12 19:21 ` Jason Gunthorpe
2023-12-12 20:05 ` Nicolin Chen
2023-12-13 12:40 ` Jason Gunthorpe
2023-12-13 19:54 ` Nicolin Chen [this message]
2023-12-17 11:21 ` Joel Granados
2023-12-19 9:26 ` Yi Liu
2023-12-20 11:23 ` Joel Granados
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZXoL56s1EIchNRzD@Asurada-Nvidia \
--to=nicolinc@nvidia.com \
--cc=Dhaval.Giani@amd.com \
--cc=alex.williamson@redhat.com \
--cc=baolu.lu@linux.intel.com \
--cc=chao.p.peng@linux.intel.com \
--cc=cohuck@redhat.com \
--cc=eric.auger@redhat.com \
--cc=iommu@lists.linux.dev \
--cc=jasowang@redhat.com \
--cc=jgg@nvidia.com \
--cc=joao.m.martins@oracle.com \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=lulu@redhat.com \
--cc=mjrosato@linux.ibm.com \
--cc=peterx@redhat.com \
--cc=robin.murphy@arm.com \
--cc=shameerali.kolothum.thodi@huawei.com \
--cc=suravee.suthikulpanit@amd.com \
--cc=vasant.hegde@amd.com \
--cc=xin.zeng@intel.com \
--cc=yan.y.zhao@intel.com \
--cc=yi.l.liu@intel.com \
--cc=yi.y.sun@linux.intel.com \
--cc=zhenzhong.duan@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox