From: Nicolin Chen <nicolinc@nvidia.com>
To: "Liu, Yi L" <yi.l.liu@intel.com>
Cc: "jgg@nvidia.com" <jgg@nvidia.com>,
"Tian, Kevin" <kevin.tian@intel.com>,
"joro@8bytes.org" <joro@8bytes.org>,
"will@kernel.org" <will@kernel.org>,
"robin.murphy@arm.com" <robin.murphy@arm.com>,
"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
"shuah@kernel.org" <shuah@kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"linux-kselftest@vger.kernel.org"
<linux-kselftest@vger.kernel.org>,
"baolu.lu@linux.intel.com" <baolu.lu@linux.intel.com>
Subject: Re: [PATCH v2 08/10] iommufd/device: Use iommu_group_replace_domain()
Date: Thu, 9 Feb 2023 12:55:51 -0800 [thread overview]
Message-ID: <Y+Vd17huIxYPJMrI@Asurada-Nvidia> (raw)
In-Reply-To: <DS0PR11MB75293887741EEB8953BCB0A9C3D89@DS0PR11MB7529.namprd11.prod.outlook.com>
On Wed, Feb 08, 2023 at 08:08:42AM +0000, Liu, Yi L wrote:
> > From: Nicolin Chen <nicolinc@nvidia.com>
> > Sent: Wednesday, February 8, 2023 5:18 AM
> >
> > iommu_group_replace_domain() is introduced to support use cases where
> > an
> > iommu_group can be attached to a new domain without getting detached
> > from
> > the old one. This replacement feature will be useful, for cases such as:
> > 1) vPASID mode, when a guest wants to replace a single pasid (PASID=0)
> > table with a larger table (PASID=N)
> > 2) Nesting mode, when switching the attaching device from an S2 domain
> > to an S1 domain, or when switching between relevant S1 domains.
> > as it allows these cases to switch seamlessly without a DMA disruption.
> >
> > So, call iommu_group_replace_domain() in the
> > iommufd_device_do_attach().
> > And add a __iommmufd_device_detach helper to allow the replace routine
> > to
> > do a partial detach on the current hwpt that's being replaced. Though the
> > updated locking logic is overcomplicated, it will be eased, once those
> > iopt_table_add/remove_ioas and list_add/del calls are moved to hwpt's
> > allocation/destroy() functions in the coming nesting series, as that'll
> > depend on a new ->domain_alloc_user op in the iommu core.
> >
> > Also, block replace operations that are from/to auto_domains, i.e. only
> > user-allocated hw_pagetables can be replaced or replaced with.
> >
> > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> > ---
> > drivers/iommu/iommufd/device.c | 101 +++++++++++++++++-------
> > drivers/iommu/iommufd/iommufd_private.h | 2 +
> > 2 files changed, 76 insertions(+), 27 deletions(-)
> >
> > diff --git a/drivers/iommu/iommufd/device.c
> > b/drivers/iommu/iommufd/device.c
> > index b8c3e3baccb5..8a9834fc129a 100644
> > --- a/drivers/iommu/iommufd/device.c
> > +++ b/drivers/iommu/iommufd/device.c
> > @@ -9,6 +9,8 @@
> > #include "io_pagetable.h"
> > #include "iommufd_private.h"
> >
> > +MODULE_IMPORT_NS(IOMMUFD_INTERNAL);
> > +
> > static bool allow_unsafe_interrupts;
> > module_param(allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR);
> > MODULE_PARM_DESC(
> > @@ -194,9 +196,61 @@ static bool
> > iommufd_hw_pagetable_has_group(struct iommufd_hw_pagetable *hwpt,
> > return false;
> > }
> >
> > +/**
> > + * __iommmufd_device_detach - Detach a device from idev->hwpt to
> > new_hwpt
>
> This function doesn't do anything to make this device attached to new_hwpt.
> It is done in the iommufd_device_attach_ioas(). New_hwpt here indicates if
> this detach requires to do some extra thing. E.g. remove reserved iova from
> the idev->hwpt->ioas. So may just say " Detach a device from idev->hwpt",
> and explain the usage of new_hwpt in the below.
Yea. You are right.
> > + * @idev: device to detach
> > + * @new_hwpt: new hw_pagetable to attach (pass in NULL for a simple
> > detach)
>
> The new hw_pagetable to be attached.
OK.
> > + * @detach_group: flag to call iommu_detach_group
> > + *
> > + * This is a cleanup helper shared by the replace and detach routines.
> > Comparing
> > + * to a detach routine, a replace routine only needs a partial detach
> > procedure:
> > + * it does not need the iommu_detach_group(); it will attach the device to
> > a new
> > + * hw_pagetable after a partial detach from the currently attached
> > hw_pagetable,
> > + * so certain steps can be skipped if two hw_pagetables have the same
> > IOAS.
> > + */
> > +static void __iommmufd_device_detach(struct iommufd_device *idev,
> > + struct iommufd_hw_pagetable
> > *new_hwpt,
> > + bool detach_group)
> > +{
> > + struct iommufd_hw_pagetable *hwpt = idev->hwpt;
> > + struct iommufd_ioas *new_ioas = NULL;
> > +
> > + if (new_hwpt)
> > + new_ioas = new_hwpt->ioas;
> > +
> > + mutex_lock(&hwpt->devices_lock);
> > + list_del(&idev->devices_item);
> > + if (hwpt->ioas != new_ioas)
> > + mutex_lock(&hwpt->ioas->mutex);
>
> The lock order is mostly hwpt->ioas->mutex and then hwpt->devices_lock.
> See the iommufd_device_auto_get_domain(). If possible, may switch the
> order sequence here.
Yea, I know it's a bit strange. Yet...
Our nesting series simplifies this part to:
if (cur_ioas != new_ioas) {
mutex_lock(&hwpt->ioas->mutex);
iopt_remove_reserved_iova(&hwpt->ioas->iopt, idev->dev);
mutex_unlock(&hwpt->ioas->mutex);
}
So, here is trying to avoid something like:
if (cur_ioas != new_ioas)
mutex_lock(&hwpt->ioas->mutex);
// doing something
if (cur_ioas != new_ioas)
iopt_remove_reserved_iova(&hwpt->ioas->iopt, idev->dev);
// doing something
if (cur_ioas != new_ioas)
mutex_unlock(&hwpt->ioas->mutex);
> Also, rename hwpt to be cur_hwpt, this may help
> reviewers to distinguish it from the hwpt in the caller of this function. It
> looks to be a deadlock at first look, but not after closer reading.
Sure.
> > @@ -345,6 +406,13 @@ int iommufd_device_attach(struct iommufd_device
> > *idev, u32 *pt_id)
> > struct iommufd_hw_pagetable *hwpt =
> > container_of(pt_obj, struct
> > iommufd_hw_pagetable, obj);
> >
> > + if (idev->hwpt == hwpt)
> > + goto out_done;
> > + if (idev->hwpt && idev->hwpt->auto_domain) {
> > + rc = -EBUSY;
>
> This means if device was attached to an auto_created hwpt, then we
> cannot replace it with a user allocated hwpt? If yes, this means the
> replace is not available until user hwpt support, which is part of nesting.
After aligning with Jason, this limit here might be wrong, as we
should be able to support replacing an IOAS. I'd need to take a
closer look and fix it in v3.
> > + if (idev->hwpt)
> > + return -EBUSY;
>
> So we don't allow ioas replacement for physical devices. Is it?
> Looks like emulated devices allows it.
This was to avoid an replace with an auto_domain. Similarly, it's
likely wrong, as I replied above.
Thanks
Nic
next prev parent reply other threads:[~2023-02-09 20:56 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-07 21:17 [PATCH v2 00/10] Add IO page table replacement support Nicolin Chen
2023-02-07 21:17 ` [PATCH v2 01/10] iommu: Move dev_iommu_ops() to private header Nicolin Chen
2023-02-09 2:49 ` Tian, Kevin
2023-02-07 21:17 ` [PATCH v2 02/10] iommu: Introduce a new iommu_group_replace_domain() API Nicolin Chen
2023-02-09 2:55 ` Tian, Kevin
2023-02-09 13:23 ` Jason Gunthorpe
2023-02-10 1:34 ` Tian, Kevin
2023-02-10 23:51 ` Alex Williamson
2023-02-11 0:44 ` Jason Gunthorpe
2023-02-13 2:24 ` Tian, Kevin
2023-02-13 8:34 ` Baolu Lu
2023-02-13 14:45 ` Jason Gunthorpe
2023-02-14 3:29 ` Tian, Kevin
2023-02-15 6:10 ` Tian, Kevin
2023-02-15 12:52 ` Jason Gunthorpe
2023-02-22 2:11 ` Tian, Kevin
2023-02-24 0:57 ` Jason Gunthorpe
2023-02-24 8:07 ` Tian, Kevin
2023-02-07 21:17 ` [PATCH v2 03/10] iommufd: Create access in vfio_iommufd_emulated_bind() Nicolin Chen
2023-02-09 2:56 ` Tian, Kevin
2023-02-09 16:15 ` Nicolin Chen
2023-02-09 18:58 ` Eric Farman
2023-02-09 19:54 ` Nicolin Chen
2023-02-07 21:17 ` [PATCH v2 04/10] iommufd/selftest: Add IOMMU_TEST_OP_ACCESS_SET_IOAS coverage Nicolin Chen
2023-02-09 2:59 ` Tian, Kevin
2023-02-07 21:17 ` [PATCH v2 05/10] iommufd: Add replace support in iommufd_access_set_ioas() Nicolin Chen
2023-02-09 3:13 ` Tian, Kevin
2023-02-09 20:28 ` Nicolin Chen
2023-02-09 20:49 ` Jason Gunthorpe
2023-02-09 22:18 ` Nicolin Chen
2023-02-07 21:17 ` [PATCH v2 06/10] iommufd/selftest: Add coverage for access->ioas replacement Nicolin Chen
2023-02-07 21:17 ` [PATCH v2 07/10] iommufd/device: Make hwpt_list list_add/del symmetric Nicolin Chen
2023-02-09 3:23 ` Tian, Kevin
2023-02-09 13:24 ` Jason Gunthorpe
2023-02-10 1:46 ` Tian, Kevin
2023-02-10 21:17 ` Jason Gunthorpe
2023-02-13 2:12 ` Tian, Kevin
2023-02-07 21:18 ` [PATCH v2 08/10] iommufd/device: Use iommu_group_replace_domain() Nicolin Chen
2023-02-08 8:08 ` Liu, Yi L
2023-02-09 20:55 ` Nicolin Chen [this message]
2023-02-08 8:12 ` Liu, Yi L
2023-02-09 20:56 ` Nicolin Chen
2023-02-09 4:00 ` Tian, Kevin
2023-02-09 21:13 ` Nicolin Chen
2023-02-10 0:01 ` Jason Gunthorpe
2023-02-10 20:50 ` Nicolin Chen
2023-02-10 2:11 ` Tian, Kevin
2023-02-11 0:10 ` Nicolin Chen
2023-02-13 2:34 ` Tian, Kevin
2023-02-13 7:48 ` Nicolin Chen
2023-02-13 8:27 ` Tian, Kevin
2023-02-13 14:49 ` Jason Gunthorpe
2023-02-14 10:54 ` Nicolin Chen
2023-02-15 1:37 ` Tian, Kevin
2023-02-15 1:58 ` Nicolin Chen
2023-02-15 2:15 ` Tian, Kevin
2023-02-15 7:15 ` Nicolin Chen
2023-02-15 7:24 ` Tian, Kevin
2023-02-15 12:51 ` Jason Gunthorpe
2023-02-14 10:59 ` Nicolin Chen
2023-02-15 1:38 ` Tian, Kevin
2023-02-15 7:16 ` Nicolin Chen
2023-02-07 21:18 ` [PATCH v2 09/10] vfio: Support IO page table replacement Nicolin Chen
2023-02-09 4:06 ` Tian, Kevin
2023-02-07 21:18 ` [PATCH v2 10/10] vfio: Do not allow !ops->dma_unmap in vfio_pin/unpin_pages() Nicolin Chen
2023-02-09 4:10 ` Tian, Kevin
2023-02-09 13:26 ` Jason Gunthorpe
2023-02-09 16:19 ` Nicolin Chen
2023-02-09 2:50 ` [PATCH v2 00/10] Add IO page table replacement support Tian, Kevin
2023-02-09 16:13 ` Nicolin Chen
2023-02-10 1:34 ` Tian, Kevin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y+Vd17huIxYPJMrI@Asurada-Nvidia \
--to=nicolinc@nvidia.com \
--cc=alex.williamson@redhat.com \
--cc=baolu.lu@linux.intel.com \
--cc=iommu@lists.linux.dev \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=robin.murphy@arm.com \
--cc=shuah@kernel.org \
--cc=will@kernel.org \
--cc=yi.l.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox