Linux Kernel Selftest development
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: "Tian, Kevin" <kevin.tian@intel.com>
Cc: "iommu@lists.linux.dev" <iommu@lists.linux.dev>,
	"linux-kselftest@vger.kernel.org"
	<linux-kselftest@vger.kernel.org>,
	Lu Baolu <baolu.lu@linux.intel.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Nicolin Chen <nicolinc@nvidia.com>,
	"Liu, Yi L" <yi.l.liu@intel.com>
Subject: Re: [PATCH v3 03/17] iommufd: Replace the hwpt->devices list with iommufd_group
Date: Fri, 14 Apr 2023 10:31:33 -0300	[thread overview]
Message-ID: <ZDlVtcwhV2G8ZKao@nvidia.com> (raw)
In-Reply-To: <BN9PR11MB52762841AAA04A24F76A743C8C989@BN9PR11MB5276.namprd11.prod.outlook.com>

On Thu, Apr 13, 2023 at 02:52:54AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Wednesday, April 12, 2023 7:18 PM
> > 
> > On Wed, Apr 12, 2023 at 08:27:36AM +0000, Tian, Kevin wrote:
> > > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > > Sent: Tuesday, April 11, 2023 10:31 PM
> > > >
> > > > On Thu, Mar 23, 2023 at 07:21:42AM +0000, Tian, Kevin wrote:
> > > >
> > > > > If no oversight then we can directly put the lock in
> > > > > iommufd_hw_pagetable_attach/detach() which can also simplify a bit
> > on
> > > > > its callers in device.c.
> > > >
> > > > So, I did this, and syzkaller explains why this can't be done:
> > > >
> > > > https://lore.kernel.org/r/0000000000006e66d605f83e09bc@google.com
> > > >
> > > > We can't allow the hwpt to be discovered by a parallel
> > > > iommufd_hw_pagetable_attach() until it is done being setup, otherwise
> > > > if we fail to set it up we can't destroy the hwpt.
> > > >
> > > > 	if (immediate_attach) {
> > > > 		rc = iommufd_hw_pagetable_attach(hwpt, idev);
> > > > 		if (rc)
> > > > 			goto out_abort;
> > > > 	}
> > > >
> > > > 	rc = iopt_table_add_domain(&hwpt->ioas->iopt, hwpt->domain);
> > > > 	if (rc)
> > > > 		goto out_detach;
> > > > 	list_add_tail(&hwpt->hwpt_item, &hwpt->ioas->hwpt_list);
> > > > 	return hwpt;
> > > >
> > > > out_detach:
> > > > 	if (immediate_attach)
> > > > 		iommufd_hw_pagetable_detach(idev);
> > > > out_abort:
> > > > 	iommufd_object_abort_and_destroy(ictx, &hwpt->obj);
> > > >
> > > > As some other idev could be pointing at it too now.
> > >
> > > How could this happen before this object is finalized? iirc you pointed to
> > > me this fact in previous discussion.
> > 
> > It only is unavailable through the xarray, but we've added it to at
> > least one internal list on the group already, it is kind of sketchy to
> > work like this, it should all be atomic..
> > 
> 
> which internal list? group has a list for attached devices but regarding
> to hwpt it's stored in a single field igroup->hwpt.

It is added to 

	list_add_tail(&hwpt->hwpt_item, &hwpt->ioas->hwpt_list);

Which can be observed from

	mutex_lock(&ioas->mutex);
	list_for_each_entry(hwpt, &ioas->hwpt_list, hwpt_item) {
		if (!hwpt->auto_domain)
			continue;

		if (!iommufd_lock_obj(&hwpt->obj))
			continue;

If iommufd_lock_obj() has happened then
iommufd_object_abort_and_destroy() is in trouble.

Thus we need to hold the ioas->mutex right up until we know we can't
call iommufd_object_abort_and_destroy(), or lift out the hwpt list_add

This could maybe also be fixed by holding the destroy_rw_sem right up
until finalize. Though, I think I looked at this once and decided
against it for some reason..

> btw removing this lock in this file also makes it easier to support siov
> device which doesn't have group. We can have internal group attach
> and pasid attach wrappers within device.c and leave igroup->lock held
> in the group attach path.

Yeah, I expect this will need more work when we get to PASID support

Most likely the resolution will be something like PASID domains can't
be used as PF/VF domains because they don't have the right reserved
regions, so they shouldn't be in the hwpt_list at all, so we can use a
more relaxed locking.

Jason

  reply	other threads:[~2023-04-14 13:31 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-21 19:14 [PATCH v3 00/17] Add iommufd physical device operations for replace and alloc hwpt Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 01/17] iommufd: Move isolated msi enforcement to iommufd_device_bind() Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 02/17] iommufd: Add iommufd_group Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 03/17] iommufd: Replace the hwpt->devices list with iommufd_group Jason Gunthorpe
2023-03-23  7:21   ` Tian, Kevin
2023-03-23 14:23     ` Jason Gunthorpe
2023-03-24  1:37       ` Tian, Kevin
2023-03-24 15:02         ` Jason Gunthorpe
2023-03-28  2:32           ` Tian, Kevin
2023-03-28 11:38             ` Jason Gunthorpe
2023-03-29  3:03               ` Tian, Kevin
2023-04-11 14:31     ` Jason Gunthorpe
2023-04-12  8:27       ` Tian, Kevin
2023-04-12 11:17         ` Jason Gunthorpe
2023-04-13  2:52           ` Tian, Kevin
2023-04-14 13:31             ` Jason Gunthorpe [this message]
2023-04-20  6:15               ` Tian, Kevin
2023-04-20 15:34                 ` Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 04/17] iommu: Export iommu_get_resv_regions() Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 05/17] iommufd: Keep track of each device's reserved regions instead of groups Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 06/17] iommufd: Use the iommufd_group to avoid duplicate MSI setup Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 07/17] iommufd: Make sw_msi_start a group global Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 08/17] iommufd: Move putting a hwpt to a helper function Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 09/17] iommufd: Add enforced_cache_coherency to iommufd_hw_pagetable_alloc() Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 10/17] iommufd: Reorganize iommufd_device_attach into iommufd_device_change_pt Jason Gunthorpe
2023-03-23  7:25   ` Tian, Kevin
2023-03-23 14:26     ` Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 11/17] iommu: Introduce a new iommu_group_replace_domain() API Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 12/17] iommufd: Add iommufd_device_replace() Jason Gunthorpe
2023-03-23  7:31   ` Tian, Kevin
2023-03-23 14:30     ` Jason Gunthorpe
2023-03-24  1:42       ` Tian, Kevin
2023-03-24 15:03         ` Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 13/17] iommufd: Make destroy_rwsem use a lock class per object type Jason Gunthorpe
2023-03-23  7:54   ` Tian, Kevin
2023-03-21 19:14 ` [PATCH v3 14/17] iommufd/selftest: Test iommufd_device_replace() Jason Gunthorpe
2023-03-23  7:57   ` Tian, Kevin
2023-03-23 14:32     ` Jason Gunthorpe
2023-03-21 19:14 ` [PATCH v3 15/17] iommufd: Add IOMMU_HWPT_ALLOC Jason Gunthorpe
2023-03-23  8:00   ` Tian, Kevin
2023-03-21 19:14 ` [PATCH v3 16/17] iommufd/selftest: Return the real idev id from selftest mock_domain Jason Gunthorpe
2023-03-23  8:02   ` Tian, Kevin
2023-03-21 19:14 ` [PATCH v3 17/17] iommufd/selftest: Add a selftest for IOMMU_HWPT_ALLOC Jason Gunthorpe
2023-03-23  8:03   ` Tian, Kevin
2023-03-23  8:04 ` [PATCH v3 00/17] Add iommufd physical device operations for replace and alloc hwpt Tian, Kevin
2023-03-23 14:35   ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZDlVtcwhV2G8ZKao@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=iommu@lists.linux.dev \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=nicolinc@nvidia.com \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox