Re: [PATCH 2/4] vfio: Move the sanity check of the group to vfio_create_group()

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: Jason Gunthorpe <jgg@nvidia.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Cornelia Huck <cohuck@redhat.com>,
	iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
	kvm@vger.kernel.org, Will Deacon <will@kernel.org>,
	Qian Cai <cai@lca.pw>, Joerg Roedel <jroedel@suse.de>,
	Marek Szyprowski <m.szyprowski@samsung.com>,
	Matthew Rosato <mjrosato@linux.ibm.com>,
	Robin Murphy <robin.murphy@arm.com>
Subject: Re: [PATCH 2/4] vfio: Move the sanity check of the group to vfio_create_group()
Date: Thu, 22 Sep 2022 16:36:14 -0300	[thread overview]
Message-ID: <Yyy5Lr30A3GuK4Ab@nvidia.com> (raw)
In-Reply-To: <20220922131050.7136481f.alex.williamson@redhat.com>

On Thu, Sep 22, 2022 at 01:10:50PM -0600, Alex Williamson wrote:
> > @@ -378,13 +394,20 @@ static struct vfio_group *vfio_get_group(struct iommu_group *iommu_group,
> >  
> >  	mutex_lock(&vfio.group_lock);
> >  
> > -	ret = __vfio_group_get_from_iommu(iommu_group);
> > -	if (ret)
> > -		goto err_unlock;
> > +	ret = vfio_group_find_from_iommu(iommu_group);
> > +	if (ret) {
> > +		if (WARN_ON(vfio_group_has_device(ret, dev))) {
> > +			ret = ERR_PTR(-EINVAL);
> > +			goto out_unlock;
> > +		}
> 
> This still looks racy.  We only know within vfio_group_has_device() that
> the device is not present in the group, what prevents a race between
> here and when we finally do add it to group->device_list?

This is a condition which is defined to be impossible and by
auditing I've checked it can't happen.

There is no race in the sense that this can't actually happen, if it
does happen it means the group is corrupted. At that point reasoning
about locks and such goes out the window too - eg it might be
corrupted because of bad locking.

When it comes to self-debugging tests we often have these
"races", eg in the destroy path we do:

	WARN_ON(!list_empty(&group->device_list));

Which doesn't hold the appropriate locks either.

The point is just to detect that group has been corrupted at a point
in time in hopes of guarding against a future kernel bug.

> The semantics of vfio_get_group() are also rather strange, 'return a
> vfio_group for this iommu_group, but make sure it doesn't include this
> device' :-\  Thanks,

I think of it as "return a group and also do sanity checks that the
returned group has not been corrupted"

I don't like the name of this function but couldn't figure a better
one. It is something like "find or create a group for a device which
we know doesn't already have a group"

Jason

next prev parent reply	other threads:[~2022-09-22 19:36 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-08 18:44 [PATCH 0/4] Fix splats releated to using the iommu_group after destroying devices Jason Gunthorpe
2022-09-08 18:44 ` [PATCH 1/4] vfio: Simplify vfio_create_group() Jason Gunthorpe
2022-09-20 19:45   ` Matthew Rosato
2022-09-08 18:44 ` [PATCH 2/4] vfio: Move the sanity check of the group to vfio_create_group() Jason Gunthorpe
2022-09-22 19:10   ` Alex Williamson
2022-09-22 19:36     ` Jason Gunthorpe [this message]
2022-09-22 21:23       ` Alex Williamson
2022-09-22 23:12         ` Jason Gunthorpe
2022-09-08 18:45 ` [PATCH 3/4] vfio: Follow a strict lifetime for struct iommu_group * Jason Gunthorpe
2022-09-20 19:32   ` Matthew Rosato
2022-09-08 18:45 ` [PATCH 4/4] iommu: Fix ordering of iommu_release_device() Jason Gunthorpe
2022-09-08 21:05   ` Robin Murphy
2022-09-08 21:27     ` Robin Murphy
2022-09-08 21:43       ` Jason Gunthorpe
2022-09-09  9:05         ` Robin Murphy
2022-09-09 13:25           ` Jason Gunthorpe
2022-09-09 17:57             ` Robin Murphy
2022-09-09 18:30               ` Jason Gunthorpe
2022-09-09 19:55                 ` Robin Murphy
2022-09-09 23:45                   ` Jason Gunthorpe
2022-09-12 11:13                     ` Robin Murphy
2022-09-22 16:56                       ` Jason Gunthorpe
2022-09-09 12:49 ` [PATCH 0/4] Fix splats releated to using the iommu_group after destroying devices Matthew Rosato
2022-09-09 16:24   ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yyy5Lr30A3GuK4Ab@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=cai@lca.pw \
    --cc=cohuck@redhat.com \
    --cc=iommu@lists.linux.dev \
    --cc=joro@8bytes.org \
    --cc=jroedel@suse.de \
    --cc=kvm@vger.kernel.org \
    --cc=m.szyprowski@samsung.com \
    --cc=mjrosato@linux.ibm.com \
    --cc=robin.murphy@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox