public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Matthew Rosato <mjrosato@linux.ibm.com>,
	alex.williamson@redhat.com, pbonzini@redhat.com,
	cohuck@redhat.com, farman@linux.ibm.com, pmorel@linux.ibm.com,
	borntraeger@linux.ibm.com, frankja@linux.ibm.com,
	imbrenda@linux.ibm.com, david@redhat.com, akrowiak@linux.ibm.com,
	jjherne@linux.ibm.com, pasic@linux.ibm.com,
	zhenyuw@linux.intel.com, zhi.a.wang@intel.com,
	linux-s390@vger.kernel.org, kvm@vger.kernel.org,
	intel-gvt-dev@lists.freedesktop.org,
	intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] KVM: async kvm_destroy_vm for vfio devices
Date: Wed, 11 Jan 2023 20:53:34 +0000	[thread overview]
Message-ID: <Y78hzsHiwaFpL60+@google.com> (raw)
In-Reply-To: <Y78Wk2/P5+gLMdpk@nvidia.com>

On Wed, Jan 11, 2023, Jason Gunthorpe wrote:
> On Wed, Jan 11, 2023 at 07:54:51PM +0000, Sean Christopherson wrote:
> 
> > Something feels off.  If KVM's refcount is 0, then accessing device->group->kvm
> > in vfio_device_open() can't happen unless there's a refcounting bug somewhere.
> 
> The problem is in close, not open.

The deadlock problem is, yes.  My point is that if group_lock needs to be taken
when nullifying group->kvm during kvm_vfio_destroy(), then there is also a refcounting
prolem with respect to open().  If there is no refcounting problem, then nullifying
group->kvm during kvm_vfio_destroy() is unnecessary (but again, I doubt this is
the case).

The two things aren't directly related, but it seems possible to solve both while
making this all slightly less ugly.  Well, at least from KVM's perspective, whether
or not it'd be an improvement on the VFIO side is definitely debatable.

> Specifically it would be very hard to avoid holding the group_lock
> during close which is when the put is done.
> 
> > Rather than force devices to get KVM references, why not handle that in common
> > VFIO code and drop KVM refcountin from devices?  Worst case scenario KVM is pinned
> > by a device that doesn't need KVM but is in a group associated with KVM.  If that's
> > a concern, it seems easy enough to add a flag to vfio_device_ops to enumerate
> > whether or not the device depends on KVM.
> 
> We can't make cross-dependencies between kvm and core VFIO - it is why
> so much of this is soo ugly.

Ugh, right, modules for everyone.

> The few device drivers that unavoidably have KVM involvment already
> have a KVM module dependency, so they can safely do the get/put

Rather than store a "struct kvm *" in vfio_device, what about adding a new set
of optional ops to get/put KVM references?  Having dedicated KVM ops is gross,
but IMO it's less gross than backdooring the KVM pointer into open_device() by
stashing KVM into the device, e.g. it formalizes the VFIO API for devices that
depend on KVM instead of making devices pinky-swear to grab a reference during
open_device().

To further harden things, KVM could export only kvm_get_safe_kvm() if there are
no vendor modules.  I.e. make kvm_get_kvm() an internal-only helper when possible
and effectively force VFIO devices to use the safe variant.  That would work even
x86, as kvm_get_kvm() wouldn't be exported if neither KVM_AMD nor KVM_INTEL is
built as a module.

---
 drivers/vfio/vfio_main.c | 20 +++++++++++++-------
 include/linux/vfio.h     |  9 +++++++--
 2 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 6e8804fe0095..b3a84d65baa6 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -772,7 +772,12 @@ static struct file *vfio_device_open(struct vfio_device *device)
 		 * reference and release it during close_device.
 		 */
 		mutex_lock(&device->group->group_lock);
-		device->kvm = device->group->kvm;
+
+		if (device->kvm_ops && device->group->kvm) {
+			ret = device->kvm_ops->get_kvm(device->group->kvm);
+			if (ret)
+				goto err_undo_count;
+		}
 
 		if (device->ops->open_device) {
 			ret = device->ops->open_device(device);
@@ -823,8 +828,9 @@ static struct file *vfio_device_open(struct vfio_device *device)
 err_undo_count:
 	mutex_unlock(&device->group->group_lock);
 	device->open_count--;
-	if (device->open_count == 0 && device->kvm)
-		device->kvm = NULL;
+	if (device->open_count == 0 && device->kvm_ops)
+		device->kvm_ops->put_kvm();
+
 	mutex_unlock(&device->dev_set->lock);
 	module_put(device->dev->driver->owner);
 err_unassign_container:
@@ -1039,8 +1045,8 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep)
 	}
 	mutex_unlock(&device->group->group_lock);
 	device->open_count--;
-	if (device->open_count == 0)
-		device->kvm = NULL;
+	if (device->open_count == 0 && device->kvm_ops)
+		device->kvm_ops->put_kvm();
 	mutex_unlock(&device->dev_set->lock);
 
 	module_put(device->dev->driver->owner);
@@ -1656,8 +1662,8 @@ EXPORT_SYMBOL_GPL(vfio_file_enforced_coherent);
  * @file: VFIO group file
  * @kvm: KVM to link
  *
- * When a VFIO device is first opened the KVM will be available in
- * device->kvm if one was associated with the group.
+ * When a VFIO device is first opened, the device's kvm_ops->get_kvm() will be
+ * invoked with the KVM instance associated with the group (if applicable).
  */
 void vfio_file_set_kvm(struct file *file, struct kvm *kvm)
 {
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index fdd393f70b19..d6dcbe0546bf 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -18,6 +18,11 @@
 
 struct kvm;
 
+struct vfio_device_kvm_ops {
+	int (*get_kvm)(struct kvm *kvm);
+	void (*put_kvm)(void);
+};
+
 /*
  * VFIO devices can be placed in a set, this allows all devices to share this
  * structure and the VFIO core will provide a lock that is held around
@@ -43,8 +48,8 @@ struct vfio_device {
 	struct vfio_device_set *dev_set;
 	struct list_head dev_set_list;
 	unsigned int migration_flags;
-	/* Driver must reference the kvm during open_device or never touch it */
-	struct kvm *kvm;
+
+	const struct vfio_device_kvm_ops *kvm_ops;
 
 	/* Members below here are private, not for driver use */
 	unsigned int index;

base-commit: d52444c7a90fc551b4c3b0bda7d3f0b2ca9fc84d
-- 


  reply	other threads:[~2023-01-11 20:53 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-09 20:10 [PATCH 0/2] kvm/vfio: fix potential deadlock on vfio group lock Matthew Rosato
2023-01-09 20:10 ` [PATCH 1/2] KVM: async kvm_destroy_vm for vfio devices Matthew Rosato
2023-01-09 20:13   ` Jason Gunthorpe
2023-01-09 20:24     ` Matthew Rosato
2023-01-09 21:07   ` Anthony Krowiak
2023-01-11 19:54   ` Sean Christopherson
2023-01-11 20:05     ` Jason Gunthorpe
2023-01-11 20:53       ` Sean Christopherson [this message]
2023-01-12 12:45         ` Jason Gunthorpe
2023-01-12 17:21           ` Matthew Rosato
2023-01-12 17:27             ` Jason Gunthorpe
2023-01-09 20:10 ` [PATCH 2/2] KVM: s390: pci: use asyncronous kvm put Matthew Rosato

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y78hzsHiwaFpL60+@google.com \
    --to=seanjc@google.com \
    --cc=akrowiak@linux.ibm.com \
    --cc=alex.williamson@redhat.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=david@redhat.com \
    --cc=farman@linux.ibm.com \
    --cc=frankja@linux.ibm.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=intel-gvt-dev@lists.freedesktop.org \
    --cc=jgg@nvidia.com \
    --cc=jjherne@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mjrosato@linux.ibm.com \
    --cc=pasic@linux.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=pmorel@linux.ibm.com \
    --cc=zhenyuw@linux.intel.com \
    --cc=zhi.a.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox