* [RFC v2 00/11] Move group specific code into group.c
@ 2022-11-24 12:26 Yi Liu
2022-11-24 12:26 ` [RFC v2 01/11] vfio: Simplify vfio_create_group() Yi Liu
` (10 more replies)
0 siblings, 11 replies; 42+ messages in thread
From: Yi Liu @ 2022-11-24 12:26 UTC (permalink / raw)
To: alex.williamson, jgg
Cc: kevin.tian, eric.auger, cohuck, nicolinc, yi.y.sun, chao.p.peng,
mjrosato, kvm, yi.l.liu
With the introduction of iommufd[1], VFIO is towarding to provide device
centric uAPI after adapting to iommufd. With this trend, existing VFIO
group infrastructure is optional once VFIO converted to device centric.
This series moves the group specific code out of vfio_main.c, prepares
for compiling group infrastructure out after adding vfio device cdev[2]
Complete code in below branch:
https://github.com/yiliu1765/iommufd/commits/vfio_group_split_rfcv2
This is based on Jason's "Connect VFIO to IOMMUFD"[3] and my "Make mdev driver
dma_unmap callback tolerant to unmaps come before device open"[4]
[1] https://lore.kernel.org/all/0-v5-4001c2997bd0+30c-iommufd_jgg@nvidia.com/
[2] https://github.com/yiliu1765/iommufd/tree/wip/vfio_device_cdev
[3] https://lore.kernel.org/kvm/063990c3-c244-1f7f-4e01-348023832066@intel.com/T/#t
[4] https://lore.kernel.org/kvm/20221123134832.429589-1-yi.l.liu@intel.com/T/#t
v2:
- Remove device->group reference in vfio_main.c suggested by Jason.
- Cherry-pick the patches in Alex's vfio/next branch, and rebased this
series on the top.
v1: https://lore.kernel.org/kvm/20221123150113.670399-1-yi.l.liu@intel.com/T/#t
Regards,
Yi Liu
Jason Gunthorpe (2):
vfio: Simplify vfio_create_group()
vfio: Move the sanity check of the group to vfio_create_group()
Yi Liu (9):
vfio: Set device->group in helper function
vfio: Wrap group codes to be helpers for __vfio_register_dev() and
unregister
vfio: Make vfio_device_open() group agnostic
vfio: Move device open/close code to be helpfers
vfio: Swap order of vfio_device_container_register() and open_device()
vfio: Refactor vfio_device_first_open() and _last_close()
vfio: Wrap vfio group module init/clean code into helpers
vfio: Refactor dma APIs for emulated devices
vfio: Move vfio group specific code into group.c
drivers/vfio/Makefile | 1 +
drivers/vfio/container.c | 20 +-
drivers/vfio/group.c | 842 ++++++++++++++++++++++++++++++++++++
drivers/vfio/vfio.h | 49 ++-
drivers/vfio/vfio_main.c | 896 +++------------------------------------
5 files changed, 959 insertions(+), 849 deletions(-)
create mode 100644 drivers/vfio/group.c
--
2.34.1
^ permalink raw reply [flat|nested] 42+ messages in thread
* [RFC v2 01/11] vfio: Simplify vfio_create_group()
2022-11-24 12:26 [RFC v2 00/11] Move group specific code into group.c Yi Liu
@ 2022-11-24 12:26 ` Yi Liu
2022-11-28 7:57 ` Tian, Kevin
2022-11-24 12:26 ` [RFC v2 02/11] vfio: Move the sanity check of the group to vfio_create_group() Yi Liu
` (9 subsequent siblings)
10 siblings, 1 reply; 42+ messages in thread
From: Yi Liu @ 2022-11-24 12:26 UTC (permalink / raw)
To: alex.williamson, jgg
Cc: kevin.tian, eric.auger, cohuck, nicolinc, yi.y.sun, chao.p.peng,
mjrosato, kvm, yi.l.liu
From: Jason Gunthorpe <jgg@nvidia.com>
The vfio.group_lock is now only used to serialize vfio_group creation
and destruction, we don't need a micro-optimization of searching,
unlocking, then allocating and searching again. Just hold the lock
the whole time.
Grabbed from:
https://lore.kernel.org/kvm/20220922152338.2a2238fe.alex.williamson@redhat.com/
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/vfio/vfio_main.c | 33 ++++++++++-----------------------
1 file changed, 10 insertions(+), 23 deletions(-)
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index b233805ff645..5388a11ed496 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -146,10 +146,12 @@ EXPORT_SYMBOL_GPL(vfio_device_set_open_count);
* Group objects - create, release, get, put, search
*/
static struct vfio_group *
-__vfio_group_get_from_iommu(struct iommu_group *iommu_group)
+vfio_group_get_from_iommu(struct iommu_group *iommu_group)
{
struct vfio_group *group;
+ lockdep_assert_held(&vfio.group_lock);
+
/*
* group->iommu_group from the vfio.group_list cannot be NULL
* under the vfio.group_lock.
@@ -163,17 +165,6 @@ __vfio_group_get_from_iommu(struct iommu_group *iommu_group)
return NULL;
}
-static struct vfio_group *
-vfio_group_get_from_iommu(struct iommu_group *iommu_group)
-{
- struct vfio_group *group;
-
- mutex_lock(&vfio.group_lock);
- group = __vfio_group_get_from_iommu(iommu_group);
- mutex_unlock(&vfio.group_lock);
- return group;
-}
-
static void vfio_group_release(struct device *dev)
{
struct vfio_group *group = container_of(dev, struct vfio_group, dev);
@@ -228,6 +219,8 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
struct vfio_group *ret;
int err;
+ lockdep_assert_held(&vfio.group_lock);
+
group = vfio_group_alloc(iommu_group, type);
if (IS_ERR(group))
return group;
@@ -240,26 +233,16 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
goto err_put;
}
- mutex_lock(&vfio.group_lock);
-
- /* Did we race creating this group? */
- ret = __vfio_group_get_from_iommu(iommu_group);
- if (ret)
- goto err_unlock;
-
err = cdev_device_add(&group->cdev, &group->dev);
if (err) {
ret = ERR_PTR(err);
- goto err_unlock;
+ goto err_put;
}
list_add(&group->vfio_next, &vfio.group_list);
- mutex_unlock(&vfio.group_lock);
return group;
-err_unlock:
- mutex_unlock(&vfio.group_lock);
err_put:
put_device(&group->dev);
return ret;
@@ -456,7 +439,9 @@ static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev,
if (ret)
goto out_put_group;
+ mutex_lock(&vfio.group_lock);
group = vfio_create_group(iommu_group, type);
+ mutex_unlock(&vfio.group_lock);
if (IS_ERR(group)) {
ret = PTR_ERR(group);
goto out_remove_device;
@@ -505,9 +490,11 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
return ERR_PTR(-EINVAL);
}
+ mutex_lock(&vfio.group_lock);
group = vfio_group_get_from_iommu(iommu_group);
if (!group)
group = vfio_create_group(iommu_group, VFIO_IOMMU);
+ mutex_unlock(&vfio.group_lock);
/* The vfio_group holds a reference to the iommu_group */
iommu_group_put(iommu_group);
--
2.34.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [RFC v2 02/11] vfio: Move the sanity check of the group to vfio_create_group()
2022-11-24 12:26 [RFC v2 00/11] Move group specific code into group.c Yi Liu
2022-11-24 12:26 ` [RFC v2 01/11] vfio: Simplify vfio_create_group() Yi Liu
@ 2022-11-24 12:26 ` Yi Liu
2022-11-28 8:05 ` Tian, Kevin
2022-11-24 12:26 ` [RFC v2 03/11] vfio: Set device->group in helper function Yi Liu
` (8 subsequent siblings)
10 siblings, 1 reply; 42+ messages in thread
From: Yi Liu @ 2022-11-24 12:26 UTC (permalink / raw)
To: alex.williamson, jgg
Cc: kevin.tian, eric.auger, cohuck, nicolinc, yi.y.sun, chao.p.peng,
mjrosato, kvm, yi.l.liu
From: Jason Gunthorpe <jgg@nvidia.com>
This avoids opening group specific code in __vfio_register_dev() for the
sanity check if an (existing) group is not corrupted by having two copies
of the same struct device in it. It also simplifies the error unwind for
this sanity check since the failure can be detected in the group allocation.
This also prepares for moving the group specific code into separate group.c.
Grabbed from:
https://lore.kernel.org/kvm/20220922152338.2a2238fe.alex.williamson@redhat.com/
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/vfio/vfio_main.c | 62 ++++++++++++++++------------------------
1 file changed, 25 insertions(+), 37 deletions(-)
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 5388a11ed496..7a78256a650e 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -146,7 +146,7 @@ EXPORT_SYMBOL_GPL(vfio_device_set_open_count);
* Group objects - create, release, get, put, search
*/
static struct vfio_group *
-vfio_group_get_from_iommu(struct iommu_group *iommu_group)
+vfio_group_find_from_iommu(struct iommu_group *iommu_group)
{
struct vfio_group *group;
@@ -157,10 +157,8 @@ vfio_group_get_from_iommu(struct iommu_group *iommu_group)
* under the vfio.group_lock.
*/
list_for_each_entry(group, &vfio.group_list, vfio_next) {
- if (group->iommu_group == iommu_group) {
- refcount_inc(&group->drivers);
+ if (group->iommu_group == iommu_group)
return group;
- }
}
return NULL;
}
@@ -310,23 +308,6 @@ static bool vfio_device_try_get_registration(struct vfio_device *device)
return refcount_inc_not_zero(&device->refcount);
}
-static struct vfio_device *vfio_group_get_device(struct vfio_group *group,
- struct device *dev)
-{
- struct vfio_device *device;
-
- mutex_lock(&group->device_lock);
- list_for_each_entry(device, &group->device_list, group_next) {
- if (device->dev == dev &&
- vfio_device_try_get_registration(device)) {
- mutex_unlock(&group->device_lock);
- return device;
- }
- }
- mutex_unlock(&group->device_lock);
- return NULL;
-}
-
/*
* VFIO driver API
*/
@@ -456,6 +437,21 @@ static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev,
return ERR_PTR(ret);
}
+static bool vfio_group_has_device(struct vfio_group *group, struct device *dev)
+{
+ struct vfio_device *device;
+
+ mutex_lock(&group->device_lock);
+ list_for_each_entry(device, &group->device_list, group_next) {
+ if (device->dev == dev) {
+ mutex_unlock(&group->device_lock);
+ return true;
+ }
+ }
+ mutex_unlock(&group->device_lock);
+ return false;
+}
+
static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
{
struct iommu_group *iommu_group;
@@ -491,9 +487,15 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
}
mutex_lock(&vfio.group_lock);
- group = vfio_group_get_from_iommu(iommu_group);
- if (!group)
+ group = vfio_group_find_from_iommu(iommu_group);
+ if (group) {
+ if (WARN_ON(vfio_group_has_device(group, dev)))
+ group = ERR_PTR(-EINVAL);
+ else
+ refcount_inc(&group->drivers);
+ } else {
group = vfio_create_group(iommu_group, VFIO_IOMMU);
+ }
mutex_unlock(&vfio.group_lock);
/* The vfio_group holds a reference to the iommu_group */
@@ -504,7 +506,6 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
static int __vfio_register_dev(struct vfio_device *device,
struct vfio_group *group)
{
- struct vfio_device *existing_device;
int ret;
/*
@@ -526,19 +527,6 @@ static int __vfio_register_dev(struct vfio_device *device,
if (!device->dev_set)
vfio_assign_device_set(device, device);
- existing_device = vfio_group_get_device(group, device->dev);
- if (existing_device) {
- /*
- * group->iommu_group is non-NULL because we hold the drivers
- * refcount.
- */
- dev_WARN(device->dev, "Device already exists on group %d\n",
- iommu_group_id(group->iommu_group));
- vfio_device_put_registration(existing_device);
- ret = -EBUSY;
- goto err_out;
- }
-
/* Our reference on group is moved to the device */
device->group = group;
--
2.34.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [RFC v2 03/11] vfio: Set device->group in helper function
2022-11-24 12:26 [RFC v2 00/11] Move group specific code into group.c Yi Liu
2022-11-24 12:26 ` [RFC v2 01/11] vfio: Simplify vfio_create_group() Yi Liu
2022-11-24 12:26 ` [RFC v2 02/11] vfio: Move the sanity check of the group to vfio_create_group() Yi Liu
@ 2022-11-24 12:26 ` Yi Liu
2022-11-24 13:17 ` Jason Gunthorpe
2022-11-28 8:08 ` Tian, Kevin
2022-11-24 12:26 ` [RFC v2 04/11] vfio: Wrap group codes to be helpers for __vfio_register_dev() and unregister Yi Liu
` (7 subsequent siblings)
10 siblings, 2 replies; 42+ messages in thread
From: Yi Liu @ 2022-11-24 12:26 UTC (permalink / raw)
To: alex.williamson, jgg
Cc: kevin.tian, eric.auger, cohuck, nicolinc, yi.y.sun, chao.p.peng,
mjrosato, kvm, yi.l.liu
This avoids referencing device->group in __vfio_register_dev()
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/vfio/vfio_main.c | 52 +++++++++++++++++++++++++---------------
1 file changed, 33 insertions(+), 19 deletions(-)
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 7a78256a650e..4980b8acf5d3 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -503,10 +503,15 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
return group;
}
-static int __vfio_register_dev(struct vfio_device *device,
- struct vfio_group *group)
+static int vfio_device_set_group(struct vfio_device *device,
+ enum vfio_group_type type)
{
- int ret;
+ struct vfio_group *group;
+
+ if (type == VFIO_IOMMU)
+ group = vfio_group_find_or_alloc(device->dev);
+ else
+ group = vfio_noiommu_group_alloc(device->dev, type);
/*
* In all cases group is the output of one of the group allocation
@@ -515,6 +520,16 @@ static int __vfio_register_dev(struct vfio_device *device,
if (IS_ERR(group))
return PTR_ERR(group);
+ /* Our reference on group is moved to the device */
+ device->group = group;
+ return 0;
+}
+
+static int __vfio_register_dev(struct vfio_device *device,
+ enum vfio_group_type type)
+{
+ int ret;
+
if (WARN_ON(device->ops->bind_iommufd &&
(!device->ops->unbind_iommufd ||
!device->ops->attach_ioas)))
@@ -527,34 +542,33 @@ static int __vfio_register_dev(struct vfio_device *device,
if (!device->dev_set)
vfio_assign_device_set(device, device);
- /* Our reference on group is moved to the device */
- device->group = group;
-
ret = dev_set_name(&device->device, "vfio%d", device->index);
if (ret)
- goto err_out;
+ return ret;
- ret = device_add(&device->device);
+ ret = vfio_device_set_group(device, type);
if (ret)
- goto err_out;
+ return ret;
+
+ ret = device_add(&device->device);
+ if (ret) {
+ vfio_device_remove_group(device);
+ return ret;
+ }
/* Refcounting can't start until the driver calls register */
refcount_set(&device->refcount, 1);
- mutex_lock(&group->device_lock);
- list_add(&device->group_next, &group->device_list);
- mutex_unlock(&group->device_lock);
+ mutex_lock(&device->group->device_lock);
+ list_add(&device->group_next, &device->group->device_list);
+ mutex_unlock(&device->group->device_lock);
return 0;
-err_out:
- vfio_device_remove_group(device);
- return ret;
}
int vfio_register_group_dev(struct vfio_device *device)
{
- return __vfio_register_dev(device,
- vfio_group_find_or_alloc(device->dev));
+ return __vfio_register_dev(device, VFIO_IOMMU);
}
EXPORT_SYMBOL_GPL(vfio_register_group_dev);
@@ -564,8 +578,7 @@ EXPORT_SYMBOL_GPL(vfio_register_group_dev);
*/
int vfio_register_emulated_iommu_dev(struct vfio_device *device)
{
- return __vfio_register_dev(device,
- vfio_noiommu_group_alloc(device->dev, VFIO_EMULATED_IOMMU));
+ return __vfio_register_dev(device, VFIO_EMULATED_IOMMU);
}
EXPORT_SYMBOL_GPL(vfio_register_emulated_iommu_dev);
@@ -638,6 +651,7 @@ void vfio_unregister_group_dev(struct vfio_device *device)
/* Balances device_add in register path */
device_del(&device->device);
+ /* Balances vfio_device_set_group in register path */
vfio_device_remove_group(device);
}
EXPORT_SYMBOL_GPL(vfio_unregister_group_dev);
--
2.34.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [RFC v2 04/11] vfio: Wrap group codes to be helpers for __vfio_register_dev() and unregister
2022-11-24 12:26 [RFC v2 00/11] Move group specific code into group.c Yi Liu
` (2 preceding siblings ...)
2022-11-24 12:26 ` [RFC v2 03/11] vfio: Set device->group in helper function Yi Liu
@ 2022-11-24 12:26 ` Yi Liu
2022-11-28 8:11 ` Tian, Kevin
2022-11-24 12:26 ` [RFC v2 05/11] vfio: Make vfio_device_open() group agnostic Yi Liu
` (6 subsequent siblings)
10 siblings, 1 reply; 42+ messages in thread
From: Yi Liu @ 2022-11-24 12:26 UTC (permalink / raw)
To: alex.williamson, jgg
Cc: kevin.tian, eric.auger, cohuck, nicolinc, yi.y.sun, chao.p.peng,
mjrosato, kvm, yi.l.liu
This avoids to decode group fields in the common functions used by
vfio_device registration, and prepares for further moving vfio group
specific code into separate file.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/vfio/vfio_main.c | 23 ++++++++++++++++-------
1 file changed, 16 insertions(+), 7 deletions(-)
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 4980b8acf5d3..edcfa8a61096 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -525,6 +525,20 @@ static int vfio_device_set_group(struct vfio_device *device,
return 0;
}
+static void vfio_device_group_register(struct vfio_device *device)
+{
+ mutex_lock(&device->group->device_lock);
+ list_add(&device->group_next, &device->group->device_list);
+ mutex_unlock(&device->group->device_lock);
+}
+
+static void vfio_device_group_unregister(struct vfio_device *device)
+{
+ mutex_lock(&device->group->device_lock);
+ list_del(&device->group_next);
+ mutex_unlock(&device->group->device_lock);
+}
+
static int __vfio_register_dev(struct vfio_device *device,
enum vfio_group_type type)
{
@@ -559,9 +573,7 @@ static int __vfio_register_dev(struct vfio_device *device,
/* Refcounting can't start until the driver calls register */
refcount_set(&device->refcount, 1);
- mutex_lock(&device->group->device_lock);
- list_add(&device->group_next, &device->group->device_list);
- mutex_unlock(&device->group->device_lock);
+ vfio_device_group_register(device);
return 0;
}
@@ -616,7 +628,6 @@ static struct vfio_device *vfio_device_get_from_name(struct vfio_group *group,
* removed. Open file descriptors for the device... */
void vfio_unregister_group_dev(struct vfio_device *device)
{
- struct vfio_group *group = device->group;
unsigned int i = 0;
bool interrupted = false;
long rc;
@@ -644,9 +655,7 @@ void vfio_unregister_group_dev(struct vfio_device *device)
}
}
- mutex_lock(&group->device_lock);
- list_del(&device->group_next);
- mutex_unlock(&group->device_lock);
+ vfio_device_group_unregister(device);
/* Balances device_add in register path */
device_del(&device->device);
--
2.34.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [RFC v2 05/11] vfio: Make vfio_device_open() group agnostic
2022-11-24 12:26 [RFC v2 00/11] Move group specific code into group.c Yi Liu
` (3 preceding siblings ...)
2022-11-24 12:26 ` [RFC v2 04/11] vfio: Wrap group codes to be helpers for __vfio_register_dev() and unregister Yi Liu
@ 2022-11-24 12:26 ` Yi Liu
2022-11-28 8:17 ` Tian, Kevin
2022-11-24 12:26 ` [RFC v2 06/11] vfio: Move device open/close code to be helpfers Yi Liu
` (5 subsequent siblings)
10 siblings, 1 reply; 42+ messages in thread
From: Yi Liu @ 2022-11-24 12:26 UTC (permalink / raw)
To: alex.williamson, jgg
Cc: kevin.tian, eric.auger, cohuck, nicolinc, yi.y.sun, chao.p.peng,
mjrosato, kvm, yi.l.liu
This prepares for moving group specific code to separate file.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/vfio/vfio_main.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index edcfa8a61096..fcb9f778fc9b 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -878,9 +878,6 @@ static struct file *vfio_device_open(struct vfio_device *device)
*/
filep->f_mode |= (FMODE_PREAD | FMODE_PWRITE);
- if (device->group->type == VFIO_NO_IOMMU)
- dev_warn(device->dev, "vfio-noiommu device opened by user "
- "(%s:%d)\n", current->comm, task_pid_nr(current));
/*
* On success the ref of device is moved to the file and
* put in vfio_device_fops_release()
@@ -927,6 +924,10 @@ static int vfio_group_ioctl_get_device_fd(struct vfio_group *group,
goto err_put_fdno;
}
+ if (group->type == VFIO_NO_IOMMU)
+ dev_warn(device->dev, "vfio-noiommu device opened by user "
+ "(%s:%d)\n", current->comm, task_pid_nr(current));
+
fd_install(fdno, filep);
return fdno;
--
2.34.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [RFC v2 06/11] vfio: Move device open/close code to be helpfers
2022-11-24 12:26 [RFC v2 00/11] Move group specific code into group.c Yi Liu
` (4 preceding siblings ...)
2022-11-24 12:26 ` [RFC v2 05/11] vfio: Make vfio_device_open() group agnostic Yi Liu
@ 2022-11-24 12:26 ` Yi Liu
2022-11-28 8:21 ` Tian, Kevin
2022-11-24 12:26 ` [RFC v2 07/11] vfio: Swap order of vfio_device_container_register() and open_device() Yi Liu
` (4 subsequent siblings)
10 siblings, 1 reply; 42+ messages in thread
From: Yi Liu @ 2022-11-24 12:26 UTC (permalink / raw)
To: alex.williamson, jgg
Cc: kevin.tian, eric.auger, cohuck, nicolinc, yi.y.sun, chao.p.peng,
mjrosato, kvm, yi.l.liu
This makes the device open and close be in paired helpers.
vfio_device_open(), and vfio_device_close() handles the open_count, and
calls vfio_device_first_open(), and vfio_device_last_close() when
open_count condition is met. This also helps to avoid open code for device
in the vfio_group_ioctl_get_device_fd(), and prepares for further moving
vfio group specific code into separate file.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/vfio/vfio_main.c | 46 +++++++++++++++++++++++++---------------
1 file changed, 29 insertions(+), 17 deletions(-)
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index fcb9f778fc9b..c3d58cbc2aa9 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -846,20 +846,41 @@ static void vfio_device_last_close(struct vfio_device *device)
module_put(device->dev->driver->owner);
}
-static struct file *vfio_device_open(struct vfio_device *device)
+static int vfio_device_open(struct vfio_device *device)
{
- struct file *filep;
- int ret;
+ int ret = 0;
mutex_lock(&device->dev_set->lock);
device->open_count++;
if (device->open_count == 1) {
ret = vfio_device_first_open(device);
if (ret)
- goto err_unlock;
+ device->open_count--;
}
mutex_unlock(&device->dev_set->lock);
+ return ret;
+}
+
+static void vfio_device_close(struct vfio_device *device)
+{
+ mutex_lock(&device->dev_set->lock);
+ vfio_assert_device_open(device);
+ if (device->open_count == 1)
+ vfio_device_last_close(device);
+ device->open_count--;
+ mutex_unlock(&device->dev_set->lock);
+}
+
+static struct file *vfio_device_open_file(struct vfio_device *device)
+{
+ struct file *filep;
+ int ret;
+
+ ret = vfio_device_open(device);
+ if (ret)
+ goto err_out;
+
/*
* We can't use anon_inode_getfd() because we need to modify
* the f_mode flags directly to allow more than just ioctls
@@ -885,12 +906,8 @@ static struct file *vfio_device_open(struct vfio_device *device)
return filep;
err_close_device:
- mutex_lock(&device->dev_set->lock);
- if (device->open_count == 1)
- vfio_device_last_close(device);
-err_unlock:
- device->open_count--;
- mutex_unlock(&device->dev_set->lock);
+ vfio_device_close(device);
+err_out:
return ERR_PTR(ret);
}
@@ -918,7 +935,7 @@ static int vfio_group_ioctl_get_device_fd(struct vfio_group *group,
goto err_put_device;
}
- filep = vfio_device_open(device);
+ filep = vfio_device_open_file(device);
if (IS_ERR(filep)) {
ret = PTR_ERR(filep);
goto err_put_fdno;
@@ -1105,12 +1122,7 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep)
{
struct vfio_device *device = filep->private_data;
- mutex_lock(&device->dev_set->lock);
- vfio_assert_device_open(device);
- if (device->open_count == 1)
- vfio_device_last_close(device);
- device->open_count--;
- mutex_unlock(&device->dev_set->lock);
+ vfio_device_close(device);
vfio_device_put_registration(device);
--
2.34.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [RFC v2 07/11] vfio: Swap order of vfio_device_container_register() and open_device()
2022-11-24 12:26 [RFC v2 00/11] Move group specific code into group.c Yi Liu
` (5 preceding siblings ...)
2022-11-24 12:26 ` [RFC v2 06/11] vfio: Move device open/close code to be helpfers Yi Liu
@ 2022-11-24 12:26 ` Yi Liu
2022-11-28 8:27 ` Tian, Kevin
2022-11-24 12:26 ` [RFC v2 08/11] vfio: Refactor vfio_device_first_open() and _last_close() Yi Liu
` (3 subsequent siblings)
10 siblings, 1 reply; 42+ messages in thread
From: Yi Liu @ 2022-11-24 12:26 UTC (permalink / raw)
To: alex.williamson, jgg
Cc: kevin.tian, eric.auger, cohuck, nicolinc, yi.y.sun, chao.p.peng,
mjrosato, kvm, yi.l.liu
This makes the DMA unmap callback registration to container be consistent
across the vfio iommufd compat mode and the legacy container mode.
In the vfio iommufd compat mode, this registration is done in the
vfio_iommufd_bind() when creating access which has an unmap callback. This
is prior to calling the open_device() op provided by mdev driver. While,
in the vfio legacy mode, the registration is done by
vfio_device_container_register() and it is after open_device(). By swapping
the order of vfio_device_container_register() and open_device(), the two
modes have the consistent order for the DMA unmap callback registration.
This also prepares for further splitting group specific code.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/vfio/vfio_main.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index c3d58cbc2aa9..af5908455cfc 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -799,6 +799,7 @@ static int vfio_device_first_open(struct vfio_device *device)
ret = vfio_group_use_container(device->group);
if (ret)
goto err_module_put;
+ vfio_device_container_register(device);
} else if (device->group->iommufd) {
ret = vfio_iommufd_bind(device, device->group->iommufd);
if (ret)
@@ -811,17 +812,17 @@ static int vfio_device_first_open(struct vfio_device *device)
if (ret)
goto err_container;
}
- if (device->group->container)
- vfio_device_container_register(device);
mutex_unlock(&device->group->group_lock);
return 0;
err_container:
device->kvm = NULL;
- if (device->group->container)
+ if (device->group->container) {
+ vfio_device_container_unregister(device);
vfio_group_unuse_container(device->group);
- else if (device->group->iommufd)
+ } else if (device->group->iommufd) {
vfio_iommufd_unbind(device);
+ }
err_module_put:
mutex_unlock(&device->group->group_lock);
module_put(device->dev->driver->owner);
@@ -833,15 +834,15 @@ static void vfio_device_last_close(struct vfio_device *device)
lockdep_assert_held(&device->dev_set->lock);
mutex_lock(&device->group->group_lock);
- if (device->group->container)
- vfio_device_container_unregister(device);
if (device->ops->close_device)
device->ops->close_device(device);
device->kvm = NULL;
- if (device->group->container)
+ if (device->group->container) {
+ vfio_device_container_unregister(device);
vfio_group_unuse_container(device->group);
- else if (device->group->iommufd)
+ } else if (device->group->iommufd) {
vfio_iommufd_unbind(device);
+ }
mutex_unlock(&device->group->group_lock);
module_put(device->dev->driver->owner);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [RFC v2 08/11] vfio: Refactor vfio_device_first_open() and _last_close()
2022-11-24 12:26 [RFC v2 00/11] Move group specific code into group.c Yi Liu
` (6 preceding siblings ...)
2022-11-24 12:26 ` [RFC v2 07/11] vfio: Swap order of vfio_device_container_register() and open_device() Yi Liu
@ 2022-11-24 12:26 ` Yi Liu
2022-11-24 14:56 ` Jason Gunthorpe
2022-11-28 8:31 ` Tian, Kevin
2022-11-24 12:27 ` [RFC v2 09/11] vfio: Wrap vfio group module init/clean code into helpers Yi Liu
` (2 subsequent siblings)
10 siblings, 2 replies; 42+ messages in thread
From: Yi Liu @ 2022-11-24 12:26 UTC (permalink / raw)
To: alex.williamson, jgg
Cc: kevin.tian, eric.auger, cohuck, nicolinc, yi.y.sun, chao.p.peng,
mjrosato, kvm, yi.l.liu
To prepare for moving group specific code out from vfio_main.c.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/vfio/vfio_main.c | 93 ++++++++++++++++++++++++++++------------
1 file changed, 65 insertions(+), 28 deletions(-)
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index af5908455cfc..45f96515c05e 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -775,14 +775,9 @@ static bool vfio_assert_device_open(struct vfio_device *device)
return !WARN_ON_ONCE(!READ_ONCE(device->open_count));
}
-static int vfio_device_first_open(struct vfio_device *device)
+static int vfio_device_group_use_iommu(struct vfio_device *device)
{
- int ret;
-
- lockdep_assert_held(&device->dev_set->lock);
-
- if (!try_module_get(device->dev->driver->owner))
- return -ENODEV;
+ int ret = 0;
/*
* Here we pass the KVM pointer with the group under the lock. If the
@@ -792,39 +787,88 @@ static int vfio_device_first_open(struct vfio_device *device)
mutex_lock(&device->group->group_lock);
if (!vfio_group_has_iommu(device->group)) {
ret = -EINVAL;
- goto err_module_put;
+ goto out_unlock;
}
if (device->group->container) {
ret = vfio_group_use_container(device->group);
if (ret)
- goto err_module_put;
+ goto out_unlock;
vfio_device_container_register(device);
} else if (device->group->iommufd) {
ret = vfio_iommufd_bind(device, device->group->iommufd);
- if (ret)
- goto err_module_put;
}
- device->kvm = device->group->kvm;
+out_unlock:
+ mutex_unlock(&device->group->group_lock);
+ return ret;
+}
+
+static void vfio_device_group_unuse_iommu(struct vfio_device *device)
+{
+ mutex_lock(&device->group->group_lock);
+ if (device->group->container) {
+ vfio_device_container_unregister(device);
+ vfio_group_unuse_container(device->group);
+ } else if (device->group->iommufd) {
+ vfio_iommufd_unbind(device);
+ }
+ mutex_unlock(&device->group->group_lock);
+}
+
+static struct kvm *vfio_device_get_group_kvm(struct vfio_device *device)
+{
+ struct vfio_group *group = device->group;
+
+ mutex_lock(&group->group_lock);
+ if (!group->kvm) {
+ mutex_unlock(&group->group_lock);
+ return NULL;
+ }
+ /* group_lock is released in the vfio_device_put_group_kvm() */
+ return group->kvm;
+}
+
+static void vfio_device_put_group_kvm(struct vfio_device *device)
+{
+ mutex_unlock(&device->group->group_lock);
+}
+
+static int vfio_device_first_open(struct vfio_device *device)
+{
+ struct kvm *kvm;
+ int ret;
+
+ lockdep_assert_held(&device->dev_set->lock);
+
+ if (!try_module_get(device->dev->driver->owner))
+ return -ENODEV;
+
+ ret = vfio_device_group_use_iommu(device);
+ if (ret)
+ goto err_module_put;
+
+ kvm = vfio_device_get_group_kvm(device);
+ if (!kvm) {
+ ret = -EINVAL;
+ goto err_unuse_iommu;
+ }
+
+ device->kvm = kvm;
if (device->ops->open_device) {
ret = device->ops->open_device(device);
if (ret)
goto err_container;
}
- mutex_unlock(&device->group->group_lock);
+ vfio_device_put_group_kvm(device);
return 0;
err_container:
device->kvm = NULL;
- if (device->group->container) {
- vfio_device_container_unregister(device);
- vfio_group_unuse_container(device->group);
- } else if (device->group->iommufd) {
- vfio_iommufd_unbind(device);
- }
+ vfio_device_put_group_kvm(device);
+err_unuse_iommu:
+ vfio_device_group_unuse_iommu(device);
err_module_put:
- mutex_unlock(&device->group->group_lock);
module_put(device->dev->driver->owner);
return ret;
}
@@ -833,17 +877,10 @@ static void vfio_device_last_close(struct vfio_device *device)
{
lockdep_assert_held(&device->dev_set->lock);
- mutex_lock(&device->group->group_lock);
if (device->ops->close_device)
device->ops->close_device(device);
device->kvm = NULL;
- if (device->group->container) {
- vfio_device_container_unregister(device);
- vfio_group_unuse_container(device->group);
- } else if (device->group->iommufd) {
- vfio_iommufd_unbind(device);
- }
- mutex_unlock(&device->group->group_lock);
+ vfio_device_group_unuse_iommu(device);
module_put(device->dev->driver->owner);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [RFC v2 09/11] vfio: Wrap vfio group module init/clean code into helpers
2022-11-24 12:26 [RFC v2 00/11] Move group specific code into group.c Yi Liu
` (7 preceding siblings ...)
2022-11-24 12:26 ` [RFC v2 08/11] vfio: Refactor vfio_device_first_open() and _last_close() Yi Liu
@ 2022-11-24 12:27 ` Yi Liu
2022-11-28 8:36 ` Tian, Kevin
2022-11-24 12:27 ` [RFC v2 10/11] vfio: Refactor dma APIs for emulated devices Yi Liu
2022-11-24 12:27 ` [RFC v2 11/11] vfio: Move vfio group specific code into group.c Yi Liu
10 siblings, 1 reply; 42+ messages in thread
From: Yi Liu @ 2022-11-24 12:27 UTC (permalink / raw)
To: alex.williamson, jgg
Cc: kevin.tian, eric.auger, cohuck, nicolinc, yi.y.sun, chao.p.peng,
mjrosato, kvm, yi.l.liu
This wraps the init/clean code of vfio group global variable to be helpers,
and prepares for further moving vfio group specific code into separate
file.
As container is used by group, so vfio_container_init/cleanup() is moved
into vfio_group_init/cleanup().
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/vfio/vfio_main.c | 56 ++++++++++++++++++++++++++--------------
1 file changed, 36 insertions(+), 20 deletions(-)
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 45f96515c05e..a0c699b3e3d9 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -2087,12 +2087,11 @@ static char *vfio_devnode(struct device *dev, umode_t *mode)
return kasprintf(GFP_KERNEL, "vfio/%s", dev_name(dev));
}
-static int __init vfio_init(void)
+static int __init vfio_group_init(void)
{
int ret;
ida_init(&vfio.group_ida);
- ida_init(&vfio.device_ida);
mutex_init(&vfio.group_lock);
INIT_LIST_HEAD(&vfio.group_list);
@@ -2109,24 +2108,12 @@ static int __init vfio_init(void)
vfio.class->devnode = vfio_devnode;
- /* /sys/class/vfio-dev/vfioX */
- vfio.device_class = class_create(THIS_MODULE, "vfio-dev");
- if (IS_ERR(vfio.device_class)) {
- ret = PTR_ERR(vfio.device_class);
- goto err_dev_class;
- }
-
ret = alloc_chrdev_region(&vfio.group_devt, 0, MINORMASK + 1, "vfio");
if (ret)
goto err_alloc_chrdev;
-
- pr_info(DRIVER_DESC " version: " DRIVER_VERSION "\n");
return 0;
err_alloc_chrdev:
- class_destroy(vfio.device_class);
- vfio.device_class = NULL;
-err_dev_class:
class_destroy(vfio.class);
vfio.class = NULL;
err_group_class:
@@ -2134,18 +2121,47 @@ static int __init vfio_init(void)
return ret;
}
-static void __exit vfio_cleanup(void)
+static void vfio_group_cleanup(void)
{
WARN_ON(!list_empty(&vfio.group_list));
-
- ida_destroy(&vfio.device_ida);
ida_destroy(&vfio.group_ida);
unregister_chrdev_region(vfio.group_devt, MINORMASK + 1);
- class_destroy(vfio.device_class);
- vfio.device_class = NULL;
class_destroy(vfio.class);
- vfio_container_cleanup();
vfio.class = NULL;
+ vfio_container_cleanup();
+}
+
+static int __init vfio_init(void)
+{
+ int ret;
+
+ ida_init(&vfio.device_ida);
+
+ ret = vfio_group_init();
+ if (ret)
+ return ret;
+
+ /* /sys/class/vfio-dev/vfioX */
+ vfio.device_class = class_create(THIS_MODULE, "vfio-dev");
+ if (IS_ERR(vfio.device_class)) {
+ ret = PTR_ERR(vfio.device_class);
+ goto err_dev_class;
+ }
+
+ pr_info(DRIVER_DESC " version: " DRIVER_VERSION "\n");
+ return 0;
+
+err_dev_class:
+ vfio_group_cleanup();
+ return ret;
+}
+
+static void __exit vfio_cleanup(void)
+{
+ ida_destroy(&vfio.device_ida);
+ class_destroy(vfio.device_class);
+ vfio.device_class = NULL;
+ vfio_group_cleanup();
xa_destroy(&vfio_device_set_xa);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [RFC v2 10/11] vfio: Refactor dma APIs for emulated devices
2022-11-24 12:26 [RFC v2 00/11] Move group specific code into group.c Yi Liu
` (8 preceding siblings ...)
2022-11-24 12:27 ` [RFC v2 09/11] vfio: Wrap vfio group module init/clean code into helpers Yi Liu
@ 2022-11-24 12:27 ` Yi Liu
2022-11-28 8:42 ` Tian, Kevin
2022-11-24 12:27 ` [RFC v2 11/11] vfio: Move vfio group specific code into group.c Yi Liu
10 siblings, 1 reply; 42+ messages in thread
From: Yi Liu @ 2022-11-24 12:27 UTC (permalink / raw)
To: alex.williamson, jgg
Cc: kevin.tian, eric.auger, cohuck, nicolinc, yi.y.sun, chao.p.peng,
mjrosato, kvm, yi.l.liu
To use group helpers instead of opening group related code in the
API. This prepares moving group specific code out of vfio_main.c.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/vfio/container.c | 20 +++++++++++++-------
drivers/vfio/vfio.h | 32 ++++++++++++++++----------------
drivers/vfio/vfio_main.c | 25 ++++++++++++++-----------
3 files changed, 43 insertions(+), 34 deletions(-)
diff --git a/drivers/vfio/container.c b/drivers/vfio/container.c
index 6b362d97d682..b7a9560ab25e 100644
--- a/drivers/vfio/container.c
+++ b/drivers/vfio/container.c
@@ -540,10 +540,12 @@ void vfio_group_unuse_container(struct vfio_group *group)
fput(group->opened_file);
}
-int vfio_container_pin_pages(struct vfio_container *container,
- struct iommu_group *iommu_group, dma_addr_t iova,
- int npage, int prot, struct page **pages)
+int vfio_device_container_pin_pages(struct vfio_device *device,
+ dma_addr_t iova, int npage,
+ int prot, struct page **pages)
{
+ struct vfio_container *container = device->group->container;
+ struct iommu_group *iommu_group = device->group->iommu_group;
struct vfio_iommu_driver *driver = container->iommu_driver;
if (npage > VFIO_PIN_PAGES_MAX_ENTRIES)
@@ -555,9 +557,11 @@ int vfio_container_pin_pages(struct vfio_container *container,
npage, prot, pages);
}
-void vfio_container_unpin_pages(struct vfio_container *container,
- dma_addr_t iova, int npage)
+void vfio_device_container_unpin_pages(struct vfio_device *device,
+ dma_addr_t iova, int npage)
{
+ struct vfio_container *container = device->group->container;
+
if (WARN_ON(npage <= 0 || npage > VFIO_PIN_PAGES_MAX_ENTRIES))
return;
@@ -565,9 +569,11 @@ void vfio_container_unpin_pages(struct vfio_container *container,
npage);
}
-int vfio_container_dma_rw(struct vfio_container *container, dma_addr_t iova,
- void *data, size_t len, bool write)
+int vfio_device_container_dma_rw(struct vfio_device *device,
+ dma_addr_t iova, void *data,
+ size_t len, bool write)
{
+ struct vfio_container *container = device->group->container;
struct vfio_iommu_driver *driver = container->iommu_driver;
if (unlikely(!driver || !driver->ops->dma_rw))
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index 3378714a7462..d7c631b43ac1 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -122,13 +122,14 @@ int vfio_container_attach_group(struct vfio_container *container,
void vfio_group_detach_container(struct vfio_group *group);
void vfio_device_container_register(struct vfio_device *device);
void vfio_device_container_unregister(struct vfio_device *device);
-int vfio_container_pin_pages(struct vfio_container *container,
- struct iommu_group *iommu_group, dma_addr_t iova,
- int npage, int prot, struct page **pages);
-void vfio_container_unpin_pages(struct vfio_container *container,
- dma_addr_t iova, int npage);
-int vfio_container_dma_rw(struct vfio_container *container, dma_addr_t iova,
- void *data, size_t len, bool write);
+int vfio_device_container_pin_pages(struct vfio_device *device,
+ dma_addr_t iova, int npage,
+ int prot, struct page **pages);
+void vfio_device_container_unpin_pages(struct vfio_device *device,
+ dma_addr_t iova, int npage);
+int vfio_device_container_dma_rw(struct vfio_device *device,
+ dma_addr_t iova, void *data,
+ size_t len, bool write);
int __init vfio_container_init(void);
void vfio_container_cleanup(void);
@@ -166,22 +167,21 @@ static inline void vfio_device_container_unregister(struct vfio_device *device)
{
}
-static inline int vfio_container_pin_pages(struct vfio_container *container,
- struct iommu_group *iommu_group,
- dma_addr_t iova, int npage, int prot,
- struct page **pages)
+static inline int vfio_device_container_pin_pages(struct vfio_group *group,
+ dma_addr_t iova, int npage,
+ int prot, struct page **pages)
{
return -EOPNOTSUPP;
}
-static inline void vfio_container_unpin_pages(struct vfio_container *container,
- dma_addr_t iova, int npage)
+static inline void vfio_device_container_unpin_pages(struct vfio_group *group,
+ dma_addr_t iova, int npage)
{
}
-static inline int vfio_container_dma_rw(struct vfio_container *container,
- dma_addr_t iova, void *data, size_t len,
- bool write)
+static inline int vfio_device_container_dma_rw(struct vfio_device *device,
+ dma_addr_t iova, void *data,
+ size_t len, bool write)
{
return -EOPNOTSUPP;
}
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index a0c699b3e3d9..b6353b054899 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -1959,6 +1959,11 @@ int vfio_set_irqs_validate_and_prepare(struct vfio_irq_set *hdr, int num_irqs,
}
EXPORT_SYMBOL(vfio_set_irqs_validate_and_prepare);
+static bool vfio_device_has_container(struct vfio_device *device)
+{
+ return device->group->container;
+}
+
/*
* Pin contiguous user pages and return their associated host pages for local
* domain only.
@@ -1971,7 +1976,7 @@ EXPORT_SYMBOL(vfio_set_irqs_validate_and_prepare);
* Return error or number of pages pinned.
*
* A driver may only call this function if the vfio_device was created
- * by vfio_register_emulated_iommu_dev() due to vfio_container_pin_pages().
+ * by vfio_register_emulated_iommu_dev() due to vfio_device_container_pin_pages().
*/
int vfio_pin_pages(struct vfio_device *device, dma_addr_t iova,
int npage, int prot, struct page **pages)
@@ -1979,10 +1984,9 @@ int vfio_pin_pages(struct vfio_device *device, dma_addr_t iova,
/* group->container cannot change while a vfio device is open */
if (!pages || !npage || WARN_ON(!vfio_assert_device_open(device)))
return -EINVAL;
- if (device->group->container)
- return vfio_container_pin_pages(device->group->container,
- device->group->iommu_group,
- iova, npage, prot, pages);
+ if (vfio_device_has_container(device))
+ return vfio_device_container_pin_pages(device, iova,
+ npage, prot, pages);
if (device->iommufd_access) {
int ret;
@@ -2018,9 +2022,8 @@ void vfio_unpin_pages(struct vfio_device *device, dma_addr_t iova, int npage)
if (WARN_ON(!vfio_assert_device_open(device)))
return;
- if (device->group->container) {
- vfio_container_unpin_pages(device->group->container, iova,
- npage);
+ if (vfio_device_has_container(device)) {
+ vfio_device_container_unpin_pages(device, iova, npage);
return;
}
if (device->iommufd_access) {
@@ -2057,9 +2060,9 @@ int vfio_dma_rw(struct vfio_device *device, dma_addr_t iova, void *data,
if (!data || len <= 0 || !vfio_assert_device_open(device))
return -EINVAL;
- if (device->group->container)
- return vfio_container_dma_rw(device->group->container, iova,
- data, len, write);
+ if (vfio_device_has_container(device))
+ return vfio_device_container_dma_rw(device, iova,
+ data, len, write);
if (device->iommufd_access) {
unsigned int flags = 0;
--
2.34.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [RFC v2 11/11] vfio: Move vfio group specific code into group.c
2022-11-24 12:26 [RFC v2 00/11] Move group specific code into group.c Yi Liu
` (9 preceding siblings ...)
2022-11-24 12:27 ` [RFC v2 10/11] vfio: Refactor dma APIs for emulated devices Yi Liu
@ 2022-11-24 12:27 ` Yi Liu
2022-11-24 14:36 ` Jason Gunthorpe
10 siblings, 1 reply; 42+ messages in thread
From: Yi Liu @ 2022-11-24 12:27 UTC (permalink / raw)
To: alex.williamson, jgg
Cc: kevin.tian, eric.auger, cohuck, nicolinc, yi.y.sun, chao.p.peng,
mjrosato, kvm, yi.l.liu
This prepares for compiling out vfio group after vfio device cdev is
added. No vfio_group decode code should be in vfio_main.c.
No functional change is intended.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/vfio/Makefile | 1 +
drivers/vfio/group.c | 842 +++++++++++++++++++++++++++++++++++++++
drivers/vfio/vfio.h | 17 +
drivers/vfio/vfio_main.c | 830 +-------------------------------------
4 files changed, 863 insertions(+), 827 deletions(-)
create mode 100644 drivers/vfio/group.c
diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
index b953517dc70f..3783db7e8082 100644
--- a/drivers/vfio/Makefile
+++ b/drivers/vfio/Makefile
@@ -4,6 +4,7 @@ vfio_virqfd-y := virqfd.o
obj-$(CONFIG_VFIO) += vfio.o
vfio-y += vfio_main.o \
+ group.o \
iova_bitmap.o
vfio-$(CONFIG_IOMMUFD) += iommufd.o
vfio-$(CONFIG_VFIO_CONTAINER) += container.o
diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
new file mode 100644
index 000000000000..f097c3f4ca55
--- /dev/null
+++ b/drivers/vfio/group.c
@@ -0,0 +1,842 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * VFIO core
+ *
+ * Copyright (C) 2012 Red Hat, Inc. All rights reserved.
+ * Author: Alex Williamson <alex.williamson@redhat.com>
+ *
+ * Derived from original vfio:
+ * Copyright 2010 Cisco Systems, Inc. All rights reserved.
+ * Author: Tom Lyon, pugs@cisco.com
+ */
+
+#include <linux/file.h>
+#include <linux/vfio.h>
+#include <linux/iommufd.h>
+#include "vfio.h"
+
+static struct vfio {
+ struct class *class;
+ struct list_head group_list;
+ struct mutex group_lock; /* locks group_list */
+ struct ida group_ida;
+ dev_t group_devt;
+} vfio;
+
+static struct vfio_device *vfio_device_get_from_name(struct vfio_group *group,
+ char *buf)
+{
+ struct vfio_device *it, *device = ERR_PTR(-ENODEV);
+
+ mutex_lock(&group->device_lock);
+ list_for_each_entry(it, &group->device_list, group_next) {
+ int ret;
+
+ if (it->ops->match) {
+ ret = it->ops->match(it, buf);
+ if (ret < 0) {
+ device = ERR_PTR(ret);
+ break;
+ }
+ } else {
+ ret = !strcmp(dev_name(it->dev), buf);
+ }
+
+ if (ret && vfio_device_try_get_registration(it)) {
+ device = it;
+ break;
+ }
+ }
+ mutex_unlock(&group->device_lock);
+
+ return device;
+}
+
+/*
+ * VFIO Group fd, /dev/vfio/$GROUP
+ */
+static bool vfio_group_has_iommu(struct vfio_group *group)
+{
+ lockdep_assert_held(&group->group_lock);
+ /*
+ * There can only be users if there is a container, and if there is a
+ * container there must be users.
+ */
+ WARN_ON(!group->container != !group->container_users);
+
+ return group->container || group->iommufd;
+}
+
+/*
+ * VFIO_GROUP_UNSET_CONTAINER should fail if there are other users or
+ * if there was no container to unset. Since the ioctl is called on
+ * the group, we know that still exists, therefore the only valid
+ * transition here is 1->0.
+ */
+static int vfio_group_ioctl_unset_container(struct vfio_group *group)
+{
+ int ret = 0;
+
+ mutex_lock(&group->group_lock);
+ if (!vfio_group_has_iommu(group)) {
+ ret = -EINVAL;
+ goto out_unlock;
+ }
+ if (group->container) {
+ if (group->container_users != 1) {
+ ret = -EBUSY;
+ goto out_unlock;
+ }
+ vfio_group_detach_container(group);
+ }
+ if (group->iommufd) {
+ iommufd_ctx_put(group->iommufd);
+ group->iommufd = NULL;
+ }
+
+out_unlock:
+ mutex_unlock(&group->group_lock);
+ return ret;
+}
+
+static int vfio_group_ioctl_set_container(struct vfio_group *group,
+ int __user *arg)
+{
+ struct vfio_container *container;
+ struct iommufd_ctx *iommufd;
+ struct fd f;
+ int ret;
+ int fd;
+
+ if (get_user(fd, arg))
+ return -EFAULT;
+
+ f = fdget(fd);
+ if (!f.file)
+ return -EBADF;
+
+ mutex_lock(&group->group_lock);
+ if (vfio_group_has_iommu(group)) {
+ ret = -EINVAL;
+ goto out_unlock;
+ }
+ if (!group->iommu_group) {
+ ret = -ENODEV;
+ goto out_unlock;
+ }
+
+ container = vfio_container_from_file(f.file);
+ if (container) {
+ ret = vfio_container_attach_group(container, group);
+ goto out_unlock;
+ }
+
+ iommufd = iommufd_ctx_from_file(f.file);
+ if (!IS_ERR(iommufd)) {
+ u32 ioas_id;
+
+ ret = iommufd_vfio_compat_ioas_id(iommufd, &ioas_id);
+ if (ret) {
+ iommufd_ctx_put(group->iommufd);
+ goto out_unlock;
+ }
+
+ group->iommufd = iommufd;
+ goto out_unlock;
+ }
+
+ /* The FD passed is not recognized. */
+ ret = -EBADFD;
+
+out_unlock:
+ mutex_unlock(&group->group_lock);
+ fdput(f);
+ return ret;
+}
+
+static int vfio_group_ioctl_get_device_fd(struct vfio_group *group,
+ char __user *arg)
+{
+ struct vfio_device *device;
+ struct file *filep;
+ char *buf;
+ int fdno;
+ int ret;
+
+ buf = strndup_user(arg, PAGE_SIZE);
+ if (IS_ERR(buf))
+ return PTR_ERR(buf);
+
+ device = vfio_device_get_from_name(group, buf);
+ kfree(buf);
+ if (IS_ERR(device))
+ return PTR_ERR(device);
+
+ fdno = get_unused_fd_flags(O_CLOEXEC);
+ if (fdno < 0) {
+ ret = fdno;
+ goto err_put_device;
+ }
+
+ filep = vfio_device_open_file(device);
+ if (IS_ERR(filep)) {
+ ret = PTR_ERR(filep);
+ goto err_put_fdno;
+ }
+
+ if (group->type == VFIO_NO_IOMMU)
+ dev_warn(device->dev, "vfio-noiommu device opened by user "
+ "(%s:%d)\n", current->comm, task_pid_nr(current));
+
+ fd_install(fdno, filep);
+ return fdno;
+
+err_put_fdno:
+ put_unused_fd(fdno);
+err_put_device:
+ vfio_device_put_registration(device);
+ return ret;
+}
+
+static int vfio_group_ioctl_get_status(struct vfio_group *group,
+ struct vfio_group_status __user *arg)
+{
+ unsigned long minsz = offsetofend(struct vfio_group_status, flags);
+ struct vfio_group_status status;
+
+ if (copy_from_user(&status, arg, minsz))
+ return -EFAULT;
+
+ if (status.argsz < minsz)
+ return -EINVAL;
+
+ status.flags = 0;
+
+ mutex_lock(&group->group_lock);
+ if (!group->iommu_group) {
+ mutex_unlock(&group->group_lock);
+ return -ENODEV;
+ }
+
+ /*
+ * With the container FD the iommu_group_claim_dma_owner() is done
+ * during SET_CONTAINER but for IOMMFD this is done during
+ * VFIO_GROUP_GET_DEVICE_FD. Meaning that with iommufd
+ * VFIO_GROUP_FLAGS_VIABLE could be set but GET_DEVICE_FD will fail due
+ * to viability.
+ */
+ if (vfio_group_has_iommu(group))
+ status.flags |= VFIO_GROUP_FLAGS_CONTAINER_SET |
+ VFIO_GROUP_FLAGS_VIABLE;
+ else if (!iommu_group_dma_owner_claimed(group->iommu_group))
+ status.flags |= VFIO_GROUP_FLAGS_VIABLE;
+ mutex_unlock(&group->group_lock);
+
+ if (copy_to_user(arg, &status, minsz))
+ return -EFAULT;
+ return 0;
+}
+
+static long vfio_group_fops_unl_ioctl(struct file *filep,
+ unsigned int cmd, unsigned long arg)
+{
+ struct vfio_group *group = filep->private_data;
+ void __user *uarg = (void __user *)arg;
+
+ switch (cmd) {
+ case VFIO_GROUP_GET_DEVICE_FD:
+ return vfio_group_ioctl_get_device_fd(group, uarg);
+ case VFIO_GROUP_GET_STATUS:
+ return vfio_group_ioctl_get_status(group, uarg);
+ case VFIO_GROUP_SET_CONTAINER:
+ return vfio_group_ioctl_set_container(group, uarg);
+ case VFIO_GROUP_UNSET_CONTAINER:
+ return vfio_group_ioctl_unset_container(group);
+ default:
+ return -ENOTTY;
+ }
+}
+
+static int vfio_group_fops_open(struct inode *inode, struct file *filep)
+{
+ struct vfio_group *group =
+ container_of(inode->i_cdev, struct vfio_group, cdev);
+ int ret;
+
+ mutex_lock(&group->group_lock);
+
+ /*
+ * drivers can be zero if this races with vfio_device_remove_group(), it
+ * will be stable at 0 under the group rwsem
+ */
+ if (refcount_read(&group->drivers) == 0) {
+ ret = -ENODEV;
+ goto out_unlock;
+ }
+
+ if (group->type == VFIO_NO_IOMMU && !capable(CAP_SYS_RAWIO)) {
+ ret = -EPERM;
+ goto out_unlock;
+ }
+
+ /*
+ * Do we need multiple instances of the group open? Seems not.
+ */
+ if (group->opened_file) {
+ ret = -EBUSY;
+ goto out_unlock;
+ }
+ group->opened_file = filep;
+ filep->private_data = group;
+ ret = 0;
+out_unlock:
+ mutex_unlock(&group->group_lock);
+ return ret;
+}
+
+static int vfio_group_fops_release(struct inode *inode, struct file *filep)
+{
+ struct vfio_group *group = filep->private_data;
+
+ filep->private_data = NULL;
+
+ mutex_lock(&group->group_lock);
+ /*
+ * Device FDs hold a group file reference, therefore the group release
+ * is only called when there are no open devices.
+ */
+ WARN_ON(group->notifier.head);
+ if (group->container)
+ vfio_group_detach_container(group);
+ if (group->iommufd) {
+ iommufd_ctx_put(group->iommufd);
+ group->iommufd = NULL;
+ }
+ group->opened_file = NULL;
+ mutex_unlock(&group->group_lock);
+ return 0;
+}
+
+static const struct file_operations vfio_group_fops = {
+ .owner = THIS_MODULE,
+ .unlocked_ioctl = vfio_group_fops_unl_ioctl,
+ .compat_ioctl = compat_ptr_ioctl,
+ .open = vfio_group_fops_open,
+ .release = vfio_group_fops_release,
+};
+
+/*
+ * Group objects - create, release, get, put, search
+ */
+static struct vfio_group *
+vfio_group_find_from_iommu(struct iommu_group *iommu_group)
+{
+ struct vfio_group *group;
+
+ lockdep_assert_held(&vfio.group_lock);
+
+ /*
+ * group->iommu_group from the vfio.group_list cannot be NULL
+ * under the vfio.group_lock.
+ */
+ list_for_each_entry(group, &vfio.group_list, vfio_next) {
+ if (group->iommu_group == iommu_group)
+ return group;
+ }
+ return NULL;
+}
+
+static void vfio_group_release(struct device *dev)
+{
+ struct vfio_group *group = container_of(dev, struct vfio_group, dev);
+
+ mutex_destroy(&group->device_lock);
+ mutex_destroy(&group->group_lock);
+ WARN_ON(group->iommu_group);
+ ida_free(&vfio.group_ida, MINOR(group->dev.devt));
+ kfree(group);
+}
+
+static struct vfio_group *vfio_group_alloc(struct iommu_group *iommu_group,
+ enum vfio_group_type type)
+{
+ struct vfio_group *group;
+ int minor;
+
+ group = kzalloc(sizeof(*group), GFP_KERNEL);
+ if (!group)
+ return ERR_PTR(-ENOMEM);
+
+ minor = ida_alloc_max(&vfio.group_ida, MINORMASK, GFP_KERNEL);
+ if (minor < 0) {
+ kfree(group);
+ return ERR_PTR(minor);
+ }
+
+ device_initialize(&group->dev);
+ group->dev.devt = MKDEV(MAJOR(vfio.group_devt), minor);
+ group->dev.class = vfio.class;
+ group->dev.release = vfio_group_release;
+ cdev_init(&group->cdev, &vfio_group_fops);
+ group->cdev.owner = THIS_MODULE;
+
+ refcount_set(&group->drivers, 1);
+ mutex_init(&group->group_lock);
+ INIT_LIST_HEAD(&group->device_list);
+ mutex_init(&group->device_lock);
+ group->iommu_group = iommu_group;
+ /* put in vfio_group_release() */
+ iommu_group_ref_get(iommu_group);
+ group->type = type;
+ BLOCKING_INIT_NOTIFIER_HEAD(&group->notifier);
+
+ return group;
+}
+
+static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
+ enum vfio_group_type type)
+{
+ struct vfio_group *group;
+ struct vfio_group *ret;
+ int err;
+
+ lockdep_assert_held(&vfio.group_lock);
+
+ group = vfio_group_alloc(iommu_group, type);
+ if (IS_ERR(group))
+ return group;
+
+ err = dev_set_name(&group->dev, "%s%d",
+ group->type == VFIO_NO_IOMMU ? "noiommu-" : "",
+ iommu_group_id(iommu_group));
+ if (err) {
+ ret = ERR_PTR(err);
+ goto err_put;
+ }
+
+ err = cdev_device_add(&group->cdev, &group->dev);
+ if (err) {
+ ret = ERR_PTR(err);
+ goto err_put;
+ }
+
+ list_add(&group->vfio_next, &vfio.group_list);
+
+ return group;
+
+err_put:
+ put_device(&group->dev);
+ return ret;
+}
+
+static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev,
+ enum vfio_group_type type)
+{
+ struct iommu_group *iommu_group;
+ struct vfio_group *group;
+ int ret;
+
+ iommu_group = iommu_group_alloc();
+ if (IS_ERR(iommu_group))
+ return ERR_CAST(iommu_group);
+
+ ret = iommu_group_set_name(iommu_group, "vfio-noiommu");
+ if (ret)
+ goto out_put_group;
+ ret = iommu_group_add_device(iommu_group, dev);
+ if (ret)
+ goto out_put_group;
+
+ mutex_lock(&vfio.group_lock);
+ group = vfio_create_group(iommu_group, type);
+ mutex_unlock(&vfio.group_lock);
+ if (IS_ERR(group)) {
+ ret = PTR_ERR(group);
+ goto out_remove_device;
+ }
+ iommu_group_put(iommu_group);
+ return group;
+
+out_remove_device:
+ iommu_group_remove_device(dev);
+out_put_group:
+ iommu_group_put(iommu_group);
+ return ERR_PTR(ret);
+}
+
+static bool vfio_group_has_device(struct vfio_group *group, struct device *dev)
+{
+ struct vfio_device *device;
+
+ mutex_lock(&group->device_lock);
+ list_for_each_entry(device, &group->device_list, group_next) {
+ if (device->dev == dev) {
+ mutex_unlock(&group->device_lock);
+ return true;
+ }
+ }
+ mutex_unlock(&group->device_lock);
+ return false;
+}
+
+static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
+{
+ struct iommu_group *iommu_group;
+ struct vfio_group *group;
+
+ iommu_group = iommu_group_get(dev);
+ if (!iommu_group && vfio_noiommu) {
+ /*
+ * With noiommu enabled, create an IOMMU group for devices that
+ * don't already have one, implying no IOMMU hardware/driver
+ * exists. Taint the kernel because we're about to give a DMA
+ * capable device to a user without IOMMU protection.
+ */
+ group = vfio_noiommu_group_alloc(dev, VFIO_NO_IOMMU);
+ if (!IS_ERR(group)) {
+ add_taint(TAINT_USER, LOCKDEP_STILL_OK);
+ dev_warn(dev, "Adding kernel taint for vfio-noiommu group on device\n");
+ }
+ return group;
+ }
+
+ if (!iommu_group)
+ return ERR_PTR(-EINVAL);
+
+ /*
+ * VFIO always sets IOMMU_CACHE because we offer no way for userspace to
+ * restore cache coherency. It has to be checked here because it is only
+ * valid for cases where we are using iommu groups.
+ */
+ if (!device_iommu_capable(dev, IOMMU_CAP_CACHE_COHERENCY)) {
+ iommu_group_put(iommu_group);
+ return ERR_PTR(-EINVAL);
+ }
+
+ mutex_lock(&vfio.group_lock);
+ group = vfio_group_find_from_iommu(iommu_group);
+ if (group) {
+ if (WARN_ON(vfio_group_has_device(group, dev)))
+ group = ERR_PTR(-EINVAL);
+ else
+ refcount_inc(&group->drivers);
+ } else {
+ group = vfio_create_group(iommu_group, VFIO_IOMMU);
+ }
+ mutex_unlock(&vfio.group_lock);
+
+ /* The vfio_group holds a reference to the iommu_group */
+ iommu_group_put(iommu_group);
+ return group;
+}
+
+int vfio_device_set_group(struct vfio_device *device,
+ enum vfio_group_type type)
+{
+ struct vfio_group *group;
+
+ if (type == VFIO_IOMMU)
+ group = vfio_group_find_or_alloc(device->dev);
+ else
+ group = vfio_noiommu_group_alloc(device->dev, type);
+
+ /*
+ * In all cases group is the output of one of the group allocation
+ * functions and we have group->drivers incremented for us.
+ */
+ if (IS_ERR(group))
+ return PTR_ERR(group);
+
+ /* Our reference on group is moved to the device */
+ device->group = group;
+ return 0;
+}
+
+void vfio_device_remove_group(struct vfio_device *device)
+{
+ struct vfio_group *group = device->group;
+ struct iommu_group *iommu_group;
+
+ if (group->type == VFIO_NO_IOMMU || group->type == VFIO_EMULATED_IOMMU)
+ iommu_group_remove_device(device->dev);
+
+ /* Pairs with vfio_create_group() / vfio_group_get_from_iommu() */
+ if (!refcount_dec_and_mutex_lock(&group->drivers, &vfio.group_lock))
+ return;
+ list_del(&group->vfio_next);
+
+ /*
+ * We could concurrently probe another driver in the group that might
+ * race vfio_device_remove_group() with vfio_get_group(), so we have to
+ * ensure that the sysfs is all cleaned up under lock otherwise the
+ * cdev_device_add() will fail due to the name aready existing.
+ */
+ cdev_device_del(&group->cdev, &group->dev);
+
+ mutex_lock(&group->group_lock);
+ /*
+ * These data structures all have paired operations that can only be
+ * undone when the caller holds a live reference on the device. Since
+ * all pairs must be undone these WARN_ON's indicate some caller did not
+ * properly hold the group reference.
+ */
+ WARN_ON(!list_empty(&group->device_list));
+ WARN_ON(group->notifier.head);
+
+ /*
+ * Revoke all users of group->iommu_group. At this point we know there
+ * are no devices active because we are unplugging the last one. Setting
+ * iommu_group to NULL blocks all new users.
+ */
+ if (group->container)
+ vfio_group_detach_container(group);
+ iommu_group = group->iommu_group;
+ group->iommu_group = NULL;
+ mutex_unlock(&group->group_lock);
+ mutex_unlock(&vfio.group_lock);
+
+ iommu_group_put(iommu_group);
+ put_device(&group->dev);
+}
+
+void vfio_device_group_register(struct vfio_device *device)
+{
+ mutex_lock(&device->group->device_lock);
+ list_add(&device->group_next, &device->group->device_list);
+ mutex_unlock(&device->group->device_lock);
+}
+
+void vfio_device_group_unregister(struct vfio_device *device)
+{
+ mutex_lock(&device->group->device_lock);
+ list_del(&device->group_next);
+ mutex_unlock(&device->group->device_lock);
+}
+
+int vfio_device_group_use_iommu(struct vfio_device *device)
+{
+ int ret = 0;
+
+ /*
+ * Here we pass the KVM pointer with the group under the lock. If the
+ * device driver will use it, it must obtain a reference and release it
+ * during close_device.
+ */
+ mutex_lock(&device->group->group_lock);
+ if (!vfio_group_has_iommu(device->group)) {
+ ret = -EINVAL;
+ goto out_unlock;
+ }
+
+ if (device->group->container) {
+ ret = vfio_group_use_container(device->group);
+ if (ret)
+ goto out_unlock;
+ vfio_device_container_register(device);
+ } else if (device->group->iommufd) {
+ ret = vfio_iommufd_bind(device, device->group->iommufd);
+ }
+
+out_unlock:
+ mutex_unlock(&device->group->group_lock);
+ return ret;
+}
+
+void vfio_device_group_unuse_iommu(struct vfio_device *device)
+{
+ mutex_lock(&device->group->group_lock);
+ if (device->group->container) {
+ vfio_device_container_unregister(device);
+ vfio_group_unuse_container(device->group);
+ } else if (device->group->iommufd) {
+ vfio_iommufd_unbind(device);
+ }
+ mutex_unlock(&device->group->group_lock);
+}
+
+struct kvm *vfio_device_get_group_kvm(struct vfio_device *device)
+{
+ struct vfio_group *group = device->group;
+
+ mutex_lock(&group->group_lock);
+ if (!group->kvm) {
+ mutex_unlock(&group->group_lock);
+ return NULL;
+ }
+ /* group_lock is released in the vfio_device_put_group_kvm() */
+ return group->kvm;
+}
+
+void vfio_device_put_group_kvm(struct vfio_device *device)
+{
+ mutex_unlock(&device->group->group_lock);
+}
+
+bool vfio_device_has_container(struct vfio_device *device)
+{
+ return device->group->container;
+}
+
+/**
+ * vfio_file_iommu_group - Return the struct iommu_group for the vfio group file
+ * @file: VFIO group file
+ *
+ * The returned iommu_group is valid as long as a ref is held on the file. This
+ * returns a reference on the group. This function is deprecated, only the SPAPR
+ * path in kvm should call it.
+ */
+struct iommu_group *vfio_file_iommu_group(struct file *file)
+{
+ struct vfio_group *group = file->private_data;
+ struct iommu_group *iommu_group = NULL;
+
+ if (!IS_ENABLED(CONFIG_SPAPR_TCE_IOMMU))
+ return NULL;
+
+ if (!vfio_file_is_group(file))
+ return NULL;
+
+ mutex_lock(&group->group_lock);
+ if (group->iommu_group) {
+ iommu_group = group->iommu_group;
+ iommu_group_ref_get(iommu_group);
+ }
+ mutex_unlock(&group->group_lock);
+ return iommu_group;
+}
+EXPORT_SYMBOL_GPL(vfio_file_iommu_group);
+
+/**
+ * vfio_file_is_group - True if the file is usable with VFIO aPIS
+ * @file: VFIO group file
+ */
+bool vfio_file_is_group(struct file *file)
+{
+ return file->f_op == &vfio_group_fops;
+}
+EXPORT_SYMBOL_GPL(vfio_file_is_group);
+
+/**
+ * vfio_file_enforced_coherent - True if the DMA associated with the VFIO file
+ * is always CPU cache coherent
+ * @file: VFIO group file
+ *
+ * Enforced coherency means that the IOMMU ignores things like the PCIe no-snoop
+ * bit in DMA transactions. A return of false indicates that the user has
+ * rights to access additional instructions such as wbinvd on x86.
+ */
+bool vfio_file_enforced_coherent(struct file *file)
+{
+ struct vfio_group *group = file->private_data;
+ struct vfio_device *device;
+ bool ret = true;
+
+ if (!vfio_file_is_group(file))
+ return true;
+
+ /*
+ * If the device does not have IOMMU_CAP_ENFORCE_CACHE_COHERENCY then
+ * any domain later attached to it will also not support it. If the cap
+ * is set then the iommu_domain eventually attached to the device/group
+ * must use a domain with enforce_cache_coherency().
+ */
+ mutex_lock(&group->device_lock);
+ list_for_each_entry(device, &group->device_list, group_next) {
+ if (!device_iommu_capable(device->dev,
+ IOMMU_CAP_ENFORCE_CACHE_COHERENCY)) {
+ ret = false;
+ break;
+ }
+ }
+ mutex_unlock(&group->device_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(vfio_file_enforced_coherent);
+
+/**
+ * vfio_file_set_kvm - Link a kvm with VFIO drivers
+ * @file: VFIO group file
+ * @kvm: KVM to link
+ *
+ * When a VFIO device is first opened the KVM will be available in
+ * device->kvm if one was associated with the group.
+ */
+void vfio_file_set_kvm(struct file *file, struct kvm *kvm)
+{
+ struct vfio_group *group = file->private_data;
+
+ if (!vfio_file_is_group(file))
+ return;
+
+ mutex_lock(&group->group_lock);
+ group->kvm = kvm;
+ mutex_unlock(&group->group_lock);
+}
+EXPORT_SYMBOL_GPL(vfio_file_set_kvm);
+
+/**
+ * vfio_file_has_dev - True if the VFIO file is a handle for device
+ * @file: VFIO file to check
+ * @device: Device that must be part of the file
+ *
+ * Returns true if given file has permission to manipulate the given device.
+ */
+bool vfio_file_has_dev(struct file *file, struct vfio_device *device)
+{
+ struct vfio_group *group = file->private_data;
+
+ if (!vfio_file_is_group(file))
+ return false;
+
+ return group == device->group;
+}
+EXPORT_SYMBOL_GPL(vfio_file_has_dev);
+
+static char *vfio_devnode(struct device *dev, umode_t *mode)
+{
+ return kasprintf(GFP_KERNEL, "vfio/%s", dev_name(dev));
+}
+
+int __init vfio_group_init(void)
+{
+ int ret;
+
+ ida_init(&vfio.group_ida);
+ mutex_init(&vfio.group_lock);
+ INIT_LIST_HEAD(&vfio.group_list);
+
+ ret = vfio_container_init();
+ if (ret)
+ return ret;
+
+ /* /dev/vfio/$GROUP */
+ vfio.class = class_create(THIS_MODULE, "vfio");
+ if (IS_ERR(vfio.class)) {
+ ret = PTR_ERR(vfio.class);
+ goto err_group_class;
+ }
+
+ vfio.class->devnode = vfio_devnode;
+
+ ret = alloc_chrdev_region(&vfio.group_devt, 0, MINORMASK + 1, "vfio");
+ if (ret)
+ goto err_alloc_chrdev;
+ return 0;
+
+err_alloc_chrdev:
+ class_destroy(vfio.class);
+ vfio.class = NULL;
+err_group_class:
+ vfio_container_cleanup();
+ return ret;
+}
+
+void vfio_group_cleanup(void)
+{
+ WARN_ON(!list_empty(&vfio.group_list));
+ ida_destroy(&vfio.group_ida);
+ unregister_chrdev_region(vfio.group_devt, MINORMASK + 1);
+ class_destroy(vfio.class);
+ vfio.class = NULL;
+ vfio_container_cleanup();
+}
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index d7c631b43ac1..0e61c3afce5d 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -15,6 +15,10 @@ struct iommu_group;
struct vfio_device;
struct vfio_container;
+void vfio_device_put_registration(struct vfio_device *device);
+bool vfio_device_try_get_registration(struct vfio_device *device);
+struct file *vfio_device_open_file(struct vfio_device *device);
+
enum vfio_group_type {
/*
* Physical device with IOMMU backing.
@@ -66,6 +70,19 @@ struct vfio_group {
struct iommufd_ctx *iommufd;
};
+int vfio_device_set_group(struct vfio_device *device,
+ enum vfio_group_type type);
+void vfio_device_remove_group(struct vfio_device *device);
+void vfio_device_group_register(struct vfio_device *device);
+void vfio_device_group_unregister(struct vfio_device *device);
+int vfio_device_group_use_iommu(struct vfio_device *device);
+void vfio_device_group_unuse_iommu(struct vfio_device *device);
+struct kvm *vfio_device_get_group_kvm(struct vfio_device *device);
+void vfio_device_put_group_kvm(struct vfio_device *device);
+bool vfio_device_has_container(struct vfio_device *device);
+int __init vfio_group_init(void);
+void vfio_group_cleanup(void);
+
#if IS_ENABLED(CONFIG_VFIO_CONTAINER)
/* events for the backend driver notify callback */
enum vfio_iommu_notify_type {
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index b6353b054899..52cd5cabdae5 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -43,11 +43,6 @@
#define DRIVER_DESC "VFIO - User Level meta-driver"
static struct vfio {
- struct class *class;
- struct list_head group_list;
- struct mutex group_lock; /* locks group_list */
- struct ida group_ida;
- dev_t group_devt;
struct class *device_class;
struct ida device_ida;
} vfio;
@@ -56,7 +51,6 @@ bool vfio_allow_unsafe_interrupts;
EXPORT_SYMBOL_GPL(vfio_allow_unsafe_interrupts);
static DEFINE_XARRAY(vfio_device_set_xa);
-static const struct file_operations vfio_group_fops;
int vfio_assign_device_set(struct vfio_device *device, void *set_id)
{
@@ -142,168 +136,17 @@ unsigned int vfio_device_set_open_count(struct vfio_device_set *dev_set)
}
EXPORT_SYMBOL_GPL(vfio_device_set_open_count);
-/*
- * Group objects - create, release, get, put, search
- */
-static struct vfio_group *
-vfio_group_find_from_iommu(struct iommu_group *iommu_group)
-{
- struct vfio_group *group;
-
- lockdep_assert_held(&vfio.group_lock);
-
- /*
- * group->iommu_group from the vfio.group_list cannot be NULL
- * under the vfio.group_lock.
- */
- list_for_each_entry(group, &vfio.group_list, vfio_next) {
- if (group->iommu_group == iommu_group)
- return group;
- }
- return NULL;
-}
-
-static void vfio_group_release(struct device *dev)
-{
- struct vfio_group *group = container_of(dev, struct vfio_group, dev);
-
- mutex_destroy(&group->device_lock);
- mutex_destroy(&group->group_lock);
- WARN_ON(group->iommu_group);
- ida_free(&vfio.group_ida, MINOR(group->dev.devt));
- kfree(group);
-}
-
-static struct vfio_group *vfio_group_alloc(struct iommu_group *iommu_group,
- enum vfio_group_type type)
-{
- struct vfio_group *group;
- int minor;
-
- group = kzalloc(sizeof(*group), GFP_KERNEL);
- if (!group)
- return ERR_PTR(-ENOMEM);
-
- minor = ida_alloc_max(&vfio.group_ida, MINORMASK, GFP_KERNEL);
- if (minor < 0) {
- kfree(group);
- return ERR_PTR(minor);
- }
-
- device_initialize(&group->dev);
- group->dev.devt = MKDEV(MAJOR(vfio.group_devt), minor);
- group->dev.class = vfio.class;
- group->dev.release = vfio_group_release;
- cdev_init(&group->cdev, &vfio_group_fops);
- group->cdev.owner = THIS_MODULE;
-
- refcount_set(&group->drivers, 1);
- mutex_init(&group->group_lock);
- INIT_LIST_HEAD(&group->device_list);
- mutex_init(&group->device_lock);
- group->iommu_group = iommu_group;
- /* put in vfio_group_release() */
- iommu_group_ref_get(iommu_group);
- group->type = type;
- BLOCKING_INIT_NOTIFIER_HEAD(&group->notifier);
-
- return group;
-}
-
-static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
- enum vfio_group_type type)
-{
- struct vfio_group *group;
- struct vfio_group *ret;
- int err;
-
- lockdep_assert_held(&vfio.group_lock);
-
- group = vfio_group_alloc(iommu_group, type);
- if (IS_ERR(group))
- return group;
-
- err = dev_set_name(&group->dev, "%s%d",
- group->type == VFIO_NO_IOMMU ? "noiommu-" : "",
- iommu_group_id(iommu_group));
- if (err) {
- ret = ERR_PTR(err);
- goto err_put;
- }
-
- err = cdev_device_add(&group->cdev, &group->dev);
- if (err) {
- ret = ERR_PTR(err);
- goto err_put;
- }
-
- list_add(&group->vfio_next, &vfio.group_list);
-
- return group;
-
-err_put:
- put_device(&group->dev);
- return ret;
-}
-
-static void vfio_device_remove_group(struct vfio_device *device)
-{
- struct vfio_group *group = device->group;
- struct iommu_group *iommu_group;
-
- if (group->type == VFIO_NO_IOMMU || group->type == VFIO_EMULATED_IOMMU)
- iommu_group_remove_device(device->dev);
-
- /* Pairs with vfio_create_group() / vfio_group_get_from_iommu() */
- if (!refcount_dec_and_mutex_lock(&group->drivers, &vfio.group_lock))
- return;
- list_del(&group->vfio_next);
-
- /*
- * We could concurrently probe another driver in the group that might
- * race vfio_device_remove_group() with vfio_get_group(), so we have to
- * ensure that the sysfs is all cleaned up under lock otherwise the
- * cdev_device_add() will fail due to the name aready existing.
- */
- cdev_device_del(&group->cdev, &group->dev);
-
- mutex_lock(&group->group_lock);
- /*
- * These data structures all have paired operations that can only be
- * undone when the caller holds a live reference on the device. Since
- * all pairs must be undone these WARN_ON's indicate some caller did not
- * properly hold the group reference.
- */
- WARN_ON(!list_empty(&group->device_list));
- WARN_ON(group->notifier.head);
-
- /*
- * Revoke all users of group->iommu_group. At this point we know there
- * are no devices active because we are unplugging the last one. Setting
- * iommu_group to NULL blocks all new users.
- */
- if (group->container)
- vfio_group_detach_container(group);
- iommu_group = group->iommu_group;
- group->iommu_group = NULL;
- mutex_unlock(&group->group_lock);
- mutex_unlock(&vfio.group_lock);
-
- iommu_group_put(iommu_group);
- put_device(&group->dev);
-}
-
/*
* Device objects - create, release, get, put, search
*/
/* Device reference always implies a group reference */
-static void vfio_device_put_registration(struct vfio_device *device)
+void vfio_device_put_registration(struct vfio_device *device)
{
if (refcount_dec_and_test(&device->refcount))
complete(&device->comp);
}
-static bool vfio_device_try_get_registration(struct vfio_device *device)
+bool vfio_device_try_get_registration(struct vfio_device *device)
{
return refcount_inc_not_zero(&device->refcount);
}
@@ -402,143 +245,6 @@ static int vfio_init_device(struct vfio_device *device, struct device *dev,
return ret;
}
-static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev,
- enum vfio_group_type type)
-{
- struct iommu_group *iommu_group;
- struct vfio_group *group;
- int ret;
-
- iommu_group = iommu_group_alloc();
- if (IS_ERR(iommu_group))
- return ERR_CAST(iommu_group);
-
- ret = iommu_group_set_name(iommu_group, "vfio-noiommu");
- if (ret)
- goto out_put_group;
- ret = iommu_group_add_device(iommu_group, dev);
- if (ret)
- goto out_put_group;
-
- mutex_lock(&vfio.group_lock);
- group = vfio_create_group(iommu_group, type);
- mutex_unlock(&vfio.group_lock);
- if (IS_ERR(group)) {
- ret = PTR_ERR(group);
- goto out_remove_device;
- }
- iommu_group_put(iommu_group);
- return group;
-
-out_remove_device:
- iommu_group_remove_device(dev);
-out_put_group:
- iommu_group_put(iommu_group);
- return ERR_PTR(ret);
-}
-
-static bool vfio_group_has_device(struct vfio_group *group, struct device *dev)
-{
- struct vfio_device *device;
-
- mutex_lock(&group->device_lock);
- list_for_each_entry(device, &group->device_list, group_next) {
- if (device->dev == dev) {
- mutex_unlock(&group->device_lock);
- return true;
- }
- }
- mutex_unlock(&group->device_lock);
- return false;
-}
-
-static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
-{
- struct iommu_group *iommu_group;
- struct vfio_group *group;
-
- iommu_group = iommu_group_get(dev);
- if (!iommu_group && vfio_noiommu) {
- /*
- * With noiommu enabled, create an IOMMU group for devices that
- * don't already have one, implying no IOMMU hardware/driver
- * exists. Taint the kernel because we're about to give a DMA
- * capable device to a user without IOMMU protection.
- */
- group = vfio_noiommu_group_alloc(dev, VFIO_NO_IOMMU);
- if (!IS_ERR(group)) {
- add_taint(TAINT_USER, LOCKDEP_STILL_OK);
- dev_warn(dev, "Adding kernel taint for vfio-noiommu group on device\n");
- }
- return group;
- }
-
- if (!iommu_group)
- return ERR_PTR(-EINVAL);
-
- /*
- * VFIO always sets IOMMU_CACHE because we offer no way for userspace to
- * restore cache coherency. It has to be checked here because it is only
- * valid for cases where we are using iommu groups.
- */
- if (!device_iommu_capable(dev, IOMMU_CAP_CACHE_COHERENCY)) {
- iommu_group_put(iommu_group);
- return ERR_PTR(-EINVAL);
- }
-
- mutex_lock(&vfio.group_lock);
- group = vfio_group_find_from_iommu(iommu_group);
- if (group) {
- if (WARN_ON(vfio_group_has_device(group, dev)))
- group = ERR_PTR(-EINVAL);
- else
- refcount_inc(&group->drivers);
- } else {
- group = vfio_create_group(iommu_group, VFIO_IOMMU);
- }
- mutex_unlock(&vfio.group_lock);
-
- /* The vfio_group holds a reference to the iommu_group */
- iommu_group_put(iommu_group);
- return group;
-}
-
-static int vfio_device_set_group(struct vfio_device *device,
- enum vfio_group_type type)
-{
- struct vfio_group *group;
-
- if (type == VFIO_IOMMU)
- group = vfio_group_find_or_alloc(device->dev);
- else
- group = vfio_noiommu_group_alloc(device->dev, type);
-
- /*
- * In all cases group is the output of one of the group allocation
- * functions and we have group->drivers incremented for us.
- */
- if (IS_ERR(group))
- return PTR_ERR(group);
-
- /* Our reference on group is moved to the device */
- device->group = group;
- return 0;
-}
-
-static void vfio_device_group_register(struct vfio_device *device)
-{
- mutex_lock(&device->group->device_lock);
- list_add(&device->group_next, &device->group->device_list);
- mutex_unlock(&device->group->device_lock);
-}
-
-static void vfio_device_group_unregister(struct vfio_device *device)
-{
- mutex_lock(&device->group->device_lock);
- list_del(&device->group_next);
- mutex_unlock(&device->group->device_lock);
-}
-
static int __vfio_register_dev(struct vfio_device *device,
enum vfio_group_type type)
{
@@ -594,35 +300,6 @@ int vfio_register_emulated_iommu_dev(struct vfio_device *device)
}
EXPORT_SYMBOL_GPL(vfio_register_emulated_iommu_dev);
-static struct vfio_device *vfio_device_get_from_name(struct vfio_group *group,
- char *buf)
-{
- struct vfio_device *it, *device = ERR_PTR(-ENODEV);
-
- mutex_lock(&group->device_lock);
- list_for_each_entry(it, &group->device_list, group_next) {
- int ret;
-
- if (it->ops->match) {
- ret = it->ops->match(it, buf);
- if (ret < 0) {
- device = ERR_PTR(ret);
- break;
- }
- } else {
- ret = !strcmp(dev_name(it->dev), buf);
- }
-
- if (ret && vfio_device_try_get_registration(it)) {
- device = it;
- break;
- }
- }
- mutex_unlock(&group->device_lock);
-
- return device;
-}
-
/*
* Decrement the device reference count and wait for the device to be
* removed. Open file descriptors for the device... */
@@ -665,108 +342,6 @@ void vfio_unregister_group_dev(struct vfio_device *device)
}
EXPORT_SYMBOL_GPL(vfio_unregister_group_dev);
-/*
- * VFIO Group fd, /dev/vfio/$GROUP
- */
-static bool vfio_group_has_iommu(struct vfio_group *group)
-{
- lockdep_assert_held(&group->group_lock);
- /*
- * There can only be users if there is a container, and if there is a
- * container there must be users.
- */
- WARN_ON(!group->container != !group->container_users);
-
- return group->container || group->iommufd;
-}
-
-/*
- * VFIO_GROUP_UNSET_CONTAINER should fail if there are other users or
- * if there was no container to unset. Since the ioctl is called on
- * the group, we know that still exists, therefore the only valid
- * transition here is 1->0.
- */
-static int vfio_group_ioctl_unset_container(struct vfio_group *group)
-{
- int ret = 0;
-
- mutex_lock(&group->group_lock);
- if (!vfio_group_has_iommu(group)) {
- ret = -EINVAL;
- goto out_unlock;
- }
- if (group->container) {
- if (group->container_users != 1) {
- ret = -EBUSY;
- goto out_unlock;
- }
- vfio_group_detach_container(group);
- }
- if (group->iommufd) {
- iommufd_ctx_put(group->iommufd);
- group->iommufd = NULL;
- }
-
-out_unlock:
- mutex_unlock(&group->group_lock);
- return ret;
-}
-
-static int vfio_group_ioctl_set_container(struct vfio_group *group,
- int __user *arg)
-{
- struct vfio_container *container;
- struct iommufd_ctx *iommufd;
- struct fd f;
- int ret;
- int fd;
-
- if (get_user(fd, arg))
- return -EFAULT;
-
- f = fdget(fd);
- if (!f.file)
- return -EBADF;
-
- mutex_lock(&group->group_lock);
- if (vfio_group_has_iommu(group)) {
- ret = -EINVAL;
- goto out_unlock;
- }
- if (!group->iommu_group) {
- ret = -ENODEV;
- goto out_unlock;
- }
-
- container = vfio_container_from_file(f.file);
- if (container) {
- ret = vfio_container_attach_group(container, group);
- goto out_unlock;
- }
-
- iommufd = iommufd_ctx_from_file(f.file);
- if (!IS_ERR(iommufd)) {
- u32 ioas_id;
-
- ret = iommufd_vfio_compat_ioas_id(iommufd, &ioas_id);
- if (ret) {
- iommufd_ctx_put(group->iommufd);
- goto out_unlock;
- }
-
- group->iommufd = iommufd;
- goto out_unlock;
- }
-
- /* The FD passed is not recognized. */
- ret = -EBADFD;
-
-out_unlock:
- mutex_unlock(&group->group_lock);
- fdput(f);
- return ret;
-}
-
static const struct file_operations vfio_device_fops;
/* true if the vfio_device has open_device() called but not close_device() */
@@ -775,65 +350,6 @@ static bool vfio_assert_device_open(struct vfio_device *device)
return !WARN_ON_ONCE(!READ_ONCE(device->open_count));
}
-static int vfio_device_group_use_iommu(struct vfio_device *device)
-{
- int ret = 0;
-
- /*
- * Here we pass the KVM pointer with the group under the lock. If the
- * device driver will use it, it must obtain a reference and release it
- * during close_device.
- */
- mutex_lock(&device->group->group_lock);
- if (!vfio_group_has_iommu(device->group)) {
- ret = -EINVAL;
- goto out_unlock;
- }
-
- if (device->group->container) {
- ret = vfio_group_use_container(device->group);
- if (ret)
- goto out_unlock;
- vfio_device_container_register(device);
- } else if (device->group->iommufd) {
- ret = vfio_iommufd_bind(device, device->group->iommufd);
- }
-
-out_unlock:
- mutex_unlock(&device->group->group_lock);
- return ret;
-}
-
-static void vfio_device_group_unuse_iommu(struct vfio_device *device)
-{
- mutex_lock(&device->group->group_lock);
- if (device->group->container) {
- vfio_device_container_unregister(device);
- vfio_group_unuse_container(device->group);
- } else if (device->group->iommufd) {
- vfio_iommufd_unbind(device);
- }
- mutex_unlock(&device->group->group_lock);
-}
-
-static struct kvm *vfio_device_get_group_kvm(struct vfio_device *device)
-{
- struct vfio_group *group = device->group;
-
- mutex_lock(&group->group_lock);
- if (!group->kvm) {
- mutex_unlock(&group->group_lock);
- return NULL;
- }
- /* group_lock is released in the vfio_device_put_group_kvm() */
- return group->kvm;
-}
-
-static void vfio_device_put_group_kvm(struct vfio_device *device)
-{
- mutex_unlock(&device->group->group_lock);
-}
-
static int vfio_device_first_open(struct vfio_device *device)
{
struct kvm *kvm;
@@ -910,7 +426,7 @@ static void vfio_device_close(struct vfio_device *device)
mutex_unlock(&device->dev_set->lock);
}
-static struct file *vfio_device_open_file(struct vfio_device *device)
+struct file *vfio_device_open_file(struct vfio_device *device)
{
struct file *filep;
int ret;
@@ -949,177 +465,6 @@ static struct file *vfio_device_open_file(struct vfio_device *device)
return ERR_PTR(ret);
}
-static int vfio_group_ioctl_get_device_fd(struct vfio_group *group,
- char __user *arg)
-{
- struct vfio_device *device;
- struct file *filep;
- char *buf;
- int fdno;
- int ret;
-
- buf = strndup_user(arg, PAGE_SIZE);
- if (IS_ERR(buf))
- return PTR_ERR(buf);
-
- device = vfio_device_get_from_name(group, buf);
- kfree(buf);
- if (IS_ERR(device))
- return PTR_ERR(device);
-
- fdno = get_unused_fd_flags(O_CLOEXEC);
- if (fdno < 0) {
- ret = fdno;
- goto err_put_device;
- }
-
- filep = vfio_device_open_file(device);
- if (IS_ERR(filep)) {
- ret = PTR_ERR(filep);
- goto err_put_fdno;
- }
-
- if (group->type == VFIO_NO_IOMMU)
- dev_warn(device->dev, "vfio-noiommu device opened by user "
- "(%s:%d)\n", current->comm, task_pid_nr(current));
-
- fd_install(fdno, filep);
- return fdno;
-
-err_put_fdno:
- put_unused_fd(fdno);
-err_put_device:
- vfio_device_put_registration(device);
- return ret;
-}
-
-static int vfio_group_ioctl_get_status(struct vfio_group *group,
- struct vfio_group_status __user *arg)
-{
- unsigned long minsz = offsetofend(struct vfio_group_status, flags);
- struct vfio_group_status status;
-
- if (copy_from_user(&status, arg, minsz))
- return -EFAULT;
-
- if (status.argsz < minsz)
- return -EINVAL;
-
- status.flags = 0;
-
- mutex_lock(&group->group_lock);
- if (!group->iommu_group) {
- mutex_unlock(&group->group_lock);
- return -ENODEV;
- }
-
- /*
- * With the container FD the iommu_group_claim_dma_owner() is done
- * during SET_CONTAINER but for IOMMFD this is done during
- * VFIO_GROUP_GET_DEVICE_FD. Meaning that with iommufd
- * VFIO_GROUP_FLAGS_VIABLE could be set but GET_DEVICE_FD will fail due
- * to viability.
- */
- if (vfio_group_has_iommu(group))
- status.flags |= VFIO_GROUP_FLAGS_CONTAINER_SET |
- VFIO_GROUP_FLAGS_VIABLE;
- else if (!iommu_group_dma_owner_claimed(group->iommu_group))
- status.flags |= VFIO_GROUP_FLAGS_VIABLE;
- mutex_unlock(&group->group_lock);
-
- if (copy_to_user(arg, &status, minsz))
- return -EFAULT;
- return 0;
-}
-
-static long vfio_group_fops_unl_ioctl(struct file *filep,
- unsigned int cmd, unsigned long arg)
-{
- struct vfio_group *group = filep->private_data;
- void __user *uarg = (void __user *)arg;
-
- switch (cmd) {
- case VFIO_GROUP_GET_DEVICE_FD:
- return vfio_group_ioctl_get_device_fd(group, uarg);
- case VFIO_GROUP_GET_STATUS:
- return vfio_group_ioctl_get_status(group, uarg);
- case VFIO_GROUP_SET_CONTAINER:
- return vfio_group_ioctl_set_container(group, uarg);
- case VFIO_GROUP_UNSET_CONTAINER:
- return vfio_group_ioctl_unset_container(group);
- default:
- return -ENOTTY;
- }
-}
-
-static int vfio_group_fops_open(struct inode *inode, struct file *filep)
-{
- struct vfio_group *group =
- container_of(inode->i_cdev, struct vfio_group, cdev);
- int ret;
-
- mutex_lock(&group->group_lock);
-
- /*
- * drivers can be zero if this races with vfio_device_remove_group(), it
- * will be stable at 0 under the group rwsem
- */
- if (refcount_read(&group->drivers) == 0) {
- ret = -ENODEV;
- goto out_unlock;
- }
-
- if (group->type == VFIO_NO_IOMMU && !capable(CAP_SYS_RAWIO)) {
- ret = -EPERM;
- goto out_unlock;
- }
-
- /*
- * Do we need multiple instances of the group open? Seems not.
- */
- if (group->opened_file) {
- ret = -EBUSY;
- goto out_unlock;
- }
- group->opened_file = filep;
- filep->private_data = group;
- ret = 0;
-out_unlock:
- mutex_unlock(&group->group_lock);
- return ret;
-}
-
-static int vfio_group_fops_release(struct inode *inode, struct file *filep)
-{
- struct vfio_group *group = filep->private_data;
-
- filep->private_data = NULL;
-
- mutex_lock(&group->group_lock);
- /*
- * Device FDs hold a group file reference, therefore the group release
- * is only called when there are no open devices.
- */
- WARN_ON(group->notifier.head);
- if (group->container)
- vfio_group_detach_container(group);
- if (group->iommufd) {
- iommufd_ctx_put(group->iommufd);
- group->iommufd = NULL;
- }
- group->opened_file = NULL;
- mutex_unlock(&group->group_lock);
- return 0;
-}
-
-static const struct file_operations vfio_group_fops = {
- .owner = THIS_MODULE,
- .unlocked_ioctl = vfio_group_fops_unl_ioctl,
- .compat_ioctl = compat_ptr_ioctl,
- .open = vfio_group_fops_open,
- .release = vfio_group_fops_release,
-};
-
/*
* Wrapper around pm_runtime_resume_and_get().
* Return error code on failure or 0 on success.
@@ -1725,121 +1070,6 @@ static const struct file_operations vfio_device_fops = {
.mmap = vfio_device_fops_mmap,
};
-/**
- * vfio_file_iommu_group - Return the struct iommu_group for the vfio group file
- * @file: VFIO group file
- *
- * The returned iommu_group is valid as long as a ref is held on the file. This
- * returns a reference on the group. This function is deprecated, only the SPAPR
- * path in kvm should call it.
- */
-struct iommu_group *vfio_file_iommu_group(struct file *file)
-{
- struct vfio_group *group = file->private_data;
- struct iommu_group *iommu_group = NULL;
-
- if (!IS_ENABLED(CONFIG_SPAPR_TCE_IOMMU))
- return NULL;
-
- if (!vfio_file_is_group(file))
- return NULL;
-
- mutex_lock(&group->group_lock);
- if (group->iommu_group) {
- iommu_group = group->iommu_group;
- iommu_group_ref_get(iommu_group);
- }
- mutex_unlock(&group->group_lock);
- return iommu_group;
-}
-EXPORT_SYMBOL_GPL(vfio_file_iommu_group);
-
-/**
- * vfio_file_is_group - True if the file is usable with VFIO aPIS
- * @file: VFIO group file
- */
-bool vfio_file_is_group(struct file *file)
-{
- return file->f_op == &vfio_group_fops;
-}
-EXPORT_SYMBOL_GPL(vfio_file_is_group);
-
-/**
- * vfio_file_enforced_coherent - True if the DMA associated with the VFIO file
- * is always CPU cache coherent
- * @file: VFIO group file
- *
- * Enforced coherency means that the IOMMU ignores things like the PCIe no-snoop
- * bit in DMA transactions. A return of false indicates that the user has
- * rights to access additional instructions such as wbinvd on x86.
- */
-bool vfio_file_enforced_coherent(struct file *file)
-{
- struct vfio_group *group = file->private_data;
- struct vfio_device *device;
- bool ret = true;
-
- if (!vfio_file_is_group(file))
- return true;
-
- /*
- * If the device does not have IOMMU_CAP_ENFORCE_CACHE_COHERENCY then
- * any domain later attached to it will also not support it. If the cap
- * is set then the iommu_domain eventually attached to the device/group
- * must use a domain with enforce_cache_coherency().
- */
- mutex_lock(&group->device_lock);
- list_for_each_entry(device, &group->device_list, group_next) {
- if (!device_iommu_capable(device->dev,
- IOMMU_CAP_ENFORCE_CACHE_COHERENCY)) {
- ret = false;
- break;
- }
- }
- mutex_unlock(&group->device_lock);
- return ret;
-}
-EXPORT_SYMBOL_GPL(vfio_file_enforced_coherent);
-
-/**
- * vfio_file_set_kvm - Link a kvm with VFIO drivers
- * @file: VFIO group file
- * @kvm: KVM to link
- *
- * When a VFIO device is first opened the KVM will be available in
- * device->kvm if one was associated with the group.
- */
-void vfio_file_set_kvm(struct file *file, struct kvm *kvm)
-{
- struct vfio_group *group = file->private_data;
-
- if (!vfio_file_is_group(file))
- return;
-
- mutex_lock(&group->group_lock);
- group->kvm = kvm;
- mutex_unlock(&group->group_lock);
-}
-EXPORT_SYMBOL_GPL(vfio_file_set_kvm);
-
-/**
- * vfio_file_has_dev - True if the VFIO file is a handle for device
- * @file: VFIO file to check
- * @device: Device that must be part of the file
- *
- * Returns true if given file has permission to manipulate the given device.
- */
-bool vfio_file_has_dev(struct file *file, struct vfio_device *device)
-{
- struct vfio_group *group = file->private_data;
-
- if (!vfio_file_is_group(file))
- return false;
-
- return group == device->group;
-}
-EXPORT_SYMBOL_GPL(vfio_file_has_dev);
-
/*
* Sub-module support
*/
@@ -1959,11 +1189,6 @@ int vfio_set_irqs_validate_and_prepare(struct vfio_irq_set *hdr, int num_irqs,
}
EXPORT_SYMBOL(vfio_set_irqs_validate_and_prepare);
-static bool vfio_device_has_container(struct vfio_device *device)
-{
- return device->group->container;
-}
-
/*
* Pin contiguous user pages and return their associated host pages for local
* domain only.
@@ -2085,55 +1310,6 @@ EXPORT_SYMBOL(vfio_dma_rw);
/*
* Module/class support
*/
-static char *vfio_devnode(struct device *dev, umode_t *mode)
-{
- return kasprintf(GFP_KERNEL, "vfio/%s", dev_name(dev));
-}
-
-static int __init vfio_group_init(void)
-{
- int ret;
-
- ida_init(&vfio.group_ida);
- mutex_init(&vfio.group_lock);
- INIT_LIST_HEAD(&vfio.group_list);
-
- ret = vfio_container_init();
- if (ret)
- return ret;
-
- /* /dev/vfio/$GROUP */
- vfio.class = class_create(THIS_MODULE, "vfio");
- if (IS_ERR(vfio.class)) {
- ret = PTR_ERR(vfio.class);
- goto err_group_class;
- }
-
- vfio.class->devnode = vfio_devnode;
-
- ret = alloc_chrdev_region(&vfio.group_devt, 0, MINORMASK + 1, "vfio");
- if (ret)
- goto err_alloc_chrdev;
- return 0;
-
-err_alloc_chrdev:
- class_destroy(vfio.class);
- vfio.class = NULL;
-err_group_class:
- vfio_container_cleanup();
- return ret;
-}
-
-static void vfio_group_cleanup(void)
-{
- WARN_ON(!list_empty(&vfio.group_list));
- ida_destroy(&vfio.group_ida);
- unregister_chrdev_region(vfio.group_devt, MINORMASK + 1);
- class_destroy(vfio.class);
- vfio.class = NULL;
- vfio_container_cleanup();
-}
-
static int __init vfio_init(void)
{
int ret;
--
2.34.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [RFC v2 03/11] vfio: Set device->group in helper function
2022-11-24 12:26 ` [RFC v2 03/11] vfio: Set device->group in helper function Yi Liu
@ 2022-11-24 13:17 ` Jason Gunthorpe
2022-11-24 13:50 ` Yi Liu
2022-11-28 8:08 ` Tian, Kevin
1 sibling, 1 reply; 42+ messages in thread
From: Jason Gunthorpe @ 2022-11-24 13:17 UTC (permalink / raw)
To: Yi Liu
Cc: alex.williamson, kevin.tian, eric.auger, cohuck, nicolinc,
yi.y.sun, chao.p.peng, mjrosato, kvm
On Thu, Nov 24, 2022 at 04:26:54AM -0800, Yi Liu wrote:
> This avoids referencing device->group in __vfio_register_dev()
>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> ---
> drivers/vfio/vfio_main.c | 52 +++++++++++++++++++++++++---------------
> 1 file changed, 33 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
> index 7a78256a650e..4980b8acf5d3 100644
> --- a/drivers/vfio/vfio_main.c
> +++ b/drivers/vfio/vfio_main.c
> @@ -503,10 +503,15 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
> return group;
> }
>
> -static int __vfio_register_dev(struct vfio_device *device,
> - struct vfio_group *group)
> +static int vfio_device_set_group(struct vfio_device *device,
> + enum vfio_group_type type)
> {
> - int ret;
> + struct vfio_group *group;
> +
> + if (type == VFIO_IOMMU)
> + group = vfio_group_find_or_alloc(device->dev);
> + else
> + group = vfio_noiommu_group_alloc(device->dev, type);
>
> /*
> * In all cases group is the output of one of the group allocation
This comment should be deleted
> @@ -515,6 +520,16 @@ static int __vfio_register_dev(struct vfio_device *device,
> if (IS_ERR(group))
> return PTR_ERR(group);
>
> + /* Our reference on group is moved to the device */
> + device->group = group;
> + return 0;
> +}
> +
> +static int __vfio_register_dev(struct vfio_device *device,
> + enum vfio_group_type type)
> +{
> + int ret;
> +
> if (WARN_ON(device->ops->bind_iommufd &&
> (!device->ops->unbind_iommufd ||
> !device->ops->attach_ioas)))
> @@ -527,34 +542,33 @@ static int __vfio_register_dev(struct vfio_device *device,
> if (!device->dev_set)
> vfio_assign_device_set(device, device);
>
> - /* Our reference on group is moved to the device */
> - device->group = group;
> -
> ret = dev_set_name(&device->device, "vfio%d", device->index);
> if (ret)
> - goto err_out;
> + return ret;
>
> - ret = device_add(&device->device);
> + ret = vfio_device_set_group(device, type);
> if (ret)
> - goto err_out;
> + return ret;
> +
> + ret = device_add(&device->device);
> + if (ret) {
> + vfio_device_remove_group(device);
> + return ret;
You could probably keep the goto
Jason
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC v2 03/11] vfio: Set device->group in helper function
2022-11-24 13:17 ` Jason Gunthorpe
@ 2022-11-24 13:50 ` Yi Liu
0 siblings, 0 replies; 42+ messages in thread
From: Yi Liu @ 2022-11-24 13:50 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: alex.williamson, kevin.tian, eric.auger, cohuck, nicolinc,
yi.y.sun, chao.p.peng, mjrosato, kvm
On 2022/11/24 21:17, Jason Gunthorpe wrote:
> On Thu, Nov 24, 2022 at 04:26:54AM -0800, Yi Liu wrote:
>> This avoids referencing device->group in __vfio_register_dev()
>>
>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>> ---
>> drivers/vfio/vfio_main.c | 52 +++++++++++++++++++++++++---------------
>> 1 file changed, 33 insertions(+), 19 deletions(-)
>>
>> diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
>> index 7a78256a650e..4980b8acf5d3 100644
>> --- a/drivers/vfio/vfio_main.c
>> +++ b/drivers/vfio/vfio_main.c
>> @@ -503,10 +503,15 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
>> return group;
>> }
>>
>> -static int __vfio_register_dev(struct vfio_device *device,
>> - struct vfio_group *group)
>> +static int vfio_device_set_group(struct vfio_device *device,
>> + enum vfio_group_type type)
>> {
>> - int ret;
>> + struct vfio_group *group;
>> +
>> + if (type == VFIO_IOMMU)
>> + group = vfio_group_find_or_alloc(device->dev);
>> + else
>> + group = vfio_noiommu_group_alloc(device->dev, type);
>>
>> /*
>> * In all cases group is the output of one of the group allocation
>
> This comment should be deleted
ok.
>> @@ -515,6 +520,16 @@ static int __vfio_register_dev(struct vfio_device *device,
>> if (IS_ERR(group))
>> return PTR_ERR(group);
>>
>> + /* Our reference on group is moved to the device */
>> + device->group = group;
>> + return 0;
>> +}
>> +
>> +static int __vfio_register_dev(struct vfio_device *device,
>> + enum vfio_group_type type)
>> +{
>> + int ret;
>> +
>> if (WARN_ON(device->ops->bind_iommufd &&
>> (!device->ops->unbind_iommufd ||
>> !device->ops->attach_ioas)))
>> @@ -527,34 +542,33 @@ static int __vfio_register_dev(struct vfio_device *device,
>> if (!device->dev_set)
>> vfio_assign_device_set(device, device);
>>
>> - /* Our reference on group is moved to the device */
>> - device->group = group;
>> -
>> ret = dev_set_name(&device->device, "vfio%d", device->index);
>> if (ret)
>> - goto err_out;
>> + return ret;
>>
>> - ret = device_add(&device->device);
>> + ret = vfio_device_set_group(device, type);
>> if (ret)
>> - goto err_out;
>> + return ret;
>> +
>> + ret = device_add(&device->device);
>> + if (ret) {
>> + vfio_device_remove_group(device);
>> + return ret;
>
> You could probably keep the goto
after the change, only this branch will use the err_out; may be not
necessary to use goto. but keep goto may have less changes in patch
file. e.g. I'm ok to keep goto.
@@ -527,12 +542,13 @@ static int __vfio_register_dev(struct vfio_device
*device,
if (!device->dev_set)
vfio_assign_device_set(device, device);
- /* Our reference on group is moved to the device */
- device->group = group;
-
ret = dev_set_name(&device->device, "vfio%d", device->index);
if (ret)
- goto err_out;
+ return ret;
+
+ ret = vfio_device_set_group(device, type);
+ if (ret)
+ return ret;
ret = device_add(&device->device);
if (ret)
--
Regards,
Yi Liu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC v2 11/11] vfio: Move vfio group specific code into group.c
2022-11-24 12:27 ` [RFC v2 11/11] vfio: Move vfio group specific code into group.c Yi Liu
@ 2022-11-24 14:36 ` Jason Gunthorpe
2022-11-25 8:45 ` Yi Liu
0 siblings, 1 reply; 42+ messages in thread
From: Jason Gunthorpe @ 2022-11-24 14:36 UTC (permalink / raw)
To: Yi Liu
Cc: alex.williamson, kevin.tian, eric.auger, cohuck, nicolinc,
yi.y.sun, chao.p.peng, mjrosato, kvm
On Thu, Nov 24, 2022 at 04:27:02AM -0800, Yi Liu wrote:
> This prepares for compiling out vfio group after vfio device cdev is
> added. No vfio_group decode code should be in vfio_main.c.
>
> No functional change is intended.
>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> ---
> drivers/vfio/Makefile | 1 +
> drivers/vfio/group.c | 842 +++++++++++++++++++++++++++++++++++++++
> drivers/vfio/vfio.h | 17 +
> drivers/vfio/vfio_main.c | 830 +-------------------------------------
> 4 files changed, 863 insertions(+), 827 deletions(-)
> create mode 100644 drivers/vfio/group.c
vfio_device_open_file() should be moved into group.c as well and
export vfio_device_open/close() instead
Jason
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC v2 08/11] vfio: Refactor vfio_device_first_open() and _last_close()
2022-11-24 12:26 ` [RFC v2 08/11] vfio: Refactor vfio_device_first_open() and _last_close() Yi Liu
@ 2022-11-24 14:56 ` Jason Gunthorpe
2022-11-25 8:57 ` Yi Liu
2022-11-28 8:31 ` Tian, Kevin
1 sibling, 1 reply; 42+ messages in thread
From: Jason Gunthorpe @ 2022-11-24 14:56 UTC (permalink / raw)
To: Yi Liu
Cc: alex.williamson, kevin.tian, eric.auger, cohuck, nicolinc,
yi.y.sun, chao.p.peng, mjrosato, kvm
On Thu, Nov 24, 2022 at 04:26:59AM -0800, Yi Liu wrote:
> + kvm = vfio_device_get_group_kvm(device);
> + if (!kvm) {
> + ret = -EINVAL;
> + goto err_unuse_iommu;
> + }
A null kvm is not an error.
And looking at this along with following cdev patch, I think this
organization is cleaner. Make it so the caller of the vfio_device_open
does most of the group/device differences. We already have different
call chains. keep the iommfd code in vfio_main.c's functions.
diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
index 3a69839c65ff75..9b511055150cec 100644
--- a/drivers/vfio/group.c
+++ b/drivers/vfio/group.c
@@ -609,61 +609,32 @@ void vfio_device_group_unregister(struct vfio_device *device)
int vfio_device_group_use_iommu(struct vfio_device *device)
{
+ struct vfio_group *group = device->group;
int ret = 0;
- /*
- * Here we pass the KVM pointer with the group under the lock. If the
- * device driver will use it, it must obtain a reference and release it
- * during close_device.
- */
- mutex_lock(&device->group->group_lock);
- if (!vfio_group_has_iommu(device->group)) {
- ret = -EINVAL;
- goto out_unlock;
- }
+ lockdep_assert_held(&group->group_lock);
- if (device->group->container) {
- ret = vfio_group_use_container(device->group);
- if (ret)
- goto out_unlock;
- vfio_device_container_register(device);
- } else if (device->group->iommufd) {
- ret = vfio_iommufd_bind(device, device->group->iommufd);
- }
+ if (WARN_ON(!group->container))
+ return -EINVAL;
-out_unlock:
- mutex_unlock(&device->group->group_lock);
- return ret;
+ ret = vfio_group_use_container(group);
+ if (ret)
+ return ret;
+ vfio_device_container_register(device);
+ return 0;
}
void vfio_device_group_unuse_iommu(struct vfio_device *device)
-{
- mutex_lock(&device->group->group_lock);
- if (device->group->container) {
- vfio_device_container_unregister(device);
- vfio_group_unuse_container(device->group);
- } else if (device->group->iommufd) {
- vfio_iommufd_unbind(device);
- }
- mutex_unlock(&device->group->group_lock);
-}
-
-struct kvm *vfio_device_get_group_kvm(struct vfio_device *device)
{
struct vfio_group *group = device->group;
- mutex_lock(&group->group_lock);
- if (!group->kvm) {
- mutex_unlock(&group->group_lock);
- return NULL;
- }
- /* group_lock is released in the vfio_group_put_kvm() */
- return group->kvm;
-}
+ lockdep_assert_held(&group->group_lock);
-void vfio_device_put_group_kvm(struct vfio_device *device)
-{
- mutex_unlock(&device->group->group_lock);
+ if (WARN_ON(!group->container))
+ return;
+
+ vfio_device_container_unregister(device);
+ vfio_group_unuse_container(group);
}
/**
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 3108e92a5cb20b..f9386a34d584e2 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -364,9 +364,9 @@ static bool vfio_assert_device_open(struct vfio_device *device)
return !WARN_ON_ONCE(!READ_ONCE(device->open_count));
}
-static int vfio_device_first_open(struct vfio_device *device)
+static int vfio_device_first_open(struct vfio_device *device,
+ struct iommufd_ctx *iommufd, struct kvm *kvm)
{
- struct kvm *kvm;
int ret;
lockdep_assert_held(&device->dev_set->lock);
@@ -374,54 +374,56 @@ static int vfio_device_first_open(struct vfio_device *device)
if (!try_module_get(device->dev->driver->owner))
return -ENODEV;
- ret = vfio_device_group_use_iommu(device);
+ if (iommufd)
+ ret = vfio_iommufd_bind(device, iommufd);
+ else
+ ret = vfio_device_group_use_iommu(device);
if (ret)
goto err_module_put;
- kvm = vfio_device_get_group_kvm(device);
- if (!kvm) {
- ret = -EINVAL;
- goto err_unuse_iommu;
- }
-
device->kvm = kvm;
if (device->ops->open_device) {
ret = device->ops->open_device(device);
if (ret)
- goto err_container;
+ goto err_unuse_iommu;
}
- vfio_device_put_group_kvm(device);
return 0;
-err_container:
- device->kvm = NULL;
- vfio_device_put_group_kvm(device);
err_unuse_iommu:
- vfio_device_group_unuse_iommu(device);
+ if (iommufd)
+ vfio_iommufd_unbind(device);
+ else
+ vfio_device_group_unuse_iommu(device);
err_module_put:
module_put(device->dev->driver->owner);
+ device->kvm = NULL;
return ret;
}
-static void vfio_device_last_close(struct vfio_device *device)
+static void vfio_device_last_close(struct vfio_device *device,
+ struct iommufd_ctx *iommufd)
{
lockdep_assert_held(&device->dev_set->lock);
if (device->ops->close_device)
device->ops->close_device(device);
device->kvm = NULL;
- vfio_device_group_unuse_iommu(device);
+ if (iommufd)
+ vfio_iommufd_unbind(device);
+ else
+ vfio_device_group_unuse_iommu(device);
module_put(device->dev->driver->owner);
}
-static int vfio_device_open(struct vfio_device *device)
+static int vfio_device_open(struct vfio_device *device,
+ struct iommufd_ctx *iommufd, struct kvm *kvm)
{
int ret = 0;
mutex_lock(&device->dev_set->lock);
device->open_count++;
if (device->open_count == 1) {
- ret = vfio_device_first_open(device);
+ ret = vfio_device_first_open(device, iommufd, kvm);
if (ret)
device->open_count--;
}
@@ -430,22 +432,53 @@ static int vfio_device_open(struct vfio_device *device)
return ret;
}
-static void vfio_device_close(struct vfio_device *device)
+static void vfio_device_close(struct vfio_device *device,
+ struct iommufd_ctx *iommufd)
{
mutex_lock(&device->dev_set->lock);
vfio_assert_device_open(device);
if (device->open_count == 1)
- vfio_device_last_close(device);
+ vfio_device_last_close(device, iommufd);
device->open_count--;
mutex_unlock(&device->dev_set->lock);
}
+static int vfio_device_group_open(struct vfio_device *device)
+{
+ int ret;
+
+ mutex_lock(&device->group->group_lock);
+ if (!vfio_group_has_iommu(device->group)) {
+ ret = -EINVAL;
+ goto out_unlock;
+ }
+
+ /*
+ * Here we pass the KVM pointer with the group under the lock. If the
+ * device driver will use it, it must obtain a reference and release it
+ * during close_device.
+ */
+ ret = vfio_device_open(device, device->group->iommufd,
+ device->group->kvm);
+
+out_unlock:
+ mutex_unlock(&device->group->group_lock);
+ return ret;
+}
+
+void vfio_device_close_group(struct vfio_device *device)
+{
+ mutex_lock(&device->group->group_lock);
+ vfio_device_close(device, device->group->iommufd);
+ mutex_unlock(&device->group->group_lock);
+}
+
struct file *vfio_device_open_file(struct vfio_device *device)
{
struct file *filep;
int ret;
- ret = vfio_device_open(device);
+ ret = vfio_device_group_open(device);
if (ret)
goto err_out;
@@ -474,7 +507,7 @@ struct file *vfio_device_open_file(struct vfio_device *device)
return filep;
err_close_device:
- vfio_device_close(device);
+ vfio_device_group_close(device), device->group->iommufd;
err_out:
return ERR_PTR(ret);
}
@@ -519,7 +552,7 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep)
{
struct vfio_device *device = filep->private_data;
- vfio_device_close(device);
+ vfio_device_close_group(device);
vfio_device_put_registration(device);
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [RFC v2 11/11] vfio: Move vfio group specific code into group.c
2022-11-24 14:36 ` Jason Gunthorpe
@ 2022-11-25 8:45 ` Yi Liu
0 siblings, 0 replies; 42+ messages in thread
From: Yi Liu @ 2022-11-25 8:45 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: alex.williamson, kevin.tian, eric.auger, cohuck, nicolinc,
yi.y.sun, chao.p.peng, mjrosato, kvm
On 2022/11/24 22:36, Jason Gunthorpe wrote:
> On Thu, Nov 24, 2022 at 04:27:02AM -0800, Yi Liu wrote:
>> This prepares for compiling out vfio group after vfio device cdev is
>> added. No vfio_group decode code should be in vfio_main.c.
>>
>> No functional change is intended.
>>
>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>> ---
>> drivers/vfio/Makefile | 1 +
>> drivers/vfio/group.c | 842 +++++++++++++++++++++++++++++++++++++++
>> drivers/vfio/vfio.h | 17 +
>> drivers/vfio/vfio_main.c | 830 +-------------------------------------
>> 4 files changed, 863 insertions(+), 827 deletions(-)
>> create mode 100644 drivers/vfio/group.c
>
> vfio_device_open_file() should be moved into group.c as well and
> export vfio_device_open/close() instead
also need export vfio_device_ops as well:-)
--
Regards,
Yi Liu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC v2 08/11] vfio: Refactor vfio_device_first_open() and _last_close()
2022-11-24 14:56 ` Jason Gunthorpe
@ 2022-11-25 8:57 ` Yi Liu
2022-11-25 12:37 ` Jason Gunthorpe
0 siblings, 1 reply; 42+ messages in thread
From: Yi Liu @ 2022-11-25 8:57 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: alex.williamson, kevin.tian, eric.auger, cohuck, nicolinc,
yi.y.sun, chao.p.peng, mjrosato, kvm
On 2022/11/24 22:56, Jason Gunthorpe wrote:
> On Thu, Nov 24, 2022 at 04:26:59AM -0800, Yi Liu wrote:
>> + kvm = vfio_device_get_group_kvm(device);
>> + if (!kvm) {
>> + ret = -EINVAL;
>> + goto err_unuse_iommu;
>> + }
>
> A null kvm is not an error.
makes sense. vfio not only serves for VM but also userspace drivers.
> And looking at this along with following cdev patch, I think this
> organization is cleaner. Make it so the caller of the vfio_device_open
> does most of the group/device differences. We already have different
> call chains. keep the iommfd code in vfio_main.c's functions.
ok.
>
> diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
> index 3a69839c65ff75..9b511055150cec 100644
> --- a/drivers/vfio/group.c
> +++ b/drivers/vfio/group.c
> @@ -609,61 +609,32 @@ void vfio_device_group_unregister(struct vfio_device *device)
>
> int vfio_device_group_use_iommu(struct vfio_device *device)
> {
> + struct vfio_group *group = device->group;
> int ret = 0;
>
> - /*
> - * Here we pass the KVM pointer with the group under the lock. If the
> - * device driver will use it, it must obtain a reference and release it
> - * during close_device.
> - */
> - mutex_lock(&device->group->group_lock);
> - if (!vfio_group_has_iommu(device->group)) {
> - ret = -EINVAL;
> - goto out_unlock;
> - }
> + lockdep_assert_held(&group->group_lock);
>
> - if (device->group->container) {
> - ret = vfio_group_use_container(device->group);
> - if (ret)
> - goto out_unlock;
> - vfio_device_container_register(device);
> - } else if (device->group->iommufd) {
> - ret = vfio_iommufd_bind(device, device->group->iommufd);
> - }
> + if (WARN_ON(!group->container))
> + return -EINVAL;
>
> -out_unlock:
> - mutex_unlock(&device->group->group_lock);
> - return ret;
> + ret = vfio_group_use_container(group);
> + if (ret)
> + return ret;
> + vfio_device_container_register(device);
> + return 0;
> }
>
> void vfio_device_group_unuse_iommu(struct vfio_device *device)
> -{
> - mutex_lock(&device->group->group_lock);
> - if (device->group->container) {
> - vfio_device_container_unregister(device);
> - vfio_group_unuse_container(device->group);
> - } else if (device->group->iommufd) {
> - vfio_iommufd_unbind(device);
> - }
> - mutex_unlock(&device->group->group_lock);
> -}
> -
> -struct kvm *vfio_device_get_group_kvm(struct vfio_device *device)
> {
> struct vfio_group *group = device->group;
>
> - mutex_lock(&group->group_lock);
> - if (!group->kvm) {
> - mutex_unlock(&group->group_lock);
> - return NULL;
> - }
> - /* group_lock is released in the vfio_group_put_kvm() */
> - return group->kvm;
> -}
> + lockdep_assert_held(&group->group_lock);
>
> -void vfio_device_put_group_kvm(struct vfio_device *device)
> -{
> - mutex_unlock(&device->group->group_lock);
> + if (WARN_ON(!group->container))
> + return;
> +
> + vfio_device_container_unregister(device);
> + vfio_group_unuse_container(group);
> }
>
> /**
> diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
> index 3108e92a5cb20b..f9386a34d584e2 100644
> --- a/drivers/vfio/vfio_main.c
> +++ b/drivers/vfio/vfio_main.c
> @@ -364,9 +364,9 @@ static bool vfio_assert_device_open(struct vfio_device *device)
> return !WARN_ON_ONCE(!READ_ONCE(device->open_count));
> }
>
> -static int vfio_device_first_open(struct vfio_device *device)
> +static int vfio_device_first_open(struct vfio_device *device,
> + struct iommufd_ctx *iommufd, struct kvm *kvm)
> {
> - struct kvm *kvm;
> int ret;
>
> lockdep_assert_held(&device->dev_set->lock);
> @@ -374,54 +374,56 @@ static int vfio_device_first_open(struct vfio_device *device)
> if (!try_module_get(device->dev->driver->owner))
> return -ENODEV;
>
> - ret = vfio_device_group_use_iommu(device);
> + if (iommufd)
> + ret = vfio_iommufd_bind(device, iommufd);
> + else
> + ret = vfio_device_group_use_iommu(device);
> if (ret)
> goto err_module_put;
>
> - kvm = vfio_device_get_group_kvm(device);
> - if (!kvm) {
> - ret = -EINVAL;
> - goto err_unuse_iommu;
> - }
> -
> device->kvm = kvm;
> if (device->ops->open_device) {
> ret = device->ops->open_device(device);
> if (ret)
> - goto err_container;
> + goto err_unuse_iommu;
> }
> - vfio_device_put_group_kvm(device);
> return 0;
>
> -err_container:
> - device->kvm = NULL;
> - vfio_device_put_group_kvm(device);
> err_unuse_iommu:
> - vfio_device_group_unuse_iommu(device);
> + if (iommufd)
> + vfio_iommufd_unbind(device);
> + else
> + vfio_device_group_unuse_iommu(device);
> err_module_put:
> module_put(device->dev->driver->owner);
> + device->kvm = NULL;
> return ret;
> }
>
> -static void vfio_device_last_close(struct vfio_device *device)
> +static void vfio_device_last_close(struct vfio_device *device,
> + struct iommufd_ctx *iommufd)
> {
> lockdep_assert_held(&device->dev_set->lock);
>
> if (device->ops->close_device)
> device->ops->close_device(device);
> device->kvm = NULL;
> - vfio_device_group_unuse_iommu(device);
> + if (iommufd)
> + vfio_iommufd_unbind(device);
> + else
> + vfio_device_group_unuse_iommu(device);
> module_put(device->dev->driver->owner);
> }
>
> -static int vfio_device_open(struct vfio_device *device)
> +static int vfio_device_open(struct vfio_device *device,
> + struct iommufd_ctx *iommufd, struct kvm *kvm)
> {
> int ret = 0;
>
> mutex_lock(&device->dev_set->lock);
> device->open_count++;
> if (device->open_count == 1) {
> - ret = vfio_device_first_open(device);
> + ret = vfio_device_first_open(device, iommufd, kvm);
> if (ret)
> device->open_count--;
> }
> @@ -430,22 +432,53 @@ static int vfio_device_open(struct vfio_device *device)
> return ret;
> }
>
> -static void vfio_device_close(struct vfio_device *device)
> +static void vfio_device_close(struct vfio_device *device,
> + struct iommufd_ctx *iommufd)
> {
> mutex_lock(&device->dev_set->lock);
> vfio_assert_device_open(device);
> if (device->open_count == 1)
> - vfio_device_last_close(device);
> + vfio_device_last_close(device, iommufd);
> device->open_count--;
> mutex_unlock(&device->dev_set->lock);
> }
>
> +static int vfio_device_group_open(struct vfio_device *device)
> +{
> + int ret;
> +
> + mutex_lock(&device->group->group_lock);
now the group path holds group_lock first, and then device_set->lock.
this is different with existing code. is it acceptable? I had a quick
check with this change, basic test is good. no a-b-b-a locking issue.
> + if (!vfio_group_has_iommu(device->group)) {
> + ret = -EINVAL;
> + goto out_unlock;
> + }
> +
> + /*
> + * Here we pass the KVM pointer with the group under the lock. If the
> + * device driver will use it, it must obtain a reference and release it
> + * during close_device.
> + */
> + ret = vfio_device_open(device, device->group->iommufd,
> + device->group->kvm);
> +
> +out_unlock:
> + mutex_unlock(&device->group->group_lock);
> + return ret;
> +}
> +
> +void vfio_device_close_group(struct vfio_device *device)
> +{
> + mutex_lock(&device->group->group_lock);
> + vfio_device_close(device, device->group->iommufd);
> + mutex_unlock(&device->group->group_lock);
> +}
> +
above two functions should be put in group.c.
> struct file *vfio_device_open_file(struct vfio_device *device)
> {
> struct file *filep;
> int ret;
>
> - ret = vfio_device_open(device);
> + ret = vfio_device_group_open(device);
> if (ret)
> goto err_out;
>
> @@ -474,7 +507,7 @@ struct file *vfio_device_open_file(struct vfio_device *device)
> return filep;
>
> err_close_device:
> - vfio_device_close(device);
> + vfio_device_group_close(device), device->group->iommufd;
> err_out:
> return ERR_PTR(ret);
> }
> @@ -519,7 +552,7 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep)
> {
> struct vfio_device *device = filep->private_data;
>
> - vfio_device_close(device);
> + vfio_device_close_group(device);
> > vfio_device_put_registration(device);
>
--
Regards,
Yi Liu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC v2 08/11] vfio: Refactor vfio_device_first_open() and _last_close()
2022-11-25 8:57 ` Yi Liu
@ 2022-11-25 12:37 ` Jason Gunthorpe
2022-11-25 14:06 ` Yi Liu
0 siblings, 1 reply; 42+ messages in thread
From: Jason Gunthorpe @ 2022-11-25 12:37 UTC (permalink / raw)
To: Yi Liu
Cc: alex.williamson, kevin.tian, eric.auger, cohuck, nicolinc,
yi.y.sun, chao.p.peng, mjrosato, kvm
On Fri, Nov 25, 2022 at 04:57:27PM +0800, Yi Liu wrote:
> > +static int vfio_device_group_open(struct vfio_device *device)
> > +{
> > + int ret;
> > +
> > + mutex_lock(&device->group->group_lock);
>
> now the group path holds group_lock first, and then device_set->lock.
> this is different with existing code. is it acceptable? I had a quick
> check with this change, basic test is good. no a-b-b-a locking issue.
I looked for a while and couldn't find a reason why it wouldn't be OK
> > + if (!vfio_group_has_iommu(device->group)) {
> > + ret = -EINVAL;
> > + goto out_unlock;
> > + }
> > +
> > + /*
> > + * Here we pass the KVM pointer with the group under the lock. If the
> > + * device driver will use it, it must obtain a reference and release it
> > + * during close_device.
> > + */
> > + ret = vfio_device_open(device, device->group->iommufd,
> > + device->group->kvm);
> > +
> > +out_unlock:
> > + mutex_unlock(&device->group->group_lock);
> > + return ret;
> > +}
> > +
> > +void vfio_device_close_group(struct vfio_device *device)
> > +{
> > + mutex_lock(&device->group->group_lock);
> > + vfio_device_close(device, device->group->iommufd);
> > + mutex_unlock(&device->group->group_lock);
> > +}
> > +
>
> above two functions should be put in group.c.
Yes
Jason
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC v2 08/11] vfio: Refactor vfio_device_first_open() and _last_close()
2022-11-25 12:37 ` Jason Gunthorpe
@ 2022-11-25 14:06 ` Yi Liu
2022-11-25 14:15 ` Jason Gunthorpe
0 siblings, 1 reply; 42+ messages in thread
From: Yi Liu @ 2022-11-25 14:06 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: alex.williamson, kevin.tian, eric.auger, cohuck, nicolinc,
yi.y.sun, chao.p.peng, mjrosato, kvm
On 2022/11/25 20:37, Jason Gunthorpe wrote:
> On Fri, Nov 25, 2022 at 04:57:27PM +0800, Yi Liu wrote:
>
>>> +static int vfio_device_group_open(struct vfio_device *device)
>>> +{
>>> + int ret;
>>> +
>>> + mutex_lock(&device->group->group_lock);
>>
>> now the group path holds group_lock first, and then device_set->lock.
>> this is different with existing code. is it acceptable? I had a quick
>> check with this change, basic test is good. no a-b-b-a locking issue.
>
> I looked for a while and couldn't find a reason why it wouldn't be OK
ok, so updated this commit as below:
From dd236de34a4f736041456fb46bd4a5eab360681b Mon Sep 17 00:00:00 2001
From: Yi Liu <yi.l.liu@intel.com>
Date: Wed, 2 Nov 2022 04:42:25 -0700
Subject: [PATCH 08/11] vfio: Refactor vfio_device_open/close()
With this refactor, vfio_device_open/close() is the common code to open
device for the current group path, and also the future vfio device cdev
path. It calls the vfio_device_first_open() and _last_close() to handle
the legacy container path and the compat iommufd path.
Current caller of vfio_device_open/close() are vfio_device_group_open
and vfio_device_group_close(). They take care of group_lock, iommufd_ctx
pointer and also kvm pointer.
Future caller in the vfio device cdev path just need to care about the
iommufd_ctx pointer and kvm pointer. This is not part of this commit.
This prepares for moving group specific code out of vfio_main.c and also
compiling out the group infrastructure in future.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/vfio/vfio_main.c | 133 +++++++++++++++++++++++++--------------
1 file changed, 87 insertions(+), 46 deletions(-)
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 6898a492acc0..b49c48f3bcc8 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -772,7 +772,38 @@ static bool vfio_assert_device_open(struct vfio_device
*device)
return !WARN_ON_ONCE(!READ_ONCE(device->open_count));
}
-static int vfio_device_first_open(struct vfio_device *device)
+static int vfio_device_group_use_iommu(struct vfio_device *device)
+{
+ struct vfio_group *group = device->group;
+ int ret = 0;
+
+ lockdep_assert_held(&group->group_lock);
+
+ if (WARN_ON(!group->container))
+ return -EINVAL;
+
+ ret = vfio_group_use_container(group);
+ if (ret)
+ return ret;
+ vfio_device_container_register(device);
+ return 0;
+}
+
+static void vfio_device_group_unuse_iommu(struct vfio_device *device)
+{
+ struct vfio_group *group = device->group;
+
+ lockdep_assert_held(&group->group_lock);
+
+ if (WARN_ON(!group->container))
+ return;
+
+ vfio_device_container_unregister(device);
+ vfio_group_unuse_container(group);
+}
+
+static int vfio_device_first_open(struct vfio_device *device,
+ struct iommufd_ctx *iommufd, struct kvm *kvm)
{
int ret;
@@ -781,77 +812,56 @@ static int vfio_device_first_open(struct vfio_device
*device)
if (!try_module_get(device->dev->driver->owner))
return -ENODEV;
- /*
- * Here we pass the KVM pointer with the group under the lock. If the
- * device driver will use it, it must obtain a reference and release it
- * during close_device.
- */
- mutex_lock(&device->group->group_lock);
- if (!vfio_group_has_iommu(device->group)) {
- ret = -EINVAL;
+ if (iommufd)
+ ret = vfio_iommufd_bind(device, iommufd);
+ else
+ ret = vfio_device_group_use_iommu(device);
+ if (ret)
goto err_module_put;
- }
- if (device->group->container) {
- ret = vfio_group_use_container(device->group);
- if (ret)
- goto err_module_put;
- vfio_device_container_register(device);
- } else if (device->group->iommufd) {
- ret = vfio_iommufd_bind(device, device->group->iommufd);
- if (ret)
- goto err_module_put;
- }
-
- device->kvm = device->group->kvm;
+ device->kvm = kvm;
if (device->ops->open_device) {
ret = device->ops->open_device(device);
if (ret)
- goto err_container;
+ goto err_unuse_iommu;
}
- mutex_unlock(&device->group->group_lock);
return 0;
-err_container:
+err_unuse_iommu:
device->kvm = NULL;
- if (device->group->container) {
- vfio_device_container_unregister(device);
- vfio_group_unuse_container(device->group);
- } else if (device->group->iommufd) {
+ if (iommufd)
vfio_iommufd_unbind(device);
- }
+ else
+ vfio_device_group_unuse_iommu(device);
err_module_put:
- mutex_unlock(&device->group->group_lock);
module_put(device->dev->driver->owner);
return ret;
}
-static void vfio_device_last_close(struct vfio_device *device)
+static void vfio_device_last_close(struct vfio_device *device,
+ struct iommufd_ctx *iommufd)
{
lockdep_assert_held(&device->dev_set->lock);
- mutex_lock(&device->group->group_lock);
if (device->ops->close_device)
device->ops->close_device(device);
device->kvm = NULL;
- if (device->group->container) {
- vfio_device_container_unregister(device);
- vfio_group_unuse_container(device->group);
- } else if (device->group->iommufd) {
+ if (iommufd)
vfio_iommufd_unbind(device);
- }
- mutex_unlock(&device->group->group_lock);
+ else
+ vfio_device_group_unuse_iommu(device);
module_put(device->dev->driver->owner);
}
-static int vfio_device_open(struct vfio_device *device)
+static int vfio_device_open(struct vfio_device *device,
+ struct iommufd_ctx *iommufd, struct kvm *kvm)
{
int ret = 0;
mutex_lock(&device->dev_set->lock);
device->open_count++;
if (device->open_count == 1) {
- ret = vfio_device_first_open(device);
+ ret = vfio_device_first_open(device, iommufd, kvm);
if (ret)
device->open_count--;
}
@@ -860,22 +870,53 @@ static int vfio_device_open(struct vfio_device *device)
return ret;
}
-static void vfio_device_close(struct vfio_device *device)
+static void vfio_device_close(struct vfio_device *device,
+ struct iommufd_ctx *iommufd)
{
mutex_lock(&device->dev_set->lock);
vfio_assert_device_open(device);
if (device->open_count == 1)
- vfio_device_last_close(device);
+ vfio_device_last_close(device, iommufd);
device->open_count--;
mutex_unlock(&device->dev_set->lock);
}
+static int vfio_device_group_open(struct vfio_device *device)
+{
+ int ret;
+
+ mutex_lock(&device->group->group_lock);
+ if (!vfio_group_has_iommu(device->group)) {
+ ret = -EINVAL;
+ goto out_unlock;
+ }
+
+ /*
+ * Here we pass the KVM pointer with the group under the lock. If the
+ * device driver will use it, it must obtain a reference and release it
+ * during close_device.
+ */
+ ret = vfio_device_open(device, device->group->iommufd,
+ device->group->kvm);
+
+out_unlock:
+ mutex_unlock(&device->group->group_lock);
+ return ret;
+}
+
+static void vfio_device_group_close(struct vfio_device *device)
+{
+ mutex_lock(&device->group->group_lock);
+ vfio_device_close(device, device->group->iommufd);
+ mutex_unlock(&device->group->group_lock);
+}
+
static struct file *vfio_device_open_file(struct vfio_device *device)
{
struct file *filep;
int ret;
- ret = vfio_device_open(device);
+ ret = vfio_device_group_open(device);
if (ret)
goto err_out;
@@ -904,7 +945,7 @@ static struct file *vfio_device_open_file(struct
vfio_device *device)
return filep;
err_close_device:
- vfio_device_close(device);
+ vfio_device_group_close(device);
err_out:
return ERR_PTR(ret);
}
@@ -1120,7 +1161,7 @@ static int vfio_device_fops_release(struct inode
*inode, struct file *filep)
{
struct vfio_device *device = filep->private_data;
- vfio_device_close(device);
+ vfio_device_group_close(device);
vfio_device_put_registration(device);
--
2.34.1
--
Regards,
Yi Liu
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [RFC v2 08/11] vfio: Refactor vfio_device_first_open() and _last_close()
2022-11-25 14:06 ` Yi Liu
@ 2022-11-25 14:15 ` Jason Gunthorpe
2022-11-25 14:33 ` Yi Liu
0 siblings, 1 reply; 42+ messages in thread
From: Jason Gunthorpe @ 2022-11-25 14:15 UTC (permalink / raw)
To: Yi Liu
Cc: alex.williamson, kevin.tian, eric.auger, cohuck, nicolinc,
yi.y.sun, chao.p.peng, mjrosato, kvm
On Fri, Nov 25, 2022 at 10:06:13PM +0800, Yi Liu wrote:
> On 2022/11/25 20:37, Jason Gunthorpe wrote:
> > On Fri, Nov 25, 2022 at 04:57:27PM +0800, Yi Liu wrote:
> >
> > > > +static int vfio_device_group_open(struct vfio_device *device)
> > > > +{
> > > > + int ret;
> > > > +
> > > > + mutex_lock(&device->group->group_lock);
> > >
> > > now the group path holds group_lock first, and then device_set->lock.
> > > this is different with existing code. is it acceptable? I had a quick
> > > check with this change, basic test is good. no a-b-b-a locking issue.
> >
> > I looked for a while and couldn't find a reason why it wouldn't be OK
>
> ok, so updated this commit as below:
Yeah, can you update the cdev series too, lets see how it looks
Jason
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC v2 08/11] vfio: Refactor vfio_device_first_open() and _last_close()
2022-11-25 14:15 ` Jason Gunthorpe
@ 2022-11-25 14:33 ` Yi Liu
0 siblings, 0 replies; 42+ messages in thread
From: Yi Liu @ 2022-11-25 14:33 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: alex.williamson, kevin.tian, eric.auger, cohuck, nicolinc,
yi.y.sun, chao.p.peng, mjrosato, kvm
On 2022/11/25 22:15, Jason Gunthorpe wrote:
> On Fri, Nov 25, 2022 at 10:06:13PM +0800, Yi Liu wrote:
>> On 2022/11/25 20:37, Jason Gunthorpe wrote:
>>> On Fri, Nov 25, 2022 at 04:57:27PM +0800, Yi Liu wrote:
>>>
>>>>> +static int vfio_device_group_open(struct vfio_device *device)
>>>>> +{
>>>>> + int ret;
>>>>> +
>>>>> + mutex_lock(&device->group->group_lock);
>>>>
>>>> now the group path holds group_lock first, and then device_set->lock.
>>>> this is different with existing code. is it acceptable? I had a quick
>>>> check with this change, basic test is good. no a-b-b-a locking issue.
>>>
>>> I looked for a while and couldn't find a reason why it wouldn't be OK
>>
>> ok, so updated this commit as below:
>
> Yeah, can you update the cdev series too, lets see how it looks
yes, it's wip, not finished yet. I'll update tomorrow.
--
Regards,
Yi Liu
^ permalink raw reply [flat|nested] 42+ messages in thread
* RE: [RFC v2 01/11] vfio: Simplify vfio_create_group()
2022-11-24 12:26 ` [RFC v2 01/11] vfio: Simplify vfio_create_group() Yi Liu
@ 2022-11-28 7:57 ` Tian, Kevin
0 siblings, 0 replies; 42+ messages in thread
From: Tian, Kevin @ 2022-11-28 7:57 UTC (permalink / raw)
To: Liu, Yi L, alex.williamson@redhat.com, jgg@nvidia.com
Cc: eric.auger@redhat.com, cohuck@redhat.com, nicolinc@nvidia.com,
yi.y.sun@linux.intel.com, chao.p.peng@linux.intel.com,
mjrosato@linux.ibm.com, kvm@vger.kernel.org
> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Thursday, November 24, 2022 8:27 PM
>
> From: Jason Gunthorpe <jgg@nvidia.com>
>
> The vfio.group_lock is now only used to serialize vfio_group creation
> and destruction, we don't need a micro-optimization of searching,
> unlocking, then allocating and searching again. Just hold the lock
> the whole time.
>
> Grabbed from:
> https://lore.kernel.org/kvm/20220922152338.2a2238fe.alex.williamson@red
> hat.com/
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
^ permalink raw reply [flat|nested] 42+ messages in thread
* RE: [RFC v2 02/11] vfio: Move the sanity check of the group to vfio_create_group()
2022-11-24 12:26 ` [RFC v2 02/11] vfio: Move the sanity check of the group to vfio_create_group() Yi Liu
@ 2022-11-28 8:05 ` Tian, Kevin
0 siblings, 0 replies; 42+ messages in thread
From: Tian, Kevin @ 2022-11-28 8:05 UTC (permalink / raw)
To: Liu, Yi L, alex.williamson@redhat.com, jgg@nvidia.com
Cc: eric.auger@redhat.com, cohuck@redhat.com, nicolinc@nvidia.com,
yi.y.sun@linux.intel.com, chao.p.peng@linux.intel.com,
mjrosato@linux.ibm.com, kvm@vger.kernel.org
> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Thursday, November 24, 2022 8:27 PM
>
> From: Jason Gunthorpe <jgg@nvidia.com>
>
> This avoids opening group specific code in __vfio_register_dev() for the
> sanity check if an (existing) group is not corrupted by having two copies
> of the same struct device in it. It also simplifies the error unwind for
> this sanity check since the failure can be detected in the group allocation.
>
> This also prepares for moving the group specific code into separate group.c.
>
> Grabbed from:
> https://lore.kernel.org/kvm/20220922152338.2a2238fe.alex.williamson@red
> hat.com/
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
^ permalink raw reply [flat|nested] 42+ messages in thread
* RE: [RFC v2 03/11] vfio: Set device->group in helper function
2022-11-24 12:26 ` [RFC v2 03/11] vfio: Set device->group in helper function Yi Liu
2022-11-24 13:17 ` Jason Gunthorpe
@ 2022-11-28 8:08 ` Tian, Kevin
2022-11-28 9:17 ` Yi Liu
1 sibling, 1 reply; 42+ messages in thread
From: Tian, Kevin @ 2022-11-28 8:08 UTC (permalink / raw)
To: Liu, Yi L, alex.williamson@redhat.com, jgg@nvidia.com
Cc: eric.auger@redhat.com, cohuck@redhat.com, nicolinc@nvidia.com,
yi.y.sun@linux.intel.com, chao.p.peng@linux.intel.com,
mjrosato@linux.ibm.com, kvm@vger.kernel.org
> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Thursday, November 24, 2022 8:27 PM
>
> This avoids referencing device->group in __vfio_register_dev()
this is not true. There is still reference to device->group when adding
the device to the group->device_list.
this remark will become true if you put it after patch04.
> +static int vfio_device_set_group(struct vfio_device *device,
> + enum vfio_group_type type)
> {
> - int ret;
> + struct vfio_group *group;
> +
> + if (type == VFIO_IOMMU)
> + group = vfio_group_find_or_alloc(device->dev);
> + else
> + group = vfio_noiommu_group_alloc(device->dev, type);
Do we need a WARN_ON(type == VFIO_NO_IOMMU)?
> @@ -638,6 +651,7 @@ void vfio_unregister_group_dev(struct vfio_device
> *device)
> /* Balances device_add in register path */
> device_del(&device->device);
>
> + /* Balances vfio_device_set_group in register path */
"vfio_device_set_group()"
> vfio_device_remove_group(device);
> }
> EXPORT_SYMBOL_GPL(vfio_unregister_group_dev);
> --
> 2.34.1
^ permalink raw reply [flat|nested] 42+ messages in thread
* RE: [RFC v2 04/11] vfio: Wrap group codes to be helpers for __vfio_register_dev() and unregister
2022-11-24 12:26 ` [RFC v2 04/11] vfio: Wrap group codes to be helpers for __vfio_register_dev() and unregister Yi Liu
@ 2022-11-28 8:11 ` Tian, Kevin
2022-11-28 9:17 ` Yi Liu
0 siblings, 1 reply; 42+ messages in thread
From: Tian, Kevin @ 2022-11-28 8:11 UTC (permalink / raw)
To: Liu, Yi L, alex.williamson@redhat.com, jgg@nvidia.com
Cc: eric.auger@redhat.com, cohuck@redhat.com, nicolinc@nvidia.com,
yi.y.sun@linux.intel.com, chao.p.peng@linux.intel.com,
mjrosato@linux.ibm.com, kvm@vger.kernel.org
> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Thursday, November 24, 2022 8:27 PM
Subject:
"vfio: Create wrappers for group register/unregister"
>
> This avoids to decode group fields in the common functions used by
"avoids decoding"
> vfio_device registration, and prepares for further moving vfio group
> specific code into separate file.
>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
^ permalink raw reply [flat|nested] 42+ messages in thread
* RE: [RFC v2 05/11] vfio: Make vfio_device_open() group agnostic
2022-11-24 12:26 ` [RFC v2 05/11] vfio: Make vfio_device_open() group agnostic Yi Liu
@ 2022-11-28 8:17 ` Tian, Kevin
2022-11-28 9:19 ` Yi Liu
0 siblings, 1 reply; 42+ messages in thread
From: Tian, Kevin @ 2022-11-28 8:17 UTC (permalink / raw)
To: Liu, Yi L, alex.williamson@redhat.com, jgg@nvidia.com
Cc: eric.auger@redhat.com, cohuck@redhat.com, nicolinc@nvidia.com,
yi.y.sun@linux.intel.com, chao.p.peng@linux.intel.com,
mjrosato@linux.ibm.com, kvm@vger.kernel.org
> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Thursday, November 24, 2022 8:27 PM
>
> This prepares for moving group specific code to separate file.
>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> ---
> drivers/vfio/vfio_main.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
> index edcfa8a61096..fcb9f778fc9b 100644
> --- a/drivers/vfio/vfio_main.c
> +++ b/drivers/vfio/vfio_main.c
> @@ -878,9 +878,6 @@ static struct file *vfio_device_open(struct vfio_device
> *device)
> */
> filep->f_mode |= (FMODE_PREAD | FMODE_PWRITE);
>
> - if (device->group->type == VFIO_NO_IOMMU)
> - dev_warn(device->dev, "vfio-noiommu device opened by
> user "
> - "(%s:%d)\n", current->comm, task_pid_nr(current));
> /*
> * On success the ref of device is moved to the file and
> * put in vfio_device_fops_release()
> @@ -927,6 +924,10 @@ static int vfio_group_ioctl_get_device_fd(struct
> vfio_group *group,
> goto err_put_fdno;
> }
>
> + if (group->type == VFIO_NO_IOMMU)
> + dev_warn(device->dev, "vfio-noiommu device opened by
> user "
> + "(%s:%d)\n", current->comm, task_pid_nr(current));
> +
> fd_install(fdno, filep);
> return fdno;
>
Do we want to support no-iommu mode in future cdev path?
If yes keeping the check in vfio_device_open() makes more sense. Just
replace direct device->group reference with a helper e.g.:
vfio_device_group_noiommu()
^ permalink raw reply [flat|nested] 42+ messages in thread
* RE: [RFC v2 06/11] vfio: Move device open/close code to be helpfers
2022-11-24 12:26 ` [RFC v2 06/11] vfio: Move device open/close code to be helpfers Yi Liu
@ 2022-11-28 8:21 ` Tian, Kevin
2022-11-28 9:27 ` Yi Liu
0 siblings, 1 reply; 42+ messages in thread
From: Tian, Kevin @ 2022-11-28 8:21 UTC (permalink / raw)
To: Liu, Yi L, alex.williamson@redhat.com, jgg@nvidia.com
Cc: eric.auger@redhat.com, cohuck@redhat.com, nicolinc@nvidia.com,
yi.y.sun@linux.intel.com, chao.p.peng@linux.intel.com,
mjrosato@linux.ibm.com, kvm@vger.kernel.org
> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Thursday, November 24, 2022 8:27 PM
>
> This makes the device open and close be in paired helpers.
>
> vfio_device_open(), and vfio_device_close() handles the open_count, and
> calls vfio_device_first_open(), and vfio_device_last_close() when
> open_count condition is met. This also helps to avoid open code for device
> in the vfio_group_ioctl_get_device_fd(), and prepares for further moving
I didn't get which 'open code' is referred to here:
> @@ -918,7 +935,7 @@ static int vfio_group_ioctl_get_device_fd(struct
> vfio_group *group,
> goto err_put_device;
> }
>
> - filep = vfio_device_open(device);
> + filep = vfio_device_open_file(device);
it's simply a replacement of function calls.
^ permalink raw reply [flat|nested] 42+ messages in thread
* RE: [RFC v2 07/11] vfio: Swap order of vfio_device_container_register() and open_device()
2022-11-24 12:26 ` [RFC v2 07/11] vfio: Swap order of vfio_device_container_register() and open_device() Yi Liu
@ 2022-11-28 8:27 ` Tian, Kevin
2022-11-28 9:28 ` Yi Liu
0 siblings, 1 reply; 42+ messages in thread
From: Tian, Kevin @ 2022-11-28 8:27 UTC (permalink / raw)
To: Liu, Yi L, alex.williamson@redhat.com, jgg@nvidia.com
Cc: eric.auger@redhat.com, cohuck@redhat.com, nicolinc@nvidia.com,
yi.y.sun@linux.intel.com, chao.p.peng@linux.intel.com,
mjrosato@linux.ibm.com, kvm@vger.kernel.org
> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Thursday, November 24, 2022 8:27 PM
>
> This makes the DMA unmap callback registration to container be consistent
> across the vfio iommufd compat mode and the legacy container mode.
>
> In the vfio iommufd compat mode, this registration is done in the
> vfio_iommufd_bind() when creating access which has an unmap callback.
> This
> is prior to calling the open_device() op provided by mdev driver. While,
> in the vfio legacy mode, the registration is done by
> vfio_device_container_register() and it is after open_device(). By swapping
> the order of vfio_device_container_register() and open_device(), the two
> modes have the consistent order for the DMA unmap callback registration.
let's mark out that existing drivers have been converted to be OK with
this order swap in Jason's series.
>
> This also prepares for further splitting group specific code.
>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
^ permalink raw reply [flat|nested] 42+ messages in thread
* RE: [RFC v2 08/11] vfio: Refactor vfio_device_first_open() and _last_close()
2022-11-24 12:26 ` [RFC v2 08/11] vfio: Refactor vfio_device_first_open() and _last_close() Yi Liu
2022-11-24 14:56 ` Jason Gunthorpe
@ 2022-11-28 8:31 ` Tian, Kevin
1 sibling, 0 replies; 42+ messages in thread
From: Tian, Kevin @ 2022-11-28 8:31 UTC (permalink / raw)
To: Liu, Yi L, alex.williamson@redhat.com, jgg@nvidia.com
Cc: eric.auger@redhat.com, cohuck@redhat.com, nicolinc@nvidia.com,
yi.y.sun@linux.intel.com, chao.p.peng@linux.intel.com,
mjrosato@linux.ibm.com, kvm@vger.kernel.org
> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Thursday, November 24, 2022 8:27 PM
>
> To prepare for moving group specific code out from vfio_main.c.
>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
it's not easy to review this patch with Jason's further change. Will wait
until next version
^ permalink raw reply [flat|nested] 42+ messages in thread
* RE: [RFC v2 09/11] vfio: Wrap vfio group module init/clean code into helpers
2022-11-24 12:27 ` [RFC v2 09/11] vfio: Wrap vfio group module init/clean code into helpers Yi Liu
@ 2022-11-28 8:36 ` Tian, Kevin
0 siblings, 0 replies; 42+ messages in thread
From: Tian, Kevin @ 2022-11-28 8:36 UTC (permalink / raw)
To: Liu, Yi L, alex.williamson@redhat.com, jgg@nvidia.com
Cc: eric.auger@redhat.com, cohuck@redhat.com, nicolinc@nvidia.com,
yi.y.sun@linux.intel.com, chao.p.peng@linux.intel.com,
mjrosato@linux.ibm.com, kvm@vger.kernel.org
> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Thursday, November 24, 2022 8:27 PM
>
> This wraps the init/clean code of vfio group global variable to be helpers,
> and prepares for further moving vfio group specific code into separate
> file.
>
> As container is used by group, so vfio_container_init/cleanup() is moved
> into vfio_group_init/cleanup().
>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
^ permalink raw reply [flat|nested] 42+ messages in thread
* RE: [RFC v2 10/11] vfio: Refactor dma APIs for emulated devices
2022-11-24 12:27 ` [RFC v2 10/11] vfio: Refactor dma APIs for emulated devices Yi Liu
@ 2022-11-28 8:42 ` Tian, Kevin
0 siblings, 0 replies; 42+ messages in thread
From: Tian, Kevin @ 2022-11-28 8:42 UTC (permalink / raw)
To: Liu, Yi L, alex.williamson@redhat.com, jgg@nvidia.com
Cc: eric.auger@redhat.com, cohuck@redhat.com, nicolinc@nvidia.com,
yi.y.sun@linux.intel.com, chao.p.peng@linux.intel.com,
mjrosato@linux.ibm.com, kvm@vger.kernel.org
> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Thursday, November 24, 2022 8:27 PM
>
> To use group helpers instead of opening group related code in the
> API. This prepares moving group specific code out of vfio_main.c.
>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC v2 03/11] vfio: Set device->group in helper function
2022-11-28 8:08 ` Tian, Kevin
@ 2022-11-28 9:17 ` Yi Liu
2022-11-29 2:04 ` Tian, Kevin
0 siblings, 1 reply; 42+ messages in thread
From: Yi Liu @ 2022-11-28 9:17 UTC (permalink / raw)
To: Tian, Kevin, alex.williamson@redhat.com, jgg@nvidia.com
Cc: eric.auger@redhat.com, cohuck@redhat.com, nicolinc@nvidia.com,
yi.y.sun@linux.intel.com, chao.p.peng@linux.intel.com,
mjrosato@linux.ibm.com, kvm@vger.kernel.org
On 2022/11/28 16:08, Tian, Kevin wrote:
>> From: Liu, Yi L <yi.l.liu@intel.com>
>> Sent: Thursday, November 24, 2022 8:27 PM
>>
>> This avoids referencing device->group in __vfio_register_dev()
>
> this is not true. There is still reference to device->group when adding
> the device to the group->device_list.
>
> this remark will become true if you put it after patch04.
oh yes. :-)
>> +static int vfio_device_set_group(struct vfio_device *device,
>> + enum vfio_group_type type)
>> {
>> - int ret;
>> + struct vfio_group *group;
>> +
>> + if (type == VFIO_IOMMU)
>> + group = vfio_group_find_or_alloc(device->dev);
>> + else
>> + group = vfio_noiommu_group_alloc(device->dev, type);
>
> Do we need a WARN_ON(type == VFIO_NO_IOMMU)?
do you mean a heads-up to user? if so, there is already a warn in
vfio_group_find_or_alloc() and vfio_group_ioctl_get_device_fd()
>> @@ -638,6 +651,7 @@ void vfio_unregister_group_dev(struct vfio_device
>> *device)
>> /* Balances device_add in register path */
>> device_del(&device->device);
>>
>> + /* Balances vfio_device_set_group in register path */
>
> "vfio_device_set_group()"
got it.
>> vfio_device_remove_group(device);
>> }
>> EXPORT_SYMBOL_GPL(vfio_unregister_group_dev);
>> --
>> 2.34.1
>
--
Regards,
Yi Liu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC v2 04/11] vfio: Wrap group codes to be helpers for __vfio_register_dev() and unregister
2022-11-28 8:11 ` Tian, Kevin
@ 2022-11-28 9:17 ` Yi Liu
0 siblings, 0 replies; 42+ messages in thread
From: Yi Liu @ 2022-11-28 9:17 UTC (permalink / raw)
To: Tian, Kevin, alex.williamson@redhat.com, jgg@nvidia.com
Cc: eric.auger@redhat.com, cohuck@redhat.com, nicolinc@nvidia.com,
yi.y.sun@linux.intel.com, chao.p.peng@linux.intel.com,
mjrosato@linux.ibm.com, kvm@vger.kernel.org
On 2022/11/28 16:11, Tian, Kevin wrote:
>> From: Liu, Yi L <yi.l.liu@intel.com>
>> Sent: Thursday, November 24, 2022 8:27 PM
>
> Subject:
>
> "vfio: Create wrappers for group register/unregister"
>
>>
>> This avoids to decode group fields in the common functions used by
>
> "avoids decoding"
got above.
>> vfio_device registration, and prepares for further moving vfio group
>> specific code into separate file.
>>
>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
thanks.
--
Regards,
Yi Liu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC v2 05/11] vfio: Make vfio_device_open() group agnostic
2022-11-28 8:17 ` Tian, Kevin
@ 2022-11-28 9:19 ` Yi Liu
2022-12-01 7:08 ` Yi Liu
0 siblings, 1 reply; 42+ messages in thread
From: Yi Liu @ 2022-11-28 9:19 UTC (permalink / raw)
To: Tian, Kevin, alex.williamson@redhat.com, jgg@nvidia.com
Cc: eric.auger@redhat.com, cohuck@redhat.com, nicolinc@nvidia.com,
yi.y.sun@linux.intel.com, chao.p.peng@linux.intel.com,
mjrosato@linux.ibm.com, kvm@vger.kernel.org
On 2022/11/28 16:17, Tian, Kevin wrote:
>> From: Liu, Yi L <yi.l.liu@intel.com>
>> Sent: Thursday, November 24, 2022 8:27 PM
>>
>> This prepares for moving group specific code to separate file.
>>
>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>> ---
>> drivers/vfio/vfio_main.c | 7 ++++---
>> 1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
>> index edcfa8a61096..fcb9f778fc9b 100644
>> --- a/drivers/vfio/vfio_main.c
>> +++ b/drivers/vfio/vfio_main.c
>> @@ -878,9 +878,6 @@ static struct file *vfio_device_open(struct vfio_device
>> *device)
>> */
>> filep->f_mode |= (FMODE_PREAD | FMODE_PWRITE);
>>
>> - if (device->group->type == VFIO_NO_IOMMU)
>> - dev_warn(device->dev, "vfio-noiommu device opened by
>> user "
>> - "(%s:%d)\n", current->comm, task_pid_nr(current));
>> /*
>> * On success the ref of device is moved to the file and
>> * put in vfio_device_fops_release()
>> @@ -927,6 +924,10 @@ static int vfio_group_ioctl_get_device_fd(struct
>> vfio_group *group,
>> goto err_put_fdno;
>> }
>>
>> + if (group->type == VFIO_NO_IOMMU)
>> + dev_warn(device->dev, "vfio-noiommu device opened by
>> user "
>> + "(%s:%d)\n", current->comm, task_pid_nr(current));
>> +
>> fd_install(fdno, filep);
>> return fdno;
>>
>
> Do we want to support no-iommu mode in future cdev path?
>
> If yes keeping the check in vfio_device_open() makes more sense. Just
> replace direct device->group reference with a helper e.g.:
>
> vfio_device_group_noiommu()
I didn't see a reason cdev cannot support no-iommu mode. so a helper to
check noiommu is reasonable.
--
Regards,
Yi Liu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC v2 06/11] vfio: Move device open/close code to be helpfers
2022-11-28 8:21 ` Tian, Kevin
@ 2022-11-28 9:27 ` Yi Liu
0 siblings, 0 replies; 42+ messages in thread
From: Yi Liu @ 2022-11-28 9:27 UTC (permalink / raw)
To: Tian, Kevin, alex.williamson@redhat.com, jgg@nvidia.com
Cc: eric.auger@redhat.com, cohuck@redhat.com, nicolinc@nvidia.com,
yi.y.sun@linux.intel.com, chao.p.peng@linux.intel.com,
mjrosato@linux.ibm.com, kvm@vger.kernel.org
On 2022/11/28 16:21, Tian, Kevin wrote:
>> From: Liu, Yi L <yi.l.liu@intel.com>
>> Sent: Thursday, November 24, 2022 8:27 PM
>>
>> This makes the device open and close be in paired helpers.
>>
>> vfio_device_open(), and vfio_device_close() handles the open_count, and
>> calls vfio_device_first_open(), and vfio_device_last_close() when
>> open_count condition is met. This also helps to avoid open code for device
>> in the vfio_group_ioctl_get_device_fd(), and prepares for further moving
>
> I didn't get which 'open code' is referred to here:
it's the device->open_count things. but you are right, it's not in the
vfio_group_ioctl_get_device_fd().
>> @@ -918,7 +935,7 @@ static int vfio_group_ioctl_get_device_fd(struct
>> vfio_group *group,
>> goto err_put_device;
>> }
>>
>> - filep = vfio_device_open(device);
>> + filep = vfio_device_open_file(device);
>
> it's simply a replacement of function calls.
so more accurate description is splitting the vfio_device_open() into
common vfio_device_open() which is paired with vfio_device_close(), and
another wrapper to deal with device open and device file open, which is
group path specific.
--
Regards,
Yi Liu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC v2 07/11] vfio: Swap order of vfio_device_container_register() and open_device()
2022-11-28 8:27 ` Tian, Kevin
@ 2022-11-28 9:28 ` Yi Liu
0 siblings, 0 replies; 42+ messages in thread
From: Yi Liu @ 2022-11-28 9:28 UTC (permalink / raw)
To: Tian, Kevin, alex.williamson@redhat.com, jgg@nvidia.com
Cc: eric.auger@redhat.com, cohuck@redhat.com, nicolinc@nvidia.com,
yi.y.sun@linux.intel.com, chao.p.peng@linux.intel.com,
mjrosato@linux.ibm.com, kvm@vger.kernel.org
On 2022/11/28 16:27, Tian, Kevin wrote:
>> From: Liu, Yi L <yi.l.liu@intel.com>
>> Sent: Thursday, November 24, 2022 8:27 PM
>>
>> This makes the DMA unmap callback registration to container be consistent
>> across the vfio iommufd compat mode and the legacy container mode.
>>
>> In the vfio iommufd compat mode, this registration is done in the
>> vfio_iommufd_bind() when creating access which has an unmap callback.
>> This
>> is prior to calling the open_device() op provided by mdev driver. While,
>> in the vfio legacy mode, the registration is done by
>> vfio_device_container_register() and it is after open_device(). By swapping
>> the order of vfio_device_container_register() and open_device(), the two
>> modes have the consistent order for the DMA unmap callback registration.
>
> let's mark out that existing drivers have been converted to be OK with
> this order swap in Jason's series.
sure.
>>
>> This also prepares for further splitting group specific code.
>>
>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
--
Regards,
Yi Liu
^ permalink raw reply [flat|nested] 42+ messages in thread
* RE: [RFC v2 03/11] vfio: Set device->group in helper function
2022-11-28 9:17 ` Yi Liu
@ 2022-11-29 2:04 ` Tian, Kevin
2022-11-29 13:27 ` Jason Gunthorpe
0 siblings, 1 reply; 42+ messages in thread
From: Tian, Kevin @ 2022-11-29 2:04 UTC (permalink / raw)
To: Liu, Yi L, alex.williamson@redhat.com, jgg@nvidia.com
Cc: eric.auger@redhat.com, cohuck@redhat.com, nicolinc@nvidia.com,
yi.y.sun@linux.intel.com, chao.p.peng@linux.intel.com,
mjrosato@linux.ibm.com, kvm@vger.kernel.org
> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Monday, November 28, 2022 5:17 PM
>
>
> >> +static int vfio_device_set_group(struct vfio_device *device,
> >> + enum vfio_group_type type)
> >> {
> >> - int ret;
> >> + struct vfio_group *group;
> >> +
> >> + if (type == VFIO_IOMMU)
> >> + group = vfio_group_find_or_alloc(device->dev);
> >> + else
> >> + group = vfio_noiommu_group_alloc(device->dev, type);
> >
> > Do we need a WARN_ON(type == VFIO_NO_IOMMU)?
>
> do you mean a heads-up to user? if so, there is already a warn in
> vfio_group_find_or_alloc() and vfio_group_ioctl_get_device_fd()
>
I meant that VFIO_NO_IOMMU is not expected as a passed in type.
It's implicitly handled by vfio_group_find_or_alloc() which calls
vfio_noiommu_group_alloc() plus kernel taint.
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC v2 03/11] vfio: Set device->group in helper function
2022-11-29 2:04 ` Tian, Kevin
@ 2022-11-29 13:27 ` Jason Gunthorpe
2022-11-30 7:09 ` Yi Liu
0 siblings, 1 reply; 42+ messages in thread
From: Jason Gunthorpe @ 2022-11-29 13:27 UTC (permalink / raw)
To: Tian, Kevin
Cc: Liu, Yi L, alex.williamson@redhat.com, eric.auger@redhat.com,
cohuck@redhat.com, nicolinc@nvidia.com, yi.y.sun@linux.intel.com,
chao.p.peng@linux.intel.com, mjrosato@linux.ibm.com,
kvm@vger.kernel.org
On Tue, Nov 29, 2022 at 02:04:01AM +0000, Tian, Kevin wrote:
> > From: Liu, Yi L <yi.l.liu@intel.com>
> > Sent: Monday, November 28, 2022 5:17 PM
> >
> >
> > >> +static int vfio_device_set_group(struct vfio_device *device,
> > >> + enum vfio_group_type type)
> > >> {
> > >> - int ret;
> > >> + struct vfio_group *group;
> > >> +
> > >> + if (type == VFIO_IOMMU)
> > >> + group = vfio_group_find_or_alloc(device->dev);
> > >> + else
> > >> + group = vfio_noiommu_group_alloc(device->dev, type);
> > >
> > > Do we need a WARN_ON(type == VFIO_NO_IOMMU)?
> >
> > do you mean a heads-up to user? if so, there is already a warn in
> > vfio_group_find_or_alloc() and vfio_group_ioctl_get_device_fd()
> >
>
> I meant that VFIO_NO_IOMMU is not expected as a passed in type.
> It's implicitly handled by vfio_group_find_or_alloc() which calls
> vfio_noiommu_group_alloc() plus kernel taint.
The code is simple enough to check the two callers, I don't know if we
need a WARN_ON - and it isn't actually wrong to call
vfio_noiommu_group_alloc() for a VFIO_NO_IOMMU..
Jason
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC v2 03/11] vfio: Set device->group in helper function
2022-11-29 13:27 ` Jason Gunthorpe
@ 2022-11-30 7:09 ` Yi Liu
0 siblings, 0 replies; 42+ messages in thread
From: Yi Liu @ 2022-11-30 7:09 UTC (permalink / raw)
To: Jason Gunthorpe, Tian, Kevin
Cc: alex.williamson@redhat.com, eric.auger@redhat.com,
cohuck@redhat.com, nicolinc@nvidia.com, yi.y.sun@linux.intel.com,
chao.p.peng@linux.intel.com, mjrosato@linux.ibm.com,
kvm@vger.kernel.org
On 2022/11/29 21:27, Jason Gunthorpe wrote:
> On Tue, Nov 29, 2022 at 02:04:01AM +0000, Tian, Kevin wrote:
>>> From: Liu, Yi L <yi.l.liu@intel.com>
>>> Sent: Monday, November 28, 2022 5:17 PM
>>>
>>>
>>>>> +static int vfio_device_set_group(struct vfio_device *device,
>>>>> + enum vfio_group_type type)
>>>>> {
>>>>> - int ret;
>>>>> + struct vfio_group *group;
>>>>> +
>>>>> + if (type == VFIO_IOMMU)
>>>>> + group = vfio_group_find_or_alloc(device->dev);
>>>>> + else
>>>>> + group = vfio_noiommu_group_alloc(device->dev, type);
>>>>
>>>> Do we need a WARN_ON(type == VFIO_NO_IOMMU)?
>>>
>>> do you mean a heads-up to user? if so, there is already a warn in
>>> vfio_group_find_or_alloc() and vfio_group_ioctl_get_device_fd()
>>>
>>
>> I meant that VFIO_NO_IOMMU is not expected as a passed in type.
>> It's implicitly handled by vfio_group_find_or_alloc() which calls
>> vfio_noiommu_group_alloc() plus kernel taint.
>
> The code is simple enough to check the two callers, I don't know if we
> need a WARN_ON - and it isn't actually wrong to call
> vfio_noiommu_group_alloc() for a VFIO_NO_IOMMU..
this function also works if the VFIO_NO_IOMMU is passed in. But no such
usage in the code yet. :-)
--
Regards,
Yi Liu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC v2 05/11] vfio: Make vfio_device_open() group agnostic
2022-11-28 9:19 ` Yi Liu
@ 2022-12-01 7:08 ` Yi Liu
2022-12-01 12:48 ` Jason Gunthorpe
0 siblings, 1 reply; 42+ messages in thread
From: Yi Liu @ 2022-12-01 7:08 UTC (permalink / raw)
To: Tian, Kevin, alex.williamson@redhat.com, jgg@nvidia.com
Cc: eric.auger@redhat.com, cohuck@redhat.com, nicolinc@nvidia.com,
yi.y.sun@linux.intel.com, chao.p.peng@linux.intel.com,
mjrosato@linux.ibm.com, kvm@vger.kernel.org
On 2022/11/28 17:19, Yi Liu wrote:
> On 2022/11/28 16:17, Tian, Kevin wrote:
>>> From: Liu, Yi L <yi.l.liu@intel.com>
>>> Sent: Thursday, November 24, 2022 8:27 PM
>>>
>>> This prepares for moving group specific code to separate file.
>>>
>>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>>> ---
>>> drivers/vfio/vfio_main.c | 7 ++++---
>>> 1 file changed, 4 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
>>> index edcfa8a61096..fcb9f778fc9b 100644
>>> --- a/drivers/vfio/vfio_main.c
>>> +++ b/drivers/vfio/vfio_main.c
>>> @@ -878,9 +878,6 @@ static struct file *vfio_device_open(struct vfio_device
>>> *device)
>>> */
>>> filep->f_mode |= (FMODE_PREAD | FMODE_PWRITE);
>>>
>>> - if (device->group->type == VFIO_NO_IOMMU)
>>> - dev_warn(device->dev, "vfio-noiommu device opened by
>>> user "
>>> - "(%s:%d)\n", current->comm, task_pid_nr(current));
>>> /*
>>> * On success the ref of device is moved to the file and
>>> * put in vfio_device_fops_release()
>>> @@ -927,6 +924,10 @@ static int vfio_group_ioctl_get_device_fd(struct
>>> vfio_group *group,
>>> goto err_put_fdno;
>>> }
>>>
>>> + if (group->type == VFIO_NO_IOMMU)
>>> + dev_warn(device->dev, "vfio-noiommu device opened by
>>> user "
>>> + "(%s:%d)\n", current->comm, task_pid_nr(current));
>>> +
>>> fd_install(fdno, filep);
>>> return fdno;
>>>
>>
>> Do we want to support no-iommu mode in future cdev path?
>>
>> If yes keeping the check in vfio_device_open() makes more sense. Just
>> replace direct device->group reference with a helper e.g.:
>>
>> vfio_device_group_noiommu()
>
> I didn't see a reason cdev cannot support no-iommu mode. so a helper to
> check noiommu is reasonable.
This check should be done after opening device and the file. Current
vfio_device_open() opens device first and then open file. Open file is
group path specific, not needed in future device cdev path. So if want to
have this check in the common function, the open device and open file
order should be swapped. However, it is not necessary here. So may just
drop this patch and consider it in future device cdev series.
--
Regards,
Yi Liu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC v2 05/11] vfio: Make vfio_device_open() group agnostic
2022-12-01 7:08 ` Yi Liu
@ 2022-12-01 12:48 ` Jason Gunthorpe
0 siblings, 0 replies; 42+ messages in thread
From: Jason Gunthorpe @ 2022-12-01 12:48 UTC (permalink / raw)
To: Yi Liu
Cc: Tian, Kevin, alex.williamson@redhat.com, eric.auger@redhat.com,
cohuck@redhat.com, nicolinc@nvidia.com, yi.y.sun@linux.intel.com,
chao.p.peng@linux.intel.com, mjrosato@linux.ibm.com,
kvm@vger.kernel.org
On Thu, Dec 01, 2022 at 03:08:12PM +0800, Yi Liu wrote:
> On 2022/11/28 17:19, Yi Liu wrote:
> > On 2022/11/28 16:17, Tian, Kevin wrote:
> > > > From: Liu, Yi L <yi.l.liu@intel.com>
> > > > Sent: Thursday, November 24, 2022 8:27 PM
> > > >
> > > > This prepares for moving group specific code to separate file.
> > > >
> > > > Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> > > > ---
> > > > drivers/vfio/vfio_main.c | 7 ++++---
> > > > 1 file changed, 4 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
> > > > index edcfa8a61096..fcb9f778fc9b 100644
> > > > --- a/drivers/vfio/vfio_main.c
> > > > +++ b/drivers/vfio/vfio_main.c
> > > > @@ -878,9 +878,6 @@ static struct file *vfio_device_open(struct vfio_device
> > > > *device)
> > > > */
> > > > filep->f_mode |= (FMODE_PREAD | FMODE_PWRITE);
> > > >
> > > > - if (device->group->type == VFIO_NO_IOMMU)
> > > > - dev_warn(device->dev, "vfio-noiommu device opened by
> > > > user "
> > > > - "(%s:%d)\n", current->comm, task_pid_nr(current));
> > > > /*
> > > > * On success the ref of device is moved to the file and
> > > > * put in vfio_device_fops_release()
> > > > @@ -927,6 +924,10 @@ static int vfio_group_ioctl_get_device_fd(struct
> > > > vfio_group *group,
> > > > goto err_put_fdno;
> > > > }
> > > >
> > > > + if (group->type == VFIO_NO_IOMMU)
> > > > + dev_warn(device->dev, "vfio-noiommu device opened by
> > > > user "
> > > > + "(%s:%d)\n", current->comm, task_pid_nr(current));
> > > > +
> > > > fd_install(fdno, filep);
> > > > return fdno;
> > > >
> > >
> > > Do we want to support no-iommu mode in future cdev path?
> > >
> > > If yes keeping the check in vfio_device_open() makes more sense. Just
> > > replace direct device->group reference with a helper e.g.:
> > >
> > > vfio_device_group_noiommu()
> >
> > I didn't see a reason cdev cannot support no-iommu mode. so a helper to
> > check noiommu is reasonable.
>
> This check should be done after opening device and the file. Current
> vfio_device_open() opens device first and then open file. Open file is
> group path specific, not needed in future device cdev path. So if want to
> have this check in the common function, the open device and open file
> order should be swapped. However, it is not necessary here. So may just
> drop this patch and consider it in future device cdev series.
the point here was to remove the device->group touches from
vfio_main.c, which this does and seems appropraite.
cdev no-iommu mode is going to be quite different since it will work
without groups
Jason
^ permalink raw reply [flat|nested] 42+ messages in thread
end of thread, other threads:[~2022-12-01 12:48 UTC | newest]
Thread overview: 42+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-24 12:26 [RFC v2 00/11] Move group specific code into group.c Yi Liu
2022-11-24 12:26 ` [RFC v2 01/11] vfio: Simplify vfio_create_group() Yi Liu
2022-11-28 7:57 ` Tian, Kevin
2022-11-24 12:26 ` [RFC v2 02/11] vfio: Move the sanity check of the group to vfio_create_group() Yi Liu
2022-11-28 8:05 ` Tian, Kevin
2022-11-24 12:26 ` [RFC v2 03/11] vfio: Set device->group in helper function Yi Liu
2022-11-24 13:17 ` Jason Gunthorpe
2022-11-24 13:50 ` Yi Liu
2022-11-28 8:08 ` Tian, Kevin
2022-11-28 9:17 ` Yi Liu
2022-11-29 2:04 ` Tian, Kevin
2022-11-29 13:27 ` Jason Gunthorpe
2022-11-30 7:09 ` Yi Liu
2022-11-24 12:26 ` [RFC v2 04/11] vfio: Wrap group codes to be helpers for __vfio_register_dev() and unregister Yi Liu
2022-11-28 8:11 ` Tian, Kevin
2022-11-28 9:17 ` Yi Liu
2022-11-24 12:26 ` [RFC v2 05/11] vfio: Make vfio_device_open() group agnostic Yi Liu
2022-11-28 8:17 ` Tian, Kevin
2022-11-28 9:19 ` Yi Liu
2022-12-01 7:08 ` Yi Liu
2022-12-01 12:48 ` Jason Gunthorpe
2022-11-24 12:26 ` [RFC v2 06/11] vfio: Move device open/close code to be helpfers Yi Liu
2022-11-28 8:21 ` Tian, Kevin
2022-11-28 9:27 ` Yi Liu
2022-11-24 12:26 ` [RFC v2 07/11] vfio: Swap order of vfio_device_container_register() and open_device() Yi Liu
2022-11-28 8:27 ` Tian, Kevin
2022-11-28 9:28 ` Yi Liu
2022-11-24 12:26 ` [RFC v2 08/11] vfio: Refactor vfio_device_first_open() and _last_close() Yi Liu
2022-11-24 14:56 ` Jason Gunthorpe
2022-11-25 8:57 ` Yi Liu
2022-11-25 12:37 ` Jason Gunthorpe
2022-11-25 14:06 ` Yi Liu
2022-11-25 14:15 ` Jason Gunthorpe
2022-11-25 14:33 ` Yi Liu
2022-11-28 8:31 ` Tian, Kevin
2022-11-24 12:27 ` [RFC v2 09/11] vfio: Wrap vfio group module init/clean code into helpers Yi Liu
2022-11-28 8:36 ` Tian, Kevin
2022-11-24 12:27 ` [RFC v2 10/11] vfio: Refactor dma APIs for emulated devices Yi Liu
2022-11-28 8:42 ` Tian, Kevin
2022-11-24 12:27 ` [RFC v2 11/11] vfio: Move vfio group specific code into group.c Yi Liu
2022-11-24 14:36 ` Jason Gunthorpe
2022-11-25 8:45 ` Yi Liu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).