* [PATCH v2 1/2] vfio-mdev: Wire in a request handler for mdev parent
2020-11-20 18:07 [PATCH v2 0/2] Connect request callback to mdev and vfio-ccw Eric Farman
@ 2020-11-20 18:07 ` Eric Farman
2020-11-24 9:57 ` Cornelia Huck
2020-12-02 20:28 ` Alex Williamson
2020-11-20 18:07 ` [PATCH v2 2/2] vfio-ccw: Wire in the request callback Eric Farman
2020-11-24 10:00 ` [PATCH v2 0/2] Connect request callback to mdev and vfio-ccw Cornelia Huck
2 siblings, 2 replies; 7+ messages in thread
From: Eric Farman @ 2020-11-20 18:07 UTC (permalink / raw)
To: Cornelia Huck, Kirti Wankhede, Alex Williamson
Cc: Halil Pasic, Matthew Rosato, linux-s390, kvm, Eric Farman
While performing some destructive tests with vfio-ccw, where the
paths to a device are forcible removed and thus the device itself
is unreachable, it is rather easy to end up in an endless loop in
vfio_del_group_dev() due to the lack of a request callback for the
associated device.
In this example, one MDEV (77c) is used by a guest, while another
(77b) is not. The symptom is that the iommu is detached from the
mdev for 77b, but not 77c, until that guest is shutdown:
[ 238.794867] vfio_ccw 0.0.077b: MDEV: Unregistering
[ 238.794996] vfio_mdev 11f2d2bc-4083-431d-a023-eff72715c4f0: Removing from iommu group 2
[ 238.795001] vfio_mdev 11f2d2bc-4083-431d-a023-eff72715c4f0: MDEV: detaching iommu
[ 238.795036] vfio_ccw 0.0.077c: MDEV: Unregistering
...silence...
Let's wire in the request call back to the mdev device, so that a
device being physically removed from the host can be (gracefully?)
handled by the parent device at the time the device is removed.
Add a message when registering the device if a driver doesn't
provide this callback, so a clue is given that this same loop
may be encountered in a similar situation.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
drivers/vfio/mdev/mdev_core.c | 4 ++++
drivers/vfio/mdev/vfio_mdev.c | 10 ++++++++++
include/linux/mdev.h | 4 ++++
3 files changed, 18 insertions(+)
diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
index b558d4cfd082..6de97d25a3f8 100644
--- a/drivers/vfio/mdev/mdev_core.c
+++ b/drivers/vfio/mdev/mdev_core.c
@@ -154,6 +154,10 @@ int mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops)
if (!dev)
return -EINVAL;
+ /* Not mandatory, but its absence could be a problem */
+ if (!ops->request)
+ dev_info(dev, "Driver cannot be asked to release device\n");
+
mutex_lock(&parent_list_lock);
/* Check for duplicate */
diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
index 30964a4e0a28..06d8fc4a6d72 100644
--- a/drivers/vfio/mdev/vfio_mdev.c
+++ b/drivers/vfio/mdev/vfio_mdev.c
@@ -98,6 +98,15 @@ static int vfio_mdev_mmap(void *device_data, struct vm_area_struct *vma)
return parent->ops->mmap(mdev, vma);
}
+static void vfio_mdev_request(void *device_data, unsigned int count)
+{
+ struct mdev_device *mdev = device_data;
+ struct mdev_parent *parent = mdev->parent;
+
+ if (parent->ops->request)
+ parent->ops->request(mdev, count);
+}
+
static const struct vfio_device_ops vfio_mdev_dev_ops = {
.name = "vfio-mdev",
.open = vfio_mdev_open,
@@ -106,6 +115,7 @@ static const struct vfio_device_ops vfio_mdev_dev_ops = {
.read = vfio_mdev_read,
.write = vfio_mdev_write,
.mmap = vfio_mdev_mmap,
+ .request = vfio_mdev_request,
};
static int vfio_mdev_probe(struct device *dev)
diff --git a/include/linux/mdev.h b/include/linux/mdev.h
index 0ce30ca78db0..9004375c462e 100644
--- a/include/linux/mdev.h
+++ b/include/linux/mdev.h
@@ -72,6 +72,9 @@ struct device *mdev_get_iommu_device(struct device *dev);
* @mmap: mmap callback
* @mdev: mediated device structure
* @vma: vma structure
+ * @request: request callback to release device
+ * @mdev: mediated device structure
+ * @count: request sequence number
* Parent device that support mediated device should be registered with mdev
* module with mdev_parent_ops structure.
**/
@@ -92,6 +95,7 @@ struct mdev_parent_ops {
long (*ioctl)(struct mdev_device *mdev, unsigned int cmd,
unsigned long arg);
int (*mmap)(struct mdev_device *mdev, struct vm_area_struct *vma);
+ void (*request)(struct mdev_device *mdev, unsigned int count);
};
/* interface for exporting mdev supported type attributes */
--
2.17.1
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH v2 1/2] vfio-mdev: Wire in a request handler for mdev parent
2020-11-20 18:07 ` [PATCH v2 1/2] vfio-mdev: Wire in a request handler for mdev parent Eric Farman
@ 2020-11-24 9:57 ` Cornelia Huck
2020-12-02 20:28 ` Alex Williamson
1 sibling, 0 replies; 7+ messages in thread
From: Cornelia Huck @ 2020-11-24 9:57 UTC (permalink / raw)
To: Eric Farman
Cc: Kirti Wankhede, Alex Williamson, Halil Pasic, Matthew Rosato,
linux-s390, kvm
On Fri, 20 Nov 2020 19:07:39 +0100
Eric Farman <farman@linux.ibm.com> wrote:
> While performing some destructive tests with vfio-ccw, where the
> paths to a device are forcible removed and thus the device itself
> is unreachable, it is rather easy to end up in an endless loop in
> vfio_del_group_dev() due to the lack of a request callback for the
> associated device.
>
> In this example, one MDEV (77c) is used by a guest, while another
> (77b) is not. The symptom is that the iommu is detached from the
> mdev for 77b, but not 77c, until that guest is shutdown:
>
> [ 238.794867] vfio_ccw 0.0.077b: MDEV: Unregistering
> [ 238.794996] vfio_mdev 11f2d2bc-4083-431d-a023-eff72715c4f0: Removing from iommu group 2
> [ 238.795001] vfio_mdev 11f2d2bc-4083-431d-a023-eff72715c4f0: MDEV: detaching iommu
> [ 238.795036] vfio_ccw 0.0.077c: MDEV: Unregistering
> ...silence...
>
> Let's wire in the request call back to the mdev device, so that a
> device being physically removed from the host can be (gracefully?)
> handled by the parent device at the time the device is removed.
>
> Add a message when registering the device if a driver doesn't
> provide this callback, so a clue is given that this same loop
> may be encountered in a similar situation.
>
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
> drivers/vfio/mdev/mdev_core.c | 4 ++++
> drivers/vfio/mdev/vfio_mdev.c | 10 ++++++++++
> include/linux/mdev.h | 4 ++++
> 3 files changed, 18 insertions(+)
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 1/2] vfio-mdev: Wire in a request handler for mdev parent
2020-11-20 18:07 ` [PATCH v2 1/2] vfio-mdev: Wire in a request handler for mdev parent Eric Farman
2020-11-24 9:57 ` Cornelia Huck
@ 2020-12-02 20:28 ` Alex Williamson
2020-12-03 3:02 ` Eric Farman
1 sibling, 1 reply; 7+ messages in thread
From: Alex Williamson @ 2020-12-02 20:28 UTC (permalink / raw)
To: Eric Farman
Cc: Cornelia Huck, Kirti Wankhede, Halil Pasic, Matthew Rosato,
linux-s390, kvm
On Fri, 20 Nov 2020 19:07:39 +0100
Eric Farman <farman@linux.ibm.com> wrote:
> While performing some destructive tests with vfio-ccw, where the
> paths to a device are forcible removed and thus the device itself
> is unreachable, it is rather easy to end up in an endless loop in
> vfio_del_group_dev() due to the lack of a request callback for the
> associated device.
>
> In this example, one MDEV (77c) is used by a guest, while another
> (77b) is not. The symptom is that the iommu is detached from the
> mdev for 77b, but not 77c, until that guest is shutdown:
>
> [ 238.794867] vfio_ccw 0.0.077b: MDEV: Unregistering
> [ 238.794996] vfio_mdev 11f2d2bc-4083-431d-a023-eff72715c4f0: Removing from iommu group 2
> [ 238.795001] vfio_mdev 11f2d2bc-4083-431d-a023-eff72715c4f0: MDEV: detaching iommu
> [ 238.795036] vfio_ccw 0.0.077c: MDEV: Unregistering
> ...silence...
>
> Let's wire in the request call back to the mdev device, so that a
> device being physically removed from the host can be (gracefully?)
> handled by the parent device at the time the device is removed.
>
> Add a message when registering the device if a driver doesn't
> provide this callback, so a clue is given that this same loop
> may be encountered in a similar situation.
>
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
> drivers/vfio/mdev/mdev_core.c | 4 ++++
> drivers/vfio/mdev/vfio_mdev.c | 10 ++++++++++
> include/linux/mdev.h | 4 ++++
> 3 files changed, 18 insertions(+)
>
> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
> index b558d4cfd082..6de97d25a3f8 100644
> --- a/drivers/vfio/mdev/mdev_core.c
> +++ b/drivers/vfio/mdev/mdev_core.c
> @@ -154,6 +154,10 @@ int mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops)
> if (!dev)
> return -EINVAL;
>
> + /* Not mandatory, but its absence could be a problem */
> + if (!ops->request)
> + dev_info(dev, "Driver cannot be asked to release device\n");
> +
> mutex_lock(&parent_list_lock);
>
> /* Check for duplicate */
> diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
> index 30964a4e0a28..06d8fc4a6d72 100644
> --- a/drivers/vfio/mdev/vfio_mdev.c
> +++ b/drivers/vfio/mdev/vfio_mdev.c
> @@ -98,6 +98,15 @@ static int vfio_mdev_mmap(void *device_data, struct vm_area_struct *vma)
> return parent->ops->mmap(mdev, vma);
> }
>
> +static void vfio_mdev_request(void *device_data, unsigned int count)
> +{
> + struct mdev_device *mdev = device_data;
> + struct mdev_parent *parent = mdev->parent;
> +
> + if (parent->ops->request)
> + parent->ops->request(mdev, count);
What do you think about duplicating the count==0 notice in the else
case here? ie.
else if (count == 0)
dev_notice(mdev_dev(mdev), "No mdev vendor driver request callback support, blocked until released by user\n");
This at least puts something in the log a bit closer to the timeframe
of a possible issue versus the registration nag. vfio-core could do
this too, but vfio-mdev registers a request callback on behalf of all
mdev devices, so vfio-core would no longer have visibility for this
case.
Otherwise this series looks fine to me and I can take it through the
vfio tree. Thanks,
Alex
> +}
> +
> static const struct vfio_device_ops vfio_mdev_dev_ops = {
> .name = "vfio-mdev",
> .open = vfio_mdev_open,
> @@ -106,6 +115,7 @@ static const struct vfio_device_ops vfio_mdev_dev_ops = {
> .read = vfio_mdev_read,
> .write = vfio_mdev_write,
> .mmap = vfio_mdev_mmap,
> + .request = vfio_mdev_request,
> };
>
> static int vfio_mdev_probe(struct device *dev)
> diff --git a/include/linux/mdev.h b/include/linux/mdev.h
> index 0ce30ca78db0..9004375c462e 100644
> --- a/include/linux/mdev.h
> +++ b/include/linux/mdev.h
> @@ -72,6 +72,9 @@ struct device *mdev_get_iommu_device(struct device *dev);
> * @mmap: mmap callback
> * @mdev: mediated device structure
> * @vma: vma structure
> + * @request: request callback to release device
> + * @mdev: mediated device structure
> + * @count: request sequence number
> * Parent device that support mediated device should be registered with mdev
> * module with mdev_parent_ops structure.
> **/
> @@ -92,6 +95,7 @@ struct mdev_parent_ops {
> long (*ioctl)(struct mdev_device *mdev, unsigned int cmd,
> unsigned long arg);
> int (*mmap)(struct mdev_device *mdev, struct vm_area_struct *vma);
> + void (*request)(struct mdev_device *mdev, unsigned int count);
> };
>
> /* interface for exporting mdev supported type attributes */
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH v2 1/2] vfio-mdev: Wire in a request handler for mdev parent
2020-12-02 20:28 ` Alex Williamson
@ 2020-12-03 3:02 ` Eric Farman
0 siblings, 0 replies; 7+ messages in thread
From: Eric Farman @ 2020-12-03 3:02 UTC (permalink / raw)
To: Alex Williamson
Cc: Cornelia Huck, Kirti Wankhede, Halil Pasic, Matthew Rosato,
linux-s390, kvm
On 12/2/20 3:28 PM, Alex Williamson wrote:
> On Fri, 20 Nov 2020 19:07:39 +0100
> Eric Farman <farman@linux.ibm.com> wrote:
>
>> While performing some destructive tests with vfio-ccw, where the
>> paths to a device are forcible removed and thus the device itself
>> is unreachable, it is rather easy to end up in an endless loop in
>> vfio_del_group_dev() due to the lack of a request callback for the
>> associated device.
>>
>> In this example, one MDEV (77c) is used by a guest, while another
>> (77b) is not. The symptom is that the iommu is detached from the
>> mdev for 77b, but not 77c, until that guest is shutdown:
>>
>> [ 238.794867] vfio_ccw 0.0.077b: MDEV: Unregistering
>> [ 238.794996] vfio_mdev 11f2d2bc-4083-431d-a023-eff72715c4f0: Removing from iommu group 2
>> [ 238.795001] vfio_mdev 11f2d2bc-4083-431d-a023-eff72715c4f0: MDEV: detaching iommu
>> [ 238.795036] vfio_ccw 0.0.077c: MDEV: Unregistering
>> ...silence...
>>
>> Let's wire in the request call back to the mdev device, so that a
>> device being physically removed from the host can be (gracefully?)
>> handled by the parent device at the time the device is removed.
>>
>> Add a message when registering the device if a driver doesn't
>> provide this callback, so a clue is given that this same loop
>> may be encountered in a similar situation.
>>
>> Signed-off-by: Eric Farman <farman@linux.ibm.com>
>> ---
>> drivers/vfio/mdev/mdev_core.c | 4 ++++
>> drivers/vfio/mdev/vfio_mdev.c | 10 ++++++++++
>> include/linux/mdev.h | 4 ++++
>> 3 files changed, 18 insertions(+)
>>
>> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
>> index b558d4cfd082..6de97d25a3f8 100644
>> --- a/drivers/vfio/mdev/mdev_core.c
>> +++ b/drivers/vfio/mdev/mdev_core.c
>> @@ -154,6 +154,10 @@ int mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops)
>> if (!dev)
>> return -EINVAL;
>>
>> + /* Not mandatory, but its absence could be a problem */
>> + if (!ops->request)
>> + dev_info(dev, "Driver cannot be asked to release device\n");
>> +
>> mutex_lock(&parent_list_lock);
>>
>> /* Check for duplicate */
>> diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
>> index 30964a4e0a28..06d8fc4a6d72 100644
>> --- a/drivers/vfio/mdev/vfio_mdev.c
>> +++ b/drivers/vfio/mdev/vfio_mdev.c
>> @@ -98,6 +98,15 @@ static int vfio_mdev_mmap(void *device_data, struct vm_area_struct *vma)
>> return parent->ops->mmap(mdev, vma);
>> }
>>
>> +static void vfio_mdev_request(void *device_data, unsigned int count)
>> +{
>> + struct mdev_device *mdev = device_data;
>> + struct mdev_parent *parent = mdev->parent;
>> +
>> + if (parent->ops->request)
>> + parent->ops->request(mdev, count);
>
> What do you think about duplicating the count==0 notice in the else
> case here? ie.
>
> else if (count == 0)
> dev_notice(mdev_dev(mdev), "No mdev vendor driver request callback support, blocked until released by user\n");
>
I'm fine with that. If there are no objections, I should be able to spin
a v3 with such a change tomorrow.
Thank you!
Eric
> This at least puts something in the log a bit closer to the timeframe
> of a possible issue versus the registration nag. vfio-core could do
> this too, but vfio-mdev registers a request callback on behalf of all
> mdev devices, so vfio-core would no longer have visibility for this
> case.
>
> Otherwise this series looks fine to me and I can take it through the
> vfio tree. Thanks,
>
> Alex
>
>> +}
>> +
>> static const struct vfio_device_ops vfio_mdev_dev_ops = {
>> .name = "vfio-mdev",
>> .open = vfio_mdev_open,
>> @@ -106,6 +115,7 @@ static const struct vfio_device_ops vfio_mdev_dev_ops = {
>> .read = vfio_mdev_read,
>> .write = vfio_mdev_write,
>> .mmap = vfio_mdev_mmap,
>> + .request = vfio_mdev_request,
>> };
>>
>> static int vfio_mdev_probe(struct device *dev)
>> diff --git a/include/linux/mdev.h b/include/linux/mdev.h
>> index 0ce30ca78db0..9004375c462e 100644
>> --- a/include/linux/mdev.h
>> +++ b/include/linux/mdev.h
>> @@ -72,6 +72,9 @@ struct device *mdev_get_iommu_device(struct device *dev);
>> * @mmap: mmap callback
>> * @mdev: mediated device structure
>> * @vma: vma structure
>> + * @request: request callback to release device
>> + * @mdev: mediated device structure
>> + * @count: request sequence number
>> * Parent device that support mediated device should be registered with mdev
>> * module with mdev_parent_ops structure.
>> **/
>> @@ -92,6 +95,7 @@ struct mdev_parent_ops {
>> long (*ioctl)(struct mdev_device *mdev, unsigned int cmd,
>> unsigned long arg);
>> int (*mmap)(struct mdev_device *mdev, struct vm_area_struct *vma);
>> + void (*request)(struct mdev_device *mdev, unsigned int count);
>> };
>>
>> /* interface for exporting mdev supported type attributes */
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 2/2] vfio-ccw: Wire in the request callback
2020-11-20 18:07 [PATCH v2 0/2] Connect request callback to mdev and vfio-ccw Eric Farman
2020-11-20 18:07 ` [PATCH v2 1/2] vfio-mdev: Wire in a request handler for mdev parent Eric Farman
@ 2020-11-20 18:07 ` Eric Farman
2020-11-24 10:00 ` [PATCH v2 0/2] Connect request callback to mdev and vfio-ccw Cornelia Huck
2 siblings, 0 replies; 7+ messages in thread
From: Eric Farman @ 2020-11-20 18:07 UTC (permalink / raw)
To: Cornelia Huck, Kirti Wankhede, Alex Williamson
Cc: Halil Pasic, Matthew Rosato, linux-s390, kvm, Eric Farman
The device is being unplugged, so pass the request to userspace to
ask for a graceful cleanup. This should free up the thread that
would otherwise loop waiting for the device to be fully released.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
---
drivers/s390/cio/vfio_ccw_ops.c | 26 ++++++++++++++++++++++++++
drivers/s390/cio/vfio_ccw_private.h | 4 ++++
include/uapi/linux/vfio.h | 1 +
3 files changed, 31 insertions(+)
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index 8b3ed5b45277..68106be4ba7a 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -394,6 +394,7 @@ static int vfio_ccw_mdev_get_irq_info(struct vfio_irq_info *info)
switch (info->index) {
case VFIO_CCW_IO_IRQ_INDEX:
case VFIO_CCW_CRW_IRQ_INDEX:
+ case VFIO_CCW_REQ_IRQ_INDEX:
info->count = 1;
info->flags = VFIO_IRQ_INFO_EVENTFD;
break;
@@ -424,6 +425,9 @@ static int vfio_ccw_mdev_set_irqs(struct mdev_device *mdev,
case VFIO_CCW_CRW_IRQ_INDEX:
ctx = &private->crw_trigger;
break;
+ case VFIO_CCW_REQ_IRQ_INDEX:
+ ctx = &private->req_trigger;
+ break;
default:
return -EINVAL;
}
@@ -607,6 +611,27 @@ static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
}
}
+/* Request removal of the device*/
+static void vfio_ccw_mdev_request(struct mdev_device *mdev, unsigned int count)
+{
+ struct vfio_ccw_private *private = dev_get_drvdata(mdev_parent_dev(mdev));
+
+ if (!private)
+ return;
+
+ if (private->req_trigger) {
+ if (!(count % 10))
+ dev_notice_ratelimited(mdev_dev(private->mdev),
+ "Relaying device request to user (#%u)\n",
+ count);
+
+ eventfd_signal(private->req_trigger, 1);
+ } else if (count == 0) {
+ dev_notice(mdev_dev(private->mdev),
+ "No device request channel registered, blocked until released by user\n");
+ }
+}
+
static const struct mdev_parent_ops vfio_ccw_mdev_ops = {
.owner = THIS_MODULE,
.supported_type_groups = mdev_type_groups,
@@ -617,6 +642,7 @@ static const struct mdev_parent_ops vfio_ccw_mdev_ops = {
.read = vfio_ccw_mdev_read,
.write = vfio_ccw_mdev_write,
.ioctl = vfio_ccw_mdev_ioctl,
+ .request = vfio_ccw_mdev_request,
};
int vfio_ccw_mdev_reg(struct subchannel *sch)
diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
index 8723156b29ea..b2c762eb42b9 100644
--- a/drivers/s390/cio/vfio_ccw_private.h
+++ b/drivers/s390/cio/vfio_ccw_private.h
@@ -84,7 +84,10 @@ struct vfio_ccw_crw {
* @irb: irb info received from interrupt
* @scsw: scsw info
* @io_trigger: eventfd ctx for signaling userspace I/O results
+ * @crw_trigger: eventfd ctx for signaling userspace CRW information
+ * @req_trigger: eventfd ctx for signaling userspace to return device
* @io_work: work for deferral process of I/O handling
+ * @crw_work: work for deferral process of CRW handling
*/
struct vfio_ccw_private {
struct subchannel *sch;
@@ -108,6 +111,7 @@ struct vfio_ccw_private {
struct eventfd_ctx *io_trigger;
struct eventfd_ctx *crw_trigger;
+ struct eventfd_ctx *req_trigger;
struct work_struct io_work;
struct work_struct crw_work;
} __aligned(8);
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 2f313a238a8f..d1812777139f 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -820,6 +820,7 @@ enum {
enum {
VFIO_CCW_IO_IRQ_INDEX,
VFIO_CCW_CRW_IRQ_INDEX,
+ VFIO_CCW_REQ_IRQ_INDEX,
VFIO_CCW_NUM_IRQS
};
--
2.17.1
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH v2 0/2] Connect request callback to mdev and vfio-ccw
2020-11-20 18:07 [PATCH v2 0/2] Connect request callback to mdev and vfio-ccw Eric Farman
2020-11-20 18:07 ` [PATCH v2 1/2] vfio-mdev: Wire in a request handler for mdev parent Eric Farman
2020-11-20 18:07 ` [PATCH v2 2/2] vfio-ccw: Wire in the request callback Eric Farman
@ 2020-11-24 10:00 ` Cornelia Huck
2 siblings, 0 replies; 7+ messages in thread
From: Cornelia Huck @ 2020-11-24 10:00 UTC (permalink / raw)
To: Eric Farman
Cc: Kirti Wankhede, Alex Williamson, Halil Pasic, Matthew Rosato,
linux-s390, kvm
On Fri, 20 Nov 2020 19:07:38 +0100
Eric Farman <farman@linux.ibm.com> wrote:
> There is a situation where removing all the paths from a device
> connected via mdev and vfio-ccw can cause some difficulty.
> Using the "chchp -c 0 xx" command to all paths will cause the
> device to be removed from the configuration, and any guest
> filesystem that is relying on that device will encounter errors.
> Interestingly, the last chchp command will actually fail to
> return to a shell prompt, and subsequent commands (another
> chchp to bring the paths back online, chzdev, etc.) will also
> hang because of the outstanding chchp.
>
> The last chchp command drives to vfio_ccw_sch_remove() for every
> affected mediated device, and ultimately enters an infinite loop
> in vfio_del_group_dev(). This loop is broken when the guest goes
> away, which in this case doesn't occur until the guest is shutdown.
> This drives vfio_ccw_mdev_release() and thus vfio_device_release()
> to wake up the vfio_del_group_dev() thread.
>
> There is also a callback mechanism called "request" to ask a
> driver (and perhaps user) to release the device, but this is not
> implemented for mdev. So this adds one to that point, and then
> wire it to vfio-ccw to pass it along to userspace. This will
> gracefully drive the unplug code, and everything behaves nicely.
>
> Despite the testing that was occurring, this doesn't appear related
> to the vfio-ccw channel path handling code. I can reproduce this with
> an older kernel/QEMU, which makes sense because the above behavior is
> driven from the subchannel event codepaths and not the chpid events.
> Because of that, I didn't flag anything with a Fixes tag, since it's
> seemingly been this way forever.
Both patches look good to me.
Which would be the best way to merge this? Via vfio or via vfio-ccw?
>
> RFC->V2:
> - Patch 1
> - Added a message when registering a device without a request callback
> - Changed the "if(!callback) return" to "if(callback) do" layout
> - Removed "unlikely" from "if(callback)" logic
> - Clarified some wording in the device ops struct commentary
> - Patch 2
> - Added Conny's r-b
>
> Eric Farman (2):
> vfio-mdev: Wire in a request handler for mdev parent
> vfio-ccw: Wire in the request callback
>
> drivers/s390/cio/vfio_ccw_ops.c | 26 ++++++++++++++++++++++++++
> drivers/s390/cio/vfio_ccw_private.h | 4 ++++
> drivers/vfio/mdev/mdev_core.c | 4 ++++
> drivers/vfio/mdev/vfio_mdev.c | 10 ++++++++++
> include/linux/mdev.h | 4 ++++
> include/uapi/linux/vfio.h | 1 +
> 6 files changed, 49 insertions(+)
>
^ permalink raw reply [flat|nested] 7+ messages in thread