[PATCH v3 0/5] Error recovery for zPCI passthrough devices

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v3 0/5] Error recovery for zPCI passthrough devices
@ 2025-09-25 17:48 Farhan Ali
  2025-09-25 17:48 ` [PATCH v3 1/5] [NOTFORMERGE] linux-headers: Update for zpci vfio device Farhan Ali
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Farhan Ali @ 2025-09-25 17:48 UTC (permalink / raw)
  To: qemu-devel, qemu-s390x; +Cc: mjrosato, alifm, thuth, alex.williamson, clg

Hi,

This patch series introduces support for error recovery for passthrough
PCI devices on System Z (s390x). This is the user space component for the Linux
kernel patches [1]. For QEMU on eventfd notification for PCI error from vfio-pci
driver we call the vfio error handler. We can use a per device error
handler callback to override the default vfio error handler. 

For s390x specific error handler, we retrieve the architecture specific PCI error
information and inject the information into the guest. Once the guest receives
the error information, the guest drivers will drive the error recovery.
Typically recovery involves a device reset which translate to CLP
disable/enable cycle for the device.

I would appreciate some feedback on this patch series.

Thanks
Farhan

[1] https://lore.kernel.org/all/20250924171628.826-1-alifm@linux.ibm.com/

ChangeLog
---------
v2 https://lore.kernel.org/qemu-devel/20250825212434.2255-1-alifm@linux.ibm.com/
v2 -> v3
    - Update arch_err_handler to err_handler and include Error ** in
    function definition. (patch 2)

    - Introduce helper function to hide the internal indirection of device_feature()
    (patch 3)

    - Update function definitions to include Error ** (patch 4)
    


v1 https://lore.kernel.org/qemu-devel/20250813174152.1238-1-alifm@linux.ibm.com/
v1 -> v2
   - Use VFIO_DEVICE_FEATURE ioctl to get device error information.
   (Based on Alex's feedback on kernel series)

Farhan Ali (5):
  [NOTFORMERGE] linux-headers: Update for zpci vfio device
  vfio/pci: Add an error handler callback
  vfio: Introduce vfio_device_feature helper function
  s390x/pci: Add PCI error handling for vfio pci devices
  s390x/pci: Reset a device in error state

 hw/s390x/s390-pci-bus.c          | 16 ++++++
 hw/s390x/s390-pci-vfio.c         | 87 ++++++++++++++++++++++++++++++++
 hw/vfio/device.c                 |  6 +++
 hw/vfio/pci.c                    |  8 +++
 hw/vfio/pci.h                    |  1 +
 include/hw/s390x/s390-pci-bus.h  |  1 +
 include/hw/s390x/s390-pci-vfio.h |  8 +++
 include/hw/vfio/vfio-device.h    |  2 +
 linux-headers/linux/vfio.h       | 15 ++++++
 9 files changed, 144 insertions(+)

-- 
2.43.0



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3 1/5] [NOTFORMERGE] linux-headers: Update for zpci vfio device
  2025-09-25 17:48 [PATCH v3 0/5] Error recovery for zPCI passthrough devices Farhan Ali
@ 2025-09-25 17:48 ` Farhan Ali
  2025-09-25 17:48 ` [PATCH v3 2/5] vfio/pci: Add an error handler callback Farhan Ali
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 18+ messages in thread
From: Farhan Ali @ 2025-09-25 17:48 UTC (permalink / raw)
  To: qemu-devel, qemu-s390x; +Cc: mjrosato, alifm, thuth, alex.williamson, clg

Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
---
 linux-headers/linux/vfio.h | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index 79bf8c0cc5..2918080ad9 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -1468,6 +1468,21 @@ struct vfio_device_feature_bus_master {
 };
 #define VFIO_DEVICE_FEATURE_BUS_MASTER 10
 
+/**
+ * VFIO_DEVICE_FEATURE_ZPCI_ERROR feature provides PCI error information to
+ * userspace for vfio-pci devices on s390x. On s390x PCI error recovery involves
+ * platform firmware and notification to operating system is done by
+ * architecture specific mechanism.  Exposing this information to userspace
+ * allows userspace to take appropriate actions to handle an error on the
+ * device.
+ */
+struct vfio_device_feature_zpci_err {
+        __u16 pec;
+        __u8 pending_errors;
+        __u8 pad;
+};
+#define VFIO_DEVICE_FEATURE_ZPCI_ERROR 11
+
 /* -------- API for Type1 VFIO IOMMU -------- */
 
 /**
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 2/5] vfio/pci: Add an error handler callback
  2025-09-25 17:48 [PATCH v3 0/5] Error recovery for zPCI passthrough devices Farhan Ali
  2025-09-25 17:48 ` [PATCH v3 1/5] [NOTFORMERGE] linux-headers: Update for zpci vfio device Farhan Ali
@ 2025-09-25 17:48 ` Farhan Ali
  2025-09-26  4:57   ` Markus Armbruster
  2025-09-25 17:48 ` [PATCH v3 3/5] vfio: Introduce vfio_device_feature helper function Farhan Ali
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 18+ messages in thread
From: Farhan Ali @ 2025-09-25 17:48 UTC (permalink / raw)
  To: qemu-devel, qemu-s390x; +Cc: mjrosato, alifm, thuth, alex.williamson, clg

Provide a vfio error handling callback, that can be used by devices to
handle PCI errors for passthrough devices.

Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
---
 hw/vfio/pci.c | 8 ++++++++
 hw/vfio/pci.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index bc0b4c4d56..b02a974954 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3063,11 +3063,19 @@ void vfio_pci_put_device(VFIOPCIDevice *vdev)
 static void vfio_err_notifier_handler(void *opaque)
 {
     VFIOPCIDevice *vdev = opaque;
+    Error *err = NULL;
 
     if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
         return;
     }
 
+    if (vdev->err_handler) {
+        if (vdev->err_handler(vdev, &err)) {
+            return;
+        }
+        error_report_err(err);
+    }
+
     /*
      * TBD. Retrieve the error details and decide what action
      * needs to be taken. One of the actions could be to pass
diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index e0aef82a89..faadce487c 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -146,6 +146,7 @@ struct VFIOPCIDevice {
     EventNotifier err_notifier;
     EventNotifier req_notifier;
     int (*resetfn)(struct VFIOPCIDevice *);
+    bool (*err_handler)(struct VFIOPCIDevice *, Error **);
     uint32_t vendor_id;
     uint32_t device_id;
     uint32_t sub_vendor_id;
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 2/5] vfio/pci: Add an error handler callback
  2025-09-25 17:48 ` [PATCH v3 2/5] vfio/pci: Add an error handler callback Farhan Ali
@ 2025-09-26  4:57   ` Markus Armbruster
  2025-09-26  7:40     ` Cédric Le Goater
  2025-09-26 17:53     ` Farhan Ali
  0 siblings, 2 replies; 18+ messages in thread
From: Markus Armbruster @ 2025-09-26  4:57 UTC (permalink / raw)
  To: Farhan Ali; +Cc: qemu-devel, qemu-s390x, mjrosato, thuth, alex.williamson, clg

Farhan Ali <alifm@linux.ibm.com> writes:

> Provide a vfio error handling callback, that can be used by devices to
> handle PCI errors for passthrough devices.
>
> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
> ---
>  hw/vfio/pci.c | 8 ++++++++
>  hw/vfio/pci.h | 1 +
>  2 files changed, 9 insertions(+)
>
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index bc0b4c4d56..b02a974954 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -3063,11 +3063,19 @@ void vfio_pci_put_device(VFIOPCIDevice *vdev)
>  static void vfio_err_notifier_handler(void *opaque)
>  {
>      VFIOPCIDevice *vdev = opaque;
> +    Error *err = NULL;
>  
>      if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
>          return;
>      }
>  
> +    if (vdev->err_handler) {
> +        if (vdev->err_handler(vdev, &err)) {
> +            return;
> +        }
> +        error_report_err(err);
> +    }

This is unusual.

Functions taking an Error ** argument usually do so to report errors.
The rules spelled out in qapi/error.h apply.  In particular:

 * - On success, the function should not touch *errp.  On failure, it
 *   should set a new error, e.g. with error_setg(errp, ...), or
 *   propagate an existing one, e.g. with error_propagate(errp, ...).
 *
 * - Whenever practical, also return a value that indicates success /
 *   failure.  This can make the error checking more concise, and can
 *   avoid useless error object creation and destruction.  Note that
 *   we still have many functions returning void.  We recommend
 *   • bool-valued functions return true on success / false on failure,

If ->err_handler() behaved that way, it @err would be null after it
returns false.  We'd call error_report_err(NULL), and crash.

Functions with unusual behavior need a contract: a comment spelling out
their behavior.

What is the intended behavior of the err_handler() callback?

> +
>      /*
>       * TBD. Retrieve the error details and decide what action
>       * needs to be taken. One of the actions could be to pass
> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
> index e0aef82a89..faadce487c 100644
> --- a/hw/vfio/pci.h
> +++ b/hw/vfio/pci.h
> @@ -146,6 +146,7 @@ struct VFIOPCIDevice {
>      EventNotifier err_notifier;
>      EventNotifier req_notifier;
>      int (*resetfn)(struct VFIOPCIDevice *);
> +    bool (*err_handler)(struct VFIOPCIDevice *, Error **);
>      uint32_t vendor_id;
>      uint32_t device_id;
>      uint32_t sub_vendor_id;

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 2/5] vfio/pci: Add an error handler callback
  2025-09-26  4:57   ` Markus Armbruster
@ 2025-09-26  7:40     ` Cédric Le Goater
  2025-09-26 18:44       ` Farhan Ali
  2025-09-26 17:53     ` Farhan Ali
  1 sibling, 1 reply; 18+ messages in thread
From: Cédric Le Goater @ 2025-09-26  7:40 UTC (permalink / raw)
  To: Markus Armbruster, Farhan Ali
  Cc: qemu-devel, qemu-s390x, mjrosato, thuth, alex.williamson

On 9/26/25 06:57, Markus Armbruster wrote:
> Farhan Ali <alifm@linux.ibm.com> writes:
> 
>> Provide a vfio error handling callback, that can be used by devices to
>> handle PCI errors for passthrough devices.
>>
>> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
>> ---
>>   hw/vfio/pci.c | 8 ++++++++
>>   hw/vfio/pci.h | 1 +
>>   2 files changed, 9 insertions(+)
>>
>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> index bc0b4c4d56..b02a974954 100644
>> --- a/hw/vfio/pci.c
>> +++ b/hw/vfio/pci.c
>> @@ -3063,11 +3063,19 @@ void vfio_pci_put_device(VFIOPCIDevice *vdev)
>>   static void vfio_err_notifier_handler(void *opaque)
>>   {
>>       VFIOPCIDevice *vdev = opaque;
>> +    Error *err = NULL;
>>   
>>       if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
>>           return;
>>       }
>>   
>> +    if (vdev->err_handler) {
>> +        if (vdev->err_handler(vdev, &err)) {
>> +            return;
>> +        }
>> +        error_report_err(err);
>> +    }
> 
> This is unusual.

and the compiler complains :

../hw/vfio/pci.c: In function ‘vfio_err_notifier_handler’:
../hw/vfio/pci.c:3076:9: error: dangling pointer to ‘err’ may be used [-Werror=dangling-pointer=]
  3076 |         error_report_err(err);
       |         ^~~~~~~~~~~~~~~~~~~~~
../hw/vfio/pci.c:3066:12: note: ‘err’ declared here
  3066 |     Error *err = NULL;
       |            ^~~
cc1: all warnings being treated as errors


C.


> 
> Functions taking an Error ** argument usually do so to report errors.
> The rules spelled out in qapi/error.h apply.  In particular:
> 
>   * - On success, the function should not touch *errp.  On failure, it
>   *   should set a new error, e.g. with error_setg(errp, ...), or
>   *   propagate an existing one, e.g. with error_propagate(errp, ...).
>   *
>   * - Whenever practical, also return a value that indicates success /
>   *   failure.  This can make the error checking more concise, and can
>   *   avoid useless error object creation and destruction.  Note that
>   *   we still have many functions returning void.  We recommend
>   *   • bool-valued functions return true on success / false on failure,
> 
> If ->err_handler() behaved that way, it @err would be null after it
> returns false.  We'd call error_report_err(NULL), and crash.
> 
> Functions with unusual behavior need a contract: a comment spelling out
> their behavior.
> 
> What is the intended behavior of the err_handler() callback?
> 
>> +
>>       /*
>>        * TBD. Retrieve the error details and decide what action
>>        * needs to be taken. One of the actions could be to pass
>> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
>> index e0aef82a89..faadce487c 100644
>> --- a/hw/vfio/pci.h
>> +++ b/hw/vfio/pci.h
>> @@ -146,6 +146,7 @@ struct VFIOPCIDevice {
>>       EventNotifier err_notifier;
>>       EventNotifier req_notifier;
>>       int (*resetfn)(struct VFIOPCIDevice *);
>> +    bool (*err_handler)(struct VFIOPCIDevice *, Error **);
>>       uint32_t vendor_id;
>>       uint32_t device_id;
>>       uint32_t sub_vendor_id;
> 



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 2/5] vfio/pci: Add an error handler callback
  2025-09-26  7:40     ` Cédric Le Goater
@ 2025-09-26 18:44       ` Farhan Ali
  0 siblings, 0 replies; 18+ messages in thread
From: Farhan Ali @ 2025-09-26 18:44 UTC (permalink / raw)
  To: Cédric Le Goater, Markus Armbruster
  Cc: qemu-devel, qemu-s390x, mjrosato, thuth, alex.williamson


On 9/26/2025 12:40 AM, Cédric Le Goater wrote:
> On 9/26/25 06:57, Markus Armbruster wrote:
>> Farhan Ali <alifm@linux.ibm.com> writes:
>>
>>> Provide a vfio error handling callback, that can be used by devices to
>>> handle PCI errors for passthrough devices.
>>>
>>> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
>>> ---
>>>   hw/vfio/pci.c | 8 ++++++++
>>>   hw/vfio/pci.h | 1 +
>>>   2 files changed, 9 insertions(+)
>>>
>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>>> index bc0b4c4d56..b02a974954 100644
>>> --- a/hw/vfio/pci.c
>>> +++ b/hw/vfio/pci.c
>>> @@ -3063,11 +3063,19 @@ void vfio_pci_put_device(VFIOPCIDevice *vdev)
>>>   static void vfio_err_notifier_handler(void *opaque)
>>>   {
>>>       VFIOPCIDevice *vdev = opaque;
>>> +    Error *err = NULL;
>>>         if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
>>>           return;
>>>       }
>>>   +    if (vdev->err_handler) {
>>> +        if (vdev->err_handler(vdev, &err)) {
>>> +            return;
>>> +        }
>>> +        error_report_err(err);
>>> +    }
>>
>> This is unusual.
>
> and the compiler complains :
>
> ../hw/vfio/pci.c: In function ‘vfio_err_notifier_handler’:
> ../hw/vfio/pci.c:3076:9: error: dangling pointer to ‘err’ may be used 
> [-Werror=dangling-pointer=]
>  3076 |         error_report_err(err);
>       |         ^~~~~~~~~~~~~~~~~~~~~
> ../hw/vfio/pci.c:3066:12: note: ‘err’ declared here
>  3066 |     Error *err = NULL;
>       |            ^~~
> cc1: all warnings being treated as errors
>
>
> C.

Compiling on s390x didn't cause this compiler error, but indeed 
compiling on x86 it did.

Thanks
Farhan


>
>
>>
>> Functions taking an Error ** argument usually do so to report errors.
>> The rules spelled out in qapi/error.h apply.  In particular:
>>
>>   * - On success, the function should not touch *errp.  On failure, it
>>   *   should set a new error, e.g. with error_setg(errp, ...), or
>>   *   propagate an existing one, e.g. with error_propagate(errp, ...).
>>   *
>>   * - Whenever practical, also return a value that indicates success /
>>   *   failure.  This can make the error checking more concise, and can
>>   *   avoid useless error object creation and destruction.  Note that
>>   *   we still have many functions returning void.  We recommend
>>   *   • bool-valued functions return true on success / false on failure,
>>
>> If ->err_handler() behaved that way, it @err would be null after it
>> returns false.  We'd call error_report_err(NULL), and crash.
>>
>> Functions with unusual behavior need a contract: a comment spelling out
>> their behavior.
>>
>> What is the intended behavior of the err_handler() callback?
>>
>>> +
>>>       /*
>>>        * TBD. Retrieve the error details and decide what action
>>>        * needs to be taken. One of the actions could be to pass
>>> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
>>> index e0aef82a89..faadce487c 100644
>>> --- a/hw/vfio/pci.h
>>> +++ b/hw/vfio/pci.h
>>> @@ -146,6 +146,7 @@ struct VFIOPCIDevice {
>>>       EventNotifier err_notifier;
>>>       EventNotifier req_notifier;
>>>       int (*resetfn)(struct VFIOPCIDevice *);
>>> +    bool (*err_handler)(struct VFIOPCIDevice *, Error **);
>>>       uint32_t vendor_id;
>>>       uint32_t device_id;
>>>       uint32_t sub_vendor_id;
>>
>
>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 2/5] vfio/pci: Add an error handler callback
  2025-09-26  4:57   ` Markus Armbruster
  2025-09-26  7:40     ` Cédric Le Goater
@ 2025-09-26 17:53     ` Farhan Ali
  2025-09-27  5:59       ` Markus Armbruster
  1 sibling, 1 reply; 18+ messages in thread
From: Farhan Ali @ 2025-09-26 17:53 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: qemu-devel, qemu-s390x, mjrosato, thuth, alex.williamson, clg


On 9/25/2025 9:57 PM, Markus Armbruster wrote:
> Farhan Ali <alifm@linux.ibm.com> writes:
>
>> Provide a vfio error handling callback, that can be used by devices to
>> handle PCI errors for passthrough devices.
>>
>> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
>> ---
>>   hw/vfio/pci.c | 8 ++++++++
>>   hw/vfio/pci.h | 1 +
>>   2 files changed, 9 insertions(+)
>>
>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> index bc0b4c4d56..b02a974954 100644
>> --- a/hw/vfio/pci.c
>> +++ b/hw/vfio/pci.c
>> @@ -3063,11 +3063,19 @@ void vfio_pci_put_device(VFIOPCIDevice *vdev)
>>   static void vfio_err_notifier_handler(void *opaque)
>>   {
>>       VFIOPCIDevice *vdev = opaque;
>> +    Error *err = NULL;
>>   
>>       if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
>>           return;
>>       }
>>   
>> +    if (vdev->err_handler) {
>> +        if (vdev->err_handler(vdev, &err)) {
>> +            return;
>> +        }
>> +        error_report_err(err);
>> +    }
> This is unusual.
>
> Functions taking an Error ** argument usually do so to report errors.
> The rules spelled out in qapi/error.h apply.  In particular:
>
>   * - On success, the function should not touch *errp.  On failure, it
>   *   should set a new error, e.g. with error_setg(errp, ...), or
>   *   propagate an existing one, e.g. with error_propagate(errp, ...).
>   *
>   * - Whenever practical, also return a value that indicates success /
>   *   failure.  This can make the error checking more concise, and can
>   *   avoid useless error object creation and destruction.  Note that
>   *   we still have many functions returning void.  We recommend
>   *   • bool-valued functions return true on success / false on failure,
>
> If ->err_handler() behaved that way, it @err would be null after it
> returns false.  We'd call error_report_err(NULL), and crash.
>
> Functions with unusual behavior need a contract: a comment spelling out
> their behavior.
>
> What is the intended behavior of the err_handler() callback?

Hi Markus,

Thanks for reviewing! The intended behavior for err_handler() is to set 
errp and report the error on false/failure. With the above code, I also 
intended fall through to vm_stop() when err_handler() fails.

I think I misunderstood the errp error handling, it seems like the 
correct way to do what I intended would be

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index b02a974954..630de46c90 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3070,10 +3070,11 @@ static void vfio_err_notifier_handler(void *opaque)
      }

      if (vdev->err_handler) {
-        if (vdev->err_handler(vdev, &err)) {
+        if (!vdev->err_handler(vdev, &err)) {
+            error_report_err(err);
+        } else {
              return;
          }
-        error_report_err(err);
      }

Please correct me if I missed anything.

Thanks
Farhan

>
>> +
>>       /*
>>        * TBD. Retrieve the error details and decide what action
>>        * needs to be taken. One of the actions could be to pass
>> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
>> index e0aef82a89..faadce487c 100644
>> --- a/hw/vfio/pci.h
>> +++ b/hw/vfio/pci.h
>> @@ -146,6 +146,7 @@ struct VFIOPCIDevice {
>>       EventNotifier err_notifier;
>>       EventNotifier req_notifier;
>>       int (*resetfn)(struct VFIOPCIDevice *);
>> +    bool (*err_handler)(struct VFIOPCIDevice *, Error **);
>>       uint32_t vendor_id;
>>       uint32_t device_id;
>>       uint32_t sub_vendor_id;
>


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 2/5] vfio/pci: Add an error handler callback
  2025-09-26 17:53     ` Farhan Ali
@ 2025-09-27  5:59       ` Markus Armbruster
  2025-09-27  7:05         ` Cédric Le Goater
  0 siblings, 1 reply; 18+ messages in thread
From: Markus Armbruster @ 2025-09-27  5:59 UTC (permalink / raw)
  To: Farhan Ali; +Cc: qemu-devel, qemu-s390x, mjrosato, thuth, alex.williamson, clg

Farhan Ali <alifm@linux.ibm.com> writes:

> On 9/25/2025 9:57 PM, Markus Armbruster wrote:
>> Farhan Ali <alifm@linux.ibm.com> writes:
>>
>>> Provide a vfio error handling callback, that can be used by devices to
>>> handle PCI errors for passthrough devices.
>>>
>>> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
>>> ---
>>>   hw/vfio/pci.c | 8 ++++++++
>>>   hw/vfio/pci.h | 1 +
>>>   2 files changed, 9 insertions(+)
>>>
>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>>> index bc0b4c4d56..b02a974954 100644
>>> --- a/hw/vfio/pci.c
>>> +++ b/hw/vfio/pci.c
>>> @@ -3063,11 +3063,19 @@ void vfio_pci_put_device(VFIOPCIDevice *vdev)
>>>  static void vfio_err_notifier_handler(void *opaque)
>>>  {
>>>      VFIOPCIDevice *vdev = opaque;
>>> +    Error *err = NULL;
>>>
>>>      if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
>>>          return;
>>>      }
>>>
>>> +    if (vdev->err_handler) {
>>> +        if (vdev->err_handler(vdev, &err)) {
>>> +            return;
>>> +        }
>>> +        error_report_err(err);
>>> +    }
>>
>> This is unusual.
>>
>> Functions taking an Error ** argument usually do so to report errors.
>> The rules spelled out in qapi/error.h apply.  In particular:
>>
>>   * - On success, the function should not touch *errp.  On failure, it
>>   *   should set a new error, e.g. with error_setg(errp, ...), or
>>   *   propagate an existing one, e.g. with error_propagate(errp, ...).
>>   *
>>   * - Whenever practical, also return a value that indicates success /
>>   *   failure.  This can make the error checking more concise, and can
>>   *   avoid useless error object creation and destruction.  Note that
>>   *   we still have many functions returning void.  We recommend
>>   *   • bool-valued functions return true on success / false on failure,
>>
>> If ->err_handler() behaved that way, it @err would be null after it
>> returns false.  We'd call error_report_err(NULL), and crash.
>>
>> Functions with unusual behavior need a contract: a comment spelling out
>> their behavior.
>>
>> What is the intended behavior of the err_handler() callback?
>
> Hi Markus,
>
> Thanks for reviewing! The intended behavior for err_handler() is to set errp and report the error on false/failure. With the above code, I also intended fall through to vm_stop() when err_handler() fails.
>
> I think I misunderstood the errp error handling, it seems like the correct way to do what I intended would be
>
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index b02a974954..630de46c90 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -3070,10 +3070,11 @@ static void vfio_err_notifier_handler(void *opaque)
>      }
>
>      if (vdev->err_handler) {
> -        if (vdev->err_handler(vdev, &err)) {
> +        if (!vdev->err_handler(vdev, &err)) {
> +            error_report_err(err);
> +        } else {
>              return;
>          }
> -        error_report_err(err);
>      }
>
> Please correct me if I missed anything.

Resulting function:

   static void vfio_err_notifier_handler(void *opaque)
   {
       VFIOPCIDevice *vdev = opaque;
       Error *err = NULL;

       if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
           return;
       }

       if (vdev->err_handler) {
           if (!vdev->err_handler(vdev, &err)) {
               error_report_err(err);
           } else {
               return;
           }
       }

       /*
        * TBD. Retrieve the error details and decide what action
        * needs to be taken. One of the actions could be to pass
        * the error to the guest and have the guest driver recover
        * from the error. This requires that PCIe capabilities be
        * exposed to the guest. For now, we just terminate the
        * guest to contain the error.
        */

       error_report("%s(%s) Unrecoverable error detected. Please collect any data possible and then kill the guest", __func__, vdev->vbasedev.name);

       vm_stop(RUN_STATE_INTERNAL_ERROR);
   }

Slighly rearranged for clearer control flow:

   static void vfio_err_notifier_handler(void *opaque)
   {
       VFIOPCIDevice *vdev = opaque;
       Error *err = NULL;

       if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
           return;
       }

       if (vdev->err_handler) {
           if (vdev->err_handler(vdev, &err)) {
               /* Error successfully handled */
               return;
           }
           error_report_err(err);
       }

       /*
        * TBD. Retrieve the error details and decide what action
        * needs to be taken. One of the actions could be to pass
        * the error to the guest and have the guest driver recover
        * from the error. This requires that PCIe capabilities be
        * exposed to the guest. For now, we just terminate the
        * guest to contain the error.
        */

       error_report("%s(%s) Unrecoverable error detected. Please collect any data possible and then kill the guest", __func__, vdev->vbasedev.name);

       vm_stop(RUN_STATE_INTERNAL_ERROR);
   }

Questions / issues:

* Is the comment still accurate?

* When ->err_handler() fails, we report the error twice.  Would it make
  sense to combine the two error reports into one?

* Preexisting: the second error message is ugly.

  Error messages should be short and to the point: single phrase, with
  no newline or trailing punctuation.  The "please collect ..." part
  does not belong to the error message proper, it's advice on what to
  do.  Better: report the error, then print advice:

       error_report("%s(%s) Unrecoverable error detected",
                    __func__, vdev->vbasedev.name);
       error_printf("Please collect any data possible and then kill the guest.");

  Including __func__ in an error message is an anti-pattern.  Look at

    vfio_err_notifier_handler(fred) Unrecoverable error detected

  with a user's eyes: "vfio_err_notifier_handler" is programmer
  gobbledygook, the device name "fred" is useful once you realize what
  it is, "Unrecoverable error detected" lacks detail.

[...]



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 2/5] vfio/pci: Add an error handler callback
  2025-09-27  5:59       ` Markus Armbruster
@ 2025-09-27  7:05         ` Cédric Le Goater
  2025-09-29 17:20           ` Farhan Ali
  0 siblings, 1 reply; 18+ messages in thread
From: Cédric Le Goater @ 2025-09-27  7:05 UTC (permalink / raw)
  To: Markus Armbruster, Farhan Ali
  Cc: qemu-devel, qemu-s390x, mjrosato, thuth, alex.williamson

On 9/27/25 07:59, Markus Armbruster wrote:
> Farhan Ali <alifm@linux.ibm.com> writes:
> 
>> On 9/25/2025 9:57 PM, Markus Armbruster wrote:
>>> Farhan Ali <alifm@linux.ibm.com> writes:
>>>
>>>> Provide a vfio error handling callback, that can be used by devices to
>>>> handle PCI errors for passthrough devices.
>>>>
>>>> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
>>>> ---
>>>>    hw/vfio/pci.c | 8 ++++++++
>>>>    hw/vfio/pci.h | 1 +
>>>>    2 files changed, 9 insertions(+)
>>>>
>>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>>>> index bc0b4c4d56..b02a974954 100644
>>>> --- a/hw/vfio/pci.c
>>>> +++ b/hw/vfio/pci.c
>>>> @@ -3063,11 +3063,19 @@ void vfio_pci_put_device(VFIOPCIDevice *vdev)
>>>>   static void vfio_err_notifier_handler(void *opaque)
>>>>   {
>>>>       VFIOPCIDevice *vdev = opaque;
>>>> +    Error *err = NULL;
>>>>
>>>>       if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
>>>>           return;
>>>>       }
>>>>
>>>> +    if (vdev->err_handler) {
>>>> +        if (vdev->err_handler(vdev, &err)) {
>>>> +            return;
>>>> +        }
>>>> +        error_report_err(err);
>>>> +    }
>>>
>>> This is unusual.
>>>
>>> Functions taking an Error ** argument usually do so to report errors.
>>> The rules spelled out in qapi/error.h apply.  In particular:
>>>
>>>    * - On success, the function should not touch *errp.  On failure, it
>>>    *   should set a new error, e.g. with error_setg(errp, ...), or
>>>    *   propagate an existing one, e.g. with error_propagate(errp, ...).
>>>    *
>>>    * - Whenever practical, also return a value that indicates success /
>>>    *   failure.  This can make the error checking more concise, and can
>>>    *   avoid useless error object creation and destruction.  Note that
>>>    *   we still have many functions returning void.  We recommend
>>>    *   • bool-valued functions return true on success / false on failure,
>>>
>>> If ->err_handler() behaved that way, it @err would be null after it
>>> returns false.  We'd call error_report_err(NULL), and crash.
>>>
>>> Functions with unusual behavior need a contract: a comment spelling out
>>> their behavior.
>>>
>>> What is the intended behavior of the err_handler() callback?
>>
>> Hi Markus,
>>
>> Thanks for reviewing! The intended behavior for err_handler() is to set errp and report the error on false/failure. With the above code, I also intended fall through to vm_stop() when err_handler() fails.
>>
>> I think I misunderstood the errp error handling, it seems like the correct way to do what I intended would be
>>
>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> index b02a974954..630de46c90 100644
>> --- a/hw/vfio/pci.c
>> +++ b/hw/vfio/pci.c
>> @@ -3070,10 +3070,11 @@ static void vfio_err_notifier_handler(void *opaque)
>>       }
>>
>>       if (vdev->err_handler) {
>> -        if (vdev->err_handler(vdev, &err)) {
>> +        if (!vdev->err_handler(vdev, &err)) {
>> +            error_report_err(err);
>> +        } else {
>>               return;
>>           }
>> -        error_report_err(err);
>>       }
>>
>> Please correct me if I missed anything.
> 
> Resulting function:
> 
>     static void vfio_err_notifier_handler(void *opaque)
>     {
>         VFIOPCIDevice *vdev = opaque;
>         Error *err = NULL;
> 
>         if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
>             return;
>         }
> 
>         if (vdev->err_handler) {
>             if (!vdev->err_handler(vdev, &err)) {
>                 error_report_err(err);
>             } else {
>                 return;
>             }
>         }
> 
>         /*
>          * TBD. Retrieve the error details and decide what action
>          * needs to be taken. One of the actions could be to pass
>          * the error to the guest and have the guest driver recover
>          * from the error. This requires that PCIe capabilities be
>          * exposed to the guest. For now, we just terminate the
>          * guest to contain the error.
>          */
> 
>         error_report("%s(%s) Unrecoverable error detected. Please collect any data possible and then kill the guest", __func__, vdev->vbasedev.name);
> 
>         vm_stop(RUN_STATE_INTERNAL_ERROR);
>     }
> 
> Slighly rearranged for clearer control flow:
> 
>     static void vfio_err_notifier_handler(void *opaque)
>     {
>         VFIOPCIDevice *vdev = opaque;
>         Error *err = NULL;
> 
>         if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
>             return;
>         }
> 
>         if (vdev->err_handler) {
>             if (vdev->err_handler(vdev, &err)) {
>                 /* Error successfully handled */
>                 return;
>             }
>             error_report_err(err);
>         }
> 
>         /*
>          * TBD. Retrieve the error details and decide what action
>          * needs to be taken. One of the actions could be to pass
>          * the error to the guest and have the guest driver recover
>          * from the error. This requires that PCIe capabilities be
>          * exposed to the guest. For now, we just terminate the
>          * guest to contain the error.
>          */
> 
>         error_report("%s(%s) Unrecoverable error detected. Please collect any data possible and then kill the guest", __func__, vdev->vbasedev.name);
> 
>         vm_stop(RUN_STATE_INTERNAL_ERROR);
>     }
> 
> Questions / issues:
> 
> * Is the comment still accurate?
> 
> * When ->err_handler() fails, we report the error twice.  Would it make
>    sense to combine the two error reports into one?

Yes. It was my request too.

Thanks,

C.



> * Preexisting: the second error message is ugly.
> 
>    Error messages should be short and to the point: single phrase, with
>    no newline or trailing punctuation.  The "please collect ..." part
>    does not belong to the error message proper, it's advice on what to
>    do.  Better: report the error, then print advice:
> 
>         error_report("%s(%s) Unrecoverable error detected",
>                      __func__, vdev->vbasedev.name);
>         error_printf("Please collect any data possible and then kill the guest.");
> 
>    Including __func__ in an error message is an anti-pattern.  Look at
> 
>      vfio_err_notifier_handler(fred) Unrecoverable error detected
> 
>    with a user's eyes: "vfio_err_notifier_handler" is programmer
>    gobbledygook, the device name "fred" is useful once you realize what
>    it is, "Unrecoverable error detected" lacks detail.
> 
> [...]
> 



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 2/5] vfio/pci: Add an error handler callback
  2025-09-27  7:05         ` Cédric Le Goater
@ 2025-09-29 17:20           ` Farhan Ali
  2025-09-30  9:20             ` Markus Armbruster
  0 siblings, 1 reply; 18+ messages in thread
From: Farhan Ali @ 2025-09-29 17:20 UTC (permalink / raw)
  To: Cédric Le Goater, Markus Armbruster
  Cc: qemu-devel, qemu-s390x, mjrosato, thuth, alex.williamson


On 9/27/2025 12:05 AM, Cédric Le Goater wrote:
> On 9/27/25 07:59, Markus Armbruster wrote:
>> Farhan Ali <alifm@linux.ibm.com> writes:
>>
>>> On 9/25/2025 9:57 PM, Markus Armbruster wrote:
>>>> Farhan Ali <alifm@linux.ibm.com> writes:
>>>>
>>>>> Provide a vfio error handling callback, that can be used by 
>>>>> devices to
>>>>> handle PCI errors for passthrough devices.
>>>>>
>>>>> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
>>>>> ---
>>>>>    hw/vfio/pci.c | 8 ++++++++
>>>>>    hw/vfio/pci.h | 1 +
>>>>>    2 files changed, 9 insertions(+)
>>>>>
>>>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>>>>> index bc0b4c4d56..b02a974954 100644
>>>>> --- a/hw/vfio/pci.c
>>>>> +++ b/hw/vfio/pci.c
>>>>> @@ -3063,11 +3063,19 @@ void vfio_pci_put_device(VFIOPCIDevice *vdev)
>>>>>   static void vfio_err_notifier_handler(void *opaque)
>>>>>   {
>>>>>       VFIOPCIDevice *vdev = opaque;
>>>>> +    Error *err = NULL;
>>>>>
>>>>>       if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
>>>>>           return;
>>>>>       }
>>>>>
>>>>> +    if (vdev->err_handler) {
>>>>> +        if (vdev->err_handler(vdev, &err)) {
>>>>> +            return;
>>>>> +        }
>>>>> +        error_report_err(err);
>>>>> +    }
>>>>
>>>> This is unusual.
>>>>
>>>> Functions taking an Error ** argument usually do so to report errors.
>>>> The rules spelled out in qapi/error.h apply.  In particular:
>>>>
>>>>    * - On success, the function should not touch *errp.  On 
>>>> failure, it
>>>>    *   should set a new error, e.g. with error_setg(errp, ...), or
>>>>    *   propagate an existing one, e.g. with error_propagate(errp, 
>>>> ...).
>>>>    *
>>>>    * - Whenever practical, also return a value that indicates 
>>>> success /
>>>>    *   failure.  This can make the error checking more concise, and 
>>>> can
>>>>    *   avoid useless error object creation and destruction. Note that
>>>>    *   we still have many functions returning void.  We recommend
>>>>    *   • bool-valued functions return true on success / false on 
>>>> failure,
>>>>
>>>> If ->err_handler() behaved that way, it @err would be null after it
>>>> returns false.  We'd call error_report_err(NULL), and crash.
>>>>
>>>> Functions with unusual behavior need a contract: a comment spelling 
>>>> out
>>>> their behavior.
>>>>
>>>> What is the intended behavior of the err_handler() callback?
>>>
>>> Hi Markus,
>>>
>>> Thanks for reviewing! The intended behavior for err_handler() is to 
>>> set errp and report the error on false/failure. With the above code, 
>>> I also intended fall through to vm_stop() when err_handler() fails.
>>>
>>> I think I misunderstood the errp error handling, it seems like the 
>>> correct way to do what I intended would be
>>>
>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>>> index b02a974954..630de46c90 100644
>>> --- a/hw/vfio/pci.c
>>> +++ b/hw/vfio/pci.c
>>> @@ -3070,10 +3070,11 @@ static void vfio_err_notifier_handler(void 
>>> *opaque)
>>>       }
>>>
>>>       if (vdev->err_handler) {
>>> -        if (vdev->err_handler(vdev, &err)) {
>>> +        if (!vdev->err_handler(vdev, &err)) {
>>> +            error_report_err(err);
>>> +        } else {
>>>               return;
>>>           }
>>> -        error_report_err(err);
>>>       }
>>>
>>> Please correct me if I missed anything.
>>
>> Resulting function:
>>
>>     static void vfio_err_notifier_handler(void *opaque)
>>     {
>>         VFIOPCIDevice *vdev = opaque;
>>         Error *err = NULL;
>>
>>         if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
>>             return;
>>         }
>>
>>         if (vdev->err_handler) {
>>             if (!vdev->err_handler(vdev, &err)) {
>>                 error_report_err(err);
>>             } else {
>>                 return;
>>             }
>>         }
>>
>>         /*
>>          * TBD. Retrieve the error details and decide what action
>>          * needs to be taken. One of the actions could be to pass
>>          * the error to the guest and have the guest driver recover
>>          * from the error. This requires that PCIe capabilities be
>>          * exposed to the guest. For now, we just terminate the
>>          * guest to contain the error.
>>          */
>>
>>         error_report("%s(%s) Unrecoverable error detected. Please 
>> collect any data possible and then kill the guest", __func__, 
>> vdev->vbasedev.name);
>>
>>         vm_stop(RUN_STATE_INTERNAL_ERROR);
>>     }
>>
>> Slighly rearranged for clearer control flow:
>>
>>     static void vfio_err_notifier_handler(void *opaque)
>>     {
>>         VFIOPCIDevice *vdev = opaque;
>>         Error *err = NULL;
>>
>>         if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
>>             return;
>>         }
>>
>>         if (vdev->err_handler) {
>>             if (vdev->err_handler(vdev, &err)) {
>>                 /* Error successfully handled */
>>                 return;
>>             }
>>             error_report_err(err);
>>         }

Yes, this is what i intended to do with my patch and provide a clearer 
flow. Though the compiler error reported by Cedric, is a little 
confusing, need to understand why that happens.


>>
>>         /*
>>          * TBD. Retrieve the error details and decide what action
>>          * needs to be taken. One of the actions could be to pass
>>          * the error to the guest and have the guest driver recover
>>          * from the error. This requires that PCIe capabilities be
>>          * exposed to the guest. For now, we just terminate the
>>          * guest to contain the error.
>>          */
>>
>>         error_report("%s(%s) Unrecoverable error detected. Please 
>> collect any data possible and then kill the guest", __func__, 
>> vdev->vbasedev.name);
>>
>>         vm_stop(RUN_STATE_INTERNAL_ERROR);
>>     }
>>
>> Questions / issues:
>>
>> * Is the comment still accurate?

This comment would still apply for vfio-pci devices on other 
architectures except for s390x. We are trying to change this behavior 
for s390x.

>>
>> * When ->err_handler() fails, we report the error twice. Would it make
>>    sense to combine the two error reports into one?
>
> Yes. It was my request too.
>
> Thanks,
>
> C.

I was a little hesitant about changing the existing error message as its 
been there for almost 12 years (since commit 7b4b0e9eda ("vfio: 
QEMU-AER: Qemu changes to support AER for VFIO-PCI devices")). Nothing 
should ever dependent on specific error messages, but still.. .If the 
preference is to combine/change the message I can do that.


>
>
>
>> * Preexisting: the second error message is ugly.
>>
>>    Error messages should be short and to the point: single phrase, with
>>    no newline or trailing punctuation.  The "please collect ..." part
>>    does not belong to the error message proper, it's advice on what to
>>    do.  Better: report the error, then print advice:
>>
>>         error_report("%s(%s) Unrecoverable error detected",
>>                      __func__, vdev->vbasedev.name);
>>         error_printf("Please collect any data possible and then kill 
>> the guest.");
>>
>>    Including __func__ in an error message is an anti-pattern. Look at
>>
>>      vfio_err_notifier_handler(fred) Unrecoverable error detected
>>
>>    with a user's eyes: "vfio_err_notifier_handler" is programmer
>>    gobbledygook, the device name "fred" is useful once you realize what
>>    it is, "Unrecoverable error detected" lacks detail.
>>
>> [...]
>>
>
How about "(device) Unrecoverable PCIe error detected for device"

Thanks
Farhan




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 2/5] vfio/pci: Add an error handler callback
  2025-09-29 17:20           ` Farhan Ali
@ 2025-09-30  9:20             ` Markus Armbruster
  2025-09-30 17:15               ` Farhan Ali
  0 siblings, 1 reply; 18+ messages in thread
From: Markus Armbruster @ 2025-09-30  9:20 UTC (permalink / raw)
  To: Farhan Ali
  Cc: Cédric Le Goater, qemu-devel, qemu-s390x, mjrosato, thuth,
	alex.williamson

Farhan Ali <alifm@linux.ibm.com> writes:

> On 9/27/2025 12:05 AM, Cédric Le Goater wrote:
>> On 9/27/25 07:59, Markus Armbruster wrote:
>>> Farhan Ali <alifm@linux.ibm.com> writes:
>>>
>>>> On 9/25/2025 9:57 PM, Markus Armbruster wrote:
>>>>> Farhan Ali <alifm@linux.ibm.com> writes:
>>>>>
>>>>>> Provide a vfio error handling callback, that can be used by devices to
>>>>>> handle PCI errors for passthrough devices.
>>>>>>
>>>>>> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
>>>>>> ---
>>>>>>    hw/vfio/pci.c | 8 ++++++++
>>>>>>    hw/vfio/pci.h | 1 +
>>>>>>    2 files changed, 9 insertions(+)
>>>>>>
>>>>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>>>>>> index bc0b4c4d56..b02a974954 100644
>>>>>> --- a/hw/vfio/pci.c
>>>>>> +++ b/hw/vfio/pci.c
>>>>>> @@ -3063,11 +3063,19 @@ void vfio_pci_put_device(VFIOPCIDevice *vdev)
>>>>>>   static void vfio_err_notifier_handler(void *opaque)
>>>>>>   {
>>>>>>       VFIOPCIDevice *vdev = opaque;
>>>>>> +    Error *err = NULL;
>>>>>>
>>>>>>       if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
>>>>>>           return;
>>>>>>       }
>>>>>>
>>>>>> +    if (vdev->err_handler) {
>>>>>> +        if (vdev->err_handler(vdev, &err)) {
>>>>>> +            return;
>>>>>> +        }
>>>>>> +        error_report_err(err);
>>>>>> +    }
>>>>>
>>>>> This is unusual.
>>>>>
>>>>> Functions taking an Error ** argument usually do so to report errors.
>>>>> The rules spelled out in qapi/error.h apply.  In particular:
>>>>>
>>>>>    * - On success, the function should not touch *errp.  On failure, it
>>>>>    *   should set a new error, e.g. with error_setg(errp, ...), or
>>>>>    *   propagate an existing one, e.g. with error_propagate(errp, ...).
>>>>>    *
>>>>>    * - Whenever practical, also return a value that indicates success /
>>>>>    *   failure.  This can make the error checking more concise, and can
>>>>>    *   avoid useless error object creation and destruction. Note that
>>>>>    *   we still have many functions returning void.  We recommend
>>>>>    *   • bool-valued functions return true on success / false on failure,
>>>>>
>>>>> If ->err_handler() behaved that way, it @err would be null after it
>>>>> returns false.  We'd call error_report_err(NULL), and crash.
>>>>>
>>>>> Functions with unusual behavior need a contract: a comment spelling out
>>>>> their behavior.
>>>>>
>>>>> What is the intended behavior of the err_handler() callback?
>>>>
>>>> Hi Markus,
>>>>
>>>> Thanks for reviewing! The intended behavior for err_handler() is to set errp and report the error on false/failure. With the above code, I also intended fall through to vm_stop() when err_handler() fails.
>>>>
>>>> I think I misunderstood the errp error handling, it seems like the correct way to do what I intended would be
>>>>
>>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>>>> index b02a974954..630de46c90 100644
>>>> --- a/hw/vfio/pci.c
>>>> +++ b/hw/vfio/pci.c
>>>> @@ -3070,10 +3070,11 @@ static void vfio_err_notifier_handler(void *opaque)
>>>>       }
>>>>
>>>>       if (vdev->err_handler) {
>>>> -        if (vdev->err_handler(vdev, &err)) {
>>>> +        if (!vdev->err_handler(vdev, &err)) {
>>>> +            error_report_err(err);
>>>> +        } else {
>>>>               return;
>>>>           }
>>>> -        error_report_err(err);
>>>>       }
>>>>
>>>> Please correct me if I missed anything.
>>>
>>> Resulting function:
>>>
>>>     static void vfio_err_notifier_handler(void *opaque)
>>>     {
>>>         VFIOPCIDevice *vdev = opaque;
>>>         Error *err = NULL;
>>>
>>>         if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
>>>             return;
>>>         }
>>>
>>>         if (vdev->err_handler) {
>>>             if (!vdev->err_handler(vdev, &err)) {
>>>                 error_report_err(err);
>>>             } else {
>>>                 return;
>>>             }
>>>         }
>>>
>>>         /*
>>>          * TBD. Retrieve the error details and decide what action
>>>          * needs to be taken. One of the actions could be to pass
>>>          * the error to the guest and have the guest driver recover
>>>          * from the error. This requires that PCIe capabilities be
>>>          * exposed to the guest. For now, we just terminate the
>>>          * guest to contain the error.
>>>          */
>>>
>>>         error_report("%s(%s) Unrecoverable error detected. Please collect any data possible and then kill the guest", __func__, vdev->vbasedev.name);
>>>
>>>         vm_stop(RUN_STATE_INTERNAL_ERROR);
>>>     }
>>>
>>> Slighly rearranged for clearer control flow:
>>>
>>>     static void vfio_err_notifier_handler(void *opaque)
>>>     {
>>>         VFIOPCIDevice *vdev = opaque;
>>>         Error *err = NULL;
>>>
>>>         if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
>>>             return;
>>>         }
>>>
>>>         if (vdev->err_handler) {
>>>             if (vdev->err_handler(vdev, &err)) {
>>>                 /* Error successfully handled */
>>>                 return;
>>>             }
>>>             error_report_err(err);
>>>         }
>
> Yes, this is what i intended to do with my patch and provide a clearer flow. Though the compiler error reported by Cedric, is a little confusing, need to understand why that happens.
>
>
>>>
>>>         /*
>>>          * TBD. Retrieve the error details and decide what action
>>>          * needs to be taken. One of the actions could be to pass
>>>          * the error to the guest and have the guest driver recover
>>>          * from the error. This requires that PCIe capabilities be
>>>          * exposed to the guest. For now, we just terminate the
>>>          * guest to contain the error.
>>>          */
>>>
>>>         error_report("%s(%s) Unrecoverable error detected. Please collect any data possible and then kill the guest", __func__, vdev->vbasedev.name);
>>>
>>>         vm_stop(RUN_STATE_INTERNAL_ERROR);
>>>     }
>>>
>>> Questions / issues:
>>>
>>> * Is the comment still accurate?
>
> This comment would still apply for vfio-pci devices on other architectures except for s390x. We are trying to change this behavior for s390x.

The comment is about things that should be done to handle the error.
Would these things be done here, or in a suitable ->err_handler()?

>>>
>>> * When ->err_handler() fails, we report the error twice. Would it make
>>>    sense to combine the two error reports into one?
>>
>> Yes. It was my request too.
>>
>> Thanks,
>>
>> C.
>
> I was a little hesitant about changing the existing error message as its been there for almost 12 years (since commit 7b4b0e9eda ("vfio: QEMU-AER: Qemu changes to support AER for VFIO-PCI devices")). Nothing should ever dependent on specific error messages, but still.. .If the preference is to combine/change the message I can do that.

Don't hesitate to improve error messages.

>>> * Preexisting: the second error message is ugly.
>>>
>>>    Error messages should be short and to the point: single phrase, with
>>>    no newline or trailing punctuation.  The "please collect ..." part
>>>    does not belong to the error message proper, it's advice on what to
>>>    do.  Better: report the error, then print advice:
>>>
>>>         error_report("%s(%s) Unrecoverable error detected",
>>>                      __func__, vdev->vbasedev.name);
>>>         error_printf("Please collect any data possible and then kill the guest.");
>>>
>>>    Including __func__ in an error message is an anti-pattern. Look at
>>>
>>>      vfio_err_notifier_handler(fred) Unrecoverable error detected
>>>
>>>    with a user's eyes: "vfio_err_notifier_handler" is programmer
>>>    gobbledygook, the device name "fred" is useful once you realize what
>>>    it is, "Unrecoverable error detected" lacks detail.
>>>
>>> [...]
>>>
>>
> How about "(device) Unrecoverable PCIe error detected for device"

Suggest "for device %s", where %s identifies the device to the user.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 2/5] vfio/pci: Add an error handler callback
  2025-09-30  9:20             ` Markus Armbruster
@ 2025-09-30 17:15               ` Farhan Ali
  2025-10-01  4:52                 ` Markus Armbruster
  0 siblings, 1 reply; 18+ messages in thread
From: Farhan Ali @ 2025-09-30 17:15 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Cédric Le Goater, qemu-devel, qemu-s390x, mjrosato, thuth,
	alex.williamson


On 9/30/2025 2:20 AM, Markus Armbruster wrote:
> Farhan Ali <alifm@linux.ibm.com> writes:
>
>> On 9/27/2025 12:05 AM, Cédric Le Goater wrote:
>>> On 9/27/25 07:59, Markus Armbruster wrote:
>>>> Farhan Ali <alifm@linux.ibm.com> writes:
>>>>
>>>>> On 9/25/2025 9:57 PM, Markus Armbruster wrote:
>>>>>> Farhan Ali <alifm@linux.ibm.com> writes:
>>>>>>
>>>>>>> Provide a vfio error handling callback, that can be used by devices to
>>>>>>> handle PCI errors for passthrough devices.
>>>>>>>
>>>>>>> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
>>>>>>> ---
>>>>>>>     hw/vfio/pci.c | 8 ++++++++
>>>>>>>     hw/vfio/pci.h | 1 +
>>>>>>>     2 files changed, 9 insertions(+)
>>>>>>>
>>>>>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>>>>>>> index bc0b4c4d56..b02a974954 100644
>>>>>>> --- a/hw/vfio/pci.c
>>>>>>> +++ b/hw/vfio/pci.c
>>>>>>> @@ -3063,11 +3063,19 @@ void vfio_pci_put_device(VFIOPCIDevice *vdev)
>>>>>>>    static void vfio_err_notifier_handler(void *opaque)
>>>>>>>    {
>>>>>>>        VFIOPCIDevice *vdev = opaque;
>>>>>>> +    Error *err = NULL;
>>>>>>>
>>>>>>>        if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
>>>>>>>            return;
>>>>>>>        }
>>>>>>>
>>>>>>> +    if (vdev->err_handler) {
>>>>>>> +        if (vdev->err_handler(vdev, &err)) {
>>>>>>> +            return;
>>>>>>> +        }
>>>>>>> +        error_report_err(err);
>>>>>>> +    }
>>>>>> This is unusual.
>>>>>>
>>>>>> Functions taking an Error ** argument usually do so to report errors.
>>>>>> The rules spelled out in qapi/error.h apply.  In particular:
>>>>>>
>>>>>>     * - On success, the function should not touch *errp.  On failure, it
>>>>>>     *   should set a new error, e.g. with error_setg(errp, ...), or
>>>>>>     *   propagate an existing one, e.g. with error_propagate(errp, ...).
>>>>>>     *
>>>>>>     * - Whenever practical, also return a value that indicates success /
>>>>>>     *   failure.  This can make the error checking more concise, and can
>>>>>>     *   avoid useless error object creation and destruction. Note that
>>>>>>     *   we still have many functions returning void.  We recommend
>>>>>>     *   • bool-valued functions return true on success / false on failure,
>>>>>>
>>>>>> If ->err_handler() behaved that way, it @err would be null after it
>>>>>> returns false.  We'd call error_report_err(NULL), and crash.
>>>>>>
>>>>>> Functions with unusual behavior need a contract: a comment spelling out
>>>>>> their behavior.
>>>>>>
>>>>>> What is the intended behavior of the err_handler() callback?
>>>>> Hi Markus,
>>>>>
>>>>> Thanks for reviewing! The intended behavior for err_handler() is to set errp and report the error on false/failure. With the above code, I also intended fall through to vm_stop() when err_handler() fails.
>>>>>
>>>>> I think I misunderstood the errp error handling, it seems like the correct way to do what I intended would be
>>>>>
>>>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>>>>> index b02a974954..630de46c90 100644
>>>>> --- a/hw/vfio/pci.c
>>>>> +++ b/hw/vfio/pci.c
>>>>> @@ -3070,10 +3070,11 @@ static void vfio_err_notifier_handler(void *opaque)
>>>>>        }
>>>>>
>>>>>        if (vdev->err_handler) {
>>>>> -        if (vdev->err_handler(vdev, &err)) {
>>>>> +        if (!vdev->err_handler(vdev, &err)) {
>>>>> +            error_report_err(err);
>>>>> +        } else {
>>>>>                return;
>>>>>            }
>>>>> -        error_report_err(err);
>>>>>        }
>>>>>
>>>>> Please correct me if I missed anything.
>>>> Resulting function:
>>>>
>>>>      static void vfio_err_notifier_handler(void *opaque)
>>>>      {
>>>>          VFIOPCIDevice *vdev = opaque;
>>>>          Error *err = NULL;
>>>>
>>>>          if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
>>>>              return;
>>>>          }
>>>>
>>>>          if (vdev->err_handler) {
>>>>              if (!vdev->err_handler(vdev, &err)) {
>>>>                  error_report_err(err);
>>>>              } else {
>>>>                  return;
>>>>              }
>>>>          }
>>>>
>>>>          /*
>>>>           * TBD. Retrieve the error details and decide what action
>>>>           * needs to be taken. One of the actions could be to pass
>>>>           * the error to the guest and have the guest driver recover
>>>>           * from the error. This requires that PCIe capabilities be
>>>>           * exposed to the guest. For now, we just terminate the
>>>>           * guest to contain the error.
>>>>           */
>>>>
>>>>          error_report("%s(%s) Unrecoverable error detected. Please collect any data possible and then kill the guest", __func__, vdev->vbasedev.name);
>>>>
>>>>          vm_stop(RUN_STATE_INTERNAL_ERROR);
>>>>      }
>>>>
>>>> Slighly rearranged for clearer control flow:
>>>>
>>>>      static void vfio_err_notifier_handler(void *opaque)
>>>>      {
>>>>          VFIOPCIDevice *vdev = opaque;
>>>>          Error *err = NULL;
>>>>
>>>>          if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
>>>>              return;
>>>>          }
>>>>
>>>>          if (vdev->err_handler) {
>>>>              if (vdev->err_handler(vdev, &err)) {
>>>>                  /* Error successfully handled */
>>>>                  return;
>>>>              }
>>>>              error_report_err(err);
>>>>          }
>> Yes, this is what i intended to do with my patch and provide a clearer flow. Though the compiler error reported by Cedric, is a little confusing, need to understand why that happens.
>>
>>
>>>>          /*
>>>>           * TBD. Retrieve the error details and decide what action
>>>>           * needs to be taken. One of the actions could be to pass
>>>>           * the error to the guest and have the guest driver recover
>>>>           * from the error. This requires that PCIe capabilities be
>>>>           * exposed to the guest. For now, we just terminate the
>>>>           * guest to contain the error.
>>>>           */
>>>>
>>>>          error_report("%s(%s) Unrecoverable error detected. Please collect any data possible and then kill the guest", __func__, vdev->vbasedev.name);
>>>>
>>>>          vm_stop(RUN_STATE_INTERNAL_ERROR);
>>>>      }
>>>>
>>>> Questions / issues:
>>>>
>>>> * Is the comment still accurate?
>> This comment would still apply for vfio-pci devices on other architectures except for s390x. We are trying to change this behavior for s390x.
> The comment is about things that should be done to handle the error.
> Would these things be done here, or in a suitable ->err_handler()?

Ideally in the err_handler(). And for s390x we try do what the comment 
mentions, which is inject the error into the guest through s390x 
architecture specific mechanism. I can remove the comment block.


>
>>>> * When ->err_handler() fails, we report the error twice. Would it make
>>>>     sense to combine the two error reports into one?
>>> Yes. It was my request too.
>>>
>>> Thanks,
>>>
>>> C.
>> I was a little hesitant about changing the existing error message as its been there for almost 12 years (since commit 7b4b0e9eda ("vfio: QEMU-AER: Qemu changes to support AER for VFIO-PCI devices")). Nothing should ever dependent on specific error messages, but still.. .If the preference is to combine/change the message I can do that.
> Don't hesitate to improve error messages.
>
>>>> * Preexisting: the second error message is ugly.
>>>>
>>>>     Error messages should be short and to the point: single phrase, with
>>>>     no newline or trailing punctuation.  The "please collect ..." part
>>>>     does not belong to the error message proper, it's advice on what to
>>>>     do.  Better: report the error, then print advice:
>>>>
>>>>          error_report("%s(%s) Unrecoverable error detected",
>>>>                       __func__, vdev->vbasedev.name);
>>>>          error_printf("Please collect any data possible and then kill the guest.");
>>>>
>>>>     Including __func__ in an error message is an anti-pattern. Look at
>>>>
>>>>       vfio_err_notifier_handler(fred) Unrecoverable error detected
>>>>
>>>>     with a user's eyes: "vfio_err_notifier_handler" is programmer
>>>>     gobbledygook, the device name "fred" is useful once you realize what
>>>>     it is, "Unrecoverable error detected" lacks detail.
>>>>
>>>> [...]
>>>>
>> How about "(device) Unrecoverable PCIe error detected for device"
> Suggest "for device %s", where %s identifies the device to the user.

Sure, I can make the change.

Thanks

Farhan

>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 2/5] vfio/pci: Add an error handler callback
  2025-09-30 17:15               ` Farhan Ali
@ 2025-10-01  4:52                 ` Markus Armbruster
  2025-10-01 18:21                   ` Farhan Ali
  0 siblings, 1 reply; 18+ messages in thread
From: Markus Armbruster @ 2025-10-01  4:52 UTC (permalink / raw)
  To: Farhan Ali
  Cc: Cédric Le Goater, qemu-devel, qemu-s390x, mjrosato, thuth,
	alex.williamson

Farhan Ali <alifm@linux.ibm.com> writes:

> On 9/30/2025 2:20 AM, Markus Armbruster wrote:
>> Farhan Ali <alifm@linux.ibm.com> writes:
>>
>>> On 9/27/2025 12:05 AM, Cédric Le Goater wrote:
>>>> On 9/27/25 07:59, Markus Armbruster wrote:

[...]

>>>>> * Is the comment still accurate?
>>>
>>> This comment would still apply for vfio-pci devices on other architectures except for s390x. We are trying to change this behavior for s390x.
>>
>> The comment is about things that should be done to handle the error.
>> Would these things be done here, or in a suitable ->err_handler()?
>
> Ideally in the err_handler(). And for s390x we try do what the comment mentions, which is inject the error into the guest through s390x architecture specific mechanism. I can remove the comment block.

Well, if there's stuff left to do, a comment outlining it is desirable.
If I understand you correctly, then the one we have is no longer
accurate.  Could you update it, so it is?

[...]



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 2/5] vfio/pci: Add an error handler callback
  2025-10-01  4:52                 ` Markus Armbruster
@ 2025-10-01 18:21                   ` Farhan Ali
  2025-10-06  6:06                     ` Markus Armbruster
  0 siblings, 1 reply; 18+ messages in thread
From: Farhan Ali @ 2025-10-01 18:21 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Cédric Le Goater, qemu-devel, qemu-s390x, mjrosato, thuth,
	alex.williamson


On 9/30/2025 9:52 PM, Markus Armbruster wrote:
> Farhan Ali <alifm@linux.ibm.com> writes:
>
>> On 9/30/2025 2:20 AM, Markus Armbruster wrote:
>>> Farhan Ali <alifm@linux.ibm.com> writes:
>>>
>>>> On 9/27/2025 12:05 AM, Cédric Le Goater wrote:
>>>>> On 9/27/25 07:59, Markus Armbruster wrote:
> [...]
>
>>>>>> * Is the comment still accurate?
>>>> This comment would still apply for vfio-pci devices on other architectures except for s390x. We are trying to change this behavior for s390x.
>>> The comment is about things that should be done to handle the error.
>>> Would these things be done here, or in a suitable ->err_handler()?
>> Ideally in the err_handler(). And for s390x we try do what the comment mentions, which is inject the error into the guest through s390x architecture specific mechanism. I can remove the comment block.
> Well, if there's stuff left to do, a comment outlining it is desirable.
> If I understand you correctly, then the one we have is no longer
> accurate.  Could you update it, so it is?
>
> [...]
>
How about something like this?

We can retrieve the error details and decide what action needs to be 
taken in err_handler(). One of the actions could be to pass the error to 
the guest and have the guest driver recover from the error. This 
requires that PCIe capabilities be exposed to the guest. If 
err_handler() is not implemented/fails, we just terminate the guest to 
contain the error.

Thanks

Farhan



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 2/5] vfio/pci: Add an error handler callback
  2025-10-01 18:21                   ` Farhan Ali
@ 2025-10-06  6:06                     ` Markus Armbruster
  0 siblings, 0 replies; 18+ messages in thread
From: Markus Armbruster @ 2025-10-06  6:06 UTC (permalink / raw)
  To: Farhan Ali
  Cc: Cédric Le Goater, qemu-devel, qemu-s390x, mjrosato, thuth,
	alex.williamson

Farhan Ali <alifm@linux.ibm.com> writes:

> On 9/30/2025 9:52 PM, Markus Armbruster wrote:
>> Farhan Ali <alifm@linux.ibm.com> writes:
>>
>>> On 9/30/2025 2:20 AM, Markus Armbruster wrote:
>>>> Farhan Ali <alifm@linux.ibm.com> writes:
>>>>
>>>>> On 9/27/2025 12:05 AM, Cédric Le Goater wrote:
>>>>>> On 9/27/25 07:59, Markus Armbruster wrote:
>>
>> [...]
>>
>>>>>>> * Is the comment still accurate?
>>>>>
>>>>> This comment would still apply for vfio-pci devices on other architectures except for s390x. We are trying to change this behavior for s390x.
>>>
>>>> The comment is about things that should be done to handle the error.
>>>> Would these things be done here, or in a suitable ->err_handler()?
>>>
>>> Ideally in the err_handler(). And for s390x we try do what the comment mentions, which is inject the error into the guest through s390x architecture specific mechanism. I can remove the comment block.
>>
>> Well, if there's stuff left to do, a comment outlining it is desirable.
>> If I understand you correctly, then the one we have is no longer
>> accurate.  Could you update it, so it is?
>>
>> [...]
>>
> How about something like this?
>
> We can retrieve the error details and decide what action needs to be taken in err_handler(). One of the actions could be to pass the error to the guest and have the guest driver recover from the error. This requires that PCIe capabilities be exposed to the guest. If err_handler() is not implemented/fails, we just terminate the guest to contain the error.

Looks good to me.  I lack the expertise to really judge.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3 3/5] vfio: Introduce vfio_device_feature helper function
  2025-09-25 17:48 [PATCH v3 0/5] Error recovery for zPCI passthrough devices Farhan Ali
  2025-09-25 17:48 ` [PATCH v3 1/5] [NOTFORMERGE] linux-headers: Update for zpci vfio device Farhan Ali
  2025-09-25 17:48 ` [PATCH v3 2/5] vfio/pci: Add an error handler callback Farhan Ali
@ 2025-09-25 17:48 ` Farhan Ali
  2025-09-25 17:48 ` [PATCH v3 4/5] s390x/pci: Add PCI error handling for vfio pci devices Farhan Ali
  2025-09-25 17:48 ` [PATCH v3 5/5] s390x/pci: Reset a device in error state Farhan Ali
  4 siblings, 0 replies; 18+ messages in thread
From: Farhan Ali @ 2025-09-25 17:48 UTC (permalink / raw)
  To: qemu-devel, qemu-s390x; +Cc: mjrosato, alifm, thuth, alex.williamson, clg

Introduce a helper function to call internal VFIODeviceIOOps
device_feature().

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
---
 hw/vfio/device.c              | 6 ++++++
 include/hw/vfio/vfio-device.h | 2 ++
 2 files changed, 8 insertions(+)

diff --git a/hw/vfio/device.c b/hw/vfio/device.c
index 08f12ac31f..2ea2af3f79 100644
--- a/hw/vfio/device.c
+++ b/hw/vfio/device.c
@@ -504,6 +504,12 @@ void vfio_device_unprepare(VFIODevice *vbasedev)
     vbasedev->bcontainer = NULL;
 }
 
+int vfio_device_feature(VFIODevice *vbasedev,
+                        struct vfio_device_feature *feature)
+{
+    return vbasedev->io_ops->device_feature(vbasedev, feature);
+}
+
 /*
  * Traditional ioctl() based io
  */
diff --git a/include/hw/vfio/vfio-device.h b/include/hw/vfio/vfio-device.h
index e7e6243e2d..a7f00d2a80 100644
--- a/include/hw/vfio/vfio-device.h
+++ b/include/hw/vfio/vfio-device.h
@@ -157,6 +157,8 @@ bool vfio_device_attach_by_iommu_type(const char *iommu_type, char *name,
                                       Error **errp);
 void vfio_device_detach(VFIODevice *vbasedev);
 VFIODevice *vfio_get_vfio_device(Object *obj);
+int vfio_device_feature(VFIODevice *vbasedev,
+                        struct vfio_device_feature *feat);
 
 typedef QLIST_HEAD(VFIODeviceList, VFIODevice) VFIODeviceList;
 extern VFIODeviceList vfio_device_list;
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 4/5] s390x/pci: Add PCI error handling for vfio pci devices
  2025-09-25 17:48 [PATCH v3 0/5] Error recovery for zPCI passthrough devices Farhan Ali
                   ` (2 preceding siblings ...)
  2025-09-25 17:48 ` [PATCH v3 3/5] vfio: Introduce vfio_device_feature helper function Farhan Ali
@ 2025-09-25 17:48 ` Farhan Ali
  2025-09-25 17:48 ` [PATCH v3 5/5] s390x/pci: Reset a device in error state Farhan Ali
  4 siblings, 0 replies; 18+ messages in thread
From: Farhan Ali @ 2025-09-25 17:48 UTC (permalink / raw)
  To: qemu-devel, qemu-s390x; +Cc: mjrosato, alifm, thuth, alex.williamson, clg

Add an s390x specific callback for vfio error handling. For s390x pci devices,
we have platform specific error information. We need to retrieve this error
information for passthrough devices. This is done via a memory region which
exposes that information.

Once this error information is retrieved we can then inject an error into
the guest, and let the guest drive the recovery.

Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
---
 hw/s390x/s390-pci-bus.c          |  9 ++++
 hw/s390x/s390-pci-vfio.c         | 81 ++++++++++++++++++++++++++++++++
 include/hw/s390x/s390-pci-bus.h  |  1 +
 include/hw/s390x/s390-pci-vfio.h |  6 +++
 4 files changed, 97 insertions(+)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index f87d2748b6..9f7b17e807 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -158,6 +158,8 @@ static void s390_pci_perform_unplug(S390PCIBusDevice *pbdev)
 {
     HotplugHandler *hotplug_ctrl;
 
+    qemu_mutex_destroy(&pbdev->err_handler_lock);
+
     if (pbdev->pft == ZPCI_PFT_ISM) {
         notifier_remove(&pbdev->shutdown_notifier);
     }
@@ -1074,6 +1076,7 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
     S390pciState *s = S390_PCI_HOST_BRIDGE(hotplug_dev);
     PCIDevice *pdev = NULL;
     S390PCIBusDevice *pbdev = NULL;
+    Error *local_err = NULL;
     int rc;
 
     if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_BRIDGE)) {
@@ -1140,6 +1143,7 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
         pbdev->iommu->pbdev = pbdev;
         pbdev->state = ZPCI_FS_DISABLED;
         set_pbdev_info(pbdev);
+        qemu_mutex_init(&pbdev->err_handler_lock);
 
         if (object_dynamic_cast(OBJECT(dev), "vfio-pci")) {
             /*
@@ -1164,6 +1168,11 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
             pbdev->iommu->dma_limit = s390_pci_start_dma_count(s, pbdev);
             /* Fill in CLP information passed via the vfio region */
             s390_pci_get_clp_info(pbdev);
+            /* Setup error handler for error recovery */
+            if (!s390_pci_setup_err_handler(pbdev, &local_err)) {
+                warn_report_err(local_err);
+            }
+
             if (!pbdev->interp) {
                 /* Do vfio passthrough but intercept for I/O */
                 pbdev->fh |= FH_SHM_VFIO;
diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 938a551171..1697a84de7 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -103,6 +103,58 @@ void s390_pci_end_dma_count(S390pciState *s, S390PCIDMACount *cnt)
     }
 }
 
+static bool s390_pci_get_feature_err(VFIOPCIDevice *vfio_pci,
+                                    struct vfio_device_feature_zpci_err *err,
+                                    Error **errp)
+{
+    int ret;
+    uint64_t buf[DIV_ROUND_UP(sizeof(struct vfio_device_feature) +
+                              sizeof(struct vfio_device_feature_zpci_err),
+                              sizeof(uint64_t))] = {};
+    struct vfio_device_feature *feature = (struct vfio_device_feature *)buf;
+
+    feature->argsz = sizeof(buf);
+    feature->flags = VFIO_DEVICE_FEATURE_GET | VFIO_DEVICE_FEATURE_ZPCI_ERROR;
+    ret = vfio_device_feature(&vfio_pci->vbasedev, feature);
+
+    if (ret) {
+        error_setg(errp, "Failed feature get VFIO_DEVICE_FEATURE_ZPCI_ERROR"
+                    " (rc=%d)", ret);
+        return false;
+    }
+
+    memcpy(err, (struct vfio_device_feature_zpci_err *) feature->data,
+           sizeof(struct vfio_device_feature_zpci_err));
+
+    return true;
+}
+
+static bool s390_pci_err_handler(VFIOPCIDevice *vfio_pci, Error **errp)
+{
+    S390PCIBusDevice *pbdev;
+    struct vfio_device_feature_zpci_err err;
+
+    pbdev = s390_pci_find_dev_by_target(s390_get_phb(),
+                                        DEVICE(&vfio_pci->parent_obj)->id);
+
+    QEMU_LOCK_GUARD(&pbdev->err_handler_lock);
+
+    if (!s390_pci_get_feature_err(vfio_pci, &err, errp)) {
+        return false;
+    }
+
+    pbdev->state = ZPCI_FS_ERROR;
+    s390_pci_generate_error_event(err.pec, pbdev->fh, pbdev->fid, 0, 0);
+
+    while (err.pending_errors) {
+        if (!s390_pci_get_feature_err(vfio_pci, &err, errp)) {
+            return false;
+        }
+        s390_pci_generate_error_event(err.pec, pbdev->fh, pbdev->fid, 0, 0);
+    }
+    return true;
+}
+
 static void s390_pci_read_base(S390PCIBusDevice *pbdev,
                                struct vfio_device_info *info)
 {
@@ -369,3 +421,32 @@ void s390_pci_get_clp_info(S390PCIBusDevice *pbdev)
     s390_pci_read_util(pbdev, info);
     s390_pci_read_pfip(pbdev, info);
 }
+
+bool s390_pci_setup_err_handler(S390PCIBusDevice *pbdev, Error **errp)
+{
+    int ret;
+    VFIOPCIDevice *vfio_pci = VFIO_PCI_BASE(pbdev->pdev);
+    uint64_t buf[DIV_ROUND_UP(sizeof(struct vfio_device_feature),
+                              sizeof(uint64_t))] = {};
+    struct vfio_device_feature *feature = (struct vfio_device_feature *)buf;
+
+    feature->argsz = sizeof(buf);
+    feature->flags = VFIO_DEVICE_FEATURE_PROBE | VFIO_DEVICE_FEATURE_ZPCI_ERROR;
+
+    ret = vfio_device_feature(&vfio_pci->vbasedev, feature);
+
+    if (ret != 0) {
+        if (ret == -ENOTTY) {
+            error_setg(errp, "Automated error recovery unavailable for device");
+        } else {
+            error_setg(errp,
+                       "Failed to probe for VFIO_DEVICE_FEATURE_ZPCI_ERROR (ret=%d)",
+                       ret);
+        }
+        return false;
+    }
+
+    vfio_pci->err_handler = s390_pci_err_handler;
+
+    return true;
+}
diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
index 04944d4fed..3795e0bbfc 100644
--- a/include/hw/s390x/s390-pci-bus.h
+++ b/include/hw/s390x/s390-pci-bus.h
@@ -364,6 +364,7 @@ struct S390PCIBusDevice {
     bool forwarding_assist;
     bool aif;
     bool rtr_avail;
+    QemuMutex err_handler_lock;
     QTAILQ_ENTRY(S390PCIBusDevice) link;
 };
 
diff --git a/include/hw/s390x/s390-pci-vfio.h b/include/hw/s390x/s390-pci-vfio.h
index ae1b126ff7..b45ffa5044 100644
--- a/include/hw/s390x/s390-pci-vfio.h
+++ b/include/hw/s390x/s390-pci-vfio.h
@@ -22,6 +22,7 @@ S390PCIDMACount *s390_pci_start_dma_count(S390pciState *s,
 void s390_pci_end_dma_count(S390pciState *s, S390PCIDMACount *cnt);
 bool s390_pci_get_host_fh(S390PCIBusDevice *pbdev, uint32_t *fh);
 void s390_pci_get_clp_info(S390PCIBusDevice *pbdev);
+bool s390_pci_setup_err_handler(S390PCIBusDevice *pbdev, Error **errp);
 #else
 static inline bool s390_pci_update_dma_avail(int fd, unsigned int *avail)
 {
@@ -39,6 +40,11 @@ static inline bool s390_pci_get_host_fh(S390PCIBusDevice *pbdev, uint32_t *fh)
     return false;
 }
 static inline void s390_pci_get_clp_info(S390PCIBusDevice *pbdev) { }
+static inline bool s390_pci_setup_err_handler(S390PCIBusDevice *pbdev, Error **errp)
+{
+    error_setg(errp, "VFIO not available, cannot setup error handler\n");
+    return false;
+}
 #endif
 
 #endif
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 5/5] s390x/pci: Reset a device in error state
  2025-09-25 17:48 [PATCH v3 0/5] Error recovery for zPCI passthrough devices Farhan Ali
                   ` (3 preceding siblings ...)
  2025-09-25 17:48 ` [PATCH v3 4/5] s390x/pci: Add PCI error handling for vfio pci devices Farhan Ali
@ 2025-09-25 17:48 ` Farhan Ali
  4 siblings, 0 replies; 18+ messages in thread
From: Farhan Ali @ 2025-09-25 17:48 UTC (permalink / raw)
  To: qemu-devel, qemu-s390x; +Cc: mjrosato, alifm, thuth, alex.williamson, clg

For passthrough devices in error state, for a guest driven reset of the
device we can attempt a reset to recover the device. A reset of the device
will trigger a CLP disable/enable cycle on the host to bring the device
into a recovered state.

Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
---
 hw/s390x/s390-pci-bus.c          | 7 +++++++
 hw/s390x/s390-pci-vfio.c         | 6 ++++++
 include/hw/s390x/s390-pci-vfio.h | 2 ++
 3 files changed, 15 insertions(+)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 9f7b17e807..c0216d4a82 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -1497,6 +1497,8 @@ static void s390_pci_device_reset(DeviceState *dev)
         return;
     case ZPCI_FS_STANDBY:
         break;
+    case ZPCI_FS_ERROR:
+        break;
     default:
         pbdev->fh &= ~FH_MASK_ENABLE;
         pbdev->state = ZPCI_FS_DISABLED;
@@ -1509,6 +1511,11 @@ static void s390_pci_device_reset(DeviceState *dev)
     } else if (pbdev->summary_ind) {
         pci_dereg_irqs(pbdev);
     }
+
+    if (pbdev->state == ZPCI_FS_ERROR) {
+        s390_pci_reset(pbdev);
+    }
+
     if (pbdev->iommu->enabled) {
         pci_dereg_ioat(pbdev->iommu);
     }
diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 1697a84de7..27e300f95d 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -155,6 +155,12 @@ static bool s390_pci_err_handler(VFIOPCIDevice *vfio_pci, Error **errp)
     return true;
 }
 
+void s390_pci_reset(S390PCIBusDevice *pbdev)
+{
+    VFIOPCIDevice *vfio_pci = VFIO_PCI_BASE(pbdev->pdev);
+    ioctl(vfio_pci->vbasedev.fd, VFIO_DEVICE_RESET);
+}
+
 static void s390_pci_read_base(S390PCIBusDevice *pbdev,
                                struct vfio_device_info *info)
 {
diff --git a/include/hw/s390x/s390-pci-vfio.h b/include/hw/s390x/s390-pci-vfio.h
index b45ffa5044..5d7f21023f 100644
--- a/include/hw/s390x/s390-pci-vfio.h
+++ b/include/hw/s390x/s390-pci-vfio.h
@@ -23,6 +23,7 @@ void s390_pci_end_dma_count(S390pciState *s, S390PCIDMACount *cnt);
 bool s390_pci_get_host_fh(S390PCIBusDevice *pbdev, uint32_t *fh);
 void s390_pci_get_clp_info(S390PCIBusDevice *pbdev);
 bool s390_pci_setup_err_handler(S390PCIBusDevice *pbdev, Error **errp);
+void s390_pci_reset(S390PCIBusDevice *pbdev);
 #else
 static inline bool s390_pci_update_dma_avail(int fd, unsigned int *avail)
 {
@@ -45,6 +46,7 @@ static inline bool s390_pci_setup_err_handler(S390PCIBusDevice *pbdev, Error **e
     error_setg(errp, "VFIO not available, cannot setup error handler\n");
     return false;
 }
+static inline void s390_pci_reset(S390PCIBusDevice *pbdev) { }
 #endif
 
 #endif
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2025-10-06  8:03 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-25 17:48 [PATCH v3 0/5] Error recovery for zPCI passthrough devices Farhan Ali
2025-09-25 17:48 ` [PATCH v3 1/5] [NOTFORMERGE] linux-headers: Update for zpci vfio device Farhan Ali
2025-09-25 17:48 ` [PATCH v3 2/5] vfio/pci: Add an error handler callback Farhan Ali
2025-09-26  4:57   ` Markus Armbruster
2025-09-26  7:40     ` Cédric Le Goater
2025-09-26 18:44       ` Farhan Ali
2025-09-26 17:53     ` Farhan Ali
2025-09-27  5:59       ` Markus Armbruster
2025-09-27  7:05         ` Cédric Le Goater
2025-09-29 17:20           ` Farhan Ali
2025-09-30  9:20             ` Markus Armbruster
2025-09-30 17:15               ` Farhan Ali
2025-10-01  4:52                 ` Markus Armbruster
2025-10-01 18:21                   ` Farhan Ali
2025-10-06  6:06                     ` Markus Armbruster
2025-09-25 17:48 ` [PATCH v3 3/5] vfio: Introduce vfio_device_feature helper function Farhan Ali
2025-09-25 17:48 ` [PATCH v3 4/5] s390x/pci: Add PCI error handling for vfio pci devices Farhan Ali
2025-09-25 17:48 ` [PATCH v3 5/5] s390x/pci: Reset a device in error state Farhan Ali

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).