* [PATCH] PCIe AER: report non fatal errors only to the functions of the same device
@ 2017-08-04 2:18 Dongdong Liu
2017-08-15 22:50 ` Bjorn Helgaas
0 siblings, 1 reply; 6+ messages in thread
From: Dongdong Liu @ 2017-08-04 2:18 UTC (permalink / raw)
To: helgaas
Cc: linux-pci, gabriele.paoloni, charles.chenxin, linuxarm,
Dongdong Liu
From: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Currently if an uncorrectable error is reported by an EP the AER
driver walks over all the devices connected to the upstream port
bus and in turns call the report_error_detected() callback.
If any of the devices connected to the bus does not implement
dev->driver->err_handler->error_detected() do_recovery() will fail.
However for non fatal errors the PCIe link should not be considered
compromised, therefore it makes sense to report the error only to
all the functions of a multifunction device.
This patch implements this new behaviour for non fatal errors.
Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
---
drivers/pci/bus.c | 38 ++++++++++++++++++++++++++++++++++++++
drivers/pci/pcie/aer/aerdrv_core.c | 13 ++++++++++++-
include/linux/pci.h | 3 ++-
3 files changed, 52 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index bc56cf1..bc8f8b2 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -364,6 +364,44 @@ void pci_bus_add_devices(const struct pci_bus *bus)
}
EXPORT_SYMBOL(pci_bus_add_devices);
+/** pci_walk_mf_dev - walk all functions of a multi-function
+ * device calling callback.
+ * @dev a function in a multi-function device
+ * @cb callback to be called for each device found
+ * @userdata arbitrary pointer to be passed to callback.
+ *
+ * Walk, on a given bus, only the adjacent functions of a
+ * multi-function device. Call the provided callback on each
+ * device found.
+ *
+ * We check the return of @cb each time. If it returns anything
+ * other than 0, we break out.
+ *
+ */
+void pci_walk_mf_dev(struct pci_dev *dev, int (*cb)(struct pci_dev *, void *),
+ void *userdata)
+{
+ int retval;
+ struct pci_bus *bus;
+ struct pci_dev *pdev;
+ int ndev;
+
+ bus = dev->bus;
+ ndev = PCI_SLOT(dev->devfn);
+
+ down_read(&pci_bus_sem);
+ /* call cb for all the functions of the mf device */
+ list_for_each_entry(pdev, &bus->devices, bus_list) {
+ if (PCI_SLOT(pdev->devfn) == ndev) {
+ retval = cb(pdev, userdata);
+ if (retval)
+ break;
+ }
+ }
+ up_read(&pci_bus_sem);
+}
+EXPORT_SYMBOL_GPL(pci_walk_mf_dev);
+
/** pci_walk_bus - walk devices on/under bus, calling callback.
* @top bus whose devices should be walked
* @cb callback to be called for each device found
diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
index b1303b3..67c3dc0 100644
--- a/drivers/pci/pcie/aer/aerdrv_core.c
+++ b/drivers/pci/pcie/aer/aerdrv_core.c
@@ -390,7 +390,18 @@ static pci_ers_result_t broadcast_error_message(struct pci_dev *dev,
* If the error is reported by an end point, we think this
* error is related to the upstream link of the end point.
*/
- pci_walk_bus(dev->bus, cb, &result_data);
+ if ((state == pci_channel_io_normal) &&
+ (!pci_ari_enabled(dev->bus)))
+ /*
+ * the error is non fatal so the bus is ok, just walk
+ * through all the functions in a multifunction device.
+ * if ARI is enabled on the bus then there can be only
+ * one device under that bus (so walk all the functions
+ * under the bus).
+ */
+ pci_walk_mf_dev(dev, cb, &result_data);
+ else
+ pci_walk_bus(dev->bus, cb, &result_data);
}
return result_data.result;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 4869e66..69e77bb 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1269,7 +1269,8 @@ const struct pci_device_id *pci_match_id(const struct pci_device_id *ids,
struct pci_dev *dev);
int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max,
int pass);
-
+void pci_walk_mf_dev(struct pci_dev *dev, int (*cb)(struct pci_dev *, void *),
+ void *userdata);
void pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
void *userdata);
int pci_cfg_space_size(struct pci_dev *dev);
--
1.9.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] PCIe AER: report non fatal errors only to the functions of the same device
2017-08-04 2:18 [PATCH] PCIe AER: report non fatal errors only to the functions of the same device Dongdong Liu
@ 2017-08-15 22:50 ` Bjorn Helgaas
2017-08-16 8:50 ` Dongdong Liu
0 siblings, 1 reply; 6+ messages in thread
From: Bjorn Helgaas @ 2017-08-15 22:50 UTC (permalink / raw)
To: Dongdong Liu; +Cc: linux-pci, gabriele.paoloni, charles.chenxin, linuxarm
On Fri, Aug 04, 2017 at 10:18:26AM +0800, Dongdong Liu wrote:
> From: Gabriele Paoloni <gabriele.paoloni@huawei.com>
>
> Currently if an uncorrectable error is reported by an EP the AER
> driver walks over all the devices connected to the upstream port
> bus and in turns call the report_error_detected() callback.
> If any of the devices connected to the bus does not implement
> dev->driver->err_handler->error_detected() do_recovery() will fail.
>
> However for non fatal errors the PCIe link should not be considered
> compromised, therefore it makes sense to report the error only to
> all the functions of a multifunction device.
> This patch implements this new behaviour for non fatal errors.
Why do we bother even with other functions in the same multifunction
device? PCIe r3.1, sec 6.2.2.2.2, says non-fatal errors only affect a
particular transaction, and "devices not associated with the
transaction in error are not impacted."
A transaction is only associated with one function, so other functions
in the same device shouldn't be affected, should they?
> Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
> ---
> drivers/pci/bus.c | 38 ++++++++++++++++++++++++++++++++++++++
> drivers/pci/pcie/aer/aerdrv_core.c | 13 ++++++++++++-
> include/linux/pci.h | 3 ++-
> 3 files changed, 52 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> index bc56cf1..bc8f8b2 100644
> --- a/drivers/pci/bus.c
> +++ b/drivers/pci/bus.c
> @@ -364,6 +364,44 @@ void pci_bus_add_devices(const struct pci_bus *bus)
> }
> EXPORT_SYMBOL(pci_bus_add_devices);
>
> +/** pci_walk_mf_dev - walk all functions of a multi-function
> + * device calling callback.
> + * @dev a function in a multi-function device
> + * @cb callback to be called for each device found
> + * @userdata arbitrary pointer to be passed to callback.
> + *
> + * Walk, on a given bus, only the adjacent functions of a
> + * multi-function device. Call the provided callback on each
> + * device found.
> + *
> + * We check the return of @cb each time. If it returns anything
> + * other than 0, we break out.
> + *
> + */
> +void pci_walk_mf_dev(struct pci_dev *dev, int (*cb)(struct pci_dev *, void *),
> + void *userdata)
> +{
> + int retval;
> + struct pci_bus *bus;
> + struct pci_dev *pdev;
> + int ndev;
> +
> + bus = dev->bus;
> + ndev = PCI_SLOT(dev->devfn);
> +
> + down_read(&pci_bus_sem);
> + /* call cb for all the functions of the mf device */
> + list_for_each_entry(pdev, &bus->devices, bus_list) {
> + if (PCI_SLOT(pdev->devfn) == ndev) {
> + retval = cb(pdev, userdata);
> + if (retval)
> + break;
> + }
> + }
> + up_read(&pci_bus_sem);
> +}
> +EXPORT_SYMBOL_GPL(pci_walk_mf_dev);
> +
> /** pci_walk_bus - walk devices on/under bus, calling callback.
> * @top bus whose devices should be walked
> * @cb callback to be called for each device found
> diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
> index b1303b3..67c3dc0 100644
> --- a/drivers/pci/pcie/aer/aerdrv_core.c
> +++ b/drivers/pci/pcie/aer/aerdrv_core.c
> @@ -390,7 +390,18 @@ static pci_ers_result_t broadcast_error_message(struct pci_dev *dev,
> * If the error is reported by an end point, we think this
> * error is related to the upstream link of the end point.
> */
> - pci_walk_bus(dev->bus, cb, &result_data);
> + if ((state == pci_channel_io_normal) &&
> + (!pci_ari_enabled(dev->bus)))
> + /*
> + * the error is non fatal so the bus is ok, just walk
> + * through all the functions in a multifunction device.
> + * if ARI is enabled on the bus then there can be only
> + * one device under that bus (so walk all the functions
> + * under the bus).
> + */
> + pci_walk_mf_dev(dev, cb, &result_data);
> + else
> + pci_walk_bus(dev->bus, cb, &result_data);
> }
>
> return result_data.result;
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 4869e66..69e77bb 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1269,7 +1269,8 @@ const struct pci_device_id *pci_match_id(const struct pci_device_id *ids,
> struct pci_dev *dev);
> int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max,
> int pass);
> -
> +void pci_walk_mf_dev(struct pci_dev *dev, int (*cb)(struct pci_dev *, void *),
> + void *userdata);
> void pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
> void *userdata);
> int pci_cfg_space_size(struct pci_dev *dev);
> --
> 1.9.1
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] PCIe AER: report non fatal errors only to the functions of the same device
2017-08-15 22:50 ` Bjorn Helgaas
@ 2017-08-16 8:50 ` Dongdong Liu
2017-08-16 14:07 ` Bjorn Helgaas
0 siblings, 1 reply; 6+ messages in thread
From: Dongdong Liu @ 2017-08-16 8:50 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: linux-pci, gabriele.paoloni, charles.chenxin, linuxarm
Hi Bjorn
在 2017/8/16 6:50, Bjorn Helgaas 写道:
> On Fri, Aug 04, 2017 at 10:18:26AM +0800, Dongdong Liu wrote:
>> From: Gabriele Paoloni <gabriele.paoloni@huawei.com>
>>
>> Currently if an uncorrectable error is reported by an EP the AER
>> driver walks over all the devices connected to the upstream port
>> bus and in turns call the report_error_detected() callback.
>> If any of the devices connected to the bus does not implement
>> dev->driver->err_handler->error_detected() do_recovery() will fail.
>>
>> However for non fatal errors the PCIe link should not be considered
>> compromised, therefore it makes sense to report the error only to
>> all the functions of a multifunction device.
>> This patch implements this new behaviour for non fatal errors.
>
> Why do we bother even with other functions in the same multifunction
> device? PCIe r3.1, sec 6.2.2.2.2, says non-fatal errors only affect a
> particular transaction, and "devices not associated with the
> transaction in error are not impacted."
>
> A transaction is only associated with one function, so other functions
> in the same device shouldn't be affected, should they?
PCIe r3.1, sec 6.2.4 Error Logging
PCI Express errors are not Function-specific.
"Software is responsible for scanning all Functions in a multi-Function device when it detects one of those errors"
Thanks,
Dongdong
>
>> Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
>> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
>> ---
>> drivers/pci/bus.c | 38 ++++++++++++++++++++++++++++++++++++++
>> drivers/pci/pcie/aer/aerdrv_core.c | 13 ++++++++++++-
>> include/linux/pci.h | 3 ++-
>> 3 files changed, 52 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
>> index bc56cf1..bc8f8b2 100644
>> --- a/drivers/pci/bus.c
>> +++ b/drivers/pci/bus.c
>> @@ -364,6 +364,44 @@ void pci_bus_add_devices(const struct pci_bus *bus)
>> }
>> EXPORT_SYMBOL(pci_bus_add_devices);
>>
>> +/** pci_walk_mf_dev - walk all functions of a multi-function
>> + * device calling callback.
>> + * @dev a function in a multi-function device
>> + * @cb callback to be called for each device found
>> + * @userdata arbitrary pointer to be passed to callback.
>> + *
>> + * Walk, on a given bus, only the adjacent functions of a
>> + * multi-function device. Call the provided callback on each
>> + * device found.
>> + *
>> + * We check the return of @cb each time. If it returns anything
>> + * other than 0, we break out.
>> + *
>> + */
>> +void pci_walk_mf_dev(struct pci_dev *dev, int (*cb)(struct pci_dev *, void *),
>> + void *userdata)
>> +{
>> + int retval;
>> + struct pci_bus *bus;
>> + struct pci_dev *pdev;
>> + int ndev;
>> +
>> + bus = dev->bus;
>> + ndev = PCI_SLOT(dev->devfn);
>> +
>> + down_read(&pci_bus_sem);
>> + /* call cb for all the functions of the mf device */
>> + list_for_each_entry(pdev, &bus->devices, bus_list) {
>> + if (PCI_SLOT(pdev->devfn) == ndev) {
>> + retval = cb(pdev, userdata);
>> + if (retval)
>> + break;
>> + }
>> + }
>> + up_read(&pci_bus_sem);
>> +}
>> +EXPORT_SYMBOL_GPL(pci_walk_mf_dev);
>> +
>> /** pci_walk_bus - walk devices on/under bus, calling callback.
>> * @top bus whose devices should be walked
>> * @cb callback to be called for each device found
>> diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
>> index b1303b3..67c3dc0 100644
>> --- a/drivers/pci/pcie/aer/aerdrv_core.c
>> +++ b/drivers/pci/pcie/aer/aerdrv_core.c
>> @@ -390,7 +390,18 @@ static pci_ers_result_t broadcast_error_message(struct pci_dev *dev,
>> * If the error is reported by an end point, we think this
>> * error is related to the upstream link of the end point.
>> */
>> - pci_walk_bus(dev->bus, cb, &result_data);
>> + if ((state == pci_channel_io_normal) &&
>> + (!pci_ari_enabled(dev->bus)))
>> + /*
>> + * the error is non fatal so the bus is ok, just walk
>> + * through all the functions in a multifunction device.
>> + * if ARI is enabled on the bus then there can be only
>> + * one device under that bus (so walk all the functions
>> + * under the bus).
>> + */
>> + pci_walk_mf_dev(dev, cb, &result_data);
>> + else
>> + pci_walk_bus(dev->bus, cb, &result_data);
>> }
>>
>> return result_data.result;
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index 4869e66..69e77bb 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -1269,7 +1269,8 @@ const struct pci_device_id *pci_match_id(const struct pci_device_id *ids,
>> struct pci_dev *dev);
>> int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max,
>> int pass);
>> -
>> +void pci_walk_mf_dev(struct pci_dev *dev, int (*cb)(struct pci_dev *, void *),
>> + void *userdata);
>> void pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
>> void *userdata);
>> int pci_cfg_space_size(struct pci_dev *dev);
>> --
>> 1.9.1
>>
>
> .
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] PCIe AER: report non fatal errors only to the functions of the same device
2017-08-16 8:50 ` Dongdong Liu
@ 2017-08-16 14:07 ` Bjorn Helgaas
2017-08-17 13:06 ` Gabriele Paoloni
0 siblings, 1 reply; 6+ messages in thread
From: Bjorn Helgaas @ 2017-08-16 14:07 UTC (permalink / raw)
To: Dongdong Liu; +Cc: linux-pci, gabriele.paoloni, charles.chenxin, linuxarm
On Wed, Aug 16, 2017 at 04:50:16PM +0800, Dongdong Liu wrote:
> Hi Bjorn
>
> 在 2017/8/16 6:50, Bjorn Helgaas 写道:
> >On Fri, Aug 04, 2017 at 10:18:26AM +0800, Dongdong Liu wrote:
> >>From: Gabriele Paoloni <gabriele.paoloni@huawei.com>
> >>
> >>Currently if an uncorrectable error is reported by an EP the AER
> >>driver walks over all the devices connected to the upstream port
> >>bus and in turns call the report_error_detected() callback.
> >>If any of the devices connected to the bus does not implement
> >>dev->driver->err_handler->error_detected() do_recovery() will fail.
> >>
> >>However for non fatal errors the PCIe link should not be considered
> >>compromised, therefore it makes sense to report the error only to
> >>all the functions of a multifunction device.
> >>This patch implements this new behaviour for non fatal errors.
> >
> >Why do we bother even with other functions in the same multifunction
> >device? PCIe r3.1, sec 6.2.2.2.2, says non-fatal errors only affect a
> >particular transaction, and "devices not associated with the
> >transaction in error are not impacted."
> >
> >A transaction is only associated with one function, so other functions
> >in the same device shouldn't be affected, should they?
>
> PCIe r3.1, sec 6.2.4 Error Logging
> PCI Express errors are not Function-specific. "Software is
> responsible for scanning all Functions in a multi-Function device
> when it detects one of those errors"
The previous text basically says that if a multi-function device
should generate at most one error reporting message, even if several
functions have logged an error of the same severity. I think that
single message corresponds to a single interrupt.
So when it says "software is responsible for scanning all Functions in
a multi-Function device," I think the point is that software should
read the error reporting registers of all functions in case several of
them have logged errors.
But IIUC, this patch has nothing to do with reading the error CSRs
(which should be done in the PCI/AER core). This patch merely changes
the set of devices for which we call the driver's error reporting
interfaces.
If this is fixing a problem, maybe it would help clarify things if you
could include the concrete details of what's going wrong.
Bjorn
> >>Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
> >>Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
> >>---
> >> drivers/pci/bus.c | 38 ++++++++++++++++++++++++++++++++++++++
> >> drivers/pci/pcie/aer/aerdrv_core.c | 13 ++++++++++++-
> >> include/linux/pci.h | 3 ++-
> >> 3 files changed, 52 insertions(+), 2 deletions(-)
> >>
> >>diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> >>index bc56cf1..bc8f8b2 100644
> >>--- a/drivers/pci/bus.c
> >>+++ b/drivers/pci/bus.c
> >>@@ -364,6 +364,44 @@ void pci_bus_add_devices(const struct pci_bus *bus)
> >> }
> >> EXPORT_SYMBOL(pci_bus_add_devices);
> >>
> >>+/** pci_walk_mf_dev - walk all functions of a multi-function
> >>+ * device calling callback.
> >>+ * @dev a function in a multi-function device
> >>+ * @cb callback to be called for each device found
> >>+ * @userdata arbitrary pointer to be passed to callback.
> >>+ *
> >>+ * Walk, on a given bus, only the adjacent functions of a
> >>+ * multi-function device. Call the provided callback on each
> >>+ * device found.
> >>+ *
> >>+ * We check the return of @cb each time. If it returns anything
> >>+ * other than 0, we break out.
> >>+ *
> >>+ */
> >>+void pci_walk_mf_dev(struct pci_dev *dev, int (*cb)(struct pci_dev *, void *),
> >>+ void *userdata)
> >>+{
> >>+ int retval;
> >>+ struct pci_bus *bus;
> >>+ struct pci_dev *pdev;
> >>+ int ndev;
> >>+
> >>+ bus = dev->bus;
> >>+ ndev = PCI_SLOT(dev->devfn);
> >>+
> >>+ down_read(&pci_bus_sem);
> >>+ /* call cb for all the functions of the mf device */
> >>+ list_for_each_entry(pdev, &bus->devices, bus_list) {
> >>+ if (PCI_SLOT(pdev->devfn) == ndev) {
> >>+ retval = cb(pdev, userdata);
> >>+ if (retval)
> >>+ break;
> >>+ }
> >>+ }
> >>+ up_read(&pci_bus_sem);
> >>+}
> >>+EXPORT_SYMBOL_GPL(pci_walk_mf_dev);
> >>+
> >> /** pci_walk_bus - walk devices on/under bus, calling callback.
> >> * @top bus whose devices should be walked
> >> * @cb callback to be called for each device found
> >>diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
> >>index b1303b3..67c3dc0 100644
> >>--- a/drivers/pci/pcie/aer/aerdrv_core.c
> >>+++ b/drivers/pci/pcie/aer/aerdrv_core.c
> >>@@ -390,7 +390,18 @@ static pci_ers_result_t broadcast_error_message(struct pci_dev *dev,
> >> * If the error is reported by an end point, we think this
> >> * error is related to the upstream link of the end point.
> >> */
> >>- pci_walk_bus(dev->bus, cb, &result_data);
> >>+ if ((state == pci_channel_io_normal) &&
> >>+ (!pci_ari_enabled(dev->bus)))
> >>+ /*
> >>+ * the error is non fatal so the bus is ok, just walk
> >>+ * through all the functions in a multifunction device.
> >>+ * if ARI is enabled on the bus then there can be only
> >>+ * one device under that bus (so walk all the functions
> >>+ * under the bus).
> >>+ */
> >>+ pci_walk_mf_dev(dev, cb, &result_data);
> >>+ else
> >>+ pci_walk_bus(dev->bus, cb, &result_data);
> >> }
> >>
> >> return result_data.result;
> >>diff --git a/include/linux/pci.h b/include/linux/pci.h
> >>index 4869e66..69e77bb 100644
> >>--- a/include/linux/pci.h
> >>+++ b/include/linux/pci.h
> >>@@ -1269,7 +1269,8 @@ const struct pci_device_id *pci_match_id(const struct pci_device_id *ids,
> >> struct pci_dev *dev);
> >> int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max,
> >> int pass);
> >>-
> >>+void pci_walk_mf_dev(struct pci_dev *dev, int (*cb)(struct pci_dev *, void *),
> >>+ void *userdata);
> >> void pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
> >> void *userdata);
> >> int pci_cfg_space_size(struct pci_dev *dev);
> >>--
> >>1.9.1
> >>
> >
> >.
> >
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: [PATCH] PCIe AER: report non fatal errors only to the functions of the same device
2017-08-16 14:07 ` Bjorn Helgaas
@ 2017-08-17 13:06 ` Gabriele Paoloni
2017-08-17 16:48 ` Bjorn Helgaas
0 siblings, 1 reply; 6+ messages in thread
From: Gabriele Paoloni @ 2017-08-17 13:06 UTC (permalink / raw)
To: Bjorn Helgaas, liudongdong (C)
Cc: linux-pci@vger.kernel.org, Chenxin (Charles), Linuxarm
SGkgQmpvcm4NCg0KPiAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiBGcm9tOiBCam9ybiBI
ZWxnYWFzIFttYWlsdG86aGVsZ2Fhc0BrZXJuZWwub3JnXQ0KPiBTZW50OiAxNiBBdWd1c3QgMjAx
NyAxNTowOA0KPiBUbzogbGl1ZG9uZ2RvbmcgKEMpDQo+IENjOiBsaW51eC1wY2lAdmdlci5rZXJu
ZWwub3JnOyBHYWJyaWVsZSBQYW9sb25pOyBDaGVueGluIChDaGFybGVzKTsNCj4gTGludXhhcm0N
Cj4gU3ViamVjdDogUmU6IFtQQVRDSF0gUENJZSBBRVI6IHJlcG9ydCBub24gZmF0YWwgZXJyb3Jz
IG9ubHkgdG8gdGhlDQo+IGZ1bmN0aW9ucyBvZiB0aGUgc2FtZSBkZXZpY2UNCj4gDQo+IE9uIFdl
ZCwgQXVnIDE2LCAyMDE3IGF0IDA0OjUwOjE2UE0gKzA4MDAsIERvbmdkb25nIExpdSB3cm90ZToN
Cj4gPiBIaSBCam9ybg0KPiA+DQo+ID4g5ZyoIDIwMTcvOC8xNiA2OjUwLCBCam9ybiBIZWxnYWFz
IOWGmemBkzoNCj4gPiA+T24gRnJpLCBBdWcgMDQsIDIwMTcgYXQgMTA6MTg6MjZBTSArMDgwMCwg
RG9uZ2RvbmcgTGl1IHdyb3RlOg0KPiA+ID4+RnJvbTogR2FicmllbGUgUGFvbG9uaSA8Z2Ficmll
bGUucGFvbG9uaUBodWF3ZWkuY29tPg0KPiA+ID4+DQo+ID4gPj5DdXJyZW50bHkgaWYgYW4gdW5j
b3JyZWN0YWJsZSBlcnJvciBpcyByZXBvcnRlZCBieSBhbiBFUCB0aGUgQUVSDQo+ID4gPj5kcml2
ZXIgd2Fsa3Mgb3ZlciBhbGwgdGhlIGRldmljZXMgY29ubmVjdGVkIHRvIHRoZSB1cHN0cmVhbSBw
b3J0DQo+ID4gPj5idXMgYW5kIGluIHR1cm5zIGNhbGwgdGhlIHJlcG9ydF9lcnJvcl9kZXRlY3Rl
ZCgpIGNhbGxiYWNrLg0KPiA+ID4+SWYgYW55IG9mIHRoZSBkZXZpY2VzIGNvbm5lY3RlZCB0byB0
aGUgYnVzIGRvZXMgbm90IGltcGxlbWVudA0KPiA+ID4+ZGV2LT5kcml2ZXItPmVycl9oYW5kbGVy
LT5lcnJvcl9kZXRlY3RlZCgpIGRvX3JlY292ZXJ5KCkgd2lsbCBmYWlsLg0KPiA+ID4+DQo+ID4g
Pj5Ib3dldmVyIGZvciBub24gZmF0YWwgZXJyb3JzIHRoZSBQQ0llIGxpbmsgc2hvdWxkIG5vdCBi
ZSBjb25zaWRlcmVkDQo+ID4gPj5jb21wcm9taXNlZCwgdGhlcmVmb3JlIGl0IG1ha2VzIHNlbnNl
IHRvIHJlcG9ydCB0aGUgZXJyb3Igb25seSB0bw0KPiA+ID4+YWxsIHRoZSBmdW5jdGlvbnMgb2Yg
YSBtdWx0aWZ1bmN0aW9uIGRldmljZS4NCj4gPiA+PlRoaXMgcGF0Y2ggaW1wbGVtZW50cyB0aGlz
IG5ldyBiZWhhdmlvdXIgZm9yIG5vbiBmYXRhbCBlcnJvcnMuDQo+ID4gPg0KPiA+ID5XaHkgZG8g
d2UgYm90aGVyIGV2ZW4gd2l0aCBvdGhlciBmdW5jdGlvbnMgaW4gdGhlIHNhbWUgbXVsdGlmdW5j
dGlvbg0KPiA+ID5kZXZpY2U/ICBQQ0llIHIzLjEsIHNlYyA2LjIuMi4yLjIsIHNheXMgbm9uLWZh
dGFsIGVycm9ycyBvbmx5IGFmZmVjdA0KPiBhDQo+ID4gPnBhcnRpY3VsYXIgdHJhbnNhY3Rpb24s
IGFuZCAiZGV2aWNlcyBub3QgYXNzb2NpYXRlZCB3aXRoIHRoZQ0KPiA+ID50cmFuc2FjdGlvbiBp
biBlcnJvciBhcmUgbm90IGltcGFjdGVkLiINCj4gPiA+DQo+ID4gPkEgdHJhbnNhY3Rpb24gaXMg
b25seSBhc3NvY2lhdGVkIHdpdGggb25lIGZ1bmN0aW9uLCBzbyBvdGhlcg0KPiBmdW5jdGlvbnMN
Cj4gPiA+aW4gdGhlIHNhbWUgZGV2aWNlIHNob3VsZG4ndCBiZSBhZmZlY3RlZCwgc2hvdWxkIHRo
ZXk/DQo+ID4NCj4gPiBQQ0llIHIzLjEsIHNlYyA2LjIuNCBFcnJvciBMb2dnaW5nDQo+ID4gUENJ
IEV4cHJlc3MgZXJyb3JzIGFyZSBub3QgRnVuY3Rpb24tc3BlY2lmaWMuICAiU29mdHdhcmUgaXMN
Cj4gPiByZXNwb25zaWJsZSBmb3Igc2Nhbm5pbmcgYWxsIEZ1bmN0aW9ucyBpbiBhIG11bHRpLUZ1
bmN0aW9uIGRldmljZQ0KPiA+IHdoZW4gaXQgZGV0ZWN0cyBvbmUgb2YgdGhvc2UgZXJyb3JzIg0K
PiANCj4gVGhlIHByZXZpb3VzIHRleHQgYmFzaWNhbGx5IHNheXMgdGhhdCBpZiBhIG11bHRpLWZ1
bmN0aW9uIGRldmljZQ0KPiBzaG91bGQgZ2VuZXJhdGUgYXQgbW9zdCBvbmUgZXJyb3IgcmVwb3J0
aW5nIG1lc3NhZ2UsIGV2ZW4gaWYgc2V2ZXJhbA0KPiBmdW5jdGlvbnMgaGF2ZSBsb2dnZWQgYW4g
ZXJyb3Igb2YgdGhlIHNhbWUgc2V2ZXJpdHkuICBJIHRoaW5rIHRoYXQNCj4gc2luZ2xlIG1lc3Nh
Z2UgY29ycmVzcG9uZHMgdG8gYSBzaW5nbGUgaW50ZXJydXB0Lg0KPiANCj4gU28gd2hlbiBpdCBz
YXlzICJzb2Z0d2FyZSBpcyByZXNwb25zaWJsZSBmb3Igc2Nhbm5pbmcgYWxsIEZ1bmN0aW9ucyBp
bg0KPiBhIG11bHRpLUZ1bmN0aW9uIGRldmljZSwiIEkgdGhpbmsgdGhlIHBvaW50IGlzIHRoYXQg
c29mdHdhcmUgc2hvdWxkDQo+IHJlYWQgdGhlIGVycm9yIHJlcG9ydGluZyByZWdpc3RlcnMgb2Yg
YWxsIGZ1bmN0aW9ucyBpbiBjYXNlIHNldmVyYWwgb2YNCj4gdGhlbSBoYXZlIGxvZ2dlZCBlcnJv
cnMuDQoNClJpZ2h0LCBsb29raW5nIGFnYWluIGF0IHRoZSBBRVIgY29yZSBpdCBzZWVtcyB0aGF0
IGZpbmRfc291cmNlX2RldmljZSgpDQp3b3VsZCBsb29rIGZvciB0aGUgZXJyb3Igc291cmNlcyBi
eSB3YWxraW5nIHRoZSBQQ0llIGhpZXJhcmNoeSBzdGFydGluZw0KZnJvbSB0aGUgUlAgdGhhdCBy
ZXBvcnRlZCB0aGUgZXJyb3IgKGhvd2V2ZXIgZnJvbSBBRVJfTUFYX01VTFRJX0VSUl9ERVZJQ0VT
DQptYXggNSBkZXZpY2VzIGNhbiBsb2cgYW4gZXJyb3Igb24gYSBzaW5nbGUgQUVSIGludGVycnVw
dC4uLikuDQoNCkFueXdheSBhcyBpdCBpcyBub3csIGFuZCBhc3N1bWluZyB0aGF0IHdlIGhhdmUg
bm8gbW9yZSB0aGFuIDUgZnVuY3Rpb25zDQppbiBhIG11bHRpLWZ1bmN0aW9uIGRldmljZSwgQUVS
IGNvcmUgc2hvdWxkIGNhbGwgaGFuZGxlX2Vycm9yX3NvdXJjZSgpDQpmb3IgZWFjaCBmdW5jdGlv
biB0aGF0IGxvZ2dlZCBhbiBlcnJvciwgcmlnaHQ/DQoNCj4gDQo+IEJ1dCBJSVVDLCB0aGlzIHBh
dGNoIGhhcyBub3RoaW5nIHRvIGRvIHdpdGggcmVhZGluZyB0aGUgZXJyb3IgQ1NScw0KPiAod2hp
Y2ggc2hvdWxkIGJlIGRvbmUgaW4gdGhlIFBDSS9BRVIgY29yZSkuICBUaGlzIHBhdGNoIG1lcmVs
eSBjaGFuZ2VzDQo+IHRoZSBzZXQgb2YgZGV2aWNlcyBmb3Igd2hpY2ggd2UgY2FsbCB0aGUgZHJp
dmVyJ3MgZXJyb3IgcmVwb3J0aW5nDQo+IGludGVyZmFjZXMuDQoNCkNvcnJlY3QuIEZyb20gb3Vy
IHBvaW50IG9mIHZpZXcgaWYgYSBmYXRhbCBBRVIgaXMgcmVwb3J0ZWQgYnkgYSBmdW5jdGlvbg0K
dGhlbiB3ZSBuZWVkIHRvIGNhbGwgdGhlIGRyaXZlciBjYWxsYmFja3MgYWxzbyBvbiBhbGwgdGhl
IGZ1bmN0aW9uIHVuZGVyDQp0aGUgc2FtZSBidXMgYXMgdGhlIHJlcG9ydGluZyBkZXZpY2UgKGFz
IHRoZSBsaW5rIGlzIGNvbXByb21pc2VkKS4NCg0KV2UgdGhvdWdodCB0aGF0IGZvciBub24tZmF0
YWwgZXJyb3JzIHRoaXMgaXMgbm90IG5lY2Vzc2FyeSBhcyB0aGUgYnVzIA0KbGluayBzaG91bGQg
bm90IGJlIGNvbnNpZGVyZWQgY29tcHJvbWlzZWQsIGJ1dCB3ZSB0aG91Z2h0IHRoYXQgZm9yIE1G
DQpkZXZpY2VzIG1heWJlIGl0IHdvdWxkIGhhdmUgYmVlbiBhcHByb3ByaWF0ZSB0byBjYWxsIHRo
ZSBkcml2ZXIgY2FsbGJhY2tzDQpvbiB0aGUgb3RoZXIgZnVuY3Rpb25zIG9mIHRoZSBzYW1lIGRl
dmljZS4gSG93ZXZlciBhZnRlciB5b3VyIGNvbnNpZGVyYXRpb24NCmFib3ZlIGFuZCBhZnRlciBk
b3VibGUgY2hlY2tpbmcgdGhlIEFFUiBjb3JlIGl0IHNlZW1zIHRoYXQgYWxzbyB0aGlzIGlzDQpu
b3QgbmVjZXNzYXJ5IChpbiBmYWN0IGZvciBhIE1GIGRldmljZSBoYW5kbGVfZXJyb3Jfc291cmNl
KCkgd2lsbCBiZSBjYWxsZWQNCmZvciBlYWNoIGZ1bmN0aW9uIHRoYXQgbG9nZ2VkIHRoZSBlcnJv
cikNCg0KPiANCj4gSWYgdGhpcyBpcyBmaXhpbmcgYSBwcm9ibGVtLCBtYXliZSBpdCB3b3VsZCBo
ZWxwIGNsYXJpZnkgdGhpbmdzIGlmIHlvdQ0KPiBjb3VsZCBpbmNsdWRlIHRoZSBjb25jcmV0ZSBk
ZXRhaWxzIG9mIHdoYXQncyBnb2luZyB3cm9uZy4NCg0KSW4gSGkxNjIwIHdlIGhhdmUgc29tZSBp
bnRlZ3JhdGVkIGNvbnRyb2xsZXJzIHRoYXQgYXBwZWFyIGFzIFBDSWUgRVBzDQp1bmRlciB0aGUg
c2FtZSBidXMuIFNvbWUgb2YgdGhlc2UgY29udHJvbGxlcnMgKGUuZy4gdGhlIFNBVEEgDQpjb250
cm9sbGVyKSBhcmUgbWlzc2luZyB0aGUgZXJyX2hhbmRsZXIgY2FsbGJhY2tzLg0KDQpJZiBvbmUg
ZGV2aWNlIHJlcG9ydHMgYSBub24tZmF0YWwgdW5jb3JyZWN0YWJsZSBlcnJvciB3aXRoIHRoZSBj
dXJyZW50DQpBRVIgY29yZSBjb2RlIHRoZSBjYWxsYmFja3MgZm9yIGFsbCB0aGUgZGV2aWNlcyB1
bmRlciB0aGUgc2FtZSBidXMgd2lsbA0KYmUgY2FsbGVkIGFuZCwgaWYgYW55IG9mIHRoZSBkZXZp
Y2VzIGlzIG1pc3NpbmcgdGhlIGNhbGxiYWNrIGFsbCB0aGUNCmRldmljZXMgaW4gdGhlIHN1YnRy
ZWUgYXJlIGxlZnQgaW4gZXJyb3Igc3RhdGUgd2l0aG91dCByZWNvdmVyeS4uLiANClRoaXMgcGF0
Y2ggaXMgbmVlZGVkIHRvIHNvcnQgb3V0IGEgc2l0dWF0aW9uIGxpa2UgdGhpcyBvbmUuDQoNCkFu
eXdheSBJIHRoaW5rIHRoYXQgYWZ0ZXIgdGhlIGNvbnNpZGVyYXRpb25zIGFib3ZlIG9uIHRoZSBN
RiBkZXZpY2UNCkkgY2FuIG1vZGlmeSB0aGlzIHBhdGNoIHRvIGp1c3QgY2FsbCB0aGUgY2FsbGJh
Y2sgZm9yIHRoZSByZXBvcnRpbmcNCmZ1bmN0aW9uLi4uPw0KDQpUaGFua3MNCkdhYg0KDQo+IA0K
PiBCam9ybg0KPiANCj4gPiA+PlNpZ25lZC1vZmYtYnk6IEdhYnJpZWxlIFBhb2xvbmkgPGdhYnJp
ZWxlLnBhb2xvbmlAaHVhd2VpLmNvbT4NCj4gPiA+PlNpZ25lZC1vZmYtYnk6IERvbmdkb25nIExp
dSA8bGl1ZG9uZ2RvbmczQGh1YXdlaS5jb20+DQo+ID4gPj4tLS0NCj4gPiA+PiBkcml2ZXJzL3Bj
aS9idXMuYyAgICAgICAgICAgICAgICAgIHwgMzgNCj4gKysrKysrKysrKysrKysrKysrKysrKysr
KysrKysrKysrKysrKysNCj4gPiA+PiBkcml2ZXJzL3BjaS9wY2llL2Flci9hZXJkcnZfY29yZS5j
IHwgMTMgKysrKysrKysrKysrLQ0KPiA+ID4+IGluY2x1ZGUvbGludXgvcGNpLmggICAgICAgICAg
ICAgICAgfCAgMyArKy0NCj4gPiA+PiAzIGZpbGVzIGNoYW5nZWQsIDUyIGluc2VydGlvbnMoKyks
IDIgZGVsZXRpb25zKC0pDQo+ID4gPj4NCj4gPiA+PmRpZmYgLS1naXQgYS9kcml2ZXJzL3BjaS9i
dXMuYyBiL2RyaXZlcnMvcGNpL2J1cy5jDQo+ID4gPj5pbmRleCBiYzU2Y2YxLi5iYzhmOGIyIDEw
MDY0NA0KPiA+ID4+LS0tIGEvZHJpdmVycy9wY2kvYnVzLmMNCj4gPiA+PisrKyBiL2RyaXZlcnMv
cGNpL2J1cy5jDQo+ID4gPj5AQCAtMzY0LDYgKzM2NCw0NCBAQCB2b2lkIHBjaV9idXNfYWRkX2Rl
dmljZXMoY29uc3Qgc3RydWN0IHBjaV9idXMNCj4gKmJ1cykNCj4gPiA+PiB9DQo+ID4gPj4gRVhQ
T1JUX1NZTUJPTChwY2lfYnVzX2FkZF9kZXZpY2VzKTsNCj4gPiA+Pg0KPiA+ID4+Ky8qKiBwY2lf
d2Fsa19tZl9kZXYgLSB3YWxrIGFsbCBmdW5jdGlvbnMgb2YgYSBtdWx0aS1mdW5jdGlvbg0KPiA+
ID4+KyAqICBkZXZpY2UgY2FsbGluZyBjYWxsYmFjay4NCj4gPiA+PisgKiAgQGRldiAgICAgIGEg
ZnVuY3Rpb24gaW4gYSBtdWx0aS1mdW5jdGlvbiBkZXZpY2UNCj4gPiA+PisgKiAgQGNiICAgICAg
IGNhbGxiYWNrIHRvIGJlIGNhbGxlZCBmb3IgZWFjaCBkZXZpY2UgZm91bmQNCj4gPiA+PisgKiAg
QHVzZXJkYXRhIGFyYml0cmFyeSBwb2ludGVyIHRvIGJlIHBhc3NlZCB0byBjYWxsYmFjay4NCj4g
PiA+PisgKg0KPiA+ID4+KyAqICBXYWxrLCBvbiBhIGdpdmVuIGJ1cywgb25seSB0aGUgYWRqYWNl
bnQgZnVuY3Rpb25zIG9mIGENCj4gPiA+PisgKiAgbXVsdGktZnVuY3Rpb24gZGV2aWNlLiBDYWxs
IHRoZSBwcm92aWRlZCBjYWxsYmFjayBvbiBlYWNoDQo+ID4gPj4rICogIGRldmljZSBmb3VuZC4N
Cj4gPiA+PisgKg0KPiA+ID4+KyAqICBXZSBjaGVjayB0aGUgcmV0dXJuIG9mIEBjYiBlYWNoIHRp
bWUuIElmIGl0IHJldHVybnMgYW55dGhpbmcNCj4gPiA+PisgKiAgb3RoZXIgdGhhbiAwLCB3ZSBi
cmVhayBvdXQuDQo+ID4gPj4rICoNCj4gPiA+PisgKi8NCj4gPiA+Pit2b2lkIHBjaV93YWxrX21m
X2RldihzdHJ1Y3QgcGNpX2RldiAqZGV2LCBpbnQgKCpjYikoc3RydWN0IHBjaV9kZXYNCj4gKiwg
dm9pZCAqKSwNCj4gPiA+PisJCSAgdm9pZCAqdXNlcmRhdGEpDQo+ID4gPj4rew0KPiA+ID4+Kwlp
bnQgcmV0dmFsOw0KPiA+ID4+KwlzdHJ1Y3QgcGNpX2J1cyAqYnVzOw0KPiA+ID4+KwlzdHJ1Y3Qg
cGNpX2RldiAqcGRldjsNCj4gPiA+PisJaW50IG5kZXY7DQo+ID4gPj4rDQo+ID4gPj4rCWJ1cyA9
IGRldi0+YnVzOw0KPiA+ID4+KwluZGV2ID0gUENJX1NMT1QoZGV2LT5kZXZmbik7DQo+ID4gPj4r
DQo+ID4gPj4rCWRvd25fcmVhZCgmcGNpX2J1c19zZW0pOw0KPiA+ID4+KwkvKiBjYWxsIGNiIGZv
ciBhbGwgdGhlIGZ1bmN0aW9ucyBvZiB0aGUgbWYgZGV2aWNlICovDQo+ID4gPj4rCWxpc3RfZm9y
X2VhY2hfZW50cnkocGRldiwgJmJ1cy0+ZGV2aWNlcywgYnVzX2xpc3QpIHsNCj4gPiA+PisJCWlm
IChQQ0lfU0xPVChwZGV2LT5kZXZmbikgPT0gbmRldikgew0KPiA+ID4+KwkJCXJldHZhbCA9IGNi
KHBkZXYsIHVzZXJkYXRhKTsNCj4gPiA+PisJCQlpZiAocmV0dmFsKQ0KPiA+ID4+KwkJCQlicmVh
azsNCj4gPiA+PisJCX0NCj4gPiA+PisJfQ0KPiA+ID4+Kwl1cF9yZWFkKCZwY2lfYnVzX3NlbSk7
DQo+ID4gPj4rfQ0KPiA+ID4+K0VYUE9SVF9TWU1CT0xfR1BMKHBjaV93YWxrX21mX2Rldik7DQo+
ID4gPj4rDQo+ID4gPj4gLyoqIHBjaV93YWxrX2J1cyAtIHdhbGsgZGV2aWNlcyBvbi91bmRlciBi
dXMsIGNhbGxpbmcgY2FsbGJhY2suDQo+ID4gPj4gICogIEB0b3AgICAgICBidXMgd2hvc2UgZGV2
aWNlcyBzaG91bGQgYmUgd2Fsa2VkDQo+ID4gPj4gICogIEBjYiAgICAgICBjYWxsYmFjayB0byBi
ZSBjYWxsZWQgZm9yIGVhY2ggZGV2aWNlIGZvdW5kDQo+ID4gPj5kaWZmIC0tZ2l0IGEvZHJpdmVy
cy9wY2kvcGNpZS9hZXIvYWVyZHJ2X2NvcmUuYw0KPiBiL2RyaXZlcnMvcGNpL3BjaWUvYWVyL2Fl
cmRydl9jb3JlLmMNCj4gPiA+PmluZGV4IGIxMzAzYjMuLjY3YzNkYzAgMTAwNjQ0DQo+ID4gPj4t
LS0gYS9kcml2ZXJzL3BjaS9wY2llL2Flci9hZXJkcnZfY29yZS5jDQo+ID4gPj4rKysgYi9kcml2
ZXJzL3BjaS9wY2llL2Flci9hZXJkcnZfY29yZS5jDQo+ID4gPj5AQCAtMzkwLDcgKzM5MCwxOCBA
QCBzdGF0aWMgcGNpX2Vyc19yZXN1bHRfdA0KPiBicm9hZGNhc3RfZXJyb3JfbWVzc2FnZShzdHJ1
Y3QgcGNpX2RldiAqZGV2LA0KPiA+ID4+IAkJICogSWYgdGhlIGVycm9yIGlzIHJlcG9ydGVkIGJ5
IGFuIGVuZCBwb2ludCwgd2UgdGhpbmsgdGhpcw0KPiA+ID4+IAkJICogZXJyb3IgaXMgcmVsYXRl
ZCB0byB0aGUgdXBzdHJlYW0gbGluayBvZiB0aGUgZW5kIHBvaW50Lg0KPiA+ID4+IAkJICovDQo+
ID4gPj4tCQlwY2lfd2Fsa19idXMoZGV2LT5idXMsIGNiLCAmcmVzdWx0X2RhdGEpOw0KPiA+ID4+
KwkJaWYgKChzdGF0ZSA9PSBwY2lfY2hhbm5lbF9pb19ub3JtYWwpICYmDQo+ID4gPj4rCQkJCSgh
cGNpX2FyaV9lbmFibGVkKGRldi0+YnVzKSkpDQo+ID4gPj4rCQkJLyoNCj4gPiA+PisJCQkgKiB0
aGUgZXJyb3IgaXMgbm9uIGZhdGFsIHNvIHRoZSBidXMgaXMgb2ssIGp1c3Qgd2Fsaw0KPiA+ID4+
KwkJCSAqIHRocm91Z2ggYWxsIHRoZSBmdW5jdGlvbnMgaW4gYSBtdWx0aWZ1bmN0aW9uDQo+IGRl
dmljZS4NCj4gPiA+PisJCQkgKiBpZiBBUkkgaXMgZW5hYmxlZCBvbiB0aGUgYnVzIHRoZW4gdGhl
cmUgY2FuIGJlDQo+IG9ubHkNCj4gPiA+PisJCQkgKiBvbmUgZGV2aWNlIHVuZGVyIHRoYXQgYnVz
IChzbyB3YWxrIGFsbCB0aGUNCj4gZnVuY3Rpb25zDQo+ID4gPj4rCQkJICogdW5kZXIgdGhlIGJ1
cykuDQo+ID4gPj4rCQkJICovDQo+ID4gPj4rCQkJcGNpX3dhbGtfbWZfZGV2KGRldiwgY2IsICZy
ZXN1bHRfZGF0YSk7DQo+ID4gPj4rCQllbHNlDQo+ID4gPj4rCQkJcGNpX3dhbGtfYnVzKGRldi0+
YnVzLCBjYiwgJnJlc3VsdF9kYXRhKTsNCj4gPiA+PiAJfQ0KPiA+ID4+DQo+ID4gPj4gCXJldHVy
biByZXN1bHRfZGF0YS5yZXN1bHQ7DQo+ID4gPj5kaWZmIC0tZ2l0IGEvaW5jbHVkZS9saW51eC9w
Y2kuaCBiL2luY2x1ZGUvbGludXgvcGNpLmgNCj4gPiA+PmluZGV4IDQ4NjllNjYuLjY5ZTc3YmIg
MTAwNjQ0DQo+ID4gPj4tLS0gYS9pbmNsdWRlL2xpbnV4L3BjaS5oDQo+ID4gPj4rKysgYi9pbmNs
dWRlL2xpbnV4L3BjaS5oDQo+ID4gPj5AQCAtMTI2OSw3ICsxMjY5LDggQEAgY29uc3Qgc3RydWN0
IHBjaV9kZXZpY2VfaWQNCj4gKnBjaV9tYXRjaF9pZChjb25zdCBzdHJ1Y3QgcGNpX2RldmljZV9p
ZCAqaWRzLA0KPiA+ID4+IAkJCQkJIHN0cnVjdCBwY2lfZGV2ICpkZXYpOw0KPiA+ID4+IGludCBw
Y2lfc2Nhbl9icmlkZ2Uoc3RydWN0IHBjaV9idXMgKmJ1cywgc3RydWN0IHBjaV9kZXYgKmRldiwg
aW50DQo+IG1heCwNCj4gPiA+PiAJCSAgICBpbnQgcGFzcyk7DQo+ID4gPj4tDQo+ID4gPj4rdm9p
ZCBwY2lfd2Fsa19tZl9kZXYoc3RydWN0IHBjaV9kZXYgKmRldiwgaW50ICgqY2IpKHN0cnVjdCBw
Y2lfZGV2DQo+ICosIHZvaWQgKiksDQo+ID4gPj4rCQkgIHZvaWQgKnVzZXJkYXRhKTsNCj4gPiA+
PiB2b2lkIHBjaV93YWxrX2J1cyhzdHJ1Y3QgcGNpX2J1cyAqdG9wLCBpbnQgKCpjYikoc3RydWN0
IHBjaV9kZXYgKiwNCj4gdm9pZCAqKSwNCj4gPiA+PiAJCSAgdm9pZCAqdXNlcmRhdGEpOw0KPiA+
ID4+IGludCBwY2lfY2ZnX3NwYWNlX3NpemUoc3RydWN0IHBjaV9kZXYgKmRldik7DQo+ID4gPj4t
LQ0KPiA+ID4+MS45LjENCj4gPiA+Pg0KPiA+ID4NCj4gPiA+Lg0KPiA+ID4NCj4gPg0K
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] PCIe AER: report non fatal errors only to the functions of the same device
2017-08-17 13:06 ` Gabriele Paoloni
@ 2017-08-17 16:48 ` Bjorn Helgaas
0 siblings, 0 replies; 6+ messages in thread
From: Bjorn Helgaas @ 2017-08-17 16:48 UTC (permalink / raw)
To: Gabriele Paoloni
Cc: liudongdong (C), linux-pci@vger.kernel.org, Chenxin (Charles),
Linuxarm
On Thu, Aug 17, 2017 at 01:06:32PM +0000, Gabriele Paoloni wrote:
> Hi Bjorn
>
> > -----Original Message-----
> > From: Bjorn Helgaas [mailto:helgaas@kernel.org]
> > Sent: 16 August 2017 15:08
> > To: liudongdong (C)
> > Cc: linux-pci@vger.kernel.org; Gabriele Paoloni; Chenxin (Charles);
> > Linuxarm
> > Subject: Re: [PATCH] PCIe AER: report non fatal errors only to the
> > functions of the same device
> >
> > On Wed, Aug 16, 2017 at 04:50:16PM +0800, Dongdong Liu wrote:
> > > Hi Bjorn
> > >
> > > 在 2017/8/16 6:50, Bjorn Helgaas 写道:
> > > >On Fri, Aug 04, 2017 at 10:18:26AM +0800, Dongdong Liu wrote:
> > > >>From: Gabriele Paoloni <gabriele.paoloni@huawei.com>
> > > >>
> > > >>Currently if an uncorrectable error is reported by an EP the AER
> > > >>driver walks over all the devices connected to the upstream port
> > > >>bus and in turns call the report_error_detected() callback.
> > > >>If any of the devices connected to the bus does not implement
> > > >>dev->driver->err_handler->error_detected() do_recovery() will fail.
> > > >>
> > > >>However for non fatal errors the PCIe link should not be considered
> > > >>compromised, therefore it makes sense to report the error only to
> > > >>all the functions of a multifunction device.
> > > >>This patch implements this new behaviour for non fatal errors.
> > > >
> > > >Why do we bother even with other functions in the same multifunction
> > > >device? PCIe r3.1, sec 6.2.2.2.2, says non-fatal errors only affect
> > a
> > > >particular transaction, and "devices not associated with the
> > > >transaction in error are not impacted."
> > > >
> > > >A transaction is only associated with one function, so other
> > functions
> > > >in the same device shouldn't be affected, should they?
> > >
> > > PCIe r3.1, sec 6.2.4 Error Logging
> > > PCI Express errors are not Function-specific. "Software is
> > > responsible for scanning all Functions in a multi-Function device
> > > when it detects one of those errors"
> >
> > The previous text basically says that if a multi-function device
> > should generate at most one error reporting message, even if several
> > functions have logged an error of the same severity. I think that
> > single message corresponds to a single interrupt.
> >
> > So when it says "software is responsible for scanning all Functions in
> > a multi-Function device," I think the point is that software should
> > read the error reporting registers of all functions in case several of
> > them have logged errors.
>
> Right, looking again at the AER core it seems that find_source_device()
> would look for the error sources by walking the PCIe hierarchy starting
> from the RP that reported the error (however from AER_MAX_MULTI_ERR_DEVICES
> max 5 devices can log an error on a single AER interrupt...).
>
> Anyway as it is now, and assuming that we have no more than 5 functions
> in a multi-function device, AER core should call handle_error_source()
> for each function that logged an error, right?
>
> >
> > But IIUC, this patch has nothing to do with reading the error CSRs
> > (which should be done in the PCI/AER core). This patch merely changes
> > the set of devices for which we call the driver's error reporting
> > interfaces.
>
> Correct. From our point of view if a fatal AER is reported by a function
> then we need to call the driver callbacks also on all the function under
> the same bus as the reporting device (as the link is compromised).
>
> We thought that for non-fatal errors this is not necessary as the bus
> link should not be considered compromised, but we thought that for MF
> devices maybe it would have been appropriate to call the driver callbacks
> on the other functions of the same device. However after your consideration
> above and after double checking the AER core it seems that also this is
> not necessary (in fact for a MF device handle_error_source() will be called
> for each function that logged the error)
>
> >
> > If this is fixing a problem, maybe it would help clarify things if you
> > could include the concrete details of what's going wrong.
>
> In Hi1620 we have some integrated controllers that appear as PCIe EPs
> under the same bus. Some of these controllers (e.g. the SATA
> controller) are missing the err_handler callbacks.
>
> If one device reports a non-fatal uncorrectable error with the current
> AER core code the callbacks for all the devices under the same bus will
> be called and, if any of the devices is missing the callback all the
> devices in the subtree are left in error state without recovery...
> This patch is needed to sort out a situation like this one.
>
> Anyway I think that after the considerations above on the MF device
> I can modify this patch to just call the callback for the reporting
> function...?
Sounds like a reasonable approach. I'll compare the patch with the
spec in more detail when you post it.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-08-17 16:48 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-04 2:18 [PATCH] PCIe AER: report non fatal errors only to the functions of the same device Dongdong Liu
2017-08-15 22:50 ` Bjorn Helgaas
2017-08-16 8:50 ` Dongdong Liu
2017-08-16 14:07 ` Bjorn Helgaas
2017-08-17 13:06 ` Gabriele Paoloni
2017-08-17 16:48 ` Bjorn Helgaas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).