* [PATCH v2 1/4] Documentation: PCI: Sync AER doc with code
2025-09-15 13:50 [PATCH v2 0/4] Documentation: PCI: Update error recovery docs Lukas Wunner
@ 2025-09-15 13:50 ` Lukas Wunner
2025-09-15 13:50 ` [PATCH v2 2/4] Documentation: PCI: Sync error recovery " Lukas Wunner
` (4 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Lukas Wunner @ 2025-09-15 13:50 UTC (permalink / raw)
To: Bjorn Helgaas, Jonathan Corbet
Cc: Terry Bowman, Ilpo Jarvinen, Sathyanarayanan Kuppuswamy,
Niklas Schnelle, Linas Vepstas, Mahesh J Salgaonkar,
Oliver OHalloran, linuxppc-dev, linux-pci, linux-doc,
Brian Norris
The PCIe Advanced Error Reporting driver has evolved over the years but
its documentation hasn't. Catch up with past code changes:
* The documentation claims that Correctable Errors are logged with
KERN_INFO severity, but the code uses KERN_WARN.
It had used KERN_WARN from the beginning with commit 6c2b374d7485
("PCI-Express AER implemetation: AER core and aerdriver"). In 2013,
commit 2cced2d95961 ("aerdrv: Cleanup log output for AER") switched to
KERN_ERR, until 2020 when it was reverted back to KERN_WARN by commit
e83e2ca3c395 ("PCI/AER: Log correctable errors as warning, not error").
* An example log message in the documentation uses the term "Uncorrected",
but the code uses "Uncorrectable" since commit 02a06f5f1a6a ("PCI/AER:
Use 'Correctable' and 'Uncorrectable' spec terms for errors").
* The example contains the Requester ID "id=0500", which is omitted since
commit 010caed4ccb6 ("PCI/AER: Decode Error Source Requester ID").
* The example contains the error name "Unsupported Request", which is
instead reported as "UnsupReq" since commit bd237801fef2 ("PCI/AER:
Adopt lspci names for AER error decoding").
* The example doesn't prepend "0x" to hex values from the TLP Header Log,
as introduced by commit f68ea779d98a ("PCI: Add pcie_print_tlp_log() to
print TLP Header and Prefix Log").
* The documentation refers to a reset_link callback which was removed by
commit b6cf1a42f916 ("PCI/ERR: Remove service dependency in
pcie_do_recovery()").
* Commit 579086225502 ("PCI/ERR: Recover from RCiEP AER errors") added
support to recover Root Complex Integrated Endpoints by applying a
Function Level Reset, alternatively to the Secondary Bus Reset which is
applied otherwise.
* On non-fatal errors, a reset was previously never performed. But the
AER driver has just been amended to allow drivers to opt in to a reset.
* The documentation claims that a warning message is logged if a driver
lacks pci_error_handlers. But the message has been informational
(logged with KERN_INFO severity) since its introduction with commit
01daacfb9035 ("PCI/AER: Log which device prevents error recovery").
The documentation claims that the message is only logged for fatal
errors, which is incorrect. Moreover it refers to "section 3", even
though the documentation no longer contains section numbers since commit
4e37f055a92e ("Documentation: PCI: convert pcieaer-howto.txt to reST").
Section 3 is titled "Developer Guide". That's the same section where
the reference is located, so it is self-referential and can be dropped.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Brian Norris <briannorris@chromium.org>
---
Documentation/PCI/pcieaer-howto.rst | 81 ++++++++++++++---------------
1 file changed, 38 insertions(+), 43 deletions(-)
diff --git a/Documentation/PCI/pcieaer-howto.rst b/Documentation/PCI/pcieaer-howto.rst
index 4b71e2f43ca7..d448efe572c8 100644
--- a/Documentation/PCI/pcieaer-howto.rst
+++ b/Documentation/PCI/pcieaer-howto.rst
@@ -70,16 +70,16 @@ AER error output
----------------
When a PCIe AER error is captured, an error message will be output to
-console. If it's a correctable error, it is output as an info message.
+console. If it's a correctable error, it is output as a warning message.
Otherwise, it is printed as an error. So users could choose different
log level to filter out correctable error messages.
Below shows an example::
- 0000:50:00.0: PCIe Bus Error: severity=Uncorrected (Fatal), type=Transaction Layer, id=0500(Requester ID)
+ 0000:50:00.0: PCIe Bus Error: severity=Uncorrectable (Fatal), type=Transaction Layer, (Requester ID)
0000:50:00.0: device [8086:0329] error status/mask=00100000/00000000
- 0000:50:00.0: [20] Unsupported Request (First)
- 0000:50:00.0: TLP Header: 04000001 00200a03 05010000 00050100
+ 0000:50:00.0: [20] UnsupReq (First)
+ 0000:50:00.0: TLP Header: 0x04000001 0x00200a03 0x05010000 0x00050100
In the example, 'Requester ID' means the ID of the device that sent
the error message to the Root Port. Please refer to PCIe specs for other
@@ -152,18 +152,6 @@ the device driver.
Provide callbacks
-----------------
-callback reset_link to reset PCIe link
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-This callback is used to reset the PCIe physical link when a
-fatal error happens. The Root Port AER service driver provides a
-default reset_link function, but different Upstream Ports might
-have different specifications to reset the PCIe link, so
-Upstream Port drivers may provide their own reset_link functions.
-
-Section 3.2.2.2 provides more detailed info on when to call
-reset_link.
-
PCI error-recovery callbacks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -174,8 +162,8 @@ when performing error recovery actions.
Data struct pci_driver has a pointer, err_handler, to point to
pci_error_handlers who consists of a couple of callback function
pointers. The AER driver follows the rules defined in
-pci-error-recovery.rst except PCIe-specific parts (e.g.
-reset_link). Please refer to pci-error-recovery.rst for detailed
+pci-error-recovery.rst except PCIe-specific parts (see
+below). Please refer to pci-error-recovery.rst for detailed
definitions of the callbacks.
The sections below specify when to call the error callback functions.
@@ -189,10 +177,21 @@ software intervention or any loss of data. These errors do not
require any recovery actions. The AER driver clears the device's
correctable error status register accordingly and logs these errors.
-Non-correctable (non-fatal and fatal) errors
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Uncorrectable (non-fatal and fatal) errors
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-If an error message indicates a non-fatal error, performing link reset
+The AER driver performs a Secondary Bus Reset to recover from
+uncorrectable errors. The reset is applied at the port above
+the originating device: If the originating device is an Endpoint,
+only the Endpoint is reset. If on the other hand the originating
+device has subordinate devices, those are all affected by the
+reset as well.
+
+If the originating device is a Root Complex Integrated Endpoint,
+there's no port above where a Secondary Bus Reset could be applied.
+In this case, the AER driver instead applies a Function Level Reset.
+
+If an error message indicates a non-fatal error, performing a reset
at upstream is not required. The AER driver calls error_detected(dev,
pci_channel_io_normal) to all drivers associated within a hierarchy in
question. For example::
@@ -204,38 +203,34 @@ Downstream Port B and Endpoint.
A driver may return PCI_ERS_RESULT_CAN_RECOVER,
PCI_ERS_RESULT_DISCONNECT, or PCI_ERS_RESULT_NEED_RESET, depending on
-whether it can recover or the AER driver calls mmio_enabled as next.
+whether it can recover without a reset, considers the device unrecoverable
+or needs a reset for recovery. If all affected drivers agree that they can
+recover without a reset, it is skipped. Should one driver request a reset,
+it overrides all other drivers.
If an error message indicates a fatal error, kernel will broadcast
error_detected(dev, pci_channel_io_frozen) to all drivers within
-a hierarchy in question. Then, performing link reset at upstream is
-necessary. As different kinds of devices might use different approaches
-to reset link, AER port service driver is required to provide the
-function to reset link via callback parameter of pcie_do_recovery()
-function. If reset_link is not NULL, recovery function will use it
-to reset the link. If error_detected returns PCI_ERS_RESULT_CAN_RECOVER
-and reset_link returns PCI_ERS_RESULT_RECOVERED, the error handling goes
-to mmio_enabled.
+a hierarchy in question. Then, performing a reset at upstream is
+necessary. If error_detected returns PCI_ERS_RESULT_CAN_RECOVER
+to indicate that recovery without a reset is possible, the error
+handling goes to mmio_enabled, but afterwards a reset is still
+performed.
-Frequent Asked Questions
-------------------------
+In other words, for non-fatal errors, drivers may opt in to a reset.
+But for fatal errors, they cannot opt out of a reset, based on the
+assumption that the link is unreliable.
+
+Frequently Asked Questions
+--------------------------
Q:
What happens if a PCIe device driver does not provide an
error recovery handler (pci_driver->err_handler is equal to NULL)?
A:
- The devices attached with the driver won't be recovered. If the
- error is fatal, kernel will print out warning messages. Please refer
- to section 3 for more information.
-
-Q:
- What happens if an upstream port service driver does not provide
- callback reset_link?
-
-A:
- Fatal error recovery will fail if the errors are reported by the
- upstream ports who are attached by the service driver.
+ The devices attached with the driver won't be recovered.
+ The kernel will print out informational messages to identify
+ unrecoverable devices.
Software error injection
--
2.51.0
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH v2 2/4] Documentation: PCI: Sync error recovery doc with code
2025-09-15 13:50 [PATCH v2 0/4] Documentation: PCI: Update error recovery docs Lukas Wunner
2025-09-15 13:50 ` [PATCH v2 1/4] Documentation: PCI: Sync AER doc with code Lukas Wunner
@ 2025-09-15 13:50 ` Lukas Wunner
2025-09-15 15:51 ` Niklas Schnelle
2025-09-15 13:50 ` [PATCH v2 3/4] Documentation: PCI: Amend error recovery doc with DPC/AER specifics Lukas Wunner
` (3 subsequent siblings)
5 siblings, 1 reply; 10+ messages in thread
From: Lukas Wunner @ 2025-09-15 13:50 UTC (permalink / raw)
To: Bjorn Helgaas, Jonathan Corbet
Cc: Terry Bowman, Ilpo Jarvinen, Sathyanarayanan Kuppuswamy,
Niklas Schnelle, Linas Vepstas, Mahesh J Salgaonkar,
Oliver OHalloran, linuxppc-dev, linux-pci, linux-doc,
Brian Norris
Amend the documentation on PCI error recovery to fix minor inaccuracies
vis-à-vis the actual code:
* The documentation claims that a missing ->resume() or ->mmio_enabled()
callback always leads to recovery through reset. But none of the
implementations do this (pcie_do_recovery(), eeh_handle_normal_event(),
zpci_event_do_error_state_clear()).
Drop the claim to align the documentation with the code.
* The documentation does not list PCI_ERS_RESULT_RECOVERED as a valid
return value from ->error_detected(). But none of the implementations
forbid this and some drivers are returning it, e.g.:
drivers/bus/mhi/host/pci_generic.c
drivers/infiniband/hw/hfi1/pcie.c
Further down in the documentation it is implied that the return value is
in fact allowed:
"The platform will call the resume() callback on all affected device
drivers if all drivers on the segment have returned
PCI_ERS_RESULT_RECOVERED from one of the 3 previous callbacks."
The "3 previous callbacks" being ->error_detected(), ->mmio_enabled()
and ->slot_reset().
Add it to the valid return values for consistency.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Brian Norris <briannorris@chromium.org>
---
Documentation/PCI/pci-error-recovery.rst | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/Documentation/PCI/pci-error-recovery.rst b/Documentation/PCI/pci-error-recovery.rst
index 42e1e78353f3..d5c661baa87f 100644
--- a/Documentation/PCI/pci-error-recovery.rst
+++ b/Documentation/PCI/pci-error-recovery.rst
@@ -108,8 +108,8 @@ A driver does not have to implement all of these callbacks; however,
if it implements any, it must implement error_detected(). If a callback
is not implemented, the corresponding feature is considered unsupported.
For example, if mmio_enabled() and resume() aren't there, then it
-is assumed that the driver is not doing any direct recovery and requires
-a slot reset. Typically a driver will want to know about
+is assumed that the driver does not need these callbacks
+for recovery. Typically a driver will want to know about
a slot_reset().
The actual steps taken by a platform to recover from a PCI error
@@ -141,6 +141,9 @@ shouldn't do any new IOs. Called in task context. This is sort of a
All drivers participating in this system must implement this call.
The driver must return one of the following result codes:
+ - PCI_ERS_RESULT_RECOVERED
+ Driver returns this if it thinks the device is usable despite
+ the error and does not need further intervention.
- PCI_ERS_RESULT_CAN_RECOVER
Driver returns this if it thinks it might be able to recover
the HW by just banging IOs or if it wants to be given
--
2.51.0
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH v2 2/4] Documentation: PCI: Sync error recovery doc with code
2025-09-15 13:50 ` [PATCH v2 2/4] Documentation: PCI: Sync error recovery " Lukas Wunner
@ 2025-09-15 15:51 ` Niklas Schnelle
0 siblings, 0 replies; 10+ messages in thread
From: Niklas Schnelle @ 2025-09-15 15:51 UTC (permalink / raw)
To: Lukas Wunner, Bjorn Helgaas, Jonathan Corbet
Cc: Terry Bowman, Ilpo Jarvinen, Sathyanarayanan Kuppuswamy,
Linas Vepstas, Mahesh J Salgaonkar, Oliver OHalloran,
linuxppc-dev, linux-pci, linux-doc, Brian Norris
On Mon, 2025-09-15 at 15:50 +0200, Lukas Wunner wrote:
> Amend the documentation on PCI error recovery to fix minor inaccuracies
> vis-à-vis the actual code:
>
> * The documentation claims that a missing ->resume() or ->mmio_enabled()
> callback always leads to recovery through reset. But none of the
> implementations do this (pcie_do_recovery(), eeh_handle_normal_event(),
> zpci_event_do_error_state_clear()).
>
> Drop the claim to align the documentation with the code.
>
> * The documentation does not list PCI_ERS_RESULT_RECOVERED as a valid
> return value from ->error_detected(). But none of the implementations
> forbid this and some drivers are returning it, e.g.:
> drivers/bus/mhi/host/pci_generic.c
> drivers/infiniband/hw/hfi1/pcie.c
>
> Further down in the documentation it is implied that the return value is
> in fact allowed:
> "The platform will call the resume() callback on all affected device
> drivers if all drivers on the segment have returned
> PCI_ERS_RESULT_RECOVERED from one of the 3 previous callbacks."
>
> The "3 previous callbacks" being ->error_detected(), ->mmio_enabled()
> and ->slot_reset().
>
> Add it to the valid return values for consistency.
>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Reviewed-by: Brian Norris <briannorris@chromium.org>
> ---
> Documentation/PCI/pci-error-recovery.rst | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/PCI/pci-error-recovery.rst b/Documentation/PCI/pci-error-recovery.rst
> index 42e1e78353f3..d5c661baa87f 100644
> --- a/Documentation/PCI/pci-error-recovery.rst
> +++ b/Documentation/PCI/pci-error-recovery.rst
> @@ -108,8 +108,8 @@ A driver does not have to implement all of these callbacks; however,
> if it implements any, it must implement error_detected(). If a callback
> is not implemented, the corresponding feature is considered unsupported.
> For example, if mmio_enabled() and resume() aren't there, then it
> -is assumed that the driver is not doing any direct recovery and requires
> -a slot reset. Typically a driver will want to know about
> +is assumed that the driver does not need these callbacks
> +for recovery. Typically a driver will want to know about
> a slot_reset().
>
> The actual steps taken by a platform to recover from a PCI error
> @@ -141,6 +141,9 @@ shouldn't do any new IOs. Called in task context. This is sort of a
> All drivers participating in this system must implement this call.
> The driver must return one of the following result codes:
>
> + - PCI_ERS_RESULT_RECOVERED
> + Driver returns this if it thinks the device is usable despite
> + the error and does not need further intervention.
> - PCI_ERS_RESULT_CAN_RECOVER
> Driver returns this if it thinks it might be able to recover
> the HW by just banging IOs or if it wants to be given
Thanks and good catch on these inaccuracies.
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2 3/4] Documentation: PCI: Amend error recovery doc with DPC/AER specifics
2025-09-15 13:50 [PATCH v2 0/4] Documentation: PCI: Update error recovery docs Lukas Wunner
2025-09-15 13:50 ` [PATCH v2 1/4] Documentation: PCI: Sync AER doc with code Lukas Wunner
2025-09-15 13:50 ` [PATCH v2 2/4] Documentation: PCI: Sync error recovery " Lukas Wunner
@ 2025-09-15 13:50 ` Lukas Wunner
2025-09-15 15:43 ` Niklas Schnelle
2025-09-15 13:50 ` [PATCH v2 4/4] Documentation: PCI: Tidy error recovery doc's PCIe nomenclature Lukas Wunner
` (2 subsequent siblings)
5 siblings, 1 reply; 10+ messages in thread
From: Lukas Wunner @ 2025-09-15 13:50 UTC (permalink / raw)
To: Bjorn Helgaas, Jonathan Corbet
Cc: Terry Bowman, Ilpo Jarvinen, Sathyanarayanan Kuppuswamy,
Niklas Schnelle, Linas Vepstas, Mahesh J Salgaonkar,
Oliver OHalloran, linuxppc-dev, linux-pci, linux-doc,
Brian Norris
Amend the documentation on PCI error recovery with specifics about
Downstream Port Containment and Advanced Error Reporting:
* Explain that with DPC, devices are inaccessible upon an error (similar
to EEH on powerpc) and do not become accessible until the link is
re-enabled.
* Explain that with AER, although devices may already be accessible in the
->error_detected() callback, accesses should be deferred to the
->mmio_enabled() callback for compatibility with EEH on powerpc and with
s390.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Brian Norris <briannorris@chromium.org>
---
Documentation/PCI/pci-error-recovery.rst | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/Documentation/PCI/pci-error-recovery.rst b/Documentation/PCI/pci-error-recovery.rst
index d5c661baa87f..9e1e2f2a13fa 100644
--- a/Documentation/PCI/pci-error-recovery.rst
+++ b/Documentation/PCI/pci-error-recovery.rst
@@ -122,6 +122,10 @@ A PCI bus error is detected by the PCI hardware. On powerpc, the slot
is isolated, in that all I/O is blocked: all reads return 0xffffffff,
all writes are ignored.
+Similarly, on platforms supporting Downstream Port Containment
+(PCIe r7.0 sec 6.2.11), the link to the sub-hierarchy with the
+faulting device is disabled. Any device in the sub-hierarchy
+becomes inaccessible.
STEP 1: Notification
--------------------
@@ -204,6 +208,24 @@ link reset was performed by the HW. If the platform can't just re-enable IOs
without a slot reset or a link reset, it will not call this callback, and
instead will have gone directly to STEP 3 (Link Reset) or STEP 4 (Slot Reset)
+.. note::
+
+ On platforms supporting Advanced Error Reporting (PCIe r7.0 sec 6.2),
+ the faulting device may already be accessible in STEP 1 (Notification).
+ Drivers should nevertheless defer accesses to STEP 2 (MMIO Enabled)
+ to be compatible with EEH on powerpc and with s390 (where devices are
+ inaccessible until STEP 2).
+
+ On platforms supporting Downstream Port Containment, the link to the
+ sub-hierarchy with the faulting device is re-enabled in STEP 3 (Link
+ Reset). Hence devices in the sub-hierarchy are inaccessible until
+ STEP 4 (Slot Reset).
+
+ For errors such as Surprise Down (PCIe r7.0 sec 6.2.7), the device
+ may not even be accessible in STEP 4 (Slot Reset). Drivers can detect
+ accessibility by checking whether reads from the device return all 1's
+ (PCI_POSSIBLE_ERROR()).
+
.. note::
The following is proposed; no platform implements this yet:
--
2.51.0
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH v2 3/4] Documentation: PCI: Amend error recovery doc with DPC/AER specifics
2025-09-15 13:50 ` [PATCH v2 3/4] Documentation: PCI: Amend error recovery doc with DPC/AER specifics Lukas Wunner
@ 2025-09-15 15:43 ` Niklas Schnelle
0 siblings, 0 replies; 10+ messages in thread
From: Niklas Schnelle @ 2025-09-15 15:43 UTC (permalink / raw)
To: Lukas Wunner, Bjorn Helgaas, Jonathan Corbet
Cc: Terry Bowman, Ilpo Jarvinen, Sathyanarayanan Kuppuswamy,
Linas Vepstas, Mahesh J Salgaonkar, Oliver OHalloran,
linuxppc-dev, linux-pci, linux-doc, Brian Norris
On Mon, 2025-09-15 at 15:50 +0200, Lukas Wunner wrote:
> Amend the documentation on PCI error recovery with specifics about
> Downstream Port Containment and Advanced Error Reporting:
>
> * Explain that with DPC, devices are inaccessible upon an error (similar
> to EEH on powerpc) and do not become accessible until the link is
> re-enabled.
>
> * Explain that with AER, although devices may already be accessible in the
> ->error_detected() callback, accesses should be deferred to the
> ->mmio_enabled() callback for compatibility with EEH on powerpc and with
> s390.
>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Reviewed-by: Brian Norris <briannorris@chromium.org>
> ---
> Documentation/PCI/pci-error-recovery.rst | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/Documentation/PCI/pci-error-recovery.rst b/Documentation/PCI/pci-error-recovery.rst
> index d5c661baa87f..9e1e2f2a13fa 100644
> --- a/Documentation/PCI/pci-error-recovery.rst
> +++ b/Documentation/PCI/pci-error-recovery.rst
> @@ -122,6 +122,10 @@ A PCI bus error is detected by the PCI hardware. On powerpc, the slot
> is isolated, in that all I/O is blocked: all reads return 0xffffffff,
> all writes are ignored.
>
> +Similarly, on platforms supporting Downstream Port Containment
> +(PCIe r7.0 sec 6.2.11), the link to the sub-hierarchy with the
> +faulting device is disabled. Any device in the sub-hierarchy
> +becomes inaccessible.
>
> STEP 1: Notification
> --------------------
> @@ -204,6 +208,24 @@ link reset was performed by the HW. If the platform can't just re-enable IOs
> without a slot reset or a link reset, it will not call this callback, and
> instead will have gone directly to STEP 3 (Link Reset) or STEP 4 (Slot Reset)
>
> +.. note::
> +
> + On platforms supporting Advanced Error Reporting (PCIe r7.0 sec 6.2),
> + the faulting device may already be accessible in STEP 1 (Notification).
> + Drivers should nevertheless defer accesses to STEP 2 (MMIO Enabled)
> + to be compatible with EEH on powerpc and with s390 (where devices are
> + inaccessible until STEP 2).
> +
> + On platforms supporting Downstream Port Containment, the link to the
> + sub-hierarchy with the faulting device is re-enabled in STEP 3 (Link
> + Reset). Hence devices in the sub-hierarchy are inaccessible until
> + STEP 4 (Slot Reset).
> +
> + For errors such as Surprise Down (PCIe r7.0 sec 6.2.7), the device
> + may not even be accessible in STEP 4 (Slot Reset). Drivers can detect
> + accessibility by checking whether reads from the device return all 1's
> + (PCI_POSSIBLE_ERROR()).
> +
> .. note::
>
> The following is proposed; no platform implements this yet:
Thanks for improving this. Makes sense to mention and spell this out
explicitly.
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2 4/4] Documentation: PCI: Tidy error recovery doc's PCIe nomenclature
2025-09-15 13:50 [PATCH v2 0/4] Documentation: PCI: Update error recovery docs Lukas Wunner
` (2 preceding siblings ...)
2025-09-15 13:50 ` [PATCH v2 3/4] Documentation: PCI: Amend error recovery doc with DPC/AER specifics Lukas Wunner
@ 2025-09-15 13:50 ` Lukas Wunner
2025-09-15 15:46 ` Niklas Schnelle
2025-09-15 15:25 ` [PATCH v2 0/4] Documentation: PCI: Update error recovery docs Sathyanarayanan Kuppuswamy
2025-09-16 15:55 ` Bjorn Helgaas
5 siblings, 1 reply; 10+ messages in thread
From: Lukas Wunner @ 2025-09-15 13:50 UTC (permalink / raw)
To: Bjorn Helgaas, Jonathan Corbet
Cc: Terry Bowman, Ilpo Jarvinen, Sathyanarayanan Kuppuswamy,
Niklas Schnelle, Linas Vepstas, Mahesh J Salgaonkar,
Oliver OHalloran, linuxppc-dev, linux-pci, linux-doc,
Brian Norris
Commit 11502feab423 ("Documentation: PCI: Tidy AER documentation")
replaced the terms "PCI-E", "PCI-Express" and "PCI Express" with "PCIe"
in the AER documentation.
Do the same in the documentation on PCI error recovery. While at it,
add a missing period and a missing blank.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Brian Norris <briannorris@chromium.org>
---
Documentation/PCI/pci-error-recovery.rst | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/Documentation/PCI/pci-error-recovery.rst b/Documentation/PCI/pci-error-recovery.rst
index 9e1e2f2a13fa..5df481ac6193 100644
--- a/Documentation/PCI/pci-error-recovery.rst
+++ b/Documentation/PCI/pci-error-recovery.rst
@@ -13,7 +13,7 @@ PCI Error Recovery
Many PCI bus controllers are able to detect a variety of hardware
PCI errors on the bus, such as parity errors on the data and address
buses, as well as SERR and PERR errors. Some of the more advanced
-chipsets are able to deal with these errors; these include PCI-E chipsets,
+chipsets are able to deal with these errors; these include PCIe chipsets,
and the PCI-host bridges found on IBM Power4, Power5 and Power6-based
pSeries boxes. A typical action taken is to disconnect the affected device,
halting all I/O to it. The goal of a disconnection is to avoid system
@@ -206,7 +206,7 @@ reset or some such, but not restart operations. This callback is made if
all drivers on a segment agree that they can try to recover and if no automatic
link reset was performed by the HW. If the platform can't just re-enable IOs
without a slot reset or a link reset, it will not call this callback, and
-instead will have gone directly to STEP 3 (Link Reset) or STEP 4 (Slot Reset)
+instead will have gone directly to STEP 3 (Link Reset) or STEP 4 (Slot Reset).
.. note::
@@ -259,14 +259,14 @@ The driver should return one of the following result codes:
The next step taken depends on the results returned by the drivers.
If all drivers returned PCI_ERS_RESULT_RECOVERED, then the platform
-proceeds to either STEP3 (Link Reset) or to STEP 5 (Resume Operations).
+proceeds to either STEP 3 (Link Reset) or to STEP 5 (Resume Operations).
If any driver returned PCI_ERS_RESULT_NEED_RESET, then the platform
proceeds to STEP 4 (Slot Reset)
STEP 3: Link Reset
------------------
-The platform resets the link. This is a PCI-Express specific step
+The platform resets the link. This is a PCIe specific step
and is done whenever a fatal error has been detected that can be
"solved" by resetting the link.
@@ -288,13 +288,13 @@ that is equivalent to what it would be after a fresh system
power-on followed by power-on BIOS/system firmware initialization.
Soft reset is also known as hot-reset.
-Powerpc fundamental reset is supported by PCI Express cards only
+Powerpc fundamental reset is supported by PCIe cards only
and results in device's state machines, hardware logic, port states and
configuration registers to initialize to their default conditions.
For most PCI devices, a soft reset will be sufficient for recovery.
Optional fundamental reset is provided to support a limited number
-of PCI Express devices for which a soft reset is not sufficient
+of PCIe devices for which a soft reset is not sufficient
for recovery.
If the platform supports PCI hotplug, then the reset might be
@@ -338,7 +338,7 @@ Result codes:
- PCI_ERS_RESULT_DISCONNECT
Same as above.
-Drivers for PCI Express cards that require a fundamental reset must
+Drivers for PCIe cards that require a fundamental reset must
set the needs_freset bit in the pci_dev structure in their probe function.
For example, the QLogic qla2xxx driver sets the needs_freset bit for certain
PCI card types::
--
2.51.0
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH v2 4/4] Documentation: PCI: Tidy error recovery doc's PCIe nomenclature
2025-09-15 13:50 ` [PATCH v2 4/4] Documentation: PCI: Tidy error recovery doc's PCIe nomenclature Lukas Wunner
@ 2025-09-15 15:46 ` Niklas Schnelle
0 siblings, 0 replies; 10+ messages in thread
From: Niklas Schnelle @ 2025-09-15 15:46 UTC (permalink / raw)
To: Lukas Wunner, Bjorn Helgaas, Jonathan Corbet
Cc: Terry Bowman, Ilpo Jarvinen, Sathyanarayanan Kuppuswamy,
Linas Vepstas, Mahesh J Salgaonkar, Oliver OHalloran,
linuxppc-dev, linux-pci, linux-doc, Brian Norris
On Mon, 2025-09-15 at 15:50 +0200, Lukas Wunner wrote:
> Commit 11502feab423 ("Documentation: PCI: Tidy AER documentation")
> replaced the terms "PCI-E", "PCI-Express" and "PCI Express" with "PCIe"
> in the AER documentation.
>
> Do the same in the documentation on PCI error recovery. While at it,
> add a missing period and a missing blank.
>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Reviewed-by: Brian Norris <briannorris@chromium.org>
> ---
> Documentation/PCI/pci-error-recovery.rst | 14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/Documentation/PCI/pci-error-recovery.rst b/Documentation/PCI/pci-error-recovery.rst
> index 9e1e2f2a13fa..5df481ac6193 100644
> --- a/Documentation/PCI/pci-error-recovery.rst
> +++ b/Documentation/PCI/pci-error-recovery.rst
> @@ -13,7 +13,7 @@ PCI Error Recovery
> Many PCI bus controllers are able to detect a variety of hardware
> PCI errors on the bus, such as parity errors on the data and address
> buses, as well as SERR and PERR errors. Some of the more advanced
> -chipsets are able to deal with these errors; these include PCI-E chipsets,
> +chipsets are able to deal with these errors; these include PCIe chipsets,
> and the PCI-host bridges found on IBM Power4, Power5 and Power6-based
> pSeries boxes. A typical action taken is to disconnect the affected device,
> halting all I/O to it. The goal of a disconnection is to avoid system
> @@ -206,7 +206,7 @@ reset or some such, but not restart operations. This callback is made if
> all drivers on a segment agree that they can try to recover and if no automatic
> link reset was performed by the HW. If the platform can't just re-enable IOs
> without a slot reset or a link reset, it will not call this callback, and
> -instead will have gone directly to STEP 3 (Link Reset) or STEP 4 (Slot Reset)
> +instead will have gone directly to STEP 3 (Link Reset) or STEP 4 (Slot Reset).
>
> .. note::
>
> @@ -259,14 +259,14 @@ The driver should return one of the following result codes:
>
> The next step taken depends on the results returned by the drivers.
> If all drivers returned PCI_ERS_RESULT_RECOVERED, then the platform
> -proceeds to either STEP3 (Link Reset) or to STEP 5 (Resume Operations).
> +proceeds to either STEP 3 (Link Reset) or to STEP 5 (Resume Operations).
>
> If any driver returned PCI_ERS_RESULT_NEED_RESET, then the platform
> proceeds to STEP 4 (Slot Reset)
>
> STEP 3: Link Reset
> ------------------
> -The platform resets the link. This is a PCI-Express specific step
> +The platform resets the link. This is a PCIe specific step
> and is done whenever a fatal error has been detected that can be
> "solved" by resetting the link.
>
> @@ -288,13 +288,13 @@ that is equivalent to what it would be after a fresh system
> power-on followed by power-on BIOS/system firmware initialization.
> Soft reset is also known as hot-reset.
>
> -Powerpc fundamental reset is supported by PCI Express cards only
> +Powerpc fundamental reset is supported by PCIe cards only
> and results in device's state machines, hardware logic, port states and
> configuration registers to initialize to their default conditions.
>
> For most PCI devices, a soft reset will be sufficient for recovery.
> Optional fundamental reset is provided to support a limited number
> -of PCI Express devices for which a soft reset is not sufficient
> +of PCIe devices for which a soft reset is not sufficient
> for recovery.
>
> If the platform supports PCI hotplug, then the reset might be
> @@ -338,7 +338,7 @@ Result codes:
> - PCI_ERS_RESULT_DISCONNECT
> Same as above.
>
> -Drivers for PCI Express cards that require a fundamental reset must
> +Drivers for PCIe cards that require a fundamental reset must
> set the needs_freset bit in the pci_dev structure in their probe function.
> For example, the QLogic qla2xxx driver sets the needs_freset bit for certain
> PCI card types::
Thanks for the bringing this in sync.
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 0/4] Documentation: PCI: Update error recovery docs
2025-09-15 13:50 [PATCH v2 0/4] Documentation: PCI: Update error recovery docs Lukas Wunner
` (3 preceding siblings ...)
2025-09-15 13:50 ` [PATCH v2 4/4] Documentation: PCI: Tidy error recovery doc's PCIe nomenclature Lukas Wunner
@ 2025-09-15 15:25 ` Sathyanarayanan Kuppuswamy
2025-09-16 15:55 ` Bjorn Helgaas
5 siblings, 0 replies; 10+ messages in thread
From: Sathyanarayanan Kuppuswamy @ 2025-09-15 15:25 UTC (permalink / raw)
To: Lukas Wunner, Bjorn Helgaas, Jonathan Corbet
Cc: Terry Bowman, Ilpo Jarvinen, Niklas Schnelle, Linas Vepstas,
Mahesh J Salgaonkar, Oliver OHalloran, linuxppc-dev, linux-pci,
linux-doc, Brian Norris
On 9/15/25 6:50 AM, Lukas Wunner wrote:
> Fix deviations between the code and the documentation on
> PCIe Advanced Error Reporting. Add minor clarifications
> and make a few small cleanups.
>
> Changes v1 -> v2:
> * In all patches, change subject prefix to "Documentation: PCI: ".
> * In patch [3/4], mention s390 alongside powerpc (Niklas).
Looks good to me.
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> Link to v1:
> https://lore.kernel.org/all/cover.1756451884.git.lukas@wunner.de/
>
> Lukas Wunner (4):
> Documentation: PCI: Sync AER doc with code
> Documentation: PCI: Sync error recovery doc with code
> Documentation: PCI: Amend error recovery doc with DPC/AER specifics
> Documentation: PCI: Tidy error recovery doc's PCIe nomenclature
>
> Documentation/PCI/pci-error-recovery.rst | 43 ++++++++++---
> Documentation/PCI/pcieaer-howto.rst | 81 +++++++++++-------------
> 2 files changed, 72 insertions(+), 52 deletions(-)
>
--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v2 0/4] Documentation: PCI: Update error recovery docs
2025-09-15 13:50 [PATCH v2 0/4] Documentation: PCI: Update error recovery docs Lukas Wunner
` (4 preceding siblings ...)
2025-09-15 15:25 ` [PATCH v2 0/4] Documentation: PCI: Update error recovery docs Sathyanarayanan Kuppuswamy
@ 2025-09-16 15:55 ` Bjorn Helgaas
5 siblings, 0 replies; 10+ messages in thread
From: Bjorn Helgaas @ 2025-09-16 15:55 UTC (permalink / raw)
To: Lukas Wunner
Cc: Jonathan Corbet, Terry Bowman, Ilpo Jarvinen,
Sathyanarayanan Kuppuswamy, Niklas Schnelle, Linas Vepstas,
Mahesh J Salgaonkar, Oliver OHalloran, linuxppc-dev, linux-pci,
linux-doc, Brian Norris
On Mon, Sep 15, 2025 at 03:50:00PM +0200, Lukas Wunner wrote:
> Fix deviations between the code and the documentation on
> PCIe Advanced Error Reporting. Add minor clarifications
> and make a few small cleanups.
>
> Changes v1 -> v2:
> * In all patches, change subject prefix to "Documentation: PCI: ".
> * In patch [3/4], mention s390 alongside powerpc (Niklas).
>
> Link to v1:
> https://lore.kernel.org/all/cover.1756451884.git.lukas@wunner.de/
>
> Lukas Wunner (4):
> Documentation: PCI: Sync AER doc with code
> Documentation: PCI: Sync error recovery doc with code
> Documentation: PCI: Amend error recovery doc with DPC/AER specifics
> Documentation: PCI: Tidy error recovery doc's PCIe nomenclature
>
> Documentation/PCI/pci-error-recovery.rst | 43 ++++++++++---
> Documentation/PCI/pcieaer-howto.rst | 81 +++++++++++-------------
> 2 files changed, 72 insertions(+), 52 deletions(-)
Applied to pci/aer for v6.18, thanks, everybody!
^ permalink raw reply [flat|nested] 10+ messages in thread