linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Niklas Schnelle <schnelle@linux.ibm.com>,
	Lukas Wunner <lukas@wunner.de>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Kuppuswamy Sathyanarayanan
	<sathyanarayanan.kuppuswamy@linux.intel.com>,
	Mahesh Salgaonkar <mahesh@linux.ibm.com>,
	Sasha Levin <sashal@kernel.org>,
	linuxppc-dev@lists.ozlabs.org
Subject: [PATCH AUTOSEL 6.17-5.4] powerpc/eeh: Use result of error_detected() in uevent
Date: Sat, 25 Oct 2025 11:54:48 -0400	[thread overview]
Message-ID: <20251025160905.3857885-57-sashal@kernel.org> (raw)
In-Reply-To: <20251025160905.3857885-1-sashal@kernel.org>

From: Niklas Schnelle <schnelle@linux.ibm.com>

[ Upstream commit 704e5dd1c02371dfc7d22e1520102b197a3b628b ]

Ever since uevent support was added for AER and EEH with commit
856e1eb9bdd4 ("PCI/AER: Add uevents in AER and EEH error/resume"), it
reported PCI_ERS_RESULT_NONE as uevent when recovery begins.

Commit 7b42d97e99d3 ("PCI/ERR: Always report current recovery status for
udev") subsequently amended AER to report the actual return value of
error_detected().

Make the same change to EEH to align it with AER and s390.

Suggested-by: Lukas Wunner <lukas@wunner.de>
Link: https://lore.kernel.org/linux-pci/aIp6LiKJor9KLVpv@wunner.de/
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Link: https://patch.msgid.link/20250807-add_err_uevents-v5-3-adf85b0620b0@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

YES

Rationale
- Fixes incorrect uevent status at start of EEH recovery: the code
  currently emits a uevent with `PCI_ERS_RESULT_NONE` regardless of what
  the driver reported via `error_detected()`. This misrepresents the
  actual recovery status to user space.
- The fix makes EEH behave like AER (already fixed by commit
  7b42d97e99d3) and s390, improving cross-arch consistency and user
  space expectations.

Evidence in code
- Current EEH behavior: emits BEGIN_RECOVERY unconditionally at error
  detection
  - `pci_uevent_ers(pdev, PCI_ERS_RESULT_NONE);` is called after
    `error_detected()` even if the driver “votes” differently (e.g.,
    DISCONNECT/NEED_RESET): arch/powerpc/kernel/eeh_driver.c:337
- Proposed change: pass actual driver result
  - Changes the above call to `pci_uevent_ers(pdev, rc);`, where `rc` is
    the result of `driver->err_handler->error_detected()` captured just
    above: arch/powerpc/kernel/eeh_driver.c:337
- uevent mapping semantics (what user space sees) are centralized in
  `pci_uevent_ers()`:
  - NONE/CAN_RECOVER -> `ERROR_EVENT=BEGIN_RECOVERY`, `DEVICE_ONLINE=0`
  - RECOVERED -> `ERROR_EVENT=SUCCESSFUL_RECOVERY`, `DEVICE_ONLINE=1`
  - DISCONNECT -> `ERROR_EVENT=FAILED_RECOVERY`, `DEVICE_ONLINE=0`
  - Others (e.g., NEED_RESET) -> no immediate uevent (consistent with
    AER)
  - drivers/pci/pci-driver.c:1595
- AER already reports actual `error_detected()` return value to udev:
  - `pci_uevent_ers(dev, vote);` after computing `vote` in
    `report_error_detected()`: drivers/pci/pcie/err.c:83
- EEH already emits final-stage uevents correctly (unchanged by this
  patch):
  - Success at resume: `pci_uevent_ers(edev->pdev,
    PCI_ERS_RESULT_RECOVERED);` arch/powerpc/kernel/eeh_driver.c:432
  - Failure path: `pci_uevent_ers(pdev, PCI_ERS_RESULT_DISCONNECT);`
    arch/powerpc/kernel/eeh_driver.c:462

Why this is a bugfix suitable for stable
- User-visible correctness: With the current code, user space always
  sees “BEGIN_RECOVERY” even when drivers have already indicated an
  unrecoverable state (e.g., DISCONNECT). The patch ensures uevents
  reflect the true state immediately, matching AER behavior introduced
  by 7b42d97e99d3.
- Minimal, contained change: One-line change in a single architecture-
  specific file (PowerPC EEH). No API/ABI changes; only corrects the
  parameter passed to an existing helper.
- No architectural change: Keeps existing EEH flow; only adjusts the
  uevent status emitted at a single step.
- Low regression risk: AER has used this semantic for years;
  `pci_uevent_ers()` already handles `rc` values. EEH already emits
  RECOVERED/DISCONNECT at later stages; this makes the initial event
  consistent.
- Aligns cross-arch semantics: Consistent uevent reporting across AER,
  EEH, and s390 reduces user space special-casing and potential errors.

Potential side effects and why acceptable
- For drivers returning `PCI_ERS_RESULT_DISCONNECT` at
  `error_detected()`, user space will now see `FAILED_RECOVERY`
  immediately instead of a misleading `BEGIN_RECOVERY`. This is a
  correctness fix.
- For returns like `PCI_ERS_RESULT_NEED_RESET`, no initial uevent is
  emitted (consistent with AER); user space will still receive final
  RECOVERED/FAILED, as today. Any scripts that strictly expected an
  initial BEGIN_RECOVERY for all cases are already inconsistent with AER
  and should not rely on that behavior.

Historical context
- Uevent support was added by 856e1eb9bdd4 (“PCI/AER: Add uevents in AER
  and EEH error/resume”), initially emitting `NONE` at error detection
  for both AER and EEH.
- AER was corrected by 7b42d97e99d3 (“PCI/ERR: Always report current
  recovery status for udev”) to emit the actual `error_detected()`
  result.
- This patch brings EEH to parity with that established AER behavior.

Conclusion
- This is a targeted, low-risk correctness fix that improves user space
  observability and cross-arch consistency without changing kernel-side
  recovery logic. It fits stable backport rules (important bugfix,
  minimal change, low regression risk, confined to a subsystem).

 arch/powerpc/kernel/eeh_driver.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 48ad0116f3590..ef78ff77cf8f2 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -334,7 +334,7 @@ static enum pci_ers_result eeh_report_error(struct eeh_dev *edev,
 	rc = driver->err_handler->error_detected(pdev, pci_channel_io_frozen);
 
 	edev->in_error = true;
-	pci_uevent_ers(pdev, PCI_ERS_RESULT_NONE);
+	pci_uevent_ers(pdev, rc);
 	return rc;
 }
 
-- 
2.51.0



  parent reply	other threads:[~2025-10-25 16:12 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20251025160905.3857885-1-sashal@kernel.org>
2025-10-25 15:54 ` [PATCH AUTOSEL 6.17-6.12] PCI/ERR: Update device error_state already after reset Sasha Levin
2025-10-25 15:54 ` Sasha Levin [this message]
2025-10-25 16:00 ` [PATCH AUTOSEL 6.17] PCI/AER: Fix NULL pointer access by aer_info Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251025160905.3857885-57-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=bhelgaas@google.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=lukas@wunner.de \
    --cc=mahesh@linux.ibm.com \
    --cc=patches@lists.linux.dev \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=schnelle@linux.ibm.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).