From: Sam Bobroff <sbobroff@linux.ibm.com>
To: linuxppc-dev@lists.ozlabs.org
Cc: fbarrat@linux.ibm.com
Subject: [PATCH 1/1] powerpc/eeh: fix deadlock handling dead PHB
Date: Fri, 7 Feb 2020 15:57:31 +1100 [thread overview]
Message-ID: <0547e82dbf90ee0729a2979a8cac5c91665c621f.1581051445.git.sbobroff@linux.ibm.com> (raw)
Recovering a dead PHB can currently cause a deadlock as the PCI
rescan/remove lock is taken twice.
This is caused as part of an existing bug in
eeh_handle_special_event(). The pe is processed while traversing the
PHBs even though the pe is unrelated to the loop. This causes the pe
to be, incorrectly, processed more than once.
Untangling this section can move the pe processing out of the loop and
also outside the locked section, correcting both problems.
Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
---
I have only compile tested this fix, Frederic Barrat (who discovered it) has
offered to test it (thanks!).
arch/powerpc/kernel/eeh_driver.c | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 3dd1a422fc29..d6e75a8a14ce 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -1190,6 +1190,17 @@ void eeh_handle_special_event(void)
eeh_pe_state_mark(pe, EEH_PE_RECOVERING);
eeh_handle_normal_event(pe);
} else {
+ eeh_for_each_pe(pe, tmp_pe)
+ eeh_pe_for_each_dev(tmp_pe, edev, tmp_edev)
+ edev->mode &= ~EEH_DEV_NO_HANDLER;
+
+ /* Notify all devices to be down */
+ eeh_pe_state_clear(pe, EEH_PE_PRI_BUS, true);
+ eeh_set_channel_state(pe, pci_channel_io_perm_failure);
+ eeh_pe_report(
+ "error_detected(permanent failure)", pe,
+ eeh_report_failure, NULL);
+
pci_lock_rescan_remove();
list_for_each_entry(hose, &hose_list, list_node) {
phb_pe = eeh_phb_pe_get(hose);
@@ -1198,16 +1209,6 @@ void eeh_handle_special_event(void)
(phb_pe->state & EEH_PE_RECOVERING))
continue;
- eeh_for_each_pe(pe, tmp_pe)
- eeh_pe_for_each_dev(tmp_pe, edev, tmp_edev)
- edev->mode &= ~EEH_DEV_NO_HANDLER;
-
- /* Notify all devices to be down */
- eeh_pe_state_clear(pe, EEH_PE_PRI_BUS, true);
- eeh_set_channel_state(pe, pci_channel_io_perm_failure);
- eeh_pe_report(
- "error_detected(permanent failure)", pe,
- eeh_report_failure, NULL);
bus = eeh_pe_bus_get(phb_pe);
if (!bus) {
pr_err("%s: Cannot find PCI bus for "
--
2.22.0.216.g00a2a96fc9
next reply other threads:[~2020-02-07 4:59 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-07 4:57 Sam Bobroff [this message]
2020-02-07 8:56 ` [PATCH 1/1] powerpc/eeh: fix deadlock handling dead PHB Frederic Barrat
2020-02-19 12:39 ` Michael Ellerman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0547e82dbf90ee0729a2979a8cac5c91665c621f.1581051445.git.sbobroff@linux.ibm.com \
--to=sbobroff@linux.ibm.com \
--cc=fbarrat@linux.ibm.com \
--cc=linuxppc-dev@lists.ozlabs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).