linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Gavin Shan <gwshan@linux.vnet.ibm.com>
To: benh@kernel.crashing.org
Cc: linuxppc-dev@lists.ozlabs.org, Gavin Shan <gwshan@linux.vnet.ibm.com>
Subject: [PATCH 08/25] powerpc/eeh: Avoid I/O access during PE reset
Date: Thu, 24 Apr 2014 18:00:14 +1000	[thread overview]
Message-ID: <1398326431-24305-9-git-send-email-gwshan@linux.vnet.ibm.com> (raw)
In-Reply-To: <1398326431-24305-1-git-send-email-gwshan@linux.vnet.ibm.com>

We have suffered recrusive frozen PE a lot, which was caused
by IO accesses during the PE reset. Ben came up with the good
idea to keep frozen PE until recovery (BAR restore) gets done.
With that, IO accesses during PE reset are dropped by hardware
and wouldn't incur the recrusive frozen PE any more.

The patch implements the idea. We don't clear the frozen state
until PE reset is done completely. During the period, the EEH
core expects unfrozen state from backend to keep going. So we
have to reuse EEH_PE_RESET flag, which has been set during PE
reset, to return normal state from backend. The side effect is
we have to clear frozen state for towice (PE reset and clear it
explicitly), but that's harmless.

We have some limitations on pHyp. pHyp doesn't allow to enable
IO or DMA for unfrozen PE. So we don't enable them on unfrozen PE
in eeh_pci_enable(). We have to enable IO before grabbing logs on
pHyp. Otherwise, 0xFF's is always returned from PCI config space.
Also, we had wrong return value from eeh_pci_enable() for
EEH_OPT_THAW_DMA case. The patch fixes it too.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/eeh.c                 | 50 ++++++++++++++----
 arch/powerpc/kernel/eeh_driver.c          | 35 +++++++++++++
 arch/powerpc/platforms/powernv/eeh-ioda.c | 84 +++++++++----------------------
 3 files changed, 99 insertions(+), 70 deletions(-)

diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index cc728e8..25f3753 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -238,9 +238,13 @@ void eeh_slot_error_detail(struct eeh_pe *pe, int severity)
 	 * the data from PCI config space because it should return
 	 * 0xFF's. For ER, we still retrieve the data from the PCI
 	 * config space.
+	 *
+	 * For pHyp, we have to enable IO for log retrieval. Otherwise,
+	 * 0xFF's is always returned from PCI config space.
 	 */
 	if (!(pe->type & EEH_PE_PHB)) {
-		eeh_pci_enable(pe, EEH_OPT_THAW_MMIO);
+		if (eeh_probe_mode_devtree())
+			eeh_pci_enable(pe, EEH_OPT_THAW_MMIO);
 		eeh_ops->configure_bridge(pe);
 		eeh_pe_restore_bars(pe);
 
@@ -509,16 +513,42 @@ EXPORT_SYMBOL(eeh_check_failure);
  */
 int eeh_pci_enable(struct eeh_pe *pe, int function)
 {
-	int rc;
+	int rc, flags = (EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE);
+
+	/*
+	 * pHyp doesn't allow to enable IO or DMA on unfrozen PE.
+	 * Also, it's pointless to enable them on unfrozen PE. So
+	 * we have the check here.
+	 */
+	if (function == EEH_OPT_THAW_MMIO ||
+	    function == EEH_OPT_THAW_DMA) {
+		rc = eeh_ops->get_state(pe, NULL);
+		if (rc < 0)
+			return rc;
+
+		/* Needn't to enable or already enabled */
+		if ((rc == EEH_STATE_NOT_SUPPORT) ||
+		    ((rc & flags) == flags))
+			return 0;
+	}
 
 	rc = eeh_ops->set_option(pe, function);
 	if (rc)
-		pr_warning("%s: Unexpected state change %d on PHB#%d-PE#%x, err=%d\n",
-			__func__, function, pe->phb->global_number, pe->addr, rc);
+		pr_warn("%s: Unexpected state change %d on "
+			"PHB#%d-PE#%x, err=%d\n",
+			__func__, function, pe->phb->global_number,
+			pe->addr, rc);
 
 	rc = eeh_ops->wait_state(pe, PCI_BUS_RESET_WAIT_MSEC);
-	if (rc > 0 && (rc & EEH_STATE_MMIO_ENABLED) &&
-	   (function == EEH_OPT_THAW_MMIO))
+	if (rc <= 0)
+		return rc;
+
+	if ((function == EEH_OPT_THAW_MMIO) &&
+	    (rc & EEH_STATE_MMIO_ENABLED))
+		return 0;
+
+	if ((function == EEH_OPT_THAW_DMA) &&
+	    (rc & EEH_STATE_DMA_ENABLED))
 		return 0;
 
 	return rc;
@@ -639,11 +669,13 @@ int eeh_reset_pe(struct eeh_pe *pe)
 	for (i=0; i<3; i++) {
 		eeh_reset_pe_once(pe);
 
+		/*
+		 * EEH_PE_ISOLATED is expected to be removed after
+		 * BAR restore.
+		 */
 		rc = eeh_ops->wait_state(pe, PCI_BUS_RESET_WAIT_MSEC);
-		if ((rc & flags) == flags) {
-			eeh_pe_state_clear(pe, EEH_PE_ISOLATED);
+		if ((rc & flags) == flags)
 			return 0;
-		}
 
 		if (rc < 0) {
 			pr_err("%s: Unrecoverable slot failure on PHB#%d-PE#%x",
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 6d91b51..1f1e2cc 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -417,6 +417,36 @@ static void *eeh_pe_detach_dev(void *data, void *userdata)
 	return NULL;
 }
 
+/*
+ * Explicitly clear PE's frozen state for PowerNV where
+ * we have frozen PE until BAR restore is completed. It's
+ * harmless to clear it for pSeries. To be consistent with
+ * PE reset (for 3 times), we try to clear the frozen state
+ * for 3 times as well.
+ */
+static int eeh_clear_pe_frozen_state(struct eeh_pe *pe)
+{
+	int i, rc;
+
+	for (i = 0; i < 3; i++) {
+		rc = eeh_pci_enable(pe, EEH_OPT_THAW_MMIO);
+		if (rc)
+			continue;
+		rc = eeh_pci_enable(pe, EEH_OPT_THAW_DMA);
+		if (!rc)
+			break;
+	}
+
+	/* The PE has been isolated, clear it */
+	if (rc)
+		pr_warn("%s: Can't clear frozen PHB#%x-PE#%x (%d)\n",
+			__func__, pe->phb->global_number, pe->addr, rc);
+	else
+		eeh_pe_state_clear(pe, EEH_PE_ISOLATED);
+
+	return rc;
+}
+
 /**
  * eeh_reset_device - Perform actual reset of a pci slot
  * @pe: EEH PE
@@ -474,6 +504,11 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
 	eeh_pe_restore_bars(pe);
 	eeh_pe_state_clear(pe, EEH_PE_RESET);
 
+	/* Clear frozen state */
+	rc = eeh_clear_pe_frozen_state(pe);
+	if (rc)
+		return rc;
+
 	/* Give the system 5 seconds to finish running the user-space
 	 * hotplug shutdown scripts, e.g. ifdown for ethernet.  Yes,
 	 * this is a hack, but if we don't do this, and try to bring
diff --git a/arch/powerpc/platforms/powernv/eeh-ioda.c b/arch/powerpc/platforms/powernv/eeh-ioda.c
index ed6c686..6bdae8d 100644
--- a/arch/powerpc/platforms/powernv/eeh-ioda.c
+++ b/arch/powerpc/platforms/powernv/eeh-ioda.c
@@ -268,6 +268,21 @@ static int ioda_eeh_get_state(struct eeh_pe *pe)
 		return EEH_STATE_NOT_SUPPORT;
 	}
 
+	/*
+	 * If we're in middle of PE reset, return normal
+	 * state to keep EEH core going. For PHB reset, we
+	 * still expect to have fenced PHB cleared with
+	 * PHB reset.
+	 */
+	if (!(pe->type & EEH_PE_PHB) &&
+	    (pe->state & EEH_PE_RESET)) {
+		result = (EEH_STATE_MMIO_ACTIVE |
+			  EEH_STATE_DMA_ACTIVE |
+			  EEH_STATE_MMIO_ENABLED |
+			  EEH_STATE_DMA_ENABLED);
+		return result;
+	}
+
 	/* Retrieve PE status through OPAL */
 	pe_no = pe->addr;
 	ret = opal_pci_eeh_freeze_status(phb->opal_id, pe_no,
@@ -347,52 +362,6 @@ static int ioda_eeh_get_state(struct eeh_pe *pe)
 	return result;
 }
 
-static int ioda_eeh_pe_clear(struct eeh_pe *pe)
-{
-	struct pci_controller *hose;
-	struct pnv_phb *phb;
-	u32 pe_no;
-	u8 fstate;
-	u16 pcierr;
-	s64 ret;
-
-	pe_no = pe->addr;
-	hose = pe->phb;
-	phb = pe->phb->private_data;
-
-	/* Clear the EEH error on the PE */
-	ret = opal_pci_eeh_freeze_clear(phb->opal_id,
-			pe_no, OPAL_EEH_ACTION_CLEAR_FREEZE_ALL);
-	if (ret) {
-		pr_err("%s: Failed to clear EEH error for "
-		       "PHB#%x-PE#%x, err=%lld\n",
-		       __func__, hose->global_number, pe_no, ret);
-		return -EIO;
-	}
-
-	/*
-	 * Read the PE state back and verify that the frozen
-	 * state has been removed.
-	 */
-	ret = opal_pci_eeh_freeze_status(phb->opal_id, pe_no,
-			&fstate, &pcierr, NULL);
-	if (ret) {
-		pr_err("%s: Failed to get EEH status on "
-		       "PHB#%x-PE#%x\n, err=%lld\n",
-		       __func__, hose->global_number, pe_no, ret);
-		return -EIO;
-	}
-
-	if (fstate != OPAL_EEH_STOPPED_NOT_FROZEN) {
-		pr_err("%s: Frozen state not cleared on "
-		       "PHB#%x-PE#%x, sts=%x\n",
-		       __func__, hose->global_number, pe_no, fstate);
-		return -EIO;
-	}
-
-	return 0;
-}
-
 static s64 ioda_eeh_phb_poll(struct pnv_phb *phb)
 {
 	s64 rc = OPAL_HARDWARE;
@@ -524,27 +493,20 @@ static int ioda_eeh_reset(struct eeh_pe *pe, int option)
 	int ret;
 
 	/*
-	 * Anyway, we have to clear the problematic state for the
-	 * corresponding PE. However, we needn't do it if the PE
-	 * is PHB associated. That means the PHB is having fatal
-	 * errors and it needs reset. Further more, the AIB interface
-	 * isn't reliable any more.
-	 */
-	if (!(pe->type & EEH_PE_PHB) &&
-	    (option == EEH_RESET_HOT ||
-	    option == EEH_RESET_FUNDAMENTAL)) {
-		ret = ioda_eeh_pe_clear(pe);
-		if (ret)
-			return -EIO;
-	}
-
-	/*
 	 * The rules applied to reset, either fundamental or hot reset:
 	 *
 	 * We always reset the direct upstream bridge of the PE. If the
 	 * direct upstream bridge isn't root bridge, we always take hot
 	 * reset no matter what option (fundamental or hot) is. Otherwise,
 	 * we should do the reset according to the required option.
+	 *
+	 * Here, we have different design to pHyp, which always clear the
+	 * frozen state during PE reset. However, the good idea here from
+	 * benh is to keep frozen state before we get PE reset done completely
+	 * (until BAR restore). With the frozen state, HW drops illegal IO
+	 * or MMIO access, which can incur recrusive frozen PE during PE
+	 * reset. The side effect is that EEH core has to clear the frozen
+	 * state explicitly after BAR restore.
 	 */
 	if (pe->type & EEH_PE_PHB) {
 		ret = ioda_eeh_phb_reset(hose, option);
-- 
1.8.3.2

  parent reply	other threads:[~2014-04-24  8:00 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-24  8:00 [PATCH 00/25] EEH Enhancement and bug fixes Gavin Shan
2014-04-24  8:00 ` [PATCH 01/25] powerpc/eeh: Remove EEH_PE_PHB_DEAD Gavin Shan
2014-04-24  8:00 ` [PATCH 02/25] powerpc/powernv: Remove PNV_EEH_STATE_REMOVED Gavin Shan
2014-04-24  8:00 ` [PATCH 03/25] powerpc/powernv: Move PNV_EEH_STATE_ENABLED around Gavin Shan
2014-04-24  8:00 ` [PATCH 04/25] powerpc/powernv: Remove fields in PHB diag-data dump Gavin Shan
2014-04-24  8:00 ` [PATCH 05/25] powerpc/eeh: EEH_PE_ISOLATED not reflect HW state Gavin Shan
2014-04-24  8:00 ` [PATCH 06/25] powerpc/eeh: Block PCI-CFG access during PE reset Gavin Shan
2014-04-24  8:00 ` [PATCH 07/25] powerpc/powernv: Use EEH PCI config accessors Gavin Shan
2014-04-24  8:00 ` Gavin Shan [this message]
2014-04-24  8:00 ` [PATCH 09/25] powerpc/eeh: Cleanup eeh_gather_pci_data() Gavin Shan
2014-04-24  8:00 ` [PATCH 10/25] powerpc/eeh: Use cached capability for log dump Gavin Shan
2014-04-24  8:00 ` [PATCH 11/25] powerpc/eeh: Cleanup EEH subsystem variables Gavin Shan
2014-04-24  8:00 ` [PATCH 12/25] powerpc/eeh: Allow to disable EEH Gavin Shan
2014-04-24  8:00 ` [PATCH 13/25] powerpc/eeh: No hotplug on permanently removed dev Gavin Shan
2014-04-24  8:00 ` [PATCH 14/25] powerpc/powernv: Fix endless reporting frozen PE Gavin Shan
2014-04-24  8:00 ` [PATCH 15/25] powerpc/pseries: Fix overwritten PE state Gavin Shan
2014-04-24  8:00 ` [PATCH 16/25] powerpc/powernv: Reset root port in firmware Gavin Shan
2014-04-24  8:00 ` [PATCH 17/25] powerpc/eeh: Make the delay for PE reset unified Gavin Shan
2014-04-24  8:00 ` [PATCH 18/25] powerpc/pci: Mask linkDown on resetting PCI bus Gavin Shan
2014-04-24  8:00 ` [PATCH 19/25] powrpc/powernv: Reset PHB in kdump kernel Gavin Shan
2014-04-24  8:00 ` [PATCH 20/25] powerpc/eeh: Can't recover from non-PE-reset case Gavin Shan
2014-04-24  8:00 ` [PATCH 21/25] powerpc/powernv: Fundamental reset on PLX ports Gavin Shan
2014-04-24  8:00 ` [PATCH 22/25] powerpc/powernv: Missed IOMMU table type Gavin Shan
2014-04-24  8:00 ` [PATCH 23/25] powerpc/powernv: pci_domain_nr() not reliable Gavin Shan
2014-04-24  8:00 ` [PATCH 24/25] PCI: Fix return value from pci_user_{read, write}_config_*() Gavin Shan
2014-04-24  8:00 ` [PATCH 25/25] powerpc/prom: Stop scanning dev-tree for fdump early Gavin Shan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1398326431-24305-9-git-send-email-gwshan@linux.vnet.ibm.com \
    --to=gwshan@linux.vnet.ibm.com \
    --cc=benh@kernel.crashing.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).