linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Gavin Shan <gwshan@linux.vnet.ibm.com>
To: linuxppc-dev@lists.ozlabs.org
Cc: Gavin Shan <gwshan@linux.vnet.ibm.com>, stable@vger.kernel.org
Subject: [PATCH 10/21] powerpc/eeh: Clear frozen device state in time
Date: Tue, 30 Sep 2014 12:38:59 +1000	[thread overview]
Message-ID: <1412044750-24460-10-git-send-email-gwshan@linux.vnet.ibm.com> (raw)
In-Reply-To: <1412044750-24460-1-git-send-email-gwshan@linux.vnet.ibm.com>

The problem was reported by Carol: In the scenario of passing mlx4
adapter to guest, EEH error could be recovered successfully. When
returning the device back to host, the driver (mlx4_core.ko)
couldn't be loaded successfully because of error number -5 (-EIO)
returned from mlx4_get_ownership(), which hits offlined PCI device.
The root cause is that we missed to put the affected devices into
normal state on clearing PE isolated state right after PE reset.

The patch fixes above issue by putting the affected devices to
normal state when clearing PE isolated state in eeh_pe_state_clear().

Cc: stable@vger.kernel.org
Reported-by: Carol L. Soto <clsoto@us.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/eeh_pe.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index 00e3844..eef08f0 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -584,6 +584,8 @@ static void *__eeh_pe_state_clear(void *data, void *flag)
 {
 	struct eeh_pe *pe = (struct eeh_pe *)data;
 	int state = *((int *)flag);
+	struct eeh_dev *edev, *tmp;
+	struct pci_dev *pdev;
 
 	/* Keep the state of permanently removed PE intact */
 	if ((pe->freeze_count > EEH_MAX_ALLOWED_FREEZES) &&
@@ -592,9 +594,22 @@ static void *__eeh_pe_state_clear(void *data, void *flag)
 
 	pe->state &= ~state;
 
-	/* Clear check count since last isolation */
-	if (state & EEH_PE_ISOLATED)
-		pe->check_count = 0;
+	/*
+	 * Special treatment on clearing isolated state. Clear
+	 * check count since last isolation and put all affected
+	 * devices to normal state.
+	 */
+	if (!(state & EEH_PE_ISOLATED))
+		return NULL;
+
+	pe->check_count = 0;
+	eeh_pe_for_each_dev(pe, edev, tmp) {
+		pdev = eeh_dev_to_pci_dev(edev);
+		if (!pdev)
+			continue;
+
+		pdev->error_state = pci_channel_io_normal;
+	}
 
 	return NULL;
 }
-- 
1.8.3.2

  parent reply	other threads:[~2014-09-30  2:39 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-30  2:38 [PATCH 01/21] powerpc/eeh: Drop unused argument in eeh_check_failure() Gavin Shan
2014-09-30  2:38 ` [PATCH 02/21] powerpc/eeh: Add eeh_pe_state sysfs entry Gavin Shan
2014-10-01  3:43   ` [02/21] " Michael Ellerman
2014-10-01  4:20     ` Gavin Shan
2014-09-30  2:38 ` [PATCH 03/21] powerpc/eeh: Freeze PE before PE reset Gavin Shan
2014-09-30  2:38 ` [PATCH 04/21] powerpc/eeh: Reenable PCI devices after reset Gavin Shan
2014-09-30  2:38 ` [PATCH 05/21] powerpc/eeh: Clear frozen state on passing device Gavin Shan
2014-09-30  2:38 ` [PATCH 06/21] powerpc/powernv: Sync header with firmware Gavin Shan
2014-09-30  2:38 ` [PATCH 07/21] powerpc/eeh: Introduce eeh_ops::err_inject Gavin Shan
2014-09-30  2:38 ` [PATCH 08/21] powerpc/powernv: Add PCI error injection debugfs entry Gavin Shan
2014-09-30  2:38 ` [PATCH 09/21] powerpc/powernv: Clear PAPR error injection registers Gavin Shan
2014-09-30  2:38 ` Gavin Shan [this message]
2014-09-30  2:39 ` [PATCH 11/21] powerpc/eeh: Fix improper condition in eeh_pci_enable() Gavin Shan
2014-09-30  2:39 ` [PATCH 12/21] powerpc/eeh: Unfreeze PE on enabling EEH functionality Gavin Shan
2014-09-30  2:39 ` [PATCH 13/21] powerpc/eeh: Use eeh_unfreeze_pe() Gavin Shan
2014-09-30  2:39 ` [PATCH 14/21] powerpc/eeh: Block PCI config access during reset Gavin Shan
2014-09-30  2:39 ` [PATCH 15/21] powerpc/pseries: Decrease message level on EEH initialization Gavin Shan
2014-09-30  2:39 ` [PATCH 16/21] powerpc/powernv: Sync OpalPciResetScope with firmware Gavin Shan
2014-09-30  2:39 ` [PATCH 17/21] powerpc/eeh: Tag reset state for user owned PE Gavin Shan
2014-09-30  2:39 ` [PATCH 18/21] powerpc/eeh: Emulate EEH recovery for VFIO devices Gavin Shan
2014-09-30  2:39 ` [PATCH 19/21] powerpc/eeh: Dump PCI config space for all child devices Gavin Shan
2014-09-30  2:39 ` [PATCH 20/21] powerpc/powernv: Fetch frozen PE on top level Gavin Shan
2014-09-30  2:39 ` [PATCH 21/21] powerpc/powernv: Override dma_get_required_mask() Gavin Shan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1412044750-24460-10-git-send-email-gwshan@linux.vnet.ibm.com \
    --to=gwshan@linux.vnet.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).