linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Gavin Shan <shangw@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH 10/10] net/tg3: Avoid delay during MMIO access
Date: Tue, 25 Jun 2013 16:15:48 +1000	[thread overview]
Message-ID: <1372140948.3944.193.camel@pasglop> (raw)
In-Reply-To: <1372139717-14885-11-git-send-email-shangw@linux.vnet.ibm.com>

On Tue, 2013-06-25 at 13:55 +0800, Gavin Shan wrote:
> When the driver is encountering EEH errors, which might be caused
> by frozen PCI host controller, the driver needn't keep reading on
> MMIO until timeout. For the case, 0xFF's should be returned from
> hardware. Otherwise, it possibly trigger soft-lockup. The patch
> adds more check on that by pci_channel_offline(), thus to avoid
> the possible soft-lockup.

Can you resend this patch "standalone" (not part of a series)
to the maintainer/author of this driver and CC the netdev list on
vger.kernel.org ?

For the CC list, check the author of the original EEH support.

Also maybe improve the explanation above explaining something like:

"When the EEH error is the result of a fenced host bridge, MMIO
accesses can be very slow (milliseconds) to timeout and return all 1's,
thus causing the driver various timeout loops to take way too long and
trigger soft-lockup warnings (in addition to taking minutes to recover).

It might be worthwhile to check if for any of these cases, ffffffff is
a valid possible value, and if not, bail early since that means the HW
is either gone or isolated.

In the meantime, checking that the PCI channel is offline will
workaround the problem".

Or something like that...

> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
> ---
>  drivers/net/ethernet/broadcom/tg3.c |   36 +++++++++++++++++++++++++++++++++++
>  1 files changed, 36 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
> index c777b90..a13463e 100644
> --- a/drivers/net/ethernet/broadcom/tg3.c
> +++ b/drivers/net/ethernet/broadcom/tg3.c
> @@ -744,6 +744,9 @@ static int tg3_ape_lock(struct tg3 *tp, int locknum)
>  		status = tg3_ape_read32(tp, gnt + off);
>  		if (status == bit)
>  			break;
> +		if (pci_channel_offline(tp->pdev))
> +			break;
> +
>  		udelay(10);
>  	}
>  
> @@ -1635,6 +1638,9 @@ static void tg3_wait_for_event_ack(struct tg3 *tp)
>  	for (i = 0; i < delay_cnt; i++) {
>  		if (!(tr32(GRC_RX_CPU_EVENT) & GRC_RX_CPU_DRIVER_EVENT))
>  			break;
> +		if (pci_channel_offline(tp->pdev))
> +			break;
> +
>  		udelay(8);
>  	}
>  }
> @@ -1813,6 +1819,9 @@ static int tg3_poll_fw(struct tg3 *tp)
>  		for (i = 0; i < 200; i++) {
>  			if (tr32(VCPU_STATUS) & VCPU_STATUS_INIT_DONE)
>  				return 0;
> +			if (pci_channel_offline(tp->pdev))
> +				return -ENODEV;
> +
>  			udelay(100);
>  		}
>  		return -ENODEV;
> @@ -1823,6 +1832,15 @@ static int tg3_poll_fw(struct tg3 *tp)
>  		tg3_read_mem(tp, NIC_SRAM_FIRMWARE_MBOX, &val);
>  		if (val == ~NIC_SRAM_FIRMWARE_MBOX_MAGIC1)
>  			break;
> +		if (pci_channel_offline(tp->pdev)) {
> +			if (!tg3_flag(tp, NO_FWARE_REPORTED)) {
> +				tg3_flag_set(tp, NO_FWARE_REPORTED);
> +				netdev_info(tp->dev, "No firmware running\n");
> +			}
> +
> +			break;
> +		}
> +
>  		udelay(10);
>  	}
>  
> @@ -3520,6 +3538,8 @@ static int tg3_pause_cpu(struct tg3 *tp, u32 cpu_base)
>  		tw32(cpu_base + CPU_MODE,  CPU_MODE_HALT);
>  		if (tr32(cpu_base + CPU_MODE) & CPU_MODE_HALT)
>  			break;
> +		if (pci_channel_offline(tp->pdev))
> +			return -EBUSY;
>  	}
>  
>  	return (i == iters) ? -EBUSY : 0;
> @@ -8589,6 +8609,14 @@ static int tg3_stop_block(struct tg3 *tp, unsigned long ofs, u32 enable_bit, boo
>  	tw32_f(ofs, val);
>  
>  	for (i = 0; i < MAX_WAIT_CNT; i++) {
> +		if (pci_channel_offline(tp->pdev)) {
> +			dev_err(&tp->pdev->dev,
> +				"tg3_stop_block device offline, "
> +				"ofs=%lx enable_bit=%x\n",
> +				ofs, enable_bit);
> +			return -ENODEV;
> +		}
> +
>  		udelay(100);
>  		val = tr32(ofs);
>  		if ((val & enable_bit) == 0)
> @@ -8612,6 +8640,13 @@ static int tg3_abort_hw(struct tg3 *tp, bool silent)
>  
>  	tg3_disable_ints(tp);
>  
> +	if (pci_channel_offline(tp->pdev)) {
> +		tp->rx_mode &= ~(RX_MODE_ENABLE | TX_MODE_ENABLE);
> +		tp->mac_mode &= ~MAC_MODE_TDE_ENABLE;
> +		err = -ENODEV;
> +		goto err_no_dev;
> +	}
> +
>  	tp->rx_mode &= ~RX_MODE_ENABLE;
>  	tw32_f(MAC_RX_MODE, tp->rx_mode);
>  	udelay(10);
> @@ -8660,6 +8695,7 @@ static int tg3_abort_hw(struct tg3 *tp, bool silent)
>  	err |= tg3_stop_block(tp, BUFMGR_MODE, BUFMGR_MODE_ENABLE, silent);
>  	err |= tg3_stop_block(tp, MEMARB_MODE, MEMARB_MODE_ENABLE, silent);
>  
> +err_no_dev:
>  	for (i = 0; i < tp->irq_cnt; i++) {
>  		struct tg3_napi *tnapi = &tp->napi[i];
>  		if (tnapi->hw_status)

      reply	other threads:[~2013-06-25  6:16 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-25  5:55 [PATCH v1 00/10] powerpc/eeh: Remove eeh_mutex Gavin Shan
2013-06-25  5:55 ` [PATCH 01/10] " Gavin Shan
2013-06-25  5:55 ` [PATCH 02/10] powerpc/eeh: Don't collect PCI-CFG data on PHB Gavin Shan
2013-06-25  5:55 ` [PATCH 03/10] powerpc/eeh: Check PCIe link after reset Gavin Shan
2013-06-25  6:06   ` Benjamin Herrenschmidt
2013-06-25  7:47     ` Gavin Shan
2013-06-25  7:57       ` Benjamin Herrenschmidt
2013-06-25  8:04         ` Gavin Shan
2013-06-25  5:55 ` [PATCH 04/10] powerpc/eeh: Backends to get/set settings Gavin Shan
2013-06-25  6:07   ` Benjamin Herrenschmidt
2013-06-25  7:12     ` Gavin Shan
2013-06-25  5:55 ` [PATCH 05/10] powerpc/powernv: Support set/get EEH settings Gavin Shan
2013-06-25  5:55 ` [PATCH 06/10] powerpc/eeh: Support blocked IO access Gavin Shan
2013-06-25  5:55 ` [PATCH 07/10] powerpc/powernv: Block PCI-CFG access if necessary Gavin Shan
2013-06-25  5:55 ` [PATCH 08/10] powerpc/powernv: Hold PCI-CFG and I/O access Gavin Shan
2013-06-25  5:55 ` [PATCH 09/10] powerpc/eeh: Fix address catch for PowerNV Gavin Shan
2013-06-25  5:55 ` [PATCH 10/10] net/tg3: Avoid delay during MMIO access Gavin Shan
2013-06-25  6:15   ` Benjamin Herrenschmidt [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1372140948.3944.193.camel@pasglop \
    --to=benh@kernel.crashing.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=shangw@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).