netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ben Hutchings <bhutchings@solarflare.com>
To: David Miller <davem@davemloft.net>
Cc: <netdev@vger.kernel.org>, <linux-net-drivers@solarflare.com>
Subject: [PATCH net-next 10/32] sfc: Make handling of MC reboot more reliable
Date: Fri, 27 Jan 2012 20:45:07 +0000	[thread overview]
Message-ID: <1327697107.2503.17.camel@bwh-desktop> (raw)
In-Reply-To: <1327696858.2503.7.camel@bwh-desktop>

When the MC reboots, either as part of a firmware upgrade or due to a
bug, it attempts to complete (with an error) any requests that were
outstanding before the reboot.  Since there is an inherent race
condition in checking this, it will also write to a status word in
shared memory.

If we look at each of these separately, we may detect each reboot
twice, resulting in a spurious command failure after a firmware
upgrade or frustrating recovery from a firmware bug.  Instead, if a
request completion indicates a reboot, we must poll and clear the
status word.

This bug was previously masked by use of an incorrect address for the
status word.  Fix that, using the definition now included in
mcdi_pcol.h.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 drivers/net/ethernet/sfc/mcdi.c |   33 +++++++++++++++++++++++++++------
 1 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/sfc/mcdi.c b/drivers/net/ethernet/sfc/mcdi.c
index 7c405d1..f16145d 100644
--- a/drivers/net/ethernet/sfc/mcdi.c
+++ b/drivers/net/ethernet/sfc/mcdi.c
@@ -27,8 +27,6 @@
 #define CMD_NOTIFY_PORT1 4
 #define CMD_PDU_PORT0    0x008
 #define CMD_PDU_PORT1    0x108
-#define REBOOT_FLAG_PORT0 0x3f8
-#define REBOOT_FLAG_PORT1 0x3fc
 
 #define MCDI_RPC_TIMEOUT       10 /*seconds */
 
@@ -36,8 +34,16 @@
 	(efx_port_num(efx) ? CMD_PDU_PORT1 : CMD_PDU_PORT0)
 #define MCDI_DOORBELL(efx)						\
 	(efx_port_num(efx) ? CMD_NOTIFY_PORT1 : CMD_NOTIFY_PORT0)
-#define MCDI_REBOOT_FLAG(efx)						\
-	(efx_port_num(efx) ? REBOOT_FLAG_PORT1 : REBOOT_FLAG_PORT0)
+#define MCDI_STATUS(efx)						\
+	(efx_port_num(efx) ? MC_SMEM_P1_STATUS_OFST : MC_SMEM_P0_STATUS_OFST)
+
+/* A reboot/assertion causes the MCDI status word to be set after the
+ * command word is set or a REBOOT event is sent. If we notice a reboot
+ * via these mechanisms then wait 10ms for the status word to be set. */
+#define MCDI_STATUS_DELAY_US		100
+#define MCDI_STATUS_DELAY_COUNT		100
+#define MCDI_STATUS_SLEEP_MS						\
+	(MCDI_STATUS_DELAY_US * MCDI_STATUS_DELAY_COUNT / 1000)
 
 #define SEQ_MASK							\
 	EFX_MASK32(EFX_WIDTH(MCDI_HEADER_SEQ))
@@ -210,7 +216,7 @@ out:
 /* Test and clear MC-rebooted flag for this port/function */
 int efx_mcdi_poll_reboot(struct efx_nic *efx)
 {
-	unsigned int addr = FR_CZ_MC_TREG_SMEM + MCDI_REBOOT_FLAG(efx);
+	unsigned int addr = FR_CZ_MC_TREG_SMEM + MCDI_STATUS(efx);
 	efx_dword_t reg;
 	uint32_t value;
 
@@ -384,6 +390,11 @@ int efx_mcdi_rpc(struct efx_nic *efx, unsigned cmd,
 			netif_dbg(efx, hw, efx->net_dev,
 				  "MC command 0x%x inlen %d failed rc=%d\n",
 				  cmd, (int)inlen, -rc);
+
+		if (rc == -EIO || rc == -EINTR) {
+			msleep(MCDI_STATUS_SLEEP_MS);
+			efx_mcdi_poll_reboot(efx);
+		}
 	}
 
 	efx_mcdi_release(mcdi);
@@ -465,10 +476,20 @@ static void efx_mcdi_ev_death(struct efx_nic *efx, int rc)
 			mcdi->resplen = 0;
 			++mcdi->credits;
 		}
-	} else
+	} else {
+		int count;
+
 		/* Nobody was waiting for an MCDI request, so trigger a reset */
 		efx_schedule_reset(efx, RESET_TYPE_MC_FAILURE);
 
+		/* Consume the status word since efx_mcdi_rpc_finish() won't */
+		for (count = 0; count < MCDI_STATUS_DELAY_COUNT; ++count) {
+			if (efx_mcdi_poll_reboot(efx))
+				break;
+			udelay(MCDI_STATUS_DELAY_US);
+		}
+	}
+
 	spin_unlock(&mcdi->iface_lock);
 }
 
-- 
1.7.7.5



-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

  parent reply	other threads:[~2012-01-27 20:45 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-27 20:40 pull request: sfc-next 2012-01-27 Ben Hutchings
2012-01-27 20:42 ` [PATCH net-next 01/32] sfc: Fix some formatting errors reported by checkpatch Ben Hutchings
2012-01-27 20:42 ` [PATCH net-next 02/32] sfc: Avoid assignment in an if-statement, " Ben Hutchings
2012-01-27 20:42 ` [PATCH net-next 03/32] sfc: Remove parentheses around return expressions, " Ben Hutchings
2012-01-27 21:01   ` Joe Perches
2012-01-30 16:45     ` Ben Hutchings
2012-01-27 20:43 ` [PATCH net-next 04/32] sfc: Const-qualify static data as appropriate, partly prompted " Ben Hutchings
2012-01-27 20:43 ` [PATCH net-next 05/32] sfc: Remove unnecessary inclusion of <asm/io.h>, " Ben Hutchings
2012-01-27 20:43 ` [PATCH net-next 06/32] sfc: Update MCDI (firmware interface) definitions Ben Hutchings
2012-01-27 20:44 ` [PATCH net-next 07/32] sfc: Rename efx_wanted_channels() to efx_wanted_parallelism() Ben Hutchings
2012-01-27 20:44 ` [PATCH net-next 08/32] sfc: Set default parallelism to per-core by default Ben Hutchings
2012-01-27 20:44 ` [PATCH net-next 09/32] sfc: Remove fallback for invalid permanent MAC address Ben Hutchings
2012-01-27 20:45 ` Ben Hutchings [this message]
2012-01-27 20:45 ` [PATCH net-next 11/32] sfc: Use new names for MC shared memory layout constants Ben Hutchings
2012-01-27 20:45 ` [PATCH net-next 12/32] sfc: Hold efx_nic::stats_lock while reading efx_nic::mac_stats Ben Hutchings
2012-01-27 20:45 ` [PATCH net-next 13/32] sfc: Merge efx_mac_operations into efx_nic_type Ben Hutchings
2012-01-27 20:46 ` [PATCH net-next 14/32] sfc: Merge efx_mcdi_mac_check_fault() and efx_mcdi_get_mac_faults() Ben Hutchings
2012-01-27 20:46 ` [PATCH net-next 15/32] sfc: Remove efx_nic_type::push_multicast_hash operation Ben Hutchings
2012-01-27 20:46 ` [PATCH net-next 16/32] sfc: Consistently test DEBUG macro, not EFX_ENABLE_DEBUG Ben Hutchings
2012-01-27 20:47 ` [PATCH net-next 17/32] sfc: Support extraction of CAPABILITIES from GET_BOARD_CFG response Ben Hutchings
2012-01-27 20:48 ` [PATCH net-next 18/32] sfc: Correct interrupt timer quantum for Siena (normal and turbo mode) Ben Hutchings
2012-01-27 20:48 ` [PATCH net-next 19/32] sfc: Remove dependence on NAPI polling in efx_test_eventq_irq() Ben Hutchings
2012-01-27 20:48 ` [PATCH net-next 20/32] Partly revert "sfc: Handle serious errors in exactly one interrupt handler" Ben Hutchings
2012-01-27 20:49 ` [PATCH net-next 21/32] sfc: Clean up test interrupt handling Ben Hutchings
2012-01-27 20:49 ` [PATCH net-next 22/32] sfc: Add hwmon driver for boards using SFC9000-family controllers Ben Hutchings
2012-01-27 20:50 ` [PATCH net-next 23/32] sfc: Update the description of SFC_MTD Ben Hutchings
2012-01-27 20:50 ` [PATCH net-next 24/32] sfc: Remove obsolete function efx_dev_name() Ben Hutchings
2012-01-27 20:50 ` [PATCH net-next 25/32] sfc: Remove remnants of on-load self-test Ben Hutchings
2012-01-27 20:51 ` [PATCH net-next 26/32] sfc: Use existing local variables instead of repeated indirect lookups Ben Hutchings
2012-01-27 20:51 ` [PATCH net-next 27/32] sfc: Minor formatting fixes Ben Hutchings
2012-01-27 20:51 ` [PATCH net-next 28/32] sfc: Remove redundant 'rc' variable, always set to 0 Ben Hutchings
2012-01-27 20:52 ` [PATCH net-next 29/32] sfc: Rename implementation of ndo_set_rx_mode Ben Hutchings
2012-01-27 20:53 ` [PATCH net-next 30/32] sfc: Make all MAC statistics consistently 64 bits wide Ben Hutchings
2012-01-27 20:54 ` [PATCH net-next 31/32] sfc: Move the end of the non-GRO RX path into its own function Ben Hutchings
2012-01-27 20:54 ` [PATCH net-next 32/32] sfc: Replace efx_rx_buffer::is_page and other booleans with a flags field Ben Hutchings
2012-01-29 21:18 ` pull request: sfc-next 2012-01-27 David Miller
2012-01-29 21:40   ` Ben Hutchings

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1327697107.2503.17.camel@bwh-desktop \
    --to=bhutchings@solarflare.com \
    --cc=davem@davemloft.net \
    --cc=linux-net-drivers@solarflare.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).