stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Jens Remus <jremus@linux.ibm.com>,
	Benjamin Block <bblock@linux.ibm.com>,
	Steffen Maier <maier@linux.ibm.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.9 38/49] scsi: zfcp: fix reaction on bit error threshold notification
Date: Sun, 27 Oct 2019 22:01:16 +0100	[thread overview]
Message-ID: <20191027203155.923204292@linuxfoundation.org> (raw)
In-Reply-To: <20191027203119.468466356@linuxfoundation.org>

From: Steffen Maier <maier@linux.ibm.com>

[ Upstream commit 2190168aaea42c31bff7b9a967e7b045f07df095 ]

On excessive bit errors for the FCP channel ingress fibre path, the channel
notifies us.  Previously, we only emitted a kernel message and a trace
record.  Since performance can become suboptimal with I/O timeouts due to
bit errors, we now stop using an FCP device by default on channel
notification so multipath on top can timely failover to other paths.  A new
module parameter zfcp.ber_stop can be used to get zfcp old behavior.

User explanation of new kernel message:

 * Description:
 * The FCP channel reported that its bit error threshold has been exceeded.
 * These errors might result from a problem with the physical components
 * of the local fibre link into the FCP channel.
 * The problem might be damage or malfunction of the cable or
 * cable connection between the FCP channel and
 * the adjacent fabric switch port or the point-to-point peer.
 * Find details about the errors in the HBA trace for the FCP device.
 * The zfcp device driver closed down the FCP device
 * to limit the performance impact from possible I/O command timeouts.
 * User action:
 * Check for problems on the local fibre link, ensure that fibre optics are
 * clean and functional, and all cables are properly plugged.
 * After the repair action, you can manually recover the FCP device by
 * writing "0" into its "failed" sysfs attribute.
 * If recovery through sysfs is not possible, set the CHPID of the device
 * offline and back online on the service element.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: <stable@vger.kernel.org> #2.6.30+
Link: https://lore.kernel.org/r/20191001104949.42810-1-maier@linux.ibm.com
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/s390/scsi/zfcp_fsf.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 1964391db9047..a3aaef4c53a3c 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -20,6 +20,11 @@
 
 struct kmem_cache *zfcp_fsf_qtcb_cache;
 
+static bool ber_stop = true;
+module_param(ber_stop, bool, 0600);
+MODULE_PARM_DESC(ber_stop,
+		 "Shuts down FCP devices for FCP channels that report a bit-error count in excess of its threshold (default on)");
+
 static void zfcp_fsf_request_timeout_handler(unsigned long data)
 {
 	struct zfcp_adapter *adapter = (struct zfcp_adapter *) data;
@@ -231,10 +236,15 @@ static void zfcp_fsf_status_read_handler(struct zfcp_fsf_req *req)
 	case FSF_STATUS_READ_SENSE_DATA_AVAIL:
 		break;
 	case FSF_STATUS_READ_BIT_ERROR_THRESHOLD:
-		dev_warn(&adapter->ccw_device->dev,
-			 "The error threshold for checksum statistics "
-			 "has been exceeded\n");
 		zfcp_dbf_hba_bit_err("fssrh_3", req);
+		if (ber_stop) {
+			dev_warn(&adapter->ccw_device->dev,
+				 "All paths over this FCP device are disused because of excessive bit errors\n");
+			zfcp_erp_adapter_shutdown(adapter, 0, "fssrh_b");
+		} else {
+			dev_warn(&adapter->ccw_device->dev,
+				 "The error threshold for checksum statistics has been exceeded\n");
+		}
 		break;
 	case FSF_STATUS_READ_LINK_DOWN:
 		zfcp_fsf_status_read_link_down(req);
-- 
2.20.1




  parent reply	other threads:[~2019-10-27 21:06 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-27 21:00 [PATCH 4.9 00/49] 4.9.198-stable review Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 01/49] scsi: ufs: skip shutdown if hba is not powered Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 02/49] scsi: megaraid: disable device when probe failed after enabled device Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 03/49] scsi: qla2xxx: Fix unbound sleep in fcport delete path Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 04/49] ARM: OMAP2+: Fix missing reset done flag for am3 and am43 Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 05/49] ARM: dts: am4372: Set memory bandwidth limit for DISPC Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 06/49] MIPS: dts: ar9331: fix interrupt-controller size Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 07/49] nl80211: fix null pointer dereference Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 08/49] mac80211: fix txq " Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 09/49] mips: Loongson: Fix the link time qualifier of serial_exit() Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 10/49] net: hisilicon: Fix usage of uninitialized variable in function mdio_sc_cfg_reg_write() Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 11/49] namespace: fix namespace.pl script to support relative paths Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 12/49] Revert "drm/radeon: Fix EEH during kexec" Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 13/49] ocfs2: fix panic due to ocfs2_wq is null Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 14/49] MIPS: Treat Loongson Extensions as ASEs Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 15/49] MIPS: elf_hwcap: Export userspace ASEs Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 16/49] loop: Add LOOP_SET_DIRECT_IO to compat ioctl Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 17/49] net: bcmgenet: Fix RGMII_MODE_EN value for GENET v1/2/3 Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 18/49] net: bcmgenet: Set phydev->dev_flags only for internal PHYs Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 19/49] sctp: change sctp_prot .no_autobind with true Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 20/49] net: avoid potential infinite loop in tc_ctl_action() Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 21/49] ipv4: Return -ENETUNREACH if we cant create route but saddr is valid Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 22/49] memfd: Fix locking when tagging pins Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 23/49] USB: legousbtower: fix memleak on disconnect Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 24/49] ALSA: hda/realtek - Add support for ALC711 Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 25/49] usb: udc: lpc32xx: fix bad bit shift operation Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 26/49] USB: serial: ti_usb_3410_5052: fix port-close races Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 27/49] USB: ldusb: fix memleak on disconnect Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 28/49] USB: usblp: fix use-after-free " Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 29/49] USB: ldusb: fix read info leaks Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 30/49] MIPS: tlbex: Fix build_restore_pagemask KScratch restore Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 31/49] staging: wlan-ng: fix exit return when sme->key_idx >= NUM_WEPKEYS Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 32/49] scsi: core: try to get module before removing device Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 33/49] Input: da9063 - fix capability and drop KEY_SLEEP Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 34/49] ASoC: rsnd: Reinitialize bit clock inversion flag for every format setting Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 35/49] cfg80211: wext: avoid copying malformed SSIDs Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 36/49] mac80211: Reject malformed SSID elements Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 37/49] drm/edid: Add 6 bpc quirk for SDC panel in Lenovo G50 Greg Kroah-Hartman
2019-10-27 21:01 ` Greg Kroah-Hartman [this message]
2019-10-27 21:01 ` [PATCH 4.9 39/49] mm/slub: fix a deadlock in show_slab_objects() Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 40/49] xtensa: drop EXPORT_SYMBOL for outs*/ins* Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 41/49] parisc: Fix vmap memory leak in ioremap()/iounmap() Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 42/49] CIFS: avoid using MID 0xFFFF Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 43/49] btrfs: block-group: Fix a memory leak due to missing btrfs_put_block_group() Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 44/49] memstick: jmb38x_ms: Fix an error handling path in jmb38x_ms_probe() Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 45/49] cpufreq: Avoid cpufreq_suspend() deadlock on system shutdown Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 46/49] xen/netback: fix error path of xenvif_connect_data() Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 47/49] PCI: PM: Fix pci_power_up() Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 48/49] Revert "net: sit: fix memory leak in sit_init_net()" Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 49/49] RDMA/cxgb4: Do not dma memory off of the stack Greg Kroah-Hartman
2019-10-28  5:35 ` [PATCH 4.9 00/49] 4.9.198-stable review kernelci.org bot
2019-10-28  9:05 ` Didik Setiawan
2019-10-28 13:34 ` Guenter Roeck
2019-10-28 21:37 ` Jon Hunter
2019-10-29 15:17 ` Naresh Kamboju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191027203155.923204292@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=bblock@linux.ibm.com \
    --cc=jremus@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maier@linux.ibm.com \
    --cc=martin.petersen@oracle.com \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).