stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Jens Remus <jremus@linux.ibm.com>,
	Benjamin Block <bblock@linux.ibm.com>,
	Steffen Maier <maier@linux.ibm.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.4 30/41] scsi: zfcp: fix reaction on bit error threshold notification
Date: Sun, 27 Oct 2019 22:01:08 +0100	[thread overview]
Message-ID: <20191027203125.040101070@linuxfoundation.org> (raw)
In-Reply-To: <20191027203056.220821342@linuxfoundation.org>

From: Steffen Maier <maier@linux.ibm.com>

[ Upstream commit 2190168aaea42c31bff7b9a967e7b045f07df095 ]

On excessive bit errors for the FCP channel ingress fibre path, the channel
notifies us.  Previously, we only emitted a kernel message and a trace
record.  Since performance can become suboptimal with I/O timeouts due to
bit errors, we now stop using an FCP device by default on channel
notification so multipath on top can timely failover to other paths.  A new
module parameter zfcp.ber_stop can be used to get zfcp old behavior.

User explanation of new kernel message:

 * Description:
 * The FCP channel reported that its bit error threshold has been exceeded.
 * These errors might result from a problem with the physical components
 * of the local fibre link into the FCP channel.
 * The problem might be damage or malfunction of the cable or
 * cable connection between the FCP channel and
 * the adjacent fabric switch port or the point-to-point peer.
 * Find details about the errors in the HBA trace for the FCP device.
 * The zfcp device driver closed down the FCP device
 * to limit the performance impact from possible I/O command timeouts.
 * User action:
 * Check for problems on the local fibre link, ensure that fibre optics are
 * clean and functional, and all cables are properly plugged.
 * After the repair action, you can manually recover the FCP device by
 * writing "0" into its "failed" sysfs attribute.
 * If recovery through sysfs is not possible, set the CHPID of the device
 * offline and back online on the service element.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: <stable@vger.kernel.org> #2.6.30+
Link: https://lore.kernel.org/r/20191001104949.42810-1-maier@linux.ibm.com
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/s390/scsi/zfcp_fsf.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 1964391db9047..a3aaef4c53a3c 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -20,6 +20,11 @@
 
 struct kmem_cache *zfcp_fsf_qtcb_cache;
 
+static bool ber_stop = true;
+module_param(ber_stop, bool, 0600);
+MODULE_PARM_DESC(ber_stop,
+		 "Shuts down FCP devices for FCP channels that report a bit-error count in excess of its threshold (default on)");
+
 static void zfcp_fsf_request_timeout_handler(unsigned long data)
 {
 	struct zfcp_adapter *adapter = (struct zfcp_adapter *) data;
@@ -231,10 +236,15 @@ static void zfcp_fsf_status_read_handler(struct zfcp_fsf_req *req)
 	case FSF_STATUS_READ_SENSE_DATA_AVAIL:
 		break;
 	case FSF_STATUS_READ_BIT_ERROR_THRESHOLD:
-		dev_warn(&adapter->ccw_device->dev,
-			 "The error threshold for checksum statistics "
-			 "has been exceeded\n");
 		zfcp_dbf_hba_bit_err("fssrh_3", req);
+		if (ber_stop) {
+			dev_warn(&adapter->ccw_device->dev,
+				 "All paths over this FCP device are disused because of excessive bit errors\n");
+			zfcp_erp_adapter_shutdown(adapter, 0, "fssrh_b");
+		} else {
+			dev_warn(&adapter->ccw_device->dev,
+				 "The error threshold for checksum statistics has been exceeded\n");
+		}
 		break;
 	case FSF_STATUS_READ_LINK_DOWN:
 		zfcp_fsf_status_read_link_down(req);
-- 
2.20.1




  parent reply	other threads:[~2019-10-27 21:04 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-27 21:00 [PATCH 4.4 00/41] 4.4.198-stable review Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.4 01/41] scsi: ufs: skip shutdown if hba is not powered Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.4 02/41] scsi: megaraid: disable device when probe failed after enabled device Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.4 03/41] scsi: qla2xxx: Fix unbound sleep in fcport delete path Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.4 04/41] ARM: OMAP2+: Fix missing reset done flag for am3 and am43 Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.4 05/41] ARM: dts: am4372: Set memory bandwidth limit for DISPC Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.4 06/41] nl80211: fix null pointer dereference Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.4 07/41] mips: Loongson: Fix the link time qualifier of serial_exit() Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.4 08/41] net: hisilicon: Fix usage of uninitialized variable in function mdio_sc_cfg_reg_write() Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.4 09/41] namespace: fix namespace.pl script to support relative paths Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.4 10/41] MIPS: Treat Loongson Extensions as ASEs Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.4 11/41] MIPS: elf_hwcap: Export userspace ASEs Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.4 12/41] loop: Add LOOP_SET_DIRECT_IO to compat ioctl Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.4 13/41] net: bcmgenet: Fix RGMII_MODE_EN value for GENET v1/2/3 Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.4 14/41] net: bcmgenet: Set phydev->dev_flags only for internal PHYs Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.4 15/41] sctp: change sctp_prot .no_autobind with true Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.4 16/41] net: avoid potential infinite loop in tc_ctl_action() Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.4 17/41] ipv4: Return -ENETUNREACH if we cant create route but saddr is valid Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.4 18/41] memfd: Fix locking when tagging pins Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.4 19/41] USB: legousbtower: fix memleak on disconnect Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.4 20/41] usb: udc: lpc32xx: fix bad bit shift operation Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.4 21/41] USB: serial: ti_usb_3410_5052: fix port-close races Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.4 22/41] USB: ldusb: fix memleak on disconnect Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.4 23/41] USB: usblp: fix use-after-free " Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.4 24/41] USB: ldusb: fix read info leaks Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.4 25/41] scsi: core: try to get module before removing device Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.4 26/41] ASoC: rsnd: Reinitialize bit clock inversion flag for every format setting Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.4 27/41] cfg80211: wext: avoid copying malformed SSIDs Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.4 28/41] mac80211: Reject malformed SSID elements Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.4 29/41] drm/edid: Add 6 bpc quirk for SDC panel in Lenovo G50 Greg Kroah-Hartman
2019-10-27 21:01 ` Greg Kroah-Hartman [this message]
2019-10-27 21:01 ` [PATCH 4.4 31/41] mm/slub: fix a deadlock in show_slab_objects() Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.4 32/41] xtensa: drop EXPORT_SYMBOL for outs*/ins* Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.4 33/41] parisc: Fix vmap memory leak in ioremap()/iounmap() Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.4 34/41] CIFS: avoid using MID 0xFFFF Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.4 35/41] btrfs: block-group: Fix a memory leak due to missing btrfs_put_block_group() Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.4 36/41] memstick: jmb38x_ms: Fix an error handling path in jmb38x_ms_probe() Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.4 37/41] cpufreq: Avoid cpufreq_suspend() deadlock on system shutdown Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.4 38/41] xen/netback: fix error path of xenvif_connect_data() Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.4 39/41] PCI: PM: Fix pci_power_up() Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.4 40/41] net: sched: Fix memory exposure from short TCA_U32_SEL Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.4 41/41] RDMA/cxgb4: Do not dma memory off of the stack Greg Kroah-Hartman
2019-10-28  3:55 ` [PATCH 4.4 00/41] 4.4.198-stable review kernelci.org bot
2019-10-28  9:07 ` Didik Setiawan
2019-10-28 13:32 ` Guenter Roeck
2019-10-28 13:49   ` Greg Kroah-Hartman
2019-10-28 13:57     ` Greg Kroah-Hartman
2019-10-28 18:12       ` Guenter Roeck
2019-10-29  8:19         ` Greg Kroah-Hartman
2019-10-28 13:33 ` Guenter Roeck
2019-10-28 21:36 ` Jon Hunter
2019-10-29 13:29 ` Naresh Kamboju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191027203125.040101070@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=bblock@linux.ibm.com \
    --cc=jremus@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maier@linux.ibm.com \
    --cc=martin.petersen@oracle.com \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).