From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Jens Remus <jremus@linux.ibm.com>,
Benjamin Block <bblock@linux.ibm.com>,
Steffen Maier <maier@linux.ibm.com>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.9 38/49] scsi: zfcp: fix reaction on bit error threshold notification
Date: Sun, 27 Oct 2019 22:01:16 +0100 [thread overview]
Message-ID: <20191027203155.923204292@linuxfoundation.org> (raw)
In-Reply-To: <20191027203119.468466356@linuxfoundation.org>
From: Steffen Maier <maier@linux.ibm.com>
[ Upstream commit 2190168aaea42c31bff7b9a967e7b045f07df095 ]
On excessive bit errors for the FCP channel ingress fibre path, the channel
notifies us. Previously, we only emitted a kernel message and a trace
record. Since performance can become suboptimal with I/O timeouts due to
bit errors, we now stop using an FCP device by default on channel
notification so multipath on top can timely failover to other paths. A new
module parameter zfcp.ber_stop can be used to get zfcp old behavior.
User explanation of new kernel message:
* Description:
* The FCP channel reported that its bit error threshold has been exceeded.
* These errors might result from a problem with the physical components
* of the local fibre link into the FCP channel.
* The problem might be damage or malfunction of the cable or
* cable connection between the FCP channel and
* the adjacent fabric switch port or the point-to-point peer.
* Find details about the errors in the HBA trace for the FCP device.
* The zfcp device driver closed down the FCP device
* to limit the performance impact from possible I/O command timeouts.
* User action:
* Check for problems on the local fibre link, ensure that fibre optics are
* clean and functional, and all cables are properly plugged.
* After the repair action, you can manually recover the FCP device by
* writing "0" into its "failed" sysfs attribute.
* If recovery through sysfs is not possible, set the CHPID of the device
* offline and back online on the service element.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: <stable@vger.kernel.org> #2.6.30+
Link: https://lore.kernel.org/r/20191001104949.42810-1-maier@linux.ibm.com
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/s390/scsi/zfcp_fsf.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 1964391db9047..a3aaef4c53a3c 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -20,6 +20,11 @@
struct kmem_cache *zfcp_fsf_qtcb_cache;
+static bool ber_stop = true;
+module_param(ber_stop, bool, 0600);
+MODULE_PARM_DESC(ber_stop,
+ "Shuts down FCP devices for FCP channels that report a bit-error count in excess of its threshold (default on)");
+
static void zfcp_fsf_request_timeout_handler(unsigned long data)
{
struct zfcp_adapter *adapter = (struct zfcp_adapter *) data;
@@ -231,10 +236,15 @@ static void zfcp_fsf_status_read_handler(struct zfcp_fsf_req *req)
case FSF_STATUS_READ_SENSE_DATA_AVAIL:
break;
case FSF_STATUS_READ_BIT_ERROR_THRESHOLD:
- dev_warn(&adapter->ccw_device->dev,
- "The error threshold for checksum statistics "
- "has been exceeded\n");
zfcp_dbf_hba_bit_err("fssrh_3", req);
+ if (ber_stop) {
+ dev_warn(&adapter->ccw_device->dev,
+ "All paths over this FCP device are disused because of excessive bit errors\n");
+ zfcp_erp_adapter_shutdown(adapter, 0, "fssrh_b");
+ } else {
+ dev_warn(&adapter->ccw_device->dev,
+ "The error threshold for checksum statistics has been exceeded\n");
+ }
break;
case FSF_STATUS_READ_LINK_DOWN:
zfcp_fsf_status_read_link_down(req);
--
2.20.1
next prev parent reply other threads:[~2019-10-27 21:06 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-27 21:00 [PATCH 4.9 00/49] 4.9.198-stable review Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 01/49] scsi: ufs: skip shutdown if hba is not powered Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 02/49] scsi: megaraid: disable device when probe failed after enabled device Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 03/49] scsi: qla2xxx: Fix unbound sleep in fcport delete path Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 04/49] ARM: OMAP2+: Fix missing reset done flag for am3 and am43 Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 05/49] ARM: dts: am4372: Set memory bandwidth limit for DISPC Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 06/49] MIPS: dts: ar9331: fix interrupt-controller size Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 07/49] nl80211: fix null pointer dereference Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 08/49] mac80211: fix txq " Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 09/49] mips: Loongson: Fix the link time qualifier of serial_exit() Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 10/49] net: hisilicon: Fix usage of uninitialized variable in function mdio_sc_cfg_reg_write() Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 11/49] namespace: fix namespace.pl script to support relative paths Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 12/49] Revert "drm/radeon: Fix EEH during kexec" Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 13/49] ocfs2: fix panic due to ocfs2_wq is null Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 14/49] MIPS: Treat Loongson Extensions as ASEs Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 15/49] MIPS: elf_hwcap: Export userspace ASEs Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 16/49] loop: Add LOOP_SET_DIRECT_IO to compat ioctl Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 17/49] net: bcmgenet: Fix RGMII_MODE_EN value for GENET v1/2/3 Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 18/49] net: bcmgenet: Set phydev->dev_flags only for internal PHYs Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 19/49] sctp: change sctp_prot .no_autobind with true Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 20/49] net: avoid potential infinite loop in tc_ctl_action() Greg Kroah-Hartman
2019-10-27 21:00 ` [PATCH 4.9 21/49] ipv4: Return -ENETUNREACH if we cant create route but saddr is valid Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 22/49] memfd: Fix locking when tagging pins Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 23/49] USB: legousbtower: fix memleak on disconnect Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 24/49] ALSA: hda/realtek - Add support for ALC711 Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 25/49] usb: udc: lpc32xx: fix bad bit shift operation Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 26/49] USB: serial: ti_usb_3410_5052: fix port-close races Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 27/49] USB: ldusb: fix memleak on disconnect Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 28/49] USB: usblp: fix use-after-free " Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 29/49] USB: ldusb: fix read info leaks Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 30/49] MIPS: tlbex: Fix build_restore_pagemask KScratch restore Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 31/49] staging: wlan-ng: fix exit return when sme->key_idx >= NUM_WEPKEYS Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 32/49] scsi: core: try to get module before removing device Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 33/49] Input: da9063 - fix capability and drop KEY_SLEEP Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 34/49] ASoC: rsnd: Reinitialize bit clock inversion flag for every format setting Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 35/49] cfg80211: wext: avoid copying malformed SSIDs Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 36/49] mac80211: Reject malformed SSID elements Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 37/49] drm/edid: Add 6 bpc quirk for SDC panel in Lenovo G50 Greg Kroah-Hartman
2019-10-27 21:01 ` Greg Kroah-Hartman [this message]
2019-10-27 21:01 ` [PATCH 4.9 39/49] mm/slub: fix a deadlock in show_slab_objects() Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 40/49] xtensa: drop EXPORT_SYMBOL for outs*/ins* Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 41/49] parisc: Fix vmap memory leak in ioremap()/iounmap() Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 42/49] CIFS: avoid using MID 0xFFFF Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 43/49] btrfs: block-group: Fix a memory leak due to missing btrfs_put_block_group() Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 44/49] memstick: jmb38x_ms: Fix an error handling path in jmb38x_ms_probe() Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 45/49] cpufreq: Avoid cpufreq_suspend() deadlock on system shutdown Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 46/49] xen/netback: fix error path of xenvif_connect_data() Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 47/49] PCI: PM: Fix pci_power_up() Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 48/49] Revert "net: sit: fix memory leak in sit_init_net()" Greg Kroah-Hartman
2019-10-27 21:01 ` [PATCH 4.9 49/49] RDMA/cxgb4: Do not dma memory off of the stack Greg Kroah-Hartman
2019-10-28 5:35 ` [PATCH 4.9 00/49] 4.9.198-stable review kernelci.org bot
2019-10-28 9:05 ` Didik Setiawan
2019-10-28 13:34 ` Guenter Roeck
2019-10-28 21:37 ` Jon Hunter
2019-10-29 15:17 ` Naresh Kamboju
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191027203155.923204292@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=bblock@linux.ibm.com \
--cc=jremus@linux.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=maier@linux.ibm.com \
--cc=martin.petersen@oracle.com \
--cc=sashal@kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).