From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Martin Wilck <martin.wilck@suse.com>,
Rajashekhar M A <rajs@netapp.com>, Hannes Reinecke <hare@suse.de>,
Damien Le Moal <dlemoal@kernel.org>,
Christoph Hellwig <hch@lst.de>,
Mike Christie <michael.christie@oracle.com>,
"Martin K . Petersen" <martin.petersen@oracle.com>,
Sasha Levin <sashal@kernel.org>,
James.Bottomley@HansenPartnership.com,
linux-scsi@vger.kernel.org
Subject: [PATCH AUTOSEL 6.9 01/44] scsi: core: alua: I/O errors for ALUA state transitions
Date: Mon, 17 Jun 2024 09:19:14 -0400 [thread overview]
Message-ID: <20240617132046.2587008-1-sashal@kernel.org> (raw)
From: Martin Wilck <martin.wilck@suse.com>
[ Upstream commit 10157b1fc1a762293381e9145041253420dfc6ad ]
When a host is configured with a few LUNs and I/O is running, injecting FC
faults repeatedly leads to path recovery problems. The LUNs have 4 paths
each and 3 of them come back active after say an FC fault which makes 2 of
the paths go down, instead of all 4. This happens after several iterations
of continuous FC faults.
Reason here is that we're returning an I/O error whenever we're
encountering sense code 06/04/0a (LOGICAL UNIT NOT ACCESSIBLE, ASYMMETRIC
ACCESS STATE TRANSITION) instead of retrying.
[mwilck: The original patch was developed by Rajashekhar M A and Hannes
Reinecke. I moved the code to alua_check_sense() as suggested by Mike
Christie [1]. Evan Milne had raised the question whether pg->state should
be set to transitioning in the UA case [2]. I believe that doing this is
correct. SCSI_ACCESS_STATE_TRANSITIONING by itself doesn't cause I/O
errors. Our handler schedules an RTPG, which will only result in an I/O
error condition if the transitioning timeout expires.]
[1] https://lore.kernel.org/all/0bc96e82-fdda-4187-148d-5b34f81d4942@oracle.com/
[2] https://lore.kernel.org/all/CAGtn9r=kicnTDE2o7Gt5Y=yoidHYD7tG8XdMHEBJTBraVEoOCw@mail.gmail.com/
Co-developed-by: Rajashekhar M A <rajs@netapp.com>
Co-developed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin Wilck <martin.wilck@suse.com>
Link: https://lore.kernel.org/r/20240514140344.19538-1-mwilck@suse.com
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/scsi/device_handler/scsi_dh_alua.c | 31 +++++++++++++++-------
1 file changed, 22 insertions(+), 9 deletions(-)
diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
index a226dc1b65d71..4eb0837298d4d 100644
--- a/drivers/scsi/device_handler/scsi_dh_alua.c
+++ b/drivers/scsi/device_handler/scsi_dh_alua.c
@@ -414,28 +414,40 @@ static char print_alua_state(unsigned char state)
}
}
-static enum scsi_disposition alua_check_sense(struct scsi_device *sdev,
- struct scsi_sense_hdr *sense_hdr)
+static void alua_handle_state_transition(struct scsi_device *sdev)
{
struct alua_dh_data *h = sdev->handler_data;
struct alua_port_group *pg;
+ rcu_read_lock();
+ pg = rcu_dereference(h->pg);
+ if (pg)
+ pg->state = SCSI_ACCESS_STATE_TRANSITIONING;
+ rcu_read_unlock();
+ alua_check(sdev, false);
+}
+
+static enum scsi_disposition alua_check_sense(struct scsi_device *sdev,
+ struct scsi_sense_hdr *sense_hdr)
+{
switch (sense_hdr->sense_key) {
case NOT_READY:
if (sense_hdr->asc == 0x04 && sense_hdr->ascq == 0x0a) {
/*
* LUN Not Accessible - ALUA state transition
*/
- rcu_read_lock();
- pg = rcu_dereference(h->pg);
- if (pg)
- pg->state = SCSI_ACCESS_STATE_TRANSITIONING;
- rcu_read_unlock();
- alua_check(sdev, false);
+ alua_handle_state_transition(sdev);
return NEEDS_RETRY;
}
break;
case UNIT_ATTENTION:
+ if (sense_hdr->asc == 0x04 && sense_hdr->ascq == 0x0a) {
+ /*
+ * LUN Not Accessible - ALUA state transition
+ */
+ alua_handle_state_transition(sdev);
+ return NEEDS_RETRY;
+ }
if (sense_hdr->asc == 0x29 && sense_hdr->ascq == 0x00) {
/*
* Power On, Reset, or Bus Device Reset.
@@ -502,7 +514,8 @@ static int alua_tur(struct scsi_device *sdev)
retval = scsi_test_unit_ready(sdev, ALUA_FAILOVER_TIMEOUT * HZ,
ALUA_FAILOVER_RETRIES, &sense_hdr);
- if (sense_hdr.sense_key == NOT_READY &&
+ if ((sense_hdr.sense_key == NOT_READY ||
+ sense_hdr.sense_key == UNIT_ATTENTION) &&
sense_hdr.asc == 0x04 && sense_hdr.ascq == 0x0a)
return SCSI_DH_RETRY;
else if (retval)
--
2.43.0
next reply other threads:[~2024-06-17 13:20 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-17 13:19 Sasha Levin [this message]
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 02/44] scsi: sr: Fix unintentional arithmetic wraparound Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 03/44] scsi: qedf: Don't process stag work during unload and recovery Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 04/44] scsi: qedf: Wait for stag work during unload Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 05/44] scsi: qedf: Set qed_slowpath_params to zero before use Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 06/44] efi/libstub: zboot.lds: Discard .discard sections Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 07/44] efi: pstore: Return proper errors on UEFI failures Sasha Levin
2024-06-17 13:22 ` Ard Biesheuvel
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 08/44] ACPI: EC: Abort address space access upon error Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 09/44] ACPI: EC: Avoid returning AE_OK on errors in address space handler Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 10/44] btrfs: ensure fast fsync waits for ordered extents after a write failure Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 11/44] tools/power/cpupower: Fix Pstate frequency reporting on AMD Family 1Ah CPUs Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 12/44] PNP: Hide pnp_bus_type from the non-PNP code Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 13/44] ACPI: AC: Properly notify powermanagement core about changes Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 14/44] wifi: mac80211: mesh: init nonpeer_pm to active by default in mesh sdata Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 15/44] wifi: mac80211: apply mcast rate only if interface is up Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 16/44] wifi: mac80211: handle tasklet frames before stopping Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 17/44] wifi: cfg80211: fix 6 GHz scan request building Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 18/44] wifi: iwlwifi: mvm: d3: fix WoWLAN command version lookup Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 19/44] wifi: iwlwifi: mvm: remove stale STA link data during restart Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 20/44] wifi: iwlwifi: mvm: Handle BIGTK cipher in kek_kck cmd Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 21/44] wifi: iwlwifi: mvm: handle BA session teardown in RF-kill Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 22/44] wifi: iwlwifi: mvm: properly set 6 GHz channel direct probe option Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 23/44] wifi: iwlwifi: mvm: Fix scan abort handling with HW rfkill Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 24/44] wifi: mac80211: fix UBSAN noise in ieee80211_prep_hw_scan() Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 25/44] selftests: cachestat: Fix build warnings on ppc64 Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 26/44] selftests/openat2: " Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 27/44] selftests/overlayfs: Fix build error " Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 28/44] selftests/futex: pass _GNU_SOURCE without a value to the compiler Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 29/44] of/irq: Factor out parsing of interrupt-map parent phandle+args from of_irq_parse_raw() Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 30/44] nvme-fabrics: use reserved tag for reg read/write command Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 31/44] LoongArch: Fix GMAC's phy-mode definitions in dts Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 32/44] Input: silead - Always support 10 fingers Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 33/44] platform/x86/amd/hsmp: Check HSMP support on AMD family of processors Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 34/44] net: ipv6: rpl_iptunnel: block BH in rpl_output() and rpl_input() Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 35/44] ila: block BH in ila_output() Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 36/44] io_uring: fix possible deadlock in io_register_iowq_max_workers() Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 37/44] arm64: armv8_deprecated: Fix warning in isndep cpuhp starting process Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 38/44] drm/amdgpu/pptable: Fix UBSAN array-index-out-of-bounds Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 39/44] null_blk: fix validation of block size Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 40/44] kconfig: gconf: give a proper initial state to the Save button Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 41/44] kconfig: remove wrong expr_trans_bool() Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 42/44] input: Add event code for accessibility key Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 43/44] input: Add support for "Do Not Disturb" Sasha Levin
2024-06-17 13:19 ` [PATCH AUTOSEL 6.9 44/44] HID: Ignore battery for ELAN touchscreens 2F2C and 4116 Sasha Levin
-- strict thread matches above, loose matches on Subject: below --
2024-06-18 12:34 [PATCH AUTOSEL 6.9 01/44] scsi: core: alua: I/O errors for ALUA state transitions Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240617132046.2587008-1-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=James.Bottomley@HansenPartnership.com \
--cc=dlemoal@kernel.org \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=martin.wilck@suse.com \
--cc=michael.christie@oracle.com \
--cc=rajs@netapp.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox