* [PATCH] zfcp: fix infinite iteration on ERP ready list
@ 2018-05-03 11:52 Steffen Maier
2018-05-08 4:02 ` Martin K. Petersen
0 siblings, 1 reply; 2+ messages in thread
From: Steffen Maier @ 2018-05-03 11:52 UTC (permalink / raw)
To: James E . J . Bottomley, Martin K . Petersen
Cc: linux-scsi, linux-s390, Martin Schwidefsky, Heiko Carstens,
Steffen Maier, Jens Remus, stable
From: Jens Remus <jremus@linux.ibm.com>
zfcp_erp_adapter_reopen() schedules blocking of all of the adapter's
rports via zfcp_scsi_schedule_rports_block() and enqueues a reopen
adapter ERP action via zfcp_erp_action_enqueue(). Both are separately
processed asynchronously and concurrently.
Blocking of rports is done in a kworker by zfcp_scsi_rport_work(). It
calls zfcp_scsi_rport_block(), which then traces a DBF REC "scpdely" via
zfcp_dbf_rec_trig().
zfcp_dbf_rec_trig() acquires the DBF REC spin lock and then iterates with
list_for_each() over the adapter's ERP ready list without holding the ERP
lock. This opens a race window in which the current list entry can be
moved to another list, causing list_for_each() to iterate forever on the
wrong list, as the erp_ready_head is never encountered as terminal
condition.
Meanwhile the ERP action can be processed in the ERP thread by
zfcp_erp_thread(). It calls zfcp_erp_strategy(), which acquires the ERP
lock and then calls zfcp_erp_action_to_running() to move the ERP action
from the ready to the running list.
zfcp_erp_action_to_running() can move the ERP action using list_move()
just during the aforementioned race window. It then traces a REC RUN
"erator1" via zfcp_dbf_rec_run().
zfcp_dbf_rec_run() tries to acquire the DBF REC spin lock. If this is held
by the infinitely looping kworker, it effectively spins forever.
Example Sequence Diagram:
Process ERP Thread rport_work
------------------- ------------------- -------------------
zfcp_erp_adapter_reopen()
zfcp_erp_adapter_block()
zfcp_scsi_schedule_rports_block()
lock ERP zfcp_scsi_rport_work()
zfcp_erp_action_enqueue(ZFCP_ERP_ACTION_REOPEN_ADAPTER)
list_add_tail() on ready !(rport_task==RPORT_ADD)
wake_up() ERP thread zfcp_scsi_rport_block()
zfcp_dbf_rec_trig() zfcp_erp_strategy() zfcp_dbf_rec_trig()
unlock ERP lock DBF REC
zfcp_erp_wait() lock ERP
| zfcp_erp_action_to_running()
| list_for_each() ready
| list_move() current entry
| ready to running
| zfcp_dbf_rec_run() endless loop over running
| zfcp_dbf_rec_run_lvl()
| lock DBF REC spins forever
Any adapter recovery can trigger this, such as setting the device offline
or reboot.
V4.9 commit 4eeaa4f3f1d6 ("zfcp: close window with unblocked rport during
rport gone") introduced additional tracing of (un)blocking of rports. It
missed that the adapter->erp_lock must be held when calling
zfcp_dbf_rec_trig().
This fix uses the approach formerly introduced by commit aa0fec62391c
("[SCSI] zfcp: Fix sparse warning by providing new entry in dbf") that got
later removed by commit ae0904f60fab ("[SCSI] zfcp: Redesign of the debug
tracing for recovery actions.").
Introduce zfcp_dbf_rec_trig_lock(), a wrapper for zfcp_dbf_rec_trig() that
acquires and releases the adapter->erp_lock for read.
Reported-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Fixes: 4eeaa4f3f1d6 ("zfcp: close window with unblocked rport during rport gone")
Cc: <stable@vger.kernel.org> # 2.6.32+
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
---
James, Martin,
this is an important zfcp regression fix.
It would be nice if it could make it into 4.17-rcX.
The patch applies to James' fixes branch or Martin's 4.17/scsi-fixes branch.
Regards,
Steffen
drivers/s390/scsi/zfcp_dbf.c | 23 ++++++++++++++++++++++-
drivers/s390/scsi/zfcp_ext.h | 5 ++++-
drivers/s390/scsi/zfcp_scsi.c | 14 +++++++-------
3 files changed, 33 insertions(+), 9 deletions(-)
diff --git a/drivers/s390/scsi/zfcp_dbf.c b/drivers/s390/scsi/zfcp_dbf.c
index a8b831000b2d..18c4f933e8b9 100644
--- a/drivers/s390/scsi/zfcp_dbf.c
+++ b/drivers/s390/scsi/zfcp_dbf.c
@@ -4,7 +4,7 @@
*
* Debug traces for zfcp.
*
- * Copyright IBM Corp. 2002, 2017
+ * Copyright IBM Corp. 2002, 2018
*/
#define KMSG_COMPONENT "zfcp"
@@ -308,6 +308,27 @@ void zfcp_dbf_rec_trig(char *tag, struct zfcp_adapter *adapter,
spin_unlock_irqrestore(&dbf->rec_lock, flags);
}
+/**
+ * zfcp_dbf_rec_trig_lock - trace event related to triggered recovery with lock
+ * @tag: identifier for event
+ * @adapter: adapter on which the erp_action should run
+ * @port: remote port involved in the erp_action
+ * @sdev: scsi device involved in the erp_action
+ * @want: wanted erp_action
+ * @need: required erp_action
+ *
+ * The adapter->erp_lock must not be held.
+ */
+void zfcp_dbf_rec_trig_lock(char *tag, struct zfcp_adapter *adapter,
+ struct zfcp_port *port, struct scsi_device *sdev,
+ u8 want, u8 need)
+{
+ unsigned long flags;
+
+ read_lock_irqsave(&adapter->erp_lock, flags);
+ zfcp_dbf_rec_trig(tag, adapter, port, sdev, want, need);
+ read_unlock_irqrestore(&adapter->erp_lock, flags);
+}
/**
* zfcp_dbf_rec_run_lvl - trace event related to running recovery
diff --git a/drivers/s390/scsi/zfcp_ext.h b/drivers/s390/scsi/zfcp_ext.h
index bf8ea4df2bb8..e5eed8aac0ce 100644
--- a/drivers/s390/scsi/zfcp_ext.h
+++ b/drivers/s390/scsi/zfcp_ext.h
@@ -4,7 +4,7 @@
*
* External function declarations.
*
- * Copyright IBM Corp. 2002, 2016
+ * Copyright IBM Corp. 2002, 2018
*/
#ifndef ZFCP_EXT_H
@@ -35,6 +35,9 @@ extern int zfcp_dbf_adapter_register(struct zfcp_adapter *);
extern void zfcp_dbf_adapter_unregister(struct zfcp_adapter *);
extern void zfcp_dbf_rec_trig(char *, struct zfcp_adapter *,
struct zfcp_port *, struct scsi_device *, u8, u8);
+extern void zfcp_dbf_rec_trig_lock(char *tag, struct zfcp_adapter *adapter,
+ struct zfcp_port *port,
+ struct scsi_device *sdev, u8 want, u8 need);
extern void zfcp_dbf_rec_run(char *, struct zfcp_erp_action *);
extern void zfcp_dbf_rec_run_lvl(int level, char *tag,
struct zfcp_erp_action *erp);
diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index 4d2ba5682493..22f9562f415c 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -4,7 +4,7 @@
*
* Interface to Linux SCSI midlayer.
*
- * Copyright IBM Corp. 2002, 2017
+ * Copyright IBM Corp. 2002, 2018
*/
#define KMSG_COMPONENT "zfcp"
@@ -618,9 +618,9 @@ static void zfcp_scsi_rport_register(struct zfcp_port *port)
ids.port_id = port->d_id;
ids.roles = FC_RPORT_ROLE_FCP_TARGET;
- zfcp_dbf_rec_trig("scpaddy", port->adapter, port, NULL,
- ZFCP_PSEUDO_ERP_ACTION_RPORT_ADD,
- ZFCP_PSEUDO_ERP_ACTION_RPORT_ADD);
+ zfcp_dbf_rec_trig_lock("scpaddy", port->adapter, port, NULL,
+ ZFCP_PSEUDO_ERP_ACTION_RPORT_ADD,
+ ZFCP_PSEUDO_ERP_ACTION_RPORT_ADD);
rport = fc_remote_port_add(port->adapter->scsi_host, 0, &ids);
if (!rport) {
dev_err(&port->adapter->ccw_device->dev,
@@ -642,9 +642,9 @@ static void zfcp_scsi_rport_block(struct zfcp_port *port)
struct fc_rport *rport = port->rport;
if (rport) {
- zfcp_dbf_rec_trig("scpdely", port->adapter, port, NULL,
- ZFCP_PSEUDO_ERP_ACTION_RPORT_DEL,
- ZFCP_PSEUDO_ERP_ACTION_RPORT_DEL);
+ zfcp_dbf_rec_trig_lock("scpdely", port->adapter, port, NULL,
+ ZFCP_PSEUDO_ERP_ACTION_RPORT_DEL,
+ ZFCP_PSEUDO_ERP_ACTION_RPORT_DEL);
fc_remote_port_delete(rport);
port->rport = NULL;
}
--
2.13.5
^ permalink raw reply related [flat|nested] 2+ messages in thread* Re: [PATCH] zfcp: fix infinite iteration on ERP ready list
2018-05-03 11:52 [PATCH] zfcp: fix infinite iteration on ERP ready list Steffen Maier
@ 2018-05-08 4:02 ` Martin K. Petersen
0 siblings, 0 replies; 2+ messages in thread
From: Martin K. Petersen @ 2018-05-08 4:02 UTC (permalink / raw)
To: Steffen Maier
Cc: James E . J . Bottomley, Martin K . Petersen, linux-scsi,
linux-s390, Martin Schwidefsky, Heiko Carstens, Jens Remus,
stable
Steffen,
> zfcp_erp_adapter_reopen() schedules blocking of all of the adapter's
> rports via zfcp_scsi_schedule_rports_block() and enqueues a reopen
> adapter ERP action via zfcp_erp_action_enqueue(). Both are separately
> processed asynchronously and concurrently.
Applied to 4.17/scsi-fixes. Thank you!
--
Martin K. Petersen Oracle Linux Engineering
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2018-05-08 4:02 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-05-03 11:52 [PATCH] zfcp: fix infinite iteration on ERP ready list Steffen Maier
2018-05-08 4:02 ` Martin K. Petersen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox