From: James Smart <jsmart2021@gmail.com>
To: linux-scsi@vger.kernel.org
Cc: James Smart <jsmart2021@gmail.com>,
Dick Kennedy <dick.kennedy@broadcom.com>,
James Smart <james.smart@broadcom.com>
Subject: [PATCH 13/17] lpfc: Correct driver deregistrations with host nvme transport
Date: Fri, 3 Nov 2017 15:56:24 -0700 [thread overview]
Message-ID: <20171103225628.24716-14-jsmart2021@gmail.com> (raw)
In-Reply-To: <20171103225628.24716-1-jsmart2021@gmail.com>
The driver's interaction with the host nvme transport has been
incorrect for a while. The driver did not wait for the unregister
callbacks (waited only 5 jiffies). Thus the driver may remove
objects that may be referenced by subsequent abort commands from
the transport, and the actual unregister callback was effectively
a noop. This was especially problematic if the driver was unloaded.
The driver now waits for the unregister callbacks, as it should,
before continuing with teardown.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
---
drivers/scsi/lpfc/lpfc_disc.h | 2 +
drivers/scsi/lpfc/lpfc_nvme.c | 113 ++++++++++++++++++++++++++++++++++++++++--
drivers/scsi/lpfc/lpfc_nvme.h | 2 +
3 files changed, 113 insertions(+), 4 deletions(-)
diff --git a/drivers/scsi/lpfc/lpfc_disc.h b/drivers/scsi/lpfc/lpfc_disc.h
index f9a566eaef04..5a7547f9d8d8 100644
--- a/drivers/scsi/lpfc/lpfc_disc.h
+++ b/drivers/scsi/lpfc/lpfc_disc.h
@@ -134,6 +134,8 @@ struct lpfc_nodelist {
struct lpfc_scsicmd_bkt *lat_data; /* Latency data */
uint32_t fc4_prli_sent;
uint32_t upcall_flags;
+#define NLP_WAIT_FOR_UNREG 0x1
+
uint32_t nvme_fb_size; /* NVME target's supported byte cnt */
#define NVME_FB_BIT_SHIFT 9 /* PRLI Rsp first burst in 512B units. */
};
diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c
index d3ada630b427..12d09a6a4563 100644
--- a/drivers/scsi/lpfc/lpfc_nvme.c
+++ b/drivers/scsi/lpfc/lpfc_nvme.c
@@ -154,6 +154,10 @@ lpfc_nvme_localport_delete(struct nvme_fc_local_port *localport)
{
struct lpfc_nvme_lport *lport = localport->private;
+ lpfc_printf_vlog(lport->vport, KERN_INFO, LOG_NVME,
+ "6173 localport %p delete complete\n",
+ lport);
+
/* release any threads waiting for the unreg to complete */
complete(&lport->lport_unreg_done);
}
@@ -946,10 +950,19 @@ lpfc_nvme_io_cmd_wqe_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *pwqeIn,
freqpriv->nvme_buf = NULL;
/* NVME targets need completion held off until the abort exchange
- * completes.
+ * completes unless the NVME Rport is getting unregistered.
*/
- if (!(lpfc_ncmd->flags & LPFC_SBUF_XBUSY))
+ if (!(lpfc_ncmd->flags & LPFC_SBUF_XBUSY) ||
+ ndlp->upcall_flags & NLP_WAIT_FOR_UNREG) {
+ /* Clear the XBUSY flag to prevent double completions.
+ * The nvme rport is getting unregistered and there is
+ * no need to defer the IO.
+ */
+ if (lpfc_ncmd->flags & LPFC_SBUF_XBUSY)
+ lpfc_ncmd->flags &= ~LPFC_SBUF_XBUSY;
+
nCmd->done(nCmd);
+ }
spin_lock_irqsave(&phba->hbalock, flags);
lpfc_ncmd->nrport = NULL;
@@ -2234,6 +2247,47 @@ lpfc_nvme_create_localport(struct lpfc_vport *vport)
return ret;
}
+/* lpfc_nvme_lport_unreg_wait - Wait for the host to complete an lport unreg.
+ *
+ * The driver has to wait for the host nvme transport to callback
+ * indicating the localport has successfully unregistered all
+ * resources. Since this is an uninterruptible wait, loop every ten
+ * seconds and print a message indicating no progress.
+ *
+ * An uninterruptible wait is used because of the risk of transport-to-
+ * driver state mismatch.
+ */
+void
+lpfc_nvme_lport_unreg_wait(struct lpfc_vport *vport,
+ struct lpfc_nvme_lport *lport)
+{
+#if (IS_ENABLED(CONFIG_NVME_FC))
+ u32 wait_tmo;
+ int ret;
+
+ /* Host transport has to clean up and confirm requiring an indefinite
+ * wait. Print a message if a 10 second wait expires and renew the
+ * wait. This is unexpected.
+ */
+ wait_tmo = msecs_to_jiffies(LPFC_NVME_WAIT_TMO * 1000);
+ while (true) {
+ ret = wait_for_completion_timeout(&lport->lport_unreg_done,
+ wait_tmo);
+ if (unlikely(!ret)) {
+ lpfc_printf_vlog(vport, KERN_ERR, LOG_NVME_IOERR,
+ "6176 Lport %p Localport %p wait "
+ "timed out. Renewing.\n",
+ lport, vport->localport);
+ continue;
+ }
+ break;
+ }
+ lpfc_printf_vlog(vport, KERN_INFO, LOG_NVME_IOERR,
+ "6177 Lport %p Localport %p Complete Success\n",
+ lport, vport->localport);
+#endif
+}
+
/**
* lpfc_nvme_destroy_localport - Destroy lpfc_nvme bound to nvme transport.
* @pnvme: pointer to lpfc nvme data structure.
@@ -2268,7 +2322,11 @@ lpfc_nvme_destroy_localport(struct lpfc_vport *vport)
*/
init_completion(&lport->lport_unreg_done);
ret = nvme_fc_unregister_localport(localport);
- wait_for_completion_timeout(&lport->lport_unreg_done, 5);
+
+ /* Wait for completion. This either blocks
+ * indefinitely or succeeds
+ */
+ lpfc_nvme_lport_unreg_wait(vport, lport);
/* Regardless of the unregister upcall response, clear
* nvmei_support. All rports are unregistered and the
@@ -2424,6 +2482,47 @@ lpfc_nvme_register_port(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp)
#endif
}
+/* lpfc_nvme_rport_unreg_wait - Wait for the host to complete an rport unreg.
+ *
+ * The driver has to wait for the host nvme transport to callback
+ * indicating the remoteport has successfully unregistered all
+ * resources. Since this is an uninterruptible wait, loop every ten
+ * seconds and print a message indicating no progress.
+ *
+ * An uninterruptible wait is used because of the risk of transport-to-
+ * driver state mismatch.
+ */
+void
+lpfc_nvme_rport_unreg_wait(struct lpfc_vport *vport,
+ struct lpfc_nvme_rport *rport)
+{
+#if (IS_ENABLED(CONFIG_NVME_FC))
+ u32 wait_tmo;
+ int ret;
+
+ /* Host transport has to clean up and confirm requiring an indefinite
+ * wait. Print a message if a 10 second wait expires and renew the
+ * wait. This is unexpected.
+ */
+ wait_tmo = msecs_to_jiffies(LPFC_NVME_WAIT_TMO * 1000);
+ while (true) {
+ ret = wait_for_completion_timeout(&rport->rport_unreg_done,
+ wait_tmo);
+ if (unlikely(!ret)) {
+ lpfc_printf_vlog(vport, KERN_ERR, LOG_NVME_IOERR,
+ "6174 Rport %p Remoteport %p wait "
+ "timed out. Renewing.\n",
+ rport, rport->remoteport);
+ continue;
+ }
+ break;
+ }
+ lpfc_printf_vlog(vport, KERN_INFO, LOG_NVME_IOERR,
+ "6175 Rport %p Remoteport %p Complete Success\n",
+ rport, rport->remoteport);
+#endif
+}
+
/* lpfc_nvme_unregister_port - unbind the DID and port_role from this rport.
*
* There is no notion of Devloss or rport recovery from the current
@@ -2480,14 +2579,20 @@ lpfc_nvme_unregister_port(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp)
/* No concern about the role change on the nvme remoteport.
* The transport will update it.
*/
+ ndlp->upcall_flags |= NLP_WAIT_FOR_UNREG;
ret = nvme_fc_unregister_remoteport(remoteport);
if (ret != 0) {
lpfc_printf_vlog(vport, KERN_ERR, LOG_NVME_DISC,
"6167 NVME unregister failed %d "
"port_state x%x\n",
ret, remoteport->port_state);
+ } else {
+ /* Wait for completion. This either blocks
+ * indefinitely or succeeds
+ */
+ lpfc_nvme_rport_unreg_wait(vport, rport);
+ ndlp->upcall_flags &= ~NLP_WAIT_FOR_UNREG;
}
-
}
return;
diff --git a/drivers/scsi/lpfc/lpfc_nvme.h b/drivers/scsi/lpfc/lpfc_nvme.h
index fbfc1786cd04..903ec37f465f 100644
--- a/drivers/scsi/lpfc/lpfc_nvme.h
+++ b/drivers/scsi/lpfc/lpfc_nvme.h
@@ -27,6 +27,8 @@
#define LPFC_NVME_ERSP_LEN 0x20
+#define LPFC_NVME_WAIT_TMO 10
+
struct lpfc_nvme_qhandle {
uint32_t index; /* WQ index to use */
uint32_t qidx; /* queue index passed to create */
--
2.13.1
next prev parent reply other threads:[~2017-11-03 22:56 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-03 22:56 [PATCH 00/17] lpfc updates for 11.4.0.5 James Smart
2017-11-03 22:56 ` [PATCH 01/17] lpfc: FLOGI failures are reported when connected to a private loop James Smart
2017-11-08 9:05 ` Hannes Reinecke
2017-11-08 18:57 ` James Smart
2017-11-03 22:56 ` [PATCH 02/17] lpfc: Expand WQE capability of every NVME hardware queue James Smart
2017-11-08 9:08 ` Hannes Reinecke
2017-11-03 22:56 ` [PATCH 03/17] lpfc: Handle XRI_ABORTED_CQE in soft IRQ James Smart
2017-11-08 9:09 ` Hannes Reinecke
2017-11-03 22:56 ` [PATCH 04/17] lpfc: Fix crash after bad bar setup on driver attachment James Smart
2017-11-08 9:24 ` Hannes Reinecke
2017-11-03 22:56 ` [PATCH 05/17] lpfc: Fix NVME LS abort_xri James Smart
2017-11-08 9:24 ` Hannes Reinecke
2017-11-03 22:56 ` [PATCH 06/17] lpfc: Raise maximum NVME sg list size for 256 elements James Smart
2017-11-08 9:24 ` Hannes Reinecke
2017-11-03 22:56 ` [PATCH 07/17] lpfc: Driver fails to detect direct attach storage array James Smart
2017-11-08 9:25 ` Hannes Reinecke
2017-11-03 22:56 ` [PATCH 08/17] lpfc: Fix display for debugfs queInfo James Smart
2017-11-08 9:26 ` Hannes Reinecke
2017-11-03 22:56 ` [PATCH 09/17] lpfc: Adjust default value of lpfc_nvmet_mrq James Smart
2017-11-08 9:38 ` Hannes Reinecke
2017-11-03 22:56 ` [PATCH 10/17] lpfc: Fix ndlp ref count for pt2pt mode issue RSCN James Smart
2017-11-20 13:16 ` Hannes Reinecke
2017-11-03 22:56 ` [PATCH 11/17] lpfc: Linux LPFC driver does not process all RSCNs James Smart
2017-11-03 22:56 ` [PATCH 12/17] lpfc: correct port registrations with nvme_fc James Smart
2017-11-03 22:56 ` James Smart [this message]
2017-11-03 22:56 ` [PATCH 14/17] lpfc: Fix crash during driver unload with running nvme traffic James Smart
2017-11-03 22:56 ` [PATCH 15/17] lpfc: Fix driver handling of nvme resources during unload James Smart
2017-11-03 22:56 ` [PATCH 16/17] lpfc: small sg cnt cleanup James Smart
2017-11-03 22:56 ` [PATCH 17/17] lpfc: update driver version to 11.4.0.5 James Smart
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171103225628.24716-14-jsmart2021@gmail.com \
--to=jsmart2021@gmail.com \
--cc=dick.kennedy@broadcom.com \
--cc=james.smart@broadcom.com \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).