From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B2BAF1E3CA5; Fri, 4 Oct 2024 18:20:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728066039; cv=none; b=Pt3PwplTeqnhSqnLQTJ20qT4EqtRFQn2Fr2HaeKHTGTBWtXgp6cqSiM7ev6a9VJEi8gDrhcf2kdVi2Bgq9EWhGcnQoq0UwAl5H2DgOZPZaNGa898y52hVaV/fhNbIuHaxu63+Y32R1Jzvp1UDDOEn1BBMD/2I7tRAmXv2gOEzfk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728066039; c=relaxed/simple; bh=rUkD8+5U+0RJpNz4hr/3huklv5s/U3NDo+P4UQ4lxv0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=dTixeNJUfxhucQL/m04PRdPHTRbQp8EffCTgjJ8n1KpDJt5M6sO6oNYDxNXUBY2JIhn3irE+JcICct6kB5SPCEHj95/mOPQkFFR3hISnmX8zXfXQ2bwWZs5jZeUGz1phqvkwdohQE/wSxXZxL0XhafkHZEPN/OYyabjUjUYbs6A= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=OE4EWngd; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="OE4EWngd" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 95EC9C4CEC6; Fri, 4 Oct 2024 18:20:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1728066039; bh=rUkD8+5U+0RJpNz4hr/3huklv5s/U3NDo+P4UQ4lxv0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OE4EWngdpWV8IPiU/jIcLcWQKaq7LKynJ/olTt2sXX4NCZKNUDBm+l58tzoVuynhN TBTERfxM3ZuAHr81CT8Ipxm89Xtkt9ZwBdl/WqDZOo7Xw353YAXtNhDj9rg8XFR0GW QGhsOPGUGc1iKDDn2Z4rukgvmYMNK/5qfnkHTIOn7zv66WHcjvdeKCQOGAl2ZTGX8t sYYw6HnzAR24pmBG44AxN/nB8Vv4JgRQJja635VAtBr2knhSD7B3RXkXe8AaSMcauz dAgT318k7f4rtv4VnCirPE/ws2wYkNPzw0d1c2PpnsPR4W0AsETqG8QaOCM2rFw6hD QuC/q3yMADkHA== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Justin Tee , "Martin K . Petersen" , Sasha Levin , james.smart@broadcom.com, dick.kennedy@broadcom.com, James.Bottomley@HansenPartnership.com, linux-scsi@vger.kernel.org Subject: [PATCH AUTOSEL 6.11 69/76] scsi: lpfc: Ensure DA_ID handling completion before deleting an NPIV instance Date: Fri, 4 Oct 2024 14:17:26 -0400 Message-ID: <20241004181828.3669209-69-sashal@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241004181828.3669209-1-sashal@kernel.org> References: <20241004181828.3669209-1-sashal@kernel.org> Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore X-stable-base: Linux 6.11.2 Content-Transfer-Encoding: 8bit From: Justin Tee [ Upstream commit 0a3c84f71680684c1d41abb92db05f95c09111e8 ] Deleting an NPIV instance requires all fabric ndlps to be released before an NPIV's resources can be torn down. Failure to release fabric ndlps beforehand opens kref imbalance race conditions. Fix by forcing the DA_ID to complete synchronously with usage of wait_queue. Signed-off-by: Justin Tee Link: https://lore.kernel.org/r/20240912232447.45607-6-justintee8345@gmail.com Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/lpfc/lpfc_ct.c | 12 ++++++++++ drivers/scsi/lpfc/lpfc_disc.h | 7 ++++++ drivers/scsi/lpfc/lpfc_vport.c | 43 ++++++++++++++++++++++++++++------ 3 files changed, 55 insertions(+), 7 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_ct.c b/drivers/scsi/lpfc/lpfc_ct.c index 2dedd1493e5ba..1e5db489a00c5 100644 --- a/drivers/scsi/lpfc/lpfc_ct.c +++ b/drivers/scsi/lpfc/lpfc_ct.c @@ -1647,6 +1647,18 @@ lpfc_cmpl_ct(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, } out: + /* If the caller wanted a synchronous DA_ID completion, signal the + * wait obj and clear flag to reset the vport. + */ + if (ndlp->save_flags & NLP_WAIT_FOR_DA_ID) { + if (ndlp->da_id_waitq) + wake_up(ndlp->da_id_waitq); + } + + spin_lock_irq(&ndlp->lock); + ndlp->save_flags &= ~NLP_WAIT_FOR_DA_ID; + spin_unlock_irq(&ndlp->lock); + lpfc_ct_free_iocb(phba, cmdiocb); lpfc_nlp_put(ndlp); return; diff --git a/drivers/scsi/lpfc/lpfc_disc.h b/drivers/scsi/lpfc/lpfc_disc.h index f82615d87c4bb..f5ae8cc158205 100644 --- a/drivers/scsi/lpfc/lpfc_disc.h +++ b/drivers/scsi/lpfc/lpfc_disc.h @@ -90,6 +90,8 @@ enum lpfc_nlp_save_flags { NLP_IN_RECOV_POST_DEV_LOSS = 0x1, /* wait for outstanding LOGO to cmpl */ NLP_WAIT_FOR_LOGO = 0x2, + /* wait for outstanding DA_ID to finish */ + NLP_WAIT_FOR_DA_ID = 0x4 }; struct lpfc_nodelist { @@ -159,7 +161,12 @@ struct lpfc_nodelist { uint32_t nvme_fb_size; /* NVME target's supported byte cnt */ #define NVME_FB_BIT_SHIFT 9 /* PRLI Rsp first burst in 512B units. */ uint32_t nlp_defer_did; + + /* These wait objects are NPIV specific. These IOs must complete + * synchronously. + */ wait_queue_head_t *logo_waitq; + wait_queue_head_t *da_id_waitq; }; struct lpfc_node_rrq { diff --git a/drivers/scsi/lpfc/lpfc_vport.c b/drivers/scsi/lpfc/lpfc_vport.c index 4439167a51882..7a4d4d8e2ad55 100644 --- a/drivers/scsi/lpfc/lpfc_vport.c +++ b/drivers/scsi/lpfc/lpfc_vport.c @@ -626,6 +626,7 @@ lpfc_vport_delete(struct fc_vport *fc_vport) struct Scsi_Host *shost = lpfc_shost_from_vport(vport); struct lpfc_hba *phba = vport->phba; int rc; + DECLARE_WAIT_QUEUE_HEAD_ONSTACK(waitq); if (vport->port_type == LPFC_PHYSICAL_PORT) { lpfc_printf_vlog(vport, KERN_ERR, LOG_TRACE_EVENT, @@ -679,21 +680,49 @@ lpfc_vport_delete(struct fc_vport *fc_vport) if (!ndlp) goto skip_logo; + /* Send the DA_ID and Fabric LOGO to cleanup the NPIV fabric entries. */ if (ndlp && ndlp->nlp_state == NLP_STE_UNMAPPED_NODE && phba->link_state >= LPFC_LINK_UP && phba->fc_topology != LPFC_TOPOLOGY_LOOP) { if (vport->cfg_enable_da_id) { - /* Send DA_ID and wait for a completion. */ + /* Send DA_ID and wait for a completion. This is best + * effort. If the DA_ID fails, likely the fabric will + * "leak" NportIDs but at least the driver issued the + * command. + */ + ndlp = lpfc_findnode_did(vport, NameServer_DID); + if (!ndlp) + goto issue_logo; + + spin_lock_irq(&ndlp->lock); + ndlp->da_id_waitq = &waitq; + ndlp->save_flags |= NLP_WAIT_FOR_DA_ID; + spin_unlock_irq(&ndlp->lock); + rc = lpfc_ns_cmd(vport, SLI_CTNS_DA_ID, 0, 0); - if (rc) { - lpfc_printf_log(vport->phba, KERN_WARNING, - LOG_VPORT, - "1829 CT command failed to " - "delete objects on fabric, " - "rc %d\n", rc); + if (!rc) { + wait_event_timeout(waitq, + !(ndlp->save_flags & NLP_WAIT_FOR_DA_ID), + msecs_to_jiffies(phba->fc_ratov * 2000)); } + + lpfc_printf_vlog(vport, KERN_INFO, LOG_VPORT | LOG_ELS, + "1829 DA_ID issue status %d. " + "SFlag x%x NState x%x, NFlag x%x " + "Rpi x%x\n", + rc, ndlp->save_flags, ndlp->nlp_state, + ndlp->nlp_flag, ndlp->nlp_rpi); + + /* Remove the waitq and save_flags. It no + * longer matters if the wake happened. + */ + spin_lock_irq(&ndlp->lock); + ndlp->da_id_waitq = NULL; + ndlp->save_flags &= ~NLP_WAIT_FOR_DA_ID; + spin_unlock_irq(&ndlp->lock); } +issue_logo: /* * If the vpi is not registered, then a valid FDISC doesn't * exist and there is no need for a ELS LOGO. Just cleanup -- 2.43.0