From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Justin Tee <justin.tee@broadcom.com>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
Sasha Levin <sashal@kernel.org>,
paul.ely@broadcom.com, linux-scsi@vger.kernel.org
Subject: [PATCH AUTOSEL 6.17-6.12] scsi: lpfc: Decrement ndlp kref after FDISC retries exhausted
Date: Sat, 25 Oct 2025 12:00:47 -0400 [thread overview]
Message-ID: <20251025160905.3857885-416-sashal@kernel.org> (raw)
In-Reply-To: <20251025160905.3857885-1-sashal@kernel.org>
From: Justin Tee <justin.tee@broadcom.com>
[ Upstream commit b5bf6d681fce69cd1a57bfc0f1bdbbb348035117 ]
The kref for Fabric_DID ndlps is not decremented after repeated FDISC
failures and exhausting maximum allowed retries. This can leave the
ndlp lingering unnecessarily. Add a test and set bit operation for the
NLP_DROPPED flag. If not previously set, then a kref is decremented. The
ndlp is freed when the remaining reference for the completing ELS is
put.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Message-ID: <20250915180811.137530-6-justintee8345@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
YES
- In the failure branch of `lpfc_cmpl_els_fdisc()` the driver used to
log the exhausted retry and drop straight to `fdisc_failed`, leaving
the fabric `ndlp`’s initial kref outstanding; only the completion-held
reference is released later at `out:`
(`drivers/scsi/lpfc/lpfc_els.c:11252-11271`).
- The new `test_and_set_bit(NLP_DROPPED, …)` + `lpfc_nlp_put(ndlp)`
sequence (`drivers/scsi/lpfc/lpfc_els.c:11267-11269`) mirrors the
established pattern for retiring nodes safely once that initial
reference is no longer needed
(`drivers/scsi/lpfc/lpfc_hbadisc.c:4949-4954`, with the meaning of
`NLP_DROPPED` defined in `drivers/scsi/lpfc/lpfc_disc.h:197`).
- Without this drop, every fabric FDISC failure that exhausts retries
leaks the `ndlp`, keeping discovery objects and their resources
pinned; that is a real bug that can accumulate across repeated fabric
login failures.
- The fix is small, localized to the terminal failure path, and guarded
by the bit test so it cannot double-drop an already-released node,
which keeps regression risk low.
- The affected logic exists unchanged in stable kernels, so backporting
would directly eliminate the leak there without pulling in broader
dependencies.
drivers/scsi/lpfc/lpfc_els.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
index fca81e0c7c2e1..4c405bade4f34 100644
--- a/drivers/scsi/lpfc/lpfc_els.c
+++ b/drivers/scsi/lpfc/lpfc_els.c
@@ -11259,6 +11259,11 @@ lpfc_cmpl_els_fdisc(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
lpfc_vlog_msg(vport, KERN_WARNING, LOG_ELS,
"0126 FDISC cmpl status: x%x/x%x)\n",
ulp_status, ulp_word4);
+
+ /* drop initial reference */
+ if (!test_and_set_bit(NLP_DROPPED, &ndlp->nlp_flag))
+ lpfc_nlp_put(ndlp);
+
goto fdisc_failed;
}
--
2.51.0
next prev parent reply other threads:[~2025-10-25 16:27 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20251025160905.3857885-1-sashal@kernel.org>
2025-10-25 15:54 ` [PATCH AUTOSEL 6.17-5.4] scsi: lpfc: Define size of debugfs entry for xri rebalancing Sasha Levin
2025-10-25 15:54 ` [PATCH AUTOSEL 6.17] scsi: ufs: ufs-qcom: Disable lane clocks during phy hibern8 Sasha Levin
2025-10-25 15:54 ` [PATCH AUTOSEL 6.17-6.12] PCI/ERR: Update device error_state already after reset Sasha Levin
2025-10-25 15:55 ` [PATCH AUTOSEL 6.17] scsi: ufs: core: Change MCQ interrupt enable flow Sasha Levin
2025-10-25 15:55 ` [PATCH AUTOSEL 6.17-5.4] scsi: lpfc: Check return status of lpfc_reset_flush_io_context during TGT_RESET Sasha Levin
2025-10-25 15:55 ` [PATCH AUTOSEL 6.17-6.1] scsi: ufs: host: mediatek: Fix invalid access in vccqx handling Sasha Levin
2025-10-25 15:56 ` [PATCH AUTOSEL 6.17-6.1] scsi: ufs: host: mediatek: Change reset sequence for improved stability Sasha Levin
2025-10-25 15:56 ` [PATCH AUTOSEL 6.17-5.15] scsi: mpi3mr: Fix controller init failure on fault during queue creation Sasha Levin
2025-10-25 15:56 ` [PATCH AUTOSEL 6.17-6.6] scsi: ufs: host: mediatek: Disable auto-hibern8 during power mode changes Sasha Levin
2025-10-25 15:56 ` [PATCH AUTOSEL 6.17-6.12] scsi: lpfc: Clean up allocated queues when queue setup mbox commands fail Sasha Levin
2025-10-25 15:56 ` [PATCH AUTOSEL 6.17] scsi: ufs: ufs-qcom: Align programming sequence of Shared ICE for UFS controller v5 Sasha Levin
2025-10-25 15:57 ` [PATCH AUTOSEL 6.17-6.12] scsi: ufs: host: mediatek: Fix PWM mode switch issue Sasha Levin
2025-10-25 15:57 ` [PATCH AUTOSEL 6.17-6.6] scsi: ufs: host: mediatek: Enhance recovery on hibernation exit failure Sasha Levin
2025-10-25 15:57 ` [PATCH AUTOSEL 6.17-5.15] scsi: libfc: Fix potential buffer overflow in fc_ct_ms_fill() Sasha Levin
2025-10-25 15:57 ` [PATCH AUTOSEL 6.17-6.12] scsi: ufs: exynos: fsd: Gate ref_clk and put UFS device in reset on suspend Sasha Levin
2025-10-25 15:58 ` [PATCH AUTOSEL 6.17-5.10] scsi: pm80xx: Fix race condition caused by static variables Sasha Levin
2025-10-25 15:58 ` [PATCH AUTOSEL 6.17] scsi: ufs: host: mediatek: Fix adapt issue after PA_Init Sasha Levin
2025-10-25 15:58 ` [PATCH AUTOSEL 6.17-6.6] scsi: ufs: core: Disable timestamp functionality if not supported Sasha Levin
2025-10-25 15:58 ` [PATCH AUTOSEL 6.17-6.1] scsi: ufs: host: mediatek: Assign power mode userdata before FASTAUTO mode change Sasha Levin
2025-10-25 15:59 ` [PATCH AUTOSEL 6.17-6.12] scsi: mpi3mr: Fix I/O failures during controller reset Sasha Levin
2025-10-25 16:00 ` [PATCH AUTOSEL 6.17] scsi: mpi3mr: Fix device loss during enclosure reboot due to zero link speed Sasha Levin
2025-10-25 16:00 ` [PATCH AUTOSEL 6.17-6.12] scsi: lpfc: Ensure PLOGI_ACC is sent prior to PRLI in Point to Point topology Sasha Levin
2025-10-25 16:00 ` [PATCH AUTOSEL 6.17-6.6] scsi: ufs: host: mediatek: Fix auto-hibern8 timer configuration Sasha Levin
2025-10-25 16:00 ` [PATCH AUTOSEL 6.17-6.12] scsi: ufs: host: mediatek: Fix unbalanced IRQ enable issue Sasha Levin
2025-10-25 16:00 ` [PATCH AUTOSEL 6.17-6.1] scsi: ufs: host: mediatek: Enhance recovery on resume failure Sasha Levin
2025-10-25 16:00 ` [PATCH AUTOSEL 6.17-6.1] scsi: mpt3sas: Add support for 22.5 Gbps SAS link rate Sasha Levin
2025-10-25 16:00 ` Sasha Levin [this message]
2025-10-25 16:01 ` [PATCH AUTOSEL 6.17-5.4] scsi: pm8001: Use int instead of u32 to store error codes Sasha Levin
2025-10-25 16:01 ` [PATCH AUTOSEL 6.17-6.12] scsi: ufs: host: mediatek: Correct system PM flow Sasha Levin
2025-10-25 16:01 ` [PATCH AUTOSEL 6.17-5.15] scsi: lpfc: Remove ndlp kref decrement clause for F_Port_Ctrl in lpfc_cleanup Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251025160905.3857885-416-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=justin.tee@broadcom.com \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=patches@lists.linux.dev \
--cc=paul.ely@broadcom.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox