linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH AUTOSEL 6.17-5.15] nvmet-fc: avoid scheduling association deletion twice
       [not found] <20251009155752.773732-1-sashal@kernel.org>
@ 2025-10-09 15:54 ` Sasha Levin
  2025-10-10  7:39   ` Daniel Wagner
  2025-10-09 15:55 ` [PATCH AUTOSEL 6.17-6.1] nvme: Use non zero KATO for persistent discovery connections Sasha Levin
  2025-10-09 15:56 ` [PATCH AUTOSEL 6.17-5.10] nvme-fc: use lock accessing port_state and rport state Sasha Levin
  2 siblings, 1 reply; 5+ messages in thread
From: Sasha Levin @ 2025-10-09 15:54 UTC (permalink / raw)
  To: patches, stable
  Cc: Daniel Wagner, Shinichiro Kawasaki, Hannes Reinecke, Keith Busch,
	Sasha Levin, justin.tee, nareshgottumukkala83, paul.ely, sagi,
	kch, linux-nvme

From: Daniel Wagner <wagi@kernel.org>

[ Upstream commit f2537be4f8421f6495edfa0bc284d722f253841d ]

When forcefully shutting down a port via the configfs interface,
nvmet_port_subsys_drop_link() first calls nvmet_port_del_ctrls() and
then nvmet_disable_port(). Both functions will eventually schedule all
remaining associations for deletion.

The current implementation checks whether an association is about to be
removed, but only after the work item has already been scheduled. As a
result, it is possible for the first scheduled work item to free all
resources, and then for the same work item to be scheduled again for
deletion.

Because the association list is an RCU list, it is not possible to take
a lock and remove the list entry directly, so it cannot be looked up
again. Instead, a flag (terminating) must be used to determine whether
the association is already in the process of being deleted.

Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Closes: https://lore.kernel.org/all/rsdinhafrtlguauhesmrrzkybpnvwantwmyfq2ih5aregghax5@mhr7v3eryci3/
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Daniel Wagner <wagi@kernel.org>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

YES – this prevents a real use-after-free when an FC target port is torn
down through configfs.

- `nvmet_port_subsys_drop_link()` first calls `nvmet_port_del_ctrls()`
  and then `nvmet_disable_port()`
  (`drivers/nvme/target/configfs.c:1088`,
  `drivers/nvme/target/core.c:301`), and both paths funnel into
  `__nvmet_fc_free_assocs()` which queues `assoc->del_work`
  (`drivers/nvme/target/fc.c:1482`). So a forced shutdown schedules the
  same association cleanup twice.
- The guard that’s supposed to stop duplicates only runs inside
  `nvmet_fc_delete_target_assoc()` after the work executes
  (`drivers/nvme/target/fc.c:1201`), so the second caller can still re-
  queue the work once the first invocation has freed the association,
  hitting the race reported on the mailing list.
- The patch simply flips the `terminating` flag before queueing
  (`drivers/nvme/target/fc.c:1076` in the new code) and removes the
  redundant check from the worker. That keeps the work from ever being
  queued a second time, exactly matching the original intent with no
  behavioural side effects.
- Change is tiny, isolated to the nvmet-fc transport, and has no
  dependencies beyond the existing `assoc->terminating` infrastructure
  (already present in supported stable series), so the risk of
  regression is minimal while the bug being fixed can crash systems
  under administrative port removal.

If you’re carrying stable trees that include the fc target (v6.10 and
earlier back to when `assoc->terminating` was introduced), you should
pick this up; older branches without the later queue_work refcount patch
just need the same flag move applied to their local
`nvmet_fc_schedule_delete_assoc()`.

 drivers/nvme/target/fc.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
index a9b18c051f5bd..249adb2811420 100644
--- a/drivers/nvme/target/fc.c
+++ b/drivers/nvme/target/fc.c
@@ -1075,6 +1075,14 @@ nvmet_fc_delete_assoc_work(struct work_struct *work)
 static void
 nvmet_fc_schedule_delete_assoc(struct nvmet_fc_tgt_assoc *assoc)
 {
+	int terminating;
+
+	terminating = atomic_xchg(&assoc->terminating, 1);
+
+	/* if already terminating, do nothing */
+	if (terminating)
+		return;
+
 	nvmet_fc_tgtport_get(assoc->tgtport);
 	if (!queue_work(nvmet_wq, &assoc->del_work))
 		nvmet_fc_tgtport_put(assoc->tgtport);
@@ -1202,13 +1210,7 @@ nvmet_fc_delete_target_assoc(struct nvmet_fc_tgt_assoc *assoc)
 {
 	struct nvmet_fc_tgtport *tgtport = assoc->tgtport;
 	unsigned long flags;
-	int i, terminating;
-
-	terminating = atomic_xchg(&assoc->terminating, 1);
-
-	/* if already terminating, do nothing */
-	if (terminating)
-		return;
+	int i;
 
 	spin_lock_irqsave(&tgtport->lock, flags);
 	list_del_rcu(&assoc->a_list);
-- 
2.51.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH AUTOSEL 6.17-6.1] nvme: Use non zero KATO for persistent discovery connections
       [not found] <20251009155752.773732-1-sashal@kernel.org>
  2025-10-09 15:54 ` [PATCH AUTOSEL 6.17-5.15] nvmet-fc: avoid scheduling association deletion twice Sasha Levin
@ 2025-10-09 15:55 ` Sasha Levin
  2025-10-09 15:56 ` [PATCH AUTOSEL 6.17-5.10] nvme-fc: use lock accessing port_state and rport state Sasha Levin
  2 siblings, 0 replies; 5+ messages in thread
From: Sasha Levin @ 2025-10-09 15:55 UTC (permalink / raw)
  To: patches, stable
  Cc: Alistair Francis, Hannes Reinecke, Christoph Hellwig, Keith Busch,
	Sasha Levin, sagi, linux-nvme

From: Alistair Francis <alistair.francis@wdc.com>

[ Upstream commit 2e482655019ab6fcfe8865b62432c6d03f0b5f80 ]

The NVMe Base Specification 2.1 states that:

"""
A host requests an explicit persistent connection ... by specifying a
non-zero Keep Alive Timer value in the Connect command.
"""

As such if we are starting a persistent connection to a discovery
controller and the KATO is currently 0 we need to update KATO to a non
zero value to avoid continuous timeouts on the target.

Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

YES

- What it fixes
  - Aligns Linux host behavior with NVMe Base Spec 2.1 requirement that
    a host must specify a non-zero Keep Alive Timer in the Connect
    command to request an explicit persistent discovery connection. The
    previous behavior left `KATO=0` for discovery controllers even when
    the connection became persistent, causing targets to time out and
    drop connections.

- Code change and behavior
  - In `nvme_start_ctrl()` (`drivers/nvme/host/core.c:4998`), on
    reconnect for discovery controllers
    (`test_bit(NVME_CTRL_STARTED_ONCE, &ctrl->flags)` and
    `nvme_discovery_ctrl(ctrl)`), the patch:
    - Checks if `ctrl->kato` is zero.
    - If zero, calls `nvme_stop_keep_alive(ctrl)`, sets `ctrl->kato =
      NVME_DEFAULT_KATO`, then `nvme_start_keep_alive(ctrl)`.
    - Still sends the rediscover uevent: `nvme_change_uevent(ctrl,
      "NVME_EVENT=rediscover")`.
  - This immediately starts keep-alive commands after a persistent
    discovery reconnect and ensures subsequent Connect commands
    advertise non-zero KATO.

- Why this is correct and effective
  - Immediate effect: Even if the just-completed Connect used `kato=0`,
    forcing a non-zero `kato` here starts the host keep-alive work right
    away, avoiding target keep-alive timeouts after a persistent
    reconnect.
  - Future connections: `nvmf_connect_cmd_prep()` sets Connect’s KATO
    from `ctrl->kato` (`drivers/nvme/host/fabrics.c:426`). With this
    change, the next reconnection will send a non-zero KATO in the
    Connect command as the spec requires.
  - Safe sequence: `nvme_stop_keep_alive()` is a no-op when `kato==0`
    (`drivers/nvme/host/core.c:1412`), then `ctrl->kato` is set to
    `NVME_DEFAULT_KATO` (`drivers/nvme/host/nvme.h:31`), and
    `nvme_start_keep_alive()` only schedules work when `kato!=0`
    (`drivers/nvme/host/core.c:1404`).

- Scope and risk
  - Scope-limited: Only affects discovery controllers on reconnect
    (persistent discovery) and only when `kato==0`. No effect on:
    - Non-discovery (I/O) controllers (they already default to non-zero
      KATO).
    - Discovery controllers where userspace explicitly set a non-zero
      KATO.
  - No architectural changes; uses existing helpers and flags; no ABI
    change.
  - Regression risk is low. Prior history already introduced persistent
    discovery semantics and a sysfs `kato` attribute, and transports
    already honor `ctrl->kato` for Connect. This change simply fills a
    corner case where `kato` remained zero in a persistent discovery
    reconnect.

- Historical context and consistency
  - 2018: We explicitly avoided KA to discovery controllers per early
    spec constraints.
  - 2021: The code was adjusted so discovery controllers default to
    `kato=0`, while I/O controllers default to `NVME_DEFAULT_KATO`
    (commit 32feb6de). Persistent discovery connections were intended to
    have a positive KATO (via options), but implicit persistent
    reconnects could still have `kato=0`.
  - 2022: Added rediscover uevent for persistent discovery reconnects
    (f46ef9e87) and `NVME_CTRL_STARTED_ONCE` usage.
  - This patch completes the intent by ensuring persistent discovery
    reconnects run with non-zero KATO automatically, preventing target
    timeouts and complying with spec 2.1.

- Stable backport suitability
  - Fixes a user-visible bug (target timeouts and unstable discovery
    connectivity on persistent reconnects).
  - Small, self-contained change confined to `nvme_start_ctrl()` in
    `drivers/nvme/host/core.c`.
  - No new features or interfaces; minimal risk of regression; behavior
    matches spec and existing design.
  - Dependencies exist in stable trees that already have persistent
    discovery support and the `NVME_CTRL_STARTED_ONCE` mechanism. For
    older branches that still use `test_and_set_bit` in the rediscover
    path, the logic remains valid within that conditional block.

- Side notes for backporters
  - Ensure the tree has `NVME_CTRL_STARTED_ONCE`,
    `nvme_discovery_ctrl()`, and the rediscover uevent path in
    `nvme_start_ctrl()`. If an older stable branch uses
    `test_and_set_bit` instead of `test_bit`, place the new KATO block
    inside that existing conditional.
  - `nvmf_connect_cmd_prep()` must already populate Connect’s `kato`
    from `ctrl->kato` (`drivers/nvme/host/fabrics.c:426`) so that future
    reconnects benefit from the updated `kato`.

 drivers/nvme/host/core.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 6b7493934535a..5714d49932822 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4990,8 +4990,14 @@ void nvme_start_ctrl(struct nvme_ctrl *ctrl)
 	 * checking that they started once before, hence are reconnecting back.
 	 */
 	if (test_bit(NVME_CTRL_STARTED_ONCE, &ctrl->flags) &&
-	    nvme_discovery_ctrl(ctrl))
+	    nvme_discovery_ctrl(ctrl)) {
+		if (!ctrl->kato) {
+			nvme_stop_keep_alive(ctrl);
+			ctrl->kato = NVME_DEFAULT_KATO;
+			nvme_start_keep_alive(ctrl);
+		}
 		nvme_change_uevent(ctrl, "NVME_EVENT=rediscover");
+	}
 
 	if (ctrl->queue_count > 1) {
 		nvme_queue_scan(ctrl);
-- 
2.51.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH AUTOSEL 6.17-5.10] nvme-fc: use lock accessing port_state and rport state
       [not found] <20251009155752.773732-1-sashal@kernel.org>
  2025-10-09 15:54 ` [PATCH AUTOSEL 6.17-5.15] nvmet-fc: avoid scheduling association deletion twice Sasha Levin
  2025-10-09 15:55 ` [PATCH AUTOSEL 6.17-6.1] nvme: Use non zero KATO for persistent discovery connections Sasha Levin
@ 2025-10-09 15:56 ` Sasha Levin
  2 siblings, 0 replies; 5+ messages in thread
From: Sasha Levin @ 2025-10-09 15:56 UTC (permalink / raw)
  To: patches, stable
  Cc: Daniel Wagner, Shinichiro Kawasaki, Hannes Reinecke, Keith Busch,
	Sasha Levin, justin.tee, nareshgottumukkala83, paul.ely, sagi,
	linux-nvme

From: Daniel Wagner <wagi@kernel.org>

[ Upstream commit 891cdbb162ccdb079cd5228ae43bdeebce8597ad ]

nvme_fc_unregister_remote removes the remote port on a lport object at
any point in time when there is no active association. This races with
with the reconnect logic, because nvme_fc_create_association is not
taking a lock to check the port_state and atomically increase the
active count on the rport.

Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Closes: https://lore.kernel.org/all/u4ttvhnn7lark5w3sgrbuy2rxupcvosp4qmvj46nwzgeo5ausc@uyrkdls2muwx
Signed-off-by: Daniel Wagner <wagi@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

YES
- Adds the missing synchronization in `nvme_fc_create_association()` so
  we hold `rport->lock` while checking `remoteport.port_state` and
  bumping `act_ctrl_cnt` via `nvme_fc_ctlr_active_on_rport()`
  (`drivers/nvme/host/fc.c:3034-3044`). That makes the state check and
  reference grab atomic with respect to other rport users.
- Without this lock, `nvme_fc_unregister_remoteport()` can flip the same
  `remoteport.port_state` and tear down the rport under lock
  (`drivers/nvme/host/fc.c:813-839`) while a reconnect path is still
  between the state check and the `act_ctrl_cnt` increment
  (`drivers/nvme/host/fc.c:2987-2999`). This window lets the reconnect
  code touch freed memory or miss the counter bump, causing crashes or
  failed reconnections—the bug reported on the mailing list.
- The fix is tightly scoped to this race, no API or behavioral changes
  elsewhere, and it follows existing locking rules for the rport. The
  affected logic is unchanged across supported stable branches, so the
  patch applies cleanly and removes a long-standing use-after-free
  hazard.

 drivers/nvme/host/fc.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 3e12d4683ac72..03987f497a5b5 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3032,11 +3032,17 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl)
 
 	++ctrl->ctrl.nr_reconnects;
 
-	if (ctrl->rport->remoteport.port_state != FC_OBJSTATE_ONLINE)
+	spin_lock_irqsave(&ctrl->rport->lock, flags);
+	if (ctrl->rport->remoteport.port_state != FC_OBJSTATE_ONLINE) {
+		spin_unlock_irqrestore(&ctrl->rport->lock, flags);
 		return -ENODEV;
+	}
 
-	if (nvme_fc_ctlr_active_on_rport(ctrl))
+	if (nvme_fc_ctlr_active_on_rport(ctrl)) {
+		spin_unlock_irqrestore(&ctrl->rport->lock, flags);
 		return -ENOTUNIQ;
+	}
+	spin_unlock_irqrestore(&ctrl->rport->lock, flags);
 
 	dev_info(ctrl->ctrl.device,
 		"NVME-FC{%d}: create association : host wwpn 0x%016llx "
-- 
2.51.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH AUTOSEL 6.17-5.15] nvmet-fc: avoid scheduling association deletion twice
  2025-10-09 15:54 ` [PATCH AUTOSEL 6.17-5.15] nvmet-fc: avoid scheduling association deletion twice Sasha Levin
@ 2025-10-10  7:39   ` Daniel Wagner
  2025-11-04  0:26     ` Sasha Levin
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel Wagner @ 2025-10-10  7:39 UTC (permalink / raw)
  To: Sasha Levin
  Cc: patches, stable, Daniel Wagner, Shinichiro Kawasaki,
	Hannes Reinecke, Keith Busch, justin.tee, nareshgottumukkala83,
	paul.ely, sagi, kch, linux-nvme

Hi Sasha,

On Thu, Oct 09, 2025 at 11:54:41AM -0400, Sasha Levin wrote:
> From: Daniel Wagner <wagi@kernel.org>
> 
> [ Upstream commit f2537be4f8421f6495edfa0bc284d722f253841d ]
> 
> When forcefully shutting down a port via the configfs interface,
> nvmet_port_subsys_drop_link() first calls nvmet_port_del_ctrls() and
> then nvmet_disable_port(). Both functions will eventually schedule all
> remaining associations for deletion.
> 
> The current implementation checks whether an association is about to be
> removed, but only after the work item has already been scheduled. As a
> result, it is possible for the first scheduled work item to free all
> resources, and then for the same work item to be scheduled again for
> deletion.
> 
> Because the association list is an RCU list, it is not possible to take
> a lock and remove the list entry directly, so it cannot be looked up
> again. Instead, a flag (terminating) must be used to determine whether
> the association is already in the process of being deleted.
> 
> Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
> Closes: https://lore.kernel.org/all/rsdinhafrtlguauhesmrrzkybpnvwantwmyfq2ih5aregghax5@mhr7v3eryci3/
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Daniel Wagner <wagi@kernel.org>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> Signed-off-by: Sasha Levin <sashal@kernel.org>

This patch is part of a whole series:

https://lore.kernel.org/all/20250902-fix-nvmet-fc-v3-0-1ae1ecb798d8@kernel.org/

IMO, all should all be backported:

db5a5406fb7e nvmet-fc: move lsop put work to nvmet_fc_ls_req_op
f2537be4f842 nvmet-fc: avoid scheduling association deletion twice
10c165af35d2 nvmet-fcloop: call done callback even when remote port is gone
891cdbb162cc nvme-fc: use lock accessing port_state and rport state

Thanks,
Daniel


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH AUTOSEL 6.17-5.15] nvmet-fc: avoid scheduling association deletion twice
  2025-10-10  7:39   ` Daniel Wagner
@ 2025-11-04  0:26     ` Sasha Levin
  0 siblings, 0 replies; 5+ messages in thread
From: Sasha Levin @ 2025-11-04  0:26 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: patches, stable, Daniel Wagner, Shinichiro Kawasaki,
	Hannes Reinecke, Keith Busch, justin.tee, nareshgottumukkala83,
	paul.ely, sagi, kch, linux-nvme

On Fri, Oct 10, 2025 at 09:39:05AM +0200, Daniel Wagner wrote:
>Hi Sasha,
>
>On Thu, Oct 09, 2025 at 11:54:41AM -0400, Sasha Levin wrote:
>> From: Daniel Wagner <wagi@kernel.org>
>>
>> [ Upstream commit f2537be4f8421f6495edfa0bc284d722f253841d ]
>>
>> When forcefully shutting down a port via the configfs interface,
>> nvmet_port_subsys_drop_link() first calls nvmet_port_del_ctrls() and
>> then nvmet_disable_port(). Both functions will eventually schedule all
>> remaining associations for deletion.
>>
>> The current implementation checks whether an association is about to be
>> removed, but only after the work item has already been scheduled. As a
>> result, it is possible for the first scheduled work item to free all
>> resources, and then for the same work item to be scheduled again for
>> deletion.
>>
>> Because the association list is an RCU list, it is not possible to take
>> a lock and remove the list entry directly, so it cannot be looked up
>> again. Instead, a flag (terminating) must be used to determine whether
>> the association is already in the process of being deleted.
>>
>> Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
>> Closes: https://lore.kernel.org/all/rsdinhafrtlguauhesmrrzkybpnvwantwmyfq2ih5aregghax5@mhr7v3eryci3/
>> Reviewed-by: Hannes Reinecke <hare@suse.de>
>> Signed-off-by: Daniel Wagner <wagi@kernel.org>
>> Signed-off-by: Keith Busch <kbusch@kernel.org>
>> Signed-off-by: Sasha Levin <sashal@kernel.org>
>
>This patch is part of a whole series:
>
>https://lore.kernel.org/all/20250902-fix-nvmet-fc-v3-0-1ae1ecb798d8@kernel.org/
>
>IMO, all should all be backported:
>
>db5a5406fb7e nvmet-fc: move lsop put work to nvmet_fc_ls_req_op
>f2537be4f842 nvmet-fc: avoid scheduling association deletion twice
>10c165af35d2 nvmet-fcloop: call done callback even when remote port is gone
>891cdbb162cc nvme-fc: use lock accessing port_state and rport state

Ack, thanks!

-- 
Thanks,
Sasha


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-11-04  0:26 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20251009155752.773732-1-sashal@kernel.org>
2025-10-09 15:54 ` [PATCH AUTOSEL 6.17-5.15] nvmet-fc: avoid scheduling association deletion twice Sasha Levin
2025-10-10  7:39   ` Daniel Wagner
2025-11-04  0:26     ` Sasha Levin
2025-10-09 15:55 ` [PATCH AUTOSEL 6.17-6.1] nvme: Use non zero KATO for persistent discovery connections Sasha Levin
2025-10-09 15:56 ` [PATCH AUTOSEL 6.17-5.10] nvme-fc: use lock accessing port_state and rport state Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).