From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C79223B4E9F for ; Mon, 30 Mar 2026 09:38:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774863514; cv=none; b=YvjWMbYQW6mKiOppwvczRVPK1ar88LVZ9mWMTsyvHrnD0equHq9Y2AimGjqFLQ8QmQhlNTHwAVaN8ICYT3fBncg/Q6yhh+NSBnDMo2Et8fLxSI/THJJa3y1RgsojTJtHW1sQqlpBox+emj7sP61CAzz+t1VlGS/uAkVxWF1Y48k= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774863514; c=relaxed/simple; bh=qtzSz0PWSDt1A0mxV0QZPaHRHot9knP6XqUwijJVUtM=; h=Subject:To:Cc:From:Date:Message-ID:MIME-Version:Content-Type; b=nckTfHrrNvLlh7PKievpXVeMQZ2wX1+b37Crm+hzoX2H3CCCCk1pYYeJ1JGEt0HNtwkZPxq2vqwjjhATxlS/AnpyYJkbapJJSqb4glyKsVxvUTpq0jbXuQ8QsR4o/3K/yYnM3Gymi3cv9HSanO1VqYULO7/EaUr/k1ljdx3SJ3Q= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b=kqREakBx; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b="kqREakBx" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E011AC4CEF7; Mon, 30 Mar 2026 09:38:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1774863514; bh=qtzSz0PWSDt1A0mxV0QZPaHRHot9knP6XqUwijJVUtM=; h=Subject:To:Cc:From:Date:From; b=kqREakBxWbCyXWQZrjUG0rv9dFaOOgkxoRPQZXnKnHhHjJIkr8JVKI19BzjOgaJqb HK80PiLBPUgE8XRwZzqfrlf5LQHo5/3MGrrFmqsdQZBWCJhyZR2kvZnTzucKC2TtqY /8RkTtXlrs9vLy2woKE9a3RoPPsTKxHUmzJA0Qow= Subject: FAILED: patch "[PATCH] scsi: target: tcm_loop: Drain commands in target_reset" failed to apply to 5.15-stable tree To: josef@toxicpanda.com,martin.petersen@oracle.com Cc: From: Date: Mon, 30 Mar 2026 11:38:14 +0200 Message-ID: <2026033014-wiring-railroad-acea@gregkh> Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to . To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y git checkout FETCH_HEAD git cherry-pick -x 1333eee56cdf3f0cf67c6ab4114c2c9e0a952026 # git commit -s git send-email --to '' --in-reply-to '2026033014-wiring-railroad-acea@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 1333eee56cdf3f0cf67c6ab4114c2c9e0a952026 Mon Sep 17 00:00:00 2001 From: Josef Bacik Date: Mon, 16 Mar 2026 20:23:29 -0400 Subject: [PATCH] scsi: target: tcm_loop: Drain commands in target_reset handler MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit tcm_loop_target_reset() violates the SCSI EH contract: it returns SUCCESS without draining any in-flight commands. The SCSI EH documentation (scsi_eh.rst) requires that when a reset handler returns SUCCESS the driver has made lower layers "forget about timed out scmds" and is ready for new commands. Every other SCSI LLD (virtio_scsi, mpt3sas, ipr, scsi_debug, mpi3mr) enforces this by draining or completing outstanding commands before returning SUCCESS. Because tcm_loop_target_reset() doesn't drain, the SCSI EH reuses in-flight scsi_cmnd structures for recovery commands (e.g. TUR) while the target core still has async completion work queued for the old se_cmd. The memset in queuecommand zeroes se_lun and lun_ref_active, causing transport_lun_remove_cmd() to skip its percpu_ref_put(). The leaked LUN reference prevents transport_clear_lun_ref() from completing, hanging configfs LUN unlink forever in D-state: INFO: task rm:264 blocked for more than 122 seconds. rm D 0 264 258 0x00004000 Call Trace: __schedule+0x3d0/0x8e0 schedule+0x36/0xf0 transport_clear_lun_ref+0x78/0x90 [target_core_mod] core_tpg_remove_lun+0x28/0xb0 [target_core_mod] target_fabric_port_unlink+0x50/0x60 [target_core_mod] configfs_unlink+0x156/0x1f0 [configfs] vfs_unlink+0x109/0x290 do_unlinkat+0x1d5/0x2d0 Fix this by making tcm_loop_target_reset() actually drain commands: 1. Issue TMR_LUN_RESET via tcm_loop_issue_tmr() to drain all commands that the target core knows about (those not yet CMD_T_COMPLETE). 2. Use blk_mq_tagset_busy_iter() to iterate all started requests and flush_work() on each se_cmd — this drains any deferred completion work for commands that already had CMD_T_COMPLETE set before the TMR (which the TMR skips via __target_check_io_state()). This is the same pattern used by mpi3mr, scsi_debug, and libsas to drain outstanding commands during reset. Fixes: e0eb5d38b732 ("scsi: target: tcm_loop: Use block cmd allocator for se_cmds") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Josef Bacik Link: https://patch.msgid.link/27011aa34c8f6b1b94d2e3cf5655b6d037f53428.1773706803.git.josef@toxicpanda.com Signed-off-by: Martin K. Petersen diff --git a/drivers/target/loopback/tcm_loop.c b/drivers/target/loopback/tcm_loop.c index d668bd19fd4a..528883d989b8 100644 --- a/drivers/target/loopback/tcm_loop.c +++ b/drivers/target/loopback/tcm_loop.c @@ -26,6 +26,7 @@ #include #include #include +#include #include #include #include @@ -269,15 +270,27 @@ static int tcm_loop_device_reset(struct scsi_cmnd *sc) return (ret == TMR_FUNCTION_COMPLETE) ? SUCCESS : FAILED; } +static bool tcm_loop_flush_work_iter(struct request *rq, void *data) +{ + struct scsi_cmnd *sc = blk_mq_rq_to_pdu(rq); + struct tcm_loop_cmd *tl_cmd = scsi_cmd_priv(sc); + struct se_cmd *se_cmd = &tl_cmd->tl_se_cmd; + + flush_work(&se_cmd->work); + return true; +} + static int tcm_loop_target_reset(struct scsi_cmnd *sc) { struct tcm_loop_hba *tl_hba; struct tcm_loop_tpg *tl_tpg; + struct Scsi_Host *sh = sc->device->host; + int ret; /* * Locate the tcm_loop_hba_t pointer */ - tl_hba = *(struct tcm_loop_hba **)shost_priv(sc->device->host); + tl_hba = *(struct tcm_loop_hba **)shost_priv(sh); if (!tl_hba) { pr_err("Unable to perform device reset without active I_T Nexus\n"); return FAILED; @@ -286,11 +299,38 @@ static int tcm_loop_target_reset(struct scsi_cmnd *sc) * Locate the tl_tpg pointer from TargetID in sc->device->id */ tl_tpg = &tl_hba->tl_hba_tpgs[sc->device->id]; - if (tl_tpg) { - tl_tpg->tl_transport_status = TCM_TRANSPORT_ONLINE; - return SUCCESS; - } - return FAILED; + if (!tl_tpg) + return FAILED; + + /* + * Issue a LUN_RESET to drain all commands that the target core + * knows about. This handles commands not yet marked CMD_T_COMPLETE. + */ + ret = tcm_loop_issue_tmr(tl_tpg, sc->device->lun, 0, TMR_LUN_RESET); + if (ret != TMR_FUNCTION_COMPLETE) + return FAILED; + + /* + * Flush any deferred target core completion work that may still be + * queued. Commands that already had CMD_T_COMPLETE set before the TMR + * are skipped by the TMR drain, but their async completion work + * (transport_lun_remove_cmd → percpu_ref_put, release_cmd → scsi_done) + * may still be pending in target_completion_wq. + * + * The SCSI EH will reuse in-flight scsi_cmnd structures for recovery + * commands (e.g. TUR) immediately after this handler returns SUCCESS — + * if deferred work is still pending, the memset in queuecommand would + * zero the se_cmd while the work accesses it, leaking the LUN + * percpu_ref and hanging configfs unlink forever. + * + * Use blk_mq_tagset_busy_iter() to find all started requests and + * flush_work() on each — the same pattern used by mpi3mr, scsi_debug, + * and other SCSI drivers to drain outstanding commands during reset. + */ + blk_mq_tagset_busy_iter(&sh->tag_set, tcm_loop_flush_work_iter, NULL); + + tl_tpg->tl_transport_status = TCM_TRANSPORT_ONLINE; + return SUCCESS; } static const struct scsi_host_template tcm_loop_driver_template = {