[PATCH 0/4] scsi_debug improvements

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/4] scsi_debug improvements
@ 2025-02-24 11:55 John Garry
  2025-02-24 11:55 ` [PATCH 1/4] scsi: scsi_debug: Remove sdebug_device_access_info John Garry
                   ` (4 more replies)
  0 siblings, 5 replies; 21+ messages in thread
From: John Garry @ 2025-02-24 11:55 UTC (permalink / raw)
  To: James.Bottomley, martin.petersen; +Cc: linux-scsi, bvanassche, John Garry

This is a re-post with modifications of Bart's series at
https://lore.kernel.org/linux-scsi/20240307203015.870254-1-bvanassche@acm.org/

I re-ordered the patches solving the atomic context sleeping and
simplifying command handling. This makes it easier to solve the atomic
context sleeping issue. However, it is harder to backport. Personally I
don't think that backporting is so important as the atomic context
sleeping is so difficult to recreate.

The main change in this series is to solve the atomic context sleeping
issue by now reporting errors back from scsi_debug_abort() to the SCSI
midlayer when we cannot guarantee if the command was aborted.

Based on 1d67c48947c6 (origin/staging, origin/for-next,
origin/6.15/scsi-staging) Merge patch series "scsi: scsi_debug: Add
more tape support"

Bart Van Assche (3):
  scsi: scsi_debug: Remove a reference to in_use_bm
  scsi: scsi_debug: Simplify command handling
  scsi: scsi_debug: Do not sleep in atomic sections

John Garry (1):
  scsi: scsi_debug: Remove sdebug_device_access_info

 drivers/scsi/scsi_debug.c | 153 ++++++++------------------------------
 1 file changed, 30 insertions(+), 123 deletions(-)

-- 
2.31.1

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/4] scsi: scsi_debug: Remove sdebug_device_access_info
  2025-02-24 11:55 [PATCH 0/4] scsi_debug improvements John Garry
@ 2025-02-24 11:55 ` John Garry
  2025-02-24 18:04   ` Bart Van Assche
  2025-02-24 11:55 ` [PATCH 2/4] scsi: scsi_debug: Remove a reference to in_use_bm John Garry
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 21+ messages in thread
From: John Garry @ 2025-02-24 11:55 UTC (permalink / raw)
  To: James.Bottomley, martin.petersen; +Cc: linux-scsi, bvanassche, John Garry

This structure is not used, so delete it.

It was originally intended for supporting checking for atomic writes
overlapping with ongoing reads and writes, but that support never got
added.

sbc-4 r22 section 4.29.3.2 "Performing operations during an atomic write
operation" describes two methods of handling overlapping atomic writes.
Currently the only method supported is for the ongoing read or write to
complete.

Signed-off-by: John Garry <john.g.garry@oracle.com>
---
 drivers/scsi/scsi_debug.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 2f60ab9a93bd..e3ebb6710d41 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -460,12 +460,6 @@ struct sdebug_defer {
 	enum sdeb_defer_type defer_t;
 };
 
-struct sdebug_device_access_info {
-	bool atomic_write;
-	u64 lba;
-	u32 num;
-	struct scsi_cmnd *self;
-};
 
 struct sdebug_queued_cmd {
 	/* corresponding bit set in in_use_bm[] in owning struct sdebug_queue
@@ -473,7 +467,6 @@ struct sdebug_queued_cmd {
 	 */
 	struct sdebug_defer sd_dp;
 	struct scsi_cmnd *scmd;
-	struct sdebug_device_access_info *i;
 };
 
 struct sdebug_scsi_cmd {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/4] scsi: scsi_debug: Remove a reference to in_use_bm
  2025-02-24 11:55 [PATCH 0/4] scsi_debug improvements John Garry
  2025-02-24 11:55 ` [PATCH 1/4] scsi: scsi_debug: Remove sdebug_device_access_info John Garry
@ 2025-02-24 11:55 ` John Garry
  2025-02-24 11:55 ` [PATCH 3/4] scsi: scsi_debug: Simplify command handling John Garry
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 21+ messages in thread
From: John Garry @ 2025-02-24 11:55 UTC (permalink / raw)
  To: James.Bottomley, martin.petersen; +Cc: linux-scsi, bvanassche, John Garry

From: Bart Van Assche <bvanassche@acm.org>

Commit f1437cd1e535 ("scsi: scsi_debug: Drop sdebug_queue") removed
the 'in_use_bm' struct member. Hence remove a reference to that struct
member from the procfs host info file.

Fixes: f1437cd1e535 ("scsi: scsi_debug: Drop sdebug_queue")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: John Garry <john.g.garry@oracle.com>
---
 drivers/scsi/scsi_debug.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index e3ebb6710d41..7631c2c46594 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -7575,7 +7575,7 @@ static int scsi_debug_show_info(struct seq_file *m, struct Scsi_Host *host)
 		blk_mq_tagset_busy_iter(&host->tag_set, sdebug_submit_queue_iter,
 					&data);
 		if (f >= 0) {
-			seq_printf(m, "    in_use_bm BUSY: %s: %d,%d\n",
+			seq_printf(m, "    BUSY: %s: %d,%d\n",
 				   "first,last bits", f, l);
 		}
 	}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/4] scsi: scsi_debug: Simplify command handling
  2025-02-24 11:55 [PATCH 0/4] scsi_debug improvements John Garry
  2025-02-24 11:55 ` [PATCH 1/4] scsi: scsi_debug: Remove sdebug_device_access_info John Garry
  2025-02-24 11:55 ` [PATCH 2/4] scsi: scsi_debug: Remove a reference to in_use_bm John Garry
@ 2025-02-24 11:55 ` John Garry
  2025-02-24 11:55 ` [PATCH 4/4] scsi: scsi_debug: Do not sleep in atomic sections John Garry
  2025-02-25  1:57 ` [PATCH 0/4] scsi_debug improvements Martin K. Petersen
  4 siblings, 0 replies; 21+ messages in thread
From: John Garry @ 2025-02-24 11:55 UTC (permalink / raw)
  To: James.Bottomley, martin.petersen; +Cc: linux-scsi, bvanassche, John Garry

From: Bart Van Assche <bvanassche@acm.org>

Simplify command handling by moving struct sdebug_defer into the
private SCSI command data instead of allocating it separately. The only
functional change is that aborting a SCSI command now fails and is
retried at a later time if the completion handler can't be cancelled.

See also commit 1107c7b24ee3 ("scsi: scsi_debug: Dynamically allocate
sdebug_queued_cmd"; v6.4).

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: John Garry <john.g.garry@oracle.com>
---
 drivers/scsi/scsi_debug.c | 130 ++++++--------------------------------
 1 file changed, 20 insertions(+), 110 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 7631c2c46594..c1a217b5aac2 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -300,11 +300,6 @@ struct tape_block {
 
 #define SDEB_XA_NOT_IN_USE XA_MARK_1
 
-static struct kmem_cache *queued_cmd_cache;
-
-#define TO_QUEUED_CMD(scmd)  ((void *)(scmd)->host_scribble)
-#define ASSIGN_QUEUED_CMD(scmnd, qc) { (scmnd)->host_scribble = (void *) qc; }
-
 /* Zone types (zbcr05 table 25) */
 enum sdebug_z_type {
 	ZBC_ZTYPE_CNV	= 0x1,
@@ -460,17 +455,9 @@ struct sdebug_defer {
 	enum sdeb_defer_type defer_t;
 };
 
-
-struct sdebug_queued_cmd {
-	/* corresponding bit set in in_use_bm[] in owning struct sdebug_queue
-	 * instance indicates this slot is in use.
-	 */
-	struct sdebug_defer sd_dp;
-	struct scsi_cmnd *scmd;
-};
-
 struct sdebug_scsi_cmd {
 	spinlock_t   lock;
+	struct sdebug_defer sd_dp;
 };
 
 static atomic_t sdebug_cmnd_count;   /* number of incoming commands */
@@ -636,8 +623,6 @@ static int sdebug_add_store(void);
 static void sdebug_erase_store(int idx, struct sdeb_store_info *sip);
 static void sdebug_erase_all_stores(bool apart_from_first);
 
-static void sdebug_free_queued_cmd(struct sdebug_queued_cmd *sqcp);
-
 /*
  * The following are overflow arrays for cdbs that "hit" the same index in
  * the opcode_info_arr array. The most time sensitive (or commonly used) cdb
@@ -6333,10 +6318,10 @@ static u32 get_tag(struct scsi_cmnd *cmnd)
 /* Queued (deferred) command completions converge here. */
 static void sdebug_q_cmd_complete(struct sdebug_defer *sd_dp)
 {
-	struct sdebug_queued_cmd *sqcp = container_of(sd_dp, struct sdebug_queued_cmd, sd_dp);
+	struct sdebug_scsi_cmd *sdsc = container_of(sd_dp,
+					typeof(*sdsc), sd_dp);
+	struct scsi_cmnd *scp = (struct scsi_cmnd *)sdsc - 1;
 	unsigned long flags;
-	struct scsi_cmnd *scp = sqcp->scmd;
-	struct sdebug_scsi_cmd *sdsc;
 	bool aborted;
 
 	if (sdebug_statistics) {
@@ -6347,27 +6332,23 @@ static void sdebug_q_cmd_complete(struct sdebug_defer *sd_dp)
 
 	if (!scp) {
 		pr_err("scmd=NULL\n");
-		goto out;
+		return;
 	}
 
-	sdsc = scsi_cmd_priv(scp);
 	spin_lock_irqsave(&sdsc->lock, flags);
 	aborted = sd_dp->aborted;
 	if (unlikely(aborted))
 		sd_dp->aborted = false;
-	ASSIGN_QUEUED_CMD(scp, NULL);
 
 	spin_unlock_irqrestore(&sdsc->lock, flags);
 
 	if (aborted) {
 		pr_info("bypassing scsi_done() due to aborted cmd, kicking-off EH\n");
 		blk_abort_request(scsi_cmd_to_rq(scp));
-		goto out;
+		return;
 	}
 
 	scsi_done(scp); /* callback to mid level */
-out:
-	sdebug_free_queued_cmd(sqcp);
 }
 
 /* When high resolution timer goes off this function is called. */
@@ -6674,10 +6655,15 @@ static void scsi_debug_sdev_destroy(struct scsi_device *sdp)
 	sdp->hostdata = NULL;
 }
 
-/* Returns true if we require the queued memory to be freed by the caller. */
-static bool stop_qc_helper(struct sdebug_defer *sd_dp,
-			   enum sdeb_defer_type defer_t)
+/* Returns true if it is safe to complete @cmnd. */
+static bool scsi_debug_stop_cmnd(struct scsi_cmnd *cmnd)
 {
+	struct sdebug_scsi_cmd *sdsc = scsi_cmd_priv(cmnd);
+	struct sdebug_defer *sd_dp = &sdsc->sd_dp;
+	enum sdeb_defer_type defer_t = READ_ONCE(sd_dp->defer_t);
+
+	lockdep_assert_held(&sdsc->lock);
+
 	if (defer_t == SDEB_DEFER_HRT) {
 		int res = hrtimer_try_to_cancel(&sd_dp->hrt);
 
@@ -6702,28 +6688,6 @@ static bool stop_qc_helper(struct sdebug_defer *sd_dp,
 	return false;
 }
 
-
-static bool scsi_debug_stop_cmnd(struct scsi_cmnd *cmnd)
-{
-	enum sdeb_defer_type l_defer_t;
-	struct sdebug_defer *sd_dp;
-	struct sdebug_scsi_cmd *sdsc = scsi_cmd_priv(cmnd);
-	struct sdebug_queued_cmd *sqcp = TO_QUEUED_CMD(cmnd);
-
-	lockdep_assert_held(&sdsc->lock);
-
-	if (!sqcp)
-		return false;
-	sd_dp = &sqcp->sd_dp;
-	l_defer_t = READ_ONCE(sd_dp->defer_t);
-	ASSIGN_QUEUED_CMD(cmnd, NULL);
-
-	if (stop_qc_helper(sd_dp, l_defer_t))
-		sdebug_free_queued_cmd(sqcp);
-
-	return true;
-}
-
 /*
  * Called from scsi_debug_abort() only, which is for timed-out cmd.
  */
@@ -7106,33 +7070,6 @@ static bool inject_on_this_cmd(void)
 
 #define INCLUSIVE_TIMING_MAX_NS 1000000		/* 1 millisecond */
 
-
-void sdebug_free_queued_cmd(struct sdebug_queued_cmd *sqcp)
-{
-	if (sqcp)
-		kmem_cache_free(queued_cmd_cache, sqcp);
-}
-
-static struct sdebug_queued_cmd *sdebug_alloc_queued_cmd(struct scsi_cmnd *scmd)
-{
-	struct sdebug_queued_cmd *sqcp;
-	struct sdebug_defer *sd_dp;
-
-	sqcp = kmem_cache_zalloc(queued_cmd_cache, GFP_ATOMIC);
-	if (!sqcp)
-		return NULL;
-
-	sd_dp = &sqcp->sd_dp;
-
-	hrtimer_init(&sd_dp->hrt, CLOCK_MONOTONIC, HRTIMER_MODE_REL_PINNED);
-	sd_dp->hrt.function = sdebug_q_cmd_hrt_complete;
-	INIT_WORK(&sd_dp->ew.work, sdebug_q_cmd_wq_complete);
-
-	sqcp->scmd = scmd;
-
-	return sqcp;
-}
-
 /* Complete the processing of the thread that queued a SCSI command to this
  * driver. It either completes the command by calling cmnd_done() or
  * schedules a hr timer or work queue then returns 0. Returns
@@ -7149,7 +7086,6 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip,
 	struct sdebug_scsi_cmd *sdsc = scsi_cmd_priv(cmnd);
 	unsigned long flags;
 	u64 ns_from_boot = 0;
-	struct sdebug_queued_cmd *sqcp;
 	struct scsi_device *sdp;
 	struct sdebug_defer *sd_dp;
 
@@ -7181,12 +7117,7 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip,
 		}
 	}
 
-	sqcp = sdebug_alloc_queued_cmd(cmnd);
-	if (!sqcp) {
-		pr_err("%s no alloc\n", __func__);
-		return SCSI_MLQUEUE_HOST_BUSY;
-	}
-	sd_dp = &sqcp->sd_dp;
+	sd_dp = &sdsc->sd_dp;
 
 	if (polled || (ndelay > 0 && ndelay < INCLUSIVE_TIMING_MAX_NS))
 		ns_from_boot = ktime_get_boottime_ns();
@@ -7234,7 +7165,6 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip,
 
 				if (kt <= d) {	/* elapsed duration >= kt */
 					/* call scsi_done() from this thread */
-					sdebug_free_queued_cmd(sqcp);
 					scsi_done(cmnd);
 					return 0;
 				}
@@ -7247,13 +7177,11 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip,
 		if (polled) {
 			spin_lock_irqsave(&sdsc->lock, flags);
 			sd_dp->cmpl_ts = ktime_add(ns_to_ktime(ns_from_boot), kt);
-			ASSIGN_QUEUED_CMD(cmnd, sqcp);
 			WRITE_ONCE(sd_dp->defer_t, SDEB_DEFER_POLL);
 			spin_unlock_irqrestore(&sdsc->lock, flags);
 		} else {
 			/* schedule the invocation of scsi_done() for a later time */
 			spin_lock_irqsave(&sdsc->lock, flags);
-			ASSIGN_QUEUED_CMD(cmnd, sqcp);
 			WRITE_ONCE(sd_dp->defer_t, SDEB_DEFER_HRT);
 			hrtimer_start(&sd_dp->hrt, kt, HRTIMER_MODE_REL_PINNED);
 			/*
@@ -7277,13 +7205,11 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip,
 			sd_dp->issuing_cpu = raw_smp_processor_id();
 		if (polled) {
 			spin_lock_irqsave(&sdsc->lock, flags);
-			ASSIGN_QUEUED_CMD(cmnd, sqcp);
 			sd_dp->cmpl_ts = ns_to_ktime(ns_from_boot);
 			WRITE_ONCE(sd_dp->defer_t, SDEB_DEFER_POLL);
 			spin_unlock_irqrestore(&sdsc->lock, flags);
 		} else {
 			spin_lock_irqsave(&sdsc->lock, flags);
-			ASSIGN_QUEUED_CMD(cmnd, sqcp);
 			WRITE_ONCE(sd_dp->defer_t, SDEB_DEFER_WQ);
 			schedule_work(&sd_dp->ew.work);
 			spin_unlock_irqrestore(&sdsc->lock, flags);
@@ -8650,12 +8576,6 @@ static int __init scsi_debug_init(void)
 	hosts_to_add = sdebug_add_host;
 	sdebug_add_host = 0;
 
-	queued_cmd_cache = KMEM_CACHE(sdebug_queued_cmd, SLAB_HWCACHE_ALIGN);
-	if (!queued_cmd_cache) {
-		ret = -ENOMEM;
-		goto driver_unreg;
-	}
-
 	sdebug_debugfs_root = debugfs_create_dir("scsi_debug", NULL);
 	if (IS_ERR_OR_NULL(sdebug_debugfs_root))
 		pr_info("%s: failed to create initial debugfs directory\n", __func__);
@@ -8682,8 +8602,6 @@ static int __init scsi_debug_init(void)
 
 	return 0;
 
-driver_unreg:
-	driver_unregister(&sdebug_driverfs_driver);
 bus_unreg:
 	bus_unregister(&pseudo_lld_bus);
 dev_unreg:
@@ -8699,7 +8617,6 @@ static void __exit scsi_debug_exit(void)
 
 	for (; k; k--)
 		sdebug_do_remove_host(true);
-	kmem_cache_destroy(queued_cmd_cache);
 	driver_unregister(&sdebug_driverfs_driver);
 	bus_unregister(&pseudo_lld_bus);
 	root_device_unregister(pseudo_primary);
@@ -9083,7 +9000,6 @@ static bool sdebug_blk_mq_poll_iter(struct request *rq, void *opaque)
 	struct sdebug_defer *sd_dp;
 	u32 unique_tag = blk_mq_unique_tag(rq);
 	u16 hwq = blk_mq_unique_tag_to_hwq(unique_tag);
-	struct sdebug_queued_cmd *sqcp;
 	unsigned long flags;
 	int queue_num = data->queue_num;
 	ktime_t time;
@@ -9099,13 +9015,7 @@ static bool sdebug_blk_mq_poll_iter(struct request *rq, void *opaque)
 	time = ktime_get_boottime();
 
 	spin_lock_irqsave(&sdsc->lock, flags);
-	sqcp = TO_QUEUED_CMD(cmd);
-	if (!sqcp) {
-		spin_unlock_irqrestore(&sdsc->lock, flags);
-		return true;
-	}
-
-	sd_dp = &sqcp->sd_dp;
+	sd_dp = &sdsc->sd_dp;
 	if (READ_ONCE(sd_dp->defer_t) != SDEB_DEFER_POLL) {
 		spin_unlock_irqrestore(&sdsc->lock, flags);
 		return true;
@@ -9115,8 +9025,6 @@ static bool sdebug_blk_mq_poll_iter(struct request *rq, void *opaque)
 		spin_unlock_irqrestore(&sdsc->lock, flags);
 		return true;
 	}
-
-	ASSIGN_QUEUED_CMD(cmd, NULL);
 	spin_unlock_irqrestore(&sdsc->lock, flags);
 
 	if (sdebug_statistics) {
@@ -9125,8 +9033,6 @@ static bool sdebug_blk_mq_poll_iter(struct request *rq, void *opaque)
 			atomic_inc(&sdebug_miss_cpus);
 	}
 
-	sdebug_free_queued_cmd(sqcp);
-
 	scsi_done(cmd); /* callback to mid level */
 	(*data->num_entries)++;
 	return true;
@@ -9441,8 +9347,12 @@ static int scsi_debug_queuecommand(struct Scsi_Host *shost,
 static int sdebug_init_cmd_priv(struct Scsi_Host *shost, struct scsi_cmnd *cmd)
 {
 	struct sdebug_scsi_cmd *sdsc = scsi_cmd_priv(cmd);
+	struct sdebug_defer *sd_dp = &sdsc->sd_dp;
 
 	spin_lock_init(&sdsc->lock);
+	hrtimer_init(&sd_dp->hrt, CLOCK_MONOTONIC, HRTIMER_MODE_REL_PINNED);
+	sd_dp->hrt.function = sdebug_q_cmd_hrt_complete;
+	INIT_WORK(&sd_dp->ew.work, sdebug_q_cmd_wq_complete);
 
 	return 0;
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 4/4] scsi: scsi_debug: Do not sleep in atomic sections
  2025-02-24 11:55 [PATCH 0/4] scsi_debug improvements John Garry
                   ` (2 preceding siblings ...)
  2025-02-24 11:55 ` [PATCH 3/4] scsi: scsi_debug: Simplify command handling John Garry
@ 2025-02-24 11:55 ` John Garry
  2025-11-05 11:57   ` [PATCH] scsi: scsi_debug: make timeout faults by set delay to maximum value JiangJianJun
                     ` (2 more replies)
  2025-02-25  1:57 ` [PATCH 0/4] scsi_debug improvements Martin K. Petersen
  4 siblings, 3 replies; 21+ messages in thread
From: John Garry @ 2025-02-24 11:55 UTC (permalink / raw)
  To: James.Bottomley, martin.petersen; +Cc: linux-scsi, bvanassche, John Garry

From: Bart Van Assche <bvanassche@acm.org>

Function stop_qc_helper() is called while the debug_scsi_cmd lock is held,
and from here we may call cancel_work_sync(), which may sleep.

Sleeping in atomic sections is not allowed.

Hence change the cancel_work_sync() call into a cancel_work() call.

However now it is not possible to know if the work callback is running
when we return. This is relevant for eh_abort_handler handling, as the
semantics of that callback are that success means that we do not keep a
reference to the scsi_cmnd - now this is not possible. So return FAIL
when we are unsure if the callback still running.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
jpg: return FAILED from scsi_debug_abort() when possible callback running
Signed-off-by: John Garry <john.g.garry@oracle.com>
---
 drivers/scsi/scsi_debug.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index c1a217b5aac2..2208dcba346e 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -6655,7 +6655,7 @@ static void scsi_debug_sdev_destroy(struct scsi_device *sdp)
 	sdp->hostdata = NULL;
 }
 
-/* Returns true if it is safe to complete @cmnd. */
+/* Returns true if cancelled or not running callback. */
 static bool scsi_debug_stop_cmnd(struct scsi_cmnd *cmnd)
 {
 	struct sdebug_scsi_cmd *sdsc = scsi_cmd_priv(cmnd);
@@ -6668,18 +6668,18 @@ static bool scsi_debug_stop_cmnd(struct scsi_cmnd *cmnd)
 		int res = hrtimer_try_to_cancel(&sd_dp->hrt);
 
 		switch (res) {
-		case 0: /* Not active, it must have already run */
 		case -1: /* -1 It's executing the CB */
 			return false;
+		case 0: /* Not active, it must have already run */
 		case 1: /* Was active, we've now cancelled */
 		default:
 			return true;
 		}
 	} else if (defer_t == SDEB_DEFER_WQ) {
 		/* Cancel if pending */
-		if (cancel_work_sync(&sd_dp->ew.work))
+		if (cancel_work(&sd_dp->ew.work))
 			return true;
-		/* Was not pending, so it must have run */
+		/* callback may be running, so return false */
 		return false;
 	} else if (defer_t == SDEB_DEFER_POLL) {
 		return true;
@@ -6759,7 +6759,7 @@ static int sdebug_fail_abort(struct scsi_cmnd *cmnd)
 
 static int scsi_debug_abort(struct scsi_cmnd *SCpnt)
 {
-	bool ok = scsi_debug_abort_cmnd(SCpnt);
+	bool aborted = scsi_debug_abort_cmnd(SCpnt);
 	u8 *cmd = SCpnt->cmnd;
 	u8 opcode = cmd[0];
 
@@ -6768,7 +6768,8 @@ static int scsi_debug_abort(struct scsi_cmnd *SCpnt)
 	if (SDEBUG_OPT_ALL_NOISE & sdebug_opts)
 		sdev_printk(KERN_INFO, SCpnt->device,
 			    "%s: command%s found\n", __func__,
-			    ok ? "" : " not");
+			    aborted ? "" : " not");
+
 
 	if (sdebug_fail_abort(SCpnt)) {
 		scmd_printk(KERN_INFO, SCpnt, "fail abort command 0x%x\n",
@@ -6776,6 +6777,9 @@ static int scsi_debug_abort(struct scsi_cmnd *SCpnt)
 		return FAILED;
 	}
 
+	if (aborted == false)
+		return FAILED;
+
 	return SUCCESS;
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/4] scsi: scsi_debug: Remove sdebug_device_access_info
  2025-02-24 11:55 ` [PATCH 1/4] scsi: scsi_debug: Remove sdebug_device_access_info John Garry
@ 2025-02-24 18:04   ` Bart Van Assche
  0 siblings, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2025-02-24 18:04 UTC (permalink / raw)
  To: John Garry, James.Bottomley, martin.petersen; +Cc: linux-scsi

On 2/24/25 3:55 AM, John Garry wrote:
> This structure is not used, so delete it.
> 
> It was originally intended for supporting checking for atomic writes
> overlapping with ongoing reads and writes, but that support never got
> added.
> 
> sbc-4 r22 section 4.29.3.2 "Performing operations during an atomic write
> operation" describes two methods of handling overlapping atomic writes.
> Currently the only method supported is for the ongoing read or write to
> complete.
Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/4] scsi_debug improvements
  2025-02-24 11:55 [PATCH 0/4] scsi_debug improvements John Garry
                   ` (3 preceding siblings ...)
  2025-02-24 11:55 ` [PATCH 4/4] scsi: scsi_debug: Do not sleep in atomic sections John Garry
@ 2025-02-25  1:57 ` Martin K. Petersen
  4 siblings, 0 replies; 21+ messages in thread
From: Martin K. Petersen @ 2025-02-25  1:57 UTC (permalink / raw)
  To: John Garry; +Cc: James.Bottomley, martin.petersen, linux-scsi, bvanassche


John,

> The main change in this series is to solve the atomic context sleeping
> issue by now reporting errors back from scsi_debug_abort() to the SCSI
> midlayer when we cannot guarantee if the command was aborted.

Applied to 6.15/scsi-staging, thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] scsi: scsi_debug: make timeout faults by set delay to maximum value
  2025-11-05 12:01   ` JiangJianJun
@ 2025-11-05 11:54     ` John Garry
  2025-11-06 12:03       ` JiangJianJun
  0 siblings, 1 reply; 21+ messages in thread
From: John Garry @ 2025-11-05 11:54 UTC (permalink / raw)
  To: JiangJianJun, James.Bottomley, martin.petersen, linux-scsi
  Cc: bvanassche, hewenliang4, yangyun50, wuyifeng10

On 05/11/2025 12:01, JiangJianJun wrote:

I got this patch 3x times...

> The injection of timeout faults is achieved by directly discarding the
> SCSI command.
> 
> However, after the timeout, the SCSI mid-layer cancels the SCSI command.
> At this point, if the command is checked during cancellation, 

What do you mean by "checked during cancellation"?

> it will result
> in a "not found" outcome, making it impossible to cancel the command properly.

I don't know what is meant by "cancel the command properly".

> 
> Therefore, the approach has been changed to avoid direct cancellation, and the
> delay is set to the maximum value(0x7FFFFFFF).
> 
> Signed-off-by: JiangJianJun <jiangjianjun3@huawei.com>
> ---
>   drivers/scsi/scsi_debug.c | 7 +++++--
>   1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
> index 353cb60e1abe..7d86a6f10130 100644
> --- a/drivers/scsi/scsi_debug.c
> +++ b/drivers/scsi/scsi_debug.c
> @@ -9249,6 +9249,7 @@ static int scsi_debug_queuecommand(struct Scsi_Host *shost,
>   	bool inject_now;
>   	int ret = 0;
>   	struct sdebug_err_inject err;
> +	bool timeout = false;
>   
>   	scsi_set_resid(scp, 0);
>   	if (sdebug_statistics) {
> @@ -9291,7 +9292,7 @@ static int scsi_debug_queuecommand(struct Scsi_Host *shost,
>   
>   	if (sdebug_timeout_cmd(scp)) {
>   		scmd_printk(KERN_INFO, scp, "timeout command 0x%x\n", opcode);
> -		return 0;
> +		timeout = true;
>   	}
>   
>   	ret = sdebug_fail_queue_cmd(scp);
> @@ -9398,7 +9399,9 @@ static int scsi_debug_queuecommand(struct Scsi_Host *shost,
>   		pfp = r_pfp;    /* if leaf function ptr NULL, try the root's */
>   
>   fini:
> -	if (F_DELAY_OVERR & flags)	/* cmds like INQUIRY respond asap */
> +	if (unlikely(timeout)) /* inject timeout */
> +		return schedule_resp(scp, devip, errsts, pfp, 0x7FFFFFFF, 0x7FFFFFFF);
> +	else if (F_DELAY_OVERR & flags)	/* cmds like INQUIRY respond asap */
>   		return schedule_resp(scp, devip, errsts, pfp, 0, 0);
>   	else if ((flags & F_LONG_DELAY) && (sdebug_jdelay > 0 ||
>   					    sdebug_ndelay > 10000)) {


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH] scsi: scsi_debug: make timeout faults by set delay to maximum value
  2025-02-24 11:55 ` [PATCH 4/4] scsi: scsi_debug: Do not sleep in atomic sections John Garry
@ 2025-11-05 11:57   ` JiangJianJun
  2025-11-05 12:00   ` JiangJianJun
  2025-11-05 12:01   ` JiangJianJun
  2 siblings, 0 replies; 21+ messages in thread
From: JiangJianJun @ 2025-11-05 11:57 UTC (permalink / raw)
  To: James.Bottomley, martin.petersen, linux-scsi
  Cc: bvanassche, john.g.garry, hewenliang4, yangyun50, wuyifeng10

The injection of timeout faults is achieved by directly discarding the
SCSI command.

However, after the timeout, the SCSI mid-layer cancels the SCSI command.
At this point, if the command is checked during cancellation, it will result
in a "not found" outcome, making it impossible to cancel the command properly.

Therefore, the approach has been changed to avoid direct cancellation, and the
delay is set to the maximum value(0x7FFFFFFF).

Signed-off-by: JiangJianJun <jiangjianjun3@huawei.com>
---
 drivers/scsi/scsi_debug.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 353cb60e1abe..7d86a6f10130 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -9249,6 +9249,7 @@ static int scsi_debug_queuecommand(struct Scsi_Host *shost,
 	bool inject_now;
 	int ret = 0;
 	struct sdebug_err_inject err;
+	bool timeout = false;
 
 	scsi_set_resid(scp, 0);
 	if (sdebug_statistics) {
@@ -9291,7 +9292,7 @@ static int scsi_debug_queuecommand(struct Scsi_Host *shost,
 
 	if (sdebug_timeout_cmd(scp)) {
 		scmd_printk(KERN_INFO, scp, "timeout command 0x%x\n", opcode);
-		return 0;
+		timeout = true;
 	}
 
 	ret = sdebug_fail_queue_cmd(scp);
@@ -9398,7 +9399,9 @@ static int scsi_debug_queuecommand(struct Scsi_Host *shost,
 		pfp = r_pfp;    /* if leaf function ptr NULL, try the root's */
 
 fini:
-	if (F_DELAY_OVERR & flags)	/* cmds like INQUIRY respond asap */
+	if (unlikely(timeout)) /* inject timeout */
+		return schedule_resp(scp, devip, errsts, pfp, 0x7FFFFFFF, 0x7FFFFFFF);
+	else if (F_DELAY_OVERR & flags)	/* cmds like INQUIRY respond asap */
 		return schedule_resp(scp, devip, errsts, pfp, 0, 0);
 	else if ((flags & F_LONG_DELAY) && (sdebug_jdelay > 0 ||
 					    sdebug_ndelay > 10000)) {
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH] scsi: scsi_debug: make timeout faults by set delay to maximum value
  2025-02-24 11:55 ` [PATCH 4/4] scsi: scsi_debug: Do not sleep in atomic sections John Garry
  2025-11-05 11:57   ` [PATCH] scsi: scsi_debug: make timeout faults by set delay to maximum value JiangJianJun
@ 2025-11-05 12:00   ` JiangJianJun
  2025-11-05 12:01   ` JiangJianJun
  2 siblings, 0 replies; 21+ messages in thread
From: JiangJianJun @ 2025-11-05 12:00 UTC (permalink / raw)
  To: James.Bottomley, martin.petersen, linux-scsi
  Cc: bvanassche, john.g.garry, hewenliang4, yangyun50, wuyifeng10

The injection of timeout faults is achieved by directly discarding the
SCSI command.

However, after the timeout, the SCSI mid-layer cancels the SCSI command.
At this point, if the command is checked during cancellation, it will result
in a "not found" outcome, making it impossible to cancel the command properly.

Therefore, the approach has been changed to avoid direct cancellation, and the
delay is set to the maximum value(0x7FFFFFFF).

Signed-off-by: JiangJianJun <jiangjianjun3@huawei.com>
---
 drivers/scsi/scsi_debug.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 353cb60e1abe..7d86a6f10130 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -9249,6 +9249,7 @@ static int scsi_debug_queuecommand(struct Scsi_Host *shost,
 	bool inject_now;
 	int ret = 0;
 	struct sdebug_err_inject err;
+	bool timeout = false;
 
 	scsi_set_resid(scp, 0);
 	if (sdebug_statistics) {
@@ -9291,7 +9292,7 @@ static int scsi_debug_queuecommand(struct Scsi_Host *shost,
 
 	if (sdebug_timeout_cmd(scp)) {
 		scmd_printk(KERN_INFO, scp, "timeout command 0x%x\n", opcode);
-		return 0;
+		timeout = true;
 	}
 
 	ret = sdebug_fail_queue_cmd(scp);
@@ -9398,7 +9399,9 @@ static int scsi_debug_queuecommand(struct Scsi_Host *shost,
 		pfp = r_pfp;    /* if leaf function ptr NULL, try the root's */
 
 fini:
-	if (F_DELAY_OVERR & flags)	/* cmds like INQUIRY respond asap */
+	if (unlikely(timeout)) /* inject timeout */
+		return schedule_resp(scp, devip, errsts, pfp, 0x7FFFFFFF, 0x7FFFFFFF);
+	else if (F_DELAY_OVERR & flags)	/* cmds like INQUIRY respond asap */
 		return schedule_resp(scp, devip, errsts, pfp, 0, 0);
 	else if ((flags & F_LONG_DELAY) && (sdebug_jdelay > 0 ||
 					    sdebug_ndelay > 10000)) {
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH] scsi: scsi_debug: make timeout faults by set delay to maximum value
  2025-02-24 11:55 ` [PATCH 4/4] scsi: scsi_debug: Do not sleep in atomic sections John Garry
  2025-11-05 11:57   ` [PATCH] scsi: scsi_debug: make timeout faults by set delay to maximum value JiangJianJun
  2025-11-05 12:00   ` JiangJianJun
@ 2025-11-05 12:01   ` JiangJianJun
  2025-11-05 11:54     ` John Garry
  2 siblings, 1 reply; 21+ messages in thread
From: JiangJianJun @ 2025-11-05 12:01 UTC (permalink / raw)
  To: James.Bottomley, martin.petersen, linux-scsi
  Cc: bvanassche, john.g.garry, hewenliang4, yangyun50, wuyifeng10

The injection of timeout faults is achieved by directly discarding the
SCSI command.

However, after the timeout, the SCSI mid-layer cancels the SCSI command.
At this point, if the command is checked during cancellation, it will result
in a "not found" outcome, making it impossible to cancel the command properly.

Therefore, the approach has been changed to avoid direct cancellation, and the
delay is set to the maximum value(0x7FFFFFFF).

Signed-off-by: JiangJianJun <jiangjianjun3@huawei.com>
---
 drivers/scsi/scsi_debug.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 353cb60e1abe..7d86a6f10130 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -9249,6 +9249,7 @@ static int scsi_debug_queuecommand(struct Scsi_Host *shost,
 	bool inject_now;
 	int ret = 0;
 	struct sdebug_err_inject err;
+	bool timeout = false;
 
 	scsi_set_resid(scp, 0);
 	if (sdebug_statistics) {
@@ -9291,7 +9292,7 @@ static int scsi_debug_queuecommand(struct Scsi_Host *shost,
 
 	if (sdebug_timeout_cmd(scp)) {
 		scmd_printk(KERN_INFO, scp, "timeout command 0x%x\n", opcode);
-		return 0;
+		timeout = true;
 	}
 
 	ret = sdebug_fail_queue_cmd(scp);
@@ -9398,7 +9399,9 @@ static int scsi_debug_queuecommand(struct Scsi_Host *shost,
 		pfp = r_pfp;    /* if leaf function ptr NULL, try the root's */
 
 fini:
-	if (F_DELAY_OVERR & flags)	/* cmds like INQUIRY respond asap */
+	if (unlikely(timeout)) /* inject timeout */
+		return schedule_resp(scp, devip, errsts, pfp, 0x7FFFFFFF, 0x7FFFFFFF);
+	else if (F_DELAY_OVERR & flags)	/* cmds like INQUIRY respond asap */
 		return schedule_resp(scp, devip, errsts, pfp, 0, 0);
 	else if ((flags & F_LONG_DELAY) && (sdebug_jdelay > 0 ||
 					    sdebug_ndelay > 10000)) {
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* scsi: scsi_debug: make timeout faults by set delay to maximum value
  2025-11-05 11:54     ` John Garry
@ 2025-11-06 12:03       ` JiangJianJun
  2025-11-07 10:05         ` John Garry
  0 siblings, 1 reply; 21+ messages in thread
From: JiangJianJun @ 2025-11-06 12:03 UTC (permalink / raw)
  To: James.Bottomley, martin.petersen, linux-scsi
  Cc: bvanassche, john.g.garry, hewenliang4, yangyun50, wuyifeng10

> I got this patch 3x times...
Sorry, command has an error, i thought it had failed, so I sent it again.

> What do you mean by "checked during cancellation"?
&
> I don't know what is meant by "cancel the command properly".
Maybe I shouldn't use "cancel" to describe it; in the code, it's called "abort".
The function `sdebug_timeout_cmd` is designed to simulate a command timeout, by
discarding it. I noticed that you added a check to see if the command was
executed before "abort"; otherwise, it returns a failure.
Therefore, I change discarding to long-long-delay, so that "abort" will succeed.
If we needed abort-failure, we can inject ERR_ABORT_CMD_FAILED.

The fault-injection can be seen this link:
https://lore.kernel.org/r/20231010092051.608007-5-haowenchao2@huawei.com


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: scsi: scsi_debug: make timeout faults by set delay to maximum value
  2025-11-06 12:03       ` JiangJianJun
@ 2025-11-07 10:05         ` John Garry
  2025-11-08  8:29           ` JiangJianJun
  0 siblings, 1 reply; 21+ messages in thread
From: John Garry @ 2025-11-07 10:05 UTC (permalink / raw)
  To: JiangJianJun, James.Bottomley, martin.petersen, linux-scsi
  Cc: bvanassche, hewenliang4, yangyun50, wuyifeng10

On 06/11/2025 12:03, JiangJianJun wrote:
>> I got this patch 3x times...
> Sorry, command has an error, i thought it had failed, so I sent it again.
> 
>> What do you mean by "checked during cancellation"?
> &
>> I don't know what is meant by "cancel the command properly".
> Maybe I shouldn't use "cancel" to describe it; in the code, it's called "abort".
> The function `sdebug_timeout_cmd` is designed to simulate a command timeout, by
> discarding it. I noticed that you added a check to see if the command was
> executed before "abort"; otherwise, it returns a failure.

When we get discard the command, we get a timeout, and the scsi error 
handling kicks in eventually. The first thing that the scsi error 
handler tries to do is abort the command. All that the scsi_debug abort 
handler can do is ensure that we no longer have a reference to the 
scsi_command, which may mean cancelling any pending completion (if 
possible). Is your problem that sometimes the abort handler may fail, 
and we have to escalate?

> Therefore, I change discarding to long-long-delay, so that "abort" will succeed.
> If we needed abort-failure, we can inject ERR_ABORT_CMD_FAILED.
> 
> The fault-injection can be seen this link:
> https://urldefense.com/v3/__https://lore.kernel.org/r/20231010092051.608007-5-haowenchao2@huawei.com__;!!ACWV5N9M2RV99hQ!L_o5Xt_9lhXLe0R8AWsJWnt33qXlNijbVEWyiq9kjgA2lBsuV7E2s7ficvGM37RlSxxvXVZ1YEkAQxZ-WkFvKGTURrU$
> 
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: scsi: scsi_debug: make timeout faults by set delay to maximum value
  2025-11-07 10:05         ` John Garry
@ 2025-11-08  8:29           ` JiangJianJun
  2025-11-08  9:38             ` John Garry
  0 siblings, 1 reply; 21+ messages in thread
From: JiangJianJun @ 2025-11-08  8:29 UTC (permalink / raw)
  To: James.Bottomley, martin.petersen, linux-scsi
  Cc: bvanassche, john.g.garry, hewenliang4, yangyun50, wuyifeng10

> When we get discard the command, we get a timeout, and the scsi error 
> handling kicks in eventually. The first thing that the scsi error 
> handler tries to do is abort the command. All that the scsi_debug abort 
> handler can do is ensure that we no longer have a reference to the 
> scsi_command, which may mean cancelling any pending completion (if 
> possible). Is your problem that sometimes the abort handler may fail, 
> and we have to escalate?
What I mean is, in fault injection, we need to set a timeout, and then we expect a successful aborting. On the other hand, we can make a failed aborting by injecting-fault.
Before your modifications, the above objectives were achievable. The purpose of my current modification is also to restore the previous functionality.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: scsi: scsi_debug: make timeout faults by set delay to maximum value
  2025-11-08  8:29           ` JiangJianJun
@ 2025-11-08  9:38             ` John Garry
  2025-11-11 10:59               ` JiangJianJun
  0 siblings, 1 reply; 21+ messages in thread
From: John Garry @ 2025-11-08  9:38 UTC (permalink / raw)
  To: JiangJianJun, James.Bottomley, martin.petersen, linux-scsi
  Cc: bvanassche, hewenliang4, yangyun50, wuyifeng10

On 08/11/2025 08:29, JiangJianJun wrote:
>> When we get discard the command, we get a timeout, and the scsi error
> 
>> handling kicks in eventually. The first thing that the scsi error
> 
>> handler tries to do is abort the command. All that the scsi_debug abort
> 
>> handler can do is ensure that we no longer have a reference to the
> 
>> scsi_command, which may mean cancelling any pending completion (if
> 
>> possible). Is your problem that sometimes the abort handler may fail,
> 
>> and we have to escalate?
> 
> What I mean is, in fault injection, we need to set a timeout, and then we expect a successful aborting. 

I don't think that you can always expect successful aborting.

As an example of this, if the completion callback for a command is 
running at the same time as the abort handler, then we cannot 
successfully abort.

> On the other hand, we can make a failed aborting by injecting-fault.
> 
> Before your modifications, the above objectives were achievable. The purpose of my current modification is also to restore the previous functionality.

Which modifications are you specifically referring to?


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: scsi: scsi_debug: make timeout faults by set delay to maximum value
  2025-11-08  9:38             ` John Garry
@ 2025-11-11 10:59               ` JiangJianJun
  2025-11-12  8:53                 ` John Garry
  0 siblings, 1 reply; 21+ messages in thread
From: JiangJianJun @ 2025-11-11 10:59 UTC (permalink / raw)
  To: James.Bottomley, martin.petersen, linux-scsi
  Cc: bvanassche, john.g.garry, hewenliang4, yangyun50, wuyifeng10

> I don't think that you can always expect successful aborting.
If scsi_debug has been load, and scsi device id is "3:0:0:0", then we can
inject io-timeout:
    echo '0 -1 0x2a' > /sys/kernel/debug/scsi_debug/3:0:0:0/error
First dispatched WRITE-command will be timeout, after timeout scsi middle
layer will abort this command. This abort-operation will success.
IF we want make it fail, we can inject abort-fail: 
    echo '3 -1 0x2a' > /sys/kernel/debug/scsi_debug/3:0:0:0/error
So we can control the result of abort-opation.
> Which modifications are you specifically referring to?
This feature was able to achieve the expected results before you made any
changes. Additionally, I didn't change your logic; I was just restoring this
feature.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: scsi: scsi_debug: make timeout faults by set delay to maximum value
  2025-11-11 10:59               ` JiangJianJun
@ 2025-11-12  8:53                 ` John Garry
  2025-11-12 13:38                   ` JiangJianJun
  0 siblings, 1 reply; 21+ messages in thread
From: John Garry @ 2025-11-12  8:53 UTC (permalink / raw)
  To: JiangJianJun, James.Bottomley, martin.petersen, linux-scsi
  Cc: bvanassche, hewenliang4, yangyun50, wuyifeng10

On 11/11/2025 10:59, JiangJianJun wrote:
>> I don't think that you can always expect successful aborting.
> 
> If scsi_debug has been load, and scsi device id is "3:0:0:0", then we can
> 
> inject io-timeout:
> 
>      echo '0 -1 0x2a' > /sys/kernel/debug/scsi_debug/3:0:0:0/error

uh, I never thought that this thing actually was accepted

> 
> First dispatched WRITE-command will be timeout, after timeout scsi middle
> 
> layer will abort this command. This abort-operation will success.
> 
> IF we want make it fail, we can inject abort-fail:
> 
>      echo '3 -1 0x2a' > /sys/kernel/debug/scsi_debug/3:0:0:0/error
> 
> So we can control the result of abort-opation.
> 
>> Which modifications are you specifically referring to?
> 
> This feature was able to achieve the expected results before you made any
> 
> changes.

Again I will ask - which of my changes are you referring to? Provide 
commit ids, please.

> Additionally, I didn't change your logic; I was just restoring this
> 
> feature.
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: scsi: scsi_debug: make timeout faults by set delay to maximum value
  2025-11-12  8:53                 ` John Garry
@ 2025-11-12 13:38                   ` JiangJianJun
  2025-11-12 16:02                     ` John Garry
  0 siblings, 1 reply; 21+ messages in thread
From: JiangJianJun @ 2025-11-12 13:38 UTC (permalink / raw)
  To: James.Bottomley, martin.petersen, linux-scsi
  Cc: bvanassche, john.g.garry, hewenliang4, yangyun50, wuyifeng10

> uh, I never thought that this thing actually was accepted
Have you reviewed this commit before?
You seem to think that this feature is not good.

> Again I will ask - which of my changes are you referring to? Provide 
> commit ids, please.
The email I am replying to now is the change.
And the commit id: ac0fb4a55bde561c46fc7445642a722803176b33.



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: scsi: scsi_debug: make timeout faults by set delay to maximum value
  2025-11-12 13:38                   ` JiangJianJun
@ 2025-11-12 16:02                     ` John Garry
  2025-11-13 11:22                       ` JiangJianJun
  0 siblings, 1 reply; 21+ messages in thread
From: John Garry @ 2025-11-12 16:02 UTC (permalink / raw)
  To: JiangJianJun, James.Bottomley, martin.petersen, linux-scsi
  Cc: bvanassche, hewenliang4, yangyun50, wuyifeng10

On 12/11/2025 13:38, JiangJianJun wrote:
>> uh, I never thought that this thing actually was accepted
> 
> Have you reviewed this commit before?
> 
> You seem to think that this feature is not good.
> 

The concern which I mentioned before was that we would have competing 
and potentially conflicting methods to trigger and handle errors.

> 
> 
>> Again I will ask - which of my changes are you referring to? Provide
> 
>> commit ids, please.
> 
> The email I am replying to now is the change.
> 
> And the commit id: ac0fb4a55bde561c46fc7445642a722803176b33.

This specifically is a change from Bart to fix my code. However I added 
the change to check the abort result.

I think previously scsi_debug_abort() would just always report success, 
even if scsi_debug_abort_cmnd() possibly fails, right?

Now for a fake timeout, there seems to be a problem in 
scsi_debug_stop_cmnd() that we abort depending on the deferred type, but 
this is not set for any fake timeout. Or - more specifically - it is not 
reset for when the scmd is recycled. I think that we should just allow 
the abort to be successful for that case.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: scsi: scsi_debug: make timeout faults by set delay to maximum value
  2025-11-12 16:02                     ` John Garry
@ 2025-11-13 11:22                       ` JiangJianJun
  2025-11-13 11:48                         ` John Garry
  0 siblings, 1 reply; 21+ messages in thread
From: JiangJianJun @ 2025-11-13 11:22 UTC (permalink / raw)
  To: James.Bottomley, martin.petersen, linux-scsi
  Cc: bvanassche, john.g.garry, hewenliang4, yangyun50, wuyifeng10

> The concern which I mentioned before was that we would have competing 
> and potentially conflicting methods to trigger and handle errors.
>
> This specifically is a change from Bart to fix my code. However I added 
> the change to check the abort result.
> 
> I think previously scsi_debug_abort() would just always report success, 
> even if scsi_debug_abort_cmnd() possibly fails, right?
> 
> Now for a fake timeout, there seems to be a problem in 
> scsi_debug_stop_cmnd() that we abort depending on the deferred type, but 
> this is not set for any fake timeout. Or - more specifically - it is not 
> reset for when the scmd is recycled. I think that we should just allow 
> the abort to be successful for that case.

I think your design is rational.

I think the method of simulating timeout is rational because discard-command is
equivalent to the queuecommand executing but returning a success status, which
is clearly not what this interface expects. It is also within expectation that
such an unrational approach might conflict with subsequent changes.

So, I tend to change the discard-command to setting a very much long delay for
schedule.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: scsi: scsi_debug: make timeout faults by set delay to maximum value
  2025-11-13 11:22                       ` JiangJianJun
@ 2025-11-13 11:48                         ` John Garry
  0 siblings, 0 replies; 21+ messages in thread
From: John Garry @ 2025-11-13 11:48 UTC (permalink / raw)
  To: JiangJianJun, James.Bottomley, martin.petersen, linux-scsi
  Cc: bvanassche, hewenliang4, yangyun50, wuyifeng10

On 13/11/2025 11:22, JiangJianJun wrote:
>> Now for a fake timeout, there seems to be a problem in
>> scsi_debug_stop_cmnd() that we abort depending on the deferred type, but
>> this is not set for any fake timeout. Or - more specifically - it is not
>> reset for when the scmd is recycled. I think that we should just allow
>> the abort to be successful for that case.
> 
> 
> I think your design is rational.
> 
> 
> 
> I think the method of simulating timeout is rational because discard-command is
> 
> equivalent to the queuecommand executing but returning a success status, which
> 
> is clearly not what this interface expects. It is also within expectation that
> 
> such an unrational approach might conflict with subsequent changes.
> 
> 
> 
> So, I tend to change the discard-command to setting a very much long delay for
> 
> schedule.

I have worked on some patches to improve this situation. I will send 
them in a while. Please check them.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2025-11-13 11:48 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-24 11:55 [PATCH 0/4] scsi_debug improvements John Garry
2025-02-24 11:55 ` [PATCH 1/4] scsi: scsi_debug: Remove sdebug_device_access_info John Garry
2025-02-24 18:04   ` Bart Van Assche
2025-02-24 11:55 ` [PATCH 2/4] scsi: scsi_debug: Remove a reference to in_use_bm John Garry
2025-02-24 11:55 ` [PATCH 3/4] scsi: scsi_debug: Simplify command handling John Garry
2025-02-24 11:55 ` [PATCH 4/4] scsi: scsi_debug: Do not sleep in atomic sections John Garry
2025-11-05 11:57   ` [PATCH] scsi: scsi_debug: make timeout faults by set delay to maximum value JiangJianJun
2025-11-05 12:00   ` JiangJianJun
2025-11-05 12:01   ` JiangJianJun
2025-11-05 11:54     ` John Garry
2025-11-06 12:03       ` JiangJianJun
2025-11-07 10:05         ` John Garry
2025-11-08  8:29           ` JiangJianJun
2025-11-08  9:38             ` John Garry
2025-11-11 10:59               ` JiangJianJun
2025-11-12  8:53                 ` John Garry
2025-11-12 13:38                   ` JiangJianJun
2025-11-12 16:02                     ` John Garry
2025-11-13 11:22                       ` JiangJianJun
2025-11-13 11:48                         ` John Garry
2025-02-25  1:57 ` [PATCH 0/4] scsi_debug improvements Martin K. Petersen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox