public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] scsi: add scsi_done_direct() helper
@ 2022-02-08  6:37 Xiaoguang Wang
  2022-02-08  6:37 ` [PATCH 2/2] scsi: target: tcm_loop: use scsi_done_direct() Xiaoguang Wang
  2022-02-11 21:49 ` [PATCH 1/2] scsi: add scsi_done_direct() helper Martin K. Petersen
  0 siblings, 2 replies; 3+ messages in thread
From: Xiaoguang Wang @ 2022-02-08  6:37 UTC (permalink / raw)
  To: linux-scsi, target-devel; +Cc: martin.petersen, bostroesser, kanie

For scsi commands that are known to be completed in non-interrupt
context, scsi_done_direct() calling blk_mq_complete_request_direct()
can be used to completes commands directly instead deferring it
to softirq, which can give throughput improvement.

Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
---
 drivers/scsi/scsi_lib.c  | 32 +++++++++++++++++++++++++++-----
 include/scsi/scsi_cmnd.h |  1 +
 2 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 0a70aa763a96..c37879f46eaf 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1625,26 +1625,48 @@ static blk_status_t scsi_prepare_cmd(struct request *req)
 	return scsi_cmd_to_driver(cmd)->init_command(cmd);
 }
 
-void scsi_done(struct scsi_cmnd *cmd)
+static bool __scsi_done(struct scsi_cmnd *cmd)
 {
 	switch (cmd->submitter) {
 	case SUBMITTED_BY_BLOCK_LAYER:
-		break;
+		return false;
 	case SUBMITTED_BY_SCSI_ERROR_HANDLER:
-		return scsi_eh_done(cmd);
+		scsi_eh_done(cmd);
+		return true;
 	case SUBMITTED_BY_SCSI_RESET_IOCTL:
-		return;
+		return true;
 	}
 
 	if (unlikely(blk_should_fake_timeout(scsi_cmd_to_rq(cmd)->q)))
-		return;
+		return true;
 	if (unlikely(test_and_set_bit(SCMD_STATE_COMPLETE, &cmd->state)))
+		return true;
+	return false;
+}
+
+void scsi_done(struct scsi_cmnd *cmd)
+{
+	if (__scsi_done(cmd))
 		return;
+
 	trace_scsi_dispatch_cmd_done(cmd);
 	blk_mq_complete_request(scsi_cmd_to_rq(cmd));
 }
 EXPORT_SYMBOL(scsi_done);
 
+/* Complete cmds directly, useful in preemptible instead of an interrupt. */
+void scsi_done_direct(struct scsi_cmnd *cmd)
+{
+	struct request *rq = scsi_cmd_to_rq(cmd);
+
+	if (__scsi_done(cmd))
+		return;
+
+	trace_scsi_dispatch_cmd_done(cmd);
+	blk_mq_complete_request_direct(rq, rq->q->mq_ops->complete);
+}
+EXPORT_SYMBOL(scsi_done_direct);
+
 static void scsi_mq_put_budget(struct request_queue *q, int budget_token)
 {
 	struct scsi_device *sdev = q->queuedata;
diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
index 6794d7322cbd..ff1c4b51f7ae 100644
--- a/include/scsi/scsi_cmnd.h
+++ b/include/scsi/scsi_cmnd.h
@@ -168,6 +168,7 @@ static inline struct scsi_driver *scsi_cmd_to_driver(struct scsi_cmnd *cmd)
 }
 
 void scsi_done(struct scsi_cmnd *cmd);
+void scsi_done_direct(struct scsi_cmnd *cmd);
 
 extern void scsi_finish_command(struct scsi_cmnd *cmd);
 
-- 
2.14.4.44.g2045bb6


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH 2/2] scsi: target: tcm_loop: use scsi_done_direct()
  2022-02-08  6:37 [PATCH 1/2] scsi: add scsi_done_direct() helper Xiaoguang Wang
@ 2022-02-08  6:37 ` Xiaoguang Wang
  2022-02-11 21:49 ` [PATCH 1/2] scsi: add scsi_done_direct() helper Martin K. Petersen
  1 sibling, 0 replies; 3+ messages in thread
From: Xiaoguang Wang @ 2022-02-08  6:37 UTC (permalink / raw)
  To: linux-scsi, target-devel; +Cc: martin.petersen, bostroesser, kanie

Tcm_loop uses workqueue to end requests, which is non-interrupt context,
then we can complete request directly instead deferring it to softirq.
The call graph likes below:
    blk_mq_complete_request_remote+1
    blk_mq_complete_request+14
    target_put_sess_cmd+294
    transport_generic_free_cmd+93
    target_complete_ok_work+251
    process_one_work+482
    worker_thread+80
    kthread+361
    ret_from_fork+31

Use tcm_loop and tcmu(backstore is file) to evaluate performance, fio
job:
  [global]
  filename=/dev/sdb
  direct=1
  runtime=30
  thread=1
  norandommap=1
  time_based
  numjobs=1
  rw=randread
  iodepth=32
  ioengine=libaio

Without this patch:
bs 4k
  READ: bw=319MiB/s (334MB/s), 319MiB/s-319MiB/s (334MB/s-334MB/s),
io=9563MiB (10.0GB), run=30001-30001msec

bs 8k:
   READ: bw=611MiB/s (641MB/s), 611MiB/s-611MiB/s (641MB/s-641MB/s),
io=17.9GiB (19.2GB), run=30001-30001msec

bs 16k:
   READ: bw=1109MiB/s (1163MB/s), 1109MiB/s-1109MiB/s (1163MB/s-1163MB/s),
io=32.5GiB (34.9GB), run=30001-30001msec

bs 32k:
   READ: bw=2200MiB/s (2306MB/s), 2200MiB/s-2200MiB/s (2306MB/s-2306MB/s),
io=64.4GiB (69.2GB), run=30001-30001msec

With this patch:
bs 4k:
   READ: bw=344MiB/s (361MB/s), 344MiB/s-344MiB/s (361MB/s-361MB/s),
io=10.1GiB (10.8GB), run=30001-30001msec

bs 8k:
   READ: bw=651MiB/s (682MB/s), 651MiB/s-651MiB/s (682MB/s-682MB/s),
io=19.1GiB (20.5GB), run=30001-30001msec

bs 16k:
   READ: bw=1248MiB/s (1308MB/s), 1248MiB/s-1248MiB/s (1308MB/s-1308MB/s),
io=36.6GiB (39.3GB), run=30001-30001msec

bs 32k:
   READ: bw=2456MiB/s (2576MB/s), 2456MiB/s-2456MiB/s (2576MB/s-2576MB/s),
io=71.0GiB (77.3GB), run=30001-30001msec

We can get throughput improvement.

Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
---
 drivers/target/loopback/tcm_loop.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/target/loopback/tcm_loop.c b/drivers/target/loopback/tcm_loop.c
index 4407b56aa6d1..ce414fbdbae6 100644
--- a/drivers/target/loopback/tcm_loop.c
+++ b/drivers/target/loopback/tcm_loop.c
@@ -70,8 +70,12 @@ static void tcm_loop_release_cmd(struct se_cmd *se_cmd)
 
 	if (se_cmd->se_cmd_flags & SCF_SCSI_TMR_CDB)
 		kmem_cache_free(tcm_loop_cmd_cache, tl_cmd);
-	else
-		scsi_done(sc);
+	else {
+		if (unlikely(in_interrupt()))
+			scsi_done(sc);
+		else
+			scsi_done_direct(sc);
+	}
 }
 
 static int tcm_loop_show_info(struct seq_file *m, struct Scsi_Host *host)
-- 
2.14.4.44.g2045bb6


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH 1/2] scsi: add scsi_done_direct() helper
  2022-02-08  6:37 [PATCH 1/2] scsi: add scsi_done_direct() helper Xiaoguang Wang
  2022-02-08  6:37 ` [PATCH 2/2] scsi: target: tcm_loop: use scsi_done_direct() Xiaoguang Wang
@ 2022-02-11 21:49 ` Martin K. Petersen
  1 sibling, 0 replies; 3+ messages in thread
From: Martin K. Petersen @ 2022-02-11 21:49 UTC (permalink / raw)
  To: Xiaoguang Wang
  Cc: linux-scsi, target-devel, martin.petersen, bostroesser, kanie


Xiaoguang,

> For scsi commands that are known to be completed in non-interrupt
> context, scsi_done_direct() calling blk_mq_complete_request_direct()
> can be used to completes commands directly instead deferring it
> to softirq, which can give throughput improvement.

https://lore.kernel.org/all/20220201210954.570896-1-sebastian@breakpoint.cc/

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-02-11 21:49 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-02-08  6:37 [PATCH 1/2] scsi: add scsi_done_direct() helper Xiaoguang Wang
2022-02-08  6:37 ` [PATCH 2/2] scsi: target: tcm_loop: use scsi_done_direct() Xiaoguang Wang
2022-02-11 21:49 ` [PATCH 1/2] scsi: add scsi_done_direct() helper Martin K. Petersen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox