* [PATCH V2 0/2] scsi: scsi-mq: don't hold host_busy in IO path
@ 2018-06-24 14:03 Ming Lei
2018-06-24 14:03 ` [PATCH V2 1/2] scsi: read host_busy via scsi_host_busy() Ming Lei
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Ming Lei @ 2018-06-24 14:03 UTC (permalink / raw)
To: linux-scsi
Cc: Jens Axboe, linux-block, Ming Lei, Omar Sandoval,
Martin K. Petersen, James Bottomley, Christoph Hellwig, Don Brace,
Kashyap Desai, Mike Snitzer, Hannes Reinecke, Laurence Oberman,
Bart Van Assche
Hi,
This patches removes the expensive atomic opeation on host-wide counter
of .host_busy for scsi-mq, and it is observed that IOPS can be increased by
15% with this change in IO test over scsi_debug.
V2:
- merge patch 1&2 into one patch
- run more tests(blktests, xfstests, ...), and not see regression
Ming Lei (2):
scsi: read host_busy via scsi_host_busy()
scsi: avoid to hold host-wide counter of host_busy for scsi_mq
drivers/scsi/advansys.c | 8 ++++----
drivers/scsi/hosts.c | 32 +++++++++++++++++++++++++++++++
drivers/scsi/libsas/sas_scsi_host.c | 4 ++--
drivers/scsi/megaraid/megaraid_sas_base.c | 2 +-
drivers/scsi/mpt3sas/mpt3sas_base.c | 4 ++--
drivers/scsi/qlogicpti.c | 2 +-
drivers/scsi/scsi.c | 2 +-
drivers/scsi/scsi_error.c | 6 +++---
drivers/scsi/scsi_lib.c | 23 ++++++++++++++++------
drivers/scsi/scsi_sysfs.c | 2 +-
include/scsi/scsi_host.h | 1 +
11 files changed, 65 insertions(+), 21 deletions(-)
Cc: Omar Sandoval <osandov@fb.com>,
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>,
Cc: James Bottomley <james.bottomley@hansenpartnership.com>,
Cc: Christoph Hellwig <hch@lst.de>,
Cc: Don Brace <don.brace@microsemi.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: Bart Van Assche <bart.vanassche@wdc.com>
--
2.9.5
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH V2 1/2] scsi: read host_busy via scsi_host_busy()
2018-06-24 14:03 [PATCH V2 0/2] scsi: scsi-mq: don't hold host_busy in IO path Ming Lei
@ 2018-06-24 14:03 ` Ming Lei
2018-06-25 15:37 ` Bart Van Assche
2018-06-24 14:03 ` [PATCH V2 2/2] scsi: avoid to hold host-wide counter of host_busy for scsi_mq Ming Lei
2018-06-26 16:53 ` [PATCH V2 0/2] scsi: scsi-mq: don't hold host_busy in IO path Martin K. Petersen
2 siblings, 1 reply; 10+ messages in thread
From: Ming Lei @ 2018-06-24 14:03 UTC (permalink / raw)
To: linux-scsi
Cc: Jens Axboe, linux-block, Ming Lei, Omar Sandoval,
Martin K. Petersen, James Bottomley, Christoph Hellwig, Don Brace,
Kashyap Desai, Mike Snitzer, Hannes Reinecke, Laurence Oberman,
Bart Van Assche
No functional change.
Just introduce scsi_host_busy() and replace the direct read of
scsi_host->host_busy with this new API.
Cc: Omar Sandoval <osandov@fb.com>,
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>,
Cc: James Bottomley <james.bottomley@hansenpartnership.com>,
Cc: Christoph Hellwig <hch@lst.de>,
Cc: Don Brace <don.brace@microsemi.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
drivers/scsi/advansys.c | 8 ++++----
drivers/scsi/hosts.c | 10 ++++++++++
drivers/scsi/libsas/sas_scsi_host.c | 4 ++--
drivers/scsi/megaraid/megaraid_sas_base.c | 2 +-
drivers/scsi/mpt3sas/mpt3sas_base.c | 4 ++--
drivers/scsi/qlogicpti.c | 2 +-
drivers/scsi/scsi.c | 2 +-
drivers/scsi/scsi_error.c | 6 +++---
drivers/scsi/scsi_sysfs.c | 2 +-
include/scsi/scsi_host.h | 1 +
10 files changed, 26 insertions(+), 15 deletions(-)
diff --git a/drivers/scsi/advansys.c b/drivers/scsi/advansys.c
index c9a52905070e..713f69033f20 100644
--- a/drivers/scsi/advansys.c
+++ b/drivers/scsi/advansys.c
@@ -2416,8 +2416,8 @@ static void asc_prt_scsi_host(struct Scsi_Host *s)
struct asc_board *boardp = shost_priv(s);
printk("Scsi_Host at addr 0x%p, device %s\n", s, dev_name(boardp->dev));
- printk(" host_busy %u, host_no %d,\n",
- atomic_read(&s->host_busy), s->host_no);
+ printk(" host_busy %d, host_no %d,\n",
+ scsi_host_busy(s), s->host_no);
printk(" base 0x%lx, io_port 0x%lx, irq %d,\n",
(ulong)s->base, (ulong)s->io_port, boardp->irq);
@@ -3182,8 +3182,8 @@ static void asc_prt_driver_conf(struct seq_file *m, struct Scsi_Host *shost)
shost->host_no);
seq_printf(m,
- " host_busy %u, max_id %u, max_lun %llu, max_channel %u\n",
- atomic_read(&shost->host_busy), shost->max_id,
+ " host_busy %d, max_id %u, max_lun %llu, max_channel %u\n",
+ scsi_host_busy(shost), shost->max_id,
shost->max_lun, shost->max_channel);
seq_printf(m,
diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 3771e59a9fae..ea4b0bb0c1cd 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -564,6 +564,16 @@ struct Scsi_Host *scsi_host_get(struct Scsi_Host *shost)
EXPORT_SYMBOL(scsi_host_get);
/**
+ * scsi_host_busy - Return the host busy counter
+ * @shost: Pointer to Scsi_Host to inc.
+ **/
+int scsi_host_busy(struct Scsi_Host *shost)
+{
+ return atomic_read(&shost->host_busy);
+}
+EXPORT_SYMBOL(scsi_host_busy);
+
+/**
* scsi_host_put - dec a Scsi_Host ref count
* @shost: Pointer to Scsi_Host to dec.
**/
diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index ceab5e5c41c2..33229348dcb6 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -759,7 +759,7 @@ void sas_scsi_recover_host(struct Scsi_Host *shost)
spin_unlock_irq(shost->host_lock);
SAS_DPRINTK("Enter %s busy: %d failed: %d\n",
- __func__, atomic_read(&shost->host_busy), shost->host_failed);
+ __func__, scsi_host_busy(shost), shost->host_failed);
/*
* Deal with commands that still have SAS tasks (i.e. they didn't
* complete via the normal sas_task completion mechanism),
@@ -801,7 +801,7 @@ void sas_scsi_recover_host(struct Scsi_Host *shost)
goto retry;
SAS_DPRINTK("--- Exit %s: busy: %d failed: %d tries: %d\n",
- __func__, atomic_read(&shost->host_busy),
+ __func__, scsi_host_busy(shost),
shost->host_failed, tries);
}
diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index e6ba02793610..9aa9590c5373 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -2834,7 +2834,7 @@ static int megasas_reset_bus_host(struct scsi_cmnd *scmd)
"SCSI command pointer: (%p)\t SCSI host state: %d\t"
" SCSI host busy: %d\t FW outstanding: %d\n",
scmd, scmd->device->host->shost_state,
- atomic_read((atomic_t *)&scmd->device->host->host_busy),
+ scsi_host_busy(scmd->device->host),
atomic_read(&instance->fw_outstanding));
/*
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 2053970fc9f8..dc41bd3de08a 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -3250,7 +3250,7 @@ _base_recovery_check(struct MPT3SAS_ADAPTER *ioc)
* See _wait_for_commands_to_complete() call with regards to this code.
*/
if (ioc->shost_recovery && ioc->pending_io_count) {
- ioc->pending_io_count = atomic_read(&ioc->shost->host_busy);
+ ioc->pending_io_count = scsi_host_busy(ioc->shost);
if (ioc->pending_io_count == 0)
wake_up(&ioc->reset_wq);
}
@@ -6857,7 +6857,7 @@ mpt3sas_wait_for_commands_to_complete(struct MPT3SAS_ADAPTER *ioc)
return;
/* pending command count */
- ioc->pending_io_count = atomic_read(&ioc->shost->host_busy);
+ ioc->pending_io_count = scsi_host_busy(ioc->shost);
if (!ioc->pending_io_count)
return;
diff --git a/drivers/scsi/qlogicpti.c b/drivers/scsi/qlogicpti.c
index 8578e566ab41..9d09228eee28 100644
--- a/drivers/scsi/qlogicpti.c
+++ b/drivers/scsi/qlogicpti.c
@@ -959,7 +959,7 @@ static inline void update_can_queue(struct Scsi_Host *host, u_int in_ptr, u_int
/* Temporary workaround until bug is found and fixed (one bug has been found
already, but fixing it makes things even worse) -jj */
int num_free = QLOGICPTI_REQ_QUEUE_LEN - REQ_QUEUE_DEPTH(in_ptr, out_ptr) - 64;
- host->can_queue = atomic_read(&host->host_busy) + num_free;
+ host->can_queue = scsi_host_busy(host) + num_free;
host->sg_tablesize = QLOGICPTI_MAX_SG(num_free);
}
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index 4c60c260c5da..a15d078321a2 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -167,7 +167,7 @@ void scsi_log_completion(struct scsi_cmnd *cmd, int disposition)
if (level > 3)
scmd_printk(KERN_INFO, cmd,
"scsi host busy %d failed %d\n",
- atomic_read(&cmd->device->host->host_busy),
+ scsi_host_busy(cmd->device->host),
cmd->device->host->host_failed);
}
}
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 8932ae81a15a..6a014fd15fe9 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -66,7 +66,7 @@ void scsi_eh_wakeup(struct Scsi_Host *shost)
{
lockdep_assert_held(shost->host_lock);
- if (atomic_read(&shost->host_busy) == shost->host_failed) {
+ if (scsi_host_busy(shost) == shost->host_failed) {
trace_scsi_eh_wakeup(shost);
wake_up_process(shost->ehandler);
SCSI_LOG_ERROR_RECOVERY(5, shost_printk(KERN_INFO, shost,
@@ -2155,7 +2155,7 @@ int scsi_error_handler(void *data)
break;
if ((shost->host_failed == 0 && shost->host_eh_scheduled == 0) ||
- shost->host_failed != atomic_read(&shost->host_busy)) {
+ shost->host_failed != scsi_host_busy(shost)) {
SCSI_LOG_ERROR_RECOVERY(1,
shost_printk(KERN_INFO, shost,
"scsi_eh_%d: sleeping\n",
@@ -2170,7 +2170,7 @@ int scsi_error_handler(void *data)
"scsi_eh_%d: waking up %d/%d/%d\n",
shost->host_no, shost->host_eh_scheduled,
shost->host_failed,
- atomic_read(&shost->host_busy)));
+ scsi_host_busy(shost)));
/*
* We have a host that is failing for some reason. Figure out
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 7943b762c12d..de122354d09a 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -382,7 +382,7 @@ static ssize_t
show_host_busy(struct device *dev, struct device_attribute *attr, char *buf)
{
struct Scsi_Host *shost = class_to_shost(dev);
- return snprintf(buf, 20, "%d\n", atomic_read(&shost->host_busy));
+ return snprintf(buf, 20, "%d\n", scsi_host_busy(shost));
}
static DEVICE_ATTR(host_busy, S_IRUGO, show_host_busy, NULL);
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 53b485fe9b67..5ea06d310a25 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -758,6 +758,7 @@ extern void scsi_scan_host(struct Scsi_Host *);
extern void scsi_rescan_device(struct device *);
extern void scsi_remove_host(struct Scsi_Host *);
extern struct Scsi_Host *scsi_host_get(struct Scsi_Host *);
+extern int scsi_host_busy(struct Scsi_Host *shost);
extern void scsi_host_put(struct Scsi_Host *t);
extern struct Scsi_Host *scsi_host_lookup(unsigned short);
extern const char *scsi_host_state_name(enum scsi_host_state);
--
2.9.5
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH V2 2/2] scsi: avoid to hold host-wide counter of host_busy for scsi_mq
2018-06-24 14:03 [PATCH V2 0/2] scsi: scsi-mq: don't hold host_busy in IO path Ming Lei
2018-06-24 14:03 ` [PATCH V2 1/2] scsi: read host_busy via scsi_host_busy() Ming Lei
@ 2018-06-24 14:03 ` Ming Lei
2018-06-25 15:38 ` Bart Van Assche
2018-06-29 16:20 ` [V2, " Guenter Roeck
2018-06-26 16:53 ` [PATCH V2 0/2] scsi: scsi-mq: don't hold host_busy in IO path Martin K. Petersen
2 siblings, 2 replies; 10+ messages in thread
From: Ming Lei @ 2018-06-24 14:03 UTC (permalink / raw)
To: linux-scsi
Cc: Jens Axboe, linux-block, Ming Lei, Omar Sandoval,
Martin K. Petersen, James Bottomley, Christoph Hellwig, Don Brace,
Kashyap Desai, Mike Snitzer, Hannes Reinecke, Laurence Oberman,
Bart Van Assche
It isn't necessary to check the host depth in scsi_queue_rq() any more
since it has been respected by blk-mq before calling scsi_queue_rq() via
getting driver tag.
Lots of LUNs may attach to same host, and per-host IOPS may reach millions
level, so we should avoid to this expensive atomic operations on the
hostwide counter in IO path.
This patch implemens scsi_host_busy() via blk_mq_tagset_busy_iter() for
reading the count of busy IOs for scsi_mq.
It is observed that IOPS is increased by 15% in IO test on scsi_debug
(32 LUNs, 32 submit queues, 1024 can_queue, libaio/dio) in one
dual-socket system.
Cc: Omar Sandoval <osandov@fb.com>,
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>,
Cc: James Bottomley <james.bottomley@hansenpartnership.com>,
Cc: Christoph Hellwig <hch@lst.de>,
Cc: Don Brace <don.brace@microsemi.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
drivers/scsi/hosts.c | 24 +++++++++++++++++++++++-
drivers/scsi/scsi_lib.c | 23 +++++++++++++++++------
2 files changed, 40 insertions(+), 7 deletions(-)
diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index ea4b0bb0c1cd..f02dcc875a09 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -563,13 +563,35 @@ struct Scsi_Host *scsi_host_get(struct Scsi_Host *shost)
}
EXPORT_SYMBOL(scsi_host_get);
+struct scsi_host_mq_in_flight {
+ int cnt;
+};
+
+static void scsi_host_check_in_flight(struct request *rq, void *data,
+ bool reserved)
+{
+ struct scsi_host_mq_in_flight *in_flight = data;
+
+ if (blk_mq_request_started(rq))
+ in_flight->cnt++;
+}
+
/**
* scsi_host_busy - Return the host busy counter
* @shost: Pointer to Scsi_Host to inc.
**/
int scsi_host_busy(struct Scsi_Host *shost)
{
- return atomic_read(&shost->host_busy);
+ struct scsi_host_mq_in_flight in_flight = {
+ .cnt = 0,
+ };
+
+ if (!shost->use_blk_mq)
+ return atomic_read(&shost->host_busy);
+
+ blk_mq_tagset_busy_iter(&shost->tag_set, scsi_host_check_in_flight,
+ &in_flight);
+ return in_flight.cnt;
}
EXPORT_SYMBOL(scsi_host_busy);
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 41e9ac9fc138..1c79c86184b1 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -345,7 +345,8 @@ static void scsi_dec_host_busy(struct Scsi_Host *shost)
unsigned long flags;
rcu_read_lock();
- atomic_dec(&shost->host_busy);
+ if (!shost->use_blk_mq)
+ atomic_dec(&shost->host_busy);
if (unlikely(scsi_host_in_recovery(shost))) {
spin_lock_irqsave(shost->host_lock, flags);
if (shost->host_failed || shost->host_eh_scheduled)
@@ -444,7 +445,12 @@ static inline bool scsi_target_is_busy(struct scsi_target *starget)
static inline bool scsi_host_is_busy(struct Scsi_Host *shost)
{
- if (shost->can_queue > 0 &&
+ /*
+ * blk-mq can handle host queue busy efficiently via host-wide driver
+ * tag allocation
+ */
+
+ if (!shost->use_blk_mq && shost->can_queue > 0 &&
atomic_read(&shost->host_busy) >= shost->can_queue)
return true;
if (atomic_read(&shost->host_blocked) > 0)
@@ -1550,9 +1556,12 @@ static inline int scsi_host_queue_ready(struct request_queue *q,
if (scsi_host_in_recovery(shost))
return 0;
- busy = atomic_inc_return(&shost->host_busy) - 1;
+ if (!shost->use_blk_mq)
+ busy = atomic_inc_return(&shost->host_busy) - 1;
+ else
+ busy = 0;
if (atomic_read(&shost->host_blocked) > 0) {
- if (busy)
+ if (busy || scsi_host_busy(shost))
goto starved;
/*
@@ -1566,7 +1575,7 @@ static inline int scsi_host_queue_ready(struct request_queue *q,
"unblocking host at zero depth\n"));
}
- if (shost->can_queue > 0 && busy >= shost->can_queue)
+ if (!shost->use_blk_mq && shost->can_queue > 0 && busy >= shost->can_queue)
goto starved;
if (shost->host_self_blocked)
goto starved;
@@ -1652,7 +1661,9 @@ static void scsi_kill_request(struct request *req, struct request_queue *q)
* with the locks as normal issue path does.
*/
atomic_inc(&sdev->device_busy);
- atomic_inc(&shost->host_busy);
+
+ if (!shost->use_blk_mq)
+ atomic_inc(&shost->host_busy);
if (starget->can_queue > 0)
atomic_inc(&starget->target_busy);
--
2.9.5
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH V2 1/2] scsi: read host_busy via scsi_host_busy()
2018-06-24 14:03 ` [PATCH V2 1/2] scsi: read host_busy via scsi_host_busy() Ming Lei
@ 2018-06-25 15:37 ` Bart Van Assche
0 siblings, 0 replies; 10+ messages in thread
From: Bart Van Assche @ 2018-06-25 15:37 UTC (permalink / raw)
To: Ming Lei, linux-scsi@vger.kernel.org
Cc: Jens Axboe, linux-block@vger.kernel.org, Omar Sandoval,
Martin K. Petersen, James Bottomley, Christoph Hellwig, Don Brace,
Kashyap Desai, Mike Snitzer, Hannes Reinecke, Laurence Oberman
On 06/24/18 07:04, Ming Lei wrote:
> No functional change.
>
> Just introduce scsi_host_busy() and replace the direct read of
> scsi_host->host_busy with this new API.
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH V2 2/2] scsi: avoid to hold host-wide counter of host_busy for scsi_mq
2018-06-24 14:03 ` [PATCH V2 2/2] scsi: avoid to hold host-wide counter of host_busy for scsi_mq Ming Lei
@ 2018-06-25 15:38 ` Bart Van Assche
2018-06-29 16:20 ` [V2, " Guenter Roeck
1 sibling, 0 replies; 10+ messages in thread
From: Bart Van Assche @ 2018-06-25 15:38 UTC (permalink / raw)
To: Ming Lei, linux-scsi@vger.kernel.org
Cc: Jens Axboe, linux-block@vger.kernel.org, Omar Sandoval,
Martin K. Petersen, James Bottomley, Christoph Hellwig, Don Brace,
Kashyap Desai, Mike Snitzer, Hannes Reinecke, Laurence Oberman
On 06/24/18 07:04, Ming Lei wrote:
> It isn't necessary to check the host depth in scsi_queue_rq() any more
> since it has been respected by blk-mq before calling scsi_queue_rq() via
> getting driver tag.
>
> Lots of LUNs may attach to same host, and per-host IOPS may reach millions
> level, so we should avoid to this expensive atomic operations on the
> hostwide counter in IO path.
>
> This patch implemens scsi_host_busy() via blk_mq_tagset_busy_iter() for
> reading the count of busy IOs for scsi_mq.
>
> It is observed that IOPS is increased by 15% in IO test on scsi_debug
> (32 LUNs, 32 submit queues, 1024 can_queue, libaio/dio) in one
> dual-socket system.
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH V2 0/2] scsi: scsi-mq: don't hold host_busy in IO path
2018-06-24 14:03 [PATCH V2 0/2] scsi: scsi-mq: don't hold host_busy in IO path Ming Lei
2018-06-24 14:03 ` [PATCH V2 1/2] scsi: read host_busy via scsi_host_busy() Ming Lei
2018-06-24 14:03 ` [PATCH V2 2/2] scsi: avoid to hold host-wide counter of host_busy for scsi_mq Ming Lei
@ 2018-06-26 16:53 ` Martin K. Petersen
2 siblings, 0 replies; 10+ messages in thread
From: Martin K. Petersen @ 2018-06-26 16:53 UTC (permalink / raw)
To: Ming Lei
Cc: linux-scsi, Jens Axboe, linux-block, Omar Sandoval,
Martin K. Petersen, James Bottomley, Christoph Hellwig, Don Brace,
Kashyap Desai, Mike Snitzer, Hannes Reinecke, Laurence Oberman,
Bart Van Assche
Ming,
> This patches removes the expensive atomic opeation on host-wide
> counter of .host_busy for scsi-mq, and it is observed that IOPS can be
> increased by 15% with this change in IO test over scsi_debug.
Applied to 4.19/scsi-queue, thanks!
--
Martin K. Petersen Oracle Linux Engineering
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [V2, 2/2] scsi: avoid to hold host-wide counter of host_busy for scsi_mq
2018-06-24 14:03 ` [PATCH V2 2/2] scsi: avoid to hold host-wide counter of host_busy for scsi_mq Ming Lei
2018-06-25 15:38 ` Bart Van Assche
@ 2018-06-29 16:20 ` Guenter Roeck
2018-06-30 1:12 ` Ming Lei
2018-06-30 1:30 ` Ming Lei
1 sibling, 2 replies; 10+ messages in thread
From: Guenter Roeck @ 2018-06-29 16:20 UTC (permalink / raw)
To: Ming Lei
Cc: linux-scsi, Jens Axboe, linux-block, Omar Sandoval,
Martin K. Petersen, James Bottomley, Christoph Hellwig, Don Brace,
Kashyap Desai, Mike Snitzer, Hannes Reinecke, Laurence Oberman,
Bart Van Assche
Hi,
On Sun, Jun 24, 2018 at 10:03:27PM +0800, Ming Lei wrote:
> It isn't necessary to check the host depth in scsi_queue_rq() any more
> since it has been respected by blk-mq before calling scsi_queue_rq() via
> getting driver tag.
>
> Lots of LUNs may attach to same host, and per-host IOPS may reach millions
> level, so we should avoid to this expensive atomic operations on the
> hostwide counter in IO path.
>
> This patch implemens scsi_host_busy() via blk_mq_tagset_busy_iter() for
> reading the count of busy IOs for scsi_mq.
>
> It is observed that IOPS is increased by 15% in IO test on scsi_debug
> (32 LUNs, 32 submit queues, 1024 can_queue, libaio/dio) in one
> dual-socket system.
>
This patch breaks two of my qemu test builds in -next: parisc:defconfig
and arm:versatilepb-scsi:versatile_defconfig (which is versatilepb booting
from scsi disk). The symptom is the same for both: Boot stalls after scsi
bus initialization.
arm:
sym53c8xx 0000:00:0c.0: enabling device (0100 -> 0103)
sym0: <895a> rev 0x0 at pci 0000:00:0c.0 irq 66
sym0: No NVRAM, ID 7, Fast-40, LVD, parity checking
sym0: SCSI BUS has been reset.
scsi host0: sym-2.2.3
random: fast init done
[stalls]
parisc:
sym53c8xx 0000:00:00.0: enabling SERR and PARITY (0107 -> 0147)
sym0: <895a> rev 0x0 at pci 0000:00:00.0 irq 17
sym0: PA-RISC Firmware, ID 7, Fast-40, LVD, parity checking
sym0: SCSI BUS has been reset.
scsi host0: sym-2.2.3
random: fast init done
[stalls]
Reverting the patch fixes the problem. Bisect log is attached.
Guenter
---
# bad: [e3c7283c19cd9ba999794f38007389ac83408a78] Add linux-next specific files for 20180629
# good: [7daf201d7fe8334e2d2364d4e8ed3394ec9af819] Linux 4.18-rc2
git bisect start 'HEAD' 'v4.18-rc2'
# good: [74f0b73f82214595f9a457c71c16c5b73daaa12b] Merge remote-tracking branch 'crypto/master'
git bisect good 74f0b73f82214595f9a457c71c16c5b73daaa12b
# good: [f4f7cb745da2544fd23fec5e33782b453121abb4] Merge remote-tracking branch 'spi/for-next'
git bisect good f4f7cb745da2544fd23fec5e33782b453121abb4
# good: [11fce5f62f76bbe4fa35c17d9d968c50a2add11d] Merge remote-tracking branch 'staging/staging-next'
git bisect good 11fce5f62f76bbe4fa35c17d9d968c50a2add11d
# bad: [3b7557dfc1a37bc3e332c96e3e84c8d2606116b8] Merge remote-tracking branch 'xarray/xarray'
git bisect bad 3b7557dfc1a37bc3e332c96e3e84c8d2606116b8
# bad: [6cb742aeb685642ea6b829b06f4655df0f9b521c] Merge remote-tracking branch 'pinctrl/for-next'
git bisect bad 6cb742aeb685642ea6b829b06f4655df0f9b521c
# good: [c65be1a63f1df224c8f22d72b9ec824241ada585] scsi: core: check for equality of result byte values
git bisect good c65be1a63f1df224c8f22d72b9ec824241ada585
# bad: [f933a84d83b933d5c343e76ee6c18768c3bac0d6] Merge remote-tracking branch 'libata/for-next'
git bisect bad f933a84d83b933d5c343e76ee6c18768c3bac0d6
# bad: [bc64692916f008be1f2c0c5a7562aa826a6444aa] Merge remote-tracking branch 'scsi/for-next'
git bisect bad bc64692916f008be1f2c0c5a7562aa826a6444aa
# good: [cd5f5a2b20414c23b16b330f1ef54a941dc90d3d] Merge remote-tracking branch 'slave-dma/next'
git bisect good cd5f5a2b20414c23b16b330f1ef54a941dc90d3d
# good: [c84b023a4c1461498abf0eda54f60e2fd64a1ca2] scsi: read host_busy via scsi_host_busy()
git bisect good c84b023a4c1461498abf0eda54f60e2fd64a1ca2
# bad: [0c218e16a8501cfda30f498217b434976cb62fc5] scsi: tcmu: Don't pass KERN_ERR to pr_err
git bisect bad 0c218e16a8501cfda30f498217b434976cb62fc5
# bad: [328728630d9f2bf14b82ca30b5e47489beefe361] scsi: core: avoid host-wide host_busy counter for scsi_mq
git bisect bad 328728630d9f2bf14b82ca30b5e47489beefe361
# first bad commit: [328728630d9f2bf14b82ca30b5e47489beefe361] scsi: core: avoid host-wide host_busy counter for scsi_mq
---
Example qemu trace (with -d exec; this is repeated over and over again, with other
code - mostly timer handling - in between):
Chain 0: 0x7fffca60f8c0 [0000000000000004/00000000103fb5c4/0x40003] blk_add_timer
Chain 0: 0x7fffca6103c0 [0000000000000004/00000000103f14fc/0x40003] blk_start_request
Chain 0: 0x7fffca6105c0 [0000000000000004/00000000103f4954/0x40003] blk_queue_start_tag
Chain 0: 0x7fffca610b00 [0000000000000004/00000000104ec990/0x40003] scsi_request_fn
Trace 0: 0x7fffca610dc0 [0000000000000004/00000000104ec9a4/0x40003] scsi_request_fn
Trace 0: 0x7fffca612040 [0000000000000004/00000000104ecaf4/0x40003] scsi_request_fn
Trace 0: 0x7fffca612240 [0000000000000004/00000000104ecb04/0x40003] scsi_request_fn
Chain 0: 0x7fffca6274c0 [0000000000000004/00000000104ecb20/0x40003] scsi_request_fn
Trace 0: 0x7fffca627780 [0000000000000004/00000000104ecbdc/0x40003] scsi_request_fn
Trace 0: 0x7fffca627ec0 [0000000000000004/00000000104ecbf0/0x40003] scsi_request_fn
Trace 0: 0x7fffca619480 [0000000000000004/00000000104ebd28/0x40003] scsi_dec_host_busy
Trace 0: 0x7fffca619680 [0000000000000004/00000000104ebd38/0x40003] scsi_dec_host_busy
Chain 0: 0x7fffca628000 [0000000000000004/00000000104ecbf8/0x40003] scsi_request_fn
Trace 0: 0x7fffca628480 [0000000000000004/00000000104eca8c/0x40003] scsi_request_fn
Chain 0: 0x7fffca61ac40 [0000000000000004/00000000103f13a4/0x40003] blk_requeue_request
Chain 0: 0x7fffca61b580 [0000000000000004/00000000103f13d4/0x40003] blk_requeue_request
Trace 0: 0x7fffca61b6c0 [0000000000000004/00000000103f13d8/0x40003] blk_requeue_request
Trace 0: 0x7fffca61b8c0 [0000000000000004/00000000103f13e8/0x40003] blk_requeue_request
Trace 0: 0x7fffca61cc00 [0000000000000004/00000000103f4c0c/0x40003] blk_queue_end_tag
Trace 0: 0x7fffca61ce80 [0000000000000004/00000000103f4c28/0x40003] blk_queue_end_tag
Chain 0: 0x7fffca61d1c0 [0000000000000004/00000000103f142c/0x40003] blk_requeue_request
Chain 0: 0x7fffca5f5440 [0000000000000004/00000000103ee2d0/0x40003] __elv_add_request
Chain 0: 0x7fffca61e100 [0000000000000004/00000000103ee3b0/0x40003] elv_requeue_request
Chain 0: 0x7fffca61e400 [0000000000000004/00000000103f140c/0x40003] blk_requeue_request
Chain 0: 0x7fffca6285c0 [0000000000000004/00000000104eca98/0x40003] scsi_request_fn
Trace 0: 0x7fffca628700 [0000000000000004/00000000104eca9c/0x40003] scsi_request_fn
Trace 0: 0x7fffca628900 [0000000000000004/00000000104ecaac/0x40003] scsi_request_fn
Chain 0: 0x7fffca61fd00 [0000000000000004/00000000103ef5ec/0x40003] blk_delay_queue
Chain 0: 0x7fffca620540 [0000000000000004/00000000101d6334/0x40003] __msecs_to_jiffies
Chain 0: 0x7fffca620940 [0000000000000004/00000000103ef62c/0x40003] blk_delay_queue
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [V2, 2/2] scsi: avoid to hold host-wide counter of host_busy for scsi_mq
2018-06-29 16:20 ` [V2, " Guenter Roeck
@ 2018-06-30 1:12 ` Ming Lei
2018-06-30 1:30 ` Ming Lei
1 sibling, 0 replies; 10+ messages in thread
From: Ming Lei @ 2018-06-30 1:12 UTC (permalink / raw)
To: Guenter Roeck
Cc: linux-scsi, Jens Axboe, linux-block, Omar Sandoval,
Martin K. Petersen, James Bottomley, Christoph Hellwig, Don Brace,
Kashyap Desai, Mike Snitzer, Hannes Reinecke, Laurence Oberman,
Bart Van Assche
On Fri, Jun 29, 2018 at 09:20:54AM -0700, Guenter Roeck wrote:
> Hi,
>
> On Sun, Jun 24, 2018 at 10:03:27PM +0800, Ming Lei wrote:
> > It isn't necessary to check the host depth in scsi_queue_rq() any more
> > since it has been respected by blk-mq before calling scsi_queue_rq() via
> > getting driver tag.
> >
> > Lots of LUNs may attach to same host, and per-host IOPS may reach millions
> > level, so we should avoid to this expensive atomic operations on the
> > hostwide counter in IO path.
> >
> > This patch implemens scsi_host_busy() via blk_mq_tagset_busy_iter() for
> > reading the count of busy IOs for scsi_mq.
> >
> > It is observed that IOPS is increased by 15% in IO test on scsi_debug
> > (32 LUNs, 32 submit queues, 1024 can_queue, libaio/dio) in one
> > dual-socket system.
> >
>
> This patch breaks two of my qemu test builds in -next: parisc:defconfig
> and arm:versatilepb-scsi:versatile_defconfig (which is versatilepb booting
> from scsi disk). The symptom is the same for both: Boot stalls after scsi
> bus initialization.
>
> arm:
>
> sym53c8xx 0000:00:0c.0: enabling device (0100 -> 0103)
> sym0: <895a> rev 0x0 at pci 0000:00:0c.0 irq 66
> sym0: No NVRAM, ID 7, Fast-40, LVD, parity checking
> sym0: SCSI BUS has been reset.
> scsi host0: sym-2.2.3
> random: fast init done
> [stalls]
>
> parisc:
>
> sym53c8xx 0000:00:00.0: enabling SERR and PARITY (0107 -> 0147)
> sym0: <895a> rev 0x0 at pci 0000:00:00.0 irq 17
> sym0: PA-RISC Firmware, ID 7, Fast-40, LVD, parity checking
> sym0: SCSI BUS has been reset.
> scsi host0: sym-2.2.3
> random: fast init done
> [stalls]
>
> Reverting the patch fixes the problem. Bisect log is attached.
Hi Guenter,
Thanks for your test & report!
This looks a bit weird, I need to take a close look given this
patch supposes to be nop for non-blk-mq, which is exactly your
case.
Thanks,
Ming
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [V2, 2/2] scsi: avoid to hold host-wide counter of host_busy for scsi_mq
2018-06-29 16:20 ` [V2, " Guenter Roeck
2018-06-30 1:12 ` Ming Lei
@ 2018-06-30 1:30 ` Ming Lei
2018-06-30 1:41 ` Guenter Roeck
1 sibling, 1 reply; 10+ messages in thread
From: Ming Lei @ 2018-06-30 1:30 UTC (permalink / raw)
To: Guenter Roeck
Cc: linux-scsi, Jens Axboe, linux-block, Omar Sandoval,
Martin K. Petersen, James Bottomley, Christoph Hellwig, Don Brace,
Kashyap Desai, Mike Snitzer, Hannes Reinecke, Laurence Oberman,
Bart Van Assche
On Fri, Jun 29, 2018 at 09:20:54AM -0700, Guenter Roeck wrote:
> Hi,
>
> On Sun, Jun 24, 2018 at 10:03:27PM +0800, Ming Lei wrote:
> > It isn't necessary to check the host depth in scsi_queue_rq() any more
> > since it has been respected by blk-mq before calling scsi_queue_rq() via
> > getting driver tag.
> >
> > Lots of LUNs may attach to same host, and per-host IOPS may reach millions
> > level, so we should avoid to this expensive atomic operations on the
> > hostwide counter in IO path.
> >
> > This patch implemens scsi_host_busy() via blk_mq_tagset_busy_iter() for
> > reading the count of busy IOs for scsi_mq.
> >
> > It is observed that IOPS is increased by 15% in IO test on scsi_debug
> > (32 LUNs, 32 submit queues, 1024 can_queue, libaio/dio) in one
> > dual-socket system.
> >
>
> This patch breaks two of my qemu test builds in -next: parisc:defconfig
> and arm:versatilepb-scsi:versatile_defconfig (which is versatilepb booting
> from scsi disk). The symptom is the same for both: Boot stalls after scsi
> bus initialization.
>
> arm:
>
> sym53c8xx 0000:00:0c.0: enabling device (0100 -> 0103)
> sym0: <895a> rev 0x0 at pci 0000:00:0c.0 irq 66
> sym0: No NVRAM, ID 7, Fast-40, LVD, parity checking
> sym0: SCSI BUS has been reset.
> scsi host0: sym-2.2.3
> random: fast init done
> [stalls]
>
> parisc:
>
> sym53c8xx 0000:00:00.0: enabling SERR and PARITY (0107 -> 0147)
> sym0: <895a> rev 0x0 at pci 0000:00:00.0 irq 17
> sym0: PA-RISC Firmware, ID 7, Fast-40, LVD, parity checking
> sym0: SCSI BUS has been reset.
> scsi host0: sym-2.2.3
> random: fast init done
> [stalls]
>
> Reverting the patch fixes the problem. Bisect log is attached.
Please test the patch of 'scsi: fix scsi_host_queue_ready', which has
been posted on linux-scsi list and CCed to you.
Thanks,
Ming
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [V2, 2/2] scsi: avoid to hold host-wide counter of host_busy for scsi_mq
2018-06-30 1:30 ` Ming Lei
@ 2018-06-30 1:41 ` Guenter Roeck
0 siblings, 0 replies; 10+ messages in thread
From: Guenter Roeck @ 2018-06-30 1:41 UTC (permalink / raw)
To: Ming Lei
Cc: linux-scsi, Jens Axboe, linux-block, Omar Sandoval,
Martin K. Petersen, James Bottomley, Christoph Hellwig, Don Brace,
Kashyap Desai, Mike Snitzer, Hannes Reinecke, Laurence Oberman,
Bart Van Assche
On Sat, Jun 30, 2018 at 09:30:30AM +0800, Ming Lei wrote:
> On Fri, Jun 29, 2018 at 09:20:54AM -0700, Guenter Roeck wrote:
> > Hi,
> >
> > On Sun, Jun 24, 2018 at 10:03:27PM +0800, Ming Lei wrote:
> > > It isn't necessary to check the host depth in scsi_queue_rq() any more
> > > since it has been respected by blk-mq before calling scsi_queue_rq() via
> > > getting driver tag.
> > >
> > > Lots of LUNs may attach to same host, and per-host IOPS may reach millions
> > > level, so we should avoid to this expensive atomic operations on the
> > > hostwide counter in IO path.
> > >
> > > This patch implemens scsi_host_busy() via blk_mq_tagset_busy_iter() for
> > > reading the count of busy IOs for scsi_mq.
> > >
> > > It is observed that IOPS is increased by 15% in IO test on scsi_debug
> > > (32 LUNs, 32 submit queues, 1024 can_queue, libaio/dio) in one
> > > dual-socket system.
> > >
> >
> > This patch breaks two of my qemu test builds in -next: parisc:defconfig
> > and arm:versatilepb-scsi:versatile_defconfig (which is versatilepb booting
> > from scsi disk). The symptom is the same for both: Boot stalls after scsi
> > bus initialization.
> >
> > arm:
> >
> > sym53c8xx 0000:00:0c.0: enabling device (0100 -> 0103)
> > sym0: <895a> rev 0x0 at pci 0000:00:0c.0 irq 66
> > sym0: No NVRAM, ID 7, Fast-40, LVD, parity checking
> > sym0: SCSI BUS has been reset.
> > scsi host0: sym-2.2.3
> > random: fast init done
> > [stalls]
> >
> > parisc:
> >
> > sym53c8xx 0000:00:00.0: enabling SERR and PARITY (0107 -> 0147)
> > sym0: <895a> rev 0x0 at pci 0000:00:00.0 irq 17
> > sym0: PA-RISC Firmware, ID 7, Fast-40, LVD, parity checking
> > sym0: SCSI BUS has been reset.
> > scsi host0: sym-2.2.3
> > random: fast init done
> > [stalls]
> >
> > Reverting the patch fixes the problem. Bisect log is attached.
>
> Please test the patch of 'scsi: fix scsi_host_queue_ready', which has
> been posted on linux-scsi list and CCed to you.
>
Yes, that fixes the problem. Thanks for the quick turnaround!
Guenter
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2018-06-30 1:41 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-06-24 14:03 [PATCH V2 0/2] scsi: scsi-mq: don't hold host_busy in IO path Ming Lei
2018-06-24 14:03 ` [PATCH V2 1/2] scsi: read host_busy via scsi_host_busy() Ming Lei
2018-06-25 15:37 ` Bart Van Assche
2018-06-24 14:03 ` [PATCH V2 2/2] scsi: avoid to hold host-wide counter of host_busy for scsi_mq Ming Lei
2018-06-25 15:38 ` Bart Van Assche
2018-06-29 16:20 ` [V2, " Guenter Roeck
2018-06-30 1:12 ` Ming Lei
2018-06-30 1:30 ` Ming Lei
2018-06-30 1:41 ` Guenter Roeck
2018-06-26 16:53 ` [PATCH V2 0/2] scsi: scsi-mq: don't hold host_busy in IO path Martin K. Petersen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox