public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Jens Axboe <axboe@fb.com>,
	linux-block@vger.kernel.org,
	Christoph Hellwig <hch@infradead.org>
Cc: Bart Van Assche <bart.vanassche@sandisk.com>,
	linux-scsi@vger.kernel.org,
	"Martin K . Petersen" <martin.petersen@oracle.com>,
	"James E . J . Bottomley" <jejb@linux.vnet.ibm.com>,
	Ming Lei <ming.lei@redhat.com>
Subject: [PATCH 00/14] blk-mq-sched: fix SCSI-MQ performance regression
Date: Tue,  1 Aug 2017 00:50:57 +0800	[thread overview]
Message-ID: <20170731165111.11536-2-ming.lei@redhat.com> (raw)
In-Reply-To: <20170731165111.11536-1-ming.lei@redhat.com>

In Red Hat internal storage test wrt. blk-mq scheduler, we
found that its performance is quite bad, especially
about sequential I/O on some multi-queue SCSI devcies.

Turns out one big issue causes the performance regression: requests
are still dequeued from sw queue/scheduler queue even when ldd's
queue is busy, so I/O merge becomes quite difficult to do, and
sequential IO degrades a lot.

The 1st five patches improve this situation, and brings back
some performance loss.

But looks they are still not enough. Finally it is caused by
the shared queue depth among all hw queues. For SCSI devices,
.cmd_per_lun defines the max number of pending I/O on one
request queue, which is per-request_queue depth. So during
dispatch, if one hctx is too busy to move on, all hctxs can't
dispatch too because of the per-request_queue depth.

Patch 6 ~ 14 use per-request_queue dispatch list to avoid
to dequeue requests from sw/scheduler queue when lld queue
is busy.

With this changes, SCSI-MQ performance is brought back
against block legacy path, follows the test result on lpfc:

- fio(libaio, bs:4k, dio, queue_depth:64, 20 jobs)


                   |v4.13-rc3       | v4.13-rc3   | patched v4.13-rc3
                   |legacy deadline | mq-none     | mq-none
---------------------------------------------------------------------
read        "iops" | 401749.4001    | 346237.5025 | 387536.4427
randread    "iops" | 25175.07121    | 21688.64067 | 25578.50374
write       "iops" | 376168.7578    | 335262.0475 | 370132.4735
reandwrite  "iops" | 25235.46163    | 24982.63819 | 23934.95610

                   |v4.13-rc3       | v4.13-rc3   | patched v4.13-rc3
                   |legacy deadline | mq-deadline | mq-deadline
------------------------------------------------------------------------------
read        "iops" | 401749.4001    | 35592.48901 | 401681.1137
randread    "iops" | 25175.07121    | 30029.52618 | 21446.68731
write       "iops" | 376168.7578    | 27340.56777 | 377356.7286
randwrite   "iops" | 25235.46163    | 24395.02969 | 24885.66152

Ming Lei (14):
  blk-mq-sched: fix scheduler bad performance
  blk-mq: rename flush_busy_ctx_data as ctx_iter_data
  blk-mq: introduce blk_mq_dispatch_rq_from_ctxs()
  blk-mq-sched: improve dispatching from sw queue
  blk-mq-sched: don't dequeue request until all in ->dispatch are
    flushed
  blk-mq-sched: introduce blk_mq_sched_queue_depth()
  blk-mq-sched: use q->queue_depth as hint for q->nr_requests
  blk-mq: introduce BLK_MQ_F_SHARED_DEPTH
  blk-mq-sched: cleanup blk_mq_sched_dispatch_requests()
  blk-mq-sched: introduce helpers for query, change busy state
  blk-mq: introduce helpers for operating ->dispatch list
  blk-mq: introduce pointers to dispatch lock & list
  blk-mq: pass 'request_queue *' to several helpers of operating BUSY
  blk-mq-sched: improve IO scheduling on SCSI devcie

 block/blk-mq-debugfs.c |  11 ++---
 block/blk-mq-sched.c   |  70 +++++++++++++++--------------
 block/blk-mq-sched.h   |  23 ++++++++++
 block/blk-mq.c         | 117 +++++++++++++++++++++++++++++++++++++++++++------
 block/blk-mq.h         |  72 ++++++++++++++++++++++++++++++
 block/blk-settings.c   |   2 +
 include/linux/blk-mq.h |   5 +++
 include/linux/blkdev.h |   5 +++
 8 files changed, 255 insertions(+), 50 deletions(-)

-- 
2.9.4

  reply	other threads:[~2017-07-31 16:51 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-31 16:50 [PATCH 00/14] blk-mq-sched: fix SCSI-MQ performance regression Ming Lei
2017-07-31 16:50 ` Ming Lei [this message]
2017-07-31 16:50 ` [PATCH 01/14] blk-mq-sched: fix scheduler bad performance Ming Lei
2017-07-31 23:00   ` Bart Van Assche
2017-07-31 16:50 ` [PATCH 02/14] blk-mq: rename flush_busy_ctx_data as ctx_iter_data Ming Lei
2017-07-31 23:03   ` Bart Van Assche
2017-07-31 16:51 ` [PATCH 03/14] blk-mq: introduce blk_mq_dispatch_rq_from_ctxs() Ming Lei
2017-07-31 23:09   ` Bart Van Assche
2017-08-01 10:07     ` Ming Lei
2017-08-02 17:19   ` kbuild test robot
2017-07-31 16:51 ` [PATCH 04/14] blk-mq-sched: improve dispatching from sw queue Ming Lei
2017-07-31 23:34   ` Bart Van Assche
2017-08-01 10:17     ` Ming Lei
2017-08-01 10:50       ` Ming Lei
2017-08-01 15:11         ` Bart Van Assche
2017-08-02  3:31           ` Ming Lei
2017-08-03  1:35             ` Bart Van Assche
2017-08-03  3:13               ` Ming Lei
2017-08-03 17:33                 ` Bart Van Assche
2017-08-05  8:40                   ` hch
2017-08-05 13:40                   ` Ming Lei
2017-07-31 16:51 ` [PATCH 05/14] blk-mq-sched: don't dequeue request until all in ->dispatch are flushed Ming Lei
2017-07-31 23:42   ` Bart Van Assche
2017-08-01 10:44     ` Ming Lei
2017-08-01 16:14       ` Bart Van Assche
2017-08-02  3:01         ` Ming Lei
2017-08-03  1:33           ` Bart Van Assche
2017-07-31 16:51 ` [PATCH 06/14] blk-mq-sched: introduce blk_mq_sched_queue_depth() Ming Lei
2017-07-31 16:51 ` [PATCH 07/14] blk-mq-sched: use q->queue_depth as hint for q->nr_requests Ming Lei
2017-07-31 16:51 ` [PATCH 08/14] blk-mq: introduce BLK_MQ_F_SHARED_DEPTH Ming Lei
2017-07-31 16:51 ` [PATCH 09/14] blk-mq-sched: cleanup blk_mq_sched_dispatch_requests() Ming Lei
2017-07-31 16:51 ` [PATCH 10/14] blk-mq-sched: introduce helpers for query, change busy state Ming Lei
2017-07-31 16:51 ` [PATCH 11/14] blk-mq: introduce helpers for operating ->dispatch list Ming Lei
2017-07-31 16:51 ` [PATCH 12/14] blk-mq: introduce pointers to dispatch lock & list Ming Lei
2017-07-31 16:51 ` [PATCH 13/14] blk-mq: pass 'request_queue *' to several helpers of operating BUSY Ming Lei
2017-07-31 16:51 ` [PATCH 14/14] blk-mq-sched: improve IO scheduling on SCSI devcie Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170731165111.11536-2-ming.lei@redhat.com \
    --to=ming.lei@redhat.com \
    --cc=axboe@fb.com \
    --cc=bart.vanassche@sandisk.com \
    --cc=hch@infradead.org \
    --cc=jejb@linux.vnet.ibm.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox