linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 0/2] block,nvme: latency-based I/O scheduler
@ 2024-03-26 15:35 Hannes Reinecke
  2024-03-26 15:35 ` [PATCH 1/2] block: track per-node I/O latency Hannes Reinecke
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Hannes Reinecke @ 2024-03-26 15:35 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Keith Busch, Christoph Hellwig, Sagi Grimberg, linux-nvme,
	linux-block, Hannes Reinecke

Hi all,

there had been several attempts to implement a latency-based I/O
scheduler for native nvme multipath, all of which had its issues.

So time to start afresh, this time using the QoS framework
already present in the block layer.
It consists of two parts:
- a new 'blk-nodelat' QoS module, which is just a simple per-node
  latency tracker
- a 'latency' nvme I/O policy

Using the 'tiobench' fio script I'm getting:
  WRITE: bw=531MiB/s (556MB/s), 33.2MiB/s-52.4MiB/s
  (34.8MB/s-54.9MB/s), io=4096MiB (4295MB), run=4888-7718msec
    WRITE: bw=539MiB/s (566MB/s), 33.7MiB/s-50.9MiB/s
  (35.3MB/s-53.3MB/s), io=4096MiB (4295MB), run=5033-7594msec
     READ: bw=898MiB/s (942MB/s), 56.1MiB/s-75.4MiB/s
  (58.9MB/s-79.0MB/s), io=4096MiB (4295MB), run=3397-4560msec
     READ: bw=1023MiB/s (1072MB/s), 63.9MiB/s-75.1MiB/s
  (67.0MB/s-78.8MB/s), io=4096MiB (4295MB), run=3408-4005msec

for 'round-robin' and

  WRITE: bw=574MiB/s (601MB/s), 35.8MiB/s-45.5MiB/s
  (37.6MB/s-47.7MB/s), io=4096MiB (4295MB), run=5629-7142msec
    WRITE: bw=639MiB/s (670MB/s), 39.9MiB/s-47.5MiB/s
  (41.9MB/s-49.8MB/s), io=4096MiB (4295MB), run=5388-6408msec
     READ: bw=1024MiB/s (1074MB/s), 64.0MiB/s-73.7MiB/s
  (67.1MB/s-77.2MB/s), io=4096MiB (4295MB), run=3475-4000msec
     READ: bw=1013MiB/s (1063MB/s), 63.3MiB/s-72.6MiB/s
  (66.4MB/s-76.2MB/s), io=4096MiB (4295MB), run=3524-4042msec
  
for 'latency' with 'decay' set to 10.
That's on a 32G FC testbed running against a brd target,
fio running with 16 thread.

As usual, comments and reviews are welcome.

Hannes Reinecke (2):
  block: track per-node I/O latency
  nvme: add 'latency' iopolicy

 block/Kconfig                 |   7 +
 block/Makefile                |   1 +
 block/blk-mq-debugfs.c        |   2 +
 block/blk-nodelat.c           | 368 ++++++++++++++++++++++++++++++++++
 block/blk-rq-qos.h            |   6 +
 drivers/nvme/host/multipath.c |  46 ++++-
 drivers/nvme/host/nvme.h      |   2 +
 include/linux/blk-mq.h        |  11 +
 8 files changed, 439 insertions(+), 4 deletions(-)
 create mode 100644 block/blk-nodelat.c

-- 
2.35.3


^ permalink raw reply	[flat|nested] 11+ messages in thread
* [PATCHv2 0/2] block,nvme: latency-based I/O scheduler
@ 2024-04-03 14:17 Hannes Reinecke
  2024-04-03 14:17 ` [PATCH 1/2] block: track per-node I/O latency Hannes Reinecke
  0 siblings, 1 reply; 11+ messages in thread
From: Hannes Reinecke @ 2024-04-03 14:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Keith Busch, Sagi Grimberg, Jens Axboe, linux-nvme, linux-block,
	Hannes Reinecke

Hi all,

there had been several attempts to implement a latency-based I/O
scheduler for native nvme multipath, all of which had its issues.

So time to start afresh, this time using the QoS framework
already present in the block layer.
It consists of two parts:
- a new 'blk-nlatency' QoS module, which is just a simple per-node
  latency tracker
- a 'latency' nvme I/O policy

Using the 'tiobench' fio script with 512 byte blocksize I'm getting
the following latencies (in usecs) as a baseline:
- seq write: avg 186 stddev 331
- rand write: avg 4598 stddev 7903
- seq read: avg 149 stddev 65
- rand read: avg 150 stddev 68

Enabling the 'latency' iopolicy:
- seq write: avg 178 stddev 113
- rand write: avg 3427 stddev 6703
- seq read: avg 140 stddev 59
- rand read: avg 141 stddev 58

Setting the 'decay' parameter to 10:
- seq write: avg 182 stddev 65
- rand write: avg 2619 stddev 5894
- seq read: avg 142 stddev 57
- rand read: avg 140 stddev 57  

That's on a 32G FC testbed running against a brd target,
fio running with 48 threads. So promises are met: latency
goes down, and we're even able to control the standard
deviation via the 'decay' parameter.

As usual, comments and reviews are welcome.

Changes to the original version:
- split the rqos debugfs entries
- Modify commit message to indicate latency
- rename to blk-nlatency

Hannes Reinecke (2):
  block: track per-node I/O latency
  nvme: add 'latency' iopolicy

 block/Kconfig                 |   6 +
 block/Makefile                |   1 +
 block/blk-mq-debugfs.c        |   2 +
 block/blk-nlatency.c          | 388 ++++++++++++++++++++++++++++++++++
 block/blk-rq-qos.h            |   6 +
 drivers/nvme/host/multipath.c |  57 ++++-
 drivers/nvme/host/nvme.h      |   1 +
 include/linux/blk-mq.h        |  11 +
 8 files changed, 465 insertions(+), 7 deletions(-)
 create mode 100644 block/blk-nlatency.c

-- 
2.35.3


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-04-04 18:49 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-26 15:35 [PATCH RFC 0/2] block,nvme: latency-based I/O scheduler Hannes Reinecke
2024-03-26 15:35 ` [PATCH 1/2] block: track per-node I/O latency Hannes Reinecke
2024-03-27 18:03   ` kernel test robot
2024-03-27 20:59   ` kernel test robot
2024-03-26 15:35 ` [PATCH 2/2] nvme: add 'latency' iopolicy Hannes Reinecke
2024-03-28 10:38 ` [PATCH RFC 0/2] block,nvme: latency-based I/O scheduler Sagi Grimberg
2024-03-28 11:32   ` Hannes Reinecke
  -- strict thread matches above, loose matches on Subject: below --
2024-04-03 14:17 [PATCHv2 " Hannes Reinecke
2024-04-03 14:17 ` [PATCH 1/2] block: track per-node I/O latency Hannes Reinecke
2024-04-04  2:22   ` kernel test robot
2024-04-04  2:55   ` kernel test robot
2024-04-04 18:47   ` kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).