linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* NVMe over RDMA latency
@ 2016-07-07 19:55 Ming Lin
  2016-07-13  9:49 ` Sagi Grimberg
  0 siblings, 1 reply; 7+ messages in thread
From: Ming Lin @ 2016-07-07 19:55 UTC (permalink / raw)


Hi list,

I'm trying to understand the NVMe over RDMA latency.

Test hardware:
A real NVMe PCI drive on target
Host and target back-to-back connected by Mellanox ConnectX-3

[global]
ioengine=libaio
direct=1
runtime=10
time_based
norandommap
group_reporting

[job1]
filename=/dev/nvme0n1
rw=randread
bs=4k


fio latency data on host side(test nvmeof device)
    slat (usec): min=2, max=213, avg= 6.34, stdev= 3.47
    clat (usec): min=1, max=2470, avg=39.56, stdev=13.04
     lat (usec): min=30, max=2476, avg=46.14, stdev=15.50

fio latency data on target side(test NVMe pci device locally)
    slat (usec): min=1, max=36, avg= 1.92, stdev= 0.42
    clat (usec): min=1, max=68, avg=20.35, stdev= 1.11
     lat (usec): min=19, max=101, avg=22.35, stdev= 1.21

So I picked up this sample from blktrace which seems matches the fio avg latency data.

Host(/dev/nvme0n1)
259,0    0       86     0.015768739  3241  Q   R 1272199648 + 8 [fio]
259,0    0       87     0.015769674  3241  G   R 1272199648 + 8 [fio]
259,0    0       88     0.015771628  3241  U   N [fio] 1
259,0    0       89     0.015771901  3241  I  RS 1272199648 + 8 (    2227) [fio]
259,0    0       90     0.015772863  3241  D  RS 1272199648 + 8 (     962) [fio]
259,0    1       85     0.015819257     0  C  RS 1272199648 + 8 (   46394) [0]

Target(/dev/nvme0n1)
259,0    0      141     0.015675637  2197  Q   R 1272199648 + 8 [kworker/u17:0]
259,0    0      142     0.015676033  2197  G   R 1272199648 + 8 [kworker/u17:0]
259,0    0      143     0.015676915  2197  D  RS 1272199648 + 8 (15676915) [kworker/u17:0]
259,0    0      144     0.015694992     0  C  RS 1272199648 + 8 (   18077) [0]

So host completed IO in about 50usec and target completed IO in about 20usec.
Does that mean the 30usec delta comes from RDMA write(host read means target RDMA write)?

Thanks,
Ming

Below is just for myself to understand what the blktrace flag means
===================================================================
Q - queued:
generic_make_request_checks: trace_block_bio_queue(q, bio)

G - get request:
blk_mq_map_request: trace_block_getrq(q, bio, op)

U:
blk_mq_insert_requests: trace_block_unplug(q, depth, !from_schedule)

I - inserted:
__blk_mq_insert_req_list: trace_block_rq_insert(hctx->queue, rq)

D - issued:
blk_mq_start_request: trace_block_rq_issue(q, rq)

C - complete:
blk_update_request: trace_block_rq_complete(req->q, req, nr_bytes)

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-07-14 17:45 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-07-07 19:55 NVMe over RDMA latency Ming Lin
2016-07-13  9:49 ` Sagi Grimberg
     [not found]   ` <CABgxfbEa077L6o-AxEqMr1WMuU-gC8_qc4VrrNs9nAkKLrysdw@mail.gmail.com>
2016-07-13 17:25     ` Ming Lin
2016-07-13 18:25   ` Ming Lin
2016-07-14  6:52     ` Sagi Grimberg
2016-07-14 16:43     ` Wendy Cheng
2016-07-14 17:45       ` Wendy Cheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).