linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: sagi@grimberg.me (Sagi Grimberg)
Subject: NVMe over RDMA latency
Date: Wed, 13 Jul 2016 12:49:46 +0300	[thread overview]
Message-ID: <57860EBA.5010103@grimberg.me> (raw)
In-Reply-To: <1467921342.24395.12.camel@ssi>

> Hi list,

Hey Ming,

> I'm trying to understand the NVMe over RDMA latency.
>
> Test hardware:
> A real NVMe PCI drive on target
> Host and target back-to-back connected by Mellanox ConnectX-3
>
> [global]
> ioengine=libaio
> direct=1
> runtime=10
> time_based
> norandommap
> group_reporting
>
> [job1]
> filename=/dev/nvme0n1
> rw=randread
> bs=4k
>
>
> fio latency data on host side(test nvmeof device)
>      slat (usec): min=2, max=213, avg= 6.34, stdev= 3.47
>      clat (usec): min=1, max=2470, avg=39.56, stdev=13.04
>       lat (usec): min=30, max=2476, avg=46.14, stdev=15.50
>
> fio latency data on target side(test NVMe pci device locally)
>      slat (usec): min=1, max=36, avg= 1.92, stdev= 0.42
>      clat (usec): min=1, max=68, avg=20.35, stdev= 1.11
>       lat (usec): min=19, max=101, avg=22.35, stdev= 1.21
>
> So I picked up this sample from blktrace which seems matches the fio avg latency data.
>
> Host(/dev/nvme0n1)
> 259,0    0       86     0.015768739  3241  Q   R 1272199648 + 8 [fio]
> 259,0    0       87     0.015769674  3241  G   R 1272199648 + 8 [fio]
> 259,0    0       88     0.015771628  3241  U   N [fio] 1
> 259,0    0       89     0.015771901  3241  I  RS 1272199648 + 8 (    2227) [fio]
> 259,0    0       90     0.015772863  3241  D  RS 1272199648 + 8 (     962) [fio]
> 259,0    1       85     0.015819257     0  C  RS 1272199648 + 8 (   46394) [0]
>
> Target(/dev/nvme0n1)
> 259,0    0      141     0.015675637  2197  Q   R 1272199648 + 8 [kworker/u17:0]
> 259,0    0      142     0.015676033  2197  G   R 1272199648 + 8 [kworker/u17:0]
> 259,0    0      143     0.015676915  2197  D  RS 1272199648 + 8 (15676915) [kworker/u17:0]
> 259,0    0      144     0.015694992     0  C  RS 1272199648 + 8 (   18077) [0]
>
> So host completed IO in about 50usec and target completed IO in about 20usec.
> Does that mean the 30usec delta comes from RDMA write(host read means target RDMA write)?


Couple of things that come to mind:

0. Are you using iodepth=1 correct?

1. I imagine you are not polling in the host but rather interrupt
    driven correct? thats a latency source.

2. the target code is polling if the block device supports it. can you
    confirm that is indeed the case?

3. mlx4 has a strong fencing policy for memory registration, which we
    always do. thats a latency source. can you try with
    register_always=0?

4. IRQ affinity assignments. if the sqe is submitted on cpu core X and
    the completion comes to cpu core Y, we will consume some latency
    with the context-switch of waiking up fio on cpu core X. Is this
    a possible case?

5. What happens if you test against a null_blk (which has a latency of
    < 1us)? back when I ran some tryouts I saw ~10-11us added latency
    from the fabric under similar conditions.

  reply	other threads:[~2016-07-13  9:49 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-07 19:55 NVMe over RDMA latency Ming Lin
2016-07-13  9:49 ` Sagi Grimberg [this message]
     [not found]   ` <CABgxfbEa077L6o-AxEqMr1WMuU-gC8_qc4VrrNs9nAkKLrysdw@mail.gmail.com>
2016-07-13 17:25     ` Ming Lin
2016-07-13 18:25   ` Ming Lin
2016-07-14  6:52     ` Sagi Grimberg
2016-07-14 16:43     ` Wendy Cheng
2016-07-14 17:45       ` Wendy Cheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57860EBA.5010103@grimberg.me \
    --to=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).