public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Sagi Grimberg <sagi@grimberg.me>
To: Roman Pen <roman.penyaev@profitbricks.com>,
	linux-block@vger.kernel.org, linux-rdma@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>,
	Christoph Hellwig <hch@infradead.org>,
	Bart Van Assche <bart.vanassche@sandisk.com>,
	Or Gerlitz <ogerlitz@mellanox.com>,
	Danil Kipnis <danil.kipnis@profitbricks.com>,
	Jack Wang <jinpu.wang@profitbricks.com>
Subject: Re: [PATCH 00/24] InfiniBand Transport (IBTRS) and Network Block Device (IBNBD)
Date: Mon, 5 Feb 2018 14:16:03 +0200	[thread overview]
Message-ID: <23dcfda7-eac1-ed13-b24e-3586c284ee55@grimberg.me> (raw)
In-Reply-To: <20180202140904.2017-1-roman.penyaev@profitbricks.com>

Hi Roman and the team,

On 02/02/2018 04:08 PM, Roman Pen wrote:
> This series introduces IBNBD/IBTRS modules.
> 
> IBTRS (InfiniBand Transport) is a reliable high speed transport library
> which allows for establishing connection between client and server
> machines via RDMA.

So its not strictly infiniband correct?

  It is optimized to transfer (read/write) IO blocks
> in the sense that it follows the BIO semantics of providing the
> possibility to either write data from a scatter-gather list to the
> remote side or to request ("read") data transfer from the remote side
> into a given set of buffers.
> 
> IBTRS is multipath capable and provides I/O fail-over and load-balancing
> functionality.

Couple of questions on your multipath implementation?
1. What was your main objective over dm-multipath?
2. What was the consideration of this implementation over
creating a stand-alone bio based device node to reinject the
bio to the original block device?

> IBNBD (InfiniBand Network Block Device) is a pair of kernel modules
> (client and server) that allow for remote access of a block device on
> the server over IBTRS protocol. After being mapped, the remote block
> devices can be accessed on the client side as local block devices.
> Internally IBNBD uses IBTRS as an RDMA transport library.
> 
> Why?
> 
>     - IBNBD/IBTRS is developed in order to map thin provisioned volumes,
>       thus internal protocol is simple and consists of several request
> 	 types only without awareness of underlaying hardware devices.

Can you explain how the protocol is developed for thin-p? What are the
essence of how its suited for it?

>     - IBTRS was developed as an independent RDMA transport library, which
>       supports fail-over and load-balancing policies using multipath, thus
> 	 it can be used for any other IO needs rather than only for block
> 	 device.

What do you mean by "any other IO"?

>     - IBNBD/IBTRS is faster than NVME over RDMA.  Old comparison results:
>       https://www.spinics.net/lists/linux-rdma/msg48799.html
>       (I retested on latest 4.14 kernel - there is no any significant
> 	  difference, thus I post the old link).

That is interesting to learn.

Reading your reference brings a couple of questions though,
- Its unclear to me how ibnbd performs reads without performing memory
   registration. Is it using the global dma rkey?

- Its unclear to me how there is a difference in noreg in writes,
   because for small writes nvme-rdma never register memory (it uses
   inline data).

- Looks like with nvme-rdma you max out your iops at 1.6 MIOPs, that
   seems considerably low against other reports. Can you try and explain
   what was the bottleneck? This can be a potential bug and I (and the
   rest of the community is interesting in knowing more details).

- srp/scst comparison is really not fair having it in legacy request
   mode. Can you please repeat it and report a bug to either linux-rdma
   or to the scst mailing list?

- Your latency measurements are surprisingly high for a null target
   device (even for low end nvme device actually) regardless of the
   transport implementation.

For example:
- QD=1 read latency is 648.95 for ibnbd (I assume usecs right?) which is
   fairly high. on nvme-rdma its 1058 us, which means over 1 millisecond
   and even 1.254 ms for srp. Last time I tested nvme-rdma read QD=1
   latency I got ~14 us. So something does not add up here. If this is
   not some configuration issue, then we have serious bugs to handle..

- QD=16 the read latencies are > 10ms for null devices?! I'm having
   troubles understanding how you were able to get such high latencies
   (> 100 ms for QD>=100)

Can you share more information about your setup? It would really help
us understand more.

>     - Major parts of the code were rewritten, simplified and overall code
>       size was reduced by a quarter.

That is good to know.

  parent reply	other threads:[~2018-02-05 12:16 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-02 14:08 [PATCH 00/24] InfiniBand Transport (IBTRS) and Network Block Device (IBNBD) Roman Pen
2018-02-02 14:08 ` [PATCH 01/24] ibtrs: public interface header to establish RDMA connections Roman Pen
2018-02-02 14:08 ` [PATCH 02/24] ibtrs: private headers with IBTRS protocol structs and helpers Roman Pen
2018-02-02 14:08 ` [PATCH 03/24] ibtrs: core: lib functions shared between client and server modules Roman Pen
2018-02-05 10:52   ` Sagi Grimberg
2018-02-06 12:01     ` Roman Penyaev
2018-02-06 16:10       ` Jason Gunthorpe
2018-02-07 10:34         ` Roman Penyaev
2018-02-02 14:08 ` [PATCH 04/24] ibtrs: client: private header with client structs and functions Roman Pen
2018-02-05 10:59   ` Sagi Grimberg
2018-02-06 12:23     ` Roman Penyaev
2018-02-02 14:08 ` [PATCH 05/24] ibtrs: client: main functionality Roman Pen
2018-02-02 16:54   ` Bart Van Assche
2018-02-05 13:27     ` Roman Penyaev
2018-02-05 14:14       ` Sagi Grimberg
2018-02-05 17:05         ` Roman Penyaev
2018-02-05 11:19   ` Sagi Grimberg
2018-02-05 14:19     ` Roman Penyaev
2018-02-05 16:24       ` Bart Van Assche
2018-02-02 14:08 ` [PATCH 06/24] ibtrs: client: statistics functions Roman Pen
2018-02-02 14:08 ` [PATCH 07/24] ibtrs: client: sysfs interface functions Roman Pen
2018-02-05 11:20   ` Sagi Grimberg
2018-02-06 12:28     ` Roman Penyaev
2018-02-02 14:08 ` [PATCH 08/24] ibtrs: server: private header with server structs and functions Roman Pen
2018-02-02 14:08 ` [PATCH 09/24] ibtrs: server: main functionality Roman Pen
2018-02-05 11:29   ` Sagi Grimberg
2018-02-06 12:46     ` Roman Penyaev
2018-02-02 14:08 ` [PATCH 10/24] ibtrs: server: statistics functions Roman Pen
2018-02-02 14:08 ` [PATCH 11/24] ibtrs: server: sysfs interface functions Roman Pen
2018-02-02 14:08 ` [PATCH 12/24] ibtrs: include client and server modules into kernel compilation Roman Pen
2018-02-02 14:08 ` [PATCH 13/24] ibtrs: a bit of documentation Roman Pen
2018-02-02 14:08 ` [PATCH 14/24] ibnbd: private headers with IBNBD protocol structs and helpers Roman Pen
2018-02-02 14:08 ` [PATCH 15/24] ibnbd: client: private header with client structs and functions Roman Pen
2018-02-02 14:08 ` [PATCH 16/24] ibnbd: client: main functionality Roman Pen
2018-02-02 15:11   ` Jens Axboe
2018-02-05 12:54     ` Roman Penyaev
2018-02-02 14:08 ` [PATCH 17/24] ibnbd: client: sysfs interface functions Roman Pen
2018-02-02 14:08 ` [PATCH 18/24] ibnbd: server: private header with server structs and functions Roman Pen
2018-02-02 14:08 ` [PATCH 19/24] ibnbd: server: main functionality Roman Pen
2018-02-02 14:09 ` [PATCH 20/24] ibnbd: server: functionality for IO submission to file or block dev Roman Pen
2018-02-02 14:09 ` [PATCH 21/24] ibnbd: server: sysfs interface functions Roman Pen
2018-02-02 14:09 ` [PATCH 22/24] ibnbd: include client and server modules into kernel compilation Roman Pen
2018-02-02 14:09 ` [PATCH 23/24] ibnbd: a bit of documentation Roman Pen
2018-02-02 15:55   ` Bart Van Assche
2018-02-05 13:03     ` Roman Penyaev
2018-02-05 14:16       ` Sagi Grimberg
2018-02-02 14:09 ` [PATCH 24/24] MAINTAINERS: Add maintainer for IBNBD/IBTRS modules Roman Pen
2018-02-02 16:07 ` [PATCH 00/24] InfiniBand Transport (IBTRS) and Network Block Device (IBNBD) Bart Van Assche
2018-02-02 16:40   ` Doug Ledford
2018-02-05  8:45     ` Jinpu Wang
2018-06-04 12:14     ` Danil Kipnis
2018-02-02 17:05 ` Bart Van Assche
2018-02-05  8:56   ` Jinpu Wang
2018-02-05 11:36     ` Sagi Grimberg
2018-02-05 13:38       ` Danil Kipnis
2018-02-05 14:17         ` Sagi Grimberg
2018-02-05 16:40           ` Danil Kipnis
2018-02-05 18:38             ` Bart Van Assche
2018-02-06  9:44               ` Danil Kipnis
2018-02-06 15:35                 ` Bart Van Assche
2018-02-05 16:16     ` Bart Van Assche
2018-02-05 16:36       ` Jinpu Wang
2018-02-07 16:35       ` Christopher Lameter
2018-02-07 17:18         ` Roman Penyaev
2018-02-07 17:32           ` Bart Van Assche
2018-02-08 17:38             ` Danil Kipnis
2018-02-08 18:09               ` Bart Van Assche
2018-06-04 12:27                 ` Danil Kipnis
2018-02-05 12:16 ` Sagi Grimberg [this message]
2018-02-05 12:30   ` Sagi Grimberg
2018-02-07 13:06     ` Roman Penyaev
2018-02-05 16:58   ` Bart Van Assche
2018-02-05 17:16     ` Roman Penyaev
2018-02-05 17:20       ` Bart Van Assche
2018-02-06 11:47         ` Roman Penyaev
2018-02-06 13:12   ` Roman Penyaev
2018-02-06 16:01     ` Bart Van Assche
2018-02-07 12:57       ` Roman Penyaev
2018-02-07 16:35         ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=23dcfda7-eac1-ed13-b24e-3586c284ee55@grimberg.me \
    --to=sagi@grimberg.me \
    --cc=axboe@kernel.dk \
    --cc=bart.vanassche@sandisk.com \
    --cc=danil.kipnis@profitbricks.com \
    --cc=hch@infradead.org \
    --cc=jinpu.wang@profitbricks.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=ogerlitz@mellanox.com \
    --cc=roman.penyaev@profitbricks.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox