From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrei Mikhailovsky Subject: Re: Preliminary RDMA vs TCP numbers Date: Wed, 8 Apr 2015 19:16:40 +0100 (BST) Message-ID: <13289231.842.1428516917655.JavaMail.andrei@tuchka> References: <755F6B91B3BE364F9BCA11EA3F9E0C6F2CD75D78@SACMBXIP01.sdcorp.global.sandisk.com> <6442513.754.1428484840531.JavaMail.andrei@tuchka> <755F6B91B3BE364F9BCA11EA3F9E0C6F2CD761B4@SACMBXIP01.sdcorp.global.sandisk.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0112480069==" Return-path: In-Reply-To: <755F6B91B3BE364F9BCA11EA3F9E0C6F2CD761B4-cXZ6iGhjG0il5HHZYNR2WTJ2aSJ780jGSxCzGc5ayCJWk0Htik3J/w@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ceph-users-bounces-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org Sender: "ceph-users" To: Somnath Roy Cc: ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org, ceph-devel List-Id: ceph-devel.vger.kernel.org --===============0112480069== Content-Type: multipart/alternative; boundary="----=_Part_841_28200728.1428516917654" ------=_Part_841_28200728.1428516917654 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Somnath, Sounds very promising! I can't wait to try it on my cluster as I am currently using IPOIB instread of the native rdma. Cheers Andrei ----- Original Message ----- > From: "Somnath Roy" > To: "Andrei Mikhailovsky" , "Andrey Korolyov" > > Cc: ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org, "ceph-devel" > > Sent: Wednesday, 8 April, 2015 5:23:23 PM > Subject: RE: [ceph-users] Preliminary RDMA vs TCP numbers > Andrei, > Yes, I see it has lot of potential and I believe fixing the > performance bottlenecks inside XIO messenger it should go further. > We are working on it and will keep community posted.. > Thanks & Regards > Somnath > From: Andrei Mikhailovsky [mailto:andrei-930XJYlnu5nQT0dZR+AlfA@public.gmane.org] > Sent: Wednesday, April 08, 2015 2:22 AM > To: Andrey Korolyov > Cc: ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org; ceph-devel; Somnath Roy > Subject: Re: [ceph-users] Preliminary RDMA vs TCP numbers > Hi, > Am I the only person noticing disappointing results from the > preliminary RDMA testing, or am I reading the numbers wrong? > Yes, it's true that on a very small cluster you do see a great > improvement in rdma, but in real life rdma is used in large > infrastructure projects, not on a few servers with a handful of > osds. In fact, from what i've seen from the slides, the rdma > implementation scales horribly to the point that it becomes slower > the more osds you through at it. > From my limited knowledge, i have expected a much higher performance > gains with rdma, taking into account that you should have much lower > latency and overhead and lower cpu utilisation when using this > transport in comparison with tcp. > Are we likely to see a great deal of improvement with ceph and rdma > in a near future? Is there a roadmap for having a stable and > reliable rdma protocol support? > Thanks > Andrei > ----- Original Message ----- > > From: "Andrey Korolyov" < andrey-5vqebrSIFTo@public.gmane.org > > > > To: "Somnath Roy" < Somnath.Roy-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org > > > > Cc: ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org , "ceph-devel" < > > ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > > > > Sent: Wednesday, 8 April, 2015 9:28:12 AM > > > Subject: Re: [ceph-users] Preliminary RDMA vs TCP numbers > > > On Wed, Apr 8, 2015 at 11:17 AM, Somnath Roy < > > Somnath.Roy-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org > wrote: > > > > > > > > Hi, > > > > Please find the preliminary performance numbers of TCP Vs RDMA > > > (XIO) implementation (on top of SSDs) in the following link. > > > > > > > > http://www.slideshare.net/somnathroy7568/ceph-on-rdma > > > > > > > > The attachment didn't go through it seems, so, I had to use > > > slideshare. > > > > > > > > Mark, > > > > If we have time, I can present it in tomorrow's performance > > > meeting. > > > > > > > > Thanks & Regards > > > > Somnath > > > > > > > Those numbers are really impressive (for small numbers at least)! > > What > > > are TCP settings you using?For example, difference can be lowered > > on > > > scale due to less intensive per-connection acceleration on CUBIC on > > a > > > larger number of nodes, though I do not believe that it was a main > > > reason for an observed TCP catchup on a relatively flat workload > > such > > > as fio generates. > > > _______________________________________________ > > > ceph-users mailing list > > > ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > PLEASE NOTE: The information contained in this electronic mail > message is intended only for the use of the designated recipient(s) > named above. If the reader of this message is not the intended > recipient, you are hereby notified that you have received this > message in error and that any review, dissemination, distribution, > or copying of this message is strictly prohibited. If you have > received this communication in error, please notify the sender by > telephone or e-mail (as shown above) immediately and destroy any and > all copies of this message in your possession (whether hard copies > or electronically stored copies). ------=_Part_841_28200728.1428516917654 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable <= div style=3D'font-family: arial,helvetica,sans-serif; font-size: 10pt; colo= r: #000000'>Somnath,

Sounds very promising! I can't wait to try it o= n my cluster as I am currently using IPOIB instread of the native rdma.
=
Cheers

Andrei




From: "Somnath Roy" <Somnath.Roy-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
To: "Andrei Mikhailovsky" <andrei-930XJYlnu5nQT0dZR+AlfA@public.gmane.org>, "Andrey Korolyov"= <andrey-5vqebrSIFTo@public.gmane.org>
Cc: ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org, "ceph-deve= l" <ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Sent: Wednesday, 8 April, 2= 015 5:23:23 PM
Subject: RE: [ceph-users] Preliminary RDMA vs TCP = numbers

Andrei,

Yes, I see i= t has lot of potential and I believe fixing the performance bottlenecks ins= ide XIO messenger it should go further.

We are worki= ng on it and will keep community posted..

 

Thanks &= Regards

Somnath

 

From: Andr= ei Mikhailovsky [mailto:andrei-930XJYlnu5nQT0dZR+AlfA@public.gmane.org]
Sent: Wednesday, April 08, 2015 2:22 AM
To: Andrey Korolyov
Cc: ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org; ceph-devel; Somnath Roy
Subject: Re: [ceph-users] Preliminary RDMA vs TCP numbers

 

Hi,

Am I the only person noticing disappointing results from the preliminary RD= MA testing, or am I reading the numbers wrong?

Yes, it's true that on a very small cluster you do see a great improvement = in rdma, but in real life rdma is used in large infrastructure projects, no= t on a few servers with a handful of osds. In fact, from what i've seen fro= m the slides, the rdma implementation scales horribly to the point that it becomes slower the more osds you thro= ugh at it.

>From my limited knowledge, i have expected a much higher performance gains = with rdma, taking into account that you should have much lower latency and = overhead and lower cpu utilisation when using this transport in comparison = with tcp.

Are we likely to see a great deal of improvement with ceph and rdma in a ne= ar future? Is there a roadmap for having a stable and reliable rdma protoco= l support?

Thanks

Andrei


From: "Andrey Korolyov" <andrey-5vqebrSIFTo@public.gmane.org>
To: "Somnath Roy" <Somnath.Roy-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
Cc: c= eph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org, "ceph-devel" <ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Sent: Wednesday, 8 April, 2015 9:28:12 AM
Subject: Re: [ceph-users] Preliminary RDMA vs TCP numbers

On Wed, Apr 8, 2015 at 11:17 AM, Somnath Roy <Somnath.Roy-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org> wrote: >
> Hi,
> Please find the preliminary performance numbers of TCP Vs RDMA (XIO) i= mplementation (on top of SSDs) in the following link.
>
> http://www.slideshare.net/somnathroy7568/ceph-on-rdma
>
> The attachment didn't go through it seems, so, I had to use slideshare= .
>
> Mark,
> If we have time, I can present it in tomorrow's performance meeting. >
> Thanks & Regards
> Somnath
>

Those numbers are really impressive (for small numbers at least)! What
are TCP settings you using?For example, difference can be lowered on
scale due to less intensive per-connection acceleration on CUBIC on a
larger number of nodes, though I do not believe that it was a main
reason for an observed TCP catchup on a relatively flat workload such
as fio generates.
_______________________________________________
ceph-users mailing list
ceph-users@l= ists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

 




PLEASE NOTE: The information contained in this electronic mail message is i= ntended only for the use of the designated recipient(s) named above. If the= reader of this message is not the intended recipient, you are hereby notif= ied that you have received this message in error and that any review, dissemination, distribution, or copy= ing of this message is strictly prohibited. If you have received this commu= nication in error, please notify the sender by telephone or e-mail (as show= n above) immediately and destroy any and all copies of this message in your possession (whether hard copies= or electronically stored copies).


------=_Part_841_28200728.1428516917654-- --===============0112480069== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ ceph-users mailing list ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com --===============0112480069==--