From mboxrd@z Thu Jan 1 00:00:00 1970 From: "hzwulibin-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org" Subject: Re: Understanding the number of TCP connections between clients and OSDs Date: Tue, 27 Oct 2015 08:41:38 +0800 Message-ID: <2015102708413664295810@gmail.com> References: , <590F8B9C-DA15-41C2-9E37-8847F79D7F6B@schermer.cz> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0796715976==" Return-path: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ceph-users-bounces-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org Sender: "ceph-users" To: Jan Schermer , Rick Balsano Cc: "ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org" , "ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" List-Id: ceph-devel.vger.kernel.org This is a multi-part message in MIME format. --===============0796715976== Content-Type: multipart/alternative; boundary="----=_001_NextPart803833711260_=----" This is a multi-part message in MIME format. ------=_001_NextPart803833711260_=---- Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: base64 SGksDQpJIGFsc28gY29uY2VybnMgYWJvdXQgdGhpcyBwcm9ibGVtLiBBbmQgbXkgcHJvYmxlbSBp cyBob3cgbWFueSB0aHJlYWRzIHdpbGwgdGhlIHFlbXUtc3lzdGVtLXg4NiBoYXMuDQoNCkZyb20g d2hhdCBpIHRlc3RlZCwgaXQgY291bGQgYmV0d2VlbiAxMDAgdG8gODAwLCB5ZWFoLCBtYXliZSBp dCBoYXMgcmVsYXRpb25zaGlwIHdpdGggdGhlIG9zZCBudW1iZXIuIEJ1dCBpdA0Kc2VlbXMgYWZm ZWN0IHRoZSBwZXJmb3JtYW5jZSB3aGVuIGl0IGhhcyBtYW55IHRocmVhZHMuIEZyb20gd2hhdCBp IHRlc3RlZCwgNGsgcmFuZHdyaXRlIHdpbGwgcmVkdWNlIGZyb20gMTVrDQp0byA0ay4gVGhhdCdz IHJlYWxseSB1bmFjY2VwdGFibGUhDQoNCk15IGV2bmlyb25tZW50Og0KDQoxLiBuaW5lIE9TRCBz dG9yYWdlIHNlcnZlcnMgd2l0aCB0d28gaW50ZWwgREMgMzUwMCBTU0Qgb24gZWFjaA0KMi4gaGFt bWVyIDAuOTQuMw0KMy4gUUVNVSBlbXVsYXRvciB2ZXJzaW9uIDIuMS4yIChEZWJpYW4gMToyLjEr ZGZzZy0xMitkZWI4dTR+YnBvNzArMSkNCg0KVGhhbmtzIQ0KDQoNCmh6d3VsaWJpbkBnbWFpbC5j b20NCiANCkZyb206IEphbiBTY2hlcm1lcg0KRGF0ZTogMjAxNS0xMC0yNyAwNTo0OA0KVG86IFJp Y2sgQmFsc2Fubw0KQ0M6IGNlcGgtdXNlcnNAbGlzdHMuY2VwaC5jb20NClN1YmplY3Q6IFJlOiBb Y2VwaC11c2Vyc10gVW5kZXJzdGFuZGluZyB0aGUgbnVtYmVyIG9mIFRDUCBjb25uZWN0aW9ucyBi ZXR3ZWVuIGNsaWVudHMgYW5kIE9TRHMNCklmIHdlJ3JlIHRhbGtpbmcgYWJvdXQgUkJEIGNsaWVu dHMgKHFlbXUpIHRoZW4gdGhlIG51bWJlciBhbHNvIGdyb3dzIHdpdGggbnVtYmVyIG9mIHZvbHVt ZXMgYXR0YWNoZWQgdG8gdGhlIGNsaWVudC4gV2l0aCBhIHNpbmdsZSB2b2x1bWUgaXQgd2FzIDwx MDAwLiBJdCBncm93cyB3aGVuIHRoZXJlJ3MgaGVhdnkgSU8gaGFwcGVuaW5nIGluIHRoZSBndWVz dC4NCkkgaGFkIHRvIGJ1bXAgdXAgdGhlIGZpbGUgb3BlbiBsaW1pdHMgdG8gc2V2ZXJhbCB0aHVz YW5kcyAoODAwMCB3YXMgaXQ/KSB0byBhY2NvbW9kYXRlIGNsaWVudCB3aXRoIDEwIHZvbHVtZXMg aW4gb3VyIGNsdXN0ZXIuIFdlIGp1c3Qgc2NhbGVkIHRoZSBudW1iZXIgb2YgT1NEcyBkb3duIHNv IGhvcGVmdWxseSBJIGNvdWxkIGhhdmUgYSBncmFwaCBvZiB0aGF0Lg0KQnV0IEkganVzdCBndWVz c3RpbWF0ZWQgd2hhdCBpdCBjb3VsZCBiZWNvbWUsIGFuZCB0aGF0J3Mgbm90IG5lY2Vzc2FyaWx5 IHdoYXQgdGhlIHRoZW9yZXRpY2FsIGxpbWl0IGlzLiBWZXJ5IGJhZCB0aGluZ3MgaGFwcGVuIHdo ZW4geW91IHJlYWNoIHRoYXQgdGhyZXNob2xkLiBJdCBjb3VsZCBhbHNvIGRlcGVuZCBvbiB0aGUg Z3Vlc3Qgc2V0dGluZ3MgKGxpa2UgcXVldWUgZGVwdGgpLCBhbmQgaG93IG11Y2ggaXQgc2Vla3Mg b3ZlciB0aGUgZHJpdmUgKGhvdyBtYW55IGRpZmZlcmVudCBQR3MgaXQgaGl0cyksIGJ1dCBrbm93 aW5nIHRoZSB1cHBlciBib3VuZCBpcyBtb3N0IGNyaXRpY2FsLg0KDQpKYW4NCg0KT24gMjYgT2N0 IDIwMTUsIGF0IDIxOjMyLCBSaWNrIEJhbHNhbm8gPHJpY2tAb3Bvd2VyLmNvbT4gd3JvdGU6DQoN CldlJ3ZlIHJ1biBpbnRvIGlzc3VlcyB3aXRoIHRoZSBudW1iZXIgb2Ygb3BlbiBUQ1AgY29ubmVj dGlvbnMgZnJvbSBhIHNpbmdsZSBjbGllbnQgdG8gdGhlIE9TRHMgaW4gb3VyIENlcGggY2x1c3Rl ci4NCg0KV2UgY2FuICgmIGhhdmUpIGluY3JlYXNlZCB0aGUgb3BlbiBmaWxlIGxpbWl0IHRvIHdv cmsgYXJvdW5kIHRoaXMsIGJ1dCB3ZSdyZSBsb29raW5nIHRvIHVuZGVyc3RhbmQgd2hhdCBkZXRl cm1pbmVzIHRoZSBudW1iZXIgb2Ygb3BlbiBjb25uZWN0aW9ucyBtYWludGFpbmVkIGJldHdlZW4g YSBjbGllbnQgYW5kIGEgcGFydGljdWxhciBPU0QuIE91ciBuYWl2ZSBhc3N1bXB0aW9uIHdhcyAx IG9wZW4gVENQIGNvbm5lY3Rpb24gcGVyIE9TRCBvciBwZXIgcG9ydCBtYWRlIGF2YWlsYWJsZSBi eSB0aGUgQ2VwaCBub2RlLiBUaGVyZSBhcmUgbWFueSBtb3JlIHRoYW4gdGhpcywgcHJlc3VtYWJs eSB0byBhbGxvdyBwYXJhbGxlbCBjb25uZWN0aW9ucywgYmVjYXVzZSB3ZSBzZWUgMS00IGNvbm5l Y3Rpb25zIGZyb20gZWFjaCBjbGllbnQgcGVyIG9wZW4gcG9ydCBvbiBhIENlcGggbm9kZS4NCg0K SGVyZSBpcyBzb21lIGJhY2tncm91bmQgb24gb3VyIGNsdXN0ZXI6DQoqIHN0aWxsIHJ1bm5pbmcg RmlyZWZseSAwLjgwLjgNCiogNDE0IE9TRHMsIDM1IG5vZGVzLCBvbmUgbWFzc2l2ZSBwb29sDQoq IGNsaWVudHMgYXJlIEtWTSBwcm9jZXNzZXMsIGFjY2Vzc2luZyBDZXBoIFJCRCBpbWFnZXMgdXNp bmcgdmlydGlvDQoqIHRvdGFsIG51bWJlciBvZiBvcGVuIFRDUCBjb25uZWN0aW9ucyBmcm9tIG9u ZSBjbGllbnQgdG8gYWxsIG5vZGVzIGJldHdlZW4gNTAwLTEwMDAgDQoNCklzIHRoZXJlIGFueSB3 YXkgdG8gZWl0aGVyIGtub3cgb3IgY2FwIHRoZSBtYXhpbXVtIG51bWJlciBvZiBjb25uZWN0aW9u cyB3ZSBzaG91bGQgZXhwZWN0Pw0KDQpJIGNhbiBwcm92aWRlIG1vcmUgaW5mbyBhcyByZXF1aXJl ZC4gSSd2ZSBkb25lIHNvbWUgc2VhcmNoZXMgYW5kIGZvdW5kIHJlZmVyZW5jZXMgdG8gImh1Z2Ug bnVtYmVyIG9mIFRDUCBjb25uZWN0aW9ucyIgYnV0IG5vdGhpbmcgY29uY3JldGUgdG8gdGVsbCBt ZSBob3cgdG8gcHJlZGljdCBob3cgdGhhdCBzY2FsZXMuDQoNClRoYW5rcywNClJpY2sNCi0tIA0K UmljayBCYWxzYW5vDQpTZW5pb3IgU29mdHdhcmUgRW5naW5lZXINCk9wb3dlcg0KDQpPICsxIDU3 MSAzODQgMTIxMA0KV2UncmUgSGlyaW5nISBTZWUgam9icyBoZXJlLg0KX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18NCmNlcGgtdXNlcnMgbWFpbGluZyBsaXN0 DQpjZXBoLXVzZXJzQGxpc3RzLmNlcGguY29tDQpodHRwOi8vbGlzdHMuY2VwaC5jb20vbGlzdGlu Zm8uY2dpL2NlcGgtdXNlcnMtY2VwaC5jb20NCg0K ------=_001_NextPart803833711260_=---- Content-Type: text/html; charset="ISO-8859-1" Content-Transfer-Encoding: quoted-printable =0A
Hi,
I also concer= ns about this problem. And my problem is how many threads will the qemu-sy= stem-x86 has.

From what i tested, it could betwee= n 100 to 800, yeah, maybe it has relationship with the osd number. But it<= /div>
seems affect the performance when it has many threads. From what= i tested, 4k randwrite will reduce from 15k
to 4k. That's reall= y unacceptable!

My evnironment:

1. nine OSD storage servers with two intel DC 3500 SSD on each
2. hammer 0.94.3
3. QEMU emulato= r version 2.1.2 (Debian 1:2.1+dfsg-12+deb8u4~bpo70+1)<= /span>

=0A
Thanks!

=0A
hzwulibin-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
=0A 
Date: 2015-1= 0-27 05:48
Subject:&n= bsp;Re: [ceph-users] Understanding the number of TCP connections between c= lients and OSDs
If we're talking about RBD clients (qemu) then the number also grows= with number of volumes attached to the client. With a single volume it wa= s <1000. It grows when there's heavy IO happening in the guest.
I had to bump up the file open limits to several thusands (8000 wa= s it?) to accomodate client with 10 volumes in our cluster. We just scaled= the number of OSDs down so hopefully I could have a graph of that.
But I just guesstimated what it could become, and that's not nece= ssarily what the theoretical limit is. Very bad things happen when you rea= ch that threshold. It could also depend on the guest settings (like queue = depth), and how much it seeks over the drive (how many different PGs it hi= ts), but knowing the upper bound is most critical.
Jan

On 26 Oct 2015, at 21:32, Rick Balsano <rick-FGJi4DqQKYDQT0dZR+AlfA@public.gmane.org> wrote:

We've run into issues with the number of open TCP connections from a sing= le client to the OSDs in our Ceph cluster.

<= /div>
We can (& have) increased the open file limit to = work around this, but we're looking to understand what determines the numb= er of open connections maintained between a client and a particular OSD. O= ur naive assumption was 1 open TCP connection per OSD or per port made ava= ilable by the Ceph node. There are many more than this, presumably to allo= w parallel connections, because we see 1-4 connections from each client pe= r open port on a Ceph node.

Here is some background on our cluster:
*= still running Firefly 0.80.8
* 414 OSDs, 35 nod= es, one massive pool
* clients are KVM processes, acc= essing Ceph RBD images using virtio
* total number of= open TCP connections from one client to all nodes between 500-1000 <= br clear=3D"all" class=3D"">

Is there any way to either know or cap the maximum number of connec= tions we should expect?

I can provide more info as required. I've done some searches and f= ound references to "huge number of TCP connections" but nothing concrete t= o tell me how to predict how that scales.

Thanks,
Rick
--
Rick Balsano
Senior Software Engineer
Opower

O +1 571 384 1210
We're Hiring! See jobs= here.
=0A
=0A________________________= _______________________
ceph-users mailing list
ceph-users@lists= .ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-= ceph.com

=0A ------=_001_NextPart803833711260_=------ --===============0796715976== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ ceph-users mailing list ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com --===============0796715976==--