Question about big EC pool.

All of lore.kernel.org
 help / color / mirror / Atom feed

* Question about big EC pool.
@ 2015-09-12 16:01 Mike Almateia
  2015-09-12 16:34 ` Somnath Roy
  0 siblings, 1 reply; 7+ messages in thread
From: Mike Almateia @ 2015-09-12 16:01 UTC (permalink / raw)
  To: ceph-devel

Hello!

Ceph have EC pool feature a long time and I thinking how much big EC 
pool can we made and support.

I have task from our a client to make a storage around 5Pb userful space 
for storing video from cams (I ask in ceph-user maillist but nobody 
answer me.)

Ceph by now can handle that huge EC storage or not? May be someone 
testing or useing in prodaction huge EC pools?
I heard, that EC feature not ready for production, it's right?

Thanks for any answer.

--
Mike, yes.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Question about big EC pool.
  2015-09-12 16:01 Question about big EC pool Mike Almateia
@ 2015-09-12 16:34 ` Somnath Roy
  2015-09-12 19:12   ` Mike Almateia
  0 siblings, 1 reply; 7+ messages in thread
From: Somnath Roy @ 2015-09-12 16:34 UTC (permalink / raw)
  To: Mike Almateia, ceph-devel

I don't think there is any limit from Ceph side..
We are testing with ~768 TB deployment with 4:2 EC on Flash and it is working well so far..

Thanks & Regards
Somnath

-----Original Message-----
From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Mike Almateia
Sent: Saturday, September 12, 2015 9:01 AM
To: ceph-devel
Subject: Question about big EC pool.

Hello!

Ceph have EC pool feature a long time and I thinking how much big EC pool can we made and support.

I have task from our a client to make a storage around 5Pb userful space for storing video from cams (I ask in ceph-user maillist but nobody answer me.)

Ceph by now can handle that huge EC storage or not? May be someone testing or useing in prodaction huge EC pools?
I heard, that EC feature not ready for production, it's right?

Thanks for any answer.

--
Mike, yes.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html

________________________________

PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Question about big EC pool.
  2015-09-12 16:34 ` Somnath Roy
@ 2015-09-12 19:12   ` Mike Almateia
  2015-09-12 22:12     ` Somnath Roy
  0 siblings, 1 reply; 7+ messages in thread
From: Mike Almateia @ 2015-09-12 19:12 UTC (permalink / raw)
  To: ceph-devel

12-Sep-15 19:34, Somnath Roy пишет:
> I don't think there is any limit from Ceph side..
> We are testing with ~768 TB deployment with 4:2 EC on Flash and it is working well so far..
>
> Thanks & Regards
> Somnath

Thanks for answer!

It's very interesting!

What is hardware you use for your the test cluster?
You use only SSD or SSD+NVE?
Journal is located on the same SSD or not?
What a plugin you use?
You catch some bugs or strange things?

Sorry for so many questions, but it's very interesting!
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Question about big EC pool.
  2015-09-12 19:12   ` Mike Almateia
@ 2015-09-12 22:12     ` Somnath Roy
  2015-09-13 17:39       ` Mike Almateia
  0 siblings, 1 reply; 7+ messages in thread
From: Somnath Roy @ 2015-09-12 22:12 UTC (permalink / raw)
  To: Mike Almateia, ceph-devel

<<inline

-----Original Message-----
From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Mike Almateia
Sent: Saturday, September 12, 2015 12:13 PM
To: ceph-devel
Subject: Re: Question about big EC pool.

12-Sep-15 19:34, Somnath Roy пишет:
> I don't think there is any limit from Ceph side..
> We are testing with ~768 TB deployment with 4:2 EC on Flash and it is working well so far..
>
> Thanks & Regards
> Somnath

Thanks for answer!

It's very interesting!

What is hardware you use for your the test cluster?
[Somnath] Three 256 TB SanDisk's JBOF (IF100) and 2 heads in front of that , so, total of 6 node cluster. FYI, each IF100 can support max 512 TB. Heads are with 128GB  RAM and Xeon 2690 V3 dual socket on each of the server.

You use only SSD or SSD+NVE?

[Somnath] For now, it is all SSDs.

Journal is located on the same SSD or not?

[Somnath] Yes, journal is on the same SSD.

What a plugin you use?

[Somnath] Cauchy_good jerasure.

You catch some bugs or strange things?

[Somnath] So far all is well :-)

Sorry for so many questions, but it's very interesting!

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html

________________________________

PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Question about big EC pool.
  2015-09-12 22:12     ` Somnath Roy
@ 2015-09-13 17:39       ` Mike Almateia
  2015-09-13 18:25         ` Somnath Roy
  0 siblings, 1 reply; 7+ messages in thread
From: Mike Almateia @ 2015-09-13 17:39 UTC (permalink / raw)
  To: Somnath Roy, ceph-devel

13-Sep-15 01:12, Somnath Roy пишет:
> 12-Sep-15 19:34, Somnath Roy пишет:
>> >I don't think there is any limit from Ceph side..
>> >We are testing with ~768 TB deployment with 4:2 EC on Flash and it is working well so far..
>> >
>> >Thanks & Regards
>> >Somnath
> Thanks for answer!
>
> It's very interesting!
>
> What is hardware you use for your the test cluster?
> [Somnath] Three 256 TB SanDisk's JBOF (IF100) and 2 heads in front of that , so, total of 6 node cluster. FYI, each IF100 can support max 512 TB. Heads are with 128GB  RAM and Xeon 2690 V3 dual socket on each of the server.

What a version of ceph you use?
How cluster working in degraded state? Performance degradation is huge?
I think that e5-2690 didn't enough for that flash cluster.

How you have 6 node if as you say "Three 256 TB SanDisk's JBOF (IF100) 
and 2 heads in front of that", may be I not realized how IF100 working.

> You use only SSD or SSD+NVE?
>
> [Somnath] For now, it is all SSDs.
>
> Journal is located on the same SSD or not?
>
> [Somnath] Yes, journal is on the same SSD.
>
> What a plugin you use?
>
> [Somnath] Cauchy_good jerasure.

You did try the isa plugin?

>
> You catch some bugs or strange things?
>
> [Somnath] So far all is well :-)
>

It's good :)

Thanks for answer!
-- 
Mike, yes.

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Question about big EC pool.
  2015-09-13 17:39       ` Mike Almateia
@ 2015-09-13 18:25         ` Somnath Roy
  2015-09-15 11:32           ` Mike Almateia
  0 siblings, 1 reply; 7+ messages in thread
From: Somnath Roy @ 2015-09-13 18:25 UTC (permalink / raw)
  To: Mike Almateia, ceph-devel

<<inline

-----Original Message-----
From: Mike Almateia [mailto:mike.almateia@gmail.com]
Sent: Sunday, September 13, 2015 10:39 AM
To: Somnath Roy; ceph-devel
Subject: Re: Question about big EC pool.

13-Sep-15 01:12, Somnath Roy пишет:
> 12-Sep-15 19:34, Somnath Roy пишет:
>> >I don't think there is any limit from Ceph side..
>> >We are testing with ~768 TB deployment with 4:2 EC on Flash and it is working well so far..
>> >
>> >Thanks & Regards
>> >Somnath
> Thanks for answer!
>
> It's very interesting!
>
> What is hardware you use for your the test cluster?
> [Somnath] Three 256 TB SanDisk's JBOF (IF100) and 2 heads in front of that , so, total of 6 node cluster. FYI, each IF100 can support max 512 TB. Heads are with 128GB  RAM and Xeon 2690 V3 dual socket on each of the server.

What a version of ceph you use?
[Somnath] As of now, it's giant , but, will be moving to Hammer soon..

How cluster working in degraded state? Performance degradation is huge?

[Somnath] That's one of the reason we are using Cauchy_good, it's performance in degraded state is much better. By reducing the recovery traffic (lower values of recovery settings) , we are able to get significant performance improvement during degraded state as well...BTW, degradation will depend on how much data cluster has to recover. In our case, we are seeing ~8% degradation if say ~64 TB (one ceph node) is failed, but, ~28% if ~128TB (2 node) is down. This is for 4M reads..

I think that e5-2690 didn't enough for that flash cluster.

[Somnath] In our case and specially for bigger block sizes object use cases, dual socket E5-2690 should be more than sufficient. We are not able to saturate that in this case. For smaller block size block use cases we are almost saturating the cpus with our config though. If you are planning to use EC with RGW and considering you have object size at least 256K or so, this cpu complex is good enough IMO.

How you have 6 node if as you say "Three 256 TB SanDisk's JBOF (IF100) and 2 heads in front of that", may be I not realized how IF100 working.

[Somnath] It is 2 ceph nodes connected to each IF100 (you can connect upto 8 servers in front). The IF100 drives are partitioned between 2 head servers. We used 3 IF100s, so, total 3 * 2 = 6 head nodes or 6 node ceph servers. Hope that make sense now.

> You use only SSD or SSD+NVE?
>
> [Somnath] For now, it is all SSDs.
>
> Journal is located on the same SSD or not?
>
> [Somnath] Yes, journal is on the same SSD.
>
> What a plugin you use?
>
> [Somnath] Cauchy_good jerasure.

You did try the isa plugin?

>
> You catch some bugs or strange things?
>
> [Somnath] So far all is well :-)
>

It's good :)

Thanks for answer!
--
Mike, yes.

________________________________

PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Question about big EC pool.
  2015-09-13 18:25         ` Somnath Roy
@ 2015-09-15 11:32           ` Mike Almateia
  0 siblings, 0 replies; 7+ messages in thread
From: Mike Almateia @ 2015-09-15 11:32 UTC (permalink / raw)
  To: Somnath Roy, ceph-devel

13-Sep-15 21:25, Somnath Roy пишет:
> <<inline
>
> -----Original Message-----
> From: Mike Almateia [mailto:mike.almateia@gmail.com]
> Sent: Sunday, September 13, 2015 10:39 AM
> To: Somnath Roy; ceph-devel
> Subject: Re: Question about big EC pool.
>
> 13-Sep-15 01:12, Somnath Roy пишет:
>> 12-Sep-15 19:34, Somnath Roy пишет:
>>>> I don't think there is any limit from Ceph side..
>>>> We are testing with ~768 TB deployment with 4:2 EC on Flash and it is working well so far..
>>>>
>>>> Thanks & Regards
>>>> Somnath
>> Thanks for answer!
>>
>> It's very interesting!
>>
>> What is hardware you use for your the test cluster?
>> [Somnath] Three 256 TB SanDisk's JBOF (IF100) and 2 heads in front of that , so, total of 6 node cluster. FYI, each IF100 can support max 512 TB. Heads are with 128GB  RAM and Xeon 2690 V3 dual socket on each of the server.
>
> What a version of ceph you use?
> [Somnath] As of now, it's giant , but, will be moving to Hammer soon..
>
Good, may be you write a result perfomance/benefits after migrating to 
Hammer?

> How cluster working in degraded state? Performance degradation is huge?
>
> [Somnath] That's one of the reason we are using Cauchy_good, it's performance in degraded state is much better. By reducing the recovery traffic (lower values of recovery settings) , we are able to get significant performance improvement during degraded state as well...BTW, degradation will depend on how much data cluster has to recover. In our case, we are seeing ~8% degradation if say ~64 TB (one ceph node) is failed, but, ~28% if ~128TB (2 node) is down. This is for 4M reads..
>
Cool, I see now.

> I think that e5-2690 didn't enough for that flash cluster.
>
> [Somnath] In our case and specially for bigger block sizes object use cases, dual socket E5-2690 should be more than sufficient. We are not able to saturate that in this case. For smaller block size block use cases we are almost saturating the cpus with our config though. If you are planning to use EC with RGW and considering you have object size at least 256K or so, this cpu complex is good enough IMO.
>

Ok.

>
> How you have 6 node if as you say "Three 256 TB SanDisk's JBOF (IF100) and 2 heads in front of that", may be I not realized how IF100 working.
>
> [Somnath] It is 2 ceph nodes connected to each IF100 (you can connect upto 8 servers in front). The IF100 drives are partitioned between 2 head servers. We used 3 IF100s, so, total 3 * 2 = 6 head nodes or 6 node ceph servers. Hope that make sense now.
>

Yes, I understood it now.

Thanks for info!

-- 
Mike, yes.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-09-15 11:32 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-12 16:01 Question about big EC pool Mike Almateia
2015-09-12 16:34 ` Somnath Roy
2015-09-12 19:12   ` Mike Almateia
2015-09-12 22:12     ` Somnath Roy
2015-09-13 17:39       ` Mike Almateia
2015-09-13 18:25         ` Somnath Roy
2015-09-15 11:32           ` Mike Almateia

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.