From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wido den Hollander Subject: Re: Guidelines for Calculating IOPS? Date: Fri, 19 Oct 2012 19:56:40 +0200 Message-ID: <50819458.2060201@widodh.nl> References: <50816812.2040008@gammacode.com> <508191A0.1040509@inktank.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from smtp01.mail.pcextreme.nl ([109.72.87.137]:42472 "EHLO smtp01.mail.pcextreme.nl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751127Ab2JSR4m (ORCPT ); Fri, 19 Oct 2012 13:56:42 -0400 In-Reply-To: <508191A0.1040509@inktank.com> Sender: ceph-devel-owner@vger.kernel.org List-ID: To: Mark Kampe Cc: Mike Dawson , ceph-devel@vger.kernel.org On 10/19/2012 07:45 PM, Mark Kampe wrote: > Replication should have no effect on read throughput/IOPS. > > The client does a single write to the primary, and the > primary then handles re-replication to the secondary > copies. As such the client does not pay (in terms of > CPU or NIC bandwidth) for the replication. Per-client > throughput limitations should be largely independent of > the replication. > > However, the replication does generate additional network > and I/O activity between the OSDs. This means that the > available aggregate throughput (of the entire cluster) > is effectively cut in half when you move from one-copy to two. > I think you can say this. You have 100 disks each capable of doing ~100 IOps. Read: 100 * 100 = 10.000 IOps Write: (100 * 100 / 2) = 5.000 IOps Since you are replicating everything twice you only have the speed of 50% of the disks. When reading the reads will be balanced over the available copies. I've taken 100 IOps as a safe assumption for a regular SATA disk. Wido > I am confused by your math: > > You say 385MB/s and 5250 IOPS (x8k) > 5250 IOPS * 8192 = 43MB/s > > Do you mean that some of your clients are generating > a lot of small block writes (at up to 5250 IPS) and > that others of your clients are doing larger writes > (with an aggregate throughput of 385MB/s)? > > For RADOS throughput: > 385MB/s is a fairly small number > 5250 buffered sequential IOPS is a very small number > 5250 random IOPS is not a particularly large > number, but will require several servers > > My guess is that the IOPS may drive the number of > servers, and the drives per server will be the > capacity divided by the number of required servers. > > So how many IOPS can you get per server? > > You are using RBD, and depending on the particulars > of your stack, there may be a great deal of buffering > and caching on the client side that can make the > RADOS traffic much more efficient than the tributary > client requests. Thus, I would suggest that you > probably want to actually benchmark the application > in question to measure the client-experienced throughput. > > > On 10/19/12 07:47, Mike Dawson wrote: >> All, >> >> I am investigating the use of Ceph for a video surveillance project with >> the following minimum block storage requirements: >> >> 385 Mbps of constant write bandwidth >> 100TB storage requirement >> 5250 IOPS (size of ~8 KB) >> >> I believe 2 replicas would be acceptable. We intend to use large >> capacity (2 or 3TB) SATA 7200rpm 3.5" drives, if the IOPS work out >> properly. >> >> Is there a method / formula to estimate IOPS for RDB? Specifically I >> would like to understand: >> >> - How does replica count affect read/write IOPS? >> >> - I'm trying to understand best practice for when to optimize server >> count, drives per server, and drive capacity as it relates to IOPS. Is >> there a point of diminishing I/O performance using server chassis with >> lots of drive slots, like the 36-drive Supermicro SC847a? > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html