* Ceph benchmarks
@ 2012-08-27 20:47 Sébastien Han
2012-08-27 20:59 ` Andrey Korolyov
` (5 more replies)
0 siblings, 6 replies; 12+ messages in thread
From: Sébastien Han @ 2012-08-27 20:47 UTC (permalink / raw)
To: ceph-devel
Hi community,
For those of you who are interested, I performed several benchmarks of
RADOS and RBD on different types of hardware and use case.
You can find my results here:
http://www.sebastien-han.fr/blog/2012/08/26/ceph-benchmarks/
Hope it helps :)
Feel free to comment, critic... :)
Cheers!
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Ceph benchmarks
2012-08-27 20:47 Ceph benchmarks Sébastien Han
@ 2012-08-27 20:59 ` Andrey Korolyov
2012-08-28 1:40 ` Alexandre DERUMIER
` (4 subsequent siblings)
5 siblings, 0 replies; 12+ messages in thread
From: Andrey Korolyov @ 2012-08-27 20:59 UTC (permalink / raw)
To: Sébastien Han; +Cc: ceph-devel
On Tue, Aug 28, 2012 at 12:47 AM, Sébastien Han <han.sebastien@gmail.com> wrote:
> Hi community,
>
> For those of you who are interested, I performed several benchmarks of
> RADOS and RBD on different types of hardware and use case.
> You can find my results here:
> http://www.sebastien-han.fr/blog/2012/08/26/ceph-benchmarks/
>
> Hope it helps :)
>
> Feel free to comment, critic... :)
>
> Cheers!
My two cents - on ultrafast journal(tmpfs) it means which tcp
congestion control algorithm you using. For default CUBIC delays
aggregated sixteen-osd writing speed is about 450MBps, but for DCTCP
it raising up to 550MBps. For such device as SLC disk(ext4,^O journal,
commit=100) there is no observable difference - both times aggregated
speed measured about 330MBps. I do not tried yet H(S)TCP, it should do
the same as DCTCP. For delays lower than regular gigabit ethernet
different congestion algorithms should show bigger difference, though.
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Ceph benchmarks
2012-08-27 20:47 Ceph benchmarks Sébastien Han
2012-08-27 20:59 ` Andrey Korolyov
@ 2012-08-28 1:40 ` Alexandre DERUMIER
2012-08-28 2:18 ` Mark Nelson
` (3 subsequent siblings)
5 siblings, 0 replies; 12+ messages in thread
From: Alexandre DERUMIER @ 2012-08-28 1:40 UTC (permalink / raw)
To: Sébastien Han; +Cc: ceph-devel
Hi,
Nice benchmark !
Maybe It could be great if you can add some fio benchmark.
I'm interested to see random iops values, as I never be able to reach more than 8000iops with a rbd cluster.
random read: (iops)
fio --filename=/dev/[device] --direct=1 --rw=randread --bs=4k --size=1G --iodepth=100 --runtime=120 --group_reporting --name=file1 --ioengine=libaio
random read: (iops)
fio --filename=/dev/[device] --direct=1 --rw=randwrite --bs=4k --size=1G --iodepth=100 --runtime=120 --group_reporting --name=file1 --ioengine=libaio
seq read: (bandwith)
fio --filename=/dev/[device] --direct=1 --rw=read --bs=4M --size=1G --iodepth=100 --runtime=120 --group_reporting --name=file1 --ioengine=libaio
seq write: (bandwith)
fio --filename=/dev/[device] --direct=1 --rw=write --bs=4M --size=1G --iodepth=100 --runtime=120 --group_reporting --name=file1 --ioengine=libaio
Regards,
Alexandre
----- Mail original -----
De: "Sébastien Han" <han.sebastien@gmail.com>
À: "ceph-devel" <ceph-devel@vger.kernel.org>
Envoyé: Lundi 27 Août 2012 22:47:06
Objet: Ceph benchmarks
Hi community,
For those of you who are interested, I performed several benchmarks of
RADOS and RBD on different types of hardware and use case.
You can find my results here:
http://www.sebastien-han.fr/blog/2012/08/26/ceph-benchmarks/
Hope it helps :)
Feel free to comment, critic... :)
Cheers!
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
--
Alexandre D e rumier
Ingénieur Systèmes et Réseaux
Fixe : 03 20 68 88 85
Fax : 03 20 68 90 88
45 Bvd du Général Leclerc 59100 Roubaix
12 rue Marivaux 75002 Paris
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Ceph benchmarks
2012-08-27 20:47 Ceph benchmarks Sébastien Han
2012-08-27 20:59 ` Andrey Korolyov
2012-08-28 1:40 ` Alexandre DERUMIER
@ 2012-08-28 2:18 ` Mark Nelson
2012-08-28 4:27 ` Mark Kirkwood
2012-08-28 11:51 ` Plaetinck, Dieter
` (2 subsequent siblings)
5 siblings, 1 reply; 12+ messages in thread
From: Mark Nelson @ 2012-08-28 2:18 UTC (permalink / raw)
To: Sébastien Han; +Cc: ceph-devel
On 08/27/2012 03:47 PM, Sébastien Han wrote:
> Hi community,
>
Hi!
> For those of you who are interested, I performed several benchmarks of
> RADOS and RBD on different types of hardware and use case.
> You can find my results here:
> http://www.sebastien-han.fr/blog/2012/08/26/ceph-benchmarks/
>
> Hope it helps :)
>
> Feel free to comment, critic... :)
A couple of thoughts:
1) With so few OSDs going from 1000 to 10000 pgs shouldn't make too much
of a difference. It would be concerning if it did!
2) Were the commodity results with SSDs using replication of 3? Also,
was that test with the flusher on or off? I'd hope that with 15k drives
you'd see a bit better throughput with journals on the SSDs.
3) It would be interesting to try these tests without the raid1 and see
if you can max out the bonded interface.
4) I think the R520 backplane is using SAS expanders like in the R515s
we have. We've had some performance problems caused either by them or
by something goofy going on with our H700 controllers.
5) rados bench tests with smaller requests could be interesting on 15k
drives. I typically see about 1-2MB/s per OSD for 4k requests with
7200rpm SATA disks.
Mark
>
> Cheers!
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Ceph benchmarks
2012-08-28 2:18 ` Mark Nelson
@ 2012-08-28 4:27 ` Mark Kirkwood
2012-08-28 8:32 ` Sébastien Han
0 siblings, 1 reply; 12+ messages in thread
From: Mark Kirkwood @ 2012-08-28 4:27 UTC (permalink / raw)
To: Mark Nelson; +Cc: Sébastien Han, ceph-devel
+1 to that. I've been seeing 4-6 MB/s for 4K writes for 1 OSD with 1 SSD
for journal and another for data [1]. Interestingly I did see some nice
scaling with 4K random reads: 2-4 MB/s per thread for up to 8 threads
(looked like it plateaued thereafter).
Cheers
Mark
[1] FYI not on the box I posted about before - on a more modern pc with
6Bbit/s SATA.
On 28/08/12 14:18, Mark Nelson wrote:
> 5) rados bench tests with smaller requests could be interesting on 15k
> drives. I typically see about 1-2MB/s per OSD for 4k requests with
> 7200rpm SATA disks.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Ceph benchmarks
2012-08-28 4:27 ` Mark Kirkwood
@ 2012-08-28 8:32 ` Sébastien Han
2012-08-28 11:46 ` Mark Nelson
0 siblings, 1 reply; 12+ messages in thread
From: Sébastien Han @ 2012-08-28 8:32 UTC (permalink / raw)
To: Mark Kirkwood; +Cc: Mark Nelson, ceph-devel
@Alexandre: I don't have all the machines anymore, I'll see what I can
do :). Only the commodity cluster remains
@Mark Nelson: 2) Which bench? The RADOS one?
3) Sorry the RAID controller doesn't support JBOD...
5) I still have the commodity cluster, I'll perform some little rados benchmarks
Cheers!
On Tue, Aug 28, 2012 at 6:27 AM, Mark Kirkwood
<mark.kirkwood@catalyst.net.nz> wrote:
> +1 to that. I've been seeing 4-6 MB/s for 4K writes for 1 OSD with 1 SSD for
> journal and another for data [1]. Interestingly I did see some nice scaling
> with 4K random reads: 2-4 MB/s per thread for up to 8 threads (looked like
> it plateaued thereafter).
>
> Cheers
>
> Mark
>
> [1] FYI not on the box I posted about before - on a more modern pc with
> 6Bbit/s SATA.
>
>
> On 28/08/12 14:18, Mark Nelson wrote:
>>
>> 5) rados bench tests with smaller requests could be interesting on 15k
>> drives. I typically see about 1-2MB/s per OSD for 4k requests with 7200rpm
>> SATA disks.
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Ceph benchmarks
2012-08-28 8:32 ` Sébastien Han
@ 2012-08-28 11:46 ` Mark Nelson
0 siblings, 0 replies; 12+ messages in thread
From: Mark Nelson @ 2012-08-28 11:46 UTC (permalink / raw)
To: Sébastien Han; +Cc: Mark Kirkwood, ceph-devel
On 08/28/2012 03:32 AM, Sébastien Han wrote:
> @Alexandre: I don't have all the machines anymore, I'll see what I can
> do :). Only the commodity cluster remains
>
> @Mark Nelson: 2) Which bench? The RADOS one?
> 3) Sorry the RAID controller doesn't support JBOD...
> 5) I still have the commodity cluster, I'll perform some little rados benchmarks
Ah, that's the problem we have with our H700s. We do single drive
raid0s to get around it, but it's not ideal. Do you have two drives in
a raid1 or just a single drive?
Some other things we've noticed on our Dell machines:
- Writeback cache is pretty much faster than writethrough cache on all
of our tests, even sequential writes.
- Concurrent writers to a single raid group seem to tank performance. I
still don't know why this is, but it's making buffered IO and any direct
IO with more than one writer top out at about 95MB/s regardless of the
number of drives in the raid group. (And more writers slower
performance more).
Mark
>
> Cheers!
>
> On Tue, Aug 28, 2012 at 6:27 AM, Mark Kirkwood
> <mark.kirkwood@catalyst.net.nz> wrote:
>> +1 to that. I've been seeing 4-6 MB/s for 4K writes for 1 OSD with 1 SSD for
>> journal and another for data [1]. Interestingly I did see some nice scaling
>> with 4K random reads: 2-4 MB/s per thread for up to 8 threads (looked like
>> it plateaued thereafter).
>>
>> Cheers
>>
>> Mark
>>
>> [1] FYI not on the box I posted about before - on a more modern pc with
>> 6Bbit/s SATA.
>>
>>
>> On 28/08/12 14:18, Mark Nelson wrote:
>>>
>>> 5) rados bench tests with smaller requests could be interesting on 15k
>>> drives. I typically see about 1-2MB/s per OSD for 4k requests with 7200rpm
>>> SATA disks.
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Ceph benchmarks
2012-08-27 20:47 Ceph benchmarks Sébastien Han
` (2 preceding siblings ...)
2012-08-28 2:18 ` Mark Nelson
@ 2012-08-28 11:51 ` Plaetinck, Dieter
2012-08-28 13:11 ` Tommi Virtanen
2012-09-08 18:16 ` Ceph benchmarks / ceph osd tell X bench Dieter Kasper
5 siblings, 0 replies; 12+ messages in thread
From: Plaetinck, Dieter @ 2012-08-28 11:51 UTC (permalink / raw)
To: Sébastien Han; +Cc: ceph-devel
Sébastien Han <han.sebastien@gmail.com> wrote:
> Just as a reminder the system maintains 2 caches facilities:
> * disk write cache
> * page cache
the page cache is the one commonly referred to as block cache right (i.e. in the block layer, below the filesystem layer in the kernel)?
what do you mean with disk write cache? the one where io commands are held so they can be reordered? I always thought that was the same as the block cache - and also hard disks sometimes do this themselves, buffering in a few megs of memory on the hard disk itself?
Dieter
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Ceph benchmarks
2012-08-27 20:47 Ceph benchmarks Sébastien Han
` (3 preceding siblings ...)
2012-08-28 11:51 ` Plaetinck, Dieter
@ 2012-08-28 13:11 ` Tommi Virtanen
2012-08-28 22:16 ` Sébastien Han
2012-09-08 18:16 ` Ceph benchmarks / ceph osd tell X bench Dieter Kasper
5 siblings, 1 reply; 12+ messages in thread
From: Tommi Virtanen @ 2012-08-28 13:11 UTC (permalink / raw)
To: Sébastien Han; +Cc: ceph-devel
On Mon, Aug 27, 2012 at 1:47 PM, Sébastien Han <han.sebastien@gmail.com> wrote:
> For those of you who are interested, I performed several benchmarks of
> RADOS and RBD on different types of hardware and use case.
> You can find my results here:
> http://www.sebastien-han.fr/blog/2012/08/26/ceph-benchmarks/
Nice!
Minor nit: "sudo echo 3 | tee /proc/sys/vm/drop_caches && sudo sync"
you probably want "say echo 3 | sudo tee ... && sync"
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Ceph benchmarks
2012-08-28 13:11 ` Tommi Virtanen
@ 2012-08-28 22:16 ` Sébastien Han
0 siblings, 0 replies; 12+ messages in thread
From: Sébastien Han @ 2012-08-28 22:16 UTC (permalink / raw)
To: Tommi Virtanen; +Cc: ceph-devel
@Mark Nelson: thanks for the precision, I'll think about that the next
time I'll build an array. It was raid1 with 2 disks (no broken array)
@Plaetinck, Dieter: Sorry I made a little mistake, I was referring
about the system cache (page cache), the one which considers write
operations to the storage system complete after the data has been
copied into it. Secondly the disk write cache (hard drive disk), the
one stored into the hard drive disk. I'm going to make the sentence
clearer and remove the disk write cache part.
@Jerker Nyberg: I performed some measure on each system during a write
but it's more in my head than on the paper. As I can say, the
commodity cluster was struggling during the write. The other machines
barely showed a load, even when I deactivated every cores and kept
only 1 or 2 it was ok. The CPU load from the OSD wasn't so high.
@Tommi Virtanen: nice catch! I'm gonna update the article :)
Thank you for all the feedback, I'll try to perform some of the tests
guys mentioned above :)
On Tue, Aug 28, 2012 at 3:11 PM, Tommi Virtanen <tv@inktank.com> wrote:
> On Mon, Aug 27, 2012 at 1:47 PM, Sébastien Han <han.sebastien@gmail.com> wrote:
>> For those of you who are interested, I performed several benchmarks of
>> RADOS and RBD on different types of hardware and use case.
>> You can find my results here:
>> http://www.sebastien-han.fr/blog/2012/08/26/ceph-benchmarks/
>
> Nice!
>
> Minor nit: "sudo echo 3 | tee /proc/sys/vm/drop_caches && sudo sync"
> you probably want "say echo 3 | sudo tee ... && sync"
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Ceph benchmarks / ceph osd tell X bench
2012-08-27 20:47 Ceph benchmarks Sébastien Han
` (4 preceding siblings ...)
2012-08-28 13:11 ` Tommi Virtanen
@ 2012-09-08 18:16 ` Dieter Kasper
2012-09-10 17:07 ` Sébastien Han
5 siblings, 1 reply; 12+ messages in thread
From: Dieter Kasper @ 2012-09-08 18:16 UTC (permalink / raw)
To: Sébastien Han; +Cc: ceph-devel
Hi Sébastien,
when running 'ceph osd tell $i bench'
who/where will I see the results:
osd.0 [INF] bench: wrote 1024 MB in blocks of 4096 KB in 19.109900 sec at 54870 KB/sec
osd.1 [INF] bench: wrote 1024 MB in blocks of 4096 KB in 20.755279 sec at 50520 KB/sec
osd.2 [INF] bench: wrote 1024 MB in blocks of 4096 KB in 19.347267 sec at 54197 KB/sec
?
Which logging do I have to active for it ?
Thanks,
-Dieter
On Mon, Aug 27, 2012 at 10:47:06PM +0200, Sébastien Han wrote:
> Hi community,
>
> For those of you who are interested, I performed several benchmarks of
> RADOS and RBD on different types of hardware and use case.
> You can find my results here:
> http://www.sebastien-han.fr/blog/2012/08/26/ceph-benchmarks/
>
> Hope it helps :)
>
> Feel free to comment, critic... :)
>
> Cheers!
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Ceph benchmarks / ceph osd tell X bench
2012-09-08 18:16 ` Ceph benchmarks / ceph osd tell X bench Dieter Kasper
@ 2012-09-10 17:07 ` Sébastien Han
0 siblings, 0 replies; 12+ messages in thread
From: Sébastien Han @ 2012-09-10 17:07 UTC (permalink / raw)
To: Dieter Kasper; +Cc: ceph-devel
Hi Dieter,
Simply run a "ceph -w" and wait for the output.
Cheers.
On Sat, Sep 8, 2012 at 8:16 PM, Dieter Kasper <d.kasper@kabelmail.de> wrote:
>
> Hi Sébastien,
>
> when running 'ceph osd tell $i bench'
> who/where will I see the results:
> osd.0 [INF] bench: wrote 1024 MB in blocks of 4096 KB in 19.109900 sec at 54870 KB/sec
> osd.1 [INF] bench: wrote 1024 MB in blocks of 4096 KB in 20.755279 sec at 50520 KB/sec
> osd.2 [INF] bench: wrote 1024 MB in blocks of 4096 KB in 19.347267 sec at 54197 KB/sec
> ?
>
> Which logging do I have to active for it ?
>
> Thanks,
> -Dieter
>
>
> On Mon, Aug 27, 2012 at 10:47:06PM +0200, Sébastien Han wrote:
> > Hi community,
> >
> > For those of you who are interested, I performed several benchmarks of
> > RADOS and RBD on different types of hardware and use case.
> > You can find my results here:
> > http://www.sebastien-han.fr/blog/2012/08/26/ceph-benchmarks/
> >
> > Hope it helps :)
> >
> > Feel free to comment, critic... :)
> >
> > Cheers!
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2012-09-10 17:08 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-27 20:47 Ceph benchmarks Sébastien Han
2012-08-27 20:59 ` Andrey Korolyov
2012-08-28 1:40 ` Alexandre DERUMIER
2012-08-28 2:18 ` Mark Nelson
2012-08-28 4:27 ` Mark Kirkwood
2012-08-28 8:32 ` Sébastien Han
2012-08-28 11:46 ` Mark Nelson
2012-08-28 11:51 ` Plaetinck, Dieter
2012-08-28 13:11 ` Tommi Virtanen
2012-08-28 22:16 ` Sébastien Han
2012-09-08 18:16 ` Ceph benchmarks / ceph osd tell X bench Dieter Kasper
2012-09-10 17:07 ` Sébastien Han
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.