Re: RBD vs RADOS benchmark performance

CEPH filesystem development
 help / color / mirror / Atom feed

From: Mark Nelson <mark.nelson-4GqslpFJ+cxBDgjK7y7TUQ@public.gmane.org>
To: Greg <itooo-xVucS5mfmt0AvxtiuMwx3w@public.gmane.org>
Cc: ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	ceph-users-Qp0mS5GaXlQ@public.gmane.org
Subject: Re: RBD vs RADOS benchmark performance
Date: Mon, 13 May 2013 10:17:20 -0500	[thread overview]
Message-ID: <51910400.9080607@inktank.com> (raw)
In-Reply-To: <5190FE49.1030307-xVucS5mfmt0AvxtiuMwx3w@public.gmane.org>

On 05/13/2013 09:52 AM, Greg wrote:
> Le 13/05/2013 15:55, Mark Nelson a écrit :
>> On 05/13/2013 07:26 AM, Greg wrote:
>>> Le 13/05/2013 07:38, Olivier Bonvalet a écrit :
>>>> Le vendredi 10 mai 2013 à 19:16 +0200, Greg a écrit :
>>>>> Hello folks,
>>>>>
>>>>> I'm in the process of testing CEPH and RBD, I have set up a small
>>>>> cluster of  hosts running each a MON and an OSD with both journal and
>>>>> data on the same SSD (ok this is stupid but this is simple to
>>>>> verify the
>>>>> disks are not the bottleneck for 1 client). All nodes are connected
>>>>> on a
>>>>> 1Gb network (no dedicated network for OSDs, shame on me :).
>>>>>
>>>>> Summary : the RBD performance is poor compared to benchmark
>>>>>
>>>>> A 5 seconds seq read benchmark shows something like this :
>>>>>>     sec Cur ops   started  finished avg MB/s  cur MB/s  last lat
>>>>>> avg lat
>>>>>>       0       0         0         0         0 0 -         0
>>>>>>       1      16        39        23   91.9586        92 0.966117
>>>>>> 0.431249
>>>>>>       2      16        64        48   95.9602       100 0.513435
>>>>>> 0.53849
>>>>>>       3      16        90        74   98.6317       104 0.25631
>>>>>> 0.55494
>>>>>>       4      11        95        84   83.9735        40 1.80038
>>>>>> 0.58712
>>>>>>   Total time run:        4.165747
>>>>>> Total reads made:     95
>>>>>> Read size:            4194304
>>>>>> Bandwidth (MB/sec):    91.220
>>>>>>
>>>>>> Average Latency:       0.678901
>>>>>> Max latency:           1.80038
>>>>>> Min latency:           0.104719
>>>>> 91MB read performance, quite good !
>>>>>
>>>>> Now the RBD performance :
>>>>>> root@client:~# dd if=/dev/rbd1 of=/dev/null bs=4M count=100
>>>>>> 100+0 records in
>>>>>> 100+0 records out
>>>>>> 419430400 bytes (419 MB) copied, 13.0568 s, 32.1 MB/s
>>>>> There is a 3x performance factor (same for write: ~60M benchmark, ~20M
>>>>> dd on block device)
>>>>>
>>>>> The network is ok, the CPU is also ok on all OSDs.
>>>>> CEPH is Bobtail 0.56.4, linux is 3.8.1 arm (vanilla release + some
>>>>> patches for the SoC being used)
>>>>>
>>>>> Can you show me the starting point for digging into this ?
>>>> You should try to increase read_ahead to 512K instead of the defaults
>>>> 128K (/sys/block/*/queue/read_ahead_kb). I have seen a huge difference
>>>> on reads with that.
>>>>
>>> Olivier,
>>>
>>> thanks a lot for pointing this out, it indeed makes a *huge*
>>> difference !
>>>> # dd if=/mnt/t/1 of=/dev/zero bs=4M count=100
>>>> 100+0 records in
>>>> 100+0 records out
>>>> 419430400 bytes (419 MB) copied, 5.12768 s, 81.8 MB/s
>>> (caches dropped before each test of course)
>>>
>>> Mark, this is probably something you will want to investigate and
>>> explain in a "tweaking" topic of the documentation.
>>>
>>> Regards,
>>
>> Out of curiosity, has your rados bench performance improved as well?
>> We've also seen improvements for sequential read throughput when
>> increasing read_ahead_kb. (it may decrease random iops in some cases
>> though!)  The reason I didn't think to mention it here though is
>> because I was just focused on the difference between rados bench and
>> rbd.  It would be interesting to know if rbd has improved more
>> dramatically than rados bench.
> Mark, the read ahead is set on the RBD block device (on the client), so
> it doesn't improve benchmark results as the benchmark doesn't use the
> block layer.

Ah, I was thinking you had increased it on the OSDs (which can also 
help).  On the OSD side, if you are targeting spinning disks, it can 
depend a lot on how much data is stored per track and the cost of head 
switches and track switches.

>
> 1 question remains : why did I have poor performance with 1 single
> writing thread ?

In general, parallelism is really helpful because it hides latency and 
also helps you spread the load over all of your OSDs.  Even on a single 
disk, having concurrent requests lets the scheduler/controller do a 
better job of ordering requests.  Even on high performance distributed 
file systems like lustre you generally are going to do best with lots of 
IO nodes reading/writing multiple files.

>
> Regards,

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

next prev parent reply	other threads:[~2013-05-13 15:17 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <518D2B76.9040706@itooo.com>
     [not found] ` <1368423516.6771.2.camel@localhost>
2013-05-13 12:26   ` RBD vs RADOS benchmark performance Greg
2013-05-13 13:55     ` [ceph-users] " Mark Nelson
2013-05-13 14:52       ` Greg
     [not found]         ` <5190FE49.1030307-xVucS5mfmt0AvxtiuMwx3w@public.gmane.org>
2013-05-13 15:17           ` Mark Nelson [this message]
     [not found]     ` <5190DBD9.9070500-xVucS5mfmt0AvxtiuMwx3w@public.gmane.org>
2013-05-13 15:01       ` Gandalf Corvotempesta
     [not found]         ` <CAJH6TXhcgNOLE53eJoJamwE3i-FSfBf9LzpRACHwp_hEriH5zA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-05-13 15:10           ` Greg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51910400.9080607@inktank.com \
    --to=mark.nelson-4gqslpfj+cxbdgjk7y7tuq@public.gmane.org \
    --cc=ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=ceph-users-Qp0mS5GaXlQ@public.gmane.org \
    --cc=itooo-xVucS5mfmt0AvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox