From: Mark Nelson <mark.nelson@inktank.com>
To: Andrey Korolyov <andrey@xdel.ru>
Cc: James Harper <james.harper@bendigoit.com.au>,
"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: poor write performance
Date: Thu, 18 Apr 2013 12:01:20 -0500 [thread overview]
Message-ID: <517026E0.9070905@inktank.com> (raw)
In-Reply-To: <CABYiri-9=ADNQgnVENJLrvW18f1+XcMTzDUSDmuLcikjS5u5iA@mail.gmail.com>
On 04/18/2013 11:46 AM, Andrey Korolyov wrote:
> On Thu, Apr 18, 2013 at 5:43 PM, Mark Nelson <mark.nelson@inktank.com> wrote:
>> On 04/18/2013 06:46 AM, James Harper wrote:
>>>
>>> I'm doing some basic testing so I'm not really fussed about poor
>>> performance, but my write performance appears to be so bad I think I'm doing
>>> something wrong.
>>>
>>> Using dd to test gives me kbytes/second for write performance for 4kb
>>> block sizes, while read performance is acceptable (for testing at least).
>>> For dd I'm using iflag=direct for read and oflag=direct for write testing.
>>>
>>> My setup, approximately, is:
>>>
>>> Two OSD's
>>> . 1 x 7200RPM SATA disk each
>>> . 2 x gigabit cluster network interfaces each in a bonded configuration
>>> directly attached (osd to osd, no switch)
>>> . 1 x gigabit public network
>>> . journal on another spindle
>>>
>>> Three MON's
>>> . 1 each on the OSD's
>>> . 1 on another server, which is also the one used for testing performance
>>>
>>> I'm using debian packages from ceph which are version 0.56.4
>>>
>>> For comparison, my existing production storage is 2 servers running DRBD
>>> with iSCSI to the initiators which run Xen on top of a (C)LVM volumes on top
>>> of the iSCSI. Performance not spectacular but acceptable. The servers in
>>> question are the same specs as the servers I'm testing on.
>>>
>>> Where should I start looking for performance problems? I've tried running
>>> some of the benchmark stuff in the documentation but I haven't gotten very
>>> far...
>>
>>
>> Hi James! Sorry to hear about the performance trouble! Is it just
>> sequential 4KB direct IO writes that are giving you troubles? If you are
>> using the kernel version of RBD, we don't have any kind of cache implemented
>> there and since you are bypassing the pagecache on the client, those writes
>> are being sent to the different OSDs in 4KB chunks over the network. RBD
>> stores data in blocks that are represented by 4MB objects on one of the
>> OSDs, so without cache a lot of sequential 4KB writes will be hitting 1 OSD
>> repeatedly and then moving on to the next one. Hopefully those writes would
>> get aggregated at the OSD level, but clearly that's not really happening
>> here given your performance.
>>
>> Here's a couple of thoughts:
>>
>> 1) If you are working with VMs, using the QEMU/KVM interface with virtio
>> drivers and RBD cache enabled will give you a huge jump in small sequential
>> write performance relative to what you are seeing now.
>>
>> 2) You may want to try upgrading to 0.60. We made a change to how the
>> pg_log works that causes fewer disk seeks during small IO, especially with
>> XFS.
>
> Can you point into related commits, if possible?
here you go:
http://tracker.ceph.com/projects/ceph/repository/revisions/188f3ea6867eeb6e950f6efed18d53ff17522bbc
>
>>
>> 3) If you are still having trouble, testing your network, disk speeds, and
>> using rados bench to test the object store all may be helpful.
>>
>>>
>>> Thanks
>>>
>>> James
>>
>>
>> Good luck!
>>
>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2013-04-18 17:01 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-18 11:46 poor write performance James Harper
2013-04-18 12:15 ` Wolfgang Hennerbichler
2013-04-18 23:11 ` James Harper
2013-04-20 10:52 ` Harald Rößler
2013-04-20 11:12 ` James Harper
2013-04-20 21:04 ` Jeff Mitchell
2013-04-18 13:43 ` Mark Nelson
2013-04-18 16:46 ` Andrey Korolyov
2013-04-18 17:01 ` Mark Nelson [this message]
2013-04-18 23:23 ` James Harper
2013-04-19 7:21 ` James Harper
2013-04-19 7:30 ` James Harper
2013-04-19 11:09 ` James Harper
2013-04-19 14:50 ` Mark Nelson
2013-04-20 0:33 ` James Harper
2013-04-20 1:30 ` James Harper
2013-04-21 13:52 ` Mark Nelson
2013-04-22 5:32 ` James Harper
2013-04-22 11:34 ` Mark Nelson
2013-04-22 11:40 ` James Harper
2013-04-21 17:56 ` Sylvain Munaut
2013-04-21 23:04 ` James Harper
2013-04-22 8:34 ` Sylvain Munaut
2013-04-22 11:34 ` James Harper
2013-04-22 11:39 ` Mark Nelson
2013-04-22 11:48 ` James Harper
2013-04-22 12:01 ` Mark Nelson
2013-04-22 13:47 ` Mark Nelson
2013-04-22 15:20 ` Sage Weil
2013-04-22 15:35 ` Sylvain Munaut
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=517026E0.9070905@inktank.com \
--to=mark.nelson@inktank.com \
--cc=andrey@xdel.ru \
--cc=ceph-devel@vger.kernel.org \
--cc=james.harper@bendigoit.com.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.