All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Nelson <mnelson@redhat.com>
To: Sage Weil <sage@newdream.net>
Cc: ceph-devel <ceph-devel@vger.kernel.org>
Subject: Re: Initial newstore vs filestore results
Date: Wed, 08 Apr 2015 09:38:46 -0500	[thread overview]
Message-ID: <55253D76.1030409@redhat.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1504071951120.4469@cobra.newdream.net>



On 04/07/2015 09:58 PM, Sage Weil wrote:
> On Tue, 7 Apr 2015, Mark Nelson wrote:
>> On 04/07/2015 02:16 PM, Mark Nelson wrote:
>>> On 04/07/2015 09:57 AM, Mark Nelson wrote:
>>>> Hi Guys,
>>>>
>>>> I ran some quick tests on Sage's newstore branch.  So far given that
>>>> this is a prototype, things are looking pretty good imho.  The 4MB
>>>> object rados bench read/write and small read performance looks
>>>> especially good.  Keep in mind that this is not using the SSD journals
>>>> in any way, so 640MB/s sequential writes is actually really good
>>>> compared to filestore without SSD journals.
>>>>
>>>> small write performance appears to be fairly bad, especially in the RBD
>>>> case where it's small writes to larger objects.  I'm going to sit down
>>>> and see if I can figure out what's going on.  It's bad enough that I
>>>> suspect there's just something odd going on.
>>>>
>>>> Mark
>>>
>>> Seekwatcher/blktrace graphs of a 4 OSD cluster using newstore for those
>>> interested:
>>>
>>> http://nhm.ceph.com/newstore/
>>>
>>> Interestingly small object write/read performance with 4 OSDs was about
>>> 1/3-1/4 the speed of the same cluster with 36 OSDs.
>>>
>>> Note: Thanks Dan for fixing the directory column width!
>>>
>>> Mark
>>
>> New fio/librbd results using Sage's latest code that attempts to keep small
>> overwrite extents in the db.  This is 4 OSD so not directly comparable to the
>> 36 OSD tests above, but does include seekwatcher graphs.  Results in MB/s:
>>
>> 	write	read	randw	randr
>> 4MB	57.9	319.6	55.2	285.9
>> 128KB	2.5	230.6	2.4	125.4
>> 4KB	0.46	55.65	1.11	3.56
>
> What would be very interesting would be to see the 4KB performance
> with the defaults (newstore overlay max = 32) vs overlays disabled
> (newstore overlay max = 0) and see if/how much it is helping.
>
> The latest branch also has open-by-handle.  It's on by default (newstore
> open by handle = true).  I think for most workloads it won't be very
> noticeable... I think there are two questions we need to answer though:
>
> 1) Does it have any impact on a creation workload (say, 4kb objects).  It
> shouldn't, but we should confirm.

4KB objects via rados bench ok?

>
> 2) Does it impact small object random reads with a cold cache.  I think to
> see the effect we'll probably need to pile a ton of objects into the
> store, drop caches, and then do random reads.  In the best case the
> effect will be small, but hopefully noticeable: we should go from
> a directory lookup (1+ seeks) + inode lookup (1+ seek) + data
> read, to inode lookup (1+ seek) + data read.  So, 3 -> 2 seeks best case?
> I'm not really sure what XFS is doing under the covers here...

So the above test process for RBD was basically:

1) create a configurable sized RBD volume (16GB in this case across 4 OSDs).
2) fill volume with 4MB writes to preallocate the blocks
3) repeat for each test:
3a) drop cache and sync
3b) Run the test



>
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

  parent reply	other threads:[~2015-04-08 14:38 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-07 14:57 Initial newstore vs filestore results Mark Nelson
2015-04-07 19:16 ` Mark Nelson
2015-04-08  1:45   ` Mark Nelson
2015-04-08  1:48     ` Somnath Roy
2015-04-08  1:53       ` Mark Nelson
2015-04-08  2:26         ` Chen, Xiaoxi
2015-04-08  2:58     ` Sage Weil
2015-04-08  7:24       ` Haomai Wang
2015-04-08 16:49         ` Sage Weil
2015-04-08 17:19           ` Gregory Farnum
2015-04-08 17:38             ` Sage Weil
2015-04-08 19:16           ` Milosz Tanski
2015-04-08 14:38       ` Mark Nelson [this message]
2015-04-09  3:19       ` Mark Nelson
2015-04-09 17:00         ` Mark Nelson
2015-04-10  6:11           ` Duan, Jiangang
2015-04-10 10:25             ` Ning Yao
2015-04-10 15:28               ` Sage Weil
2015-04-10 15:53                 ` Mark Nelson
2015-04-10 19:41                   ` Mark Nelson
2015-04-10 20:04                     ` Mark Nelson
2015-04-10 23:24                       ` Sage Weil
2015-04-10 23:44                         ` Duan, Jiangang
2015-04-10 23:58                           ` Mark Nelson
2015-04-10 23:43                       ` Duan, Jiangang
2015-04-11  0:09                         ` Mark Nelson
2015-04-11 13:22                           ` Duan, Jiangang
2015-04-10 12:07             ` Mark Nelson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55253D76.1030409@redhat.com \
    --to=mnelson@redhat.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=sage@newdream.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.