From: Mark Nelson <mark.nelson@inktank.com>
To: Mike Dawson <mike.dawson@cloudapt.com>,
Sage Weil <sage@inktank.com>, Haomai Wang <haomaiwang@gmail.com>
Cc: "ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: [Performance] Improvement on DB Performance
Date: Wed, 21 May 2014 10:53:01 -0500 [thread overview]
Message-ID: <537CCBDD.1000400@inktank.com> (raw)
In-Reply-To: <537CCB49.8060408@cloudapt.com>
On 05/21/2014 10:50 AM, Mike Dawson wrote:
> Haomai,
>
> Thanks for finding this!
Yes agreed, this looks very exciting. :D
>
>
> Sage,
>
> We have a client that runs an io intensive, closed-source software
> package that seems to issue overzealous flushes which may benefit from
> this patch (or the other methods you mention). If you were to spin a wip
> build based on Dumpling, I'll be a willing tester.
I'd be happy to jump on the bandwagon too. I'm in the middle of RBD
testing using fio with the librbd engine.
>
> Thanks,
> Mike Dawson
>
> On 5/21/2014 11:23 AM, Sage Weil wrote:
>> On Wed, 21 May 2014, Haomai Wang wrote:
>>> I pushed the commit to fix this
>>> problem(https://github.com/ceph/ceph/pull/1848).
>>>
>>> With test program(Each sync request is issued with ten write request),
>>> a significant improvement is noticed.
>>>
>>> aio_flush sum: 914750 avg: 1239 count:
>>> 738 max: 4714 min: 1011
>>> flush_set sum: 904200 avg: 1225 count:
>>> 738 max: 4698 min: 999
>>> flush sum: 641648 avg: 173 count:
>>> 3690 max: 1340 min: 128
>>>
>>> Compared to last mail, it reduce each aio_flush request to 1239 ns
>>> instead of 24145 ns.
>>
>> Good catch! That's a great improvement.
>>
>> The patch looks clearly correct. We can probably do even better by
>> putting the Objects on a list when they get the first dirty buffer so
>> that
>> we only cycle through the dirty ones. Or, have a global list of dirty
>> buffers (instead of dirty objects -> dirty buffers).
>>
>> sage
>>
>>>
>>> I hope it's the root cause for db on rbd performance.
>>>
>>> On Wed, May 21, 2014 at 6:15 PM, Haomai Wang <haomaiwang@gmail.com>
>>> wrote:
>>>> Hi all,
>>>>
>>>> I remember there exists discuss about DB(mysql) performance on rbd.
>>>> Recently I test mysql-bench with rbd and found awful performance. So I
>>>> dive into it and find that main cause is "flush" request from guest.
>>>> As we know, applications such as mysql, ceph has own journal for
>>>> durable and journal usually send sync&direct io. If fs barrier is on,
>>>> each sync io operation make kernel issue "sync"(barrier) request to
>>>> block device. Here, qemu will call "rbd_aio_flush" to apply.
>>>>
>>>> Via systemtap, I found a amazing thing:
>>>> aio_flush sum: 4177085 avg: 24145 count:
>>>> 173 max: 28172 min: 22747
>>>> flush_set sum: 4172116 avg: 24116 count:
>>>> 173 max: 28034 min: 22733
>>>> flush sum: 3029910 avg: 4 count:
>>>> 670477 max: 1893 min: 3
>>>>
>>>> This statistic info is gathered in 5s. Most of consuming time is on
>>>> "ObjectCacher::flush". What's more, with time increasing, the flush
>>>> count will be increasing.
>>>>
>>>> After view source, I find the root cause is "ObjectCacher::flush_set",
>>>> it will iterator the "object_set" and look for dirty buffer. And
>>>> "object_set" contains all objects ever opened. For example:
>>>>
>>>> 2014-05-21 18:01:37.959013 7f785c7c6700 0 objectcacher flush_set
>>>> total: 5919 flushed: 5
>>>> 2014-05-21 18:01:37.999698 7f785c7c6700 0 objectcacher flush_set
>>>> total: 5919 flushed: 5
>>>> 2014-05-21 18:01:38.038405 7f785c7c6700 0 objectcacher flush_set
>>>> total: 5920 flushed: 5
>>>> 2014-05-21 18:01:38.080118 7f785c7c6700 0 objectcacher flush_set
>>>> total: 5920 flushed: 5
>>>> 2014-05-21 18:01:38.119792 7f785c7c6700 0 objectcacher flush_set
>>>> total: 5921 flushed: 5
>>>> 2014-05-21 18:01:38.162004 7f785c7c6700 0 objectcacher flush_set
>>>> total: 5922 flushed: 5
>>>> 2014-05-21 18:01:38.202755 7f785c7c6700 0 objectcacher flush_set
>>>> total: 5923 flushed: 5
>>>> 2014-05-21 18:01:38.243880 7f785c7c6700 0 objectcacher flush_set
>>>> total: 5923 flushed: 5
>>>> 2014-05-21 18:01:38.284399 7f785c7c6700 0 objectcacher flush_set
>>>> total: 5923 flushed: 5
>>>>
>>>> These logs record the iteration info, the loop will check 5920 objects
>>>> but only 5 objects are dirty.
>>>>
>>>> So I think the solution is make "ObjectCacher::flush_set" only
>>>> iterator the objects which is dirty.
>>>>
>>>> --
>>>> Best Regards,
>>>>
>>>> Wheat
>>>
>>>
>>>
>>> --
>>> Best Regards,
>>>
>>> Wheat
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2014-05-21 15:53 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-21 10:15 [Performance] Improvement on DB Performance Haomai Wang
2014-05-21 10:21 ` Haomai Wang
2014-05-21 12:00 ` Luke Jing Yuan
2014-05-21 12:12 ` Haomai Wang
2014-05-21 15:06 ` Mark Nelson
2014-05-21 15:23 ` Sage Weil
2014-05-21 15:50 ` Mike Dawson
2014-05-21 15:53 ` Mark Nelson [this message]
2014-05-21 16:15 ` Sage Weil
[not found] ` <77004F70-7FE7-4EBE-A34D-46A8DC290936@profihost.ag>
2014-05-21 18:41 ` Sage Weil
2014-05-21 18:51 ` Stefan Priebe - Profihost AG
2014-05-21 20:05 ` Stefan Priebe
2014-05-26 13:57 ` Haomai Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=537CCBDD.1000400@inktank.com \
--to=mark.nelson@inktank.com \
--cc=ceph-devel@vger.kernel.org \
--cc=haomaiwang@gmail.com \
--cc=mike.dawson@cloudapt.com \
--cc=sage@inktank.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.