All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Priebe <s.priebe@profihost.ag>
To: Sage Weil <sage@inktank.com>
Cc: Mike Dawson <mike.dawson@cloudapt.com>,
	Haomai Wang <haomaiwang@gmail.com>,
	"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: [Performance] Improvement on DB Performance
Date: Wed, 21 May 2014 22:05:39 +0200	[thread overview]
Message-ID: <537D0713.2020807@profihost.ag> (raw)
In-Reply-To: <18096300-29E6-4BAC-B956-0E5D3F80E379@profihost.ag>

*arg* sorry missed emperor with dumpling.. sorry.

Stefan

Am 21.05.2014 20:51, schrieb Stefan Priebe - Profihost AG:
>
>> Am 21.05.2014 um 20:41 schrieb Sage Weil <sage@inktank.com>:
>>
>>> On Wed, 21 May 2014, Stefan Priebe - Profihost AG wrote:
>>> Hi sage,
>>>
>>> what about cuttlefish customers?
>>
>> We stopped backporting fixes to cuttlefish a while ago.  Please upgrade to
>> dumpling!
>
> Did I miss an information from inktank to update to dumpling? I thought we should stay at cuttlefish and then upgrade to firefly.
>
>>
>> That said, this patch should apply cleanly to cuttlefish.
>>
>> sage
>>
>>
>>>
>>> Greets,
>>> Stefan
>>> Excuse my typo sent from my mobile phone.
>>>
>>> Am 21.05.2014 um 18:15 schrieb Sage Weil <sage@inktank.com>:
>>>
>>>       On Wed, 21 May 2014, Mike Dawson wrote:
>>>             Haomai,
>>>
>>>
>>>             Thanks for finding this!
>>>
>>>
>>>
>>>             Sage,
>>>
>>>
>>>             We have a client that runs an io intensive, closed-source software
>>>             package
>>>
>>>             that seems to issue overzealous flushes which may benefit from this
>>>             patch (or
>>>
>>>             the other methods you mention). If you were to spin a wip build based
>>>             on
>>>
>>>             Dumpling, I'll be a willing tester.
>>>
>>>
>>>       Pushed wip-librbd-flush-dumpling, should be built shortly.
>>>
>>>       sage
>>>
>>>
>>>             Thanks,
>>>
>>>             Mike Dawson
>>>
>>>
>>>             On 5/21/2014 11:23 AM, Sage Weil wrote:
>>>
>>>                   On Wed, 21 May 2014, Haomai Wang wrote:
>>>
>>>                         I pushed the commit to fix this
>>>
>>>                         problem(https://github.com/ceph/ceph/pull/1848).
>>>
>>>
>>>                         With test program(Each sync request is issued
>>>                         with ten write request),
>>>
>>>                         a significant improvement is noticed.
>>>
>>>
>>>                         aio_flush                          sum: 914750
>>>                             avg: 1239   count:
>>>
>>>                         738      max: 4714   min: 1011
>>>
>>>                         flush_set                          sum: 904200
>>>                             avg: 1225   count:
>>>
>>>                         738      max: 4698   min: 999
>>>
>>>                         flush                              sum: 641648
>>>                             avg: 173    count:
>>>
>>>                         3690     max: 1340   min: 128
>>>
>>>
>>>                         Compared to last mail, it reduce each aio_flush
>>>                         request to 1239 ns
>>>
>>>                         instead of 24145 ns.
>>>
>>>
>>>                   Good catch!  That's a great improvement.
>>>
>>>
>>>                   The patch looks clearly correct.  We can probably do even
>>>                   better by
>>>
>>>                   putting the Objects on a list when they get the first dirty
>>>                   buffer so that
>>>
>>>                   we only cycle through the dirty ones.  Or, have a global
>>>                   list of dirty
>>>
>>>                   buffers (instead of dirty objects -> dirty buffers).
>>>
>>>
>>>                   sage
>>>
>>>
>>>
>>>                         I hope it's the root cause for db on rbd
>>>                         performance.
>>>
>>>
>>>                         On Wed, May 21, 2014 at 6:15 PM, Haomai Wang
>>>                         <haomaiwang@gmail.com> wrote:
>>>
>>>                               Hi all,
>>>
>>>
>>>                               I remember there exists discuss
>>>                               about DB(mysql) performance on rbd.
>>>
>>>                               Recently I test mysql-bench with
>>>                               rbd and found awful performance. So
>>>                               I
>>>
>>>                               dive into it and find that main
>>>                               cause is "flush" request from
>>>                               guest.
>>>
>>>                               As we know, applications such as
>>>                               mysql, ceph has own journal for
>>>
>>>                               durable and journal usually send
>>>                               sync&direct io. If fs barrier is
>>>                               on,
>>>
>>>                               each sync io operation make kernel
>>>                               issue "sync"(barrier) request to
>>>
>>>                               block device. Here, qemu will call
>>>                               "rbd_aio_flush" to apply.
>>>
>>>
>>>                               Via systemtap, I found a amazing
>>>                               thing:
>>>
>>>                               aio_flush
>>>                                                        sum:
>>>                               4177085    avg: 24145  count:
>>>
>>>                               173      max: 28172  min: 22747
>>>
>>>                               flush_set
>>>                                                        sum:
>>>                               4172116    avg: 24116  count:
>>>
>>>                               173      max: 28034  min: 22733
>>>
>>>                               flush
>>>                                                            sum:
>>>                               3029910    avg: 4      count:
>>>
>>>                               670477   max: 1893   min: 3
>>>
>>>
>>>                               This statistic info is gathered in
>>>                               5s. Most of consuming time is on
>>>
>>>                               "ObjectCacher::flush". What's more,
>>>                               with time increasing, the flush
>>>
>>>                               count will be increasing.
>>>
>>>
>>>                               After view source, I find the root
>>>                               cause is "ObjectCacher::flush_set",
>>>
>>>                               it will iterator the "object_set"
>>>                               and look for dirty buffer. And
>>>
>>>                               "object_set"  contains all objects
>>>                               ever opened.  For example:
>>>
>>>
>>>                               2014-05-21 18:01:37.959013
>>>                               7f785c7c6700  0 objectcacher
>>>                               flush_set
>>>
>>>                               total: 5919 flushed: 5
>>>
>>>                               2014-05-21 18:01:37.999698
>>>                               7f785c7c6700  0 objectcacher
>>>                               flush_set
>>>
>>>                               total: 5919 flushed: 5
>>>
>>>                               2014-05-21 18:01:38.038405
>>>                               7f785c7c6700  0 objectcacher
>>>                               flush_set
>>>
>>>                               total: 5920 flushed: 5
>>>
>>>                               2014-05-21 18:01:38.080118
>>>                               7f785c7c6700  0 objectcacher
>>>                               flush_set
>>>
>>>                               total: 5920 flushed: 5
>>>
>>>                               2014-05-21 18:01:38.119792
>>>                               7f785c7c6700  0 objectcacher
>>>                               flush_set
>>>
>>>                               total: 5921 flushed: 5
>>>
>>>                               2014-05-21 18:01:38.162004
>>>                               7f785c7c6700  0 objectcacher
>>>                               flush_set
>>>
>>>                               total: 5922 flushed: 5
>>>
>>>                               2014-05-21 18:01:38.202755
>>>                               7f785c7c6700  0 objectcacher
>>>                               flush_set
>>>
>>>                               total: 5923 flushed: 5
>>>
>>>                               2014-05-21 18:01:38.243880
>>>                               7f785c7c6700  0 objectcacher
>>>                               flush_set
>>>
>>>                               total: 5923 flushed: 5
>>>
>>>                               2014-05-21 18:01:38.284399
>>>                               7f785c7c6700  0 objectcacher
>>>                               flush_set
>>>
>>>                               total: 5923 flushed: 5
>>>
>>>
>>>                               These logs record the iteration
>>>                               info, the loop will check 5920
>>>                               objects
>>>
>>>                               but only 5 objects are dirty.
>>>
>>>
>>>                               So I think the solution is make
>>>                               "ObjectCacher::flush_set" only
>>>
>>>                               iterator the objects which is
>>>                               dirty.
>>>
>>>
>>>                               --
>>>
>>>                               Best Regards,
>>>
>>>
>>>                               Wheat
>>>
>>>
>>>
>>>
>>>                         --
>>>
>>>                         Best Regards,
>>>
>>>
>>>                         Wheat
>>>
>>>                         --
>>>
>>>                         To unsubscribe from this list: send the line
>>>                         "unsubscribe ceph-devel" in
>>>
>>>                         the body of a message to
>>>                         majordomo@vger.kernel.org
>>>
>>>                         More majordomo info at
>>>                          http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>>
>>>                   --
>>>
>>>                   To unsubscribe from this list: send the line "unsubscribe
>>>                   ceph-devel" in
>>>
>>>                   the body of a message to majordomo@vger.kernel.org
>>>
>>>                   More majordomo info at
>>>                    http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>>             --
>>>
>>>             To unsubscribe from this list: send the line "unsubscribe ceph-devel"
>>>             in
>>>
>>>             the body of a message to majordomo@vger.kernel.org
>>>
>>>             More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>>
>>>       --
>>>       To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>       the body of a message to majordomo@vger.kernel.org
>>>       More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

  reply	other threads:[~2014-05-21 20:05 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-21 10:15 [Performance] Improvement on DB Performance Haomai Wang
2014-05-21 10:21 ` Haomai Wang
2014-05-21 12:00   ` Luke Jing Yuan
2014-05-21 12:12     ` Haomai Wang
2014-05-21 15:06       ` Mark Nelson
2014-05-21 15:23   ` Sage Weil
2014-05-21 15:50     ` Mike Dawson
2014-05-21 15:53       ` Mark Nelson
2014-05-21 16:15       ` Sage Weil
     [not found]         ` <77004F70-7FE7-4EBE-A34D-46A8DC290936@profihost.ag>
2014-05-21 18:41           ` Sage Weil
2014-05-21 18:51             ` Stefan Priebe - Profihost AG
2014-05-21 20:05               ` Stefan Priebe [this message]
2014-05-26 13:57     ` Haomai Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=537D0713.2020807@profihost.ag \
    --to=s.priebe@profihost.ag \
    --cc=ceph-devel@vger.kernel.org \
    --cc=haomaiwang@gmail.com \
    --cc=mike.dawson@cloudapt.com \
    --cc=sage@inktank.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.