From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mark Nelson Subject: Re: [Performance] Improvement on DB Performance Date: Wed, 21 May 2014 10:53:01 -0500 Message-ID: <537CCBDD.1000400@inktank.com> References: <537CCB49.8060408@cloudapt.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from mail-ig0-f180.google.com ([209.85.213.180]:61028 "EHLO mail-ig0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752109AbaEUPxB (ORCPT ); Wed, 21 May 2014 11:53:01 -0400 Received: by mail-ig0-f180.google.com with SMTP id c1so2296606igq.1 for ; Wed, 21 May 2014 08:53:01 -0700 (PDT) In-Reply-To: <537CCB49.8060408@cloudapt.com> Sender: ceph-devel-owner@vger.kernel.org List-ID: To: Mike Dawson , Sage Weil , Haomai Wang Cc: "ceph-devel@vger.kernel.org" On 05/21/2014 10:50 AM, Mike Dawson wrote: > Haomai, > > Thanks for finding this! Yes agreed, this looks very exciting. :D > > > Sage, > > We have a client that runs an io intensive, closed-source software > package that seems to issue overzealous flushes which may benefit from > this patch (or the other methods you mention). If you were to spin a wip > build based on Dumpling, I'll be a willing tester. I'd be happy to jump on the bandwagon too. I'm in the middle of RBD testing using fio with the librbd engine. > > Thanks, > Mike Dawson > > On 5/21/2014 11:23 AM, Sage Weil wrote: >> On Wed, 21 May 2014, Haomai Wang wrote: >>> I pushed the commit to fix this >>> problem(https://github.com/ceph/ceph/pull/1848). >>> >>> With test program(Each sync request is issued with ten write request), >>> a significant improvement is noticed. >>> >>> aio_flush sum: 914750 avg: 1239 count: >>> 738 max: 4714 min: 1011 >>> flush_set sum: 904200 avg: 1225 count: >>> 738 max: 4698 min: 999 >>> flush sum: 641648 avg: 173 count: >>> 3690 max: 1340 min: 128 >>> >>> Compared to last mail, it reduce each aio_flush request to 1239 ns >>> instead of 24145 ns. >> >> Good catch! That's a great improvement. >> >> The patch looks clearly correct. We can probably do even better by >> putting the Objects on a list when they get the first dirty buffer so >> that >> we only cycle through the dirty ones. Or, have a global list of dirty >> buffers (instead of dirty objects -> dirty buffers). >> >> sage >> >>> >>> I hope it's the root cause for db on rbd performance. >>> >>> On Wed, May 21, 2014 at 6:15 PM, Haomai Wang >>> wrote: >>>> Hi all, >>>> >>>> I remember there exists discuss about DB(mysql) performance on rbd. >>>> Recently I test mysql-bench with rbd and found awful performance. So I >>>> dive into it and find that main cause is "flush" request from guest. >>>> As we know, applications such as mysql, ceph has own journal for >>>> durable and journal usually send sync&direct io. If fs barrier is on, >>>> each sync io operation make kernel issue "sync"(barrier) request to >>>> block device. Here, qemu will call "rbd_aio_flush" to apply. >>>> >>>> Via systemtap, I found a amazing thing: >>>> aio_flush sum: 4177085 avg: 24145 count: >>>> 173 max: 28172 min: 22747 >>>> flush_set sum: 4172116 avg: 24116 count: >>>> 173 max: 28034 min: 22733 >>>> flush sum: 3029910 avg: 4 count: >>>> 670477 max: 1893 min: 3 >>>> >>>> This statistic info is gathered in 5s. Most of consuming time is on >>>> "ObjectCacher::flush". What's more, with time increasing, the flush >>>> count will be increasing. >>>> >>>> After view source, I find the root cause is "ObjectCacher::flush_set", >>>> it will iterator the "object_set" and look for dirty buffer. And >>>> "object_set" contains all objects ever opened. For example: >>>> >>>> 2014-05-21 18:01:37.959013 7f785c7c6700 0 objectcacher flush_set >>>> total: 5919 flushed: 5 >>>> 2014-05-21 18:01:37.999698 7f785c7c6700 0 objectcacher flush_set >>>> total: 5919 flushed: 5 >>>> 2014-05-21 18:01:38.038405 7f785c7c6700 0 objectcacher flush_set >>>> total: 5920 flushed: 5 >>>> 2014-05-21 18:01:38.080118 7f785c7c6700 0 objectcacher flush_set >>>> total: 5920 flushed: 5 >>>> 2014-05-21 18:01:38.119792 7f785c7c6700 0 objectcacher flush_set >>>> total: 5921 flushed: 5 >>>> 2014-05-21 18:01:38.162004 7f785c7c6700 0 objectcacher flush_set >>>> total: 5922 flushed: 5 >>>> 2014-05-21 18:01:38.202755 7f785c7c6700 0 objectcacher flush_set >>>> total: 5923 flushed: 5 >>>> 2014-05-21 18:01:38.243880 7f785c7c6700 0 objectcacher flush_set >>>> total: 5923 flushed: 5 >>>> 2014-05-21 18:01:38.284399 7f785c7c6700 0 objectcacher flush_set >>>> total: 5923 flushed: 5 >>>> >>>> These logs record the iteration info, the loop will check 5920 objects >>>> but only 5 objects are dirty. >>>> >>>> So I think the solution is make "ObjectCacher::flush_set" only >>>> iterator the objects which is dirty. >>>> >>>> -- >>>> Best Regards, >>>> >>>> Wheat >>> >>> >>> >>> -- >>> Best Regards, >>> >>> Wheat >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >>> >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html