Re: [PATCH 02/20] Btrfs: initialize new bitmaps' list

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Christian Brunner <chb@muc.de>
To: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 02/20] Btrfs: initialize new bitmaps' list
Date: Fri, 9 Dec 2011 16:17:21 +0100	[thread overview]
Message-ID: <CAO47_-_38Ov_PKuqod6wRntipHagbhi0n=Vi5wgwtRrXZm_-ZQ@mail.gmail.com> (raw)
In-Reply-To: <CAO47_--d2oEHCyMafoBLWfbd1WZsASngBh=4_TNiwrwd+QJkGA@mail.gmail.com>

2011/12/7 Christian Brunner <chb@muc.de>:
> 2011/12/1 Christian Brunner <chb@muc.de>:
>> 2011/12/1 Alexandre Oliva <oliva@lsd.ic.unicamp.br>:
>>> On Nov 29, 2011, Christian Brunner <chb@muc.de> wrote:
>>>
>>>> When I'm doing havy reading in our ceph cluster. The load and wait-io
>>>> on the patched servers is higher than on the unpatched ones.
>>>
>>> That's unexpected.
>
> In the mean time I know, that it's not related to the reads.
>
>>> I suppose I could wave my hands while explaining that you're getting
>>> higher data throughput, so it's natural that it would take up more
>>> resources, but that explanation doesn't satisfy me.  I suppose
>>> allocation might have got slightly more CPU intensive in some cases, as
>>> we now use bitmaps where before we'd only use the cheaper-to-allocate
>>> extents.  But that's unsafisfying as well.
>>
>> I must admit, that I do not completely understand the difference
>> between bitmaps and extents.
>>
>> From what I see on my servers, I can tell, that the degradation over
>> time is gone. (Rebooting the servers every day is no longer needed.
>> This is a real plus.) But the performance compared to a freshly
>> booted, unpatched server is much slower with my ceph workload.
>>
>> I wonder if it would make sense to initialize the list field only,
>> when the cluster setup fails? This would avoid the fallback to the
>> much unclustered allocation and would give us the cheaper-to-allocate
>> extents.
>
> I've now tried various combinations of you patches and I can really
> nail it down to this one line.
>
> With this patch applied I get much higher write-io values than without
> it. Some of the other patches help to reduce the effect, but it's
> still significant.
>
> iostat on an unpatched node is giving me:
>
> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s
> avgrq-sz avgqu-sz   await  svctm  %util
> sda             105.90     0.37   15.42   14.48  2657.33   560.13
> 107.61     1.89   62.75   6.26  18.71
>
> while on a node with this patch it's
> sda             128.20     0.97   11.10   57.15  3376.80   552.80
> 57.58    20.58  296.33   4.16  28.36
>
>
> Also interesting, is the fact that the average request size on the
> patched node is much smaller.
>
> Josef was telling me, that this could be related to the number of
> bitmaps we write out, but I've no idea how to trace this.
>
> I would be very happy if someone could give me a hint on what to do
> next, as this is one of the last remaining issues with our ceph
> cluster.

This is still bugging me and I just remembered something that might be
helpfull. Also I hope that this is not misleading...

Back in 2.6.38 we were running ceph without btrfs performance
degradation. I found a thread on the list where similar problems where
reported:

http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg10346.html

In that thread someone bisected the issue to

>From 4e69b598f6cfb0940b75abf7e179d6020e94ad1e Mon Sep 17 00:00:00 2001
From: Josef Bacik <josef@redhat.com>
Date: Mon, 21 Mar 2011 10:11:24 -0400
Subject: [PATCH] Btrfs: cleanup how we setup free space clusters

In this commit the bitmaps handling was changed. So I just thought
that this may be related.

I'm still hoping, that someone with a deeper understanding of btrfs
could take a look at this.

Thanks,
Christian

next prev parent reply	other threads:[~2011-12-09 15:17 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-28 19:17 [PATCH 00/20] Here's my current btrfs patchset Alexandre Oliva
2011-10-14 15:10 ` [PATCH 13/20] Btrfs: revamp clustered allocation logic Alexandre Oliva
2011-10-29  4:20 ` [PATCH 17/20] Btrfs: introduce -o cluster and -o nocluster Alexandre Oliva
2011-11-08 14:25 ` [PATCH 14/20] Btrfs: introduce option to rebalance only metadata Alexandre Oliva
2011-11-08 14:33 ` [PATCH 01/20] Btrfs: enable removal of second disk with raid1 metadata Alexandre Oliva
2011-11-10 22:55 ` [PATCH 18/20] Btrfs: add -o mincluster option Alexandre Oliva
2011-11-16  3:25 ` [PATCH 10/20] Btrfs: report reason for failed relocation Alexandre Oliva
2011-11-25 23:47 ` [PATCH 20/20] Btrfs: don't waste metadata block groups for clustered allocation Alexandre Oliva
2011-11-26 21:19 ` [PATCH 12/20] Btrfs: introduce verbose debug mode for patched clustered allocation recovery Alexandre Oliva
2011-11-26 23:53 ` [PATCH 05/20] Btrfs: start search for new cluster at the beginning of the block group Alexandre Oliva
2011-11-27  2:05 ` [PATCH 09/20] Btrfs: skip allocation attempt from empty cluster Alexandre Oliva
2011-11-27 22:49 ` [PATCH 16/20] Btrfs: try cluster but don't advance in search list Alexandre Oliva
2011-11-28  0:07 ` [PATCH 08/20] Btrfs: try to allocate from cluster even at LOOP_NO_EMPTY_SIZE Alexandre Oliva
2011-11-28  2:04 ` [PATCH 15/20] Btrfs: activate allocation debugging Alexandre Oliva
2011-11-28 13:52 ` [PATCH 02/20] Btrfs: initialize new bitmaps' list Alexandre Oliva
2011-11-29 21:00   ` Christian Brunner
2011-11-30 23:28     ` Alexandre Oliva
2011-12-01 14:50       ` Christian Brunner
2011-12-07 20:50         ` Christian Brunner
2011-12-09 15:17           ` Christian Brunner [this message]
2011-12-12  7:47           ` Alexandre Oliva
2011-12-12  8:05             ` Christian Brunner
2011-11-28 14:07 ` [PATCH 04/20] Btrfs: reset cluster's max_size when creating bitmap cluster Alexandre Oliva
2011-11-28 14:22 ` [PATCH 11/20] Btrfs: note when a bitmap is skipped because its list is in use Alexandre Oliva
2011-11-28 14:23 ` [PATCH 19/20] Btrfs: log when a bitmap is rejected for a cluster Alexandre Oliva
2011-11-28 14:30 ` [PATCH 06/20] Btrfs: skip block groups without enough space " Alexandre Oliva
2011-11-28 14:36 ` [PATCH 07/20] Btrfs: don't set up allocation result twice Alexandre Oliva
2011-11-28 15:41 ` [PATCH 03/20] Btrfs: fix comment typo Alexandre Oliva
2011-11-29  2:25 ` [PATCH 00/20] Here's my current btrfs patchset Li Zefan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAO47_-_38Ov_PKuqod6wRntipHagbhi0n=Vi5wgwtRrXZm_-ZQ@mail.gmail.com' \
    --to=chb@muc.de \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=oliva@lsd.ic.unicamp.br \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).