From: Roman Mamedov <rm@romanrm.ru>
To: dave@jikos.cz
Cc: linux-btrfs@vger.kernel.org, sensille@gmx.net, chris.mason@fusionio.com
Subject: Re: Massive metadata size increase after upgrade from 3.2.18 to 3.4.1
Date: Sun, 17 Jun 2012 21:29:08 +0600 [thread overview]
Message-ID: <20120617212908.4fcfd19c@natsu> (raw)
In-Reply-To: <20120614113316.GR32402@twin.jikos.cz>
[-- Attachment #1: Type: text/plain, Size: 3925 bytes --]
On Thu, 14 Jun 2012 13:33:16 +0200
David Sterba <dave@jikos.cz> wrote:
> On Sat, Jun 09, 2012 at 01:38:22AM +0600, Roman Mamedov wrote:
> > Before the upgrade (on 3.2.18):
> >
> > Metadata, DUP: total=9.38GB, used=5.94GB
> >
> > After the FS has been mounted once with 3.4.1:
> >
> > Data: total=3.44TB, used=2.67TB
> > System, DUP: total=8.00MB, used=412.00KB
> > System: total=4.00MB, used=0.00
> > Metadata, DUP: total=84.38GB, used=5.94GB
> >
> > Where did my 75 GB of free space just went?
>
> This is caused by the patch (credits for bisecting it go to Arne)
>
> commit cf1d72c9ceec391d34c48724da57282e97f01122
> Author: Chris Mason <chris.mason@oracle.com>
> Date: Fri Jan 6 15:41:34 2012 -0500
>
> Btrfs: lower the bar for chunk allocation
>
> The chunk allocation code has tried to keep a pretty tight lid on creating new
> metadata chunks. This is partially because in the past the reservation
> code didn't give us an accurate idea of how much space was being used.
>
> The new code is much more accurate, so we're able to get rid of some of these
> checks.
> ---
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -3263,27 +3263,12 @@ static int should_alloc_chunk(struct btrfs_root *root,
> if (num_bytes - num_allocated < thresh)
> return 1;
> }
> -
> - /*
> - * we have two similar checks here, one based on percentage
> - * and once based on a hard number of 256MB. The idea
> - * is that if we have a good amount of free
> - * room, don't allocate a chunk. A good mount is
> - * less than 80% utilized of the chunks we have allocated,
> - * or more than 256MB free
> - */
> - if (num_allocated + alloc_bytes + 256 * 1024 * 1024 < num_bytes)
> - return 0;
> -
> - if (num_allocated + alloc_bytes < div_factor(num_bytes, 8))
> - return 0;
> -
> thresh = btrfs_super_total_bytes(root->fs_info->super_copy);
>
> - /* 256MB or 5% of the FS */
> - thresh = max_t(u64, 256 * 1024 * 1024, div_factor_fine(thresh, 5));
> + /* 256MB or 2% of the FS */
> + thresh = max_t(u64, 256 * 1024 * 1024, div_factor_fine(thresh, 2));
>
> - if (num_bytes > thresh && sinfo->bytes_used < div_factor(num_bytes, 3))
> + if (num_bytes > thresh && sinfo->bytes_used < div_factor(num_bytes, 8))
> return 0;
> return 1;
> }
> ---
>
> Originally there were 2 types of check, based on +256M and on
> percentage. The former are removed which leaves only the percentage
> thresholds. If there's less than 2% of the fs of metadata actually used,
> the metadata are reserved exactly to 2%. When acutual usage goes over
> 2%, there's always at least 20% over-reservation,
>
> sinfo->bytes_used < div_factor(num_bytes, 8)
>
> ie the threshold is 80%, which may be wasteful for large fs.
>
> So, the metadata chunks are immediately pinned to 2% of the filesystem
> after first few writes, and this is what you observe.
>
> Running balance will remove the unused metadata chunks, but only to the
> 2% level.
>
> [end of analysis]
>
> So what to do now? Simply reverting the +256M checks works and restores
> more or less the original behaviour.
Thanks.
So should I try restoring both of these, and leave the rest as is?
> - if (num_allocated + alloc_bytes + 256 * 1024 * 1024 < num_bytes)
> - return 0;
> -
> - if (num_allocated + alloc_bytes < div_factor(num_bytes, 8))
> - return 0;
Or would it make more sense to try rolling back that patch completely?
--
With respect,
Roman
~~~~~~~~~~~~~~~~~~~~~~~~~~~
"Stallman had a printer,
with code he could not see.
So he began to tinker,
and set the software free."
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
prev parent reply other threads:[~2012-06-17 15:29 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-08 19:38 Massive metadata size increase after upgrade from 3.2.18 to 3.4.1 Roman Mamedov
2012-06-12 17:38 ` Calvin Walton
2012-06-13 10:30 ` Anand Jain
2012-06-14 11:33 ` David Sterba
2012-06-17 15:29 ` Roman Mamedov [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120617212908.4fcfd19c@natsu \
--to=rm@romanrm.ru \
--cc=chris.mason@fusionio.com \
--cc=dave@jikos.cz \
--cc=linux-btrfs@vger.kernel.org \
--cc=sensille@gmx.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).