Re: [auto-]defrag, nodatacow - general suggestions?(was: btrfs: poor performance on deleting many large files?)

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: [auto-]defrag, nodatacow - general suggestions?(was: btrfs: poor performance on deleting many large files?)
Date: Thu, 17 Dec 2015 05:07:12 +0000 (UTC)	[thread overview]
Message-ID: <pan$b0077$783b4f16$83d44d74$9f275abe@cox.net> (raw)
In-Reply-To: 1450303141.6242.50.camel@scientia.net

Christoph Anton Mitterer posted on Wed, 16 Dec 2015 22:59:01 +0100 as
excerpted:

> In kinda curios, what free space fragmentation actually means here.
> 
> Ist simply like this:
> +----------+-----+---+--------+
> |     F    |  D  | F |    D   |
> +----------+-----+---+--------+
> Where D is data (i.e. files/metadata) and F is free space.
> In other words, (F)ree space itself is not further subdivided and only
> fragmented by the (D)ata extents in between.
> 
> Or is it more complex like this:
> +-----+----+-----+---+--------+
> |  F  |  F |  D  | F |    D   |
> +-----+----+-----+---+--------+
> Where the (F)ree space itself is subdivided into "extents" (not
> necessarily of the same size), and btrfs couldn't use e.g. the first two
> F's as one contiguous amount of free space for a larger (D)ata extent

[still breaking into smaller points for reply]

At the one level, I had the simpler f/d/f/d scheme in mind, but that 
would be the case inside a single data chunk.  At the higher file level, 
with files significant fractions of the size of a single data chunk to 
much larger than a single data chunk, the more complex and second
f/f/d/f/d case would apply, with the chunk boundary as the separation 
between the f/f.

IOW, files larger than data chunk size will always be fragmented into 
data chunk size fragments/extents, at the largest, because chunks are 
designed to be movable using balance, device remove, replace, etc.

So (using the size numbers from a recent comment from Qu in a different 
thread), on a filesystem with under 100 GiB total space-effective (space-
effective, space available, accounting for the replication type, raid1, 
etc, and I'm simplifying here...), data chunks should be 1 GiB, while 
above that, with striping, they might be upto 10 GiB.

Using the 1 GiB nominal figure, files over 1 GiB would always be broken 
into 1 GiB maximum size extents, corresponding to 1 extent per chunk.

But while 4 KiB extents are clearly tiny and inefficient at today's 
scale, in practice, efficiency gains break down at well under GiB scale, 
with AFAIK 128 MiB being the upper bound at which any efficiency gains 
could really be expected, and 1 MiB arguably being a reasonable point at 
which further increases in extent size likely won't have a whole lot of 
effect even on SSD erase-block (where 1 MiB is a nominal max), but that's 
that's still 256X the usual 4 KiB minimum data block size, 8X the 128 KiB 
btrfs compression-block size, and 4X the 256 KiB defrag default "don't 
bother with extents larger than this" size.

Basically, the 256 KiB btrfs defrag "don't bother with anything larger 
than this" default is quite reasonable, tho for massive multi-gig VM 
images, the number of 256 KiB fragments will still look pretty big, so 
while technically a very reasonable choice, the "eye appeal" still isn't 
that great.

But based on real reports posting before and after numbers from filefrag 
(on uncompressed btrfs), we do have cases where defrag can't find 256 KiB 
free-space blocks and thus can actually fragment a file worse than it was 
before, so free-space fragmentation is indeed a very real problem.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

next prev parent reply	other threads:[~2015-12-17  5:07 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-23  1:43 btrfs: poor performance on deleting many large files Mitch Fossen
2015-11-23  6:29 ` Duncan
2015-11-25 21:49   ` Mitchell Fossen
2015-11-26 16:52     ` Duncan
2015-11-26 18:25       ` Christoph Anton Mitterer
2015-11-26 23:29         ` Duncan
2015-11-27  0:06           ` Christoph Anton Mitterer
2015-11-27  3:38             ` Duncan
2015-11-28  3:57               ` Christoph Anton Mitterer
2015-11-28  6:49                 ` Duncan
2015-12-12 22:15                   ` Christoph Anton Mitterer
2015-12-13  7:10                     ` Duncan
2015-12-16 22:14                       ` Christoph Anton Mitterer
2015-12-14 14:24                     ` Austin S. Hemmelgarn
2015-12-14 19:39                       ` Christoph Anton Mitterer
2015-12-14 20:27                         ` Austin S. Hemmelgarn
2015-12-14 21:30                           ` Lionel Bouton
2015-12-14 23:25                             ` Christoph Anton Mitterer
2015-12-15  1:49                               ` Duncan
2015-12-15  2:38                                 ` Lionel Bouton
2015-12-16  8:10                                   ` Duncan
2015-12-14 23:10                           ` Christoph Anton Mitterer
2015-12-14 23:16                           ` project idea: per-object default mount-options / more btrfs-properties / chattr attributes (was: btrfs: poor performance on deleting many large files) Christoph Anton Mitterer
2015-12-15  2:08                           ` btrfs: poor performance on deleting many large files Duncan
2015-12-15  4:05                       ` Chris Murphy
2015-11-27  1:49     ` Qu Wenruo
2015-11-23 12:59 ` Austin S Hemmelgarn
2015-11-26  0:23   ` [auto-]defrag, nodatacow - general suggestions?(was: btrfs: poor performance on deleting many large files?) Christoph Anton Mitterer
2015-11-26  0:33     ` Hugo Mills
2015-12-09  5:43       ` Christoph Anton Mitterer
2015-12-09 13:36         ` Duncan
2015-12-14  2:46           ` Christoph Anton Mitterer
2015-12-14 11:19             ` Duncan
2015-12-16 23:39           ` Kai Krakow
2015-12-14  1:44       ` Christoph Anton Mitterer
2015-12-14 10:51         ` Duncan
2015-12-16 23:55           ` Christoph Anton Mitterer
2015-11-26 23:08     ` Duncan
2015-12-09  5:45       ` Christoph Anton Mitterer
2015-12-09 16:36         ` Duncan
2015-12-16 21:59           ` Christoph Anton Mitterer
2015-12-17  4:06             ` Duncan
2015-12-18  0:21               ` Christoph Anton Mitterer
2015-12-17  4:35             ` Duncan
2015-12-17  5:07             ` Duncan [this message]
2015-12-17  5:12             ` Duncan
2015-12-17  6:00             ` Duncan
2015-12-17  6:01             ` Duncan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='pan$b0077$783b4f16$83d44d74$9f275abe@cox.net' \
    --to=1i5t5.duncan@cox.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.