From: Martin Steigerwald <Martin@lichtvoll.de>
To: Hugo Mills <hugo@carfax.org.uk>
Cc: Robert White <rwhite@pobox.com>, linux-btrfs@vger.kernel.org
Subject: Re: BTRFS free space handling still needs more work: Hangs again
Date: Sat, 27 Dec 2014 18:59:31 +0100 [thread overview]
Message-ID: <6579515.E2MeSbCGLA@merkaba> (raw)
In-Reply-To: <9346949.uCfVN6IAc7@merkaba>
[-- Attachment #1: Type: text/plain, Size: 5091 bytes --]
Am Samstag, 27. Dezember 2014, 18:11:21 schrieb Martin Steigerwald:
> Am Samstag, 27. Dezember 2014, 16:26:42 schrieb Hugo Mills:
> > On Sat, Dec 27, 2014 at 06:54:33AM -0800, Robert White wrote:
> > > On 12/27/2014 05:55 AM, Martin Steigerwald wrote:
> > [snip]
> > > >while fio was just *laying* out the 4 GiB file. Yes, thats 100% system CPU
> > > >for 10 seconds while allocatiing a 4 GiB file on a filesystem like:
> > > >
> > > >martin@merkaba:~> LANG=C df -hT /home
> > > >Filesystem Type Size Used Avail Use% Mounted on
> > > >/dev/mapper/msata-home btrfs 170G 156G 17G 91% /home
> > > >
> > > >where a 4 GiB file should easily fit, no? (And this output is with the 4
> > > >GiB file. So it was even 4 GiB more free before.)
> > >
> > > No. /usr/bin/df is an _approximation_ in BTRFS because of the limits
> > > of the fsstat() function call. The fstat function call was defined
> > > in 1990 and "can't understand" the dynamic allocation model used in
> > > BTRFS as it assumes fixed geometry for filesystems. You do _not_
> > > have 17G actually available. You need to rely on btrfs fi df and
> > > btrfs fi show to figure out how much space you _really_ have.
> > >
> > > According to this block you have a RAID1 of ~ 160GB expanse (two 160G disks)
> > >
> > > > merkaba:~> date; btrfs fi sh /home ; btrfs fi df /home
> > > > Sa 27. Dez 13:26:39 CET 2014
> > > > Label: 'home' uuid: [some UUID]
> > > > Total devices 2 FS bytes used 152.83GiB
> > > > devid 1 size 160.00GiB used 160.00GiB path
> > > /dev/mapper/msata-home
> > > > devid 2 size 160.00GiB used 160.00GiB path
> > > /dev/mapper/sata-home
> > >
> > > And according to this block you have about 4.49GiB of data space:
> > >
> > > > Btrfs v3.17
> > > > Data, RAID1: total=154.97GiB, used=149.58GiB
> > > > System, RAID1: total=32.00MiB, used=48.00KiB
> > > > Metadata, RAID1: total=5.00GiB, used=3.26GiB
> > > > GlobalReserve, single: total=512.00MiB, used=0.00B
> > >
> > > 154.97
> > > 5.00
> > > 0.032
> > > + 0.512
> > >
> > > Pretty much as close to 160GiB as you are going to get (those
> > > numbers being rounded up in places for "human readability") BTRFS
> > > has allocate 100% of the raw storage into typed extents.
> > >
> > > A large datafile can only fit in the 154.97-149.58 = 5.39
> >
> > I appreciate that this is something of a minor point in the grand
> > scheme of things, but I'm afraid I've lost the enthusiasm to engage
> > with the broader (somewhat rambling, possibly-at-cross-purposes)
> > conversation in this thread. However...
> >
> > > Trying to allocate that 4GiB file into that 5.39GiB of space becomes
> > > an NP-complete (e.g. "very hard") problem if it is very fragmented.
> >
> > This is... badly mistaken, at best. The problem of where to write a
> > file into a set of free extents is definitely *not* an NP-hard
> > problem. It's a P problem, with an O(n log n) solution, where n is the
> > number of free extents in the free space cache. The simple approach:
> > fill the first hole with as many bytes as you can, then move on to the
> > next hole. More complex: order the free extents by size first. Both of
> > these are O(n log n) algorithms, given an efficient general-purpose
> > index of free space.
> >
> > The problem of placing file data isn't a bin-packing problem; it's
> > not like allocating RAM (where each allocation must be contiguous).
> > The items being placed may be split as much as you like, although
> > minimising the amount of splitting is a goal.
> >
> > I suspect that the performance problems that Martin is seeing may
> > indeed be related to free space fragmentation, in that finding and
> > creating all of those tiny extents for a huge file is causing
> > problems. I believe that btrfs isn't alone in this, but it may well be
> > showing the problem to a far greater degree than other FSes. I don't
> > have figures to compare, I'm afraid.
>
> Thats what I wanted to hint at.
>
> I suspect an issue with free space fragmentation and do what I think I see:
>
> btrfs balance minimizes free space in chunk fragmentation.
>
> And that is my whole case on why I think it does help with my /home
> filesystem.
>
> So while btrfs filesystem defragment may help with defragmenting individual
> files, possibly at the cost of fragmenting free space at least on filesystem
> almost full conditions, I think to help with free space fragmentation there
> are only three options at the moment:
>
> 1) reformat and restore via rsync or btrfs send from backup (i.e. file based)
>
> 2) make the BTRFS in itself bigger
>
> 3) btrfs balance at least chunks, at least those that are not more than 70%
> or 80% full.
>
> Do you know of any other ways to deal with it?
Yes.
4) Delete some stuff from it or move it over to a different filesystem.
--
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 181 bytes --]
next prev parent reply other threads:[~2014-12-27 17:59 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-26 13:37 BTRFS free space handling still needs more work: Hangs again Martin Steigerwald
2014-12-26 14:20 ` Martin Steigerwald
2014-12-26 14:41 ` Martin Steigerwald
2014-12-27 3:33 ` Duncan
2014-12-26 15:59 ` Martin Steigerwald
2014-12-27 4:26 ` Duncan
2014-12-26 22:48 ` Robert White
2014-12-27 5:54 ` Duncan
2014-12-27 9:01 ` Martin Steigerwald
2014-12-27 9:30 ` Hugo Mills
2014-12-27 10:54 ` Martin Steigerwald
2014-12-27 11:52 ` Robert White
2014-12-27 13:16 ` Martin Steigerwald
2014-12-27 13:49 ` Robert White
2014-12-27 14:06 ` Martin Steigerwald
2014-12-27 14:00 ` Robert White
2014-12-27 14:14 ` Martin Steigerwald
2014-12-27 14:21 ` Martin Steigerwald
2014-12-27 15:14 ` Robert White
2014-12-27 16:01 ` Martin Steigerwald
2014-12-28 0:25 ` Robert White
2014-12-28 1:01 ` Bardur Arantsson
2014-12-28 4:03 ` Robert White
2014-12-28 12:03 ` Martin Steigerwald
2014-12-28 17:04 ` Patrik Lundquist
2014-12-29 10:14 ` Martin Steigerwald
2014-12-28 12:07 ` Martin Steigerwald
2014-12-28 14:52 ` Robert White
2014-12-28 15:42 ` Martin Steigerwald
2014-12-28 15:47 ` Martin Steigerwald
2014-12-29 0:27 ` Robert White
2014-12-29 9:14 ` Martin Steigerwald
2014-12-27 16:10 ` Martin Steigerwald
2014-12-27 14:19 ` Robert White
2014-12-27 11:11 ` Martin Steigerwald
2014-12-27 12:08 ` Robert White
2014-12-27 13:55 ` Martin Steigerwald
2014-12-27 14:54 ` Robert White
2014-12-27 16:26 ` Hugo Mills
2014-12-27 17:11 ` Martin Steigerwald
2014-12-27 17:59 ` Martin Steigerwald [this message]
2014-12-28 0:06 ` Robert White
2014-12-28 11:05 ` Martin Steigerwald
2014-12-28 13:00 ` BTRFS free space handling still needs more work: Hangs again (further tests) Martin Steigerwald
2014-12-28 13:40 ` BTRFS free space handling still needs more work: Hangs again (further tests, as close as I dare) Martin Steigerwald
2014-12-28 13:56 ` BTRFS free space handling still needs more work: Hangs again (further tests, as close as I dare, current idea) Martin Steigerwald
2014-12-28 15:00 ` Martin Steigerwald
2014-12-29 9:25 ` Martin Steigerwald
2014-12-27 18:28 ` BTRFS free space handling still needs more work: Hangs again Zygo Blaxell
2014-12-27 18:40 ` Hugo Mills
2014-12-27 19:23 ` BTRFS free space handling still needs more work: Hangs again (no complete lockups, "just" tasks stuck for some time) Martin Steigerwald
2014-12-29 2:07 ` Zygo Blaxell
2014-12-29 9:32 ` Martin Steigerwald
2015-01-06 20:03 ` Zygo Blaxell
2015-01-07 19:08 ` Martin Steigerwald
2015-01-07 21:41 ` Zygo Blaxell
2015-01-08 5:45 ` Duncan
2015-01-08 10:18 ` Martin Steigerwald
2015-01-09 8:25 ` Duncan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6579515.E2MeSbCGLA@merkaba \
--to=martin@lichtvoll.de \
--cc=hugo@carfax.org.uk \
--cc=linux-btrfs@vger.kernel.org \
--cc=rwhite@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.