From: Martin Steigerwald <Martin@lichtvoll.de>
To: Hugo Mills <hugo@carfax.org.uk>
Cc: Robert White <rwhite@pobox.com>, linux-btrfs@vger.kernel.org
Subject: Re: BTRFS free space handling still needs more work: Hangs again
Date: Sat, 27 Dec 2014 18:59:31 +0100 [thread overview]
Message-ID: <6579515.E2MeSbCGLA@merkaba> (raw)
In-Reply-To: <9346949.uCfVN6IAc7@merkaba>
[-- Attachment #1: Type: text/plain, Size: 5091 bytes --]
Am Samstag, 27. Dezember 2014, 18:11:21 schrieb Martin Steigerwald:
> Am Samstag, 27. Dezember 2014, 16:26:42 schrieb Hugo Mills:
> > On Sat, Dec 27, 2014 at 06:54:33AM -0800, Robert White wrote:
> > > On 12/27/2014 05:55 AM, Martin Steigerwald wrote:
> > [snip]
> > > >while fio was just *laying* out the 4 GiB file. Yes, thats 100% system CPU
> > > >for 10 seconds while allocatiing a 4 GiB file on a filesystem like:
> > > >
> > > >martin@merkaba:~> LANG=C df -hT /home
> > > >Filesystem Type Size Used Avail Use% Mounted on
> > > >/dev/mapper/msata-home btrfs 170G 156G 17G 91% /home
> > > >
> > > >where a 4 GiB file should easily fit, no? (And this output is with the 4
> > > >GiB file. So it was even 4 GiB more free before.)
> > >
> > > No. /usr/bin/df is an _approximation_ in BTRFS because of the limits
> > > of the fsstat() function call. The fstat function call was defined
> > > in 1990 and "can't understand" the dynamic allocation model used in
> > > BTRFS as it assumes fixed geometry for filesystems. You do _not_
> > > have 17G actually available. You need to rely on btrfs fi df and
> > > btrfs fi show to figure out how much space you _really_ have.
> > >
> > > According to this block you have a RAID1 of ~ 160GB expanse (two 160G disks)
> > >
> > > > merkaba:~> date; btrfs fi sh /home ; btrfs fi df /home
> > > > Sa 27. Dez 13:26:39 CET 2014
> > > > Label: 'home' uuid: [some UUID]
> > > > Total devices 2 FS bytes used 152.83GiB
> > > > devid 1 size 160.00GiB used 160.00GiB path
> > > /dev/mapper/msata-home
> > > > devid 2 size 160.00GiB used 160.00GiB path
> > > /dev/mapper/sata-home
> > >
> > > And according to this block you have about 4.49GiB of data space:
> > >
> > > > Btrfs v3.17
> > > > Data, RAID1: total=154.97GiB, used=149.58GiB
> > > > System, RAID1: total=32.00MiB, used=48.00KiB
> > > > Metadata, RAID1: total=5.00GiB, used=3.26GiB
> > > > GlobalReserve, single: total=512.00MiB, used=0.00B
> > >
> > > 154.97
> > > 5.00
> > > 0.032
> > > + 0.512
> > >
> > > Pretty much as close to 160GiB as you are going to get (those
> > > numbers being rounded up in places for "human readability") BTRFS
> > > has allocate 100% of the raw storage into typed extents.
> > >
> > > A large datafile can only fit in the 154.97-149.58 = 5.39
> >
> > I appreciate that this is something of a minor point in the grand
> > scheme of things, but I'm afraid I've lost the enthusiasm to engage
> > with the broader (somewhat rambling, possibly-at-cross-purposes)
> > conversation in this thread. However...
> >
> > > Trying to allocate that 4GiB file into that 5.39GiB of space becomes
> > > an NP-complete (e.g. "very hard") problem if it is very fragmented.
> >
> > This is... badly mistaken, at best. The problem of where to write a
> > file into a set of free extents is definitely *not* an NP-hard
> > problem. It's a P problem, with an O(n log n) solution, where n is the
> > number of free extents in the free space cache. The simple approach:
> > fill the first hole with as many bytes as you can, then move on to the
> > next hole. More complex: order the free extents by size first. Both of
> > these are O(n log n) algorithms, given an efficient general-purpose
> > index of free space.
> >
> > The problem of placing file data isn't a bin-packing problem; it's
> > not like allocating RAM (where each allocation must be contiguous).
> > The items being placed may be split as much as you like, although
> > minimising the amount of splitting is a goal.
> >
> > I suspect that the performance problems that Martin is seeing may
> > indeed be related to free space fragmentation, in that finding and
> > creating all of those tiny extents for a huge file is causing
> > problems. I believe that btrfs isn't alone in this, but it may well be
> > showing the problem to a far greater degree than other FSes. I don't
> > have figures to compare, I'm afraid.
>
> Thats what I wanted to hint at.
>
> I suspect an issue with free space fragmentation and do what I think I see:
>
> btrfs balance minimizes free space in chunk fragmentation.
>
> And that is my whole case on why I think it does help with my /home
> filesystem.
>
> So while btrfs filesystem defragment may help with defragmenting individual
> files, possibly at the cost of fragmenting free space at least on filesystem
> almost full conditions, I think to help with free space fragmentation there
> are only three options at the moment:
>
> 1) reformat and restore via rsync or btrfs send from backup (i.e. file based)
>
> 2) make the BTRFS in itself bigger
>
> 3) btrfs balance at least chunks, at least those that are not more than 70%
> or 80% full.
>
> Do you know of any other ways to deal with it?
Yes.
4) Delete some stuff from it or move it over to a different filesystem.
--
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 181 bytes --]
next prev parent reply other threads:[~2014-12-27 17:59 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-26 13:37 BTRFS free space handling still needs more work: Hangs again Martin Steigerwald
2014-12-26 14:20 ` Martin Steigerwald
2014-12-26 14:41 ` Martin Steigerwald
2014-12-27 3:33 ` Duncan
2014-12-26 15:59 ` Martin Steigerwald
2014-12-27 4:26 ` Duncan
2014-12-26 22:48 ` Robert White
2014-12-27 5:54 ` Duncan
2014-12-27 9:01 ` Martin Steigerwald
2014-12-27 9:30 ` Hugo Mills
2014-12-27 10:54 ` Martin Steigerwald
2014-12-27 11:52 ` Robert White
2014-12-27 13:16 ` Martin Steigerwald
2014-12-27 13:49 ` Robert White
2014-12-27 14:06 ` Martin Steigerwald
2014-12-27 14:00 ` Robert White
2014-12-27 14:14 ` Martin Steigerwald
2014-12-27 14:21 ` Martin Steigerwald
2014-12-27 15:14 ` Robert White
2014-12-27 16:01 ` Martin Steigerwald
2014-12-28 0:25 ` Robert White
2014-12-28 1:01 ` Bardur Arantsson
2014-12-28 4:03 ` Robert White
2014-12-28 12:03 ` Martin Steigerwald
2014-12-28 17:04 ` Patrik Lundquist
2014-12-29 10:14 ` Martin Steigerwald
2014-12-28 12:07 ` Martin Steigerwald
2014-12-28 14:52 ` Robert White
2014-12-28 15:42 ` Martin Steigerwald
2014-12-28 15:47 ` Martin Steigerwald
2014-12-29 0:27 ` Robert White
2014-12-29 9:14 ` Martin Steigerwald
2014-12-27 16:10 ` Martin Steigerwald
2014-12-27 14:19 ` Robert White
2014-12-27 11:11 ` Martin Steigerwald
2014-12-27 12:08 ` Robert White
2014-12-27 13:55 ` Martin Steigerwald
2014-12-27 14:54 ` Robert White
2014-12-27 16:26 ` Hugo Mills
2014-12-27 17:11 ` Martin Steigerwald
2014-12-27 17:59 ` Martin Steigerwald [this message]
2014-12-28 0:06 ` Robert White
2014-12-28 11:05 ` Martin Steigerwald
2014-12-28 13:00 ` BTRFS free space handling still needs more work: Hangs again (further tests) Martin Steigerwald
2014-12-28 13:40 ` BTRFS free space handling still needs more work: Hangs again (further tests, as close as I dare) Martin Steigerwald
2014-12-28 13:56 ` BTRFS free space handling still needs more work: Hangs again (further tests, as close as I dare, current idea) Martin Steigerwald
2014-12-28 15:00 ` Martin Steigerwald
2014-12-29 9:25 ` Martin Steigerwald
2014-12-27 18:28 ` BTRFS free space handling still needs more work: Hangs again Zygo Blaxell
2014-12-27 18:40 ` Hugo Mills
2014-12-27 19:23 ` BTRFS free space handling still needs more work: Hangs again (no complete lockups, "just" tasks stuck for some time) Martin Steigerwald
2014-12-29 2:07 ` Zygo Blaxell
2014-12-29 9:32 ` Martin Steigerwald
2015-01-06 20:03 ` Zygo Blaxell
2015-01-07 19:08 ` Martin Steigerwald
2015-01-07 21:41 ` Zygo Blaxell
2015-01-08 5:45 ` Duncan
2015-01-08 10:18 ` Martin Steigerwald
2015-01-09 8:25 ` Duncan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6579515.E2MeSbCGLA@merkaba \
--to=martin@lichtvoll.de \
--cc=hugo@carfax.org.uk \
--cc=linux-btrfs@vger.kernel.org \
--cc=rwhite@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).