From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: free space inode generation (0) did not match free space cache generation
Date: Sat, 22 Mar 2014 23:32:16 +0000 (UTC) [thread overview]
Message-ID: <pan$b7ff$285d59f6$2419371$b3946ebe@cox.net> (raw)
In-Reply-To: 532DFDAB.7000600@friedels.name
Hendrik Friedel posted on Sat, 22 Mar 2014 22:16:27 +0100 as excerpted:
> I read through the FAQ you mentioned, but I must admit, that I do not
> fully understand.
My experience is that it takes a bit of time to soak in. Between time,
previous Linux experience, and reading this list for awhile, things do
make more sense now, but my understanding has definitely changed and
deepened over time.
> What I am wondering about is, what caused this problem to arise. The
> filesystem was hardly a week old, never mistreated (powered down without
> unmounting or so) and not even half full. So what caused the data chunks
> all being allocated?
I can't really say, but it's worth noting that btrfs can normally
allocate chunks, but doesn't (yet?) automatically deallocate them. To
deallocate, you balance. Btrfs can reuse areas that have been deleted as
the same thing, data or metadata, but it can't switch between them
without a balance.
So the most obvious thing is that if you copy a bunch of stuff around so
the filesystem is nearing full, then delete a bunch of it, consider
checking your btrfs filesystem df/show stats and see whether you need a
balance. But like I said, that's obvious.
> The only thing that I could think of is that I created hourly snapshots
> with snapper.
> In fact in order to be able to do the balance, I had to delete something
> -so I deleted the snapshots.
One possibility off the top of my head: Do you have noatime set in your
mount options? That's definitely recommended with snapshotting, since
otherwise, atime updates will be changes to the filesystem metadata since
the last snapshot, and thus will add to the difference between snapshots
that must be stored. If you're doing hourly snapshots and are accessing
much of the filesystem each hour, that'll add up!
Additionally, I recommend snapshot thinning. Hourly snapshots are nice
but after some time, they just become noise. Will you really know or
care which specific hour it was if you're having to retrieve a snapshot
from a month ago?
So hourly snapshots, but after say a day, delete two out of three,
leaving three-hourly snapshots. After two days, delete another half,
leaving six-hourly snapshots (four a day). After a week, delete three of
the four, leaving daily snapshots. After a quarter (13 weeks) delete six
of seven (or 4 of five if it's weekdays only), leaving weekly snapshots.
After a year, delete 12 of the 13, leaving quarterly snapshots. ... Or
something like that. You get the idea. Obviously script it, just like
the snapshotting itself is scripted.
That will solve another problem too. When btrfs gets into the thousands
of snapshots, at it will pretty fast with unthinned hourly, certain
operations slow down dramatically. The problem was much worse at one
point, but the snapshot aware defrag was disabled for the time being, as
it simply didn't scale and people with thousands of snapshots were seeing
balances or defrags go days with little visible progress. But, few
people really /need/ thousands of snapshots. With a bit of reasonable
thinning down to one a quarter, you end up with 200-300 snapshots and
that's it.
Also, it may or may not apply to you, but internal-rewrite (as opposed to
simply appended) files are bad news for COW-based filesystems such as
btrfs. The autodefrag mount option can help with this for smaller files
(say to several hundred megabytes in size), but for larger (from say half
a gig) actively rewritten files such as databases, VM images, and pre-
allocated torrent downloads until they're fully downloaded, setting the
NOCOW attribute (chattr +C, change in-place, instead of using the normal
copy-on-write) is strongly recommended. But the catch is that the
attribute needs to be set while the file is still zero-size, before it
actually has any content. The easiest way to do that is to create a
dedicated directory for such files and to set the attribute on the
directory, after which it'll automatically be inherited by any newly
created files or subdirs in that directory.
But, there's a catch with snapshots. The first change to a block after a
snapshot forces a COW anyway, since the data has changed from that of the
snapshot. So for those making heavy use of snapshots, creating dedicated
subvolumes for these NOCOW directories is a good idea, since snapshots
are per subvolume and thus these dedicated subvolumes will be excluded
from the general snapshots (just don't snapshot the dedicated subvolumes).
Of course that does limit the value of snapshots to some degree, but it's
worth keeping in mind that most filesystems don't even offer the snapshot
feature at all, so...
> Can you tell me where I can read about the causes for this problem?
The above wisdom is mostly from reading the list for awhile. Like I
said, it takes awhile to soak in, and my thinking on the subject has
changed somewhat over time. The fact that NOCOW wasn't NOCOW on the
first change after a snapshot was a rather big epiphany to me, but AFAIK,
that's not on the wiki or elsewhere yet. It makes sense if you think
about it, but someone specifically asked, and the devs confirmed it.
Before that I had no idea, and was left wondering at some of the behavior
being reported, even with nocow properly set. (That was back when the
broken snapshot aware defrag was still in place, as it simply didn't
scale with snapshots and such files, and I couldn't figure out why NOCOW
wasn't working to avoid the problem, until a dev confirmed that the first
change after a snapshot was COW anyway, and it all dropped into place...
continuously rewritten VM images, even if set NOCOW, would still be
continuously fragmented, if people were doing regular snapshots on them.)
> Besides this:
> You recommend monitoring the output of btrfs fi show and to do a
> balance, whenever unallocated space drops too low. I can monitor this
> and let monit send me a message once that happens. Still, I'd like to
> know how to make this less likely.
I haven't had a problem with it here, but then I haven't been doing much
snapshotting (and always manual when I do it), I don't run any VMs or
large databases, I mounted with the autodefrag option from the beginning,
and I've used noatime for nearing a decade now as it was also recommended
for my previous filesystem, reiserfs.
But regardless of my experience with my own usage pattern, I suspect that
with reasonable monitoring, you'll eventually become familiar with how
fast the chunks are allocated and possibly with what sort of actions
beyond the obvious active moving stuff around on the filesystem triggers
those allocations, for your specific usage pattern, and can then adapt as
necessary.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
next prev parent reply other threads:[~2014-03-22 23:32 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <532DF38B.40409@friedels.name>
2014-03-22 21:16 ` free space inode generation (0) did not match free space cache generation Hendrik Friedel
2014-03-22 23:32 ` Duncan [this message]
2014-03-24 20:52 ` Hendrik Friedel
2014-03-25 13:00 ` Duncan
2014-03-25 20:03 ` Hendrik Friedel
2014-03-25 20:10 ` Hugo Mills
2014-03-25 21:28 ` Duncan
2014-03-25 21:50 ` Hugo Mills
2014-03-28 7:32 ` Hendrik Friedel
2014-03-22 18:13 Hendrik Friedel
2014-03-22 19:23 ` Duncan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='pan$b7ff$285d59f6$2419371$b3946ebe@cox.net' \
--to=1i5t5.duncan@cox.net \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox