From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: btrfs is using 25% more disk than it should
Date: Sat, 20 Dec 2014 01:33:10 +0000 (UTC) [thread overview]
Message-ID: <pan$5ee96$62de2cb1$5eb9b857$32dd0ff2@cox.net> (raw)
In-Reply-To: CAN6BF2JoMki_KmpmVYVM-_ECqCg_w-qo9_6P=MiZbabMQyVN_g@mail.gmail.com
Daniele Testa posted on Sat, 20 Dec 2014 03:59:42 +0800 as excerpted:
> The file has both checksums and datacow on it. I will do "chattr +C"
> on the parent dir and re-create the file to make sure all files are
> marked as "nodatacow".
>
> Should I also turn off checksums with the mount-flags if this filesystem
> only contain big VM-files? Or is it not needed if I put +C on the parent
> dir?
FWIW...
Turning off datacow, whether by chattr +C on the parent dir before
creating the file, or via mount option, turns off checksumming as well.
(For completeness, it also turns off compression, but I don't think that
applies in your case.)
In general, active VM images (and database files) with default flags tend
to get very highly fragmented very fast, due to btrfs' default COW on a
file with a heavy "internal rewrite" pattern (as opposed to append-only
or full rename/replace on rewrite). For relatively small files with this
rewrite pattern, think typical desktop firefox sqlite database files of a
quarter GiB or less, the btrfs autodefrag mount option can be helpful,
but because it triggers a rewrite of the entire file, as filesize goes
up, the viability of autodefrag goes down, and at somewhere around half a
gig, autodefrag doesn't work so well any more, particularly on very
active files where the incoming rewrite stream may be faster than btrfs
can rewrite the entire file.
Making heavy-internal-rewrite pattern files of over say half a GiB in
size nocow is one suggested solution. However, snapshots lock in place
the existing version, causing a one-time COW after a snapshot. If people
are doing frequent automated snapshots (say once an hour), this can be a
big problem, as the file ends up fragmenting pretty badly with these 1-
cow writes as well. That's where snapshots come into the picture.
There are ways to work around the problem (put the files in question on a
subvolume and don't snapshot it as often as the parent, setup a cron job
to do say weekly defrag on the files in question, etc), but since you
don't have snapshots going anyway, that's not a concern for you except as
a preventative -- consider it if you /do/ start doing snapshots.
So anyway, as I said, creating the file nocow (whether by mount option or
chattr) will turn off checksumming too. But on something that frequently
internally rewritten, where corruption will very likely corrupt the VM
anyway and there's already mechanisms in place to deal with that (either
VM integrity mechanisms, or backups, or simply disposable VMs, fire up a
new one when necessary), at least with btrfs single-mode-data where
there's no second copy to restore from if the checksum /does/ fail,
turning off checksumming isn't necessarily as bad as it may seem anyway.
And it /should/ save you some on the metadata... tho I'd not consider
that savings worth turning off checksumming if that were the /only/
reason, on its own. The metadata difference is more a nice side-effect
of an already commonly recommended practice for large VM image files,
than something you'd turn off checksumming for in the first place.
Certainly, on most files I'd prefer the checksums, and in fact am running
btrfs raid1 mode here specifically to get the benefit of having a second
copy to retrieve from if the first attempted copy fails checksum. But VM
images and database files are a bit of an exception.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
next prev parent reply other threads:[~2014-12-20 1:33 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-18 14:59 btrfs is using 25% more disk than it should Daniele Testa
2014-12-19 18:53 ` Phillip Susi
2014-12-19 19:59 ` Daniele Testa
2014-12-19 20:35 ` Phillip Susi
2014-12-19 21:15 ` Josef Bacik
2014-12-19 21:53 ` Phillip Susi
2014-12-19 22:06 ` Josef Bacik
2014-12-20 1:33 ` Duncan [this message]
2014-12-19 21:10 ` Josef Bacik
2014-12-19 21:17 ` Josef Bacik
2014-12-20 1:38 ` Duncan
2014-12-20 5:52 ` Zygo Blaxell
2014-12-20 6:18 ` Daniele Testa
2014-12-20 6:59 ` Duncan
2014-12-20 11:02 ` Josef Bacik
2014-12-20 11:28 ` Josef Bacik
2014-12-23 21:51 ` Zygo Blaxell
2014-12-20 9:15 ` Daniele Testa
2014-12-20 11:23 ` Robert White
2014-12-20 11:39 ` Josef Bacik
2014-12-21 1:40 ` Robert White
2014-12-21 3:04 ` Robert White
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='pan$5ee96$62de2cb1$5eb9b857$32dd0ff2@cox.net' \
--to=1i5t5.duncan@cox.net \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.