Re: btrfs filesystem keeps allocating new chunks for no apparent reason

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: linux-btrfs@vger.kernel.org
Subject: Re: btrfs filesystem keeps allocating new chunks for no apparent reason
Date: Tue, 11 Apr 2017 07:16:20 -0400	[thread overview]
Message-ID: <8bbd6b5c-58c8-62d2-78de-76ce31ff0bc9@gmail.com> (raw)
In-Reply-To: <20170411095552.o5b4wysjqlbp57xa@angband.pl>

On 2017-04-11 05:55, Adam Borowski wrote:
> On Tue, Apr 11, 2017 at 06:01:19AM +0200, Kai Krakow wrote:
>> Yes, I know all this. But I don't see why you still want noatime or
>> relatime if you use lazytime, except for super-optimizing. Lazytime
>> gives you POSIX conformity for a problem that the other options only
>> tried to solve.
>
> (Besides lazytime also working on mtime, and, technically, ctime.)
Nope, it by definition can't work on ctime because a ctime update means 
something else changed in the inode, which in turn will cause it to be 
flushed to disk normally (lazytime only defers the flush as long as 
nothing else in the inode is different, so it won't help much on stuff 
like traditional log files because their size is changing regularly 
(which updates the inode, which then causes it to get flushed)).
>
> First: atime, in any form, murders snapshots.  On any filesystem that has
> them, not just btrfs -- I've tested zfs and LVM snapshots, there's also
> qcow2/vdi and so on.  On all of them, every single read-everything operation
> costs you 5% disk space.  For a _read_ operation!
>
> I've tested /usr-y mix of files, for consistency with the guy who mentioned
> this problem first.  Your mileage will vary depending on whether you store
> 100GB disk images or a news spool.
>
> Read-everything is quite rare, but most systems have at least one
> stat-everything cronjob.  That touches only diratime, but that's still
> 1-in-11 inodes (remarkably consistent: I've checked a few machines with
> drastically different purposes, and somehow the min was 10, max 12).
>
> And no, marking snapshots as ro doesn't help: reading the live version still
> breaks CoW.
>
>
> Second: atime murders media with limited write endurance.  Modern SSD can
> cope well, but I for one work a lot with SD and eMMC.  Every single SoC
> image I've seen uses noatime for this reason.
Even on SSD's it's still an issue, especially if it's something like 
ext4 which uses inode tables (updating one inode will usually require a 
RMW of an erase block regardless, but using inode tables means that this 
happens _all the time_).
>
>
> Third: relatime/lazytime don't eliminate the performance cost.  They fix
> only frequently read files -- if you have a big filesystem where you read a
> lot but individual files tend to be read rarely, relatime is as bad as
> strictatime, and lazytime actually worse.  Both will do an unnecessary write
> of all inodes.
>
>
> Four: why?  Beside being POSIXLY_CORRECT, what do you actually gain from
> atime?  I can think only of:
> * new mail notification with mbox.  Just patch the mail reader to manually
>   futimens(..., {UTIME_NOW,UTIME_OMIT}), it has no extra cost on !noatime
>   mounts.  I've personally did so for mutt, the updated version will ship
>   in Debian stretch; you can patch other mail readers although they tend
>   to be rarely used in conjunction with shell access (and thus they have
>   no need for atime at all).
> * Debian's popcon's "vote" field.  Use "inst", and there's no gain from
>   popcon for you personally.
> * some intrusion detection forensics (broken by open(..., O_NOATIME))
On top of all that:
Five:
Handling of atime slows down stat and a handful of other things.  If you 
take a source tree the size of the Linux kernel, write a patch that 
changes every file (even just one character), and then go to commit it 
in Git (or SVN, or Bazaar, or Mercurial), you'll see a pretty serious 
difference in the time it takes to commit because almost all VCS 
software calls stat() on the entire tree.  relatime won't help much here 
because the check to determine whether or not to update the atime still 
has to happen (in fact, it will hurt slightly, strictatime eliminates 
that check).

Six:
It doesn't behave how most users would inherently expect, partly because 
there are ways to bypass it even if the FS is mounted with strictatime.
>
>
> Conclusion: death to atime!
>

next prev parent reply	other threads:[~2017-04-11 11:16 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-06 21:28 btrfs filesystem keeps allocating new chunks for no apparent reason Hans van Kranenburg
2016-05-30 11:07 ` Hans van Kranenburg
2016-05-30 19:55   ` Duncan
2016-05-30 21:18     ` Hans van Kranenburg
2016-05-30 21:55       ` Duncan
2016-05-31  1:36 ` Qu Wenruo
2016-06-08 23:10   ` Hans van Kranenburg
2016-06-09  8:52     ` Marc Haber
2016-06-09 10:37       ` Hans van Kranenburg
2016-06-09 15:41     ` Duncan
2016-06-10 17:07       ` Henk Slager
2016-06-11 15:23         ` Hans van Kranenburg
2016-06-09 18:07     ` Chris Murphy
2017-04-07 21:25   ` Hans van Kranenburg
2017-04-07 23:56     ` Peter Grandi
2017-04-08  7:09     ` Duncan
2017-04-08 11:16     ` Hans van Kranenburg
2017-04-08 11:35       ` Hans van Kranenburg
2017-04-09 23:23       ` Hans van Kranenburg
2017-04-10 12:39         ` Austin S. Hemmelgarn
2017-04-10 12:45           ` Kai Krakow
2017-04-10 12:51             ` Austin S. Hemmelgarn
2017-04-10 16:53               ` Kai Krakow
     [not found]               ` <20170410184444.08ced097@jupiter.sol.local>
2017-04-10 16:54                 ` Kai Krakow
2017-04-10 17:13                   ` Austin S. Hemmelgarn
2017-04-10 18:18                     ` Kai Krakow
2017-04-10 19:43                       ` Austin S. Hemmelgarn
2017-04-10 22:21                         ` Adam Borowski
2017-04-11  4:01                         ` Kai Krakow
2017-04-11  9:55                           ` Adam Borowski
2017-04-11 11:16                             ` Austin S. Hemmelgarn [this message]
2017-04-10 23:45                       ` Janos Toth F.
2017-04-11  3:56                         ` Kai Krakow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8bbd6b5c-58c8-62d2-78de-76ce31ff0bc9@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).