linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ivan P <chrnosphered@gmail.com>
To: linux-btrfs@vger.kernel.org
Subject: Re: btrfs-tools/linux 4.11: btrfs-cleaner misbehaving
Date: Sun, 28 May 2017 12:39:38 +0200	[thread overview]
Message-ID: <CADzmB21OO273fXqTsh-qAnRk_y8xeTXbCoyeRbDB9eO2S9ASug@mail.gmail.com> (raw)
In-Reply-To: <20170527215608.23a40176@ws>

On Sun, May 28, 2017 at 6:56 AM, Duncan <1i5t5.duncan@cox.net> wrote:
> [This mail was also posted to gmane.comp.file-systems.btrfs.]
>
> Ivan P posted on Sat, 27 May 2017 22:54:31 +0200 as excerpted:
>
>>>>>> Please add me to CC when replying, as I am not
>>>>>> subscribed to the mailing list.
>
>> Hmm, remounting as you suggested has shut it up immediately - hurray!
>>
>> I don't really have any special write pattern from what I can tell.
>> About the only thing different from all the other btrfs systems I've
>> set up is that the data is also on the same volume as the system.
>> Normal usage, no VMs or heavy file generation. I'm also only taking
>> snapshots of the system and @home, with the latter only containing
>> my .config, .cache and symlinks to some folders in @data.
>
> Systemd?  Journald with journals on btrfs?  Regularly snapshotting that
> subvolume?
>
> If yes to all of the above, that might be the issue.  Normally systemd
> will set the journal directory NOCOW, so the journal files inherit it
> at creation, in ordered to avoid heavy fragmentation due to the COW-
> unfriendly database-style file-internal-rewrite pattern with the
> journal files.
>
> Great.  Except that snapshotting locks the existing version of the file
> in place with the snapshot, so the next write to any block must be COW
> anyway.  This is sometimes referred to as COW1, since it's a
> single-time COW, and the effect isn't too bad with a one-time
> snapshot.  But if you're regularly snapshotting the journal files, that
> will trigger COW1 on every snapshot, which if you're snapshotting often
> enough can be almost as bad as regular COW in terms of fragmentation.
>
> The fix is to make the journal dir a subvolume instead, thereby
> excluding it from the snapshot taken on the parent subvolume, and just
> don't snapshot the journal subvolume then, so the NOCOW that systemd
> should already set on that subdir and its contents will actually be
> NOCOW, without interference from snapshotting repeatedly forcing COW1.
>
>
> Of course an alternative fix, the one I use here (and am happy with)
> instead, is to have a normal syslog (I use syslog-ng, but others have
> reported using rsyslog) handling your saved logs in traditional text
> form (most modern syslogs should cooperate with systemd's journald),
> and configure journald to only use tmpfs (see the journald.conf
> manpage). Traditional text logs are append-only and not nearly as bad
> in COW terms.  Meanwhile, journald is still active, just writing to
> tmpfs only, so you get a journal for the current boot session and thus
> can still take advantage of all the usual systemd/journald features
> such as systemctl status spitting out the last 10 log entries for that
> service, etc.  It's just limited to the current boot session, and you
> use the normal text logs for anything older than that.  For me anyway
> that's the best of both worlds, and I don't have to worry about how the
> journal files behave on btrfs at all, because they're not written to
> btrfs at all. =:^)
>
>
> Meanwhile, since you mentioned snapshots, a word of caution there.  If
> you do have scripted snapshots being taken, be sure you have a script
> thinning down your snapshot history as well.  More than 200-300
> snapshots per subvolume scales very poorly in btrfs maintenance terms
> (and qgroups make the problem far worse, if you have them active at
> all).  But if for instance you're taking snapshots ever hour, if you
> need something from one say a month old, are you really going to
> remember or care which exact hour it was, or will the daily either
> before or after that hour be fine, and actually much easier to find if
> you've trimmed to daily by then, as opposed to having hundreds and
> hundreds of hourly snapshots accumulating?
>
> So snapshots are great but they don't come without cost, and if you
> keep under 200 and if possible under 100 per subvolume, you'll find
> maintenance such as balance and check (fsck) go much faster than they
> do with even 500, let alone thousands.
>
> --
> Duncan - List replies preferred.   No HTML msgs.
> "Every nonfree program has a lord, a master --
> and if you use the program, he is your master."  Richard Stallman

I haven't had any issues like this before on another two boxes which ran
for years with systemd and journald, so I'm rather surprised this is a problem.

It does make sense for journald to fragment the disk, but isn't that what
autodefrag is for? The weird thing is that btrfs-cleaner can't seem to be
able to ever finish the work it is doing. Which would mean the work piles up
constantly, without getting done...

At the moment I am at 9 system snapshots and 5 @home snapshots, which
IMHO btrfs should be able to handle. The other boxes have about the same
number of snapshots and one of them is running 24/7 as a home server.

The snapshots are not automated, I take a snapshot of @arch_current and
@home using a script before updating my system. So the snapshot interval
is very irregular. I also try clean up old snapshots, only leaving about three
newest system snapshots on disk, though I haven't done that recently.

Oh, an I'm not using any qgroups, not that I know, at least.

Regards,
Ivan.

       reply	other threads:[~2017-05-28 10:39 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20170527215608.23a40176@ws>
2017-05-28 10:39 ` Ivan P [this message]
2017-05-27 18:53 btrfs-tools/linux 4.11: btrfs-cleaner misbehaving Ivan P
2017-05-27 19:33 ` Hans van Kranenburg
2017-05-27 20:29   ` Ivan P
2017-05-27 20:42     ` Hans van Kranenburg
2017-05-27 20:54       ` Ivan P
2017-05-28  4:55         ` Duncan
2017-05-28  7:13           ` Marat Khalili
2017-05-27 19:36 ` Jean-Denis Girard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CADzmB21OO273fXqTsh-qAnRk_y8xeTXbCoyeRbDB9eO2S9ASug@mail.gmail.com \
    --to=chrnosphered@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).