From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: Amount of scrubbed data goes from 15.90GiB to 26.66GiB after defragment -r -v -clzo on a fs always mounted with compress=lzo
Date: Fri, 13 May 2016 06:11:27 +0000 (UTC) [thread overview]
Message-ID: <pan$106d2$20dff0d0$cff8ba20$86e58020@cox.net> (raw)
In-Reply-To: 60018ed4-023e-4ca6-8f74-e19b8849de65@linuxsystems.it
Niccolò Belli posted on Thu, 12 May 2016 15:56:20 +0200 as excerpted:
> Thanks for the detailed explanation, hopefully in the future someone
> will be able to make defrag snapshot/reflink aware in a scalable manner.
It's still planned, AFAIK, but one of the scaling issues in particular,
quotas, have turned out to be a particularly challenging thing to even
actually get working correctly. They've rewritten the quota code twice
(so they're on their third attempted solution), and it's still broken in
certain corner-cases ATM, to the point that while they're still actually
trying to get the existing third try to work in the tough corner-cases as
well, they're already talking about an eventual third rewrite (4th
attempt, having scrapped three) once they actually have the corner-cases
working, to try to bring better performance once they know the tough
corner-cases and can actually design a solution with both them and
performance in mind from the beginning.
So in practice an actually scalable snapshot-aware defrag is likely to be
years out, as it's going to need actually working and scalable quota
code, and even then, that's only part of the full scalable snapshot/
reflink-aware defrag solution.
The good news is that while there's still work to be done, progress has
been healthy in other areas, so once the quota code both actually works
and is scalable, the other aspects should hopefully fall into place
relatively fast, as they've already been maturing on their own,
separately.
> I will not use use defrag anymore, but what do you suggest me to do to
> reclaim the lost space? Get rid of my current snapshots or maybe simply
> running bedup?
Neither snapshots nor dedup are one of my direct use-cases so my
practical knowledge there is limited, but removing the snapshots should
indeed clear the space (but you'll likely have to remove all of them
covering a specific subvolume in ordered to free the space) as in doing
so you'll be removing all references locking the old extents in place.
If you already have them backed up (using send/receive, for instance)
elsewhere or don't actually need them, however, it's a viable alternative.
In theory the various btrfs dedup solutions out there should work as
well, while letting you keep the snapshots (at least to the extent
they're either writable snapshots so can be reflink modified, or a single
read-only snapshot that the others including the freshly defragged
working copy can be reflinked to), since that's their mechanism of
operation -- finding identical block sequences and reflinking them so
there's only one actual copy on the filesystem, with the rest being
reflinks to it -- so in effect it should undo the reflink-breaking you
did with the defrag. *But*, without any personal experience with them, I
have no idea either how effective they are in practice in a situation
like this, or how practical vs. convoluted the commandlines are going to
be to actually accomplish your goal. Best-case, it's a simple and fast
command to run and it not only undoes the defrag reflink breakage, but
actually finds enough duplication in the dataset to reduce usage even
further than before, worst-case it's multiple complex commands that take
a week or longer to run and don't actually help much.
So in practice, you have a choice between EITHER deleting all the
snapshots and along with them everything locking down the old extents,
thus leaving you with only the new, fresh copy (which by itself should be
smaller than before), but at the expense of losing your snapshots, OR the
at least from my knowledge relative unknown of the various btrfs dedup
solutions, which in theory should work well, but in practice... I simply
don't know.
AND of course you have the option of basically doing nothing, leaving
things as they are. However, given the context of this thread, it seems
you don't consider that a viable longer term option as apparently you
were trying to clear space, not use MORE of it, and presumably you
actually need that space for something else, and that precludes just
letting things be, unless of course you can afford to simply buy your way
out of the problem with more storage devices.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
next prev parent reply other threads:[~2016-05-13 6:11 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-11 19:50 Amount of scrubbed data goes from 15.90GiB to 26.66GiB after defragment -r -v -clzo on a fs always mounted with compress=lzo Niccolò Belli
2016-05-11 20:07 ` Christoph Anton Mitterer
2016-05-12 10:18 ` Duncan
2016-05-12 13:56 ` Niccolò Belli
2016-05-13 6:11 ` Duncan [this message]
2016-05-20 15:51 ` Niccolò Belli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='pan$106d2$20dff0d0$cff8ba20$86e58020@cox.net' \
--to=1i5t5.duncan@cox.net \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).