All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gian-Carlo Pascutto <gcp@sjeng.org>
To: linux-btrfs@vger.kernel.org
Subject: Big disk space usage difference, even after defrag, on identical data
Date: Sat, 11 Apr 2015 21:59:50 +0200	[thread overview]
Message-ID: <55297D36.8090808@sjeng.org> (raw)

Linux mozwell 3.19.0-trunk-amd64 #1 SMP Debian 3.19.1-1~exp1
(2015-03-08) x86_64 GNU/Linux
btrfs-progs v3.19.1

I have a btrfs volume that's been in use for a week or 2. It has about
~560G of uncompressible data (video files, tar.xz, git repos, ...) and
~200G of data that compresses 2:1 with LZO (PostgreSQL db).

It's split into 2 subvolumes:
ID 257 gen 6550 top level 5 path @db
ID 258 gen 6590 top level 5 path @large

and mounted like this:
/dev/sdc /srv/db btrfs rw,noatime,compress=lzo,space_cache 0 0
/dev/sdc /srv/large btrfs rw,noatime,compress=lzo,space_cache 0 0

du -skh /srv
768G    /srv

df -h
/dev/sdc        1.4T  754G  641G  55% /srv/db
/dev/sdc        1.4T  754G  641G  55% /srv/large

btrfs fi df /srv/large
Data, single: total=808.01GiB, used=749.36GiB
System, DUP: total=8.00MiB, used=112.00KiB
System, single: total=4.00MiB, used=0.00B
Metadata, DUP: total=3.50GiB, used=1.87GiB
Metadata, single: total=8.00MiB, used=0.00B
GlobalReserve, single: total=512.00MiB, used=0.00B

So that's a bit bigger than perhaps expected (~750G instead of
~660G+metadata). I thought it might've been related to compress bailing
out too easily, but I've done a
btrfs fi defragment -r -v -clzo /srv/db /srv/large
and this doesn't change anything.

I recently copied this data to a new, bigger disk, and the result looks
worrying:

mount options:
/dev/sdd /mnt/large btrfs rw,noatime,compress=lzo,space_cache 0 0
/dev/sdd /mnt/db btrfs rw,noatime,compress=lzo,space_cache 0 0

btrfs fi df
Data, single: total=684.00GiB, used=683.00GiB
System, DUP: total=8.00MiB, used=96.00KiB
System, single: total=4.00MiB, used=0.00B
Metadata, DUP: total=3.50GiB, used=2.04GiB
Metadata, single: total=8.00MiB, used=0.00B
GlobalReserve, single: total=512.00MiB, used=0.00B

df
/dev/sdd        3.7T  688G  3.0T  19% /mnt/large
/dev/sdd        3.7T  688G  3.0T  19% /mnt/db

du
767G    /mnt

That's a 66G difference for the same data with the same compress option.
The used size here is much more in line with what I'd have expected
given the nature of the data.

I would think that compression differences or things like fragmentation
or bookending for modified files shouldn't affect this, because the
first filesystem has been defragmented/recompressed and didn't shrink.

So what can explain this? Where did the 66G go?

-- 
GCP

             reply	other threads:[~2015-04-11 19:59 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-11 19:59 Gian-Carlo Pascutto [this message]
2015-04-13  4:04 ` Big disk space usage difference, even after defrag, on identical data Zygo Blaxell
2015-04-13  8:07   ` Duncan
2015-04-13 11:32   ` Gian-Carlo Pascutto
2015-04-13  5:06 ` Duncan
2015-04-13 14:06   ` Gian-Carlo Pascutto
2015-04-13 21:45     ` Zygo Blaxell
2015-04-14  3:18     ` Duncan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55297D36.8090808@sjeng.org \
    --to=gcp@sjeng.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.