From: Qu Wenruo <quwenruo@cn.fujitsu.com>
To: <bo.li.liu@oracle.com>, Marc MERLIN <marc@merlins.org>
Cc: <linux-btrfs@vger.kernel.org>
Subject: Re: Where is the disk space?
Date: Mon, 16 Nov 2015 09:47:21 +0800 [thread overview]
Message-ID: <564935A9.1000202@cn.fujitsu.com> (raw)
In-Reply-To: <20151115063539.GC16363@localhost.localdomain>
Liu Bo wrote on 2015/11/14 22:35 -0800:
> Hi,
>
> On Fri, Nov 13, 2015 at 09:41:01AM -0800, Marc MERLIN wrote:
>> root@polgara:/mnt/btrfs_root# du -sh *
>> 28G @
>> 28G @_hourly.20151113_08:04:01
>> 4.0K @_last
>> 4.0K @_last_rw
>> 28G @_rw.20151113_00:02:01
>> root@polgara:/mnt/btrfs_root# df -h .
>> Filesystem Size Used Avail Use% Mounted on
>> /dev/sdb5 56G 40G 5.4G 89% /mnt/btrfs_root
>>
>> root@polgara:/mnt/btrfs_root# btrfs fi df .
>> Data, single: total=39.85GiB, used=38.52GiB
>> System, DUP: total=8.00MiB, used=16.00KiB
>> System, single: total=4.00MiB, used=0.00B
>> Metadata, DUP: total=6.00GiB, used=579.17MiB
>> Metadata, single: total=8.00MiB, used=0.00B
>> GlobalReserve, single: total=208.00MiB, used=0.00B
>>
>> root@polgara:/mnt/btrfs_root# btrfs fi show .
>> Label: 'btrfs_root' uuid: a2a1ed7b-6bfe-4e83-bc10-727126ed17bf
>> Total devices 1 FS bytes used 39.09GiB
>> devid 1 size 55.88GiB used 51.88GiB path /dev/sdb5
>>
>> btrfs-progs v4.0-dirty
>> root@polgara:/mnt/btrfs_root#
>>
>> root@polgara:/mnt/btrfs_root# btrfs balance start -dusage=80 -v /mnt/btrfs_root
>> Dumping filters: flags 0x1, state 0x0, force is off
>> DATA (flags 0x2): balancing, usage=80
>> Done, had to relocate 1 out of 55 chunks
>>
>> Sadly, it's only running 3.17.8 because of complicated reasons, but still,
>>
>> 1) I have 28GB used (modulo a few files between the btrfs send snapshots and
>> current status)
>>
>> 2) fi show shows I'm using 39GB, not sure where the extra 11GB came from
>>
>> 3) fi df agrees with fi show
>>
>> 4) regular df agrees on used too, but shows 5GB free instead of 15GB despite
>> the filesystem being balanced.
>>
>> I did have a bunch of snapshots that I did delete a while ago now, but it
>> looks like their blocks aren't being reclaimed.
>>
>> Any ideas?
>>
>
> Since you said you have some snapshots in between...I can think of one
> case to prove where the space goes,
>
> Say, you have a file with size=10M on a freshly created partition(the total used data space is 10M), and you have a snapshot which owns this file, then you modify the original file by overwrite the range [3M, 5M], and right now you can find that the total used data space increases to 15M or maybe more (because of unaliged write and extent pads to 4K length).
>
> This comes from our COW and extent references implementation, so you get
> the benefit of COW, meanwhile have to live with the un-reclaimed space.
>
> It's sort of something I was trying to fix, but I found that my approach
> led to other problems so I decided to give it up.
>
> Thanks,
>
> -liubo
The case is quite right, but the example maybe a little incorrect BTW.
For 10M file in one subvolume and rewrite [3M, 5M](including the last
byte) in the snapshot will only increase used space to 12M + 4K, as
above numbers except the last byte are already sectorsize aligned.
The root cause is, btrfs' lazy extent freeing behavior.
Use a new 12M case to describe:
======
Step 0:
Subv A:
[0,12M): Shared between A and B (Extent 1)
on disk:
Extent 1: record the above [0,12M) data
Subv B:
[0,4M): Shared between A and B (Extent 1)
[4M,8M): Exclusive in B (Extent 2)
[8M,12M): Shared between A and B (Extent 1)
on disk:
Extent 1: is reused in [0,4M) and [8M,12M) ranges
Extent 2: record data of [4M,8M)
------
Step 1:
Write [4M,8M) of subv A:
[0,4M): Shared between A and B (Extent 1)
[4M,8M): Exclusive in A (Extent 3)
[8M,12M): Shared between A and B (Extent 1)
Extent 3: record the above [4M,8M) data
And Extent 1 is not changed.
Subv B:
[0,4M): Shared between A and B (Extent 1)
[4M,8M): Exclusive in B (Extent 2)
[8M,12M): Shared between A and B (Extent 1)
======
After step 2, the used space will be 12M (Extent 1) + 4M (Extent 2) + 4M
(Extent 3) = 20M.
Even only 8M of extent 1 is really referred by.
So the middle 4M of extent 1 is totally wasted and Btrfs won't free it,
until *ALL* of the extent is not referred by any one.
One solution would be defrag, but IIRC defrag under multi-subvolume case
is not supported yet...
BTW, personally speaking, to find out how much space a subvolume takes,
btrfs qgroup would be quite handy (after 4.2 kernel).
Shows how much space a subvolume takes exclusively.
Thanks,
Qu
>
>> Thanks,
>> Marc
>> --
>> "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
>> Microsoft is to operating systems ....
>> .... what McDonalds is to gourmet cooking
>> Home page: http://marc.merlins.org/
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
prev parent reply other threads:[~2015-11-16 1:47 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-13 17:41 Where is the disk space? Marc MERLIN
2015-11-13 19:45 ` Duncan
2015-11-13 20:01 ` Marc MERLIN
2015-11-13 21:00 ` Duncan
2015-11-15 6:35 ` Liu Bo
2015-11-15 21:38 ` Marc MERLIN
2015-11-16 1:47 ` Qu Wenruo [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=564935A9.1000202@cn.fujitsu.com \
--to=quwenruo@cn.fujitsu.com \
--cc=bo.li.liu@oracle.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=marc@merlins.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox